Briefly
- Multiverse’s Compactifai Tech allegedly reduced the number of parameters by 70%and the memory model by 93%, preserving 97-98%accuracy.
- The company has just closed the $ 215 million series B circle, a bullhound capital, HP Tech Ventures and Toshiba.
- The method uses tension networks from quantum physics to compress models and “cure” them with rapid retraining, claiming 50% faster performance at the conclusion.
Spanish AI startup has just convinced investors to hand over more than $ 215 million based on bold claims: they can reduce large language models by 95% without threatening their performance.
The innovation of Multiverse Computing depends on its compact technology, a compression method that borrows mathematical concepts from quantum physics to reduce models to smartphone’s size.
San Sebastian says their compressed Llam-2 7B model is 25% faster to conclude, using 70% less parameters, with accuracy only 2-3%.
If it is confirmed on the scale, it could address the AI-Elephant AI problem: models so mass so much that they require specialized data centers only to work.
“For the first time in history, we are able to profile the internal action of the neural network to eliminate billions of false correlations to truly optimize all kinds of AI models,” said Román Orús, Multivers Chief Scientific Officer, UA blog blog Thursday.
Bullhound Capital led a $ 215 million series B circle with support from HP Tech Ventures and Toshiba.
The physics behind the compression
Application of concepts inspired by quantum to resolve one of the smallest questions AI sounds amazing if if research Sustling, it’s really.
Unlike traditional compression that simply cuts neurons or reduces numerical precision, Compactifai uses tensor networks – mathematical structures developed by physicists to monitor the interaction of particles without drowning in data.
The process acts like origami for AI models: weight matrices are folded into smaller, interconnected structures called the Matrix product operators.
Instead of storing any connection between neurons, the system only keeps significant correlations with the rejection of excess patterns, such as information or relationships that are repeated again and again.
Multiverse revealed that AI models are not uniform. Early layers are shown fragile, while deeper layers – referentially shown that they are less critical for performance – can withstand aggressive compression.
This selective approach allows them to achieve a dramatic reduction in size where other methods fail.
After compression, the models undergo short “healing” – a launch that lasts less than one epoch thanks to the reduced number of parameters. The company claims that this renovation procedure is 50% faster than the training of original models due to reduced loading loads by GPU-CPU.
A long story – by your own offers of the company – you start with the model, start a compactify magic and finish with a compressed version that has less than 50% of its parameters, can be launched at twice the conclusion, costs much less and is as capable of as the original.
In your research, the team shows that you can reduce the Memorial needs of the LLALA-2 7B models by 93%, reduce the number of parameters by 70%, speed up training by 50%and speed up the answer (conclusion) by 25%-just lose 2-3%accuracy.
Traditional decreasing methods such as quantization (reduction of precision such as using less decimal places), pruning (completely cutting less important neurons, such as the pruning of dead branches from the tree) or a distillation technique (training smaller model to imitate greater behavior) are nowhere near to achieve these numbers.
Multirse already serves more than 100 clients, including Bosch and Bank of Canada, applying their quantized inspired algorithms beyond AI to energy optimization and financial modeling.
The Spanish government is to be exposed 67 million euros In March, pushing a total financing above $ 250 million.
Currently, they offer compressed versions of an open code model like Llam and Mistral via AWS, the company is planning to expand to Deepseek R1 and other explanation models.
Openi or Claud’s ownership systems remain obviously beyond the borders because they are not available for tinkering or study.
The promise of technology extends beyond the cost of saving costs. Inclusion of HP Tech Ventures signals the interest in implementing Edge Ai – by breaking sophisticated models locally, not the cloud server.
“The innovative approach to multiverses can potentially bring AI advantages of improved performance, personalization, privacy and cost -effectiveness for life for any size companies,” said Tuan Tran, president of HP for technology and innovation.
So, if you find that one day you run Deepseek R1 on your smartphone, these guys can thank.
Edited Josh Quttner and Sebastian Sinclair
Generally intelligent Bulletin
Weekly AI journey narrated by gene, generative AI model.