Multiverse debuts tiny AI models that run offline without losing performance

Cosmico - Multiverse debuts tiny AI models that run offline without losing performance
Credit: Multiverse Computing

One of Europe’s most prominent AI startups, Multiverse Computing, has unveiled two AI models so small they’re named after a fly’s brain and a chicken’s brain — yet they still boast impressive performance in chat, speech, and even reasoning tasks.

The Donostia, Spain–based company, known for its quantum-inspired AI compression technology, says these are the world’s smallest high-performing models designed to run entirely on-device — from smartphones and laptops to IoT gadgets like smart appliances and wearables.

“We can compress the model so much that they can fit on devices,” said founder Román Orús in an interview with TechCrunch. “You can run them on premises, directly on your iPhone, or on your Apple Watch.”

From Quantum Physics to AI Compression

Multiverse Computing was founded in 2019 by Orús, quantum computing expert Samuel Mugel, and former Unnim Banc deputy CEO Enrique Lizaso Olmos. The company recently raised €189 million ($215M) in June, bringing total funding to roughly $250 million. The latest round was led by Bullhound Capital, with participation from HP Tech Ventures, Toshiba, and other strategic investors.

Its core technology, called CompactifAI, is a quantum-inspired compression algorithm that drastically reduces AI model size without sacrificing — and sometimes even improving — performance. Unlike typical AI compression methods, CompactifAI draws on principles from quantum physics, which Orús says allows for more “subtle and refined” optimization.

Meet the Model Zoo: ChickBrain and SuperFly

Multiverse has compressed numerous open-source models, from small Llama and Mistral versions to OpenAI’s latest open releases. But with its new Model Zoo, the company aims to make the smallest yet most capable AI models for embedded use:

  • SuperFly – A compressed version of Hugging Face’s SmolLM2-135, reduced from 135M to 94 million parameters — roughly the neural capacity of a fly’s brain. While not built for deep reasoning, it’s ideal for ultra-lightweight voice-enabled applications. Multiverse envisions SuperFly running on devices like washing machines or microwaves, enabling commands like “start quick wash” or real-time troubleshooting without internet access.
  • ChickBrain – A 3.2 billion parameter compressed version of Meta’s Llama 3.1 8B model. Despite being less than half the original size, ChickBrain slightly outperforms its parent on benchmarks like MMLU-Pro (language skills), Math 500 and GSM8K (math reasoning), and GPQA Diamond (general knowledge). The model can run entirely offline on a MacBook and is aimed at higher-level reasoning tasks.

Why It Matters

While these models won’t dethrone the largest AI systems on global leaderboards, their advantage lies in size-to-performance efficiency. Smaller models can run offline, protect user privacy, and dramatically cut cloud processing costs.

Multiverse is already in discussions with Apple, Samsung, Sony, and HP to integrate its tech directly into consumer devices. It also serves corporate clients like BASF, Ally, Moody’s, and Bosch, and offers its compressed models through an AWS-hosted API with lower token costs than many competitors.

As AI increasingly moves out of the data center and into everyday devices, Multiverse Computing’s Model Zoo could mark a turning point in bringing powerful intelligence to the edge — whether that edge is a smart fridge, a laptop, or even a wristwatch.

Read more