Data / Others
Print
Tether Releases QVAC Genesis I, World’s Largest Synthetic Data Set to Train STEM-Focused AI Models, Alongside QVAC Workbench, a Comprehensive Local AI App

The largest and most advanced synthetic dataset ever created for AI training, with 41 billion tokens, aims to level the playing field, enabling open, community-driven intelligence to thrive outside Big Tech’s walls

24 October  2025  – Tether Data’s AI research division, QVAC, has released the largest synthetic dataset ever created for artificial intelligence training under a new initiative called QVAC Genesis. This first release, Genesis I, a massive collection of 41 billion text tokens, is designed to help the world build smarter, more capable, and highly precise STEM-focused language models. Each “text token” represents a tiny fragment of language, the building blocks that AI models use to understand and generate text. By training on 41 billion of these tokens from QVAC Genesis’s dataset, models grasp not just words, but the relationships and logic that connect them.

This dataset has been rigorously validated across educational and scientific benchmarks, demonstrating superior reasoning and problem-solving performance in subjects such as mathematics, physics, biology, and medicine. It represents the first publicly available synthetic dataset, specifically built and rigorously validated for education-specific content, offering comprehensive coverage across key STEM domains where today’s public training datasets fall short. 

More than a technical milestone, this release is a statement about who should own the future of intelligence. As AI becomes increasingly centralized, trained, hosted, and controlled by a handful of corporations, QVAC Genesis I is working to return that power to the people by providing open, high-quality data for scientific research advancement.

Tether Data also today released its first consumer app, QVAC Workbench, a comprehensive workspace that demonstrates the potential of local on-device Artificial Intelligence. QVAC Workbench is currently targeting AI enthusiasts, advanced users, and researchers. It already supports a wide variety of LLMs and other AI Models, including Llama, Medgemma, Qwen, SmolVLM, Whisper, and many more. 

The app is available for smartphones (Android for now, and iOS within a few days) as well as desktop platforms (Windows, macOS, and Linux), providing the most comprehensive on-device support compared to existing offerings.

With QVAC Workbench, all chats and interactions with the AI Models remain local on-device, where data is owned by the user and remains 100% private. Yet it also offers a unique feature called “Delegated Inference,” which allows a user to connect peer-to-peer to their mobile Workbench app with the Workbench desktop app to fully utilize the power and resources of their home or office workstations.

“Intelligence shouldn’t be centralized,” said Paolo Ardoino, CEO of Tether. “With QVAC Workbench and Genesis I, we’re opening the door to infinite intelligence, AI that lives, learns, and evolves locally on your own device. We believe that intelligence, like information, should be free, accessible, and owned by everyone, not locked behind corporate firewalls or sold as a service. Whether it’s a phone, a robot, or a wearable, intelligence should belong to the individual, not the institution. QVAC Genesis I represents a future where people, not platforms, control how knowledge is created, shared, and used. It’s about restoring balance, bringing intelligence back to the edge, where it belongs, and ensuring the freedom to build and learn is universal.”

By making the QVAC Genesis dataset public, we invite researchers to build and use models that can compete with, and even surpass, proprietary systems. In fact, our dataset was created using a multi-stage generation and validation process that turns high-quality scientific and educational materials into structured learning data. The result is a training resource that helps models reason, solve problems, and think critically, rather than merely imitate language.

“Most AI today sounds smart, but doesn’t truly think,” continued Paolo Ardoino, CEO of Tether. “We designed this dataset to help models understand cause and effect, to make connections, draw conclusions, and reason their way through complexity. And we’re making it open to everyone.”

The release of the first two QVAC projects is part of a broader mission to reshape how AI exists in the world, introducing a new paradigm of ‘local intelligence,’ where tools learn and evolve directly on any device. 

The full technical breakdown of the dataset, code-named QVAC Genesis I, is available now in the accompanying research blog: QVAC Genesis I: the Largest and Highest-Quality Multi-domain Educational Synthetic Dataset for Pre-training

QVAC Workbench apps can be downloaded from the qvac.tether.dev website.

About Tether Data

Tether Data, S.A. de C.V. (“Tether Data”) is part of Tether’s broader vision to advance freedom, transparency, and innovation through technology. Its mission is to enable people and organizations to connect and share information directly, without unnecessary intermediaries. By creating secure, peer-to-peer systems, Tether Data gives users greater control over their data, communications, and digital interactions. Tether Data aims to redefine how information flows across networks by replacing centralized models with decentralized infrastructure designed for privacy, efficiency, and resilience. The company’s goal is to make global connectivity faster, safer, and more private, empowering individuals and institutions alike to exchange information freely and securely.

About QVAC

QVAC is Tether Data’s advanced AI research initiative dedicated to building open, decentralized, and adaptive intelligence systems. Its mission is Local AI. Infinite Intelligence. No Compromise envisions a world where AI lives and learns on any device, empowering individuals and communities rather than concentrating power in corporate data centers.

latest news

Tether Open-Sources the Next Generation of Bitcoin Mining Infrastructure with MOS, Mining OS, Mining SDK

2 February, 2026 – Tether, the largest company in the digital assets industry, today announced the open-sourcing of Mining OS (MOS), an operating system designed to manage, monitor, and automate Bitcoin mining operations at scale. MOS provides end-to-end visibility across mining sites, bringing hardware, energy, infrastructure, and operational data into a single unified system. The […]

Learn more
Tether and Opera Expand Financial Access in Emerging Markets Through MiniPay

2 February 2026 – Tether, the world’s leading stablecoin issuer, announced the expansion of USD₮ and Tether Gold (via XAUt0*) support within MiniPay, Opera’s self-custodial wallet built on the Celo blockchain. The initiative is helping millions of users across emerging markets gain stable, dollar-denominated access to value and savings, reinforcing Tether’s role in advancing global […]

Learn more
Tether Delivers $10B+ Profits in 2025, $6.3B in Excess Reserves, and Record $141 billion Exposure in U.S. Treasury Holdings

Tether International, S.A. de C.V., today published its Q4 2025 attestation, prepared by BDO, a top-five global independent accounting firm. The report confirms the accuracy of Tether’s Financial Figures and Reserves Report (FFRR), and provides a transparent, detailed view of the assets backing USD₮ as of December 31, 2025. It highlights a landmark year defined […]

Learn more
Read all news