Others
Print
Tether’s QVAC Launches World’s First Cross-Platform BitNet LoRA Framework to Enable Billion-Parameter AI Training and Inference on Consumer GPUs and Smartphones  

17 March, 2026Tether today announced a breakthrough in AI model training with the launch of the world’s first cross-platform LoRA fine-tuning framework for Microsoft’s BitNet models (1-bit LLMs). This new capability, part of QVAC Fabric, dramatically reduces memory and compute requirements, enabling billion-parameter language models to be fine-tuned on everyday hardware, including laptops, consumer GPUs, and modern smartphones.

Developing and maintaining AI models requires enterprise-grade NVIDIA systems or cloud infrastructure, which has become prohibitively expensive. This has resulted in advanced AI development being available almost exclusively to the largest organizations with access to specialized hardware and significant budgets.

Tether’s QVAC Fabric LLM, further enhanced with this new breakthrough BitNet-based framework, eliminates those barriers by enabling cross-platform LoRA fine-tuning and inference acceleration support across heterogeneous consumer GPUs, including Intel, AMD, Apple Silicon M chips, and others. This advancement enables users to train and customize AI models directly on widely available consumer devices. 

This achievement by Tether’s engineering team marks the first successful demonstration of BitNet fine-tuning on mobile GPUs, including Adreno, Mali, and Apple Bionic GPU. Users can fine-tune 125M-parameter BitNet models in ~10 minutes on a Samsung S25 (Adreno GPU) for a biomedical dataset of ~300 documents (~18k tokens). For the 1B model, fine-tuning the same biomedical data completes in 1 hour 18 minutes on the Samsung S25 device and 1 hour 45 minutes on the iPhone 16. Pushing the devices to their limit, our team was able to fine-tune models up to 13B on the iPhone 16.

The framework also demonstrates the capability to fine-tune models that are 2 times larger on edge devices than Q4 non-BitNet models, showcasing the superior memory advantage of the BitNet architecture.

BitNet Inference performance also sees a substantial boost through QVAC Fabric. The BitNet family of models runs significantly faster on mobile GPUs. On these devices, GPU performance was between two and eleven times faster than CPU, showing that today’s mobile GPUs can support workloads that previously required specialized expensive hardware or data centers.

The memory savings are equally significant. Benchmarks show that BitNet-1B (TQ1_0) uses up to 77.8% less VRAM than Gemma-3-1B (16-bit) and 65.6% less than Qwen3-0.6B (16-bit) across both inference and LoRA fine-tuning workloads. These reductions create meaningful memory headroom, enabling larger models and personalization workflows to run on hardware that would have been considered insufficient just months ago.

Additionally, the framework enables LoRA fine-tuning for 1-bit LLM on non-NVIDIA hardware for the first time, expanding support across AMD, Intel, Apple Silicon, and mobile GPUs. By reducing dependence on specialized hardware and cloud providers, the system broadens access to AI fine-tuning while keeping sensitive data local to the device. This efficiency advantage also makes federated learning achievable and realistic in the near future, allowing fine-tuned updates to be trained and shared across distributed devices while keeping sensitive user data local and reducing dependence on centralized infrastructure.

“Intelligence will be a key determining factor in the future of society. It has the potential to improve the stability of society, serve as connective tissue, or further empower the few. The future of AI should be accessible, available, and open to people and builders everywhere, and it should not require an absurd amount of resources only available to a handful of cloud providers,” said Paolo Ardoino, CEO of Tether. “When training large language models depends on centralized infrastructure, innovation becomes stagnant, the ecosystem becomes fragile, and societal equilibrium is put at risk. By enabling meaningful large-model training on consumer hardware, including smartphones, Tether’s QVAC is proving that advanced AI can be decentralized, inclusive, and empowering for everyone. Tether will continue to invest significant resources and capital over the coming weeks, months, and years to ensure that AI becomes accessible to everyone, everywhere, locally on-device. The era of Stable Intelligence has just begun.”

Full technical details, including the paper, adapters, benchmarks, and cross-platform binaries, are available on the Hugging Face blog: LoRA Fine-Tuning BitNet b1.58 LLMs on Heterogeneous Edge GPUs via QVAC Fabric.



About Tether 

Tether’s vision is to advance freedom, transparency, and innovation through technology. Its mission is to enable people and organizations to connect and share information directly, without unnecessary intermediaries. By creating secure, peer-to-peer systems, Tether gives users greater control over their data, communications, and digital interactions. 

Tether aims to redefine how information flows across networks by replacing centralized models with decentralized infrastructure designed for privacy, efficiency, and resilience. The company’s goal is to make global connectivity faster, safer, and more private, empowering individuals and institutions alike to exchange information freely and securely.

latest news

Tether Names Zachary Lyons as CIO; Richard Heathcote Transitions to Advisory Role After Guiding Investment Expansion

Tether today announced a transition within its investment leadership. Chief Investment Officer Richard Heathcote, who has played a central role in shaping Tether’s investment strategy during a period of extraordinary growth, will be stepping back from his day-to-day responsibilities to focus on personal and family priorities. Richard has been one of the driving forces behind […]

Learn more
Tether Announces Strategic Investment in Ark Labs, Reintroducing Stablecoins to Programmable Bitcoin Infrastructure

12 March 2026 – Tether Investments today announced a strategic investment in Ark Labs, the team behind Arkade, the programmable Bitcoin infrastructure, to expand global stablecoin access on the largest and most liquid network. This investment was part of a $5.2 million funding round, reinforcing Tether’s integral role and ongoing expansion into the Bitcoin ecosystem.  […]

Learn more
Tether Invests in Axiym to Advance Digital Asset Use Cases Across Global Payment Ecosystems

5 March 2026 – Tether, the largest company in the digital asset industry, today announced a strategic investment in Axiym, a fintech innovator with a globally distributed treasury and settlement infrastructure. This initiative marks a significant milestone in Tether’s mission to expand access to global financial systems and empower individuals and businesses with modern, reliable […]

Learn more
Read all news