WebsiteGear Logo Log In
New User? Sign Up
About | Contact | FAQ
  Home News Web Hosting Computer Hardware Sunday, June 21, 2026 
Add Press Release News | News Feeds Feeds | Email This News Email


FuriosaAI Ends 2024 on a High Note: Llama 3.1 Performance, SDK Release, Leadership Expansion
Friday, December 27, 2024

SANTA CLARA, Calif., Dec. 19, 2024 /PRNewswire/ -- FuriosaAI, an emerging leader in AI semiconductor solutions, is closing out the year with rapid technical and customer progress with its second-generation chip, RNGD (pronounced 'Renegade'). The recently announced AI solution has achieved compelling performance metrics in real-world enterprise deployments meeting the demand for inference with advanced large language and multimodal models.

The new performance benchmarks showcase RNGD's ability to meet industry-leading throughput demands for Llama 3.1 models, including the 8B and 70B variants, with additional optimizations already in progress. The company also announced key software features that bring advanced optimization for customers currently sampling RNGD hardware in their production environments. These achievements represent the first phase of Furiosa's vision for AI infrastructure that overcomes the inherent limitations of GPUs.

RNGD delivers winning throughput metrics with Llama 3.1 8B and 70B:

Building on the AI-native Tensor Contraction Processor (TCP) architecture of RNGD, Furiosa is redefining real-world AI deployments, delivering unmatched performance, programmability, and power efficiency. Furiosa's RNGD recently achieved a throughput of 3,200-3,300 Tokens per Second (TPS) when running the LLaMA 3.1-8B model. In single-user scenarios, RNGD consistently delivers 40-60 TPS performance.

Additionally, RNGD demonstrates exceptional power efficiency, consuming 181W per card, with further optimization efforts underway. Rather than excessively boosting per-user performance, the company aims to maintain performance levels exceeding typical text-reading speeds (10-20 TPS or higher) while optimizing for multi-user environments and achieving a balanced performance approach.

Furiosa is advancing the performance and efficiency of the LLaMA 3.1-70B model. With just two RNGD cards, LLaMA 3.1-70B can be executed effectively. Currently, a single server supports up to 100 concurrent user queries, with ongoing optimizations aiming to achieve 8,000 TPS per server when equipped with 8 RNGD cards.

With the release of SDK v2024.3.0, Furiosa will expand the range of preloaded models. The SDK will also include support for tensor parallelism, enabling seamless processing across multiple elements without requiring model modifications, and a torch.compile, providing the foundation for executing customized models. Integration with HuggingFace Optimum will further empower customers to leverage a broader variety of models.

Advanced optimization tools delivered to early RNGD customers:

Building on these milestones, domestic and global enterprise customers are conducting tests with Furiosa to find a more efficient solution for scaling the inference of their self-developed models, compared to their existing setup. Their objective is to manage TCO effectively as they prepare for large-scale AI adoption. Furiosa plans to provide a high-quality AI development environment through a powerful and user-friendly SDK optimized for RNGD. The SDK v2024.1.0, currently available through the Early Access Program (EAP), is designed to handle high-performance processing of multiple LLM serving requests. It incorporates optimization techniques such as PagedAttention, Block KV Cache, and Continuous Batching, while also supporting various token sampling methods, including Greedy, Beam Search, and Top-k/p. These features allow developers to seamlessly create AI services customized to meet a wide range of requirements. The SDK and online sample will be available after the release of v2024.3.0.

Furiosa remains committed to delivering the most sustainable AI deployment solutions through rigorous optimization at an unprecedented pace.

"With RNGD now in customers' hands, we are accelerating the next generation of frontier LLMs to unlock emerging Agentic AI applications--bringing advanced reasoning capabilities to enterprise verticals, all at dramatically lower costs," said June Paik, Co-Founder and CEO of FuriosaAI.

Furiosa Expands Global Footprint with Strategic Leadership Appointment

Furiosa is scaling production and expanding its leadership team with the appointment of Alex Liu as Senior Vice President of Product and Business. A Technology Emmy Award winner and co-founder of NETINT Technologies, Alex brings over 20 years of expertise in startup management, technology innovation, and strategic leadership. At NETINT, he spearheaded groundbreaking achievements, including the development of the world's first VPU SoC, setting new industry benchmarks and securing the prestigious 2024 Technology Emmy Award. At Furiosa, Alex will lead global product management, go-to-market strategies, and partnerships to drive innovation and align the company's AI-native technologies with a vision to empower the development of planet-scale AI infrastructure.

RNGD is currently sampling with customers, and mass production will ramp up in partnership with TSMC for 2025 availability. To learn more about Furiosa, please visit https://furiosa.ai/.

About FuriosaAI

FuriosaAI is a semiconductor company dedicated to creating sustainable AI computing solutions that make powerful AI accessible to all. With its innovative Tensor Contraction Processor architecture, FuriosaAI is revolutionizing the AI hardware landscape, offering unparalleled efficiency and programmability for the most demanding AI workloads. For more information, please visit https://furiosa.ai/.

View original content to download multimedia:https://www.prnewswire.com/news-releases/furiosaai-ends-2024-on-a-high-note-llama-3-1-performance-sdk-release-leadership-expansion-302336756.html

SOURCE FuriosaAI



Email This News Email | Submit To Slashdot Slashdot | Submit To Digg.com Digg | Submit To del.icio.us Del.icio.us | News Feeds Feeds

RELATED NEWS ARTICLES
Nav Energy Toolbase Launches Energy Storage Partnership with Sungrow to Support PowerStack 255CS and PowerTitan 2.0 | Jan 22, 2026
Nav RS now offers Phoenix Contact's pioneering new NearFi technology | Jan 22, 2026
Nav Einride and IonQ Partnership Uses Quantum Computing to Optimize the Logistics of Electric and Autonomous Freight | Jan 22, 2026
Nav SCAILIUM Debuts "AI Production Layer" to Overcome GPU Starvation and Slash AI Energy Waste | Jan 22, 2026
Nav MetaOptics to Showcase Five Breakthrough Metalens-Powered Products at CES 2026 | Jan 22, 2026
Nav No Assembly Required: Barrett Distribution Centers Powers Maxwood Furniture's West Coast DTC Expansion | Jan 22, 2026
Nav Quantum Art Raises $100 Million in Series A Round to Drive Scalable, Multi-Core Quantum Computing | Jan 22, 2026
Nav Hesai Recognized as the Only Lidar Company on Morgan Stanley's "Humanoid Tech 25" of Global Robotics Leaders | Jan 22, 2026
Nav Fresco Raises EUR15m Series C to Power the Future of AI-Driven Cooking and the Connected Kitchen Ecosystem | Jan 22, 2026
Nav Cellid and Jig.jp Jointly Develop AR Glasses | Jan 22, 2026
NEWS SEARCH

FEATURED NEWS | POPULAR NEWS
Submit News | View More News View More News