Nvidia Launches First Groq LPX Chip: Up to 35x Improvement in Inference Efficiency Per Megawatt When Combined with Vera Rubin, and Demonstrates Next-Generation Kyber Prototype

robot
Abstract generation in progress

According to CoinWorld, based on monitoring by 1M AI News, Groq 3 LPU (Language Processing Unit) is the first chip launched after NVIDIA acquired AI inference chip startup Groq in December last year for approximately $20 billion. It is expected to start shipping in the third quarter of this year. The Groq 3 LPX rack can hold 256 LPUs, equipped with 128GB on-chip SRAM and an interconnect bandwidth of 640TB per second. Officially, when deployed with Vera Rubin NVL72, LPX can increase inference throughput per megawatt by up to 35 times, unlocking revenue potential for trillion-parameter and multi-million token context inference scenarios. Jensen Huang described the two processors as “extremely different yet unified: one pursuing high throughput, the other low latency,” with LPX’s on-chip memory significantly expanding the total available memory for models. The LPX rack is planned to be launched later this year alongside the Vera Rubin platform. At the conference, Huang also showcased a prototype of the next-generation rack architecture codenamed Kyber. Kyber rearranges 144 GPU compute trays vertically to improve physical density and reduce latency, and will be integrated into the successor platform Vera Rubin Ultra, expected to be released in 2027.

View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
  • Reward
  • Comment
  • Repost
  • Share
Comment
Add a comment
Add a comment
No comments
  • Pin