DeepSeek Open-Sources TileKernels, GPU Kernel Library for Large Model Training and Inference

Gate News message, April 23 — DeepSeek has open-sourced TileKernels under the MIT license, a GPU kernel library written in TileLang for large language model training and inference. TileLang is a domain-specific language developed by the tile-ai team for expressing high-performance GPU kernels in Python. DeepSeek stated that most kernels in the library have approached hardware performance limits in compute density and memory bandwidth, with portions already deployed in internal training and inference operations.

The library comprises six categories of kernels: MoE (mixture of experts) gating and routing, including Top-k expert selection, token-to-expert mapping, and fused expand/shrink with weight normalization; quantization supporting FP8, FP4, and E5M6 formats with per-token, per-block, and per-channel quantization, including fused SwiGLU+quantization operations; batch transpose; Engram gating with fused RMSNorm forward/backward propagation and weight gradient reduction; Manifold HyperConnection with Sinkhorn normalization and mixed split/apply; and high-level autograd interfaces that wrap low-level kernels into trainable layers.

Engram and Manifold HyperConnection are proprietary components of DeepSeek’s model architecture, with implementation details disclosed publicly for the first time. The library requires NVIDIA SM90 or SM100 architecture GPUs (H100/H200 or Blackwell series), CUDA Toolkit 13.1 or higher, and PyTorch 2.10 or higher.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Articoli correlati

Mantle's rsETH Relief Loan Proposal Enters Aave Governance Vote as DeFi United Raises $314.57M

According to Mantle Network's official announcement, the rsETH relief coordination loan facility proposal for Aave has entered the governance voting phase. Mantle will provide a loan to support Aave's relief efforts, enabling users to orderly exit or resolve related positions. As of now, the DeFi

GateNews1h fa

Cardano Foundation Releases Podcast on Digital Trust Infrastructure Framework

According to Cardano Foundation, on May 2, the foundation released a new episode of its "Let's Talk Cardano" podcast series featuring Douglas Heintzman of the Blockchain Research Institute, exploring Digital Trust Infrastructure (DTI). The episode discusses a five-layer framework designed to build t

GateNews1h fa

TON Mainnet Validator Minimum Staking Threshold Expected to Rise to 1 Million TON on May 2

According to TON, on May 2, the minimum staking threshold for mainnet validators is expected to rise from 824,000 TON to 1 million TON, while the maximum threshold will increase from 2.425 million TON to 3 million TON. The protocol cited increased capital inflows into the validator network as the re

GateNews1h fa

Zcash Foundation Releases Zebra 4.4.0 on May 2, Fixing Multiple Consensus-Level Security Vulnerabilities

According to Zcash Foundation, Zebra 4.4.0 was released on May 2 to fix multiple consensus-level security vulnerabilities and strongly recommends all node operators upgrade immediately. The update addresses denial-of-service flaws that could halt new block discovery, block signature operation

GateNews3h fa

黑色四月 DeFi 28 起駭客事件後,Curve 打造鏈上垃圾債市場:受損用戶可折價賣出債權

本文說明 Curve 在 2025 年暴跌後,為 CRV-long Llamalend 的壞帳開啟鏈上困境債市場。透過 crvUSD/cvcrvUSD 池,受損債權可折價交易、換取流動性,或等待回收。此設計讓市場自行定價與治理,提供退出與風險/報酬平衡,但不保證回收。

ChainNewsAbmedia6h fa

Zcash Foundation Releases Zebra 4.4.0, Fixes Consensus-Level Security Vulnerabilities

According to Zcash Foundation, Zebra 4.4.0 was released today, fixing multiple consensus-level security vulnerabilities and urging all node operators to upgrade immediately. The vulnerabilities include a denial-of-service flaw that could halt block discovery permanently, sigops counting errors

GateNews11h fa
Commento
0/400
Nessun commento