DeepSeek-R1 unveils new model "MODEL1" on its first anniversary
BlockBeats news, on January 21, according to Quantum Bit, DeepSeek-R1 revealed its new model "MODEL1" on the first anniversary of its release. DeepSeek updated the FlashMLA code on GitHub, where MODEL1 is mentioned in 28 places across 114 files, appearing as a different model from V32. It is known that V32 is DeepSeek-V3.2, so MODEL1 is likely a new architecture. The specific differences in the code are reflected in the KV cache layout, sparsity handling, and FP8 decoding, with several differences in memory optimization.
Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.
You may also like
Federal Reserve Beige Book: Most districts expect slight and moderate economic growth
Trending news
MoreThe European Union has streamlined the list of companies from France, Sweden, and the United Kingdom for the management of its established Scaleup Europe Fund program.
According to sources, the US Department of Justice is conducting an antitrust investigation into the global fertilizer industry, involving several international giants.
