Cursor AI Accused of Hiding Model Origins
Cursor, a rapidly growing AI code editor, is facing controversy after releasing its new AI model, Composer 2. The company is accused of not being upfront about the model’s origins, sparking debate within the AI community.
Composer 2: A Powerful New AI Model
Cursor launched Composer 2, touting it as a “frontier level” AI for coding that is significantly cheaper than other top-tier models. Initial reactions were overwhelmingly positive, with many users impressed by its capabilities. The model was presented as Cursor’s own creation, built from the ground up.
The Allegation: Based on Kimmy K2.5?
The situation took a turn when a user named Finn raised concerns, suggesting that Composer 2 was actually a fine-tuned version of Kimmy K2.5, an open-source model developed by the Chinese company Moonshot AI (also known as Kimi AI). Finn claimed that Composer 2 was essentially Kimmy K2.5 with added reinforcement learning. While Composer 1.5 seemed to have masked its origins more effectively, Composer 2’s connection to Kimmy K2.5 became harder to hide.
What is Kimmy K2.5?
Kimmy K2.5 is a powerful open-source AI model from China, known for its strong performance in agentic tasks. While open-source, it comes with a modified MIT license. This license has specific requirements: companies with over 100 million monthly active users or over $20 million in monthly revenue must publicly disclose their use of the model. Smaller companies or those using it for personal projects have more flexibility, but large commercial entities have disclosure obligations.
Cursor’s Response and the License Issue
Initially, Cursor did not mention Kimmy K2.5 as the base for Composer 2. Lee Robinson, who works for Cursor, stated that Composer 2 started from an open-source base and that Cursor planned to do full pre-training in the future. He explained that only about a quarter of the compute for the final model came from the base, with the rest being Cursor’s own training, including reinforcement learning using vast amounts of user data. He also mentioned they were following the license terms through their “inference partner.”
The situation escalated when an employee from Kimi AI posted (and later deleted) that their tests confirmed Composer 2 used the same tokenizer as Kimmy K2.5. This employee stated they were “shocked that Cursor AI did not respect our license, nor did they pay us any fees.”
Following this, Kimi AI released a more conciliatory statement, congratulating Cursor on Composer 2 and expressing pride that Kimmy K2.5 provided the foundation. They noted that Cursor’s pre-training and reinforcement learning training effectively integrated their model, calling it the type of open model integration they support. It appears Cursor accessed Kimmy K2.5 through Fireworks AI, a hosted inference platform, as part of a commercial partnership.
Why the Secrecy? Geopolitics and Perception
Several reasons likely contributed to Cursor’s initial reluctance to disclose the Kimmy K2.5 connection. Firstly, as a company valued at nearly $30 billion and aiming to be seen as a serious AI research firm, building their own models is crucial for their image. Being perceived as merely a wrapper for other companies’ technology could be detrimental. Secondly, the geopolitical climate between the US and China plays a significant role. For a US-based company, especially one serving enterprise clients, building on a model from a Chinese company could be a public relations headache and raise concerns about data security and infrastructure ties, particularly given the ongoing AI race narrative.
Cursor’s Technical Contributions: Self-Summarization
Despite the controversy over attribution, Cursor’s technical work on Composer 2 is significant. The company claims that three-quarters of the compute power used went into their own training and innovations. A key innovation highlighted is “self-summarization.” This technique allows the AI model to pause mid-task, summarize its progress and current understanding (up to about 1,000 tokens), and then continue with this condensed context. This is crucial for handling very long and complex coding tasks that might exceed a model’s standard context window.
This self-summarization process is part of their reinforcement learning strategy. Poor summaries are downweighted, while good ones are reinforced, improving the model’s ability to manage information over extended tasks. Cursor demonstrated this capability by successfully solving the challenging “Make Doom for MIPS” benchmark, which involves making the classic game Doom run using old MIPS computer language. This task required extensive code testing and engineering, pushing the limits of context handling, and Composer 2 reportedly handled it by creating over 100,000 tokens of self-summaries down to a usable 1,000.
Why This Matters
This incident highlights the complex dynamics of the open-source AI landscape. While companies like Moonshot AI release models to foster innovation, attribution and licensing compliance remain critical. For startups like Cursor, leveraging existing powerful models can accelerate development and reduce costs, especially when aiming for high-value enterprise markets. However, transparency about these foundations is essential for community trust and ethical practice. The geopolitical aspect also underscores how international relations can influence technology development and adoption. Ultimately, this event underscores the evolving nature of AI development, where adapting, fine-tuning, and productizing open-source models are becoming as important as training them from scratch.
The Future of Open Source AI
Clement Delang, founder of Hugging Face, commented that open source continues to be a major enabler of competition and that Chinese open-source models are now a significant force. He noted that the frontier of AI is shifting from solely training from scratch to adapting, fine-tuning, and rapidly productizing existing models. While Cursor faced criticism for its initial lack of transparency, their technical innovations and contributions to the open-source ecosystem are also recognized. The hope is that companies will continue to build upon open-source foundations, while also providing proper attribution and contributing back to the community.
Source: Cursor is CAUGHT red handed… (YouTube)