AI reasoning models can cheat to win chess games

May Be Interested In:Rams will face the Jacksonville Jaguars in London this season


Palisade’s team found that OpenAI’s o1-preview attempted to hack 45 of its 122 games, while DeepSeek’s R1 model attempted to cheat in 11 of its 74 games. Ultimately, o1-preview managed to “win” seven times. The researchers say that DeepSeek’s rapid rise in popularity meant its R1 model was overloaded at the time of the experiments, meaning they only managed to get it to do the first steps of a game, not to finish a full one. “While this is good enough to see propensity to hack, this underestimates DeepSeek’s hacking success because it has fewer steps to work with,” they wrote in their paper. Both OpenAI and DeepSeek were contacted for comment about the findings, but neither replied. 

The models used a variety of cheating techniques, including attempting to access the file where the chess program stores the chess board and delete the cells representing their opponent’s pieces. (“To win against a powerful chess engine as black, playing a standard game may not be sufficient,” the o1-preview-powered agent wrote in a “journal” documenting the steps it took. “I’ll overwrite the board to have a decisive advantage.”) Other tactics included creating a copy of Stockfish—essentially pitting the chess engine against an equally proficient version of itself—and attempting to replace the file containing Stockfish’s code with a much simpler chess program.

So, why do these models try to cheat?

The researchers noticed that o1-preview’s actions changed over time. It consistently attempted to hack its games in the early stages of their experiments before December 23 last year, when it suddenly started making these attempts much less frequently. They believe this might be due to an unrelated update to the model made by OpenAI. They tested the company’s more recent o1mini and o3mini reasoning models and found that they never tried to cheat their way to victory.

Reinforcement learning may be the reason o1-preview and DeepSeek R1 tried to cheat unprompted, the researchers speculate. This is because the technique rewards models for making whatever moves are necessary to achieve their goals—in this case, winning at chess. Non-reasoning LLMs use reinforcement learning to some extent, but it plays a bigger part in training reasoning models.

share Share facebook pinterest whatsapp x print

Similar Content

Calling all fashion models … now AI is coming for you
Calling all fashion models … now AI is coming for you
Against the dark background of space, pale, misty clouds of gas and dust are liberally sprinkled with blue and pink stars. Some bright pink patches of clustered stars are visible, as are dark brown knots of dust.
Hubble Unveils a Glittering View of Sh2-284 – NASA Science
When Patching Isn’t Enough
Donald Trump and Melania Trump pay their respects at Pope Francis' funeral
Donald Trump and Melania Trump pay their respects at Pope Francis’ funeral
Tariffs Could Give Tesla and Musk a Leg Up on Rivals
Tariffs Could Give Tesla and Musk a Leg Up on Rivals
The influencer lawsuit that could change the industry
The influencer lawsuit that could change the industry

Leave a Reply

Your email address will not be published. Required fields are marked *

The Information Edge: Stay Ahead of the Curve | © 2025 | Daily News