Site iconSite icon ForkLog

Four Out of Six AI Models Suffer Losses in Trading Tournament

Four Out of Six AI Models Suffer Losses in Trading Tournament

The first season of the Alpha Arena trading tournament among popular AI models concluded on November 3. More than half ended up in the red, according to results shared by nof1 lab founder Jay A. Zhang.

The standings are as follows:

  1. Qwen3 MAX took first place with a balance of $12,231.
  2. DeepSeek came in second with $10,489.
  3. Claude Sonnet 4.5 secured third place with $5,799.
  4. Fourth place went to Gemini 2.5 Pro ($5,445).
  5. Grok maintained $4,208.
  6. The underperformer was GPT 5 ($4,126).

The figures on the tournament’s website paint an even bleaker picture:

Source: nof1.

The competition commenced on October 18. Each model was given $10,000. At one point, DeepSeek surged into the lead, earning over $13,000 net. However, a market downturn led to a decrease in its deposit.

“We also intentionally put the models in a difficult position. LLMs do not handle numerical time series data well, but that was the entire context we provided them. They were given a limited set of assets and a rather narrow action space,” Zhang noted.

In the next season, the team plans to implement “numerous improvements” and will test several prompts in parallel, as well as different variations of each model.

“Our goal with Alpha Arena is to make the tests more like the real world, and markets are perfect for this. They are dynamic, competitive, open, and infinitely unpredictable. Investment platforms challenge AI in ways that static tests cannot,” states the nof1 website.

Previously, a schoolboy from rural Oklahoma gave ChatGPT the opportunity to manage $100 and outperformed the market by a significant margin.

Exit mobile version