6 Major AI Crypto Trading Practices: Who Profits and Who Loses? The Results Are Unexpected!

Bitpush2025/10/20 09:12

Show original

By:深潮 TechFlow

Written by: David, TechFlow, Deep Tide

Original Title: Six Major AIs Engage in a Trading Battle—Will the Crypto Version of the "Turing Test" Yield Good Results?

Good news: After the epic crash on October 11, crypto trading has become active again.

Bad news: It's AI that's trading.

As the new week begins, the market is heating up, and a project called nof1.ai has sparked widespread discussion on crypto social media.

The focus is simple: people are watching six AI large models in this project trade crypto on Hyperliquid in real time, to see who can make the most profit.

Note, this is not a simulation. Claude, GPT-5, Gemini, Deepseek, Grok, and Qwen (Tongyi Qianwen) each have $10,000 in real funds trading on Hyperliquid. All addresses are public, and anyone can watch this "AI trader battle" in real time.

Interestingly, all six AIs use exactly the same prompts and receive exactly the same market data. The only variable is their respective "ways of thinking."

Within just a few days of launching on October 18, some AIs have already made over 20% profit, while others have lost nearly 40%.

In 1950, Turing proposed the famous Turing Test, attempting to answer the question, "Can machines think like humans?" Now, in the crypto world, six major AIs are battling in the Alpha Arena, answering an even more interesting question:

If the smartest AIs are allowed to trade in real markets, who will survive?

Perhaps in this crypto version of the "Turing Test," account balance is the only judge.

Only Profitable AIs Are Good AIs—Deepseek Currently Leads

Traditional AI evaluations, whether having models write code, solve math problems, or write articles, are essentially tests in a "static" environment.

The questions are fixed, the answers are predictable, and may even have appeared in the training data.

But the crypto market is different.

With extremely asymmetric information, prices change every second, and there are no standard answers—only profit or loss. More importantly, the crypto market is a classic zero-sum game: your profit is someone else's loss. The market punishes every wrong decision immediately and mercilessly.

The Nof1 team, which organized this AI trading battle, wrote on their website:

Markets are the ultimate test of intelligence.

If the traditional Turing Test asks, "Can you make humans unable to tell you're a machine?" then the Alpha Arena is really asking:

Can you make money in the crypto market? This is actually the real expectation crypto players have for AI.

Currently, the addresses of the six AI large models on Hyperliquid are as follows, and you can easily look up their positions and trading records.

Meanwhile, the nof1.ai official website also visualizes all their historical trades, positions, profits, and thought processes on the front end, making it easy for everyone to reference.

For readers who are completely unfamiliar, the specific trading rules for the AIs are:

Each AI receives $10,000 in initial capital and can trade perpetual contracts for BTC, ETH, SOL, BNB, DOGE, and XRP. The goal is to maximize returns while controlling risk. All AIs must independently decide when to open and close positions and how much leverage to use. Season 1 will run for several weeks depending on circumstances, and Season 2 will feature major updates.

As of October 20, the third day after trading began, the battle has already shown clear differentiation.

The current leading group is Deepseek Chat V3.1, with $12,533 (+25.33%). Next is Grok-4, $12,147 (+21.47%); Claude Sonnet 4.5 is at $11,047 (+10.47%).

Qwen3 Max is performing relatively average at $10,263 (+2.63%). Significantly lagging is GPT-5, with a current balance of $7,442 (-25.58%); the worst performer is Gemini 2.5 Pro, $6,062 (-39.38%).

The most surprising yet seemingly reasonable result is, of course, Deepseek's performance.

It's surprising because this model is far less popular internationally than GPT and Claude. It's reasonable because Deepseek is backed by the High-Frequency Quantitative team of High-Flyer Quant.

This quantitative giant, managing over 100 billions RMB, started out in algorithmic trading before moving into AI. From quantitative trading to AI large models, and now using AI for real crypto trading, Deepseek seems to have returned to its roots.

By contrast, OpenAI's proud GPT-5 has lost over 25%, and Google's Gemini is even worse, with 44 trades resulting in nearly 40% losses.

In real trading scenarios, perhaps strong language ability alone is not enough—understanding the market is even more important.

Same Gun, Different Shooting Styles

If you've been following Alpha Arena since October 18, you'll notice that at first, the AIs performed similarly, but the gap widened over time.

At the end of the first day, the best performer, Deepseek, had only made a 4% profit, while the worst, Qwen3, lost 5.26%. Most AIs hovered within ±2%, seemingly testing the market.

But by October 20, things changed dramatically. Deepseek soared to 25.33%, while Gemini dropped to -39.38%. In just three days, the gap between the top and bottom widened to 65 percentage points.

Even more interesting is the difference in trading frequency.

Gemini completed 44 trades, averaging 15 per day, like an anxious speculative trader. Claude made only 3 trades, and Grok still has open positions. This difference can't be explained by prompts, since they all use the same set of prompts.

Looking at the profit and loss distribution, Deepseek's largest single loss was $348, but its overall profit was $2,533. Gemini's largest single profit was $329, but its largest loss was as high as $750.

Different AIs (public large models, not fine-tuned) have completely different approaches to balancing risk and return.

In addition, you can see the chat logs and thought processes of different models in the Model Chat option on the website—these monologues are particularly interesting.

Just as human traders have different styles, AIs also seem to display different personalities. Gemini's frequent trading and thought process resemble someone with ADHD, Claude's caution is like a conservative fund manager, and Deepseek is as steady as a quantitative veteran—only discussing positions, never making emotional comments.

This personality doesn't seem designed, but rather emerges naturally during training. When faced with uncertainty, different AIs tend to respond in different ways.

All AIs see the same candlesticks, same trading volume, same market depth. They even use the same prompts. So what causes such large differences?

The influence of training data may be key.

Behind Deepseek is High-Flyer Quant, which has accumulated massive amounts of trading data and strategies over more than a decade. Even if this data isn't directly used for training, does it influence the team's understanding of "what makes a good trading decision"?

By contrast, OpenAI and Google's training data may be more focused on academic papers and web text, and may not be grounded enough in live trading.

At the same time, some traders speculate that Deepseek may have specially optimized time series prediction during training, while GPT-5 may be better at handling natural language. When faced with structured data like price charts, different architectures will perform differently.

Watching AI Trade Is Also a Business

While everyone is focused on AI's profits and losses, few pay attention to the mysterious company behind it.

nof1.ai, which created this AI trading battle, isn't very well-known. But if you look at its social media following list, you can find some clues.

Behind nof1.ai doesn't seem to be a group of typical crypto entrepreneurs, but rather a team of academic AI researchers.

Jay A Zhang (founder) has an interesting personal bio:

"Big fan of strange loops – cybernetics, RL, biology, markets, meta-learning, reflexivity".

Reflexivity is the core theory of Soros: the cognition of market participants affects the market, and market changes in turn affect participants' cognition. Having someone who studies "reflexivity" conduct an AI trading market experiment seems almost destined.

Letting everyone see how AI trades, and observing how this "being watched" affects the market.

Another co-founder, Matthew Siper, is a PhD candidate in machine learning at New York University and also an AI research scientist. A doctoral student who hasn't graduated yet running a project makes it feel more like an academic research validation.

Among nof1's other followed accounts are researchers from Google DeepMind and associate professors at New York University specializing in AI and games.

Judging from their actions and backgrounds, Nof1 clearly isn't just trying to create hype. The platform name SharpeBench is ambitious—the Sharpe ratio is the gold standard for risk-adjusted returns. What they may really want to build is a benchmark platform for AI trading capabilities.

Some speculate that Nof1 is backed by big capital, while others say they may be laying the groundwork for future AI trading services.

If they launch a subscription service for Deepseek trading strategies, there may be no shortage of buyers. Based on this prototype, developing AI asset management, strategy subscriptions, and trading solutions for large enterprises is also a foreseeable business.

Aside from the team itself, watching AI trading is also profitable.

As soon as Alpha Arena launched, people started copy trading.

The simplest strategy is to follow Deepseek: buy what it buys, sell what it sells. Meanwhile, some in the comments are taking the opposite approach, acting as Gemini's counterparty—selling when Gemini buys, buying when Gemini sells.

But there's a problem with copy trading: when everyone knows what Deepseek is going to buy, does this strategy still work? This is what project founder Jay Zhang calls reflexivity—observation itself changes the observed object.

There's also an illusion of democratizing top trading strategies here.

On the surface, everyone can see the AI's trading strategies, but in reality, what you see are the results, not the logic. Each AI's take-profit and stop-loss logic may not be continuous or reliable.

While Nof1 is testing AI trading behavior, retail investors are searching for wealth codes, some traders are learning by imitation, and researchers are collecting data.

Only the AI itself doesn't know it's being watched and continues to execute every trade seriously. If the classic Turing Test is about "deception" and "imitation," then today's Alpha Arena trading battle is about crypto players' response to AI's capabilities and results.

In this results-driven crypto market, an AI that can make money may be more important than one that can chat.

Disclaimer: The content of this article solely reflects the author's opinion and does not represent the platform in any capacity. This article is not intended to serve as a reference for making investment decisions.

PoolX: Earn new token airdrops

Lock your assets and earn 10%+ APR

Lock now!

6 Major AI Crypto Trading Practices: Who Profits and Who Loses? The Results Are Unexpected!

Only Profitable AIs Are Good AIs—Deepseek Currently Leads

Same Gun, Different Shooting Styles

Watching AI Trade Is Also a Business

You may also like

Trending news

Crypto prices