The world's first AI-created AI! GPT-5.3 participates in developing itself, turning science fiction into reality

MarketWhisper

OpenAI releases GPT-5.3-Codex, the first “participatory creation” model that can debug its own code, manage deployment, and diagnose tests. Karpathy states that this update is “the closest scenario to AI taking off.”

The Technological Singularity Breakthrough: AI Begins Creating AI

OpenAI’s official account announces: GPT-5.3-Codex is officially online, marking it as “the first model to participate in creating itself.” What does this mean? It means that during development, this AI helped debug its training code, managed its deployment process, and diagnosed its test results. In plain language: AI is starting to create AI.

Former OpenAI researcher and Tesla AI Director Andrej Karpathy tweeted immediately after: “This is the closest thing I’ve seen to an AI takeoff scene from science fiction.” Such an evaluation from a top AI researcher carries significant weight because Karpathy has personally experienced multiple key stages of AI development, and his judgment is based on deep technical understanding.

AI self-iterates itself—this is not just marketing hype. According to internal disclosures from OpenAI, GPT-5.3-Codex did the following during development: analyzed training logs to identify failed tests, suggested repair plans for training scripts and configuration files, generated deployment recipes, and summarized abnormal evaluations for human review. What does this imply? AI is no longer just a tool; it is beginning to become a member of the development team, and one that can improve itself.

This capability of self-participation in development breaks through traditional AI positioning. Previously, AI models were entirely designed, trained, and deployed by humans; AI was a passive product. Now, GPT-5.3 plays an active role in its own creation, although still under human supervision. This role shift is profoundly significant. It hints at a future where most AI models could be designed and optimized by AI itself, with humans only providing direction and final review.

Four Major Behaviors of GPT-5.3 Self-Participation in Development

Analyzing Training Logs: Automatically tagging failed tests and identifying anomalies during training

Suggesting Repair Plans: Offering improvements for training scripts and configuration files

Generating Deployment Recipes: Automating deployment processes to reduce manual operations

Summarizing Evaluation Anomalies: Organizing complex evaluation results into human-understandable reports

MIT recently published the SEAL paper (arXiv:2506.10943), describing an AI architecture capable of continuous learning after deployment, evolving without retraining. Notably, some SEAL researchers have now joined OpenAI. This means AI is shifting from a “static tool” to a “dynamic system,” with learning no longer limited to deployment; reasoning and training boundaries are dissolving. GPT-5.3 may be the first commercial application of this new architecture.

77.3% Domination in Benchmark Tests Over Claude

On February 5, just 20 minutes apart, OpenAI and Anthropic both announced new generation models. First, Anthropic released Claude Opus 4.6, followed by OpenAI launching GPT-5.3-Codex—an intense showdown. Since OpenAI aims to use GPT-5.3-Codex to outperform competitors’ new models, it must have some real capability. Data doesn’t lie: upon release, GPT-5.3-Codex set new records in multiple industry benchmarks.

Terminal-Bench 2.0 tests AI’s operational ability in real terminal environments—coding, training models, configuring servers. GPT-5.3-Codex scored 77.3%, while GPT-5.2-Codex scored 64.0%, and Claude Opus 4.6 reportedly scored 65.4%. A 13-point improvement between generations is a huge leap in AI. The comparison of 77.3% vs. 65.4% demonstrates GPT-5.3’s significant advantage in practical engineering tasks.

SWE-Bench Pro is a benchmark specifically testing real software engineering skills across Python, JavaScript, Go, and Ruby. GPT-5.3-Codex achieved 56.8%, surpassing the previous GPT-5.2-Codex’s 56.4%, maintaining industry leadership. More importantly, OpenAI revealed that GPT-5.3-Codex used the fewest output tokens among all models to reach this score, indicating it is not only accurate but also efficient.

OSWorld-Verified tests AI’s ability to complete productivity tasks in a visual desktop environment—editing spreadsheets, creating presentations, handling documents, etc. GPT-5.3-Codex scored 64.7%, while the human average is 72%. This means it is already close to human performance in computer operation tasks, nearly doubling the previous generation. This near-human level performance allows AI to truly handle office work for the first time, rather than just assist.

Claude Counterattack: 1 Million Tokens and Agent Teams

More notably, Claude Opus 4.6 supports a 1 million token context window (beta) in Opus models, capable of processing entire codebases or hundreds of pages of documents at once. It also introduced the Agent Teams feature, where multiple AI agents collaborate simultaneously on programming, testing, and documentation—an “AI team” mode that is transforming coding from individual skill to collaborative work.

When OpenAI and Anthropic released flagship models on the same day and same time, this competition is no longer just about technical prowess but about the future direction of AI: is it OpenAI’s “self-evolution” route or Anthropic’s “multi-agent collaboration” route? OpenAI’s strategy is to make a single AI increasingly powerful, even capable of improving itself. Anthropic’s approach is to have multiple AIs collaborate, dividing tasks and working together to complete complex objectives.

The 1 million token context window is a technological breakthrough. It corresponds to roughly 750,000 English words or 3 million Chinese characters—enough to contain an entire medium-sized software project or a thick technical manual. This capacity allows Claude to “see” the entire project overview rather than just fragments. For large-scale architecture analysis and restructuring, this global view is crucial.

Agent Teams introduce the concept of collaboration into AI. One agent writes code, another tests, a third writes documentation, and they communicate and coordinate. This mode mimics human software teams and may be more suitable for certain scenarios than a single super AI. However, multi-agent collaboration also introduces new complexities: how to coordinate, avoid conflicts, and ensure consistency.

Both routes have their advantages and disadvantages. OpenAI’s self-evolution approach is more aggressive—if successful, it could trigger exponential capability growth but also risks losing control. Anthropic’s multi-agent route is more conservative—distributing capabilities to reduce single-point risks, though coordination costs may limit efficiency. As AI begins evolving in the wild, governance issues will shift from “how smart is it” to “how do we manage a constantly changing system.” The fact that two top AI companies released breakthrough models within 20 minutes leaves humanity with a shrinking window to think and prepare.

View Original
Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.

Related Articles

Financial giant SBI issues 10 billion yen on-chain bonds, with Ripple XRP as a bonus for purchases

Japan's SBI Holdings launches a 10 billion yen blockchain bond, offering investors fixed interest and Ripple XRP rewards. The bond is traded on SBI VC Trade, marking a significant shift in Japan's cryptocurrency regulation and further promoting the development of the security token market.

動區BlockTempo7h ago

The crypto market has almost given back the gains from the 2024-2025 U.S. presidential election cycle, with the total market cap down about 40% from its peak.

The crypto market experienced a significant rise after the 2024 U.S. presidential election but has now pulled back. Total3 market capitalization dropped from $1.16 trillion to approximately $713 billion, a 40% decline from the peak. Both Bitcoin and Ethereum have fallen sharply, market sentiment is subdued, and the Fear & Greed Index indicates extreme fear.

GateNewsBot7h ago

Bitcoin spot ETF experiences five consecutive weeks of net outflows, with a total withdrawal of $3.8 billion

U.S. Bitcoin spot ETFs have experienced net outflows for five consecutive weeks, totaling approximately $3.8 billion, reflecting institutional risk reduction and position adjustments. Meanwhile, Ethereum spot ETFs are also facing net outflows. Market opinions suggest that if macroeconomic data weakens, digital asset ETFs could see a return of funds.

GateNewsBot9h ago

American private credit giant Blue Owl Capital announces the sale of approximately $1.4 billion in loan assets

Blue Owl Capital announces the sale of $1.4 billion in loan assets to meet investor redemption demands, causing its stock price to drop nearly 15%. Experts warn that excessive expansion in the private credit market could trigger systemic risks. If the central bank cuts interest rates, it could bring new momentum to Bitcoin and the crypto market.

GateNewsBot10h ago

Bitcoin ETFs Attract $88M as Ethereum Flows Stall to Near Zero

_Bitcoin retains stable institutional demand as Ethereum ETF flows remain thin and volatile._ Spot Bitcoin ETFs recorded solid inflows on Feb. 20, while Ethereum products drew little new capital. Data shows most demand remains concentrated in a few major issuers. At the same time, monthly

LiveBTCNews18h ago

Ethereum’s Great Decoupling – Analyzing the Growing Divergence Between ETH and Russell 2000

Digital assets exhibit market volatility and generally have correlations based on how asset classes interact with each other. Historically, Ethereum has had a positive correlation to other emerging markets and

BlockChainReporter18h ago
Comment
0/400
No comments
Trade Crypto Anywhere Anytime
qrCode
Scan to download Gate App
Community
  • 简体中文
  • English
  • Tiếng Việt
  • 繁體中文
  • Español
  • Русский
  • Français (Afrique)
  • Português (Portugal)
  • Bahasa Indonesia
  • 日本語
  • بالعربية
  • Українська
  • Português (Brasil)