Zhiyuan GE-Sim 2.0: Use a World Model to generate the world, and Unitree’s formidable rival pushes the humanoid robot toward self-evolution

ChainNewsAbmedia

Embodied AI is entering a critical turning point. Recently, China’s Zhiren Robotics released the Genie Envisioner World Simulator 2.0 (GE-Sim 2.0), attempting to push the World Model—from a tool that simply understands the environment—into a world simulator (World Simulator) that can directly run, train, and optimize robots.

If you still don’t understand how significant this is, take a look first at the fundamental flaws of LLM architectures: in terms of training logic, existing LLMs only predict the context based on huge text corpora. They can know that the words “an apple falls” often appear together, but they do not truly understand the causal relationships of gravity or the physical world.

That’s why scientists like Yang LeCun and Fei-Fei Li have thrown themselves into the World Model track. Once AI has the ability to understand 3D environments and make physics predictions, this technology will become the digital brain for “Physical AI” such as autonomous robots, self-driving, and smart manufacturing. Therefore, the World Model approach argues that robots will be a crucial carrier. Now that a humanoid robot vendor like Zhiren Robotics has entered the scene, it symbolizes the vanguard of China’s hardware-driven resurgence.

Previously, Wei Zhe-jia, chairman of TSMC, had said: if you look at mainland China, it keeps making robots that can jump around and bounce. That’s useless—it’s just for show. He pointed out that the key is to enable the robot “brain” to operate, and who makes the brains? Nvidia (Nvidia), AMD (AMD), and a bunch of U.S. companies—but 95% of the brains are made by TSMC. GE-Sim 2.0’s development bottlenecks are still closely tied to China’s model development.

The World Model roadmap claims that robots are key

Current mainstream LLMs rely on massive corpora and statistical relationships to understand context, and predict the next word. It can know that the words “an apple falls” often appear together, but it does not truly understand the causal relationships of gravity or the physical world.

This kind of pattern performs extremely well in text generation, programming assistance, or Q&A tasks, but in scenarios that require understanding real-world structure, reasoning causal relationships, and long-term planning, there are still fundamental limitations. The bigger problem is that data sources are gradually running out. LLM training depends heavily on high-quality human data, and in recent years the industry has begun warning that available human text data may be consumed up within the next few years. Then, just like inbreeding that can lead to genetic defects, models will gradually drift away from reality and see performance degradation.

(In-depth analysis: Do LLMs have flaws? Why is Yang LeCun betting on the AMI World Model route?)

This is also why, in recent years, two heavyweight figures in the AI academic community—Yang LeCun and Fei-Fei Li, known as the “AI godmother”—have both chosen to bet on the next-generation AI architecture called World Model.

Back when the author said: looking further ahead, once AI has the ability to understand 3D environments and make physics predictions, this technology will become the digital brain for “Physical AI” such as autonomous robots, self-driving, and smart manufacturing. Therefore, the World Model roadmap claims that robots will be a very important carrier. Now that humanoid robot maker Zhiren Robotics has entered the fray, it symbolizes the vanguard of China’s hardware-driven resurgence.

Previously, Wei Zhe-jia, chairman of TSMC, when talking about robots and semiconductor development, said plainly: if you look at mainland China, it keeps making robots that can jump around and bounce. That’s useless—it’s only good-looking to the eyes. He pointed out that the key is to ensure the robot brain can function, and who makes the brain—Nvidia (Nvidia), AMD (AMD), and a bunch of U.S. companies. But 95% of the brain is made by TSMC.

(TSMC’s Wei Zhe-jia is sarcastic: China’s robots bounce around—only good-looking, useless! The key still comes from Nvidia)

World Model evolution: from understanding the world to learning within the world

In the past few years, World Model has been regarded as a key technology for AI to understand reality. By using image, language, and sensor data, the model can predict environmental changes and give robots basic decision-making abilities.

But GE-Sim 2.0’s core breakthrough is not just understanding the world—it is learning and an action system directly in the “world generated by the model.” It brings Action into the core variables, upgrading from traditional state prediction to a complete loop:

State

Action

State Evolution

This means that robots are no longer just observing and responding; they can actively try things out, self-optimize, and continue learning in a simulated environment. This shift makes World Model evolve from a “cognitive model” into “training infrastructure.”

GE-Sim 2.0: Let robots “evolve” in a virtual world

GE-Sim 2.0 is defined as a set of “embodied world simulators.” Its core goal is to solve the three major bottlenecks of real-world training: cost being too high, insufficient data, and difficulty in scaling. By generating environments with models, the system can train robots at scale without relying on the real world.

Technically, GE-Sim 2.0 integrates three key capabilities. First is “action-driven image generation”: the model can generate corresponding future frames based on the robot’s actions, while maintaining multi-view consistency, including the head viewpoint and left/right hand operation viewpoints.

Second is proprioception modeling: it doesn’t just simulate external visuals; it can also predict the robot’s own joints and action state, making decision-making closer to the real physical world.

Third is “automatic task assessment”: through a built-in reward model, the system can automatically determine whether a task is completed—for example, “put the blue object into the red box”—and provide feedback directly for reinforcement learning. This enables robots to complete a full closed loop in a simulated environment:

GE-Sim 2.0 can already achieve “minute-level” stable video generation

Compared with earlier models that could only generate short video segments, GE-Sim 2.0 can already achieve “minute-level” stable video generation, supporting long-duration task simulations. Meanwhile, by training with large-scale real data (remote operation, deployment, and interaction data), the model has stronger generalization capabilities across different scenarios and tasks. This is especially critical for humanoid robots: real-world operations are highly variable, and fixed-scene training alone can’t handle it.

The emergence of the World Simulator means that robots can “practice infinitely” in a virtual world—this will bring two structural changes: first, training costs will drop dramatically. Second, the speed of capability iteration will increase exponentially.

Zhiren Robotics: a new force in China’s humanoid robotics

Zhiren Robotics was founded in 2023 by Peng Zhihui, Huawei’s “genius youth,” focusing on embodied intelligence that merges AI and robotics.

The company’s core products include:

the “Yuan Zheng” series humanoid robots

the “Lingxi” robot system

the general large model GO-1

It has already completed multiple rounds of fundraising and has received investments from institutions such as Sequoia China and Hillhouse Capital. It is viewed as an important player in China’s humanoid robot sector, forming a competitive landscape with Unitree Robotics.

This article Zhiren GE-Sim 2.0: generate worlds with a World Model, Unitree’s formidable rival pushes humanoid robots toward self-evolution first appeared on ChainNews ABMedia.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments