Site icon Adaptive Business Logic

2025 Annual Review: Embodied Intelligence Industry

2025 Annual Review: Embodied Intelligence Industry

Is Embodied Intelligence the Biggest “Bubble” in 2025?

At the beginning of the year, Unitree suddenly made a big move by releasing the R1 humanoid robot priced at $5,900. Just a year ago, the industry generally believed that the cost floor for humanoid robots was still between $20,000 and $30,000. Unitree’s move shattered the entire industry’s price expectations.

Subsequently, the valuation of Figure AI soared from $2.6 billion in 2024 to $39 billion, a 15 – fold increase. The list of investors reads like an Oscar ceremony in the tech circle: Microsoft, OpenAI, NVIDIA, Jeff Bezos, Intel, and Samsung.

The capital market is betting wildly, as if the future of embodied intelligence is just around the corner.

However, at the same time, Tesla boasted that it would produce 5,000 Optimus robots, but actually only assembled about 1,000 before hitting the pause button and facing a redesign. Elon Musk’s claim that “80% of Tesla’s value will come from Optimus” seems a bit awkward in the face of reality.

This contrast between the hot and cold situation is quite confusing. Where exactly has the development of embodied intelligence reached? This article will provide a detailed analysis from several aspects, including algorithms, hardware, data, capital, and the strategies of major players.

01 What is Embodied Intelligence? Why Did it Explode in 2025?

Before discussing the current industry situation, let’s first clarify what embodied intelligence is.

If ChatGPT is an AI that “can talk”, then embodied intelligence is an AI that “can take action”. Its core is VLA, Vision – Language – Action, a vision – language – action model. It integrates three elements into a single neural network: Vision: perceiving the current scene; Language: understanding task goals and common sense; Action: outputting specific control instructions.

Simply put, it has three capabilities: understanding the environment, comprehending instructions, and performing actions.

How is it different from traditional robots?

For example, traditional industrial robots are like actors who can only recite fixed lines. You program them, and they execute step by step. However, embodied intelligent robots are more like actors who can improvise. They can understand environmental changes and make autonomous decisions.

For instance, if you ask it to fold a towel, a traditional robot requires the towel to be placed in exactly the same position every time. But an embodied intelligent robot can recognize that the towel is wrinkled or misaligned this time and adjust its movement trajectory accordingly to still fold it properly.

Dyna Robotics is a hot – shot embodied intelligence company in Silicon Valley. It was founded just a year ago, and now its Series A financing has reached $120 million, with a valuation of $600 million. Its investors include NVIDIA. The “towel – folding” task was the demo that first made Dyna famous.

York Yang

Co – founder of Dyna Robotics

VLA, in simple terms, is that we use the VLM in the large – model field as the “backbone”, but when it comes to the final output, we convert the result into actions that can be used in the robotics field. Actions can be intuitively understood as commands like moving an arm to a certain coordinate point.

The most criticized aspect of VLA is: why do we need L (Language)? In traditional robot algorithms in the past, many were purely based on vision. But if you think carefully, your brain actually generates something similar to language to tell you what to do in the first step and what to do in the second step in a long – term task.

The role of L is that for some very complex tasks, it can handle them by leveraging the logical knowledge learned from large – language models. For example, if you want to drink water, it will know that you need to find a cup or a bottle. This is something that can be directly provided by large – language models. The main purpose of using VLA is actually to better combine Language with Vision. Otherwise, if you only have Vision, the tasks you can perform may be short – term, and you won’t be able to handle any long – term tasks that require reasoning. So this is the main reason why we focus on introducing the language component.

This is a qualitative leap: robots are no longer just mechanical arms executing fixed programs but intelligent agents that can understand, plan, and adapt through the integration of vision, language, and action.

Embodied intelligence is not a new concept. Why did it suddenly explode in 2025? There are three factors.

First, large – models have become relatively mature.

Whether it’s OpenAI or other companies’ recently released large – models, the improvement in capabilities is more of an incremental evolution rather than a leap – forward transformation like the transition from GPT – 3.5 to GPT – 4 in the early days. Against this backdrop, the overall capabilities of large – models are becoming more stable and are sufficient to serve as a reliable foundation for embodied intelligence systems.

ChatGPT has proven that large – language models can understand complex instructions and make reasoning and planning. This set of capabilities can be transferred to robots. When you say “make me breakfast”, it can plan a multi – step sequence like “first get the eggs, then break the eggs, and then turn on the fire to fry”.

Second, the cost of computing power has been halved again and again. As chip manufacturers continue to launch new – generation chips with stronger performance, the unit cost of equivalent computing power has been on a long – term downward trend. Usually, every few years, the cost of obtaining the same amount of computing power is reduced by half.

In 2023, renting an NVIDIA H100 GPU was extremely expensive. Now, with the intensifying price war in cloud – computing power, the cost of training large – models has dropped significantly. What was once only affordable for leading companies is now accessible to startups.

Third, the hardware supply chain has matured.

The overall maturity of robot hardware components is relatively high. Especially driven by the boom in humanoid robots in the past year, a large amount of capital and engineering resources have been invested in the research and development of core basic components, including motors and reducers. As a result, the relevant technologies have continued to mature, and the cost has been continuously decreasing.

Unitree directly brought the price down to $5,900. Previously, the industry generally believed that mass production could be achieved in the price range of $20,000 – $30,000. The sharp decline in the cost curve has made commercialization no longer a pipe dream.

The combination of these three forces has pushed embodied intelligence from the laboratory to the verge of commercialization. But this is not blind optimism but a rational judgment based on technological maturity. So, what are the current capabilities of embodied intelligence, and what can it do?

02 What Can Robots Do Now?

Chapter 2.1 What They Can Already Do

Let’s first talk about what robots can do: there are already practical applications in industrial and commercial scenarios.

Folding towels and clothes may sound simple, but Dyna’s robots can fold 700 towels in 24 hours with a success rate of 99.4%. This is real productivity in hotels and laundries. Moreover, their basic model contains various scenario data, such as cutting vegetables, cutting fruits, preparing food, cleaning up after breakfast, and logistics sorting.

In BMW Group’s factories, Figure’s robots are performing simple assembly and material handling tasks. Agility Robotics’ Digit is moving boxes in warehousing and logistics scenarios. 1X will also deliver up to 10,000 1X Neo humanoid robots to the Swedish giant EQT, mainly for use in industrial scenarios such as manufacturing, warehousing, and logistics. Not to mention that Amazon has deployed 1 million specialized robots, almost exceeding its 1.56 million human employees.

These are not just demos but real – world commercial projects. This is “rational progress” – not aiming for all – around capabilities but focusing on practical applications.

Chapter 2.2: Tasks Being Tackled

What are the tasks that robots can’t do yet and that leading companies are currently working on? For example, medium – difficulty tasks like making breakfast.

This is a “long – term task” that requires planning multiple steps: getting ingredients, cutting vegetables, arranging the dishes, turning on the fire, and stir – frying. Each step needs to be executed precisely, and the force needs to be controlled. You can’t crush the eggs or cut your hand while cutting vegetables. Dyna’s latest demo shows that it has overcome this long – term task of making breakfast.

Figure has also demonstrated a demo of two robots working together, with one passing tools and the other operating. This is very useful in household scenarios, but the stability is still being refined.

Chapter 2.3: What They Can’t Do Yet

The most difficult task is household chores. Because every family environment is different. Lighting changes, item placement, and family members’ movements are all challenges in “unstructured environments”.

Relatively speaking, factories are “structured environments” with fixed lighting, fixed item positions, and standardized processes. But a home is a completely different story. Moreover, household chores have a strict requirement: zero tolerance for errors. If a robot breaks a part in a factory, the loss is controllable. But if it breaks a bowl or hurts someone at home, it’s an accident.

Wang Hao

CTO of Independent Variable Robotics

For example, when a robot is performing a task, there may be a small wrinkle on the tablecloth, a cup may be placed unstably, or a transparent object may reflect light, which just interferes with the camera. These slight physical changes can be instantly adapted to by humans based on intuition and rich experience. However, due to its heavy reliance on data – driven methods, the large – AI model may not be able to truly sense these new challenges.

Therefore, the technical threshold for robots to enter households is much higher than that for entering factories. But this doesn’t mean it’s out of reach.

York Yang

Co – founder of Dyna Robotics

We think that initially, robots will be used in the markets we are currently exploring, such as commercial services, where they can work together with humans to complete tasks. But we don’t think that household use is that far off. You don’t need a fully – fledged, general AGI. You may only need a few capabilities to enter the household scenario. First, let the robot start working at home, and then gradually develop more capabilities through model iteration.

Of course, once our hardware cost drops to a level that ordinary families can afford, we may first sell the robot to families with the function of folding clothes and then gradually expand other functions. So this timeline shouldn’t be too far off, maybe only 1 – 2 years.

This is “rational progress” – instead of waiting for robots to become all – around butlers in science – fiction movies before entering the market, we start with a clear and user – needed function and gradually iterate.

03 Technological Breakthroughs in 2025

Although there are many challenges, there are indeed several notable technological breakthroughs in 2025. Industry insiders have frankly told us that none of these breakthroughs is revolutionary, but they are all real progress.

Chapter 3.1 Breakthrough 1: The Popularity of the Dual – System Architecture

Many companies have started to adopt the so – called “System 1 + System 2” architecture.

link

Exit mobile version