Sofia apartments reservation system

(+359)-887-464 572

X Square Robot’s Wang Qian: Robots will eventually reach Mars

X Square Robot's Wang Qian: Robots will eventually reach Mars

On March 30, 2026, in Shenzhen, a quiet but telling milestone took place in the embodied AI world. X Square Robot (自变量机器人) hosted the inaugural Embodied AI Developers Conference — EAIDC 2026 — billed as the world’s first large-scale gathering dedicated specifically to developers building embodied AI systems. The event brought together researchers, engineers, and technology companies from across the industry for live robotic demonstrations, a national-level hackathon, and focused discussions on a topic that most robotics conferences still treat as futuristic: not whether robots can be intelligent, but how to deploy them at scale in the real world.

That framing — from can it work to how do we ship it — is precisely what distinguishes X Square from most companies in this space. While competitors have spent the past two years perfecting demo videos of robots doing backflips, X Square has been obsessively focused on a harder, less photogenic problem: building a foundation model general enough to actually run in a real home environment, a messy and variable space without failing the moment something unexpected happens.

EAIDC was, in some ways, a coming-out party. X Square is no longer a well-funded startup operating in the background. It is increasingly the company setting the terms of the conversation. Following the conclusion of the conference, Wang Qian(王潜), X Square’s founder and CEO, sat down with Pandaily for an exclusive interview to discuss the road ahead.

一.jpg

What X Square Actually Builds

To understand X Square, it helps to understand what problem it is trying to solve.

Traditional industrial robots are essentially very precise, very fast machines that follow pre-programmed instructions. They excel in structured environments — the same task, the same object, the same position, repeated thousands of times. Change any variable and they fail. They have no ability to observe the world, reason about it, and adapt.

The company’s answer is what it calls an Embodied Intelligence Foundation Model — specifically, a Vision-Language-Action (VLA) model that processes sensory input (video, language, haptic signals) and outputs physical action (joint torques, velocities, poses) in a single end-to-end architecture. The model, branded the Great Wall series with its flagship called WALL-A, is designed to enable robots to perceive, decide, and act autonomously across diverse environments and tasks without being explicitly reprogrammed for each one.

X Square claims WALL-A is currently one of the world’s largest-scale unified embodied foundation models — achieved via a novel three-stage training paradigm, a proprietary architecture combining shared attention with expert-routed feed-forward modules, and training data drawn from a mix of self-collected real-world demonstrations, open-source robot datasets, and automatically generated multimodal data.

In September 2025, X Square open-sourced a developer-facing version of its model called WALL-OSS, which has since been integrated into Hugging Face’s LeRobot framework — a move that simultaneously builds community goodwill and feeds the data flywheel. EAIDC itself can be read as an extension of this strategy: by convening the ecosystem, X Square is positioning its open-source model as the platform developers build on top of.

On the hardware side, X Square produces two robot platforms: Quanta X1, a wheeled dual-arm robot designed for precision manipulation in service environments, and Quanta X2, a wheeled humanoid launched in August 2025. It also makes the Artixon Hand, a 20 DoF dexterous robotic hand — which the company claims is the world’s first such hand to be controlled directly by a foundation model rather than traditional motion-planning algorithms.

The company is headquartered in Shenzhen with offices in Beijing and Shanghai, and has deployed its robots in hotel groups, logistics companies, elderly care facilities, supermarkets, and research institutions. Early commercial revenue is coming in from education, hospitality, and elder care, with household services — including a collaboration with classifieds platform 58.com — currently active.

二.png

The Founder: A Transformer Pioneer Who Quit Finance to Build Robots

Wang Qian is not a typical robotics entrepreneur. His academic background is in deep learning — specifically, he was among the earliest researchers in the world to introduce the attention mechanism into neural networks, publishing work at the same conference as Google’s early attention paper in 2014, three years before the Transformer architecture would reshape the entire field of AI.

After completing his bachelor’s and master’s degrees at Tsinghua University, he pursued a PhD at the University of Southern California, conducting robotics learning and human-robot interaction research at top American robotics laboratories. But he pivoted mid-career — to quantitative finance. He founded a quant fund in the United States, which by his own account was financially successful. And then he spent many nights unable to sleep.

“I kept thinking: I should have stayed in robotics,” Wang has said in interviews. In 2023, he dissolved the fund and returned to China.

His co-founder and CTO, Wang Hao (王昊), brings a complementary profile: a PhD in computational physics from Peking University, he previously led the algorithm team for the Fengshenbang large language model at the Institute for Intelligent Computing (IDEA Research), overseeing development of China’s first hundred-billion-parameter foundation model and one of the earliest trillion-parameter models, Ziya.

Together, the two founders embody X Square’s core thesis: that building a general-purpose robot brain requires expertise in both large foundation models and robotics learning — and that the two cannot be solved in isolation.

三.jpg

“We Were the Only Variable”

The name “X Square” is not arbitrary. In mathematics, x represents the independent variable — the thing that changes and drives outcomes. The Chinese name, 自变量 (zì biànliàng), carries the same meaning with an additional nuance: zì means “self” or “autonomous.” The company wants to be the variable that changes the world — and it wants that change to be self-generated, not dependent on others.

This framing turns out to be an accurate description of how X Square navigated its early years.

When the company was founded in late 2023, the embodied intelligence landscape in China was already competitive. Galbot (银河通用) and Agibot (智元机器人) had both launched that same year with considerable fanfare, larger teams, and more initial funding. X Square, by contrast, operated with little public visibility and struggled to raise early rounds.

“The biggest difficulty was that nobody trusted us,” Wang Qian has recalled. “Early investors didn’t believe we could actually pull this off. They bet on the team being decent enough that even if we failed at this, we’d find something else to do.”

What changed was output. X Square released its first embodied intelligence model just two months after founding. By October 2024, it had trained WALL-A — at the time, the largest-parameter general-purpose embodied manipulation model in the world. The demos showed something qualitatively different: robots that could hang laundry, prepare shaved ice, wind cables around pegs, and sort parcels of arbitrary shape, all controlled by a single model without task-specific reprogramming. By the time Physical Intelligence (PI), the American benchmark company in this space, released its π0 model in late 2024 — validating the end-to-end VLA approach — X Square had already been on that path for over a year.

“We didn’t need to copy anyone’s homework,” Wang has said. “We were already there.”

By early 2025, the funding picture had transformed entirely. Meituan led a Series A round; Alibaba Cloud and Hongshan (formerly Sequoia China) co-led an A+ round of nearly RMB 1 billion; and by January 2026, a 1 billion yuan A++ round closed with ByteDance, Hongshan. The company has now raised approximately $280 million in total — and is the only Chinese embodied intelligence startup to have simultaneously attracted backing from Meituan, Alibaba, and ByteDance.

四.jpg

The Technical Bet: Why End-to-End Matters

The central debate in embodied intelligence — and the one X Square has wagered its existence on — is architectural. Should a robot’s “brain” be built as a modular stack (perception module → planning module → control module), or as a single end-to-end model that processes raw sensor input and outputs raw motor commands without explicit intermediate representations?

X Square’s answer has been unambiguous from day one: end-to-end, always.

Wang Qian’s reasoning is both technical and philosophical. Modular systems, he argues, impose human-designed abstractions onto a continuous physical world. Each handoff between modules introduces error, latency, and brittleness. The physical world doesn’t segment neatly into “perception” and “planning” — it is a continuous stream of force, contact, spatial relationships, and temporal dependencies. A model that learns to navigate that stream as a whole, rather than as a sequence of sub-problems, will generalize better and fail more gracefully.

“Embodied intelligence is a foundation model for the physical world — independent from, and parallel to, language models and multimodal models for the virtual world,” Wang said at the 2025 MEET conference. “The long-term advancement of robotics intelligence depends on general-purpose AI capability.”

The practical implication is that WALL-A takes language, video, and haptic sensor data as input and outputs velocity, pose, and torque directly — with no intermediate planning layer. The same model controls the company’s wheeled robots, its dexterous hand, and in principle any robotic platform it is trained on. In July 2025, X Square demonstrated what it claims was an industry first: a foundation model controlling a high-degree-of-freedom dexterous hand to perform complex manipulation tasks, including card dealing, without task-specific engineering. The company has also integrated chain-of-thought reasoning natively into the model — enabling robots to break down complex instructions into subtask sequences and reason through them before acting.

Where the Robots Are Actually Working

For all the model capability claims, X Square has been notably focused on real-world deployment — and specifically on environments that are messy, variable, and require genuine generalization.

The company’s most commercially developed vertical is hospitality. It has deployed robots at hotel groups across China, where the Quanta X1 and X2 platforms handle guest reception, luggage assistance, laundry — including the full cycle from collection to washing to folding and return — room delivery, and cleaning. The company has also deployed at elderly care facilities, where robots perform transport, monitoring, and companion tasks around the clock.

In logistics, X Square has deployed at major domestic parcel sorting companies, where its reinforcement learning integration allows robots to sort parcels of diverse shapes at speeds the company says exceed human performance. In retail, it has partnered with supermarket chains where Quanta X1 performs restocking tasks autonomously.

This year, X Square is doubling down on deploying robots in home environments—a far more complex space defined by greater task variability. Unlike controlled settings, homes present real kitchens, real living rooms, and real bedrooms, each with unique layouts, objects, and user habits. A single task like “fetch a cup” plays out differently across a hundred households. True generalization, the company argues, can only be achieved through training on real-world data. After all, “No matter how long you train in a pool, you won’t learn to swim in the ocean.”

These deployments matter beyond revenue. They are X Square’s data engine. The company has invested heavily in proprietary teleoperation and data collection hardware — including exoskeleton-based data capture systems — to generate the high-quality real-world demonstration data that trains WALL-A. Wang Qian has described this closed loop of hardware, data, and model as X Square’s primary competitive moat.

“The next phase of competition in embodied intelligence,” he has said, “is essentially a battle of foundation models built on data closed-loops and their capacity for model evolution.”

EAIDC, viewed through this lens, is also a data strategy. By building a developer ecosystem and hosting a hackathon, X Square is creating a funnel of contributors who extend the model’s coverage into new tasks and environments — without X Square having to collect every data point itself.

Wang Qian’s Worldview: Long-Term Believer, Short-Term Skeptic

For all his confidence in X Square’s own trajectory, Wang Qian is measured — sometimes contrarian — when he talks about the broader robotics industry.

On the long arc, he is a true believer. He thinks robots will be ubiquitous within a decade, handling every category of physical labor that humans currently perform. He has spoken about robots eventually working on Mars. The question for him is never_ whether_ — it is when, and who gets the architecture right.

On the near term, he is more cautious. He thinks a number of high-profile robotics companies are over-promising and over-valued, and that the industry would benefit from less hype and more realistic expectations about what the current generation of models can actually do reliably at scale. His view is that the sector is roughly at the equivalent of GPT-2 in the large language model development curve — genuinely impressive, unmistakably the right direction, but still well short of the inflection point that unlocks mass-market utility.

On where robots should be deployed first, Wang Qian has a clear hierarchy. Service industries — hotels, elderly care, property management — come before factories. The reason is partly commercial (service environments reward generalization; factories reward repetition) and partly technical. Factory work, he argues, demands extremely high precision and throughput speeds that current embodied models cannot consistently deliver, while offering limited data diversity in return. Service environments, by contrast, are rich in variation, which is exactly what trains a better general model. “Working in a factory doing simple, repetitive tasks — that’s mostly a PR move,” he said.

On the question of competitive moats, Wang Qian believes hardware will matter more over time than most software-first investors currently assume. Hardware is sticky: switching costs are high, users develop preferences, and physical form factors create brand loyalty in ways that pure software does not. But he also thinks the post-ChatGPT era has fundamentally elevated the importance of model capability — and that companies which under-invest in foundation model research will find their hardware advantages eroded quickly. The winning formula, in his view, is the one X Square is pursuing: hardware and model co-developed from the ground up, with each informing the other.

From Product Definition to Standard Setting: X Square’s Global Leadership

Wang Qian’s confidence stems not only from his proven business logic and a high-execution team but also from the rare distinction of securing backing from Meituan, Alibaba, and ByteDance—three giants that seldom align on investments. Now, as the host of EAIDC, Wang is transitioning X Square from a product company into a convener for the next generation of embodied AI. At CVPR 2026, X Square Robot partnered with Sun Yat-sen University and MBZUAI to launch ManipArena, a platform establishing a unified global benchmark with 20 real-world tasks and 188 hours of high-quality data. By offering cloud-based remote testing, X Square is lowering hardware barriers and defining the rules for the industry’s next phase: building a global, open, and highly collaborative developer ecosystem. As Wang noted, “I don’t need to convince anyone; the results speak.”

Source link

Like this:

Like Loading…

Нашия източник е Българо-Китайска Търговско-промишлена палaта

More Articles & Posts