Voice commerce isn’t a future trend—it’s here and changing how people shop. In 2024, global voice-driven shopping reached $116.8 billion, with projections hitting $151.4 billion in 2025. Behind this rapid growth is a shift toward more natural, frictionless ways to interact with digital services.
Nowhere is this transformation more advanced than in China. Tech giants like Alibaba, Baidu, and JD.com have turned voice assistants into full-service retail engines. These platforms don’t just respond to commands—they drive end-to-end shopping experiences, embedded in smart homes, cars, and daily routines.
This blog unpacks how voice commerce works, why it matters globally, and how China has turned it into a strategic advantage. You’ll learn what makes the Chinese model so effective—and what global retailers can take from it.
Key Takeaways
Here’s a brief overview of the following article:
- Definition of Voice Commerce: Voice commerce uses voice assistants like Tmall Genie, Alexa, or Xiaodu to search, order, and pay for products using spoken commands instead of screens or keyboards.
- How It Works: These systems use natural language processing to understand voice input and complete transactions through pre-linked payment systems, offering a hands-free shopping experience.
- Why China Leads in Voice Commerce: China’s rapid adoption of smart devices, strong tech ecosystems, and AI voice integration across homes, cars, and retail platforms has made it the global leader in voice-driven shopping.
- Platform Innovations: Companies like Alibaba, Baidu, JD.com, and Xiaomi use voice tech to connect shopping, logistics, and customer service, making transactions faster and more seamless across devices.
- Benefits and Challenges: Voice commerce simplifies routine purchases and supports accessibility, but accuracy, privacy, and complex shopping still present challenges.
Ashley helps global brands translate China’s voice commerce success into effective digital strategies. Book a session to stay ahead in retail innovation.
What Is Voice Commerce?
Voice commerce (“v-commerce”) refers to any shopping transaction or product search performed through voice-enabled technology.
Instead of manually browsing websites or apps, a voice commerce shopper can simply speak to a digital assistant—Amazon’s Alexa, Google Assistant, Apple’s Siri, or China’s AliGenie—to find products, ask questions, and even complete purchases.
These systems rely on voice recognition technology and natural language processing to understand and execute purchase requests.
How it Works
Voice commerce begins with a digital tool like Tmall Genie, Alexa, or Google Assistant, which is used to listen to a trigger phrase. Once activated, the system processes the spoken request, interprets the user’s intent, and accesses relevant product or service databases.
If the item is available, the assistant provides options, confirms details, and completes the transaction through pre-linked payment systems.
For example, someone might say “Alexa, add milk to my cart” or ask “AliGenie, order the top-selling smartphone on Tmall”. The voice assistant uses natural language processing to understand the request and facilitate the transaction.
This hands-free shopping mode is gaining traction worldwide due to its sheer convenience. What makes voice commerce so appealing is its intuitive, hands-free experience.
It feels like chatting, making online shopping more accessible for people who find typing or navigating screens cumbersome. This can benefit the visually impaired or elderly by simplifying how they interact with e-commerce.
Although still a relatively new slice of the retail market, voice commerce is expanding fast, with analysts projecting a surge of more than $80 billion between 2024 and 2025.
Voice Commerce Use Cases Around the World
Smart speakers and voice assistants have become everyday tools in homes worldwide. From quick shopping tasks to reordering household staples, voice commerce is becoming part of daily routines, especially in North America and Europe.
In these markets, most voice shopping revolves around speed and convenience. People use assistants like Alexa or Google Assistant to:
- Reorder groceries or household items
- Check delivery status
- Book appointments or services
- Buy digital products like music or movie rentals
These tasks are typically repeat-based or low-stakes, ideal for situations where users know precisely what they need. For example, saying “Alexa, reorder toothpaste” is faster than opening an app and searching manually. This simplicity is a major driver of adoption.
However, the focus in most Western markets remains narrow. Voice commerce is primarily used for single-purpose tasks, and product discovery still leans heavily on screens.
China, by contrast, has pushed voice commerce further. Its platforms integrate voice into complete shopping journeys—from browsing to checkout to post-sale support—creating a much deeper role for voice in retail.
Why China Is Leading in Voice Commerce
China has emerged as the global frontrunner in voice commerce, not just in scale but also in how deeply voice is integrated into everyday retail. While many countries use voice for quick, simple tasks, China has built entire shopping ecosystems around it.
China’s voice assistant market was valued at $858 million in 2024 and is projected to grow at a CAGR of 31.05%, reaching over $4347.22 million by 2030.
Platforms like Alibaba, Baidu, JD.com, and Xiaomi have embedded voice tools into their e-commerce, logistics, and smart home ecosystems.
In China, voice is not a standalone feature—it’s part of a connected, intelligent network that spans online shopping, payments, customer service, and smart home control. Whether ordering breakfast, booking movie tickets in your car, or restocking kitchen staples from your fridge, voice is becoming a central interface for commerce.
The following section will explain how each major platform—Alibaba, Baidu, JD.com, and Xiaomi—has shaped this transformation.
China’s Platform-Specific Innovations in Voice Commerce
China’s largest tech companies are driving its voice commerce revolution. Each platform—Alibaba, Baidu, Xiaomi, and JD.com—has taken a unique approach, creating voice experiences tightly linked to its ecosystems.
Xiaomi’s XiaoAI
Screenshot from xiaoai.mi.com
In 2025, XiaoAi will be China’s most widely used voice assistant embedded in consumer hardware. Developed by Xiaomi, it powers everything from smartphones and TVs to air conditioners, wearables, and even electric vehicles, making voice control a seamless part of daily life.
The platform’s scale is massive. Xiaomi now operates 943.7 million connected IoT devices (excluding smartphones), and the Mi Home app reached 106.4 million monthly active users, up 19.5% year-over-year. XiaoAi handles millions of voice interactions daily, serving as the default interface across Xiaomi’s smart ecosystem.
Core Use Cases in 2025
- Smart home routines: Control lights, appliances, and air quality with simple voice commands.
- Voice shopping on Mi TVs: Browse and buy products via Dongfang Shopping without touching a remote.
- In-car assistant: In the SU7 EV—75,869 units delivered in Q1—XiaoAi handles navigation, music, and smart home sync.
- Wearables and smart glasses: Offer real-time translation, reminders, and voice-activated tasks on the go.
XiaoAi’s strength lies in its integration. Over 19.3 million users operate five or more connected devices, creating a tightly woven voice-driven experience. It also supports Xiaomi’s high-margin internet services, which earned RMB 9.1 billion in Q1 with a 76.9% gross margin.
By embedding voice across all touchpoints—home, car, screen, and wearables—XiaoAi turns ambient voice interaction into commerce, control, and convenience. It’s not an add-on, but the connective layer powering one of China’s most advanced smart ecosystems.
Alibaba (Tmall Genie)
Screenshot from tmallgenie.com
Alibaba has led the voice commerce movement through its Tmall Genie smart speaker and the AliGenie voice assistant. Launched in 2017, Tmall Genie was built to handle voice-based shopping from the ground up.
Users can:
- Check flash sale deals
- Reorder household items
- Track deliveries
- Make payments via voice
Alibaba has expanded voice commerce beyond the home. Tmall Genie is now integrated into vehicles from Audi and Honda, allowing drivers to shop, book tickets, or place orders while on the road. The AliMe voice assistant is also embedded in the Taobao and Alipay apps, enabling voice search, bill payment, and in-app purchases.
Voice has also become a branding tool. In a partnership with KFC, Alibaba launched a Colonel Sanders-themed Tmall Genie that lets users order meals by speaking. Another collaboration with Starbucks lets customers place voice orders and receive deliveries through Ele.me—all powered by a quick spoken request.
These integrations show that Alibaba doesn’t treat voice as an add-on—it builds it directly into its retail, food delivery, and mobility platforms.
Baidu’s Xiaodu
Baidu’s DuerOS voice assistant screenshot
Baidu’s DuerOS voice assistant powers Xiaodu smart speakers and a wide range of connected devices. Once handling around 1 billion voice searches daily, DuerOS is Baidu’s core tool for integrating commerce with natural dialogue.
Its applications include:
- Smart TVs and home appliances that recommend products
- Smart fridges that remind users to restock and reorder items through JD.com or Baidu Mall
- In-car systems that allow drivers to shop or search without taking their hands off the wheel
By embedding voice into everyday objects, Baidu turns passive devices into active retail portals and positions voice as the default way to interact with its digital services.
JD.com: Voice Commerce as Logistics Infrastructure
JD.com brings a unique angle to voice commerce by focusing on logistics and smart supply chains. Its proprietary voice assistant is embedded into the JD+ smart home platform and screen-equipped devices.
With JD’s voice interface, users can:
- Check order status
- Schedule or reschedule deliveries
- Confirm installation services
- Modify delivery preferences
Voice commerce on JD isn’t limited to screens or apps. TV boxes let users navigate products with spoken commands while seeing them on-screen.
JD’s delivery robots—already active in cities like Changsha and Beijing—respond to voice input for ID checks, package retrieval, and route changes. This bridges the gap between digital requests and real-world fulfillment.
Beyond the Big Four: Daily Voice Use in China’s Ecosystem
China’s broader voice ecosystem goes far beyond individual brands. Voice is already part of how millions interact with shopping and services across platforms.
- Smart Speakers as Shopping Assistants: Beyond Alibaba’s Tmall Genie, JD.com’s LingLong enables voice orders of household staples, while Xiaomi’s Xiao Ai users can shop via Mi TVs. These devices excel at “low-touch” purchases—replenishing toothpaste or buying rice—where convenience trumps product comparison.
- Hyper-Convenient Food & Grocery: Voice eliminates friction in time-sensitive scenarios. On Ele.me and Meituan, users shout orders like “Deliver hotpot for two at 7 PM” while cooking. Alibaba’s Hema Fresh stores even let shoppers add items to their cart mid-aisle by speaking to the app.
- Cars as Mobile Checkout Lanes: Baidu’s DuerOS powers voice commerce in XPeng and NIO EVs, where drivers book restaurants or pay for parking without lifting a hand. This isn’t futuristic—it’s already routine in China’s connected-car ecosystem.
- Social Commerce Gets Vocal: On WeChat, brands like Nike use Mini Programs to process voice queries like “Find running shoes under 800 RMB.” Douyin’s live sellers answer spoken questions during streams, merging entertainment and instant checkout.
- Healthcare: Voice as a Lifeline: Elderly users rely on Alibaba Health to refill prescriptions by voice. AI symptom checkers parse spoken descriptions (e.g., “I have a fever and cough”) to recommend products.
What Global Brands Can Learn from China’s Voice Commerce Model
While many global retailers experiment with voice assistants, China’s voice commerce evolution reveals a strategic shift in how digital retail is structured and monetized. The real takeaway isn’t about devices but the business model redesign behind them. Here’s what global brands are missing:
1. Voice Commerce Is a Traffic-Generating Engine for Owned Ecosystems
Chinese platforms don’t rely on third-party voice ecosystems. Alibaba, JD.com, and Baidu created proprietary voice assistants to ensure voice interactions stay inside their retail environments, boosting customer retention, reducing acquisition costs, and collecting first-party data without platform fees.
Strategic Lesson: Global brands relying on Alexa or Google Assistant risk losing traffic, insights, and transactional control. To own the customer journey, brands must develop closed-loop voice experiences via native apps, in-car systems, or white-label assistants.
2. Voice-Driven Commerce Unlocks Untapped User Segments
China’s voice-first strategy has accelerated adoption in nontraditional user groups: the elderly, rural consumers, and mobility-impaired users. Voice simplifies access for users who avoid screens, expanding the consumer base without expensive UI localization or device upgrades.
Missed Opportunity: Western brands often optimize voice for tech-savvy early adopters. In contrast, China proves that inclusive design can open entirely new markets. Brands should use voice to serve populations excluded by screen-first design, especially in aging economies like Japan, Germany, and South Korea.
3. Voice is a Loyalty Infrastructure, Not a One-Time Interaction
Chinese platforms use voice prompts to drive habit loops—e.g., daily deal alerts, shopping reminders, or “smart restocking” nudges. This behavior conditioning isn’t just convenient—it builds loyalty through automation. Consumers are more likely to reorder from a voice assistant that already “knows” them.
Brand Takeaway: Global retailers should design voice flows that proactively re-engage users with contextual prompts (e.g., “Want to reorder your gym supplements for the month?”). This turns passive voice tools into CRM engines.
4. Voice Isn’t Replacing Visual Commerce—It’s Restructuring It
In China, voice commerce isn’t positioned as a replacement for visual interfaces—it’s used to filter noise and direct intent. Smart displays don’t eliminate screens; they prioritize what to show based on voice inputs. This changes how visual content is curated, not how it disappears.
Implication: Western brands should rethink their product discovery UX. Instead of relying on endless scrolls, voice-filtered curation can surface fewer, better-matched SKUs, boosting decision speed and reducing drop-offs.
5. Voice Commerce Success Requires Retail + Infrastructure Collaboration
China’s success stems from tight collaboration between retailers, telecoms, device OEMs, and city-level policy makers. Smart speakers ship pre-configured for commerce. EVs integrate payment-linked voice assistants. In some smart cities, even public kiosks offer voice-enabled transactions.
Lesson for Global Markets: Brands need to move beyond a direct-to-consumer strategy. Voice commerce at scale requires B2B2C alliances—with carmakers, appliance brands, hotels, and telcos—to embed voice deeper into real-world environments.
6. Voice Data as a Differentiator in Product Development
In China, voice search data shapes not just marketing but product design. If 100,000 users ask for “low-sugar soy milk” monthly, Alibaba’s supply chain partners adjust inventory, packaging, and pricing. Voice queries surface unfiltered, high-intent insights that traditional surveys miss.
Strategic Edge: Global CPG and DTC brands can use anonymized voice queries to detect unmet needs, seasonal trends, or regional demand. This gives product teams a faster feedback loop than social media or email.
7. Regulatory Navigation Determines Long-Term Viability
While most global discussions focus on voice UX, Chinese platforms invest heavily in compliance infrastructure. They align early with evolving government requirements on biometric data, consent protocols, and AI explainability, giving them a head start when voice regulations tighten.
Reality Check: Global brands need voice commerce strategies that anticipate regulation, not retrofit around it. That means building opt-in models, voice-data minimization frameworks, and regional compliance gates from day one, especially in GDPR and CPRA regions.
Challenges and Future Outlook
Despite rapid growth, voice commerce still faces important challenges, many of which China’s platforms actively address. From accuracy and trust to the limits of voice-only shopping, the next development phase will depend on solving these friction points.
Accuracy and Language Localization
One concern is accuracy and language localization—voice assistants must flawlessly understand diverse languages, dialects, and accents.
China’s voice pioneers have invested heavily in mastering Mandarin’s tones and regional dialects. However, extending such precision to all user groups (including non-native speakers or noisy environments) is ongoing.
Privacy and Security
Privacy and security are other considerations. Any device listening for voice commands could record sensitive information, and consumers need confidence that voice payments are secure.
Chinese providers have tackled security with features like voiceprint ID for authentication, yet users globally still worry about data privacy with always-on microphones. Over time, stricter data practices and user education will be key to addressing these trust issues.
Shopping Experience via Voice
Another practical challenge is the shopping experience via voice. Voice commerce works brilliantly for straightforward tasks (re-ordering known items, quick queries like store hours, or simple purchases like movie tickets). However, it can be less ideal for browsing complex product categories or comparing options, tasks where visual information is helpful.
Chinese companies are innovating around this limitation. Alibaba, for instance, introduced smart speakers with screens (smart displays) that combine voice with visual feedback. Users can speak requests and see product images or live videos on the device. This multimodal approach will likely grow, merging voice convenience with visual detail.
Moreover, as AI improves, voice assistants will better personalize results and read back only the most relevant info, making purely audio shopping more efficient.
Future Outlook
Looking ahead, the future of voice-driven shopping seems extremely promising, with China at the forefront. Industry forecasts suggest that voice commerce will continue its explosive growth. Chinese companies will likely export these voice-commerce technologies and models to other countries as they innovate.
We’re already seeing hints of this: Alibaba’s voice tech has been used in Marriott hotels in China to let guests order services by voice, and such ideas can be applied internationally.
In the coming years, expect voice shopping to become even more seamless and deeply embedded in daily life. Consumers might converse with virtual shopping assistants who know their preferences, whether at home, in the car, or walking down the street.
China’s experience shows that when voice technology is accurate, convenient, and integrated with popular services, people enthusiastically welcome it. Ultimately, voice commerce is about removing friction from shopping.
Chinese retailers have illustrated just how far this can go, from buying a grocery item with a simple command to experiencing an entire shopping festival hands-free. The rest of the world is watching and learning.
As voice-driven shopping moves from novelty to norm, China’s pioneering journey offers a glimpse of a future where shopping might be as easy as asking aloud for what we need – and having it delivered, no clicks required.
From Trend to Strategy: Turning China’s Voice Commerce Insights Into Action
Voice commerce in China isn’t an isolated trend—it’s part of a larger transformation that has made the country the world’s most advanced digital commerce market. As global retailers study China’s rapid evolution in AI, omnichannel retail, and consumer personalization, one question stands out: How do you turn China’s breakthroughs into strategy?
Ashley Dudarenok has the answer.
Named one of the World’s Top 100 Retail Influencers by RETHINK Retail and a Top Voice in Marketing by LinkedIn, Ashley is a naturalized Chinese serial entrepreneur with over 15 years of experience helping global companies navigate China’s fast-changing digital ecosystem.
She has advised executives from LVMH, Clarins, Shiseido, Saatchi & Saatchi, and Fortune 500 boards, delivering actionable insights on how Chinese platforms like Alibaba, JD.com, and Tencent drive digital transformation across voice, AI, livestream, and social commerce.
Why Book Ashley Dudarenok?
- Firsthand Insight from the Source: Member of Alibaba’s Global Influencer Entourage, and part of JD.com and Pinduoduo’s Global China Experts Group.
- Results-Driven Content: After working with her, her clients report accelerated innovation, smarter customer targeting, and deeper digital integration.
- Global Experience: Ashley has delivered keynotes on five continents, bridging China’s pace with global brand expectations.
- Trusted By Industry Leaders: Executives from Estée Lauder, Harley-Davidson, and GrandVision NV rate her sessions as “essential.”
Suppose your team is exploring voice commerce and customer-centric retail models or wants to future-proof its China strategy. Ashley offers the expertise, energy, and clarity needed to move from trend-watching to tactical execution.
Book Ashley Dudarenok today to turn China’s digital edge into your competitive advantage.
FAQs on What is Voice Commerce
-
How does voice commerce work in China?
In China, voice commerce is integrated into daily life through platforms like Tmall Genie and Baidu Xiaodu. Users speak commands to order products, track deliveries, or make payments. These systems connect directly to e-commerce, logistics, and payment platforms for seamless, end-to-end transactions.
-
Why is China leading in voice commerce?
China leads due to rapid smart device adoption, deep platform integration, and localized AI that understands Mandarin and regional dialects. Companies like Alibaba and Baidu have embedded voice into retail, logistics, and smart home systems, creating a full-cycle commerce experience through spoken input.
-
What are the leading voice commerce platforms in China?
Tmall Genie (Alibaba), Xiaodu (Baidu), and Xiao Ai (Xiaomi) are China’s top voice assistants. Each connects to parent ecosystems, allowing users to shop, pay bills, reorder goods, and access services using voice commands across home, mobile, and automotive environments.
-
What are the benefits of voice commerce for consumers?
Voice commerce simplifies shopping by enabling fast, hands-free transactions. It’s ideal for routine purchases, multitasking, and accessibility. In China, it enhances convenience by integrating into smart homes, vehicles, and apps people use daily, reducing friction in the purchase journey.
-
Is voice commerce secure?
Yes, leading platforms in China secure voice transactions through voiceprint ID, facial recognition, and encrypted payments. While privacy concerns exist globally, Chinese systems emphasize multi-factor authentication and biometric verification to protect user data and payment credentials.
-
What kinds of products are bought through voice in China?
Common voice purchases include groceries, household items, digital content, and takeaway meals. Users also book movie tickets, request delivery updates, or reorder frequent items. Voice commerce in China thrives on convenience and repeat purchases embedded in everyday routines.
-
How do smart speakers support voice shopping in China?
Smart speakers are centralized assistants that link voice commands to apps, shopping platforms, and home devices. Tmall Genie and Xiaodu, for example, allow users to browse products, make payments, and control connected appliances—all with simple voice prompts.
-
How is China addressing voice recognition challenges?
Chinese companies have trained AI on diverse Mandarin dialects, tonal variations, and informal speech. This localization ensures high accuracy across regions and user groups. Voice systems evolve to handle noisy settings and user diversity more effectively.
-
Can voice commerce replace traditional shopping channels?
Voice complements rather than replaces visual browsing. It’s highly effective for simple, repeat purchases and quick tasks. In China, smart displays are now combining voice with screens to offer a multimodal experience, merging convenience with visual clarity for more complex shopping needs.
-
How are Chinese brands using voice to enhance customer experience?
Brands like KFC and Starbucks have integrated voice assistants for ordering, delivery, and personalization. Tmall Genie enables branded voice interactions, allowing users to speak directly to services and receive customized responses tied to past orders or preferences.
-
What can global retailers learn from China’s voice commerce model?
Global retailers can learn to build tightly integrated ecosystems, localize voice UX for natural behavior, and treat voice as a gateway to full-service experiences. China shows success comes from embedding voice across platforms, not treating it as a standalone feature.