The world of edge AI is rapidly evolving, and M5Stack has made it incredibly accessible with their CoreS3 SE and LLM Module Kit combination. In this comprehensive tutorial, we’ll walk through creating a fully functional offline AI voice assistant that can understand speech, process natural language, and respond with synthesized speech—all without requiring an internet connection.
What You’ll Need
Hardware Components
- M5Stack CoreS3 SE IoT Controller ($38.90) – A lightweight version of the CoreS3 featuring ESP32-S3, 2.0″ capacitive touch screen, and built-in speaker/microphone
- M5Stack LLM Module Kit ($49.90) – Includes the LLM module with AX630C processor and LLM Mate module for connectivity
Software Requirements
- Arduino IDE
- M5Stack LLM library
- USB-C cable for programming
Understanding the Hardware
M5Stack CoreS3 SE Features
The CoreS3 SE is a streamlined version of the popular CoreS3, designed specifically for IoT applications. Key specifications include:
- Processor: ESP32-S3 dual-core Xtensa LX7 @ 240MHz
- Memory: 16MB Flash, 8MB PSRAM
- Display: 2.0″ capacitive touch IPS screen (320×240)
- Audio: Built-in 1W speaker and dual microphones
- Connectivity: WiFi, USB-C with OTG support
- Power Management: AXP2101 chip for efficient power consumption
Compared to the full CoreS3, the SE version omits the camera, proximity sensor, IMU, and compass to focus on core functionality at a lower price point.
LLM Module Kit Capabilities
The LLM Module Kit is where the AI magic happens:
- Processor: AX630C SoC with dual Cortex-A53 @ 1.2GHz
- AI Performance: 3.2 TOPS @ INT8 precision
- Memory: 4GB LPDDR4 (1GB system + 3GB for AI acceleration)
- Storage: 32GB eMMC with pre-installed Ubuntu system
- AI Functions: KWS (wake word), ASR (speech recognition), LLM (language model), TTS (text-to-speech)
- Power Consumption: Only ~1.5W at full load
Setting Up the Development Environment
Step 1: Arduino IDE Configuration
- Install the latest Arduino IDE
- Add the M5Stack board package to your board manager
- Select “M5Stack CoreS3” as your target board
- Install the M5Stack LLM library from the Library Manager
Step 2: Library Installation
Search for and install the “M5Module LLM” library. This provides all the necessary functions to communicate with the LLM module and handle AI operations.
The Complete Voice Assistant Code
Here’s the actual working code for our voice assistant application, taken directly from the M5Stack examples:
/*
* SPDX-FileCopyrightText: 2024 M5Stack Technology CO LTD
*
* SPDX-License-Identifier: MIT
*/
#include <Arduino.h>
#include <M5Unified.h>
#include <M5ModuleLLM.h>
M5ModuleLLM module_llm;
M5ModuleLLM_VoiceAssistant voice_assistant(&module_llm);
/* On ASR data callback */
void on_asr_data_input(String data, bool isFinish, int index)
{
M5.Display.setTextColor(TFT_GREEN, TFT_BLACK);
// M5.Display.setFont(&fonts::efontCN_12); // Support Chinese display
M5.Display.printf(">> %s\n", data.c_str());
/* If ASR data is finish */
if (isFinish) {
M5.Display.setTextColor(TFT_YELLOW, TFT_BLACK);
M5.Display.print(">> ");
}
};
/* On LLM data callback */
void on_llm_data_input(String data, bool isFinish, int index)
{
M5.Display.print(data);
/* If LLM data is finish */
if (isFinish) {
M5.Display.print("\n");
}
};
void setup()
{
M5.begin();
M5.Display.setTextSize(2);
M5.Display.setTextScroll(true);
/* Init module serial port */
// int rxd = 16, txd = 17; // Basic
// int rxd = 13, txd = 14; // Core2
// int rxd = 18, txd = 17; // CoreS3
int rxd = M5.getPin(m5::pin_name_t::port_c_rxd);
int txd = M5.getPin(m5::pin_name_t::port_c_txd);
Serial2.begin(115200, SERIAL_8N1, rxd, txd);
/* Init module */
module_llm.begin(&Serial2);
/* Make sure module is connected */
M5.Display.printf(">> Check ModuleLLM connection..\n");
while (1) {
if (module_llm.checkConnection()) {
break;
}
}
/* Begin voice assistant preset */
M5.Display.printf(">> Begin voice assistant..\n");
int ret = voice_assistant.begin("HELLO");
// int ret = voice_assistant.begin("你好你好", "", "zh_CN"); // Chinese kws and asr
if (ret != MODULE_LLM_OK) {
while (1) {
M5.Display.setTextColor(TFT_RED);
M5.Display.printf(">> Begin voice assistant failed\n");
}
}
/* Register on ASR data callback function */
voice_assistant.onAsrDataInput(on_asr_data_input);
/* Register on LLM data callback function */
voice_assistant.onLlmDataInput(on_llm_data_input);
M5.Display.printf(">> Voice assistant ready\n");
}
void loop()
{
/* Keep voice assistant preset update */
voice_assistant.update();
}
Code Breakdown and Explanation
Library Includes and Object Creation
#include <Arduino.h>
#include <M5Unified.h>
#include <M5ModuleLLM.h>
M5ModuleLLM module_llm;
M5ModuleLLM_VoiceAssistant voice_assistant(&module_llm);
The code starts by including the necessary libraries:
M5Unified.h
: Provides unified access to all M5Stack hardware featuresM5ModuleLLM.h
: Contains the LLM module communication and AI functionality
Two main objects are created:
module_llm
: Handles low-level communication with the LLM modulevoice_assistant
: Provides high-level voice assistant functionality
ASR (Speech Recognition) Callback Function
void on_asr_data_input(String data, bool isFinish, int index)
{
M5.Display.setTextColor(TFT_GREEN, TFT_BLACK);
M5.Display.printf(">> %s\n", data.c_str());
if (isFinish) {
M5.Display.setTextColor(TFT_YELLOW, TFT_BLACK);
M5.Display.print(">> ");
}
}
This callback function is triggered whenever the speech recognition system processes audio input:
- Real-time Display: Shows recognized text in green as it’s being processed
- Progressive Updates: The
isFinish
parameter indicates when speech recognition is complete - Visual Feedback: Changes text color to yellow when ready for the next input
- User Experience: Provides immediate visual feedback of what the system “heard”
LLM (Language Model) Response Callback
void on_llm_data_input(String data, bool isFinish, int index)
{
M5.Display.print(data);
if (isFinish) {
M5.Display.print("\n");
}
}
This callback handles the AI’s response generation:
- Streaming Output: Displays the AI response as it’s being generated (token by token)
- Natural Flow: Creates a typewriter effect showing the AI “thinking” and responding
- Completion Handling: Adds a newline when the response is complete
Setup Function – Hardware Initialization
void setup()
{
M5.begin();
M5.Display.setTextSize(2);
M5.Display.setTextScroll(true);
The setup begins with basic hardware initialization:
M5.begin()
: Initializes all M5Stack hardware componentssetTextSize(2)
: Sets readable text size for the 2″ displaysetTextScroll(true)
: Enables automatic scrolling when text fills the screen
Serial Communication Setup
int rxd = M5.getPin(m5::pin_name_t::port_c_rxd);
int txd = M5.getPin(m5::pin_name_t::port_c_txd);
Serial2.begin(115200, SERIAL_8N1, rxd, txd);
This section configures the serial communication between CoreS3 and the LLM module:
- Dynamic Pin Assignment: Uses
M5.getPin()
to automatically get the correct pins for different M5Stack models - Port C Connection: The LLM module connects via the Port C interface
- Standard Settings: 115200 baud rate with 8 data bits, no parity, 1 stop bit
- Hardware Flexibility: The commented lines show pin configurations for different M5Stack models
Module Connection and Verification
module_llm.begin(&Serial2);
M5.Display.printf(">> Check ModuleLLM connection..\n");
while (1) {
if (module_llm.checkConnection()) {
break;
}
}
This critical section ensures reliable communication:
- Module Initialization: Starts the LLM module communication
- Connection Verification: Continuously checks until the module responds
- User Feedback: Displays connection status on screen
- Robust Startup: Won’t proceed until connection is established
Voice Assistant Configuration
M5.Display.printf(">> Begin voice assistant..\n");
int ret = voice_assistant.begin("HELLO");
// int ret = voice_assistant.begin("你好你好", "", "zh_CN"); // Chinese kws and asr
if (ret != MODULE_LLM_OK) {
while (1) {
M5.Display.setTextColor(TFT_RED);
M5.Display.printf(">> Begin voice assistant failed\n");
}
}
The voice assistant initialization includes:
- Wake Word Setup: Configures “HELLO” as the activation phrase
- Language Support: Shows how to enable Chinese wake words and ASR
- Error Handling: Displays failure message if initialization fails
- System Reliability: Prevents operation if the assistant can’t start properly
Callback Registration
voice_assistant.onAsrDataInput(on_asr_data_input);
voice_assistant.onLlmDataInput(on_llm_data_input);
This registers our callback functions:
- Event-Driven Architecture: The system calls these functions when events occur
- Separation of Concerns: Display logic is separated from AI processing
- Customizable Responses: Easy to modify how the system responds to events
Main Loop – Continuous Operation
void loop()
{
voice_assistant.update();
}
The main loop is elegantly simple:
- Single Responsibility: Only needs to update the voice assistant
- Internal Processing: All AI operations happen inside the
update()
function - Efficient Design: Minimal overhead in the main loop
- Event Handling: Callbacks handle all user interactions and responses
How the Complete System Works
1. Initialization Sequence
- Hardware Setup: CoreS3 initializes display, audio, and communication systems
- Module Connection: Establishes serial communication with LLM module
- AI Loading: LLM module loads the Qwen2.5-0.5B model into memory
- Voice Assistant Start: Configures wake word detection and callback functions
2. Runtime Operation
- Continuous Listening: System monitors for the “HELLO” wake word
- Speech Capture: When activated, captures and processes audio input
- Real-time Display: Shows recognized speech in green text as it’s processed
- AI Processing: Sends recognized text to the language model
- Response Generation: AI generates response and streams it to display
- Audio Output: Text-to-speech converts response to audio
3. User Interaction Flow
User says "Hello" → Wake word detected → System activates
User speaks command → ASR processes → Text displayed in green
AI processes request → LLM generates response → Text streams to display
TTS speaks response → System returns to listening state
Compilation and Deployment
Building the Project
- Open Arduino IDE and load the voice assistant example
- Select “M5Stack CoreS3” as your board
- Choose the correct COM port when CoreS3 is connected
- Click “Upload” to compile and flash the firmware
- Compilation typically takes 1-2 minutes depending on your system
Hardware Assembly
- Ensure both devices are powered off
- Align the LLM module with the CoreS3 SE’s M5Bus connector
- Press firmly until the modules click together
- Power on the system – the LLM module’s LED should turn green
Testing Your AI Assistant
Startup Sequence
When you power on the system, you’ll see:
- “Check ModuleLLM connection..” – Establishing communication
- “Begin voice assistant..” – Loading AI models
- “Voice assistant ready” – System ready for use
Example Interactions
Based on the actual demo, here are real interactions you can try:
Basic Conversation:
- Say: “Hello”
- Response: “Hi, how can I help you today?”
Identity Questions:
- Say: “Hello, what is your name?”
- Response: “I’m a large language model created by Qwen”
Capability Inquiry:
- Say: “Hello, what can you do?”
- Response: Detailed explanation of language model capabilities including translation, question answering, and text generation
Translation Requests:
- Say: “Hello, translate ‘How are you?’ to Spanish”
- Response: “¿Cómo estás?”
Language Support:
- Say: “Hello, do you know Spanish?”
- Response: “Sí, conozco español” (Yes, I know Spanish)
Advanced Features and Customization
Multi-Language Support
The code includes commented lines showing Chinese language support:
// int ret = voice_assistant.begin("你好你好", "", "zh_CN"); // Chinese kws and asr
This demonstrates how to:
- Set Chinese wake words (“你好你好” – “Hello Hello”)
- Configure Chinese ASR (Automatic Speech Recognition)
- Support multiple languages in the same application
Display Customization
The code shows how to enable Chinese character display:
// M5.Display.setFont(&fonts::efontCN_12); // Support Chinese display
Model Management
The LLM module supports multiple AI models through its apt-based package system:
- Language Models: Qwen2.5-0.5B, Qwen2.5-1.5B, Llama-3.2-1B
- Vision Models: InternVL2_5-1B-MPO, YOLO11
- Speech Models: Whisper-tiny, Whisper-base, MeloTTS
Performance Characteristics
Power Consumption
Based on the specifications:
- Standby Mode: 104.64µA @ 4.2V (battery powered)
- Active Mode: 166.27mA @ 5V (USB powered)
- AI Processing: ~1.5W during inference
Response Times
From the demo video, typical response times are:
- Wake Word Detection: Instant recognition
- Speech Recognition: 1-2 seconds for short phrases
- AI Processing: 2-4 seconds for simple queries
- Complex Responses: 5-10 seconds for detailed explanations
Memory Usage
The system efficiently manages its 4GB of memory:
- System Operations: 1GB reserved for Ubuntu and basic functions
- AI Acceleration: 3GB dedicated to model inference and caching
- Model Storage: 32GB eMMC holds multiple models and system files
Troubleshooting Common Issues
Connection Problems
If you see “Check ModuleLLM connection..” stuck on screen:
- Verify proper module stacking alignment
- Check that both devices are powered
- Ensure the LLM module’s green LED is illuminated
- Try reseating the connection
Compilation Errors
Common Arduino IDE issues:
// If you get pin definition errors, manually set pins:
int rxd = 18, txd = 17; // For CoreS3
// int rxd = 13, txd = 14; // For Core2
// int rxd = 16, txd = 17; // For Basic
Performance Optimization
For better response times:
- Allow 30-60 seconds for full system initialization
- Speak clearly and at moderate pace
- Keep queries concise for faster processing
- Ensure stable power supply (USB recommended for development)
Audio Issues
If speech recognition isn’t working:
- Check that the built-in microphones aren’t blocked
- Ensure you’re saying “HELLO” clearly as the wake word
- Verify the speaker is producing audio output
- Test in a quiet environment initially
Real-World Applications
Smart Home Integration
The voice assistant can be extended for home automation:
// Example: Custom command handling
void on_asr_data_input(String data, bool isFinish, int index) {
if (isFinish) {
if (data.indexOf("turn on lights") >= 0) {
// Add your smart home control code here
controlLights(true);
}
// Continue with normal display
M5.Display.setTextColor(TFT_GREEN, TFT_BLACK);
M5.Display.printf(">> %s\n", data.c_str());
}
}
Educational Projects
Perfect for teaching AI concepts:
- Machine Learning: Demonstrate real-time inference
- Natural Language Processing: Show speech-to-text and text-to-speech
- Edge Computing: Explain offline AI processing
- IoT Development: Integrate with sensors and actuators
Industrial Applications
Suitable for environments requiring:
- Offline Operation: No internet dependency
- Privacy Protection: All processing happens locally
- Low Latency: Immediate response without cloud delays
- Reliability: Consistent performance without network issues
Prototyping Platform
Ideal for rapid development:
- Quick Iteration: Fast compile and deploy cycle
- Modular Design: Easy to add sensors and actuators
- Scalable Architecture: Can grow from prototype to production
- Community Support: Extensive documentation and examples
Advanced Code Modifications
Custom Wake Words
To change the wake word, modify the setup function:
// Change from "HELLO" to custom wake word
int ret = voice_assistant.begin("JARVIS"); // Custom wake word
Enhanced Error Handling
Add more robust error checking:
void setup() {
// ... existing setup code ...
// Enhanced connection checking with timeout
int connection_attempts = 0;
M5.Display.printf(">> Check ModuleLLM connection..\n");
while (connection_attempts < 30) { // 30 second timeout
if (module_llm.checkConnection()) {
break;
}
delay(1000);
connection_attempts++;
M5.Display.printf("Attempt %d/30\n", connection_attempts);
}
if (connection_attempts >= 30) {
M5.Display.setTextColor(TFT_RED);
M5.Display.printf(">> Connection failed after 30 seconds\n");
while(1) delay(1000); // Stop execution
}
}
Custom Response Processing
Add intelligence to response handling:
void on_llm_data_input(String data, bool isFinish, int index) {
// Store complete responses for further processing
static String complete_response = "";
complete_response += data;
M5.Display.print(data);
if (isFinish) {
M5.Display.print("\n");
// Process complete response
if (complete_response.indexOf("temperature") >= 0) {
// Trigger temperature sensor reading
displayTemperature();
}
complete_response = ""; // Reset for next response
}
}
Future Enhancements
Model Upgrades
The system supports model updates through:
- APT Package Manager:
apt update && apt upgrade
on the LLM module - SD Card Updates: Load new firmware via microSD card
- OTA Updates: Over-the-air updates via WiFi
Integration Possibilities
Extend functionality with additional M5Stack modules:
- Environmental Sensors: Add temperature, humidity, air quality monitoring
- Camera Module: Enable visual AI capabilities
- GPS Module: Location-aware responses
- LoRaWAN Module: Long-range communication for remote deployments
API Integration
Enable OpenAI API compatibility:
# On the LLM module's Ubuntu system
apt install openai-api-plugin
This allows integration with existing OpenAI-compatible applications and services.
Conclusion
The M5Stack CoreS3 SE and LLM Module Kit combination represents a remarkable achievement in making advanced AI accessible to developers and hobbyists. The provided code demonstrates how sophisticated AI capabilities can be implemented with minimal complexity, thanks to M5Stack’s well-designed libraries and hardware integration.
Key takeaways from this tutorial:
Technical Excellence
- Efficient Architecture: The callback-based design ensures responsive user interaction
- Robust Communication: Serial communication with proper error handling
- Modular Design: Easy to extend and customize for specific applications
Practical Benefits
- Complete Offline Operation: No cloud dependency ensures privacy and reliability
- Low Power Consumption: Suitable for battery-powered applications
- Rapid Development: From concept to working prototype in minutes
- Professional Results: Production-quality voice interaction
Educational Value
- Real AI Implementation: Hands-on experience with modern AI technologies
- Hardware Integration: Learn how software and hardware work together
- Scalable Learning: Start simple, add complexity as skills develop
The future of AI is moving to the edge, and this tutorial shows how accessible that future has become. Whether you’re building smart home devices, educational tools, industrial applications, or just exploring AI capabilities, the M5Stack ecosystem provides a solid foundation for innovation.
Start with the provided code, experiment with modifications, and gradually build more sophisticated applications. The combination of powerful hardware, comprehensive software libraries, and extensive community support makes this an ideal platform for both learning and professional development.
Ready to build your own AI applications? The CoreS3 SE and LLM Module Kit are available from the M5Stack store, complete with documentation, examples, and community support to help you succeed in your AI journey.