Currently the most efficient multimodal GenAI large model training and model compression
Edge-side GenAI large model deployment acceleration technology
Real-time end-to-end TTS, dialogue, translation, digital human, and other models and deployment
Large Language Models: DeepSeek, Qwen series, LLaMA series, etc.
Multimodal large language models and embodied AI models End-to-end TTS, dialogue, translation, digital human, and other models
Text-to-image and video generation models
Nvidia, AMD, Intel, Huawei, Hailo, Qualcomm, MTK, domestic acceleration chips, etc.
Moxin LLM is a high-performance language model jointly developed by MoYoYo AI, optimized for local deployment and edge devices. It achieves SOTA-level performance in multiple Zero-shot tasks, balancing accuracy with response speed, making it an ideal core for local intelligent applications.
Local real-time voice translation, no internet required, fast and accurate, designed for professional communication and cross-language meetings.
In collaboration with AMD and GMK, we successfully deployed the full-size Qwen3-253B model on an AMD Ryzen AI Max+ 395 mini PC, achieving an inference speed of 14 tokens/s.
Locally running voice dialogue, intelligent Agent, and digital human solutions, supporting speech recognition, understanding, and TTS synthesis. Already applied in scenarios like Werewolf voice games, achieving offline rapid response and natural interaction.
Enabling robots to “understand and operate the real world.” It can identify surrounding objects, comprehend commands, and control robotic arms or mobile chassis to complete practical tasks like grasping and transporting, widely applied in service, education, and smart manufacturing scenarios.