LPM 1.0 Real-Time Avatars
Anuttacon's LPM 1.0 (April 11 2026) turns one image + audio + context into a real-time video of that character speaking, singing, or listening — 0.35s latency, stable for 45+ minute conversations without identity drift. 17B Diffusion Transformer with DMD distillation. Founder is MiHoYo/Genshin Impact co-founder Cai Haoyu.
**LPM 1.0** (Live Portrait Model) is a real-time avatar synthesis system from Anuttacon, released as a research preview on April 11, 2026. It is the first avatar model to combine sub-second latency with multi-minute stability, meaningfully crossing the threshold from demo tech to potential deployment tech. ## Capability Inputs: - One reference image of a character - Audio (speech, singing) - Context (optional pose, style, environmental parameters) Output: - Real-time video of that character speaking, singing, listening, or expressing - Full face animation (lip sync, micro-expressions, eye blinks, gaze shifts) - Body language (shoulder, head, hand motion) - **Idle motion during silent listening** — the character fidgets, breathes, glances — a subtle but crucial element for believability ## Performance - **0.35 second end-to-end latency** — from audio input to video frame display. - For comparison: LiveAvatar and similar prior art were 1+ second. - 0.35s is within the range of natural conversational gap and doesn't trigger the 'it's lagging' perception. - **Stable for 45+ minute videos** — character identity doesn't 'warp' or drift. Other avatar models start to deform facial structure, change apparent age, or shift proportions over long sessions. LPM 1.0 holds a consistent appearance. - Multilingual — handles lip sync across many languages. - Handles **singing and emotional transitions** — not just neutral speech. ## Architecture - **17B parameter Diffusion Transformer** (DiT) - **DMD distillation** (Distribution Matching Distillation) — reduces many-step diffusion to few-step inference, the key to real-time latency. - Custom audio-to-motion encoder trained on large motion-capture + speech dataset. ## Release - Research preview only - **No weights, no code, no public demo** released — a departure from many recent open research patterns - Demo videos published on Anuttacon's channels This is deliberate — Anuttacon's business model (see below) depends on controlling the deployment. ## Anuttacon Anuttacon was founded by Cai Haoyu, co-founder of MiHoYo (the company behind Genshin Impact, Honkai: Star Rail, and Zenless Zone Zero). MiHoYo has one of the strongest character-design, animation, and gacha-monetization operations in the world, and Cai Haoyu's move into AI-driven real-time characters is a direct application of that expertise to the next medium. The plausible business model: AI-driven virtual companions, game NPCs with real-time natural dialogue, virtual idol / VTuber production tools, and potentially customer service avatars for enterprise. ## Significance LPM 1.0 is the inflection point where avatar-based interfaces become technically viable for mass deployment. Implications: - **Replacement of text chat**: for many consumer use cases, a real-time avatar that looks and sounds like a character is more engaging than text. AI assistants shipping avatar front-ends within 12 months is plausible. - **Education and coaching**: avatar tutors that can actually hold sustained 45-min lessons. - **Companion apps**: the current wave of Replika-style text companions will upgrade to avatars. - **Customer service**: avatar-based support with cultural localisation. - **Deepfake and fraud concerns**: this technology also makes impersonation much easier. Real-time voice-cloned avatar video chat is a new attack surface for social engineering. ## Related developments - HeyGen, Synthesia, Colossyan: current enterprise video-avatar tools, non-real-time. - D-ID: real-time talking head, shallower motion. - Hedra: competitive real-time avatars, slightly behind LPM 1.0 on latency and stability. - Runway and Pika Labs: general video generation, not avatar-specialised. The avatar category is the subset of generative video where real-time latency and identity stability are the key constraints, and LPM 1.0 is the first model to cross both thresholds clearly. ## Part of the broader weekly context See AI News Week of April 12 2026 — Four Headline Stories for the surrounding releases. LPM 1.0 is the under-covered '4th headline' story, but arguably the one with the nearest-term user-experience impact.