LPM 1.0 AI generates a 45-minute lip-synced video from a single photo, showcasing real-time facial animation.

Editorial illustration for LPM 1.0 creates 45‑minute lip‑synced video from a single photo in real time

LPM 1.0: AI Turns Single Photo into 45-Min Video

LPM 1.0 creates 45‑minute lip‑synced video from a single photo in real time

April 13, 2026 • 2 min read

A single portrait can now become a half‑hour of moving speech without a studio or a time‑consuming render farm. Researchers unveiled LPM 1.0, a model that takes one still image and produces a continuously streaming video that stays in sync with a spoken script. The claim isn’t just about length; it’s about flexibility.

From realistic human faces to stylised anime avatars and even three‑dimensional game characters, the system allegedly handles the full spectrum of visual styles without extra fine‑tuning. Real‑time generation means the output appears frame by frame, sidestepping the batch‑processing pipelines that typically lock users into hours‑long post‑production. If the model truly keeps a 45‑minute clip stable, it could reshape how creators think about low‑cost, on‑the‑fly content.

The following details lay out exactly how LPM 1.0 achieves that breadth and speed.

LPM 1.0 works across different image styles, photorealistic faces, anime, and 3D game characters, without any additional training. The entire video generation runs as a streaming process in real time rather than rendering a finished video all at once. Videos up to 45 minutes long should remain stable.

LPM 1.0 utilizes what the researchers call "multi-granularity identity conditioning:" alongside a main image, the model also receives reference images from different angles and with varying facial expressions. This means it doesn't have to invent details like teeth, wrinkles tied to specific emotions, or profile views on its own -- it can pull them directly from the reference material. When listening, it generates reactive facial expressions like nodding or gaze shifts based on incoming audio.

When speaking, the response audio drives lip movements and body language. During pauses, LPM generates natural idle behavior based on text instructions. Beyond real-time conversation, LPM 1.0 also supports offline video generation from existing audio, useful for podcasts or movie dialogs, according to project manager Ailing Zeng.

New AI model generates 45-minute lip-synced video from one photo and runs in real time - THE DECODER

One image, a full conversation. LPM 1.0 claims to turn that still into a 45‑minute, lip‑synced video in real time. The model hooks directly into voice AIs such as ChatGPT, producing speaking, listening, or singing avatars that display hesitation, gaze shifts, and smooth emotional changes.

It reportedly handles photorealistic faces, anime art, and 3D game characters without additional training, and the generation proceeds as a streaming process rather than a batch render. However, the claim that videos up to 45 minutes “should remain stable” lacks published benchmarks, leaving durability under extended use unclear. The ability to operate across diverse visual styles is noteworthy, yet the article does not detail computational requirements or latency figures, which are critical for real‑time deployment.

Consequently, while LPM 1.0 demonstrates a promising integration of image‑to‑video synthesis and voice interaction, further evidence is needed to assess its practical limits and consistency across varied content. Future studies that publish quantitative performance data would help clarify its suitability for commercial or research applications.

Common Questions Answered

How does LPM 1.0 generate video from a single photo across different visual styles?

LPM 1.0 uses a 'multi-granularity identity conditioning' technique that allows it to generate videos from a single image across photorealistic faces, anime, and 3D game characters without additional training. The model can create up to 45-minute videos that maintain the original image's identity while synchronizing lip movements and displaying natural emotional variations.

What makes LPM 1.0's video generation process unique compared to traditional rendering methods?

Unlike traditional video rendering that requires batch processing and extensive computational resources, LPM 1.0 generates videos as a streaming process in real time. This approach allows for continuous video generation with smooth transitions and the ability to create lengthy videos up to 45 minutes long without needing a render farm.

How does LPM 1.0 integrate with voice AI technologies like ChatGPT?

LPM 1.0 can directly hook into voice AI systems like ChatGPT to produce speaking avatars that display nuanced behaviors such as hesitation, gaze shifts, and emotional changes. The model can generate avatars that not only lip-sync with spoken content but also provide a more natural and dynamic conversational experience across various visual styles.

🎓

Featured Review

No Code MBA

Build AI apps without coding. Our in-depth course review.

Read Review

LPM 1.0: AI Turns Single Photo into 45-Min Video

Further Reading

Common Questions Answered

How does LPM 1.0 generate video from a single photo across different visual styles?

What makes LPM 1.0's video generation process unique compared to traditional rendering methods?

How does LPM 1.0 integrate with voice AI technologies like ChatGPT?

Most Popular

Intuit turns months of tax code work into hours with proprietary DSL

Two new AI sandbox architectures limit credential exposure after prompt injection

Google Vids adds Veo, Lyria AI models and directable avatars for flyers, reels

Alibaba’s Tongyi Lab launches VimRAG, a memory‑graph multimodal RAG framework

Guide to Building Document Intelligence Pipelines with LangExtract and OpenAI

Meta's structured prompting lifts LLM code review accuracy to 93%

Nvidia unveils Agentforce AI platform with Adobe, Salesforce, SAP at GTC 2026

Sam Altman proposes new AI 'social contract' in You.com guide

Anthropic ends free OpenClaw access to Claude, adds extra fee April 4

Batch Mode VC-6 and NVIDIA Nsight Speed Up Vision AI Pipelines

Further Reading

Related Reading

Ant Group unveils Ring-1T, first open-source trillion-parameter reasoning model

ChatGPT Health Event Shows AI Modernizing Dev Workflows, GitLab Unveils Plans

Gen AI app sessions up fivefold, downloads jump 778% as ChatGPT leads traffic

Inside the .claude Folder: How AI Stores Its Working State Unannounced

Analysis of Four Core Dilemmas Highlights AI Agents' Security Risks

Common Questions Answered

How does LPM 1.0 generate video from a single photo across different visual styles?

What makes LPM 1.0's video generation process unique compared to traditional rendering methods?

How does LPM 1.0 integrate with voice AI technologies like ChatGPT?

Most Popular

Intuit turns months of tax code work into hours with proprietary DSL

Two new AI sandbox architectures limit credential exposure after prompt injection

Google Vids adds Veo, Lyria AI models and directable avatars for flyers, reels

Alibaba’s Tongyi Lab launches VimRAG, a memory‑graph multimodal RAG framework

Guide to Building Document Intelligence Pipelines with LangExtract and OpenAI

Meta's structured prompting lifts LLM code review accuracy to 93%

Nvidia unveils Agentforce AI platform with Adobe, Salesforce, SAP at GTC 2026

Sam Altman proposes new AI 'social contract' in You.com guide

Anthropic ends free OpenClaw access to Claude, adds extra fee April 4

Batch Mode VC-6 and NVIDIA Nsight Speed Up Vision AI Pipelines