Creating AI-Generated Videos with Wan2GP, LTX 2.3, and OmniVoice: Behind-the-Scenes of the Johor Election Stone Lion Podcast Video
Introduction
A recent YouTube video titled 《居銮石狮子都要下来做Podcast,柔佛州选大乱斗?》 (Watch here) has sparked discussion across Malaysian social media. The video features the stone lion of Kluang as a podcast host humorously commenting on the latest developments in Johor's state election. What makes this video particularly interesting is that it was entirely generated using AI tools: the creator employed Wan2GP for video generation, LTX 2.3 for quality enhancement, and OmniVoice for realistic podcast voice synthesis.
This article details the characteristics of these tools, their collaborative workflow in video creation, and the implications of using AI tools for producing timely commentary content.
Tool Overview
Wan2GP: Low-VRAM-Friendly AI Video Generator
Wan2GP is an open-source AI video generation tool designed specifically for low-memory GPUs, capable of producing high-quality videos on consumer-grade graphics cards. It is based on optimized Wan series models, making it ideal for quickly generating short clips and social content.
- Official site: Wan2GP: Free AI Video Generator Online, No Install
- GitHub: deepbeepmeep/Wan2GP
- Features: No installation required, online use, low VRAM consumption, suitable for rapid prototyping
LTX 2.3: Latest Open-Source AI Video Model
LTX 2.3 is an open-source AI video generation model released by the LTX Model team, supporting 4K resolution and 50 FPS video output with built-in native audio generation. The model excels in text-to-video and image-to-video tasks.
- Official introduction: LTX-2.3: Introducing LTX's Latest AI Video Model
- Tutorial: LTX-2.3 Tutorial: Text to Video and Image to Video
- Features: High resolution, smooth frame rates, active open-source community
OmniVoice: Multilingual AI Voice Cloning and TTS
OmniVoice is a voice generation platform supporting 600+ languages, featuring zero-shot voice cloning and natural speech synthesis capabilities. It can generate target voices from short audio samples or synthesize multilingual speech directly from text.
- Official site: OmniVoice: Free AI Voice Generator & Voice Cloning
- GitHub: k2-fsa/OmniVoice
- Features: Cross-language support, high-fidelity voice cloning, generous free usage tier
Video Creation Workflow
1. Conceptualization and Script Writing
First, the author crafted a podcast script based on the latest Johor election news, incorporating humorous elements like the stone lion "coming down," election chaos, and local public reactions. The script used a Chinese-English mixed language style to increase fun and spreadability.
2. Audio Generation (OmniVoice)
Using OmniVoice, the author selected a middle-aged male voice as the stone lion's vocal characteristic. By uploading a sample audio clip (or using the built-in voice library), OmniVoice generated the complete podcast voiceover audio file. The tool's multilingual support ensured natural and fluent Chinese pronunciation.
3. Base Video Generation (Wan2GP)
With the voiceover ready, the author input text descriptions of key scenes into Wan2GP. Examples:
- "An ancient stone lion walking through the streets of Kluang town, with the Johor state government building in the background"
- "The stone lion holding a microphone, speaking seriously about election results"
Wan2GP quickly generated base video clips for these scenes in a low-VRAM environment, though resolution and detail might be limited.
4. Video Enhancement (LTX 2.3)
To improve video quality, the author imported Wan2GP-generated initial clips into LTX 2.3 for secondary processing. LTX 2.3's super-resolution and frame interpolation features enhanced clarity and smoothness, particularly in the stone lion's texture and motion details.
5. Audio-Video Synthesis and Post-Production
Finally, using video editing software (such as DaVinci Resolve or CapCut), the author synchronized the OmniVoice-generated audio with the LTX 2.3-enhanced video track. Subtitles, background music, and simple transition effects were added to complete the final video.
Results and Reflections
Through this workflow, the author successfully produced a timely, entertaining AI-generated video in just a few days. The video garnered thousands of views and numerous comments on YouTube, with audiences frequently praising the realism of the stone lion's voice and the video's satirical tone.
Key Advantages:
- Extremely Low Cost: All tools offer free tiers or open-source versions, avoiding traditional video production labor and equipment expenses.
- Remarkable Speed: From concept to finished product in under 24 hours, enabling rapid response to hot topics.
- Creative Freedom: AI tools made previously difficult-to-achieve concepts (like a stone lion podcast) realizable.
Limitations and Improvement Directions:
- Generated videos occasionally exhibit slight "unnaturalness" (e.g., lip-sync mismatches), requiring manual adjustment.
- For complex camera movements and multi-character interactions, AI still struggles to fully replace live-action filming.
- Future exploration could involve more advanced models (e.g., Wan 2.2) or motion control techniques to improve consistency.
Conclusion
This case demonstrates the powerful potential of modern AI toolchains in content creation. By combining Wan2GP (rapid prototyping), LTX 2.3 (quality enhancement), and OmniVoice (voice synthesis), creators can produce professional-quality video content at extremely low barriers. For news commentary, social satire, and educational content, this workflow is particularly suited for rapid response and experimental expression.
As AI video and speech models continue to advance, we can expect more creators to leverage similar toolchains to express viewpoints, tell stories—and perhaps give even more stone lions their own podcast channels.



