Sharper Fine Detail - The VAE Rebuild That Changes Everything
LTX 2.3 rebuilds its latent space with an updated Variational Autoencoder trained on higher-quality data. The practical difference shows up immediately in outputs — fine hair texture, fabric weave, product surface gloss, and small on-screen text all pass through the generation pipeline without softening or smearing.
| Prompt | Output Video |
|---|
| Pixar-style 3D girl with brown braids, white tee, jeans, red sneakers, leaping toward camera, fisheye lens, medieval European timber town, bright sunny day, playful adventurous vibe, 4K |  |
Tighter Prompt Adherence — The 4x Text Connector Upgrade
Most AI video generators struggle with specificity. Mention three subjects in different positions doing different actions, and the model collapses them into something approximate. LTX 2.3 video generator uses a 4x larger text connector that resolves multi-subject prompts, spatial relationships, and stylistic instructions with clear accuracy.
| Prompt | Output Video |
|---|
| A 6-second fantasy animation of a multicolored acrylic paint-splash cat (red, blue, yellow, orange, purple, green) walking on a white canvas. The cat leaps and splatters paint droplets (0-2s), spins mid-air and dissolves into swirling color streams (2-4s), then the paint converges and coils into a glossy paint-splash flower (4-6s). Fluid paint physics, vibrant saturated colors, cartoonish style, white background, 8K, whimsical atmosphere. |  |
Stronger Image-to-Video — Real Motion, No More Frozen Frames
The image-to-video problem in earlier models came in two forms: subjects that barely moved (the Ken Burns effect), or subjects that moved but lost visual consistency with the source frame. LTX 2.3 video generator addresses both.
| Start Frame | End Frame | Prompt | Output Video |
|---|
 |  | A 6-second whimsical animation: a glossy acrylic paint-splash cat (red, blue, yellow, orange, purple, green) walks on a white canvas, leaps and splatters paint droplets (0-2s), spins and dissolves into swirling color streams (2-4s), which converge and solidify into a bright paint-splash flower with a green stem (4-6s). Fluid paint physics, vibrant colors, cartoonish style, 8K, soft white background. |  |
Cleaner Audio - New Vocoder, Better Training Data
Audio quality in AI video generation is often the weakest point. Clicks, pops, misaligned dialogue, and inconsistent environmental sound degrade the final output even when visuals are strong. LTX 2.3 addresses this with a new vocoder and filtered training data that removes audio artifacts before they reach the output.
| Prompt | Output Video |
|---|
| A 5-second cinematic video of luxury perfume "CELESTIAL BLOSSOM" on a soft beige background. Blue delphinium petals float gently (0-1s), accelerate and swirl toward the center forming a bottle silhouette (1-3s), then solidify into transparent glass filled with golden liquid, black cap and gold lettering appear (3-4s). Final shot: the complete bottle stands elegantly with a few petals floating around it in warm light (4-5s). Soft focus, golden lighting, magical particle physics, dreamy elegant style, 8K. |  |
Native Portrait Mode - Built for Vertical Video From the Start
LTX 2.3 video generator generates portrait video up to 1080×1920 resolution using training data collected in vertical orientation. This is different from simply cropping a landscape video. The composition, subject framing, and motion behavior are all designed for portrait from the beginning.
| Prompt | Output Video |
|---|
| A smiling man with curly hair and a woman ride a yellow vintage Vespa scooter through a bustling Asian alleyway, warm golden hour light, hanging red lanterns, fruit stalls, cobblestone street, cinematic bokeh, joyful travel vibe, 8K |  |