Skip to main content

Is this AI video generator the future of anime? Definitely not

AI-generated anime girl sat in a field.
Animon

Japan has been fairly quiet during the AI boom so far, but now — just after the whole ChatGPTxGhibli trend — Japanese company Animon has come out with an AI video generator specifically for anime.

The free-to-use tool takes still images of anime, cartoon, or CG art and creates five-second-long animations based on the prompt you enter. The company claims it will help both professional and amateur animators speed up the creative process — essentially requiring only one hand-drawn frame per five seconds of video instead of hundreds.

Recommended Videos

Is there a place for AI tools in the anime industry?

Quick confession here — I happen to be a total anime nerd. I’ve watched hundreds of shows, I visit Japan nearly every year, I’ve worked there, and I speak Japanese. I’m no expert on the animation process, but I know what good animation looks like and what bad animation looks like.

Even without AI getting involved, there are already a lot of ways to cut corners with animation in the anime industry. A lot of the work now gets outsourced to Korean animation studios, and the time and money going into it keep going down.

When a studio doesn’t invest enough in its animation, it affects the shows in two main ways: there’s a lot less movement, and there’s a noticeable drop in art quality. The character designs might look beautiful on the promotional posters, but the versions drawn from frame to frame don’t have the same level of detail or the right proportions, and in some cases, they even look straight-up derpy.

I have an unfortunate example of this from a show I like (a popular volleyball anime called Haikyuu). Earlier episodes were animated really nicely, but for whatever reason, the quality plummeted during one of the later seasons. Here’s an image comparing a shot of the protagonist from an episode in season one to an episode in season four.

To be clear, some shows still look great — but low-quality animation can pop up just about anywhere, and it can definitely be bad enough to drive away viewers like me.

With the bar sinking so low, you might think AI video generation tools actually have a chance in this industry. However, judging by the content I got from Animon.ai, I’m afraid that doesn’t appear to be the case.

How do the generated animations look?

The tool does work in that it takes the image you give it and makes it move for five seconds, but that’s about all you can guarantee will happen. For my first experiment, I gave it a still of Jet Black from Cowboy Bebop about to drink from a whiskey glass.

My prompt was simply “The pictured character takes a drink from the glass of whiskey he is holding.” The video I got in return has a few things wrong with it.

Firstly, Jet Black does not drink the whiskey. He appears to be talking, and he lifts the glass closer to his face, but I don’t see anything that looks like actual drinking. The whiskey inside the glass appears to be very busy, however — the glass both fills up and drains despite the lack of drinking that appears to be going on. If you want to see Jet drink for real, it’s right here on YouTube.

The AI model also struggles with Jet’s scar. The bottom end of it sort of disappears, and when he opens his eyes, the scar is going straight over his eyeball. Because I grabbed the still from a video on Crunchyroll’s YouTube channel, it has CR’s logo in the top right. In the AI video, however, the text appears to have morphed into “Crunchyolo” instead.

Next, I tried feeding it a still of Maomao’s dance scene from The Apothecary Diaries, and I got a really weird result. Unlike the trippy, mushy, morphing movements of the 2D Jet video, the AI gave me what looked like a 3D model.

Animon.ai is meant to work with CG art as well, so I suppose it makes sense that it can generate 3D models, but I was pretty surprised. While The Apothecary Diaries makes use of 3D models a lot for buildings and backgrounds, the character was definitely drawn in the shot I used.

Either way, the generated videos seem significantly more stable when 3D models are involved. Movement looks less sloshy, and the model keeps the size and shape of the character more consistent. I still wouldn’t want to see it in a show as is, but it’s undoubtedly a step up from the 2D-style content it gave me.

However, Animon.ai’s sales pitch is all about helping animators save time by drawing fewer frames — and this is irrelevant when it comes to 3D animation. It’s already quick and inexpensive compared to 2D, and it doesn’t make much sense to create a 3D model and then use an AI video generator to haphazardly animate it rather than using animation software.

Should you use Animon.ai?

If you don’t know anything about animation and you want a quick GIF or a video for personal purposes, this tool will probably work just fine (or give you a good laugh if not). However, it’s hard to imagine any amateur or professional animators seeing real value in this.

If you want to see what the tool is capable of, the Animon YouTube channel posted a music video that appears to be almost 100% AI-generated — it definitely looks that way, at least.

Overall, I think there are two main reasons AI video generators aren’t ready for any real work. One is that movement looks too bad, and the other is the lack of control you have over the result. It’s the same as any other generative AI tool, really — they’re just not consistent and responsive enough to be reliable. We will likely get there one day, and it’s still fun to mess around with early versions of the technology, but don’t let the salespeople fool you — none of these tools are ready for commercial use yet.

Topics
Willow Roberts
Willow Roberts has been a Computing Writer at Digital Trends for a year and has been writing for about a decade. She has a…
I tested the future of AI image generation. It’s astoundingly fast.
Imagery generated by HART.

One of the core problems with AI is the notoriously high power and computing demand, especially for tasks such as media generation. On mobile phones, when it comes to running natively, only a handful of pricey devices with powerful silicon can run the feature suite. Even when implemented at scale on cloud, it’s a pricey affair.
Nvidia may have quietly addressed that challenge in partnership with the folks over at the Massachusetts Institute of Technology and Tsinghua University. The team created a hybrid AI image generation tool called HART (hybrid autoregressive transformer) that essentially combines two of the most widely used AI image creation techniques. The result is a blazing fast tool with dramatically lower compute requirement.
Just to give you an idea of just how fast it is, I asked it to create an image of a parrot playing a bass guitar. It returned with the following picture in just about a second. I could barely even follow the progress bar. When I pushed the same prompt before Google’s Imagen 3 model in Gemini, it took roughly 9-10 seconds on a 200 Mbps internet connection.

A massive breakthrough
When AI images first started making waves, the diffusion technique was behind it all, powering products such as OpenAI’s Dall-E image generator, Google’s Imagen, and Stable Diffusion. This method can produce images with an extremely high level of detail. However, it is a multi-step approach to creating AI images, and as a result, it is slow and computationally expensive.
The second approach that has recently gained popularity is auto-regressive models, which essentially work in the same fashion as chatbots and generate images using a pixel prediction technique. It is faster, but also a more error-prone method of creating images using AI.
On-device demo for HART: Efficient Visual Generation with Hybrid Autoregressive Transformer
The team at MIT fused both methods into a single package called HART. It relies on an autoregression model to predict compressed image assets as a discrete token, while a small diffusion model handles the rest to compensate for the quality loss. The overall approach reduces the number of steps involved from over two dozen to eight steps.
The experts behind HART claim that it can “generate images that match or exceed the quality of state-of-the-art diffusion models, but do so about nine times faster.” HART combines an autoregressive model with a 700 million parameter range and a small diffusion model that can handle 37 million parameters.

Read more
ChatGPT app could soon generate AI videos with Sora
Depiction of OpenAI Sora video generator on a phone.

OpenAI released its Sora text-to-video generation tool late in 2024, and expanded it to the European market at the end of February this year. It seems the next avenue for Sora is the ChatGPT app.

According to a TechCrunch report, which cites internal conversations, OpenAI is planning to bring the video creation AI tool to ChatGPT. So far, the video generator has been available only via a web client, and has remained exclusive to paid users.

Read more
Adobe releases its first commercially safe Firefly video generating AI
Firefly video still shot of an Icelandic horse

Following on the success of its IP-friendly Firefly Image model, Adobe announced on Wednesday the beta release of a new Firefly Video model, as well as two subscription packages with which to access its audio and video generating abilities. Generate Video, according to the announcement post, "empowers creative professionals with tools to generate video clips from a text prompt or image, use camera angles to control shots, create professional quality images from 3D sketches, craft atmospheric elements and develop custom motion design elements."

The model will initially be able to generate video in 1080p resolution to start, though the company plans to release a 4k model for professional production work in the near future. Like the image generator, Firefly Video is trained exclusively on Adobe stock, licensed, and public domain content, making its outputs usable in commercial applications without fear of them running afoul of copyright or intellectual property protections. And, unlike Grok 2, there's minimal chance of it outputting racist, offensive, or illegal content.

Read more