AI dente —

AI-generated video of Will Smith eating spaghetti astounds with terrible beauty

Open source "text2video" ModelScope AI made the viral sensation possible.

Benj Edwards - Mar 30, 2023 9:02 pm UTC

Stills from an AI-generated video of Will Smith eating spaghetti. — Enlarge / Stills from an AI-generated video of Will Smith eating spaghetti that has been heating up the Internet.
chaindrop / Reddit

Amid this past week's controversies in AI over regulation, fears of world-ending doom, and job disruption, the clouds have briefly parted. For a brief and shining moment, we can enjoy an absolutely ridiculous AI-generated video of Will Smith eating spaghetti that is now lighting up our lives with its terrible glory.

Further Reading

Meta announces Make-A-Video, which generates video from text [Updated]

On Monday, a Reddit user named "chaindrop" shared the AI-generated video on the r/StableDiffusion subreddit. It quickly spread to other forms of social media and inspired mixed ruminations in the press. For example, Vice said the video will "haunt you for the rest of your life," while the AV Club called it the "natural end point for AI development."

We're somewhere in between. The 20-second silent video consists of 10 independently generated two-second segments stitched together. Each one shows different angles of a simulated Will Smith (at one point, even two Will Smiths) ravenously gobbling up spaghetti. It's entirely computer-generated, thanks to AI.

And you will see it now:

We know what you're thinking: "Didn't I see this kind of advanced deepfake technology in 1987's The Running Man?" No, that was Jesse "The Body" Ventura defeating a fake Arnold Schwarzenegger in a dystopic game show cage match, set somewhere between 2017 and 2019. Here in 2023, we have fake Will Smith eating spaghetti.

This feat is possible due to a new open source AI tool called ModelScope, released a few weeks ago by DAMO Vision Intelligence Lab, a research division of Alibaba. ModelScope is a "text2video" diffusion model that has been trained to create new videos from prompts by analyzing millions of images and thousands of videos scraped into the LAION5B, ImageNet, and Webvid datasets. That includes videos from Shutterstock, hence the ghostly "Shutterstock" watermark on its output.

Further Reading

Getty sues Stability AI for copying 12M photos and imitating famous watermark

AI community HuggingFace currently hosts an online demo of ModelScope, although it requires an account, and you'll need to pay for compute time to run it. We tried to use it but it was overloaded, likely due to Smith's spaghetti mania.

According to chaindrop, the workflow for creating the video was fairly simple: give ModelScope the prompt "Will Smith eating spaghetti" and generate it at 24 frames per second (FPS). Next, chaindrop used the Flowframes interpolation tool to increase the FPS from 24 to 48 and then slowed it down to half speed, resulting in a smoother video.

Further Reading

Google’s newest AI generator creates HD video from text prompts

Of course, ModelScope isn't the only game in town regarding the emerging field of text2video. Recently, Runway debuted "Gen-2," and we've previously covered early text2video research projects from Meta and Google.

Since Will Smith eating spaghetti became a viral hit, the Internet has been graced with follow-ups such as Scarlett Johansson and Joe Biden eating spaghetti. There's even Smith eating meatballs, a video that is perhaps actually truly horrifying. But it's still great somehow—perfect future meme fodder.

Of course, once the outputs of these text2video tools get too realistic, we'll have other issues to deal with—deep social and cultural issues, likely. But for now, let's enjoy ModelScope's imperfect, horrible glory. We apologize in advance.

Benj Edwards Benj Edwards is an AI and Machine Learning Reporter for Ars Technica. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

Channel Ars Technica

← Previous story Next story →