AI translated#Artificial Intelligence #Generative AI #NVIDIA

Nvidia unveils VideoLDM, a text-to-video generator

20.04.2023 ForkLog

Nvidia developed the VideoLDM neural network, which generates short and realistic videos from text descriptions.

The algorithm enables animations of about five seconds at resolutions up to 2048×1280 pixels and at 24 frames per second. The model can generate video for both simple and complex prompts.

VideoLDM draws on advances from the Stable Diffusion algorithm. The model comprises about 4.1 billion parameters, of which 2.7 billion were trained on video.

The company said it had achieved “significant progress” in training the neural network quite rapidly. According to the developers, VideoLDM began generating detailed videos that match the descriptions in just a month.

The developers published several examples of the network’s work on their site.

“A turtle swims in the ocean.” Data: Nvidia.

“A fighter jet vacuums a sandy beach.” Data: Nvidia.

“A fox in a suit dances in the park.” Data: Nvidia.

“A lion stands on a surfboard in the ocean at sunset, 4K, high resolution.” Data: Nvidia.

“Two pandas sit at a table and play cards, 4K, high resolution.” Data: Nvidia.

“Pouring beer into a glass from a low angle.” Data: Nvidia.

The model can also generate driving scenes. Such videos have a resolution of 1024×512 pixels and last up to five minutes.

VideoLDM can model specific driving scenarios and predict the behavior of objects on the road. According to the developers, this enables realistic frames.

Example of a generated driving scene. Data: Nvidia.

The published work is a participant in the IEEE Conference on Computer Vision and Pattern Recognition, which will be held in Vancouver from June 18 to 22. It is unclear whether Nvidia plans to release the algorithm publicly.

In April, Meta unveiled a tool for image and video segmentation.

In March, Microsoft released a preview version of Bing Image Creator.

Подписывайтесь на ForkLog в социальных сетях

Telegram (основной канал) Facebook X

Found a mistake? Select it and press CTRL+ENTER

Рассылки ForkLog: держите руку на пульсе биткоин-индустрии!

China’s Cyber Centre Warns of OpenClaw Risks Amidst National Surge

Perplexity Introduces OpenClaw Rival: Personal Computer

Study Reveals Increased Workload Following AI Adoption

AI Boom Drains Over Half of Active Developers from Crypto Industry

Etched in Silicon

Google Enhances AI Features in Docs, Sheets, Slides, and Drive

OpenAI Integrates Shazam into ChatGPT

Nvidia CEO Views AI as a Job Creator, Not a Job Killer