{"id":77576,"date":"2023-04-20T16:35:33","date_gmt":"2023-04-20T13:35:33","guid":{"rendered":"https:\/\/forklog.com\/en\/?p=77576"},"modified":"2025-09-11T00:05:46","modified_gmt":"2025-09-10T21:05:46","slug":"nvidia-unveils-videoldm-a-text-to-video-generator","status":"publish","type":"post","link":"https:\/\/forklog.com\/en\/nvidia-unveils-videoldm-a-text-to-video-generator\/","title":{"rendered":"Nvidia unveils VideoLDM, a text-to-video generator"},"content":{"rendered":"<p>Nvidia <a href=\"https:\/\/research.nvidia.com\/labs\/toronto-ai\/VideoLDM\/samples.html\" target=\"_blank\" rel=\"noopener nofollow\" title=\"\">developed<\/a> the VideoLDM neural network, which generates short and realistic videos from text descriptions.<\/p>\n<p>The algorithm enables animations of about five seconds at resolutions up to 2048\u00d71280 pixels and at 24 frames per second. The model can generate video for both simple and complex prompts.<\/p>\n<p>VideoLDM draws on advances from the Stable Diffusion algorithm. The model comprises about 4.1 billion parameters, of which 2.7 billion were trained on video.<\/p>\n<p>The company said it had achieved &#8220;significant progress&#8221; in training the neural network quite rapidly. According to the developers, VideoLDM began generating detailed videos that match the descriptions in just a month.<\/p>\n<p>The developers published several examples of the network&#8217;s work on their <a href=\"https:\/\/research.nvidia.com\/labs\/toronto-ai\/VideoLDM\/samples.html\" target=\"_blank\" rel=\"noopener nofollow\" title=\"\">site<\/a>.<\/p>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/video26.mp4\"><\/video><figcaption>&#8220;A turtle swims in the ocean.&#8221; Data: Nvidia.<\/figcaption><\/figure>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/video14.mp4\"><\/video><figcaption>&#8220;A fighter jet vacuums a sandy beach.&#8221; Data: Nvidia.<\/figcaption><\/figure>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/video7.mp4\"><\/video><figcaption>&#8220;A fox in a suit dances in the park.&#8221; Data: Nvidia.<\/figcaption><\/figure>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/video10.mp4\"><\/video><figcaption>&#8220;A lion stands on a surfboard in the ocean at sunset, 4K, high resolution.&#8221; Data: Nvidia.<\/figcaption><\/figure>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/video27.mp4\"><\/video><figcaption>&#8220;Two pandas sit at a table and play cards, 4K, high resolution.&#8221; Data: Nvidia.<\/figcaption><\/figure>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/video57.mp4\"><\/video><figcaption>&#8220;Pouring beer into a glass from a low angle.&#8221; Data: Nvidia.<\/figcaption><\/figure>\n<p>The model can also generate driving scenes. Such videos have a resolution of 1024\u00d7512 pixels and last up to five minutes.<\/p>\n<p>VideoLDM can model specific driving scenarios and predict the behavior of objects on the road. According to the developers, this enables realistic frames.<\/p>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/high_res_driving_1.mp4\"><\/video><\/figure>\n<figure class=\"wp-block-video\"><video controls src=\"https:\/\/forklog.com\/wp-content\/uploads\/high_res_driving_2.mp4\"><\/video><figcaption>Example of a generated driving scene. Data: Nvidia.<\/figcaption><\/figure>\n<p>The published work is a participant in the IEEE Conference on Computer Vision and Pattern Recognition, which will be held in Vancouver from June 18 to 22. It is unclear whether Nvidia plans to release the algorithm publicly.<\/p>\n<p>In April, Meta unveiled a tool for image and video segmentation.<\/p>\n<p>In March, Microsoft <a href=\"https:\/\/forklog.com\/en\/news\/microsoft-integrates-image-generator-into-bing-and-edge\">released<\/a> a preview version of Bing Image Creator.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Nvidia has developed the VideoLDM neural network that generates short and realistic videos from text descriptions.<\/p>\n","protected":false},"author":1,"featured_media":77577,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"select":"1","news_style_id":"1","cryptorium_level":"","_short_excerpt_text":"","creation_source":"","_metatest_mainpost_news_update":false,"footnotes":""},"categories":[3],"tags":[438,1760,1294],"class_list":["post-77576","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-analysis","tag-artificial-intelligence","tag-generative-ai","tag-nvidia"],"aioseo_notices":[],"amp_enabled":true,"views":"34","promo_type":"1","layout_type":"1","short_excerpt":"","is_update":"","_links":{"self":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/77576","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/comments?post=77576"}],"version-history":[{"count":1,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/77576\/revisions"}],"predecessor-version":[{"id":77578,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/77576\/revisions\/77578"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/media\/77577"}],"wp:attachment":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/media?parent=77576"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/categories?post=77576"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/tags?post=77576"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}