{"id":12554,"date":"2024-04-15T12:23:09","date_gmt":"2024-04-15T09:23:09","guid":{"rendered":"https:\/\/forklog.com\/en\/xai-unveils-first-multimodal-version-of-grok-1-5v\/"},"modified":"2024-04-15T12:23:09","modified_gmt":"2024-04-15T09:23:09","slug":"xai-unveils-first-multimodal-version-of-grok-1-5v","status":"publish","type":"post","link":"https:\/\/forklog.com\/en\/xai-unveils-first-multimodal-version-of-grok-1-5v\/","title":{"rendered":"xAI Unveils First Multimodal Version of Grok-1.5V"},"content":{"rendered":"<p>Elon Musk&#8217;s company xAI <a href=\"https:\/\/x.ai\/blog\/grok-1.5v\">has unveiled<\/a> a new version of its chatbot, Grok, capable of processing requests in various formats.<\/p>\n<p>The presentation came just weeks after the release of the previous version.<\/p>\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>&#8220;Grok-1.5V competes with existing multimodal models in several areas: from interdisciplinary reasoning to understanding scientific diagrams, graphs, and screenshots,&#8221; the blog states.<\/p>\n<\/blockquote>\n<p>The developers provided several examples in the press release demonstrating the chatbot&#8217;s new capabilities:<\/p>\n<ul class=\"wp-block-list\">\n<li>converting a flowchart sketch into Python code;<\/li>\n<li>generating a bedtime story from a child&#8217;s drawing;<\/li>\n<li>explaining memes; <\/li>\n<li>converting a table into CSV file format.<\/li>\n<\/ul>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-eu.googleusercontent.com\/ENN9_Mm1kHq4Q-4-votGCdYqElbqMdma_LaUkQw1UqPNWePdyLgTc5qYtS89F4c-N8Xr1GLIUUxhZ6Exd-NwuVFhXWRYwK06y9t0uQ72UUUl92bw_nafL4yrYbClDYFGrjlmF7SK0LKKhGyt7QPzGO8\" alt=\"\u0421\u0442\u0430\u0440\u0442\u0430\u043f xAI \u043f\u0440\u0435\u0434\u0441\u0442\u0430\u0432\u0438\u043b \u043f\u0435\u0440\u0432\u0443\u044e \u043c\u0443\u043b\u044c\u0442\u0438\u043c\u043e\u0434\u0430\u043b\u044c\u043d\u0443\u044e \u0432\u0435\u0440\u0441\u0438\u044e Grok-1.5V\"\/><figcaption class=\"wp-element-caption\">Example of converting a sketch into Python code. Data: xAI.<\/figcaption><\/figure>\n<p>After testing counterparts like GPT-4V, Claude 3Sonnet, Claude 3 Opus, and Gemini Pro 1.5, xAI claims its multimodal model leads in many parameters.<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-eu.googleusercontent.com\/H_F0UqY2CsoWQ23MI62ugpy-rzHgKv1aP5fwzYp6YFIKEfDI1yX9KOHQ7ud35ktk8mwKdEiHatUqmw-CvVncqEsAgrclUcAyiGNuY3EFrX87TohWg0X3MWraC6mwXhx2TllEw2FeCSrVkJsw-wh20no\" alt=\"\u0421\u0442\u0430\u0440\u0442\u0430\u043f xAI \u043f\u0440\u0435\u0434\u0441\u0442\u0430\u0432\u0438\u043b \u043f\u0435\u0440\u0432\u0443\u044e \u043c\u0443\u043b\u044c\u0442\u0438\u043c\u043e\u0434\u0430\u043b\u044c\u043d\u0443\u044e \u0432\u0435\u0440\u0441\u0438\u044e Grok-1.5V\"\/><figcaption class=\"wp-element-caption\">Comparison of AI models. Data: xAI.<\/figcaption><\/figure>\n<p>Company representatives emphasized that Grok-1.5V surpasses its competitors in the RealWorldQA benchmark\u2014a new metric designed to assess real-world spatial understanding.<\/p>\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/lh7-eu.googleusercontent.com\/dRXWXuI3rKMCPJj3q-7Aj9Bm18Rh1Kd-UoN-eL59CRO-I-jVWh9DXQk2J_wZyYM7zhfAk7X7F8byDAso3nAByuBNF6d2mmR6NJ6BAcH22KDAOPaEVkGCv7vNGhVVuFmLKb79xZlxhAiI6iBOdG_Vz_M\" alt=\"\u0421\u0442\u0430\u0440\u0442\u0430\u043f xAI \u043f\u0440\u0435\u0434\u0441\u0442\u0430\u0432\u0438\u043b \u043f\u0435\u0440\u0432\u0443\u044e \u043c\u0443\u043b\u044c\u0442\u0438\u043c\u043e\u0434\u0430\u043b\u044c\u043d\u0443\u044e \u0432\u0435\u0440\u0441\u0438\u044e Grok-1.5V\"\/><figcaption class=\"wp-element-caption\">Examples of passing RealWorldQA. Data: xAI.<\/figcaption><\/figure>\n<p>To pass the test, the AI model was trained on over 700 images, each accompanied by a question and answer for each element. xAI has made RealWorldQA publicly available under a Creative Commons license.<\/p>\n<p>Grok-1.5V was launched less than a month after xAI <a href=\"https:\/\/forklog.com\/en\/news\/elon-musk-announces-open-sourcing-of-groks-code\">released<\/a> the model&#8217;s open-source code.<\/p>\n<p>According to the developers, &#8220;significant&#8221; updates to the chatbot&#8217;s ability to understand and generate multimodal signals are expected in the coming months.<\/p>\n<p>Early testers and current users will gain access to Grok-1.5V shortly.<\/p>\n<p>Back in December 2023, xAI representatives notified the <span data-descr=\"U.S. Securities and Exchange Commission\" class=\"old_tooltip\">SEC<\/span> of plans to raise $1 billion through a private sale of equity securities.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Elon Musk&#8217;s company xAI has unveiled a new version of its chatbot, Grok, capable of processing requests in various formats. The presentation came just weeks after the release of the previous version. &#8220;Grok-1.5V competes with existing multimodal models in several areas: from interdisciplinary reasoning to understanding scientific diagrams, graphs, and screenshots,&#8221; the blog states. The [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":12553,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"select":"","news_style_id":"","cryptorium_level":"","_short_excerpt_text":"","creation_source":"","_metatest_mainpost_news_update":false,"footnotes":""},"categories":[3],"tags":[438,1217,1493],"class_list":["post-12554","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-and-analysis","tag-artificial-intelligence","tag-elon-musk","tag-explainable-ai"],"aioseo_notices":[],"amp_enabled":true,"views":"59","promo_type":"","layout_type":"","short_excerpt":"","is_update":"","_links":{"self":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/12554","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/comments?post=12554"}],"version-history":[{"count":0,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/posts\/12554\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/media\/12553"}],"wp:attachment":[{"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/media?parent=12554"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/categories?post=12554"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/forklog.com\/en\/wp-json\/wp\/v2\/tags?post=12554"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}