Site iconSite icon ForkLog

OpenAI unveils POINT-E, a 3D-generation model

OpenAI unveils POINT-E, a 3D-generation model

OpenAI has released a new algorithm for generating three-dimensional images from text prompts, POINT-E.

According to the study, the models require a single Nvidia V100 GPU and about two minutes to create an image.

The algorithm does not generate 3D objects in the traditional sense. It creates ‘point clouds’ or discrete sets of data points in space that represent a three-dimensional form.

Researchers noted that such data are easier to synthesise computationally. However they do not capture the object’s detailed structure, shape, or texture.

Three-dimensional objects created with POINT-E. Data: OpenAI.

To overcome this limitation, the OpenAI team trained an additional AI system to convert POINT-E point clouds into meshes.

POINT-E itself consists of two parts:

The text-to-image model works similarly to DALL-E 2. It was trained on labelled images so that the algorithm understands associations between words and visual concepts.

The image-to-3D model was trained on pairs of image and three-dimensional object.

For example, if you enter the text prompt ‘A cat eats a burrito’, POINT-E will first generate a synthetic image consistent with the prompt. The second model will then synthesize a rough ‘cloud’ with 1024 points, and then refine the 3D object to 4096 points.

Turning a 2D image into 3D. Data: OpenAI.

According to the researchers, after training the models on a dataset of ‘several million’ 3D objects and associated metadata, POINT-E can generate coloured point clouds that correspond to textual prompts. They acknowledged the model’s imperfect performance, but noted the speed of generation.

“Although our method yields worse results in this evaluation than the most advanced methods, it provides samples in a small fraction of the time. This could make it more practical for certain applications or enable the discovery of higher-quality 3D objects,” the developers said.

OpenAI released the open-source code for the projects on GitHub.

In December, the company introduced the ChatGPT chatbot, built on a large language model.

In April, OpenAI released the second version of its image generator for the text-to-image model DALL-E.

Subscribe to ForkLog News on Telegram: ForkLog AI — all the news from the AI world!

Exit mobile version