Creating realistic 3D shapes using generative AI

A simple fix to an existing technique.

Follow us onFollow Tech Explorist on Google News

MIT researchers have addressed a key challenge in generating high-quality 3D models by improving the Score Distillation technique. This technique traditionally uses 2D image generation models to create 3D shapes.

While 2D generative AI models can produce lifelike images, they aren’t designed for 3D. When applied to 3D shapes, these models most likely produce blurry or cartoonish outputs.

By investigating the differences between 2D and 3D generation algorithms, the researchers identified the root causes of the low-quality results. They then introduced a simple fix to enhance Score Distillation, allowing for the creation of sharper, more realistic 3D shapes, bringing them closer in quality to the best 2D models.

Unlike other methods, which involve retraining or fine-tuning the AI model and are expensive and time-consuming, this new technique achieves 3D shape quality on par with and doesn’t require additional training or complex postprocessing.

A generative AI develops drug molecules based on the 3D surface of proteins

Also, identifying the root cause of the problem allowed researchers to improve their mathematical understanding of Score Distillation.

Artem Lukoianov, an electrical engineering and computer science (EECS) graduate student and lead author of a paper on this technique, said, “Now we know where we should be heading, which allows us to find more efficient solutions that are faster and higher-quality.”

From 2D images to 3D shapes

MIT researchers investigated the Score Distillation (SDS) process and discovered a key issue causing blurry or cartoonish 3D shapes: a mismatch between a formula used in SDS and its counterpart in 2D diffusion models. This formula dictates how the model updates a random representation by gradually adding and removing noise to resemble the desired image.

Synthetic data can have significant negative impacts on generative AI

The equation’s complexity causes SDS to replacesit with randomly sampled noise at each step- prompting to subpar results. The researchers identified that this reliance on random noise is the cause of the lower-quality 3D shapes, such as blurriness and a lack of realism.

MIT researchers improved the Score Distillation (SDS) technique by replacing the randomly sampled noise term with an approximation that infers the missing noise from the current 3D shape rendering. As their analysis predicted, this approach produces sharper and more realistic 3D shapes.

Additionally, the researchers increased the resolution of the image rendering and fine-tuned model parameters to enhance the quality of the 3D shapes further. They successfully used an off-the-shelf, pretrained image diffusion model to generate smooth, realistic 3D shapes, avoiding the need for costly retraining.

New diffusion model could make weird AI images a thing of the past

The final 3D objects are comparable in sharpness to those produced by other advanced methods.

Lukoianov said, “Trying to experiment with different parameters blindly, sometimes it works, and sometimes it doesn’t, but you don’t know why. We know this is the equation we need to solve. Now, this allows us to think of more efficient ways to solve it.”

“Because the method relies on a pretrained diffusion model, it inherits the biases and shortcomings of that model, making it prone to hallucinations and other failures. Improving the underlying diffusion model would enhance the process.”

Researchers are now studying the formula to see how they could solve it more effectively. They are interested in exploring how these insights could improve image editing techniques.

Journal Reference:

  1. Artem Lukoianov, Haitz Saez et al. Score distillation via reparametrized DDIM. (Paper)
Up next

New AI model imitates sounds more like humans

Teaching AI to communicate sounds like humans do.

4M: a next-generation framework for training multimodal foundation

An open-source training framework to advance multimodal AI.
Recommended Books
The Cambridge Handbook of the Law, Policy, and Regulation for Human–Robot Interaction (Cambridge Law Handbooks)

The Cambridge Handbook of the Law, Policy, and Regulation for Human-Robot...

Book By
Cambridge University Press
University
Picks for you

New AI model imitates sounds more like humans

4M: a next-generation framework for training multimodal foundation

Researchers develop offline speech recognition algorithm

Most recent Large Language Models remain vulnerable to simple manipulations

Boltz-1: A model to predict biomolecular structures