Facebook unveils SEER, a self-supervised computer-vision model

ForkLog

5 years ago

Facebook unveils SEER, a self-supervised computer-vision model

Facebook has developed the SEER (Self-supervised) computer-vision model, which can learn to recognise objects in photographs with minimal human input.

Future AI systems will learn as people do — without relying on labeled data sets. Today we’re detailing SEER, a breakthrough in self-supervised #computervision, and open-sourcing VISSL, the library we used to build it. Learn more:https://t.co/CBg6ZkiqFU pic.twitter.com/zHHM3UHiUs

— Facebook AI (@facebookai) March 4, 2021

The SEER model can be trained on any random collection of images on the Internet — without the need for meticulous curation and labeling that underpin most computer-vision algorithms today.

After pretraining on a million random, unlabeled and uncurated publicly available Instagram images, the model achieved an accuracy of 84.2% on [simple_tooltip content=’The largest visual data set in the world used by computer-vision developers to verify the accuracy of their algorithms.’]ImageNet[/simple_tooltip].

Facebook AI’s Chief Scientist Yann LeCun calls the self-supervised approach one of the most promising ways to build machines endowed with foundational knowledge capable of tackling tasks that sit far beyond the capabilities of current AI models.

“Self-supervised learning could have many useful applications, for example, learning to read medical images without the need to label a large number of X-ray scans,” says LeCun.

He added that a similar approach is already used to automatically generate hashtags for images on Instagram. He also said that SEER technology could be used at Facebook to match ads with posts or to filter unwanted content.

Facebook said that some technologies underlying SEER will be freely available to developers. But the algorithm itself will remain closed, as it was trained on Instagram user data.

Earlier in January 2021, Facebook introduced a new version of automatic alternative text (AAT) for photographs, which uses machine learning to describe images for people with visual impairments.

Subscribe to ForkLog news on Telegram: ForkLog Feed — the full feed of news, ForkLog — the most important news and polls.