If you’ve ever tried editing photos or analyzing intricate imagery for scientific or security purposes, you know how challenging it can be to identify and separate different objects within an image.
The process often involves starting from scratch every time you want to analyze a new part of the image. But Meta, formerly known as Facebook, is changing the game with its latest AI tool.
Meta has introduced the “Segment Anything Model” or “SAM,” an AI-based model designed to make image analysis more accessible and efficient.
With SAM, researchers and web developers can now create “cutouts” or segments of any item in an image with just a few clicks or by drawing a box around the object.
This versatile tool can be used for research purposes, creative editing, or even enhancing the virtual reality (VR) experience by quickly and effectively dissecting different parts of an image.
What sets SAM apart is Meta’s commitment to providing a comprehensive solution. The company has launched the browser-based tool to the public and open-sourced its computer vision model.
Meta claims that its model is trained on an extensive dataset of 1.1 billion segmentation masks (representing different parts of an image) and 11 million images licensed by a prominent photo company.
Although Meta hasn’t disclosed the identity of the company, it’s clear that they have leveraged a vast amount of data to train their AI model.
The creation of this dataset was no small feat. Meta AI, the artificial intelligence research arm of the social media giant, collaborated with 130 human annotators based in Kenya.
Through a combination of manual and automatic labelling, they meticulously annotated billions of parts in millions of images, ensuring the dataset’s reliability and accuracy.
While object recognition and computer vision technologies have been in existence for years, Meta’s approach aims to merge the power of AI foundational models with the potential of computer vision.
Startups like Runway and established players like Adobe have already introduced AI-based tools for detecting and selecting different objects within images.
However, Meta’s ambition is to take these advancements a step further by encouraging users to build on top of their generalized model and develop more specific applications in fields like biology and agriculture.
The release of SAM coincides with Meta’s plans to incorporate generative AI into its advertising strategy for Instagram and Facebook. In February, Meta CEO Mark Zuckerberg announced the formation of a dedicated product team focused solely on building generative AI tools.
From artificial personas to Instagram filters and chat-based features in WhatsApp and Instagram, Meta is determined to leverage the potential of AI across its platforms.
The SAM tool specifically caters to users who lack the necessary AI infrastructure or data capacity to create their own models for image segmentation.
By operating in real time within a browser, Meta’s tool becomes easily accessible to a broader range of individuals who may not have access to GPU resources. This accessibility opens the door to numerous edge use cases that were previously limited by resource-intensive methods.
However, it’s important to acknowledge that a computer vision model trained on two-dimensional images has limitations. For example, detecting and selecting a remote held upside down would require training the model on different orientations of the same object.
Additionally, models trained on 2D images may struggle to identify partially covered or exposed objects. This means they might not accurately recognize non-standardized objects through AR/VR headsets or detect partially obscured objects in public spaces when used by autonomous vehicle manufacturers.
Meta recognizes the potential of SAM within its own virtual reality spaces, such as the online VR game Horizon Worlds.
The object detection tool developed by Meta’s researchers, Alexander Kirillov and Nikhila Ravi, can be utilized for “gaze-based” detection of objects through VR and augmented reality (AR) headsets.
The applications of this generalized image segmenting model are far-reaching. Kirillov shares an example of researchers needing to count and identify trees in photos collected for studying fires in California. With SAM, they can now accomplish this task efficiently, saving time and effort.
Meta’s AI tool represents a significant step toward simplifying image analysis and empowering researchers and developers in various domains.
By democratizing access to advanced computer vision capabilities, Meta hopes to foster innovation and enable users to unlock the full potential of this technology in diverse fields.
Whether you’re a biologist, an artist, or a VR enthusiast, Meta’s SAM tool brings powerful image analysis capabilities to your fingertips, opening up new possibilities for exploration and creativity.