Update04/19/2026

SAM 3.1: Faster and More Accessible Real-Time Video Detection

Meta AI
Meta

SAM 3.1: Faster and More Accessible Real-Time Video Detection and Tracking With Multiplexing and Global Reasoning

Update March 27, 2026:

We’ve seen incredible adoption of SAM 3 over the last few months, and during that time, we’ve been working behind the scenes on updates to improve video processing efficiency. Today, we’re pleased to introduce SAM 3.1.

As a drop-in replacement for SAM 3, our updated model delivers a significant boost in video processing efficiency by introducing object multiplexing, which allows the model to track up to 16 objects in a single forward pass. This innovation doubles the processing speed for videos with a medium number of objects, increasing throughput from 16 to 32 frames per second on a single H100 GPU. As a result, SAM 3.1 enables real-time object tracking in complex videos while reducing overall GPU resource requirements, making high-performance applications feasible on smaller, more accessible hardware.

Meta Segment Anything Model 3 (SAM 3) Overview
Meta has released Segment Anything Model 3 (SAM 3), the next-generation unified model for detection, segmentation, and tracking of objects in both images and videos. It supports highly flexible prompts including:

Text prompts (short open-vocabulary noun phrases, e.g., “striped red umbrella”)
Exemplar image prompts
Traditional visual prompts (points, boxes, masks)

This addresses a major limitation of earlier models by enabling promptable concept segmentation — finding and segmenting all instances of a concept, even rare or nuanced ones not in fixed label sets.

Key Improvements

2x performance gain on the new Segment Anything with Concepts (SA-Co) benchmark for promptable concept segmentation in images and videos.
Better accuracy in crowded scenes and interactive tasks compared to previous SAM models and strong baselines (e.g., OWLv2, Gemini 2.5 Pro).
Fast inference: ~30ms per image (even with 100+ objects) on an H200 GPU; near real-time for video with multiple objects.
SAM 3.1 update: Introduces multiplexing — processes all tracked objects together in a single pass instead of separate passes per object. This reduces redundant computation, lowers memory usage, and improves efficiency and accuracy, especially in crowded or complex video scenes.

Data & Training Innovation
Meta built a scalable hybrid data engine combining SAM 3, Llama-based AI annotators, and human reviewers. This made annotation ~5x faster for negative prompts and enabled creation of a massive training dataset covering over 4 million unique concepts.

Additional Releases

SAM 3D: Open-source models and data for 3D object/scene reconstruction and human pose/shape estimation from a single image.
Segment Anything Playground: A user-friendly web platform where anyone (no coding required) can experiment with SAM 3 for creative edits, annotations, and media modification (e.g., pixelating faces, adding effects, spotlighting objects).
SA-Co benchmark dataset for community evaluation and research.
Fine-tuning code and approaches to help users adapt SAM 3 to specific domains.

Real-World Applications

Facebook Marketplace: “View in Room” feature uses SAM 3/SAM 3D to let users visualize furniture and decor in their own space.
Creator tools: New effects coming to Instagram’s Edits app (apply dynamic effects to specific people/objects with one tap), Meta AI app (Vibes), and meta.ai.
Science & Conservation: Powers new public wildlife datasets (SA-FARI for camera traps, FathomNet for underwater imagery) in partnership with Conservation X Labs and others.

Future Directions
SAM 3 performs well on short prompts and common scenarios but can be improved for fine-grained domain-specific concepts (e.g., medical terms) via fine-tuning. It also has room to grow in handling very complex/long prompts and more efficient multi-object video tracking with shared context.
Overall, SAM 3 makes advanced visual understanding more accessible and powerful, with open weights, code, data, and a playground for broad experimentation. It continues Meta’s push to empower creators, researchers, and developers while enabling practical applications in e-commerce, content creation, and scientific monitoring.
Would you like a shorter version, bullet-point takeaways only, or a focus on specific parts (e.g., technical improvements or applications)?