Institution

Meta AI

Meta's AI research organization, known for open models, computer vision systems, and large-scale infrastructure.

MAE: Masked Autoencoders as Scalable Vision Learners

MAE turns masked image modeling for vision pretraining into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.

Segmentation · Meta AI

Mask R-CNN: Instance Segmentation on Top of Faster R-CNN

Mask R-CNN turns instance segmentation into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.

Brain Decoding · Meta AI

Brain2Qwerty: Non-Invasive Brain-to-Text Decoding

Brain2Qwerty decodes typed sentences from non-invasive brain recordings: MEG reaches 32% CER on average, EEG trails at 67%, and the best participants reach 19%.

Segmentation · Meta AI

Mask2Former: One Transformer for Segmentation Tasks

Mask2Former uses masked attention to unify semantic, instance, and panoptic segmentation, reaching 57.8 PQ on COCO panoptic and 57.7 mIoU on ADE20K.

Small Language Models · Meta AI

MobileLLM: Better Sub-Billion Models for Devices

MobileLLM argues architecture matters more at sub-billion scale: deep-thin designs plus sharing improve 125M/350M models by 2.7%/4.3%, then 0.7%/0.8% more.

Multimodal Models · Meta AI

VLM3: Vision Language Models Are Native 3D Learners

VLM3 shows a standard 4B vision-language model matches expert 3D models — 0.904 depth accuracy, 94.0% camera-pose AUC, 91.35% object-3D accuracy — with no 3D-specific architecture, only focal unification and scaling.

Code Generation · Meta AI

Code Llama: Open Code Models Built on Llama 2 (7B-70B)

Code Llama continues training Llama 2 on code, reaching up to 67% on HumanEval and 65% on MBPP, the best open scores at its release, with infilling, instruction following, and 100k-token context support.

Self-Supervised Learning · Meta AI

DINOv2: Self-Supervised Visual Features That Skip Finetuning

DINOv2 pretrains Vision Transformers with no labels on a curated 142M-image set, then freezes the backbone — a linear probe on top matches or beats OpenCLIP on most image- and pixel-level benchmarks.

Open Models · Meta AI

Llama 2 Explained: Meta's Open Weights and the RLHF Chat Recipe

Llama 2 shipped 7B, 13B, and 70B open-weight models plus Llama 2-Chat, the first open chat model whose RLHF pipeline — including a separate safety reward model and Ghost Attention — was documented in full.

Retrieval-Augmented Generation · Meta AI

RAG (2020): The Paper That Named Retrieval-Augmented Generation

The original RAG paper bolts a Wikipedia dense retriever (DPR) onto a BART seq2seq generator, set new state-of-the-art on three open-domain QA tasks, and updates knowledge by swapping the index — no retraining.

Segmentation · Meta AI

Segment Anything (SAM): One Promptable Model, a Billion Masks

Meta AI's SAM treats segmentation as a promptable task and ships with SA-1B (1.1B masks on 11M images), letting one model transfer zero-shot to new objects and image distributions.

LLM Reasoning · Meta AI

Toolformer: How a Language Model Teaches Itself to Use Tools

Toolformer trains a model to decide which API to call — calculator, QA, search, translation, calendar — purely by keeping the sampled calls that lower next-token loss, with only a handful of demos per tool.

Open Models · Meta AI

Llama 3: A 405B Dense Open Model That Matches GPT-4

Meta released Llama 3 as a herd of language models led by a dense 405B-parameter flagship with a 128K context window, trained on 15T+ tokens and openly published with weights.

Segmentation · Meta AI

SAM 2 Explained: Promptable Segmentation Across Video

SAM 2 carries one click through a whole video using a streaming memory module, hitting better masks with 3x fewer interactions than prior video methods and running 6x faster than SAM on images.