From AGI to ASI: DeepMind's Map of Superintelligence Pathways
Google DeepMind's report lays out four non-exclusive paths from AGI to ASI and treats each bottleneck, from data walls to regulation, as an open research question.
Institution
Google's frontier AI lab, spanning language models, robotics, scientific discovery, and reasoning systems.
Google DeepMind's report lays out four non-exclusive paths from AGI to ASI and treats each bottleneck, from data walls to regulation, as an open research question.
Self-Supervised Learning · Google DeepMind
BYOL turns self-supervised visual learning without negative pairs into a concrete research object, with evidence anchors, method tradeoffs, and limits for practical use.
Theorem Proving · Google DeepMind
This work evaluates AI-aided formal proof search on open math problems: the strongest agent resolves 9 of 353 Erdos problems and proves 44 of 492 OEIS conjectures.
Interpretability · Google DeepMind
Gemma Scope is a free, open suite of JumpReLU sparse autoencoders covering every layer of Gemma 2 2B and 9B (plus parts of 27B) — over 400 SAEs and 30M+ features, costing more than 20% of GPT-3's compute to train.
Code Generation · Google DeepMind
DeepMind's AlphaCode averaged a top 54.3% ranking on Codeforces contests with 5,000+ participants by generating up to a million candidate programs per problem, then filtering and clustering them down to ten submissions.
Language Models · Google DeepMind
DeepMind's Chinchilla shows model size and training tokens should scale equally. A 70B model on ~1.4T tokens beats Gopher 280B, GPT-3 175B, and MT-NLG 530B.
Multimodal Models · Google DeepMind
Flamingo bolts trainable cross-attention onto a frozen vision encoder and a frozen language model, then learns new image and video tasks from a handful of in-context examples — no fine-tuning.
Gemma is a 2B and 7B family of open-weight models distilled from Gemini research that beats similarly sized open models on 11 of 18 text tasks, shipped with pretrained and instruction-tuned checkpoints.
Biomolecular Modeling · Google DeepMind
AlphaFold 3 replaces AlphaFold 2's structure module with a diffusion network and predicts whole complexes — proteins with nucleic acids, ligands, ions, and modified residues — in one model.
Theorem Proving · Google DeepMind
AlphaGeometry pairs a language model with a symbolic engine and trains on 100M synthetic theorems, solving 25 of 30 olympiad geometry problems versus 10 for the prior best.
Long Context · Google DeepMind
Gemini 1.5 Pro and Flash keep >99% retrieval recall up to at least 10M tokens of text, video, and audio — and Pro matches Gemini 1.0 Ultra with far less compute.
Vision-Language-Action · Google DeepMind
RT-2 co-fine-tunes a web-pretrained vision-language model on robot trajectories, expresses actions as text tokens, and gets emergent generalization to novel objects, unseen commands, and basic reasoning across 6k trials.