The Scale of AI Training Compute

This is the scale of AI training compute:

Bell Labs and Claude Shannon, Theseus, 40 FLOPS (4.0e+1) - 1950
At Bell Laboratories, mathematician and Bell Labs researcher Claude Shannon built a machine called Theseus. It was one of the world’s first examples of machine learning: a robotic maze-solving mouse known as Theseus.

Cornell University, Perceptron Mark I, 690 kiloFLOPS (6.9e+5) - 1957
Perceptron Mark I was an experimental simulation program at Cornell University. Perceptron Mark I used the IBM 704 computer to simulate perceptual learning, recognition, and spontaneous classification of visual stimuli in the perceptron.

IBM, LTE Speaker Verification System, 110 megaFLOPS (1.1e+8) - 1966
The IBM Speaker Verification System was a two‐level adaptive linear threshold element (LTE) system to perform speaker discriminations. The IBM Speaker Verification System was able to achieve over 90% accuracy in separating a known speaker from impostors.

Carnegie Mellon University, ALVINN, 11 gigaFLOPS (1.1e+10) - 1989
ALVINN (Autonomous Land Vehicle In a Neural Network) was a 3-layer back-propagation network designed at Carnegie Mellon University for the task of road following. ALVINN took images from a camera and a laser range finder as input and produces as output the direction the vehicle should travel in order to follow the roads near Carnegie Mellon University.

University of Toronto, Dropout (ImageNet), 270 petaFLOPS (2.7e+17) - 2012
The key idea of Dropout (ImageNet), created at University of Toronto, was to randomly drop units (along with their connections) from the neural network during training. This prevented units from co-adapting too much. During training, Dropout (ImageNet) sampled from an exponential number of different thinned networks. At test time, it was easy to approximate the effect of averaging the predictions of all these thinned networks by simply using a single unthinned network that has smaller weights. University of Toronto showed that Dropout (ImageNet) improved the performance of neural networks on supervised learning tasks in vision, speech recognition, document classification and computational biology, obtaining state-of-the-art results on many benchmark data sets.

OpenAI, GPT-1, 18 exaFLOPS (1.8e+19) - 2018
With GPT-1, OpenAI demonstrate that large gains can be realized by generative pre-training of a language model on a diverse corpus of unlabeled text, followed by discriminative fine-tuning on each specific task. GPT-1 achieved absolute improvements of 8.9% on commonsense reasoning (Stories Cloze Test), 5.7% on question answering (RACE), and 1.5% on textual entailment (MultiNLI).

DeepMind, AlphaStar, 59 zettaFLOPS (5.9e+22) - 2019
DeepMind evaluated AlphaStar in the full game of StarCraft II through a series of online games against human players. DeepMind AlphaStar was rated at Grandmaster level for all three StarCraft races and above 99.8% of officially ranked human players.

OpenAI, GPT-3, 310 zettaFLOPS (3.1e+23) - 2020
OpenAI GPT-3 is an autoregressive language model with 175 billion parameters, 10x more than any previous non-sparse language model, and test its performance in the few-shot setting. For all tasks, GPT-3 is applied without any gradient updates or fine-tuning, with tasks and few-shot demonstrations specified purely via text interaction with the model.

Baidu, ERNIE 3.0 Titan, 1 yottaFLOPS (1.0e+24) - 2021
In order to explore the performance of scaling up ERNIE 3.0, Baidu trained a hundred-billion-parameter model called ERNIE 3.0 Titan with up to 260 billion parameters on the PaddlePaddle platform. Furthermore, Baidu designed a self-supervised adversarial loss and a controllable language modeling loss to make ERNIE 3.0 Titan generate credible and controllable texts.

Google Research, PaLM, 2.5 yottaFLOPS (2.5e+24) - 2022
Google Research PaLM was trained on on 6144 TPU v4 chips using Pathways, a new ML system which enables highly efficient training across multiple TPU Pods. On a number of these tasks, Google Research PaLM 540B achieves breakthrough performance, outperforming the finetuned state-of-the-art on a suite of multi-step reasoning tasks, and outperforming average human performance on the recently released BIG-bench benchmark.

OpenAI, GPT-4, 21 yottaFLOPS (2.1e+25) - 2023
OpenAI GPT-4 is a large-scale, multimodal model which can accept image and text inputs and produce text outputs. OpenAI GPT-4 exhibits human-level performance on various professional and academic benchmarks, including passing a simulated bar exam with a score around the top 10% of test takers.

Meta AI, Llama 3.1, 38 yottaFLOPS (3.8e+25) - 2024
Meta AI Llama 3 is a herd of language models that natively support multilinguality, coding, reasoning, and tool usage. The largest Meta AI Llama 3 model is a dense Transformer with 405B parameters and a context window of up to 128K tokens.

Видео The Scale of AI Training Compute канала PlivoAI