What are Parameters in an AI Model?

Parameters are the learned numerical weights inside a neural network. Learn what they are, how they work, and why model size is measured in billions.

Parameters in an AI model are the learned numerical values stored inside a neural network that control how it transforms inputs into outputs. Each parameter is essentially a weight on a connection between artificial neurons, and a typical large language model contains tens to hundreds of billions of them. The full set of parameters, often called the model's weights, is the artifact produced by training and is what gets saved to disk and loaded at inference time.

How parameters work

During training, the model processes examples, makes predictions, and compares them to the correct answer. An optimizer then nudges every parameter slightly in the direction that would have reduced the error, a process called gradient descent. After trillions of such updates, the parameters settle into values that encode statistical patterns about language, images, or whatever data the model was trained on.

At inference, a prompt is converted into numbers and passed through dozens or hundreds of layers. At each layer, the input is multiplied by weight matrices and passed through simple nonlinear functions, with attention mechanisms letting the model mix information across positions. None of the original training data is stored verbatim in the weights; rather, the parameters hold a compressed statistical representation of it. A concrete example: in a transformer, the query, key, and value projections for each attention head are matrices of parameters that decide which earlier words the model attends to when predicting the next one.

Why it matters

Parameter count is the most-cited proxy for a model's capability, and for good reason: more parameters give a network more capacity to memorize and generalize from patterns, and the largest modern models display emergent abilities that smaller ones lack. Parameter count also drives practical concerns: memory (each parameter is typically 2 bytes in FP16 or 1 byte when aggressively quantized), compute cost per token, latency, and the hardware required to run or fine-tune the model. This is why a 7-billion-parameter model can run on a laptop while a 400-billion-parameter model usually cannot.

Key types

  • Weights: the bulk of the parameters, stored in matrices that multiply inputs and hidden states.
  • Biases: small additive offsets (one per layer or per neuron) that shift activations.
  • Embedding parameters: the lookup tables that convert token IDs into vectors, counted in the total parameter budget.
  • Attention parameters: the query, key, value, and output projections inside each transformer block.
  • Feed-forward parameters: the two large dense layers in each transformer block, which usually account for the majority of total weights.

Parameters are also commonly grouped by precision. A model described as "70B" has 70 billion parameters, but its file size depends on whether those are stored in 32-bit, 16-bit, 8-bit, or 4-bit format, which is why the same model can range from roughly 140 GB down to around 35 GB on disk. Understanding parameters clarifies almost every other concept in modern AI, from fine-tuning and quantization to context length and inference cost.

You might also like

Related posts