What is Chain-of-Thought Prompting?

Chain-of-thought prompting is a technique that asks an LLM to reason step by step before answering, boosting accuracy on math, logic, and multi-step problems.

Chain-of-thought prompting is a prompt-engineering technique in which a user instructs a large language model to work through a problem one step at a time, exposing the intermediate reasoning that leads to the final answer. Instead of jumping straight to a conclusion, the model writes out the logical steps in natural language, much like a student showing their work on a math test. The technique was popularized by Wei et al. (2022) in Chain-of-Thought Prompting Elicits Reasoning in Large Language Models and has since become a foundation of modern prompt design.

How Chain-of-Thought Prompting works

The core idea is deceptively simple. When a prompt contains one or more worked examples in which the model demonstrates a reasoning chain — "first I do X, then I compute Y, therefore Z" — the model tends to imitate that structure on the new problem. This is known as few-shot chain-of-thought prompting, and it requires no changes to the model's weights; only the prompt changes.

A more recent variant, called zero-shot chain-of-thought, was introduced by Kojima et al. (2022). It works by appending a single magic phrase such as Let's think step by step to any question, which alone is enough to coax the model into decomposing the problem. Both variants rely on the same underlying capability: sufficiently large language models have learned internal procedures for arithmetic and logic, and surfacing those procedures as text measurably improves answer accuracy.

Why it matters

Chain-of-thought prompting matters because it directly attacks one of the most visible failure modes of LLMs: confidently wrong one-shot answers on multi-step problems. By forcing the model to externalize its reasoning, the technique reduces arithmetic errors, improves performance on commonsense benchmarks, and makes model behavior easier to audit because a human can inspect each step. It is now a building block for more advanced methods such as self-consistency (sampling many chains and voting on the answer), tree-of-thought search, and the reasoning traces produced by modern reasoning models.

Key variants

  • Few-shot CoT: The prompt includes several hand-written examples that demonstrate step-by-step reasoning before the real question. Usually the most reliable approach for smaller models.
  • Zero-shot CoT: Simply add "Let's think step by step" (or a similar trigger) to any prompt. Cheap and surprisingly effective on capable models.
  • Self-consistency: Sample many independent chains of thought and pick the most common final answer, trading compute for accuracy.
  • Tree-of-Thought: Let the model branch and explore multiple reasoning paths, then backtrack or prune weak ones — useful for puzzles and planning tasks.
  • Reasoning-model traces: Newer models such as those in the o-series and DeepSeek-R1 are explicitly trained to natively produce long chain-of-thought reasoning by default.

Chain-of-thought prompting turned "show your work" from a classroom rule into a powerful, general-purpose tool for getting more reliable answers out of large language models.

You might also like

Related posts