Scientific ComputingBeginner

Matrix Multiplication in Python

Multiply matrices in Python the right way. Compare @, np.dot, np.matmul, and pure-Python loops with runnable examples — no setup required.

Try it yourself

Run this code directly in your browser. Click "Open in full editor" to experiment further.

Loading...

Click Run to see output

Or press Ctrl + Enter

How it works

Matrix multiplication is the workhorse of scientific computing — it's what powers neural networks, 3D graphics, physics simulations, and pretty much every recommendation algorithm you've ever used. Getting it right in Python comes down to one simple choice: use NumPy and the @ operator.

What Matrix Multiplication Actually Is

If you have a matrix A of shape (m, n) and a matrix B of shape (n, p), then A @ B produces a new matrix of shape (m, p). Every entry (i, j) in the result is the dot product of row i from A with column j from B.

The only hard rule: the inner dimensions must match. A 3×4 matrix can multiply a 4×2 matrix (the 4s line up), but it cannot multiply a 5×2.

The Four Ways to Multiply

ApproachSyntaxWhen to use
@ operatorA @ BDefault choice. Cleanest and most readable.
np.matmulnp.matmul(A, B)Same as @. Use inside other expressions if you prefer function form.
np.dotnp.dot(A, B)Older API. For 2D arrays it's identical to @, but it behaves differently for 1D and higher-D arrays — easy to trip on.
Pure Python loopstriple for loopNever. Hundreds of times slower. Useful only for understanding what's happening.

If you remember nothing else, remember this: use `@`.

The Element-Wise Trap

This one bites everyone exactly once:

A * B   # element-wise — multiplies positions individually
A @ B   # matrix multiplication — dot products of rows and columns

For two 2×2 matrices these give completely different answers. If your gradient explodes or your output looks wrong, this is the first thing to check.

Why NumPy Crushes Plain Python

A pure-Python triple loop for matrix multiplication does the math correctly but spends most of its time on Python-level overhead — every multiplication and addition goes through the interpreter. NumPy hands the work off to BLAS (Basic Linear Algebra Subprograms), a hyper-tuned library written in C and Fortran that uses CPU vector instructions and cache-aware algorithms.

For a 200×200 multiply, the difference is typically 100-1000×. Scale up to 1000×1000 and the pure-Python version simply doesn't finish.

Common Shapes You'll Actually Use

  • (batch, features) @ (features, output) — a single layer of a neural network
  • (3, 3) @ (3, n) — applying a rotation/transformation to n 3D points
  • X.T @ X — the Gram matrix, the foundation of linear regression's normal equation
  • A @ x where x is 1D — solving systems, projecting vectors
  • Quick Gotchas

  • Shape mismatch error? Print A.shape and B.shape. The inner two numbers must match.
  • Result is a single number? You probably gave NumPy two 1D arrays — that's the dot product, not a matrix product.
  • Want the transpose? A.T (no parentheses, no method call).
  • Identity matrix? np.eye(n). A @ np.eye(n) == A.
  • Run the snippet above and you'll see the same multiplication done four ways, the element-wise trap, and a head-to-head speed test that explains in one print statement why the rest of the scientific Python world is built on NumPy.

    Related examples