Machine LearningIntermediate

Spam Classifier with Naive Bayes in Python

Build a spam email classifier in Python using Multinomial Naive Bayes and TF-IDF. Train, evaluate, and predict on new messages — runnable in your browser.

Try it yourself

Run this code directly in your browser. Click "Open in full editor" to experiment further.

Loading...

Click Run to see output

Or press Ctrl + Enter

How it works

Spam classification is the "hello world" of natural language processing — and Multinomial Naive Bayes is the algorithm that, for decades, ran most of the world's spam filters. It's fast, it works with shockingly little data, and the math behind it is short enough to fit on a napkin.

Why Naive Bayes Works So Well For Spam

Naive Bayes asks one question for every incoming message:

Given the words I see, is this message more likely to come from the spam pile or the ham pile?

It answers using Bayes' theorem:

P(spam | words) ∝ P(spam) · P(words | spam)

The "naive" part is that it assumes every word is independent of every other word — which is obviously false ("free" and "prize" are correlated). Yet for text classification this assumption barely hurts performance, because what matters is the relative probability of words across classes, not their joint distribution.

The payoff: training is just counting. No gradient descent, no iterations, no hyperparameter sweeps. Counting is what makes Naive Bayes blindingly fast and resistant to overfitting on small datasets.

TF-IDF: Turning Words Into Numbers

Naive Bayes needs numerical features, so we convert each message into a TF-IDF vector first.

  • TF (term frequency) — how often a word appears in this message.
  • IDF (inverse document frequency) — how rare that word is across all messages.
  • TF-IDF — multiply them. Words that appear often in this message but rarely elsewhere score high.
  • Why not raw word counts? Because words like "the" and "and" appear constantly in both spam and ham — they carry no signal. IDF crushes them down. Words like "WINNER" or "prize" appear in spam but rarely in normal conversation — IDF lifts them up.

    A few details in the snippet's vectorizer worth knowing:

  • lowercase=True — "FREE" and "free" become the same feature.
  • stop_words='english' — drops a, the, and, is, etc.
  • ngram_range=(1, 2) — captures both single words and word pairs. "free" alone is suspicious, but "free entry" is a smoking gun.
  • What `alpha` Does (Laplace Smoothing)

    If the word "giveaway" never appeared in any ham message during training, then P(giveaway | ham) = 0, which would make the entire posterior probability collapse to zero — even if every other word in the message screams ham.

    Laplace smoothing fixes this by pretending every word appeared at least alpha times in every class. With alpha=0.1, the model still trusts the data heavily but never assigns true zero probability to anything. alpha=1.0 is the classic default; alpha=0.1 works better for small corpora like this one.

    How To Read The Confusion Matrix

    The matrix shows four numbers:

  • True positives (correctly flagged spam)
  • True negatives (correctly accepted ham)
  • False positives (legitimate messages wrongly marked spam) — these are the worst errors. A real friend's email going to junk is a much worse experience than a single spam slipping into the inbox.
  • False negatives (spam that got through) — annoying but recoverable.
  • Spam filtering is one of the rare ML problems where you should bias toward false negatives. Better to let a few spammy messages through than to silently lose someone's job offer.

    What The Word Bar Charts Reveal

    The two bar charts show the words with the largest log-probability gap between classes. This is the model's learned vocabulary of spam — and it's almost always entertaining. You'll see things like "free", "win", "click", "urgent", "claim" rise to the top, while words like "meeting", "lunch", "thanks", "tomorrow" anchor the ham side.

    This interpretability is one of Naive Bayes' great strengths. Compare with a deep neural network where "why did you classify this as spam?" has no clean answer.

    Why Not Logistic Regression Or A Transformer?

    For large, modern email systems with millions of training messages and access to GPUs, transformer-based classifiers (BERT, etc.) outperform Naive Bayes on accuracy. But for the vast majority of small-to-medium text classification problems, Naive Bayes is:

  • Faster to train by orders of magnitude — you can retrain on millions of messages in seconds.
  • Simpler to deploy — no GPU, tiny memory footprint, model serializes to a few KB.
  • More robust on small data — works well with hundreds of examples, not millions.
  • Easier to debug — every prediction can be traced back to specific word probabilities.
  • For anything from review sentiment, support ticket routing, or email triage at startup scale, Naive Bayes is still the right first move.

    Things To Tweak

  • Add more training data — the snippet uses 40 messages for clarity. Try the SMS Spam Collection (5,574 messages) for a real benchmark — Multinomial NB hits ~98% accuracy.
  • Try `BernoulliNB` instead — uses presence/absence of words rather than counts. Often slightly better on short messages like SMS.
  • Switch to character n-gramsanalyzer='char_wb', ngram_range=(3, 5) makes the model robust to leetspeak and obfuscation ("fr3e", "v.i.a.g.r.a").
  • Tune the decision thresholdpredict_proba gives you the probability of spam. If false positives matter more, only flag as spam when p > 0.7 instead of p > 0.5.
  • Add length and punctuation features — spam messages tend to be longer, ALL CAPS, and have more exclamation marks. Concatenate these to the TF-IDF vector.
  • Where Naive Bayes Shows Up In Real Code

  • Spam filters — the original killer app. Paul Graham's 2002 essay A Plan for Spam popularized it.
  • Sentiment classification — positive vs negative reviews, polite vs hostile messages.
  • Document categorization — auto-routing support tickets, news article topics, legal document types.
  • Author identification — identifying which of several writers produced a given text.
  • Medical diagnosis — classifying patient symptoms into possible conditions, especially when training data is limited.
  • Run the snippet above and you'll see a Naive Bayes classifier learn to separate spam from ham from 40 tiny examples, surface the most spam-revealing words it discovered, and confidently classify six brand-new messages it has never seen before.

    Related examples