Data VisualizationBeginner

Word Cloud in Python

Create stunning word clouds in Python with the wordcloud library! Step-by-step tutorial with custom colors, shapes, and stopwords — runnable in your browser.

Try it yourself

Run this code directly in your browser. Click "Open in full editor" to experiment further.

Loading...

Click Run to see output

Or press Ctrl + Enter

How it works

Word clouds turn a wall of text into an instant visual summary — the more often a word appears, the bigger it gets. ☁️

What's a Word Cloud Good For?

  • Quick text summaries — get the gist of a long article, review batch, or transcript at a glance
  • Social media analysis — see what topics dominate a hashtag or comment section
  • Survey responses — visualize what customers actually said in open-ended answers
  • Blog posts and presentations — they look great as a hero image
  • NLP exploration — a fast first look at your text data before deeper analysis
  • How It Works (Behind the Scenes)

    1. Tokenize — split your text into individual words.

    2. Filter — remove "stopwords" (the, and, is, of...) that would dominate but mean nothing.

    3. Count — tally how many times each word appears.

    4. Size — bigger count → bigger font.

    5. Pack — fit all the words into the image without overlapping, prioritizing the biggest ones.

    6. Color — apply a colormap so the result looks visually pleasing.

    The wordcloud library handles every step — you just feed it text.

    The Three Ways to Make One

    1. From raw text — the simplest case

    WordCloud(width=800, height=400).generate("your big string of text...")

    2. From a frequency dictionary — when you already counted

    freqs = {'python': 100, 'javascript': 85, 'rust': 45}
    WordCloud().generate_from_frequencies(freqs)

    This is the magic method when you've already used pandas.Series.value_counts() or collections.Counter. Skip the text-parsing step entirely.

    3. From a file

    with open('article.txt') as f:
        text = f.read()
    WordCloud().generate(text)

    Key Parameters You'll Actually Use

    ParameterWhat it does
    width, heightOutput image size in pixels
    background_color'white', 'black', or any color name
    colormapAny matplotlib colormap: 'viridis', 'plasma', 'inferno', 'cool', 'autumn'
    max_wordsCap the number of words shown (default: 200)
    stopwordsA set of words to ignore — combine STOPWORDS with your own additions
    min_font_size / max_font_sizeSize range for the words
    relative_scaling0 = sizes ignore frequency, 1 = pure frequency-based
    random_stateLock the layout so it's reproducible
    maskA numpy array image to shape the cloud (heart, country, logo, etc.)

    Stopwords — The Most Important Setting

    Without stopwords, your cloud will be dominated by the, and, of, a. Always filter them:

    from wordcloud import STOPWORDS
    
    stopwords = set(STOPWORDS)             # 200+ common English words
    stopwords.update(['said', 'would'])    # add your own domain-specific noise

    Displaying the Cloud

    A WordCloud object isn't an image yet — it's a data structure. Hand it to matplotlib to render:

    plt.imshow(cloud, interpolation='bilinear')   # 'bilinear' = smooth edges
    plt.axis('off')                               # hide ticks and labels
    plt.show()

    Use interpolation='bilinear' (or 'lanczos') — without it, the words look pixelated.

    Pro Tips

  • Start small: build a cloud with 50–100 words before scaling up — easier to read.
  • Custom shapes: pass a black-and-white image array as mask= to make the cloud fit a logo or country outline.
  • Save to file: cloud.to_file('cloud.png') exports a PNG directly.
  • Multilingual: works fine for any language, but stopword lists are English-only by default — add your own for French, Arabic, etc.
  • Avoid the cliché: word clouds work best for exploration, not for serious analytics. For real frequency comparison, a bar chart is more honest.
  • Run the snippet above to see three different word clouds: a basic one, a styled dark-theme one, and one built from a frequency dictionary of programming languages. Try swapping in your own text! ✨

    Related examples