Data StructuresAdvanced

LRU Cache in Python

Learn to build an LRU (Least Recently Used) Cache in Python using OrderedDict. A really fun and simple guide to O(1) time complexity caching!

Try it yourself

Run this code directly in your browser. Click "Open in full editor" to experiment further.

Loading...

Click Run to see output

Or press Ctrl + Enter

How it works

The LRU (Least Recently Used) cache shows up everywhere: in your OS, CPU, browser, CDN, ORM, and Redis. It's also the canonical coding interview problem, immortalised as LeetCode 146. Implement one with O(1) get and put and you've proven you understand hash maps, linked lists, and how to make them dance together. Let's break it down. ๐ŸŽฏ

What LRU Means

Least Recently Used is an eviction policy: a rule for deciding what to throw away when the cache fills up. The thing you touched longest ago gets evicted first.

Think of your browser tabs. You probably have 47 open. The ones you haven't clicked in three days are the ones you'd close first if your laptop started melting. That's LRU.

Another analogy: a chef with a cookbook stand that only fits three open books. Every time she consults a recipe, she puts that book on top of the pile. When she needs a fourth, she removes the book at the bottom โ€” the least recently touched. The pile is always sorted by recency of use, not insertion order.

The O(1) Trick

We want both get(key) and put(key, value) in O(1), no matter how big the cache grows. Neither structure alone can pull this off:

  • A hash map gives O(1) lookup but has no order. Finding the "oldest" key means scanning everything โ€” O(n).
  • A doubly linked list maintains order beautifully (append to tail, evict from head) but finding a key by value is O(n).
  • Combine them and the magic happens: a hash map stores key โ†’ node pointer, and a doubly linked list stores entries ordered by recency (most recent at tail, least at head). Now get(key) is: hash-map lookup โ†’ follow pointer โ†’ unlink node โ†’ move to tail. All O(1). The list must be doubly linked because unlinking a node in O(1) given only a pointer requires prev pointers.

    Walkthrough

    Let's trace a capacity-3 cache through this sequence:

    put(1, 'A')  โ†’  [1:A]
    put(2, 'B')  โ†’  [1:A, 2:B]
    put(3, 'C')  โ†’  [1:A, 2:B, 3:C]   (full)
    get(1)       โ†’  'A', cache: [2:B, 3:C, 1:A]   (1 moved to back)
    put(4, 'D')  โ†’  [3:C, 1:A, 4:D]   (2 evicted โ€” it was oldest)
    get(2)       โ†’  -1   (2 was evicted)
    put(1, 'Z')  โ†’  [3:C, 4:D, 1:Z]   (1 updated, moved to back)

    Notice how get(1) actually mutates the cache by re-promoting key 1. That's the part beginners forget. The cache's state is a function of every operation, not just the writes. ๐Ÿ‘€

    Complexity Analysis

    OperationTimeSpace
    get(key)O(1)โ€”
    put(key, value)O(1)โ€”
    Total spaceโ€”O(text{capacity})

    Every operation โ€” lookup, recency update, eviction โ€” is constant time. Memory grows linearly with the cache size, never with the number of operations performed.

    Python's `functools.lru_cache` and `functools.cache`

    If you don't need a class โ€” just want to memoize a pure function โ€” Python ships with a battle-tested decorator:

    from functools import lru_cache, cache
    
    @lru_cache(maxsize=128)
    def fib(n):
        if n < 2: return n
        return fib(n - 1) + fib(n - 2)
    
    @cache  # Python 3.9+, equivalent to lru_cache(maxsize=None)
    def expensive(x):
        ...

    Key differences:

  • lru_cache(maxsize=N) evicts the least recently used entry once N is reached.
  • functools.cache (added in Python 3.9) is uncapped โ€” it grows forever. It's smaller and faster than lru_cache because it skips the eviction bookkeeping.
  • Both are thread-safe at the cache-coherence level: the underlying data structure won't corrupt under concurrent access. However, the wrapped function can still be called more than once if two threads race past a cold cache simultaneously.
  • Both expose .cache_info() (hits, misses, currsize) and .cache_clear().
  • OrderedDict Approach

    Here's the secret: collections.OrderedDict already is a hash map plus a doubly linked list under the hood. You can write a full LRU cache in ~15 lines, as the example shows. Two methods make it work:

  • move_to_end(key) โ€” O(1) promotion to most-recently-used.
  • popitem(last=False) โ€” O(1) eviction of the oldest entry.
  • OrderedDict is slightly slower per op than dict, but it's purpose-built for frequent reorder operations and very fast in practice.

    Common Pitfalls

  • Forgetting to update recency on `get`. A get that skips move_to_end is not LRU โ€” it's FIFO in a costume.
  • Off-by-one capacity. Evict when len(cache) > capacity, not >=. Otherwise you'll evict the item you just inserted.
  • Not handling key-already-present in `put`. If the key exists, promote it to the back and update its value. Skipping the promotion breaks LRU semantics.
  • `@lru_cache` on instance methods. The decorator holds references to self, so cached instances never get garbage collected โ€” a sneaky memory leak. Use cachetools.LRUCache per-instance instead.
  • Unhashable arguments. lru_cache hashes its args, so passing a list or dict raises TypeError. Convert to tuple or frozenset first.
  • Real-World Uses

  • CPU caches (L1/L2/L3) use LRU-style replacement in hardware.
  • Browser and disk caches โ€” your back button rides on a recency-based page cache.
  • CDN edge caches (Cloudflare, Fastly) use LRU variants to keep hot assets near users.
  • ORM query caches in Django, SQLAlchemy, Hibernate cache rendered queries with LRU eviction.
  • Web framework session stores evict idle sessions LRU-style.
  • Redis offers maxmemory-policy allkeys-lru for memory-bounded operation.
  • GPU memory managers in PyTorch and TensorFlow use LRU-ish tensor caching.
  • LRU vs Other Eviction Policies

  • LFU (Least Frequently Used) evicts the least-often used entry. Better when popularity is stable.
  • FIFO evicts in insertion order, ignoring access. Simple, often worse than LRU.
  • MRU (Most Recently Used) evicts the most recent โ€” niche, useful for full table scans.
  • ARC (Adaptive Replacement Cache) dynamically balances LRU and LFU. Used by ZFS.
  • TinyLFU / W-TinyLFU mixes a frequency sketch with a small LRU window. Powers Caffeine and outperforms plain LRU on most real workloads.
  • Frequently Asked Questions

    Why a doubly linked list, not singly? Removing a node in O(1) requires updating the previous node's next pointer. With a singly linked list you'd scan from the head to find the predecessor โ€” O(n).

    Is `functools.lru_cache` thread-safe? Yes for cache integrity โ€” concurrent get/put won't corrupt the structure. But the wrapped function can be invoked multiple times concurrently on a cold key, since there's no lock around the user function.

    How is LRU different from LFU? LRU asks "when did you last touch this?" LFU asks "how often?" LFU wins when popularity is stable; LRU wins when access patterns drift.

    What happens at full capacity? A put that exceeds capacity evicts the head node โ€” size stays at exactly capacity.

    Can the cache size be changed at runtime? Not directly with functools.lru_cache โ€” you must clear and re-decorate. cachetools.LRUCache supports resize() natively.

    When should I NOT use LRU? When access is uniformly random, when frequency matters more than recency (use LFU), or when items have explicit TTLs (use TTLCache). Skip caching entirely if recomputation is cheap โ€” caching has overhead too.

    Related examples