Portmint Lighthouse

The Trouble With Repeating Yourself

Every time you copy a fact into a second spot, you've quietly made a promise: to keep both copies in step forever. Break that promise — and you always do, eventually — and your data starts telling two different stories.

Picture writing a friend's phone number on three sticky notes around the house. The day they change their number, you update the note on the fridge and forget the two in the drawer. Now you've got three notes and no way to know which one is right. That's exactly what repeated data does inside a spreadsheet, just at a scale that hurts more.

How the mess sneaks in

Say you keep one big grid of orders. Every order row also carries the customer's name, email, and address, because you wanted it all in one place. Feels convenient. One customer with twenty orders now has their address copied across twenty rows.

Then they move. To fix it honestly, you'd have to find and update all twenty rows. Miss one — and tired people always miss one — and now your data disagrees with itself. Half the orders ship to the new address, half to the old. Nothing crashed. No error popped up. The data just quietly went wrong, which is the worst kind of wrong, because you can't see it until a package lands on the wrong porch.

One fact, one home

The cure has a homely name: keep each fact in exactly one place. The customer's address belongs in one row of a customers table, and nowhere else. The orders don't carry a copy of the address — they just point to the customer it belongs to. Change the address once, in its one true home, and every order automatically reflects it, because they were never holding their own copy to begin with.

This is the heart of why databases split things into separate tables. It isn't fussiness. It's so that no fact ever has to be remembered in two places at once. One home per fact means one place to fix it, and no chance of two copies drifting apart.

The everyday version

You already do this in your own life when you're being careful. You keep your friends' numbers in one contacts list, not scribbled on scattered notes, precisely so there's a single source of truth. When a number changes, you update the contact, and every text and call from then on uses the right one. The database idea is the same instinct, made into a rule the system enforces for you.

Why this matters

Repeated data is the number-one reason business records turn untrustworthy. Two spreadsheets, two totals, two answers to "how many customers do we have?" — and a long, grumpy afternoon spent reconciling them. Designing so each fact has one home is how you avoid that afternoon entirely.

It's also the "why" behind nearly everything left in this course. Keys, relationships, and queries are the tools that let tables stay separate and lean while still working together. They only make sense once you feel this ache — the cost of saying the same thing twice.

Your turn

Look at a spreadsheet you use and hunt for a fact that's copied down many rows — a name, a price, a category. Imagine it changing tomorrow. Count how many cells you'd have to fix by hand. That count is the size of the problem databases solve.

Next we'll meet the small but mighty idea that lets one table point at another without copying anything: the key. 🔦

Stuck or curious?

Ask Pip about this lesson — tap the porthole bottom-right.