02.03.2025
Science
eye 80

A Young Mathematician’s Breakthrough Could Turbocharge the Internet

US Grad Student Upends 40-Year-Old Hash Table Theory

A young mathematician from the US has toppled a theory unshaken for 40 years—mostly because he’d never heard of it. Now, Andrew Krapivin’s discovery might just give the internet a serious speed boost.

Every time you flick through your phone’s contacts, hunt for a product online, or sift through emails, special data structures called hash tables swoop in to fetch what you need in a flash. They’re the unsung heroes keeping the modern web humming. But for ages, experts thought hash tables had hit their speed limit. Enter Krapivin, a grad student whose offbeat find has flipped that notion on its head. Here’s how hash tables tick, why his breakthrough matters, and what it might mean for how we handle data down the road.

Hash Tables 101: The Basics

A hash table is a clever way to stash and snag information, organizing it into “key-value” pairs for quick access. Think of it as a filing system that lets you add, ditch, or find entries lickety-split. The magic lies in hash functions—nifty formulas that crunch a key into an index, pointing straight to where its value sits in an array. Instead of rummaging through every item, the system zips to the right spot, slashing search time.

Picture a library: each book’s got a call number (the key), and a catalog (the hash function) tells you exactly which shelf (the index) it’s on. No need to scour the stacks. Hash functions spread data evenly across the table, dodging pile-ups called collisions that could slow things down. That’s why they’re a go-to for online services—nobody likes twiddling their thumbs while a database dawdles.

The catch? A table’s efficiency hinges on how full it gets—the ratio of stored items to total slots. As it nears capacity, collisions stack up, and finding a free spot takes more elbow grease.

Why Hash Tables Were Thought Maxed Out

In open-addressing hash tables—one of two big flavors (the other uses lists)—data lives right in the array’s cells. Each slot’s either empty or holds a key. To slot in a new key, a sequence of hash functions suggests spots until an empty one turns up. The snag—known as search complexity—is how many tries it takes to land that vacancy. That’s the heartbeat of a hash table’s speed.

The goal? Trim that complexity, especially when the table’s bursting. In plain speak, we’d say it’s 50% or 80% full. Researchers use a figure, x, to gauge how close it is to overflowing: x at 100 means 99% full; at 1,000, it’s 99.9%.

Back in 1985, computer scientist Andrew Yao—later a Turing Award champ—argued that for open-addressing tables, the best way to hunt a slot was a random roll of the dice, dubbed universal hashing. He reckoned that in the worst case—like snagging the last free spot—you couldn’t dodge a time sink tied to x. A 99% full table might mean checking 100 spots. For 40 years, most figured Yao had nailed it.

Then, in January 2025, Cambridge grad student Andrew Krapivin and a couple of pals proved the titan was off the mark—and that hash tables could run even leaner.

Krapivin’s Coup—and How It Happened by Chance

It all started in late 2021, when Krapivin, then an undergrad at Rutgers, stumbled on a paper about shrinking memory pointers. Years later, he circled back, reread it, and twigged that pointers could slim down further—if the data they pointed to got a makeover.

Diving into hash tables, he cooked up a new breed that outpaced the old guard. His version tracked down items faster, with less sweat. He pitched it to his prof, Martin Farach-Colton, a co-author of that pointer paper. Farach-Colton raised an eyebrow—hash tables were mined dry, right? But when he tapped William Kuszmaul from Carnegie Mellon to double-check, Kuszmaul saw gold.

Kuszmaul broke it to Krapivin: this wasn’t just a new table—he’d toppled a 40-year orthodoxy. The kicker? Krapivin had never clocked Yao’s work. Unshackled by dogma, he’d wandered where others feared to tread.

Now a Cambridge grad student, he teamed with Farach-Colton and Kuszmaul to pen a paper unveiling a method that beats random probing, even in a jam-packed table. Their trick doesn’t just quicken the hunt—it locks in steady search times, no matter how stuffed the array gets.

What’s Next?

Don’t expect the web to zip overnight, but Krapivin’s brainwave could ripple out in time. At the least, it’s lit a fire under further digs—already, a follow-up paper’s floated a model tweaking search tactics based on what’s taken. The internet’s gears might just spin smoother yet.

Read also


Readers' choice
up