When is your birthday? The math behind hash collisions

0xkrt26.github.io
52 points denismenaceabout 17 hours ago 13 comments
AI-Powered Research Assistant
Analyze the comments and suggest other relevant articles to read.

Comments

pedrosbmartins|about 10 hours ago
> What is the probability that you are sharing the same birthday with people around you?

> What if I told you that in a room with only 23 people there’s already a 50% chance for two of them to have matching birthdays?

I guess it's the subject shift from _you_ to _any two people from a group_ that creates the surprise in the birthday paradox. You definitely need way more than 23 randomly sampled people to get to a high probability that _you_ specifically share a birthday with one of them, and the result does not contradict that notion.

JKCalhoun|23 minutes ago
Yeah, they should not have lead with subterfuge. It's still remarkable to many people (myself included) that a pool as small as 23 gives a 50% probability.

I think even given that premise, the "50% probability" is still a bit of a rug pull. The casual listener still believes the problem should address the 100% match.

A more honest approach is to plainly ask how many people have to be at a party to guarantee there are at least two people with the same birthday. To even the layman, the answer is 366 of course. Follow that though with, "And how many people will have had to arrive for there to be a 50% likelihood that two people at the party have the same birthday?"

To go from 366 to 23 I think is a surprise to many people. Because humans suck at probability, most people might instinctively assume half of 366 (183). So it becomes a surprise how low (less than two dozen!) it really is.

My own "drunk walk" to making sense of the small number: when two people are at the party, it is intuitive to me that there is 1 in 365 chance they will have the same birthday. As soon as a 3rd person arrives though there are two partygoers they might match so the odds have just doubled! :-) I understand though that the 4th person arriving does not double the odds but nonetheless increase the chances by 50%.

Suddenly I can now see a kind of asymptotic curve that, when we get to 366, will at last cross the threshold for 100% probability. But the asymptotic nature makes it clear to me that it will cross the 50% mark much sooner than would a linear growth. I am already convinced at this point that your 23 number is probably a pretty good one.

utopcell|about 2 hours ago
Sup-par phrasing is a subtle advantage of non-AI generated text. In the past, I would be put off by this bad phrasing and the typo ("requier") in the text, but these days, it's a signal that a human took the time to write this, which makes me happy to see.

..or is it "sub-par"?

incognito124|about 1 hour ago
I didn't even notice it at first!
aidenn0|about 7 hours ago
This is also an easy way to detect RNGs that are not truncated (i.e. return the entire state (or any 1-to-1 permutation of their entire state):

https://www.pcg-random.org/posts/birthday-test.html

Example:

Any RNG with a period 2**32 that can output every 32-bit value at least once must have zero collisions for the first 2**32 outputs, but we would expect to see about 100 collisions after just 200k outputs.

gblargg|about 2 hours ago
Such an RNG would be great for playing your 2^32 song collection, since you'd never hear the same song twice within a given time through.
rmunn|about 7 hours ago
> What is the probability that you are sharing the same birthday with people around you?

If you're a twin and your twin sibling is standing next to you, nearly 100%. But not exactly 100%: there have been cases of twins born on either side of midnight ending up with birthdays that differ by a day. (I don't personally know of any twins born on either side of midnight between Dec 31st and Jan 1st, who would then have different calendar years in their birthdays, but odds are very good that it has happened at least once in human history).

bryanrasmussen|about 5 hours ago
twins born mid flight as plane crosses time zone, can the second twin be born before the first!!?
rmunn|about 2 hours ago
Or (more likely since most airlines won't let a nine-month pregnant woman on board), born aboard a ship. (These days, most likely a cruise ship).

For extra fun, have them be born on opposite sites of the International Date Line, crossing west-to-east so that the younger twin (born on the east side of the line) is born on (say) July 1st at 8:00 AM local time, while the older twin (born fifteen minutes earlier on the west side of the line) is born on July 2nd at 8:45 AM local time.

For extra EXTRA fun, have them be born on opposite sites of the International Date Line on opposite sides of midnight, AND as the calendar ticks over from Dec 31st to Jan 1st. It gets really, really confusing. Though thankfully, I would bet money that particular example is contrived enough that it has never happened in real life.

pablowegw|about 11 hours ago
And what are the odds of your birthday being exactly at the center of the (non-leap) year? That's my B'day. Cool!
rossvc|about 11 hours ago
50/50. Either it is or it isn’t!
ChrisArchitect|about 10 hours ago
Related today:

Ask HN: We just had an actual UUID v4 collision...

https://news.ycombinator.com/item?id=48060054

chistev|about 7 hours ago
That might be why the OP posted it.