Sunday, April 2, 2017

Birthday problem: Facebook friends



The birthday problem is a famous exercise in probability:  at a gathering of N people, what is the probability that two persons share the same birthday?  It seems counter-intuitive that when N = 23, the probability just exceeds 50%.  You can read the analysis in wikipedia, or my blog post about it (here).  Or how it snuck up on me in another context and I didn't recognize it (here).

Last week I was on Facebook (again), and thinking about birthdays.  I have a total of 45 friends who let Facebook know their birthday.  That's not many friends by FB standards, but I try not to take it personally.  In any event, I was curious to know how many shared birthdays there were among my FB friends.  It's easy to check once you're on the right page (thanks, Google, for pointing the way).  FB shows the birthday when you hover above the person's photo.  In any event, there were two pairs:  February 14 and April 11.

Here's the point:  it's easy to calculate the probability of one pair when N = 23.  It's much harder to enumerate the possibilities and their probabilities when N = 45 or 250 or whatever.  This is a job for simulation!

Fifty lines of Python (here) and I can run 10000 simulations for my case (N = 45) in an instant.  The results for one run show that the most common outcome is to have two pairs, and 80% of the runs have 1, 2, 3 or 4 pairs.

> python friends.py 
 2727 {2: 2}
 2251 {2: 3}
 1934 {2: 1}
 1138 {2: 4}
  591 {}
  396 {2: 5}
  248 {2: 1, 3: 1}
  220 {2: 2, 3: 1}
  175 {2: 3, 3: 1}
   85 {3: 1}
   83 {2: 6}
   60 {2: 4, 3: 1}
   17 {2: 5, 3: 1}
   13 {2: 7}
    9 {2: 2, 3: 2}
    7 {4: 1}
    6 {3: 2}
    5 {2: 1, 4: 1}
    3 {2: 6, 3: 1}
    2 {2: 3, 4: 1}
    1 {5: 1}
10000

1 comment:

Anuj Agarwal said...

Hi Python for Bioinformatics Team,

My name is Anuj Agarwal. I'm Founder of Feedspot.

I would like to personally congratulate you as your blog Python for Bioinformatics has been selected by our panelist as one of the Top 75 Bioinformatics Blogs on the web.

http://blog.feedspot.com/bioinformatics_blogs/

I personally give you a high-five and want to thank you for your contribution to this world. This is the most comprehensive list of Top 75 Bioinformatics Blogs on the internet and I’m honored to have you as part of this!

Also, you have the honor of displaying the following badge on your blog.