So, I give you a million numbers written on plain flash cards and along with each number is written a unique name. The numbers are not in any particular order and are totally random. And once I’ve handed out these cards to you, I start asking you what the name corresponding to a certain number was. The problem is to estimate how long it will take you to do this. What you will most likely do is start going through the cards one by one until you came upon the card which had the number I asked you to find me the name for(if you’re really lucky this might come soon), at which point you would shout out, “HEY, the name is xxx’. The question is: is this really the fastest you can go?
Here you would have to look at each and every one of the cards, see if the number matches and if it does than you would reveal the name. Now in the worst case scenario, the card with the number on it is at the end of the pile so you have to look at a million cards every time I ask you for the name corresponding to a particular number.
Now, my claim is that I will look at only 20 cards and tell you the unique number corresponding to the name. If this is possible, compared to you I am 50,000 times faster.
Take a moment and think about it. Spend a day on it, if you wish. Take two. Actually, I’m going to be generous here. Take a week. And come back and look at the solution given below. Or don’t wait and just read it now, which is what you’re probably going to do.
One of the solutions:
Now, this problem has multiple neat solutions. The one I am going to share with you is one of the easy ones. The solution uses an almost magical structure called a Binary Tree. So, when you give me a card, rather than just putting the cards one on top of the other as I get them, I am going to read the number on the card and put it at a certain location. Now when you give me the second card, I am going to read the second number and ifit’s larger than the first number I am going to put it on the right of the first card and if it is smaller than the first number then I will put it on the left side of the first card. So, after a few run’s I get a structure that reads like this:
So, when you tell me a name, I look at the card at the top and go: ‘Hmm.. well let’s see. The number I want to find is greater then this one. So I should look at only the cards on the right.’ I have thus removed half a million cards. And with some very simple math we get 2 to the power 20 which is 1048576, a bit larger then a million. So anything under this number I can get to by just looking at 20 Cards.
Here’s how.
Cards to Consider Cards Checked
1048576 Step 0
524288 Step 1
262144 Step 2
131072 Step 3
.
8 Step 17
4 Step 18
2 Step 19
1 Step 20
Here after each step the size of my problem remains only half of what it originally was. So BAM!
Impressed? You better be! You were going to turn over a million cards every time. I just saved you a whole lot of time.
Where the fun starts:
In Computer Science such solutions have their own problems. This solution will only be valid when we have a balanced tree (i.e. each side of the tree has an equal number of cards in it). So if I gave you a sorted list of numbers, your simple strategy would fail and you would still end up searching one million even with the structure I have proposed. However, such problems are not unsolvable and we have some slightly trickier structures with slightly more properties that allow us to create very balanced binary Trees (Red Black Trees). So, say you can’t find my card in 20 moves. Well you will most likely find it in 30. This is still more than 30,000 times faster than the time you were going to spend.
Blame yourself yet?
Haris