First, some logical thought. (That has absolutely nothing to do with MATLAB.) Does the argument made hold water?
We can think of any permutation of the numbers 1:100 (or 1:n) in general as representing a graph, in the sense that box n maps to some other box, based on the number the box contains. (A box may even rarely map to itself for some boxes.)
But because this is a permutation of the numbers 1:n, then any box can only have one other box that maps to it, and there will always be exactly one box that maps into box k. This is true because we are working with a permutation of the numbers 1:n.
Next, if we start out in box #k, MUST we always return to box #k? The only way this could fail is if it was possible for us to have a sequence that would go something like
A --> B --> C --> D --> B --> C --> D --> B ...
Can you see why this is impossible, under the conditions we have set up the boxes? For a sequence like the above to happen, we MUST have TWO boxes that each point to box B. Thus in the postulated search sequence, box A and box D must both point to box B. Since we set up the numbers in the boxes as a permutation of the numbers 1:n, that can never happen. Therefore, if we start in box #k, then we will always returns to box number k. A cycle must always exist that contains box k, so if we start at box k, we will eventually return to box k, although the number of steps required for that event may be the maximum possible.
The only way we can fail then is if the cycle containing box k is if the associated cycle happens to pass throguh more boxes than 50.
In fact, any permutation of 1:n can result in multiple cycles, some of a short length, some longer. For example, consider the numbers 1:20. I'll just pick one random sample.
[1:20;randperm(20)]
ans =
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
12 20 2 14 15 10 13 18 8 9 1 5 17 11 7 6 16 3 4 19
What cycles can I find there? Only 1 cycle in this permutation.
1--> 12 --> 5 --> 15 --> 7 --> 13 --> 17 --> 16 --> 6 --> 10 --> 9 --> 8 --> 18 --> 3 --> 2 --> 20 --> 19 --> 4 --> 14 --> 11 --> 1
Hmm, this case would fail. Any box I start in will take exactly 21 steps before I return to that box.
[1:20;randperm(20)]
ans =
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
8 9 17 14 18 12 2 10 16 20 3 7 19 4 1 13 11 5 15 6
Here, we find 4 distinct cycles.
1 --> 8 --> 10 --> 20 --> 6 --> 12 --> 7 --> 2 --> 9 --> 16 --> 13 --> 19 --> 15 --> 1
3 --> 17 --> 11 --> 3
4 --> 14 --> 4
5 --> 18 --> 5
As you can see, each cycle is disjoint from the other cycles. At the same time, one of the 4 disjoint cycles was of length greater than 50% of the total number of elements. So this permutation is also a failed permutation, in that if we started in any of the 13 boxes in that first cycle, it will take exactly 13 steps to return to that box. And returning to the start box means we have found the box containing our number.
One thing you do need to recognize if that IF we start in box #k, then we must ALWAYS return to box #k by following the chain thus induced. The only question is the expectation of how many steps it will take at most. The question really reduces to determining the probability that a permutation of 1:100 will contain no cycle of length greater than 50. (The claim made is the frequency is a wee bit over 30% for permutations of the number 1:100.)
Can we use the tools in MATLAb to do this for us? Well, yes, we can do a lot.
First, I'll create a random permutation graph. I'll do it for the numbers 1:20. The function conncomp really does it all.
s = 1:20;
t = randperm(20);
G = graph(s,t);
conncomp(G)
ans =
1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 2 3 2
So this time we had 3 disjoint cycles, but one of them was a long one.
Do I really need to write code to estimate that probability using simulation? After all, that was your assignment. I'll look back in tomorrow morning. In fact, the functions graph and conncomp do almost all the work for you. accumarray would do the rest, IF you understand what it tells you. (If you are not allowed to use the graph tools to solve the problem, too bad. I won't re-write them.) Consider what this little code fragment tells you.
t = randperm(20);
G = graph(s,t);
conncomp(G)
ans =
1 1 2 2 2 3 2 2 1 3 2 2 2 3 1 1 2 1 2 3
accumarray(conncomp(G)',1)
ans =
6
10
4
max(accumarray(conncomp(G)',1))
ans =
10
Edit: Since I know the answer to that silly question above, what does this tell you?
n = 100;
s = 1:n;
N = 100000;
success = 0;
for i = 1:N
t = randperm(n);
G = graph(s,t);
if max(accumarray(conncomp(G)',1)) <= n/2
success = success + 1;
end
end
disp(success/N)
0.31181
Of course, if you turn in this terse little code fragment as the solution to your homework assignment, my guess is your teacher will know you did not do it yourself. Que sera, sera...
Post edit:
There are a couple of important points to understand here. I think some of the commenters missed the point, not fully thinking through the algorithm as described.
- Can you randomly have fallen onto the wrong cycle? NO! If you have number k, then you start by looking at box #k. That induces some sequence of numbers that starts with the number k. Eventually, if you end up searching through every box, this tells us only that there is only one cycle induced by the particular permutation, and that is a really large cycle. In that case, everybody fails to find their original number in a sufficiently low time by this algorithm.
- Suppose a really large cycle exists that spanns ALL of the boxes. Will this algorithm start you at some random point in that cycle, so that even though there is a cycle of length 100, you might find your number before 100 boxes have been searched? NO! That is like the joke about why are your missing car keys always in the last place you look - you stop looking because there is no reason to look anymore. Effectively, if a long cycle exists, and you start loooking for #k in box number k, then if a box ever points you back into box #k befoe you have looked in every box, then the cycle is not one that is the fullest possible length.
- Can some people happen to have numbers that allow them to suceed in fewer than 100/2=50 steps? Yes. But the gist of the algorithm posed is that EVERYBODY needs to succeed, or the case is deemed to have failed. For example...
I ran off a couple of sample graphs, looking for one that proves my point, but here is the one I chose:
n = 100;
s = 1:n;
t = randperm(n);
G = graph(s,t);
accumarray(conncomp(G)',1)
ans =
77
21
1
1
Here is a permutation where out of 100 numbers, there is one cycle of length 77. So if you start from ANY of those boxes, say box #k, while looking for the number k? In every case, it will require you to search through EXACTLY 77 boxes before you find the box that directs you back to the box you started from. That final box contains the number you were searching for.
However, there are also two unit cycles, where a box contains its own number. So this permutation is not a "derangement". A derangement is a permutation that has no numbers that appear in their original position. (Yes, there is even a name for such a result.)
So if you happen to be one of those two people for this permutation, if you start with that number, then the box you will open first contains your own number. So this particular permutation is NOT a derangement.
There is also a short cycle here, of length 21. So for this particular permutation, there are actually 23 people who will succeed in finding their own number. Two will find it immediately. And 21 of them will find their number after 21 steps. But if you happen to lie in the cycle of length 21, no matter what number you hold, you will always run through all 21 boxes before you are done. The important point is that for any given permutation, unless ALL 100 numbers lead to a successful conclusion, then the algorithm is deemed to have failed on that permutation. And that explains why the simulation I ran was sufficient to determine if a solution exists for a given permutation.
Finally, why did the code that Fisnik wrote fail? I can best only say that it was not written to solve the problem. There are literally too many problems with that code for me to list. I'm sorry, but that is true. You need to fully understand the algorithm described to search through the boxes, and what that entails in terms of permutations. I think you also need to understand things like derangements to fully appreciate things here. Could code have been written by me to solve this problem without using the graph tools in MATLAB? Of course, but I won't write that code. As well, I have not written what I would consider a complete proof that the probability is around 31% that a permutation allows success using this algorithm. That would get deeply into the theory of things like derangements, and I won't go there. This answer is already far too long.
I think this now covers all possible things that need be discussed, at least in my eyes. If you are confused, then you should start by a careful re-reading of my answer, but I can try to entertain further thoughtfully posed questions, as long as you have thoughtfully read my full answer here and the analysis contained therein.