Solve this statistical question

July 4th, 2009   Filed Under enart.notebookputer.com   edit

  • Clyde takes the same 30-minute walk every day. During his walks, he listens to music on his iPod. His iPod contains 611 songs in memory, which Clyde listens to on a randomly shuffling basis. Last week a certain song started playing at precisely the same point along his walk 3 times in 5 days. My question: What is the precise mathematical probability of this highly unusual occurrence? Clyde allows that the second and third time, there may have been up to a 3-second variance from the exact spot where the song began the first time, but that "all three times were within +/- 3 seconds at the most." I would like 3 answers, changing one variable: 1. perfect alignment, no variance - 3 hits at exactly the same point along the walk 2. with up to +/- 3 seconds variance - a 6-second spread 3. with up to +/- 6 seconds variance - a 12-second spread To answer this question you will have to determine or make an educated guess regarding average song length. All songs are in the Contemporary Christian category. If a crooked number, like 3:56, back it up. (Good backup merits a tip.) Otherwise choose a round number like 4:00. You might also need to know how far into the walk the exact song-start location is. I'll ask Clyde and post that info as a clarification. If the Random function is not truly random with respect to iPod track selection and playback, that is important to know. If this affects your analysis, explain how it affected the analysis and how it increased the degree of difficulty and I will tip you accordingly. I require a thorough mathematical explanation. Walk me through the math and explain all assumptions, variables, and statistical variation. Your answer must be mathematically and scientifically unassailable.


  • Clyde reports that the point where the song began is "3/4 of the way through the 1 mile walk." Also my request that your answer be "unassailable" was perhaps too extreme. I do not want to scare off a Researcher who can provide a reasoned, methodical answer. Your methods and math need to be solid and defensible, but I understand your answer is ultimately going to be an approximation built on approximations. The point of the exercise is to make clear that the odds of this happening are astronomical - whether it's 60 million to 1 or 600 million to 1 is of lesser importance. But I want the analysis to be solid because it might make its way into publication.


  • Hi there! Clyde has too much time on his hands:) Ok, first of all we have to know how random the shuffle algorithm is. Information on this is hard to come by, with one site ( http://www.audiorevolution.com/equip/ipod/ ) saying the iPod's algorithm is good, and another ( http://www.v-2.org/displayArticle.php?article_num=330 ) complaining that it favors a dozen or so files over the others. I will presume first that the algorithm is "perfect" - that is, each song has precisely the same probability of playing next as any other. Next, an assumption. Since we do not know the lengths of Clyde's songs, nor precisely what time the "same time each day" was, I am assuming that each song has an equal probability of beginning at any given second. This may not be a valid assumption. Consider as a simplification 2 songs, one 1 minute long, one 10 minutes long. The probability of a song beginning at 1 minute is 50% (either long-short (not start at 1 minute) or short-short or short-long (starts at 1 minute)) whereas the probability of a song beginning at 2 minutes is only 25% (short-short-short; short-short-long; short-long-short; short-long-long; long-short-short; long-short-long; long-long-short; long-long-long; 2 of 8 have songs starting at minute 2). As the time interval lengthens, the probabilities even out; as the number of songs increases, the probabilities even out. So, since this is an unknown, I am assuming the simplest and most likely case, that they are equal. Next, 1 more assumption. I assume that "at the same time" you mean "during the same second", and "within 3 seconds" means "during the same second or the two adjacent seconds". This addresses the concern of a commenter below who notes that no two event will occur at *precisely* the same time - there is some finite, albeit infinitesimal, variation. Another assumption: song length. This establishes the exact probability of *any* song starting during any given second. Shorter songs imply a greater probability of a song starting; longer songs imply a lower probability. The number we choose is unimportant to the analysis since it is merely a scaling factor. That is, should Clyde's songs be, on average, half as long as mine, his chances of a song starting at any given second will be twice what mine are. The math remains the same, and you can multiply the numbers by 2 if you would like. With that and one more disclaimer, that I'm not big into Christian anything, I do have a diverse collection of audio, spanning books on CD, classical, jazz, '80s, movie soundtracks, Celtic, electronic/alternative/new-age, and many other categories. According to Winamp, there are a total of 11464 tracks spanning 2377220 seconds, or an average 207.36 seconds per song. Thus there is a 1/207.36 (0.48224%) probability of any song beginning at any given second (with our simplifying assumption from above taken into account). Yet more assumptions! "3 times in 5 days" indicates the phenomenon occurred on the first, 5th, and one other day. After all, if it happened 3 days in a row, we would be hearing "3 days in a row" and the question would have been asked with perhaps more incredulity. We have 2 probabilities to calculate and multiply: first the chance of it happening on a given day, and then the chance of it happening on the requisite 3 of 5 days. We assume that the probabilities associated with song distribution do not change from day to day. At long last, on to the calculation: We have 32 possible day-patterns (which days the phenomenon occurs): Y/Y/Y/Y/Y, Y/Y/Y/Y/N, Y/Y/Y/N/Y, Y/Y/Y/N/N, Y/Y/N/Y/Y/, Y/Y/N/Y/N, etc.), of which 3 interest us: (Y/Y/N/N/Y, Y/N/Y/N/Y, Y/N/N/Y/Y). That is, regardless of the probability of song S starting during second T on any given day, the chance of it happening on 3 of 5 days according to my interpretation of the question are only 3/32nds as high. During second T we have a 0.48224% chance of a song starting. Since there are 611 songs, we have 1/611th that chance (0.00078927%) of the *particular* song starting at the right time. Multiplying that by the 3/32 from before, we get 0.000073994%, or 1 chance in 1,351,460. Pretty slim. There is one more wrinkle however. The first day is really "free". Suppose the song had not played on day 1. Then Clyde would not have mentioned it. Thus, the question only becomes important because it happened the 2nd and 3rd time, not because it happened the first. After all, on day 1, *some* song started playing at *some* time, and if it happened twice more in the following 4 days, it would be important; otherwise not. Without rehashing the calculations above (which I can do if it's not obvious what changes I am making), the final chances are 1 in 675,730. That was part 1. Fortunately, parts 2 and 3 are easy. We know the probability that the right song will start during the right second. The probability that the right song will start during 6 right seconds is 6 times as high; during 12 right seconds, 12 times as high. That gives 1:112,622 and 1:56,311 respectively. Probability has some very nonintuitive areas to it, so if my explanation is unclear or seems wrong, don't hesitate to ask about it. If you take issue with any of the many assumptions I have made to answer your question, I can change them in my analysis. This has been an interesting problem, and I would be more than happy to revisit it. -Haversian


  • It happened again on the 9th day, and Clyde says he took special note of the next song - it was one he did not remember ever hearing before. Let's assume pure randomness for the purposes of the calculation. Let's also eliminate the "window" by asking simply, What are the chances that out of 611 possible songs, a certain song would be the 5th song played in a random sequence on day 1, day 2, day 5, and day 9?


  • > Let's assume pure randomness for the purposes of the calculation. More precisely, that each song has an equal chance of playing, regardless of what has played before? > Let's also eliminate the "window" by asking simply, What are the chances that out of 611 possible songs, a certain song would be the 5th song played in a random sequence on day 1, day 2, day 5, and day 9? Again, we assume day 1 is "free", leaving us multiplying the probability of the song *NOT* playing as #5 on days 3, 4, 6, 7, 8, and 9 by the probability of the song playing as #5 on days 2, 5 and 9. If each song is independently chosen, the odds of it playing at any given time are 1/611; the odds of it not playing are 610/611. Thus, we have 1/611 * 610/611 * 610/611 * 1/611 * 610/611 * 610/611 * 610/611 * 1/611 = 84459630100000 / 19423598036535983659681 = 1 in 230 million.







  • #If you have any other info about this subject , Please add it free.#
    Your name:
    E-mail:
    Telphone:

    Your comments:


    If you have any other info about Solve this statistical question , Please add it free.