Continue to Site

Probability distribution of mail time of arrival

ccurtis

Well-Known Member
Hello,

Just for fun I have collected data, 40 points so far, of the daily time of arrival of snail mail to my house. I applied normal distribution stats to the data to get an objective view of the probability of receiving mail within certain time intervals. But, there is an issue. The data distribution does not appear to be bell shaped (normal). The left tail of the curve is short and there is a long tail on the right side. There is a hard limit to how early the mail arrives, yet, the mail arrives, albeit infrequently, well after dark. The 3 sigma time interval gives an early time of arrival that is apparently impossible while giving a late time interval that is reasonable.

Question 1: Does this mean that application normal distribution is a flawed model?
Question 2: Is there a better model for this situation? I read about a Poisson distribution model, but it talks about rates of events so I doubt applicability, although the curve shape looks like a fit. Only one event occurs per day, never more than one event per unit time.

It is certainly true that some random variables are not normally distributed. It would be a good idea to investigate other possible candidate distributions for a model with improved prediction qualities. At the same time, you should also be aware of the powerful implications of the Central Limit Theorem.

You probably don't have the same mail carrier every day. I know in our case, the carriers follow different routes, so the time can vary considerably.

You probably don't have the same mail carrier every day. I know in our case, the carriers follow different routes, so the time can vary considerably.
Or the same number of pieces to sort and deliver.

The time of mail arrival does vary a lot. From my data so far, mail arrival spans 1:47p to 7:22p with two humps in the histogram; one huge hump in the interval from 1:47p to 2:54p and another small hump in the interval 4:01p to 5:08p. There is also an interesting gap from 3:00p to 4:00p when mail has, so far, never arrived. Maybe suggesting a bimodal model? I just thought it would be interesting to see if I can compute the probability mail arrives in a particular time interval. They don't call them random variables for nothing, so that's the whole idea. The Central Limit Theorem does tend to indicate that given enough samples the normal distribution model should be useful, even if not the best fit, and so far, it looks reasonably accurate. Thanks for that! But then, given enough samples, I don't need a model because the samples become the population from which the probability is then evident. As the samples accumulate it will be interesting to watch how the probability calculations change and by how much. I also have a plot for the arrivals on each particular day of the week but there is not much to note from it, other than Monday has the widest spread in the arrival time (spanning almost the entire range) and Wednesday has the least spread clustered around the 2:00p timeframe. This all started with the installation of a mailbox sensor, making data collection a breeze.

Last edited:
It's not truly random as there are hard limits, based on human actions - shift starts and ends, plus minimum possible time from a shift start to travel to your location.

It sounds to me like there could be two shifts working, with second shift delivering smaller volumes (so less variations), mid to late afternoon?

Replies
9
Views
21K
Replies
4
Views
12K
Replies
0
Views
6K
Replies
0
Views
6K
Replies
2
Views
7K