Q:

A building has n floors numbered 1,2,...,n, plus a ground floor g. at the ground floor, m people get on the elevator together, and each gets off at a uniformly random one of the n floors (independently of everybody else). what is the variance of the number of floors the elevator does not stop at? (in fact, the variance of the number of floors the elevator does stop at must be the same (do you see why?) but the former is a little easier to compute.)

Accepted Solution

A:
Let [tex]X_i[/tex] be the random variable indicating whether the elevator does not stop at floor [tex]i[/tex], with

[tex]X_i=\begin{cases}1&\text{if the elevator does not stop at floor }i\\0&\text{otherwise}\end{cases}[/tex]

Let [tex]Y[/tex] be the random variable representing the number of floors at which the elevator does not stop. Then

[tex]Y=X_1+X_2+\cdots+X_{n-1}+X_n[/tex]

We want to find [tex]\mathrm{Var}(Y)[/tex]. By definition,

[tex]\mathrm{Var}(Y)=\mathbb E[(Y-\mathbb E[Y])^2]=\mathbb E[Y^2]-\mathbb E[Y]^2[/tex]

As stated in the question, there is a [tex]\dfrac1n[/tex] probability that any one person will get off at floor [tex]n[/tex] (here, [tex]n[/tex] refers to any of the [tex]n[/tex] total floors, not just the top floor). Then the probability that a person will not get off at floor [tex]n[/tex] is [tex]1-\dfrac1n[/tex]. There are [tex]m[/tex] people in the elevator, so the probability that not a single one gets off at floor [tex]n[/tex] is [tex]\left(1-\dfrac1n\right)^m[/tex].

So,

[tex]\mathbb P(X_i=x)\begin{cases}\left(1-\dfrac1n\right)^m&\text{for }x=1\\\\1-\left(1-\dfrac1n\right)^m&\text{for }x=0\end{cases}[/tex]

which means

[tex]\mathbb E[Y]=\mathbb E\left[\displaystyle\sum_{i=1}^nX_i\right]=\displaystyle\sum_{i=1}^n\mathbb E[X_i]=\sum_{i=1}^n\left(1\cdot\left(1-\dfrac1n\right)^m+0\cdot\left(1-\left(1-\dfrac1n\right)^m\right)[/tex]
[tex]\implies\mathbb E[Y]=n\left(1-\dfrac1n\right)^m[/tex]

and

[tex]\mathbb E[Y^2]=\mathbb E\left[\left(\displaystyle\sum_{i=1}^n{X_i}\right)^2\right]=\mathbb E\left[\displaystyle\sum_{i=1}^n{X_i}^2+2\sum_{1\le i<j}X_iX_j\right]=\displaystyle\sum_{i=1}^n\mathbb E[{X_i}^2]+2\sum_{1\le i<j}\mathbb E[X_iX_j][/tex]

Computing [tex]\mathbb E[{X_i}^2][/tex] is trivial since it's the same as [tex]\mathbb E[X_i][/tex]. (Do you see why?)

Next, we want to find the expected value of the following random variable, when [tex]i\neq j[/tex]:

[tex]X_iX_j=\begin{cases}1&\text{if }X_i=1\text{ and }X_j=1\\0&\text{otherwise}\end{cases}[/tex]

If [tex]X_iX_j=0[/tex], we don't care; when we compute [tex]\mathbb E[X_iX_j][/tex], the contributing terms will vanish. We only want to see what happens when both floors are not visited.

[tex]\mathbb P(X_iX_j=1)=\left(1-\dfrac2n\right)^m[/tex]
[tex]\implies\mathbb E[X_iX_j]=\left(1-\dfrac2n\right)^m[/tex]
[tex]\implies2\displaystyle\sum_{1\le i<j}\mathbb E[X_iX_j]=2n(n-1)\left(1-\dfrac2n\right)^m[/tex]

where we multiply by [tex]n(n-1)[/tex] because that's how many ways there are of choosing indices [tex]i,j[/tex] for [tex]X_iX_j[/tex] such that [tex]1\le i<j[/tex].

So,

[tex]\mathrm{Var}[Y]=n\left(1-\dfrac1n\right)^m+2n(n-1)\left(1-\dfrac2n\right)^m-n^2\left(1-\dfrac1n\right)^{2m}[/tex]