Useful Asymptotics

\( \renewcommand{\vec}[1]{\boldsymbol{#1}} \DeclareMathOperator*{\E}{\mathbb{E}} \DeclareMathOperator*{\Var}{\mathrm{Var}} \DeclareMathOperator*{\Cov}{\mathrm{Cov}} \DeclareMathOperator*{\argmin}{\mathrm{arg\,min\;}} \DeclareMathOperator*{\argmax}{\mathrm{arg\,max\;}} \def\ZZ{{\mathbb Z}} \def\NN{{\mathbb N}} \def\RR{{\mathbb R}} \def\CC{{\mathbb C}} \def\QQ{{\mathbb Q}} \def\FF{{\mathbb FF}} \def\EE{{\mathbb E}} \newcommand{\tr}{{\rm tr}} \newcommand{\sign}{{\rm sign}} \newcommand{\1}{đťź™} \require{physics} \)

1 Taylor Expansion #

Approximating \(e^x\) #

\[\begin{aligned} e^x &= 1+x+\frac{x^2}{2!}+\frac{x^3}{3!}+\dots\\ &= 1+x+O(x^2) \end{aligned} \]

This Taylor expansion holds for all \(x\), but only for small \(x\) is \(O(x^2)\) less significant than \(x\).

Approximating \(\ln (1+\frac{1}{n})\) #

For \(-1\leq x\leq 1\)

\[\ln (1+x)=x-\frac{x^2}{2}+\frac{x^3}{3}-\dots \]

When \(x=\frac{1}{n}\), we have \(\ln (1+\frac{1}{n})\sim \frac{1}{n}\).

2 Harmonic Numbers Asymptotics #

The \(n\)-th harmonic number:

\[H_n = 1+\frac{1}{2}+\frac{1}{3}+ \dots +\frac{1}{n} = \sum_{k=1}^n \frac{1}{k}\]

Upper bound: if \(k = \lceil \log_2(n+1)\rceil\), then

\[\begin{aligned} H_n &= 1+\frac{1}{2}+\frac{1}{3}+ \dots + \frac{1}{n}\\ &\leq 1+\frac{1}{2}+\frac{1}{3}+ \dots + \frac{1}{2^k-1}\\ &= 1+ \left( \frac{1}{2} + \frac{1}{3} \right) + \left( \frac{1}{4} + \frac{1}{5} + \frac{1}{6}+\frac{1}{7} \right)+ \dots + \left( \frac{1}{2^{k-1}}+\dots + \frac{1}{2^{k}-1} \right)\\ &\leq 1+ \left( \frac{1}{2} + \frac{1}{2} \right) + \left( \frac{1}{4} + \frac{1}{4} + \frac{1}{4}+\frac{1}{4} \right)+ \dots + \left( \frac{1}{2^{k-1}}+\dots + \frac{1}{2^{k-1}} \right)\\ &=\underbrace{1+1+\dots +1}_k = k \end{aligned} \]

Therefore \(H_n \leq \lceil \log_2(n+1)\rceil \sim \log_2 n\)

Lower bound is similar: if \(k = \lfloor \log_2(n)\rfloor\), then

\[\begin{aligned} H_n &= 1+\frac{1}{2}+\frac{1}{3}+ \dots + \frac{1}{n}\\ &\geq 1+\frac{1}{2}+\frac{1}{3}+ \dots + \frac{1}{2^k}\\ &= 1+ \frac{1}{2} + \left( \frac{1}{3} + \frac{1}{4} \right) + \left( \frac{1}{5} + \frac{1}{6}+\frac{1}{7} + \frac{1}{8}\right) + \dots + \left( \frac{1}{2^{k-1}+1}+\dots + \frac{1}{2^{k}} \right)\\ &\geq 1+ \frac{1}{2} + \left( \frac{1}{4} + \frac{1}{4} \right) + \left( \frac{1}{8} + \frac{1}{8} + \frac{1}{8}+\frac{1}{8} \right)+ \dots + \left( \frac{1}{2^{k}}+\dots + \frac{1}{2^{k}} \right)\\ &=1 + \underbrace{\frac{1}{2}+\frac{1}{2}+\dots +\frac{1}{2}}_k = 1 + \frac{1}{2}k \end{aligned} \]

Therefore \(H_n \geq 1+\frac{1}{2}\lfloor \log_2 n\rfloor \sim \frac{1}{2}\log_2 n\)

In summary, \[H_n = \Theta(\log n)\]

We can also approximate \(H_n\) by an integral.

The integral \(\int_{1}^n \frac{1}{x} dx = \ln n -\ln 1 = \ln n\) has lower bound \(\frac{1}{2}+\frac{1}{3}+\dots + \frac{1}{n}=H_n-1\), and upper bound \(1+\frac{1}{2}+\dots +\frac{1}{n-1} = H_{n-1}\), this can be easily seen by drawing the graph, so we have:

\[H_n-1\leq \ln n \leq H_{n-1} \Leftrightarrow \ln (n+1) \leq H_n \leq 1+\ln n\]

We can write down a slightly less accurate but prettier estimate: \(\ln n \leq H_n \leq 1+\ln n\)

3 Birthday Paradox #

Randomly put \(n\) balls into \(m\) bins (\(n\leq m\)). What is the probability that no two balls land in the same bin?

The probability of no collisions is:

\[p_{n,m} = 1 \cdot \left( 1-\frac{1}{m} \right) \left( 1-\frac{2}{m} \right) \cdots \left( 1-\frac{n-1}{m} \right)\]

Using \(e^x\geq 1+x\), we get a supper bound:

\[\begin{aligned} p_{n,m} &\leq e^{-\frac{1}{m}}\cdot e^{-\frac{2}{m}} \cdots e^{-\frac{n-1}{m}} \\ &= \exp \left( -\frac{1}{m}-\frac{2}{m}- \dots -\frac{n-1}{m} \right) \\ &= \exp \left( -\frac{n(n-1)}{2m} \right) \end{aligned} \]

Roughly, we say \(p_{n,m}\leq \exp \left( -\frac{n^2}{2m} \right)\). (It doesn’t follow from the above, but just believe it.)

We can answer the question “when is \(p_{n.m}\approx \frac{1}{2}\)” by solving \(n\), obtaining \(n\sim \sqrt{2 \ln 2} \cdot \sqrt{m}\).

For the lower bound, since the Taylor expansion:

\[\ln (1+x)=x-\frac{x^2}{2}+\frac{x^3}{3}-\dots \]

We have \(\ln (1+x)\geq x-\frac{x^2}{2}\), so \(1+x \geq \exp (x-O(x^2))\). It follows that:

\[\begin{aligned} p_{n,m} &\geq e^{-\frac{1}{m}-O(\frac{1}{m^2})}\cdot e^{-\frac{2}{m}-O(\frac{2^2}{m^2})} \cdots e^{-\frac{n-1}{m}-O(\frac{(n-1)^2}{m^2})} \\ &= \exp \left( -\frac{1}{m}-\frac{2}{m}- \dots -\frac{n-1}{m} \right) \exp \left( \frac{1}{m^2} O(1^2+2^2+\dots+(n-1)^2) \right) \\ &= \exp \left( -\frac{n(n-1)}{2m} \right) \exp \left( -O\left(\frac{n^3}{m^2}\right) \right) \end{aligned} \]

This is the same thing as the upper bound, multiplied by an error factor. When \(n=\Theta(\sqrt{m})\), this error factor is \(\exp \left( -O \left( \frac{1}{\sqrt{m}} \right) \right)\), which is \(1-O \left( \frac{1}{\sqrt{m}} \right)\).

So when \(n=\Theta(\sqrt{m})\), the error factor doesn’t matter too match. This implies that when there are \(O(\sqrt{m})\) balls and \(m\) bins, there is around 50% chance to get a collision.

4 Factorial Asymptotics (Stirling’s Formula) #

Stirling’s formula: \[n! = \Theta(\sqrt{n})\cdot \left( \frac{n}{e} \right)^n\]

A more precise variant states that: \[n! = \sqrt{2\pi n} \cdot \left( \frac{n}{e} \right)^n \left( 1 \pm O \left( \frac{1}{n} \right) \right)\]

5 Binomial Coefficients Asymptotics #

6 PadĂ© Approximant #