Written by Li Zhuohua
on September 1, 2021

Quantum Computation and Information

Table of Contents

1 Preliminaries
- 1.1 Quantum States
2 Rotate, Compute, Rotate
3 Measurements
- 3.1 General Measurements
- 3.2 Projective Measurements
- 3.3 POVM Measurements
- 3.4 Partial Measurements
4 Trace, Partial Trace, and Reduced Density Matrix
- 4.1 Trace
- 4.2 Partial Trace
- 4.3 Reduced Density Matrix
5 Quantum Operations
- 5.1 Describing An Open Quantum System
- 5.2 Operator-Sum Representation
6 Circuit Implementation
7 Quantum Gates
- 7.1 Pauli Group over \(n\) Qubits: \(\mathcal{P}_n\)
- 7.2 Clifford Group over \(n\) Qubits: \(\mathcal{C}_n\)
8 Measures for Quantum Information
- 8.1 Fidelity
- 8.2 Trace Distance
9 Useful Transformations
- 9.1 Hadamard Transform
- 9.2 Quantum Fourier Transform
10 Algorithms
- 10.1 Quantum Teleportation
- 10.2 Fourier Sampling
- 10.3 Phase Kickback
- 10.4 Amplitude Amplification
- 10.5 Grover’s Algorithm
- 10.6 Phase Estimation
- 10.7 Amplitude Estimation
- 10.8 Quantum Counting
- 10.9 Twirling Quantum Channels
- 10.10 Entanglement Distillation (Purification)
- 10.11 Quantum Monte Carlo
- 10.12 Randomized Benchmarking
11 Quantum Adversary Method in Quantum Query Complexity
12 Number Theory

1 Preliminaries #

1.1 Quantum States #

Hilbert spaces \(\mathcal{H}\) #

Definition: an inner product space in which each Cauchy sequence of vectors has a limit.

Sounds daunting, however, in quantum computing we only care about finite dimensional space. In this case a Hilbert space degenerates to exactly a “normal” inner product space, which is what we learned in our undergraduate level linear algebra course.

Dirac Notation #

Normally we write vectors like the following:

Column vector: \(x\)
Conjugate transpose of \(x\) (which is a row vector): \(x^{\dagger}\)
Inner product of \(x\) and \(y\): \(x^{\dagger}y\) or \(\braket{x}{y}\)

A reminder: Doing a inner product in complex vector space requires taking a “conjugate transpose” instead of simply a “transpose”.

In quantum computing, people usually use a different notation to represent vectors:

Column vector: \(\ket{\varphi}\)
Conjugate transpose of \(\ket{\varphi}\) (which is a row vector): \(\bra{\varphi}\)
Inner product of \(\ket{\varphi}\) and \(\ket{\psi}\): \(\braket{\varphi}{\psi}\)

The advantage of Dirac notation is that we can easily know whether a value is a column vector, a row vector, or a scalar. Read the article by Prof. E. F. Redish for more information.

2 Rotate, Compute, Rotate #

Quantum Mechanics = probability with minus signs

Probabilistic Computing = deterministic computing + coin flip instruction

Strongly Believed: Every (compute-a-function) problem in P probabilistically, also in P deterministically. (pseudo-random generator)

Probabilistic computing gives speedups over deterministic computing, from one P algorithm to another for many problems.

Quantum computing gives speedups over probabilistic computing, from one P algorithm to another for many problems. (E.g., Grover)

(Seems, because factoring may be in P) Quantum computing gives exponential speedups over probabilistic computing for factoring

Strongly Believed: quantum computing does not give exponential speedups for NP-complete problems. (E.g., SAT)

Strongly believed: NP != BPP, factoring is not NP complete

3 Measurements #

3.1 General Measurements #

Quantum measurement operators: \(\{M_m\}\) such that \(\sum_m M_m^{\dagger} M_m = I\).

Measurement of a single qubit in the computational basis: \(M_0=\ket{0}\bra{0}\), \(M_1=\ket{1}\bra{1}\).

For Pure States #

If the state is \(\ket{\psi}\), result \(m\) occurs with probability:

\[p(m)= \bra{\psi} M_m^{\dagger} M_m \ket{\psi}\]

and the state after the measurement snaps to:

\[\frac{M_m \ket{\psi}}{\sqrt{p(m)}}\]

For Mixed States #

If the state is \(\rho\), result \(m\) occurs with probability:

\[p(m)=\tr \left( M_m^{\dagger} M_m \rho \right)\]

and the state after the measurement snaps to:

\[\frac{M_m \rho M_m^{\dagger}}{p(m)}\]

3.2 Projective Measurements #

Most literature only introduces this kind of measurements.

Observable \(M = \sum_m m P_m\), a Hermitian operator. \(P_m\) is a orthogonal projection matrix, i.e., \(P^2 = P^{\dagger} = P\).

In quantum mechanics, the measurement of an observable \(M\) prepares an eigenstate of \(M\), and the observer learns the value of the corresponding eigenvalue.

“Measure in basis \(\ket{m}\)” means \(P_m = \ket{m}\bra{m}\). Often people simply list a set of basis instead of giving the observable \(M\).

For Pure States #

If the state is \(\ket{\psi}\), result \(m\) occurs with probability:

\[p(m)= \|P_m \ket{\psi}\|^2 = \bra{\psi} P_m^{\dagger}P_m \ket{\psi} = \bra{\psi} P_m \ket{\psi}\]

and the state after the measurement snaps to:

\[\frac{P_m \ket{\psi}}{\sqrt{p(m)}}\]

Note

If \(P_m=\ket{m}\bra{m}\), then

\[p(m)=\bra{\psi} \ket{m} \bra{m} \ket{\psi} = |\bra{m} \ket{\psi}|^2\]

and the state snaps to:

\[\frac{\bra{m} \ket{\psi}\ket{m}}{|\bra{m} \ket{\psi}|}\]

If many identically prepared states \(\ket{\psi}\) are measured, then the expectation value of the outcomes is

\[\langle M \rangle = \sum_{m} m \cdot p(m) = \sum_{m} m \bra{\psi} P_m \ket{\psi} = \bra{\psi} M \ket{\psi}\]

For Mixed States #

If the state is \(\rho\), result \(m\) occurs with probability:

\[p(m)=\tr \left( P_m \rho \right)\]

and the state after the measurement snaps to:

\[\frac{P_m \rho P_m^{\dagger}}{p(m)}\]

Note

If \(P_m=\ket{m}\bra{m}\), then

\[p(m) = \tr(\ket{m}\bra{m} \rho) = \tr(\bra{m} \rho \ket{m}) = \bra{m} \rho \ket{m}\]

and the state snaps to:

\[\ket{m}\bra{m}\]

Note

Since a density matrix \(\rho\) can be written as an ensemble of pure states:

\[\rho = \sum_i p_i \ket{\psi_i} \bra{\psi_i} \]

If the observable \(M=\sum_{m} m \ket{\psi_m} \bra{\psi_m}\), then \(p_i=p(m) = \tr(\ket{\psi_m}\bra{\psi_m} \rho)\).

3.3 POVM Measurements #

Positive operator \(E_m \triangleq M_m^{\dagger} M_m\), such that \(\sum_m E_m = I\).

For Pure States #

If the state is \(\ket{\psi}\), result \(m\) occurs with probability:

\[p(m)= \bra{\psi} E_m \ket{\psi}\]

and POVM measurements do not consider the state after the measurement.

For Mixed States #

If the state is \(\rho\), result \(m\) occurs with probability:

\[p(m)=\tr \left( E_m \rho \right)\]

and POVM measurements do not consider the state after the measurement.

3.4 Partial Measurements #

We only use projective measurements as examples.

For state \(\ket{\psi} = \sum_{i,j} \alpha_{i,j} \ket{u_i}\ket{v_j}\), where \(\ket{u_i}\), \(\ket{v_j}\) are orthonormal bases. We only measure the first qubit using observable \(M=\sum_{m} m P_m\). Since the projection only applies on the first qubit, the operator we use is actually \(P_m \otimes I = \ket{m}\bra{m} \otimes I\).

Applying the projection:

\[\begin{split} \ket{\psi’}&= (P_m \otimes I) \ket{\psi}\\ &= (\ket{m}\bra{m} \otimes I) \left( \sum_{i,j} \alpha_{i,j} \ket{u_i}\ket{v_j} \right)\\ &= \sum_{j} \alpha_{m,j} \ket{m}\ket{v_j} \end{split} \]

The probability that the outcome is \(m\):

\[\begin{split} \Pr\left[m\right] &= \bra{\psi’}\ket{\psi’}\\ &= \left( \sum_{i} \alpha_{m,i}^{*} \bra{m}\otimes\bra{v_i} \right) \left( \sum_{j} \alpha_{m,j} \ket{m}\otimes\ket{v_j} \right)\\ &= \sum_{i,j}\alpha_{m,i}^{*} \alpha_{m,j} 1 \otimes \bra{v_i}\ket{v_j}\\ &= \sum_{i} |\alpha_{m,i}|^2 \end{split} \]

4 Trace, Partial Trace, and Reduced Density Matrix #

4.1 Trace #

The trace of an operator \(A\) is defined as the sum of its diagonal entries: \[\tr(A) = \sum_{i} \bra{i}A\ket{i}\] The trace is the same no matter which basis you use, so it is a property of the operator, not of the basis you choose.

For a composite system \(AB\), let \(\ket{a}\) and \(\ket{b}\) be basis sets for \(A\) and \(B\), for an operator \(O=O_A \otimes O_B\), the trace of \(O\) is: \[\begin{split} \tr(O) &= \sum_{a,b}\bra{a,b}O\ket{a,b}\\ &= \sum_{a,b} (\bra{a} \otimes \bra{b})(O_A\otimes O_B)(\ket{a}\otimes\ket{b})\\ &= \sum_{a,b} \bra{a}O_A\ket{a} \bra{b}O_B\ket{b}\\ &= \sum_a \bra{a}O_A\ket{a} \sum_{b} \bra{b}O_B\ket{b} \end{split} \] The result is simply the product of traces of \(O_A\) and \(O_B\) in their respective Hilbert spaces, i.e., \[\tr(A\otimes B) = \tr(A)\tr(B)\]

4.2 Partial Trace #

Similar to the trace, a partial trace is an operator that only traces over a part of the system. For example, if we “trace over B”, we eliminate the variables pertaining to B and what we get is an operator acting only on the Hilbert space of A. \[\begin{split} \tr_B(O_A\otimes O_B) &= \sum_b (I_A \otimes \bra{b})(O_A\otimes O_B)(I_A\otimes\ket{b})\\ &= \sum_b O_A \bra{b}O_B\ket{b}\\ &= \tr(O_B)O_A \end{split} \] More formally, we can define partial trace as follows (from the lecture notes by Dave Bacon): \[\tr_B(\rho) = \sum_{i=1}^{\dim \mathcal{H}_{B}} \bra{\phi_i}_B \rho \ket{\phi_i}_B\] where the quantum system is defined in Hilbert space \(\mathcal{H}=\mathcal{H}_A \otimes \mathcal{H}_B\), and \(\ket{\phi_i}\) is a basis for \(\mathcal{H}_B\). \(\ket{\phi_i}_B \triangleq I_A \otimes \ket{\phi_i}\). Note that many literature simply omits the identity \(I_A\), which is a bit confusing.

Another definition in Nielsen & Chuang, which is not very intuitive in my opinion:

\[\tr_B \left( \ket{a_1}\bra{a_2} \otimes \ket{b_1}\bra{b_2} \right) \triangleq \ket{a_1}\bra{a_2} \tr \left( \ket{b_1}\bra{b_2} \right)\]

The two definitions are equivalent.

Proof

If \(\rho = \ket{a_1}\bra{a_2} \otimes \ket{b_1}\bra{b_2}\), then from the first definition:

\[\begin{split} &\tr_B \left( \ket{a_1}\bra{a_2} \otimes \ket{b_1}\bra{b_2} \right)\\ =& \sum_i (I_A \otimes \bra{\phi_i}) \ket{a_1}\bra{a_2} \otimes \ket{b_1}\bra{b_2} (I_A \otimes \ket{\phi_i})\\ =& \sum_i \ket{a_1}\bra{a_2} \otimes \braket{\phi_i}{b_1} \braket{b_2}{\phi_i}\\ =& \ket{a_1}\bra{a_2} \otimes \sum_i \braket{\phi_i}{b_1} \braket{b_2}{\phi_i}\\ =& \ket{a_1}\bra{a_2} \tr \left( \ket{b_1} \bra{b_2} \right) \end{split} \]

4.3 Reduced Density Matrix #

Suppose we have physical systems A and B, whose state is described by a density operator \(\rho^{AB}\). The reduced density operator for system A is defined by: \[\rho^A \triangleq \tr_B(\rho^{AB})\]

5 Quantum Operations #

A quantum operation is just a linear map that maps density matrices to density matrices: \[\rho’ = \mathcal{E}(\rho)\] It is a general tool for describing the evolution of quantum systems, for example, unitary transformations and measurements are all quantum operations.

Two equivalent representation of quantum operators:

5.1 Describing An Open Quantum System #

Suppose we have a system in state \(\rho\), which is sent into a box which is coupled to an environment with state \(\rho_{\text{env}}\). After the box’s transformation \(U\) the system no longer interacts with the environment, and thus we perform a partial trace over the environment to obtain the reduced state of the system alone: \[\mathcal{E}(\rho)=\tr_{\text{env}} \left[ U(\rho \otimes \rho_{\text{env}}) U^{\dagger}\right]\]

5.2 Operator-Sum Representation #

Another way of representing quantum operations is called the Operator-Sum Representation: \[\mathcal{E}(\rho)=\sum_k E_k \rho E_k^{\dagger} \quad \text{with } \sum_k E_k^{\dagger}E_k=I\] The operators \(\set{E_k}\) are known as operation elements or Kraus operators for the quantum operation \(\mathcal{E}\).

The derivation of equivalence between the two representations, i.e., Equation (8.10) in the textbook by Nielsen & Chuang is not trivial:

From Equation (8.6):

\[\mathcal{E}(\rho)=\tr_B \left[ U(\rho \otimes \rho_B) U^{\dagger}\right]\]

Let \(\ket{e_k}\) be an orthonormal basis for system \(B\). Assume system \(B\) has the initial state \(\rho_B = \ket{e_0}\bra{e_0}\).

To derive equation (8.10), the trick is the following fact:

\[\begin{split} &\rho \otimes \ket{e_0}\bra{e_0}\\ =& (\rho \otimes I_B) (I_A \otimes \ket{e_0}) (I_A \otimes \bra{e_0})\\ =& (\rho I_A \otimes I_B \ket{e_0}) (I_A \otimes \bra{e_0})\\ =& (I_A \rho \otimes \ket{e_0} \cdot 1) (I_A \otimes \bra{e_0})\\ =& (I_A \otimes \ket{e_0}) (\rho \otimes 1) (I_A \otimes \bra{e_0})\\ =& (I_A \otimes \ket{e_0}) \rho (I_A \otimes \bra{e_0}) \end{split} \]

Therefore, equation (8.10):

\[\begin{split} \mathcal{E}(\rho)&=\tr_B \left[ U(\rho \otimes \ket{e_0}\bra{e_0}) U^{\dagger}\right]\\ &= \sum_k (I_A \otimes \bra{e_k}) \left[ U \left( \rho \otimes \ket{e_0}\bra{e_0} \right) U^{\dagger} \right] (I_A \otimes \ket{e_k})\\ &= \sum_k \underbrace{(I_A \otimes \bra{e_k}) U (I_A \otimes \ket{e_0})}_{E_k} \rho \underbrace{(I_A \otimes \bra{e_0}) U^{\dagger} (I_A \otimes \ket{e_k})}_{E_k^{\dagger}}\\ \end{split} \]

Reference: StackExchange.

The derivation of equation (8.7) is also based on this.

6 Circuit Implementation #

6.1 Implementation \(Q_{F}\) #

For \(F: \{0,1\}^n \to \{0,1\}^m\) (\(m\) is usually 1)

Implements:

\[\ket{x}\ket{b} \stackrel{Q_F}{\longrightarrow} \ket{x}\ket{b\oplus F(x)}\]

where \(x \in \{0,1\}^n, b \in \{0,1\}^m\), \(\oplus\) is bitwise XOR.

6.2 Sign-implementation \(Q_F^{\pm}\) #

For \(F: \{0,1\}^n \to \{0,1\}\) (only for \(m=1\))

Implements:

\[\ket{x} \stackrel{Q_F^{\pm}}{\longrightarrow} (-1)^{F(x)}\ket{x}\]

where \(x \in \{0,1\}^n\).

6.3 Quantum Random Access Memory (QRAM) #

Given a superposition of address in the address register \(a\), QRAM returns a superposition of data in the data register \(d\), correlated to the address register. Let \(D_i\) be the content of the \(i\)th memory cell. This is shown in the formula: \[\sum_{i} \alpha_i \ket{i}_a \longrightarrow \sum_{i} \alpha_i \ket{i}_a \ket{D_i}_d\]

7 Quantum Gates #

7.1 Pauli Group over \(n\) Qubits: \(\mathcal{P}_n\) #

\(\mathcal{P}_1\) on 1 qubit has 16 elements (2×2 unitary matrices) consisting of all the Pauli matrices: \[I=\begin{bmatrix}1&0\\0&1\end{bmatrix}, \quad X=\begin{bmatrix}0&1\\1&0\end{bmatrix}, \quad Y=\begin{bmatrix}0&-i\\i&0\end{bmatrix}, \quad Z=\begin{bmatrix}1&0\\0&-1\end{bmatrix}\] together with the products of these matrices with factors \(\pm 1\) and \(\pm i\).

Therefore,

\[\mathcal{P}_1 = \{\pm I,\pm iI,\pm X,\pm iX,\pm Y,\pm iY,\pm Z,\pm iZ\}\]

\(\mathcal{P}_n\) is generated by the operators described above applied to each of \(n\) qubits in the tensor product Hilbert space \((\CC^2)^{\otimes n}\).

Note

\(X\) gate is like a “NOT” gate, \(\ket{0} \mapsto \ket{1}\) and \(\ket{1} \mapsto \ket{0}\).

\(Z\) gate multiplies -1 if input is \(\ket{1}\).

\(Y\) gate \(\ket{0}\mapsto i\ket{1}\), and \(\ket{1}\mapsto -i\ket{0}\).

7.2 Clifford Group over \(n\) Qubits: \(\mathcal{C}_n\) #

The Clifford group is defined as the group of unitaries that transforms Paulis into other Paulis:

\[\mathcal{C}_n = \set{V \in U(2^n): V P V^{\dagger} \in \mathcal{P}_n, \forall P \in \mathcal{P}_n}\]

8 Measures for Quantum Information #

8.1 Fidelity #

Fidelity of Quantum States #

Given two density matrix \(\rho\) and \(\sigma\), the fidelity is generally defined as the quantity \[F(\rho, \sigma) = \left(\operatorname{tr} \sqrt{\sqrt\rho \sigma\sqrt\rho}\right)^2\]

Note that for a pure state \(\ket{\psi}\), the square root of its density matrix is the density matrix itself (easy to verify).

Usually we compute the fidelity between a pure state \(\ket{\psi}\) and an arbitrary mixed state \(\rho\).

\[\begin{split} F(\ket{\psi}, \rho) &= \left( \tr \sqrt{(\ket{\psi}\bra{\psi})^{\frac{1}{2}} \rho (\ket{\psi}\bra{\psi})^{\frac{1}{2}}} \right)^2\\ &= \left( \tr \sqrt{\ket{\psi}\bra{\psi} \rho \ket{\psi}\bra{\psi}} \right)^2\\ &= \bra{\psi} \rho \ket{\psi} \end{split} \]

Note

In Nielsen & Chuang, the definition does not have the square.

Fidelity of Quantum Channels #

The average fidelity of a quantum channel \(\mathcal{E}\) is defined by \[F(\mathcal{E})=\int \dd{\psi} \bra{\psi} \mathcal{E}(\ket{\psi}\bra{\psi})\ket{\psi}\] where the integral is over the uniform (Haar) measure on state space, normalized so \(\int \dd{\psi}=1\).

8.2 Trace Distance #

9 Useful Transformations #

9.1 Hadamard Transform #

There are several kinds of notations (denote \(N=2^n\)): \[\begin{split} H^{\otimes n} \ket{x} &= H x_1 \otimes H x_2 \otimes \dots \otimes H x_n\\ &= \bigotimes_{i=1}^n \frac{1}{\sqrt{2}} \left( \ket{0} + (-1)^{x_i}\ket{1} \right)\\ &= \frac{1}{\sqrt{2^n}} \sum_{s \in \{0,1\}^{n}} (-1)^{\sum_{i=1}^{n} s_i x_i} \ket{s} \\ &= \frac{1}{\sqrt{2^n}} \sum_{s \in \{0,1\}^{n}} (-1)^{\sum_{i=1}^{n} s_i x_i \mod 2} \ket{s} \\ &= \frac{1}{\sqrt{2^n}} \sum_{s \in \{0,1\}^{n}} (-1)^{\mathrm{XOR}_s(x)} \ket{s} \\ &= \frac{1}{\sqrt{2^n}} \sum_{s \in \{0,1\}^{n}} (-1)^{s \cdot x} \ket{s}\\ &= \frac{1}{\sqrt{N}} \sum_{s=0}^{N-1} (-1)^{s \cdot x} \ket{s} \end{split} \]

Where \(\text{XOR}_s(x)\) is defined as XOR of bits \(x_i\) where \(s_i=1\). I.e., \(\text{XOR}_s(x)=\left(\sum_{i: s_i=1}^n x_i\right) \mod 2 = \left(\sum_{i=1}^n s_ix_i\right) \mod 2\).

Denote \(\ket{g} \triangleq \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) \ket{x}\), where \(g: \{0,1\}^n \to \CC\) and satisfies \(\frac{1}{N} \sum_{x=0}^{N-1} |g(x)|^2=1\) (so \(\ket{g}\) is really a quantum state).

Define \(\mathcal{X}_s(x) \triangleq (-1)^{s \cdot x}\), then the above equation can be written as

\[H^{\otimes n} \ket{s} = \ket{\mathcal{X}_{s}}\]

Note that \(H^{\otimes n}\) is its own inverse, so we also have

\[H^{\otimes n} \ket{\mathcal{X}_{s}} = \ket{s}\]

Using this transform, we can get some useful results:

\[\begin{split} H^{\otimes n} \ket{g} &= H^{\otimes n} \left( \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) \ket{x} \right) \\ &= \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) H^{\otimes n}\ket{x}\\ &= \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) \frac{1}{\sqrt{N}} \sum_{s=0}^{N-1} (-1)^{s \cdot x} \ket{s}\\ &= \frac{1}{N} \sum_{s=0}^{N-1} \sum_{x=0}^{N-1} (-1)^{s \cdot x} g(x) \ket{s}\\ &= \sum_{s=0}^{N-1} \hat{g}(s) \ket{s}\\ \end{split} \]

where \(\hat{g}(s)\) is the “strength” of pattern \(\ket{\mathcal{X}_s}\) for \(\ket{g}\), defined as

\[\begin{split} \hat{g}(s) &\triangleq \braket{\mathcal{X}_s}{g}\\ &= \frac{1}{N} \sum_{x=0}^{N-1} \mathcal{X}_s(x)^{\dagger} g(x)\\ &= \frac{1}{N} \sum_{x=0}^{N-1} (-1)^{s \cdot x} g(x) \end{split} \]

Again multiply \(H^{\otimes n}\) on both sides, we have

\[\ket{g} = \sum_{s=0}^{N-1} \hat{g}(s) H^{\otimes n} \ket{s} = \sum_{s=0}^{N-1} \hat{g}(s) \ket{\mathcal{X}_s}\]

In summary, \(H^{\otimes n}\) transforms \(\ket{g}\) from \(\mathcal{X}\)-basis into standard basis (actually Boolean Fourier transform):

\[\ket{g} = \sum_{s=0}^{N-1} \hat{g}(s) \ket{\mathcal{X}_s} \stackrel{H^{\otimes n}}{\longrightarrow} \sum_{s=0}^{N-1} \hat{g}(s) \ket{s}\]

Also it transforms it back (Boolean Fourier inverse transform).

9.2 Quantum Fourier Transform #

Define \(N\) functions \(\mathcal{X}_0, \mathcal{X}_2, \dots, \mathcal{X}_{N-1}: \ZZ_N \to \CC\),

\[\mathcal{X}_s(x) = \omega_N^{s\cdot x}\]

where \(\omega_N = e^{\frac{2\pi i}{N}}\).

Facts

\(\mathcal{X}_0(x) \equiv 1\)

\(\mathcal{X}_s(x)^N=1\), i.e., it is an \(N\)-th root of unity

\(\mathcal{X}_s(x)^{*} = \left( \omega_N^{s \cdot x} \right)^{*} = \omega_N^{-s \cdot x} = \mathcal{X}_{-s}(x)\)

\(\mathcal{X}_s(x) = \mathcal{X}_x(s)\)

\(\ket{\mathcal{X}_0}, \ket{\mathcal{X}_2}, \dots, \ket{\mathcal{X}_{N-1}}\) are orthonormal.

Notation (Important)

The conventions for the name of “QFT” and “inverse QFT” vary. This is sometimes annoying because Nielsen & Chuang uses notations where QFT has the same effect as the classical inverse Discrete Fourier Transform (DFT), see here for details.

So in the following we will only use \(\text{DFT}\) instead of \(\text{QFT}\). Note that \(\text{DFT}\) means inverse QFT, and \(\text{DFT}^{\dagger}\) means QFT if you follow Nielsen & Chuang’s notation.

Define the inverse DFT matrix (or QFT matrix) as

\[\begin{split} \text{DFT}_N^{\dagger} &\triangleq \begin{bmatrix} \rvert & & \rvert\\ \ket{\mathcal{X}_0} & \dots & \ket{\mathcal{X}_{N-1}}\\ \rvert & & \rvert \end{bmatrix}\\ &= \frac{1}{\sqrt{N}} \begin{bmatrix} \omega_N^0 & \omega_N^0 & \omega_N^0 & \dots & \omega_N^0\\ \omega_N^0 & \omega_N^1 & \omega_N^2 & \dots & \omega_N^{N-1}\\ \omega_N^0 & \omega_N^2 & \omega_N^4 & \dots & \omega_N^{2(N-1)}\\ \vdots & \vdots & \vdots & \ddots & \vdots\\ \omega_N^0 & \omega_N^{N-1} & \omega_N^{2(N-1)} & \dots & \omega_N^{(N-1)(N-1)} \end{bmatrix} \end{split} \]

Therefore if \(\ket{x}\) is a basis state, i.e., it is a vector where only the \(x\)-th entry is 1 (index starting with 0), we have the inverse Quantum Fourier Transform:

\[\text{DFT}_N \ket{x} = \frac{1}{\sqrt{N}} \sum_{s=0}^{N-1} \mathcal{X}_s(x)^{*} \ket{s}\]

I.e., \(\text{DFT}_N \ket{s} = \ket{\mathcal{X}^{*}_s}\)

And the Quantum Fourier Transform:

\[\text{DFT}^{\dagger}_N \ket{x} = \frac{1}{\sqrt{N}} \sum_{s=0}^{N-1} \mathcal{X}_s(x) \ket{s}\]

I.e., \(\text{DFT}^{\dagger}_N \ket{s} = \ket{\mathcal{X}_s}\)

An equivalent definition is: (See proof in Nielsen & Chuang, formula (5.5) – (5.10))

\[\text{DFT}^{\dagger}_N \ket{x} = \frac{1}{\sqrt{N}} \left( \ket{0}+e^{\frac{2\pi i}{2}x} \ket{1} \right) \otimes \left( \ket{0}+e^{\frac{2\pi i}{2^2}x} \ket{1} \right) \otimes \dots \otimes \left( \ket{0}+e^{\frac{2\pi i}{2^{n-1}}x} \ket{1} \right) \otimes \left( \ket{0}+e^{\frac{2\pi i}{2^n}x} \ket{1} \right)\]

So the inverse QFT is: \[\text{DFT}_N \ket{x} = \frac{1}{\sqrt{N}} \left( \ket{0}+e^{-\frac{2\pi i}{2}x} \ket{1} \right) \otimes \left( \ket{0}+e^{-\frac{2\pi i}{2^2}x} \ket{1} \right) \otimes \dots \otimes \left( \ket{0}+e^{-\frac{2\pi i}{2^{n-1}}x} \ket{1} \right) \otimes \left( \ket{0}+e^{-\frac{2\pi i}{2^n}x} \ket{1} \right)\]

Similar with Hadamard Transform, we have

\[\begin{split} \text{DFT}_N \ket{g} &= \text{DFT}_N \left( \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) \ket{x} \right) \\ &= \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) \text{DFT}_N\ket{x}\\ &= \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) \frac{1}{\sqrt{N}} \sum_{s=0}^{N-1} \mathcal{X}_s(x)^{*} \ket{s}\\ &= \frac{1}{N} \sum_{s=0}^{N-1} \sum_{x=0}^{N-1} \mathcal{X}_s(x)^{*} g(x) \ket{s}\\ &= \sum_{s=0}^{N-1} \hat{g}(s) \ket{s}\\ \end{split} \]

where \[\begin{split} \hat{g}(s) &\triangleq \braket{\mathcal{X}_s}{g}\\ &= \frac{1}{N} \sum_{x=0}^{N-1} \mathcal{X}_s(x)^{\dagger} g(x) \end{split} \]

Taking the inverse DFT on both sides, we have

\[\ket{g} = \sum_{s=0}^{N-1} \hat{g}(s) \text{DFT}^{\dagger}\ket{s} = \sum_{s=0}^{N-1} \hat{g}(s) \ket{\mathcal{X}_s}\]

In summary, the Fourier Transform and the Inverse Transform convert functions between two bases, as shown in the table

	Original Function \(\ket{g}\)	Transformed Function \(\text{DFT}_N \ket{g}\)
Std. Basis	\(\displaystyle \frac{1}{\sqrt{N}} \sum_{x=0}^{N-1} g(x) \ket{x}\)	\(\displaystyle\sum_{s=0}^{N-1} \hat{g}(s) \ket{s}\)
\(\ket{\mathcal{X}_s}\) Basis	\(\displaystyle\sum_{s=0}^{N-1} \hat{g}(s) \ket{\mathcal{X}_s}\)	\(\displaystyle\frac{1}{\sqrt{N}} \sum_{s=0}^{N-1} g(s) \ket{\mathcal{X}^{*}_s}\)

10 Algorithms #

10.1 Quantum Teleportation #

The essential operation in quantum teleportation is Bell state measurement.

Four Bell states can be obtained by starting with any of the four Bell states and applying either the identity or one of the Pauli matrices:

\[\begin{split} (I \otimes \sigma_X) \left( \frac{\ket{01} - \ket{10}}{\sqrt{2}} \right) &= \frac{1}{\sqrt{2}}(\ket{00} - \ket{11})\\ (I \otimes \sigma_Z) \left( \frac{\ket{01} - \ket{10}}{\sqrt{2}} \right) &= -\frac{1}{\sqrt{2}}(\ket{01} + \ket{10})\\ (I \otimes \sigma_Y) \left( \frac{\ket{01} - \ket{10}}{\sqrt{2}} \right) &= -\frac{i}{\sqrt{2}}(\ket{00} + \ket{11})\\ (I \otimes I) \left( \frac{\ket{01} - \ket{10}}{\sqrt{2}} \right) &= \frac{1}{\sqrt{2}}(\ket{01} - \ket{10})\\ \end{split} \]

They are orthonormal so form a basis. For convenience, we can ignore the global phase and define:

\[\begin{split} \ket{\Phi^+} &= \frac{1}{\sqrt{2}} (\ket{00}+\ket{11})\\ \ket{\Phi^-} &= \frac{1}{\sqrt{2}} (\ket{00}-\ket{11})\\ \ket{\Psi^+} &= \frac{1}{\sqrt{2}} (\ket{01}+\ket{10})\\ \ket{\Psi^-} &= \frac{1}{\sqrt{2}} (\ket{01}-\ket{10})\\ \end{split} \]

Assume Alice has \(\ket{\psi}_1 = \alpha \ket{0}_1 + \beta \ket{1}_1\). Alice and Bob share an entangled pair:

\[\ket{\Phi^+}_{23} = \frac{1}{\sqrt{2}} (\ket{00}_{23}+\ket{11}_{23})\]

Then the combined state of the three qubits can be expressed as:

\[\begin{split} \ket{\psi}_1\ket{\Phi^+}_{23} &= (\alpha\ket{0}_1 + \beta \ket{1}_1)\frac{1}{\sqrt{2}} (\ket{00}_{23}+\ket{11}_{23})\\ &= \frac{1}{\sqrt{2}} (\alpha\ket{000}_{123} + \alpha \ket{011}_{123} + \beta\ket{100}_{123} + \beta \ket{111}_{123})\\ &= \frac{1}{2} \left[ \alpha \left(\ket{\Phi^+}_{12} + \ket{\Phi^-}_{12}\right)\ket{0}_3 + \alpha \left(\ket{\Psi^+}_{12} + \ket{\Psi^-}_{12}\right)\ket{1}_3 + \right.\\ & \hphantom{=\frac{1}{2}\left[\right.} \left.\beta \left(\ket{\Psi^+}_{12} - \ket{\Psi^-}_{12}\right)\ket{0}_3 + \beta \left(\ket{\Phi^+}_{12} - \ket{\Phi^-}_{12}\right)\ket{1}_3\right] \\ &= \frac{1}{2} \left[ \ket{\Phi^+}_{12} \left( \alpha\ket{0}_3 + \beta\ket{1}_3 \right) + \ket{\Phi^-}_{12} \left( \alpha\ket{0}_3 - \beta\ket{1}_3 \right) + \right. \\ & \hphantom{=\frac{1}{2}\left[\right.} \left. \ket{\Psi^+}_{12} \left( \alpha\ket{1}_3 + \beta\ket{0}_3 \right) + \ket{\Psi^-}_{12} \left( \alpha\ket{1}_3 - \beta\ket{0}_3 \right) \right] \end{split} \]

After Alice measures the first two qubits in Bell basis, she will get one of the Bell states, and depending on the outcome, she can send two classical bits to instruct Bob to recover the state \(\ket{\psi}\).

10.2 Fourier Sampling #

10.3 Phase Kickback #

Suppose we have a one-qubit unitary \(U\) which has an eigenvector \(\ket{\psi}\) and \(e^{2\pi i \theta}\) is the corresponding eigenvalue.

Start with \(\ket{0}\ket{\psi}\), apply the Hadamard gate on the first qubit and then the controlled-\(U\) gate (denoted by \(C(U)\)) on the two qubits (the first qubit is the control bit), we have

\[\text{C}(U) (H \otimes I) (\ket{0}\otimes\ket{\psi}) = \frac{\ket{0} + e^{2\pi i \theta}\ket{1}}{\sqrt{2}} \ket{\psi}\]

The overall effect is to add a phase shift to the control qubit, and the second qubit remains intact. This is why it is called phase kickback.

For example, assume \(U=X\) (i.e., NOT gate), where \(X \ket{-} = -\ket{-}\). We have

\[\begin{split} &\text{CNOT}((H \otimes I) (\ket{0}\otimes\ket{-}))\\ =& \text{CNOT} (\ket{+}\ket{-})\\ =& \ket{-}\ket{-} \end{split} \]

10.4 Amplitude Amplification #

Given a Boolean function \(\chi: \ZZ \to \set{0,1}\). We define: If \(\chi(x)=1\), \(\ket{x}\) is a “good” state, otherwise it is a “bad” state.

The goal of amplitude amplification algorithm is to find a “good” state with high probability.

Define unitary operator: \[Q=-\mathcal{A}S_0\mathcal{A}^{-1}S_{\chi}\]

Where:

\(S_0\) changes the sign of the amplitude if and only if the state is \(\ket{0}\). The algebraic expression is \(S_0 = I-2\ket{0}\bra{0}\). In other words, \(-S_0\) is just the sign-implementation of the logical OR function: \(Q^{\pm}_{\text{OR}}\).
\(S_{\chi}\) changes the sign of the amplitudes of the “good” states:

\[S_{\chi}\ket{x} = \left\{ \begin{array}{l@{\quad:\quad}l} -\ket{x} & \text{if } \chi(x)=1\\ \phantom{-}\ket{x} & \text{if } \chi(x)=0 \end{array}\right. \] The algebraic expression is \(S_{\chi} = I-2\Pi_1\), where \(\Pi_i\) is a projection operator defined as \[\Pi_i=\sum_{x: \chi(x)=i} \ket{x}\bra{x}, \quad i \in \set{0,1}\]

Intuitively, \(\Pi_0\) maintains all the “bad” states and maps all the “good” states to 0. \(\Pi_1\) maintains all the “good” states and maps all the “bad” states to 0. In particular, if there is only one “good state”, say \(\ket{\psi}\), then \(S_{\chi} = I-2\ket{\psi}\bra{\psi}\).

Let \(\mathcal{A}\) be any quantum algorithm (unitary) such that \(\mathcal{A}\ket{0} = \sum_{x} \alpha_x \ket{x}\), and we say the algorithm is “successful” if a “good” result is produced when \(\mathcal{A}\ket{0}\) is measured. Define

\[\begin{split} &\ket{\Psi}=\mathcal{A}\ket{0}\\ &\ket{\Psi_i}=\Pi_i\ket{\Psi} = \Pi_i\mathcal{A}\ket{0}, \quad i \in \set{0,1} \end{split} \]

We call \(\ket{\Psi}\) the “initial state” because it will be used as the input state of the whole algorithm. Intuitively, \(\ket{\Psi_0}\) is a superposition of all the “bad” states, \(\ket{\Psi_1}\) is a superposition of all the “good” states. Note that \(\ket{\Psi_i}\) are orthogonal. Here we also abuse the notation because \(\ket{\Psi_i}\) are NOT normalized. By definition, we have \(\ket{\Psi} = \ket{\Psi_0}+\ket{\Psi_1}\), and \(S_{\chi}\ket{\Psi_0}=\ket{\Psi_0}\), \(S_{\chi}\ket{\Psi_1}=-\ket{\Psi_1}\).

Lemma \[\begin{aligned} Q\ket{\Psi_1}&=(1-2a)\ket{\Psi_1}-2a\ket{\Psi_0} \\ Q\ket{\Psi_0}&=2(1-a)\ket{\Psi_1}+(1-2a)\ket{\Psi_0} \end{aligned} \] where \(a = \braket{\Psi_1}{\Psi_1}\).

Proof

First proof a useful equality: \[\begin{split} \mathcal{A}S_0\mathcal{A}^{-1} &= \mathcal{A}(I-2\ket{0}\bra{0})\mathcal{A}^{\dagger}\\ &= I - 2 \mathcal{A}\ket{0}\bra{0}\mathcal{A}^{\dagger}\\ &= I - 2 \ket{\Psi}\bra{\Psi}\\ &= I-2\left(\ket{\Psi_0}+\ket{\Psi_1}\right)\left(\bra{\Psi_0}+\bra{\Psi_1}\right) \end{split} \]

And since \(\braket{\Psi}{\Psi} = \braket{\Psi_0}{\Psi_0} + \braket{\Psi_1}{\Psi_1} = 1\), we have \(\braket{\Psi_0}{\Psi_0}=1-a\).

Therefore

\[\begin{split} Q\ket{\Psi_0} &= - \left[ I-2\left(\ket{\Psi_0}+\ket{\Psi_1}\right)\left(\bra{\Psi_0}+\bra{\Psi_1}\right) \right] S_{\chi} \ket{\Psi_0}\\ &= -\left[ \ket{\Psi_0}-2\left(\ket{\Psi_0}+\ket{\Psi_1}\right)\braket{\Psi_0}{\Psi_0}\right]\\ &= 2(1-a)\ket{\Psi_1}+(1-2a)\ket{\Psi_0} \end{split} \]

The other one is similar.

In fact, \(a\) is the success probability of algorithm \(\mathcal{A}\), i.e., the probability of getting a “good” state by directly measuring \(\mathcal{A}\ket{0}\).

Define the angle \(0<\theta\leq \frac{\pi}{2}\) such that \(\sin^2(\theta) = a\). And we normalize \(\ket{\Psi_1}\) and \(\ket{\Psi_0}\) to \(\frac{1}{\sqrt{a}}\ket{\Psi_1}\) and \(\frac{1}{\sqrt{1-a}}\ket{\Psi_0}\) respectively, so that they can be used as an orthonormal basis. We call the space spanned by them \(\mathcal{H}_{\Psi}\). The above lemma can be reformulated as:

\[\begin{aligned} Q\frac{1}{\sqrt{a}}\ket{\Psi_1}&=\cos(2\theta)\frac{1}{\sqrt{a}}\ket{\Psi_1}-2\sin(\theta)\cos(\theta) \frac{1}{\sqrt{1-a}}\ket{\Psi_0} \\ Q\frac{1}{\sqrt{1-a}}\ket{\Psi_0}&=2\sin(\theta)\cos(\theta)\frac{1}{\sqrt{a}}\ket{\Psi_1}+\cos(2\theta)\frac{1}{\sqrt{1-a}}\ket{\Psi_0} \end{aligned} \]

Thus if a vector is in the space \(\mathcal{H}_{\Psi}\), applying \(Q\) on it corresponds to a rotation by the angle \(2\theta\), which can be represented by the rotation matrix:

\[Q = \begin{bmatrix}\cos(2\theta)& -\sin(2\theta)\\ \sin(2\theta) & \phantom{-}\cos(2\theta)\end{bmatrix}\]

Thus applying \(Q\) on state \(\ket{\Psi}\) for \(n\) times is equivalent to rotating it by the angle \(2n\theta\):

\[\begin{split} Q^n\ket{\Psi} &= Q^n(\ket{\Psi_1} + \ket{\Psi_0})\\ &= \frac{1}{\sqrt{a}} (\sin(2n\theta))\cos(\theta)+\cos(2n\theta)\sin(\theta)) \ket{\Psi_1} + \frac{1}{\sqrt{1-a}} (\cos(2n\theta)\cos(\theta)-\sin(2n\theta)\sin(\theta))\ket{\Psi_0}\\ &= \frac{1}{\sqrt{a}} \sin((2n+1)\theta) \ket{\Psi_1} + \frac{1}{\sqrt{1-a}} \cos((2n+1)\theta)\ket{\Psi_0}\\ \end{split} \]

(The original paper uses a different derivation, see Brassard et al. 2000)

Now measure, the probability of finding a “good” state is \(\sin^2((2n+1)\theta)\). To maximize it, let \((2n+1)\theta=\frac{\pi}{2}\), we get \(n=\frac{\pi}{4\theta}-\frac{1}{2}\). So we choose \(n=\lfloor \frac{\pi}{4\theta}\rfloor\). In this case, we have:

\[\frac{\pi}{4\theta} -1 < \left\lfloor \frac{\pi}{4\theta} \right\rfloor \leq \frac{\pi}{4\theta} \implies \frac{\pi}{2} < (2n+1)\theta \leq \frac{\pi}{2} + \theta \leq \pi\]

Since \(\sin\) is decreasing in range \([\frac{\pi}{2}, \pi]\), we have:

\[\sin^2((2n+1)\theta) \geq \sin^2(\frac{\pi}{2} + \theta) = \cos^2(\theta) = 1-a\]

I.e., the measurement outcome is “good” with probability at least \(1-a\).

Usually \(\sin(\theta)\ll 1\), so we can approximate as follows:

\[n=\left\lfloor \frac{\pi}{4\theta}\right\rfloor \approx \left\lfloor \frac{\pi}{4\sin(\theta)}\right\rfloor = \Theta\left(\frac{1}{\sqrt{a}}\right)\]

I.e., by applying \(Q\) for \(\Theta\left(\frac{1}{\sqrt{a}}\right)\) times, we can find a “good” state with high probability.

Since the algorithm \(\mathcal{A}\) has a success probability \(a\), in classical case, to find a “good” output, the expected number of applications of \(\mathcal{A}\) is \(\frac{1}{a}\). Therefore we obtain a quadratic speedup over the best possible classical algorithm.

Note that this algorithm requires that the value of \(a\) is known so that \(n\), the number of times we apply \(Q\), can be computed.

Geometric Illustration

When applying operator \(Q=-\mathcal{A}S_0\mathcal{A}^{-1}S_{\chi}\) to the initial state \(\ket{\Psi}\), the operation can be regarded as two flips, as shown in the figure:

Figure 1: Geometric Illustration of Amplitude Amplification (image credit: mm)

(Shown in red) First apply \(S_{\chi} = I-2\Pi_1\), which geometrically flips a vector with respect to \(\ket{\Psi_0}\), the superposition of all the “bad” states.

(Shown in green) Then apply \(-\mathcal{A}S_0\mathcal{A}^{-1} = -I + 2 \ket{\Psi}\bra{\Psi}\), which geometrically flips a vector with respect to \(\ket{\Psi}\), the initial input state.

The overall effect is to rotate the initial state towards \(\ket{\Psi_{1}}\), the superposition of “good” states, by an angle of \(2\theta\) (This is a general conclusion, A rotation equals two reflections). The amplitude amplification algorithm essentially repeats this procedure \(n\) times, then measure.

10.5 Grover’s Algorithm #

Given a Boolean function \(f: \set{0,1, \dots, N-1} \to \set{0,1}\) satisfying that there exists a unique \(x_0\) on which \(f\) takes value 1, and Grover’s algorithm is to find \(x_0\).

Grover’s algorithm is a spacial case of the Amplitude Amplification algorithm. let \(\chi=f\), and let \(\mathcal{A}=H^{\otimes n}\) be the Hadamard transform on \(n\) qubits. The operator \[Q = -\mathcal{A}S_0\mathcal{A}^{-1}S_{\chi} = -H^{\otimes n}S_0 \left(H^{\otimes n}\right)^{\dagger}S_f = H^{\otimes n} Q^{\pm}_{\text{OR}} H^{\otimes n}S_f\]

This algorithm finds \(x_0\) with high probability by using \(O(\sqrt{N})\) evaluations of function \(f\).

10.6 Phase Estimation #

Given a unitary operator \(U\), which has an eigenvector \(\ket{\psi}\) with eigenvalue \(e^{2\pi i\theta}\), \(0\leq\theta<1\). We assume that \(\ket{\psi}\) is known and \(\theta\) is unknown. The goal of the algorithm is to estimate \(\theta\).

Intuition: Using Phase Kickback to write the phase of \(U\) (in the Fourier basis) to \(t\) qubits in the “counting register”. Then using the inverse transform to translate this from the Fourier basis into the computational basis, which we can measure. \(t\) is chosen according to the accuracy we wish to have in our estimate of \(\theta\).

Prepare state. The first register (“counting register”) has \(t\) qubits initialized to \(\ket{0}\). The second register is initialized to \(\ket{\psi}\): \[\ket{\psi_0}=\ket{0}^{\otimes t} \ket{\psi}\]
Apply \(H^{\otimes t}\) on the first register: \[\ket{\psi_1}= \frac{1}{\sqrt{2^t}} \left( \ket{0}+\ket{1} \right)^{\otimes t} \ket{\psi} \]
Apply controlled-\(U^{2^j}\) on the second register with \(0\leq j \leq t-1\), where the control bit is the \(j\)-th qubit in the first register, respectively. (See Figure 5.2 in Nielsen & Chuang)

\(U^{2^j}\) is defined as: \[U^{2^j} \ket{\psi} = U^{2^j-1}U\ket{\psi} = U^{2^j-1} e^{2\pi i \theta}\ket{\psi} = \dots = e^{2\pi i 2^j \theta}\ket{\psi}\]

Using the property of phase kickback: \[C(U^{2^j})(\ket{+}\ket{\psi}) = \frac{1}{\sqrt{2}} C(U^{2^j})\left( \ket{0}\ket{\psi} + \ket{1}\ket{\psi} \right) = \frac{1}{\sqrt{2}} \left( \ket{0} + e^{2\pi i 2^j \theta} \ket{1} \right) \otimes \ket{\psi} 1\]

We get the final state of the both registers:

\[\begin{split} \ket{\psi_2} &= \frac{1}{\sqrt{2^t}}\left( \ket{0} + e^{2\pi i 2^{t-1} \theta} \ket{1} \right) \otimes \dots \otimes \left( \ket{0} + e^{2\pi i 2^1 \theta} \ket{1} \right) \otimes \left( \ket{0} + e^{2\pi i 2^0 \theta} \ket{1} \right) \otimes \ket{\psi}\\ &= \frac{1}{\sqrt{2^t}} \sum_{k=0}^{2^{t}-1} e^{2\pi i k \theta} \ket{k}\ket{\psi}\\ &= \left(\text{DFT}^{\dagger}_{2^t} \ket{2^t \theta}\right) \otimes \ket{\psi} \quad \text{(if } 2^t\theta \text{ is an integer)} \end{split} \] The second equality is due to the definition of the Quantum Fourier Transform.
Perform \(\text{DFT}_{2^t}\) on the first register, if \(2^t\theta\) is an integer, we get

\[\ket{2^t \theta}\ket{\psi}\]

Now measure and we get the outcome \(2^t\theta\) in the first register, so we know \(\theta\).

If \(2^t\theta\) is not an integer, we can approximate \(\theta\) by rounding \(2^t\theta\) to the nearest integer, see discussion below.

Performance:

In summary, exactly \(\sum_{j=0}^{t-1} 2^j = 2^t-1\) controlled-\(U\) gates are used.

If \(2^t\theta\) is an integer, then we can get \(\theta\) exactly with probability 1.

If \(2^t\theta\) is not an integer, we round it to the nearest integer \(a\), i.e.,

\[2^t\theta = a + 2^t\delta\]

where \(-\frac{1}{2}\leq 2^t\delta \leq \frac{1}{2}\), the \(2^t\) factor is only for the convenience of the following derivation.

We will prove that when measuring the first register, we get outcome \(a\) with high probability.

Proof

At step 4, applying \(\text{DFT}_{2^t}\) on the first register, which has state:

\[\frac{1}{\sqrt{2^t}} \sum_{k=0}^{2^{t}-1} e^{2\pi i k \theta} \ket{k}\]

We get: \[\begin{split} &\frac{1}{\sqrt{2^t}} \sum_{k=0}^{2^{t}-1} e^{2\pi i k \theta} \text{DFT}_{2^t}\ket{k}\\ =&\frac{1}{\sqrt{2^t}} \sum_{k=0}^{2^{t}-1} e^{2\pi i k \theta} \left[\frac{1}{\sqrt{2^t}} \sum_{x=0}^{2^{t}-1} e^{\frac{-2\pi i k x}{2^t}}\ket{x}\right]\\ =&\frac{1}{2^t} \sum_{k=0}^{2^{t}-1}\sum_{x=0}^{2^{t}-1} e^{2\pi i k \left( \theta-\frac{x}{2^{t}} \right)}\ket{x}\\ =&\frac{1}{2^t} \sum_{x=0}^{2^{t}-1}\sum_{k=0}^{2^{t}-1} e^{2\pi i k \left( \theta-\frac{x}{2^{t}} \right)}\ket{x} \end{split} \]

So measuring it yields the output \(a\) with probability:

\[\begin{split} \Pr\left[a\right]&= \left| \frac{1}{2^{t}}\sum_{k=0}^{2^{t}-1} e^{2\pi i k \left( \theta-\frac{a}{2^{t}} \right)} \right|^2\\ &= \left| \frac{1}{2^{t}}\sum_{k=0}^{2^{t}-1} e^{2\pi i k \delta} \right|^2 \quad \text{(Pluging in } a=2^t\theta - 2^t\delta \text{)}\\ &= \left| \frac{1}{2^{t}} \frac{1-e^{2\pi i \delta 2^{t}}}{1-e^{2\pi i \delta}} \right|^2 \quad \text{(Sum of geometric series)}\\ &= \frac{1}{2^{2t}} \frac{|\sin(\pi 2^{t} \delta)|^2}{|\sin(\pi \delta)|^2} \quad \text{(} \left| 1-e^{2ix} \right|^2 = 4 |\sin(x)|^2 \text{)}\\ &\geq \frac{1}{2^{2t}} \frac{|\sin(\pi 2^{t} \delta)|^2}{|\pi \delta|^2} \quad \text{(} |\sin(x)| \leq |x| \text{)}\\ &\geq \frac{1}{2^{2t}} \frac{|2 \cdot 2^t \delta|^2}{|\pi \delta|^2} \quad \text{(} |\sin(x)| \geq \left|\frac{2}{\pi}x\right| \text{ for } |x| \leq \frac{\pi}{2} \text{)}\\ &= \frac{4}{\pi^2} \approx 0.405 \end{split} \]

where we used the fact that \(|\delta|\leq \frac{1}{2^{t+1}}\).

10.7 Amplitude Estimation #

Amplitude Amplification assumes that the success probability \(a\) is known. If \(a\) is unknown, we can estimate it using Amplitude Estimation.

The algorithm uses Phase Estimation to estimate the eigenvalues of the Grover iteration \(Q\). Since \(Q\) is a rotation matrix, whose angle of rotation is \(2\theta\), then its eigenvalues are \(e^{i 2\theta}\) and \(e^{-i2\theta}\), and eigenvectors are \[\begin{split} \ket{\Psi_{+}}&=\frac{1}{\sqrt{2}}(i,1)^\mathsf{T} = \frac{1}{\sqrt{2}} \left( \frac{i}{\sqrt{1-a}}\ket{\Psi_{0}} + \frac{1}{\sqrt{a}}\ket{\Psi_{1}} \right)\\ \ket{\Psi_{-}}&=\frac{1}{\sqrt{2}}(-i,1)^\mathsf{T} = \frac{1}{\sqrt{2}} \left( \frac{-i}{\sqrt{1-a}}\ket{\Psi_{0}} + \frac{1}{\sqrt{a}}\ket{\Psi_{1}} \right) \end{split} \]

(The calculation of the eigenvalues/eigenvectors are here, or here)

Since \(a=\sin^2(\theta)\), we can estimate \(\theta\) so we know \(a\).

The error of estimating \(\theta\) can be transformed to the error of \(a\) in the next lemma.

Lemma (Brassard et al. 2000, Lemma 7)

Let \(a=\sin^2(\theta)\) and \(\tilde{a}=\sin^2(\tilde{\theta})\), then

\[\left| \tilde{\theta}-\theta \right| \leq \varepsilon \implies \left| \tilde{a}-a \right| \leq 2 \varepsilon \sqrt{a(1-a)} + \varepsilon^2 \]

By Phase Estimation, we can use the number of \(2^t-1\) controlled-\(Q\) gates to estimate \(\theta\) with probability at least \(\frac{4}{\pi^2}\), such that

\[\left| \widehat{\theta}-\theta \right| \leq \frac{\pi}{2^{t+1}}\]

So by the above lemma, we have

\[\left| \widehat{a}-a \right| \leq \frac{\pi}{2^t} \sqrt{a(1-a)} + \frac{\pi^2}{2^{2t+2}} \]

By using the “Powering Lemma”, the success probability can be improved to \(1-\delta\) for any \(\delta\) at the cost of an \(\mathcal{O}(\log \frac{1}{\delta})\) multiplicative factor.

10.8 Quantum Counting #

Grover’s Algorithm attempts to find a solution to the oracle, whereas the quantum counting algorithm tells us how many of these solutions there are. I.e., the number of \(x\) such that \(f(x)=1\).

This is a straightforward application of Amplitude Estimation. Under the same setting with Grover’s algorithm, assume the number of solutions is \(M\), then the success probability \(a=\frac{M}{N}\). By Amplitude Estimation, we can estimate \(a\), so we know the estimate of \(M\).

10.9 Twirling Quantum Channels #

Denote \(\mathcal{U}(d)\) be the set of all \(d \times d\) unitary matrices. Twirling the quantum channel \(\Lambda\) on a density operator \(\rho\) is

\[\rho \mapsto \int_{\mathcal{U}(d)} \dd{\eta(U)} U^{\dagger} \Lambda (U \rho U^{\dagger}) U\] where \(U\) is chosen randomly with respect to the Haar measure \(\eta\).

Intuitively, twirling a channel is equivalent to taking the average, or expected value, of conjugation with all possible \(U \in \mathcal{U}(d)\).

10.10 Entanglement Distillation (Purification) #

Transform \(N\) arbitrary states \(\rho\) into some number of approximately pure Bell pairs.

10.11 Quantum Monte Carlo #

Given a quantum algorithm \(\mathcal{A}\) which operates on \(n\) qubits. We define a (discrete) random variable \(v(\mathcal{A})\) as follows:

Apply \(\mathcal{A}\) on state \(\ket{0^n}\), i.e., \(\mathcal{A}\ket{0^n}=\sum_{x} \alpha_x \ket{x}\), then measure in the computational basis.
Based on the measurement outcome \(x \in \set{0,1}^n\), the value of the random variable \(v(\mathcal{A})\) is defined to be \(\phi(x)\) for some fixed function \(\phi: \set{0,1}^n \to \RR\).

The expectation is defined as \(\E[v(\mathcal{A})] = \sum_{x} |\alpha_x|^2 \phi(x)\).

The goal of Quantum Monte Carlo is to estimate \(\E[v(\mathcal{A})]\) given \(\mathcal{A}\). Here, we assume the value of \(v(\mathcal{A})\) is bounded: \(v(\mathcal{A}) \in [0,1]\). Refer Montanaro 2015 for more general cases.

Given \(\mathcal{A}\), the algorithm assumes that we can construct an unitary operator \(W\) on \(n+1\) qubits, defined by \[W\ket{x}\ket{0} = \ket{x} \left( \sqrt{1-\phi(x)} \ket{0} + \sqrt{\phi(x)} \ket{1} \right)\]

Then define the initial state: \[\ket{\Psi}=W (\mathcal{A}\ket{0^n}) \ket{0} = \sum_{x}\alpha_x\ket{x} \left( \sqrt{1-\phi(x)} \ket{0} + \sqrt{\phi(x)} \ket{1} \right)\] And two reflections: \[\begin{split} U&=2\ket{\Psi}\bra{\Psi}-I\\ V&=I-2P \end{split} \] where \(P\) is a projector defined as \(P=I_n \otimes \ket{1}\bra{1}\).

Then according to Amplitude Estimation, we can let the Grover operator \(Q=UV\) and estimate \(a=\bra{\Psi} P \ket{\Psi} = \E[v(\mathcal{A})]\).

Performance:

By Amplitude Estimation, we can estimate \(a\) using \(T=\mathcal{\widetilde{O}}(2^t)\) queries of \(Q\) with high probability that

\[ \left| \widehat{a}-a \right| \leq \mathcal{O}\left(\frac{1}{T}\right)\]

Note that this demonstrates a quadratic quantum speedup because classical methods like Chernoff bound take \(\mathcal{\widetilde{O}}(T^2)\) samples to estimate the same precision.

10.12 Randomized Benchmarking #

11 Quantum Adversary Method in Quantum Query Complexity #

Given a secret \(N\)-bit input string \(w\), encoded as a quantum oracle \(O^{\pm}_w\). You can query the \(i\)-th element of \(w\) in superposition: \[O^{\pm}_w: \ket{i} \mapsto (-1)^{w_i}\ket{i}\]

The goal is to solve a fixed and known decision problem \(F: \set{0,1}^N \mapsto \set{0,1}\) applied on \(w\), e.g., if \(F=\text{OR}\), then we need to decide whether \(w\) has at least one 1.

The cost is defined as the number of queries to the oracle \(O^{\pm}_w\).

12 Number Theory #

If \((a \mod n) = (b \mod n)\), and \(\gcd(a,n)=1\), then \(\gcd(b,n)=1\).

Proof

Let \(r \triangleq a \mod n = b \mod n\), then

\[a = pn + r, \quad b=qn+r\]

So \[a=pn+b-qn=b+(p-q)n\]

Since \(\gcd(b,n)\) divides \(b\) and \(n\), so it also divides \(a\).

By condition, \(\gcd(a,n)=1\), so the only number that divides \(a\) and \(n\) is 1. So \(\gcd(b,n)=1\).

\(\gcd(a,n)=1\) if and only if \(\exists x,y \in \ZZ\), such that \(xa+yn=1\).
Let \(G\) be a finite group with \(m=|G|\), the order of the group. Then for any element \(g \in G\), \(g^m=1\).

Links to this note

Welcome To My Digital Garden

← → Top