Answers to questions

June 6, 2018*

*Last modified 25-Aug-18

During the last month or so of the course, I accepted anonymous questions submitted to my website, and posted the answers so the rest of the class could learn from it too. These questions and answers are archived here.


Question 1.

In Notes 8, why does equation (1.3) follow hold? We invert \(I_d \otimes R\) and \(\Lambda \otimes I_n\). Can you describe why please? Is it because the identity matrix commutes with everything?

Good question! The short answer is yes. In this part, we start with \[ \Lambda \otimes \operatorname{id}_k ( |\varphi_i\rangle \langle\varphi_i|), \] and use a relation from Notes 7 to write \(|\varphi_i\rangle = I_d\otimes R_i\, |\Omega\rangle\) for some matrix \(R_i\). Substituting this, we have \[ \Lambda \otimes \operatorname{id}_k ( I_d\otimes R_i\, |\Omega\rangle \langle\Omega| \,I_d \otimes R_i^\dagger), \](1) which (1.3) claims is equal to \[ I_d\otimes R_i\, (\Lambda \otimes \operatorname{id}_k ( |\Omega\rangle \langle\Omega| ) )I_d \otimes R_i^\dagger. \](2) Intuitively, we can see this because \(I_d\otimes R_i\) only acts non-trivially on the second system, and \(\Lambda\otimes \operatorname{id}_k\) only acts non-trivially on the first system. But how can we show this algebraically? We can define a linear map \(\mathcal{R}_i : A\mapsto R_i A R_i^\dagger\). Then (1) becomes \[\begin{aligned} (\Lambda \otimes \operatorname{id}_k) \circ (\operatorname{id}_d \otimes \mathcal{R}_i) (|\Omega\rangle \langle\Omega|) &= (\Lambda \circ \operatorname{id}_d) \otimes (\operatorname{id}_k \circ \mathcal{R}_i ) \, (| \Omega \rangle\langle \Omega |) \\ &=(\operatorname{id}_d \circ \Lambda ) \otimes (\mathcal{R}_i \circ \operatorname{id}_k) \, (| \Omega \rangle\langle \Omega |)\\ &=\operatorname{id}_d \otimes \mathcal{R}_i \circ ( \Lambda \otimes \operatorname{id}_k) \, (| \Omega \rangle\langle \Omega |)\\ &=I_d \otimes R_i\, ( \Lambda \otimes \operatorname{id}_k \, (| \Omega \rangle\langle \Omega |) ) \, I_d \otimes R_i^\dagger \end{aligned}\] which is (2). So it just follows from the rules of composition1 and tensor product, along with the fact that, e.g. \(\operatorname{id}_k \circ \mathcal{R}_i = \mathcal{R}_i = \mathcal{R}_i \circ \operatorname{id}_k\). That is, composing a map with the identity map does nothing.

In practice, one usually doesn’t write out full proofs for facts like these, because it comes up a lot and it’s the same argument every time. But it’s important to see how it works to know that the intuition about “which system does it act on non-trivially” really holds up.

  1. Note that these superoperators \(\mathcal{R}_i\) are just linear maps like \(R_i\), except on a larger Hilbert space. So composition of superoperators respects tensor product just as matrix multiplication does, since they are in fact the same operation, just on different sized Hilbert spaces.


Question 2.

Hey! Could you also please help me with an exercise (Notes 5, eq 21): show that a separable bipartite state can always be expressed as convex combination of pure product states: \[ \rho_{AB} = \sum\limits_{i=1} p_i |\psi_i\rangle\langle\psi_i|\otimes |\phi_i\rangle\langle\phi_i|. \] I get confused when I expand the density matrices and try to confine the sum to only one variable.

Hi! Good question. The definition of a separable state is that we can write it in the form \[ \rho_{AB} = \sum_{i=1}^n p_i \omega_i^A \otimes \sigma_i^B \] for some probability distribution \(\{p_i\}_{i=1}^n\) and density matrices \(\omega_i^A\) and \(\sigma_i^B\). As you mentioned, the trick in in the number of index variables. Each state \(\omega_i^A\) has a decomposition into at most \(d_A:= \dim \mathcal{H}_A\) pure states: \[ \omega_i^A = \sum_{j=1}^{d_A} q^{(i)}_j | \psi^{(i)}_j \rangle\langle \psi^{(i)}_j |. \](1) We know we can restrict to \(d_A\) pure states because the eigendecomposition gives a decomposition with that many pure states, and we don’t need to restrict the sum to fewer than \(d_A\) states because we can always have the associated probability \(q^{(i)}_j = 0\). So for each \(i\), we have a probability distribution \(\{q^{(i)}_j\}_{j=1}^{d_A}\) and set of pure states \(\{| \psi^{(i)}_j \rangle\}_{j=1}^{d_A}\) which satisfy (1). Likewise, we can decompose each \(\sigma_B^{(i)}\) as \[ \sigma_i^B = \sum_{k=1}^{d_B} r^{(i)}_k | \phi^{(i)}_k \rangle\langle \phi^{(i)}_k |. \] for some probability distribution \(\{r^{(i)}_k\}_{k=1}^{d_B}\) and pure states \(\{| \phi^{(i)}_k \rangle\}_{k=1}^{d_B}\). Putting this together, we have \[ \rho_{AB} = \sum_{i=1}^n \sum_{j=1}^{d_A} \sum_{k=1}^{d_B} p_i q^{(i)}_j r^{(i)}_k | \psi^{(i)}_j \rangle\langle \psi^{(i)}_j |\otimes | \phi^{(i)}_k \rangle\langle \phi^{(i)}_k |. \](2) This is probably where you got to in your expansions. The trick is to realize that the bound variables \(i,j,k\) don’t have inherent meaning; the sum is simply \[ \rho_{AB} = p_1 q^{(1)}_1 r^{(1)}_1 | \psi^{(1)}_1 \rangle\langle \psi^{(1)}_1 |\otimes | \phi^{(1)}_1 \rangle\langle \phi^{(1)}_1 | + p_1 q^{(1)}_2 r^{(1)}_k | \psi^{(1)}_2 \rangle\langle \psi^{(1)}_2 |\otimes | \phi^{(1)}_1 \rangle\langle \phi^{(1)}_1 | + \ldots \] where we have a term for each value of \(i,j,k\). So we can biject the set \(\{1,2,\dotsc,n\} \times \{1,2,\dotsc,d_A\} \times \{1,2,\dotsc,d_B\}\) with the set \(\{1,2,\dotsc, n*d_A*d_B\}\), by for example1.] \[ \alpha(i,j,k) = (i-1)*d_Ad_B + (j-1)*d_B + k \] which runs from \(\alpha(1,1,1) = 1\) to \(\alpha(n,d_A,d_B) = nd_Ad_B\), and in fact is a bijection as we could check. So we can define a probability distribution \(\{t_\beta\}_{\beta=1}^{nd_Ad_B}\) by the inverse mapping: if \(\alpha(i,j,k) = \beta\), then \(t_\beta = p_i q^{(i)}_j r^{(i)}_j\). Likewise, we can define pure states \[ | \chi_\beta \rangle_A = | \psi^{(i)}_j \rangle_A \] where \(i\) and \(j\) are such that \(\beta= \alpha(i,j,k)\) for some \(k\). Likewise, we define pure states \[ | \tilde \chi_\beta \rangle_B = | \phi^{(i)}_k \rangle_B \] where \(i\) and \(k\) are such that \(\beta= \alpha(i,j,k)\) for some \(j\). Putting it all together, \[ \rho_{AB} = \sum_{\beta=1}^{nd_Ad_B} t_\beta | \chi_\beta \rangle\langle \chi_\beta |\otimes | \tilde \chi_\beta \rangle\langle \tilde \chi_\beta |. \](3) Why is this equal? Since each term in the sum on the right-hand side of (3) appears exactly once as a term in the sums on the right-hand side of (2).

This trick often isn’t written out, so I hope it helps to have it here. In scientific programming languages, the bijection \(\alpha\) is often called sub2ind, since it converts “subscripts” like \(i,j,k\) to a single index, \(\beta\). For example, Julia is an open-source scientific programming language and documents the function here, if you’re interested.

  1. Thanks to this stackoverflow answer


Question 3.

Could you please help me with an exercise given in the lectures (Notes 13, eq 1.18): show that there exist a set of unitary operators \(\{U_j\}_j\) and a probability distribution \(\{p_j\}_j\) such that \[ \rho_A \otimes \sigma_B = \sum\limits_j p_jU_j\rho_{AB}U_j^\dagger, \](1) where \(\sigma_B = I/d\) is a completely mixed state.

The exercise in (1.18) is a bit tricky— I wouldn’t know how to do it without having seen it before. The unitaries needed are described in Mark Wilde’s book. In Section 3.7.2 (p. 110), he defines qudit generalizations of the Pauli \(X\) and \(Z\) operators, which he calls the Heisenberg-Weyl operators. Then in Exercise 4.7.6 on p. 176 of the same book, he has an exercise to show that these operators “do the job” by averaging out to the completely mixed state (see equation (4.349) in the book). Note you have a doubly-indexed family of unitaries \(\{ X(i) Z(j) \}_{i,j}\), but that could always be reindexed to have only one index, like in the previous question. Actually, there is one more subtlety: in Exercise 4.7.6 the operators only act on one system; here you have two systems, but only want to map the \(B\) system to the completely mixed state. So you would use \(I\otimes X(i) Z(j)\) instead.