### Exercise 4.

Use the HSW theorem to find the product state capacity of the depolarizing channel, \(\Lambda\), defined by \[ \Lambda(\rho) = p\rho + (1-p) \frac{I}{2}. \]

Note we are only dealing with qubits (otherwise \(\Lambda\) would not be trace-preserving). First, note that for any unitary \(U\), we have that \[ \Lambda(U \rho U^*) = pU\rho U^* + (1-p) \frac{I}{2} \] and therefore \(\Lambda(U \rho U^*) = U \Lambda(\rho)U^*\).

The HSW theorem gives that the product state capacity of \(\Lambda\) is given by \[ \chi^*(\Lambda) = \max_{\{p_x, \rho_x\}} S( \Lambda(\sum_x p_x \rho_x)) - \sum_x p_x S(\Lambda(\rho_x)). \] We can reduce to pure state ensembles \(\{p_x,\psi_x\}\). Then defining the average state \(\rho := \sum_x p_x \psi_x\), we consider \[\begin{aligned} S( \Lambda(\rho)) - \sum_x p_x S(\Lambda(\psi_x)) \end{aligned}\] Since all pure states can be related by unitaries, we have that for each \(x\) and \(y\), there is a unitary \(U\) such that \(\psi_x = U \psi_y U^*\). Thus, \[ S(\Lambda(\psi_x)) = S( \Lambda( U \psi_y U^*)) = S(U \Lambda(\psi_y) U^*) = S(\Lambda(\psi_y)) \] using that the entropy is unitarily invariant. Thus, \(S(\Lambda(\psi_x))\) does not depend on \(x\). To calculate its value, we can extend \(| \psi_x \rangle\) to an orthonormal basis given by \(\{| \psi_x \rangle, | \psi_x \rangle^\perp\}\). Then in this basis, \[ \Lambda(\psi_x) = p\psi_x + (1-p) \frac{I}{2} = \begin{pmatrix} p + \frac{1-p}{2} & 0 \\ 0 & \frac{1-p}{2} \end{pmatrix}. \] Thus, \(S(\Lambda(\psi_x)) = h\left( \frac{1-p}{2}\right)\) where \(h\) is the binary entropy. Since \(\sum_x p_x =1\), we have \[ S( \Lambda(\rho)) - \sum_x p_x S(\Lambda(\psi_x)) = S( \Lambda(\rho)) - h\left( \frac{1-p}{2}\right). \] It remains to maximize the output entropy \(S( \Lambda(\rho))\) over all states \(\rho\). Since \(\Lambda\) is unitarily invariant, we can work in the eigenbasis of \(\rho\), in which case \(\rho = \begin{pmatrix} \lambda & 0 \\ 0 & 1- \lambda \end{pmatrix}\) for some number \(0\leq \lambda\leq 1\). In this basis, we have \[ \Lambda(\rho) = \begin{pmatrix} p \lambda + \frac{1-p}{2}& 0 \\ 0 & p(1- \lambda) + \frac{1-p}{2} \end{pmatrix}. \] Thus, \[ S(\Lambda(\rho)) = h\left(p \lambda + \frac{1-p}{2} \right) = h\left( \frac{1}{2} + p( \lambda - \frac{1}{2})\right) \] Since the binary entropy is maximized at \(\frac{1}{2}\) at which point it takes the value \(\log_2 2 = 1\) (since, e.g. the entropy is maximized on a uniform distribution), \(S(\Lambda(\rho))\) is maximized when \(\lambda = \frac{1}{2}\). Thus, \[ \chi^*(\Lambda) = 1- h\left( \frac{1-p}{2}\right) \] gives the product state capacity of the depolarizing (qubit) channel.