# Locally maximizing the Rényi entropies

## August 25, 2018*

Tags: entropies, visualization

As I was rewriting my website, I found some visualizations I had stored on my old website to show a collaborator, and I figured it was worth writing a little to have a more proper place to put them; hence this post 😊.

Probability distributions on three letters consist just of three non-negative numbers which add up to 1, which we can see as a vector in $$\mathbb{R}^3$$. The set of all such distributions form a simplex, which looks like a 2D triangle laying in $$\mathbb{R}^3$$:  The simplex of probability distributions on three letters from two perspectives.

We can parametrize such distributions just by their $$x$$- and $$y$$-coordinates, since their $$z$$-coordinate is given by $$z=1-x-y$$. This allows us to plot functions $$f$$ that vary over the set of probability distributions on three letters in $$\mathbb{R}^3$$: for each valid choice of $$(x,y)$$ coordinates, we plot the number $$f(x,y,1-x-y)$$ at the point $$(x,y)$$. One particular function of probability distributions that I’m interested in is the $$\alpha$$-Rényi entropy. When considering probability distributions on three letters, it’s given by $H_\alpha( \vec p) = \frac{1}{1-\alpha}\log(x^\alpha + y^\alpha + z^\alpha)$ where $$\vec p = (x,y,z)$$, and $$\alpha \in (0,1)\cup(1,\infty)$$ is a parameter. From this function, we can define another function of probability measures, $\Delta_\varepsilon(\vec p) = \max_{ \vec q \in B_\varepsilon( \vec p) } H_\alpha (\vec q) - H_\alpha(\vec p),$ where $$B_\varepsilon(\vec p)$$ is called the $$\varepsilon$$-ball around $$\vec p$$, and consists of all probability measures which are $$\varepsilon$$-close to $$\vec p$$ in total variation distance. For example if $$\vec r = (0.21, 0.24, 0.55)$$, then $$B_\varepsilon(\vec r)$$ is given by the filled purple hexagon in Figure 2: $$B_\varepsilon(\vec r)$$ is the purple hexagon.

It turns out that this maximum is achieved at one unique point; for the case before, it’s shown here: The maximizer of $$H_\alpha$$ over the ball $$B_\varepsilon(\vec r)$$ is the unlabelled black point at the bottom of the hexagon.

and we can write down a form for the maximizer. This allows us to plot the value of $$\Delta_\varepsilon$$ as it varies over the set of probability distributions, for a given $$\varepsilon$$ and $$\alpha$$. The quantity $$\Delta_\varepsilon$$ is useful for proving continuity bounds. I’ve included some of these plots of it below.

$$\Delta_\varepsilon$$ with $$\varepsilon = 0.1$$, for the $$\alpha$$-Rényi entropy with $$\alpha = 0.5$$.

$$\Delta_\varepsilon$$ with $$\varepsilon = 0.1$$, for the $$\alpha$$-Rényi entropy with $$\alpha = 1.5$$.

$$\Delta_\varepsilon$$ with $$\varepsilon = 0.1$$, for the $$\alpha$$-Rényi entropy with $$\alpha = 2.0$$.

$$\Delta_\varepsilon$$ with $$\varepsilon = 0.1$$, for the $$\alpha$$-Rényi entropy with $$\alpha = 3.0$$.