发布于 

信息论

信息论笔记。

信源与信息熵

马尔可夫信源

信源记忆长度为m+1m+1时,该时刻发出的信号只与之前的mm个有关系,则称这种信源为马尔可夫信源。对于一阶马尔可夫信源,类似于下面的式子都成立:

p(x1x2,x3)=p(x1x2)p(x_1\mid x_2,x_3)=p(x_1\mid x_2)

因为x1x_1并不影响x3x_3

离散信源信息熵

定义

离散信源信息熵:

H(X)=E[I(X)]=E[logp(xi)]=ip(xi)logp(xi)H(X)=E[I(X)]=E[-\log p(x_i)]=-\sum_i p(x_i)\log p(x_i)

条件信息熵:

H(XY)=jp(yj)H(Xyj)=i,jp(yj)p(xiyj)logp(xiyj)=i,jp(xi,yj)logp(xiyj)\begin{aligned} H(X\mid Y)=\sum_j p(y_j)H(X\mid y_j)&=-\sum_{i,j}p(y_j)p(x_i\mid y_j)\log p(x_i\mid y_j)\\ &=-\sum_{i,j}p(x_i,y_j)\log p(x_i\mid y_j) \end{aligned}

联合信息熵:

H(X,Y)=i,jp(xi,yj)logp(xi,yj)H(X,Y)=-\sum_{i,j}p(x_i,y_j)\log p(x_i,y_j)

重要关系

H(X,Y)=H(Y)+H(XY)H(X, Y)=H(Y)+H(X\mid Y)

H(X,Y)=H(Y)+H(XY)H(X, Y)=H(Y)+H(X\mid Y)

证明:

H(Y)+H(XY)=jp(yj)logp(yj)i,jp(xi,yj)logp(xiyj)=i,jp(xi,yj)logp(yj)i,jp(xi,yj)logp(xiyj)=i,jp(xi,yj)logp(yj)p(xiyj)=i,jp(xi,yj)logp(xi,yj)=H(X,Y)\begin{aligned} H(Y)+H(X\mid Y)&=-\sum_j p(y_j)\log p(y_j)-\sum_{i,j}p(x_i,y_j)\log p(x_i\mid y_j)\\ &=-\sum_{i,j} p(x_i,y_j)\log p(y_j)-\sum_{i,j}p(x_i,y_j)\log p(x_i\mid y_j)\\ &=-\sum_{i,j} p(x_i,y_j)\log p(y_j)p(x_i\mid y_j)\\ &=-\sum_{i,j}p(x_i,y_j)\log p(x_i,y_j)=H(X,Y) \end{aligned}

推广:

H(X1,X2,,Xn)=inH(XiX1,X2,,Xi1)H(X_1,X_2,\cdots ,X_n)=\sum_i^n H(X_i\mid X_1,X_2,\cdots ,X_{i-1})

互信息

定义

I(X;Y)=H(X)H(XY)I(X;Y)=H(X)-H(X\mid Y)

条件互信息定义

I(X;YZ)=H(XZ)H(XY,Z)=i,j,kp(xi,yj,zk)logp(xiyj,zk)p(xizk)\begin{aligned} I(X;Y\mid Z)&=H(X\mid Z)-H(X\mid Y,Z)\\ &=\sum_{i,j,k}p(x_i,y_j,z_k)\log \dfrac{p(x_i\mid y_j,z_k)}{p(x_i\mid z_k)} \end{aligned}

重要关系

I(X;Y)=I(Y;X)I(X;Y)=I(Y;X)

证明:

I(X;Y)=H(X)H(XY)=ip(xi)logp(xi)+i,jp(xi,yj)logp(xiyj)=i,jp(xi,yj)logp(xi)+i,jp(xi,yj)logp(xiyj)=i,jp(xi,yj)logp(xi,yj)p(xi)p(yj)=I(Y;X)\begin{aligned} I(X;Y)&=H(X)-H(X\mid Y)\\ &=-\sum_i p(x_i)\log p(x_i)+\sum_{i,j}p(x_i,y_j)\log p(x_i\mid y_j)\\ &=-\sum_{i,j} p(x_i,y_j)\log p(x_i)+\sum_{i,j}p(x_i,y_j)\log p(x_i\mid y_j)\\ &=\sum_{i,j} p(x_i,y_j)\log \dfrac{p(x_i,y_j)}{p(x_i)p(y_j)}=I(Y;X) \end{aligned}

0I(X;Y)H(X)0\leqslant I(X;Y)\leqslant H(X)

I(X;Y,Z)=I(X;Y)+I(X;ZY)I(X;Y,Z)=I(X;Y)+I(X;Z\mid Y)

证明:

I(X;Y)+I(X;ZY)=i,jp(xi,yj)logp(xi,yj)p(xi)p(yj)+i,j,kp(xi,yj,zk)logp(xiyj,zk)p(xiyj)=i,j,klogp(xi,yj)p(xi)p(yj)p(xi,yj,zk)p(yj)p(yj,zk)p(xi,yj)=i,j,klogp(xi,yj,zk)p(xi)p(yj,zk)=I(X;Y,Z)\begin{aligned} I(X;Y)+I(X;Z\mid Y)&=\sum_{i,j}p(x_i,y_j)\log \dfrac{p(x_i,y_j)}{p(x_i)p(y_j)}+\sum_{i,j,k}p(x_i,y_j,z_k)\log \dfrac{p(x_i\mid y_j,z_k)}{p(x_i\mid y_j)}\\ &=\sum_{i,j,k}\log \dfrac{p(x_i,y_j)}{p(x_i)p(y_j)}\cdot \dfrac{p(x_i,y_j,z_k)p(y_j)}{p(y_j,z_k)p(x_i,y_j)}\\ &=\sum_{i,j,k}\log \dfrac{p(x_i,y_j,z_k)}{p(x_i)p(y_j,z_k)}=I(X;Y,Z) \end{aligned}

I(X;Y,Z)=I(X;Y)I(X;Y,Z)=I(X;Y)XYZX-Y-Z构成马尔科夫链

证明:

I(X;Y,Z)=i,j,kp(xi.yj,zk)logp(xi,yj,zk)p(xi)p(yi,zk)=i,j,kp(xi.yj,zk)logp(xiyj,zk)p(xi)=i,j,kp(xi.yj,zk)logp(xiyj)p(xi)=I(X;Y)\begin{aligned} I(X;Y,Z)&=\sum_{i,j,k}p(x_i.y_j,z_k)\log \dfrac{p(x_i,y_j,z_k)}{p(x_i)p(y_i,z_k)}\\ &=\sum_{i,j,k}p(x_i.y_j,z_k)\log \dfrac{p(x_i\mid y_j,z_k)}{p(x_i)}\\ &=\sum_{i,j,k}p(x_i.y_j,z_k)\log \dfrac{p(x_i\mid y_j)}{p(x_i)}=I(X;Y) \end{aligned}

信息传输不等式

对于构成马尔科夫链的随机变量XYZX-Y-Z,他们之间的互信息有关系:

I(X;Y)I(X;Z)I(X;Y)\geqslant I(X;Z)

证明:

I(X;Y)I(X;Z)=I(X;Y,Z)I(X;Z)=I(X;YZ)=kp(zk)I(X;Yzk)0.\begin{aligned} I(X;Y)-I(X;Z)&=I(X;Y,Z)-I(X;Z)\\ &=I(X;Y\mid Z)\\ &=\sum_k p(z_k)I(X;Y\mid z_k)\geqslant 0. \end{aligned}