# Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution

### Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.
Assignment-1 Solutions
Solution 1: (d) By the chain rule for entropy we have:
H(Y; Z=X) = H(Z=X) + H(Y=X; Z) = H(Y=X) + H(Z=X; Y ) ! (c)
Also, as conditioning can not increase entropy. Hence
H(Z=X; Y ) H(Z=X)
and
H(Y=X; Z) H(Y=X)
Therefore we have
H(Y; Z=X) H(Y=X) + H(Z=X) ! (a)
and
H(Y; Z=X) H(Z=X) + H(Y=X) ! (b)
Hence, all the statements are true.
Solution 2: (a) Given the probabilities fp1; p2; p3g = f0:25; 0:25; 0:5g, the information content in X (in bits)
is given as
H(X) =
3Xi
=1
pi log2 pi = (0:25 log2 0:25 + 0:25 log2 0:25 + 0:5 log2 0:5) = 1:5 bits.
Solution 3: (b) We will prove this by contradiction. Assume that there exists an x, say x0 and two different values
of
y, say y1 and y2 such that p(x0; y1) > 0 and p(x0; y2) > 0. Then p(x0) p(x0; y1) + p(x0; y2) > 0, and p(y1=x0)
and
p(y2=x0) are not equal to 0 or 1. Thus
H(Y=X) = X
x
p(x)X
y
p(y=x) log p(y=x)
p(x0)(p(y1=x0) log p(y1=x0) p(y2=x0) log p(y2=x0))
> 0;
since t log t 0 for 0 t 1, and is strictly positive for t not equal to 0 or 1. Therefore our assumption that there
are two different values of
y such that p(x0; yi) > 0; i 2 f1; 2g is not correct, i.e., for all x with p(x) > 0, there is
only one possible value of
y with p(x; y) > 0 or in other words, Y is a function of X.
Solution 4: (b) This can be proved by justifying the following steps:
H(X; (g(X))) (=a) H(X) + H(g(X)=X)
(b)
= H(X);
H(X; g(X)) (=c) H(g(X)) + H(X=g(X))
| {z }
0
(
d)
H(g(X)):
Thus H(g(X)) H(X). The proof of all the steps are as follows:
1

(a) H(X; g(X)) = H(X) + H(g(X)=X) by the chain rule for entropies.
(b)
H(g(X)=X) = 0 since for any particular value of X, g(X) is fixed, and hence H(g(X)=X) = Px p(x)H(g(X)=X =
x) = Px 0 = 0.
(c)
H(X; g(X)) = H(g(X)) + H(x=g(X)) again by the chain rule.
(d)
H(X=g(X)) 0, with equality if and only if X is a function of g(X), i.e., g(·) is one-to-one.
Hence
H(X) H(g(X)).
Solution 5: (c) Since the coin is selected at random, the probability of selecting any of the coin is P(coin =
fair) = P(coin = biased) = 1 2. Therefore, the random variable (RV) X takes values 0 and 1 with equal probability.
The
H(X) is given as
H(X) = X
xi2f0;1g
P(X = xi) log2 P(X = xi) = 1 2 log2 1 2 + 1 2 log2 1 2 = 1:
Solution 6: (b) To find H(Y ), we have to first obtain the P(Y = yi); 8i. If an unbiased coin is chosen, then the
0 1
0 1 2
X Y
1
12
p =
12
p =
1/4
1/4
1/2
Figure 1:
sample space of the experiment will be
S1 = fHH; HT; T H; T Tg, so that there may be 0, 1 or 2 number of tails
recorded. However, if we select the biased coin, the sample space of the experiment will be
S2 = fHHg, so that
there will be 0 tails. This experiment can be understood by Figure 1. First we select the coin randomly, and then
conduct the tossing experiment. The probability
P(Y = yi) is given as:
P(Y = 0) = X
x
P(Y = 0; X) = P(X = 0)P(Y = 0jX = 0) + P(X = 1)P(Y = 0jX = 1)
=
1 2
·
1 4
+
1 2
· 1 =
5 8
P(Y = 1) = X
x
P(Y = 1; X) = P(X = 0)P(Y = 1jX = 0) + P(X = 1)P(Y = 1jX = 1)
=
1 2
·
1 2
+
1 2
· 0 =
1 4
P(Y = 2) = X
x
P(Y = 3; X) = P(X = 0)P(Y = 3jX = 0) + P(X = 1)P(Y = 3jX = 1)
=
1 2
·
1 4
+
1 2
· 0 =
1 8
Therefore,
H(Y ) is given as
H(Y ) = X
yi2f0;1;2g
P(Y = yi) log2 P(Y = yi) = 58 log2 5 8 + 1 4 log2 1 4 + 18 log2 1 8
= 1:3 bits:
Solution 7: (b) The mutual information I(X; Y ) is given as:
I(X; Y ) = H(Y ) H(Y=X)
We know that
H(Y ) = 1:3 bits. The conditional entropy H(Y=X) is given as:
H(Y=X) = H(Y=X = 0)P(X = 0) + H(Y=X = 1)P(X = 1):
2
When X = 1, Y always takes value 0, therefore there is no uncertainty in the value of Y , i.e., H(Y=X = 1) = 0.
And, when
X = 0, Y takes values 0, 1 and 2 with probability 14, 1 2 and 1 4 respectively. Therefore we have
H(Y=X = 0) = 2 · 1 4 log2 1 4 + 1 2 log2 12 = 1:5
=
) H(Y=X = 0)P(X = 0) = 2 · 1 4 log2 1 4 + 1 2 log2 12 · 1 2 = 0:75
=
) I(X; Y ) = H(Y ) H(Y=X) = 1:3 0:75 = 0:55 bits
Solution 8: (c) Information theory proves the existence of a code, however it does not give the best coding scheme.
Solution 9: (a) The inequality in option (b) directly follows from Fano’s lemma. Now, as we know that H(p) 1
and log(
L 1) < log L, we have the inequality in (c)
1 +
p log L H(p) + p log(L 1) H(X=Y ):
Hence, the only incorrect option is (a).
Solution 10: (b) We prove the convexity/non-convexity of all the functions as follows:
(a) We know that a function
f(x) is concave if f00(x) 0. Here,

 f00(x) = (1 – log x) = ≥ 0 d2 1 dx2 x2

Hence, it is not a concave function.
(b) See the lecture notes.
(c) We fix
p(x) and consider two different conditional distributions p1(y=x) and p2(y=x). The corresponding joint
distributions are
p1(x; y) = p(x)p1(y=x) and p(x)p2(y=x), and their respective marginals are p(x); p1(y) and
p(x); p2(y). Consider a conditional distribution
pλ(y=x) = λp1(y=x) + (1 λ)p2(y=x)
that is a mixture of
p1(y=x) and p2(y=x). The corresponding joint distribution is also a mixture of the
corresponding joint distributions,
pλ(x; y) = λp1(x; y) + (1 λ)p2(x; y);
and the distribution of Y is also a mixture
pλ(y) = λp1(y) + (1 λ)p2(y):
Hence if we let qλ(x; y) = p(x)pλ(y) be the product of the marginal distributions, we have
qλ(x; y) = λq1(x; y) + (1 λ)q2(x; y):
Since the mutual information is the relative entropy between the joint distribution and the product of the
marginal distributions, i.e.,
I(X; Y ) = D(pλkqλ);
and the relative entropy D(pkq) is a convex function of (p; q) (See lecture notes), it follows that the mutual
information is a convex function of
p(y=x) for fixed p(x).
(d) See the lecture notes.
3

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.

Assignment-1-Solution for An Introduction to Information Theory.