This is an old revision of the document!
Information
Experience shows that the information value of certain news depends on their probability.
$$ I_{E_i} = f(P_i) $$
in which \( I_{E_i} \) means the information value. In this aspect the more unexpected or unlikely (rumour) a news is the bigger it's news value.
So the \(f\) function was selected according to Shannon's suggestion:
$$ I_E = log_2 \frac{1}{p_E} = -log_2( p_E ) [bit] $$
The properties of a logarithm function play an important role in the modeling procedure of the quantitative properties of a given information.
If an event space consist of two equal-probability event \(p(E_1) = p(E_2) = 0.5 \) then,
$$ I_{E_1} = I_{E_2} = log_2 \frac{1}{0.5} = - log_2 2 = 1 [bit] $$
So the unit of the information means the news value which is connected to the simple, less likely, same probability choice.
If the event system consist of 'n' number of events and all these events have the same probability then the probability of any event is the following:
$$ p_E = \frac{1}{n} $$
In these cases, the news value which is connected to these events will be the following:
$$ I_E = log_2 \frac{1}{p_E} = log_2 (n) [bit] $$
Entropy
If the events in the event space are not equally likely, then the set of messages can be well characterized by the average information content of the messages.
The average information content of the set of messages is called the entropy of the message set.
$$ H_E = \sum_{i=1}^n p_i \cdot I_{E_i} = \sum_{i=1}^n p_i \cdot \log_2 \frac{1}{p_i} = - \sum_{i=1}^n p_i \cdot \log_2 p_i$$