tanszek:oktatas:techcomm:information
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
tanszek:oktatas:techcomm:information [2024/10/15 06:30] – [Example of Entropy calculation] knehez | tanszek:oktatas:techcomm:information [2025/10/14 06:25] (current) – [Entropy] knehez | ||
---|---|---|---|
Line 11: | Line 11: | ||
$$ I_E = \log_2 \frac{1}{p_E} = -\log_2( p_E ) [bit] $$ | $$ I_E = \log_2 \frac{1}{p_E} = -\log_2( p_E ) [bit] $$ | ||
- | The properties of a logarithm function play an important role in the modeling | + | Shannon used the logarithm to measure information because only the logarithmic function makes the information of independent events additive. If two independent events 𝐴 𝐵 occur, their joint probability is: \( p(A,B) = p(A) \cdot p(B) \). |
+ | |||
+ | We expect that the total information should add up: | ||
+ | |||
+ | $$ I(A,B) = I(A) + I(B) $$ | ||
+ | |||
+ | Only the logarithm satisfies this property: | ||
+ | |||
+ | $$ I(p) = -\log p \quad \Rightarrow \quad I(A,B) = -\log(p(A)p(B)) = I(A) + I(B) $$ | ||
+ | |||
+ | If we used \( I(p) = 1/p \), the values would multiply, not add. | ||
+ | |||
+ | The properties of a logarithm function play an important role in modeling the quantitative properties of a given information. | ||
If an event space consist of two equal-probability event \(p(E_1) = p(E_2) = 0.5 \) then, | If an event space consist of two equal-probability event \(p(E_1) = p(E_2) = 0.5 \) then, | ||
Line 33: | Line 45: | ||
The average information content of the set of messages is called the //entropy// of the message set. | The average information content of the set of messages is called the //entropy// of the message set. | ||
- | $$ H_E = \sum_{i=1}^n p_i \cdot I_{E_i} = \sum_{i=1}^n p_i \cdot \log_2 \frac{1}{p_i} = - \sum_{i=1}^n p_i \cdot \log_2 p_i$$ | + | $$ H_E = \sum_{i=1}^n p_i \cdot I_{E_i} = \sum_{i=1}^n p_i \cdot \log_2 \frac{1}{p_i} = - \sum_{i=1}^n p_i \cdot \log_2 p_i [bit]$$ |
**Example**: | **Example**: | ||
Line 46: | Line 58: | ||
Entropy can also be viewed as a measure of the information " | Entropy can also be viewed as a measure of the information " | ||
+ | |||
+ | Example: | ||
+ | |||
+ | * If a source always sends the same letter (“AAAAA…”), | ||
+ | * If every letter occurs with equal probability (e.g., random characters), | ||
This concept is crucial in various fields, including //data compression//, | This concept is crucial in various fields, including //data compression//, | ||
Line 107: | Line 124: | ||
#include < | #include < | ||
#include < | #include < | ||
+ | #include < | ||
- | float calculateEntropy(unsigned | + | float calculateEntropy(unsigned |
- | char sample[] = "Some poetry types are unique to particular cultures and genres and respond to yQ%v? | + | int main(void) { |
+ | const char sample[] = | ||
+ | | ||
+ | "and respond to yQ%v? | ||
+ | "poet writes. Readers accustomed to identifying poetry with Dante, | ||
+ | "Goethe, Mickiewicz, or Rumi may think of it as written in lines " | ||
+ | "based on rhyme and regular meter. There are, however, traditions, | ||
+ | "such as Biblical poetry and alliterative verse, that use other " | ||
+ | "means to create rhythm and euphony. Much modern poetry reflects | ||
+ | "a critique of poetic tradition, testing the principle of euphony | ||
+ | "itself or altogether forgoing rhyme or set rhythm."; | ||
- | int main() | ||
- | { | ||
- | unsigned int byteCounter[256]; | ||
const int windowWidth = 20; | const int windowWidth = 20; | ||
+ | unsigned char counts[256]; | ||
- | | + | int sampleLength |
- | { | + | |
- | memset(byteCounter, | + | |
- | char *p = & | + | for (int start = 0; start <= sampleLength - windowWidth; start++) { |
- | char *end = & | + | memset(counts, |
- | | + | |
- | | + | for (int j = 0; j < windowWidth; |
- | | + | |
+ | counts[c]++; | ||
} | } | ||
- | | + | |
- | printf(" | + | |
+ | printf(" | ||
} | } | ||
+ | return 0; | ||
} | } | ||
- | + | float calculateEntropy(unsigned | |
- | float calculateEntropy(unsigned | + | |
- | { | + | |
float entropy = 0.0f; | float entropy = 0.0f; | ||
- | + | | |
- | | + | if (counts[i] > 0) { |
- | | + | float freq = (float)counts[i] / length; |
- | if (bytes[i] != 0) | + | entropy |
- | | + | |
- | float freq = (float) | + | |
- | entropy | + | |
} | } | ||
} | } | ||
return entropy; | return entropy; | ||
} | } | ||
+ | |||
</ | </ | ||
tanszek/oktatas/techcomm/information.1728973802.txt.gz · Last modified: 2024/10/15 06:30 by knehez