tanszek:oktatas:techcomm:lzw_coding
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
tanszek:oktatas:techcomm:lzw_coding [2024/10/07 08:11] – [Step 1: Initialize the Dictionary] knehez | tanszek:oktatas:techcomm:lzw_coding [2024/11/19 07:54] (current) – knehez | ||
---|---|---|---|
Line 6: | Line 6: | ||
1. **Initialize a dictionary** with all single characters in the input data (e.g., all ASCII characters if the input is text). | 1. **Initialize a dictionary** with all single characters in the input data (e.g., all ASCII characters if the input is text). | ||
+ | |||
2. **Scan the input** string character by character, building longer and longer substrings that exist in the dictionary. | 2. **Scan the input** string character by character, building longer and longer substrings that exist in the dictionary. | ||
+ | |||
3. **Add new substrings** to the dictionary and output the dictionary index for the current substring when it can no longer be extended by the next character in the input. | 3. **Add new substrings** to the dictionary and output the dictionary index for the current substring when it can no longer be extended by the next character in the input. | ||
+ | ---- | ||
=== Example of LZW Compression === | === Example of LZW Compression === | ||
Line 55: | Line 58: | ||
4. **Current substring**: | 4. **Current substring**: | ||
- It exists in the dictionary (index 65). | - It exists in the dictionary (index 65). | ||
+ | - Read the next character: `" | ||
+ | - The substring `" | ||
+ | - Read the next character: `" | ||
- No more characters left to read. | - No more characters left to read. | ||
- | - **Output** the code for `"A"` (65). | + | - **Output** the code for `"ABA"` (258). |
- | #### Final Dictionary After Compression | + | ==== Final Dictionary After Compression |
The final dictionary looks like this: | The final dictionary looks like this: | ||
- | | Index | Character | + | ^Index^Character^ |
- | |-------|-----------| | + | |
| 65 | A | | | 65 | A | | ||
| 66 | B | | | 66 | B | | ||
Line 70: | Line 75: | ||
| 258 | ABA | | | 258 | ABA | | ||
- | #### Encoded Output | + | ==== Encoded Output |
The final output (in terms of dictionary indices) is: | The final output (in terms of dictionary indices) is: | ||
- | ``` | + | |
- | 65 66 256 65 | + | < |
- | ``` | + | |
This represents the compressed version of the original string `" | This represents the compressed version of the original string `" | ||
Line 81: | Line 85: | ||
--- | --- | ||
- | ### Decoding LZW | + | ==== Decoding LZW ==== |
To decode an LZW-compressed string, you use the same dictionary-based approach but in reverse. The decoder reconstructs the dictionary entries as it processes the encoded output, using the indices to retrieve the corresponding substrings. | To decode an LZW-compressed string, you use the same dictionary-based approach but in reverse. The decoder reconstructs the dictionary entries as it processes the encoded output, using the indices to retrieve the corresponding substrings. | ||
- | Let’s go through the decoding process of the encoded sequence | + | Let’s go through the decoding process of the encoded sequence |
- | 1. **65**: Look up in the dictionary. It maps to `" | + | 1.) **65**: Look up in the dictionary. It maps to `" |
- | - **Decoded string so far**: `" | + | |
- | 2. **66**: Look up in the dictionary. It maps to `" | + | |
- | | + | |
- | 3. **256**: Look up in the dictionary. It maps to `"AB"`. | + | 2.) **66**: Look up in the dictionary. It maps to `"B"`. |
- | - **Decoded string so far**: `" | + | |
- | 4. **65**: Look up in the dictionary. It maps to `" | + | |
- | | + | |
- | The original | + | 3.) **256**: Look up in the dictionary. It maps to `" |
+ | |||
+ | * **Decoded | ||
+ | |||
+ | 4.) **258**: Look up in the dictionary. It maps to `" | ||
+ | |||
+ | * **Decoded string so far**: | ||
+ | |||
+ | The original string '' | ||
--- | --- | ||
- | ### Summary of LZW Compression | + | ==== Summary of LZW Compression |
1. **Dictionary Initialization**: | 1. **Dictionary Initialization**: | ||
+ | |||
2. **Scanning Input**: Input data is scanned character by character, grouping together characters to form substrings. | 2. **Scanning Input**: Input data is scanned character by character, grouping together characters to form substrings. | ||
+ | |||
3. **Dictionary Expansion**: | 3. **Dictionary Expansion**: | ||
+ | |||
4. **Output**: The dictionary indices corresponding to the substrings are output, resulting in a compressed sequence of codes. | 4. **Output**: The dictionary indices corresponding to the substrings are output, resulting in a compressed sequence of codes. | ||
Line 113: | Line 124: | ||
+ | C implementation of LZW encoding: | ||
+ | |||
+ | <sxh c> | ||
+ | #include < | ||
+ | #include < | ||
+ | |||
+ | #define MAX_sxhS 100 | ||
+ | |||
+ | char sxhTable[MAX_sxhS][10] = {" | ||
+ | |||
+ | int tableElements = 5; | ||
+ | |||
+ | int isInsxhTable(char *s) | ||
+ | { | ||
+ | for(int i = 0; i < tableElements; | ||
+ | { | ||
+ | if(strcmp(sxhTable[i], | ||
+ | { | ||
+ | return i; | ||
+ | } | ||
+ | } | ||
+ | return -1; | ||
+ | } | ||
+ | |||
+ | int main() | ||
+ | { | ||
+ | char text[] = " | ||
+ | char *p = text; | ||
+ | int bufferEnd = 1; | ||
+ | |||
+ | int lastFoundIndex = -1; | ||
+ | while(*p != ' | ||
+ | { | ||
+ | char subStr[10]; | ||
+ | strncpy(subStr, | ||
+ | subStr[bufferEnd] = ' | ||
+ | int foundIndex = isInsxhTable(subStr); | ||
+ | if(foundIndex != -1) | ||
+ | { | ||
+ | | ||
+ | | ||
+ | | ||
+ | } | ||
+ | p += strlen(subStr) - 1; | ||
+ | bufferEnd = 1; | ||
+ | strcpy(sxhTable[tableElements++], | ||
+ | printf(" | ||
+ | } | ||
+ | } | ||
+ | </ |
tanszek/oktatas/techcomm/lzw_coding.1728288709.txt.gz · Last modified: 2024/10/07 08:11 by knehez