tanszek:oktatas:techcomm:lzw_coding
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| tanszek:oktatas:techcomm:lzw_coding [2024/10/07 08:10] – [Key Steps in LZW Compression] knehez | tanszek:oktatas:techcomm:lzw_coding [2024/11/19 07:54] (current) – knehez | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| 1. **Initialize a dictionary** with all single characters in the input data (e.g., all ASCII characters if the input is text). | 1. **Initialize a dictionary** with all single characters in the input data (e.g., all ASCII characters if the input is text). | ||
| + | |||
| 2. **Scan the input** string character by character, building longer and longer substrings that exist in the dictionary. | 2. **Scan the input** string character by character, building longer and longer substrings that exist in the dictionary. | ||
| + | |||
| 3. **Add new substrings** to the dictionary and output the dictionary index for the current substring when it can no longer be extended by the next character in the input. | 3. **Add new substrings** to the dictionary and output the dictionary index for the current substring when it can no longer be extended by the next character in the input. | ||
| + | ---- | ||
| === Example of LZW Compression === | === Example of LZW Compression === | ||
| Line 29: | Line 32: | ||
| | 66 | B | | | 66 | B | | ||
| - | #### Step 2: Start Encoding | + | ==== Step 2: Start Encoding |
| We start reading the input string character by character and follow these steps: | We start reading the input string character by character and follow these steps: | ||
| Line 55: | Line 58: | ||
| 4. **Current substring**: | 4. **Current substring**: | ||
| - It exists in the dictionary (index 65). | - It exists in the dictionary (index 65). | ||
| + | - Read the next character: `" | ||
| + | - The substring `" | ||
| + | - Read the next character: `" | ||
| - No more characters left to read. | - No more characters left to read. | ||
| - | - **Output** the code for `"A"` (65). | + | - **Output** the code for `"ABA"` (258). |
| - | #### Final Dictionary After Compression | + | ==== Final Dictionary After Compression |
| The final dictionary looks like this: | The final dictionary looks like this: | ||
| - | | Index | Character | + | ^Index^Character^ |
| - | |-------|-----------| | + | |
| | 65 | A | | | 65 | A | | ||
| | 66 | B | | | 66 | B | | ||
| Line 70: | Line 75: | ||
| | 258 | ABA | | | 258 | ABA | | ||
| - | #### Encoded Output | + | ==== Encoded Output |
| The final output (in terms of dictionary indices) is: | The final output (in terms of dictionary indices) is: | ||
| - | ``` | + | |
| - | 65 66 256 65 | + | < |
| - | ``` | + | |
| This represents the compressed version of the original string `" | This represents the compressed version of the original string `" | ||
| Line 81: | Line 85: | ||
| --- | --- | ||
| - | ### Decoding LZW | + | ==== Decoding LZW ==== |
| To decode an LZW-compressed string, you use the same dictionary-based approach but in reverse. The decoder reconstructs the dictionary entries as it processes the encoded output, using the indices to retrieve the corresponding substrings. | To decode an LZW-compressed string, you use the same dictionary-based approach but in reverse. The decoder reconstructs the dictionary entries as it processes the encoded output, using the indices to retrieve the corresponding substrings. | ||
| - | Let’s go through the decoding process of the encoded sequence | + | Let’s go through the decoding process of the encoded sequence |
| - | 1. **65**: Look up in the dictionary. It maps to `" | + | 1.) **65**: Look up in the dictionary. It maps to `" |
| - | - **Decoded string so far**: `" | + | |
| - | 2. **66**: Look up in the dictionary. It maps to `" | + | |
| - | | + | |
| - | 3. **256**: Look up in the dictionary. It maps to `"AB"`. | + | 2.) **66**: Look up in the dictionary. It maps to `"B"`. |
| - | - **Decoded string so far**: `" | + | |
| - | 4. **65**: Look up in the dictionary. It maps to `" | + | |
| - | | + | |
| - | The original | + | 3.) **256**: Look up in the dictionary. It maps to `" |
| + | |||
| + | * **Decoded | ||
| + | |||
| + | 4.) **258**: Look up in the dictionary. It maps to `" | ||
| + | |||
| + | * **Decoded string so far**: | ||
| + | |||
| + | The original string '' | ||
| --- | --- | ||
| - | ### Summary of LZW Compression | + | ==== Summary of LZW Compression |
| 1. **Dictionary Initialization**: | 1. **Dictionary Initialization**: | ||
| + | |||
| 2. **Scanning Input**: Input data is scanned character by character, grouping together characters to form substrings. | 2. **Scanning Input**: Input data is scanned character by character, grouping together characters to form substrings. | ||
| + | |||
| 3. **Dictionary Expansion**: | 3. **Dictionary Expansion**: | ||
| + | |||
| 4. **Output**: The dictionary indices corresponding to the substrings are output, resulting in a compressed sequence of codes. | 4. **Output**: The dictionary indices corresponding to the substrings are output, resulting in a compressed sequence of codes. | ||
| Line 113: | Line 124: | ||
| + | C implementation of LZW encoding: | ||
| + | |||
| + | <sxh c> | ||
| + | #include < | ||
| + | #include < | ||
| + | |||
| + | #define MAX_sxhS 100 | ||
| + | |||
| + | char sxhTable[MAX_sxhS][10] = {" | ||
| + | |||
| + | int tableElements = 5; | ||
| + | |||
| + | int isInsxhTable(char *s) | ||
| + | { | ||
| + | for(int i = 0; i < tableElements; | ||
| + | { | ||
| + | if(strcmp(sxhTable[i], | ||
| + | { | ||
| + | return i; | ||
| + | } | ||
| + | } | ||
| + | return -1; | ||
| + | } | ||
| + | |||
| + | int main() | ||
| + | { | ||
| + | char text[] = " | ||
| + | char *p = text; | ||
| + | int bufferEnd = 1; | ||
| + | |||
| + | int lastFoundIndex = -1; | ||
| + | while(*p != ' | ||
| + | { | ||
| + | char subStr[10]; | ||
| + | strncpy(subStr, | ||
| + | subStr[bufferEnd] = ' | ||
| + | int foundIndex = isInsxhTable(subStr); | ||
| + | if(foundIndex != -1) | ||
| + | { | ||
| + | | ||
| + | | ||
| + | | ||
| + | } | ||
| + | p += strlen(subStr) - 1; | ||
| + | bufferEnd = 1; | ||
| + | strcpy(sxhTable[tableElements++], | ||
| + | printf(" | ||
| + | } | ||
| + | } | ||
| + | </ | ||
tanszek/oktatas/techcomm/lzw_coding.1728288635.txt.gz · Last modified: 2024/10/07 08:10 by knehez
