User Tools

Site Tools


tanszek:oktatas:techcomm:lzw_coding

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
tanszek:oktatas:techcomm:lzw_coding [2024/10/07 08:11] – [Step 1: Initialize the Dictionary] kneheztanszek:oktatas:techcomm:lzw_coding [2024/11/19 07:54] (current) knehez
Line 6: Line 6:
  
 1. **Initialize a dictionary** with all single characters in the input data (e.g., all ASCII characters if the input is text). 1. **Initialize a dictionary** with all single characters in the input data (e.g., all ASCII characters if the input is text).
 +
 2. **Scan the input** string character by character, building longer and longer substrings that exist in the dictionary. 2. **Scan the input** string character by character, building longer and longer substrings that exist in the dictionary.
 +
 3. **Add new substrings** to the dictionary and output the dictionary index for the current substring when it can no longer be extended by the next character in the input. 3. **Add new substrings** to the dictionary and output the dictionary index for the current substring when it can no longer be extended by the next character in the input.
 +----
  
 === Example of LZW Compression === === Example of LZW Compression ===
Line 55: Line 58:
 4. **Current substring**: `"A"`   4. **Current substring**: `"A"`  
    - It exists in the dictionary (index 65).    - It exists in the dictionary (index 65).
 +   - Read the next character: `"B"`.
 +   - The substring `"AB"` **already exists** in the dictionary (index 256).
 +   - Read the next character: `"A"`, forming `"ABA"`.
    - No more characters left to read.    - No more characters left to read.
-   - **Output** the code for `"A"` (65).+   - **Output** the code for `"ABA"` (258).
  
-#### Final Dictionary After Compression+==== Final Dictionary After Compression ====
  
 The final dictionary looks like this: The final dictionary looks like this:
  
-Index Character +^Index^Character^
-|-------|-----------|+
 | 65    | A         | | 65    | A         |
 | 66    | B         | | 66    | B         |
Line 70: Line 75:
 | 258   | ABA       | | 258   | ABA       |
  
-#### Encoded Output+==== Encoded Output ====
  
 The final output (in terms of dictionary indices) is: The final output (in terms of dictionary indices) is:
-``` + 
-65 66 256 65 +<code>65 66 256 258</code>
-```+
  
 This represents the compressed version of the original string `"ABABABA"`. This represents the compressed version of the original string `"ABABABA"`.
Line 81: Line 85:
 --- ---
  
-### Decoding LZW+==== Decoding LZW ====
  
 To decode an LZW-compressed string, you use the same dictionary-based approach but in reverse. The decoder reconstructs the dictionary entries as it processes the encoded output, using the indices to retrieve the corresponding substrings. To decode an LZW-compressed string, you use the same dictionary-based approach but in reverse. The decoder reconstructs the dictionary entries as it processes the encoded output, using the indices to retrieve the corresponding substrings.
  
-Let’s go through the decoding process of the encoded sequence `65 66 256 65`:+Let’s go through the decoding process of the encoded sequence ''65 66 256 65'':
  
-1. **65**: Look up in the dictionary. It maps to `"A"`. +1.**65**: Look up in the dictionary. It maps to `"A"`.
-   - **Decoded string so far**: `"A"`+
  
-2. **66**: Look up in the dictionary. It maps to `"B"`. +  * **Decoded string so far**: `"A"`
-   **Decoded string so far**: `"AB"`+
  
-3. **256**: Look up in the dictionary. It maps to `"AB"`. +2.**66**: Look up in the dictionary. It maps to `"B"`.
-   - **Decoded string so far**: `"ABAB"`+
  
-4. **65**: Look up in the dictionary. It maps to `"A"`. +  * **Decoded string so far**: `"AB"`
-   **Decoded string so far**: `"ABABABA"`+
  
-The original string `"ABABABA"` is reconstructed from the compressed codes.+3.) **256**: Look up in the dictionary. It maps to `"AB"`. 
 + 
 +  * **Decoded string so far**: `"ABAB"
 + 
 +4.) **258**: Look up in the dictionary. It maps to `"ABA"`. 
 + 
 +  * **Decoded string so far**: `"ABABABA"` 
 + 
 +The original string ''ABABABA'' is reconstructed from the compressed codes.
  
 --- ---
  
-### Summary of LZW Compression+==== Summary of LZW Compression ====
  
 1. **Dictionary Initialization**: The dictionary starts with all individual characters. 1. **Dictionary Initialization**: The dictionary starts with all individual characters.
 +
 2. **Scanning Input**: Input data is scanned character by character, grouping together characters to form substrings. 2. **Scanning Input**: Input data is scanned character by character, grouping together characters to form substrings.
 +
 3. **Dictionary Expansion**: New substrings are added to the dictionary as the input is processed. 3. **Dictionary Expansion**: New substrings are added to the dictionary as the input is processed.
 +
 4. **Output**: The dictionary indices corresponding to the substrings are output, resulting in a compressed sequence of codes. 4. **Output**: The dictionary indices corresponding to the substrings are output, resulting in a compressed sequence of codes.
  
Line 113: Line 124:
  
  
 +C implementation of LZW encoding:
 +
 +<sxh c>
 +#include <stdio.h>
 +#include <string.h>
 +
 +#define MAX_sxhS 100
 +
 +char sxhTable[MAX_sxhS][10] = {"a","b","c","d","e"};
 +
 +int tableElements = 5;
 +
 +int isInsxhTable(char *s)
 +{
 +    for(int i = 0; i < tableElements; i++)
 +    {
 +        if(strcmp(sxhTable[i], s) == 0)
 +        {
 +            return i;
 +        }
 +    }
 +    return -1;
 +}
 +
 +int main()
 +{
 +    char text[] = "dabbacdabbacdabbacdabbacdee";
 +    char *p = text;
 +    int bufferEnd = 1;
 +
 +    int lastFoundIndex = -1;
 +    while(*p != '\0')
 +    {
 +        char subStr[10];
 +        strncpy(subStr, p, bufferEnd);
 +        subStr[bufferEnd] = '\0';
 +        int foundIndex = isInsxhTable(subStr);
 +        if(foundIndex != -1)
 +        {
 +           bufferEnd++;
 +           lastFoundIndex = foundIndex;
 +           continue;
 +        }
 +        p += strlen(subStr) - 1;
 +        bufferEnd = 1;
 +        strcpy(sxhTable[tableElements++], subStr);
 +        printf("%d,", lastFoundIndex + 1);
 +    }
 +}
  
 +</sxh>
tanszek/oktatas/techcomm/lzw_coding.1728288709.txt.gz · Last modified: 2024/10/07 08:11 by knehez