Differences

This shows you the differences between two versions of the page.

--- tanszek:oktatas:techcomm:lzw_coding [2024/10/07 08:11] – [Step 1: Initialize the Dictionary] knehez
+++ tanszek:oktatas:techcomm:lzw_coding [2024/11/19 07:54] (current) – knehez
@@ Line 6: / Line 6: @@
 . **Initialize a dictionary** with all single characters in the input data (e.g., all ASCII characters if the input is text).
 . **Scan the input** string character by character, building longer and longer substrings that exist in the dictionary.
 . **Add new substrings** to the dictionary and output the dictionary index for the current substring when it can no longer be extended by the next character in the input.
+----
 === Example of LZW Compression ===
@@ Line 55: / Line 58: @@
 . **Current substring**: `"A"`
    - It exists in the dictionary (index 65).
+   - Read the next character: `"B"`.
+   - The substring `"AB"` **already exists** in the dictionary (index 256).
+   - Read the next character: `"A"`, forming `"ABA"`.
    - No more characters left to read.
-   - **Output** the code for `"A"` (65).
+   - **Output** the code for `"ABA"` (258).
-#### Final Dictionary After Compression
+==== Final Dictionary After Compression ====
 The final dictionary looks like this:
-| Index | Character |
+^Index^Character^
-|-------|-----------|
 | 65    | A         |
 | 66    | B         |
@@ Line 70: / Line 75: @@
 | 258   | ABA       |
-#### Encoded Output
+==== Encoded Output ====
 The final output (in terms of dictionary indices) is:
-```
-66 256 65
+<code>65 66 256 258</code>
-```
 This represents the compressed version of the original string `"ABABABA"`.
@@ Line 81: / Line 85: @@
 ---
-### Decoding LZW
+==== Decoding LZW ====
 To decode an LZW-compressed string, you use the same dictionary-based approach but in reverse. The decoder reconstructs the dictionary entries as it processes the encoded output, using the indices to retrieve the corresponding substrings.
-Let’s go through the decoding process of the encoded sequence `65 66 256 65`:
+Let’s go through the decoding process of the encoded sequence ''65 66 256 65'':
-. **65**: Look up in the dictionary. It maps to `"A"`.
+.) **65**: Look up in the dictionary. It maps to `"A"`.
-   - **Decoded string so far**: `"A"`
-. **66**: Look up in the dictionary. It maps to `"B"`.
+  * **Decoded string so far**: `"A"`
-   - **Decoded string so far**: `"AB"`
-. **256**: Look up in the dictionary. It maps to `"AB"`.
+.) **66**: Look up in the dictionary. It maps to `"B"`.
-   - **Decoded string so far**: `"ABAB"`
-. **65**: Look up in the dictionary. It maps to `"A"`.
+  * **Decoded string so far**: `"AB"`
-   - **Decoded string so far**: `"ABABABA"`
-The original string `"ABABABA"` is reconstructed from the compressed codes.
+.) **256**: Look up in the dictionary. It maps to `"AB"`.
+  * **Decoded string so far**: `"ABAB"`
+.) **258**: Look up in the dictionary. It maps to `"ABA"`.
+  * **Decoded string so far**: `"ABABABA"`
+The original string ''ABABABA'' is reconstructed from the compressed codes.
 ---
-### Summary of LZW Compression
+==== Summary of LZW Compression ====
 . **Dictionary Initialization**: The dictionary starts with all individual characters.
 . **Scanning Input**: Input data is scanned character by character, grouping together characters to form substrings.
 . **Dictionary Expansion**: New substrings are added to the dictionary as the input is processed.
 . **Output**: The dictionary indices corresponding to the substrings are output, resulting in a compressed sequence of codes.
@@ Line 113: / Line 124: @@
+C implementation of LZW encoding:
+<sxh c>
+#include <stdio.h>
+#include <string.h>
+#define MAX_sxhS 100
+char sxhTable[MAX_sxhS][10] = {"a","b","c","d","e"};
+int tableElements = 5;
+int isInsxhTable(char *s)
+{
+    for(int i = 0; i < tableElements; i++)
+    {
+        if(strcmp(sxhTable[i], s) == 0)
+        {
+            return i;
+        }
+    }
+    return -1;
+}
+int main()
+{
+    char text[] = "dabbacdabbacdabbacdabbacdee";
+    char *p = text;
+    int bufferEnd = 1;
+    int lastFoundIndex = -1;
+    while(*p != '\0')
+    {
+        char subStr[10];
+        strncpy(subStr, p, bufferEnd);
+        subStr[bufferEnd] = '\0';
+        int foundIndex = isInsxhTable(subStr);
+        if(foundIndex != -1)
+        {
+           bufferEnd++;
+           lastFoundIndex = foundIndex;
+           continue;
+        }
+        p += strlen(subStr) - 1;
+        bufferEnd = 1;
+        strcpy(sxhTable[tableElements++], subStr);
+        printf("%d,", lastFoundIndex + 1);
+    }
+}
+</sxh>