Differences

This shows you the differences between two versions of the page.

--- tanszek:oktatas:techcomm:rle_coding [2024/10/07 08:00] – created knehez
+++ tanszek:oktatas:techcomm:rle_coding [2025/11/10 14:51] (current) – knehez
@@ Line 1: / Line 1: @@
-===== Run-length Encoding (RLE) =====
+===== Run-Length Encoding (RLE) =====
-Run-Length Encoding (RLE) – simple compression technique where consecutive repeating identical characters or bits are encoded with a single symbol and a repetition count. It was previously used for compressing black-and-white images but also in TGA, PCX formats, and FAX machines.
+Run-Length Encoding (RLE) is one of the simplest **lossless data compression** techniques.
+It works by replacing sequences of **repeated elements** (called "runs") with a single value and a repetition count.
-It’s easy to understand with an example.
+Run-Length Encoding cannot be attributed to a single inventor; rather, it is based on one of the earliest and simplest general principles of data compression, developed and first used in the 1950s for computer image processing.
-Let’s encode the following characters using RLE encoding:
+----
-<code>WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW</code>
-Encoded:
+==== Basic Idea ====
-<code>12W1B12W3B24W1B14W</code>
+When a sequence contains the same value multiple times in a row, instead of storing it multiple times, we store only the value and the number of times it repeats.
-The interpretation is: 12 W’s, 1 B, 12 W’s, 3 B’s, and so on.
+For example:
+<code>
+AAAAABBBCCDAA → 5A3B2C1D2A
+</code>
-Thus, the text encoded with run-length encoding describes the original 67 characters using 18. Of course, it is used with binary code rather than characters.
+Here:
+  * 'A' repeats 5 times → 5A
+  * 'B' repeats 3 times → 3B
+  * 'C' repeats 2 times → 2C
+  * 'D' repeats 1 time → 1D
+  * 'A' repeats again 2 times → 2A
-C program that implements this algorithm:
+The original message had 12 characters, but the encoded version has only 10, so there is some compression (especially for longer runs).
+----
+==== Binary Example ====
+RLE is especially effective for **binary data**, such as black-and-white images.
+Example (0 = white pixel, 1 = black pixel):
+<code>
+Original: 000000001111100000011111111000000000
+Encoded : 8 0s, 5 1s, 6 0s, 8 1s, 7 0s → (8,0)(5,1)(6,0)(8,1)(7,0)
+</code>
+This binary form is what early image formats (like **PCX**, **TGA**) and **FAX machines** used.
+----
+==== Decoding ====
+To reconstruct (decode) the original message, we simply repeat each symbol as many times as its count indicates.
+For example:
+<code>
+Encoded: 12W1B12W3B24W1B14W
+Decoded: WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW
+</code>
+----
+==== Advantages ====
+  * Very simple to implement
+  * Works well for data containing **long runs of identical values** (e.g., simple images, icons, masks)
+  * Low computational cost (fast encoding/decoding)
+----
+==== Disadvantages ====
+  * Inefficient when the data changes frequently (e.g., photographs, text files)
+  * May increase file size if no long runs exist (e.g,. `ABCD` → `1A1B1C1D`, which is longer)
+  * Usually combined with other compression algorithms in practical applications
+----
+==== Applications ====
+  * Image compression (TGA, PCX, BMP with RLE)
+  * FAX transmission (CCITT Group 3/4 standard)
+  * Simple bitmap fonts
+  * Storing masks and binary patterns
+  * Used as a sub-step in complex formats (e.g., TIFF, PDF internal compression)
+----
+A C program that implements this algorithm:
 <sxh c>