===== Run-Length Encoding (RLE) =====
Run-Length Encoding (RLE) is one of the simplest **lossless data compression** techniques.
It works by replacing sequences of **repeated elements** (called "runs") with a single value and a repetition count.
Run-Length Encoding cannot be attributed to a single inventor; rather, it is based on one of the earliest and simplest general principles of data compression, developed and first used in the 1950s for computer image processing.
----
==== Basic Idea ====
When a sequence contains the same value multiple times in a row, instead of storing it multiple times, we store only the value and the number of times it repeats.
For example:
AAAAABBBCCDAA → 5A3B2C1D2A
Here:
* 'A' repeats 5 times → 5A
* 'B' repeats 3 times → 3B
* 'C' repeats 2 times → 2C
* 'D' repeats 1 time → 1D
* 'A' repeats again 2 times → 2A
The original message had 12 characters, but the encoded version has only 10, so there is some compression (especially for longer runs).
----
==== Binary Example ====
RLE is especially effective for **binary data**, such as black-and-white images.
Example (0 = white pixel, 1 = black pixel):
Original: 000000001111100000011111111000000000
Encoded : 8 0s, 5 1s, 6 0s, 8 1s, 7 0s → (8,0)(5,1)(6,0)(8,1)(7,0)
This binary form is what early image formats (like **PCX**, **TGA**) and **FAX machines** used.
----
==== Decoding ====
To reconstruct (decode) the original message, we simply repeat each symbol as many times as its count indicates.
For example:
Encoded: 12W1B12W3B24W1B14W
Decoded: WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW
----
==== Advantages ====
* Very simple to implement
* Works well for data containing **long runs of identical values** (e.g., simple images, icons, masks)
* Low computational cost (fast encoding/decoding)
----
==== Disadvantages ====
* Inefficient when the data changes frequently (e.g., photographs, text files)
* May increase file size if no long runs exist (e.g,. `ABCD` → `1A1B1C1D`, which is longer)
* Usually combined with other compression algorithms in practical applications
----
==== Applications ====
* Image compression (TGA, PCX, BMP with RLE)
* FAX transmission (CCITT Group 3/4 standard)
* Simple bitmap fonts
* Storing masks and binary patterns
* Used as a sub-step in complex formats (e.g., TIFF, PDF internal compression)
----
A C program that implements this algorithm:
#include
#include
void runLengthEncoding(const char *input) {
int length = strlen(input);
for (int i = 0; i < length; i++) {
// Count occurrences of the current character
int count = 1;
while (i < length - 1 && input[i] == input[i + 1]) {
count++;
i++;
}
// Print the count followed by the character
printf("%d%c", count, input[i]);
}
printf("\n"); // New line after encoding
}
int main() {
// Example string to encode
char input[] = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW";
printf("Original String: %s\n", input);
printf("Run-Length Encoded: ");
runLengthEncoding(input);
return 0;
}