Run-Length Encoding (RLE) is one of the simplest lossless data compression techniques. It works by replacing sequences of repeated elements (called “runs”) with a single value and a repetition count.
Run-Length Encoding cannot be attributed to a single inventor; rather, it is based on one of the earliest and simplest general principles of data compression, developed and first used in the 1950s for computer image processing.
When a sequence contains the same value multiple times in a row, instead of storing it multiple times, we store only the value and the number of times it repeats.
For example:
AAAAABBBCCDAA → 5A3B2C1D2A
Here:
The original message had 12 characters, but the encoded version has only 10, so there is some compression (especially for longer runs).
RLE is especially effective for binary data, such as black-and-white images.
Example (0 = white pixel, 1 = black pixel):
Original: 000000001111100000011111111000000000 Encoded : 8 0s, 5 1s, 6 0s, 8 1s, 7 0s → (8,0)(5,1)(6,0)(8,1)(7,0)
This binary form is what early image formats (like PCX, TGA) and FAX machines used.
To reconstruct (decode) the original message, we simply repeat each symbol as many times as its count indicates.
For example:
Encoded: 12W1B12W3B24W1B14W Decoded: WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW
A C program that implements this algorithm:
#include <stdio.h>
#include <string.h>
void runLengthEncoding(const char *input) {
int length = strlen(input);
for (int i = 0; i < length; i++) {
// Count occurrences of the current character
int count = 1;
while (i < length - 1 && input[i] == input[i + 1]) {
count++;
i++;
}
// Print the count followed by the character
printf("%d%c", count, input[i]);
}
printf("\n"); // New line after encoding
}
int main() {
// Example string to encode
char input[] = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW";
printf("Original String: %s\n", input);
printf("Run-Length Encoded: ");
runLengthEncoding(input);
return 0;
}