User Tools

Site Tools


tanszek:oktatas:techcomm:rle_coding

Run-Length Encoding (RLE)

Run-Length Encoding (RLE) is one of the simplest lossless data compression techniques. It works by replacing sequences of repeated elements (called “runs”) with a single value and a repetition count.

Run-Length Encoding cannot be attributed to a single inventor; rather, it is based on one of the earliest and simplest general principles of data compression, developed and first used in the 1950s for computer image processing.


Basic Idea

When a sequence contains the same value multiple times in a row, instead of storing it multiple times, we store only the value and the number of times it repeats.

For example:

AAAAABBBCCDAA → 5A3B2C1D2A

Here:

  • 'A' repeats 5 times → 5A
  • 'B' repeats 3 times → 3B
  • 'C' repeats 2 times → 2C
  • 'D' repeats 1 time → 1D
  • 'A' repeats again 2 times → 2A

The original message had 12 characters, but the encoded version has only 10, so there is some compression (especially for longer runs).


Binary Example

RLE is especially effective for binary data, such as black-and-white images.

Example (0 = white pixel, 1 = black pixel):

Original: 000000001111100000011111111000000000
Encoded : 8 0s, 5 1s, 6 0s, 8 1s, 7 0s → (8,0)(5,1)(6,0)(8,1)(7,0)

This binary form is what early image formats (like PCX, TGA) and FAX machines used.


Decoding

To reconstruct (decode) the original message, we simply repeat each symbol as many times as its count indicates.

For example:

Encoded: 12W1B12W3B24W1B14W
Decoded: WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW

Advantages

  • Very simple to implement
  • Works well for data containing long runs of identical values (e.g., simple images, icons, masks)
  • Low computational cost (fast encoding/decoding)

Disadvantages

  • Inefficient when the data changes frequently (e.g., photographs, text files)
  • May increase file size if no long runs exist (e.g,. `ABCD` → `1A1B1C1D`, which is longer)
  • Usually combined with other compression algorithms in practical applications

Applications

  • Image compression (TGA, PCX, BMP with RLE)
  • FAX transmission (CCITT Group 3/4 standard)
  • Simple bitmap fonts
  • Storing masks and binary patterns
  • Used as a sub-step in complex formats (e.g., TIFF, PDF internal compression)

A C program that implements this algorithm:

#include <stdio.h>
#include <string.h>

void runLengthEncoding(const char *input) {
    int length = strlen(input);
    
    for (int i = 0; i < length; i++) {
        // Count occurrences of the current character
        int count = 1;
        while (i < length - 1 && input[i] == input[i + 1]) {
            count++;
            i++;
        }
        
        // Print the count followed by the character
        printf("%d%c", count, input[i]);
    }
    
    printf("\n"); // New line after encoding
}

int main() {
    // Example string to encode
    char input[] = "WWWWWWWWWWWWBWWWWWWWWWWWWBBBWWWWWWWWWWWWWWWWWWWWWWWWBWWWWWWWWWWWWWWW";
    
    printf("Original String: %s\n", input);
    printf("Run-Length Encoded: ");
    runLengthEncoding(input);
    
    return 0;
}

tanszek/oktatas/techcomm/rle_coding.txt · Last modified: 2025/11/10 14:51 by knehez