Run Length Encoding¶
Run length encoding is a common operation in compression applications. In this article, we discuss how to do this efficiently in MATLAB using vectorization techniques.
Let’s consider a simple sequence of integers:
x = [0 0 0 0 0 0 0 4 4 4 3 3 2 2 2 2 2 2 2 1 1 0 0 0 0 0 2 3 9 5 5 5 5 5 5]
The sequence has 35 elements.
First step is change detection:
>> diff_positions = find(diff(x) ~= 0)
diff_positions =
7 10 12 19 21 26 27 28 29
Note that these positions are 0 based indexes. The first difference is occurring at x(8).
We can use this to compute the runs of each symbol:
>> runs = diff([0 diff_positions numel(x)])
runs =
7 3 2 7 2 5 1 1 1 6
The start position for the first symbol of each run can also be easily obtained:
>> start_positions = [1 (diff_positions + 1)]
start_positions =
1 8 11 13 20 22 27 28 29 30
We can now pick up the symbols from x:
>> symbols = x(start_positions)
symbols =
0 4 3 2 1 0 2 3 9 5
Combine the symbols and their runs:
>> encoding = [symbols; runs]
encoding =
0 4 3 2 1 0 2 3 9 5
7 3 2 7 2 5 1 1 1 6
Flatten the encoding:
>> encoding = encoding(:)';
>> fprintf('%d ', encoding)
0 7 4 3 3 2 2 7 1 2 0 5 2 1 3 1 9 1 5 6 >>
We can cross check that the length of the encoded sequence is correct:
>> total_symbols = sum(runs)
total_symbols =
35
We can check the length of the encoded sequence:
>> >> numel(encoding)
ans =
20
It is indeed less than 35. The gain is not much since there were many symbols with just one occurrence.
The decoding can be easily done using a for loop:
x_dec = [];
for i=1:numel(encoding) / 2
symbol = encoding(i*2 -1);
run_length = encoding(i*2);
x_dec = [x_dec symbol * ones([1, run_length])];
end
Let’s print the decoded sequence:
>> fprintf('%d ', x_dec);
0 0 0 0 0 0 0 4 4 4 3 3 2 2 2 2 2 2 2 1 1 0 0 0 0 0 2 3 9 5 5 5 5 5 5
Verify that the decoded sequence is indeed same as original sequence:
>> sum(x_dec - x)
ans =
0
The library provides useful methods for performing run length encoding and decoding.
Encoding:
>> x = [0 0 0 0 3 3 3 2 2];
>> encoding = spx.dsp.runlength.encode(x)
encoding =
0 4 3 3 2 2
Decoding:
>> spx.dsp.runlength.decode(encoding)
ans =
0 0 0 0 3 3 3 2 2