An Efficient Algorithm for Hamming (7,4) Decoding — Matlab Implementation

We’re gonna investigate Hamming (7, 4) codes, and efficiently implement them in Matlab / Octave. The explanation of the algorithm used is here.

When data is transmitted over a channel, errors are often introduced due to noise, and what not. To account for the losses, redundant data is added the actual data and transmitted. This data can be used to detect, and correct errors up to an extent. This is known as Channel Coding, and Hamming codes are an example.

Hamming (7, 4) codes encode 4 bits of data into 7 bit blocks. The extra 3 bits are the redundant parity bits we talked about.

For example, consider 4 bits of data: d1 d2 d3 d4. With this, parity bits are calculated as

p1 = d2 + d3 + d4
p2 = d1 + d3 + d4
p3 = d1 + d2 + d4

All of this is combined, and encoded as

p1 p2 p3 d1 d2 d3 d4

At the receiver end, parity bits are again constructed, and checked with the received bits to detect, and correct errors.

Let’s assume only 1 error occurs, and the value of the bit is flipped. If d1 is flipped, it would cause disagreement in p2 and p3. And, so on.

Also note that, errors could occur in parity bits too. If only p1 is in disagreement, the error occurred in p1.

This would, of course, fail if there are more than 1 errors. But if your channel introduces 2 or more errors for every 7 bits, you're destined to doom anyway. Typical error probabilities are of the order 10^-6 .

Alright. Let’s get to the implementation. You might have noticed that the encoding can be represented as a matrix multiplication

data = [d1 d2 d3 d4]
                                |0 1 1 1 0 0 0|
encoded_data = [d1 d2 d3 d4]  * |1 0 1 0 1 0 0|
                                |1 1 0 0 0 1 0|
                                |1 1 1 0 0 0 1|

The first element of the resultant matrix would be d1*0 + d2*1 + d3*1 + d4*1. Excellent! This is exactly what we wanted for parity bit p1.

The last four bits would be the data bits d1 d2 d3 d4 themselves.

Such a matrix is called generator matrix. We generated hamming codes by multiplying our data with the generator matrix.

The converse of this is the decoder matrix. At the receiver end, we use it detect where the error occured. Here is the decoder matrix for Hamming (7, 4) codes.

|1 0 0 0 1 1 1|
|0 1 0 1 0 1 1| * [p1 p2 p3 d1 d2 d3 d4]
|0 0 1 1 1 0 1|

Multiplying our 7 bit code with this matrix, we will be left with 3 bits, which tell us if, and where, an error occurred.

Examine what the decoder matrix is doing. The first element would be p1 + d2 + d3 + d4.

This is equal to 2 * (d2 + d3 + d4). Since this is binary, we will be left with a 0. But if an error occurred anywhere, the sum is 1 .

Here’s how the resultant matrix will be for error in different positions.

No errors   -> [0 0 0]
Error in p1 -> [1 0 0]
Error in p2 -> [0 1 0]
Error in p3 -> [0 0 1]
Error in d1 -> [0 1 1]
Error in d2 -> [1 0 1]
Error in d3 -> [1 1 0]
Error in d4 -> [1 1 1]

Now that we know where the error is, decoding is just flipping the error bit, and removing the parity bits.

Encoding

in Matlab (complete code at the end)

function [output_vector] = sevenFourHammingEncode(input_vector)
    output_vector = [];
    number_of_windows = length(input_vector) / 4; % assumes length to be multiple of four
    four_bit_windows = reshape(input_vector, 4, number_of_windows)';
    %                p1  p2  p3  d1  d2  d3  d4
    generator_matrix = [
                          0 1 1 1 0 0 0 ;
                          1 0 1 0 1 0 0 ;
                          1 1 0 0 0 1 0 ;
                          1 1 1 0 0 0 1 ;
                       ];
    output_vector = mod(four_bit_windows * generator_matrix, 2);
    output_vector = reshape(output_vector', numel(output_vector), 1)';
end

Decoding

function [output_vector] = sevenFourHammingDecode(input_vector)
    output_vector = [];

    number_of_windows = numel(input_vector) / 7;
    seven_bit_windows = reshape(input_vector, 7, number_of_windows)';
    %   p1  p2  p3  d1  d2  d3  d4
    decoder_matrix = [
                        1 0 0 0 1 1 1 ;
                        0 1 0 1 0 1 1 ;
                        0 0 1 1 1 0 1 ;
                     ];
    % [ 1 0 0 ; 0 1 0 ; 0 0 1; 0 1 1 ; 1 0 1 ; 1 1 0 ; 1 1 1; ]
    % [4, 2, 1, 3, 5, 6, 7]
    result_vector = mod(decoder_matrix * seven_bit_windows', 2)';
    result_vector = result_vector * [4 ; 2 ; 1]; % summing all elements

    result_vector(result_vector == 4) = 9;
    result_vector(result_vector == 1) = 11;
    result_vector(result_vector == 3) = 12;
    result_vector = mod(result_vector, 8);

    seven_bit_windows = [ones(size(seven_bit_windows, 1), 1) seven_bit_windows]; % adding 1 column at start

    result_vector = [(0:8:rows(result_vector)*8-1)' result_vector];
    result_vector = result_vector * [1 ; 1]; % summing row and column

    padded_data = reshape(seven_bit_windows', numel(seven_bit_windows), 1)';
    padded_data(result_vector' + 1) = not(padded_data(result_vector' + 1));
    padded_data = reshape(padded_data, 8, numel(padded_data) / 8)';
    padded_data = padded_data(:, 5:end);

    output_vector = reshape(padded_data', numel(padded_data), 1)';
end

I wrote an explanation for how this works here

The decoding uses only matrix operations, and is hence a lot faster than done using a for loop. Orders of magnitude faster.

Here is a script for comparison of performance of (3, 1), (5, 1), and Hamming (7, 4) encodings.



	'@author: Aravind Reddy V <aravind.reddy@iiitb.org> (https://aravindvoggu.in)';

	% References
	%
	% http://michael.dipperstein.com/hamming/index.html
	% https://en.wikipedia.org/wiki/Hamming(7,4)

	%% createDataSet: Creates random binary dataset
	function [dataset] = createDataSet(size)
	dataset = randi([0, 1], 1, size);
	end


	%% addErrors: Adds errors
	function [output_vector, number_of_errors] = addErrors(input_vector, probability, input_size)
	number_of_errors = probability * input_size;
	error_locations = randperm(input_size, number_of_errors); % randperm to avoid repetitions

	output_vector = input_vector;
	output_vector(error_locations) = not(input_vector(error_locations));
	end

	%% findErrors: Returns an array with ones where errors occured
	function [error_array, error_lengths] = findErrors(input_vector, output_vector)
	error_array = (output_vector != input_vector);

	t1 = find(diff(error_array)); % indices where the pattern changes
	t2 = [t1 numel(error_array)] - [0 t1]; % pattern lengths

	error_lengths = [];

	if error_array(1) == 0 % then odd rows of t2 are non errors. drop them.
	error_lengths = t2'(2:2:end, :)'; % even elements
	else
	error_lengths = t2'(1:2:end, :)'; % odd elements
	end
	end

	%% numberOfErrors: Returns number of errors
	function [number] = numberOfErrors(input_data, decoded_data)
	number = sum(input_data != decoded_data);
	end



	function [output_vector] = sevenFourHammingEncode(input_vector)
	output_vector = [];
	number_of_windows = length(input_vector) / 4; % assumes length to be multiple of four
	four_bit_windows = reshape(input_vector, 4, number_of_windows)';
	% p1 p2 p3 d1 d2 d3 d4
	generator_matrix = [
	0 1 1 1 0 0 0 ;
	1 0 1 0 1 0 0 ;
	1 1 0 0 0 1 0 ;
	1 1 1 0 0 0 1 ;
	];

	output_vector = mod(four_bit_windows * generator_matrix, 2);

	output_vector = reshape(output_vector', numel(output_vector), 1)';

	% output_vector = encode(input_vector, 7, 4, 'hamming/binary');
	end

	% Orders of magnitude faster than using loops
	% This code is cache efficient and uses matrices for
	% parallell operations by the processor, hence is
	% faster.
	%
	% Sorry about the lack of comments though. I didn't
	% realise their importance back then, and I don't have
	% time to understand this and add comments now.
	function [output_vector] = sevenFourHammingDecode(input_vector)
	output_vector = [];
	number_of_windows = numel(input_vector) / 7;


	seven_bit_windows = reshape(input_vector, 7, number_of_windows)';

	% q = 1
	% seven_bit_windows
	% seven_bit_windows(2, q) = not(seven_bit_windows(2, q))

	% p1 p2 p3 d1 d2 d3 d4
	decoder_matrix = [
	1 0 0 0 1 1 1 ;
	0 1 0 1 0 1 1 ;
	0 0 1 1 1 0 1 ;
	];

	% [ 1 0 0 ; 0 1 0 ; 0 0 1; 0 1 1 ; 1 0 1 ; 1 1 0 ; 1 1 1; ]
	% [4, 2, 1, 3, 5, 6, 7]

	result_vector = mod(decoder_matrix * seven_bit_windows', 2)';

	result_vector = result_vector * [4 ; 2 ; 1]; % summing all elements

	result_vector(result_vector == 4) = 9;
	result_vector(result_vector == 1) = 11;
	result_vector(result_vector == 3) = 12;

	result_vector = mod(result_vector, 8);

	seven_bit_windows = [ones(size(seven_bit_windows, 1), 1) seven_bit_windows]; % adding 1 column at start

	result_vector = [(0:8:rows(result_vector)*8-1)' result_vector];
	result_vector = result_vector * [1 ; 1]; % summing row and column

	padded_data = reshape(seven_bit_windows', numel(seven_bit_windows), 1)';
	padded_data(result_vector' + 1) = not(padded_data(result_vector' + 1));
	padded_data = reshape(padded_data, 8, numel(padded_data) / 8)';

	padded_data = padded_data(:, 5:end);


	output_vector = reshape(padded_data', numel(padded_data), 1)';
	end

	%% hammingProbabilities: Returns output poe for hamming 7,4
	function [poe] = hammingProbabilities(input_size, probabilities)
	poe = [];
	printf('\n');

	for probability = probabilities
	printf('Hamming (7,4). Input size = %d, probability of error = %f\n', input_size, probability);
	input_data = createDataSet(input_size);

	transmitted_data = sevenFourHammingEncode(input_data);

	received_data = addErrors(transmitted_data, probability, input_size);

	decoded_data = sevenFourHammingDecode(received_data);

	poe = [poe numberOfErrors(input_data, decoded_data) / input_size];
	end
	end


	hamming_input_probabilities = [0.4 0.2 0.1 0.05 0.01 0.005 0.001];
	hamming_output_probabilities = hammingProbabilities(10**5, hamming_input_probabilities);

	plot(hamming_input_probabilities, hamming_output_probabilities);
	xlabel('Input Probabilities');
	ylabel('Output Probabilities');
	hold on;



	%% threeOneEncode: (3, 1) encoding
	function [output_vector] = threeOneEncode(input_vector)
	output_vector = repmat(input_vector', 1, 3)';
	output_vector = output_vector(:)';
	end

	%% threeOneDecode: (3, 1) decoding
	function [outputs] = threeOneDecode(input_vector)
	three_bit_windows = reshape(input_vector, 3, numel(input_vector) / 3)';

	three_bit_windows = three_bit_windows * [1; 1; 1;];

	outputs = (three_bit_windows >= 2)';
	end



	%% threeOneProbabilities: Three one performance
	function [poe] = threeOneProbabilities(input_size, probabilities)
	poe = [];
	printf('\n');

	for probability = probabilities
	printf('Repeat (3, 1). Input size = %d, probability of error = %f\n', input_size, probability);
	input_data = createDataSet(input_size);

	transmitted_data = threeOneEncode(input_data);

	received_data = addErrors(transmitted_data, probability, input_size);

	decoded_data = threeOneDecode(received_data);

	poe = [poe numberOfErrors(input_data, decoded_data) / input_size];

	end
	end


	input_size = 10**5;

	threeOne_input_probabilities = [0.4 0.2 0.1 0.05 0.01 0.005 0.001];
	threeOne_output_probabilities = threeOneProbabilities(10**5, threeOne_input_probabilities);


	plot(threeOne_input_probabilities, threeOne_output_probabilities);
	xlabel('Input Probabilities');
	ylabel('Output Probabilities');
	hold on;


	%% fiveOneEncode: (5, 1) encoding
	function [output_vector] = fiveOneEncode(input_vector)
	output_vector = repmat(input_vector', 1, 5)';
	output_vector = output_vector(:)';
	end

	%% fiveOneDecode: (5, 1) decoding
	function [outputs] = fiveOneDecode(input_vector)
	five_bit_windows = reshape(input_vector, 5, numel(input_vector) / 5)';

	five_bit_windows = five_bit_windows * [1; 1; 1; 1; 1;];

	outputs = (five_bit_windows > 2)';
	end

	%% fiveOneProbabilities: five one performance
	function [poe] = fiveOneProbabilities(input_size, probabilities)
	poe = [];
	printf('\n');

	for probability = probabilities
	printf('Repeat (5, 1). Input size = %d, probability of error = %f\n', input_size, probability);
	input_data = createDataSet(input_size);

	transmitted_data = fiveOneEncode(input_data);

	received_data = addErrors(transmitted_data, probability, input_size);

	decoded_data = fiveOneDecode(received_data);

	poe = [poe numberOfErrors(input_data, decoded_data) / input_size];

	end
	end

	input_size = 10**5;

	fiveOne_input_probabilities = [0.4 0.2 0.1 0.05 0.01 0.005 0.001];
	fiveOne_output_probabilities = fiveOneProbabilities(10**5, fiveOne_input_probabilities);


	plot(fiveOne_input_probabilities, fiveOne_output_probabilities);
	xlabel('Input Probabilities');
	ylabel('Output Probabilities');
	hold on;

	legend('Hamming', 'Three one', 'Five one');
	title('Comparision of performance');

	set(gca, 'XDir', 'reverse')

	print -dpng 'Comparision.png'

view raw encoding.m hosted with ❤ by GitHub