sep nom de site %%

Foreword

For more information and source code, you can go to my git repository:

Summary

Definition
Features
How it works
Binary representation
Usage
Pseudo-Code
Bonus – Position Counting

Definition

The LZ77 algorithm is a lossless compression algorithm using a sliding window.

It can be found under different names such as Lempel Ziv 77 or LZ1.

Lossless compression is a compression method that restores the same data as the original after decompression.
A sliding window compression uses a search memory area that can be moved, which prevents the algorithm from being too resource-intensive.

Features

As we have seen previously, the LZ77 algorithm uses a sliding window like the majority of algorithms in the LZ family.

This sliding window is split into two parts, one part which is the buffer and another which is the dictionary.

The buffer is the read part and the dictionary is the search part.

The purpose of this algorithm is to encode a recurrence found in the buffer and the dictionary by the position/size combination.

How it works

The operation is simple, find the largest pattern of the stamp in the dictionary.

To illustrate this, let’s take an example:

In the buffer = “AABBCC”.
In the dictionary = “RRYYIIAANNMMXX

We start looking:

AABBCC
NOT FOUND
AABBC
NOT FOUND
AABB
NOT FOUND
AAB
NOT FOUNDAA
FOUND RRYYIIAANNMMXX

The distance from the found pattern to the playback buffer is calculated.
“RRYYIIAANNMMXX” = 8

The length of the found pattern is 2 (AA) and the next character of the pattern in the buffer is B so for encoding we get (8, 2, B).

If no pattern is found, the encoding will be like this (0, 0, [first character of the buffer]), then if no pattern is found in the previous example, the encoding will be (0, 0, A).

To summarize, if a pattern is found ([distance], [size], [next character]) and if no pattern is found (0, 0, [first character of the buffer]).

On small data, the encoded part can be larger than the original data.
But on large data such as Victor Hugo’s Les Misérables (Volume 1):

Original file: 710 kilobytes
Compressed file: 420 kilobytes

Either 40% compression.

Binary representation

For the binary representation, the commonly used values are:

Dictionary: 12 bits or 4095.
Buffer: 4 bits or 15.

Either encoded on 2 bytes.

Usage

The LZ77 is no longer used, the LZSS shape will be preferred which is a great improvement in compression ratio but also in compression/decompression speed.

Pseudo-Code

Here is a pseudo-code for the compression part:

And the decompression part:

Bonus – Position Counting

There are mainly 2 forms of position counting.

What I call absolute that is counted from the beginning of the dictionaryExample with the sentence: “I am here and I am superman” of 29.

| — 8 — |
The encoded sentence : `I am here and (0,8)superman`.
| — 0

And the relative format counts from the current index. Example with the sentence: “I’m here and I’m superman” of 29.

| — 8 — |
The encoded sentence : `I am here and (0,8)superman`.
| —– 15 —– |

In most uses, the relative mode is used.

This Post Has 3 Comments

bursa escort 1 Mar 2021 Reply

Wow, marvelous blog layout! How long have you been blogging for? you make blogging look easy. The overall look of your website is fantastic, as well as the content!| Maris Izak Devina
erotik 2 Mar 2021 Reply

I am sure this piece of writing has touched all the internet viewers, its really really pleasant piece of writing on building up new blog.| Adi Godart Marci
atk young and hairy 5 Mar 2021 Reply

Hey there. I found your web site by way of Google whilst looking for a related topic, your website got here up. It appears good. I have bookmarked it in my google bookmarks to come back then. Julina Morlee Beverlie

Tutorials and explanations for developers

Understanding the LZ77 compression algorithm

Foreword

Summary

Definition

Features

How it works

Binary representation

Usage

Pseudo-Code

Bonus – Position Counting

This Post Has 3 Comments

Leave a Reply Cancel reply

Foreword

Summary

Definition

Features

How it works

Binary representation

Usage

Pseudo-Code

Bonus – Position Counting

You Might Also Like

Understanding the Knuth-Morris-Pratt algorithm

This Post Has 3 Comments

Leave a Reply Cancel reply