Bienvenido! - Willkommen! - Welcome!

Bitácora Técnica de Tux&Cía., Santa Cruz de la Sierra, BO
Bitácora Central: Tux&Cía.
Bitácora de Información Avanzada: Tux&Cía.-Información
May the source be with you!

Thursday, May 15, 2008

La huella digital

Todo archivo tiene su propia huella digital, única y sumamente difícil de reproducir o clonar, todo serio conocedor de Linux o Unix conoce los mecanismos de hash de un archivo. Yo lo conozco desde mediados del 1995 y lo uso desde mi primer descarga de un archivo ISO bajo Linux kernel 1.2
Theoretically, MD5 and SHA1 are algorithms for computing a 'condensed representation' of a message or a data file. The 'condensed representation' is of fixed length and is known as a 'message digest' or 'fingerprint'.
The MD5 hash, also known as checksum for a file is a 128-bit value, something like a fingerprint of the file. There is a very small possibility of getting two identical hashes of two different files. This feature can be useful both for comparing the files and their integrity control.

Hash length
The length of the hash value is determined by the type of the used algorithm, and its length does not depend on the size of the file. The most common hash value lengths are either 128 or 160 bits.

Non-discoverability
Every pair of nonidentical files will translate into a completely different hash value, even if the two files differ only by a single bit. Using today's technology, it is not possible to discover a pair of files that translate to the same hash value.

Repeatability
Each time a particular file is hashed using the same algorithm, the exact same hash value will be produced.

Irreversibility
All hashing algorithms are one-way. Given a checksum value, it is infeasible to discover the password. In fact, none of the properties of the original message can be determined given the checksum value alone.

The algorithm was invented by:
Professor Ronald L. Rivest (born 1947, Schenectady, New York) is a cryptographer, and is the Viterbi Professor of Computer Science at MIT's Department of Electrical Engineering and Computer Science. He is most celebrated for his work on public-key encryption with Len Adleman and Adi Shamir, specifically the RSA algorithm, for which they won the 2002 ACM Turing Award.

Un método de comprobar la integridad de archivos que podrían ser cambiados por virus, muy común en el mundo Unix y Linux, usado también por determinados programas antivirus, con el siguiente mecanismo:

Each file will be assigned internally a tab called "File Hashes" The tab contains the MD5, SHA1 and CRC-32 file hashes. These are common hashes that are used to verify the integrity and authenticity of files. Many download sites list the MD5 hash along with the download link.

For instance when you download or receive a file, you can use MD5 or SHA-1 to guarantee that you have the correct, unaltered file by comparing its hash with the original. You are essentially verifying the file's integrity

The Secure Hash Algorithm Directory


SHA-1: The Secure Hash Algorithm (SHA) was developed by NIST and is specified in the Secure Hash Standard (SHS, FIPS 180). SHA-1 is a revision to this version and was published in 1994. It is also described in the ANSI X9.30 (part 2) standard. SHA-1 produces a 160-bit (20 byte) message digest. Although slower than MD5, this larger digest size makes it stronger against brute force attacks.

In both cases (sha-1 or md5), the fingerprint (message digest) is also non-reversable.... your data cannot be retrieved from the message digest, yet as stated earlier, the digest uniquely identifies the data.

The md5 sum of a (text) file (message or email)
Message Digest 5 is a standard algorithm which takes as input a message of arbitrary length and produces as output a 128-bit fingerprint or message digest of the input.
Somewhat similar in general concept to a CRC (cyclic redundancy check), the MD5 algorithm is used as part of the SNMPv3 (simple network mail protocol version 3) security subsystem.

No comments: