widely spread symmetric cipher is DES (Data Encryption Standard). Because of the increase in the
computing power of computers, the basic version of DES cannot be considered sufficiently safe anymore.
Therefore a new, more powerful cipher called AES (Advanced Encryption Standard) was standardized in
2000. It will likely replace DES as the most widely used symmetric encryption algorithm. RSA is probably the
best known asymmetric encryption algorithm. The books page lists several good textbooks on cryptography
and related topics
* Digital Signatures
Some public-key algorithms can be used to generate digital signatures. A digital signature is a small amount
of data that was created using some private key, and there is a public key that can be used to verify that the
signature was really generated using the corresponding private key. The algorithm used to generate the
signature must be such that without knowing the private key it is not possible to create a signature that
would verify as valid.
Digital signatures are used to verify that a message really comes from the claimed sender (assuming only the
sender knows the private key corresponding to the public key). This is called (data origin) authentication.
They can also be used to timestamp documents: a trusted party signs the document and its timestamp with
his/her private key, thus testifying that the document existed at the stated time.
Digital signatures can also be used to certify that a public key belongs to a particular entity. This is done by
signing the combination of the public key and the information about its owner by a trusted key. The
resulting data structure is often called a public-key certificate (or simply, a certificate). Certificates can be
thought of as analogous to passports that guarantee the identity of their bearers.
The trusted party who issues certificates to the identified entities is called a certification authority (CA).
Certification authorities can be thought of as being analogous to governments issuing passports for their
citizens.
A certification authority can be operated by an external certification service provider, or even by a
government, or the CA can belong to the same organization as the entities. CAs can also issue certificates
to other (sub-)CAs. This leads to a tree-like certification hierarchy. The highest trusted CA in the tree is
called a root CA. The hierarchy of trust formed by end entities, sub-CAs, and root CA is called a public-key
infrastructure (PKI).
A public-key infrastructure does not necessarily require an universally accepted hierarchy or roots, and
each party may have different trust points. This is the web of trust concept used, for example, in PGP.
A digital signature of an arbitrary document is typically created by computing a message digest from the
document, and concatenating it with information about the signer, a timestamp, etc. This can be done by
applying a cryptographic hash function on the data. The resulting string is then encrypted using the private
key of the signer using a suitable algorithm. The resulting encrypted block of bits is the signature. It is often
distributed together with information about the public key that was used to sign it.
To verify a signature, the recipient first determines whether it trusts that the key belongs to the person it is
supposed to belong to (using a certificate or a priori knowledge), and then decrypts the signature using the
public key of the person. If the signature decrypts properly and the information matches that of the message
(proper message digest etc.), the signature is accepted as valid. In addition to authentication, this technique
also provides data integrity, which means that unauthorized alteration of the data during transmission is
detected.
Several methods for making and verifying digital signatures are freely available. The most widely known
algorithm is RSA
* Cryptographic Hash Functions
Cryptographic hash functions are used in various contexts, for example, to compute the message digest
when making a digital signature. A hash function compresses the bits of a message to a fixed-size hash
value in a way that distributes the possible messages evenly among the possible hash values. A
cryptographic hash function does this in
a way that makes it extremely difficult to come up with a message that would hash to a particular hash value.
Cryptographic hash functions typically produce hash values of 128 or more bits. The number of different
hash values thus obtained, 2128, is vastly larger than the number of different messages likely to ever be
exchanged in the world. The reason for requiring more than 128 bits is based on the birthday paradox. The
birthday paradox roughly states that given a hash function mapping any message to an 128-bit hash digest,
we can expect that the same digest will be computed twice when 264 randomly selected messages
have been hashed. As cheaper memory chips for computers become available it may become necessary to
require larger than 128 bit message digests (such as 160 bits as has become standard recently).
Many good cryptographic hash functions are freely available. The most famous cryptographic hash
functions are those of the MD family: MD4, MD5, SHA-1, and RipeMD-160. Of these, MD4, MD5, and SHA-
1 have been broken. MD4 and MD5 should be considered insecure and not used anymore. SHA-1 is still
widely used, although its stronger counterparts, SHA-256, SHA-384, and SHA-512 are likely to replace it in
the future
* Cryptographic Random Number Generators
Cryptographic random number generators generate random numbers for use in cryptographic applications,
such as for keys. Conventional random number generators available in most programming languages or
programming environments are not suitable for use in cryptographic applications (they are designed for
statistical randomness, not to
resist prediction by cryptanalysts).
In the optimal case, random numbers are based on true physical sources of randomness that cannot be
predicted. Such sources may include the noise from a semiconductor device, the least significant bits of an
audio input, or the intervals between device interrupts or user keystrokes. The noise obtained from a
physical source is then "distilled" by a cryptographic hash function to make every bit depend on every
other bit. Quite often a large pool (several thousand bits) is used to contain randomness, and every bit of
the pool is made to depend on every bit of input noise and every other bit of the pool in a cryptographically
strong way.
When true physical randomness is not available, pseudo-random numbers must be used. This situation is
undesirable, but often arises on general purpose computers. It is always desirable to obtain some
environmental noise - such as from device latencies, resource utilization statistics, network statistics, or
keyboard interrupts. The point is that the data must be unpredictable for any external observer; to achieve
this, the random pool must contain at least 128 bits of true entropy.
Cryptographic pseudo-random number generators typically have a large pool ("seed value") containing
randomness. Bits are returned from this pool by taking data from the pool, optionally running the data
through a cryptographic hash function to avoid revealing the contents of the pool. When more bits are
needed, the pool is stirred by encrypting its contents by a suitable cipher with a random key (that may be
taken from an unreturned part of the pool) in a mode which
makes every bit of the pool depend on every other bit of the pool. New environmental noise should be mixed
into the pool before stirring to make predicting previous or future values even more difficult.
Even though cryptographically strong random number generators are not very difficult to build if designed
properly, they are often overlooked. The importance of the random number generator must thus be
emphasized - if done badly, it will easily become the weakest point of the system.
* Strength of Cryptographic Algorithms
Good cryptographic systems should always be designed so that they are as difficult to break as possible.
It is possible to build systems that cannot be broken in practice (though this cannot usually be proved).
This does not significantly increase system implementation effort; however, some care and expertise is
required. There is no excuse for a system designer to leave the system breakable. Any mechanisms that can
be used to circumvent security must be made explicit, documented, and brought into the attention of the end
users. In theory, any cryptographic method with a key can be broken by trying all possible keys in
sequence. If using brute force to try all keys is the only option, the required computing power increases
exponentially with the length of the key. A 32-bit key takes 232 (about 109) steps. This is something
anyone can do on his/her home computer. A system with 56-bit keys, such as DES, requires a substantial
effort, but using massive distributed systems requires only hours of computing. In 1999, a brute-force
search using a specially designed supercomputer and a worldwide network of nearly 100,000 PCs on the
Internet, found a DES key in 22 hours and 15 minutes. It is currently believed that keys with at least 128 bits
(as in AES, for example) will be sufficient against brute-force attacks into the foreseeable future. However,
key length is not the only relevant issue. Many ciphers can be broken without trying all possible keys. In
general, it is very difficult to design ciphers that could not be broken more effectively using other methods.
Unpublished or secret algorithms should generally be regarded with suspicion. Quite often the designer is
not sure of the security of the algorithm, or its security depends on the secrecy of the algorithm. Generally,
no algorithm that depends on the secrecy of the algorithm is secure. For professionals, it is easy to
disassemble and reverse-engineer the algorithm. Experience has shown that the vast majority of secret
algorithms that have become public knowledge later have been pitifully weak in reality. The keys used in
public-key algorithms are usually much longer than those used in symmetric algorithms. This is caused by
the extra structure that is available to the cryptanalyst. There the problem is not that of guessing the right
key, but deriving the matching private key from the public key. In the case of RSA, this could be done by
factoring a large integer that has two large prime factors. In the case of some other cryptosystems, it is
equivalent to computing the discrete logarithm modulo a large integer (which is believed to be roughly
comparable to factoring when the moduli is a large prime number). There are public-key cryptosystems
based on yet other problems. To give some idea of the complexity for the RSA cryptosystem, a 256-bit
modulus is easily factored at home, and 512-bit keys can be broken by university research groups within a
few months. Keys with 768 bits are probably not secure in the long term. Keys with 1024 bits and more
should be safe for now unless major cryptographical advances are made against RSA. RSA Security claims
that 1024-bit keys are equivalent in strength to 80-bit symmetric keys and recommends their usage until
2010. 2048-bit RSA keys are claimed to be equivalent to 112-bit symmetric keys and can be used at least up
to 2030. It should be emphasized that the strength of a cryptographic system is usually equal to its weakest
link. No aspect of the system design should be overlooked, from the choice of algorithms to the key
distribution and usage policies.
* Cryptanalysis and Attacks on Cryptosystems
Cryptanalysis is the art of deciphering encrypted communications without knowing the proper keys. There
are many cryptanalytic techniques. Some of the more important ones for a system implementor are described
below.
* Ciphertext-only attack: This is the situation where the attacker does not know anything about the
contents of the message, and must work from ciphertext only. In practice it is quite often possible
to make guesses about the plaintext, as many types of messages have fixed format headers. Even
ordinary letters and documents begin in a very predictable way. For example, many classical attacks
use frequency analysis of the ciphertext, however, this does not work well against modern ciphers.
*
* Modern cryptosystems are not weak against ciphertext-only attacks, although sometimes they are
considered with the added assumption that the message contains some statistical bias.
*
* Known-plaintext attack: The attacker knows or can guess the plaintext for some parts of the
ciphertext. The task is to decrypt the rest of the ciphertext blocks using this information. This may
be done by determining the key used to encrypt the data, or via some shortcut.
*
* One of the best known modern known-plaintext attacks is linear cryptanalysis against block
ciphers.
*
* Chosen-plaintext attack: The attacker is able to have any text he likes encrypted with the unknown
key. The task is to determine the key used for encryption.
*
* A good example of this attack is the differential cryptanalysis which can be applied against block
ciphers (and in some cases also against hash functions).
*
* Some cryptosystems, particularly RSA, are vulnerable to chosen-plaintext attacks. When such
algorithms are used, care must be taken to design the application (or protocol) so that an attacker
can never have chosen plaintext encrypted.
*
* Man-in-the-middle attack: This attack is relevant for cryptographic communication and key
exchange protocols. The idea is that when two parties, A and B, are exchanging keys for secure
communication (for example, using Diffie-Hellman), an adversary positions himself between A and
B on the communication line. The adversary then intercepts the signals that A and B send to each
other, and performs a key exchange with A and B separately. A and B will end up using a different
key, each of which is known to the adversary. The adversary can then decrypt any communication
from A with the key he shares with A, and then resends the communication to B by encrypting it
again with the key he shares with B. Both A and B will think that they are communicating securely,
but in fact the adversary is hearing everything.
*
* The usual way to prevent the man-in-the-middle attack is to use a public-key cryptosystem capable
of providing digital signatures. For set up, the parties must know each other's public keys in
advance. After the shared secret has been generated, the parties send digital signatures of it to
each other. The man-in-the-middle fails in his attack, because he is unable to forge these signatures
without the knowledge of the private keys used for signing.
*
* This solution is sufficient if there also exists a way to securely distribute public keys. One such
way is a certification hierarchy such as X.509. It is used for example in IPSec.
*
* Correlation between the secret key and the output of the cryptosystem is the main source of
information to the cryptanalyst. In the easiest case, the information about the secret key is directly
leaked by the cryptosystem. More complicated cases require studying the correlation (basically,
any relation that would not be expected on the basis of chance alone) between the observed (or
measured) information about the cryptosystem and the guessed key information.
*
* For example, in linear (resp. differential) attacks against block ciphers the cryptanalyst studies the
known (resp. chosen) plaintext and the observed ciphertext. Guessing some of the key bits of the
cryptosystem the analyst determines by correlation between the plaintext and the ciphertext
whether she guessed correctly. This can be repeated, and has many variations.
*
* The differential cryptanalysis introduced by Eli Biham and Adi Shamir in late 1980s was the first
attack that fully utilized this idea against block ciphers (especially against DES). Later Mitsuru
Matsui came up with linear cryptanalysis which was even more effective against DES. More
recently, new attacks using similar ideas have been developed.
*
* Perhaps the best introduction to this material is the proceedings of EUROCRYPT and CRYPTO
throughout the 1990s. There one can find Mitsuru Matsui's discussion of linear cryptanalysis of
DES, and the ideas of truncated differentials by Lars Knudsen (for example, IDEA cryptanalysis).
The book by Eli Biham and Adi Shamir about the differential cryptanalysis of DES is the "classical"
work on this subject.
*
* The correlation idea is fundamental to cryptography and several researchers have tried to
construct cryptosystems which are provably secure against such attacks. For example, Knudsen
and Nyberg have studied provable security against differential cryptanalysis.
*
* Attack against or using the underlying hardware: in the last few years as more and more small
mobile crypto devices have come into widespread use, a new category of attacks has become
relevant which aims directly at the hardware implementation of the cryptosystem.
*
* The attacks use the data from very fine measurements of the crypto device doing, say, encryption
and compute key information from these measurements. The basic ideas are then closely related to
those in other correlation attacks. For instance, the attacker guesses some key bits and attempts to
verify the correctness of the guess by studying correlation against her measurements.
*
* Several attacks have been proposed such as using careful timings of the device, fine measurements
of the power consumption, and radiation patterns. These measurements can be used to obtain the
secret key or other kinds information stored on the device.
*
* This attack is generally independent of the used cryptographical algorithms and can be applied to
any device that is not explicitly protected against it.
*
* More information about differential power analysis is available at
http://www.cryptography.com.
*
* Faults in cryptosystems can lead to cryptanalysis and even the discovery of the secret key. The
interest in cryptographical devices lead to the discovery that some algorithms behaved very badly
with the introduction of small faults in the internal computation.
*
* For example, the usual implementation of RSA private-key operations are very suspectible to fault
attacks. It has been shown that by causing one bit of error at a suitable point can reveal the
factorization of the modulus (i.e. it reveals the private key).
*
* Similar ideas have been applied to a wide range of algorithms and devices. It is thus necessary that
cryptographical devices are designed to be highly resistant against faults (and against malicious
introduction of faults by cryptanalysts).
*
* Quantum computing: Peter Shor's paper on polynomial time factoring and discrete logarithm
algorithms with quantum computers has caused growing interest in quantum computing. Quantum
computing is a recent field of research that uses quantum mechanics to build computers that are, in
theory, more powerful than modern serial computers. The power is derived from the inherent
parallelism of quantum mechanics. So instead of doing tasks one at a time, as serial machines do,
quantum computers can perform them all at once. Thus it is hoped that with quantum computers we
can solve problems infeasible with serial machines.
*
* Shor's results imply that if quantum computers could be implemented effectively then most of
public-key cryptography will become history. However, they are much less effective against secret-
key cryptography.
*
* The current state of the art of quantum computing does not appear alarming, as only very small
machines have been implemented. The theory of quantum computation gives much promise for
better performance than serial computers, however, whether it will be realized in practice is an open
question.
*
* Quantum mechanics is also a source for new ways of data hiding and secure communication with
the potential of offering unbreakable security, this is the field of quantum cryptography. Unlike
quantum computing, many successful experimental implementations of quantum cryptography
have been already achieved. However, quantum cryptography is still some way off from being
realized in commercial applications.
*
* DNA cryptography: Leonard Adleman (one of the inventors of RSA) came up with the idea of
using DNA as computers. DNA molecules could be viewed as a very large computer capable of
parallel execution. This parallel nature could give DNA computers exponential speed-up against
modern serial computers.
*
* There are unfortunately problems with DNA computers, one being that the exponential speed-up
requires also exponential growth in the volume of the material needed. Thus in practice DNA
computers would have limits on their performance. Also, it is not very easy to build one.
There are many other cryptographic attacks and cryptanalysis techniques. However, these are probably the
most important ones for an application designer. Anyone contemplating to design a new cryptosystem
should have a much deeper understanding of these issues. Good places to start looking for information are
the excellent books Handbook of Applied Cryptography by Menezes, van Oorschot, and Vanstone and
Applied Cryptography by Schneier.
Cryptographic Algorithms