Is cryptography important Why

What to know about cryptography

Cryptography is a science that deals with encrypting and protecting information. As a sub-area of ​​computer science, it has become an integral part of the modern IT world. It is therefore helpful to know the basic terms.

The word "cryptography" is made up of the two ancient Greek words "kryptos" and "graphein", which mean "secret" and "write". So cryptography is about hiding information with the help of secret writing. This is what distinguishes cryptography from steganography, which also deals with hiding information, but hides it in a higher-level carrier medium.

Cryptography must also be distinguished from cryptanalysis, which deals with the decryption and cracking of cryptographic processes. One could therefore say that cryptography and cryptanalysis are in constant competition with one another.

The starting point in cryptography is the so-called "plain text", which represents the message to be protected. Before the transmission, the plain text is transformed into a "ciphertext". This process is called "encrypting", the reverse (ie the restoration of the plaintext from the ciphertext) is called "decrypting". Both processes follow an algorithm that is usually parameterized with the help of a key.

A message transmission usually runs in the cryptography literature from "Alice" to "Bob", while "Eve" (from English "Eavesdroppper") tries to intercept and read the message. To do this, Alice and Bob first have to agree on a common algorithm and key, although there are significant differences in quality with regard to both.

A simple algorithm is the so-called "Caesar cipher", which advances letters in the alphabet by a certain number. An "A" becomes a "B", a "B" becomes a "C", a "C" becomes a "D", and so on, until finally the "Y" becomes a "Z" . The "Z" itself is in turn mapped onto the "A". The example shows the shift by one place, of course the letters can also be shifted by more places.

The plain text "HELLO" becomes the ciphertext "IBMMP" through the use of the Caesar cipher and the key "1". Since there are only 26 possible keys, this encryption can be cracked with very little effort: It is sufficient to try all the keys until one fits. Such a procedure, in which all possible keys are tested with pure computing power, is called "brute force".

It is obvious that a more secure system is recommended in practice. This suggests choosing the most complex algorithm possible and keeping it secret. However, this is precisely what counts as bad style: at the end of the 19th century, the French mathematician Auguste Kerckhoff formulated the principle named after him that the security of a cryptosystem should only depend on the secrecy of the key, but not on that of the algorithm. In other words, it means that the disclosure of an encryption algorithm must not restrict its security.

Symmetrical encryption: AES & Co.

The search for a better algorithm raises the question of whether there is mathematically demonstrably perfect encryption. Surprisingly, there is actually such a thing, namely the so-called one-time pad (OTP). It is based on the knowledge that the Caesar encryption is unbreakable if only a single character is encrypted. Although there are still only 26 options for decryption, the correct value can no longer be read due to the lack of context.

If you encrypt a message character by character with the Caesar cipher and use a separate, randomly chosen key for each individual character, so that any patterns that could allow an analysis are eliminated from the plaintext, the ciphertext can no longer be decrypted without knowing the key.

The simplicity and elegance of this process is hampered by the problem of how to transport the key safely to the recipient. This project is made even more difficult by the fact that a key that has been used once must be considered "used" and may not be used again - on the other hand, the key would no longer be accidental and thus again vulnerable. The one-time pad therefore only works with genuinely random keys, the length of which is as long as that of the plaintext.

In the past 2000 years, numerous other methods have therefore been developed which attempt to build a pragmatic bridge between high security and good applicability. In the modern era, the Data Encryption Standard (DES) from 1976 and its improvement 3DES should be mentioned here, both of which were designed by IBM in collaboration with the NSA. However, they are no longer considered safe.

Instead, the modern de facto standard is the AES algorithm, which was selected as the winner in a competition in 2001 by NIST, the US National Institute of Standards and Technology. Unlike Caesar and OTP, AES does not encrypt each character individually, but entire blocks of characters. One therefore speaks of a block cipher. Depending on the key length, AES is available in different versions, for example AES-128, AES-192 and AES-256.

AES also offers different operating modes: While in ECB (Electronic Code Book) each block is encrypted independently, which in turn can lead to patterns and repetitions, CBC (Cipher Block Chaining) links each block with the encryption of the previous block, which was the case before avoids the problem mentioned. GCM (Galois / Counter-Mode) is relatively young and ensures even better security.

From a practical point of view, AES-256-CBC and AES-128-GCM are primarily used today, although the GCM variant is not yet as widespread as CBC. However, regardless of the variant chosen, the problem of key exchange between the parties involved still persists.

Asymmetric encryption: RSA & Co.

The symmetric is opposed to the so-called asymmetric encryption. The main difference is that the symmetric methods use a single key for encryption and decryption, whereas asymmetric methods each work with a separate key. So there are two keys that complement each other: What was encrypted with one key can only be decrypted with the second.

One of these two keys remains secret, while the other can be published. One speaks therefore of a "private key" and a "public key" and overall of "public key cryptography". If Alice now wants to send a message to Bob, all she needs is Bob's public key and uses it to encrypt the message. Then the message can only be decrypted again with Bob's private key. Since only Bob knows this key, the message is secure.

From a mathematical point of view, the method is based on so-called trapdoor functions, which are easy to calculate, but whose reversal is very difficult unless you have special knowledge. A simple example of this is prime factorization. While it is very easy to calculate the product (namely 240,067) from the two numbers 431 and 557, the decomposition of, for example, 566,609 into the two prime factors is far more difficult and time-consuming. The best-known asymmetrical algorithm RSA is based on a trap door problem, namely the discrete logarithm in connection with remainder classes.

What is particularly practical about asymmetric procedures is that they solve the key exchange problem, because each communication partner only needs two keys, one private and one public. However, asymmetric methods also have their disadvantages. On the one hand, there is the enormous amount of computation required, and on the other hand, the maximum length of the encryptable messages is severely limited.

The hybrid use of symmetric and asymmetric encryption is a solution: First, a random key is generated with which the actual message can be encrypted quickly and efficiently using a symmetric procedure. The randomly generated key is then encrypted using an asymmetric process, and both (the encrypted message and the encrypted key) are transmitted to the recipient. On the way, public key cryptography can be combined with high efficiency.

In practice, RSA is no longer used too often for this purpose; instead, nowadays primarily processes based on elliptic curves are in the foreground. This is also called Elliptic Curve Cryptography (ECC). Even if, from a mathematical point of view, these work differently than remainder classes and the discrete logarithm, they are also based on a trapdoor function.

Hash functions: SHA & Co.

Another problem that asymmetric methods cannot solve is protecting the integrity of the message. Neither AES nor RSA prevent the encrypted message from being manipulated. A so-called "hash function" is required to detect manipulation. This is a mathematical function that acts as a one-way function - which, in contrast to a trapdoor function, is no longer reversible.

A hash function calculates a so-called hash value on a given input, which is always the same size, regardless of the size of the input. This hash value represents a kind of digital fingerprint. Even a small change to the input usually causes a serious change in the hash value: This makes it almost impossible to predict the hash value. It is important that hash functions are also "collision-resistant". This means that it should be impossible to construct two inputs that receive the same hash value.

Since hash values ​​are always the same size, it is in the nature of things that there are more potential inputs than hash values, so that there are theoretically an infinite number of inputs that lead to the same hash value. The collision resistance therefore only relates to the fact that it must not be possible to bring about such a collision in a targeted manner.

Even if many hash functions have been used in the past few years and decades, two families of these functions have essentially prevailed: the message digest (MD) and the secure hash algorithms (SHA). There are again different variants of both, whereby all MD variants are now considered unsafe. Even with SHA, not all variants are trustworthy anymore, especially with SHA1 you should keep your hands off. SHA256, SHA512 or, in the meantime, SHA3 are preferably used.

If a hash value is also transmitted in addition to a message, this can be recalculated and compared when the message is received. On the way it can be seen whether a message was transmitted unchanged. However, it would be easy for an attacker not only to manipulate the message, but also to simply recalculate the hash value. The hash must therefore be specially protected again.

This happens with the help of a so-called Message Authentication Code (MAC). A MAC is generally nothing more than a hash value, the calculation of which, in addition to the data to be hashed, also includes a secret key that is only known to the two communication partners. One therefore speaks of a "shared key". Since an attacker does not know this shared key, no valid new MAC can be calculated from the outside, and manipulation would be noticed.

display

Certificates, signatures & Co.

Last but not least, there remains the question of how communication between a client and a server on the Internet is secured. Two aspects are particularly interesting here: On the one hand, the connection should be encrypted, on the other hand, the client must be able to check that it is communicating with the actual server and not with someone who pretends to be the actual server.

The encryption is easy to do. To do this, HTTP is tunneled over an encrypting protocol such as TLS / SSL. In this case one speaks of HTTPS, and symmetrical encryption takes place under the hood. An ad hoc key is generated for the initial key exchange, using a procedure such as "Diffie-Hellman Key Exchange", which is based on principles similar to asymmetric encryption.

The second aspect, the authentication of the server, is a bit more complex. The basic idea here is that a server can use asymmetric encryption to prove that it actually belongs to the desired domain. To do this, he has the private key that belongs to a domain - the client can call up the public key. However, this begs the question of how the client knows that a public key actually relates to a domain.

For this purpose, the public key is supplemented with metadata such as the associated domain name and digitally signed by a trustworthy third party. The combination of public key, metadata and digital signature is called a certificate.

A digital signature, on the other hand, can easily be created using the means already known, because it is nothing more than the inverted use of an asymmetrical procedure: Up until now, Alice encrypted a message with Bob's public key so that only Bob would encrypt it could decrypt again with his private key. However, if Alice encrypts a message with her own private key, the message can only be decrypted with her public key.

At first glance that seems absurd, after all, it is possible for everyone. But on closer inspection, there is an interesting effect: If a message can only be decrypted with Alice's public key, it must have been encrypted with Alice's private key, which in turn is only possible for Alice - which would confirm the authorship . That is exactly what a digital signature is.

The only question that remains is why one should trust a third party digital signature. How can you check who this third party is? Basically it's very simple, because for him there is also a certificate that matches the private key used for signing, but which in turn must be signed by a trustworthy fourth person, and so on.

The trustworthy entities are called Certificate Authorities (CA). But so that the chain does not go on indefinitely, there has to be a final instance somewhere, a so-called Root CA. The question is how this can be validated in turn. The relatively simple answer to this is that, from a technical point of view, this is not even possible, but the trust relationship is simply based on the fact that the manufacturers of operating systems and web browsers already deliver the root CA certificates by default.

Admittedly, that is a poor answer, but in fact all of the security of the modern web is based on it. So you should always be aware of the fact that security has a lot to do with trust, and that ultimately you trust the person who trusts the root CAs. For this very reason, you should be extremely skeptical and careful when asked to install an additional (root) CA certificate on your own machine - the security of your own computer is potentially compromised as a result.

Conclusion

Even if cryptography is very complex and demanding in the details, the basic concepts can be explained and understood simply and clearly. It is much more important for developers to understand the difference between encryption and hashing or between symmetric and asymmetric encryption than to be able to describe in detail how and why the S-boxes from AES work.

In fact, you don't need more than what is described in this article for a clear understanding of cryptography: All other approaches are ultimately based on the modules presented here and their combination. That is reassuring, because anyone who has understood how the algorithms work conceptually and knows which method is suitable for what, has a good starting point for using cryptography in a targeted and appropriate manner.

Golo Roden

Golo Roden is the founder, CTO and managing director of the native web GmbH, a company specializing in native web technologies.

Read CV »