Chapter 4. Information Security
At a Glance
This chapter focuses on mechanisms for protecting information from unwanted exposure, tampering, or destruction. These aspects of security are usually referred to as confidentiality211 – preventing unauthorized users from accessing or modifying data and programs – and integrity – insuring that information and software remain intact and correct. The discussion here is largely conceptual, though examples of the application several principles on actual systems are given.
211 Or privacy, which is sometimes used interchangeably with confidentiality and sometimes refers more specifically to protecting personally identifiable information about individuals.
Cryptography
Cryptography is a collection of mathematical techniques for protecting information. Using cryptography, you can transform written words and other kinds of messages so that they are unintelligible to anyone who does not possess a specific mathematical key necessary to unlock the message. The process of using cryptography to scramble a message is called encryption. The process of unscrambling the message by use of the appropriate key is called decryption.
Cryptography is used to prevent information from being accessed by an unauthorized recipient. In theory, once a piece of information is encrypted, that information can be accidentally disclosed or intercepted by a third party without compromising the security of the information, provided that the key necessary to decrypt the information is not disclosed and that the method of encryption will resist attempts to decrypt the message without the key.
In addition to enhancing confidentiality, cryptography has also been used to insure message integrity and nonrepudiation.
Cryptographic Algorithms and Functions
There are fundamentally two kinds of encryption algorithms:
Symmetric key algorithms
With these algorithms, the same key is used to encrypt and decrypt the message. Symmetric key algorithms are sometimes called secret key algorithms and sometimes called private key algorithms. Unfortunately, both of these names are easily confused with public key algorithms, which are unrelated to symmetric key algorithms. Symmetric key algorithms can be divided into two categories: block and stream. Block algorithms encrypt data a block (many bytes) at a time, while stream algorithms encrypt byte-by-byte (or even bit-by-bit).
Asymmetric key algorithms
With these algorithms, one key is used to encrypt the message and another key to decrypt it. A particularly important class of asymmetric key algorithms are public key cryptosystems. The encryption key is normally called the public key in these algorithms because it can be made publicly available without compromising the secrecy of the message or the decryption key. The decryption key is normally called the private key or secret key.
Symmetric key algorithms are the workhorses of modern cryptographic systems. They are generally much faster than public key algorithms. They are also somewhat easier to implement. And finally, it is generally easier for cryptographers to ascertain the strength of symmetric key algorithms. Unfortunately, symmetric key algorithms have three problems that limit their use in the real world:
• For two parties to securely exchange information using a symmetric key algorithm, those parties must first exchange an encryption key. Exchanging an encryption key in a secure fashion can be quite difficult.
• As long as they wish to send or receive messages, both parties must keep a copy of the key, and must keep it safe. If one party’s copy is compromised and the second party doesn’t know this fact, then the second party might send a message to the first party—and that message could then be subverted using the compromised key.
• If each pair of parties wishes to communicate in private, then they need a unique key. This requires (N2 – N) / 2 keys for N different users. This number quickly becomes unmanageable.
Public key algorithms overcome these problems by separating the encryption and decryption keys. In theory, public key technology makes it relatively easy to send somebody an encrypted message. People who wish to receive encrypted messages will typically publish their public keys in directories or make their keys otherwise readily available. Then, to send somebody an encrypted message, all you have to do is get a copy of her public key, encrypt your message, and send it to her. With a good public key system, you know that the only person who can decrypt the message is the person who has possession of the matching private key. Furthermore, all you really need to store on your own machine is your private key (though it’s convenient and unproblematic to have your public key available as well.)
Public key cryptography can also be used for creating digital signatures. Like a real signature, a digital signature is used to denote authenticity or intention. For example, you can sign a piece of electronic mail to indicate your authorship in a manner akin to signing a paper letter. And as with signing a bill of sale agreement, you can electronically sign a transaction to indicate that you wish to purchase or sell something. Using public key technology, you use the private key to create the digital signature; others can then use your matching public key to verify the signature.
Unfortunately, public key algorithms are computationally expensive. In practice, public key encryption and decryption require as much as 1000 times more computer power than an equivalent symmetric key encryption algorithm.
To get both the benefits of public key technology and the speed of symmetric encryption systems, most modern encryption systems actually use a combination:
Hybrid public/private cryptosystems
With these systems, slower public key cryptography is used to exchange a random session key, which is then used as the basis of a private (symmetric) key algorithm. (A session key is used only for a single encryption session and is then discarded.) Nearly all practical public key cryptography implementations are actually hybrid systems.
Finally, there is a special class of functions that are almost always used in conjunction with public key cryptography. These algorithms are not encryption algorithms at all. Instead, they are used to create a “fingerprint” of a file or a key:
Message digest functions
A message digest function generates a seemingly random pattern of bits for a given input. The digest value is computed in such a way that finding a different input that will exactly generate the given digest is computationally infeasible. Message digests are often regarded as fingerprints for files. Most systems that perform digital signatures encrypt a message digest of the data rather than the actual file data itself.
Cryptographic Strength of Symmetric Algorithms
Different encryption algorithms are not equal. Some systems are not very good at protecting data, allowing encrypted information to be decrypted without knowledge of the requisite key. Others are quite resistant to even the most determined attack. The ability of a cryptographic system to protect information from attack is called its strength. Strength depends on many factors, including:
• The secrecy of the key.
• The difficulty of guessing the key or trying out all possible keys (a key search). Longer keys are generally more difficult to guess or find.
• The difficulty of inverting the encryption algorithm without knowing the encryption key (breaking the encryption algorithm).
• The existence (or lack) of back doors, or additional ways by which an encrypted file can be decrypted more easily without knowing the key.
• The ability to decrypt an entire encrypted message if you know the way that a portion of it decrypts (called a
known plaintext attack).
• The properties of the plaintext and knowledge of those properties by an attacker. For example, a cryptographic system may be vulnerable to attack if all messages encrypted with it begin or end with a known piece of plaintext.
In general, cryptographic strength is not proven; it is only disproven. When a new encryption algorithm is proposed, the author of the algorithm almost always believes that the algorithm offers complete security—that is, the author believes there is no way to decrypt an encrypted message without possession of the corresponding key. After all, if the algorithm contained a known flaw, then the author would not propose the algorithm in the first place (or at least would not propose it in good conscience).
As part of studying the strength of an algorithm, a mathematician can show that the algorithm is resistant to specific kinds of attacks that have been previously shown to compromise other algorithms. Unfortunately, even an algorithm that is resistant to every known attack is not necessarily secure, because new attacks are constantly being developed.
From time to time, some individuals or corporations claim that they have invented new symmetric encryption algorithms that are dramatically more secure than existing algorithms. Generally, these algorithms should be avoided. As there are no known attacks against the encryption algorithms that are in wide use today, there is no reason to use new, unproven encryption algorithms—algorithms that might have flaws lurking in them.
Key Length with Symmetric Key Algorithms
Short keys can significantly compromise the security of encrypted messages, because an attacker can merely decrypt the message with every possible key so as to decipher the message’s content. But while short keys provide comparatively little security, extremely long keys do not necessarily provide significantly more practical security than keys of moderate length. That is, while keys of 40 or 56 bits are not terribly secure, a key of 256 bits does not offer significantly more real security than a key of 168 bits, or even a key of 128 bits.
If you are attempting to decrypt a message and do not have a copy of the key, the simplest way to decrypt the message is to do a brute force attack. These attacks are also called key search attacks, because they involve trying every possible key to see if that key decrypts the message. If the key is selected at random, then on average, an attacker will need to try half of all the possible keys before finding the actual decryption key.
Inside a computer, a cryptographic key is represented as a string of binary digits. Each binary digit can be a 0 or a
1. In general, each added key bit doubles the number of keys. So how many bits is enough? That depends on how fast the attacker can try different keys and how long you wish to keep your information secure. If an attacker can try only 10 keys per second, then a 40-bit key will protect a message for more than 3,484 years. Of course, today’s computers can try many thousands of keys per second—and with special-purpose hardware and software, they can try hundreds of thousands. Key search speed can be further improved by running the same program on hundreds or thousands of computers at a time. Thus, it’s possible to search a million keys per second or more using today’s technology. If you have the ability to search a million keys per second, you can try all 40-bit keys in only 13 days. If a key that is 40 bits long is clearly not sufficient to keep information secure, how many bits are necessary? If you could search a billion keys per second, trying all 80-bit keys would still require 38 million years. A 128-bit key search would require 1022 years with current technology, and hundreds of millions of years even with advances in quantum computing. As our Sun is likely to become a red giant within the next 4 billion years and, in so doing, destroy the Earth, a 128-bit encryption key should be sufficient for most cryptographic uses, assuming that there are no other weaknesses in the algorithm used.
Common Symmetric Key Algorithms
There are many symmetric key algorithms in use today. Some of the algorithms that are commonly encountered in the field of computer security are summarized below; a more complete list of algorithms is in (PUIS, 169-176):
DES
The Data Encryption Standard was adopted as a U.S. government standard in 1977 and as an ANSI standard in 1981. The DES is a block cipher that uses a 56-bit key and has several different operating modes depending on the purpose for which it is employed. The DES is a strong algorithm, but today the short key length limits its use. Indeed, in 1998 a special-purpose machine for “cracking DES” was created by the Electronic Frontier Foundation (EFF) for under $250,000. In one demonstration, it found the key to an encrypted message in less than a day in conjunction with a coalition of computer users around the world.
Triple-DES
Triple-DES is a way to make the DES dramatically more secure by using the DES encryption algorithm three times with three different keys, for a total key length of 168 bits. Also called “3DES,” this algorithm has been widely used by financial institutions and by the Secure Shell program (ssh). Simply using the DES twice with two different keys does not improve its security to the extent that one might at first suspect because of a theoretical kind of known plaintext attack called meet-in-the-middle, in which an attacker simultaneously attempts encrypting the plaintext with a single DES operation and decrypting the ciphertext with another single DES operation, until a match is made in the middle.
Blowfish
Blowfish is a fast, compact, and simple block encryption algorithm invented by Bruce Schneier. The algorithm allows a variable-length key, up to 448 bits, and is optimized for execution on 32- or 64-bit processors. The algorithm is unpatented and has been placed in the public domain. Blowfish is used in the Secure Shell and other programs.
IDEA
The International Data Encryption Algorithm (IDEA) was developed in Zurich, Switzerland, by James L. Massey and Xuejia Lai and published in 1990. IDEA uses a 128-bit key. IDEA is used by the popular program PGP to encrypt files and electronic mail. Unfortunately, wider use of IDEA has been hampered by a series of software patents on the algorithm, which are currently held by Ascom-Tech AG in Solothurn, Switzerland.
RC4
This stream cipher was originally developed by Ronald Rivest and kept as a trade secret by RSA Data Security. The algorithm was revealed by an anonymous Usenet posting in 1994 and appears to be reasonably strong. RC4 allows keys between 1 and 2048 bits.
Rijndael (AES)
This block cipher was developed by Joan Daemen and Vincent Rijmen, and chosen in October 2000 by the National Institute of Standards and Technology to be the United State’s new Advanced Encryption Standard. Rijndael is an extraordinarily fast and compact cipher that can use keys that are 128, 192, or 256 bits long. Cryptographers establish the strength of their algorithms through a process of peer review. When an algorithm is published, other cryptographers may look for flaws or weaknesses. Do not trust people who say they’ve developed a new encryption algorithm, but also say that they don’t want to disclose how the algorithm works because such disclosure would compromise the strength of the algorithm. In practice, there is no way to keep an algorithm secret: true security lies in openness.
On the other hand, it’s important to realize that simply publishing an algorithm or a piece of software does not guarantee that flaws will be found. The WEP (Wired Equivalent Protocol) encryption algorithm used by the 802.11 networking standard was published for many years before a significant flaw was found in the algorithm—the flaw had been there all along, but no one had bothered to look for it.
One-Time Pads
There is a provably unbreakable symmetric key cryptosystem – the one-time pad system. In a one-time pad system, the communicating parties share a key composed of a very long stream of random bytes (longer than the message that is to be sent). The message is encrypted and decrypted by transforming each byte of the message by a byte of the key, after which that key byte is discarded and never used again. Because the key is random and nonrepeating, even a key search attack is infeasible, because every possible message can be produced by some key.
Unfortunately, one-time pads have several limitations that make them impractical. In addition to the usual symmetric encryption problems of securely distributing and managing keys, generating large amounts of truly random data is not always straightforward, and distributing large amounts of key material can be difficult. Nevertheless, this system is sometimes used for extremely high-security communications links.
Public Key Algorithms
Public key algorithms are more difficult to create than symmetric key algorithms, and there are fewer in use. Because the keys of symmetric and asymmetric encryption algorithms are used in fundamentally different ways, it is not possible to infer the relative cryptographic strength of these algorithms by comparing the length of their keys – key lengths in public key algorithms typically range from 512 to 2048 or 4096 bits; for most users, 1024 bits are sufficient for the foreseeable future. The following list summarizes the public key systems in common use today:
Diffie-Hellman key exchange
A system for exchanging cryptographic keys between active parties. Diffie-Hellman is not actually a method of encryption and decryption, but a method of developing and exchanging a shared private key over a public communications channel. In effect, the two parties agree to some common numerical values, and then each party creates a key. Mathematical transformations of the keys are exchanged. Each party can then calculate a third session key that cannot easily be derived by an attacker who knows both exchanged values.
DSA/DSS
The Digital Signature Standard (DSS) was developed by the U.S. National Security Agency and adopted as a Federal Information Processing Standard (FIPS) by the National Institute for Standards and Technology. DSS is based on the Digital Signature Algorithm (DSA). Although DSA allows keys of any length, only keys between 512 and 1024 bits are permitted under the DSS FIPS. As specified, DSS can be used only for digital signatures, although it is possible to use some DSA implementations for encryption as well.
Elliptic curves
Elliptic curve cryptosystems are public key encryption systems that are based on an elliptic curve rather than on a traditional logarithmic function. The advantage to using elliptic curve systems stems from the fact that there are no known computationally feasible algorithms for computing discrete logarithms of elliptic curves. Thus, short keys in elliptic curve cryptosystems can offer a high degree of confidentiality and security, while remaining very fast to calculate. Elliptic curves can also be computed very efficiently in hardware.
RSA
RSA is a well-known public key cryptography system developed in 1977 by three professors then at MIT: Ronald Rivest, Adi Shamir, and Leonard Adleman. RSA can be used both for encrypting information and as the basis of a digital signature system. Digital signatures can be used to prove the authorship and authenticity of digital information. The key may be any length, depending on the particular implementation used.
Message Digest Functions
Message digest functions distill the information contained in a file (small or large) into a single large number, typically between 128 and 256 bits in length. The best message digest functions combine these mathematical properties:
a. Every bit of the message digest function’s output is potentially influenced by every bit of the function’s input.
b. If any given bit of the function’s input is changed, every output bit has a 50 percent chance of changing.
c. Given an input file and its corresponding message digest, it should be computationally infeasible to find another file with the same message digest value.
In theory, two different files can have the same message digest value. This is called a collision. For a message digest function to be secure, it should be computationally infeasible to find or produce these collisions.
Many message digest functions have been proposed and are now in use. Here are a few:
MD2
Message Digest #2, developed by Ronald Rivest. This message digest is probably the most secure of Rivest’s message digest functions, but takes the longest to compute. As a result, MD2 is rarely used. MD2 produces a 128-bit digest.
MD4
Message Digest #4, also developed by Ronald Rivest. This message digest algorithm was developed as a fast alternative to MD2. Subsequently, MD4 was shown to have a possible weakness. That is, it may be possible to find a file that produces the same MD4 as a given file without requiring a brute force search (which would be infeasible for the same reason that it is infeasible to search a 128- bit keyspace). MD4 produces a 128-bit digest.
MD5
Message Digest #5, also developed by Ronald Rivest. MD5 is a modification of MD4 that includes techniques designed to make it more secure. Although widely used, in the summer of 1996 a few flaws were discovered in MD5 that allowed some kinds of collisions in a weakened form of the algorithm to be calculated. As a result, MD5 is slowly falling out of favor. MD5 and SHA-1 are both used in SSL and in Microsoft’s Authenticode technology. MD5 produces a 128-bit digest.
SHA
The Secure Hash Algorithm, related to MD4 and designed for use with the National Institute for Standards and Technology’s Digital Signature Standard (NIST’s DSS). Shortly after the publication of the SHA, NIST announced that it was not suitable for use without a small change. SHA produces a 160-bit digest.
SHA-1
The revised Secure Hash Algorithm incorporates minor changes from SHA. It is not publicly known if these changes make SHA-1 more secure than SHA, although many people believe that they do. SHA-1 produces a 160-bit digest.
SHA-256, SHA-384, SHA-512
These are, respectively, 256-, 384-, and 512-bit hash functions designed to be used with 128-, 192-, and 256-bit encryption algorithms. These functions were proposed by NIST in 2001 for use with the Advanced Encryption Standard.
Besides these functions, it is also possible to use traditional symmetric block encryption systems such as the DES as message digest functions. To use an encryption function as a message digest function, simply run the encryption function in cipher feedback mode. For a key, use a key that is randomly chosen and specific to the application. Encrypt the entire input file. The last block of encrypted data is the message digest. Symmetric encryption algorithms produce excellent hashes, but they are significantly slower than the message digest functions described previously.
Message digest functions are a powerful tool for detecting very small changes in very large files or messages; calculate the MD5 code for your message and set it aside. If you think that the file has been changed (either accidentally or on purpose), simply recalculate the MD5 code and compare it with the MD5 that you originally calculated. If they match, you can safely assume that the file was not modified.
Because of their properties, message digest functions are also an important part of many cryptographic systems in use today. Message digests are the basis of most digital signature standards. Instead of signing the entire document, most digital signature standards specify that the message digest of the document be calculated. It is the message digest, rather than the entire document, which is actually signed.
Message digests can also be readily used for message authentication codes that use a shared secret between two parties to prove that a message is authentic. MACs are appended to the end of the message to be verified. (RFC 2104 describes how to use keyed hashing for message authentication.) MACs based on message digests provide the “cryptographic” security for most of the Internet’s routing protocols.
Maintaining Integrity
Maintaining the integrity of information stored on your computers is critical to overall security and reliable operation. You must insure the integrity of your operating system, the integrity of your applications, and the integrity of your data. For operating systems and applications, this requires not only monitoring for unwanted changes to your software, but also applying necessary security patches and updates to keep your software protected.
Keeping Systems Up to Date
From the moment a workstation or server is connected to the Internet, it is open to discovery and attempted access by unwanted outsiders. Attackers find new Internet hosts with amazing speed. Detailed reports on the aggressiveness of attackers can be found at the website maintained by The Honeynet Project, http://project.honeynet.org/. In one case, a newly-configured Honeynet system was successfully penetrated 15 minutes after the computer was placed on the network. It is thus imperative that any system that will be on a network be kept up-to-date with security fixes – both before connecting it to the network and after.
Software Management Systems
A software management system is a set of tools and procedures for keeping track of which versions of what software you’ve got installed, and whether any local changes have been made to the software or its configuration files. Without such a system, it is impossible to know whether a piece of software needs to be updated or what local changes have been made and need to be preserved after the update. Using some software management system to keep up-to-date is essential for security purposes, and useful for non-security upgrades as well.
Fortunately, nearly all Unix systems and Microsoft NT-based systems provide some form of software management for the core components of the operating system and applications distributed with it. The most common approaches are managing packages — precompiled executables and supporting files — and managing the software source code from which executables can be compiled and installed.
Package-based Systems
A typical package file is a file containing a set of executable programs, already compiled, along with any supporting files such as libraries, default configuration files, and documentation. Under most packaging systems, the package also contains some meta-data, such as:
• Version information for the software it contains
• Information about compatible operating system versions or hardware architectures
• Lists of other packages that the package requires
• Lists of other packages with which the package conflicts
• Lists of which included files are configuration files (or otherwise likely to be changed by users once installed)
• Commands to run before, during, or after the included files are installed
The other important component of a package-based system is a database of which versions of which packages have been installed on the system. On Windows systems, the Registry often serves this purpose.
Package-based systems are easy to use: with a simple command or two, a system administrator can install new software or upgrade their current software when a new or patched version is released. Because the packaged executables are already compiled for the target operating system and hardware platform, the administrator doesn’t have to spend time building (and maybe even porting) the application.
On the other hand, packages are compiled to work on the typical installation of the operating system, and not necessarily on your installation. If you need to tune your applications to work with some special piece of hardware, adapt them to an unusual authentication system, or simply compile them with an atypical configuration setting, source code will likely be more useful to you, if it is available. This is often the case with the kernel on Unix operating systems, for example.
Commercial systems that don’t provide source code are obvious candidates for package-based management. Solaris 2.x, for example, provides the pkgadd, pkgrm, pkginfo, and showrev commands (and others) for adding, removing, and querying packages from the shell, and admintool for managing software graphically. Microsoft Windows systems use the web-based Windows Update to download and install updates to the operating system and core utilities.
Package management isn’t only for commercial systems. Free software Unix distributions provide package management systems to make it easier for system administrators to keep the system up to date. Several Linux distributions have adopted the RPM Package Manager (RPM) system. This system uses a single command, rpm, for all of its package management functions. Debian GNU/Linux uses an alternative package management system called dpkg. The BSD-based Unix systems focus on source-based updates, but also provide a collection of precompiled packages that are managed with the pkg_ add, pkg_delete, and pkg_info commands.
Source-based Systems
In contrast to package-based systems, source-based systems focus on helping the system administrator maintain an up-to-date copy of the operating system’s or application’s source code, from which new executables can be compiled and installed. Source-based management has its own special convenience: a source-based update comes in only a single version, as opposed to compiled packages, which must be separately compiled and packaged for each architecture or operating system on which the software runs. Source-based systems can also be particularly useful when it’s necessary to make local source code changes.
From a security standpoint, building packages from source-code can be a mixed blessing. On the one hand, you are free to inspect the source-code and determine if there are any lurking bugs or Trojan horses. In practice, such inspection is difficult and rarely done. Moreover, if an attacker can get access to your source code, it is not terribly difficult for the attacker to add a Trojan horse of her own! To avoid this problem, you need to be sure both that the source code you are compiling is for a reliable application and that you have the genuine source code.
Source code and patches
The simplest approach to source management is to keep application source code available on the system and recompile it whenever it’s changed. When a patch to an application is released, it typically takes the form of a patch diff, a file that describes which lines in the old version should be changed, removed, or added to in order to produce the new version. The diff program produces these files, and the patch program is used to apply them to an old version to create the new version. After patching the source code, the system administrator recompiles and reinstalls the application.
For example, FreeBSD and related versions of Unix distribute many applications in their ports collection. An application in the ports collection consists of the original source code from the application’s author along with a set of patches that have been applied to better integrate the application into the BSD environment. The makefiles included in the ports system automatically build the application, install it, and then register the application’s files with the BSD pkg_add command. This approach is widely used for maintaining third-party software on FreeBSD systems.
CVS
Another approach to source management is to store the source code on a server using a source code versioning system such as the Concurrent Versions System (CVS), and configure the server to allow anonymous client connections. Users who want to update their source code to the latest release use the cvs program to “check out” the latest patched version from the remote server’s repository. The updated code can then be compiled and installed.
FreeBSD, NetBSD, and OpenBSD use CVS to distribute and maintain their core operating system software through CVS. In addition, tens of thousands of open source software projects maintain CVS servers of their own, or are hosted at sites such as sourceforge.net that provide CVS repositories. A good reference on CVS is Essential CVS, published by O’Reilly and Associates.
Updating System Software
It is imperative that you ensure that patches are available for all known security problems in the software you run, that you find those patches, and that you apply them – ideally, before the system is connected to a network. Similarly, once the system is up and running, you must be vigilant to learn about newly discovered security problems in your operating system and applications so as to apply patches for them as they become available.
The most secure way to patch a new installation is to download the patches to another computer that’s already connected to the Internet and updated with the latest security patches (perhaps a Mac or PC client that runs no server services). Once downloaded, they can be burned onto a CD or transferred to the new system using a local network connection, and applied. This approach is also convenient if you have many computers running the same operating system to update, and a slow network connection. Updates can be transferred once, and then applied on each machine from the CD. For Microsoft systems, the Windows Update Catalog web site provides downloadable updates that can be used in this fashion.
If no other Internet-connected host is available or suitable, the new host may have to be connected before the patches are applied. In this case, disable all network servers on the machine, and make the connection as brief as possible — only long enough to download the required patches — and then physically remove the machine from the network while the patches are applied. This process can be made even more secure if the machine’s connection can be protected by a stateful firewall or a router that implements network address translation, so that the only packets that can reach the new host are those associated with a connection initiated by the new host.
You can’t stay up-to-date with software that you don’t know you’ve installed. An important component of any ongoing updating process is to inventory your system and keep track of new applications that you’ve installed. Operating systems that use packages usually provide commands that will let you determine which packages you have installed. Source-based software management typically relies on keeping all of the source code to the installed applications in a single location where it can be easily found.
Learning about patches
There are several avenues for learning about security problems and patches for operating systems and applications.
• Every Unix operating system and most major applications, such as web servers, has an associated mailing list for announcements of new versions. Microsoft offers e-mail notification of security bulletins through the Microsoft Profile Center (http://register.microsoft.com/regsys/pic.asp). Many vendors maintain a separate list for announcements of security-related issues. Subscribe to these lists and pay attention to the messages.
• Several mailing lists, such as BugTraq and NT-BugTraq, collect and distribute security alerts for many products. Subscribe to these lists (perhaps in digest form) and pay attention to the messages.
• Many operating system and application developers post security and release announcements in relevant USENET newsgroups (for example, the BIND name server announcements appear in comp.protocols.dns.bind). Skim these newsgroups regularly.
• If your vendor provides a subscription patch CD service, consider subscribing. Although these CDs may not provide up-to-the-minute patches, they can save a lot of time when bringing up a new system by reducing the number of patches that need to be downloaded.
• Automatic update systems compare installed packages with the latest versions of packages available on the vendor’s web site and report which packages are out-of-date. Most also can be configured to automatically download and install the upgraded packages, which can be useful if you don’t change your configuration from the vendor defaults, and you trust the vendor to upgrade your system. Some can be run automatically on a scheduled basis; others must be run manually.
• Finally, you can manually check the vendor’s website on a regular basis for new versions of software.
Once you learn about a security patch, don’t wait – apply it immediately. Vulnerabilities that become public begin to be exploited almost immediately. (Patches that add new features, rather than fixing security vulnerabilities, do not require the same urgency).
Downloading and Verifying Patches
Whether you use packages or source code, you’ve got to get the files from somewhere. Vendors typically make their applications available on the Internet through the World-Wide Web or an anonymous FTP site. When an operating system or application is popular, however, a single Web site or FTP site can’t keep up with the demand to download it, so many software vendors arrange to have other sites serve as mirrors for their site. Users are encouraged to download the software from the mirror site closest (in network geography) to them. In principle, all of the software on the vendor’s site is replicated to each mirror site on a regular (often daily) basis.
Mirror sites provide an important security benefit, by making the availability of software more reliable through redundancy. They are also useful when you have a fast network connection to the mirror site, but a slow connection to the principal site. On the other hand, mirror sites also create some security concerns:
• The administrators of the mirror site control their local copies of the software, and may have the ability to corrupt it, replace it with a trojaned version, etc. You must trust not only the vendor but also the administrators of the mirror site. If the vendor distributes digital signatures along with the software (for example, detached PGP signatures with source code archives, gnupg signatures in rpm files, or ActiveX code signatures), you can be more sure that you’re receiving the software as released by the vendor, as long as you acquire the vendor’s public key directly – not through the mirror! Some update systems automatically check signatures before an update will be applied.
• Even if you trust the mirror, daily updating may not be fast enough. If a critical security patch is released, you may not have time to wait 24 hours for your local mirror to be updated. In these cases, there is no substitute for downloading the patch directly from the vendor as soon as possible.
Using a mirror site is thus a trade-off between the convenience of being able to get a high-speed download when you want it, and the necessity to possibly extend your trust to a third party.
Be very wary of applying patches found in mailing lists and on bulletin boards: at worst, they may be planted to trick people into installing a new vulnerability. At best, they are often produced by inexperienced programmers whose systems are unlike yours, so their solutions may cause more damage than they fix.
Upgrading applications
Under Unix-based package management systems, upgrading a package is usually a very simple procedure. For example, to upgrade the bzip2-devel package on a system that uses the RPM package manager:
# ls -l *.rpm-
rw-r--r-- 1 root root 33708 Apr 16 23:15 bzip2-devel-1.0.2-2.i386.rpm
# rpm -K bzip2-devel-1.0.2-2.i386.rpm Check the checksum and signature)
bzip2-devel-1.0.2-2.i386.rpm: md5 OK
# rpm -Uvh bzip2-devel-1.0.2-2.i386.rpm Upgrade the package
Preparing... ########################################### [100%]
1:bzip2-devel ########################################### [100%]
# rpm -q bzip2-devel Confirm that the version is now 1.0.2-2
bzip2-devel-1.0.2-2
Installing a Solaris security patch is similarly easy. After downloaded patch 104489-15.tar.Z from
http://sunsolve.sun.com, the installpatch script bundled inside the patch archive is used to install the appropriate patch:
% ls *.tar.Z
104489-15.tar.Z
% uncompress *.Z
% tar xf 104489-15.tar
% cd 104489-15
% ls
.diPatch* SUNWtltk/ backoutpatch* postbackout*
Install.info* SUNWtltkd/ installpatch* postpatch*
README.104489-15 SUNWtltkm/ patchinfo*
% su
Password: password
# ./installpatch .
Checking installed patches...
Generating list of files to be patched...
Verifying sufficient filesystem capacity (exhaustive method)...
Installing patch packages...
Patch number 104489-15 has been successfully installed.
See /var/sadm/patch/104489-15/log for details
Executing postpatch script...
Patch packages installed:
SUNWtltk
SUNWtltkd
SUNWtltkm
# showrev -p | egrep 104489
Patch: 104489-01 Obsoletes: Packages: SUNWtltk, SUNWtltkd
Patch: 104489-14 Obsoletes: Packages: SUNWtltk, SUNWtltkd, SUNWtltkm
Patch: 104489-15 Obsoletes: Packages: SUNWtltk, SUNWtltkd, SUNWtltkm
If you’re using source-based management, upgrading involves either a CVS checkout of the updated source code or applying a patch file to the old source code to update it. In either case, the source code must then be recompiled and reinstalled. Here is an example of applying a patch to an application:
% ls -ld *
-rw-rw---- 1 dunemush dunemush 188423 Jul 20 12:07 1.7.5-patch09
drwx------ 10 dunemush dunemush 4096 Jul 4 16:15 pennmush/
% cd pennmush
% patch -p1 -s < ../1.7.5-patch09
% make
....source code compile messages...
% make install
...installation messages...
%
If you’re upgrading a server program, of course, you will need to stop the running server process and restart it to run the newly installed version — simply changing the server program on disk is not sufficient!
Upgrading applications on Microsoft Windows systems is typically more eccentric. If the application is one of the core Microsoft applications, like Internet Explorer or Media Player, Windows Update will handle patches. But each third-party application must provide its own approach to upgrades. Some may require you to remove the older version and install the new one, others may suggest you simply install the new version over the older, and others may have their own built-in update functionality (antivirus engines are particularly notable in this regard). You’ll have to examine each application individually.
Backing Out and Backing Up
Not every upgrade is a panacea. Sometime upgrades cause more problems than they solve, either because they break important functionality, or they don’t provide the desired fix. It’s important to be able to revert to the pre-upgrade software if the upgrade should prove troublesome.
There are two basic strategies for recovering from a bad upgrade. First, it may be possible to “back out” the patch and reinstall the earlier version. Under source-based management systems, the patch program can also be used to remove a previously applied patch, or the earlier version can be checked out from a CVS repository. It can be more difficult to cleanly back out a package. Although most package management software provides a way to overwrite an installed package with an earlier version, if the package dependencies have also been updated, older version of the dependencies may also have to be located and installed. Many, but not all, Microsoft patches are capable of uninstalling themselves or provide uninstall instructions.
A second strategy for source-based systems is to locally back up older versions of software. By keeping older versions of source code, it’s generally not difficult to reinstall the earlier version. Multiple versions can be kept in separate directories in /usr/src, or a version control system such as RCS or CVS can be used locally to track multiple versions of software in the same directory.
Perhaps the most reliable method is to perform a full backup of your system prior to the changes. Then, if the upgrade goes badly, you can restore your system to the prior state.
Integrity Monitoring
Insuring that system software is up to date when new patches are released is an important part of maintaining integrity. Equally important is insuring that system software – and your valuable data – doesn’t change when you don’t expect it to. Ideally, no unauthorized user or process would be able to tamper with your information; good server information practices reduce the likelihood of someone gaining privileges they shouldn’t have. In practice, however, it’s necessary to monitor your data on an ongoing basis so that you can discover tampering if it should occur, and to archive your data so you can restore it to a correct state.
Tampering
There are several ways to safeguard against tampering. In addition to using care in the organization of user and file permissions, critical files that change infrequently can be kept on read-only media. Files can also be encrypted so that additional passwords are required to covertly modify the information they contain (though it may be possible to corrupt or delete the files themselves).
There are also many approaches to detecting tampering. For smaller systems or when there are a limited number of key files to protect, making backups of the files on write-once media can be an effective strategy. Files can be regularly compared to their archived counterparts, and if a file is corrupted, the backup can be used to restore it. Of course, when an authorized change is made to a file, the backup must also be updated.
Cryptographic digests of important files can be computed and stored off-line or protected by encryption. As noted earlier, an important property of cryptographic digests is that it is infeasible to generate a new file that will match a given digest. Some antivirus systems can perform a similar function, often called “inoculation”, in which checksums are inserted into executable files themselves. Chapter 5 discusses the use of comparison files and cryptographic digests for ongoing auditing of system data in greater detail.
Backups
Bugs, accidents, natural disasters, and attacks on your system cannot be predicted. Often, despite your best efforts, they can’t be prevented. But if you have backups, you can compare your current system and your backed-up system, and you can restore your system to a stable state. Even if you lose your entire computer—to fire, for instance—with a good set of backups you can restore the information after you have purchased or borrowed a replacement machine. Insurance can cover the cost of a new CPU and disk drive, but your data is something that in many cases can never be replaced.
Years ago, making daily backups was a common practice because computer hardware would often fail for no obvious reason. A backup was the only protection against data loss. Today, hardware failure is still a good reason to back up your system. Hard disk failures are a random process: even though a typical hard disk will now last for five years or more, an organization that has 20 or 30 hard disks can expect a significant drive failure every few months. Drives frequently fail without warning—sometimes only a few days after they have been put into service. It’s prudent, therefore, to back up your system on a regular basis.
Backups can also be an important tool for securing computers against attacks. Specifically, a full backup allows you to see what an intruder has changed, by comparing the files on the computer with the files on the backup. Make your first backup of your computer after you install its operating system, load your applications, and install all of the necessary security patches. Not only will this first backup allow you to analyze your system after an attack to see what has been modified, but it will also save the time of rebuilding your system from scratch in the event of a hardware failure.
How to back up
There are many different forms of backups in use today. Here are just a few:
• Copy your critical files to a high-density removable magnetic or optical disk.
• Periodically copy your disk to a spare or “mirror” disk.
• Instantaneously mirror two disks using either software or hardware RAID systems.
• Make periodic zip, “sit” or “tar” archives of your important files. You can keep these backups on your primary system or you can copy them to another computer, possibly at a different location.
• Make backups onto magnetic or optical tape.
• Back up your files over a network or over the Internet to another computer that you own, or to an Internet backup service. Some of these services can be exceedingly sophisticated. For example, the services can examine the MD5 checksums of your files and only back up files that are “unique.” Thus, if you have a thousand computers, each with a copy of Microsoft Office, none of those application files need to be copied over the network to add them to the backup.
What to back up
There are two approaches to computer backup systems:
1. Back up everything that is unique to your system—user accounts, data files, and important system directories that have been customized for your computer. This approach saves tape or disk and decreases the amount of time that a backup takes; in the event of a system failure, you recover by reinstalling your computer’s operating system, reloading all of the applications, and then restoring your backup tapes.
2. Back up everything, because restoring a complete system is easier than restoring an incomplete one, and tape is cheap.
The second approach should generally be preferred. While some of the information you back up is already “backed up” on the original distribution disks or tapes you used to load the system onto your hard disk, distribution disks or tapes sometimes get lost. Furthermore, as your system ages, programs get installed in the operating system’s reserved directories as security holes get discovered and patched, and as other changes occur. If you’ve ever tried to restore your system after a disaster, you know how much easier the process is when everything is in the same place.
For this reason, it is recommended that you store everything from your system (and that means everything necessary to reinstall the system from scratch—every last file) onto backup media at regular, predefined intervals. How often you do this depends on the speed of your backup equipment and the amount of storage space allocated for backups, as well as the needs of your organization. You might want to do a total backup once a week, or you might want to do it only twice a year.
Types of Backups
There are three basic types of backups:
Level-zero backup
Makes a copy of your original system. When your system is first installed, before people have started to use it, back up every file and program on the system. Such a backup can be invaluable after a break-in.
Full backup
Makes a copy to the backup device of every file on your computer. This method is similar to a day-zero backup, except that you do it on a regular basis.
Incremental backup
Makes a copy to the backup device of only those items in a filesystem that have been modified after a particular event (such as the application of a vendor patch) or date (such as the date of the last full backup).Full backups and incremental backups work together. A common backup strategy is:
• Make a full backup on the first day of every other week.
• Make an incremental backup every evening of everything that has been modified since the last full backup. This kind of incremental backup is sometimes called a differential backup, as it archives those files that differ since the last full backup.
Most administrators of large systems plan and store their backups by disk drive or partition. Different partitions usually require different backup strategies. Some partitions, such as your system partitions (if they are separate), should probably be backed up whenever you make a change to them, on the theory that every change that you make to them is too important to lose. You should use full backups with these systems, rather than incremental backups, because they are only usable in their entirety. Likewise, partitions that are used solely for storing application programs really only need to be backed up when new programs are installed or when the configuration of existing programs is changed.
On the other hand, partitions that are used for keeping user files are more amenable to incremental backups. But you may wish to make such backups frequently, to minimize the amount of work that would be lost in the event of a failure.
When you make incremental backups, use a rotating set of backup disks or tapes. The backup you do tonight shouldn’t write over the tape you used for your backup last night. Otherwise, if your computer crashes in the middle of tonight’s backup, you would lose the data on the disk, the data in tonight’s backup (because it is incomplete), and the data in last night’s backup (because you partially overwrote it with tonight’s backup). Ideally, perform an incremental backup once a night, and have a different tape for every night of the week.
How Long Should You Keep a Backup?
It may take a week or a month to realize that a file has been deleted. Therefore, you should keep some backup tapes for a week, some for a month, and some for several months. Many organizations make yearly or quarterly backups that they archive indefinitely. Some organizations decide to keep their yearly or biannual backups “forever” — it’s a small investment in the event that it should ever be needed again. In some countries, there may be legal requirements that backups of specific kinds of data (such as accounting records) be kept for a minimum period. On the other hand, it may be important to have a “data destruction” policy that specifies the maximum time backups may be kept.
You may wish to keep on your system an index or listing of the names of the files on your backup tapes. This way, if you ever need to restore a file, you can find the right tape to use by scanning the index, rather than by reading in every single tape. Having a printed copy of these indices is also a good idea, especially if you keep the online index on a system that may need to be restored!
If you keep backups for a long period of time, be sure to migrate the data on your backups each time you purchase a new backup system. Otherwise, you might find yourself stuck with tapes that can’t be read by anyone, anywhere. This has happened to major research universities and even the U.S. National Aeronautics and Space Administration.
Other Backup Tips
There are several other good ways to increase the reliability of your backups:
Use redundant backup sets
You can use two distinct sets of backup tapes to create a tandem backup. With this backup strategy, you create two complete backups (call them A and B) on successive backup occasions. Then, when you perform your first incremental backup, the “A incremental,” you back up all of the files that were created or modified after the last A backup, even if they are on the B backup. The second time you perform an incremental backup, the “B incremental,” you write out all of the files that were created or modified since the last B backup (even if they are on the A incremental backup.) This system protects you against media failure, because every file is backed up in two locations. It does, however, double the amount of time that you will spend performing backups.
Replace tapes as needed
Tapes are physical media, and each time you run them through your disk drive they degrade somewhat. Based on your experience with your tape drive and media, you should set a lifetime for each tape. Some vendors establish limits for their tapes (for example, 3 years or 2000 cycles), but others do not. Be certain to see what the vendor recommends—and don’t push that limit. The few pennies you may save by using a tape beyond its useful range will not offset the cost of a major loss.
Keep your tape drives clean
If you make your backups to tape, follow the preventative maintenance schedule of your tape drive vendor, and use an appropriate cleaning cartridge or other process as recommended. Being unable to read a tape because a drive is dirty is inconvenient; discovering that the data you’ve written to tape is corrupt and no one can read it is a disaster.
Verify the backup
On a regular basis you should attempt to restore a few files chosen at random from your backups, to make sure that your equipment and software are functioning properly. Stories abound about computer centers that have lost disk drives and gone to their backup tapes, only to find them all unreadable. This scenario can occur as a result of bad tapes, improper backup procedures, faulty software, operator error, or other problems.
At least once a year, you should attempt to restore your entire system completely from backups to ensure that your entire backup system is working properly. Starting with a different, unconfigured computer, see if you can restore all of your tapes and get the new computer operational. Sometimes you will discover that some critical file is missing from your backup tapes. These practice trials are the best times to discover a problem and fix it.
A related exercise that can prove valuable is to pick a file at random, once a week or once a month, and try to restore it. Not only will this reveal if the backups are comprehensive, but the exercise of doing the restoration may also provide some insight.
An in-depth discussion of backup and restore systems can fill a book —W. Curtis Preston’s book, Unix Backup & Recovery, published by O’Reilly and Associates, is an excellent one.
Transmission Integrity
Cryptography also provides the solution to the problem of insuring that when you transmit data to someone else over a network the recipient receives the data as you sent it, protected from accidental corruption or intentional tampering. A typical strategy involves digitally signing the file, by computing a cryptographic digest and encrypting the digest with a symmetric or asymmetric algorithm, and then sending it along with the file (which may itself be encrypted for confidentiality) along with the file. The recipient recomputes the digest from the file and then decrypts the transmitted digest. If they match, the message’s integrity is ensured.
A Hash Message Authentication Code (HMAC) function is another technique for verifying the integrity of a message transmitted between two parties that agree on a shared secret key. Essentially, HMAC combines the original message and a key to compute a message digest function of the two. Sometimes additional information, such as protocol sequence numbers, are included as well, to thwart replay attacks. The sender of the message computes the HMAC of the message, the key, and any additional information and transmits the HMAC with the original message. The recipient recalculates the HMAC using the message and the recipient’s copy of the secret key (along with any additional information, such as the expected sequence number), then compares the received HMAC with the calculated HMAC to see if they match. If the two HMACs match, then the recipient knows that the original message has not been modified, because the message digest hasn’t changed, and that it is authentic, because the sender knew the shared key, which is presumed to be secret.
HMACs are often used to harden network protocol messages against tampering, because they are much faster to calculate than digital signatures. They are also typically smaller in size. However, HMACs are based on a shared key that must be protected from compromise, while digital signatures are usually performed with public key systems. Several general cryptographic protocols have been developed to secure network connections. These protocols are typically built from a combination of cryptographic algorithms to support key exchange, authentication, encryption, and message authentication codes, along with specifications for how a client and a server will agree on algorithms and exchange credentials and session keys. For example, the SSL/TLS protocol supports these combinations of algorithms:
EDH-RSA-DES-CBC3-SHA SSLv3 Kx=DH Au=RSA Enc=3DES(168) Mac=SHA1
EDH-DSS-DES-CBC3-SHA SSLv3 Kx=DH Au=DSS Enc=3DES(168) Mac=SHA1
DES-CBC3-SHA SSLv3 Kx=RSA Au=RSA Enc=3DES(168) Mac=SHA1
DHE-DSS-RC4-SHA SSLv3 Kx=DH Au=DSS Enc=RC4(128) Mac=SHA1
RC4-SHA SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=SHA1
RC4-MD5 SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=MD5
EXP1024-DHE-DSS-RC4-SHA SSLv3 Kx=DH(1024) Au=DSS Enc=RC4(56) Mac=SHA1 export
EXP1024-RC4-SHA SSLv3 Kx=RSA(1024) Au=RSA Enc=RC4(56) Mac=SHA1 export
EXP1024-DHE-DSS-DES-CBC-SHA SSLv3 Kx=DH(1024) Au=DSS Enc=DES(56) Mac=SHA1 export
EXP1024-DES-CBC-SHA SSLv3 Kx=RSA(1024) Au=RSA Enc=DES(56) Mac=SHA1 export
EXP1024-RC2-CBC-MD5 SSLv3 Kx=RSA(1024) Au=RSA Enc=RC2(56) Mac=MD5 export
EXP1024-RC4-MD5 SSLv3 Kx=RSA(1024) Au=RSA Enc=RC4(56) Mac=MD5 export
EDH-RSA-DES-CBC-SHA SSLv3 Kx=DH Au=RSA Enc=DES(56) Mac=SHA1
EDH-DSS-DES-CBC-SHA SSLv3 Kx=DH Au=DSS Enc=DES(56) Mac=SHA1
DES-CBC-SHA SSLv3 Kx=RSA Au=RSA Enc=DES(56) Mac=SHA1
EXP-EDH-RSA-DES-CBC-SHA SSLv3 Kx=DH(512) Au=RSA Enc=DES(40) Mac=SHA1 export
EXP-EDH-DSS-DES-CBC-SHA SSLv3 Kx=DH(512) Au=DSS Enc=DES(40) Mac=SHA1 export
EXP-DES-CBC-SHA SSLv3 Kx=RSA(512) Au=RSA Enc=DES(40) Mac=SHA1 export
EXP-RC2-CBC-MD5 SSLv3 Kx=RSA(512) Au=RSA Enc=RC2(40) Mac=MD5 export
EXP-RC4-MD5 SSLv3 Kx=RSA(512) Au=RSA Enc=RC4(40) Mac=MD5 export
Each algorithm-combination specifies an algorithm to use for key exchange (Kx, which may be Diffie-Hellman or RSA), authentication (Au, which may be RSA or DSS), encryption (Enc, which may be DES, Triple-DES, RC4, or RC2, with the key length shown), and message access codes (Mac, which may be SHA1 or MD5).
|