All about digital certificates -- a layperson's introduction
Talking on the phone / talking on the 'Net
Imagine that you are about to go to your favorite bookstore on the Web. You find just the book you want -- on sale, yet! -- and decide you have to order it right away. You notice a link to an order form; you select the link and are presented with a page which allows you to enter all your ordering information, name, billing address, credit card information, book title, and so on. After you fill out the form you submit it; in a few moments you see a page that tells you that your order will soon be processed and thanks you for your business.
This is electronic commerce as it should be: easy, fast, and -- ideally -- secure. When you submit your order you'd like to be sure that the bookstore gets the information but no one else does. You probably don't even stop to think about it, but at every step of the way someone might be listening in on your 'conversation'. If you are sharing a computer with someone else, they might be monitoring your keystrokes. (If you are a government target, they might have a van sitting in front of your house monitoring electromagnetic radiation in order to determine what you typed, even if you are on a personal workstation! We'll presume that this is highly unlikely, however.) It's possible that the browser you are using has been modified so that it copies all input to an unscrupulous user's file. While your information travels through the Internet to the bookstore, someone may be 'sniffing packets' (trying to intercept the information from the network, like listening in to a phone line). An unhappy bookstore employee could steal the data once it arrives there. The information could be stored insecurely at the bookstore site so that someone from outside of the business could steal it. Someone could be masquerading as the bookstore and claim to process your order when in fact it's being collected for illegal use.
Now, many of these possibilities may seem far-fetched, and in fact many of them are. But as electronic commerce grows into a serious money-making venture, it will become a growing target of criminal activity including, for example, credit card theft. We face this head-on by implementing security technology that develops along with the electronic commerce market.
Who are you talking to?
When you look up a merchant's phone number in the Yellow Pages, you can be reasonably sure that the phone number really does belong to the merchant. But what's to prevent someone posting a Web address on the Internet, claiming to be a site that they aren't? In fact, there means of prevention, which rely on certificates and certifying authorities.
A certificate in this context is a digital proof of identity which the bookstore can present to your browser and which your browser recognizes as being valid because it somehow 'recognizes' the issuer of the certificate.
For example, when you want to travel abroad you must acquire a passport from the U.S. Government which both serves as proof of identity and authorization to travel aboard. To get this passport you present various documents establishing your identity and U.S. citizenship. Upon verification of these documents the government issues the passport which, while not 'signed', is printed on special paper with watermarks and other devices to prevent forgery and at the same time to establish its authenticity as a passport issued by the U.S. government.
When you show this passport to a border official, he or she can verify your identity because the passport is most likely not a forgery, and because the U.S. government can be trusted to have checked your identity before issuing the passport.
In the world of digital certificates, a small number of companies play the role of the U.S. government, issuing digital certificates that verify a person's identity or a Web site's identity. They digitally 'sign' their certificates in a manner that makes forgery very difficult and thus establishes the authenticity of the certificate. Some browsers are shipped with a small built-in directory of certification authorities; any certificates signed by one of these in the list is accepted automatically as okay.
Certification Authorities (CAs) are also commonly known as trusted third parties (TTPs) because they play the role of a trusted middleman between two parties who have never met. The bookstore has dealt with the TTP in order to get a certificate; your browser knows the TTP because it's been pre-configured with information about it. Although you and the bookstore are complete strangers, the TTP enables you to make transactions with the bookstore in complete confidence because of the trust you both put in the TTP.
Typically a certification authority may require, for a Web site certificate, proof that the Web address is registered to the company that is submitting the application, in the form of notarized documents. How much proof is needed and what liability is assumed by the CA in the case of a stolen certificate (or a wrongly issued one) is a complicated subject, and some companies have drafted long documents called 'Certification Practice Statements' to discuss these issues.
Who's listening in?
When you talk to a merchant on the phone, you can be reasonably sure, depending on how nosy your housemates are, that no one is listening in on your conversation. But on the Internet, how can you be sure? You can't even be sure what route the information is taking to get from your computer to the bookstore's. How, then, can you know if someone is snooping while the information is en route?
Certainly most sites take reasonable precautions to prevent eavesdroppers, but the wily hacker always finds a way around the obstacles eventually. The trick is to encrypt your data; that is, you scramble it in such a way that eavesdroppers hear only noise, but the person on the other end of the communication can unscramble the data and understand you perfectly.
Typically the way that this works is that you and your friend who want have a private conversation agree on some mathematical process which you apply to the data on your end to scramble it and which your friend applies in reverse on the other end to unscramble it. If only you and your friend know the mathematical process, this would be good enough, but there is a small class of such algorithms that are commonly used for encryption, so an eavesdropper might just try each one in turn until he or she found the one that worked.
Instead of keeping the algorithm secret, then, you keep a small piece of data on which the algorithm depends secret, but the algorithm can be known (and in fact all of the commonly-used algorithms are not only known but published on the Internet, with sources, graphics interfaces, and all kinds of other bells and whistles, all free for the taking). This small piece of data, usually a randomly generated number of between 40 and 128 bits (between about 12 digits and about 30 digits), is called the 'key' because it changes the operation of the algorithm so that the scrambled data looks completely different when the key is changed, and if someone has the scrambled data and the algorithm but doesn't have the key the he or she cannot understand the data.
Because the algorithms are all mathematical routines, all done on computer, if someone had long enough he or she could try each possible 'key' with the algorithm in turn, until the unscrambling produced data that had meaning. To thwart this, the keys are chosen long enough so that it would take a very very long time on the fastest computers ever built (or that will be built within the next 10-20 years, in case you want to store your data securely), to do this exhaustive 'brute-force' search. 'A very very long time' often means 'the current lifetime of the universe'. The longer the key length, the longer a brute-force search will take. A typical algorithm with a 40-bit key was used to encrypt some data and the results were posted on the Internet along with a challenge to all comers to unscramble the data and find the key. Within about 2 1/2 hours a Berkeley PhD student, using about 250 Pentium workstations, found the key. A similar challenge with a 48-bit key stood for 2 weeks before someone found the key. Another challenge with an algorithm of similar strength that uses a 56 bit key is still unsolved after 4 months.
The one drawback of this setup is that you have to find a way to give your friend the key in private, before you can start having private conversations. If anyone else learns the key, they can eavesdrop with impunity. But if you have a way to give your friend the key in private without being overheard, why can't you just have the rest of the conversation you were trying to protect from eavesdropping, right then and there?
But now, imagine that you could tell your friend the key in public, everyone else could overhear, and the friend could use this key to send data to you which only you could read, even though everyone else knew the key. Your friend would similarly publish a key which you (and everyone else) would use to send data to your friend that only he or she could read. Directories of such keys could be published like Yellow Pages and you could look up someone's key in the directory, send a message using it, and rest secure in the knowledge that no one else but the intended recipient would ever see the contents of the message.
In fact, it's already happening, thanks to a type of cryptography called 'public-key cryptography' because the key used to encrypt data is made public.
An individual's public key is used to encrypt all messages sent to him or her; he or she then uses a private key, which is never given to anyone, to decrypt the message so it can be read.
This type of encryption is typically a little slower than the other, 'symmetric-key encryption', and so it is often used only to... send a key for one of the other, faster algorithms so that eavesdroppers can't grab it! The faster algorithm is then used to carry on the rest of the communication.
Your browser knows about these algorithms, so it could send your data to the bookstore encrypted. But where does it find the bookstore's public key? In the bookstore's certificate! Along with the other identifying information, the public key is provided exactly for this purpose.
It is also provided so that you can test that you are really talking to the bookstore and not someone who stole the certificate. If you send data to the bookstore, encrypted with the public key, and you a get a reply that makes sense, you can be sure that you really are talking to the bookstore; only the bookstore would be able to decrypt the data. (There is also the possibility that someone else stole the bookstore's private key; just as someone might steal your wallet and try to use your credit card, someone might break into the bookstore's computer and steal the private key. Companies take many precautions to prevent this from happening, however.)
Your browser knows how to do this kind of test; it is built into a security protocol called SSL that your browsers uses with certain Web servers automatically. Whenever you order merchandise from the Web, you should be talking to one of these servers. Your browser will usually indicate to you that you are talking to one of these security-enabled servers, perhaps by the addition of a key icon in a corner or by a change in color of the border of the display area. You should check your browser documentation for more details.