6 - 1
Information Assurance Foundations - SANS
©2001
1
Web Security
Security Essentials
The SANS Institute
Hello. With everything that is occurring on the Internet and all of the articles that have been written,
web security is a very exciting area. Most attacks that are publicized are either directly or indirectly
web-based attacks. Every company and person seems to have a web site, yet most web sites are not
designed or built properly from a security standpoint. In the next hour, we are going to take a look at
web security and cover some things you can do to check the security of the web sites you either
maintain or use. This is a foundational course, developed for the SANS Security Essentials program.
When you complete this course, there will be a quiz available from the SANS web page to help
reinforce the material and ensure your mastery of it. Also, you should always get prior permission
but I would recommend trying these steps out on your own web sites to see what vulnerabilities
might exist. Remember, before you can fix a problem you must be aware of the problem. Hopefully
after this module you will have some of the knowledge you need to start securing your web
applications.
6 - 2
Web Security - SANS
©2001
2
Agenda
• Web communication
• Web security protocols
• Active content
• Cracking web applications
• Web application defenses
On the slide “Agenda” we list some of the key things that we are going to cover in this section. First,
we are going to cover web communication and how it works. Topics that are often misunderstood,
called a “Web browser”, or just browser for short. Browsers take input from users, convert that
input into a language the server will understand, sends it off to the server over the network, and waits
for the reply. When the server sends the reply, the browser will format it and display it for the user.
Simple as that. OK, it’s not really all that simple. There may be a lot of processing that goes on
behind the scenes. For example, the server may have to contact other computers to get the
information the client needs, or the client may have to run some other programs in order to properly
interpret the response from the browser, but here you have the basics: Client sends the request,
server responds to the request.
The way clients and servers communicate on the Web is through a protocol called HTTP – the
HyperText Transfer Protocol. Like any other protocol, HTTP is just a set of standards,
conventions, and notations the two systems must understand in order to communicate.
The HyperText Markup Language, or HTML, is the actual language used to develop web pages.
HTML uses a set of special notations, called tags, to tell the browser how to display a page,
including things like where to center text, what fonts to use, where to place images on a page, and so
on. If you want to see examples of HTML, most browsers allow you to view the HTML source code
for any page it displays.
6 - 4
Web Security - SANS
©2001
4
Everything You Always Wanted to
Know About Web Communications (2)
• Stateless Communications
• Retrieving Information – GET
• Sending Information – POST
Communication on the Web is called “stateless.” This is because each interaction between clients
and servers is an independent transaction. For example, each time you click on a web page you are
starting a completely new interaction between your browser and the server. If you click on 12
different links on a page, your browser will make 12 different connections to the server. There is no
information about the state of any previous transactions carried over from one transaction to the next.
structure and organization of the server, in order to plan an attack. And it’s all there, free for the
looking.
6 - 6
Web Security - SANS
©2001
6
HTML Security
• Hidden Fields
• Server Side Includes
Many web pages, particularly those that use input forms, make use of a feature of HTML called
Hidden Fields. Like their name implies, hidden fields reside on a web page form but they are
hidden from view when the page is displayed. Hidden fields are typically used as a method for
carrying information from one form to another without requiring the user to re-enter the
information on each form. However, hidden fields can also contain values not entered by the user.
For example, when a user enters a user ID on a web form, the server might look up the user’s
Social Security Number and place that in a hidden field for later use. If you look at the HTML
source for the page with the hidden field, you will see that information. Unfortunately, so will
anyone else that may be sniffing the network when that page is transmitted.
Another neat tool is the use of a technology called Server Side Includes. Server Side Includes
are small pieces of code that are embedded in HTML documents. When a Web server begins to
display a web page, it will go line by line through the code interpreting the HTML commands.
When it comes upon a Server Side Include line, it stops and does whatever the include says. For
example, it might insert text from a different file, like a copyright notice or policy statement. It
might insert today’s date and time to be displayed on the page. Or, and this is the scary part, it
might run a separate program and insert its output into the HTML document. This is scary
because if the included program has a bug, or the attacker can manipulate the program to run
some malicious code, the potential exists for the attacker to compromise the server and gain
unauthorized access or obtain confidential information.
Now, despite these shortcomings, and some others we will examine shortly, nobody is saying that
we should do away with HTML. But security practitioners need to take extra care when
protect the system. SYN floods, fragmentation attacks, and the Ping of Death are all examples of what
happens when a system receives input it did not expect.
Plain vanilla HTML also has no built-in methods for validating user input. There are no
variable checks or data validation rules built into HTML to prevent bad input from happening. If you are
using a scripting language to develop your pages, you can build validation routines into your forms, but if
you want to stick with plain HTML, you are out of luck.
That’s why you need to pay particular attention to any web pages, or any program for
that matter, that requires user input. You need to ensure that all input is validated for correctness. What
does “validated” mean? It means that you need to check that the input is correct for the type of information
being requested. If you are looking for a Social Security number, make sure that there are no letters
entered by the user. If you are requesting a piece of text that should be 10 characters long, make sure the
user doesn’t enter 500 characters of text.
Beyond simple type and length validation, you also need to check the input to see if it
matches the type of information you are expecting. For example, if you normally only sell 2 or 3 of a
particular item, is it normal for a user to order 999? Is the name on the customer’s credit card different
from the name on the shipping address? Things like this can be a clue to possible unauthorized activity or
fraud.
6 - 8
Web Security - SANS
©2001
8
Cookies
• HTTP is “stateless” – no context information
• Cookies provide “state” and context
• Can only hold information given to the browser
by the server
• Can only be exchanged with originating server or
domain
• Beware of cross-site sharing (e.g. DoubleClick)
• Can block cookies if desired
• Beware of cross-site sharing (e.g. DoubleClick)
• Can block cookies if desired
Some people object to cookies on privacy principles. They believe that cookies are somehow magically
taking information from you or your computer and spreading that information around the Internet. Most
of these fears are based on a lack of understanding of how cookies really work. First off, cookies can
only contain information that you’ve already given to the web server or the company you are dealing
with. There is no way the site can know your home address or credit card number unless you have
already given it to them. So you’ve already given up some of your privacy before the cookies even
entered into the picture. Secondly, cookies can only be sent to and from the server or domain that
originally created the cookie. There is no way that a cookie from xyz.com can be shared with a server
from abc.com.
This last point, however, while technically true, has found a wrinkle lately. It is true that one company’s
server can not share a cookie with another company’s server. But what if one company were able to
distribute cookies on ALL servers? This is exactly what a company called DoubleClick has done. You’ve
probably seen their advertisements on web pages you’ve visited. DoubleClick rents space on web pages
for advertisements. So, for example, when you visit the web page for acme.com, you will see an ad that
is actually generated by DoubleClick from the DoubleClick server. The cookies generated by that ad are
shared between the browser and DoubleClick, not the browser and Acme. Then, when you go to
widgets.com, you may see another DoubleClick advertisement. Again, you will share a cookie with
DoubleClick, not Widgets.com. In this way, the DoubleClick service can begin to collect information on
what sites you have visited over the Internet. Many privacy advocates are extremely worried about this
practice.
If you are really worried about cookies, you can take steps to protect yourself. In most browsers, you can
set an option to prevent the downloading of cookies to your browser. There are also a number of
shareware add-on utilities that let you selectively block cookies based on various criteria.
6 - 10
Web Security - SANS
©2001
10
What About Non-Persistent
Web Security - SANS
©2001
11
SSL
• Protocol for encrypting network traffic
• Operates at Transport Layer
• Operates on port 443
•How it works
– Client connects to server
– Server indicates need for SSL
– Client and server exchange crypto keys
– Secure session begins
• Not a guarantee of security
Plain, generic HTTP is fine for open, non-secret communications, but some applications require more privacy
than that provided by HTTP. For example, you may want to keep your credit card information or information
about your bank accounts secret over the Internet. For these types of applications, there is the Secure Socket
Layer protocol, or SSL.
SSL is a general-purpose protocol for encryption of network traffic. Although it is most commonly associated
with HTTP traffic, SSL operates at the Transport Layer of the TCP/IP stack and can be used with many different
application protocols. Any program that uses TCP can be modified to use SSL. General HTTP traffic typically
operates on port 80. When SSL is enabled on a connection, it usually runs on port 443.
When a client connects to a web server, the server will generally indicate whether SSL is required for that
page. If it is required, the client and the server will negotiate to determine what type of encryption the session
will use. Generally, the strongest algorithm that the two programs support will be selected.
The client and the server will then exchange encryption keys. These are the codes that will enable the two to
encrypt messages back and forth. Once the keys have been exchanged, all further communications between the
client and the server are encrypted.
I have left out a LOT of detail here about the specifics of the key exchange and the use of certificates to
validate the identity of the client and the server, but most of it is unimportant in order to gain a high-level
understanding of the process. What’s important to remember is that all sensitive information that is to be
validating credit card numbers, checking the customer’s authorization to use the credit card,
authorizing the transaction with the bank, and processing the transaction. SET provides an integrated
system that handles the entire transaction, including card authorization and finalization of the sale.
SET has a number of mechanisms that protect the customer, the merchant, and the bank. For
example, the protocol hides the actual credit card number from the merchant, instead sending it
directly to the bank. Also, the bank does not know the actual merchandise purchased by the
customer, protecting the privacy of the customer’s purchases.
6 - 13
Web Security - SANS
©2001
13
Secure Electronic Transactions
(SET) (2)
• Services provided
– Authentication
–Confidentiality
– Message Integrity
–Linkage
SET provides four basic services that protect transactions.
Authentication: All the parties to the transaction are authenticated using digital signatures. We will
learn more about digital signatures later when we discuss cryptography.
Confidentiality: The transaction is encrypted so that Internet eavesdroppers can not capture the data
and discover the details of the transaction.
Message Integrity: The transaction can not be tampered with by attackers. Thus, they can not alter
the account numbers or payment amounts involved in the transaction.
Linkage: SET allows a message sent by one party to the transaction (either the customer, the
merchant, or the bank) to contain an attachment that can be read only by another specified party.
This allows the first party to verify that the attachment is correct without being able to read the
contents of the attachment. This is very important for the privacy reasons stated above.
SET has many advantages over plain SSL in that it covers the entire transaction from end to end. If
However, CGI is a very primitive process for handling such interaction, and it may create a large number of
vulnerabilities on the server in which it is used. For example, if the results of the CGI execution are not filtered
before being sent to the user, the use of CGI programs can lead to the leakage of information about the system
or its data. Because CGI has few built-in data checking mechanisms, it can be relatively easy for a user to
falsify the information sent to the CGI program, increasing the potential for the execution of unauthorized or
fraudulent transactions. Finally, since many CGI programs use underlying command interpreters (like Perl or a
UNIX shell), the potential exists for an attacker to run programs not intended by the designers of the system.
This is a popular method of gaining unauthorized administrative access on web servers.
6 - 15
Web Security - SANS
©2001
15
Common Gateway Interface
(CGI) (2)
• Common Mistakes
– Misuse of command interpreters
– Bad memory management
– Passing unchecked parameters to system
There are several common mistakes that many CGI developers make when writing their programs. The first is
misuse of command interpreters. As mentioned before, many CGI programs use command interpreters that
are called by the CGI program. Since there is no direct linkage between the CGI program and the command
interpreter, the interpreter has little way of validating the information it is being sent. If an attacker can find a
way to pass random system commands to the interpreter, they have the potential to successfully compromise the
system.
Another common mistake is the lack of attention paid to memory management. As we will see later on when
we discuss buffer overflows, a common method of attack is to send a program more information than it was
designed to handle. If the information reaches a certain peak, or if it is carefully crafted, it has the ability to
crash the server, often leaving the attacker with administrator privileges on the computer. Also, if the program
itself does not pay close attention to the resources it is using, it can potentially consume all the available
resources of the computer, again leaving it exposed to compromise. The final common mistake, and the one
Two of the most common examples of active content are Java and ActiveX. Java is a programming
and execution environment originally developed by Sun Microsystems. It was designed for
developing programs that run on many different types of devices. One of the features of Java’s
portability is that a special type of Java program, called an applet, can be embedded in a web page’s
HTML code and run on a user’s machine. ActiveX is the term Microsoft uses for its active content
components. ActiveX components are called “controls” and, like Java, are downloaded to the user’s
computer where they are executed.