Cross-Site Scripting
(XSS)
Cross-site scripting
(often abbreviated to XSS) is a form of injection, where an attacker finds a
way to have the target site display code they control. In its most basic form,
this can be as simple as a site that allows HTML characters in usernames, where
someone can specify a username like:
DaveChild<script type="text/javascript" src="http://www.example.com/my_script.js"></script>
Now, whenever someone sees my username on the target site, the
script I've added to my username will run. I could potentially use this to grab
the person's login information, log their keystrokes - any number of nefarious
activities.
As a developer, you can combat this type of attack by encoding
or removing HTML characters (watch out for character encoding issues, as
outlined next). Even better than stripping out unwanted characters is to allow
a whitelist of safe characters in usernames and other fields. Be especially
careful with e-commerce sites where you are listing orders in a CMS - an XSS
vulnerability may allow an attacker to gain administrative access to your CMS.
It is also important to turn off TRACE and TRACK support on the server, as if
there is a vulnerability (and always assume that despite your best efforts
there will be) these potentially allow an attacker to steal a user's cookie.
As a user you are also vulnerable to this sort of attack, and it
is very difficult, at the moment, to make yourself safe against it
Cross-Site
Request Forgery (CSRF)
Despite the similar name, CSRF is unconnected to XSS. CSRF is a
form of attack where an authenticated user performs an action on a site without
knowing it.
Let's assume that Jack is logged in to his bank, and has a
cookie stored on his computer. Each time he sends an HTTP request to the bank
(i.e., views a page or an image on a page) his browser sends the cookie along
with the request so that the bank knows that it's him making the request.
Jill, meanwhile, runs a different website and has managed to get
Jack to visit it. One of the items on the page is in fact loaded from the bank,
for example in an iframe. The URL of the iframe or request contains instructions
to the bank to transfer money from Jack's account to Jill's. Because the
request is coming from Jack's computer, and includes his cookie, the bank
assumes it is a legitimate request and the money is transferred.
This type of attack is extremely dangerous and virtually
untracable. As a developer, your job is to protect against it, and the best way
to do that is to remember Rule
Number One: Never, Ever Trust Your Users. No matter how authenticated they are,
do not assume every request was intended.
In practical PHP terms, you can combat CSRF with several
relatively simple coding habits. Never let the user do anything with a GET
request - always use POST. Confirm actions before performing them with a
confirmation dialog on a separate page - and make sure both the original action button or link andthe
confirmation were clicked. Even better, have the user enter information like
letters from their password on the confirmation page.
Add a randomly generated token to forms and verify its presence
when a request is made. Use frame-breaking JavaScript. Time-out
sessions with a short timespan (think minutes, not hours). Encourage the user
to log out when they've finished. Check the HTTP_REFERER header (it can be
hidden, but is still worth checking as if it is a different domain to that
expected it is definitely a CSRF request).
Character
Encoding
Character encoding in PHP and associated database systems is
worthy of its own series. In any one request, there may be more different
character encodings in use than you might think.
For example, a single request and response (uploading a file to
a server and writing information to a database) may involve all of the
following differently items with different character encodings: the HTTP
request headers, post data, PHP's default encoding, the PHP MySQL module,
MySQL's default set, the set of each table being used, a file being opened and
read, a new file being created and written, the response headers and the
response body.
English-speaking developers generally don't have much cause to
get embroiled in character encoding issues, and that results in a lot of
developers with a serious lack of understanding of how character encodings work
and fit together. For those that do have a reason to look at character
encodings, usually that interest ends with the setting of the response's
character set.
However, character sets are a fundamental part of all web
development. English alone can exist in any one of a wide variety of sets, and
developers are usually familiar with the most common two: ISO-8859-1 and UTF-8.
Fewer are familiar with UCS-2, UTF-16 or windows-1252. Still fewer are familiar
with commonly used alternative language sets (e.g, GB2312 for Chinese).
Which, in a very roundabout way, brings me on to the security
pitfalls of character encodings. Where data is processed by PHP using one
character set, but a database server uses a different character set, a
character (or series of characters) deemed safe by PHP may in fact allow SQL
injection against the database.
PHP security expert Chris Shiflett has written about this issue and included an example of
how it can be exploited to allow SQL injection even where input is sanitized
using addslashes().
The solution is to always always use mysql_real_escape_string() rather
than addslashes() (or use prepared statements / stored procedures), and to
explicitly state character sets at all stages of interaction. Ideally, use the
same character set throughout your system (UTF-8 is recommended) and where PHP
allows you to specify a character encoding for a function (e.g.,
htmlspecialchars() or htmlentities()), make use of it.
It's not just SQL that's vulnerable as a result of character
encoding bugs. Cross-site scripting is possible even where HTML characters are
escaped if character sets are not handled properly. Fortunately, once again
that is simple to avoid by properly setting character encodings at all stages
of the process and specifying character encoding for functions where possible.
No comments:
Post a Comment