Why does software crash, freeze and hang?
Hands up who has never had a desktop application fatally crash, temporarily freeze or permanently hang on them? Apart from the abacus users over there we've all had it happen to us. But why?
Is it just sloppy programming or is there something fundamentally broken with the way applications and operating systems are coded?
After upgrading to OS X 10.4.3, Mail started consistently crashing a few seconds after launching. After digging around the forums I got a possible fix: remove all stored email messages and configuration/preference files. It worked, I'm happy to say. I got a virgin Mail application that was seemingly stable. Luckily, I use IMAP email - all my mail is stored on my server. So, after setting up the account details in Mail - yet again - I just synced Mail with the IMAP server and I had all my email local again - nice.
Now then, this debacle isn't new. I've had to do this all before. There's a pattern here perhaps. When the chips are down or when configuration files are corrupted or not understood by the new version of the software why does the application have to crash on me? Why is this such a fragile world?
What's the basis for my thinking that our software could be build to not crash, freeze or hang? I'll tell you...
Take the scenario where you're browsing the Web, hopping from one website to another - and then you click on a link to a website that doesn't exist. How do you know? Well, the browser says "can't find website" or some such thing. It doesn't crash, freeze or throw a wobbly - hopefully.
Now, what's special about the browser is that it is generally robust enough not to crash when it comes across websites that don't exist or don't adhere to 100% HTML standard specifications - hopefully. In fact there's a lot of expectation that websites wont exist or wont conform to web standards built into the browser.
Now, what would happen if Mail found that it's configuration files where corrupt or formatted wrong or of an old version? Could it just say "can't read configuration files - shall I try to repair or just set up a brand new one?" rather than just crashing all forlorn like? Flump! I know Mail is currently capable of setting up a new configuration file because that's what it did when I removed them.
What would happen if Mail was to pretty much expect configuration files not to conform to standard (like the web browser and its websites)? If rather than crashing Mail just asked politely "old preference files not readable - would you like to start afresh?" Wouldn't that be much nicer?
And another thing... What about that spinning coloured ball or watch (for pre OS X users) or sand timer (for Windows users), eh? That long pause. Why is my interface frozen when the application is waiting for data or running a calculation? That just seems archaic. A good (or bad) case in point is spotlight searches. Why does Finder freeze with that spinning ball when it does a spotlight search? And why can't I open any other Finder windows when it freezes?
So, something seems wrong at the core here. Spinning balls shouldn't exist. So, what we could be talking here is decoupling the desktop interface from the collection and management of the data. So that the interface never crashes or stalls but could report if it's not finding the data where or in the form it thinks it should find it. And the interface never momentarily freezes when accessing information or performing a calculation, it just displays a time line bar and lets you get on with other stuff.
Wouldn't this be much better - much more robust? Applications would be built to handle corrupt, malformed or missing data gracefully and also handle waiting for data to sync up by showing the user what's going on. And never crash or hang the user interface.
Developers could split the interface from the data processing part of their applications and enable these two components to message each other. Or an application could be split into many more components all messaging each other. The contents of each message would be checked before ingesting so that it couldn't hang or crash the recipient component.
I must admit that sometimes applications seem to cope well but there just seems to be far too many times when they don't.
Now, am I dreaming or is there fundamentally something amiss in the way applications and operating systems are built today? Couldn't all this be much better? What do you reckon, eh?