On the importance of backward compatibility

I’m often asked why I’m so obsessed with backward compatibility and, as a result, why I’ve made the issue such a central part of the LSB over the past year. Yes, it’s hard, particularly in the Linux world, because there are thousands of developers building the components that make up the platform, and it just takes one to break compatibility and make our lives difficult. Even worse, the idea of keeping extraneous stuff around for the long term “just” for the sake of compatibility is anathema to most engineers. Elegance of design is a much higher calling than the pedestrian task of making sure things don’t break.

Why is backward compatibility important? Here’s a great example, via Joel Spolsky (note: from 2004):

Raymond Chen is a developer on the Windows team at Microsoft. He’s been there since 1992, and his weblog The Old New Thing is chock-full of detailed technical stories about why certain things are the way they are in Windows, even silly things, which turn out to have very good reasons.

The most impressive things to read on Raymond’s weblog are the stories of the incredible efforts the Windows team has made over the years to support backwards compatibility: “Look at the scenario from the customer’s standpoint. You bought programs X, Y and Z. You then upgraded to Windows XP. Your computer now crashes randomly, and program Z doesn’t work at all. You’re going to tell your friends, ‘Don’t upgrade to Windows XP. It crashes randomly, and it’s not compatible with program Z.’ Are you going to debug your system to determine that program X is causing the crashes, and that program Z doesn’t work because it is using undocumented window messages? Of course not. You’re going to return the Windows XP box for a refund. (You bought programs X, Y, and Z some months ago. The 30-day return policy no longer applies to them. The only thing you can return is Windows XP.)”

I first heard about this from one of the developers of the hit game SimCity, who told me that there was a critical bug in his application: it used memory right after freeing it, a major no-no that happened to work OK on DOS but would not work under Windows where memory that is freed is likely to be snatched up by another running application right away. The testers on the Windows team were going through various popular applications, testing them to make sure they worked OK, but SimCity kept crashing. They reported this to the Windows developers, who disassembled SimCity, stepped through it in a debugger, found the bug, and added special code that checked if SimCity was running, and if it did, ran the memory allocator in a special mode in which you could still use memory after freeing it.

This was not an unusual case. The Windows testing team is huge and one of their most important responsibilities is guaranteeing that everyone can safely upgrade their operating system, no matter what applications they have installed, and those applications will continue to run, even if those applications do bad things or use undocumented functions or rely on buggy behavior that happens to be buggy in Windows n but is no longer buggy in Windows n+1…

A lot of developers and engineers don’t agree with this way of working. If the application did something bad, or relied on some undocumented behavior, they think, it should just break when the OS gets upgraded. The developers of the Macintosh OS at Apple have always been in this camp. It’s why so few applications from the early days of the Macintosh still work…

To contrast, I’ve got DOS applications that I wrote in 1983 for the very original IBM PC that still run flawlessly, thanks to the Raymond Chen Camp at Microsoft.

I can almost feel the revulsion among my readership right about now. However, next time you’re in Best Buy or CompUSA, look at the shelf of Windows applications, then compare it to the shelf of Mac applications, and perhaps you’ll better understand why it’s important.

Beyond the results speaking for themselves, I’ll argue that it takes a better engineer to move a platform forward while at the same time making sure things don’t break. It’s pretty easy to wash your hands of something and declare it to be someone else’s problem.

42 comments on “On the importance of backward compatibility

  1. Alejandro

    I completely agree. Even if MS’ backwards compatibility isn’t as perfect as Spolsky suggests, it’s still an important Windows selling point. It’s also an important feature of, say, Playstation consoles (most of those who buy new Sony consoles previously owned older models, and they appreciate being able to keep playing their old games).

  2. Simon Waters

    The crux of the problem here is that stuff gets left behind.

    SimCity could have been fixed for XP, and XP could have checked at install and got the patched version. Then the additional “patch” which adds complexity, and thus potentially security risk, could have been avoided. Of course this would have required Windows to have some sort of meaningful way of recognizing what software was installed, rather than the rather ad-hoc stuff that happened (especially before XP). It would also require that the authors of SimCity (or the source code) be available.

    The down side has to be considered, several of Microsoft’s major security issues this year on XP have been in code that is effectively (or actually) obsolete, but still hanging around.

    Besides I think you are comparing API compatibility with binary compatibility.

    I thought MacOSX shipped with an entire copy of MacOS 9 basically stuck in to allow old MacOS apps to run, which when you make a major change to the underlying technology is probably the only way. Although I’m not an Apple user,

    I suspect the reason there are more Mac Apps around is more to do with the fact that Microsoft has 80 times the desktop share, so 80 times the market share. Certainly hiring anyone with MacOS programming clue is pretty hard, and I know projects with a fair bit of programming clue still struggling to get that polished MacOS experience as a result.

  3. Pingback: Backward Compatibility on iface thoughts

  4. Roger

    Hmm.

    I read Raymond’s blog quite a bit as I code for both windows and Unix . I applaud the effort the windows team put into this.

    But I don’t think this is the main reason for windows’ success, but it is an important part of it.

    As a user of windows this grand effort protects your investment in the operating system and applications. It means the chances of you deciding to toss out everything and start again (possibly on another OS) are lower.

    Is it better engineering – maybe – it is certainly more skillfull engineering except that other views into Microsoft development suggest they have substantial maintainance issues in the windows code , which is just the sort of thing I would expect this policy to produce.

  5. Mark Brown

    Compatibility isn’t the big issue for ISVs working with Mac OS – as Simon says they’ve got a copy of Mac OS 9 sitting there for any pre OS X applications and the two CPU transitions they’ve done (68k to PowerPC and then to Intel) have both included the provision of emulators to allow binaries for the previous architecture to continue to run. Sounds like exactly the sort of thing you’re asking for, really.

  6. Kari Pahula

    Contrast having to keep binary compatibility with ancient binary blobs with having the source available for the software that you use and need. Isn’t it wonderful to live in a GNU world?

  7. stephen o'grady

    “Beyond the results speaking for themselves, I’ll argue that it takes a better engineer to move a platform forward while at the same time making sure things don’t break.”

    that’s certainly been the Solaris’ folks philosophy for some time, and undoubtedly has benefits.

    it does, however, have costs as well.

  8. Joe Buck

    Ian,

    Yes, Microsoft expends enormous effort on this kind of thing, and it is the reason that Windows has so many problems. We run mainly Linux at my house, but can dual-boot if absolutely needed. My daughter, six years old at the time, was given a (Windows, of course) computer game as a gift, so I booted up Windows to install it. It turns out I had to make her an administrator so she could run the game; this kind of thing is the main reason why so many Windows users do the same. Heavy investment in backward compatibility to make sloppy programs keep running forces the platform to be less secure, harder to maintain, and buggier.

    That doesn’t mean that you can’t support backward compatibility at all. We can use compatibility libraries to make old interfaces available.

  9. Johan

    Are you seriously suggesting that having the linux kernel special-case its behaviour to work around bugs in userspace programs would be a good idea?

    If so, I suggest you float that idea on lkml. I’m sure the results would be hilarious.

  10. Eugenia

    Ian, VERY well said. Backwards compatibility is extremely important but unfortunately, most developers in the Linux community just don’t understand that.

  11. Pingback: Eugenia’s rants and thoughts :: The importance of backwards compatibility :: January :: 2007

  12. (dim)

    Absolutely agree!

    Look at Solaris – 12 years old binary compiled on SPARCstation 1 will still run on the latest Solaris 10 with any latest SPARC chip!

    Now if somebody trying to invest into Linux (let’s say write a modem driver, or even simple database application) – same binary probably will no more work in 6 months… Will you be happy to see all your time/monay wasted?…

    Rgds,
    -dim

  13. Redeeman

    dim, thats just totally wrong.

    linux is very backwards compatible, yes, kernel drivers are not so much, but your database example is completely bogus.

    sure, if you DYNAMICALLY link, you may not have good results in years, however, your 6 months example is totally wrong. just statically link, and you will not have any issues. i still have years old binaries that continue to function perfectly. and as for your kernel driver example, they can follow procedure and have it included.

    and even if some totally unforseen thing happens, and your binary fails to work, you need only issue the command “make”, and it will work again.

  14. McBofh

    The database example is not bogus at all. The db vendors certify their products based on certain versions of the linux kernel, and if you change the installed kernel version you are outside the vendor’s tested behaviour.

    Re dynamic linking, a major reason why Sun decided to do away with static linking of the OS utilities is support – if a customer is using staticly-linked binaries then you have *no* guarantee that when they patch their system they will actually end up getting the fix for whatever bug their application is stumbling over.

    There is more to backwards compatibility, and dynamic linking, than Redeeman appears to understand.

  15. Rich Steiner

    Joe Buck suggested that “Microsoft expends enormous effort on this kind of thing, and it is the reason that Windows has so many problems.”

    With all due respect, there are many platforms out there which are able to maintain a very high level of backwards-compatibility without also being encumbered with the level of cruft that Microsoft’s product lines are famous for.

    I work all the time on Unisys Clearpath Dorado mainframes, for example, and we use software all the time that was written in the late 1960’s or 1970’s (small utilities mainly) and which still compiles and runs on that platform. Most ofthe stuff we use is far more recent, of course, but it’s nice to have the ability to invest a significant amount of effort in the creatio of a piece of software and know that the platform won’t leave us behind even though it’s constantly advancing “under the covers” on the hardware and software (OS/EXEC) level.

    Even in the PC realm there are examples of platforms that maintain a high level of backwards compatibility in a far more elegant fashion. The OS/2 operating system I use on one of my main desktops is a good example — it runs DOS and 16-bit Windows software without required a FAT-based filesystem or much of the architectural morass that MS claimed was “required for backwards compatibility” by Windows 95, and it also runs almost everything written for the 16-bit OS/2 variants going back to the mid 1980’s. Heck, you can even boot a variant of CP/M in a VMB under OS/2 without problems alongside various actual MS-DOS, DR-DOS, or PC-DOS images. Try that with Windows. The current variant of OS/2 (eComStation) boots on modern hardware and still maintains the same level of compability with older software.

    Microsoft spends an enormous effort on marketing and on finding new and exciting ways to tie their current user base to their cash cows (Windows and Office), and *very* little effort (apparently) on solving engineering problems within their product lines.

  16. gerd

    All you really need is a userspace “subdistribution”. A standard collection of library licensed every 6 month and shipped as a whole. Not the Kernel, not Vanilla, not a Desktop Environment. More a kind of GCC distribution with a standard library collection. If every distribution is build on basis of the standard collection, no problem will occur. And new versions come out every 6 month.

    I mean, KDE has no problem to guarantee backward compatibility for KDE applications within the same version number 3. Because the project controlls the libraries. The space between Kernel and Desktop Environments is still controlled by the Distributors’ diversity and here shit happens all the time. It is not a frontend or backend problem.

  17. Carlie J. Coats, Jr.

    I agree whole-heartedly with maintaining backwards compatibility (I’ve been maintaining an
    OpenSource environmental-modeling support library for 14 years now, and I’ve been very careful:
    any app that would work with the 1992-vintage 0.7 prototype release will work with the current
    2007-vintage 3.1 release–either at the source level or the binary level).

    But responding to McBofh:

    (a) Dynamic linking means that *I* have no guarantee that apps won’t fail because of
    *new bugs in shared-object libraries* of the new OS version. I’ve had it happen
    too many times that new OS versions had buggy Motif libraries that killed “nedit”
    for me — and for me, that is a mission-critical application. Nowadays, I don’t
    trust distro-releases, and get statically-linked “nedit” straight from “nedit.org”.
    Or build my own from source (which can be a royal PITA on RH-based systems, or some
    HPUX and AIX systems).

    (b) I note that Sun Studio compilers for Linus (see http://developers.sun.com/sunstudio/index.jsp)
    require the _static_ “libm.a”

    fwiw.

  18. bob hunter

    I completely disagree. Would you like to have a vehicle that uses gas and is backward compatible with coal too? In the case you have described, the situation is far worse. As a matter of policy, an OS must never patch bugged applications. If the application crashes after an OS update, it was likely to crash or malfunction before the update. On the number of applications on shelve for Windows vs osx, I’ve used Windows, linux and osx, and I can say that osx is the best: it is rock solid (up and running, and no crashes in three years), and I can run osx applications, windows applications, and linux applications from the same OS, with no need for reboot. The next version will use ZFS by default, and the gui is elegant and functional like no other. I do not understand all this bullying or mobbing against mac osx. It is a really nice OS.

  19. Ian Murdock Post author

    If the vast majority of fuel stations only sold coal, then yes, I would like to have a vehicle that is backward compatible with coal too. -ian

  20. Ian Murdock Post author

    Rich Steiner: “Joe Buck suggested that ‘Microsoft expends enormous effort on this kind of thing, and it is the reason that Windows has so many problems.’ With all due respect, there are many platforms out there which are able to maintain a very high level of backwards-compatibility without also being encumbered with the level of cruft that Microsoft’s product lines are famous for.”

    True enough.

    -ian

  21. Pingback: from hades » Blog Archive » On the importance of backward compatibility

  22. Redeeman

    mcbofh:
    i understand it perfectly.
    but clearly the person saying a db app binary not working after 6 months are clearly totally wrong. besides, really the only reason one would have problems would be if it was linked to a very old glibc, remember, all the last major versions of glibc have been compatible. but 6 months is just hilarious to say, really.

    and about the certify to a specific linux kernel, well, their application should be good enough to run on any kernel version. so in reality, are you complaining about linux releasing too often, or the lousy quality of some applications?

  23. Redeeman

    Oh, sorry for posting, but i forgot.

    you claim there are no backwards compatibility more than 6 months? then how is it i am able to run my ORIGINAL first-linux-release quake3 binaries? you know, i think people would tell you that quake3 is a fair bit older than half a year.

  24. Andre Sankari

    Ian Murdock has cited Joel Spolsky to explain why backward compatibility is important: “They reported this to the Windows developers, who disassembled SimCity, stepped through it in a debugger, found the bug, and added special code that checked if SimCity was running, and if it did, ran the memory allocator in a special mode in which you could still use memory after freeing it.”

    Fixing faulty application is not only more elegant but most probably also much more efficient – it would be easier to maintain the OS’s code without all those special chunks of code that fix buggy applications. And everyone can fix a buggy FOSS application.

    This example of a SimCity problem in Windows is totally irrelevant in a discussion about an FOSS OS. However it doesn’t mean you’re not right in the whole argument of how important backward compatibility is.

    Cheers,
    Andre

  25. Pingback: Backwards compatibility: not backward at all: Rudd-O.com

  26. Anon

    Ah but if we were in a FOSS world, than instead of stepping through the binary for simcity, one could step through the source, debug the error in less time and submit a patch to the Maxis team, thereby correcting the error in the appropriate location.

    This is an example of solving the problem vs working around the problem. Working around the problem is often easier in the short term, but if you do it too many times then life gets interesting.

    In the long run one should solve problems. Only work around problems that are not going to affect others.

  27. dna

    Speaking about backward compatibility, I had make major changes to application written in C++ for Linux while upgrading to newest Linux distribution. There was no source compatibility for application written 4 years ago, and I could only dream about binary compatibility.
    This really makes me sad about writing application for Linux in future. There is no such problems in windows.

    If we want Linux to be widely used, and industry adopted, we must make it more backward compatible!!!

    The investment in software is like other investment. If you buy a house, you don’t want to redesign all its walls all the time.

  28. Anton

    Yes – the compatibility is very important, but the couple of examples:

    I have a set of DOS games – Lemmings, Goblins 2(3), Day of Tentacle.
    They would NOT work at all under any of the last decade Windozes, but work nicely in DOSBOX.

    I have a HPIB / GPIB(IEEE-488) card made by HP in the ages of yore , which works only under Windows 9x.
    Should I drop it away and buy a new from Agilent or NI for 1000$ (one thousand bucks for extension card with 2 chips for 40$).
    But it works under Linux.

    Or should I download a whole bunch of Bluetooth stack and software with a drivers, when I just plug a new Bluetooth dongle (bought on the way home). Or my linux udev, knowing that all the Bluecrap has one or two vendors of chips, but big range of VendorID/DeviceID.

    enough now, but the list of stories can be continued…

  29. GerardM

    There are several key differences between Windows and Linux.

    *There is no one Linux, there are (imho) too many distributions that are often mutually exclusive when you want to run the same application. Windows has one platform that only changes over time
    *The article says it correctly; people have invested in applications. Typically a Linux system is running Free/Open software and consequently it is a matter of ensuring that you ARE running the latest version of the software. Because of the ability to make later versions available at a marginal cost, the importance of backward compatibility is only in ensuring that the data can be read and or converted by the new version of the software.

    Thanks,
    GerardM

  30. André

    I can’t believe you actually posted this.

    Apart from the fact that bugfixes for applications do absolutely not belong in OS code, I advice everyone to read Raskin’s position on backward compatibility in “The Humane Interface” where he illustrates quite nicely where that kind of thinking leads to.

    Maybe users should start contacting the MS support instead of the ISV about software bugs. After all, isn’t that what the MS BackComp Dept. is there for? Plus, third-party software gets a lot cheaper that way! ;-)

  31. G Fernandes

    Ian,

    Bad example. An example illustrating working around an obvious bug for the sake of backward compatibility just doesn’t cut it.

    The whole point of iterative improvement is to improve the code base at a slow but steady pace. This absolutely means trimming off cruft.

    IMHO, SimCity wrote the bad code – SimCity needed to fix the bad code. The OS working around to accomodate bad code is just such a BAD idea.

    We all laugh about Windows vulnerabilities. Well now we know at least one reason why Windows is as secure as a leaking sieve.

  32. G Fernandes

    Backward compatibility is important. But good practice is FAR more important.

    A good example might have been the JDBC API. If there is a problem in a specific driver implementation, clients do not have to change their code. The implementation of the offending driver changes to fix bad behavior.

    Backward compatibility doesn’t suffer – all clients access the driver through the JDBC interface. The JDK doesn’t work around to accomodate bad driver implementations. The offending driver is fixed.

    Voila! You have backward compatibility with good practice!

  33. bob hunter

    Did I mention that we can run emulators nowadays? If one happens to have a legacy application, running only on a certain version of a certain OS, one can run that OS on top of osx using emulation, and the legacy application, with no need to carry patches to old bugs with each new release. As a matter of fact, Apple trashes old bugs and legacy technology with each update, both on software and on hardware. It is a policy that follows natural selection: the old is clear of the way. It is also a policy that allows you to run your software museum, using emulators.

    By the way, it is the fist time that I read a linux insider talking well about microsoft. Things are really changing around here…

  34. Jacob Boswell

    I think any programmer that written a program longer than the first release should recognize the importance of backwards compatibility. However the example that you chose to illustrate its importance is the worst one possible. Their chosen action to fix the “bug” is backward compatibility with the worst practice and the “results speak for themselves.” Go ahead an install XP on a system, then install a Linux system with the SAME functionality set (OS, Email, Browser, Text editor) and this example will tell you exactly why your fresh install of XP is now taking up 3-5 GB of space VS the >1GB taken by Linux. And now every application and user has to suffer with the poor performance imposed by checking to see if a “SimCity” is running.

    I agree 100% with G Fernandes comments. Applications should strive to maintain API compatibility and give very long deprecation cycles if a change is absolutely required. But trying to follow a MS Windows approach to the solution is bad path for all involved except the one group that gets the application. If a application chooses to use an undocumented or unexposed call, they risk breaking there own stuff. No application should be held responsible to maintain compatibility to is “private” interfeces.

    Finally I have to agree with Simon Waters who inferred that the number of applications available on the shelves of Best Buy is a poor indicator of the success of the approach of MS has taken. In fact if you compared the number of available applications VS the total desktop market share, you would likely see that Linux has a huge advantage followed by OS X then Windows. But that is just an uneducated guess :)

    Anyway, the point here is that backward compatibility is important but, as in most things, you need to find the right balance when moving a product forward.

  35. Pingback: tecosystems » A Surprise Quiz

  36. Simon Peter

    “Typically a Linux system is running Free/Open software and consequently it is a matter of ensuring that you ARE running the latest version of the software.”

    While this might be true for the base system, it is certainly not true for all the apps you might want to run on top of it and that are not part of the distribution. In fact, the whole point of an operating system is to provide the infrastructure for running applications on top. These might or might not happen to be part of your distribution of choice.

    It’s always important to remember that to the developer, binaries without the source are worthless. But it is equally important to remember that to the end user, source without (compatible!) binaries is worthless.

  37. Bill Mason

    “look at the shelf of Windows applications, then compare it to the shelf of Mac applications, and perhaps you’ll better understand why it’s important.”

    No, what I’ll understand is that Microsoft had no competition on the IBM PC for far too long, and that their operating system is pre-loaded when the hardware is purchased.

    Backward compatibility is a noble goal to an extent, but in the extreme can be detrimental. We can see that now as we watch Microsoft take a half a decade to release a minor upgrade to an unstable, bug-ridden, and insecure operating system. Some things are not worth emulating.

  38. Pingback: Murdock: On the Importance of Backward Compatibility | Technology News - Technology-x.net

  39. RG3

    There’s a difference between maintaining backwards API and even ABI compatibility, and bending over backwards to support broken apps. The former are laudible goals, and can usually be achieved without too much trouble if you thought things out in the first place. Obviously, no amount of thinking ahead can set you up for every possible future situation, but it can help.

    But if applications use undocumented features, or rely on bugs, or plain fly in the face of documentation (such as in the SimCity example), should be no onus on the system provider to continue to support that behaviour.

    If you want to be confident about your program working in the future, write it according to the documented API, and following the portability guidelines that litter the net. This won’t safeguard you against reckless maintainers, but many (most?) library maintainers do care about API compatibility, and even ABI compatibility for the more core ones.

  40. Pingback: Ian Murdock’s Weblog » Blog Archive » More on the importance of backward compatibility

Comments are closed.