From ... From: Erik Naggum Subject: Re: dealing with errors (was: "Programming is FUN again") Date: 1998/03/26 Message-ID: <3099927027198005@naggum.no>#1/1 X-Deja-AN: 337979141 References: <6e9p0b$h0j@er3.rutgers.edu> <6ec6os$7a@bgtnsc03.worldnet.att.net> <6ecbd5$epg$1@client3.news.psi.net> <6f0m6n$71u$1@news.hal-pc.org> <351692EA.41C6@bogus.acm.org> <351A72C3.5CEF1977@badlands.nodak.edu> Followup-To: comp.lang.lisp mail-copies-to: never Organization: Naggum Software; +47 8800 8879; http://www.naggum.no Newsgroups: comp.lang.scheme,comp.lang.lisp * Brent A Ellingson | In other words, why the hell did the equipment test for the exception if | it didn't know what to do with it, except crash and destroy the rocket? | Had it not tested for the exception on the non-critical code, the rocket | probably would not have failed. it is amazing that the view that "don't test for errors you don't know how to handle" is _still_ possible in the light of that report, but I guess that's the way with _beliefs_. I cannot fathom how people will read all the contradictory evidence they can find and still end up believing in some braindamaged myths. the problem is: the equipment did _not_ test for the exception. the exception was allowed to propagate unchecked until the "crash-and-burn" exception handler took care of it. this could be viewed as silly, but the report clearly states why this was sound design: unhandled exceptions should be really serious and should indicate random hardware failure. the _unsound_ design was not in the exception handling at all, it was in allowing old code from Ariane 4 still run in Ariane 5, notably code that should run for a while into the launch sequence on Ariane 4 because it would enable shorter re-launch cycles -- which was not necessary at all on Ariane 5. the error was thus not in stopping at the wrong time or under the wrong conditions -- it was in _running_ code at the wrong time and under the wrong conditions. "had it not run the bogus code, the rocket would not have failed in it." how can you expect to learn from mistakes when you insist that the errors you observe are caused by mistakes you think you have _already_ learned from, and that others (the dumbass people who use exceptions, in this case) are at fault for not learning from? rather than "don't test for error you don't know how to handle", I propose "don't run code with errors you aren't prepared to handle". did you notice how the report had the brilliant insight that we have gotten used to think that code is good until proven faulty and that this was the major cultural problem pervading the whole design and deployment process? it's high time this insighed could sink into the right people and cause more focus on provably correct code and verification. with the extremely arrogant attitude still held by Brent and many others like him, we will continue to produce crappy code that crash rockets for millenia to come without learning the problem really is: unchecked assumptions! #:Erik -- religious cult update in light of new scientific discoveries: "when we cannot go to the comet, the comet must come to us."