By: Mark H. Goodrich – Copyright © 2011
Since the introduction of the Boeing 767 in the early 1980s, design standards, operational performance and use protocols of automated aircraft systems have been the proverbial “800 pound gorilla” – central to many issues regarding airplane design, training, standards, procedures, incidents and accidents, and yet a topic that few want to address directly.
Over the past thirty years, I have been retained by airlines, insurors and government agencies to investigate hundreds of incidents and dozens of accidents in which automation issues were causal to some greater or lesser extent. I have also been retained by airlines and training centers to survey safety of operations, report on training requirements, evaluate training effectiveness, and draft curricula. As an instructor and check airmen, I have trained and observed the ways in which airlines, military services, training centers and flight crews worldwide address automation issues in simulator, base, and line training. As a design engineer and test pilot, I have worked on automation issues in the context of design and certification projects, including conduct of flight test phases. And, as one who made the quantum leap from the so-called “steam gauge” environment to automated flight decks in the 1980s, and has since become rated to operate substantially all of the transport category airplane types of Airbus, Boeing and McDonnell-Douglas equipped with Electronic Flight Instrument Systems (EFIS) and computerized Flight Management Systems (FMS), I have been personal witness to the checkered way in which automation has been developed by manufacturers, and incorporated by the user community of airlines and military services.
The recent incident with an Airbus A380 operated by Qantas (Qantas Flight Number 32, or QF032) has once again brought automation issues to the fore. Fortunately, no loss of life resulted, but unfortunately, the path is already set for industry, labor and the international aviation regulators to address the lessons of QF032 in much the same way as has become traditional. Statements will be made about how things must change. Orders, directives and regulations will be promulgated that appear – at least to the trade press and the less than well-informed public – to “keep this from ever happening again”. But, those responses will miss the critical marks and instead reflect a tacit agreement by corporate interests and governments to create the appearance of solution without addressing the central issues, requiring much expenditure, or significantly changing how things have been done.
At first blush, that may sound like a result dictated by the large moneyed interests of the corporatocracy, but the true facts are that all of the involved parties come to a similar result through their respectively perceived interests. I use the word “perceived”, because those true facts are that acknowledgement of their real interests over the long-term would bring all of the parties to a much different and even more aligned course of action.
Airplane manufacturers cater to the perceived desires of airline customers, and to the lessors that now own most airplanes operated by airlines worldwide. If managers at airlines and lessors believe costs are reduced by using more automation, requiring less experienced flight crews and less training, the die is cast in the perception of the airplane manufacturers for designing towards an ever higher level of automation. Lessors may be airplane owners, but they perceive themselves only as bankers, with no interest in how the airplanes operate in fact, how well they are maintained or any of the other myriad details that actually impact long-term airplane airworthiness, value and safety. So long as insurance is in effect, lessors believe they are immune from responsibility or adverse impact, and the decisions they make in terms of requirements upon the selling manufacturer, installed equipment and systems, continuing airworthiness through quality maintenance, and standards of the operating lessee-airlines, are driven solely by short-term profit, balanced with the special tax benefits for lessors hidden by legislatures worldwide within their respective revenue codes.
Airline managers see costs for training as pure expense, and the offer from manufacturers to reduce those costs through automation looks like free money to bean counters with little appreciation for operating realities. Lured in by the false promises that automation turns lead into gold, airline managers make the manuals ever thinner, and the training ever shorter, relying more upon computer-based training to minimum standards, and less upon a determination of whether flight crews have actually learned enough to safely and efficiently operate the airplanes to which assigned. Some argue that pilots and their unions are the check and balance on safety and adequacy of training, but perceived self-interests over the short term also drive the position of labor. Pilots want the same thinner manuals, with less to read and less upon which to be tested. Fewer memory items and thinner manuals mean there is less upon which a check ride failure can be based.
Aviation regulators have also bought into the false equation. Agencies hear from legislators that their airline constituents are on the verge of economic collapse, and need relief from the old (and with automation, no longer truly relevant) regulatory requirements. The die is cast for a rationalization by the regulators that automation eliminates much of the formerly required training – both initial and recurrent. If automation prevents a stall, they reason, then training for stall recoveries is no longer necessary. If automation prevents over-banking, then there seems to be no reason to train or check for the ability of flight crew to perform steep turns. If automation allows for computer-flown approach and go-around maneuvers, then why should there be any requirement for flight crew to demonstrate the ability to hand-fly such maneuvers? The natural progression of this nonsensical approach has been to extend the traditional six-month schedule for recurrent training and checking, which is now required by different regulators around the world to be accomplished only at intervals of 12, 18 or even 24 months. And, for most jurisdictions, flight crew are allowed to fly all but a very few minutes of their proficiency checks with full automation engaged. In essence, most regulators now require proficiency checking for only automated operation, but then only for a few specific functions. Indeed, one large carrier famously published a document entitled, “The 7 Things You Need To Know About FMS to Pass Your Checkride”, ensuring that few of its pilots would read any more of the manuals in order to learn about the other 300 things that FMS could do for them, and how it did those things. The corresponding requirements of the airlines, with agreement by their pilot groups, have been contemporaneously reduced to match the minimum standards required by the regulating authority. No matter how much common sense and experience may dictate training and checking above the minimum requirements, the mantra of airlines worldwide has become, “If the regulator does not require it, and it costs money, why would we want to do it?”
The fallacy of using minimum regulatory standards as a “design target” – whether for the design of airplanes, systems or equipment on one hand, maintenance inspections on another, or training on yet another – is that substantive regulatory standards are by their nature general, and can never meet the specific requirements that vary between airlines and their respective types of operation. Whether one is designing an airplane, training pilots or inspecting airplanes, standards must be set by the requirements of the operation first, and then verified as meeting or exceeding the minimum regulatory requirements.
The QF032 incident brings the reliance of automated systems on their inputs into sharp relief, because the computerized systems gave misleading information to that crew, made recommendations that were invalid, would have made the overall situation worse had they been followed, and prevented the crew from using systems that were undamaged, but locked out by computer responses. QF032 therefore brought many of the potential hazards of over-reliance on automation into play during a single flight event. The potential for damage to systems has always been an important aspect of airplane and systems design – using fault analysis to ensure to a reasonable statistical probability that a failure will not create a cascading fault pattern in other systems – but has also been one that frequently bends under the pressures of manufacturing economics and the resulting rationalizations.
On QF032, an uncontained engine failure cut lines and interrupted signals between sensors, computers and components, creating signal losses and patterns that computers could not interpret accurately. Evaluating damage tolerance has long been a design consideration, even on “steam-gauge” airplanes, because shrapnel damage from engine failures is not the exception, but rather the rule. Minimum standards for certification regarding containment, that all manufacturers use as design targets, were based on small core turbines with far less energy, and blades that were both smaller and made of weaker materials. Although the FAA interpretive rule making has long – since 1962 – required manufacturers to design and manufacture for the purposes intended, rather that to merely comply with “minimum standards” of the substantive regulations (“minimum standards do not constitute the optimum to which the regulated should strive” in the design, materials and manufacture of aviation products – FAA Order 2100.13 dated 01 Jan 1976, and its predecessor CAA Order 1000.9, dated 01 Oct 1962), the true facts are that the FAA often requires no more than the minimums in practice, and often through the “equivalent level of safety” system, has allowed significantly less than minimum standards. In addition, the FAA has by policy agreed that training and checking of airline pilots address only a single failure at a time, ignoring the fact that engine failures are often accompanied by immediately following fuel leaks, hydraulic leaks, pressurization leaks, bleed air leaks, cabin window failures, and failure of an adjoining engine. In regulation and in practice, neither the airlines nor the regulators acknowledge the true facts, because anticipating the reality of how engines actually fail – from design, certification and manufacturing to the training of flight crew – is more expensive than pretending that uncontained failures are improbable.
QF032 was not an isolated example of how uncontained failures occur and affect other systems. The history of transport category airplanes with large-fan engines is littered with examples over the past 40 years. Neither is QF032 an isolated example of how sensor input and programming failures can affect automatic systems. Many incidents and some accidents over the past 25 years reflect the myriad ways in which automatic systems are adversely affected by a failure by programmers to understand flight operations, sensor failures, programming errors, wiring shorts, RF interference, key punch errors by database update programmers, and a variety of data input errors. One recent example is the USAF B2 crash at Guam, where water in the pitot-static system gave a false input to the flight control computers, on an airplane that cannot be flown without those computers. The crew bailed out, but there was a $2 Billion USD loss because the airplane had been designed such that it could not be flown absent computer assistance. That airline passengers do not have parachutes seems to have escaped the notice of many who seek to expand the reliance upon computers in military airplanes to the civilian sector. In a similar vein, consider how ice crystals in Airbus pitot systems produce uncommanded auto-thrust changes, how the speed protection modes for the autopilot and auto-throttle systems on the Turkish B737 approach crash at Amsterdam were defeated by the failure of just one radar altimeter, and how the “V-Mini” airspeed responses on Airbus airplanes so routinely brings airplanes across the runway threshold at 20 or more knots above a proper reference speed, with landings well beyond the touchdown zone, and in some cases, over-runs. Yet most airlines worldwide require crews to turn airspeed selection over to a computer for the final approach segment.
Faulty programming does not mean that system faults must occur as a condition precedent or precipitating event. As more items are wired through single circuit breakers, or combined controllers, and as so-called backup systems share the same actuator assemblies as the primary systems, the potential for “single points of failure” to multiple systems becomes greater. Fault analysis during the design process is supposed to anticipate failure points, but neither engineers nor regulators are likely to have the experience necessary to foresee how failures may occur, and how they can affect airplane operation or crew performance. An antiseptic view there sets the stage for flight crews to be confronted with highly confusing situations, where a simple failure cascades through other systems in ways that are illogical, difficult to understand, and in some cases, impossible to control.
The bulletin histories at every major airline are replete with notices to flight crew about how programming errors in controlling software or database updates have been observed to cause incidents, and in some cases, accidents. In all such cases, the software or database update has successfully made its way through a programming, production, and quality assurance process without the defect being detected and repaired. This fact alone should create a cautionary alert to all concerned about the “reliability approaching infallibility” of computers and automatic systems that is so highly touted by so many.
The bottom line is that everyone from design engineering and manufacturing management, to airline management, must harken back to the lessons of the past – too often purchased with fatalities – and realize that airplanes must be designed and manufactured so that automatic systems are a tool of assistance, rather than reliance. Sufficient display detail must be provided so that pilots can ascertain what automatic functions are taking place, including the ability to quickly discern a sensor or input fault from a basic system operating fault. Automatic systems should be capable of being switched off by the crew, and airplanes designed such that they can be easily flown absent computerized or automatic operating systems. Flight crew should be trained to constantly monitor the performance of automatic systems, with a healthy disrespect for what those systems are reporting and doing, and to fly the airplane to standards when automatic systems fail. Finally, shared routings for cable, fiber optics, and ARINC buses should be avoided, and greater attention paid to protecting inter-system and intra-system communications features from damage during operation.
A prevailing assumption throughout the aviation industry worldwide is that a broader use of and reliance upon automatic systems allows reduced training requirements. I have been arguing the reverse to be true for some thirty years, and have usually felt as though I was shouting at the rain.
In the years before sophisticated automation, most airline training started with a two-week ground school – 80 hours on avionics, electrics, hydraulics, and so forth. When making the transition from the Martin 404 to the DC6, from the DC7 to the B707, or from the B720 to the B727, avionics required only four hours – the first morning of the first day of ground school – because avionics systems were different only in the location of switching panels. The advent of even the simple EFIS/FMS on the first B767 was a quantum leap, and should have been addressed prophylacticly by airline management with an additional week of ground school – 40 hours devoted to understanding how the new EFIS and FMS systems functioned. Despite horrendous attrition rates (many pilots dropped their request for transition after the first few days of training, decided to finish out their careers on the older equipment, or simply retired early), airlines by-and-large refused to expand training, but instead sold their regulators on the fact that the automatic systems were so proficient and dependable that additional training was unnecessary.
Among many telling experiences that I had in those early days of automation was that I established a practice of telling students on the briefing for the first simulator session that I would conduct one or more non-briefed events in each session where an automated system would fail, provide bad output information to the crew, or act in a way that was different than planned. The goal was to create an atmosphere where the crew was constantly on alert to verify proper operation of automatic systems, including a verbal annunciation of changes to flight mode annunciators, and verification that the airplane turned in the correct direction, captured altitudes intended and so forth. In addition, the primary failures to be practiced were briefed, but resulting failure cascades were not. For example, the briefed engine fire and explosion was followed by a non-briefed fuel leak in the adjoining tank. I was ordered by several airline managements to discontinue such training, and to instead stick with the vanilla protocol designed to make everyone feel good about the process, regardless of how ineffective that training was. Since those days, the levels and scope of automation have increased substantially, but the manuals have become thinner, the memory items fewer, and the required systems knowledge ever less detailed.
Insufficiency of the minimum regulatory standards also exists with respect to flight training, checking and certification. One example is the continuing use of a gross weight takeoff with the loss of a critical engine at decision speed. Engine failure at high weights was critical for Constellations and DC7s, but the enormous thrust available with modern airplanes means that the most critical engine failure at decision speed occurs not at heavy weights, but rather at light weights, where velocities of minimum control for ground and air are the critical factors. A decision-speed engine failure on the B747-400 at maximum weight (870,000 pounds or 394,600 kilograms) is easy at 184 knots, but there is insufficient rudder to alone maintain the runway center-line in the same airplane at lighter weights (500,000 pounds or 226,800 kilograms), with only 135 knots of dynamic airspeed. Yet, the minimum standards of 1956 live on, and we check all airline pilots on modern types to be certain they can operate only at the higher weights and not at the more difficult lighter weights, the technological changes notwithstanding. Other examples of how minimum regulatory standards are insufficient to meet current technological and operational realities abound.
The same types of insufficiencies exist with respect to the preflight inspection of modern airplanes. Airlines and regulators worldwide are still training and checking as though the flight crew was looking at designs of four decades ago, which means that the unique damage problems one might observe with composite skins, with carbon brakes, or with metal-to-composite control surface hinge attachments, have not been added to the training curricula or checking process. One day, a preflight oversight will result in a high-profile crash, and all concerned will immediately begin talking about how training must be expanded so that such an event “never happens again”. Unfortunately, the industry – from top to bottom – seems unable to anticipate such problems, because people with their eyes focused only on costs are willing to rationalize until the worst potentials become reality.
In the wake of QF032, some within IFALPA have called for a review on the sufficiency of minimum standards in the regulations addressing automation design and training, despite that its constituent associations worldwide have traditionally pushed for relaxation of training standards. The training and checking standards as interpreted and enforced by regulators have always been subject to influence by pilot groups and airline managements, rather than reflecting the strict language of the substantive regulations, or the realities of actual operation. Instead, minimum standards have been modified by “letters of agreement” and “interpretive rule making” so as to minimize what has to be memorized, to drop the level of performance required by flight crew, and to reduce training time and expense for the airlines. While it is refreshing to see a union group talking about actual safety, rather than perceived safety, one wonders how that will hold up once the membership understands how it will translate, if enacted and enforced. I predict some “walking back” of the initial position articulated, once membership inputs have been received, and airline managements weigh in by preaching fear about the effects of training costs on ticket prices and airplane orders. Too often in the past, such issues have been resolved with “hazard pay”, as in, “If we increase pay, rather than doing more training, does the union still think it is unsafe?”
Finally, an increasing number of regulating authorities are ever more subject to the political influence inherent with a handful of worldwide airframe, engine, avionics and component manufacturers having enormous financial effects on national economies, international balances of payments, and employment. I expect to see some regulatory fluttering about the dove cote as a result of QF032, followed by a nonsensical regulatory response that mostly misses the mark, but sounds good to the uneducated media and public. After that, I predict the industry will return to the comfortable assurances by all concerned that any real issue can be easily solved by simply increasing the levels of available automation.
Mark H. Goodrich – Copyright © 2011
“The Automation Paradox” was first published in the February 2011 Issue (Vol 8 No 1) of Position Report magazine.