New: The Performance Margin Game


By: Mark H Goodrich – Copyright 2015

Few topics can generate more beer bets than aircraft performance. Thousands of cockpit voice recorders are filled with hours of argument on the subject, and new airline procedures reflect monthly victories of tribal lore about performance over the laws of physics. Many in the aviation business, from pilots to managers, regulators and investigators, have the mistaken notion that performance is a matter of “black and white”. To their perception, performance values and limitations are exact, defined across the industry using common protocols, and represent dependably available capabilities. The facts are quite different.

Like safety in more general terms, performance is a science of margins, and not of precise values. In any operation under a given set of conditions, the variables are too many to allow for the precise prediction of limitations or performance available. From the variations in lift and drag between airframes, thrust produced from one engine to the next, the myriad factors that affect air density, actual performance available from autoflight systems and human operators, and the presumed strength of airframes as manufactured, hundreds of assumptions underlay every limitation and performance value presented. The idea behind margins is that errors within the process and inabilities to define actual values with precision will not result in disaster, but rather only infringe upon a margin. To extend the logic, a problem occurs when the original performance margin was insufficient, and infringement events stack up to ultimately reduce the margin to zero.


As published, limitations and values mostly represent mathematical predictions. In some cases such values are only predictions, and in others are extended from test results with a small sample of exemplar coupons, components or aircraft-engine combinations. Whether purely predictive or extended from tests, a number of presumptions are at the root of the process. Because tests are always conducted under a particular set of conditions, it is then necessary to mathematically adjust the results to what would likely have been the result if tested under a “standard condition”. This process is often akin to a “secret handshake” – that is, with its details known to insiders, but not available or explained in ordinary language to the dispatchers and crews that will use the data in day-to-day operations.

Additionally problematic are the myriad ways in which errors are introduced to the process of establishing performance values, from simple calculation errors, to the far more sinister situations where the pressures of marketing, finance or contract deadlines result in the purposeful misrepresentation of calculation or test results. Even before the “age of computers”, alteration of a formula or data buried within reams of test reports was likely to escape notice, and easily explained later as a simple oversight. But, it has now become all too easy to use the alteration of algorithmic data as a means of skewing computer-generated results in the calculation of limitations or performance capabilities.

The increasing trend towards expanding “self certification” by manufacturers increases the likelihood that the ultimate discovery of such errors – whether accidental or purposeful in their origins – will lag far beyond the dates of certification and reliance by airline companies and crew upon the values provided. In far too many cases, that discovery will occur within the context of a crash investigation, or an airworthiness directive review board inquiry, as has always been the case.

While minimum certification standards may provide for a presumed buffer margin of 33% or 50% in a particular parameter, errors in the underlying presumptions can mean a limitation or performance margin is actually much smaller from the outset. The significance of that includes that such margins are often eroded over time by incremental allowances of heavier weights, higher speeds and longer airframes as fuel system capacities and payloads become larger, and more powerful engines are installed. In the cases of two different high-performance single-engine types, it was incremental increases in aircraft length, weight and speed over more than a decade that reduced erroneously established certification margins to the point where tails began to fail in service.

In other cases, the performance game can a be a manifestation of corporate influence over the political process, and the irresistible lure of politicians to satisfy their corporate sponsors by meddling in the detailed worlds of science and engineering. In the early 1970s, one such manufacturer sought to import, certify and sell its business turboprop, only to learn that the airplane could not meet the minimum performance requirements of even Part 23 for normal category airplanes. The regulator was bombarded by lobbyists and legislators arguing that competitive products had been “grandfathered” as follow-on models under standards in effect on May 15, 1956. Despite that those airplanes did not suffer the same deficiencies in performance capability, the regulator buckled and allowed certification under the minimum standards that were effective some two decades earlier, and which had been long superseded. Pilots were not informed of this process, and the performance data provided failed to adequately inform, instruct or warn that the ability to execute a balked-landing, a go-around or a missed approach procedure was non-existent when using the recommended configurations and speeds in operating data for approach and landing. After another twenty years of crash scenarios, the “many people or one famous person killed” standard was finally satisfied, and special rules were issued for training and currency to operate the type, as though somehow that would be sufficient to overcome the laws of physics measured against a poor design. It is probably redundant to note that the loss experience continues, and will until the entire fleet has been retired or lost to accident.


In another case of political machinations, carriers with political friends wanted part of the lucrative contracts to transport troops to and from Southeast Asia, and to further increase the number of passenger seats on one transport category type by almost 30% over its certified capacity. That the airplane had not been designed for that contingency was seen merely as a documentation problem to be overcome without unnecessary resort to re-certification testing and exposure to those pesky laws of physics. From the number of onboard lavatories and emergency exits to the minimum performance ostensibly required by regulations, mathematical projections about the modest effects of the expanded payload appeared, demonstrating that just enough margin in both limitations and performance was available. With no powerplant change, “war emergency power” was approved for the government contract missions, essentially eliminating the original limitation on the time over which takeoff thrust could be maintained, solving field length, tire and brake heating and climb gradient issues with a single stroke of the pen. When the need to transport troops was no longer extant, operators wanted to continue using the expanded passenger capacity for commercial transport, and once again the political process was up to the task. While senior managers at the regulator did not believe the term “war emergency power” could be approved for operations under Part 121 – at least not while maintaining a straight face – a simple name change to “-1A power” solved that problem of perception, and once again, the evaporation of original limitation and performance margins was accomplished.


In some areas, performance is a “net sum” equation that may allow for a trade of degraded margin in one performance parameter in order to increase the margin in another. Giving up a balanced field capability during takeoff – that is, the ability to accelerate-stop or accelerate-go with a critical engine failure from the decision speed – may allow for additional weight to be carried, but under penalty of losing the ability to stop, go or both in the event of a power failure. The problem is defining just how much has been lost and gained.

Similarly, what are the offsetting parameters when engines fail to produce the minimum presumed levels of thrust, the empty weight of the aircraft differs from the documented values because of residual paint or an error in the last weighing event, or the payload and mean aerodynamic chord calculations are in error because the approved loading schedule fails to accurately address the average weight of passengers and carry-on luggage? Management may have been pleased when it was able to convince the regulator that an average summer weight for each passenger and carry-on was a mere 73 kilograms, but the reality on a charter with a full-load of senior citizens and large bags might well include a five tonne error on that assumption alone. In the event of an aborted takeoff, is the ability to stop possible if tires and brakes have not been maintained in conformance with the assumptions made in the certification calculus, and since the crew is without information as to just what parameters were used in certification, how would they know in any event? Even more esoteric, what if high humidity for takeoff is outside the presumed relationship between temperature of the air and its moisture content upon which certification assumptions were based – is a correction necessary, or will the margins built into the certification process be sufficient to provide the necessary buffer?

Some might believe the fundamental question is how to calculate the loss of performance margins, or the level of performance still available once the letter of the performance charts has been exceeded. The better question may be whether operating within the published data actually gives one the performance expected, or even that purportedly guaranteed by the certification process, when observing all limitations and operating to the letter of the performance charts. Unfortunately, the most accurate answer is “maybe”. As an example of this distressing fact in operation, consider that airlines dispatch with inoperable thrust reversers, even with minimum runway lengths for takeoff. Regulations do not allow the use of thrust reversers for calculation of stopping distances, so the presumption in the minimum equipment analysis is they are unnecessary. Yet, takeoffs are aborted most often not due to engine failure, but tire failure resulting in wheel and brake fires during the stop. Is the braking effectiveness presumed in certification available under those circumstances?

A couple of decades ago, certification testing to establish landing distances was routinely conducted in the early morning to ensure no wind, gusts or turbulence, and flown by engineering test pilots who could hold the speed at V-ref plus or minus 0 knots. After each stop, new wheels, tires and brakes were installed. Test pilots with the temerity to ask just how such testing produced data of any real world value were told to be “team players” unless they would be happier working somewhere else. Predictably, overrun accidents occurred in service and analysis led back to the reality that airlines do not replace wheels, tires and brakes after every landing. In point of fact, they also do not cancel flights because of modest turbulence, and do not restrict flight crew hiring to engineering test pilots. Thus, the procedures used to establish the flight testing protocols – set forth within interpretive guidance and not substantive regulations – was upgraded to require that brakes used for those certification flights should be worn to the levels recommended in the maintenance and inspection documents for replacement. Nothing about tire wear or crew skill was altered.

PerfMaginGame(6)(Cfr14.333-335)(150dpi)The above exemplar is not highlighted as a special case, but rather to establish the greatest set of variables in certification testing – that is, the variance between substantive federal regulations with the force and effect of law in an area of federal interest, concern and regulation on one hand, and the interpretive guidance that actually governs how the substantive regulations are interpreted and applied in practice. The substantive regulations, including a great deal of specific engineering mathematics, must be revealed to the public through publication. A period for public comment must be provided, and care taken to ensure conformity with comparable regulations under the regulatory schemes of other nations. Ultimately, a “final rule” is issued, published and ostensibly defines the common set of minimum standards to which aircraft, engines and propellers will be certified.


But in reality, it is interpretive guidance that most often defines certification standards. In some cases, interpretive guidance is published in the form of advisory circulars, but in many cases, it takes other forms, such as internal memoranda, waivers, special conditions, equivalent level of safety determinations and letters of agreement. Too often it takes the form of a wink and a handshake, leaving no record by which the compromise to a certification margin may be later traced. The regulator is not required to notify or consider comments from the public regarding interpretive guidance, and may alter such guidance at will. Even published interpretive guidance is often quite different than the substantive regulation it purports to interpret, and the unpublished often correlates only insofar as it is completely distinguishable.

One recurring issue with performance-related certification is that the analysis is often more subjective than objective, requiring uncompromised integrity of the evaluator if the goals of the regulatory standards are to be realized in service. For example, engineering test pilots deal with the regulatory requirement that stall warnings must be “clear and distinctive”. The airplane must be “free from excessive vibration”. Operation of an aircraft must not require “exceptional piloting skill, alertness or strength”. Just what constitutes “clear”, “excessive” and “exceptional” is left to the discretion of the test crews employed by the manufacturer for self-certification schemes, and in other cases, overseeing regulatory personnel. Even with specific standards – such as allowable pounds of force required on a control or number of seconds before recovery is initiated – most test pilots have had experiences where manufacturing company management or regulatory personnel have “ordered” certification in the absence of actual compliance. An engineer or pilot who elects to whistle-blow such events will, following his highly publicized receipt of honors for integrity, be fired on some trumped-up charge and black-balled from further employment within the industry.

In the 1970s, a business jet manufacturer was confronted with a competitive issue, in that other jets could land on runways some 1,500 feet shorter. Engineering solutions included changing the airfoil – which would allow slower approach speeds but sacrifice higher speeds in cruise – or adding leading edge devices and more effective trailing edge flaps. Political solutions included arguing to legislators that bankruptcy and loss of manufacturing jobs would result if special waivers were not granted from the minimum standard that required use of a 1.3 g buffet margin for the final approach speed. Legislators talked to regulators and soon a stall warning computer was being touted as guaranteeing safety at a 1.02 g margin – the equivalent of eliminating the entire performance margin for approach and threshold crossing. Before long airplanes were crashing with some degree of regularity. Neither the regulator nor the manufacturer – both of which knew all too well what was causing the crashes – wanted to admit the true facts. Investigators were ordered to “slow walk” their work and hide evidence. More crashes ensued. Finally, the victims included a member of the United States Senate and his wife, who was killed. The floodgates opened and it was soon revealed that a high percentage of the airplanes had also inadvertently stalled during both certification and production test flights, but the data had been suppressed under directives from management and the test reports falsified. The decades old lesson that performance is a game of margins had been officially ignored… again.

Some years ago, a modification project was undertaken to certify newer replacement engines for a popular jet transport. The original manufacturer refused to provide source data regarding a variety of issues, including intake airflow volume through the cowlings and associated ducting. Obviously, baseline testing for these issues was an incredibly expensive proposition, essentially requiring that the original certification process be repeated. The modification company offered to make the following assumption regarding induction airflow volumes: “Since no aviation product may be designed and manufactured below the ‘minimum design standards’ of the Federal Aviation Regulations, one can reasonably infer that use of those minimum standards as baseline assumptions is a conservative approach to design.” The regulator agreed.


Unfortunately, when testing began some months later, engine performance quickly revealed induction volume issues, and baseline testing on an unmodified exemplar aircraft was re-established as a requirement. The evaluation revealed the original product to have been certified at far less than minimum design standards, shedding some light on why flight crews had found it necessary over the prior decades to avoid adding takeoff power to the center engine during crosswind takeoffs until some 80 knots of airspeed had been reached, in order to avoid compressor stalls. In the context of whether takeoff and runway analysis data was valid as published, what was the effect of delaying power application on literally thousands of takeoff events over the years, and what percentage of overruns after aborted takeoffs resulted from the original failure to meet published certification standards?

The war in Southeast Asia was raging. Helicopters were being lost and replaced at a fast pace, and a well-connected manufacturing executive wanted to cash in on the deal before peace broke out. Military purchase contracts required aircraft to meet both the military flight test standards, and those for civilian certification. Both required a minimum of three seconds between a loss of engine power at 80 knots in cruise, and rotor speed decay below the minimum revolutions that would allow for a successful autorotation. Flight test revealed the actual time for crew response to be one second. In short, an engine failure while the pilot reached to tune a radio would be non-recoverable before he could move his hand back to the collective control lever. There were several possible engineering solutions, but all involved the expenditure of money and time delays before production could begin. The political solution was adopted and the regulator’s manager of certification ordered his test crews to falsify the test results. One year later the helicopter type at issue represented 17% of the training fleet, and some 95% of the fatal accidents in training.


Transport category airplanes of the same model, and even within the same manufacturing block, may perform quite differently from one to the next. Who among us has not said, “I like that one because it flies better than the others”? Weight varies significantly on even new airplanes, and manufacturers guarantee empty weight to be only within a range of several tonnes on wide-body types. In addition, less than perfect alignment for the various details of the aerodynamic surfaces, and for those features that result in parasite drag, can result in significant variations between individual airplanes even before the repair that results from a maintenance adventure with wing-tip modification. Indeed, asymmetric adjustment of flaps and other control rigging is routinely used to correct airframe alignment issues as a part of the post-production flight test protocols, by definition increasing drag.

Both Airbus and Boeing make provisions within their respective flight management systems to correct for some deviations from nominal aircraft and engine performance. On Boeing airplanes, corrections for both “airframe drag factor” and “fuel flow factor” may be entered, and on Airbus types, a single correction for “performance factor” is available. In each case, the correction is translated to modify only the predicted effect on fuel burn. There is no correction applied to other performance parameters affected by deviations in the airframe drag values, despite that application of the configuration deviation list (CDL) requires weight corrections for a wide variety of airframe anomalies that may be far less significant to drag effects. In both cases, the values inserted are based on historical data measured over time. There are no corrections for performance deficiencies caused by a floating spoiler panel, an incorrectly rigged gear door or an engine power issue that has not yet been discovered during the normal inspection procedures. Some flight management systems incorporate a “cost index” feature that allows for an automatic application of the balance between time enroute and fuel burn without need to consult performance charts. Indeed, we have entered an era where dispatch personnel establish fuel loads and performance parameters from digitized data – that is, data resulting from the certification process with all of the vagaries and unknowns discussed above – and provide a single dispatch and release document upon which the crew relies exclusively as a reference for performance beyond the data and calculations within the onboard computers.

One part of what has been compromised is the ability of the flight crew to notice, inquire and correct a human error in the dispatch process. Just as important is the inability of the crew to use performance charts to optimize current performance under the variables of altitude, temperature, weight and speed. The computation of “cruise performance weight” for a particular airplane under existing conditions corrects for variations in aerodynamic, engine and atmospheric factors. This procedure was once a staple on long-haul flights, using raw data charts to track the speed and fuel flows at a given altitude and temperature, and then backing out to the airplane weight that should, for a “standard airplane”, result in those performance values. Despite that release data might show a calculated weight of 320,000 kilograms, a derived cruise performance weight of 360,000 kilograms would inform the crew that their airplane, on the current day and under the current conditions, would perform in cruise as if 40,000 kilograms heavier than calculated.


Those who remember the hours of study devoted to preparation for the old Performance A Examination will recall their trek through unfolded graphs some eight pages wide, as a single pencil line moved up, over and down, establishing v-speeds, safety speeds, climb gradients and obstacle clearance data. The width of that fine pencil line was often three knots when a spread of five knots was the difference between climbing and descending during an engine-out second segment climb. The process, both in training and in operational use, brought the reality of the performance margin game into sharp relief, and resulted in flight crews introducing corrective margin increases into their operating decisions.

While it is true that far more powerful engines and reduced spool-up delay intervals are now available to “save the day”, I fear the younger pilots of today have far too much respect for the accuracy of published performance data, and a reliance upon computers that will cause them to finally appreciate the performance margin game only as the windshield fills with a deteriorating situation that they do not understand, and have not been trained to anticipate or avoid. Watch this space.

Plus ça change, plus c’est la même chose. – Jean-Baptiste Alponse Karr (1808-1890).

Mark H. Goodrich – Copyright © 2015

“The Performance Margin Game” was first published in the October 2015 Issue (Vol 12 No 3) of Position Report magazine.