Achieving High Reliability in a Complex Work Environment

BY MAT FRATUS

How do you deal with errors in your organization? Do you draw attention to them, or do you downplay their significance? Do you accept errors as unavoidable side effects of the complex nature of emergency operations, or do you make every attempt to eliminate them completely? Can anyone in the organization point out actual or potential errors, or are such observations generally reserved for higher-ranking members? When errors do occur, do operations seem to unravel, or do things seem to quickly recover without a significant loss in efficiency?

Answering these questions honestly can give valuable insight into how effectively your organization will respond when errors occur. These questions become particularly important in the context of the fast-paced, high-risk, and sometimes unpredictable environment we have come to accept as a normal day at work. Although the type of work may be the same, each organization’s approach to error management and risk tolerance will be different. This may explain why some organizations seem to experience higher error rates while others seem to operate relatively error-free over a long time, even though both are operating in similar complex environments.

HIGH-RELIABILITY ORGANIZATIONS

A growing body of research has revealed specific characteristics that appear to be common among organizations that have the lowest error rates. They include seeking out and addressing errors; being resilient when errors occur; placing decision-making responsibilities on those most qualified to make them (regardless of rank); and valuing input from all levels of the organization, particularly input suggesting that things may not be going as planned. Organizations that exemplify such characteristics maintain lower error rates and have been identified as “high reliability organizations,” or HROs.

Weick and Sutcliffe1 summarized these characteristics into the five hallmarks of HROs: a healthy preoccupation with failure, a reluctance to simplify interpretations, a sensitivity to operations, a commitment to resilience, and deferring of critical decisions to those who have the highest level of expertise in the issue at hand.

Preoccupation with Failure

At first glance, this may appear to represent an overly pessimistic organizational point of view. However, consider what the opposite looks like—a preoccupation with the organization’s past successes. When it comes to evaluating an organization’s success, HROs will give you the same advice any good stockbroker would: Past successes are not, by themselves, a good predictor of future performance.

However, unlike the stockbroker, HROs believe that there is value in the failures that have occurred as long as the organization is willing to face them head-on, because failures, even small ones, are symptoms of potentially larger problems with broader systems. Therefore, HROs emphasize observed failures and act to mitigate them early, while they are still small.

The organization engages in frequent and sometimes even brutal self-analysis. Do not confuse this with beating yourself up over your errors. On the contrary, HROs reject the notion that errors can be eliminated in complex operations such as those in which we engage. Therefore, the most effective thing organizations/departments can do when errors occur is to seek them out, expose them, and share them in an organizational learning environment.

This concept is not new to the fire service. Programs such as the National Firefighter Near-Miss Reporting System are good examples of this concept in practice. To maximize effectiveness, however, even small errors must be analyzed and appropriate adjustments made to operational protocols to prevent the errors from reoccurring. We must also encourage all organizational members, regardless of rank or status, to “bring us the bad news” when things didn’t go as we’d hoped. This feedback is like gold, because it comes from the people who likely will be the first ones to spot errors or potential errors. This head start in error detection is a distinct advantage HROs typically enjoy.

Reluctance to Simplify Interpretations

By their nature, HROs normally operate in complex environments. Therefore, they understand that the complexities associated with their operations are not compatible with simplified approaches. We see and hear examples of simplified approaches during emergency operations, particularly in near-miss or after-action reports. Statements such as “It seemed like just an ordinary house fire” and “I didn’t think it was a big deal” when referring to some anomaly that occurred at an incident often reflect a mindset of oversimplification. (See “It’s Not Always That Simple.”)

Such oversimplification can create a belief that events will evolve in a predictable, positive way, reducing the need for skepticism or contrary input. This strengthens a perception of infallibility, which may then lead to increased risk taking and a propensity to marginalize indicators that the operation may be heading for disaster. Collectively, this delivers a sharp blow to situational awareness.

In many organizations, such skeptical or contrary opinions regarding tactical decisions are not highly valued and are even discouraged. This promotes an environment of tacit compliance instead of an open communication stream that makes use of the collective observations and experiences of those working on an incident. HROs take deliberate steps to encourage and reward members who present diverse opinions, challenge conventional wisdom, and emphasize subtle anomalies as operations unfold. Nurtured in emergency and nonemergency settings, HROs emphasize a culture of trust, speaking up, and respectful interaction across all organizational layers.

The question that frequently comes up regarding this concept is how incident managers maintain the balance between evaluating the merit of dissenting points of view and keeping emergency decisions timely. Obviously, this is not the time for a committee process, but you don’t want to overlook what may be vital information, either. The key to success here likely is the process used to determine which skeptical views should be used to modify operational plans and which should be dismissed. Either option may be appropriate depending on the situation, but the approach must be structured to ensure timeliness and to clarify what people expect if they suggest there is a flaw in the current plan.

When approached with a dissenting view, incident managers must ensure that they listen thoroughly and provide feedback for or against the alternative view in a professional and respectful manner. To do otherwise may make personnel apprehensive about presenting their views in the future.

Sensitivity to Operations

HROs emphasize what’s going on at the front lines and empower their front-line workers. By maintaining a vantage point that encompasses all components of the organization’s operations, leaders in HROs are able to stay in touch, detect “latent errors,” and take steps to correct them before they grow to catastrophic proportions.

Simply put, latent errors are operational deviations that generate most, if not all, of the components for potential failure, but the failure itself has not yet occurred. On the surface, it appears that everything is okay. Therefore, the organization can be lulled into thinking that the deviation is insignificant. However, in the complex system of emergency operations, even small deviations can weaken our defenses against failures. In describing such latent conditions, James Reason2 uses the analogy of slices of Swiss cheese to illustrate the potential for relatively small errors to “line up” to produce larger failures.

According to Reason, key elements consisting of safeguards, defenses, and barriers create layers that provide for safe operations. When each layer is intact, it creates a strong resistance against failure. However, these layers are rarely perfect. Deficiencies can develop in different areas of any layer for a variety of reasons. Even so, a deficiency, or hole, in any single layer (or slice) would usually not result in a catastrophic system failure. However, the location and nature of these deficiencies are not static. As these “holes” in the individual layers move, occasionally they will line up, allowing for a significant failure in multiple layers of the safety system (Figure 1).


Although the multiple layers of defenses can compensate for deficiencies in any single layer, if the deficiencies (holes) should line up, the outcome can be catastrophic. (Illustration by author.)

Being sensitive to operations, the organization obsesses about the nature and location of the holes. Because HROs accept that errors will occur, they try to minimize their impact by constantly observing where they are occurring, why they are occurring, and how to keep them from lining up. They do this by keeping in touch, digging for honest feedback, making timely adjustments, and encouraging organizational members to become familiar with operations beyond their own job. Because the cause and effect of some system failures can develop quickly, HROs encourage front-line workers to be vigilant and give them broad latitude to take quick and appropriate corrective action when they detect potential failures.

Commitment to Resilience

Although past experiences in emergency operations provide us with insight on how events are likely to unfold, nobody has a crystal ball. No two incidents are identical. Unexpected events occur; many times, they are accompanied by unfavorable outcomes. HROs understand that it is nearly impossible to predict and, therefore, prepare for every potential unplanned event that can occur in complex environments. In fact, HROs believe that focusing too much on the predicted outcome of events may cause the organization to develop a false sense of control. The organization will begin to believe that it will be able to employ specific response plans that will mitigate unexpected events by predicting all the possible endings. In practice, however, their tightly fixed plans and procedures lack the flexibility needed for successfully dealing with the pace and complexity of such events.

To prepare their personnel to deal with unplanned events, HROs advocate frequent training that encourages personnel to look beyond typical approaches and outcomes. Such training includes scenarios where things don’t go as expected, people don’t respond as expected, and the “normal” way of getting things done doesn’t work. The value in this type of training is that people learn to adapt to events in different ways and become familiar with operations beyond their specific job title.

By practicing in this environment, organization members develop keen observation and improvisation skills. Because the training purposely does not follow a linear pattern, members begin to accept and deal with events that come at them in unpredictable ways. They begin to view their skill sets as tools that can be mixed and matched in unconventional ways so they can be effective during unconventional challenges. They also develop effective communications skills that make the most of other team members’ skills, observations, and knowledge. When an organization has collectively developed these skills of resilience, they are more likely to effectively adjust rapidly to unfolding unexpected events.

Deference to Expertise

As the public’s expectation of the fire service continues to diversify, it becomes increasingly difficult for any one person to develop the level of expertise needed to specialize in every aspect of what today’s firefighters are asked to do. Although we develop a functional knowledge of the multiple dimensions of our job, we find that in certain situations, some of our personnel will have more experience, training, and insight than others. This expertise, however, may or may not come with a higher rank or organizational status.

To make the most of their organization’s talent and resources, leaders in HROs select key decision makers based on their expertise in the situation at hand, not necessarily based on their rank. As a result, these personnel are substantially more effective in dealing with rapidly evolving complex situations. In contrast, organizations that select key decision makers based on rank or status alone become slow, inflexible, and ineffective during times when such characteristics can cause the collapse of key operations.

Deferring decisions to our resident experts improves operational efficiency in two ways. First, it helps ensure that a person who may hold rank but lacks the needed expertise does not become the key decision maker.3 Second, because people on the front lines are given the latitude and authority to make the decisions, they are more likely to do so in a timelier fashion than if they were required to move the decision up through one or more layers of the organization.4 This allows them to implement tactical plans based on where the incident is now, not where it was 10 minutes ago.

Some may view the idea of deferring to expertise as a breach of the traditional fire service command structure. On the contrary, policy makers still make policy, and organizational leaders still retain control over the organization. Roles, procedures, and lines of communication are still clear, but they are enhanced by the flexibility needed to accommodate rapidly changing or unusual conditions. The unique difference in organizations that practice deferring to expertise is that their leaders use their formal authority to employ all available resources in achieving broader objectives. They refrain from micromanaging areas that can be effectively managed by other highly qualified people. In doing so, they develop people and systems within their organizations capable of achieving the highest levels of effectiveness and efficiency during situations where inflexible organizations struggle.

HRO IN PRACTICE

Currently, HRO concepts are being effectively used to address issues in patient safety and health care, commercial aviation, critical infrastructure management, aerospace, and airport security, to name a few. Closer to home, we find that the United States Forest Service has made great strides in using HRO concepts to improve firefighter safety in the wildland environment. Collectively, these organizations have found that they can enhance the efficiency and safety of emergency operations by implementing HRO concepts into their organizational structure and culture.

For the fire service in general, HRO theory offers a fresh look at risk mitigation that confronts the reality of errors in our fast-paced and complex work environment. Instead of suggesting that we can somehow sterilize our environment through rigid operational protocols, HROs opt for flexible systems that can adjust to the unique challenges of each incident. When errors occur, HROs facilitate organizational learning through fact finding and adjusting, not fault finding and punishment.

At the same time, HROs can decrease error rates by staying connected to daily operations and looking for potential failures before the failures occur. They value the diverse views and observations of organizational members and reward them for consistent, honest feedback. They accept the complexities of their working environment and understand that they will not be successful by using simple observations or solutions. Each of these areas represents positive steps toward making fire service organizations more efficient in providing emergency services to their communities while actively providing a safe and rewarding work environment for their members.

References

1. Weick, K. & K. Sutcliffe. Managing the unexpected. (San Francisco: Jossey-Bass). 2001.

2. Reason, J., “Human error: Models and management,” British Medical Journal, 2000; 320:768-770.

3. Roberts, K., K. Yu, & D. van Stralen, “Patient safety is an organizational system issue: Lessons from a variety of industries.” In B. Youngberg & M. Hatlie (eds.), The Patient Safety Handbook (Sudbury, Mass.: Jones and Bartlett Publishers) 2003, 169-186.

4. Rochlin, G., T. LaPorte, & K. Roberts., “The self-designing high reliability organization: Aircraft carrier flight operations at sea,” Naval War College Review, 1998; 51(3): 97-113.

It’s Not Always That Simple

An example of how oversimplifying observations can lead to catastrophic failures can be found in NASA’s investigation report of the space shuttle Columbia breakup. According to the report, foam strikes similar to the one that was identified as a primary causal factor in the Columbia crash had been occurring for more than 22 years without major incident.


Investigators believe that this led to the normalizing of the event—that is, it prompted the belief that it was simply a maintenance issue and did not threaten the success of the mission. In reality, this was a classic example of a latent error; it was just a matter of time before the error manifested itself in a catastrophic way. The report referred to this as “an unfortunate illustration of how NASA’s cultural bias and optimistic organizational thinking undermined effective decision making.” The Columbia Accident Investigation Board further determined that high-reliability theory is extremely useful in describing the culture that should exist in the human space flight organization.

Source: Columbia Accident Investigation Board (limited first printing). National Aeronautics & Space Administration, Washington, DC, U.S. Government Printing Office, 2003, 181.

MAT FRATUS, a 20-year veteran of the fire service, is deputy chief of the San Bernardino City (CA) Fire Department. He is an instructor in incident command, tactics, fireground decision making, and leadership. He has a B.S degree in fire administration and is enrolled in the Executive Fire Officer Program at the National Fire Academy.

Manchester (CT) Firefighter Injured in House Fire

One firefighter was injured and two people were displaced after a house fire Sunday on Highland Street.

Death Toll Hits 39 as Tornadoes, Winds, and Wildfires Sweep Across the Country

Tornadoes, dust storms, and wildfires killed at least 39 people and destroyed hundreds of homes and businesses.