Modernizing Legacy Systems in Airline Operations Management: Approaches and Best Practices
Roger Hall, an A380 Emirates Airline captain, recently disclosed an incident that occurred aboard a commercial A340 passenger flight that was cruising over Africa when a sudden bang and vibration signaled the cabin crew there was an engine failure. After the crew consulted with the airline, they decided to return to the departure airport because their destination didn’t have the capacity to repair the failed engine. To further clarify their actions, the crew used ACARS (Aircraft Communications Addressing and Reporting System) to send text messages to the home base, which didn’t respond as rapidly as the crew needed.
The reason for the slow response was that the message traffic system didn’t prioritize text messages from A340 for a ground control team over other incoming signals. Yet sending text messages is frequently a more effective practice for an aircraft crew as it’s less distracting than voice communication via satcom (satellite communications). While the incident didn’t lead to passenger casualties, the case is still alarming. It highlights two problems: the lack of prioritization in the airline message traffic system and, generally, a software drawback that could be easily avoided if message priority allocation was considered.
And this story is indicative of the ailing software and hardware problems in airlines. Early this year, Delta suffered an IT failure which resulted in delays and cancellations of about 280 flights. Low-cost carrier Southwest lost about $54 million after the cancellation of over 2,000 flights due to a technology outage.
The existing mix of legacy and state-of-the-art tech is a well-known issue that from time to time cripples schedules. It fuels the recurring problem and nudges carriers to accelerate legacy system modernization, which is critical to maintaining a sustainable level of disruption management and – eventually – revenue.
According to Sabre and Forbes’ 2017 research “Shifting the Operational Mindset to Process Integration,” service delivery, flights optimization, and unified technology initiatives are critical to improving the operational performance of carriers.
Source: The Competitive Airline, 2017
So, this time, we will explore operations management software modernization in the airline industry. What should be addressed first and what are the best practices to such system optimization for airline software?
We’ll start our review of best practices with fundamental architectural changes and end with airline-specific solutions that should be considered within the modernization strategy.
Airline operations management is a complex ecosystem comprising planning, scheduling, fleet operations control, and management with ongoing analysis that defines changes in further planning. While all these aspects of operations management have conceptually different tasks, the real-time data exchange between them ensures timely reactions to disruptions and ultimately the health of the entire airline infrastructure.
We recommend starting with fundamentals and then scaling to more specific solutions. However, the changes don’t have to be applied in that order
A cloud-based architecture is considered the main driver of digital transformation across all industries, not only those that deal with air travel and transportation. As airlines mostly depend on internal mainframes, they not only increase IT spending but also set blockers to infrastructure elasticity during peak moments.
Holiday travel, extreme weather conditions causing mass rescheduling, and popular events create seismic-scale data spikes that increase the risk of failure in reservations systems, crew and fleet scheduling, and baggage management. Cloud computing – especially public clouds that don’t rely on an internal infrastructure – ensures that these peak loads are handled by dynamic allocation of computing power. Airlines thus are able to reduce resource usage during normal loads or valley periods.
Another airline-specific use case for the cloud is real-time data accessibility and management aimed at staff mobility, i.e. the access to crew operations from any location through mobile interfaces. Cloud infrastructure becomes beneficial in such operational aspects as:
- Crew planning
- Crew control and reports
- Ad hoc task assignment and reassignment
- Crew recovery during disruptions
- Instructions and other info retrievals, etc.
Cloud-driven enterprise mobility in these cases ensures real-time coordination between frontline crews, airline management, and control center workers. We’ll discuss the main benefits of enterprise mobility adoption below.
According to 2015 estimates, around 30 percent of organizations have already moved to public clouds. This trend has begun to impact airline IT policies as well. This June, Daniel Henry, an American Airlines vice president, announced a partnership with IBM to move several critical enterprise applications to the cloud and turn them into microservices discussed below. Currently, the carrier mostly focuses on moving aa.com, the kiosk app, and the mobile app to the cloud as a single set of microservices.
One of the cloud migration practices that fits well with airline industry systems is based on the gradual replacement of legacy software:
- Prioritize high-level components that should be moved to the cloud. It’s unlikely that you will attempt to migrate the entire ecosystem to the cloud from the start. We recommend focusing on those aspects that have 1) the most fluctuating load patterns including reservations, baggage booking, and other traveler-facing applications; 2) those that relate to crew operations discussed above as they need real-time information exchange. For instance, AltexSoft helped ASL Aviation, an aviation service provider managing a fleet of about 90 aircraft, with cloud migration partly due to the second example. The ASL employees couldn’t access the information portal and collaboration environment from any location, which forced ASL management to consider the cloud option.
- Review relevant modules. Separate the relevant modules that are of focal importance and/or don’t need rebuilding within the cloud environment. Another good practice is to reveal outdated and deprecated modules that should be entirely rebuilt. This will allow for setting a holistic migration roadmap segregating those parts of the code that can be directly transitioned the cloud and those that need refactoring and modernization.
- Replicate relevant modules in the cloud. By relocating the original code base that isn’t outdated, you’ll be able to preserve a number of functions. This smooth transition allows your staff to embark on using software straight away without additional training.
- Rebuild incompatible and outdated modules. As you keep the core functionality, within a cloud environment, you can preserve full functionality by eventually rebuilding the outdated components.
Monolith and microservices architectures
Most airlines today rely on software architecture known as monolithic, meaning that different components and services are tied together within an application and have strong dependencies among one another. If the system experiences an outage, it impacts a larger share of other functional components. In the case of airline management, a small software failure can lead to catastrophic results blocking data flow and instant coordination between different business units. For example, a failure in crew planning in the early stages can lead to operational rostering problems during a scheduling period, which further devolves into the disruption ripple effect.
Microservices, on the other hand, is a type of architecture that prioritizes independence of components and atomizes business logic to the simplest services that communicate with each other through REST APIs or enterprise service bus layers. This high level of independence reduces the overall level of system fragility and ensures timely reactions to outages.
Another benefit of the microservices architecture is high business agility. Such architecture allows the engineering team to add new services faster and improve airline efficiency with recent technologies irrespective of the technologies used by other system components. As microservices communicate with each other through APIs, they don’t have to share the same technology stack and consequently allow for wider options, both in terms of choosing the best-fit technology and acquiring engineering talent.
So, how to approach modernization using microservices?
- Define the monoliths that your airline software has. It’s a rare case for airlines to have a single monolith for the entire scope of operations management. Most likely, a software landscape is already distributed between multiple monoliths that are dedicated to different operations (maintenance, repair and overhaul processes (MRO), aircraft scheduling, operations control center, etc.) Some of these monoliths may be legacy and some relatively new. And on top of that, the architecture may be complicated with third-party services embedded through APIs. The initial mapping of the entire software landscape will allow for highlighting the main monolithic parts.
- Adopt the DevOps culture. Microservices architecture is the backbone of engineering agility. However, you won’t be able to achieve it without DevOps, the practice that entails combining frontline IT (operations, including systems engineers, administrators, operations staff, etc.) and a software engineering team. While these two groups of people work as a single unit, the microservices allow for rapid adjustments and a streamlined feedback loop.
- Don’t grow the monoliths; apply new functionalities as microservices. While this doesn’t directly impact the legacy monolith, the adoption of the microservices approach in the early stages reduces the further effort of breaking down and refactoring the monoliths.
- Start continuously extracting modules. As you’ve mapped multiple modules within each monolith, you should start consistently extracting separate modules into microservices connecting them through APIs.
Our article on microservices architecture basics will supply use cases and provide guidance on how this approach is adopted, and what its best practices are.
Microservices architecture and cloud migration set the architectural fundamentals of airline software optimization measures. However, modern technology suggests a number of other, more specific practices that should be considered within your modernization strategy.
According to IATA, 97 percent of airlines recognize the need to better manage and internally share data to streamline cross-unit operations. However, the current state of data management among carriers remains low. Records are siloed and underused, which complicates both ETL (extraction, transformation, load) and BI (business intelligence) operations.
One of the most recognized approaches to data management is to maintain data operations within a three-layer model: foundation – analytics – visualization.
- Foundation. Basically, the foundation aspect is covered by implementing cloud and microservices architecture. They allow for data unification and real-time retrieval for BI operations and cross-unit data exchange.
- Analytics. The current state of data science and machine learning allows for scaling from descriptive to predictive and prescriptive analytics stages. While predictive analytics entail anticipating future events, the prescriptive level defines automation both for BI and current decision-support systems. You may find more details about the approaches to implementing company-wide analytics in our break-down on data science strategy.
- Visualization. Data visualization defines the accessibility of analytics and the speed at which human operators can react to incoming information.
Data management modernization depends both on engineering effort and internal management changes. Airline management should align software use with the goal of making data leverage more accessible in real-time:
- Introduce data retention and archival policies that lead to data coherency and allow for further data processing;
- Introduce data governance policies to audit and control data operations.
Increasingly more airlines apply BYOD (bring your own device) policies and other enterprise mobility measures to enable unified and real-time crew, scheduling, and rostering updates for frontline employees. For instance, Alaska Airlines has deployed its software on more than 10,000 corporate devices.
The main benefits of such an approach are:
- Instant data retrieval to aircraft movement times,
- Constant monitoring of fuel levels,
- Storage of all flight notes in one document repository,
- Real-time debriefing data entered by pilots,
- Reduced paperwork and increased quality of data.
Comprehensive implementation of the enterprise mobility strategy in airlines relies on two main initiatives:
Cloud migration. The cloud-based architecture provides back-end environment to instantly source data from multiple business units and make it available through SaaS-level mobile applications. They can be installed on corporate or employee-owned devices.
Mobile UX engineering. The main challenge here is less of programming and more of proper UX-design nature. Making applications that are convenient to use and allow for instant data retrieval should be emphasized.
Automation of asset management
MRO asset management is a set of procedures that track and account for changes in such assets as aircraft and aero-engines. Airline ERP systems usually provide modules to track the asset’s history and align its configuration with safety compliance rules. Another essential problem of asset and configuration management is the reduction of costs associated with maintenance and compliance. Legacy IT systems lack real-time asset tracking. This means that most assets are over-maintained to ensure safety. The adoption of real-time tracking allows for reducing maintenance and overhaul cost while preserving the same level of safety.
A traditional approach to manage aircraft assets and other machinery life-cycles is semi-automated and partly relies on manual effort due to template-based configuration management and a cumulative transaction approaches. Let’s have a look at them and explore the main modernization practices for these types of systems.
Template-based configuration management (CM). Most legacy configuration management systems rely on templates that represent real-life assets builds. Basically, a template is a digital instruction to an asset structure. The practice of using templates entails many challenges as real assets undergo modifications and templates should be altered as well. This leads to an unmanageable growth of templates and manual effort to manage and alter them. The most effective approach to MRO software modernization is to focus on the rules-based CM.
Rules-based configuration management. This approach comes as a highly autonomous substitution to the template-based CM. As assets undergo changes, they are validated against specific rules shared among all configurations within a model. Rules-based CM allows for higher automation, as it doesn’t require creating individual templates per each as-built asset. Rules-based CM also reduces manual management effort and is more intuitive for use by MRO personnel.
Cumulative transactions to track component life cycle. Asset life-cycle management requires the precise understanding of how and when components were changed. This also entails using robust systems of handling corrections and data conflicts that inevitably happen during the changes in assets configuration. Legacy MRO modules use cumulative transactions systems for these purposes. At specific periods of time, they take data snapshots of assets. Obviously, this limits the ability to understand the asset configuration in between the two snapshots. Another characteristic of the cumulative transactions approach is the hierarchy of parts that represent as-built assembly. As depicted on the image below, every time there is an update to any of the components, a new transaction is first applied to the top of the structure and the further changes are cascaded to affect meters on the lower levels.
Cumulative transactions of airframe components. Source: IBM
The drawbacks of the approach become evident every time updates to components must be corrected due to human error, systems crashes, or other irregular events. All transactions have to be applied in specific hierarchical order to precisely reflect real-life events. This requires heavy loads on a database and increases the risk of error. To apply such changes, additional effort is needed to reconstruct the previous hierarchy state and then apply updates in the correct order again.
Component life accounting. The modernized method doesn’t use hierarchical structure and applies real-life changes to all meters. This allows for asynchronous asset updates and simple data retrieval of any time point in the asset’s life cycle. The real-life nature of component tracking streamlines the overall asset management workflow and sets grounds for simpler extrapolation of future events based on the tracked history. By applying data science techniques, you can make timely predictions on events (e.g. compressor failure) and allocate maintenance effort in advance.
Decision-support systems for disruption management and in-flight optimization
According to the T2RL study, airlines and airports in the US alone lost about 6 percent of revenue due to disruptions (irregular operations) accounting for nearly $6 billion. That’s why 79 percent of airline executives think that efficient disruption management and flight optimization are the main factors in improving the operational performance of airlines.
While we’ve broadly discussed the topic in our disruption management piece, this time we would like to concentrate solely on software modernization measures that can help mitigate the harm caused by irregular operations.
Currently, Airline Operations Control Centers (AOCC) mainly use two types of systems that to one extent or another address disruption management. We’ll also explore the third, the most advanced type of solutions, in the following section.
Database Query Systems (DBQS). Systems of this type can be considered purely legacy solutions in terms of automation and human-support capacities. DBQS allows AOCC human operators to retrieve data on a crew roster, flight scheduling, asset maintenance data, passenger reservations, etc. In technical terms, DBQS are simple to implement as they don’t need profound architectural effort on the back-end and operate within the query-retrieval pattern. Most legacy airline software used in AOCC is based on these simple query systems. Their main drawback is that human operators have to act upon their knowledge considering large volumes of variables simultaneously. In such time-sensitive circumstances as flight disruptions, decision-making becomes highly error-prone if it relies on human experience and intelligence only.
Decision Support Systems (DSS). Unlike the previous type, DSS consider possible solutions for human operators. The most primitive realizations of DSS may only suggest rescheduling a disrupted flight, others can analyze a broad spectrum of data and make intelligent and justified suggestions even showing solutions cost. Such approach reduces the impact of the human factor, although the final decision is always made by AOCC executives. Currently, most airline-targeted vendors offer this type of systems and it’s considered the best industry practice. The main barrier to DSS adoption is the integrity and connectivity of computerized information systems. The challenges can be solved in part by implementing the microservices architectures deployed on a cloud to enable real-time and consistent data flow from different operational units.
Multi-agent systems for disruption management automation
Automatic or Semi-Automatic Systems (ASAS) reduce human involvement to the minimum by automating most repetitive tasks and the tasks related to finding the best disruption solutions. Eventually, adoption of ASAS can reduce the number of human operators and narrow down the human role in supervising functions. This type of AOCC software can be considered the most advanced, but also the most difficult to implement. One of the realization paradigms is called the multi-agent paradigm.
The paradigm is well-described in the 2009 paper Disruption Management in Airline Operations by Antonio J. M. Castro and Eugenio Oliveira from the University of Porto. Multi-agent systems (MAS) address disruption management automation by introducing multiple AI agents, decision-making modules, that can operate autonomously each within its domain and environment. These can be Monitoring, Aircraft Manager, Crew Manager, Supervisor, etc. The modules find solutions matching three main disruption management problems: aircraft recovery, crew recovery, and passenger recovery. MAS can react to complex disruptions, involving multiple heterogeneous problems. As agents operate independently on the lower problem-solving tiers, they cooperate on the decision-making tier.
Possible architectural implementation of MAS.
Source: Disruption Management in Airline Operations, Antonio J. M. Castro, Eugenio Oliveira, 2009
Such intelligent systems also have their evident drawbacks as they require deep customization to tailor their reactions to each airline. The main benefit of MAS adoption is that it doesn’t require full modernization of the underlying system as the entire multi-agent architecture can be implemented as a wrapping environment, given that the legacy system is fully functional.
The same paradigm can also be applied to scheduling and planning duties.
Gradual replacement as the best-fit modernization approach for airlines
We consider gradual replacement approach, according to Cognizant classification, to be the main type of legacy software optimization that fits the airline industry. What are its main ideas?
Replacing one or a limited number of components at a time. This ensures low-risk modernization as the method reduces business impact in terms of budget, staff training, and reliability measures.
Prioritizing spheres of modernization. Gradual replacement entails starting from the share of components that generate the most issues. These can be reservations modules, un-automated AOCC operations, or the lack of real-time data streaming across units.
Decoupling of elements. Because monolithic architectures are common for airlines, one of the priorities is to make them less dependent on each other by gradually adopting the microservices type of architecture and building an enterprise service bus (ESB) and application programming interface (API) layers between decoupled modules.
Building service wrappers to streamline new technologies. The multi-agent systems and other modern solutions that aid decision-making or data analysis can be built as wrappers to existing legacy systems. The wrapper approach streamlines time to market and ensures higher agility of new systems in advance as they don’t depend too much on underlying software.
Cloud migration of mobile-critical components. Mobile accessibility defines how fast your control units cooperate with frontline employees and how the latter cooperate horizontally among each other. By moving mobile-critical components to a cloud, you can achieve the desired level of communication velocity.
In terms of general prioritization, we recommend considering architectural and database changes as the first initiatives as they will serve as a backbone to further software improvements of operations.