normalization

How to Fix NDC and EDIFACT Content Inconsistencies in Flight Distribution

Olga Pereverzieva
Olga Pereverzieva, Tech Journalist

Airline content distribution today is shaped by a growing network of technology providers — including GDSs, NDC aggregators, direct airline APIs, and third-party platforms.

Because TMCs and OTAs must integrate with many of them, each using different channels, standards, and implementation logic, content becomes inherently inconsistent — fragmented across sources and difficult to unify without specialized normalization solutions.

AltexSoft encountered an air content discrepancy issue while working with many of our clients, including OTAs and leading TMCs. This article, based on our experience, breaks down where and why things go wrong in real-world air content integration and explores practical ways to normalize suppliers' data (and whether AI can realistically help).

EDIFACT vs. NDC vs. LCCs: from structured codes to flexible content

Today’s airline distribution landscape is shaped by three parallel models: GDS content based on traditional EDIFACT protocol, New Distribution Capability (NDC) implementations, and the proprietary systems of low-cost carriers (LCCs).

GDS (EDIFACT). The connection between airline service systems and GDSs runs on EDIFACT, a legacy messaging protocol based on structured text messages. It’s highly rigid by design. Core attributes—such as fare references, seats, and baggage—are structured through predefined formats and standardized fields. This makes the data predictable and consistent across airlines, but also limits flexibility. While OTAs and TMCs don’t consume EDIFACT messages directly, the limitations are still there. EDIFACT was never meant for modern merchandising, which results in a more constrained set of options available through indirect channels.

NDC, introduced by IATA in 2012, relies on XML schemas built for web-based distribution. It enables airlines to sell via third parties in a way closer to their own websites,  supporting rich content (text descriptions, images, and videos), bundled offers, and greater support for ancillaries. While often referred to as a standard, NDC primarily defines message structures and flows, without strictly enforcing how content or business logic should be modeled. So each airline ends up implementing it differently.

Low-cost carriers, historically focused on direct distribution, have built their APIs independently of industry standards—typically using REST/JSON, a newer, lighter web-based format compared to XML. As LCCs increasingly integrate with aggregators or establish direct API connections with large agencies, their content enters the same platforms as NDC and GDS.

Three different content sources naturally lead to inconsistencies. This is further compounded within NDC itself, where implementations vary from one airline to another.

Why is NDC inconsistent?

Before going any further, it’s important to clarify why NDC doesn’t work as a true standard. Although it is commonly referred to as one, in practice, it behaves more like a flexible framework. Structurally, NDC provides a common foundation — similar in intent to EDIFACT — but, as mentioned before, it does not enforce the same level of consistency across implementations.

The main divide in modern airline distribution is no longer between EDIFACT and NDC—it is between NDC airlines themselves

Dmytro Hurkovsky, Business Analyst at AltexSoft.

The inconsistency spreads over several layers.

Airlines interpret the schema differently. A substantial gap often exists between the IATA specification and real implementations. One airline may return all fare families and bundles in the initial shopping response, while another requires additional or non-obvious calls to retrieve the same information. What should be a straightforward shopping flow often becomes less predictable in practice.

Version proliferation adds complexity. More than ten NDC schema versions are in use simultaneously, and airlines rarely migrate cleanly from one to another. Instead, many adopt a hybrid approach – it’s a cheaper, simpler way to capture the business value of new releases without a full transition.

Shopping and booking may run on newer versions (e.g., 21.3 or 24.1), while servicing remains on older ones (e.g., 17.2), with an order management system bridging the gaps. This creates inconsistencies not just between airlines, but within a single airline’s own stack.

Implementation depth varies widely. Some carriers support the full booking and servicing lifecycle, while others expose only partial functionality. The uneven automation means that the same operation may work seamlessly with one airline and fail with another, even though both are NDC-enabled.

So, how to convert chaos to order?

The industry players who deal with the air content discrepancies – flight aggregators and large OTAs and TMCs that directly integrate with airlines – have to absorb the complexity of mixed sources and formats.

To present air offers to their customers consistently, they unify various APIs through a normalization layer. This includes a canonical data model that outlines a standardized structure, mapping logic that translates each supplier’s quirks into that model, and adapter layers that handle the specifics of each connection. In most implementations, the resulting data is exposed via a unified API.

API normalization layer in flight aggregators

How the API normalization layer in flight aggregators is built

Сanonical data model built to meet user needs

When integrating multiple airlines, one tempting approach is to take the latest NDC schema as a reference point and map everything — older NDC versions and EDIFACT content — into that structure. In practice, this approach quickly runs into limitations.

“By the time we finish, a new NDC version will probably appear, and we’ll have to rewrite everything again,” says Ivan Mosiev, Solution Architect at AltexSoft, engaged in airline API integration projects.

Instead, the architecture typically starts from the opposite direction: user needs. This means designing an internal canonical model that defines how offers should appear within the system, independent of any external format.

Such a model usually includes core entities such as fare attributes, baggage, seat options, and change and cancellation rules, along with additional fields to preserve flexibility. The goal is to abstract away format-specific differences while still capturing the full range of airline content.

In some cases, this model may resemble an NDC style, organizing content around offers and orders and allowing more flexibility in how information is represented. In others, it may lean toward a more rigid, field-based structure similar to EDIFACT, where attributes are clearly defined and consistently organized.

Crucially, the model must account for missing data. Not every airline supports the same features, and the system must handle these gaps gracefully.

For example: with one airline you can select a seat, but with another you cannot… In that case, we simply show the user that seat selection is not available during booking,” says Mosiev.

The same applies to post-booking services. Not every carrier supports a full end-to-end servicing flow. Changes, refunds, or ancillary add-ons may not be available via API at all. In such cases, the system must reflect this clearly — for example, by directing the traveler to contact an agent. 

Mapping and adapters to translate airline-specific data

Besides the canonical model, the normalization layer includes two key components: mapping and adapters.

Mapping defines how airline-specific attributes correspond to the internal model — for example, aligning an airline’s branded fare name or service label with a standardized concept in the system. In simple cases, this can be implemented as a lookup table, although more complex mappings often require additional transformation logic.

To operationalize this, each airline (or group of similar integrations) is handled through an adapter layer. The adapter transforms incoming data into the canonical model. This translation works in both directions:

  • inbound: airline → canonical model (for search and display)
  • outbound: canonical model → airline format (for booking and servicing)

While adapters can share some logic — for example, when airlines use similar NDC versions — they still require configuration and customization for each integration.

As carriers evolve, these adapters must evolve as well.

Airlines may migrate to a new version gradually… If we already have an adapter for the new version, we just update the configuration for that airline

Ivan Mosiev, Solution Architect at AltexSoft

This also requires ongoing monitoring and planning. “For example, we know that some airlines regularly migrate to new standards, so this should be planned in the product backlog,” Mosiev adds.

Business logic to select and normalize content

Once a canonical model is in place, the next step is deciding what content to show when multiple sources are available.

For example, if the same fare is available via both EDIFACT and NDC, the system may prioritize NDC. If the content is equivalent but prices differ, the cheaper option is selected.

More complex scenarios arise when prices are the same, but conditions differ—such as baggage allowance or flexibility rules. In these cases, there is no universal solution.

Now that we understand the general approach, let’s look at one example of a canonical model implementation offered by a key industry organization.

Doesn’t NGS solve the problem?

In an effort to resolve inconsistent display of airline products and services, Airline Tariff Publishing Company (ATPCO), a centralized aggregator of airlines’ fares, rules, and pricing logic), introduced the Next Generation Storefront (NGS) in 2019. NGS is a new data standard designed to harmonize the consumer shopping experience. 

NGS ATPCO

An example of NGS interface: “basic economy”, “economy extra” etc. are shelves. Source: ATPCO

This standard shifts the display logic from "price and schedule" to a value-based grid with "shelves" and “drawers”. The NGS display is a grid with rows - each containing offers from a single airline - and columns called shelves.

Shelves: NGS uses 6 standardized columns to categorize fares by value rather than by brand name (e.g., “Basic Economy” instead of “Saver”).

Offers are grouped based on 18 specific attributes,such as seat pitch, baggage allowance, and ticket changeability, used to determine which column a fare fits into.

Drawers: Travelers can expand "drawers" (drop-down menus) to view granular details, such as Wi-Fi availability or power outlets, ensuring "apples-to-apples" comparisons.

NGS ATPCO

An example of NGS interface: “seat selection”, “checked baggage” etc. are options in drawers. Source: ATPCO

As mentioned, NGS is just one example of a possible data model. An interface reflecting this model is quite convenient for travelers, as it allows them to choose a flight not only by price but also by additional features (for example, some people place particular importance on having Wi-Fi and a power outlet for working on the plane, while others prefer a seat near an exit). That said, some might find it a bit too cluttered with details.

But whether it’s good or not, this standard relies on data from the airlines and GDSs to decide which "shelf" a flight belongs on. If an airline sends a cryptic code instead of a clear attribute, the NGS logic won't know where to put that flight.

Think of NGS as the shelf layout in a supermarket. It makes shopping easier for the customer. However, further normalization is the logistics team's responsibility: these guys unload the trucks, remove incorrect labels, and ensure price tags are accurate, so shelves are clean and organized.

The usual suspects: Air content issues you’ll definitely encounter

Mismatches in how different suppliers model, return, and interpret the same concepts tend to cluster into recurring problem areas. Below, we break down the most common categories observed by the AltexSoft engineering team while building a flight aggregator for OTAs.

For this project, we integrated with Amadeus, Sabre, and Travelport, as well as an air consolidator and several other platforms.

Baggage: messy codes

Baggage is one of the least reliable parts of airline content. Airlines and global distribution systems (GDSs) do not always clearly indicate whether baggage refers to checked, carry-on, or personal items, or how allowances (such as first vs. second bag) should be interpreted.

This is particularly disappointing given that ATPCO long ago introduced standardized definitions for ancillary services using three-digit RFISC codes. For example: 0CC — first checked bag, 0CD — second checked bag, 0B1 — prepaid baggage, 0DG — carry-on baggage. Airlines can attach attributes to these codes, such as weight, size, and fees.

IATA has also created a retail-oriented taxonomy (used in NDC) that provides a higher-level classification of airline services. For instance, the “Checked baggage” category (13EC) includes items like “Bag” (1450), “Stroller” (1518), and “Pet” (15E0). However, this taxonomy does not fully define details such as baggage order (first vs. second) or specific allowance rules.

In practice, adoption of these standards varies across airlines. Some rely on airline-specific codes, while others return partially structured or fully unstructured descriptions.

The complexity increases further in interline itineraries, where different segments may follow different baggage rules, making even basic concepts like “first bag” difficult to interpret consistently.

GDSs do not fully resolve these inconsistencies, so baggage data often requires additional normalization. “They may return a human-readable string like ‘no checked baggage and two carry-on items,’ which then has to be parsed,” says Andrey Dudnik, Software Engineer at AltexSoft working on a normalization layer for a flight aggregator.

Our solution: we used lookup tables for known airline codes and text parsing for everything else. Regular expressions (regex) were applied to extract structured data from free-text descriptions.

Normalization ultimately remained case-specific. “If it’s not possible to fully unify baggage data, you have to handle it case by case,” Andrey notes.

Flight attributes: missing data

Not all suppliers return complete flight information, yet OTA users still expect it. These gaps need to be handled on a case-by-case basis. For example, we encountered missing flight duration data — even though travelers rely on it to plan their time, meals, and connections. 

Our solution: we maintained a custom database of airport time zones, allowing us to calculate flight duration accurately even when it wasn’t provided by the supplier.

Fare conditions: too much info

Penalties for changes and refunds are often provided across multiple scenarios — such as before departure, after departure, or no-show — but without clear prioritization or context.

“They return too much information. But customers mainly care about penalties before departure. They just want to know how much they’ll have to pay if they cancel an already ticketed flight,” says Andrey.

Our solution: the aggregator filters, reorganizes, and maps raw penalty data into user-relevant categories such as “change” and “cancel,” effectively translating supplier logic into user intent.

Observability: limited visibility into supplier behavior

Diagnosing inconsistencies is itself a challenge. Supplier responses often lack transparency, making it difficult to understand failure patterns or performance issues without additional tooling.

Our solution: we built custom observability layers to track route performance, error rates, and supplier-specific anomalies — turning operational visibility into a prerequisite for effective normalization.

Can AI help?

In some cases—absolutely. AI won’t replace the need for solid core architecture, but it’s increasingly taking over the heavy lifting of air content—sorting, routing, and translating data.

Adapter generation. As airlines introduce new NDC versions, developers often spend months building and maintaining adapters. AI can assist by analyzing provider documentation to identify response schemas, field structures, and business rules — significantly accelerating this process, though still requiring validation.

AI-assisted mapping.  AI can help determine how new data structures fit into an existing canonical model. As Ivan Mosiev notes, “AI can suggest transformations for field names and formats, helping ensure that a ‘Deluxe Bag’ from one airline is consistently mapped to a standard checked baggage attribute, reducing manual effort.”

Sophisticated classification. AI excels at classification, which is critical for solving the "messy descriptions" problem. By processing marketing fare names or unstructured service strings, machine learning models can accurately determine which category a fare belongs to if a supplier hasn’t labeled it consistently.

Olga Pereverzieva

Olga is a tech journalist at AltexSoft, specializing in travel technologies. With over 25 years of experience in journalism, she began her career writing travel articles for glossy magazines before advancing to editor-in-chief of a specialized media outlet focused on science and travel. Her diverse background also includes roles as a QA specialist and tech writer.

Want to write an article for our blog? Read our requirements and guidelines to become a contributor.

Comments