AviationNLPApplied MLGenAI & LLMs

A quality issue at 35,000 feet is a brand issue at ground level.

The catering arm of a global aviation group needed an early-warning system for in-flight food quality - we built it from the complaint stream they already had.

Ismail Mebsout•April 15, 2026•2 min read

The Stakes

In-flight catering at network scale is a quality challenge with reputational consequences. Foreign-object incidents - foil fragments, hair, small stones, insects - sit at the intersection of safety, customer experience, and brand. Cabin crews already log customer feedback in the operational complaint system, but that feedback is subjective, unstructured, and arrives at a volume no one can manually review at the granularity required. By the time a pattern surfaces at a specific outstation or provider, the damage has compounded. The question: could we read the complaint stream automatically and surface the weak signals that the manual triage process kept missing?

The Approach

01
Build the labeled set the function never had.
We extracted historical complaints, sampled across routes and providers, and pseudo-labeled them manually for foreign-object relevance. That labeled corpus became the spine of everything that followed - and a permanent asset for the quality function.
02
Start with a classifier, graduate to an LLM.
We shipped a text-classification baseline first - fast to deploy, easy to validate. Then upgraded to an LLM-based classifier as the precision-recall trade-off demanded it. The judgment call: production beats elegance. We didn't wait for the perfect model to start delivering value to the quality team.
03
Aggregate where the operating decisions live.
Detection at the complaint level is useful. Detection rolled up by route, station, and provider - with temporal tracking - is actionable. We delivered the output into the function's operational dashboard, where the people running quality reviews could already see everything else.

The Outcome

A scalable, objective, proactive monitoring system replaced subjective manual triage. Specific outstations and provider facilities now surface as anomalies before the issue compounds. The quality team's manual review workload dropped sharply, redirected toward the cases the model flags. Leadership now has data-backed visibility into catering quality across the network - for the first time at this granularity. The unlock: catering quality moved from reactive incident management to proactive risk monitoring.

The Takeaway

“When your data already exists in the form of customer feedback, you don't need a new system - you need a model that reads what's already there. Start with classifiers, escalate to LLMs only when the math demands it.

Sitting on a complaint stream, a feedback channel, or an inspection log that nobody reads at scale? We turn unstructured customer signal into early-warning systems for operations and quality teams.

Let's talk

Build the labeled set the function never had.

Start with a classifier, graduate to an LLM.

Aggregate where the operating decisions live.

More case studies

At 120,000 employees, the HR intranet stops scaling.

A frontline agent shouldn't be scrolling through PDFs in front of a customer.

Bury a flight plan under irrelevant NOTAMs and pilots stop reading it.

Get In Touch