Flight Data Monitoring and FOQA: Data De-Identification

(Note: This article was originally published on October 28, 2020).

One of the biggest and most frequent concerns that we hear from operators and pilots that may be new to Flight Data Monitoring or FOQA programs is how we protect the identity of the crew.

If you have read some of my other blogs, hopefully you will know that the real interesting information from a Flight Data Monitoring program comes from the trending and statistical data – not how a particular pilot is flying. For the most part, operators understand this and do not use the data from their programs for punitive actions; however, it certainly is a legitimate concern of pilots, particularly those in larger organizations.

For some, you may not even have an option of whether or not to de-identify the data – it may be a requirement of the pilots’ union.

So, how do we go about doing it?

The first bit of good news is that, in general, the flight data does not record anything that can directly identify the crew that was flying the aircraft. For example, there is no parameter for “Pilot ID” recorded on the typical Flight Data Recorder(FDR) or Quick Access Recorder (QAR).

However, there are parameters recorded that can be used indirectly to identify the crew. For example, most airline operators consider the Flight Number and the Flight Date/Time to be “Identifiable” parameters. These two pieces of information, used along with a crew schedule, certainly could be used to identify the crew of a particular flight.

Other operators may consider other types of information “identifiable”, but the process of hiding this information is generally the same regardless of the parameters involved.

There are two steps to implementing a data de-identification strategy. The first, and probably most obvious step, is to de-identify the data, of course. The second step is to have a mechanism in place to identify the crew if there is ever a requirement to interview them. It sounds counterintuitive, but it is actually pretty straight forward.

The de-identification step in the process involves masking those parameters that are considered identifiable. In the example of the typical airline above, our Sky Analyst FDM software would blank out the flight number and set the date of the flight to the first of the month.

Why not blank out the date and time the same way we did the flight number?

For most operators, setting the date of the flight to the first of the month provides an acceptable level of de-identification but still allows us to look at the effects of seasonal and day/night variations on our event rates and trends. But an operator is free to implement any de-identification strategy that works for them.

The second step allows for a process to identify the crew if there ever is a case where we may need to interview them for more information on the flight. This can occur for a number of reasons, none of which are nefarious. For example, a Safety Officer may want to interview the crew simply to get further insight into what ATC communications were like, or what the workload was like in the flight deck at the time.

In such cases, a user with the special role of “Gatekeeper” is consulted. Only Gatekeepers have access to those parameters considered identifiable. He/she can then log into the system to get the actual Flight Date and Flight Number (in our example) and compare that to the flight schedule to identify the crew.

In larger organizations, the Gatekeeper is typically a line pilot and representative of the pilots’ union. The idea is to select someone that can provide a buffer between management and the line pilots.

The process really is that simple, but there are some other things to consider when establishing your policies related to data de-identification, such as:

Will you keep the identifiable data for the Gatekeeper indefinitely, or will you want to permanently de-identify it after a certain period of time so that even the Gatekeeper cannot identify it? If so, what is that time period?
Will you keep your raw data indefinitely or will you want a strategy to permanently delete it after some time period? It is important to note that data de-identification typically applies only to the processed data. Raw flight data is very, very rarely de-identified so anyone with the proper software could always extract identifiable information from the raw flight data.
What will the process be for requesting identifiable information? This should be a formal process – not just a simple, casual email request.
Do you have a non-punitive agreement in place with your pilots? This is not necessarily related to de-identification but if you are going to go through the process of de-identifying data, it would be a worthwhile step to have a non-punitive agreement with them to ensure that when you DO identify the data, it is to promote and improve safety; not to punish for mistakes.

A Flight Data Monitoring program provides many safety, maintenance and operational benefits to an organization, but it is important to have all parties on board with the program before it begins. Having a well thought out data de-identification policy is a great step to ensuring that your pilot group is on board with the program.

For more information or help on setting up your Flight Data Monitoring program, feel free to contact us at info@scaledanalytics.com.

Flight Data Monitoring and FOQA: Data De-Identification

Let’s keep in touch

Submit a Comment Cancel reply