Theory Approach to Decision Optimization in Collaborative Trajectory Options Program

Using the Collaborative Trajectory Options Program (CTOP), the FAA considers airlines’ business goals to allocate available slots to fly through a Flow Constrained Area (FCA) by combining rerouting and delay at the same restrictive measure. Although, the FAA turned this collaborative environment possible, when a CTOP demand is initiated each airline has knowledge only about its own flights related to CTOP. The Decision Support System (DSS) is included as a fundamental part of the overall management by improving the fluency and safety, and supporting daily activities of National Airspace System (NAS) users. Thus, this paper presents a two-phase Game Theory approach to indicate how many route preferences should be sent by each airline to reduce its delay. The first stage will indicate the best TOS message to be sent when each airline has no history about CTOP negotiations with other airline competitors. The second stage will learn by the CTOP negotiations and adapt the DSS behavior to improve the results in every game round, by predicting the possible future scenarios and airline competitors strategies. Comparing with our proposal, the first phase made possible that Airline A achieved a global delay less than, or equal, in 97% of CTOP negotiations representing a delay reduction of 537 hours for the airline. In the second phase, a learning process was used by

∗Corresponding author
Email addresses: leocruciol@gmail.com (Leonardo Cruciol), johnpaul@gatech.edu
(John-Paul Clarke), weigang@unb.br (Li Weigang)
Preprint submitted to SI: Comp. and Ind. Math. on Applied and Comp. Math.March 10, 2019
adapting its strategies against competitors to be allocated in better available slots in a CTOP demand, which made possible an improvement rate about 21% for Airline A.
Keywords: ATFM, CTOP, DSS, Game Theory, Uncertain Systems

1. Introduction
During last three decades, the advances in Air Traffic Management (ATM) increased significantly by using techniques such as Artificial Intelligence and Game Theory. The Federal Aviation Administration (FAA) keeps improving the ATM programs and procedures to provide better conditions for the airspace control and management in the USA. Some of most important programs, such as Collaborative Decision-Making (CDM); Ground Delay Problem (GDP) and
Airspace Flow Program (AFP), were evolved to take a new step in the Next Generation Air Transportation System (NextGen) with the Collaborative Trajectory Options Program (CTOP) [1, 2, 3, 4, 5, 6].
Using the CTOP, the FAA considers airlines’ business goals to allocate available slots to fly through a Flow Constrained

Area (FCA) by combining rerouting and delay at the same restrictive measure. This evolution of ATM was important for the National Airspace System (NAS) users, since now the airlines could share their route preferences by sending Trajectory Option Set (TOS) messages for every affected flight to the FAA and improve their business results, considering every individual flight condition [7]. Although, the FAA turned this collaborative environment possible, when a CTOP demand is initiated each airline has knowledge only about its own flights related to CTOP. This makes the strategy definition more important about how many route preferences will be send. Once the final sort list to be allocated is defined from the earliest Estimated Time of Arrival (ETA) of each route preference informed in the TOS. The strategy could decreases until 7 times the delay suffered in its CTOP captured flights, which turns this approach important to handle with the overall uncertainty involved during the TOS definition process.
Considering that the CTOP process is similar as the game activities between airlines and air traffic controllers, this paper presents a two-phase Game Theory approach to indicate how many route preferences should be sent by each airline to reduce its delay. The model used a game principle to reduce the risks to increase its delay and predict competitors’ game moves by reducing its gain over big loss, which makes the DSS results more balanced. The first stage will indicate the best TOS message to be sent when each airline has no history about CTOP negotiations with other airline competitors. The second stage will learn by the CTOP negotiations and adapt the DSS behavior to improve the results in every game round, by predicting the possible future scenarios and airline competitors strategies. It was used a predictive model to estimate the historical environment to develop an expected reputation and likelihood of in-game movement. Once after a CTOP demand is done, each airline keeps not having knowledge about the other flights and airlines were involved in the
negotiation.
This paper is organized as follows. In section 2, we review the relevant research and concepts of CTOP. In section 3, the Game Theory models are proposed for CTOP. In section 4, we present the case studies and results using data from the scenario of USA. In section 5, we conclude the paper and present directions for future study.
2. Collaborative Trajectory Options Program
The CTOP is initiated when a specific geographic area will be restricted to fly, partially or fully, and so, reducing the en route capacity during a predetermined time. This area is called as FCA, and more than one FCA could happen in a same CTOP demand with different capacity and demand. At this time, the FAA informs to every airline, who have affected flights by the CTOP, which flights; capacity in each FCA for every 15 minutes window; and the start/end period of restrictions [2, 3].
An example of a CTOP demand is presented in Figure 1, with a single flight from Hartsfield-Jackson Atlanta International Airport (ATL) to Denver International Airport (DEN). It was defined three FCAs in this example, which makes possible four cases: to fly with the least costly route through the FCA1 or the FCA2 or the FCA3 or fly around avoiding the flow constrained areas
defined.
Figure 1: CTOP example from ATL to DEN
The main strategic question for the airlines is which TOS message will be sent for every flight, once each airline has no knowledge about its competitors. It was verified that the delay could be increased if the airline send for every flight more route options than the amount of FCA’s, plus one NOSLOT option to fly around, e.g., if there are 2 FCA’s the airline should send 2 routes + 1 NOSLOT option. Thus, verifying the Fig. 1, there are four options to send: 1the best option (least costly) for the FCA1, 2- the best option for the FCA2, 3- the best option for the FCA3, or 4- the best option to avoid the FCA’s.
The final sort list is defined by the ETA of each route in every FCA and is executed by the CTOP algorithm [8]. The amount of route options an airline send in each CTOP demand will likely define its position to be allocated and receive less delay in its CTOP captured flights.
The CTOP decision is complex for the airlines because there are many unknown or poorly estimated factors and variables. Information such as the number of captured flights of other airlines, demand and capacity for each FCA, strategy used by others airlines to define their TOSs, and others, are hardly estimated to take better strategies. Current solutions for this problem are based on Greedy methods [9, 10, 11].
The proposed Game Theory approach will works exactly in this part, supporting the airlines operational team to define how many route preferences will be sent in that specific CTOP demand. The first phase will analyze and propose the best TOS for an unique CTOP demand, and the second phase will learn and treat the competitor reputation to make adjustments in its strategy for every demand.
3. Game Theory Models for CTOP
During the development of the Game Theory approaches, we made the following assumptions regarding the strategies of the two airlines, A and B, i.e., the players. First, the players were assumed to have three strategy options: 1send NOSLOT for all flights and fly around the FCAs; 2- send one trajectory for every flight to fly through only one FCA plus a NOSLOT option, which will be allocated if the assigned delay is worse than the delay to fly through the FCA; and 3- send two trajectories, i.e., one through each FCA, plus a NOSLOT option. The Airline A was selected as the main player and the Airline B as the competitor player. The players were assumed to be rational and therefore will send the best trajectory option set for every flight.
3.1. CTOP Negotiations Without History Approach
The approach has no prior knowledge about competitors and no information about who is playing and their strategy in a specific CTOP demand, i.e., exactly how it works in a real CTOP demand [12, 13, 14, 15, 16, 17, 18, 19].
It was modeled as a non-cooperative, non-repeated game with incomplete information, which is analogous to the Prisoner’s Dilemma game. The Figure 2 presents the CTOP negotiations without history approach overview.
Figure 2: CTOP Negotiations Without History Approach Overview
It was used data mining techniques to analyze the historical CTOP demand data available. Based on this analysis result, it was specified three cases in terms of the fraction of demand contributed by each airline and three scenarios regarding to the flights from each airline. So, the flights were allocated to the first and second half of the demand list as specified by the IAT at the earliest FCA.
It was defined three cases: 1- Airline A has 50% of CTOP captured flights and Airline B has 50%, 2- Airline A has 67% of CTOP captured flights and Airline B has 33%, and 3- Airline A has 75% of CTOP captured flights and Airline B has 25%. Considering the scenarios, it was defined three for each case regarding the number of flight of Airline A and Airline B in each half of IAT order. For example, in the second scenario it is: first half of IAT order contains
50% of Airline A flights and 50% of Airline B, and second half contains 50% of Airline A and 50% of Airline B.
Considering the Prisoner’s Dilemma game, it was possible to make some assumptions to construct the matrix of estimate: 1- The estimated delay cost, if Airline A sends NOSLOT for every its flights, 2- The estimated delay cost, if Airline A sends one trajectory plus NOSLOT for every its flights and Airline B sends NOSLOT for every its flights, and 3- The estimated delay cost, if Airline
A sends two trajectories plus NOSLOT for every its flights and Airline B sends NOSLOT for every its flights.
The game strategy regards strictly available information for each airline and is defined by a payoff function, which is presented by the following equations. The estimated delay for each combination of case, scenario and Airline A’s move is given in minutes by the Eq. (1), when Airline B’s move is equals 0 (NOSLOT). The relationship among predicted delays is given by Eqs. (2), (3), (4) and (5).
D(C,S,M) = dy (1)
(2)
ED(C,S) = ECS(C,S) − (D(C,S,α) − D(C,S,χ)) (3)
(4)
RD(TC,TS) = X [MD(C,TS) − |ED(C,S)|] (5)
C1→TC
Where:
• D(C,S,M) = estimated delay in minutes (dy) for Airline A as Case (C), Scenario (S) and Airline A’s Move (M), when Airline B’s move is equals 0 (NOSLOT). The Airline A’s move is given by 0 = NOSLOT, 1 = one trajectory plus NOSLOT option, and 2 = two trajectories plus NOSLOT
option.
• TC = Total of Cases. It was defined three: Case 1 (A=50%,B=50%), Case 2 (A=67%,B=33%) and Case 3 (A=33%,B=67%).
• TS = Total of Scenarios.
• ECS(C,S) = Estimated delay for each case and scenario. The α represents the move 0, β represents the move 1 and χ represents the move 2.
• ED(C,S) = Estimated deviation for each case and scenario. The α represents the move 0 and χ the move 2.
• MD(C,TS) = Estimated mean delay for each case, regarding its scenarios and possible moves.
• RD(TC,TS) = Relationship among all estimated delays, as each possible move for Airline A and Airline B’s move equals 0 (NOSLOT).
Considering the payoff value, the move for each game will be predicted between sending: 1 trajectory plus NOSLOT option or 2 trajectories plus NOSLOT option. The strategy to send only a NOSLOT option was not considered because of the greater possibility of receiving a worse global delay, when there is no information about competitors’ strategies. The payoff function is given by Eqs. (6) and (7).
(6)
(7)
0, GM = 2 Trajectories + NOSLOT
NHPayoff =
NHPayoff > 0, GM = 1 Trajectory + NOSLOT
Where:
• E(TC,TS) = The delay difference between one trajectory plus NOSLOT and two trajectories plus NOSLOT, for each case and scenario. The β represents the move 1 and χ the move 2.
• NHPayoff = It represents the payoff. Considering its value, if NHPayoff is higher than 0, game move (GM) might send one trajectory plus NOSLOT option for each flight, otherwise it might send two trajectories plus NOSLOT option.
3.2. CTOP Negotiations Learning Approach
This approach aims to suggest to the airline operation centers how many trajectory options will likely reduce the involved risks of increase the delay on CTOP captured flights. It was modeled to work dynamically with a learning process that adapts its strategies against competitors, and so, be allocated in better available slots in each CTOP demand [14, 20, 21]. It is a non-cooperative, non-repeated game with incomplete information. The CTOP negotiations learning approach overview is presented in Figure 3.
Figure 3: CTOP Negotiations Learning Approach Overview
Considering that each case and scenario had the same probability to happen, as well as each move of Airline B, it was defined five cases: 1- Airline A had 50% of CTOP captured flights and Airline B had 50%, 2- Airline A had 67% of CTOP captured flights and Airline B had 33%, 3- Airline A had 33% of CTOP captured flights and Airline B had 67%, 4- Airline A had 75% of CTOP captured flights and Airline B had 25%, and 5- Airline A had 25% of CTOP captured flights and Airline B had 75%.
The NLpayoff considers the NHpayoff to calculate its final result, once the initial steps, estimate and risks analysis, are done by the first phase. The Detective Agent (DA) is responsible to estimate which case and strategy likely happened in the last game. The DA’s payoff function is given by Eqs. (8), (9) and (10).
DRL(C,S,M) = dy (8)
(9)
(10)

DA−2 Payoff≤ DAPayoff≤ -3, Detective< 2 , Detective= 0 = 2
DAPayoff =
2DA≤ PayoffDAPayoff≥ 5,<Detective5, Detective= 1= 3
Where:
• DRL(C,S,M) = Airline A’s delay in the last game, in minutes, given by dy for each case, scenario and move when Airline B sent NOSLOT for every its flight. The Airline A’s moves are given by: 0- (NOSLOT), 1(1 trajectory + NOSLOT) and 2- (2 trajectories + NOSLOT). The α represents the move 0, β represents the move 1 and χ represents the move
2.
• EMD(C,TSL) = Estimated mean delay for each case, regarding its scenarios and moves in the last game for Airline A.
• DAPayoff: It estimates which case and strategy likely happened in the last game. Considering the DAPayoff value, the Detective will be equals: 0- if (DAPayoff≤-3), 1- if (DAPayoff≥5), 2- if (-2≤DAPayoff<2) and 3if (2≤DAPayoff<5).
• TCL = Total of cases in the last game. It was defined 5 cases, regarding the flights proportion of each airline during the CTOP: 1- (A=50%,B=50%), 2- (A=67%,B=33%), 3- (A=33%,B=67%), 4- (A=75%,B=25%) and 5(A=25%,B=75%).
• TSL = Total of scenarios in the last game.
The Learner Agent (LA) is responsible to estimate which Airline B’s strategy likely was taken during last 5 games. The LA’s payoff function is given by Eqs. (11) and (12). Thus, the NL payoff function is given by Eq. (13).
(11)
(12)

LAPayoff = 0 ≤ LAPayoff 0, GM = 1 Trajectory + NOSLOT (13)
Where:
• EDF(C,K,M) = Estimated mean delay for each case, regarding its scenarios and moves in the last 5 games for Airline A. The α represents the move 0, β represents the move 1 and χ represents the move 2. The δ represents the last 1st game and the last 5th game.
• LAPayoff: It estimates which Airline B’s strategy likely was taken during last 5 games. Considering the LAPayoff value, the Learner will be equals:
0- if (LAPayoff< 0), 1- if (0≤LAPayoff<2) and 2- if (LAPayoff≥2).
• Detective = It represents the Detective value, given by DAPayoff’s payoff.
• Learner = It represents the Learner value, given by LAPayoff’s payoff.
• NLPayoff = It represents the payoff. Considering its value, if NLPayoff is higher than 0, game move (GM) might send one trajectory plus NOSLOT option for each flight, otherwise it might send two trajectories plus NOSLOT option.
The NOSLOT option will not be suggested in both cases, because the cost to send this option in specific games is too high and unbalances the strategy used by these models.
4. Case Study
The CTOP demand was created with two FCAs, FCA001 and FCA002,
which every aircraft flying through would have some restriction based on available capacity and excluding exempt flights. There was a total of 331 flights destined to the New York metropolitan area from airports in Miami, Dallas, Chicago, San Francisco, Los Angeles and Las Vegas. The flights were randomly assigned to each airline and airport.
It was defined 3 scenarios for each case, regarding the partition of airlines’ flights in the IAT order; 3 possible moves for each player, resulting 9 possible combinations for each game; and regarding the amount of CTOP captured flights, it was defined 3 cases for the first phase, without historical information, and 5 cases for the second phase, with learning structures. Considering that both airlines do not know which case or scenario is being played, there were 81 possible final results for the game in the first phase and 135 in the second phase.
Two cases studies were conducted involving two airlines and 100 different CTOP demands, called as cycle. Thus, it was performed 100 cycles, generating 20.000 CTOP negotiations for each case study, to verify the model behavior by the time. The CTOP period was randomly defined between 6 and 8 hours. The capacity to fly through each FCA was randomly defined as 3 or 5 aircraft per each 15 minutes during the CTOP period.
To evaluate the game, it was defined four parameters that influence the final outcome: 1- final case, defined by the amount of CTOP captured flights of both airlines; 2- final scenario as chosen case, defined by the partition of flights in the IAT order; 3- Airline B move, defined by case and scenario chosen; and 4Airline A move, defined by the combination of all others final parameters and its move.
Considering that Airline A could estimate its delay when Airline B sends NOSLOT for every flight, there are 3 possible moves for the game and 4 game strategies to choose: 1- Airline A would send NOSLOT for every its flights in every game, 2- Airline A would send one trajectory plus NOSLOT option for every its flights in every game, 3- Airline A would send two trajectories plus NOSLOT option for every its flights in every game, and 4- Airline A would choose the game move based on the payoff functions proposed by this approach.
4.1. Without Historical Information Results
It was defined two categories to present the results, regarding the first phase behavior. First, each case has 33.3% to happen during each game; and second, the amount of flights of Airline A is the same in every game.
Considering the first category, the accumulated delay in hours after 100 cycles is presented in Figure 4 by the normal Probability Distribution Function (PDF) for this approach and others possible strategies for Airline A. It is possible to verify that the Game Theory approach is highly related to two trajectories options strategy, which keeps achieving better results than others strategies by the time and reducing the accumulated delay. Considering the second category, the accumulated delay in hours after 100 cycles for case 1 is presented in Figure 5 by the normal PDF for Game Theory and others possible strategies for Airline A.
Figure 4: PDF of Game Theory and others possible strategies for Case 1, 2, 3 of Airline A
Figure 5: PDF of Game Theory and others possible strategies for Case 1 of Airline A
Analyzing Figure 5, the Game Theory approach achieved greater results than others possible strategies for all flights in every game for case 1, which each case has the same probability to happen in the first negotiation. So, regarding only the first category, Airline A would achieve a global delay less than, or equal, the best possible result, in 97% of CTOP negotiations. Thus, this improvement represents a delay reduction of 537 hours for Airline A.
4.2. Learning Approach Results
It was defined two categories regarding the second phase behavior to evaluate the results: 1- Each case has 33.3% to happen during each game, and 2- The amount of flights of Airline A is the same in every game. There were four possible strategies for every CTOP flight: NOSLOT (NS), one trajectory + 1 NS, two trajectories + 2 NS or decide the move according to second phase
payoff.
Considering the case 2 and a 33% strategy, i.e., Airline A will change the strategy in each one third of game total amount following the sequence: 1 trajectory + NOSLOT, 2 trajectories + NOSLOT, and NOSLOT, the accumulated delay in hours after 100 cycles is presented in Figure 6 by the normal probability distribution function for the Game Theory approach and others possible strategies for Airline A.
Figure 6: PDF of Game Theory and others possible strategies for Case 2 and 33% strategy of
Airline A
Figure 7: PDF of Game Theory and others possible strategies for Case 3 and 33% strategy of Airline A
Analyzing Figure 6, this approach reduced the global delay in about 16%. Considering the case 3 and the 33% strategy, i.e., Airline A will change the strategy in each one third of game total amount following the sequence: NOSLOT, 1 trajectory + NOSLOT, and 2 trajectories + NOSLOT. The scenario is presented in Figure 7 and reduced the global delay in about 21%.
During the case study, it was analyzed the behavior of each strategy and final results regarding the possibility to identify a Nash Equilibrium in CTOP and verify if there is some equilibrium that would be the same for every case. The Nash Equilibrium for cases 1; 2 and 3 of the first phase, including the approach that every case would have the same probability to happen, is presented in
Figure 8.
Figure 8: Nash Equilibrium for cases 1, 2 and 3
Analyzing Figure 8, it is possible to verify that Nash Equilibrium happened in two cases: 1- (1×1) both airlines would choose to send 1 trajectory + NOSLOT for every flight, and 2- (2×2) both airlines would choose to send 2 trajectories + NOSLOT for every flight. Considering this, it would be possible to deduce that the higher the proportion of flights most likely to achieve better results sending 2 trajectories + NOSLOT.
5. Conclusion
The CTOP initiative introduced a great opportunity to improve airline business results. A large uncertainty involving a better slot allocation is handled in the TOS. The TOS planning process involves how many options per TOS and which is the best strategy to choose. Each TOS message sent by airlines will determine the final assignment order for each available time slot to fly through a FCA, regarding to each airline has only information about its own flights to take its decisions when a CTOP demand is created.
In this research, we identified that, in general, if an airline does not have any information about the CTOP demand, minimum delay would be achieved sending one trajectory option for each FCA plus a NOSLOT. However, it is possible to make an optimization when some attributes of a CTOP demand indicates a better strategy to be taken in specific cases. Thus, two case studies were conducted for two scenarios: 1- to evaluate the first negotiation, and 2to evaluate the historical information on negotiations. It was generated 20.000 CTOP negotiations, considering 331 flights destined to the New York metropolitan area from airports in Miami, Dallas, Chicago, San Francisco, Los Angeles and Las Vegas. It was considered 2 airlines, 2 FCAs and 81 possible scenarios in the first phase and 135 in the second phase for each game.
It was possible to verify that when an airline took the same strategy for every flight in 100 cycles, the strategy of sending one trajectory in the TOS message for each FCA plus a NOSLOT option to fly around achieved the best results. Considering this strategy and comparing with our proposal, the first phase made possible that Airline A achieved a global delay less than, or equal, in 97% of CTOP negotiations representing a delay reduction of 537 hours for the airline. For the second phase, it was used a learning process by adapting its strategies against competitors to be allocated in better available slots in a CTOP demand, which made possible an improvement rate about 21% for an airline using the second phase.
As the main contribution of this research, the models were developed and achieved satisfactory results using the Game Theory. For initial negotiations, it could be used to reduce the assigned delay and support the airline decision process when historical information is not available. For historical information on negotiations, it could be used to learn and adapt its TOS strategies.
For the future work, it will be considered new methods such as genetic algorithms to improve the results of both game theory models presented in this paper. We intend to use more than two airlines and multiple FCAs en route, which will improve the current airline decision-making process.