Kamal Acharya

Ph.D. Candidate at UMBC, AI Researcher

Time Series Forecasting for Advanced Air Mobility: Best Practices

Introduction

A comprehensive guide to building robust time series forecasting models specifically for air mobility data. Drawing from my experience with Tennessee AAM data, I share techniques, common pitfalls, and validation strategies that have proven effective in practice.

Time series forecasting is critical for advanced air mobility operations. Accurate demand predictions enable better resource allocation, reduce operational costs, and improve service reliability.

Data Preprocessing for Temporal Data

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.

Key considerations when working with temporal data:

Handling Missing Values

Missing data is inevitable in real-world applications. Options include:

  • Forward filling (carry last value forward)
  • Interpolation (linear or spline)
  • Model-based imputation
  • Domain-specific approaches

For air mobility data, interpolation often works well for short gaps, while longer gaps may require domain knowledge.

Dealing with Seasonality

Air mobility demand exhibits multiple seasonal patterns:

  • Hourly: Rush hours vs. off-peak
  • Daily: Weekday vs. weekend
  • Weekly: Business travel patterns
  • Seasonal: Weather and holiday effects

Managing Trend Components

Trends in air mobility can be driven by:

  • Gradual growth or decline
  • Policy changes
  • Infrastructure development
  • Economic factors

Accounting for External Factors

External variables can significantly impact demand:

  • Weather conditions
  • Special events
  • Holidays and vacation periods
  • Fuel prices
  • Regulatory changes

Feature Engineering Techniques

Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

Effective features for time series include:

Lagged Values

Use previous values as features (t-1, t-7, t-30, etc.). The lag values selected depend on your domain knowledge and data characteristics.

# Example: Create lagged features
for lag in [1, 7, 30]:
    df[f'demand_lag_{lag}'] = df['demand'].shift(lag)

Rolling Statistics

Compute statistics over rolling windows to capture trends and volatility:

  • Rolling mean (smoothing)
  • Rolling standard deviation (volatility)
  • Rolling min/max (extremes)

Seasonal Indicators

Create binary features for seasonal patterns:

  • Hour of day
  • Day of week
  • Month of year
  • Is holiday?
  • Is weekend?

Domain-Specific Features

For air mobility specifically:

  • Number of active vertiports
  • Fuel availability
  • Weather severity index
  • Event proximity
  • Network density

Model Selection Strategies

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Several approaches can be employed:

Classical Methods

ARIMA (AutoRegressive Integrated Moving Average)

  • Works well for stationary or nearly stationary data
  • Interpretable parameters
  • Limited ability to capture complex patterns
  • Fast training

Exponential Smoothing

  • Good for data with clear trends and seasonality
  • Simple and interpretable
  • Less flexible than modern approaches

Machine Learning Methods

Random Forests

  • Captures non-linear relationships
  • Handles multiple feature types
  • Less prone to overfitting than single trees
  • Fast predictions

Gradient Boosting (XGBoost, LightGBM)

  • Often achieves state-of-the-art performance
  • Requires careful hyperparameter tuning
  • Can handle mixed data types
  • Good feature importance estimation

Deep Learning Methods

LSTM Networks

  • Excellent for capturing long-term dependencies
  • Handles variable-length sequences
  • Requires significant training data
  • Computationally expensive

Transformers

  • State-of-the-art in many applications
  • Parallelize well on modern hardware
  • Even more data-hungry than LSTMs
  • Great for capturing complex patterns

Temporal Convolutional Networks (TCN)

  • Efficient training and inference
  • Good for fixed-window predictions
  • Competitive with LSTMs for many tasks

Ensemble Methods

The best approach is often an ensemble of multiple techniques:

  • Average predictions from multiple models
  • Stack multiple models
  • Use different models for different forecast horizons

Cross-Validation for Time Series

Unlike standard cross-validation, temporal data requires special handling to avoid data leakage. Time series cross-validation uses expanding or sliding windows to maintain the temporal order of data.

Expanding Window (Walk-Forward)

  • Train on data up to time t, test on time t+1
  • Simulate real-world deployment
  • More realistic but computationally expensive
  • Best for model evaluation

Sliding Window

  • Fixed-size training window
  • Test on next time period
  • Balance between realism and efficiency
  • Good for hyperparameter tuning

Handling Missing Data and Anomalies

Missing Data

As discussed in preprocessing, strategies vary based on:

  • Gap duration
  • Data frequency
  • Business requirements
  • Availability of related variables

Anomalies

Anomalies (outliers) require special treatment:

  • Detect using statistical methods (Z-score, IQR)
  • Use robust models resistant to outliers
  • Separately model anomalous periods
  • Investigate root causes

Retraining Strategy

Decide how frequently to retrain your model:

  • Daily: Adapt to recent changes quickly
  • Weekly: Balance responsiveness and stability
  • Monthly: Reduce computational overhead

Evaluation Metrics

Standard regression metrics:

  • MAE (Mean Absolute Error): Average prediction error
  • RMSE (Root Mean Squared Error): Penalizes large errors
  • MAPE (Mean Absolute Percentage Error): Scale-independent

Time series specific:

  • Directional Accuracy: Predict up/down correctly?
  • Forecast Horizon Performance: Error varies by horizon
  • Seasonal MAPE: Track performance during peak/off-peak

Practical Implementation Tips

  1. Start Simple: Begin with ARIMA or exponential smoothing
  2. Establish Baselines: Compare against naive forecasts
  3. Use Ensembles: Combine multiple models
  4. Monitor Performance: Track errors over time
  5. Iterate: Continuously refine based on results
  6. Document Everything: Track experiments and changes

Conclusion

Building effective forecasting models requires domain knowledge, careful feature engineering, and rigorous validation. The best approach is often an ensemble of multiple techniques. For air mobility specifically, incorporating domain knowledge about operations, demand patterns, and external factors is crucial.

The models we build today will shape the efficiency and reliability of future air mobility services. Taking the time to do this right pays dividends in operational performance.


Have questions or thoughts? Feel free to reach out