Figure 1. The Atmospheric Radiative Transfer (ART) - GeoChronoTransformers (GCT) model online framework used for global aerosol retrievals from Landsat imagery on the Google Earth Engine (GEE) platform.
Validation method
To comprehensively assess the performance of the proposed ART-GCT-GEE model for global Landsat AOD retrievals, we employ two distinct categories of independent validation methods. One is the widely used ten-fold cross-validation (10-CV), a standard approach in validating AI regression tasks (Rodriguez et al., 2010). This is conducted at sample, station, and monthly levels, which involves randomly selecting 90% of the data samples, ground monitor stations, and months of the year for training the model, while the remaining 10% are reserved for validation (J. Wei et al., 2023). This process ensures that the training samples are independent of the testing samples at overall, spatial, and temporal scales. This cycle is repeated 10 times to ensure that all data samples are used as test sets in the cross-validation. These three methods collectively evaluate the overall accuracy of AOD estimates at monitoring stations and predictive accuracies at locations and on dates where ground measurements are not available, respectively.
The other validation method is comprised of two distinct parts: First, taking into consideration the unbalanced distribution of ground monitors and the pronounced spatiotemporal clustering patterns of AOD, we evaluate the model’s predictive capabilities by withholding temporal and spatial units. This entails controlling each year from 2013 to 2022 and each of the 10 geographical continents globally [defined in Figure S1 according to (J. Wei et al., 2019a)] to conduct independent validations (withhold one year or one continent). This is accomplished by sequentially selecting all data samples from a single year or a single continent as the validation set while utilizing data samples from the remaining 9 years or 9 continents for model training. Second, we employ data samples from the middle 6 years (i.e., 2015–2020) for the model training and utilize the two initial years (i.e., 2013 and 2014) and the two final years (i.e., 2021 and 2022) for validation. This split results in approximately 65% of the samples for training and 35% for testing. This method can effectively validate the model’s capacity to both predict historical and forecast future AOD levels.
To quantitatively assess the model’s accuracy and facilitate model comparison, several statistical indicators are used, namely, the Pearson correlation coefficient (R), median bias (MB), mean absolute error (MAE), and root-mean-square error (RMSE). Additionally, to assess the uncertainties of satellite AOD retrievals, we employ the expected errors (EE) for AOD retrievals from the MODIS Deep Blue algorithm over land (Equation 8) (Hsu et al., 2013) and the criteria for AOD retrievals in the Global Climate Observation System (GCOS) (Equation 9) (GCOS, 2010).
\(EE=\pm(0.05+20\%\times\tau_{\text{observation}})\) (8)
\(GCOS=\pm maximum(0.03,\ 10\%\times\tau_{\text{observation}})\) (9)
Results and discussion
Feature contribution analysis using XAI
DL models are commonly seen as black boxes, but with the emergence of XAI, their internal workings can be unveiled. Here, we select the advanced SHapley Additive exPlanation (SHAP) method to investigate and understand the driving factors in the Landsat AOD retrieval by assessing the contribution of input variables through the computation of Shapley values (Figure 2). SHAP, with its exceptional model-agnostic nature, can offer both local and global interpretability, thereby ensuring transparency, fairness, and interpretability across a diverse AI, especially DL applications (Lundberg and Lee, 2017). Our findings demonstrate that coastal aerosol channel within the deep-blue wavelength (Band 1) exert the most significant contribution, with the highest SHAP value of approximately 36% among all features, followed by the blue channel (Band 2), accounting for ~9%. The contributions of discrete channels to the AOD retrieval tend to gradually decrease as wavelengths increase, consistent with the decreasing sensitivity of the aerosol signal to apparent reflectance (Figure S2). Nevertheless, these contributions remain substantial, and the total contribution of all channels, spanning from visible to shortwave infrared wavelengths, amounts to approximately 58%, underscoring the considerable importance of multi-band information in aerosol retrievals. Observation angles, especially solar zenith and scattering angles, also have great impacts on the AOD retrieval (total SHAP = 19%). Furthermore, multi-dimensional information encompassing space, time, and altitude, as well as surface NDVISWIR, also play important roles (SHAP = 2–11%) in enhancing the AOD retrieval within the Transformers model. These results illustrate the rationale behind our feature selection, contributing to a deeper understanding of the physical interpretability of DL.