Figure 2. Sunburst chart of feature contributions in the satellite AOD retrieval from Landsat imagery using the SHapley Additive exPlanations (SHAP) approach of eXplainable AI (XAI).
Evaluation and uncertainty analysis

Model cross-validation

We initially employed three independent cross-validation techniques to assess the performance of the proposed model for the Landsat global aerosol retrieval. Our model demonstrates strong performance in estimating AOD from Landsat images across the world, agreeing moderately well with measurements at approximately 78% of the sites (sample-based CV-R > 0.5), with median biases within ± 2% for about 77% of the sites (Figure 3). Higher levels of accuracy are noted in populous areas characterized by elevated levels of air pollution, including Southern Africa, India, and East Asia (Table 1), where correlations surpass 0.8. The retrieval uncertainties at most sites remain consistently low, with approximately 84% and 76% of sites having small MAE and RMSE values below 0.08 and 0.1, respectively. Exceptions are observed at a few sites in North Africa and the Middle East, and eastern China (Table 1), where larger absolute errors are primarily associated with high AOD levels resulting from heavy frequent sand/dust emissions or anthropogenic activities. Overall, over 87% of the sites show considerable accuracy, with more than 70% of the retrievals falling within the EE envelope. Furthermore, 68% of the sites have at least 40% of retrievals meeting the GCOS requirements. Similar spatial patterns are observed from spatial (temporal) CV results (Figures S3 and S4), but the performance is generally poorer than the sample-based CV results. Nevertheless, approximately 73% (72%), 73% (78%), and 66% (72%) of the sites continue to demonstrate station-based (month-based) moderate correlations (CV-R > 0.5) and low MAE (< 0.08) and RMSE (< 0.1) values between the retrievals and ground-truth values. Additionally, acceptable retrievals meeting the error criteria of the EE (> 70%) and GCOS (> 40%) are observed at approximately 64% (77%) and 48% (57%) of the sites on land.