3.3 Random Forest Classifier Analysis
The performance of the RFC model was evaluated with the test samples using the parameters observed in Table 2 . Excellent results were obtained in the precision parameters, with values close to 1.00 for high and medium concentration tests, Test_H, and Test_M, respectively. However, for the tests of low concentrations (Test_L), the precision parameter remained at 1.00 for TNT but decreased to 0.75 for RDX. This decrease in the precision parameter for RDX is because some samples at low concentrations of TNT (3 samples, see confusion matrix) were predicted as RDX. However, it is essential to highlight that despite that the RFC model did not correctly predict most of the samples with TNT and RDX for the low concentration test, the model maintained a good accuracy. This is because at these concentrations (< 3%), soil particle size generates a lack of homogeneity in the samples, creating explosive clusters that make detection difficult since the interrogation area for the QCL spot is only 4 x 2 mm2. This spot size generates a certain probability of not finding HE particles when this is sensing takes place on the sample surface. A possible solution to this problem is to increase the interrogation area of the QCL system to generate an adequate average of the spectral acquisition.
The recall values in the low concentration test for both explosives were poor because most of the samples that had HE were classified as having no HE (none). According to f1-score values, this parameter measures the model in a general way, considering the recall and precision values, throwing poor values for the low concentration test.
Fig. 4 shows the analysis of the ROC curves for each of the Tests using the model generated by the RFC method. The ROC curve evaluates the model with different decision thresholds and measures the probability of sensing, which is calculated as the integration of the area under the ROC curve. The probability of sensing for the medium and high concentration test was excellent, with values close to 1.00. Although the probability of sensing for the low concentration test was expected to be poor, it presented a moderate probability of sensing.