3.3 Random Forest Classifier Analysis
The performance of the RFC model was evaluated with the test samples
using the parameters observed in Table 2 . Excellent results
were obtained in the precision parameters, with values close to 1.00 for
high and medium concentration tests, Test_H, and Test_M, respectively.
However, for the tests of low concentrations (Test_L), the precision
parameter remained at 1.00 for TNT but decreased to 0.75 for RDX. This
decrease in the precision parameter for RDX is because some samples at
low concentrations of TNT (3 samples, see confusion matrix) were
predicted as RDX. However, it is essential to highlight that despite
that the RFC model did not correctly predict most of the samples with
TNT and RDX for the low concentration test, the model maintained a good
accuracy. This is because at these concentrations (< 3%),
soil particle size generates a lack of homogeneity in the samples,
creating explosive clusters that make detection difficult since the
interrogation area for the QCL spot is only 4 x 2 mm2.
This spot size generates a certain probability of not finding HE
particles when this is sensing takes place on the sample surface. A
possible solution to this problem is to increase the interrogation area
of the QCL system to generate an adequate average of the spectral
acquisition.
The recall values in the low concentration test for both explosives were
poor because most of the samples that had HE were classified as having
no HE (none). According to f1-score values, this parameter measures the
model in a general way, considering the recall and precision values,
throwing poor values for the low concentration test.
Fig. 4 shows the analysis of the ROC curves for each of the
Tests using the model generated by the RFC method. The ROC curve
evaluates the model with different decision thresholds and measures the
probability of sensing, which is calculated as the integration of the
area under the ROC curve. The probability of sensing for the medium and
high concentration test was excellent, with values close to 1.00.
Although the probability of sensing for the low concentration test was
expected to be poor, it presented a moderate probability of sensing.