Diagnostic performance of RCPath Thy grading system (Table 4 and supplementary Table B)

We measured the diagnostic performance of our FNAC categories and compared the results with the most up-to-date figures published in the latest RCPath guidelines in 20161, and the latest meta-analyses published by Poller et al. 5,6 for the RCPath Thy system (13 articles, 3911 nodules), and Bongiovanni et al. 9 for TBSRTC (8 articles, 6362 nodules),Table 4 .
35% of the samples in our series were categorised as Thy1, which is higher than previously published figures (18-27%).1,6There were also some minor differences in the utilisation of Thy1 and Thy2 categories between the RCPath system and TBSRTC (Grade I, II), where the later system identifying fewer grade I and more grade II samples (Table 4 ).
While the risk of malignancy (ROM or PPV) in our Thy5 category (100%) was comparable to published literature for the RCPath system (98-100%)1,5 and TBSRTC (99%)9, our malignancy rates were higher for all other categories (Table 4 ). ROM was higher for Thy2 grade in our study (15%) compared to RCPath system rates (1.4-5%), and TBSRTC (4%). ROM was also surprisingly much higher for our Thy 4 patients (90%) compared to the RCPath figures (up to 68%)1, the meta-analysis of results using the Thy system by Poller et al. 5(79%), or the TBSRTC system meta-analysis by Bongiovanni et al. 9 (75.2%). Combining Thy4 and Thy5 groups together (suspicious or malignant FNAC) demonstrated a high specificity, PLR and PPV for malignancy (99.1%, 30.1, and 92.3% respectively), with low sensitivity (27.6%), and moderate NPV (77.4%) and accuracy (78.7%) (Table 4 and supplementary Table B). Combining Thy3-5 groups together (any abnormal FNAC) improved the sensitivity and the NPV (67.8% and 85.3% respectively) at the expense of reducing the specificity, PPV and the overall accuracy (74.3%, 51.3%, 72.5% respectively).
Discussion
FNAC plays an important role in the initial evaluation and decision planning for patients with thyroid nodules. However, FNAC has drawbacks especially with its relatively high rate of inadequate or unsatisfactory samples, necessitating repeat testing, and its inability to distinguish between benign and malignant lesions in some situations.8,10-12 Moreover, false positive diagnosis of malignancy can sometimes occur, which can lead to unnecessary thyroid surgery with a 2-10% risk for long-term postoperative morbidity.13,14 As the decision to pursue surgery as opposed to conservative management is greatly influenced by the FNAC results, there is a need for a consistent reporting process and rigorous evaluation of the diagnostic utility of thyroid FNAC.1,13,15,16
The RCPath Thy grading system was designed to refine and improve the reporting process, and to provide clarity for patient management.1 It can provide consistent, reproducible and auditable thyroid cytopathological reports, improve the communication process between clinicians and patients, and give figures for the predicted risk of malignancy with each cytological diagnosis.8,9
This study builds on the growing body of literature to validate the diagnostic utility of the RCPath Thy system in guiding the day-to-day clinical management.1,5,6 While the validity of using six-tiered systems (like the RCPath system or TBSRTC) is justifiable by the strong reported cyto-histological correlation, there was a notable variability in the implied risk of malignancy for different DCs and subsequently the percentage of patients undergoing surgery.1,5,6,9 As standards for FNAC reporting outcomes are not universally set, quality assurance at individual institutions by undertaking regular audit is paramount to maintaining accuracy.1,6,17
Our results demonstrate higher rates of malignancy and utilisation of the Thy1 non-diagnostic category in our cohort. This can be partially explained by sampling error from using the less-precise PGFNAC technique in cases from the early years of the study, and possibly poor operator techniques. 19 In addition, unsatisfactory sample preparation and preservation, especially from cystic lesions, are well recognised factors leading to higher rates of non-diagnostic aspirates.5,8 The Thy2 category also had a higher rate of malignancy in our cohort. Interestingly, 40% of the false negative Thy2 cases had PTMC (<1cm), which can further explain the sampling challenges of smaller lesions, especially with PGFNAC technique. In the meta-analysis performed by Wang et al. 20, the authors noted a significant difference between the FN rates for benign FNAC between academic (2%) and community hospitals (10%). The authors attributed this difference to higher sampling error with PGFNA and differences in cytological interpretation.20 Moreover, selection bias for treatment may also skew the ROM figures, as patients with Thy1 or Thy2 results will only undergo surgery if they show suspicious clinical or radiological features.5,7
Cystic changes and degenerative processes in thyroid nodules can often cause florid atypia, with a considerable potential for FN results and malignancy in around 14-17% of Thy1c and 4-33% of Thy2c nodules.13,21-24 Interestingly, our results when compared to the published figures, showed a higher ROM in Thy1c (19%) and Thy2c (50%), which can only be partially explained by treatment selection bias. However, we agree with the BTA guidelines that FNAC should be repeated for all Thy1 and Thy2 cases with suspicious clinical or sonographic features.8 Table 5 summarises the recommended clinical actions for each RCPath FNAC category.
One of the main aims of the RCPath Thy nomenclature, is to reduce the cytological reporting variability for the indeterminate thyroid nodules. These are often challenging for clinicians and pathologists because of their heterogenous morphology, and the difficulty to establish cytologically any invasive characteristics without thorough histopathological examination.6 Our malignancy rates for Thy3a and Thy3f categories are notably higher as shown inTable 4. It is well recognised that Thy3a category can often be conceived by cytopathologists as a ‘haven of safety’, avoiding false negatives when assigned instead of Thy2, and potentially unnecessary surgery when assigned instead of Thy3f, and avoiding false positives when assigned instead of Thy4.7
In our cohort, Thy4 patients also had a higher malignancy rate (90%) compared to the published figures. The BTA recommendation of diagnostic hemithyroidectomy for Thy4 lesions is based on the RCPath guidelines which quotes a 30-35% possibility of benign disease in this cohort and hence avoiding the potential long-term morbidity of total thyroidectomy8. However, In centres with a malignancy rate of >90% for Thy4 cytology, an argument could be made for offering total thyroidectomy in patients with larger nodules(>4cm) to avoid a second procedure of completion hemithyroidectomy.8 Malignancy is almost always histologically confirmed in Thy5 patients, justifying our standard practice of therapeutic hemi- or total thyroidectomy ± central compartment neck dissection guided by the MDT decision.1,8
In our cohort, Thy4 patients also had higher malignancy rates compared to the published figures. In keeping with the BTA recommendations (Table 5) , our results confirm that total thyroidectomy should not be offered to Thy4 lesions as this would put at least one in ten patients at risk of unnecessary surgery with its potential long-term morbidity.8 However, malignancy is almost always histologically confirmed in Thy5 patients, justifying our standard practice of therapeutic hemi- or total thyroidectomy ± central compartment neck dissection guided by the MDT decision.1,8
The limitations of our study include a possibly heterogenous population, inclusion of samples taken using PGFNAC techniques, and our study period crosses multiple revisions of the Thy system nomenclature. Since this was a retrospective study, it is sometimes difficult to ascertain that a histologically diagnosed malignant nodule is the same one aspirated for FNAC preoperatively. Moreover, we only included histologically-correlated FNAC samples, which likely skewed our malignancy rates in the lower risk categories when cancer is not frequently encountered. While using a tiered classification nomenclature like the Thy or TBSRTC systems may improve comparability of results between various institutions, these comparisons must be taken with caution as the results are often influenced by multiple factors. These factors include differences in thyroid cancer prevalence, variations in nodule selection for aspiration, the skill of the aspirators, the aspiration techniques, the experience of the cytopathologists, and the percentage of cases progressing to have surgery.6,7Moreover, the methods of calculating the ROMs and PPVs rates are widely variable in the literature, making it incredibly difficult to compare different studies.6,7,16
The other issue limiting the generalisability of the FNAC outcomes is the inherent inter- and intraobserver variability of thyroid cytology reporting.6,13,15,25 In a large multi-centre prospective study by Cibas et al. 16 that assessed the reporting variability of the TBSRTC system, concordance level between the local cytopathologists and a central review panel was only 64%, with 74.7% intraobserver concordance. The false positive rate of category VI (Thy5) was 6%, and these patients could potentially have undergone unnecessary surgery if they were not downgraded by the central review panel.15 Studies on the RCPath Thy system show very similar pattern with highest concordance for Thy1 and Thy5, moderate concordance for Thy2 and Thy3f, and lowest concordance for Thy3a and Thy4 categories.25
Conclusion
The use of tiered classification nomenclature, such as the RCPath Thy system, have paved the way to standardized thyroid FNAC reporting. However, diagnostic performance can be variable between different institutions. Our results demonstrate generally higher rates of malignancy compared to other published series. Each individual centre should be able to discuss suspicious cytology results in a multidisciplinary team setting, and to be able to quote local malignancy rates during patient counselling. It is prudent for all units performing thyroid diagnostics to control the factors that might lead to reporting variability, and to undertake regular audit of their performance. Adjunct immunohistochemical and molecular testing is promising, and may in future provide a route to improve thyroid cytology outcomes and so help in standardising the reporting outcomes.