Links To And Excerpts From “The New Era of TIRADSs to Stratify the Risk of Malignancy of Thyroid Nodules: Strengths, Weaknesses and Pitfalls”

In this post, I link to, and excerpt from The New Era of TIRADSs to Stratify the Risk of Malignancy of Thyroid Nodules: Strengths, Weaknesses and Pitfalls [PubMed Abstract] [Full-Text HTML] [Full-Text PDF]. Cancers (Basel). 2021 Aug 26;13(17):4316. doi: 10.3390/cancers13174316.

The above resource has been cited by 12 articles in PubMed.

There are 72 similar articles in PubMed.

All that follows is from the above resource.


Simple Summary

The aim of this review is to provide the reader with a comprehensive overview of thyroid imaging and reporting data systems used for thyroid nodules, so as to understand how nodules are scored with all existing systems. Both ultrasound based risk stratification systems and indications for fine-needle aspirations are described. Systems are compared by analyzing their strengths and weaknesses. Studies show satisfactory sensitivities and specificities for the diagnosis of malignancy for all systems, and none of them have shown a real significant advantage over the others in terms of raw diagnostic value. Interobserver agreement is also very similar for all systems, fairly adequate to robust. Dimensional cut-offs for fine-needle aspiration are quite similar and all RSSs seem to reduce effectively the number of unnecessary FNAs. Merging all existing systems in a common international one is desirable.


Since 2009, thyroid imaging reporting and data systems (TI-RADS) have been playing an increasing role in the field of thyroid nodules (TN) imaging. Their common aims are to provide sonologists of varied medical specialties and clinicians with an ultrasound (US) based malignancy risk stratification score and to guide decision making of fine-needle aspiration (FNA). Schematically, all TI-RADSs scores can be classified as either pattern-based or point-based approaches. The main strengths of these systems are their ability (i) to homogenize US TN descriptions among operators, (ii) to facilitate and shorten communication on the malignancy risk of TN between sonologists and clinicians, (iii) to provide quantitative ranges of malignancy risk assessment with high sensitivity and negative predictive values, and (iv) to reduce the number of unnecessary FNAs. Their weaknesses are (i) the remaining inter-observer discrepancies and (ii) their insufficient sensitivity for the diagnosis of follicular cancers and follicular variant of papillary cancers. Most common pitfalls are degenerating shrinking nodules and confusion between individual and coalescent nodules. The benefits of all TI-RADSs far outweigh their shortcomings, explaining their rising use, but the necessity to improve and merge the different existing systems remains.

Keywords: thyroid, nodule, risk stratification, TI-RADS, fine-needle aspiration

1. Introduction

Risk stratification systems (RSSs) have two main aims. The first one is to homogenize the results of thyroid ultrasound (US) reports, by using a quantitative cancer risk estimation approach, in order to facilitate communication between practitioners and with the patients. Ambiguities of qualitative descriptions such as “multinodular goiter to be confronted with biological tests” are reduced and allow for a quick understanding of the risk level of a thyroid nodule. The second one is to provide guidelines regarding the indications for fine-needle aspiration biopsy (FNA). There again, the limitation of subjectivity for this decision is crucial for patients to hope to get homogenized care.

Some of these systems, but not all, have incorporated a lexicon and even more rarely a standardized report. At least the former seems mandatory to increase inter-observer description agreement.

However, all RSSs tend to base the whole stratification and decision making process solely on US criteria and nodular size, whereas obviously many other factors should, and are, integrated when accomplishing these tasks. Among these are patient’s age and sex; age of the disease; family history of thyroid cancer; personal history of cervical irradiation; clinical symptoms such as dysphonia, dysphagia, or dyspnea; nodular location; number of nodules; and presence of suspicious cervical lymph nodes. Thus, a more thorough algorithm, also including laboratory tests such as TSH and calcitonin and thyroid scintigraphy when deemed adapted, may be sought in the future. This review will describe present RSSs, their strengths, weaknesses, and pitfalls via a comprehensive analysis of the literature and make some suggestions for the future.

1.1. Description of Present RSSs

Several national and international professional organizations have developed US-based risk-stratification systems. They are often referred to as thyroid imaging reporting and data systems, or TIRADS, terms derived from those used for breast cancer imaging. Some societies have chosen to stay with their own name to refer to their system (e.g., the American Thyroid Association). RSSs assign thyroid nodules to categories characterized by increasing risk ranges for cancer, based on the presence or not of specific US features. Two of the eight RSSs described below, ACR- and C-TIRADS, are point-based systems and the six others are pattern-based. Pattern-based scoring consists of recognizing a grouping of US features in a single figure, whereas point-based scoring systems consist of summing points that have been formerly attributed to US features.

1.4. Raw Diagnostic Values in Comparative Studies (before Applying Size Cut-Offs for the Decision to Perform FNA)

Many studies have attempted to compare the systems with each other. In particular, a comparison was performed between the BTA, AACE, and ATA RSSs []. The conclusions were that classification systems had elevated positive predictive value of malignancy in high-risk classes. ATA and AACE/ACE/AME systems were effective for ruling-out indication to FNA in low US risk nodules. A similar diagnostic accuracy and a substantial inter-observer agreement was provided by the 3- and the 5-category classifications.

Thus, up to now, no RSS has shown a real significant advantage over the others in terms of raw diagnostic value.

2. Indications for FNA and Diagnostic Values of RSSs after Applying Size Cut-Offs for FNA

2.1. Dimensional Cut-Offs

Each US risk-stratification system has set its own cut-offs for guiding fine needle aspiration cytology indications (Table 2).

2.2. Diagnostic Value after Applying Cut-Offs: Decision Guidance, Avoided FNAs, and Missed Carcinomas

Different retrospective series demonstrated that a very small proportion of thyroid cancers are missed after applying size cut-offs for FNA of the ACR committee. This would decrease with lower cut-offs, but create a substantial increase in the number of benign nodules that would be explored. A series showed that 13 cancers (11 PTC, one follicular and one medullary thyroid cancer) among nodules measuring 15–25 mm, would have been
missed if FNA would have not been performed for the 874 nodules measuring less than 25 mm included in this series [38]. Middleton et al. showed, in a series of 3822 nodules Bethesda II or VI, that among 352 malignant nodules (303 were histologically confirmed), 40 nodules would have received a recommendation for no further evaluation, among which
were 16 malignant nodules ≥10 mm [38].

A recent meta-analysis [39], including 12 studies [7,13,26,40–48] representing 18,750 nodules, evaluated the ability of 5 US RSSs (AACE, ACR, ATA, EU-TIRADS, K-TIRADS) for the
appropriate selection of thyroid nodules for FNA. Diagnostic odds ratio, representing the test performance and corresponding to the odds of the FNA being indicated in a malignant nodule compared to the odds of the FNA being indicated in a benign one, were calculated
for each US RSS. Keeping in mind that data on AACE and EU-TIRADS were sparse, diagnostic odd ratio was higher for ACR-TIRADS in comparison with the other systems.
The higher discriminative power was related to a higher ability of ACR-TIRADS to select malignant nodules for FNA, while no difference was found for benign nodules. This cannot be explained by the size cut-offs for FNA in intermediate- and high-risk-nodules, given that it is similar to that of the other US RSSs. However, fewer nodules will probably be classified as intermediate- or high suspicious than in other systems, because of the pointbased pattern of this RSS. As intermediate risk nodules are frequent, this could explain the advantage of the ACR-TIRADS over the other systems.

3. Weaknesses of TIRADSs

3.1. Insufficient Sensitivity for the Diagnosis of Follicular Thyroid Carcinoma and Follicular Variant of PTC

While historically the follicular variant of PTC (FVPTC) was considered a diagnostic pitfall of US, this notion was not confirmed in a report published in 2018 on 34 cases []. The K-TIRADS score was 3, 4, and 5 in 5.9%, 2.9%, and 91.2%, respectively. Thus, the false negative rate does not seem to exceed 6%.

To conclude, in FTCs cases, the RSSs false negative rate seems persistently higher than for FVPTCs, around 25%. Clinicians should be aware of this, especially in the era of thermal ablation, to try to avoid treating such nodules by alternatives to surgery. More specifically, exclusively solid isoechoic and mildly hypoechoic nodules should always be considered with caution.

3.2. Insufficient Specificity to Rule-Out Autonomously Functioning/Hot Thyroid Nodules from FNA

Autonomously functioning thyroid nodules (AFTN) account for 5–10% of palpable lesions and are very rarely malignant. .  . . It was concluded that ultrasound RSSs prompt inappropriate FNA in a significant number of patients with AFTN. The management strategy of thyroid nodules being essentially based on US risk stratification and size cut-offs, it could be considered that, depending on the RSS used, 2.7% to 9% of all nodules should have been excluded from FNA.

However, the reverse strategy of submitting all TNs to scintigraphy to exclude an AFTN before US exploration would drastically augment the costs with no diagnostic gain in, at least, 90% of all nodules.

3.3. High Rates of Nodules Classified at Intermediate Risk (Usually TI-RADS 4)

Based on the high negative predictive value of all RSSs, it could be considered that FNA could be avoided for most nodules classified as low risk, especially for those of mixed composition. At the opposite end, the high positive predictive value of high-risk categories prompt the indication for FNA in most cases if the size is over 10 mm, knowing these represent a minority of all nodules.

Conversely, the indication for FNA in intermediate risk nodules is still a matter of concern. Indeed, these nodules represent a substantial part of all nodules discovered during US thyroid imaging and even a more substantial part of those referred for FNA. Using the ATA US pattern risk assessment, nodules were classified as intermediate risk in 31% of cases []. Regarding the AACE, 56.9% were considered at intermediate risk in another report []. In a study on 305 nodules with final histology as gold standard, it was shown that ACR-TIRADS 4 nodules represented 28.8% of all nodules and EU-TIRADS 4 category 22% []. Finally, in a study with a prospective design with cytological examination as a gold standard on 4550 nodules [], the rate of TIRADS 4A nodules (equivalent to EU-TIRADS 4) was 44.5%.

Thus, the main difficulty in significantly and appropriately reducing the indications for FNA is the high rate of intermediate risk nodules.

3.4. Thyroid Diffuse Masses

All RSSs have been studied and developed for nodules. However, it is unclear whether diffuse thyroid masses have been taken into account in those systems. These are most of the time responsible for pressure symptoms with a rapid development. The most common US presentation is a hypoechoic mass invading one lobe or all the thyroid gland. It is usually hypoechoic, with poorly defined margins. Vascularity and stiffness are variable and they can be accompanied or not by suspect cervical lymph nodes. The following main aspects should be considered:

  • First, several etiological hypotheses should always be mentioned in the US report, including anaplastic carcinoma, lymphoma, metastases from non-thyroidal origin, and large differentiated papillary and follicular carcinomas. Riedel’s thyroiditis could be added to this list. In this case, marked hypoechogenicity and absorption of the US beam, absence of vascularity, and high stiffness are relatively characteristic features.
  • The context helps refining the hypotheses. Knowledge of a prior renal cell carcinoma is for instance in favor of a metastasis and rapid development in an elderly subject with severe pressure symptoms in favor of an anaplastic carcinoma.
  • Core-needle biopsy or surgical biopsy, depending on the center’s habits, should systematically be added to FNA, due to its low diagnostic power in this situation.
  • Quick referral to a tertiary care center is advised.

3.5. Absence of Validation in Large Non-Specialized Medical Communities

One of the main issues in adopting RSSs in daily life is the limited evidence regarding their diagnostic value when applied by non-specialized teams, most of the available literature on the subject being produced by expert centers. Studies carried out outside the specialized world of thyroid imaging without dedicated US machines are necessary to confirm the real world efficiency of all RSSs.

4. Pitfalls

4.1. Shrinking Nodules

Nodules with a cystic or hemorrhagic component can evolve by shrinking. Risk factors for such evolution include abundant blood supply, non-smooth margin of the internal solid portion, and a spongiform internal content []. The process can be of variable length, sometimes lasting for years, but frequently leads to ambiguous US features mimicking malignancy.  .  .  . The cytology is mainly composed of thick colloid and macrophages and the cytopathologist should be informed of the hypothesis. Otherwise, the result could be considered as non-diagnostic instead of representative of the lesion [].

4.2. Subacute Thyroiditis

Subacute thyroiditis can also mimic malignancy by US, because frequently displaying a taller-than-wide shape and marked hypoechogenicity. However, the existence of spontaneous thyroid pain, low TSH, and elevated serum inflammatory markers frequently allows the diagnosis. On the US point of view, it has been shown that the lesions have poorly defined margins that can help differentiating from a carcinoma []. In case of persistent doubt, it is advised to proceed to FNA if TSH is normal, or to scintigraphy if TSH is low, which will show an absence of tracer uptake. US follow-up is also advised, showing progressive regression of the hypoechoic zone and absence of a true nodule that could also have been hidden initially by the marked hypoechogenicity of the lesions.

4.3. Confusion or Absence of Clear Distinction between Nodular Disease and Hyperplasia

Hyperplasia of the follicular epithelium is the most common morphological change in the thyroid seen by the pathologist []. The manifestation of this process is the goiter (diffuse or nodular hyperplasia). The US features range from a simple isoechoic enlargement of the thyroid gland to multiple coalescent isoechoic nodules, usually of small size individually with no or poor definite margins. This pattern is very frequent in regions of endemic goiter. Solely did the EU-TIRADS address this issue, but it should be included in the future in RSSs, because of its very low risk of malignancy and of the feeble interest of FNA, that may even lead to false positive results [].

5. Suggestions for the Future

5.1. Absence of Classification for TNs Treated with Thermal Ablation

Thermal ablation, especially laser and radiofrequency (RFA), is of increasing use in the treatment of benign thyroid nodules and is considered as a possible alternative to surgery []. In a systematic review, it has been shown that RFA induces a volume reduction ratio ranging between 66.9% and 97.9% three years after the procedure []. These treatments induce important changes in the US features of nodules that can mimic malignancy. Nodules turn solid and hypoechoic, even markedly hypoechoic, sometimes with irregular margins and calcifications []. As radiofrequency is of frequent use for liver tumors, the LI-RADS Treatment Response (LR-TR) algorithm was introduced in 2017 to assist radiologists in assessing hepatocellular carcinoma (HCC) response following locoregional therapy []. A comparable addendum should be part of future thyroid RSSs.

5.2. Incorporating in the Algorithm the Number of Nodules Especially If They Belong to the Same Category

Different studies demonstrated that a single nodule increases the risk of malignancy compared to multiple nodules [,]. Moreover, this parameter has high inter-observer agreement and is easy to implement. Taking into account in the algorithm of future US RSSs the number of nodules to decrease the estimated risk of malignancy, especially if all are low to intermediate risk nodules, could be valuable.

5.3. Taking into Account Age, Sex, Time Since Discovery, Results of Previous FNAs

Many risk factors for thyroid nodules malignancy have been suggested, such as patient age, sex, nodule size, and composition, but our understanding of the specific risk attributable to these is not precisely known. An interesting study [] demonstrated in 20 001 thyroid nodules evaluated by FNA from 1995 to 2017 a significant increased risk of malignancy for patient age >52, male sex, nodule size with growing risk from 20 mm until more than 40 mm in comparison with nodules less than 20 mm. On the opposite side, cystic content (at least 25% of the nodule) was associated with a decreased risk of malignancy compared with predominantly solid nodule, as well as the presence of additional nodules with lowest risk for greater than 4 nodules. Interestingly, a free online calculator was constructed to provide malignancy-risk estimates based on these variables.

5.4. Taking into Account the Serum Value of TSH (to Exclude a AFTN) and Calcitonin (to Detect a Medullary Cancer), When Available

Serum TSH should be measured during the initial evaluation of a patient with one or more thyroid nodule(s). If the serum TSH is low, a radionuclide (preferably 123I) thyroid scan should be performed to exclude AFTN from FNAC and to explore the etiology of hyperthyroidism, provided that there is no evidence of Graves’ disease. In case of normal serum TSH value, there are no US features correlated with autonomous nodules [,]. The cost-effectiveness of submitting all nodules to a thyroid scan to avoid unnecessary FNA for AFTN is questioned.

Calcitonin may detect C-cell hyperplasia and medullary thyroid cancer (MTC). However, most guidelines cannot recommend for or against routine calcitonin measurement in patients with thyroid nodules. A recent review [] demonstrated that calcitonin has good sensitivity and specificity to diagnose MTC and could be useful when available in the evaluation of thyroid nodules. The literature and the experience show that for a calcitonin level over 100 pg/mL nodule larger than 1 cm are MTC. For levels below 100 ng/L and that in nodules larger than 1 cm the systematic calcitonin measurement does not bring a clear advantage for the diagnosis, especially if at low or intermediate US risk. However, the value of routine testing in patients with thyroid nodules remains questionable, due to the low prevalence of MTC, and whether routine calcitonin testing improves prognosis in MTC patients remains unclear. In clinical practice, situations associated with false positivity of calcitonin tests (e.g., renal insufficiency, treatment with proton pump inhibitor, obesity) and the correlation of calcitonin value with the nodule volume should be taken into account for the interpretation of the result. Calcitonin measurement remains mandatory in case active surveillance of EU-TIRADS 5 nodules or proven microcarcinomas is considered and before surgery or thermal ablation. Regardless, the heterogeneous US presentation of MTC [] and the low sensitivity of FNA in detecting MTC [] has to be taken into account during the clinical practice.

5.5. D Vascularity

Advanced ultrasound techniques may improve the risk estimation and could be used more extensively. For example, Borlea et al. [] demonstrated that adding 4D vascularity to the French TIRADS score proved beneficial for predicting the malignancy risk and may add important knowledge in uncertain situations.

An international team has been set up and is currently working on a global new TIRADS, to be called I-TIRADS for International TIRADS. It will include a lexicon, an RSS, and recommendations for FNA and follow-up. Maybe some of these suggestions could be taken into account to create this new version. The pitfalls they imply are detailed in Table 3.

6. Conclusions

The different US RSSs introduced since the late 2000s have facilitated the effective interpretation and communication of thyroid US findings among physicians and cytopathologists and with the patient. On the whole, there are similarities among the different RSS regarding the lexicons used and the categorization of nodules, although differences and specificities remain. Diagnostic performance and efficacy of FNA performed according to the different RSS vary, mainly influenced by different size cut-offs and partially by different risk categorizations of nodules. Understanding the strengths and weakness of the different RSSs will help to improve each system and may provide the basis for an ultimate international standardization. Efforts should be made to merge the different systems utilized around the world with the ultimate aim of eliminating unnecessary thyroid biopsies without jeopardizing the detection of clinically significant malignancies.

This entry was posted in Endocrinology, Thyroid Disease. Bookmark the permalink.