Assessing confidence in clinical information at the individual patient level and during artificial intelligence (AI) assisted clinical reasoning and decision making (CRDM).

5.1.1 Assessing confidence in clinical information at the individual patient level

Clinicians interviewed for this research discussed the complexities of clinical reasoning and decision making (CRDM) processes and the assessments of various types of information that are involved.  

They described conventional, human CRDM as a complex and nuanced process, where clinician experience, intuition, expertise, and biases interact, often subconsciously. Clinicians are experts in assessing the appropriate level of confidence to have in various forms of information feeding into CRDM, from patient histories to test results, imaging, and reports from professional colleagues.

For the clinician interviewees, effective CRDM requires them to make value judgments about the significance and trustworthiness of information derived from sources of either unknown reliability (for example, patient history) or reliability that has been demonstrated at a cohort or population level (for example, laboratory test results). They combine that information, which can be potentially contradictory, to make an optimal decision with each patient.

The individual nature of any clinical decision places the responsibility on the clinician to know how to weigh up, or indeed when to disregard, certain information. Clinicians can base their decisions on a complex synthesis of knowledge, professional experience, patient history, demographic factors, test results, imaging, expert opinion, patient preference and intuition. 

Through assimilation of all this information and given their expectations as to the most likely clinical scenario, clinicians make implicit or explicit probabilistic estimates, determine a course of action with the patient as appropriate, and act accordingly. Interviewees noted that clinicians learn these skills over an extended period through training and experience.

This context suggests that the CRDM process, while guided by best-practice, guidelines and published literature, can also be highly individual and contextual.

When faced with uncertainty, clinicians may access peer opinion, discuss conflicting views and seek the opinion of a multidisciplinary team if required. Some clinical decisions (for example, in emergency medicine) are made rapidly under time pressure, while others are made in a very considered way and without such urgency, potentially in consultation with other experts or considering recent academic literature.

Interviewees noted that these nuanced and experience-driven CRDM processes are susceptible to cognitive biases, which can be exacerbated by the unreliability of human intuitions about probability, and the perceived trustworthiness of the provided information. Confirmation bias and automation bias (as defined and discussed in section 3.3.1) are common challenges in existing CRDM situations, and it is important to consider these carefully when new and unfamiliar tools or information sources, such as AI decision-support tools, are introduced into CRDM.

Interviewees concluded that AI technologies provide both risks and opportunities in this context: cognitive biases may either be reinforced, or mitigated, by the inclusion of AI-derived information in the CRDM process. This suggests the importance of clinicians understanding how their current decision making process could be influenced by the introduction of AI technologies, particularly in the case of conflict between their own intuition or opinion and the information or recommendation provided by an AI system.

5.1.2 Confidence during AI-assisted CRDM

AI technologies that are used to support clinical reasoning and decision making can be referred to as AI-assisted CRDM or clinical decision support systems.

Incorporating AI-derived information into CRDM has enormous potential to improve consistency and quality of clinical decisions, increase efficiency, and benefit patients.69 In AI-assisted CRDM, the clinician retains ultimate responsibility for the decision made, so it has been suggested that clinical ‘reasoning support’, as opposed to ‘decision support’ is more appropriate terminology.70

As highlighted by this research’s interviewees, clinicians who use AI-derived information during CRDM will need to understand the nature and context of this information to assess whether it warrants low or high confidence.

Interviewees suggested that awareness that most AI technologies involve statistical methods and probabilities to make predictions is a fundamental starting point for anyone using AI in CRDM. For example, AI models can be trained on existing data to predict the most likely diagnosis or the treatment strategy with the best chance of success. This approach relies on the assumption that the present case and situation are similar enough to those used to train the AI system.

For this research’s interviewees, AI-derived information should be perceived as a prediction or estimate of the most likely diagnosis or optimal strategy and should be considered to have a degree of uncertainty associated with it, in the same way an external opinion might be assessed.

The individual reliability of AI-derived information will be a function of the data used to train the model, the training process itself, and the characteristics of the particular patient for which a prediction is being made.

Interviewees observed that clinicians are used to these estimates of reliability for non-AI information sources. For example, clinicians are aware that laboratory results can on occasion be incorrect, and may be aware of their statistical performance at a population level. A known example involves the common prostate-specific antigen (PSA) test for prostate cancer. It has a 75 per cent false positive rate and a 15 per cent false negative rate.71,72 Clinicians are used to interpreting results such as PSA in clinical decision making and counselling patients about the potential unreliability of such tests.

AI-derived information is potentially different to test results or other quantitative measurements, where the likelihood of error remains largely constant across broad categories of patients, and are known to vary predictably based on factors like patient demographics. In the case of AI, the aspects of the data impacting the accuracy of the prediction may be more complex and often unknown,73 making it harder for clinicians to assess how confident they should be in a specific AI prediction (as discussed further in 5.2.3).

‘Brittleness’ (the tendency for performance to fall off rapidly at the boundaries of the AI algorithm’s scope)74 in the clinical use of AI is a particular challenge to determining appropriate levels of confidence during CRDM.75 It suggests that, when applying population-level evidence and performance metrics to AI predictions concerning individual patients, clinicians will need to be cautious and retain a critical eye for unexpected, contradictory or implausible predictions (as discussed also in Box 5 and Box 6).

Further, failure cases for an AI’s performance may or may not be similar to those where human performance is low,68 making identification of these error cases particularly challenging. This is an additional reason for clinicians to retain a critical eye when dealing with AI-derived information for CRDM.

Therefore, even when confidence in an AI technology (as derived from the factors presented in Chapters 3 and 4) can be high, clinicians should still be encouraged to question predictions that appear to go against their clinical intuition or other evidence, remembering that appropriate confidence in that specific case may need to be low.76

It is important, however, to note that AI technologies can potentially find correlations in the data of an individual patient that human CRDM would not account for.77 In such cases, it is possible that an AI-derived prediction or recommendation that is contrary to a clinician’s intuition may be correct and should be considered seriously rather than dismissed out-of-hand.

These considerations suggest that educating clinicians to retain a degree of scepticism in AI-derived information, while not losing confidence in the overall performance of the AI technology, is an important aspect of practising CRDM with AI technologies. If done well, AI-assisted CRDM has been shown to have the potential to outperform both human and automated approaches.78

Information:

Aspects of CRDM - Key confidence insights

  • Clinical Reasoning and Decision Making (CRDM) is a complex, nuanced process, learned through lengthy education and professional experience. It relies on making value-judgements about information from a range of sources.
  • Appropriate confidence in AI-derived information should be assessed for each patient and each AI-assisted clinical decision.
  • Clinicians who use AI-derived information during CRDM will need to understand the nature and context of this information to assess whether it warrants low or high confidence.
  • AI-derived information should be perceived as a prediction or estimate of the most likely diagnosis or optimal strategy and should be considered to have a degree of uncertainty associated with it, in the same way an external opinion might be assessed.
  • Clinicians need to understand how their current decision making process could be affected by AI-derived information and understand the importance of retaining a critical eye, to detect potential AI failure cases.
  • Education and training will be key to developing appropriate levels of confidence during CRDM. 

References

69 Garcia-Vidal C, Sanjuan G, Puerta-Alcalde P, Moreno-García E, Soriano A. Artificial intelligence to support clinical decision-making processes. EBioMedicine. 2019;46:27-29. doi:10.1016/j.ebiom.2019.07.019

70 van Baalen S, Boon M, Verhoef P. From clinical decision support to clinical reasoning support systems. J Eval Clin Pract. 2021;27(3):520-528. doi:10.1111/jep.13541

71 NICE. NICE guidelines. PSA testing | Diagnosis | Prostate cancer | CKS |. https://cks.nice.org.uk/topics/prostate-cancer/diagnosis/psa-testing/. Published 2017. Accessed February 28, 2022.

72 Saraiya M, Kottiri BJ, Leadbetter S, et al. Total and percent free prostate-specific antigen levels among U.S. men, 2001-2002. Cancer Epidemiol Biomarkers Prev. 2005;14(9):2178-2182. doi:10.1158/1055-9965.EPI-05-0206

73 Kelly CJ, Karthikesalingam A, Suleyman M, Corrado G, King D. Key challenges for delivering clinical impact with artificial intelligence. BMC Med. 2019;17(1):1-9. doi:10.1186/s12916-019-1426-2

74 Magrabi F, Ammenwerth E, McNair JB, et al. Artificial Intelligence in Clinical Decision Support: Challenges for Evaluating AI and Practical Implications. Yearb Med Inform. 2019;28(1):128-134. doi:10.1055/s-0039-1677903

75 Myers PD, Ng K, Severson K, et al. Identifying unreliable predictions in clinical risk models. npj Digit Med. 2020;3(1):1-8. doi:10.1038/s41746-019-0209-7

68 Gaube S, Suresh H, Raue M, et al. Do as AI say: susceptibility in deployment of clinical decision-aids. npj Digit Med. 2021;4(1):1-8. doi:10.1038/s41746-021-00385-9

76 Benda NC, Novak LL, Reale C, Ancker JS. Trust in AI: why we should be designing for APPROPRIATE reliance. J Am Med Inform Assoc. 2021;29(1):207-212. doi:10.1093/jamia/ocab238

77 Shen J, Zhang CJP, Jiang B, et al. Artificial intelligence versus clinicians in disease diagnosis: Systematic review. JMIR Med Informatics. 2019;7(3):e10010. doi:10.2196/10010

78 Lee MH, Siewiorek DP, Smailagic A. A human-ai collaborative approach for clinical decision making on rehabilitation assessment. Conf Hum Factors Comput Syst - Proc. 2021;(Figure 1). doi:10.1145/3411764.3445472

Page last reviewed: 12 April 2023
Next review due: 12 April 2024