The World Health Organization (WHO) is calling for caution to be exercised in using artificial intelligence (AI) generated large language model tools (LLMs) to protect and promote human wellbeing, safety and autonomy, and preserve public health.
LLMs include some of the most rapidly expanding platforms, such as ChatGPT, Bard, Bert and many others that imitate understanding, processing, and producing human communication. Their meteoric public diffusion and growing experimental use for health-related purposes is generating significant excitement around their potential to support people’s health needs.
However, WHO cautions it is imperative that the risks be examined carefully when using LLMs to improve access to health information as a decision-support tool, or even to enhance diagnostic capacity in under-resourced settings to protect people’s health and reduce inequity.
WHO states it is enthusiastic about the appropriate use of technologies, including LLMs, to support healthcare professionals, patients, researchers and scientists, but has expressed its concerns that caution that would normally be exercised for any new technology is not being exercised consistently with LLMs. This includes widespread adherence to key values of transparency, inclusion, public engagement, expert supervision and rigorous evaluation.
Precipitous adoption of untested systems could lead to errors by healthcare workers, cause harm to patients, erode trust in AI and thereby undermine (or delay) the potential long-term benefits and uses of such technologies around the world.
Concerns that call for rigorous oversight needed for the technologies to be used in safe, effective, and ethical ways include:
• The data used to train AI may be biased, generating misleading or inaccurate information that could pose risks to health, equity and inclusiveness
• LLMs generate responses that can appear authoritative and plausible to an end user, however, they may be completely incorrect or contain serious errors, especially when health-related
• LLMs may be trained on data for which consent may not have been previously provided for such use, and LLMs may not protect sensitive data (including health data) that a user provides to an application to generate a response
• LLMs can be misused to generate and disseminate highly convincing disinformation in the form of text, audio or video content that is difficult for the public to differentiate from reliable health content
• While committed to harnessing new technologies, including AI and digital health to improve human health, WHO recommends that policy-makers ensure patient safety and protection while technology firms work to commercialise LLMs.
WHO proposes that these concerns be addressed, and clear evidence of benefit be measured before their widespread use in routine health care and medicine - whether by individuals, care providers or health system administrators and policy-makers.
Furthermore, it reiterates the importance of applying ethical principles and appropriate governance, as enumerated in the WHO guidance on the ethics and governance of AI for health, when designing, developing and deploying AI for health.