Generative AI in healthcare discussed

Llama 3 announced, and a disinformation model

Posted on April 26th, 2024

Summary

This week, several articles appeared that discuss the use of generative AI in healthcare. Identified issues include patient triage to prioritize urgent cases and the development of personalized health plans. Key healthcare providers are collaborating with major tech companies on platform development. Serious questions arise on chatbots. For instance, the reinforcement of false beliefs about biological differences between black and white people remain, which is a particular problem since the lower level of health care in marginalized communities means they are more likely to use generative AI for healthcare-related queries.

Other articles explore the adoption of generative AI to streamline government processes and help the sports industry offer new video experiences to fans.

What makes articles that examine the impact of generative AI in some domain noteworthy is their potential to prompt organizations in that domain to identify and address inefficiencies or weaknesses in their processes, even if they finally do not adopt generative AI due to inherent risks.

In 2024, elections will be held that affect nearly half of the global population. An article on disinformation discusses a model that assesses the impact of fake news on election campaigns. A key finding is that for fake news to be effective, it must be generated regularly; sporadic fake news creation tends to be ineffective due to fact-checking efforts.

A blog post by Meta discusses its initiatives to prevent AI-generated child sexual abusive content. From an ethical perspective, a Nature article provides a summary of expert discussions on the challenges posed by large language models to scientific practices.

In related Meta news, Llama 3 is now available on Hugging Face.

Lastly, an extensive survey paper on large language models by researchers from Renmin University in China has been published on arXiv. This paper is an excellent starting point for those seeking a deeper technical understanding of large language model technology and its evolution.

1. Four ways generative AI will improve the federal government

The Nextgov/FCW website examines how technology can improve how government agencies carry out crucial functions for citizens. This article delves into the potential of generative AI to enhance different types of processes. One use case is application forms; presently, a form that may take days or even weeks to process could be speeded up to hours as AI algorithms "learn" the subsequent steps in processes and workflows. This could facilitate automated regulatory filings and simplify income tax returns. Another use case is contractor management. The federal government collaborates with thousands of contractors in over a million distinct programs. This necessitates extensive resources for contract procurement and management, compliance oversight, record-keeping, data integration, permissions, approvals, and numerous other tasks that consume significant labor hours and are susceptible to human error. The hope is that generative AI can pinpoint deficiencies in supply and provider chains such as redundancy, fraud, and inefficiency.

2. Healthcare Generative AI Practitioners Prioritize Industry-Specific and Task-Specific Models

The 2024 Generative AI in Healthcare Survey was conducted online in February and March 2024. Large companies, with greater resources and potentially more diverse use cases, are more predisposed to using Generative AI, and budgets have increased by 300%. There is a distinct preference for custom-built, task-specific language models, with 36% of all respondents utilizing healthcare-specific small models. Open-source LLMs (24%) and open-source small models (21%) trail behind. The most prevalent use cases for LLMs in healthcare include answering patient questions (21%), medical chatbots (20%), and information extraction/data abstraction (19%). Accuracy, security, and privacy risks are the most critical criteria when evaluating LLMs. "Human in the loop" is the most common step taken to test and refine LLM models (55%), followed by supervised fine-tuning (32%). The most frequently tested requirements for LLM solutions are Fairness (32%), Explainability (27%), and Private Data Leakage (27%). Despite this interest and active research, only 14% of companies are in the early stages of adoption, with a first solution operational, while 11% are in more advanced stages of adoption.

3. Generative AI’s Potential in Healthcare Questioned

Google is collaborating with Highmark Health to customize patient triage procedures using generative AI. Other applications include tailoring health plans for patients and enabling personalized interactions on a large scale. Amazon's AWS is exploring the technology to analyze medical databases for "social determinants of health", while Microsoft Azure is creating a system for the Providence healthcare network to automatically categorize patient communications. The challenge here lies in managing the influx of messages in healthcare organization inboxes, making it difficult to identify patients needing immediate treatment. Other developments mentioned in the article include Ambience Healthcare, developing a generative AI app for clinicians, Nabla offering an ambient AI assistant for healthcare practitioners, and Abridge creating analytics tools for medical documentation. A Deloitte survey indicates that only about 53% of U.S. consumers believe that generative AI can improve healthcare accessibility, reduce waiting times or make healthcare more affordable. Research, including a study in JAMA Pediatrics, highlights high error rates in disease diagnosis by generative AI systems like ChatGPT, with a 35% failure rate in the MedAlign benchmark tests. Nevertheless, generative AI has shown benefits in medical imaging.

4. Generative AI is coming for healthcare, and not everyone’s thrilled

This article reports most of the elements of the preceding article. It mentions that generative AI in healthcare can perpetuate harmful stereotypes, as evidenced by a 2023 Stanford Medicine study. Researchers tested ChatGPT and other generative AI chatbots on medical questions, revealing frequent inaccuracies and the reinforcement of untrue beliefs about biological differences between black and white people. Such misinformation has historically led to misdiagnoses, particularly impacting marginalized communities since, ironically, communities often lacking healthcare coverage, are more likely to use generative AI for healthcare-related purposes.

The World Health Organization has released guidelines with over 40 recommendations for consideration by governments, technology companies, and health care providers to ensure that usage of the technology promotes and protects health. The guidelines call for participation by all stakeholders in healthcare to examine generative AI development throughout the whole process. The guidelines can be found here.

Source TechCrunch

5. Ever-evolving generative AI brings new, game changing element to sports landscape

Some Grand Slam tennis tournaments have employed generative AI to automatically generate commentary for video highlights. Additionally, the PGA Golf Tour collaborates with WSC Sports to produce AI-generated sports videos and automatically clip and publish highlights. WSC Sports' models incorporate generative AI that can process audio and video, to commentate the unfolding action in a match in realtime, apply metadata tags, and generate video content based on specific sport-type criteria. The overarching aim of using generative AI is to design hyper-personalized experiences for fans and also to attract new fans. According to Allied Market Research, the value of artificial intelligence in the global sports market could exceed 29 billion USD by 2032.

6. Meta Joins Thorn and Industry Partners in New Generative AI Principles

In this Meta publication, Meta announces that it is joining with Thorn and All Tech is Human to proactively address child safety risks, notably in relation to child sexual abuse material. A key aim of this collaboration is to prevent generative models creating AI generated sexual content. One key problem is that the generalization capabilities of models allow them to combine adult sexual content with non-sexual depictions of children to produce abusive content. Pedophiles have also used generative AI to “nudify” pictures of children. One axe that Meta is following is to conduct structured, scalable and consistent stress testing of models throughout the development process for their capability to produce abusive content. This task is challenging as the testers themselves must remain within the bounds of law. Another axe is to detect content origin, such as with watermarking.

Source Meta

7. Welcome Llama 3 – Meta's new open LLM

Meta's Llama 3 is now available on Hugging Face and integrates into the platform's ecosystem. Available in two sizes, 8B for consumer-sized GPU deployment and 70B for large-scale AI applications, both offer base and instruction-tuned variants. Additionally, Llama Guard 2, fine-tuned on Llama 3 8B, is released for enhanced safety; this is an LLM that analyzes prompts and output for possible toxic content. The tokenizer's expanded vocabulary size of 128,256 improves text encoding efficiency. Llama 3 can be deployed on Hugging Face's Inference Endpoints, leveraging Text Generation Inference (a toolkit for deploying and serving LLMs) for production-ready deployment, with features like continuous batching and tensor parallelism (a technique used to fit a LLMs onto multiple GPUs). Deployment options also include Google Cloud via Vertex AI or Google Kubernetes Engine, and Amazon SageMaker through AWS Jumpstart.

8. Generative AI model shows fake news has a greater influence on elections when released at a steady pace without interruption

This article discusses research examining the impact of disinformation on election campaigns, highlighting the challenges of conducting experiments due to the "one-history problem", where there is only one historical outcome to observe. To study the effects of counterfactual scenarios, the author models how voters receive information and simulates subsequent developments. Initially designed for studying disinformation in financial markets, the model was adapted to analyze its impact on elections as fake news became more prevalent. One type of disinformation model considered involves a release of disinformation at a random moment, which grows in strength but is eventually dampened (e.g., through fact-checking). The research suggests that a single release of such disinformation, well before election day, has minimal impact on the outcome. However, persistent and repeated releases can influence the election outcome. Surprisingly, the study finds that even if voters are unsure about the truth of information, knowing the frequency and bias of disinformation can largely negate its impact. Simply being aware of the existence of fake news serves as a potent antidote against its effects.

9. Science in the age of large language models

Four experts in artificial intelligence ethics and policy discuss potential risks and advocate for careful consideration and responsible use of LLMs for good scientific practices. The challenges raised are many, including hallucination, maintaining responsiveness to change (such as adapting to shifts in world knowledge over time), ensuring reliable generation of accurate content for infrequent or sparsely studied phenomena, addressing issues of plagiarism and authorial misrepresentation when utilizing language models, quantifying the extent of assistance provided by language models in writing tasks, handling outdated or incorrect knowledge present in published literature, detecting and mitigating bias, and accommodating future paradigm shifts in scientific understanding (as language models derive insights from past research, potentially leading to paradigm lock-in). Additionally, there is concern regarding the impact of language model usage on critical thinking skills.

10. A Survey of Large Language Models

This paper presents a highly extensive review of large language models (LLMs), their development, applications, and the technical challenges associated with them. It covers the evolution of language models from statistical to pre-trained models, including the rise of transformer-based models like GPT-3 and beyond, which have dramatically enhanced natural language processing capabilities. The paper covers concepts like pre-training, adaptation tuning, utilization, and capacity evaluation of LLMs. It highlights the impact of model scaling on performance improvement and introduces emerging abilities like in-context learning which appear at larger scales. The survey also discusses the practical application of LLMs, their role in advancing AI research, and the potential societal impacts.