Summary
The European Data Protection Supervisor (EDPS) has just published guidelines on the use of generative AI in a manner that permits compliance with the GDPR. Personal data can appear in training data or be inferred by AI systems during operation, and the EDPS argues for strategies that ensure data protection rights are respected during the whole lifecycle of an AI platform.
On the technical side, Mistral announced its AI-based coding assistant called Codestral. An MIT Technology Review article examines highly publicized errors in Googles AI Overviews search feature, and argues that despite search improvements brought by retrieval-augmented generation (RAG), AI systems are still struggling to recognize tone in data and that this compromises search results. Galileo announced a series of evaluation foundation models for controlling incorrect or toxic content in generative AI generated output. Research from Carnegie Mellon, Amsterdam University and Hugging Face is described in a Tech Crunch article which shows that popular models respond very differently to questions on politically and social sensitive topics like immigration and LGBTQ+.
The IEEE Spectrum magazine reviews the 2024 Artificial Intelligence Index Report from Stanford University. Key lessons from this year’s report is the huge increase in corporate investment in generative AI in 2023, and figures are presented for the very large volumes of CO2 produced by these models since 2020. An MIT Technology Review article presents a brief review of the UN’s recent “AI for Good” summit in Geneva.
The case of the US actress, Scarlett Johansson, who is threatening to sue OpenAI is discussed in a CNN article. Even if OpenAI manages to show that it did not infringe copyright law by copying the actress’ voice for its Sky chatbot, OpenAI may be liable under right-of-publicity laws which protects individuals’ likenesses from being misused.
Table of Contents
2. Study finds that AI models hold opposing views on controversial topics
3. What I learned from the UN’s “AI for Good” summit
4. Stanford's 2024 AI Index Tracks Generative AI and More – IEEE Spectrum
5. Why Google’s AI Overviews gets things wrong
6. Mistral releases Codestral, its first generative AI model for code
7. GoDaddy has 50 large language models; its CTO explains why
8. Why OpenAI should fear a Scarlett Johansson lawsuit
9. EDPS Guidelines on generative AI: embracing opportunities, protecting people
1. Galileo Introduces First-of-its-Kind Evaluation Foundation Models for Enterprise GenAI Evaluations
Galileo announced a series of Evaluation Foundation Models (EFMs) called Luna that are designed to facilitate generative AI evaluations. Each of the models have been fine-tuned for various evaluation tasks like hallucination detection, data leakage and detecting malicious prompts. Galileo claims that Luna can be used in real-time applications like chatbots and monitoring systems since evaluations can be completed in milli-seconds. They also claim that the system is 18% more accurate than using OpenAI GPT3.5.
2. Study finds that AI models hold opposing views on controversial topics
This article reports on a study by researchers from Carnegie Mellon, Amsterdam University and Hugging Face. The researchers tested five models with questions related to immigration, LGBTQ+ rights, surrogacy and other politically sensitive issues. Questions were posed in several languages, including English, German, Italian and Turkish. The models tested in the study were Mistral 7B, Cohere’s Command-R, the Chinese Alibaba’s Qwen, Google’s Gemma and Meta’s Llama 3. The study illustrated variations in "values" conveyed by these models. Questions around LGBTQ+ led to the most refusal to answer among models, although Qwen had four times as many refusals as Mistral’s model. The values of the model are fundamentally linked to those of the contracted people who annotate training data, and whose annotations of elements can reflect their worldview. In the case of Qwen, output is also dictated by political pressure, and the model can refuse to answer questions related to repression in Tibet and the 1989 Tiananmen Square massacre. The original research paper can be found here.
3. What I learned from the UN’s “AI for Good” summit
This article reports on the author’s impressions of the UN’s AI for Good Summit in Geneva, Switzerland, organized by the International Telecommunication Union. The summit’s objective was to discuss how AI can be used to meet the UN’s Sustainable Development Goals (clean energy promotion, eradicating poverty and hunger, adapting to climate change, gender equality, etc.). The author was unimpressed by a video intervention from OpenAI’s CEO Sam Altman, who failed to give details about what OpenAI is concretely doing for AI safety. (Aside: a Washington Post article mentions how OpenAI is creating a new AI Safety Committee). Fundamental AI concerns were not deeply discussed at the summit – the author reminds us of two of these: 1) the energy used by a GenAI platform when creating an image is roughly equivalent to that needed to charge a smartphone, and 2) how workers are being exploited when evaluating AI content (e.g., workers from Ethiopia, Eritrea, and Kenya were paid less than 2 USD per hour to tag toxic and potentially traumatic content in ChatGPT’s evaluation).
4. Stanford's 2024 AI Index Tracks Generative AI and More – IEEE Spectrum
An IEEE Spectum article summarizes the annual AI report published by the Stanford Institute for Human-Centered Artificial Intelligence (HAI), that gives 15 trends around generative AI over the last year. Among the key lessons are:
- Generative AI investment has significantly increased, despite overall corporate investment being down over the past year. Investment exceeded 25 billion USD in 2023 compared to only 3 billion USD in 2022.
- Google is leading in the number of foundational models with 18, ahead of Meta on 11, Microsoft on 9 and OpenAI on 7.
- Closed models are outperforming open models on core benchmarks, though the designers of open-model are also concerned about mitigating risks in their models.
- Foundation models are increasingly expensive. The training cost of Gemini is reported as being over 190 billion USD and GPT-4’s training cost is reported as over 78 billion USD. In comparison, Google’s transformer model was trained for only 930 USD in 2017.
- There is a significant carbon footprint. For instance, GPT-3 is cited as having created 175 billion tonnes of CO2 between 2020 and 2023.
- The US leads in foundational model creation, though China leads in AI patents granted.
Among the other findings are that organizations are increasingly aware of AI risks and the need for responsible AI, and that people from Western nations are more pessimistic about AI, due to their inherent risks, than people in emerging nations where AI is seen more as having great potential.
5. Why Google’s AI Overviews gets things wrong
This article looks at highly publicized errors made by Google’s AI Overviews – the AI enhanced search feature. Two examples cited are search results that suggested people add glue to pizza and another suggesting eating at least one small rock a day. It is believed that AI Overviews is built on Google’s Gemini model and that retrieval-augmented generation (RAG) is used for access to data beyond that used in training. In the case of the “add glue to pizza” error, AI Overviews is thought to have used a Reddit post that made this suggestion. The post was intended to be funny, but AI Overviews was unable to detect the satire. Another mentioned error is the affirmation that Barack Obama was Muslim – the AI platform drew its information from a book entitled “Barack Hussein Obama: America’s First Muslim President?”. Despite the use of RAG, AI platforms still struggle to correctly interpret meaning.
6. Mistral releases Codestral, its first generative AI model for code
Mistral, the French AI startup announced the release of a generative AI model for coding, called Codestral. It was trained on over 80 programming languages, including Python, Java, C, C++, JavaScript, and Bash. The Mistral AI Non-Production license prohibits the use of Codestral and its generated code in commercial software. The license goes on to explicitly ban “any internal usage by employees in the context of the company’s business activities”. The article postulates that the motivation for this clause is that Codestral is partly trained on copyrighted content. The article also highlights the risk of errors in code generated by AI-assisted tools and mentions a research paper from Purdue University that showed that 52% of ChatGPT answers to programming-related questions contain incorrect information.
7. GoDaddy has 50 large language models; its CTO explains why
In the article, GoDaddy CTO explains how the company is using AI. GoDaddy is one of the largest web hosting firms. The company just launched Airo – a chatbot that can automatically design a company logo, website, email and social campaigns in seconds. Choosing a good domain name and populating sites with initial content is a time-consuming phase for clients, so GoDaddy is hoping for significant productivity gains in this area. The CTO explains that the company uses 50 GenAI models for multi-modal content generation, though only a handful in production with Airo, with the rest running in test environments. The platforms used include ChatGPT, Anthropic, Gemini, AWS’s Titan. The CTO explains that from an architectural viewpoint, a gateway service coordinates the different models, selecting results from models based on the type of content generated and checking for toxic output.
8. Why OpenAI should fear a Scarlett Johansson lawsuit
This article discusses the threatened legal action by US actress Scarlett Johansson against OpenAI for creating a voice assistant, Sky, whose voice sounds like that of the actress in the 2013 film Her. The film tells the story of a man who falls in love with an artificial intelligence, played by Johansson. The article argues that OpenAI is potentially in violation of two types of law. The first type of law is copyright law, which protects Johansson from having her voice copied without her expressed permission. OpenAI claims that the voice used for Sky is that of a different professional actress. The second type of law that OpenAI might be in violation of is right-of-publicity law. This law protects individuals’ names, voices or likenesses from being misused. For instance, the US singer Bette Midler won a lawsuit against Ford Motors in 1988 who had made a commercial with a voice that sounded like hers. The article expresses surprise at the apparent lack of awareness by OpenAI regarding these past cases.
9. EDPS Guidelines on generative AI: embracing opportunities, protecting people
The European Data Protection Supervisor (EDPS) has published guidelines for their institutions and offices on using generative AI with personal data, so that usage remains compliant with the GDPR. Personal data can appear in training data or may be inferred when using the system. The Data Protection Officer is advised to evaluate the lifecycle of the GenAI platform to see the steps taken to protect personal data, to conduct a data protection impact assessment and implement systematic monitoring. This is especially important due to the risks of bias and consequently discriminatory results. Processing of personal data must have a lawful basis, such as explicit and freely-given consent, including when web-scraping techniques are used to collect the data in the first place. People must be allowed to exercise their data rights (demand that data stored be accurate, or remove consent to the data). Hallucination in AI systems is a concern since it can compromise the data accuracy principle. When retrieval-augmented generation (RAG) is used, checks must be implemented to prevent personal data leakage.