Apple AI Open to Independent Review

Influence operations and health chatbots

Posted on June 22th, 2024

Summary

Apple announced that AI services will integrate with the iPhone 15 and in MacOS Sequoia on Macs and iPads. The AI tools will need to process personal data on Apple’s cloud, and Apple is inviting independent cybersecurity researchers to review the firm’s security architecture.

Meta’s wish to use the public social media profiles of adult users to train its AI models is causing problems for the company in the EU. Meta is offering an "opt-out" method for users who do not want their data processed, but privacy advocates argue that an "opt-in" method must be proposed, since the GDPR requires that users give explicit consent for each form of processing.

Chatbots in the health domain have been in the news. Google’s Personal Health Large Language Model is a chatbot that is trained using time-series data from wearable fitness devices, and Google claims that the chatbot outperformed human experts in answers to questions relating to fitness and sleep. One potential weakness is that data from wearable devices is training data from people who are relatively active, which is not representative of everyone. Another article looks at hallucination following reported errors given by the World Health Organization’s SARAH chatbot. The article notes that the better chatbots get, the more likely people are to miss an error when it happens.

On the subject of using generative AI for generating dis-information, an MIT Technology Review article examines a report from Meta looking at influence campaigns on its platforms. The report cites instances of state-sponsored actors using GenAI to create large volumes of social media comments and turning news articles into social media posts. Meta has released an AudioSeal tool for watermarking AI-generated audio clips – so that platforms can detect these clips as being AI generated.

An article from ZDNET summarizes predictions from several analysis firms. One interesting prediction from Gartner is that the rate of unionization of knowledge workers will increase by 1’000% by 2028 due to the perceived threats to their jobs from GenAI. Finally, a study from the US National Bureau of Economic Research working paper series evaluates the impact of chatbots that advise call agents in a call center environment. The research shows a large gain in productivity for lower-skilled workers and a negligible increase for experienced workers. Earlier trends in computerization tended to favor higher-skilled workers.

1. Why artists are becoming less scared of AI

This article from MIT Technology News looks at the current relationship between GenAI and the creative process. Some artists do not want their works to be used in GenAI training data sets in order to protect their copyright claims, and this has pushed the design of tools that counter AI data-scraping engines. For instance, Nightshade from Chicago University modifies images so that, to a human, the image appears darkened, whereas the modified image completely distorts feature representations in a GenAI image model engine. Glaze uses a similar approach, and this is now used on Cara – a social media site for artists. At the same time, a study by Google DeepMind got 20 professional comedians to use GenAI to write comedy pieces. The artists found that the platforms were satisfactory for initial “vomit” drafts, but could not produce anything funny or stimulating. The article argues that "creative writing requires its creator to deviate from the norm, whereas LLMs can only mimic it".

2. Why does AI hallucinate?

The World Health Organization launched a chatbot last April but warned on its website that the chatbot may not always give accurate answers. Based on GPT-3.5, the chatbot called SARAH (Smart AI Resource Assistant for Health) gives advice about eating well, handling stress, quitting smoking, Covid-19, mental health and sexually transmitted diseases. The chatbot reportedly proposed addresses for non-existent health clinics. The article underlines that GenAI is not designed to be used as encyclopedias, but to create new, and perhaps original, text. Among the approaches cited to controlling counter-factual output are chain-of-thought prompting, where the user asks the GenAI platform to explain its reasoning, and using larger data sets to train the model. Though these approaches may help, the article notes the fundamental weakness which is "the better chatbots get, the more likely people are to miss an error when it happens".

3. Meta has created a way to watermark AI-generated speech

Meta has released AudioSeal – a tool that adds watermarks to AI-generated audio clips so that the images can be detected as being AI-generated by other platforms. The context for the tool is the fight against disinformation campaigns and voice-scamming (where some person’s voice is cloned in order to scam someone close to that person using a fake audio message). Meta claims it can detect the audio watermark with over 90% accuracy. Also, the watermark is evenly distributed over the whole clip, as opposed to being placed in isolated chunks of the audio as tends to be current practice. However, Meta currently does not have plans to integrate watermarks into audio-creation tools on its platforms. AudioSeal can be downloaded from Github for free.

4. The future of generative AI: Here's what technology analysts are saying

This article summarizes conclusions from different analysis firms on the future of generative AI. Gartner, for instance, is quoted as saying that 40% of enterprise applications will include conversational AI in 2024 and that 100 million people will use robo-colleagues to help them by 2026. On the software development front, Gartner says that GenAI will automate 60% of the design effort for websites and mobile apps by 2026, and that 15% of applications will be generated by AI without a human in the loop by 2027. By also using the technology for refactoring legacy applications, modernization costs could be reduced by as much as 70%. Among the social impacts, Gartner foresees an increase in unionization among knowledge workers by 1’000% by 2028 due to fears of being replaced by AI. Analysis from IDC looks at data quality issues. Though data is the cornerstone of GenAI, the article reports IDC as claiming that 82% of organizations have siloed data, 41% feel that data is changing faster than they can keep up with, and that 24% of organizations do not trust their data.

5. Meta Pauses AI Training on EU User Data Amid Privacy Concerns

Following a request from the Irish Data Protection Commission, Meta has announced a pause in its plan to train its GenAI platform models using the public Facebook and Instagram profiles of adult users in the European Union. Meta is currently using public profile data of users in the United States and other markets to train its models. Data protection watch-dogs like noyb – none of your business point out that the GDPR imposes that personal data should only be usable when users give informed and explicit consent for this processing, and consequently filed a complaint. Meta had argued that use of this personal data, without explicit user permission, was valid under the GDPR on the grounds of legitimate interests – the argument that an organization uses personal data in a manner that people can "reasonably expect".

6. How to opt out of Meta’s AI training

This article from MIT Technology Review explains how to opt-out from Meta using your public profile data on its platforms (Facebook, Instagram, WhatsApp) to train its GenAI platform models. Meta has used public profile data to train its models in the past. A person’s photo might be used in training data, even if this photo was published on another person’s profile. The goal of the opt-out is to avoid violating data protection laws such as the GDPR, but as mentioned earlier, the opt-out clause is contested. Privacy advocates argue that an opt-in clause should be used, whereby profile data is only used if the user gives his explicit consent for this. Meta has stated that it does not use the contents of private messages to train its models. In the US or countries where personal data protection laws are weaker, Meta says it might not respond to opt-out requests.

7. Google Gemini proves a better health coach than humans

This article reports on Google’s Personal Health Large Language Model (PH-LLM), based on Google’s Gemini platform. A unique feature of the platform is that the model is built to process time-series data from wearable devices like smartwatches and heart rate monitors, but also exercise apps and social media activity. The article describes an experiment where the answers to sleep and exercise questions from the PH-LLM model and from expert humans were compared, and PH-LLM outperformed the domain experts in all categories. The model is not without its flaws however. For instance, the model did not always generate consistent responses and could be conservative or cautious. In one case, the model failed to identify under-sleeping as a potential source of harm. Further, the model was trained with data from relatively active individuals, so therefore cannot address broad-ranging sleep and fitness concerns.

8. Apple is promising personalized AI in a private cloud. Here’s how that will work.

Apple has presented its offering of personalized AI services which will appear with the iPhone 15 and in MacOS Sequoia on Macs and iPads with M1 chips or newer. Since these chips cannot do extensive AI processing, many tasks will have to be done on the Cloud. Apple says that any personal data sent to the cloud will be encrypted, and deleted immediately after use. Only the AI task called for by the user will be able to decrypt the data. Apple is inviting independent cybersecurity researchers to review their security process, which they call the Private Cloud Compute. The article notes that Apple has less incentive than other companies to collect personal data since its business model is more oriented towards hardware and services, rather than ads.

9. In the shadow of generative AI, what remains uniquely human?

This article discusses one of the existential questions around GenAI: what qualities do humans retain that cannot be replicated by GenAI models? The article cites a quote from Stanley Kubrick’s film 2001: A Space Odyssey, where a reporter asked if the machine on the spacecraft felt emotions: "Well, he acts like he has genuine emotions... but as to whether or not he has real feelings is something that I don’t think anyone can truthfully answer". The article argues that human emotions, cognitive flexibility, intuition and the contribution made by the five senses are all required for original thinking and complex problem solving. Another human and primate characteristic, that AI does not possess, is the presence of mirror neurons. These are neurons that fire when a specific motor act or emotion is experienced, and which are also fired by a person when he or she observes that act or emotion in another person. Motor neurons are seen as important in enhancing empathy, competition and teamwork.

10. Propagandists are using AI too – and companies need to be open about it

This article analyzes a report by Meta on abuses of their platform for influence operations, specifically citing Russia, China, Iran, and Israel. The operations mentioned include using GenAI to create large volumes of social media comments in multiple languages as well as turning news articles into Facebook posts. Following interference by Russia in the 2016 US Presidential elections, Facebook, YouTube, and Twitter (now X) have been reporting on influence operations (though Elon Musk discontinued this practice at X). Meta removed six covert operations from the platform in 2024, though Meta claims these operations were not having a big impact. An argument made is that it is not enough to create mal-information: effort must still be made to bring people to this content, and this still requires paying influencers. It is noteworthy that a party in Bangladesh amassed 3.4 million followers across 98 pages of fake content, and the fact that the party was not politically relevant to US big tech could have meant that this went unnoticed for longer. At the same time, the community is getting organized with initiatives like the AI Incident Database and the Political Deepfakes Incidents Database, as well as more cases of social media users alerting the platform administrators and other users of mal-information cases.

11. Generative AI at Work

This paper presents the results of a study that seeks to precisely measure the productivity improvement of using AI chatbots in a customer support context. The experiment was made during the deployment of one of OpenAI’s GPT models in a call center for a US software supplier where there were 5179 customer support agents getting advice from the chatbot. The chatbot was trained using the skills of the most productive employees. The study found an overall productivity increase of 14% (measured by the issue resolution rate per hour). However, the improvements were especially large for lower-skilled and unskilled employees, whereas the improvement for experienced employees was negligible. This is attributed to the fact that the skills of experienced workers are more easily transferred with the help of the chatbot. This result contrasts with earlier waves of computer technology that empowered higher-skilled employees more compared to lower-skilled ones. The paper raises the question of remunerating experienced employees for the skills they contributed to the training of the chatbot.