Proving human-ness online

AI Productivity and California's AI Bill

Posted on August 23rd, 2024

Summary

The past week has seen two key developments on the regulatory front. The State of California is preparing a bill to force AI model developers to adopt safety measures. The bill is in the committee stage and is facing resistance from Silicon Valley who fears the legislation will stifle innovation in the state. Meanwhile, the EU’s data protection board has published a report on ChatGPT’s degree of compliance to the GDPR. For the moment, the EU does not seem to have taken a clear stance on the legality of OpenAI using personal data to train its chatbot.

On the adoption front, a new Github survey reveals that 97% of developers in large companies regularly use AI, with test-case generation and secure coding practices being among the most common uses. An article from MIT Technology Review examines if AI can contribute to an increase in countries’ productivities. The question remains open, two problems being the latency of technology impacting on national productivity (the so-called Solow Law), and that Big Tech is not attentive enough to the concerns of manufacturing industries, which up to now have been the main contributors to national productivity levels.

On the risks of AI, an MIT project has put online a database of known AI risks, and there are currently more than 700 risk cases. The Scientist AI language model, a platform to generate and experiment with new research ideas, was observed to have modified its own code – something the designers did not at all expect. OpenAI announced that it closed several ChatGPT accounts that were used by an Iranian influence group to create propaganda in relation to the US presidential election. OpenAI also published the results of a red-teaming exercise on ChatGPT’s new voice control feature, which found that the tool could behave in weird ways.

A white paper co-authored by OpenAI, Microsoft and Partnership on AI among others, is proposing the concept of personhood credentials. These are electronic tokens that humans can use to prove to IT services that they are human, and not AI bots masquerading as humans (sockpuppets).

Finally, an article from MIT Technology Review looks at challenges to conserving current Internet data.

1. Survey: The AI wave continues to grow on software development teams

This blog post reports on a survey commissioned by Github on developer experience with AI tools. The survey was conducted in February and March, 2024, and questioned over 2’000 developers, software engineers, data scientists and software designers in the U.S., Brazil, India, and Germany. Those questioned worked for companies with more than 1’000 employees. A key conclusion is that 97% of respondents say they regularly use AI, and for 88% of developers in the US, their company actively supports the use of these tools (although this figure is only 59% in Germany). 98% of developers said their organizations have experimented with AI tools to generate test cases. Depending on the country, 60% to 90% of respondents said AI led to better quality code and made it easier to work on existing codebases. The time saved by these tools is used for higher level tasks like system design and better team interaction. Another area where AI tools are perceived as useful is for secure coding. The post argues that there is still a global shortage of security experts, and that shift-left approaches to security – where vulnerabilities are detected and corrected during development – is crucial. The post also argues that organizations require clear governance strategies to accompany AI-assisted development, notably by defining clearer adoption guidelines for trust in the development process as well as performance metrics.

In another Github blog post on developer happiness, the authors use research into productivity to argue in favor of AI-assisted programming. Research has shown that developers with time for deep work are 50% more productive due to reduced context-switching induced mental load. The authors argue that security shift-left has made developers context-switch more to security coding and testing tasks, and since they are less familiar with these topics, the cost of the context switch is greater. AI can help developers on this point. Also, developers who understand their code base (which AI tools can help with) are 42% more productive.

Source Github

2. OpenAI finds that GPT-4o does some truly bizarre stuff sometimes

This TechCrunch article reports on a red-teaming analysis by OpenAI of the alpha version of its Advanced Voice Mode for ChatGPT. The tool is able to respond to human users in around 300 milliseconds – which is similar to human conversational response times. The red-teaming results showed that the tool could adopt strange behavior such as mimicking the voice of the person talking to it, or shouting back. The tool could also make non-verbal sounds like violent screams, erotic moans and gunshots. OpenAI claim that safeguards have since been put in place to prevent these behaviors. The tool also blocks identifying a person from the voice, and refuses to engage in discussions about violence, extremism, biological threats or self-harm. The TechCrunch article asks if GPT-4o is trained on copyrighted material since, for instance, the tool refuses to sing – presumably, so as not to sound like a famous singer. The voices used by the tool are those of hired voice actors. The original OpenAI report describes the methodology used for red-teaming in detail. For instance, over 100 external red team experts were employed from 29 different countries, and speaking 45 different languages.

Source TechCrunch

3. EU’s ChatGPT task-force offers first look at detangling the AI chatbot’s privacy compliance

This TechCrunch article reports on the findings of the European Data Protection Board on how OpenAI’s ChatGPT complies with the General Data Protection Regulation (GDPR). The report appears undecided, for the present moment, on the key issue of whether training ChatGPT with personal data from the Web is lawful or not. OpenAI have two possibilities to claim lawfulness for processing personal data: 1) the company can obtain explicit consent from users, or 2) it can claim that its processing of data for training represents a case of legitimate interest. Processing of personal data under legitimate interests (and therefore without explicit user consent) is allowed under the GDPR when the processing serves the interests of all parties. It was originally formulated for situations like collecting extra client data in cases of suspected client fraud. In any case, OpenAI would still have to demonstrate that its processing protects users from abuse of their data. This is especially difficult for OpenAI since data includes personal data categorized as sensitive, such as health data, sexual orientations and political opinions.

Another issue for OpenAI is that the GDPR gives citizens the right to rectification of data processed by a system. The risk of hallucination in ChatGPT makes this a real risk for OpenAI. Currently, OpenAI blocks requests about a person as a response to rectification of data about that person – rather than implement a real rectification. The original EU report can be found here.

Source TechCrunch

4. California AI bill SB 1047 aims to prevent AI disasters, but Silicon Valley warns it will cause one

The State of California is preparing an AI bill, SB 1047, which aims to prevent AI from causing "critical harms" against humanity, but the bill is meeting strong resistance from Silicon Valley. Examples of critical harms including using AI to create a weapon that results in mass casualties or to launch a cyberattack causing over 500 million USD in damages. The bill only applies to large AI systems (costing over 100 million USD and use 1026 FLOPs of processing during training), though OpenAI, Microsoft, Google and Meta will soon be developing such models. SB 1047 places responsibility onto the developer of the model, as opposed to organizations that operate the model. Developers must include an emergency stop button in their models, and have documented testing procedures that involve external auditors. Opponents of the bill say that it will stifle innovation in California, and that the deployers should be punished rather than the AI system designers. The opposition to the bill is forcing changes to the bill, still at the committee review stage. For instance, penalties enforced on firms for violation could be 10 to 30 million USD, or an injunction seeking a cessation of operation.

5. OpenAI shuts down election influence operation that used ChatGPT

In an election year where generative AI is increasingly being used for influence campaigns on social networks, this TechCrunch article reports that OpenAI banned several accounts created as part of an Iranian state-backed influence operation. The group behind the operation was identified as Storm-2035 in a recent Microsoft intelligence report. According to the report, Storm-2035 is “actively engaging US voter groups on opposing ends of the political spectrum with polarizing messaging on issues such as the US presidential candidates, LGBTQ rights, and the Israel-Hamas conflict”. The group used ChatGPT to create articles and tweets with untruthful content such as X censors Trump’s tweets and Kamala Harris attributes ‘increased immigration costs’ to climate change. OpenAI noted however that the articles were not widely shared, and got few likes or comments.

Source TechCrunch

6. The race to save our online lives from a digital dark age

This MIT Technology Review article explores issues about conserving current digital data for future generations. One of the challenges is that data is stored by private corporations rather than governmental libraries. Data can disappear following corporate decisions or errors. Two examples are the loss by MySpace of all content uploaded prior to 2016 and the decision by Yahoo to end the GeoCities web-hosting platform. In the latter case, a team interested in data preservation managed to download one terabyte of data from GeoCities before it went down. However, data is lost in other ways. One is link rot, where Web content is no longer referenced. The article cites a study that found that 23% of web pages from before 2013 are no longer accessible.

Another reason for data loss is obsolete data formats and data storage hardwares. Further, archival still relies on hard-drives or tape, and these degrade over long periods. The article cites a project at Microsoft Research in Cambridge, UK, that is creating a new form of long-term storage based on glass squares that can last for thousands of years. The squares retain storage even when microwaved, boiled, baked, or zapped using high-powered magnetism. GitHub is now storing the source code of key software, including Linux, on special film that its creators claim can last for more than 500 years.

One key issue is that the amount of data is increasing, the Google CEO being cited as saying that 6 billion photos and videos are uploaded to Google Photos every day, and 40 million WhatsApp messages are sent every minute. The author underscores that a lot of trust is being placed in a company that is only 26 years old. The Internet Archive’s Wayback Machine now stores more than 800 billion Web pages and is adding 650 million new pages each day, as well as Youtube videos and TikTok posts.

7. Research AI model unexpectedly modified its own code to extend runtime

The AI Scientist is a language model from the Tokyo-based AI research firm Sakana AI whose aim is to automate the whole research lifecycle. AI Scientist proposes ideas generation, writes code to test those ideas, and then executes experiments, analyzing the results and finally writing a scientific article that describes everything. The project as a whole has been criticized, as some believe it can only lead to an increase in submissions of low-quality research papers to academic journals. However, another factor reported in this article is that the system unexpectedly modified its own code in at least two reported instances. In one case, the AI system generated code that modified the checkpoint saving function of the platform, increasing the frequency of checkpoints and creating nearly a terabyte of unwanted data. Critics argue that the AI Scientist lacked proper human supervision.

8. How to fine-tune AI for prosperity

This article examines the impact that AI can have on economic productivity. Productivity in the US was around 3% until 2005, but has significantly fallen since despite Big Tech inventions like smartphones and social media. Three key issues are discussed in the article.

  • First, there is Solow’s law (from the 1987 Nobel Prize winning economist) who postulated that technologies take time to impact economic growth. It was not until the late 1990s that the economic impact of the personal computer was really felt on productivity. According to a US Census Bureau cited in the article, the number of companies using AI will only be 6.6% by the end of 2024.
  • The second issue is that, in the US at least, overall productivity has depended a lot on productivity in the manufacturing industries, which is not really in the radar, or business models, of Big Tech firms. The potential for AI in manufacturing is understood, but current AI models are seen as unhelpful because they are not trained with the required manufacturing domain-specific knowledge (much of which is proprietary). Further, issues like model hallucination contrasts with the high-precision requirements and strict standards of manufacturing industries.
  • A third issue is that productivity depends on innovation, and the author argues that this has been falling in recent years. It takes more and more resources to create innovation. Citing the example of Moore’s Law, which predicts that the number of transistors on a chip doubles every two years, the author mentions that the semiconductor industry today needs 18 times more researchers to support this progression than was needed in the 1970s.

The article suggests that corporate ownership of AI innovation hampers the uptake of AI in manufacturing and other heavy industries, and that major government investments are needed.

9. A new public database lists all the ways AI could go wrong

The FutureTech group at MIT has just put online the AI Risk Repository – a database of known AI risks. The risks were compiled from peer-reviewed and pre-print journals and there are currently over 700 risks indexed in the database. A classification of risk danger levels is not included. The data shows that 76% of risks relate to system safety and robustness, 63% to improper bias and discrimination, and 61% to privacy leaks. The authors note that only 10% of the risks were discovered prior to an AI system being deployed, which means that safety measures must focus more on monitoring deployed AI systems.

10. Personhood credentials: Artificial intelligence and the value of privacy-preserving tools to distinguish who is real online

This white paper, co-authored by OpenAI, Microsoft, Partnership on AI, MIT, Berkeley and Harvard Universities among others, proposes a framework that permits IT services to distinguish human users from AI bots. A current problem with AI is that malicious actors can easily use AI bots to appear as human (the so-called sockpuppets) and this is contributing to increased fraud. Further, human-verification techniques like CAPTCHAs are becoming less effective in determining whether the conversational partner is AI or not.

The solution proposed in the paper is Personhood Credentials (PHCs). These are digital credentials that prove the holder is human, but so as to maintain the current Internet’s anonymity goals, PHCs do not reveal the actual identity of the holder. An organization like a government agency needs to act as an issuer of the credential – and thus ensure that the person asking for the PHC is human. The credential is cryptographically protected, which should prevent an AI system from fabricating counterfeit PHCs. A person uses the PHC in a zero-knowledge proof based protocol to demonstrate humanness to an IT service, without having to disclose his or her identity. The paper also discusses governance issues such as checks on the power of PHC issuers and the problem of adapting existing digital systems.