Calfornia Governor Vetoes AI Bill

High Quality Training Data and Fine-tuning

Posted on October 7th, 2024

Summary

In the past week, the California governor Gavin Newsom vetoed the AI bill, SB 1047, designed to place responsibility on AI model developers to implement safety protocols to prevent “critical harms”. The bill only applied to large AI systems. The governor felt that high-risk processing in other AI systems should have been catered for.

OpenAI has raised another 6.6 billion USD in funding, with the company now valued at 157 billion USD. The cash inflow will help fund the high operating and training costs of the OpenAI models, and allow the company to focus on future projects like the development of an AI chip and on paying for licenses with content content providers (Reddit, Condé Nast, etc.).

On the usage of generative AI, a report by the US National Bureau of Economic Research (NBER) has found that 40% of adults in the US have used generative AI. 24% of US employees use GenAI tools weekly and one employee in nine uses a tool daily. Another report estimates that GenAI could boost European GDP by between 1.2 and 1.4 trillion EUR in the next 10 years, an increase of 8%, but that the EU is still trailing in research and developing talent. A Hugging Face article looks at the motivations for Chinese AI firms to seek markets abroad: one reason is a very competitive home market, another is the perceived high degree of regulation in China – with the consequent compliance costs.

On the question of adopting AI within companies, InfoQ reports on a panel discussion that discusses questions like whether an on-site model is better than an external API-based model, and whether fine-tuning is preferable to prompt engineering. Cohere has released a new service to help companies fine-tune one of the Cohere AI models.

On the technical side, the startup Augmented Intelligence has released a chatbot that uses a mixture of symbolic (rule-based AI) and neural-networks. One advantage of this approach is that the rules improve the explainability of the AI’s decisions. An InfoWorld article looks at how AI chatbots hallucinating code package names increases the risk of supply chain attacks on software. Finally, the Allen Institute for Artificial Intelligence (Ai2) has released a model family called Molmo which is trained on image data that has been annotated by people. This leads to higher-quality training data, which in turn leads to models with a much smaller number of parameters.

1. Generative AI hits 28% usage rate, spreads throughout US workplace: NBER

This CFODive article relates findings by the US National Bureau of Economic Research (NBER) on use of generative AI in the US workplace and homes. The research finds that by March of 2024, over 40% of adults in the US have used generative AI. The most popular GenAI tools are used more than 3 billion times per month. In the workplace, 24% of employees use GenAI tools weekly and one employee in nine uses a tool daily. The most popular uses of the tool are in business and management (for writing, interpreting and administrative support) and for software development and operations. The researchers propose that GenAI makes a contribution for between 0.5% to 3.5% of working hours, which they estimate leads to a modest productivity boost of 0.125% to 0.875%. One expert quoted in the article asks if increased worker productivity should lead to wage increases. NBER underlines that the GenAI adoption rate exceeds those of PC adoption and Internet access in previous decades. The complete report from the National Bureau of Economic Research can be found here.

Source CFODive

2. Cohere just made it way easier for companies to create their own AI language models

The AI firm Cohere has announced innovations to its service offering that allows enterprise clients to fine-tune Cohere’s Command R 08-2024 model. This model has open-weights, meaning the model’s weights are visible to clients and these can be modified – which is what happens when a model is fine-tuned. The model is optimized for reasoning, summarization, question and answering, and is seen as having strong retrieval-augmented generation possibilities. Fine-tuning a model with new data sets is generally an expensive and time-consuming task because it involves tweaking a number of hyper-parameters (the model’s configuration settings), and then measuring the training and validation loss that result from these settings. The training loss measures how well the model fits the training data; the validation loss measures how well the model fits an independent but relevant data set, not seen during training. Cohere has added its Weights & Biases platform to its fine-tuning service, which implements real-time monitoring of training metrics including validation and training loss, thereby speeding up the iteration cycles of the fine-tuning process. The platform has also increased the maximum training context length to 16,384 tokens, thus supporting processing of longer text sequences.

3. A Short Summary of Chinese AI Global Expansion

This Hugging Face article explains the business strategy of Chinese AI companies who are investing and trading outside of China. The larger IT firms are investing heavily in models and cloud infrastructures. For instance, Alibaba announced the construction of data centers in Korea, Malaysia, the Philippines, Thailand, and Mexico, along with a new international version of its large model service platform (Model Studio). Huawei announced a 430 million USD, five-year investment plan in digital transformation across Asia. Most Chinese operations are concentrated in Asia, where Chinese is generally understood. The situation is different for startups who cannot afford to train their own models; they are focusing more on the AI application space.

All sized companies are looking for markets outside of China. One reason is that the Chinese market is very competitive. 238 language models were released in China between October 2023 and September 2024. An initiative by the Chinese government called “AI+” which is pushing for AI adoption across all industries partly explains this high number. The increase comes at a time when investment in AI firms is falling: Chinese AI startups raised approximately 4.4 billion USD in funding by mid-2024, compared to a huge 24.9 billion USD in 2021. Another reason for looking for markets outside of China is the high level of AI regulation, and the perceived costs of compliance. For instance, since September 2024, images created with AI must be tagged as AI (e.g., to avoid misinformation). Companies are keen to invest in countries where regulation is less stringent, with Asia, the Middle East, and Africa being popular areas.

4. The economic opportunity of AI in the EU

This report, commissioned by Google, evaluates the impact of generative AI (GenAI) on the EU economy. Using a financial model by Goldman Sachs, the report estimates that GenAI could boost GDP by between 1.2 and 1.4 trillion EUR in the next 10 years, an increase of 8%. The increase would stem from people working with AI and re-allocating time freed-up on other tasks. However, the report warns that failure to adapt job profiles to incorporate GenAI, or delaying this transition, could mean that the GDP boost would only amount to 2%. The report estimates that GenAI could boost the productivity of 61% of jobs, with 7% of jobs being in danger of replacement by AI. The report also says that the EU lags behind on “innovation drivers” around AI, including research and talent development, and that the EU’s weak AI development position is part of a wider technological gap that has been developing since the 2000s. Finally, the report points to some inconsistencies between the GDPR and the new AI Act. These create an uncertainty around regulation that can discourage researchers and investors.

5. Large language models hallucinating non-existent developer packages could fuel supply chain attacks

This InfoWorld article discusses a security vulnerability that large-language-models (LLMs) can help to exploit called package confusion attacks. A package is a software unit that implements a feature and can be easily inserted into a software application during development. In terms of lines of code, an application today can have a higher percentage of external package code than code created by the application’s primary developers. External packets often come from Github and repositories specializing in open-source software. The article notes that a 2023 study found that 245’000 packages on popular repositories contained malicious code, meaning that use of these packages in an application could compromise the security of the application. Another security risk is the package confusion attack. Here, a criminal uploads a malicious package to a repository where the name of the package is very similar to that of a commonly used package. The aim is to confuse application developers into thinking that this is the package they are looking for, an attack also known as typo-squatting or brand-jacking.

The article highlights research that found that LLMs could increase package confusion. The problem is that when LLMs generate code, they hallucinate the names of packages that the code refers to. This can encourage supply-chain attacks where criminals create malicious packages with names corresponding to the hallucinated names proposed by the LLMs. In tests carried out for Python and JavaScript, researchers found that 19.7% of 2.23 million code samples created by popular language models contained hallucinated package names. They suggest that a mitigation to this problem might be better prompt engineering on the part of programmers and the use of Retrieval Augmented Generation (RAG) to generate more precise responses from specialized data sets of program code.

6. Virtual Panel: What to Consider when Adopting Large Language Models

This InfoQ article summarizes a discussion between four AI firm leaders (from TitanML, Kunfu.ai, Inflection and AWS) on questions that many organizations are asking themselves about AI adoption. One question is whether one should use an API-based external model or self-host. The consensus seems to be that API-based models are better in the short-term as the organization experiments to find optimal use of AI for return on investment, and avoid set-up and maintenance costs. However, self-hosting is more appropriate in the long-term for high-volume AI usage, so long as the business model covers infrastructure and operating costs, since this alleviates privacy concerns about data and removes common service-providers risks like pricing changes. Further, the increasing number of open-source language models is giving a larger choice for self-hosting.

Another common question is whether it is worthwhile fine-tuning a model, especially when working with company data. Fine-tuning is costly in resource usage, and the experts believe that prompt engineering, complemented with retrieval-augmented generation (RAG) to mitigate hallucinations, should be prioritized over fine-tuning. Fine-tuning should only be done when prompt engineering is not giving satisfactory results. Prompt engineering is more suitable for general tasks, fine-tuning is more suitable when a high degree of customization is required.

Another question discussed is the choice between open-source and proprietary models. There is debate (among these experts) about whether current open-source models are as powerful as proprietary alternatives. However, the experts underline that many organizational use cases can be handled well by these open-source models. One expert cited the case of a consultant who was using GPT-4 for a task easily handled by smaller model. The expert suggested using GPT-4 to label a few thousand examples and then fine-tune a smaller, cheaper to run model, with these. Finally, the experts insist on clear success criteria for AI projects, and a highly agile approach in implementing projects, with a focus on continuous learning.

7. Augmented Intelligence claims its AI can make chatbots more useful

Symbolic AI, also called rule-based AI, is an alternative approach to AI than the currently more popular neural-network AI (at the heart of ChatGPT, Llama, etc.). Neural networks are architectures that model the natural brain and that learn from observing patterns in data. Symbolic AI systems are based on programmatically defined rules to represent knowledge and make decisions. A key advantage of neural-network approaches is that they learn from observed data; the key advantage of symbolic AI is that the rules permit the decisions made by the AI to be explained. Symbolic AI is traditionally very compute intensive compared to neural-networks, but a recent research paper from Georgia Institute of Technology, Berkeley University and IBM Research demonstrates that a combined approach, called Neuro-Symbolic AI, can be implemented relatively efficiently. The Augmented Intelligence startup has released a conversational AI called Apollo based on this approach. Trained on the conversations of thousands of customer service agents, Apollo is seen to have key advantages over other chatbots. First, its rule-based core facilitates the explainability of its decisions from data retrieved from the AI’s log. Second, the rules permit more agentic behavior, for instance, instead of just answering questions about available flights, the AI can go ahead and book a flight.

Source TechCrunch

8. A tiny new open-source AI model performs as well as powerful big ones

The non-profit Allen Institute for Artificial Intelligence (Ai2) is releasing Molmo - a family of open-source multimodal language models that they claim perform as well as major proprietary models from Google, Anthropic and OpenAI. A key feature of Molmo is the relatively small number of parameters. Molmo’s largest model has around 72 billion parameters, and Ai2 claims that this model outperforms OpenAI’s GPT-4o in tasks relating to understanding images, charts, and documents, even though GPT-4o is believed to have over one trillion parameters. Ai2 attributes this success to its training methods. Existing popular AIs were trained on available Internet data, which Ai2 believe is a process that leads to a lot of noise and hallucinations. In the case of Molmo, the models were trained on a dataset with only 600’00 images, but the images were annotated by hand beforehand with the contents of images being explained in detail by the annotators. The key lesson is that training on high-quality data can lead to AI models with less parameters.

9. Gov. Newsom vetoes California’s controversial AI bill, SB 1047

The California Governor, Gavin Newsom, has vetoed the AI bill SB 1047. The goal of the bill was to place responsibility on AI model developers to develop safety protocols to prevent “critical harms”. The regulation was to apply to large AI systems – those where development costs exceed 100 million USD or use 1026 FLOPS of computation during training. The bill was opposed by many Silicon Valley firms, including OpenAI, despite late amendments incited by Anthropic and others. Governor Newsom agreed that the bill was well-intentioned, but “does not take into account whether an AI system is deployed in high-risk environments, involves critical decision-making or the use of sensitive data. Instead, the bill applies stringent standards to even the most basic functions — so long as a large system deploys it.”. The California State senator who brought the bill, Scott Wiener, called the decision is “a setback for everyone who believes in oversight of massive corporations that are making critical decisions that affect the safety and welfare of the public and the future of the planet.”

10. OpenAI raises $6.6B and is now valued at $157B

OpenAI announced that it has raised 6.6 billion USD in a funding round which values the company today at 157 billion USD, and bringing total cash raised to 17.9 billion USD. The investors include Microsoft (investing just less than 1 billion USD) and Nvida (100 million USD). In comparison, Elon Musk’s xAI raised over 6 billion USD this year, Anthropic has now raised 9.7 billion USD and both Cohere and Mistral’s have raised around 1 billion USD each. The article cites a report that OpenAI has asked its investors not to invest in its rivals. The article goes on to mention that ChatGPT has now more than 250 million users, of which 10 million are paying subscribers, and that annual revenue is reportedly 3.4 billion USD. Apart from Microsoft as an interested partner, Apple is integrating ChatGPT with its Apple Intelligence feature.

The cash inflow is needed to finance operations. The GPT-4 model has cost more than 100 million USD to train, and ChatGPT can cost up to 700,000 USD per day to run. OpenAI has reportedly already spent 7 billion USD on model training. The money also allows the company to invest in future projects like developing its own AI chip, and thereby remove dependence on Nvidia, its current supplier. Another need for funding is to pay for licenses from content providers like Reddit and Condé Nast, and thereby remove the risk of future lawsuits for IP violations.