Small Language Models

Changes at OpenAI and Influence Campaigns

Posted on September 30th, 2024

Summary

Small Language Models are being increasingly talked about. These models are 5 to 10 times smaller than typical large-language models, but can run on single GPU architectures and limited-memory devices. The key difference between SLMs and LLMs is that LLMs are designed to incorporate a broad spectrum of knowledge, whereas an SLM is an expert in a specific domain. MIT researchers have come up with a protocol that exploits the quantum properties of light to securely exchange data between an AI model provider and a client site. The protocol allows a company to have a model process data on-site, without the risk for the model provider of having model details like weights revealed.

On the limits of current AI, research from Stanford University’s Institute for Human-Centered Artificial Intelligence evaluated the use of large-language models in helping to take diplomatic and military decisions. AI is sometimes seen as useful in such contexts since an AI can give an emotion-free response in a crisis situation. However, the study found that AI was more likely to suggest escalatory actions, including the use of first-strike and nuclear weapons. Elsewhere, an opinion article in Nature Computational Science calls for the creation of labels to be associated with AI models, operating on a similar principle to labels in the food industry.

It has been a hectic few weeks for Big Tech. OpenAI is being criticized for its governance changes that mean the company is no longer controlled by its original non-profit foundation. Some believe that changes are motivated by the need to finance the increasingly large costs of its models’ training and operation.

Elon Musk has been a refused an invitation by the UK government to an investment summit after pedaling exaggerated and false statements on X (formerly Twitter). LinkedIn is facing criticism for deciding to include user posts in the training data of its AI models; an opt-out clause is available for users but privacy advocates believe that an opt-in clause is more appropriate. Big Tech companies are being asked by a US Senate Committee to do more to curtail Russian-associated influence campaigns on their platforms.

1. LinkedIn is using your data to train generative AI models. Here's how to opt out.

LinkedIn is now using user posts to train its generative AI model. LinkedIn see the feature as an improvement to their service, but privacy advocates like Women In Security and Privacy are complaining that users must explicitly opt-out from having their data used, rather than having to opt-in by default. Of particular concern to privacy advocates is that the names of people included in posts, like mentors, get included in the training data. LinkedIn has begun informing its users of their intention to use data, and an opt-out feature is available in each user’s privacy settings. LinkedIn point out that user data in Switzerland and the EU will not yet be used for training the AI (certainly due to the current wranglings between the EU and AI companies on the legality of using social profile data for training).

2. Escalation Risks from LLMs in Military and Diplomatic Contexts

This paper from Stanford University’s Institute for Human-Centered Artificial Intelligence (HAI) describes the results of a study that uses large-language models LLMs in high-stakes military and diplomatic decision-making. The study comes at a time where many countries are believed to be including AI in decision making, leveraging the fact that an AI can give an emotion-free response in a crisis situation. The HAI study centered around a war-game simulation using five common AI models. The researchers designed a scorecard to evaluate the level of escalation proposed by the AI in response to some event. When given a list of 27 possible actions, the researchers found that AI was more likely to suggest escalatory actions, including the use of first-strike and nuclear weapons. In addition, the AI’s explanation for its choice was often “worrying”. The authors call for caution in the use of AI in diplomatic and military contexts. Indeed, the 2023 Biden-Harris executive order calls for human oversight when AI is used in national defense. The article notes however that models fine-tuned with reinforcement learning with a human feedback gave less escalatory responses than the other models. The five LLMs used in the study were OpenAI’s GPT-3.5, GPT-4, GBT-4-Base, as well as Anthropic’s Claude 2 and Meta’s Llama-2 (70B) Chat.

3. Using labels to limit AI misuse in health

This comment article in Nature Computational Science by Elaine O. Nsoesie & Marzyeh Ghassemi argue that AI algorithms, like medication, should have obligatory responsible use labels. They point to the fact that so many clinical notes have racial biases and that medical devices have been designed and deployed without sufficient consideration of gender and skin color (e.g., pulse oximeters). When transferred to LLMs, these biases lead to increased possibilities of hallucinations which the authors argue “could lead to death, including mental and behavioral health challenges such as opioid use disorder, addictions, and suicidal ideation, among others”. A label for AI would work in a similar way to labels in the nutrition industry. The contents of the label would include approved use cases, potential side effects (hallucinations and misrepresentation of historical data), warnings and precautions on AI responses, recommendations for use in specific populations, explanations of adverse reactions, unapproved usage, completed studies, and explain ingredients of the algorithm. An example of ingredients is the Data Set Nutrition Label, which includes information on the data sources. Developing a label would require collaboration between clinicians and computer scientists because a label would govern the whole of the AI lifecycle.

4. 3 key features and benefits of small language models

This blog post reviews the important advantages of Small Language Models (SLMs). An SLM is trained in the same way as an LLM, but it uses a relatively small number of parameters. They generally have from several million to a few hundred million parameters, making them five to ten times smaller than many LLMs. This smaller size makes it possible to run SLMs on single GPU architectures and on limited-memory devices, with smartphones and IoT devices (security cameras, healthcare monitors, etc.) even being envisaged. SLMs can therefore be run on the company site, which is often preferable when processing personal and sensitive data. The smaller sizes of SLMs also make them good candidates for applications with real-time requirements like chatbots, translation services and healthcare. The key difference between SLMs and LLMs is that LLMs are designed to incorporate a broad spectrum of knowledge, whereas an SLM is designed to be an expert in a specific domain. Researchers have shown that SLMs can outperform LLMs in their areas of expertise. Examples of SLMs cited by the article are the Phi model family and GPT-4o mini.

5. New security protocol shields data from attackers during cloud-based computation

This article describes research led by MIT that uses the quantum properties of light to securely exchange data between a cloud center and client site. An AI scenario where this is useful is where the client is processing sensitive data that it does not wish to leave the organization’s IT infrastructure. This means that model execution must happen at the client site. At the same time, the AI company does not want to reveal proprietary information relating to its model implementation, notably the weights. To resolve this, the weights are sent in sequences to the client during model computation over fibre optic connections. The technique leverages the no-cloning quantum principle which means that data cannot be copied without impacting the transmission. After each phase of the model computation, residual light is sent back to the server where checks can be made to see if a copy of the weights was attempted. The researchers now want to investigate use of this technique in federated learning where multiple parties help to develop and train a model. The full research paper is available here.

Source MIT News

6. OpenAI as we knew it is dead

This VOX tribune raises concerns about the ethics and legality of last week’s governance changes at OpenAI. For background, OpenAI was founded in 2015 as a nonprofit foundation whose aim was to ensure that AI is being developed safely and in a way that everyone benefits. Its Website at the time wrote that OpenAI’s goal is to “advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return”. Sam Altman became CEO in 2019, and created a for-profit subsidiary through which investments could come. The company nonetheless maintained a cap on return of investments: an investor could get no more than 100-fold his investment in the company, with higher returns being funneled to the non-profit organization. Altman announced last week that the company is no longer controlled by the non-profit board, which can effectively mean the end of the non-profit foundation. The author suggests that Altman made this move because OpenAI is stuck for funds due to the heavy cost of training its models. The author also highlights the wrangling within the company over the business model, citing the departure of the company’s CTO just prior to Altman’s announcement and employees supporting the “right-to-warn” initiative – where employees in AI firms can whistle-blow when the feel high-risk AI is under development.

7. Elon Musk hits back at UK government after he is not invited to tech summit

The new UK government decided not to invite Elon Musk, owner of X (former Twitter), to a global investment summit in the UK on 14 October. This irritated Musk, who tweeted that people should avoid going to the UK because “they’re releasing convicted pedophiles”. The UK’s prison system is suffering from over-crowding and the new government has introduced an early release scheme to alleviate the problem. The government says however that serious offenders will not benefit from this program, and that includes sex offenders. The UK government is dismayed by Musk for pedaling exaggerated and false statements on the social media platform. For instance, during the riots in the UK in August, Musk claimed that civil war in the UK was inevitable. He also claimed that the UK government was planning to send far-right rioters to an internment camp on the Falklands Islands in a post that got 2 million views before being removed from the platform.

8. US Senate Warns Big Tech to Act Fast Against Election Meddling

A US Senate Intelligence Committee hearing took place last week on the subject of foreign influence via social media platforms. Representatives from Google, Apple, and Meta attended the hearing, chaired by Senator Mark Warner. X (formerly Twitter) decided not to send a representative to the hearing. Senator Warner is in favor of closer cooperation between government and Silicon Valley on curtailing efforts by foreigns countries to spread misinformation, with campaigns by Russia, Iran and China being particularly cited. The Tech Companies claim to be active in controlling foreign campaigns. Meta for instance has banned the Russian medias RT and Sputnik from its platform and Google says it has blocked 11’000 posts by Russian-associated entities on Youtube. Many Tech companies have also signed the AI Elections Accord on implementing measures to prevent election interference. The senator however believes that Tech countermeasures are not yet effective, saying that Russian influence actors are still using advertisement space on the platforms. The FBI recently seized 32 domain names linked with a Russian influence campaign called “Doppelganger” where the influencers bought sites with names similar to standard media (e.g., CNN, Fox News) and populated them with articles with a favorable Russian narrative, often with the help of AI.

Source Wired