Summary
One of this week’s developments is the publication of the Open Source AI Definition by the Open Source Initiative. The goal of the definition is to define what “open” means in the context of AI models. The definition upholds the “four essential freedoms”, that is, the right to use, study modify and share models. It covers model code as well as model parameters and weights. However, the definition does not require training data to be shared.
An article in Nature magazine estimates that hardware (GPUs, CPUs, memory modules, and storage devices) used to train and run generative AI models could produce 5 million metric tons of e-waste by 2030. The worldwide population creates 60 million metric tons of e-waste each year.
On the application side, a UK firm is deploying its UK trained autonomous vehicles in the US. The vehicles learned from driving on the left-hand side of the road but must adapt to the right-hand side. Cars from Cruise and Tesla have been involved in accidents recently, and some experts doubt the current ability of autonomous cars to drive in neighborhoods outside of the cars’ training data locations, or to deal with the unpredictable behaviors of human drivers. An article from VentureBeat argues that federated model learning – where several participants contribute independently to training a model – is the best way to train cybersecurity models for threat detection. On the one hand, participant trainers need not reveal sensitive data to other participants. On the other, it helps ensure fresh datasets as too many models developed in university environments share datasets.
The newest version of the Claude 3.5 Sonnet model can now interact with desktop applications and is seen as a step towards the development of AI “agents” (which can execute tasks like booking flights for users). Agents are seen as an avenue for investors to get a quicker return on their investments. An InfoWorld article looks at the challenges for organizations in managing legacy data for their AI deployments. Organizations want to use AI to extract value from their unstructured data, which can account for as much as 95% of their data.
A Guardian news article looks at the financial problems at X (formerly known as Twitter) that stem from advertisers avoiding the platform. Apple is adding AI to recent versions of the iPhone, iPad and Mac computers. However, iPhone and iPad users in Europe will not yet have access to the AI features due to uncertainties around the EU’s Digital Market Act which requires Internet giants to avoid closed eco-systems.
Table of Contents
1. How Wayve’s driverless cars will meet one one of their biggest challenges yet
2. Anthropic’s new AI model can control your PC
3. AI will add to the e-waste problem. Here’s what we can do about it.
4. Elon Musk hopes Trump victory will help his $44bn Twitter bet pay off
5. Apple Intelligence is coming to the EU in April 2025
6. OSI unveils Open Source AI Definition 1.0
7. Bridging the performance gap in data infrastructure for AI
8. How (and why) federated learning enhances cybersecurity
9. Differentiable Adaptive Merging is accelerating SLMs for enterprises
1. How Wayve’s driverless cars will meet one one of their biggest challenges yet
This MIT Technology Review article puts the spotlight on self-driving cars, and on the UK firm Wayve in particular. Wayve received 1 billion USD in funding in 2024 and, after training their AI model in the UK where cars are driven on the left-hand side of the road, the company is deploying their cars in the US (where cars are driven on the right-hand side). This deployment is seen as a big test of the Wayve AI model. Contrary to other self-driving cars, Wayve used an “end-to-end” learning approach which means that a single AI model is deployed which has been trained using camera footage, driver instructor feedback and simulation inputs. The company says that their approach has enabled them to deploy their cars in several UK cities even though the model was trained exclusively on footage from London. The company claims that only a little re-training is required for deployment in the US.
Overall, there is still debate about whether autonomous cars are ready to be used on a large scale. Cars belonging to Cruise and Tesla have been involved in accidents recently, at least one of which was fatal. Autonomous cars are seen to perform best during heavy traffic since they just need to follow the car directly in front. However, some experts doubt their current ability in neighborhoods outside of the cars’ training data locations, and also their ability to deal with the unpredictable behaviors of human drivers.
2. Anthropic’s new AI model can control your PC
Anthropic wants to build AI Agents for desktops that can perform “back-office jobs” like document search, answering emails or tasks like filling out Web-forms, e.g., booking an airline flight, by searching for the relevant information on the desktop. The newest version of the Claude 3.5 Sonnet model can now interact with desktop applications and is able to emulate gestures made by a person sitting at a PC like mouse clicks and movements. Several companies are working on similar AI “agents” as some see it as a way of getting a quicker return on investment in AI. The article cites a Capgemini report which claims that 10% of US companies are already using AI agents, and 80% will adopt them in the next three years. The article reports nonetheless that tests of 3.5 Sonnet failed for many agent behavioral tasks. Another worry is security, since a jail-broken agent could leak personal information or engage in improper tasks like ordering a fake passport.
3. AI will add to the e-waste problem. Here’s what we can do about it.
This article reports that the hardware (GPUs, CPUs, memory modules, and storage devices) used to train and run generative AI models could produce 5 million metric tons of e-waste by 2030. E-waste designates all forms of electrical and electronic equipment that has been thrown away. The worldwide population creates 60 million metric tons of e-waste each year. One problem with e-waste is that it can contain environmentally dangerous materials like lead, mercury, and chromium. Further, it is a missed opportunity to recycle precious materials like copper, gold, silver, aluminum, and rare earth elements.
In a study published in Nature, scientists measure the likely volume of e-waste linked to GenAI based on different GenAI adoption rates and on various server farm management scenarios that can minimize e-waste. The latter include prolonging the lifetime of hardware usage (currently 2 to 5 years is the norm), and designing hardware that permits its smaller components to be refurbished and then reused. This could reduce generative AI e-waste by up to 86% in the best-case scenario. Currently, only 22% of e-waste is collected and recycled. However, recycling requires a waste management infrastructure which not all countries have put in place.
4. Elon Musk hopes Trump victory will help his $44bn Twitter bet pay off
This Guardian article looks at the financial situation of X (formerly known as Twitter) since its takeover by Elon Musk in October 2022. Musk paid 44 million USD for Twitter in 2022 but the Fidelity investment group evaluates the platform at 9.4 billion USD today. This devaluation reflects the large fall in advertisement revenue which, in 2021, accounted for 90% of Twitter’s 5.1 billion USD total income. The number of visits to the platform is also down: there are 4.3 billion daily visits today compared to 5 billion daily visits in 2022. The fall in advertisement is because advertisers are concerned about the right-wing rhetoric of Musk and his behavior regarding content moderation, including his decision to sue the Center for Countering Digital Hate (CCDH) because, Musk claimed, the accusations of the center led to a loss in revenue for the platform. An expert from the Reuters Institute for the Study of Journalism believes that Musk has had some success in pushing the political agenda to the right. In any case, Musk’s personal fortune is currently estimated to be 270 billion USD, so the short-term future of the platform is considered to be safe.
5. Apple Intelligence is coming to the EU in April 2025
Apple has announced the first suite of Apple Intelligence features for recent iPhone, iPad, and Mac models. These include writing tools, image cleanup, article summarization and a redesigned Siri experience. The features will not be available to European users on the iPhone, though they will be available to European Mac users. The reason relates to regulatory uncertainty around the EU’s Digital Markets Act (DMA) whose role is to define new digital competition rules that prevent the Internet’s gatekeepers (currently, Alphabet, Amazon, Apple, ByteDance, Meta and Microsoft) from installing a self-reinforcing ecosystem. The DMA applies to any company with a market cap of 75 billion EUR or a user base of over 45 million monthly active users. In response to the regulation, Apple is already preparing to allow developers to distribute iPhone and iPad applications in other market places, and is opening up to the use of other (non-WebKit-based) browsers on Apple devices.
6. OSI unveils Open Source AI Definition 1.0
The Open Source Initiative (OSI) has released its first version of its Open Source AI Definition (OSAID). The goal of the definition is to create a framework for “permission-less, pragmatic, and simplified collaboration for AI practitioners, similar to that which the Open Source Definition has done for the software ecosystem”. A total of 25 organizations were involved in the project, including Microsoft, Google, Amazon, Meta, Intel, and Samsung, and groups including the Mozilla Foundation, the Linux Foundation, the Apache Software Foundation, and the United Nations International Telecommunications Union (ITU) in Geneva.
An AI platform whose license adheres to the principles of the OSAID needs to guarantee the four essential freedoms where a user may i) use the system for any purpose without having to ask for permission, ii) study how the system works, iii) modify the system for any purpose, and iv) share the system for others to use, with or without modifications. The definition applies to both code as well as model parameters and weights. However, as one expert points out, the definition does not require that training data be shared. This is a sensitive issue, e.g., for models trained on patient medical data. However, the OSI requests that model providers give enough information about the training data so that a “skilled person can recreate a substantially equivalent system using the same or similar data”. The issue is complicated as worries continue about the use of copyrighted content in training data, and also, as the Internet Watch Foundation reports, there is a significant amount of activity on dark web forums where open source models are used to traffic Child Sexual Abuse Material (CSAM).
7. Bridging the performance gap in data infrastructure for AI
One of the biggest obstacles to deploying AI in organizations is the lack of IT infrastructure and having to deal with legacy data systems. This InfoWorld article reviews why legacy systems have become a problem. By the 1990s, enterprise storage was supported by on-premises data centers, and the media used included hard disk drives, optical disks and magnetic tapes. The compute environment at that time was migrating from mainframes to client-server architectures. Data access was handled by manual SQL queries and Open Database Connectivity (ODBC) standards which permitted access to databases from within application code. The emergence of business intelligence then followed, where data was moved to data lakes using ETL (extract, transform, load) processes. These processes needed to transform operational data to become business interpretable. A key difference with the advent of AI is that organizations are looking to extract value from their in-house unstructured data, which can account for as much as 95% of their data. The article argues that not only does processing this data require investment in GPUs, but also investment in software infrastructures like edge computing, DevOps approaches like containerization and infrastructure as code, and MLops platforms for continuous delivery of models. This is needed both for training as well as for data extraction and transformation.
8. How (and why) federated learning enhances cybersecurity
This article looks at some of the pros and cons of using federated machine learning to improve cybersecurity. Federated learning is the approach where several participants work together to train a single AI model. In the standard approach known as horizontal partitioning, the participants share the primary algorithm and a common feature set and train the model locally on their own data. The resulting models are then aggregated centrally by the model owner into a unified model. One of the benefits of the approach is that participants can train the model on security sensitive data, without having to reveal that data to other participants. Also, there is no central point of attack for data exfiltration or training data corruption. The approach also yields lower overall training latency. On the other hand, the approach requires mutual trust between all participants, and they all need to have the sufficient infrastructure to train models locally. For the authors, this federated approach is particularly good for threat classification tasks and at attack detection scenarios. Perhaps the strongest argument of favor of the federated approach is that it encourages fresh datasets. The article warns that AI model misalignment (where the model does not behave according to the objectives of the designers) can come from models being trained with borrowed data sets. It mentions that on the Papers With Code website, 50% of the datasets come from just 12 universities and data sets are borrowed more than 58% of the time.
9. Differentiable Adaptive Merging is accelerating SLMs for enterprises
Developing and training an AI model is an expensive operation. For this reason, organizations seeking to tailor AI to their own needs will combine distinct models – an approach known as model merging. There have been several approaches to merging. The simplest is a mixture of experts approach, where the individual models are run independently and an architectural component hands off requests to the most appropriate model. One advantage of this approach over fine-tuning a new model is that it allows the expertise of all models to be exploited, and avoids the catastrophic forgetting problem (where a model loses track of original information when learning new knowledge). The mixture of experts has the disadvantage of the cost of running all distinct models. Alternative approaches seek to amalgamate component models together into a single model, which can be technically difficult as the internal representations of the models are different. Researchers from Arcee AI and Liquid AI have developed a technique called Differentiable Adaptive Merging (DAM) that facilitates cost-efficient model merging that identifies an optimal contribution of each component model. DAM has been open-sourced and is available on Github.