Summary
Audio Summmary
A new Google Gemini model includes a dial to control how much reasoning the model does in response to a query. This highlights a major concern about reasoning models: over-thinking. Reasoning models waste too much CPU and energy on reasoning for many tasks. Meanwhile, a VOX article debates whether reasoning models are really reasoning in the “human” sense of the term. For one philosopher, the proof that models do not reason can be seen in the examples where they fail. An emerging term is “jagged intelligence” which describes the fact that AI models “can both perform extremely impressive tasks while simultaneously struggling with some very dumb problems”.
The well-known AI researcher Tamay Besiroglu has founded a company called Mechanize whose mission is to provide digital environments for “the full automation of all work” and “the full automation of the economy”. Elsewhere in the US, a group of professors have filed an amicus brief to support authors in their lawsuit against Meta. Meta claims that training models with copyrighted material should be considered fair use, but the brief labels this claim “a breathtaking request for greater legal privileges than courts have ever granted human authors”. A unit of US marines in the Pacific region is experimenting with generative AI on a large scale. The unit is using generate AI for translating and summarizing foreign news sources, doing sentiment analysis of social media posts, and writing daily and weekly intelligence briefs. An MIT Technology Review article examines the interplay of creativity and generative AI. One artist using AI says that some artists “don’t actually talk about these AI generative models as a tool – they talk about them as a material, like an artistic material, like a paint”. The use of AI for creation has also been called co-creativity or more-than-human creativity.
The US tariffs exemption on imports from China for smartphones, laptops and other electronic equipment will be removed within two months. The US Commerce Secretary said that “a special focus-type of tariff” for these products will be put in place. Elsewhere, President Trump has announced that he is freezing 2.3 billion USD in funding to Harvard University, and reviewing a further 9 billion USD of grants, in retaliation for what he says was antisemitism displayed during student protests against Israel’s invasion of Gaza. The Harvard University president accused the US government of threatening the “values of a private institution devoted to the pursuit, production, and dissemination of knowledge”.
On the model development front, OpenAI has released a new family of models: GPT-4.1, GPT-4.1 mini and GPT 4.1 nano which it claims perform particularly well on software engineering benchmarks. At the same time, it is deprecating its largest model, GPT-4.5 Preview, launched only two months ago saying that GPT-4.1 delivers “improved or similar performance on many key capabilities at much lower cost and latency”. The move will allow OpenAI to reclaim computing resources. Finally, a WIRED article reviews the importance of small language models. Though there is no universal definition of a small language model, a model with fewer than 10 billion parameters is considered small. Focused on precise tasks, a small model performs very well on tasks of expertise but without the huge development costs of large models.
Table of Contents
1. Generative AI is learning to spy for the US military
2. Trump warns exemptions on smartphones, electronics will be short-lived, promises future tariffs
3. Law professors side with authors battling Meta in AI copyright case
4. How AI can help supercharge creativity
6. Harvard rejects Trump demands, gets hit by $2.3 billion funding freeze
7. Is AI really thinking and reasoning — or just pretending to?
8. Small Language Models Are the New Rage, Researchers Say
9. A Google Gemini model now has a “dial” to adjust how much it reasons
10. Famed AI researcher launches controversial startup to replace all human workers everywhere
1. Generative AI is learning to spy for the US military
This article reports on how a unit of US marines composed of 2’500 service members in the Pacific region are using generative AI to interpret intelligence reports, in a test of a tool developed by the Pentagon. Uses of the technology include translating and summarizing foreign news sources, doing sentiment analysis of social media posts, and writing daily and weekly intelligence briefs. Intelligence services are believed to process terabytes of data in 80 different languages. The data also includes unclassified data from operatives in the field as well as physical sensors (e.g., devices listening in on shipping communications). Critics of the approach say that language models are still not great at detecting sentiment and emotions in text, which undermines political sentiment analysis, and open-source intelligence data on the Internet is subject to misinformation and manipulation. This calls for a “human in the loop”, but controlling such generative AI summaries for vasts amount of data is challenging. There are also worries about how checks are being made on controversial technologies. The article mentions for instance the case of the US Immigration and Customs Enforcement that now uses an advanced database for tracking undocumented immigrants. The US is not the only country using generative AI in the military: Israel has reportedly been using the technology to sift through intelligence and identify targets in Gaza.
2. Trump warns exemptions on smartphones, electronics will be short-lived, promises future tariffs
This article reports that the US tariffs exemption on imports from China for smartphones, laptops and other electronic equipment will be removed within two months. The US Commerce Secretary said that “a special focus-type of tariff” for these products will be put in place, along with pharmaceuticals. A Chinese government representative said the exemption shows “how important China is to major US tech companies that rely heavily on the country for manufacturing and innovation”. Beijing also said it will not react to future raises in US tariffs by Trump since Chinese tariffs are now so high that there is “no market acceptance for US goods”.
Meanwhile, Nvidia announced that it is going to build a 500 billion USD AI infrastructure on US soil over the next four years. Though Nvidia designs its chips, production is outsourced to the Taiwan Semiconductor Manufacturing Company (TSMC). Trump has placed a 32% tariff on products from Taiwan, though this have been suspended for 90 days with the other “reciprocal” tariffs. Production of Nvidia’s Blackwell GPU has already started at a TSMC plant in Phoenix, Arizona. This choice is also in response to the 2022 US CHIPS Act that subsidizes the development of chips in the US.
3. Law professors side with authors battling Meta in AI copyright case
In the US, a group of professors who specialize in Copyright Law have filed an amicus brief to support authors in their lawsuit against Meta. The authors claim that Meta used their works to train the Llama AI models, without permission from or renumeration for the authors, and that consequently this violates copyright law. An amicus brief is a legal document submitted to a court by people not directly involved in a case, but who have a strong interest in the outcome. Meta claims that use of copyrighted material to train models should fall under the fair use doctrine of copyright law, but the brief calls this claim “a breathtaking request for greater legal privileges than courts have ever granted human authors”. It also writes that “training use is also not ‘transformative’ because its purpose is to enable the creation of works that compete with the copied works in the same markets – a purpose that, when pursued by a for-profit company like Meta, also makes the use undeniably ‘commercial’”. Other briefs in support of authors have been filed by the International Association of Scientific, Technical, and Medical Publishers, the global trade association for academic and professional publishers, the Copyright Alliance, and the Association of American Publishers.
4. How AI can help supercharge creativity
This article examines the interplay of creativity and generative AI. One of the artists mentioned in the piece is an algorave (for algorithmic rave) musician – a live music event where the musician creates music in real-time by writing code. This particular musician uses an AI agent to suggest novel sound combinations, which she feels adds to the creativity of her own music. Another artist commented that many people “don’t actually talk about these AI generative models as a tool – they talk about them as a material, like an artistic material, like a paint”. Another artist cited uses a neural network with an extra layer added that creates visual effects in images to give them a Rothko-like feel. Some artists believe that the suggestions from the AI tool can challenge the artists assumptions and push then in directions not foreseen. The use of AI for creation has also been called co-creativity or more-than-human creativity.
For others, use of generative AI cannot be qualified as artistic creation. One problem cited is that it is “one-shot”, lacking a back-and-forth dialog and reflection (focused deliberate thinking) that artists are used to. One creativity researcher remarked that AI “tools do not give you what you want; they give you what their designers think you want”. AI tools facilitate creation, but not creativity. The challenge going forward is for AI to help us get better at what we want to do, rather than doing it for us.
5. OpenAI’s new GPT-4.1 models can process a million tokens and solve coding problems better than ever
OpenAI has released a new family of models: GPT-4.1, GPT-4.1 mini and GPT 4.1 nano. The models are claimed to be particularly good at software engineering and agent tasks. GPT 4.1 scored 54.6% on the software engineering SWE-bench Verified benchmark, which is over 21% better than GPT-4o. The model scored over 38% on the agentic Scale’s MultiChallenge benchmark, which is 10% better than GPT-o’s score. All models have a context window of one million tokens, which is equivalent to 750’000 words (and certainly a very large codebase). OpenAI admits nonetheless that model performance degrades when inputs are very large. On its internal OpenAI-MRCR benchmark, model accuracy dropped from around 84% with 8’000 input tokens to 50% with one million tokens. OpenAI is also releasing two new benchmarks: OpenAI-MRCR for testing a model’s ability to reason in long-context conversations and Graphwalks for evaluating reasoning across lengthy documents.
OpenAI is positioning GPT-4.1 as an enterprise solution, in direct competition with Google’s Gemini 2.5 Pro, Anthropic’s Claude 3.7 Sonnet and the latest DeepSeek’s R1 updates. For instance, enterprise pricing is 26% less than GPT 4o. Further, the models are available via APIs so that enterprises can hook their applications to the models, and integration into ChatGPT will happen only gradually. In tests using enterprise applications, Thomson Reuters saw a 17% improvement in their multi-document review with its legal AI assistant CoCounsel and the financial firm Carlyle saw a 50% performance improvement on extracting financial data from documents. Finally, OpenAI announced that it is deprecating its largest model, GPT-4.5 Preview, launched only two months ago saying that GPT-4.1 delivers “improved or similar performance on many key capabilities at much lower cost and latency”. The move will allow OpenAI to reclaim computing resources.
6. Harvard rejects Trump demands, gets hit by $2.3 billion funding freeze
President Trump has announced that he is freezing 2.3 billion USD in funding to Harvard University, and reviewing a further 9 billion USD of grants, in retaliation for what he says was antisemitism displayed during student protests against Israel’s invasion of Gaza. A group of Harvard professors are suing in order to block the funding freeze. A government official said that the university exhibited a “troubling mindset that is endemic in our nation's most prestigious universities and colleges – that federal investment does not come with the responsibility to uphold civil rights laws”. Deportation proceedings have already begun against students arrested during the protests and many student visas have been canceled. The Harvard University president, in a public letter, accused the government of threatening the “values of a private institution devoted to the pursuit, production, and dissemination of knowledge”, adding that “no government – regardless of which party is in power – should dictate what private universities can teach, whom they can admit and hire, and which areas of study and inquiry they can pursue”. The US government has also suspended 400 million USD in federal funding and grants to Columbia. Harvard is trying to borrow 750 million USD from Wall Street to make up for the cash freeze.
7. Is AI really thinking and reasoning — or just pretending to?
This VOX article debates whether the latest reasoning, or Chain-of-Thought (CoT) models, are really reasoning in the “human” sense of the term. AI companies tend to define reasoning as the ability to break a problem down into smaller problems, and to tackle each step by step. Human reasoning is more varied however. There is deductive reasoning (where one arrives at a specific conclusion from a general statement), inductive reasoning (where one arrives at a broad generalization from specific statements), analogical reasoning (understanding something new or unfamiliar by comparing it with something already known), causal reasoning (understanding cause and effect between events), and common sense reasoning. For Shannon Vallor, a philosopher of technology at Edinburgh University, reasoning models exhibit “a kind of meta-mimicry”. Whereas earlier models mimicked statements from training data, reasoning models mimic the human process that comes up with the statements. The proof for Vallor that models do not reason can be seen in the examples where they fail. For Melanie Mitchell, professor at the Santa Fe Institute, reasoning models act “more like a bag of heuristics than a reasoning model”. Instead of reasoning, they are drawing on a mix of memorized information and heuristics. An emerging term for AI systems is “jagged intelligence” which describes the fact that AI models “can both perform extremely impressive tasks while simultaneously struggling with some very dumb problems”. For Ajeya Cotra, senior analyst at Open Philanthropy, the key to approaching an AI model is to “use it as a thought partner, not an oracle”.
8. Small Language Models Are the New Rage, Researchers Say
This article reviews the importance of small language models. Large models remain expensive to train and to operate. Google reportedly spent 191 million USD to train its Gemini 1.0 Ultra model, and the Electric Power Research Institute estimates that a single ChatGPT query consumes 10 times more energy than a Google search request. While large models are designed for general-purpose applications (e.g., chatbots) and data intensive computing (e.g., image generators, drug discovery data processing), small models work for focused domains like healthcare questions, summarization, etc. Though there is no universal definition of a small language model, a model with fewer than 10 billion parameters is considered small. There are two main ways of creating small models. The first is by distillation – a process whereby a large (teacher) model is used to teach a smaller (student) model. This allows the smaller model to be trained with high-quality data, compared to the volumes of not always interesting data that the large model was trained with. A second approach to creating small language models is pruning – a process whereby less used neurons are removed from a neural network. The approach originates in a 1989 paper by Yann LeCun, now at Meta, who found that 90% of the parameters in a trained neural network could be removed without sacrificing efficiency.
9. A Google Gemini model now has a “dial” to adjust how much it reasons
The latest update to a Google Gemini model includes a dial to control how much reasoning the model does in response to a query. One motivation is to reduce costs: outputs are six times more expensive to generate in CPU cycles (and energy) when reasoning is enabled. One leaderboard reportedly shows that a simple task can cost over 200 USD to complete. Most top AI models use chain-of-thought reasoning today as it is seen as a better means to yield powerful models than simply augmenting the volume of high-quality training data. That said, the article highlights a major concern about reasoning models: overthinking. Reasoning models waste too much CPU and energy on reasoning for many tasks. Tulsee Doshi, product team lead at Gemini, says that “for simple prompts, the model does think more than it needs to”, which is why developers ask for a dial in model’s API.
10. Famed AI researcher launches controversial startup to replace all human workers everywhere
The AI researcher Tamay Besiroglu, also founder of the Epoch non-profit AI research foundation, has founded a company called Mechanize whose mission is to provide digital environments for “the full automation of all work” and “the full automation of the economy”. It seeks to make it possible to automate any job. For the moment, the company wishes to concentrate on white-collar jobs because blue-collar jobs require robots. Besiroglu calculates the annual total addressable market as the sum of all salaries paid to humans – around 18 trillion USD in the US and 60 trillion USD in the whole world. He claims that automating all labor will generate greater standards of living, and that human wages will actually increase for those working in roles that AI cannot perform. He has not detailed the economics behind this however.
Meanwhile, TechCrunch is now maintaining a list of layoffs in the Tech industry in the US mainly. It reports 150’000 job lost in 2024, and 22’000 lost already in 2025. Big Tech companies are included in the list (Google, Microsoft, Amazon, Meta, TikTok, HP, eBay, …). AI is not often cited as a reason for layoffs, though many companies explain the layoffs by the need to improve operational efficiency.