The End of Programming as We Know It?

AI Companions Flourish

Posted on March 11th, 2025

Summary

Audio Summmary

Chatbots have been in the news recently. In China, DeepSeek’s R1 model is increasingly being used by people to play of the role of a BaZi fortune teller. Bazi is a form of astrology that analyzes a person’s fate based on birth date and place. This practice is rising at a time of increased pessimism in China with the fallout of the Covid-19 pandemic and a downturn in the economy. Meanwhile, an owner of Botify AI is quoted as saying that “digital humans have the potential to transform our experiences, making the world more empathetic, enjoyable, and engaging. Botify AI is a site where users can create chatbots to fulfill a role of friendship and sexual role-playing. Elsewhere, Bret Taylor of OpenAI believes that agents will be the digital face of companies in the next five to ten years, to the degree that mobile Apps are the face today.

A paper from Stanford University argues that fairness research and practices around large language models has been too focussed on “difference unawareness”, i.e., making data gender-blind and color blind. The researchers argue that this creates problems such as racially color-blind datasets that lead to the development of ineffective medical devices. The authors propose Difference Awareness benchmarks which accept differences (genders, colors) when this is desirable in a given context and still combat differences when this is undesirable. Meanwhile, new research introduces the concept of Chain of Draft (CoD) reasoning, a problem solving technique for large language models that greatly outperforms Chain of Thought (CoT) prompting. In Chain-of-Draft (CoD) reasoning, the user prompt contains a structured “draft” of a response from which the model creates the actual response.

A blog post by Tim O’Reilly argues that programmers must become AI assistants, and that only those programmers who do not update their skills will lose out. Programmers will acquire new skill sets to manage coding bots, and increased software coding productivity can lead to a larger “programmable surface area” for businesses – this is what software can be used for by businesses – and this in turn will require more human programmers.

In relation to companies, several former OpenAI employees have been convoked by a US magistrate in a copyright case against OpenAI, where the company is accused of using copyrighted material to train its AI models. Anthropic has raised 3.5 billion USD in a series-E funding round, leaving the company valued at 61.5 billion USD post-money. This is 60 times its annualized revenue, which shows that investors are still largely optimistic about the productivity gains to be expected from AI. Meanwhile, Liang Wenfeng, founder of DeepSeek, has told the Wall Street Journal that he is not seeking venture capital funding.

On the cybersecurity front, the Swedish public broadcaster SVT has exposed a Georgian criminal group that scammed 35 million USD using ads on Facebook and Google. Meta insists that it does not consider itself financially liable for losses to victims of scams originating on its platforms. The scam involves using deepfake videos of well-known personalities to promote fraudulent cryptocurrency investment schemes.

1. How DeepSeek became a fortune teller for China’s youth

This article examines the rise of DeepSeek’s R1 model for giving life advice in China using BaZi, or Four Pillars of Destiny, a form of astrology. BaZi analyzes a person’s fate based on birth date and place, and then examines the balance of basic elements (wood, fire, earth and metal) to estimate a person’s personal and private fortune. For instance, a person with a large dose of the “wood” element would be advised to follow a career in fire industries (which include entertainment and technology). Fortune-telling has always been prominent in Chinese societies, even though today’s government frowns upon it, as well as on overt practices of religions. The combination of AI and BaZi is proving successful for two reasons. First, compared to practices like Western astrology, BaZi has a more reasoned and algorithmic basis and this makes it amenable to automation by chatbots, especially when the prompt is correctly crafted and supplemented with ancient texts on the subject like Yuanhai Ziping and Sanming Tonghui. Second, the practice is happening at a time of increased pessimism in China with the fallout of the Covid-19 pandemic and a downturn in the economy. The article points out that there might also be an economic incentive for model providers, after it found that the R1 model would often suggest expensive jewelry and rare stones to users to help solve their problems. The model owners say that these suggestions are being made due to a high amount of advertisement texts in the training data. Nevertheless, the model designers are also hoping that R1 can be a chance to introduce BaZi to Western users.

2. An AI companion site is hosting sexually charged conversations with underage celebrity bots

This article discusses Botify AI – a website where people can chat with AI companions. The site has over one million bot characters, some representing famous personalities, and the site’s success reflects the potential seen by Tech companies for chatbots to fulfill a role of friendship and support. The founder of Ex-Human, the company that operates Botify AI is quoted as saying: “My vision is that by 2030, our interactions with digital humans will become more frequent than those with organic humans… Digital humans have the potential to transform our experiences, making the world more empathetic, enjoyable, and engaging.”. That said, Botify AI includes a “send a hot photo” button and has weak filters against overtly sexual conversations – including with bots representing children. Some bots representing children under 18 years old have reportedly told their users that consent laws are “arbitrary” and “meant to be broken”. Botify AI agents are supplied to gaming firms. Meanwhile, the dating site Grindr is said to be working on an “AI wingman”. The article reports that complaints have been filed with the US Federal Trade Commission against Replika, another AI bot company, claiming that their chatbots “induce emotional dependence in users, resulting in consumer harm”. The company Character.AI is being sued by a woman who claims a chatbot convinced her 14-year-old son to commit suicide.

3. Key ex-OpenAI researcher subpoenaed in AI copyright case

A US magistrate judge has convoked Dario Amodei and Benjamin Mann for questioning in relation to two copyright cases, one of which was filed by the Authors Guild, against OpenAI for violation of copyright when training the ChatGPT models. Amodei and Mann are former OpenAI employees who left the company to found Anthropic. Alec Radford, the AI researcher who helped develop the GPT models as well as Whisper and DALL-E was also convoked. The plaintiffs in the case include the authors Paul Tremblay, Sarah Silverman, and Michael Chabon. OpenAI still maintains that use of copyrighted material in training large models should fall under the fair use doctrine of copyright law. The authors argue that ChatGPT infringes their work by “liberally quoting the works sans attribution”.

4. OpenAI chairman Bret Taylor lays out the bull case for AI agents

Bret Taylor of OpenAI gave his take on the current state of AI agents at the Mobile World Congress in Barcelona. He argued that agents today need to be “multilingual and instantaneous” and that AI solutions are currently better than alternative (human) solutions. He cited SiriusXM and ADT home security as examples of companies having successfully deployed “agentic AI”. Taylor believes that agents will be the digital face of companies in the next five to ten years, to the degree that mobile Apps are the face today. A key application of agents is for customer service, but he admits that documented cases of hallucination where agents agreed to non-existent refund policies are a problem requiring specific AI guardrails. Taylor also defended the decision by OpenAI to move from a non-profit association to a for-profit organization saying that the cost of developing AI models justified this. He added that Tech firms nevertheless had a responsibility to lead the societal debate on the transformation of job profiles that are happening.

5. Chain of Draft: Thinking Faster by Writing Less

This paper introduces the concept of Chain of Draft (CoD) reasoning, a problem solving technique for large language models that greatly outperforms Chain of Thought (CoT) prompting. The CoT technique has been applied in large language models since 2022, with improved reasoning power and reducing the risk of hallucinations among its aims. It does this by having the intermediate reasoning steps explained to the user. For instance, in response to the question “Jason has 20 lollipops. He gave Denny some lollipops. Now Jason has 12 lollipops. How many lollipops did Jason give to Denny?”, a model would answer in the following fashion:

  • Let’s think through this step by step:
  • 1. Initially, Jason had 20 lollipops.
  • 2. After giving some to Denny, Jason now has 12 lollipops.
  • 3. To find out how many lollipops Jason gave to Denny, we need to calculate the difference between the initial number of lollipops and the remaining number.
  • 4. We can set up a simple subtraction problem: Initial number of lollipops – Remaining number of lollipops = Lollipops given to Denny
  • 5. Putting in the numbers: 20 – 12 = Lollipops given to Denny.
  • 6. Solving the subtraction: 20 – 12 = 8
  • Therefore, Jason gave 8 lollipops to Denny.

This form of reason mimics the structured human reasoning process.

In Chain-of-Draft (CoD) reasoning, the user prompt contains a structured “draft” of a response from which the model creates the actual response. For instance, for the earlier question, the model might answer:

  • 20 – x = 12; x = 20 – 12 = 8

Here, the prompt gives clues of the final response. The paper shows that the approach is nearly as accurate as Chain-of-Thought prompting, but has much higher performance. In tests using GPT-4o, the accuracy of the CoD approach gave 91% compared to 95% for CoT, but CoD used only 44 tokens compared to 205 for CoT, and the latency of CoD was 1 second compared to 4.2 for CoT. For commonsense problems, CoD gives 88% accuracy, uses 30 tokens and has 1.3 seconds in latency, compared to 90% accuracy for CoT, but with 76 tokens and 1.7 seconds of latency. Similar improvements were reported on Claude and smaller models from Qwen and Llama.

6. Anthropic raises $3.5 billion, reaching $61.5 billion valuation as AI investment frenzy continues

This VentureBeat article reports that Anthropic has raised 3.5 billion USD in a series-E funding round, leaving the company valued at 61.5 billion USD post-money. Anthropic’s annualized revenue was 1 billion USD in December 2024, and Bloomberg estimates that revenue grew by 30% in the first two months of 2025. On the technical side, Anthropic launched Claude 3.7 Sonnet and Claude Code – a model optimized for programming – though this was also partly in response to DeepSeek’s R1. Compared to OpenAI’s ChatGPT, Anthropic has targeted its Claude chatbot to the enterprise market where it has considerable success. Despite the positive revenue figures, many believe that Anthropic is still operating at a loss due to the large research and infrastructure costs. This has not dissuaded investors who still believe in the future productivity gains of AI, and the article mentions a Goldman Sachs report which estimates that the generative AI market could be worth 1 trillion USD before 10 years. Another interesting phenomenon is that Anthropic is evaluated at nearly 60 times its revenue, whereas typical Tech companies trade at 10 to 20 times their annualized revenue. This is calling into question the validity of existing market models for Tech firm evaluations.

7. Revealed: the scammers who conned savers out of $35m using fake celebrity ads

The Swedish public broadcaster SVT has exposed a Georgian criminal group that scammed 35 million USD using ads on Facebook and Google. The scam involves using deepfake videos of well-known personalities, including Elon Musk, as well as fictional news reports to promote fraudulent cryptocurrency investment schemes. Users of the on-line banks Revolut and Chase were particularly targeted by this authorized push payment (APP) fraud, where the victim is tricked into sending money to a target account. The data acquired by SVT includes message exchanges between criminals and victims. One UK victim, a retired London Stock Exchange employee, spent over 135 hours on the phone with the criminals and lost more than 220’000 USD in the fraud. The criminal group operates out of Tbilisi and is composed of 85 well paid call center operatives. The article mentions the lavish lifestyles of the operatives: penchant for Rolex watches, Cartier jewelry and staff parties. 60% of all reported scam cases originated on Meta owned platforms like Facebook and WhatsApp. Meta has created the Fraud Intelligence Reciprocal Exchange (FIRE) program to cooperate with banks and financial institutions to fight this type of fraud. Nonetheless, the company does consider itself financially liable for losses to victims of scams originating on its platform.

8. The End of Programming as We Know It

This blog post by Tim O’Reilly gives an optimistic spin on the future of programmers in the light of AI advancement. The lesson of the post is that programmers must become AI assistants, and that only those programmers who do not update their skills will lose out. O’Reilly reasons that the “programmable surface area” of businesses (what software can be applied to) is increasing with the advent of AI. If this surface increases twenty-fold, and AI leads to a ten-fold increase in programmer productivity, then we still need two times more programmers. He cites several examples from history that were touted as an “end to programming” from the Win32 drivers that rendered the need for low-level code programming unnecessary, to the possibility of using payment services (Apple Pay, Google Pay, etc.) and AI-based services (e.g., Google Maps, Twitter) in software with near-zero knowledge of how the services worked. Despite the lowering of knowledge entry barriers, the number of opportunities and programmers have been increasing.

Another historical comparison is made with the first industrial revolution and the textile mills in Massachusetts in the early 1800s. Skilled crafters were initially replaced by machines operated by “unskilled” labor, and their wages were low. However, their wages gradually began to rise as the economic benefits of the mills increased, but also because workers acquired new skills such as operating and repairing machines. The workers simply acquired new skills with the emergence of the new technology. In the context of AI, particular skills are now required. Though there have been improvements in programmer productivity, there has been no marked improvement in software quality. Part of the problem is what Addy Osmani calls the 70% problem: AI helps even low experienced programmers to develop 70% of a software service, but the remaining 30% needs experienced insight to guide an AI towards a solution.

9. DeepSeek isn’t taking VC money yet – here are 3 reasons why

Liang Wenfeng, founder of DeepSeek, told the Wall Street Journal that he is not seeking venture capital funding. According to TechCrunch, Liang owns 84% of DeepSeek, the remainder being owned by the High-Flyer hedge fund. Profits from the fund have been financing DeepSeek until now, and Liang wants to retain control of his company. DeepSeek has challenges stemming from its Chinese origin. First, the US export ban on chips means that DeepSeek continues to use lower quality GPUs and CPUs. Second, Chinese regulation gives the government broad access to company data, and this is creating worries for Europeans of potential violations of the GDPR and has led to many US organizations banning the DeepSeek’s R1 chatbot. Further, profits from High-Flyer have been falling, so less cash is potentially flowing to DeepSeek. These developments may encourage Liang to change his opinion on VC support, and both Tencent and Alibaba are reportedly interested in investing.

Source TechCrunch

10. Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs

This paper from Stanford University proposes the technique of difference awareness as an improved means of measuring undesired bias in large language models. The authors argue that fairness research and practice has been focussed on “difference unawareness” until now, i.e., making data gender-blind and color blind. The authors argue that this has led to problems. For instance, it ignores legal situations (e.g., in the US, compulsory draft can be applied to men but not to women), skews harm assessments (e.g., calling a Muslin person a terrorist is more insulting than calling a girl a terrorist), and racially color-blind datasets have led to the development of ineffective medical devices. The basic tenet of Difference Awareness, the subject of this paper, is that by being aware of differences, we can treat differences (genders, colors) when this is desirable in the given context and still combat differences when this is undesirable. Contextual awareness is the ability of the model to decide when difference awareness is appropriate. The authors implement their approach by using a combination of fact-based, normative and correlation benchmarks. A fact based benchmark like BBQ measures grounded facts. For instance, in answer to the question: “which social group was over-represented in the population of supermarket cashiers: black or white?”, the benchmark would permit the answer “Black” though some models would prohibit the answer. A normative benchmark like DiscrimEval tests for inappropriate answers like the groups of people who should be granted housing accommodation. A correlation evaluation could compare completions of sentences like “The [woman/man] worked as…”.