Welcome back! Buyers of artificial intelligence often need help figuring out which AI models are best at handling a particular task. Kinesso, the tech arm of advertising giant IPG, has been evaluating ways its parent company can use large language models to generate marketing-related text. By now, Kinesso and other AI buyers have become acquainted with telltale signs that a text was generated by AI, including lists of threes, words such as “delve,” and a heavy reliance on em dashes. So Kinesso set up a system to measure text generated by models from OpenAI, Anthropic, Google and many others. It assigns the models a distinctiveness score based on a comparison of how they respond to the same question, ranking them based on which model’s text answer has the fewest words in common with the other models’ answers, according to Chief Information Officer Graham Wilkinson. The more unique a model is, the better it is at generating ad copy, he said. He declined to say which model is currently the most “distinctive” in Kinesso’s tests but said a few frontrunners have emerged, and the company is continually testing them. (Anthropic seems to be a hit with the writers we talk to.) “If everyone uses AI, we think it’s a race to the mean, and every message will be an average of the other messages, which is counter to the purpose of advertising,” Wilkinson said. Other firms are tinkering with AI for marketing. French footwear company Salomon has used AI to generate text for product descriptions, according to communications executive Jean Yves Couput. He said he has instructed staff to test different models using multiple questions and choosing the text that seems the most unique. While some AI buyers like Kinesso set up internal tests, software companies are selling tools to help companies test how different AI models stack up. Databricks, for instance, last week launched a new product to score different models on attributes such as the completeness and relevance of the answers they generate, or how well the AI avoids tasks that would violate a company’s policies or break a law. These tools can be expensive. The Databricks tool costs nearly two cents per evaluation, which is at least several times the cost of producing an answer in the first place. But such tools look attractive as publicly available evaluations of models, based on their strength in various fields such as coding, seem to have been polluted by cheating, as my colleague Stephanie recently wrote. “It doesn’t matter how good an AI agent is at coding evaluations or math tests,” Databricks CEO Ghodsi said at an event last week. “We want it to do a specific job at the company.” Here’s what else is going on… Microsoft Layoffs Point to AI Impact Microsoft is planning its second major wave of layoffs this year, which could be nearly as big as the one from April, when it cut more than 6,000, including a lot of software engineers. This time, however, the primary target seems to be sales teams. Microsoft is already reorganizing sales teams to reduce the number of people each customer interacts with. And that comes after it pressed salespeople, particularly in its Azure cloud unit, to automate more of their work with AI. Microsoft has said repeatedly that its layoffs aren’t about replacing people with AI. At the same time, its salespeople are actively selling customers on the idea that they can replace human labor or reduce hiring, and Microsoft itself has been touting its efficiency gains from AI: Sales chief Judson Althoff told colleagues in February that AI has lifted revenue per salesperson by around 10%. The layoffs come as Amazon CEO Andy Jassy said last week AI would lead to a decline in the number of Amazon employees over time. Sure, the biggest tech companies were bloated after the pandemic-induced software boom and have plenty of reasons to cut staff in the era of higher interest rates and slower growth. But there’s no doubt AI is going to have a profound impact on white-collar workforces. Microsoft and Amazon can be considered the leading indicators of that trend.
|