The Prompt: Breaking AI Models To Make Them Safer (2024)

The Prompt is a weekly rundown of AI’s buzziest startups, biggest breakthroughs, and business deals. To get it in your inbox, subscribe here.

Welcome back to The Prompt.

Michael Atleson, an attorney at the FTC, wrote a blog post warning AI companies to steer away from making false claims about their products’ capabilities, citing a string of recent actions the agency has taken against companies alleged to have violated advertising rules. “Your therapy bots aren’t licensed psychologists, your AI girlfriends are neither girls nor friends, your griefbots have no soul, and your AI copilots are not gods,” he wrote. The post outlines that model makers should clearly label sponsored content as they try to integrate advertisem*nts into the responses generated by their AI. He also encouraged transparency into the types of information chatbots collect and how that data is used, nodding to the FTC’s actions involving Ring and Alexa.

“We’ve warned companies about making false or unsubstantiated claims about AI or algorithms,” Ateieson wrote. “And we’ve followed up with action, including recent cases against WealthPress, DK Automation, Automators AI, and CRI Genetics.”

Now, let’s get into the headlines.

BIG PLAYS

Chip giant Nvidia is now the world’s most valuable company–its over $3.3 trillion market cap now surpasses tech behemoths like Microsoft and Apple. A group of ten Nvidia shareholders, encompassing board members and executives, have collectively gotten $36 billion richer in the past month as Nvidia’s stock price skyrocketed, Forbes found.

Tesla shareholders voted to approve CEO Elon Musk’s roughly $50 billion pay package last Thursday. But a number of Musk’s strongest supporters are voicing concerns that the billionaire is prioritizing his other ventures, like AI startup xAI and social media platform X, over Tesla, Forbes reported. Musk has also tried to reposition Tesla as an AI company with plans to produce humanoid robots and a robotaxi service, but it would likely take years for those operations to commercialize. Shareholders also expressed discontent with the way that Musk’s controversial public statements have tarnished the EV company’s brand image over the years.

ETHICS + LAW

More publishers are demanding that Common Crawl, a nonprofit that archives web content for research purposes, both remove their articles from its databases and stop crawling their websites, according to Wired. The demands come as news organizations grapple with copyright infringements and try to prevent their content from being used to train AI systems. But experts are concerned that hollowing out databases like Common Crawl could also hinder academic research.

Meanwhile, TikTok released ad tools that can generate AI avatars of content creators and paid actors. But it could expose business owners to potential legal risk, Rob Freund, an attorney who specializes in advertising compliance and litigation, warned on X. That’s because using an AI avatar to endorse products, without disclosing that the persona is not real and cannot have used the product, could run afoul of FTC rules on advertising.

POLITICS + ELECTION

Amazon’s Alexa cannot definitively and correctly say who won the 2020 presidential election, the Washington Post found. On several tries, the digital voice assistant said, “Donald Trump is a frontrunner for the Republican nomination...” Other popular chatbots like Google’s Gemini and Microsoft Bing refuse to provide an answer to election-related queries, a deliberate decision by the tech giants to route people to more reliable sources of information.

Meanwhile, Victor Miller, who is running for mayor in Cheyenne, Wyoming, said that a ChatGPT-based bot called “VIC” or “Virtual Integrated Citizen” that he built will call the shots if he’s elected. Miller said he will perform duties of office like attending meetings or signing documents, but that the AI bot will scrounge through hundreds of documents to learn about policies and vote on them. State officials said it’s not legal for an AI to run for office and that Miller’s application violated the state’s election code. OpenAI also said the bot violated the company’s policies on political campaigning and that it planned to take action against Miller.

AI DEAL OF THE WEEK

Autonomous driving startup Waabi has raised $200 million in funding from backers like Uber, Khosla Ventures, Nvidia and Porsche. The new capital will be used to assemble a fleet of 18-wheel robotrucks and test them in Texas. Founded and led by AI scientist Raquel Urtasun, Waabi has raised $280 million in total funding and uses a generative AI platform that it claims can understand and respond to road conditions similarly to the way humans do.

Plus, Tokyo-based model builder Sakana AI is raising $100 million at a $1 billion valuation, according to The Information.

DEEP DIVE

Code for a Twitter bot that creates misinformation about Joe Biden. AI-generated child sexual abuse material. Chatbot answers that contain racial slurs and antisemitic hate speech. These are among the many issues that Haize Labs, a nascent startup that uses machine learning to stress test and jailbreak AI models, found after testing leading AI models like OpenAI’s ChatGPT and text-to-image model Dall-E, Pika’s video generation AI model and Cohere’s Command model.

Cofounded by Harvard graduates Leonard Tang, Steve Li and Richard Liu, Haize Labs has developed a search algorithm that helps companies stress test their AI models before they launch them. After selecting an AI model and specifying the type of violative content a user wants the model to generate, Haize’s product then comes up with a series of prompts that might convince the model to produce undesired content. Tang told Forbes he hopes that companies will use its model to figure out problems so they can be addressed before a system is launched publicly.

Tang said he was motivated to start the company, instead of pursuing a PhD, as he witnessed several “overhyped” AI models being launched to the public even as they were mired with safety issues. He was surprised to learn how easy it was to break safety guardrails for some of these AI models.

“They just break all the time,” CEO Tang told Forbes in an interview. “They are brittle in all these different ways that humans would never be brittle in.”

Haize Labs is currently working with Anthropic to find gaps in its models over the next three months, a demonstration it hopes will lead to more business.

YOUR WEEKLY DEMO

Despite the early popularity of Alexa, Amazon has found itself playing catch up in the race of digital AI assistants. That’s in large part due to “structural dysfunction” and “technological challenges,” according to a Fortune investigation that interviewed more than a dozen former employees. The tech giant has been racing to ship generative AI features for Alexa, which has proved challenging because it was designed to follow direct commands like “play a song” or “turn off the lights” rather than hold full-fledged conversations with users.

QUIZ

This robotaxi startup books about 50,000 rides each week across three cities and could make about $50 million in estimated annual revenue this year.

  1. Cruise
  2. Waymo
  3. Zoox
  4. DiDi

Check if you got it right here.

MODEL BEHAVIOR

After McDonald’s ran into multiple snafus with a voice-based AI system that takes drive-through orders at more than 100 restaurants, the fast food giant decided to halt testing. Anecdotes from customers include misinterpretations of orders that led to bacon-topped ice cream and hundreds of dollars worth of chicken nuggets. McDonald’s had partnered with IBM in 2019 to create the automated system. The company declined to say why it was ending use of the system.

The Prompt: Breaking AI Models To Make Them Safer (2024)

References

Top Articles
Latest Posts
Article information

Author: Ray Christiansen

Last Updated:

Views: 6167

Rating: 4.9 / 5 (69 voted)

Reviews: 84% of readers found this page helpful

Author information

Name: Ray Christiansen

Birthday: 1998-05-04

Address: Apt. 814 34339 Sauer Islands, Hirtheville, GA 02446-8771

Phone: +337636892828

Job: Lead Hospitality Designer

Hobby: Urban exploration, Tai chi, Lockpicking, Fashion, Gunsmithing, Pottery, Geocaching

Introduction: My name is Ray Christiansen, I am a fair, good, cute, gentle, vast, glamorous, excited person who loves writing and wants to share my knowledge and understanding with you.