Gate Square “Creator Certification Incentive Program” — Recruiting Outstanding Creators!
Join now, share quality content, and compete for over $10,000 in monthly rewards.
How to Apply:
1️⃣ Open the App → Tap [Square] at the bottom → Click your [avatar] in the top right.
2️⃣ Tap [Get Certified], submit your application, and wait for approval.
Apply Now: https://www.gate.com/questionnaire/7159
Token rewards, exclusive Gate merch, and traffic exposure await you!
Details: https://www.gate.com/announcements/article/47889
"Prompt injection" has become the main threat to AI browsers - ForkLog: cryptocurrencies, AI, singularity, future
The company OpenAI reported on the vulnerability of AI browsers and the measures to strengthen the security of its own solution — Atlas.
The company acknowledged that “prompt injection” attacks, which manipulate agents into executing malicious instructions, are a risk. And it will not disappear anytime soon.
She noted that the “agent mode” in Atlas “increases the threat area.”
In addition to Sam Altman's startup, other experts have also drawn attention to the problem. In early December, the UK's National Cyber Security Centre warned that attacks involving the integration of malicious prompts “will never go away.” The government advised cybersecurity specialists not to try to stop the problem but to reduce the risks and consequences.
Measures of combating
Prompt injection is a way of manipulating AI by intentionally adding text to its input data that causes it to ignore the original instructions.
OpenAI reported on the use of a proactive rapid response cycle, which shows promising results in identifying new attack strategies before they emerge “in real-world conditions”.
Anthropic and Google express similar thoughts. The competitors suggest implementing multi-layered protection and continuously conducting stress tests.
OpenAI uses an “automated LLM-based attacker” — an AI bot that is trained to play the role of a hacker looking for ways to infiltrate an agent with malicious prompts.
An artificial scammer is capable of testing the exploitation of a vulnerability in a simulator that will show the actions of the attacked neural network. Then the bot will study the reaction, adjust its actions, and make a second attempt, then a third, and so on.
Third parties do not have access to information about the internal thinking of the target AI. In theory, a “virtual hacker” should find vulnerabilities faster than a real attacker.
After the security update, the “agent mode” was able to detect an attempt at sudden prompt injection and mark it for the user.
OpenAI emphasized that while it is difficult to reliably defend against such types of attacks, it relies on large-scale testing and rapid correction cycles.
Recommendations for Users
The Chief Security Researcher at Wiz, Rami McCarthy, emphasized that reinforcement learning is one of the key ways to continuously adapt to the behavior of malicious actors, but it is only part of the picture.
These two recommendations were provided by OpenAI to users to mitigate risk. The startup also suggested giving agents specific instructions rather than providing access to email and asking to “take any necessary actions.”
McCarthy noted that as of today, browsers with built-in AI agents do not provide enough benefit to justify the risk profile.
Let us remind you that in November, Microsoft experts presented an environment for testing AI agents and identified vulnerabilities inherent in modern digital assistants.