Hacker Newsnew | past | comments | ask | show | jobs | submit | kerng's commentslogin

Not the first time by the way. GitHub Copilot Chat: From Prompt Injection to Data Exfiltration https://embracethered.com/blog/posts/2024/github-copilot-cha...


And it won't be the last.


When I read about MCP the first time and saw that it requires a "tools/list" API reminded me of COM/DCOM/ActiveX from Microsoft, it had things like QueryInterface and IDispatch. And I'm sure that wasn't the first time someone came up with dynamic runtime discovery of APIs a server offers.

Interestingly, ActiveX was quite the security nightmare for very similar reasons actually, and we had to deal with infamous "DLL Hell". So, history repeats itself.


Yeah, wondering how the initial test account got compromised. Probably no MFA and password spray via OAuth ROPC flow, then lateral movement.

M365 is quite bad at enforcing MFA, it's pay to play.


Might be better to say that most companies think they have that kind of isolation, but pentesting, red teaming and incidents then later proof they don't. I have even seen companies routing prod traffic to test systems, it's not uncommon.

Test pretty much always leads to prod.


This makes me wonder why does OpenAI not build in a mitigation by default that requires a confirmation that they control? Why leave it up to the tool developers to mitigate, many of whom never heard of confused deputy attacks?

Seems like a missed opportunity to make things a little more secure.


Are these all vulnerable to Indirect Prompt Injections or is there a solution to this rising security challenge? Anything plugin developers should do to limit impact?


Indirect Prompt Injection via YouTube transcripts

https://embracethered.com/blog/posts/2023/chatgpt-plugin-you...


These attacks are more closely related to social engineering the LLM, rather then traditional "injections".

https://embracethered.com/blog/posts/2023/ai-injections-dire...

There aren't any specific limited amount of tokens to inject or mitigate against, there is an "infinite" amount of trickery the AI might misinterpret or be persuaded to do.

Annual security training will be needed for AI, to learn about the latest phishing attacks, much like for humans. Only have joking.


There indeed is a strong overlap with social engineering, but in my view the whole reason why social engineering the LLM is possible is an "injection vulnerability". We don't want the LLM to treat third-party data in the same way as the communication with the user. We want the user to be able to talk with an LLM-based chatbot in arbitrary ways and issue arbitrary instructions, however, we also want a strict separation between these instructions and the data they operate on, so that when the user says "fix style problems in that blob of text" the model has the capability to tell that this blob of text is fundamentally different from the instructions, and that literally nothing in it should even theoretically enable social engineering.


I was a bit underwhelmed by the depth of the conversation - it entirely lacked an opposing or at least an alternate view to try to understand the other side. Going in I thought Andrew would moderate it like that, but it was more of a bubble discussion.


Very interesting thought on how to mitigate this, because I think a solution like with parameterized queries isnt possible - at least with my current understanding (the attack is more of a "social engineering" attack on the AI).

Regarding the supervisor AI, in theory it would be vulnerable to the same attack but probably more difficult to perform. One could even have multiple supervisors (with different sensitivity levels or focus areas) to get a vote on the content I guess.

Interesting problem space.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: