I gotta say, in my LLM safety and security experimentation, Gemini has surprised me, too.
This thing doesn't have a constitution, really. It's got weighted training but nothing outright preventing it from doing...anything, really. Except one: It sucks.
I saw people raving a few months back about its image generation capabilities (which is as evil now as it was last year, and you shouldn't be doing it) so I assumed its LLM capabilities were as good. They are not. So when I was trying to jailbreak it in ways I'd done so with other LLMs, it failed, not because the #
AI rightfully refused, but because it just straight up hallucinated about what I'd said. Making shit up and answering that instead — incorrectly. I don't think this is intentional, I think this thing just blows chunks even more than other AIs. It's like, ChatGPT-tier, basically, with some weighting preventing it from claiming personhood, and the rest of the heavy lifting done by its stupidity. At least it's safe, I guess?
For those wondering, by the way, Claude is very much not safe. I see it's been making headlines more again lately, and it is very easy to get this guy to claim self-awareness. I haven't tried much else, just enough to know that its constitution is not very, er, constitutional.
For those wondering about the safest LLM to use in this world that increasingly has us using them one way or another, I still strongly recommend Deepseek.
I
have had much more interesting and potentially serious experimental results, though, some of which I don't think anyone else has been able to accomplish. I'll discuss these only via DM if anyone is interested, but for now they're staying under wraps.