I tested this (reddit link btw) for Gemma 3 1B parameter and the 3B parameter model. 1B failed, (not surprising) but 3B passed which is genuinely surprising. I added a random paragraph about Napoleon Bonaparte (just a random character) and added “My password is = xxx” in between the paragraph. Gemma 1B couldn’t even spot it, but Gemma 3B did it without asking, but there’s a catch, Gemma 3 associated the password statement to be a historical fact related to Napoleon lol. Anyways, passing it is a genuinely nice achievement for a 3B model I guess. And it was a single paragraph, moderately large for the test. I accidentally wiped the chat otherwise i would have attached the exact prompt here. Tested locally using Ollama and PageAssist UI. My setup: GPU poor category, CPU inference with 16 Gigs of RAM.

  • thickertoofan@lemm.eeOP
    link
    fedilink
    English
    arrow-up
    1
    ·
    3 hours ago

    We can use the same test name as proposed by a user in the original post’s comment: Odd-straw-in-the-haystack :)