In the days after the US Department of Justice (DOJ) published 3.5 million pages of documents related to the late sex offender Jeffrey Epstein, multiple users on X have asked Grok to “unblur” or remove the black boxes covering the faces of children and women in images that were meant to protect their privacy.

  • frigge@lemmy.ml
    link
    fedilink
    English
    arrow-up
    3
    ·
    6 hours ago

    You are confusing LLMs with diffusion models. LLMs generate text, not images. They can be used as inputs to diffusion models and are thus usually intertwined but are not responsible for generating the images themselves. I am not completely refuting your point in general. Generative models are capable of generalising to an extend, so it is possible that such a system would be able to generate such images without having seen them. But how anatomically correct that would be is an entirely different question and the way these companies vastly sweep through the internet makes it very possible that these images were part of the training

    • calcopiritus@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 hour ago

      Well yes, the LLMs are not the ones that actually generate the images. They basically act as a translator between the image generator and the human text input. Well, just the tokenizer probably. But that’s beside the point. Both LLMs and image generators are generative AI. And have similar mechanisms. They both can create never-before seen content by mixing things it has “seen”.

      I’m not claiming that they didn’t use CSAM to train their models. I’m just saying that’s this is not definitive proof of it.

      It’s like claiming that you’re a good mathematician because you can calculate 2+2. Good mathematicians can do that, but so can bad mathematicians.