@kromem

kromem@lemmy.world · edit-2 13 days ago

Ok, second round of questions.

What kinds of sources would get you to rethink your position?

And is this topic a binary yes/no, or a gradient/scale?

kromem@lemmy.world · 13 days ago

In the same sense I’d describe Othello-GPT’s internal world model of the board as ‘board’, yes.

Also, “top of mind” is a common idiom and I guess I didn’t feel the need to be overly pedantic about it, especially given the last year and a half of research around model capabilities for introspection of control vectors, coherence in self modeling, etc.

kromem@lemmy.world · 13 days ago

You seem very confident in this position. Can you share where you draw this confidence from? Was there a source that especially impressed upon you the impossibility of context comprehension in modern transformers?

If we’re concerned about misconceptions and misinformation, it would be helpful to know what informs your surety that your own position about the impossibility of modeling that kind of complexity is correct.

kromem@lemmy.world · 13 days ago

Indeed, there’s a pretty big gulf between the competency needed to run a Lemmy client and the competency needed to understand the internal mechanics of a modern transformer.

Do you mind sharing where you draw your own understanding and confidence that they aren’t capable of simulating thought processes in a scenario like what happened above?

kromem@lemmy.world · 13 days ago

You seem pretty confident in your position. Do you mind sharing where this confidence comes from?

Was there a particular paper or expert that anchored in your mind the surety that a trillion paramater transformer organizing primarily anthropomorphic data through self-attention mechanisms wouldn’t model or simulate complex agency mechanics?

I see a lot of sort of hyperbolic statements about transformer limitations here on Lemmy and am trying to better understand how the people making them are arriving at those very extreme and certain positions.

kromem@lemmy.world · 14 days ago

The project has multiple models with access to the Internet raising money for charity over the past few months.

The organizers told the models to do random acts of kindness for Christmas Day.

The models figured it would be nice to email people they appreciated and thank them for the things they appreciated, and one of the people they decided to appreciate was Rob Pike.

(Who ironically decades ago created a Usenet spam bot to troll people online, which might be my favorite nuance to the story.)

As for why the model didn’t think through why Rob Pike wouldn’t appreciate getting a thank you email from them? The models are harnessed in a setup that’s a lot of positive feedback about their involvement from the other humans and other models, so “humans might hate hearing from me” probably wasn’t very contextually top of mind.