Has your local thinking model had an 'Aha!' moment similar to the one in Deepeek R1 papers?

SmokeyDope@lemmy.world · edit-2 13 days ago

Has your local thinking model had an 'Aha!' moment similar to the one in Deepeek R1 papers?

hendrik@palaver.p3x.de · edit-2 13 days ago

You might want to explain the reference in case we didn’t all learn the paper by heart…

https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

Table 3 on page 9. The “Aha” paragraph starts on the page before that.

And yes, I think I tried DeepSeek and it wrote things like that, I’m not sure if “Aha”, but similar things. Also “Wait…” and that something didn’t look right and it had to backtrack… It generally writes a lot of weird stuff. Sometimes it even writes wrong things and then doesn’t listen to itself and it silently corrects itself in the process. Other models phrase things with a different tone. At least that’s what I’ve seen. Some also repeat a lot of stuff. I’ll write what I want, and then it goes on and on “the user wants me to …” and repeats everything once again, just not in first person, but in third person. And then after some nonsense, it tries to dissect the problem.

SmokeyDope@lemmy.world · edit-2 13 days ago

Thanks for the suggestion and sharing your take :)

the ‘okay…’ ‘hold on,’ ‘wait…’ let me read over it again…’ is part of deepseeks particular reasoning patterns, I think the aha moments are more of an emergent expression that it sometimes does for whatever reason than part of an intended step of the reasoning pattern but I could be mistaken. I find that lower parameter models will suffer from not being able to follow its own reasoning patterns and quickly confuses itself into wrong answers/hallucinations. Its a shame because 8B models are fast and able to fit on a lot of GPUS but they just cant handle keeping track of all the reasoning across many thousands of tokens.

The best luck ive had was with bigger models trained on the reasoning patterns that can also switch them on and off by toggling the <think> tag. My go to model the past two months has been deephermes 22b which is toggleable deepseek reasoning baked into mistral 3 small, its able to properly bake answers to tough logical questions putting its reasoning together properly after thinking for many many tokens. Its not perfect but it gets things right more than it gets them wrong which is a huge step up.

hendrik@palaver.p3x.de · edit-2 13 days ago

Nice. I’ll try it.

Yeah, I also think it’s more an expression which somehow emerged, and it’s not really a Eureka moment. They also seem to put it that way in the paper. The say it’s an Aha for the scientists, and the whole reasoning process is an Aha, but they don’t really write the model is having an Aha moment due to some insight it had.

I don’t know if you watch Youtube videos, but Computerphile made a video about DeepSeek and an interesting video about Forbidden AI techniques a few days ago. That’s also about the reasoning process and how LLMs can be lazy and take unwanted shortcuts, and hide information in the thinking step.

SmokeyDope@lemmy.world · edit-2 12 days ago

they don’t really write the model is having an Aha moment due to some insight it had.

Well, they really can’t write it that way because it would imply the model is capable of insight which is a function of higher cognition. That path leads to questioning if machine learning neural networks are capable of any real sparks of sapience or sentience. Thats a ‘UGI’ conversation most people absolutely don’t want to have at this point for various practical, philosophical, and religous/spiritual implications.

So you can’t just outright say it, especially not in an academic STEM paper. Science academia has a hard bias against the implication of anything metaphysical or overly abstract at best they will say it ‘simulates some cognative aspects of intelligence’.

In my own experience, the model at least says 'ah! aha! Right, right, right, so…` when it thinks it has had an insight of some kind. Whether or not models are truly capable of such thing or is merely some statistical text prediction artifact is a subjective discussion of philosophy kind of like a computer scientist nerds version of the deterministic philosophical zombie arguments.

Thanks for sharing the video! I havent seen computerphile in a while, will take a look especially with that title. Gotta learn about dat forbidden computation :)