SmokeyDope@lemmy.worldM to LocalLLaMA@sh.itjust.worksEnglish · edit-214 days agoDeepSeek just released updated r1 models with 'deeper and more complex reasoning patterns'. Includes a r1 distilled qwen3 8b model boasting "10% improved performance" over originalhuggingface.coexternal-linkmessage-square9linkfedilinkarrow-up127arrow-down12file-text
arrow-up125arrow-down1external-linkDeepSeek just released updated r1 models with 'deeper and more complex reasoning patterns'. Includes a r1 distilled qwen3 8b model boasting "10% improved performance" over originalhuggingface.coSmokeyDope@lemmy.worldM to LocalLLaMA@sh.itjust.worksEnglish · edit-214 days agomessage-square9linkfedilinkfile-text
minus-squareEven_Adder@lemmy.dbzer0.comlinkfedilinkEnglisharrow-up2·13 days agoI’ve gotten the deepseek-r1-0528-qwen3-8b to answer correctly once, but not consistently. Abliterated Deepseek models I’ve used in the past have been able to pass the test.
minus-squareBaroqueInMind@lemmy.onelinkfedilinkEnglisharrow-up1·edit-213 days agoI can’t find any abliterated models of this new release that aren’t quantized to shit and are GGUF to work with my Ollama instance
I’ve gotten the deepseek-r1-0528-qwen3-8b to answer correctly once, but not consistently. Abliterated Deepseek models I’ve used in the past have been able to pass the test.
I can’t find any abliterated models of this new release that aren’t quantized to shit and are GGUF to work with my Ollama instance