petrescatraian@libranet.de to Technology@beehaw.org · 3 days agoDeepseek when asked about sensitive topicsi.postimg.ccexternal-linkmessage-square56fedilinkarrow-up1288arrow-down10file-text
arrow-up1288arrow-down1external-linkDeepseek when asked about sensitive topicsi.postimg.ccpetrescatraian@libranet.de to Technology@beehaw.org · 3 days agomessage-square56fedilinkfile-text
minus-squareiii@mander.xyzlinkfedilinkEnglisharrow-up40·edit-23 days agoMost commercial models have that, sadly. At training time they’re presented with both positive and negative responses to prompts. If you have access to the trained model weights and biases, it’s possible to undo through a method called abliteration (1) The silver lining is that a it makes explicit what different societies want to censor.
minus-squaredrspod@lemmy.mllinkfedilinkarrow-up8·3 days agoI didn’t know they were already doing that. Thanks for the link!
minus-squareSkyeStarfall@lemmy.blahaj.zonelinkfedilinkarrow-up6·2 days agoIn fact, there are already abliterated models of deepseek out there. I got a distilled version of one running on my local machine, and it talks about tiananmen square just fine
minus-squareSnot Flickerman@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up4·edit-23 days agoHi I noticed you added a footnote. Did you know that footnotes are actually able to be used like this?[1] Code for it looks like this :able to be used like this?[^1] [^1]: Here's my footnote Here’s my footnote ↩︎
Most commercial models have that, sadly. At training time they’re presented with both positive and negative responses to prompts.
If you have access to the trained model weights and biases, it’s possible to undo through a method called abliteration (1)
The silver lining is that a it makes explicit what different societies want to censor.
I didn’t know they were already doing that. Thanks for the link!
In fact, there are already abliterated models of deepseek out there. I got a distilled version of one running on my local machine, and it talks about tiananmen square just fine
Links?
Hi I noticed you added a footnote. Did you know that footnotes are actually able to be used like this?[1]
Code for it looks like this :
able to be used like this?[^1]
[^1]: Here's my footnote
Here’s my footnote ↩︎