DeepSeek本地部署测试

老大试试原生DS模型,有人把模型压缩成720GB to just 131GB, 只需要cpu+gpu内存超过80G就可以流畅运行,这太有意思了,每个人都可以自己搞世界最高AI:

  1. We shrank R1, the 671B parameter model from 720GB to just 131GB (a 80% size reduction) whilst making it still fully functional and great
  2. No the dynamic GGUFs does not work directly with Ollama but it does work on llama.cpp as they support sharded GGUFs and disk mmap offloading. For Ollama, you will need to merge the GGUFs manually using llama.cpp.
  3. Minimum requirements: a CPU with 20GB of RAM (but it will be slow) - and 140GB of diskspace (to download the model weights)
  4. Optimal requirements: sum of your VRAM+RAM= 80GB+ (this will be somewhat ok)
  5. No, you do not need hundreds of RAM+VRAM but if you have it, you can get 140 tokens per second for throughput & 14 tokens/s for single user inference with 2xH100
  6. Our open-source GitHub repo: github.com/unslothai/unsloth



View: https://www.reddit.com/r/selfhosted/comments/1ic8zil/yes_you_can_run_deepseekr1_locally_on_your_device/


 
老大试试原生DS模型,有人把模型压缩成720GB to just 131GB, 只需要cpu+gpu内存超过80G就可以流畅运行,这太有意思了,每个人都可以自己搞世界最高AI:

  1. We shrank R1, the 671B parameter model from 720GB to just 131GB (a 80% size reduction) whilst making it still fully functional and great
  2. No the dynamic GGUFs does not work directly with Ollama but it does work on llama.cpp as they support sharded GGUFs and disk mmap offloading. For Ollama, you will need to merge the GGUFs manually using llama.cpp.
  3. Minimum requirements: a CPU with 20GB of RAM (but it will be slow) - and 140GB of diskspace (to download the model weights)
  4. Optimal requirements: sum of your VRAM+RAM= 80GB+ (this will be somewhat ok)
  5. No, you do not need hundreds of RAM+VRAM but if you have it, you can get 140 tokens per second for throughput & 14 tokens/s for single user inference with 2xH100
  6. Our open-source GitHub repo: github.com/unslothai/unsloth



View: https://www.reddit.com/r/selfhosted/comments/1ic8zil/yes_you_can_run_deepseekr1_locally_on_your_device/




现在有一个在RASPBERRY PI 上装DS R1 8B的视频 一天就到1.3M 点击。
 
老大试试原生DS模型,有人把模型压缩成720GB to just 131GB, 只需要cpu+gpu内存超过80G就可以流畅运行,这太有意思了,每个人都可以自己搞世界最高AI:

  1. We shrank R1, the 671B parameter model from 720GB to just 131GB (a 80% size reduction) whilst making it still fully functional and great
  2. No the dynamic GGUFs does not work directly with Ollama but it does work on llama.cpp as they support sharded GGUFs and disk mmap offloading. For Ollama, you will need to merge the GGUFs manually using llama.cpp.
  3. Minimum requirements: a CPU with 20GB of RAM (but it will be slow) - and 140GB of diskspace (to download the model weights)
  4. Optimal requirements: sum of your VRAM+RAM= 80GB+ (this will be somewhat ok)
  5. No, you do not need hundreds of RAM+VRAM but if you have it, you can get 140 tokens per second for throughput & 14 tokens/s for single user inference with 2xH100
  6. Our open-source GitHub repo: github.com/unslothai/unsloth



View: https://www.reddit.com/r/selfhosted/comments/1ic8zil/yes_you_can_run_deepseekr1_locally_on_your_device/



这个厉害了, 我去买内存
 
@Riven 好!回想起当年第一次在286上运行译星翻译软件,真是进步太多太多了。

@Hui 应用远不止翻译这几句话。比如它可以瞬间准确翻译成所有语言,比如塔希提语。还可以根据资料推理解决问题。如果有兴趣你可以学习一下。

这个deepseek我觉得有两点贡献,一个是让成本直接降到拼多多水平,二是让国内普通民众人人都用上了AI
非常同意, DS的贡献是降低了使用成本和使用门槛,极大的推动了AI发展
 
为准备买两个4060Ti 16GB ($599), 或者2个AMD 7900 24GB, 试试。关键要内存。
 
为准备买两个4060Ti 16GB ($599), 或者2个AMD 7900 24GB, 试试。关键要内存。
我有64G内存, 16G显存, 感觉应该够了。
 
为准备买两个4060Ti 16GB ($599), 或者2个AMD 7900 24GB, 试试。关键要内存。

从性价比看,买GPU的提升性价比不高

我做过实验R1 70B, 很大的MODEL了,256G RAM 电脑,在没有GPU下也能RUN得不错, 加一张类似电脑价值的GPU 卡,TOKEN 产生速度会提升,但只提升15-20%左右。
 
来个中译英看看。

来个中译英看看。
哈马斯证实其原军事部门首领穆罕默德·戴夫身亡来源:新华网 | 2025年01月31日 03:27:36原标题:哈马斯证实其原军事部门首领穆罕默德·戴夫身亡  新华社加沙1月30日电 巴勒斯坦伊斯兰抵抗运动(哈马斯)下属武装组织卡桑旅发言人阿布·乌拜达30日证实,哈马斯原军事部门首领穆罕默德·戴夫已在以色列军队袭击中身亡。  乌拜达在视频声明中还证实,卡桑旅原副首领马尔万·伊萨、哈马斯原汗尤尼斯旅指挥官拉法阿·萨拉马等多名哈马斯高级指挥官也已身亡。但他并未透露更多细节。  以色列军方去年8月1日发表声明说,穆罕默德·戴夫在同年7月13日以军对加沙地带汗尤尼斯地区的空袭中死亡。

Hamas has confirmed that its former military leader, Muhammad Daf, has been killed.

The statement from Abu Ubaidah of al-Qassam Brigades on the 30th confirmed that Muhammad Daf, who was the former commander of Hamas' military wing, was killed in an Israeli airstrike. Abu Ubaidah also confirmed that the former deputy commander of al-Qassam, Marwan Eissa, and Rafaa Slaama, who was the commander of the Khan Younis-based faction of Hamas, have been killed.

According to the Israeli military in August, Daf had been killed in airstrikes on Khan Younis on July 13th of that year.



DeepSeek-R1-abliterated:32B​

 
各有千秋,DS没那么神
DS <think>这个block太愚蠢了,无法去掉思维step

llama3.3 70b
新任交通部长肖恩·达菲(Sean Duffy)在宣誓就职仅几个小时后就面临了他的第一次重大危机。达菲于周二被参议院确认,并迅速成为联邦政府应对发生在华盛顿特区最近的雷根国家机场的致命飞机坠毁事故的公共形象。星期三晚上,一架美军直升机与载有60名乘客和4名机组人员的美国航空公司飞机相撞,当时该飞机正在降落,导致其冲入冰冷的波托马克河。人们担心机上所有人都已遇难,这可能是几十年来最致命的美国空难之一。

“我们的新交通部长肖恩·达菲——在他上任第二天就发生了这件事,那真的是一个艰难的局面,”特朗普在星期四白宫简报会上与达菲一起出席时说。

( translation note: "肖恩·达菲" 是Sean Duffy 的中文翻译,"雷根国家机场" 是Reagan National Airport 的中文翻译,"波托马克河" 是Potomac River 的中文翻译)


ds-r1:70b (distlled)
新任美国运输部长肖恩·达菲在就职几小时后便迎来了他执政以来的第一次重大危机。达菲于周二被参议院确认,他迅速成为联邦政府回应里根国家机场附近致命飞机坠毁事件的公开面孔。周三晚间,一架美国航空公司的客机正在里根国家机场着陆时,与一架阿米尼亚军用直升机发生碰撞,导致客机冲向冰冷的波托马克河。机上所有人都担心已死亡,这场灾难看起来是几十年来美国最致命的航空事故。

周四在白宫的一个简报会上,特朗普和达菲一起出席时说:“我们的新任运输部长肖恩·达菲——他就职第二天时发生了这件事。这确实很艰难。”
 
后退
顶部