丰言的博客
竹杖芒鞋轻胜马,谁怕?一蓑烟雨任平生。
首页
关于
标签
78
分类
26
归档
87
站点地图
更新表
搜索
文献阅读
标签
2025
08-04
【文献阅读】Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
06-30
DeepSeekMoE+MTP
02-18
【文献阅读】Better & Faster Large Language Models via Multi-token Prediction
01-08
ModernBERT介绍
2024
11-22
STaR和Quiet-STaR
02-02
【文献阅读】R-Drop: Regularized Dropout for Neural Networks
01-22
【文献阅读】LST: Ladder Side-Tuning for Parameter and Memory Efficient Transfer Learning
01-18
【文献阅读】RAG: Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks
01-16
【文献阅读】Mixtral of Experts
01-11
【文献阅读】ALiBi: Attention With Linear Biases位置编码
1
2
…
4
Theme NexT works best with JavaScript enabled