广泛使用的RoPE 位置编码–广泛使用的RoPE 1 RoPE位置编码 RoPE的理论及由来:https://kexue.fm/archives/8265 假设位置为m的query向量q的维度为ddd,则RoPE会将q变换为 [q0q1q2q3⋮qd−2qd−1]⊗[cosmθ0cosmθ0cosmθ1cosmθ1⋮cosmθd/2−1cosmθd/2−1]+[−q1q0−q3q2⋮−qd−1qd−2]⊗ 2026-03-15 LLM > Position Encoding #深度学习 #智能系统 #AIGC
Chapter 1 Notations 1 Logic Operations There are 4 operators, they are ¬\neg¬, ∧\land∧, ∨\lor∨, ⇒\Rightarrow⇒, the outcome of the operation is the Truth Table. For proposition ¬A\neg A¬A A 0 1 ¬\neg¬A 1 0 For c 2026-03-04 Math > Analysis-1 #Analysis #Math
05-GPU MatMul and Compilers This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1 2025-08-22 AI Infra > ML System #Deep Learning #AI Infra #ML Systems
Efficient PyTorch Implementation of MoE with Aux loss and Token drop 1 Preliminaries Mixture-of-Experts is an essential architecture choice when building LLMs. Since the prevalence of DeepSeekV3, companies will consider whether to use MoE structure before LLM pretraini 2025-08-04 AI Infra #Deep Learning #AI Infra
04-GPU Programming 101 This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1 2025-06-30 AI Infra > ML System #Deep Learning #AI Infra #ML Systems
03-Optimization on Operator and Matrix Multiplication This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1 2025-06-28 AI Infra > ML System #Deep Learning #AI Infra #ML Systems
02-Behind ML Framework This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1 2025-06-28 AI Infra > ML System #Deep Learning #AI Infra #ML Systems
01-Introduction This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1 2025-06-23 AI Infra > ML System #Deep Learning #AI Infra #ML Systems
03-Flow Matching and Conditional Flow Matchings The series of tutorial is based on Flow Matching Guide and CodearXiv: 2412.06264Thank you, META Flow Matching Problem Instead of learning the likelihood of the target like they 2025-06-11 Visual Generation > Flow Matching #Deep Learning #Generative Model #Flow Matching
02-Flow model, Everything Before Flow Matching The series of tutorial is based on Flow Matching Guide and CodearXiv: 2412.06264Thank you, META Flow Models Before flow matching, flow models are hard to train, it entails solv 2025-06-09 Visual Generation > Flow Matching #Deep Learning #Generative Model #Flow Matching