Lynx Li Blog
  • Home
  • Archives
  • Categories
  • Tags
  • About

广泛使用的RoPE

位置编码–广泛使用的RoPE 1 RoPE位置编码 RoPE的理论及由来:https://kexue.fm/archives/8265 假设位置为m的query向量q的维度为ddd,则RoPE会将q变换为 [q0q1q2q3⋮qd−2qd−1]⊗[cos⁡mθ0cos⁡mθ0cos⁡mθ1cos⁡mθ1⋮cos⁡mθd/2−1cos⁡mθd/2−1]+[−q1q0−q3q2⋮−qd−1qd−2]⊗
2026-03-15
LLM > Position Encoding
#深度学习 #智能系统 #AIGC

Chapter 1 Notations

1 Logic Operations There are 4 operators, they are ¬\neg¬, ∧\land∧, ∨\lor∨, ⇒\Rightarrow⇒, the outcome of the operation is the Truth Table. For proposition ¬A\neg A¬A A 0 1 ¬\neg¬A 1 0 For c
2026-03-04
Math > Analysis-1
#Analysis #Math

05-GPU MatMul and Compilers

This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1
2025-08-22
AI Infra > ML System
#Deep Learning #AI Infra #ML Systems

Efficient PyTorch Implementation of MoE with Aux loss and Token drop

1 Preliminaries Mixture-of-Experts is an essential architecture choice when building LLMs. Since the prevalence of DeepSeekV3, companies will consider whether to use MoE structure before LLM pretraini
2025-08-04
AI Infra
#Deep Learning #AI Infra

04-GPU Programming 101

This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1
2025-06-30
AI Infra > ML System
#Deep Learning #AI Infra #ML Systems

03-Optimization on Operator and Matrix Multiplication

This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1
2025-06-28
AI Infra > ML System
#Deep Learning #AI Infra #ML Systems

02-Behind ML Framework

This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1
2025-06-28
AI Infra > ML System
#Deep Learning #AI Infra #ML Systems

01-Introduction

This is a lecture note of the course CSE 234 - Data Systems for ML - LE [A00].From UC SanDiegoProf. Zhang HaoWinter, 2025Link: https://podcast.ucsd.edu/watch/wi25/cse234_a00/1
2025-06-23
AI Infra > ML System
#Deep Learning #AI Infra #ML Systems

03-Flow Matching and Conditional Flow Matchings

The series of tutorial is based on Flow Matching Guide and CodearXiv: 2412.06264Thank you, META Flow Matching Problem Instead of learning the likelihood of the target like they
2025-06-11
Visual Generation > Flow Matching
#Deep Learning #Generative Model #Flow Matching

02-Flow model, Everything Before Flow Matching

The series of tutorial is based on Flow Matching Guide and CodearXiv: 2412.06264Thank you, META Flow Models Before flow matching, flow models are hard to train, it entails solv
2025-06-09
Visual Generation > Flow Matching
#Deep Learning #Generative Model #Flow Matching
123…6

Search

Hexo Fluid