AI Research--A Year of Ramblings
本文最后更新于:May 12, 2025 pm
The Beginning of Everything
Today marks exactly one year since I became an amateur AI researcher. I still remember how disheartened I was with my circumstances back then. I was originally supposed to become an antenna engineer, but due to certain reasons, I was forced to work on things I had no interest in—things completely unrelated to antennas or even my undergraduate studies.
During that time, I was extremely anxious. Every morning, I’d wake up wondering: What should I do with my future? What if I can’t understand a single word of the research topic I’m assigned? What if this field has no commercial prospects? How will I find a job? But gradually, my anxiety turned into defiance. Am I really going to let myself rot here?
The year I graduated, two things left a deep impression on me:
- Segment Anything—the first model capable of segmenting any object, demonstrating a general understanding of objects.
- ChatGPT—people were using it for all sorts of fun things, like casting “magic spells” in Python, role-playing, or even mimicking a Linux terminal.
After finishing my postgraduate entrance exams, I started studying deep learning. But due to inertia from my initial learning, I kept focusing on computer vision (CV). Then, a bold idea struck me: What if I just do AI research on my own?
Starting with a Server on My Desk
The first step in research is having your own computing power. Renting cloud machines long-term was too expensive, so I decided to build my own. I emptied my savings—scholarships, competition prizes—and even had to ask my parents for extra money to assemble a 4x RTX 3090 server.
At first, I kept it at my university since our research group needed simulation resources. I thought, “The CPU has plenty of cores, so everyone can share it.” But the campus network became the biggest obstacle—constantly changing IPs, unstable speeds, SSH disconnections, and no way to expose it to the public internet. The IT department ignored my requests. Eventually, I had no choice but to move the server back home.
Chongqing summers are scorching. To prevent overheating, I locked each GPU’s power limit to 250W. That server, sitting on my desk, accompanied me for an entire year.
Choosing a Research Direction
By mid-March, I was still planning to work on CV. But by chance, someone asked if I wanted to collaborate on large language models (LLMs). I figured that AIGC (AI-generated content) was the hottest field, with higher commercial potential, so I began a long collaboration with my co-authors.
The Journey into LLMs
In May, we decided to submit to ICLR. I started with Andrej Karpathy’s LLM introductory course, then branched into multiple areas:
- GPT architecture
- The three stages of training
- Parameter-efficient fine-tuning (PEFT)
- Infrastructure frameworks
- Multimodal models
Every day was a deep dive into papers and code implementations. My biggest regret was not learning infrastructure (Infra) properly—things like CUDA programming or Triton. Now, I only understand the principles of parallelism and acceleration kernels but can’t implement them myself.
The most painful part? Reading code. Big tech companies (especially Meta) write elegant, educational code, but digesting it all was exhausting. Meanwhile, some academic open-source projects were pure torture.
During that period, my favorite escape was walking along the beach while listening to music. To this day, Brahms’ Piano Concerto No. 1 and Chopin’s Piano Concerto No. 1 instantly bring back memories of those study sessions.
Failure, Then More Failure
Of course, not everything went smoothly. Our ICLR submission failed—we didn’t even finish the paper. We pivoted and aimed for ICML, but the submission deadline was during Chinese New Year. While my whole family gathered, I locked myself in my room, running experiments and writing. I barely spent any time with them.
After submission, the rebuttal phase brought even more issues. Judging by the final scores, it was probably a lost cause.
It’s impossible not to feel demoralized. Right now, I’m juggling failed projects, new projects, and even speed-running quantum mechanics just to graduate. The anxiety keeps waking me up at night, leaving me unable to fall back asleep—it’s taking a toll on my health.
Once, I woke up at 4 AM, gave up on sleep, and decided to take a walk to the beach. That’s when I realized how early the sun rises in Qingdao. As a longtime urban photographer, I’ve always noted sunrise and sunset times. I observed the changing light, only to find my usual path to the shore blocked by construction barriers.
I truly envy those who have the ability—both financially and skill-wise—to pursue their dreams. I’m still far from where I need to be.
All this rambling just proves one thing: I’m not strong enough yet. The only way forward is to keep improving. If I were stronger, maybe I’d have fewer worries.
Future Hopes
Right now, I’m too busy, but someday I want to work on an AI companion—a true digital human.
- LLM + humanoid robotics
- Capable of human-like movements and facial expressions
- Long-term memory that adapts to user interactions (even embedding memories into model weights)
- Customizable personalities
Maybe one day, I’ll be working on something like that.