Blog

Thoughts on ML models, AI applications, distributed systems, and data platforms.

Learning from Production Agent Systems: Claude Code
Non-linear AI·April 19, 2026
What studying Claude Code's architecture taught us — how the agent loop, context management, tool schema cost, sub-agent isolation, coordination protocol, and reliability infrastructure actually work in production.
Writing Effective Skills: Best Practices for Agent Onboarding
Non-linear AI·March 2, 2026
How to write skills that trigger reliably, load efficiently, and stay maintainable, with concrete examples from official docs, the open specification, and public skill repositories.
Understand Reasoning Engine: RL Infrastructure for LLMs, From Algorithms to Production
Guanghua Shu·February 26, 2026
A deep dive into RL infrastructure for LLMs, from RLHF to RLVR to Agentic RL, with head-to-head experiments comparing rule-based vs. LLM-as-a-Judge rewards and sync vs. async training on SkyRL.
Context Engineering: Managing the Scarcest Resource in Agent Systems
Non-linear AI·February 1, 2026
How independently-built agent systems converged on the same architecture for managing finite context windows.
Embracing Functional Architecture in Agent Engineering
Non-linear AI·January 1, 2026
How functional architecture helps with durability, security, scalability, and maintainability in agent systems.
From PPO to GRPO: The Evolution of Fine-Tuning for Reasoning Models
Guanghua Shu·December 5, 2025
From PPO to GRPO: tracing the evolution of policy optimization algorithms for efficient LLM alignment and comparing their real-world applications.
The Recommender's Lesson: How Scalable Learning Augments Human Insight
Guanghua Shu, Li Tan, Shen Zhu·November 16, 2025
From collaborative filtering to generative AI: tracing the evolution of recommender systems through the lens of the bitter lesson
Arc-Graph: Declarative Machine Learning for the Age of AI Agents
Li Tan, Shen Zhu, Guanghua Shu·November 1, 2025
How declarative ML creates a shared language for human-agent collaboration.
DeepSeek V3 and R1: Innovative Architectures and Advanced Reasoning Capabilities in Open-Source LLMs
Guanghua Shu·February 23, 2025
A comprehensive analysis of DeepSeek V3 and R1 models, covering key innovations like MLA, MoE, MTP, GRPO, and the evolution from base models to reasoning-capable systems.

Subscribe to updates

Get notified when we publish new articles about ML models, AI applications, and data platforms.

Blog

Learning from Production Agent Systems: Claude Code

Writing Effective Skills: Best Practices for Agent Onboarding

Understand Reasoning Engine: RL Infrastructure for LLMs, From Algorithms to Production

Context Engineering: Managing the Scarcest Resource in Agent Systems

Embracing Functional Architecture in Agent Engineering

From PPO to GRPO: The Evolution of Fine-Tuning for Reasoning Models

The Recommender's Lesson: How Scalable Learning Augments Human Insight

Arc-Graph: Declarative Machine Learning for the Age of AI Agents

DeepSeek V3 and R1: Innovative Architectures and Advanced Reasoning Capabilities in Open-Source LLMs

Subscribe to updates