Yun’s Substack

Yun’s Substack

Home
Archive
About
Phase 2 Experiment Design: Cross-Lab Reasoning Drift Benchmark (GPT-5 vs. Claude Opus 4.1)
Executive Summary
Aug 24 • 
Yun Huang
Detecting Behavioral Drift in GPT-5 vs. 4o Under Social, Emotional, and Authority Framing
Phase 2 (Completed): GPT-5 vs GPT-4o Baseline
Aug 21 • 
Yun Huang
1

June 2025

Fine-Tuning GPT-2 for Narrative Structure: A Behavioral Probe
an exercise to understand model behavior
Jun 21 • 
Yun Huang
From Performance to Pressure: How GPT-5 May Shift the Ground Beneath Model Evaluation
GPT-4o gave us a glimpse of what’s coming — fast, expressive, multimodal.
Jun 12 • 
Yun Huang
A statistical approach to model evaluations
A recap of key insights from Anthropic's blog on model evaluations (Nov 2024)
Jun 9 • 
Yun Huang

January 2025

Understand GPT-4: A Technical Analysis
Core Capabilities
Jan 10 • 
Yun Huang

July 2024

Comparing Zero-Shot vs. Few-Shot Prompting: Cost-Efficiency vs. Precision
Comparing Accuracy in Zero-Shot vs. Few-Shot Prompting Using GPT-3.5 Turbo API
Jul 18, 2024 • 
Yun Huang

June 2024

What's happening in AI? An analysis of funding & adoption trends, June 2024
This year, over $30 billion has been invested in AI.
Jun 12, 2024 • 
Yun Huang
2
© 2025 Yun Huang
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture