Lihao Sun

University of Chicago

profile.png

Hi there! I am a recent graduate from the University of Chicago, with a B.S. in Computer Science and a B.A. (Honors) in Cognitive Science.

My research interests center on understanding why and how LLMs demonstrate impressive capabilities and, at times, undesirable or unsafe behaviors. By applying and developing mechanistic interpretability tools, I aim to illuminate the circuits and representations that drive these outcomes and, in turn, develop principled, evidence-based interventions that reshape the model’s internals. Ultimately, I hope this line of inquiry helps us reflect more deeply on what it means to learn, think, and be human.

During my undergraduate years, I was fortunate to collaborate with Xuechunzi Bai, Andrew Lee, Chengzhi Mao, and Valentin Hofmann.

I’m also an indie music enthusiast, music magazine writer, startup builder, and competition math specialist. You can learn more about my life in this tab.

Email | Google Scholar | Github | X | Bluesky

news

Jun 10, 2025 I will be attending Y Combinator AI Startup School on Jun 16-17. See you in San Francisco!

publications

  1. aligned-but-blind-teaser.jpg
    ACL (Main)
    Aligned but Blind: Alignment Increases Implicit Bias by Reducing Awareness of Race
    Accepted to ACL (Main), 2025
  2. self-verif-teaser.png
    Preprint
    The Geometry of Self-Verification in a Task-Specific Reasoning Model
    In submission to NeurIPS, 2025