Reasoning as Trajectories

Overview

Current large language models generate tokens by iteratively updating high-dimensional representations. When solving math problems with chain-of-thought prompting, the sequence of reasoning steps can be viewed as successive states forming a trajectory through the model's representation space. We characterize this trajectory and find it is highly structured: each reasoning step occupies a distinct, linearly separable region that becomes progressively more delineated at deeper layers. This organization is already present in base models — reasoning training primarily reshapes when convergence occurs rather than introducing new representational structure.

Building on this, we show that correct and incorrect solutions follow similar early-step paths but diverge systematically at late steps, yielding actionable mid-reasoning correctness signals. We further introduce trajectory-based steering, an inference-time intervention framework that enables both error correction and reasoning length control without retraining.

Step-Specific Representation Subspaces

We extract hidden-state activations immediately preceding each Step marker during chain-of-thought reasoning on GSM8K. These activations form snapshots of the model's representation trajectory as reasoning unfolds.

Layer-wise linear probe accuracy for step-specific classification — Layer-wise linear probe accuracy for step classification (Steps 2, 3, 5, and Final Answer Marker) across Instruct, R1-Distill, and Base models. Later steps require deeper layers to become separable.

Step-specific regions are linearly separable. Step 1 has probe accuracy above 0.99 at every layer for all models. Later steps require deeper layers to become separable (Table 1).
Structure is shared across training regimes. Cross-model transfer of step-specific linear probes consistently achieves accuracy above 0.90 for nearly all model pairs (Table 1).
Robust to prompt format. Probes trained on fixed-format Step X: activations transfer to freeform responses (no explicit step markers) with best-layer accuracies consistently above 0.84 (Table 7).

💡 Key Insight: Reasoning steps occupy functionally ordered, linearly separable subspaces that are largely shared across training regimes and robust to surface-level formatting.

Correctness in Trajectory Geometry

We group trajectories by final-answer correctness and compute between-step activation distances. Early-step transitions are nearly identical for correct and incorrect solutions, but late-step transitions diverge systematically. We then train logistic regression classifiers on trajectory features to predict final-answer correctness before the answer is emitted.

Between-step activation distances and mid-reasoning correctness prediction — **(a)** Between-step activation distances. Late transitions show statistically significant divergence between correct and incorrect solutions († marks non-overlapping 95% CIs). **(b)** Test ROC-AUC across layers. Late-step trajectory features (peak 0.87 at layer 29) substantially outperform early-step geometry and logit-lens baselines.

Early-step geometry is correctness-invariant. Step 1→2 distances show no significant difference (Δ(I−C) ≈ 0).
Late-step trajectories diverge. Last step → answer marker: Euclidean Δ(I−C) = −13.39, cosine Δ = −0.06.
Trajectory features predict correctness with ROC-AUC 0.87 (peak at layer 29), substantially outperforming step-count-only baselines (0.65) and logit-lens features (0.77).

💡 Key Insight: Correctness is encoded not in where the model ends up in representation space, but in how it gets there — the trajectory, not the destination.

Error-Targeted Inference-Time Interventions

We show that unconditional test-time scaling — always injecting control tokens like "Wait" or "Hmm" — is often harmful, reducing accuracy by up to 36%. Instead, we use correctness predictors to gate interventions, applying them only when failure is predicted.

Unconditional vs predictor-gated interventions — Unconditional vs. error-targeted interventions on GSM8K. Predictor-gated methods convert large unconditional accuracy drops into consistent gains.

Unconditional interventions are often harmful. Always injecting "Wait" reduces accuracy by 36.0%; even "Step" incurs a 1.6% drop.
Predictor-gated interventions yield gains of up to +35.4% relative to always-on, intervening on only 12.3% of examples.

💡 Key Insight: Late-step trajectory geometry encodes detectable correctness signals that can guide interventions, but one-shot corrections remain limited in their ability to reliably repair diverse failure modes.

Trajectory-Based Steering

Correcting Deviating Reasoning Trajectories

We derive an ideal reasoning trajectory from correct examples in PCA space and apply low-rank steering updates whenever the current trajectory deviates beyond learned thresholds. This enables localized, repeated corrections at every step boundary.

Trajectory-based steering for correctness on GSM8K, stratified by original step count. Gains are largest on longer, error-prone reasoning chains.

Most effective on longer chains: +7.60% accuracy on 6-step problems (75.44% → 83.04%) and +7.69% on 7-step problems (67.69% → 75.38%).
High preservation rates (≥97%): repeated low-magnitude interventions avoid destabilizing already-correct trajectories.

Reasoning Length Control

Using the same termination-related subspace, we can directly and continuously control reasoning length by steering activations toward (Shorten) or away from (Prolong) the termination region.

💡 Key Insight: Reasoning trajectories can be causally manipulated to both correct errors and control reasoning length, establishing them as a unifying abstraction for interpreting, predicting, and influencing LLM reasoning behavior.

Cross-Task Generalization

Step-specific probes trained on GSM8K transfer robustly to both MATH-500 and MMLU, achieving best-layer accuracies above 0.85 for all steps. Steering vectors derived from GSM8K also improve MATH-500 accuracy from 36.40% to 38.20% (+1.80%) without any retuning — indicating that the identified geometry reflects general properties of CoT reasoning.

BibTeX

@misc{sun2026llmreasoningtrajectoriesstepspecific, title={LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals}, author={Lihao Sun and Hang Dong and Bo Qiao and Qingwei Lin and Dongmei Zhang and Saravan Rajmohan}, year={2026}, eprint={2604.05655}, archivePrefix={arXiv}, primaryClass={cs.CL}, url={https://arxiv.org/abs/2604.05655}, }

LLM Reasoning as Trajectories: Step-Specific Representation Geometry and Correctness Signals