Unit 10 · Lesson 7

Standard Deviation Tuning for Vision Trust

Lesson 6 showed that standard deviations control how much the Kalman filter trusts each measurement. This lesson shows how to choose them — starting from the physics of why accuracy degrades with distance and why multiple tags are categorically better than one, then building toward a dynamic formula you can deploy and iterate.

By the end of this lesson, you will:

Explain what each of the three standard deviation values controls (x, y, theta) and why theta is typically set higher than x and y
State the two physical factors that most affect vision measurement accuracy and how they map to standard deviation values
Implement a dynamic standard deviation computation using the formula stddev = base × distance² / numTags
Measure your robot's actual odometry drift rate and set odometry standard deviations accordingly
Log all three dynamic standard deviation values to SmartDashboard for tuning iteration
Identify the failure mode caused by setting theta standard deviation too low

What the Three Standard Deviations Control

In Lesson 6, we saw that addVisionMeasurement(pose, timestamp) uses the constructor's default vision standard deviations. A more powerful overload — addVisionMeasurement(pose, timestamp, stdDevs) — lets you pass a Matrix<N3, N1> that specifies trust per-measurement, based on the actual quality of that specific detection.

The three values in the matrix correspond to the three degrees of freedom in a 2D pose estimate: x position (meters), y position (meters), and heading theta (radians). Each one is an estimate of the standard deviation of the error in that axis. Larger value = the filter trusts that axis less. Smaller value = the filter trusts it more, and the fused estimate moves closer to the vision estimate on that axis.

x stddev (meters)

Error along the field length Primarily driven by distance to the tag (Z axis error in the camera frame maps to field-X or field-Y depending on robot heading). Grows quadratically with distance as corner pixel positions subtend fewer pixels. start: 0.3 m at 2 m

y stddev (meters)

Error along the field width Similar to x — driven by distance and pixel noise. In practice x and y stddevs are often set equal. At steep viewing angles (tag nearly edge-on), the axis perpendicular to the tag normal has higher error. start: 0.3 m at 2 m

theta stddev (radians)

Error in heading estimate Heading from single-tag solvePnP is significantly less accurate than X/Y position. Vision heading should generally not override the gyro. Set this 3–10× higher than x/y unless using multi-tag at close range. start: 9999 (ignore vision heading)

🔍 The heading override failure — most common tuning mistake

Setting theta standard deviation too low is the single most common vision tuning error I see at competition. When theta stddev is small (e.g., 0.1 rad), the Kalman filter trusts the vision-computed heading more than the gyro. Single-tag solvePnP heading estimates are often 10–30° off, especially at distance or steep angles. The result: the robot's fused heading jumps 15° sideways every time a tag detection arrives, the swerve field-oriented controller tries to compensate, and the robot fishtails during autonomous. The fix is either setting theta stddev to a very large number (9999) to effectively ignore vision heading entirely, or using multi-tag estimates which have much lower heading error. The gyro is better at heading than vision for nearly all FRC scenarios — let it own that axis.

The Two Physical Factors That Drive Accuracy

Factor 1: Distance to the tag

As the robot moves farther from a tag, the tag occupies fewer pixels in the camera image. The corner pixel coordinates used by solvePnP become less precise — each pixel represents more physical space. The error in the computed pose grows roughly with the square of the distance: double the distance, roughly quadruple the position error. This is why measurements taken close to tags (1–3 meters) are dramatically more reliable than measurements at long range (5–7 meters), and why vision trust should scale down as distance increases.

Factor 2: Number of visible tags

With a single tag, solvePnP has the pose ambiguity problem (Lesson 1) — the position estimate has significant heading uncertainty, and the position estimate depends on one set of four corner coordinates. With two or more tags, the multi-tag solvePnP solves across all visible corner sets simultaneously. This has two effects: (1) pose ambiguity is eliminated, and (2) random pixel noise in any one tag's corners is partially averaged out by the other tags. Multi-tag estimates at equivalent distance are typically 3–5× more accurate than single-tag estimates.

💡 Distance² is the right scaling, not distance

A linear scaling with distance is insufficient. If you double the distance, a tag's corner spans half as many pixels — but the angular uncertainty in each pixel doubles, making the 3D position error scale with distance squared, not linearly. The distance² term in the formula below captures this physical relationship. Teams that use linear scaling underweight nearby measurements (which are very accurate) and overweight far measurements (which should be distrusted). The quadratic formula matches the actual physics of solvePnP error growth.

Standard Deviation Explorer

Adjust the detection scenario below to see how the computed standard deviations change with distance and tag count. The charts show how the X/Y position uncertainty grows with distance under the formula stddev = base × distance² / numTags, and how each scenario maps to a SmartDashboard trust level.

Dynamic stddev computation — formula: base × dist² / numTags adjust controls

Current detection

Distance to tag (m): 2.5

Visible tags: 1

Formula constants

Base XY stddev (m): 0.03

Theta trust: 9999

XY stddev vs. distance

XY stddev vs. tag count (@ current dist)

XY stddev —

Theta stddev —

Trust level —

Java code —

The Dynamic Formula

Instead of a single fixed standard deviation for all vision measurements, the dynamic approach computes a per-measurement value based on the actual quality of that specific detection. The formula that most closely models the physics:

VisionSubsystem.java — dynamic standard deviation computation

import edu.wpi.first.math.Matrix;
import edu.wpi.first.math.VecBuilder;
import edu.wpi.first.math.numbers.*;

// ── Base constants ─────────────────────────────────────────────────────────
// Start with these values and iterate based on observed performance.
// BASE_XY_STDDEV: position stddev at 1m with 1 tag
// Decreasing this makes vision more aggressive at correcting odometry.
private static final double BASE_XY_STDDEV = 0.03;  // meters at 1m/1tag

// Theta stddev — effectively ignore vision heading.
// Only change this if you have multi-tag with very close range AND
// you've verified vision heading is more accurate than your gyro.
private static final double THETA_STDDEV = 9999.0;   // radians (effectively infinite)

/**
 * Computes vision measurement standard deviations based on detection quality.
 *
 * @param distanceMeters  Distance to the nearest detected tag (meters).
 *                        From: target.getBestCameraToTarget().getTranslation().getNorm()
 * @param numTags         Number of tags used in this estimate.
 *                        From: estimate.targetsUsed.size()
 * @return Matrix<N3,N1> of [x stddev, y stddev, theta stddev]
 */
private Matrix<N3, N1> computeVisionStdDevs(double distanceMeters, int numTags) {
    // XY error scales with distance² and improves proportionally with tag count
    double xyStdDev = BASE_XY_STDDEV * (distanceMeters * distanceMeters) / numTags;

    // Cap to prevent extreme values at long range from blowing up the filter.
    // Measurements beyond 7m or so should be rejected in Lesson 8 anyway.
    xyStdDev = Math.min(xyStdDev, 5.0);

    return VecBuilder.fill(xyStdDev, xyStdDev, THETA_STDDEV);
}

// Usage in periodic() — extract distance from the detected targets:
estimateOpt.ifPresent(estimate -> {
    int numTags = estimate.targetsUsed.size();

    // Average distance across all detected tags (better than using only the closest)
    double totalDist = estimate.targetsUsed.stream()
        .mapToDouble(t -> t.getBestCameraToTarget().getTranslation().getNorm())
        .sum();
    double avgDist = totalDist / numTags;

    Matrix<N3, N1> stdDevs = computeVisionStdDevs(avgDist, numTags);

    // Pass to drivetrain with both pose and per-measurement stddevs
    m_drive.addVisionMeasurement(
        estimate.estimatedPose.toPose2d(),
        estimate.timestampSeconds,
        stdDevs
    );

    // Log for tuning iteration (Lesson 7)
    SmartDashboard.putNumber("Vision/StdDevXY", stdDevs.get(0, 0));
    SmartDashboard.putNumber("Vision/AvgTagDist", avgDist);
    SmartDashboard.putNumber("Vision/NumTags", numTags);
});

Odometry Standard Deviations: Measuring Your Robot's Actual Drift

The odometry standard deviations set in the SwerveDrivePoseEstimator constructor represent how much uncertainty accumulates per loop from encoder integration and wheel slip. These should reflect your robot's actual measured drift rate, not an arbitrary guess.

How to measure odometry drift

Disable vision inputs entirely (comment out addVisionMeasurement()). Drive a precisely measured path — for example, 5 meters forward and back to the starting position. The odometry's reported final position versus the true starting position is the accumulated drift. Do this 5 times. The standard deviation of those drift measurements, divided by the path length, gives you a drift-per-meter estimate. For a well-tuned swerve drive, this is typically 1–3 cm per meter of travel.

From this measurement: if your drift is 2 cm per meter and your drive loop runs at 50 Hz, odometry accumulates roughly 0.02 m/m × robot_speed_m_s × 0.02s of uncertainty per loop. The value VecBuilder.fill(0.1, 0.1, 0.1) from Lesson 6 is a reasonable starting point for most well-tuned swerve drives traveling at moderate speeds. Teams with very accurate drivetrains can use smaller values (0.05); teams with significant wheel slip should use larger ones (0.2–0.3).

💡 Odometry stddev affects how quickly vision corrections take hold

The ratio between odometry and vision standard deviations determines the Kalman gain — how aggressively the filter moves toward the vision estimate when one arrives. If odometry stddev is 0.1 and vision stddev is 0.3, the filter trusts odometry 3× more, so vision moves the estimate only about 25% of the way to the vision pose. If you want faster corrections, lower the vision stddev (trust vision more) or raise the odometry stddev (trust odometry less). Neither alone is right — the ratio must reflect the actual relative accuracy of your two sources. Measure your odometry drift and use that to calibrate, not intuition.

The Tuning Workflow

Start with the conservative defaults

Begin with BASE_XY_STDDEV = 0.03, THETA_STDDEV = 9999, and odometry stddevs of [0.1, 0.1, 0.1]. These give moderate vision trust that won't destabilize the robot. Deploy and verify fusion is happening (counter increments, Field2d shows correction).

Log the computed stddevs per loop

Add SmartDashboard logging for Vision/StdDevXY, Vision/AvgTagDist, and Vision/NumTags as shown in the code above. During autonomous, check whether the logged values look reasonable — stddev should be small (high trust) when 2–3 tags are visible at 2m, and large (low trust) when only one tag is visible at 5m.

Compare fused pose vs. vision pose in Field2d

With both the fused estimate and raw vision estimate visible on the Field2d widget, observe whether the fused estimate smoothly incorporates corrections or jumps abruptly. Abrupt jumps (the robot icon teleporting) mean vision stddev is too low — increase BASE_XY_STDDEV. No visible correction means vision stddev is too high or odometry stddev is too low — lower BASE_XY_STDDEV or raise odometry stddevs.

Test the autonomous scoring accuracy

Run the full autonomous routine and measure landing accuracy at the scoring position. Acceptable: ±3 cm at a precision scoring target. If accuracy is worse with vision than without, vision is adding noise rather than correction — likely due to a bad measurement leaking through. The filtering in Lesson 8 addresses this.

Adjust BASE_XY_STDDEV based on results

If scoring accuracy is consistently biased (robot always misses left, right, or at a consistent distance), the vision measurements are systematically wrong — the issue is calibration or robot-to-camera transform, not stddev. If accuracy is randomly scattered around the target, stddevs are correctly set but there's underlying measurement noise. If accuracy is better on multi-tag frames than single-tag, consider using different base values per tag count (smaller base when multiple tags are visible).

🔍 How championship teams handle theta stddev in practice

Teams like 2910 and 6328 effectively ignore vision heading (theta stddev = very large number) for most of the match and rely entirely on the gyro for heading. The gyro on a well-maintained Pigeon 2 or NavX drifts less than 1° over an entire match, while single-tag vision heading can have 15–30° error. Even MegaTag 2 with gyro assistance is only marginally better than the gyro alone for heading. The only scenario where vision heading is worth incorporating is when the gyro has failed, been reset incorrectly, or accumulated significant drift from a collision — and in that case, you'd want separate error-recovery logic, not just a lower theta stddev.

Organizing Constants in VisionConstants

Keep all tuning values in one place so they're easy to adjust between events:

VisionConstants.java — complete constants file

public class VisionConstants {

    // ── Camera geometry ───────────────────────────────────────────────────────
    // Re-measure after any change to camera mounting.
    public static final Transform3d ROBOT_TO_FRONT_CAMERA =
        new Transform3d(new Translation3d(0.30, -0.10, 0.50),
                       new Rotation3d(0, Units.degreesToRadians(-15), 0));

    // ── Field ─────────────────────────────────────────────────────────────────
    public static final double FIELD_LENGTH_METERS = 16.54;
    public static final double FIELD_WIDTH_METERS  = 8.21;

    // ── Kalman filter — odometry trust ────────────────────────────────────────
    // Measured from actual robot drift tests (5m path, 5 runs).
    // Increase if your robot slips more (carpet-to-carpet variation).
    public static final Matrix<N3, N1> ODOMETRY_STD_DEVS =
        VecBuilder.fill(0.1, 0.1, 0.1);

    // ── Kalman filter — vision trust formula ──────────────────────────────────
    // xyStdDev = BASE_XY_STDDEV * distance² / numTags
    //
    // START HERE for tuning. Lower this → trust vision more aggressively.
    // Typical range: 0.01 (very aggressive) to 0.1 (very conservative).
    // Raise if vision is adding noise instead of correcting drift.
    public static final double BASE_XY_STDDEV = 0.03;

    // Heading from vision is rarely better than the gyro.
    // Set to a large number to effectively ignore vision heading.
    // Only reduce if: multi-tag at close range AND gyro has known drift issues.
    public static final double VISION_THETA_STDDEV = 9999.0;

    // Maximum distance at which to accept a vision measurement.
    // Beyond this, stddev is so large that measurements add noise.
    // Also used as an explicit filter in Lesson 8.
    public static final double MAX_VISION_DISTANCE_METERS = 5.0;
}

🔌 System Check

⚙️ Verifying Standard Deviation Tuning

Work through these checks after implementing dynamic stddevs before competition:

SmartDashboard shows dynamic stddev values that vary by scenario. Log Vision/StdDevXY and watch it change as you move the robot closer to and farther from tags, and as you move between positions with 1 tag and 2+ tags visible. The values should be noticeably smaller (higher trust) when close with multiple tags visible.
Theta stddev is set large (≥ 100 rad) in production. Unless you've specifically tested and verified that vision heading improves on your gyro, keep theta stddev at 9999. Check the code — don't assume a previous programmer set it correctly. The heading fishtail failure during auto is one of the hardest bugs to diagnose because it looks mechanical, not like a software tuning issue.
Fused estimate moves smoothly, not abruptly. In Field2d, the fused robot estimate should move toward vision measurements gradually, not teleport. If the robot icon jumps suddenly when a tag is detected, lower BASE_XY_STDDEV or add the distance filter from Lesson 8 to reject close-range noisy measurements.
Odometry stddevs reflect measured drift, not defaults. Run the 5m path drift test described in this lesson. If your actual drift is 5 cm per meter but odometry stddevs are set to 0.1 (which implies ~10 cm drift), you're overtrusting vision relative to your actual odometry quality. Adjust based on measurement, not assumption.

Knowledge Check

1. A robot is facing a scoring structure where it can see 3 AprilTags at an average distance of 2.0 m. Using BASE_XY_STDDEV = 0.03, what is the computed XY standard deviation, and what does this mean for how much the Kalman filter trusts this measurement?

A 0.03 m — the formula returns the base value unchanged when multiple tags are visible
B 0.04 m — computed as 0.03 × (2.0)² / 3 = 0.03 × 4 / 3 ≈ 0.04 m; this is a relatively low stddev indicating high trust, so the Kalman filter will move the fused estimate significantly toward this vision pose
C 0.12 m — computed as 0.03 × 2.0 × 3 = 0.18, which represents the combined measurement from 3 tags
D 0.12 m — computed as 0.03 × 2.0 / 3 = 0.02, rounded up for safety margin

2. During autonomous, a robot's fused pose estimate causes it to spin 20° off-axis every time a tag detection arrives, then snap back as the gyro reasserts. Odometry and gyro are working correctly. The camera calibration is good and the robot-to-camera transform is measured correctly. What is the most likely cause?

A BASE_XY_STDDEV is too small — the position correction is too aggressive
B THETA_STDDEV is too small — the Kalman filter is trusting the vision-computed heading, which has 20° error from single-tag solvePnP; the fix is setting THETA_STDDEV to a large value (≥ 100 rad) to effectively ignore vision heading and rely on the gyro
C The camera is too close to the tags — reduce vision update frequency
D The odometry stddevs are too large relative to vision stddevs

3. A team's vision fusion code uses a fixed stddev of VecBuilder.fill(0.3, 0.3, 9999) for all measurements regardless of distance or tag count. During their match, the robot sometimes sees tags at 1.5 m (multi-tag) and sometimes at 6 m (single-tag). What is the key limitation of fixed stddevs compared to the dynamic formula?

A Fixed stddevs aren't supported — addVisionMeasurement() requires per-measurement values
B Fixed stddevs weight all measurements equally regardless of actual accuracy — accurate close-range multi-tag measurements (which could provide tight corrections) are underweighted, while noisy 6 m single-tag measurements (which should be distrusted) are overweighted; the dynamic formula assigns higher trust where physics guarantees better accuracy
C The value 0.3 is too small for FRC — it should be at least 1.0 regardless of conditions
D Fixed stddevs cause the Kalman filter to ignore measurements that differ by more than 0.3 m from odometry

💪 Practice Prompt

Implement and Tune Dynamic Standard Deviations

Add a VisionConstants.java file with all the constants from this lesson: BASE_XY_STDDEV, VISION_THETA_STDDEV, ODOMETRY_STD_DEVS, and MAX_VISION_DISTANCE_METERS. Replace any hardcoded values in your drivetrain constructor and vision subsystem with these constants. Confirm the code builds and behavior is unchanged.
Implement the computeVisionStdDevs(double distanceMeters, int numTags) method in your VisionSubsystem. Add the average-distance computation from estimate.targetsUsed. Log Vision/StdDevXY, Vision/AvgTagDist, and Vision/NumTags to SmartDashboard. Deploy and watch the values change as you move the robot between different distances and positions with different tag counts visible.
Perform the odometry drift measurement: disable vision inputs, drive a 5-meter straight path and back to the start 5 times, record the odometry error each time, compute the mean and standard deviation of the errors. Use these measurements to set ODOMETRY_STD_DEVS appropriately and document your results in a code comment.
Test the heading failure deliberately: temporarily set VISION_THETA_STDDEV = 0.1 and watch the Field2d fused estimate while slowly rotating the robot to face different tags. Observe how the heading jumps. Restore to 9999 and verify the jumps stop. Write a comment in VisionConstants.java explaining what happens if this is set too low, so future team members understand why it's large.
Bonus: Implement a per-tag-count base stddev: use BASE_XY_STDDEV_SINGLE = 0.08 for 1 tag and BASE_XY_STDDEV_MULTI = 0.02 for 2+ tags, with the same distance² scaling. Compare landing accuracy at your scoring target between single-tag and multi-tag estimates. Do multi-tag estimates improve consistency? By how much? Record measurements in your VisionConstants.java as a comment.