Unit 10 · Lesson 10

Introduction to Game Piece Detection (ML Vision)

AprilTags are placed at known field positions. Game pieces are not. Finding an orange ring on carpet, distinguishing it from a cone in a shadow, or tracking a ball through a crowd of robots requires a different approach — machine learning. This lesson introduces neural network object detection and shows how to wire its output into autonomous decision-making.

By the end of this lesson, you will:

Explain the conceptual difference between rule-based detection (AprilTags) and learned detection (ML/neural networks)
Describe what a bounding box, class label, and confidence score represent in an object detection result
Outline the steps to train a custom model with Roboflow: collect images, annotate, train, export, deploy
Load and use a PhotonVision ML pipeline to detect game pieces and read detection results in Java
Convert a bounding box center pixel position to a horizontal angle (yaw) using camera calibration geometry
Build a simple autonomous "aim at game piece" behavior using ML detection output

Why Rule-Based Detection Fails for Game Pieces

AprilTags are engineered for detection. Their black-and-white high-contrast pattern is specifically designed to be unambiguously detected under varied lighting and at distance. The detection algorithm is deterministic — the same tag always produces the same result from the same angle, because the tag is a known mathematical pattern.

Game pieces are not engineered for detection. An orange ring looks different depending on lighting (overhead fluorescent vs. direct sunlight from skylights), viewing angle (face-on vs. edge-on), distance (full detail vs. a few pixels), and whether it's partially behind a field element, another robot, or in shadow. A rule-based approach — "find orange pixels above a brightness threshold" — breaks under exactly these real-competition conditions. Shadows make orange look brown. Jerseys in the crowd have orange. The robot's own mechanism, if orange, triggers false detections.

Machine learning object detection — specifically neural network models trained on hundreds or thousands of labeled examples — learns the appearance of the game piece across all these variations. The model doesn't look for "orange pixels"; it has internalized what a ring looks like from experience with real images, and it produces a confidence-weighted answer for every region of the frame.

Rule-based (AprilTag style)

Works on engineered targets with known patterns
Deterministic — same input, same output
No training required
Fails if the appearance varies (game pieces, natural objects)
Returns exact pose with geometry (solvePnP)
Zero tolerance for partial occlusion

ML object detection (game pieces)

Works on arbitrary visual objects — game pieces, field elements, robots
Probabilistic — returns confidence score, not certainty
Requires training data (images + labels)
Handles appearance variation, lighting changes, partial occlusion
Returns bounding box (pixel rectangle), not 3D pose
Fast enough for real-time at 30+ fps on coprocessors

What a Detection Result Contains

When a neural network detects an object in a camera frame, it returns three pieces of information for each detected instance:

Bounding box: A rectangle in pixel coordinates that encloses the detected object. Typically expressed as center X, center Y, width, and height — all in pixels. The bounding box tells you where in the frame the object is, not where it is in 3D space.
Class label: Which type of object was detected. A model trained to detect multiple types (e.g., "ring", "cone", "robot") returns which class each detected bounding box belongs to.
Confidence score: A value from 0.0 to 1.0 indicating how confident the model is that this is a true detection. Low confidence detections (below ~0.5) are more likely to be false positives and should be filtered or treated cautiously.

This is fundamentally different from AprilTag detection. AprilTags give you a 3D pose (camera-relative position and rotation). ML detection gives you a 2D pixel region and a label. To get a robot-relative direction to the game piece, you must convert the bounding box center to an angle using camera geometry — covered in the interactive below.

Bounding Box Explorer: From Pixels to Angles

Click the camera view on the left to place a game piece detection at different positions. The system computes the horizontal angle (yaw) to the object from camera geometry, and the approximate distance estimate from the bounding box height. Use this to understand how pixel position maps to drive commands.

Object detection → angle estimation click the camera view to place a detection

Camera view (click to place detection)

Detection output

class: —

confidence: —

bbox center (px): —

horizontal yaw: —

est. distance: —

drive command: —

Object size:

Box height (px): 45

Confidence: 0.80

click frame to reposition

💡 Bounding box → angle, not pose

The horizontal angle to the game piece (yaw) tells you which direction to turn the robot to face the piece. It does not tell you how far away the piece is. Distance from a bounding box is a rough approximation at best — it depends on knowing the object's physical size and assuming it's oriented normally. For driving toward a game piece, the yaw is the primary control input (rotate until yaw ≈ 0°), with distance estimated from box height or derived from the time it takes to reach the piece. For precise game piece position (needed for PathPlanner navigation), fuse the camera angle with your odometry pose to project an approximate field position.

The ML Detection Pipeline

From raw camera image to robot-usable data, game piece detection passes through five stages. Each stage is a potential source of latency, error, or configuration problems.

Camera acquisition: capture the frame

Same camera as your AprilTag pipeline — or a dedicated second camera. Game piece detection often benefits from a downward-angled camera (intake-side) rather than the forward-facing AprilTag camera. A camera at 30–45° downward angle from 0.4–0.6 m above the floor sees floor-level game pieces at 1–4 m range effectively.

Neural network inference: run the model on the frame

The trained model processes each camera frame. For FRC use, YOLOv5 and YOLOv8 nano/small variants are typical — designed to run in real-time on limited compute. On a Raspberry Pi 4, a YOLO nano model runs at 20–30fps at 640×640 input resolution. Larger models are more accurate but slower — there's always a tradeoff between accuracy and inference speed.

Non-maximum suppression: remove duplicate detections

A single object often triggers multiple overlapping bounding boxes from the neural network. Non-maximum suppression (NMS) merges overlapping boxes above a threshold, keeping only the highest-confidence box for each distinct object. The confidence threshold and NMS overlap threshold are configurable in PhotonVision's pipeline settings. Lower confidence threshold = more detections but more false positives.

Results published to NetworkTables

PhotonVision publishes detection results to NetworkTables at the same topic as AprilTag detections — under the camera's pipeline results. Each target includes the bounding box, class name, and confidence score. Your robot code reads these using the same PhotonCamera.getLatestResult() API from Lesson 3, with different target data fields for ML detections versus AprilTag detections.

Robot code: convert detection to robot action

The bounding box center pixel coordinates are converted to a camera-relative angle using the camera's field of view. This angle becomes a drive target — rotate until the game piece is centered. Alternatively, project the angle forward with distance estimation to get an approximate field position for PathPlanner navigation (Lesson 9 on-the-fly paths).

Training a Custom Model with Roboflow

Roboflow is the most accessible platform for FRC teams to collect, annotate, and train object detection models. Many FRC game piece models from the community are already available on Roboflow Universe — before building a custom model, check if a model for the current game's pieces already exists.

Collect images — variety is more important than quantity

Capture 200–500 images of the game piece from your camera at various distances (0.5–4 m), angles, lighting conditions (shop, practice field, outdoor, dim), and with real competition distractors — other robots nearby, field structures in background, partially occluded pieces. Include images where the piece is NOT present (to teach the model what a true negative looks like). The model can only handle situations similar to its training data — images from your exact competition environment are more valuable than generic images.

Annotate: draw bounding boxes around each game piece

In Roboflow's annotation tool, draw a tight bounding box around each game piece in every image. Assign the class label (e.g., "ring", "cone", "ball"). Consistency matters — a bounding box that includes too much background trains the model to associate irrelevant pixels with the object. For FRC pieces, aim for a box that is 5–15% larger than the physical piece on each side. Roboflow supports team annotation sharing — multiple annotators speed up large datasets.

Augment and train

Roboflow automatically applies augmentations (brightness variation, horizontal flip, rotation, crop) to artificially expand the training set. Train using the YOLOv8 nano or small architecture — PhotonVision supports ONNX export from these. Training on Roboflow's cloud GPU takes 10–30 minutes for a typical FRC dataset. After training, review the mAP (mean Average Precision) score — above 0.85 is typically competition-ready for a single-class detector.

Export as ONNX and deploy to PhotonVision

Export the trained model in ONNX format from Roboflow. In PhotonVision, create an ML pipeline and upload the ONNX model file through the dashboard. Specify the class labels (in the same order as your Roboflow label map). PhotonVision handles the inference — your robot code reads results through the same PhotonLib API. Test by pointing the camera at a game piece and confirming green detection boxes appear in the stream.

🔍 Community models save weeks of work

The FRC community publishes trained models on Roboflow Universe (universe.roboflow.com) every season. Search for the current year's game piece names — "2025 REEFSCAPE ring", "2024 Crescendo note", etc. A community model with 500+ annotated images trained on competition lighting is often better than a model you train yourself in a week, and it's available immediately. Teams like 6328 (AdvantageKit authors) and others publish their vision models publicly. Check before spending time building your own, and contribute back if you improve on what you find.

Reading ML Detections in Java

PhotonVision's ML pipeline publishes results using the same PhotonCamera API as AprilTags. The difference is in which fields of PhotonTrackedTarget you read. ML targets don't have a tag ID or solvePnP pose — they have bounding box coordinates and a confidence score instead.

GamePieceSubsystem.java — reading ML detection results

import org.photonvision.PhotonCamera;
import org.photonvision.targeting.PhotonPipelineResult;
import org.photonvision.targeting.PhotonTrackedTarget;

public class GamePieceSubsystem extends SubsystemBase {

    // Dedicated ML pipeline camera — same API, different pipeline type
    private final PhotonCamera m_mlCamera = new PhotonCamera("IntakeCamera");

    // Camera horizontal FOV — from calibration or camera spec sheet
    // Used to convert bounding box center to horizontal angle
    private static final double CAMERA_H_FOV_DEG = 70.0;   // degrees
    private static final double CAMERA_WIDTH_PX  = 1280.0;  // pixels
    private static final double GAME_PIECE_HEIGHT_M = 0.10; // game piece physical height
    private static final double CAMERA_HEIGHT_M  = 0.50;  // camera height above floor

    private Optional<PhotonTrackedTarget> m_bestTarget = Optional.empty();
    private double m_targetYawDeg = 0;
    private double m_estimatedDist = 0;

    @Override
    public void periodic() {
        PhotonPipelineResult result = m_mlCamera.getLatestResult();

        if (!result.hasTargets()) {
            m_bestTarget = Optional.empty();
            return;
        }

        // Get the highest-confidence detection among all visible game pieces
        m_bestTarget = result.getTargets().stream()
            .filter(t -> t.getConfidence() > 0.5)  // reject low-confidence detections
            .max(Comparator.comparingDouble(PhotonTrackedTarget::getConfidence));

        m_bestTarget.ifPresent(target -> {
            // For ML targets, yaw (horizontal angle) is the primary useful output.
            // PhotonVision computes this from the bounding box center using the
            // camera's calibration — positive yaw = target is to the right of center.
            m_targetYawDeg = target.getYaw();

            // Rough distance estimate from bounding box height.
            // D ≈ (object_height_meters × camera_focal_length_px) / bbox_height_px
            // This is an approximation — use only for speed/approach decisions.
            double bboxHeight = target.getDetectedCorners().stream()
                .mapToDouble(c -> c.y).max().orElse(0)
                - target.getDetectedCorners().stream()
                  .mapToDouble(c -> c.y).min().orElse(0);
            double focalLengthPx = CAMERA_WIDTH_PX / (2 *
                Math.tan(Math.toRadians(CAMERA_H_FOV_DEG / 2)));
            m_estimatedDist = (GAME_PIECE_HEIGHT_M * focalLengthPx) / bboxHeight;

            SmartDashboard.putNumber("GamePiece/Yaw",       m_targetYawDeg);
            SmartDashboard.putNumber("GamePiece/Dist",       m_estimatedDist);
            SmartDashboard.putNumber("GamePiece/Confidence", target.getConfidence());
        });
    }

    public boolean hasGamePiece() { return m_bestTarget.isPresent(); }
    public double  getTargetYaw() { return m_targetYawDeg; }
    public double  getEstimatedDistance() { return m_estimatedDist; }
}

Using Game Piece Detection in Autonomous

The most common autonomous use of ML detection is a "drive toward nearest game piece" behavior. The yaw from the detection becomes the target for a rotation PID controller, and the distance estimate determines when to stop driving forward.

AimAtGamePieceCommand.java — rotation control from ML yaw

public class AimAtGamePieceCommand extends Command {

    private final DriveSubsystem     m_drive;
    private final GamePieceSubsystem m_gamePiece;
    private final PIDController      m_yawController;

    public AimAtGamePieceCommand(
            DriveSubsystem drive,
            GamePieceSubsystem gamePiece) {
        m_drive = drive;
        m_gamePiece = gamePiece;
        // Rotate to zero yaw (game piece centered in frame)
        // Tune kP so robot rotates toward piece without oscillating
        m_yawController = new PIDController(0.04, 0, 0.002);
        m_yawController.setSetpoint(0.0);   // target: game piece centered
        m_yawController.setTolerance(2.0);  // degrees — acceptable aim error
        addRequirements(m_drive);
    }

    @Override
    public void execute() {
        if (!m_gamePiece.hasGamePiece()) {
            // No detection — rotate slowly to search
            m_drive.driveRobotRelative(new ChassisSpeeds(0, 0, 0.5));
            return;
        }

        double yaw      = m_gamePiece.getTargetYaw();
        double distance = m_gamePiece.getEstimatedDistance();

        // Rotation: PID on yaw — rotate until game piece is centered
        // Clamped to avoid excessive rotation speed
        double rotation = MathUtil.clamp(
            m_yawController.calculate(yaw), -2.5, 2.5);

        // Forward drive: proportional to distance, stops near the piece
        // Only drive forward if robot is roughly aimed at the piece
        double forward = (Math.abs(yaw) < 15) ?
            MathUtil.clamp(distance * 0.5, 0, 2.5) : 0;

        m_drive.driveRobotRelative(new ChassisSpeeds(forward, 0, rotation));
    }

    @Override
    public boolean isFinished() {
        // Done when aimed accurately AND within intake range
        return m_gamePiece.hasGamePiece()
            && m_yawController.atSetpoint()
            && m_gamePiece.getEstimatedDistance() < 0.3;
    }

    @Override
    public void end(boolean interrupted) {
        m_drive.driveRobotRelative(new ChassisSpeeds());
    }
}

💡 Combine ML detection with on-the-fly PathPlanner paths

Once you have a detection yaw and distance estimate, you can project an approximate field-relative position for the game piece by combining the camera angle with the robot's current field pose and known camera geometry. This field position becomes the target for AutoBuilder.pathfindToPose() (Lesson 9 of Unit 9) — giving you a smooth, physics-constrained path to the detected piece rather than a simple rotate-and-drive behavior. The game piece position will drift as the robot moves and the camera updates, so generate a new pathfindToPose() command whenever the estimated position changes significantly, interrupting the previous one.

🔌 System Check

⚙️ Before Using ML Detection in Autonomous

ML pipelines have different failure modes from AprilTag pipelines. Check these before relying on game piece detection in matches:

Model is loaded in PhotonVision and the correct pipeline is active. Open the dashboard and confirm the ML pipeline (not the AprilTag pipeline) is selected for the intake camera. The stream should show inference boxes over detected game pieces. If no boxes appear but the pipeline shows as "running," the model file may not have loaded correctly — redeploy from the dashboard's model management section.
Confidence threshold is set appropriately. In PhotonVision's ML pipeline settings, check the confidence threshold (typically 0.45–0.65). Too low: false positives from the carpet, shadows, or other orange objects. Too high: misses pieces at distance or in shadow. Tune on your actual competition field if possible, or at minimum on carpet under fluorescent lighting.
Detection works in competition lighting conditions. Competition venues have very different lighting from your shop. If possible, test the model before the competition at the venue during setup day. Common failure: model trained on shop images doesn't handle bright skylight overhead or direct LED competition lighting. Add competition-condition images to the training set if failures occur.
Robot code falls back gracefully when no game piece is detected. Test the autonomous routine with no game piece visible. The robot should either search (rotate slowly), wait, or transition to an alternate path — not stop or error. Check that hasGamePiece() returning false is handled in every code path that uses game piece detection.
Detection doesn't false-positive on the robot's own mechanisms. If any part of your robot is the same color as the game piece (orange intake rollers for an orange ring game, yellow structural elements for a yellow ball game), position the camera so the robot's own body isn't in the bottom of the frame. A model that sees the intake as a "game piece" will never finish the pickup command.

Knowledge Check

1. An ML detection pipeline returns a bounding box with center at pixel (480, 360) in a 1280×720 frame with a 70° horizontal field of view. The pixel center is to the left of the image center (640). Approximately what horizontal yaw angle does this detection correspond to, and which direction should the robot rotate to aim at the game piece?

A +8.75° — rotate clockwise (right) to aim at the piece
B −8.75° — the piece is left of center (480 vs. 640 center); offset = 480−640 = −160 px out of ±640 px half-width; yaw ≈ (−160/640) × 35° ≈ −8.75°; rotate counterclockwise (left) to center it
C −12.5° — based on the full 70° FOV divided by image width
D The yaw cannot be determined without camera calibration

2. A team's ML game piece detector is producing false positives on an orange sponsor banner in the background behind the scoring structure. Their model was trained only on images of game pieces on the floor. What is the most effective fix?

A Increase the confidence threshold to 0.95 to reject all but the most obvious detections
B Add labeled images of the false positive context (orange banners, field graphics) to the training set as negative examples (with no bounding box annotations), retrain the model, and/or add bounding box annotations of the banners labeled as a separate "not-game-piece" class so the model learns to distinguish them
C Switch from YOLO to an AprilTag-based detection approach for more precision
D Add a distance filter — only accept detections closer than 2 m by bounding box size

3. An AimAtGamePieceCommand is running during autonomous. The game piece is detected at yaw = −25°. The command rotates the robot left. After 0.5 seconds, the yaw reading jumps to +30° for two frames, then returns to −22°. What is the most likely cause, and how should this be handled in the command?

A The PID controller's kD term is causing derivative kick — reduce kD to zero
B A false positive detection appeared at +30° (a different object briefly classified as the game piece) for two frames; the yaw jumping to the opposite side of the frame is characteristic of a momentary false detection; the fix is to add a low-pass filter or debounce on the target yaw — only use a yaw value if it was consistent for 2–3 consecutive frames, or limit the maximum yaw change per loop
C The game piece physically moved 55° in 0.5 seconds — increase the robot's rotation speed
D The camera frame rate is too low — increase to 120fps to reduce detection latency

💪 Practice Prompt

Build a Game Piece Detection Pipeline

Search Roboflow Universe (universe.roboflow.com) for a model trained on the current FRC season's game pieces. Download the best-available model in ONNX format. If no good model exists for the current season, use last season's game piece to learn the workflow. Import it into PhotonVision's ML pipeline and verify detections appear in the stream.
Implement the GamePieceSubsystem from this lesson. Add the confidence filter (reject below 0.5). Log GamePiece/Yaw, GamePiece/Dist, and GamePiece/Confidence to SmartDashboard. Deploy and verify the values update when you hold a game piece in front of the camera at various distances and angles.
Implement AimAtGamePieceCommand. Place a game piece 2 meters in front of the robot, offset 30° to the right. Command the robot to aim (in place, wheels elevated) and confirm the yaw error decreases to within ±3° in under 2 seconds. Tune the PID kP value until the response is fast without oscillation.
Test the no-detection fallback: remove the game piece from the camera's view while the command is running. Confirm the robot performs the search behavior (slow rotation) rather than stopping or erroring. Restore the piece and confirm the command resumes aiming.
Bonus: Project the game piece's approximate field position. Using your robot's current field pose (m_drive.getPose()), the camera's horizontal FOV, and the target yaw from detection, compute an estimated Translation2d for where the game piece is on the field. Pass this to AutoBuilder.pathfindToPose() to navigate to it rather than using rotate-and-drive. Log the projected position to Field2d and verify it moves realistically as you move the physical game piece.