Case Study

Codifying Agentic Engineering for a Platform Team

How I turned a personal AI-assisted workflow into infrastructure the whole team can install, so we share working patterns instead of each reinventing them.

The Challenge

AI-assisted engineering made me dramatically faster. The problem is that a personal workflow does not scale a team. The work I was doing in my head every day, push the branch, draft a structured PR description, run review, fix what is real, reply to comments, update the tracker, is mechanical but easy to get wrong. And when every engineer does it a little differently, the team has no shared baseline to improve from.

The goal was to codify the working pattern as infrastructure the whole team could install, so the good version of the workflow becomes the default rather than a thing each person has to remember to do.

The Approach

I built an internal Claude Code plugin marketplace at HMH. Teammates add the marketplace once and install plugins from it, and updates flow automatically after that. The flagship plugin, git-workflows, turns the pull request workflow into two skills, two commands, and a hook:

  • ship-pr, one command that runs the entire PR lifecycle. It pushes, drafts a structured description, runs whichever reviewers the repo supports (each one self-detects by capability rather than by a config switch), fixes legitimate findings, replies to comments, refreshes the description after every push, and posts an acceptance-criteria coverage comment back to the linked Jira story.
  • pr-description, which discovers the repo's own PR templates, recommends the best fit from the branch name, the changed paths, and the commits, fleshes it out, and injects machine context into sentinel-marked regions (HTML comments) so any template keeps its human narrative untouched.
  • a config command that prints every resolved setting alongside its provenance: default, detected, user preference, or team policy. Configuration is an opinion cascade, so it should be legible.
  • a setup command that seeds a repo's pull request templates from a shared profile.
  • a hook that refreshes the PR's diff breakdown automatically on every push.
Component map of the git-workflows plugin: setup and config commands, the ship-pr and pr-description skills, and a push hook all coordinate through shared library helpers, with sentinel-marked regions defining which writer owns which part of the PR bodyComponent map of the git-workflows plugin: setup and config commands, the ship-pr and pr-description skills, and a push hook all coordinate through shared library helpers, with sentinel-marked regions defining which writer owns which part of the PR body
The component map from the plugin's README. Commands and skills lean on shared, tested helpers, and the sentinel contract spells out exactly who may write which region of the PR body. Scroll sideways, or click to open it full size.

Here is what that adds up to in practice. Shipping a PR takes me under five minutes from a finished branch, and it produces the same outcome every single time: a fleshed out description on the repo's own template, Copilot review comments assessed and addressed rather than skimmed, the Jira story updated with which acceptance criteria the PR covers, and the full agentic context tucked into a collapsed details block at the bottom of the description. The description stays useful to the engineer reading it, and the breadcrumbs agentic tooling needs are right there for whatever picks the work up next.

Expand the full ship-pr flow, preflight to final verification
Flowchart of a full ship-pr run: preflight checks and hard stops, capability probes for Copilot, Codex, and the tracker, PR creation with a structured description, review evaluation and fixes, a description refresh, Copilot replies, an acceptance-criteria coverage comment on the Jira story, and final verificationFlowchart of a full ship-pr run: preflight checks and hard stops, capability probes for Copilot, Codex, and the tracker, PR creation with a structured description, review evaluation and fixes, a description refresh, Copilot replies, an acceptance-criteria coverage comment on the Jira story, and final verification

The ship-pr flowchart from the plugin's README. Every gate is a capability probe, so one command adapts to whatever the repo supports. The plugin hides machine context behind a details collapse in every PR it writes, so its own flowchart gets the same treatment here.

A few principles held the whole thing together:

  • Detection over assumption. Never assume the default branch is main, and parse the forge from the remotes rather than guessing.
  • Team opinions packaged as adoptable profiles, so a team's preferences travel with the plugin.
  • Deterministic logic lives in tested scripts, so it runs the same way every time.

I dogfooded the plugin on its own pull request through five releases before asking anyone else to install it. That dogfooding surfaced real failures that synthetic tests never caught. Underneath it sits a comprehensive test suite and a CI check that keeps the version pinned consistently everywhere it appears.

The Impact

Under 5 min

one command from a finished branch to a reviewed, described, and tracked PR

5 releases

dogfooded on itself before team rollout

Any repo

adapts by detection, never by assumption

Hundreds

of tests behind the workflow the team installs

The marketplace is how the team now shares working patterns: you install a proven one and adapt it to your repo. The same approach, prototype on yourself, codify what works, then publish it, is becoming the team's playbook for agentic engineering. The git-workflows plugin is the foundation of that playbook, with plenty still on its roadmap.

Lessons Learned

A workflow becomes a standard when it is easier to install than to improvise. The fastest way to get adoption was to make the good path the path of least resistance. Paved roads beat mandates, and this project proved it again.

Sentinels work as contracts. Machines own the clearly marked regions, humans own the narrative, and any template keeps working. Drawing that boundary explicitly is what let one tool play nicely with every team's existing templates.

Dogfooding is the test suite that synthetic tests cannot replace. Every release closed a finding that came from real use. No fixture I wrote in advance would have caught those.

Cole Conrad

Cole Conrad

Principal Platform Engineer

I build platforms teams rely on to ship. If this work maps to a problem you are trying to solve, I would enjoy the conversation. The chat in the corner can also go deeper on anything in this study.