Developer Tool•2024

AI-Powered Bug Fix Automation

From error detection to pull request in seconds

An automation system that converts accessibility and performance issues into GitHub pull requests. A Chrome extension extracts issue data from error monitoring tools, sanitizes PII, enriches with AI analysis, and creates PRs automatically.

RoleLead Developer

Tech Stack

TypeScriptExpressChrome Extension+3 more

The Problem

Developers spent 10-30 minutes per bug report on repetitive triage: reading the error, finding the file, understanding context, creating a PR. With dozens of daily issues from monitoring tools, this was a massive productivity drain.

10-30 minutes of manual work per bug report
Inconsistent issue descriptions and fix approaches
Sensitive customer data in error logs risked exposure to LLMs
Context switching between monitoring tool, codebase, and GitHub

Constraints

PII must never reach LLM APIs (customer emails, IPs, session IDs)
Extension must work within Manifest V3 restrictions
Low latency, under 15 seconds end-to-end or developers won't use it
Must integrate with existing GitHub workflow (PRs, not commits)

The Approach

Three-phase architecture: Chrome extension captures and signs requests with HMAC, Express backend sanitizes PII and enriches with Claude Haiku, then creates PRs via AI agent API. Deterministic sanitization enables 24-hour caching.

Why Chrome extension over bookmarklet or browser automation?

Manifest V3 extensions have persistent service workers and storage APIs. Bookmarklets can't persist config. Browser automation (Playwright) would require always-running process and can't inject UI elements.

Why HMAC authentication between extension and backend?

Extension runs in user's browser, can't trust origin alone. HMAC with shared secret proves request came from our extension. Timing-safe comparison prevents timing attacks on signature validation.

Why Claude Haiku specifically?

Cost: $0.25/1M input, $1.25/1M output tokens, about $0.0008 per request. Speed: ~7-8 second latency. Quality: Haiku handles structured output (JSON schema) reliably. Temperature 0.2 for deterministic output.

Why deterministic PII sanitization?

Same input always produces same sanitized output (REDACTED_EMAIL, REDACTED_IP, etc.). Enables 24-hour caching by payload hash, duplicate errors don't hit LLM twice. Also provides audit trail.

Key Tradeoffs

Aggressive PII sanitization may remove useful context

Security over convenience. Email addresses and IPs can't help LLM understand code bugs. Session IDs are replaced, not removed entirely, preserving structure.

Single LLM (Claude) instead of model-agnostic

Optimizing prompts for one model is more effective than generic prompts. Claude's structured output (JSON mode) is reliable. Can add fallback to GPT later if needed.

Backend required (not extension-only)

Manifest V3 restrictions prevent API keys in extension. Backend enables PII sanitization, caching, and audit logging that extension can't do securely.

Implementation Highlights

HMAC request signing with Web Crypto API

Extension signs payload with SHA-256 HMAC using Web Crypto (browser) API. Backend validates with timing-safe comparison to prevent timing attacks. Shared secret stored in extension config.

Deterministic PII sanitization

Regex patterns for emails, credit cards, session IDs, phone numbers, IPv4/IPv6. Bounded quantifiers prevent ReDoS. Output is always identical for same input, enabling cache keying.

GitHub file verification before LLM

Uses GitHub API to verify suspected files exist in repo before asking LLM. Prevents hallucinated file paths. Octokit integration searches by filename patterns from stack trace.

Structured prompt with few-shot examples

System prompt includes JSON schema, target context (stack, framework), and two detailed examples. Output schema enforces task_title (50 chars), summary (400 chars), reproduction steps, fix plan.

24-hour deduplication cache

Payload hash (URL + errorType + title) keys into cache. Identical errors within 24 hours return cached LLM response. Reduces costs and prevents duplicate PRs from flaky errors.

Outcomes

3-phase

Architecture

Extension capture → backend sanitization → AI agent PR creation

24 hours

Cache TTL

Deduplication prevents repeat processing of same errors

What I'd Do Differently

Build better feedback loop. Currently no way to mark generated PR as good/bad. Would improve prompts faster with explicit quality signals.
Add multi-model fallback earlier. Claude outages would block all automation. Having GPT-4 fallback would improve reliability.
Implement confidence thresholds. Low-confidence file locations should create issues, not PRs. Current approach creates some PRs that need significant manual editing.

From error detection to pull request in seconds

RoleLead Developer

Tech Stack

TypeScriptExpressChrome Extension+3 more

10-30 minutes of manual work per bug report

Inconsistent issue descriptions and fix approaches

Sensitive customer data in error logs risked exposure to LLMs

Context switching between monitoring tool, codebase, and GitHub

Why Chrome extension over bookmarklet or browser automation?

Why HMAC authentication between extension and backend?

Extension runs in user's browser, can't trust origin alone. HMAC with shared secret proves request came from our extension. Timing-safe comparison prevents timing attacks on signature validation.

Why Claude Haiku specifically?

Why deterministic PII sanitization?

Same input always produces same sanitized output (REDACTED_EMAIL, REDACTED_IP, etc.). Enables 24-hour caching by payload hash, duplicate errors don't hit LLM twice. Also provides audit trail.

Aggressive PII sanitization may remove useful context

Security over convenience. Email addresses and IPs can't help LLM understand code bugs. Session IDs are replaced, not removed entirely, preserving structure.

Single LLM (Claude) instead of model-agnostic

Optimizing prompts for one model is more effective than generic prompts. Claude's structured output (JSON mode) is reliable. Can add fallback to GPT later if needed.

Backend required (not extension-only)

Manifest V3 restrictions prevent API keys in extension. Backend enables PII sanitization, caching, and audit logging that extension can't do securely.

HMAC request signing with Web Crypto API

Extension signs payload with SHA-256 HMAC using Web Crypto (browser) API. Backend validates with timing-safe comparison to prevent timing attacks. Shared secret stored in extension config.

Deterministic PII sanitization

Regex patterns for emails, credit cards, session IDs, phone numbers, IPv4/IPv6. Bounded quantifiers prevent ReDoS. Output is always identical for same input, enabling cache keying.

GitHub file verification before LLM

Uses GitHub API to verify suspected files exist in repo before asking LLM. Prevents hallucinated file paths. Octokit integration searches by filename patterns from stack trace.

Structured prompt with few-shot examples

System prompt includes JSON schema, target context (stack, framework), and two detailed examples. Output schema enforces task_title (50 chars), summary (400 chars), reproduction steps, fix plan.

24-hour deduplication cache

Payload hash (URL + errorType + title) keys into cache. Identical errors within 24 hours return cached LLM response. Reduces costs and prevents duplicate PRs from flaky errors.

Build better feedback loop. Currently no way to mark generated PR as good/bad. Would improve prompts faster with explicit quality signals.

Add multi-model fallback earlier. Claude outages would block all automation. Having GPT-4 fallback would improve reliability.

Implement confidence thresholds. Low-confidence file locations should create issues, not PRs. Current approach creates some PRs that need significant manual editing.

The ProblemThe Problem

ConstraintsConstraints

The ApproachThe Approach

Why Chrome extension over bookmarklet or browser automation?

Why HMAC authentication between extension and backend?

Why Claude Haiku specifically?

Why deterministic PII sanitization?

Key TradeoffsKey Tradeoffs

Aggressive PII sanitization may remove useful context

Single LLM (Claude) instead of model-agnostic

Backend required (not extension-only)

Implementation HighlightsImplementation Highlights

HMAC request signing with Web Crypto API

Deterministic PII sanitization

GitHub file verification before LLM

Structured prompt with few-shot examples

24-hour deduplication cache

OutcomesOutcomes

What I'd Do DifferentlyWhat I'd Do Differently

The ProblemThe Problem

ConstraintsConstraints

The ApproachThe Approach

Why Chrome extension over bookmarklet or browser automation?

Why HMAC authentication between extension and backend?

Why Claude Haiku specifically?

Why deterministic PII sanitization?

Key TradeoffsKey Tradeoffs

Aggressive PII sanitization may remove useful context

Single LLM (Claude) instead of model-agnostic

Backend required (not extension-only)

Implementation HighlightsImplementation Highlights

HMAC request signing with Web Crypto API

Deterministic PII sanitization

GitHub file verification before LLM

Structured prompt with few-shot examples

24-hour deduplication cache

OutcomesOutcomes

What I'd Do DifferentlyWhat I'd Do Differently

The Problem

Constraints

The Approach

Key Tradeoffs

Implementation Highlights

Outcomes

What I'd Do Differently

The Problem

Constraints

The Approach

Key Tradeoffs

Implementation Highlights

Outcomes

What I'd Do Differently