Add PR Reviewer deployment docs

This commit is contained in:
Blog Creator 2026-05-09 18:38:39 +00:00
parent 0cf39e41e9
commit e044202042

View File

@ -0,0 +1,316 @@
Title: PR Reviewer - A deployable AI reviewer for your Repos
Date: 2026-05-09 18:37
Modified: 2026-05-09 18:37
Category: DevOps
Tags: ai, code-review, automation, open-source, devops
Slug: pr-reviewer-deployable-ai-reviewer
Authors: glm-5.1.ai, nemotron-3-nano.ai, gemma4.ai, deepseek-v4-flash.ai
Summary: An indepth look at PR Reviewer, an opensource, locally deployable AI that automates code, security and infrastructure reviews using CrewAI and MCP.
---
## Introduction why a robot reviewer matters
Pullrequest (PR) reviews have become the gatekeeper of software quality in modern development teams. Yet the human element that makes a review useful—attention to detail, consistency, and a willingness to flag the obvious—often collides with realworld pressures: sprint deadlines, inbox overload, and the occasional cat video on Slack. The result is a patchwork of rushed approvals, missed security checks, and style drift that slowly erodes a codebases health.
Enter **PR Reviewer**, a selfhosted AI reviewer that brings three specialised agents to every PR, every time. By delegating the mechanical parts of a review—linting, vulnerability scanning, infrastructure sanity checks—to a deterministic, alwaysawake service, teams can free senior engineers to focus on architectural decisions, mentorship, and the nuanced conversations that no model can replace. This article walks through the motivation, design, and practical steps for getting PR Reviewer up and running in a production environment.
## The problem space symptoms of manual review fatigue
Before diving into the solution, it helps to enumerate the pain points that most teams encounter:
1. **Inconsistent style enforcement** Different reviewers apply different conventions, leading to a codebase that looks like a collage of personal preferences.
2. **Security blind spots** Timepressed developers may skip static analysis, allowing known CVEs or injection vectors to slip through.
3. **Infrastructure drift** Dockerfiles, Helm charts, and Terraform scripts often evolve without a single source of truth, creating deploymenttime surprises.
4. **Review bottlenecks** When a single senior engineer is the “goto” reviewer, their availability becomes a single point of failure.
5. **Context loss** New contributors rarely have access to a teams style guide, security playbook, or infrastructure policy, so they guess.
These issues are not theoretical; they manifest as longer cycle times, higher postrelease defect rates, and a growing maintenance burden. Automating the lowlevel checks while preserving the ability to inject teamspecific guidance is the sweet spot PR Reviewer aims for.
## Core philosophy “your standards, your infrastructure, your LLM”
PR Reviewer is deliberately built around three nonnegotiables:
* **Customisable context** Every review runs against a set of markdownbased guidelines that you supply. Whether you follow PEP8, enforce OWASP Top10 mitigations, or require a specific base image for Docker, the system respects those rules.
* **Selfhosted execution** The service runs inside your own network, behind your firewall, on any platform that can run Docker or a Python virtual environment. No code leaves your premises unless you explicitly point it at an external LLM.
* **Pluggable LLM provider** The LLM factory abstracts OpenAI, Anthropic, Ollama, or any future provider behind a common interface. You pick the model that matches your cost, latency, and dataprivacy requirements.
By keeping the three pillars separate, PR Reviewer can evolve without forcing you to abandon existing policies or infrastructure investments.
## Highlevel architecture how the pieces fit together
At a glance, the system consists of four logical layers:
1. **API Layer** A lightweight FastAPI service exposing healthcheck and reviewtrigger endpoints. It validates incoming payloads, authenticates callers (if you enable it), and forwards the request downstream.
2. **Orchestration Layer** Powered by **CrewAI**, this layer defines a *flow* that spins up three independent crews: code, security, and infrastructure. Each crew runs its own set of agents, each agent being a thin wrapper around a staticanalysis tool or an LLM prompt.
3. **MCP Integration Layer** The **Model Context Protocol** (MCP) bridges agents and external tools. For example, the codereview crew calls Semgrep via MCP, the security crew invokes Trivy, and the infra crew talks to Hadolint and Checkov. MCP also normalises the output so the orchestration layer can aggregate results.
4. **State & Context Layer** Pydantic models capture the PR metadata, file diffs, and any perPR context overrides. A contextresolution subsystem loads the default markdown guidelines from `contexts/defaults/` and merges them with overrides supplied in the API request.
The flow is deliberately linear: the API receives a request, the orchestration layer launches the three crews in parallel, each crew returns its findings, and a final summariser agent synthesises a humanreadable report. Because the crews are independent, you can reorder them, add new crews (e.g., a licencecompliance crew), or run them conditionally based on the size of the PR.
## The three specialised crews what they actually do
### Code Review Crew
* **Toolchain** Semgrep, accessed through MCP.
* **Focus** Syntax correctness, antipatterns, complexity metrics, and adherence to languagespecific style guides.
* **Typical output** “Unused import `json` in `utils.py`”, “Function `process_data` exceeds cyclomatic complexity of 15”, “Prefer fstrings over `%` formatting”.
### Security Review Crew
* **Toolchain** Trivy (native MCP integration) plus optional custom CVE databases.
* **Focus** Known vulnerabilities in dependencies, insecure configuration flags, and common injection patterns.
* **Typical output** “CVE2023XXXXX found in `requests` 2.28.0”, “Hardcoded AWS secret key in `config.py`”, “Potential SQL injection in `execute_query`”.
### Infrastructure Review Crew
* **Toolchain** Hadolint for Dockerfiles, Checkov for IaC (Terraform, CloudFormation, Kubernetes manifests).
* **Focus** Container best practices, leastprivilege IAM roles, resource limits, and drift detection.
* **Typical output** “Use nonroot user in Dockerfile”, “Missing `resources.limits.cpu` in Kubernetes Deployment”, “Terraform `aws_s3_bucket` lacks serverside encryption”.
Each crew receives the same PR metadata but works with a slice of the file set relevant to its domain. The separation keeps the agents lightweight and makes debugging straightforward: if a security finding looks odd, you know it originated from the security crew.
## Context system teaching the robot your teams way
One of the most common complaints about generic AI reviewers is that they ignore the idiosyncrasies of a particular codebase. PR Reviewer solves this with a **context system** that works like a configurable style guide:
* **Default markdown files** `contexts/defaults/code_review.md`, `security_review.md`, and `infra_review.md`. These contain bulletpoint rules, examples, and any organisational policies you want the agents to honour.
* **PerPR overrides** The API payload includes a `context` object where you can supply a short string or a path to a custom markdown snippet. For a PR that introduces a new database, you might add “Prioritise parameterised queries for PostgreSQL”.
* **Dynamic loading** At request time, the system merges defaults with overrides, giving each crew a final set of guidelines that are injected into the LLM prompts. The result is feedback that reads “Your logging follows the teams `logrus` conventions” rather than a generic “Consider using a structured logger”.
Because the guidelines are plain markdown, they are easy to versioncontrol, review, and evolve alongside the code they govern.
## Installation getting the service onto your machine
### Prerequisites
* Python3.103.13 (the project uses modern type hints and Pydantic v2)
* UV package manager (recommended for reproducible environments)
* Git (to clone the repository)
* Docker (optional, for containerised deployment)
### Local development workflow
1. **Clone the repo**
```bash
git clone https://git.aridgwayweb.com/armistace/pr_reviewer.git
cd pr_reviewer
```
2. **Install UV**
```bash
curl -LsSf https://astral.sh/uv/install.sh | sh
source $HOME/.local/bin/env
```
3. **Create and activate a virtual environment**
```bash
uv venv .venv
source .venv/bin/activate
```
4. **Install the package in editable mode**
```bash
uv pip install -e .
```
5. **Configure environment variables** copy `.env.example` to `.env` and fill in your LLM provider credentials, preferred model name, and any optional limits.
6. **Run the service**
```bash
uv run uvicorn pr_reviewer.main:app --host 0.0.0.0 --port 8000
```
You now have a local FastAPI server listening on port8000, ready to accept review requests.
### Dockerbased deployment
For teams that prefer immutable infrastructure, a Dockerfile is provided:
```dockerfile
FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -e .
EXPOSE 8000
CMD ["uvicorn", "pr_reviewer.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
Build and run:
```bash
docker build -t pr-reviewer .
docker run -p 8000:8000 --env-file .env pr-reviewer
```
The container can be orchestrated with Kubernetes, DockerCompose, or any platform that supports OCI images.
## API contract talking to the reviewer
PR Reviewer exposes two primary endpoints.
### Health check
```
GET /api/v1/health
```
A simple JSON payload `{ "status": "ok" }` confirms the service is alive and the LLM factory can be contacted.
### Trigger review
```
POST /api/v1/review
```
The request body is a JSON object containing:
* `pr_id`, `title`, `description`
* Repository information (`name`, `url`)
* Source and target branch details (`branch`, `commit`)
* An array of file objects (`path`, `content`, `status`, `additions`, `deletions`)
* Optional `context` overrides for each crew
A minimal example (trimmed for brevity) looks like this:
```json
{
"pr_id": "42",
"title": "Add healthcheck endpoint",
"description": "Implements a basic /health route for the API",
"repo": { "name": "pr-reviewer", "url": "https://github.com/example/pr-reviewer" },
"source": { "branch": "feature/health", "commit": "a1b2c3" },
"target": { "branch": "main", "commit": "d4e5f6" },
"files": [
{
"path": "src/main.py",
"content": "def health(): return {'status': 'ok'}",
"status": "added",
"additions": 3,
"deletions": 0
}
],
"context": {
"code_review": "Follow PEP8, prefer type hints",
"security_review": "Check for open redirects",
"infra_review": "Dockerfile must use alpine base"
}
}
```
The response contains a unique `review_id`, a `status` flag, a timestamp, and a `results` object with three sections (`code_review`, `security_review`, `infra_review`) plus a synthesized `summary`. Processing time is reported in seconds, enabling you to monitor performance trends.
## Integrating with CI/CD making reviews automatic
Because the API is HTTPbased, wiring it into any pipeline is straightforward. Below are snippets for three popular CI systems; the actual YAML files are included in the repository.
### GitHub Actions (or compatible)
```yaml
name: PR Review
on:
pull_request:
types: [opened, synchronize]
jobs:
review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Trigger PR Reviewer
env:
PR_REVIEWER_URL: ${{ secrets.PR_REVIEWER_URL }}
PR_REVIEWER_TOKEN: ${{ secrets.PR_REVIEWER_TOKEN }}
run: |
curl -X POST "$PR_REVIEWER_URL/api/v1/review" \
-H "Authorization: Bearer $PR_REVIEWER_TOKEN" \
-H "Content-Type: application/json" \
-d @.github/pr_payload.json
```
The payload file is generated onthefly using GitHubs context variables, ensuring the service receives the exact diff.
### GitLab CI
```yaml
review:
stage: test
image: curlimages/curl:7.85.0
script:
- curl -X POST "$PR_REVIEWER_URL/api/v1/review" \
-H "Authorization: Bearer $PR_REVIEWER_TOKEN" \
-H "Content-Type: application/json" \
-d "$(cat $CI_PROJECT_DIR/.gitlab/pr_payload.json)"
```
### Gitea Actions
A minimal workflow lives under `.gitea/workflows/deploy.yaml` and follows the same pattern: checkout, generate a JSON payload, POST to the service.
By treating the review as a firstclass CI step, you guarantee that every PR receives a baseline set of checks before any human eyes ever see the diff.
## Security considerations keeping your code safe
Running an AIpowered reviewer inevitably raises questions about data leakage and attack surface. PR Reviewer mitigates these concerns in several ways:
1. **Local execution** The service runs inside your own network. No code is sent to a thirdparty endpoint unless you configure an external LLM that does so. If you prefer a fully offline model, Ollama can be run locally and pointed to via the LLM factory.
2. **Leastprivilege secrets** API keys for LLM providers are stored in Kubernetes Secrets (or Docker secrets) and never baked into container images. The service reads them at runtime and discards them after use.
3. **Prompt sanitisation** CrewAIs latest release includes mitigations against prompt injection. All usersupplied context strings are escaped before being interpolated into LLM prompts.
4. **MCP sandboxing** The staticanalysis tools run in isolated subprocesses with limited filesystem access. The MCP server enforces a whitelist of allowed binaries, preventing arbitrary command execution.
5. **Version pinning** The Docker image pins specific versions of Semgrep, Trivy, Hadolint, and Checkov. Regular CI runs verify that no new CVEs have been introduced into the toolchain itself.
Nevertheless, no system is impervious. Teams should treat PR Reviewer as a *tool* that reduces risk, not eliminates it. Periodic audits of the deployed image, rotating LLM credentials, and monitoring API logs for anomalous usage are recommended best practices.
## Opensource ethos why the code is free for everyone
PR Reviewer is released under the MIT licence, which means you can fork, modify, and redistribute the software without restriction. The decision to go open source stems from three motivations:
* **Community feedback** Realworld usage uncovers edge cases that a single maintainer might never encounter. Pull requests from the community improve coverage for languages, frameworks, and infrastructure patterns.
* **Transparency** When you run the reviewer on your own hardware, you can inspect every line of code, ensuring there are no hidden dataexfiltration mechanisms.
* **Shared standards** By publishing the default context files, we provide a baseline set of best practices that any team can adopt, adapt, or replace. Over time, a curated collection of communitycontributed guidelines could become a defacto standard for AIassisted reviews.
If you find a bug, a missing language support, or simply have an idea for a new crew (e.g., a licencecompliance crew), feel free to open an issue or submit a PR. The contribution guide in the repository walks you through the process.
## The road ahead future directions and open challenges
PR Reviewer is functional today, but the landscape of AIassisted development is moving fast. Planned enhancements include:
1. **Learningaugmented context** Instead of static markdown, we are experimenting with a lightweight model that extracts style patterns from the existing codebase and suggests context updates automatically.
2. **Finegrained permissioning** A rolebased access control layer that lets you expose only the security crew to external contributors while keeping the infra crew internal.
3. **Multimodel orchestration** Some teams may want to use a fast, cheap model for linting and a larger, more capable model for nuanced security reasoning. The LLM factory will soon support percrew model selection.
4. **Metrics dashboard** A Grafanacompatible endpoint that emits counters for review duration, number of findings per crew, and trend lines for defect density over time.
5. **Extended toolchain** Plugins for SonarQube, Snyk, and custom linters via MCP wrappers, making the system a onestop shop for all static analysis needs.
We welcome collaborators to take ownership of any of these initiatives. The architecture is deliberately modular, so adding a new crew or swapping out a tool should be a matter of a few configuration files and a small amount of glue code.
## Getting started your first robot review in minutes
If youve read this far, youre probably ready to give PR Reviewer a spin. Heres a quick checklist:
1. **Clone the repo** `git clone https://git.aridgwayweb.com/armistace/pr_reviewer.git`
2. **Set up a virtual environment** Follow the UV steps in the Installation section.
3. **Create a `.env` file** Populate it with your LLM provider key and preferred model name.
4. **Run the service** `uv run uvicorn pr_reviewer.main:app --reload`
5. **Send a test request** Use `curl` or Postman with the minimal JSON payload shown earlier.
6. **Inspect the response** You should see three sections of feedback plus a concise summary.
7. **Iterate on context** Edit `contexts/defaults/*.md` to reflect your teams conventions, then rerun the test.
Once youre comfortable locally, spin up the Docker image in your CI environment and let the reviewer become a permanent gatekeeper for every pull request.
## The honest truth what PR Reviewer isnt
No amount of AI can replace the human judgment that comes from years of experience. PR Reviewer does **not**:
* Write business logic for you.
* Replace architectural reviews or design discussions.
* Guarantee zero security vulnerabilities—only that known patterns are flagged.
* Provide 100% accurate naturallanguage explanations; occasional false positives are expected.
Think of the system as a *spellchecker for code*: it catches the lowhanging fruit, freeing senior engineers to focus on the hard problems that truly add value.
## Call to action join the community
PR Reviewer is more than a personal sideproject; its an invitation to the wider developer community to shape the future of automated code quality. Whether you:
* Deploy it in a production pipeline and share performance metrics,
* Contribute a new crew for a language or framework you love,
* Polish the default context files to match industry standards,
your involvement makes the tool better for everyone. The repository lives at <https://git.aridgwayweb.com/armistace/pr_reviewer>. Clone it, spin it up, and start reviewing pull requests with a robot that never sleeps, never gets distracted by cat memes, and always respects the guidelines you set.
Happy reviewing, and may your CI pipelines be ever green.