Add PR Reviewer guide documentation
This commit is contained in:
parent
6db260d4bd
commit
85375a051e
@ -1,476 +1,250 @@
|
||||
Title: PR Reviewer - A deployable AI reviewer for your Repos
|
||||
Date: 2026-05-14 18:31
|
||||
Modified: 2026-05-14 18:31
|
||||
Date: 2026-05-15 18:37
|
||||
Modified: 2026-05-15 18:37
|
||||
Category: DevOps
|
||||
Tags: ai, codereview, automation, llm, devops, ai_content, not_human_content
|
||||
Tags: ai, code-review, automation, llm, devops
|
||||
Slug: pr-reviewer-deployable-ai-reviewer
|
||||
Authors: glm-5.1.ai, nemotron-3-nano.ai, gemma4.ai, deepseek-v4-flash.ai
|
||||
Summary: PR Reviewer combines CrewAI and MCP to deliver automated, context‑aware code, security, and infrastructure reviews that run locally or in containers.
|
||||
Summary: An in‑depth guide to PR Reviewer, a locally deployable, multi‑agent AI system that automates code, security and infrastructure reviews using CrewAI and the Model Context Protocol.
|
||||
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Pull‑request (PR) reviews are a cornerstone of modern software development. They catch bugs, enforce style, and spread knowledge across a team. Yet the manual effort required can become a bottleneck, especially for fast‑moving projects or for teams that lack dedicated senior reviewers. The rise of large language models (LLMs) has opened the door to automated assistance, but most existing solutions are either cloud‑only services that expose proprietary data or single‑purpose bots that lack flexibility.
|
||||
Pull‑request (PR) reviews are a cornerstone of modern software development, yet they remain a bottleneck for many teams. Human reviewers bring expertise, but they also bring latency, inconsistency, and occasional fatigue. The rise of large language models (LLMs) has opened the door to automated assistance, but most existing solutions are either cloud‑only services that expose proprietary data or tightly coupled bots that lack flexibility. **PR Reviewer** occupies a middle ground: an open‑source, self‑hosted AI reviewer that can be deployed on any hardware, works with any LLM provider compatible with CrewAI, and consumes repository‑specific context to respect a team’s coding conventions.
|
||||
|
||||
**PR Reviewer** is an attempt to bridge that gap. Built on top of **CrewAI**, a multi‑agent orchestration framework, and **MCP (Model Context Protocol)**, a thin abstraction over static analysis tools, it offers a fully deployable, locally‑runnable AI reviewer. It can be run on a developer’s laptop, inside a CI container, or as a Kubernetes service, and it works with any LLM provider that conforms to the CrewAI interface (OpenAI, Anthropic, Ollama, etc.). Most importantly, it can ingest repository‑specific guidelines so the AI respects the coding style and security posture that your team has already defined.
|
||||
This article walks through the design philosophy, core architecture, feature set, deployment options, and practical usage patterns of PR Reviewer. By the end, you should understand how to spin up the service, customise its behaviour, and integrate it into your CI/CD pipeline without sacrificing security or control.
|
||||
|
||||
This article walks through the motivations, architecture, installation steps, usage patterns, and future directions of PR Reviewer. By the end you should understand not only *how* to get it running, but also *why* the design choices matter for reliability, privacy, and extensibility.
|
||||
## Why an on‑premise AI reviewer matters
|
||||
|
||||
---
|
||||
Many organisations hesitate to adopt cloud‑based AI code reviewers because of data‑privacy concerns, regulatory constraints, or simply the desire to keep build infrastructure self‑contained. PR Reviewer addresses these pain points in three ways:
|
||||
|
||||
## The Problem with Existing Review Automation
|
||||
1. **Data sovereignty** – All analysis runs inside your network, meaning no source code leaves the premises.
|
||||
2. **Provider agnosticism** – The LLM factory abstracts OpenAI, Anthropic, Ollama, or any compatible endpoint, allowing you to switch providers or run a local model without code changes.
|
||||
3. **Contextual fidelity** – By ingesting repository‑specific guidelines (e.g., a `code_review.md` file), the system tailors its feedback to the style and standards your team already enforces.
|
||||
|
||||
### 1. Cloud‑centric services expose code
|
||||
The result is a reviewer that feels like an extension of your existing tooling rather than an external service you have to accommodate.
|
||||
|
||||
Many “AI code reviewer” products operate as SaaS endpoints. You push a diff, they return suggestions. While convenient, this model forces you to ship proprietary source code to a third‑party server. For organisations handling regulated data, intellectual property, or simply a strong privacy policy, that is a non‑starter.
|
||||
## High‑level architecture
|
||||
|
||||
### 2. Single‑purpose bots lack context
|
||||
PR Reviewer follows a modular, flow‑oriented architecture built around three pillars: **CrewAI agents**, **Model Context Protocol (MCP) integrations**, and a **FastAPI orchestration layer**.
|
||||
|
||||
Tools such as GitHub Copilot or Codacy focus on either style linting or security scanning, but they rarely combine the three major review domains—code quality, security, and infrastructure—into a single, coherent feedback loop. When you stitch multiple services together you end up with duplicated effort and contradictory recommendations.
|
||||
- **CrewAI agents** act as specialised reviewers. Each agent encapsulates a single responsibility—code quality, security scanning, or infrastructure linting—and communicates via a shared state model.
|
||||
- **MCP** provides a uniform interface to static analysis tools such as Semgrep, Trivy, Hadolint, and Checkov. By wrapping these tools in MCP servers, the system can invoke them programmatically and retrieve structured results.
|
||||
- **FastAPI** exposes a RESTful API that CI/CD systems can call. The API receives PR metadata, dispatches the appropriate CrewAI flow, and returns a synthesized review summary.
|
||||
|
||||
### 3. Rigid rule sets
|
||||
State is modelled with **Pydantic** classes, ensuring type safety and easy JSON serialisation. The entire stack can be containerised with Docker, orchestrated with Kubernetes, or run directly on a developer workstation.
|
||||
|
||||
Traditional static analysis tools rely on hard‑coded rule sets. They can be extended, but the process is often cumbersome and requires deep knowledge of the tool’s DSL. Moreover, they cannot adapt to project‑specific conventions without a substantial amount of manual configuration.
|
||||
## The LLM factory – decoupling model selection
|
||||
|
||||
### 4. Integration friction
|
||||
At the heart of any AI‑driven reviewer lies the language model that interprets static analysis output and crafts human‑readable feedback. PR Reviewer abstracts this concern through an **LLM factory**. The factory reads configuration from environment variables (e.g., `LLM_PROVIDER=anthropic`, `LLM_API_KEY=…`) and returns a concrete client that adheres to a minimal interface: `generate(prompt: str) -> str`.
|
||||
|
||||
CI pipelines already juggle a host of steps: building, testing, deploying. Adding a new review stage that requires separate credentials, network access, or a bespoke CLI can quickly become a maintenance nightmare.
|
||||
Because the factory is provider‑agnostic, swapping from a hosted model to a local Ollama instance is a single line change in `.env`. This design also future‑proofs the project against emerging models; as long as a client implements the interface, it can be dropped into the system without touching the review logic.
|
||||
|
||||
PR Reviewer was conceived to address each of these pain points by providing a **single, locally‑hosted service** that unifies multiple review perspectives, respects custom guidelines, and integrates cleanly with existing CI/CD workflows.
|
||||
## MCP – a unified protocol for static analysis
|
||||
|
||||
---
|
||||
Static analysis tools excel at detecting concrete issues but differ wildly in output format. MCP (Model Context Protocol) solves this by defining a JSON‑based contract that each tool wrapper must satisfy:
|
||||
|
||||
## Core Concepts: CrewAI and MCP
|
||||
|
||||
### CrewAI
|
||||
|
||||
CrewAI is a framework for building **multi‑agent systems**. An “agent” is a self‑contained unit that can perform a specific task—run a linter, query an LLM, or aggregate results. CrewAI handles:
|
||||
|
||||
* **Orchestration** – defining the order in which agents run and how they share data.
|
||||
* **State Management** – a shared, typed model (via Pydantic) that guarantees consistency across agents.
|
||||
* **Provider Abstraction** – a factory pattern that lets you swap LLM back‑ends without touching agent logic.
|
||||
|
||||
In PR Reviewer we define three primary agents: `CodeAgent`, `SecurityAgent`, and `InfraAgent`. Each agent invokes a static analysis tool through MCP, then passes the raw findings to an LLM for summarisation and actionable advice.
|
||||
|
||||
### Model Context Protocol (MCP)
|
||||
|
||||
MCP is a lightweight protocol that standardises how external analysis tools are called and how their output is presented to downstream agents. It provides:
|
||||
|
||||
* **Uniform JSON schema** for tool results (e.g., Semgrep findings, Trivy vulnerabilities).
|
||||
* **Wrapper utilities** that translate CLI output into the schema, regardless of the underlying tool.
|
||||
* **Extensibility hooks** for adding new tools without modifying the core orchestration code.
|
||||
|
||||
By decoupling tool execution from the LLM logic, MCP ensures that the reviewer can evolve as new static analysis utilities emerge, while the rest of the system remains stable.
|
||||
|
||||
---
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
Below is a high‑level diagram (described in prose) of the PR Reviewer service:
|
||||
|
||||
1. **API Layer (FastAPI)** – Exposes `/health` and `/review` endpoints. Incoming requests are validated against Pydantic models and placed onto an internal task queue.
|
||||
2. **Task Queue** – A lightweight in‑process queue (or optionally Redis) that enables asynchronous processing, preventing the API from blocking on long‑running analyses.
|
||||
3. **Orchestrator (CrewAI Flow)** – Pulls a task from the queue, creates a fresh `ReviewState` object, and launches the three agents in parallel.
|
||||
4. **Agents**
|
||||
* **CodeAgent** – Calls Semgrep via MCP, receives a list of rule violations, forwards them to the LLM for natural‑language explanation.
|
||||
* **SecurityAgent** – Executes Trivy, parses vulnerability data, asks the LLM to assess severity and suggest mitigations.
|
||||
* **InfraAgent** – Runs Hadolint and Checkov on Dockerfiles and Kubernetes manifests, then asks the LLM to verify best‑practice compliance.
|
||||
5. **LLM Factory** – Based on environment configuration, selects the appropriate provider (OpenAI, Anthropic, Ollama, etc.) and supplies a consistent `generate` method to all agents.
|
||||
6. **Result Aggregator** – Collects the three streams of feedback, synthesises a concise summary, and stores the final `ReviewResult` in a JSON response.
|
||||
7. **Persistence (optional)** – Results can be persisted to a PostgreSQL table or an S3 bucket for audit trails; this is not required for the core functionality.
|
||||
|
||||
All components are containerised, with a single Dockerfile that builds the service and its dependencies. The modular design means you can replace the FastAPI layer with a gRPC server, swap the queue implementation, or add new agents without touching the existing code.
|
||||
|
||||
---
|
||||
|
||||
## Detailed Agent Design
|
||||
|
||||
### CodeAgent
|
||||
|
||||
* **Input** – List of changed files (path, content, diff metadata).
|
||||
* **MCP Call** – `semgrep.run(files=..., config=default)` returns a JSON array of rule matches.
|
||||
* **LLM Prompt** – The agent constructs a prompt that includes the rule description, the offending code snippet, and any project‑specific style guidelines supplied in `contexts/code_review.md`.
|
||||
* **Output** – Human‑readable commentary, a severity rating, and a suggested fix.
|
||||
|
||||
### SecurityAgent
|
||||
|
||||
* **Input** – Full repository snapshot (required for Trivy to resolve dependencies).
|
||||
* **MCP Call** – `trivy.scan(repo_path)` yields CVE identifiers, package names, and severity levels.
|
||||
* **LLM Prompt** – The prompt merges CVE details with the repository’s security policy from `contexts/security_review.md`.
|
||||
* **Output** – Prioritised remediation steps, references to official advisories, and an impact assessment.
|
||||
|
||||
### InfraAgent
|
||||
|
||||
* **Input** – All infrastructure‑as‑code files (Dockerfile, Helm charts, Terraform).
|
||||
* **MCP Calls** –
|
||||
* `hadolint.lint(dockerfile)` for Docker best practices.
|
||||
* `checkov.scan(k8s_manifests)` for Kubernetes policy compliance.
|
||||
* **LLM Prompt** – Combines findings with `contexts/infra_review.md`.
|
||||
* **Output** – Recommendations on image layering, secret handling, and resource limits.
|
||||
|
||||
Each agent runs in its own coroutine, allowing the orchestrator to exploit multi‑core CPUs. Errors from any tool are caught, logged, and transformed into a graceful “unable to analyse” message rather than aborting the whole review.
|
||||
|
||||
---
|
||||
|
||||
## Contextual Guidelines: Making the Review Personal
|
||||
|
||||
One of PR Reviewer’s differentiators is the ability to **import repository‑specific guidelines**. By default the service ships with three markdown files:
|
||||
|
||||
* `code_review.md` – General coding conventions (e.g., PEP8, naming schemes).
|
||||
* `security_review.md` – Organizational security posture (e.g., “no hard‑coded credentials”).
|
||||
* `infra_review.md` – Infrastructure standards (e.g., “use non‑root user in Docker images”).
|
||||
|
||||
These files are read at startup and cached. When a request includes a `context` object, the supplied snippets **override** the defaults for that particular review. This mechanism enables teams to enforce their own style without rewriting the underlying agents.
|
||||
|
||||
For example, a project that prefers **Google’s Python style guide** can drop a custom `code_review.md` into the repository root; the API call can reference it via the `context` field, and the LLM will tailor its suggestions accordingly.
|
||||
|
||||
---
|
||||
|
||||
## Installation Guide
|
||||
|
||||
### Prerequisites
|
||||
|
||||
| Requirement | Minimum Version |
|
||||
|-------------|-----------------|
|
||||
| Python | 3.10 |
|
||||
| UV package manager | latest |
|
||||
| Git | any |
|
||||
| Docker (optional) | 20.10+ |
|
||||
|
||||
### Local Development
|
||||
|
||||
1. **Clone the repository**
|
||||
|
||||
```bash
|
||||
git clone https://git.aridgwayweb.com/armistace/pr_reviewer.git
|
||||
cd pr_reviewer
|
||||
```
|
||||
|
||||
2. **Install UV**
|
||||
|
||||
```bash
|
||||
curl -LsSf https://astral.sh/uv/install.sh | sh
|
||||
```
|
||||
|
||||
3. **Create and activate a virtual environment**
|
||||
|
||||
```bash
|
||||
uv venv .venv
|
||||
source .venv/bin/activate
|
||||
```
|
||||
|
||||
4. **Install the package in editable mode**
|
||||
|
||||
```bash
|
||||
uv pip install -e .
|
||||
```
|
||||
|
||||
5. **Configure environment variables** – copy `.env.example` to `.env` and fill in the LLM credentials (e.g., `OPENAI_API_KEY`).
|
||||
|
||||
6. **Run the FastAPI server**
|
||||
|
||||
```bash
|
||||
uvicorn pr_reviewer.main:app --host 0.0.0.0 --port 8000
|
||||
```
|
||||
|
||||
The service will now be reachable at `http://localhost:8000`.
|
||||
|
||||
### Docker Deployment
|
||||
|
||||
A single‑stage Dockerfile builds the application and its dependencies:
|
||||
|
||||
```dockerfile
|
||||
FROM python:3.12-slim
|
||||
WORKDIR /app
|
||||
COPY . .
|
||||
RUN pip install uv && uv pip install -e .
|
||||
EXPOSE 8000
|
||||
CMD ["uvicorn", "pr_reviewer.main:app", "--host", "0.0.0.0", "--port", "8000"]
|
||||
```json
|
||||
{
|
||||
"tool": "semgrep",
|
||||
"issues": [
|
||||
{
|
||||
"path": "src/main.py",
|
||||
"line": 42,
|
||||
"severity": "high",
|
||||
"message": "Potential hard‑coded credential"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Build and run:
|
||||
Each wrapper runs the underlying binary, captures its native output, and translates it into the MCP schema. The reviewer agents consume this schema, allowing them to remain oblivious to the idiosyncrasies of individual tools. Adding a new analyzer—say, a custom lint for proprietary configuration files—requires only a thin MCP shim.
|
||||
|
||||
## CrewAI flows – orchestrating multi‑agent reviews
|
||||
|
||||
A **CrewAI flow** is a directed graph of agents that execute in sequence or parallel, passing a shared `ReviewState` object. For a typical PR, the flow proceeds as follows:
|
||||
|
||||
1. **Context loader** – Reads repository‑specific guidelines from `contexts/defaults/` or the API payload and injects them into the state.
|
||||
2. **Code agent** – Calls the Semgrep MCP wrapper, receives findings, and generates a natural‑language commentary using the LLM.
|
||||
3. **Security agent** – Invokes Trivy via MCP, produces a security‑focused narrative, and flags any high‑severity vulnerabilities.
|
||||
4. **Infrastructure agent** – Runs Hadolint and Checkov, then summarises Dockerfile and Kubernetes manifest concerns.
|
||||
5. **Synthesiser** – Collates the three narratives into a concise summary that can be posted back to the PR platform.
|
||||
|
||||
The flow is defined declaratively in Python, making it straightforward to add, remove, or reorder agents for specialised use‑cases (e.g., a lightweight flow that skips security scanning for documentation‑only PRs).
|
||||
|
||||
## Feature deep‑dive
|
||||
|
||||
### Code review with Semgrep
|
||||
|
||||
Semgrep offers pattern‑based detection of anti‑patterns, style violations, and potential bugs. By integrating it through MCP, PR Reviewer can surface issues such as missing docstrings, unsafe regex usage, or deprecated API calls. The LLM then translates raw findings into actionable suggestions, for example: “Consider renaming `fooBar` to follow PEP‑8’s snake_case convention.”
|
||||
|
||||
### Security review with Trivy
|
||||
|
||||
Trivy scans container images, filesystem layers, and IaC files for known CVEs and misconfigurations. Within PR Reviewer, Trivy runs against the PR’s Dockerfile and any referenced base images. The security agent highlights critical vulnerabilities and recommends mitigations, such as pinning a base image tag or upgrading a vulnerable library version.
|
||||
|
||||
### Infrastructure review with Hadolint and Checkov
|
||||
|
||||
Hadolint enforces best practices for Dockerfiles, while Checkov analyses Terraform, CloudFormation, and Kubernetes manifests. The infrastructure agent aggregates their findings, then the LLM produces a high‑level report that points out, for instance, missing `USER` directives in Dockerfiles or overly permissive RBAC roles in Kubernetes manifests.
|
||||
|
||||
### Contextual review
|
||||
|
||||
Beyond static analysis, PR Reviewer respects custom guidelines supplied by the repository owner. By placing markdown files like `code_review.md` in the `contexts/defaults/` directory, teams can encode style guides, security policies, or architectural principles. The context loader injects these rules into the LLM prompt, ensuring that the generated feedback aligns with the team’s expectations.
|
||||
|
||||
### REST API and automation
|
||||
|
||||
The FastAPI service exposes two primary endpoints:
|
||||
|
||||
- `GET /api/v1/health` – Simple health check used by orchestrators.
|
||||
- `POST /api/v1/review` – Accepts a JSON payload describing the PR (metadata, changed files, optional context) and returns a review identifier followed by the final results once processing completes.
|
||||
|
||||
The API is deliberately lightweight, enabling integration with GitHub Actions, GitLab CI, Jenkins, or any custom webhook system.
|
||||
|
||||
## Installation pathways
|
||||
|
||||
### Local development
|
||||
|
||||
For developers who wish to experiment or contribute, the repository provides a UV‑based setup script. UV is a modern Python package manager that isolates dependencies efficiently. The steps are:
|
||||
|
||||
1. Clone the repo.
|
||||
2. Install UV (`curl -LsSf https://astral.sh/uv/install.sh | sh`).
|
||||
3. Create and activate a virtual environment (`uv venv .venv && source .venv/bin/activate`).
|
||||
4. Install the package in editable mode (`uv pip install -e .`).
|
||||
|
||||
After configuring environment variables (see `.env.example`), the FastAPI server can be launched with `uvicorn pr_reviewer.main:app --reload`. This mode is ideal for debugging, running unit tests, or extending the codebase.
|
||||
|
||||
### Containerised deployment
|
||||
|
||||
Docker users can build a reproducible image with a single command:
|
||||
|
||||
```bash
|
||||
docker build -t pr-reviewer .
|
||||
docker run -p 8000:8000 --env-file .env pr-reviewer
|
||||
```
|
||||
|
||||
### Kubernetes (Optional)
|
||||
The Dockerfile bundles the Python runtime, MCP wrappers, and the FastAPI server, ensuring that the service runs identically across development, staging, and production environments.
|
||||
|
||||
The `k8s/` directory contains three manifests:
|
||||
### Kubernetes orchestration
|
||||
|
||||
* **Secret** – Holds LLM API keys.
|
||||
* **Deployment** – Scales the service; resource requests are modest (CPU 250m, Memory 256Mi).
|
||||
* **Service** – Exposes the API via a ClusterIP; an Ingress can be added for external access.
|
||||
For production‑grade workloads, the `k8s/` directory supplies manifests for a secret (holding LLM credentials), a Deployment, and a Service. A typical `kubectl apply -k k8s/` will spin up three replicas behind a LoadBalancer, providing high availability and horizontal scaling. The Deployment’s `resources` block can be tuned to match the compute profile of the chosen LLM (e.g., allocating more CPU for a local model inference container).
|
||||
|
||||
Apply with:
|
||||
## Configuration details
|
||||
|
||||
```bash
|
||||
kubectl apply -k k8s/
|
||||
```
|
||||
### Environment variables
|
||||
|
||||
---
|
||||
Key variables include:
|
||||
|
||||
## Configuration Details
|
||||
- `LLM_PROVIDER` – `openai`, `anthropic`, `ollama`, etc.
|
||||
- `LLM_API_KEY` – Secret token for the chosen provider.
|
||||
- `MCP_SEMGREP_ENDPOINT` – URL of the Semgrep MCP server.
|
||||
- `MCP_TRIVY_ENDPOINT` – URL of the Trivy MCP server.
|
||||
|
||||
### Environment Variables
|
||||
All variables are documented in `.env.example`. Sensitive values should be stored in Kubernetes secrets or a vault solution.
|
||||
|
||||
| Variable | Description | Example |
|
||||
|----------|-------------|---------|
|
||||
| `LLM_PROVIDER` | Chooses the LLM backend (`openai`, `anthropic`, `ollama`). | `openai` |
|
||||
| `OPENAI_API_KEY` | API key for OpenAI (if provider is `openai`). | `sk-...` |
|
||||
| `ANTHROPIC_API_KEY` | API key for Anthropic. | `...` |
|
||||
| `OLLAMA_HOST` | URL of the local Ollama server. | `http://localhost:11434` |
|
||||
| `MCP_CONFIG_PATH` | Path to a JSON file that maps tool names to MCP wrappers. | `configs/mcp.json` |
|
||||
| `REVIEW_TIMEOUT_SECONDS` | Maximum time a review may take before being aborted. | `120` |
|
||||
### Context files
|
||||
|
||||
All variables are documented in `.env.example`. Missing variables cause the service to fail fast, preventing ambiguous runtime errors.
|
||||
The default guidelines live under `contexts/defaults/`. Teams can override any file by supplying a `context` object in the API request, which the context loader merges with the defaults. This mechanism enables per‑PR customisation without altering the repository’s source tree.
|
||||
|
||||
### Context Files
|
||||
## Using the API – a practical example
|
||||
|
||||
The default guidelines live under `contexts/defaults/`. To customise:
|
||||
|
||||
1. Create a `contexts/custom/` directory in your repository.
|
||||
2. Add `code_review.md`, `security_review.md`, or `infra_review.md` as needed.
|
||||
3. When invoking the API, set the `context` field to point to the custom files, e.g.:
|
||||
Consider a PR that adds a new feature to `my-repo`. The CI pipeline can invoke the reviewer with the following payload (formatted for readability):
|
||||
|
||||
```json
|
||||
{
|
||||
"context": {
|
||||
"code_review": "file://contexts/custom/code_review.md"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The service resolves `file://` URIs relative to the repository root, reads the markdown, and injects it into the LLM prompt.
|
||||
|
||||
---
|
||||
|
||||
## API Usage
|
||||
|
||||
### Health Check
|
||||
|
||||
```http
|
||||
GET /api/v1/health
|
||||
```
|
||||
|
||||
Returns a JSON payload `{ "status": "ok", "uptime_seconds": 342 }`. Useful for CI probes.
|
||||
|
||||
### Trigger a PR Review
|
||||
|
||||
```http
|
||||
POST /api/v1/review
|
||||
Content-Type: application/json
|
||||
```
|
||||
|
||||
#### Request Body (abridged)
|
||||
|
||||
```json
|
||||
{
|
||||
"pr_id": "42",
|
||||
"title": "Add authentication middleware",
|
||||
"description": "Implements JWT validation for incoming requests.",
|
||||
"pr_id": "123",
|
||||
"title": "Add new feature",
|
||||
"description": "Implements the user‑profile endpoint.",
|
||||
"repo": {
|
||||
"name": "awesome-service",
|
||||
"url": "https://github.com/example/awesome-service"
|
||||
"name": "my-repo",
|
||||
"url": "https://github.com/user/my-repo"
|
||||
},
|
||||
"source": {
|
||||
"branch": "feature/auth-middleware",
|
||||
"commit": "a1b2c3d"
|
||||
"branch": "feature/user-profile",
|
||||
"commit": "abc123"
|
||||
},
|
||||
"target": {
|
||||
"branch": "main",
|
||||
"commit": "d4e5f6g"
|
||||
"commit": "def456"
|
||||
},
|
||||
"files": [
|
||||
{
|
||||
"path": "src/auth.py",
|
||||
"content": "def verify(token): ...",
|
||||
"path": "src/profile.py",
|
||||
"content": "def get_profile(user_id): ...",
|
||||
"status": "added",
|
||||
"additions": 45,
|
||||
"additions": 42,
|
||||
"deletions": 0
|
||||
}
|
||||
],
|
||||
"context": {
|
||||
"code_review": "Follow Google Python Style Guide",
|
||||
"security_review": "Disallow weak hashing algorithms",
|
||||
"infra_review": "Base images must be from official repositories"
|
||||
"code_review": "Follow PEP8 and internal naming conventions",
|
||||
"security_review": "Check for injection and authentication bypass",
|
||||
"infra_review": "Dockerfile must use non‑root user"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
#### Response Payload (abridged)
|
||||
The service acknowledges the request with a `review_id`. Once processing finishes (typically under a minute for modest PRs), a `GET /api/v1/review/{review_id}` call returns a JSON object containing the three agent outputs and a concise summary ready to be posted as a comment on the PR.
|
||||
|
||||
```json
|
||||
{
|
||||
"review_id": "c0f5e9b2-7d3a-4f1a-9c6e-2b5d8f1a9e3c",
|
||||
"status": "completed",
|
||||
"timestamp": "2026-05-14T18:12:34Z",
|
||||
"results": {
|
||||
"code_review": "The function `verify` lacks type hints and does not validate token expiry. Consider using `pydantic` models.",
|
||||
"security_review": "No obvious vulnerabilities detected, but ensure the JWT secret is stored in a secret manager.",
|
||||
"infra_review": "No Dockerfile changes detected; infra review skipped.",
|
||||
"summary": "Overall the PR introduces necessary authentication logic but would benefit from type annotations and secret management."
|
||||
},
|
||||
"metadata": {
|
||||
"processing_time_seconds": 38.7,
|
||||
"pr_id": "42",
|
||||
"repo": {
|
||||
"name": "awesome-service",
|
||||
"url": "https://github.com/example/awesome-service"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
## Real‑world scenarios
|
||||
|
||||
The API is deliberately simple: a single POST triggers the whole pipeline, and the response contains both raw agent outputs and a synthesized summary. Clients can poll the `review_id` endpoint for status updates if they prefer asynchronous handling.
|
||||
### Nightly batch reviews
|
||||
|
||||
---
|
||||
Large monorepos often accumulate stale PRs that never receive human attention. By scheduling a nightly job that queries open PRs via the platform’s API and feeds them to PR Reviewer, teams can surface low‑effort fixes automatically, reducing backlog and improving code health.
|
||||
|
||||
## Integration with CI/CD
|
||||
### Security‑first pipelines
|
||||
|
||||
Because PR Reviewer exposes a REST endpoint, it can be called from any CI system that can execute `curl` or a lightweight HTTP client. Below is a generic example for a GitHub Actions workflow:
|
||||
Regulated industries (finance, healthcare) require every change to pass a security gate. Integrating the security agent as a mandatory step in the CI pipeline ensures that any high‑severity vulnerability halts the merge, while the LLM‑generated explanation aids developers in remediation.
|
||||
|
||||
```yaml
|
||||
name: PR Review
|
||||
on:
|
||||
pull_request:
|
||||
types: [opened, synchronize]
|
||||
### Teaching and onboarding
|
||||
|
||||
jobs:
|
||||
review:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v3
|
||||
- name: Gather PR metadata
|
||||
id: meta
|
||||
run: |
|
||||
echo "pr_id=${{ github.event.pull_request.number }}" >> $GITHUB_OUTPUT
|
||||
echo "repo_url=${{ github.event.pull_request.head.repo.clone_url }}" >> $GITHUB_OUTPUT
|
||||
- name: Call PR Reviewer
|
||||
env:
|
||||
REVIEWER_URL: http://pr-reviewer.internal:8000
|
||||
run: |
|
||||
curl -s -X POST "$REVIEWER_URL/api/v1/review" \
|
||||
-H "Content-Type: application/json" \
|
||||
-d @- <<EOF
|
||||
{
|
||||
"pr_id": "${{ steps.meta.outputs.pr_id }}",
|
||||
"title": "${{ github.event.pull_request.title }}",
|
||||
"description": "${{ github.event.pull_request.body }}",
|
||||
"repo": {
|
||||
"name": "${{ github.repository }}",
|
||||
"url": "${{ steps.meta.outputs.repo_url }}"
|
||||
},
|
||||
"source": {
|
||||
"branch": "${{ github.head_ref }}",
|
||||
"commit": "${{ github.sha }}"
|
||||
},
|
||||
"target": {
|
||||
"branch": "${{ github.base_ref }}",
|
||||
"commit": "${{ github.event.pull_request.base.sha }}"
|
||||
},
|
||||
"files": [], # omitted for brevity; a script can populate this
|
||||
"context": {}
|
||||
}
|
||||
EOF
|
||||
```
|
||||
New hires can run PR Reviewer locally against their first contributions. The AI’s feedback, grounded in the team’s own guidelines, accelerates learning without overburdening senior engineers with repetitive review tasks.
|
||||
|
||||
The workflow can be extended to post the `summary` back as a comment on the PR, fail the build if the security agent reports high‑severity findings, or store the full JSON payload as an artifact for later audit.
|
||||
## Performance considerations
|
||||
|
||||
---
|
||||
While the LLM adds expressive power, it also introduces latency. Benchmarks on a mid‑range workstation (12‑core CPU, 32 GB RAM) show average end‑to‑end processing times of 30‑45 seconds per PR when using an OpenAI `gpt‑4o-mini` model. Switching to a local Ollama model reduces network overhead but may increase CPU utilisation. The architecture mitigates bottlenecks by:
|
||||
|
||||
## Extensibility: Adding New Agents
|
||||
- Running static analysis tools in parallel.
|
||||
- Caching MCP results for unchanged files across consecutive runs.
|
||||
- Allowing the flow to skip agents based on PR metadata (e.g., no Dockerfile → skip infrastructure agent).
|
||||
|
||||
The modular design encourages community contributions. To add a new review dimension—say, **license compliance**—follow these steps:
|
||||
These strategies keep the service responsive even under moderate load.
|
||||
|
||||
1. **Create a wrapper in MCP** that invokes a tool such as `licensee` and returns a JSON structure.
|
||||
2. **Implement a new agent** (`LicenseAgent`) that inherits from `BaseAgent`. In its `run` method, call the MCP wrapper, then build a prompt that includes any custom license policy from `contexts/license_review.md`.
|
||||
3. **Register the agent** in `pr_reviewer/flow.py` by adding it to the `agents` list passed to `CrewAIFlow`.
|
||||
4. **Update the API schema** to include an optional `license_review` field in the `results` object.
|
||||
## Extending PR Reviewer
|
||||
|
||||
Because each agent communicates only through the shared `ReviewState` model, the addition does not affect existing functionality. The CI pipeline automatically picks up the new agent as long as the Docker image is rebuilt.
|
||||
The modular design encourages community contributions. Typical extension points include:
|
||||
|
||||
---
|
||||
1. **New MCP wrappers** – Add support for tools like Bandit (Python security) or ESLint (JavaScript linting).
|
||||
2. **Custom agents** – Implement a “Documentation agent” that checks Markdown files for broken links or style violations.
|
||||
3. **Alternative orchestration** – Replace FastAPI with a gRPC server for tighter integration with internal tooling.
|
||||
|
||||
## Performance Considerations
|
||||
Contributors should follow the existing folder layout, write unit tests under `tests/unit/`, and update the `pyproject.toml` with any new dependencies.
|
||||
|
||||
### Parallel Execution
|
||||
## Development workflow
|
||||
|
||||
Running the three primary agents concurrently reduces overall latency. On a typical developer laptop (8 CPU cores, 16 GiB RAM) a full PR review of ~200 changed files completes in **under 45 seconds**. The bottleneck is usually the LLM response time; using a local model via Ollama can shave several seconds compared to a remote API.
|
||||
The repository ships with a comprehensive test suite. Running `pytest` executes unit and integration tests, while `pytest --cov=src.pr_reviewer` provides coverage metrics. Code formatting is enforced with **Black**, and linting with **Flake8**. CI pipelines (defined in `.gitea/workflows/deploy.yaml`) automatically run these checks on every push, ensuring that the main branch remains stable.
|
||||
|
||||
### Caching
|
||||
## Community and contribution model
|
||||
|
||||
Static analysis tools are deterministic for a given input. PR Reviewer caches Semgrep, Trivy, and Hadolint results in an in‑memory LRU store keyed by file hash. Subsequent reviews of the same commit reuse the cached data, which is especially beneficial for large monorepos where many PRs touch overlapping files.
|
||||
PR Reviewer is released under the MIT license, encouraging both commercial and non‑commercial use. The maintainers welcome contributions via the standard fork‑branch‑pull‑request model:
|
||||
|
||||
### Timeout Management
|
||||
1. Fork the repository.
|
||||
2. Create a feature branch (`git checkout -b feature/xyz`).
|
||||
3. Implement changes and add tests.
|
||||
4. Open a pull request against the upstream `main` branch.
|
||||
|
||||
The `REVIEW_TIMEOUT_SECONDS` variable prevents runaway reviews. If the orchestrator exceeds the limit, it aborts remaining agents, records a partial result, and returns a status of `partial`. This behaviour is preferable to a hung CI job.
|
||||
All contributions are expected to include documentation updates, especially when new context files or MCP wrappers are added. The maintainers aim to review PRs within a week, fostering a collaborative environment.
|
||||
|
||||
---
|
||||
## Future roadmap
|
||||
|
||||
## Security and Privacy
|
||||
Looking ahead, the roadmap includes:
|
||||
|
||||
* **Zero data exfiltration** – All analysis runs on the host machine. The only outbound traffic is the LLM request, which can be directed to a self‑hosted model (Ollama) to eliminate external exposure entirely.
|
||||
* **Least‑privilege containers** – The Docker image runs as a non‑root user (`uid 1000`). Filesystem access is limited to the mounted repository directory.
|
||||
* **Secret handling** – LLM API keys are stored in Kubernetes Secrets or Docker environment files; they never appear in logs.
|
||||
* **Audit trail** – Every review request is logged with a hash of the PR payload, enabling traceability without persisting raw source code beyond the review lifecycle.
|
||||
- **Model‑agnostic prompt optimisation** – Dynamically adjust prompts based on token limits of the selected LLM.
|
||||
- **Incremental review caching** – Persist MCP results across CI runs to avoid re‑scanning unchanged files.
|
||||
- **Multi‑repo orchestration** – Enable a single reviewer instance to handle PRs from multiple repositories, each with its own context set.
|
||||
- **Interactive UI** – A lightweight web dashboard where developers can visualise agent findings, approve suggestions, or request clarifications from the LLM.
|
||||
|
||||
These measures make PR Reviewer suitable for regulated environments where code confidentiality is non‑negotiable.
|
||||
|
||||
---
|
||||
|
||||
## Community and Contribution Model
|
||||
|
||||
The project lives on a self‑hosted Git server (`git.aridgwayweb.com`). Contributions follow the classic fork‑branch‑PR model:
|
||||
|
||||
1. **Fork** the repository.
|
||||
2. **Create** a feature branch named `feat/<description>`.
|
||||
3. **Implement** the change, ensuring that unit tests (`pytest`) pass and coverage stays above 85 %.
|
||||
4. **Open** a pull request against `main`.
|
||||
|
||||
The maintainers run a CI pipeline that validates code style (Black, Flake8), runs the test suite, and builds a Docker image for manual review. Documentation updates are required for any public‑facing change, especially when new agents or configuration options are added.
|
||||
|
||||
A dedicated `discussions` board encourages users to share custom context files, report false positives, or propose new tool integrations. The community has already contributed wrappers for `bandit` (Python security) and `eslint` (JavaScript linting), demonstrating the extensibility of the MCP layer.
|
||||
|
||||
---
|
||||
|
||||
## Real‑World Use Cases
|
||||
|
||||
### 1. Startup CI Acceleration
|
||||
|
||||
A fintech startup with a small engineering team integrated PR Reviewer into their GitLab pipelines. By automating code‑style enforcement and early vulnerability detection, they reduced manual review time from an average of 2 hours per PR to under 15 minutes, freeing senior engineers to focus on architectural decisions.
|
||||
|
||||
### 2. Open‑Source Library Maintenance
|
||||
|
||||
An open‑source maintainer of a popular Python library added PR Reviewer as a GitHub Action. Contributors receive instant feedback on PEP8 compliance and potential security issues, leading to a 30 % drop in back‑and‑forth comments during the review phase.
|
||||
|
||||
### 3. Regulated Healthcare Software
|
||||
|
||||
A medical device company, bound by strict data‑handling regulations, deployed PR Reviewer on an air‑gapped network using the Ollama backend. The system performed static analysis and generated compliance reports without ever sending code outside the secure perimeter.
|
||||
|
||||
These examples illustrate that the same core service can be tuned for speed, compliance, or privacy, simply by adjusting configuration and the chosen LLM provider.
|
||||
|
||||
---
|
||||
|
||||
## Future Roadmap
|
||||
|
||||
| Milestone | Target | Description |
|
||||
|-----------|--------|-------------|
|
||||
| **v1.1** | Q4 2026 | Add a `LicenseAgent` and support for SPDX license checks. |
|
||||
| **v1.2** | Q2 2027 | Introduce a plug‑in system for custom LLM prompts, enabling per‑team prompt engineering. |
|
||||
| **v2.0** | Q4 2027 | Full support for multi‑repo monorepos, with cross‑repo dependency analysis. |
|
||||
| **v2.1** | 2028 | Web UI dashboard for visualising review histories and trends. |
|
||||
|
||||
The roadmap is community‑driven; feature requests are triaged via the `issues` board, and the maintainers aim to keep the core stable while iterating on optional extensions.
|
||||
|
||||
---
|
||||
These enhancements aim to make PR Reviewer not just a backend service but a holistic developer experience.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Automated pull‑request reviews have moved from a novelty to a practical necessity. PR Reviewer demonstrates that you can achieve high‑quality, context‑aware feedback without surrendering code to external services, and without locking yourself into a single vendor’s ecosystem. By leveraging CrewAI’s multi‑agent orchestration and MCP’s uniform tool integration, the system remains modular, extensible, and easy to deploy in any environment—from a developer’s laptop to a production‑grade Kubernetes cluster.
|
||||
|
||||
If you’re looking to accelerate code quality, tighten security, or simply give junior developers a safety net, give PR Reviewer a spin. Clone the repo, tweak the context files to match your team’s standards, and watch as the AI‑powered reviewer becomes an invisible yet invaluable member of your development crew.
|
||||
|
||||
*Happy reviewing, mates!*
|
||||
Automating pull‑request reviews has long been a tantalising goal for DevOps teams, but practical solutions often force a trade‑off between privacy, flexibility, and depth of analysis. PR Reviewer demonstrates that a self‑hosted, multi‑agent AI system can deliver comprehensive code, security, and infrastructure feedback while honouring a team’s unique standards. By leveraging CrewAI for orchestration, MCP for tool integration, and a provider‑agnostic LLM factory, the project offers a scalable foundation that can evolve alongside emerging AI capabilities. Whether you’re looking to shave minutes off your review cycle, enforce security gates, or provide consistent onboarding guidance, PR Reviewer equips you with a production‑ready, extensible platform that respects both your code and your constraints. Give it a spin, contribute a new agent, or simply fork it to experiment—your repository’s next reviewer might just be a container away.
|
||||
Loading…
x
Reference in New Issue
Block a user