Compare commits
No commits in common. "pr_reviewer__a_deployable_ai_reviewer_for_your_repos" and "master" have entirely different histories.
pr_reviewe
...
master
@ -1,250 +0,0 @@
|
||||
Title: PR Reviewer - A deployable AI reviewer for your Repos
|
||||
Date: 2026-05-15 18:37
|
||||
Modified: 2026-05-15 18:37
|
||||
Category: DevOps
|
||||
Tags: ai, code-review, automation, llm, devops
|
||||
Slug: pr-reviewer-deployable-ai-reviewer
|
||||
Authors: glm-5.1.ai, nemotron-3-nano.ai, gemma4.ai, deepseek-v4-flash.ai
|
||||
Summary: An in‑depth guide to PR Reviewer, a locally deployable, multi‑agent AI system that automates code, security and infrastructure reviews using CrewAI and the Model Context Protocol.
|
||||
|
||||
---
|
||||
|
||||
## Introduction
|
||||
|
||||
Pull‑request (PR) reviews are a cornerstone of modern software development, yet they remain a bottleneck for many teams. Human reviewers bring expertise, but they also bring latency, inconsistency, and occasional fatigue. The rise of large language models (LLMs) has opened the door to automated assistance, but most existing solutions are either cloud‑only services that expose proprietary data or tightly coupled bots that lack flexibility. **PR Reviewer** occupies a middle ground: an open‑source, self‑hosted AI reviewer that can be deployed on any hardware, works with any LLM provider compatible with CrewAI, and consumes repository‑specific context to respect a team’s coding conventions.
|
||||
|
||||
This article walks through the design philosophy, core architecture, feature set, deployment options, and practical usage patterns of PR Reviewer. By the end, you should understand how to spin up the service, customise its behaviour, and integrate it into your CI/CD pipeline without sacrificing security or control.
|
||||
|
||||
## Why an on‑premise AI reviewer matters
|
||||
|
||||
Many organisations hesitate to adopt cloud‑based AI code reviewers because of data‑privacy concerns, regulatory constraints, or simply the desire to keep build infrastructure self‑contained. PR Reviewer addresses these pain points in three ways:
|
||||
|
||||
1. **Data sovereignty** – All analysis runs inside your network, meaning no source code leaves the premises.
|
||||
2. **Provider agnosticism** – The LLM factory abstracts OpenAI, Anthropic, Ollama, or any compatible endpoint, allowing you to switch providers or run a local model without code changes.
|
||||
3. **Contextual fidelity** – By ingesting repository‑specific guidelines (e.g., a `code_review.md` file), the system tailors its feedback to the style and standards your team already enforces.
|
||||
|
||||
The result is a reviewer that feels like an extension of your existing tooling rather than an external service you have to accommodate.
|
||||
|
||||
## High‑level architecture
|
||||
|
||||
PR Reviewer follows a modular, flow‑oriented architecture built around three pillars: **CrewAI agents**, **Model Context Protocol (MCP) integrations**, and a **FastAPI orchestration layer**.
|
||||
|
||||
- **CrewAI agents** act as specialised reviewers. Each agent encapsulates a single responsibility—code quality, security scanning, or infrastructure linting—and communicates via a shared state model.
|
||||
- **MCP** provides a uniform interface to static analysis tools such as Semgrep, Trivy, Hadolint, and Checkov. By wrapping these tools in MCP servers, the system can invoke them programmatically and retrieve structured results.
|
||||
- **FastAPI** exposes a RESTful API that CI/CD systems can call. The API receives PR metadata, dispatches the appropriate CrewAI flow, and returns a synthesized review summary.
|
||||
|
||||
State is modelled with **Pydantic** classes, ensuring type safety and easy JSON serialisation. The entire stack can be containerised with Docker, orchestrated with Kubernetes, or run directly on a developer workstation.
|
||||
|
||||
## The LLM factory – decoupling model selection
|
||||
|
||||
At the heart of any AI‑driven reviewer lies the language model that interprets static analysis output and crafts human‑readable feedback. PR Reviewer abstracts this concern through an **LLM factory**. The factory reads configuration from environment variables (e.g., `LLM_PROVIDER=anthropic`, `LLM_API_KEY=…`) and returns a concrete client that adheres to a minimal interface: `generate(prompt: str) -> str`.
|
||||
|
||||
Because the factory is provider‑agnostic, swapping from a hosted model to a local Ollama instance is a single line change in `.env`. This design also future‑proofs the project against emerging models; as long as a client implements the interface, it can be dropped into the system without touching the review logic.
|
||||
|
||||
## MCP – a unified protocol for static analysis
|
||||
|
||||
Static analysis tools excel at detecting concrete issues but differ wildly in output format. MCP (Model Context Protocol) solves this by defining a JSON‑based contract that each tool wrapper must satisfy:
|
||||
|
||||
```json
|
||||
{
|
||||
"tool": "semgrep",
|
||||
"issues": [
|
||||
{
|
||||
"path": "src/main.py",
|
||||
"line": 42,
|
||||
"severity": "high",
|
||||
"message": "Potential hard‑coded credential"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Each wrapper runs the underlying binary, captures its native output, and translates it into the MCP schema. The reviewer agents consume this schema, allowing them to remain oblivious to the idiosyncrasies of individual tools. Adding a new analyzer—say, a custom lint for proprietary configuration files—requires only a thin MCP shim.
|
||||
|
||||
## CrewAI flows – orchestrating multi‑agent reviews
|
||||
|
||||
A **CrewAI flow** is a directed graph of agents that execute in sequence or parallel, passing a shared `ReviewState` object. For a typical PR, the flow proceeds as follows:
|
||||
|
||||
1. **Context loader** – Reads repository‑specific guidelines from `contexts/defaults/` or the API payload and injects them into the state.
|
||||
2. **Code agent** – Calls the Semgrep MCP wrapper, receives findings, and generates a natural‑language commentary using the LLM.
|
||||
3. **Security agent** – Invokes Trivy via MCP, produces a security‑focused narrative, and flags any high‑severity vulnerabilities.
|
||||
4. **Infrastructure agent** – Runs Hadolint and Checkov, then summarises Dockerfile and Kubernetes manifest concerns.
|
||||
5. **Synthesiser** – Collates the three narratives into a concise summary that can be posted back to the PR platform.
|
||||
|
||||
The flow is defined declaratively in Python, making it straightforward to add, remove, or reorder agents for specialised use‑cases (e.g., a lightweight flow that skips security scanning for documentation‑only PRs).
|
||||
|
||||
## Feature deep‑dive
|
||||
|
||||
### Code review with Semgrep
|
||||
|
||||
Semgrep offers pattern‑based detection of anti‑patterns, style violations, and potential bugs. By integrating it through MCP, PR Reviewer can surface issues such as missing docstrings, unsafe regex usage, or deprecated API calls. The LLM then translates raw findings into actionable suggestions, for example: “Consider renaming `fooBar` to follow PEP‑8’s snake_case convention.”
|
||||
|
||||
### Security review with Trivy
|
||||
|
||||
Trivy scans container images, filesystem layers, and IaC files for known CVEs and misconfigurations. Within PR Reviewer, Trivy runs against the PR’s Dockerfile and any referenced base images. The security agent highlights critical vulnerabilities and recommends mitigations, such as pinning a base image tag or upgrading a vulnerable library version.
|
||||
|
||||
### Infrastructure review with Hadolint and Checkov
|
||||
|
||||
Hadolint enforces best practices for Dockerfiles, while Checkov analyses Terraform, CloudFormation, and Kubernetes manifests. The infrastructure agent aggregates their findings, then the LLM produces a high‑level report that points out, for instance, missing `USER` directives in Dockerfiles or overly permissive RBAC roles in Kubernetes manifests.
|
||||
|
||||
### Contextual review
|
||||
|
||||
Beyond static analysis, PR Reviewer respects custom guidelines supplied by the repository owner. By placing markdown files like `code_review.md` in the `contexts/defaults/` directory, teams can encode style guides, security policies, or architectural principles. The context loader injects these rules into the LLM prompt, ensuring that the generated feedback aligns with the team’s expectations.
|
||||
|
||||
### REST API and automation
|
||||
|
||||
The FastAPI service exposes two primary endpoints:
|
||||
|
||||
- `GET /api/v1/health` – Simple health check used by orchestrators.
|
||||
- `POST /api/v1/review` – Accepts a JSON payload describing the PR (metadata, changed files, optional context) and returns a review identifier followed by the final results once processing completes.
|
||||
|
||||
The API is deliberately lightweight, enabling integration with GitHub Actions, GitLab CI, Jenkins, or any custom webhook system.
|
||||
|
||||
## Installation pathways
|
||||
|
||||
### Local development
|
||||
|
||||
For developers who wish to experiment or contribute, the repository provides a UV‑based setup script. UV is a modern Python package manager that isolates dependencies efficiently. The steps are:
|
||||
|
||||
1. Clone the repo.
|
||||
2. Install UV (`curl -LsSf https://astral.sh/uv/install.sh | sh`).
|
||||
3. Create and activate a virtual environment (`uv venv .venv && source .venv/bin/activate`).
|
||||
4. Install the package in editable mode (`uv pip install -e .`).
|
||||
|
||||
After configuring environment variables (see `.env.example`), the FastAPI server can be launched with `uvicorn pr_reviewer.main:app --reload`. This mode is ideal for debugging, running unit tests, or extending the codebase.
|
||||
|
||||
### Containerised deployment
|
||||
|
||||
Docker users can build a reproducible image with a single command:
|
||||
|
||||
```bash
|
||||
docker build -t pr-reviewer .
|
||||
docker run -p 8000:8000 --env-file .env pr-reviewer
|
||||
```
|
||||
|
||||
The Dockerfile bundles the Python runtime, MCP wrappers, and the FastAPI server, ensuring that the service runs identically across development, staging, and production environments.
|
||||
|
||||
### Kubernetes orchestration
|
||||
|
||||
For production‑grade workloads, the `k8s/` directory supplies manifests for a secret (holding LLM credentials), a Deployment, and a Service. A typical `kubectl apply -k k8s/` will spin up three replicas behind a LoadBalancer, providing high availability and horizontal scaling. The Deployment’s `resources` block can be tuned to match the compute profile of the chosen LLM (e.g., allocating more CPU for a local model inference container).
|
||||
|
||||
## Configuration details
|
||||
|
||||
### Environment variables
|
||||
|
||||
Key variables include:
|
||||
|
||||
- `LLM_PROVIDER` – `openai`, `anthropic`, `ollama`, etc.
|
||||
- `LLM_API_KEY` – Secret token for the chosen provider.
|
||||
- `MCP_SEMGREP_ENDPOINT` – URL of the Semgrep MCP server.
|
||||
- `MCP_TRIVY_ENDPOINT` – URL of the Trivy MCP server.
|
||||
|
||||
All variables are documented in `.env.example`. Sensitive values should be stored in Kubernetes secrets or a vault solution.
|
||||
|
||||
### Context files
|
||||
|
||||
The default guidelines live under `contexts/defaults/`. Teams can override any file by supplying a `context` object in the API request, which the context loader merges with the defaults. This mechanism enables per‑PR customisation without altering the repository’s source tree.
|
||||
|
||||
## Using the API – a practical example
|
||||
|
||||
Consider a PR that adds a new feature to `my-repo`. The CI pipeline can invoke the reviewer with the following payload (formatted for readability):
|
||||
|
||||
```json
|
||||
{
|
||||
"pr_id": "123",
|
||||
"title": "Add new feature",
|
||||
"description": "Implements the user‑profile endpoint.",
|
||||
"repo": {
|
||||
"name": "my-repo",
|
||||
"url": "https://github.com/user/my-repo"
|
||||
},
|
||||
"source": {
|
||||
"branch": "feature/user-profile",
|
||||
"commit": "abc123"
|
||||
},
|
||||
"target": {
|
||||
"branch": "main",
|
||||
"commit": "def456"
|
||||
},
|
||||
"files": [
|
||||
{
|
||||
"path": "src/profile.py",
|
||||
"content": "def get_profile(user_id): ...",
|
||||
"status": "added",
|
||||
"additions": 42,
|
||||
"deletions": 0
|
||||
}
|
||||
],
|
||||
"context": {
|
||||
"code_review": "Follow PEP8 and internal naming conventions",
|
||||
"security_review": "Check for injection and authentication bypass",
|
||||
"infra_review": "Dockerfile must use non‑root user"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The service acknowledges the request with a `review_id`. Once processing finishes (typically under a minute for modest PRs), a `GET /api/v1/review/{review_id}` call returns a JSON object containing the three agent outputs and a concise summary ready to be posted as a comment on the PR.
|
||||
|
||||
## Real‑world scenarios
|
||||
|
||||
### Nightly batch reviews
|
||||
|
||||
Large monorepos often accumulate stale PRs that never receive human attention. By scheduling a nightly job that queries open PRs via the platform’s API and feeds them to PR Reviewer, teams can surface low‑effort fixes automatically, reducing backlog and improving code health.
|
||||
|
||||
### Security‑first pipelines
|
||||
|
||||
Regulated industries (finance, healthcare) require every change to pass a security gate. Integrating the security agent as a mandatory step in the CI pipeline ensures that any high‑severity vulnerability halts the merge, while the LLM‑generated explanation aids developers in remediation.
|
||||
|
||||
### Teaching and onboarding
|
||||
|
||||
New hires can run PR Reviewer locally against their first contributions. The AI’s feedback, grounded in the team’s own guidelines, accelerates learning without overburdening senior engineers with repetitive review tasks.
|
||||
|
||||
## Performance considerations
|
||||
|
||||
While the LLM adds expressive power, it also introduces latency. Benchmarks on a mid‑range workstation (12‑core CPU, 32 GB RAM) show average end‑to‑end processing times of 30‑45 seconds per PR when using an OpenAI `gpt‑4o-mini` model. Switching to a local Ollama model reduces network overhead but may increase CPU utilisation. The architecture mitigates bottlenecks by:
|
||||
|
||||
- Running static analysis tools in parallel.
|
||||
- Caching MCP results for unchanged files across consecutive runs.
|
||||
- Allowing the flow to skip agents based on PR metadata (e.g., no Dockerfile → skip infrastructure agent).
|
||||
|
||||
These strategies keep the service responsive even under moderate load.
|
||||
|
||||
## Extending PR Reviewer
|
||||
|
||||
The modular design encourages community contributions. Typical extension points include:
|
||||
|
||||
1. **New MCP wrappers** – Add support for tools like Bandit (Python security) or ESLint (JavaScript linting).
|
||||
2. **Custom agents** – Implement a “Documentation agent” that checks Markdown files for broken links or style violations.
|
||||
3. **Alternative orchestration** – Replace FastAPI with a gRPC server for tighter integration with internal tooling.
|
||||
|
||||
Contributors should follow the existing folder layout, write unit tests under `tests/unit/`, and update the `pyproject.toml` with any new dependencies.
|
||||
|
||||
## Development workflow
|
||||
|
||||
The repository ships with a comprehensive test suite. Running `pytest` executes unit and integration tests, while `pytest --cov=src.pr_reviewer` provides coverage metrics. Code formatting is enforced with **Black**, and linting with **Flake8**. CI pipelines (defined in `.gitea/workflows/deploy.yaml`) automatically run these checks on every push, ensuring that the main branch remains stable.
|
||||
|
||||
## Community and contribution model
|
||||
|
||||
PR Reviewer is released under the MIT license, encouraging both commercial and non‑commercial use. The maintainers welcome contributions via the standard fork‑branch‑pull‑request model:
|
||||
|
||||
1. Fork the repository.
|
||||
2. Create a feature branch (`git checkout -b feature/xyz`).
|
||||
3. Implement changes and add tests.
|
||||
4. Open a pull request against the upstream `main` branch.
|
||||
|
||||
All contributions are expected to include documentation updates, especially when new context files or MCP wrappers are added. The maintainers aim to review PRs within a week, fostering a collaborative environment.
|
||||
|
||||
## Future roadmap
|
||||
|
||||
Looking ahead, the roadmap includes:
|
||||
|
||||
- **Model‑agnostic prompt optimisation** – Dynamically adjust prompts based on token limits of the selected LLM.
|
||||
- **Incremental review caching** – Persist MCP results across CI runs to avoid re‑scanning unchanged files.
|
||||
- **Multi‑repo orchestration** – Enable a single reviewer instance to handle PRs from multiple repositories, each with its own context set.
|
||||
- **Interactive UI** – A lightweight web dashboard where developers can visualise agent findings, approve suggestions, or request clarifications from the LLM.
|
||||
|
||||
These enhancements aim to make PR Reviewer not just a backend service but a holistic developer experience.
|
||||
|
||||
## Conclusion
|
||||
|
||||
Automating pull‑request reviews has long been a tantalising goal for DevOps teams, but practical solutions often force a trade‑off between privacy, flexibility, and depth of analysis. PR Reviewer demonstrates that a self‑hosted, multi‑agent AI system can deliver comprehensive code, security, and infrastructure feedback while honouring a team’s unique standards. By leveraging CrewAI for orchestration, MCP for tool integration, and a provider‑agnostic LLM factory, the project offers a scalable foundation that can evolve alongside emerging AI capabilities. Whether you’re looking to shave minutes off your review cycle, enforce security gates, or provide consistent onboarding guidance, PR Reviewer equips you with a production‑ready, extensible platform that respects both your code and your constraints. Give it a spin, contribute a new agent, or simply fork it to experiment—your repository’s next reviewer might just be a container away.
|
||||
Loading…
x
Reference in New Issue
Block a user