Research Crew - A critical researcher agent with web search capabilities investigates the topic and produces verified findings
Writing Crew - Four creative journalist agents write draft blog articles in parallel, each with different creative styles
Editor Crew - A critical editor loads the drafts into a vector database, queries for relevant context, and produces the final polished document with metadata

Requirements

Python 3.10 or later
Ollama server running with required models
ChromaDB server for vector storage
Trilium notes instance
Gitea instance (for automated workflows)
n8n instance (for notifications)

Environment Variables

Create a .env file in the project root with the following variables:

# Trilium Configuration
TRILIUM_HOST=
TRILIUM_PORT=
TRILIUM_PROTOCOL=https
TRILIUM_PASS=
TRILIUM_TOKEN=

# Ollama Configuration
OLLAMA_PROTOCOL=http
OLLAMA_HOST=
OLLAMA_PORT=11434
EMBEDDING_MODEL=nomic-embed-text
EDITOR_MODEL=llama3.1:8b
CONTENT_CREATOR_MODELS=["phi4-mini:latest", "qwen3:1.7b", "gemma3:latest"]

# ChromaDB Configuration
CHROMA_HOST=chroma
CHROMA_PORT=8000

# Git Configuration
GIT_USER=
GIT_PASS=
GIT_PROTOCOL=https
GIT_REMOTE=git.aridgwayweb.com/armistace/blog.git

# Notification Configuration
N8N_SECRET=
N8N_WEBHOOK_URL=

# Ollama Web Search (required for researcher agent)
OLLAMA_API_KEY=

CONTENT_CREATOR_MODELS Format

The CONTENT_CREATOR_MODELS variable should be a JSON array of Ollama model names. Each model will be used by one of the three journalist agents. Example:

CONTENT_CREATOR_MODELS=["llama3.1:8b", "qwen2.5:7b", "phi4:latest"]

OLLAMA_API_KEY

The researcher agent uses Ollama's native web search API. Create an API key from your Ollama account (https://ollama.com) and add it to your .env file. This uses your existing Ollama subscription for web searches.

Project Structure

blog_creator/
├── .env                          # Environment variables (create this)
├── .gitea/workflows/deploy.yml   # Gitea Actions workflow
├── docker-compose.yml            # Local development setup
├── requirements.txt              # Python dependencies
├── README.md                     # This file
└── src/
    ├── main.py                   # Entry point
    └── ai_generators/
        ├── ollama_md_generator.py    # Main interface (used by main.py)
        ├── blog_flow.py              # CrewAI Flow orchestrator
        ├── crews/
        │   ├── research_crew/        # Researcher agent with web search
        │   ├── writing_crew/         # Three journalist agents
        │   └── editor_crew/          # Editor agent with metadata generation
        └── tools/

Local Development Setup

Using Docker Compose

Clone the repository and navigate to the project directory
Create your .env file with all required variables
Start the services:

docker-compose up -d

This starts:

blog_creator - The main application container
chroma - ChromaDB vector database

The container will run main.py automatically on startup. To run manually:

docker-compose exec blog_creator python src/main.py

Manual Setup (without Docker)

Install system dependencies:

apt update && apt install -y rustc cargo python-is-python3 pip python3-venv libmagic-dev git

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate

Install Python dependencies:

pip install -r requirements.txt

Configure Git:

git config --global user.name "Blog Creator"
git config --global user.email "your-email@example.com"
git config --global push.autoSetupRemote true

Run the application:

python src/main.py

How It Works

Trilium Integration

The system fetches notes from Trilium that are tagged for blog creation. Each note becomes one blog post. The note content is used as the basis for the AI-generated article.

Blog Generation Flow

Research Phase - The researcher agent investigates the topic using web search, critically evaluates claims, and produces verified findings
Writing Phase - Three journalist agents write creative drafts in parallel, each with different temperature and top_p settings for variety
Editor Phase - The editor:
- Chunks and embeds all drafts into ChromaDB
- Queries the vector database for relevant context
- Generates the final polished document with metadata header

Output Format

Each blog post includes a metadata header followed by the markdown body:

Title: Designing and Building an AI Enhanced CCTV System
Date: 2026-02-02 20:00
Modified: 2026-02-02 20:00
Category: Homelab
Tags: proxmox, hardware, self host, homelab, ai_content, not_human_content
Slug: ai-enhanced-cctv
Authors: phi4-mini.ai, qwen3.ai, gemma3.ai
Summary: Home CCTV Security has become a bastion of cloud subscription awfulness. This blog describes creating your own AI enhanced system.

<full markdown blog body follows>

The metadata fields are generated as follows:

Title - From the Trilium note title
Date/Modified - Current datetime when generated
Category - AI-generated single word (e.g., Homelab, DevOps, Security)
Tags - AI-generated relevant tags plus ai_content, not_human_content
Slug - AI-generated URL-friendly slug
Authors - Derived from CONTENT_CREATOR_MODELS (model name + .ai)
Summary - AI-generated 15-25 word summary

Git Workflow

After generation, the blog post is:

Committed to a new branch named after the slug
Pushed to the configured Git remote
A notification is sent via n8n to Matrix for review

Gitea Actions Workflow

The .gitea/workflows/deploy.yml file defines an automated workflow that:

Runs on a schedule (daily at 18:15 UTC) or on push to master branch
Installs all dependencies
Creates the .env file from Gitea secrets and variables
Runs the blog generation script

Setting Up Gitea Variables

In your Gitea repository settings, configure the following:

Variables (Repository Settings -> Variables):

TRILIUM_HOST - Your Trilium server hostname
TRILIUM_PORT - Trilium port
TRILIUM_PROTOCOL - http or https
OLLAMA_PROTOCOL - http or https
OLLAMA_HOST - Ollama server hostname
OLLAMA_PORT - Ollama port (default 11434)
EMBEDDING_MODEL - Embedding model name
EDITOR_MODEL - Editor/Researcher model name
CONTENT_CREATOR_MODELS_1 through CONTENT_CREATOR_MODELS_4 - Individual model names (the workflow joins these into an array)
GIT_PROTOCOL - https or ssh
GIT_REMOTE - Git repository URL
GIT_USER - Git username for pushing
N8N_WEBHOOK_URL - n8n webhook URL for notifications
CHROMA_HOST - ChromaDB hostname
CHROMA_PORT - ChromaDB port

Secrets (Repository Settings -> Secrets):

TRILIUM_PASS - Trilium password
TRILIUM_TOKEN - Trilium API token
GIT_PASS - Git password or personal access token
N8N_SECRET - n8n webhook secret key
OLLAMA_API_KEY - Ollama API key for web search

Workflow Triggers

The workflow runs automatically when:

A push is made to the master branch
The scheduled cron time is reached (18:15 UTC daily)

To trigger manually, push any change to master or modify the cron schedule in .gitea/workflows/deploy.yml.

Customizing Agent Behavior

Agent personalities and task instructions are defined in YAML files under src/ai_generators/crews/*/config/. You can modify these without changing Python code:

research_crew/config/agents.yaml - Researcher role, goal, backstory
research_crew/config/tasks.yaml - Research task description
writing_crew/config/agents.yaml - Four journalist personalities
writing_crew/config/tasks.yaml - Writing task descriptions
editor_crew/config/agents.yaml - Editor role, goal, backstory
editor_crew/config/tasks.yaml - Editing task and metadata format

After editing YAML files, restart the application or container to apply changes.

Troubleshooting

Ollama Connection Errors

Ensure the Ollama server is running and accessible from the blog_creator container. Check OLLAMA_HOST and OLLAMA_PORT in your .env file.

ChromaDB Connection Errors

Verify ChromaDB is running and the CHROMA_HOST and CHROMA_PORT variables are correct. In Docker Compose, use chroma as the host name.

Ollama Web Search Errors

If the researcher agent fails with web search errors, check that OLLAMA_API_KEY is set correctly. Verify your Ollama subscription is active and has web search access.

Empty Output

If blog posts are generated but empty, check:

Ollama models are downloaded and available
CONTENT_CREATOR_MODELS contains valid model names
Sufficient timeout for model inference (default is 30 minutes per operation)

Git Push Failures

Verify GIT_USER and GIT_PASS are correct and the user has write access to the remote repository. Check that the remote URL in GIT_REMOTE is accessible.

Development Notes

The main.py entry point should not be modified for normal operation
All AI generation logic is in src/ai_generators/
The Flow pattern allows easy addition of new crews or steps
Vector database collections are named blog_{title}_{random_id} and persist across runs