Reviewed-on: #22
Blog Creator
An automated blog generation system that uses CrewAI agents to research, write, and edit blog posts from Trilium notes.
Architecture
The system uses three CrewAI crews orchestrated by a Flow:
- Research Crew - A critical researcher agent with web search capabilities investigates the topic and produces verified findings
- Writing Crew - Four creative journalist agents write draft blog articles in parallel, each with different creative styles
- Editor Crew - A critical editor loads the drafts into a vector database, queries for relevant context, and produces the final polished document with metadata
Requirements
- Python 3.10 or later
- Ollama server running with required models
- ChromaDB server for vector storage
- Trilium notes instance
- Gitea instance (for automated workflows)
- n8n instance (for notifications)
Environment Variables
Create a .env file in the project root with the following variables:
# Trilium Configuration
TRILIUM_HOST=
TRILIUM_PORT=
TRILIUM_PROTOCOL=https
TRILIUM_PASS=
TRILIUM_TOKEN=
# Ollama Configuration
OLLAMA_PROTOCOL=http
OLLAMA_HOST=
OLLAMA_PORT=11434
EMBEDDING_MODEL=nomic-embed-text
EDITOR_MODEL=llama3.1:8b
CONTENT_CREATOR_MODELS=["phi4-mini:latest", "qwen3:1.7b", "gemma3:latest"]
# ChromaDB Configuration
CHROMA_HOST=chroma
CHROMA_PORT=8000
# Git Configuration
GIT_USER=
GIT_PASS=
GIT_PROTOCOL=https
GIT_REMOTE=git.aridgwayweb.com/armistace/blog.git
# Notification Configuration
N8N_SECRET=
N8N_WEBHOOK_URL=
# Ollama Web Search (required for researcher agent)
OLLAMA_API_KEY=
CONTENT_CREATOR_MODELS Format
The CONTENT_CREATOR_MODELS variable should be a JSON array of Ollama model names. Each model will be used by one of the three journalist agents. Example:
CONTENT_CREATOR_MODELS=["llama3.1:8b", "qwen2.5:7b", "phi4:latest"]
OLLAMA_API_KEY
The researcher agent uses Ollama's native web search API. Create an API key from your Ollama account (https://ollama.com) and add it to your .env file. This uses your existing Ollama subscription for web searches.
Project Structure
blog_creator/
├── .env # Environment variables (create this)
├── .gitea/workflows/deploy.yml # Gitea Actions workflow
├── docker-compose.yml # Local development setup
├── requirements.txt # Python dependencies
├── README.md # This file
└── src/
├── main.py # Entry point
└── ai_generators/
├── ollama_md_generator.py # Main interface (used by main.py)
├── blog_flow.py # CrewAI Flow orchestrator
├── crews/
│ ├── research_crew/ # Researcher agent with web search
│ ├── writing_crew/ # Three journalist agents
│ └── editor_crew/ # Editor agent with metadata generation
└── tools/
Local Development Setup
Using Docker Compose
-
Clone the repository and navigate to the project directory
-
Create your
.envfile with all required variables -
Start the services:
docker-compose up -d
This starts:
blog_creator- The main application containerchroma- ChromaDB vector database
- The container will run
main.pyautomatically on startup. To run manually:
docker-compose exec blog_creator python src/main.py
Manual Setup (without Docker)
- Install system dependencies:
apt update && apt install -y rustc cargo python-is-python3 pip python3-venv libmagic-dev git
- Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate
- Install Python dependencies:
pip install -r requirements.txt
- Configure Git:
git config --global user.name "Blog Creator"
git config --global user.email "your-email@example.com"
git config --global push.autoSetupRemote true
- Run the application:
python src/main.py
How It Works
Trilium Integration
The system fetches notes from Trilium that are tagged for blog creation. Each note becomes one blog post. The note content is used as the basis for the AI-generated article.
Blog Generation Flow
-
Research Phase - The researcher agent investigates the topic using web search, critically evaluates claims, and produces verified findings
-
Writing Phase - Three journalist agents write creative drafts in parallel, each with different temperature and top_p settings for variety
-
Editor Phase - The editor:
- Chunks and embeds all drafts into ChromaDB
- Queries the vector database for relevant context
- Generates the final polished document with metadata header
Output Format
Each blog post includes a metadata header followed by the markdown body:
Title: Designing and Building an AI Enhanced CCTV System
Date: 2026-02-02 20:00
Modified: 2026-02-02 20:00
Category: Homelab
Tags: proxmox, hardware, self host, homelab, ai_content, not_human_content
Slug: ai-enhanced-cctv
Authors: phi4-mini.ai, qwen3.ai, gemma3.ai
Summary: Home CCTV Security has become a bastion of cloud subscription awfulness. This blog describes creating your own AI enhanced system.
<full markdown blog body follows>
The metadata fields are generated as follows:
- Title - From the Trilium note title
- Date/Modified - Current datetime when generated
- Category - AI-generated single word (e.g., Homelab, DevOps, Security)
- Tags - AI-generated relevant tags plus
ai_content, not_human_content - Slug - AI-generated URL-friendly slug
- Authors - Derived from CONTENT_CREATOR_MODELS (model name +
.ai) - Summary - AI-generated 15-25 word summary
Git Workflow
After generation, the blog post is:
- Committed to a new branch named after the slug
- Pushed to the configured Git remote
- A notification is sent via n8n to Matrix for review
Gitea Actions Workflow
The .gitea/workflows/deploy.yml file defines an automated workflow that:
- Runs on a schedule (daily at 18:15 UTC) or on push to master branch
- Installs all dependencies
- Creates the
.envfile from Gitea secrets and variables - Runs the blog generation script
Setting Up Gitea Variables
In your Gitea repository settings, configure the following:
Variables (Repository Settings -> Variables):
TRILIUM_HOST- Your Trilium server hostnameTRILIUM_PORT- Trilium portTRILIUM_PROTOCOL- http or httpsOLLAMA_PROTOCOL- http or httpsOLLAMA_HOST- Ollama server hostnameOLLAMA_PORT- Ollama port (default 11434)EMBEDDING_MODEL- Embedding model nameEDITOR_MODEL- Editor/Researcher model nameCONTENT_CREATOR_MODELS_1throughCONTENT_CREATOR_MODELS_4- Individual model names (the workflow joins these into an array)GIT_PROTOCOL- https or sshGIT_REMOTE- Git repository URLGIT_USER- Git username for pushingN8N_WEBHOOK_URL- n8n webhook URL for notificationsCHROMA_HOST- ChromaDB hostnameCHROMA_PORT- ChromaDB port
Secrets (Repository Settings -> Secrets):
TRILIUM_PASS- Trilium passwordTRILIUM_TOKEN- Trilium API tokenGIT_PASS- Git password or personal access tokenN8N_SECRET- n8n webhook secret keyOLLAMA_API_KEY- Ollama API key for web search
Workflow Triggers
The workflow runs automatically when:
- A push is made to the master branch
- The scheduled cron time is reached (18:15 UTC daily)
To trigger manually, push any change to master or modify the cron schedule in .gitea/workflows/deploy.yml.
Customizing Agent Behavior
Agent personalities and task instructions are defined in YAML files under src/ai_generators/crews/*/config/. You can modify these without changing Python code:
research_crew/config/agents.yaml- Researcher role, goal, backstoryresearch_crew/config/tasks.yaml- Research task descriptionwriting_crew/config/agents.yaml- Four journalist personalitieswriting_crew/config/tasks.yaml- Writing task descriptionseditor_crew/config/agents.yaml- Editor role, goal, backstoryeditor_crew/config/tasks.yaml- Editing task and metadata format
After editing YAML files, restart the application or container to apply changes.
Troubleshooting
Ollama Connection Errors
Ensure the Ollama server is running and accessible from the blog_creator container. Check OLLAMA_HOST and OLLAMA_PORT in your .env file.
ChromaDB Connection Errors
Verify ChromaDB is running and the CHROMA_HOST and CHROMA_PORT variables are correct. In Docker Compose, use chroma as the host name.
Ollama Web Search Errors
If the researcher agent fails with web search errors, check that OLLAMA_API_KEY is set correctly. Verify your Ollama subscription is active and has web search access.
Empty Output
If blog posts are generated but empty, check:
- Ollama models are downloaded and available
CONTENT_CREATOR_MODELScontains valid model names- Sufficient timeout for model inference (default is 30 minutes per operation)
Git Push Failures
Verify GIT_USER and GIT_PASS are correct and the user has write access to the remote repository. Check that the remote URL in GIT_REMOTE is accessible.
Development Notes
- The
main.pyentry point should not be modified for normal operation - All AI generation logic is in
src/ai_generators/ - The Flow pattern allows easy addition of new crews or steps
- Vector database collections are named
blog_{title}_{random_id}and persist across runs