armistace 1781a1dbf5
All checks were successful
Create Blog Article if new notes exist / prepare_blog_drafts_and_push (push) Successful in 19m8s
Merge pull request 'update repo manager to work of master not main' (#22) from git-creation-issue into master
Reviewed-on: #22
2026-04-29 21:58:43 +10:00
2026-04-28 23:30:20 +10:00
2025-01-24 04:51:50 +00:00
2025-06-04 16:56:08 +10:00
2025-05-30 17:25:13 +10:00
2025-05-30 17:25:13 +10:00
2026-04-28 23:30:20 +10:00

Blog Creator

An automated blog generation system that uses CrewAI agents to research, write, and edit blog posts from Trilium notes.

Architecture

The system uses three CrewAI crews orchestrated by a Flow:

  1. Research Crew - A critical researcher agent with web search capabilities investigates the topic and produces verified findings
  2. Writing Crew - Four creative journalist agents write draft blog articles in parallel, each with different creative styles
  3. Editor Crew - A critical editor loads the drafts into a vector database, queries for relevant context, and produces the final polished document with metadata

Requirements

  • Python 3.10 or later
  • Ollama server running with required models
  • ChromaDB server for vector storage
  • Trilium notes instance
  • Gitea instance (for automated workflows)
  • n8n instance (for notifications)

Environment Variables

Create a .env file in the project root with the following variables:

# Trilium Configuration
TRILIUM_HOST=
TRILIUM_PORT=
TRILIUM_PROTOCOL=https
TRILIUM_PASS=
TRILIUM_TOKEN=

# Ollama Configuration
OLLAMA_PROTOCOL=http
OLLAMA_HOST=
OLLAMA_PORT=11434
EMBEDDING_MODEL=nomic-embed-text
EDITOR_MODEL=llama3.1:8b
CONTENT_CREATOR_MODELS=["phi4-mini:latest", "qwen3:1.7b", "gemma3:latest"]

# ChromaDB Configuration
CHROMA_HOST=chroma
CHROMA_PORT=8000

# Git Configuration
GIT_USER=
GIT_PASS=
GIT_PROTOCOL=https
GIT_REMOTE=git.aridgwayweb.com/armistace/blog.git

# Notification Configuration
N8N_SECRET=
N8N_WEBHOOK_URL=

# Ollama Web Search (required for researcher agent)
OLLAMA_API_KEY=

CONTENT_CREATOR_MODELS Format

The CONTENT_CREATOR_MODELS variable should be a JSON array of Ollama model names. Each model will be used by one of the three journalist agents. Example:

CONTENT_CREATOR_MODELS=["llama3.1:8b", "qwen2.5:7b", "phi4:latest"]

OLLAMA_API_KEY

The researcher agent uses Ollama's native web search API. Create an API key from your Ollama account (https://ollama.com) and add it to your .env file. This uses your existing Ollama subscription for web searches.

Project Structure

blog_creator/
├── .env                          # Environment variables (create this)
├── .gitea/workflows/deploy.yml   # Gitea Actions workflow
├── docker-compose.yml            # Local development setup
├── requirements.txt              # Python dependencies
├── README.md                     # This file
└── src/
    ├── main.py                   # Entry point
    └── ai_generators/
        ├── ollama_md_generator.py    # Main interface (used by main.py)
        ├── blog_flow.py              # CrewAI Flow orchestrator
        ├── crews/
        │   ├── research_crew/        # Researcher agent with web search
        │   ├── writing_crew/         # Three journalist agents
        │   └── editor_crew/          # Editor agent with metadata generation
        └── tools/

Local Development Setup

Using Docker Compose

  1. Clone the repository and navigate to the project directory

  2. Create your .env file with all required variables

  3. Start the services:

docker-compose up -d

This starts:

  • blog_creator - The main application container
  • chroma - ChromaDB vector database
  1. The container will run main.py automatically on startup. To run manually:
docker-compose exec blog_creator python src/main.py

Manual Setup (without Docker)

  1. Install system dependencies:
apt update && apt install -y rustc cargo python-is-python3 pip python3-venv libmagic-dev git
  1. Create and activate a virtual environment:
python -m venv .venv
source .venv/bin/activate
  1. Install Python dependencies:
pip install -r requirements.txt
  1. Configure Git:
git config --global user.name "Blog Creator"
git config --global user.email "your-email@example.com"
git config --global push.autoSetupRemote true
  1. Run the application:
python src/main.py

How It Works

Trilium Integration

The system fetches notes from Trilium that are tagged for blog creation. Each note becomes one blog post. The note content is used as the basis for the AI-generated article.

Blog Generation Flow

  1. Research Phase - The researcher agent investigates the topic using web search, critically evaluates claims, and produces verified findings

  2. Writing Phase - Three journalist agents write creative drafts in parallel, each with different temperature and top_p settings for variety

  3. Editor Phase - The editor:

    • Chunks and embeds all drafts into ChromaDB
    • Queries the vector database for relevant context
    • Generates the final polished document with metadata header

Output Format

Each blog post includes a metadata header followed by the markdown body:

Title: Designing and Building an AI Enhanced CCTV System
Date: 2026-02-02 20:00
Modified: 2026-02-02 20:00
Category: Homelab
Tags: proxmox, hardware, self host, homelab, ai_content, not_human_content
Slug: ai-enhanced-cctv
Authors: phi4-mini.ai, qwen3.ai, gemma3.ai
Summary: Home CCTV Security has become a bastion of cloud subscription awfulness. This blog describes creating your own AI enhanced system.

<full markdown blog body follows>

The metadata fields are generated as follows:

  • Title - From the Trilium note title
  • Date/Modified - Current datetime when generated
  • Category - AI-generated single word (e.g., Homelab, DevOps, Security)
  • Tags - AI-generated relevant tags plus ai_content, not_human_content
  • Slug - AI-generated URL-friendly slug
  • Authors - Derived from CONTENT_CREATOR_MODELS (model name + .ai)
  • Summary - AI-generated 15-25 word summary

Git Workflow

After generation, the blog post is:

  1. Committed to a new branch named after the slug
  2. Pushed to the configured Git remote
  3. A notification is sent via n8n to Matrix for review

Gitea Actions Workflow

The .gitea/workflows/deploy.yml file defines an automated workflow that:

  • Runs on a schedule (daily at 18:15 UTC) or on push to master branch
  • Installs all dependencies
  • Creates the .env file from Gitea secrets and variables
  • Runs the blog generation script

Setting Up Gitea Variables

In your Gitea repository settings, configure the following:

Variables (Repository Settings -> Variables):

  • TRILIUM_HOST - Your Trilium server hostname
  • TRILIUM_PORT - Trilium port
  • TRILIUM_PROTOCOL - http or https
  • OLLAMA_PROTOCOL - http or https
  • OLLAMA_HOST - Ollama server hostname
  • OLLAMA_PORT - Ollama port (default 11434)
  • EMBEDDING_MODEL - Embedding model name
  • EDITOR_MODEL - Editor/Researcher model name
  • CONTENT_CREATOR_MODELS_1 through CONTENT_CREATOR_MODELS_4 - Individual model names (the workflow joins these into an array)
  • GIT_PROTOCOL - https or ssh
  • GIT_REMOTE - Git repository URL
  • GIT_USER - Git username for pushing
  • N8N_WEBHOOK_URL - n8n webhook URL for notifications
  • CHROMA_HOST - ChromaDB hostname
  • CHROMA_PORT - ChromaDB port

Secrets (Repository Settings -> Secrets):

  • TRILIUM_PASS - Trilium password
  • TRILIUM_TOKEN - Trilium API token
  • GIT_PASS - Git password or personal access token
  • N8N_SECRET - n8n webhook secret key
  • OLLAMA_API_KEY - Ollama API key for web search

Workflow Triggers

The workflow runs automatically when:

  • A push is made to the master branch
  • The scheduled cron time is reached (18:15 UTC daily)

To trigger manually, push any change to master or modify the cron schedule in .gitea/workflows/deploy.yml.

Customizing Agent Behavior

Agent personalities and task instructions are defined in YAML files under src/ai_generators/crews/*/config/. You can modify these without changing Python code:

  • research_crew/config/agents.yaml - Researcher role, goal, backstory
  • research_crew/config/tasks.yaml - Research task description
  • writing_crew/config/agents.yaml - Four journalist personalities
  • writing_crew/config/tasks.yaml - Writing task descriptions
  • editor_crew/config/agents.yaml - Editor role, goal, backstory
  • editor_crew/config/tasks.yaml - Editing task and metadata format

After editing YAML files, restart the application or container to apply changes.

Troubleshooting

Ollama Connection Errors

Ensure the Ollama server is running and accessible from the blog_creator container. Check OLLAMA_HOST and OLLAMA_PORT in your .env file.

ChromaDB Connection Errors

Verify ChromaDB is running and the CHROMA_HOST and CHROMA_PORT variables are correct. In Docker Compose, use chroma as the host name.

Ollama Web Search Errors

If the researcher agent fails with web search errors, check that OLLAMA_API_KEY is set correctly. Verify your Ollama subscription is active and has web search access.

Empty Output

If blog posts are generated but empty, check:

  • Ollama models are downloaded and available
  • CONTENT_CREATOR_MODELS contains valid model names
  • Sufficient timeout for model inference (default is 30 minutes per operation)

Git Push Failures

Verify GIT_USER and GIT_PASS are correct and the user has write access to the remote repository. Check that the remote URL in GIT_REMOTE is accessible.

Development Notes

  • The main.py entry point should not be modified for normal operation
  • All AI generation logic is in src/ai_generators/
  • The Flow pattern allows easy addition of new crews or steps
  • Vector database collections are named blog_{title}_{random_id} and persist across runs
Description
No description provided
Readme 214 KiB
Languages
Python 98.4%
Dockerfile 1.6%