AutoBlog: Automating Technical Blog Posts from GitHub Activity

Project Overview

AutoBlog is a Cloudflare‑Worker‑based service that turns GitHub repository activity into ready‑to‑publish technical blog posts.
Instead of manually writing a post after every release or major change, AutoBlog watches commits, extracts the relevant code diffs and README content, and produces a well‑structured markdown article. The generated article is opened as a pull request against a Hugo‑based blog repository, where a human can review and merge it.

Why it matters:

Developers spend a lot of time writing documentation and blog posts that could be derived from the same source code they already maintain.
Consistent, high‑quality content improves knowledge sharing, project visibility, and onboarding for open‑source projects.
Automating the first draft frees developers to focus on the actual engineering work while still keeping the community informed.

Key Features

Feature	Benefit
Automated Content Generation	Turns commits and README files into a complete markdown blog post.
Two‑Pass LLM System	First a “draft” model creates the narrative, then a “review” model validates technical accuracy and removes hallucinations.
GitHub Webhook & Cron Triggers	Real‑time generation on push events or scheduled daily runs (6 AM UTC).
Secrets Filtering	Scans diffs and README for API keys, tokens, or other sensitive data and redacts them before any LLM call.
Deduplication	Stores processed commit SHAs in a Cloudflare D1 SQLite database to avoid duplicate posts.
PR‑Based Publishing	Generates a markdown file in a Hugo blog repo and opens a pull request, giving a human a final safety net.
Configurable Thresholds	Minimum number of commits and minimum days between posts are configurable via environment variables.

Architecture Overview

TEXT

+----------------+      +----------------+      +-----------------+
| GitHub (push)  | ---> | Cloudflare     | ---> | Ollama LLMs      |
| / Cron / API   |      | Worker (Trigger|      | (draft + review)|
+----------------+      |  + Pipeline)   |      +-----------------+
                         +--------+-------+
                                  |
                                  v
                         +----------------+
                         | Cloudflare D1  |
                         | (SQLite)       |
                         +--------+-------+
                                  |
                                  v
                         +----------------+
                         | Hugo Blog Repo |
                         | (PR creation)  |
                         +----------------+

Triggers – Webhook, daily cron, or manual API call.
Worker – Orchestrates the pipeline, stores state in D1, and talks to the LLM service.
LLMs – First pass (gemma3:27b-cloud) drafts the post; second pass (gpt‑oss:120b-cloud) reviews it.
Output – A markdown file is added to the Hugo blog repo via a pull request; GitHub Actions build the site on Cloudflare Pages.

How It Works

1. Trigger Detection

The worker receives one of three triggers:

Webhook – GitHub POSTs a push event to /webhook. The payload’s HMAC signature is verified with GITHUB_WEBHOOK_SECRET.
Cron – A scheduled request hits /cron at 06:00 UTC.
Manual API – A POST /trigger/{owner}/{repo} call initiates processing on demand.

If the configured thresholds (MIN_COMMITS_FOR_POST, MIN_DAYS_BETWEEN_POSTS) are met, the pipeline continues.

2. GitHub Data Retrieval

The worker uses a thin GitHub client wrapper around Octokit. Example:

// services/github/client.ts
import { Octokit } from "octokit";

export class GitHubClient {
  private octokit: Octokit;

  constructor(private token: string) {
    this.octokit = new Octokit({ auth: token });
  }

  async getCommitDiff(
    owner: string,
    repo: string,
    sha: string
  ): Promise<string> {
    const { data } = await this.octokit.rest.repos.getCommit({
      owner,
      repo,
      commit_sha: sha,
    });
    // `patch` contains the unified diff for the commit
    return data.patch ?? "";
  }

  async getReadme(owner: string, repo: string): Promise<string> {
    const { data } = await this.octokit.rest.repos.getReadme({
      owner,
      repo,
    });
    return Buffer.from(data.content, "base64").toString("utf-8");
  }
}

The client fetches the README and the diff patches for all new commits that have not yet been processed.

3. Data Filtering

Diff Filter

Only source files that match the project’s inclusion patterns are kept:

// services/github/diff-filter.ts
export function filterDiffs(patch: string, include: RegExp[]): string {
  const lines = patch.split("\n");
  const filtered: string[] = [];
  let keep = false;

  for (const line of lines) {
    if (line.startsWith("diff --git")) {
      const path = line.split(" b/")[1];
      keep = include.some((rx) => rx.test(path));
    }
    if (keep) filtered.push(line);
  }
  return filtered.join("\n");
}

Secrets Filter

A simple regex‑based redactor removes anything that looks like a secret:

// services/secrets-filter.ts
export function redactSecrets(text: string): string {
  const patterns = [
    /ghp_[A-Za-z0-9_]{36}/g,          // GitHub PAT
    /(?:aws|gcp|azure)[-_]access[_-]?key[^\\n]*/gi,
    /(?:api|secret)[-_]key[^\\n]*/gi,
  ];
  let redacted = text;
  for (const rx of patterns) {
    redacted = redacted.replace(rx, "[REDACTED]");
  }
  return redacted;
}

Both the README and the filtered diffs pass through redactSecrets before any LLM request.

4. Two‑Pass LLM Generation

The content generator builds prompts for each pass and calls the Ollama API:

// services/ollama/generator.ts
import { OllamaClient } from "./ollama-client";

export class ContentGenerator {
  constructor(
    private client: OllamaClient,
    private draftModel: string,
    private reviewModel: string
  ) {}

  async generatePost(readme: string, diffs: string): Promise<string> {
    const draftPrompt = `
You are a technical writer. Using the following README and code diffs, draft a concise blog post that explains:
- What the project does
- The key changes introduced by the commits
- Example code snippets (preserve formatting)

README:
${readme}

Diffs:
${diffs}
`;
    const draft = await this.client.generate(draftPrompt, this.draftModel);

    const reviewPrompt = `
You are a senior engineer reviewing a draft blog post. Verify technical accuracy, correct any hallucinations, and improve code examples. Return only the revised markdown.

Draft:
${draft}
`;
    const final = await this.client.generate(reviewPrompt, this.reviewModel);
    return final;
  }
}

The first model (gemma3:27b-cloud) creates a readable draft; the second model (gpt‑oss:120b-cloud) polishes it.

5. Publishing Flow

After the final markdown is produced, the worker creates a new branch, adds the file, and opens a PR:

// services/github/pr-creator.ts
export async function createPostPR(
  client: GitHubClient,
  owner: string,
  repo: string,
  markdown: string,
  slug: string
) {
  const branch = `autoblog/${slug}-${Date.now()}`;
  // 1️⃣ Create branch from default
  const { data: ref } = await client.octokit.rest.git.getRef({
    owner,
    repo,
    ref: "heads/main",
  });
  await client.octokit.rest.git.createRef({
    owner,
    repo,
    ref: `refs/heads/${branch}`,
    sha: ref.object.sha,
  });

  // 2️⃣ Add markdown file
  const path = `content/posts/${slug}.md`;
  await client.octokit.rest.repos.createOrUpdateFileContents({
    owner,
    repo,
    path,
    message: `auto: add blog post ${slug}`,
    content: Buffer.from(markdown).toString("base64"),
    branch,
  });

  // 3️⃣ Open PR
  await client.octokit.rest.pulls.create({
    owner,
    repo,
    title: `auto: new blog post – ${slug}`,
    head: branch,
    base: "main",
    body: "Generated by AutoBlog. Please review before merging.",
  });
}

A human reviewer can then merge the PR; the Hugo GitHub Action builds the site and Cloudflare Pages serves the updated blog.

Getting Started

Clone the repository and install dependencies:
BASH
```
git clone https://github.com/xxdesmus/autoblog.git
cd autoblog
npm ci
```
Configure environment variables (example .dev.vars for wrangler):
ENV
```
DB=...                     # D1 database binding
GITHUB_PAT=ghp_XXXXXXXXXXXXXXXXXXXXXXXXXXXX
GITHUB_WEBHOOK_[REDACTED]
OLLAMA_API_URL=https://api.ollama.com/v1
```

[REDACTED] OLLAMA_DRAFT_MODEL=gemma3:27b-cloud OLLAMA_REVIEW_MODEL=gpt-oss:120b-cloud HUGO_REPO=xxdesmus/autoblog-blog MIN_DAYS_BETWEEN_POSTS=7 MIN_COMMITS_FOR_POST=3

TEXT


3. **Deploy the worker** (requires a Cloudflare account with Workers and D1 enabled):

```bash
npx wrangler publish

Set up GitHub webhook on any repository you want to monitor, pointing to https://<worker>.workers.dev/webhook and using the secret defined above.
Trigger manually (optional) to test the flow:
BASH
```
curl -X POST "https://<worker>.workers.dev/trigger/octocat/Hello-World"
```

After the PR is merged, the Hugo site rebuilds automatically and the new post appears under https://blog.codenow.dev.

Recent Developments

Anti‑Hallucination Prompts (commit 98f04f8) – Updated the draft and review prompts to explicitly ask the LLM to cite only the provided diffs and README, dramatically reducing fabricated code snippets.
URL Structure Update (commit 6d8f647) – Blog posts are now written to posts/<year>/<month>/<day>/<slug>/ for better SEO and navigation.
Added CLAUDE.md (commit 74f8be5) – Provides a concise, human‑readable overview of the project architecture and goals.

These refinements improve content reliability and make the generated site easier to browse.

Conclusion

AutoBlog demonstrates how a modest amount of infrastructure—Cloudflare Workers, a lightweight SQLite store, and two well‑prompted LLM calls—can turn routine development activity into polished technical communications. By automating the first draft and enforcing a human review step, the project balances speed with accuracy, keeping sensitive data safe while delivering high‑quality blog posts.

If you maintain an open‑source project or a team‑run codebase, give AutoBlog a try: let your commits speak for themselves, and let the world read about the work you’re already doing.