When Process Engineering Ate My Blog

Initial system state: Claude Code CLI installed, research documents accumulating. After three weeks of development, the system contained 37 workflow documents and zero blog posts published.

Analysis revealed the constraint was not technical capability but process structure.

The research-to-blog pipeline evolved into a self-improving system through iterative problem-solving rather than deliberate design. This case study documents the emergent architecture and measurable outcomes.

The Pattern: Problem → Research → Solution → Documentation

Analysis of 15 improvement cycles reveals a consistent pattern: system failures trigger solutions, which when documented, prevent recurrence.

Note: Code examples in this post are simplified conceptual implementations that illustrate the problem-solving approach.

Example 1: The Duplicate PDF Problem

Problem Identified: Duplicate processing of research PDFs across multiple directories with variant filenames.

The quick fix of checking filenames didn’t scale. “AI-research-2024.pdf” and “ai_research_final_v2.pdf” contained identical content. The research-driven solution used SHA-256 content hashing for deduplication, which then expanded into a full intake pipeline with metadata extraction, automatic routing, and standardized naming.

Example 2: Citation Verification Bottleneck

Problem Identified: Manual verification of research claims required 3-5 minutes per claim. With 50+ claims per document, verification consumed 4+ hours per research paper.

The solution evolved through three stages: from manual markdown checklists, to batch processing with structured templates, to a tiered verification system that classifies claims by priority: immediate verification for blog-bound claims, background verification for academic sources, optional for tutorials and opinions.

Example 3: Insight Synthesis Failure

Problem Identified: Research documents accumulated in topic folders without pattern synthesis or insight generation.

The solution: dynamic insight tracking where each new research document triggers updates to existing pattern files. Understanding evolved incrementally, from “feedback loops seem important” (2 documents) to “the system that documents itself, improves itself” (12 documents), with explicit evidence for and against each insight.

Emergent System Architecture

The architecture developed through evolutionary pressure rather than deliberate design. Each component exists as a response to specific failures:

Research Intake Pipeline: Deduplication, extraction, classification, and routing
Dynamic Insight System: Pattern matching, evidence accumulation, maturity tracking
Verification Pipeline: Four-tier system from immediate to optional
Blog Seed Generation: Automatic promotion when insights reach maturity
Feedback Integration: SESSION_LOG.md captures friction, drives improvements

System Emergence Principles

The self-improvement capability derives from three documented principles:

Problem documentation at time of occurrence
Evidence-based solution research from verified sources
Automation of validated solutions through scripting

The effectiveness stems from consistent application of this feedback loop rather than architectural complexity.

Measurable Outcomes

Initial objective: Build technical blog. Actual outcome: Self-improving research system with duplicate elimination via content hashing, automated citation extraction, dynamic insight evolution across documents, systematic claim verification, and 37 workflows capturing iterative improvements.

The blog implementation remains secondary to the research system that enables it.

Implementation recommendation: Begin with SESSION_LOG.md documentation. Record one problem. Develop one solution. Enable systematic iteration. The architecture emerges through consistent application of the feedback loop.

What systems have emerged from your workflow friction?