content-forge/.claude/skills/paper-illustration/SKILL.md

15 KiB
Raw Blame History

name description
paper-illustration Generate publication-quality academic paper illustrations from scientific content. 6 figure types: architecture diagrams, data plots, conceptual illustrations, process flows, comparison charts, result visualizations. Outputs AI-generated images (via basic-image-gen) or reproducible code (matplotlib/mermaid/TikZ). Academic style compliance (NeurIPS/ICML/ACL/IEEE). USE WHEN: User wants to create academic figures, paper illustrations, or scientific diagrams from vault notes or raw content. Trigger phrases: 论文插图、学术配图、paper figure、scientific illustration、 架构图、实验结果图、ablation figure、generate figures for paper、 数据可视化、流程图、对比图、概念图。 DON'T USE WHEN: - User wants decorative blog illustrations → use baoyu-article-illustrator. - User wants social media cover images → use baoyu-cover-image. - User wants infographics for Xiaohongshu → use baoyu-xhs-images. - User wants general AI image generation → use basic-image-gen directly. - User says "封面"、"配图"、"社媒" without academic context → not this skill.

Paper-Illustration: Academic Figure Generation Skill

Purpose

Generate publication-quality figures from scientific content with reproducible code output. Every data-driven figure must be reproducible from code; every conceptual figure must be visually precise.

Core principle: Code-first, AI-image fallback — data figures (plots, charts) always go through matplotlib/seaborn code generation (text accuracy guaranteed); conceptual figures go through basic-image-gen (AI excels at creative visuals).

Routing Rules (Use when / Don't use when)

Use this skill when:

  • User wants academic-quality figures for papers, presentations, or technical reports
  • User says "论文插图"、"paper figure"、"实验结果图"、"ablation study plot"
  • Content involves data visualization, system architecture, or scientific processes
  • User needs reproducible figure code (matplotlib, mermaid, TikZ)

Do NOT use this skill when:

  • User wants decorative blog/social illustrations → baoyu-article-illustrator
  • User wants cover images for articles → baoyu-cover-image
  • User wants infographics for social media → baoyu-xhs-images or baoyu-infographic
  • User wants general AI images without academic context → basic-image-gen
  • Content has no scientific/technical structure to visualize

Edge cases:

  • "帮我画个系统架构图" with paper context → this skill (academic architecture diagram)
  • "帮我画个系统架构图" for blog post → not this skill (use baoyu-article-illustrator)
  • "生成实验对比图" → this skill (data-driven, needs reproducible code)
  • "做个信息图" → not this skill (use baoyu-infographic)

Input Requirements

Required:

  • source: Vault note path OR raw text content describing the figure subject
  • figure_type: One of 6 types (auto-detected if omitted, see Decision Tree below)

Optional:

  • venue: Academic venue style — neurips (default) | icml | acl | ieee | nature
  • output_format: code (default for data types) | image (default for conceptual) | both
  • slug: Asset slug for vault storage (derived from source note if omitted)
  • caption: Figure caption (generated if omitted)
  • data: Inline data dict/CSV for data-driven figures
  • colorblind_mode: true | false (default: true — always colorblind-safe)
  • subfigures: Number of subfigures (for multi-panel layouts)
  • dpi: Output resolution (default: 300, minimum: 150)
  • language: Label language — en (default) | zh-CN

Figure Type Decision Tree

Source content analysis
    │
    ├─ Contains numeric data / experimental results?
    │   ├─ Comparing methods/models? ─────────────── comparison-chart
    │   ├─ Showing trends / distributions? ────────── data-plot
    │   └─ Showing ablation / metrics table? ──────── result-visualization
    │
    ├─ Describes a system / model architecture?
    │   └─ Has components + connections? ──────────── architecture-diagram
    │
    ├─ Describes a sequential process / pipeline?
    │   └─ Has ordered steps + decisions? ─────────── process-flow
    │
    └─ Describes an abstract concept / metaphor?
        └─ Needs creative visual representation? ──── conceptual

Figure Type → Backend Routing

figure_type              Primary Backend           Fallback
─────────────────────    ─────────────────────     ────────────────
architecture-diagram     Mermaid → basic-image-gen  TikZ
data-plot                matplotlib code            —code only
conceptual               basic-image-gen            SVG description
process-flow             Mermaid flowchart          TikZ
comparison-chart         matplotlib code            —code only
result-visualization     matplotlib/seaborn code    —code only

Key constraint: Data-driven types (data-plot, comparison-chart, result-visualization) NEVER use AI image generation. Text on axes, labels, and legends must be pixel-perfect — only code can guarantee this.

Mandatory Workflow (6 Steps — No Skipping)

Step 1: Retrieve & Classify

Read source content and classify figure type. Load references/illustration-taxonomy.md for classification guidance.

# If source is a vault note path
cd "${VAULT_PATH:-/home/kang/apps/content-forge/content-forge}"
obsidian read path="<source_note_path>"

Actions:

  • Read all source material (vault notes, inline data, referenced papers)
  • Auto-detect figure_type using the Decision Tree above (if not provided)
  • Extract key elements: entities, relationships, data points, labels
  • Confirm figure_type with user if ambiguous

Output:

## Figure Classification

- Source: [vault path or "inline content"]
- Detected type: [figure_type]
- Key elements: [list of entities/data points]
- Confidence: [high/medium/low]
- [?] Ambiguous — suggest [type_a] or [type_b], awaiting user choice (if low confidence)

Step 2: Plan the Figure (Blueprint)

Generate a structured Figure Blueprint describing the visual composition.

Blueprint format:

figure_blueprint:
  type: <figure_type>
  title: "<descriptive title>"
  caption: "<self-contained caption describing content and key findings>"
  dimensions:
    width: <inches>
    height: <inches>
    aspect_ratio: "<W:H>"
  layout:
    grid: "<NxM>"  # for subfigures
    elements:
      - id: "element_1"
        type: "<box|arrow|label|axis|legend|...>"
        content: "<text or data reference>"
        position: "<relative position description>"
  data_mapping:  # for data-driven types only
    x_axis: "<variable name>"
    y_axis: "<variable name>"
    series: ["<method_1>", "<method_2>"]
    error_bars: true/false
  text_elements:
    labels: ["<label_1>", "<label_2>"]
    annotations: ["<annotation_1>"]

Constraints:

  • Every text element must be explicitly listed (no implicit labels)
  • Data mappings must reference actual data from Step 1
  • Caption must be self-contained (understandable without reading the paper)

Step 3: Apply Academic Style

Load references/academic-styles.md and apply venue-specific styling to the Blueprint.

Style augmentation adds:

  • Color palette (hex values, colorblind-safe)
  • Font family and sizes (axis labels, titles, legends)
  • Line widths and marker styles
  • Figure size constraints (single-column vs double-column)
  • Caption format (numbered, style-specific)

Output: Enhanced Blueprint with all visual parameters specified as concrete values (no "use default" — every parameter must be a number or hex code).

Step 4: Generate Figure

Route to the appropriate backend based on figure_type.

4a. Code-based generation (data-plot, comparison-chart, result-visualization)

Generate complete, runnable Python code:

import matplotlib.pyplot as plt
import matplotlib
import numpy as np
# import seaborn as sns  # if needed

# --- Style Configuration (from Step 3) ---
matplotlib.rcParams.update({
    'font.family': '<font_family>',
    'font.size': <base_size>,
    'axes.labelsize': <label_size>,
    'axes.titlesize': <title_size>,
    'legend.fontsize': <legend_size>,
    'figure.dpi': <dpi>,
})

# --- Data ---
# <data from Step 1, hardcoded for reproducibility>

# --- Plot ---
fig, ax = plt.subplots(figsize=(<width>, <height>))

# <plotting code>

# --- Labels & Legend ---
ax.set_xlabel('<x_label>')
ax.set_ylabel('<y_label>')
ax.legend(loc='best', frameon=True)

# --- Output ---
plt.tight_layout()
plt.savefig('<slug>_fig.png', dpi=<dpi>, bbox_inches='tight')
plt.savefig('<slug>_fig.pdf', bbox_inches='tight')  # vector format
plt.show()

Code output rules:

  • Print code to terminal (NEVER write .py files to vault)
  • All data must be inline (no external file references)
  • Must include both raster (PNG) and vector (PDF) save commands
  • Must include tight_layout() and explicit DPI
  • Must include all imports at the top

4b. Mermaid-based generation (architecture-diagram, process-flow)

Generate Mermaid code, then optionally render via basic-image-gen for high-quality output:

graph TD
    A[Component A] --> B[Component B]
    B --> C{Decision}
    C -->|Yes| D[Output 1]
    C -->|No| E[Output 2]

For AI-enhanced rendering:

  • Convert Mermaid to a detailed text description
  • Use basic-image-gen with academic-style prompt (from Step 3)
  • Prompt template: see references/illustration-taxonomy.md

4c. AI image generation (conceptual)

Use basic-image-gen with a carefully engineered prompt:

  • Include all text elements from Blueprint (labels, annotations)
  • Specify exact colors from venue palette
  • Request "clean, minimalist, scientific illustration style"
  • Specify "white background, no decorative elements, no shadows, no 3D effects"

Step 5: Critic Evaluation

Load references/critic-checklist.md and evaluate the generated figure.

5 dimensions, 20 items — every item scored PASS/FAIL:

  • Clarity (5 items): caption, labels, hierarchy, overlap, whitespace
  • Accuracy (4 items): data fidelity, encoding, axes, error bars
  • Style (5 items): palette, colorblind, grayscale, fonts, size
  • Reproducibility (3 items): code completeness, imports, output config
  • Caption (3 items): content description, key findings, subfigure refs

Evaluation output:

## Critic Evaluation — Round N/3

| Dimension       | Score | Issues |
|-----------------|-------|--------|
| Clarity         | 4/5   | [!] Legend overlaps data points |
| Accuracy        | 4/4   | — |
| Style           | 5/5   | — |
| Reproducibility | 3/3   | — |
| Caption         | 2/3   | [!] Missing key finding in caption |

Overall: REVISE (16/20, threshold: 18/20)

Revision actions:
1. Move legend to upper-left corner
2. Add "Method A outperforms by 12%" to caption

Iteration rules:

  • Threshold: 18/20 to PASS
  • Maximum 3 revision rounds
  • Each revision must address ALL flagged issues
  • If round 3 still fails: output with [!] Quality warning and list remaining issues

Step 6: Archive to Vault

Store final assets and update vault metadata.

cd "${VAULT_PATH:-/home/kang/apps/content-forge/content-forge}"

# Create asset directory (if source is vault note)
# mkdir -p "05-assets/<slug>/"

# For code-based figures: user runs the code, saves output to 05-assets/<slug>/
# For AI-generated figures: basic-image-gen saves to 05-assets/<slug>/

# Update source note's frontmatter (if vault note)
obsidian property:set path="<source_note_path>" name="assets" \
  value='["05-assets/<slug>/<filename>.png"]'

Output summary:

## Figure Generation Complete

[✓] Type: <figure_type>
[✓] Backend: <matplotlib|mermaid|basic-image-gen>
[✓] Critic: PASS (score/20) — round N
[✓] Assets: 05-assets/<slug>/<filename>.{png,pdf}
[✓] Code: <printed to terminal | N/A>

Caption:
"Figure N. <caption text>"

NEVER Rules (8 Items — Zero Tolerance)

  1. NEVER use AI image generation for data-driven figures (data-plot, comparison-chart, result-visualization). Text on axes, labels, and legends is unreliable in AI-generated images. Always use matplotlib/seaborn code.

  2. NEVER use more than 7 distinct colors in a single figure. Beyond 7, use patterns (hatching, markers, line styles) to differentiate series. Color is a scarce resource.

  3. NEVER rely solely on red-green distinction. Always include a secondary encoding (shape, pattern, label) for colorblind accessibility. Test with simulated deuteranopia.

  4. NEVER output a figure without a caption. Every figure must have a self-contained caption that describes what is shown and highlights key findings.

  5. NEVER generate matplotlib code without plt.tight_layout() and explicit DPI setting. Clipped labels and low-resolution output are publication-rejection-worthy bugs.

  6. NEVER use decorative elements: drop shadows, 3D effects, gradients, bevels, or ornamental borders. Academic figures must be clean, flat, and information-dense.

  7. NEVER skip the Critic evaluation step (Step 5). Even if the figure "looks fine", run the 20-item checklist. Visual intuition misses subtle issues.

  8. NEVER write code files (.py, .mmd) to the vault. Code output goes to terminal. Only image assets (.png, .pdf, .svg) go to 05-assets/. Vault is for content, not code.

Output Constraints

  • All code output is printed to terminal, not saved as files
  • Image assets go to 05-assets/<slug>/ with vault-relative paths
  • Default language for labels: English (override with language: zh-CN)
  • Default colorblind mode: ON (override with colorblind_mode: false)
  • Default DPI: 300 (minimum 150, recommended 300-600 for print)
  • Caption language matches language parameter
  • Vector format (PDF) always generated alongside raster (PNG) for code-based figures

Failure Handling

  • Missing source content: [?] Missing input: source — stop and ask
  • Ambiguous figure_type: [?] Cannot determine figure type — present options, ask user
  • basic-image-gen unavailable: [✗] Image generation failed — fall back to Mermaid/TikZ code
  • Data insufficient for plot: [?] Insufficient data — request specific data points
  • Critic fails after 3 rounds: [!] Quality warning: <remaining_issues> — output with disclaimer
  • obsidian CLI failure: [✗] <command> failed: <error> — stop subsequent steps

Reference Files

  • references/illustration-taxonomy.md — 6 figure types: structure, routing, prompt templates, decision tree
  • references/academic-styles.md — Venue-specific styles: NeurIPS/ICML/ACL/IEEE/Nature with hex colors, fonts, sizes
  • references/critic-checklist.md — 5 dimensions x 20 items: Clarity, Accuracy, Style, Reproducibility, Caption