content-forge/.claude/skills/paper-illustration/references/illustration-taxonomy.md

10 KiB
Raw Blame History

Illustration Taxonomy Reference

Loaded by Step 1 (Retrieve & Classify). Provides structure templates, routing rules, prompt engineering guidance, and auto-detection heuristics for all 6 figure types.

Table of Contents

  1. Architecture Diagram
  2. Data Plot
  3. Conceptual Illustration
  4. Process Flow
  5. Comparison Chart
  6. Result Visualization
  7. Auto-Detection Heuristics
  8. Prompt Engineering Templates

1. Architecture Diagram

Purpose: Show system components and their interconnections. Common in ML papers (model architecture), systems papers (distributed systems), and software engineering papers.

Structural template:

  • Top-level: input → processing layers → output
  • Each layer: named box with internal description
  • Connections: directed arrows with optional labels (data type, tensor shape)
  • Optional: skip connections, attention mechanisms, residual paths

Key elements to extract:

  • Named components (encoder, decoder, attention head, loss function)
  • Data flow direction (left-to-right or top-to-bottom)
  • Tensor shapes / data dimensions at each stage
  • Grouping (which components form a module)

Backend: Mermaid (structure) → basic-image-gen (polished rendering)

Exemplar description (text-based reference):

A Transformer encoder-decoder diagram: input embeddings feed into N stacked encoder blocks (each containing multi-head self-attention + feed-forward + layer norm), connected by residual arrows. Decoder blocks mirror the structure with added cross-attention. All arrows are labeled with tensor dimensions. Clean white background, consistent box sizing, horizontal flow.

Common mistakes:

  • Too many components crammed into one figure — split into overview + detail figures
  • Arrows crossing without clear layering — use consistent flow direction
  • Missing dimension annotations — reviewers need to verify tensor shapes

2. Data Plot

Purpose: Visualize quantitative experimental results. The most common figure type in empirical papers.

Subtypes:

  • Line plot (trends over epochs, hyperparameter sweeps)
  • Bar chart (discrete comparisons, ablation results)
  • Scatter plot (correlation, embedding visualization)
  • Histogram (distribution of scores, latency)
  • Box plot (variance across runs, statistical comparison)
  • Heatmap (attention weights, confusion matrices)

Structural template:

  • X-axis: independent variable (clearly labeled with units)
  • Y-axis: dependent variable (clearly labeled with units)
  • Legend: method names, consistent with paper text
  • Error bars / confidence intervals: when multiple runs exist
  • Grid: light gray, no heavy gridlines

Key elements to extract:

  • Data values (exact numbers from tables or inline)
  • Variable names and units
  • Method/baseline names (must match paper text exactly)
  • Number of runs / seeds (for error bars)

Backend: matplotlib/seaborn code — NEVER AI image generation

Common mistakes:

  • Truncated y-axis without break marker — misleading visual
  • Missing error bars when results vary across seeds
  • Legend covering data points — move to whitespace area
  • Inconsistent method colors across subfigures

3. Conceptual Illustration

Purpose: Explain abstract ideas, metaphors, or high-level intuitions that cannot be expressed as data or architecture.

Structural template:

  • Central metaphor / visual analogy
  • Labeled regions or entities
  • Visual encoding of relationships (proximity, size, color)
  • Minimal text annotations (key terms only)

Key elements to extract:

  • Core concept and its visual metaphor
  • Entities and their relationships
  • Any text that must appear in the figure
  • Target impression (what should the reader understand at a glance)

Backend: basic-image-gen (AI excels at creative visual metaphors)

Exemplar description:

A clean scientific illustration showing knowledge distillation: a large "Teacher" network (detailed, multi-layered structure) on the left, connected by flowing arrows labeled "soft labels" to a smaller, simpler "Student" network on the right. The teacher is rendered in deep blue, the student in teal. White background, no decorative elements, all text in sans-serif font.

Common mistakes:

  • Too abstract — reviewers need to understand without reading the caption
  • Decorative clutter (gradients, shadows) — breaks academic norms
  • Text too small or embedded in complex visuals — must be readable at print size

4. Process Flow

Purpose: Show sequential steps, decision points, and branching logic. Common in method sections and algorithm descriptions.

Structural template:

  • Start/end: rounded rectangles
  • Process steps: rectangles with action descriptions
  • Decisions: diamonds with yes/no branches
  • Data stores: cylinders or parallelograms
  • Flow: top-to-bottom or left-to-right, consistent direction

Key elements to extract:

  • Ordered steps (numbered or sequential)
  • Decision points and their conditions
  • Parallel branches (if any)
  • Loop/iteration indicators
  • Input/output at boundaries

Backend: Mermaid flowchart (primary), TikZ (fallback)

Exemplar description:

A training pipeline flowchart: "Raw Data" → "Preprocessing" → "Feature Extraction" → diamond "Validation Loss Decreasing?" → Yes: "Continue Training" (loops back) → No: "Early Stop" → "Evaluate on Test Set" → "Report Results". Clean boxes with consistent padding, single-color scheme, directional arrows.

Common mistakes:

  • Too many steps in one diagram — split into phases
  • Inconsistent box sizes — standardize width
  • Missing decision labels (yes/no/condition)

5. Comparison Chart

Purpose: Compare multiple methods, models, or configurations across shared metrics. The "table as a figure" pattern.

Subtypes:

  • Grouped bar chart (methods × metrics)
  • Radar/spider chart (multi-dimensional comparison)
  • Parallel coordinates (many dimensions)
  • Table with heatmap coloring (for many metrics)

Structural template:

  • X-axis: methods/models (categorical)
  • Y-axis: metric value (numerical)
  • Grouping: by metric or by method
  • Highlighting: best result bolded or starred
  • Baseline: horizontal dashed line for reference method

Key elements to extract:

  • Method names (must match paper text)
  • Metric names and values
  • Which method is "ours" (for highlighting)
  • Baseline/reference method
  • Statistical significance markers (if applicable)

Backend: matplotlib code — NEVER AI image generation

Common mistakes:

  • Too many methods/metrics in one chart — split or use table
  • No baseline reference line — hard to judge improvement
  • Inconsistent ordering — sort by performance or alphabetically

6. Result Visualization

Purpose: Show qualitative or semi-quantitative results: attention maps, feature visualizations, generated samples, ablation grids.

Subtypes:

  • Attention heatmap overlay
  • Feature map / activation visualization
  • Sample grid (generated vs real)
  • Ablation grid (each cell = one config)
  • Confusion matrix
  • t-SNE / UMAP embedding plots

Structural template:

  • Grid layout: rows = conditions, columns = samples/metrics
  • Consistent cell sizing
  • Row/column headers clearly labeled
  • Color scale legend (for heatmaps)
  • Highlight: circles or boxes around key regions

Key elements to extract:

  • What each row/column represents
  • Color scale meaning and range
  • Which samples to highlight and why
  • Grid dimensions (rows × columns)

Backend: matplotlib/seaborn code (heatmaps, grids), basic-image-gen (overlays on real images)

Common mistakes:

  • Missing color scale legend — uninterpretable
  • Grid cells too small at print size — ensure readability
  • No row/column labels — reader cannot identify conditions

7. Auto-Detection Heuristics

Keyword-based classification from source content:

Keywords / Patterns Detected Type
"accuracy", "F1", "BLEU", "%", table of numbers, "vs", "baseline" comparison-chart
"epoch", "learning rate", "loss curve", "over time", "trend" data-plot
"architecture", "encoder", "decoder", "layer", "module", "block" architecture-diagram
"step 1", "then", "if...then", "pipeline", "workflow", "algorithm" process-flow
"ablation", "heatmap", "attention map", "feature map", "t-SNE" result-visualization
"intuition", "concept", "metaphor", "analogy", "overview" conceptual

Confidence scoring:

  • 3+ keywords from one category → HIGH confidence
  • 2 keywords → MEDIUM confidence
  • 1 keyword or keywords from multiple categories → LOW confidence (ask user)

Tie-breaking rules:

  • Numeric data present → prefer data-driven types (data-plot > comparison-chart > result-visualization)
  • No numeric data → prefer structural types (architecture-diagram > process-flow > conceptual)

8. Prompt Engineering Templates

For basic-image-gen (architecture-diagram, conceptual)

Base template:

Create a publication-quality scientific illustration for an academic paper.

Subject: {description_from_blueprint}

Style requirements:
- Clean white background
- No decorative elements (no shadows, no 3D, no gradients)
- Flat design with consistent line weights
- Colors: {palette_from_academic_styles}
- All text in {font_family}, minimum {min_font_size}pt equivalent
- Professional academic illustration style

Text elements that MUST appear legibly:
{list_all_labels_and_annotations}

Layout: {layout_description}
Aspect ratio: {width}:{height}

For architecture diagrams specifically

Architecture template:

Create a clean technical architecture diagram for a research paper.

Components (left to right / top to bottom):
{component_list_with_descriptions}

Connections:
{arrow_descriptions_with_labels}

Style: flat design, {venue} conference style, colors {hex_values},
sans-serif labels, consistent box sizing, white background.
No decorative elements. Print-ready at {width}" x {height}".

For conceptual illustrations

Concept template:

Create a minimalist scientific concept illustration.

Core concept: {concept_description}
Visual metaphor: {metaphor_description}

Must include these labeled elements:
{element_list}

Style: clean, academic, flat design. Colors: {hex_values}.
White background, no gradients or shadows.
All text must be clearly readable at small print size.

Cross-reference: SKILL.md §Decision Tree, §Figure Type → Backend Routing