10 KiB
Illustration Taxonomy Reference
Loaded by Step 1 (Retrieve & Classify). Provides structure templates, routing rules, prompt engineering guidance, and auto-detection heuristics for all 6 figure types.
Table of Contents
- Architecture Diagram
- Data Plot
- Conceptual Illustration
- Process Flow
- Comparison Chart
- Result Visualization
- Auto-Detection Heuristics
- Prompt Engineering Templates
1. Architecture Diagram
Purpose: Show system components and their interconnections. Common in ML papers (model architecture), systems papers (distributed systems), and software engineering papers.
Structural template:
- Top-level: input → processing layers → output
- Each layer: named box with internal description
- Connections: directed arrows with optional labels (data type, tensor shape)
- Optional: skip connections, attention mechanisms, residual paths
Key elements to extract:
- Named components (encoder, decoder, attention head, loss function)
- Data flow direction (left-to-right or top-to-bottom)
- Tensor shapes / data dimensions at each stage
- Grouping (which components form a module)
Backend: Mermaid (structure) → basic-image-gen (polished rendering)
Exemplar description (text-based reference):
A Transformer encoder-decoder diagram: input embeddings feed into N stacked encoder blocks (each containing multi-head self-attention + feed-forward + layer norm), connected by residual arrows. Decoder blocks mirror the structure with added cross-attention. All arrows are labeled with tensor dimensions. Clean white background, consistent box sizing, horizontal flow.
Common mistakes:
- Too many components crammed into one figure — split into overview + detail figures
- Arrows crossing without clear layering — use consistent flow direction
- Missing dimension annotations — reviewers need to verify tensor shapes
2. Data Plot
Purpose: Visualize quantitative experimental results. The most common figure type in empirical papers.
Subtypes:
- Line plot (trends over epochs, hyperparameter sweeps)
- Bar chart (discrete comparisons, ablation results)
- Scatter plot (correlation, embedding visualization)
- Histogram (distribution of scores, latency)
- Box plot (variance across runs, statistical comparison)
- Heatmap (attention weights, confusion matrices)
Structural template:
- X-axis: independent variable (clearly labeled with units)
- Y-axis: dependent variable (clearly labeled with units)
- Legend: method names, consistent with paper text
- Error bars / confidence intervals: when multiple runs exist
- Grid: light gray, no heavy gridlines
Key elements to extract:
- Data values (exact numbers from tables or inline)
- Variable names and units
- Method/baseline names (must match paper text exactly)
- Number of runs / seeds (for error bars)
Backend: matplotlib/seaborn code — NEVER AI image generation
Common mistakes:
- Truncated y-axis without break marker — misleading visual
- Missing error bars when results vary across seeds
- Legend covering data points — move to whitespace area
- Inconsistent method colors across subfigures
3. Conceptual Illustration
Purpose: Explain abstract ideas, metaphors, or high-level intuitions that cannot be expressed as data or architecture.
Structural template:
- Central metaphor / visual analogy
- Labeled regions or entities
- Visual encoding of relationships (proximity, size, color)
- Minimal text annotations (key terms only)
Key elements to extract:
- Core concept and its visual metaphor
- Entities and their relationships
- Any text that must appear in the figure
- Target impression (what should the reader understand at a glance)
Backend: basic-image-gen (AI excels at creative visual metaphors)
Exemplar description:
A clean scientific illustration showing knowledge distillation: a large "Teacher" network (detailed, multi-layered structure) on the left, connected by flowing arrows labeled "soft labels" to a smaller, simpler "Student" network on the right. The teacher is rendered in deep blue, the student in teal. White background, no decorative elements, all text in sans-serif font.
Common mistakes:
- Too abstract — reviewers need to understand without reading the caption
- Decorative clutter (gradients, shadows) — breaks academic norms
- Text too small or embedded in complex visuals — must be readable at print size
4. Process Flow
Purpose: Show sequential steps, decision points, and branching logic. Common in method sections and algorithm descriptions.
Structural template:
- Start/end: rounded rectangles
- Process steps: rectangles with action descriptions
- Decisions: diamonds with yes/no branches
- Data stores: cylinders or parallelograms
- Flow: top-to-bottom or left-to-right, consistent direction
Key elements to extract:
- Ordered steps (numbered or sequential)
- Decision points and their conditions
- Parallel branches (if any)
- Loop/iteration indicators
- Input/output at boundaries
Backend: Mermaid flowchart (primary), TikZ (fallback)
Exemplar description:
A training pipeline flowchart: "Raw Data" → "Preprocessing" → "Feature Extraction" → diamond "Validation Loss Decreasing?" → Yes: "Continue Training" (loops back) → No: "Early Stop" → "Evaluate on Test Set" → "Report Results". Clean boxes with consistent padding, single-color scheme, directional arrows.
Common mistakes:
- Too many steps in one diagram — split into phases
- Inconsistent box sizes — standardize width
- Missing decision labels (yes/no/condition)
5. Comparison Chart
Purpose: Compare multiple methods, models, or configurations across shared metrics. The "table as a figure" pattern.
Subtypes:
- Grouped bar chart (methods × metrics)
- Radar/spider chart (multi-dimensional comparison)
- Parallel coordinates (many dimensions)
- Table with heatmap coloring (for many metrics)
Structural template:
- X-axis: methods/models (categorical)
- Y-axis: metric value (numerical)
- Grouping: by metric or by method
- Highlighting: best result bolded or starred
- Baseline: horizontal dashed line for reference method
Key elements to extract:
- Method names (must match paper text)
- Metric names and values
- Which method is "ours" (for highlighting)
- Baseline/reference method
- Statistical significance markers (if applicable)
Backend: matplotlib code — NEVER AI image generation
Common mistakes:
- Too many methods/metrics in one chart — split or use table
- No baseline reference line — hard to judge improvement
- Inconsistent ordering — sort by performance or alphabetically
6. Result Visualization
Purpose: Show qualitative or semi-quantitative results: attention maps, feature visualizations, generated samples, ablation grids.
Subtypes:
- Attention heatmap overlay
- Feature map / activation visualization
- Sample grid (generated vs real)
- Ablation grid (each cell = one config)
- Confusion matrix
- t-SNE / UMAP embedding plots
Structural template:
- Grid layout: rows = conditions, columns = samples/metrics
- Consistent cell sizing
- Row/column headers clearly labeled
- Color scale legend (for heatmaps)
- Highlight: circles or boxes around key regions
Key elements to extract:
- What each row/column represents
- Color scale meaning and range
- Which samples to highlight and why
- Grid dimensions (rows × columns)
Backend: matplotlib/seaborn code (heatmaps, grids), basic-image-gen (overlays on real images)
Common mistakes:
- Missing color scale legend — uninterpretable
- Grid cells too small at print size — ensure readability
- No row/column labels — reader cannot identify conditions
7. Auto-Detection Heuristics
Keyword-based classification from source content:
| Keywords / Patterns | Detected Type |
|---|---|
| "accuracy", "F1", "BLEU", "%", table of numbers, "vs", "baseline" | comparison-chart |
| "epoch", "learning rate", "loss curve", "over time", "trend" | data-plot |
| "architecture", "encoder", "decoder", "layer", "module", "block" | architecture-diagram |
| "step 1", "then", "if...then", "pipeline", "workflow", "algorithm" | process-flow |
| "ablation", "heatmap", "attention map", "feature map", "t-SNE" | result-visualization |
| "intuition", "concept", "metaphor", "analogy", "overview" | conceptual |
Confidence scoring:
- 3+ keywords from one category → HIGH confidence
- 2 keywords → MEDIUM confidence
- 1 keyword or keywords from multiple categories → LOW confidence (ask user)
Tie-breaking rules:
- Numeric data present → prefer data-driven types (data-plot > comparison-chart > result-visualization)
- No numeric data → prefer structural types (architecture-diagram > process-flow > conceptual)
8. Prompt Engineering Templates
For basic-image-gen (architecture-diagram, conceptual)
Base template:
Create a publication-quality scientific illustration for an academic paper.
Subject: {description_from_blueprint}
Style requirements:
- Clean white background
- No decorative elements (no shadows, no 3D, no gradients)
- Flat design with consistent line weights
- Colors: {palette_from_academic_styles}
- All text in {font_family}, minimum {min_font_size}pt equivalent
- Professional academic illustration style
Text elements that MUST appear legibly:
{list_all_labels_and_annotations}
Layout: {layout_description}
Aspect ratio: {width}:{height}
For architecture diagrams specifically
Architecture template:
Create a clean technical architecture diagram for a research paper.
Components (left to right / top to bottom):
{component_list_with_descriptions}
Connections:
{arrow_descriptions_with_labels}
Style: flat design, {venue} conference style, colors {hex_values},
sans-serif labels, consistent box sizing, white background.
No decorative elements. Print-ready at {width}" x {height}".
For conceptual illustrations
Concept template:
Create a minimalist scientific concept illustration.
Core concept: {concept_description}
Visual metaphor: {metaphor_description}
Must include these labeled elements:
{element_list}
Style: clean, academic, flat design. Colors: {hex_values}.
White background, no gradients or shadows.
All text must be clearly readable at small print size.
Cross-reference: SKILL.md §Decision Tree, §Figure Type → Backend Routing