vault: auto-sync 2026-03-09 02:00

This commit is contained in:
lizikk 2026-03-09 02:00:02 +08:00
parent 7e70828216
commit 77a61e1106
25 changed files with 1467 additions and 0 deletions

View File

@ -0,0 +1,573 @@
---
id: "2026-03-08-gpt-5-4-prompt-engineering-guide"
title: "Prompt Guidance for GPT-5.4"
slug: "gpt-5-4-prompt-engineering-guide"
status: "inbox"
content_type: "article"
channels: []
language: "en"
source_urls:
- "https://developers.openai.com/api/docs/guides/latest-model"
assets: []
cover_image: ""
template: "article"
owner: "content-forge"
created_at: "2026-03-08T00:00:00+08:00"
updated_at: "2026-03-08T00:00:00+08:00"
published_at: null
tags:
- "prompt-engineering"
- "gpt-5"
- "openai"
- "llm"
---
# Prompt Guidance for GPT-5.4
GPT-5.4, our newest mainline model, is designed to balance long-running task performance, stronger control over style and behavior, and more disciplined execution across complex workflows. Building on advances from GPT-5 through GPT-5.3-Codex, GPT-5.4 improves token efficiency, sustains multi-step workflows more reliably, and performs well on long-horizon tasks.
GPT-5.4 is designed for production-grade assistants and agents that need strong multi-step reasoning, evidence-rich synthesis, and reliable performance over long contexts. It is especially effective when prompts clearly specify the output contract, tool-use expectations, and completion criteria. In practice, the biggest gains come from choosing the right reasoning effort for the task, using explicit grounding and citation rules, and giving the model a precise definition of what "done" looks like. This guide focuses on prompt patterns and migration practices that preserve those efficiency wins. For model capabilities, API parameters, and broader migration guidance, see [our latest model guide](https://developers.openai.com/api/docs/guides/latest-model).
## Understand GPT-5.4 behavior
### Where GPT-5.4 is strongest
GPT-5.4 tends to work especially well in these areas:
- Strong personality and tone adherence, with less drift over long answers
- Agentic workflow robustness, with a stronger tendency to stick with multi-step work, retry, and complete agent loops end to end
- Evidence-rich synthesis, especially in long-context or multi-tool workflows
- Instruction adherence in modular, skill-based, and block-structured prompts when the contract is explicit
- Long-context analysis across large, messy, or multi-document inputs
- Batched or parallel tool calling while maintaining tool-call accuracy
- Spreadsheet, finance, and Excel workflows that need instruction following, formatting fidelity, and stronger self-verification
### Where explicit prompting still helps
Even with those strengths, GPT-5.4 benefits from more explicit guidance in a few recurring patterns:
- Low-context tool routing early in a session, when tool selection can be less reliable
- Dependency-aware workflows that need explicit prerequisite and downstream-step checks
- Reasoning effort selection, where higher effort is not always better and the right choice depends on task shape, not intuition
- Research tasks that require disciplined source collection and consistent citations
- Irreversible or high-impact actions that require verification before execution
- Terminal or coding-agent environments where tool boundaries must stay clear
These patterns are observed defaults, not guarantees. Start with the smallest prompt that passes your evals, and add blocks only when they fix a measured failure mode.
## Use core prompt patterns
### Keep outputs compact and structured
To improve token efficiency with GPT-5.4, constrain verbosity and enforce structured output through clear output contracts. In practice, this acts as an additional control layer alongside the `verbosity` parameter in the Responses API, allowing you to guide both how much the model writes and how it structures the output.
```xml
<output_contract>
- Return exactly the sections requested, in the requested order.
- If the prompt defines a preamble, analysis block, or working section, do not treat it as extra output.
- Apply length limits only to the section they are intended for.
- If a format is required (JSON, Markdown, SQL, XML), output only that format.
</output_contract>
<verbosity_controls>
- Prefer concise, information-dense writing.
- Avoid repeating the user's request.
- Keep progress updates brief.
- Do not shorten the answer so aggressively that required evidence, reasoning, or completion checks are omitted.
</verbosity_controls>
```
### Set clear defaults for follow-through
Users often change the task, format, or tone mid-conversation. To keep the assistant aligned, define clear rules for when to proceed, when to ask, and how newer instructions override earlier defaults.
Use a default follow-through policy like this:
```xml
<default_follow_through_policy>
- If the user's intent is clear and the next step is reversible and low-risk, proceed without asking.
- Ask permission only if the next step is:
(a) irreversible,
(b) has external side effects (for example sending, purchasing, deleting, or writing to production), or
(c) requires missing sensitive information or a choice that would materially change the outcome.
- If proceeding, briefly state what you did and what remains optional.
</default_follow_through_policy>
```
Make instruction priority explicit:
```xml
<instruction_priority>
- User instructions override default style, tone, formatting, and initiative preferences.
- Safety, honesty, privacy, and permission constraints do not yield.
- If a newer user instruction conflicts with an earlier one, follow the newer instruction.
- Preserve earlier instructions that do not conflict.
</instruction_priority>
```
Higher-priority developer or system instructions remain binding.
**Guidance:** When instructions change mid-conversation, make the update explicit, scoped, and local. State what changed, what still applies, and whether the change affects the next turn or the rest of the conversation.
### Handle mid-conversation instruction updates
For mid-conversation updates, use explicit, scoped steering messages that state:
1. Scope
2. Override
3. Carry forward
```text
<task_update>
For the next response only:
- Do not complete the task.
- Only produce a plan.
- Keep it to 5 bullets.
All earlier instructions still apply unless they conflict with this update.
</task_update>
```
If the task itself changes, say so directly:
```text
<task_update>
The task has changed.
Previous task: complete the workflow.
Current task: review the workflow and identify risks only.
Rules for this turn:
- Do not execute actions.
- Do not call destructive tools.
- Return exactly:
1. Main risks
2. Missing information
3. Recommended next step
</task_update>
```
### Make tool use persistent when correctness depends on it
Use explicit rules to keep tool use thorough, dependency-aware, and appropriately paced, especially in workflows where later actions rely on earlier retrieval or verification. A common failure mode is skipping prerequisites because the right end state seems obvious.
GPT-5.4 can be less reliable at tool routing early in a session, when context is still thin. Prompt for prerequisites, dependency checks, and exact tool intent.
```xml
<tool_persistence_rules>
- Use tools whenever they materially improve correctness, completeness, or grounding.
- Do not stop early when another tool call is likely to materially improve correctness or completeness.
- Keep calling tools until:
(1) the task is complete, and
(2) verification passes (see <verification_loop>).
- If a tool returns empty or partial results, retry with a different strategy.
</tool_persistence_rules>
```
This is especially important for workflows where the final action depends on earlier lookup or retrieval steps. One of the most common failure modes is skipping prerequisites because the intended end state seems obvious.
```xml
<dependency_checks>
- Before taking an action, check whether prerequisite discovery, lookup, or memory retrieval steps are required.
- Do not skip prerequisite steps just because the intended final action seems obvious.
- If the task depends on the output of a prior step, resolve that dependency first.
</dependency_checks>
```
Prompt for parallelism when the work is independent and wall-clock matters. Prompt for sequencing when dependencies, ambiguity, or irreversible actions matter more than speed.
```xml
<parallel_tool_calling>
- When multiple retrieval or lookup steps are independent, prefer parallel tool calls to reduce wall-clock time.
- Do not parallelize steps that have prerequisite dependencies or where one result determines the next action.
- After parallel retrieval, pause to synthesize the results before making more calls.
- Prefer selective parallelism: parallelize independent evidence gathering, not speculative or redundant tool use.
</parallel_tool_calling>
```
### Force completeness on long-horizon tasks
For multi-step workflows, a common failure mode is incomplete execution: the model finishes after partial coverage, misses items in a batch, or treats empty or narrow retrieval as final. GPT-5.4 becomes more reliable when the prompt defines explicit completion rules and recovery behavior.
Coverage can be achieved through sequential or parallel retrieval, but completion rules should remain explicit either way.
```xml
<completeness_contract>
- Treat the task as incomplete until all requested items are covered or explicitly marked [blocked].
- Keep an internal checklist of required deliverables.
- For lists, batches, or paginated results:
- determine expected scope when possible,
- track processed items or pages,
- confirm coverage before finalizing.
- If any item is blocked by missing data, mark it [blocked] and state exactly what is missing.
</completeness_contract>
```
For workflows where empty, partial, or noisy retrieval is common:
```xml
<empty_result_recovery>
If a lookup returns empty, partial, or suspiciously narrow results:
- do not immediately conclude that no results exist,
- try at least one or two fallback strategies,
such as:
- alternate query wording,
- broader filters,
- a prerequisite lookup,
- or an alternate source or tool,
- Only then report that no results were found, along with what you tried.
</empty_result_recovery>
```
### Add a verification loop before high-impact actions
Once the workflow appears complete, add a lightweight verification step before returning the answer or taking an irreversible action. This helps catch requirement misses, grounding issues, and format drift before commit.
```xml
<verification_loop>
Before finalizing:
- Check correctness: does the output satisfy every requirement?
- Check grounding: are factual claims backed by the provided context or tool outputs?
- Check formatting: does the output match the requested schema or style?
- Check safety and irreversibility: if the next step has external side effects, ask permission first.
</verification_loop>
```
```xml
<missing_context_gating>
- If required context is missing, do NOT guess.
- Prefer the appropriate lookup tool when the missing context is retrievable; ask a minimal clarifying question only when it is not.
- If you must proceed, label assumptions explicitly and choose a reversible action.
</missing_context_gating>
```
For agents that actively take actions, add a short execution frame:
```xml
<action_safety>
- Pre-flight: summarize the intended action and parameters in 1-2 lines.
- Execute via tool.
- Post-flight: confirm the outcome and any validation that was performed.
</action_safety>
```
## Handle specialized workflows
### Choose image detail explicitly for vision and computer use
If your workflow depends on visual precision, specify the image `detail` level in the prompt or integration instead of relying on `auto`. Use `high` for standard high-fidelity image understanding. Use `original` for large, dense, or spatially sensitive images, especially [computer use, localization, OCR, and click-accuracy tasks](https://developers.openai.com/api/docs/guides/tools-computer-use) on `gpt-5.4` and future models. Use `low` only when speed and cost matter more than fine detail. For more details on image detail levels, see the [Images and Vision guide](https://developers.openai.com/api/docs/guides/images-vision).
### Lock research and citations to retrieved evidence
When citation quality matters, make both the source boundary and the format requirement explicit. This helps reduce fabricated references, unsupported claims, and citation-format drift.
```xml
<citation_rules>
- Only cite sources retrieved in the current workflow.
- Never fabricate citations, URLs, IDs, or quote spans.
- Use exactly the citation format required by the host application.
- Attach citations to the specific claims they support, not only at the end.
</citation_rules>
```
```xml
<grounding_rules>
- Base claims only on provided context or tool outputs.
- If sources conflict, state the conflict explicitly and attribute each side.
- If the context is insufficient or irrelevant, narrow the answer or say you cannot support the claim.
- If a statement is an inference rather than a directly supported fact, label it as an inference.
</grounding_rules>
```
If your application requires inline citations, require inline citations. If it requires footnotes, require footnotes. The key is to lock the format and prevent the model from improvising unsupported references.
### Research mode
Push GPT-5.4 into a disciplined research mode. Use this pattern for research, review, and synthesis tasks. Do not force it onto short execution tasks or simple deterministic transforms.
```xml
<research_mode>
- Do research in 3 passes:
1) Plan: list 3-6 sub-questions to answer.
2) Retrieve: search each sub-question and follow 1-2 second-order leads.
3) Synthesize: resolve contradictions and write the final answer with citations.
- Stop only when more searching is unlikely to change the conclusion.
</research_mode>
```
If your host environment uses a specific research tool or requires a submit step, combine this with the host's finalization contract.
### Clamp strict output formats
For SQL, JSON, or other parse-sensitive outputs, tell GPT-5.4 to emit only the target format and check it before finishing.
```text
<structured_output_contract>
- Output only the requested format.
- Do not add prose or markdown fences unless they were requested.
- Validate that parentheses and brackets are balanced.
- Do not invent tables or fields.
- If required schema information is missing, ask for it or return an explicit error object.
</structured_output_contract>
```
If you are extracting document regions or OCR boxes, define the coordinate system and add a drift check:
```text
<bbox_extraction_spec>
- Use the specified coordinate format exactly, such as [x1,y1,x2,y2] normalized to 0..1.
- For each box, include page, label, text snippet, and confidence.
- Add a vertical-drift sanity check so boxes stay aligned with the correct line of text.
- If the layout is dense, process page by page and do a second pass for missed items.
</bbox_extraction_spec>
```
### Keep tool boundaries explicit in coding and terminal agents
In coding agents, GPT-5.4 works better when the rules for shell access and file editing are unambiguous. This is especially important when you expose tools like [Shell](https://developers.openai.com/api/docs/guides/tools-shell) or [Apply patch](https://developers.openai.com/api/docs/guides/tools-apply-patch).
### User updates
GPT-5.4 does well with brief, outcome-based updates. Reuse the user-updates pattern from the 5.2 guide, but pair it with explicit completion and verification requirements.
Recommended update spec:
```xml
<user_updates_spec>
- Only update the user when starting a new major phase or when something changes the plan.
- Each update: 1 sentence on outcome + 1 sentence on next step.
- Do not narrate routine tool calls.
- Keep the user-facing status short; keep the work exhaustive.
</user_updates_spec>
```
For coding agents, see the Prompting patterns for coding tasks section below for more specific guidance.
### Prompting patterns for coding tasks
**Autonomy and persistence**
GPT-5.4 is generally more thorough end to end than earlier mainline models on coding and tool-use tasks, so you often need less explicit "verify everything" prompting. Still, for high-stakes changes such as production, migrations, or security work, keep a lightweight verification clause.
```xml
<autonomy_and_persistence>
Persist until the task is fully handled end-to-end within the current turn whenever feasible: do not stop at analysis or partial fixes; carry changes through implementation, verification, and a clear explanation of outcomes unless the user explicitly pauses or redirects you.
Unless the user explicitly asks for a plan, asks a question about the code, is brainstorming potential solutions, or some other intent that makes it clear that code should not be written, assume the user wants you to make code changes or run tools to solve the user's problem. In these cases, it's bad to output your proposed solution in a message, you should go ahead and actually implement the change. If you encounter challenges or blockers, you should attempt to resolve them yourself.
</autonomy_and_persistence>
```
**Intermediary updates**
Keep updates sparse and high-signal. In coding tasks, prefer updates at key points.
```xml
<user_updates_spec>
- Intermediary updates go to the `commentary` channel.
- User updates are short updates while you are working. They are not final answers.
- Use 1-2 sentence updates to communicate progress and new information while you work.
- Do not begin responses with conversational interjections or meta commentary. Avoid openers such as acknowledgements ("Done -", "Got it", or "Great question") or similar framing.
- Before exploring or doing substantial work, send a user update explaining your understanding of the request and your first step. Avoid commenting on the request or starting with phrases such as "Got it" or "Understood."
- Provide updates roughly every 30 seconds while working.
- When exploring, explain what context you are gathering and what you learned. Vary sentence structure so the updates do not become repetitive.
- When working for a while, keep updates informative and varied, but stay concise.
- When work is substantial, provide a longer plan after you have enough context. This is the only update that may be longer than 2 sentences and may contain formatting.
- Before file edits, explain what you are about to change.
- While thinking, keep the user informed of progress without narrating every tool call. Even if you are not taking actions, send frequent progress updates rather than going silent, especially if you are thinking for more than a short stretch.
- Keep the tone of progress updates consistent with the assistant's overall personality.
</user_updates_spec>
```
**Formatting**
GPT-5.4 often defaults to more structured formatting and may overuse bullet lists. If you want a clean final response, explicitly clamp list shape.
```xml
Never use nested bullets. Keep lists flat (single level). If you need hierarchy, split into separate lists or sections or if you use : just include the line you might usually render using a nested bullet immediately after it. For numbered lists, only use the `1. 2. 3.` style markers (with a period), never `1)`.
```
**Frontend tasks**
Use this only when additional frontend guidance is useful.
```xml
<frontend_tasks>
When doing frontend design tasks, avoid generic, overbuilt layouts.
Use these hard rules:
- One composition: The first viewport must read as one composition, not a dashboard, unless it is a dashboard.
- Brand first: On branded pages, the brand or product name must be a hero-level signal, not just nav text or an eyebrow. No headline should overpower the brand.
- Brand test: If the first viewport could belong to another brand after removing the nav, the branding is too weak.
- Full-bleed hero only: On landing pages and promotional surfaces, the hero image should usually be a dominant edge-to-edge visual plane or background. Do not default to inset hero images, side-panel hero images, rounded media cards, tiled collages, or floating image blocks unless the existing design system clearly requires them.
- Hero budget: The first viewport should usually contain only the brand, one headline, one short supporting sentence, one CTA group, and one dominant image. Do not place stats, schedules, event listings, address blocks, promos, "this week" callouts, metadata rows, or secondary marketing content there.
- No hero overlays: Do not place detached labels, floating badges, promo stickers, info chips, or callout boxes on top of hero media.
- Cards: Default to no cards. Never use cards in the hero unless they are the container for a user interaction. If removing a border, shadow, background, or radius does not hurt interaction or understanding, it should not be a card.
- One job per section: Each section should have one purpose, one headline, and usually one short supporting sentence.
- Real visual anchor: Imagery should show the product, place, atmosphere, or context.
- Reduce clutter: Avoid pill clusters, stat strips, icon rows, boxed promos, schedule snippets, and competing text blocks.
- Use motion to create presence and hierarchy, not noise. Ship 2-3 intentional motions for visually led work, and prefer Framer Motion when it is available.
Exception: If working within an existing website or design system, preserve the established patterns, structure, and visual language.
</frontend_tasks>
```
```xml
<terminal_tool_hygiene>
- Only run shell commands via the terminal tool.
- Never "run" tool names as shell commands.
- If a patch or edit tool exists, use it directly; do not attempt it in bash.
- After changes, run a lightweight verification step such as ls, tests, or a build before declaring the task done.
</terminal_tool_hygiene>
```
### Document localization and OCR boxes
For bbox tasks, be explicit about coordinate conventions and add drift tests.
```xml
<bbox_extraction_spec>
- Use the specified coordinate format exactly (for example [x1,y1,x2,y2] normalized 0..1).
- For each bbox, include: page, label, text snippet, confidence.
- Add a vertical-drift sanity check:
- ensure bboxes align with the line of text (not shifted up or down).
- If dense layout, process page by page and do a second pass for missed items.
</bbox_extraction_spec>
```
### Use runtime and API integration notes
For long-running or tool-heavy agents, the runtime contract matters as much as the prompt contract.
**Phase parameter**
To better support preamble messages with GPT-5.4, the Responses API includes a `phase` field designed to prevent early stopping on longer-running tasks and other misbehaviors.
- `phase` is optional at the API level, but it is highly recommended. Best-effort inference may exist server-side, but explicit round-tripping of `phase` is strictly better.
- Use `phase` for long-running or tool-heavy agents that may emit commentary before tool calls or before a final answer.
- Preserve `phase` when replaying prior assistant items so the model can distinguish working commentary from the completed answer. This matters most in multi-step flows with preambles, tool-related updates, or multiple assistant messages in the same turn.
- Do not add `phase` to user messages.
- If you use `previous_response_id`, that is usually the simplest path, since OpenAI can often recover prior state without manually replaying assistant items.
- If you replay assistant history yourself, preserve the original `phase` values.
- Missing or dropped `phase` can cause preambles to be interpreted as final answers and degrade behavior on longer, multi-step tasks.
### Preserve behavior in long sessions
Compaction unlocks significantly longer effective context windows, where user conversations can persist for many turns without hitting context limits or long-context performance degradation, and agents can perform very long trajectories that exceed a typical context window for long-running, complex tasks.
If you are using [Compaction](https://developers.openai.com/api/docs/guides/compaction) in the Responses API, compact after major milestones, treat compacted items as opaque state, and keep prompts functionally identical after compaction. The endpoint is ZDR compatible and returns an `encrypted_content` item that you can pass into future requests. GPT-5.4 tends to remain more coherent and reliable over longer, multi-turn conversations with fewer breakdowns as sessions grow.
For more guidance, see the [`/responses/compact` API reference](https://developers.openai.com/api/docs/api-reference/responses/compact).
### Control personality for customer-facing workflows
GPT-5.4 can be steered more effectively when you separate persistent personality from per-response writing controls. This is especially useful for customer-facing workflows such as emails, support replies, announcements, and blog-style content.
- **Personality (persistent):** sets the default tone, verbosity, and decision style across the session.
- **Writing controls (per response):** define the channel, register, formatting, and length for a specific artifact.
- **Reminder:** personality should not override task-specific output requirements. If the user asks for JSON, return JSON.
For natural, high-quality prose, the highest-leverage controls are:
- Give the model a clear persona.
- Specify the channel and emotional register.
- Explicitly ban formatting when you want prose.
- Use hard length limits.
```xml
<personality_and_writing_controls>
- Persona: <one sentence>
- Channel: <Slack | email | memo | PRD | blog>
- Emotional register: <direct/calm/energized/etc.> + "not <overdo this>"
- Formatting: <ban bullets/headers/markdown if you want prose>
- Length: <hard limit, e.g. <=150 words or 3-5 sentences>
- Default follow-through: if the request is clear and low-risk, proceed without asking permission.
</personality_and_writing_controls>
```
For more personality patterns you can lift directly, see the [Prompt Personalities cookbook](https://developers.openai.com/cookbook/examples/gpt-5/prompt_personalities).
**Professional memo mode**
For memos, reviews, and other professional writing tasks, general writing instructions are often not enough. These workflows benefit from explicit guidance on specificity, domain conventions, synthesis, and calibrated certainty.
```xml
<memo_mode>
- Write in a polished, professional memo style.
- Use exact names, dates, entities, and authorities when supported by the record.
- Follow domain-specific structure if one is requested.
- Prefer precise conclusions over generic hedging.
- When uncertainty is real, tie it to the exact missing fact or conflicting source.
- Synthesize across documents rather than summarizing each one independently.
</memo_mode>
```
This mode is especially useful for legal, policy, research, and executive-facing writing, where the goal is not just fluency, but disciplined synthesis and clear conclusions.
## Tune reasoning and migration
### Treat reasoning effort as a last-mile knob
Reasoning effort is not one-size-fits-all. Treat it as a last-mile tuning knob, not the primary way to improve quality. In many cases, stronger prompts, clear output contracts, and lightweight verification loops recover much of the performance teams might otherwise seek through higher reasoning settings.
Recommended defaults:
- `none`: Best for fast, cost-sensitive, latency-sensitive tasks where the model does not need to think.
- `low`: Works well for latency-sensitive tasks where a small amount of thinking can produce a meaningful accuracy gain, especially with complex instructions.
- `medium` or `high`: Reserve for tasks that truly require stronger reasoning and can absorb the latency and cost tradeoff. Choose between them based on how much performance gain your task gets from additional reasoning.
- `xhigh`: Avoid as a default unless your evals show clear benefits. It is best suited for long, agentic, reasoning-heavy tasks where maximum intelligence matters more than speed or cost.
In practice, most teams should default to the `none`, `low`, or `medium` range.
Start with `none` for execution-heavy workloads such as workflow steps, field extraction, support triage, and short structured transforms.
Start with `medium` or higher for research-heavy workloads such as long-context synthesis, multi-document review, conflict resolution, and strategy writing. With `medium` and a well-engineered prompt, you can squeeze out a lot of performance.
For GPT-5.4 workloads, `none` can already perform well on action-selection and tool-discipline tasks. If your workload depends on nuanced interpretation, such as implicit requirements, ambiguity, or cancelled-tool-call recovery, start with `low` or `medium` instead.
Before increasing reasoning effort, first add:
- `<completeness_contract>`
- `<verification_loop>`
- `<tool_persistence_rules>`
If the model still feels too literal or stops at the first plausible answer, add an initiative nudge before raising reasoning effort:
```xml
<dig_deeper_nudge>
- Don't stop at the first plausible answer.
- Look for second-order issues, edge cases, and missing constraints.
- If the task is safety or accuracy critical, perform at least one verification step.
</dig_deeper_nudge>
```
### Migrate prompts to GPT-5.4 one change at a time
Use the same one-change-at-a-time discipline as the 5.2 guide: switch model first, pin `reasoning_effort`, run evals, then iterate.
These starting points work well for many migrations:
| Current setup | Suggested GPT-5.4 start | Notes |
| ------------------------- | ---------------------------------- | ------------------------------------------------------------------- |
| `gpt-5.2` | Match the current reasoning effort | Preserve the existing latency and quality profile first, then tune. |
| `gpt-5.3-codex` | Match the current reasoning effort | For coding workflows, keep the reasoning effort the same. |
| `gpt-4.1` or `gpt-4o` | `none` | Keep snappy behavior, and increase only if evals regress. |
| Research-heavy assistants | `medium` or `high` | Use explicit research multi-pass and citation gating. |
| Long-horizon agents | `medium` or `high` | Add tool persistence and completeness accounting. |
### Web search and deep research
If you are migrating a research agent in particular, make these prompt updates before increasing reasoning effort:
- Add `<research_mode>`
- Add `<citation_rules>`
- Add `<empty_result_recovery>`
- Increase `reasoning_effort` one notch only after prompt fixes.
You can start from the 5.2 research block and then layer in citation gating and finalization contracts as needed.
GPT-5.4 performs especially well when the task requires multi-step evidence gathering, long-context synthesis, and explicit prompt contracts. In practice, the highest-leverage prompt changes are choosing reasoning effort by task shape, defining exact output and citation formats, adding dependency-aware tool rules, and making completion criteria explicit. The model is often strong out of the box, but it is most reliable when prompts clearly specify how to search, how to verify, and what counts as done.
## Next steps
- Read [our latest model guide](https://developers.openai.com/api/docs/guides/latest-model) for model capabilities, parameters, and API compatibility details.
- Read [Prompt engineering](https://developers.openai.com/api/docs/guides/prompt-engineering) for broader prompting strategies that apply across model families.
- Read [Compaction](https://developers.openai.com/api/docs/guides/compaction) if you are building long-running GPT-5.4 sessions in the Responses API.

View File

@ -0,0 +1,271 @@
---
id: 2026-03-08-gpt-5-4-prompt-patterns
title: 我读完 GPT-5.4 官方指南:别急着加钱调参数,先把这 6 个模块写对
slug: gpt-5-4-prompt-patterns
status: polish
content_type: article
channels:
- wechat
- x
language: zh-CN
source_urls:
- https://developers.openai.com/api/docs/guides/latest-model
assets:
- 05-assets/gpt-5-4-prompt-patterns/cover.png
cover_image: 05-assets/gpt-5-4-prompt-patterns/cover.png
template: article
owner: content-forge
created_at: 2026-03-08T00:00:00+08:00
updated_at: 2026-03-08T11:51:04+08:00
style: tech_blog
audience: 泛 AI 从业者
tags:
- prompt-engineering
- gpt-5
- openai
- llm
- agent
source_notes:
- 00-inbox/2026-03-08-gpt-5-4-prompt-engineering-guide.md
review_status: passed
review_passed_at: 2026-03-08T11:44:21+08:00
polish_status: done
polish_version: "1"
polished_at: 2026-03-08T11:51:03+08:00
---
# 我读完 GPT-5.4 官方指南:别急着加钱调参数,先把这 6 个模块写对
OpenAI 刚放出了 GPT-5.4 的官方 Prompt Engineering 指南。说实话,我原以为又是那种"清晰指令+角色扮演+few-shot"的老三篇。
结果不是。
这份指南的核心论点让我愣了一下:**别急着调 reasoning effort先把你的 prompt 结构写对。** 在大多数场景下,`none` 或 `low` 就够了——前提是你的 prompt 里有完整的"输出契约"和"验证循环"。
这篇文章把这份 4000+ 词的英文指南拆成 6 个可直接复用的 prompt 模块。每个模块给出中文版模板,复制过去就能用。
---
## 01 GPT-5.4 变了什么
GPT-5.4 相比前代的核心变化不在"更聪明",而在**更可控**
- 人格和语气在长回复中漂移更少
- Agentic 工作流更鲁棒——更倾向于坚持多步任务、自动重试、端到端完成
- 批量/并行 tool calling 准确率提升
- 长上下文分析能力增强
但它也暴露了几个必须靠 prompt 补救的短板:
| 短板 | 具体表现 | 解法 |
|------|---------|------|
| 低上下文 tool routing | 会话初期工具选择不靠谱 | 加前置条件检查 |
| 依赖感知弱 | 跳过前置步骤直奔终态 | 加 dependency_checks |
| reasoning effort 误区 | 开发者默认拉满,但更高 ≠ 更好 | 按任务类型选档位 |
| 不可逆操作 | 高影响操作缺少确认 | 加 verification_loop |
关键洞察:**这些短板的解法不是换模型或加预算,是写更好的 prompt 结构。**
但光知道"要写好 prompt"没用,得知道具体写什么。接下来拆解的 6 个模块就是答案。
---
## 02 六个核心 Prompt 模块
整份指南的精华浓缩为 6 个 XML 模块。它们像乐高积木——你的 system prompt 按需组装就行。
### 模块 1输出契约Output Contract
控制模型"说什么、怎么说、说多少"。
```xml
<output_contract>
- 只返回要求的章节,按要求的顺序。
- 如果 prompt 定义了前言、分析块或工作区,不要把它当额外输出。
- 长度限制只应用于指定的章节。
- 如果要求特定格式JSON/Markdown/SQL/XML只输出那种格式。
</output_contract>
<verbosity_controls>
- 简洁、信息密集。
- 不要重复用户的请求。
- 进度更新保持简短。
- 不要为了短而省掉必要的证据、推理或完成检查。
</verbosity_controls>
```
**什么时候用**:几乎所有 production prompt 都该加。它解决的是"模型废话太多"和"格式乱飘"这两个最高频的问题。
### 模块 2跟进策略Follow-Through Policy
告诉模型什么时候该自己干、什么时候该问你。
```xml
<default_follow_through_policy>
- 如果用户意图清晰且下一步可逆、低风险,直接执行不要问。
- 只在以下情况问:
(a) 不可逆操作,
(b) 有外部副作用(发送、购买、删除、写入生产环境),
(c) 需要缺失的敏感信息或会显著改变结果的选择。
- 如果直接执行了,简要说明做了什么。
</default_follow_through_policy>
```
**什么时候用**Agent 场景必备。你有没有遇到过 Agent 每走一步都来问你"可以继续吗?"让你想砸键盘?这个模块就是解药。
### 模块 3工具持久化Tool Persistence
防止模型"觉得差不多了就停手"——Agent 场景最常见的翻车模式。
```xml
<tool_persistence_rules>
- 只要工具调用能实质性提升正确性或完整性,就调用。
- 当再调一次工具很可能提升质量时,不要提前停止。
- 持续调用工具直到:(1) 任务完成,且 (2) 验证通过。
- 如果工具返回空或部分结果,换策略重试。
</tool_persistence_rules>
<dependency_checks>
- 执行操作前,检查是否需要前置的发现、查找或记忆检索步骤。
- 不要因为最终操作看起来很明显就跳过前置步骤。
- 如果任务依赖前一步的输出,先解决那个依赖。
</dependency_checks>
```
多少次你让 AI 搜个东西,它搜不到就直接告诉你"没有相关结果"?这就是缺了 tool persistence。
还有一个配套模块值得加:
```xml
<parallel_tool_calling>
- 多个检索步骤相互独立时,优先并行调用以减少等待时间。
- 有前置依赖的步骤不要并行。
- 并行检索后暂停,合成结果再决定下一步。
</parallel_tool_calling>
```
RAG、多工具 Agent、批量处理——这三个块组合使用效果最好。
### 模块 4完整性保障Completeness Contract
防止模型"做到一半就交差"。
```xml
<completeness_contract>
- 所有请求项都覆盖或标记 [blocked] 之前,任务就是未完成。
- 维护内部交付物检查清单。
- 对列表或分页结果:确定预期范围,跟踪已处理项,确认覆盖率。
- 因缺失数据阻塞的项标记 [blocked],说明具体缺什么。
</completeness_contract>
<empty_result_recovery>
如果查找返回空或可疑地窄的结果:
- 不要立即断定"没有结果"。
- 至少尝试 1-2 个回退策略:换关键词、放宽过滤、前置查找、换工具。
- 然后才报告未找到,并说明尝试了什么。
</empty_result_recovery>
```
`empty_result_recovery` 这个块特别实用。它本质上是在教模型一个人类常识:**搜不到 ≠ 不存在,可能是你搜的方式不对。**
但如果你以为写对这些模块就万事大吉,那就太乐观了。
### 模块 5验证循环Verification Loop
最终交付或执行不可逆操作之前,强制自检。
```xml
<verification_loop>
最终交付前:
- 正确性:输出是否满足每一个要求?
- 可验证性:事实声明是否有上下文或工具输出支撑?
- 格式:是否符合要求的 schema 或风格?
- 安全:下一步有外部副作用的话,先请求确认。
</verification_loop>
<missing_context_gating>
- 缺少必要上下文时不要猜。
- 优先用工具查找;查不到时才问用户,且问题要最小化。
- 如果必须继续,明确标注假设并选择可逆操作。
</missing_context_gating>
```
代码部署、数据修改、对外发送——都该过这一关。别等到线上出了事故才想起来加自检,那时候成本是现在的 100 倍。
### 模块 6推理力度调优Reasoning Effort
这是整份指南最反直觉的部分。
原文说得很直白:**reasoning effort 是 last-mile knob最后一公里的旋钮不是 primary lever主杠杆。** 大多数团队应该默认在 `none`、`low` 或 `medium` 范围内。
reasoning effort 是 OpenAI API 的一个参数,控制模型在回答前"思考多久"——档位越高,延迟越大、成本越高,但不一定效果更好。
| 档位 | 适用场景 | 典型任务 |
|------|---------|---------|
| `none` | 执行型、延迟敏感 | 工作流步骤、字段提取、工单分类 |
| `low` | 需少量思考 | 复杂指令遵循 |
| `medium` | 需较强推理 | 长上下文综合、多文档审查 |
| `high` | 重度推理 | 冲突解决、策略撰写 |
| `xhigh` | 慎用,需 eval 验证 | 超长 Agent 任务 |
关键原则:**在加 reasoning effort 之前,先加模块 3工具持久化+ 模块 4完整性保障+ 模块 5验证循环。** 这三个 prompt 模块能解决大部分"模型不够认真"的问题。你以为是模型不够聪明,其实是你的 prompt 没告诉它什么叫"做完了"。
如果加了这三个模块还差点意思,再加:
```xml
<dig_deeper_nudge>
- 不要停在第一个看似合理的答案。
- 寻找二阶问题、边界情况和遗漏的约束。
- 如果任务涉及安全或准确性,至少做一次验证步骤。
</dig_deeper_nudge>
```
---
## 03 迁移策略:一次只改一个变量
从旧模型迁移到 GPT-5.4,最容易犯的错是同时换模型+改 prompt+调参数,出了问题根本不知道谁的锅。正确姿势:先换模型、保持 reasoning_effort 不变、跑 eval、一次只调一个参数。
| 你的现状 | GPT-5.4 起步建议 |
|---------|----------------|
| GPT-5.2 | 保持当前 reasoning effort |
| GPT-5.3-Codex | 保持当前 reasoning effort |
| GPT-4.1 / GPT-4o | 从 `none` 开始,不够再加 |
| 研究类 Agent | `medium``high` |
| 长流程 Agent | `medium``high` |
---
## 04 适用边界(别无脑照搬)
这些模块不只适用于 GPT-5.4。输出契约、验证循环、工具持久化都是模型无关的 prompt 工程模式。我在 Claude 的 system prompt 里也用了类似结构,效果同样显著。
但也不要照搬。
每个模型有不同的默认行为。GPT-5.4 在 agentic 工作流上比 4o 鲁棒很多,所以指南里的"验证一切"实际上比以前宽松了。Claude 的 tool use 和 GPT 的 tool use 语义有差异,复制模板没问题,但别跳过在你的场景下跑 eval。
还有一点可能让你不安reasoning effort 设成 `none`?真的够用?
指南说得很清楚GPT-5.4 在 `none` 下就能胜任 action-selection 和 tool-discipline 任务。只有任务涉及细微理解——隐含需求、歧义处理、取消的 tool call 恢复——才需要 `low``medium`
---
## 05 你的 Prompt 体检清单
这份指南传递了一个信号:**LLM 应用的质量瓶颈正在从"模型能力"转移到"prompt 结构"。**
6 个模块速查:
1. **输出契约** — 控制说什么、怎么说
2. **跟进策略** — 什么时候自主执行、什么时候问用户
3. **工具持久化** — 防止"差不多就行"
4. **完整性保障** — 防止"做到一半交差"
5. **验证循环** — 最终交付前自检
6. **推理力度** — 不是越高越好prompt 结构先行
现在就打开你的 system prompt对照这 6 个模块逐项查缺补漏。跑一遍 eval有效的保留没用的删掉——别堆模块够用就好。
> "Start with the smallest prompt that passes your evals, and add blocks only when they fix a measured failure mode."
> (从能通过 eval 的最小 prompt 开始,只在修复了实测到的失败模式时才加新模块。)
>
> — OpenAI GPT-5.4 Prompt Guide

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 MiB

View File

@ -0,0 +1,40 @@
---
type: metaphor
palette: warm
rendering: hand-drawn
text: title-only
mood: balanced
font: clean
aspect: "2.35:1"
language: zh
title: "先把这 6 个模块写对"
topic: GPT-5.4 Prompt Engineering 六个核心模块
---
# Cover Image Prompt
A wide panoramic hand-drawn illustration (2.35:1 aspect ratio) with warm orange and golden tones on a light cream/beige textured paper background.
## Visual Concept: Six Tuning Knobs
The central metaphor is **six hand-drawn circular knobs or dials** arranged in a gentle arc from left to right, each representing a prompt engineering module. The knobs have a vintage instrument/control panel aesthetic, like mixing board knobs or old radio dials.
Each knob has a small hand-lettered label beneath it:
1. 输出契约 (Output)
2. 跟进策略 (Follow)
3. 工具持久化 (Tools)
4. 完整性 (Complete)
5. 验证循环 (Verify)
6. 推理力度 (Reason)
The knobs are set to different positions — the first five are turned to high/optimal positions, while the sixth (推理力度) is deliberately turned low, reinforcing the article's core message: "don't crank up reasoning effort first, get the other modules right."
## Style Details
- **Hand-drawn aesthetic**: Slightly imperfect lines, pencil/ink sketch quality, warm paper texture visible
- **Color scheme**: Warm oranges (#E67E22), golden yellows (#F39C12), terracotta (#D35400), with cream background (#FDF6E3)
- **Decorative elements**: Small XML-like angle brackets `< >` scattered as decorative motifs around the knobs, subtle connection lines between knobs suggesting a system
- **Title text**: "先把这 6 个模块写对" in clean Chinese sans-serif font, positioned at the top or bottom with good contrast
- **Whitespace**: Generous breathing room, 40-50% of the image is clean space
- **NO realistic human figures** — only the knobs, labels, and decorative elements
- **Mood**: Warm, approachable, professional but not corporate — like a well-crafted notebook sketch

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.2 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.3 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.5 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.4 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 2.7 MiB

View File

@ -0,0 +1,51 @@
# Content Analysis: GPT-5.4 Prompt Patterns
## 基本信息
- **来源**: 03-review/2026-03-08-gpt-5-4-prompt-patterns.md
- **原文语言**: 中文
- **字数**: ~3500 字
- **类型**: 干货分享 / 技术解读
## 内容分类
- **主题**: GPT-5.4 官方 Prompt Engineering 指南拆解
- **角度**: 实操解读 — 把 4000+ 英文指南拆成 6 个可复用 XML 模块
- **价值主张**: 别急着调参数和加钱,先写对 prompt 结构
## 核心卖点
1. **6 个 XML 模块**可直接复用,复制即用
2. **反直觉洞察**reasoning effort 不是越高越好
3. **迁移策略**:一次只改一个变量
4. **跨模型通用**:不只适用于 GPT-5.4
## 目标受众
- AI 产品经理、AI 应用开发者
- 使用 LLM API 构建产品的技术人员
- 对 prompt engineering 有基础认知的从业者
## Hook 分析
- 标题自带钩子:"别急着加钱调参数"反常识
- "6 个模块写对"给出明确数字承诺
- 适合小红书"干货收藏"心理
## 6 个核心模块
1. 输出契约Output Contract— 控制格式和冗余
2. 跟进策略Follow-Through Policy— 自主执行 vs 请求确认
3. 工具持久化Tool Persistence— 防止提前停手
4. 完整性保障Completeness Contract— 防止半途交差
5. 验证循环Verification Loop— 交付前自检
6. 推理力度Reasoning Effort— 最后一公里旋钮
## 视觉机会
- 6 个模块天然适合做成 6 张内容卡
- reasoning effort 档位表适合对比图
- 迁移策略适合表格/流程图
- 体检清单适合 ending 卡
## 推荐图片数量
- 封面(1) + 6模块(6) + 结尾(1) = **8 张**
- 可压缩: 封面(1) + 模块两两合并(3) + 结尾(1) = **5 张**
## 收藏/分享潜力
- 收藏潜力: ★★★★★6 个模块 + 代码模板 = 强工具属性)
- 分享潜力: ★★★★☆(反直觉观点 + 实操价值)
- 评论潜力: ★★★☆☆(技术类讨论门槛较高)

View File

@ -0,0 +1,68 @@
---
strategy: a
name: Story-Driven故事驱动型
style: atomstorm
style_reason: "AtomStorm 深色科技风配合个人经历叙事,营造'技术大佬分享实战经验'的氛围"
elements:
background: dark-gradient
decorations: [neon-glow-lines, circuit-traces]
emphasis: glow-highlight
typography: highlight
layout: balanced
image_count: 8
---
## P1 Cover
**Type**: cover
**Hook**: "GPT-5.4 指南我读完了 别急着加钱调参数"
**Visual**: 深色背景 + 6 个发光旋钮图标排列 + 标题大字
**Layout**: sparse
## P2 Pain Point
**Type**: pain-point
**Hook**: "你是不是也这样?"
**Message**: reasoning effort 拉满、token 费用暴涨、效果反而不稳定的痛点共鸣
**Visual**: 费用账单示意 + 红色警告
**Layout**: balanced
## P3 Module 1-2
**Type**: content
**Hook**: "模块1输出契约 模块2跟进策略"
**Message**: 控制格式 + 控制自主权XML 模板展示
**Visual**: 两个模块卡片并排,代码高亮
**Layout**: comparison
## P4 Module 3-4
**Type**: content
**Hook**: "模块3工具持久化 模块4完整性保障"
**Message**: 防提前停手 + 防半途交差
**Visual**: 两个模块卡片并排
**Layout**: comparison
## P5 Module 5
**Type**: content
**Hook**: "模块5验证循环"
**Message**: 最终交付前强制自检流程
**Visual**: 验证检查清单 + 流程图
**Layout**: flow
## P6 Module 6
**Type**: content
**Hook**: "模块6推理力度 ≠ 越高越好"
**Message**: reasoning effort 5 档位对照表 + 反直觉洞察
**Visual**: 5 档仪表盘 + 档位表
**Layout**: comparison
## P7 Migration
**Type**: content
**Hook**: "迁移口诀:一次只改一个变量"
**Message**: 从旧模型迁移的起步建议表
**Visual**: 迁移策略表格
**Layout**: list
## P8 Ending
**Type**: ending
**Hook**: "6 模块体检清单 收藏备用"
**Message**: 速查清单 + CTA
**Visual**: 6 项检查清单 + 关注引导
**Layout**: sparse

View File

@ -0,0 +1,68 @@
---
strategy: b
name: Information-Dense信息密集型
style: atomstorm
style_reason: "AtomStorm 深色科技风搭配高密度信息卡,突出专业感和工具属性,最大化收藏动机"
elements:
background: dark-solid
decorations: [grid-pattern, neon-glow-lines]
emphasis: box-highlight
typography: code-block
layout: dense
image_count: 8
---
## P1 Cover
**Type**: cover
**Hook**: "GPT-5.4 Prompt 指南精华 6个模块复制即用"
**Visual**: 深色背景 + 6 个模块图标网格 + 大标题 + "复制即用"标签
**Layout**: sparse
## P2 Overview
**Type**: content
**Hook**: "GPT-5.4 变了什么4个关键变化"
**Message**: 可控性提升人格漂移少、Agentic鲁棒、并行tool calling、长上下文+ 4 个短板表格
**Visual**: 变化对照表,左边优点右边短板
**Layout**: comparison
## P3 Module 1: Output Contract
**Type**: content
**Hook**: "模块1 输出契约 控制说什么怎么说"
**Message**: XML 代码模板 + 使用场景说明
**Visual**: 代码块高亮 + 右侧注释
**Layout**: dense
## P4 Module 2-3
**Type**: content
**Hook**: "模块2 跟进策略 + 模块3 工具持久化"
**Message**: 两个模块的核心 XML + 一句话总结
**Visual**: 上下两栏代码展示
**Layout**: dense
## P5 Module 4-5
**Type**: content
**Hook**: "模块4 完整性保障 + 模块5 验证循环"
**Message**: 两个模块核心逻辑 + 关键规则
**Visual**: 上下两栏,关键词高亮
**Layout**: dense
## P6 Module 6: Reasoning Effort
**Type**: content
**Hook**: "模块6 推理力度 关键反直觉"
**Message**: 5 档位表 + "先加模块3/4/5最后才加 reasoning effort"
**Visual**: 仪表盘 + 档位表 + 红色警告框
**Layout**: comparison
## P7 Migration Table
**Type**: content
**Hook**: "迁移策略 一次只改一个变量"
**Message**: 现状→建议起步表 + 通用性说明
**Visual**: 迁移对照表
**Layout**: list
## P8 Ending
**Type**: ending
**Hook**: "Prompt 体检清单 逐项查缺补漏"
**Message**: 6 模块速查 + "从最小 prompt 开始" + 关注 CTA
**Visual**: 编号清单 + 底部引用
**Layout**: balanced

View File

@ -0,0 +1,61 @@
---
strategy: c
name: Visual-First视觉优先型
style: atomstorm
style_reason: "AtomStorm 霓虹科技美学为主导,每页一个大视觉隐喻,文字精简到极致"
elements:
background: dark-gradient
decorations: [neon-glow-lines, glassmorphism-panels]
emphasis: glow-highlight
typography: display
layout: sparse
image_count: 8
---
## P1 Cover
**Type**: cover
**Hook**: "6 个 Prompt 模块 > 调高参数"
**Visual**: 深色背景 + 6 个霓虹发光的乐高积木堆叠 + 大标题
**Layout**: sparse
## P2 The Insight
**Type**: content
**Hook**: "reasoning effort ≠ 越高越好"
**Visual**: 一个巨大的发光旋钮从 MAX 拨回到 LOW配合简短文字
**Layout**: sparse
## P3 Module 1-2
**Type**: content
**Hook**: "输出契约 + 跟进策略"
**Visual**: 两个发光图标(格式约束 + 自主决策),极简文字
**Layout**: sparse
## P4 Module 3-4
**Type**: content
**Hook**: "工具持久化 + 完整性保障"
**Visual**: 两个发光图标(持续调用 + 检查清单),极简文字
**Layout**: sparse
## P5 Module 5
**Type**: content
**Hook**: "验证循环"
**Visual**: 一个发光的循环箭头图标 + 4 个检查点
**Layout**: flow
## P6 Module 6
**Type**: content
**Hook**: "推理力度是最后一公里"
**Visual**: 5 个仪表盘从暗到亮排列,标注档位
**Layout**: sparse
## P7 The Rule
**Type**: content
**Hook**: "先写对结构 再调参数"
**Visual**: 天平隐喻——左边是 6 个模块(重),右边是 reasoning effort 旋钮(轻)
**Layout**: sparse
## P8 Ending
**Type**: ending
**Hook**: "你的 Prompt 缺了哪块?"
**Visual**: 6 个模块 checklist + 关注 CTA
**Layout**: sparse

View File

@ -0,0 +1,68 @@
---
strategy: b
name: Information-Dense信息密集型
style: atomstorm
style_reason: "AtomStorm 深色科技风搭配高密度信息卡,突出专业感和工具属性,最大化收藏动机"
elements:
background: dark-solid
decorations: [grid-pattern, neon-glow-lines]
emphasis: box-highlight
typography: code-block
layout: dense
image_count: 8
---
## P1 Cover
**Type**: cover
**Hook**: "GPT-5.4 Prompt 指南精华 6个模块复制即用"
**Visual**: 深色背景 + 6 个模块图标网格 + 大标题 + "复制即用"标签
**Layout**: sparse
## P2 Overview
**Type**: content
**Hook**: "GPT-5.4 变了什么4个关键变化"
**Message**: 可控性提升人格漂移少、Agentic鲁棒、并行tool calling、长上下文+ 4 个短板表格
**Visual**: 变化对照表,左边优点右边短板
**Layout**: comparison
## P3 Module 1: Output Contract
**Type**: content
**Hook**: "模块1 输出契约 控制说什么怎么说"
**Message**: XML 代码模板 + 使用场景说明
**Visual**: 代码块高亮 + 右侧注释
**Layout**: dense
## P4 Module 2-3
**Type**: content
**Hook**: "模块2 跟进策略 + 模块3 工具持久化"
**Message**: 两个模块的核心 XML + 一句话总结
**Visual**: 上下两栏代码展示
**Layout**: dense
## P5 Module 4-5
**Type**: content
**Hook**: "模块4 完整性保障 + 模块5 验证循环"
**Message**: 两个模块核心逻辑 + 关键规则
**Visual**: 上下两栏,关键词高亮
**Layout**: dense
## P6 Module 6: Reasoning Effort
**Type**: content
**Hook**: "模块6 推理力度 关键反直觉"
**Message**: 5 档位表 + "先加模块3/4/5最后才加 reasoning effort"
**Visual**: 仪表盘 + 档位表 + 红色警告框
**Layout**: comparison
## P7 Migration Table
**Type**: content
**Hook**: "迁移策略 一次只改一个变量"
**Message**: 现状→建议起步表 + 通用性说明
**Visual**: 迁移对照表
**Layout**: list
## P8 Ending
**Type**: ending
**Hook**: "Prompt 体检清单 逐项查缺补漏"
**Message**: 6 模块速查 + "从最小 prompt 开始" + 关注 CTA
**Visual**: 编号清单 + 底部引用
**Layout**: balanced

View File

@ -0,0 +1,29 @@
# P1 Cover — GPT-5.4 Prompt Patterns
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu.
## Visual Design
**Background**: Dark solid background (#1a1a24) with subtle grid pattern overlay in very faint blue (#326DF5 at 5% opacity).
**Title area** (top 30%):
- Main title in large bold white text: "GPT-5.4 Prompt 指南精华"
- Subtitle in cyan (#00e5ff) text: "6个模块 复制即用"
**Center content** (middle 50%):
- 6 module icons arranged in a 2×3 grid
- Each icon is a rounded rectangle with dark background (#2a2a38) and thin glowing border (#326DF5)
- Each card has a small icon and Chinese label:
1. 📋 输出契约
2. 🔄 跟进策略
3. 🔧 工具持久化
4. ✅ 完整性保障
5. 🔍 验证循环
6. ⚙️ 推理力度
- The 6th card (推理力度) has a different accent color (purple #6e00ff) to stand out
**Bottom area** (bottom 20%):
- Tag line: "别急着加钱调参数" in muted gray text
- Small watermark "栗子KK" at bottom-right in semi-transparent white
**Style**: Dark tech aesthetic with neon blue (#326DF5) and cyan (#00e5ff) accents. Clean, professional, high contrast. NO realistic human figures.

View File

@ -0,0 +1,29 @@
# P2 Overview — GPT-5.4 变了什么
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu. Dark tech style with AtomStorm brand colors.
## Visual Design
**Background**: Dark solid (#1a1a24) with subtle grid pattern.
**Title** (top): "GPT-5.4 变了什么?" in bold white text, with "4个关键变化 + 4个短板" in cyan (#00e5ff) as subtitle.
**Left column — 优势** (with green/cyan accents):
- ✅ 人格漂移更少
- ✅ Agentic 工作流更鲁棒
- ✅ 并行 Tool Calling 更准
- ✅ 长上下文分析增强
**Right column — 短板** (with orange/red accents):
- ⚠️ 低上下文 Tool Routing → 加前置检查
- ⚠️ 依赖感知弱 → 加 dependency_checks
- ⚠️ Reasoning Effort 误区 → 按任务选档
- ⚠️ 不可逆操作 → 加 verification_loop
Each item should be in a rounded rectangle card with dark background (#2a2a38) and thin border. Left column border cyan, right column border orange.
**Bottom highlight box**: "解法不是换模型,是写更好的 Prompt 结构" in a glassmorphism panel with blue glow.
**Watermark**: "栗子KK" bottom-right, semi-transparent.
**Style**: Dark tech, high contrast, clean layout. NO realistic human figures.

View File

@ -0,0 +1,42 @@
# P3 Module 1 — 输出契约 Output Contract
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu. Dark tech style.
## Visual Design
**Background**: Dark solid (#1a1a24) with faint circuit trace decorations.
**Header** (top 15%):
- Module number "01" in large cyan (#00e5ff) display font
- Title "输出契约" in bold white, subtitle "Output Contract" in muted gray
- One-line summary: "控制模型说什么、怎么说、说多少" in small cyan text
**Main content** (middle 65%):
A code block area styled like a dark IDE/terminal:
- Background slightly lighter (#1e1e2e) with rounded corners
- Monospace font, syntax-highlighted XML:
```
<output_contract>
· 只返回要求的章节
· 按要求的顺序排列
· 长度限制只应用于指定章节
· 格式要求严格遵循
</output_contract>
<verbosity_controls>
· 简洁、信息密集
· 不要重复用户请求
· 进度更新保持简短
</verbosity_controls>
```
XML tags in cyan (#00e5ff), content in white, bullets in blue (#326DF5).
**Bottom callout** (bottom 20%):
A highlighted box: "几乎所有 production prompt 都该加" with a small rocket icon.
Text in white on a dark card with blue border glow.
**Watermark**: "栗子KK" bottom-right.
**Style**: Dark tech, code-focused. NO realistic human figures.

View File

@ -0,0 +1,32 @@
# P4 Module 2-3 — 跟进策略 + 工具持久化
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu. Dark tech style.
## Visual Design
**Background**: Dark solid (#1a1a24) with subtle grid.
**Upper half — Module 2: 跟进策略**
- Number "02" in cyan (#00e5ff), title "跟进策略" in bold white
- Subtitle: "Follow-Through Policy" in gray
- Key rules in a dark card (#2a2a38) with blue border:
· 意图清晰 + 可逆低风险 → 直接执行
· 不可逆操作 → 必须确认
· 有外部副作用 → 必须确认
· 缺敏感信息 → 才问用户
- One-liner highlight: "解决 Agent 每步都来问你的问题" in cyan
**Divider**: Thin horizontal line with gradient blue-to-purple
**Lower half — Module 3: 工具持久化**
- Number "03" in cyan, title "工具持久化" in bold white
- Subtitle: "Tool Persistence" in gray
- Key rules in a dark card with blue border:
· 工具能提升质量 → 就调用
· 不要提前停止
· 持续调用直到任务完成 + 验证通过
· 空结果 → 换策略重试
- One-liner: "搜不到 ≠ 不存在" in orange (#FF8C00) highlight
**Watermark**: "栗子KK" bottom-right.
**Style**: Dark tech. NO realistic human figures.

View File

@ -0,0 +1,32 @@
# P5 Module 4-5 — 完整性保障 + 验证循环
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu. Dark tech style.
## Visual Design
**Background**: Dark solid (#1a1a24) with faint circuit traces.
**Upper half — Module 4: 完整性保障**
- Number "04" in cyan (#00e5ff), title "完整性保障" in bold white
- Subtitle: "Completeness Contract" in gray
- Key rules in dark card (#2a2a38) with blue border:
· 所有项覆盖或标记 [blocked] 前 → 任务未完成
· 维护内部交付物检查清单
· 跟踪已处理项,确认覆盖率
· 空结果恢复:至少尝试 1-2 个回退策略
- Highlight box: "搜不到≠不存在,可能是搜的方式不对" in yellow/orange
**Divider**: Thin gradient line blue→purple
**Lower half — Module 5: 验证循环**
- Number "05" in cyan, title "验证循环" in bold white
- Subtitle: "Verification Loop" in gray
- 4-step checklist with glowing checkboxes:
☑ 正确性:满足每一个要求?
☑ 可验证性:有上下文支撑?
☑ 格式:符合 schema
☑ 安全:有副作用先确认
- Highlight: "代码部署、数据修改、对外发送——都该过这关" in red/orange
**Watermark**: "栗子KK" bottom-right.
**Style**: Dark tech. NO realistic human figures.

View File

@ -0,0 +1,32 @@
# P6 Module 6 — 推理力度 Reasoning Effort
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu. Dark tech style.
## Visual Design
**Background**: Dark solid (#1a1a24).
**Header**:
- Number "06" in purple (#6e00ff), title "推理力度" in bold white
- Subtitle: "Reasoning Effort" in gray
- Key insight in large text: "不是越高越好" with "越高越好" crossed out in red
**Center — 5-tier gauge/table**:
A vertical list of 5 tiers, each as a horizontal bar with different widths and colors:
| 档位 | 宽度 | 颜色 | 适用场景 |
|------|------|------|---------|
| none | 20% | 绿色 | 工作流步骤、字段提取 |
| low | 35% | 青色 | 复杂指令遵循 |
| medium | 55% | 蓝色 | 多文档审查 |
| high | 75% | 紫色 | 冲突解决、策略撰写 |
| xhigh | 95% | 红色 | 慎用!需 eval 验证 |
Each bar has the tier name on the left, use case on the right.
**Bottom callout** in a red-bordered warning box:
"先加模块 3+4+5最后才调这个旋钮"
"你以为模型不够聪明,其实是 Prompt 没说清什么叫'做完了'"
**Watermark**: "栗子KK" bottom-right.
**Style**: Dark tech with purple accent for this module. NO realistic human figures.

View File

@ -0,0 +1,34 @@
# P7 Migration — 迁移策略
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu. Dark tech style.
## Visual Design
**Background**: Dark solid (#1a1a24) with subtle grid.
**Header**:
- Title "迁移策略" in bold white
- Subtitle "一次只改一个变量" in cyan (#00e5ff)
**Center — Migration Table**:
A clean table with dark card rows (#2a2a38), blue borders:
| 你的现状 | GPT-5.4 起步建议 |
|---------|----------------|
| GPT-5.2 | 保持当前 reasoning effort |
| GPT-5.3-Codex | 保持当前 reasoning effort |
| GPT-4.1 / GPT-4o | 从 none 开始 |
| 研究类 Agent | medium 或 high |
| 长流程 Agent | medium 或 high |
Left column in white, right column in cyan. Row backgrounds alternating slightly.
**Below table — Flow diagram**:
3 steps connected by arrows:
Step 1: "换模型" → Step 2: "跑 Eval" → Step 3: "一次调一个参数"
Each step in a rounded rect with blue glow.
**Bottom note**: "出了问题才知道是谁的锅" in muted orange text
**Watermark**: "栗子KK" bottom-right.
**Style**: Dark tech. NO realistic human figures.

View File

@ -0,0 +1,37 @@
# P8 Ending — Prompt 体检清单
A vertical infographic card (3:4 aspect ratio) for Xiaohongshu. Dark tech style.
## Visual Design
**Background**: Dark solid (#1a1a24) with subtle glow effect from center.
**Header**:
- Title "你的 Prompt 体检清单" in bold white
- Subtitle "逐项查缺补漏" in cyan (#00e5ff)
**Center — Checklist** (6 items):
Each item is a row with a glowing checkbox icon and text:
☐ 1. 输出契约 — 控制说什么、怎么说
☐ 2. 跟进策略 — 何时自主执行、何时问用户
☐ 3. 工具持久化 — 防止"差不多就行"
☐ 4. 完整性保障 — 防止"做到一半交差"
☐ 5. 验证循环 — 最终交付前自检
☐ 6. 推理力度 — 不是越高越好,结构先行
Each checkbox has a blue (#326DF5) glow. The 6th item checkbox is purple (#6e00ff).
**Quote area** (below checklist):
A glassmorphism panel with the quote:
"从能通过 eval 的最小 prompt 开始,只在修复了实测到的失败模式时才加新模块。"
— OpenAI GPT-5.4 Prompt Guide
Quote text in italic light gray.
**CTA area** (bottom):
"打开你的 system prompt 对照检查"
"关注 栗子KK 获取更多 AI 实战干货"
CTA in cyan text, slightly larger.
**Watermark**: "栗子KK" bottom-right.
**Style**: Dark tech, conclusive feel. NO realistic human figures.