Prompt Architecture

Advanced prompt design for agent tasks: structured outputs, tool-use grammar, and few-shot strategies.

Prerequisites

Course: Agent Engineering Fundamentals
Experience writing prompts for LLMs
Nexus CLI and SDK familiarity

Learning outcomes

Design effective system prompts for agent tasks
Implement tool-use grammar in agent prompts
Build dynamic prompt pipelines for complex workflows
Evaluate and iterate on prompt performance
Version and manage prompts in production

Course modules

Module 1

Prompt Design Principles

Structure, clarity, and specificity in agent prompts. System prompts vs. task prompts. The role of examples and formatting.

Duration: 45 min

Module 2

Tool-Use Grammar

Designing tool call formats, output parsers, and error handling in prompts. Teaching agents to use tools through few-shot examples.

Duration: 60 min

Module 3

Advanced Strategies

Dynamic prompt construction, multi-turn reasoning, self-correction prompts, and prompt versioning for production systems.

Duration: 45 min

Module 1: System Prompt Design (Detailed)

The system prompt is the foundational instruction set that defines an agent'''s behavior, capabilities, and constraints. Unlike single-turn LLM prompts, agent system prompts must handle dynamic context, tool integration, and multi-step reasoning.

System prompt anatomy. A well-structured agent system prompt contains: identity and purpose (who the agent is and what it does), capability declaration (what tools and knowledge are available), behavioral guidelines (how the agent should approach tasks), constraints (boundaries the agent must not cross), output format specification (how results should be presented), and error handling procedures (what to do when things go wrong).

Example system prompt structure:

You are a code architect agent for the Nexus platform.
Your purpose: analyze, design, and refactor codebases.

Capabilities:
- search_files: find files by pattern and content
- read_file: read file contents
- write_file: create or modify files
- run_tests: execute test suites

Behavioral guidelines:
1. Always read existing code before making changes
2. Prefer minimal, focused changes over large rewrites
3. Follow the existing code style in each module
4. Write or update tests for every change

Constraints:
- Only modify files under src/ and tests/
- Never modify package-lock.json or yarn.lock
- Do not install new dependencies without approval

Prompt engineering principles for agents. Be specific about tool selection criteria: "Use search_files for finding files, grep for content search, and read_file for viewing contents." Define fallback behaviors: "If a tool call fails, retry once with adjusted parameters, then try an alternative approach, then report the failure." Set quality standards: "Before writing any code, verify your understanding of the existing architecture by reading at least 3 related files."

Module 2: Context Window Management (Detailed)

The context window is the agent'''s working memory during a task. Efficient context management is critical because context overflow causes agents to forget earlier information. The default context window is 128K tokens, but effective utilization depends on prompt structure.

Context budgeting. A well-managed context allocation: system prompt (~4K tokens, 3%), task description (~2K tokens, 1.5%), tool documentation (~8K tokens, 6%), conversation history (~60K tokens, 47%), retrieved memories (~20K tokens, 16%), tool outputs (~30K tokens, 23%), and scratchpad (~4K tokens, 3%). This allocation leaves headroom for unexpected context needs.

Context compression techniques. Summarization: compress long tool outputs into key findings before storing in conversation history. Truncation: for very long outputs, keep only the first and last sections (where key information typically appears). Deduplication: if the same tool output appears twice, keep only the first instance. Prioritization: mark critical information as "pin" to prevent it from being compressed or truncated. These techniques can reduce context usage by 40-60% without losing essential information.

Module 3: Few-Shot and Structured Output (Detailed)

Few-shot examples teach agents through demonstration rather than instruction. For agents, few-shot examples are most effective for: tool selection (showing which tool to use for which task), output formatting (showing the expected result structure), reasoning patterns (showing how to decompose specific task types), and error recovery (showing how to handle common failure modes).

Example: Tool selection few-shot:

User: Fix the failing test in tests/api/users.test.ts
Agent: I need to understand the test and the implementation.
1. read_file("tests/api/users.test.ts") - understand the test
2. read_file("src/api/users.ts") - understand the implementation
3. run_tests("tests/api/users.test.ts") - reproduce the failure
4. [analyze and fix]
5. run_tests("tests/api/users.test.ts") - verify the fix

Structured output patterns. Use JSON output for programmatic consumption of agent results. The agent can output structured plans, analysis results, and change summaries. Configure in nexus.yaml: output: format: json schema: ./agent-output-schema.json. Structured output reduces parsing errors and enables automated downstream processing.

Exercises

Exercise 1: Write a system prompt for a code review agent. Include instructions for checking correctness, style, security, and performance. Test it by having the agent review a PR with known issues. Exercise 2: Create a task that requires reading 20+ files (enough to overflow a 32K context). Observe how the agent manages context. Does it summarize? Does it use memory? Exercise 3: Design a structured output schema for agent code review results. Include fields for file, issue type, severity, description, and suggested fix. Test with a real review.

Module 4: Prompt Versioning and Testing (Detailed)

Agent prompts evolve over time as you discover edge cases and improve instructions. Prompt versioning ensures you can track changes and roll back if needed. Nexus stores prompt history in the agent configuration. Each version is timestamped and associated with a deployment.

Prompt testing methodology. Before deploying a prompt change: (1) Run the evaluation suite with the new prompt and compare against the baseline. Key metrics: task completion rate, tool call accuracy, plan validity, and number of clarification requests. (2) Review the agent output for 5-10 representative tasks manually. Look for regressions in output quality or unexpected behavior changes. (3) Deploy to a canary agent (5% of traffic) and monitor for 30 minutes. If no regressions detected, roll out to 100%. The Nexus platform supports this workflow natively:

nexus prompt diff --agent my-agent --version v1 --version v2
nexus prompt rollback --agent my-agent --version v1

Common prompt anti-patterns. Over-specification: too many rules make the agent inflexible. Solution: prioritize rules by importance and keep the top 10. Contradictory instructions: conflicting rules confuse the agent. Solution: test prompts with a contradiction detection tool. Vague directives: "be careful" is not actionable. Solution: replace with specific behaviors: "always run tests after making changes." Missing fallbacks: the agent doesn'''t know what to do when a tool fails. Solution: always specify fallback behavior for each tool.

Module 5: Advanced Techniques (Detailed)

Dynamic prompt injection. Inject task-specific context into the system prompt at runtime. Use prompt templates with variables:

You are reviewing {{repo_name}} pull request #{{pr_number}}.
Focus areas: {{focus_areas}}.
The project uses {{tech_stack}}.

Multi-turn prompt refinement. Use the agent'''s own output to refine its prompt. After each task, the agent analyzes its performance and suggests prompt improvements. This creates a virtuous cycle of continuous improvement. Enable with: prompt: auto_refine: true in nexus.yaml.