Codex
AGENTS.md Configuration Principles
Core Principle: Keep it Short and Precise
- Limit to 100 lines, hard cap at 300 lines.
- AGENTS.md should only include information Codex might overlook.
- Do not include content that can be inferred from code (e.g., export default function is a React component—Codex knows this).
- Too many rules? Split into subdirectory AGENTS.md files, loaded by scope.
- Key rules should be tagged or numbered to prevent being overlooked.
AGENTS.md Scope Rules
Direct User Instructions > Deeper Nested AGENTS.md > Shallow AGENTS.md
- Each AGENTS.md’s scope is the entire directory tree of the folder containing it.
- Deeper nested files have higher priority.
- Direct system/developer/user instructions have the highest priority.
Example: Good AGENTS.md Structure
# Project Name
## Workflow
- Run `npm test` after each code change.
- Use Conventional Commits (feat:, fix:, refactor:, docs:).
- Create PR with `gh pr create` after completion.
## Tech Stack
- Node.js 18+, Express 4.x, PostgreSQL 16.
- Testing: Jest + React Testing Library.
- Authentication: JWT + bcrypt.
## Code Standards
- Component files should not exceed 300 lines; split if they do.
- Avoid using `any` type.
- All public APIs must have JSDoc comments.
## Required Checks
After modifications, run in order:
1. `npm run lint` - Linter fixes.
2. `npm test` - Run relevant tests.
3. `npm run typecheck` - Type checking.
Programmatic Checks in AGENTS.md
If AGENTS.md contains programmatic check commands, Codex must run all checks and validate:
## Required Checks
After modifications, run in order:
1. `just fmt` - Code formatting.
2. `just fix -p <changed-crate>` - Linter fixes.
3. `cargo test -p <changed-project>` - Run relevant tests.
4. `just bazel-lock-check` - Dependency lock file check.
Workflow Best Practices
1. Phased Workflow
- Understand the codebase → Modify.
- Plan first → Implement.
- Generate → Validate.
- Do not compress all steps into one large prompt.
2. Simple Tasks Should Not Use Complex Workflows
- Tasks that can be completed in 3-5 minutes should be a single sentence.
- Complex workflows are suitable for multi-file, multi-step large tasks.
- Simple tasks like renaming variables can be done in one sentence.
3. Let Codex Understand the Codebase First
# Let Codex understand the project structure first.
codex exec "Please explain the overall structure of the src/ directory, including:
1. What are the core modules?
2. What are the dependencies?
3. Where is the entry file?"
4. Parallel Task Strategy
- Assign independent tasks to different Codex instances.
- Use background processing to maintain workflow continuity.
- Avoid task dependencies blocking each other.
- Community Suggestion: Assign well-scoped tasks to multiple agents to run simultaneously.
Debugging and Error Correction
1. Paste the Bug and Say “Fix”
- Paste the error message to Codex and say one word: “fix”.
- Do not guide how to fix, do not guess the cause, do not specify solutions.
- Codex’s debugging ability is stronger than expected; the more you manage, the more likely it is to go astray.
- The success rate of letting Codex fix directly is over 80%.
2. Two Failures = Start Over
- If the same issue is fixed more than twice, start a new session.
- Context pollution can reduce performance.
- Recommendation: Restart if a fix exceeds two attempts.
3. Request a Rewrite of Mediocre Solutions
- When Codex provides a working but not elegant solution, do not patch it.
- Say: “Knowing everything you know now, discard this and implement an elegant solution.”
- Rewritten versions are usually much better than patched ones.
Context Management
1. Long Sessions Reduce Performance
- The context window of Codex Web cloud agents is limited.
- After long sessions, context is repeatedly compressed, leading to reduced AI understanding.
- Community Proposal (#22642): auto_new_session_after_compactions = N, automatically refresh after N compressions.
2. Practical Suggestions
- Complex tasks should be phased, starting a new session for each phase.
- Write key decisions and context into AGENTS.md instead of relying on dialogue memory.
- New sessions should carry key context summaries instead of continuing in overly long sessions.
3. Session Refresh Strategy
Session 1: Understand codebase structure → Write to AGENTS.md
Session 2: Implement functionality based on AGENTS.md
Session 3: Independent validation.
Prompt Engineering
Golden Structure for Effective Prompts
【Task Type】 Please fix/implement/refactor...
【Problem Description】
- Specific, clear problem description.
- Current behavior vs expected behavior.
【Constraints】
- Any limitations or requirements.
- Files/modules not to modify.
【Reference Information】
- Relevant file paths.
- Error messages or logs.
- Reference code examples.
Best Practices for Prompts
1. Provide Specific Paths and Line Numbers
❌ "Fix the bug in the auth module."
✅ "Fix the token validation logic in lines 45-62 of src/auth/login.py."
2. Specify the Expected Output Format
"Please generate a PR containing:
- Code changes
- Unit tests (at least 3 test cases)
- Updates to relevant documentation in README."
3. Specify the Tests to Run
"After modifications, run the following commands to validate:
npm test -- --testPathPattern=auth
npm run lint."
4. Include Error Messages or Logs
"Current error message:
TypeError: Cannot read property 'token' of undefined
at validateToken (src/auth/middleware.js:23:15)
Please locate and fix this issue."
Subagents
1. Add “use subagents” in Prompts
- Codex will automatically split tasks for multiple subagents to process in parallel.
- Suitable for code reviews and large-scale refactoring.
2. Dedicated Subagents > General Mega-Agent
- Create function-specific subagents (e.g., “frontend component agent”) instead of general ones (e.g., “QA agent”).
- The more specific the function, the more precise the context, leading to better results.
3. Subagents Have Independent Context Windows
- Research, validation, and review are isolated in independent contexts.
- Prevent pollution and bias.
- Do not pollute the main context.
4. Multi-Agent Collaboration
Community discussion (#22749) mentioned:
For complex tasks across projects, coordination between multiple Codex instances is needed. Suggestions include:
- Each instance is responsible for independent subtasks.
- Coordinate context using shared files (like AGENTS.md).
- Use CI/CD pipelines to link outputs from multiple agents.
Skills and MCP Plugins
1. MCP Server (Model Context Protocol)
MCP is the core protocol for extending Codex capabilities, allowing integration with external tools:
| Tool | Description | Installation |
|---|---|---|
| xlsx-for-ai | 39 Excel operations (read/write, pivot, charts, etc.) | npm install -g xlsx-for-ai |
| Community MCP Registry | registry.modelcontextprotocol.io | Browse and install |
Configuration Example:
{
"mcpServers": {
"xlsx-for-ai": {
"command": "xlsx-for-ai-mcp"
}
}
}
2. Skills Should Be Folder Structures
skills/
api-design/
SKILL.md # Main file: core rules and index.
references/ # Corpus, reference materials.
scripts/ # Auxiliary scripts.
examples/ # Example code.
- The main file should only contain core rules and index.
- Corpus and checklists should be in references/.
- Progressive Disclosure: Codex only reads subdirectory content when needed.
3. Add Gotchas Section
This is the most valuable long-term technology: Record failure patterns each time Codex makes a mistake, accumulating into high signal-to-noise content.
Gotchas Structure Example
# SKILL.md
## Core Rules
...
## Gotchas
### 2026-05-10: Missing API Pagination Parameter
- **Issue**: Forgot to add pagination parameters when generating API.
- **Manifestation**: Returns all data, causing performance issues.
- **Fix**: Add pagination rules in SKILL.md.
- **Prevention**: Add to checklist "Does it include pagination?"
### 2026-05-12: Insufficient Test Coverage
- **Issue**: Only tests the happy path, ignoring edge cases.
- **Manifestation**: Tests pass but actual execution fails.
- **Fix**: Add boundary value testing rules.
- **Prevention**: Add to checklist "Edge case coverage."
Gotchas Maintenance Principles
- Record every mistake immediately: Don’t wait, record it right away.
- Include four elements: Problem description, manifestation, fix method, prevention measures.
- Regular Review: Review weekly to identify recurring patterns.
- Convert to Rules: If a gotcha occurs more than three times, convert it into a formal rule.
- Archive Resolved Issues: Move problems that haven’t appeared in over 30 days to the archive.
Memory and Persistence
1. Codex’s Memory Mechanism
Codex itself does not have long-term memory across sessions; each task runs in an independent sandbox environment. However, the following methods can achieve a “memory” effect:
2. Methods to Implement Memory
Method 1: AGENTS.md as Project Memory
- Write project knowledge, decisions, and standards into AGENTS.md.
- Codex reads this every time it executes a task.
Method 2: Code Comments and Documentation
- Keep clear comments in the code.
- Maintain up-to-date README and architecture documentation.
Method 3: Local Automated Memory (Community Solution)
GitHub community users (#21728) proposed a “Sentinel-AI” model:
Local Agent detects an issue
↓
Checks local memory/Playbook
↓
Known issue → Execute local solution
Unknown issue → Escalate to Codex
↓
Codex generates a fix
↓
Human review + approval
↓
Save as a reusable local tool
↓
Automatically use next time, no need to request cloud AI again.
Core Benefits:
- About 80% of tasks do not require calling the cloud API.
- Costs reduced from $75/month to $15/month.
- Forms a “learning compounding” effect — the longer it runs, the fewer cloud calls needed.
Method 4: Structured Memory File
# .codex/memory.md
memory:
append_only: true # Only append, do not overwrite.
Comments
Discussion is powered by Giscus (GitHub Discussions). Add
repo,repoID,category, andcategoryIDunder[params.comments.giscus]inhugo.tomlusing the values from the Giscus setup tool.