Security checklist

AI Security Checklist

Six copy-paste prompts to audit any AI-powered project. Works with Claude, GPT, Codex, or any LLM with file access. Start with the review, then work through the checklist.

Works with

Any LLM

Claude, GPT, Codex, Gemini, local models

Time to harden

~20 min

Enough for a credible first pass

Ops habit

Nightly

Automated checks catch drift before it stacks

Section 1

Let AI produce the report first

Do not ask AI to harden everything in one shot. Start with a conservative review that gives you a clear, structured report: what to fix today, what can wait, what still works, and what gets less convenient.

Recommended model: any strong reasoning model (Claude Opus, GPT-5.4, Codex)

What a good first answer looks like

  • One clear overall status instead of a wall of warnings
  • No more than 3 things to fix right now
  • Plain-language tradeoffs for every suggested change
  • A smallest next step, plus what not to change blindly
AI review prompt
Run a conservative security review for my project.

Before making any claims, read the project structure and configuration files. Use what you find as the source of truth. Do not guess from memory.

Use this 5-step baseline as your review rubric:
1. Check whether plaintext secrets exist in any config, .env, or source files
2. Check whether API routes validate input and sanitize output
3. Check whether environment-sensitive clients initialize safely
4. Check whether security headers are configured
5. Check whether dependencies have known vulnerabilities (run audit if available)

Important:
- Do not make any config changes.
- Do not auto-fix anything.
- Do not suggest maximum lockdown by default.
- If a recommendation would reduce workflow convenience, explain that tradeoff clearly.
- If a change would break tools, integrations, or local access, say so explicitly.

Return this exact format:

Overall status: [Safe enough / Needs attention / Fix now]

Fix today (max 3):
- [issue]: protects [what], costs [what convenience]

Can wait (max 3):
- [issue]: protects [what], costs [what convenience]

Skip for my workflow (optional, max 2):
- [issue]: why it may not fit this setup right now

What still works after these changes: [one sentence]
What gets less convenient: [one sentence]

End with:
- the smallest next change I should consider
- the exact command or file to inspect next
- what I should not change blindly

Section 2

Quick Check

Six practical fixes. Start with the easiest baseline wins, then decide whether your workflow can tolerate tighter isolation.

Completion: 0 / 6 done

Step 1

Run a conservative security review

Get a clear picture before changing anything. Let the AI produce a structured report, not a wall of warnings.

You need a baseline. A conservative review catches the real issues without suggesting maximum lockdown that breaks your workflow. Start with understanding, not fixing.

  • -Use a strong reasoning model for the review. Save faster models for follow-up work.
  • -Read the full report before making any changes.
  • -If a suggestion would break your workflow, the report should say so explicitly.
Prompt
Run a conservative security review for my project.

Before making any claims, read the project structure and configuration files. Use what you find as the source of truth. Do not guess from memory.

Use this 5-step baseline as your review rubric:
1. Check whether plaintext secrets exist in any config, .env, or source files
2. Check whether API routes validate input and sanitize output
3. Check whether environment-sensitive clients initialize safely
4. Check whether security headers are configured
5. Check whether dependencies have known vulnerabilities (run audit if available)

Important:
- Do not make any config changes.
- Do not auto-fix anything.
- Do not suggest maximum lockdown by default.
- If a recommendation would reduce workflow convenience, explain that tradeoff clearly.
- If a change would break tools, integrations, or local access, say so explicitly.

Return this exact format:

Overall status: [Safe enough / Needs attention / Fix now]

Fix today (max 3):
- [issue]: protects [what], costs [what convenience]

Can wait (max 3):
- [issue]: protects [what], costs [what convenience]

Skip for my workflow (optional, max 2):
- [issue]: why it may not fit this setup right now

What still works after these changes: [one sentence]
What gets less convenient: [one sentence]

End with:
- the smallest next change I should consider
- the exact command or file to inspect next
- what I should not change blindly
Step 2

Move secrets out of config

Stop storing plaintext API keys and tokens in config files. Reference environment variables instead.

Prompt
Scan this project for exposed secrets and credentials.

Check these locations:
1. All .env files (are any committed to git?)
2. Source code files (hardcoded API keys, tokens, passwords)
3. Configuration files (plaintext credentials in JSON, YAML, TOML)
4. Git history (secrets that were committed then removed but still in history)
5. Client-side code (secrets bundled into frontend JavaScript)

For each finding, report:
- File path and line number
- What type of credential it is
- Whether it is currently exposed (committed, in client bundle, etc.)
- Risk level: Critical (actively exposed) / Warning (could leak) / Info (best practice)

Do not redact the findings from me. I need to see exactly what is exposed.

Then provide:
1. Which secrets to rotate immediately
2. How to move each one to environment variables
3. Whether a .gitignore update is needed
4. Whether git history needs cleaning (and the exact command)
Step 3

Harden your API routes

Review every public-facing API endpoint for input validation, error handling, rate limiting, and authentication.

Prompt
Review every API route in this project for security issues.

For each route, check:
1. Input validation: Are query params, body fields, and path params validated?
2. Authentication: Does it verify who is calling? Should it?
3. Authorization: Can users access only their own data?
4. Error handling: Do error responses leak internal details (stack traces, DB errors, file paths)?
5. Rate limiting: Is there protection against abuse?
6. Output sanitization: Could responses include user-controlled HTML or scripts?

For each issue found, report:
- Route path and method
- The specific vulnerability
- Severity: Critical / High / Medium / Low
- A concrete fix (show the code change, not just advice)

Do not apply fixes. Report only. I will review and apply changes myself.

End with a summary table:
| Route | Issues Found | Highest Severity |
Step 4

Audit your dependencies

Check for known vulnerabilities in your package dependencies and get a clear upgrade plan.

Prompt
Audit the project dependencies for known security vulnerabilities.

Steps:
1. Run the package manager's built-in audit (npm audit, pnpm audit, yarn audit, pip audit, etc.)
2. List every vulnerability found with severity level
3. For each critical or high severity issue:
   - What package is affected
   - What the vulnerability allows (RCE, XSS, data leak, etc.)
   - Whether a patched version exists
   - Whether upgrading would break anything (check changelogs)

4. Check for outdated major versions of security-critical packages:
   - Framework (Next.js, React, Django, etc.)
   - Auth libraries
   - Database clients
   - HTTP/API clients

Report format:
- Total vulnerabilities: [count by severity]
- Immediate upgrades needed: [list with commands]
- Upgrades that need testing: [list with breaking change warnings]
- Safe to ignore for now: [list with reasoning]

Run the audit command and show me the real output. Do not guess.
Step 5

Review agent governance

If you run AI agents, verify that your permission boundaries, approval gates, and escalation rules are enforced.

Prompt
Review the agent governance and permission system in this project.

Check:
1. Permission boundaries: What can each agent do without human approval?
2. Approval gates: What requires explicit human sign-off before executing?
3. Escalation rules: When do agents stop and ask for help?
4. Rejection patterns: Is there a log of what has been rejected before?
5. File protection: Are critical files (config, credentials, governance docs) protected from agent writes?
6. External access: Can agents make API calls, send emails, publish content, or modify shared resources without approval?

For each finding, assess:
- Is the current permission appropriate for the risk?
- Could an agent accidentally do damage with its current access?
- Are there gaps where an agent could bypass governance?

Report format:

Governance health: [Strong / Adequate / Needs work / Missing]

Overpermissioned (fix now):
- [agent/role]: can [action] without approval, should require it

Underpermissioned (consider relaxing):
- [agent/role]: requires approval for [action] that could safely auto-execute

Missing controls:
- [what governance mechanism should exist but does not]

End with: the single most important governance change to make today.
Step 6

Set up automated security checks

Create a recurring audit so regressions get caught before they pile up.

Prompt
Help me set up automated recurring security checks for this project.

I need:
1. A nightly or weekly audit script that:
   - Runs dependency audit
   - Scans for new plaintext secrets in committed files
   - Checks that security headers are still configured
   - Verifies .env files are not committed
   - Reports any new API routes that lack input validation

2. Integration with my existing workflow:
   - If I use CI/CD (GitHub Actions, Vercel, etc.), add it there
   - If I use scheduled agents, create a task/skill for it
   - If neither, create a simple shell script I can cron

3. Alert format:
   - Only alert on new issues (not re-reporting known ones)
   - Clear severity levels
   - Actionable next steps, not vague warnings

Show me the exact files to create and where to put them.
Do not over-engineer. The simplest version that catches real problems is better than a complex system nobody runs.

Section 3

How Agent0 handles security by default

Agent0 is local-first, human-governed, and fully transparent. Here is what that means in practice.

ControlHosted platformsAgent0
Data locationTheir serversYour machine
Who sees your ideasThe platform providerOnly you
API key managementStored by vendorStays local
Agent permissionsPlatform-definedYou define every boundary
Audit trailLimited or noneFull git history
Vendor lock-inProprietary formatsPlain markdown
Model choiceTheir model, their rulesBring your own
Cortex
Chief of Staff
Loom
Loom
Radar
Radar
Hippo
Hippocampus
Signal
Signal
Sentinel
Sentinel
Axon
Axon
Hey, I'm Cortex — Chief of Staff for the team. Ask me anything; I'll handle it or pull in Loom (writing), Radar (growth), Hippocampus (research), Signal (social), Sentinel (ops), or Axon (code review). What's on your mind?