Frontier AI Defenses for Social on Autopilot
Inspired by BeeSafe AI — Frontier AI Defenses for Social Engineering Attacks. Loop until the workflow is current, exceptions are owned, and human sign-off is captured where required.
Inspired by BeeSafe AI
by Trooper
/loop 30m Start the "Frontier AI Defenses for Social on Autopilot" loop. Inspired by BeeSafe AI (https://beesafe.ai). Goal: open work triaged, exceptions owned, and core security workflow current with audit trail Max iterations: 20 Between iterations run: Report open queue items, stale tasks, failed automations, and items awaiting human approval for BeeSafe AI Exit when: zero open items without owner or explicit escalation, all external actions approved or sent, and systems of record current Step 1 — Scan surface: Discover assets, agents, endpoints, and exposure across the environment. Step 2 — Test and validate: Run adversarial or policy checks; prove impact with graph context. Step 3 — Triage findings: Rank by severity; assign owners and remediation plans. Step 4 — Remediate safely: Draft fixes grounded in real config; require approval for prod changes. Step 5 — Verify and close: Confirm fix landed; preserve audit trail from finding to resolution. ## Before you start Connect plugins: - GitHub (required) — Read branches, PRs, reviews, checks, workflow runs, and source diffs. Attach skills: - Loop runner (required) — Self-pace iterations, run the check between passes, and stop only on the exit condition. - Code change + local verification (optional) — Edit code safely, run commands, and keep changes scoped. - CI debugging (optional) — Read failing checks, logs, and the smallest actionable root cause. - Approval workflows (optional) — Keep outbound actions in draft or approval states when risk is non-trivial. - Security triage (optional) — Assess alerts, prioritize patches, and avoid unsafe shortcuts. Self-pace this loop. After each iteration, run the check command, read the output, and only continue if the exit condition is not met. Stop when the exit condition passes or max iterations is reached. Give a short status update each pass.
Paste the kickoff prompt into Cursor, Claude Code, or Codex. Deeplinks do not install hook files.
1. Scan surface
Discover assets, agents, endpoints, and exposure across the environment.
2. Test and validate
Run adversarial or policy checks; prove impact with graph context.
3. Triage findings
Rank by severity; assign owners and remediation plans.
4. Remediate safely
Draft fixes grounded in real config; require approval for prod changes.
5. Verify and close
Confirm fix landed; preserve audit trail from finding to resolution.
Guardrails
Rules the agent must follow so it cannot cheat the exit condition.
- Require human approval before customer-facing sends, payments, or legal submissions unless pre-approved templates apply
- Preserve full audit trail linking source data to every automated action
- Escalate compliance, safety, or regulatory-sensitive items immediately
- Never expose secrets or credentials in logs or reports
More Security loops
Inbox Triage with Approval
On an interval, classify incoming mail, draft safe replies for routine threads, and escalate anything that needs a human decision.
Morning Operator Brief
Daily interval loop that reads your calendar, open tickets, and inbox priorities, then delivers a concise operator brief with ranked actions.
Customer Onboarding Watch
Interval loop that watches new signups, runs the onboarding checklist against each account, and nudges or escalates stuck users.
