Skip to main content
CloudKeepers are autonomous keepers that enforce guardrails for cost, security, and performance across every connected cloud and Kubernetes cluster. Each keeper combines a cloud provider with an operational pillar — giving you 9 specialized monitors that detect drift, surface remediation playbooks, and alert teams so issues are fixed before they become Incidents.

The Problem

Cloud drift is constant. Security groups get opened during incidents and never closed. Dev instances left running over weekends compound into hundreds of dollars of monthly waste. New resources get provisioned without required tags, skip encryption, or use overly permissive IAM roles. Kubernetes pods crash-loop unnoticed while CPU and memory requests stay oversized. Most teams only discover these issues during quarterly audits or, worse, after a security event. Continuous monitoring currently requires:
  • AWS Config rules — powerful but complex to write, maintain, and interpret
  • Terraform Sentinel or OPA policies — code-based, developer-only, no plain-language rules
  • Cloud Custodian — YAML-based, requires DevOps expertise to maintain
  • Manual scheduled audits — infrequent, incomplete, and immediately stale
None of these give you a daily operational picture with plain-language findings, prioritized by impact, with implementation steps attached.

How Existing Tools Compare

ToolWhat It DoesWhat’s Missing
AWS ConfigRules-based configuration drift detectionComplex rule authoring, no AI analysis, no remediation guidance, AWS-only
Terraform Sentinel / OPAPolicy-as-code enforcementDeveloper-only, requires code changes to add rules, no AI recommendations
Cloud CustodianYAML-based cloud governance automationComplex setup, no natural-language interface, no prioritization
Wiz / Orca (CSPM)Cloud security posture managementSecurity-only (no cost/performance), expensive, requires dedicated analyst
AWS Trusted AdvisorBasic well-architected checks~50 fixed checks, no customization, no daily operational cadence
CloudKeepers combines cost, security, and performance guardrails into a single autonomous system — with findings in plain language, prioritized by impact, with remediation playbooks attached.

Keeper Architecture

CloudKeepers organizes monitoring into a 3 × 3 matrix of providers and pillars:
ProviderCostSecurityPerformance
AWSAWS-COSTAWS-SECAWS-PERF
GCPGCP-COSTGCP-SECGCP-PERF
KubernetesK8S-COSTK8S-SECK8S-PERF
Each keeper is a specialized monitor for one provider + one pillar combination. Enable only the keepers you need — for example, AWS-COST and K8S-SEC — or enable all nine for full coverage. Each keeper contains multiple detection rules (40+ rules total) that you can individually toggle and configure:
  • Cost rules: idle compute instances, unattached storage, old snapshots, unused static IPs, oversized databases, idle load balancers, over-requested pod resources, and more
  • Security rules: public S3 buckets, unused IAM roles, MFA disabled on root, open security groups, secrets in parameter store, and more
  • Performance rules: RDS connection limits, missing health probes, CrashLooping pods, throttled resources, and more

Autonomy Levels

Every detection rule can operate at one of three autonomy levels:
LevelNameBehavior
1SuggestRead-only. The keeper analyzes infrastructure and reports findings. No changes are made.
2ApproveThe keeper drafts actions for each finding. You review and approve before anything runs.
3AutonomousThe keeper executes approved command types automatically. You are notified after each action.
Autonomy is configured per rule, so you can run most rules in Suggest mode while allowing well-understood cost rules (like cleaning up unattached volumes) to operate autonomously.

What Makes This Different

  • Keepers, not rules: instead of writing policy code, you enable provider-pillar keepers and configure detection rules — the keepers decide what matters
  • Three pillars: cost optimization, security monitoring, and performance analysis in a single system
  • Configurable autonomy: choose Suggest, Approve, or Autonomous per rule — from read-only observation to full auto-remediation
  • Tunable thresholds: adjust detection sensitivity per rule (e.g., idle CPU threshold, snapshot max age, lookback period)
  • Daily operational cadence: designed to run daily (configurable cron), not quarterly — catching drift before it compounds
  • Plain-language findings: each finding explains the risk and impact in business terms, not just a rule name
  • Remediation playbooks attached: every finding includes impact analysis, before/after estimates, and step-by-step implementation guidance
  • Multi-cloud + Kubernetes: scans AWS, GCP, and Kubernetes in a single system — not separate tools per provider

Responsibilities

  • Policy enforcement: apply cost, security, and performance guardrails through specialized keepers for day-to-day operations.
  • Drift detection: continuously scan for misconfigurations, risky defaults, resource bloat, and performance bottlenecks.
  • Remediation playbooks: attach implementation steps and automation options to every finding.
  • Alerting: notify the right channels by severity so teams can triage quickly.

Prerequisites

  • At least one cloud account or Kubernetes cluster connected with permissions for read/monitoring and (optionally) remediation.
  • Slack Integration, Microsoft Teams, or email destinations configured if you want outbound alerts in addition to in-app Notifications.
  • Optional: tags or filters ready if you plan to scope findings to specific environments.

Quick Start

1

Open CloudKeepers

Go to CloudKeepers to see the onboarding view. It walks you through three steps: connect a cloud account, enable keepers, and run your first detection scan. Click Enable Your First Keepers to begin.
CloudKeepers onboarding page with Enable Your First Keepers CTA, three-step how-it-works timeline, and cost, security, and performance value cards
2

Select and configure keepers

The setup wizard has two steps. In Select Keepers, choose which keepers to activate — filter by provider (AWS, Kubernetes) or pillar (Cost, Security, Performance). In Review & Configure, fine-tune detection rules per keeper, set the autonomy level (Suggest, Approve, or Autonomous), and adjust which rules are enabled.
Two-step setup wizard showing keeper selection grid on the left and per-keeper rule review with autonomy level toggles on the right
3

Review the dashboard

Once keepers are enabled, select one from the sidebar to see its Dashboard tab. Four stat cards — Open Findings, Critical & High, Potential Savings, and This Week — give you a quick pulse. The Findings Over Time chart breaks down trends by severity.
AWS Cost Optimization dashboard with stat cards for open findings, critical and high, potential savings, and this week count, plus a findings over time chart
4

Triage findings

Switch to the Findings tab to see a Kanban board with columns for Pending, In Progress, Implemented, and Ignored. Each finding card shows the title, estimated savings, effort level, and risk severity. Click a card to drill into details, or drag it between columns to update its status.
Findings Kanban board with a pending finding card showing 30 unattached EBS volumes, $55.20 savings, effort low, risk medium
5

Review detection runs

The Runs tab shows every detection run with its status, summary, duration, and how many findings were created or updated. Use this as an audit trail to verify keepers are running on schedule.
Runs tab showing a completed detection run with 30 detections from 6 rules, 57-second duration, and 1 new finding
6

Configure keeper settings

In the Settings tab, set the cron schedule (default: daily at 07:00 UTC), and toggle individual detection rules on or off. Each rule shows a description of what it detects and supports per-rule autonomy and threshold configuration.
Settings tab showing cron schedule editor and a list of 10 detection rules with toggle switches for idle compute, unattached storage, old snapshots, and more

How enforcement and drift detection work

  • Keepers run on the cron schedules you set (default: daily at 7 AM UTC) or on-demand to scan all permitted resources for cost, security, and performance risk — not limited to what you previously discovered.
  • Each detection run produces an audit trail in the Runs tab showing status, timing, findings created/updated/closed, and any errors.
  • Findings are tagged with pillar, severity (Low / Medium / High), effort, and estimated savings to prioritize the highest-value fixes.
  • Findings start as drafts; promote them to active recommendations, then save to Plan when you are ready for approvals, scheduling, and execution tracking.

Finding statuses

Findings move through a Kanban workflow:
StatusMeaning
NewJust detected — awaiting triage
AcknowledgedTeam is aware, not yet acting
ActiveRemediation in progress
ResolvedFix implemented and verified
DismissedIntentionally skipped — keeper will not re-flag
CloudKeepers is your daily operational guardrail. Assessment is a deeper, periodic evaluation and is not meant for day-to-day runs.

Keeper settings

Each keeper has a dedicated Settings tab where you can:
  • Schedule: set a cron expression for automated runs (minimum 1-hour interval).
  • Detection rules: toggle individual rules, adjust their autonomy level (Suggest / Approve / Autonomous), and configure per-rule thresholds (e.g., idle CPU %, lookback days, snapshot max age).
  • Commands & permissions: manage which cloud commands each rule is allowed to execute, with per-command effects (Allow / Require Approval / Deny).
  • Notifications: configure Email, Slack, and Teams channels with per-channel minimum severity thresholds.

Remediation playbooks

  • Every finding includes an impact analysis with before/after estimates and a step-by-step playbook.
  • Use Impact Analytics for deeper analysis, Generate Guidelines for shareable runbooks, Custom Prompt to explore edge cases, or Implement to execute changes.
  • Track status and outcomes in Plan so governance, FinOps, and security teams share the same source of truth.

Alerting and routing

  • Set per-channel minimum severities to keep noise low while still surfacing critical issues quickly.
  • Slack: real-time triage with action links back to CloudThinker.
  • Email: audit trails with workspace-aware links.
  • Teams: team-channel delivery with severity filtering.
  • In-app Notifications are always delivered regardless of channel settings.
  • Combine alerting with Plan workflows to ensure findings get reviewed, approved, and closed.

What’s Next

Plan

Save findings to Plan for approvals, scheduling, and execution tracking

Assessment

Run deeper periodic Well-Architected assessments alongside daily CloudKeepers runs

Slack Integration

Route CloudKeepers alerts to Slack channels for real-time triage

Recurring Tasks

Schedule additional recurring analysis to complement CloudKeepers