> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cloudthinker.io/llms.txt
> Use this file to discover all available pages before exploring further.

# CloudThinker

> Autonomous Cloud Operations — AI agents that manage infrastructure, review code, resolve incidents, and optimize costs across multi-clouds and Kubernetes.

<img src="https://mintcdn.com/cloudthinker/nVesu4UXaGtxkRRZ/images/platform/hero-banner.webp?fit=max&auto=format&n=nVesu4UXaGtxkRRZ&q=85&s=e4c3875aa9b8b6f859faf3759fa318ee" alt="CloudThinker AI agent orchestrating cloud operations — incidents resolved, PRs reviewed, costs optimized, security remediated, debug output" style={{width: '100%', height: 'auto', borderRadius: '12px'}} width="2864" height="1460" data-path="images/platform/hero-banner.webp" />

## Start here

Six concrete first tasks. Each takes 5–10 minutes and ends in a real result you can verify.

<CardGroup cols={2}>
  <Card title="Connect AWS" icon="aws" href="/guide/connections/aws">
    Add your first AWS account with an IAM role and see your resources discovered automatically
  </Card>

  <Card title="Run your first cost analysis" icon="dollar-sign" href="/guide/tutorial/cloudkeepers">
    Find idle resources, oversized instances, and unused commitments — with projected monthly savings
  </Card>

  <Card title="Set up code review" icon="code-pull-request" href="/guide/tutorial/code-review">
    Connect a Git repository and get AI review comments on the next pull request
  </Card>

  <Card title="Investigate an incident" icon="triangle-exclamation" href="/guide/tutorial/incident-response">
    Wire Pulse to your monitoring and let agents form hypotheses, gather evidence, and propose remediation
  </Card>

  <Card title="Invite your team" icon="user-plus" href="/guide/workspace-users">
    Add members, assign roles, and grant per-workspace access
  </Card>

  <Card title="Configure approvals" icon="circle-check" href="/guide/approval">
    Decide which agent actions run on their own, which need a click, and who gets to click
  </Card>
</CardGroup>

***

## Choose your goal

Pick the outcome you want next. Each goal maps to one module with a guided path.

<CardGroup cols={2}>
  <Card title="Spend less" icon="piggy-bank" href="/guide/cost-optimization/overview">
    **CostOps** — continuous spend audit across AWS, Azure, and GCP with rightsizing recommendations and approval-gated remediation
  </Card>

  <Card title="Ship safer" icon="shield-check" href="/guide/code-review/overview">
    **Code Review Agent** — every PR reviewed with context from running infrastructure, past incidents, and your team's conventions
  </Card>

  <Card title="Resolve incidents faster" icon="bolt" href="/guide/incident/overview">
    **Deep Response Engine** — Pulse strips noise from monitoring; agents investigate the rest and run approved runbooks
  </Card>

  <Card title="Assess your cloud posture" icon="magnifying-glass-chart" href="/guide/infrastructure/assessment">
    **Assessment** — Well-Architected analysis across resources and pillars, on demand
  </Card>

  <Card title="Automate recurring ops" icon="repeat" href="/guide/automation/autonomous-agents">
    **Autonomous agents + skills** — encode your runbooks, conventions, and policies so the loop runs without restating them
  </Card>
</CardGroup>

***

## Core concepts

What to learn once and use everywhere.

<CardGroup cols={3}>
  <Card title="Agents" icon="user-group" href="/guide/agents">
    Anna orchestrates; Alex, Oliver, Tony, and Kai specialize in cloud, security, databases, and Kubernetes
  </Card>

  <Card title="CloudThinker Language" icon="code" href="/guide/language">
    `@agent #tool` syntax — who you're asking, what shape of output, what to do
  </Card>

  <Card title="Connections" icon="plug" href="/guide/connections/overview">
    Cloud providers, observability, databases, ticketing, chat — 50+ built-in and MCP connections
  </Card>

  <Card title="Approvals & autonomy" icon="circle-check" href="/guide/auto-mode">
    Four autonomy levels — notify → suggest → approve → autonomous — gated by RBAC
  </Card>

  <Card title="Operations Hub" icon="grid" href="/guide/operations-hub">
    325+ pre-built operations spanning cost, security, performance, and Kubernetes
  </Card>

  <Card title="Knowledge & memory" icon="brain" href="/guide/knowledge">
    Investigations, decisions, and runbooks feed back into every future loop
  </Card>
</CardGroup>

***

## How CloudThinker works

Every module runs the same four-phase loop — **Detect → Analyze → Resolve → Validate** — under your approval policy. Agents detect signals from your environment, analyze them into a plan, execute the resolution under your autonomy ceiling, then validate the outcome and write it back into [memory](/guide/knowledge) for the next iteration.

The human stays **on the loop, not in every step**. You set the goal and the autonomy ceiling; the agent runs; you intervene when judgment matters. The four [autonomy levels](/guide/auto-mode) — **notify → suggest → approve → autonomous** — are gated by RBAC, so the policy you write is the policy that runs.

This is the **AgenticOps** category — where DevOps automated pipelines and AIOps applied ML to observability, AgenticOps introduces autonomous agents that *operate* infrastructure directly. The [field guide](/learn/aio/overview) covers the full reference architecture, the L0–L4 autonomy spectrum, and the governance discipline behind it.

***

## The six modules

<CardGroup cols={2}>
  <Card title="Code Review Agent" icon="code-pull-request" href="/guide/code-review/overview">
    AI review on every PR with context from running infrastructure, [past incidents](/guide/incident/incident-memory), and [team conventions](/guide/code-review/convention-rules). Inline comments, reproduction steps, suggested patches.
  </Card>

  <Card title="Deep Response Engine" icon="triangle-exclamation" href="/guide/incident/overview">
    [Pulse](/guide/pulse/overview) suppresses \~98% of monitoring noise. When something escalates, agents form hypotheses, gather evidence, and run approved [runbooks](/guide/incident/runbooks). MTTR under five minutes for common failure modes.
  </Card>

  <Card title="CostOps Agent" icon="dollar-sign" href="/guide/infrastructure/cloudkeepers">
    Continuous spend audit across [AWS](/guide/connections/aws), [Azure](/guide/connections/azure), and [GCP](/guide/connections/gcp). Idle resources, oversized instances, unused commitments — surfaced with projected savings and [approval-gated](/guide/approval) remediation.
  </Card>

  <Card title="SecOps Agent" icon="shield-halved">
    <span style={{display: 'inline-block', padding: '2px 8px', marginBottom: '8px', fontSize: '0.7em', fontWeight: 600, background: 'rgba(9, 170, 170, 0.12)', color: '#09AAAA', border: '1px solid rgba(9, 170, 170, 0.35)', borderRadius: '999px', textTransform: 'uppercase', letterSpacing: '0.5px'}}>Research Preview</span>

    Continuous configuration assessment and [vulnerability scans](/guide/security/overview) across cloud, container, and IaC layers. Findings ranked by exploitability; fixes opened as pull requests.
  </Card>

  <Card title="ChatOps" icon="comments" href="/guide/slack-integration">
    Agents operate inside Slack, [Microsoft Teams](/guide/teams-integration), and the CLI. Query infrastructure, approve actions, and review changes without leaving your workflow.
  </Card>

  <Card title="Team Memory" icon="brain" href="/guide/knowledge">
    Persistent multi-layer memory captures [investigations](/guide/incident/incident-memory), decisions, [runbooks](/guide/incident/runbooks), and resolved tickets. Knowledge compounds across the team instead of leaving with the engineer who wrote it.
  </Card>
</CardGroup>

***

## Why this matters

A typical engineering team runs against 8–12 specialized platforms — Cost Explorer, Security Hub, Datadog, kubectl, Terraform, GitHub, PagerDuty — none of which share state. Every new cloud service expands the surface to monitor without expanding the team that monitors it.

| Failure mode            | What it looks like in practice                                                                 |
| ----------------------- | ---------------------------------------------------------------------------------------------- |
| **Tool sprawl**         | Eight dashboards open during an incident, four showing partial views of the same system        |
| **Alert fatigue**       | Most pages are noise; engineers triage by gut feel because no one can audit every notification |
| **Reactive cost**       | Bills land monthly; by the time waste is visible, it has already been paid for                 |
| **Visibility ≠ action** | Dashboards surface problems but require a human to interpret, prioritize, and execute the fix  |

Adding another dashboard doesn't fix any of them. The [Agentic Infrastructure Operations field guide](/learn/aio/overview) lays out why — and the architecture, governance, and adoption discipline that does.
