> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cloudthinker.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Autonomous Agents

> Configure agents to work autonomously on continuous monitoring, optimization, and operations

CloudThinker agents can operate autonomously, continuously monitoring your infrastructure and taking action without requiring manual prompts. This enables 24/7 optimization, proactive [incident](/guide/incident/overview) response, and automated compliance enforcement.

***

## The Problem With Reactive Operations

Manual cloud management is inherently reactive. Teams respond to incidents after they occur, review costs monthly when invoices arrive, and run security audits when compliance deadlines force them. The window between when a problem starts and when someone acts on it is measured in hours or days — not seconds.

Even teams that set up monitoring alerts still require a human in the loop for every action: acknowledge the alert, log into the console, investigate, decide, execute. This creates:

* **On-call fatigue**: engineers interrupted at all hours for issues that could be auto-remediated
* **Accumulated drift**: small problems (unused resources, config drift) compound when nobody reviews them daily
* **Expertise bottleneck**: only specialists can act, so non-critical issues queue up indefinitely

***

## How Existing Automation Compares

| Tool                                   | What It Does                      | What's Missing                                     |
| -------------------------------------- | --------------------------------- | -------------------------------------------------- |
| **AWS Auto Scaling**                   | Scale compute based on metrics    | Single-service, no cross-domain intelligence       |
| **AWS Config Rules**                   | Detect compliance drift           | Detects only, no remediation, no AI analysis       |
| **AWS Lambda + EventBridge**           | Event-driven automation           | Requires custom code for every use case; no AI     |
| **Terraform + CI/CD**                  | Infrastructure-as-code automation | Deployment automation, not operational monitoring  |
| **Runbook automation (Ansible, etc.)** | Script-based operational tasks    | Pre-written scripts only; no adaptive AI reasoning |

Autonomous Agents go further: they apply AI intelligence to continuously monitor, analyze, and act — adapting to your specific environment instead of executing fixed scripts.

***

## How Autonomous Mode Works

<Steps>
  <Step title="Configure">
    Define the scope, schedule, and permissions for autonomous operations. Specify what actions agents can take independently.
  </Step>

  <Step title="Monitor">
    Agents continuously scan your infrastructure based on configured schedules, using the same intelligence as interactive conversations.
  </Step>

  <Step title="Analyze">
    When agents identify issues or opportunities, they analyze the situation and determine appropriate actions.
  </Step>

  <Step title="Act or Alert">
    Based on your [approval](/guide/approval) settings, agents either take action automatically or create recommendations for your review.
  </Step>

  <Step title="Report">
    All autonomous activities are logged and reported, with [notifications](/guide/notifications) sent through your configured channels.
  </Step>
</Steps>

***

## Autonomous Capabilities by Agent

<Tabs>
  <Tab title="Alex (Cloud Engineer)">
    **Continuous Cost Optimization**

    * Monitor resource utilization and identify right-sizing opportunities
    * Detect unattached volumes and unused resources
    * Track spending anomalies and alert on budget risks
    * Generate daily/weekly cost recommendations

    **Infrastructure Monitoring**

    * Scan for configuration drift
    * Monitor resource health and availability
    * Track infrastructure changes across accounts
  </Tab>

  <Tab title="Oliver (Security)">
    **Security Monitoring**

    * Continuous security group audit for violations
    * IAM policy analysis for privilege escalation risks
    * Public resource detection (S3, RDS, etc.)
    * Compliance drift monitoring

    **Threat Detection**

    * Monitor CloudTrail for suspicious activities
    * Detect unusual access patterns
    * Alert on security misconfigurations
  </Tab>

  <Tab title="Kai (Kubernetes)">
    **Cluster Optimization**

    * Monitor pod resource utilization
    * Detect pods without resource limits
    * Identify node scaling opportunities
    * Track cluster health metrics

    **Workload Management**

    * Monitor deployment health
    * Detect failed pods and restart loops
    * Track resource quota utilization
  </Tab>

  <Tab title="Tony (Database)">
    **Performance Monitoring**

    * Track slow query patterns
    * Monitor connection pool utilization
    * Detect index optimization opportunities
    * Alert on performance degradation

    **Capacity Planning**

    * Storage growth monitoring
    * Connection limit tracking
    * Backup verification
  </Tab>

  <Tab title="Anna (General Manager)">
    **Orchestration**

    * Coordinate multi-agent autonomous tasks
    * Aggregate findings across agents
    * Generate consolidated reports
    * Manage cross-functional projects
  </Tab>
</Tabs>

***

## Configuring Autonomous Operations

### Agent Connections

Configure which resources and accounts each agent can access autonomously:

1. Navigate to **Settings > Agent Connections** in your workspace
2. Select the agent to configure
3. Define the connection scope (accounts, regions, resource types)
4. Set access level (read-only or read-write)

### Scheduling

Define when autonomous operations run:

<AccordionGroup>
  <Accordion title="Continuous Monitoring">
    Agents monitor in near real-time, checking for issues as they occur:

    * Security violations
    * Resource health changes
    * Cost anomalies
  </Accordion>

  <Accordion title="Scheduled Scans">
    Configure periodic comprehensive scans:

    * Daily: Cost analysis, resource inventory
    * Weekly: Security audit, compliance check
    * Monthly: [Well-Architected](/guide/infrastructure/assessment) assessment
  </Accordion>

  <Accordion title="Event-Driven">
    Trigger agent actions based on events:

    * New resource creation
    * Configuration changes
    * Alert conditions
  </Accordion>
</AccordionGroup>

### Builtin Tools

Configure which tools agents can use autonomously:

| Tool          | Description                     | Default           |
| ------------- | ------------------------------- | ----------------- |
| **Analyze**   | Read and analyze resources      | Enabled           |
| **Report**    | Generate reports and dashboards | Enabled           |
| **Recommend** | Create recommendations          | Enabled           |
| **Alert**     | Send notifications              | Enabled           |
| **Execute**   | Make infrastructure changes     | Requires approval |

***

## Approval Settings

Control what actions require human approval:

### Approval Levels

<CardGroup cols={3}>
  <Card title="Full Autonomy" icon="robot">
    Agent can take all configured actions without approval
  </Card>

  <Card title="Recommend Only" icon="clipboard-check">
    Agent creates recommendations but doesn't execute changes
  </Card>

  <Card title="Alert Only" icon="bell">
    Agent monitors and alerts but takes no action
  </Card>
</CardGroup>

### Action-Specific Approvals

Configure approval requirements by action type:

* **Read operations**: Typically no approval needed
* **Recommendations**: Auto-create or require review
* **Configuration changes**: Usually require approval
* **Cost-impacting changes**: Configurable thresholds

<Warning>
  Always start with "Recommend Only" mode and gradually increase autonomy as you build confidence in agent behavior.
</Warning>

***

## Monitoring Autonomous Activity

### Activity Log

All autonomous agent actions are logged with:

* Timestamp and duration
* Agent and action type
* Resources affected
* Outcome and any errors

### Dashboard View

Monitor autonomous operations from your workspace dashboard:

* Active autonomous tasks
* Recent completions
* Pending approvals
* Error alerts

### Notifications

Configure notifications for autonomous activities:

```bash theme={null}
# Example notification configuration
Notify via Slack when:
- Agent creates a high-priority recommendation
- Agent encounters an error
- Agent completes a scheduled scan
- Agent requests approval for an action
```

***

## Background Jobs

CloudThinker runs various background processes to support autonomous operations:

### System Jobs

| Job                 | Frequency    | Purpose                                      |
| ------------------- | ------------ | -------------------------------------------- |
| **Cloud Sync**      | Configurable | Sync resource inventory from cloud providers |
| **Token Refresh**   | Automatic    | Maintain valid credentials for integrations  |
| **Quota Reset**     | Daily        | Reset usage quotas for new day               |
| **Credit Rollover** | Monthly      | Process billing credits                      |
| **Cleanup**         | Periodic     | Archive old data, clean temporary files      |

### Agent Jobs

| Job                   | Frequency    | Purpose                                       |
| --------------------- | ------------ | --------------------------------------------- |
| **Cost Analysis**     | Daily        | Analyze spending and generate recommendations |
| **Security Scan**     | Configurable | Check for security issues                     |
| **Health Check**      | Continuous   | Monitor resource health                       |
| **Report Generation** | Scheduled    | Generate periodic reports                     |

***

## Best Practices

<AccordionGroup>
  <Accordion title="Start Conservative">
    Begin with read-only operations and recommendation-only mode. Gradually increase autonomy based on agent performance and your comfort level.
  </Accordion>

  <Accordion title="Define Clear Boundaries">
    Specify exactly which resources and actions are in scope. Use tags and account boundaries to limit agent access.
  </Accordion>

  <Accordion title="Review Regularly">
    Periodically review autonomous activity logs and recommendations. Adjust configurations based on patterns.
  </Accordion>

  <Accordion title="Configure Alerting">
    Set up notifications for important events, errors, and approval requests. Don't let critical items go unnoticed.
  </Accordion>

  <Accordion title="Test in Non-Production">
    Test autonomous configurations in development/staging environments before applying to production.
  </Accordion>
</AccordionGroup>

***

## Example Configurations

### Cost-Focused Autonomous Setup

```yaml theme={null}
Agent: Alex
Mode: Recommend Only
Schedule:
  - Daily at 6 AM: Full cost analysis
  - Continuous: Spending anomaly detection
Scope: All AWS accounts, production tag
Actions:
  - Generate cost recommendations (auto)
  - Create alerts for budget risks (auto)
  - Execute savings (requires approval)
```

### Security-Focused Autonomous Setup

```yaml theme={null}
Agent: Oliver
Mode: Alert + Recommend
Schedule:
  - Every 6 hours: Security group scan
  - Daily: IAM policy audit
  - Weekly: Full compliance check
Scope: All accounts, all regions
Actions:
  - Alert on critical findings (auto)
  - Create remediation recommendations (auto)
  - Block public access (requires approval)
```

<Card title="Configure Approval Workflows" icon="clipboard-check" href="/guide/approval">
  Set up approval workflows for autonomous agent actions
</Card>
