> ## Documentation Index
> Fetch the complete documentation index at: https://docs.cloudthinker.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Deep Response Engine Setup

> Connect your monitoring tools and let AI agents investigate incidents automatically — from alert to root cause analysis in minutes.

## What You'll Set Up

By the end of this tutorial, signals from your monitoring tools will flow through **Pulse** (which cuts \~98% of noise via deduplication, suppression, and AI classification), and any cluster that crosses the actionability bar will auto-escalate to an **Incident** — where agents correlate evidence across metrics, logs, traces, and [topology](/guide/infrastructure/topology) to identify root cause and suggest remediation.

For the upstream signal-source setup (AWS pollers, Slack, Teams, third-party webhooks), see [Pulse Setup](/guide/pulse/setup).

<Steps>
  <Step title="Navigate to Deep Response Engine Settings">
    Go to **Deep Response Engine** in your workspace. You'll see the incident dashboard and configuration options.
  </Step>

  <Step title="Connect Monitoring Tools via Webhooks">
    CloudThinker ingests alerts from 15+ monitoring platforms through [webhooks](/guide/webhooks/overview):

    | Platform                                  | Setup                                                 |
    | ----------------------------------------- | ----------------------------------------------------- |
    | **PagerDuty**                             | Add CloudThinker webhook URL as a service integration |
    | **Datadog**                               | Create a webhook notification in Monitors             |
    | **Prometheus / Alertmanager**             | Add webhook receiver configuration                    |
    | **AWS CloudWatch**                        | Route alarms through SNS to webhook                   |
    | **[Grafana](/guide/connections/grafana)** | Add webhook contact point                             |
    | **Opsgenie**                              | Configure webhook integration                         |
    | **New Relic**                             | Add webhook notification channel                      |
    | **Sentry**                                | Configure webhook integration for issues              |

    To connect:

    1. Go to **Deep Response Engine** > **Integrations**
    2. Select your monitoring platform
    3. Copy the generated **webhook URL**
    4. Paste it in your monitoring tool's webhook/notification settings
    5. Send a test alert to verify the connection

    <Note>
      See [Webhook Integrations](/guide/incident/webhook-integrations) for detailed setup instructions for each platform.
    </Note>
  </Step>

  <Step title="Configure Alert Routing">
    Once webhooks are connected, configure how alerts are handled:

    * **Auto-investigate**: Automatically start AI investigation when an alert arrives (recommended)
    * **Severity mapping**: Map your monitoring tool's severity levels to CloudThinker's (Critical, High, Medium, Low)
    * **Deduplication**: Prevent duplicate incidents from related alerts
  </Step>

  <Step title="Trigger Your First Incident">
    You can either:

    * **Wait for a real alert**: Let your monitoring tools trigger a real incident
    * **Send a test webhook**: Use your monitoring tool's test feature to send a sample alert
    * **Log manually**: Go to **Deep Response Engine** > **Manual Logging** to create a test incident

    For your first run, manual logging lets you see the full investigation flow immediately:

    1. Click **New Incident**
    2. Describe the issue: "High CPU utilization on production web server"
    3. Set severity and affected resources
    4. Submit
  </Step>

  <Step title="Watch the AI Investigation">
    Once an incident is created, the AI agent starts a **hypothesis-driven investigation**:

    1. **Initial hypothesis**: Forms possible root causes based on the alert data
    2. **Evidence gathering**: Pulls metrics, logs, traces, configs, and recent deployments
    3. **Timeline correlation**: Maps events across systems to a unified timeline
    4. **[Topology](/guide/infrastructure/topology) analysis**: Traces service dependencies to understand blast radius
    5. **[Root Cause Analysis](/guide/incident/root-cause-analysis)**: Narrows down to the most likely cause with a confidence score

    You can watch the investigation in real-time as the agent works through each step.
  </Step>

  <Step title="Review the Root Cause Analysis">
    The completed investigation shows:

    * **Root cause**: The identified issue with confidence score
    * **Evidence chain**: All data points that support the conclusion
    * **Blast radius**: Which services and users are affected
    * **Timeline**: Sequence of events leading to the incident
    * **Remediation**: Recommended actions to resolve and prevent recurrence

    The agent's reasoning is fully transparent — you can see every hypothesis it considered and why it was confirmed or ruled out.
  </Step>

  <Step title="Resolve and Learn">
    After resolving the incident:

    1. Mark the incident as **Resolved**
    2. The agent stores the investigation in its **[knowledge base](/guide/knowledge)**
    3. Future similar incidents benefit from learned patterns

    Over time, the system gets faster and more accurate at diagnosing issues it has seen before.
  </Step>
</Steps>

***

## How Investigation Works

```
Alert arrives → AI forms hypotheses → Gathers evidence (metrics, logs, traces)
→ Correlates timeline → Analyzes topology → Identifies root cause
→ Recommends remediation → Learns from resolution
```

The entire flow — from alert to root cause — typically completes in under 5 minutes.

***

## Tips

* **Connect multiple monitoring tools**: The more data sources agents can access, the more accurate the [Root Cause Analysis](/guide/incident/root-cause-analysis)
* **Start with auto-investigate on**: Let the AI investigate every alert automatically — you can always tune later
* **Review dismissed hypotheses**: Understanding why the agent ruled out alternatives builds trust in its reasoning
* **Enable [Slack notifications](/guide/slack-integration)**: Route incident updates to your `#incidents` channel so the team stays informed
* **Combine with [CloudKeepers](/guide/infrastructure/cloudkeepers)**: Many incidents are preventable — CloudKeepers catch drift before it causes outages

***

## Tutorial Complete

You've now set up the core CloudThinker features end-to-end:

<CardGroup cols={2}>
  <Card title="VibeOps" icon="comments" href="/guide/tutorial/vibeops">
    Conversational cloud operations
  </Card>

  <Card title="Code Review" icon="code-pull-request" href="/guide/tutorial/code-review">
    AI-powered PR reviews
  </Card>

  <Card title="CloudKeepers" icon="radar" href="/guide/tutorial/cloudkeepers">
    Autonomous monitoring
  </Card>

  <Card title="Assessment" icon="magnifying-glass" href="/guide/tutorial/assessment">
    Infrastructure analysis
  </Card>
</CardGroup>

***

## What's Next

<CardGroup cols={3}>
  <Card title="Use Cases" icon="lightbulb" href="/guide/use-cases/actionable-dashboards">
    Real-world examples of CloudThinker in action
  </Card>

  <Card title="Connections" icon="plug" href="/guide/connections/overview">
    Add more cloud providers and integrations
  </Card>

  <Card title="Slack Integration" icon="slack" href="/guide/slack-integration">
    Run operations from Slack
  </Card>
</CardGroup>
