Skip to main content

Topology

The Topology Explorer provides interactive visualization of your cloud infrastructure and service relationships. Build topology maps manually, let agents discover them, or import from Infrastructure as Code.
Topology Explorer

Overview

Topology maps help you:
  • Visualize relationships between cloud resources
  • Understand dependencies across services
  • Support incident response with visual context
  • Enable root cause analysis (RCA) by tracing connections
  • Document architecture for team knowledge sharing

Building Topology

Let CloudThinker agents automatically discover and map your infrastructure.
@alex discover and map infrastructure topology for production
@kai map Kubernetes service dependencies
@alex build topology from AWS account including all VPCs
Benefits:
  • Automatic resource discovery
  • Real-time relationship mapping
  • Continuous sync with infrastructure changes

Resource Types

The Topology Explorer supports all major cloud resource types:
CategoryResources
ComputeEC2, Lambda, ECS, EKS, VMs, Cloud Run
NetworkingVPC, Load Balancers, CloudFront, API Gateway
DatabaseRDS, Aurora, DynamoDB, Cloud SQL
StorageS3, EFS, EBS, Cloud Storage
SecurityIAM Roles, Security Groups, ACM Certificates
KubernetesClusters, Deployments, Services, Pods

Using Topology for Incident Response

Topology maps are invaluable during incidents:

Root Cause Analysis (RCA)

@alex use topology to trace the impact of RDS outage
@kai show all services affected by the failing pod
@anna coordinate incident response using infrastructure map

Impact Analysis

Visualize blast radius and affected services:
@alex show downstream dependencies of payment-service
@kai map all services connected to the database cluster
@oliver identify security exposure paths in topology

Real-Time Status

During incidents, topology shows:
  • Health status of each resource
  • Connection states between services
  • Error propagation paths
  • Recovery progress visualization

Views and Filters

Load View

Access saved topology views from the Load View dropdown.

Filter Resources

Use the search and filter panel to:
  • Search by resource name or ID
  • Filter by resource type (EC2, RDS, EKS, etc.)
  • Filter by tags or metadata
  • Show/hide resource categories

Sync Status

The Synced indicator shows when topology was last updated from your infrastructure.

Agent Integration

Agents use topology for enhanced analysis:
AgentTopology Usage
AlexCost impact visualization, resource optimization paths
OliverSecurity exposure mapping, compliance visualization
TonyDatabase dependency chains, performance bottlenecks
KaiService mesh visualization, pod relationships
AnnaCross-service incident coordination, architecture reviews

Example Prompts

@alex analyze cost optimization opportunities using topology view
@oliver map security vulnerabilities across the infrastructure topology
@kai show Kubernetes service dependencies and potential single points of failure
@anna use topology to coordinate the database migration impact

Export Options

Export topology for documentation and sharing:
  • PNG/SVG - Static image export
  • PDF - Printable documentation
  • JSON - Machine-readable format
  • Share Link - Collaborative viewing

Real-World Use Cases

Production Outage Response

Scenario: Your payment service is down and customers can’t complete orders.
@alex show topology centered on payment-service with all dependencies
The topology reveals:
  • Payment service connects to RDS Aurora (primary database)
  • Aurora connects to ElastiCache (session cache)
  • ElastiCache shows unhealthy status ← Root cause identified
Resolution time: Minutes instead of hours by visually tracing the dependency chain.

Cloud Migration Planning

Scenario: Migrating from on-premises to AWS. Need to understand what moves together.
@alex build topology from our Terraform state and identify migration groups
@anna use topology to create migration waves based on dependencies
Outcome:
  • Wave 1: Stateless web services (low risk)
  • Wave 2: Application servers with database dependencies
  • Wave 3: Core databases with replication setup
  • Wave 4: Final cutover with traffic routing

Security Incident Investigation

Scenario: Security alert - unusual traffic from an EC2 instance.
@oliver map all connections from instance i-0abc123 in topology
@oliver trace data flow paths that could expose sensitive data
Topology reveals:
  • Compromised instance has access to 3 S3 buckets
  • Connected to production RDS via security group
  • Blast radius: 12 downstream services
Action: Isolate instance, rotate credentials, audit all connected resources.

Cost Optimization Discovery

Scenario: Monthly AWS bill spiked 40%. Need to find the cause.
@alex overlay cost data on infrastructure topology
@alex highlight resources with >$500/month spend
Topology shows:
  • Orphaned load balancers with no targets: $180/month
  • Oversized RDS instance (db.r5.4xlarge) for dev: $2,400/month
  • Idle EKS node group running 24/7: $1,200/month
Savings identified: $3,780/month by visual inspection.

Compliance Audit Preparation

Scenario: SOC 2 audit next month. Need to document data flows.
@oliver generate topology showing all PII data paths
@oliver map encryption status for data at rest and in transit
Deliverables:
  • Visual data flow diagrams for auditors
  • Encryption coverage map (gaps highlighted in red)
  • Network segmentation proof
  • Access control visualization

Disaster Recovery Testing

Scenario: Validate DR plan before annual test.
@alex compare production topology with DR region topology
@alex identify resources missing from DR setup
Gaps found:
  • DR missing ElastiCache cluster
  • Lambda functions not replicated
  • S3 cross-region replication not enabled for 2 buckets
Fix before DR test: Avoid embarrassing failures.

New Engineer Onboarding

Scenario: New team member needs to understand the architecture.
@anna create topology overview of our e-commerce platform
@alex annotate topology with service responsibilities
Result: Interactive architecture diagram that new engineers can explore, click on resources to see details, and understand how services connect.

Kubernetes Service Mesh Debugging

Scenario: Intermittent 503 errors in production.
@kai map service mesh topology with current health status
@kai show request flow from ingress to failing service
Topology reveals:
  • Ingress → API Gateway → Order Service → Inventory Service
  • Inventory Service pod: CrashLoopBackOff
  • Root cause: OOMKilled due to memory leak

Root Cause Analysis (RCA) for Errors

Scenario: Application throwing “Connection refused” errors intermittently.
@alex trace error path from web-app through topology
@tony correlate database connection errors with topology dependencies
Topology-driven RCA:
  1. Web App → Load Balancer → API Server → Database
  2. API Server shows healthy
  3. Database connection pool: Exhausted ← Root cause
  4. Upstream cause: Slow query holding connections
Resolution: Optimize slow query, increase connection pool, add connection timeout.

Performance Degradation Analysis

Scenario: API response times increased from 200ms to 2 seconds.
@alex analyze performance bottlenecks using topology view
@tony overlay latency metrics on service topology
Topology with metrics overlay:
User → CloudFront (5ms) → ALB (3ms) → API (50ms) → RDS (1800ms) ← Bottleneck
                                    ↘ ElastiCache (2ms)
Findings:
  • Database latency spiked from 20ms to 1800ms
  • Missing index on new query pattern
  • Table scan on 50M rows
Fix: Add composite index, response time back to 200ms.

Cascading Failure Investigation

Scenario: Multiple services failing simultaneously.
@anna map failure propagation across topology
@alex identify the origin point of cascading failures
Topology timeline:
  1. T+0: Redis cluster failover triggered
  2. T+5s: Session service lost cache → returning errors
  3. T+10s: Auth service failing → can’t validate tokens
  4. T+15s: All downstream services rejecting requests
Root cause: Redis cluster hit memory limit, triggered unexpected failover. Prevention: Add memory alerts, implement circuit breakers, cache fallbacks.

Memory Leak Detection

Scenario: Service restarts every few hours in production.
@kai correlate pod restarts with resource topology
@alex show memory trends for services in the request path
Topology + metrics:
  • Order Service: Memory growing 50MB/hour
  • Connected to: Message Queue, Database, Cache
  • Leak source: Unclosed database connections after queue processing
Resolution: Fix connection cleanup in queue consumer, add connection pool monitoring.

Network Latency Troubleshooting

Scenario: Cross-service calls timing out randomly.
@alex map network topology with latency annotations
@kai identify network bottlenecks between services
Topology reveals:
  • Services in different availability zones
  • NAT Gateway: Throughput limit reached
  • Cross-AZ traffic: 2ms → 200ms during peak
Solution: Co-locate dependent services, add NAT Gateway capacity.

Database Connection Issues

Scenario: “Too many connections” errors during peak traffic.
@tony map all services connecting to production database
@alex show connection counts per service in topology
Topology with connection metrics:
┌─────────────────────────────────────────┐
│            RDS PostgreSQL               │
│         Max connections: 500            │
│         Current: 487 (97%)              │
└─────────────────────────────────────────┘
     ↑           ↑           ↑
  API (200)  Worker (250)  Cron (37)
Issue: Worker service connection pool too large. Fix: Right-size connection pools per service based on actual need.