Skip to main content

Netdata AI: Alert Assistant & Infrastructure Insights

Netdata AI provides intelligent assistance for both immediate alert response and strategic infrastructure analysis using advanced AI to help you understand incidents quickly and synthesize high-resolution metrics into actionable intelligence.

This comprehensive system combines real-time alert assistance for emergency troubleshooting with strategic infrastructure insights for long-term planning, helping you both respond to immediate incidents and make informed decisions about your infrastructure's future.

Two Complementary Approaches

Netdata Insights serves different moments in your engineering workflow through two distinct but complementary capabilities:

AspectReal-time Alert AssistantStrategic Insights Reports
Primary UseImmediate incident responseStrategic planning & analysis
When You Use ItDuring active alertsPost-incident, planning sessions
Mindset"The building is on fire""Let's understand and plan better"
Response TimeInstant contextual help2-3 minutes for comprehensive analysis
ScopeSingle alert or immediate issueInfrastructure-wide trends and patterns
OutputQuick explanations and next stepsDetailed reports with embedded visualizations, downloadable as PDFs, shareable via email
Typical Scenario3 AM emergency responseMonday morning incident review
tip

These serve fundamentally different moments in an engineer's workflow. The Assistant is for high-stress situations when you need immediate context, while Insights Reports are for when you have time to think strategically about your infrastructure's health and future needs.

note

Netdata Insights is currently in beta as a research preview:

  • Available in Netdata Cloud for Business users and Free Trial participants
  • Works with any infrastructure where you've deployed Netdata agents
  • No additional configuration or new pipelines required
  • Everyone gets 10 reports to generate for free
  • Community users can get early access via Discord or email to product@netdata.cloud

Real-time Alert Assistant

Get immediate context and guidance when alerts fire - exactly when you need it most, especially during critical situations.

The Assistant provides instant explanations and troubleshooting steps directly within your alert workflow, helping you understand what's happening without leaving the Netdata interface.

FeatureBenefit
Follows Your WorkflowThe Assistant window stays with you as you navigate through Netdata dashboards during your troubleshooting process.
Works at Any HourEspecially valuable during after-hours emergencies when you might not have team support available.
Contextual KnowledgeCombines Netdata's community expertise with the power of large language models to provide relevant advice.
Time-SavingEliminates the need for searches across multiple documentation sources or community forums.
Non-IntrusiveProvides helpful guidance without taking control away from you - you remain in charge of the troubleshooting process.

Using the Alert Assistant

Accessing the Assistant

  1. Navigate to the Alerts tab.

  2. If there are active alerts, the Actions column will have an Assistant button.

    actions column

  3. Click the Assistant button to open a floating window with tailored troubleshooting insights.

  4. If there are no active alerts, you can still access the Assistant from the Alert Configuration view.

Understanding Assistant Information

When you open the Assistant, you'll see:

  1. Alert Context: Explanation of what the alert means and why it's occurring

    Netdata Assistant popup

  2. Troubleshooting Steps: Recommended actions to address the issue

  3. Importance Level: Context on how critical this alert is for your system

  4. Resource Links: Curated documentation and external resources for further investigation

    useful resources

Real-World Alert Response

Here's how the Alert Assistant helps in a critical situation:

Strategic Insights Reports

Generate comprehensive infrastructure analysis that synthesizes days, weeks, or months of high-resolution data into actionable intelligence for strategic decision-making.

Insights Reports transform your raw telemetry data into structured narratives that help you understand trends, plan capacity, optimize performance, and conduct thorough post-incident analysis.

Four Types of Strategic Analysis

Report TypeWhat It ProvidesKey CapabilitiesBest Used For
Infrastructure SummaryComplete timeline of incidents, performance changes, and system behaviorWhat happened: Timeline reconstruction Impact assessment: Affected services Current status: Action itemsWeekend incident recovery, executive updates, team handoffs
Capacity PlanningData-driven projections with concrete recommendationsTrend analysis: Resource utilization patterns Bottleneck prediction: Inflection-point dates Scaling recommendations: Hardware suggestionsQuarterly planning, budget justification, infrastructure roadmaps
Performance OptimizationSynthesized analysis of system inefficiencies and improvement opportunitiesContention patterns: Resource conflicts Optimization opportunities: Tuning recommendations Impact prioritization: Biggest improvement areasPerformance debugging, system tuning, SRE optimization projects
Anomaly AnalysisContext-aware detection and explanation of unusual infrastructure behaviorPattern recognition: Abnormal behavior Root cause analysis: Why anomalies occurred Trend correlation: Cross-infrastructure connectionsPost-incident analysis, proactive issue detection, system health assessment
tip

Simply request the analysis you need and get comprehensive reports that turn months of raw telemetry data into clear, actionable intelligence that helps you make better infrastructure decisions faster.

How Netdata Insights Works

The system combines three key components to deliver infrastructure intelligence at scale:

1. Data Pipeline

Your Netdata agents continue collecting metrics every second, storing them locally as they always have. When you request analysis, Insights queries relevant time ranges across your infrastructure, pulling raw metrics, events, and anomaly detection results.

2. Context Compression

Raw telemetry data is compressed into structured context bundles that include:

  • Statistical summaries (percentiles, trends, correlation coefficients)
  • Detected anomalies with confidence scores and affected metrics
  • Event timelines (alerts, deployments, configuration changes)
  • Cross-node correlations and dependency mappings
  • Historical baselines for comparison
3. AI Analysis

Advanced language models process the compressed context to generate structured reports with natural-language explanations, relevant visualizations, and actionable recommendations.

important

Your infrastructure data is processed for your reports and then discarded. We never use them for training or model improvement.

What's Coming Next

The goal is building an autonomous debugging partner, not just another chatbot. A system that scales human decision-making using all the info that Netdata already collects about your infrastructure.

For details on upcoming features and our product roadmap, read our full announcement on the Netdata blog.


Do you have any feedback for this page? If so, you can open a new issue on our netdata/learn repository.