summarize - Aggregations
Note
The summarize clause aggregates data into single results or groups - with critical rules: all calculations must be inline, you cannot reference variables within the same clause, and you cannot use compute after summarize.
Overview
The summarize clause is how you aggregate data in NQL - reducing many rows to one (or to groups). It's essential for creating metrics, trends, and analytics.
What summarize does:
- Reduces data: many rows → one row (or groups of rows)
- Calculates metrics: sums, averages, counts, min/max
- Groups data: by properties, time periods, or both
- Creates dashboard KPIs and trends
CRITICAL Rules for summarize
- Cannot reference variables within same
summarizeclause - must repeat calculations - Cannot use
computeaftersummarize- all calculations must be inline - Cannot group by numeric data types - integers, datetime, or byte fields
Basic Syntax
/* Single aggregation (no grouping) */
| summarize <metric> = <field>.<function>()
/* Grouped by property */
| summarize <metric> = <field>.<function>() by <field>
/* Grouped by time period */
| summarize <metric> = <field>.<function>() by <time_period>
/* Combined grouping */
| summarize <metric> = <field>.<function>() by <field>, <time_period>
Four Summarize Patterns
Pattern 1: Simple Aggregation (No Grouping)
Reduces all data to a single result.
/* Total number of devices */
devices during past 7d
| summarize total_devices = count()
/* Returns: total_devices: 142 */
Pattern 2: Grouped by Property
Creates one row per unique value of the grouping field(s).
/* Device count by operating system */
devices during past 7d
| summarize device_count = count() by operating_system.name
/* Returns multiple rows: */
/* operating_system.name | device_count */
/* Windows 11 | 85 */
/* Windows 10 | 42 */
/* macOS | 15 */
Pattern 3: Grouped by Time Period
Creates one row per time bucket for trend analysis.
/* Daily crash trend */
execution.crashes during past 7d
| summarize total_crashes = count() by 1d
| sort start_time asc
/* Returns 7 rows (one per day) */
Pattern 4: Combined Grouping (Property + Time)
Creates one row per unique combination of property and time bucket.
/* Daily crashes by OS */
execution.crashes during past 30d
| summarize crash_count = count() by 1d, device.operating_system.platform
| sort total_crashes desc
/* Returns: rows for each (day, OS) combination */
Time Period Grouping
Valid Time Periods
| Period Type | Valid Values | Example | Use Case |
|---|---|---|---|
| Minutes | 15min, 30min, 45min | by 15min | High-resolution short-term trends |
| Hours | 1h, 2h, 3h, ... | by 1h | Recent detailed analysis |
| Days | 1d, 2d, 3d, ... | by 1d | Daily trends, week-over-week |
| Weeks | 1w | by 1w | Weekly aggregation |
Time Period Best Practices
- Use hourly (
by 1h) for troubleshooting recent issues - Use daily (
by 1d) for dashboard KPIs and weekly trends - Use 15min only for very short time windows (past few hours)
- Always sort by
start_time ascfor chronological order
Time Grouping Examples
Hourly trend (high resolution):
execution.crashes during past 48h
| summarize crash_count = count() by 1h
| sort start_time asc
/* Returns ~48 rows (one per hour) */
Daily trend (standard):
execution.crashes during past 30d
| summarize crash_count = count() by 1d
| sort start_time asc
/* Returns 30 rows (one per day) */
Combined with property:
web.page_views during past 7d
| where application.name == "Confluence"
| summarize avg_load_time = page_load_time.backend.avg() by 1d, device.name
| sort start_time asc
Property Grouping
Single Field Grouping
/* Crash count by application */
execution.crashes during past 7d
| summarize crash_count = count() by application.name
| sort crash_count desc
Multiple Field Grouping
/* Crash count by application AND version */
execution.crashes during past 7d
| summarize crash_count = count() by application.name, binary.real_binary.version
| sort crash_count desc
Cannot Group by Numeric Types
You cannot group by: - Integer fields - DateTime fields - Byte fields
Multiple Aggregations
You can calculate multiple metrics in a single summarize clause:
execution.events during past 7d
| where binary.name == "outlook.exe"
| summarize
avg_cpu = cpu_time.avg(),
max_memory = real_memory.max(),
unique_devices = device.count(),
unique_users = user.count()
With grouping:
execution.events during past 7d
| summarize
avg_cpu = cpu_time.avg(),
avg_memory = real_memory.avg(),
device_count = device.count()
by binary.name
| sort avg_cpu desc
CRITICAL: Inline Calculations
You CANNOT Reference Variables in Same Summarize Clause
All calculations must be inline - you cannot use one variable to calculate another within the same summarize.
❌ WRONG - Referencing Variables
/* ERROR - Cannot reference total_failures within same clause! */
connection.events during past 7d
| summarize
total_failures = (number_of_no_host_connections.sum() + number_of_no_service_connections.sum()),
total_attempts = (number_of_established_connections.sum() + number_of_no_host_connections.sum()),
failure_rate = total_failures / total_attempts # ERROR!
✅ CORRECT - Repeat Full Calculation
/* Must repeat the full calculation inline */
connection.events during past 7d
| summarize
total_failures = (number_of_no_host_connections.sum() + number_of_no_service_connections.sum()),
total_attempts = (number_of_established_connections.sum() + number_of_no_host_connections.sum()),
failure_rate = (number_of_no_host_connections.sum() + number_of_no_service_connections.sum())
/ (number_of_established_connections.sum() + number_of_no_host_connections.sum())
/* Repeat the entire calculation - cannot reference variables */
Why This Limitation Exists
Variables in summarize exist only for output/display purposes. They cannot be used as intermediate variables within the clause.
The query optimizer handles repeated calculations efficiently, so there's no performance penalty.
CRITICAL: Cannot Use compute After summarize
compute CANNOT Follow summarize
Once you use summarize, you cannot use compute afterward. All calculations must be inline within the summarize clause.
❌ WRONG - compute After summarize
/* ERROR - Cannot use compute after summarize! */
execution.events during past 7d
| summarize
total_cpu = cpu_time.sum(),
total_duration = execution_duration.sum()
| compute cpu_percent = (total_cpu * 100) / total_duration # ERROR!
✅ CORRECT - Calculation Inside summarize
/* All calculations inline within summarize */
execution.events during past 7d
| summarize
cpu_percent = ((cpu_time.sum()) * 100) / (execution_duration.sum() * number_of_logical_processors.max())
Real-World Examples
Example: Device Count Summary
Scenario: Simple count of all devices.
Result: Single number (e.g., 142 devices)
Example: Crash Analysis by Application
Scenario: Find applications with the most crashes.
execution.crashes during past 7d
| summarize
crash_count = count(),
affected_devices = device.name.count(),
affected_users = user.name.count()
by application.name
| where crash_count > 10
| list application.name, crash_count, affected_devices, affected_users
| sort crash_count desc
| limit 20
Explanation:
- Groups by application name
- Calculates 3 metrics: total crashes, unique devices, unique users
- Filters for apps with >10 crashes
- Sorts to show worst offenders first
Example: Daily Crash Trend
Scenario: Track crashes over time to identify patterns.
execution.crashes during past 30d
| where binary.binary.name == "outlook.exe"
| summarize
total_crashes = count(),
affected_devices = device.name.count()
by 1d
| sort start_time asc
Explanation:
- Groups by day (
by 1d) - Creates 30 rows (one per day)
- Tracks both crash count and unique devices
- Sorts chronologically for trend visualization
Example: Combined Grouping (Property + Time)
Scenario: Track crashes by OS platform over time.
execution.crashes during past 30d
| summarize
crash_count = count(),
device_count = device.name.count()
by 1d, device.operating_system.platform
| list start_time,
device.operating_system.platform,
crash_count,
device_count
| sort crash_count desc
Explanation:
- Groups by BOTH day and OS platform
- Creates rows for each (day, platform) combination
- Enables comparing trends across different platforms
Example: CPU Usage Analysis with Inline Calculation
Scenario: Calculate CPU percentage properly with inline formula.
/* Production-tested example */
execution.events during past 7d
| where binary.name in ["sensedlpprocessor.exe", "mssense.exe"]
| summarize
total_processes_started = number_of_started_processes.sum(),
total_cpu_time = cpu_time.sum(),
cpu_usage_percent = ((cpu_time.sum()) * 100) / ((execution_duration.sum()) * (number_of_logical_processors.max()))
by binary.name
| sort cpu_usage_percent desc
Explanation:
- Groups by binary name
- Calculates CPU percentage inline (cannot reference total_cpu_time!)
- Uses proper formula with max() for processors
Example: Memory Utilization by Device
Scenario: Monitor memory usage across devices.
execution.events during past 7d
| where binary.name in ["sensedlpprocessor.exe", "mssense.exe", "mpdlpservice.exe"]
| summarize
avg_memory_used = real_memory.avg(),
last_memory_used = real_memory.last(),
process_count = binary.name.count()
by device.name
| where avg_memory_used > 500MB
| list device.name,
avg_memory_used.as(format = bytes),
last_memory_used.as(format = bytes),
process_count
| sort avg_memory_used desc
| limit 100
Common Patterns
Pattern: Top N by Metric
execution.events during past 7d
| summarize avg_cpu = cpu_time.avg() by device.name
| sort avg_cpu desc
| limit 20 # Top 20 devices by CPU
Pattern: Percentage Calculation
/* Inline percentage calculation */
devices during past 7d
| include execution.crashes during past 7d
| compute has_crashes = device.count()
| summarize
total_devices = count(),
devices_with_crashes = has_crashes.sum(),
crash_percentage = (has_crashes.sum() * 100.0) / count()
Pattern: Time-Series with Threshold
device_performance.events during past 48h
| where cpu_usage.avg > 80
| summarize high_cpu_count = count() by 1h
| sort start_time asc
Pattern: Multi-Dimensional Analysis
execution.events during past 7d
| summarize
avg_cpu = cpu_time.avg(),
avg_memory = real_memory.avg(),
execution_count = count()
by device.name, binary.name
| where execution_count > 100
| sort avg_cpu desc
Tips & Tricks
Sort Time-Series Chronologically
Always sort by start_time asc for time-series queries:
Filter After Aggregation
You can use where AFTER summarize to filter aggregated results:
count() Without Field Name
count() without a field name counts records:
Time Period Can Be Anywhere in by List
Common Mistake: Grouping by Numeric Field
Common Mistake: Multiple Time Periods
Common Mistake: Using .avg Instead of .avg()
Comparing compute vs summarize
Understanding when to use compute vs summarize:
| Aspect | compute | summarize |
|---|---|---|
| Purpose | Per-object calculations | Aggregate/reduce data |
| Output | One row per object (device/user) | One row total OR one row per group |
| Requires | Must follow with or include | Can be used alone |
| Use for | Individual device metrics | Totals, averages, counts across many |
| Example | "CPU per device" | "Average CPU across all devices" |
Example showing the difference:
/* compute - Per-device crash count */
devices during past 7d
| include execution.crashes during past 7d
| compute crash_count = count() # One value per device
| list device.name, crash_count
/* Returns: 142 rows (one per device) */
/* summarize - Total crashes across all devices */
execution.crashes during past 7d
| summarize total_crashes = count() # Single aggregate
/* Returns: 1 row (total) */
/* summarize with grouping - Crashes per device */
execution.crashes during past 7d
| summarize crash_count = count() by device.name
/* Returns: ~15 rows (only devices WITH crashes) */
Related Topics
- Aggregation Functions - Available functions (avg, sum, count, etc.)
- compute - Calculations - Per-object calculations
- where - Filtering Data - Filtering before and after aggregation
- sort - Ordering Results - Sorting aggregated results
- with vs include - Event joins before aggregation
Additional Resources
- NQL Syntax Cheat Sheet - Quick summarize reference
- Common Query Templates - Aggregation templates
- Query Performance Guide - Optimizing aggregations
- Common Error Messages - Troubleshooting summarize errors