summarize - Aggregations

Note

The summarize clause aggregates data into single results or groups - with critical rules: all calculations must be inline, you cannot reference variables within the same clause, and you cannot use compute after summarize.

Overview

The summarize clause is how you aggregate data in NQL - reducing many rows to one (or to groups). It's essential for creating metrics, trends, and analytics.

What summarize does:

Reduces data: many rows → one row (or groups of rows)
Calculates metrics: sums, averages, counts, min/max
Groups data: by properties, time periods, or both
Creates dashboard KPIs and trends

CRITICAL Rules for summarize

Cannot reference variables within same summarize clause - must repeat calculations
Cannot use compute after summarize - all calculations must be inline
Cannot group by numeric data types - integers, datetime, or byte fields

Basic Syntax

/* Single aggregation (no grouping) */
| summarize <metric> = <field>.<function>()

/* Grouped by property */
| summarize <metric> = <field>.<function>() by <field>

/* Grouped by time period */
| summarize <metric> = <field>.<function>() by <time_period>

/* Combined grouping */
| summarize <metric> = <field>.<function>() by <field>, <time_period>

Four Summarize Patterns

Pattern 1: Simple Aggregation (No Grouping)

Reduces all data to a single result.

/* Total number of devices */
devices during past 7d
| summarize total_devices = count()
/* Returns: total_devices: 142 */

Pattern 2: Grouped by Property

Creates one row per unique value of the grouping field(s).

/* Device count by operating system */
devices during past 7d
| summarize device_count = count() by operating_system.name
/* Returns multiple rows: */
/* operating_system.name | device_count */
/* Windows 11             | 85 */
/* Windows 10             | 42 */
/* macOS                  | 15 */

Pattern 3: Grouped by Time Period

Creates one row per time bucket for trend analysis.

/* Daily crash trend */
execution.crashes during past 7d
| summarize total_crashes = count() by 1d
| sort start_time asc
/* Returns 7 rows (one per day) */

Pattern 4: Combined Grouping (Property + Time)

Creates one row per unique combination of property and time bucket.

/* Daily crashes by OS */
execution.crashes during past 30d
| summarize crash_count = count() by 1d, device.operating_system.platform
| sort total_crashes desc
/* Returns: rows for each (day, OS) combination */

Time Period Grouping

Valid Time Periods

Period Type	Valid Values	Example	Use Case
Minutes	15min, 30min, 45min	`by 15min`	High-resolution short-term trends
Hours	1h, 2h, 3h, ...	`by 1h`	Recent detailed analysis
Days	1d, 2d, 3d, ...	`by 1d`	Daily trends, week-over-week
Weeks	1w	`by 1w`	Weekly aggregation

Time Period Best Practices

Use hourly (by 1h) for troubleshooting recent issues
Use daily (by 1d) for dashboard KPIs and weekly trends
Use 15min only for very short time windows (past few hours)
Always sort by start_time asc for chronological order

Time Grouping Examples

Hourly trend (high resolution):

execution.crashes during past 48h
| summarize crash_count = count() by 1h
| sort start_time asc
/* Returns ~48 rows (one per hour) */

Daily trend (standard):

execution.crashes during past 30d
| summarize crash_count = count() by 1d
| sort start_time asc
/* Returns 30 rows (one per day) */

Combined with property:

web.page_views during past 7d
| where application.name == "Confluence"
| summarize avg_load_time = page_load_time.backend.avg() by 1d, device.name
| sort start_time asc

Property Grouping

Single Field Grouping

/* Crash count by application */
execution.crashes during past 7d
| summarize crash_count = count() by application.name
| sort crash_count desc

Multiple Field Grouping

/* Crash count by application AND version */
execution.crashes during past 7d
| summarize crash_count = count() by application.name, binary.real_binary.version
| sort crash_count desc

Cannot Group by Numeric Types

You cannot group by: - Integer fields - DateTime fields - Byte fields

/* ❌ WRONG - Cannot group by integer */
devices
| summarize count = count() by days_since_last_seen  # ERROR

/* ✅ CORRECT - Filter first, then group by non-numeric field */
devices
| where days_since_last_seen < 7
| summarize count = count() by device.name

Multiple Aggregations

You can calculate multiple metrics in a single summarize clause:

execution.events during past 7d
| where binary.name == "outlook.exe"
| summarize
    avg_cpu = cpu_time.avg(),
    max_memory = real_memory.max(),
    unique_devices = device.count(),
    unique_users = user.count()

With grouping:

execution.events during past 7d
| summarize
    avg_cpu = cpu_time.avg(),
    avg_memory = real_memory.avg(),
    device_count = device.count()
  by binary.name
| sort avg_cpu desc

CRITICAL: Inline Calculations

You CANNOT Reference Variables in Same Summarize Clause

All calculations must be inline - you cannot use one variable to calculate another within the same summarize.

❌ WRONG - Referencing Variables

/* ERROR - Cannot reference total_failures within same clause! */
connection.events during past 7d
| summarize
    total_failures = (number_of_no_host_connections.sum() + number_of_no_service_connections.sum()),
    total_attempts = (number_of_established_connections.sum() + number_of_no_host_connections.sum()),
    failure_rate = total_failures / total_attempts  # ERROR!

✅ CORRECT - Repeat Full Calculation

/* Must repeat the full calculation inline */
connection.events during past 7d
| summarize
    total_failures = (number_of_no_host_connections.sum() + number_of_no_service_connections.sum()),
    total_attempts = (number_of_established_connections.sum() + number_of_no_host_connections.sum()),
    failure_rate = (number_of_no_host_connections.sum() + number_of_no_service_connections.sum())
                 / (number_of_established_connections.sum() + number_of_no_host_connections.sum())
    /* Repeat the entire calculation - cannot reference variables */

Why This Limitation Exists

Variables in summarize exist only for output/display purposes. They cannot be used as intermediate variables within the clause.

The query optimizer handles repeated calculations efficiently, so there's no performance penalty.

CRITICAL: Cannot Use compute After summarize

compute CANNOT Follow summarize

Once you use summarize, you cannot use compute afterward. All calculations must be inline within the summarize clause.

❌ WRONG - compute After summarize

/* ERROR - Cannot use compute after summarize! */
execution.events during past 7d
| summarize
    total_cpu = cpu_time.sum(),
    total_duration = execution_duration.sum()
| compute cpu_percent = (total_cpu * 100) / total_duration  # ERROR!

✅ CORRECT - Calculation Inside summarize

/* All calculations inline within summarize */
execution.events during past 7d
| summarize
    cpu_percent = ((cpu_time.sum()) * 100) / (execution_duration.sum() * number_of_logical_processors.max())

Real-World Examples

Example: Device Count Summary

Scenario: Simple count of all devices.

devices during past 7d
| summarize total_devices = count()

Result: Single number (e.g., 142 devices)

Example: Crash Analysis by Application

Scenario: Find applications with the most crashes.

execution.crashes during past 7d
| summarize
crash_count = count(),
affected_devices = device.name.count(),
affected_users = user.name.count()
  by application.name
| where crash_count > 10
| list application.name, crash_count, affected_devices, affected_users
| sort crash_count desc
| limit 20

Explanation:

Groups by application name
Calculates 3 metrics: total crashes, unique devices, unique users
Filters for apps with >10 crashes
Sorts to show worst offenders first

Example: Daily Crash Trend

Scenario: Track crashes over time to identify patterns.

execution.crashes during past 30d
| where binary.binary.name == "outlook.exe"
| summarize
total_crashes = count(),
affected_devices = device.name.count()
  by 1d
| sort start_time asc

Explanation:

Groups by day (by 1d)
Creates 30 rows (one per day)
Tracks both crash count and unique devices
Sorts chronologically for trend visualization

Example: Combined Grouping (Property + Time)

Scenario: Track crashes by OS platform over time.

execution.crashes during past 30d
| summarize
crash_count = count(),
device_count = device.name.count()
  by 1d, device.operating_system.platform
| list start_time,
   device.operating_system.platform,
   crash_count,
   device_count
| sort crash_count desc

Explanation:

Groups by BOTH day and OS platform
Creates rows for each (day, platform) combination
Enables comparing trends across different platforms

Example: CPU Usage Analysis with Inline Calculation

Scenario: Calculate CPU percentage properly with inline formula.

/* Production-tested example */
execution.events during past 7d
| where binary.name in ["sensedlpprocessor.exe", "mssense.exe"]
| summarize
total_processes_started = number_of_started_processes.sum(),
total_cpu_time = cpu_time.sum(),
cpu_usage_percent = ((cpu_time.sum()) * 100) / ((execution_duration.sum()) * (number_of_logical_processors.max()))
  by binary.name
| sort cpu_usage_percent desc

Explanation:

Groups by binary name
Calculates CPU percentage inline (cannot reference total_cpu_time!)
Uses proper formula with max() for processors

Example: Memory Utilization by Device

Scenario: Monitor memory usage across devices.

execution.events during past 7d
| where binary.name in ["sensedlpprocessor.exe", "mssense.exe", "mpdlpservice.exe"]
| summarize
avg_memory_used = real_memory.avg(),
last_memory_used = real_memory.last(),
process_count = binary.name.count()
  by device.name
| where avg_memory_used > 500MB
| list device.name,
   avg_memory_used.as(format = bytes),
   last_memory_used.as(format = bytes),
   process_count
| sort avg_memory_used desc
| limit 100

Common Patterns

Pattern: Top N by Metric

execution.events during past 7d
| summarize avg_cpu = cpu_time.avg() by device.name
| sort avg_cpu desc
| limit 20  # Top 20 devices by CPU

Pattern: Percentage Calculation

/* Inline percentage calculation */
devices during past 7d
| include execution.crashes during past 7d
| compute has_crashes = device.count()
| summarize
    total_devices = count(),
    devices_with_crashes = has_crashes.sum(),
    crash_percentage = (has_crashes.sum() * 100.0) / count()

Pattern: Time-Series with Threshold

device_performance.events during past 48h
| where cpu_usage.avg > 80
| summarize high_cpu_count = count() by 1h
| sort start_time asc

Pattern: Multi-Dimensional Analysis

execution.events during past 7d
| summarize
    avg_cpu = cpu_time.avg(),
    avg_memory = real_memory.avg(),
    execution_count = count()
  by device.name, binary.name
| where execution_count > 100
| sort avg_cpu desc

Tips & Tricks

Sort Time-Series Chronologically

Always sort by start_time asc for time-series queries:

| summarize metric = field.avg() by 1d
| sort start_time asc  # Chronological order for charts

Filter After Aggregation

You can use where AFTER summarize to filter aggregated results:

execution.events during past 7d
| summarize crash_count = count() by device.name
| where crash_count > 10  # Filter aggregated results
| sort crash_count desc

count() Without Field Name

count() without a field name counts records:

execution.crashes during past 7d
| summarize total_crashes = count()  # Count crash events

Time Period Can Be Anywhere in by List

/* These are equivalent: */
| summarize metric = field.avg() by 1d, device.name
| summarize metric = field.avg() by device.name, 1d

Common Mistake: Grouping by Numeric Field

/* ❌ WRONG - Integer type not supported */
devices
| summarize count = count() by days_since_last_seen  # ERROR

/* ✅ CORRECT - Use buckets or where filters */
devices
| where days_since_last_seen >= 0 and days_since_last_seen < 7
| summarize week_1_count = count()

Common Mistake: Multiple Time Periods

/* ❌ WRONG - Only ONE time period allowed */
| summarize count = count() by 1d, 1h  # ERROR

/* ✅ CORRECT - Choose appropriate granularity */
| summarize count = count() by 1h, device.name  # One time, multiple properties OK

Common Mistake: Using .avg Instead of .avg()

/* ❌ LESS ACCURATE - Using aggregated metric field */
execution.events during past 7d
| summarize avg_cpu = cpu_time.avg  # Missing parentheses

/* ✅ CORRECT - Using smart aggregate function */
execution.events during past 7d
| summarize avg_cpu = cpu_time.avg()  # Function with parentheses

Comparing compute vs summarize

Understanding when to use compute vs summarize:

Aspect	compute	summarize
Purpose	Per-object calculations	Aggregate/reduce data
Output	One row per object (device/user)	One row total OR one row per group
Requires	Must follow `with` or `include`	Can be used alone
Use for	Individual device metrics	Totals, averages, counts across many
Example	"CPU per device"	"Average CPU across all devices"

Example showing the difference:

/* compute - Per-device crash count */
devices during past 7d
| include execution.crashes during past 7d
| compute crash_count = count()  # One value per device
| list device.name, crash_count
/* Returns: 142 rows (one per device) */

/* summarize - Total crashes across all devices */
execution.crashes during past 7d
| summarize total_crashes = count()  # Single aggregate
/* Returns: 1 row (total) */

/* summarize with grouping - Crashes per device */
execution.crashes during past 7d
| summarize crash_count = count() by device.name
/* Returns: ~15 rows (only devices WITH crashes) */

Aggregation Functions - Available functions (avg, sum, count, etc.)
compute - Calculations - Per-object calculations
where - Filtering Data - Filtering before and after aggregation
sort - Ordering Results - Sorting aggregated results
with vs include - Event joins before aggregation

Additional Resources

NQL Syntax Cheat Sheet - Quick summarize reference
Common Query Templates - Aggregation templates
Query Performance Guide - Optimizing aggregations
Common Error Messages - Troubleshooting summarize errors

summarize - Aggregations

Overview

Basic Syntax

Four Summarize Patterns

Pattern 1: Simple Aggregation (No Grouping)

Pattern 2: Grouped by Property

Pattern 3: Grouped by Time Period

Pattern 4: Combined Grouping (Property + Time)

Time Period Grouping

Valid Time Periods

Time Grouping Examples

Property Grouping

Single Field Grouping

Multiple Field Grouping

Multiple Aggregations

CRITICAL: Inline Calculations

❌ WRONG - Referencing Variables

✅ CORRECT - Repeat Full Calculation

CRITICAL: Cannot Use compute After summarize

❌ WRONG - compute After summarize

✅ CORRECT - Calculation Inside summarize

Real-World Examples

Common Patterns

Pattern: Top N by Metric

Pattern: Percentage Calculation

Pattern: Time-Series with Threshold

Pattern: Multi-Dimensional Analysis

Tips & Tricks

Comparing compute vs summarize

Related Topics

Additional Resources