Skip to content

Aggregation Functions

Note

Understanding the difference between smart aggregates (.avg()), aggregated metrics (.avg), and special functions - critical for writing correct NQL queries.

Overview

NQL provides multiple ways to aggregate data, and choosing the right one is crucial for accurate results. The most common confusion is between .avg() and .avg - they're different and used in different contexts.

Three function categories: 1. Smart Aggregates - Functions with () - use in compute/summarize 2. Aggregated Metrics - Database fields without () - use in where clauses 3. Special Functions - count(), countif(), last(), time_elapsed()

Quick Decision Guide

Context Use This Example
In compute or summarize Smart aggregates .avg() compute avg_cpu = cpu_time.avg()
In where clause Aggregated metric .avg where cpu_usage.avg > 50
For peak values Chained .avg.max() summarize peak = cpu_usage.avg.max()
Count unique objects <object>.count() summarize devices = device.count()
Count events (avoid!) count() ❌ Returns sample count on events

The 95% Rule

Use .avg() in compute/summarize 95% of the time - this gives you the true average considering cardinality.

Only use .avg (without parentheses) in where clauses for filtering.

Smart Aggregates (Functions with Parentheses)

Use these in compute and summarize clauses.

Available Functions

Function Description Returns Example
.avg() True average (accounting for cardinality) Number cpu_time.avg()
.sum() Sum of all values Number memory.sum()
.count() Number of aggregated values Integer events.count()
.min() Minimum value Number memory.min()
.max() Maximum value Number memory.max()

How They Work

Smart aggregates are computed on-the-fly and provide accurate aggregation by accounting for data cardinality.

Equivalent formulas:

  • .avg() = .sum.sum() / .count.sum()
  • .sum() = .sum.sum()
  • .max() = .max.max()
  • .min() = .min.min()
  • .count() = .count.sum()

When to Use Smart Aggregates

/* ✅ CORRECT - Use in compute */
devices during past 7d
| include device_performance.events
| compute avg_memory = free_memory.avg()  # Smart aggregate
| list device.name, avg_memory

/* ✅ CORRECT - Use in summarize */
execution.events during past 7d
| summarize avg_cpu = cpu_time.avg() by device.name  # Smart aggregate

Aggregated Metrics (Database Fields)

These are pre-computed fields in the database - use them in where clauses for filtering.

Available Fields

Field Description Use In
.avg Pre-computed average where clauses
.sum Pre-computed sum where clauses
.count Pre-computed count where clauses
.min Pre-computed minimum where clauses
.max Pre-computed maximum where clauses

When to Use Aggregated Metrics

/* ✅ CORRECT - Use .avg (no parentheses) in where */
devices during past 7d
| include device_performance.events
| where cpu_usage.avg > 50  # Aggregated metric field
| list device.name

The .avg vs .avg() vs .avg.avg() vs .avg.max() Question

This is the most common source of confusion in NQL. Here's the complete breakdown:

.avg - Pre-Computed Average Field

Use in: where clauses for filtering

devices during past 7d
| include device_performance.events
| where cpu_usage.avg > 50  # Field for filtering

Use in: compute and summarize clauses

This is what you want 95% of the time!

devices during past 7d
| include device_performance.events
| compute avg_cpu = cpu_usage.avg()  # Accurate average

Why it's recommended:

  • Accounts for cardinality (number of samples per device/object)
  • Returns true average across all data points
  • Result is consistent regardless of time selection

Nexthink does NOT recommend using this!

The problem:

  • For past 2d: Averages 2 samples (2 days)
  • For past 48h: Averages 192 samples (48 hours × 4 samples/hour)
  • Same time period, different results!
/* ❌ NOT RECOMMENDED - Result varies by time selection */
device_performance.events during past 7d
| summarize sample_avg = cpu_usage.avg.avg()  # Don't use this

/* ✅ CORRECT - Consistent, accurate result */
device_performance.events during past 7d
| summarize true_avg = cpu_usage.avg()  # Use this instead

.avg.max() - Maximum Observed Value

Use in: Finding peak values

/* ✅ CORRECT - Find peak CPU usage */
device_performance.events during past 7d
| summarize
    avg_cpu = cpu_usage.avg(),      # Average CPU
    peak_cpu = cpu_usage.avg.max()  # Peak CPU observed
  by device.name
| where peak_cpu > 80

When to use:

  • Finding maximum observed values
  • Identifying spikes or peaks
  • Alert thresholds based on peak performance

Special Functions

count() - Count Records

Syntax: count() (without a field name)

What it does: Counts events or records

CRITICAL: Never Use count() on Sampled Events!

On event tables like execution.events, count() returns the number of samples, not meaningful business metrics.

/* ❌ BAD - Returns sample count (meaningless!) */
execution.events during past 7d
| summarize total = count()
/* Returns: 1,245,892 samples - not useful! */

/* ✅ GOOD - Count unique objects */
execution.events during past 7d
| summarize
unique_devices = device.count(),
unique_apps = application.count()
/* Returns: 142 devices, 38 apps - business value! */

When count() IS useful:

/* ✅ Count non-sampled events (crashes, boots) */
execution.crashes during past 7d
| summarize crash_count = count()

/* ✅ Count after aggregation */
devices during past 7d
| summarize device_count = count()

.count() - Count Unique Objects

Syntax: device.count(), user.count(), application.count()

What it does:

  • In compute: Returns 1 if object has events, 0 otherwise
  • In summarize: Counts unique objects

This is the RIGHT way to count on event tables!

/* ✅ CORRECT - Count unique devices */
execution.events during past 7d
| summarize unique_devices = device.count()

/* ✅ CORRECT - Per-device presence (1 or 0) */
devices during past 7d
| include execution.crashes during past 7d
| compute has_crashes = device.count()  # 1 if crashes exist, 0 otherwise
| summarize devices_with_crashes = has_crashes.sum()

countif() - Conditional Counting

Syntax: countif(<condition>)

What it does: Counts only values that meet the condition

/* Count poor quality calls */
collaboration.sessions during past 24h
| summarize
    poor_calls = countif(session.audio.quality = poor or session.video.quality = poor),
    total_calls = count(),
    poor_ratio = countif(session.audio.quality = poor or session.video.quality = poor) / count()
  by connection_type

.last() - Most Recent Value

Syntax: <field>.last()

What it does: Returns the most recent value from the aggregated set

/* Get current disk space (most recent reading) */
devices during past 7d
| include device_performance.events
| compute
    current_free_space = free_disk_space.last(),  # Latest value
    avg_free_space = free_disk_space.avg()         # Average over period
| list device.name, current_free_space, avg_free_space

Use cases:

  • Current state from time-series data
  • Latest configuration values
  • Most recent status

time_elapsed() - Time Since Timestamp

Syntax: <timestamp_field>.time_elapsed()

What it does: Calculates time elapsed since a timestamp

Chainable: Can be followed by aggregation functions

/* Time since last update */
devices
| compute days_since_update = last_update.time_elapsed()
| where days_since_update > 30d
| list device.name, days_since_update

Real-World Examples

Example: CPU Performance Analysis

Scenario: Find devices with high average CPU and identify peak spikes.

execution.events during past 7d
| summarize
avg_cpu_percent = ((cpu_time.sum()) * 100) / ((execution_duration.sum()) * (number_of_logical_processors.max())),
unique_devices = device.count()
  by binary.name
| where avg_cpu_percent > 10
| list binary.name, avg_cpu_percent, unique_devices
| sort avg_cpu_percent desc
| limit 20

Explanation:

  • Uses .sum() smart aggregates for accurate totals
  • Uses .max() for number of logical processors
  • Uses device.count() to count unique devices (NOT count()!)
  • Calculates CPU percentage inline in summarize

Example: Memory Usage Monitoring

Scenario: Monitor memory usage with average and peak values.

device_performance.events during past 7d
| summarize
avg_memory = real_memory.avg(),      # Average memory usage
peak_memory = real_memory.avg.max(), # Peak memory observed
device_count = device.count()
  by binary.name
| where peak_memory > 2GB
| list binary.name,
   avg_memory.as(format = bytes),
   peak_memory.as(format = bytes),
   device_count
| sort peak_memory desc

Explanation:

  • .avg() for average memory over period
  • .avg.max() for peak memory spike
  • device.count() for unique devices
  • Formats output as bytes (GB/MB)

Example: Crash Rate Calculation

Scenario: Calculate percentage of devices with crashes.

devices during past 7d
| include execution.crashes during past 7d
| compute has_crashes = device.count()  # Returns 1 or 0
| summarize
total_devices = count(),
devices_with_crashes = has_crashes.sum(),  # Sum the 1s
crash_percentage = (has_crashes.sum() * 100.0) / count()

Explanation:

  • device.count() in compute returns 1 if crashes exist, 0 otherwise
  • Sum the 1s to count devices with crashes
  • Divide by total for percentage
  • Uses inline calculation in summarize

Example: Production DLP Monitoring (Verified)

Scenario: Monitor Microsoft Purview DLP memory usage (from production examples).

execution.events during past 7d
| where binary.name in ["sensedlpprocessor.exe", "mssense.exe", "mpdlpservice.exe"]
| summarize
avg_memory_used = real_memory.avg(),    # Smart aggregate
last_memory_used = real_memory.last(),  # Most recent value
process_count = binary.name.count()     # Unique processes
  by device.name
| list device.name,
   avg_memory_used.as(format = bytes),
   last_memory_used.as(format = bytes),
   process_count
| sort process_count desc
| limit 100

Common Mistakes & Anti-Patterns

Mistake #1: Using count() on Sampled Events

/* ❌ WRONG - Returns sample count (meaningless) */
execution.events during past 7d
| summarize total_events = count()
/* Returns: 1,245,892 (just the sample count) */

/* ✅ CORRECT - Count unique objects or sum metrics */
execution.events during past 7d
| summarize
unique_devices = device.count(),
unique_apps = application.count(),
total_cpu_time = cpu_time.sum()
/* Returns meaningful business metrics */

Mistake #2: Using .avg.avg() Instead of .avg()

/* ❌ NOT RECOMMENDED - Result varies by time selection */
device_performance.events during past 7d
| summarize sample_avg = cpu_usage.avg.avg()

/* ✅ CORRECT - Consistent, accurate average */
device_performance.events during past 7d
| summarize true_avg = cpu_usage.avg()

Mistake #3: Wrong Context for .avg vs .avg()

/* ❌ WRONG - Function in where clause */
devices during past 7d
| include device_performance.events
| where cpu_usage.avg() > 50  # ERROR - use .avg not .avg()

/* ✅ CORRECT - Field in where, function in compute */
devices during past 7d
| include device_performance.events
| where cpu_usage.avg > 50        # Field for filtering
| compute avg_cpu = cpu_usage.avg()  # Function for calculation

Mistake #4: Counting All Records Instead of Unique Objects

/* ❌ LESS USEFUL - Total records */
execution.events during past 7d
| where binary.name = "outlook.exe"
| summarize total_records = count()

/* ✅ MORE USEFUL - Unique devices running Outlook */
execution.events during past 7d
| where binary.name = "outlook.exe"
| summarize
unique_devices = device.count(),
unique_users = user.count()

Summary Table: When to Use Each

What You Want Use This Context Example
Filter by average .avg where clause where cpu.avg > 50
Calculate average .avg() compute/summarize compute x = cpu.avg()
Find peak value .avg.max() compute/summarize summarize peak = cpu.avg.max()
Count unique devices device.count() compute/summarize summarize n = device.count()
Check if has events device.count() compute compute has = device.count()
Count non-sampled events count() summarize summarize n = count()
Conditional count countif() summarize summarize n = countif(x > 10)
Latest value .last() compute/summarize compute current = field.last()
Sum values .sum() compute/summarize compute total = field.sum()

Additional Resources