Skip to content

where - Filtering Data

Note

The where clause filters query results based on conditions - and filtering early is the #1 performance optimization in NQL.

Overview

The where clause is one of the most important clauses in NQL. It filters data to include only rows that meet specific conditions.

Why it matters:

  • Performance - Filtering early dramatically reduces query execution time
  • Precision - Narrow results to exactly what you need
  • Flexibility - Combine multiple conditions with AND/OR logic

#1 Performance Rule

ALWAYS filter early with where before expensive operations like compute or summarize

A query that filters 1 million events down to 1,000 BEFORE aggregating is 1000x more efficient than filtering AFTER.

Basic Syntax

| where <field> <operator> <value>
| where <field_1> <operator> <field_2>
| where <condition_1> and/or <condition_2>

Three Comparison Types

1. Fixed Reference Comparison

devices during past 7d
| where operating_system.platform == windows

2. Field-to-Field Comparison

execution.events during past 7d
| where device.location != context.location  # Device changed location

3. Multiple Conditions

devices during past 7d
| where (hardware.memory > 8GB and operating_system.platform == windows)
     or (hardware.memory > 4GB and operating_system.platform == macos)

Comparison Operators

Operator Description Data Types Example
== or = Equals (interchangeable) All types where os.name == "Windows 11"
!= Not equals All types where os.name != "Windows 7"
> Greater than Numeric, dates, durations where memory > 8GB
< Less than Numeric, dates, durations where cpu_usage < 50
>= Greater than or equal Numeric, dates, durations where memory >= 8GB
<= Less than or equal Numeric, dates, durations where cpu_usage <= 80
in In list All types where name in ["A", "B", "C"]
!in Not in list All types where name !in ["TEST", "DEV"]
contains Array contains value String arrays where tags contains "production"
!contains Array doesn't contain String arrays where tags !contains "test"

Equals Operator: = vs ==

Both = and == work identically for comparisons in where clauses.

where os.name = "Windows 11"   # Valid
where os.name == "Windows 11"  # Also valid
However, only = works for aliasing in compute/summarize clauses.

Wildcards in Filters

NQL supports two wildcard characters for pattern matching:

Wildcard Function Example
* Replaces any number of characters (0 or more) "sense*.exe" matches "sensedlpprocessor.exe", "sensece.exe"
? Replaces exactly one character "Windows 1?" matches "Windows 10", "Windows 11"

Wildcard Matching is Case-Insensitive

All wildcard matching in NQL is case-insensitive:

where binary.name == "outlook.exe"  # Matches "OUTLOOK.EXE", "Outlook.exe", etc.

Wildcard Placement Examples

/* Starts with "Microsoft" */
where application.name == "Microsoft*"
/* Matches: "Microsoft Teams", "Microsoft Edge", "Microsoft 365" */

/* Contains "Office" anywhere */
where application.name == "*Office*"
/* Matches: "Microsoft Office", "Office 365", "LibreOffice" */

/* Ends with ".dll" */
where binary.name == "*.dll"
/* Matches: "kernel32.dll", "ntdll.dll", "user32.dll" */

/* Exact character replacement */
where operating_system.name == "Windows 1?"
/* Matches: "Windows 10", "Windows 11" */
/* Does NOT match: "Windows 7" (requires exactly 1 character) */

Common Use Cases

Use Case 1: Filter by Operating System

devices during past 7d
| where operating_system.platform == windows
| list device.name, operating_system.name

Use Case 2: Find Specific Applications

execution.events during past 7d
| where binary.name in ["outlook.exe", "chrome.exe", "teams.exe"]
| summarize avg_cpu = cpu_time.avg() by binary.name

Use Case 3: Numeric Range Filtering

devices during past 7d
| where hardware.memory >= 8GB and hardware.memory <= 16GB
| list device.name, hardware.memory

Use Case 4: Using Wildcards for Process Families

/* Find all Microsoft DLP processes */
execution.events during past 7d
| where binary.name in ["sense*.exe", "mpdlp*.exe"]
| summarize process_count = count() by binary.name

Use Case 5: Field-to-Field Comparison

/* Devices that changed location */
execution.events during past 7d
| where device.location != context.location
| list device.name, device.location, context.location
| sort device.name asc

Logical Operators

Combine multiple conditions using logical operators:

AND - Both conditions must be true

devices during past 7d
| where operating_system.name == "Windows 11"
    and hardware.memory > 8GB

OR - Either condition must be true

devices during past 7d
| where operating_system.name == "Windows 10"
     or operating_system.name == "Windows 11"

NOT - Negation

devices during past 7d
| where not (operating_system.name == "Windows 7")

Complex Conditions with Grouping

devices during past 7d
| where (operating_system.name == "Windows 11" and hardware.memory > 8GB)
     or (operating_system.name == "Windows 10" and hardware.memory > 16GB)

Parentheses for Clarity

Use parentheses to group conditions and ensure correct precedence:

/* Clear precedence with parentheses */
where (condition1 and condition2) or (condition3 and condition4)

Real-World Examples

Example: Find High-Risk Devices

Scenario: Security team needs to identify Windows 7 devices (unsupported) with external network access.

devices during past 7d
| where operating_system.name == "Windows 7"
| with connection.events during past 7d
| where event.destination.domain !in ["*.company.local", "*.internal"]
| list device.name,
   operating_system.name,
   event.destination.domain
| sort device.name asc

Explanation: 1. Start with all devices 2. Filter for Windows 7 (EOL operating system) 3. Join with connection events (only devices WITH connections) 4. Filter out internal domains (external connections only) 5. Display device name, OS, and external destinations

Example: Low Disk Space Alert

Scenario: Find devices with less than 10GB free disk space.

devices during past 7d
| include device_performance.events
| compute current_free_space_mb = free_disk_space.last()
| where current_free_space_mb < 10000  # Less than ~10GB
| list device.name,
   current_free_space_mb,
   device.volumes.size
| sort current_free_space_mb asc

Explanation: 1. Include all devices (even those without performance data) 2. Compute current free space (most recent value) 3. Filter after compute - Find devices below threshold 4. Display name, free space, total size 5. Sort by free space (most critical first)

Example: Microsoft Office Crash Investigation

Scenario: Find devices with Office crashes, exclude known test devices.

execution.crashes during past 7d
| where (binary.binary.name == "*office*"
 or binary.binary.name in ["outlook.exe", "excel.exe", "winword.exe"])
   and device.name !in ["TEST-*", "DEV-*"]
| summarize crash_count = count() by device.name, binary.binary.name
| where crash_count > 5
| list device.name, binary.binary.name, crash_count
| sort crash_count desc

Explanation: 1. Query crash events from past week 2. Early filtering: Office binaries only, exclude test devices 3. Aggregate crash counts by device and binary 4. Filter aggregated results: Only devices with > 5 crashes 5. Display and sort by crash count

Performance: Filter Early!

The placement of where clauses has dramatic performance implications.

❌ BAD PERFORMANCE - Filtering After Aggregation

/* SLOW - Processes ALL 30 days of data first! */
execution.events during past 30d
| summarize count by binary.name  # Aggregates millions of events
| where binary.name == "outlook.exe"  # Filters AFTER processing

/* Query time: ~30 seconds */

✅ GOOD PERFORMANCE - Filtering Before Aggregation

/* FAST - Filters early, processes only relevant data */
execution.events during past 30d
| where binary.name == "outlook.exe"  # Filters FIRST
| summarize count by binary.name  # Aggregates only Outlook events

/* Query time: ~2 seconds (15x faster!) */

Performance Comparison Table

Filter Placement Data Processed Query Time Result
After aggregation 5,000,000 events 30 sec Same
Before aggregation 5,000 events 2 sec Same
Speedup 1000x less data 15x faster Identical

Filter Early Checklist

✅ where before with/include
✅ where before compute
✅ where before summarize
❌ where after summarize (use for aggregated values only)

Common Patterns

Pattern 1: Department Filtering

devices during past 7d
| where department == "IT"
| list device.name, department

Pattern 2: Multiple OS Versions

devices during past 7d
| where operating_system.name in ["Windows 10", "Windows 11"]
| summarize device_count = count() by operating_system.name

Pattern 3: Exclusion Pattern

devices during past 7d
| where device.name !in ["TEST-*", "DEV-*", "LAB-*"]
| list device.name

Pattern 4: Numeric Threshold

devices during past 7d
| include device_performance.events
| where cpu_usage.avg > 50  # Pre-computed average field
| list device.name, cpu_usage.avg

Pattern 5: Date Range (Implicit with Time Windows)

/* Most recent 24 hours */
execution.events during past 24h
| where binary.name == "outlook.exe"

/* Last 7 days */
execution.events during past 7d
| where binary.name == "outlook.exe"

Tips & Tricks

Multiple where Clauses = AND Logic

Separate where clauses on different lines are equivalent to a single clause with and:

/* These are equivalent: */
| where os.name == "Windows 11"
| where memory > 8GB

/* Same as: */
| where os.name == "Windows 11" and memory > 8GB

Use in for Multiple Values

Instead of chaining OR conditions, use in:

/* BAD - verbose and harder to maintain */
where name == "A" or name == "B" or name == "C"

/* GOOD - concise and clear */
where name in ["A", "B", "C"]

Wildcard Performance

Specific wildcards are faster than broad wildcards:

/* Faster - specific prefix */
where name == "specific*"

/* Slower - contains anywhere */
where name == "*specific*"

Common Mistake: Using > on Strings

/* ❌ WRONG - Comparison operators on strings don't work as expected */
where device.name > "SERVER"

/* ✅ CORRECT - Use wildcards for string patterns */
where device.name == "SERVER*"

Common Mistake: Missing Time Selection

/* ❌ WRONG - Event tables require time selection */
devices
| with web.errors
| where error.type == "timeout"

/* ✅ CORRECT - Add time window to event join */
devices
| with web.errors during past 7d
| where error.type == "timeout"

Common Mistake: Filtering Before vs After Aggregation

/* ❌ WRONG - Processes all data first (slow) */
execution.events during past 7d
| summarize count by binary.name
| where count > 100

/* ✅ BETTER - Filter data early (fast) */
execution.events during past 7d
| where binary.name in ["outlook.exe", "teams.exe", "chrome.exe"]
| summarize count by binary.name
| where count > 100  # Filter aggregated results after

Additional Resources