Guide Monitoring • detections • runbooks

From logs to signal: detections that stick.

Most alert programs fail because they start too big and have no ownership. This guide shows a practical way to build coverage, ship a small set of high-value detections, and measure response.

Monitoring services Back to resources

Guide

A detection engineering playbook

Monitoring built for humans: clear alerts, clear actions, clear ownership.

1. Define what “success” means

Pick a few measurable outcomes and align everyone on them.

Mean time to acknowledge (MTTA) and contain (MTTC)
Coverage for identity, privilege, and data access events
False positive rate and alert fatigue (how many pages per week)

2. Create a use-case backlog

Start with the threats that actually happen, and the data you can realistically collect.

Identity anomalies: MFA bypass indicators, impossible travel, new device logins
Privilege changes: admin role grants, new service principals, policy changes
Data access: mass export, unusual queries, new integrations pulling data
App abuse: brute force, credential stuffing, token replay, admin workflow abuse

3. Map log coverage before you write alerts

Most “detections” fail because the underlying telemetry is missing or inconsistent.

Identity provider logs (SSO, MFA, user lifecycle)
Cloud audit logs (API calls, IAM, network changes)
Application audit logs for sensitive actions
Endpoint and container runtime signals where applicable

4. Build alerts that include context

Every alert should answer: what happened, why it matters, and what to do next.

Include actor, target, time, and evidence links (log IDs)
Assign severity based on impact, not “how rare” it is
Write one primary action and one escalation action

5. Add runbooks and ownership

A detection without a runbook is just noise.

Who is responsible for triage during business hours and after hours?
What is the first containment action (disable token, block IP, revoke role)?
What evidence should be preserved, and where?

6. Tune relentlessly

Good detections get better with feedback, not with more rules.

Record “why was this alert useful or not useful?” after each incident.
Reduce scope (and increase accuracy) before you expand coverage.
Move “expected noisy events” into dashboards instead of pages.

7. Review monthly and ship improvements

Security monitoring is a product. Give it a cadence.

Monthly review: top alerts, top gaps, top false positives
Quarterly tabletop drill to validate response and comms
Post-incident: convert lessons into new detections or guardrails

Want detections tuned for your stack?

We can build a coverage map, ship initial detections with runbooks, and hand off ownership cleanly. Lex supports teams globally from India.

Talk to Lex