
The problem with signature-based security is that signatures only exist for attacks that have already happened. The instant an attacker uses a technique that does not match a known pattern — a novel SQLi bypass, an obfuscated command injection payload, a zero-day exploit — the ruleset is useless. Behavioral baselining starts from the opposite assumption: describe what is normal, and flag everything that deviates.
This is how Raven.io's detection model is built. Before the agent blocks anything, it spends 48 hours observing. Every SQL query, every file path access, every outbound network call, every subprocess invocation — all of it gets recorded and structured into a behavioral fingerprint that is unique to your application. When exploitation happens, it deviates from that fingerprint in ways that are detectable even if the specific attack technique has never been seen before.
What Gets Observed During the Baseline Period
The agent instruments four categories of operations: database access, file system access, network calls, and subprocess execution. These are the operations that matter most for attack detection because they represent the actual damage an exploit can do.
For database access, the agent records query structure — not the values, but the template. A query like SELECT id, email FROM users WHERE id = ? is recorded as a structural pattern. Over the baseline period, the agent builds a map of which query templates appear, which code paths invoke them, and what the normal parameter types look like.
For file system access, it records which paths the application reads and writes under normal operation — config files, temp directories, log paths, asset directories. An exploit that tries to read /etc/passwd or write to /tmp/exploit.sh will deviate from every path the application accessed during baselining.
How the Baseline Is Structured
The baseline is not a flat list of observed values. It is organized by execution context — which API endpoint triggered the operation, which user role was active, which code path was executing. This context-sensitive structure is what allows the agent to distinguish between "this application normally reads SSL certificates during startup" and "this API endpoint accessed a certificate file it has never touched in 48 hours of observation."
The data model uses a hierarchical structure. At the top level, operation type (SQL, FS, network, subprocess). Below that, calling code path. Below that, specific operation parameters. Each node in the hierarchy stores frequency statistics: how often it appears, what range of values it produces, which other operations typically precede or follow it.
This frequency data matters because some operations are variable by design. A report-generating endpoint might read any number of different database tables depending on report type. The baseline captures that variability as a distribution, not as a fixed set. An operation falls outside the baseline not when it is new, but when it falls outside the observed distribution in a statistically significant way.
The 48-Hour Window: Why Timing Matters
48 hours covers at least two full daily cycles. Most applications have traffic patterns that repeat daily — morning peak, afternoon batch jobs, nightly data exports, weekend maintenance windows. Two days captures these cycles without being so long that it includes rare events that would inflate the baseline and reduce detection sensitivity.
For applications with known weekly cycles (payroll systems, weekly reporting jobs), we recommend extending the observation window to seven days before enabling blocking mode. The dashboard shows baseline coverage metrics — what percentage of observed code paths have been seen at least three times, which is the threshold for confident baseline inclusion. An operator can watch the coverage metric climb toward 100% and switch to blocking mode when it stabilizes.
Background jobs that run every few hours can sometimes not appear in a 48-hour window if they were paused during deployment. The agent flags these as "code paths with no baseline" when they first execute after blocking mode is enabled, and requests operator confirmation before applying block rules to them.
Handling Legitimate Application Changes
Applications change. New features ship, query patterns evolve, code gets refactored. The baseline cannot be a static snapshot that breaks every time a developer adds a feature. Raven.io handles this through a continuous recalibration model.
When a deployment occurs (detected via the agent's integration with deployment metadata, or by a manual recalibration trigger in the dashboard), the agent enters a 4-hour grace window for the new code paths. New operations observed during this window are logged but not blocked. After the grace window, they are incorporated into the baseline if they match the structural patterns of existing code (same query templates, same file access patterns, just new paths or columns). If the new operations look structurally different — new database tables, new external network calls — they remain flagged for operator review.
This approach keeps the baseline current without creating a window that attackers can exploit by timing their attacks around deployments.
False Positive Control
Behavioral detection is only valuable if false positive rates are low enough that alert fatigue does not set in. In our production deployments, teams see an average of 2-4 confirmed anomaly alerts per day in active-block mode, with fewer than 8% requiring operator investigation before being confirmed as legitimate. The rest are either confirmed attacks or confirmed anomalies that the operator tunes out with a one-click rule exclusion.
The most common source of false positives is integration with third-party monitoring agents that also instrument the same runtime layer. When a new APM tool is installed alongside Raven.io, the APM tool's instrumentation sometimes generates file system and network operations that the Raven.io baseline does not recognize. The agent detects these as anomalies until the operator marks the APM tool's process namespace as trusted. This is a known integration pattern and is documented in the setup guide.
When Behavior-Based Detection Catches What Signatures Miss
The value of behavioral baselining becomes clearest when an attack does not use any recognizable signatures. Consider a server-side template injection attack against a Python Flask application. The attacker sends a crafted template expression that causes the application to execute Python code. The WAF sees a POST request with what looks like a mathematical expression. The input filter sees no SQL keywords, no path traversal sequences, no known malicious patterns.
What Raven.io sees is a subprocess invocation from a web request handler. The application's baseline does not include any subprocess calls originating from the template rendering code path. That deviation — subprocess execution from a code path that has never spawned subprocesses — triggers a block regardless of the payload content. The agent does not need to know anything about template injection syntax to catch it.
This behavior-first detection model is why Raven.io caught 93% of novel payloads in OWASP's 2024 benchmark testing, compared to 41% for signature-only WAF configurations. The remaining 7% were attacks confined entirely to in-memory data manipulation with no file system, network, or subprocess impact — a category where behavioral observability is inherently limited.
What the Baseline Does Not Cover
Behavioral baselining has limits that are important to understand clearly. It does not detect attacks that stay within normal behavior — for example, an attacker who has stolen valid credentials and is doing authorized data access at a higher volume than usual. Rate-based anomalies like credential stuffing and scraping are handled by a separate frequency analysis module, not the structural baseline.
It also does not detect vulnerabilities in business logic where the attack uses legitimate application features. An attacker exploiting an IDOR vulnerability by changing a user ID parameter to access another user's records uses a code path and SQL query that match the baseline exactly. IDOR detection requires session context analysis layered on top of the structural baseline, which is on the product roadmap for Q3 2025.
Understanding what a detection model covers and what it does not is how you build a complete security program. Behavioral baselining is not a silver bullet — it is a layer that catches a category of attacks that signature-based tools structurally cannot. Stack it with good SAST tooling, code review, and session monitoring to cover the gaps.
See Your Application's Behavioral Baseline
Run Raven.io in observation mode for 48 hours against your staging environment. The baseline dashboard shows you every SQL query, file access, and network call your application makes — visualized and organized by code path.
Start a Free Pilot