DEEP DIVE: The Context Graph - The Missing System of Record for the Agentic SOC

Agentic SOC Series

Jun 15, 2026

Attackers operate at machine speed. The SOC does not. Sorry for the longer post but this issue has been bugging me for some time and warrants some attention from all of us.

This is not news. It has been the central tension in security operations for two decades. What has changed is the structural consequence. When your adversary pivots in seconds and your detection fires in minutes, the human analyst’s oldest weakness - finding the context to make a defensible decision - was a productivity tax. Frustrating but survivable. The analyst dug, asked around, eventually pieced enough together to act.

When agents enter the picture, that same weakness becomes a structural blocker.

An agent can triage an alert in seconds. It can correlate across data sources, check enrichment APIs, and draft a response recommendation faster than a human can open the incident. Then the agent asks the only question that actually matters - has this pattern been seen before, how did we handle it, what did we decide and why - and the answer is silence.

Not because the information does not exist somewhere. Because it was never captured as a durable artifact.

The Core Problem: Rules Without Decision Traces

Jaya Gupta and Ashu Garg recently articulated a distinction that cleans this up: agents do not need more rules, they need decision traces (Foundation Capital, December 2025). They introduced the term context graph to describe the missing system of record that captures not just what happened, but how decisions were made and why.

The insight translates directly to security operations.

We have rules. Lots of them. Detection analytics, correlation logic, playbook steps, response policies, compliance frameworks, access governance rules. The entire security stack is built on rules.

What we do not have is decision traces.

A decision trace captures the full provenance of a triage, hunt, or response decision. What inputs were weighed? What alternatives were considered? What precedent was cited? What was ruled out and why? What was the outcome, and who or what made the call?

Today, that trace evaporates the moment the analyst closes the incident or the playbook completes. The outcome survives - incident status changed to “Closed: True Positive.” The reasoning does not.

Translating the Framework to Security Operations

Here’s the term-by-term map from Gupta and Garg’s framework to our domain.

System of Record. In security operations, the canonical systems of record are the SIEM (Microsoft Sentinel), the XDR (Microsoft Defender XDR), the ITSM (ServiceNow, Jira), and the IAM directory (Microsoft Entra ID). Each owns a slice of truth: events, incidents, tickets, identities.

Rule. A detection analytic. A correlation query. A playbook step. An access policy. A compliance control. Rules encode what to look for and what to do when you find it.

Decision Trace. The artifact that captures why an analyst triaged an alert the way they did. Why a hunt concluded without findings. Why an incident was closed as benign. Why a suppression was applied. Why access was granted or revoked. The full provenance of the judgment, not just the outcome.

Context Graph. Decision traces joined to the security entity graph. Entities (users, devices, applications, IP addresses, files) connected through both the history of what happened to them and the history of what we decided about them. When you query an entity, you see what it did and the reasoning behind every decision made about it.

System of Agents. In the Microsoft security ecosystem, this is Security Copilot, Azure AI Foundry agents, Foundry IQ retrieval, Microsoft Agent 365, the Sentinel MCP Server, and the emerging Security Store. The orchestration layer that coordinates reasoning, action, and memory.

Glue Function. The SOC itself. Security operations exists because no system of record owns the cross-functional workflow. IT, engineering, identity, compliance, and the business each have their own systems. The SOC is where those systems meet when something goes wrong.

That last point is the structural argument for why the security context graph is a net-new system of record, not a feature of an existing product.

Why the SOC Is the Glue Function

Security operations has always been a coordination problem dressed up as a detection problem.

The SIEM collects events. XDR correlates threats. IAM governs access. ITSM tracks tickets. Compliance maps controls. Each system does its job. None of them owns the workflow that spans all of them.

When an OAuth consent phishing campaign hits your environment, the investigation touches Entra ID (which app was granted consent), Microsoft Defender for Cloud Apps (what did the malicious app do), Microsoft Defender for Endpoint (did the user’s device show signs of compromise before the consent), email logs (how did the phishing message arrive), HR systems (is this user in a sensitive role), and sometimes legal and compliance.

No single system of record holds the complete picture. The SOC analyst is the integration layer - the human API that queries each system, synthesizes the results, and makes a judgment call.

In the agentic SOC, agents inherit that glue function. They query across systems, correlate signals, and propose actions. But unlike the human analyst, agents cannot rely on institutional memory, hallway conversations, or the vague recollection that we saw something like this last quarter.

Agents need the context graph.

When Precedent Goes Missing

Let’s walk through a scenario I have seen variations of across multiple environments. No real customer names. The pattern is common.

An OAuth consent phishing campaign targets your organization. Users receive emails impersonating a vendor, asking them to grant permissions to a malicious application. The attack lands three times over six months, targeting different users each time.

Incident One (March). An alert fires on unusual OAuth consent activity. A Tier 2 analyst investigates, determines the app is malicious, revokes consent, and closes the incident as True Positive. The analyst adds a comment: Malicious app, user coached, access revoked, no evidence of data exfiltration. Closed cleanly.

Incident Two (May). Same campaign, different user. The alert fires. A different analyst sees the same pattern but also sees that the user is a contractor with limited access. The analyst assesses the risk as low and applies a suppression: Suppress alerts for this app ID for contractor accounts for 30 days while we implement a blanket block. The suppression is applied. The blanket block never ships. No incident is created.

Incident Three (August). Same campaign again. This time the user is an executive with access to sensitive financial data. The alert fires. A Tier 1 analyst sees it.

Here’s the problem.

The Tier 1 analyst has no way to find the precedent. Incident One is closed. The analyst would need to know exactly what to search for and where. The suppression from Incident Two exists as a policy. It is still active because someone forgot to expire it, but there is no link between the suppression and the original context. The suppression says what was suppressed, not why, not who decided, and not what the original incident looked like.

The Tier 1 analyst treats this as a new pattern. The triage takes an hour instead of five minutes. The executive’s access to the malicious app stays live for that hour because nobody realized this was the third occurrence.

What survived in the platform: alerts, incident state, the suppression policy.

What did not survive: the why. The decision trace. The provenance that would have told the Tier 1 analyst, we have seen this twice, here is what we learned, here is what to do.

Now imagine that happening twenty times a day across your SOC. That’s the productivity tax.

Now imagine an agent trying to triage that third incident. The agent can query the alerts, check the incident status, even read the suppression policy. But the agent cannot find the human reasoning that would have accelerated the response. The agent is as blind to precedent as the Tier 1 analyst. Worse, actually, because the agent does not have the option to walk over to the Tier 2 desk and ask.

That’s the structural blocker.

The Platform Question: What Sentinel and Defender XDR Solve Today

Direct version of what the Microsoft security platform delivers today and what is still missing.

What is solved: the data plane.

Microsoft Sentinel and Microsoft Defender XDR together deliver real substrate for security operations:

Three-tier, four-surface detection architecture. Sentinel analytics, Defender XDR correlations, and custom detections across endpoint, identity, cloud, and applications.
Unified incidents. Defender XDR aggregates related alerts into incidents with entity mapping.
ASIM normalization. Advanced Security Information Model for consistent querying across data sources.
Graph analytics. Sentinel’s graph capabilities connect entities through behavioral and temporal relationships.
MCP integration. The Sentinel MCP Server (preview-sensitive; verify feature status before production use) exposes data and operations to AI agents.
Foundry IQ retrieval. Azure AI Foundry’s grounding capabilities can index security documentation and knowledge with permission-aware retrieval.

This is real infrastructure. The data plane is where Microsoft has invested heavily, and it shows. Most of the industry is still catching up to it.

What is not solved: the decision plane.

The gap shows up in four places.

Incidents are state machines, not decision traces. An incident can be open, in progress, or closed with a classification and a closure reason. The incident record does not capture what alternatives were considered, what precedent was cited, or why certain hypotheses were ruled out. The state survives. The reasoning does not.

Suppressions are policy, not lineage. A suppression rule tells you what to suppress and for how long. It does not tell you why the suppression was created, what incident prompted it, who authorized it, or what conditions should trigger a review. Suppressions accumulate without provenance until someone runs a breach report and nobody can answer why did we not see this.

Agent runs are not consistently persisted as durable artifacts joined to the entity graph. When Security Copilot or a Foundry agent investigates an incident, the session may exist in audit logs, but it is not connected to the incident in a way that future agents or analysts can query as precedent. The agent’s reasoning is ephemeral.

The why-we-did-not-act trace is missing entirely. When an alert is dismissed, when an investigation concludes without findings, when a hunt comes up empty, what gets captured? An outcome. No action taken. What does not get captured: the reasoning that led there. The negative decision trace is the most absent artifact in security operations.

The Warehouse Question: Why Fabric and Databricks Are Not the Answer

A reasonable question comes up here: why not capture decision traces in Microsoft Fabric or Databricks since we are already using them for security analytics?

The answer is architectural.

Fabric and Databricks sit in the read path, not the write path. By the time data lands in the lakehouse via ETL, the decision context is gone. The incident is closed, the suppression is applied, the agent run is complete. You can analyze the outcome. You cannot capture the reasoning because you are downstream of the decision.

Decision traces have to be captured at orchestration time. The moment the analyst closes the incident. The moment the playbook completes. The moment the agent commits its recommendation. That is when the decision context exists. Five minutes later it is vapor.

Fabric and Databricks are great for retrospective analytics, ML training, SOC metrics, and BI reporting. They are the wrong layer for capturing decision traces.

Here is the layer model:

LayerFunctionWrite pathSentinel Analytics TierHot detectionEvents to AlertsDefender XDRUnified incident stateAlerts to IncidentsAgent Orchestration (Security Copilot, Foundry agents, Sentinel MCP)Decision-trace capture if instrumentedDecisions to Trace artifactsSentinel Data LakeDurable storage and queryEvents and TracesFabric / DatabricksDownstream read pathAnalytics, ML, BI

The decision trace belongs in the orchestration layer, persisted to the Sentinel data lake as a peer table to events. Not downstream in the warehouse.

Caveats before the architecture talk: Sentinel MCP Server, custom Sentinel graph patterns, parts of the Security Store, and some agent-assisted investigation experiences are preview-sensitive - verify feature status before production use. Microsoft Fabric and Databricks are referenced as comparison architectures, not as recommended decision-trace stores. The Cloudflare Project Glasswing reference is an offensive vulnerability research pattern using Mythos (currently Anthropic preview, not GA); I’m citing the Validator pattern, not claiming Cloudflare runs an agentic SOC. The rules-versus-decision-traces distinction and the term context graph are credited to Jaya Gupta and Ashu Garg (Foundation Capital, December 2025); the translation to security operations is mine.

What This Could Look Like: Architecture Options

I am framing these as options, not prescriptions. The right shape depends on your tooling and your operating maturity.

Option 1: Emit structured decision traces from every agent run.

Every agent run, whether Security Copilot, a Foundry agent, or a custom MCP-based workflow, emits a structured decision trace at commit time. The trace captures:

What inputs were consumed
What queries were executed
What alternatives were considered
What precedent was cited
What was ruled out and why
The final recommendation and outcome
Who or what made the call (human authority, agent autonomy, or hybrid)

Persist that trace to the Sentinel data lake as a peer table to security events. Same retention, same query surface, same access controls.

Option 2: Join traces to the entity graph.

Use Sentinel’s graph analytics to connect decision traces to the existing entity graph. When you query a user, device, or application, you see not only the events and behaviors associated with that entity but the history of decisions made about it.

Show me all prior decisions involving this user identity becomes a graph query. Find precedent for this alert pattern becomes a traversal.

Option 3: Expose precedent retrieval to agents.

Build a Sentinel MCP tool that agents can invoke: query prior decisions for this alert pattern. Integrate with Foundry IQ so the agent’s grounded retrieval includes not just documentation and runbooks but prior decision traces.

The next decision can read the prior decisions. The context graph becomes queryable at triage time.

What This Could Look Like: Process Options

Architecture alone does not solve this. You also need process discipline. And the process side does not require any new platform.

Structured incident closure.

Every analyst-closed incident records what was weighed, what was ruled out, and what precedent was cited. Structured fields, not freeform comments. A close artifact template the analyst completes, or one the agent drafts for human approval.

Suppression governance.

Every suppression and tuning change carries:

An authority (who approved this)
A precedent link (what incident or decision prompted it)
An expiration (when should this be reviewed)
A scope (what conditions must hold for the suppression to remain valid)

Suppressions without lineage are technical debt. They accumulate until a breach report asks why did we not see this and nobody can answer.

Meaningful Validator review.

The Validator pattern is an adversarial review function that challenges agent recommendations before action. It only works if there is a decision trace to dispute. Without the context graph, adversarial validation is theater. I disagree with the agent’s recommendation means nothing if you cannot see the agent’s reasoning.

Cloudflare’s Project Glasswing (using Mythos, currently Anthropic preview) demonstrates the Validator pattern in the vulnerability research domain: autonomous agents with human validators who can dispute, redirect, or halt agent runs. The pattern applies to SOC operations, but only if the decision trace exists.

The On-Ramp: Human-in-the-Loop With Trace Emission

None of this requires day-one full autonomy.

Start with human-in-the-loop operations. The agent proposes; the human disposes. What changes is the discipline of trace emission.

When the analyst approves an agent recommendation, the trace captures the agent’s reasoning, the analyst’s judgment, the inputs both considered, and the outcome. When the analyst rejects an agent recommendation, the trace captures what the agent proposed, why the analyst disagreed, what the analyst did instead, and the outcome.

The context graph compounds whether the human or the agent makes the call, as long as the workflow layer captures the inputs, the rationale, and the outcome.

This is the on-ramp. Agentic SOC operations do not require flipping a switch from human-operated to autonomous. They require building the infrastructure of decision memory incrementally, starting now, while humans are still in the loop. The moment you turn up more agent autonomy, the absence of that memory becomes the bottleneck.

The Same Scenario With a Context Graph

Picture the third OAuth consent phishing incident with a context graph in place.

The agent triages the alert. Before proposing action, the agent queries precedent: have we seen this pattern before?

The context graph returns two decision traces:

March incident. Malicious app confirmed, user coached, access revoked, no exfiltration. True Positive.
May suppression. Same app ID, contractor accounts, suppression applied for 30 days (now expired). Authority: senior analyst. Precedent: March incident. Blanket block never shipped.

The agent synthesizes: this is the third occurrence of a known malicious campaign. Prior decisions indicate immediate revocation and user coaching. The current target is an executive with sensitive access, higher risk than prior cases. Recommend immediate revocation, executive notification, and escalation to Tier 2 for impact assessment.

The Tier 1 analyst sees the agent’s recommendation with full provenance. The analyst verifies the precedent, agrees with the escalation, and approves the action. Five minutes instead of an hour.

That’s the transformation. Triage with memory. Detection that compounds into institutional learning instead of disappearing into closed-incident archives.

What to Do This Week

Start by picking one workflow - the closure path on a single incident type, or one agent run that already exists in your environment - and define what a decision trace would look like for it. Three structured fields will do: inputs weighed, precedent cited, authority. Capture it as a comment template, a SOAR variable, or a Foundry agent output. You do not need the full graph to start. You need one repeatable trace that survives the close, and a place to put it that someone other than the original analyst can find. That is the next step that actually compounds.

Next up is Article 07, the Trifecta - why playbooks, humans, and agents all stay in the agentic SOC. They are not competing operating models. They are complementary roles with different strengths, and the context graph is the shared memory that makes them work together instead of colliding. From here, explore the Trifecta post when it lands and try the one-workflow exercise above against your own environment.

r detection fires in minutes, the human analyst’s oldest weakness - finding the context to make a defensible decision - was a productivity tax. Frustrating but survivable. The analyst dug, asked around, eventually pieced enough together to act.