Optimize IAST Vulnerability Detection by jandro996 · Pull Request #8885 · DataDog/dd-trace-java · GitHub
Skip to content

Optimize IAST Vulnerability Detection#8885

Merged
jandro996 merged 34 commits into
masterfrom
alejandro.gonzalez/Optimize-IAST-Vulnerability-Detection
Jul 3, 2025
Merged

Optimize IAST Vulnerability Detection#8885
jandro996 merged 34 commits into
masterfrom
alejandro.gonzalez/Optimize-IAST-Vulnerability-Detection

Conversation

@jandro996

@jandro996 jandro996 commented May 26, 2025

Copy link
Copy Markdown
Member

What Does This Do

Implements the new algorithm for detecting IAST vulnerabilities, where vulnerabilities that were already explored in previous runs for a given endpoint are skipped, ensuring that all remaining ones are eventually explored.

This addresses the current limitation where only the first matching vulnerabilities are consistently reported, causing others to remain hidden.

Changes to OverheadContext

The OverheadContext class has been extended to support three separate tracking maps:

  1. globalMap
  • Used to track vulnerability detection counts per endpoint across all requests.
  • Keys are strings combining the request method and route (GET /login, POST /submit, etc.).
  • Values are maps from vulnerabilityType → int (count of occurrences).
  • Capped at 4,096 entries using a clear‐on‐overflow strategy, to ensure bounded memory usage.
  • Oldest entries are cleared once the limit is reached.
  1. copyMap
  • Created per request to copy the global counts at the start of the request, ensuring a consistent baseline to compare against throughout the lifecycle of the request.
  1. requestMap
  • Tracks vulnerability type counts within the request.

An additional field, isGlobal, has been added to indicate whether the context is global or request-scoped. If isGlobal is true, the maps are not used, and quota checks proceed using the global strategy only.

A new method, resetMaps(), has been added to update globalMap when the request ends and vulnerability data has been reported. Two scenarios are supported:

  • Case 1: Budget not fully used → The entry for the endpoint in globalMap is cleared, since the request stayed within budget.
  • Case 2: Budget fully used → The counts from requestMap are compared to those in copyMap. For each vulnerability type, if the value in requestMap is greater, it is used to update the corresponding entry in globalMap.

Changes to OverheadController

The method consumeQuota() has been extended to receive a vulnerabilityType and modified to support the new logic:

If an OverheadContext is present and not global, and there is remaining quota and a valid span, the controller now invokes a new method maybeSkipVulnerability() to determine whether quota should actually be consumed or not, based on endpoint-specific history.

It's better to check the Algorithm execution example flow diagram to understand how this should work

Changes to IastRequestContext

In releaseRequestContext(), the request now calls resetMaps() on the associated OverheadContext, ensuring globalMap is updated at the end of each request.

Motivation

[RFC-1029] Optimizing IAST Vulnerability Detection implementation

Additional Notes

java tracer needs to implement also [RFC-1029-A1] Solution for dynamic http routes

Contributor Checklist

Jira ticket: APPSEC-57267

@pr-commenter

pr-commenter Bot commented May 29, 2025

Copy link
Copy Markdown

@jandro996 jandro996 force-pushed the alejandro.gonzalez/Optimize-IAST-Vulnerability-Detection branch from 600341f to 416dbae Compare May 30, 2025 10:34
jandro996 and others added 2 commits June 2, 2025 11:30
…ad/OverheadController.java

Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>
@jandro996 jandro996 marked this pull request as ready for review June 3, 2025 11:11
@jandro996 jandro996 requested a review from a team as a code owner June 3, 2025 11:11
@jandro996 jandro996 requested review from robertpi and smola June 3, 2025 11:11
@github-actions

github-actions Bot commented Jun 3, 2025

Copy link
Copy Markdown
Contributor

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@jandro996 jandro996 added type: enhancement Enhancements and improvements comp: asm iast Application Security Management (IAST) labels Jun 3, 2025
Comment thread dd-java-agent/agent-iast/src/main/java/com/datadog/iast/IastGlobalContext.java Outdated
@jandro996 jandro996 requested a review from smola June 5, 2025 09:46
import org.springframework.web.bind.annotation.RestController;

@RestController
public class IastSamplingController {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we introduce a test with different code paths that can be controlled though query/path parameters?, the order in which we trigger vulns is always deterministic and it would be nice to also test with some variability.

@jandro996 jandro996 requested a review from smola June 27, 2025 11:40

@smola smola left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like an extra reviewer on this.
On the startup time regression: I think we can accept the 1ms startup regression for IAST in pet clinic. It does not affect scenarios other than DD_IAST_ENABLED=true, and the improved detection rate here is worth it.

]
}

void 'test'() {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you improve the name of the test?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this discussed in the working group?, I did wonder if the same behavior happens in other tracers or its just our implementation.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, there is a discussion in the working group slack channel and also was mentioned in the last meeting. It should happen in al the tracer as is an algorithm issue not an implementation one.

@manuel-alvarez-alvarez manuel-alvarez-alvarez left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but we have to discuss in the working group the issue about the algorithm degrading when the codepaths can change.

@jandro996 jandro996 merged commit 99ecab7 into master Jul 3, 2025
508 checks passed
@jandro996 jandro996 deleted the alejandro.gonzalez/Optimize-IAST-Vulnerability-Detection branch July 3, 2025 11:27
@github-actions github-actions Bot added this to the 1.51.0 milestone Jul 3, 2025
svc-squareup-copybara pushed a commit to cashapp/misk that referenced this pull request Jul 10, 2025
| Package | Type | Package file | Manager | Update | Change |
|---|---|---|---|---|---|
|
[com.google.errorprone:error_prone_annotations](https://errorprone.info)
([source](https://github.com/google/error-prone)) | dependencies |
misk/gradle/libs.versions.toml | gradle | minor | `2.39.0` -> `2.40.0` |
|
[org.apache.commons:commons-lang3](https://commons.apache.org/proper/commons-lang/)
([source](https://gitbox.apache.org/repos/asf/commons-lang.git)) |
dependencies | misk/gradle/libs.versions.toml | gradle | minor |
`3.17.0` -> `3.18.0` |
|
[org.jetbrains.kotlinx.binary-compatibility-validator](https://github.com/Kotlin/binary-compatibility-validator)
| plugin | misk/gradle/libs.versions.toml | gradle | patch | `0.18.0` ->
`0.18.1` |
| [com.datadoghq:dd-trace-api](https://github.com/datadog/dd-trace-java)
| dependencies | misk/gradle/libs.versions.toml | gradle | minor |
`1.50.1` -> `1.51.0` |
| [software.amazon.awssdk:sdk-core](https://aws.amazon.com/sdkforjava) |
dependencies | misk/gradle/libs.versions.toml | gradle | patch |
`2.31.77` -> `2.31.78` |
| [software.amazon.awssdk:sqs](https://aws.amazon.com/sdkforjava) |
dependencies | misk/gradle/libs.versions.toml | gradle | patch |
`2.31.77` -> `2.31.78` |
|
[software.amazon.awssdk:dynamodb-enhanced](https://aws.amazon.com/sdkforjava)
| dependencies | misk/gradle/libs.versions.toml | gradle | patch |
`2.31.77` -> `2.31.78` |
| [software.amazon.awssdk:dynamodb](https://aws.amazon.com/sdkforjava) |
dependencies | misk/gradle/libs.versions.toml | gradle | patch |
`2.31.77` -> `2.31.78` |
| [software.amazon.awssdk:aws-core](https://aws.amazon.com/sdkforjava) |
dependencies | misk/gradle/libs.versions.toml | gradle | patch |
`2.31.77` -> `2.31.78` |
| [software.amazon.awssdk:bom](https://aws.amazon.com/sdkforjava) |
dependencies | misk/gradle/libs.versions.toml | gradle | patch |
`2.31.77` -> `2.31.78` |
| [software.amazon.awssdk:auth](https://aws.amazon.com/sdkforjava) |
dependencies | misk/gradle/libs.versions.toml | gradle | patch |
`2.31.77` -> `2.31.78` |

---

### Release Notes

<details>
<summary>google/error-prone
(com.google.errorprone:error_prone_annotations)</summary>

###
[`v2.40.0`](https://github.com/google/error-prone/releases/tag/v2.40.0):
Error Prone 2.40.0

Changes:

- Bug fixes and improvements
- Releases (including snapshots) have migrated from [OSSRH to the
Central Publisher
Portal](https://central.sonatype.org/pages/ossrh-eol/#process-to-migrate)

Full changelog:
google/error-prone@v2.39.0...v2.40.0

</details>

<details>
<summary>Kotlin/binary-compatibility-validator
(org.jetbrains.kotlinx.binary-compatibility-validator)</summary>

###
[`v0.18.1`](https://github.com/Kotlin/binary-compatibility-validator/releases/tag/0.18.1)

[Compare
Source](Kotlin/binary-compatibility-validator@0.18.0...0.18.1)

#### What's Changed

- Fixed a bug preventing use of cross-compilation support during KLIB
dump validation
\[[#&#8203;304](https://github.com/Kotlin/binary-compatibility-validator/issues/304)]\[[#&#8203;306](https://github.com/Kotlin/binary-compatibility-validator/issues/306)]

</details>

<details>
<summary>datadog/dd-trace-java (com.datadoghq:dd-trace-api)</summary>

###
[`v1.51.0`](https://github.com/DataDog/dd-trace-java/releases/tag/v1.51.0):
1.51.0

### Components

#### Application Security Management (IAST)

- 🐛 Fix verify error when ctor params are used after a call site
([#&#8203;9083](DataDog/dd-trace-java#9083) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))
- 🐛 Limit the maximum size of the location path in IAST
vulnerabilities
([#&#8203;9028](DataDog/dd-trace-java#9028) -
[@&#8203;jandro996](https://github.com/jandro996))
- 🐛 Fix IAST gRPC handler with null superclass
([#&#8203;8984](DataDog/dd-trace-java#8984) -
[@&#8203;smola](https://github.com/smola))
- ✨ Optimize IAST Vulnerability Detection
([#&#8203;8885](DataDog/dd-trace-java#8885) -
[@&#8203;jandro996](https://github.com/jandro996))

#### Application Security Management (WAF)

- ✨ Upgrade libddwaf-java to 15.0.0
([#&#8203;9022](DataDog/dd-trace-java#9022) -
[@&#8203;sezen-datadog](https://github.com/sezen-datadog))
- ✨ Extract RestEasy json body response schemas
([#&#8203;9015](DataDog/dd-trace-java#9015) -
[@&#8203;jandro996](https://github.com/jandro996))
- ✨ Extract Jersey json body response schemas
([#&#8203;9014](DataDog/dd-trace-java#9014) -
[@&#8203;jandro996](https://github.com/jandro996))
- ✨ Extract Ratpack json body response schemas
([#&#8203;9013](DataDog/dd-trace-java#9013) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))
- ✨ Enable API Security by default and make it lazy loading
([#&#8203;9009](DataDog/dd-trace-java#9009) -
[@&#8203;smola](https://github.com/smola))
- ✨ Extract Vert.x json body response schemas
([#&#8203;9001](DataDog/dd-trace-java#9001) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))
- ✨ Extract Play json body response schemas
([#&#8203;8995](DataDog/dd-trace-java#8995) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))
- 🐛 Fix Jackson nodes introspection for request/response schema
extraction
([#&#8203;8980](DataDog/dd-trace-java#8980) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))
- ✨ Extract Spring json body response schemas
([#&#8203;8938](DataDog/dd-trace-java#8938) -
[@&#8203;sezen-datadog](https://github.com/sezen-datadog))
- ✨ Default obfuscation regexp update
([#&#8203;8937](DataDog/dd-trace-java#8937) -
[@&#8203;sezen-datadog](https://github.com/sezen-datadog))

#### Build & Tooling

- ✨ Cancel GitLab running pipeline on new PR push
([#&#8203;9023](DataDog/dd-trace-java#9023) -
[@&#8203;PerfectSlayer](https://github.com/PerfectSlayer))
- ✨ Migrate publishing to Maven Central Portal
([#&#8203;8807](DataDog/dd-trace-java#8807) -
[@&#8203;sarahchen6](https://github.com/sarahchen6))

#### Continuous Integration Visibility

- 🐛 Fix Test Optimization to work with JDK 24
([#&#8203;9114](DataDog/dd-trace-java#9114) -
[@&#8203;nikita-tkachenko-datadog](https://github.com/nikita-tkachenko-datadog))
- ✨ Add repo root as safe directory on git client creation
([#&#8203;9033](DataDog/dd-trace-java#9033) -
[@&#8203;daniel-mohedano](https://github.com/daniel-mohedano))
- ✨ Add PR number tag and improve PR information building
([#&#8203;8990](DataDog/dd-trace-java#8990) -
[@&#8203;daniel-mohedano](https://github.com/daniel-mohedano))
- ✨ Update impacted tests logic
([#&#8203;8923](DataDog/dd-trace-java#8923) -
[@&#8203;daniel-mohedano](https://github.com/daniel-mohedano))

#### Data Streams Monitoring

- 🧹 Clean up DSM context injection
([#&#8203;8776](DataDog/dd-trace-java#8776) -
[@&#8203;PerfectSlayer](https://github.com/PerfectSlayer))

#### Database Monitoring

- 🐛 Set trace\_injected in try block
([#&#8203;9025](DataDog/dd-trace-java#9025) -
[@&#8203;natashadada](https://github.com/natashadada))

#### Dynamic Instrumentation

- 🐛 Add source file tracking enable option
([#&#8203;9115](DataDog/dd-trace-java#9115) -
[@&#8203;jpbempel](https://github.com/jpbempel))
- ✨ Add java.util.Date support
([#&#8203;9111](DataDog/dd-trace-java#9111) -
[@&#8203;jpbempel](https://github.com/jpbempel))
- ✨ Update file probe format
([#&#8203;9047](DataDog/dd-trace-java#9047) -
[@&#8203;jpbempel](https://github.com/jpbempel))
- ✨ add safe local var hoisting
([#&#8203;9034](DataDog/dd-trace-java#9034) -
[@&#8203;jpbempel](https://github.com/jpbempel))
- 🧹 Add new config for debugger upload interval
([#&#8203;8959](DataDog/dd-trace-java#8959) -
[@&#8203;jpbempel](https://github.com/jpbempel))
- ✨ Enable Code Origin with Dynamic instrumentation
([#&#8203;8940](DataDog/dd-trace-java#8940) -
[@&#8203;jpbempel](https://github.com/jpbempel))

#### ML Observability (LLMObs)

- 💡 LLM Observability SDK
([#&#8203;8781](DataDog/dd-trace-java#8781) -
[@&#8203;gary-huang](https://github.com/gary-huang),
[@&#8203;nayeem-kamal](https://github.com/nayeem-kamal))

#### Metrics

- 🐛 Ensure client stat reporter is started when the agent is not
available at bootstrap
([#&#8203;9082](DataDog/dd-trace-java#9082) -
[@&#8203;amarziali](https://github.com/amarziali))
- ✨ Create metric: appsec.waf.config\_errors
([#&#8203;8394](DataDog/dd-trace-java#8394) -
[@&#8203;sezen-datadog](https://github.com/sezen-datadog))

#### Platform components

- ✨ Introduce environment component
([#&#8203;9071](DataDog/dd-trace-java#9071) -
[@&#8203;PerfectSlayer](https://github.com/PerfectSlayer))

#### Profiling

- 🐛 Remove annoying warning for smap event parsing
([#&#8203;9119](DataDog/dd-trace-java#9119) -
[@&#8203;jbachorik](https://github.com/jbachorik))
- 🐛 Fix ByteCountingInputStream when reading past EOF
([#&#8203;8988](DataDog/dd-trace-java#8988) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))

#### Realtime User Monitoring

- ✨ Add RUM SDK injection for servlet based web servers
([#&#8203;9110](DataDog/dd-trace-java#9110) -
[@&#8203;PerfectSlayer](https://github.com/PerfectSlayer)
[@&#8203;amarziali](https://github.com/amarziali))

#### Telemetry

- ✨ Update the config origin metric to match what it's mapping
([#&#8203;9045](DataDog/dd-trace-java#9045) -
[@&#8203;sezen-datadog](https://github.com/sezen-datadog))

#### Testing

- ✨ Add testing for latest stable version (JDK 24)
([#&#8203;8875](DataDog/dd-trace-java#8875) -
[@&#8203;sarahchen6](https://github.com/sarahchen6))

#### Trace context propagation

- 🐛 Fix bug with dropping baggage when
`TracePropagationBehaviorExtract=IGNORE`
([#&#8203;9037](DataDog/dd-trace-java#9037) -
[@&#8203;mhlidd](https://github.com/mhlidd))
- 🐛 Fix ArrayIndexOutOfBoundsException in PercentEscaper
([#&#8203;9032](DataDog/dd-trace-java#9032) -
[@&#8203;mhlidd](https://github.com/mhlidd))

#### Tracer core

- 🐛 Fix `Error` handling for trace interceptors
([#&#8203;9097](DataDog/dd-trace-java#9097) -
[@&#8203;AlexeyKuznetsov-DD](https://github.com/AlexeyKuznetsov-DD))
- 💡 Add wildcard feature for `DD_TRACE_HEADER_TAGS` and enabling
for Http Response headers
([#&#8203;9067](DataDog/dd-trace-java#9067) -
[@&#8203;mhlidd](https://github.com/mhlidd))

#### Tracer public API

- 💡 Add LLM Observability SDK
([#&#8203;8781](DataDog/dd-trace-java#8781) -
[@&#8203;gary-huang](https://github.com/gary-huang))

### Instrumentations

#### Akka instrumentation

- 🐛 Fix NPE in akka-http and pekko-http integrations
([#&#8203;9019](DataDog/dd-trace-java#9019) -
[@&#8203;mcculls](https://github.com/mcculls))

#### Eclipse Vert.x instrumentation

- ✨ Extract Vert.x json body response schemas
([#&#8203;9001](DataDog/dd-trace-java#9001) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))
- ✨ Write http.route tag as soon as possible in vert.x
([#&#8203;8952](DataDog/dd-trace-java#8952) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))

#### JAX-WS instrumentation

- 💡⚠️ Enable jax-ws integration by default
([#&#8203;9030](DataDog/dd-trace-java#9030) -
[@&#8203;bm1549](https://github.com/bm1549))
- ✨ Extract Jersey json body response schemas
([#&#8203;9014](DataDog/dd-trace-java#9014) -
[@&#8203;jandro996](https://github.com/jandro996))

#### Mule instrumentation

- 🐛 Propagate grizzly http span in filters if nothing is active
([#&#8203;9016](DataDog/dd-trace-java#9016) -
[@&#8203;amarziali](https://github.com/amarziali))

#### Play Framework instrumentation

- ✨ Extract Play json body response schemas
([#&#8203;8995](DataDog/dd-trace-java#8995) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))

#### Ratpack instrumentation

- ✨ Extract Ratpack json body response schemas
([#&#8203;9013](DataDog/dd-trace-java#9013) -
[@&#8203;manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez))

#### Spring instrumentation

- ✨ Extract Spring json body response schemas
([#&#8203;8938](DataDog/dd-trace-java#8938) -
[@&#8203;sezen-datadog](https://github.com/sezen-datadog))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - "after 6pm every weekday,before 2am
every weekday" in timezone Australia/Melbourne, Automerge - At any time
(no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Never, or you tick the rebase/retry checkbox.

👻 **Immortal**: This PR will be recreated if closed unmerged. Get
[config help](https://github.com/renovatebot/renovate/discussions) if
that's undesired.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Renovate
Bot](https://github.com/renovatebot/renovate).

GitOrigin-RevId: 649b690d4c9d7dcb572c457f0802b42b8e3e682e
sarahchen6 pushed a commit that referenced this pull request Jul 24, 2025
What Does This Do
Implements the new algorithm for detecting IAST vulnerabilities, where vulnerabilities that were already explored in previous runs for a given endpoint are skipped, ensuring that all remaining ones are eventually explored.

This addresses the current limitation where only the first matching vulnerabilities are consistently reported, causing others to remain hidden.

Changes to OverheadContext
The OverheadContext class has been extended to support three separate tracking maps:

globalMap
Used to track vulnerability detection counts per endpoint across all requests.
Keys are strings combining the request method and route (GET /login, POST /submit, etc.).
Values are maps from vulnerabilityType → int (count of occurrences).
Capped at 4,096 entries using a clear‐on‐overflow strategy, to ensure bounded memory usage.
Oldest entries are cleared once the limit is reached.
copyMap
Created per request to copy the global counts at the start of the request, ensuring a consistent baseline to compare against throughout the lifecycle of the request.
requestMap
Tracks vulnerability type counts within the request.
An additional field, isGlobal, has been added to indicate whether the context is global or request-scoped. If isGlobal is true, the maps are not used, and quota checks proceed using the global strategy only.

A new method, resetMaps(), has been added to update globalMap when the request ends and vulnerability data has been reported. Two scenarios are supported:

Case 1: Budget not fully used → The entry for the endpoint in globalMap is cleared, since the request stayed within budget.
Case 2: Budget fully used → The counts from requestMap are compared to those in copyMap. For each vulnerability type, if the value in requestMap is greater, it is used to update the corresponding entry in globalMap.
Changes to OverheadController
The method consumeQuota() has been extended to receive a vulnerabilityType and modified to support the new logic:

If an OverheadContext is present and not global, and there is remaining quota and a valid span, the controller now invokes a new method maybeSkipVulnerability() to determine whether quota should actually be consumed or not, based on endpoint-specific history.

It's better to check the Algorithm execution example flow diagram to understand how this should work

Changes to IastRequestContext
In releaseRequestContext(), the request now calls resetMaps() on the associated OverheadContext, ensuring globalMap is updated at the end of each request.

Motivation
[RFC-1029] Optimizing IAST Vulnerability Detection implementation

Additional Notes
java tracer needs to implement also [RFC-1029-A1] Solution for dynamic http routes

(cherry picked from commit 99ecab7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

comp: asm iast Application Security Management (IAST) type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants