Optimize IAST Vulnerability Detection#8885
Conversation
600341f to
416dbae
Compare
…ad/OverheadController.java Co-authored-by: datadog-datadog-prod-us1[bot] <88084959+datadog-datadog-prod-us1[bot]@users.noreply.github.com>
|
Hi! 👋 Thanks for your pull request! 🎉 To help us review it, please make sure to:
If you need help, please check our contributing guidelines. |
| import org.springframework.web.bind.annotation.RestController; | ||
|
|
||
| @RestController | ||
| public class IastSamplingController { |
There was a problem hiding this comment.
Can we introduce a test with different code paths that can be controlled though query/path parameters?, the order in which we trigger vulns is always deterministic and it would be nice to also test with some variability.
smola
left a comment
There was a problem hiding this comment.
I'd like an extra reviewer on this.
On the startup time regression: I think we can accept the 1ms startup regression for IAST in pet clinic. It does not affect scenarios other than DD_IAST_ENABLED=true, and the improved detection rate here is worth it.
| ] | ||
| } | ||
|
|
||
| void 'test'() { |
There was a problem hiding this comment.
Can you improve the name of the test?
There was a problem hiding this comment.
Was this discussed in the working group?, I did wonder if the same behavior happens in other tracers or its just our implementation.
There was a problem hiding this comment.
Yes, there is a discussion in the working group slack channel and also was mentioned in the last meeting. It should happen in al the tracer as is an algorithm issue not an implementation one.
manuel-alvarez-alvarez
left a comment
There was a problem hiding this comment.
LGTM, but we have to discuss in the working group the issue about the algorithm degrading when the codepaths can change.
| Package | Type | Package file | Manager | Update | Change | |---|---|---|---|---|---| | [com.google.errorprone:error_prone_annotations](https://errorprone.info) ([source](https://github.com/google/error-prone)) | dependencies | misk/gradle/libs.versions.toml | gradle | minor | `2.39.0` -> `2.40.0` | | [org.apache.commons:commons-lang3](https://commons.apache.org/proper/commons-lang/) ([source](https://gitbox.apache.org/repos/asf/commons-lang.git)) | dependencies | misk/gradle/libs.versions.toml | gradle | minor | `3.17.0` -> `3.18.0` | | [org.jetbrains.kotlinx.binary-compatibility-validator](https://github.com/Kotlin/binary-compatibility-validator) | plugin | misk/gradle/libs.versions.toml | gradle | patch | `0.18.0` -> `0.18.1` | | [com.datadoghq:dd-trace-api](https://github.com/datadog/dd-trace-java) | dependencies | misk/gradle/libs.versions.toml | gradle | minor | `1.50.1` -> `1.51.0` | | [software.amazon.awssdk:sdk-core](https://aws.amazon.com/sdkforjava) | dependencies | misk/gradle/libs.versions.toml | gradle | patch | `2.31.77` -> `2.31.78` | | [software.amazon.awssdk:sqs](https://aws.amazon.com/sdkforjava) | dependencies | misk/gradle/libs.versions.toml | gradle | patch | `2.31.77` -> `2.31.78` | | [software.amazon.awssdk:dynamodb-enhanced](https://aws.amazon.com/sdkforjava) | dependencies | misk/gradle/libs.versions.toml | gradle | patch | `2.31.77` -> `2.31.78` | | [software.amazon.awssdk:dynamodb](https://aws.amazon.com/sdkforjava) | dependencies | misk/gradle/libs.versions.toml | gradle | patch | `2.31.77` -> `2.31.78` | | [software.amazon.awssdk:aws-core](https://aws.amazon.com/sdkforjava) | dependencies | misk/gradle/libs.versions.toml | gradle | patch | `2.31.77` -> `2.31.78` | | [software.amazon.awssdk:bom](https://aws.amazon.com/sdkforjava) | dependencies | misk/gradle/libs.versions.toml | gradle | patch | `2.31.77` -> `2.31.78` | | [software.amazon.awssdk:auth](https://aws.amazon.com/sdkforjava) | dependencies | misk/gradle/libs.versions.toml | gradle | patch | `2.31.77` -> `2.31.78` | --- ### Release Notes <details> <summary>google/error-prone (com.google.errorprone:error_prone_annotations)</summary> ### [`v2.40.0`](https://github.com/google/error-prone/releases/tag/v2.40.0): Error Prone 2.40.0 Changes: - Bug fixes and improvements - Releases (including snapshots) have migrated from [OSSRH to the Central Publisher Portal](https://central.sonatype.org/pages/ossrh-eol/#process-to-migrate) Full changelog: google/error-prone@v2.39.0...v2.40.0 </details> <details> <summary>Kotlin/binary-compatibility-validator (org.jetbrains.kotlinx.binary-compatibility-validator)</summary> ### [`v0.18.1`](https://github.com/Kotlin/binary-compatibility-validator/releases/tag/0.18.1) [Compare Source](Kotlin/binary-compatibility-validator@0.18.0...0.18.1) #### What's Changed - Fixed a bug preventing use of cross-compilation support during KLIB dump validation \[[#​304](https://github.com/Kotlin/binary-compatibility-validator/issues/304)]\[[#​306](https://github.com/Kotlin/binary-compatibility-validator/issues/306)] </details> <details> <summary>datadog/dd-trace-java (com.datadoghq:dd-trace-api)</summary> ### [`v1.51.0`](https://github.com/DataDog/dd-trace-java/releases/tag/v1.51.0): 1.51.0 ### Components #### Application Security Management (IAST) - 🐛 Fix verify error when ctor params are used after a call site ([#​9083](DataDog/dd-trace-java#9083) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) - 🐛 Limit the maximum size of the location path in IAST vulnerabilities ([#​9028](DataDog/dd-trace-java#9028) - [@​jandro996](https://github.com/jandro996)) - 🐛 Fix IAST gRPC handler with null superclass ([#​8984](DataDog/dd-trace-java#8984) - [@​smola](https://github.com/smola)) - ✨ Optimize IAST Vulnerability Detection ([#​8885](DataDog/dd-trace-java#8885) - [@​jandro996](https://github.com/jandro996)) #### Application Security Management (WAF) - ✨ Upgrade libddwaf-java to 15.0.0 ([#​9022](DataDog/dd-trace-java#9022) - [@​sezen-datadog](https://github.com/sezen-datadog)) - ✨ Extract RestEasy json body response schemas ([#​9015](DataDog/dd-trace-java#9015) - [@​jandro996](https://github.com/jandro996)) - ✨ Extract Jersey json body response schemas ([#​9014](DataDog/dd-trace-java#9014) - [@​jandro996](https://github.com/jandro996)) - ✨ Extract Ratpack json body response schemas ([#​9013](DataDog/dd-trace-java#9013) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) - ✨ Enable API Security by default and make it lazy loading ([#​9009](DataDog/dd-trace-java#9009) - [@​smola](https://github.com/smola)) - ✨ Extract Vert.x json body response schemas ([#​9001](DataDog/dd-trace-java#9001) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) - ✨ Extract Play json body response schemas ([#​8995](DataDog/dd-trace-java#8995) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) - 🐛 Fix Jackson nodes introspection for request/response schema extraction ([#​8980](DataDog/dd-trace-java#8980) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) - ✨ Extract Spring json body response schemas ([#​8938](DataDog/dd-trace-java#8938) - [@​sezen-datadog](https://github.com/sezen-datadog)) - ✨ Default obfuscation regexp update ([#​8937](DataDog/dd-trace-java#8937) - [@​sezen-datadog](https://github.com/sezen-datadog)) #### Build & Tooling - ✨ Cancel GitLab running pipeline on new PR push ([#​9023](DataDog/dd-trace-java#9023) - [@​PerfectSlayer](https://github.com/PerfectSlayer)) - ✨ Migrate publishing to Maven Central Portal ([#​8807](DataDog/dd-trace-java#8807) - [@​sarahchen6](https://github.com/sarahchen6)) #### Continuous Integration Visibility - 🐛 Fix Test Optimization to work with JDK 24 ([#​9114](DataDog/dd-trace-java#9114) - [@​nikita-tkachenko-datadog](https://github.com/nikita-tkachenko-datadog)) - ✨ Add repo root as safe directory on git client creation ([#​9033](DataDog/dd-trace-java#9033) - [@​daniel-mohedano](https://github.com/daniel-mohedano)) - ✨ Add PR number tag and improve PR information building ([#​8990](DataDog/dd-trace-java#8990) - [@​daniel-mohedano](https://github.com/daniel-mohedano)) - ✨ Update impacted tests logic ([#​8923](DataDog/dd-trace-java#8923) - [@​daniel-mohedano](https://github.com/daniel-mohedano)) #### Data Streams Monitoring - 🧹 Clean up DSM context injection ([#​8776](DataDog/dd-trace-java#8776) - [@​PerfectSlayer](https://github.com/PerfectSlayer)) #### Database Monitoring - 🐛 Set trace\_injected in try block ([#​9025](DataDog/dd-trace-java#9025) - [@​natashadada](https://github.com/natashadada)) #### Dynamic Instrumentation - 🐛 Add source file tracking enable option ([#​9115](DataDog/dd-trace-java#9115) - [@​jpbempel](https://github.com/jpbempel)) - ✨ Add java.util.Date support ([#​9111](DataDog/dd-trace-java#9111) - [@​jpbempel](https://github.com/jpbempel)) - ✨ Update file probe format ([#​9047](DataDog/dd-trace-java#9047) - [@​jpbempel](https://github.com/jpbempel)) - ✨ add safe local var hoisting ([#​9034](DataDog/dd-trace-java#9034) - [@​jpbempel](https://github.com/jpbempel)) - 🧹 Add new config for debugger upload interval ([#​8959](DataDog/dd-trace-java#8959) - [@​jpbempel](https://github.com/jpbempel)) - ✨ Enable Code Origin with Dynamic instrumentation ([#​8940](DataDog/dd-trace-java#8940) - [@​jpbempel](https://github.com/jpbempel)) #### ML Observability (LLMObs) - 💡 LLM Observability SDK ([#​8781](DataDog/dd-trace-java#8781) - [@​gary-huang](https://github.com/gary-huang), [@​nayeem-kamal](https://github.com/nayeem-kamal)) #### Metrics - 🐛 Ensure client stat reporter is started when the agent is not available at bootstrap ([#​9082](DataDog/dd-trace-java#9082) - [@​amarziali](https://github.com/amarziali)) - ✨ Create metric: appsec.waf.config\_errors ([#​8394](DataDog/dd-trace-java#8394) - [@​sezen-datadog](https://github.com/sezen-datadog)) #### Platform components - ✨ Introduce environment component ([#​9071](DataDog/dd-trace-java#9071) - [@​PerfectSlayer](https://github.com/PerfectSlayer)) #### Profiling - 🐛 Remove annoying warning for smap event parsing ([#​9119](DataDog/dd-trace-java#9119) - [@​jbachorik](https://github.com/jbachorik)) - 🐛 Fix ByteCountingInputStream when reading past EOF ([#​8988](DataDog/dd-trace-java#8988) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) #### Realtime User Monitoring - ✨ Add RUM SDK injection for servlet based web servers ([#​9110](DataDog/dd-trace-java#9110) - [@​PerfectSlayer](https://github.com/PerfectSlayer) [@​amarziali](https://github.com/amarziali)) #### Telemetry - ✨ Update the config origin metric to match what it's mapping ([#​9045](DataDog/dd-trace-java#9045) - [@​sezen-datadog](https://github.com/sezen-datadog)) #### Testing - ✨ Add testing for latest stable version (JDK 24) ([#​8875](DataDog/dd-trace-java#8875) - [@​sarahchen6](https://github.com/sarahchen6)) #### Trace context propagation - 🐛 Fix bug with dropping baggage when `TracePropagationBehaviorExtract=IGNORE` ([#​9037](DataDog/dd-trace-java#9037) - [@​mhlidd](https://github.com/mhlidd)) - 🐛 Fix ArrayIndexOutOfBoundsException in PercentEscaper ([#​9032](DataDog/dd-trace-java#9032) - [@​mhlidd](https://github.com/mhlidd)) #### Tracer core - 🐛 Fix `Error` handling for trace interceptors ([#​9097](DataDog/dd-trace-java#9097) - [@​AlexeyKuznetsov-DD](https://github.com/AlexeyKuznetsov-DD)) - 💡 Add wildcard feature for `DD_TRACE_HEADER_TAGS` and enabling for Http Response headers ([#​9067](DataDog/dd-trace-java#9067) - [@​mhlidd](https://github.com/mhlidd)) #### Tracer public API - 💡 Add LLM Observability SDK ([#​8781](DataDog/dd-trace-java#8781) - [@​gary-huang](https://github.com/gary-huang)) ### Instrumentations #### Akka instrumentation - 🐛 Fix NPE in akka-http and pekko-http integrations ([#​9019](DataDog/dd-trace-java#9019) - [@​mcculls](https://github.com/mcculls)) #### Eclipse Vert.x instrumentation - ✨ Extract Vert.x json body response schemas ([#​9001](DataDog/dd-trace-java#9001) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) - ✨ Write http.route tag as soon as possible in vert.x ([#​8952](DataDog/dd-trace-java#8952) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) #### JAX-WS instrumentation - 💡⚠️ Enable jax-ws integration by default ([#​9030](DataDog/dd-trace-java#9030) - [@​bm1549](https://github.com/bm1549)) - ✨ Extract Jersey json body response schemas ([#​9014](DataDog/dd-trace-java#9014) - [@​jandro996](https://github.com/jandro996)) #### Mule instrumentation - 🐛 Propagate grizzly http span in filters if nothing is active ([#​9016](DataDog/dd-trace-java#9016) - [@​amarziali](https://github.com/amarziali)) #### Play Framework instrumentation - ✨ Extract Play json body response schemas ([#​8995](DataDog/dd-trace-java#8995) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) #### Ratpack instrumentation - ✨ Extract Ratpack json body response schemas ([#​9013](DataDog/dd-trace-java#9013) - [@​manuel-alvarez-alvarez](https://github.com/manuel-alvarez-alvarez)) #### Spring instrumentation - ✨ Extract Spring json body response schemas ([#​8938](DataDog/dd-trace-java#8938) - [@​sezen-datadog](https://github.com/sezen-datadog)) </details> --- ### Configuration 📅 **Schedule**: Branch creation - "after 6pm every weekday,before 2am every weekday" in timezone Australia/Melbourne, Automerge - At any time (no schedule defined). 🚦 **Automerge**: Enabled. ♻ **Rebasing**: Never, or you tick the rebase/retry checkbox. 👻 **Immortal**: This PR will be recreated if closed unmerged. Get [config help](https://github.com/renovatebot/renovate/discussions) if that's undesired. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box --- This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). GitOrigin-RevId: 649b690d4c9d7dcb572c457f0802b42b8e3e682e
What Does This Do Implements the new algorithm for detecting IAST vulnerabilities, where vulnerabilities that were already explored in previous runs for a given endpoint are skipped, ensuring that all remaining ones are eventually explored. This addresses the current limitation where only the first matching vulnerabilities are consistently reported, causing others to remain hidden. Changes to OverheadContext The OverheadContext class has been extended to support three separate tracking maps: globalMap Used to track vulnerability detection counts per endpoint across all requests. Keys are strings combining the request method and route (GET /login, POST /submit, etc.). Values are maps from vulnerabilityType → int (count of occurrences). Capped at 4,096 entries using a clear‐on‐overflow strategy, to ensure bounded memory usage. Oldest entries are cleared once the limit is reached. copyMap Created per request to copy the global counts at the start of the request, ensuring a consistent baseline to compare against throughout the lifecycle of the request. requestMap Tracks vulnerability type counts within the request. An additional field, isGlobal, has been added to indicate whether the context is global or request-scoped. If isGlobal is true, the maps are not used, and quota checks proceed using the global strategy only. A new method, resetMaps(), has been added to update globalMap when the request ends and vulnerability data has been reported. Two scenarios are supported: Case 1: Budget not fully used → The entry for the endpoint in globalMap is cleared, since the request stayed within budget. Case 2: Budget fully used → The counts from requestMap are compared to those in copyMap. For each vulnerability type, if the value in requestMap is greater, it is used to update the corresponding entry in globalMap. Changes to OverheadController The method consumeQuota() has been extended to receive a vulnerabilityType and modified to support the new logic: If an OverheadContext is present and not global, and there is remaining quota and a valid span, the controller now invokes a new method maybeSkipVulnerability() to determine whether quota should actually be consumed or not, based on endpoint-specific history. It's better to check the Algorithm execution example flow diagram to understand how this should work Changes to IastRequestContext In releaseRequestContext(), the request now calls resetMaps() on the associated OverheadContext, ensuring globalMap is updated at the end of each request. Motivation [RFC-1029] Optimizing IAST Vulnerability Detection implementation Additional Notes java tracer needs to implement also [RFC-1029-A1] Solution for dynamic http routes (cherry picked from commit 99ecab7)

What Does This Do
Implements the new algorithm for detecting IAST vulnerabilities, where vulnerabilities that were already explored in previous runs for a given endpoint are skipped, ensuring that all remaining ones are eventually explored.
This addresses the current limitation where only the first matching vulnerabilities are consistently reported, causing others to remain hidden.
Changes to OverheadContext
The OverheadContext class has been extended to support three separate tracking maps:
An additional field, isGlobal, has been added to indicate whether the context is global or request-scoped. If isGlobal is true, the maps are not used, and quota checks proceed using the global strategy only.
A new method, resetMaps(), has been added to update globalMap when the request ends and vulnerability data has been reported. Two scenarios are supported:
Changes to OverheadController
The method consumeQuota() has been extended to receive a vulnerabilityType and modified to support the new logic:
If an OverheadContext is present and not global, and there is remaining quota and a valid span, the controller now invokes a new method maybeSkipVulnerability() to determine whether quota should actually be consumed or not, based on endpoint-specific history.
It's better to check the Algorithm execution example flow diagram to understand how this should work
Changes to IastRequestContext
In releaseRequestContext(), the request now calls resetMaps() on the associated OverheadContext, ensuring globalMap is updated at the end of each request.
Motivation
[RFC-1029] Optimizing IAST Vulnerability Detection implementation
Additional Notes
java tracer needs to implement also [RFC-1029-A1] Solution for dynamic http routes
Contributor Checklist
type:and (comp:orinst:) labels in addition to any usefull labelsclose,fixor any linking keywords when referencing an issue.Use
solvesinstead, and assign the PR milestone to the issueJira ticket: APPSEC-57267