iframe-proxy

aidandaly24 · 2026-06-25T15:06:57Z

Fixes #808
Fixes #809

Problem

The HTTP starter templates rebuild agent/session state on every invocation, so within one session there is no multi-turn recall (turn 2 can't see turn 1) — directly contradicting agentcore invoke --session-id. The Strands no-memory template is worse: it shares one global _agent across all invocations in a process, so conversation history bleeds between sessions/users in agentcore dev and non-Runtime deployments (Lambda/ECS/FastAPI). Affects strands, openaiagents, googleadk, langchain_langgraph, and autogen.

Fix

Each template now keys in-process agent/session state by context.session_id, bounded to 128 sessions with LRU eviction so a long-running process keeps per-session isolation without growing unbounded:

strands / autogen: per-session OrderedDict cache (move_to_end on hit, evict least-recently-used at the cap).
openaiagents: @lru_cache(maxsize=128) over per-session SQLiteSession.
googleadk: module-scoped InMemorySessionService bounded via an LRU key set, evicting via delete_session.
langchain_langgraph: module-scoped InMemorySaver bounded via an LRU thread set, evicting via delete_thread.

In-memory is best-effort (resets on cold start); for durable history, attach a session manager / AgentCore Memory.

Validation

Bug-bashed with real deploy + invoke per framework on a test account: per-session recall works and there is no cross-session leak (the #809 check) on all frameworks. Generated templates py_compile cleanly across flag combinations; asset snapshot regenerated; unit suite green.

…oss all HTTP starter base templates (aws#808, aws#809) All five HTTP starter base templates (strands, openaiagents, googleadk, langchain_langgraph, autogen) plus the regenerated assets snapshot. Only base/main.py template content changed; the strands hasMemory branch and all other framework files are untouched. Refs aws#808 Refs aws#809

github-actions · 2026-06-25T15:08:02Z

agentcore-devx-automation · 2026-06-25T15:08:23Z

Claude Security Review: no high-confidence findings. (run)

agentcore-cli-automation

Thanks for the fix — the per-session memory bug is real and the overall approach (key in-process state by context.session_id) is the right design. Most of the diff looks good; flagging a few correctness/scale issues that I think need to be addressed before merge.

The core concern: the PR description explicitly justifies the 128-entry cap for the openaiagents/autogen/strands templates ("caps process memory and evicted sessions restart fresh"), but the googleadk and langchain_langgraph templates use module-level in-memory stores (InMemorySessionService, InMemorySaver) with no bound at all. That just trades the original cross-session bleed (#809-style) for an unbounded memory leak in any long-running process serving many sessions (e.g. agentcore dev over a long session, or non-Runtime deployments which the PR description specifically calls out as the failure mode of #809). The fix should be consistent across all five templates.

The FIFO-vs-LRU comments inline are smaller but worth correcting since the docstrings claim LRU.

agentcore-cli-automation · 2026-06-25T15:10:29Z

-    session_service = InMemorySessionService()
-    session = await session_service.create_session(
+# Module-level session service and runner (preserves history across invocations)
+_session_service = InMemorySessionService()


Unbounded session growth. _session_service = InMemorySessionService() at module scope retains every (app_name, user_id, session_id) triple forever — there is no eviction path. The PR description argues that the 128-entry cap on the other templates "caps process memory and evicted sessions restart fresh", but this template has no cap, so a long-running process (notably agentcore dev over time, and any non-Runtime deployment) will grow without bound as new session ids arrive.

The ADK InMemorySessionService doesn't expose a public eviction API as far as I can see, so options are roughly:

Wrap session creation in a bounded FIFO/LRU and call _session_service.delete_session(...) on eviction (ADK does expose delete_session).

Keep a collections.OrderedDict of recently-touched (user_id, session_id) keys alongside the service and prune via delete_session once it exceeds N (e.g. 128, to match the other templates).

Subclass / wrap to enforce the cap at create-time.

Whichever you pick, please match the 128-entry behavior already documented for openaiagents/autogen/strands so the fix is consistent across templates.

agentcore-cli-automation · 2026-06-25T15:10:29Z

 tools = [add_numbers]

+# Module-level checkpointer preserves conversation history across invocations
+_checkpointer = InMemorySaver()


Unbounded checkpoint growth. _checkpointer = InMemorySaver() is module-level with no eviction, so every distinct thread_id (= session_id) keeps its checkpoint forever. Same correctness concern as the googleadk template: the 128-cap rationale in the PR description doesn't apply here, so this template will leak memory in any long-running process.

A couple of options:

Switch to a bounded saver (e.g. wrap InMemorySaver in something that tracks insertion order and calls _checkpointer.delete_thread(thread_id) past 128 entries — InMemorySaver exposes a storage dict you can prune).

Use SqliteSaver/AsyncSqliteSaver with a temp file path for the base template (durable, doesn't grow process memory).

At minimum, document the unboundedness in a comment and explicitly tell the reader to swap in a durable checkpointer for production — but a 128-cap to match the other templates is the cleanest fix.

agentcore-cli-automation · 2026-06-25T15:10:29Z


+# Reuses one AssistantAgent per session_id so each session keeps its own
+# in-process conversation history (best-effort; resets on cold start). Caches up
+# to 128 active sessions; the oldest is evicted and its history reset.


The eviction policy is FIFO, not LRU, despite the docstring. Plain dict preserves insertion order, and next(iter(_agents)) returns the oldest-inserted key. Reading an existing entry via _agents[session_id] does not move it to the end, so an actively-used session can be evicted while an idle one sticks around.

Two ways to fix:

Either drop the "LRU" claim from the comment (the policy is FIFO/insertion-order, which is fine if intended), or

Use collections.OrderedDict and call _agents.move_to_end(session_id) on every hit so it's actually LRU.

Same issue applies to the new strands base-template cache (src/assets/python/http/strands/base/main.py ~line 433-438), since the PR description there also describes it as "LRU-style eviction".

(Separate, smaller concern on this template: get_or_create_agent isn't concurrency-safe — two concurrent first-invocations for the same new session_id will both fall through the if session_id not in _agents check, both await get_streamable_http_mcp_tools(), and both build an AssistantAgent; the loser's agent is discarded mid-flight. Not catastrophic, but worth an asyncio.Lock around the create path if you want it tight.)

agentcore-cli-automation · 2026-06-25T15:10:29Z

+    def get_or_create_agent(session_id{{#if hasSkillsFetcher}}, skill_plugins=None{{/if}}):
+        if session_id not in cache:
+            if len(cache) >= 128:
+                cache.pop(next(iter(cache)))


FIFO, not LRU — same as the autogen template. next(iter(cache)) returns the oldest-inserted key, and reading cache[session_id] doesn't update its position. Either change the comment to say "FIFO eviction (oldest by insertion order)" or switch to collections.OrderedDict + cache.move_to_end(session_id) on hit for real LRU behavior.

This affects the no-memory branch only — the hasMemory branch (lines ~385-423) is unchanged here and was already unbounded; you may want to consider applying a cap there too for consistency, but that's out of scope of #808/#809 so feel free to defer.

agentcore-cli-automation · 2026-06-25T15:10:29Z

+# Caches up to 128 active sessions; LRU eviction silently resets history for
+# the oldest session. For production use, replace with a durable session store
+# (e.g. SQLiteSession with a file path).
+@lru_cache(maxsize=128)


Not blocking, just confirming: SQLiteSession(session_id) with no db_path uses an in-memory SQLite DB per-instance, so the lru_cache retention semantics here are: "keep up to 128 in-memory SQLite DBs alive; evicted DBs are GC'd and their history is lost on next access." That matches the docstring and is the correct knob for a starter template. ✅

Unlike the autogen/strands dict-based caches, functools.lru_cache is true LRU — accessing an existing entry moves it to most-recently-used — so no FIFO concern here.

… caches true LRU (aws#808, aws#809) The googleadk (InMemorySessionService) and langchain_langgraph (InMemorySaver) templates kept per-session state at module scope with no eviction, trading the cross-session leak for an unbounded memory leak in long-running processes. Bound both to 128 active sessions with LRU eviction (delete_session / delete_thread on the least-recently-used key), matching the other three templates. The autogen and strands no-memory caches were plain dicts evicting by insertion order (FIFO) despite docstrings claiming LRU. Switch them to OrderedDict with move_to_end on hit so the eviction policy is genuinely LRU and the docstrings are accurate. openaiagents already uses functools.lru_cache (true LRU) and is unchanged. Regenerated the assets snapshot. Refs aws#808 Refs aws#809

agentcore-devx-automation · 2026-06-25T15:23:31Z

Claude Security Review: no high-confidence findings. (run)

agentcore-cli-automation

Looks good to merge — all the issues raised in the prior review have been addressed in commit 08bfb25:

googleadk: _session_service is now bounded to 128 active sessions via _session_keys OrderedDict with LRU eviction calling delete_session on the evicted entry. ✅
langchain_langgraph: _checkpointer is now bounded to 128 active threads via _thread_ids OrderedDict with LRU eviction calling delete_thread. ✅
autogen: Switched from plain dict to OrderedDict with move_to_end on hit — now true LRU rather than FIFO. ✅
strands: Same OrderedDict + move_to_end treatment for the no-memory cache. ✅
openaiagents: Unchanged from the prior review — functools.lru_cache was already correct.

All five HTTP templates are now consistent: a 128-entry per-process cap, true LRU eviction, and context.session_id threaded through. The snapshot is regenerated to match.

A couple of minor things I noticed but won't block on (carried over from the prior review or out of scope):

The previously-noted concurrency window in autogen's get_or_create_agent (two concurrent first-invocations for the same new session_id both falling through the miss check) still exists, and the same pattern is now present in strands and googleadk. For a starter template this seems acceptable.
The strands hasMemory branch (lines ~387-425) is still an unbounded plain-dict cache, but as the prior reviewer noted that's outside the #808/#809 scope.

No new serious issues found.

aws#808) OrderedDict.popitem() returns (key, value) where key is the (user_id, session_id) tuple and value is True. The previous unpack assigned the key tuple to old_user_id and True to old_session_id, so delete_session() was called with garbage and the real session was never deleted -- the InMemorySessionService grew unbounded past the 128-session cap, defeating the aws#808 bound for GoogleADK (LangChain unpacked correctly). Destructure the key tuple and discard the value so the real ids reach delete_session. Add an asset test asserting the store caps at 128 with correct eviction ids, and regenerate the asset snapshot.

agentcore-devx-automation · 2026-06-25T16:13:41Z

agentcore-cli-automation

Re-reviewed the new commit f3dabe9 (added after the prior approval). Looks good to merge.

f3dabe9 is a real fix. Confirmed by reading the rendered template at src/assets/python/http/googleadk/base/main.py:

The key stored in _session_keys is a 2-tuple (user_id, session_id) (line 150), with the value always True.
OrderedDict.popitem(last=False) returns (key, value), so the outer 2-tuple is (tuple_key, True).
The previous form old_user_id, old_session_id = popitem(...) unpacked the outer pair, assigning the whole inner key tuple to old_user_id and True to old_session_id — delete_session(user_id=<tuple>, session_id=True) would have errored or no-op'd, leaving the underlying InMemorySessionService to grow past 128 unbounded. The new form (old_user_id, old_session_id), _ = popitem(...) destructures correctly.
LangChain (line 75: evicted, _ = _thread_ids.popitem(last=False)) uses a single-string key, so its unpack was already correct — the commit message's parenthetical is accurate. ✅
The other LRU caches (autogen line 107, strands line 441) just call cache.popitem(last=False) and discard the result, which is fine since they don't need an async cleanup callback like delete_session. ✅

Tests: The new googleadk-session-eviction.test.ts has two effective regression guards — the substring assertion for the fixed form and the negative-match guard against the buggy form (lines 59 and 61) — which is exactly what's needed to prevent this from regressing. The simulateEviction portion (lines 27-51, 68-85) is a JS re-implementation of the Python eviction algorithm, so the caps the store at 128 assertion isn't actually exercising the template; it's documenting the intended algorithm. Not blocking, but if you wanted that assertion to mean something, you'd need to actually run the rendered template under Python — not worth the infra for a starter-template asset test, the two string guards are sufficient.

No new serious issues. The unaddressed-but-out-of-scope items from the prior review (concurrency window in get_or_create_agent, unbounded hasMemory strands branch) still stand and are fine to defer.

github-actions Bot added the size/m PR size: M label Jun 25, 2026

aidandaly24 temporarily deployed to e2e-testing June 25, 2026 15:07 — with GitHub Actions Inactive

github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 25, 2026

agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 25, 2026

agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 25, 2026

agentcore-cli-automation suggested changes Jun 25, 2026

View reviewed changes

github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 25, 2026

aidandaly24 changed the title ~~fix(templates): Make each HTTP framework template reuse per-session conversa (#808, #809)~~ fix(templates): key per-session agent state by context.session_id across all HTTP starter templates (#808, #809) Jun 25, 2026

github-actions Bot removed the size/m PR size: M label Jun 25, 2026

aidandaly24 temporarily deployed to e2e-testing June 25, 2026 15:22 — with GitHub Actions Inactive

github-actions Bot added the size/m PR size: M label Jun 25, 2026

agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 25, 2026

github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 25, 2026

agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 25, 2026

agentcore-cli-automation approved these changes Jun 25, 2026

View reviewed changes

github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 25, 2026

github-actions Bot added size/m PR size: M and removed size/m PR size: M labels Jun 25, 2026

agentcore-devx-automation Bot added the claude-security-reviewing Claude Code /security-review in progress label Jun 25, 2026

aidandaly24 temporarily deployed to e2e-testing June 25, 2026 16:12 — with GitHub Actions Inactive

github-actions Bot added the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 25, 2026

agentcore-devx-automation Bot removed the claude-security-reviewing Claude Code /security-review in progress label Jun 25, 2026

agentcore-cli-automation approved these changes Jun 25, 2026

View reviewed changes

github-actions Bot removed the agentcore-harness-reviewing AgentCore Harness review in progress label Jun 25, 2026

aidandaly24 marked this pull request as ready for review June 25, 2026 16:42

aidandaly24 requested a review from a team June 25, 2026 16:42

aidandaly24 mentioned this pull request Jun 25, 2026

feat(templates): add conversation history persistence to HTTP agent templates #794

Closed

14 tasks

Sunbelt Computer Software

PL/B Language Development and Support

Uh oh!

Conversation

aidandaly24 commented Jun 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Validation

Uh oh!

github-actions Bot commented Jun 25, 2026

Package Tarball

How to install

Uh oh!

agentcore-devx-automation Bot commented Jun 25, 2026

Uh oh!

agentcore-cli-automation left a comment

Choose a reason for hiding this comment

Uh oh!

agentcore-cli-automation Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

agentcore-cli-automation Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

agentcore-cli-automation Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

agentcore-cli-automation Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

agentcore-cli-automation Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

agentcore-devx-automation Bot commented Jun 25, 2026

Uh oh!

agentcore-cli-automation left a comment

Choose a reason for hiding this comment

Uh oh!

agentcore-devx-automation Bot commented Jun 25, 2026

Uh oh!

agentcore-cli-automation left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aidandaly24 commented Jun 25, 2026 •

edited

Loading