Skip to content

fix: reset streaming state and close stale WebSocket on session switch (fixes #144)#151

Merged
yaojin3616 merged 2 commits intodataelement:mainfrom
Longado:fix/session-switch-streaming-state
Mar 20, 2026
Merged

fix: reset streaming state and close stale WebSocket on session switch (fixes #144)#151
yaojin3616 merged 2 commits intodataelement:mainfrom
Longado:fix/session-switch-streaming-state

Conversation

@Longado
Copy link
Contributor

@Longado Longado commented Mar 20, 2026

Problem

When switching between conversation sessions or agents while a response was streaming, the UI got stuck in the "outputting" state permanently — the input bar stayed disabled and a loading indicator spun forever. Returning to the original session also showed an empty message list.

Three root causes in AgentDetail.tsx:

  1. selectSession() did not reset isStreaming / isWaiting — switching sessions while streaming left both flags true, disabling the input area indefinitely.
  2. selectSession() did not close the existing WebSocket — the old WS kept firing onmessage, appending stale streaming chunks into the new session's message list and corrupting its history.
  3. The agent-change useEffect had the same two omissions — navigating to a different agent carried over the stuck state and kept the old WS alive.

Reported in #144.

Fix

selectSession() — close wsRef.current when readyState !== WebSocket.CLOSED (covers both OPEN and CONNECTING states), then reset isStreaming and isWaiting before loading the new session.

Agent-change useEffect — apply the same WS close + state reset when id changes.

The existing WS connect useEffect already has a proper cleanup (cancelled = true + wsRef.current?.close()) that prevents double-close races when React unmounts the effect.

Changes

frontend/src/pages/AgentDetail.tsx only — two small additions:

 const selectSession = async (sess: any) => {
+    if (wsRef.current && wsRef.current.readyState !== WebSocket.CLOSED) {
+        wsRef.current.close();
+        wsRef.current = null;
+    }
     setChatMessages([]);
     setHistoryMsgs([]);
+    setIsStreaming(false);
+    setIsWaiting(false);
     setActiveSession(sess);
     ...

 useEffect(() => {
+    if (wsRef.current && wsRef.current.readyState !== WebSocket.CLOSED) {
+        wsRef.current.close();
+        wsRef.current = null;
+    }
     setActiveSession(null);
     setChatMessages([]);
     setHistoryMsgs([]);
+    setIsStreaming(false);
+    setIsWaiting(false);
     ...
 }, [id]);

🤖 Generated with Claude Code

Longado and others added 2 commits March 20, 2026 20:02
…element#143)

Two independent fixes for `_send_message_to_agent`:

1. **Retry on ReadTimeout** — wrap the `llm_client.complete()` call in a
   retry loop (up to 3 attempts, exponential back-off: 1 s then 2 s).
   A single transient `httpx.ReadTimeout` no longer aborts a long-running
   A2A task mid-execution; it is retried silently instead.  Only after
   all 3 attempts fail does the exception propagate to the outer handler.

2. **Meaningful error message** — `httpx.ReadTimeout.__str__()` returns an
   empty string, so the UI previously showed only
   `"❌ Message send error:"` with no cause.  The outer `except` block now
   falls back to `type(e).__name__` when `str(e)` is empty, producing e.g.
   `"❌ Message send error: ReadTimeout"`.

Fixes dataelement#143.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dataelement#144)

Three bugs caused the UI to get stuck in "outputting" state after switching
conversation sessions or agents:

1. selectSession() did not reset isStreaming / isWaiting before loading the
   new session, so the input area remained disabled if the previous session
   was mid-stream.

2. selectSession() did not close the existing WebSocket. The old WS could
   still fire onmessage after the switch, appending stale streaming data
   into the new session's message list.

3. The agent-change useEffect also omitted both fixes above, so switching
   to a different agent carried over the same stuck state and orphaned WS.

Fix: in selectSession(), close wsRef.current when readyState !== CLOSED
(covers both OPEN and CONNECTING states) and reset isStreaming/isWaiting.
Apply the same two resets in the agent-change useEffect.

The existing WS connect useEffect already has a cleanup function
(cancelled = true + wsRef.close) that prevents double-close races.

Fixes dataelement#144.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@Congregalis
Copy link
Contributor

Thanks for working on this. The overall direction makes sense: switching sessions or agents should reset stale streaming UI state, and data from one conversation should not leak into another.

That said, I do not think this fully resolves #144 yet. From my reading, there are still a few blocking issues in the current fix:

  1. Closing the old WebSocket can still cause conversation content to be lost
    The current change closes the existing socket immediately when switching sessions. However, on the backend, the assistant message is only persisted when generation completes or when it goes through the explicit interruption path. If the user switches away while a response is still streaming, this can cancel the in-flight task before the response is saved, so returning to the original session can still show missing content.

    For this issue, the minimum requirement should be: switching sessions must not lose the in-flight response, whether that response completes normally or is interrupted during the transition.

  2. The + New Session path is still inconsistent with the other switch paths
    This PR resets isStreaming / isWaiting in selectSession() and on agent change, but createNewSession() still only clears messages and switches activeSession. It does not reset the same UI state consistently. As a result, creating a new session while a response is streaming can still leave the composer in a bad state.

    I think the session-switch reset logic should be centralized and reused across:

    • selectSession()
    • createNewSession()
    • the agent-change effect
  3. selectSession() still has a race when loading messages
    After switching sessions, the code immediately fetches messages for the selected session, but that request is neither cancelled nor validated when it resolves. If the user clicks two sessions quickly, the earlier request can return later and overwrite the message list for the newer session. That can still present as “history disappeared” or “wrong session content,” which is very close to the original bug report.

    I would suggest adding an AbortController, or at least checking that the resolved response still belongs to the currently selected session before applying it.

One additional note: this PR is framed as a fix for #144, but it also includes an unrelated backend change for A2A timeout retries. I would strongly recommend splitting that into a separate PR so the review scope stays clean and easier to reason about.

My view is that the stuck streaming state is only partially addressed here, while the content-loss and race-condition aspects of #144 are still not fully resolved.

@Congregalis
Copy link
Contributor

@yaojin3616
Follow-up design consideration for maintainers, separate from the scope of #144:

I think there is a broader lifecycle question underneath this bug: the current implementation appears to treat “currently visible in the UI” and “currently running conversation” as the same thing. That coupling makes session switching fragile and increases the chance of both stuck-state bugs and content-loss bugs.

A longer-term model worth considering is to separate UI visibility from conversation execution lifecycle, so that switching between sessions or agents does not necessarily interrupt work that is already in progress. That would make multi-conversation workflows much more robust.

I do not think that needs to be solved in this PR in order to fix #144. But it may be a useful follow-up direction once the immediate bug is fixed cleanly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants