Skip to content

Bug: anyio.Lock in oauth2.py raises "current task is not holding this lock" under cross-task generator driving #2644

@gILisH

Description

@gILisH

Bug: anyio.Lock in oauth2.py raises "current task is not holding this lock" under cross-task generator driving

Summary

mcp/client/auth/oauth2.py declares its OAuth state lock as anyio.Lock (via a field(default_factory=anyio.Lock) on the dataclass). When httpx drives the async_auth_flow async generator from a task different from the one that originally acquired the lock, the release() call raises:

RuntimeError: The current task is not holding this lock

This happens because anyio.Lock records task identity at acquire() time and verifies it at release() — but httpx's generator-driving pattern does not guarantee the driving task is the same task across __anext__ calls. The lock object's task-identity invariant is incompatible with how async generators are consumed downstream.

Reproduction

Install the MCP SDK in any context where an httpx.AsyncClient drives async_auth_flow across multiple httpx request retries. Under sufficient concurrency the task-identity check eventually fails. We see this reliably with Hermes Agent's MCP client when handling token refresh under load.

Stack trace shape:

File ".../mcp/client/auth/oauth2.py", line <N>, in async_auth_flow
    async with self.lock:
File ".../anyio/_core/_synchronization.py", line <N>, in __aexit__
    self.release()
File ".../anyio/_core/_synchronization.py", line <N>, in release
    raise RuntimeError("The current task is not holding this lock")

Root cause

anyio.Lock is designed for the "one task acquires, same task releases" pattern. Async generators driven by httpx violate that contract because httpx's retry/redirect machinery may resume the generator from a different task than the one that suspended it. asyncio.Lock does NOT check task identity on release — it's a simple FIFO synchronization primitive, suitable for this pattern.

Proposed fix

- import anyio
+ import anyio
+ import asyncio
  ...
- lock: anyio.Lock = field(default_factory=anyio.Lock)
+ lock: asyncio.Lock = field(default_factory=asyncio.Lock)

(asyncio.Lock is also reentrant-safe for the use pattern in async_auth_flow — the only thing being given up is the anyio-cross-runtime portability, which async_auth_flow doesn't rely on since it's already inside an asyncio context driven by httpx.)

Workaround

We currently apply this fix as a downstream patch (patches/50-mcp-oauth-lock-fix.sh) that targets the installed mcp/client/auth/oauth2.py after each Hermes upgrade. Happy to send a PR if the fix above is acceptable.

Severity

Intermittent — the failure only surfaces under specific timing where httpx re-drives the generator from a different task. In our reference deployment it surfaces ~once per day under normal load, more often during cron-bursty windows. Each failure aborts the in-flight MCP request, requiring the caller to retry.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions