Fix/fix: resolve MCP tools race condition causing 'completion 无法解析' error by idiotsj · Pull Request #5534 · AstrBotDevs/AstrBot

idiotsj · 2026-02-27T13:51:52Z

Fix: Resolve MCP tools race condition causing 'completion 无法解析' error

问题描述

修复 MCP 工具竞态条件导致的 "API 返回的 completion 无法解析" 错误。

根本原因

manager.py 使用 asyncio.create_task() 启动 MCP 初始化，导致系统在工具注册完成之前就开始接受请求。

变更内容

1. `astrbot/core/provider/manager.py`

asyncio.create_task(self.llm_tools.init_mcp_clients()) → await self.llm_tools.init_mcp_clients()

2. `astrbot/core/provider/func_tool_manager.py`

添加 Future 同步机制，等待所有 MCP 客户端初始化完成
添加 20 秒总超时，防止慢速服务器阻塞启动
改进错误日志，显示失败服务的配置信息

测试场景

场景	修复前	修复后
启动后立即使用	❌ 抛出"无法解析"异常	✅ 正常工作
多个 MCP 服务器	❌ 部分工具可能未加载	✅ 所有工具都已加载
MCP 初始化失败	❌ 静默失败	✅ 记录详细错误信息
慢速 MCP 服务器	❌ 可能无限等待	✅ 20 秒后超时

检查清单

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug Fixes（错误修复）:

修复一个竞争条件：在 MCP 客户端完成初始化之前就可以使用 MCP 工具，导致补全结果解析失败。
通过在 MCP 服务器启动耗时过长时强制超时和取消，防止初始化过程挂起。

Enhancements（增强）:

为所有 MCP 客户端初始化任务增加协同等待机制，包含全局超时时间以及按服务追踪就绪状态。
改进 MCP 初始化日志记录：按服务报告成功/失败情况，并记录有限的配置细节以帮助排查问题。
强化动态启用/禁用 MCP 服务器的处理逻辑，包括正确创建 future、处理超时、任务取消以及资源清理。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

错误修复：

防止在 MCP 客户端完成初始化之前使用 MCP 工具，从而解决导致补全解析失败的竞态条件问题。

功能增强：

为 MCP 客户端添加协调的初始化与超时处理机制，包括对耗时过长任务的取消，以及针对每个服务的详细成功/失败日志记录。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug Fixes（错误修复）:

修复一个竞争条件：在 MCP 客户端完成初始化之前就可以使用 MCP 工具，导致补全结果解析失败。
通过在 MCP 服务器启动耗时过长时强制超时和取消，防止初始化过程挂起。

Enhancements（增强）:

为所有 MCP 客户端初始化任务增加协同等待机制，包含全局超时时间以及按服务追踪就绪状态。
改进 MCP 初始化日志记录：按服务报告成功/失败情况，并记录有限的配置细节以帮助排查问题。
强化动态启用/禁用 MCP 服务器的处理逻辑，包括正确创建 future、处理超时、任务取消以及资源清理。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

修复了一个竞态条件：在 MCP 客户端完成初始化之前，MCP 工具可能被使用，从而导致补全结果解析错误。

增强功能：

为所有 MCP 客户端初始化任务添加协调等待机制，并设置 20 秒超时时间，以防止缓慢或卡死的服务器无限期阻塞启动过程。
改进 MCP 初始化日志，增加按服务记录的成功/失败报告和配置信息，便于排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug Fixes（错误修复）:

修复一个竞争条件：在 MCP 客户端完成初始化之前就可以使用 MCP 工具，导致补全结果解析失败。
通过在 MCP 服务器启动耗时过长时强制超时和取消，防止初始化过程挂起。

Enhancements（增强）:

为所有 MCP 客户端初始化任务增加协同等待机制，包含全局超时时间以及按服务追踪就绪状态。
改进 MCP 初始化日志记录：按服务报告成功/失败情况，并记录有限的配置细节以帮助排查问题。
强化动态启用/禁用 MCP 服务器的处理逻辑，包括正确创建 future、处理超时、任务取消以及资源清理。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

错误修复：

防止在 MCP 客户端完成初始化之前使用 MCP 工具，从而解决导致补全解析失败的竞态条件问题。

功能增强：

为 MCP 客户端添加协调的初始化与超时处理机制，包括对耗时过长任务的取消，以及针对每个服务的详细成功/失败日志记录。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug Fixes（错误修复）:

修复一个竞争条件：在 MCP 客户端完成初始化之前就可以使用 MCP 工具，导致补全结果解析失败。
通过在 MCP 服务器启动耗时过长时强制超时和取消，防止初始化过程挂起。

Enhancements（增强）:

为所有 MCP 客户端初始化任务增加协同等待机制，包含全局超时时间以及按服务追踪就绪状态。
改进 MCP 初始化日志记录：按服务报告成功/失败情况，并记录有限的配置细节以帮助排查问题。
强化动态启用/禁用 MCP 服务器的处理逻辑，包括正确创建 future、处理超时、任务取消以及资源清理。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复：

防止在 MCP 客户端尚未完成初始化时使用 MCP 工具，从而避免在解析补全结果时出现失败。
通过在 MCP 服务器启动缓慢时对未完成的初始化任务进行超时和取消，避免出现无限挂起的情况。

增强功能：

引入对 MCP 客户端就绪状态的协同等待机制，包含全局超时时间及按服务的就绪状态跟踪。
改进 MCP 初始化日志记录，对每个服务分别报告成功/失败情况，并在受限地展示配置细节以便排查问题。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Bug 修复:

通过在 provider 启动期间等待客户端完成设置，防止在 MCP 客户端完成初始化之前 MCP 工具被使用。
确保动态启用 MCP 服务器时，在初始化超时或发生错误时快速失败，而不是留下挂起或部分初始化的客户端。

增强功能:

引入协同的 MCP 客户端初始化流程，为每个客户端提供就绪事件，并支持可配置的全局超时，以防范服务器过慢或卡死。
改进 MCP 初始化的可观测性，为每个服务提供成功/失败报告，并在日志中安全地暴露关键配置详情。

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

Original summary in English

Summary by Sourcery

在使用 MCP 工具前，确保 MCP 工具客户端已完全初始化，并为 MCP 服务器的初始化与关闭添加更健壮的超时和生命周期管理。

Bug 修复：

消除一个竞态条件：之前 MCP 工具可能在其客户端尚未完成初始化前就被调用，从而导致补全结果解析失败。
防止 MCP 初始化无限期挂起：通过强制超时并取消启动过慢或卡住的 MCP 客户端启动任务来避免。

增强改进：

引入协调的 MCP 客户端初始化跟踪机制：为每个客户端使用独立的 future、关闭事件，并通过环境变量提供可配置的全局超时时间。
提升 MCP 启动的可观测性：按服务记录启动成功或失败日志，脱敏配置细节，并提供更清晰的超时/错误报告。
加强动态启用 MCP 服务器的可靠性：在注册客户端之前验证初始化是否完成，并清理被取消或失败的任务。

Original summary in English

Summary by Sourcery

Ensure MCP tool clients are fully initialized before use and add robust timeout and lifecycle handling for MCP server initialization and shutdown.

Bug Fixes:

Eliminate a race condition where MCP tools could be invoked before their clients finished initializing, which caused completion parsing failures.
Prevent MCP initialization from hanging indefinitely by enforcing timeouts and cancelling slow or stuck MCP client startup tasks.

Enhancements:

Introduce coordinated MCP client initialization tracking with per-client futures, shutdown events, and configurable global timeouts via environment variables.
Improve observability of MCP startup by logging per-service success or failure with redacted configuration details and clearer timeout/error reporting.
Strengthen dynamic MCP server enablement by validating initialization completion before registering clients and cleaning up cancelled or failed tasks.

- Wait for MCP client initialization to complete before accepting requests - Add Future-based synchronization in init_mcp_clients() - Prevent tool_calls from being rejected due to empty func_list - Improve error logging for MCP initialization failures Fixes race condition where AI attempts to call MCP tools before they are registered, resulting in 'API 返回的 completion 无法解析' exceptions. The issue occurred because: 1. MCP clients were initialized asynchronously without waiting 2. System accepted user requests immediately after startup 3. AI received empty tool list and attempted to call non-existent tools 4. Tool matching failed, causing parsing errors This fix ensures all MCP tools are loaded before the system processes any requests that might use them.

- Add 20-second total timeout to prevent slow MCP servers from blocking startup - Show detailed configuration info when MCP initialization fails - List all failed services in a summary warning - Gracefully handle timeout by using already-completed services This ensures that even if some MCP servers are slow or unreachable, the system will start within a reasonable time and provide clear feedback about which services failed and why.

dosubot · 2026-02-27T13:53:08Z

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

^{How did I do? Any feedback?}

sourcery-ai

Hey - 我发现了两个问题，并给出了一些整体性的反馈：

在使用 ready_future 对象跟踪 MCP 初始化时，请确保 _init_mcp_client_task_wrapper 在 finally 代码块中总是为对应的 future 完成结算（设置结果或异常），这样即使初始化过程提前中断，init_mcp_clients 也不会被挂住。
当 MCP 初始化发生超时时，建议显式取消那些仍在运行、但对应 future 尚未完成的初始化任务，以避免在 20 秒超时之后仍有不必要的后台工作和潜在的副作用。

面向 AI 代理的提示词

Please address the comments from this code review:

## Overall Comments
- 在使用 `ready_future` 对象跟踪 MCP 初始化时，请确保 `_init_mcp_client_task_wrapper` 在 `finally` 代码块中总是为对应的 future 完成结算（设置结果或异常），这样即使初始化过程提前中断，`init_mcp_clients` 也不会被挂住。
- 当 MCP 初始化发生超时时，建议显式取消那些仍在运行、但对应 future 尚未完成的初始化任务，以避免在 20 秒超时之后仍有不必要的后台工作和潜在的副作用。

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="262-266" />
<code_context>
+                    logger.error(f"MCP 服务 {name} 初始化失败: {result}")
+                    # 显示配置信息以便调试
+                    cfg = mcp_server_json_obj.get(name, {})
+                    if "command" in cfg:
+                        logger.error(f"  命令: {cfg['command']}")
+                        if "args" in cfg:
+                            logger.error(f"  参数: {cfg['args']}")
+                    elif "url" in cfg:
+                        logger.error(f"  URL: {cfg['url']}")
+                    failed_services.append(name)
</code_context>
<issue_to_address>
**🚨 issue (security):** 在日志中记录完整的命令/参数/URL，可能会在生产环境日志中泄露敏感信息。

为了降低这一风险，可以在记录日志前对这些值做脱敏处理（例如去掉查询参数、token 和凭据），或者只记录粗粒度的信息，比如服务名称、主机和端口。你也可以把完整配置输出放在 debug 级别日志或需要显式开启的诊断开关之后，这样它就不会出现在正常的生产日志中。
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="215" />
<code_context>
             open(mcp_json_file, encoding="utf-8"),
         )["mcpServers"]

+        # 收集所有初始化任务的 Future
+        init_futures: dict[str, asyncio.Future] = {}
+
</code_context>
<issue_to_address>
**issue (complexity):** 建议用一个简单的 name → task 映射配合 `asyncio.wait`，再精简 `_init_mcp_client_task_wrapper` 的签名，来直接管理超时和错误，从而替换当前自定义的 future 编排逻辑。

在保持现有行为（超时控制、部分成功、详细日志）的前提下，可以通过以下方式简化编排：

* 完全移除 `ready_future`，只使用任务（tasks）。
* 使用 `asyncio.wait(..., timeout=...)`，而不是 `wait_for(gather(...))` 加上手动重建结果集。
* 维护一个 `name -> task` 的映射，并在单个循环中进行处理。

### 1. 简化任务编排

使用直接的 `name -> task` 映射替换当前的 `init_futures` + `ready_future` 管线：

```python
mcp_server_json_obj: dict[str, dict] = json.load(
    open(mcp_json_file, encoding="utf-8"),
)["mcpServers"]

tasks: dict[str, asyncio.Task] = {}

for name, cfg in mcp_server_json_obj.items():
    if cfg.get("active", True):
        event = asyncio.Event()
        task = asyncio.create_task(
            self._init_mcp_client_task_wrapper(name, cfg, event)
        )
        tasks[name] = task
        self.mcp_client_event[name] = event

if tasks:
    logger.info(f"等待 {len(tasks)} 个 MCP 服务初始化...")

    done, pending = await asyncio.wait(tasks.values(), timeout=20.0)

    if pending:
        logger.warning(
            "MCP 服务初始化超时（20秒），部分服务可能未完全加载。"
            "建议检查 MCP 服务器配置和网络连接。"
        )

    success_count = 0
    failed_services: list[str] = []

    for name, task in tasks.items():
        if task in pending:
            # 超时
            logger.error(f"MCP 服务 {name} 初始化超时")
            failed_services.append(name)
            # 可选：取消仍在运行的任务
            task.cancel()
            continue

        exc = task.exception()
        if exc is not None:
            logger.error(f"MCP 服务 {name} 初始化失败: {exc}")
            cfg = mcp_server_json_obj.get(name, {})
            if "command" in cfg:
                logger.error(f"  命令: {cfg['command']}")
                if "args" in cfg:
                    logger.error(f"  参数: {cfg['args']}")
            elif "url" in cfg:
                logger.error(f"  URL: {cfg['url']}")
            failed_services.append(name)
        else:
            success_count += 1

    if failed_services:
        logger.warning(
            f"以下 MCP 服务初始化失败: {', '.join(failed_services)}。"
            f"请检查配置文件 mcp_server.json 和服务器可用性。"
        )

    logger.info(f"MCP 服务初始化完成: {success_count}/{len(tasks)} 成功")
```

### 2. 简化 `_init_mcp_client_task_wrapper`

采用上述方式后，`_init_mcp_client_task_wrapper` 不再需要 `ready_future`，也不需要与外部 future 做协调：

```python
async def _init_mcp_client_task_wrapper(
    self,
    name: str,
    cfg: dict,
    event: asyncio.Event,
) -> None:
    """初始化 MCP 客户端的包装函数，用于捕获异常"""
    try:
        # 原有初始化逻辑
        ...
        event.set()
    except Exception as e:
        # 在这里做局部日志或清理
        logger.exception(f"MCP 服务 {name} 初始化时发生异常: {e}")
        raise  # 让 task.exception() 能捕获到
```

这样可以保留：

* 所有服务统一的 20 秒启动超时。
* 部分可用性（已完成的任务可用，超时的任务会被报告）。
* 逐服务的详细日志（错误信息、配置提示）。
* 更简单的控制流：使用单次 `asyncio.wait`，再遍历一次 `tasks.items()`，而不是 futures + gather + 手动配对。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你喜欢我们的代码审查，欢迎分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的代码审查。}

Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

When using the ready_future objects to track MCP initialization, ensure _init_mcp_client_task_wrapper always settles (sets result or exception) for its corresponding future in a finally block so init_mcp_clients cannot hang if initialization bails out early.
On MCP initialization timeout, consider explicitly cancelling the still-running initialization tasks associated with the unfinished futures to avoid unnecessary background work and potential side effects after the 20s timeout.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- When using the `ready_future` objects to track MCP initialization, ensure `_init_mcp_client_task_wrapper` always settles (sets result or exception) for its corresponding future in a `finally` block so `init_mcp_clients` cannot hang if initialization bails out early.
- On MCP initialization timeout, consider explicitly cancelling the still-running initialization tasks associated with the unfinished futures to avoid unnecessary background work and potential side effects after the 20s timeout.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="262-266" />
<code_context>
+                    logger.error(f"MCP 服务 {name} 初始化失败: {result}")
+                    # 显示配置信息以便调试
+                    cfg = mcp_server_json_obj.get(name, {})
+                    if "command" in cfg:
+                        logger.error(f"  命令: {cfg['command']}")
+                        if "args" in cfg:
+                            logger.error(f"  参数: {cfg['args']}")
+                    elif "url" in cfg:
+                        logger.error(f"  URL: {cfg['url']}")
+                    failed_services.append(name)
</code_context>
<issue_to_address>
**🚨 issue (security):** Logging full commands/args/URLs may leak sensitive information in production logs.

To reduce this risk, either sanitize these values before logging (e.g., strip query parameters, tokens, and credentials) or log only coarse‑grained details such as service name, host, and port. You could also gate the full config output behind a debug‑level or explicitly opt‑in diagnostic flag so it never appears in normal production logs.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="215" />
<code_context>
             open(mcp_json_file, encoding="utf-8"),
         )["mcpServers"]

+        # 收集所有初始化任务的 Future
+        init_futures: dict[str, asyncio.Future] = {}
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the custom future orchestration with a simple name-to-task mapping using asyncio.wait and a streamlined _init_mcp_client_task_wrapper signature to manage timeouts and errors more directly.

You can keep all the new behavior (timeout, partial success, detailed logging) while simplifying the orchestration by:

* Dropping `ready_future` entirely and using only tasks.
* Using `asyncio.wait(..., timeout=...)` instead of `wait_for(gather(...))` + manual result reconstruction.
* Tracking `name -> task` and processing in a single loop.

### 1. Simplify task orchestration

Replace the `init_futures` + `ready_future` plumbing with a direct `name -> task` mapping:

```python
mcp_server_json_obj: dict[str, dict] = json.load(
    open(mcp_json_file, encoding="utf-8"),
)["mcpServers"]

tasks: dict[str, asyncio.Task] = {}

for name, cfg in mcp_server_json_obj.items():
    if cfg.get("active", True):
        event = asyncio.Event()
        task = asyncio.create_task(
            self._init_mcp_client_task_wrapper(name, cfg, event)
        )
        tasks[name] = task
        self.mcp_client_event[name] = event

if tasks:
    logger.info(f"等待 {len(tasks)} 个 MCP 服务初始化...")

    done, pending = await asyncio.wait(tasks.values(), timeout=20.0)

    if pending:
        logger.warning(
            "MCP 服务初始化超时（20秒），部分服务可能未完全加载。"
            "建议检查 MCP 服务器配置和网络连接。"
        )

    success_count = 0
    failed_services: list[str] = []

    for name, task in tasks.items():
        if task in pending:
            # 超时
            logger.error(f"MCP 服务 {name} 初始化超时")
            failed_services.append(name)
            # 可选：取消仍在运行的任务
            task.cancel()
            continue

        exc = task.exception()
        if exc is not None:
            logger.error(f"MCP 服务 {name} 初始化失败: {exc}")
            cfg = mcp_server_json_obj.get(name, {})
            if "command" in cfg:
                logger.error(f"  命令: {cfg['command']}")
                if "args" in cfg:
                    logger.error(f"  参数: {cfg['args']}")
            elif "url" in cfg:
                logger.error(f"  URL: {cfg['url']}")
            failed_services.append(name)
        else:
            success_count += 1

    if failed_services:
        logger.warning(
            f"以下 MCP 服务初始化失败: {', '.join(failed_services)}。"
            f"请检查配置文件 mcp_server.json 和服务器可用性。"
        )

    logger.info(f"MCP 服务初始化完成: {success_count}/{len(tasks)} 成功")
```

### 2. Simplify `_init_mcp_client_task_wrapper`

With the above, `_init_mcp_client_task_wrapper` no longer needs `ready_future` and doesn’t need to coordinate external futures:

```python
async def _init_mcp_client_task_wrapper(
    self,
    name: str,
    cfg: dict,
    event: asyncio.Event,
) -> None:
    """初始化 MCP 客户端的包装函数，用于捕获异常"""
    try:
        # 原有初始化逻辑
        ...
        event.set()
    except Exception as e:
        # 在这里做局部日志或清理
        logger.exception(f"MCP 服务 {name} 初始化时发生异常: {e}")
        raise  # 让 task.exception() 能捕获到
```

This keeps:

* Startup timeout (20s) for all services.
* Partial availability (completed tasks are used, timed-out ones reported).
* Detailed per-service logging (errors, config hints).
* Simpler control flow: a single `asyncio.wait` + one loop over `tasks.items()` instead of futures + gather + manual zipping.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

gemini-code-assist · 2026-02-27T13:56:16Z

Warning

Gemini is experiencing higher than usual traffic and was unable to create the summary. Please try again in a few hours by commenting /gemini summary.

- Replace Future-based sync with asyncio.wait + name→task mapping - Explicitly cancel timed-out tasks after 20s timeout - Downgrade sensitive config details (command/args/URL) to debug level - Move urllib.parse import to top-level

idiotsj · 2026-02-27T14:05:55Z

正在优化中

- Do not cancel pending tasks on timeout; let them continue running in the background waiting for the termination signal (event.set()), so successfully initialized services remain available - Track initialization state with a flag to distinguish init failures from post-init cancellations in _init_mcp_client_task_wrapper

Pending tasks in asyncio.wait are tasks that have NOT completed initialization within 20s, so cancelling them is safe and correct.

Soulter · 2026-02-28T13:36:49Z

/gemini summary

gemini-code-assist · 2026-02-28T13:37:08Z

Summary of Changes

此拉取请求旨在解决 MCP 工具初始化过程中的竞态条件，该条件可能导致工具在完全加载之前被调用，从而引发“completion 无法解析”的错误。通过确保 MCP 客户端在系统开始处理请求之前完成初始化，并引入超时机制以防止慢速服务器阻塞启动，同时改进了错误日志，使得问题排查更加高效和清晰。

Highlights

解决竞态条件: 将 manager.py 中 MCP 工具的初始化从异步任务改为 await 调用，确保工具在被使用前完全初始化，避免了工具未加载完成就被调用的问题。
增强 MCP 客户端初始化: 在 func_tool_manager.py 中引入 asyncio.wait 机制，对所有 MCP 客户端的初始化进行协调和等待，并设置了 20 秒的超时，以防止慢速或卡死的服务器阻塞启动过程。
改进错误处理和日志: 增加了详细的错误日志记录，包括失败服务的配置信息（敏感信息在 debug 级别显示），并处理了超时情况，使得问题排查更加高效和清晰。
简化代码结构: 移除了 _init_mcp_client_task_wrapper 中的 ready_future 参数，简化了异步任务的编排逻辑，提高了代码的可读性和维护性。

Changelog

astrbot/core/provider/func_tool_manager.py
- 导入了 urllib.parse 模块。
- 重构了 init_mcp_clients 方法，使用 asyncio.wait 协调所有 MCP 客户端的初始化任务，并设置了 20 秒的超时。
- 增强了 MCP 服务初始化失败或超时的错误日志，并在 debug 级别记录了配置详情。
- 简化了 _init_mcp_client_task_wrapper 函数的签名，移除了 ready_future 参数，并调整了异常处理逻辑。
astrbot/core/provider/manager.py
- 将 self.llm_tools.init_mcp_clients() 的调用方式从 asyncio.create_task() 修改为 await，确保 MCP 客户端初始化同步完成。

Activity

dosubot[bot] 检查了相关文档，未发现需要更新。
gemini-code-assist[bot] 因流量过高未能生成摘要。
作者 idiotsj 评论“正在优化中”。
Soulter 请求生成 Gemini 摘要。
sourcery-ai[bot] 提出了关于日志中敏感信息泄露的安全问题，该问题已通过在 debug 级别记录敏感配置和解析 URL 来解决。
sourcery-ai[bot] 提出了关于异步任务编排复杂性的问题，建议使用 asyncio.wait 和 name -> task 映射来简化，该建议已被采纳并实现。

zouyonghe · 2026-03-01T12:15:29Z

@sourcery-ai review

sourcery-ai

Hey - 我在这里给出了一些整体性的反馈：

新的 init_mcp_clients 逻辑会等待 _init_mcp_client_task_wrapper 任务完成，但这些任务只有在 event.wait()（关停时）之后才会结束，因此 asyncio.wait(..., timeout=20) 几乎总是会触发超时，然后取消所有 MCP 客户端；建议将“已初始化”的显式信号与长生命周期的客户端任务分离开来，这样启动流程只等待初始化完成，而不是等待整个客户端生命周期结束。
由于 manager.initialize 现在会 await init_mcp_clients，在当前设计下，应用程序的启动时间实际上被 20 秒的超时路径所限制；在你修复生命周期/初始化的分离问题之后，请确认初始化能够在客户端准备就绪后立即返回，而不是让启动过程被整整一个超时时间窗口所阻塞。

给 AI Agents 的提示

Please address the comments from this code review:

## Overall Comments
- The new `init_mcp_clients` logic awaits task completion for `_init_mcp_client_task_wrapper`, but those tasks only finish after `event.wait()` (shutdown), so `asyncio.wait(..., timeout=20)` will almost always hit the timeout and then cancel all MCP clients; consider separating an explicit 'initialized' signal from the long-lived client task so startup waits for init without waiting for the entire client lifetime.
- Because `manager.initialize` now `await`s `init_mcp_clients`, the application startup time is effectively gated by the 20s timeout path in the current design; once you fix the lifetime/initialization separation, verify that initialization returns promptly after clients are ready instead of delaying startup for the full timeout window.

Sourcery 对开源项目是免费的 —— 如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈来改进后续的评审。}

Original comment in English

Hey - I've left some high level feedback:

The new init_mcp_clients logic awaits task completion for _init_mcp_client_task_wrapper, but those tasks only finish after event.wait() (shutdown), so asyncio.wait(..., timeout=20) will almost always hit the timeout and then cancel all MCP clients; consider separating an explicit 'initialized' signal from the long-lived client task so startup waits for init without waiting for the entire client lifetime.
Because manager.initialize now awaits init_mcp_clients, the application startup time is effectively gated by the 20s timeout path in the current design; once you fix the lifetime/initialization separation, verify that initialization returns promptly after clients are ready instead of delaying startup for the full timeout window.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new `init_mcp_clients` logic awaits task completion for `_init_mcp_client_task_wrapper`, but those tasks only finish after `event.wait()` (shutdown), so `asyncio.wait(..., timeout=20)` will almost always hit the timeout and then cancel all MCP clients; consider separating an explicit 'initialized' signal from the long-lived client task so startup waits for init without waiting for the entire client lifetime.
- Because `manager.initialize` now `await`s `init_mcp_clients`, the application startup time is effectively gated by the 20s timeout path in the current design; once you fix the lifetime/initialization separation, verify that initialization returns promptly after clients are ready instead of delaying startup for the full timeout window.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

zouyonghe · 2026-03-01T12:25:26Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 2 个问题，并给出了一些整体性的反馈：

init_mcp_clients 和 enable_mcp_server 中硬编码的 20 秒超时时间如果改成可配置项（例如来自配置文件/环境变量），会更灵活，这样不同部署可以在不改代码的情况下进行调整。
在 init_mcp_clients 中，你在失败时会记录 MCP 服务端配置（command/args/url）；建议在日志中显式脱敏或截断可能包含敏感信息的字段（例如内嵌在 URL 中的 token 或很长的参数列表），即便是在 debug 级别日志中也应如此。
在错误日志路径中新加入的 urllib.parse.urlparse 调用需要在 func_tool_manager.py 中显式导入 urllib.parse（或 from urllib import parse），前提是该模块中目前尚未导入。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The hard-coded 20s timeout in `init_mcp_clients` and `enable_mcp_server` would be more flexible as a configurable setting (e.g., from config/env) so different deployments can tune it without code changes.
- In `init_mcp_clients`, you log MCP server configuration (command/args/url) on failure; consider explicitly redacting or truncating potentially sensitive values (e.g., tokens embedded in URLs or long argument lists) before logging even at debug level.
- The new use of `urllib.parse.urlparse` in the error-logging path requires an import of `urllib.parse` (or `from urllib import parse`) in `func_tool_manager.py` if it is not already present in the module.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="280-289" />
<code_context>
+                    logger.error(f"MCP 服务 {name} 初始化失败: {exc}")
+                    # 仅在 debug 级别输出完整配置，避免在生产日志中泄露敏感信息
+                    cfg = mcp_server_json_obj.get(name, {})
+                    if "command" in cfg:
+                        logger.debug(f"  命令: {cfg['command']}")
+                        if "args" in cfg:
+                            logger.debug(f"  参数: {cfg['args']}")
+                    elif "url" in cfg:
+                        parsed = urllib.parse.urlparse(cfg["url"])
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Logging full MCP `command` and `args` may leak sensitive data even at debug level.

Because `command` and especially `args` can include tokens, credentials, or internal hostnames, logging them verbatim is risky, even at debug level and especially on error paths that might be enabled in production. Please consider redacting or masking sensitive-looking values (e.g., env vars, bearer tokens, secrets in URLs) or limiting output to safe metadata such as executable name and argument count.

```suggestion
                    logger.error(f"MCP 服务 {name} 初始化失败: {exc}")
                    # 仅在 debug 级别输出有限配置，避免在日志中泄露敏感信息
                    cfg = mcp_server_json_obj.get(name, {})
                    if "command" in cfg:
                        cmd = cfg["command"]
                        # 只记录可执行命令及参数数量，不输出完整参数内容
                        if isinstance(cmd, (list, tuple)) and cmd:
                            executable = str(cmd[0])
                        else:
                            executable = str(cmd)
                        args_val = cfg.get("args", [])
                        if isinstance(args_val, (list, tuple)):
                            args_count = len(args_val)
                        elif args_val is None:
                            args_count = 0
                        else:
                            # 非列表类型一律按 1 个参数计数，避免泄露具体内容
                            args_count = 1
                        logger.debug(f"  命令可执行文件: {executable}, 参数数量: {args_count}")
                    elif "url" in cfg:
                        parsed = urllib.parse.urlparse(cfg["url"])
                        logger.debug(f"  主机: {parsed.scheme}://{parsed.netloc}")
```
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="216" />
<code_context>

-        for name in mcp_server_json_obj:
-            cfg = mcp_server_json_obj[name]
+        tasks: dict[str, asyncio.Task] = {}
+        ready_futures: dict[str, asyncio.Future] = {}
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring MCP client initialization into a single helper that returns structured results so you can drop the external futures, flags, and custom timeout handling logic.

You can keep all the new behavior (timeouts, better logging, cancellation) while simplifying the orchestration by pushing the timeout and status handling into a single per‑service helper and removing the `initialized` flag / external futures.

### 1. Collapse `ready_futures` + `initialized` into a per‑service helper

Instead of:
- `tasks` + `ready_futures` dicts
- an `initialized` flag
- wrapper that sets exceptions on an external future

make `_init_mcp_client_task_wrapper` the *only* place that knows about timeout + init vs run, and return a structured result instead of setting an external future:

```python
@dataclass
class MCPInitResult:
    name: str
    ok: bool
    error: Exception | None = None
    timeout: bool = False

async def _run_mcp_client(
    self,
    name: str,
    cfg: dict,
    event: asyncio.Event,
    timeout: float | None = None,
) -> MCPInitResult:
    try:
        if timeout is not None:
            await asyncio.wait_for(self._init_mcp_client(name, cfg), timeout)
        else:
            await self._init_mcp_client(name, cfg)

        await event.wait()
        logger.info(f"收到 MCP 客户端 {name} 终止信号")
        return MCPInitResult(name=name, ok=True)

    except asyncio.TimeoutError as e:
        logger.error(f"MCP 客户端 {name} 初始化超时", exc_info=True)
        return MCPInitResult(name=name, ok=False, error=e, timeout=True)

    except Exception as e:
        logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True)
        return MCPInitResult(name=name, ok=False, error=e)

    finally:
        await self._terminate_mcp_client(name)
```

This removes:
- `ready_future` plumbing
- the `initialized` flag and dual semantics for exceptions
- special `CancelledError` handling to synthesize a `TimeoutError` on the future

### 2. Simplify `init_mcp_clients` orchestration

With `_run_mcp_client` returning an `MCPInitResult`, `init_mcp_clients` no longer needs task maps + ready futures + manual pending tracking; you can rely on `asyncio.gather` and inspect results:

```python
async def init_mcp_clients(self) -> None:
    # ... load mcp_server_json_obj ...

    events: dict[str, asyncio.Event] = {}
    tasks: list[asyncio.Task[MCPInitResult]] = []

    for name, cfg in mcp_server_json_obj.items():
        if not cfg.get("active", True):
            continue

        event = asyncio.Event()
        events[name] = event
        self.mcp_client_event[name] = event

        tasks.append(asyncio.create_task(
            self._run_mcp_client(name, cfg, event, timeout=20.0)
        ))

    if not tasks:
        return

    logger.info(f"等待 {len(tasks)} 个 MCP 服务初始化...")

    results = await asyncio.gather(*tasks)

    success_count = 0
    failed_services: list[str] = []

    for res in results:
        if res.ok:
            success_count += 1
            continue

        failed_services.append(res.name)
        self.mcp_client_event.pop(res.name, None)

        if res.timeout:
            logger.error(f"MCP 服务 {res.name} 初始化超时")
        else:
            logger.error(f"MCP 服务 {res.name} 初始化失败: {res.error}")
            cfg = mcp_server_json_obj.get(res.name, {})
            # keep your existing debug logging of command/url here

    if failed_services:
        logger.warning(
            f"以下 MCP 服务初始化失败: {', '.join(failed_services)}。"
            "请检查配置文件 mcp_server.json 和服务器可用性。"
        )

    logger.info(f"MCP 服务初始化完成: {success_count}/{len(results)} 成功")
```

This preserves:
- per‑service timeout (20s)
- detailed logging and config hints
- cleanup and event registration

while removing:
- `ready_futures` dict
- `asyncio.wait()` + manual `pending_services` computation
- separate cancellation and `TimeoutError` synthesis paths

### 3. Reuse the same helper in `enable_mcp_server`

`enable_mcp_server` can reuse `_run_mcp_client` with a single `await`, instead of managing `ready_future` + `init_task` and duplicating timeout/cancellation logic:

```python
async def enable_mcp_server(
    self,
    name: str,
    config: dict,
    event: asyncio.Event | None = None,
    timeout: float = 10,
) -> None:
    if not event:
        event = asyncio.Event()
    if name in self.mcp_client_dict:
        return

    self.mcp_client_event[name] = event
    result = await self._run_mcp_client(name, config, event, timeout=timeout)

    if not result.ok:
        # cleanup event mapping on failure
        self.mcp_client_event.pop(name, None)
        if result.timeout:
            raise asyncio.TimeoutError(
                f"MCP 服务 {name} 初始化超时（{timeout} 秒）"
            ) from result.error
        raise result.error or Exception(f"MCP 服务 {name} 初始化失败")
```

This keeps:
- the public API’s timeout semantics
- proper cancellation and cleanup
- clear error propagation

and eliminates duplicated timeout/cancel logic plus the extra `ready_future`/`init_task` layer.

Sourcery 对开源项目是免费的——如果你觉得这些评审有帮助，欢迎分享给更多人 ✨

_{请帮我变得更有用！你可以在每条评论上点 👍 或 👎，我会基于这些反馈改进后续的代码评审。}

Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

The hard-coded 20s timeout in init_mcp_clients and enable_mcp_server would be more flexible as a configurable setting (e.g., from config/env) so different deployments can tune it without code changes.
In init_mcp_clients, you log MCP server configuration (command/args/url) on failure; consider explicitly redacting or truncating potentially sensitive values (e.g., tokens embedded in URLs or long argument lists) before logging even at debug level.
The new use of urllib.parse.urlparse in the error-logging path requires an import of urllib.parse (or from urllib import parse) in func_tool_manager.py if it is not already present in the module.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The hard-coded 20s timeout in `init_mcp_clients` and `enable_mcp_server` would be more flexible as a configurable setting (e.g., from config/env) so different deployments can tune it without code changes.
- In `init_mcp_clients`, you log MCP server configuration (command/args/url) on failure; consider explicitly redacting or truncating potentially sensitive values (e.g., tokens embedded in URLs or long argument lists) before logging even at debug level.
- The new use of `urllib.parse.urlparse` in the error-logging path requires an import of `urllib.parse` (or `from urllib import parse`) in `func_tool_manager.py` if it is not already present in the module.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="280-289" />
<code_context>
+                    logger.error(f"MCP 服务 {name} 初始化失败: {exc}")
+                    # 仅在 debug 级别输出完整配置，避免在生产日志中泄露敏感信息
+                    cfg = mcp_server_json_obj.get(name, {})
+                    if "command" in cfg:
+                        logger.debug(f"  命令: {cfg['command']}")
+                        if "args" in cfg:
+                            logger.debug(f"  参数: {cfg['args']}")
+                    elif "url" in cfg:
+                        parsed = urllib.parse.urlparse(cfg["url"])
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Logging full MCP `command` and `args` may leak sensitive data even at debug level.

Because `command` and especially `args` can include tokens, credentials, or internal hostnames, logging them verbatim is risky, even at debug level and especially on error paths that might be enabled in production. Please consider redacting or masking sensitive-looking values (e.g., env vars, bearer tokens, secrets in URLs) or limiting output to safe metadata such as executable name and argument count.

```suggestion
                    logger.error(f"MCP 服务 {name} 初始化失败: {exc}")
                    # 仅在 debug 级别输出有限配置，避免在日志中泄露敏感信息
                    cfg = mcp_server_json_obj.get(name, {})
                    if "command" in cfg:
                        cmd = cfg["command"]
                        # 只记录可执行命令及参数数量，不输出完整参数内容
                        if isinstance(cmd, (list, tuple)) and cmd:
                            executable = str(cmd[0])
                        else:
                            executable = str(cmd)
                        args_val = cfg.get("args", [])
                        if isinstance(args_val, (list, tuple)):
                            args_count = len(args_val)
                        elif args_val is None:
                            args_count = 0
                        else:
                            # 非列表类型一律按 1 个参数计数，避免泄露具体内容
                            args_count = 1
                        logger.debug(f"  命令可执行文件: {executable}, 参数数量: {args_count}")
                    elif "url" in cfg:
                        parsed = urllib.parse.urlparse(cfg["url"])
                        logger.debug(f"  主机: {parsed.scheme}://{parsed.netloc}")
```
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="216" />
<code_context>

-        for name in mcp_server_json_obj:
-            cfg = mcp_server_json_obj[name]
+        tasks: dict[str, asyncio.Task] = {}
+        ready_futures: dict[str, asyncio.Future] = {}
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider refactoring MCP client initialization into a single helper that returns structured results so you can drop the external futures, flags, and custom timeout handling logic.

You can keep all the new behavior (timeouts, better logging, cancellation) while simplifying the orchestration by pushing the timeout and status handling into a single per‑service helper and removing the `initialized` flag / external futures.

### 1. Collapse `ready_futures` + `initialized` into a per‑service helper

Instead of:
- `tasks` + `ready_futures` dicts
- an `initialized` flag
- wrapper that sets exceptions on an external future

make `_init_mcp_client_task_wrapper` the *only* place that knows about timeout + init vs run, and return a structured result instead of setting an external future:

```python
@dataclass
class MCPInitResult:
    name: str
    ok: bool
    error: Exception | None = None
    timeout: bool = False

async def _run_mcp_client(
    self,
    name: str,
    cfg: dict,
    event: asyncio.Event,
    timeout: float | None = None,
) -> MCPInitResult:
    try:
        if timeout is not None:
            await asyncio.wait_for(self._init_mcp_client(name, cfg), timeout)
        else:
            await self._init_mcp_client(name, cfg)

        await event.wait()
        logger.info(f"收到 MCP 客户端 {name} 终止信号")
        return MCPInitResult(name=name, ok=True)

    except asyncio.TimeoutError as e:
        logger.error(f"MCP 客户端 {name} 初始化超时", exc_info=True)
        return MCPInitResult(name=name, ok=False, error=e, timeout=True)

    except Exception as e:
        logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True)
        return MCPInitResult(name=name, ok=False, error=e)

    finally:
        await self._terminate_mcp_client(name)
```

This removes:
- `ready_future` plumbing
- the `initialized` flag and dual semantics for exceptions
- special `CancelledError` handling to synthesize a `TimeoutError` on the future

### 2. Simplify `init_mcp_clients` orchestration

With `_run_mcp_client` returning an `MCPInitResult`, `init_mcp_clients` no longer needs task maps + ready futures + manual pending tracking; you can rely on `asyncio.gather` and inspect results:

```python
async def init_mcp_clients(self) -> None:
    # ... load mcp_server_json_obj ...

    events: dict[str, asyncio.Event] = {}
    tasks: list[asyncio.Task[MCPInitResult]] = []

    for name, cfg in mcp_server_json_obj.items():
        if not cfg.get("active", True):
            continue

        event = asyncio.Event()
        events[name] = event
        self.mcp_client_event[name] = event

        tasks.append(asyncio.create_task(
            self._run_mcp_client(name, cfg, event, timeout=20.0)
        ))

    if not tasks:
        return

    logger.info(f"等待 {len(tasks)} 个 MCP 服务初始化...")

    results = await asyncio.gather(*tasks)

    success_count = 0
    failed_services: list[str] = []

    for res in results:
        if res.ok:
            success_count += 1
            continue

        failed_services.append(res.name)
        self.mcp_client_event.pop(res.name, None)

        if res.timeout:
            logger.error(f"MCP 服务 {res.name} 初始化超时")
        else:
            logger.error(f"MCP 服务 {res.name} 初始化失败: {res.error}")
            cfg = mcp_server_json_obj.get(res.name, {})
            # keep your existing debug logging of command/url here

    if failed_services:
        logger.warning(
            f"以下 MCP 服务初始化失败: {', '.join(failed_services)}。"
            "请检查配置文件 mcp_server.json 和服务器可用性。"
        )

    logger.info(f"MCP 服务初始化完成: {success_count}/{len(results)} 成功")
```

This preserves:
- per‑service timeout (20s)
- detailed logging and config hints
- cleanup and event registration

while removing:
- `ready_futures` dict
- `asyncio.wait()` + manual `pending_services` computation
- separate cancellation and `TimeoutError` synthesis paths

### 3. Reuse the same helper in `enable_mcp_server`

`enable_mcp_server` can reuse `_run_mcp_client` with a single `await`, instead of managing `ready_future` + `init_task` and duplicating timeout/cancellation logic:

```python
async def enable_mcp_server(
    self,
    name: str,
    config: dict,
    event: asyncio.Event | None = None,
    timeout: float = 10,
) -> None:
    if not event:
        event = asyncio.Event()
    if name in self.mcp_client_dict:
        return

    self.mcp_client_event[name] = event
    result = await self._run_mcp_client(name, config, event, timeout=timeout)

    if not result.ok:
        # cleanup event mapping on failure
        self.mcp_client_event.pop(name, None)
        if result.timeout:
            raise asyncio.TimeoutError(
                f"MCP 服务 {name} 初始化超时（{timeout} 秒）"
            ) from result.error
        raise result.error or Exception(f"MCP 服务 {name} 初始化失败")
```

This keeps:
- the public API’s timeout semantics
- proper cancellation and cleanup
- clear error propagation

and eliminates duplicated timeout/cancel logic plus the extra `ready_future`/`init_task` layer.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

The previous design awaited task completion, but tasks only finish on shutdown (after event.wait()), causing asyncio.wait to always hit the 20s timeout and cancel all clients. Fix: introduce a dedicated ready_event that is set immediately after _init_mcp_client completes. init_mcp_clients now waits only for ready_event (with 20s timeout), while the long-lived client task continues running in the background until shutdown_event is set. This ensures startup returns promptly once clients are ready.

idiotsj · 2026-03-01T12:28:28Z

严肃学习修改

zouyonghe · 2026-03-01T12:30:12Z

@sourcery-ai review

Only log executable name and argument count instead of full command/args to avoid leaking tokens or credentials even at debug level.

sourcery-ai

Hey，我发现了两个问题，并给出了一些整体性的反馈：

在 init_mcp_clients 中因超时取消 MCP 初始化任务时，建议在取消后显式地 await 这些任务（例如通过 asyncio.gather(*cancelled_tasks, return_exceptions=True)），以避免事件循环里残留未结束的任务以及未被观察到的异常。
MCP 初始化目前使用的是硬编码的 20 秒全局超时时间；建议将其提取为一个具名常量或配置项，以便在不同环境中调整时，无需修改逻辑代码。
MCP 客户端初始化失败目前在 _init_mcp_client_task_wrapper（带 exc_info=True）和 init_mcp_clients 中都会记录日志；你可能希望将这部分日志集中到一处或去重，以避免噪音较大的重复错误信息。

给 AI Agent 的提示

Please address the comments from this code review:

## Overall Comments
- When cancelling MCP init tasks on timeout in `init_mcp_clients`, consider explicitly awaiting their cancellation (e.g., via `asyncio.gather(*cancelled_tasks, return_exceptions=True)`) to avoid lingering tasks and unobserved exceptions in the event loop.
- The 20-second global timeout for MCP initialization is currently hard-coded; extracting it into a named constant or configuration option would make it easier to tune across environments without touching the logic.
- Failures during MCP client initialization are logged both in `_init_mcp_client_task_wrapper` (with `exc_info=True`) and again in `init_mcp_clients`; you might want to centralize or de-duplicate this logging to avoid noisy, redundant error messages.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="246-248" />
<code_context>
+
+            wait_coros = [_wait_ready(n, e) for n, e in ready_events.items()]
+            try:
+                await asyncio.wait_for(
+                    asyncio.gather(*wait_coros, return_exceptions=True),
+                    timeout=20.0,
+                )
+            except asyncio.TimeoutError:
</code_context>
<issue_to_address>
**suggestion:** Consider making the MCP initialization timeout configurable instead of hardcoded to 20 seconds.

The 20s default seems fine, but some deployments (e.g., slow remote MCP servers, cold starts) may legitimately take longer. With a hardcoded timeout, those slow-but-healthy services will be cancelled and treated as failures. Please expose this as a configurable value (or at least a module-level constant) so operators can tune it to their environment.

Suggested implementation:

```python
import asyncio


# Default timeout (in seconds) for MCP service initialization.
# Adjust this value or wire it to configuration as needed for your deployment.
MCP_INIT_TIMEOUT: float = 20.0

```

```python
                await asyncio.wait_for(
                    asyncio.gather(*wait_coros, return_exceptions=True),
                    timeout=MCP_INIT_TIMEOUT,
                )

```

If your project has a central configuration system (e.g., settings module, env-based config, or a config object passed into this component), you may want to:
1. Load `MCP_INIT_TIMEOUT` from that configuration instead of hardcoding `20.0` here.
2. Optionally document this setting in your ops/config docs so operators know they can tune MCP initialization timeouts.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="219" />
<code_context>
+        # name -> (shutdown_event, ready_event, task)
+        # shutdown_event: 关机时 set，通知长期运行的 task 退出
+        # ready_event:    初始化完成时 set，用于等待初始化，不阻塞到关机
+        client_info: dict[str, tuple[asyncio.Event, asyncio.Event, asyncio.Task]] = {}
+
+        for name, cfg in mcp_server_json_obj.items():
</code_context>
<issue_to_address>
**issue (complexity):** Consider introducing a small McpClientInfo data class and operating on it directly to simplify client tracking and readiness waiting logic while preserving current behavior.

You can keep the new behavior but reduce complexity by introducing a small data class and simplifying the waiting logic. This also removes tuple index plumbing and `_wait_ready`.

### 1. Replace tuple-based `client_info` with a data class

Define a per-client structure once:

```python
from dataclasses import dataclass

@dataclass
class McpClientInfo:
    name: str
    cfg: dict
    shutdown_event: asyncio.Event
    ready_event: asyncio.Event
    task: asyncio.Task
```

Then refactor the initialization section:

```python
mcp_server_json_obj: dict[str, dict] = json.load(
    open(mcp_json_file, encoding="utf-8"),
)["mcpServers"]

client_infos: dict[str, McpClientInfo] = {}

for name, cfg in mcp_server_json_obj.items():
    if cfg.get("active", True):
        shutdown_event = asyncio.Event()
        ready_event = asyncio.Event()
        task = asyncio.create_task(
            self._init_mcp_client_task_wrapper(
                name=name,
                cfg=cfg,
                shutdown_event=shutdown_event,
                ready_event=ready_event,
            )
        )
        info = McpClientInfo(
            name=name,
            cfg=cfg,
            shutdown_event=shutdown_event,
            ready_event=ready_event,
            task=task,
        )
        client_infos[name] = info
        self.mcp_client_event[name] = shutdown_event
```

This avoids `info[1]`, `info[2]`, and re‑looking up `cfg` later.

### 2. Simplify the readiness waiting logic (remove `_wait_ready`)

Instead of building `ready_events` and `tasks_by_name` maps plus `_wait_ready`, operate directly on `client_infos`:

```python
if client_infos:
    logger.info(f"等待 {len(client_infos)} 个 MCP 服务初始化...")

    wait_tasks = [
        asyncio.create_task(info.ready_event.wait())
        for info in client_infos.values()
    ]
    try:
        await asyncio.wait_for(
            asyncio.gather(*wait_tasks, return_exceptions=True),
            timeout=20.0,
        )
    except asyncio.TimeoutError:
        pass

    success_count = 0
    failed_services: list[str] = []

    for info in client_infos.values():
        if not info.ready_event.is_set():
            logger.error(f"MCP 服务 {info.name} 初始化超时")
            info.task.cancel()
            failed_services.append(info.name)
        elif info.task.done() and info.task.exception() is not None:
            exc = info.task.exception()
            logger.error(f"MCP 服务 {info.name} 初始化失败: {exc}")
            cfg = info.cfg
            if "command" in cfg:
                logger.debug(f"  命令: {cfg['command']}")
                if "args" in cfg:
                    logger.debug(f"  参数: {cfg['args']}")
            elif "url" in cfg:
                parsed = urllib.parse.urlparse(cfg["url"])
                logger.debug(f"  主机: {parsed.scheme}://{parsed.netloc}")
            failed_services.append(info.name)
        else:
            success_count += 1

    if failed_services:
        logger.warning(
            f"以下 MCP 服务初始化失败: {', '.join(failed_services)}。"
            f"请检查配置文件 mcp_server.json 和服务器可用性。"
        )

    logger.info(f"MCP 服务初始化完成: {success_count}/{len(client_infos)} 成功")
```

This keeps all existing behavior (timeout, cancellation, failure logging, URL/command details) but eliminates the extra dictionaries and `_wait_ready` helper.

### 3. Optionally pass a single object to the wrapper

To further reduce parameter juggling, you can pass `McpClientInfo` into the wrapper instead of multiple arguments:

```python
async def _init_mcp_client_task_wrapper(
    self,
    info: McpClientInfo,
) -> None:
    """初始化 MCP 客户端的包装函数。

    我们保证无论成功或失败，ready_event 至少会被 set 一次，
    以解除 init_mcp_clients 中的等待。
    """
    try:
        await self._init_mcp_client(info.name, info.cfg)
        info.ready_event.set()
        await info.shutdown_event.wait()
        logger.info(f"收到 MCP 客户端 {info.name} 终止信号")
    except Exception:
        info.ready_event.set()
        logger.error(f"初始化 MCP 客户端 {info.name} 失败", exc_info=True)
        raise
    finally:
        await self._terminate_mcp_client(info.name)
```

And creation:

```python
info = McpClientInfo(
    name=name,
    cfg=cfg,
    shutdown_event=asyncio.Event(),
    ready_event=asyncio.Event(),
    task=None,  # placeholder
)
info.task = asyncio.create_task(self._init_mcp_client_task_wrapper(info))
client_infos[name] = info
self.mcp_client_event[name] = info.shutdown_event
```

This keeps the lifecycle contract explicit and removes the need to mentally track multiple positional parameters.
</issue_to_address>

Sourcery 对开源项目永久免费——如果你觉得这些 Review 有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续 Review。}

Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

When cancelling MCP init tasks on timeout in init_mcp_clients, consider explicitly awaiting their cancellation (e.g., via asyncio.gather(*cancelled_tasks, return_exceptions=True)) to avoid lingering tasks and unobserved exceptions in the event loop.
The 20-second global timeout for MCP initialization is currently hard-coded; extracting it into a named constant or configuration option would make it easier to tune across environments without touching the logic.
Failures during MCP client initialization are logged both in _init_mcp_client_task_wrapper (with exc_info=True) and again in init_mcp_clients; you might want to centralize or de-duplicate this logging to avoid noisy, redundant error messages.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- When cancelling MCP init tasks on timeout in `init_mcp_clients`, consider explicitly awaiting their cancellation (e.g., via `asyncio.gather(*cancelled_tasks, return_exceptions=True)`) to avoid lingering tasks and unobserved exceptions in the event loop.
- The 20-second global timeout for MCP initialization is currently hard-coded; extracting it into a named constant or configuration option would make it easier to tune across environments without touching the logic.
- Failures during MCP client initialization are logged both in `_init_mcp_client_task_wrapper` (with `exc_info=True`) and again in `init_mcp_clients`; you might want to centralize or de-duplicate this logging to avoid noisy, redundant error messages.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="246-248" />
<code_context>
+
+            wait_coros = [_wait_ready(n, e) for n, e in ready_events.items()]
+            try:
+                await asyncio.wait_for(
+                    asyncio.gather(*wait_coros, return_exceptions=True),
+                    timeout=20.0,
+                )
+            except asyncio.TimeoutError:
</code_context>
<issue_to_address>
**suggestion:** Consider making the MCP initialization timeout configurable instead of hardcoded to 20 seconds.

The 20s default seems fine, but some deployments (e.g., slow remote MCP servers, cold starts) may legitimately take longer. With a hardcoded timeout, those slow-but-healthy services will be cancelled and treated as failures. Please expose this as a configurable value (or at least a module-level constant) so operators can tune it to their environment.

Suggested implementation:

```python
import asyncio


# Default timeout (in seconds) for MCP service initialization.
# Adjust this value or wire it to configuration as needed for your deployment.
MCP_INIT_TIMEOUT: float = 20.0

```

```python
                await asyncio.wait_for(
                    asyncio.gather(*wait_coros, return_exceptions=True),
                    timeout=MCP_INIT_TIMEOUT,
                )

```

If your project has a central configuration system (e.g., settings module, env-based config, or a config object passed into this component), you may want to:
1. Load `MCP_INIT_TIMEOUT` from that configuration instead of hardcoding `20.0` here.
2. Optionally document this setting in your ops/config docs so operators know they can tune MCP initialization timeouts.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="219" />
<code_context>
+        # name -> (shutdown_event, ready_event, task)
+        # shutdown_event: 关机时 set，通知长期运行的 task 退出
+        # ready_event:    初始化完成时 set，用于等待初始化，不阻塞到关机
+        client_info: dict[str, tuple[asyncio.Event, asyncio.Event, asyncio.Task]] = {}
+
+        for name, cfg in mcp_server_json_obj.items():
</code_context>
<issue_to_address>
**issue (complexity):** Consider introducing a small McpClientInfo data class and operating on it directly to simplify client tracking and readiness waiting logic while preserving current behavior.

You can keep the new behavior but reduce complexity by introducing a small data class and simplifying the waiting logic. This also removes tuple index plumbing and `_wait_ready`.

### 1. Replace tuple-based `client_info` with a data class

Define a per-client structure once:

```python
from dataclasses import dataclass

@dataclass
class McpClientInfo:
    name: str
    cfg: dict
    shutdown_event: asyncio.Event
    ready_event: asyncio.Event
    task: asyncio.Task
```

Then refactor the initialization section:

```python
mcp_server_json_obj: dict[str, dict] = json.load(
    open(mcp_json_file, encoding="utf-8"),
)["mcpServers"]

client_infos: dict[str, McpClientInfo] = {}

for name, cfg in mcp_server_json_obj.items():
    if cfg.get("active", True):
        shutdown_event = asyncio.Event()
        ready_event = asyncio.Event()
        task = asyncio.create_task(
            self._init_mcp_client_task_wrapper(
                name=name,
                cfg=cfg,
                shutdown_event=shutdown_event,
                ready_event=ready_event,
            )
        )
        info = McpClientInfo(
            name=name,
            cfg=cfg,
            shutdown_event=shutdown_event,
            ready_event=ready_event,
            task=task,
        )
        client_infos[name] = info
        self.mcp_client_event[name] = shutdown_event
```

This avoids `info[1]`, `info[2]`, and re‑looking up `cfg` later.

### 2. Simplify the readiness waiting logic (remove `_wait_ready`)

Instead of building `ready_events` and `tasks_by_name` maps plus `_wait_ready`, operate directly on `client_infos`:

```python
if client_infos:
    logger.info(f"等待 {len(client_infos)} 个 MCP 服务初始化...")

    wait_tasks = [
        asyncio.create_task(info.ready_event.wait())
        for info in client_infos.values()
    ]
    try:
        await asyncio.wait_for(
            asyncio.gather(*wait_tasks, return_exceptions=True),
            timeout=20.0,
        )
    except asyncio.TimeoutError:
        pass

    success_count = 0
    failed_services: list[str] = []

    for info in client_infos.values():
        if not info.ready_event.is_set():
            logger.error(f"MCP 服务 {info.name} 初始化超时")
            info.task.cancel()
            failed_services.append(info.name)
        elif info.task.done() and info.task.exception() is not None:
            exc = info.task.exception()
            logger.error(f"MCP 服务 {info.name} 初始化失败: {exc}")
            cfg = info.cfg
            if "command" in cfg:
                logger.debug(f"  命令: {cfg['command']}")
                if "args" in cfg:
                    logger.debug(f"  参数: {cfg['args']}")
            elif "url" in cfg:
                parsed = urllib.parse.urlparse(cfg["url"])
                logger.debug(f"  主机: {parsed.scheme}://{parsed.netloc}")
            failed_services.append(info.name)
        else:
            success_count += 1

    if failed_services:
        logger.warning(
            f"以下 MCP 服务初始化失败: {', '.join(failed_services)}。"
            f"请检查配置文件 mcp_server.json 和服务器可用性。"
        )

    logger.info(f"MCP 服务初始化完成: {success_count}/{len(client_infos)} 成功")
```

This keeps all existing behavior (timeout, cancellation, failure logging, URL/command details) but eliminates the extra dictionaries and `_wait_ready` helper.

### 3. Optionally pass a single object to the wrapper

To further reduce parameter juggling, you can pass `McpClientInfo` into the wrapper instead of multiple arguments:

```python
async def _init_mcp_client_task_wrapper(
    self,
    info: McpClientInfo,
) -> None:
    """初始化 MCP 客户端的包装函数。

    我们保证无论成功或失败，ready_event 至少会被 set 一次，
    以解除 init_mcp_clients 中的等待。
    """
    try:
        await self._init_mcp_client(info.name, info.cfg)
        info.ready_event.set()
        await info.shutdown_event.wait()
        logger.info(f"收到 MCP 客户端 {info.name} 终止信号")
    except Exception:
        info.ready_event.set()
        logger.error(f"初始化 MCP 客户端 {info.name} 失败", exc_info=True)
        raise
    finally:
        await self._terminate_mcp_client(info.name)
```

And creation:

```python
info = McpClientInfo(
    name=name,
    cfg=cfg,
    shutdown_event=asyncio.Event(),
    ready_event=asyncio.Event(),
    task=None,  # placeholder
)
info.task = asyncio.create_task(self._init_mcp_client_task_wrapper(info))
client_infos[name] = info
self.mcp_client_event[name] = info.shutdown_event
```

This keeps the lifecycle contract explicit and removes the need to mentally track multiple positional parameters.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

- Extract MCP_INIT_TIMEOUT = 20.0 as a named module-level constant - Replace tuple-based client_info with _McpClientInfo dataclass to eliminate index-based access and improve readability - Remove _wait_ready helper; use asyncio.create_task(event.wait()) directly - Await cancelled tasks after timeout to prevent lingering background tasks and unobserved exceptions

- Catch asyncio.CancelledError separately in _init_mcp_client_task_wrapper so ready_event.set() is always called (Python 3.8+ CancelledError inherits BaseException, not Exception) - Cancel and await lingering wait_tasks after timeout to prevent them from hanging indefinitely when ready_event is never set

idiotsj · 2026-03-01T12:39:45Z

已根据审查意见学习修正

zouyonghe · 2026-03-01T12:40:20Z

基于远端最新提交 e2da013 的检查，这个 PR 目前还有 3 个建议先修复的风险点：

enable_mcp_server 与 _init_mcp_client_task_wrapper 参数语义不一致（阻断级）
enable_mcp_server 仍按旧逻辑传 ready_future，但 wrapper 已改成接收 ready_event 并调用 ready_event.set()。这会导致动态启用 MCP 时出现运行时异常或等待超时。
启动超时仍是硬编码常量，未做到可配置
现在只是把 20.0 抽成了 MCP_INIT_TIMEOUT，但没有从配置或环境变量读取。不同部署场景（慢网络/远程 MCP）仍无法按需调参。
URL 日志仍可能泄露敏感信息
失败日志使用 parsed.netloc；如果 URL 包含 user:pass@host，可能把凭据写入日志。建议仅记录 scheme + hostname + 可选端口，不记录用户信息和查询参数。

建议以上 3 点修复后再合并，会更稳妥。

…nfig issues - Fix enable_mcp_server to pass shutdown_event + ready_event instead of ready_future, matching _init_mcp_client_task_wrapper's current signature - Cancel and await init_task on timeout; clean up mcp_client_event on failure - Read MCP_INIT_TIMEOUT from env var ASTRBOT_MCP_INIT_TIMEOUT (default 20s) so operators can tune it without code changes - Strip userinfo from URL in debug log (use hostname+port only, not netloc) to avoid leaking credentials embedded in URLs

sourcery-ai

Hey - 我发现了 1 个问题，并给出了一些整体性的反馈：

将 init_mcp_clients 修改为在所有服务都失败时抛出 MCPAllServicesFailedError，相比之前的版本可能会改变启动行为；请考虑你是否希望在这种情况下管理器初始化直接失败，还是仅记录日志并在 MCP 部分不可用/完全不可用的情况下继续运行。
_wait_mcp_lifecycle_tasks 在任务外层使用了 asyncio.shield，然后在超时时显式取消任务，这让整体流程更难理解；你可以考虑去掉 shield，只依赖 wait_for + 显式取消逻辑来简化实现。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Changing `init_mcp_clients` to raise `MCPAllServicesFailedError` when all services fail may alter startup behavior compared to previous versions; consider whether you want the manager initialization to hard-fail in this case or just log and continue for partial/zero MCP availability.
- `_wait_mcp_lifecycle_tasks` uses `asyncio.shield` around tasks and then explicitly cancels them on timeout, which makes the flow a bit harder to follow; you could simplify by dropping `shield` and just relying on `wait_for` + explicit cancellation logic.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="196" />
<code_context>
         self.mcp_client_dict: dict[str, MCPClient] = {}
         """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self.mcp_server_runtime: dict[str, _MCPServerRuntime] = {}

     def empty(self) -> bool:
</code_context>
<issue_to_address>
**issue (complexity):** Consider unifying MCP client and runtime management, centralizing start/stop logic, and simplifying timeout helpers to reduce lifecycle complexity and duplication.

You can keep all the new behavior but significantly reduce lifecycle complexity with a couple of targeted restructurings.

### 1. Unify client + runtime into a single source of truth

Right now you have:

- `self.mcp_client_dict[name] -> MCPClient`
- `self.mcp_server_runtime[name] -> _MCPServerRuntime(shutdown_event, lifecycle_task)`

This leads to sync issues and extra checks in `_run_mcp_client`. You can make `_MCPServerRuntime` own the client and use a single dict:

```python
@dataclass
class MCPServerRuntime:
    client: MCPClient
    shutdown_event: asyncio.Event
    lifecycle_task: asyncio.Task[None]

class FunctionToolManager:
    def __init__(self) -> None:
        self.func_list: list[FuncTool] = []
        self.mcp_servers: dict[str, MCPServerRuntime] = {}
```

Update `_init_mcp_client` to just return a ready client:

```python
async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
    if name in self.mcp_servers:
        # reuse runtime.client if you want to support re-init
        await self._terminate_mcp_client(name)

    mcp_client = MCPClient()
    mcp_client.name = name
    # ... existing config/connection logic ...
    return mcp_client
```

Then `_start_mcp_server` becomes both init + runtime registration:

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if name in self.mcp_servers:
        logger.warning(f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。")
        self._log_safe_mcp_debug_config(cfg)
        return

    shutdown_event = shutdown_event or asyncio.Event()

    try:
        client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    except asyncio.TimeoutError as exc:
        raise MCPInitTimeoutError(
            f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）"
        ) from exc

    lifecycle_task = asyncio.create_task(
        self._run_mcp_client(name, shutdown_event),
        name=f"mcp-client:{name}",
    )

    self.mcp_servers[name] = MCPServerRuntime(
        client=client,
        shutdown_event=shutdown_event,
        lifecycle_task=lifecycle_task,
    )
```

Inside `_run_mcp_client`, you no longer need to look up and cross-check tasks; you can let disable/cleanup own the dict mutation:

```python
async def _run_mcp_client(self, name: str, shutdown_event: asyncio.Event) -> None:
    try:
        await shutdown_event.wait()
        logger.info(f"收到 MCP 客户端 {name} 终止信号")
    except asyncio.CancelledError:
        logger.debug(f"MCP 客户端 {name} 任务被取消")
        raise
    finally:
        await self._terminate_mcp_client(name)
```

And `_terminate_mcp_client` would use `self.mcp_servers[name].client` instead of `self.mcp_client_dict`. That removes the parallel maps and the “current task vs stored task” guard.

### 2. Centralize start idempotency

You currently check “already running” both in `_start_mcp_server` and `enable_mcp_server`. You can centralize this:

```python
async def enable_mcp_server(
    self,
    name: str,
    config: dict,
    shutdown_event: asyncio.Event | None = None,
    timeout: float | int | str | None = None,
) -> None:
    timeout_value = _resolve_mcp_timeout(timeout=timeout, init_phase=False)
    await self._start_mcp_server(
        name=name,
        cfg=config,
        shutdown_event=shutdown_event,
        timeout=timeout_value,
    )
```

…and let `_start_mcp_server` be the single place that enforces “already running” semantics (as in the previous snippet). That removes duplicated guards and logging paths.

If you prefer the opposite, make `_start_mcp_server` assume the caller has already checked, and keep the guard only in `enable_mcp_server`/`init_mcp_clients`—but pick one layer.

### 3. Replace the two wait helpers with a single stop helper

`_wait_mcp_lifecycle_task` is a 1-line wrapper over `_wait_mcp_lifecycle_tasks`. You can instead create a single “stop runtime with timeout” helper that disables a large chunk of the spread shutdown logic:

```python
async def _stop_runtime(self, name: str, runtime: MCPServerRuntime, timeout: float) -> None:
    runtime.shutdown_event.set()
    try:
        await asyncio.wait_for(asyncio.shield(runtime.lifecycle_task), timeout=timeout)
    except asyncio.TimeoutError:
        if not runtime.lifecycle_task.done():
            runtime.lifecycle_task.cancel()
        await asyncio.gather(runtime.lifecycle_task, return_exceptions=True)
        raise
    finally:
        # remove from dict & tools only here
        self.mcp_servers.pop(name, None)
        self.func_list = [
            f for f in self.func_list
            if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
        ]
```

Then `disable_mcp_server` becomes very linear:

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    if name:
        runtime = self.mcp_servers.get(name)
        if runtime is None:
            return
        await _stop_runtime(name, runtime, timeout)
    else:
        runtimes = list(self.mcp_servers.items())
        await asyncio.gather(
            *[self._stop_runtime(n, r, timeout) for n, r in runtimes],
            return_exceptions=True,
        )
```

This collapses `disable_mcp_server → _wait_* → lifecycle` into a single helper that is used for both single and bulk shutdown, while keeping your “cancel-on-timeout” behavior.

### 4. Narrow timeout helpers to the two actual use cases

`_resolve_timeout` + `_resolve_mcp_timeout(init_phase=bool)` is generic but adds branching that’s only used in two places. You can keep the behavior but make it more direct:

```python
def _resolve_init_timeout(timeout: float | int | str | None = None) -> float:
    return _resolve_timeout(
        timeout=timeout,
        env_name=MCP_INIT_TIMEOUT_ENV,
        default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
    )

def _resolve_enable_timeout(timeout: float | int | str | None = None) -> float:
    return _resolve_timeout(
        timeout=timeout,
        env_name=ENABLE_MCP_TIMEOUT_ENV,
        default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
    )
```

Then call sites become self-documenting and you can drop the `init_phase` flag:

```python
init_timeout = _resolve_init_timeout()
# ...
timeout_value = _resolve_enable_timeout(timeout)
```

The underlying `_resolve_timeout` stays as-is, so you don’t lose flexibility or behavior.

---

These changes keep your new features (timeouts, per-service summary, better logging, runtime representation) but simplify the lifecycle to:

- One dict: `self.mcp_servers[name] -> MCPServerRuntime(client, event, task)`
- One start function (`_start_mcp_server`) used by both `init_mcp_clients` and `enable_mcp_server`
- One stop helper (`_stop_runtime`) used by both single and bulk shutdown
- Two explicit timeout resolvers for init/enable flows.
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请对每条评论点 👍 或 👎，我会根据你的反馈改进后续评审。}

Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

Changing init_mcp_clients to raise MCPAllServicesFailedError when all services fail may alter startup behavior compared to previous versions; consider whether you want the manager initialization to hard-fail in this case or just log and continue for partial/zero MCP availability.
_wait_mcp_lifecycle_tasks uses asyncio.shield around tasks and then explicitly cancels them on timeout, which makes the flow a bit harder to follow; you could simplify by dropping shield and just relying on wait_for + explicit cancellation logic.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Changing `init_mcp_clients` to raise `MCPAllServicesFailedError` when all services fail may alter startup behavior compared to previous versions; consider whether you want the manager initialization to hard-fail in this case or just log and continue for partial/zero MCP availability.
- `_wait_mcp_lifecycle_tasks` uses `asyncio.shield` around tasks and then explicitly cancels them on timeout, which makes the flow a bit harder to follow; you could simplify by dropping `shield` and just relying on `wait_for` + explicit cancellation logic.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="196" />
<code_context>
         self.mcp_client_dict: dict[str, MCPClient] = {}
         """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self.mcp_server_runtime: dict[str, _MCPServerRuntime] = {}

     def empty(self) -> bool:
</code_context>
<issue_to_address>
**issue (complexity):** Consider unifying MCP client and runtime management, centralizing start/stop logic, and simplifying timeout helpers to reduce lifecycle complexity and duplication.

You can keep all the new behavior but significantly reduce lifecycle complexity with a couple of targeted restructurings.

### 1. Unify client + runtime into a single source of truth

Right now you have:

- `self.mcp_client_dict[name] -> MCPClient`
- `self.mcp_server_runtime[name] -> _MCPServerRuntime(shutdown_event, lifecycle_task)`

This leads to sync issues and extra checks in `_run_mcp_client`. You can make `_MCPServerRuntime` own the client and use a single dict:

```python
@dataclass
class MCPServerRuntime:
    client: MCPClient
    shutdown_event: asyncio.Event
    lifecycle_task: asyncio.Task[None]

class FunctionToolManager:
    def __init__(self) -> None:
        self.func_list: list[FuncTool] = []
        self.mcp_servers: dict[str, MCPServerRuntime] = {}
```

Update `_init_mcp_client` to just return a ready client:

```python
async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
    if name in self.mcp_servers:
        # reuse runtime.client if you want to support re-init
        await self._terminate_mcp_client(name)

    mcp_client = MCPClient()
    mcp_client.name = name
    # ... existing config/connection logic ...
    return mcp_client
```

Then `_start_mcp_server` becomes both init + runtime registration:

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if name in self.mcp_servers:
        logger.warning(f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。")
        self._log_safe_mcp_debug_config(cfg)
        return

    shutdown_event = shutdown_event or asyncio.Event()

    try:
        client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    except asyncio.TimeoutError as exc:
        raise MCPInitTimeoutError(
            f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）"
        ) from exc

    lifecycle_task = asyncio.create_task(
        self._run_mcp_client(name, shutdown_event),
        name=f"mcp-client:{name}",
    )

    self.mcp_servers[name] = MCPServerRuntime(
        client=client,
        shutdown_event=shutdown_event,
        lifecycle_task=lifecycle_task,
    )
```

Inside `_run_mcp_client`, you no longer need to look up and cross-check tasks; you can let disable/cleanup own the dict mutation:

```python
async def _run_mcp_client(self, name: str, shutdown_event: asyncio.Event) -> None:
    try:
        await shutdown_event.wait()
        logger.info(f"收到 MCP 客户端 {name} 终止信号")
    except asyncio.CancelledError:
        logger.debug(f"MCP 客户端 {name} 任务被取消")
        raise
    finally:
        await self._terminate_mcp_client(name)
```

And `_terminate_mcp_client` would use `self.mcp_servers[name].client` instead of `self.mcp_client_dict`. That removes the parallel maps and the “current task vs stored task” guard.

### 2. Centralize start idempotency

You currently check “already running” both in `_start_mcp_server` and `enable_mcp_server`. You can centralize this:

```python
async def enable_mcp_server(
    self,
    name: str,
    config: dict,
    shutdown_event: asyncio.Event | None = None,
    timeout: float | int | str | None = None,
) -> None:
    timeout_value = _resolve_mcp_timeout(timeout=timeout, init_phase=False)
    await self._start_mcp_server(
        name=name,
        cfg=config,
        shutdown_event=shutdown_event,
        timeout=timeout_value,
    )
```

…and let `_start_mcp_server` be the single place that enforces “already running” semantics (as in the previous snippet). That removes duplicated guards and logging paths.

If you prefer the opposite, make `_start_mcp_server` assume the caller has already checked, and keep the guard only in `enable_mcp_server`/`init_mcp_clients`—but pick one layer.

### 3. Replace the two wait helpers with a single stop helper

`_wait_mcp_lifecycle_task` is a 1-line wrapper over `_wait_mcp_lifecycle_tasks`. You can instead create a single “stop runtime with timeout” helper that disables a large chunk of the spread shutdown logic:

```python
async def _stop_runtime(self, name: str, runtime: MCPServerRuntime, timeout: float) -> None:
    runtime.shutdown_event.set()
    try:
        await asyncio.wait_for(asyncio.shield(runtime.lifecycle_task), timeout=timeout)
    except asyncio.TimeoutError:
        if not runtime.lifecycle_task.done():
            runtime.lifecycle_task.cancel()
        await asyncio.gather(runtime.lifecycle_task, return_exceptions=True)
        raise
    finally:
        # remove from dict & tools only here
        self.mcp_servers.pop(name, None)
        self.func_list = [
            f for f in self.func_list
            if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
        ]
```

Then `disable_mcp_server` becomes very linear:

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    if name:
        runtime = self.mcp_servers.get(name)
        if runtime is None:
            return
        await _stop_runtime(name, runtime, timeout)
    else:
        runtimes = list(self.mcp_servers.items())
        await asyncio.gather(
            *[self._stop_runtime(n, r, timeout) for n, r in runtimes],
            return_exceptions=True,
        )
```

This collapses `disable_mcp_server → _wait_* → lifecycle` into a single helper that is used for both single and bulk shutdown, while keeping your “cancel-on-timeout” behavior.

### 4. Narrow timeout helpers to the two actual use cases

`_resolve_timeout` + `_resolve_mcp_timeout(init_phase=bool)` is generic but adds branching that’s only used in two places. You can keep the behavior but make it more direct:

```python
def _resolve_init_timeout(timeout: float | int | str | None = None) -> float:
    return _resolve_timeout(
        timeout=timeout,
        env_name=MCP_INIT_TIMEOUT_ENV,
        default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
    )

def _resolve_enable_timeout(timeout: float | int | str | None = None) -> float:
    return _resolve_timeout(
        timeout=timeout,
        env_name=ENABLE_MCP_TIMEOUT_ENV,
        default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
    )
```

Then call sites become self-documenting and you can drop the `init_phase` flag:

```python
init_timeout = _resolve_init_timeout()
# ...
timeout_value = _resolve_enable_timeout(timeout)
```

The underlying `_resolve_timeout` stays as-is, so you don’t lose flexibility or behavior.

---

These changes keep your new features (timeouts, per-service summary, better logging, runtime representation) but simplify the lifecycle to:

- One dict: `self.mcp_servers[name] -> MCPServerRuntime(client, event, task)`
- One start function (`_start_mcp_server`) used by both `init_mcp_clients` and `enable_mcp_server`
- One stop helper (`_stop_runtime`) used by both single and bulk shutdown
- Two explicit timeout resolvers for init/enable flows.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

zouyonghe · 2026-03-02T01:53:19Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 4 个问题，并留下了一些整体性的反馈：

现在即使所有 MCP 服务启动失败，Manager.initialize 也会静默继续执行（init_mcp_clients 使用 raise_on_all_failed=False 进行等待）；建议要么使用 raise_on_all_failed=True 调用它，要么显式暴露一个更清晰的启动失败信号，以便调用方在没有可用工具时可以决定是否中止初始化。
mcp_client_dict 现在是一个兼容性计算属性，每次访问都会重新构建一个字典；如果仍然有任何热点路径频繁使用这个属性，建议要么将这些调用方迁移为直接使用 mcp_server_runtime，要么对该映射做缓存以避免重复分配。

面向 AI Agents 的提示

Please address the comments from this code review:

## Overall Comments
- Manager.initialize now silently proceeds even if all MCP services fail to start (init_mcp_clients is awaited with raise_on_all_failed=False); consider either calling it with raise_on_all_failed=True or surfacing a clearer startup failure signal so callers can decide whether to abort initialization when no tools are available.
- mcp_client_dict is now a computed compatibility property that rebuilds a dict on every access; if any hot path still uses this attribute frequently, consider either migrating those callers to use mcp_server_runtime directly or caching the mapping to avoid repeated allocations.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="468-408" />
<code_context>
+        """Wait for lifecycle task and cancel on timeout."""
+        await self._wait_mcp_lifecycle_tasks([lifecycle_task], timeout)
+
+    async def _wait_mcp_lifecycle_tasks(
+        self, lifecycle_tasks: list[asyncio.Task[None]], timeout: float
+    ) -> None:
+        """Wait for lifecycle tasks and cancel unfinished tasks on timeout."""
+        if not lifecycle_tasks:
+            return
+        try:
+            await asyncio.wait_for(
+                asyncio.gather(*lifecycle_tasks),
+                timeout=timeout,
+            )
+        except asyncio.TimeoutError:
+            for task in lifecycle_tasks:
+                if not task.done():
+                    task.cancel()
+            await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
+            raise
+
+    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 考虑对生命周期任务中的非超时异常进行更健壮的处理，以避免部分等待和未取消的任务。

在 `_wait_mcp_lifecycle_tasks` 中，`asyncio.gather(*lifecycle_tasks)` 被 `await` 时没有使用 `return_exceptions=True`。如果某个生命周期任务快速以非 `TimeoutError` 异常失败，该方法会立即抛出异常，并且不会等待剩余任务，导致它们继续运行。在 `disable_mcp_server` 中，这可能会让关闭过程看起来已经完成，但实际上仍有任务在运行。

建议使用 `return_exceptions=True`，这样可以等待所有任务完成，然后在保持超时取消行为的前提下，显式处理/记录失败。例如：

```python
try:
    results = await asyncio.wait_for(
        asyncio.gather(*lifecycle_tasks, return_exceptions=True),
        timeout=timeout,
    )
    for r in results:
        if isinstance(r, Exception):
            logger.error("MCP lifecycle task failed", exc_info=r)
except asyncio.TimeoutError:
    for task in lifecycle_tasks:
        if not task.done():
            task.cancel()
    await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
    raise
```

这样一来，所有生命周期任务要么完成，要么被取消，并且失败会被暴露出来，而不会留下仍在运行的后台任务。

建议实现方式：

```python
    async def _wait_mcp_lifecycle_tasks(
        self, lifecycle_tasks: list[asyncio.Task[None]], timeout: float
    ) -> None:
        """Wait for lifecycle tasks and cancel unfinished tasks on timeout.

        Ensures all tasks are awaited even if some fail, and logs failures explicitly.
        """
        if not lifecycle_tasks:
            return

        try:
            results = await asyncio.wait_for(
                asyncio.gather(*lifecycle_tasks, return_exceptions=True),
                timeout=timeout,
            )
        except asyncio.TimeoutError:
            for task in lifecycle_tasks:
                if not task.done():
                    task.cancel()
            # Ensure all tasks are awaited after cancellation
            await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
            raise
        else:
            # Handle/log non-timeout failures explicitly so they don't get lost
            for result in results:
                if isinstance(result, Exception):
                    logger.error("MCP lifecycle task failed", exc_info=result)

```

1. 确保在此模块中有可用的 `logger`。如果没有，请在 `astrbot/core/provider/func_tool_manager.py` 文件顶部添加：
   ```python
   import logging

   logger = logging.getLogger(__name__)
   ```
2. 如果你的代码库更倾向于使用其他日志记录模式（例如使用共享的 logger 实例），请将 `logger` 替换为相应的 logger 引用。
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="332-334" />
<code_context>
-            return
+            return MCPInitSummary(total=0, success=0, failed=[])

         mcp_server_json_obj: dict[str, dict] = json.load(
             open(mcp_json_file, encoding="utf-8"),
         )["mcpServers"]

</code_context>
<issue_to_address>
**suggestion (bug_risk):** 打开 MCP 配置文件时请使用上下文管理器，以避免文件描述符泄漏。

当前 `json.load(open(...))` 的写法会让文件描述符一直保持打开直到被 GC 回收，这在长时间运行的进程或频繁重载场景中存在风险。建议改为：

```python
with open(mcp_json_file, encoding="utf-8") as f:
    mcp_server_json_obj: dict[str, dict] = json.load(f)["mcpServers"]
```

这样可以让资源管理更加显式和可预测。

```suggestion
        with open(mcp_json_file, encoding="utf-8") as f:
            mcp_server_json_obj: dict[str, dict] = json.load(f)["mcpServers"]
```
</issue_to_address>

### Comment 3
<location path="astrbot/core/provider/func_tool_manager.py" line_range="493-408" />
<code_context>
+        try:
+            await mcp_client.connect_to_server(config, name)
+            tools_res = await mcp_client.list_tools_and_save()
+        except BaseException:
+            try:
+                await mcp_client.cleanup()
+            except Exception as cleanup_exc:
+                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
+            raise
         logger.debug(f"MCP server {name} list tools response: {tools_res}")
         tool_names = [tool.name for tool in tools_res.tools]
</code_context>
<issue_to_address>
**suggestion (bug_risk):** 在 MCP 客户端初始化阶段捕获 `BaseException` 范围过大，可能会干扰取消语义。

在 `_init_mcp_client` 中捕获 `BaseException` 意味着你也会拦截 `KeyboardInterrupt`、`SystemExit` 和与取消相关的异常，这即使在重新抛出异常的情况下，也会在关停和取消行为上产生细微影响。

请将其缩小为捕获 `Exception`，如果需要特殊处理，可以显式添加一个 `except asyncio.CancelledError:` 分支，在其中执行清理后再重新抛出。其余清理逻辑可以保持不变，只是放在更窄的异常类型下。

建议实现方式：

```python
        try:
            await mcp_client.connect_to_server(config, name)
            tools_res = await mcp_client.list_tools_and_save()
        except asyncio.CancelledError:
            try:
                await mcp_client.cleanup()
            except Exception as cleanup_exc:
                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
            raise
        except Exception:
            try:
                await mcp_client.cleanup()
            except Exception as cleanup_exc:
                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
            raise

```

1. 确保在 `astrbot/core/provider/func_tool_manager.py` 文件顶部已导入 `asyncio`。如果还未导入，请添加：
   `import asyncio`
2. 不需要其他行为上的改动，其余逻辑可以保持现状。
</issue_to_address>

### Comment 4
<location path="astrbot/core/provider/func_tool_manager.py" line_range="55" />
<code_context>
+    lifecycle_task: asyncio.Task[None]
+
+
+def _resolve_timeout(
+    timeout: float | int | str | None = None,
+    *,
</code_context>
<issue_to_address>
**issue (complexity):** 建议移除过于薄的包装辅助函数，将幂等性检查和生命周期等待逻辑移动到主要调用点中，以减少间接调用，让 MCP 服务器流程更易理解。

你可以在不改变行为或新特性的前提下，减少一些间接层。

### 1. 合并超时辅助函数

`_resolve_init_timeout` / `_resolve_enable_timeout` 都只是对 `_resolve_timeout` 的一行包装，并且各自只使用了一次。你可以在调用点内联它们，并删除这些薄包装函数：

```python
# before
init_timeout = _resolve_init_timeout()
timeout_value = _resolve_enable_timeout(timeout)

# after
init_timeout = _resolve_timeout(
    timeout=None,
    env_name=MCP_INIT_TIMEOUT_ENV,
    default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
)

timeout_value = _resolve_timeout(
    timeout=timeout,
    env_name=ENABLE_MCP_TIMEOUT_ENV,
    default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
)
```

然后删除 `_resolve_init_timeout` / `_resolve_enable_timeout`。行为和环境变量优先级保持不变，但文件会更扁平、更易浏览。

### 2. 移除单个生命周期等待包装

`_wait_mcp_lifecycle_task` 只是简单调用 `_wait_mcp_lifecycle_tasks`。你可以在两个调用路径中直接使用复数版本的辅助函数，并删除这个额外的包装：

```python
# before
await self._wait_mcp_lifecycle_task(runtime.lifecycle_task, timeout)

# after
await self._wait_mcp_lifecycle_tasks([runtime.lifecycle_task], timeout)
```

然后删除 `_wait_mcp_lifecycle_task`。这样可以保持当前的“超时即取消”语义，但去掉了一层间接调用。

### 3. 集中幂等性检查

目前 `_start_mcp_server` 是幂等的，但对外的入口是 `enable_mcp_server`。你可以保留这种行为，但将检查逻辑移动到 `enable_mcp_server` 中，并简化 `_start_mcp_server`，使其始终执行“实际工作”：

```python
# enable_mcp_server
async def enable_mcp_server(...):
    if name in self.mcp_server_runtime:
        logger.warning(
            f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout_value:g}）。"
        )
        self._log_safe_mcp_debug_config(config)
        return

    timeout_value = _resolve_timeout(
        timeout=timeout,
        env_name=ENABLE_MCP_TIMEOUT_ENV,
        default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
    )
    await self._start_mcp_server(
        name=name,
        cfg=config,
        shutdown_event=shutdown_event,
        timeout=timeout_value,
    )
```

```python
# _start_mcp_server
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if shutdown_event is None:
        shutdown_event = asyncio.Event()

    try:
        mcp_client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    ...
```

这样既保留了“重复 enable 时直接忽略”的幂等行为，又避免在两个地方维护这段逻辑。

### 4. 考虑将生命周期等待逻辑内联到 `disable_mcp_server` 中

鉴于 `_wait_mcp_lifecycle_tasks` 只在 `disable_mcp_server` 中使用，你可以将其逻辑内联到该方法中，减少跨函数跳转：

```python
# single-server case
runtime.shutdown_event.set()
try:
    await asyncio.wait_for(
        asyncio.gather(runtime.lifecycle_task),
        timeout=timeout,
    )
except asyncio.TimeoutError:
    if not runtime.lifecycle_task.done():
        runtime.lifecycle_task.cancel()
    await asyncio.gather(runtime.lifecycle_task, return_exceptions=True)
    raise
finally:
    self.mcp_server_runtime.pop(name, None)
    ...
```

```python
# all-servers case
for runtime in runtimes:
    runtime.shutdown_event.set()

try:
    await asyncio.wait_for(
        asyncio.gather(*(rt.lifecycle_task for rt in runtimes)),
        timeout=timeout,
    )
except asyncio.TimeoutError:
    for task in (rt.lifecycle_task for rt in runtimes):
        if not task.done():
            task.cancel()
    await asyncio.gather(
        *(rt.lifecycle_task for rt in runtimes),
        return_exceptions=True,
    )
    raise
finally:
    self.mcp_server_runtime.clear()
    ...
```

这样可以在不改变超时和取消语义的前提下，让生命周期流程在一个方法中自洽、易于理解。
</issue_to_address>

Sourcery 对开源项目免费——如果你喜欢我们的代码审查，请考虑帮忙分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续的代码审查。}

Original comment in English

Hey - I've found 4 issues, and left some high level feedback:

Manager.initialize now silently proceeds even if all MCP services fail to start (init_mcp_clients is awaited with raise_on_all_failed=False); consider either calling it with raise_on_all_failed=True or surfacing a clearer startup failure signal so callers can decide whether to abort initialization when no tools are available.
mcp_client_dict is now a computed compatibility property that rebuilds a dict on every access; if any hot path still uses this attribute frequently, consider either migrating those callers to use mcp_server_runtime directly or caching the mapping to avoid repeated allocations.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Manager.initialize now silently proceeds even if all MCP services fail to start (init_mcp_clients is awaited with raise_on_all_failed=False); consider either calling it with raise_on_all_failed=True or surfacing a clearer startup failure signal so callers can decide whether to abort initialization when no tools are available.
- mcp_client_dict is now a computed compatibility property that rebuilds a dict on every access; if any hot path still uses this attribute frequently, consider either migrating those callers to use mcp_server_runtime directly or caching the mapping to avoid repeated allocations.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="468-408" />
<code_context>
+        """Wait for lifecycle task and cancel on timeout."""
+        await self._wait_mcp_lifecycle_tasks([lifecycle_task], timeout)
+
+    async def _wait_mcp_lifecycle_tasks(
+        self, lifecycle_tasks: list[asyncio.Task[None]], timeout: float
+    ) -> None:
+        """Wait for lifecycle tasks and cancel unfinished tasks on timeout."""
+        if not lifecycle_tasks:
+            return
+        try:
+            await asyncio.wait_for(
+                asyncio.gather(*lifecycle_tasks),
+                timeout=timeout,
+            )
+        except asyncio.TimeoutError:
+            for task in lifecycle_tasks:
+                if not task.done():
+                    task.cancel()
+            await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
+            raise
+
+    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Consider handling non-timeout exceptions in lifecycle tasks more robustly to avoid partial waits and uncancelled tasks.

In `_wait_mcp_lifecycle_tasks`, `asyncio.gather(*lifecycle_tasks)` is awaited without `return_exceptions=True`. If one lifecycle task fails quickly with a non-`TimeoutError` exception, the method raises immediately and never waits on the remaining tasks, leaving them running. In `disable_mcp_server` this can make shutdown appear complete while tasks are still active.

Consider using `return_exceptions=True` so all tasks are awaited and then handling/logging failures explicitly, while keeping the timeout cancellation behavior. For example:

```python
try:
    results = await asyncio.wait_for(
        asyncio.gather(*lifecycle_tasks, return_exceptions=True),
        timeout=timeout,
    )
    for r in results:
        if isinstance(r, Exception):
            logger.error("MCP lifecycle task failed", exc_info=r)
except asyncio.TimeoutError:
    for task in lifecycle_tasks:
        if not task.done():
            task.cancel()
    await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
    raise
```

This way all lifecycle tasks are completed or cancelled, and failures are surfaced without leaving background tasks running.

Suggested implementation:

```python
    async def _wait_mcp_lifecycle_tasks(
        self, lifecycle_tasks: list[asyncio.Task[None]], timeout: float
    ) -> None:
        """Wait for lifecycle tasks and cancel unfinished tasks on timeout.

        Ensures all tasks are awaited even if some fail, and logs failures explicitly.
        """
        if not lifecycle_tasks:
            return

        try:
            results = await asyncio.wait_for(
                asyncio.gather(*lifecycle_tasks, return_exceptions=True),
                timeout=timeout,
            )
        except asyncio.TimeoutError:
            for task in lifecycle_tasks:
                if not task.done():
                    task.cancel()
            # Ensure all tasks are awaited after cancellation
            await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
            raise
        else:
            # Handle/log non-timeout failures explicitly so they don't get lost
            for result in results:
                if isinstance(result, Exception):
                    logger.error("MCP lifecycle task failed", exc_info=result)

```

1. Ensure there is a `logger` available in this module. If not, add at the top of `astrbot/core/provider/func_tool_manager.py`:
   ```python
   import logging

   logger = logging.getLogger(__name__)
   ```
2. If your codebase prefers a different logging pattern (e.g., using a shared logger instance), replace `logger` with the appropriate logger reference.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="332-334" />
<code_context>
-            return
+            return MCPInitSummary(total=0, success=0, failed=[])

         mcp_server_json_obj: dict[str, dict] = json.load(
             open(mcp_json_file, encoding="utf-8"),
         )["mcpServers"]

</code_context>
<issue_to_address>
**suggestion (bug_risk):** Use a context manager when opening the MCP config file to avoid leaking file descriptors.

The current `json.load(open(...))` pattern leaves the file descriptor open until GC, which is risky in long-running processes or frequent reloads. Instead:

```python
with open(mcp_json_file, encoding="utf-8") as f:
    mcp_server_json_obj: dict[str, dict] = json.load(f)["mcpServers"]
```

This keeps resource handling explicit and deterministic.

```suggestion
        with open(mcp_json_file, encoding="utf-8") as f:
            mcp_server_json_obj: dict[str, dict] = json.load(f)["mcpServers"]
```
</issue_to_address>

### Comment 3
<location path="astrbot/core/provider/func_tool_manager.py" line_range="493-408" />
<code_context>
+        try:
+            await mcp_client.connect_to_server(config, name)
+            tools_res = await mcp_client.list_tools_and_save()
+        except BaseException:
+            try:
+                await mcp_client.cleanup()
+            except Exception as cleanup_exc:
+                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
+            raise
         logger.debug(f"MCP server {name} list tools response: {tools_res}")
         tool_names = [tool.name for tool in tools_res.tools]
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Catching `BaseException` during MCP client init is broader than necessary and may interfere with cancellation semantics.

In `_init_mcp_client`, catching `BaseException` means you also intercept `KeyboardInterrupt`, `SystemExit`, and cancellation-related exceptions, which can subtly change shutdown and cancellation behavior even though you re-raise.

Please narrow this to `Exception` and, if you need special handling, add an explicit `except asyncio.CancelledError:` that performs cleanup and then re-raises. The rest of your cleanup logic can stay the same, just under the narrower exception types.

Suggested implementation:

```python
        try:
            await mcp_client.connect_to_server(config, name)
            tools_res = await mcp_client.list_tools_and_save()
        except asyncio.CancelledError:
            try:
                await mcp_client.cleanup()
            except Exception as cleanup_exc:
                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
            raise
        except Exception:
            try:
                await mcp_client.cleanup()
            except Exception as cleanup_exc:
                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
            raise

```

1. Ensure `import asyncio` is present at the top of `astrbot/core/provider/func_tool_manager.py`. If it's not already imported, add:
   `import asyncio`
2. No other behavioral changes are required; the rest of the logic can remain as-is.
</issue_to_address>

### Comment 4
<location path="astrbot/core/provider/func_tool_manager.py" line_range="55" />
<code_context>
+    lifecycle_task: asyncio.Task[None]
+
+
+def _resolve_timeout(
+    timeout: float | int | str | None = None,
+    *,
</code_context>
<issue_to_address>
**issue (complexity):** Consider removing thin wrapper helpers and moving the idempotence and lifecycle-wait logic into the main call sites to reduce indirection and make the MCP server flow easier to follow.

You can trim some indirection without changing behavior or the new features.

### 1. Collapse timeout helpers

`_resolve_init_timeout` / `_resolve_enable_timeout` are 1‑liners over `_resolve_timeout` and only used once each. You can inline them at the call sites and remove the thin wrappers:

```python
# before
init_timeout = _resolve_init_timeout()
timeout_value = _resolve_enable_timeout(timeout)

# after
init_timeout = _resolve_timeout(
    timeout=None,
    env_name=MCP_INIT_TIMEOUT_ENV,
    default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
)

timeout_value = _resolve_timeout(
    timeout=timeout,
    env_name=ENABLE_MCP_TIMEOUT_ENV,
    default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
)
```

Then delete `_resolve_init_timeout` / `_resolve_enable_timeout`. Behavior and env precedence stay identical, but the file is flatter and easier to scan.

### 2. Remove the singular lifecycle wait wrapper

`_wait_mcp_lifecycle_task` only delegates to `_wait_mcp_lifecycle_tasks`. You can use the plural helper directly from both code paths and drop the extra wrapper:

```python
# before
await self._wait_mcp_lifecycle_task(runtime.lifecycle_task, timeout)

# after
await self._wait_mcp_lifecycle_tasks([runtime.lifecycle_task], timeout)
```

Then remove `_wait_mcp_lifecycle_task`. This keeps the “cancel on timeout” semantics exactly as they are now but eliminates one level of indirection.

### 3. Centralize the idempotence check

Right now `_start_mcp_server` is idempotent even though `enable_mcp_server` is the public entry point. You can keep the behavior but move the guard to `enable_mcp_server` and simplify `_start_mcp_server` so it always “does the work”:

```python
# enable_mcp_server
async def enable_mcp_server(...):
    if name in self.mcp_server_runtime:
        logger.warning(
            f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout_value:g}）。"
        )
        self._log_safe_mcp_debug_config(config)
        return

    timeout_value = _resolve_timeout(
        timeout=timeout,
        env_name=ENABLE_MCP_TIMEOUT_ENV,
        default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
    )
    await self._start_mcp_server(
        name=name,
        cfg=config,
        shutdown_event=shutdown_event,
        timeout=timeout_value,
    )
```

```python
# _start_mcp_server
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if shutdown_event is None:
        shutdown_event = asyncio.Event()

    try:
        mcp_client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    ...
```

This keeps the idempotent “ignore duplicate enable” behavior but avoids having that logic in two places.

### 4. Consider inlining lifecycle waiting in `disable_mcp_server`

Given `_wait_mcp_lifecycle_tasks` is only used from `disable_mcp_server`, you can inline its logic there to reduce cross‑jumping:

```python
# single-server case
runtime.shutdown_event.set()
try:
    await asyncio.wait_for(
        asyncio.gather(runtime.lifecycle_task),
        timeout=timeout,
    )
except asyncio.TimeoutError:
    if not runtime.lifecycle_task.done():
        runtime.lifecycle_task.cancel()
    await asyncio.gather(runtime.lifecycle_task, return_exceptions=True)
    raise
finally:
    self.mcp_server_runtime.pop(name, None)
    ...
```

```python
# all-servers case
for runtime in runtimes:
    runtime.shutdown_event.set()

try:
    await asyncio.wait_for(
        asyncio.gather(*(rt.lifecycle_task for rt in runtimes)),
        timeout=timeout,
    )
except asyncio.TimeoutError:
    for task in (rt.lifecycle_task for rt in runtimes):
        if not task.done():
            task.cancel()
    await asyncio.gather(
        *(rt.lifecycle_task for rt in runtimes),
        return_exceptions=True,
    )
    raise
finally:
    self.mcp_server_runtime.clear()
    ...
```

Semantics for timeouts and cancellation remain unchanged, but the lifecycle flow becomes self‑contained in one method.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

idiotsj · 2026-03-02T14:13:50Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题，并给出了一些整体性的反馈：

mcp_client_dict 属性现在返回的是一个派生出来的 dict，而不是原始的映射，因此任何依赖对 mcp_client_dict 进行修改的现有代码都会悄无声息地失效；如果确实有这种写访问的需求，建议要么显式地废弃这种用法，要么提供一个可变的门面（facade）来同步更新 mcp_server_runtime。
在 _wait_mcp_lifecycle_tasks 和 disable_mcp_server 中，当发生超时时，你抛出了一个裸的 asyncio.TimeoutError，但没有说明是哪些服务关闭失败；为提升可操作性，建议用一个领域相关的错误类型封装，或者至少在异常/日志中包含受影响的服务名称。
新的 MCP 初始化超时解析同时使用了 MCP_INIT_TIMEOUT_ENV 和 ENABLE_MCP_TIMEOUT_ENV；为了避免运维人员在“首次启动”和“动态启用”之间配置出不一致的超时，可能需要在文档中说明（或在代码中强制）这两者的交互方式。

给予 AI Agents 的提示

Please address the comments from this code review:

## Overall Comments
- The `mcp_client_dict` property now returns a derived dict instead of the original mapping, so any existing code that mutates `mcp_client_dict` will silently stop working; if such write access is relied on, consider either deprecating it explicitly or providing a mutable facade that updates `mcp_server_runtime`.
- In `_wait_mcp_lifecycle_tasks` and `disable_mcp_server`, when a timeout occurs you raise a bare `asyncio.TimeoutError` without indicating which servers failed to shut down; it would be more actionable to wrap this in a domain-specific error or at least include the affected server names in the exception/log.
- The new MCP init timeout resolution uses both `MCP_INIT_TIMEOUT_ENV` and `ENABLE_MCP_TIMEOUT_ENV`; it may be worth documenting (or enforcing in code) how these interact to avoid operators accidentally configuring inconsistent timeouts for initial boot vs dynamic enable.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="384" />
<code_context>
+            logger.error(msg)
+        return summary
+
+    async def _run_mcp_client(
         self,
         name: str,
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining the MCP client lifecycle logic into `_start_mcp_server` and a single shutdown helper to reduce indirection and duplicated shutdown handling.

You can keep the new semantics but trim some of the abstraction layers. Two places where you get the most complexity reduction for the least change are:

1. **Fold `_run_mcp_client` into `_start_mcp_server` (no dedicated lifecycle method).**  
   `_run_mcp_client` is only a wait + terminate wrapper. You can keep the per‑client task but avoid the extra indirection:

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if name in self.mcp_server_runtime:
        logger.warning(
            f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
        )
        self._log_safe_mcp_debug_config(cfg)
        return

    if shutdown_event is None:
        shutdown_event = asyncio.Event()

    try:
        mcp_client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    except asyncio.TimeoutError as exc:
        raise MCPInitTimeoutError(
            f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）"
        ) from exc
    except Exception:
        logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True)
        raise

    async def lifecycle() -> None:
        try:
            await shutdown_event.wait()
            logger.info(f"收到 MCP 客户端 {name} 终止信号")
        except asyncio.CancelledError:
            logger.debug(f"MCP 客户端 {name} 任务被取消")
            raise
        finally:
            await self._terminate_mcp_client(name)
            current_task = asyncio.current_task()
            runtime = self.mcp_server_runtime.get(name)
            if runtime and runtime.lifecycle_task is current_task:
                self.mcp_server_runtime.pop(name, None)

    lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}")

    self.mcp_server_runtime[name] = _MCPServerRuntime(
        client=mcp_client,
        shutdown_event=shutdown_event,
        lifecycle_task=lifecycle_task,
    )
```

Then you can remove `_run_mcp_client` entirely, keeping the lifecycle behavior but with one less concept to track.


2. **Inline `_wait_mcp_lifecycle_tasks` into a single shutdown helper shared by both branches of `disable_mcp_server`.**  
   Instead of a generic helper plus branching logic, centralize the join/cancel logic into a small private method that works on a list of runtimes. This keeps the mental model local to “shutdown runtimes”:

```python
async def _shutdown_runtimes(
    self,
    runtimes: list[_MCPServerRuntime],
    timeout: float,
) -> None:
    lifecycle_tasks = [rt.lifecycle_task for rt in runtimes if not rt.lifecycle_task.done()]
    if not lifecycle_tasks:
        return

    for rt in runtimes:
        rt.shutdown_event.set()

    try:
        results = await asyncio.wait_for(
            asyncio.gather(*lifecycle_tasks, return_exceptions=True),
            timeout=timeout,
        )
    except asyncio.TimeoutError:
        for task in lifecycle_tasks:
            if not task.done():
                task.cancel()
        await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
        raise
    else:
        for result in results:
            if isinstance(result, asyncio.CancelledError):
                logger.debug("MCP lifecycle task was cancelled during shutdown.")
            elif isinstance(result, Exception):
                logger.error(
                    "MCP lifecycle task failed during shutdown.",
                    exc_info=(type(result), result, result.__traceback__),
                )
```

Then `disable_mcp_server` becomes:

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    """Disable an MCP server by its name. If name is None, ALL MCP servers will be disabled."""
    if name:
        runtime = self.mcp_server_runtime.get(name)
        if runtime is None:
            return

        try:
            await self._shutdown_runtimes([runtime], timeout)
        finally:
            self.mcp_server_runtime.pop(name, None)
            self.func_list = [
                f
                for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
    else:
        runtimes = list(self.mcp_server_runtime.values())
        try:
            await self._shutdown_runtimes(runtimes, timeout)
        finally:
            self.mcp_server_runtime.clear()
            self.func_list = [
                f for f in self.func_list if not isinstance(f, MCPTool)
            ]
```

This:

- Removes `_wait_mcp_lifecycle_tasks` entirely.
- Avoids duplicating “set event → wait with timeout → handle cancellation/errors → cleanup” between single and multi‑shutdown cases.
- Keeps the lifecycle logic in one, easily discoverable helper instead of spread across three methods.
</issue_to_address>

Sourcery 对开源项目免费 —— 如果你觉得这些评论有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进之后的代码评审。}

Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

The mcp_client_dict property now returns a derived dict instead of the original mapping, so any existing code that mutates mcp_client_dict will silently stop working; if such write access is relied on, consider either deprecating it explicitly or providing a mutable facade that updates mcp_server_runtime.
In _wait_mcp_lifecycle_tasks and disable_mcp_server, when a timeout occurs you raise a bare asyncio.TimeoutError without indicating which servers failed to shut down; it would be more actionable to wrap this in a domain-specific error or at least include the affected server names in the exception/log.
The new MCP init timeout resolution uses both MCP_INIT_TIMEOUT_ENV and ENABLE_MCP_TIMEOUT_ENV; it may be worth documenting (or enforcing in code) how these interact to avoid operators accidentally configuring inconsistent timeouts for initial boot vs dynamic enable.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `mcp_client_dict` property now returns a derived dict instead of the original mapping, so any existing code that mutates `mcp_client_dict` will silently stop working; if such write access is relied on, consider either deprecating it explicitly or providing a mutable facade that updates `mcp_server_runtime`.
- In `_wait_mcp_lifecycle_tasks` and `disable_mcp_server`, when a timeout occurs you raise a bare `asyncio.TimeoutError` without indicating which servers failed to shut down; it would be more actionable to wrap this in a domain-specific error or at least include the affected server names in the exception/log.
- The new MCP init timeout resolution uses both `MCP_INIT_TIMEOUT_ENV` and `ENABLE_MCP_TIMEOUT_ENV`; it may be worth documenting (or enforcing in code) how these interact to avoid operators accidentally configuring inconsistent timeouts for initial boot vs dynamic enable.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="384" />
<code_context>
+            logger.error(msg)
+        return summary
+
+    async def _run_mcp_client(
         self,
         name: str,
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining the MCP client lifecycle logic into `_start_mcp_server` and a single shutdown helper to reduce indirection and duplicated shutdown handling.

You can keep the new semantics but trim some of the abstraction layers. Two places where you get the most complexity reduction for the least change are:

1. **Fold `_run_mcp_client` into `_start_mcp_server` (no dedicated lifecycle method).**  
   `_run_mcp_client` is only a wait + terminate wrapper. You can keep the per‑client task but avoid the extra indirection:

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if name in self.mcp_server_runtime:
        logger.warning(
            f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
        )
        self._log_safe_mcp_debug_config(cfg)
        return

    if shutdown_event is None:
        shutdown_event = asyncio.Event()

    try:
        mcp_client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    except asyncio.TimeoutError as exc:
        raise MCPInitTimeoutError(
            f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）"
        ) from exc
    except Exception:
        logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True)
        raise

    async def lifecycle() -> None:
        try:
            await shutdown_event.wait()
            logger.info(f"收到 MCP 客户端 {name} 终止信号")
        except asyncio.CancelledError:
            logger.debug(f"MCP 客户端 {name} 任务被取消")
            raise
        finally:
            await self._terminate_mcp_client(name)
            current_task = asyncio.current_task()
            runtime = self.mcp_server_runtime.get(name)
            if runtime and runtime.lifecycle_task is current_task:
                self.mcp_server_runtime.pop(name, None)

    lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}")

    self.mcp_server_runtime[name] = _MCPServerRuntime(
        client=mcp_client,
        shutdown_event=shutdown_event,
        lifecycle_task=lifecycle_task,
    )
```

Then you can remove `_run_mcp_client` entirely, keeping the lifecycle behavior but with one less concept to track.


2. **Inline `_wait_mcp_lifecycle_tasks` into a single shutdown helper shared by both branches of `disable_mcp_server`.**  
   Instead of a generic helper plus branching logic, centralize the join/cancel logic into a small private method that works on a list of runtimes. This keeps the mental model local to “shutdown runtimes”:

```python
async def _shutdown_runtimes(
    self,
    runtimes: list[_MCPServerRuntime],
    timeout: float,
) -> None:
    lifecycle_tasks = [rt.lifecycle_task for rt in runtimes if not rt.lifecycle_task.done()]
    if not lifecycle_tasks:
        return

    for rt in runtimes:
        rt.shutdown_event.set()

    try:
        results = await asyncio.wait_for(
            asyncio.gather(*lifecycle_tasks, return_exceptions=True),
            timeout=timeout,
        )
    except asyncio.TimeoutError:
        for task in lifecycle_tasks:
            if not task.done():
                task.cancel()
        await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
        raise
    else:
        for result in results:
            if isinstance(result, asyncio.CancelledError):
                logger.debug("MCP lifecycle task was cancelled during shutdown.")
            elif isinstance(result, Exception):
                logger.error(
                    "MCP lifecycle task failed during shutdown.",
                    exc_info=(type(result), result, result.__traceback__),
                )
```

Then `disable_mcp_server` becomes:

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    """Disable an MCP server by its name. If name is None, ALL MCP servers will be disabled."""
    if name:
        runtime = self.mcp_server_runtime.get(name)
        if runtime is None:
            return

        try:
            await self._shutdown_runtimes([runtime], timeout)
        finally:
            self.mcp_server_runtime.pop(name, None)
            self.func_list = [
                f
                for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
    else:
        runtimes = list(self.mcp_server_runtime.values())
        try:
            await self._shutdown_runtimes(runtimes, timeout)
        finally:
            self.mcp_server_runtime.clear()
            self.func_list = [
                f for f in self.func_list if not isinstance(f, MCPTool)
            ]
```

This:

- Removes `_wait_mcp_lifecycle_tasks` entirely.
- Avoids duplicating “set event → wait with timeout → handle cancellation/errors → cleanup” between single and multi‑shutdown cases.
- Keeps the lifecycle logic in one, easily discoverable helper instead of spread across three methods.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-02T14:15:41Z

astrbot/core/provider/func_tool_manager.py

+            logger.error(msg)
+        return summary
+
+    async def _run_mcp_client(


issue (complexity): 考虑将 MCP 客户端生命周期逻辑内联到 _start_mcp_server 中，并配合一个统一的关闭（shutdown）辅助函数，以减少间接层和重复的关闭处理。

你可以保持新的语义不变，只是收紧一下抽象层。下面这两个地方是在改动最小的前提下，能带来最大复杂度降低的点：

将 _run_mcp_client 折叠进 _start_mcp_server（不再单独保留生命周期方法）。
_run_mcp_client 只是一个“等待 + 终止”的包装器。你依然可以保留每个客户端独立的 task，但避免额外的间接调用：

async def _start_mcp_server( self, name: str, cfg: dict, *, shutdown_event: asyncio.Event | None = None, timeout: float, ) -> None: if name in self.mcp_server_runtime: logger.warning( f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。" ) self._log_safe_mcp_debug_config(cfg) return if shutdown_event is None: shutdown_event = asyncio.Event() try: mcp_client = await asyncio.wait_for( self._init_mcp_client(name, cfg), timeout=timeout, ) except asyncio.TimeoutError as exc: raise MCPInitTimeoutError( f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）" ) from exc except Exception: logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True) raise async def lifecycle() -> None: try: await shutdown_event.wait() logger.info(f"收到 MCP 客户端 {name} 终止信号") except asyncio.CancelledError: logger.debug(f"MCP 客户端 {name} 任务被取消") raise finally: await self._terminate_mcp_client(name) current_task = asyncio.current_task() runtime = self.mcp_server_runtime.get(name) if runtime and runtime.lifecycle_task is current_task: self.mcp_server_runtime.pop(name, None) lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}") self.mcp_server_runtime[name] = _MCPServerRuntime( client=mcp_client, shutdown_event=shutdown_event, lifecycle_task=lifecycle_task, )

这样你就可以完全移除 _run_mcp_client，在保持现有生命周期行为的同时，少了一个需要理解的概念。

将 _wait_mcp_lifecycle_tasks 内联为一个在 disable_mcp_server 双分支间共享的统一关闭辅助函数。
与其使用一个通用 helper 再在外层做分支，不如把 join/cancel 的逻辑集中到一个针对“runtimes 列表”的小型私有方法中。这样可以让心智模型更加聚焦于“关闭 runtimes”这一件事：

async def _shutdown_runtimes( self, runtimes: list[_MCPServerRuntime], timeout: float, ) -> None: lifecycle_tasks = [rt.lifecycle_task for rt in runtimes if not rt.lifecycle_task.done()] if not lifecycle_tasks: return for rt in runtimes: rt.shutdown_event.set() try: results = await asyncio.wait_for( asyncio.gather(*lifecycle_tasks, return_exceptions=True), timeout=timeout, ) except asyncio.TimeoutError: for task in lifecycle_tasks: if not task.done(): task.cancel() await asyncio.gather(*lifecycle_tasks, return_exceptions=True) raise else: for result in results: if isinstance(result, asyncio.CancelledError): logger.debug("MCP lifecycle task was cancelled during shutdown.") elif isinstance(result, Exception): logger.error( "MCP lifecycle task failed during shutdown.", exc_info=(type(result), result, result.__traceback__), )

然后 disable_mcp_server 可以变成：

async def disable_mcp_server( self, name: str | None = None, timeout: float = 10, ) -> None: """Disable an MCP server by its name. If name is None, ALL MCP servers will be disabled.""" if name: runtime = self.mcp_server_runtime.get(name) if runtime is None: return try: await self._shutdown_runtimes([runtime], timeout) finally: self.mcp_server_runtime.pop(name, None) self.func_list = [ f for f in self.func_list if not (isinstance(f, MCPTool) and f.mcp_server_name == name) ] else: runtimes = list(self.mcp_server_runtime.values()) try: await self._shutdown_runtimes(runtimes, timeout) finally: self.mcp_server_runtime.clear() self.func_list = [ f for f in self.func_list if not isinstance(f, MCPTool) ]

这样做可以：

完全移除 _wait_mcp_lifecycle_tasks。

避免在“单个关闭”和“批量关闭”的两种场景下重复实现“设置事件 → 带超时等待 → 处理取消/错误 → 清理”的逻辑。

将生命周期逻辑集中在一个容易发现的辅助方法中，而不是分散在三个方法里。

Original comment in English

issue (complexity): Consider inlining the MCP client lifecycle logic into _start_mcp_server and a single shutdown helper to reduce indirection and duplicated shutdown handling.

You can keep the new semantics but trim some of the abstraction layers. Two places where you get the most complexity reduction for the least change are:

Fold _run_mcp_client into _start_mcp_server (no dedicated lifecycle method).
_run_mcp_client is only a wait + terminate wrapper. You can keep the per‑client task but avoid the extra indirection:

async def _start_mcp_server( self, name: str, cfg: dict, *, shutdown_event: asyncio.Event | None = None, timeout: float, ) -> None: if name in self.mcp_server_runtime: logger.warning( f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。" ) self._log_safe_mcp_debug_config(cfg) return if shutdown_event is None: shutdown_event = asyncio.Event() try: mcp_client = await asyncio.wait_for( self._init_mcp_client(name, cfg), timeout=timeout, ) except asyncio.TimeoutError as exc: raise MCPInitTimeoutError( f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）" ) from exc except Exception: logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True) raise async def lifecycle() -> None: try: await shutdown_event.wait() logger.info(f"收到 MCP 客户端 {name} 终止信号") except asyncio.CancelledError: logger.debug(f"MCP 客户端 {name} 任务被取消") raise finally: await self._terminate_mcp_client(name) current_task = asyncio.current_task() runtime = self.mcp_server_runtime.get(name) if runtime and runtime.lifecycle_task is current_task: self.mcp_server_runtime.pop(name, None) lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}") self.mcp_server_runtime[name] = _MCPServerRuntime( client=mcp_client, shutdown_event=shutdown_event, lifecycle_task=lifecycle_task, )

Then you can remove _run_mcp_client entirely, keeping the lifecycle behavior but with one less concept to track.

Inline _wait_mcp_lifecycle_tasks into a single shutdown helper shared by both branches of disable_mcp_server.
Instead of a generic helper plus branching logic, centralize the join/cancel logic into a small private method that works on a list of runtimes. This keeps the mental model local to “shutdown runtimes”:

async def _shutdown_runtimes( self, runtimes: list[_MCPServerRuntime], timeout: float, ) -> None: lifecycle_tasks = [rt.lifecycle_task for rt in runtimes if not rt.lifecycle_task.done()] if not lifecycle_tasks: return for rt in runtimes: rt.shutdown_event.set() try: results = await asyncio.wait_for( asyncio.gather(*lifecycle_tasks, return_exceptions=True), timeout=timeout, ) except asyncio.TimeoutError: for task in lifecycle_tasks: if not task.done(): task.cancel() await asyncio.gather(*lifecycle_tasks, return_exceptions=True) raise else: for result in results: if isinstance(result, asyncio.CancelledError): logger.debug("MCP lifecycle task was cancelled during shutdown.") elif isinstance(result, Exception): logger.error( "MCP lifecycle task failed during shutdown.", exc_info=(type(result), result, result.__traceback__), )

Then disable_mcp_server becomes:

async def disable_mcp_server( self, name: str | None = None, timeout: float = 10, ) -> None: """Disable an MCP server by its name. If name is None, ALL MCP servers will be disabled.""" if name: runtime = self.mcp_server_runtime.get(name) if runtime is None: return try: await self._shutdown_runtimes([runtime], timeout) finally: self.mcp_server_runtime.pop(name, None) self.func_list = [ f for f in self.func_list if not (isinstance(f, MCPTool) and f.mcp_server_name == name) ] else: runtimes = list(self.mcp_server_runtime.values()) try: await self._shutdown_runtimes(runtimes, timeout) finally: self.mcp_server_runtime.clear() self.func_list = [ f for f in self.func_list if not isinstance(f, MCPTool) ]

This:

Removes _wait_mcp_lifecycle_tasks entirely.

Avoids duplicating “set event → wait with timeout → handle cancellation/errors → cleanup” between single and multi‑shutdown cases.

Keeps the lifecycle logic in one, easily discoverable helper instead of spread across three methods.

sourcery-ai

Hey - 我这边发现了 1 个问题，并补充了一些整体层面的反馈：

建议将 mcp_server_runtime 设为私有属性（例如 _mcp_server_runtime），并为仪表盘/其他调用方暴露一个专用的只读访问器，而不是依赖当前的内部 dict 结构，这样未来如果对运行时元数据做重构，不会在无感知的情况下破坏外部代码。
在 _wait_mcp_lifecycle_tasks 中，你在取消剩余任务之后总是重新抛出 TimeoutError；如果允许“部分关闭”，可以考虑记录哪些任务仍然处于挂起状态，并在非严格模式（“全部禁用”）下避免继续向上抛出异常，从而让调用方能区分“部分关闭失败”和“完全关闭失败”。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider making `mcp_server_runtime` private (e.g. `_mcp_server_runtime`) and exposing a dedicated read-only accessor for dashboards/other callers instead of relying on the internal dict structure, so that future refactors of the runtime metadata don’t silently break external code.
- In `_wait_mcp_lifecycle_tasks`, you always re-raise the TimeoutError after cancelling remaining tasks; if partial shutdown is acceptable, you might want to log which tasks are still pending and avoid bubbling the exception in the non-strict ("disable all") case so that callers can distinguish between partial and total shutdown failures.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="384" />
<code_context>
+            logger.error(msg)
+        return summary
+
+    async def _run_mcp_client(
         self,
         name: str,
</code_context>
<issue_to_address>
**issue (complexity):** 考虑将这些较小的辅助协程内联，并合并清理逻辑，使 MCP 的生命周期以及初始化/关闭流程更加集中、易于理解。

1. 将 `_run_mcp_client` 折叠进 `_start_mcp_server`  

`_run_mcp_client` 实际上只是一个简单的“等待事件 → 做清理”的包装器，并且只在 `_start_mcp_server` 中被使用。你可以在 `_start_mcp_server` 中创建一个小的内部协程，把这段逻辑内联进去，这样在保持行为一致的前提下，减少一层生命周期封装，并移除一个“半公开”的辅助函数：

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if name in self.mcp_server_runtime:
        logger.warning(
            f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
        )
        self._log_safe_mcp_debug_config(cfg)
        return

    if shutdown_event is None:
        shutdown_event = asyncio.Event()

    try:
        mcp_client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    except asyncio.TimeoutError as exc:
        raise MCPInitTimeoutError(
            f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）"
        ) from exc
    except Exception:
        logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True)
        raise

    async def lifecycle() -> None:
        try:
            await shutdown_event.wait()
            logger.info(f"收到 MCP 客户端 {name} 终止信号")
        except asyncio.CancelledError:
            logger.debug(f"MCP 客户端 {name} 任务被取消")
            raise
        finally:
            await self._terminate_mcp_client(name)
            runtime = self.mcp_server_runtime.get(name)
            current_task = asyncio.current_task()
            if runtime and runtime.lifecycle_task is current_task:
                self.mcp_server_runtime.pop(name, None)

    lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}")

    self.mcp_server_runtime[name] = _MCPServerRuntime(
        client=mcp_client,
        shutdown_event=shutdown_event,
        lifecycle_task=lifecycle_task,
    )
```

这样就可以移除 `_run_mcp_client`，将生命周期逻辑集中在一个函数中。

---

2. 将 `_wait_mcp_lifecycle_tasks` 内联到 `disable_mcp_server` 中  

`_wait_mcp_lifecycle_tasks` 只在 `disable_mcp_server` 中使用。直接内联其逻辑，可以在不改变行为的情况下，让关闭流程更加直观：

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    if name:
        runtime = self.mcp_server_runtime.get(name)
        if runtime is None:
            return

        runtime.shutdown_event.set()
        lifecycle_tasks = [runtime.lifecycle_task]
    else:
        runtimes = list(self.mcp_server_runtime.values())
        for runtime in runtimes:
            runtime.shutdown_event.set()
        lifecycle_tasks = [runtime.lifecycle_task for runtime in runtimes]

    if not lifecycle_tasks:
        return

    try:
        try:
            results = await asyncio.wait_for(
                asyncio.gather(*lifecycle_tasks, return_exceptions=True),
                timeout=timeout,
            )
        except asyncio.TimeoutError:
            for task in lifecycle_tasks:
                if not task.done():
                    task.cancel()
            await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
            raise
        else:
            for result in results:
                if isinstance(result, asyncio.CancelledError):
                    logger.debug("MCP lifecycle task was cancelled during shutdown.")
                elif isinstance(result, Exception):
                    logger.error(
                        "MCP lifecycle task failed during shutdown.",
                        exc_info=(type(result), result, result.__traceback__),
                    )
    finally:
        if name:
            self.mcp_server_runtime.pop(name, None)
            self.func_list = [
                f
                for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
        else:
            self.mcp_server_runtime.clear()
            self.func_list = [f for f in self.func_list if not isinstance(f, MCPTool)]
```

这样在保留原有超时/取消语义的前提下，去掉了一层额外封装。

---

3. 使用单个 `finally` 简化 `_init_mcp_client` 的清理逻辑  

当前 `_init_mcp_client` 在两个 `except` 分支中重复了清理逻辑。你可以通过一个 `try/except/finally` 加“connected”标志来保持相同行为，同时简化分支：

```python
async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
    """初始化单个MCP客户端"""
    mcp_client = MCPClient()
    mcp_client.name = name
    tools_res = None
    connected = False

    try:
        await mcp_client.connect_to_server(config, name)
        connected = True
        tools_res = await mcp_client.list_tools_and_save()
    except asyncio.CancelledError:
        # 让上层感知取消，但仍在 finally 尝试清理
        raise
    except Exception:
        # 让上层感知失败，但仍在 finally 尝试清理
        raise
    finally:
        if not connected:
            try:
                await mcp_client.cleanup()
            except Exception as cleanup_exc:
                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")

    # 仅在成功连接并获取 tools 时执行后续逻辑
    logger.debug(f"MCP server {name} list tools response: {tools_res}")
    tool_names = [tool.name for tool in tools_res.tools]

    self.func_list = [
        f
        for f in self.func_list
        if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
    ]

    for tool in mcp_client.tools:
        func_tool = MCPTool(
            mcp_tool=tool,
            mcp_client=mcp_client,
            mcp_server_name=name,
        )
        self.func_list.append(func_tool)

    logger.info(f"已连接 MCP 服务 {name}, Tools: {tool_names}")
    return mcp_client
```

这样在减少分支和重复日志的同时，保留了相同的语义（失败/取消时始终尝试清理，成功时保留客户端）。
</issue_to_address>

Sourcery 对开源项目是免费的——如果你觉得这次 Review 有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈改进后续 Review。}

Original comment in English

Hey - I've found 1 issue, and left some high level feedback:

Consider making mcp_server_runtime private (e.g. _mcp_server_runtime) and exposing a dedicated read-only accessor for dashboards/other callers instead of relying on the internal dict structure, so that future refactors of the runtime metadata don’t silently break external code.
In _wait_mcp_lifecycle_tasks, you always re-raise the TimeoutError after cancelling remaining tasks; if partial shutdown is acceptable, you might want to log which tasks are still pending and avoid bubbling the exception in the non-strict ("disable all") case so that callers can distinguish between partial and total shutdown failures.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Consider making `mcp_server_runtime` private (e.g. `_mcp_server_runtime`) and exposing a dedicated read-only accessor for dashboards/other callers instead of relying on the internal dict structure, so that future refactors of the runtime metadata don’t silently break external code.
- In `_wait_mcp_lifecycle_tasks`, you always re-raise the TimeoutError after cancelling remaining tasks; if partial shutdown is acceptable, you might want to log which tasks are still pending and avoid bubbling the exception in the non-strict ("disable all") case so that callers can distinguish between partial and total shutdown failures.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="384" />
<code_context>
+            logger.error(msg)
+        return summary
+
+    async def _run_mcp_client(
         self,
         name: str,
</code_context>
<issue_to_address>
**issue (complexity):** Consider inlining the small helper coroutines and consolidating cleanup logic to keep the MCP lifecycle and init/shutdown flow localized and simpler to follow.

1. Collapse `_run_mcp_client` into `_start_mcp_server`  

`_run_mcp_client` is essentially a simple “wait on event → cleanup” wrapper, and it’s only used in `_start_mcp_server`. You can reduce lifecycle layering by inlining that logic into a small inner coroutine created in `_start_mcp_server`, keeping behavior identical but removing one public-ish helper:

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    if name in self.mcp_server_runtime:
        logger.warning(
            f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
        )
        self._log_safe_mcp_debug_config(cfg)
        return

    if shutdown_event is None:
        shutdown_event = asyncio.Event()

    try:
        mcp_client = await asyncio.wait_for(
            self._init_mcp_client(name, cfg),
            timeout=timeout,
        )
    except asyncio.TimeoutError as exc:
        raise MCPInitTimeoutError(
            f"MCP 服务 {name} 初始化超时（{timeout:g} 秒）"
        ) from exc
    except Exception:
        logger.error(f"初始化 MCP 客户端 {name} 失败", exc_info=True)
        raise

    async def lifecycle() -> None:
        try:
            await shutdown_event.wait()
            logger.info(f"收到 MCP 客户端 {name} 终止信号")
        except asyncio.CancelledError:
            logger.debug(f"MCP 客户端 {name} 任务被取消")
            raise
        finally:
            await self._terminate_mcp_client(name)
            runtime = self.mcp_server_runtime.get(name)
            current_task = asyncio.current_task()
            if runtime and runtime.lifecycle_task is current_task:
                self.mcp_server_runtime.pop(name, None)

    lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}")

    self.mcp_server_runtime[name] = _MCPServerRuntime(
        client=mcp_client,
        shutdown_event=shutdown_event,
        lifecycle_task=lifecycle_task,
    )
```

With this, `_run_mcp_client` can be removed, and lifecycle flow is localized to a single function.

---

2. Inline `_wait_mcp_lifecycle_tasks` into `disable_mcp_server`  

`_wait_mcp_lifecycle_tasks` is only used in `disable_mcp_server`. Inlining its logic makes the shutdown flow easier to follow without changing behavior:

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    if name:
        runtime = self.mcp_server_runtime.get(name)
        if runtime is None:
            return

        runtime.shutdown_event.set()
        lifecycle_tasks = [runtime.lifecycle_task]
    else:
        runtimes = list(self.mcp_server_runtime.values())
        for runtime in runtimes:
            runtime.shutdown_event.set()
        lifecycle_tasks = [runtime.lifecycle_task for runtime in runtimes]

    if not lifecycle_tasks:
        return

    try:
        try:
            results = await asyncio.wait_for(
                asyncio.gather(*lifecycle_tasks, return_exceptions=True),
                timeout=timeout,
            )
        except asyncio.TimeoutError:
            for task in lifecycle_tasks:
                if not task.done():
                    task.cancel()
            await asyncio.gather(*lifecycle_tasks, return_exceptions=True)
            raise
        else:
            for result in results:
                if isinstance(result, asyncio.CancelledError):
                    logger.debug("MCP lifecycle task was cancelled during shutdown.")
                elif isinstance(result, Exception):
                    logger.error(
                        "MCP lifecycle task failed during shutdown.",
                        exc_info=(type(result), result, result.__traceback__),
                    )
    finally:
        if name:
            self.mcp_server_runtime.pop(name, None)
            self.func_list = [
                f
                for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
        else:
            self.mcp_server_runtime.clear()
            self.func_list = [f for f in self.func_list if not isinstance(f, MCPTool)]
```

This removes an extra layer without changing the timeout/cancel semantics.

---

3. Simplify `_init_mcp_client` cleanup with a single `finally`  

The current `_init_mcp_client` duplicates cleanup logic in two `except` branches. You can keep behavior the same with a single `try/except/finally` and a “connected” flag:

```python
async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
    """初始化单个MCP客户端"""
    mcp_client = MCPClient()
    mcp_client.name = name
    tools_res = None
    connected = False

    try:
        await mcp_client.connect_to_server(config, name)
        connected = True
        tools_res = await mcp_client.list_tools_and_save()
    except asyncio.CancelledError:
        # 让上层感知取消，但仍在 finally 尝试清理
        raise
    except Exception:
        # 让上层感知失败，但仍在 finally 尝试清理
        raise
    finally:
        if not connected:
            try:
                await mcp_client.cleanup()
            except Exception as cleanup_exc:
                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")

    # 仅在成功连接并获取 tools 时执行后续逻辑
    logger.debug(f"MCP server {name} list tools response: {tools_res}")
    tool_names = [tool.name for tool in tools_res.tools]

    self.func_list = [
        f
        for f in self.func_list
        if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
    ]

    for tool in mcp_client.tools:
        func_tool = MCPTool(
            mcp_tool=tool,
            mcp_client=mcp_client,
            mcp_server_name=name,
        )
        self.func_list.append(func_tool)

    logger.info(f"已连接 MCP 服务 {name}, Tools: {tool_names}")
    return mcp_client
```

This keeps the same semantics (always try to cleanup on failure/cancel, leave the client alive on success) while reducing branching and duplicated logging.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

idiotsj · 2026-03-02T14:45:37Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 3 个问题，并给出了一些整体性的反馈：

在 _init_mcp_client 中，connected 在 connect_to_server 之后立刻被设置为 True，因此如果 list_tools_and_save 抛出异常，即使连接已经建立也不会调用清理逻辑；建议将 connected = True 延后到连接和工具列表都成功之后再设置，或者在失败时始终调用 cleanup()，以避免 MCP 客户端资源泄漏。
mcp_server_runtime 属性被描述为向后兼容的视图，但现在返回的是只读的 MappingProxyType；如果现有调用方会修改这个 dict，那么现在就会在运行时遇到 TypeError。因此可能更安全的做法是要么保持返回可变视图，要么明确将其标记为一个破坏性变更，并相应更新所有调用方。

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_init_mcp_client`, `connected` is set to `True` immediately after `connect_to_server`, so if `list_tools_and_save` raises an exception there is no cleanup call even though the connection was opened; consider deferring `connected = True` until after both connection and tool listing succeed, or always running `cleanup()` on failure to avoid leaking MCP client resources.
- The `mcp_server_runtime` property is advertised as a backward-compatible view but now returns a read-only `MappingProxyType`; if any existing callers mutate this dict, they will now see runtime `TypeError`s, so it may be safer to either keep a mutable view or explicitly mark this as a breaking change and update callers accordingly.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="564-573" />
<code_context>
+    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
</code_context>
<issue_to_address>
**issue (bug_risk):** MCP 客户端在连接成功后如果失败，可能会因为在 `list_tools_and_save` 失败时跳过清理而发生资源泄漏。

在 `_init_mcp_client` 中，`connected` 在 `connect_to_server` 之后立刻被设置为 `True`，因此 `finally` 中的清理逻辑只会在初始连接失败时执行。如果 `connect_to_server` 成功但 `list_tools_and_save` 抛出异常，该异常会被重新抛出，客户端既不会被注册到 `_mcp_server_runtime` 中，其连接/资源也不会被清理。这会留下一个没有运行时条目的活动连接。请确保在连接成功之后发生的失败同样会触发清理（例如通过使用 `success` 标志或检查 `tools_res is not None`），并且只在客户端完全初始化并注册之后才跳过清理。
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="237-243" />
<code_context>
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
+        self._mcp_client_dict_view = _MCPClientDictView(self._mcp_server_runtime)
+
+    @property
+    def mcp_client_dict(self) -> MutableMapping[str, MCPClient]:
+        """Compatibility view for external callers that still read mcp_client_dict.
+
+        This is a read-only view; write attempts raise TypeError.
+        """
+        return self._mcp_client_dict_view
+
+    @property
</code_context>
<issue_to_address>
**question (bug_risk):** 新的只读 `mcp_client_dict` 视图改变了可变性语义，可能会破坏之前直接修改该 dict 的调用方。

`mcp_client_dict` 现在返回 `_MCPClientDictView`，在发生写操作时会抛出 `TypeError`，而不再像普通可变 dict 那样工作。之前依赖修改该映射的现有集成（例如移除或注入客户端）现在可能会出错。如果这个 API 希望保持向后兼容，建议要么将返回类型改为非可变的 `Mapping` 以明确其只读语义，要么将其明确记录为破坏性变更，并更新所有仓内调用方以避免直接修改该映射。
</issue_to_address>

### Comment 3
<location path="astrbot/core/provider/func_tool_manager.py" line_range="74" />
<code_context>
+    lifecycle_task: asyncio.Task[None]
+
+
+class _MCPClientDictView(MutableMapping[str, MCPClient]):
+    """Read-only compatibility view for mcp_client_dict.
+
</code_context>
<issue_to_address>
**issue (complexity):** 建议用更简单的只读映射代理以及直接明了的 try/except 清理逻辑，替换自定义的 `_MCPClientDictView` 和当前 `_init_mcp_client` 的控制流，以在保持行为的同时降低理解成本。

你可以在保留所有新行为的前提下，精简掉一些较重的抽象。

### 1. 移除 `_MCPClientDictView`，直接暴露一个简单的只读映射

`_MCPClientDictView` 实现了完整的 `MutableMapping` API，却只是为了在写操作时抛出异常，并且在访问客户端时又多了一层间接性。你的单一真值源已经是 `_mcp_server_runtime`；你可以直接通过 `MappingProxyType` 暴露 `{name: MCPClient}` 的只读映射，而无需自定义类。

这样可以在移除一个完整类型和一层间接访问的同时，保持 `mcp_client_dict` 对现有只读代码的兼容性。

```python
from types import MappingProxyType

class FunctionToolManager:
    def __init__(self) -> None:
        self.func_list: list[FuncTool] = []
        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)

    @property
    def mcp_client_dict(self) -> Mapping[str, MCPClient]:
        """Read-only compatibility view for external callers that still read mcp_client_dict."""
        # Build a derived {name: client} mapping from the runtime dict.
        return MappingProxyType(
            {name: runtime.client for name, runtime in self._mcp_server_runtime.items()}
        )

    @property
    def mcp_server_runtime_view(self) -> Mapping[str, _MCPServerRuntime]:
        """Read-only view of MCP runtime metadata for external callers."""
        return self._mcp_server_runtime_view

    @property
    def mcp_server_runtime(self) -> Mapping[str, _MCPServerRuntime]:
        """Backward-compatible read-only view (deprecated)."""
        return self._mcp_server_runtime_view
```

这样做可以：

- 完全移除 `_MCPClientDictView`。
- 保持 `mcp_client_dict[...]` 的读操作可用。
- 维持仅运行时元数据可变的约束，并保证外部代码无法通过 manager 直接修改 MCP 客户端。

如果你担心分配开销，可以缓存派生映射，并在 `_mcp_server_runtime` 变更时将其失效。但在大多数情况下，简单的推导式已经足够且更易于理解。

---

### 2. 简化 `_init_mcp_client` 的错误/清理处理逻辑

当前使用 `connected` / `tools_res is None` 的模式让控制流较难理解。你可以通过直接使用 `try/except/finally`，并在错误路径中进行清理，来保持“失败时总是清理、成功时不清理”的语义：

```python
async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
    """初始化单个MCP客户端"""
    mcp_client = MCPClient()
    mcp_client.name = name

    try:
        await mcp_client.connect_to_server(config, name)
        tools_res = await mcp_client.list_tools_and_save()
    except asyncio.CancelledError:
        # propagate cancellation, but still cleanup
        try:
            await mcp_client.cleanup()
        except Exception as cleanup_exc:
            logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
        raise
    except Exception:
        # any other failure: cleanup and re‑raise
        try:
            await mcp_client.cleanup()
        except Exception as cleanup_exc:
            logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
        raise

    logger.debug(f"MCP server {name} list tools response: {tools_res}")
    tool_names = [tool.name for tool in tools_res.tools]

    # 移除该MCP服务之前的工具（如有）
    self.func_list = [
        f
        for f in self.func_list
        if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
    ]

    # 将 MCP 工具转换为 FuncTool 并添加到 func_list
    for tool in mcp_client.tools:
        func_tool = MCPTool(
            mcp_tool=tool,
            mcp_client=mcp_client,
            mcp_server_name=name,
        )
        self.func_list.append(func_tool)

    logger.info(f"已连接 MCP 服务 {name}, Tools: {tool_names}")
    return mcp_client
```

这样可以保证：

- 在任何失败情况下（包括 `_start_mcp_server` 的超时错误）都会进行清理。
- 成功路径上不会发生额外的清理。
- 清晰地分离“正常路径”和错误处理，降低阅读生命周期行为时的认知负担。

---

仅这两个改动就可以在保留新的超时/生命周期/错误语义的前提下，移除一个自定义类型和若干条件分支。
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据反馈改进之后的评审。}

Original comment in English

Hey - I've found 3 issues, and left some high level feedback:

In _init_mcp_client, connected is set to True immediately after connect_to_server, so if list_tools_and_save raises an exception there is no cleanup call even though the connection was opened; consider deferring connected = True until after both connection and tool listing succeed, or always running cleanup() on failure to avoid leaking MCP client resources.
The mcp_server_runtime property is advertised as a backward-compatible view but now returns a read-only MappingProxyType; if any existing callers mutate this dict, they will now see runtime TypeErrors, so it may be safer to either keep a mutable view or explicitly mark this as a breaking change and update callers accordingly.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_init_mcp_client`, `connected` is set to `True` immediately after `connect_to_server`, so if `list_tools_and_save` raises an exception there is no cleanup call even though the connection was opened; consider deferring `connected = True` until after both connection and tool listing succeed, or always running `cleanup()` on failure to avoid leaking MCP client resources.
- The `mcp_server_runtime` property is advertised as a backward-compatible view but now returns a read-only `MappingProxyType`; if any existing callers mutate this dict, they will now see runtime `TypeError`s, so it may be safer to either keep a mutable view or explicitly mark this as a breaking change and update callers accordingly.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="564-573" />
<code_context>
+    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
</code_context>
<issue_to_address>
**issue (bug_risk):** MCP client may leak on failure after connecting because cleanup is skipped when `list_tools_and_save` fails.

In `_init_mcp_client`, `connected` is set to `True` right after `connect_to_server`, so the `finally` cleanup only runs when the initial connect fails. If `connect_to_server` succeeds but `list_tools_and_save` raises, the exception is re-raised, the client is never registered in `_mcp_server_runtime`, and its connection/resources are never cleaned up. This leaves a live connection with no runtime entry. Please ensure failures after connect still trigger cleanup (e.g., via a `success` flag or checking `tools_res is not None`), and only skip cleanup once the client is fully initialized and registered.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="237-243" />
<code_context>
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
+        self._mcp_client_dict_view = _MCPClientDictView(self._mcp_server_runtime)
+
+    @property
+    def mcp_client_dict(self) -> MutableMapping[str, MCPClient]:
+        """Compatibility view for external callers that still read mcp_client_dict.
+
+        This is a read-only view; write attempts raise TypeError.
+        """
+        return self._mcp_client_dict_view
+
+    @property
</code_context>
<issue_to_address>
**question (bug_risk):** The new read-only `mcp_client_dict` view changes mutability semantics and may break callers that previously mutated the dict directly.

`mcp_client_dict` now returns `_MCPClientDictView`, which raises `TypeError` on mutation instead of behaving like a plain mutable dict. Existing integrations that relied on mutating this mapping (e.g., removing or injecting clients) may now break. If this API is meant to stay backward compatible, consider either changing the type to a non-mutable `Mapping` to signal immutability, or explicitly documenting this as a breaking change and updating all in-tree usages to avoid direct mutation.
</issue_to_address>

### Comment 3
<location path="astrbot/core/provider/func_tool_manager.py" line_range="74" />
<code_context>
+    lifecycle_task: asyncio.Task[None]
+
+
+class _MCPClientDictView(MutableMapping[str, MCPClient]):
+    """Read-only compatibility view for mcp_client_dict.
+
</code_context>
<issue_to_address>
**issue (complexity):** Consider replacing the custom `_MCPClientDictView` and the current `_init_mcp_client` control flow with simpler, read-only mapping proxies and straightforward try/except cleanup logic to keep behavior while reducing mental overhead.

You can keep all of the new behaviors while trimming a couple of the heavier abstractions.

### 1. Drop `_MCPClientDictView` and expose a simple read‑only mapping

`_MCPClientDictView` implements the full `MutableMapping` API just to throw on writes, and adds an extra layer to reach the client. You already have `_mcp_server_runtime` as the single source of truth; you can expose a read‑only mapping of `{name: MCPClient}` directly via `MappingProxyType` without a custom class.

This keeps `mcp_client_dict` read‑only and compatible for existing read code, while removing one whole type and a layer of indirection.

```python
from types import MappingProxyType

class FunctionToolManager:
    def __init__(self) -> None:
        self.func_list: list[FuncTool] = []
        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)

    @property
    def mcp_client_dict(self) -> Mapping[str, MCPClient]:
        """Read-only compatibility view for external callers that still read mcp_client_dict."""
        # Build a derived {name: client} mapping from the runtime dict.
        return MappingProxyType(
            {name: runtime.client for name, runtime in self._mcp_server_runtime.items()}
        )

    @property
    def mcp_server_runtime_view(self) -> Mapping[str, _MCPServerRuntime]:
        """Read-only view of MCP runtime metadata for external callers."""
        return self._mcp_server_runtime_view

    @property
    def mcp_server_runtime(self) -> Mapping[str, _MCPServerRuntime]:
        """Backward-compatible read-only view (deprecated)."""
        return self._mcp_server_runtime_view
```

This:

- Removes `_MCPClientDictView` entirely.
- Keeps `mcp_client_dict[...]` working for reads.
- Maintains the invariant that only runtime metadata is mutable, and external code cannot mutate MCP clients via the manager.

If you’re worried about allocation, you can cache the derived mapping and invalidate it when `_mcp_server_runtime` changes, but in most cases the simple comprehension is enough and much easier to reason about.

---

### 2. Simplify `_init_mcp_client` error/cleanup handling

The current pattern with `connected`/`tools_res is None` makes the flow harder to follow. You can keep the same semantics (always cleanup on failure, never on success) by using a straight `try/except/finally` with the cleanup only in the error path:

```python
async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
    """初始化单个MCP客户端"""
    mcp_client = MCPClient()
    mcp_client.name = name

    try:
        await mcp_client.connect_to_server(config, name)
        tools_res = await mcp_client.list_tools_and_save()
    except asyncio.CancelledError:
        # propagate cancellation, but still cleanup
        try:
            await mcp_client.cleanup()
        except Exception as cleanup_exc:
            logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
        raise
    except Exception:
        # any other failure: cleanup and re‑raise
        try:
            await mcp_client.cleanup()
        except Exception as cleanup_exc:
            logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
        raise

    logger.debug(f"MCP server {name} list tools response: {tools_res}")
    tool_names = [tool.name for tool in tools_res.tools]

    # 移除该MCP服务之前的工具（如有）
    self.func_list = [
        f
        for f in self.func_list
        if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
    ]

    # 将 MCP 工具转换为 FuncTool 并添加到 func_list
    for tool in mcp_client.tools:
        func_tool = MCPTool(
            mcp_tool=tool,
            mcp_client=mcp_client,
            mcp_server_name=name,
        )
        self.func_list.append(func_tool)

    logger.info(f"已连接 MCP 服务 {name}, Tools: {tool_names}")
    return mcp_client
```

This preserves:

- Cleanup on any failure (including timeout errors from `_start_mcp_server`).
- No extra cleanup on the successful path.
- Clear separation between the “happy path” and error handling, reducing the cognitive load when reading lifecycle behavior.

---

These two changes alone remove one custom type and several conditionals, while keeping your new timeout/lifecycle/error semantics intact.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

idiotsj · 2026-03-02T14:53:11Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了两个问题，并给出了一些整体反馈：

mcp_client_dict 兼容属性在每次访问时都会重建一个新的 dict 和 MappingProxyType；如果它会出现在热路径中，建议要么暴露一个长生命周期的只读视图（类似 _mcp_server_runtime_view），要么至少对该映射做缓存，避免重复分配。
_TIMEOUT_MISMATCH_LOGGED 这个模块级标志会在 _warn_on_timeout_mismatch 中被修改；如果该函数在启动阶段可能被多个任务并发调用，建议让检查/设置操作变为原子（例如使用锁，或者把状态移动到 FunctionToolManager 实例中），以避免并发条件下重复日志的竞态问题。

面向 AI Agent 的提示词

Please address the comments from this code review:

## Overall Comments
- `mcp_client_dict` 兼容属性在每次访问时都会重建一个新的 dict 和 `MappingProxyType`；如果它会出现在热路径中，建议要么暴露一个长生命周期的只读视图（类似 `_mcp_server_runtime_view`），要么至少对该映射做缓存，避免重复分配。
- `_TIMEOUT_MISMATCH_LOGGED` 这个模块级标志会在 `_warn_on_timeout_mismatch` 中被修改；如果该函数在启动阶段可能被多个任务并发调用，建议让检查/设置操作变为原子（例如使用锁，或者把状态移动到 `FunctionToolManager` 实例中），以避免并发条件下重复日志的竞态问题。

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="419-428" />
<code_context>
+            logger.error(msg)
+        return summary
+
+    async def _start_mcp_server(
         self,
         name: str,
         cfg: dict,
-        event: asyncio.Event,
-        ready_future: asyncio.Future | None = None,
+        *,
+        shutdown_event: asyncio.Event | None = None,
+        timeout: float,
     ) -> None:
-        """初始化 MCP 客户端的包装函数，用于捕获异常"""
+        """Initialize MCP server with timeout and register task/event together.
+
+        This method is idempotent. If the server is already running, the existing
+        runtime is kept and the new config is ignored.
+        """
+        if name in self._mcp_server_runtime:
+            logger.warning(
+                f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
</code_context>
<issue_to_address>
**issue (bug_risk):** 考虑对同一 name 并发调用 `_start_mcp_server` 的情况做更稳健的处理。

`if name in self._mcp_server_runtime` 检查和后续插入操作并不是原子的。如果 `enable_mcp_server`（或类似调用方）可以在同一 `name` 上并发运行，多个任务可能会同时通过该检查，各自创建独立的客户端并启动各自的生命周期任务；最终字典中只会保留最后一次赋值，但所有任务都会继续运行。

如果存在并发启动的可能性，建议按名称做同步控制（例如维护一个 `asyncio.Lock` 的字典，或在管理器层面使用一个 `asyncio.Lock`），这样同一时间只有一个 `_start_mcp_server` 能够为某个给定的 `name` 完成初始化。
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="194" />
<code_context>
-        self.mcp_client_dict: dict[str, MCPClient] = {}
-        """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
+        """MCP 服务运行时状态（唯一事实来源）"""
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
</code_context>
<issue_to_address>
**issue (complexity):** 建议通过仅使用一个可变运行时 dict、一个只读视图、更简单的超时不匹配处理，以及更直接的关闭逻辑（去掉每个服务一套生命周期任务和事件）来简化 MCP 运行时/状态管理。

- 双层运行时视图和 `mcp_client_dict` 代理的叠加可以在不牺牲安全性的前提下简化：
  - 过去 `mcp_client_dict` 是一个普通可变 dict；现在每次访问都会构建一个新的 dict 并包一层 `MappingProxyType`，同时还维护了一个缓存的 `_mcp_server_runtime_view`。
  - 你可以：
    * 保持 `_mcp_server_runtime` 作为单一可信的事实来源；
    * 去掉实例级的 `_mcp_server_runtime_view`；
    * 让 `mcp_client_dict` 成为一个普通 dict（与原始类型向后兼容）；
    * 提供一个运行时元数据的只读视图。

  例如：

  ```python
  class FunctionToolManager:
      def __init__(self) -> None:
          self.func_list: list[FuncTool] = []
          self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}

      @property
      def mcp_client_dict(self) -> dict[str, MCPClient]:
          """Compatibility view: name -> MCPClient (mutable like before)."""
          return {name: rt.client for name, rt in self._mcp_server_runtime.items()}

      @property
      def mcp_server_runtime_view(self) -> Mapping[str, _MCPServerRuntime]:
          """Read-only view of MCP runtime metadata."""
          from types import MappingProxyType

          return MappingProxyType(self._mcp_server_runtime)

      # Backward-compatible alias, if needed:
      mcp_server_runtime = mcp_server_runtime_view
  ```

  这样可以去掉一层间接访问和逐次调用 `MappingProxyType` 的开销，同时仍然保留：
  - 单一可变的事实来源；
  - 单一只读视图；
  - 以及之前的 `mcp_client_dict` 使用方式。

---

- 超时不匹配的日志逻辑目前比必要情况更复杂，并且与环境变量/全局状态耦合比较紧。你已经通过 `_resolve_timeout` 解析了超时；可以把 `_warn_on_timeout_mismatch` 改成一个只比较“解析后的值”并在每个实例只记录一次日志的纯函数：

  ```python
  class FunctionToolManager:
      def __init__(self) -> None:
          self._timeout_mismatch_warned = False
          ...

      def _warn_on_timeout_mismatch(
          self,
          init_timeout: float,
          enable_timeout: float,
      ) -> None:
          if self._timeout_mismatch_warned:
              return
          if init_timeout == enable_timeout:
              return

          logger.info(
              "检测到 MCP 初始化超时与动态启用超时配置不同："
              "初始化使用 %s 秒，动态启用使用 %s 秒。如需一致，请设置相同值。",
              f"{init_timeout:g}",
              f"{enable_timeout:g}",
          )
          self._timeout_mismatch_warned = True
  ```

  用法：

  ```python
  init_timeout = _resolve_timeout(...)
  enable_timeout = _resolve_timeout(...)
  self._warn_on_timeout_mismatch(init_timeout, enable_timeout)
  ```

  这样可以去掉：
  - 重复读取环境变量；
  - 模块级的 `_TIMEOUT_MISMATCH_LOGGED` 全局变量；
  - 以及“来自环境或来自参数”的混合逻辑，同时保留原有的告警行为。

---

- 当前的生命周期间接层（`shutdown_event` + 每个服务一个 `lifecycle_task` + `_shutdown_runtimes`）对于“启动客户端，然后在之后带超时地关闭”这个需求来说略显笨重。如果不需要外部组件通过每个服务的 `asyncio.Event` 来发出关闭信号，可以简化为直接的关闭任务，并让 `_MCPServerRuntime` 保持精简：

  ```python
  @dataclass
  class _MCPServerRuntime:
      name: str
      client: MCPClient

  async def _start_mcp_server(self, name: str, cfg: dict, *, timeout: float) -> None:
      if name in self._mcp_server_runtime:
          logger.warning("MCP 服务 %s 已在运行，忽略启用请求（timeout=%g）。", name, timeout)
          self._log_safe_mcp_debug_config(cfg)
          return

      mcp_client = await asyncio.wait_for(
          self._init_mcp_client(name, cfg),
          timeout=timeout,
      )
      self._mcp_server_runtime[name] = _MCPServerRuntime(name=name, client=mcp_client)

  async def _shutdown_runtimes(
      self,
      runtimes: list[_MCPServerRuntime],
      timeout: float,
      *,
      strict: bool = True,
  ) -> list[str]:
      tasks = [
          asyncio.create_task(self._terminate_mcp_client(rt.name), name=f"mcp-shutdown:{rt.name}")
          for rt in runtimes
      ]
      try:
          results = await asyncio.wait_for(
              asyncio.gather(*tasks, return_exceptions=True),
              timeout=timeout,
          )
      except asyncio.TimeoutError:
          pending = [rt.name for rt, t in zip(runtimes, tasks) if not t.done()]
          for t in tasks:
              if not t.done():
                  t.cancel()
          await asyncio.gather(*tasks, return_exceptions=True)
          if strict:
              raise MCPShutdownTimeoutError(pending, timeout)
          logger.warning("MCP 服务关闭超时（%s 秒），未完全关闭：%s", f"{timeout:g}", ", ".join(pending))
          return pending

      for result in results:
          if isinstance(result, Exception) and not isinstance(result, asyncio.CancelledError):
              logger.error("MCP 关闭任务失败", exc_info=(type(result), result, result.__traceback__))
      return []
  ```

  这样可以保留：
  - 受时间限制的关闭；
  - 错误日志和 `MCPShutdownTimeoutError`；
  - 明确的运行时跟踪；

  同时去掉后台生命周期任务和 `shutdown_event` 管道，使控制流更容易理解。
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据反馈改进后续的评审。}

Original comment in English

Hey - I've found 2 issues, and left some high level feedback:

The mcp_client_dict compatibility property rebuilds a new dict and MappingProxyType on every access; if it is used in hot paths you might want to either expose a long-lived mapping view (similar to _mcp_server_runtime_view) or at least cache the mapping to avoid repeated allocations.
The _TIMEOUT_MISMATCH_LOGGED module-level flag is mutated from _warn_on_timeout_mismatch; if this function can be called from multiple tasks at startup, consider making the check/set atomic (e.g., using a lock or moving the state into the FunctionToolManager instance) to avoid racy double-logging.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `mcp_client_dict` compatibility property rebuilds a new dict and `MappingProxyType` on every access; if it is used in hot paths you might want to either expose a long-lived mapping view (similar to `_mcp_server_runtime_view`) or at least cache the mapping to avoid repeated allocations.
- The `_TIMEOUT_MISMATCH_LOGGED` module-level flag is mutated from `_warn_on_timeout_mismatch`; if this function can be called from multiple tasks at startup, consider making the check/set atomic (e.g., using a lock or moving the state into the `FunctionToolManager` instance) to avoid racy double-logging.

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="419-428" />
<code_context>
+            logger.error(msg)
+        return summary
+
+    async def _start_mcp_server(
         self,
         name: str,
         cfg: dict,
-        event: asyncio.Event,
-        ready_future: asyncio.Future | None = None,
+        *,
+        shutdown_event: asyncio.Event | None = None,
+        timeout: float,
     ) -> None:
-        """初始化 MCP 客户端的包装函数，用于捕获异常"""
+        """Initialize MCP server with timeout and register task/event together.
+
+        This method is idempotent. If the server is already running, the existing
+        runtime is kept and the new config is ignored.
+        """
+        if name in self._mcp_server_runtime:
+            logger.warning(
+                f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
</code_context>
<issue_to_address>
**issue (bug_risk):** Consider handling concurrent `_start_mcp_server` calls for the same name more robustly.

The `if name in self._mcp_server_runtime` check and the subsequent insertion are not atomic. If `enable_mcp_server` (or similar callers) can run concurrently with the same `name`, multiple tasks may pass the check, create separate clients, and spawn multiple lifecycle tasks; only the last assignment will be stored, but all tasks will keep running.

If concurrent startup is possible, consider per-name synchronization (e.g., a dict of `asyncio.Lock`s or a manager-level `asyncio.Lock`) so that only one `_start_mcp_server` can initialize a given `name` at a time.
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="194" />
<code_context>
-        self.mcp_client_dict: dict[str, MCPClient] = {}
-        """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
+        """MCP 服务运行时状态（唯一事实来源）"""
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the MCP runtime/state management by using a single mutable runtime dict, a single read-only view, simpler timeout mismatch handling, and more direct shutdown logic without per-server lifecycle tasks and events.

- The dual runtime views and `mcp_client_dict` proxy layering can be simplified without losing any safety:
  - `mcp_client_dict` used to be a plain mutable dict; now each access builds a new dict and wraps it in a `MappingProxyType`, plus there’s a cached `_mcp_server_runtime_view`.
  - You can:
    * Keep `_mcp_server_runtime` as the single source of truth,
    * Drop the instance-level `_mcp_server_runtime_view`,
    * Make `mcp_client_dict` a plain dict (backward compatible with the original type),
    * Provide a single read-only view for the runtime metadata.

  For example:

  ```python
  class FunctionToolManager:
      def __init__(self) -> None:
          self.func_list: list[FuncTool] = []
          self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}

      @property
      def mcp_client_dict(self) -> dict[str, MCPClient]:
          """Compatibility view: name -> MCPClient (mutable like before)."""
          return {name: rt.client for name, rt in self._mcp_server_runtime.items()}

      @property
      def mcp_server_runtime_view(self) -> Mapping[str, _MCPServerRuntime]:
          """Read-only view of MCP runtime metadata."""
          from types import MappingProxyType

          return MappingProxyType(self._mcp_server_runtime)

      # Backward-compatible alias, if needed:
      mcp_server_runtime = mcp_server_runtime_view
  ```

  This removes one layer of indirection and the per-call `MappingProxyType` while keeping:
  - a single mutable source of truth,
  - a single read-only view,
  - and the previous `mcp_client_dict` usage pattern.

---

- The timeout mismatch logging is more complex than needed and tightly coupled to env/ globals. You already resolve timeouts via `_resolve_timeout`; `_warn_on_timeout_mismatch` can be made a pure function that compares the *resolved* values and logs once per instance:

  ```python
  class FunctionToolManager:
      def __init__(self) -> None:
          self._timeout_mismatch_warned = False
          ...

      def _warn_on_timeout_mismatch(
          self,
          init_timeout: float,
          enable_timeout: float,
      ) -> None:
          if self._timeout_mismatch_warned:
              return
          if init_timeout == enable_timeout:
              return

          logger.info(
              "检测到 MCP 初始化超时与动态启用超时配置不同："
              "初始化使用 %s 秒，动态启用使用 %s 秒。如需一致，请设置相同值。",
              f"{init_timeout:g}",
              f"{enable_timeout:g}",
          )
          self._timeout_mismatch_warned = True
  ```

  Usage:

  ```python
  init_timeout = _resolve_timeout(...)
  enable_timeout = _resolve_timeout(...)
  self._warn_on_timeout_mismatch(init_timeout, enable_timeout)
  ```

  This removes:
  - re-reading env vars,
  - the module-level `_TIMEOUT_MISMATCH_LOGGED` global,
  - and the mixed “from env or param” logic, while keeping the same warning behavior.

---

- The lifecycle indirection (`shutdown_event` + per-server `lifecycle_task` + `_shutdown_runtimes`) is quite heavy for “start client, later shut down with a timeout”. If you don’t need external components to signal shutdown via a per-server `asyncio.Event`, you can simplify to direct shutdown tasks and keep `_MCPServerRuntime` minimal:

  ```python
  @dataclass
  class _MCPServerRuntime:
      name: str
      client: MCPClient

  async def _start_mcp_server(self, name: str, cfg: dict, *, timeout: float) -> None:
      if name in self._mcp_server_runtime:
          logger.warning("MCP 服务 %s 已在运行，忽略启用请求（timeout=%g）。", name, timeout)
          self._log_safe_mcp_debug_config(cfg)
          return

      mcp_client = await asyncio.wait_for(
          self._init_mcp_client(name, cfg),
          timeout=timeout,
      )
      self._mcp_server_runtime[name] = _MCPServerRuntime(name=name, client=mcp_client)

  async def _shutdown_runtimes(
      self,
      runtimes: list[_MCPServerRuntime],
      timeout: float,
      *,
      strict: bool = True,
  ) -> list[str]:
      tasks = [
          asyncio.create_task(self._terminate_mcp_client(rt.name), name=f"mcp-shutdown:{rt.name}")
          for rt in runtimes
      ]
      try:
          results = await asyncio.wait_for(
              asyncio.gather(*tasks, return_exceptions=True),
              timeout=timeout,
          )
      except asyncio.TimeoutError:
          pending = [rt.name for rt, t in zip(runtimes, tasks) if not t.done()]
          for t in tasks:
              if not t.done():
                  t.cancel()
          await asyncio.gather(*tasks, return_exceptions=True)
          if strict:
              raise MCPShutdownTimeoutError(pending, timeout)
          logger.warning("MCP 服务关闭超时（%s 秒），未完全关闭：%s", f"{timeout:g}", ", ".join(pending))
          return pending

      for result in results:
          if isinstance(result, Exception) and not isinstance(result, asyncio.CancelledError):
              logger.error("MCP 关闭任务失败", exc_info=(type(result), result, result.__traceback__))
      return []
  ```

  This preserves:
  - time-bounded shutdown,
  - error logging and `MCPShutdownTimeoutError`,
  - explicit runtime tracking,

  while removing the background lifecycle task and `shutdown_event` plumbing, making control flow easier to follow.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

idiotsj · 2026-03-02T15:03:47Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 2 个问题

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="552-561" />
<code_context>
+        return []

-    async def _init_mcp_client(self, name: str, config: dict) -> None:
+    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
         """初始化单个MCP客户端"""
-        # 先清理之前的客户端，如果存在
-        if name in self.mcp_client_dict:
-            await self._terminate_mcp_client(name)
-
         mcp_client = MCPClient()
         mcp_client.name = name
-        self.mcp_client_dict[name] = mcp_client
-        await mcp_client.connect_to_server(config, name)
-        tools_res = await mcp_client.list_tools_and_save()
+        try:
+            await mcp_client.connect_to_server(config, name)
+            tools_res = await mcp_client.list_tools_and_save()
+        except asyncio.CancelledError:
+            try:
+                await mcp_client.cleanup()
+            except Exception as cleanup_exc:
+                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
+            raise
</code_context>
<issue_to_address>
**suggestion:** `_init_mcp_client` 中的清理逻辑在两个 `except` 分支里重复出现，可以重构一下以提升可维护性。

`asyncio.CancelledError` 和通用 `Exception` 分支目前都执行相同的 `mcp_client.cleanup()` 加日志的逻辑。为了减少重复代码以及未来可能的逻辑偏移，建议将这部分提取到一个共享的辅助方法中，或者重构 `try/except` 结构（例如使用单一的 `except BaseException:`，在其中对 `CancelledError` 做重新抛出），使清理逻辑只在一个地方维护。

建议实现方式：

```python
    async def _cleanup_mcp_client_safely(self, mcp_client: MCPClient, name: str) -> None:
        """安全清理单个 MCP 客户端，避免清理异常中断主流程"""
        try:
            await mcp_client.cleanup()
        except Exception as cleanup_exc:  # noqa: BLE001 - 这里仅做日志记录
            logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")

    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
        """初始化单个MCP客户端"""

```

```python
        try:
            await mcp_client.connect_to_server(config, name)
            tools_res = await mcp_client.list_tools_and_save()
        except asyncio.CancelledError:
            await self._cleanup_mcp_client_safely(mcp_client, name)
            raise
        except Exception as exc:
            await self._cleanup_mcp_client_safely(mcp_client, name)
            raise

```

1. 如果在 `except Exception as exc:` 分支中原本还有额外的错误日志（例如记录初始化失败原因），请在 `raise` 之前重新补上这些日志逻辑。
2. 如果当前文件未导入 `MCPClient` 或 `logger`，请确认顶部已正确导入这些符号。
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="194" />
<code_context>
-        self.mcp_client_dict: dict[str, MCPClient] = {}
-        """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
+        """MCP 服务运行时状态（唯一事实来源）"""
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
</code_context>
<issue_to_address>
**issue (complexity):** 建议通过直接从 `_mcp_server_runtime` 推导客户端映射，并使用单一锁来保护所有运行时状态的修改，而不是依赖多个缓存和按服务划分的锁，以简化 MCP 运行时管理逻辑。

在保留你引入的所有新特性（超时、摘要、生命周期管理、更好的日志）的前提下，可以通过以下方式消除一部分复杂度：

1. **移除反规范化的 `_mcp_client_dict_cache`**  
2. **用单一运行时锁替换“锁的字典”模式**

### 1. 从 `_mcp_server_runtime` 推导 `mcp_client_dict`

当前 `_MCPServerRuntime` 已经持有 `client`，而 `_mcp_client_dict_cache` 需要在多个地方保持同步（`_start_mcp_server`、`_terminate_mcp_client`、`init_mcp_clients`、`disable_mcp_server`、生命周期任务等）。

你可以将 `_mcp_server_runtime` 作为单一事实来源，通过派生方式提供相同的只读 API，而无需额外的缓存字典：

```python
class FunctionToolManager:
    def __init__(self) -> None:
        self.func_list: list[FuncTool] = []
        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
        # REMOVE: self._mcp_client_dict_cache and _mcp_client_dict_view

    @property
    def mcp_client_dict(self) -> Mapping[str, MCPClient]:
        """Read-only compatibility view."""
        # Build a derived mapping on access instead of caching another dict
        return MappingProxyType(
            {name: runtime.client for name, runtime in self._mcp_server_runtime.items()}
        )
```

然后在其余代码中：

- 删除所有对 `self._mcp_client_dict_cache[...] = ...` 和 `.pop(..., None)` 的调用。
- 读取 `self._mcp_client_dict_cache` 的地方，可以改用 `self.mcp_client_dict`（或者在已拿到 runtime 的上下文中，直接使用 `self._mcp_server_runtime[name].client`）。

例如：

```python
# 在 _start_mcp_server 的生命周期清理中：
if runtime and runtime.lifecycle_task is current_task:
    self._mcp_server_runtime.pop(name, None)
    # REMOVE: self._mcp_client_dict_cache.pop(name, None)

# 在 _terminate_mcp_client 中：
runtime = self._mcp_server_runtime.get(name)
if runtime:
    client = runtime.client
    ...
    # REMOVE: self._mcp_client_dict_cache.pop(name, None)
```

这样可以在保留外部行为（`name -> MCPClient` 的映射）的同时，避免状态重复和一致性问题。

### 2. 用单一运行时锁替代“每个服务一个锁 + 锁字典”

除非你有明确需求希望对不同服务的操作可以完全并行执行，否则 `self._mcp_server_locks` 加 `self._mcp_server_locks_lock` 的模式可能比实际需要的机制更复杂。

你可以引入一个单一的锁，用于保护 `_mcp_server_runtime` 的变更以及服务的（反）初始化操作。这类操作相对比较少见，使用单一锁通常足够：

```python
class FunctionToolManager:
    def __init__(self) -> None:
        ...
        self._runtime_lock = asyncio.Lock()
```

在 `_start_mcp_server` 中使用它：

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    async with self._runtime_lock:
        if name in self._mcp_server_runtime:
            logger.warning(
                f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
            )
            self._log_safe_mcp_debug_config(cfg)
            return

        if shutdown_event is None:
            shutdown_event = asyncio.Event()

        try:
            mcp_client = await asyncio.wait_for(
                self._init_mcp_client(name, cfg),
                timeout=timeout,
            )
        ...
        lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}")
        self._mcp_server_runtime[name] = _MCPServerRuntime(
            name=name,
            client=mcp_client,
            shutdown_event=shutdown_event,
            lifecycle_task=lifecycle_task,
        )
```

在 `disable_mcp_server` 中对 `_mcp_server_runtime` 做修改时同样使用该锁：

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    if name:
        async with self._runtime_lock:
            runtime = self._mcp_server_runtime.get(name)
        if runtime is None:
            return

        try:
            await self._shutdown_runtimes([runtime], timeout, strict=True)
        finally:
            async with self._runtime_lock:
                self._mcp_server_runtime.pop(name, None)
            self.func_list = [
                f
                for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
    else:
        async with self._runtime_lock:
            runtimes = list(self._mcp_server_runtime.values())
        try:
            await self._shutdown_runtimes(runtimes, timeout, strict=False)
        finally:
            async with self._runtime_lock:
                self._mcp_server_runtime.clear()
            self.func_list = [
                f for f in self.func_list if not isinstance(f, MCPTool)
            ]
```

这样既保留了新的生命周期和超时行为，又能显著减少：

- 需要保持同步的共享结构数量；  
- 围绕“每个服务一个锁 + 锁字典”这套机制带来的心智负担。
</issue_to_address>

Sourcery 对开源项目免费使用——如果你觉得我们的评审有帮助，欢迎分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈来改进后续的评审质量。}

Original comment in English

Hey - I've found 2 issues

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="552-561" />
<code_context>
+        return []

-    async def _init_mcp_client(self, name: str, config: dict) -> None:
+    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
         """初始化单个MCP客户端"""
-        # 先清理之前的客户端，如果存在
-        if name in self.mcp_client_dict:
-            await self._terminate_mcp_client(name)
-
         mcp_client = MCPClient()
         mcp_client.name = name
-        self.mcp_client_dict[name] = mcp_client
-        await mcp_client.connect_to_server(config, name)
-        tools_res = await mcp_client.list_tools_and_save()
+        try:
+            await mcp_client.connect_to_server(config, name)
+            tools_res = await mcp_client.list_tools_and_save()
+        except asyncio.CancelledError:
+            try:
+                await mcp_client.cleanup()
+            except Exception as cleanup_exc:
+                logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")
+            raise
</code_context>
<issue_to_address>
**suggestion:** The cleanup logic in `_init_mcp_client` is duplicated across two `except` blocks and could be refactored for maintainability.

Both the `asyncio.CancelledError` and generic `Exception` branches perform the same `mcp_client.cleanup()` + logging sequence. To reduce duplication and future drift, consider factoring this into a shared helper or restructuring the `try/except` (e.g., a single `except BaseException:` that re-raises `CancelledError`) so the cleanup logic lives in one place.

Suggested implementation:

```python
    async def _cleanup_mcp_client_safely(self, mcp_client: MCPClient, name: str) -> None:
        """安全清理单个 MCP 客户端，避免清理异常中断主流程"""
        try:
            await mcp_client.cleanup()
        except Exception as cleanup_exc:  # noqa: BLE001 - 这里仅做日志记录
            logger.error(f"清理 MCP 客户端资源 {name} 失败: {cleanup_exc}")

    async def _init_mcp_client(self, name: str, config: dict) -> MCPClient:
        """初始化单个MCP客户端"""

```

```python
        try:
            await mcp_client.connect_to_server(config, name)
            tools_res = await mcp_client.list_tools_and_save()
        except asyncio.CancelledError:
            await self._cleanup_mcp_client_safely(mcp_client, name)
            raise
        except Exception as exc:
            await self._cleanup_mcp_client_safely(mcp_client, name)
            raise

```

1. 如果在 `except Exception as exc:` 分支中原本还有额外的错误日志（例如记录初始化失败原因），请在 `raise` 之前重新补上这些日志逻辑。
2. 如果当前文件未导入 `MCPClient` 或 `logger`，请确认顶部已正确导入这些符号。
</issue_to_address>

### Comment 2
<location path="astrbot/core/provider/func_tool_manager.py" line_range="194" />
<code_context>
-        self.mcp_client_dict: dict[str, MCPClient] = {}
-        """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
+        """MCP 服务运行时状态（唯一事实来源）"""
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying the MCP runtime management by deriving the client mapping directly from `_mcp_server_runtime` and guarding all runtime mutations with a single lock instead of multiple caches and per-server locks.

You can keep all the new features (timeouts, summaries, lifecycle, better logging) but remove a chunk of complexity by:

1. **Eliminating the denormalized `_mcp_client_dict_cache`**
2. **Replacing the lock-of-locks with a single runtime lock**

### 1. Derive `mcp_client_dict` from `_mcp_server_runtime`

Right now `_MCPServerRuntime` already holds the `client`, and `_mcp_client_dict_cache` is kept in sync in multiple places (`_start_mcp_server`, `_terminate_mcp_client`, `init_mcp_clients`, `disable_mcp_server`, lifecycle task).

You can treat `_mcp_server_runtime` as the single source of truth, and provide the same read‑only API without a separate cache:

```python
class FunctionToolManager:
    def __init__(self) -> None:
        self.func_list: list[FuncTool] = []
        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
        # REMOVE: self._mcp_client_dict_cache and _mcp_client_dict_view

    @property
    def mcp_client_dict(self) -> Mapping[str, MCPClient]:
        """Read-only compatibility view."""
        # Build a derived mapping on access instead of caching another dict
        return MappingProxyType(
            {name: runtime.client for name, runtime in self._mcp_server_runtime.items()}
        )
```

Then, in the rest of the code:

- Drop all `self._mcp_client_dict_cache[...] = ...` and `.pop(..., None)` calls.
- Places that read `self._mcp_client_dict_cache` can instead use `self.mcp_client_dict` (or `self._mcp_server_runtime[name].client` where you’re already dealing with runtimes).

For example:

```python
# In _start_mcp_server lifecycle cleanup:
if runtime and runtime.lifecycle_task is current_task:
    self._mcp_server_runtime.pop(name, None)
    # REMOVE: self._mcp_client_dict_cache.pop(name, None)

# In _terminate_mcp_client:
runtime = self._mcp_server_runtime.get(name)
if runtime:
    client = runtime.client
    ...
    # REMOVE: self._mcp_client_dict_cache.pop(name, None)
```

This keeps the external behavior (a mapping of `name -> MCPClient`) while avoiding state duplication and consistency hazards.

### 2. Replace per-server locks + lock-of-locks with a single runtime lock

Unless you have a concrete requirement for concurrent operations on different servers to proceed in parallel, the `self._mcp_server_locks` + `self._mcp_server_locks_lock` pattern is probably more machinery than you need.

You can instead introduce a single lock that protects `_mcp_server_runtime` mutations and server (de)initialization, which are comparatively rare operations:

```python
class FunctionToolManager:
    def __init__(self) -> None:
        ...
        self._runtime_lock = asyncio.Lock()
```

Use it in `_start_mcp_server`:

```python
async def _start_mcp_server(
    self,
    name: str,
    cfg: dict,
    *,
    shutdown_event: asyncio.Event | None = None,
    timeout: float,
) -> None:
    async with self._runtime_lock:
        if name in self._mcp_server_runtime:
            logger.warning(
                f"MCP 服务 {name} 已在运行，忽略本次启用请求（timeout={timeout:g}）。"
            )
            self._log_safe_mcp_debug_config(cfg)
            return

        if shutdown_event is None:
            shutdown_event = asyncio.Event()

        try:
            mcp_client = await asyncio.wait_for(
                self._init_mcp_client(name, cfg),
                timeout=timeout,
            )
        ...
        lifecycle_task = asyncio.create_task(lifecycle(), name=f"mcp-client:{name}")
        self._mcp_server_runtime[name] = _MCPServerRuntime(
            name=name,
            client=mcp_client,
            shutdown_event=shutdown_event,
            lifecycle_task=lifecycle_task,
        )
```

And in `disable_mcp_server` when mutating `_mcp_server_runtime`:

```python
async def disable_mcp_server(
    self,
    name: str | None = None,
    timeout: float = 10,
) -> None:
    if name:
        async with self._runtime_lock:
            runtime = self._mcp_server_runtime.get(name)
        if runtime is None:
            return

        try:
            await self._shutdown_runtimes([runtime], timeout, strict=True)
        finally:
            async with self._runtime_lock:
                self._mcp_server_runtime.pop(name, None)
            self.func_list = [
                f
                for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
    else:
        async with self._runtime_lock:
            runtimes = list(self._mcp_server_runtime.values())
        try:
            await self._shutdown_runtimes(runtimes, timeout, strict=False)
        finally:
            async with self._runtime_lock:
                self._mcp_server_runtime.clear()
            self.func_list = [
                f for f in self.func_list if not isinstance(f, MCPTool)
            ]
```

This keeps all new lifecycle and timeout behavior, but significantly reduces:

- the number of shared structures you have to keep in sync, and  
- the conceptual overhead around per-server locking and a lock-of-locks map.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

idiotsj · 2026-03-02T15:18:58Z

@sourcery-ai review

sourcery-ai

Hey - 我发现了 1 个问题

给 AI Agent 的提示

请根据本次代码审查中的评论进行修改：

## 单独评论

### 评论 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="194" />
<code_context>
-        self.mcp_client_dict: dict[str, MCPClient] = {}
-        """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
+        """MCP 服务运行时状态（唯一事实来源）"""
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
</code_context>
<issue_to_address>
**issue (complexity):** 建议通过使用单一运行时“唯一事实来源”来简化 MCP 客户端/状态管理，在 `_terminate_mcp_client` 中集中处理清理逻辑，并预先计算超时时间，从而避免重复的逻辑和状态。

你可以在保留所有新行为（超时、生命周期、摘要等）的前提下，通过以下方式显著降低复杂度：

---

### 1. 移除 `_mcp_client_dict_cache`，从单一事实来源派生客户端映射

目前你维护了：

- `_mcp_server_runtime`
- `_mcp_client_dict_cache`
- 两个 mapping proxy
- 并且在多个位置更新两个结构（`_start_mcp_server`、`_terminate_mcp_client`、`disable_mcp_server`、`_shutdown_runtimes` 生命周期）。

你可以保留 `_mcp_server_runtime` 作为唯一权威结构，并将 `mcp_client_dict` 暴露为从它派生的只读视图。这样既不会改变公共 API，又能去掉重复的状态和清理路径。

**之前（简化版）：**

```python
self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
self._mcp_client_dict_cache: dict[str, MCPClient] = {}
self._mcp_client_dict_view = MappingProxyType(self._mcp_client_dict_cache)

@property
def mcp_client_dict(self) -> Mapping[str, MCPClient]:
    return self._mcp_client_dict_view
```

**之后（单一事实来源）：**

```python
self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)

@property
def mcp_client_dict(self) -> Mapping[str, MCPClient]:
    # 只读兼容视图，从运行时状态派生
    return MappingProxyType({
        name: runtime.client
        for name, runtime in self._mcp_server_runtime_view.items()
    })

@property
def mcp_server_runtime_view(self) -> Mapping[str, _MCPServerRuntime]:
    return self._mcp_server_runtime_view

@property
def mcp_server_runtime(self) -> Mapping[str, _MCPServerRuntime]:
    # 已废弃别名，仍为只读
    return self._mcp_server_runtime_view
```

然后你就可以从以下位置移除所有对 `_mcp_client_dict_cache` 的修改：

- `_start_mcp_server`
- `_terminate_mcp_client`
- `_shutdown_runtimes` / `disable_mcp_server`（它们的 `finally` 块）

所有与运行时相关的代码只需要操作 `_mcp_server_runtime`，而调用方仍然通过同一个 `mcp_client_dict` 接口访问。

---

### 2. 让 `_terminate_mcp_client` 完全负责 `func_list` 清理

当前 `func_list` 的清理发生在多个地方：

- `_terminate_mcp_client`
- `disable_mcp_server`（带 name 和为 `None` 的两种分支）
- 生命周期任务的 `finally`，通过 `_terminate_mcp_client` 加上额外的字典 `pop`

这会让幂等性更难推理。你可以把工具清理集中到 `_terminate_mcp_client` 中，让 `disable_mcp_server` 只专注于协调流程。

**之前（简化版）：**

```python
async def _terminate_mcp_client(self, name: str) -> None:
    runtime = self._mcp_server_runtime.get(name)
    if runtime:
        ...
        finally:
            self.func_list = [
                f for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
            async with self._runtime_lock:
                self._mcp_client_dict_cache.pop(name, None)
            logger.info(...)

async def disable_mcp_server(self, name: str | None = None, timeout: float = 10) -> None:
    if name:
        ...
        finally:
            async with self._runtime_lock:
                self._mcp_server_runtime.pop(name, None)
                self._mcp_client_dict_cache.pop(name, None)
            self.func_list = [
                f for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
```

**之后（集中清理，`disable_mcp_server` 不再操作工具）：**

```python
async def _terminate_mcp_client(self, name: str) -> None:
    runtime = self._mcp_server_runtime.get(name)
    if not runtime:
        # 仍然要清理可能残留的工具
        self.func_list = [
            f for f in self.func_list
            if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
        ]
        return

    try:
        await runtime.client.cleanup()
    except Exception as e:
        logger.error(f"清空 MCP 客户端资源 {name}: {e}。")
    finally:
        self.func_list = [
            f for f in self.func_list
            if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
        ]
        async with self._runtime_lock:
            self._mcp_server_runtime.pop(name, None)
        logger.info(f"已关闭 MCP 服务 {name}")
```

```python
async def disable_mcp_server(self, name: str | None = None, timeout: float = 10) -> None:
    if name:
        async with self._runtime_lock:
            runtime = self._mcp_server_runtime.get(name)
        if runtime is None:
            return

        try:
            await self._shutdown_runtimes([runtime], timeout, strict=True)
        finally:
            # 这里不直接操作 func_list 或 runtime 字典；
            # 由生命周期任务 / _terminate_mcp_client 负责。
            pass
    else:
        async with self._runtime_lock:
            runtimes = list(self._mcp_server_runtime.values())
        try:
            await self._shutdown_runtimes(runtimes, timeout, strict=False)
        finally:
            # 同样：由生命周期任务 / _terminate_mcp_client 清理状态。
            pass
```

这可以在保持现有行为（生命周期、超时、按服务关闭）的同时，避免工具和运行时状态被重复清理。

---

### 3. 集中超时解析逻辑，避免到处重复和交叉校验

当前 `init_mcp_clients` 和 `enable_mcp_server` 都会：

- 通过 `_resolve_timeout(..., env_name=MCP_INIT_TIMEOUT_ENV, ...)` 解析初始化超时
- 通过 `_resolve_timeout(..., env_name=ENABLE_MCP_TIMEOUT_ENV, ...)` 解析启用超时
- 使用这两个值调用 `_warn_on_timeout_mismatch`

你可以在 `__init__` 中只解析一次，然后复用计算出的值，同时保留环境变量语义和不匹配警告的行为完全不变。

**之前（在多个地方）：**

```python
init_timeout = _resolve_timeout(
    timeout=None,
    env_name=MCP_INIT_TIMEOUT_ENV,
    default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
)
enable_timeout = _resolve_timeout(
    timeout=None,
    env_name=ENABLE_MCP_TIMEOUT_ENV,
    default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
)
self._warn_on_timeout_mismatch(init_timeout, enable_timeout)
```

**之后（在 `__init__` 中预计算并复用）：**

```python
class FunctionToolManager:
    def __init__(self) -> None:
        ...
        self._init_timeout_default = _resolve_timeout(
            timeout=None,
            env_name=MCP_INIT_TIMEOUT_ENV,
            default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
        )
        self._enable_timeout_default = _resolve_timeout(
            timeout=None,
            env_name=ENABLE_MCP_TIMEOUT_ENV,
            default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
        )
        self._warn_on_timeout_mismatch(
            self._init_timeout_default,
            self._enable_timeout_default,
        )
```

```python
async def init_mcp_clients(self, raise_on_all_failed: bool = False) -> MCPInitSummary:
    ...
    init_timeout = self._init_timeout_default
    timeout_display = f"{init_timeout:g}"
    ...
```

```python
async def enable_mcp_server(
    self,
    name: str,
    config: dict,
    shutdown_event: asyncio.Event | None = None,
    timeout: float | int | str | None = None,
) -> None:
    timeout_value = _resolve_timeout(
        timeout=timeout,
        env_name=ENABLE_MCP_TIMEOUT_ENV,
        default=self._enable_timeout_default,
    )
    await self._start_mcp_server(
        name=name,
        cfg=config,
        shutdown_event=shutdown_event,
        timeout=timeout_value,
    )
```

这样可以去掉重复的超时解析逻辑以及多次调用 `_warn_on_timeout_mismatch`，同时保持所有配置行为和告警不变。

---

这些改动在保留你新增特性和错误处理逻辑的同时，减少了：

- MCP 可变状态结构的数量
- 启动/关闭时需要保持同步的代码位置
- 重复的超时解析和不匹配检查
</issue_to_address>

Sourcery 对开源项目免费——如果你觉得这些审查有帮助，欢迎分享 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据反馈改进审查质量。}

Original comment in English

Hey - I've found 1 issue

Prompt for AI Agents

Please address the comments from this code review:

## Individual Comments

### Comment 1
<location path="astrbot/core/provider/func_tool_manager.py" line_range="194" />
<code_context>
-        self.mcp_client_dict: dict[str, MCPClient] = {}
-        """MCP 服务列表"""
-        self.mcp_client_event: dict[str, asyncio.Event] = {}
+        self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
+        """MCP 服务运行时状态（唯一事实来源）"""
+        self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying MCP client/state management by using a single runtime source of truth, centralizing cleanup in `_terminate_mcp_client`, and precomputing timeouts to avoid duplicated logic and state.

You can keep all the new behavior (timeouts, lifecycle, summaries) while cutting a good chunk of complexity by:

---

### 1. Drop `_mcp_client_dict_cache` and derive the client mapping from a single source of truth

Right now you maintain:

- `_mcp_server_runtime`
- `_mcp_client_dict_cache`
- two mapping proxies
- and you update both structs in multiple places (`_start_mcp_server`, `_terminate_mcp_client`, `disable_mcp_server`, `_shutdown_runtimes` lifecycle).

You can keep `_mcp_server_runtime` as the sole authoritative structure and expose `mcp_client_dict` as a read-only view derived from it. This removes duplicated state and cleanup paths without changing the public API.

**Before (simplified):**

```python
self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)
self._mcp_client_dict_cache: dict[str, MCPClient] = {}
self._mcp_client_dict_view = MappingProxyType(self._mcp_client_dict_cache)

@property
def mcp_client_dict(self) -> Mapping[str, MCPClient]:
    return self._mcp_client_dict_view
```

**After (single source of truth):**

```python
self._mcp_server_runtime: dict[str, _MCPServerRuntime] = {}
self._mcp_server_runtime_view = MappingProxyType(self._mcp_server_runtime)

@property
def mcp_client_dict(self) -> Mapping[str, MCPClient]:
    # Read-only compatibility view, derived from runtime state
    return MappingProxyType({
        name: runtime.client
        for name, runtime in self._mcp_server_runtime_view.items()
    })

@property
def mcp_server_runtime_view(self) -> Mapping[str, _MCPServerRuntime]:
    return self._mcp_server_runtime_view

@property
def mcp_server_runtime(self) -> Mapping[str, _MCPServerRuntime]:
    # deprecated alias, still read-only
    return self._mcp_server_runtime_view
```

Then you can remove all `_mcp_client_dict_cache` mutations from:

- `_start_mcp_server`
- `_terminate_mcp_client`
- `_shutdown_runtimes` / `disable_mcp_server` (their `finally` blocks)

All runtime-related code only touches `_mcp_server_runtime`, and callers still see the same `mcp_client_dict` interface.

---

### 2. Let `_terminate_mcp_client` own func_list cleanup

Currently, `func_list` cleanup happens in several places:

- `_terminate_mcp_client`
- `disable_mcp_server` (both name and `None` branch)
- lifecycle task’s `finally` via `_terminate_mcp_client` + extra dict pops

This makes it harder to reason about idempotency. You can centralize tool cleanup in `_terminate_mcp_client` and keep `disable_mcp_server` focused on orchestration only.

**Before (simplified):**

```python
async def _terminate_mcp_client(self, name: str) -> None:
    runtime = self._mcp_server_runtime.get(name)
    if runtime:
        ...
        finally:
            self.func_list = [
                f for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
            async with self._runtime_lock:
                self._mcp_client_dict_cache.pop(name, None)
            logger.info(...)

async def disable_mcp_server(self, name: str | None = None, timeout: float = 10) -> None:
    if name:
        ...
        finally:
            async with self._runtime_lock:
                self._mcp_server_runtime.pop(name, None)
                self._mcp_client_dict_cache.pop(name, None)
            self.func_list = [
                f for f in self.func_list
                if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
            ]
```

**After (centralized cleanup, `disable_mcp_server` doesn’t touch tools):**

```python
async def _terminate_mcp_client(self, name: str) -> None:
    runtime = self._mcp_server_runtime.get(name)
    if not runtime:
        # still clean stale tools if any
        self.func_list = [
            f for f in self.func_list
            if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
        ]
        return

    try:
        await runtime.client.cleanup()
    except Exception as e:
        logger.error(f"清空 MCP 客户端资源 {name}: {e}。")
    finally:
        self.func_list = [
            f for f in self.func_list
            if not (isinstance(f, MCPTool) and f.mcp_server_name == name)
        ]
        async with self._runtime_lock:
            self._mcp_server_runtime.pop(name, None)
        logger.info(f"已关闭 MCP 服务 {name}")
```

```python
async def disable_mcp_server(self, name: str | None = None, timeout: float = 10) -> None:
    if name:
        async with self._runtime_lock:
            runtime = self._mcp_server_runtime.get(name)
        if runtime is None:
            return

        try:
            await self._shutdown_runtimes([runtime], timeout, strict=True)
        finally:
            # No direct func_list or runtime dict manipulation here;
            # lifecycle/_terminate_mcp_client are responsible.
            pass
    else:
        async with self._runtime_lock:
            runtimes = list(self._mcp_server_runtime.values())
        try:
            await self._shutdown_runtimes(runtimes, timeout, strict=False)
        finally:
            # Same: lifecycle/_terminate_mcp_client clear state.
            pass
```

This keeps all existing behavior (lifecycle, timeouts, per-server shutdown) but avoids double-clearing tools and runtime state.

---

### 3. Centralize timeout resolution to avoid repetition and cross-checking everywhere

Both `init_mcp_clients` and `enable_mcp_server` currently:

- Resolve init timeout via `_resolve_timeout(..., env_name=MCP_INIT_TIMEOUT_ENV, ...)`
- Resolve enable timeout via `_resolve_timeout(..., env_name=ENABLE_MCP_TIMEOUT_ENV, ...)`
- Call `_warn_on_timeout_mismatch` with the pair

You can resolve these once in `__init__` and reuse the computed values, keeping the environment-variable semantics and mismatch warning exactly the same.

**Before (in multiple places):**

```python
init_timeout = _resolve_timeout(
    timeout=None,
    env_name=MCP_INIT_TIMEOUT_ENV,
    default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
)
enable_timeout = _resolve_timeout(
    timeout=None,
    env_name=ENABLE_MCP_TIMEOUT_ENV,
    default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
)
self._warn_on_timeout_mismatch(init_timeout, enable_timeout)
```

**After (precompute in `__init__` and reuse):**

```python
class FunctionToolManager:
    def __init__(self) -> None:
        ...
        self._init_timeout_default = _resolve_timeout(
            timeout=None,
            env_name=MCP_INIT_TIMEOUT_ENV,
            default=DEFAULT_MCP_INIT_TIMEOUT_SECONDS,
        )
        self._enable_timeout_default = _resolve_timeout(
            timeout=None,
            env_name=ENABLE_MCP_TIMEOUT_ENV,
            default=DEFAULT_ENABLE_MCP_TIMEOUT_SECONDS,
        )
        self._warn_on_timeout_mismatch(
            self._init_timeout_default,
            self._enable_timeout_default,
        )
```

```python
async def init_mcp_clients(self, raise_on_all_failed: bool = False) -> MCPInitSummary:
    ...
    init_timeout = self._init_timeout_default
    timeout_display = f"{init_timeout:g}"
    ...
```

```python
async def enable_mcp_server(
    self,
    name: str,
    config: dict,
    shutdown_event: asyncio.Event | None = None,
    timeout: float | int | str | None = None,
) -> None:
    timeout_value = _resolve_timeout(
        timeout=timeout,
        env_name=ENABLE_MCP_TIMEOUT_ENV,
        default=self._enable_timeout_default,
    )
    await self._start_mcp_server(
        name=name,
        cfg=config,
        shutdown_event=shutdown_event,
        timeout=timeout_value,
    )
```

This removes repeated timeout logic and the repeated `_warn_on_timeout_mismatch` calls while preserving all configuration behavior and warnings.

---

These changes keep your new features and error handling intact, but reduce:

- number of mutable MCP state structures
- number of locations that must be kept in sync on startup/shutdown
- repeated timeout resolution and mismatch checks
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

astrbot/core/provider/func_tool_manager.py

idiotsj · 2026-03-02T15:32:42Z

@sourcery-ai review

sourcery-ai

Hey - 我在这里留下了一些整体性的反馈：

在 _terminate_mcp_client 中，清理逻辑与 _cleanup_mcp_client_safely 有重复；可以考虑复用这个辅助函数，以便在所有路径中对客户端清理的错误处理保持一致。
mcp_client_dict 属性当前在每次访问时都会基于一个新的字典推导式构造一个新的 MappingProxyType；如果这个属性被频繁访问，你可能希望暴露一个缓存视图，或者复用 _mcp_server_runtime_view，以避免不必要的内存分配。

面向 AI 代理的提示

Please address the comments from this code review:

## Overall Comments
- In `_terminate_mcp_client`, the cleanup logic duplicates `_cleanup_mcp_client_safely`; consider reusing that helper to keep the error handling for client cleanup consistent in all paths.
- The `mcp_client_dict` property currently constructs a new `MappingProxyType` over a new dict comprehension on every access; if this is accessed frequently, you might want to expose a cached view or reuse `_mcp_server_runtime_view` to avoid unnecessary allocations.

Sourcery 对开源项目是免费的——如果你觉得我们的评审有帮助，欢迎分享 ✨

_{帮我变得更有用！请对每条评论点 👍 或 👎，我会根据你的反馈改进后续的评审。}

Original comment in English

Hey - I've left some high level feedback:

In _terminate_mcp_client, the cleanup logic duplicates _cleanup_mcp_client_safely; consider reusing that helper to keep the error handling for client cleanup consistent in all paths.
The mcp_client_dict property currently constructs a new MappingProxyType over a new dict comprehension on every access; if this is accessed frequently, you might want to expose a cached view or reuse _mcp_server_runtime_view to avoid unnecessary allocations.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- In `_terminate_mcp_client`, the cleanup logic duplicates `_cleanup_mcp_client_safely`; consider reusing that helper to keep the error handling for client cleanup consistent in all paths.
- The `mcp_client_dict` property currently constructs a new `MappingProxyType` over a new dict comprehension on every access; if this is accessed frequently, you might want to expose a cached view or reuse `_mcp_server_runtime_view` to avoid unnecessary allocations.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

idiotsj · 2026-03-02T15:40:59Z

@sourcery-ai review

sourcery-ai

嗨，我已经审查了你的更改，看起来非常棒！

Sourcery 对开源项目是完全免费的——如果你觉得这些评审对你有帮助，欢迎分享给更多人 ✨

_{帮我变得更有用！请在每条评论上点 👍 或 👎，我会根据你的反馈来改进今后的评审。}

Original comment in English

Hey - I've reviewed your changes and they look great!

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

idiotsj added 2 commits February 27, 2026 20:05

dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Feb 27, 2026

dosubot bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Feb 27, 2026

sourcery-ai bot reviewed Feb 27, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

idiotsj added 2 commits February 27, 2026 22:18

fix: restore task cancellation on timeout per review feedback

38e99cf

Pending tasks in asyncio.wait are tasks that have NOT completed initialization within 20s, so cancelling them is safe and correct.

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Mar 1, 2026

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

idiotsj force-pushed the fix/mcp-race-condition branch from 8132ce2 to 5fab8aa Compare March 1, 2026 12:29

security: redact sensitive MCP config from debug logs

b810dfb

Only log executable name and argument count instead of full command/args to avoid leaking tokens or credentials even at debug level.

sourcery-ai bot reviewed Mar 1, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

idiotsj added 2 commits March 1, 2026 20:35

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

refactor: simplify mcp runtime ownership and timeout resolution

bcab236

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

fix: harden mcp shutdown waiting and startup signaling

9524878

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

refactor: streamline mcp lifecycle and shutdown errors

9221f3b

dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Mar 2, 2026

refactor: harden mcp runtime access and shutdown

24b8379

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Show resolved Hide resolved

astrbot/core/provider/func_tool_manager.py Show resolved Hide resolved

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

fix: ensure mcp client cleanup and clarify views

e69ff0c

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Show resolved Hide resolved

astrbot/core/provider/func_tool_manager.py Show resolved Hide resolved

refactor: cache mcp client view and guard startup

146c743

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Outdated Show resolved Hide resolved

astrbot/core/provider/func_tool_manager.py Show resolved Hide resolved

refactor: simplify mcp init cleanup and runtime lock

424aa5b

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

astrbot/core/provider/func_tool_manager.py Show resolved Hide resolved

refactor: reduce mcp runtime duplication

ef8e254

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

refactor: reuse mcp cleanup and client view

1e93ce4

sourcery-ai bot reviewed Mar 2, 2026

View reviewed changes

zouyonghe merged commit 0711ec3 into AstrBotDevs:master Mar 2, 2026
6 checks passed

Uh oh!

Conversation

idiotsj commented Feb 27, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Fix: Resolve MCP tools race condition causing 'completion 无法解析' error

问题描述

根本原因

变更内容

1. astrbot/core/provider/manager.py

2. astrbot/core/provider/func_tool_manager.py

测试场景

检查清单

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

Summary by Sourcery

idiotsj commented Feb 27, 2026 •

edited by sourcery-ai bot

Loading

1. `astrbot/core/provider/manager.py`

2. `astrbot/core/provider/func_tool_manager.py`