LangChain, MCP Server, Qwen-Agent等测试及问题记录
参考官方文档:https://langchain-ai.github.io/langgraph/tutorials/introduction/由于想测试通过LangGraph编排让大模型调用工具,所以首先查询支持Function Calling的大模型:https://help.aliyun.com/zh/model-studio/qwen-function-calling使用云服务商提供的大模型
LangChain LangGraph
参考官方文档:https://langchain-ai.github.io/langgraph/tutorials/introduction/
1. 这里使用Qwen系列模型进行测试
由于想测试通过LangGraph编排让大模型调用工具,所以首先查询支持Function Calling的大模型:
https://help.aliyun.com/zh/model-studio/qwen-function-calling
1.1
使用云服务商提供的大模型需要申请api key
https://bailian.console.aliyun.com/?apiKey=1#/api-key
可用的模型名称 https://help.aliyun.com/zh/model-studio/models#ced16cb6cdfsy
from langchain_community.chat_models.tongyi import ChatTongyi
llm = ChatTongyi(model="qwen-max")
llm_with_tools = llm.bind_tools(tools)
1.2
本地部署使用vllm时调用工具失败
BadRequestError: Error code: 400 - {'object': 'error', 'message': '"auto" tool choice requires --enable-auto-tool-choice and --tool-call-parser to be set', 'type': 'BadRequestError', 'param': None, 'code': 400}
根据提示尝试修改启动参数,未解决问题
(LLM) root@42c2e682b768:/workspace# vllm serve /workspace/Qwen2.5/Qwen/Qwen2.5-7B-Instruct --tensor-parallel-size 4 --enable-auto-tool-choice --tool-call-parser
INFO 04-17 08:40:29 [__init__.py:239] Automatically detected platform cuda.
usage: vllm serve [model_tag] [options]
vllm serve: error: argument --tool-call-parser: expected one argument
(LLM) root@42c2e682b768:/workspace# vllm serve /workspace/Qwen2.5/Qwen/Qwen2.5-7B-Instruct --tensor-parallel-size 4 --enable-auto-tool-choice
INFO 04-17 08:42:08 [__init__.py:239] Automatically detected platform cuda.
Traceback (most recent call last):
File "/opt/conda/envs/LLM/bin/vllm", line 8, in <module>
sys.exit(main())
File "/opt/conda/envs/LLM/lib/python3.10/site-packages/vllm/entrypoints/cli/main.py", line 48, in main
cmds[args.subparser].validate(args)
File "/opt/conda/envs/LLM/lib/python3.10/site-packages/vllm/entrypoints/cli/serve.py", line 30, in validate
validate_parsed_serve_args(args)
File "/opt/conda/envs/LLM/lib/python3.10/site-packages/vllm/entrypoints/openai/cli_args.py", line 284, in validate_parsed_serve_args
raise TypeError("Error: --enable-auto-tool-choice requires "
TypeError: Error: --enable-auto-tool-choice requires --tool-call-parser
本地部署使用OLLama
llm = ChatOpenAI(
openai_api_base="http://127.0.0.1:11434/v1", # Ollama 默认监听端口
openai_api_key="ollama", # 可任意值,但不能为空
model="llama3.3:latest", # 替换为你在 Ollama pull 的模型名字,如 qwen:7b、mistral、llama3 等
temperature=0.7,
verbose=True # 打印模型返回内容
)
使用Ollama部署qwen2.5可以调用tool,需要显式指定用哪个tool,如果用多个tool会报错:
Cell In[25], line 110, in chatbot(state)
106 message = llm_with_tools.invoke(state["messages"])
107 # Because we will be interrupting during tool execution,
108 # we disable parallel tool calling to avoid repeating any
109 # tool invocations when we resume.
--> 110 assert len(message.tool_calls) <= 1
111 return {"messages": [message]}
AssertionError:
而使用
ChatTongyi可以绑定多个tool,同时使用。
1.3
详细的测试代码如下,加了工具后每次响应的时长会变长,本地检索文档,受限于RAG的能力,匹配效果不好,需要专门调优,使用免费的tavily_search 搜素结果基本可用,还有一些付费的search tool:
https://python.langchain.com/docs/integrations/tools/
from typing import Annotated
from typing import Literal
from typing_extensions import TypedDict
from langchain_community.chat_models.tongyi import ChatTongyi
from langchain_community.document_loaders import UnstructuredURLLoader
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.messages import ToolMessage, HumanMessage
from langchain_core.tools import InjectedToolCallId, tool
from langchain_openai import ChatOpenAI
from langgraph.checkpoint.memory import MemorySaver
from langgraph.graph import StateGraph, START, END, MessagesState
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode, tools_condition
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langgraph.types import Command, interrupt
tavily_search = TavilySearchResults(max_results=2)
tools = [tavily_search , retrieve_context, human_assistance]
tool_node = ToolNode(tools=tools)
llm = ChatTongyi(model="qwen-max")
# 替换为本地 vLLM + Qwen 模型
#llm = ChatOpenAI(
# openai_api_base="http://localhost:8000/v1",
# openai_api_key="EMPTY", # vLLM 默认不验证
# model_name="Qwen2.5-7B-Instruct",
# temperature=0.7
#)
llm_with_tools = llm.bind_tools(tools)
class State(TypedDict):
messages: Annotated[list, add_messages]
name: str
birthday: str
def chatbot(state: State):
message = llm_with_tools.invoke(state["messages"])
# Because we will be interrupting during tool execution,
# we disable parallel tool calling to avoid repeating any
# tool invocations when we resume.
assert len(message.tool_calls) <= 1
return {"messages": [message]}
# Define the workflow with LangGraph
graph_builder = StateGraph(State)
graph_builder.add_node("chatbot", chatbot)
graph_builder.add_node("tools", tool_node)
graph_builder.add_conditional_edges(
"chatbot",
tools_condition,
)
graph_builder.add_edge("tools", "chatbot")
graph_builder.add_edge(START, "chatbot")
memory = MemorySaver()
graph = graph_builder.compile(checkpointer=memory)
MCP Server
参考官方文档:https://modelcontextprotocol.io/quickstart/server 写一个server.py和client.py
官方的例子需要用到Claude API这里就模拟了一下:
class MockContent:
def __init__(self, type_, text=None, name=None, input_=None):
self.type = type_
self.text = text
self.name = name
self.input = input_
class MockResponse:
def __init__(self):
self.content = [
MockContent(type_='text', text='这是助手的初始回答。'),
MockContent(type_='tool_use', name='get_forecast', input_={'latitude': 40.7128, 'longitude': -74.0060}),
MockContent(type_='tool_use', name='get_alerts', input_={'state': 'NY'}),
MockContent(type_='text', text='星期一:\n气温: 25°C\n风速: 10 km/h 东北风\n天气预报: 晴朗\n---\n星期二:\n气温: 22°C\n风速: 8 km/h 西南风\n天气预报: 多云')
]
response = MockResponse()
for content in response.content:
if content.type == 'text':
final_text.append(content.text)
assistant_message_content.append(content)
elif content.type == 'tool_use':
tool_name = content.name
tool_args = content.input
# Execute tool call
result = await self.session.call_tool(tool_name, tool_args)
上面只是测试了本地运行server.py client.py并交互,对于远程调用根据官方公告:
https://modelcontextprotocol.io/specification/2025-03-26/changelog
Replaced the previous HTTP+SSE transport with a more flexible Streamable HTTP transport (PR #206)
所以MCP本身功能还有待完善,有可能会做重大调整,再等等看。
相对于Langchain,MCP本身并没有提供更多新的功能,而且LangChain也对MCP进行了适配,所以二者并没有相关替代的关系 https://github.com/langchain-ai/langchain-mcp-adapters
langchain-mcp-adapters
参考github上面的readme实现server.py,修改一下client.py换用chatTongyi
cat client.py
import asyncio
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
from langchain_mcp_adapters.tools import load_mcp_tools
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_community.chat_models.tongyi import ChatTongyi
async def main():
#model = ChatOpenAI(model="gpt-4o")
model = ChatTongyi(model="qwen2.5-72b-instruct") #qwen-max qwen2.5-72b-instruct
server_params = StdioServerParameters(
command="python",
# 替换为 math_server.py 的完整路径
args=["server.py"],
)
async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
tools = await load_mcp_tools(session)
agent = create_react_agent(model, tools)
agent_response = await agent.ainvoke({
"messages": "what's (3 + 5) x 12?"
})
print(agent_response)
if __name__ == "__main__":
asyncio.run(main())
大模型会自动调用math_tool去完成计算,可以修改client.py中的messages验证一下,当问题是普通问题时并不会调用math_tool
Qwen-Agent
参考官方文档:
https://github.com/QwenLM/Qwen-Agent/blob/main/README_CN.md
有如下说明:
Qwen-Agent支持接入阿里云DashScope服务提供的Qwen模型服务,也支持通过OpenAI API方式接入开源的Qwen模型服务。
所以该工具只适合Qwen系列模型
总结
当前目标是想做一个支持对接多种开源大模型,可调用工具的Agent,考虑到兼容MCP协议,所以这里选择使用langchain-mcp-adapters
更多推荐
所有评论(0)