大模型：使用langchain库调用大模型（2）

2201_75573294

423人浏览 · 2026-03-17 09:18:41

2201_75573294 · 2026-03-17 09:18:41 发布

在使用大语言模型（LLM）构建应用时，模型返回的自然语言文本通常难以直接用于程序逻辑。LangChain 提供了强大的输出解析器（Output Parser） 和链式调用（LCEL） 机制，帮助我们格式化模型输出、提取结构化数据，并以简洁优雅的方式组织流程。

本文将通过两个完整示例，带你掌握如何使用 LangChain 解析列表和 JSON 数据，并展示如何通过管道操作符简化代码结构。

一、输出解析器-列表

输出解析器的作用是强制模型以指定格式输出内容，并将其自动转换为 Python 对象（如列表、字典等）

from langchain_openai import ChatOpenAI
from langchain.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_messages([
    ("system", "{parser_instructions}"),
    ("human", "列出10首{subject}最受欢迎的流行音乐。")
])

system和human用户提问和系统回复格式，定义prompt，包含系统指令和用户问题。

from langchain.output_parsers import CommaSeparatedListOutputParser
output_parser = CommaSeparatedListOutputParser()
parser_instructions = output_parser.get_format_instructions()
print(parser_instructions)

初始化解析器，并获取格式指令。输出格式化，强制模型输出特定格式的数据，并自动解析为python对象。

print打印的是告诉模型应该以什么格式输出，例如Your response should be a list of comma separated values, eg: `foo, bar, baz`

final_prompt = prompt.invoke({"subject": "中国", "parser_instructions": parser_instructions})

model = ChatOpenAI(model="qwen3.5-plus",
                 openai_api_key="sk-45*******************9",
                 openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1")
response = model.invoke(final_prompt)
print(response.content)

output_parser.invoke(response)

上述代码中有两个print，第一个是打印指令，第二个打印模型原始输出

核心价值，将模型的自由文本输出，转换为程序可直接使用的结构化数据。

构建最终prompt，然后调用模型，最后输出列表。

二、输出解析器-json

在实际应用中，我们往往需要更复杂的数据结构，如 JSON。LangChain 提供了 PydanticOutputParser，结合 Pydantic 模型，可以精准定义期望的输出结构。

from typing import List#3.11xuni2
from langchain.output_parsers import PydanticOutputParser
from langchain.prompts import ChatPromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field
from langchain_openai import ChatOpenAI

class FilmInfo(BaseModel):
    film_name: str = Field(description="电影的名字", example="拯救大兵瑞恩")
    author_name: str = Field(description="电影的导演", example="斯皮尔伯格")
    genres: List[str] = Field(description="电影的题材", example=["历史", "战争"])

定义期待的输出结果格式，相当于给模型一个json模版。

basemodel：创建数据模式，相当于表格的表头

field（）：对字段进行说明，相当于列名

description：告诉模型这是什么

example：示例，给模型一个例子

output_parser = PydanticOutputParser(pydantic_object=FilmInfo)
print(output_parser.get_format_instructions())

初始化解析器，上一个例子中使用的解析器是CommaSeparatedListOutputParser，后面输出的是列表。这里我们使用PydanticOutputParser，输出结构化的对象也就是我们需要的json格式。

print打印的是生成的格式指令：

prompt = ChatPromptTemplate.from_messages([
    ("system", "{parser_instructions} 你输出的结果请使用中文。"),
    ("human", "请你帮我从电影概述中，提取电影名、导演，以及电影的体裁。电影概述会被三个#符号包围。\n###{film_introduction}###")
])

构建prompt，强调使用中文

film_introduction = """《《唐人街探案》是由万达影视传媒有限公司、上海骋亚影视文化传媒有限公司出品，湖南芒果娱乐有限公司、合一影业有限公司等联合出品，陈思诚执导，陈思诚、程佳客、刘凯、白鹤编剧，王宝强、刘昊然领衔主演，陈赫、佟丽娅、肖央、小沈阳特别出演，金士杰、董成鹏、张国强友情出演的喜剧电影 [67]。该片于2015年12月31日在中国上映 [69-70]。2025年4月30日19:30在江苏卫视播出。 [80]
该片讲述了唐仁、秦风必须在躲避警察追捕、匪帮追杀、黑帮围剿的同时，在短短七天内，完成找到“失落的黄金”、查明“真凶”、为自己“洗清罪名”这些“逆天”任务的故事 [69]。
该片获得第73届威尼斯国际电影节“威尼斯日”特别推荐、“观众票选最受瞩目电影奖”、“东京电影节展映” [1] [72]，截至2024年5月8日，该片累计票房82333.7万元 [71]。
"""

final_prompt = prompt.invoke({"film_introduction": film_introduction,
                              "parser_instructions": output_parser.get_format_instructions()})
print(final_prompt)

电影介绍文本

model = ChatOpenAI(model="qwen-plus",
                 openai_api_key="sk-********************9",
                 openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1")
response = model.invoke(final_prompt)
print(response.content)

调用模型

result = output_parser.invoke(response)

解析为filminfo对象，可以直接使用，属性访问方便快捷

三、链式调用（LCEL）

前面的例子中，我们使用了嵌套调用的方式：

result = output_parser.invoke(model.invoke(prompt.invoke({...})))

这种方式虽然可行，但随着流程复杂化，可读性会急剧下降。LangChain 提供了管道操作符（|），让我们能以声明式的方式串联组件。

示例：列出中国汽车品牌

from langchain.prompts import ChatPromptTemplate
#定义prompt
prompt = ChatPromptTemplate.from_messages([
    ("system", "{parser_instructions}"),
    ("human", "列出5个{subject}生产的汽车的品牌。")
])

#初始化解析器和模型
output_parser = CommaSeparatedListOutputParser()
parser_instructions = output_parser.get_format_instructions()

model = ChatOpenAI(model="qwen-plus",
                 openai_api_key="sk-45********************************",
                 openai_api_base="https://dashscope.aliyuncs.com/compatible-mode/v1")

result = output_parser.invoke(model.invoke(prompt.invoke({"subject": "中国", "parser_instructions": parser_instructions})))

#使用管道符构建链
chat_model_chain = prompt | model | output_parser

#执行链
result = chat_model_chain.invoke({"subject": "中国", "parser_instructions": parser_instructions})

管道操作符：

| 的作用就是将多个组件连接成处理流水线。

output_parser.invoke(model.invoke(prompt.invoke(...)))嵌套调用很复杂，使用 | 就很清晰明了。result = chat_model_chain.invoke({"subject": "中国", "parser_instructions": parser_instructions})执行的时候实际流程是：

步骤1: prompt.invoke({...}) → 生成格式化后的prompt
步骤2: model.invoke(步骤1的结果) → 模型生成文本
步骤3: output_parser.invoke(步骤2的结果) → 解析成列表
步骤4: 返回最终结果

LangChain最核心的链式编程范式，让复杂流程变得清晰简洁