【LangChain-AI】聊天模型--结构化输出

阿_徒

33人浏览 · 2026-06-06 08:30:00

阿_徒 · 2026-06-06 08:30:00 发布

1. 返回 Pydantic 对象

我们可以设置执行 Runnable 后的输出结果指定为Pydantic 类，这将返回一个 Pydantic 对象。
当收到模型的响应后，LangChain 会提取出代表Pydantic 参数的 JSON 对象，并用 Pydantic 模型对其进行解析和验证，将这个验证后的 JSON 转换为一个可用的 Pydantic 对象实例返回。

from typing import Optional, List
from langchain.chat_models import init_chat_model
from pydantic import BaseModel, Field

# 聊天模型-结构化返回
model = init_chat_model(model="deepseek-chat", model_provider="deepseek")

# 返回pydantic对象
class Joke(BaseModel):
    """给用户讲的一个笑话"""
    setup: str = Field(description="这个笑话的开头")
    punchline: str = Field(description="这个笑话的妙语")
    rating: Optional[int] = Field(default=None, description="从1到10分，给这个笑话评分") # Optional表示这个评分是一个可选项

class Data(BaseModel):
    """获取关于笑话的数据列表"""
    jokes: List[Joke]

model_with_structure = model.with_structured_output(Data)
print(model_with_structure.invoke("讲一个关于唱歌的笑话和一个关于跳舞的笑话"))

输出结果：

jokes=[Joke(setup='为什么唱歌的人总是很冷静？', punchline='因为他们有"调"不紊！', rating=None), Joke(setup='为什么跳舞的人总是很受欢迎？', punchline='因为他们会"舞"动人心！', rating=None)]

调用with_structured_output方法并不是直接调用大模型，而是帮我构建一个可以结构化输出的Runnable实例（一个类似聊天模型的model），它和model的区别在于输出结果上，对于model 来说返回的AI Message中的content是一个消息字符串，对于这个Runnable来说，返回的是一个给定输出结构的对象。

2. 返回字典TypedDict

TypedDict ，它用于为字典对象提供精确的、结构化的类型提示。它允许我们指定一个字典中应该有哪些键，以及每个键对应的值的类型。它非常重要的一个能力就是捕捉键名拼写错误与类型错误。

因此，我们也可以设置执行 Runnable 后的输出结果指定为 TypedDict 类，这将返回一个字典，且输出后，会根据设定进行验证。

from typing import Optional, TypedDict, Annotated
from langchain.chat_models import init_chat_model


# 聊天模型-结构化返回
model = init_chat_model(model="deepseek-chat", model_provider="deepseek")

# TypedDict
class Joke(TypedDict):
    """给用户讲了一个笑话"""
    setup: Annotated[str, ..., "这个笑话的开头"]
    punchline: Annotated[str, ..., "这个笑话的妙语"]
    rating: Annotated[Optional[int], None, "从1到10分，给这个笑话评分"]

model_with_structure = model.with_structured_output(Joke, include_raw=True) # 加上include_raw=True这个参数就可以看到聊天模型返回的 message
print(model_with_structure.invoke("讲一个关于唱歌的笑话和一个关于跳舞的笑话"))

返回的结果：
在这里插入图片描述
可以看到返回的结果里面有三个字段：raw(里面是AIMessage)、parsed(对AIMessag进行了解析，解析成结构化数据)、parsing_error(有没有出现解析错误)。

3. 返回 JSON

还可以让聊天模型直接返回 JSON，只不过为了声明 JSON，我们需要定义 JSON Schema。

from langchain.chat_models import init_chat_model

# 聊天模型-结构化返回
model = init_chat_model(model="deepseek-chat", model_provider="deepseek")

# JSON Schema
json_schema = {
    "title": "joke",
    "description": "给用户讲一个笑话。",
    "type": "object",
    "properties": {
        "setup": {
            "type": "string",
            "description": "这个笑话的开头",
        },
        "punchline": {
            "type": "string",
            "description": "这个笑话的妙语",
        },
        "rating": {
            "type": "integer",
            "description": "从1到10分，给这个笑话评分",
            "default": None,
        },
    },
    "required": ["setup", "punchline"],
}

model_with_structured = model.with_structured_output(json_schema)
print(model_with_structured.invoke("讲一个关于跳舞的笑话"))

输出结果：

{'setup': '为什么程序员跳舞跳得特别好？', 'punchline': '因为他们最擅长"跳"出 bug 啊！', 'rating': 7}

上面的结构化输出都是聊天模型提供的能力，并不是LangChain提供的，LangChain有提供具体的组件来处理结构化输出。

4. 选择输出格式

我们上面定义的结构化输出都是关于笑话的，那如果我现在问它别的问题呢？？？

from langchain.chat_models import init_chat_model


# 聊天模型-结构化返回
model = init_chat_model(model="deepseek-chat", model_provider="deepseek")

# JSON Schema
json_schema = {
    "title": "joke",
    "description": "给用户讲一个笑话。",
    "type": "object",
    "properties": {
        "setup": {
            "type": "string",
            "description": "这个笑话的开头",
        },
        "punchline": {
            "type": "string",
            "description": "这个笑话的妙语",
        },
        "rating": {
            "type": "integer",
            "description": "从1到10分，给这个笑话评分",
            "default": None,
        },
    },
    "required": ["setup", "punchline"],
}

model_with_structured = model.with_structured_output(json_schema)
print(model_with_structured.invoke("你是谁？"))

输出结果：

{'setup': '我是谁？', 'punchline': '我是一个AI助手，正在努力想一个关于自己的笑话...', 'rating': 6}

我问它别的问题，它还在说笑话，这不是瞎胡闹嘛，我们不能让它瞎胡闹，我们可以让它在返回的时候有选择。

from typing import Optional, Union

from langchain.chat_models import init_chat_model
from pydantic import BaseModel, Field

model = init_chat_model(model="deepseek-chat", model_provider="deepseek")


# Pydantic 对象
class Joke(BaseModel):
    """给用户讲的一个笑话"""

    setup: str = Field(description="这个笑话的开头")
    punchline: str = Field(description="这个笑话的妙语")
    rating: Optional[int] = Field(default=None, description="从1-10分，给这个笑话评分")


class Response(BaseModel):
    """用以对话的方式回应。"""

    content: str = Field(description="用于对用户查询的会话响应")

class FinalResponse(BaseModel):
    """最终回复，选择合适的输出结构"""

    final_output: Union[Joke, Response]

model_with_structured = model.with_structured_output(FinalResponse)
print(model_with_structured.invoke("讲一个关于跳舞的笑话"))
print(model_with_structured.invoke("你是谁？"))

输出结果：

final_output=Joke(setup='为什么跳舞的人数学都很好？', punchline='因为他们总是数着拍子：1、2、3、4，1、2、3、4……', rating=7)
final_output=Response(content='你好！我叫Claude，是由Anthropic公司创造的AI助手。我在这里帮助你解答问题、提供信息、进行创作，或者只是和你聊聊天！有什么我可以帮你的吗？😊')