LangChainGo Agents

AirGo.

421人浏览 · 2026-03-11 12:03:21

AirGo. · 2026-03-11 12:03:21 发布

LangChainGo Agents

基于 langchaingo v0.1.14 版本分析

概述

agents 包是 LangChainGo 框架中实现智能代理的核心模块。根据 doc.go 的描述：

An Agent is a wrapper around a model, which takes in user input and returns a response corresponding to an “action” to take and a corresponding “action input”. Alternatively the agent can return a finish with the finished answer to the query.

Agent 本质上是 LLM 的包装器，负责：

接收用户输入
决定执行什么动作（Action）及参数（Action Input）
或直接返回最终答案（Finish）

核心架构

┌─────────────────────────────────────────────────────────────┐
│                        Executor                             │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  for i < MaxIterations:                             │   │
│  │      ┌─────────────────────────────────────────┐    │   │
│  │      │ Agent.Plan(inputs, intermediateSteps)  │    │   │
│  │      └─────────────────────────────────────────┘    │   │
│  │                      │                              │   │
│  │          ┌───────────┴───────────┐                  │   │
│  │          ▼                       ▼                  │   │
│  │   []AgentAction            AgentFinish              │   │
│  │          │                       │                  │   │
│  │          ▼                       ▼                  │   │
│  │   Tool.Call()             返回最终结果              │   │
│  │          │                                             │   │
│  │          ▼                                             │   │
│  │   记录 AgentStep                                       │   │
│  │   (Action + Observation)                               │   │
│  │          │                                             │   │
│  │          └──────────────────────────┐                │   │
│  │                                     ▼                │   │
│  │                          继续下一次迭代               │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

核心组件职责

组件	职责	文件位置
Agent	决策：分析输入，决定下一步行动	`agents.go`
Executor	执行：循环调用 Agent，执行工具	`executor.go`
Tool	操作：执行具体功能	`tools/tool.go`

Agent 接口定义

所有 Agent 必须实现以下接口（agents.go:12-20）：

// Agent is the interface all agents must implement.
type Agent interface {
    // Plan Given an input and previous steps decide what to do next. Returns
    // either actions or a finish. Options can be passed to configure LLM
    // parameters like temperature, max tokens, etc.
    Plan(ctx context.Context, intermediateSteps []schema.AgentStep, inputs map[string]string, options ...chains.ChainCallOption) ([]schema.AgentAction, *schema.AgentFinish, error)
    GetInputKeys() []string
    GetOutputKeys() []string
    GetTools() []tools.Tool
}

Plan 方法详解

Plan(ctx context.Context, intermediateSteps []schema.AgentStep, inputs map[string]string, options ...chains.ChainCallOption) ([]schema.AgentAction, *schema.AgentFinish, error)

参数说明：

intermediateSteps: 之前执行的步骤历史（Action + Observation）
inputs: 用户输入的键值对

返回值：

[]schema.AgentAction: 需要执行的动作列表
*schema.AgentFinish: 任务完成，返回最终结果
error: 错误信息

Executor 执行器

结构定义

// Executor is the chain responsible for running agents.
type Executor struct {
    Agent            Agent
    Memory           schema.Memory
    CallbacksHandler callbacks.Handler
    ErrorHandler     *ParserErrorHandler

    MaxIterations           int
    ReturnIntermediateSteps bool
}

核心执行流程

func (e *Executor) Call(ctx context.Context, inputValues map[string]any, options ...chains.ChainCallOption) (map[string]any, error) {
    inputs, err := inputsToString(inputValues)
    if err != nil {
        return nil, err
    }
    nameToTool := getNameToTool(e.Agent.GetTools())

    steps := make([]schema.AgentStep, 0)
    for i := 0; i < e.MaxIterations; i++ {
        var finish map[string]any
        steps, finish, err = e.doIteration(ctx, steps, nameToTool, inputs, options...)
        if finish != nil || err != nil {
            return finish, err
        }
    }
    // 达到最大迭代次数仍未完成
    return e.getReturn(&schema.AgentFinish{ReturnValues: make(map[string]any)}, steps), ErrNotFinished
}

创建 Executor

executor := agents.NewExecutor(agent,
    agents.WithMaxIterations(5),
    agents.WithMemory(memory),
    agents.WithCallbacksHandler(handler),
    agents.WithReturnIntermediateSteps(),
)

Agent 实现模式

agents 包目前提供了 3 种 Agent 实现：

Agent 类型	文件	核心特点
OneShotZeroAgent	`mrkl.go`	基于 ReAct 框架，使用 Action/Action Input 格式解析
ConversationalAgent	`conversational.go`	适合对话场景，支持聊天历史，用 AI: 作为最终答案标记
OpenAIFunctionsAgent	`openai_functions_agent.go`	使用 OpenAI 原生 Function Calling API

ReAct (OneShotZeroAgent)

ReAct = Reasoning + Acting

这是最经典的 Agent 实现模式，通过文本解析来实现思考与行动的循环。

结构定义

type OneShotZeroAgent struct {
    Chain            chains.Chain
    Tools            []tools.Tool
    OutputKey        string
    CallbacksHandler callbacks.Handler
}

输出格式

Action: calculator
Action Input: 2 + 2
Observation: 4
Thought: 我已经得到答案
Final Answer: 4

输出解析逻辑

func (a *OneShotZeroAgent) parseOutput(output string) ([]schema.AgentAction, *schema.AgentFinish, error) {
    // 首先检查是否有最终答案
    if strings.Contains(output, _finalAnswerAction) {
        splits := strings.Split(output, _finalAnswerAction)
        return nil, &schema.AgentFinish{
            ReturnValues: map[string]any{
                a.OutputKey: strings.TrimSpace(splits[len(splits)-1]),
            },
            Log: output,
        }, nil
    }

    // 解析 Action/Action Input
    r := regexp.MustCompile(`(?i)Action:\s*(.+?)\s*Action\s+Input:\s*(?s)(.+)`)
    matches := r.FindStringSubmatch(output)
    if len(matches) == 3 {
        return []schema.AgentAction{
            {Tool: strings.TrimSpace(matches[1]), ToolInput: strings.TrimSpace(matches[2]), Log: output},
        }, nil, nil
    }

    return nil, nil, fmt.Errorf("%w: %s", ErrUnableToParseOutput, output)
}

ScratchPad 机制

Agent 通过 agent_scratchpad 记录思考过程：

func constructMrklScratchPad(steps []schema.AgentStep) string {
    var scratchPad string
    if len(steps) > 0 {
        for _, step := range steps {
            scratchPad += "\n" + step.Action.Log
            scratchPad += "\nObservation: " + step.Observation + "\n"
        }
    }
    return scratchPad
}

Conversational Agent

专门为对话场景设计的 Agent，特点是：

支持聊天历史记忆
更适合自然对话而非纯任务执行

结构定义

type ConversationalAgent struct {
    Chain            chains.Chain
    Tools            []tools.Tool
    OutputKey        string
    CallbacksHandler callbacks.Handler
}

输出格式差异

使用 AI: 作为最终答案标记：

const _conversationalFinalAnswerAction = "AI:"

func (a *ConversationalAgent) parseOutput(output string) ([]schema.AgentAction, *schema.AgentFinish, error) {
    if strings.Contains(output, _conversationalFinalAnswerAction) {
        splits := strings.Split(output, _conversationalFinalAnswerAction)
        finishAction := &schema.AgentFinish{
            ReturnValues: map[string]any{
                a.OutputKey: splits[len(splits)-1],
            },
            Log: output,
        }
        return nil, finishAction, nil
    }
    // ...
}

OpenAI Functions Agent

这是基于 OpenAI 原生 Function Calling API 的实现，具有以下优势：

不依赖文本解析，更可靠
支持结构化参数
支持并行工具调用

结构定义

type OpenAIFunctionsAgent struct {
    LLM              llms.Model
    Prompt           prompts.FormatPrompter
    Tools            []tools.Tool
    OutputKey        string
    CallbacksHandler callbacks.Handler
}

工具定义方式

func (o *OpenAIFunctionsAgent) functions() []llms.FunctionDefinition {
    res := make([]llms.FunctionDefinition, 0)
    for _, tool := range o.Tools {
        res = append(res, llms.FunctionDefinition{
            Name:        tool.Name(),
            Description: tool.Description(),
            Parameters: map[string]any{
                "properties": map[string]any{
                    "__arg1": map[string]string{"title": "__arg1", "type": "string"},
                },
                "required": []string{"__arg1"},
                "type":     "object",
            },
        })
    }
    return res
}

并行工具调用支持

func (o *OpenAIFunctionsAgent) constructScratchPad(steps []schema.AgentStep) []llms.ChatMessage {
    // 支持多个并行 tool calls 的处理
    var currentToolCalls []llms.ToolCall
    // ...
}

ReAct 与 ZeroShot 的关系

在代码库中，ReAct 和 ZeroShot 是同一种实现。

代码证据

doc.go:10-13 明确说明：

// Package agents provides and implementation of the agent interface called
// OneShotZeroAgent. This agent uses the ReAct Framework (based on the
// descriptions of tools) to decide what action to take.

initialize.go 中的常量命名：

const (
    ZeroShotReactDescription AgentType = "zeroShotReactDescription"
    // ...
)

func Initialize(...) (*Executor, error) {
    switch agentType {
    case ZeroShotReactDescription:
        agent = NewOneShotAgent(llm, tools, opts...)  // 对应 mrkl.go
    // ...
    }
}

概念关系

概念	含义
ReAct	Reasoning + Acting，一种 Agent 设计框架/方法论
Zero-Shot	无需示例，仅凭工具描述就能工作
OneShotZeroAgent	使用 ReAct 框架的 Zero-Shot Agent 实现

命名拆解：

OneShotZeroAgent = One-Shot + Zero-Shot Agent
                 = 单次规划 + 零样本 Agent
                 = 基于 ReAct 框架，无需示例即可工作的 Agent

实战示例：视频脚本修改工具链

场景描述

基于用户输入的自然语言修改视频口播脚本：

查询当前脚本内容
调用模型按照用户要求修改脚本
返回修改后的脚本

目录结构

script_editor/
├── main.go              # 主程序入口
├── tools/
│   ├── get_script.go    # 查询脚本工具
│   └── modify_script.go # 修改脚本工具
└── script_store.go      # 脚本存储（模拟数据库）

完整代码实现

1. 脚本存储 (`script_store.go`)

package main

import "sync"

// ScriptStore 模拟脚本存储
type ScriptStore struct {
    mu      sync.RWMutex
    scripts map[string]string // videoID -> script content
}

func NewScriptStore() *ScriptStore {
    return &ScriptStore{
        scripts: map[string]string{
            "video_001": `大家好，欢迎来到我的频道。
今天我们要聊一聊人工智能的发展历程。
人工智能从20世纪50年代开始发展，
经历了几次寒冬和复兴。
现在我们正处于AI的黄金时代。`,
        },
    }
}

func (s *ScriptStore) Get(videoID string) string {
    s.mu.RLock()
    defer s.mu.RUnlock()
    return s.scripts[videoID]
}

func (s *ScriptStore) Set(videoID, content string) {
    s.mu.Lock()
    defer s.mu.Unlock()
    s.scripts[videoID] = content
}

2. 查询脚本工具 (`tools/get_script.go`)

package tools

import (
    "context"
    "fmt"
)

// GetScriptTool 查询视频脚本的工具
type GetScriptTool struct {
    store *ScriptStore
}

func NewGetScriptTool(store *ScriptStore) *GetScriptTool {
    return &GetScriptTool{store: store}
}

func (t *GetScriptTool) Name() string {
    return "get_script"
}

func (t *GetScriptTool) Description() string {
    return `获取指定视频的口播脚本内容。
输入格式：视频ID，例如 "video_001"
返回：该视频的完整脚本文本。`
}

func (t *GetScriptTool) Call(ctx context.Context, input string) (string, error) {
    videoID := input
    script := t.store.Get(videoID)
    if script == "" {
        return "", fmt.Errorf("未找到视频 %s 的脚本", videoID)
    }
    return script, nil
}

3. 修改脚本工具 (`tools/modify_script.go`)

package tools

import (
    "context"
    "encoding/json"
    "fmt"

    "github.com/tmc/langchaingo/llms"
)

// ModifyScriptTool 修改脚本的工具（内部调用LLM）
type ModifyScriptTool struct {
    store *ScriptStore
    llm   llms.Model
}

type ModifyScriptInput struct {
    VideoID      string `json:"video_id"`
    OriginalText string `json:"original_text"`
    Modification string `json:"modification"`
}

func NewModifyScriptTool(store *ScriptStore, llm llms.Model) *ModifyScriptTool {
    return &ModifyScriptTool{store: store, llm: llm}
}

func (t *ModifyScriptTool) Name() string {
    return "modify_script"
}

func (t *ModifyScriptTool) Description() string {
    return `根据用户的修改要求，调用AI修改视频脚本。
输入格式（JSON）：
{
    "video_id": "视频ID",
    "original_text": "原始脚本内容",
    "modification": "用户的修改要求"
}
返回：修改后的脚本内容。`
}

func (t *ModifyScriptTool) Call(ctx context.Context, input string) (string, error) {
    var params ModifyScriptInput
    if err := json.Unmarshal([]byte(input), &params); err != nil {
        return "", fmt.Errorf("解析输入失败: %w", err)
    }

    // 构造修改脚本的 prompt
    prompt := fmt.Sprintf(`你是一个专业的视频脚本编辑助手。请根据用户的要求修改以下口播脚本。

原始脚本：
%s

修改要求：
%s

请直接输出修改后的完整脚本，不要添加任何解释或说明：`, params.OriginalText, params.Modification)

    // 调用 LLM 进行修改
    resp, err := t.llm.Generate(ctx, []string{prompt})
    if err != nil {
        return "", fmt.Errorf("调用模型失败: %w", err)
    }

    modifiedScript := resp[0]

    // 保存修改后的脚本
    t.store.Set(params.VideoID, modifiedScript)

    return modifiedScript, nil
}

4. 主程序 (`main.go`)

package main

import (
    "context"
    "fmt"
    "log"
    "os"

    "github.com/tmc/langchaingo/agents"
    "github.com/tmc/langchaingo/llms/openai"
    "github.com/tmc/langchaingo/schema"
    "github.com/tmc/langchaingo/tools"
)

func main() {
    ctx := context.Background()

    // 1. 初始化 LLM
    llm, err := openai.New(
        openai.WithToken(os.Getenv("OPENAI_API_KEY")),
        openai.WithModel("gpt-4"),
    )
    if err != nil {
        log.Fatal(err)
    }

    // 2. 初始化脚本存储
    store := NewScriptStore()

    // 3. 定义工具列表
    toolList := []tools.Tool{
        NewGetScriptTool(store),
        NewModifyScriptTool(store, llm),
    }

    // 4. 创建 Agent（可选择不同实现）
    // 方式一：使用 ReAct Agent
    agent := agents.NewOneShotAgent(llm, toolList)
    
    // 方式二：使用 OpenAI Functions Agent（推荐）
    // agent := agents.NewOpenAIFunctionsAgent(llm, toolList)

    // 5. 创建 Executor
    executor := agents.NewExecutor(agent,
        agents.WithMaxIterations(5),
        agents.WithReturnIntermediateSteps(),
    )

    // 6. 执行用户请求
    userQuery := "请把 video_001 的脚本改得更轻松幽默一些，加入一些网络流行语"

    result, err := chains.Call(ctx, executor, map[string]any{
        "input": userQuery,
    })
    if err != nil {
        log.Fatal(err)
    }

    fmt.Println("=== 修改后的脚本 ===")
    fmt.Println(result["output"])

    // 打印中间步骤
    if steps, ok := result["intermediateSteps"].([]schema.AgentStep); ok {
        fmt.Println("\n=== 执行步骤 ===")
        for i, step := range steps {
            fmt.Printf("步骤 %d: 调用工具 %s\n", i+1, step.Action.Tool)
            fmt.Printf("输入: %s\n", step.Action.ToolInput)
            fmt.Printf("输出: %s\n\n", step.Observation)
        }
    }
}

执行流程

用户输入: "把 video_001 的脚本改得更轻松幽默"
                    │
                    ▼
        ┌──────────────────────┐
        │      Agent.Plan      │
        │   (分析用户意图)      │
        └──────────────────────┘
                    │
        决定调用: get_script("video_001")
                    │
                    ▼
        ┌──────────────────────┐
        │   GetScriptTool.Call │
        │   返回原始脚本内容     │
        └──────────────────────┘
                    │
                    ▼
        ┌──────────────────────┐
        │      Agent.Plan      │
        │   (分析下一步行动)    │
        └──────────────────────┘
                    │
        决定调用: modify_script({
            "video_id": "video_001",
            "original_text": "原始脚本...",
            "modification": "改得更轻松幽默"
        })
                    │
                    ▼
        ┌──────────────────────────┐
        │  ModifyScriptTool.Call   │
        │  内部调用 LLM 修改脚本    │
        │  保存并返回修改结果       │
        └──────────────────────────┘
                    │
                    ▼
        ┌──────────────────────┐
        │      Agent.Plan      │
        │   决定任务完成        │
        │   Final Answer: ...  │
        └──────────────────────┘
                    │
                    ▼
              返回最终结果

进阶优化

添加记忆支持

import "github.com/tmc/langchaingo/memory"

executor := agents.NewExecutor(agent,
    agents.WithMaxIterations(5),
    agents.WithMemory(memory.NewConversationBuffer()),
)

扩展更多工具

// 预览脚本修改（不保存）
type PreviewModificationTool struct { ... }

// 获取脚本历史版本
type GetScriptHistoryTool struct { ... }

// 恢复到指定版本
type RevertScriptTool struct { ... }

总结

agents 包核心设计理念

接口抽象：Agent 接口定义了决策的契约
职责分离：Agent 负责决策，Executor 负责执行，Tool 负责操作
可扩展性：通过 Option 模式灵活配置
多种实现：支持 ReAct、Conversational、OpenAI Functions 等不同模式

Agent 模式选择建议

场景	推荐 Agent	理由
通用任务执行	OneShotZeroAgent	基于 ReAct，通用性强
对话机器人	ConversationalAgent	支持上下文，对话更自然
OpenAI API 环境	OpenAIFunctionsAgent	结构化输出更可靠

文件结构一览

agents/
├── agents.go              # Agent 接口定义
├── executor.go            # Executor 执行器
├── mrkl.go                # ReAct Agent 实现
├── conversational.go      # 对话 Agent 实现
├── openai_functions_agent.go  # OpenAI Functions Agent
├── options.go             # 配置选项
├── initialize.go          # 快速初始化（已废弃）
├── errors.go              # 错误定义
├── doc.go                 # 包文档
└── prompts/               # Prompt 模板
    ├── conversational_prefix.txt
    ├── conversational_format_instructions.txt
    └── conversational_suffix.txt

AtomGit开源社区

AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念，把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起，为开发者提供从开发、训练到部署的一站式体验。

更多推荐

从0到1，无代码微调并部署本地大语言模型LLM

错误：正常来说安装完后验证环境会显示显卡型号，但是我在安装时，会出现报错，原因是它安装了错误的cuda版本，需要重新安装。如果只是希望学习微调的在这里已经结束了，下面是本系列教程的后续，如何用。这里用于演示，只对模型做一个自我认知的微调。部署本地的LLM微调大语言模型。