Spring AI 1.x 系列【54】Retry 机制分析

云烟成雨TD

19人浏览 · 2026-06-09 13:34:58

云烟成雨TD · 2026-06-09 13:34:58 发布

文章目录

1. 概述
2. spring-ai-retry
3. 自动配置：SpringAiRetryAutoConfiguration
- Bean 1：RetryTemplate
- Bean 2：`ResponseErrorHandler`
4. 配置属性
5. 重试生命周期（完整请求流程）
6. 涉及的模型范围
7. 流式 vs 非流式
8. 关键设计要点
9. 自定义与扩展
总结

1. 概述

Spring AI 的 Retry 机制是一个两层架构，负责处理 AI API 调用失败时的自动重试策略。

层级	职责	核心组件
异常分类层	拦截 HTTP 响应，将错误分为"可重试"和"不可重试"两类	`ResponseErrorHandler`
重试执行层	基于异常类型决定是否重试，执行退避策略	`RetryTemplate`

整体依赖 Spring Retry 项目，采用编程式 RetryTemplate.execute()方式而非 AOP 注解方式。

2. spring-ai-retry

Maven 坐标： org.springframework.ai:spring-ai-retry:1.1.4

该模块包含三个关键类：

2.1 TransientAiException

public class TransientAiException extends RuntimeException {
    public TransientAiException(String message) { super(message); }
    public TransientAiException(String message, Throwable cause) { super(message, cause); }
}

可恢复异常。表示当前操作在重试后可能成功。典型场景包括：

HTTP 5xx 服务端错误（502 Bad Gateway、503 Service Unavailable 等）
网络超时
限流导致的临时失败

2.2 NonTransientAiException

public class NonTransientAiException extends RuntimeException {
    public NonTransientAiException(String message) { super(message); }
    public NonTransientAiException(String message, Throwable cause) { super(message, cause); }
}

不可恢复异常。表示重试不会改变结果，必须修复根本原因。典型场景包括：

HTTP 401（API Key 无效）
HTTP 403（权限不足）
HTTP 429（配额超限）

2.3 RetryUtils — 默认配置工厂

这是一个抽象工具类，提供三组静态常量：

（1）DEFAULT_RESPONSE_ERROR_HANDLER

内置的响应错误处理器，分类逻辑为：

4xx 客户端错误 → 抛出 NonTransientAiException（不重试）
其他错误（5xx 等） → 抛出 TransientAiException（重试）

（2）DEFAULT_RETRY_TEMPLATE

默认重试模板，参数如下：

参数	值
最大重试次数	`10`
重试的异常类型	`TransientAiException.class`、`ResourceAccessException.class`
退避策略	指数退避
初始间隔	`2000ms`
乘数因子	`5`
最大间隔	`180000ms`（3 分钟）
日志监听器	每次重试时 WARN 级别打印 `"Retry error. Retry count: N"`

（3）SHORT_RETRY_TEMPLATE

测试用重试模板，差异点：

固定退避：100ms（不逐步增长）
日志精简：不打印异常堆栈

3. 自动配置：SpringAiRetryAutoConfiguration

类： org.springframework.ai.retry.autoconfigure.SpringAiRetryAutoConfiguration

触发条件： @ConditionalOnClass(RetryUtils.class) — 即 classpath 上存在 spring-ai-retry 模块。

注册： 通过 META-INF/spring/org.springframework.boot.autoconfigure.AutoConfiguration.imports 自动加载。

该配置类定义了两个核心 Bean（均带有 @ConditionalOnMissingBean，允许用户覆盖）：

Bean 1：RetryTemplate

@Bean @ConditionalOnMissingBean
public RetryTemplate retryTemplate(SpringAiRetryProperties properties)

构建逻辑：

从 SpringAiRetryProperties 读取 maxAttempts
配置 retryOn(TransientAiException.class) 和 retryOn(ResourceAccessException.class)
配置指数退避参数（initialInterval、multiplier、maxInterval）
动态检测 WebFlux：如果 classpath 存在 WebClientRequestException，也将其设为可重试异常
添加日志监听器：重试时打印 "Retry error. Retry count: N, Exception: ..."

Bean 2：`ResponseErrorHandler`

@Bean @ConditionalOnMissingBean
public ResponseErrorHandler responseErrorHandler(SpringAiRetryProperties properties)

异常分类的决策流程（有明确的优先级顺序）：

HTTP 响应状态码为 Error
    │
    ├── 1. 状态码在 onHttpCodes 列表中？
    │       └── YES → 抛 TransientAiException（强制重试）
    │
    ├── 2. onClientErrors=false 且 状态码为 4xx？
    │       └── YES → 抛 NonTransientAiException（默认不重试）
    │
    ├── 3. 状态码在 excludeOnHttpCodes 列表中？
    │       └── YES → 抛 NonTransientAiException（强制不重试）
    │
    └── 4. 兜底 → 抛 TransientAiException（默认重试）

4. 配置属性

前缀： spring.ai.retry

4.1 配置一览

属性	类型	默认值	说明
`spring.ai.retry.max-attempts`	`int`	`10`	最大重试次数
`spring.ai.retry.on-client-errors`	`boolean`	`false`	是否对 4xx 错误也进行重试
`spring.ai.retry.exclude-on-http-codes`	`List<Integer>`	`[]`	明确不重试的 HTTP 状态码列表
`spring.ai.retry.on-http-codes`	`List<Integer>`	`[]`	明确需要重试的 HTTP 状态码列表
`spring.ai.retry.backoff.initial-interval`	`Duration`	`2000ms`	首次重试前的等待时间
`spring.ai.retry.backoff.multiplier`	`int`	`5`	指数退避的乘数因子
`spring.ai.retry.backoff.max-interval`	`Duration`	`180000ms`	两次重试之间的最大等待时间

4.2 配置示例

spring:
  ai:
    retry:
      max-attempts: 5                          # 最多重试 5 次
      on-client-errors: false                   # 4xx 不重试（默认）
      backoff:
        initial-interval: 1s                    # 首次重试等待 1 秒
        multiplier: 2                           # 指数乘数 2
        max-interval: 30s                       # 最大间隔 30 秒
      exclude-on-http-codes:
        - 429                                   # 429 不重试（配额超限重试无意义）
        - 503
      on-http-codes:
        - 502                                   # 502 明确重试

4.3 退避间隔计算

指数退避公式：下一次间隔 = min(initialInterval × multiplier^(retryCount), maxInterval)

以默认配置为例（initialInterval=2s, multiplier=5, maxInterval=180s）：

重试次数	等待间隔
第 1 次	2s
第 2 次	10s
第 3 次	50s
第 4 次	180s（触及上限）
第 5 次及以后	180s

总耗时（10 次全失败）：约 2 + 10 + 50 + 180 × 7 ≈ 22 分钟后放弃。

5. 重试生命周期（完整请求流程）

以 OpenAI Chat 调用为例，展示一次 API 请求从发起到最终响应的完整链路：

User Code
  │
  ▼
OpenAiChatModel.call(Prompt)
  │
  ▼
OpenAiChatModel.internalCall()
  │
  ├── 1. 构造 ChatCompletionRequest
  │
  ├── 2. Observation 埋点包装
  │
  └── 3. retryTemplate.execute(ctx -> {
          return this.openAiApi.chatCompletionEntity(request, headers);
      })
        │
        ▼
      OpenAiApi.chatCompletionEntity()
        │
        ▼
      RestClient (已注册 ResponseErrorHandler)
        │
        ├── [成功] → 返回 ResponseEntity<ChatCompletion> → 解析为 ChatResponse
        │
        └── [失败] → ResponseErrorHandler 介入
                      │
                      ├── 抛 NonTransientAiException → RetryTemplate 不重试，直接失败
                      │
                      └── 抛 TransientAiException → RetryTemplate 捕获
                            │
                            ├── 未达 maxAttempts → 指数退避等待 → 重试
                            │     └── 日志: "Retry error. Retry count: 1, Exception: ..."
                            │
                            └── 已达 maxAttempts → 抛出最终异常

代码证据，在 OpenAiChatModel.internalCall() 中：

// 第 199-200 行：核心调用被 retryTemplate.execute() 包裹
ResponseEntity<ChatCompletion> completionEntity = this.retryTemplate
    .execute(ctx -> this.openAiApi.chatCompletionEntity(request, getAdditionalHttpHeaders(prompt)));

在 OpenAiChatModel Builder 中（第 806-809 行），retryTemplate 默认值来自 RetryUtils.DEFAULT_RETRY_TEMPLATE，可通过 OpenAiChatModel.builder().retryTemplate(...) 覆盖。

6. 涉及的模型范围

所有模型自动配置均通过 @AutoConfiguration(after = { ... , SpringAiRetryAutoConfiguration.class, ... }) 确保 Retry Bean 先于模型 Bean 创建，然后同时注入 RetryTemplate 和 ResponseErrorHandler。

模块	模型	使用 RetryTemplate	使用 ResponseErrorHandler
OpenAI	Chat / Embedding / Image / AudioSpeech / AudioTranscription / Moderation	✓	✓
Azure OpenAI	（同一套 API 客户端）	✓	✓
DeepSeek	Chat	✓	✓
智谱 AI	Chat / Embedding / Image	✓	✓
Mistral AI	Chat / Embedding / Moderation	✓	✓
MiniMax	Chat / Embedding	✓	✓
Vertex AI Gemini	Chat	✓	✓
Vertex AI	Text Embedding	✓	✓
Google GenAI	Chat / Text Embedding	✓	✓
Anthropic	Chat	✓	✓
Ollama	Chat	✓	✓
ElevenLabs	TextToSpeech	✓	✓

实际上，只要模型通过 RestClient 发起 HTTP 调用并接受 RetryTemplate 注入，就会自动纳入这套机制。

7. 流式 vs 非流式

这是一个重要的设计细节：

调用方式	重试行为
同步调用（`call()`）	`retryTemplate.execute()` 包裹整个 HTTP 调用，完整支持重试
流式调用（`stream()`）	不使用 `RetryTemplate`，不存在重试机制

以 OpenAiChatModel 为例：

// 同步调用 — 第 199-200 行：有 retryTemplate 包裹
ResponseEntity<ChatCompletion> completionEntity = this.retryTemplate
    .execute(ctx -> this.openAiApi.chatCompletionEntity(request, ...));

// 流式调用 — 第 271-279 行：直接调用 API，无重试
public Flux<ChatResponse> internalStream(Prompt prompt, ...) {
    return Flux.deferContextual(contextView -> {
        // 直接调用 openAiApi.chatCompletionStream()，没有被 retryTemplate 包裹
        ...
    });
}

这意味着：

同步调用：遇到 5xx 会自动重试最多 10 次
流式调用：遇到错误直接失败，不会自动重试。但初始连接建立时的 4xx 错误（如 API Key 无效）仍会被 ResponseErrorHandler 拦截为 NonTransientAiException 并快速失败

8. 关键设计要点

8.1 不使用 AOP 注解

Spring AI 不使用 @Retryable/@EnableRetry 注解。在整个代码库中零使用。重试是通过在模型方法中直接调用 RetryTemplate.execute() 实现的编程式重试。

原因推测：编程式重试允许在 retryTemplate.execute() 内部配合 Observation（可观测性埋点），保持调用链路完整；同时也便于在 Builder 模式中替换默认的 RetryTemplate。

8.2 不使用过滤器/拦截器

重试不通过 HTTP Filter 或 Spring Interceptor 实现。异常分类在 ResponseErrorHandler（RestClient 层面），重试执行在模型方法内部，两者职责清晰分离。

8.3 WebFlux 的动态支持

SpringAiRetryAutoConfiguration 通过反射动态检测 WebClientRequestException：

try {
    Class<?> webClientRequestEx = Class
        .forName("org.springframework.web.reactive.function.client.WebClientRequestException");
    builder.retryOn((Class<? extends Throwable>) webClientRequestEx);
} catch (ClassNotFoundException ignore) {
    // WebFlux 不在 classpath，跳过
}

这使得 WebFlux 项目的预响应网络错误（如 DNS 解析失败、连接被拒）也能享受重试能力。

8.4 独立使用也具备重试能力

每个模型的 Builder 默认使用 RetryUtils.DEFAULT_RETRY_TEMPLATE，因此即使不依赖 Spring Boot 自动配置，直接 new 出来的模型实例同样具备默认重试行为。

8.5 异常体系

RuntimeException
    ├── TransientAiException      ← 可重试（服务端错误、网络抖动）
    └── NonTransientAiException   ← 不可重试（认证失败、参数错误）

ResourceAccessException（Spring Web 的网络 I/O 异常）也被纳入重试范围，与 TransientAiException 平级。

9. 自定义与扩展

9.1 通过配置文件调整

spring:
  ai:
    retry:
      max-attempts: 3
      on-client-errors: true          # 对 4xx 也重试（不推荐）
      backoff:
        initial-interval: 500ms
        multiplier: 2
        max-interval: 10s
      on-http-codes:
        - 429                          # 限流错误也重试

9.2 覆盖 Bean — 自定义 RetryTemplate

两类 Bean 均加有 @ConditionalOnMissingBean，用户可自行定义覆盖：

@Configuration
public class CustomRetryConfig {

    @Bean
    public RetryTemplate retryTemplate() {
        return RetryTemplate.builder()
            .maxAttempts(3)
            .retryOn(TransientAiException.class)
            .fixedBackoff(Duration.ofSeconds(1))    // 固定 1 秒退避
            .withListener(new RetryListener() {
                @Override
                public <T, E extends Throwable> void onError(
                        RetryContext ctx, RetryCallback<T, E> cb, Throwable t) {
                    // 自定义重试监控，如上报 Metrics
                    log.error("AI API retry #{}", ctx.getRetryCount(), t);
                }
            })
            .build();
    }
}

9.3 覆盖 Bean — 自定义 ResponseErrorHandler

@Bean
public ResponseErrorHandler responseErrorHandler() {
    return new ResponseErrorHandler() {
        @Override
        public boolean hasError(ClientHttpResponse response) throws IOException {
            return response.getStatusCode().isError();
        }

        @Override
        public void handleError(ClientHttpResponse response) throws IOException {
            // 自定义错误分类：所有 non-2xx 一律重试
            String body = StreamUtils.copyToString(response.getBody(), StandardCharsets.UTF_8);
            throw new TransientAiException(
                String.format("HTTP %s - %s", response.getStatusCode().value(), body));
        }
    };
}

9.4 针对单个模型覆盖

也可以不覆盖全局 Bean，而是在创建特定模型时传入自定义 RetryTemplate：

RetryTemplate shortRetry = RetryTemplate.builder()
    .maxAttempts(3)
    .exponentialBackoff(500, 2, 5000)
    .build();

OpenAiChatModel chatModel = OpenAiChatModel.builder()
    .openAiApi(openAiApi)
    .defaultOptions(options)
    .retryTemplate(shortRetry)    // 仅对此 ChatModel 生效
    .build();

总结

Spring AI 的 Retry 机制优雅地处理了 AI API 调用的不稳定性：

异常分类通过 ResponseErrorHandler 区分可恢复（Transient）与不可恢复（NonTransient）错误
重试执行通过编程式 RetryTemplate 实现，而非 AOP 注解，保持了调用链路的完整性
默认策略为指数退避（2s → 10s → 50s → … → 180s），最多重试 10 次
全覆盖几乎所有 AI 模型的同步调用都透明地享受重试能力
局限：流式调用（stream()）不支持自动重试；4xx 客户端错误默认不重试
高可定制：通过配置文件、Bean 覆盖或模型级参数均可灵活调整

AtomGit开源社区

AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念，把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起，为开发者提供从开发、训练到部署的一站式体验。

更多推荐

升鲜宝AI助手新人使用指南（八）---升鲜宝生鲜配送供应链管理系统源代码服务

AtomGit开源社区

【2026前端转 AI 全栈指南】第 1 章：前言 · 后端架构 · 章节导览

AtomGit开源社区

2026年AI大模型接口调度服务全维度实测：主流服务商性能对比与高性价比选型参考

一是协议深度取代数量成为核心竞争力，仅支持OpenAI兼容已成标配，原生支持Anthropic与Gemini协议才是差异化关键；二是企业级管理能力成为刚需，子账号、成本归因与合规已从加分项变为必选项；三是评测数据驱动选型，依赖平台单方宣传已不足够，独立的第三方实测数据正成为决策关键。综上所述，对于寻求将AI能力稳定嵌入核心业务流程的企业，4SAPI 凭借其全协议支持、高可用架构及完善的企业级功能