Java 程序员第 42 阶段07:文档智能解析审核,大模型实现合同摘要与合规校验
目录
1. [章节概述](#1-章节概述) 2. [LangChain4j架构深度解析](#2-langchain4j架构深度解析) 3. [多模型Provider集成实现](#3-多模型provider集成实现) 4. [合同摘要AI服务设计与实现](#4-合同摘要ai服务设计与实现) 5. [流式输出处理机制](#5-流式输出处理机制) 6. [实际代码示例](#6-实际代码示例) 7. [最佳实践与性能优化](#7-最佳实践与性能优化) 8. [本章小结](#8-本章小结)
1. 章节概述
1.1 LangChain4j简介

LangChain4j是Java生态中最为成熟的LLM(大语言模型)集成框架,为Java开发者提供了统一的方式访问OpenAI、Claude、通义千问、GLM等主流大模型服务。相比Python生态的LangChain,LangChain4j更加轻量、类型安全,且与Spring Boot生态深度集成。
**核心特性:** - **统一API**:通过ChatModel抽象层,开发者无需关心底层模型差异 - **Prompt模板**:强大的Prompt工程支持,支持变量插值与条件逻辑 - **Chain机制**:将多个LLM调用串联成复杂的工作流 - **工具调用**:支持Function Calling和Tool Use - **流式响应**:完整的Streaming API支持
1.2 本章学习目标
通过本章学习,您将掌握:
┌─────────────────────────────────────────────────────────────────┐ │ 学习目标 │ ├─────────────────────────────────────────────────────────────────┤ │ 1. 理解LangChain4j的核心架构和设计理念 │ │ 2. 掌握多模型Provider的配置与切换机制 │ │ 3. 设计并实现合同摘要AI服务 │ │ 4. 实现流式输出处理与Token统计 │ │ 5. 集成到现有的文档智能解析审核平台 │ └─────────────────────────────────────────────────────────────────┘
1.3 合同摘要业务场景
在文档智能解析审核平台中,合同摘要模块承担以下职责:
1. **合同关键信息提取**:从非结构化合同文本中提取 Parties、金额、日期、期限等关键信息 2. **条款分类识别**:识别并分类合同中的各类条款(保密条款、违约条款、终止条款等) 3. **风险点标注**:标记潜在的法律风险点和需要人工审核的重点 4. **摘要生成**:生成符合企业内部标准的合同摘要文档
2. LangChain4j架构深度解析
2.1 整体架构图
┌─────────────────────────────────────────────────────────────────┐ │ LangChain4j 架构 │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Application│ │ AI Services │ │ Chains │ │ │ │ Layer │ │ │ │ │ │ │ └──────┬───────┘ └──────┬───────┘ └──────┬───────┘ │ │ │ │ │ │ │ ┌──────▼───────────────────▼───────────────────▼───────┐ │ │ │ LangChain4j Core │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ Chat │ │ Prompt │ │ Memory │ │ Tool │ │ │ │ │ │ Model │ │ Template│ │ Manager│ │ Use │ │ │ │ │ └─────────┘ └─────────┘ └─────────┘ └─────────┘ │ │ │ └──────────────────────────────────────────────────────┘ │ │ │ │ │ ┌───────────────────────────▼──────────────────────────────┐ │ │ │ Model Providers │ │ │ │ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │ │ │ │ │ OpenAI │ │Claude │ │ Qwen │ │ GLM │ │Local AI │ │ │ │ │ └────────┘ └────────┘ └────────┘ └────────┘ └────────┘ │ │ │ └───────────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────────┘
2.2 核心组件详解
#### 2.2.1 ChatModel抽象层
public interface ChatModel { /** * 同步调用 */ ChatResponse chat(ChatRequest request); /** * Streaming调用 */ StreamingChatModel toStreamingChatModel(); }
ChatModel是LangChain4j的核心接口,定义了与LLM交互的标准方式。所有模型提供商(OpenAI、Claude等)都实现此接口,提供统一的调用方式。
#### 2.2.2 AiService增强
@AiService interface ContractSummarizer { @SystemMessage("你是一个专业的合同分析师...") @UserMessage("请分析以下合同内容并提取关键信息...") ContractSummary summarize(String contractText); }
通过@AiService注解,LangChain4j自动为接口生成实现类,支持Method级别的Prompt定制。
2.3 Provider枚举设计
public enum ChatModelProvider { OPENAI("OpenAI GPT系列", "https://api.openai.com/v1"), CLAUDE("Anthropic Claude", "https://api.anthropic.com/v1"), QWEN("阿里通义千问", "https://dashscope.aliyuncs.com/api/v1"), GLM("智谱GLM", "https://open.bigmodel.cn/api/paas/v4"); private final String description; private final String baseUrl; ChatModelProvider(String description, String baseUrl) { this.description = description; this.baseUrl = baseUrl; } }
3. 多模型Provider集成实现
3.1 Maven依赖配置
<!-- LangChain4j BOM --> <dependencyManagement> <dependencies> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-bom</artifactId> <version>1.0.0-beta1</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <!-- LangChain4j Core --> <dependencies> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j</artifactId> </dependency> <!-- OpenAI Provider --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-open-ai</artifactId> </dependency> <!-- Anthropic Claude Provider --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-anthropic</artifactId> </dependency> <!-- Alibaba Qwen Provider --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-dashscope</artifactId> </dependency> <!-- Zhipu GLM Provider --> <dependency> <groupId>dev.langchain4j</groupId> <artifactId>langchain4j-zhipu-ai</artifactId> </dependency> </dependencies>
3.2 LLM配置类实现
@Configuration @Slf4j public class LlmConfig { @Value("${llm.provider:OPENAI}") private String providerName; @Value("${llm.api-key:}") private String apiKey; @Value("${llm.base-url:}") private String baseUrl; @Value("${llm.model-name:gpt-4}") private String modelName; @Value("${llm.temperature:0.7}") private double temperature; @Value("${llm.max-tokens:4096}") private int maxTokens; @Bean public ChatModel chatModel() { ChatModelProvider provider = ChatModelProvider.valueOf(providerName.toUpperCase()); return switch (provider) { case OPENAI -> createOpenAIModel(); case CLAUDE -> createClaudeModel(); case QWEN -> createQwenModel(); case GLM -> createGlmModel(); }; } private ChatModel createOpenAIModel() { return OpenAiChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .baseUrl(StringUtils.hasText(baseUrl) ? baseUrl : null) .logRequests(true) .logResponses(true) .build(); } private ChatModel createClaudeModel() { return AnthropicChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .baseUrl(StringUtils.hasText(baseUrl) ? baseUrl : null) .build(); } private ChatModel createQwenModel() { return DashScopeChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .baseUrl(StringUtils.hasText(baseUrl) ? baseUrl : null) .build(); } private ChatModel createGlmModel() { return ZhipuAiChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .baseUrl(StringUtils.hasText(baseUrl) ? baseUrl : null) .build(); } @Bean public StreamingChatModel streamingChatModel() { ChatModelProvider provider = ChatModelProvider.valueOf(providerName.toUpperCase()); return switch (provider) { case OPENAI -> OpenAiStreamingChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .baseUrl(StringUtils.hasText(baseUrl) ? baseUrl : null) .build(); case QWEN -> DashScopeStreamingChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .build(); default -> throw new IllegalArgumentException( "Streaming not supported for provider: " + provider); }; } }
3.3 配置属性文件
# application.yml llm: provider: ${LLM_PROVIDER:OPENAI} api-key: ${OPENAI_API_KEY:your-api-key} base-url: ${LLM_BASE_URL:} model-name: ${LLM_MODEL_NAME:gpt-4} temperature: 0.7 max-tokens: 4096
4. 合同摘要AI服务设计与实现

4.1 合同摘要数据模型
@Data @Builder public class ContractSummary { private String contractId; private String contractType; private String contractNumber; private LocalDate signingDate; private LocalDate startDate; private LocalDate endDate; private PartyInfo partyA; private PartyInfo partyB; private BigDecimal totalAmount; private String currency; private List<ClauseSummary> clauses; private List<String> riskPoints; private List<String> keyObligations; private String overallAssessment; private String rawSummary; private TokenUsage tokenUsage; } @Data @Builder public class PartyInfo { private String name; private String type; private String address; private String representative; private String contactInfo; } @Data @Builder public class ClauseSummary { private Integer clauseNumber; private String clauseType; private String title; private String summary; private String riskLevel; private List<String> keyPoints; } @Data @Builder public class TokenUsage { private int promptTokens; private int completionTokens; private int totalTokens; }
4.2 合同摘要Prompt模板
@Component public class ContractPromptTemplate { private static final String SYSTEM_TEMPLATE = """ 你是一个专业的合同分析师,拥有丰富的法律知识和合同审核经验。 你的职责是分析各类合同文本,提取关键信息,识别潜在风险。 请严格按照以下JSON格式输出分析结果,不要添加任何额外的解释说明: { "contractType": "合同类型,如:采购合同、服务合同、租赁合同等", "contractNumber": "合同编号", "signingDate": "签订日期,格式:YYYY-MM-DD", "startDate": "开始日期,格式:YYYY-MM-DD", "endDate": "结束日期,格式:YYYY-MM-DD", "partyA": { "name": "甲方名称", "type": "甲方类型", "address": "地址", "representative": "法定代表人" }, "partyB": { "name": "乙方名称", "type": "乙方类型", "address": "地址", "representative": "法定代表人" }, "totalAmount": "合同总金额(数字)", "currency": "币种", "clauses": [ { "clauseNumber": 1, "clauseType": "条款类型", "title": "条款标题", "summary": "条款摘要", "riskLevel": "风险等级:LOW/MEDIUM/HIGH", "keyPoints": ["要点1", "要点2"] } ], "riskPoints": ["风险点1", "风险点2"], "keyObligations": ["关键义务1", "关键义务2"], "overallAssessment": "总体评估意见" } 如果某些信息在合同中未明确提供,请使用 null,不要编造信息。 """; private static final String USER_TEMPLATE = """ 请分析以下合同文本,提取关键信息并识别潜在风险: --- {contract_text} --- 请直接输出JSON格式的分析结果。 """; public AiMessage buildSystemMessage() { return AiMessage.from(SYSTEM_TEMPLATE); } public UserMessage buildUserMessage(String contractText) { return UserMessage.from(USER_TEMPLATE.replace("{contract_text}", contractText)); } }
4.3 摘要服务实现
@Service @Slf4j public class ContractSummarizationService { private final ChatModel chatModel; private final StreamingChatModel streamingChatModel; private final ObjectMapper objectMapper; private final ContractPromptTemplate promptTemplate; public ContractSummarizationService( ChatModel chatModel, StreamingChatModel streamingChatModel, ObjectMapper objectMapper, ContractPromptTemplate promptTemplate) { this.chatModel = chatModel; this.streamingChatModel = streamingChatModel; this.objectMapper = objectMapper; this.promptTemplate = promptTemplate; } /** * 同步合同摘要 */ public ContractSummary summarize(String contractId, String contractText) { log.info("Starting contract summarization, contractId: {}", contractId); try { UserMessage userMessage = promptTemplate.buildUserMessage(contractText); AiMessage systemMessage = promptTemplate.buildSystemMessage(); ChatRequest request = ChatRequest.builder() .messages(systemMessage, userMessage) .responseFormat(ChatResponseFormat.JSON) .build(); ChatResponse response = chatModel.chat(request); String jsonContent = response.aiMessage().text(); log.debug("LLM Response: {}", jsonContent); ContractSummary summary = parseSummary(jsonContent); summary.setContractId(contractId); summary.setRawSummary(jsonContent); if (response.tokenUsage() != null) { summary.setTokenUsage(TokenUsage.builder() .promptTokens(response.tokenUsage().inputTokens()) .completionTokens(response.tokenUsage().outputTokens()) .totalTokens(response.tokenUsage().totalTokens()) .build()); } log.info("Contract summarization completed, contractId: {}", contractId); return summary; } catch (Exception e) { log.error("Failed to summarize contract: {}", contractId, e); throw new ContractSummarizationException( "Failed to summarize contract: " + contractId, e); } } /** * 流式合同摘要 */ public Flux<String> summarizeStream(String contractId, String contractText, Consumer<String> onNext, Runnable onComplete, Consumer<Throwable> onError) { log.info("Starting streaming contract summarization, contractId: {}", contractId); UserMessage userMessage = promptTemplate.buildUserMessage(contractText); AiMessage systemMessage = promptTemplate.buildSystemMessage(); StringBuilder fullContent = new StringBuilder(); return streamingChatModel.chatMessages( List.of(systemMessage, userMessage), StreamingResponseHandler.onNext(onNext.andThen(content -> { fullContent.append(content); })).onComplete(() -> { log.info("Streaming completed for contract: {}", contractId); onComplete.run(); }).onError(onError) ); } private ContractSummary parseSummary(String jsonContent) { try { return objectMapper.readValue(jsonContent, ContractSummary.class); } catch (JsonProcessingException e) { log.warn("Failed to parse JSON, attempting cleanup: {}", e.getMessage()); return parseWithCleanup(jsonContent); } } private ContractSummary parseWithCleanup(String jsonContent) { String cleaned = jsonContent .replaceAll("```json\\s*", "") .replaceAll("```\\s*", "") .replaceAll("\n\\s*", "") .trim(); try { return objectMapper.readValue(cleaned, ContractSummary.class); } catch (JsonProcessingException e) { throw new ContractSummarizationException( "Failed to parse contract summary JSON", e); } } }
5. 流式输出处理机制

5.1 流式响应架构
┌─────────────────────────────────────────────────────────────────┐ │ 流式响应处理架构 │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Client Request │ │ │ │ │ ▼ │ │ ┌─────────────┐ │ │ │ Streaming │ │ │ │ ChatModel │ │ │ └──────┬──────┘ │ │ │ SSE/Server-Sent Events │ │ ▼ │ │ ┌─────────────────────────────────────────┐ │ │ │ StreamingResponseHandler │ │ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │ │ │ onNext │ │onComplete│ │ onError │ │ │ │ │ └────┬────┘ └────┬────┘ └────┬────┘ │ │ │ └───────┼────────────┼────────────┼───────┘ │ │ │ │ │ │ │ ▼ ▼ ▼ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ │Accumulator│ │Callback │ │Error │ │ │ │增量拼接 │ │完成通知 │ │处理 │ │ │ └─────────┘ └─────────┘ └─────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘
5.2 Token使用量追踪
@Component @Slf4j public class TokenUsageCollector { private final Map<String, TokenStats> sessionStats = new ConcurrentHashMap<>(); public void startSession(String sessionId) { sessionStats.put(sessionId, new TokenStats()); } public void recordPromptTokens(String sessionId, int tokens) { TokenStats stats = sessionStats.get(sessionId); if (stats != null) { stats.addPromptTokens(tokens); } } public void recordCompletionTokens(String sessionId, int tokens) { TokenStats stats = sessionStats.get(sessionId); if (stats != null) { stats.addCompletionTokens(tokens); } } public TokenStats getStats(String sessionId) { return sessionStats.get(sessionId); } public void endSession(String sessionId) { TokenStats stats = sessionStats.remove(sessionId); if (stats != null) { log.info("Session {} completed. Total tokens: {}, Cost estimate: ${}", sessionId, stats.getTotalTokens(), stats.estimateCost()); } } @Data @AllArgsConstructor public static class TokenStats { private AtomicInteger promptTokens = new AtomicInteger(0); private AtomicInteger completionTokens = new AtomicInteger(0); public void addPromptTokens(int tokens) { promptTokens.addAndGet(tokens); } public void addCompletionTokens(int tokens) { completionTokens.addAndGet(tokens); } public int getTotalTokens() { return promptTokens.get() + completionTokens.get(); } public double estimateCost() { // OpenAI GPT-4 pricing: $0.03/1K prompt, $0.06/1K completion double promptCost = promptTokens.get() / 1000.0 * 0.03; double completionCost = completionTokens.get() / 1000.0 * 0.06; return promptCost + completionCost; } } }
5.3 SSE流式控制器
@RestController @RequestMapping("/api/v1/contracts") @Slf4j public class ContractStreamController { private final ContractSummarizationService summarizationService; private final TokenUsageCollector tokenCollector; public ContractStreamController( ContractSummarizationService summarizationService, TokenUsageCollector tokenCollector) { this.summarizationService = summarizationService; this.tokenCollector = tokenCollector; } @GetMapping(value = "/{contractId}/summary/stream", produces = MediaType.TEXT_EVENT_STREAM_VALUE) public Flux<ServerSentEvent<String>> streamSummary( @PathVariable String contractId, @RequestParam String contractText) { String sessionId = UUID.randomUUID().toString(); tokenCollector.startSession(sessionId); AtomicReference<String> fullContent = new AtomicReference<>(""); return summarizationService.summarizeStream( contractId, contractText, content -> { fullContent.updateAndGet(prev -> prev + content); }, () -> { tokenCollector.endSession(sessionId); log.info("Stream completed for session: {}", sessionId); }, error -> { log.error("Stream error for session: {}", sessionId, error); } ).map(content -> ServerSentEvent.<String>builder() .data(content) .event("content") .id(sessionId) .build()) .startWith(ServerSentEvent.<String>builder() .event("start") .data(sessionId) .build()); } }
6. 实际代码示例
6.1 完整的LLM配置类
package com.docanalysis.llm.config; import dev.langchain4j.model.chat.ChatModel; import dev.langchain4j.model.chat.StreamingChatModel; import dev.langchain4j.model.openai.OpenAiChatModel; import dev.langchain4j.model.openai.OpenAiStreamingChatModel; import dev.langchain4j.model.anthropic.AnthropicChatModel; import dev.langchain4j.model.dashscope.DashScopeChatModel; import dev.langchain4j.model.dashscope.DashScopeStreamingChatModel; import dev.langchain4j.model.zhipu.ZhipuAiChatModel; import org.springframework.beans.factory.annotation.Value; import org.springframework.context.annotation.Bean; import org.springframework.context.annotation.Configuration; import org.springframework.util.StringUtils; /** * LLM大模型配置类 * 支持OpenAI、Claude、通义千问、智谱GLM等多模型切换 */ @Configuration public class LlmConfiguration { @Value("${langchain4j.provider:OPENAI}") private String provider; @Value("${langchain4j.api-key:}") private String apiKey; @Value("${langchain4j.base-url:}") private String baseUrl; @Value("${langchain4j.model-name:gpt-4}") private String modelName; @Value("${langchain4j.temperature:0.7}") private double temperature; @Value("${langchain4j.max-tokens:4096}") private int maxTokens; @Value("${langchain4j.timeout:60000}") private long timeoutMs; @Bean public ChatModel chatModel() { return switch (Provider.fromString(provider)) { case OPENAI -> buildOpenAIModel(); case CLAUDE -> buildClaudeModel(); case QWEN -> buildQwenModel(); case GLM -> buildGlmModel(); }; } @Bean public StreamingChatModel streamingChatModel() { return switch (Provider.fromString(provider)) { case OPENAI -> buildOpenAIStreamingModel(); case QWEN -> buildQwenStreamingModel(); default -> throw new UnsupportedOperationException( "Streaming not supported for provider: " + provider); }; } private OpenAiChatModel buildOpenAIModel() { var builder = OpenAiChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .timeout(java.time.Duration.ofMillis(timeoutMs)) .logRequests(true) .logResponses(true); if (StringUtils.hasText(baseUrl)) { builder.baseUrl(baseUrl); } return builder.build(); } private OpenAiStreamingChatModel buildOpenAIStreamingModel() { var builder = OpenAiStreamingChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .timeout(java.time.Duration.ofMillis(timeoutMs)); if (StringUtils.hasText(baseUrl)) { builder.baseUrl(baseUrl); } return builder.build(); } private AnthropicChatModel buildClaudeModel() { return AnthropicChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .timeout(java.time.Duration.ofMillis(timeoutMs)) .build(); } private DashScopeChatModel buildQwenModel() { var builder = DashScopeChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens); if (StringUtils.hasText(baseUrl)) { builder.baseUrl(baseUrl); } return builder.build(); } private DashScopeStreamingChatModel buildQwenStreamingModel() { return DashScopeStreamingChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens) .build(); } private ZhipuAiChatModel buildGlmModel() { var builder = ZhipuAiChatModel.builder() .apiKey(apiKey) .modelName(modelName) .temperature(temperature) .maxTokens(maxTokens); if (StringUtils.hasText(baseUrl)) { builder.baseUrl(baseUrl); } return builder.build(); } public enum Provider { OPENAI, CLAUDE, QWEN, GLM; public static Provider fromString(String value) { try { return Provider.valueOf(value.toUpperCase()); } catch (IllegalArgumentException e) { throw new IllegalArgumentException( "Unsupported LLM provider: " + value + ". Supported providers: OPENAI, CLAUDE, QWEN, GLM"); } } } }
6.2 合同摘要服务完整实现
package com.docanalysis.llm.service; import com.docanalysis.llm.config.ContractPromptTemplate; import com.docanalysis.llm.model.ClauseSummary; import com.docanalysis.llm.model.ContractSummary; import com.docanalysis.llm.model.PartyInfo; import com.docanalysis.llm.model.TokenUsage; import com.fasterxml.jackson.core.JsonProcessingException; import com.fasterxml.jackson.databind.ObjectMapper; import dev.langchain4j.model.chat.ChatModel; import dev.langchain4j.model.chat.ChatRequest; import dev.langchain4j.model.chat.ChatResponse; import dev.langchain4j.model.chat.StreamingChatModel; import dev.langchain4j.model.message.AiMessage; import dev.langchain4j.model.message.UserMessage; import dev.langchain4j.model.output.Response; import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.springframework.stereotype.Service; import reactor.core.publisher.Flux; import java.math.BigDecimal; import java.time.LocalDate; import java.util.List; import java.util.function.Consumer; /** * 合同摘要服务 * 使用LangChain4j集成大模型实现合同智能摘要 */ @Service public class ContractSummarizationService { private static final Logger log = LoggerFactory.getLogger( ContractSummarizationService.class); private final ChatModel chatModel; private final StreamingChatModel streamingChatModel; private final ObjectMapper objectMapper; private final ContractPromptTemplate promptTemplate; public ContractSummarizationService( ChatModel chatModel, StreamingChatModel streamingChatModel, ObjectMapper objectMapper, ContractPromptTemplate promptTemplate) { this.chatModel = chatModel; this.streamingChatModel = streamingChatModel; this.objectMapper = objectMapper; this.promptTemplate = promptTemplate; } /** * 同步合同摘要 * @param contractText 合同文本内容 * @return 合同摘要对象 */ public ContractSummary summarize(String contractText) { log.info("Starting contract summarization"); AiMessage systemMessage = AiMessage.from(promptTemplate.getSystemPrompt()); UserMessage userMessage = UserMessage.from( promptTemplate.buildUserPrompt(contractText)); try { ChatResponse response = chatModel.chat(ChatRequest.builder() .messages(List.of(systemMessage, userMessage)) .build()); String jsonResponse = response.content().text(); log.debug("LLM Raw Response: {}", jsonResponse); ContractSummary summary = parseJsonResponse(jsonResponse); summary.setRawSummary(jsonResponse); if (response.tokenUsage() != null) { summary.setTokenUsage(TokenUsage.builder() .promptTokens(response.tokenUsage().inputTokens()) .completionTokens(response.tokenUsage().outputTokens()) .totalTokens(response.tokenUsage().totalTokens()) .build()); } log.info("Contract summarization completed. " + "Token usage: {}", summary.getTokenUsage()); return summary; } catch (Exception e) { log.error("Failed to summarize contract", e); throw new ContractSummarizationException( "Failed to summarize contract", e); } } /** * 流式合同摘要 * @param contractText 合同文本内容 * @param onToken token回调 * @return token流 */ public Flux<String> summarizeStream(String contractText, Consumer<String> onToken) { log.info("Starting streaming contract summarization"); AiMessage systemMessage = AiMessage.from(promptTemplate.getSystemPrompt()); UserMessage userMessage = UserMessage.from( promptTemplate.buildUserPrompt(contractText)); StringBuilder accumulated = new StringBuilder(); return streamingChatModel.chatMessages( List.of(systemMessage, userMessage), (token) -> { accumulated.append(token); onToken.accept(token); } ).thenMany(Flux.defer(() -> { String fullContent = accumulated.toString(); log.info("Streaming completed. Total length: {}", fullContent.length()); return Flux.empty(); })); } private ContractSummary parseJsonResponse(String json) { try { String cleaned = cleanJsonResponse(json); return objectMapper.readValue(cleaned, ContractSummary.class); } catch (JsonProcessingException e) { log.error("Failed to parse JSON response", e); throw new ContractSummarizationException( "Failed to parse contract summary", e); } } private String cleanJsonResponse(String raw) { String cleaned = raw.trim(); if (cleaned.startsWith("```json")) { cleaned = cleaned.substring(7); } if (cleaned.startsWith("```")) { cleaned = cleaned.substring(3); } if (cleaned.endsWith("```")) { cleaned = cleaned.substring(0, cleaned.length() - 3); } return cleaned.trim(); } public static class ContractSummarizationException extends RuntimeException { public ContractSummarizationException(String message, Throwable cause) { super(message, cause); } } }
6.3 Prompt模板配置
package com.docanalysis.llm.config; import org.springframework.stereotype.Component; /** * 合同摘要Prompt模板 */ @Component public class ContractPromptTemplate { private static final String SYSTEM_PROMPT = """ 你是一个专业的合同分析师,擅长从合同文本中提取关键信息、 识别法律风险点并生成结构化摘要。 请严格按以下JSON格式输出,确保所有字段都有值(未找到则填null): { "contractType": "合同类型", "signingDate": "YYYY-MM-DD格式签订日期", "startDate": "YYYY-MM-DD格式开始日期", "endDate": "YYYY-MM-DD格式结束日期", "partyA": { "name": "甲方全称", "type": "甲方主体类型", "address": "注册地址", "representative": "法定代表人" }, "partyB": { "name": "乙方全称", "type": "乙方主体类型", "address": "注册地址", "representative": "法定代表人" }, "totalAmount": "数字金额", "currency": "币种", "clauses": [ { "clauseNumber": 1, "clauseType": "条款类型", "title": "条款标题", "summary": "条款摘要(50字内)", "riskLevel": "LOW|MEDIUM|HIGH", "keyPoints": ["要点1", "要点2"] } ], "riskPoints": ["风险点描述"], "keyObligations": ["关键义务"], "overallAssessment": "整体评估" } """; private static final String USER_PROMPT_TEMPLATE = "请分析以下合同内容,提取关键信息并按JSON格式输出:\n\n%s"; public String getSystemPrompt() { return SYSTEM_PROMPT; } public String buildUserPrompt(String contractText) { if (contractText.length() > 15000) { contractText = contractText.substring(0, 15000) + "\n... [内容已截断]"; } return String.format(USER_PROMPT_TEMPLATE, contractText); } }
7. 最佳实践与性能优化
7.1 模型选择建议
| 场景 | 推荐模型 | 理由 | |------|---------|------| | 通用合同摘要 | GPT-4 / Claude-3 | 通用能力强,理解准确 | | 中文合同 | 通义千问 / GLM-4 | 中文理解更好,性价比高 | | 长文本合同 | GPT-4-32K / Claude-100K | 支持更长上下文 | | 实时流式 | GPT-4 / 通义千问 | 流式输出体验好 |
7.2 性能优化策略
@Configuration public class PerformanceConfig { @Bean @Scope("prototype") public ChatModel chatModelPrototype() { // 每次创建新的实例,避免状态污染 return OpenAiChatModel.builder() .apiKey(apiKey) .modelName("gpt-4") .build(); } @Bean public ChatModel chatModelCached() { // 单例缓存,适用于无状态模型 return OpenAiChatModel.builder() .apiKey(apiKey) .modelName("gpt-4") .build(); } }
7.3 错误处理与重试
@Retryable( value = {HttpException.class, IOException.class}, maxAttempts = 3, backoff = @Backoff(delay = 1000, multiplier = 2) ) public ContractSummary summarizeWithRetry(String contractText) { return summarize(contractText); }
8. 本章小结
8.1 核心要点回顾
┌─────────────────────────────────────────────────────────────────┐ │ 本章核心要点 │ ├─────────────────────────────────────────────────────────────────┤ │ 1. LangChain4j架构:ChatModel抽象层统一接口,多Provider支持 │ │ 2. 配置管理:通过配置类实现多模型动态切换 │ │ 3. Prompt工程:结构化Prompt模板提升输出质量 │ │ 4. 流式处理:StreamingResponseHandler实现实时响应 │ │ 5. Token统计:完整的Token使用量追踪与成本估算 │ └─────────────────────────────────────────────────────────────────┘
8.2 后续内容预告
下一章我们将介绍**Drools规则引擎设计与实现**,学习如何:
- 集成Drools规则引擎 - 定义和管理合规规则 - 实现规则匹配与冲突检测 - 将规则引擎与合同审核流程结合
*本章完*
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐
所有评论(0)