Java 程序员第 42 阶段20：文档智能解析审核大模型实现合同摘要与合规校验，生产上线与持续迭代

布吉岛的石头

255人浏览 · 2026-06-13 09:30:00

布吉岛的石头 · 2026-06-13 09:30:00 发布

──────────────────────────────────────────────────

章节概述

20.1 学习目标

本章节将深入讲解合同审核系统的生产上线与持续迭代技术栈，包括：

掌握生产环境验收测试清单制定与执行
熟练使用灰度发布与回滚策略降低发布风险
理解Prometheus + Grafana监控告警体系
掌握ELK日志收集与分析架构
熟练配置Jenkins/GitHub Actions持续部署流水线

20.2 章节背景

合同智能审核系统进入生产环境后，需要建立完善的：

**质量保障体系**：多维度验收测试确保系统稳定
**发布控制体系**：灰度发布降低风险，快速回滚保证可用
**运维监控体系**：实时监控告警，快速定位问题
**持续迭代体系**：自动化流水线提高交付效率

──────────────────────────────────────────────────

生产环境验收测试清单

20.2.1 功能验收测试

# 合同审核系统生产环境验收测试清单

## 一、功能验收测试

### 1.1 合同上传功能
- [ ] 支持PDF格式上传，单文件大小不超过100MB
- [ ] 支持Word(doc/docx)格式上传
- [ ] 支持图片格式(jpg/png)上传
- [ ] 大文件上传进度显示正常
- [ ] 上传失败时错误提示清晰
- [ ] 并发上传10个文件成功率100%

### 1.2 合同解析功能
- [ ] PDF解析完整率 > 99%
- [ ] 文字提取准确率 > 98%
- [ ] 表格结构识别正确
- [ ] 合同要素（甲方、乙方、金额、日期）提取准确
- [ ] 解析时间：单页PDF < 2秒

### 1.3 AI摘要生成
- [ ] 摘要长度控制在200-500字
- [ ] 摘要内容与原合同一致
- [ ] 关键条款不遗漏
- [ ] 生成时间 < 10秒
- [ ] 支持中英文合同

### 1.4 合规校验功能
- [ ] 风险条款识别准确率 > 95%
- [ ] 法规引用正确
- [ ] 校验建议实用可行
- [ ] 支持自定义校验规则
- [ ] 校验结果可导出

### 1.5 用户权限功能
- [ ] 用户注册/登录正常
- [ ] 角色权限配置生效
- [ ] 操作审计日志完整
- [ ] 单点登录(SSO)正常
- [ ] Token过期处理正确

20.2.2 性能验收测试

# 性能测试脚本 - 使用Apache Bench
# 并发测试
ab -n 1000 -c 100 http://contract-api.example.com/actuator/health

# 性能测试脚本 - 使用JMeter
# jmeter -n -t contract_api_test.jmx -l result.jtl

# 性能测试用例
echo "
========================================
性能验收测试标准
========================================
指标名称                    标准值        测试方法
----------------------------------------
首页响应时间               < 1秒        10次平均
合同上传响应时间           < 3秒        10次平均
合同解析响应时间           < 5秒/页     10次平均
摘要生成响应时间           < 10秒       10次平均
并发用户数                 > 100        成功率>99%
系统吞吐量                 > 50 QPS     10分钟压测
CPU利用率                  < 70%        峰值时
内存利用率                 < 80%        峰值时
"

# 性能测试报告模板
performance_test_report:
  test_info:
    test_date: "2024-01-15"
    test_environment: "生产环境"
    test_tool: "Apache JMeter 5.6"
    test_duration: "30分钟"

  test_results:
    concurrent_users:
      scenario: "模拟100用户同时操作"
      duration: 1800
      total_requests: 45000
      successful_requests: 44955
      failed_requests: 45
      success_rate: "99.90%"
      avg_response_time: "1.2秒"
      p95_response_time: "2.5秒"
      p99_response_time: "4.1秒"

    sustained_load:
      scenario: "持续80%容量负载"
      duration: 1800
      avg_cpu: "55%"
      avg_memory: "62%"
      avg_qps: 85
      error_rate: "0.05%"

  conclusion: "通过性能验收"

20.2.3 安全验收测试

# 安全扫描命令
# 1. 依赖漏洞扫描
./mvnw org.owasp:dependency-check-maven-plugin:check

# 2. 代码安全扫描
./mvnw spotbugs:check

# 3. 容器镜像扫描
trivy image contract-api:1.0.0

# 4. Kubernetes集群安全扫描
kube-bench run --targets=master,node

# 安全测试用例
echo "
========================================
安全验收测试清单
========================================

[ ] SQL注入防护测试
    - 输入: ' OR 1=1 --
    - 预期: 请求被拒绝或参数转义

[ ] XSS攻击防护测试
    - 输入: <script>alert('xss')</script>
    - 预期: 脚本不被执行

[ ] CSRF令牌验证
    - 预期: 无令牌请求被拒绝

[ ] 身份认证测试
    - 预期: 错误密码5次后账户锁定

[ ] 敏感数据加密
    - 预期: 数据库中密码为加密存储

[ ] HTTPS强制跳转
    - 预期: HTTP请求自动跳转HTTPS

[ ] 敏感接口限流
    - 预期: 超过限流返回429状态码
"

──────────────────────────────────────────────────

灰度发布与回滚策略

20.3.1 灰度发布流程

# canary-deployment.yaml
apiVersion: argoproj.io/v1alpha1
kind: Rollout
metadata:
  name: contract-api
  namespace: production
spec:
  replicas: 10
  strategy:
    canary:
      # 步进式灰度
      steps:
        - setWeight: 10
        - pause: {duration: 10m}
        - setWeight: 30
        - pause: {duration: 10m}
        - setWeight: 50
        - pause: {duration: 10m}
        - setWeight: 100

      # 金丝雀分析
      analysis:
        templates:
          - templateName: success-rate
        args:
          - name: service-name
            value: contract-api-canary

      # 自动回滚条件
      canaryMetadata:
        labels:
          role: canary
      stableMetadata:
        labels:
          role: stable

      # 流量权重
      trafficRouting:
        nginx:
          stableIngress: contract-api-stable
          additionalIngressAnnotations:
            canary-by-header: X-Canary

20.3.2 回滚策略配置

# rollback-strategy.yaml
apiVersion: flagger.app/v1beta1
kind: MetricTemplate
metadata:
  name: success-rate
  namespace: production
spec:
  provider:
    type: prometheus
    address: http://prometheus.monitor:9090
  query: |
    histogram_quantile(0.99,
      sum(rate(nginx_ingress_controller_request_duration_seconds_bucket{
        ingress="{{.Name}}"
      }[5m])) by (le)
    )
---
apiVersion: flagger.app/v1beta1
kind: AlertProvider
metadata:
  name: slack
  namespace: production
spec:
  type: slack
  channel: "#contract-alerts"
  webhook: https://hooks.slack.com/services/xxx

20.3.3 回滚操作命令

# ===================================================================
# 回滚操作指南
# ===================================================================

# 1. 查看部署历史
kubectl rollout history deployment/contract-api -n production

# 2. 查看特定版本的详细信息
kubectl rollout history deployment/contract-api -n production --revision=3

# 3. 回滚到上一个版本
kubectl rollout undo deployment/contract-api -n production

# 4. 回滚到指定版本
kubectl rollout undo deployment/contract-api -n production --to-revision=2

# 5. 查看回滚状态
kubectl rollout status deployment/contract-api -n production

# 6. 验证回滚后的Pod
kubectl get pods -n production -l app=contract-api

# 7. 测试回滚后的服务
curl http://contract-api.production.svc.cluster.local/actuator/health

# 8. 紧急回滚脚本
#!/bin/bash
echo "开始紧急回滚..."
kubectl rollout undo deployment/contract-api -n production
echo "等待回滚完成..."
kubectl rollout status deployment/contract-api -n production --timeout=300s
echo "验证服务状态..."
curl -f http://contract-api/actuator/health || exit 1
echo "回滚完成！"

──────────────────────────────────────────────────

Prometheus + Grafana监控告警配置

20.4.1 Prometheus配置

# prometheus-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-config
  namespace: monitor
data:
  prometheus.yml: |
    global:
      scrape_interval: 15s
      evaluation_interval: 15s
      external_labels:
        cluster: 'production'
        env: 'prod'

    alerting:
      alertmanagers:
        - static_configs:
            - targets:
              - alertmanager.monitor.svc.cluster.local:9093

    rule_files:
      - "/etc/prometheus/rules/*.yml"

    scrape_configs:
      # Prometheus自我监控
      - job_name: 'prometheus'
        static_configs:
          - targets: ['localhost:9090']

      # Kubernetes API Server
      - job_name: 'kubernetes-apiservers'
        kubernetes_sd_configs:
          - role: endpoints
        scheme: https
        tls_config:
          ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
        relabel_configs:
          - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
            action: keep
            regex: default;kubernetes;https

      # Kubernetes Pods
      - job_name: 'kubernetes-pods'
        kubernetes_sd_configs:
          - role: pod
        relabel_configs:
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
            action: keep
            regex: true
          - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
            action: replace
            target_label: __metrics_path__
            regex: (.+)
          - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
            action: replace
            regex: ([^:]+)(?::\d+)?;(\d+)
            replacement: $1:$2
            target_label: __address__
          - action: labelmap
            regex: __meta_kubernetes_pod_label_(.+)

      # Contract API应用
      - job_name: 'contract-api'
        kubernetes_sd_configs:
          - role: service
        relabel_configs:
          - source_labels: [__meta_kubernetes_service_label_app]
            action: keep
            regex: contract-api
          - source_labels: [__meta_kubernetes_service_label_monitor]
            action: keep
            regex: enabled
          - action: labelmap
            regex: __meta_kubernetes_service_label_(.+)

20.4.2 告警规则配置

# prometheus-alerts.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: prometheus-alerts
  namespace: monitor
data:
  # 应用告警规则
  contract-api-alerts.yml: |
    groups:
      - name: contract-api
        interval: 30s
        rules:
          # 高错误率告警
          - alert: ContractAPIHighErrorRate
            expr: |
              sum(rate(http_server_requests_seconds_count{
                job="contract-api",
                status=~"5.."
              }[5m])) /
              sum(rate(http_server_requests_seconds_count{
                job="contract-api"
              }[5m])) > 0.01
            for: 5m
            labels:
              severity: critical
              team: contract
            annotations:
              summary: "合同API错误率过高"
              description: "合同API的5xx错误率超过1%，当前值: {{ $value | humanizePercentage }}"
              runbook_url: "https://wiki.example.com/runbooks/high-error-rate"

          # 高延迟告警
          - alert: ContractAPIHighLatency
            expr: |
              histogram_quantile(0.95,
                sum(rate(http_server_requests_seconds_bucket{
                  job="contract-api",
                  uri!="/actuator/health"
                }[5m])) by (le, uri)
              ) > 2
            for: 5m
            labels:
              severity: warning
              team: contract
            annotations:
              summary: "合同API延迟过高"
              description: "95分位延迟超过2秒，当前值: {{ $value | humanizeDuration }}"

          # JVM堆内存告警
          - alert: ContractAPIJVMHeapUsage
            expr: |
              jvm_memory_used_bytes{job="contract-api", area="heap"} /
              jvm_memory_max_bytes{job="contract-api", area="heap"} > 0.85
            for: 10m
            labels:
              severity: warning
              team: contract
            annotations:
              summary: "JVM堆内存使用率过高"
              description: "JVM堆内存使用率超过85%，当前值: {{ $value | humanizePercentage }}"

          # 数据库连接池告警
          - alert: ContractAPIDBPoolExhausted
            expr: |
              hikaricp_connections_active{pool="HikariPool-1"} /
              hikaricp_connections_max{pool="HikariPool-1"} > 0.9
            for: 5m
            labels:
              severity: critical
              team: contract
            annotations:
              summary: "数据库连接池即将耗尽"
              description: "活跃连接数超过最大连接的90%"

          # AI模型调用失败告警
          - alert: ContractAPIAIFailureRate
            expr: |
              sum(rate(ai_model_requests_total{
                job="contract-api",
                status="error"
              }[5m])) /
              sum(rate(ai_model_requests_total{
                job="contract-api"
              }[5m])) > 0.05
            for: 5m
            labels:
              severity: critical
              team: contract
            annotations:
              summary: "AI模型调用失败率过高"
              description: "AI模型调用失败率超过5%，当前值: {{ $value | humanizePercentage }}"

          # 服务不可用告警
          - alert: ContractAPIServiceDown
            expr: |
              up{job="contract-api"} == 0
            for: 1m
            labels:
              severity: critical
              team: contract
            annotations:
              summary: "合同API服务不可用"
              description: "合同API服务已经停止运行超过1分钟"

  # Kubernetes集群告警规则
  k8s-alerts.yml: |
    groups:
      - name: kubernetes
        interval: 30s
        rules:
          # Pod CPU使用率过高
          - alert: K8SPodCPUUsageHigh
            expr: |
              sum(rate(container_cpu_usage_seconds_total{
                namespace="production",
                pod=~"contract-api-.*"
              }[5m])) by (pod) > 1.8
            for: 10m
            labels:
              severity: warning
            annotations:
              summary: "Pod CPU使用率过高"
              description: "Pod {{ $labels.pod }} CPU使用率超过90%，当前值: {{ $value | humanizePercentage }}"

          # Pod内存使用率过高
          - alert: K8SPodMemoryUsageHigh
            expr: |
              container_memory_working_set_bytes{
                namespace="production",
                pod=~"contract-api-.*"
              } / container_spec_memory_limit_bytes{
                namespace="production",
                pod=~"contract-api-.*"
              } > 0.85
            for: 10m
            labels:
              severity: warning
            annotations:
              summary: "Pod内存使用率过高"
              description: "Pod {{ $labels.pod }} 内存使用率超过85%"

          # Pod重启次数过多
          - alert: K8SPodRestartingTooMuch
            expr: |
              increase(kube_pod_container_status_restarts_total{
                namespace="production",
                pod=~"contract-api-.*"
              }[1h]) > 3
            for: 5m
            labels:
              severity: warning
            annotations:
              summary: "Pod重启次数过多"
              description: "Pod {{ $labels.pod }} 在过去1小时内重启超过3次"

          # HPA达到最大副本数
          - alert: K8SHPAAtMaxReplicas
            expr: |
              kube_horizontalpodautoscaler_status_current_replicas{
                namespace="production",
                name="contract-api-hpa"
              } >=
              kube_horizontalpodautoscaler_spec_max_replicas{
                namespace="production",
                name="contract-api-hpa"
              }
            for: 5m
            labels:
              severity: warning
            annotations:
              summary: "HPA已达到最大副本数"
              description: "合同API HPA已达到最大副本数 {{ $value }}，建议检查负载情况"

20.4.3 Grafana Dashboard配置

# grafana-dashboard.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: grafana-dashboard-contract-api
  namespace: monitor
data:
  contract-api-dashboard.json: |
    {
      "dashboard": {
        "title": "合同审核系统监控",
        "uid": "contract-api",
        "timezone": "Asia/Shanghai",
        "panels": [
          {
            "title": "服务健康状态",
            "type": "stat",
            "gridPos": {"h": 4, "w": 6},
            "targets": [
              {
                "expr": "up{job='contract-api'}",
                "legendFormat": "{{pod}}"
              }
            ],
            "fieldConfig": {
              "defaults": {
                "mappings": [
                  {"type": "value", "options": {"0": {"text": "下线", "color": "red"}}},
                  {"type": "value", "options": {"1": {"text": "在线", "color": "green"}}}
                ]
              }
            }
          },
          {
            "title": "QPS",
            "type": "graph",
            "gridPos": {"h": 8, "w": 12},
            "targets": [
              {
                "expr": "sum(rate(http_server_requests_seconds_count{job='contract-api'}[1m]))",
                "legendFormat": "总QPS"
              },
              {
                "expr": "sum(rate(http_server_requests_seconds_count{job='contract-api', status=~'2..'}[1m]))",
                "legendFormat": "成功QPS"
              },
              {
                "expr": "sum(rate(http_server_requests_seconds_count{job='contract-api', status=~'5..'}[1m]))",
                "legendFormat": "错误QPS"
              }
            ]
          },
          {
            "title": "响应时间P95/P99",
            "type": "graph",
            "gridPos": {"h": 8, "w": 12},
            "targets": [
              {
                "expr": "histogram_quantile(0.95, sum(rate(http_server_requests_seconds_bucket{job='contract-api'}[5m])) by (le))",
                "legendFormat": "P95"
              },
              {
                "expr": "histogram_quantile(0.99, sum(rate(http_server_requests_seconds_bucket{job='contract-api'}[5m])) by (le))",
                "legendFormat": "P99"
              },
              {
                "expr": "histogram_quantile(0.50, sum(rate(http_server_requests_seconds_bucket{job='contract-api'}[5m])) by (le))",
                "legendFormat": "P50"
              }
            ]
          },
          {
            "title": "JVM内存使用",
            "type": "graph",
            "gridPos": {"h": 8, "w": 12},
            "targets": [
              {
                "expr": "jvm_memory_used_bytes{job='contract-api', area='heap'} / 1024 / 1024 / 1024",
                "legendFormat": "堆内存使用 ({{pod}})"
              },
              {
                "expr": "jvm_memory_max_bytes{job='contract-api', area='heap'} / 1024 / 1024 / 1024",
                "legendFormat": "堆内存最大 ({{pod}})"
              }
            ]
          },
          {
            "title": "AI模型调用统计",
            "type": "graph",
            "gridPos": {"h": 8, "w": 12},
            "targets": [
              {
                "expr": "sum(rate(ai_model_requests_total{job='contract-api'}[5m])) by (type)",
                "legendFormat": "{{type}}"
              }
            ]
          },
          {
            "title": "数据库连接池",
            "type": "graph",
            "gridPos": {"h": 8, "w": 12},
            "targets": [
              {
                "expr": "hikaricp_connections_active{pool='HikariPool-1'}",
                "legendFormat": "活跃连接"
              },
              {
                "expr": "hikaricp_connections_idle{pool='HikariPool-1'}",
                "legendFormat": "空闲连接"
              },
              {
                "expr": "hikaricp_connections_pending{pool='HikariPool-1'}",
                "legendFormat": "等待连接"
              }
            ]
          }
        ]
      }
    }

──────────────────────────────────────────────────

ELK日志收集与分析

20.5.1 Filebeat配置

# filebeat-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: filebeat-config
  namespace: monitor
data:
  filebeat.yml: |
    filebeat.inputs:
      # 应用日志
      - type: container
        paths:
          - /var/log/containers/contract-api-*.log
        processors:
          - add_kubernetes_metadata:
              host: ${NODE_NAME}
              matchers:
                - logs_path:
                    logs_path: "/var/log/containers/"
          - add_fields:
              target: ''
              fields:
                service: contract-api
                environment: production
        json.keys_under_root: true
        json.add_error_key: true
        json.message_key: message

      # 系统日志
      - type: log
        paths:
          - /var/log/syslog
        fields:
          service: syslog
          environment: production

    processors:
      - add_host_metadata:
          cloud: auto
      - add_cloud_metadata: ~
      - add_docker_metadata: ~
      - decode_json_fields:
          fields: ["message"]
          target: ""
          overwrite_keys: true
          add_error_key: true
      - drop_event:
          when:
            regexp:
              message: "^\\s+$"

    output.logstash:
      hosts: ["logstash.monitor.svc.cluster.local:5044"]
      ssl.enabled: false

    logging.level: info
    logging.to_files: true
    logging.files:
      path: /var/log/filebeat
      name: filebeat
      keepfiles: 7
      permissions: 0640

20.5.2 Logstash配置

# logstash-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: logstash-config
  namespace: monitor
data:
  logstash.yml: |
    http.host: "0.0.0.0"
    xpack.monitoring.elasticsearch.hosts: ["http://elasticsearch.monitor.svc.cluster.local:9200"]
    pipeline.workers: 4
    pipeline.batch.size: 125

  # Logstash管道配置
  contract-api.conf: |
    input {
      beats {
        port => 5044
        codec => json
      }
    }

    filter {
      # 应用日志处理
      if [service] == "contract-api" {
        # 解析时间戳
        date {
          match => ["timestamp", "ISO8601"]
          target => "@timestamp"
        }

        # 提取日志级别
        grok {
          match => { "message" => "%{WORD:level}\s+%{DATA:class}\s+-\s+%{GREEDYDATA:log_message}" }
          tag_on_failure => ["_grokparsefailure"]
        }

        # 异常堆栈处理
        if [stack_trace] {
          ruby {
            code => "
              stack_trace = event.get('stack_trace')
              if stack_trace
                # 限制堆栈长度
                lines = stack_trace.split('\n')[0..10]
                event.set('stack_trace', lines.join('\n'))
              end
            "
          }
        }

        # 添加索引前缀
        mutate {
          add_field => { "index_prefix" => "contract-api" }
        }
      }

      # 系统日志处理
      if [service] == "syslog" {
        grok {
          match => { "message" => "%{SYSLOGTIMESTAMP:syslog_timestamp} %{SYSLOGHOST:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" }
        }
        date {
          match => ["syslog_timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss"]
          target => "@timestamp"
        }
        mutate {
          add_field => { "index_prefix" => "syslog" }
        }
      }

      # 公共处理
      mutate {
        add_field => { "[@metadata][index_date]" => "%{+YYYY.MM.dd}" }
      }
    }

    output {
      elasticsearch {
        hosts => ["http://elasticsearch.monitor.svc.cluster.local:9200"]
        index => "%{index_prefix}-%{[@metadata][index_date]}"
        document_type => "_doc"
      }
    }

20.5.3 Elasticsearch索引配置

# 创建索引模板
curl -X PUT "http://elasticsearch.monitor.svc.cluster.local:9200/_index_template/contract-api" \
  -H "Content-Type: application/json" \
  -d '{
  "index_patterns": ["contract-api-*"],
  "template": {
    "settings": {
      "number_of_shards": 3,
      "number_of_replicas": 1,
      "index.refresh_interval": "5s",
      "index.lifecycle.name": "contract-api-policy",
      "index.lifecycle.rollover_alias": "contract-api"
    },
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "level": {
          "type": "keyword"
        },
        "class": {
          "type": "keyword"
        },
        "message": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "service": {
          "type": "keyword"
        },
        "environment": {
          "type": "keyword"
        },
        "trace_id": {
          "type": "keyword"
        },
        "span_id": {
          "type": "keyword"
        },
        "pod": {
          "type": "keyword"
        },
        "namespace": {
          "type": "keyword"
        }
      }
    }
  }
}'

# 查看索引列表
curl -X GET "http://elasticsearch.monitor.svc.cluster.local:9200/_cat/indices/contract-api-*?v"

# 查看索引健康状态
curl -X GET "http://elasticsearch.monitor.svc.cluster.local:9200/_cluster/health?index=contract-api-*"

──────────────────────────────────────────────────

持续集成与持续部署

20.6.1 GitHub Actions工作流

# .github/workflows/ci-cd.yml
name: Contract API CI/CD Pipeline

on:
  push:
    branches: [main, develop, 'release/*']
  pull_request:
    branches: [main]
  workflow_dispatch:

env:
  IMAGE_NAME: contract-api
  REGISTRY: registry.example.com
  HELM_chart: ./chart

jobs:
  # ===================================================================
  # 第一阶段：代码质量检查
  # ===================================================================
  code-quality:
    name: Code Quality Check
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Set up JDK 17
        uses: actions/setup-java@v4
        with:
          java-version: '17'
          distribution: 'temurin'
          cache: 'maven'

      - name: Cache Maven packages
        uses: actions/cache@v3
        with:
          path: ~/.m2/repository
          key: ${{ runner.os }}-m2-${{ hashFiles('**/pom.xml') }}
          restore-keys: ${{ runner.os }}-m2

      - name: Check code format
        run: ./mvnw spotless:check

      - name: SpotBugs Scan
        run: ./mvnw spotbugs:check

      - name: OWASP Dependency Check
        run: ./mvnw org.owasp:dependency-check-maven-plugin:check
        continue-on-error: true

      - name: Upload Dependency Check Report
        uses: actions/upload-artifact@v3
        if: always()
        with:
          name: dependency-check-report
          path: target/dependency-check-report.html

  # ===================================================================
  # 第二阶段：单元测试与集成测试
  # ===================================================================
  test:
    name: Unit and Integration Tests
    runs-on: ubuntu-latest
    services:
      mysql:
        image: mysql:8.0
        env:
          MYSQL_ROOT_PASSWORD: test_password
          MYSQL_DATABASE: contract_test
        options: >-
          --health-cmd="mysqladmin ping"
          --health-interval=10s
          --health-timeout=5s
          --health-retries=5
        ports:
          - 3306:3306

      redis:
        image: redis:7-alpine
        options: >-
          --health-cmd="redis-cli ping"
          --health-interval=10s
          --health-timeout=5s
          --health-retries=5
        ports:
          - 6379:6379

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up JDK 17
        uses: actions/setup-java@v4
        with:
          java-version: '17'
          distribution: 'temurin'
          cache: 'maven'

      - name: Run Unit Tests
        run: ./mvnw test -Dspring.profiles.active=test

      - name: Run Integration Tests
        run: ./mvnw verify -Dspring.profiles.active=integration

      - name: Upload Test Results
        uses: actions/upload-artifact@v3
        if: always()
        with:
          name: test-results
          path: '**/target/surefire-reports/*.xml'

      - name: Upload Coverage Reports
        uses: codecov/codecov-action@v3
        with:
          files: '**/target/site/jacoco/jacoco.xml'
          fail_ci_if_error: false

  # ===================================================================
  # 第三阶段：Docker镜像构建与推送
  # ===================================================================
  build-and-push:
    name: Build and Push Docker Image
    runs-on: ubuntu-latest
    needs: [code-quality, test]
    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
    outputs:
      image-tag: ${{ steps.meta.outputs.tags }}

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

      - name: Login to Container Registry
        uses: docker/login-action@v3
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ secrets.REGISTRY_USER }}
          password: ${{ secrets.REGISTRY_TOKEN }}

      - name: Extract metadata
        id: meta
        uses: docker/metadata-action@v5
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
          tags: |
            type=ref,event=branch
            type=sha,prefix={{branch}}-
            type=raw,value=latest,enable={{is_default_branch}}

      - name: Build and push with cache
        uses: docker/build-push-action@v5
        with:
          context: .
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
          build-args: |
            BUILD_VERSION=${{ github.sha }}
            BUILD_DATE=${{ github.event.head_commit.timestamp }}

  # ===================================================================
  # 第四阶段：部署到测试环境
  # ===================================================================
  deploy-test:
    name: Deploy to Test Environment
    runs-on: ubuntu-latest
    needs: build-and-push
    environment: test

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Helm
        uses: azure/setup-helm@v3
        with:
          version: '3.13.0'

      - name: Configure kubectl
        uses: azure/k8s-set-context@v3
        with:
          kubeconfig: ${{ secrets.KUBE_CONFIG_TEST }}

      - name: Deploy to Test
        run: |
          helm upgrade --install contract-api ${{ env.HELM_chart }} \
            --namespace test \
            --create-namespace \
            --set image.repository=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} \
            --set image.tag=${{ needs.build-and-push.outputs.image-tag }} \
            --wait --timeout 10m \
            --atomic \
            --cleanup-on-fail

      - name: Verify deployment
        run: |
          kubectl rollout status deployment/contract-api -n test --timeout=300s
          kubectl get pods -n test -l app=contract-api

      - name: Run Smoke Tests
        run: |
          sleep 30
          ENDPOINT=$(kubectl get svc contract-api -n test -o jsonpath='{.spec.clusterIP}')
          curl -f http://$ENDPOINT:8080/actuator/health || exit 1

  # ===================================================================
  # 第五阶段：部署到生产环境（需要手动审批）
  # ===================================================================
  deploy-production:
    name: Deploy to Production Environment
    runs-on: ubuntu-latest
    needs: deploy-test
    environment:
      name: production
      url: https://contract-api.example.com
    if: github.ref == 'refs/heads/main'

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Helm
        uses: azure/setup-helm@v3
        with:
          version: '3.13.0'

      - name: Configure kubectl
        uses: azure/k8s-set-context@v3
        with:
          kubeconfig: ${{ secrets.KUBE_CONFIG_PROD }}

      - name: Backup current deployment
        run: |
          kubectl get deployment contract-api -n production -o yaml > /tmp/backup-$(date +%Y%m%d%H%M%S).yaml

      - name: Deploy to Production
        run: |
          helm upgrade --install contract-api ${{ env.HELM_chart }} \
            --namespace production \
            --create-namespace \
            --set image.repository=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} \
            --set image.tag=${{ needs.build-and-push.outputs.image-tag }} \
            --wait --timeout 15m \
            --atomic \
            --cleanup-on-fail

      - name: Verify deployment
        run: |
          kubectl rollout status deployment/contract-api -n production --timeout=600s
          kubectl get pods -n production -l app=contract-api

      - name: Run Production Smoke Tests
        run: |
          sleep 60
          curl -f https://contract-api.example.com/actuator/health || exit 1
          curl -f https://contract-api.example.com/api/v1/contract/health || exit 1

      - name: Notify on Slack
        if: always()
        uses: slackapi/slack-github-action@v1
        with:
          channel-id: 'C0123456789'
          payload: |
            {
              "text": "Contract API部署结果: ${{ job.status }}",
              "attachments": [{
                "color": "${{ job.status == 'success' && '#36a64f' || '#ff0000' }}",
                "fields": [
                  {"title": "环境", "value": "Production", "short": true},
                  {"title": "版本", "value": "${{ needs.build-and-push.outputs.image-tag }}", "short": true},
                  {"title": "提交", "value": "${{ github.sha }}", "short": true}
                ]
              }]
            }
        env:
          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

20.6.2 部署脚本

#!/bin/bash
# ===================================================================
# 生产环境部署脚本
# 使用方式: ./deploy-production.sh [VERSION] [ENVIRONMENT]
# 示例: ./deploy-production.sh 1.0.0 production
# ===================================================================

set -e

VERSION=${1:-latest}
ENVIRONMENT=${2:-production}
NAMESPACE="production"
RELEASE_NAME="contract-api"
CHART_PATH="./chart"

# 颜色定义
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
NC='\033[0m'

log_info() {
    echo -e "${GREEN}[INFO]${NC} $1"
}

log_warn() {
    echo -e "${YELLOW}[WARN]${NC} $1"
}

log_error() {
    echo -e "${RED}[ERROR]${NC} $1"
}

# 检查前置条件
check_prerequisites() {
    log_info "检查前置条件..."

    command -v kubectl >/dev/null 2>&1 || { log_error "kubectl未安装"; exit 1; }
    command -v helm >/dev/null 2>&1 || { log_error "helm未安装"; exit 1; }

    kubectl cluster-info >/dev/null 2>&1 || { log_error "无法连接到Kubernetes集群"; exit 1; }

    log_info "前置条件检查通过"
}

# 备份当前部署
backup_current_deployment() {
    log_info "备份当前部署..."
    BACKUP_FILE="/tmp/backup-${RELEASE_NAME}-$(date +%Y%m%d%H%M%S).yaml"
    kubectl get deployment ${RELEASE_NAME} -n ${NAMESPACE} -o yaml > ${BACKUP_FILE}
    log_info "备份已保存到: ${BACKUP_FILE}"
}

# 执行部署
deploy() {
    log_info "开始部署 Contract API v${VERSION} 到 ${ENVIRONMENT}..."

    helm upgrade --install ${RELEASE_NAME} ${CHART_PATH} \
        --namespace ${NAMESPACE} \
        --create-namespace \
        --set image.tag=${VERSION} \
        --wait --timeout 15m \
        --atomic \
        --cleanup-on-fail

    log_info "部署命令执行完成"
}

# 验证部署
verify_deployment() {
    log_info "验证部署状态..."

    # 等待滚动更新完成
    kubectl rollout status deployment/${RELEASE_NAME} -n ${NAMESPACE} --timeout=600s

    # 检查Pod状态
    READY_PODS=$(kubectl get pods -n ${NAMESPACE} -l app=${RELEASE_NAME} -o jsonpath='{.items[*].status.conditions[?(@.type=="Ready")].status}')
    if [[ "$READY_PODS" != *"True"* ]]; then
        log_error "Pod未就绪"
        kubectl get pods -n ${NAMESPACE} -l app=${RELEASE_NAME}
        exit 1
    fi

    # 健康检查
    sleep 30
    HEALTH_STATUS=$(curl -sf http://${RELEASE_NAME}.${NAMESPACE}.svc.cluster.local:8080/actuator/health || echo "failed")
    if [[ "$HEALTH_STATUS" != *"UP"* ]]; then
        log_error "健康检查失败"
        exit 1
    fi

    log_info "部署验证通过"
}

# 发送通知
notify() {
    log_info "发送部署通知..."

    curl -X POST "${SLACK_WEBHOOK_URL}" \
        -H 'Content-Type: application/json' \
        -d "{
            \"text\": \"Contract API部署完成\",
            \"attachments\": [{
                \"color\": \"#36a64f\",
                \"fields\": [
                    {\"title\": \"环境\", \"value\": \"${ENVIRONMENT}\", \"short\": true},
                    {\"title\": \"版本\", \"value\": \"${VERSION}\", \"short\": true}
                ]
            }]
        }" 2>/dev/null || log_warn "通知发送失败"
}

# 回滚函数
rollback() {
    log_warn "开始回滚..."

    helm rollback ${RELEASE_NAME} -n ${NAMESPACE}
    kubectl rollout undo deployment/${RELEASE_NAME} -n ${NAMESPACE}

    log_info "回滚完成"
}

# 主函数
main() {
    log_info "=========================================="
    log_info " Contract API 部署脚本"
    log_info "=========================================="
    log_info "版本: ${VERSION}"
    log_info "环境: ${ENVIRONMENT}"
    log_info "=========================================="

    check_prerequisites
    backup_current_deployment

    if deploy; then
        if verify_deployment; then
            notify
            log_info "部署成功完成！"
        else
            log_error "验证失败，执行回滚..."
            rollback
            exit 1
        fi
    else
        log_error "部署失败"
        rollback
        exit 1
    fi
}

# 捕获Ctrl+C进行回滚
trap 'log_warn "捕获中断信号，开始回滚..."; rollback; exit 1' INT TERM

main "$@"