结构优化设计-主题045-信赖域方法
主题045:信赖域方法 (Trust Region Methods)
1. 引言
1.1 背景与动机
在结构优化设计中,我们经常需要求解复杂的非线性优化问题。传统的线搜索方法(如梯度下降法、牛顿法)在某些情况下会遇到困难:
- Hessian矩阵不正定时,牛顿方向可能不是下降方向
- 步长选择不当可能导致迭代发散
- 在病态问题上收敛缓慢
信赖域方法(Trust Region Method)提供了一种更稳健的优化框架,它通过在每次迭代中构建局部二次模型,并在信赖域内求解子问题来确定搜索方向。
1.2 核心思想
信赖域方法的核心思想是:
- 局部建模:在当前点 xkx_kxk 构建二次模型:
mk(p)=fk+gkTp+12pTBkpm_k(p) = f_k + g_k^T p + \frac{1}{2} p^T B_k pmk(p)=fk+gkTp+21pTBkp - 信赖域约束:在信赖域 ∥p∥≤Δk\|p\| \leq \Delta_k∥p∥≤Δk 内求解子问题
- 自适应调整:根据实际下降与预测下降的比值调整信赖域半径
1.3 学习目标
通过本教程,您将掌握:
- 信赖域方法的基本原理与算法框架
- 信赖域子问题的求解方法(Cauchy点、Dogleg、Steihaug)
- 信赖域半径的自适应调整策略
- 边界约束优化的信赖域反射法







2. 数学理论基础
2.1 信赖域子问题
信赖域子问题的标准形式为:
minpmk(p)=fk+gkTp+12pTBkp\min_{p} m_k(p) = f_k + g_k^T p + \frac{1}{2} p^T B_k ppminmk(p)=fk+gkTp+21pTBkp
s.t. ∥p∥≤Δk\text{s.t. } \|p\| \leq \Delta_ks.t. ∥p∥≤Δk
其中:
- fk=f(xk)f_k = f(x_k)fk=f(xk) 是当前函数值
- gk=∇f(xk)g_k = \nabla f(x_k)gk=∇f(xk) 是当前梯度
- BkB_kBk 是Hessian矩阵或其近似
- Δk\Delta_kΔk 是信赖域半径
- ∥⋅∥\|\cdot\|∥⋅∥ 通常是欧几里得范数
2.2 信赖域半径的调整
信赖域方法通过比较实际下降与预测下降来调整信赖域半径:
ρk=f(xk)−f(xk+pk)mk(0)−mk(pk)=实际下降预测下降\rho_k = \frac{f(x_k) - f(x_k + p_k)}{m_k(0) - m_k(p_k)} = \frac{\text{实际下降}}{\text{预测下降}}ρk=mk(0)−mk(pk)f(xk)−f(xk+pk)=预测下降实际下降
调整策略:
- 如果 ρk<ηbad\rho_k < \eta_{bad}ρk<ηbad(如0.25):模型不可靠,缩小信赖域
- 如果 ρk>ηgood\rho_k > \eta_{good}ρk>ηgood(如0.75)且 ∥pk∥=Δk\|p_k\| = \Delta_k∥pk∥=Δk:模型可靠且被边界限制,扩大信赖域
- 否则:保持信赖域半径不变
2.3 Cauchy点方法
Cauchy点是最简单的子问题求解方法,它沿着最速下降方向走到信赖域边界:
pkC=−τkΔk∥gk∥gkp_k^C = -\tau_k \frac{\Delta_k}{\|g_k\|} g_kpkC=−τk∥gk∥Δkgk
其中:
τk={1if gkTBkgk≤0min(∥gk∥3ΔkgkTBkgk,1)otherwise\tau_k = \begin{cases} 1 & \text{if } g_k^T B_k g_k \leq 0 \\ \min\left(\frac{\|g_k\|^3}{\Delta_k g_k^T B_k g_k}, 1\right) & \text{otherwise} \end{cases}τk={1min(ΔkgkTBkgk∥gk∥3,1)if gkTBkgk≤0otherwise
性质:Cauchy点保证至少和最速下降法有相同的收敛速度。
2.4 Dogleg方法
Dogleg方法通过连接Cauchy点和牛顿方向形成一条折线路径:
路径定义:
- 如果 ∥pN∥≤Δk\|p^N\| \leq \Delta_k∥pN∥≤Δk,取 pk=pNp_k = p^Npk=pN(牛顿方向)
- 否则,在折线路径 p(τ)p(\tau)p(τ) 上寻找满足 ∥p(τ)∥=Δk\|p(\tau)\| = \Delta_k∥p(τ)∥=Δk 的点
折线路径:
p(τ)={τpC0≤τ≤1pC+(τ−1)(pN−pC)1≤τ≤2p(\tau) = \begin{cases} \tau p^C & 0 \leq \tau \leq 1 \\ p^C + (\tau - 1)(p^N - p^C) & 1 \leq \tau \leq 2 \end{cases}p(τ)={τpCpC+(τ−1)(pN−pC)0≤τ≤11≤τ≤2
优点:
- 计算简单
- 比Cauchy点更高效
- 当Hessian正定时效果良好
2.5 Steihaug共轭梯度法
Steihaug方法使用截断共轭梯度法求解信赖域子问题:
算法步骤:
- 初始化:z0=0z_0 = 0z0=0,r0=gkr_0 = g_kr0=gk,d0=−r0d_0 = -r_0d0=−r0
- 如果 ∥r0∥<ε\|r_0\| < \varepsilon∥r0∥<ε,返回 pk=z0p_k = z_0pk=z0
- 对于 j=0,1,2,...j = 0, 1, 2, ...j=0,1,2,...:
- 计算 BkdjB_k d_jBkdj
- 如果 djTBkdj≤0d_j^T B_k d_j \leq 0djTBkdj≤0:遇到负曲率,沿当前方向走到边界
- 计算 αj=rjTrjdjTBkdj\alpha_j = \frac{r_j^T r_j}{d_j^T B_k d_j}αj=djTBkdjrjTrj
- 如果 ∥zj+αjdj∥≥Δk\|z_j + \alpha_j d_j\| \geq \Delta_k∥zj+αjdj∥≥Δk:超出信赖域,截断到边界
- 更新:zj+1=zj+αjdjz_{j+1} = z_j + \alpha_j d_jzj+1=zj+αjdj,rj+1=rj+αjBkdjr_{j+1} = r_j + \alpha_j B_k d_jrj+1=rj+αjBkdj
- 如果 ∥rj+1∥\|r_{j+1}\|∥rj+1∥ 足够小,返回 pk=zj+1p_k = z_{j+1}pk=zj+1
- 计算 βj=rj+1Trj+1rjTrj\beta_j = \frac{r_{j+1}^T r_{j+1}}{r_j^T r_j}βj=rjTrjrj+1Trj+1
- 更新方向:dj+1=−rj+1+βjdjd_{j+1} = -r_{j+1} + \beta_j d_jdj+1=−rj+1+βjdj
优点:
- 适合大规模问题
- 可以处理不定Hessian
- 不需要显式存储Hessian矩阵
2.6 收敛性分析
全局收敛性:信赖域方法具有全局收敛性,即:
limk→∞∥∇f(xk)∥=0\lim_{k \to \infty} \|\nabla f(x_k)\| = 0k→∞lim∥∇f(xk)∥=0
收敛速度:
- 在Hessian正定的假设下,信赖域方法具有超线性收敛速度
- 接近最优解时,信赖域通常不活跃,方法退化为牛顿法
3. 环境准备
3.1 依赖库
pip install numpy matplotlib scipy
3.2 导入模块
import numpy as np
import matplotlib.pyplot as plt
from typing import Callable, Tuple, Optional, List
from enum import Enum
# 设置中文显示
plt.rcParams['font.sans-serif'] = ['SimHei', 'DejaVu Sans']
plt.rcParams['axes.unicode_minus'] = False
4. 完整代码实现
4.1 信赖域主类
class TrustRegion:
"""
信赖域方法 (Trust Region Method)
信赖域方法通过在每次迭代中构建一个局部二次模型,
并在信赖域内求解该模型来确定搜索方向。
"""
def __init__(self,
subproblem_solver: SubproblemSolver = SubproblemSolver.DOGLEG,
max_iter: int = 1000,
tol: float = 1e-6,
delta_init: float = 1.0,
delta_max: float = 100.0,
eta: float = 0.1,
eta_good: float = 0.75,
eta_bad: float = 0.25):
"""
初始化信赖域方法
参数:
subproblem_solver: 子问题求解方法
max_iter: 最大迭代次数
tol: 收敛容差
delta_init: 初始信赖域半径
delta_max: 最大信赖域半径
eta: 接受步长的最小比率阈值
eta_good: 信赖域扩大阈值
eta_bad: 信赖域缩小阈值
"""
self.subproblem_solver = subproblem_solver
self.max_iter = max_iter
self.tol = tol
self.delta = delta_init
self.delta_max = delta_max
self.eta = eta
self.eta_good = eta_good
self.eta_bad = eta_bad
# 历史记录
self.x_history = []
self.f_history = []
self.grad_norm_history = []
self.delta_history = []
self.rho_history = []
self.step_norm_history = []
4.2 Cauchy点方法
def solve_cauchy_point(self, grad: np.ndarray, hessian: np.ndarray,
delta: float) -> np.ndarray:
"""
Cauchy点方法求解子问题
Cauchy点是最速下降方向与信赖域边界的交点。
这是最简单的子问题求解方法,保证收敛但效率较低。
"""
g_norm = np.linalg.norm(grad)
if g_norm < 1e-15:
return np.zeros_like(grad)
# 最速下降方向
p_sd = -grad / g_norm * delta
# 计算曲率
gBg = grad @ hessian @ grad
if gBg <= 0:
# 负曲率或零曲率,取信赖域边界
tau = 1.0
else:
# 计算最优步长
tau = min(1.0, g_norm**3 / (delta * gBg))
p_cauchy = tau * p_sd
return p_cauchy
4.3 Dogleg方法
def solve_dogleg(self, grad: np.ndarray, hessian: np.ndarray,
delta: float) -> np.ndarray:
"""
Dogleg方法求解子问题
Dogleg方法通过连接最速下降方向和牛顿方向形成一条折线路径,
在该路径上寻找满足信赖域约束的解。
"""
# 尝试牛顿方向
try:
p_newton = -np.linalg.solve(hessian, grad)
if np.linalg.norm(p_newton) <= delta:
return p_newton
except np.linalg.LinAlgError:
# Hessian奇异,使用修正
p_newton = None
# 计算Cauchy点
g_norm_sq = np.dot(grad, grad)
gBg = grad @ hessian @ grad
if gBg <= 0:
tau = 1.0
else:
tau = min(1.0, g_norm_sq**1.5 / (delta * gBg))
p_cauchy = -tau * grad * delta / np.sqrt(g_norm_sq)
if p_newton is None:
return p_cauchy
# 在折线路径上寻找交点
diff = p_newton - p_cauchy
# 解方程 ||p_cauchy + t*(p_newton - p_cauchy)||^2 = delta^2
a = np.dot(diff, diff)
b = 2 * np.dot(p_cauchy, diff)
c = np.dot(p_cauchy, p_cauchy) - delta**2
discriminant = b**2 - 4*a*c
if discriminant < 0:
return p_cauchy
t = (-b + np.sqrt(discriminant)) / (2*a)
t = max(0, min(1, t))
p_dogleg = p_cauchy + t * diff
return p_dogleg
4.4 Steihaug共轭梯度法
def solve_steihaug(self, grad: np.ndarray, hessian: np.ndarray,
delta: float, max_iter: int = None) -> np.ndarray:
"""
Steihaug共轭梯度法求解子问题
Steihaug方法使用截断共轭梯度法求解信赖域子问题,
适合大规模问题,可以处理不定Hessian。
"""
if max_iter is None:
max_iter = len(grad)
x = np.zeros_like(grad)
r = grad.copy()
d = -r.copy()
r_norm_sq = np.dot(r, r)
for i in range(max_iter):
Bd = hessian @ d
dBd = np.dot(d, Bd)
if dBd <= 0:
# 遇到负曲率,沿当前方向走到信赖域边界
a = np.dot(d, d)
b = 2 * np.dot(x, d)
c = np.dot(x, x) - delta**2
discriminant = b**2 - 4*a*c
if discriminant >= 0:
tau = (-b + np.sqrt(discriminant)) / (2*a)
if tau >= 0:
return x + tau * d
return x
alpha = r_norm_sq / dBd
x_new = x + alpha * d
if np.linalg.norm(x_new) >= delta:
# 超出信赖域,截断到边界
a = np.dot(d, d)
b = 2 * np.dot(x, d)
c = np.dot(x, x) - delta**2
discriminant = b**2 - 4*a*c
if discriminant >= 0:
tau = (-b + np.sqrt(discriminant)) / (2*a)
if tau >= 0:
return x + tau * d
return x
r_new = r + alpha * Bd
r_norm_sq_new = np.dot(r_new, r_new)
if np.sqrt(r_norm_sq_new) < 1e-10:
return x_new
beta = r_norm_sq_new / r_norm_sq
d = -r_new + beta * d
x = x_new
r = r_new
r_norm_sq = r_norm_sq_new
return x
4.5 主优化循环
def optimize(self, f: Callable, x0: np.ndarray,
grad_f: Callable = None, hessian_f: Callable = None) -> Tuple[np.ndarray, float]:
"""
执行信赖域优化
参数:
f: 目标函数
x0: 初始点
grad_f: 梯度函数(可选)
hessian_f: Hessian函数(可选)
返回:
x: 最优解
f_val: 最优函数值
"""
self.x_history = [x0.copy()]
self.f_history = []
self.grad_norm_history = []
self.delta_history = []
self.rho_history = []
self.step_norm_history = []
x = x0.copy()
for iteration in range(self.max_iter):
# 计算函数值
f_val = f(x)
self.f_history.append(f_val)
# 计算梯度
if grad_f is not None:
grad = grad_f(x)
else:
grad = self.compute_gradient(f, x)
grad_norm = np.linalg.norm(grad)
self.grad_norm_history.append(grad_norm)
self.delta_history.append(self.delta)
# 检查收敛
if grad_norm < self.tol:
break
# 计算Hessian
if hessian_f is not None:
hessian = hessian_f(x)
else:
hessian = self.compute_hessian(f, x)
# 确保Hessian对称正定(如果可能)
eigvals = np.linalg.eigvalsh(hessian)
if np.min(eigvals) < 1e-10:
hessian = hessian + (1e-6 - np.min(eigvals)) * np.eye(len(x))
# 求解信赖域子问题
p = self.solve_subproblem(grad, hessian, self.delta)
step_norm = np.linalg.norm(p)
self.step_norm_history.append(step_norm)
# 计算实际下降与预测下降的比值
f_new = f(x + p)
actual_reduction = f_val - f_new
predicted_reduction = -(grad @ p + 0.5 * p @ hessian @ p)
if abs(predicted_reduction) < 1e-15:
rho = 1.0 if actual_reduction > 0 else 0.0
else:
rho = actual_reduction / predicted_reduction
self.rho_history.append(rho)
# 更新迭代点
if rho > self.eta:
x = x + p
self.x_history.append(x.copy())
# 调整信赖域半径
if rho < self.eta_bad:
self.delta *= 0.25
elif rho > self.eta_good and abs(step_norm - self.delta) < 1e-10:
self.delta = min(2 * self.delta, self.delta_max)
# 防止信赖域过小
if self.delta < 1e-10:
break
return x, f(x)
5. 代码深度解析
5.1 信赖域半径的自适应调整
代码中实现了经典的信赖域半径调整策略:
# 调整信赖域半径
if rho < self.eta_bad:
self.delta *= 0.25 # 模型不可靠,大幅缩小
elif rho > self.eta_good and abs(step_norm - self.delta) < 1e-10:
self.delta = min(2 * self.delta, self.delta_max) # 模型可靠且被限制,扩大
为什么这样调整?
- 当 ρ\rhoρ 很小时,说明二次模型不能很好地近似目标函数,需要缩小信赖域
- 当 ρ\rhoρ 很大且步长达到边界时,说明模型可靠但被信赖域限制,可以扩大信赖域
- 这种自适应调整保证了算法的稳健性和效率
5.2 Hessian修正
# 确保Hessian对称正定(如果可能)
eigvals = np.linalg.eigvalsh(hessian)
if np.min(eigvals) < 1e-10:
hessian = hessian + (1e-6 - np.min(eigvals)) * np.eye(len(x))
为什么需要修正Hessian?
- 在非凸区域,Hessian可能不正定
- 不正定的Hessian会导致子问题求解困难
- 通过添加适当的正则化项,可以保证子问题有解
5.3 步长接受准则
# 更新迭代点
if rho > self.eta:
x = x + p
self.x_history.append(x.copy())
为什么需要接受准则?
- 即使求解了子问题,得到的步长也可能不使函数值下降
- 通过设置阈值 η\etaη,可以拒绝那些不能带来足够下降的步长
- 这保证了算法的单调下降性质
5.4 Dogleg方法的几何解释
Dogleg方法的路径由两部分组成:
- 从原点到Cauchy点的线段
- 从Cauchy点到牛顿方向的线段
这种折线路径充分利用了最速下降方向和牛顿方向的信息,在计算复杂度和收敛速度之间取得了良好的平衡。
6. 测试函数与案例分析
6.1 二次函数测试
二次函数是信赖域方法的理想测试案例:
f(x)=12xTAx−bTxf(x) = \frac{1}{2} x^T A x - b^T xf(x)=21xTAx−bTx
对于二次函数,信赖域方法通常能快速收敛,因为二次模型与目标函数完全匹配。
6.2 Rosenbrock函数
Rosenbrock函数是经典的困难测试函数:
f(x)=∑i=1n−1[100(xi+1−xi2)2+(1−xi)2]f(x) = \sum_{i=1}^{n-1} [100(x_{i+1} - x_i^2)^2 + (1 - x_i)^2]f(x)=i=1∑n−1[100(xi+1−xi2)2+(1−xi)2]
信赖域方法在这个函数上表现良好,能够沿着狭窄的"山谷"找到最优解。
6.3 不同求解器比较
| 求解器 | 收敛速度 | 计算成本 | 适用场景 |
|---|---|---|---|
| Cauchy点 | 慢 | 低 | 理论分析、大规模问题 |
| Dogleg | 中 | 中 | 中小规模问题、Hessian正定 |
| Steihaug | 快 | 中 | 大规模问题、不定Hessian |
6.4 信赖域半径的动态变化
通过观察信赖域半径的变化,可以了解算法的自适应过程:
- 初始阶段:信赖域可能频繁调整
- 接近最优解:信赖域通常稳定或扩大
- 遇到困难区域:信赖域会缩小以保证稳健性
8. 结构优化应用实例
8.1 有限元分析中的非线性求解
在有限元分析中,经常需要求解非线性方程组:
r(u)=fext−fint(u)=0r(u) = f_{ext} - f_{int}(u) = 0r(u)=fext−fint(u)=0
使用信赖域方法求解:
def residual(u):
return f_ext - compute_internal_force(u)
def jacobian(u):
return compute_stiffness_matrix(u)
# 转化为最小化问题
f = lambda u: 0.5 * np.dot(residual(u), residual(u))
tr = TrustRegion(SubproblemSolver.STEIHAUG)
u_opt, _ = tr.optimize(f, u0)
8.2 材料非线性优化
在材料参数识别中,信赖域方法可以用于:
def objective(material_params):
# 模拟
simulated_response = simulate(material_params)
# 计算误差
error = np.sum((simulated_response - experimental_data)**2)
return error
tr = TrustRegion(SubproblemSolver.DOGLEG, max_iter=100)
params_opt, error_opt = tr.optimize(objective, params_initial)
8.3 接触问题求解
在接触问题中,信赖域反射法可以处理边界约束:
# 定义接触约束
def contact_objective(displacement):
# 计算能量
energy = compute_potential_energy(displacement)
# 添加接触惩罚
penetration = compute_penetration(displacement)
return energy + penalty * np.sum(penetration**2)
# 边界约束(位移限制)
lower_bounds = np.zeros(n_dof) # 下界
upper_bounds = np.ones(n_dof) * max_displacement # 上界
trr = TrustRegionReflective(max_iter=200)
disp_opt, _ = trr.optimize(contact_objective, disp0, lower_bounds, upper_bounds)
9. 参数调优指南
9.1 初始信赖域半径
- 默认:1.0
- 小规模问题:可以设置较小值(0.1-0.5)
- 大规模问题:可以设置较大值(1.0-10.0)
- 病态问题:建议设置较小值
9.2 信赖域调整阈值
标准设置:
- η\etaη(接受阈值):0.1
- ηgood\eta_{good}ηgood(扩大阈值):0.75
- ηbad\eta_{bad}ηbad(缩小阈值):0.25
激进设置(追求速度):
- η\etaη:0.01
- ηgood\eta_{good}ηgood:0.9
- ηbad\eta_{bad}ηbad:0.1
保守设置(追求稳健):
- η\etaη:0.25
- ηgood\eta_{good}ηgood:0.6
- ηbad\eta_{bad}ηbad:0.4
9.3 子问题求解器选择
| 问题规模 | 推荐求解器 | 理由 |
|---|---|---|
| 小规模(n < 100) | Dogleg | 计算简单,效果好 |
| 中规模(100 < n < 10000) | Steihaug | 平衡效率和内存 |
| 大规模(n > 10000) | Steihaug | 内存友好,可处理稀疏矩阵 |
| Hessian不定 | Steihaug | 可以处理负曲率 |
9.4 收敛容差
- 粗略解:ϵ=10−3\epsilon = 10^{-3}ϵ=10−3
- 中等精度:ϵ=10−6\epsilon = 10^{-6}ϵ=10−6
- 高精度:ϵ=10−9\epsilon = 10^{-9}ϵ=10−9
- 有限元分析:ϵ=10−8∼10−10\epsilon = 10^{-8} \sim 10^{-10}ϵ=10−8∼10−10
10. 进阶挑战
挑战1:实现精确求解
对于小规模问题,可以实现信赖域子问题的精确求解:
def solve_exact(self, grad: np.ndarray, hessian: np.ndarray,
delta: float) -> np.ndarray:
"""使用特征值分解精确求解信赖域子问题"""
# 求解 (B + lambda*I)p = -g,使得 ||p|| = delta
# 使用迭代方法找到最优lambda
pass
挑战2:稀疏矩阵支持
对于大规模问题,修改代码以支持稀疏矩阵:
from scipy.sparse import csr_matrix
from scipy.sparse.linalg import cg
def solve_steihaug_sparse(self, grad, hessian_sparse, delta):
"""支持稀疏矩阵的Steihaug方法"""
pass
挑战3:非单调信赖域方法
实现非单调信赖域方法,允许偶尔的函数值上升:
def optimize_nonmonotone(self, f, x0, max_iter=1000, m=10):
"""非单调信赖域方法"""
# 参考最近m次迭代的函数值
pass
11. 常见报错与解决方案
11.1 “Hessian is singular”
原因:Hessian矩阵奇异
解决:
- 增加Hessian修正的系数
- 使用更保守的信赖域策略
- 检查问题建模是否正确
11.2 “Trust region too small”
原因:信赖域半径收缩到极小值
解决:
- 检查目标函数和梯度计算是否正确
- 增加最小信赖域半径限制
- 调整信赖域调整阈值
11.3 “Subproblem solver failed”
原因:子问题求解失败
解决:
- 切换到更稳健的求解器(如Steihaug)
- 检查Hessian矩阵是否正确计算
- 增加Hessian修正
11.4 “Convergence too slow”
原因:收敛速度过慢
解决:
- 使用更高效的子问题求解器(如Dogleg或Steihaug)
- 调整信赖域参数,更积极地扩大信赖域
- 检查问题的条件数
11.5 “Out of memory”
原因:内存不足
解决:
- 使用Steihaug方法,避免显式存储Hessian
- 使用稀疏矩阵格式
- 实现矩阵-向量乘法而不存储完整矩阵
附录:完整代码
完整的Python实现代码请参见 trust_region.py 文件。
"""
主题045:信赖域方法 (Trust Region Methods)
信赖域方法是一种强大的优化算法,通过在每次迭代中构建局部二次模型,
并在信赖域内求解子问题来确定搜索方向。
"""
import numpy as np
import matplotlib.pyplot as plt
from typing import Callable, Tuple, Optional, List
from enum import Enum
# 设置中文字体
plt.rcParams['font.sans-serif'] = ['SimHei', 'DejaVu Sans']
plt.rcParams['axes.unicode_minus'] = False
class SubproblemSolver(Enum):
"""子问题求解方法"""
CAUCHY = "cauchy" # Cauchy点方法
DOGLEG = "dogleg" # Dogleg方法
STEIHAUG = "steihaug" # Steihaug共轭梯度法
EXACT = "exact" # 精确求解
class TrustRegion:
"""
信赖域方法 (Trust Region Method)
信赖域方法通过在每次迭代中构建一个局部二次模型,
并在信赖域内求解该模型来确定搜索方向。
核心思想:
1. 在当前点构建二次模型:m_k(p) = f_k + g_k^T p + 1/2 p^T B_k p
2. 在信赖域 ||p|| <= Delta_k 内求解子问题
3. 根据实际下降与预测下降的比值调整信赖域半径
"""
def __init__(self,
subproblem_solver: SubproblemSolver = SubproblemSolver.DOGLEG,
max_iter: int = 1000,
tol: float = 1e-6,
delta_init: float = 1.0,
delta_max: float = 100.0,
eta: float = 0.1,
eta_good: float = 0.75,
eta_bad: float = 0.25):
"""
初始化信赖域方法
参数:
subproblem_solver: 子问题求解方法
max_iter: 最大迭代次数
tol: 收敛容差
delta_init: 初始信赖域半径
delta_max: 最大信赖域半径
eta: 接受步长的最小比率阈值
eta_good: 信赖域扩大阈值
eta_bad: 信赖域缩小阈值
"""
self.subproblem_solver = subproblem_solver
self.max_iter = max_iter
self.tol = tol
self.delta = delta_init
self.delta_max = delta_max
self.eta = eta
self.eta_good = eta_good
self.eta_bad = eta_bad
# 历史记录
self.x_history = []
self.f_history = []
self.grad_norm_history = []
self.delta_history = []
self.rho_history = []
self.step_norm_history = []
def compute_gradient(self, f: Callable, x: np.ndarray,
h: float = 1e-8) -> np.ndarray:
"""数值计算梯度"""
n = len(x)
grad = np.zeros(n)
f_x = f(x)
for i in range(n):
x_plus = x.copy()
x_plus[i] += h
grad[i] = (f(x_plus) - f_x) / h
return grad
def compute_hessian(self, f: Callable, x: np.ndarray,
h: float = 1e-5) -> np.ndarray:
"""数值计算Hessian矩阵"""
n = len(x)
H = np.zeros((n, n))
for i in range(n):
x_plus = x.copy()
x_minus = x.copy()
x_plus[i] += h
x_minus[i] -= h
grad_plus = self.compute_gradient(f, x_plus, h)
grad_minus = self.compute_gradient(f, x_minus, h)
H[:, i] = (grad_plus - grad_minus) / (2 * h)
# 对称化
H = (H + H.T) / 2
return H
def solve_cauchy_point(self, grad: np.ndarray, hessian: np.ndarray,
delta: float) -> np.ndarray:
"""
Cauchy点方法求解子问题
Cauchy点是最速下降方向与信赖域边界的交点。
这是最简单的子问题求解方法,保证收敛但效率较低。
"""
g_norm = np.linalg.norm(grad)
if g_norm < 1e-15:
return np.zeros_like(grad)
# 最速下降方向
p_sd = -grad / g_norm * delta
# 计算曲率
gBg = grad @ hessian @ grad
if gBg <= 0:
# 负曲率或零曲率,取信赖域边界
tau = 1.0
else:
# 计算最优步长
tau = min(1.0, g_norm**3 / (delta * gBg))
p_cauchy = tau * p_sd
return p_cauchy
def solve_dogleg(self, grad: np.ndarray, hessian: np.ndarray,
delta: float) -> np.ndarray:
"""
Dogleg方法求解子问题
Dogleg方法通过连接最速下降方向和牛顿方向形成一条折线路径,
在该路径上寻找满足信赖域约束的解。
"""
# 尝试牛顿方向
try:
p_newton = -np.linalg.solve(hessian, grad)
if np.linalg.norm(p_newton) <= delta:
return p_newton
except np.linalg.LinAlgError:
# Hessian奇异,使用修正
p_newton = None
# 计算Cauchy点
g_norm_sq = np.dot(grad, grad)
gBg = grad @ hessian @ grad
if gBg <= 0:
tau = 1.0
else:
tau = min(1.0, g_norm_sq**1.5 / (delta * gBg))
p_cauchy = -tau * grad * delta / np.sqrt(g_norm_sq)
if p_newton is None:
return p_cauchy
# 在折线路径上寻找交点
# 路径:p_cauchy -> p_newton
diff = p_newton - p_cauchy
# 解方程 ||p_cauchy + t*(p_newton - p_cauchy)||^2 = delta^2
a = np.dot(diff, diff)
b = 2 * np.dot(p_cauchy, diff)
c = np.dot(p_cauchy, p_cauchy) - delta**2
discriminant = b**2 - 4*a*c
if discriminant < 0:
return p_cauchy
t = (-b + np.sqrt(discriminant)) / (2*a)
t = max(0, min(1, t))
p_dogleg = p_cauchy + t * diff
return p_dogleg
def solve_steihaug(self, grad: np.ndarray, hessian: np.ndarray,
delta: float, max_iter: int = None) -> np.ndarray:
"""
Steihaug共轭梯度法求解子问题
Steihaug方法使用截断共轭梯度法求解信赖域子问题,
适合大规模问题,可以处理不定Hessian。
"""
if max_iter is None:
max_iter = len(grad)
x = np.zeros_like(grad)
r = grad.copy()
d = -r.copy()
r_norm_sq = np.dot(r, r)
for i in range(max_iter):
Bd = hessian @ d
dBd = np.dot(d, Bd)
if dBd <= 0:
# 遇到负曲率,沿当前方向走到信赖域边界
a = np.dot(d, d)
b = 2 * np.dot(x, d)
c = np.dot(x, x) - delta**2
discriminant = b**2 - 4*a*c
if discriminant >= 0:
tau = (-b + np.sqrt(discriminant)) / (2*a)
if tau >= 0:
return x + tau * d
# 如果无法到达边界,返回当前点
return x
alpha = r_norm_sq / dBd
x_new = x + alpha * d
if np.linalg.norm(x_new) >= delta:
# 超出信赖域,截断到边界
a = np.dot(d, d)
b = 2 * np.dot(x, d)
c = np.dot(x, x) - delta**2
discriminant = b**2 - 4*a*c
if discriminant >= 0:
tau = (-b + np.sqrt(discriminant)) / (2*a)
if tau >= 0:
return x + tau * d
return x
r_new = r + alpha * Bd
r_norm_sq_new = np.dot(r_new, r_new)
if np.sqrt(r_norm_sq_new) < 1e-10:
return x_new
beta = r_norm_sq_new / r_norm_sq
d = -r_new + beta * d
x = x_new
r = r_new
r_norm_sq = r_norm_sq_new
return x
def solve_subproblem(self, grad: np.ndarray, hessian: np.ndarray,
delta: float) -> np.ndarray:
"""根据选择的方法求解信赖域子问题"""
if self.subproblem_solver == SubproblemSolver.CAUCHY:
return self.solve_cauchy_point(grad, hessian, delta)
elif self.subproblem_solver == SubproblemSolver.DOGLEG:
return self.solve_dogleg(grad, hessian, delta)
elif self.subproblem_solver == SubproblemSolver.STEIHAUG:
return self.solve_steihaug(grad, hessian, delta)
elif self.subproblem_solver == SubproblemSolver.EXACT:
# 精确求解(仅用于小规模问题)
return self.solve_exact(grad, hessian, delta)
else:
raise ValueError(f"Unknown solver: {self.subproblem_solver}")
def solve_exact(self, grad: np.ndarray, hessian: np.ndarray,
delta: float) -> np.ndarray:
"""
精确求解信赖域子问题(使用特征值分解)
仅适用于小规模问题,用于验证其他方法的准确性。
"""
n = len(grad)
# 尝试直接求解无约束问题
try:
p_full = -np.linalg.solve(hessian, grad)
if np.linalg.norm(p_full) <= delta:
return p_full
except np.linalg.LinAlgError:
pass
# 在边界上求解
# 使用迭代方法找到最优lambda
lambda_min = 0
lambda_max = 1000
for _ in range(100):
lambda_mid = (lambda_min + lambda_max) / 2
try:
B_lambda = hessian + lambda_mid * np.eye(n)
p_lambda = -np.linalg.solve(B_lambda, grad)
p_norm = np.linalg.norm(p_lambda)
if abs(p_norm - delta) < 1e-6:
return p_lambda
elif p_norm > delta:
lambda_min = lambda_mid
else:
lambda_max = lambda_mid
except np.linalg.LinAlgError:
lambda_min = lambda_mid
# 返回最后计算的解
B_lambda = hessian + lambda_max * np.eye(n)
return -np.linalg.solve(B_lambda, grad)
def optimize(self, f: Callable, x0: np.ndarray,
grad_f: Callable = None, hessian_f: Callable = None) -> Tuple[np.ndarray, float]:
"""
执行信赖域优化
参数:
f: 目标函数
x0: 初始点
grad_f: 梯度函数(可选)
hessian_f: Hessian函数(可选)
返回:
x: 最优解
f_val: 最优函数值
"""
self.x_history = [x0.copy()]
self.f_history = []
self.grad_norm_history = []
self.delta_history = []
self.rho_history = []
self.step_norm_history = []
x = x0.copy()
for iteration in range(self.max_iter):
# 计算函数值
f_val = f(x)
self.f_history.append(f_val)
# 计算梯度
if grad_f is not None:
grad = grad_f(x)
else:
grad = self.compute_gradient(f, x)
grad_norm = np.linalg.norm(grad)
self.grad_norm_history.append(grad_norm)
self.delta_history.append(self.delta)
# 检查收敛
if grad_norm < self.tol:
break
# 计算Hessian
if hessian_f is not None:
hessian = hessian_f(x)
else:
hessian = self.compute_hessian(f, x)
# 确保Hessian对称正定(如果可能)
eigvals = np.linalg.eigvalsh(hessian)
if np.min(eigvals) < 1e-10:
# 修正Hessian
hessian = hessian + (1e-6 - np.min(eigvals)) * np.eye(len(x))
# 求解信赖域子问题
p = self.solve_subproblem(grad, hessian, self.delta)
step_norm = np.linalg.norm(p)
self.step_norm_history.append(step_norm)
# 计算实际下降与预测下降的比值
f_new = f(x + p)
actual_reduction = f_val - f_new
predicted_reduction = -(grad @ p + 0.5 * p @ hessian @ p)
if abs(predicted_reduction) < 1e-15:
rho = 1.0 if actual_reduction > 0 else 0.0
else:
rho = actual_reduction / predicted_reduction
self.rho_history.append(rho)
# 更新迭代点
if rho > self.eta:
x = x + p
self.x_history.append(x.copy())
# 调整信赖域半径
if rho < self.eta_bad:
self.delta *= 0.25
elif rho > self.eta_good and abs(step_norm - self.delta) < 1e-10:
self.delta = min(2 * self.delta, self.delta_max)
# 防止信赖域过小
if self.delta < 1e-10:
break
return x, f(x)
class TrustRegionReflective:
"""
反射信赖域方法 (Trust Region Reflective)
用于处理带边界约束的优化问题。
"""
def __init__(self,
max_iter: int = 1000,
tol: float = 1e-6,
delta_init: float = 1.0,
delta_max: float = 100.0):
self.max_iter = max_iter
self.tol = tol
self.delta = delta_init
self.delta_max = delta_max
self.x_history = []
self.f_history = []
def project_to_bounds(self, x: np.ndarray,
lower: Optional[np.ndarray],
upper: Optional[np.ndarray]) -> np.ndarray:
"""将点投影到边界内"""
x_proj = x.copy()
if lower is not None:
x_proj = np.maximum(x_proj, lower)
if upper is not None:
x_proj = np.minimum(x_proj, upper)
return x_proj
def optimize(self, f: Callable, x0: np.ndarray,
lower: Optional[np.ndarray] = None,
upper: Optional[np.ndarray] = None,
grad_f: Callable = None) -> Tuple[np.ndarray, float]:
"""
执行带边界约束的信赖域优化
参数:
f: 目标函数
x0: 初始点
lower: 下界(可选)
upper: 上界(可选)
grad_f: 梯度函数(可选)
"""
self.x_history = [x0.copy()]
self.f_history = []
x = x0.copy()
tr = TrustRegion(SubproblemSolver.STEIHAUG, self.max_iter, self.tol)
for iteration in range(self.max_iter):
f_val = f(x)
self.f_history.append(f_val)
if grad_f is not None:
grad = grad_f(x)
else:
grad = tr.compute_gradient(f, x)
if np.linalg.norm(grad) < self.tol:
break
# 计算反射方向
# 简化的反射信赖域方法
hessian = tr.compute_hessian(f, x)
p = tr.solve_steihaug(grad, hessian, self.delta)
x_new = x + p
# 投影到可行域
x_new = self.project_to_bounds(x_new, lower, upper)
# 检查是否接受步长
if f(x_new) < f_val:
x = x_new
self.delta = min(2 * self.delta, self.delta_max)
else:
self.delta *= 0.5
self.x_history.append(x.copy())
if self.delta < 1e-10:
break
return x, f(x)
# ==================== 测试函数 ====================
def quadratic_function(x: np.ndarray) -> float:
"""二次测试函数"""
A = np.array([[4, 1], [1, 3]])
b = np.array([1, 2])
return 0.5 * x @ A @ x - b @ x
def quadratic_grad(x: np.ndarray) -> np.ndarray:
"""二次函数梯度"""
A = np.array([[4, 1], [1, 3]])
b = np.array([1, 2])
return A @ x - b
def quadratic_hessian(x: np.ndarray) -> np.ndarray:
"""二次函数Hessian"""
return np.array([[4, 1], [1, 3]])
def rosenbrock(x: np.ndarray) -> float:
"""Rosenbrock函数"""
return sum(100 * (x[i+1] - x[i]**2)**2 + (1 - x[i])**2
for i in range(len(x)-1))
def rosenbrock_grad(x: np.ndarray) -> np.ndarray:
"""Rosenbrock函数梯度"""
n = len(x)
grad = np.zeros(n)
for i in range(n-1):
grad[i] += -400 * x[i] * (x[i+1] - x[i]**2) - 2 * (1 - x[i])
grad[i+1] += 200 * (x[i+1] - x[i]**2)
return grad
def beale(x: np.ndarray) -> float:
"""Beale函数"""
x1, x2 = x[0], x[1]
term1 = (1.5 - x1 + x1*x2)**2
term2 = (2.25 - x1 + x1*x2**2)**2
term3 = (2.625 - x1 + x1*x2**3)**2
return term1 + term2 + term3
def beale_grad(x: np.ndarray) -> np.ndarray:
"""Beale函数梯度"""
x1, x2 = x[0], x[1]
d1 = 1.5 - x1 + x1*x2
d2 = 2.25 - x1 + x1*x2**2
d3 = 2.625 - x1 + x1*x2**3
df_dx1 = (2*d1*(-1 + x2) +
2*d2*(-1 + x2**2) +
2*d3*(-1 + x2**3))
df_dx2 = (2*d1*x1 +
2*d2*2*x1*x2 +
2*d3*3*x1*x2**2)
return np.array([df_dx1, df_dx2])
def himmelblau(x: np.ndarray) -> float:
"""Himmelblau函数"""
x1, x2 = x[0], x[1]
return (x1**2 + x2 - 11)**2 + (x1 + x2**2 - 7)**2
def himmelblau_grad(x: np.ndarray) -> np.ndarray:
"""Himmelblau函数梯度"""
x1, x2 = x[0], x[1]
df_dx1 = 2*(x1**2 + x2 - 11)*2*x1 + 2*(x1 + x2**2 - 7)
df_dx2 = 2*(x1**2 + x2 - 11) + 2*(x1 + x2**2 - 7)*2*x2
return np.array([df_dx1, df_dx2])
# ==================== 可视化函数 ====================
def plot_optimization_trajectory(tr: TrustRegion, func, func_name: str,
x_range=(-2, 2), y_range=(-2, 2),
true_optimum=None):
"""绘制优化轨迹"""
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
# 创建网格
x1 = np.linspace(x_range[0], x_range[1], 100)
x2 = np.linspace(y_range[0], y_range[1], 100)
X1, X2 = np.meshgrid(x1, x2)
Z = np.zeros_like(X1)
for i in range(len(x1)):
for j in range(len(x2)):
Z[j, i] = func(np.array([X1[j, i], X2[j, i]]))
# 左图:优化轨迹
ax1 = axes[0]
contour = ax1.contour(X1, X2, Z, levels=20, cmap='viridis', alpha=0.6)
ax1.clabel(contour, inline=True, fontsize=8)
x_hist = np.array(tr.x_history)
ax1.plot(x_hist[:, 0], x_hist[:, 1], 'ro-', markersize=4,
linewidth=1.5, label='优化轨迹')
ax1.plot(x_hist[0, 0], x_hist[0, 1], 'go', markersize=10,
label='起点', markeredgecolor='black', markeredgewidth=1.5)
ax1.plot(x_hist[-1, 0], x_hist[-1, 1], 'b*', markersize=15,
label='终点', markeredgecolor='black', markeredgewidth=1.5)
if true_optimum is not None:
ax1.plot(true_optimum[0], true_optimum[1], 'g+', markersize=15,
label='真实最优解', markeredgewidth=2)
ax1.set_xlabel('x₁', fontsize=12)
ax1.set_ylabel('x₂', fontsize=12)
ax1.set_title(f'{func_name} - 信赖域优化轨迹', fontsize=14)
ax1.legend(fontsize=10)
ax1.grid(True, alpha=0.3)
ax1.set_aspect('equal', adjustable='box')
# 右图:信赖域半径变化
ax2 = axes[1]
ax2.semilogy(tr.delta_history, 'b-', linewidth=2, label='信赖域半径 Δ')
ax2.set_xlabel('迭代次数', fontsize=12)
ax2.set_ylabel('信赖域半径 (对数尺度)', fontsize=12)
ax2.set_title('信赖域半径变化', fontsize=14)
ax2.legend(fontsize=10)
ax2.grid(True, alpha=0.3)
plt.tight_layout()
filename = f'trust_region_{func_name.lower().replace(" ", "_")}.png'
plt.savefig(filename, dpi=150, bbox_inches='tight')
print(f"已保存: {filename}")
plt.close()
def plot_convergence_comparison():
"""比较不同子问题求解方法的收敛性"""
x0 = np.array([-1.5, 1.5])
solvers = [
(SubproblemSolver.CAUCHY, "Cauchy点"),
(SubproblemSolver.DOGLEG, "Dogleg"),
(SubproblemSolver.STEIHAUG, "Steihaug CG")
]
fig, axes = plt.subplots(1, 2, figsize=(14, 6))
colors = ['blue', 'red', 'green']
for i, (solver, name) in enumerate(solvers):
tr = TrustRegion(subproblem_solver=solver, max_iter=100, tol=1e-6)
x_opt, f_opt = tr.optimize(rosenbrock, x0, rosenbrock_grad)
# 左图:函数值收敛
axes[0].semilogy(tr.f_history, color=colors[i], linewidth=2,
label=f'{name} (f={f_opt:.2e})', marker='o', markersize=3)
# 右图:梯度范数收敛
axes[1].semilogy(tr.grad_norm_history, color=colors[i], linewidth=2,
label=f'{name}', marker='s', markersize=3)
axes[0].set_xlabel('迭代次数', fontsize=12)
axes[0].set_ylabel('函数值 (对数尺度)', fontsize=12)
axes[0].set_title('Rosenbrock函数 - 不同求解器比较', fontsize=14)
axes[0].legend(fontsize=10)
axes[0].grid(True, alpha=0.3)
axes[1].set_xlabel('迭代次数', fontsize=12)
axes[1].set_ylabel('梯度范数 (对数尺度)', fontsize=12)
axes[1].set_title('梯度范数收敛', fontsize=14)
axes[1].legend(fontsize=10)
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('trust_region_solvers_comparison.png', dpi=150, bbox_inches='tight')
print("已保存: trust_region_solvers_comparison.png")
plt.close()
def plot_ratio_history():
"""绘制实际下降/预测下降比值的变化"""
x0 = np.array([-1.5, 1.5])
tr = TrustRegion(subproblem_solver=SubproblemSolver.DOGLEG,
max_iter=50, tol=1e-6)
x_opt, f_opt = tr.optimize(rosenbrock, x0, rosenbrock_grad)
fig, axes = plt.subplots(2, 1, figsize=(12, 8))
# 上图:rho值变化
ax1 = axes[0]
iterations = range(len(tr.rho_history))
ax1.plot(iterations, tr.rho_history, 'b-o', linewidth=2, markersize=6)
ax1.axhline(y=0.75, color='g', linestyle='--', linewidth=2, label='扩大阈值 (0.75)')
ax1.axhline(y=0.25, color='r', linestyle='--', linewidth=2, label='缩小阈值 (0.25)')
ax1.axhline(y=0.1, color='orange', linestyle='--', linewidth=2, label='接受阈值 (0.1)')
ax1.set_xlabel('迭代次数', fontsize=12)
ax1.set_ylabel('ρ (实际下降/预测下降)', fontsize=12)
ax1.set_title('信赖域比率 ρ 的变化', fontsize=14)
ax1.legend(fontsize=10)
ax1.grid(True, alpha=0.3)
# 下图:步长变化
ax2 = axes[1]
ax2.plot(iterations, tr.step_norm_history, 'r-s', linewidth=2, markersize=6, label='步长 ||p||')
ax2.plot(range(len(tr.delta_history)), tr.delta_history, 'g--',
linewidth=2, label='信赖域半径 Δ')
ax2.set_xlabel('迭代次数', fontsize=12)
ax2.set_ylabel('步长 / 信赖域半径', fontsize=12)
ax2.set_title('步长与信赖域半径', fontsize=14)
ax2.legend(fontsize=10)
ax2.grid(True, alpha=0.3)
plt.tight_layout()
plt.savefig('trust_region_ratio_analysis.png', dpi=150, bbox_inches='tight')
print("已保存: trust_region_ratio_analysis.png")
plt.close()
def demo_boundary_constraints():
"""演示边界约束优化"""
# 定义带边界的二次函数
def bounded_quadratic(x):
return (x[0] - 2)**2 + (x[1] - 1)**2
x0 = np.array([0.0, 0.0])
lower = np.array([0.5, 0.5]) # 下界
upper = np.array([1.5, 1.5]) # 上界
trr = TrustRegionReflective(max_iter=100, tol=1e-6)
x_opt, f_opt = trr.optimize(bounded_quadratic, x0, lower, upper)
# 绘制结果
fig, ax = plt.subplots(figsize=(8, 8))
# 绘制可行域
rect = plt.Rectangle((lower[0], lower[1]),
upper[0] - lower[0],
upper[1] - lower[1],
fill=True, facecolor='lightgreen',
edgecolor='green', linewidth=2, alpha=0.3,
label='可行域')
ax.add_patch(rect)
# 绘制优化轨迹
x_hist = np.array(trr.x_history)
ax.plot(x_hist[:, 0], x_hist[:, 1], 'ro-', markersize=6,
linewidth=2, label='优化轨迹')
ax.plot(x_hist[0, 0], x_hist[0, 1], 'go', markersize=12,
label='起点', markeredgecolor='black', markeredgewidth=2)
ax.plot(x_hist[-1, 0], x_hist[-1, 1], 'b*', markersize=15,
label='终点', markeredgecolor='black', markeredgewidth=2)
# 标记无约束最优解
ax.plot(2, 1, 'g+', markersize=15, markeredgewidth=2,
label='无约束最优解 (2, 1)')
ax.set_xlim(-0.5, 3)
ax.set_ylim(-0.5, 2)
ax.set_xlabel('x₁', fontsize=12)
ax.set_ylabel('x₂', fontsize=12)
ax.set_title('信赖域反射法 - 边界约束优化', fontsize=14)
ax.legend(fontsize=10, loc='upper left')
ax.grid(True, alpha=0.3)
ax.set_aspect('equal', adjustable='box')
plt.tight_layout()
plt.savefig('trust_region_boundary.png', dpi=150, bbox_inches='tight')
print("已保存: trust_region_boundary.png")
plt.close()
print(f"\n边界约束优化结果:")
print(f" 最优解: ({x_opt[0]:.6f}, {x_opt[1]:.6f})")
print(f" 最优值: {f_opt:.6f}")
print(f" 迭代次数: {len(trr.x_history)-1}")
# ==================== 主程序 ====================
if __name__ == "__main__":
print("=" * 60)
print("信赖域方法演示")
print("=" * 60)
# 1. 二次函数测试
print("\n1. 二次函数优化")
print("-" * 40)
x0 = np.array([2.0, 2.0])
tr = TrustRegion(subproblem_solver=SubproblemSolver.DOGLEG,
max_iter=50, tol=1e-8)
x_opt, f_opt = tr.optimize(quadratic_function, x0,
quadratic_grad, quadratic_hessian)
print(f"初始点: {x0}")
print(f"最优解: {x_opt}")
print(f"最优值: {f_opt:.8f}")
print(f"迭代次数: {len(tr.x_history)-1}")
print(f"真实最优解: [0.18181818, 0.54545455]")
plot_optimization_trajectory(tr, quadratic_function, "Quadratic",
x_range=(-1, 3), y_range=(-1, 3),
true_optimum=[0.1818, 0.5455])
# 2. Rosenbrock函数
print("\n2. Rosenbrock函数优化")
print("-" * 40)
x0 = np.array([-1.5, 1.5])
tr = TrustRegion(subproblem_solver=SubproblemSolver.DOGLEG,
max_iter=500, tol=1e-6)
x_opt, f_opt = tr.optimize(rosenbrock, x0, rosenbrock_grad)
print(f"初始点: {x0}")
print(f"最优解: {x_opt}")
print(f"最优值: {f_opt:.8f}")
print(f"迭代次数: {len(tr.x_history)-1}")
print(f"真实最优解: [1, 1]")
plot_optimization_trajectory(tr, rosenbrock, "Rosenbrock",
x_range=(-2, 2), y_range=(-1, 3),
true_optimum=[1, 1])
# 3. Beale函数
print("\n3. Beale函数优化")
print("-" * 40)
x0 = np.array([2.0, 2.0])
tr = TrustRegion(subproblem_solver=SubproblemSolver.DOGLEG,
max_iter=200, tol=1e-6)
x_opt, f_opt = tr.optimize(beale, x0, beale_grad)
print(f"初始点: {x0}")
print(f"最优解: {x_opt}")
print(f"最优值: {f_opt:.8f}")
print(f"迭代次数: {len(tr.x_history)-1}")
print(f"真实最优解: [3, 0.5]")
plot_optimization_trajectory(tr, beale, "Beale",
x_range=(-1, 4), y_range=(-1, 4),
true_optimum=[3, 0.5])
# 4. Himmelblau函数
print("\n4. Himmelblau函数优化")
print("-" * 40)
x0 = np.array([-2.0, 2.0])
tr = TrustRegion(subproblem_solver=SubproblemSolver.DOGLEG,
max_iter=100, tol=1e-6)
x_opt, f_opt = tr.optimize(himmelblau, x0, himmelblau_grad)
print(f"初始点: {x0}")
print(f"最优解: {x_opt}")
print(f"最优值: {f_opt:.8f}")
print(f"迭代次数: {len(tr.x_history)-1}")
plot_optimization_trajectory(tr, himmelblau, "Himmelblau",
x_range=(-5, 5), y_range=(-5, 5))
# 5. 不同求解器比较
print("\n5. 不同子问题求解器比较")
print("-" * 40)
plot_convergence_comparison()
# 6. 信赖域比率分析
print("\n6. 信赖域比率分析")
print("-" * 40)
plot_ratio_history()
# 7. 边界约束优化
print("\n7. 边界约束优化")
print("-" * 40)
demo_boundary_constraints()
print("\n" + "=" * 60)
print("信赖域方法演示完成!")
print("=" * 60)
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐
所有评论(0)