p39

1.随机最优控制问题

考虑一个问题:An agent needs to make a decision on consumption and investment.
Time: [0,T][0,T][0,T];
Risk-free interest rate: rrr;
Price of a risky asset: {St:t∈[0,T]}\{ S_t : t \in [0,T] \}{St:t[0,T]},
dS(t)=μS(t)dt+σS(t)dW(t) dS(t) = \mu S(t) dt +\sigma S(t) dW(t) dS(t)=μS(t)dt+σS(t)dW(t)
Initial wealth: X0=xX_0=xX0=x
Wealth: {Xt:t∈[0,T]}\{ X_t : t \in [0,T] \}{Xt:t[0,T]}
Consumption: {ct:t∈[0,T]}\{ c_t : t \in [0,T] \}{ct:t[0,T]}(考虑投资+消费)
Portfolio weights: (1)risky asset ωt\omega_tωt; (2)money account 1−ωt1-\omega_t1ωt
dX(t)=ωtXtStdS(t)+(1−ωt)Xtrdt−ctdt dX(t)= \frac{\omega_t X_t}{S_t} dS(t) + (1-\omega_t) X_t r dt - c_t dt dX(t)=StωtXtdS(t)+(1ωt)Xtrdtctdt

=[rXt−ct+ωtXt(μ−r)]dt+ωtXtσdW(t)=\left[ r X_t - c_t + \omega_t X_t (\mu -r) \right] dt +\omega_t X_t \sigma dW(t)=[rXtct+ωtXt(μr)]dt+ωtXtσdW(t)

效用函数
The agent chooses consumption and investment to maximize
E[∫0TU(t,ct)dt+Φ(XT)] \mathbb{E}\left[ \int_{0}^{T} U(t, c_t) dt + \Phi(X_T) \right] E[0TU(t,ct)dt+Φ(XT)]
注:这是经典的连续时间消费—投资问题(Merton problem)的标准形式。 1.经济学含义:效用来自“消费”,不是股票价格这个模型里: 股票价格St本身并不会直接给人带来快乐 真正给人带来满足的是: 消费(吃饭、住房、娱乐) 最终财富(遗产、退休资产等)。 2.Merton 模型的核心目标是:“如何在风险资产和消费之间最优分配财富”因此St已经进入财富动态

Phi_T是终端效用(terminal utility)即到最终时刻T,剩下的钱也有价值。

Utility maximization problem
max⁡{ct,ωt}t∈[0,T]E[∫0TU(t,ct) dt+Φ(XT)]s.t.dXt=[rXt−ct+ωtXt(μ−r)]dt+ωtXtσdW(t),X0=x,ct≥0,0≤ωt≤1. \begin{aligned} \max_{\{c_{t},\omega _{t}\}_{t\in \lbrack 0,T]}}\quad &\mathbb{E}\left[ \int_{0}^{T}U(t,c_{t})\,dt+\Phi (X_{T})\right] \\ \text{s.t.}\quad &dX_{t} = \left[ r X_t - c_t + \omega_t X_t (\mu -r) \right] dt +\omega_t X_t \sigma dW(t), \\ &X_{0}=x, \\ &c_{t}\geq 0, \\ &0\leq \omega _{t}\leq 1. \end{aligned} {ct,ωt}t[0,T]maxs.t.E[0TU(t,ct)dt+Φ(XT)]dXt=[rXtct+ωtXt(μr)]dt+ωtXtσdW(t),X0=x,ct0,0ωt1.
stochastic optimal control problem(一般化)
sup⁡{ut}t∈[0,T]∈UE[∫0TF(s,Xsu,us) ds+G(XTu)]s.t.dXtu=μ(t,Xsu,ut)dt+σ(t,Xtu,ut)dW(t),X0u=x. \begin{aligned} \sup_{\{u_{t}\}_{t\in \lbrack 0,T]}\in U}\quad &\mathbb{E}\left[ \int_{0}^{T}F(s,X_{s}^{u},u_{s})\,ds+G(X_{T}^{u})\right] \\ \text{s.t.}\quad &dX_{t}^{u}=\mu \left( t,X_{s}^{u},u_{t}\right) dt+\sigma \left( t,X_{t}^{u},u_{t}\right) dW(t), \\ &X_{0}^{u}=x. \end{aligned} {ut}t[0,T]Usups.t.E[0TF(s,Xsu,us)ds+G(XTu)]dXtu=μ(t,Xsu,ut)dt+σ(t,Xtu,ut)dW(t),X0u=x.
注记:

  • max改为sup的原因是有可能取不到max。
  • 这里{ut}t∈[0,T]\{u_{t}\}_{t\in [ 0,T]}{ut}t[0,T]可以是一个决策向量,如上包含了ctc_tctωt\omega_tωt,control process。
  • UUU是一个决策空间,控制约束。control constraints。
  • XtuX_t^uXtu就是一个状态过程,如股票价格状态,依赖于xxx and uuu,状态由控制决定。state process,depends on xxx and uuu
  • GGG终点时候,有可能是增益,有可能是惩罚。terminal reward/penalty。
  • FFF瞬间的收益或惩罚。running reward/penalty。

p40

2.贝尔曼最优条件

每个瞬间+最终收益
引入几个概念
1.performance criteria,表现,某一个过程对应的目标函数值是多少。
假设现在是ttt时刻,状态xxx,控制uuu。从ttt时刻往后延伸到TTT往后产生的收益。
J(t,x,u)=Et,x[∫tTF(s,Xsu,us) ds+G(XTu)]J(t,x,u)=\mathbb{E}_{t,x} \left[ \int_{t}^{T}F(s,X_{s}^{u},u_{s})\,ds+G(X_{T}^{u})\right]J(t,x,u)=Et,x[tTF(s,Xsu,us)ds+G(XTu)]
:与此同时,JJJ是泛函(functional),输入是u=(us)s∈[t,T]u= ( u_s )_{s \in [t, T]}u=(us)s[t,T]整一个控制过程,输出是一个数,即u(⋅)→J(u)u(\cdot) \to J(u)u()J(u),这就是函数的函数概念。
2.值函数 Value function,从ttt时刻往后延伸到TTT往后的收益加总。
V(t,x)=sup⁡{uτ}τ∈[t,T]∈UJ(t,x,u)V(t,x)=\sup_{\{u_{\tau}\}_{ \tau \in \lbrack t,T]}\in U} J(t,x,u)V(t,x)=sup{uτ}τ[t,T]UJ(t,x,u)

贝尔曼最优条件
Preliminary: Law of Iterated Expectations, LIE; Tower Property. E[Y]=E[E(Y∣X)]\mathbb{E}% \left[ Y\right] =\mathbb{E}\left[ \mathbb{E}\left( Y|X\right) \right]E[Y]=E[E(YX)]

Remark: The third equality cannot be written as E[∫τTF(s,Xsu,us)ds+G(XTu)]\mathbb{E}\left[ \int_{\tau }^{T}F(s,X_{s}^{u},u_{s})ds+G(X_{T}^{u})\right]E[τTF(s,Xsu,us)ds+G(XTu)]. The underlying reason in
stochastic control problems is as follows: YYY represents the future payoff [τ,T]% [\tau ,T][τ,T]. From the perspective of time ttt, it is a completely random
path. However, at time τ\tauτ, the system state is already deterministic
conditional on Fτ\mathcal{F}_{\tau }Fτ. (Accordingly, we first condense the
future random payoff into an Fτ\mathcal{F}_{\tau}Fτ-measurable expectation
observable at time τ\tauτ.) Therefore we first condense the future random
payoff into an expectation that is observable at time τ\tauτ.

J(t,x,u)=Et,x[∫tTF(s,Xsu,us) ds+G(XTu)]J(t,x,u)=\mathbb{E}_{t,x}\left[ \int_{t}^{T}F(s,X_{s}^{u},u_{s})% \,ds+G(X_{T}^{u})\right]J(t,x,u)=Et,x[tTF(s,Xsu,us)ds+G(XTu)]
首先拆分时间段[t,T][t,T][t,T]拆分为[t,τ][t,\tau][t,τ][τ,T][\tau,T][τ,T]
=Et,x[∫tτF(s,Xsu,us) ds+∫τTF(s,Xsu,us) ds+G(XTu)]=\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+\int_{\tau }^{T}F(s,X_{s}^{u},u_{s})\,ds+G(X_{T}^{u})\right]=Et,x[tτF(s,Xsu,us)ds+τTF(s,Xsu,us)ds+G(XTu)]
这里为什么要加期望,和
1.从时刻ttt看,F\mathcal{F}F是整个未来完全随机,所以最外层套一个Et,x\mathbb{E}_{t,x}Et,x没问题。2.Fτ\mathcal{F}_{\tau}Fτ表示的τ\tauτ时刻以前的信息,但是t−>τt->\taut>τ这事情确定的,所以写为E[∫τTFds+G∣Fτ]\mathbb{E}% \left[ \int_{\tau }^{T}Fds+G\mid \mathcal{F}% _{\tau }\right]E[τTFds+GFτ]
=Et,x{∫tτF(s,Xsu,us) ds+E[∫τTF(s,Xsu,us) ds+G(XTu)∣Fτ]}=\mathbb{E}_{t,x}\left\{ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+\mathbb{E}% \left[ \int_{\tau }^{T}F(s,X_{s}^{u},u_{s})\,ds+G(X_{T}^{u})\mid \mathcal{F}% _{\tau }\right] \right\}=Et,x{tτF(s,Xsu,us)ds+E[τTF(s,Xsu,us)ds+G(XTu)Fτ]}

=Et,x[∫tτF(s,Xsu,us) ds+J(τ,Xτu,u)]=\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+J(\tau ,X_{\tau }^{u},u)\right]=Et,x[tτF(s,Xsu,us)ds+J(τ,Xτu,u)]

≤Et,x[∫tτF(s,Xsu,us) ds]+V(τ,Xτu)\leq \mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds\right] +V(\tau ,X_{\tau }^{u})Et,x[tτF(s,Xsu,us)ds]+V(τ,Xτu)

≤sup⁡{us}s∈[0,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)]\leq \sup_{\{u_{s}\}_{s\in \lbrack 0,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})\right]sup{us}s[0,T]Et,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]

sup⁡{us}s∈[t,T]J(t,x,u)≤sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)]V(t,x)≤sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)] \begin{aligned} \sup_{\{u_{s}\}_{s\in \lbrack t,T]}}J(t,x,u) \leq &\sup_{\{u_{s}\}_{s\in \lbrack t,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})\right] \\ V(t,x) \leq &\sup_{\{u_{s}\}_{s\in \lbrack t,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})\right] \end{aligned} {us}s[t,T]supJ(t,x,u)V(t,x){us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]{us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]
接下来需要证明≥\ge也成立。
我们定义一个ϵ\epsilonϵ-最优控制为vϵv^{\epsilon}vϵ
Define ϵ\epsilonϵ-optimal control as vϵv^{\epsilon}vϵ such that

Define ϵ\epsilonϵ-optimal control as vϵv^{\epsilon }vϵ such that V(t,x)≥J(t,x,vϵ)≥V(t,x)−ϵV(t,x)\geq J(t,x,v^{\epsilon })\geq V(t,x)-\epsilonV(t,x)J(t,x,vϵ)V(t,x)ϵ. And then define a control v~ϵ\tilde{v}^{\epsilon }v~ϵ such that v~ϵ=1({t≤τ})ut,1({t>τ})vϵ\tilde{v}^{\epsilon }=\mathbf{1(\{}t\leq \tau \mathbf{\})}u_{t},\mathbf{1(\{}t>\tau \mathbf{\})}v^{\epsilon }v~ϵ=1({tτ})ut,1({t>τ})vϵ.
abitrary control and ϵ\epsilonϵ-optimal control

Then we have
V(t,x)≥J(t,x,v~ϵ)≥Et,x[∫tTF(s,Xsv~ϵ,v~ϵ) ds+G(XTv~ϵ)] \begin{equation*} V(t,x)\geq J(t,x,\tilde{v}^{\epsilon })\geq \mathbb{E}_{t,x}\left[ \int_{t}^{T}F(s,X_{s}^{\tilde{v}^{\epsilon }},\tilde{v}^{\epsilon })\,ds+G(X_{T}^{\tilde{v}^{\epsilon }})\right] \end{equation*} V(t,x)J(t,x,v~ϵ)Et,x[tTF(s,Xsv~ϵ,v~ϵ)ds+G(XTv~ϵ)]
=Et,x[∫tτF(s,Xsu,us) ds+∫τTF(s,Xsv~ϵ,vϵ) ds+G(XTv~ϵ)] \begin{equation*} =\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+\int_{\tau }^{T}F(s,X_{s}^{\tilde{v}^{\epsilon }},v^{\epsilon })\,ds+G(X_{T}^{\tilde{v}% ^{\epsilon }})\right] \end{equation*} =Et,x[tτF(s,Xsu,us)ds+τTF(s,Xsv~ϵ,vϵ)ds+G(XTv~ϵ)]
≥Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)−ϵ] \begin{equation*} \geq \mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})-\epsilon \right] \end{equation*} Et,x[tτF(s,Xsu,us)ds+V(τ,Xτu)ϵ]
because abitrary between ttt to τ\tauτ
≥sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)−ϵ] \begin{equation*} \geq \sup_{\{u_{s}\}_{s\in \lbrack t,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})-\epsilon % \right] \end{equation*} {us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)ϵ]

Let ϵ→0\epsilon \rightarrow 0ϵ0, then we have

V(t,x)≥sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)] \begin{equation*} V(t,x)\geq \sup_{\{u_{s}\}_{s\in \lbrack t,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})\right] \end{equation*} V(t,x){us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]

{V(t,x)≤sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)]V(t,x)≥sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)] \begin{cases} V(t,x) \leq \sup\limits_{\{u_{s}\}_{s\in [t,T]}} \mathbb{E}_{t,x} \left[ \int_{t}^{\tau} F(s,X_{s}^{u},u_{s})\,ds + V(\tau,X_{\tau}^{u}) \right] \\[6pt] V(t,x) \geq \sup\limits_{\{u_{s}\}_{s\in [t,T]}} \mathbb{E}_{t,x} \left[ \int_{t}^{\tau} F(s,X_{s}^{u},u_{s})\,ds + V(\tau,X_{\tau}^{u}) \right] \end{cases} V(t,x){us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]V(t,x){us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]

V(t,x)=sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)] \begin{equation*} V(t,x)=\sup_{\{u_{s}\}_{s\in \lbrack t,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})\right] \end{equation*} V(t,x)={us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]

This is called the Bellman optimality condition.

意即决策一半时候,更新到最优策略,收益也可得到最优

:从“[t,T][t,T][t,T]的最优价值”=先做“t→τt \to \tautτ”这一小段决策得到即时收益+到“τ\tauτ之后”继续采用最优策略得到未来最优的价值。
如果一个策略在整体上最优,那么它从任意未来时刻开始看,后半段也必须是最优的。否则后半段还能改进,那整体就不是最优。

注2:在 ttt 时刻、系统状态为 xxx 时,从现在开始一直到终点 TTT,所能够获得的最大期望总收益。

3.HJB方程

method: 1.guess and verify 猜一个值函数,然后检验。2.值函数迭代
目前有了贝尔曼最优条件还不够,需要进一步处理,即HJB方程。

apply Ito’s formula to V(τ,Xτu)V(\tau,X_{\tau }^{u})V(τ,Xτu) with arbitrary uuu
d[V(τ,Xτu)]=∂V(t,Xτu)∂tdτ+∂V(t,Xτu)∂XdXτu+∂2V(t,Xτu)∂X2(dXτu)2 \begin{equation*} d\left[ V(\tau ,X_{\tau }^{u})\right] =\frac{\partial V(t,X_{\tau }^{u})}{% \partial t}d\tau +\frac{\partial V(t,X_{\tau }^{u})}{\partial X}dX_{\tau }^{u}+\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}(dX_{\tau }^{u})^{2} \end{equation*} d[V(τ,Xτu)]=tV(t,Xτu)dτ+XV(t,Xτu)dXτu+X22V(t,Xτu)(dXτu)2
因为[t,T][t,T][t,T]划分为[t,τ][t, \tau][t,τ](τ,T](\tau, T](τ,T],并且各自对应的策略是“固定”和“最优”。而且第一段已经固定了,变化在于第二段。
∂V(t,Xτu)∂t\frac{\partial V(t,X_{\tau }^{u})}{\partial t}tV(t,Xτu)表示value function 对“时间变量”的变化率。真正变化的是τ\tauτ
=(∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)+12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ+∂V(t,Xτu)∂Xσ(τ,Xτu,uτ)dWτ \begin{aligned} =&\left( \frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{\partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })+\frac{1}{2% }\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \\ &+\frac{\partial V(t,X_{\tau }^{u})}{\partial X}\sigma (\tau ,X_{\tau }^{u},u_{\tau })dW_{\tau } \end{aligned} =(tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)+21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ+XV(t,Xτu)σ(τ,Xτu,uτ)dWτ

🔵 calculate

dXtu=μ(t,Xsu,ut)dt+σ(t,Xtu,ut)dW(t) {\color{blue} dX_t^u=\mu(t,X_s^u,u_t)dt+\sigma(t,X_t^u,u_t)dW(t) } dXtu=μ(t,Xsu,ut)dt+σ(t,Xtu,ut)dW(t)
simply, dXtu=μ(⋅)dt+σ(⋅)dW(t), (dXtu)2=σ2(⋅)dτ\color{blue} dX_{t}^{u}=\mu \left( \cdot \right) dt+\sigma \left( \cdot \right) dW(t),\ (dX_{t}^{u})^{2}=\sigma ^{2}\left( \cdot \right)d\taudXtu=μ()dt+σ()dW(t), (dXtu)2=σ2()dτ. Vx[μ(⋅)dt+σ(⋅)dW(t)]\color{blue} V_{x}[\mu \left( \cdot \right) dt+\sigma \left( \cdot \right) dW(t)]Vx[μ()dt+σ()dW(t)]

d[V(τ,Xτu)]=Vtdτ+Vxμ(⋅)dτ+Vxσ(⋅)dW(t)+12Vxxσ2(⋅)dτ=[Vt+Vxμ(⋅)+12Vxxσ2(⋅)]dτ+Vxσ(⋅)dW(t)\color{blue} \begin{aligned} d\left[ V(\tau ,X_{\tau }^{u})\right] &=V_{t}d\tau +V_{x}\mu \left( \cdot \right) d\tau +V_{x}\sigma \left( \cdot \right) dW(t)+\frac{1}{2}V_{xx}\sigma ^{2}\left( \cdot \right) d\tau \\ &=\left[ V_{t}+V_{x}\mu \left( \cdot \right) +\frac{1}{2}V_{xx}\sigma^{2}\left( \cdot \right) \right] d\tau +V_{x}\sigma \left( \cdot \right)dW(t) \end{aligned} d[V(τ,Xτu)]=Vtdτ+Vxμ()dτ+Vxσ()dW(t)+21Vxxσ2()dτ=[Vt+Vxμ()+21Vxxσ2()]dτ+Vxσ()dW(t)

Integrate both sides from t→t+ht\rightarrow t+htt+h
V(t+h,Xt+hu)−V(t,Xtu)=∫tt+h(∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)+12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ+∫tt+h∂V(t,Xτu)∂Xσ(τ,Xτu,uτ)dWτ \begin{aligned} &V(t+h,X_{t +h}^{u})-V(t,X_{t}^{u}) \\ =&\int_{t}^{t+h}\left( \frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{% \partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })+% \frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \\ &+\int_{t}^{t+h}\frac{\partial V(t,X_{\tau }^{u})}{\partial X}\sigma (\tau ,X_{\tau }^{u},u_{\tau })dW_{\tau } \end{aligned} =V(t+h,Xt+hu)V(t,Xtu)tt+h(tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)+21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ+tt+hXV(t,Xτu)σ(τ,Xτu,uτ)dWτ

注:

在随机过程中,伊藤积分通常写作:
It=∫0tXsdWsI_t = \int_0^t X_s dW_sIt=0tXsdWs

其中 WsW_sWs 是标准布朗运动(Brownian Motion)。布朗运动可以看作是无数个微小的、无方向的随机震荡。

2. “Martingale increment”(鞅增量)是什么意思?

  • Martingale(鞅):在概率论中,鞅代表一个“公平游戏”。这意味着如果你知道了直到今天为止的所有信息,你对明天财富的最佳预测就是你今天的财富。
  • Increment(增量):即这一小段时间内的变化量 dIt=XtdWtdI_t = X_t dW_tdIt=XtdWt
  • 结论:说伊藤积分是“鞅增量”,意味着这一小段随机积分的变化是不带“趋势”的,它完全是由随机扰动驱动的,没有偏向增加或减少的预设动力。

3. “条件期望为 0”意味着什么?

E[XtdWt∣Fs]=0(对于 s<t)\mathbb{E}[X_t dW_t \mid \mathcal{F}_s] = 0 \quad (\text{对于 } s < t)E[XtdWtFs]=0(对于 s<t)

这里的 Fs\mathcal{F}_sFs 代表直到 sss 时刻为止的所有已知信息。

  • 直观理解:虽然我们不知道 dWtdW_tdWt 具体会跳向哪里,但在已知当前所有信息的情况下,它向上跳和向下跳的概率是平衡的。

4. 这句话的实际用途

在推导伊藤引理(Itô’s Lemma)或者求解随机微分方程(SDE)时,这个性质非常强大:

  1. 简化计算:当我们对一个随机微分方程两边取期望时,所有的伊藤积分项(鞅增量项)都会直接消失(变成 0)
  2. 提取趋势:这能帮助研究者把“确定性的趋势(Drift)”从“纯粹的随机波动(Diffusion)”中分离出来。

总结一下:
这句话的意思是,伊藤积分所代表的随机波动部分是纯粹的噪声,它不包含任何可以被提前预知的系统性偏差。 如果你在这一刻预测下一刻这段积分的变化,你的最优估计只能是 0。

Apply the expectation operator Et,x\mathbb{E}_{t,x}Et,x
Et,x[V(t+h,Xτ+hu)]−V(t,Xtu) \begin{equation*} \mathbb{E}_{t,x}\left[ V(t+h,X_{\tau +h}^{u})\right] -V(t,X_{t}^{u}) \end{equation*} Et,x[V(t+h,Xτ+hu)]V(t,Xtu)
=Et,x[∫tt+h(∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ] \begin{equation*} =\mathbb{E}_{t,x}\left[ \int_{t}^{t+h}\left( \frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{\partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })\frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{% \partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau % \right] \end{equation*} =Et,x[tt+h(tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ]

V(t+h,Xt+hu)−V(t,Xtu)V(t+h,X_{t +h}^{u})-V(t,X_{t}^{u})V(t+h,Xt+hu)V(t,Xtu),这里最主要是V(t+h,Xt+hu)V(t+h,X_{t +h}^{u})V(t+h,Xt+hu)Xt+huX_{t+h}^{u}Xt+hu是随机变量。在ttt时刻的时候,时间和状态都是确定的。但是在未来时刻t+ht+ht+h时刻,未知。

Et,x[V(t+h,Xt+hu)]=V(t,Xtu)+Et,x[∫tt+h(∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ] \begin{equation*} \mathbb{E}_{t,x}\left[ V(t+h,X_{t+h}^{u})\right] =V(t,X_{t}^{u})+\mathbb{E}% _{t,x}\left[ \int_{t}^{t+h}\left( \frac{\partial V(t,X_{\tau }^{u})}{% \partial t}+\frac{\partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })\frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \right] \end{equation*} Et,x[V(t+h,Xt+hu)]=V(t,Xtu)+Et,x[tt+h(tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ]
V(t+h,Xt+hu)=V(t,Xtu)+∫tt+h(∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)+12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ \begin{equation*} V(t+h,X_{t+h}^{u})=V(t,X_{t}^{u})+\int_{t}^{t+h}\left( \frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{\partial V(t,X_{\tau }^{u})}{\partial X% }\mu (\tau ,X_{\tau }^{u},u_{\tau })+\frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \end{equation*} V(t+h,Xt+hu)=V(t,Xtu)+tt+h(tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)+21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ

by Bellman optimality condition

V(t,x)=sup⁡{us}s∈[t,T]Et,x[∫tτF(s,Xsu,us) ds+V(τ,Xτu)] \begin{equation*} V(t,x)=\sup\limits_{\{u_{s}\}_{s\in \lbrack t,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{\tau }F(s,X_{s}^{u},u_{s})\,ds+V(\tau ,X_{\tau }^{u})\right] \end{equation*} V(t,x)={us}s[t,T]supEt,x[tτF(s,Xsu,us)ds+V(τ,Xτu)]
这里我们把原来的 τ\tauτ 改为 t+ht+ht+h,然后去掉sup⁡\supsup,但是注意变为大于等于。
V(t,x)=sup⁡{us}s∈[t,T]Et,x[∫tt+hF(s,Xsu,us) ds+V(t+h,Xt+hu)] \begin{equation*} V(t,x)=\sup\limits_{\{u_{s}\}_{s\in \lbrack t,T]}}\mathbb{E}_{t,x}\left[ \int_{t}^{t+h}F(s,X_{s}^{u},u_{s})\,ds+V(t+h,X_{t+h}^{u})\right] \end{equation*} V(t,x)={us}s[t,T]supEt,x[tt+hF(s,Xsu,us)ds+V(t+h,Xt+hu)]
≥Et,x[∫tt+hF(s,Xsu,us) ds+V(t+h,Xt+hu)] \begin{equation*} \geq \mathbb{E}_{t,x}\left[ \int_{t}^{t+h}F(s,X_{s}^{u},u_{s})% \,ds+V(t+h,X_{t+h}^{u})\right] \end{equation*} Et,x[tt+hF(s,Xsu,us)ds+V(t+h,Xt+hu)]
V(t,x)≥Et,x[∫tt+hF(s,Xsu,us) ds+V(t,x)+∫tt+h(∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)+12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ \begin{aligned} V(t,x) \geq &\mathbb{E}_{t,x}\left[ \int_{t}^{t+h}F(s,X_{s}^{u},u_{s})\,ds+V(t,x)+% \right. \\ &\int_{t}^{t+h}\left( \frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{% \partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })+% \frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \end{aligned} V(t,x)Et,x[tt+hF(s,Xsu,us)ds+V(t,x)+tt+h(tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)+21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ

=V(t,Xtu)+Et,x[∫tt+h(F(τ,Xτu,uτ) dτ+∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)+12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ] \begin{equation*} =V(t,X_{t}^{u})+\mathbb{E}_{t,x}\left[ \int_{t}^{t+h}\left( F(\tau ,X_{\tau }^{u},u_{\tau })\,d\tau +\frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{\partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })+\frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \right] \end{equation*} =V(t,Xtu)+Et,x[tt+h(F(τ,Xτu,uτ)dτ+tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)+21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ]
then we have
0≥Et,x[∫tt+h(F(τ,Xτu,uτ) +∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)+12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ] \begin{equation*} 0\geq \mathbb{E}_{t,x}\left[ \int_{t}^{t+h}\left( F(\tau ,X_{\tau }^{u},u_{\tau })\,+\frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{% \partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })+% \frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \right] \end{equation*} 0Et,x[tt+h(F(τ,Xτu,uτ)+tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)+21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ]

divide both sides by hhh
0≥Et,x[1h∫tt+h(F(τ,Xτu,uτ) dτ+∂V(t,Xτu)∂t+∂V(t,Xτu)∂Xμ(τ,Xτu,uτ)+12∂2V(t,Xτu)∂X2σ2(τ,Xτu,uτ))dτ] \begin{equation*} 0\geq \mathbb{E}_{t,x}\left[ \frac{1}{h}\int_{t}^{t+h}\left( F(\tau ,X_{\tau}^{u},u_{\tau })\,d\tau +\frac{\partial V(t,X_{\tau }^{u})}{\partial t}+\frac{\partial V(t,X_{\tau }^{u})}{\partial X}\mu (\tau ,X_{\tau }^{u},u_{\tau })+\frac{1}{2}\frac{\partial ^{2}V(t,X_{\tau }^{u})}{\partial X^{2}}\sigma ^{2}(\tau ,X_{\tau }^{u},u_{\tau })\right) d\tau \right] \end{equation*} 0Et,x[h1tt+h(F(τ,Xτu,uτ)dτ+tV(t,Xτu)+XV(t,Xτu)μ(τ,Xτu,uτ)+21X22V(t,Xτu)σ2(τ,Xτu,uτ))dτ]
let h→0h\rightarrow 0h0, then
这里简单说明就是积分中值定理+连续性,最后令中指epsilon取t
补充:
积分中值定理:1h∫tt+hf(τ)dτ=f(ξ),ξ∈[t,t+h]\frac{1}{h}\int_{t}^{t+h} f(\tau) d\tau=f(\xi), \quad \xi \in [t, t+h]h1tt+hf(τ)dτ=f(ξ),ξ[t,t+h]
所以简写一下,0≥E1h∫tt+hH(τ,Xτu,uτ)dτ0 \ge \mathbb{E} \frac{1}{h}\int_{t}^{t+h} H(\tau, X_{\tau}^{u}, u_{\tau}) d\tau0Eh1tt+hH(τ,Xτu,uτ)dτ
1h∫tt+hH(τ,Xτu,uτ)dτ=H(ξh,Xξhu,uξh)\frac{1}{h}\int_{t}^{t+h} H(\tau, X_{\tau}^{u}, u_{\tau}) d\tau = H(\xi_{h}, X_{\xi_{h}}^{u}, u_{\xi_{h}})h1tt+hH(τ,Xτu,uτ)dτ=H(ξh,Xξhu,uξh)
然后我们取值ξh=t\xi_{h}=tξh=tH(t,Xtu,ut)H(t, X_{t}^{u}, u_{t})H(t,Xtu,ut)
Et,x[H(t,Xtu,ut)]≤0\mathbb{E}_{t,x} [H(t, X_{t}^{u}, u_{t})] \le 0Et,x[H(t,Xtu,ut)]0
这里Et,x\mathbb{E}_{t,x}Et,x,也就是在ttt时刻,系统的状态时刻为xxx。也就是在ttt时刻,信息是已知的E(⋅∣Xt=x)\mathbb{E}( \cdot | X_{t} = x)E(Xt=x)
然后H(t,Xtu,ut)H(t, X_{t}^{u}, u_{t})H(t,Xtu,ut)就是常数(已知)了,Et,x[H(t,Xtu,ut)]=H(t,Xtu,ut)\mathbb{E}_{t,x} [H(t, X_{t}^{u}, u_{t})] = H(t, X_{t}^{u}, u_{t})Et,x[H(t,Xtu,ut)]=H(t,Xtu,ut),然后把H(t,Xtu,ut)H(t, X_{t}^{u}, u_{t})H(t,Xtu,ut)复原。

0≥F(t,Xtu,ut) +∂V(t,Xtu)∂t+∂V(t,Xtu)∂Xμ(t,Xtu,ut)+12∂2V(t,Xtu)∂X2σ2(t,Xtu,ut) \begin{equation*} 0\geq F(t,X_{t}^{u},u_{t})\,+\frac{\partial V(t,X_{t}^{u})}{\partial t}+% \frac{\partial V(t,X_{t}^{u})}{\partial X}\mu (t,X_{t}^{u},u_{t})+\frac{1}{2}% \frac{\partial ^{2}V(t,X_{t}^{u})}{\partial X^{2}}\sigma ^{2}(t,X_{t}^{u},u_{t}) \end{equation*} 0F(t,Xtu,ut)+tV(t,Xtu)+XV(t,Xtu)μ(t,Xtu,ut)+21X22V(t,Xtu)σ2(t,Xtu,ut)
the inequality becomes an equality when uuu is optimal
sup⁡utF(t,Xtu,ut) +∂V(t,Xtu)∂t+∂V(t,Xtu)∂Xμ(t,Xtu,ut)+12∂2V(t,Xtu)∂X2σ2(t,Xtu,ut) \begin{equation*} \sup_{u_{t}}F(t,X_{t}^{u},u_{t})\,+\frac{\partial V(t,X_{t}^{u})}{\partial t}% +\frac{\partial V(t,X_{t}^{u})}{\partial X}\mu (t,X_{t}^{u},u_{t})+\frac{1}{2% }\frac{\partial ^{2}V(t,X_{t}^{u})}{\partial X^{2}}\sigma ^{2}(t,X_{t}^{u},u_{t}) \end{equation*} utsupF(t,Xtu,ut)+tV(t,Xtu)+XV(t,Xtu)μ(t,Xtu,ut)+21X22V(t,Xtu)σ2(t,Xtu,ut)
∂V(t,Xtu)∂t+sup⁡utF(t,Xtu,ut) +∂V(t,Xtu)∂Xμ(t,Xtu,ut)+12∂2V(t,Xtu)∂X2σ2(t,Xtu,ut)=0 \begin{equation*} \frac{\partial V(t,X_{t}^{u})}{\partial t}+\sup_{u_{t}}F(t,X_{t}^{u},u_{t})% \,+\frac{\partial V(t,X_{t}^{u})}{\partial X}\mu (t,X_{t}^{u},u_{t})+\frac{1% }{2}\frac{\partial ^{2}V(t,X_{t}^{u})}{\partial X^{2}}\sigma ^{2}(t,X_{t}^{u},u_{t})=0 \end{equation*} tV(t,Xtu)+utsupF(t,Xtu,ut)+XV(t,Xtu)μ(t,Xtu,ut)+21X22V(t,Xtu)σ2(t,Xtu,ut)=0
这里是终端条件,终端收益
Also, notice that
V(T,x)=G(x) \begin{equation*} V(T,x)=G(x) \end{equation*} V(T,x)=G(x)
then we have the following partial differential equation
{∂V(t,Xtu)∂t+sup⁡utF(t,Xtu,ut) +∂V(t,Xtu)∂Xμ(t,Xtu,ut)+12∂2V(t,Xtu)∂X2σ2(t,Xtu,ut)=0V(T,x)=G(x) \begin{cases} \displaystyle \frac{\partial V(t,X_{t}^{u})}{\partial t}% +\sup_{u_{t}}F(t,X_{t}^{u},u_{t})\,+\frac{\partial V(t,X_{t}^{u})}{\partial X% }\mu (t,X_{t}^{u},u_{t})+\frac{1}{2}\frac{\partial ^{2}V(t,X_{t}^{u})}{% \partial X^{2}}\sigma ^{2}(t,X_{t}^{u},u_{t})=0 \\[4pt] V(T,x) = G(x) \end{cases} tV(t,Xtu)+utsupF(t,Xtu,ut)+XV(t,Xtu)μ(t,Xtu,ut)+21X22V(t,Xtu)σ2(t,Xtu,ut)=0V(T,x)=G(x)
This is called Hamiton-Jacobi-Bellman equation(HJB).

4.运用HJB方程求解最优消费投资问题

怎么写
How to solve a stochastic optimal control problem?
sup⁡{ut}t∈[0,T]∈UE[∫0TF(s,Xsu,us) ds+G(XTu)]s.t.dXtu=μ(t,Xtu,ut)dt+σ(t,Xtu,ut)dW(t),X0u=x. \begin{aligned} \sup_{\{u_{t}\}_{t\in \lbrack 0,T]}\in U}\quad &\mathbb{E}\left[ \int_{0}^{T}F(s,X_{s}^{u},u_{s})\,ds+G(X_{T}^{u})\right] \\ \text{s.t.}\quad &dX_{t}^{u}=\mu \left( t,X_{t}^{u},u_{t}\right) dt+\sigma \left( t,X_{t}^{u},u_{t}\right) dW(t), \\ &X_{0}^{u}=x. \end{aligned} {ut}t[0,T]Usups.t.E[0TF(s,Xsu,us)ds+G(XTu)]dXtu=μ(t,Xtu,ut)dt+σ(t,Xtu,ut)dW(t),X0u=x.
①write down HJB equation
{∂V(t,Xtu)∂t+sup⁡utF(t,Xtu,ut) +∂V(t,Xtu)∂Xμ(t,Xtu,ut)+12∂2V(t,Xtu)∂X2σ2(t,Xtu,ut)=0V(T,x)=G(x) \begin{cases} \displaystyle \frac{\partial V(t,X_{t}^{u})}{\partial t}% +\sup_{u_{t}}F(t,X_{t}^{u},u_{t})\,+\frac{\partial V(t,X_{t}^{u})}{\partial X% }\mu (t,X_{t}^{u},u_{t})+\frac{1}{2}\frac{\partial ^{2}V(t,X_{t}^{u})}{% \partial X^{2}}\sigma ^{2}(t,X_{t}^{u},u_{t})=0 \\[4pt] V(T,x) = G(x) \end{cases} tV(t,Xtu)+utsupF(t,Xtu,ut)+XV(t,Xtu)μ(t,Xtu,ut)+21X22V(t,Xtu)σ2(t,Xtu,ut)=0V(T,x)=G(x)
②solve for u∗u^{\ast}u in terms of VVV
sup⁡utF(t,Xtu,ut) +∂V(t,Xtu)∂Xμ(t,Xtu,ut)+12∂2V(t,Xtu)∂X2σ2(t,Xtu,ut)\sup_{u_{t}}F(t,X_{t}^{u},u_{t})\,+\frac{\partial V(t,X_{t}^{u})}{\partial X% }\mu (t,X_{t}^{u},u_{t})+\frac{1}{2}\frac{\partial ^{2}V(t,X_{t}^{u})}{% \partial X^{2}}\sigma ^{2}(t,X_{t}^{u},u_{t})utsupF(t,Xtu,ut)+XV(t,Xtu)μ(t,Xtu,ut)+21X22V(t,Xtu)σ2(t,Xtu,ut)
③plug u∗u^{\ast}u back to HJB equation, and then solve VVV。难
数值解,或验证

请添加图片描述

Logo

AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。

更多推荐