用伪代码理解医学世界模型：State、Action、Transition 与 Feedback

damopa

409人浏览 · 2026-05-20 23:27:57

damopa · 2026-05-20 23:27:57 发布

很多医学 AI 项目，本质上仍然是一个 prediction pipeline：

data = collect_patient_data()
features = extract_features(data)
risk = model.predict(features)
return risk

这类系统可以回答：

未来某个疾病风险是多少；
当前影像是否存在异常；
某组指标是否提示风险升高；
某个用户属于哪一类风险分层。

这当然有价值。

但如果我们讨论的是“医学世界模型”，问题就不能停留在 predict(risk)。世界模型更关心的是：

在当前状态下，如果采取某个 action，系统可能如何 transition？真实 feedback 又如何更新下一轮判断？

也就是说，医学世界模型不是把模型做得更大，也不是把更多数据塞进一个风险预测器，而是要把医学问题从：

future = predict(state)

推进到：

next_state = simulate_transition(state, action, evidence)
updated_model = update_with_feedback(state, action, next_state, feedback)

这篇文章尝试用开发者能理解的方式，把“医学世界模型”拆成一个最小工程框架。

1. Prediction model 和 world model 的核心区别

先看一个普通预测模型。

class PredictionModel:
    def predict_risk(self, patient_data):
        features = self.extract_features(patient_data)
        risk_score = self.model.predict(features)
        return risk_score

它的输入是数据，输出是风险。

risk = prediction_model.predict_risk(patient_data)
print(risk)

这种结构适合做：

疾病风险预测；
影像分类；
预后分层；
异常检测；
人群筛查。

但它通常不显式回答：

如果采取某个干预动作，会发生什么变化？
这个动作为什么可能有效？
预期变化有哪些证据支持？
真实反馈与预期不一致时，如何更新模型？

世界模型需要引入 action。

一个最小化的 world model 结构可以写成：

class WorldModel:
    def simulate(self, state, action):
        next_state = self.transition_model(state, action)
        return next_state

医学场景中，还必须加入 evidence、safety 和 feedback：

class MedicalWorldModel:
    def simulate_transition(self, state, action, evidence):
        if not self.safety_gate(state, action):
            return {
                "status": "rejected",
                "reason": "Action failed safety gate"
            }

        transition = self.transition_model.estimate(
            state=state,
            action=action,
            evidence=evidence
        )

        return {
            "status": "hypothesis",
            "current_state": state,
            "action": action,
            "expected_transition": transition,
            "evidence": evidence,
            "uncertainty": transition.uncertainty
        }

这里有一个关键点：医学世界模型输出的不是“治疗结论”，而是机制约束下的状态转移假设。

2. 把医学问题拆成五个对象

一个医学世界模型至少需要五类对象：

State      当前状态
Action     可选动作
Transition 状态转移
Evidence   证据链
Feedback   真实反馈

可以先用一个抽象接口表示：

class MedicalWorldModel:
    def observe_state(self, subject):
        pass

    def define_action(self, intervention):
        pass

    def build_evidence_chain(self, state, action):
        pass

    def simulate_transition(self, state, action, evidence):
        pass

    def collect_feedback(self, subject, action):
        pass

    def update_model(self, state, action, transition, feedback):
        pass

这个接口比普通预测模型多了三件事：

显式定义 action；
显式记录 evidence；
用 feedback 更新下一轮判断。

这也是 world model 与 prediction model 在工程上的核心区别。

3. State：不要只把人表示成一个风险分数

很多医学 AI 系统会把个体压缩成一个风险分数：

{
  "cardiovascular_risk": 0.23
}

这对于风险分层有用，但对于世界模型不够。

世界模型需要的是更结构化的 state representation。例如：

{
  "subject_id": "anonymous_001",
  "timestamp": "2026-05-20",
  "metabolic_state": {
    "fasting_glucose": 5.6,
    "hba1c": 5.7,
    "fasting_insulin": 12.4,
    "triglycerides": 1.8
  },
  "inflammation_state": {
    "hs_crp": 2.1
  },
  "lifestyle_state": {
    "sleep_duration": 6.2,
    "weekly_exercise_minutes": 90,
    "diet_pattern": "high_refined_carbohydrate"
  },
  "risk_context": {
    "family_history": ["type_2_diabetes"],
    "medications": [],
    "known_conditions": []
  }
}

注意，这只是示意 schema，不是临床标准。

在真实系统中，state schema 必须满足：

数据来源可追溯；
单位明确；
时间戳明确；
缺失值处理明确；
变量含义明确；
不把不可靠数据当成确定事实。

可以定义一个简单的数据类：

from dataclasses import dataclass
from typing import Dict, Any, List

@dataclass
class HealthState:
    subject_id: str
    timestamp: str
    biomarkers: Dict[str, Any]
    lifestyle: Dict[str, Any]
    symptoms: Dict[str, Any]
    medications: List[str]
    context: Dict[str, Any]

世界模型的第一步不是训练大模型，而是定义“状态如何表示”。

4. Action：医学干预必须变成可计算对象

在普通聊天机器人里，干预可能只是自然语言建议：

建议你改善睡眠、增加运动、控制饮食。

但在 world model 中，action 必须是可编码对象。

例如：

{
  "action_id": "increase_zone2_exercise",
  "type": "lifestyle",
  "target": "weekly_exercise_minutes",
  "change": {
    "from": 90,
    "to": 150
  },
  "duration": "12_weeks",
  "monitoring": ["resting_heart_rate", "sleep_quality", "fasting_glucose"],
  "safety_notes": [
    "requires clinician review if known cardiovascular disease exists"
  ]
}

用 Python 表示：

@dataclass
class MedicalAction:
    action_id: str
    action_type: str
    target: str
    parameters: Dict[str, Any]
    duration: str
    monitoring: List[str]
    safety_notes: List[str]

医学世界模型中的 action 不一定是药物，也可以是：

运动方案；
饮食调整；
睡眠干预；
随访策略；
检测计划；
药物或补剂调整；
行为改变；
临床进一步检查。

但无论是哪一种，都不能只停留在一句话建议。它需要有：

action 类型；
target；
参数；
时间窗口；
监测指标；
安全边界；
证据来源。

5. Evidence：每一次 transition 都要有证据链

医学世界模型不能像游戏环境一样随意试错。它必须有 evidence chain。

一个最小 evidence object 可以这样设计：

{
  "evidence_id": "evidence_001",
  "claim": "Increasing weekly aerobic exercise may improve insulin sensitivity in selected metabolic-risk populations.",
  "evidence_type": ["clinical_guideline", "peer_reviewed_study", "mechanistic_rationale"],
  "strength": "moderate",
  "applicability": {
    "population_match": "partial",
    "condition_match": "partial",
    "uncertainty": "individual response may vary"
  },
  "limitations": [
    "not a personalized treatment prediction",
    "requires safety screening",
    "effect size depends on baseline state and adherence"
  ]
}

对应 Python 类：

@dataclass
class EvidenceItem:
    evidence_id: str
    claim: str
    evidence_type: List[str]
    strength: str
    applicability: Dict[str, Any]
    limitations: List[str]

构建证据链：

class EvidenceBuilder:
    def build(self, state: HealthState, action: MedicalAction):
        evidence_items = self.retrieve_relevant_evidence(state, action)
        filtered = self.filter_by_applicability(evidence_items, state)
        return filtered

    def retrieve_relevant_evidence(self, state, action):
        # In production, this should query curated knowledge bases,
        # guidelines, systematic reviews, or trusted literature indexes.
        return []

    def filter_by_applicability(self, evidence_items, state):
        # Filter evidence by population, condition, safety boundary,
        # measurement context, and uncertainty.
        return evidence_items

对于医学 AI 来说，没有 evidence 的 transition 是危险的。

所以 CSDN 版可以记住一句工程原则：

不要只生成建议，要生成 evidence-bound transition hypothesis。

6. Transition：输出的是状态转移假设，不是疗效承诺

Transition 是世界模型的核心。

它不是说：

这个干预一定有效。

而是说：

在当前状态和证据约束下，这个 action 可能导致哪些状态变化，置信度和不确定性是什么。

可以定义：

@dataclass
class TransitionHypothesis:
    from_state: HealthState
    action: MedicalAction
    expected_changes: Dict[str, Any]
    time_window: str
    evidence: List[EvidenceItem]
    uncertainty: Dict[str, Any]
    safety_flags: List[str]

示例：

{
  "expected_changes": {
    "weekly_exercise_minutes": {
      "direction": "increase",
      "expected_from": 90,
      "expected_to": 150
    },
    "insulin_sensitivity": {
      "direction": "potential_improvement",
      "confidence": "low_to_moderate"
    },
    "fasting_glucose": {
      "direction": "possible_decrease",
      "confidence": "uncertain"
    }
  },
  "time_window": "8_to_12_weeks",
  "uncertainty": {
    "adherence": "unknown",
    "baseline_variability": "high",
    "measurement_noise": "moderate"
  },
  "safety_flags": [
    "screen cardiovascular risk before increasing exercise intensity"
  ]
}

模拟 transition：

class TransitionModel:
    def estimate(self, state, action, evidence):
        expected_changes = self.estimate_expected_changes(state, action, evidence)
        uncertainty = self.estimate_uncertainty(state, action, evidence)
        safety_flags = self.check_safety_flags(state, action)

        return TransitionHypothesis(
            from_state=state,
            action=action,
            expected_changes=expected_changes,
            time_window=action.parameters.get("time_window", "unknown"),
            evidence=evidence,
            uncertainty=uncertainty,
            safety_flags=safety_flags
        )

注意命名：这里用的是 TransitionHypothesis，不是 TreatmentPlan。

这是非常重要的安全边界。

7. Feedback：世界模型必须能被校准

如果没有 feedback，world model 只是一个更复杂的推理器。

医学世界模型需要闭环：

observe state → choose action → simulate transition → collect feedback → update model

伪代码：

def world_model_loop(subject, action):
    state_t0 = observe_state(subject)

    evidence = build_evidence_chain(state_t0, action)

    transition_hypothesis = simulate_transition(
        state=state_t0,
        action=action,
        evidence=evidence
    )

    feedback = collect_feedback(
        subject=subject,
        action=action,
        time_window=transition_hypothesis.time_window
    )

    updated_model_state = update_model(
        previous_state=state_t0,
        action=action,
        hypothesis=transition_hypothesis,
        feedback=feedback
    )

    return updated_model_state

Feedback 可以包括：

复测指标；
症状变化；
可穿戴设备趋势；
依从性记录；
不良反应；
医生或专业人员评估；
用户报告结果；
环境和生活方式变化。

Feedback object 示例：

{
  "feedback_id": "feedback_001",
  "action_id": "increase_zone2_exercise",
  "time_window": "12_weeks",
  "observations": {
    "weekly_exercise_minutes": 145,
    "fasting_glucose": 5.4,
    "sleep_duration": 6.5,
    "subjective_energy": "improved"
  },
  "adherence": "partial",
  "adverse_events": [],
  "notes": "Interpret carefully; multiple concurrent changes existed."
}

更新逻辑：

class FeedbackUpdater:
    def update(self, previous_state, action, hypothesis, feedback):
        comparison = self.compare_expected_vs_observed(
            expected=hypothesis.expected_changes,
            observed=feedback["observations"]
        )

        audit_log = {
            "previous_state": previous_state,
            "action": action,
            "hypothesis": hypothesis,
            "feedback": feedback,
            "comparison": comparison,
            "update_reason": self.infer_update_reason(comparison)
        }

        return audit_log

8. Safety Gate：医学世界模型必须先过安全门

医学系统不能只追求“模型能力”，必须先有安全边界。

可以定义 safety gate：

class SafetyGate:
    def check(self, state: HealthState, action: MedicalAction):
        checks = [
            self.check_contraindications(state, action),
            self.check_required_clinician_review(state, action),
            self.check_action_intensity(state, action),
            self.check_data_quality(state),
        ]

        if not all(checks):
            return False

        return True

    def check_contraindications(self, state, action):
        # Placeholder: should use curated medical rules and clinician review.
        return True

    def check_required_clinician_review(self, state, action):
        # Some actions should never be autonomous.
        return True

    def check_action_intensity(self, state, action):
        return True

    def check_data_quality(self, state):
        return True

在医学世界模型里，安全门应该在 transition 之前：

def simulate_medical_transition(state, action):
    if not safety_gate.check(state, action):
        return {
            "status": "blocked",
            "reason": "Safety gate failed. Clinician review required."
        }

    evidence = evidence_builder.build(state, action)
    transition = transition_model.estimate(state, action, evidence)

    return transition

工程原则：

medical world model should be safety-first, evidence-bound, and feedback-calibrated.

9. Audit Log：每一次推演都应该可追踪

医学 AI 不能只输出一个答案。它需要 audit trail。

一个 audit log 可以包含：

{
  "audit_id": "audit_001",
  "timestamp": "2026-05-20T23:00:00+08:00",
  "state_version": "state_v1",
  "action_version": "action_v1",
  "evidence_version": "evidence_v1",
  "transition_version": "transition_v1",
  "model_version": "world_model_v0.1",
  "human_review": {
    "required": true,
    "status": "pending"
  },
  "disclaimer": "Hypothesis-generating only. Not a treatment recommendation."
}

生成 audit log：

class AuditLogger:
    def log_transition(self, state, action, evidence, transition):
        return {
            "state": state,
            "action": action,
            "evidence": evidence,
            "transition": transition,
            "model_version": self.get_model_version(),
            "human_review_required": True,
            "disclaimer": "Hypothesis-generating only. Not medical advice."
        }

对于医学世界模型来说，可审计不是附加功能，而是核心功能。

10. 一个最小可运行的概念框架

把前面的对象组合起来：

class MinimalMedicalWorldModel:
    def __init__(self, safety_gate, evidence_builder, transition_model, feedback_updater, audit_logger):
        self.safety_gate = safety_gate
        self.evidence_builder = evidence_builder
        self.transition_model = transition_model
        self.feedback_updater = feedback_updater
        self.audit_logger = audit_logger

    def run(self, state, action):
        if not self.safety_gate.check(state, action):
            return {
                "status": "blocked",
                "message": "Safety gate failed. Human review required."
            }

        evidence = self.evidence_builder.build(state, action)

        transition = self.transition_model.estimate(
            state=state,
            action=action,
            evidence=evidence
        )

        audit_log = self.audit_logger.log_transition(
            state=state,
            action=action,
            evidence=evidence,
            transition=transition
        )

        return {
            "status": "hypothesis_generated",
            "transition": transition,
            "audit_log": audit_log
        }

    def update_with_feedback(self, previous_state, action, transition, feedback):
        return self.feedback_updater.update(
            previous_state=previous_state,
            action=action,
            hypothesis=transition,
            feedback=feedback
        )

使用方式：

state = observe_state(subject)
action = define_action(intervention)

result = medical_world_model.run(state, action)

if result["status"] == "hypothesis_generated":
    feedback = collect_feedback(subject, action)
    updated = medical_world_model.update_with_feedback(
        previous_state=state,
        action=action,
        transition=result["transition"],
        feedback=feedback
    )

这就是一个非常简化的医学世界模型 loop：

State → Action → Evidence → Transition → Audit → Feedback → Update

11. SteeraMed 在这个框架中的位置

SteeraMed 可以理解为一种面向生物医学世界模型的可驾驭框架。它强调的不是“自动给出治疗答案”，而是把医学 AI 系统组织成：

可表示的状态；
可编码的动作；
可追踪的证据；
可推演的状态转移；
可审计的过程；
可反馈校准的闭环。

用开发者语言说，SteeraMed 更接近一个：

state-action-transition-evidence-feedback framework

而不是一个：

chatbot that gives medical advice

这一区分很重要。

如果从系统设计角度看，SteeraMed 这类框架真正要解决的问题，是让医学 AI 从“单轮回答问题”走向“长期状态管理和反馈校准”。

12. Developer Takeaway

最后总结一下 CSDN 开发者视角下的关键点。

1. 不要只做 risk prediction

风险预测有价值，但它不是世界模型。

risk = predict(state)

还不够。

2. 必须显式定义 action

没有 action，就谈不上 action-conditioned transition。

next_state = simulate(state, action)

3. Transition 应该是 hypothesis，不是 promise

医学世界模型输出的是状态转移假设，不是疗效承诺。

transition = generate_hypothesis(state, action, evidence)

4. Evidence chain 是核心模块

医学系统不能只靠模型“觉得”。每一次推演都应该有证据链。

evidence = build_evidence_chain(state, action)

5. Feedback loop 决定模型是否能校准

没有反馈闭环，世界模型无法持续改进。

model.update(feedback)

6. Safety gate 必须在前

医学世界模型必须先过安全门，再谈推演。

if not safety_gate.check(state, action):
    block_and_require_human_review()

7. Audit log 不是可选项

每一次状态、动作、证据、转移和反馈都应该能被追踪。

audit_logger.log(state, action, evidence, transition, feedback)

结语

医学世界模型不是一个更大的预测模型，而是一种围绕状态、动作、证据、转移和反馈构建的系统框架。

对于开发者来说，最重要的不是先训练一个更大的模型，而是先定义清楚：

What is the state?
What is the action?
What evidence supports the transition?
What feedback will update the model?
What safety gate blocks unsafe actions?
What audit log makes the process traceable?

如果医学 AI 要从“识别风险”走向“辅助长期决策”，这样的架构问题会越来越重要。

参考文献与项目链接

Ha, D., & Schmidhuber, J. Recurrent World Models Facilitate Policy Evolution. Advances in Neural Information Processing Systems 31, 2018. https://papers.nips.cc/paper/7512-recurrent-world-models-facilitate-policy-evolution；arXiv 版本：https://arxiv.org/abs/1803.10122
LeCun, Y. A Path Towards Autonomous Machine Intelligence. OpenReview, 2022. https://openreview.net/forum?id=BZ5a1r-kVsf
Yang, Y., Wang, Z.-Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.-D., & Chen, J. Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv:2506.02327, 2025. https://arxiv.org/abs/2506.02327；项目页：https://yijun-yang.github.io/MeWM/
Qazi, M. A., Nadeem, M., & Yaqub, M. Beyond Generative AI: World Models for Clinical Prediction, Counterfactuals, and Planning. arXiv:2511.16333, 2025. https://arxiv.org/abs/2511.16333
Katsoulakis, E., Wang, Q., Wu, H., et al. Digital twins for health: a scoping review. npj Digital Medicine, 7, 77, 2024. https://doi.org/10.1038/s41746-024-01073-0
Emmert-Streib, F., Parkkila, S., Laubenbacher, R., et al. The role of digital twins in P4 medicine: A paradigm for modern healthcare. npj Digital Medicine, 8, 735, 2025. https://doi.org/10.1038/s41746-025-02115-x
Xiong, J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. https://doi.org/10.20944/preprints202605.0366.v1
SteeraMed 项目网址：https://SteeraMed.com
Steerable World 项目网址：https://steerable.world