用工程结构理解:系统生物学模型为什么不等于医学世界模型?
如果你做过生命科学数据分析、系统生物学建模、知识图谱、多组学整合,看到“医学世界模型”这个概念,第一反应可能是:
这不就是 systems biology 吗?
生命科学不是早就在建模基因网络、代谢通路、信号通路、疾病机制了吗?
这个问题非常合理。
如果所谓医学世界模型只是把系统生物学、数字孪生、AI 大模型重新包装一下,那它确实没有太多工程价值。
但从系统设计角度看,二者的核心差别不是“名字不同”,而是对象边界不同。
系统生物学模型通常重点表达:
component -> relation -> pathway -> network -> mechanism
医学世界模型则需要进一步表达:
state + action + evidence -> transition hypothesis -> feedback update
换句话说:
- 系统生物学主要帮助我们理解生命系统如何运行;
- 医学世界模型进一步要支持“如果采取某个动作,个体状态可能如何变化”的干预推演;
- 可驾驭医学世界模型还要把目标、边界、证据链、人类复核和反馈校准纳入系统架构。
这篇文章不讨论宏大概念,而是用开发者视角拆开这个问题。
1. 三类模型的工程边界
先用一张表区分预测模型、系统生物学模型和医学世界模型。
| 模型类型 | 核心问题 | 主要对象 | 典型输出 | 工程关键词 |
|---|---|---|---|---|
| 预测模型 | 未来风险有多高? | features, labels | risk score, class, probability | classification / regression |
| 系统生物学模型 | 生命系统如何运行? | genes, proteins, pathways, networks | mechanism, network dynamics | graph / ODE / network model |
| 医学世界模型 | 采取某个动作后,状态可能如何变化? | state, action, transition, evidence, feedback | transition hypothesis, audit trail | state-action-feedback loop |
一个普通风险预测模型可以写成:
risk = predict_risk(patient_features)
一个系统生物学模型可能更像:
network_state = simulate_pathway_dynamics(
pathway_graph,
initial_conditions,
perturbation
)
而医学世界模型更接近:
transition = estimate_transition_hypothesis(
state=current_patient_state,
action=candidate_intervention,
evidence=evidence_chain
)
feedback = collect_feedback(
patient_id=patient_id,
action=candidate_intervention,
time_window_weeks=8
)
updated_state = update_state(
previous_state=current_patient_state,
action=candidate_intervention,
transition=transition,
feedback=feedback
)
这里的关键不是模型形式更复杂,而是系统对象发生了变化:
从 feature prediction
到 mechanism modeling
再到 action-conditioned transition reasoning
2. 系统生物学不是没有 action,但 action 通常不是决策对象
需要先说明:系统生物学并不是不研究扰动和干预。
系统生物学当然可以处理:
- gene knockout;
- drug perturbation;
- pathway activation / inhibition;
- environment change;
- ODE simulation;
- network control;
- multi-omics perturbation response。
所以不能简单说:
systems biology has no action
medical world model has action
这个说法不严谨。
更准确的说法是:
系统生物学可以研究扰动和响应,但医学世界模型需要把 action 作为面向个体决策的结构化对象,并把它放入 evidence、transition、feedback 和 audit loop 中。
在系统生物学中,perturbation 可能是模型输入之一:
result = simulate_network(
graph=pathway_graph,
perturbation={"gene_x": "knockout"}
)
但在医学世界模型中,action 不只是一个扰动参数,而是一个可执行、可记录、可审计、可反馈的对象。
例如:
from dataclasses import dataclass
from typing import List, Dict
@dataclass
class InterventionAction:
action_id: str
category: str
description: str
target_mechanisms: List[str]
intensity: str
duration_weeks: int
monitoring_markers: List[str]
safety_constraints: List[str]
示例:
action = InterventionAction(
action_id="nutrition_low_glycemic_8w",
category="nutrition",
description="8-week low-glycemic dietary adjustment",
target_mechanisms=[
"postprandial_glucose_variability",
"insulin_resistance",
"weight_management"
],
intensity="moderate",
duration_weeks=8,
monitoring_markers=[
"fasting_glucose",
"hba1c",
"weight",
"waist_circumference"
],
safety_constraints=[
"not a treatment prescription",
"clinical review required if medication is involved",
"stop or refer if red flags appear"
]
)
这就是工程结构上的区别:
perturbation parameter != intervention action object
3. 系统生物学模型更像 mechanism graph
一个简化的系统生物学模型可以用图表示。
@dataclass
class BiologicalNode:
node_id: str
node_type: str # gene, protein, metabolite, pathway, phenotype
name: str
@dataclass
class BiologicalEdge:
source: str
target: str
relation: str # activates, inhibits, regulates, correlates_with
evidence_strength: str
@dataclass
class MechanismGraph:
nodes: List[BiologicalNode]
edges: List[BiologicalEdge]
例如:
mechanism_graph = MechanismGraph(
nodes=[
BiologicalNode("n1", "pathway", "insulin_signaling"),
BiologicalNode("n2", "phenotype", "glucose_variability"),
BiologicalNode("n3", "phenotype", "fatigue")
],
edges=[
BiologicalEdge(
source="n1",
target="n2",
relation="regulates",
evidence_strength="moderate"
),
BiologicalEdge(
source="n2",
target="n3",
relation="associated_with",
evidence_strength="low"
)
]
)
这个结构非常有价值,因为它帮助我们表达:
- 哪些机制可能相关;
- 哪些通路可能被扰动;
- 哪些节点之间存在调控关系;
- 哪些 phenotype 可能由多个机制共同影响。
但它还不是完整医学世界模型。
因为它还没有明确回答:
当前个体状态是什么?
准备采取什么动作?
动作后预期状态如何变化?
证据链是什么?
反馈窗口是什么?
如果反馈不符合预期,如何更新?
4. 医学世界模型需要 State 对象
医学世界模型首先需要定义个体状态。
from dataclasses import dataclass
from typing import Optional
@dataclass
class PatientState:
patient_id: str
demographics: Dict
clinical_markers: Dict
lifestyle: Dict
symptoms: List[str]
medications: List[str]
omics: Optional[Dict] = None
wearable: Optional[Dict] = None
mechanism_context: Optional[Dict] = None
示例:
state = PatientState(
patient_id="P001",
demographics={
"age": 52,
"sex": "unspecified"
},
clinical_markers={
"bmi": 29.1,
"fasting_glucose": 6.2,
"hba1c": 6.0,
"triglycerides": 2.1,
"hdl_c": 0.95
},
lifestyle={
"sleep_hours": 5.8,
"exercise_frequency_per_week": 1,
"diet_pattern": "high_refined_carbohydrate"
},
symptoms=[
"fatigue",
"post_meal_sleepiness"
],
medications=[],
mechanism_context={
"possible_insulin_resistance": True,
"possible_glucose_variability": True,
"data_quality": "partial"
}
)
这里有一个重要原则:
state 不是越大越好,而是要能被 action、transition 和 feedback 引用。
如果一个字段不能影响动作选择,也不能用于反馈更新,那它可能只是数据堆积,不是有效状态表示。
5. Transition 不是疗效预测,而是状态转移假设
开发者很容易写出这样的函数:
next_state = predict_next_state(state, action)
但在医学场景中,这个命名风险很高。
它像是在说模型可以预测个体疗效。对于当前大多数医学 AI 系统来说,这样的表述过强。
更稳妥的写法是:
transition = estimate_transition_hypothesis(
state=state,
action=action,
evidence=evidence_chain
)
也就是:状态转移假设。
@dataclass
class TransitionHypothesis:
expected_direction: Dict
mechanism_rationale: List[str]
uncertainty_level: str
time_window_weeks: int
assumptions: List[str]
limitations: List[str]
示例:
transition = TransitionHypothesis(
expected_direction={
"fasting_glucose": "decrease_possible",
"postprandial_glucose": "decrease_possible",
"weight": "slight_decrease_possible",
"energy_level": "may_improve"
},
mechanism_rationale=[
"lower refined carbohydrate intake may reduce postprandial glucose excursions",
"weight reduction may improve insulin sensitivity",
"improved sleep may reduce metabolic stress"
],
uncertainty_level="moderate",
time_window_weeks=8,
assumptions=[
"adequate adherence",
"no major medication change",
"baseline data quality is acceptable"
],
limitations=[
"individual response may vary",
"not a treatment effect prediction",
"not a substitute for clinical judgment"
]
)
注意这里用的是:
decrease_possible
may_improve
hypothesis
uncertainty
limitations
而不是:
will decrease
will reverse
will cure
这是医学 AI 工程实现里非常关键的安全边界。
6. EvidenceChain:不要让模型只生成建议
如果一个系统只输出:
recommendation = "reduce refined carbohydrates and increase exercise"
这还不是医学世界模型。
医学世界模型必须能说明:
- 为什么提出这个 action;
- 机制依据是什么;
- 证据强度如何;
- 适用边界是什么;
- 不确定性在哪里。
可以定义 evidence chain:
@dataclass
class EvidenceItem:
source_type: str # guideline, trial, mechanism, omics, individual_context
description: str
strength: str
reference: str | None = None
@dataclass
class EvidenceChain:
items: List[EvidenceItem]
overall_strength: str
uncertainty: str
limitations: List[str]
示例:
evidence_chain = EvidenceChain(
items=[
EvidenceItem(
source_type="mechanism",
description="Reduced refined carbohydrate intake may reduce postprandial glucose excursions.",
strength="moderate"
),
EvidenceItem(
source_type="individual_context",
description="Current state includes high refined carbohydrate pattern and low exercise frequency.",
strength="contextual"
),
EvidenceItem(
source_type="clinical_guideline",
description="Lifestyle intervention is commonly recommended for metabolic risk management.",
strength="high"
)
],
overall_strength="moderate",
uncertainty="moderate",
limitations=[
"adherence is uncertain",
"individual response may vary",
"clinical review required when disease or medication is involved"
]
)
工程原则:
recommendation without evidence object = weak output
action + transition + evidence + feedback plan = stronger world-model output
7. Feedback:世界模型必须能更新
医学世界模型不能只是一次性输出。
它必须支持 feedback update。
@dataclass
class FollowUpFeedback:
patient_id: str
action_id: str
timepoint_weeks: int
observed_markers: Dict
adherence: Dict
symptom_changes: Dict
adverse_events: List[str]
示例:
feedback = FollowUpFeedback(
patient_id="P001",
action_id="nutrition_low_glycemic_8w",
timepoint_weeks=8,
observed_markers={
"fasting_glucose": 5.8,
"hba1c": 5.8,
"weight_change_kg": -2.1,
"waist_change_cm": -3.0
},
adherence={
"nutrition": "medium",
"exercise": "low",
"sleep": "unchanged"
},
symptom_changes={
"fatigue": "slightly_improved",
"post_meal_sleepiness": "improved"
},
adverse_events=[]
)
然后把反馈写入更新逻辑:
def update_world_model_state(
previous_state: PatientState,
action: InterventionAction,
transition: TransitionHypothesis,
feedback: FollowUpFeedback
):
update_record = {
"previous_state": previous_state,
"action": action,
"expected_transition": transition,
"observed_feedback": feedback,
"interpretation": None,
"next_step": None
}
if feedback.adherence.get("nutrition") == "medium":
update_record["interpretation"] = (
"Partial improvement observed; adherence may limit effect size."
)
update_record["next_step"] = (
"Review adherence barriers and consider adjusting action intensity."
)
else:
update_record["interpretation"] = (
"Observed feedback should be interpreted with caution."
)
update_record["next_step"] = (
"Collect more context before updating the transition hypothesis."
)
return update_record
如果没有 feedback update,这个系统更像推荐系统,不像世界模型。
8. 因果边界:action-conditioned reasoning 不能只靠相关性
医学世界模型一旦要回答:
if action A, then what may happen?
就进入了因果问题。
所以 transition 不应该只是相关性预测:
transition = correlate(state_features, future_outcomes)
而应该显式记录因果假设和不确定性:
@dataclass
class CausalAssumption:
assumption_id: str
description: str
possible_confounders: List[str]
applicable_population: str
evidence_level: str
uncertainty: str
示例:
causal_assumption = CausalAssumption(
assumption_id="CA001",
description=(
"Reducing refined carbohydrate intake may reduce postprandial glucose "
"excursions in individuals with diet-related glucose variability."
),
possible_confounders=[
"medication_change",
"physical_activity_change",
"sleep_change",
"stress_change",
"baseline_disease_status"
],
applicable_population="health-management context with mild metabolic risk",
evidence_level="moderate",
uncertainty="individual_response_varies"
)
这不是说每个系统都必须完整实现因果推断引擎,而是说:
只要系统输出 action-conditioned transition,就必须显式记录因果假设、适用边界和不确定性。
否则 transition hypothesis 很容易退化成相关性外推。
9. SafetyGate:医学系统必须先过安全边界
医学世界模型不能只追求“更聪明”。
它必须先有安全边界。
@dataclass
class SafetyGateResult:
passed: bool
red_flags: List[str]
contraindications: List[str]
required_review: List[str]
notes: List[str]
示例:
def run_safety_gate(
state: PatientState,
action: InterventionAction
) -> SafetyGateResult:
red_flags = []
contraindications = []
required_review = []
if state.clinical_markers.get("fasting_glucose", 0) > 13.9:
red_flags.append("very_high_glucose_requires_clinical_evaluation")
if "chest_pain" in state.symptoms:
red_flags.append("chest_pain_requires_urgent_evaluation")
if state.medications:
required_review.append("medication_context_requires_clinician_review")
passed = len(red_flags) == 0 and len(contraindications) == 0
return SafetyGateResult(
passed=passed,
red_flags=red_flags,
contraindications=contraindications,
required_review=required_review,
notes=[
"not medical advice",
"not a validated treatment planning system",
"human review required in clinical context"
]
)
原则:
No safety gate, no medical world-model deployment.
10. AuditLog:为什么每一步都要留痕
医学世界模型必须能回答:
- 当时的 state 是什么?
- 为什么选择这个 action?
- transition hypothesis 是什么?
- evidence chain 来自哪里?
- 谁复核过?
- feedback 和预期是否一致?
- 如果不一致,下一轮怎么更新?
可以定义 audit log:
@dataclass
class AuditLog:
record_id: str
patient_id: str
state_snapshot_id: str
action_id: str
transition_id: str
evidence_chain_id: str
safety_gate_id: str
reviewer: str
decision: str
timestamp: str
示例:
audit_log = AuditLog(
record_id="AUDIT_20260521_001",
patient_id="P001",
state_snapshot_id="STATE_20260521",
action_id="nutrition_low_glycemic_8w",
transition_id="TRANSITION_20260521_001",
evidence_chain_id="EVIDENCE_20260521_001",
safety_gate_id="SAFETY_20260521_001",
reviewer="human_expert",
decision="approved_for_health_management_context",
timestamp="2026-05-21T20:00:00+08:00"
)
医学世界模型不是“生成一句更好的答案”,而是让一次推演过程可追踪、可审计、可反馈。
11. 一个最小医学世界模型工作流
把上面的对象组合起来,可以得到一个最小工作流:
def medical_world_model_loop(patient_id: str):
# 1. Observe current state
state = observe_patient_state(patient_id)
# 2. Retrieve mechanism context
mechanism_context = retrieve_mechanism_context(state)
# 3. Generate candidate actions
candidate_actions = generate_candidate_actions(
state=state,
mechanism_context=mechanism_context
)
transition_candidates = []
for action in candidate_actions:
# 4. Safety gate first
safety = run_safety_gate(state, action)
if not safety.passed:
continue
# 5. Build evidence chain
evidence = build_evidence_chain(
state=state,
action=action,
mechanism_context=mechanism_context
)
# 6. Estimate transition hypothesis
transition = estimate_transition_hypothesis(
state=state,
action=action,
evidence=evidence
)
transition_candidates.append({
"action": action,
"transition": transition,
"evidence": evidence,
"safety": safety
})
# 7. Human-in-the-loop review
selected = human_expert_review(transition_candidates)
# 8. Collect follow-up feedback
feedback = collect_follow_up_feedback(
patient_id=patient_id,
action_id=selected["action"].action_id,
time_window_weeks=selected["transition"].time_window_weeks
)
# 9. Update model state
updated_record = update_world_model_state(
previous_state=state,
action=selected["action"],
transition=selected["transition"],
feedback=feedback
)
# 10. Write audit log
audit_log = write_audit_log(
state=state,
selected=selected,
feedback=feedback,
updated_record=updated_record
)
return {
"updated_record": updated_record,
"audit_log": audit_log
}
这个 workflow 的关键不是代码本身,而是对象顺序:
state
-> mechanism context
-> candidate action
-> safety gate
-> evidence chain
-> transition hypothesis
-> human review
-> feedback
-> update
-> audit log
这就是医学世界模型和普通系统生物学图谱之间的工程差异。
12. JSON 示例:一次 transition record
下面是一个简化 JSON,表示一次医学世界模型推演记录:
{
"state": {
"patient_id": "P001",
"state_snapshot_id": "STATE_20260521",
"clinical_markers": {
"bmi": 29.1,
"fasting_glucose": 6.2,
"hba1c": 6.0,
"triglycerides": 2.1
},
"lifestyle": {
"sleep_hours": 5.8,
"exercise_frequency_per_week": 1,
"diet_pattern": "high_refined_carbohydrate"
},
"mechanism_context": {
"possible_insulin_resistance": true,
"possible_glucose_variability": true,
"data_quality": "partial"
}
},
"action": {
"action_id": "nutrition_low_glycemic_8w",
"category": "nutrition",
"duration_weeks": 8,
"target_mechanisms": [
"postprandial_glucose_variability",
"insulin_resistance"
],
"monitoring_markers": [
"fasting_glucose",
"hba1c",
"weight",
"waist_circumference"
]
},
"transition_hypothesis": {
"expected_direction": {
"fasting_glucose": "decrease_possible",
"postprandial_glucose": "decrease_possible",
"weight": "slight_decrease_possible"
},
"uncertainty_level": "moderate",
"time_window_weeks": 8,
"limitations": [
"individual_response_varies",
"not_a_treatment_effect_prediction"
]
},
"evidence_chain": {
"overall_strength": "moderate",
"items": [
{
"source_type": "mechanism",
"description": "Reduced refined carbohydrate intake may reduce postprandial glucose excursions."
},
{
"source_type": "individual_context",
"description": "Current lifestyle pattern includes high refined carbohydrate intake."
}
]
},
"safety_gate": {
"passed": true,
"red_flags": [],
"notes": [
"not_medical_advice",
"human_review_required_in_clinical_context"
]
},
"feedback_plan": {
"timepoint_weeks": 8,
"metrics": [
"fasting_glucose",
"hba1c",
"weight",
"waist_circumference",
"symptom_score"
]
}
}
13. 开发者实现原则
原则 1:不要从 chatbot 开始
不要一上来写:
answer = llm.chat(user_question)
应先定义对象:
state_schema = define_state_schema()
action_schema = define_action_schema()
transition_schema = define_transition_schema()
evidence_schema = define_evidence_schema()
feedback_schema = define_feedback_schema()
原则 2:不要把 transition 写成疗效预测
避免:
effect = predict_treatment_effect(state, action)
建议:
transition = estimate_transition_hypothesis(state, action, evidence)
原则 3:系统生物学图谱是机制层,不是完整世界模型
mechanism_graph = build_mechanism_graph(omics_data)
这很重要,但还不够。还需要:
action = define_intervention_action()
transition = estimate_transition_hypothesis(state, action, evidence)
feedback = collect_follow_up_feedback()
原则 4:Evidence object 必须是一等对象
不要只输出建议:
recommendation = generate_recommendation(state)
而要输出:
output = {
"state": state,
"action": action,
"transition_hypothesis": transition,
"evidence_chain": evidence_chain,
"safety_gate": safety_gate,
"feedback_plan": feedback_plan
}
原则 5:必须 human-in-the-loop
医学世界模型不应设计成自动治疗系统。
decision = human_expert_review(model_output)
这应该是核心流程,而不是可选项。
原则 6:没有 feedback,就不是强世界模型
如果系统无法更新:
updated_state = update_state(previous_state, action, feedback)
它就更像一次性推荐系统,而不是医学世界模型。
14. SteeraMed 的工程表达
在这个语境下,SteeraMed 可以被理解为一种可驾驭生物医学世界模型框架。
它的工程重点不是“自动控制人体”,而是把下面这些对象组织起来:
State
Action
Transition Hypothesis
Evidence Chain
Safety Gate
Human Review
Feedback
Audit Log
也可以写成:
class SteerableMedicalWorldModel:
def observe_state(self, patient_id):
pass
def generate_actions(self, state):
pass
def run_safety_gate(self, state, action):
pass
def build_evidence_chain(self, state, action):
pass
def estimate_transition(self, state, action, evidence):
pass
def request_human_review(self, candidates):
pass
def collect_feedback(self, selected_action):
pass
def update_model(self, state, action, feedback):
pass
def write_audit_log(self, record):
pass
这比“一个医疗 AI 聊天机器人”复杂得多,也更接近医学真正需要的系统形态。
15. 总结:系统生物学是机制层,医学世界模型是行动推演层
最后总结一下。
系统生物学非常重要,它帮助我们理解生命系统的网络结构、动态调控和机制关系。
但从工程角度看,系统生物学模型通常还不是完整的医学世界模型。
医学世界模型需要把下面几个对象连起来:
individual state
intervention action
mechanism-informed evidence
transition hypothesis
safety gate
human review
longitudinal feedback
audit log
所以,二者不是替代关系,而是层级关系:
Systems biology:
mechanism understanding
Medical world model:
mechanism-informed action simulation
Steerable medical world model:
goal-directed, evidence-bounded, feedback-calibrated intervention reasoning
中文表达就是:
系统生物学让我们理解生命系统;医学世界模型让这种理解进入状态—动作—转移—反馈的推演过程;可驾驭医学世界模型进一步让这个过程具备目标、边界、复核和校准能力。
这也是为什么,在系统生物学已经非常重要的前提下,医学 AI 仍然需要医学世界模型。
参考文献与延伸阅读
- Kitano, H. Systems Biology: A Brief Overview. Science, 2002. https://doi.org/10.1126/science.1069492
- Kitano, H. Computational systems biology. Nature, 2002. https://doi.org/10.1038/nature01254
- Ideker, T., Galitski, T., & Hood, L. A new approach to decoding life: systems biology. Annual Review of Genomics and Human Genetics, 2001. https://doi.org/10.1146/annurev.genom.2.1.343
- Barabási, A.-L., Gulbahce, N., & Loscalzo, J. Network medicine: a network-based approach to human disease. Nature Reviews Genetics, 2011. https://doi.org/10.1038/nrg2918
- Noble, D. The Music of Life: Biology Beyond Genes. Oxford University Press, 2006.
- Ha, D., & Schmidhuber, J. Recurrent World Models Facilitate Policy Evolution. NeurIPS, 2018. https://arxiv.org/abs/1803.10122
- LeCun, Y. A Path Towards Autonomous Machine Intelligence. OpenReview, 2022. https://openreview.net/forum?id=BZ5a1r-kVsf
- Pearl, J., & Mackenzie, D. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018.
- Katsoulakis, E., Wang, Q., Wu, H., et al. Digital twins for health: a scoping review. npj Digital Medicine, 2024. https://doi.org/10.1038/s41746-024-01073-0
- Yang, Y., Wang, Z.-Y., Liu, Q., Sun, S., Wang, K., Chellappa, R., Zhou, Z., Yuille, A., Zhu, L., Zhang, Y.-D., & Chen, J. Medical World Model: Generative Simulation of Tumor Evolution for Treatment Planning. arXiv:2506.02327, 2025. https://arxiv.org/abs/2506.02327
- Xiong, J. World Models for Biomedicine: A Steerability Framework. Preprints.org, 2026. https://doi.org/10.20944/preprints202605.0366.v1
- SteeraMed 项目网址:https://SteeraMed.com
- Steerable World 项目网址:https://steerable.world
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐
所有评论(0)