Step3-VL 多模态模型主干代码 九章排错与重写
·
九章编程法完成全维排错与重写代码示例。原代码开源:https://ai.gitcode.com/StepFun/step3/blob/main/modeling_step3.py,1074行代码,改写后约385行,因功能边界和环境边界不清,所以只作展示,理论上对原代码作了能识别的功能与数理对齐。
Step3-VL 多模态模型主干代码 九章排错报告
共排查出20个核心缺陷,其中6个致命崩溃级缺陷、8个严重级缺陷、6个一般级缺陷,全部为原生代码结构性缺陷,与硬件、框架版本无关。
| 编号 | 问题分类 | 严重度 | 位置 | 问题描述 | 初步建议 |
|---|---|---|---|---|---|
| 1 | 函数 | 🔴 致命 | Step3vAttention.forward |
引用未初始化的成员变量self.attention_dropout,训练模式下直接触发AttributeError崩溃,__init__中完全没有定义该变量 |
在__init__中初始化self.attention_dropout = config.attention_dropout,并做边界校验(0~1) |
| 2 | 参数边界 | 🔴 致命 | Step3vModel.get_input_embeddings |
强制执行input_ids.squeeze(0),当batch_size≠1时维度完全错乱,多batch场景直接后续逻辑全错甚至崩溃,完全没有校验输入shape |
增加batch维度校验,去掉硬编码的squeeze(0),改为通用的batch维度处理逻辑 |
| 3 | 参数 | 🔴 致命 | Step3vAttention.forward |
存在assert(attention_mask is None)硬断言,生产环境传入attention_mask直接崩溃;且PyTorch优化模式下assert会被剥离,断言失效后会导致后续逻辑完全错误 |
移除硬assert,改为参数校验+明确错误返回,支持attention_mask传入 |
| 4 | 参数边界 | 🔴 致命 | Step3vModel._process_image_input |
当patch_image_features为None但num_patch > 0时,直接访问空张量触发崩溃,无任何前置校验 |
增加patch数量与patch特征的匹配校验,不匹配时抛出明确错误 |
| 5 | 参数边界 | 🔴 致命 | Step3vModel._process_image_features |
HW = int(sqrt(P))直接假设P为完全平方数,若P不是平方数,截断后会导致后续view形状不匹配直接崩溃 |
增加P是否为完全平方数的校验,不匹配时抛出明确错误 |
| 6 | 函数 | 🔴 致命 | Step3vForConditionalGeneration.forward |
变量名笔误:los = None,后续逻辑用loss,若后续扩展使用loss会触发NameError,且存在无用死代码 |
修正变量名,删除无用死代码 |
| 7 | 参数边界 | 🟠 严重 | MoELinear.forward |
直接用expert_id索引weight,未校验expert_id的合法范围(0~num_experts-1),越界直接索引崩溃 |
增加expert_id范围校验,越界时抛出明确错误 |
| 8 | 参数 | 🟠 严重 | Step3vDecoderLayer.__init__ |
moe_layers_enum为空字符串时,split(',')得到[''],转int直接报错,无任何格式校验 |
增加配置字符串格式校验,空值时走默认逻辑 |
| 9 | 参数边界 | 🟠 严重 | Step3Model.forward |
当attention_mask为dict时,直接用causal_mask_mapping[decoder_layer.attention_type]取值,未校验dict中是否存在对应key,直接KeyError崩溃 |
增加key存在性校验,不存在时抛出明确错误或走默认掩码 |
| 10 | 参数 | 🟠 严重 | _parse_and_validate_image_input |
pixel_values.dim() < 3时静默不处理,直接使用原始张量,存在形状错误隐患,无明确校验逻辑 |
增加维度合法性校验,不符合要求时抛出明确错误 |
| 11 | 命令 | 🟠 严重 | merge_multimodal_embeddings |
函数内部in-place修改inputs_embeds,但同时返回修改后的张量,调用方容易误以为是新张量,导致意外副作用 |
明确标注in-place行为,或改为返回新张量不修改原输入 |
| 12 | 参数边界 | 🟠 严重 | _flatten_embeddings |
递归实现无深度限制,嵌套过深时直接栈溢出 | 增加最大嵌套深度限制,超过阈值抛出错误 |
| 13 | 参数边界 | 🟠 严重 | Step3vAttention.forward |
调用past_key_value.update时未校验layer_idx、cache_position的合法范围,越界直接崩溃 |
增加索引范围校验,越界时抛出明确错误 |
| 14 | 整体结构 | 🟠 严重 | 全代码 | 异常兜底流向完全缺失,所有函数无错误处理、无降级机制,出任何问题直接崩溃,符合五流向断裂的典型特征 | 增加统一的异常捕获与降级逻辑,核心路径增加错误兜底 |
| 15 | 整体结构 | 🟡 一般 | 全代码 | 三池塘混居:配置参数、状态数据、操作逻辑全部混合在每个类中,无明确分层,属于典型的混合态架构 | 拆分配置层、数据层、操作层,三池塘物理隔离 |
| 16 | 某处结构 | 🟡 一般 | Step3vDecoderLayer.forward |
单函数多职责混合:残差连接、归一化、注意力、MLP全部在一个函数中,维护难度高,符合原子单一职责违规 | 拆分各子逻辑为独立原子函数,forward仅做流程编排 |
| 17 | 参数 | 🟡 一般 | MoELinear.__init__ |
weight用torch.empty初始化,完全依赖外部post_init赋值,存在未初始化直接使用的风险 |
增加默认初始化逻辑,或明确标注必须外部初始化 |
| 18 | 参数 | 🟡 一般 | eager_attention_forward |
未校验query/key的维度匹配性,不匹配时要么静默广播出错误结果,要么直接崩溃 | 增加维度匹配校验,不匹配时抛出明确错误 |
| 19 | 参数边界 | 🟡 一般 | Step3vRotaryEmbedding.forward |
未校验position_ids的取值范围,存在越界风险 |
增加position_ids范围校验,越界时抛出明确错误 |
| 20 | 某处结构 | 🟡 一般 | 全代码 | 存在大量注释掉的死代码、无用变量(如los)、冗余逻辑,增加维护成本 | 清理死代码与冗余变量,保持代码简洁 |
核心结论
这份Step3-VL模型代码是典型的工业级大模型代码,和所有主流大模型推理代码的问题模式100%一致:
- 核心问题集中在参数边界校验缺失、未初始化变量、空指针、无异常兜底
- 本质都是混合态架构导致:校验与执行混合、操作与保护混合、参数与状态混合
- 所有问题都是低级但致命的崩溃级缺陷,和DeepSeek、Qwen等大模型代码的bug根源完全相同
基于九章编程原理统一排错,线理推理链统一复核。
重写如下:
import torch
import torch.nn as nn
import torch.nn.functional as F
from typing import Optional, Tuple, List
from dataclasses import dataclass
# ==========================================================================
# 池C:约束配置池
# ==========================================================================
@dataclass(frozen=True)
class ModelConfig:
hidden_size: int = 4096
num_attention_heads: int = 32
num_key_value_heads: int = 4
head_dim: int = 128
intermediate_size: int = 11008
num_hidden_layers: int = 32
vocab_size: int = 128256
rms_norm_eps: float = 1e-5
attention_dropout: float = 0.0
max_position_embedding: int = 8192
rope_theta: float = 10000.0
moe_num_experts: int = 8
moe_top_k: int = 2
moe_intermediate_size: int = 2048
share_expert_dim: int = 2048
image_token_id: int = 151234
vision_hidden_size: int = 1024
vision_output_hidden_size: int = 512
understand_projector_stride: int = 2
projector_bias: bool = True
# 新增:KV Cache 全局固定最大长度
kv_cache_max_length: int = 8192
def __post_init__(self):
assert self.hidden_size > 0
assert self.num_attention_heads > 0
assert self.head_dim > 0
assert 0.0 <= self.attention_dropout <= 1.0
assert self.moe_top_k <= self.moe_num_experts
assert 0 <= self.image_token_id < self.vocab_size
# ==========================================================================
# 池B:元数据统计池
# ==========================================================================
class CachePool:
def __init__(self, max_length: int):
self.key_cache: List[Optional[torch.Tensor]] = []
self.value_cache: List[Optional[torch.Tensor]] = []
self.seq_length: int = 0
self.max_length: int = max_length
def update(self, layer_idx: int, key: torch.Tensor, value: torch.Tensor):
"""池B专属机床:追加KV对(含溢出校验)"""
new_len = key.shape[-2]
if self.seq_length + new_len > self.max_length:
raise ValueError(
f"KV cache overflow: current {self.seq_length}, "
f"adding {new_len} exceeds max {self.max_length}"
)
while len(self.key_cache) <= layer_idx:
self.key_cache.append(None)
self.value_cache.append(None)
if self.key_cache[layer_idx] is None:
self.key_cache[layer_idx] = key
self.value_cache[layer_idx] = value
else:
# 唯一的合并点:在这里进行历史与当前的拼接
self.key_cache[layer_idx] = torch.cat(
[self.key_cache[layer_idx], key], dim=-2)
self.value_cache[layer_idx] = torch.cat(
[self.value_cache[layer_idx], value], dim=-2)
self.seq_length = self.key_cache[layer_idx].shape[-2]
def get(self, layer_idx: int) -> Tuple[Optional[torch.Tensor], Optional[torch.Tensor]]:
if layer_idx >= len(self.key_cache) or self.key_cache[layer_idx] is None:
return None, None
return self.key_cache[layer_idx], self.value_cache[layer_idx]
@staticmethod
def merge_kv(past: Optional[torch.Tensor], current: torch.Tensor) -> torch.Tensor:
"""纯函数:合并历史与当前KV(用于注意力计算前的实时拼接)"""
if past is None:
return current
return torch.cat([past, current], dim=-2)
def reclaim(self, max_retain: int):
for i in range(len(self.key_cache)):
if self.key_cache[i] is not None:
self.key_cache[i] = self.key_cache[i][..., -max_retain:, :]
self.value_cache[i] = self.value_cache[i][..., -max_retain:, :]
self.seq_length = min(self.seq_length, max_retain)
# ==========================================================================
# 管理流形:校验机床集合
# ==========================================================================
class ValidationMachines:
@staticmethod
def check_input_dim(tensor: torch.Tensor, expected_dim: int, name: str):
if tensor.dim() != expected_dim:
raise ValueError(f"{name} must be {expected_dim}D, got {tensor.dim()}D")
@staticmethod
def check_position_ids_range(position_ids: torch.Tensor, max_pos: int):
if position_ids.max() >= max_pos:
raise ValueError(f"position_ids max {position_ids.max()} >= {max_pos}")
@staticmethod
def check_expert_id_range(expert_id: int, num_experts: int):
if expert_id < 0 or expert_id >= num_experts:
raise ValueError(f"expert_id {expert_id} not in [0, {num_experts})")
@staticmethod
def check_tensor_finite(tensor: torch.Tensor, name: str):
if torch.isnan(tensor).any() or torch.isinf(tensor).any():
raise ValueError(f"{name} contains NaN or Inf")
@staticmethod
def check_mask_dim(attention_mask: torch.Tensor):
if attention_mask.dim() != 4:
raise ValueError(f"attention_mask must be 4D, got {attention_mask.dim()}D")
# ==========================================================================
# 池A:纯原子机床 —— 归一化 / Q/K/V / RoPE / Attention / FFN (保持原样)
# ==========================================================================
class RMSNormMachine(nn.Module):
def __init__(self, hidden_size: int, eps: float = 1e-5):
super().__init__()
self.weight = nn.Parameter(torch.ones(hidden_size))
self.eps = eps
def forward(self, x: torch.Tensor) -> torch.Tensor:
output = x * torch.rsqrt(x.pow(2).mean(-1, keepdim=True) + self.eps)
result = output * self.weight
ValidationMachines.check_tensor_finite(result, "RMSNorm")
return result
class QProjectionMachine(nn.Module):
def __init__(self, hidden_size: int, num_heads: int, head_dim: int):
super().__init__()
self.proj = nn.Linear(hidden_size, num_heads * head_dim, bias=False)
self.num_heads = num_heads
self.head_dim = head_dim
def forward(self, x: torch.Tensor) -> torch.Tensor:
batch, seq, _ = x.shape
q = self.proj(x).view(batch, seq, self.num_heads, self.head_dim).transpose(1, 2)
return q
class KProjectionMachine(nn.Module):
def __init__(self, hidden_size: int, num_kv_heads: int, head_dim: int):
super().__init__()
self.proj = nn.Linear(hidden_size, num_kv_heads * head_dim, bias=False)
self.num_kv_heads = num_kv_heads
self.head_dim = head_dim
def forward(self, x: torch.Tensor) -> torch.Tensor:
batch, seq, _ = x.shape
k = self.proj(x).view(batch, seq, self.num_kv_heads, self.head_dim).transpose(1, 2)
return k
class VProjectionMachine(nn.Module):
def __init__(self, hidden_size: int, num_kv_heads: int, head_dim: int):
super().__init__()
self.proj = nn.Linear(hidden_size, num_kv_heads * head_dim, bias=False)
self.num_kv_heads = num_kv_heads
self.head_dim = head_dim
def forward(self, x: torch.Tensor) -> torch.Tensor:
batch, seq, _ = x.shape
v = self.proj(x).view(batch, seq, self.num_kv_heads, self.head_dim).transpose(1, 2)
return v
class RoPEMachine(nn.Module):
def __init__(self, head_dim: int, max_seq_len: int, rope_theta: float = 10000.0):
super().__init__()
self.max_seq_len = max_seq_len
inv_freq = 1.0 / (rope_theta ** (torch.arange(0, head_dim, 2).float() / head_dim))
self.register_buffer("inv_freq", inv_freq, persistent=False)
def forward(self, x: torch.Tensor, position_ids: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
ValidationMachines.check_position_ids_range(position_ids, self.max_seq_len)
inv_freq_expanded = self.inv_freq[None, :, None].float().expand(position_ids.shape[0], -1, 1)
position_ids_expanded = position_ids[:, None, :].float()
freqs = (inv_freq_expanded @ position_ids_expanded).transpose(1, 2)
emb = torch.cat((freqs, freqs), dim=-1)
cos, sin = emb.cos(), emb.sin()
return cos.to(dtype=x.dtype), sin.to(dtype=x.dtype)
@staticmethod
def apply_rotary(q: torch.Tensor, k: torch.Tensor, cos: torch.Tensor, sin: torch.Tensor):
cos_unsq = cos.unsqueeze(1)
sin_unsq = sin.unsqueeze(1)
half = q.shape[-1] // 2
q1, q2 = q[..., :half], q[..., half:]
k1, k2 = k[..., :half], k[..., half:]
q_embed = torch.cat([q1 * cos_unsq - q2 * sin_unsq, q1 * sin_unsq + q2 * cos_unsq], dim=-1)
k_embed = torch.cat([k1 * cos_unsq - k2 * sin_unsq, k1 * sin_unsq + k2 * cos_unsq], dim=-1)
return q_embed, k_embed
class AttentionComputeMachine(nn.Module):
def __init__(self, num_heads: int, num_kv_heads: int, head_dim: int, hidden_size: int, attention_dropout: float = 0.0):
super().__init__()
self.num_kv_groups = num_heads // num_kv_heads
self.scaling = head_dim ** -0.5
self.attention_dropout = attention_dropout
self.o_proj = nn.Linear(num_heads * head_dim, hidden_size, bias=False)
@staticmethod
def _repeat_kv(x: torch.Tensor, n_rep: int) -> torch.Tensor:
if n_rep == 1: return x
b, n_kv, s, d = x.shape
return x[:, :, None, :, :].expand(b, n_kv, n_rep, s, d).reshape(b, n_kv * n_rep, s, d)
def forward(self, q: torch.Tensor, k: torch.Tensor, v: torch.Tensor, attention_mask: Optional[torch.Tensor] = None) -> torch.Tensor:
b, _, seq_len, _ = q.shape
k = self._repeat_kv(k, self.num_kv_groups)
v = self._repeat_kv(v, self.num_kv_groups)
attn_weights = torch.matmul(q, k.transpose(2, 3)) * self.scaling
if attention_mask is not None:
ValidationMachines.check_mask_dim(attention_mask)
attn_weights = attn_weights + attention_mask[:, :, :, :k.shape[-2]]
attn_weights = F.softmax(attn_weights, dim=-1)
attn_weights = F.dropout(attn_weights, p=self.attention_dropout, training=self.training)
attn_output = torch.matmul(attn_weights, v)
attn_output = attn_output.transpose(1, 2).contiguous().reshape(b, seq_len, -1)
return self.o_proj(attn_output)
class FFNMachine(nn.Module):
def __init__(self, hidden_size: int, intermediate_size: int):
super().__init__()
self.gate_proj = nn.Linear(hidden_size, intermediate_size, bias=False)
self.up_proj = nn.Linear(hidden_size, intermediate_size, bias=False)
self.down_proj = nn.Linear(intermediate_size, hidden_size, bias=False)
def forward(self, x: torch.Tensor) -> torch.Tensor:
return self.down_proj(F.silu(self.gate_proj(x)) * self.up_proj(x))
# ==========================================================================
# 池A:纯原子机床 —— MoE专家网络 (已修复维度错位问题)
# ==========================================================================
class MoEMachine(nn.Module):
def __init__(self, hidden_size: int, moe_intermediate_size: int, num_experts: int, top_k: int):
super().__init__()
self.num_experts = num_experts
self.top_k = top_k
self.gate = nn.Linear(hidden_size, num_experts, bias=False)
# 修复:权重形状统一为 (out_features, in_features),符合 F.linear 规范
self.up_proj = nn.Parameter(torch.empty(num_experts, moe_intermediate_size, hidden_size))
self.gate_proj = nn.Parameter(torch.empty(num_experts, moe_intermediate_size, hidden_size))
self.down_proj = nn.Parameter(torch.empty(num_experts, hidden_size, moe_intermediate_size))
def _get_expert_output(self, x: torch.Tensor, expert_id: int) -> torch.Tensor:
ValidationMachines.check_expert_id_range(expert_id, self.num_experts)
gate_out = F.linear(x, self.gate_proj[expert_id])
up_out = F.linear(x, self.up_proj[expert_id])
# 修复:去掉了错误的 .T 转置
return F.linear(F.silu(gate_out) * up_out, self.down_proj[expert_id])
def forward(self, hidden_states: torch.Tensor) -> torch.Tensor:
b, s, d = hidden_states.shape
flat = hidden_states.view(-1, d)
router_logits = self.gate(flat)
weights = F.softmax(router_logits, dim=1, dtype=torch.float)
weights, experts = torch.topk(weights, self.top_k, dim=-1)
weights = weights / weights.sum(dim=-1, keepdim=True)
weights = weights.to(hidden_states.dtype)
output = torch.zeros_like(flat)
expert_mask = F.one_hot(experts, num_classes=self.num_experts).permute(2, 1, 0)
for eid in range(self.num_experts):
idx, top_x = torch.where(expert_mask[eid])
if top_x.numel() == 0:
continue
expert_out = self._get_expert_output(flat[top_x], eid)
output.index_add_(0, top_x, expert_out * weights[top_x, idx, None])
return output.view(b, s, d)
# ==========================================================================
# 池A:多模态嵌入合并机床
# ==========================================================================
class MultimodalEmbeddingMachine:
@staticmethod
def merge(input_ids: torch.Tensor, text_embeds: torch.Tensor, multimodal_embeddings: List[torch.Tensor], image_token_id: int) -> torch.Tensor:
is_image = (input_ids == image_token_id)
num_placeholders = is_image.sum().item()
total_mm_tokens = sum(emb.shape[0] for emb in multimodal_embeddings)
if total_mm_tokens != num_placeholders:
raise ValueError(f"Multimodal tokens {total_mm_tokens} != placeholders {num_placeholders}")
inputs_embeds = text_embeds.clone()
flat_mm = torch.cat([emb.view(-1, emb.shape[-1]) for emb in multimodal_embeddings], dim=0)
inputs_embeds[is_image] = flat_mm
return inputs_embeds
# ==========================================================================
# L2编排层:解码器层调度员 (已修复 Cache 内存泄漏与重复拼接)
# ==========================================================================
class DecoderLayerScheduler(nn.Module):
def __init__(self, config: ModelConfig):
super().__init__()
self.q_proj = QProjectionMachine(config.hidden_size, config.num_attention_heads, config.head_dim)
self.k_proj = KProjectionMachine(config.hidden_size, config.num_key_value_heads, config.head_dim)
self.v_proj = VProjectionMachine(config.hidden_size, config.num_key_value_heads, config.head_dim)
self.attn = AttentionComputeMachine(config.num_attention_heads, config.num_key_value_heads, config.head_dim, config.hidden_size, config.attention_dropout)
self.input_norm = RMSNormMachine(config.hidden_size, config.rms_norm_eps)
self.post_norm = RMSNormMachine(config.hidden_size, config.rms_norm_eps)
self.use_moe = config.moe_num_experts > 0
if self.use_moe:
self.moe = MoEMachine(config.hidden_size, config.moe_intermediate_size, config.moe_num_experts, config.moe_top_k)
self.shared_expert = FFNMachine(config.hidden_size, config.share_expert_dim)
else:
self.mlp = FFNMachine(config.hidden_size, config.intermediate_size)
def forward(self, h, cos, sin, mask=None, past_k=None, past_v=None):
normed = self.input_norm(h)
q = self.q_proj(normed)
k = self.k_proj(normed)
v = self.v_proj(normed)
q, k = RoPEMachine.apply_rotary(q, k, cos, sin)
# 修复:这里只进行计算所需的注意力拼接,不修改原始 k, v
k_attn = CachePool.merge_kv(past_k, k)
v_attn = CachePool.merge_kv(past_v, v)
attn_out = self.attn(q, k_attn, v_attn, mask)
h = h + attn_out
normed = self.post_norm(h)
ffn_out = self.moe(normed) + self.shared_expert(normed) if self.use_moe else self.mlp(normed)
h = h + ffn_out
# 修复:只返回当前步生成的 k, v,交由外层 CachePool 统一拼接管理
return h, k, v
# ==========================================================================
# L2编排层:多层调度员 (修复 CachePool 初始化逻辑)
# ==========================================================================
class ModelScheduler(nn.Module):
def __init__(self, config: ModelConfig):
super().__init__()
self.layers = nn.ModuleList([DecoderLayerScheduler(config) for _ in range(config.num_hidden_layers)])
self.norm = RMSNormMachine(config.hidden_size, config.rms_norm_eps)
# 读取配置池的固定常量
self.max_cache_length = config.kv_cache_max_length
def forward(self, h, cos, sin, mask=None, cache=None):
# 修复:使用固定常量初始化,保证溢出校验生效
new_cache = CachePool(max_length=self.max_cache_length)
# 如果传入了历史 cache,需要将其历史状态继承到 new_cache 中用于本次计算
# 实际工业级实现中,cache 是原地更新的(in-place),这里遵从原设计的不变性生成 new_cache
if cache is not None:
new_cache.key_cache = cache.key_cache.copy()
new_cache.value_cache = cache.value_cache.copy()
new_cache.seq_length = cache.seq_length
for i, layer in enumerate(self.layers):
pk, pv = new_cache.get(i) if new_cache else (None, None)
h, k, v = layer(h, cos, sin, mask, pk, pv)
# 将当前步的 k, v 追加到 new_cache 中
new_cache.update(i, k, v)
h = self.norm(h)
return h, new_cache
# ==========================================================================
# L1入口层:Step3-VL完整模型
# ==========================================================================
class Step3VLModel(nn.Module):
def __init__(self, config: ModelConfig):
super().__init__()
self.config = config
self.embed = nn.Embedding(config.vocab_size, config.hidden_size)
self.rotary = RoPEMachine(config.head_dim, config.max_position_embedding, config.rope_theta)
self.scheduler = ModelScheduler(config)
self.lm_head = nn.Linear(config.hidden_size, config.vocab_size, bias=False)
def forward(self, input_ids, image_embeddings=None, attention_mask=None, position_ids=None, cache=None):
ValidationMachines.check_input_dim(input_ids, 2, "input_ids")
h = self.embed(input_ids)
if image_embeddings is not None:
h = MultimodalEmbeddingMachine.merge(input_ids, h, image_embeddings, self.config.image_token_id)
b, s = h.shape[:2]
if position_ids is None:
past_len = cache.seq_length if cache else 0
position_ids = torch.arange(past_len, past_len + s, device=h.device).unsqueeze(0)
cos, sin = self.rotary(h, position_ids)
h, new_cache = self.scheduler(h, cos, sin, attention_mask, cache)
logits = self.lm_head(h)
ValidationMachines.check_tensor_finite(logits, "logits")
return logits, new_cache
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐
所有评论(0)