SCNet 原生训练GPT-2类模型 10-50M LLM 实例
本文为“SCNet 超算互联网 LLM Fine-Tuning LoRA 实例”的拓展,聚焦于教学用原生训练GPT类大模型的实例。
https://blog.csdn.net/YucongCai/article/details/159696147?spm=1001.2014.3001.5501
https://blog.csdn.net/YucongCai/article/details/159696147?spm=1001.2014.3001.5501已阅读上文读者可直至上文中步骤12开始快速部署开发代码。
在安装好library并从Kaggle链接下载免费的 crowdflower/twitter-airline-sentiment 数据库为案例后,本文将展示GPT-2类的LLM模型训练流程。
https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment
https://www.kaggle.com/datasets/crowdflower/twitter-airline-sentiment本文不涉及构架开发或优化,仅为技术实践实例。本文为公益类代码,由DeepSeek辅助生成,经过实例测试。
文章分两部分,第一部分直接使用GPT2LMHeadModel进行训练,第二部分用pytorch复现其构架代码。本文仅模拟GPT-2模型的训练流程和构架搭建,所训练数据集均属于sentiment analysis,并非通用LLM数据。
15. 加载并处理数据集
import pandas as pd
import torch
from transformers import GPT2Tokenizer, GPT2LMHeadModel, GPT2Config, DataCollatorForLanguageModeling, Trainer, TrainingArguments
from datasets import Dataset
# Load tokenizer early to use for truncation
tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
tokenizer.pad_token = tokenizer.eos_token
# -------------------------------
# 1. Load and prepare the dataset
# -------------------------------
df = pd.read_csv('twitter-airline-sentimentSentiment_Analysis.csv')
data = df.head(50000)
print(f"Loaded {len(data)} rows")
# Define instruction and compute fixed token count
instruction = "Analyze the sentiment of the following tweet. Output exactly one word, with no punctuation or extra text:"
# The fixed text without tweet and label
fixed_template = f"Instruction: {instruction}\nInput:\nOutput:"
fixed_token_count = len(tokenizer.encode(fixed_template))
print(f"Fixed tokens (without tweet and label): {fixed_token_count}")
# Target total tokens (e.g., 60)
max_total_tokens = 31+80
# Leave 1 token for the sentiment label
max_tweet_tokens = max_total_tokens - fixed_token_count - 1
print(f"Max tokens allowed for tweet content: {max_tweet_tokens}")
def truncate_tweet(tweet, max_tokens):
tokens = tokenizer.encode(tweet, truncation=True, max_length=max_tokens)
return tokenizer.decode(tokens, skip_special_tokens=True)
# Format each row with truncated tweet
formatted_texts = []
for _, row in data.iterrows():
short_tweet = truncate_tweet(row['content'], max_tweet_tokens)
text = f"Instruction: {instruction}\nInput: {short_tweet}\nOutput: {row['sentiment']}"
formatted_texts.append(text)
# Create dataset
dataset = Dataset.from_dict({"text": formatted_texts})
注意 max_total_tokens 需要和GPT-2中的 n_positions 保持一致。
16 准备训练数据
# -------------------------------
# 2. Tokenize the dataset
# -------------------------------
def tokenize_function(examples):
# Use max_length equal to the total token limit (e.g., 60)
return tokenizer(
examples["text"],
truncation=True,
padding="max_length",
max_length=max_total_tokens, # e.g., 60
# Do NOT set return_tensors here – let collator handle it
)
tokenized_dataset = dataset.map(tokenize_function, batched=True, remove_columns=["text"])
17. 检测输入数据
# -------------------------------
# 3. Inspect the tokenized dataset
# -------------------------------
print("\n--- Inspection of tokenized dataset ---")
for i in range(3): # first 3 examples
sample = tokenized_dataset[i]
input_ids = sample["input_ids"]
print(f"\nSample {i}:")
print(" Input IDs length:", len(input_ids))
actual_len = sum(1 for token in input_ids if token != tokenizer.pad_token_id)
print(" Actual tokens (excluding padding):", actual_len)
print(" Decoded text:")
print(tokenizer.decode(input_ids, skip_special_tokens=True))
print("-" * 50)
# Overall statistics
token_counts = [len(sample["input_ids"]) for sample in tokenized_dataset]
print(f"\nMax token length: {max(token_counts)}")
print(f"Min token length: {min(token_counts)}")
print(f"Average token length: {sum(token_counts)/len(token_counts):.1f}")
第一部分
18.a 设置GPT-2模型大小
# -------------------------------
# 3. Configure a small GPT-2 model (~100M parameters)
# -------------------------------
config = GPT2Config(
vocab_size=50257, # same as GPT-2/GPT-3
n_positions=max_total_tokens, # context length #12 for twitts #256 it changed the number of positional embedding,
# the matrix (n_positions, n_embd) that sends the input to the geometrtic vector in the embedding space
n_embd=512, # embedding dimension (≈ 110M params if n_layer=12) #768
# `embed_dim` must be divisible by num_heads (got `embed_dim`: 256 and `num_heads`: 12).
n_layer=6, # number of transformer blocks
n_head=8, # number of attention heads
resid_pdrop=0.1, # dropout for residuals
embd_pdrop=0.1,
attn_pdrop=0.1,
)
model = GPT2LMHeadModel(config)
print(f"Model has {model.num_parameters():,} parameters")
19.b 并检测tokenizer输出
model.resize_token_embeddings(len(tokenizer))
检测硬件规格
import torch
print("CUDA available:", torch.cuda.is_available())
print("Device count:", torch.cuda.device_count())
print("Device name:", torch.cuda.get_device_name(0) if torch.cuda.is_available() else "None")
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = GPT2LMHeadModel(config).to(device)
print("Model device:", next(model.parameters()).device)
20.b 设定训练参数
# -------------------------------
# 4. Prepare data collator and training arguments
# -------------------------------
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer,
mlm=False, # we are training a causal LM, not masked LM
)
# Training arguments (adjust according to your hardware)
training_args = TrainingArguments(
output_dir="/root/private_data/gpt2-small-twitter-checkpoint",
overwrite_output_dir=True,
num_train_epochs=3, # small dataset, few epochs
per_device_train_batch_size=8, # depends on your GPU memory (4090 has 24GB)
per_device_eval_batch_size=8,
gradient_accumulation_steps=1,
warmup_steps=100,
weight_decay=0.01,
logging_steps=50,
save_steps=500,
save_total_limit=2,
prediction_loss_only=True,
fp16=True, # enable mixed precision for speed
dataloader_num_workers=4,
report_to="none", # disable wandb/tensorboard if not needed
)
21.b 开始训练(需要消耗数分钟)
# -------------------------------
# 5. Create Trainer and start training
# -------------------------------
trainer = Trainer(
model=model,
args=training_args,
train_dataset=tokenized_dataset,
data_collator=data_collator,
tokenizer=tokenizer,
)
# Train the model
trainer.train()
22.b 保存训练模型
# -------------------------------
# 6. Save the trained model and tokenizer
# -------------------------------
model.save_pretrained("/root/private_data/gpt2-small-twitter")
tokenizer.save_pretrained("/root/private_data//gpt2-small-twitter")
print("Model and tokenizer saved to /root/private_data/gpt2-small-twitter")
23.b 调用模型测试训练成果
import pandas as pd
import torch
import random
from transformers import GPT2LMHeadModel, GPT2Tokenizer
# -------------------------------
# 1. Load the saved model and tokenizer
# -------------------------------
model_path = "/root/private_data/gpt2-small-twitter"
model = GPT2LMHeadModel.from_pretrained(model_path)
tokenizer = GPT2Tokenizer.from_pretrained(model_path)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
model.eval()
print(f"Model loaded on {device}")
# -------------------------------
# 2. Preprocessing parameters (must match training)
# -------------------------------
df = pd.read_csv('twitter-airline-sentimentSentiment_Analysis.csv')
data = df.head(500)
instruction = "Analyze the sentiment of the following tweet. Output exactly one word, with no punctuation or extra text:"
fixed_template = f"Instruction: {instruction}\nInput:\nOutput:"
fixed_token_count = len(tokenizer.encode(fixed_template))
print(f"Fixed tokens (without tweet and label): {fixed_token_count}")
# This must match the max_total_tokens used during training.
# If you used 60, set it here.
max_total_tokens = 60
max_tweet_tokens = max_total_tokens - fixed_token_count - 1 # leave 1 for the label
print(f"Max tokens allowed for tweet content: {max_tweet_tokens}")
def truncate_tweet(tweet, max_tokens):
tokens = tokenizer.encode(tweet, truncation=True, max_length=max_tokens)
return tokenizer.decode(tokens, skip_special_tokens=True)
# Build formatted texts (for loss calculation)
formatted_texts = []
for _, row in data.iterrows():
short_tweet = truncate_tweet(row['content'], max_tweet_tokens)
text = f"Instruction: {instruction}\nInput: {short_tweet}\nOutput: {row['sentiment']}"
formatted_texts.append(text)
print(f"Loaded {len(formatted_texts)} formatted examples")
# -------------------------------
# 3. Choose 5 random examples
# -------------------------------
random.seed(42)
indices = random.sample(range(len(formatted_texts)), 50)#5)
# -------------------------------
# 4. Test each example
# -------------------------------
for idx in indices:
full_text = formatted_texts[idx]
tweet = data.iloc[idx]['content']
expected_sentiment = data.iloc[idx]['sentiment']
short_tweet = truncate_tweet(tweet, max_tweet_tokens)
print(f"\n--- Example {idx} ---")
print(f"Original tweet: {tweet[:100]}...")
print(f"Truncated tweet: {short_tweet[:100]}...")
print(f"Expected sentiment: {expected_sentiment}")
# ---- Compute loss on the full formatted text ----
inputs = tokenizer(full_text, return_tensors="pt", truncation=True, max_length=max_total_tokens)
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs, labels=inputs["input_ids"])
loss = outputs.loss
perplexity = torch.exp(loss)
print(f"Loss: {loss.item():.4f}, Perplexity: {perplexity.item():.2f}")
# ---- Generate the sentiment ----
# Build prompt: only up to "Output:"
prompt = f"Instruction: {instruction}\nInput: {short_tweet}\nOutput:"
# Tokenize WITHOUT padding to get actual length
prompt_ids = tokenizer(prompt, return_tensors="pt", truncation=True)
prompt_ids = {k: v.to(device) for k, v in prompt_ids.items()}
# Determine how many new tokens we can generate without exceeding n_positions
available_positions = model.config.n_positions - prompt_ids["input_ids"].shape[1]
max_new_tokens = min(5, available_positions) # we only need one word, but ensure it fits
print(f"Prompt length: {prompt_ids['input_ids'].shape[1]}, Available positions: {available_positions}, Generating up to {max_new_tokens} tokens.")
if max_new_tokens <= 0:
print("WARNING: No room for generation; skipping.")
continue
with torch.no_grad():
generated_ids = model.generate(
prompt_ids["input_ids"],
max_new_tokens=max_new_tokens,
do_sample=False,
pad_token_id=tokenizer.eos_token_id,
eos_token_id=tokenizer.eos_token_id,
)
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(f"Full generated text: {generated_text}") # for debugging
# Extract the generated output after the last "Output:"
if "Output:" in generated_text:
after_output = generated_text.split("Output:")[-1].strip()
if after_output:
predicted = after_output.split()[0] # first word
else:
predicted = "<no output>"
else:
# If "Output:" not found, take the whole generated text (fallback)
predicted = generated_text.strip().split()[0] if generated_text.strip() else "<no output>"
print(f"Predicted sentiment: {predicted}")
至此,一个GPT-2类的LLM构架就训练完成。需要注意的是,此训练数据集仅仅作为sentiment analysis使用,并非用作few-shot的通用解决。
第二部分
这一部分不采用18.a 的做法,而直接从PyTorch定义GPT-2的构架,然后手动训练大模型。
18.b 首先,删除GPT2LMHeadModel,并重新导入objects
GPT2LMHeadModel
# transformers.models.gpt2.modeling_gpt2.GPT2LMHeadModel
del GPT2LMHeadModel
import torch
import torch.nn as nn
import torch.nn.functional as F
from typing import Optional, Tuple
19.b 定义Multi Head Attention layer
class GPT2Attention(nn.Module):
def __init__(self, config):
super().__init__()
self.n_head = config.n_head
self.n_embd = config.n_embd
self.head_dim = self.n_embd // self.n_head
assert self.head_dim * self.n_head == self.n_embd, "n_embd must be divisible by n_head"
# Projections
self.c_attn = nn.Linear(self.n_embd, 3 * self.n_embd) # Q, K, V
self.c_proj = nn.Linear(self.n_embd, self.n_embd)
self.attn_dropout = nn.Dropout(config.attn_pdrop)
self.resid_dropout = nn.Dropout(config.resid_pdrop)
# Causal mask (upper triangular)
self.register_buffer("bias", torch.tril(torch.ones(config.n_positions, config.n_positions))
.view(1, 1, config.n_positions, config.n_positions))
def forward(self, x, attention_mask=None):
B, T, C = x.size() # batch, sequence length, embedding dim
# Compute Q, K, V
qkv = self.c_attn(x) # (B, T, 3*C)
q, k, v = qkv.split(self.n_embd, dim=2)
# Reshape to (B, n_head, T, head_dim)
q = q.view(B, T, self.n_head, self.head_dim).transpose(1, 2)
k = k.view(B, T, self.n_head, self.head_dim).transpose(1, 2)
v = v.view(B, T, self.n_head, self.head_dim).transpose(1, 2)
# Scaled dot‑product attention
att = (q @ k.transpose(-2, -1)) * (1.0 / (self.head_dim ** 0.5)) # (B, n_head, T, T)
# Apply causal mask
att = att.masked_fill(self.bias[:, :, :T, :T] == 0, float('-inf'))
# Apply optional attention mask (from padding)
if attention_mask is not None:
# attention_mask: (B, T) with 1 for real tokens, 0 for padding
att_mask = attention_mask[:, None, None, :] # (B, 1, 1, T)
att = att.masked_fill(att_mask == 0, float('-inf'))
att = F.softmax(att, dim=-1)
att = self.attn_dropout(att)
y = att @ v # (B, n_head, T, head_dim)
y = y.transpose(1, 2).contiguous().view(B, T, C)
y = self.resid_dropout(self.c_proj(y))
return y
20.b 定义MLP layer
class GPT2MLP(nn.Module):
def __init__(self, config):
super().__init__()
self.c_fc = nn.Linear(config.n_embd, 4 * config.n_embd)
self.c_proj = nn.Linear(4 * config.n_embd, config.n_embd)
self.dropout = nn.Dropout(config.resid_pdrop)
self.act = nn.GELU()
def forward(self, x):
x = self.act(self.c_fc(x))
x = self.dropout(self.c_proj(x))
return x
21.b 使用 Multi Head Attention layer 和 MLP layer组成一个GPT-2的Transformer block/layer(Decoder block)
class GPT2Block(nn.Module):
def __init__(self, config):
super().__init__()
self.ln_1 = nn.LayerNorm(config.n_embd)
self.attn = GPT2Attention(config)
self.ln_2 = nn.LayerNorm(config.n_embd)
self.mlp = GPT2MLP(config)
def forward(self, x, attention_mask=None):
# Pre‑norm residual connection
x = x + self.attn(self.ln_1(x), attention_mask)
x = x + self.mlp(self.ln_2(x))
return x
22.b 使用GPT2Block组成GPT-2模型(embedding+Transformer blocks)
class GPT2Model(nn.Module):
def __init__(self, config):
super().__init__()
self.config = config
self.wte = nn.Embedding(config.vocab_size, config.n_embd)
self.wpe = nn.Embedding(config.n_positions, config.n_embd)
self.blocks = nn.ModuleList([GPT2Block(config) for _ in range(config.n_layer)])
self.ln_f = nn.LayerNorm(config.n_embd)
def forward(self, input_ids, attention_mask=None):
B, T = input_ids.size()
assert T <= self.config.n_positions, f"Sequence length {T} exceeds n_positions {self.config.n_positions}"
# Token and position embeddings
token_embeds = self.wte(input_ids) # (B, T, n_embd)
position_ids = torch.arange(T, device=input_ids.device).unsqueeze(0) # (1, T)
pos_embeds = self.wpe(position_ids) # (1, T, n_embd)
x = token_embeds + pos_embeds
# Apply transformer blocks
for block in self.blocks:
x = block(x, attention_mask)
x = self.ln_f(x)
return x
23.b 对应加入language model wrapper,尤其是linear head(lm_head),组成可以训练的GPT2LMHeadModel
class GPT2LMHeadModel(nn.Module):
def __init__(self, config):
super().__init__()
self.config = config
self.transformer = GPT2Model(config)
self.lm_head = nn.Linear(config.n_embd, config.vocab_size, bias=False)
# Tie weights (optional but common)
self.lm_head.weight = self.transformer.wte.weight
def forward(self, input_ids, attention_mask=None, labels=None):
hidden_states = self.transformer(input_ids, attention_mask)
logits = self.lm_head(hidden_states) # (B, T, vocab_size)
loss = None
if labels is not None:
# Shift so that tokens < n predict token n
shift_logits = logits[..., :-1, :].contiguous()
shift_labels = labels[..., 1:].contiguous()
loss = F.cross_entropy(shift_logits.view(-1, shift_logits.size(-1)), shift_labels.view(-1))
# Return a dictionary (compatible with Hugging Face Trainer)
return {"loss": loss, "logits": logits} if loss is not None else {"logits": logits}
24.b 定义训练所需设置
# Define the config (adjust as needed)
class Config:
vocab_size = 50257
n_positions = max_total_tokens # must match max_total_tokens
n_embd = 128
n_layer = 4
n_head = 4 # 128/4 = 32 head dimension
attn_pdrop = 0.1
resid_pdrop = 0.1
embd_pdrop = 0.1 # not used in our implementation, but you can add dropout on embeddings if desired
config = Config()
model = GPT2LMHeadModel(config)
# Move to device
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)
# Count parameters
print(f"Model has {sum(p.numel() for p in model.parameters()):,} parameters")
# -------------------------------
# 4. Prepare data collator and training arguments
# -------------------------------
data_collator = DataCollatorForLanguageModeling(
tokenizer=tokenizer,
mlm=False, # we are training a causal LM, not masked LM
)
25.b 训练GPT2LMHeadModel
from torch.utils.data import DataLoader
from transformers import DataCollatorForLanguageModeling
# Prepare dataset (already tokenized)
train_dataloader = DataLoader(tokenized_dataset, batch_size=32, shuffle=True, collate_fn=data_collator)
optimizer = torch.optim.AdamW(model.parameters(), lr=5e-5)
model.train()
for epoch in range(3):
total_loss = 0
for batch in train_dataloader:
# batch contains input_ids, attention_mask, etc.
input_ids = batch["input_ids"].to(device)
attention_mask = batch["attention_mask"].to(device)
outputs = model(input_ids, attention_mask=attention_mask, labels=input_ids)
loss = outputs["loss"]
loss.backward()
optimizer.step()
optimizer.zero_grad()
total_loss += loss.item()
print(f"Epoch {epoch+1}, loss: {total_loss/len(train_dataloader):.4f}")
26.b 测试训练的模型
# -------------------------------
# 2. Helper: Greedy generation for custom model
# -------------------------------
def generate(model, input_ids, max_new_tokens, eos_token_id, pad_token_id=None):
"""Greedy generation for causal LM."""
generated = input_ids.clone()
for _ in range(max_new_tokens):
with torch.no_grad():
# Get logits for the last position
outputs = model(generated)
logits = outputs["logits"] # (B, T, vocab_size)
next_token_logits = logits[:, -1, :] # (B, vocab_size)
next_token = torch.argmax(next_token_logits, dim=-1, keepdim=True) # (B, 1)
generated = torch.cat([generated, next_token], dim=-1)
# Stop if EOS token generated
if (next_token == eos_token_id).any():
break
return generated
# -------------------------------
# 3. Preprocessing parameters (must match training)
# -------------------------------
df = pd.read_csv('twitter-airline-sentimentSentiment_Analysis.csv')
data = df.head(500)
instruction = "Analyze the sentiment of the following tweet. Output exactly one word, with no punctuation or extra text:"
fixed_template = f"Instruction: {instruction}\nInput:\nOutput:"
fixed_token_count = len(tokenizer.encode(fixed_template))
print(f"Fixed tokens (without tweet and label): {fixed_token_count}")
# This must match the max_total_tokens used during training.
max_total_tokens = 60 # set to whatever you used (e.g., 60)
max_tweet_tokens = max_total_tokens - fixed_token_count - 1 # leave 1 for the label
print(f"Max tokens allowed for tweet content: {max_tweet_tokens}")
def truncate_tweet(tweet, max_tokens):
tokens = tokenizer.encode(tweet, truncation=True, max_length=max_tokens)
return tokenizer.decode(tokens, skip_special_tokens=True)
# Build formatted texts (for loss calculation)
formatted_texts = []
for _, row in data.iterrows():
short_tweet = truncate_tweet(row['content'], max_tweet_tokens)
text = f"Instruction: {instruction}\nInput: {short_tweet}\nOutput: {row['sentiment']}"
formatted_texts.append(text)
print(f"Loaded {len(formatted_texts)} formatted examples")
# -------------------------------
# 4. Choose 5 random examples
# -------------------------------
random.seed(42)
indices = random.sample(range(len(formatted_texts)), 5)
# -------------------------------
# 5. Test each example
# -------------------------------
for idx in indices:
full_text = formatted_texts[idx]
tweet = data.iloc[idx]['content']
expected_sentiment = data.iloc[idx]['sentiment']
short_tweet = truncate_tweet(tweet, max_tweet_tokens)
print(f"\n--- Example {idx} ---")
print(f"Original tweet: {tweet[:100]}...")
print(f"Truncated tweet: {short_tweet[:100]}...")
print(f"Expected sentiment: {expected_sentiment}")
# ---- Compute loss on the full formatted text ----
inputs = tokenizer(full_text, return_tensors="pt", truncation=True, max_length=max_total_tokens)
inputs = {k: v.to(device) for k, v in inputs.items()}
with torch.no_grad():
outputs = model(**inputs, labels=inputs["input_ids"])
loss = outputs["loss"]
perplexity = torch.exp(loss)
print(f"Loss: {loss.item():.4f}, Perplexity: {perplexity.item():.2f}")
# ---- Generate the sentiment ----
# Build prompt: only up to "Output:"
prompt = f"Instruction: {instruction}\nInput: {short_tweet}\nOutput:"
prompt_ids = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=max_total_tokens - 1) # leave room for generation
prompt_ids = prompt_ids["input_ids"].to(device)
# Determine how many new tokens we can generate
available_positions = model.config.n_positions - prompt_ids.shape[1]
max_new_tokens = min(5, available_positions) # we only need one word, but ensure it fits
print(f"Prompt length: {prompt_ids.shape[1]}, Available positions: {available_positions}, Generating up to {max_new_tokens} tokens.")
if max_new_tokens <= 0:
print("WARNING: No room for generation; skipping.")
continue
# Generate using our custom function
with torch.no_grad():
generated_ids = generate(
model,
prompt_ids,
max_new_tokens,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id, # pad with EOS for generation
)
generated_text = tokenizer.decode(generated_ids[0], skip_special_tokens=True)
print(f"Full generated text: {generated_text}")
# Extract the generated output after the last "Output:"
if "Output:" in generated_text:
after_output = generated_text.split("Output:")[-1].strip()
if after_output:
predicted = after_output.split()[0] # first word
else:
predicted = "<no output>"
else:
predicted = generated_text.strip().split()[0] if generated_text.strip() else "<no output>"
print(f"Predicted sentiment: {predicted}")
这一模型的保存有直接保存PyTorch文件和转换成Huggingface模型再保存等不同方式,本文不做阐述。
至此,一个10M-50M的GPT-2类LLM就训练完成了。相同的,训练数据是作为sentimental analysis使用,并非实际的通用LLM大模型。
我在找工作,HR或项目合作请联系:yucongcai_business@outlook.com
与科研相关的请联系:yucongcai_research@outlook.com
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐


所有评论(0)