ChatGLM3简介
·
ChatGLM3
简介
ChatGLM3 是智谱AI和清华大学 KEG 实验室联合发布的新一代对话预训练模型。ChatGLM3-6B 是 ChatGLM3 系列中的开源模型,在保留了前两代模型对话流畅、部署门槛低等众多优秀特性的基础上,ChatGLM3-6B 引入了如下特性:
-
**更强大的基础模型:**ChatGLM3-6B-Base 具有在 10B 以下的基础模型中最强的性能。
Model GSM8K MATH BBH MMLU C-Eval CMMLU MBPP AGIEval ChatGLM2-6B-Base 32.4 6.5 33.7 47.9 51.7 50.0 - - Best Baseline(10B 以下) 52.1 13.1 45.0 60.1 63.5 62.2 47.5 45.8 ChatGLM3-6B-Base 72.3 25.7 66.1 61.4 69.0 67.5 52.4 53.7 Model 平均 Summary Single-Doc QA Multi-Doc QA Code Few-shot Synthetic ChatGLM2-6B-32K 41.5 24.8 37.6 34.7 52.8 51.3 47.7 ChatGLM3-6B-32K 50.2 26.6 45.8 46.1 56.2 61.2 65 -
更完整的功能支持: ChatGLM3-6B 采用了全新设计的Prompt 格式。
- 多轮对话
- 同时原生支持工具调用(Function Call)
- 代码执行(Code Interpreter)
- Agent 任务
-
更全面的开源序列:
Model Seq Length Download ChatGLM3-6B 8k HuggingFace | ModelScope ChatGLM3-6B-Base 8k HuggingFace | ModelScope ChatGLM3-6B-32K 32k HuggingFace | ModelScope
推理代码
from modelscope import AutoTokenizer, AutoModel, snapshot_download
model_dir = snapshot_download("ZhipuAI/chatglm3-6b", revision = "master")
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)
命令行加载
import os
import platform
from transformers import AutoTokenizer, AutoModel
model_path = "model/chatglm3_32k/"
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModel.from_pretrained(model_path, trust_remote_code=True).cuda()
# 多显卡支持,使用下面两行代替上面一行,将num_gpus改为你实际的显卡数量
# from utils import load_model_on_gpus
# model = load_model_on_gpus(model_path, num_gpus=2)
model = model.eval()
os_name = platform.system()
clear_command = 'cls' if os_name == 'Windows' else 'clear'
stop_stream = False
welcome_prompt = "欢迎使用 ChatGLM3-6B 模型,输入内容即可进行对话,clear 清空对话历史,stop 终止程序"
def build_prompt(history):
prompt = welcome_prompt
for query, response in history:
prompt += f"\n\n用户:{query}"
prompt += f"\n\nChatGLM3-6B:{response}"
return prompt
def main():
past_key_values, history = None, []
global stop_stream
print(welcome_prompt)
while True:
query = input("\n用户:")
if query.strip() == "stop":
break
if query.strip() == "clear":
past_key_values, history = None, []
os.system(clear_command)
print(welcome_prompt)
continue
print("\nChatGLM:", end="")
current_length = 0
for response, history, past_key_values in model.stream_chat(tokenizer, query, history=history,temperature=1,
past_key_values=past_key_values,
return_past_key_values=True):
if stop_stream:
stop_stream = False
break
else:
print(response[current_length:], end="", flush=True)
current_length = len(response)
print(history)
print("")
# print(past_key_values)
if __name__ == "__main__":
main()
更多推荐
已为社区贡献15条内容
所有评论(0)