[特殊字符]AI洞察者中心|知识库(六)
来源:https://bluefocus.feishu.cn/docx/T0lLdbOy6oPUDGxcI3Lcr2Gtn2e
)
return response.choices[0].message["content"]
我们先来看prompt,后面会有解析
delimiter = "####"
system_message = f"""
Follow these steps to answer the customer queries.
The customer query will be delimited with four hashtags,\
i.e. {delimiter}.
Step 1:{delimiter} First decide whether the user is \
asking a question about a specific product or products. \
Product cateogry doesn't count.
Step 2:{delimiter} If the user is asking about \
specific products, identify whether \
the products are in the following list.
All available products:
-
Product: TechPro Ultrabook
Category: Computers and Laptops
Brand: TechPro
Model Number: TP-UB100
Warranty: 1 year
Rating: 4.5
Features: 13.3-inch display, 8GB RAM, 256GB SSD, Intel Core i5 processor
Description: A sleek and lightweight ultrabook for everyday use.
Price: $799.99
-
Product: BlueWave Gaming Laptop
Category: Computers and Laptops
Brand: BlueWave
Model Number: BW-GL200
Warranty: 2 years
Rating: 4.7
Features: 15.6-inch display, 16GB RAM, 512GB SSD, NVIDIA GeForce RTX 3060
Description: A high-performance gaming laptop for an immersive experience.
Price: $1199.99
-
Product: PowerLite Convertible
Category: Computers and Laptops
Brand: PowerLite
Model Number: PL-CV300
Warranty: 1 year
Rating: 4.3
Features: 14-inch touchscreen, 8GB RAM, 256GB SSD, 360-degree hinge
Description: A versatile convertible laptop with a responsive touchscreen.
Price: $699.99
-
Product: TechPro Desktop
Category: Computers and Laptops
Brand: TechPro
Model Number: TP-DT500
Warranty: 1 year
Rating: 4.4
Features: Intel Core i7 processor, 16GB RAM, 1TB HDD, NVIDIA GeForce GTX 1660
Description: A powerful desktop computer for work and play.
Price: $999.99
-
Product: BlueWave Chromebook
Category: Computers and Laptops
Brand: BlueWave
Model Number: BW-CB100
Warranty: 1 year
Rating: 4.1
Features: 11.6-inch display, 4GB RAM, 32GB eMMC, Chrome OS
Description: A compact and affordable Chromebook for everyday tasks.
Price: $249.99
Step 3:{delimiter} If the message contains products \
in the list above, list any assumptions that the \
user is making in their \
message e.g. that Laptop X is bigger than \
Laptop Y, or that Laptop Z has a 2 year warranty.
Step 4:{delimiter}: If the user made any assumptions, \
figure out whether the assumption is true based on your \
product information.
Step 5:{delimiter}: First, politely correct the \
customer's incorrect assumptions if applicable. \
Only mention or reference products in the list of \
5 available products, as these are the only 5 \
products that the store sells. \
Answer the customer in a friendly tone.
Use the following format:
Step 1:{delimiter} <step 1 reasoning>
Step 2:{delimiter} <step 2 reasoning>
Step 3:{delimiter} <step 3 reasoning>
Step 4:{delimiter} <step 4 reasoning>
Response to user:{delimiter} <response to customer>
Make sure to include {delimiter} to separate every step.
"""
写好prompt后,再测试一下用户输入
user_message = f"""
by how much is the BlueWave Chromebook more expensive \
than the TechPro Desktop"""
messages = [
{'role':'system',
'content': system_message},
{'role':'user',
'content': f"{delimiter}{user_message}{delimiter}"},
]
response = get_completion_from_messages(messages)
print(response)
"""
Step 1:#### The user is asking a question about two specific products, the BlueWave Chromebook and the TechPro Desktop.
Step 2:#### The prices of the two products are as follows:
-
BlueWave Chromebook: $249.99
-
TechPro Desktop: $999.99
Step 3:#### The user is assuming that the BlueWave Chromebook is more expensive than the TechPro Desktop.
Step 4:#### The assumption is incorrect. The TechPro Desktop is actually more expensive than the BlueWave Chromebook.
Response to user:#### The BlueWave Chromebook is actually less expensive than the TechPro Desktop. The BlueWave Chromebook costs $249.99 while the TechPro Desktop costs $999.99.
"""
你可以发现,它完全遵从了我们的指令来
user_message = f"""
do you sell tvs"""
messages = [
{'role':'system',
'content': system_message},
{'role':'user',
'content': f"{delimiter}{user_message}{delimiter}"},
]
response = get_completion_from_messages(messages)
print(response)
"""
Step 1:#### The user is asking about a specific product category, TVs.
Step 2:#### The list of available products does not include any TVs.
Response to user:#### I'm sorry, but we do not sell TVs at this time. Our store specializes in computers and laptops. However, if you are interested in purchasing a computer or laptop, please let me know and I would be happy to assist you.
"""
可以看下面的解析
从示例二的回复中我们可以看到,中间的步骤,其实不是必要的。
虽然我们给定了LLM格式,要如何输出,但有时候,模型也可能不遵守。我们需要做错误兜底
比如像下面这样
try:
final_response = response.split(delimiter)[-1].strip()
except Exception as e:
final_response = "Sorry, I'm having trouble right now, please try asking another question."
print(final_response)
"""
I'm sorry, but we do not sell TVs at this time. Our store specializes in computers and laptops. However, if you are interested in purchasing a computer or laptop, please let me know and I would be happy to assist you.
"""
总的来说,找到最佳的折中需要一些实践的经验。因此要投入生产环境之前,要尽可能地尝试各种prompt来看任务完成的情况。
# 链式提示 (Chaining Prompts)
## 链式提示 VS 思维链
我们将继续学习通过思维链技术来把复杂任务拆解成一步一步的简单任务的prompt提示词。即,把多个简单prompt提示词串成链式的流程。
你可能会疑惑,为什么我们之前能用一个prompt提示词解决许多任务,现在却倒回来了。
我们可以用做菜类比来描述这个过程:
+ 使用一条长而复杂的指令来完成任务就像一次性烹饪复杂的餐点,需要同时处理多个成分、烹饪技巧和时间安排,难以同时跟踪和确保每个部分的完美烹饪。
+ 而将多个提示串联起来则像分阶段烹饪餐点,侧重于逐个部分的烹饪,确保每个部分在移动到下一个部分之前都得到正确的烹饪。这种方式将任务的复杂性降低,使任务管理更加容易,并降低了出错的可能性。然而,对于非常简单的食谱,这种方式可能会过于不必要和复杂。
总的来说,链接提示有如下好处:
+ 拆分复杂任务:通过链接多个提示,我们可以将复杂任务拆分成一系列更简单的子任务,降低了任务的复杂性,使管理更易,也减少了出错的可能性。类似于模块化编程,减少了代码(任务)之间的复杂依赖和歧义,方便调试和执行。
+ 维持和控制工作流的状态:根据不同的状态进行不同的行动。它能确保模型拥有执行任务所需的所有信息,并降低出错的可能性。这种方法能降低成本,因为长的提示和更多的令牌会导致更高的运行成本。而且它也方便跟踪状态,然后根据需要注入相关指示,而不是在一条提示中描述整个工作流程。
+ 易于测试:方便找出可能出错的步骤,也有利于在某个步骤进行人机协作。
+ 引入工具:允许模型在必要时在工作流程的某些点使用外部工具,如查阅产品目录、调用API、搜索知识库等。
## 状态分类
在代码中,我们依旧看之前的例子
我们先让chatGPT针对用户的意图需求分类出产品的类别和涉及到的产品名
delimiter = "####"
system_message = f"""
You will be provided with customer service queries. \
The customer service query will be delimited with \
{delimiter} characters.
Output a python list of objects, where each object has \
the following format:
'category': <one of Computers and Laptops, \
Smartphones and Accessories, \
Televisions and Home Theater Systems, \
Gaming Consoles and Accessories,
Audio Equipment, Cameras and Camcorders>,
OR
'products': <a list of products that must \
be found in the allowed products below>
Where the categories and products must be found in \
the customer service query.
If a product is mentioned, it must be associated with \
the correct category in the allowed products list below.
If no products or categories are found, output an \
empty list.
Allowed products:
Computers and Laptops category:
TechPro Ultrabook
BlueWave Gaming Laptop
PowerLite Convertible
TechPro Desktop
BlueWave Chromebook
Smartphones and Accessories category:
SmartX ProPhone
MobiTech PowerCase
SmartX MiniPhone
MobiTech Wireless Charger
SmartX EarBuds
Televisions and Home Theater Systems category:
CineView 4K TV
SoundMax Home Theater
CineView 8K TV
SoundMax Soundbar
CineView OLED TV
Gaming Consoles and Accessories category:
GameSphere X
ProGamer Controller
GameSphere Y
ProGamer Racing Wheel
GameSphere VR Headset
Audio Equipment category:
AudioPhonic Noise-Canceling Headphones
WaveSound Bluetooth Speaker
AudioPhonic True Wireless Earbuds
WaveSound Soundbar
AudioPhonic Turntable
Cameras and Camcorders category:
FotoSnap DSLR Camera
ActionCam 4K
FotoSnap Mirrorless Camera
ZoomMaster Camcorder
FotoSnap Instant Camera
Only output the list of objects, with nothing else.
"""
user_message_1 = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs """
messages = [
{'role':'system',
'content': system_message},
{'role':'user',
'content': f"{delimiter}{user_message_1}{delimiter}"},
]
category_and_product_response_1 = get_completion_from_messages(messages)
print(category_and_product_response_1)
"""
[
{'category': 'Smartphones and Accessories', 'products': ['SmartX ProPhone']},
{'category': 'Cameras and Camcorders', 'products': ['FotoSnap DSLR Camera']},
{'category': 'Televisions and Home Theater Systems'}
]
"""
这个prompt很简单,我们也来拆解一下
## 检索回复
有了用户问询到的产品类别和产品名,我们就可以通过查询我们已有的产品数据库中找到更细的产品描述。再将topk相关的产品描述用一个回复prompt模板给到chatGPT生成最终的回复。
方便测试,我们让GPT-4生成大量的产品细节
product information
products = {
"TechPro Ultrabook": {
"name": "TechPro Ultrabook",
"category": "Computers and Laptops",
"brand": "TechPro",
"model_number": "TP-UB100",
"warranty": "1 year",
"rating": 4.5,
"features": ["13.3-inch display", "8GB RAM", "256GB SSD", "Intel Core i5 processor"],
"description": "A sleek and lightweight ultrabook for everyday use.",
"price": 799.99
},
"BlueWave Gaming Laptop": {
"name": "BlueWave Gaming Laptop",
"category": "Computers and Laptops",
"brand": "BlueWave",
"model_number": "BW-GL200",
"warranty": "2 years",
"rating": 4.7,
"features": ["15.6-inch display", "16GB RAM", "512GB SSD", "NVIDIA GeForce RTX 3060"],
"description": "A high-performance gaming laptop for an immersive experience.",
"price": 1199.99
},
"PowerLite Convertible": {
"name": "PowerLite Convertible",
"category": "Computers and Laptops",
"brand": "PowerLite",
"model_number": "PL-CV300",
"warranty": "1 year",
"rating": 4.3,
"features": ["14-inch touchscreen", "8GB RAM", "256GB SSD", "360-degree hinge"],
"description": "A versatile convertible laptop with a responsive touchscreen.",
"price": 699.99
},
"TechPro Desktop": {
"name": "TechPro Desktop",
"category": "Computers and Laptops",
"brand": "TechPro",
"model_number": "TP-DT500",
"warranty": "1 year",
"rating": 4.4,
"features": ["Intel Core i7 processor", "16GB RAM", "1TB HDD", "NVIDIA GeForce GTX 1660"],
"description": "A powerful desktop computer for work and play.",
"price": 999.99
},
"BlueWave Chromebook": {
"name": "BlueWave Chromebook",
"category": "Computers and Laptops",
"brand": "BlueWave",
"model_number": "BW-CB100",
"warranty": "1 year",
"rating": 4.1,
"features": ["11.6-inch display", "4GB RAM", "32GB eMMC", "Chrome OS"],
"description": "A compact and affordable Chromebook for everyday tasks.",
"price": 249.99
},
"SmartX ProPhone": {
"name": "SmartX ProPhone",
"category": "Smartphones and Accessories",
"brand": "SmartX",
"model_number": "SX-PP10",
"warranty": "1 year",
"rating": 4.6,
"features": ["6.1-inch display", "128GB storage", "12MP dual camera", "5G"],
"description": "A powerful smartphone with advanced camera features.",
"price": 899.99
},
"MobiTech PowerCase": {
"name": "MobiTech PowerCase",
"category": "Smartphones and Accessories",
"brand": "MobiTech",
"model_number": "MT-PC20",
"warranty": "1 year",
"rating": 4.3,
"features": ["5000mAh battery", "Wireless charging", "Compatible with SmartX ProPhone"],
"description": "A protective case with built-in battery for extended usage.",
"price": 59.99
},
"SmartX MiniPhone": {
"name": "SmartX MiniPhone",
"category": "Smartphones and Accessories",
"brand": "SmartX",
"model_number": "SX-MP5",
"warranty": "1 year",
"rating": 4.2,
"features": ["4.7-inch display", "64GB storage", "8MP camera", "4G"],
"description": "A compact and affordable smartphone for basic tasks.",
"price": 399.99
},
"MobiTech Wireless Charger": {
"name": "MobiTech Wireless Charger",
"category": "Smartphones and Accessories",
"brand": "MobiTech",
"model_number": "MT-WC10",
"warranty": "1 year",
"rating": 4.5,
"features": ["10W fast charging", "Qi-compatible", "LED indicator", "Compact design"],
"description": "A convenient wireless charger for a clutter-free workspace.",
"price": 29.99
},
"SmartX EarBuds": {
"name": "SmartX EarBuds",
"category": "Smartphones and Accessories",
"brand": "SmartX",
"model_number": "SX-EB20",
"warranty": "1 year",
"rating": 4.4,
"features": ["True wireless", "Bluetooth 5.0", "Touch controls", "24-hour battery life"],
"description": "Experience true wireless freedom with these comfortable earbuds.",
"price": 99.99
},
"CineView 4K TV": {
"name": "CineView 4K TV",
"category": "Televisions and Home Theater Systems",
"brand": "CineView",
"model_number": "CV-4K55",
"warranty": "2 years",
"rating": 4.8,
"features": ["55-inch display", "4K resolution", "HDR", "Smart TV"],
"description": "A stunning 4K TV with vibrant colors and smart features.",
"price": 599.99
},
"SoundMax Home Theater": {
"name": "SoundMax Home Theater",
"category": "Televisions and Home Theater Systems",
"brand": "SoundMax",
"model_number": "SM-HT100",
"warranty": "1 year",
"rating": 4.4,
"features": ["5.1 channel", "1000W output", "Wireless subwoofer", "Bluetooth"],
"description": "A powerful home theater system for an immersive audio experience.",
"price": 399.99
},
"CineView 8K TV": {
"name": "CineView 8K TV",
"category": "Televisions and Home Theater Systems",
"brand": "CineView",
"model_number": "CV-8K65",
"warranty": "2 years",
"rating": 4.9,
"features": ["65-inch display", "8K resolution", "HDR", "Smart TV"],
"description": "Experience the future of television with this stunning 8K TV.",
"price": 2999.99
},
"SoundMax Soundbar": {
"name": "SoundMax Soundbar",
"category": "Televisions and Home Theater Systems",
"brand": "SoundMax",
"model_number": "SM-SB50",
"warranty": "1 year",
"rating": 4.3,
"features": ["2.1 channel", "300W output", "Wireless subwoofer", "Bluetooth"],
"description": "Upgrade your TV's audio with this sleek and powerful soundbar.",
"price": 199.99
},
"CineView OLED TV": {
"name": "CineView OLED TV",
"category": "Televisions and Home Theater Systems",
"brand": "CineView",
"model_number": "CV-OLED55",
"warranty": "2 years",
"rating": 4.7,
"features": ["55-inch display", "4K resolution", "HDR", "Smart TV"],
"description": "Experience true blacks and vibrant colors with this OLED TV.",
"price": 1499.99
},
"GameSphere X": {
"name": "GameSphere X",
"category": "Gaming Consoles and Accessories",
"brand": "GameSphere",
"model_number": "GS-X",
"warranty": "1 year",
"rating": 4.9,
"features": ["4K gaming", "1TB storage", "Backward compatibility", "Online multiplayer"],
"description": "A next-generation gaming console for the ultimate gaming experience.",
"price": 499.99
},
"ProGamer Controller": {
"name": "ProGamer Controller",
"category": "Gaming Consoles and Accessories",
"brand": "ProGamer",
"model_number": "PG-C100",
"warranty": "1 year",
"rating": 4.2,
"features": ["Ergonomic design", "Customizable buttons", "Wireless", "Rechargeable battery"],
"description": "A high-quality gaming controller for precision and comfort.",
"price": 59.99
},
"GameSphere Y": {
"name": "GameSphere Y",
"category": "Gaming Consoles and Accessories",
"brand": "GameSphere",
"model_number": "GS-Y",
"warranty": "1 year",
"rating": 4.8,
"features": ["4K gaming", "500GB storage", "Backward compatibility", "Online multiplayer"],
"description": "A compact gaming console with powerful performance.",
"price": 399.99
},
"ProGamer Racing Wheel": {
"name": "ProGamer Racing Wheel",
"category": "Gaming Consoles and Accessories",
"brand": "ProGamer",
"model_number": "PG-RW200",
"warranty": "1 year",
"rating": 4.5,
"features": ["Force feedback", "Adjustable pedals", "Paddle shifters", "Compatible with GameSphere X"],
"description": "Enhance your racing games with this realistic racing wheel.",
"price": 249.99
},
"GameSphere VR Headset": {
"name": "GameSphere VR Headset",
"category": "Gaming Consoles and Accessories",
"brand": "GameSphere",
"model_number": "GS-VR",
"warranty": "1 year",
"rating": 4.6,
"features": ["Immersive VR experience", "Built-in headphones", "Adjustable headband", "Compatible with GameSphere X"],
"description": "Step into the world of virtual reality with this comfortable VR headset.",
"price": 299.99
},
"AudioPhonic Noise-Canceling Headphones": {
"name": "AudioPhonic Noise-Canceling Headphones",
"category": "Audio Equipment",
"brand": "AudioPhonic",
"model_number": "AP-NC100",
"warranty": "1 year",
"rating": 4.6,
"features": ["Active noise-canceling", "Bluetooth", "20-hour battery life", "Comfortable fit"],
"description": "Experience immersive sound with these noise-canceling headphones.",
"price": 199.99
},
"WaveSound Bluetooth Speaker": {
"name": "WaveSound Bluetooth Speaker",
"category": "Audio Equipment",
"brand": "WaveSound",
"model_number": "WS-BS50",
"warranty": "1 year",
"rating": 4.5,
"features": ["Portable", "10-hour battery life", "Water-resistant", "Built-in microphone"],
"description": "A compact and versatile Bluetooth speaker for music on the go.",
"price": 49.99
},
"AudioPhonic True Wireless Earbuds": {
"name": "AudioPhonic True Wireless Earbuds",
"category": "Audio Equipment",
"brand": "AudioPhonic",
"model_number": "AP-TW20",
"warranty": "1 year",
"rating": 4.4,
"features": ["True wireless", "Bluetooth 5.0", "Touch controls", "18-hour battery life"],
"description": "Enjoy music without wires with these comfortable true wireless earbuds.",
"price": 79.99
},
"WaveSound Soundbar": {
"name": "WaveSound Soundbar",
"category": "Audio Equipment",
"brand": "WaveSound",
"model_number": "WS-SB40",
"warranty": "1 year",
"rating": 4.3,
"features": ["2.0 channel", "80W output", "Bluetooth", "Wall-mountable"],
"description": "Upgrade your TV's audio with this slim and powerful soundbar.",
"price": 99.99
},
"AudioPhonic Turntable": {
"name": "AudioPhonic Turntable",
"category": "Audio Equipment",
"brand": "AudioPhonic",
"model_number": "AP-TT10",
"warranty": "1 year",
"rating": 4.2,
"features": ["3-speed", "Built-in speakers", "Bluetooth", "USB recording"],
"description": "Rediscover your vinyl collection with this modern turntable.",
"price": 149.99
},
"FotoSnap DSLR Camera": {
"name": "FotoSnap DSLR Camera",
"category": "Cameras and Camcorders",
"brand": "FotoSnap",
"model_number": "FS-DSLR200",
"warranty": "1 year",
"rating": 4.7,
"features": ["24.2MP sensor", "1080p video", "3-inch LCD", "Interchangeable lenses"],
"description": "Capture stunning photos and videos with this versatile DSLR camera.",
"price": 599.99
},
"ActionCam 4K": {
"name": "ActionCam 4K",
"category": "Cameras and Camcorders",
"brand": "ActionCam",
"model_number": "AC-4K",
"warranty": "1 year",
"rating": 4.4,
"features": ["4K video", "Waterproof", "Image stabilization", "Wi-Fi"],
"description": "Record your adventures with this rugged and compact 4K action camera.",
"price": 299.99
},
"FotoSnap Mirrorless Camera": {
"name": "FotoSnap Mirrorless Camera",
"category": "Cameras and Camcorders",
"brand": "FotoSnap",
"model_number": "FS-ML100",
"warranty": "1 year",
"rating": 4.6,
"features": ["20.1MP sensor", "4K video", "3-inch touchscreen", "Interchangeable lenses"],
"description": "A compact and lightweight mirrorless camera with advanced features.",
"price": 799.99
},
"ZoomMaster Camcorder": {
"name": "ZoomMaster Camcorder",
"category": "Cameras and Camcorders",
"brand": "ZoomMaster",
"model_number": "ZM-CM50",
"warranty": "1 year",
"rating": 4.3,
"features": ["1080p video", "30x optical zoom", "3-inch LCD", "Image stabilization"],
"description": "Capture life's moments with this easy-to-use camcorder.",
"price": 249.99
},
"FotoSnap Instant Camera": {
"name": "FotoSnap Instant Camera",
"category": "Cameras and Camcorders",
"brand": "FotoSnap",
"model_number": "FS-IC10",
"warranty": "1 year",
"rating": 4.1,
"features": ["Instant prints", "Built-in flash", "Selfie mirror", "Battery-powered"],
"description": "Create instant memories with this fun and portable instant camera.",
"price": 69.99
}
}
接着,我们写几个help func来帮我们通过产品类别和产品名找到它们
def get_product_by_name(name):
return products.get(name, None)
def get_products_by_category(category):
return [product for product in products.values() if product["category"] == category]
通过产品名找产品细节
print(get_product_by_name("TechPro Ultrabook"))
"""
{'name': 'TechPro Ultrabook', 'category': 'Computers and Laptops', 'brand': 'TechPro', 'model_number': 'TP-UB100', 'warranty': '1 year', 'rating': 4.5, 'features': ['13.3-inch display', '8GB RAM', '256GB SSD', 'Intel Core i5 processor'], 'description': 'A sleek and lightweight ultrabook for everyday use.', 'price': 799.99}
"""
通过产品类别,找该类别下的产品
print(get_products_by_category("Computers and Laptops"))
"""
[{'name': 'TechPro Ultrabook', 'category': 'Computers and Laptops', 'brand': 'TechPro', 'model_number': 'TP-UB100', 'warranty': '1 year', 'rating': 4.5, 'features': ['13.3-inch display', '8GB RAM', '256GB SSD', 'Intel Core i5 processor'], 'description': 'A sleek and lightweight ultrabook for everyday use.', 'price': 799.99}, {'name': 'BlueWave Gaming Laptop', 'category': 'Computers and Laptops', 'brand': 'BlueWave', 'model_number': 'BW-GL200', 'warranty': '2 years', 'rating': 4.7, 'features': ['15.6-inch display', '16GB RAM', '512GB SSD', 'NVIDIA GeForce RTX 3060'], 'description': 'A high-performance gaming laptop for an immersive experience.', 'price': 1199.99}, {'name': 'PowerLite Convertible', 'category': 'Computers and Laptops', 'brand': 'PowerLite', 'model_number': 'PL-CV300', 'warranty': '1 year', 'rating': 4.3, 'features': ['14-inch touchscreen', '8GB RAM', '256GB SSD', '360-degree hinge'], 'description': 'A versatile convertible laptop with a responsive touchscreen.', 'price': 699.99}, {'name': 'TechPro Desktop', 'category': 'Computers and Laptops', 'brand': 'TechPro', 'model_number': 'TP-DT500', 'warranty': '1 year', 'rating': 4.4, 'features': ['Intel Core i7 processor', '16GB RAM', '1TB HDD', 'NVIDIA GeForce GTX 1660'], 'description': 'A powerful desktop computer for work and play.', 'price': 999.99}, {'name': 'BlueWave Chromebook', 'category': 'Computers and Laptops', 'brand': 'BlueWave', 'model_number': 'BW-CB100', 'warranty': '1 year', 'rating': 4.1, 'features': ['11.6-inch display', '4GB RAM', '32GB eMMC', 'Chrome OS'], 'description': 'A compact and affordable Chromebook for everyday tasks.', 'price': 249.99}]
"""
我们可以把上一个状态分类的输出例子,用它来测试一下
user_message_1 = """
tell me about the smartx pro phone and the fotosnap camera, the dslr one. Also tell me about your tvs
"""
category_and_product_response_1 = """
[
{'category': 'Smartphones and Accessories', 'products': ['SmartX ProPhone']},
{'category': 'Cameras and Camcorders', 'products': ['FotoSnap DSLR Camera']},
{'category': 'Televisions and Home Theater Systems'}
]
"""
我们首先要把chatGPT返回的字符串,解析成一个python object,方便我们获取里面的类别和产品名,函数如下
import json
def read_string_to_list(input_string):
if input_string is None:
return None
try:
input_string = input_string.replace("'", "\"") # Replace single quotes with double quotes for valid JSON
data = json.loads(input_string)
return data
except json.JSONDecodeError:
print("Error: Invalid JSON string")
return None
category_and_product_list = read_string_to_list(category_and_product_response_1)
print(category_and_product_list)
"""
[{'category': 'Smartphones and Accessories', 'products': ['SmartX ProPhone']}, {'category': 'Cameras and Camcorders', 'products': ['FotoSnap DSLR Camera']}, {'category': 'Televisions and Home Theater Systems'}]
"""
接着,为了扩大召回,我们设置一个策略,将相关的所有产品和类别都找出来,并把它转换成一个字符串,方便输入到回复prompt模板中
def generate_output_string(data_list):
output_string = ""
if data_list is None:
return output_string
for data in data_list:
try:
if "products" in data:
products_list = data["products"]
for product_name in products_list:
product = get_product_by_name(product_name)
if product:
output_string += json.dumps(product, indent=4) + "\n"
else:
print(f"Error: Product '{product_name}' not found")
elif "category" in data:
category_name = data["category"]
category_products = get_products_by_category(category_name)
for product in category_products:
output_string += json.dumps(product, indent=4) + "\n"
else:
print("Error: Invalid object format")
except Exception as e:
print(f"Error: {e}")
return output_string
product_information_for_user_message_1 = generate_output_string(category_and_product_list)
print(product_information_for_user_message_1)
"""
{
"name": "SmartX ProPhone",
"category": "Smartphones and Accessories",
"brand": "SmartX",
"model_number": "SX-PP10",
"warranty": "1 year",
"rating": 4.6,
"features": [
"6.1-inch display",
"128GB storage",
"12MP dual camera",
"5G"
],
"description": "A powerful smartphone with advanced camera features.",
"price": 899.99
}
{
"name": "FotoSnap DSLR Camera",
"category": "Cameras and Camcorders",
"brand": "FotoSnap",
"model_number": "FS-DSLR200",
"warranty": "1 year",
"rating": 4.7,
"features": [
"24.2MP sensor",
"1080p video",
"3-inch LCD",
"Interchangeable lenses"
],
"description": "Capture stunning photos and videos with this versatile DSLR camera.",
"price": 599.99
}
{
"name": "CineView 4K TV",
"category": "Televisions and Home Theater Systems",
"brand": "CineView",
"model_number": "CV-4K55",
"warranty": "2 years",
"rating": 4.8,
"features": [
"55-inch display",
"4K resolution",
"HDR",
"Smart TV"
],
"description": "A stunning 4K TV with vibrant colors and smart features.",
"price": 599.99
}
{
"name": "SoundMax Home Theater",
"category": "Televisions and Home Theater Systems",
"brand": "SoundMax",
"model_number": "SM-HT100",
"warranty": "1 year",
"rating": 4.4,
"features": [
"5.1 channel",
"1000W output",
"Wireless subwoofer",
"Bluetooth"
],
"description": "A powerful home theater system for an immersive audio experience.",
"price": 399.99
}
{
"name": "CineView 8K TV",
"category": "Televisions and Home Theater Systems",
"brand": "CineView",
"model_number": "CV-8K65",
"warranty": "2 years",
"rating": 4.9,
"features": [
"65-inch display",
"8K resolution",
"HDR",
"Smart TV"
],
"description": "Experience the future of television with this stunning 8K TV.",
"price": 2999.99
}
{
"name": "SoundMax Soundbar",
"category": "Televisions and Home Theater Systems",
"brand": "SoundMax",
"model_number": "SM-SB50",
"warranty": "1 year",
"rating": 4.3,
"features": [
"2.1 channel",
"300W output",
"Wireless subwoofer",
"Bluetooth"
],
"description": "Upgrade your TV's audio with this sleek and powerful soundbar.",
"price": 199.99
}
{
"name": "CineView OLED TV",
"category": "Televisions and Home Theater Systems",
"brand": "CineView",
"model_number": "CV-OLED55",
"warranty": "2 years",
"rating": 4.7,
"features": [
"55-inch display",
"4K resolution",
"HDR",
"Smart TV"
],
"description": "Experience true blacks and vibrant colors with this OLED TV.",
"price": 1499.99
}
"""
最终,我们写一个回复模板prompt,并把上面检索到的字符串拼接到里面,给到chatGPT,给我们生成最终的回复。
system_message = f"""
You are a customer service assistant for a \
large electronic store. \
Respond in a friendly and helpful tone, \
with very concise answers. \
Make sure to ask the user relevant follow up questions.
"""
user_message_1 = f"""
tell me about the smartx pro phone and \
the fotosnap camera, the dslr one. \
Also tell me about your tvs"""
messages = [
{'role':'system',
'content': system_message},
{'role':'user',
'content': user_message_1},
{'role':'assistant',
'content': f"""Relevant product information:\n\
{product_information_for_user_message_1}"""},
]
final_response = get_completion_from_messages(messages)
print(final_response)
"""
The SmartX ProPhone has a 6.1-inch display, 128GB storage, 12MP dual camera, and 5G. The FotoSnap DSLR Camera has a 24.2MP sensor, 1080p video, 3-inch LCD, and interchangeable lenses. We have a variety of TVs, including the CineView 4K TV with a 55-inch display, 4K resolution, HDR, and smart TV features. We also have the SoundMax Home Theater system with 5.1 channel, 1000W output, wireless subwoofer, and Bluetooth. Do you have any specific questions about these products or any other products we offer?
"""
# 检查输出 (Check Outputs)
在展示给用户之前,对系统生成的输出进行质量、相关性以及安全性检查是重要的。我们可以
1. 使用moderation API对输出进行过滤和调整,这对于针对敏感群体创建的聊天机器人尤其有用。
1. 通过让模型自评其输出的质量,对输出进行检查。但这可能增加系统的延迟和成本,所以一般情况下不推荐这样做。
| 方法 | 优点 | 缺点 |
| --- | --- | --- |
| 使用 moderation API 检查输出 | 1. 可以有效地过滤和调整系统生成的输出,尤其适用于敏感群体的聊天机器人。 | 1. 虽然这种方法可以检查输出的质量,但它不能完全避免模型生成不适当或不准确的输出。 |
| 让模型自评其输出 | 1. 模型能定制化评估策略 | 1. 这种方法可能增加系统的延迟和成本,因为需要等待模型的额外调用。 |
## 使用Moderation API
让我们来看具体例子
final_response_to_customer = f"""
The SmartX ProPhone has a 6.1-inch display, 128GB storage, \
12MP dual camera, and 5G. The FotoSnap DSLR Camera \
has a 24.2MP sensor, 1080p video, 3-inch LCD, and \
interchangeable lenses. We have a variety of TVs, including \
the CineView 4K TV with a 55-inch display, 4K resolution, \
HDR, and smart TV features. We also have the SoundMax \
Home Theater system with 5.1 channel, 1000W output, wireless \
subwoofer, and Bluetooth. Do you have any specific questions \
about these products or any other products we offer?
"""
response = openai.Moderation.create(
input=final_response_to_customer
)
moderation_output = response["results"][0]
print(moderation_output)
"""
{
"categories": {
"hate": false,
"hate/threatening": false,
"self-harm": false,
"sexual": false,
"sexual/minors": false,
"violence": false,
"violence/graphic": false
},
"category_scores": {
"hate": 4.3336598e-07,
"hate/threatening": 5.640859e-10,
"self-harm": 2.8951988e-10,
"sexual": 2.2455274e-06,
"sexual/minors": 1.2425416e-08,
"violence": 6.0600332e-06,
"violence/graphic": 4.518234e-07
},
"flagged": false
}
"""
比如之前的回复,我们可以丢给Moderation api,得到一系列的分类和分数。从这个反馈中,我们可以看到,这个输出并未被标记,而且在所有类别中得分都很低,这是合理的。
对输出进行检查也是非常重要的,比如如果你正在为敏感人群创建一个聊天机器人,你可以为标记输出使用较低的阈值。如果moderation的输出表明内容被标记,你可以采取适当的行动,如回复一个完整的备选答案,或生成一个新的回应。值得注意的是,随着我们对模型的改进,它们生成有害输出的可能性也越来越小。
## 让模型自定义分类
检查输出的另一种方法是询问模型本身生成的是否满意,是否遵循你定义的某种规则。你可以通过将生成的输出作为模型输入的一部分,并要求它评估输出的质量。可以以各种不同的方式进行这个操作。例如,你可以设计一种审查规则,比如考试或评分论文的标准,并说这是否遵循了我们品牌的友好调性,你可以概述一些你的品牌指南,如果这对你来说非常重要的话。
system_message = f"""
You are an assistant that evaluates whether \
customer service agent responses sufficiently \
answer customer questions, and also validates that \
all the facts the assistant cites from the product \
information are correct.
The product information and user and customer \
service agent messages will be delimited by \
3 backticks, i.e. ```.
Respond with a Y or N character, with no punctuation:
Y - if the output sufficiently answers the question
AND the response correctly uses product information
N - otherwise
Output a single letter only.
"""
customer_message = f"""
tell me about the smartx pro phone and
the fotosnap camera, the dslr one.
Also tell me about your tvs"""
product_information = """{ "name": "SmartX ProPhone", "category": "Smartphones and Accessories", "brand": "SmartX", "model_number": "SX-PP10", "warranty": "1 year", "rating": 4.6, "features": [ "6.1-inch display", "128GB storage", "12MP dual camera", "5G" ], "description": "A powerful smartphone with advanced camera features.", "price": 899.99 } { "name": "FotoSnap DSLR Camera", "category": "Cameras and Camcorders", "brand": "FotoSnap", "model_number": "FS-DSLR200", "warranty": "1 year", "rating": 4.7, "features": [ "24.2MP sensor", "1080p video", "3-inch LCD", "Interchangeable lenses" ], "description": "Capture stunning photos and videos with this versatile DSLR camera.", "price": 599.99 } { "name": "CineView 4K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-4K55", "warranty": "2 years", "rating": 4.8, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "A stunning 4K TV with vibrant colors and smart features.", "price": 599.99 } { "name": "SoundMax Home Theater", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-HT100", "warranty": "1 year", "rating": 4.4, "features": [ "5.1 channel", "1000W output", "Wireless subwoofer", "Bluetooth" ], "description": "A powerful home theater system for an immersive audio experience.", "price": 399.99 } { "name": "CineView 8K TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-8K65", "warranty": "2 years", "rating": 4.9, "features": [ "65-inch display", "8K resolution", "HDR", "Smart TV" ], "description": "Experience the future of television with this stunning 8K TV.", "price": 2999.99 } { "name": "SoundMax Soundbar", "category": "Televisions and Home Theater Systems", "brand": "SoundMax", "model_number": "SM-SB50", "warranty": "1 year", "rating": 4.3, "features": [ "2.1 channel", "300W output", "Wireless subwoofer", "Bluetooth" ], "description": "Upgrade your TV's audio with this sleek and powerful soundbar.", "price": 199.99 } { "name": "CineView OLED TV", "category": "Televisions and Home Theater Systems", "brand": "CineView", "model_number": "CV-OLED55", "warranty": "2 years", "rating": 4.7, "features": [ "55-inch display", "4K resolution", "HDR", "Smart TV" ], "description": "Experience true blacks and vibrant colors with this OLED TV.", "price": 1499.99 }"""
q_a_pair = f"""
Customer message: {customer_message}
Product information: {product_information}
Agent response: {final_response_to_customer}
Does the response use the retrieved information correctly?
Does the response sufficiently answer the question
Output Y or N
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': q_a_pair}
]
response = get_completion_from_messages(messages, max_tokens=1)
print(response)
"""
Y
"""
another_response = "life is like a box of chocolates"
q_a_pair = f"""
Customer message: {customer_message}
Product information: {product_information}
Agent response: {another_response}
Does the response use the retrieved information correctly?
Does the response sufficiently answer the question?
Output Y or N
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': q_a_pair}
]
response = get_completion_from_messages(messages)
print(response)
"""
N
"""
这个评估prompt我们也可以拆解分析一下
# 评估 (Evaluation)
官方视频把这部分分成了三段来讲解。我们也统一分成三小节
1. 如何将之前写的prompt组装到一起拼成完整的系统
1. 当只有唯一正确答案时如何评估?
1. 当正确答案不唯一时要如何评估?
## 拼成完整系统
现在,我们将结合之前视频中学到的所有知识,来创建一个客服助理,我们将遵循以下步骤:
1. 首先,我们将检查输入,看是否会触发审核API。
1. 如果没有触发,我们将提取产品列表。
1. 如果找到产品,我们将尝试查找它们。
1. 接下来,我们会用模型回答用户的问题。
1. 最后,我们会把回答通过审核API,如果没有被标记,我们会将其返回给用户。
假设我们把之前写的函数都放到了utils.py里,这里我们import它,下面是它的简单pipeline
def process_user_message(user_input, all_messages, debug=True):
delimiter = "```"
# Step 1: Check input to see if it flags the Moderation API or is a prompt injection
response = openai.Moderation.create(input=user_input)
moderation_output = response["results"][0]
if moderation_output["flagged"]:
print("Step 1: Input flagged by Moderation API.")
return "Sorry, we cannot process this request."
if debug: print("Step 1: Input passed moderation check.")
category_and_product_response = utils.find_category_and_product_only(user_input, utils.get_products_and_category())
#print(print(category_and_product_response)
# Step 2: Extract the list of products
category_and_product_list = utils.read_string_to_list(category_and_product_response)
#print(category_and_product_list)
if debug: print("Step 2: Extracted list of products.")
# Step 3: If products are found, look them up
product_information = utils.generate_output_string(category_and_product_list)
if debug: print("Step 3: Looked up product information.")
# Step 4: Answer the user question
system_message = f"""
You are a customer service assistant for a large electronic store. \
Respond in a friendly and helpful tone, with concise answers. \
Make sure to ask the user relevant follow-up questions.
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': f"{delimiter}{user_input}{delimiter}"},
{'role': 'assistant', 'content': f"Relevant product information:\n{product_information}"}
]
final_response = get_completion_from_messages(all_messages + messages)
if debug:print("Step 4: Generated response to user question.")
all_messages = all_messages + messages[1:]
# Step 5: Put the answer through the Moderation API
response = openai.Moderation.create(input=final_response)
moderation_output = response["results"][0]
if moderation_output["flagged"]:
if debug: print("Step 5: Response flagged by Moderation API.")
return "Sorry, we cannot provide this information."
if debug: print("Step 5: Response passed moderation check.")
# Step 6: Ask the model if the response answers the initial user query well
user_message = f"""
Customer message: {delimiter}{user_input}{delimiter}
Agent response: {delimiter}{final_response}{delimiter}
Does the response sufficiently answer the question?
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': user_message}
]
evaluation_response = get_completion_from_messages(messages)
if debug: print("Step 6: Model evaluated the response.")
# Step 7: If yes, use this answer; if not, say that you will connect the user to a human
if "Y" in evaluation_response: # Using "in" instead of "==" to be safer for model output variation (e.g., "Y." or "Yes")
if debug: print("Step 7: Model approved the response.")
return final_response, all_messages
else:
if debug: print("Step 7: Model disapproved the response.")
neg_str = "I'm unable to provide the information you're looking for. I'll connect you with a human representative for further assistance."
return neg_str, all_messages
<https://github.com/OpenDocCN/dsai-notes-pt4-zh/raw/master/docs/ai-dongchazhe-zhongxin/img/b5ccf553e147915b82c1f6ba53bd120c.png>
以上代码的时序图为(GPT-4)生成的
代码中有一个细节,就是上下文记忆的管理
对于分类和审核过程,都没有用到上下文记忆,只有在生成回复模板过程中,会把之前的上下文记忆,与当前带有system prompt的messageslist拼接
# Step 4: Answer the user question
system_message = f"""
You are a customer service assistant for a large electronic store. \
Respond in a friendly and helpful tone, with concise answers. \
Make sure to ask the user relevant follow-up questions.
"""
messages = [
{'role': 'system', 'content': system_message},
{'role': 'user', 'content': f"{delimiter}{user_input}{delimiter}"},
{'role': 'assistant', 'content': f"Relevant product information:\n{product_information}"}
]
final_response = get_completion_from_messages(all_messages + messages)
if debug:print("Step 4: Generated response to user question.")
all_messages = all_messages + messages[1:]
而上下文all_messages,下一轮的更新,会把system 角色的prompt给去掉,再和上一次对话进行拼接
all_messages = all_messages + messages[1:]
打印出来看效果,是ok的
user_input = "tell me about the smartx pro phone and the fotosnap camera, the dslr one. Also what tell me about your tvs"
response,_ = process_user_message(user_input,[])
print(response)
"""
Step 1: Input passed moderation check.
Step 2: Extracted list of products.
Step 3: Looked up product information.
Step 4: Generated response to user question.
Step 5: Response passed moderation check.
Step 6: Model evaluated the response.
Step 7: Model approved the response.
The SmartX ProPhone is a powerful smartphone with a 6.1-inch display, 128GB storage, 12MP dual camera, and 5G capabilities. The FotoSnap DSLR Camera is a versatile camera with a 24.2MP sensor, 1080p video, 3-inch LCD, and interchangeable lenses. As for our TVs, we have a range of options including the CineView 4K TV with a 55-inch display, 4K resolution, HDR, and smart TV capabilities, the CineView 8K TV with a 65-inch display, 8K resolution, HDR, and smart TV capabilities, and the CineView OLED TV with a 55-inch display, 4K resolution, HDR, and smart TV capabilities. Do you have any specific questions about these products or would you like me to recommend a product based on your needs?
"""
这样我们能一次性端对端地从用户输入到客服助理输出
当然,我们可以稍微包装一下,给它加上一个GUI,变成更用户友好的DEMO,详细文档见:
[GitHub - holoviz/panel: A high-level ](https://github.com/holoviz/panel)app and dashboarding solution for Python
导入panel包
import panel as pn # GUI
pn.extension()
设置用户输入,让panel每次输出
def collect_messages(debug=False):
user_input = inp.value_input
if debug: print(f"User Input = {user_input}")
if user_input == "":
return
inp.value = ''
global context
#response, context = process_user_message(user_input, context, utils.get_products_and_category(),debug=True)
response, context = process_user_message(user_input, context, debug=False)
context.append({'role':'assistant', 'content':f"{response}"})
panels.append(
pn.Row('User:', pn.pane.Markdown(user_input, width=600)))
panels.append(
pn.Row('Assistant:', pn.pane.Markdown(response, width=600, style={'background-color': '#F6F6F6'})))
return pn.Column(*panels)
下面是完整的DEMO演示效果
<https://github.com/OpenDocCN/dsai-notes-pt4-zh/raw/master/docs/ai-dongchazhe-zhongxin/img/44d39a2322ad2883c911f65f26c969ad.png>
你发现它具备了以下能力
1. 事实相关
1. 防指令注入
1. 有一定的上下文记忆能力,能理解对话中途的指代问题
## 新范式评估方法
接下来我们要评估我们构建好的完整系统。
<https://github.com/OpenDocCN/dsai-notes-pt4-zh/raw/master/docs/ai-dongchazhe-zhongxin/img/9603318a4743e592f59261698c77e1af.png>
在你构建了这样的系统之后,你如何知道它的工作情况呢?或许,当你将它部署并让用户使用时,你又如何跟踪其表现,发现任何缺陷并继续提高系统回答的质量呢?这里会分享一些评估大型语言模型输出的最佳实践。
构建这样的系统的感觉是独特的。与之前吴恩达讲解的传统的机器学习监督学习应用中看到的有一个关键的区别。
在传统的监督学习方法中,如果你需要收集10000个标记的例子,那么收集另外1000个测试例子的增量成本并不那么高。所以,在传统的监督学习设置中,收集训练集,收集开发集或者保留训练验证集和测试集,然后在整个开发过程中手头有这些,这并不罕见。
但是,如果你能够在几分钟内指定一个提示,并在几个小时内使其工作,你却必须花很长时间来收集1000个测试样例,这似乎是一个巨大的麻烦。因为你不能在零训练样例的情况下让这个工作。因此,当你使用LLM构建一个应用程序时,就会有极大的不协调感。所以你需要在只有少量的例子上调整提示,可能是1到3到5个例子,尝试获得在它们整体跑起来的效果。
随着你的系统进行额外的测试,你偶尔会遇到一些棘手的例子,提示在它们上不起作用,或者处理不了它们。在这种情况下,你可以将这些额外的一两个或三到五个例子加入到你正在测试的集合中,这样就可以随机添加更多的棘手例子。最后,你会有足够的例子添加到你的开发集中,这会让手动运行每个例子通过提示变得有些麻烦,每次你改变提示时,你就开始开发度量方法来衡量这小部分例子的表现,如平均准确率。
如果说你的系统正确率已经达到了91%,但你希望调优至92%或者93%,那么你确实需要更大的样本集来衡量91%和93%性能之间的差异。只有当你真的需要一个公正无偏的评估系统表现的估计时,你才需要超越开发集,去收集一个保留的测试集。
有一个重要的注意事项,许多应用中,大型语言模型给出的答案可能并不完全准确,但如果这并没有造成实质性的风险或伤害,那就没有太大问题。但显然,对于任何高风险的应用,如果存在偏见或不适当的输出可能对某人造成伤害,那么收集测试集以严格评估你的系统性能,确保它在你使用前已经做好正确的事情,这就变得更为重要。
但是如果,例如,你只是为了自己阅读,使用它来总结文章,且无他人受到影响,那么可能造成的伤害就更小,你可以在这个过程的早期停止,而不需要去支付收集更大数据集的费用。
| 评估方式 | 优点 | 缺点 |
| --- | --- | --- |
| 传统监督学习方法 | 1. 数据标签清晰,易于理解和应用 | 1. 需要大量标注的数据,数据收集和处理成本高 |
| 使用大型语言模型(LLM)的评估方式 | 1. 可以在少量甚至无监督数据的情况下开始工作 | 1. 评估标准可能因应用而异,难以量化 |
LLM-based评估流程
总结如下
1. 传统监督学习方法中,收集训练集、开发集和测试集是常见的。但到了几分钟到几小时就能应用的LLM中,代价很大。
1. LLM的系统效果评估和改进质量的最佳实践:不从测试集开始,而是逐渐构建一组测试样例。首先会在几个示例上调整提示,然后在测试过程中随机添加更多的棘手例子。
1. 可以开发自动化度量方法来衡量这一小部分例子的表现,如平均准确率。
1. 当系统达到足够的效果时,可以停止不再继续。但如果需要更高的信心,就需要进一步收集随机样例或创建保留测试集。
接下来我们正式进入客服系统的评估
我们的系统评估分成了两个阶段
1. 答案唯一正确的分类问题
1. 答案不唯一正确的生成问题
## 答案唯一正确时评估
在这个例子中,我们首先从常见的功能开始,使用一个函数来获取产品和分类的列表。
def find_category_and_product_v1(user_input,products_and_category):
delimiter = "####"
system_message = f"""
You will be provided with customer service queries. \
The customer service query will be delimited with {delimiter} characters.
Output a python list of json objects, where each object has the following format:
'category': <one of Computers and Laptops, Smartphones and Accessories, Televisions and Home Theater Systems, \
Gaming Consoles and Accessories, Audio Equipment, Cameras and Camcorders>,
AND
'products': <a list of products that must be found in the allowed products below>
Where the categories and products must be found in the customer service query.
If a product is mentioned, it must be associated with the correct category in the allowed products list below.
If no products or categories are found, output an empty list.
List out all products that are relevant to the customer service query based on how closely it relates
to the product name and product category.
Do not assume, from the name of the product, any features or attributes such as relative quality or price.
The allowed products are provided in JSON format.
The keys of each item represent the category.
The values of each item is a list of products that are within that category.
Allowed products: {products_and_category}
"""
few_shot_user_1 = """I want the most expensive computer."""
few_shot_assistant_1 = """
[{'category': 'Computers and Laptops', \
'products': ['TechPro Ultrabook', 'BlueWave Gaming Laptop', 'PowerLite Convertible', 'TechPro Desktop', 'BlueWave Chromebook']}]
"""
messages = [
{'role':'system', 'content': system_message},
{'role':'user', 'content': f"{delimiter}{few_shot_user_1}{delimiter}"},
{'role':'assistant', 'content': few_shot_assistant_1 },
{'role':'user', 'content': f"{delimiter}{user_input}{delimiter}"},
]
return get_completion_from_messages(messages)
上面代码使用了少样本提示,即使用user和assistant的角色消息构造出一个好的输出示例。
比如用户问「我想要最贵的电脑」,我们返回电脑类别和所有相关的产品名字信息。针对这类问题,我们可以用另一个提示来评估它,比如「我想买一台电视,我预算有限。」看看是否返回电视类别和所有相关的产品名。
这里有三个提示,如果你是第一次开发这个提示,有一两个或者三个这样的示例是非常合理的。并且你可以不断调整提示,直到它给出适当的输出,直到提示能检索到所有相关的产品和类别,以满足客户的请求。
customer_msg_0 = f"""Which TV can I buy if I'm on a budget?"""
products_by_category_0 = find_category_and_product_v1(customer_msg_0,
products_and_category)
print(products_by_category_0)
"""
[{'category': 'Televisions and Home Theater Systems', 'products': ['CineView 4K TV', 'SoundMax Home Theater', 'CineView 8K TV', 'SoundMax Soundbar', 'CineView OLED TV']}]
"""
**测试用例一:通过**
customer_msg_1 = f"""I need a charger for my smartphone"""
products_by_category_1 = find_category_and_product_v1(customer_msg_1,
products_and_category)
print(products_by_category_1)
"""
[{'category': 'Smartphones and Accessories', 'products': ['MobiTech PowerCase', 'MobiTech Wireless Charger', 'SmartX EarBuds']}]
"""
**测试用例二:通过**
customer_msg_2 = f"""
What computers do you have?"""
products_by_category_2 = find_category_and_product_v1(customer_msg_2,
products_and_category)
products_by_category_2
"""
" [{'category': 'Computers and Laptops', 'products': ['TechPro Ultrabook', 'BlueWave Gaming Laptop', 'PowerLite Convertible', 'TechPro Desktop', 'BlueWave Chromebook']}]"
"""
**测试用例三:通过**
customer_msg_3 = f"""
tell me about the smartx pro phone and the fotosnap camera, the dslr one.
Also, what TVs do you have?"""
products_by_category_3 = find_category_and_product_v1(customer_msg_3,
products_and_category)
print(products_by_category_3)
"""
[{'category': 'Smartphones and Accessories', 'products': ['SmartX ProPhone']},
{'category': 'Cameras and Camcorders', 'products': ['FotoSnap DSLR Camera']},
{'category': 'Televisions and Home Theater Systems', 'products': ['CineView 4K TV', 'SoundMax Home Theater', 'CineView 8K TV', 'SoundMax Soundbar', 'CineView OLED TV']}]
Note: The query mentions "smartx pro phone" and "fotosnap camera, the dslr one", so the output includes the relevant categories and products. The query also asks about TVs, so the relevant category is included in the output.
"""
测试用例二三你可以看到,它回复的不是纯python list,有时会加了"和不必要的空格,有时会加上一些解释的文本。这需要你做一下较鲁棒的后处理。比如用正则查找[]内的结构体,过滤掉前后无关的字符等。
当三个测试用例通过后,我们就可以测试更难的用例。
**测试用例四:未通过**
customer_msg_4 = f"""
tell me about the CineView TV, the 8K one, Gamesphere console, the X one.
I'm on a budget, what computers do you have?"""
products_by_category_4 = find_category_and_product_v1(customer_msg_4,
products_and_category)
print(products_by_category_4)
"""
[{'category': 'Televisions and Home Theater Systems', 'products': ['CineView 8K TV']},
{'category': 'Gaming Consoles and Accessories', 'products': ['GameSphere X']},
{'category': 'Computers and Laptops', 'products': ['BlueWave Chromebook']}]
Note: The CineView TV mentioned is the 8K one, and the Gamesphere console mentioned is the X one.
For the computer category, since the customer mentioned being on a budget, we cannot determine which specific product to recommend.
Therefore, we have included all the products in the Computers and Laptops category in the output.
"""
这个case出问题在两个地方
1. 输出的类别错了,因为用户给定了一些产品信息,是关于电视的,但问题问的却是关于电脑的,这个在之前的测试用例中未见到,LLM错误地把它放到了返回类别里,而不是根据用户问到的意图
1. 回复了无关的解释文本
针对这两点,我们可以这样改进我们的prompt
1. 加入返回格式约束,让它不要输出额外的文本
1. 加入更多示例样本,提示模型更清晰当前要做的任务是什么。
def find_category_and_product_v2(user_input,products_and_category):
"""
Added: Do not output any additional text that is not in JSON format.
Added a second example (for few-shot prompting) where user asks for
the cheapest computer. In both few-shot examples, the shown response
is the full list of products in JSON only.
"""
delimiter = "####"
system_message = f"""
You will be provided with customer service queries. \
The customer service query will be delimited with {delimiter} characters.
Output a python list of json objects, where each object has the following format:
'category': <one of Computers and Laptops, Smartphones and Accessories, Televisions and Home Theater Systems, \
Gaming Consoles and Accessories, Audio Equipment, Cameras and Camcorders>,
AND
'products': <a list of products that must be found in the allowed products below>
Do not output any additional text that is not in JSON format.
Do not write any explanatory text after outputting the requested JSON.
Where the categories and products must be found in the customer service query.
If a product is mentioned, it must be associated with the correct category in the allowed products list below.
If no products or categories are found, output an empty list.
List out all products that are relevant to the customer service query based on how closely it relates
to the product name and product category.
Do not assume, from the name of the product, any features or attributes such as relative quality or price.
The allowed products are provided in JSON format.
The keys of each item represent the category.
The values of each item is a list of products that are within that category.
Allowed products: {products_and_category}
"""
few_shot_user_1 = """I want the most expensive computer. What do you recommend?"""
few_shot_assistant_1 = """
[{'category': 'Computers and Laptops', \
'products': ['TechPro Ultrabook', 'BlueWave Gaming Laptop', 'PowerLite Convertible', 'TechPro Desktop', 'BlueWave Chromebook']}]
"""
few_shot_user_2 = """I want the most cheapest computer. What do you recommend?"""
few_shot_assistant_2 = """
[{'category': 'Computers and Laptops', \
'products': ['TechPro Ultrabook', 'BlueWave Gaming Laptop', 'PowerLite Convertible', 'TechPro Desktop', 'BlueWave Chromebook']}]
"""
messages = [
{'role':'system', 'content': system_message},
{'role':'user', 'content': f"{delimiter}{few_shot_user_1}{delimiter}"},
{'role':'assistant', 'content': few_shot_assistant_1 },
{'role':'user', 'content': f"{delimiter}{few_shot_user_2}{delimiter}"},
{'role':'assistant', 'content': few_shot_assistant_2 },
{'role':'user', 'content': f"{delimiter}{user_input}{delimiter}"},
]
return get_completion_from_messages(messages)
我在改进后的prompt上重测了一下这个case,发现还没解决。也有可能是我对改进prompt的部分理解有误
| 改进前回复 | 改进后回复 |
| --- | --- |
| ```[{&#x27;category&#x27;: &#x27;Televisions and Home Theater Systems&#x27;, &#x27;products&#x27;: [&#x27;CineView 8K TV&#x27;]}, {&#x27;category&#x27;: &#x27;Gaming Consoles and Accessories&#x27;, &#x27;products&#x27;: [&#x27;GameSphere X&#x27;]}, {&#x27;category&#x27;: &#x27;Computers and Laptops&#x27;, &#x27;products&#x27;: [&#x27;BlueWave Chromebook&#x27;]}] Note: The CineView TV mentioned is the 8K one, and the Gamesphere console mentioned is the X one. For the computer category, since the customer mentioned being on a budget, we cannot determine which specific product to recommend. Therefore, we have included all the products in the Computers and Laptops category in the output. ```| ```[{&#x27;category&#x27;: &#x27;Televisions and Home Theater Systems&#x27;, &#x27;products&#x27;: [&#x27;CineView 8K TV&#x27;]}, {&#x27;category&#x27;: &#x27;Gaming Consoles and Accessories&#x27;, &#x27;products&#x27;: [&#x27;GameSphere X&#x27;]}, {&#x27;category&#x27;: &#x27;Computers and Laptops&#x27;, &#x27;products&#x27;: [&#x27;BlueWave Chromebook&#x27;]}] The CineView TV is a high-end television with 8K resolution, providing an incredibly sharp and detailed picture. It is perfect for those who want the best viewing experience possible. The GameSphere X is a powerful gaming console that offers a wide range of games and features. It is perfect for gamers who want a high-quality gaming experience. The BlueWave Chromebook is a budget-friendly laptop that is perfect for those who need a basic computer for everyday use. It is not as powerful as some of the other options, but it is affordable and reliable. ```|
视频中测试了另一个case
customer_msg_3 = f"""
tell me about the smartx pro phone and the fotosnap camera, the dslr one.
Also, what TVs do you have?"""
products_by_category_3 = find_category_and_product_v2(customer_msg_3,
products_and_category)
print(products_by_category_3)
"""
[{'category': 'Smartphones and Accessories', 'products': ['SmartX ProPhone']}, {'category': 'Cameras and Camcorders', 'products': ['FotoSnap DSLR Camera']}, {'category': 'Televisions and Home Theater Systems', 'products': ['CineView 4K TV', 'SoundMax Home Theater', 'CineView 8K TV', 'SoundMax Soundbar', 'CineView OLED TV']}]
"""
在输出格式上是对的。
假设我们上面一切就绪,便需要在之前的测试case上看,有没有干扰到之前的结果,以方便在更大的测试集上进行测试。但一个一个跑显然就很不经济。
更好地方式是,构建一个自动化回归测试。在每次改动后,都跑一次回归测试,得到准确率。
下面,我们准备了10个测试例子,包含了输入和预计输出
msg_ideal_pairs_set = [
# eg 0
{'customer_msg':"""Which TV can I buy if I'm on a budget?""",
'ideal_answer':{
'Televisions and Home Theater Systems':set(
['CineView 4K TV', 'SoundMax Home Theater', 'CineView 8K TV', 'SoundMax Soundbar', 'CineView OLED TV']
)}
},
# eg 1
{'customer_msg':"""I need a charger for my smartphone""",
'ideal_answer':{
'Smartphones and Accessories':set(
['MobiTech PowerCase', 'MobiTech Wireless Charger', 'SmartX EarBuds']
)}
},
# eg 2
{'customer_msg':f"""What computers do you have?""",
'ideal_answer':{
'Computers and Laptops':set(
['TechPro Ultrabook', 'BlueWave Gaming Laptop', 'PowerLite Convertible', 'TechPro Desktop', 'BlueWave Chromebook'
])
}
},
# eg 3
{'customer_msg':f"""tell me about the smartx pro phone and \
the fotosnap camera, the dslr one.\
Also, what TVs do you have?""",
'ideal_answer':{
'Smartphones and Accessories':set(
['SmartX ProPhone']),
'Cameras and Camcorders':set(
['FotoSnap DSLR Camera']),
'Televisions and Home Theater Systems':set(
['CineView 4K TV', 'SoundMax Home Theater','CineView 8K TV', 'SoundMax Soundbar', 'CineView OLED TV'])
}
},
# eg 4
{'customer_msg':"""tell me about the CineView TV, the 8K one, Gamesphere console, the X one.
I'm on a budget, what computers do you have?""",
'ideal_answer':{
'Televisions and Home Theater Systems':set(
['CineView 8K TV']),
'Gaming Consoles and Accessories':set(
['GameSphere X']),
'Computers and Laptops':set(
['TechPro Ultrabook', 'BlueWave Gaming Laptop', 'PowerLite Convertible', 'TechPro Desktop', 'BlueWave Chromebook'])
}
},
# eg 5
{'customer_msg':f"""What smartphones do you have?""",
'ideal_answer':{
'Smartphones and Accessories':set(
['SmartX ProPhone', 'MobiTech PowerCase', 'SmartX MiniPhone', 'MobiTech Wireless Charger', 'SmartX EarBuds'
])
}
},
# eg 6
{'customer_msg':f"""I'm on a budget. Can you recommend some smartphones to me?""",
'ideal_answer':{
'Smartphones and Accessories':set(
['SmartX EarBuds', 'SmartX MiniPhone', 'MobiTech PowerCase', 'SmartX ProPhone', 'MobiTech Wireless Charger']
)}
},
# eg 7 # this will output a subset of the ideal answer
{'customer_msg':f"""What Gaming consoles would be good for my friend who is into racing games?""",
'ideal_answer':{
'Gaming Consoles and Accessories':set([
'GameSphere X',
'ProGamer Controller',
'GameSphere Y',
'ProGamer Racing Wheel',
'GameSphere VR Headset'
])}
},
# eg 8
{'customer_msg':f"""What could be a good present for my videographer friend?""",
'ideal_answer': {
'Cameras and Camcorders':set([
'FotoSnap DSLR Camera', 'ActionCam 4K', 'FotoSnap Mirrorless Camera', 'ZoomMaster Camcorder', 'FotoSnap Instant Camera'
])}
},
# eg 9
{'customer_msg':f"""I would like a hot tub time machine.""",
'ideal_answer': []
}
]
AtomGit 是由开放原子开源基金会联合 CSDN 等生态伙伴共同推出的新一代开源与人工智能协作平台。平台坚持“开放、中立、公益”的理念,把代码托管、模型共享、数据集托管、智能体开发体验和算力服务整合在一起,为开发者提供从开发、训练到部署的一站式体验。
更多推荐

所有评论(0)