如何微调聊天模型
微调通过在比提示中能容纳的更多示例上进行训练来改进模型,从而让您在广泛的任务上取得更好的结果。本笔记本提供了我们新的 GPT-4o mini 微调的分步指南。我们将使用 RecipeNLG 数据集 进行实体提取,该数据集提供了各种食谱以及每种食谱提取的通用配料列表。这是命名实体识别 (NER) 任务的常用数据集。
注意:GPT-4o mini 微调可供我们 Tier 4 和 Tier 5 使用层级 的开发者使用。 您可以通过访问您的微调仪表板,点击“创建”,然后从基础模型下拉菜单中选择“gpt-4o-mini-2024-07-18”来开始微调 GPT-4o mini。
我们将按以下步骤进行:
- 设置: 加载我们的数据集并过滤到一个域进行微调。
- 数据准备: 通过创建训练和验证示例,并将它们上传到
Files
端点来准备您的数据以进行微调。 - 微调: 创建您的微调模型。
- 推理: 使用您的微调模型对新输入进行推理。
到最后,您应该能够训练、评估和部署一个微调的 gpt-4o-mini-2024-07-18
模型。
有关微调的更多信息,您可以参考我们的文档指南或API 参考。
设置
# 确保使用最新版本的 openai python 包
!pip install --upgrade --quiet openai
import json
import openai
import os
import pandas as pd
from pprint import pprint
client = openai.OpenAI(
api_key=os.environ.get("OPENAI_API_KEY"),
organization="<org id>",
project="<project id>",
)
微调在专注于特定领域时效果最好。确保您的数据集足够专注以便模型学习,但又足够通用以便不会错过未见过的示例,这一点很重要。考虑到这一点,我们从 RecipesNLG 数据集中提取了一个子集,仅包含来自 cookbooks.com 的文档。
# 读取我们将用于此任务的数据集。
# 这将是 RecipesNLG 数据集,我们已清理为仅包含来自 www.cookbooks.com 的文档
recipe_df = pd.read_csv("data/cookbook_recipes_nlg_10k.csv")
recipe_df.head()
title | ingredients | directions | link | source | NER | |
---|---|---|---|---|---|---|
0 | No-Bake Nut Cookies | ["1 c. firmly packed brown sugar", "1/2 c. eva... | ["In a heavy 2-quart saucepan, mix brown sugar... | www.cookbooks.com/Recipe-Details.aspx?id=44874 | www.cookbooks.com | ["brown sugar", "milk", "vanilla", "nuts", "bu... |
1 | Jewell Ball'S Chicken | ["1 small jar chipped beef, cut up", "4 boned ... | ["Place chipped beef on bottom of baking dish.... | www.cookbooks.com/Recipe-Details.aspx?id=699419 | www.cookbooks.com | ["beef", "chicken breasts", "cream of mushroom... |
2 | Creamy Corn | ["2 (16 oz.) pkg. frozen corn", "1 (8 oz.) pkg... | ["In a slow cooker, combine all ingredients. C... | www.cookbooks.com/Recipe-Details.aspx?id=10570 | www.cookbooks.com | ["frozen corn", "cream cheese", "butter", "gar... |
3 | Chicken Funny | ["1 large whole chicken", "2 (10 1/2 oz.) cans... | ["Boil and debone chicken.", "Put bite size pi... | www.cookbooks.com/Recipe-Details.aspx?id=897570 | www.cookbooks.com | ["chicken", "chicken gravy", "cream of mushroo... |
4 | Reeses Cups(Candy) | ["1 c. peanut butter", "3/4 c. graham cracker ... | ["Combine first four ingredients and press in ... | www.cookbooks.com/Recipe-Details.aspx?id=659239 | www.cookbooks.com | ["peanut butter", "graham cracker crumbs", "bu... |
数据准备
我们将开始准备数据。当使用 ChatCompletion
格式进行微调时,每个训练示例都是一个简单的 messages
列表。例如,一个条目可能看起来像:
[{'role': 'system',
'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'},
{'role': 'user',
'content': 'Title: No-Bake Nut Cookies\n\nIngredients: ["1 c. firmly packed brown sugar", "1/2 c. evaporated milk", "1/2 tsp. vanilla", "1/2 c. broken nuts (pecans)", "2 Tbsp. butter or margarine", "3 1/2 c. bite size shredded rice biscuits"]\n\nGeneric ingredients: '},
{'role': 'assistant',
'content': '["brown sugar", "milk", "vanilla", "nuts", "butter", "bite size shredded rice biscuits"]'}]
在训练过程中,此对话将被拆分,最后一个条目是模型将生成的 completion
,其余的 messages
作为提示。在构建训练示例时请考虑这一点 - 如果您的模型将处理多轮对话,请提供代表性示例,这样当对话开始扩展时它就不会表现不佳。
请注意,目前每个训练示例都有 4096 个 token 的限制。任何超过此长度的内容都将被截断为 4096 个 token。
system_message = "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."
def create_user_message(row):
return f"Title: {row['title']}\n\nIngredients: {row['ingredients']}\n\nGeneric ingredients: "
def prepare_example_conversation(row):
return {
"messages": [
{"role": "system", "content": system_message},
{"role": "user", "content": create_user_message(row)},
{"role": "assistant", "content": row["NER"]},
]
}
pprint(prepare_example_conversation(recipe_df.iloc[0]))
{'messages': [{'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.',
'role': 'system'},
{'content': 'Title: No-Bake Nut Cookies\n'
'\n'
'Ingredients: ["1 c. firmly packed brown sugar", '
'"1/2 c. evaporated milk", "1/2 tsp. vanilla", "1/2 '
'c. broken nuts (pecans)", "2 Tbsp. butter or '
'margarine", "3 1/2 c. bite size shredded rice '
'biscuits"]\n'
'\n'
'Generic ingredients: ',
'role': 'user'},
{'content': '["brown sugar", "milk", "vanilla", "nuts", '
'"butter", "bite size shredded rice biscuits"]',
'role': 'assistant'}]}
现在让我们对数据集的一个子集执行此操作,以用作我们的训练数据。您可以从 30-50 个精心修剪的示例开始。您应该会看到随着训练集大小的增加,性能会持续线性扩展,但您的作业也会花费更长的时间。
# 使用数据集的前 100 行进行训练
training_df = recipe_df.loc[0:100]
# 将 prepare_example_conversation 函数应用于 training_df 的每一行
training_data = training_df.apply(prepare_example_conversation, axis=1).tolist()
for example in training_data[:5]:
print(example)
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: No-Bake Nut Cookies\n\nIngredients: ["1 c. firmly packed brown sugar", "1/2 c. evaporated milk", "1/2 tsp. vanilla", "1/2 c. broken nuts (pecans)", "2 Tbsp. butter or margarine", "3 1/2 c. bite size shredded rice biscuits"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["brown sugar", "milk", "vanilla", "nuts", "butter", "bite size shredded rice biscuits"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Jewell Ball\'S Chicken\n\nIngredients: ["1 small jar chipped beef, cut up", "4 boned chicken breasts", "1 can cream of mushroom soup", "1 carton sour cream"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["beef", "chicken breasts", "cream of mushroom soup", "sour cream"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Creamy Corn\n\nIngredients: ["2 (16 oz.) pkg. frozen corn", "1 (8 oz.) pkg. cream cheese, cubed", "1/3 c. butter, cubed", "1/2 tsp. garlic powder", "1/2 tsp. salt", "1/4 tsp. pepper"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["frozen corn", "cream cheese", "butter", "garlic powder", "salt", "pepper"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Chicken Funny\n\nIngredients: ["1 large whole chicken", "2 (10 1/2 oz.) cans chicken gravy", "1 (10 1/2 oz.) can cream of mushroom soup", "1 (6 oz.) box Stove Top stuffing", "4 oz. shredded cheese"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["chicken", "chicken gravy", "cream of mushroom soup", "shredded cheese"]'}]}
{'messages': [{'role': 'system', 'content': 'You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided.'}, {'role': 'user', 'content': 'Title: Reeses Cups(Candy) \n\nIngredients: ["1 c. peanut butter", "3/4 c. graham cracker crumbs", "1 c. melted butter", "1 lb. (3 1/2 c.) powdered sugar", "1 large pkg. chocolate chips"]\n\nGeneric ingredients: '}, {'role': 'assistant', 'content': '["peanut butter", "graham cracker crumbs", "butter", "powdered sugar", "chocolate chips"]'}]}
除了训练数据,我们还可以选择性地提供验证数据,这些数据将用于确保模型不会过度拟合训练集。
validation_df = recipe_df.loc[101:200]
validation_data = validation_df.apply(
prepare_example_conversation, axis=1).tolist()
然后我们需要将数据保存为 .jsonl
文件,每一行都是一个训练示例对话。
def write_jsonl(data_list: list, filename: str) -> None:
with open(filename, "w") as out:
for ddict in data_list:
jout = json.dumps(ddict) + "\n"
out.write(jout)
training_file_name = "tmp_recipe_finetune_training.jsonl"
write_jsonl(training_data, training_file_name)
validation_file_name = "tmp_recipe_finetune_validation.jsonl"
write_jsonl(validation_data, validation_file_name)
我们的训练 .jsonl
文件的前 5 行看起来是这样的:
# 打印训练文件的前 5 行
!head -n 5 tmp_recipe_finetune_training.jsonl
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: No-Bake Nut Cookies\n\nIngredients: [\"1 c. firmly packed brown sugar\", \"1/2 c. evaporated milk\", \"1/2 tsp. vanilla\", \"1/2 c. broken nuts (pecans)\", \"2 Tbsp. butter or margarine\", \"3 1/2 c. bite size shredded rice biscuits\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"brown sugar\", \"milk\", \"vanilla\", \"nuts\", \"butter\", \"bite size shredded rice biscuits\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Jewell Ball'S Chicken\n\nIngredients: [\"1 small jar chipped beef, cut up\", \"4 boned chicken breasts\", \"1 can cream of mushroom soup\", \"1 carton sour cream\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"beef\", \"chicken breasts\", \"cream of mushroom soup\", \"sour cream\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Creamy Corn\n\nIngredients: [\"2 (16 oz.) pkg. frozen corn\", \"1 (8 oz.) pkg. cream cheese, cubed\", \"1/3 c. butter, cubed\", \"1/2 tsp. garlic powder\", \"1/2 tsp. salt\", \"1/4 tsp. pepper\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"frozen corn\", \"cream cheese\", \"butter\", \"garlic powder\", \"salt\", \"pepper\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Chicken Funny\n\nIngredients: [\"1 large whole chicken\", \"2 (10 1/2 oz.) cans chicken gravy\", \"1 (10 1/2 oz.) can cream of mushroom soup\", \"1 (6 oz.) box Stove Top stuffing\", \"4 oz. shredded cheese\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"chicken\", \"chicken gravy\", \"cream of mushroom soup\", \"shredded cheese\"]"}]}
{"messages": [{"role": "system", "content": "You are a helpful recipe assistant. You are to extract the generic ingredients from each of the recipes provided."}, {"role": "user", "content": "Title: Reeses Cups(Candy) \n\nIngredients: [\"1 c. peanut butter\", \"3/4 c. graham cracker crumbs\", \"1 c. melted butter\", \"1 lb. (3 1/2 c.) powdered sugar\", \"1 large pkg. chocolate chips\"]\n\nGeneric ingredients: "}, {"role": "assistant", "content": "[\"peanut butter\", \"graham cracker crumbs\", \"butter\", \"powdered sugar\", \"chocolate chips\"]"}]}
上传文件
您现在可以将文件上传到我们的 Files
端点,供微调模型使用。
def upload_file(file_name: str, purpose: str) -> str:
with open(file_name, "rb") as file_fd:
response = client.files.create(file=file_fd, purpose=purpose)
return response.id
training_file_id = upload_file(training_file_name, "fine-tune")
validation_file_id = upload_file(validation_file_name, "fine-tune")
print("Training file ID:", training_file_id)
print("Validation file ID:", validation_file_id)
Training file ID: file-3wfAfDoYcGrSpaE17qK0vXT0
Validation file ID: file-HhFhnyGJhazYdPcd3wrtvIoX
微调
现在我们可以使用生成的文件和可选的后缀来创建我们的微调作业,以识别模型。响应将包含一个 id
,您可以使用该 ID 来检索作业的更新。
注意:文件必须首先由我们的系统处理,因此您可能会收到“文件未准备好”的错误。在这种情况下,只需几分钟后重试即可。
MODEL = "gpt-4o-mini-2024-07-18"
response = client.fine_tuning.jobs.create(
training_file=training_file_id,
validation_file=validation_file_id,
model=MODEL,
suffix="recipe-ner",
)
job_id = response.id
print("Job ID:", response.id)
print("Status:", response.status)
Job ID: ftjob-UiaiLwGdGBfdLQDBAoQheufN
Status: validating_files
检查作业状态
您可以向 https://api.openai.com/v1/alpha/fine-tunes
端点发出 GET
请求来列出您的 alpha 微调作业。在这种情况下,您需要检查从上一步获得的 ID 的 status
是否为 succeeded
。
完成后,您可以使用 result_files
来对验证集进行采样(如果您上传了验证集),并使用 fine_tuned_model
参数中的 ID 来调用您的训练模型。
response = client.fine_tuning.jobs.retrieve(job_id)
print("Job ID:", response.id)
print("Status:", response.status)
print("Trained Tokens:", response.trained_tokens)
Job ID: ftjob-UiaiLwGdGBfdLQDBAoQheufN
Status: running
Trained Tokens: None
我们可以使用事件端点来跟踪微调的进度。您可以多次重新运行下面的单元格,直到微调准备就绪。
response = client.fine_tuning.jobs.list_events(job_id)
events = response.data
events.reverse()
for event in events:
print(event.message)
Step 288/303: training loss=0.00
Step 289/303: training loss=0.01
Step 290/303: training loss=0.00, validation loss=0.31
Step 291/303: training loss=0.00
Step 292/303: training loss=0.00
Step 293/303: training loss=0.00
Step 294/303: training loss=0.00
Step 295/303: training loss=0.00
Step 296/303: training loss=0.00
Step 297/303: training loss=0.00
Step 298/303: training loss=0.01
Step 299/303: training loss=0.00
Step 300/303: training loss=0.00, validation loss=0.04
Step 301/303: training loss=0.16
Step 302/303: training loss=0.00
Step 303/303: training loss=0.00, full validation loss=0.33
Checkpoint created at step 101 with Snapshot ID: ft:gpt-4o-mini-2024-07-18:openai-gtm:recipe-ner:9o1eNlSa:ckpt-step-101
Checkpoint created at step 202 with Snapshot ID: ft:gpt-4o-mini-2024-07-18:openai-gtm:recipe-ner:9o1eNFnj:ckpt-step-202
New fine-tuned model created: ft:gpt-4o-mini-2024-07-18:openai-gtm:recipe-ner:9o1eNNKO
The job has successfully completed
现在它完成了,我们可以从作业中获取一个微调模型 ID:
response = client.fine_tuning.jobs.retrieve(job_id)
fine_tuned_model_id = response.fine_tuned_model
if fine_tuned_model_id is None:
raise RuntimeError(
"Fine-tuned model ID not found. Your job has likely not been completed yet."
)
print("Fine-tuned model ID:", fine_tuned_model_id)
Fine-tuned model ID: ft:gpt-4o-mini-2024-07-18:openai-gtm:recipe-ner:9o1eNNKO
推理
最后一步是使用您的微调模型进行推理。与经典的 FineTuning
类似,您只需在 model
参数中填入您的新微调模型名称来调用 ChatCompletions
。
test_df = recipe_df.loc[201:300]
test_row = test_df.iloc[0]
test_messages = []
test_messages.append({"role": "system", "content": system_message})
user_message = create_user_message(test_row)
test_messages.append({"role": "user", "content": user_message})
pprint(test_messages)
[{'content': 'You are a helpful recipe assistant. You are to extract the '
'generic ingredients from each of the recipes provided.',
'role': 'system'},
{'content': 'Title: Beef Brisket\n'
'\n'
'Ingredients: ["4 lb. beef brisket", "1 c. catsup", "1 c. water", '
'"1/2 onion, minced", "2 Tbsp. cider vinegar", "1 Tbsp. prepared '
'horseradish", "1 Tbsp. prepared mustard", "1 tsp. salt", "1/2 '
'tsp. pepper"]\n'
'\n'
'Generic ingredients: ',
'role': 'user'}]
response = client.chat.completions.create(
model=fine_tuned_model_id, messages=test_messages, temperature=0, max_tokens=500
)
print(response.choices[0].message.content)
["beef brisket", "catsup", "water", "onion", "cider vinegar", "horseradish", "mustard", "salt", "pepper"]
结论
恭喜您,现在您已准备好使用 ChatCompletion
格式微调自己的模型!我们期待看到您构建的内容。