Azure OpenAI 服务(预览版)及其自带数据
注意:有新版本的 openai 库可用。请参阅 https://github.com/openai/openai-python/discussions/742
此示例展示了如何将 Azure OpenAI 服务模型与您自己的数据结合使用。此功能目前处于预览状态。
Azure OpenAI on your data 使您能够对支持的聊天模型(如 GPT-3.5-Turbo 和 GPT-4)使用您自己的数据,而无需训练或微调模型。使用您自己的数据运行模型,可以更准确、更快速地对您的数据进行聊天和分析。Azure OpenAI on your data 的一个关键优势是它能够定制对话式 AI 的内容。由于模型可以访问并引用特定来源来支持其响应,因此答案不仅基于其预训练知识,还基于指定数据源中可用的最新信息。此基础数据还有助于模型避免生成基于过时或不正确信息的响应。
Azure OpenAI on your own data 结合 Azure Cognitive Search,提供了一个可自定义的、预构建的知识检索解决方案,可用于构建对话式 AI 应用程序。要查看知识检索和语义搜索的替代方法,请参阅有关向量数据库的 cookbook 示例。
工作原理
Azure OpenAI on your own data 将模型与您的数据连接起来,使其能够以增强模型输出的方式检索和利用数据。与 Azure Cognitive Search 结合使用时,数据会根据用户输入和提供的对话历史记录从指定的数据源中检索。然后,数据会进行增强,并作为提示重新提交给模型,为模型提供可用于生成响应的上下文信息。
有关更多信息,请参阅Azure OpenAI 服务的 डेटा, गोपनीयता 和 安全性。
先决条件
为了开始,我们将介绍一些先决条件。
为了正确访问 Azure OpenAI 服务,我们需要在Azure 门户上创建相应的资源(您可以在Microsoft Docs 中查看有关如何执行此操作的详细指南)。
要将您自己的数据与 Azure OpenAI 模型结合使用,您将需要:
- Azure OpenAI 访问权限和已部署聊天模型的资源(例如,GPT-3 或 GPT-4)
- Azure Cognitive Search 资源
- Azure Blob Storage 资源
- 要用作数据的文档(请参阅数据源选项)
有关如何将文档上传到 blob 存储并使用 Azure AI Studio 创建索引的完整演练,请参阅此快速入门。
设置
首先,我们安装必要的依赖项。
! pip install "openai>=0.28.1,<1.0.0"
! pip install python-dotenv
在此示例中,我们将使用 dotenv
来加载我们的环境变量。要连接到 Azure OpenAI 和 Search 索引,应将以下变量以 KEY=VALUE
格式添加到 .env
文件中:
OPENAI_API_BASE
- Azure OpenAI 端点。可以在 Azure 门户中找到 Azure OpenAI 资源的“密钥和终结点”下的信息。OPENAI_API_KEY
- Azure OpenAI API 密钥。可以在 Azure 门户中找到 Azure OpenAI 资源的“密钥和终结点”下的信息。如果使用 Azure Active Directory 身份验证,则省略(请参阅下面的“使用 Microsoft Active Directory 进行身份验证”)SEARCH_ENDPOINT
- Cognitive Search 端点。此 URL 可以在 Azure 门户中找到您的 Search 资源的“概述”部分。SEARCH_KEY
- Cognitive Search API 密钥。可以在 Azure 门户中找到您的 Search 资源的“密钥”下的信息。SEARCH_INDEX_NAME
- 您使用自己的数据创建的索引的名称。
import os
import openai
import dotenv
dotenv.load_dotenv()
openai.api_base = os.environ["OPENAI_API_BASE"]
# Azure OpenAI on your own data is only supported by the 2023-08-01-preview API version
openai.api_version = "2023-08-01-preview"
身份验证
Azure OpenAI 服务支持多种身份验证机制,包括 API 密钥和 Azure 凭据。
use_azure_active_directory = False # 如果您正在使用 Azure Active Directory,请将此标志设置为 True
使用 API 密钥进行身份验证
要将 OpenAI SDK 设置为使用 Azure API 密钥,我们需要将 api_type
设置为 azure
,并将 api_key
设置为与您的终结点关联的密钥(您可以在Azure 门户的“资源管理”下的“密钥和终结点”中找到此密钥)。
if not use_azure_active_directory:
openai.api_type = 'azure'
openai.api_key = os.environ["OPENAI_API_KEY"]
使用 Microsoft Active Directory 进行身份验证
现在让我们看看如何通过 Microsoft Active Directory 身份验证获取密钥。有关如何设置此信息的更多信息,请参阅文档。
! pip install azure-identity
from azure.identity import DefaultAzureCredential
if use_azure_active_directory:
default_credential = DefaultAzureCredential()
token = default_credential.get_token("https://cognitiveservices.azure.com/.default")
openai.api_type = "azure_ad"
openai.api_key = token.token
令牌的有效期为一段时间,之后会过期。为了确保每次请求都发送有效的令牌,您可以通过挂钩到 requests.auth 来刷新即将过期的令牌。
import typing
import time
import requests
if typing.TYPE_CHECKING:
from azure.core.credentials import TokenCredential
class TokenRefresh(requests.auth.AuthBase):
def __init__(self, credential: "TokenCredential", scopes: typing.List[str]) -> None:
self.credential = credential
self.scopes = scopes
self.cached_token: typing.Optional[str] = None
def __call__(self, req):
if not self.cached_token or self.cached_token.expires_on - time.time() < 300:
self.cached_token = self.credential.get_token(*self.scopes)
req.headers["Authorization"] = f"Bearer {self.cached_token.token}"
return req
使用您自己的数据进行聊天补全模型
设置上下文
在此示例中,我们希望模型根据 Azure AI 服务文档数据来响应。按照前面共享的快速入门,我们将Azure AI 服务和机器学习文档页面的markdown文件添加到了我们的搜索索引中。模型现在已准备好回答有关 Azure AI 服务和机器学习的问题。
代码
要使用 Python SDK 将您自己的数据与 Azure OpenAI 模型进行聊天,我们必须首先设置代码以定位聊天补全扩展终结点,该终结点旨在与您自己的数据配合使用。为此,我们创建了一个便捷函数,可以调用该函数来为库设置自定义适配器,该适配器将针对给定部署 ID 的扩展终结点。
import requests
def setup_byod(deployment_id: str) -> None:
"""设置 OpenAI Python SDK 以便将您自己的数据用于聊天终结点。
:param deployment_id: 要与您自己的数据一起使用的模型的部署 ID。
要删除此配置,只需将 openai.requestssession 设置为 None。
"""
class BringYourOwnDataAdapter(requests.adapters.HTTPAdapter):
def send(self, request, **kwargs):
request.url = f"{openai.api_base}/openai/deployments/{deployment_id}/extensions/chat/completions?api-version={openai.api_version}"
return super().send(request, **kwargs)
session = requests.Session()
# 安装自定义适配器,该适配器将使用扩展终结点来处理使用给定 `deployment_id` 的任何调用
session.mount(
prefix=f"{openai.api_base}/openai/deployments/{deployment_id}",
adapter=BringYourOwnDataAdapter()
)
if use_azure_active_directory:
session.auth = TokenRefresh(default_credential, ["https://cognitiveservices.azure.com/.default"])
openai.requestssession = session
现在我们可以调用便捷函数来使用我们计划用于我们自己数据的模型配置 SDK。
setup_byod("gpt-4")
通过为 dataSources
关键字参数提供我们的搜索终结点、密钥和索引名称,现在将以我们自己的数据为基础来回答模型提出的任何问题。还将提供一个附加属性 context
,以显示模型用于回答问题的引用数据。
completion = openai.ChatCompletion.create(
messages=[{"role": "user", "content": "What are the differences between Azure Machine Learning and Azure AI services?"}],
deployment_id="gpt-4",
dataSources=[ # camelCase 是故意的,因为这是 API 所期望的格式
{
"type": "AzureCognitiveSearch",
"parameters": {
"endpoint": os.environ["SEARCH_ENDPOINT"],
"key": os.environ["SEARCH_KEY"],
"indexName": os.environ["SEARCH_INDEX_NAME"],
}
}
]
)
print(completion)
{
"id": "65b485bb-b3c9-48da-8b6f-7d3a219f0b40",
"model": "gpt-4",
"created": 1693338769,
"object": "extensions.chat.completion",
"choices": [
{
"index": 0,
"finish_reason": "stop",
"message": {
"role": "assistant",
"content": "Azure AI 服务和 Azure 机器学习 (AML) 都旨在应用人工智能 (AI) 来增强业务运营,但它们面向不同的受众并提供不同的功能 [doc1]。 \n\nAzure AI 服务专为没有机器学习经验的开发人员而设计,并提供预训练模型来解决文本分析、图像识别和自然语言处理等一般性问题 [doc5]。这些服务需要对您的数据有一般的了解,而无需机器学习或数据科学经验,并提供 REST API 和基于语言的 SDK [doc2]。\n\n另一方面,Azure 机器学习是为数据科学家量身定制的,涉及更长的过程,包括数据收集、清理、转换、算法选择、模型训练和部署 [doc5]。它允许用户为高度专业化和特定问题创建自定义解决方案,需要熟悉主题、数据和数据科学专业知识 [doc5]。\n\n总之,Azure AI 服务为没有机器学习经验的开发人员提供预训练模型,而 Azure 机器学习是数据科学家用于构建和部署自定义机器学习模型的平台。",
"end_turn": true,
"context": {
"messages": [
{
"role": "tool",
"content": "{\"citations\": [{\"content\": \"<h2 id=\\\"how-are-azure-ai-services-and-azure-machine-learning-aml-similar\\\">Azure AI 服务和 Azure 机器学习 (AML) 有何相似之处?</h2>\\n<p>两者都旨在应用人工智能 (AI) 来增强业务运营,尽管它们在各自的解决方案中提供此功能的方式不同。</p>\\n<p>通常,受众不同:</p>\\n<ul>\\n<li>Azure AI 服务面向没有机器学习经验的开发人员。</li>\\n<li>Azure 机器学习是为数据科学家量身定制的。</li></ul>\\n<p><b>Azure AI 服务和机器学习</b></p>\\n<p>Azure AI 服务提供机器学习功能,用于解决一般性问题,例如分析文本的情感或分析图像以识别对象或人脸。您无需特殊的机器学习或数据科学知识即可使用这些服务。Azure AI 服务是一组服务,每种服务都支持不同、通用的预测功能。这些服务分为不同的类别,以帮助您找到合适的服务。</p>\\n<table>\\n<thead>\\n<tr>\\n<th>服务类别</th>\\n<th>目的</th>\\n</tr>\\n</thead>\\n<tbody>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/decision/\\\">决策</a></td>\\n<td>构建能够提供建议以实现知情高效决策的应用程序。</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/lang/\\\">语言</a></td>\\n<td>允许您的应用程序处理自然语言,提供预构建的脚本,评估情感并了解用户想要什么。</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/search/\\\">搜索</a></td>\\n<td>将 Bing 搜索 API 添加到您的应用程序中,并利用单一 API 调用即可搜索数十亿个网页、图片、视频和新闻。</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/speech/\\\">语音</a></td>\\n<td>将语音转换为文本,并将文本转换为自然的语音。翻译语言,并启用说话人验证和识别。</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/vision/\\\">视觉</a></td>\\n<td>识别、标识、描述、索引和审核您的图片、视频和数字墨水内容。</td>\\n</tr>\\n<p></tbody>\\n</table></p>\\n<p>在以下情况下使用 Azure AI 服务:</p>\\n<ul>\\n<li>可以使用通用解决方案。</li>\\n</ul>\\n<p>在以下情况下使用其他机器学习解决方案:</p>\\n<ul>\\n<li>需要选择算法并针对非常特定的数据进行训练。</li>\\n</ul>\\n<h2 id=\\\"what-is-machine-learning\\\">什么是机器学习?</h2>\\n<p>机器学习是一个将数据和算法结合起来解决特定需求的理念。一旦数据和算法得到训练,输出就是一个模型,您可以将其与不同的数据一起重复使用。训练好的模型根据新数据提供见解。</p>\\n<p>构建机器学习系统需要一些机器学习或数据科学方面的知识。</p>\\n<p>机器学习通过<a href=\\\"/azure/architecture/data-guide/technology-choices/data-science-and-machine-learning?.context=azure%2fmachine-learning%2fstudio%2fcontext%2fml-context\\\">Azure 机器学习 (AML) 产品和服务</a>提供。</p>\\n<h2 id=\\\"what-is-an-azure-ai-service\\\">什么是 Azure AI 服务?</h2>\\n<p>Azure AI 服务提供机器学习解决方案的组成部分或全部组件:数据、算法和训练模型。这些服务旨在需要对您的数据有一般性了解,而无需机器学习或数据科学经验。这些服务同时提供 REST API 和基于语言的 SDK。因此,您需要具备编程语言知识才能使用这些服务。</p>\",\"id\": null, \"title\": \"Azure AI services and machine learning\", \"filepath\": \"cognitive-services-and-machine-learning.md\", \"url\": \"https://krpraticstorageacc.blob.core.windows.net/azure-openai/cognitive-services-and-machine-learning.md\", \"metadata\": {\"chunking\": \"orignal document size=1188. Scores=5.689296 and None.Org Highlight count=160.Filtering to chunk no. 1/Highlights=67 of size=506\"}, \"chunk_id\": \"1\"}, {\"content\": \"<hr />\\n<p>title: Azure AI services and Machine Learning\\ntitleSuffix: Azure AI services\\ndescription: Learn where Azure AI services fits in with other Azure offerings for machine learning.\\nservices: cognitive-services\\nmanager: nitinme\\nauthor: aahill\\nms.author: aahi\\nms.service: cognitive-services\\nms.topic: conceptual\\nms.date: 10/28/2021</p>\\n<hr />\\n<h1 id=\\\"azure-ai-services-and-machine-learning\\\">Azure AI services and machine learning</h1>\\n<p>Azure AI services provides machine learning capabilities to solve general problems such as analyzing text for emotional sentiment or analyzing images to recognize objects or faces..You don't need special machine learning or data science knowledge to use these services../what-are-ai-services.md\\\">Azure AI services</a> is a group of services, each supporting different, generalized prediction capabilities..The services are divided into different categories to help you find the right service..</p>\\n<table>\\n<thead>\\n<tr>\\n<th>Service category</th>\\n<th>Purpose</th>\\n</tr>\\n</thead>\\n<tbody>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/decision/\\\">Decision</a></td>\\n<td>Build apps that surface recommendations for informed and efficient decision-making..</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/lang/\\\">Language</a></td>\\n<td>Allow your apps to process natural language with pre-built scripts, evaluate sentiment and learn how to recognize what users want..</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/search/\\\">Search</a></td>\\n<td>Add Bing Search APIs to your apps and harness the ability to comb billions of webpages, images, videos, and news with a single API call..</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"https://azure.microsoft.com/services/cognitive-services/directory/speech/\\\">Speech</a></td>\",\"id\": null, \"title\": \"Azure AI services and machine learning\", \"filepath\": \"cognitive-services-and-machine-learning.md\", \"url\": \"https://krpraticstorageacc.blob.core.windows.net/azure-openai/cognitive-services-and-machine-learning.md\", \"metadata\": {\"chunking\": \"orignal document size=1188. Scores=5.689296 and None.Org Highlight count=160.Filtering to chunk no. 0/Highlights=63 of size=526\"}, \"chunk_id\": \"0\"}, {\"content\": \"<p>How is Azure Cognitive Search related to Azure AI services?</p>\\n<p><a href=\\\"../search/search-what-is-azure-search.md\\\">Azure Cognitive Search</a> is a separate cloud search service that optionally uses Azure AI services to add image and natural language processing to indexing workloads. Azure AI services is exposed in Azure Cognitive Search through <a href=\\\"../search/cognitive-search-predefined-skills.md\\\">built-in skills</a> that wrap individual APIs. You can use a free resource for walkthroughs, but plan on creating and attaching a <a href=\\\"../search/cognitive-search-attach-cognitive-services.md\\\">billable resource</a> for larger volumes.</p>\\n<h2 id=\\\"how-can-you-use-azure-ai-services\\\">How can you use Azure AI services?</h2>\\n<p>Each service provides information about your data. You can combine services together to chain solutions such as converting speech (audio) to text, translating the text into many languages, then using the translated languages to get answers from a knowledge base. While Azure AI services can be used to create intelligent solutions on their own, they can also be combined with traditional machine learning projects to supplement models or accelerate the development process. </p>\\n<p>Azure AI services that provide exported models for other machine learning tools:</p>\\n<table>\\n<thead>\\n<tr>\\n<th>Azure AI service</th>\\n<th>Model information</th>\\n</tr>\\n</thead>\\n<tbody>\\n<tr>\\n<td><a href=\\\"./custom-vision-service/overview.md\\\">Custom Vision</a></td>\\n<td><a href=\\\"./custom-vision-service/export-model-python.md\\\">Export</a> for Tensorflow for Android, CoreML for iOS11, ONNX for Windows ML</td>\\n</tr>\\n</tbody>\\n</table>\\n<h2 id=\\\"learn-more\\\">Learn more</h2>\\n<ul>\\n<li><a href=\\\"/azure/architecture/data-guide/technology-choices/data-science-and-machine-learning\\\">Architecture Guide - What are the machine learning products at Microsoft?</a></li>\\n<li><a href=\\\"../machine-learning/concept-deep-learning-vs-machine-learning.md\\\">Machine learning - Introduction to deep learning vs. machine learning</a></li>\\n</ul>\\n<h2 id=\\\"next-steps\\\">Next steps</h2>\\n<ul>\\n<li>Create your Azure AI services resource in the <a href=\\\"multi-service-resource.md?pivots=azportal\\\">Azure portal</a> or with <a href=\\\"./multi-service-resource.md?pivots=azcli\\\">Azure CLI</a>.</li>\\n<li>Learn how to <a href=\\\"authentication.md\\\">authenticate</a> with your Azure AI service.</li>\\n<li>Use <a href=\\\"diagnostic-logging.md\\\">diagnostic logging</a> for issue identification and debugging. </li>\\n<li>Deploy an Azure AI service in a Docker <a href=\\\"cognitive-services-container-support.md\\\">container</a>.</li>\\n<li>Keep up to date with <a href=\\\"https://azure.microsoft.com/updates/?product=cognitive-services\\\">service updates</a>.</li>\\n</ul>\",\"id\": null, \"title\": \"Azure AI services and machine learning\", \"filepath\": \"cognitive-services-and-machine-learning.md\", \"url\": \"https://krpraticstorageacc.blob.core.windows.net/azure-openai/cognitive-services-and-machine-learning.md\", \"metadata\": {\"chunking\": \"orignal document size=793. Scores=3.3767838 and None.Org Highlight count=69.\"}, \"chunk_id\": \"3\"}, {\"content\": \"<p>How are Azure AI services different from machine learning?.</p>\\n<p>Azure AI services provide a trained model for you..This brings data and an algorithm together, available from a REST API(s) or SDK..An Azure AI service provides answers to general problems such as key phrases in text or item identification in images..</p>\\n<p>Machine learning is a process that generally requires a longer period of time to implement successfully..This time is spent on data collection, cleaning, transformation, algorithm selection, model training, and deployment to get to the same level of functionality provided by an Azure AI service..With machine learning, it is possible to provide answers to highly specialized and/or specific problems..Machine learning problems require familiarity with the specific subject matter and data of the problem under consideration, as well as expertise in data science..</p>\\n<h2 id=\\\"what-kind-of-data-do-you-have\\\">What kind of data do you have?.</h2>\\n<p>Azure AI services, as a group of services, can require none, some, or all custom data for the trained model..</p>\\n<h3 id=\\\"no-additional-training-data-required\\\">No additional training data required</h3>\\n<p>Services that provide a fully-trained model can be treated as a <em>opaque box</em>..You don't need to know how they work or what data was used to train them..</p>\\n<h3 id=\\\"some-or-all-training-data-required\\\">Some or all training data required</h3>\\n<p>Some services allow you to bring your own data, then train a model..This allows you to extend the model using the Service's data and algorithm with your own data..The output matches your needs..When you bring your own data, you may need to tag the data in a way specific to the service..For example, if you are training a model to identify flowers, you can provide a catalog of flower images along with the location of the flower in each image to train the model..These services process significant amounts of model data..</p>\\n<h2 id=\\\"service-requirements-for-the-data-model\\\">Service requirements for the data model</h2>\\n<p>The following data categorizes each service by which kind of data it allows or requires..</p>\\n<table>\\n<thead>\\n<tr>\\n<th>Azure AI service</th>\\n<th>No training data required</th>\\n<th>You provide some or all training data</th>\\n<th>Real-time or near real-time data collection</th>\\n</tr>\\n</thead>\\n<tbody>\\n<tr>\\n<td><a href=\\\"../LUIS/what-is-luis.md\\\">Language Understanding (LUIS)</a></td>\\n<td></td>\\n<td>x</td>\\n<td></td>\\n</tr>\\n<tr>\\n<td><a href=\\\"../personalizer/what-is-personalizer.md\\\">Personalizer</a><sup>1</sup></sup></td>\\n<td>x</td>\\n<td>x</td>\\n<td>x</td>\\n</tr>\\n<tr>\\n<td><a href=\\\"../computer-vision/overview.md\\\">Vision</a></td>\\n<td>x</td>\\n<td></td>\\n<td></td>\\n</tr>\\n</tbody>\\n</table>\\n<p><sup>1</sup> Personalizer only needs training data collected by the service (as it operates in real-time) to evaluate your policy and data..</p>\\n<h2 id=\\\"where-can-you-use-azure-ai-services\\\">Where can you use Azure AI services?.</h2>\\n<p>The services are used in any application that can make REST API(s) or SDK calls..Examples of applications include web sites, bots, virtual or mixed reality, desktop and mobile applications.\",\"id\": null, \"title\": \"Azure AI services and machine learning\", \"filepath\": \"cognitive-services-and-machine-learning.md\", \"url\": \"https://krpraticstorageacc.blob.core.windows.net/azure-openai/cognitive-services-and-machine-learning.md\", \"metadata\": {\"chunking\": \"orignal document size=1734. Scores=3.1447978 and None.Org Highlight count=66.Filtering to highlight size=891\"}, \"chunk_id\": \"4\"}], \"intent\": \"[\\\"What are the differences between Azure Machine Learning and Azure AI services?\\\"]\"}",
"end_turn": false
}
]
}
}
}
]
}
assistant: Azure AI 服务和 Azure 机器学习 (AML) 都旨在应用人工智能 (AI) 来增强业务运营,但它们面向不同的受众并提供不同的功能 [doc1]。
Azure AI 服务专为没有机器学习经验的开发人员而设计,并提供预训练模型来解决文本分析、图像识别和自然语言处理等一般性问题 [doc5]。这些服务需要对您的数据有一般的了解,而无需机器学习或数据科学经验,并提供 REST API 和基于语言的 SDK [doc2]。
另一方面,Azure 机器学习是为数据科学家量身定制的,并提供了一个用于构建、训练和部署自定义机器学习模型的平台 [doc1]。它需要机器学习或数据科学方面的知识,并允许用户选择算法并针对非常特定的数据进行训练 [doc2]。
总之,Azure AI 服务为没有机器学习经验的开发人员提供预训练模型,而 Azure 机器学习是数据科学家用于构建和部署自定义机器学习模型的平台。