输入“/”快速插入内容

OpenAI 开发者大会|2023 年 11 月 7 日

2023年11月7日创建
本文讨论了2023年11月7日OpenAI开发者大会的相关内容,包括推出新模型、开发者产品,功能更新、价格调整等一系列举措。关键要点包括:
1.
新模型发布:推出GPT-4 Turbo,能力更强、知识更新至2023年4月,128k上下文窗口,输入输出价格降低;更新GPT-3.5 Turbo,默认支持16K上下文窗口。
2.
功能更新:函数调用可一次调用多个函数且准确性提高;GPT-4 Turbo指令遵循能力增强,支持新JSON模式;新增seed参数实现可重现输出,将推出返回对数概率功能。
3.
新API发布:发布Assistants API,助力开发者构建类代理体验,具备Code Interpreter、Retrieval和Function calling等功能。
4.
多模态能力:GPT-4 Turbo支持图像输入;可通过API集成DALL·E 3生成图像;能通过文本转语音API生成高质量语音。
5.
模型定制:开启GPT-4微调实验性访问计划;推出Custom Models计划,为特定组织提供定制服务。
6.
价格与速率调整:降低多种模型价格;提高付费GPT-4客户的每分钟令牌限制。
7.
其他更新:引入Copyright Shield保护客户版权;发布Whisper large-v3,开源Consistency Decoder 。
1.
GPT 作为 AI 生态的底座,越来越强大了。几乎每次更新、迭代都能延伸出更多新的应用场景。这次降价、各种 API 开放:128k 上下文 GPT4 turbo、视觉、DALL-E3 等以及 GPTs 和知识库更新(更新至 23 年 4 月)。生态化的趋势应当如此。
2.
大模型竞争已经不再是单纯的能力竞争,满足具体场景才是核心。降低应用成本,满足更多开发者需求、提供开发者工具,给开发者更多空间构造生态。
以下是官方文档的正文内容:
New models and developer products announced at DevDay
Today, we shared dozens of new additions and improvements, and reduced pricing across many parts of our platform. These include:
New GPT-4 Turbo model that is more capable, cheaper and supports a 128K context window
New Assistants API that makes it easier for developers to build their own assistive AI apps that have goals and can call models and tools
New multimodal capabilities in the platform, including vision, image creation (DALL·E 3), and text-to-speech (TTS)
We’ll begin rolling out new features to OpenAI customers starting at 1pm PT today.
GPT-4 Turbo with 128K context
We released the first version of GPT-4 in March and made GPT-4 generally available to all developers in July. Today we’re launching a preview of the next generation of this model, GPT-4 Turbo .
GPT-4 Turbo is more capable and has knowledge of world events up to April 2023. It has a 128k context window so it can fit the equivalent of more than 300 pages of text in a single prompt. We also optimized its performance so we are able to offer GPT-4 Turbo at a 3x cheaper price for input tokens and a 2x cheaper price for output tokens compared to GPT-4.
GPT-4 Turbo is available for all paying developers to try by passing gpt-4-1106-preview in the API and we plan to release the stable production-ready model in the coming weeks.
Function calling updates
Function calling lets you describe functions of your app or external APIs to models, and have the model intelligently choose to output a JSON object containing arguments to call those functions. We’re releasing several improvements today, including the ability to call multiple functions in a single message: users can send one message requesting multiple actions, such as “open the car window and turn off the A/C”, which would previously require multiple roundtrips with the model ( learn more ). We are also improving function calling accuracy: GPT-4 Turbo is more likely to return the right function parameters.
Improved instruction following and JSON mode
GPT-4 Turbo performs better than our previous models on tasks that require the careful following of instructions, such as generating specific formats (e.g., “always respond in XML”). It also supports our new JSON mode , which ensures the model will respond with valid JSON. The new API parameter response_format enables the model to constrain its output to generate a syntactically correct JSON object. JSON mode is useful for developers generating JSON in the Chat Completions API outside of function calling.
Reproducible outputs and log probabilities
The new seed parameter enables reproducible outputs by making the model return consistent completions most of the time. This beta feature is useful for use cases such as replaying requests for debugging, writing more comprehensive unit tests, and generally having a higher degree of control over the model behavior. We at OpenAI have been using this feature internally for our own unit tests and have found it invaluable. We’re excited to see how developers will use it. Learn more .
We’re also launching a feature to return the log probabilities for the most likely output tokens generated by GPT-4 Turbo and GPT-3.5 Turbo in the next few weeks, which will be useful for building features such as autocomplete in a search experience.
Updated GPT-3.5 Turbo
In addition to GPT-4 Turbo, we are also releasing a new version of GPT-3.5 Turbo that supports a 16K context window by default. The new 3.5 Turbo supports improved instruction following, JSON mode, and parallel function calling. For instance, our internal evals show a 38% improvement on format following tasks such as generating JSON, XML and YAML. Developers can access this new model by calling gpt-3.5-turbo-1106 in the API. Applications using the gpt-3.5-turbo name will automatically be upgraded to the new model on December 11. Older models will continue to be accessible by passing gpt-3.5-turbo-0613 in the API until June 13, 2024. Learn more .
Assistants API, Retrieval, and Code Interpreter
Today, we’re releasing the Assistants API , our first step towards helping developers build agent-like experiences within their own applications. An assistant is a purpose-built AI that has specific instructions, leverages extra knowledge, and can call models and tools to perform tasks. The new Assistants API provides new capabilities such as Code Interpreter and Retrieval as well as function calling to handle a lot of the heavy lifting that you previously had to do yourself and enable you to build high-quality AI apps.
This API is designed for flexibility; use cases range from a natural language-based data analysis app, a coding assistant, an AI-powered vacation planner, a voice-controlled DJ, a smart visual canvas—the list goes on. The Assistants API is built on the same capabilities that enable our new GPTs product : custom instructions and tools such as Code interpreter, Retrieval, and function calling.
A key change introduced by this API is persistent and infinitely long threads , which allow developers to hand off thread state management to OpenAI and work around context window constraints. With the Assistants API, you simply add each new message to an existing thread .
Assistants also have access to call new tools as needed, including:
Code Interpreter : writes and runs Python code in a sandboxed execution environment, and can generate graphs and charts, and process files with diverse data and formatting. It allows your assistants to run code iteratively to solve challenging code and math problems, and more.
Retrieval : augments the assistant with knowledge from outside our models, such as proprietary domain data, product information or documents provided by your users. This means you don’t need to compute and store embeddings for your documents, or implement chunking and search algorithms. The Assistants API optimizes what retrieval technique to use based on our experience building knowledge retrieval in ChatGPT.
Function calling : enables assistants to invoke functions you define and incorporate the function response in their messages.
As with the rest of the platform, data and files passed to the OpenAI API are never used to train our models and developers can delete the data when they see fit.