32,39 €
Combining ChatGPT APIs with Python opens doors to building extraordinary AI applications. By leveraging these APIs, you can focus on the application logic and user experience, while ChatGPT’s robust NLP capabilities handle the intricacies of human-like text understanding and generation.
This book is a guide for beginners to master the ChatGPT, Whisper, and DALL-E APIs by building ten innovative AI projects. These projects offer practical experience in integrating ChatGPT with frameworks and tools such as Flask, Django, Microsoft Office APIs, and PyQt.
Throughout this book, you’ll get to grips with performing NLP tasks, building a ChatGPT clone, and creating an AI-driven code bug fixing SaaS application. You’ll also cover speech recognition, text-to-speech functionalities, language translation, and generation of email replies and PowerPoint presentations. This book teaches you how to fine-tune ChatGPT and generate AI art using DALL-E APIs, and then offers insights into selling your apps by integrating ChatGPT API with Stripe. With practical examples available on GitHub, the book gradually progresses from easy to advanced topics, cultivating the expertise required to develop, deploy, and monetize your own groundbreaking applications by harnessing the full potential of ChatGPT APIs.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 344
Veröffentlichungsjahr: 2023
Master ChatGPT, Whisper, and DALL-E APIs by building ten innovative AI projects
Martin Yanev
BIRMINGHAM—MUMBAI
Copyright © 2023 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Group Product Manager: Niranjan Naikwadi
Publishing Product Manager: Tejashwini R
Book Project Manager: Sonam Pandey
Senior Editor: Aamir Ahmed
Technical Editor: Simran Ali
Copy Editor: Safis Editing
Proofreader: Safis Editing
Indexer: Manju Arasan
Production Designer: Shyam Sundar Korumilli
DevRel Marketing Coordinator: Vinishka Kalra
First published: September 2023
Production reference: 1010923
Published by Packt Publishing Ltd.
Grosvenor House
11 St Paul’s Square
Birmingham
B3 1RB, UK.
ISBN 978-1-80512-756-7
www.packtpub.com
To my girlfriend, Xhulja Kola, for being my loving partner throughout our joint life journey. To my grandma, Maria, whose unwavering support and boundless love have shaped me into the person I am today. To my parents, Plamen and Zhana, who have been my pillars of strength.
– Martin Yanev
Martin Yanev is a highly accomplished software engineer with a wealth of expertise spanning diverse industries, including aerospace and medical technology. With an illustrious career of over 8 years, Martin has carved a niche for himself in developing and seamlessly integrating cutting-edge software solutions for critical domains such as air traffic control and chromatography systems.
Renowned as an esteemed instructor, Martin has empowered an impressive global community of over 280,000 students. His instructional prowess shines through as he imparts knowledge and guidance, leveraging his extensive proficiency in frameworks such as Flask, Django, Pytest, and TensorFlow. Possessing a deep understanding of the complete spectrum of OpenAI APIs, Martin exhibits mastery in constructing, training, and fine-tuning AI systems.
Martin’s commitment to excellence is exemplified by his dual master’s degrees in aerospace systems and software engineering. This remarkable academic achievement underscores his unwavering dedication to both the practical and theoretical facets of the industry. With his exceptional track record and multifaceted skill set, Martin continues to propel innovation and drive transformative advancements in the ever-evolving landscape of software engineering.
Sourabh Sharma has been working at Oracle as a lead technical member where he is responsible for developing and designing the key components of the blueprint solutions. He was the key member of the team and designed architecture that is being used by various Oracle products. He has over 20 years of experience delivering enterprise products and applications for leading companies. His expertise lies in conceptualizing, modeling, designing, and developing N-tier and cloud-based applications, as well as leading teams. He has vast experience in developing microservice-based solutions and implementing various types of workflow and orchestration engines. He also believes in continuous learning and sharing knowledge through his books and training.
Sourabh has also worked on two other books: Mastering Microservices with Java, Third Edition, and Modern API Development with Spring and Spring Boot, Second Edition.
Arindam Ganguly is an experienced data scientist who has worked in the software development industry for more than 7 years. He has proven skill sets in developing and managing a number of software products, mainly in the field of data science and artificial intelligence. He is also a published author and has written a book, Build and Deploy Machine Learning Solutions using IBM Watson, which teaches how to build artificial intelligence applications using the popular IBM Watson toolkit.
Ashutosh Vishwakarma is the co-founder of a conversational AI company called Verifast.tech. He has developed multiple high-scale and ML-based systems over the past 8 years as a backend engineer and is currently working with the LLM ecosystem to build next-gen UX.
In the first part, encompassing two chapters, the focus is on providing a comprehensive overview of ChatGPT and its significance for natural language processing (NLP). We will discuss the fundamentals of ChatGPT, exploring its impact and usage in web applications, as well as introducing readers to the ChatGPT API. This part demonstrates the process of building a ChatGPT clone, which is a chatbot that utilizes OpenAI’s language model to generate human-like responses to user input. The application will be built using Flask, a lightweight web framework for Python.
This part has the following chapters:
Chapter 1, Beginning with the ChatGPT API for NLP TasksChapter 2, Building a ChatGPT CloneNatural Language Processing (NLP) is an area of artificial intelligence that focuses on the interaction between computers and humans through natural language. Over the years, NLP has made remarkable progress in the field of language processing, and ChatGPT is one such revolutionary NLP tool that has gained significant popularity in recent years.
ChatGPT is an advanced AI language model developed by OpenAI, and it has been trained on a massive dataset of diverse texts, including books, articles, and web pages. With its ability to generate human-like text, ChatGPT has become a go-to tool for many NLP applications, including chatbots, language translation, and content generation.
In this chapter, we will explore the basics of ChatGPT and how you can use it for your NLP tasks. We will start with an introduction to ChatGPT and its impact on the field of NLP. Then we will explore how to use ChatGPT from the web and its benefits. Next, we will learn how to get started with the ChatGPT API, including creating an account and generating API keys. After that, we will take a walk-through of setting up your development environment to work with the ChatGPT API. Finally, we will see an example of a simple ChatGPT API response to understand the basic functionalities of the tool.
In this chapter, we will cover the following topics:
The ChatGPT Revolution.Using ChatGPT from the Web.Getting Started with the ChatGPT API.Setting Up Your Python Development Environment.A simple ChatGPT API Response.By the end of this chapter, you will have a solid experience with ChatGPT and you will learn how to use it to perform NLP tasks efficiently.
To get the most out of this chapter, you will need some basic tools to work with the Python code and the ChatGPT APIs. This chapter will guide you through all software installations and registrations.
You will require the following:
Python 3.7 or later installed on your computerAn OpenAI API key, which can be obtained by signing up for an OpenAI accountA code editor, such as PyCharm (recommended), to write and run Python codeThe code examples from this chapter can be found on GitHub at https://github.com/PacktPublishing/Building-AI-Applications-with-ChatGPT-APIs/tree/main/Chapter01%20ChatGPTResponse.
ChatGPT is an advanced AI language model developed by OpenAI, and it has made a significant impact on the field of natural language processing (NLP). The model is based on the transformer architecture, and it has been trained on a massive dataset of diverse texts, including books, articles, and web pages.
One of the key features of ChatGPT is its ability to generate text that is coherent and contextually appropriate. In contrast to earlier NLP models, ChatGPT possesses a more extensive comprehension of language, and it can generate text that is similar in style and structure to human-generated text. This feature has made ChatGPT a valuable tool for various applications, including conversational AI and content creation.
ChatGPT has also made significant progress in the field of conversational AI, where it has been used to develop chatbots that can interact with humans naturally. With its ability to understand context and generate text that is similar in style to human-generated text, ChatGPT has become a go-to tool for developing conversational AI.
The emergence of large language models (LLMs) such as GPT-3 has revolutionized the landscape of chatbots. Prior to LLMs, chatbots were limited in their capabilities, relying on rule-based systems with predefined responses. These chatbots lacked contextual understanding and struggled to engage in meaningful conversations. However, with LLM-based chatbots, there has been a significant transformation. These models comprehend complex queries, generate coherent and nuanced responses, and possess a broader knowledge base. They exhibit improved contextual understanding, learn from user interactions, and continually enhance their performance. LLM-based chatbots have elevated the user experience by providing more natural and personalized interactions, showcasing the remarkable advancements in chatbot technology.
ChatGPT has a long and successful history in the field of NLP. The model has undergone several advancements over the years, including the following:
GPT-1 (2018): Had 117 million parameters and was trained on a diverse set of web pages. It demonstrated impressive results in various NLP tasks, including question-answering, sentiment analysis, and language translation.GPT-2 (2019): Had 1.5 billion parameters and was trained on over 8 million web pages. It showed remarkable progress in language understanding and generation and became a widely used tool for various NLP applications.GPT-3 (2020): Had a record-breaking 175 billion parameters and set a new benchmark for language understanding and generation. It was used for various applications, including chatbots, language translation, and content creation.GPT-3.5: The latest version of the model, released after continued refinement and improvement by OpenAI.GPT-4 can solve difficult problems with greater accuracy, thanks to its broader general knowledge and problem-solving abilities. Developers can harness the power of GPT models without requiring them to train their own models from scratch. This can save a lot of time and resources, especially for smaller teams or individual developers.
In the next section, you will learn how to use ChatGPT from the web. You will learn how to create an OpenAI account and explore the ChatGPT web interface.
Interacting with ChatGPT via the OpenAI website is incredibly straightforward. OpenAI provides a web-based interface that can be found at https://chat.openai.com, enabling users to engage with the model without any prior coding knowledge or setup required. Once you visit the website, you can begin entering your questions or prompts, and the model will produce its best possible answer or generated text. Notably, ChatGPT Web also provides users with various settings and options that allow them to track the conversation’s context and save the history of all interactions with the AI. This feature-rich approach to web-based AI interactions allows users to effortlessly experiment with the model’s capabilities and gain insight into its vast potential applications. To get started with the web-based interface, you’ll need to register for an account with OpenAI, which we will cover in detail in the next section. Once you’ve created an account, you can access the web interface and begin exploring the model’s capabilities, including various settings and options to enhance your AI interactions.
Before using ChatGPT or the ChatGPT API, you must create an account on the OpenAI website, which will give you access to all the tools that the company has developed. To do that, you can visit https://chat.openai.com, where you will be asked to either log in or sign up for a new account, as shown in Figure 1.1:
Figure 1.1: OpenAI Welcome Window
Simply click the Sign up button and follow the prompts to access the registration window (see Figure 1.2). From there, you have the option to enter your email address and click Continue, or you can opt to register using your Google or Microsoft account. Once this step is complete, you can select a password and validate your email, just like with any other website registration process.
After completing the registration process, you can begin exploring ChatGPT’s full range of features. Simply click the Log in button depicted in Figure 1.1 and enter your credentials into the Log In window. Upon successfully logging in, you’ll gain full access to ChatGPT and all other OpenAI products. With this straightforward approach to access, you can seamlessly explore the full capabilities of ChatGPT and see firsthand why it’s become such a powerful tool for natural language processing tasks.
Figure 1.2: OpenAI Registration Window
Now we can explore the features and functionality of the ChatGPT web interface in greater detail. We’ll show you how to navigate the interface and make the most of its various options to get the best possible results from the AI model.
The ChatGPT web interface allows users to interact with the AI model. Once a user registers for the service and logs in, they can enter text prompts or questions into a chat window and receive responses from the model. You can ask ChatGPT anything using the Send a message… text field. The chat window also displays previous messages and prompts, allowing users to keep track of the conversation’s context, as shown in Figure 1.3:
Figure 1.3: ChatGPT Following Conversational Context
In addition to that, ChatGPT allows users to easily record the history of their interactions with the model. Users’ chat logs are automatically saved, which can later be accessed from the left sidebar for reference or analysis. This feature is especially useful for researchers or individuals who want to keep track of their conversations with the model and evaluate its performance over time. The chat logs can also be used to train other models or compare the performance of different models. You are now able to distinguish and use the advancements of different ChatGPT models. You can also use ChatGPT from the web, including creating an account and generating API keys. The ChatGPT API is flexible, customizable, and can save developers time and resources, making it an ideal choice for chatbots, virtual assistants, and automated content generation. In the next section, you will learn how to access the ChatGPT API easily using Python.
The ChatGPT API is an application programming interface developed by OpenAI that allows developers to interact with Generative Pre-trained Transformer (GPT) models for natural language processing (NLP) tasks. This API provides an easy-to-use interface for generating text, completing prompts, answering questions, and carrying out other NLP tasks using state-of-the-art machine learning models.
The ChatGPT API is used for chatbots, virtual assistants, and automated content generation. It can also be used for language translation, sentiment analysis, and content classification. The API is flexible and customizable, allowing developers to fine-tune the model’s performance for their specific use case. Let’s now discover the process of obtaining an API key. This is the first step to accessing the ChatGPT API from your own applications.
To use the ChatGPT API, you will need to obtain an API key. This can be obtained from OpenAI. This key will allow you to authenticate your requests to the API and ensure that only authorized users can access your account.
To obtain an API key, you must access the OpenAI Platform at https://platform.openai.com using your ChatGPT credentials. The OpenAI Platform page provides a central hub for managing your OpenAI resources. Once you have signed up, you can navigate to the API access page: https://platform.openai.com/account/api-keys. On the API access page, you can manage your API keys for the ChatGPT API and other OpenAI services. You can generate new API keys, view and edit the permissions associated with each key, and monitor your usage of the APIs. The page provides a clear overview of your API keys, including their names, types, and creation dates, and allows you to easily revoke or regenerate keys as needed.
Click on the +Create new secret key button and your API key will be created:
Figure 1.4: Creating an API Key
After creating your API key, you will only have one chance to copy it (see Figure 1.5). It’s important to keep your API key secure and confidential, as anyone who has access to your key could potentially access your account and use your resources. You should also be careful not to share your key with unauthorized users and avoid committing your key to public repositories or sharing it in plain text over insecure channels.
Figure 1.5: Saving an API Key
Copying and pasting the API key in our applications and scripts allows us to use the ChatGPT API. Now, let’s examine the ChatGPT tokens and their involvement in the OpenAI pricing model.
When working with ChatGPT APIs, it’s important to understand the concept of tokens. Tokens are the basic units of text used by models to process and understand the input and output text.
Tokens can be words or chunks of characters and are created by breaking down the text into smaller pieces. For instance, the word “hamburger” can be broken down into “ham,”“bur,” and “ger,” while a shorter word such as “pear” is a single token. Tokens can also start with whitespace, such as “ hello” or “ bye”.
The number of tokens used in an API request depends on the length of both the input and output text. As a rule of thumb, one token corresponds to approximately 4 characters or 0.75 words in English text. It’s important to note that the combined length of the text prompt and generated response must not exceed the maximum context length of the model. Table 1.1 shows the token limits of some of the popular ChatGPT models.
MODEL
MAX TOKENS
gpt-4
8,192 tokens
gpt-4-32k
32,768 tokens
gpt-3.5-turbo
4,096 tokens
text-davinci-003
4,096 tokens
Table 1.1: API model token limits
To learn more about how text is translated into tokens, you can check out OpenAI’sTokenizer tool. The tokenizer tool is a helpful resource provided by OpenAI for understanding how text is translated into tokens. This tool breaks down text into individual tokens and displays their corresponding byte offsets, which can be useful for analyzing and understanding the structure of your text.
You can find the tokenizer tool at https://platform.openai.com/tokenizer. To use the tokenizer tool, simply enter the text you want to analyze and select the appropriate model and settings. The tool will then generate a list of tokens, along with their corresponding byte offsets (see Figure 1.6).
Figure 1.6: The Tokenizer Tool
The ChatGPT API pricing is structured such that you are charged per 1,000 tokens processed, with a minimum charge per API request. This means that the longer your input and output texts are, the more tokens will be processed and the higher the cost will be. Table 1.2 displays the cost of processing 1,000 tokens for several commonly used ChatGPT models.
MODEL
PROMPT
COMPLETION
gpt-4
$0.03 / 1K tokens
$0.06 / 1K tokens
gpt-4-32k
$0.06 / 1K tokens
$0.12 / 1K tokens
gpt-3.5-turbo
$0.002 / 1K tokens
$0.002 / 1K tokens
text-davinci-003
$0.0200 / 1K tokens
$0.0200 / 1K tokens
Table 1.2: ChatGPT API Model Pricing
Important note
It is important to keep an eye on your token usage to avoid unexpected charges. You can track your usage and monitor your billing information through the Usage dashboard at https://platform.openai.com/account/usage.
As you can see, ChatGPT is has easy-to-use interface that allows developers to interact with GPT models for natural language processing tasks. Tokens are the basic units of text used by the models to process and understand the input and output text. The pricing structure for the ChatGPT API is based on the number of tokens processed, with a minimum charge per API request.
In the next section, we will cover how to set up the Python development environment for working with the ChatGPT API. This involves installing Python and the PyCharm IDE, setting up a virtual environment, and installing the necessary Python packages. Additionally, we will give you instructions on how to create a Python virtual environment using the built-in venv module and how to access the Terminal tab within PyCharm.
Before we start writing our first code, it’s important to create an environment to work in and install any necessary dependencies. Fortunately, Python has an excellent tooling system for managing virtual environments. Virtual environments in Python are a complex topic, but for the purposes of this book, it’s enough to know that they are isolated Python environments that are separate from your global Python installation. This isolation allows developers to work with different Python versions, install packages within the environment, and manage project dependencies without interfering with Python’s global installation.
In order to utilize the ChatGPT API in your NLP projects, you will need to set up your Python development environment. This section will guide you through the necessary steps to get started, including the following:
Installing PythonInstalling the PyCharm IDEInstalling pipSetting up a virtual environmentInstalling the required Python packagesA properly configured development environment will allow you to make API requests to ChatGPT and process the resulting responses in your Python code.
Python is a popular programming language that is widely used for various purposes, including machine learning and data analysis. You can download and install the latest version of Python from the official website, https://www.python.org/downloads/. Once you have downloaded the Python installer, simply follow the instructions to install Python on your computer. The next step is to choose an Integrated Development Environment (IDE) to work with (see Figure 1.7).
Figure 1.7: Python Installation
One popular choice among Python developers is PyCharm, a powerful and user-friendly IDE developed by JetBrains. PyCharm provides a wide range of features that make it easy to develop Python applications, including code completion, debugging tools, and project management capabilities.
To install PyCharm, you can download the Community Edition for free from the JetBrains website, https://www.jetbrains.com/pycharm/download/. Once you have downloaded the installer, simply follow the instructions to install PyCharm on your computer.
Setting up a Python virtual environment is a crucial step in creating an isolated development environment for your project. By creating a virtual environment, you can install specific versions of Python packages and dependencies without interfering with other projects on your system.
Creating a Python virtual environment specific to your ChatGPT application project is a recommended best practice. By doing so, you can ensure that all the packages and dependencies are saved inside your project folder rather than cluttering up your computer’s global Python installation. This approach provides a more organized and isolated environment for your project’s development and execution.
PyCharm allows you to set up the Python virtual environment directly during the project creation process. Once installed, you can launch PyCharm and start working with Python. Upon launching PyCharm, you will see the Welcome Window, and from there, you can create a new project. By doing so, you will be directed to the New Project window, where you can specify your desired project name and, more importantly, set up your Python virtual environment. To do this, you need to ensure that New environment using is selected. This option will create a copy of the Python version installed on your device and save it to your local project.
As you can see from Figure 1.8, the Location field displays the directory path of your local Python virtual environment situated within your project directory. Beneath it, Base interpreter displays the installed Python version on your system. Clicking the Create button will initiate the creation of your new project.
Figure 1.8: PyCharm Project Setup
Figure 1.9 displays the two main indicators showing that the Python virtual environment is correctly installed and activated. One of these indications is the presence of a venv folder within your PyCharm project, which proves that the environment is installed. Additionally, you should observe Python 3.11 (ChatGPTResponse) in the lower-right corner, confirming that your virtual environment has been activated successfully.
Figure 1.9: Python Virtual Environment Indications
A key component needed to install any package in Python is pip. Lets’s see how to check whether pip is already installed on your system, and how to install it if necessary.
pip is a package installer for Python. It allows you to easily install and manage third-party Python libraries and packages such as openai. If you are using a recent version of Python, pip should already be installed. You can check whether pip is installed on your system by opening a command prompt or terminal and typing pip followed by the Enter key. If pip is installed, you should see some output describing its usage and commands.
If pip is not installed on your system, you can install it by following these steps:
First, download the get-pip.py script from the official Python website: https://bootstrap.pypa.io/get-pip.py.Save the file to a location on your computer that you can easily access, such as your desktop or downloads folder.Open a command prompt or terminal and navigate to the directory where you saved the get-pip.py file.Run the following command to install pip: python get-pip.pyOnce the installation is complete, you can verify that pip is installed by typing