45,59 €
This book is an essential guide for mastering advanced CSS3 and SVG techniques, ideal for web developers aiming to create visually striking and interactive web content. It starts with an in-depth look at CSS3, covering shadow effects, gradients, 2D and 3D transforms, and sophisticated animations. You'll learn to implement complex styling, use media queries for responsive design, and create dynamic keyframe animations. Security practices and effective use of CSS3 are also addressed.
Moving to SVG, the book explores creating scalable vector graphics for advanced data visualizations and animations. You'll learn to use SVG elements like shapes, gradients, and Bezier curves, and how to seamlessly integrate them with CSS3 for dynamic, interactive graphics. Detailed tutorials on SVG filters, transforms, and animations enable you to build intricate visual effects and engaging content. Practical examples show how to combine CSS3 and SVG for responsive, high-performance web projects.
By the end, you’ll have the skills to craft compelling web applications using CSS3 and SVG, whether you're a beginner or an experienced developer looking to elevate your web design expertise.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 315
Veröffentlichungsjahr: 2024
LICENSE, DISCLAIMER OF LIABILITY, AND LIMITED WARRANTY
By purchasing or using this book and companion files (the “Work”), you agree that this license grants permission to use the contents contained herein, including the disc, but does not give you the right of ownership to any of the textual content in the book / disc or ownership to any of the information or products contained in it. This license does not permit uploading of theWork onto the Internet or on a network (of any kind) without the written consent of the Publisher. Duplication or dissemination of any text, code, simulations, images, etc. contained herein is limited to and subject to licensing terms for the respective products, and permission must be obtained from the Publisher or the owner of the content, etc., in order to reproduce or network any portion of the textual material (in any media) that is contained in the Work.
MERCURY LEARNING AND INFORMATION (“MLI” or “the Publisher”) and anyone involved in the creation, writing, or production of the companion disc, accompanying algorithms, code, or computer programs (“the software”), and any accompanying Web site or software of the Work, cannot and do not warrant the performance or results that might be obtained by using the contents of the Work. The author, developers, and the Publisher have used their best efforts to ensure the accuracy and functionality of the textual material and/or programs contained in this package; we, however, make no warranty of any kind, express or implied, regarding the performance of these contents or programs. The Work is sold “as is” without warranty (except for defective materials used in manufacturing the book or due to faulty workmanship).
The author, developers, and the publisher of any accompanying content, and anyone involved in the composition, production, and manufacturing of this work will not be liable for damages of any kind arising out of the use of (or the inability to use) the algorithms, source code, computer programs, or textual material contained in this publication. This includes, but is not limited to, loss of revenue or profit, or other incidental, physical, or consequential damages arising out of the use of this Work.
The sole remedy in the event of a claim of any kind is expressly limited to replacement of the book and/or disc, and only at the discretion of the Publisher. The use of “implied warranty” and certain “exclusions” varies from state to state and might not apply to the purchaser of this product.
Companion files for this title are available by writing to the publisher with proof of purchase at [email protected].
Copyright ©2024 by MERCURY LEARNING AND INFORMATION.
An Imprint of DeGruyter Inc. All rights reserved.
This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display, or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.
Publisher: David Pallai
MERCURY LEARNING AND INFORMATION
121 High Street, 3rd Floor
Boston, MA 02110
www.merclearning.com
800-232-0223
O. Campesato. CSS3 and SVG with Claude 3.
ISBN: 978-1-50152-336-6
The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.
Library of Congress Control Number: 2024942172
242526321 This book is printed on acid-free paper in the United States of America.
Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional information, please contact the Customer Service Dept. at 800-232-0223(toll free).
All of our titles are available in digital format at academiccourseware.com and other digital vendors. Companion files for this title are available with proof of purchase by contacting [email protected]. The sole obligation of MERCURY LEARNING AND INFORMATION to the purchaser is to replace the files, based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.
I’d like to dedicate this book to my parents – may this bring joy and happiness into their lives.
CONTENTS
Preface
Chapter 1: The Generative AI Landscape
What Is Generative AI?
Key Features of Generative AI
Popular Techniques in Generative AI
What Makes Generative AI Different
Conversational AI Versus Generative AI
Primary Objective
Applications
Technologies Used
Training and Interaction
Evaluation
Data Requirements
Is Claude 3 Part of Generative AI?
Is DALL-E Part of Generative AI?
DALLE-3 Sample Images
What Are LLMs?
What Are SLMs?
Popular SLMs
DeepMind
DeepMind and Games
Player of Games (PoG)
OpenAI
Cohere
Hugging Face
Hugging Face Libraries
Hugging Face Model Hub
AI21
Introduction to Claude 3
Advantages of Claude 3
Limitations of Claude 3
Interacting with Claude 3
Claude 3 Custom Instructions
Claude 3 and Prompts
Claude 3 Competitors
Google Gemini
YouChat
CoPilot (OpenAI/Microsoft)
Codex (OpenAI)
Apple GPT
PaLM-2
Llama 3
Summary
Chapter 2: Prompt Engineering
What Is Prompt Engineering?
Prompts and Completions
Types of Prompts
Instruction Prompts
Reverse Prompts
System Prompts Versus Agent Prompts
Prompt Templates
Prompts for Different LLMs
Poorly Worded Prompts
Inference Parameters
Temperature Parameter
Temperature and the softmax() Function
LLMs Try to Be Helpful
Meta AI
Gemini
GPT-4
Claude 3
Perplexity
Aspects of LLM Development
LLM Size Versus Performance
Emergent Abilities of LLMs
Kaplan and Undertrained Models
Summary
Chapter 3: Introduction to CSS3
HTML and <DOCTYPE>
CSS3 Features and Browser Support
Browser-Specific Prefixes for CSS3
A Quick Overview of CSS3 Features
CSS3 Shadow Effects and Rounded Corners
CSS3 and Text Shadow Effects
CSS3 and Box Shadow Effects
CSS3 and Rounded Corners
CSS3 Gradients
Linear Gradients
Radial Gradients
CSS3 2D Transforms
Zoom Effects with Scale Transforms
Rotate Transforms
Skew Transforms
Translate Transforms
Summary
Chapter 4: CSS3 3D Animation
A CSS3-Based Cube
CSS3 Transitions
Simple CSS3 Animation Effects
Glowing Effects
Image Fading and Rotating Effects with CSS3
Bouncing Effects
CSS3 Effects for Text
Rendering Multicolumn Text
CSS3 Media Queries
CSS3 3D Animation Effects
Summary
Chapter 5: CSS3 and Claude 3
Working with HTML
Popular Use Cases for HTML
HTML and Accessibility
Security Issues with HTML
An HTML Web Page with a DropDown List
Working with CSS
Use Cases for CSS3
Security Issues with CSS3
Text with a CSS3 Linear Gradient
Text with a CSS3 Radial Gradient
A Rounded Rectangle with a CSS3 Radial Gradient
Diagonal Gradients and Multigradients
Media Queries
Keyframe Animation Effects
CSS3 Animation Effects
A Carousel Effect with CSS3
CSS3 and 2D Transforms
CSS3 Quadratic Bezier Curves
CSS3 Cubic Bezier Curves
Artistic CSS3 Cubic Bezier Curves
CSS3 Filters
CSS3 Blur Filter
CSS3 Drop Shadow
Summary
Chapter 6: Introduction to SVG
Overview of SVG
Basic 2D Shapes in SVG
SVG Gradients
SVG <polygon> Element
Bezier Curves
SVG Filters, Shadow Effects, and Text Paths
SVG Transforms
SVG Animation
SVG and JavaScript
CSS3 and SVG Bar Charts
Summary
Chapter 7: SVG and Claude 3
Working with SVG
Use Cases for SVG
Accessibility and SVG
Security Issues with SVG
SVG Linear Gradients
SVG Radial Gradients
A Triangle with a Radial Gradient
SVG 2D Shapes and Gradients
A Bar Chart in SVG
SVG Quadratic Bezier Curves
SVG Cubic Bezier Curves
SVG and 2D Transforms
Animated SVG Cubic Bezier Curves
Hover Effects
Hover Animation Effects
SVG Versus CSS3: A Comparison
SVG Versus PNG: A Comparison
SVG Filters
SVG Blur Filter
SVG Turbulence Filter
SVG and CSS3 in HTML Web Pages
SVG and JavaScript in HTML Web Pages
Elliptic Arcs with a Radial Gradient
An SVG Checkerboard Pattern
An SVG Checkerboard Pattern with Filter Effects
A Master-Detail HTML Web Page
Summary
Index
PREFACE
WHAT IS THE PRIMARY VALUE PROPOSITION FOR THIS BOOK?
This book explores the synergy between Claude 3, a state-of-the-art AI model, and Web technologies such as HTML, CSS3, and SVG, offering both novices and seasoned developers the tools to create stunning web graphics and animations.
KEY TOPICS
Generative AI and Claude 3: Explore the fundamental concepts of Generative AI, key players in the field, and the unique features of Claude 3.Prompt Engineering: Master the art of crafting effective prompts to interact seamlessly with AI models like Claude 3.CSS3 3D Animation: Learn how to create dynamic 3D animations using CSS3, including effects like glowing, image fading, and rotating.SVG and Claude 3: Discover how to generate sophisticated SVG graphics and animations using Claude 3, enhancing your Web designs with advanced visual effects.THE TARGET AUDIENCE
This book is ideal for both beginners and experienced developers, offering in-depth knowledge about AI, Web development, and programming. Moreover, this book is structured to provide both theoretical knowledge and practical insights, making it a valuable resource for those looking to deepen their understanding of these rapidly evolving fields. As such, it is useful as a go-to resource for modern developers looking to stay ahead in an AI-focused world.
CHAPTER OVERVIEW
Chapter 1 explores Generative AI, discussing its key features, the differences between Conversational AI and Generative AI, and its various applications. It also examines the roles of prominent AI players like DeepMind, OpenAI, Cohere, Hugging Face, AI21, and others in this field. A significant portion of this chapter is dedicated to understanding Claude 3, such as its key features and a comparison of Claude 3 with GPT-4.
Chapter 2 shifts focus to Prompt Engineering, providing a comprehensive overview, including the types and importance of prompts, and guidelines for effective prompt design. This part of the book is crucial for understanding how to interact effectively with AI models such as Claude 3. Chapter 3 introduces CSS3, along with manually created HTML Web pages that contain CSS3 code for linear gradients, radial gradients, and other CSS3-based effects. Chapter 4 contains Claude 3-generated HTML Web pages with CSS3 that show you how to create 3D animation effects. Chapter 5 contains an assortment of Claude 3-generated Web pages that contain CSS3. Chapter 6 introduces SVG, along with manually created HTML Web pages that contain SVG code for linear gradients, radial gradients, and other CSS3-based effects. Chapter 7 contains examples of Claude 3-generated HTML Web pages that contain SVG code.
GETTING THE MOST FROM THIS BOOK
Some Web developers learn well from prose, others learn well from sample code (and lots of it), which means that there is no single style that can be used for everyone.
Moreover, some Web developers want to run the code first, see what it does, and then return to the code to delve into the details (and others use the opposite approach).
Consequently, there are several types of code samples in this book in order to illustrate some aspects of CSS3 and SVG, as well as how to supply prompts to Claude 3 in order to generate CSS3 code and SVG code.
HOW WAS THE CODE FOR THIS BOOK TESTED?
The code samples in this book have been tested in a recent version of Firefox on a MacBook Pro Sonoma 14.2.1 (earlier versions of OS X support the code samples in this book).
WHAT DO I NEED TO KNOW FOR THIS BOOK?
The most useful prerequisite is familiarity with HTML, CSS3, and SVG, which will enable you to understand the code samples more quickly. The less technical knowledge that you have, the more diligence will be required in order to understand the diverse topics that are covered.
If you want to be sure that you can grasp the material in this book, glance through some of the code samples to get an idea of how much is familiar to you and how much is new for you.
DOES THIS BOOK CONTAIN PRODUCTION-LEVEL CODE SAMPLES?
Clarity has a higher priority than writing compact code that is difficult to understand (and possibly prone to bugs). If you decide to use any of the code in this book on a production website, you ought to subject that code to the same rigorous analysis as the other parts of your HTML Web pages.
HOW DO I SET UP A COMMAND SHELL?
If you are a Mac user, there are three ways to do so. The first method is to use Finder to navigate to Applications > Utilities and then double click on the Utilities application. Next, if you already have a command shell available, you can launch a new command shell by typing the following command:
open/ Applications/Utilities/Terminal.app
A second method for Mac users is to open a new command shell on a MacBook from a command shell that is already visible simply by clicking command+n in that command shell, and your Mac will launch another command shell.
If you are a PC user, you can install Cygwin (open source https://cygwin.com/) that simulates bash commands or use another toolkit such as MKS (a commercial product). Please read the online documentation that describes the download and installation process. Note that custom aliases are not automatically set if they are defined in a file other than the main start-up file (such as .bash_login).
COMPANION FILES
All the code samples and figures in this book may be obtained by writing to the publisher at [email protected].
O. Campesato
August 2024
CHAPTER1
THE GENERATIVE AI LANDSCAPE
This chapter contains an introduction to generative AI and some of the companies that have a significant presence in generative AI. Some of the main features of Claude 3 will also be discussed, along with some of the competitors for Claude 3.
The first section of this chapter starts with information generated by Claude 3 regarding the nature of generative AI as well as conversational AI versus generative AI. According to Claude 3, it’s also true that ChatGPT, GPT-4, and DALL-E are included in generative AI. This section also discusses Claude 3 and some of its features, as well as alternatives to Claude 3, such as Llama 2 (Meta) and Gemini (formerly Google Bard).
The second section provides an overview of several important companies that are heavily involved in generative AI, such as OpenAI, Google, Meta, Cohere, and Hugging Face (as well as Anthropic, of course).
The third section discusses Claude 3, which includes a description of the three versions of Claude 3, called Opus, Sonnet, and Haiku. Claude 3 Opus is the most powerful, and available as a paid service ($20 per month), whereas Sonnet and Haiku are freely available to everyone. This section also briefly describes some of its competitors, such as Google Gemini and Meta AI (Meta).
As this book goes to print Anthropic released an updated version of Claude 3 Sonnet that outperforms GPT-4o from OpenAI.
WHAT IS GENERATIVE AI?
Generative AI refers to a subset of artificial intelligence models and techniques that are designed to generate new data samples that are similar to the input data. The goal is to produce content or data that wasn’t part of the original training set but is coherent, contextually relevant, and in the same style or structure.
Generative AI stands apart in its ability to create and innovate, as opposed to merely analyzing or classifying. The advancements in this field have led to breakthroughs in creative domains and practical applications, making it a cutting-edge area of AI research and development.
Key Features of Generative AI
The following bullet list contains key features of generative AI, followed by a brief description for each bullet item:
data generationsynthesislearning distributionsData generation refers to the ability to create new data points that are not part of the training data but resemble it. This can include text, images, music, videos, or any other form of data.
Synthesis means that generative models can blend various inputs to generate outputs that incorporate features from each input, like merging the styles of two images.
Learning distributions means that generative AI models aim to learn the probability distribution of the training data so they can produce new samples from that distribution.
Popular Techniques in Generative AI
Generative adversarial networks (GANs): GANs consist of two networks, a generator and a discriminator, that are trained simultaneously. The generator tries to produce fake data, while the discriminator tries to distinguish between real data and fake data. Over time, the generator gets better at producing realistic data.
Variational autoencoders (VAEs): VAEs are probabilistic models that learn to encode and decode data in a manner in which the encoded representations can be used to generate new data samples.
Recurrent neural networks (RNNs): Used primarily for sequence generation, such as text or music.
What Makes Generative AI Different
Creation versus classification: While most traditional AI models aim to classify input data into predefined categories, generative models aim to create new data.
Unsupervised learning: Many generative models, especially GANs and VAEs, operate in an unsupervised manner, meaning they don’t require labeled data for training.
Diverse outputs: Generative models can produce a wide variety of outputs based on learned distributions, making them ideal for tasks like art generation, style transfer, and more.
Challenges: Generative AI poses unique challenges, such as mode collapse in GANs or ensuring the coherence of generated content.
Furthermore, there are numerous areas that involve generative AI applications, some of which are in the following bullet list:
art and music creationdata augmentationstyle transfertext generationimage synthesisdrug discoveryArt and music creation includes generating paintings, music, or other forms of art.
Data augmentation involves creating additional data for training models, especially when the original dataset is limited.
Style transfer refers to applying the style of one image to the content of another.
Text generation is a very popular application of generative AI, which involves creating coherent and contextually relevant text.
Image synthesis is another popular area of generative AI, which involves generating realistic images, faces, or even creating scenes for video games.
Drug discovery is a very important facet of generative AI that pertains to generating molecular structures for new potential drugs.
CONVERSATIONAL AI VERSUS GENERATIVE AI
Both conversational AI and generative AI are prominent subfields within the broader domain of artificial intelligence. However, these subfields have a different focus regarding their primary objective, the technologies that they use, and applications.
The primary differences between the two subfields are in the following sequence of bullet points:
primary objectiveapplicationstechnologies usedtraining and interactionevaluationdata requirementsPrimary Objective
The main goal of conversational AI is to facilitate human-like interactions between machines and humans. This includes chatbots, virtual assistants, and other systems that engage in dialogue with users.
The primary objective of generative AI is to create new content or data that wasn’t in the training set but is similar in structure and style. This can range from generating images, music, and text to more complex tasks like video synthesis.
Applications
Common applications for conversational AI include customer support chatbots, voice-operated virtual assistants (like Siri or Alexa), and interactive voice response (IVR) systems.
Generative AI has a broad spectrum of common applications such as creating art or music, generating realistic video game environments, synthesizing voices, and producing realistic images or even deepfakes.
Technologies Used
Conversational AI often relies on natural language processing (NLP) techniques to understand and generate human language. This includes intent recognition, entity extraction, and dialogue management.
Generative AI commonly utilizes generative adversarial networks (GANs), variational autoencoders (VAEs), and other generative models to produce new content.
Training and Interaction
While training can be supervised, semisupervised, or unsupervised, the primary interaction mode for conversational AI is through back-and-forth dialogue or conversation.
The training process for generative AI, especially with models like GANs, involves iterative processes where the model learns to generate data by trying to fool a discriminator into believing the generated data is real.
Evaluation
Conversational AI evaluation metrics often revolve around understanding and response accuracy, user satisfaction, and the fluency of generated responses.
Generative AI evaluation metrics for models like GANs can be challenging and might involve using a combination of quantitative metrics and human judgment to assess the quality of generated content.
Data Requirements
Data requirements for conversational AI typically involve dialogue data, with conversations between humans or between humans and bots.
Data requirements for generative AI involve large datasets of the kind of content it is supposed to generate, be it images, text, music, and so on.
Although both conversational AI and generative AI deal with generating outputs, their primary objectives, applications, and methodologies can differ significantly. Conversational AI is geared toward interactive communication with users, while generative AI focuses on producing new, original content.
IS CLAUDE 3 PART OF GENERATIVE AI?
According to Claude 3 itself, Claude 3 is not part of generative AI (which might be surprising). The following prompt was given to Claude 3 to ask if it’s part of generative AI: Are you considered part of generative AI?
Claude 3 provided the following response to the preceding question:
No, I am not part of what is typically considered "Generative AI." I am a large language model trained by Anthropic to have natural conversations and provide informative responses, but I do not have generative capabilities to create new content like images, videos, audio, or computer programs from scratch.
My training allows me to understand and process existing data and information in order to summarize, analyze, explain, or translate it into natural language responses. However, I do not have the ability to autonomously generate truly new content in domains like visual art, music, software code, etc.
I rely on my training data provided by Anthropic to form my responses, rather than generating wholly original content through machine learning techniques like generative adversarial networks (GANs) or diffusion models that are used in Generative AI applications.
So in summary, while I use advanced natural language processing capabilities, I am not considered part of the specific field of Generative AI that aims to create new artificial data across different modalities like images, video, and audio. I am a very capable language model, but not one with generative AI capabilities.
IS DALL-E PART OF GENERATIVE AI?
DALL-E and similar tools that generate graphics from text are indeed examples of generative AI. In fact, DALL-E is one of the most prominent examples of generative AI in the realm of image synthesis.
Here’s a bullet list of generative characteristics of DALL-E, followed by brief descriptions of each bullet item:
image generationlearning distributionsinnovative combinationsbroad applicationstransformer architectureImage generation is a key feature of DALL-E, which was designed to generate images based on textual descriptions. Given a prompt like “a two-headed flamingo,” DALL-E can produce a novel image that matches the description, even if it’s never seen such an image in its training data.
Learning distributions: Like other generative models, DALL-E learns the probability distribution of its training data. When it generates an image, it samples from this learned distribution to produce visuals that are plausible based on its training.
Innovative combinations: DALL-E can generate images that represent entirely novel or abstract concepts, showcasing its ability to combine and recombine learned elements in innovative ways.
In addition to image synthesis, DALL-E has provided broad application support in areas like art generation, style blending, and creating images with specific attributes or themes, highlighting its versatility as a generative tool.
DALL-E leverages a variant of the transformer architecture, similar to models like GPT-3, but adapted for image generation tasks.
Other tools that generate graphics, art, or any form of visual content based on input data (whether it’s text, another image, or any other form of data) and can produce outputs not explicitly present in their training data are also considered generative AI. They showcase the capability of AI models to not just analyze and classify but to create and innovate.
DALLE-3 SAMPLE IMAGES
The following prompt was supplied to Meta AI (currently Claude 3 cannot generate images via text prompts) in order to generate an image of a pirate steering a cigar boat.
Please generate a clown steering a cigar boat.
The image in Figure 1.1 displays a clown steering a cigar boat.
The following prompt was also supplied to Meta AI in order to generate an image of a movie star steering a cigar boat.
Please generate a movie star steering a cigar boat engulfed in swirling ocean waters.
The image in Figure 1.2 displays a movie star steering a cigar boat.
FIGURE 1.1 A clown steering a cigar boat.
FIGURE 1.2A movie star steering a cigar boat.
As can be seen from the two markedly different preceding images, the text in a prompt can significantly affect the type of image that Meta AI will generate.
Presumably these features will spur OpenAI to provide the same set of features (some are implemented in Meta AI, but they are not publicly available).
WHAT ARE LLMS?
The following prompt was supplied to Meta AI regarding a description of LLMs, along with their advantages and disadvantages:
Please describe LLMs and their advantages as well as their disadvantages.
Meta AI
Large Language Models (LLMs) are complex algorithms trained on vast datasets to understand and generate human-like text, offering automation, creativity, and information processing opportunities. Here are some of the advantages and disadvantages of LLMs:
Advantages:
Comprehensibility: LLMs tend to produce translations that are grammatically correct and easy to comprehend.Adaptability: Owing to their training on diverse corpora, LLMs are well-equipped to handle a broad range of topics and language styles.Improvement Over Time: As LLMs continue to learn from user interactions and a broadening corpus of text, their performance can improve over time.Disadvantages:
Domain-Specific Translation: LLMs may struggle due to the lack of specialized training data.Limitations on the Availability of Language Pairs: LLMs are often trained on a large number of language pairs, which can dilute their proficiency in any specific pair.Contextual Understanding: While LLMs have a decent understanding of immediate context, they often falter when it comes to understanding larger or implicit context within a text.Meta AI also provided the following sources that are referenced in the preceding description of LLMs:
https://projectmanagers.net/top-10-disadvantages-of-large-language-models-llm/
https://blog.pangeanic.com/advantages-and-disadvantages-of-llm-machine-translation-versus-neural-machine-translation
WHAT ARE SLMS?
The following prompt was supplied to Meta AI regarding a description of SLMs, along with their advantages and disadvantages:
Please describe Small Language Models (SLMs) and their advantages as well as their disadvantages.
Meta AI
Small Language Models (SLMs) are a type of neural network that generate natural language content and are designed to be more lightweight and resource-efficient. SLMs are ideal for applications that need to operate within limited computational environments, offering advantages such as:
Advantages:
Efficiency: SLMs require less computational power and energy.Speed: They are easier and quicker to deploy.Robustness: SLMs are more manageable and controllable.Security: They present a lower risk of data leakage.Disadvantages:
Niche focus: SLMs may not perform well outside their specific domain of training.Limited generalization: They lack the broad knowledge base of LLMs.Rapid evolution: The field of Language Models is rapidly evolving, making it challenging to keep up with the latest advancements.Evaluation and selection difficulties: Choosing the appropriate SLM for a specific application can be daunting due to the influx of various models.Meta AI also provided the following sources that are referenced in the preceding description of SLMs:
https://winder.ai/exploring-small-language-models/
https://www.harrisonclarke.com/blog/large-language-models-vs-small-language-models
https://aisera.com/blog/small-language-models/
Popular SLMs
The following prompt was supplied to Meta AI regarding popular SLMs, along with their features:
Please provide a list of popular Small Language Models (SLMs), along with their features.
Meta AI
Here are some popular Small Language Models (SLMs):
DistilBERT: This is a smaller version of the BERT model, with around half of the parameters. It is a good balance of speed, efficiency, and cost, making it suitable for use in resource-constrained environments.GPT-Neo: This is an open-source alternative to GPT-3, with similar architecture and capabilities. It has 2.7 billion parameters and is designed to provide high-quality results for a variety of language tasks without the need for fine-tuning.GPT-J: This is similar to GPT-3 in design, with 6 billion parameters. It includes Rotary Position Embeddings and attention mechanisms, and is effective for tasks such as translating from English to French.Domain-Specific LLMs: These models are designed for the healthcare sector and are specialized to process and generate information related to medical terminologies, procedures, and patient care.Micro Language Models: These models are designed for AI customer support and are fine-tuned to understand the nuances of customer interactions, product details, and company policies.Meta AI also provided the following sources that are referenced in the preceding list of SLMs:
https://aisera.com/blog/small-language-models/
https://winder.ai/exploring-small-language-models/
The next several sections briefly introduce some of the companies that have a strong presence in the AI world.
DEEPMIND
DeepMind has made significant contributions to artificial intelligence, which includes the creation of various AI systems. DeepMind was established in 2010 and became a subsidiary of Google 2014, and its home page is here: https://deepmind.com/
DeepMind created the 280GB language model Gopher that significantly outperforms its competitors, including GPT-3, J1-Jumbo, and MT-NLG. DeepMind also developed AlphaFold that solved a protein folding task in literally thirty minutes that had eluded researchers for ten years. Moreover, DeepMind made AlphaFold available for free for everyone in July 2021. DeepMind has made significant contributions in the development of world-caliber AI game systems, some of which are discussed in the next section.
DeepMind and Games
DeepMind is the force behind the AI systems StarCraft and AlphaGo that defeated the best human players in Go (which is considerably more difficult than chess). These games provide “perfect information,” whereas games with “imperfect information” (such as poker) have posed a challenge for ML models.
AlphaGo Zero (the successor of AlphaGo) mastered the game through self-play in less time and with less computing power. AlphaGo Zero exhibited extraordinary performance by defeating AlphaGo 100–0. Another powerful system is AlphaZero that also used a self-play technique and learned to play Go, chess, and shogi, and also achieved SOTA (state of the art) performance results.
By way of comparison, ML models that use tree search are well-suited for games with perfect information. By contrast, games with imperfect information (such as poker) involve hidden information that can be leveraged to devise counter strategies to counteract the strategies of opponents. In particular, AlphaStar is capable of playing against the best players of StarCraft II, and also became the first AI to achieve SOTA results in a game that requires “strategic capability in an imperfect information world.”
Player of Games (PoG)
The DeepMind team at Google devised the general-purpose PoG (Player of Games) algorithm that is based on the following techniques:
CVPN (counterfactual value-and-policy network)GT-CFT (growing tree CFR)The counterfactual value-and-policy network (CVPN) is a neural network that calculates the counterfactuals for each state belief in the game. This is key to evaluating the different variants of the game at any given time.
Growing tree CFR (GT-CFR) is a variation of CFR that is optimized for game trees that grow over time. GT-CFR is based on two fundamental phases, which is discussed in more detail here:
https://medium.com/syncedreview/deepminds-pog-excels-in-perfect-and-imperfect-information-games-advancing-research-on-general-9dbad5c04221
OPENAI
OpenAI is an AI research company that has made significant contributions to AI, including DALL-E, ChatGPT, and GPT-4o, and its home page is here: https://openai.com/api/
OpenAI was founded in San Francisco by Elon Musk and Sam Altman (as well as others), and one of its stated goals is to develop AI that benefits humanity. Given Microsoft’s massive investments in and deep alliance with the organization, OpenAI might be viewed as an arm of Microsoft. OpenAI is the creator of the GPT-x series of LLMs (large language models) as well as Claude 3 that was made available on November 30, 2022.
OpenAI made GPT-3 commercially available via API for use across applications, charging on a per-word basis. GPT-3 was announced in July 2020 and was available through a beta program. Then in November 2021 OpenAI made GPT-3 open to everyone, and more details are accessible here:
https://openai.com/blog/api-no-waitlist/
In addition, OpenAI developed DALL-E that generates images from text. OpenAI initially did not permit users to upload images that contained realistic faces. Later OpenAI changed its policy to allow users to upload faces into its online system. Check the OpenAI Web page for more details. Incidentally, diffusion models have superseded the benchmarks of DALL-E.
OpenAI has also released a public beta of Embeddings, which is a data format that is suitable for various types of tasks with machine learning, as described here:
https://beta.openai.com/docs/guides/embeddings
OpenAI is the creator of Codex that provides a set of models that were trained on NLP. The initial release of Codex was in private beta, and more information is accessible here: https://beta.openai.com/docs/engines/instruct-series-beta
OpenAI provides four models that are collectively called their Instruct models, which support the ability of GPT-3 to generate natural language. To learn more about the features and services that OpenAI offers, navigate to the following link: https://platform.openai.com/overview
COHERE
Cohere is a start-up and a competitor of OpenAI, and its home page is here: https://cohere.ai/
Cohere develops cutting-edge NLP technology that is commercially available for multiple industries. Cohere is focused on models that perform textual analysis instead of models for text generation (such as GPT-based models). The founding team of Cohere is impressive: CEO Aidan Gomez is one of the co-inventors of the transformer architecture, and CTO Nick Frosst is a protege of Geoff Hinton, who is a co-winner of the Turing award.
HUGGING FACE
Hugging Face is a popular community-based repository for open-source NLP technology, and its home page is here: https://github.com/huggingface
Unlike OpenAI or Cohere, Hugging Face does not build its own NLP models. Instead, Hugging Face is a platform that manages a plethora of open-source NLP models that customers can fine-tune and then deploy those fine-tuned models. Indeed, Hugging Face has become the eminent location for people to collaborate on NLP models and is sometimes described as “GitHub for machine learning and NLP.”
Hugging Face Libraries
Hugging Face provides three important libraries: datasets, tokenizers, and transformers. The Accelerate library supports PyTorch models. The datasets library provides an assortment of libraries for NLP. The tokenizers library enables you to convert text data to numeric values.
Perhaps the most impressive library is the transformers library that provides an enormous set of pretrained BERT-based models in order to perform a wide variety of NLP tasks. The Github repository is here: https://github.com/huggingface/transformers
Hugging Face Model Hub
Hugging Face provides a model hub that provides a plethora of models that are accessible online. Moreover, the website supports online testing of its models, which includes the following tasks:
masked word completion with BERTname entity recognition with Electranatural language inference with RoBERTaquestion answering with DistilBERTsummarization with BARTtext generation with GPT-2translation with T5Navigate to the following link to see the text generation capabilities of “write with transformer”: https://transformer.huggingface.co
AI21
AI21 is a company that provides proprietary large language models via API to support the applications of its customers. The current SOTA model of AI21 is called Jurassic-1 (roughly the same size as GPT-3), and AI21 also creates its own applications on top of Jurassic-1 and other models. The current application suite of AI21 involves tools that can augment reading and writing.
Primer is an older competitor in this space, founded two years before the invention of the transformer. The company primarily serves clients in government and defense.
INTRODUCTION TO CLAUDE 3
Anthropic was created in 2021 by former employees of OpenAI and its home page is here: https://www.anthropic.com/
Anthropic has significant financial support from an assortment of companies, including Google and Salesforce.
In March 2024 Anthropic released Claude 3 as a competitor to ChatGPT and GPT-4. Claude 3 is available in three versions: Opus, Sonnet, and Haiku (Opus is the most powerful version).
Claude 3 Opus has a high degree of comprehension and expert level knowledge in fields such as mathematics. Opus is currently available in many countries for a monthly subscription of $20.
Claude 3 Sonnet is twice as fast as earlier versions of Claude (i.e., Claude 2 and Claude 2.1), along with improved reasoning capabilities, and it’s well-suited for fast answers to queries. In addition, Sonnet is freely available.
Claude 3 Haiku is the most cost effective as well as the fastest version of Claude 3, and it’s available on a per-token basis.
There are other aspects of Claude 3 to keep in mind when comparing Claude 3 with other LLMs. For example, Claude 3 is more likely to respond faster to prompts than earlier versions. Second, Claude 3 is less likely to respond with incorrect results and more likely to indicate that it lacks information regarding a prompt. In addition, Claude 3 currently provides a 200K context window, and is likely to support one million tokens in future iterations of Claude 3.
Incidentally, Claude 3 is not “self-aware” in the sense that it does not provide information about itself submitting prompts to Claude 3. For instance, submit the following prompt to Claude 3:
Which 3 versions of Claude 3 are available?
You will see the following response:
I do not actually have different numbered versions like "claude 3". I am an AI assistant created by Anthropic, but I do not have detailed information about product versions or naming conventions. I am simply known as Claude.
If you provide Claude Sonnet with the following prompt:
Who can provide me with information regarding Claude Opus, Sonnet, and Haiku?
You will see the following response:
Unfortunately, I do not have any specific information about models or versions named Claude Opus, Claude Sonnet, or Claude Haiku. As an AI assistant created by Anthropic, I do not have access to details about the company's internal AI development efforts, code names, or different model variants they may be working on.
Since you are asking about these apparently unannounced or confidential model names, the best source of accurate information would be from Anthropic employees directly involved with the development of these systems. As an external AI, I do not have any insider knowledge about unreleased or unannounced projects the company may be pursuing.
Unless Anthropic has publicly released details about things like Claude Opus, Sonnet, and Haiku, which I'm not aware of, I cannot provide any authoritative information about what those refer to or represent. My role is limited to being a general conversational AI assistant without access to Anthropic's proprietary research or confidential product roadmap details. I cannot be more helpful regarding those specific model names you mentioned.
Advantages of Claude 3