53,99 €
This book focuses on essential XML standards relevant to almost all developers. It investigates XML technologies applicable across a wide range of applications, rather than those limited to specific domains. While XML is a markup language, it is widely used by programmers. The book also covers supporting technologies layered on top of XML, such as XLinks, XSLT, Namespaces, Schemas, XHTML, RDDL, XPointers, XPath, SAX, and DOM.
The journey begins with understanding XML and its syntax. It then explores Document Type Definitions (DTDs), Namespaces, and XHTML. Following this, the book delves into CSS Style Sheets, XML Schema Basics, XSL and XSLT, SOAP, DOM Programming Interface, SAX, XPath, XLink, XQuery, XPointer, XForms, XSL-FO, and using XML with Databases. The final chapters cover Web Services, providing a comprehensive understanding of how XML integrates into various applications.
Mastering these standards and technologies is crucial for developers working with XML. This book transitions readers from basic XML syntax to advanced applications, blending theoretical concepts with practical examples. It is an essential resource for developers looking to leverage XML in their projects.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 651
Veröffentlichungsjahr: 2024
XMLBASICS
LICENSE, DISCLAIMER OF LIABILITY, AND LIMITED WARRANTY
By purchasing or using this book (the “Work”), you agree that this license grants permission to use the contents contained herein, but does not give you the right of ownership to any of the textual content in the book or ownership to any of the information or products contained in it. This license does not permit uploading of the Work onto the Internet or on a network (of any kind) without the written consent of the Publisher. Duplication or dissemination of any text, code, simulations, images, etc. contained herein is limited to and subject to licensing terms for the respective products, and permission must be obtained from the Publisher or the owner of the content, etc., in order to reproduce or network any portion of the textual material (in any media) that is contained in the Work.
MERCURY LEARNING AND INFORMATION (“MLI” or “the Publisher”) and anyone involved in the creation, writing, or production of the companion disc, accompanying algorithms, code, or computer programs (“the software”), and any accompanying Web site or software of the Work, cannot and do not warrant the performance or results that might be obtained by using the contents of the Work. The author, developers, and the Publisher have used their best efforts to insure the accuracy and functionality of the textual material and/or programs contained in this package; we, however, make no warranty of any kind, express or implied, regarding the performance of these contents or programs. The Work is sold “as is” without warranty (except for defective materials used in manufacturing the book or due to faulty workmanship).
The author, developers, and the publisher of any accompanying content, and anyone involved in the composition, production, and manufacturing of this work will not be liable for damages of any kind arising out of the use of (or the inability to use) the algorithms, source code, computer programs, or textual material contained in this publication. This includes, but is not limited to, loss of revenue or profit, or other incidental, physical, or consequential damages arising out of the use of this Work.
The sole remedy in the event of a claim of any kind is expressly limited to replacement of the book, and only at the discretion of the Publisher. The use of “implied warranty” and certain “exclusions” vary from state to state, and might not apply to the purchaser of this product.
XMLBASICS
SHASHI BANZAL
MERCURYLEARNING AND INFORMATIONDulles, VirginiaBoston, MassachusettsNew Delhi
Copyright ©2020 by MERCURY LEARNINGAND INFORMATION LLC. All rights reserved.Reprinted and revised with permission.
Original title and copyright: Learning XML.Copyright ©2017 by University Science Press (An imprint of Laxmi Publications Pvt. Ltd. All rights reserved.)
This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.
Publisher: David PallaiMERCURYLEARNING AND INFORMATION22841 Quicksilver DriveDulles, VA [email protected]
S. Banzal. XML Basics.ISBN: 978-1-68392-546-0
The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.
Library of Congress Control Number: 2020942355
202122321 Printed on acid-free paper in the United States of America.
Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional information, please contact the Customer Service Dept. at 800-232-0223(toll free).
All of our titles are available in digital format at www.academiccourseware.com and other digital vendors. The sole obligation of MERCURY LEARNING AND INFORMATION to the purchaser is to replace the book, based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.
CONTENTS
Preface
Chapter 1:
Understanding XML
Markup Languages
Specific Markup Languages
Generalized Markup Language
SGML - A Metalanguage
Why is XML so Adaptable?
XML Over SGML
Introduction to XML
Extensible
Markup
Language
History of XML
HTML and XML
XML Structure
Logical Structure
XML Declaration
XML Syntax
How Do I Structure My XML Documents?
Need for XML-Based Languages
XML Benefits
XML Disadvantages
Lack of Application Processing
General Weaknesses of XML
XML and Unicode Disadvantages
Characteristics of an XML Document
Open and Extensible
Application Independence
Data Format Integration
One Data Source, Multiple Views
Data Presentation Modification
Internationalization
Future-Oriented
Improved Data Searches
Enables E-Commerce Transactions
XML Documents form a Tree Structure
All XML Elements Must have a Closing Tag
XML Tags are Case Sensitive
XML Elements Must be Properly Nested
XML Documents Must have a Root Element
XML Attribute Values Must be Quoted
XML is Free
XML Technology
Uses
Sample XML Document
XML in Practical World
Property Inheritance
Combining Stylesheets
Questions for Discussion
Chapter 2:
XML Syntax
The Well-Formed Document
XML Document Structure
Prolog Section
The Standalone Attribute
The Encoding Attribute
Instance Section
Elements
Character Data
CDATA
Comment
Processing Instruction
Entities
General Entities
Parameter Entities
Entity References
Attributes
Entities’ References and Constants
Unparsed Data
Character Data (CDATA)
Processing Instructions (PIS)
Questions for Discussion
Chapter 3:
Document Type Definition (DTD)
Physical Structure in XML
Parsed and Unparsed Entities
Predefined Entities
Internal and External Entity
XML General Syntax
Attributes
Valid Documents
Well-Formed Documents
Well-Formed XML Documents
XML Documents
The XML Declaration
Processing Instructions
Comments
Document Type Declaration
XML Application Classification
Parsers
XML Processing-Attribute Values
XML Processing
Event-Driven Parsers
Tree-Based Parsers
XML Parser
Parse an XML Document
Parse an XML String
Document Type Definitions (DTDS)
Example DTD
DTD <!DOCTYPE>
DOCTYPE Syntax
XML Syntax Rules
DTDs (Well-Formed vs. Valid)
General Principles in Writing DTDs
Document Validation
Validating an XML Document with a DTD
The Purpose of DTDs
Creating DTDs
Code Sample: DTDs/Demos/Beatles.DTD
Internal DTD
Example Internal DTD
External DTD
Example External DTD
Combined DTD
DTD Elements
Basic Syntax
Plain Text
Unrestricted Elements
Empty Elements
Child Elements
Other Elements
Choice of Elements
Empty Elements
Mixed Content
Multiple Child Elements (Sequences)
An XML Application without a DTD
DTD Element Operators
DTD Operators with Sequences
Subsequences
The Document Element
Location of Modifier
Using Parentheses for Complex Declarations
XML CDATA
PCDATA-Parsed Character Data
CDATA-(Unparsed) Character Data
Notes on CDATA Sections
Internal & External Subsets
Standalone Attribute
DOCTYPE Declaration
Internal DTD Subset Declarations
External DTDs
Basic Markup Declarations
Formal DTD Structure-Entities
Predefined Entities
General Entities
Parameter Entities
Formal DTD Structure-Elements
Content Model
Cardinality Operators
Attributes
Default Values
Attribute Types
CDATA
ID
IDREF
Entity
Entity, Entities
NMTOKEN, NMTOKENS
Notation
Enumerations
Declaring Attributes
Conditional Sections
Limitations of DTDs
Designing XML Documents
XML for Messages
XML for Persistent Data
Mapping the Information Model to XML
A Document Type Declaration
Elements
Empty Elements
Attributes
CDATA
White Space
Special Characters
Questions for Discussion
Chapter 4:
Namespaces
Namespaces
Purpose of Namespaces
Declaring a Namespace
Scope
Qualified
XML Namespace
Example Namespace
XML Local Namespace
Example Local Namespace
Multiple Namespaces
XML Default Namespace
Understanding Namespaces
Naming Namespaces
Declaring and Using Namespaces
Default Namespaces
Explicit Namespaces
XML Namespaces
Name Conflicts
Solving the Name Conflict Using a Prefix
Locally Declared Elements and Attributes
Using Multiple Namespaces
Uniform Resource Identifier (URI)
Default Namespaces
Namespaces in Real Use
Questions for Discussion
Chapter 5:
Introduction to XHTML
A Quick History of HTML
XML Over HTML
Getting Multilingual with XML
The Convergence of HTML and XML
Add HTML to XML Data
Differences Between XHTML and HTML
XHTML
Benefits of XHTML
XHTML Coding
XML Declaration
XHTML DTDs
The DOCTYPE Declaration
XHTML Strict
XHTML Transitional
XHTML Frameset
The Document Element
A Sample XHTML Document
Document Formation
XHTML Tags
Questions for Discussion
Chapter 6:
CSS Style Sheets
CSS Documents
XML and CSS
Limitations of CSS for Complex Applications
Advantages of Authoring XML Documents with CSS
Authoring Approaches
Authoring XML Documents with CSS
Associating CSS Stylesheets with XML
Rendering XML Documents with CSS
CSS Syntax
CSS Example
CSS Comments
CSS Selectors
Embedding CSS in Web Page
CSS Styles
Displaying XML with CSS
XSL Transformation
Using XSL to Present XML Documents
XSL Patterns
XML Styles (Revisited)
Questions for Discussion
Chapter 7:
XML Schema Basics
XML Schema
Role of a Schema
DTD as a Schema
Schema Languages and Notations
The Purpose of XML Schema
The Power of XML Schema
A First Look
A Simple XML Schema
Schema as a Set of Constraints
Schema as an Explanation
DTD vs XML Schema
Structures
Preamble
Sample Preamble
Attributes and Attribute Groups
Content Models
Element Declaration
Derivation
Data Types
Primitive Types
Generated and User Defined Types
Hyperlinks
Links
Linking and Querying
XML Information Set
Link Elements
Locators
XLinks
Simple Links
Extended Links
Extended Link Groups
Validating an XML Instance Document
Simple-Type Elements
Built-in Simple Types
19 Primitive Data Types
Built-in Derived Data Types
Defining a Simple-Type Element
User-Derived Simple Types
Controlling Length
Specifying Patterns
Working with Numbers
Mins and Maxs
Number of Digits
Enumerations
Whitespace Handling
Specifying Element Type Locally
Nonatomic Types
Lists
Unions
Declaring Global Simple-Type Elements
Global vs. Local Simple-Type Elements
Default Values
Fixed Values
Nil Values
Complex-Type Elements
Content Models
Complex Model Groups
Occurrence Constraints
Declaring Global Complex-Type Elements
Mixed Content
Defining Complex Types Globally
Empty Elements
Adding Attributes to Elements with Complex Content
Adding Attributes to Elements with Simple Content
Restricting Attribute Values
Default and Fixed Values
Fixed Values
Requiring Attributes
Groups
Extending Complex Types
Abstract Types
XML Schema Keys
Keys
Annotating XML Schemas
Annotating a Schema
XSD Indicators
But This is No Longer Valid
Create an XML Schema
XSD Date and Time Data Types
XML Editors
Questions for Discussion
Chapter 8:
XSL Basics
Introduction to XSL
An XML Syntax
An XSL Processor
The XSL Templates
Location Paths
Template Ordering
Axes
Repetitions and Sortings in XSL
XSL Sorting
Uppercase and Lowercase Sorting
XSL Conditional Processing
Number Generation and Formatting in XSL
Formatting Multilevel Numbers
Numeric Calculation in XSL
Ceiling, Floor, and Round
String Function
XSL String Functions
Concatination
XSL Output Element
HTML Output Method
Text Output Method
Copy and Copy-of Constructs in XSL
Use-Attribute-Sets Attribute
Miscellaneous Additional Functions
Combining XSL
Importing Stylesheets
Apply-Import Function
Questions for Discussion
Chapter 9:
XSLT Basics
XSLT (Extensible Stylesheet Language)
XSLT Sample Program
The Transformation Process
Processing a Transformation
Applying XSLT to an XML Document
XSLT Syntax
XML Version
XSL Root Element
Selecting the Root Node
Usage Example
XSLT <value-of> Element
Usage Example
XSLT <for-each> Element
<xsl:for-each> Example
Result
Before
After
XSLT <if> Element
The Source File
The Solution
The Source File
The Solution
Questions for Discussion
Chapter 10:
SOAP
SOAP
Communication Over Distributed Systems
Remote Procedure Call (RPC)
SOAP Syntax
SOAP Message Structure
The SOAP Envelope Element
The SOAP Header Element
The SOAP Body Element
The SOAP Fault Element
The HTTP Protocol
SOAP HTTP Binding
Content-Type
Content-Length
A SOAP Example
Transport Methods in SOAP
SOAP and the Request/Response Model
HTTP Headers and SOAP
Request Headers
Response Headers
Sending Messages Using M-Post
A Schema for the Body Content of the SOAP Message
SOAP Encoding
Encoding Style Attribute
Questions for Discussion
Chapter 11:
DOM Programming Interface
DOM (Document Object Model)
XML DOM Tree
High Level Architecture of a DOM/XML Application
DOM Implementation
The DOM Specification
XML DOM Nodes
XML DOM Node Tree
First Child - Last Child
DOM Level 2 Specification
XML Document Structure
Working with DOM
Client Side and Server Side DOM
XML DOM Parser
XML Parser
Load an XML Document
Questions for Discussion
Chapter 12:
SAX (Simple API for XML)
Introduction to SAX
SAX (Simple API for XML)
DOM and Tree-Based Processing
PROS and CONS of Tree-Based Processing
How to Choose Between SAX and DOM
The SAX API is Defined in 4 Interfaces Under the org.xml.sax Package
SAX Sample Program
Three Steps to SAX
Creating the SAX Parser the Sample File
SAX Interface Java Example
SAX Parsing Pattern Example
Questions for Discussion
Chapter 13:
XPath
XPath Introduction
XPath Syntax
The XML Example Document
Navigating a Document with XPath Patterns
Referencing Nodes
XPath (XML Path) Language
Data Types, Literals, and Variables
XPath Operators
Evaluation Context
Built-in Functions
Using XPath Functions
Node Functions
String Functions
Boolean Functions
Number Functions
The Role of XPath
Using XPath in XSLT Templates
XPath Location Path
Location Path Example
XPath Location Step
XPath Location Path – Absolute
Example of an Absolute Location Path
Selecting Nodes
Predicates
Selecting Unknown Nodes
Selecting Several Paths
The Root Node
XPath Location Path – Relative
Example of a Relative Location Path
Children
The Wildcard
XPath Attributes
XPath – Expressions
XPath—Our Sample XML File
A Simple XPath Expression
Questions for Discussion
Chapter 14:
XLink, XQuery, and XPointer
Introduction to XQuery
XQuery Example
XQuery Syntax
XQuery Basic Syntax Rules
XQuery Selecting and Filtering Elements
XQuery Functions
XQuery User-Defined Functions
XLink and XPointer Introduction
XLink and XPointer Syntax
HTML, XML, and Linking
Linking with XLink
XLink Example
The XML Example Document
Understanding XLink Attributes
Creating Links with XLink
XPointer Syntax
Addressing with XPointer
Building XPointer Expressions
Creating XPointers
XPointer Example
The Linking XML Document
XPointer Example
The Linking XML Document
Questions for Discussion
Chapter 15:
XForms
Introduction to XForms
Features of XForms
Parts of XForms
The Form Controls
The Form Controls Listed
The XForms Processor
The XForms Namespace
XForms and XPath
XForms Properties
XForms Actions
Questions for Discussion
Chapter 16:
XSL-FO
Introduction to XSL-FO
XSL-FO Documents
XSL-FO Document Structure
Font and Text Attributes
XSL-FO Areas
XSL-FO Output
Page Layout
XSL-FO Blocks
Styling Text in XSL-FO
Controlling Spacing and Borders
More Complex Structures
Tables
XSL-FO Objects
Graphics
XSL-FO Processors
XSL-FO Software
XSL-FO and XSLT
Questions for Discussion
Chapter 17:
XML with Databases
Introduction
XML Documents as Databases
Why Use a Database?
Data versus Documents
Data-Centric Documents
Document-Centric Documents
Data, Documents, and Databases
Storing and Retrieving Data
Mapping Document Schemas to Database Schemas
Relational Database Primer
The World’s Shortest Guide to SQL
Retrieving Records Using Select
Inserting Records
Updating Records
Deleting Records
Databases and XML
Resolving XML Data into Database Tables
Storing XML Documents in a Database
Exporting an XML Document from a Database
Accessing Data from a Database as XML
Questions for Discussion
Chapter 18:
Web Services
Web Services
The Web Services Platform
Web Services Platform Elements
Types of Web Services
Web Service Architectures
Web Services Example
How to Use Web Services
SOAP
WSDL and UDDI
UDDI Benefits
How Can UDDI be Used
Questions for Discussion
Appendix: A:
XML Basics
Appendix: B:
Well Formed XML Documents
Appendix: C:
XML Overview
Glossary
Index
PREFACE
This book focuses on standards that are relevant to almost all developers working with XML. We investigate XML technologies that span a wide range of XML applications, not just those that are relevant only within a few restricted domains. XML is not a programming language. It is a markup language; but it is successfully used by many programmers. The book also covers generic supporting technologies that have been layered on top of XML and are used across a wide range of XML applications. These technologies include XLinks, XSLT, Namespaces, Schemas, XHTML, RDDL, XPointers, XPath, SAX, and DOM.
S. BANZALAugust 2020
CHAPTER 1
UNDERSTANDING XML
MARKUP LANGUAGES
The term Markup is a concatenation of the words “mark up.” This refers to the traditional way of marking up a document in the print and design worlds.
Markup is used to modify the look and formatting of text or to establish the structure and meaning of the document for output to some medium, such as the printer or the World Wide Web. Markup consists of codes, or tags, that are added to text to change the look or meaning of the tagged text. The tagged text for a document is usually called the source code for that document. Most word processors use some sort of markup languages to produce formatted text. There are two types of Markup languages: Specific Markup Languages and Generalized Markup Languages.
SPECIFIC MARKUP LANGUAGES
Specific markup languages were developed for specific purposes. These markup languages cannot be used for any other purpose other than that for what it was developed for. Hypertext Markup Language, or HTML, was designed for simplicity and it has a flexible structure. It allows text and graphics to be displayed in any Web browser.
Many markup languages have served quite well as document formatting tools for printing on the Web. However, they do not perform well in describing the data they contain or at providing contextual information for the data. For example, Hyper Text Markup Language describes how the text should be formatted, but conveys nothing about the kind of text data included in the document.
When using specific markup languages, the authors are limited to a particular set of tags. If a set of tags does not meet a need, authors must find an alternative way to meet those needs. A document might not be portable to other applications, as the data is not self-describing. It cannot be used for any other purpose than that for which it was originally intended. The language probably has a proprietary way of marking up text that is not compatible with other markup languages. This can create confusion and additional work for authors who must use several languages to accommodate different applications.
GENERALIZED MARKUP LANGUAGE
In the 1970s, Dr. C. F. Goldfarb and two of his colleagues proposed a method of describing text that was not specific to an application or a device. The method had two suggestions:
•The markup should describe the structure of a document and not its formatting or style characteristics.
•The syntax of the markup should be strictly enforced so that the code can clearly be read by a software program or by a human being.
The result of these suggestions was the Standardized General Markup Language (SGML) that was adopted as a standard by the International Organization for Standardization in 1986.
SGML - A METALANGUAGE
SGML has added provisions for identifying the characters to be used in a document. This makes it easier to ensure that a processor can understand everything in a document by allowing a document to specify the character set that it uses.
SGML provides a way to identify objects that will be used throughout a document. These objects, called entities, are convenient to use when a text fragment or any other data appears in several places in a document. If an entity is declared in one place of the document, any changes to that declaration will be reflected in all occurrences of the entity throughout the document.
SGML – Example
<!DOCTYPE CARS PUBLIC "//EXT/DTD CATALOG//EN">
<CAR>
<COLOR> Red
<PRICE> $20,000
</CAR>
The code snippet shown is an example of an SGML document. We can see that the content is the same as that of the HTML document. These similarities exist because HTML is an application of SGML. HTML was created using SGML standards. The main difference between SGML and HTML is that SGML is extensible, which means that it allows an author to define a particular structure by defining the parts that fit that structure. HTML is not extensible, which means that HTML cannot be used to create another markup language with its own rules and purposes.
WHY IS XML SO ADAPTABLE?
If XML is a new generation, then SGML is its mother. SGML is likely one of the most adaptable languages of all time, allowing the use of constructs that even XML won’t allow. Unfortunately, SGML is more complex and not as universally supported as XML, so the use of SGML instead of XML isn’t really recommended.
XML has inherited many of the key features of SGML, however, and puts them to good use; in many cases, the ways that it differs from its predecessor are inconsequential. While you may occasionally run across strange circumstances that would work better with SGML, it’s best to focus on XML since that’s where most of the support and interest lies.
XML OVER SGML
Even though XML is a subset of Standard Generalized Markup Language (SGML), XML is optimized for use on the World Wide Web. XML is designed in such a way that it has some benefits that are not found in SGML. XML is a smaller language than SGML. The designers of XML removed some specifications in SGML that were not needed for Web delivery.
XML includes a specification for the hyperlinking scheme, which is described as a separate language called eXtensible Linking Language (XLL). XML supports the basic hyperlinking found in HTML as well as extended linking. XML includes specification for a style sheet language called eXtensible Stylesheet Language (XSL). XSL provides support for a style sheet mechanism, which allows an author to create a template of various styles.
XML documents are self-describing documents. That is, each document contains a set of rules to which its data must conform. Since the same set of rules can be reused in another document, other authors can easily create the same class of document, if necessary.
XML can be used as the data interchange format. Many legacy systems can contain data in disparate forms, and developers are doing a lot of work to connect these legacy systems using the Internet. Since the XML text format is standards-based, data can be converted to XML and then easily read by another system or application.
XML can be used for Web data. For example, the content is stored in an XML file and the HTML page is used simply for formatting and display. So, the content can be updated and translated into another language without modifying anything in the HTML code.
INTRODUCTION TO XML
XML (eXtensible Markup Language) was invented for the purpose of having a standard and powerful way of describing any kind of data. XML offers a widely adopted standard way of representing text and data in a format that can be processed without much human or machine intelligence. Information formatted in XML can be exchanged across platforms, languages, and applications, and can be used with a wide range of development tools and utilities.
XML is a meta-language; that is, it is a language in which other languages are created. In XML, data is “marked up” with tags similar to HTML tags. In fact, the latest version of HTML, called XHTML, is an XML-based language, which means that XHTML follows the syntax rules of XML.
XML is used to store data or information. This data might be intended to be by read by people or by machines. It can be highly structured data, such as data typically stored in databases or spreadsheets, or loosely structured data, such as data stored in letters or manuals.
XML is all about preserving useful information—information that computers can use to be more intelligent about what they do with our data. The best part of XML is that it liberates information from the shackles of a fixed-tag set.
XML provides a standard approach for describing, capturing, processing, and publishing information. It is a language that has significant benefits over HTML.
Unlike most markup languages, XML is a flexible framework in which you can create your own customized markup languages. All XML-based languages share the same look and feel, and they share a common basic syntax. The essence of XML is in its name: Extensible Markup Language.
•Markup – It is a collection of tags.
•XML Tags – Identify the content of the data
•Extensible – User-defined tags
EXTENSIBLE
XML is extensible. It lets you define your own tags, the order in which they occur, and how they should be processed or displayed. Another way to think about extensibility is to consider that XML allows us to extend our notion of what a document is: it can be a file that lives on a file server, or it can be a transient piece of data that flows between two computer systems (as in the case of Web Services).
MARKUP
The most recognizable feature of XML is its tags, or elements (to be more accurate). In fact, the elements you’ll create in XML will be very similar to the elements you’ve already been creating in your HTML documents. However, XML allows you to define your own set of tags.
LANGUAGE
XML is a language that’s very similar to HTML. It’s much more flexible than HTML because it allows you to create your own custom tags. However, it’s important to realize that XML is not just a language. XML is a meta-language: a language that allows us to create or define other languages. For example, with XML we can create other languages, such as RSS, MathML (a mathematical markup language), and even tools like XSLT.
HISTORY OF XML
In 1970, IBM introduced SGML (Standard Generalized Markup Language). SGML was developed out of the General Markup Language (GML), which was developed by IBM in the late 1960s. SGML is a semantic and structural language for text documents, but it is very complicated. HTML is a subset of SGML.
In 1996, XML Working Group was formed under W3C. The World Wide Web Consortium (W3C) is an international consortium where Member organizations, a full-time staff, and the public work together to develop Web standards. W3C was created by Tim Berners-Lee in 1994 who also invented the World Wide Web in 1989. In 1998, W3C introduced XML 1.0.
XML (Extensible Markup Language) is a dialect of SGML. XML is not a programming language. Rather, it is a set of rules that allows you to represent data in a structured manner. Since the rules are standard, the XML documents can be automatically generated and processed.
XML was designed to describe data and is a cross-platform, software- and hardware-independent tool for transmitting or exchanging information. It is an open-standards-based technology which is both human and machine readable. XML is best suited for use in documents that are similar. In future Web development, it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data. The XML specification includes the syntax and grammar of XML documents as well as DTD.
Website creation is a fast-growing sector. In the early days, Website design consisted primarily of creating fancy graphics and nice-looking, easy-to-read Web pages.
As today’s Websites are interactive, the steps in Website design have changed. Although creating a pleasant-looking Website is still important, the primary focus has shifted from graphical design to programmatic design.
Consider a company wanting to sell its product on the Web. In such cases, the Webpages will collect and store a user’s billing information. This calls for storing and manipulating such data in a database. This is where XML comes into the picture.
XML is the solution for the problems that arise when using database Webpages.
HTML AND XML
HTML and XML were designed with different purposes in mind. XML is similar to HTML—they are both closely related to the SGML markup definition language that has been an ISO standard since 1986. SGML is an early attempt to combine the metadata (data about the data) with the data and it was used primarily in large document management systems. Because SGML is a very complex language, it has limited mass appeal.
HTML is the most recognized application of SGML and it allows any Web browser or application which understands HTML to display information in a consistent form. A HTML document is effective when it comes to laying out and displaying data, but it is a fixed set of tags, and it does not have the flexibility to describe different document and data types. HTML, in conjunction with Cascading Style Sheets (CSS), is reasonably good at displaying data, but it is not as good as XML at transporting data that is meant to be viewed or parsed in dozens of different ways by a variety of devices. In essence, where HTML is a presentation language, we require a richer communication means that can help with exchanging information from one computer to another.
The need to extract data and put a structure around information led to the creation of XML. Since it was released in 1997, XML use has been growing rapidly. There are two major fundamental differences between HTML and XML:
•Separation of form and content—HTML mostly consists of tags defining the appearance of text; in XML, the tags generally define the structure and content of the data, with the actual appearance specified by a specific application or associated stylesheet.
•XML is extensible—tags can be defined by individuals or organisations for some specific application, whereas the HTML standard tagset is defined by the World Wide Web Consortium (W3C).
XML is not intended as a replacement for HTML and both are complementary technologies. XML is a more general and better solution to the problem of sharing data on the Web than extending HTML.
XML STRUCTURE
One of XML’s best features is its ability to provide structure to a document. Every XML document includes both a logical and a physical structure. The logical structure is like a template that details the elements to be included in a document and the order in which they have to be included. The physical structure contains the actual data used in a document.
LOGICAL STRUCTURE
Logical Structure refers to the organization of the different parts of a document. It indicates how a document is built, as opposed to what a document contains. The first structural element in an XML document is an optional prolog element. The prolog is the base for the logical structure of an XML document. The prolog consists of two basic components, the XML Declaration and the Document Type Declaration. These two components are also optional.
XML DECLARATION
The XML Declaration identifies the version of the XML specification to which the document conforms. Although the XML declaration is an optional element, we should always include it in the XML document.
The code snippet here gives an example of basic XML declaration. Here, the line of code must use only lowercase letters.
<?xml version="1.0"?>
An XML declaration can also contain an encoding declaration and a standalone document declaration.
The encoding declaration identifies the character-encoding scheme, such as UTF-8 or EUC-JP. Different encoding schemas map to different character formats or languages. For example, UTF-8, the default scheme, includes representations for most of the characters in the English Language.
XML SYNTAX
The first thing that you’ll need to do is open up your text editor of choice. At this point, your document is going to look something like this (if you’re using XML version 1.0):
<?xml version="1.0"?>
Once you’ve typed your directive, it’s time to start adding some content to the page. Information on an XML page is handled in a very precise and structured format, using tags to define your data. White space can be included in the document to make it more easily readable, though you should be careful not to use that white space inside of your tags, as it can create problems when being read by a browser.
Let’s say that you’ve decided to create a new XML document to tell the world about your two favorite cats. You want to use the tag <cats>. Your document now looks a little something like this:
<?xml version="1.0"?>
<cats>Tooter and Shade are the best cats in the world!</cats>
Note the white space in between the directive and the first tags. You could also have put both of the tags on their own line, with the content of the tags between them, as long as you don’t add additional white space within the tags.
Of course, the <cats> tags don’t do anything. If you load this page into a Web browser, you’ll end up with more or less a copy of the file contents displayed on the screen with the tags in some pretty colors. You’ll have to define the tags, which can be done in 1 of 4 ways:
•Using Cascading Style Sheets (CSS)
•Using the eXtensible Style Language (XSL) Style Sheets
•Using a Data Island plus Script
•Using a Data Object Model plus Script or Client-Side Program
All of this might sound complicated, but it’s really not. It does involve creating and referencing other pages, though for now we’re still working on just the basic structure of XML. Save the document (in Text-Only mode) under the name cats.xml (making sure to use the .xml extension).
HOW DO I STRUCTURE MY XML DOCUMENTS?
Structure in an XML document is very important. Small errors in the structure of your document can have large effects on the overall outcome; pieces may not be displayed correctly, or might not appear at all. If the structure is too damaged, then the entire document might fail to work.
As previously mentioned, all XML documents begin with the XML directive. Open up the previously-saved file, cats.xml, and you’ll find your directive already in place.
<?xml version="1.0"?>
<cats>Tooter and Shade are the best cats in the world!</cats>
Unfortunately, your file is still missing a few vital elements. The <cats> tags don’t work, and the browser has no idea how to make them work. If you load it up in a browser, you’ll just see a copy of the file, with the various elements in different colors. This is actually useful, however; as long as you see this, then your code is good. The browser doesn’t know what else to do with it, in this case because some of the elements are missing, but the lack of definitive error codes tells you that it’s at least well-coded.
Go into the file, between your directive and the content, and get ready to add another vital element to your page. Type the following:
<?xml-stylesheet type="text/css" href="cats.css"?>
Of course, this doesn’t mean much to you right now. In time, though, it’s going to be a vital part of your page. What you just typed is the directions that the browser needs to find the XML processor, or the file that tells it how it should handle the information in the XML document. The line that you just typed tells the browser to find the file called cats.css, and that the file is a Cascading Style Sheet. It also tells it that it’s the stylesheet that it needs for this page. Now your cats.xml file should look like the following, which looks a lot more like an XML file.
<?xml version="1.0"?>
<?xml-stylesheet type="text/css" href="cats.css"?>
<cats>Tooter and Shade are the best cats in the world!</cats>
NEED FOR XML-BASED LANGUAGES
The main advantage of being able to define your own markup language is that it gives you the freedom to capture and publish useful information about what your data is and how it is structured. To show the difference, consider a company wanting to sell books on the Web. If they want to publish the information about the books on a Webpage, then we need to write an HTML document like the one shown.
The original data has been formed into HTML for publishing purposes. In the course of that transformation, useful information about what the information really is has been lost. If the same content were written in XML, it would look like the following code snippet.
<!-Book Snippet in HTML —>
<h1> Books for Sale </h1>
<table border=1>
<tr>
<td>Title</td><td>Paradise Lost</td>
</tr>
<tr>
<td>Author</td><td>John Milton</td>
</tr></table>
<!-Book snippet in XML —>
<BooksForSale>
<Title>Paradise Lost</Title>
<Author>John Milton</Author>
</BooksForSale>
If this code were to be published on the Web, this representation opens up some interesting possibilities. No image is shown.
XML BENEFITS
Initially, XML received a lot of excitement, but that has now died down some. This isn’t because XML is not as useful, but rather because it doesn’t provide the “Wow! factor” that other technologies, such as HTML, do. When you write an HTML document, you see a nicely formatted page in a browser—instant gratification. When you write an XML document, you see an XML document—not so exciting. However, with a little more effort, you can make that XML document sing.
XML is Everywhere
XML is now as important for the Web as HTML was to the foundation of the Web. XML is the most common tool for data transmissions between all sorts of applications. XML is used in many aspects of Web development, often to simplify data storage and sharing.
XML Separates Data from HTML
If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes. With XML, data can be stored in separate XML files. This way you can concentrate on using HTML for layout and display and be sure that changes in the underlying data will not require any changes to the HTML. With a few lines of JavaScript code, you can read an external XML file and update the data content of Webpage.
XML Simplifies Data Sharing
In the real world, computer systems and databases contain data in incompatible formats. XML data is stored in plain text format. This provides a software- and hardware-independent way of storing data. This makes it much easier to create data that can be shared by different applications.
XML Simplifies Data Transport
One of the most time-consuming challenges for developers is to exchange data between incompatible systems over the Internet. Exchanging data as XML greatly reduces this complexity, since the data can be read by different incompatible applications.
XML Simplifies Platform Changes
Upgrading to new systems (hardware or software platforms) is always time consuming. Large amounts of data must be converted and incompatible data is often lost.
XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers without losing data.
XML Makes Your Data More Readily Available
Different applications can access your data, not only from HTML pages, but also from XML data sources. With XML, your data can be available to all kinds of “reading machines” (handheld computers, voice machines, and news feeds), and make it available for blind people or people with other disabilities.
XML is Used to Create New Internet Languages
New Internet languages are created with XML. Here are some examples:
•XHTML
•WSDL for describing the available Webservices
•WAP and WML as markup languages for handheld devices
•RSS languages for news feeds
•RDF and OWL for describing resources and ontology
•SMIL for describing multimedia for the Web
The future might give us word processors, spreadsheet applications, and databases that can read each other’s data in XML format, without any conversion utilities in between. XML documents form a tree structure that starts at the “root” and branches to the “leaves.”
XML DISADVANTAGES
XML is useful for developing future Web applications, and it almost defines the future of Web development. However, XML also has some drawbacks. One of the biggest drawbacks of XML is that it lacks adequate applications for processing.
LACK OF APPLICATION PROCESSING
XML needs an application processing system. There are no browsers yet that can read XML. For HTML, anyone can write up a program that can be read using any browser anywhere in the world. To be able to be read in a browser, XML still depends on HTML and is not independent of it. XML documents have to be converted to HTML before they are deployed. The most common method is to write the parsing routes in either DHTML or Java applications and parse them through the XML document. The formatting rules can be applied by the style sheet to convert the entire document into HTML.
Other disadvantages of XML include the fact that it is more difficult, more demanding, and more precise when compared to HTML. XML does not have any browser support and does not have anything to support the end user applications.
XML is very flexible, but its flexibility can potentially become one of its disadvantages, since there may be disagreements in its tags. If an XML object has too many constraints, it might become very difficult to construct the file. While just describing tags and building a system sounds easy, it may not be that easy in reality. For example, a business or professional organization may have hundreds of functions related to one set of documents. XML does not have the capability to synthesize all the information related to the document.
GENERAL WEAKNESSES OF XML
Since XML is a verbose language, it is dependent on who is writing it. A verbose language may pose problems for other users. XML is not specific to any platform and has a neutral platform requirement that may be a disadvantage in a few circumstances. All the standards of XML are not yet fully compliant. Users have reported problems with the parser and there are problems with XML and HTTP that are still being resolved.
XML documents can be difficult and expensive to set up. A freelancer, for example, can sit at his home and at his own pace create, write, and format a document or a manuscript using any of the free software available. However, the moment he introduces XML, the whole process becomes more complicated.
XML AND UNICODE DISADVANTAGES
Implementing multiple programs that are incompatible can be challenging. When XML is tied closely to Unicode, the Unicode changes XML’s attributes, which might result in a file that is totally different from the original.
The XML parsers, when used along with the RSS and the component called next, cannot disable the external entities. Instead they recognize them as their own, which can prove to be a major disadvantage. XML by itself cannot work along with Netscape, which makes it dependent on HTML. XML is not a super efficient model, it is not platform independent, and it cannot be deployed on every operating system. The limitation here is also very basic since it cannot talk to the browsers.
There are sample codes that belong to HTML and XHTML which contain a doctype and point to a DTD. The common belief is that this actually works, but browsers do not actually retrieve these DTDs. Whenever the DTD is unavailable, then the entire application breaks down. This is a problem because the DTD can be unavailable for other reasons, and it doesn’t mean that the service itself has to become unavailable.
XML creates an abundant amount of dependency on single factors that can create problems for programs. DTD, when available, is totally not useful, and an outside program has to be used to create a backup system, so users and developers might as well use an outside program made from scratch, which has the back up at intermediary levels.
External entities pose a problem, which is a major disadvantage for XML. The best way to fix the external entities’ problems with XML DTD is to not to use them at all, or if you have to use them, then don’t use them on the producer side. Do not attempt to retrieve them on the client’s side.
When you write the specifications for an XML document, do not mention the specifications for DTD in the vocabulary. There is a need for the programs to run their parsers for XML by disabling the external entity resolution. Otherwise, the external entities’ problem will invariably crop up, triggering a series of problems that cannot be solved by the XML environment alone. While layering the specifications, it is against the rules to disable or ban certain document types, which is allowed in SOAP.
If your job is to implement a Web application which is based on XML, you may need to configure the parser not to perform the DTD-based validations, and also not to try and resolve the external entities. This could be an answer to some problems, so taking precautionary measures is worthwhile. Publishing documents on the Web requires the same precautions; the document types should not be included.
A document may not be valid in the way XML describes it to be, and some people even believe that document validation in XML is overrated. Document data types are not very powerful when it comes to validation and it has been forgotten that the document has its own language and grammar which are not efficient for getting validated. There is also the problem of other programs not trusting the XML DTD. The doctype in HTML is much different from the doctype in XML. You may not be able to use the doctype in XML as an indicator, which helps programs understand what type of document it is dealing with.
If there is an application which exists that can handle multiple vocabularies of XML, and also knows to dispatch the respective documents to the concerned handlers by checking the namespace at the root of the element, then you can consider yourself lucky. If the vocabularies are not mentioned in the namespace, then you can look for them in the mime type. In some cases, the vocabularies are not present in the name space, nor are they specific to the mime. Such language is certainly a bad example and will create problems because you will have to use the root element name.
XML specifications define three kinds of file processing. The first one is DTD based validations which do not perform or retrieve external entities. The second one is the DTD based validation, which does not perform or retrieve external entities so that the information set and the reference library can be expanded. The third one is to perform the DTD-based validation by retrieving the external entities so that the information set and the entity reference can be expanded.
The point of having many profiles is so that the application has a choice and it chooses the right one. Character entities are considered unsafe for Web applications. It is a disadvantage because there will be a problem with the input and its editor. On the World Wide Web, there may be other options available when there is such a problem. The situation need not be so unfortunate because there may be a solution which exists, and there is an input method which can solve the problem with the editor. If the XHTML entities were pre-defined, then there wouldn’t be many problems.
CHARACTERISTICS OF AN XML DOCUMENT
There are a range of characteristics associated with XML.
Simplicity
Information coded in XML is easy to read and understand, and it can be processed easily by computers.
Self-Describing
OPEN AND EXTENSIBLE
XML allows you to add other elements when needed. This means you can always adapt your system to address specification modifications.
APPLICATION INDEPENDENCE
Using XML, data is no longer dependent on a specific application for creation, viewing or editing. In this sense, XML is to data what Java is to applications. Java allows programs to run anywhere—XML allows data to be used by any application.
DATA FORMAT INTEGRATION
XML documents can contain any imaginable data type—from classical data like text and numbers, or multimedia objects such as sounds and video, or active components like Applets.
ONE DATA SOURCE, MULTIPLE VIEWS
By formatting our data in a markup language, we allow computer applications to process and present this data to us in different ways. In contrast, HTML presents data in one fixed way.
DATA PRESENTATION MODIFICATION
You can change the look and feel of documents, or even entire Websites, with XSL Style Sheets without manipulating the data itself.
INTERNATIONALIZATION
Internationalization is important for electronic worldwide business applications. XML supports multilingual documents and the Unicode standard.
FUTURE-ORIENTED
XML is the endorsed industry standard of the World Wide Web Consortium (W3C) and is supported by all leading software providers. Furthermore, XML is also the standard today in an increasing number of other industries, such as health care.
IMPROVED DATA SEARCHES
Tags, attributes, and element structure provide context information that can be used to interpret the meaning of content, opening up new possibilities for highly efficient search engines, and intelligent data mining. An intelligent search engine for a body of XML-compliant markup languages would search both the content and the metadata, which would drastically improve the accuracy of searches. This will obviously cause an increase in the relevant and accessible data on a global basis.
ENABLES E-COMMERCE TRANSACTIONS
An ecommerce transaction requires instant cooperation between a host of agents involved in a single purchase. For example, a customer ordering an item from a supplier involves a number of transactions, including those with the customer (“B2C ecommerce”), businesses in a supply chain (“B2B ecommerce”), and banks (“B2B”), and between systems (“enterprise integration”). The initial reaction of most companies was to integrate these diverse operations by building or buying software that employed protocols, such as DCOM or CORBA, to perform such integrations. However, XML offers the option of performing the necessary integration by exchanging standardized data.
XML DOCUMENTS FORM A TREE STRUCTURE
XML documents must contain a root element. This element is the parent of all other elements. The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree. All elements can have sub-elements (child elements):
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
The terms parent, child, and sibling are used to describe the relationships between elements. Parent elements have children. Children on the same level are called siblings (brothers or sisters). All elements can have text content and attributes (just like in HTML).
FIGURE 1.1 Tree structure of an XML document
The image above represents one book in the XML below:
<bookstore>
<book category="COOKING">
<title lang="en">Indian Food</title>
<author>Swati Jain</author>
<year>2011</year>
<price>200.00</price>
</book>
<book category="CHILDREN">
<title lang="en">Dolls</title>
<author>J K Jain </author>
<year>2010</year>
<price>29.95</price>
</book>
<book category="WEB">
<title lang="en">Learning XML</title>
<author>G.Ram</author>
<year>2009</year>
<price>13.95</price>
</book>
</bookstore>
The root element in the example is <bookstore>. All <book> elements in the document are contained within <bookstore>.
The syntax rules of XML are very simple and logical. The rules are easy to learn and easy to use.
ALL XML ELEMENTS MUST HAVE A CLOSING TAG
In HTML, elements do not have to have a closing tag:
<p>This is a paragraph
<p>This is another paragraph
In XML, it is illegal to omit the closing tag. All elements must have a closing tag:
<p>This is a paragraph</p>
<p>This is another paragraph</p>
You might have noticed from the previous example that the XML declaration did not have a closing tag. This is not an error. The declaration is not a part of the XML document itself, and it has no closing tag.
XML TAGS ARE CASE SENSITIVE
XML tags are case sensitive. The tag <Letter> is different from the tag <letter>. Opening and closing tags must be written with the same case:
<Message>This is incorrect</message>
<message>This is correct</message>
Opening and closing tags are often referred to as Start and end tags. Use whatever terms you prefer.
XML ELEMENTS MUST BE PROPERLY NESTED
In HTML, you might see improperly nested elements:
<b><i>This text is bold and italic</b></i>
In XML, all elements must be properly nested within each other:
<b><i>This text is bold and italic</i></b>
In the example above, “properly nested” simply means that since the <i> element is opened inside the <b> element, it must be closed inside the <b> element.
XML DOCUMENTS MUST HAVE A ROOT ELEMENT
XML documents must contain one element that is the parent of all other elements. This element is called the root element.
<root>
<child>
<subchild>.....</subchild>
</child>
</root>
XML ATTRIBUTE VALUES MUST BE QUOTED
XML elements can have attributes in name/value pairs just like in HTML.
In XML, the attribute values must always be quoted. Study the two XML documents below. The first one is incorrect, and the second is correct:
<note date=12/11/2019>
<to>Tonu</to>
<from>John</from>
</note>
<note date="12/11/2019">
<to>Tonu</to>
<from>John</from>
</note>
The error in the first document is that the date attribute in the note element is not quoted.
XML IS FREE
XML doesn’t cost anything to use. It can be written with a simple text editor or one of the many freely available XML authoring tools, such as XML Notepad. In addition, many Web development tools, such as Dream-weaver and Visual Studio .NET, have built-in XML support. There are also many free XML parsers, such as Microsoft’s MSXML (downloadable from microsoft.com) and Xerces (downloadable at apache.org).
XML TECHNOLOGY
The structured data is contained in an XML document, a text file with .xml as the extension. You can use CSS as in HTML to provide style sheets for XML data display. For more advanced features, power, and flexibility for the presentations, you could use XSL (XML Style sheet Language) to build the style sheets.
To enforce the structural constraints and rules on the data contained in an XML document, you could code a DTD (Document Type Definition). Due to certain limitations that were inherent in DTDs, the W3C came up with a specification to serve the same purpose as DTDs—the schemas. The schemas are contained in a .xsd file, and DTDs in a .dtd file. XML schema is an XML-based alternative to DTD.
FIGURE 1.2 XML Technology
XSD - XML Schema Definition
DTD - Document Type Definition
XSL - Extensible Stylesheet Language
USES
XML is widely used for the following purposes.
•Storing configuration information—typically data in an application which is not stored in a database. Most server software has configuration files in XML formats.
•XML documents can also be used as a mini data store. This data can be used to present it on a variety of targets including browsers, and print media.
•Transmitting data between applications—overcomes problems in client server applications which are cross-platform in nature. Ex: A Windows program talking to a mainframe, Little and Big Endian problems, and data type size variations across platforms.
FIGURE 1.3 Variant uses of XML
When XML data is transferred across different systems, the data contained in an XML document can be read using a software entity called a parser. Most of the popular databases (Oracle, MS SQL Server, Sybase, and DB2) provide their own mechanisms to store and retrieve data as XML. Some of them also provide parsers to work with the XML documents programmatically. XML is a key technology when it comes to Web Services. .NET uses XML extensively. It is used as a data format for everything—configuration files, metadata, RPC, and object serialization.
SAMPLE XML DOCUMENT
The following is a sample section from a possible XML document. It is *not* a full XML document—we will discuss the structure of XML documents shortly and you will notice that we need a few extra lines to consider it to be a full document.
<employee>
<ident>3348498</ident>
<name>
<lastname>Peterson</lastname>
<firstname>Sam</firstname>
<title>Dr.</title>
</name>
<phonedetails>
<extension>8221</extension>
<companyprefix>700</companyprefix>
<regionprefix>1</regionprefix>
<intprefix>+353</intprefix>
</phonedetails>
<department>
<title>Software Development</title>
<depid>8</depid>
</department>
<location>
<building>Aston Quay</building>
<room>A142</room>
</location>
</employee>
While not necessarily the optimum structure for information such as above, it illustrates a major point of XML. The tags are defined by individuals, rather than some predefined standard structure. There are two different kinds of information in the above example:
•markup - such as <department> and <firstname>
•text/character data - such as “Peterson” and “+353”
XML documents mix markup and text together into a single file: the markup describes the structure of the document, while the text is the document’s content.
XML IN PRACTICAL WORLD
Content Management
Almost all of the leading content management systems use XML in one way or another. A typical use would be to store a company’s marketing content in one or more XML documents. These XML documents could then be transformed for output on the Web as Word documents, as PowerPoint slides, in plain text, or audio format. The content can also easily be shared with partners who can then output the content in their own formats. Storing the content in XML makes it much easier to manage content for two reasons.
Content changes, additions, and deletions are made in a central location and the changes will cascade out to all formats of presentation. There is no need to be concerned about keeping the Word documents in sync with the Website, because the content itself is managed in one place and then transformed for each output medium.
Formatting changes are made in a central location. To illustrate, suppose a company had many marketing Web pages, all of which were produced from XML content being transformed to HTML. The format for all of these pages could be controlled from a single XSLT and a sitewide formatting change could be made modifying that XSLT.
WEB Services
XML Web services are small applications or pieces of applications that are made accessible on the Internet using open standards based on XML. Web services generally consist of three components:
•SOAP—an XML-based protocol used to transfer Web services over the Internet.
•WSDL (Web Services Description Language)—an XML-based language for describing a Web service and how to call it.
•Universal Discovery Description and Integration (UDDI)