XML Basics - S. Banzal - E-Book

XML Basics E-Book

S. Banzal

0,0
53,99 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

This book focuses on essential XML standards relevant to almost all developers. It investigates XML technologies applicable across a wide range of applications, rather than those limited to specific domains. While XML is a markup language, it is widely used by programmers. The book also covers supporting technologies layered on top of XML, such as XLinks, XSLT, Namespaces, Schemas, XHTML, RDDL, XPointers, XPath, SAX, and DOM.
The journey begins with understanding XML and its syntax. It then explores Document Type Definitions (DTDs), Namespaces, and XHTML. Following this, the book delves into CSS Style Sheets, XML Schema Basics, XSL and XSLT, SOAP, DOM Programming Interface, SAX, XPath, XLink, XQuery, XPointer, XForms, XSL-FO, and using XML with Databases. The final chapters cover Web Services, providing a comprehensive understanding of how XML integrates into various applications.
Mastering these standards and technologies is crucial for developers working with XML. This book transitions readers from basic XML syntax to advanced applications, blending theoretical concepts with practical examples. It is an essential resource for developers looking to leverage XML in their projects.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 651

Veröffentlichungsjahr: 2024

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



XMLBASICS

LICENSE, DISCLAIMER OF LIABILITY, AND LIMITED WARRANTY

By purchasing or using this book (the “Work”), you agree that this license grants permission to use the contents contained herein, but does not give you the right of ownership to any of the textual content in the book or ownership to any of the information or products contained in it. This license does not permit uploading of the Work onto the Internet or on a network (of any kind) without the written consent of the Publisher. Duplication or dissemination of any text, code, simulations, images, etc. contained herein is limited to and subject to licensing terms for the respective products, and permission must be obtained from the Publisher or the owner of the content, etc., in order to reproduce or network any portion of the textual material (in any media) that is contained in the Work.

MERCURY LEARNING AND INFORMATION (“MLI” or “the Publisher”) and anyone involved in the creation, writing, or production of the companion disc, accompanying algorithms, code, or computer programs (“the software”), and any accompanying Web site or software of the Work, cannot and do not warrant the performance or results that might be obtained by using the contents of the Work. The author, developers, and the Publisher have used their best efforts to insure the accuracy and functionality of the textual material and/or programs contained in this package; we, however, make no warranty of any kind, express or implied, regarding the performance of these contents or programs. The Work is sold “as is” without warranty (except for defective materials used in manufacturing the book or due to faulty workmanship).

The author, developers, and the publisher of any accompanying content, and anyone involved in the composition, production, and manufacturing of this work will not be liable for damages of any kind arising out of the use of (or the inability to use) the algorithms, source code, computer programs, or textual material contained in this publication. This includes, but is not limited to, loss of revenue or profit, or other incidental, physical, or consequential damages arising out of the use of this Work.

The sole remedy in the event of a claim of any kind is expressly limited to replacement of the book, and only at the discretion of the Publisher. The use of “implied warranty” and certain “exclusions” vary from state to state, and might not apply to the purchaser of this product.

XMLBASICS

SHASHI BANZAL

MERCURYLEARNING AND INFORMATIONDulles, VirginiaBoston, MassachusettsNew Delhi

Copyright ©2020 by MERCURY LEARNINGAND INFORMATION LLC. All rights reserved.Reprinted and revised with permission.

Original title and copyright: Learning XML.Copyright ©2017 by University Science Press (An imprint of Laxmi Publications Pvt. Ltd. All rights reserved.)

This publication, portions of it, or any accompanying software may not be reproduced in any way, stored in a retrieval system of any type, or transmitted by any means, media, electronic display or mechanical display, including, but not limited to, photocopy, recording, Internet postings, or scanning, without prior permission in writing from the publisher.

Publisher: David PallaiMERCURYLEARNING AND INFORMATION22841 Quicksilver DriveDulles, VA [email protected]

S. Banzal. XML Basics.ISBN: 978-1-68392-546-0

The publisher recognizes and respects all marks used by companies, manufacturers, and developers as a means to distinguish their products. All brand names and product names mentioned in this book are trademarks or service marks of their respective companies. Any omission or misuse (of any kind) of service marks or trademarks, etc. is not an attempt to infringe on the property of others.

Library of Congress Control Number: 2020942355

202122321 Printed on acid-free paper in the United States of America.

Our titles are available for adoption, license, or bulk purchase by institutions, corporations, etc. For additional information, please contact the Customer Service Dept. at 800-232-0223(toll free).

All of our titles are available in digital format at www.academiccourseware.com and other digital vendors. The sole obligation of MERCURY LEARNING AND INFORMATION to the purchaser is to replace the book, based on defective materials or faulty workmanship, but not based on the operation or functionality of the product.

CONTENTS

Preface

Chapter 1:

Understanding XML

Markup Languages

Specific Markup Languages

Generalized Markup Language

SGML - A Metalanguage

Why is XML so Adaptable?

XML Over SGML

Introduction to XML

Extensible

Markup

Language

History of XML

HTML and XML

XML Structure

Logical Structure

XML Declaration

XML Syntax

How Do I Structure My XML Documents?

Need for XML-Based Languages

XML Benefits

XML Disadvantages

Lack of Application Processing

General Weaknesses of XML

XML and Unicode Disadvantages

Characteristics of an XML Document

Open and Extensible

Application Independence

Data Format Integration

One Data Source, Multiple Views

Data Presentation Modification

Internationalization

Future-Oriented

Improved Data Searches

Enables E-Commerce Transactions

XML Documents form a Tree Structure

All XML Elements Must have a Closing Tag

XML Tags are Case Sensitive

XML Elements Must be Properly Nested

XML Documents Must have a Root Element

XML Attribute Values Must be Quoted

XML is Free

XML Technology

Uses

Sample XML Document

XML in Practical World

Property Inheritance

Combining Stylesheets

Questions for Discussion

Chapter 2:

XML Syntax

The Well-Formed Document

XML Document Structure

Prolog Section

The Standalone Attribute

The Encoding Attribute

Instance Section

Elements

Character Data

CDATA

Comment

Processing Instruction

Entities

General Entities

Parameter Entities

Entity References

Attributes

Entities’ References and Constants

Unparsed Data

Character Data (CDATA)

Processing Instructions (PIS)

Questions for Discussion

Chapter 3:

Document Type Definition (DTD)

Physical Structure in XML

Parsed and Unparsed Entities

Predefined Entities

Internal and External Entity

XML General Syntax

Attributes

Valid Documents

Well-Formed Documents

Well-Formed XML Documents

XML Documents

The XML Declaration

Processing Instructions

Comments

Document Type Declaration

XML Application Classification

Parsers

XML Processing-Attribute Values

XML Processing

Event-Driven Parsers

Tree-Based Parsers

XML Parser

Parse an XML Document

Parse an XML String

Document Type Definitions (DTDS)

Example DTD

DTD <!DOCTYPE>

DOCTYPE Syntax

XML Syntax Rules

DTDs (Well-Formed vs. Valid)

General Principles in Writing DTDs

Document Validation

Validating an XML Document with a DTD

The Purpose of DTDs

Creating DTDs

Code Sample: DTDs/Demos/Beatles.DTD

Internal DTD

Example Internal DTD

External DTD

Example External DTD

Combined DTD

DTD Elements

Basic Syntax

Plain Text

Unrestricted Elements

Empty Elements

Child Elements

Other Elements

Choice of Elements

Empty Elements

Mixed Content

Multiple Child Elements (Sequences)

An XML Application without a DTD

DTD Element Operators

DTD Operators with Sequences

Subsequences

The Document Element

Location of Modifier

Using Parentheses for Complex Declarations

XML CDATA

PCDATA-Parsed Character Data

CDATA-(Unparsed) Character Data

Notes on CDATA Sections

Internal & External Subsets

Standalone Attribute

DOCTYPE Declaration

Internal DTD Subset Declarations

External DTDs

Basic Markup Declarations

Formal DTD Structure-Entities

Predefined Entities

General Entities

Parameter Entities

Formal DTD Structure-Elements

Content Model

Cardinality Operators

Attributes

Default Values

Attribute Types

CDATA

ID

IDREF

Entity

Entity, Entities

NMTOKEN, NMTOKENS

Notation

Enumerations

Declaring Attributes

Conditional Sections

Limitations of DTDs

Designing XML Documents

XML for Messages

XML for Persistent Data

Mapping the Information Model to XML

A Document Type Declaration

Elements

Empty Elements

Attributes

CDATA

White Space

Special Characters

Questions for Discussion

Chapter 4:

Namespaces

Namespaces

Purpose of Namespaces

Declaring a Namespace

Scope

Qualified

XML Namespace

Example Namespace

XML Local Namespace

Example Local Namespace

Multiple Namespaces

XML Default Namespace

Understanding Namespaces

Naming Namespaces

Declaring and Using Namespaces

Default Namespaces

Explicit Namespaces

XML Namespaces

Name Conflicts

Solving the Name Conflict Using a Prefix

Locally Declared Elements and Attributes

Using Multiple Namespaces

Uniform Resource Identifier (URI)

Default Namespaces

Namespaces in Real Use

Questions for Discussion

Chapter 5:

Introduction to XHTML

A Quick History of HTML

XML Over HTML

Getting Multilingual with XML

The Convergence of HTML and XML

Add HTML to XML Data

Differences Between XHTML and HTML

XHTML

Benefits of XHTML

XHTML Coding

XML Declaration

XHTML DTDs

The DOCTYPE Declaration

XHTML Strict

XHTML Transitional

XHTML Frameset

The Document Element

A Sample XHTML Document

Document Formation

XHTML Tags

Questions for Discussion

Chapter 6:

CSS Style Sheets

CSS Documents

XML and CSS

Limitations of CSS for Complex Applications

Advantages of Authoring XML Documents with CSS

Authoring Approaches

Authoring XML Documents with CSS

Associating CSS Stylesheets with XML

Rendering XML Documents with CSS

CSS Syntax

CSS Example

CSS Comments

CSS Selectors

Embedding CSS in Web Page

CSS Styles

Displaying XML with CSS

XSL Transformation

Using XSL to Present XML Documents

XSL Patterns

XML Styles (Revisited)

Questions for Discussion

Chapter 7:

XML Schema Basics

XML Schema

Role of a Schema

DTD as a Schema

Schema Languages and Notations

The Purpose of XML Schema

The Power of XML Schema

A First Look

A Simple XML Schema

Schema as a Set of Constraints

Schema as an Explanation

DTD vs XML Schema

Structures

Preamble

Sample Preamble

Attributes and Attribute Groups

Content Models

Element Declaration

Derivation

Data Types

Primitive Types

Generated and User Defined Types

Hyperlinks

Links

Linking and Querying

XML Information Set

Link Elements

Locators

XLinks

Simple Links

Extended Links

Extended Link Groups

Validating an XML Instance Document

Simple-Type Elements

Built-in Simple Types

19 Primitive Data Types

Built-in Derived Data Types

Defining a Simple-Type Element

User-Derived Simple Types

Controlling Length

Specifying Patterns

Working with Numbers

Mins and Maxs

Number of Digits

Enumerations

Whitespace Handling

Specifying Element Type Locally

Nonatomic Types

Lists

Unions

Declaring Global Simple-Type Elements

Global vs. Local Simple-Type Elements

Default Values

Fixed Values

Nil Values

Complex-Type Elements

Content Models

Complex Model Groups

Occurrence Constraints

Declaring Global Complex-Type Elements

Mixed Content

Defining Complex Types Globally

Empty Elements

Adding Attributes to Elements with Complex Content

Adding Attributes to Elements with Simple Content

Restricting Attribute Values

Default and Fixed Values

Fixed Values

Requiring Attributes

Groups

Extending Complex Types

Abstract Types

XML Schema Keys

Keys

Annotating XML Schemas

Annotating a Schema

XSD Indicators

But This is No Longer Valid

Create an XML Schema

XSD Date and Time Data Types

XML Editors

Questions for Discussion

Chapter 8:

XSL Basics

Introduction to XSL

An XML Syntax

An XSL Processor

The XSL Templates

Location Paths

Template Ordering

Axes

Repetitions and Sortings in XSL

XSL Sorting

Uppercase and Lowercase Sorting

XSL Conditional Processing

Number Generation and Formatting in XSL

Formatting Multilevel Numbers

Numeric Calculation in XSL

Ceiling, Floor, and Round

String Function

XSL String Functions

Concatination

XSL Output Element

HTML Output Method

Text Output Method

Copy and Copy-of Constructs in XSL

Use-Attribute-Sets Attribute

Miscellaneous Additional Functions

Combining XSL

Importing Stylesheets

Apply-Import Function

Questions for Discussion

Chapter 9:

XSLT Basics

XSLT (Extensible Stylesheet Language)

XSLT Sample Program

The Transformation Process

Processing a Transformation

Applying XSLT to an XML Document

XSLT Syntax

XML Version

XSL Root Element

Selecting the Root Node

Usage Example

XSLT <value-of> Element

Usage Example

XSLT <for-each> Element

<xsl:for-each> Example

Result

Before

After

XSLT <if> Element

The Source File

The Solution

The Source File

The Solution

Questions for Discussion

Chapter 10:

SOAP

SOAP

Communication Over Distributed Systems

Remote Procedure Call (RPC)

SOAP Syntax

SOAP Message Structure

The SOAP Envelope Element

The SOAP Header Element

The SOAP Body Element

The SOAP Fault Element

The HTTP Protocol

SOAP HTTP Binding

Content-Type

Content-Length

A SOAP Example

Transport Methods in SOAP

SOAP and the Request/Response Model

HTTP Headers and SOAP

Request Headers

Response Headers

Sending Messages Using M-Post

A Schema for the Body Content of the SOAP Message

SOAP Encoding

Encoding Style Attribute

Questions for Discussion

Chapter 11:

DOM Programming Interface

DOM (Document Object Model)

XML DOM Tree

High Level Architecture of a DOM/XML Application

DOM Implementation

The DOM Specification

XML DOM Nodes

XML DOM Node Tree

First Child - Last Child

DOM Level 2 Specification

XML Document Structure

Working with DOM

Client Side and Server Side DOM

XML DOM Parser

XML Parser

Load an XML Document

Questions for Discussion

Chapter 12:

SAX (Simple API for XML)

Introduction to SAX

SAX (Simple API for XML)

DOM and Tree-Based Processing

PROS and CONS of Tree-Based Processing

How to Choose Between SAX and DOM

The SAX API is Defined in 4 Interfaces Under the org.xml.sax Package

SAX Sample Program

Three Steps to SAX

Creating the SAX Parser the Sample File

SAX Interface Java Example

SAX Parsing Pattern Example

Questions for Discussion

Chapter 13:

XPath

XPath Introduction

XPath Syntax

The XML Example Document

Navigating a Document with XPath Patterns

Referencing Nodes

XPath (XML Path) Language

Data Types, Literals, and Variables

XPath Operators

Evaluation Context

Built-in Functions

Using XPath Functions

Node Functions

String Functions

Boolean Functions

Number Functions

The Role of XPath

Using XPath in XSLT Templates

XPath Location Path

Location Path Example

XPath Location Step

XPath Location Path – Absolute

Example of an Absolute Location Path

Selecting Nodes

Predicates

Selecting Unknown Nodes

Selecting Several Paths

The Root Node

XPath Location Path – Relative

Example of a Relative Location Path

Children

The Wildcard

XPath Attributes

XPath – Expressions

XPath—Our Sample XML File

A Simple XPath Expression

Questions for Discussion

Chapter 14:

XLink, XQuery, and XPointer

Introduction to XQuery

XQuery Example

XQuery Syntax

XQuery Basic Syntax Rules

XQuery Selecting and Filtering Elements

XQuery Functions

XQuery User-Defined Functions

XLink and XPointer Introduction

XLink and XPointer Syntax

HTML, XML, and Linking

Linking with XLink

XLink Example

The XML Example Document

Understanding XLink Attributes

Creating Links with XLink

XPointer Syntax

Addressing with XPointer

Building XPointer Expressions

Creating XPointers

XPointer Example

The Linking XML Document

XPointer Example

The Linking XML Document

Questions for Discussion

Chapter 15:

XForms

Introduction to XForms

Features of XForms

Parts of XForms

The Form Controls

The Form Controls Listed

The XForms Processor

The XForms Namespace

XForms and XPath

XForms Properties

XForms Actions

Questions for Discussion

Chapter 16:

XSL-FO

Introduction to XSL-FO

XSL-FO Documents

XSL-FO Document Structure

Font and Text Attributes

XSL-FO Areas

XSL-FO Output

Page Layout

XSL-FO Blocks

Styling Text in XSL-FO

Controlling Spacing and Borders

More Complex Structures

Tables

XSL-FO Objects

Graphics

XSL-FO Processors

XSL-FO Software

XSL-FO and XSLT

Questions for Discussion

Chapter 17:

XML with Databases

Introduction

XML Documents as Databases

Why Use a Database?

Data versus Documents

Data-Centric Documents

Document-Centric Documents

Data, Documents, and Databases

Storing and Retrieving Data

Mapping Document Schemas to Database Schemas

Relational Database Primer

The World’s Shortest Guide to SQL

Retrieving Records Using Select

Inserting Records

Updating Records

Deleting Records

Databases and XML

Resolving XML Data into Database Tables

Storing XML Documents in a Database

Exporting an XML Document from a Database

Accessing Data from a Database as XML

Questions for Discussion

Chapter 18:

Web Services

Web Services

The Web Services Platform

Web Services Platform Elements

Types of Web Services

Web Service Architectures

Web Services Example

How to Use Web Services

SOAP

WSDL and UDDI

UDDI Benefits

How Can UDDI be Used

Questions for Discussion

Appendix: A:

XML Basics

Appendix: B:

Well Formed XML Documents

Appendix: C:

XML Overview

Glossary

Index

PREFACE

This book focuses on standards that are relevant to almost all developers working with XML. We investigate XML technologies that span a wide range of XML applications, not just those that are relevant only within a few restricted domains. XML is not a programming language. It is a markup language; but it is successfully used by many programmers. The book also covers generic supporting technologies that have been layered on top of XML and are used across a wide range of XML applications. These technologies include XLinks, XSLT, Namespaces, Schemas, XHTML, RDDL, XPointers, XPath, SAX, and DOM.

S. BANZALAugust 2020

CHAPTER 1

UNDERSTANDING XML

MARKUP LANGUAGES

The term Markup is a concatenation of the words “mark up.” This refers to the traditional way of marking up a document in the print and design worlds.

Markup is used to modify the look and formatting of text or to establish the structure and meaning of the document for output to some medium, such as the printer or the World Wide Web. Markup consists of codes, or tags, that are added to text to change the look or meaning of the tagged text. The tagged text for a document is usually called the source code for that document. Most word processors use some sort of markup languages to produce formatted text. There are two types of Markup languages: Specific Markup Languages and Generalized Markup Languages.

SPECIFIC MARKUP LANGUAGES

Specific markup languages were developed for specific purposes. These markup languages cannot be used for any other purpose other than that for what it was developed for. Hypertext Markup Language, or HTML, was designed for simplicity and it has a flexible structure. It allows text and graphics to be displayed in any Web browser.

Many markup languages have served quite well as document formatting tools for printing on the Web. However, they do not perform well in describing the data they contain or at providing contextual information for the data. For example, Hyper Text Markup Language describes how the text should be formatted, but conveys nothing about the kind of text data included in the document.

When using specific markup languages, the authors are limited to a particular set of tags. If a set of tags does not meet a need, authors must find an alternative way to meet those needs. A document might not be portable to other applications, as the data is not self-describing. It cannot be used for any other purpose than that for which it was originally intended. The language probably has a proprietary way of marking up text that is not compatible with other markup languages. This can create confusion and additional work for authors who must use several languages to accommodate different applications.

GENERALIZED MARKUP LANGUAGE

In the 1970s, Dr. C. F. Goldfarb and two of his colleagues proposed a method of describing text that was not specific to an application or a device. The method had two suggestions:

•The markup should describe the structure of a document and not its formatting or style characteristics.

•The syntax of the markup should be strictly enforced so that the code can clearly be read by a software program or by a human being.

The result of these suggestions was the Standardized General Markup Language (SGML) that was adopted as a standard by the International Organization for Standardization in 1986.

SGML - A METALANGUAGE

SGML has added provisions for identifying the characters to be used in a document. This makes it easier to ensure that a processor can understand everything in a document by allowing a document to specify the character set that it uses.

SGML provides a way to identify objects that will be used throughout a document. These objects, called entities, are convenient to use when a text fragment or any other data appears in several places in a document. If an entity is declared in one place of the document, any changes to that declaration will be reflected in all occurrences of the entity throughout the document.

SGML – Example

<!DOCTYPE CARS PUBLIC "//EXT/DTD CATALOG//EN">

<CAR>

<COLOR> Red

<PRICE> $20,000

</CAR>

The code snippet shown is an example of an SGML document. We can see that the content is the same as that of the HTML document. These similarities exist because HTML is an application of SGML. HTML was created using SGML standards. The main difference between SGML and HTML is that SGML is extensible, which means that it allows an author to define a particular structure by defining the parts that fit that structure. HTML is not extensible, which means that HTML cannot be used to create another markup language with its own rules and purposes.

WHY IS XML SO ADAPTABLE?

If XML is a new generation, then SGML is its mother. SGML is likely one of the most adaptable languages of all time, allowing the use of constructs that even XML won’t allow. Unfortunately, SGML is more complex and not as universally supported as XML, so the use of SGML instead of XML isn’t really recommended.

XML has inherited many of the key features of SGML, however, and puts them to good use; in many cases, the ways that it differs from its predecessor are inconsequential. While you may occasionally run across strange circumstances that would work better with SGML, it’s best to focus on XML since that’s where most of the support and interest lies.

XML OVER SGML

Even though XML is a subset of Standard Generalized Markup Language (SGML), XML is optimized for use on the World Wide Web. XML is designed in such a way that it has some benefits that are not found in SGML. XML is a smaller language than SGML. The designers of XML removed some specifications in SGML that were not needed for Web delivery.

XML includes a specification for the hyperlinking scheme, which is described as a separate language called eXtensible Linking Language (XLL). XML supports the basic hyperlinking found in HTML as well as extended linking. XML includes specification for a style sheet language called eXtensible Stylesheet Language (XSL). XSL provides support for a style sheet mechanism, which allows an author to create a template of various styles.

XML documents are self-describing documents. That is, each document contains a set of rules to which its data must conform. Since the same set of rules can be reused in another document, other authors can easily create the same class of document, if necessary.

XML can be used as the data interchange format. Many legacy systems can contain data in disparate forms, and developers are doing a lot of work to connect these legacy systems using the Internet. Since the XML text format is standards-based, data can be converted to XML and then easily read by another system or application.

XML can be used for Web data. For example, the content is stored in an XML file and the HTML page is used simply for formatting and display. So, the content can be updated and translated into another language without modifying anything in the HTML code.

INTRODUCTION TO XML

XML (eXtensible Markup Language) was invented for the purpose of having a standard and powerful way of describing any kind of data. XML offers a widely adopted standard way of representing text and data in a format that can be processed without much human or machine intelligence. Information formatted in XML can be exchanged across platforms, languages, and applications, and can be used with a wide range of development tools and utilities.

XML is a meta-language; that is, it is a language in which other languages are created. In XML, data is “marked up” with tags similar to HTML tags. In fact, the latest version of HTML, called XHTML, is an XML-based language, which means that XHTML follows the syntax rules of XML.

XML is used to store data or information. This data might be intended to be by read by people or by machines. It can be highly structured data, such as data typically stored in databases or spreadsheets, or loosely structured data, such as data stored in letters or manuals.

XML is all about preserving useful information—information that computers can use to be more intelligent about what they do with our data. The best part of XML is that it liberates information from the shackles of a fixed-tag set.

XML provides a standard approach for describing, capturing, processing, and publishing information. It is a language that has significant benefits over HTML.

Unlike most markup languages, XML is a flexible framework in which you can create your own customized markup languages. All XML-based languages share the same look and feel, and they share a common basic syntax. The essence of XML is in its name: Extensible Markup Language.

•Markup – It is a collection of tags.

•XML Tags – Identify the content of the data

•Extensible – User-defined tags

EXTENSIBLE

XML is extensible. It lets you define your own tags, the order in which they occur, and how they should be processed or displayed. Another way to think about extensibility is to consider that XML allows us to extend our notion of what a document is: it can be a file that lives on a file server, or it can be a transient piece of data that flows between two computer systems (as in the case of Web Services).

MARKUP

The most recognizable feature of XML is its tags, or elements (to be more accurate). In fact, the elements you’ll create in XML will be very similar to the elements you’ve already been creating in your HTML documents. However, XML allows you to define your own set of tags.

LANGUAGE

XML is a language that’s very similar to HTML. It’s much more flexible than HTML because it allows you to create your own custom tags. However, it’s important to realize that XML is not just a language. XML is a meta-language: a language that allows us to create or define other languages. For example, with XML we can create other languages, such as RSS, MathML (a mathematical markup language), and even tools like XSLT.

HISTORY OF XML

In 1970, IBM introduced SGML (Standard Generalized Markup Language). SGML was developed out of the General Markup Language (GML), which was developed by IBM in the late 1960s. SGML is a semantic and structural language for text documents, but it is very complicated. HTML is a subset of SGML.

In 1996, XML Working Group was formed under W3C. The World Wide Web Consortium (W3C) is an international consortium where Member organizations, a full-time staff, and the public work together to develop Web standards. W3C was created by Tim Berners-Lee in 1994 who also invented the World Wide Web in 1989. In 1998, W3C introduced XML 1.0.

XML (Extensible Markup Language) is a dialect of SGML. XML is not a programming language. Rather, it is a set of rules that allows you to represent data in a structured manner. Since the rules are standard, the XML documents can be automatically generated and processed.

XML was designed to describe data and is a cross-platform, software- and hardware-independent tool for transmitting or exchanging information. It is an open-standards-based technology which is both human and machine readable. XML is best suited for use in documents that are similar. In future Web development, it is most likely that XML will be used to describe the data, while HTML will be used to format and display the same data. The XML specification includes the syntax and grammar of XML documents as well as DTD.

Website creation is a fast-growing sector. In the early days, Website design consisted primarily of creating fancy graphics and nice-looking, easy-to-read Web pages.

As today’s Websites are interactive, the steps in Website design have changed. Although creating a pleasant-looking Website is still important, the primary focus has shifted from graphical design to programmatic design.

Consider a company wanting to sell its product on the Web. In such cases, the Webpages will collect and store a user’s billing information. This calls for storing and manipulating such data in a database. This is where XML comes into the picture.

XML is the solution for the problems that arise when using database Webpages.

HTML AND XML

HTML and XML were designed with different purposes in mind. XML is similar to HTML—they are both closely related to the SGML markup definition language that has been an ISO standard since 1986. SGML is an early attempt to combine the metadata (data about the data) with the data and it was used primarily in large document management systems. Because SGML is a very complex language, it has limited mass appeal.

HTML is the most recognized application of SGML and it allows any Web browser or application which understands HTML to display information in a consistent form. A HTML document is effective when it comes to laying out and displaying data, but it is a fixed set of tags, and it does not have the flexibility to describe different document and data types. HTML, in conjunction with Cascading Style Sheets (CSS), is reasonably good at displaying data, but it is not as good as XML at transporting data that is meant to be viewed or parsed in dozens of different ways by a variety of devices. In essence, where HTML is a presentation language, we require a richer communication means that can help with exchanging information from one computer to another.

The need to extract data and put a structure around information led to the creation of XML. Since it was released in 1997, XML use has been growing rapidly. There are two major fundamental differences between HTML and XML:

•Separation of form and content—HTML mostly consists of tags defining the appearance of text; in XML, the tags generally define the structure and content of the data, with the actual appearance specified by a specific application or associated stylesheet.

•XML is extensible—tags can be defined by individuals or organisations for some specific application, whereas the HTML standard tagset is defined by the World Wide Web Consortium (W3C).

XML is not intended as a replacement for HTML and both are complementary technologies. XML is a more general and better solution to the problem of sharing data on the Web than extending HTML.

XML STRUCTURE

One of XML’s best features is its ability to provide structure to a document. Every XML document includes both a logical and a physical structure. The logical structure is like a template that details the elements to be included in a document and the order in which they have to be included. The physical structure contains the actual data used in a document.

LOGICAL STRUCTURE

Logical Structure refers to the organization of the different parts of a document. It indicates how a document is built, as opposed to what a document contains. The first structural element in an XML document is an optional prolog element. The prolog is the base for the logical structure of an XML document. The prolog consists of two basic components, the XML Declaration and the Document Type Declaration. These two components are also optional.

XML DECLARATION

The XML Declaration identifies the version of the XML specification to which the document conforms. Although the XML declaration is an optional element, we should always include it in the XML document.

The code snippet here gives an example of basic XML declaration. Here, the line of code must use only lowercase letters.

<?xml version="1.0"?>

An XML declaration can also contain an encoding declaration and a standalone document declaration.

The encoding declaration identifies the character-encoding scheme, such as UTF-8 or EUC-JP. Different encoding schemas map to different character formats or languages. For example, UTF-8, the default scheme, includes representations for most of the characters in the English Language.

XML SYNTAX

The first thing that you’ll need to do is open up your text editor of choice. At this point, your document is going to look something like this (if you’re using XML version 1.0):

<?xml version="1.0"?>

Once you’ve typed your directive, it’s time to start adding some content to the page. Information on an XML page is handled in a very precise and structured format, using tags to define your data. White space can be included in the document to make it more easily readable, though you should be careful not to use that white space inside of your tags, as it can create problems when being read by a browser.

Let’s say that you’ve decided to create a new XML document to tell the world about your two favorite cats. You want to use the tag <cats>. Your document now looks a little something like this:

<?xml version="1.0"?>

<cats>Tooter and Shade are the best cats in the world!</cats>

Note the white space in between the directive and the first tags. You could also have put both of the tags on their own line, with the content of the tags between them, as long as you don’t add additional white space within the tags.

Of course, the <cats> tags don’t do anything. If you load this page into a Web browser, you’ll end up with more or less a copy of the file contents displayed on the screen with the tags in some pretty colors. You’ll have to define the tags, which can be done in 1 of 4 ways:

•Using Cascading Style Sheets (CSS)

•Using the eXtensible Style Language (XSL) Style Sheets

•Using a Data Island plus Script

•Using a Data Object Model plus Script or Client-Side Program

All of this might sound complicated, but it’s really not. It does involve creating and referencing other pages, though for now we’re still working on just the basic structure of XML. Save the document (in Text-Only mode) under the name cats.xml (making sure to use the .xml extension).

HOW DO I STRUCTURE MY XML DOCUMENTS?

Structure in an XML document is very important. Small errors in the structure of your document can have large effects on the overall outcome; pieces may not be displayed correctly, or might not appear at all. If the structure is too damaged, then the entire document might fail to work.

As previously mentioned, all XML documents begin with the XML directive. Open up the previously-saved file, cats.xml, and you’ll find your directive already in place.

<?xml version="1.0"?>

<cats>Tooter and Shade are the best cats in the world!</cats>

Unfortunately, your file is still missing a few vital elements. The <cats> tags don’t work, and the browser has no idea how to make them work. If you load it up in a browser, you’ll just see a copy of the file, with the various elements in different colors. This is actually useful, however; as long as you see this, then your code is good. The browser doesn’t know what else to do with it, in this case because some of the elements are missing, but the lack of definitive error codes tells you that it’s at least well-coded.

Go into the file, between your directive and the content, and get ready to add another vital element to your page. Type the following:

<?xml-stylesheet type="text/css" href="cats.css"?>

Of course, this doesn’t mean much to you right now. In time, though, it’s going to be a vital part of your page. What you just typed is the directions that the browser needs to find the XML processor, or the file that tells it how it should handle the information in the XML document. The line that you just typed tells the browser to find the file called cats.css, and that the file is a Cascading Style Sheet. It also tells it that it’s the stylesheet that it needs for this page. Now your cats.xml file should look like the following, which looks a lot more like an XML file.

<?xml version="1.0"?>

<?xml-stylesheet type="text/css" href="cats.css"?>

<cats>Tooter and Shade are the best cats in the world!</cats>

NEED FOR XML-BASED LANGUAGES

The main advantage of being able to define your own markup language is that it gives you the freedom to capture and publish useful information about what your data is and how it is structured. To show the difference, consider a company wanting to sell books on the Web. If they want to publish the information about the books on a Webpage, then we need to write an HTML document like the one shown.

The original data has been formed into HTML for publishing purposes. In the course of that transformation, useful information about what the information really is has been lost. If the same content were written in XML, it would look like the following code snippet.

<!-Book Snippet in HTML —>

<h1> Books for Sale </h1>

<table border=1>

<tr>

<td>Title</td><td>Paradise Lost</td>

</tr>

<tr>

<td>Author</td><td>John Milton</td>

</tr></table>

<!-Book snippet in XML —>

<BooksForSale>

<Title>Paradise Lost</Title>

<Author>John Milton</Author>

</BooksForSale>

If this code were to be published on the Web, this representation opens up some interesting possibilities. No image is shown.

XML BENEFITS

Initially, XML received a lot of excitement, but that has now died down some. This isn’t because XML is not as useful, but rather because it doesn’t provide the “Wow! factor” that other technologies, such as HTML, do. When you write an HTML document, you see a nicely formatted page in a browser—instant gratification. When you write an XML document, you see an XML document—not so exciting. However, with a little more effort, you can make that XML document sing.

XML is Everywhere

XML is now as important for the Web as HTML was to the foundation of the Web. XML is the most common tool for data transmissions between all sorts of applications. XML is used in many aspects of Web development, often to simplify data storage and sharing.

XML Separates Data from HTML

If you need to display dynamic data in your HTML document, it will take a lot of work to edit the HTML each time the data changes. With XML, data can be stored in separate XML files. This way you can concentrate on using HTML for layout and display and be sure that changes in the underlying data will not require any changes to the HTML. With a few lines of JavaScript code, you can read an external XML file and update the data content of Webpage.

XML Simplifies Data Sharing

In the real world, computer systems and databases contain data in incompatible formats. XML data is stored in plain text format. This provides a software- and hardware-independent way of storing data. This makes it much easier to create data that can be shared by different applications.

XML Simplifies Data Transport

One of the most time-consuming challenges for developers is to exchange data between incompatible systems over the Internet. Exchanging data as XML greatly reduces this complexity, since the data can be read by different incompatible applications.

XML Simplifies Platform Changes

Upgrading to new systems (hardware or software platforms) is always time consuming. Large amounts of data must be converted and incompatible data is often lost.

XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers without losing data.

XML Makes Your Data More Readily Available

Different applications can access your data, not only from HTML pages, but also from XML data sources. With XML, your data can be available to all kinds of “reading machines” (handheld computers, voice machines, and news feeds), and make it available for blind people or people with other disabilities.

XML is Used to Create New Internet Languages

New Internet languages are created with XML. Here are some examples:

•XHTML

•WSDL for describing the available Webservices

•WAP and WML as markup languages for handheld devices

•RSS languages for news feeds

•RDF and OWL for describing resources and ontology

•SMIL for describing multimedia for the Web

The future might give us word processors, spreadsheet applications, and databases that can read each other’s data in XML format, without any conversion utilities in between. XML documents form a tree structure that starts at the “root” and branches to the “leaves.”

XML DISADVANTAGES

XML is useful for developing future Web applications, and it almost defines the future of Web development. However, XML also has some drawbacks. One of the biggest drawbacks of XML is that it lacks adequate applications for processing.

LACK OF APPLICATION PROCESSING

XML needs an application processing system. There are no browsers yet that can read XML. For HTML, anyone can write up a program that can be read using any browser anywhere in the world. To be able to be read in a browser, XML still depends on HTML and is not independent of it. XML documents have to be converted to HTML before they are deployed. The most common method is to write the parsing routes in either DHTML or Java applications and parse them through the XML document. The formatting rules can be applied by the style sheet to convert the entire document into HTML.

Other disadvantages of XML include the fact that it is more difficult, more demanding, and more precise when compared to HTML. XML does not have any browser support and does not have anything to support the end user applications.

XML is very flexible, but its flexibility can potentially become one of its disadvantages, since there may be disagreements in its tags. If an XML object has too many constraints, it might become very difficult to construct the file. While just describing tags and building a system sounds easy, it may not be that easy in reality. For example, a business or professional organization may have hundreds of functions related to one set of documents. XML does not have the capability to synthesize all the information related to the document.

GENERAL WEAKNESSES OF XML

Since XML is a verbose language, it is dependent on who is writing it. A verbose language may pose problems for other users. XML is not specific to any platform and has a neutral platform requirement that may be a disadvantage in a few circumstances. All the standards of XML are not yet fully compliant. Users have reported problems with the parser and there are problems with XML and HTTP that are still being resolved.

XML documents can be difficult and expensive to set up. A freelancer, for example, can sit at his home and at his own pace create, write, and format a document or a manuscript using any of the free software available. However, the moment he introduces XML, the whole process becomes more complicated.

XML AND UNICODE DISADVANTAGES

Implementing multiple programs that are incompatible can be challenging. When XML is tied closely to Unicode, the Unicode changes XML’s attributes, which might result in a file that is totally different from the original.

The XML parsers, when used along with the RSS and the component called next, cannot disable the external entities. Instead they recognize them as their own, which can prove to be a major disadvantage. XML by itself cannot work along with Netscape, which makes it dependent on HTML. XML is not a super efficient model, it is not platform independent, and it cannot be deployed on every operating system. The limitation here is also very basic since it cannot talk to the browsers.

There are sample codes that belong to HTML and XHTML which contain a doctype and point to a DTD. The common belief is that this actually works, but browsers do not actually retrieve these DTDs. Whenever the DTD is unavailable, then the entire application breaks down. This is a problem because the DTD can be unavailable for other reasons, and it doesn’t mean that the service itself has to become unavailable.

XML creates an abundant amount of dependency on single factors that can create problems for programs. DTD, when available, is totally not useful, and an outside program has to be used to create a backup system, so users and developers might as well use an outside program made from scratch, which has the back up at intermediary levels.

External entities pose a problem, which is a major disadvantage for XML. The best way to fix the external entities’ problems with XML DTD is to not to use them at all, or if you have to use them, then don’t use them on the producer side. Do not attempt to retrieve them on the client’s side.

When you write the specifications for an XML document, do not mention the specifications for DTD in the vocabulary. There is a need for the programs to run their parsers for XML by disabling the external entity resolution. Otherwise, the external entities’ problem will invariably crop up, triggering a series of problems that cannot be solved by the XML environment alone. While layering the specifications, it is against the rules to disable or ban certain document types, which is allowed in SOAP.

If your job is to implement a Web application which is based on XML, you may need to configure the parser not to perform the DTD-based validations, and also not to try and resolve the external entities. This could be an answer to some problems, so taking precautionary measures is worthwhile. Publishing documents on the Web requires the same precautions; the document types should not be included.

A document may not be valid in the way XML describes it to be, and some people even believe that document validation in XML is overrated. Document data types are not very powerful when it comes to validation and it has been forgotten that the document has its own language and grammar which are not efficient for getting validated. There is also the problem of other programs not trusting the XML DTD. The doctype in HTML is much different from the doctype in XML. You may not be able to use the doctype in XML as an indicator, which helps programs understand what type of document it is dealing with.

If there is an application which exists that can handle multiple vocabularies of XML, and also knows to dispatch the respective documents to the concerned handlers by checking the namespace at the root of the element, then you can consider yourself lucky. If the vocabularies are not mentioned in the namespace, then you can look for them in the mime type. In some cases, the vocabularies are not present in the name space, nor are they specific to the mime. Such language is certainly a bad example and will create problems because you will have to use the root element name.

XML specifications define three kinds of file processing. The first one is DTD based validations which do not perform or retrieve external entities. The second one is the DTD based validation, which does not perform or retrieve external entities so that the information set and the reference library can be expanded. The third one is to perform the DTD-based validation by retrieving the external entities so that the information set and the entity reference can be expanded.

The point of having many profiles is so that the application has a choice and it chooses the right one. Character entities are considered unsafe for Web applications. It is a disadvantage because there will be a problem with the input and its editor. On the World Wide Web, there may be other options available when there is such a problem. The situation need not be so unfortunate because there may be a solution which exists, and there is an input method which can solve the problem with the editor. If the XHTML entities were pre-defined, then there wouldn’t be many problems.

CHARACTERISTICS OF AN XML DOCUMENT

There are a range of characteristics associated with XML.

Simplicity

Information coded in XML is easy to read and understand, and it can be processed easily by computers.

Self-Describing

OPEN AND EXTENSIBLE

XML allows you to add other elements when needed. This means you can always adapt your system to address specification modifications.

APPLICATION INDEPENDENCE

Using XML, data is no longer dependent on a specific application for creation, viewing or editing. In this sense, XML is to data what Java is to applications. Java allows programs to run anywhere—XML allows data to be used by any application.

DATA FORMAT INTEGRATION

XML documents can contain any imaginable data type—from classical data like text and numbers, or multimedia objects such as sounds and video, or active components like Applets.

ONE DATA SOURCE, MULTIPLE VIEWS

By formatting our data in a markup language, we allow computer applications to process and present this data to us in different ways. In contrast, HTML presents data in one fixed way.

DATA PRESENTATION MODIFICATION

You can change the look and feel of documents, or even entire Websites, with XSL Style Sheets without manipulating the data itself.

INTERNATIONALIZATION

Internationalization is important for electronic worldwide business applications. XML supports multilingual documents and the Unicode standard.

FUTURE-ORIENTED

XML is the endorsed industry standard of the World Wide Web Consortium (W3C) and is supported by all leading software providers. Furthermore, XML is also the standard today in an increasing number of other industries, such as health care.

IMPROVED DATA SEARCHES

Tags, attributes, and element structure provide context information that can be used to interpret the meaning of content, opening up new possibilities for highly efficient search engines, and intelligent data mining. An intelligent search engine for a body of XML-compliant markup languages would search both the content and the metadata, which would drastically improve the accuracy of searches. This will obviously cause an increase in the relevant and accessible data on a global basis.

ENABLES E-COMMERCE TRANSACTIONS

An ecommerce transaction requires instant cooperation between a host of agents involved in a single purchase. For example, a customer ordering an item from a supplier involves a number of transactions, including those with the customer (“B2C ecommerce”), businesses in a supply chain (“B2B ecommerce”), and banks (“B2B”), and between systems (“enterprise integration”). The initial reaction of most companies was to integrate these diverse operations by building or buying software that employed protocols, such as DCOM or CORBA, to perform such integrations. However, XML offers the option of performing the necessary integration by exchanging standardized data.

XML DOCUMENTS FORM A TREE STRUCTURE

XML documents must contain a root element. This element is the parent of all other elements. The elements in an XML document form a document tree. The tree starts at the root and branches to the lowest level of the tree. All elements can have sub-elements (child elements):

<root>

<child>

<subchild>.....</subchild>

</child>

</root>

The terms parent, child, and sibling are used to describe the relationships between elements. Parent elements have children. Children on the same level are called siblings (brothers or sisters). All elements can have text content and attributes (just like in HTML).

FIGURE 1.1 Tree structure of an XML document

The image above represents one book in the XML below:

<bookstore>

<book category="COOKING">

<title lang="en">Indian Food</title>

<author>Swati Jain</author>

<year>2011</year>

<price>200.00</price>

</book>

<book category="CHILDREN">

<title lang="en">Dolls</title>

<author>J K Jain </author>

<year>2010</year>

<price>29.95</price>

</book>

<book category="WEB">

<title lang="en">Learning XML</title>

<author>G.Ram</author>

<year>2009</year>

<price>13.95</price>

</book>

</bookstore>

The root element in the example is <bookstore>. All <book> elements in the document are contained within <bookstore>.

The syntax rules of XML are very simple and logical. The rules are easy to learn and easy to use.

ALL XML ELEMENTS MUST HAVE A CLOSING TAG

In HTML, elements do not have to have a closing tag:

<p>This is a paragraph

<p>This is another paragraph

In XML, it is illegal to omit the closing tag. All elements must have a closing tag:

<p>This is a paragraph</p>

<p>This is another paragraph</p>

You might have noticed from the previous example that the XML declaration did not have a closing tag. This is not an error. The declaration is not a part of the XML document itself, and it has no closing tag.

XML TAGS ARE CASE SENSITIVE

XML tags are case sensitive. The tag <Letter> is different from the tag <letter>. Opening and closing tags must be written with the same case:

<Message>This is incorrect</message>

<message>This is correct</message>

Opening and closing tags are often referred to as Start and end tags. Use whatever terms you prefer.

XML ELEMENTS MUST BE PROPERLY NESTED

In HTML, you might see improperly nested elements:

<b><i>This text is bold and italic</b></i>

In XML, all elements must be properly nested within each other:

<b><i>This text is bold and italic</i></b>

In the example above, “properly nested” simply means that since the <i> element is opened inside the <b> element, it must be closed inside the <b> element.

XML DOCUMENTS MUST HAVE A ROOT ELEMENT

XML documents must contain one element that is the parent of all other elements. This element is called the root element.

<root>

<child>

<subchild>.....</subchild>

</child>

</root>

XML ATTRIBUTE VALUES MUST BE QUOTED

XML elements can have attributes in name/value pairs just like in HTML.

In XML, the attribute values must always be quoted. Study the two XML documents below. The first one is incorrect, and the second is correct:

<note date=12/11/2019>

<to>Tonu</to>

<from>John</from>

</note>

<note date="12/11/2019">

<to>Tonu</to>

<from>John</from>

</note>

The error in the first document is that the date attribute in the note element is not quoted.

XML IS FREE

XML doesn’t cost anything to use. It can be written with a simple text editor or one of the many freely available XML authoring tools, such as XML Notepad. In addition, many Web development tools, such as Dream-weaver and Visual Studio .NET, have built-in XML support. There are also many free XML parsers, such as Microsoft’s MSXML (downloadable from microsoft.com) and Xerces (downloadable at apache.org).

XML TECHNOLOGY

The structured data is contained in an XML document, a text file with .xml as the extension. You can use CSS as in HTML to provide style sheets for XML data display. For more advanced features, power, and flexibility for the presentations, you could use XSL (XML Style sheet Language) to build the style sheets.

To enforce the structural constraints and rules on the data contained in an XML document, you could code a DTD (Document Type Definition). Due to certain limitations that were inherent in DTDs, the W3C came up with a specification to serve the same purpose as DTDs—the schemas. The schemas are contained in a .xsd file, and DTDs in a .dtd file. XML schema is an XML-based alternative to DTD.

FIGURE 1.2 XML Technology

XSD - XML Schema Definition

DTD - Document Type Definition

XSL - Extensible Stylesheet Language

USES

XML is widely used for the following purposes.

•Storing configuration information—typically data in an application which is not stored in a database. Most server software has configuration files in XML formats.

•XML documents can also be used as a mini data store. This data can be used to present it on a variety of targets including browsers, and print media.

•Transmitting data between applications—overcomes problems in client server applications which are cross-platform in nature. Ex: A Windows program talking to a mainframe, Little and Big Endian problems, and data type size variations across platforms.

FIGURE 1.3 Variant uses of XML

When XML data is transferred across different systems, the data contained in an XML document can be read using a software entity called a parser. Most of the popular databases (Oracle, MS SQL Server, Sybase, and DB2) provide their own mechanisms to store and retrieve data as XML. Some of them also provide parsers to work with the XML documents programmatically. XML is a key technology when it comes to Web Services. .NET uses XML extensively. It is used as a data format for everything—configuration files, metadata, RPC, and object serialization.

SAMPLE XML DOCUMENT

The following is a sample section from a possible XML document. It is *not* a full XML document—we will discuss the structure of XML documents shortly and you will notice that we need a few extra lines to consider it to be a full document.

<employee>

<ident>3348498</ident>

<name>

<lastname>Peterson</lastname>

<firstname>Sam</firstname>

<title>Dr.</title>

</name>

<phonedetails>

<extension>8221</extension>

<companyprefix>700</companyprefix>

<regionprefix>1</regionprefix>

<intprefix>+353</intprefix>

</phonedetails>

<department>

<title>Software Development</title>

<depid>8</depid>

</department>

<location>

<building>Aston Quay</building>

<room>A142</room>

</location>

</employee>

While not necessarily the optimum structure for information such as above, it illustrates a major point of XML. The tags are defined by individuals, rather than some predefined standard structure. There are two different kinds of information in the above example:

•markup - such as <department> and <firstname>

•text/character data - such as “Peterson” and “+353”

XML documents mix markup and text together into a single file: the markup describes the structure of the document, while the text is the document’s content.

XML IN PRACTICAL WORLD

Content Management

Almost all of the leading content management systems use XML in one way or another. A typical use would be to store a company’s marketing content in one or more XML documents. These XML documents could then be transformed for output on the Web as Word documents, as PowerPoint slides, in plain text, or audio format. The content can also easily be shared with partners who can then output the content in their own formats. Storing the content in XML makes it much easier to manage content for two reasons.

Content changes, additions, and deletions are made in a central location and the changes will cascade out to all formats of presentation. There is no need to be concerned about keeping the Word documents in sync with the Website, because the content itself is managed in one place and then transformed for each output medium.

Formatting changes are made in a central location. To illustrate, suppose a company had many marketing Web pages, all of which were produced from XML content being transformed to HTML. The format for all of these pages could be controlled from a single XSLT and a sitewide formatting change could be made modifying that XSLT.

WEB Services

XML Web services are small applications or pieces of applications that are made accessible on the Internet using open standards based on XML. Web services generally consist of three components:

•SOAP—an XML-based protocol used to transfer Web services over the Internet.

•WSDL (Web Services Description Language)—an XML-based language for describing a Web service and how to call it.

•Universal Discovery Description and Integration (UDDI)