PHP Web 2.0 Mashup Projects: Practical PHP Mashups with Google Maps, Flickr, Amazon, YouTube, MSN Search, Yahoo! - Shu-Wai Chow - E-Book

PHP Web 2.0 Mashup Projects: Practical PHP Mashups with Google Maps, Flickr, Amazon, YouTube, MSN Search, Yahoo! E-Book

Shu-Wai Chow

0,0
31,19 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

A mashup is a web page or application that combines data from two or more external online sources into an integrated experience. This book is your entryway to the world of mashups and Web 2.0. You will create PHP projects that grab data from one place on the Web, mix it up with relevant information from another place on the Web and present it in a single application. This book is made up of five real-world PHP projects. Each project begins with an overview of the technologies and protocols needed for the project, and then dives straight into the tools used and details of creating the project:



Look up products on Amazon.Com from their code in the Internet UPC database

A fully customized search engine with MSN Search and Yahoo!

A personal video jukebox with YouTube and Last.FM

Deliver real-time traffic incident data via SMS and the California Highway Patrol!

Display pictures sourced from Flickr in Google maps



All the mashup applications used in the book are built upon free tools and are thoroughly explained. You will find all the source code used to build the mashups used in this book in the code download section for this book.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 385

Veröffentlichungsjahr: 2007

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

PHP Web 2.0 Mashup Projects
Credits
About the Author
About the Reviewer
Preface
What This Book Covers
What You Need for This Book
Conventions
Reader Feedback
Customer Support
Downloading the Example Code for the Book
Errata
Questions
1. Introduction to Mashups
Web 2.0 and Mashups
Importance of Data
User Communities
How We Will Create Mashups
More Mashups
2. Buy it on Amazon
XML-RPC
XML-RPC Structure
XML-RPC Request
XML-RPC Data Types
Scalar Values
String
Integer
Double
Boolean
Date/Time
Base64-Encoded Binary
Arrays
Struct
XML-RPC Response
Working with XML-RPC in PHP
Making an XML-RPC Request
Serializing Data with XML-RPC Encode Request
Creating a Single Parameter XML-RPC Request
Double Data Type
Date/Time and Base64 Data Types
Creating a Multiple Parameter XML-RPC Request
Passing Arrays in XML-RPC Requests
Passing Struct in XML-RPC Requests
Calling XML-RPC Using Sockets
Processing an XML-RPC Response
Creating an XML-RPC Parser Class
Testing Our XML-RPC Parser Class
Using PEAR to Handle XML-RPC
REST
Working with REST in PHP
Making a REST Request
A GET and POST Refresher
Using Sockets to Initiate a REST Request
Creating GET and POST Request Functions
Making a REST Parser Class
Testing Our REST Parser Class
Processing a REST Response
Basic Walkthrough with PHP and SAX
Using the PHP’s XML Functions
Setting up the Callback Functions
Seeing the Callbacks in Action
Creating a SAX Parser Class
Examining the Classes
Using and Testing the Class
Internet UPC Database API
Amazon API
A Tour of ECS
Anatomy of an ECS REST Request
Location of Service
Mashing Up
Product Lookups
Handling Amazon’s XML Responses
An ECS Lookup Response
Your Own Amazon Cart
Summary
3. Make Your Own Search Engine
SOAP
Web Services Descriptor Language (WSDL) With XML Schema Data (XSD)
Basic WSDL Structure
definitions Element
types Element
Simple Type
Complex Type
Arrays
message Element
RPC Binding
Document Binding
portType Element
binding Element
service Element
The SOAP Message
Envelope
Header
Body
RPC Binding
Document Binding
Fault
PHP’s SoapClient
Creating Parameters
Instantiate the SoapClient
Instantiating in WSDL Mode
Instantiating in Non-WSDL Mode
Making the Call and Using SoapClient Methods
Calling SOAP Operations in WSDL Mode
Calling SOAP Operations in Non-WSDL Mode
Handling the SOAP Response
Handling SOAP Errors with SoapFault
Handling Successful Results
Microsoft Live Search Web Service
Using Search
Yahoo! Search Web Service
Using Web Search
Mashing Up
Summary
4. Your Own Video Jukebox
XSPF
RSS
YouTube Overview
YouTube Developer API
Last.fm Overview
Audioscrobbler Web Services
Parsing With PEAR
Package Installation and Usage
File_XSPF
Services_YouTube
XML_RSS
Mashing Up
Mashup Architecture
Main Page
Navigation Page
Content Page
Using the Mashup
Summary
5. Traffic Incidents via SMS
Screen Scraping the PHP Way
Parsing with DOM Functions
Basic Element and Attribute Parsing
Testing the Schema
More About PHP’s Implementation of the DOM
411Sync.com API
Creating Your Mobile Search Keyword
Name Your Keyword
Format the Users will Use when They Use Your Search
HTTP Location of the XML Data
California Highway Patrol Incident Page
Mashing Up
The Incident Class
The DOM Parser Class
The CHP DOM Parser Class
Creating the Feed Page
Testing and Deploying
Summary
6. London Tube Photos
Preliminary Planning
Finding Tube Information
Integrating Google Maps and Flickr Services
Application Sequence
Resource Description Framework (RDF)
SPARQL
Analyzing the Query Subject
Anatomy of a SPARQL Query
Writing SPARQL WHERE Clauses
Basic Principles
A Simple Query
Querying for Types
Ordering, Limiting, and Offsetting
UNION and DISTINCT
More SPARQL Features
RDF API for PHP (RAP)
XMLHttpRequest Object
XMLHttpRequest Object Overview
Using the Object
Creating the Object
Making the HTTP Request
Creating and Using the Callback
JavaScript Object Notation (JSON)
JavaScript Objects Review
JSON Structure
Accessing JSON Properties
Serializing the JSON Response
Google Maps API
Creating a Map
Geocoding
Markers
Events
InfoWindow Box
Flickr Services API
Executing a Search
Interpreting Service Results
Retrieving a Photo or a Photo’s Page
Mashing Up
Building and Populating the Database
Examining the File
Creating Our Database Schema
Building SPARQL Queries
Stations Query
Lines Query
Lines to Stations Query
Database Population Script
The TubeSource Database Interface Class
The Main User Interface
Using Flickr Services with AJAX
Creating an XMLHttpRequest Proxy
Modifying the Main JavaScript
Making the XMLHttpRequest
Race Conditions
Parsing the AJAX Response
Summary
Index

PHP Web 2.0 Mashup Projects: Create practical mashups in PHP, grabbing and mixing data from Google Maps, Flickr, Amazon, YouTube, MSN Search, Yahoo!, Last.fm, and 411Sync.com

Shu-Wai Chow

PHP Web 2.0 Mashup Projects

Create practical mashups in PHP, grabbing and mixing data from Google Maps, Flickr, Amazon, YouTube, MSN Search, Yahoo!, Last.fm, and 411Sync.com

Copyright © 2007 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, Packt Publishing, nor its dealers or distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: September, 2007

Production Reference: 1070907

Published by Packt Publishing Ltd.

32 Lincoln Road

Olton

Birmingham, B27 6PA, UK.

ISBN 978-1-847190-88-8

www.packtpub.com

Cover Image by Vinayak Chittar (<[email protected]>)

Credits

Author

Shu-Wai Chow

Reviewer

Stoyan Stefanov

Senior Acquisition Editor

Douglas Paterson

Development Editor

Nikhil Bangera

Technical Editors

Adil Rizwan

Ajay. S

Editorial Manager

Dipali Chittar

Project Manager

Patricia Weir

Project Coordinator

Abhijeet Deobhakta

Indexer

Bhushan Pangaonkar

Proofreader

Cathy Cumberlidge

Production Coordinator

Shantanu Zagade

Cover Designer

Shantanu Zagade

About the Author

Shu-Wai Chow has worked in computer programming and information technology for the past eight years. He started his career in Sacramento, California, spending four years as the webmaster for Educaid, a First Union Company, and another four years at Vision Service Plan as an application developer. Through the years, he has become proficient in Java, JSP, PHP, ColdFusion, ASP, LDAP, XSLT, and XSL-FO. Shu has also been the volunteer webmaster and a feline adoption counselor for several animal welfare organizations in Sacramento.

He is currently a software engineer at Antenna Software in Jersey City, New Jersey, and is finishing his studies in Economics at Rutgers, the State University of New Jersey.

Born in the British Crown Colony of Hong Kong, Shu did most of his alleged growing up in Palo Alto, California. He lives on the Jersey Shore with seven very demanding cats, four birds that are too smart for their own good, a tail-less bearded dragon, a betta who needs her tank cleaned, a dermestid beetle colony, a cherished Fender Stratocaster, and a beloved, saint-like fiancé.

I received a lot of help from many different people and companies on this book.

First and foremost, thank you to the people at Packt Publishing, especially Doug Paterson, Nikhil Bangera, Adil Rizwan, and Abhijeet Deobhakta, for a professional working experience, giving me this opportunity, and once again, for their faith in me (which I still don’t completely understand).

Thank you, Stoyan Stefanov, for your great review comments. Your insight and suggestions really improved this book’s content and personally pushed me further.

Each chapter deserves some special attention.

Chapter 2: Thank you to the folks at the UPC Internet Database and Amazon, Inc. for their permission. Thanks especially to everyone at UPC Internet Database for answering my questions.

Chapter 3: Thank you to Yahoo! and Microsoft for their permission, and the prompt service and assistance from their legal departments. Special thanks are given to Data Access Corporation and Vincent Oorsprong for their helpful and educational SOAP services. Data Access Worldwide (www.dataaccess.com) delivers advanced products and services to help customers build great applications and get more value from their data.

Chapter 4: Thanks to the people at YouTube and Last.fm for their permission. Special thanks go to Last.fm for their interest in my project.

Chapter 5: Thank you to 411Sync and the California Highway Patrol for their assistance. A big thank you goes to Manish Lachwani of 411Sync, who is probably the most patient person in the world.

Chapter 6: Thank you to Google, Flickr, and Jo Walsh for permission to be included. A big thanks to Jo Walsh for her help and insight. Special thanks to the people at Google, whose enthusiasm made me think that this book wasn’t such a nutty idea.

I neglected to thank Ed Hansen and Keith Easterbrook in the last book. They were the ones that forced us to use WebSphere Application Developer. Without WSAD, I never would have used Eclipse, and I never would have used PHPEclipse, which means that I never would have written the first book, and without the first book, there would not be this second book. So, thank you, gentlemen, and I apologize for the oversight.

Thank you to Kaya’s Kitchen of Belmar, New Jersey for making the best vegetarian cuisine in the whole wide world. If you ever find yourself on the Jersey Shore, you absolutely must visit this restaurant.

Finally, all hail the Palo Alto Duck Pond, Hobee’s on El Camino and Arastradero, Dodger Stadium, and the Davis Greenbelt.

About the Reviewer

Stoyan Stefanov is a Yahoo! web developer, Zend Certified Engineer, book author and contributor to the international PHP community. His personal blog is at http://www.phpied.com.

Once upon a time, a diviner told me that I would meet an angel on Earth in the month of February. This book is dedicated to that February angel and love of my life, Anneliese Strunk. You have brought more happiness, inspiration, and joy to my life than I could ever have imagined.

Preface

A mashup is a web page or application that combines data from two or more external online sources into an integrated experience. This book is your entryway to the world of mashups and Web 2.0. You will create PHP projects that grab data from one place on the Web, mix it up with relevant information from another place on the Web and present it in a single application. All the mashup applications used in the book are built upon free tools and are thoroughly explained. You will find all the source code used to build the mashups in the code download section on our website.

This book is a practical tutorial with five detailed and carefully explained case studies to build new and effective mashup applications.

What This Book Covers

You will learn how to write PHP code to remotely consume services like Google Maps, Flickr, Amazon, YouTube, MSN Search, Yahoo!, Last.fm, and the Internet UPC Database, not to mention the California Highway Patrol Traffic data! You will also learn about the technologies, data formats, and protocols needed to use these web services and APIs, and some of the freely-available PHP tools for working with them.

You will understand how these technologies work with each other and see how to use this information, in combination with your imagination, to build your own cutting-edge websites.

Chapter 1 provides an overview of mashups: what a mashup is, and why you would want one.

In Chapter 2 we create a basic mashup, and go shopping. We will simply look up products on Amazon.com based on the Universal Product Code (UPC). To do this, we cover two basic web services to get our feet wet — XML-RPC and REST. The Internet UPC database is an XML-RPC-based service, while Amazon uses REST. We will create code to call XML-RPC and REST services. Using PHP’s SAX function, we create an extensible object-oriented parser for XML. The mashup covered in this chapter integrates information taken from Amazon’s E-commerce Service (ECS) with the Internet UPC database.

In Chapter 3, we create a custom search engine using the technology of MSN, and Yahoo! The chapter starts with an introduction to SOAP, the most complex of the web service protocols. SOAP relies heavily on other standards like WSDL and XSD, which are also covered in readable detail. We take a look at a WSDL document and learn how to figure out what web services are available from it, and what types of data are passed. Using PHP 5’s SoapClient extension, we then interact with SOAP servers to grab data. We then finally create our mashup, which gathers web search results sourced from Microsoft Live and Yahoo!

For the mashup in Chapter 4, we use the API from the video repository site YouTube, and the XML feeds from social music site Last.fm. We will take a look at three different XML-based file formats from those two sites: XSPF for song playlists, RSS for publishing frequently updated information, and YouTube’s custom XML format. We will create a mashup that takes the songs in two Last.fm RSS feeds and queries YouTube to retrieve videos for those songs. Rather than creating our own XML-based parsers to parse the three formats, we have used parsers from PEAR, one for each of the three formats. Using these PEAR packages, we create an object-oriented abstraction of these formats, which can be consumed by our mashup application.

In Chapter 5, we screen-scrape from the California Highway Patrol website. The CHP maintains a website of traffic incidents. This site auto-refreshes every minute, ensuring the user gets live data about accidents throughout the state of California. This is very valuable if you are in front of a computer. If you are out and about running errands, it would be fairly useless. However, our mashup will use the web service from 411Sync.com to accept SMS messages from mobile users to deliver these traffic incidents to users.

We’ve thrown almost everything into Chapter 6! In this chapter, we use RDF documents, SPARQL, RAP, Google Maps, Flickr, AJAX, and JSON. We create a geographically-centric way to present pictures from Flickr on Google Maps. We see how to read RDF documents and how to extract data from them using SPARQL and RAP for RDF. This gets us the latitude and longitude of London tube stations. We display them on a Google Map, and retrieve pictures of a selected station from Flickr. Our application needs to communicate with the API servers for which we use AJAX and JSON, which is emerging as a major data format. The biggest pitfall in this AJAX application is race conditions, and we will learn various techniques to overcome these.

What You Need for This Book

To follow along with the projects and use the example code in this book, you will need a web server running PHP 5.0 or higher and Apache 1.3.

All of the examples assume you are running the web server on your local work station, and all development is done locally.

Additionally, two projects have special requirements. In Chapter 5, you will need access to a web server that can be reached externally from the Internet. In Chapter 6, you will need a MySQL server. Again, we assume you are running the MySQL server locally and it is properly configured.

To quickly install PHP, Apache, and MySQL, check out XAMPP (http://www.apachefriends.org/en/xampp.html). XAMPP is a one-step installer for PHP, Apache, and MySQL, among other things.

XAMPP is available for Windows, Linux, and Mac OS X. However, many standard Linux distributions already have PHP, Apache, and MySQL installed. Check your distribution’s documentation on how to activate them. Mac OS X already has Apache and PHP installed by default. You can turn them on by enabling Web Sharing in your Sharing Preferences.

MySQL can be installed as a binary downloaded from MySQL.com (http://dev.mysql.com/downloads/mysql/4.1.html).

Customer Support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the Example Code for the Book

Visit http://www.packtpub.com/support, and select this book from the list of titles to download any example code or extra resources for this book. The files available for download will then be displayed.

The downloadable files contain instructions on how to use them.

Errata

Although we have taken every care to ensure the accuracy of our contents, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in text or code—we would be grateful if you would report this to us. By doing this you can save other readers from frustration, and help to improve subsequent versions of this book. If you find any errata, report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the Submit Errata link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata are added to the list of existing errata. The existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Questions

You can contact us at <[email protected]> if you are having a problem with some aspect of the book, and we will do our best to address it.

Chapter 1. Introduction to Mashups

Mashups, more specifically called web application hybrids by Wikipedia, have been an exciting trend in web applications in recent years. Web mashups are exactly what they sound like—web applications that merge data from one or more sources and present them in new ways. Very often, the data owners encourage and facilitate third parties to use the data. In many cases, this facilitation is made possible by the data owners providing application programming interfaces (API) to their data. These APIs follow standard web service protocols and can be implemented quickly and easily in a variety of programming languages, including PHP. New, innovative mashups, made by individuals that combine data from traditionally unlikely pairings are popping up every day.

One example is the Wii Seeker site. When the Nintendo Wii launched in November 2006, many knew there would be shortages. The object of the Wii Seeker site is to help people find Wiis by combining expected initial shipment information to Target stores and Google Maps. A marker on a Google Map represented a Target retail store. If the user clicked on the marker they would see information about the store such as the address. They would also see the number of Wiis the store was expected to have on launch day. By representing numerical inventory data on a map, a user could see Target stores near their location and plan their store visits on launch day to maximize their chances of actually finding a Wii.

After the Nintendo Wii was launched, the site reinvented itself by adding auction information from eBay and product information from Amazon. They also added additional chain retail stores like Circuit City and Walmart. Instead of seeing Nintendo Wii inventory information on each store, the site now allows visitors to post notes for each other about the store’s inventory.

Another mashup example is Astrolicio.us. This site queries data feeds from sites like Digg.com, Google News, and Google Videos and presents it to the user on one page. By combining data feeds, the site’s creator has made a portal of current astronomy news for visitors.

On the homepage, the user can quickly scan items that may interest them. For news, the user is given bullet points for each news item containing the headline and a synopsis. For videos, the user is shown a thumbnail. If a user clicks on a link, they are taken to the source of the article or video. This site is clean, simple, and full of information. It is also quite easy to make using the APIs of the sources. It probably did not take the site creator more than an afternoon to go from the start of coding to launch.

Web 2.0 and Mashups

How, in just a few short years, have mashups suddenly sprung up everywhere? The story leads back to just a few years ago. After the technology industry’s financial bubble collapsed in 2001, internet firms regrouped and redefined themselves. There were business lessons to be learned, technologies to be re-evaluated, and people’s perceptions had changed. By the middle of the decade, many trends and differences became clear. The term “Web 2.0” started to surface, to draw separation between new sites and sites that gained popularity in the late Nineties. The term was vague and seemed suspiciously gimmicky at first. However, the differences between old and new were real. They were not just historical and chronological. Sites like Google, YouTube, and Flickr demonstrated new approaches to building a web business. These sites often had simple interfaces, fully embraced web services, and returned a lot of control to the user. Many of these sites relied solely on their users for content. In September 2005, technology publisher Tim O’Reilly wrote an article entitled What Is Web 2.0 to succinctly declare the traits of Web 2.0 versus 1.0 sites. There were two characteristics that were direct catalysts for the growth of mashups:

Importance of DataUser Communities

Importance of Data

The first characteristic is the importance of data. The question of who owned data and what they choose to do with the data became a big issue. Why in the world would companies invest millions of dollars to gather their data and their database systems, but then freely give it away for others to use? The answer is by opening their systems, mashup developers help increase the reach of the data owners.

O’Reilly used the example of MapQuest to illustrate this. MapQuest was the leader in mapping in the mid to late nineties. However, their system was closed and did not allow outside parties to do anything with their data. In the early Aughts, mapping sites started to leverage this weakness. Yahoo! Maps, Microsoft Virtual Earth, and Google Maps entered the market, and each one had APIs. Despite the huge early market lead, MapQuest quickly lost to bigger players with open data. There are many examples like this. Amazon opened up their data through the Amazon Ecommerce Service (ECS). Many mashups have used this web service to create their own store fronts. Amazon gets the sale and gives a percentage to mashup developers. This has created many more channels for Amazon to sell their goods besides www.amazon.com. Contrast this with a site like BarnesAndNoble.com which does not open their data. The only channel that they can sell is through the main website. Not only do they lose sales opportunities, but they lack the affiliate loyalty that Amazon has.

In our earlier examples, Wii Seeker helps the Target by funneling buyers to stores. Wii Seeker in turn, receives adverting revenue and affiliate commissions on their site. Google Videos, Google News, and Digg.com get visitors when a user clicks on a link from astrolicious.us. Astrolicious.us gets advertising revenue with very little development time invested.

User Communities

The second characteristic is that user added data is more valuable than we once thought. User product reviews on ecommerce sites are nothing new. Neither are web forums. However, it is how sites are using this information, and who owns the data, that is becoming important. Movie rental site Netflix has always allowed users to rate movies they have watched. Based on these recommendations, Netflix will suggest other movies you might like. Recently, they have added a new social networking feature called “Friends”, where you can see how your friends have rated movies and what they are watching. One feature of Friends is compatibility ratings. Comparing both you and your friends’ recommendations, Netflix comes up with a percentage of your shared movie tastes.

Other sites are completely dependent on user-added data. YouTube and Flickr provide video and picture hosting, respectively, for free. Their widespread adoption, though, is not simply from hosting. Before Flickr, there were many sites that hosted images for free. That was nothing new. The difference, again, is what both sites do with user-added data. Both sites provide social networking features. You can leave your ratings and comments on a hosted item and you can subscribe to a person’s profile. Anytime that person uploads something, you will be notified of the new content. Both sites also allow folksonomic tagging, which basically lets uploaders describe the content with their own keywords. Visitors can use these keywords to search when they are looking for content. Tagging has proven to be an incredible aid for search algorithms.

Thus, it is these two characteristics of new sites that have allowed small web developers to appear much bigger. Backed with data from large internet presences, mashup developers create usage channels that data owners could not have foreseen, or been restricted by business rules.

How We Will Create Mashups

Technologically, the mashup phenomenon could not have happened without website owners making a clean separation between the data that is used on their sites, and the actual presentation of the data. This has always been a goal in computer application development, and therefore, it is no surprise that website and web application architecture have progressed towards this stage ever since the World Wide Web was created. This separation is quickly turning the World Wide Web into what is known as the semantic web—a philosophy where web content is presented not only for humans to read, but also in a way that can be easily processed by software and machines. We have moved from static pages to database-driven sites, from presentational FONT tags to cascading style sheets. It is perhaps inevitable that the web has become an environment that fosters mashup development.

Data sources of mashups are varied. Often, data owners provide mashup developers access to their data through official application programming interfaces. As we are talking about web applications, these APIs utilize web services, which come in a variety of protocols. Really Simple Syndication (RSS), a family of formats to present data, is another common data source that has helped spur the mashup adoption. When official methods are unavailable, developers become really creative in getting data. Screen scraping is a method that has always been around. Regardless of the method, mashups also deal with a variety of data formats. While mashups can be simple to create, a mashup developer must be flexible and well-rounded in the knowledge of their tools.

Open-source software is particularly well-suited in this mashup environment. The Apache and PHP combination makes for fast development. Being open source, developers are constantly and quickly adding new features to keep up with the web service world.

This book will take a look at how to use common data sources with PHP. Most official APIs are based on the big three web service protocols—XML-RPC, REST, and SOAP. We will of course look at these protocols. APIs and raw web service requests by hand, of course, are not the only way to retrieve data. We will look at using third-party libraries to interface with some popular sites. Feeds are also an important data source which we will use. By giving you a broad overview of the tools used in the mashup world, you should be able to start developing your own mashups quickly.

More Mashups

For more examples and inspirations, check out these popular mashups:

Popurls (popurls.com)—Collects URLs from popular sites.Housingmaps.com (www.housingmaps.com)—Plots housing listings from Craigslist on to a map.Keegy (us.keegy.com)—A site that aggregates news from different sources and personalizes it for the reader.Alkemis (local.alkemis.com)—Aggregates and maps all sorts of data, for example, pictures and live web cams, in selected cities.Gametripping.com (www.gametripping.com)—A collection of satellite and Flickr photos of baseball stadiums.

Chapter 2. Buy it on Amazon

Project Overview

What

Build an application that takes UPC symbols and looks them up on Amazon.com.

Protocols Used

XML-RPC, REST

Data Formats

XML-RPC, XML

Tools Featured

PHP’s XML-RPC Functions and SAX Functions

APIs Used

Internet UPC Database, Amazon Web Services

We are going to start off with a relatively simple project. Our project will accept a Universal Product Code number from a user, look up the product information associated with the UPC number from the Internet UPC Database, and allow the user to buy the product from our site using Amazon.com. In other words, we are going to create an online store based on UPC numbers. By using Amazon.com’s inventory, users can buy from Amazon.com, but they’ll be able to do everything from our site alone. While such a site may not make us the next ecommerce king, it will introduce us to the two most basic web services—XML-RPC and REST. Each protocol will require us to structure our request in a certain way.

XML-RPC will return an XML document formatted to the XML-RPC specifications. REST responses are a lot more varied and in free form. They may be anything from a plain text string to huge, complex XML documents. Although most web services return a descriptive, well formed XML document, REST responses are not bound to any standard or specification. We will create utilities to process both XML-RPC and REST requests that we can use for the current and future mashups.

XML-RPC

As developers in today’s world, we should be familiar with XML. On the surface, it is a group of data that is packaged and organized neatly into opening and closing tags. A deeper look tells us that this structure of XML makes it easy for machines to read and process. Thus, while XML can be hard on human eyes, we know it is designed for machine communication. However, without any sort of agreed structure of the XML document by the machines, the advantages of XML are effectively eliminated. This is where XML-RPC comes in.

RPC is an acronym for Remote Procedure Call; developed by David Winer of UserLand software in 1996. Its purpose is to allow applications, regardless of how different each program or the purpose of each program, to communicate with each other across a network in a standardized manner.

In computer terms, a procedure call is that which gets executed when the operating system communicates to the input devices about what you are doing: Which key did you just hit on the keyboard? Where did you move the mouse? What did you just click on, with the mouse?

XML-RPC carries this idea into the networking world (the “Remote” part of RPC) by creating a standard for one program to get information from another program across the network. Program A sends a remote procedure call to Program B. This call may include parameters that Program B needs to retrieve the data.

For example, if the query is against a list of people’s name: Do you want to retrieve a list of only those whose first name is “Peter”? Do you want to narrow your search down to a city? The requests, and all its parameters, are formatted in a generic way that Program B understands. Regardless of the data type or size, Program B returns the answer back in a generic way that Program A understands. Program A can then do with the data whatever the user requested.

XML-RPC Structure

Two programs communicating across a network is obviously very different from an operating system talking with a mouse. An operating system has the advantage of super high speed internal buses and the ability to talk on a lower machine level. A procedure call using XML-RPC must be program neutral and friendly to the network transport protocol. It does not have the luxury of constantly polling the other machine hundreds of times per second. Thus, XML-RPC communication must accomplish its mission in the most efficient means possible in the lowest common denominator. This is accomplished by dividing calls into strictly formed XML requests and responses.

The Official Specifications

The Official Specifications

We are going to take a casual tour of an XML request and response call. For more formal details, you can read the official XML-RPC specifications at http://www.xmlrpc.com/spec

XML-RPC Request

XML-RPC requests function as HTTP POST requests. Therefore, it must have a proper HTTP POST header. The actual remote call and parameters, in XML format, follows the header as the body of the HTTP request.

POST /RPC2 HTTP/1.0 User-Agent: PHP5 XML-RPC Client (Mac OS X) Host: betty.userland.com Content-Type: text/xml Content-length: 181 <?xml version="1.0"?> <methodCall> <methodName>examples.getStateName</methodName> <params> <param> <value><int>42</int></value> </param> </params> </methodCall>

The first line in the header, POST and the Host line tell us that this XML-RPC call is to a web service that sits at betty.userland.com/RPC2. The name of the call to be requested, in this case, examples.getStateName, is the first useful information in the message body. We pass an integer of 42 as the parameter to examples.getStateName. Let’s take a look at these elements one by one:

The root element in an XML-RPC call is methodCall. It has one required child element, methodName, which specifies the name of the call to be requested. There can only be one methodCall per request. If parameters are passed to the call, they are encapsulated in the params element.

A procedure call can require an unlimited number of parameters. XML-RPC calls do not have named parameters. In other words, you do not name your parameter before assigning a value, for example:

// This is wrong.Parameters are not named. <param name="myInt"> <value><int>42</int></value> </param>

Instead, for functions requiring more than one parameter, the correct parameter order is defined by the remote function. You will have to check the API’s documentation for this information and make sure you order the parameters correctly.

//The correct way to differentiate parameters is in their order as defined by the API <params> <param><value><int>42</int></value></param> <param><value><int>13</int></value></param> <param><value><int>32</int></value></param> </params>

In the request, each parameter is enclosed by a param element. Within each param, the actual parameter is wrapped up by a value element. Within this value element are the actual parameter values and data types.

XML-RPC Data Types

XML-RPC parameters can be one of the following three types. Each is represented in a different way inside the value element.

Scalars: basic primitive data typesArrays: similar to PHP numerically indexed arraysStructs: equivalent to associative arrays in the PHP world

Right now, we are using these data types and their structural definitions as request parameters. However, the same data types are used throughout XML-RPC. The server response will give us data in the same schema.

Scalar Values

Scalar values are the common primitive data types found in most languages. Almost every scalar value must be encapsulated by an element that declares what data type that value is.

String

This is your basic text string. It corresponds to the string data type in PHP.

<param> <value><string>Hello, world!</string></value> </param>

Since this is the most common data type, any values of the value element without data type tags inside will default to string. For example, the following is perfectly legal in XML-RPC and will default to a string:

<param> <value>Goodbye, cruel world!</value> </param>

Integer

This is a four byte signed integer. It is the same as the PHP data type integer. Hence, it can take the more obvious tag of int or i4, for four byte integer.

<param> <value><int>42</int></value> </param> // This is the same: <param> <value><i4>42</i4></value> </param>

Double

Double is a double precision signed floating point number. It is the equivalent to float in PHP. Note that in PHP 4, there was also a double data type, but this has been deprecated in preference of float.

<param> <value><double>-44.352301</double></value> </param>

Boolean

Boolean is your basic true or false state. It is the same as the PHP boolean. The difference is that in PHP, and many other languages for that matter, boolean can be represented by the keywords TRUE or FALSE, or a numerical setting of 1 for true and 0 for false. In XML-RPC, booleans can only be represented with 1 or 0.

<param> <value><boolean>0</boolean></value> </param>

Date/Time

A date/time data type specifies a date and a time value, up to the second. It follows the format YYYYMMDDTHH:MM:SS. Year, month, day, hour, minute, and second should be apparent in that format. The “T”, however, is a literal. There is no date/time data type in PHP, so you will have to represent this as a string.

<param> <value> <dateTime.iso8601>20060710T11:30:32</ dateTime.iso8601> </value> </param>

In PHP 5, there is a date parameter of “c” that will return a date/time object in ISO 8601 format. It will not return the exact format that XML-RPC needs, but later we will use a function to automatically encode it into a date/time type.

Base64-Encoded Binary

To transfer binary information via XML-RPC, encode it in base64 and wrap it around base64 tags.

<param> <value> <base64>Pj4UijBdhLr6IdvCc0Ad3NVP4OidTd8E1kRY5Edh</base64> </value> </param>

There is no binary data type in PHP, but you can encode files in base64 for XML-RPC transfer by using the file_get_contents function.

Arrays

Numerically indexed arrays are passed as a single structure within the value element. Arrays are defined with a specific structure of child and grandchild elements.

<params> <param> <value> <array> <data> <value><string>One</string></value> <value><boolean>Monkey</boolean></value> <value><double>4.307400</double></value> </data> </array> </value> </param> </params>

There are two levels of children before we actually see the data. array, which defines the value of the parameter to be an array, and data, which signals the start of the data. The data is encapsulated exactly like scalar values. There is a value element for each item in the array and they may have children elements that define the data type of the value.

Arrays can be recursive. Within each value element, can be another array as long as they contain the array/data/value descendant sequence.

Struct

Similar to arrays, structs are the XML-RPC representation of PHP’s associative arrays. Each item has a named key and a value pair. Like arrays, a single struct element defines a struct. Each item has one member element, each member element has a required name element, which names the item, and one value element which represents the value. Also, like arrays, the value element follows the definition rules of scalar values in XML-RPC.

<value> <struct> <member> <name>One</name> <value><string>This is a string</string></value> </member> <member> <name>Two</name> <value><boolean>1</boolean></value> </member> <member> <name>This is a Name</name> <value><double>-98.330000</double></value> </member> </struct> </value>

XML-RPC requests, then, are basically HTTP POST actions that specify a remote method to be called with properly formatted parameters. Let’s take a look at the response back from the server.

XML-RPC Response

Once we make a request, we can expect one of the two types of responses from the service. If an XML-RPC request was successful, we will receive the data we requested returned to us in a fashion defined by the XML-RPC specifications. If there was an error, a special XML-RPC fault message will be returned.

Similar to a regular web page call, a header will be returned with the results in the body.

HTTP/1.x 200 OK Date: Fri, 11 Aug 2007 23:34:43 GMT Server: Apache/1.3.33 (Darwin) PHP/5.1.4 DAV/1.0.3 X-Powered-By: PHP/5.1.4 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/xml <?xml version="1.0" encoding="iso-8859-1"?> <methodResponse> <params> <param> <value><string>A woeful jeremiad</string></value> </param> </params> </methodResponse>

This is a successful XML-RPC response. It should look familiar to you. methodResponse is the root element that defines this as a response. Following that is a params child. Each value that is returned is enclosed in a param element. Underneath that, each value follows the rules for scalar values we saw earlier. This example shows a single string value that the service returns. However, like the request, everything under params can also be a multiple value return, an array or a struct in addition to single values. For example, an array returned from the service would look like this:

<methodResponse> <params><param><value> <array> <data> <value><string>system.multicall</string></value> <value><string>system.listMethods</string></value> <value><string>system.getCapabilities</string></value> </data> </array> </value></param></params> </methodCall>

If the service could not fulfill your request, it will return an XML-RPC fault. Instead of a params element, methodResponse will have a fault element. methodResponse will always have either a params child or a fault child, but not both.

An XML-RPC fault is basically a struct that is returned to you. There are two named members in this struct. A faultString is a human readable alert of the error, and faultCode, which is an integer assigned by the service. Neither faultString or faultCode are defined or standardized by the XML-RPC specifications. They depend solely on the server implementation.

HTTP/1.x 200 OK Date: Fri, 11 Aug 2007 23:41:18 GMT Server: Apache/1.3.33 (Darwin) PHP/5.1.4 DAV/1.0.3 X-Powered-By: PHP/5.1.4 Keep-Alive: timeout=15, max=100 Connection: Keep-Alive Content-Type: text/xml <?xml version="1.0"?> <methodResponse> <fault> <value> <struct> <member> <name>faultCode</name> <value><int>4</int></value> </member> <member> <name>faultString</name> <value><string>Too many parameters.</string></value> </member> </struct> </value> </fault> </methodResponse>