PHP and MongoDB Web Development Beginner's Guide - Rubayeet Islam - E-Book

PHP and MongoDB Web Development Beginner's Guide E-Book

Rubayeet Islam

0,0
34,79 €

oder
-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

With the rise of Web 2.0, the need for a highly scalable database, capable of storing diverse user-generated content is increasing. MongoDB, an open-source, non-relational database has stepped up to meet this demand and is being used in some of the most popular websites in the world. MongoDB is one of the NoSQL databases which is gaining popularity for developing PHP Web 2.0 applications.PHP and MongoDB Web Development Beginner’s Guide is a fast-paced, hands-on guide to get started with web application development using PHP and MongoDB. The book follows a “Code first, explain later” approach, using practical examples in PHP to demonstrate unique features of MongoDB. It does not overwhelm you with information (or starve you of it), but gives you enough to get a solid practical grasp on the concepts.The book starts by introducing the underlying concepts of MongoDB. Each chapter contains practical examples in PHP that teache specific features of the database.The book teaches you to build a blogging application, handle user sessions and authentication, and perform aggregation with MapReduce. You will learn unique MongoDB features and solve interesting problems like real-time analytics, location-aware web apps etc. You will be guided to use MongoDB alongside MySQL to build a diverse data back-end.
With its concise coverage of concepts and numerous practical examples, PHP and MongoDB Web Development Beginner’s Guide is the right choice for the PHP developer to get started with learning MongoDB.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB

Veröffentlichungsjahr: 2011

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

PHP and MongoDB Web Development
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers and more
Why Subscribe?
Free Access for Packt account holders
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Time for action - heading
What just happened?
Pop Quiz - heading
Have a go hero heading
Reader feedback
Customer support
Downloading the example code
Errata
Piracy
Questions
1. Getting Started with MongoDB
The NoSQL movement
Types of NoSQL databases
MongoDB A document-based NoSQL database
Why MongoDB?
Who is using MongoDB?
MongoDB concepts—Databases, collections, and documents
Anatomy of document
BSON—The data exchange format for MongoDB
Similarity with relational databases
Downloading, installing, and running MongoDB
System requirements
Time for action - downloading and running MongoDB on Windows
What just happened?
Installing the 64-bit version
Time for action - downloading and running MongoDB on Linux
What just happened?
Installing MongoDB on OS X
Configuring MongoDB
Command-line parameters
File-based configuration
Have a go hero configure MongoDB to run with non-default settings
Stopping MongoDB
Hitting Control + C
From the mongo shell
Sending INT or TERM signal in UNIX
Creating databases, collections, and documents
Time for action - creating databases, collections, and documents
What just happened?
Pop Quiz - configuring MongoDB
Installing the PHP driver for MongoDB
Time for action - installing PHP driver for MongoDB on Windows
What just happened?
Installing the PHP-MongoDB driver on Unix
Connecting to the MongoDB server from PHP
Creating a PHP-Mongo connection
Time for action - creating a connection to the MongoDB server from PHP
What just happened?
Configuring the PHP-MongoDB connection
Specifying timeout for the connection attempt
Have a go hero connect to a MongoDB server on a networked computer
Summary
2. Building your First MongoDB Powered Web App
A MongoDB powered blog
Have the MongoDB server running
Inserting documents in MongoDB
Time for action - building the Blog Post Creator
What just happened?
Creating databases and collections implicitly
Performing 'safe' inserts
Benefits of safe inserts
Specifying a timeout on insert
Setting the user generated _id
The MongoDate object
Have a go hero allow storing tags for an article
Querying documents in a collection
Time for action - retrieving articles from a database
What just happened?
The Mongo Query Language
The MongoCursor object
Conditional Queries
Pop Quiz - what does this query do?
Doing advanced queries in MongoDB
Time for action - building the Blog Dashboard
What just happened?
Returning a subset of fields
Sorting the query results
Using count, skip, and limit
Performing range queries on dates
Have a go hero rewrite blogs.php
Updating documents in MongoDB
Time for action - building the Blog Editor
What just happened?
Optional arguments to the update method
Performing 'upsert'
Using update versus using save
Using modifier operations
Setting with $set
Incrementing with $inc
Deleting fields with $unset
Renaming fields with $rename
Have a go hero merge Blog editor and creator into a single module
Deleting documents in MongoDB
Time for action - deleting blog posts
What just happened?
Optional arguments to remove
Managing relationships between documents
Embedded documents
Referenced documents
Time for action - posting comments to blog posts
What just happened?
Embedded versus referenced Which one to use?
Querying embedded objects
Have a go hero get comments by username
Summary
3. Building a Session Manager
Understanding HTTP sessions
Understanding PHP native session handling
Time for action - testing native PHP session handling
What just happened?
Limitations of native PHP session handling
Implementing session handling with MongoDB
Extending session handling with session_set_save_handler
The SessionManager class
Time for action - building the SessionManager class
What just happened?
How the SessionManager works
The constructor
The open and close methods
The read method
The write method
The destroy method
The gc method
Pop Quiz - what does session_destroy() do?
Putting the SessionManager in action
Time for action - putting SessionManager into action
What just happened?
Building the user authentication module
Time for action - building the User class
What just happened?
Creating the login, logout, and user profile page
Time for action - creating the login, logout, and profile page
What just happened?
Have a go hero implement user authentication in the blogging web app
Using good session practices
Setting low expiry times of session cookies
Using session timeouts
Setting proper domains for session cookies
Checking for browser consistency
Have a go hero store and verify User Agent in SessionManager
Summary
4. Aggregation Queries
Generating sample data
Time for action - generating sample data
What just happened?
Understanding MapReduce
Visualizing MapReduce
Pop Quiz - MapReduce basics
Performing MapReduce in MongoDB
Time for action - counting the number of articles for each author
What just happened?
Defining the Map function
Defining the Reduce function
Applying the Map and Reduce
Viewing the results
Performing MapReduce on a subset of the collection
Concurrency
Performing MongoDB MapReduce within PHP
Time for action - creating a tag cloud
What just happened?
Have a go hero repeat the earlier example with PHP
Performing aggregation using group()
Time for action - calculating the average rating per author
What just happened?
Grouping by custom keys
MapReduce versus group()
Have a go hero find the maximum and minimum rating for each author
Pop Quiz - limitation of group()
Listing distinct values for a field
Time for action - listing distinct categories of articles
What just happened?
Using distinct() in mongo shell
Counting documents with count()
Summary
5. Web Analytics using MongoDB
Why MongoDB is a good choice as a web analytics backend
Logging with MongoDB
Time for action - logging page visits with MongoDB
What just happened?
Capped collections
Sorting in natural order
Updating and deleting documents in a capped collection
Specifying the size of a regular collection
Convert a regular collection to a capped one
Pop Quiz - capped collection
Extracting analytics data with MapReduce
Time for action - finding total views and average response time per blog post
What just happened?
The map, reduce, and finalize functions
Displaying the result
Running MapReduce in real time versus running it in the background
Have a go hero find out usage share of browsers for the site
Real-time analytics using MongoDB
Time for action - building a real-time page visit counter
What just happened?
Have a go hero get unique page visits in real time
Summary
6. Using MongoDB with Relational Databases
The motivation behind using MongoDB and an RDBMS together
Potential use cases
Defining the relational model
Time for action - creating the database in MySQL
What just happened?
Caching aggregation results in MongoDB
Time for action - storing the daily sales history of products in MongoDB
What just happened
Benefits of caching queries in MongoDB
Storing results of expensive JOINs
Have a go hero replacing Views with MongoDB
Using MongoDB for data archiving
Time for action - archiving old sales records in MongoDB
What just happened?
Challenges in archiving and migration
Dealing with foreign key constraints
Preserving data types
Storing metadata in MongoDB
Time for action - using MongoDB to store customer metadata
What just happened?
Problems with using MongoDB and RDBMS together
Summary
7. Handling Large Files with GridFS
What is GridFS?
The rationale of GridFS
The specification
Advantages over the filesystem
Pop Quiz - what is the maximum size of BSON objects?
Storing files in GridFS
Time for action - uploading images to GridFS
What just happened?
Looking under the hood
Have a go hero perform multiple file uploads in GridFS
Serving files from GridFS
Time for action - serving images from GridFS
What just happened?
Updating metdata of a file
Deleting files
Have a go hero create an image gallery with GridFS
Reading files in chunks
Time for action - reading images in chunks
What just happened?
When should you not use GridFS
Summary
8. Building Location-aware Web Applications with MongoDB and PHP
A geolocation primer
Methods to determine location
Pop Quiz - locating a smartphone
Detecting the location of a web page visitor
The W3C Geolocation API
Browsers that support geolocation
Time for action - detecting location with W3C API
What just happened?
The Geolocation object
The getCurrentPosition() method
Drawing the map using the Google Maps API
Geospatial indexing
Time for action - creating geospatial indexes
What just happened?
Geospatial indexing Important things to know
Performing location queries
Time for action - finding restaurants near your location
What just happened?
The geoNear() command
Bounded queries
Geospatial haystack indexing
Time for action - finding nearby restaurants that serve burgers
What just happened?
Summary
9. Improving Security and Performance
Enhancing query performance using indexes
Time for action - creating an index on a MongoDB collection
What just happened?
The _id index
Unique indexes
Compound keys indexes
Indexing embedded document fields
Indexing array fields
Deleting indexes
When indexing cannot be used
Indexing guidelines
Choose the keys wisely
Keep an eye on the index size
Avoid using low-selectivity single key indexes
Be aware of indexing costs
On a live database, run indexing in the background
Pop Quiz - the indexing MCQ test
Have a go hero implement search in the blogging application
Optimizing queries
Explaining queries using explain()
Optimization rules
Have a go hero compare outputs of explain() for indexed and non-indexed queries
Using hint()
Profiling queries
Understanding the output
Optimization rules
Securing MongoDB
Time for action - adding user authentication in MongoDB
What just happened?
Creating an admin user
Creating regular user
Viewing, changing, and deleting user accounts
User authentication through PHP driver
Have a go hero modify DBConnection class to add user authentication
Filtering user input
Running MongoDB server in a secure environment
Ensuring data durability
Journaling
Performance
Using fsync
Replication
Pop Quiz - flushing data to disk
Summary
10. Easy MongoDB Administration with RockMongo and phpMoAdmin
Administering MongoDB with RockMongo
Time for action - installing RockMongo on your computer
What just happened?
Exploring data with RockMongo
Querying
Updating, deleting, and creating documents
Importing and exporting data
Viewing stats
Miscellaneous
Using phpMoAdmin to administer MongoDB
Time for action - installing phpMoAdmin on your computer
What just happened?
Viewing databases and collections
Querying documents
Saving and deleting objects
Importing and exporting data
Viewing stats
Other features
RockMongo versus phpMoAdmin
The verdict
Summary
A. Pop Quiz Answers
Chapter 1, Getting Started with MongoDB
Chapter 2, Building your First MongoDB Powered Web App
Chapter 3, Building a Session Manager
Chapter 4, Aggregation Queries
Chapter 5, Web Analytics using MongoDB
Chapter 7, Handling Large Files with GridFS
Chapter 8, Building Location-aware Web Applications with MongoDB and PHP
Chapter 9, Improving Security and Performance
Index

PHP and MongoDB Web Development

PHP and MongoDB Web Development

Copyright © 2011 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: November 2011

Production Reference: 1181111

Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK.

ISBN 978-1-84951-362-3

www.packtpub.com

Cover Image by Charwak A ( <[email protected]> )

Credits

Author

Rubayeet Islam

Reviewers

Sam Millman

Sigert de Vries

Nurul Ferdous

Vidyasagar N V

Acquisition Editor

Usha Iyer

Development Editor

Susmita Panda

Technical Editors

Joyslita D'Souza

Veronica Fernandes

Lubna Shaikh

Copy Editor

Laxmi Subramanian

Project Coordinator

Kushal Bhardwaj

Proofreader

Matthew Humphries

Indexer

Tejal Daruwale

Graphics

Valentina D'silva

Production Coordinator

Prachali Bhiwandkar

Cover Work

Prachali Bhiwandkar

About the Author

Rubayeet Islam is a Software Developer with over 4 years of experience in large-scale web application development on open source technology stacks (LAMP, Python/Django, Ruby on Rails). He is currently involved in developing cloud-based distributed software that use MongoDB as their analytics and metadata backend. He has also spoken in seminars to promote the use of MongoDB and NoSQL databases in general. He graduated from the University of Dhaka with a B.S. in Computer Science and Engineering.

I thank the Almighty for giving me such a blessed life and my parents for letting me follow my passion. My friend and colleague, Nurul Ferdous, for inspiring me to be an author in the first place. Finally, the amazing people at Packt—Usha Iyer, Kushal Bhardwaj, Priya Mukherji, and Susmita Panda, without your help and guidance this book would not have been possible to write.

About the Reviewers

Sam Millman, after achieving a B.Sc. in Computing from the University of Plymouth, immediately moved to advance his knowledge within Web development, specifically PHP. He is a fully self-taught professional Web Developer and IT Administrator working for a company in the south of England.

He first started to show an interest in MongoDB when he went in search of something new to learn. Now he is an active user of the MongoDB Google User Group and is about to release a new site written in PHP with MongoDB as the primary data store.

Sigert de Vries (1983) is a professional Web Developer working in The Netherlands. He has worked in several companies as a System Administrator and Web Developer. He is a specialist in high performance websites and is an open source enthusiast. With his communicative skills, he translates advanced technical issues to "normal" human language.

Sigert is currently working at Worldticketshop.com, helping them to be one of the largest ticket marketplaces in Europe. Within the company, there's plenty of room to use NoSQL solutions such as MongoDB.

I would like to thank Packt publishing for asking me to review this book, it has been a pleasure!

Vidyasagar N V was interested in Computer Science since an early age. Some of his serious work in computers and computer networks began during his high school days. Later, he went to the prestigious Institute Of Technology, Banaras Hindu University for his B.Tech. He has been working as a Software Developer and Data Expert, developing and building scalable systems. He has worked with a variety of 2nd, 3rd, and 4th generation languages. He has worked with flat files, indexed files, hierarchical databases, network databases, relational databases, NoSQL databases, Hadoop, and related technologies. Currently, he is working as a Senior Developer at Ziva Software Pvt. Ltd., developing big database-structured data-extraction techniques for the Web and local information. He enjoys producing high-quality software, web-based solutions, and designing secure and scalable data systems.

I would like to thank my parents, Mr. N Srinivasa Rao and Mrs.Latha Rao, and my family who supported and backed me throughout my life. My friends for being friends, and all those people willing to donate their time, effort, and expertise by participating in open source software projects. Thank you Packt Publishing for selecting me as one of the technical reviewers on this wonderful book. It is my honor to be a part of this book. You can contact me at <[email protected]>.

www.PacktPub.com

Support files, eBooks, discount offers and more

You might want to visit www.PacktPub.com for support files and downloads related to your book.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

http://PacktLib.PacktPub.com

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.

Why Subscribe?

Fully searchable across every book published by PacktCopy and paste, print and bookmark contentOn demand and accessible via web browser

Free Access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.

Preface

MongoDB is an open source, non-relational database system designed to meet the needs of modern Web 2.0 applications. It is currently being used by some of the most popular websites in the world. This book introduces MongoDB to the web developer who has some background building web applications using PHP. This book teaches what MongoDB is, how it is different from relational database management systems, and when and why developers should use it instead of a relational database for storing data.

You will learn how to build PHP applications that use MongoDB as the data backend; solve common problems, such as HTTP session handling, user authentication, and so on.

You will also learn to solve interesting problems with MongoDB, such as web analytics with MapReduce, storing large files in GridFS, and building location-aware applications using Geospatial indexing.

Finally, you will learn how to optimize MongoDB to boost performance, improve security, and ensure data durability. The book will demonstrate the use of some handy GUI tools that makes database management easier.

What this book covers

Chapter 1, Getting Started with MongoDB introduces the underlying concepts of MongoDB, provides a step-by-step guide on how to install and run a MongoDB server on a computer, and make PHP and MongoDB talk to each other.

Chapter 2, Building your First MongoDB Powered Web App shows you how to build a simple blogging web application using PHP and MongoDB. Through the examples in this chapter, you will learn how to create/read/update/delete data in MongoDB using PHP.

Chapter 3, Building a Session Manager shows you how PHP and MongoDB can be used to handle HTTP sessions. You will build a stand-alone session manager module and learn how to perform user authentication/authorization using the module.

Chapter 4, Aggregation Queries introduces MapReduce, a powerful functional programming paradigm and shows you how it can be used to perform aggregation queries in MongoDB.

Chapter 5, Web Analytics using MongoDB shows you how you can store website traffic data in MongoDB in real time and use MapReduce to extract important analytics.

Chapter 6, Using MongoDB with Relational Databases explores use cases where MongoDB can be used alongside a relational database. You will learn how to archive data in MongoDB, use it for caching expensive query results, and store non-structured metadata about different objects in the domain.

Chapter 7, Handling Large Files with GridFS introduces GridFS, a specification in MongoDB that allows us to store large files in the database.

Chapter 8, Building Location-aware Web Applications with MongoDB and PHP, uses PHP, HTML5, JavaScript, and the Geospatial Indexing feature of MongoDB to build a web application that helps you find restaurants close to your current location.

Chapter 9, Improving Security and Performance shows you how to boost query performance using indexes, use built-in tools for analyzing and fine-tuning queries, improve database security, and ensure data durability.

Chapter 10, Easy MongoDB Administration with RockMongo and phpMoAdmin demonstrates the use of a couple of PHP-based GUI tools for managing MongoDB server—RockMongo and phpMoAdmin.

What you need for this book

Apache web server (or IIS if you are on Windows) running PHP 5.2.6 or higher.

A web browser that supports the W3C Geolocation API (Internet Explorer 9.0+, Google Chrome 5.0+, Firefox 3.5+ or Safari 5.0+).

Chapter 6, Using MongoDB with Relational Databases requires that you have MySQL installed on your machine.

Who this book is for

This book assumes that you have some background in web application development using PHP, HTML, and CSS. Some of the chapters require that you know JavaScript and are familiar with AJAX. Having a working knowledge of using a relational database system, such as MySQL will help you grasp some of the concepts quicker, but it is not strictly mandatory. No prior knowledge of MongoDB is required.

Conventions

In this book, you will find several headings appearing frequently.

To give clear instructions of how to complete a procedure or task, we use:

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to <[email protected]>, and mention the book title via the subject of your message.

If there is a book that you need and would like to see us publish, please send us a note in the SUGGEST A TITLE form on www.packtpub.com or e-mail <[email protected]>.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.PacktPub.com. If you purchased this book elsewhere, you can visit http://www.PacktPub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/support, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works, in any form, on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors, and our ability to bring you valuable content.

Questions

You can contact us at <[email protected]> if you are having a problem with any aspect of the book, and we will do our best to address it.

Chapter 1. Getting Started with MongoDB

We are about to begin our journey in PHP and MongoDB web development. Since you picked up this book, I assume you have some background building web apps using PHP, and you are interested in learning to develop PHP applications with MongoDB as data backend. In case you have never heard of MongoDB before, it is an open source, document-oriented database that supports the concept of flexible schema. In this chapter, we will learn what MongoDB is, and what do we gain from using MongoDB instead of trusted old SQL databases. We will start by learning briefly about the NoSQL databases (a set of database technologies that are considered alternative to RDBM systems), the basics of MongoDB, and what distinguishes it from relational databases. Then we will move on to installing and running MongoDB and hooking it up with PHP.

To sum it up, in this chapter we will:

Learn about the NoSQL movementLearn the basic concepts behind MongoDBLearn how to download, install, and run MongoDB on a computerLearn to use the mongo Interactive ShellLearn how to make PHP and MongoDB talk to each other

So let's get on with it...

The NoSQL movement

You probably have heard about NoSQL before. You may have seen it in the RSS feed headlines of your favorite tech blogs, or you overheard a conversation between developers in your favorite restaurant during lunch. NoSQL (elaborated "Not only SQL"), is a data storage technology. It is a term used to collectively identify a number of database systems, which are fundamentally different from relational databases. NoSQL databases are increasingly being used in web 2.0 applications, social networking sites where the data is mostly user generated. Because of their diverse nature, it is difficult to map user-generated content to a relational data model, the schema has to be kept as flexible as possible to reflect the changes in the content. As the popularity of such a website grows, so does the amount of data and the read-write operations on the data. With a relational database system, dealing with these problems is very hard. The developers of the application and administrators of the database have to deal with the added complexity of scaling the database operations, while keeping its performance optimum. This is why popular websites—Facebook, Twitter to name a few—have adopted NoSQL databases to store part or all of their data. These database systems have been developed (in many cases built from scratch by developers of the web applications in question!) with the goal of addressing such problems, and therefore are more suitable for such use cases. They are open source, freely available on the Internet, and their use is increasingly gaining momentum in consumer and enterprise applications.

Types of NoSQL databases

The NoSQL databases currently being used can be grouped into four broad categories:

Key-value data stores: Data is stored as key-value pairs. Values are retrieved by keys. Redis, Dynomite, and Voldemort are examples of such databases.Column-based databases: These databases organize the data in tables, similar to an RDBMS, however, they store the content by columns instead of rows. They are good for data warehousing applications. Examples of column-based databases are Hbase, Cassandra, Hypertable, and so on.Document-based databases: Data is stored and organized as a collection of documents. The documents are flexible; each document can have any number of fields. Apache CouchDB and MongoDB are prominent document databases.Graph-based data-stores: These databases apply the computer science graph theory for storing and retrieving data. They focus on interconnectivity of different parts of data. Units of data are visualized as nodes and relationships among them are defined by edges connecting the nodes. Neo4j is an example of such a database.

MongoDB A document-based NoSQL database

MongoDB falls into the group of document-oriented NoSQL databases. It is developed and maintained by 10gen (http://www.10gen.com). It is an open source database, written in the programming language C. The source code is licensed under AGPL and freely available at GitHub, anyone can download it from the repo https://github.com/mongodb/mongo and customize it to suit his/her needs. It is increasingly being used as a data storage layer in different kinds of applications, both web-based and nonweb-based.

Why MongoDB?

Features that make learning and using MongoDB a win, include:

Easy to learn, at least easier than learning other NoSQL systems, if I dare say. Column-oriented or graph-based databases introduce radical ideas that many developers struggle to grasp. However, there is a lot of similarity in the basic concepts of MongoDB and a relational database. Developers coming from an RDBMS background, find little trouble adapting to MongoDB.It implements the idea of flexible schema. You don't have to define the structure of the data before you start storing it, which makes it very suitable for storing non-structured data.It is highly scalable. It comes with great features to help keep performance optimum, while the size and traffic of data grows, with little or no change in the application layer.

It is free, it can be downloaded and used without charge. It has excellent documentation and an active and co-operative online community who participate in mailing lists, forums, and IRC chat rooms.

Who is using MongoDB?

Let's take a look at some real world use cases of MongoDB:

Craigslist: Craigslist is the world's most popular website for featuring free classified advertisements. It uses MongoDB to archive billions of records. They had been using a MySQL based solution for achieving that. Replacing them with MongoDB has allowed them to add schema changes without delay, and scale much more easily.Foursquare: Foursquare is a popular location-based social networking application. It stores the geographical location of interesting venues (restaurants, cafes, and so on) and records when users visit these venues. It uses MongoDB for storing venue and user information.CERN: The renowned particle physics laboratory based in Geneva, uses MongoDB as an aggregation cache for its Large Hadron Collider experiment. The results for expensive aggregation queries, performed on massive amounts of data, are stored in MongoDB for future use.

MongoDB concepts—Databases, collections, and documents

A MongoDB server hosts a number of databases. The databases act as containers of data and they are independent of each other. A MongoDB database contains one or more collections. For example, a database for a blogging application named myblogsite may typically have the collections articles, authors, comments, categories, and so on.

A collection is a set of documents. It is logically analogous to the concept of a table in a relational database. But unlike tables, you don't have to define the structure of the data that is going to be stored in the collection beforehand.

A document stored in a collection is a unit of data. A document contains a set of fields or key-value pairs. The keys are strings, the values can be of various types: strings, integers, floats, timestamps, and so on. You can even store a document as the value of a field in another document.

Anatomy of document

Let's take a closer look at a MongoDB document. The following is an example of a document that stores certain information about a user in a web application:

{ _id : ObjectId("4db31fa0ba3aba54146d851a") username : "joegunchy" email : "[email protected]" age : 26 is_admin : true created : "Sun Apr 24 2011 01:52:58 GMT+0700 (BDST)" }

The previous document has six fields. If you have some JavaScript experience, you would recognize the structure as JSON or JavaScript Object Notation. The value for the first field, _id, is autogenerated. MongoDB automatically generates an ObjectId for each document you create in a collection and assigns it as _id for that document. This is also unique; that means no two documents in the same collection will have the same values for ID, just like a primary key of a table in a relational database. The next two fields, username and email are strings, age is an integer, and is_admin is boolean. Finally, created is a JavaScript DateTime object, represented as a string.

BSON—The data exchange format for MongoDB

We have already seen that the structure of a document imitates a JSON object. When you store this document in the database, it is serialized into a special binary encoded format, known as BSON, short for binary JSON. BSON is the default data exchange format for MongoDB. The key advantage of BSON is that it is more efficient than conventional formats such as XML and JSON, both in terms of memory consumption and processing time. Also, BSON supports all the data types supported by JSON (string, integer, double, Boolean, array, object, null) plus some special data types such as regular expression, object ID, date, binary data, and code. Programming languages such as PHP, Python, Java, and so on have libraries that manage conversion of language-specific data structures (for example, the associative array in PHP) to and from BSON. This enables the languages to easily communicate with MongoDB and manipulate the data in it.

Note

If you are interested to learn more about BSON format, you may try visiting http://bsonspec.org/.

Similarity with relational databases

Developers with a background on working with relational database systems will quickly recognize the similarities between the logical abstractions of the relational data model and the Mongo data model. The next figure compares components of a relational data model with those of the Mongo data model:

The next figure shows how a single row of a hypothetical table named users is mapped into a document in a collection:

Also just like columns of a RDBMS table, fields of a collection can be indexed, although implementations of indexing are different.

So much for the similarities: now let's talk briefly about the differences. The key thing that distinguishes MongoDB from a relational model is the absence of relationship constraints. There are no foreign keys in a collection and as a result there are no JOIN queries. Constraint management is typically handled in the application layer. Also, because of its flexible schema property, there is no expensive ALTER TABLE statement in MongoDB.

Downloading, installing, and running MongoDB

We are done with the theoretical part, at least for now. It is time for us to download, install, and start playing with MongoDB on the computer.

System requirements

MongoDB supports a wide variety of platforms. It can run on Windows (XP, Vista, and 7), various flavors of Linux (Debian/Ubuntu, Fedora, CentOS, and so on), and OS X running on Intel-based Macs. In this section, we are going to see step-by-step instructions for having a MongoDB system up and running in a computer, running on Windows, Linux, or OS X.

Time for action - downloading and running MongoDB on Windows

We are going to learn how to download, install, and run MongoDB on a computer running on Windows:

Head on over to the downloads page on the MongoDB official website, http://www.mongodb.org/downloads.Click on the download link for the latest stable release under Windows 32-bit. This will start downloading a ZIP archive:Once the download is finished, move the ZIP archive to the C:\ drive and extract it. Rename the extracted folder (mongodb-win32-i386-x.y.z where x.y.z is the version number) to mongodb.Create the folder C:\data\db. Open a CMD prompt window, and enter the following commands:
C:\> cd \mongodb\bin C:\mongodb\bin> mongod
Open another CMD prompt window and enter the following commands:
C:\> cd \mongodb\bin C:\mongodb\bin> mongo
Type show dbs into the shell and hit Enter.

What just happened?

In steps 1 to 3, we downloaded and extracted a ZIP archive that contains binary files for running MongoDB on Windows, moved and extracted it under the C:\ drive, and renamed the folder to mongodb for convenience. In step 4, we created the data directory (C:\data\db). This is the location where MongoDB will store its data files. In step 5, we execute the C:\mongodb\bin\mongod.exe program in the CMD prompt to launch the MongoDB server; this is the server that hosts multiple databases (you can also do this by double-clicking on the file in Windows Explorer). In step 6, after the server program is booted up, we invoke the C:\mongodb\bin\mongo.exe program to start the mongo interactive shell, which is a command-line interface to the MongoDB server:

C:\mongodb\bin\mongo MongoDB shell version: 1.8.1 connection to test type "help" for help >

Once the shell has started, we issue the command show dbs to list all the pre-loaded databases in the server:

>show dbs admin (empty) local (empty) >

Installing the 64-bit version

The documentation at the MongoDB website recommends that you run the 64-bit version of the system. This is because the 32-bit version cannot store more than 2 gigabytes of data. If you think it is likely that the data in your database will exceed the 2 GB limit, then you should obviously download and install the 64-bit version instead. You will also need an operating system that supports running applications in the 64-bit mode. For the purpose of the practical examples shown in this book, we are just fine with the 32-bit version, you should not worry about that too much.

Time for action - downloading and running MongoDB on Linux

Now, we are going to learn how to download and run the MongoDB server on a Linux box:

Fire up the terminal program. Type in the following command and hit Enter
wget http://fastdl.mongodb.org/linux/mongodb-linux-i686-1.8.3.tgz > mongo.tgz
Extract the downloaded archive by using the following command:
tar xzf mongo.tgz
Rename the extracted directory by using the following command:
mv mongodb-linux-i686-1.8.3 mongodb
Create the data directory /data/db by using the following command:
sudo mkdir -p /data/db sudo chown `id -u` /data/db
Startup the server by running the following command:
./mongodb/bin/mongod
Open another tab in the terminal and run the next command:
./mongodb/bin/mongo
Type show dbs into the shell and hit Enter.

What just happened?

In step 1, we downloaded the latest stable release of MongoDB 32-bit version for Linux using the wget program, and stored it as a GZIP tarball named mongo.tgz on your machine.

Note

At the time of this writing, the latest production release for MongoDB is 1.8.3. So when you try this, if a newer production release is available, you should download that version instead.

In steps 2 and 3, we extracted the tarball and renamed the extracted directory to mongodb for convenience. In step 4, we created the data directory /data/db for MongoDB, and gave it permission to read from and write to that directory. In step 5, we startup the MongoDB server by executing the mongodb/bin/mongod script.

In step 6, after we have successfully launched the server, we start the mongo interactive shell:

$./mongodb/bin/mongo MongoDB shell version: 1.8.1 url: test connection to test type "help" for help >

Once the shell has started, we issue the command show dbs to list all the pre-loaded databases in the server:

>show dbs local (empty) admin (empty) >

The databases listed here are special databases pre-built within the server. They are used for administration and authentication purposes. We do not need to concern ourselves with them right now.

Tip

Installing MongoDB using package managers

You can use the package manager of your Linux distribution (apt for Debian/Ubuntu, yum for Fedora/CentOS) to install MongoDB. To get distro-specific instructions, Ubuntu/Debian users should visit http://www.mongodb.org/display/DOCS/Ubuntu+and+Debian+packages. Users of CentOS and Fedora should visit http://www.mongodb.org/display/DOCS/CentOS+and+Fedora+Packages. The advantage of using a package manager, other than being able to install with fewer commands, is that you can launch the Mongo server and the client just by typing mongod and mongo respectively in the shell.

Installing MongoDB on OS X