38,39 €
Get your code under control in a series of small, specific steps
PHP developers from all skill levels will be able to get value from this book and will be able to transform their spaghetti code applications to clean, modular applications. If you are in the midst of a legacy refactor or you find yourself in a state of despair caused by the code you have inherited, this is the book for you. All you need is to have PHP 5.0 installed, and you're all set to change the way you maintain and deploy your code!
Have you noticed that your legacy PHP application is composed of page scripts placed directly in the document root of the web server? Or, do your page scripts, along with any other classes and functions, combine the concerns of model, view, and controller into the same scope? Is the majority of the logical flow incorporated as include files and global functions rather than class methods? Working with such a legacy application feels like dragging your feet through mud, doesn't it?This book will show you how to modernize your application in terms of practice and technique, rather than in terms of using tools like frameworks and libraries, by extracting and replacing its legacy artifacts. We will use a step-by-step approach, moving slowly and methodically, to improve your application from the ground up. We'll show you how dependency injection can replace both the new and global dependencies. We'll also show you how to change the presentation logic to view files and the action logic to a controller. Moreover, we'll keep your application running the whole time. Each completed step in the process will keep your codebase fully operational with higher quality. When we are done, you will be able to breeze through your code like the wind. Your code will be autoloaded, dependency-injected, unit-tested, layer-separated, and front-controlled. Most of the very limited code we will add to your application is specific to this book. We will be improving ourselves as programmers, as well as improving the quality of our legacy application.
This book gives developers an easy-to-follow, practical and powerful process to bring their applications up to a modern baseline. Each step in the book is practical, self-contained and moves you closer to the end goal you seek: maintainable code. As you follow the exercises in the book, the author almost anticipates your questions and you will have the answers, ready to be implemented on your project.
Sie lesen das E-Book in den Legimi-Apps auf:
Seitenzahl: 316
Veröffentlichungsjahr: 2016
Copyright © 2016 Paul M. Jones
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: August 2016
Production reference: 1260816
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78712-470-7
www.packtpub.com
Author
Paul M. Jones
Acquisition Editor
Frank Pohlmann
Technical Editor
Danish Shaikh
Indexer
Mariammal Chettiyar
Graphics
Disha Haria
Production Coordinator
Arvindkumar Gupta
Cover Work
Arvindkumar Gupta
In early 2012, while attending a popular PHP conference in Chicago, I approached a good friend, Paul Jones, with questions about PSR-0 and autoloading. We immediately broke out my laptop to view an attempt at applying the convention and Paul really helped me put the pieces together in short order. His willingness to jump right in and help others always inspires me, and has gained my respect.
So in August of 2012 I heard of a video containing a talk given by Paul at the Nashville PHP User Group, and was drawn in. The talk, It Was Like That When I Got Here: Steps Toward Modernizing A Legacy Codebase, sounded interesting because it highlighted something I am passionate about: refactoring.
After watching I was electrified! I often speak about refactoring and receive inquiries on how to apply it for legacy code rather than performing a rewrite. Put another way, how is refactoring possible in a codebase where includes and requires are the norm, namespaces don't exist, globals are used heavily, and object instantiation runs rampant with no dependency injection? And what if the codebase is procedural?
Paul's focus of modernizing a legacy application filled the gap by getting legacy code to a point where standard refactoring is possible. His step-by-step approach makes it easier for developers to get the bear dancing so continued improving of code through refactoring can happen.
I felt the topic was a must see for PHP developers and quickly fired off an email asking if he'd be interested in flying to Miami and giving the same talk for the South Florida PHP User Group. Within minutes my email was answered and Paul even offered to drive down from Nashville for the talk. However, since I started organizing the annual SunshinePHP Developer Conference to be held February in Miami we decided to have Paul speak at the conference rather than come down earlier.
Fast forward two years later, and here we are in mid-2014. Developing with PHP has really matured in recent years, but it's no secret that PHP's low level of entry for beginners helped create some nasty codebases. Companies who built applications in the dark times simply can't afford to put things on hold and rebuild a legacy application, especially with today's fast paced economy and higher developer salaries. To stay competitive, companies must continually push developers for new features and to increase application stability. This creates a hostile environment for developers working with a poorly written legacy application. Modernizing a legacy application is a necessity, and must happen. Yet knowing how to create clean code and comprehending how to modernize a legacy application are two entirely different things.
Paul and I have been speaking to packed rooms at conferences around the world about modernizing and refactoring. Developers are hungry for knowledge on how to improve the quality of their code and perfect their craft. Unfortunately, we can only reach a small portion of PHP developers using these methods. The time has come for us to create books in hopes of reaching more PHP developers to improve the situation.
I see more and more developers embrace refactoring into their development workflow to leverage methods outlined in my talks and forthcoming book Refactoring 101. But understanding how to use these refactoring processes on a legacy codebase is not straight forward, and sometimes impossible. The book you're about to read bridges the gap, allowing developers to modernize a codebase so refactoring can be applied for continued enhancement. Many thanks to Paul for putting this together. Enjoy!
Adam Culp
(https://leanpub.com/refactoring101)
Paul M. Jones is an internationally recognized PHP expert who has worked as everything from junior developer to VP of Engineering in all kinds of organizations (corporate, military, non-profit, educational, medical, and others). He blogs professionally at www.paul-m-jones.com and is a regular speaker at various PHP conferences.
Paul's latest open-source project is Aura for PHP. Previously, he was the architect behind the Solar Framework, and was the creator of the Savant template system. He was a founding contributor to the Zend Framework (the DB, DB_Table, and View components), and has written a series of authoritative benchmarks on dynamic framework performance.
Paul was one of the first elected members of the PEAR project. He is a voting member of the PHP Framework Interoperability Group, where he shepherded the PSR-1 Coding Standard and PSR-2 Coding Style recommendations, and was the primary author on the PSR-4 Autoloader recommendation. He was also a member of the Zend PHP 5.3 Certification education advisory board.
In a previous career, Paul was an operations intelligence specialist for the US Air Force. In his spare time, he enjoys putting .308 holes in targets at 400 yards.
Many thanks to all of the conference attendees who heard my It Was Like That When I Got Here presentation and who encouraged me to expand it into a full book. Without you, I would not have considered writing this at all.
Thank you to Adam Culp, who provided a thorough review of the work-in-progress, and for his concentration on refactoring approaches. Thanks also to Chris Hartjes, who went over the chapter on unit testing in depth and gave it his blessing. Many thanks to Luis Cordova, who acted as a work-in-progress editor and who corrected my many pronoun issues.
Finally, thanks to everyone who bought a copy of the book before it was complete, and especially to those who provided feedback and insightful questions regarding it. These include Hari KT (a long-time colleague on the Aura project), Ron Emaus, Gareth Evans, Jason Fuller, David Hurley, Stephen Lawrence, Elizabeth Tucker
Long, Chris Smith, and others too numerous to name. Your early support helped to assure me that writing the book was worthwhile.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
I have been programming professionally in one capacity or another for over 30 years. I continue to find it a challenging and rewarding career. I still learn new lessons about my profession every day, as I think is the case for every programmer dedicated to this craft.
Even more challenging and rewarding is helping other programmers to learn what I have learned. I have worked with PHP for 15 years now, in many different kinds of organizations and in every capacity from junior developer to VP of Engineering. In that time, I have learned a lot about the commonalities in legacy PHP applications. This book is distilled from my notes and memories from modernizing those codebases. I hope it can serve as a path for other programmers to follow, leading them out of a morass of bad code and bad work situations, and into a better life for themselves.
This book also serves as penance for all of the legacy code I have left behind for others to deal with. All I can say is that I didn't know then what I know now. In part, I offer this book as atonement for the coding sins of my past. I hope it can help you to avoid my previous mistakes.
In its simplest definition, a legacy application is any application that you, as a developer, inherit from someone else. It was written before you arrived, and you had little or no decision-making authority in how it was built.
However, there is a lot more weight to the word legacy among developers. It carries with it connotations of poorly organized, difficult to maintain and improve, hard to understand, untested or untestable, and a series of similar negatives. The application works as a product in that it provides revenue, but as a program, it is brittle and sensitive to change.
Because this is a book specifically about PHP-based legacy applications, I am going to offer some PHP-specific characteristics that I have seen in the field. For our purposes, a legacy application in PHP is one that matches two or more of the following descriptions:
These characteristics are probably familiar to anyone who has had to deal with a very old PHP application. They describe what I call a typical PHP application.
Most PHP developers are not formally trained as programmers, or are almost entirely self-taught. They often come to the language from other, usually non-technical, professions. Somehow or another, they are tasked with the duty of creating webpages because they are seen as the most technically-savvy person in their organization. Since PHP is such a forgiving language and grants a lot of power without a lot of discipline, it is very easy to produce working web pages and even applications without a lot of training.
These and other factors strongly influence the underlying foundation of the typical PHP application. They are usually not written in a popular full-stack framework or even a micro-framework. Instead, they are often a series of page scripts, placed directly in the web server document root, to which clients can browse directly. Any functionality that needs to be reused has been collected into a series of include files. There are include files for common configurations and settings, headers and footers, common forms and content, function definitions, navigation, and so on.
This reliance on include files in the typical PHP application is what makes me call them include-oriented architectures. The legacy application uses include calls everywhere to couple the pieces of the program into a single whole. This is in contrast to a class-oriented architecture, where even if the application does not adhere to good object-oriented programming principles, at least the behaviors are bundled into classes.
The typical include-oriented PHP application generally looks something like this:
The structure shown is a simplified example. There are many possible variations. In some legacy applications, I have seen literally hundreds of main-level page scripts and dozens of subdirectories with their own unique hierarchies for additional pages. The key is that the legacy application is usually in the document root, has page scripts that users browse to directly, and uses include files to manage most program behavior instead of classes and objects.
Legacy applications will use individual page scripts as the access point for public behavior. Each page script is responsible for setting up the global environment, performing the requested logic, and then delivering output to the client.
Appendix A, Typical Legacy Page Script contains a sanitized, anonymized version of a typical legacy page script from a real application. I have taken the liberty of making the indentation consistent (originally, the indents were somewhat random) and wrapping it at 60 characters so it fits better on e-reader screens. Go take a look at it now, but be careful. I won't be held liable if you go blind or experience post-traumatic stress as a result! As we examine it, we find all manner of issues that make maintenance and improvement difficult:
The Appendix A, Typical Legacy Page Script example is relatively tame as far as legacy page scripts go. I have seen other scripts where JavaScript and CSS code have been mixed in, along with remote-file inclusions and all sorts of security flaws. It is also only (!) about 400 lines long. I have seen page scripts that are thousands of lines long which generate several different page variations, all wrapped into a single switch statement with a dozen or more case conditions.
Many developers, when presented with a typical PHP application, are able to live with it for only so long before they want to scrap it and rewrite it from scratch. Nuke it from orbit; it's the only way to be sure! is the rallying cry of these enthusiastic and energetic programmers. Other developers, their enthusiasm drained by their death march experience, feel cautious and wary at such a suggestion. They are fully aware that the codebase is bad, but the devil (or in our case, code) they know is better than the devil they don't.
A complete rewrite is a very tempting idea. Developers championing a rewrite feel like they will be able to do all the right things the first time through. They will be able to write unit tests, enforce best practices, separate concerns according to modern pattern definitions, and use the latest framework or even write their own framework (since they know best what their own needs are). Because the existing application can serve as a reference implementation, they feel confident that there will be little or no trial-and-error work in rewriting the application. The needed behaviors already exist; all the developers need to do is copy them to the new system. The behaviors that are difficult or impossible to implement in the existing system can be added on from the start as part of the rewrite.
As tempting as a rewrite sounds, it is fraught with many dangers. Joel Spolsky had this to say regarding the old Netscape Navigator web browser rewrite in 2000:
Netscape made the single worst strategic mistake that any software company can make by deciding to rewrite their code from scratch. Lou Montulli, one of the 5 programming superstars who did the original version of Navigator, emailed me to say, I agree completely, it's one of the major reasons I resigned from Netscape. This one decision cost Netscape 3 years. That's three years in which the company couldn't add new features, couldn't respond to the competitive threads from Internet Explorer, and had to sit on their hands while Microsoft completely ate their lunch.
--Joel Spolsky, Netscape Goes BonkersNetscape went out of business as a result.
Josh Kerr relates a similar story regarding TextMate:
Macromates, an indie company who had a very successful text editor called Textmate, decided to rewrite the code base for Textmate 2. It took them 6 years to get a beta release out the door which is an eternity in today's time and they lost a lot of market share. When they did release a beta, it was too late and 6 months later they folded the project and pushed it on to Github as an open source project.
--Josh Kerr, TextMate 2 And Why You Shouldn't Rewrite Your CodeFred Brooks calls the urge to do a complete rewrite the second-system effect. He wrote about this in 1975:
The second is the most dangerous system a man ever designs. ... The general tendency is to over-design the second system, using all the ideas and frills that were cautiously sidetracked on the first one. ... The second-system effect has ... a tendency to refine techniques whose very existence has been made obsolete by changes in basic system assumptions. ... How does the project manager avoid the second-system effect? By insisting on a senior architect who has at least two systems under his belt.
--Fred Brooks, The Mythical Man-Month, pp. 53-58.Developers were the same forty years ago as they are today. I expect them to be the same over the next forty years as well; human beings remain human beings. Overconfidence, insufficient pessimism, ignorance of history, and the desire to be one's own customer all lead developers easily into rationalizations that this time will be different when they attempt a rewrite.
There are lots of reasons why a rewrite rarely works, but I will concentrate on only one general reason here: the intersection of resources, knowledge, communication, and productivity. (Be sure to read The Mythical Man-Month (pp. 13-26) for a great description of the problems associated with thinking of resources and scheduling as interchangeable elements.)
As with all things, we have only limited resources to bring to bear against the rewrite project. There are only a certain number of developers in the organization. These are the developers who will have to do both maintenance on the existing program and write the completely new version of the program. Any developers working on the one project will not be able to work on the other.
One idea is to have the existing developers spend part of their time on the old application and part of their time on the new one. However, moving a developer between the two projects will not be an even split of productivity. Because of the cognitive load of context-switching, the developer will be less than half as productive on each.
To avoid the productivity losses from switching developers between maintenance and the rewrite, the organization may try to hire more developers. Some can then be dedicated to the old project and others to the new project. Unfortunately, this approach reveals what F. A. Hayek calls the knowledge problem. Originally applied to the realm of economics, the knowledge problem applies equally as well to programming.
If we put the new developers on the rewrite project, they won't know enough about the existing system, the existing problems, the business goals, and perhaps not even the best practices for doing the rewrite to be effective. They will have to be trained on these things, most likely by the existing developers. This means the existing developers, who have been relegated to maintaining the existing program, will have to spend a lot of time communicating knowledge to the new hires. The amount of time involved is non-trivial, and the communication of this knowledge will have to continue until the new developers are as well-versed as the existing developers. This means that the linear increase in resources results in a less-than-linear increase in productivity: a 100% increase in the number of programmers will result in a less than 50% increase in output, sometimes much less (cf. The Miserable Mathematics of the Man-Month – http://paul-m-jones.com/archives/1591).
Alternatively, we could put the existing developers on the rewrite project, and the new hires on maintenance of the existing program. This too reveals a knowledge problem because the new developers are completely unfamiliar with the system. Where will they get the knowledge they need to do their work? From the existing developers, of course, who will still need to spend valuable time communicating their knowledge to the new hires. Once again, we see that the linear increase in developers leads to a less-than-linear increase in productivity.
To deal with the knowledge problem and the related communication costs, some may feel the best way to handle the project would be to dedicate all the existing developers on the rewrite, and delay maintenance and upgrades on the existing system until the rewrite is done. This is a great temptation because the developers will be all too eager to salve their own pains and become their own customers - becoming excited about what features they want to have and what fixes they want to make. These desires will lead them to overestimate their own ability to perform a full rewrite and underestimate the amount of time needed to complete it. The managers, for their part, will accept the optimism of the developers, perhaps adding some buffer in the schedule for good measure.
The overconfidence and optimism of the developers will morph into frustration and pain when they realize the task is actually much greater and more overwhelming than they first thought. The rewrite will go on much longer than anticipated, not by a little, but by an order of magnitude or more. For the duration of the rewrite, the existing program will languish - buggy and missing features - disappointing existing customers and failing to attract new ones. The rewrite project will, at the end, become a panicked death march to get it done at all costs, and the result will be a codebase that is just as bad as the first one, only in different ways. It will be merely a copy of the first system, because schedule pressures will have dictated that new features be delayed until after an initial release is achieved.
Given the risks associated with a complete rewrite, I recommend refactoring instead. Refactoring means that the quality of the program is improved in small steps, without changing the functionality of the program. A single, relatively small change is introduced across the entire system. The system is then tested to make sure it still works properly, and finally, the system is put into production. A second small change builds on the previous one, and so on. Over a period of time, the system becomes markedly easier to maintain and improve.
A refactoring approach is decidedly less appealing than a complete rewrite. It defies the core sensibilities of most developers. The developers have to continue working with the system as it is, warts and all, for long periods of time. They do not get to switch over to the latest, hottest framework. They do not get to become their own customers and indulge their desires to do things right the first time. Being a longer-term strategy, the refactoring approach does not appeal to a culture that values rapid development of new applications over patching existing ones. Developers usually prefer to start their own new projects, not maintain older projects developed by others.
However, as a risk-reducing strategy, using an iterative refactoring approach is undeniably superior to a rewrite. The individual refactorings themselves are small compared to any similar portion of a rewrite project. They can be applied in much shorter periods of time than a comparable feature would be in a rewrite, and they leave the existing codebase in a working state at the end of each iteration. At no point does the existing application stop operating or progressing. The iterative refactorings can be integrated into a larger process with scheduling that allows for cycles of bug fixes, feature additions, and refactorings to improve the next cycle.
Finally, the goal of any single refactoring step is not perfection. The goal in each step is merely improvement. We are not trying to realize an impossible goal over a long period of time. We are taking small steps toward easily-visualized goals that can be accomplished in short timeframes. Each small refactoring win will both improve morale and drive enthusiasm for the next refactoring step. Over time, these many small wins accumulate into a single big win: a fully-modernized codebase that has never stopped generating revenue for the business.
Until now, we have been discussing legacy applications as page-based, include-oriented systems. However, there is also a large base of legacy code out there using public frameworks.
Each different public framework in PHP land is its own unique hell. Applications written in CakePHP (http://cakephp.org/) suffer from different legacy issues than those written in CodeIgniter, Solar, Symfony 1, Zend Framework 1, and so on. Each of these different frameworks, and their varying work-alikes, encourage different kinds of tight-coupling in applications. Thus, the specific steps needed to refactor applications built using one of these frameworks are very different from the steps needed for a different framework.
As such, various parts of this book may be useful as a guide to refactoring different parts of a legacy application based on a public framework, but as a whole, the book is not targeted at refactoring applications based on these public frameworks.
In-house, private, or otherwise non-public frameworks under the direct control of their own architects within the organization likely to benefit from the refactorings included in this book.
I sometimes hear about how developers wisely wish to avoid a complete rewrite and instead want to refactor or migrate to a public framework. This sounds like the best of both worlds, combining an iterative approach with the developers' desire to use the hottest new technology.
My experience with legacy PHP applications has been that they are almost as resistant to framework integration as they are to unit testing. If the application was already in a state where its logic could be ported to a framework, there would be little need to port it in the first place.
However, by the time we have completed the refactorings in this book, the application is very likely to be in a state that will be much more amenable to a public framework migration. Whether the developers will still want to do so is another matter.
At this point, we have realized that a rewrite, while appealing, is a dangerous approach. An iterative refactoring approach sounds a lot more like actual work, but has the benefit of being achievable and realistic.
The next step is to prepare ourselves for the refactoring approach by getting some prerequisites out of the way. After that, we will proceed toward modernizing our legacy application in a series of relatively small steps, one step per chapter with each step broken down into an easy-to-follow process with answers to common questions.
Let's get started!
Before we begin modernizing our application, we need to make sure we have the necessary prerequisites in place to do the work of refactoring. These are as following:
Revision control (also known as source control or version control) allows us to keep track of the prerequisites:revision control" changes we make to our codebase. We can make a change, then commit it to source control, make more changes and commit them, and push our changes to other developers on the team. If we discover an error, we can revert to an earlier version of the codebase to a point where the error does not exist and start over.
