39,59 €
VMware vSphere is the leading server virtualization platform with consistent management for virtual data centers. It enhances troubleshooting skills to diagnose and resolve day to day problems in your VMware vSphere infrastructure environment.
This book will provide you practical hands-on knowledge of using different performance monitoring and troubleshooting tools to manage and troubleshoot the vSphere infrastructure.
It begins by introducing systematic approach for troubleshooting different problems and show casing the troubleshooting techniques. You will be able to use the troubleshooting tools to monitor performance, and troubleshoot issues related to Hosts and Virtual Machines. Moving on, you will troubleshoot High Availability, storage I/O control problems, virtual LANS, and iSCSI, NFS, VMFS issues.
By the end of this book, you will be able to analyze and solve advanced issues related to vShpere environment such as vcenter certificates, database problems, and different failed state errors.
Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:
Seitenzahl: 268
Veröffentlichungsjahr: 2015
Copyright © 2015 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
First published: October 2015
Production reference: 1261015
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78355-176-7
www.packtpub.com
Author
Muhammad Zeeshan Munir
Reviewers
Kenneth van Ditmarsch
Péter Károly "Stone" JUHÁSZ
Commissioning Editor
Ashwin Nair
Acquisition Editors
Shaon Basu
Divya Poojari
Content Development Editor
Mamata Walkar
Technical Editor
Mohita Vyas
Copy Editor
Angad Singh
Project Coordinator
Sanjeet Rao
Proofreader
Safis Editng
Indexer
Tejal Soni
Production Coordinator
Melwyn Dsa
Cover Work
Melwyn Dsa
Muhammad Zeeshan Munir is a system architect and specializes in the area of data center virtualization and cloud computing. He has been in the IT industry for nearly 11 years after his post graduation in computer science and has been working with Linux, Microsoft, and VMware products. He mainly specializes in designing, integrating, and automating private and public cloud infrastructures for enterprise to start-up companies.
Currently, Zeeshan works at Qatar Computing and Research Institute (Hamad Bin Khalifa University). Zeeshan also provided services as a free lance Assistant Manager ICT Operations to a Milan-based company, Linx ICT Solutions.
Kenneth van Ditmarsch is a very experienced freelance virtualization consultant. As one of the few freelance VMware Certified Design eXperts (VCDX), he has clearly added value in virtualization infrastructure projects. He especially gained knowledge and extensive project experience during his last years at VMware and several specialized consulting engagements he worked on.
Kenneth agreed to review this book based on his extensive experience of VMware products. You can check out Kenneth's personal blog around virtualization at http://virtualkenneth.com/.
Péter Károly "Stone" JUHÁSZ was born in Hungary in 1980, where he lives with his family and their cat.
He got his MSc degree as a programmer mathematician. At the very beginning of his career, he turned towards operations. Since 2004, he has been working as a general—mainly GNU/Linux—system administrator.
His average working day includes: patching in the server room, installing servers, managing PBX, maintaining VMware vSphere infrastructure and servers at Amazon AWS, managing storage and backups, performing monitoring with Nagios, trying out new technology, and writing scripts to ease everyday work.
His interests in IT are Linux, server administration, virtualization, artificial intelligence, network security, and distributed systems.
For support files and downloads related to your book, please visit www.PacktPub.com.
Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.
https://www2.packtpub.com/books/subscription/packtlib
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.
If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.
Get notified! Find out when new books are published by following @PacktEnterprise on Twitter or the Packt Enterprise Facebook page.
For indeed, with hardship [will be] ease. [94:5]
This book is dedicated to my parents, who taught me how to write and communicate better!
To my lovely wife. Without her tireless support in different adventures (in UK, Italy, Qatar and Pakistan), I would not have been able to make it!
I would like to thank Dr. Ahmed Elmagarmid (Executive Director of Qatar Computing & Research Institute, Hamad Bin Khalifa University), whose vision inspired me all the way while writing this book.
I would like to extend my special thanks to everyone, including my family (brothers and sister), and friends (Muhammad Imran and Abid) who motivated and helped me achieve this, director, Marco Li Vigni, in Italy whose technical advice and ready support has always been guidance and of greatest value for me.
I would like to thank the reviewers of this book for their feedback and pointing me to the right direction. A special thanks to Mamata Walkar the Content Editor of the book, Divya Poojari, Technical Editor Mohita Vyas, and Shaon Basu for getting this effort completed.
VMware has been a famous cloud and virtualization software provider since almost two decades. The VMware virtualization suite vSphere comprises different virtualization producing including bare-metal hypervisors based on vSphere hosts (ESX/ESXi), vCenter Server, vCloud Director, VMware NSX (previously known as vCloud Networking and Security), VMware Horizon Mirage (desktop virtualization), and so on. Virtualization is based on an operating system that can be installed on bare-metal servers and work stations to host other operating systems, for example, Linux, Unix, Windows, and many more. This allows vSphere hosts to share and distribute the available resources (computation, memory, and disk drive) among different hosted virtual machines, and allows them to install different operating systems without exposing the hardware architecture.
Today, many organizations, universities, and research institutes are widely adopting virtualization for day-to-day computing needs using the VMware vSphere hypervisor. Wide growth in vSphere-based infrastructures also requires troubleshooting and resolution of different related issues of the vSphere hypervisor. This is a book that enables system engineers and data center architects to troubleshoot most of the common problems that can be faced in a data center based on the vSphere infrastructure. The book lets you develop a clear and minute troubleshooting approach and lets you adapt to it by practicing it. Real vSphere problems that system engineers may face in the data center are covered by example in this book. In addition to that, vSphere Troubleshooting can be used as a reference and provides a complete overview of the concepts and knowledge necessary for system engineers. You will learn new skills, new tools, and ready-to-use troubleshooting recipes by reading it.
Chapter 1, The Methodology of Problem Solving, covers some of the common troubleshooting skills that can also be applied to troubleshoot vSphere hosts. In this chapter, you learn the installation of VMware Management Assistant (vMA), the first tool to help you get started.
Chapter 2, Monitoring and Troubleshooting Host and VM Performance, teaches you how to use performance-monitoring tools and how these tools can help troubleshoot some very common issues in the vSphere infrastructure. This chapter also covers some of the very important vSphere host metrics and how these metrics can be viewed in performance charts.
Chapter 3, Troubleshooting Clusters, discusses how to get basic information about clusters in order to troubleshoot their common problems. This chapter also covers how this information can be used in advance to prevent any problems from happening. Performance monitoring for clusters is a very important ingredient, and it helps you with your business continuity and managing workloads. The topic on troubleshooting the Heartbeat data store and DRS Storage issues gives a basic insight into some of the very common problems, how to solve them, and some tips for avoiding them from occurring.
Chapter 4, Monitoring and Troubleshooting Networking, covers some of the basic concepts of switching, a deep dive into troubleshooting commands, and some of the tools for monitoring network performance. It also covers how to troubleshoot a single vSphere host using esxcli and, for multiple vSphere hosts, how to automate tasks using a scripting language from PowerCLI or a vMA appliance.
Chapter 5, Monitoring and Troubleshooting Storage, covers many different storage troubleshooting techniques, except Fiber SANs. Learning these techniques is a good starting point to manage most storage troubleshooting issues. We also keep focusing on the VMware vMA appliance to deploy our troubleshooting procedures for storage.
Chapter 6, Advanced Troubleshooting of vCenter Server and vSphere Hosts, is where you learn different vCenter Server and vSphere HA agent and state problems. It also covers how to troubleshoot and fix some of the common problems related to vSphere HA. Once you know how to fix some of the common issues, you will get some background of troubleshooting for advanced problems as well.
Appendix A, Learning PowerGUI Basics, shows you how to use the PowerGUI script editor to write your PowerShell scripts. You can use it to manage, not only your vSphere infrastructure, but also your Windows-based environment from a single centralized console.
Appendix B, Installing VMware vRealize Operations Manager, illustrates how VMware vRealize Operations Manager helps you to ensure the availability and management of your infrastructure and applications across Amazon, vSphere, physical hardware, and Hyper-V. You can monitor your applications and optimize performance for your infrastructure.
Appendix C, Power CLI - A Basic Reference, shows you how to download and run the VMware vSphere PowerCLI 6.0 Release 1 or Release 2 in a step-by-step manner.
This book requires you to have a working setup of the VMware infrastructure, and it should include at least two vSphere hosts in a cluster preferably managed by vCenter Server. VMware Management Assistant (vMA) and vSphere Power CLI are also required to execute different commands and management scripts. Some of the tools can be downloaded from the URLs provided in different chapters.
The books is intended for mid-level system engineers and system integrators who want to learn the VMware power tools used to troubleshoot and manage the vSphere infrastructure. A good level of knowledge and understanding of virtualization is expected.
In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.
Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Select the Deploy from a file or URL option."
A block of code is set as follows:
Any command-line input or output is written as follows:
New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Select the Deploy from a file or URL option."
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.
To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.
Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.
You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.
We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from: https://www.packtpub.com/sites/default/files/downloads/1767EN.pdf.
Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.
To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.
Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.
Please contact us at <[email protected]> with a link to the suspected pirated material.
We appreciate your help in protecting our authors and our ability to bring you valuable content.
If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.
This chapter covers a basic overview of troubleshooting skills, a complete set of troubleshooting tools for vSphere infrastructure, and tips and techniques on how these tools can be used to troubleshoot your vSphere infrastructure.
The topics covered in this chapter are as follows:
We all fix things in our daily lives, and all it takes to fix these things are troubleshooting skills. As with all skills, whether it's playing the piano, fixing a broken car, acting, or writing a computer program, some people are gifted with these skills for troubleshooting by nature. If you have a natural skill, you might assume that everyone else is also gifted. You may have learned how to ride a bike effortlessly, without knowing how much work other people may have had to put into it.
In the same way, some people have a natural talent for troubleshooting and are better at it than others. Such people quickly grasp the necessary steps and easily isolate the problem until they are able to find the root cause. Let's say your motorbike stops working and you take it to a mechanic, telling him the problems and the symptoms of your motorbike. A mechanic who is good at troubleshooting could be able to isolate the problem right away. He could also be able to explain you why your motorbike fails and what is the root cause of the problem. On the contrary, when you take your motorbike to a mechanic who isn't good in troubleshooting, you can expect more time to fix the motorbike and a higher repair bill. You may also need to go every now and then to see the mechanic to get your motorbike fixed at the earliest.
But this does not mean that if you don't have troubleshooting skills, you cannot learn them. Troubleshooting skills can be learned and mastered by anyone. For example, like many other skills, we apply certain techniques in troubleshooting as well—it does not matter whether we are gifted with this skill or not. When we start practicing, it becomes our second nature. We all want to be better troubleshooters, but we also need to be precise and fast. A good system engineer is gifted with troubleshooting skills. When we work in highly available environments where downtime is measured in dollars, we always want to have the right troubleshooting skill set to solve the problem. This requires precision, speed, comprehension, and troubleshooting skills.
Of course, it makes sense that you would prefer to go to the good mechanic who knows what it takes to fix your motorbike efficiently. Applying these scenarios will not only help you to troubleshoot in all aspects of life but also to troubleshoot vSphere in terms of identifying problems and their root causes, and understanding how to fix them.
You should consider a structured approach to troubleshooting rather than doing so without applying any methodology. The following aspects can be helpful and can teach you how to best practice troubleshooting, taking the motorbike to be repaired as an example:
Root Cause of Problem
Troubleshooting Skills Required
In the Engine
Action Needed
Not working at all
Easy
Dead battery
Problem understanding
Malfunctioning
Medium
Dashboard blinking light
Problem understanding + investigation
Malfunctioning, but the symptoms are seen in other components
Hard
Loss of power
Problem understanding + real-time investigation + correlation of events
Not working, but the problem disappeared
Requires on long analysis
Weak battery or some mechanical problems
Problem understanding + historical investigation + correlation of events
You should always establish good communication methods within your work environment. Communicating your problem effectively is one of the key skills required essentially for troubleshooting, especially when you are working in a collaborative environment. Lack of communication can lead to some serious and never-ending problems with increasing down-time. You might be working continuously without realizing that your other team members are working on the same problem as your are. If you've precise communication, you will always avoid the path that your other team members have already discovered.
The following communication methods can be effectively used to communicate within and outside of teams:
While working on any system, you will face many common problems again and again. You should always create a knowledge base of these common problems, which includes identifying the problem, its symptoms, and the solution to be applied, along with a Root Cause Analysis (RCA) of the problem. Documenting and creating a knowledge repository of these problems and steps taken to troubleshoot them will save you a lot of work in the future. This will also help you to share the knowledge of troubleshooting with all your team members at one place. In addition, it will help you transfer knowledge to your newly hired team members and allow them to use a smarter and more methodological approach towards troubleshooting.
You might be able to fix the issue with no understanding of the root cause, but you cannot completely prevent it. You should always isolate and find the correct root cause in order to avoid problems in the future. If you know the root cause, you can easily assign the problematic issue to the correct team to resolve it accordingly. Sometimes you can come across very complex problems, where you may find the root cause, but sometimes that changes several times in the procedure. Highly available environments also have high stress and require your full concentration, excellent troubleshooting skills, and the correct domain knowledge. This becomes more crucial when it costs your organization money at every single second.
For highly available environments, where every second of down time can cost you dollars, you would always have the right people in the right place in order to make sure your investment has been made at the right place. The value you will get by having the right people for the right job would save you not only in terms of Return on Investment (ROI) but also in terms of your reputation. If the required knowledge is missing, you should conduct training: first educate yourself and then transfer the knowledge to your team members. A technical team equipped with the knowledge of the problem space is highly desirable at all times.
Whenever you face a critical problem, you should always try to divide the problem into smaller issues and try to divide it among your team members. If your team has only one member, you can still divide the problem into smaller ones. This approach does not only enable you to solve the problem quickly but also engages your team members to concentrate on different areas of the problem. Obviously, you should avoid working on the same problem that your other team members are working on. Thus, you should always make sure you have divided the problem space appropriately.
You should always encourage your team members to log all their problems, their solutions, and the steps that were taken to reach to the solutions. You could centralize such information using a Knowledgebase or a local Wiki within your organization. Once you have your Knowledgebase in place with records of problems and their troubleshooting solutions, you can start testing the solutions. This will assure you that the solutions in your knowledge base are robust and well tested. You can use some kind of document version control so that as the problem evolves, your documentation can keep track of all of these changes.
When you are working in a data center, where you need to work together with other members of a team, this documentation process enables the entire team to solve the problem more easily. If you document the solutions in your organization, you truly enable your junior team members to learn new things and solve problems without involving senior team members.
In VMware vSphere troubleshooting, we will discuss and troubleshoot problems with different vSphere hosts, virtual machines, and vCenter Server. In simple walkthroughs, we will identify the problems and fix those problems by applying our knowledge. You will see how to isolate vSphere-related technical issues and how to apply troubleshooting techniques to those issues. We will discuss different VMware power tools to mange a vSphere infrastructure in centralized way, which includes VMware vSphere Management Assistant (vMA), EXCLI, vSphere PowerCLI, ESXTOP, resxotop, performance monitoring charts, and many other tools. These tools will be introduced step by step in the upcoming chapters.
VMware vMA is a SUSE Linux-based virtual appliance that is shipped with vSphere SDK for Perl and vSphere command line interface. You can use vMA to manage your entire vSphere infrastructure from a central service console by executing different service scripts, creating and analyzing log bundles, monitoring performance, and much more. You can also use vSphere VMA to act as a centralized log server to receive logs from all of your vSphere hosts. Let's look at the various configuration parameters of our first VMware power tool, vSphere VMA.
VMware vMA requires a minimum of 3 GB of disk space and 600 MB of RAM. The Open Virtual Machine Format (OVF) template is based on SUSE Linux 64-bit architecture. vMA supports vSphere 4.0 Update 2 to vSphere 6.0 and vCenter 5.0 and upward. vMA can be used to target vCenter 5.0 or later, ESX/ESXi3.5 Update 5, and vSphere ESXi 4.0 Update 2 or later systems. A single vMA appliance can support a different number of targets, depending on how it is being used at runtime. You will require a user name and password to download the vMA application. It can be downloaded from https://my.vmware.com/group/vmware/details?productId=352&downloadGroup=VMA550.
We will deploy the new vMA from the vSphere Client tied to a vCenter Server 5.0 or vCenter Server 4.x. It can be deployed on the following vSphere releases:
The virtualized hosts that can be managed from the vMA are:
To install VMware vMA, perform the following steps:
