VMware vSphere Troubleshooting - Muhammad Zeeshan Munir - E-Book

VMware vSphere Troubleshooting E-Book

Muhammad Zeeshan Munir

0,0
39,59 €

-100%
Sammeln Sie Punkte in unserem Gutscheinprogramm und kaufen Sie E-Books und Hörbücher mit bis zu 100% Rabatt.

Mehr erfahren.
Beschreibung

VMware vSphere is the leading server virtualization platform with consistent management for virtual data centers. It enhances troubleshooting skills to diagnose and resolve day to day problems in your VMware vSphere infrastructure environment.
This book will provide you practical hands-on knowledge of using different performance monitoring and troubleshooting tools to manage and troubleshoot the vSphere infrastructure.
It begins by introducing systematic approach for troubleshooting different problems and show casing the troubleshooting techniques. You will be able to use the troubleshooting tools to monitor performance, and troubleshoot issues related to Hosts and Virtual Machines. Moving on, you will troubleshoot High Availability, storage I/O control problems, virtual LANS, and iSCSI, NFS, VMFS issues.
By the end of this book, you will be able to analyze and solve advanced issues related to vShpere environment such as vcenter certificates, database problems, and different failed state errors.

Das E-Book können Sie in Legimi-Apps oder einer beliebigen App lesen, die das folgende Format unterstützen:

EPUB
MOBI

Seitenzahl: 268

Veröffentlichungsjahr: 2015

Bewertungen
0,0
0
0
0
0
0
Mehr Informationen
Mehr Informationen
Legimi prüft nicht, ob Rezensionen von Nutzern stammen, die den betreffenden Titel tatsächlich gekauft oder gelesen/gehört haben. Wir entfernen aber gefälschte Rezensionen.



Table of Contents

VMware vSphere Troubleshooting
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files, eBooks, discount offers, and more
Why subscribe?
Free access for Packt account holders
Instant updates on new Packt books
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Downloading the example code
Downloading the color images of this book
Errata
Piracy
Questions
1. The Methodology of Problem Solving
Troubleshooting techniques
Precise communication
Creating a knowledge base of identified problems and solutions
Obtaining the required knowledge of the problem space
Isolating the problem space
Documenting and keeping track of changes
Troubleshooting with power tools
Configuring the vSphere management agent
Installation
Installation steps
VMware vMA features
Powering-on vMA
AD integration
AD unattended access
vMA web UI
vi-user
Configuring vMA as a syslog server
Creating a logrotate file
The vMA authentication mechanism
Accessing systems from vMA
vMA scripts samples
PowerCLI
Connecting to vCenter Server or an ESX/vSphere host with PowerCLI
Setting up a syslog server using PowerCLI
Setting up a sysLog server manually
A comprehensive reference of log files
vSphere log files – vSphere host 5.1 and later
Logs from vCenter Server components on vSphere host 5.1, 5.5, and 6.0
vCenter log files
vCenter inventory service log files
vSphere Profile-Driven Storage log files
Configuring logs and collecting logs
Using vSphere Client
Using vSphere Web Client
Using the vm-support tool
Running vm-support in a console session on vSphere hosts
Generating logs on stdout
Using vm-support in vMA to collect logs
Using PowerCLI to collect the log bundle
Collecting log bundles from vCenter Server
Collecting log bundles from a vSphere host
Collecting log bundles from the vSphere log browser
Exporting logs
Understanding the hardware health of vSphere hosts
Miscellaneous tools
Summary
2. Monitoring and Troubleshooting Host and VM Performance
Tools for performance monitoring
Using esxtop/resxtop
Live resource monitoring – the interactive mode
Offline performance monitoring – batch mode
Replaying performance metrics – replay mode
Using Windows Performance Monitor
Analyzing esxtop results
Understanding CPU statistics
Enabling more esxtop fields
Memory statistics
Memory management in a vSphere host
Memory overcommitment
Memory overhead
Transparent page sharing
Ballooning
Memory compression
Esxtop for memory statistics
Diagnosing memory blockage
Network metrics
Understanding network metrics
Diagnosing network performance
Storage metrics
Using vMA and resxtop
vCenter performance charts
Creating a chart and configuring metrics
Configuring logging level for performance
Virtual machine troubleshooting
USB-attached virtual machines
Non-responsive USB/CD drives
Unable VM migration with a USB device
Fault-tolerant virtual machines
Incompatible or hosts not available
Summary
3. Troubleshooting Clusters
An overview of cluster information
Cluster performance monitoring
vSphere HA
Failing heartbeat datastores
Changing heartbeating datastores
Insufficient heartbeat datastores
Unable to unmount a datastore
Detaching datastores with vMA
Detaching a datastore using vSphere PowerCLI
vCenter server rejects specific datastores
DRS-enabled storage
Failed DRS recommendations
Datastore maintenance mode failure
More common errors of Storage DRS
Insufficient resources and vSphere HA failover
I/O control troubleshooting
SIOC logging
Changing vDisk shares and limits for a virtual machine
vSphere Fault Tolerance for virtual machines
Common troubleshooting of fault tolerance
Configuring SNMP traps for continuous monitoring
Configuring SNMP traps with vMA
Tuning the SNMP agents
Configuring SNMP agents from PowerCLI
Summary
4. Monitoring and Troubleshooting Networking
Log files
Understanding the virtual network creation process
Network troubleshooting commands
Repairing a dvsdata.db file
ESXCLI network
Troubleshooting uplinks
Troubleshooting virtual switches
Troubleshooting VLANs
Verifying physical trunks and VLAN configuration
Verifying VLAN configuration from CLI
Verifying VLANs from PowerCLI
Verifying PVLANs and secondary PVLANs
Testing virtual machine connectivity
Troubleshooting VMkernel interfaces
Verifying configuration from DCUI
Verifying network connectivity from DCUI
Verifying management network from DCUI
Troubleshooting with port mirroring
Monitoring with NetFlow
Adding a default route
Deleting a route
Managing vSphere DNS
PowerCLI - changing DNS on multiple vSphere hosts
Summary
5. Monitoring and Troubleshooting Storage
Storage adapters
Storage log files
The hostd.log file
The storageRM.log file
The vmkernel.log file
DRMDump
Multipathing and PSA troubleshooting
Native Multipathing Plugins
Changing the path selection policy from VMware vMA
Storage path masking
LUN and claim rules
Identifying storage devices and LUNs
Listing storage devices from vMA
Troubleshooting paths
Disabling vSphere APD
Planned PDL
VMware vMA to automate detaching of LUNs
Unplanned PDL
Multipath policy selection from the vSphere client
Using vMA to change a path state
Unmasking a path
LUN troubleshooting tips
Storage module troubleshooting
Troubleshooting iSCSI-related issues
iSCSI error correction
Troubleshooting NFS issues
Troubleshooting VMFS issues
VMFS snapshots and resignaturing
SAN display problems
SAN performance troubleshooting
Summary
6. Advanced Troubleshooting of vCenter Server and vSphere Hosts
vCenter managed hosts
Logging for an inventory service
Viewing vCenter Server logs
Setting up vCenter Server the statistics intervals from vSphere Web Client
Relocating or removing a vSphere host
vSphere host disconnection and reconnection
vSphere SSL certificates
Replacing machine certificates
Replacing VMCA root certificate
Replacing user solution certificates
Implementing SSL certificates for ESXi
Regenerating certificates
vCenter Server database
vSphere HA agent troubleshooting
Unreachable or uninitialized state
The HA agent initialization error
Reinstalling the HA agent
HA agent host failed state
Network partitioned or network isolated errors
Commonly known auto deploy problems
Getting help
Summary
A. Learning PowerGUI Basics
Using the VMware Community PowerPack
Summary
B. Installing VMware vRealize Operations Manager
Summary
C. Power CLI - A Basic Reference
Summary
Index

VMware vSphere Troubleshooting

VMware vSphere Troubleshooting

Copyright © 2015 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: October 2015

Production reference: 1261015

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78355-176-7

www.packtpub.com

Credits

Author

Muhammad Zeeshan Munir

Reviewers

Kenneth van Ditmarsch

Péter Károly "Stone" JUHÁSZ

Commissioning Editor

Ashwin Nair

Acquisition Editors

Shaon Basu

Divya Poojari

Content Development Editor

Mamata Walkar

Technical Editor

Mohita Vyas

Copy Editor

Angad Singh

Project Coordinator

Sanjeet Rao

Proofreader

Safis Editng

Indexer

Tejal Soni

Production Coordinator

Melwyn Dsa

Cover Work

Melwyn Dsa

About the Author

Muhammad Zeeshan Munir is a system architect and specializes in the area of data center virtualization and cloud computing. He has been in the IT industry for nearly 11 years after his post graduation in computer science and has been working with Linux, Microsoft, and VMware products. He mainly specializes in designing, integrating, and automating private and public cloud infrastructures for enterprise to start-up companies.

Currently, Zeeshan works at Qatar Computing and Research Institute (Hamad Bin Khalifa University). Zeeshan also provided services as a free lance Assistant Manager ICT Operations to a Milan-based company, Linx ICT Solutions.

About the Reviewers

Kenneth van Ditmarsch is a very experienced freelance virtualization consultant. As one of the few freelance VMware Certified Design eXperts (VCDX), he has clearly added value in virtualization infrastructure projects. He especially gained knowledge and extensive project experience during his last years at VMware and several specialized consulting engagements he worked on.

Kenneth agreed to review this book based on his extensive experience of VMware products. You can check out Kenneth's personal blog around virtualization at http://virtualkenneth.com/.

Péter Károly "Stone" JUHÁSZ was born in Hungary in 1980, where he lives with his family and their cat.

He got his MSc degree as a programmer mathematician. At the very beginning of his career, he turned towards operations. Since 2004, he has been working as a general—mainly GNU/Linux—system administrator.

His average working day includes: patching in the server room, installing servers, managing PBX, maintaining VMware vSphere infrastructure and servers at Amazon AWS, managing storage and backups, performing monitoring with Nagios, trying out new technology, and writing scripts to ease everyday work.

His interests in IT are Linux, server administration, virtualization, artificial intelligence, network security, and distributed systems.

www.PacktPub.com

Support files, eBooks, discount offers, and more

For support files and downloads related to your book, please visit www.PacktPub.com.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at <[email protected]> for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

https://www2.packtpub.com/books/subscription/packtlib

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can search, access, and read Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by PacktCopy and paste, print, and bookmark contentOn demand and accessible via a web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view 9 entirely free books. Simply use your login credentials for immediate access.

Instant updates on new Packt books

Get notified! Find out when new books are published by following @PacktEnterprise on Twitter or the Packt Enterprise Facebook page.

For indeed, with hardship [will be] ease. [94:5]

This book is dedicated to my parents, who taught me how to write and communicate better!

To my lovely wife. Without her tireless support in different adventures (in UK, Italy, Qatar and Pakistan), I would not have been able to make it!

I would like to thank Dr. Ahmed Elmagarmid (Executive Director of Qatar Computing & Research Institute, Hamad Bin Khalifa University), whose vision inspired me all the way while writing this book.

I would like to extend my special thanks to everyone, including my family (brothers and sister), and friends (Muhammad Imran and Abid) who motivated and helped me achieve this, director, Marco Li Vigni, in Italy whose technical advice and ready support has always been guidance and of greatest value for me.

I would like to thank the reviewers of this book for their feedback and pointing me to the right direction. A special thanks to Mamata Walkar the Content Editor of the book, Divya Poojari, Technical Editor Mohita Vyas, and Shaon Basu for getting this effort completed.

Preface

VMware has been a famous cloud and virtualization software provider since almost two decades. The VMware virtualization suite vSphere comprises different virtualization producing including bare-metal hypervisors based on vSphere hosts (ESX/ESXi), vCenter Server, vCloud Director, VMware NSX (previously known as vCloud Networking and Security), VMware Horizon Mirage (desktop virtualization), and so on. Virtualization is based on an operating system that can be installed on bare-metal servers and work stations to host other operating systems, for example, Linux, Unix, Windows, and many more. This allows vSphere hosts to share and distribute the available resources (computation, memory, and disk drive) among different hosted virtual machines, and allows them to install different operating systems without exposing the hardware architecture.

Today, many organizations, universities, and research institutes are widely adopting virtualization for day-to-day computing needs using the VMware vSphere hypervisor. Wide growth in vSphere-based infrastructures also requires troubleshooting and resolution of different related issues of the vSphere hypervisor. This is a book that enables system engineers and data center architects to troubleshoot most of the common problems that can be faced in a data center based on the vSphere infrastructure. The book lets you develop a clear and minute troubleshooting approach and lets you adapt to it by practicing it. Real vSphere problems that system engineers may face in the data center are covered by example in this book. In addition to that, vSphere Troubleshooting can be used as a reference and provides a complete overview of the concepts and knowledge necessary for system engineers. You will learn new skills, new tools, and ready-to-use troubleshooting recipes by reading it.

What this book covers

Chapter 1, The Methodology of Problem Solving, covers some of the common troubleshooting skills that can also be applied to troubleshoot vSphere hosts. In this chapter, you learn the installation of VMware Management Assistant (vMA), the first tool to help you get started.

Chapter 2, Monitoring and Troubleshooting Host and VM Performance, teaches you how to use performance-monitoring tools and how these tools can help troubleshoot some very common issues in the vSphere infrastructure. This chapter also covers some of the very important vSphere host metrics and how these metrics can be viewed in performance charts.

Chapter 3, Troubleshooting Clusters, discusses how to get basic information about clusters in order to troubleshoot their common problems. This chapter also covers how this information can be used in advance to prevent any problems from happening. Performance monitoring for clusters is a very important ingredient, and it helps you with your business continuity and managing workloads. The topic on troubleshooting the Heartbeat data store and DRS Storage issues gives a basic insight into some of the very common problems, how to solve them, and some tips for avoiding them from occurring.

Chapter 4, Monitoring and Troubleshooting Networking, covers some of the basic concepts of switching, a deep dive into troubleshooting commands, and some of the tools for monitoring network performance. It also covers how to troubleshoot a single vSphere host using esxcli and, for multiple vSphere hosts, how to automate tasks using a scripting language from PowerCLI or a vMA appliance.

Chapter 5, Monitoring and Troubleshooting Storage, covers many different storage troubleshooting techniques, except Fiber SANs. Learning these techniques is a good starting point to manage most storage troubleshooting issues. We also keep focusing on the VMware vMA appliance to deploy our troubleshooting procedures for storage.

Chapter 6, Advanced Troubleshooting of vCenter Server and vSphere Hosts, is where you learn different vCenter Server and vSphere HA agent and state problems. It also covers how to troubleshoot and fix some of the common problems related to vSphere HA. Once you know how to fix some of the common issues, you will get some background of troubleshooting for advanced problems as well.

Appendix A, Learning PowerGUI Basics, shows you how to use the PowerGUI script editor to write your PowerShell scripts. You can use it to manage, not only your vSphere infrastructure, but also your Windows-based environment from a single centralized console.

Appendix B, Installing VMware vRealize Operations Manager, illustrates how VMware vRealize Operations Manager helps you to ensure the availability and management of your infrastructure and applications across Amazon, vSphere, physical hardware, and Hyper-V. You can monitor your applications and optimize performance for your infrastructure.

Appendix C, Power CLI - A Basic Reference, shows you how to download and run the VMware vSphere PowerCLI 6.0 Release 1 or Release 2 in a step-by-step manner.

What you need for this book

This book requires you to have a working setup of the VMware infrastructure, and it should include at least two vSphere hosts in a cluster preferably managed by vCenter Server. VMware Management Assistant (vMA) and vSphere Power CLI are also required to execute different commands and management scripts. Some of the tools can be downloaded from the URLs provided in different chapters.

Who this book is for

The books is intended for mid-level system engineers and system integrators who want to learn the VMware power tools used to troubleshoot and manage the vSphere infrastructure. A good level of knowledge and understanding of virtualization is expected.

Conventions

In this book, you will find a number of text styles that distinguish between different kinds of information. Here are some examples of these styles and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: "Select the Deploy from a file or URL option."

A block of code is set as follows:

Writing inode tables: done Creating journal (32768 blocks): done Writing superblocks and filesystem accounting information: done This filesystem will be automatically checked every 28 mounts or 180 days, whichever comes first. Use tune2fs -c or -i to override.

Any command-line input or output is written as follows:

sudo rm /etc/localtimesudo ln -s /usr/share/zoneinfo/UTC /etc/localtime

New terms and important words are shown in bold. Words that you see on the screen, for example, in menus or dialog boxes, appear in the text like this: "Select the Deploy from a file or URL option."

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or disliked. Reader feedback is important for us as it helps us develop titles that you will really get the most out of.

To send us general feedback, simply e-mail <[email protected]>, and mention the book's title in the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide at www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Downloading the color images of this book

We also provide you with a PDF file that has color images of the screenshots/diagrams used in this book. The color images will help you better understand the changes in the output. You can download this file from: https://www.packtpub.com/sites/default/files/downloads/1767EN.pdf.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you could report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded to our website or added to any list of existing errata under the Errata section of that title.

To view the previously submitted errata, go to https://www.packtpub.com/books/content/support and enter the name of the book in the search field. The required information will appear under the Errata section.

Piracy

Piracy of copyrighted material on the Internet is an ongoing problem across all media. At Packt, we take the protection of our copyright and licenses very seriously. If you come across any illegal copies of our works in any form on the Internet, please provide us with the location address or website name immediately so that we can pursue a remedy.

Please contact us at <[email protected]> with a link to the suspected pirated material.

We appreciate your help in protecting our authors and our ability to bring you valuable content.

Questions

If you have a problem with any aspect of this book, you can contact us at <[email protected]>, and we will do our best to address the problem.

Chapter 1. The Methodology of Problem Solving

This chapter covers a basic overview of troubleshooting skills, a complete set of troubleshooting tools for vSphere infrastructure, and tips and techniques on how these tools can be used to troubleshoot your vSphere infrastructure.

The topics covered in this chapter are as follows:

Troubleshooting techniquesInstalling and configuring vMAConfiguring a centralized syslog serverUtilizing PowerCLIA comprehensive reference of log filesCollecting logsUnderstanding the health of vSphere hosts

Troubleshooting techniques

We all fix things in our daily lives, and all it takes to fix these things are troubleshooting skills. As with all skills, whether it's playing the piano, fixing a broken car, acting, or writing a computer program, some people are gifted with these skills for troubleshooting by nature. If you have a natural skill, you might assume that everyone else is also gifted. You may have learned how to ride a bike effortlessly, without knowing how much work other people may have had to put into it.

In the same way, some people have a natural talent for troubleshooting and are better at it than others. Such people quickly grasp the necessary steps and easily isolate the problem until they are able to find the root cause. Let's say your motorbike stops working and you take it to a mechanic, telling him the problems and the symptoms of your motorbike. A mechanic who is good at troubleshooting could be able to isolate the problem right away. He could also be able to explain you why your motorbike fails and what is the root cause of the problem. On the contrary, when you take your motorbike to a mechanic who isn't good in troubleshooting, you can expect more time to fix the motorbike and a higher repair bill. You may also need to go every now and then to see the mechanic to get your motorbike fixed at the earliest.

But this does not mean that if you don't have troubleshooting skills, you cannot learn them. Troubleshooting skills can be learned and mastered by anyone. For example, like many other skills, we apply certain techniques in troubleshooting as well—it does not matter whether we are gifted with this skill or not. When we start practicing, it becomes our second nature. We all want to be better troubleshooters, but we also need to be precise and fast. A good system engineer is gifted with troubleshooting skills. When we work in highly available environments where downtime is measured in dollars, we always want to have the right troubleshooting skill set to solve the problem. This requires precision, speed, comprehension, and troubleshooting skills.

Of course, it makes sense that you would prefer to go to the good mechanic who knows what it takes to fix your motorbike efficiently. Applying these scenarios will not only help you to troubleshoot in all aspects of life but also to troubleshoot vSphere in terms of identifying problems and their root causes, and understanding how to fix them.

You should consider a structured approach to troubleshooting rather than doing so without applying any methodology. The following aspects can be helpful and can teach you how to best practice troubleshooting, taking the motorbike to be repaired as an example:

Root Cause of Problem

Troubleshooting Skills Required

In the Engine

Action Needed

Not working at all

Easy

Dead battery

Problem understanding

Malfunctioning

Medium

Dashboard blinking light

Problem understanding + investigation

Malfunctioning, but the symptoms are seen in other components

Hard

Loss of power

Problem understanding + real-time investigation + correlation of events

Not working, but the problem disappeared

Requires on long analysis

Weak battery or some mechanical problems

Problem understanding + historical investigation + correlation of events

Precise communication

You should always establish good communication methods within your work environment. Communicating your problem effectively is one of the key skills required essentially for troubleshooting, especially when you are working in a collaborative environment. Lack of communication can lead to some serious and never-ending problems with increasing down-time. You might be working continuously without realizing that your other team members are working on the same problem as your are. If you've precise communication, you will always avoid the path that your other team members have already discovered.

The following communication methods can be effectively used to communicate within and outside of teams:

Direct conversation: You can communicate the problem directly, in person, with your team membersVoice/Video chats: Voice and video chats are very common now a days and enable a geographically distributed team to conduct regular meetingsWeb sessions: Web sessions can be used to access remote systems, conducting presentations and sharing whiteboardsEmail/Text chat: Email is the most common tool to used now a days for all kind of office communication

Creating a knowledge base of identified problems and solutions

While working on any system, you will face many common problems again and again. You should always create a knowledge base of these common problems, which includes identifying the problem, its symptoms, and the solution to be applied, along with a Root Cause Analysis (RCA) of the problem. Documenting and creating a knowledge repository of these problems and steps taken to troubleshoot them will save you a lot of work in the future. This will also help you to share the knowledge of troubleshooting with all your team members at one place. In addition, it will help you transfer knowledge to your newly hired team members and allow them to use a smarter and more methodological approach towards troubleshooting.

You might be able to fix the issue with no understanding of the root cause, but you cannot completely prevent it. You should always isolate and find the correct root cause in order to avoid problems in the future. If you know the root cause, you can easily assign the problematic issue to the correct team to resolve it accordingly. Sometimes you can come across very complex problems, where you may find the root cause, but sometimes that changes several times in the procedure. Highly available environments also have high stress and require your full concentration, excellent troubleshooting skills, and the correct domain knowledge. This becomes more crucial when it costs your organization money at every single second.

Obtaining the required knowledge of the problem space

For highly available environments, where every second of down time can cost you dollars, you would always have the right people in the right place in order to make sure your investment has been made at the right place. The value you will get by having the right people for the right job would save you not only in terms of Return on Investment (ROI) but also in terms of your reputation. If the required knowledge is missing, you should conduct training: first educate yourself and then transfer the knowledge to your team members. A technical team equipped with the knowledge of the problem space is highly desirable at all times.

Isolating the problem space

Whenever you face a critical problem, you should always try to divide the problem into smaller issues and try to divide it among your team members. If your team has only one member, you can still divide the problem into smaller ones. This approach does not only enable you to solve the problem quickly but also engages your team members to concentrate on different areas of the problem. Obviously, you should avoid working on the same problem that your other team members are working on. Thus, you should always make sure you have divided the problem space appropriately.

Documenting and keeping track of changes

You should always encourage your team members to log all their problems, their solutions, and the steps that were taken to reach to the solutions. You could centralize such information using a Knowledgebase or a local Wiki within your organization. Once you have your Knowledgebase in place with records of problems and their troubleshooting solutions, you can start testing the solutions. This will assure you that the solutions in your knowledge base are robust and well tested. You can use some kind of document version control so that as the problem evolves, your documentation can keep track of all of these changes.

When you are working in a data center, where you need to work together with other members of a team, this documentation process enables the entire team to solve the problem more easily. If you document the solutions in your organization, you truly enable your junior team members to learn new things and solve problems without involving senior team members.

Troubleshooting with power tools

In VMware vSphere troubleshooting, we will discuss and troubleshoot problems with different vSphere hosts, virtual machines, and vCenter Server. In simple walkthroughs, we will identify the problems and fix those problems by applying our knowledge. You will see how to isolate vSphere-related technical issues and how to apply troubleshooting techniques to those issues. We will discuss different VMware power tools to mange a vSphere infrastructure in centralized way, which includes VMware vSphere Management Assistant (vMA), EXCLI, vSphere PowerCLI, ESXTOP, resxotop, performance monitoring charts, and many other tools. These tools will be introduced step by step in the upcoming chapters.

Configuring the vSphere management agent

VMware vMA is a SUSE Linux-based virtual appliance that is shipped with vSphere SDK for Perl and vSphere command line interface. You can use vMA to manage your entire vSphere infrastructure from a central service console by executing different service scripts, creating and analyzing log bundles, monitoring performance, and much more. You can also use vSphere VMA to act as a centralized log server to receive logs from all of your vSphere hosts. Let's look at the various configuration parameters of our first VMware power tool, vSphere VMA.

Installation

VMware vMA requires a minimum of 3 GB of disk space and 600 MB of RAM. The Open Virtual Machine Format (OVF) template is based on SUSE Linux 64-bit architecture. vMA supports vSphere 4.0 Update 2 to vSphere 6.0 and vCenter 5.0 and upward. vMA can be used to target vCenter 5.0 or later, ESX/ESXi3.5 Update 5, and vSphere ESXi 4.0 Update 2 or later systems. A single vMA appliance can support a different number of targets, depending on how it is being used at runtime. You will require a user name and password to download the vMA application. It can be downloaded from https://my.vmware.com/group/vmware/details?productId=352&downloadGroup=VMA550.

We will deploy the new vMA from the vSphere Client tied to a vCenter Server 5.0 or vCenter Server 4.x. It can be deployed on the following vSphere releases:

vCenter Server 5.0vCenter Server 6.0

The virtualized hosts that can be managed from the vMA are:

ESXi 3.5 Update 5ESXi 4.0 Update 2vSphere ESXi 4.1 and 4.1 Update 1vSphere ESXi 5.0vSphere ESXi 6.0

Installation steps

To install VMware vMA, perform the following steps:

Once you are done with downloading the appliance, extract the vMA zip file into a directory.Log in to your vCenter or vSphere Client. From your vCenter client, you can select any vSphere host to which you would like to deploy vMA.To start the OVF appliance deployment wizard, choose the option Deploy OVF Template from the file menu.Select the Deploy from a file or URL option.Then, browse the folder where you have already extracted your vMA appliance. Click on the vMA OVF template to select it.Next, accept the vMA license agreement.Give an FQDN to your vMA appliance; I have given mine as vma.linxsol.com. The default name is also acceptable.Choose the appropriate folder to store your appliance for inventory.From your vCenter Server, choose the resource pool to allocate resources for the vMA appliance. If you do not select any resource pool, the wizard will place your appliance in the highest level of resource pool, which is selected by default.Choose the storage where you would like to store your vMA appliance; it could be a local data store, iSCSI, FC SAN, or NFS data store.Next, choose Disk Format