Tutorial: Get VM performance data

Today's tutorial will be all about querying performance data for one or multiple VMs in your vSphere inventory. It is one of the topics people have been asking for and surprisingly, it is not too difficult to deal with. You can use resulting data to create beautiful custom charts in real-time.

Routine

As usual, let us start with a clean script that connects to your vCenter or ESXi host using the Perl SDK. We'll also include a very handy Data::Dumper module.

The script is going to support several command line options in order to further customize the results. I like to set default values for all options so that the script actually does something when it's run for the first time.

#!/usr/bin/perl -w

use strict;
use warnings;
use VMware::VIRuntime;
use Data::Dumper;

my %opts = (
    vmname => {
        type => '=s',
        help => 'VM name regex',
        default => '.*',
    },
    interval => {
        type => '=i',
        help => 'Performance counters sample interval [ 20 | 300 | 1800 | 7200 | 86400 ]',
        default => '20',
    },
    counters => {
        type => '=s',
        help => 'Comma-separated list of performance counter keys',
        default => '2,33',
    },
    action => {
        type => '=s',
        help => 'Type of operation to perform [ listcounters | getperformance ]',
        default => 'getperformance',
    }
);

Opts::add_options(%opts);
Opts::parse();
Opts::validate();
Util::connect();

We are also going to need a performance manager entity which allows us to query performance data. Among other things, it is the place to look for a list of all available performance counters. As vSphere supports several hundreds of them, the SDK utilizes numeric identifiers — keys — to uniquely identify those we are interested in.

# get performance manager view
my $perfmgr_view = Vim::get_view(mo_ref => Vim::get_service_content()->perfManager)
    || die "Failed to obtain perfManager view\n";

# store performance counters in a nice hash
my %pcs = ();
foreach (@{$perfmgr_view->perfCounter}) {
    $pcs{$_->key} = {
        desc => $_->nameInfo->summary,
        roll => $_->rollupType->val,
        unit => $_->unitInfo->label,
    };
}

List performance counters

The very first thing you might want to do, is to list all available performance counters and their keys. Let us implement a special listcounters action for this.

# list performance counters
if (Opts::get_option('action') eq 'listcounters') {
    print "key [        unit ] (    rollup )  counter description\n";
    print "======================================================\n";
    foreach (sort {$a <=> $b} keys %pcs) {
        printf(
            "%-3d [%12s ] (%10s )  %s\n",
            $_, $pcs{$_}{unit}, $pcs{$_}{roll}, $pcs{$_}{desc}
        );
    }
    exit 0;
}

This simple loop prints all counters and their keys and units in a rather neat way.

Get performance data

Once we've identified the counters, it's time to query vSphere for the actual data. We're going to parse the counters command line option to identify our counters of interest.

By default, value of 2,33 will be used to obtain the average CPU usage and active memory usage respectively.

# create metric IDs for selected counters
my @metrics = ();
foreach (split(/,\s*/, Opts::get_option('counters'))) {
    push (@metrics, PerfMetricId->new(counterId => $_, instance => ''));
}

Note the empty instance attribute — you may know it from vSphere client. It allows for obtaining even more granular metrics, e.g. utilization of a specific CPU core, etc. But for this example, let us simply enter an empty string, which will return aggregated statistics for all available instances.

The vmname command line option enables you to specify the name of virtual machine to obtain performance data for. A simple regex filter is used for this. All VMs will be queried by default, but for larger environments it would be advisable to limit the scope to something more restrictive.

# find target virtual machine(s) and escape name for regex
my $vmregex = Opts::get_option('vmname');
   $vmregex =~ s{/}{\\/}g;

my $vmviews = Vim::find_entity_views(
    view_type => 'VirtualMachine', 
    properties => ['name'],
    filter => { name => qr/^$vmregex$/ }
);
die "No VMs found\n" unless (scalar(@$vmviews));

And now for the final part — actually retrieving the results. We have to query each VM separately, hence the more specific your vmname parameter is, the faster the script will be.

# query performance for each vm matching selection
my %results = ();
foreach my $vm (@$vmviews) {
    # specify performance query and try to obtain results
    eval {
        my $perf_query_spec = PerfQuerySpec->new(
            entity => $vm,
            metricId => \@metrics,
            format => 'csv',
            intervalId => Opts::get_option('interval'),
        );
        my $perf_data = $perfmgr_view->QueryPerf(querySpec => $perf_query_spec);
        $results{$vm->name} = {result => 'success', perfdata => $perf_data};
    };
    # deal with errors
    if ($@) {
        $results{$vm->name} = {result => 'error', fault => $@};
    }
}

print Dumper \%results;

I'm trying to keep things easy here by simply dumping the result. Obviously you can do any type of wizardry with it. One could, for example, add units and descriptions using the %pcs hash stored earlier.

Results

Let's take a look at the result. It is a perl hash with VM names as keys and resulting performance data as values. Since we have selected the CSV format, each metric set contains comma-separated list of performance values and there is also the sampleInfoCSV string containing list of timestamps matching the given dataset.

[pavel@devel]$ ./cloud-stats.pl --sessionfile /tmp/session.test --vmname 'debian.*'
$VAR1 = {
    'debian test vm' => {
        'result' => 'success',
        'perfdata' => [
            bless( {
                'entity' => bless( {
                    'value' => 'vm-552',
                    'type' => 'VirtualMachine'
                 }, 'ManagedObjectReference' ),
                'value' => [
                    bless( {
                        'value' => '60,50',
                        'id' => bless( {
                            'counterId' => '2',
                            'instance' => ''
                         }, 'PerfMetricId' )
                    }, 'PerfMetricSeriesCSV' ),
                    bless( {
                        'value' => '20968,10484',
                        'id' => bless( {
                            'counterId' => '33',
                            'instance' => ''
                         }, 'PerfMetricId' )
                    }, 'PerfMetricSeriesCSV' )
                ],
                'sampleInfoCSV' => '20,2018-04-11T11:45:20Z,20,2018-04-11T11:45:40Z'
            }, 'PerfEntityMetricCSV' )
        ]
    }
};

Sample interval

There are two basic types of performance stats in vSphere — real-time (20second samples kept for an hour) and historical (sample intervals here). Again, you may remember them from vSphere client. You may also remember that some performance counters are by default only available in real-time.

The key difference is that real-time statistics are provided by the ESXi host the VM is currently running on, whereas historical stats are aggregated by vCenter. It is therefore impossible to collect historical statistics from a standalone ESXi host. You'll get a SOAP error if you attempt to do so.

In my script, you can simply distinguish between the two using the interval command line option. Default is set to 20 seconds, i.e. real-time statistics. If you attempt to collect historical statistics for counters that are only available in real-time, you'll get an empty result.

Performance

This script performs quite well even in large environments.

When testing in my lab with ca 20 VMs, I didn't even notice any difference between real-time and historical statistics and the script always finished in less than a second.

Obtaining real-time statistics for several thousand VMs in my production environment, however, is noticeably faster than fetching historical ones and puts less pressure on vCenter.

If you've made it all the way till here, thanks for taking the time to read this guide and as always — have fun with the result.

vSphere Perl SDK

Search This Blog