Virtual SAN + vRealize Operations = Effective Storage Management (Part 2)

...In the last post we saw how to configure MPSD (Management Pack for Storage Devices) with vRealize Operations. Lets' see once configured, what do we get out off it.

With the addition of MPSD to vROPS, we get some very useful and simple Dashboards added to the Home View.
For this post, i will showing some Dashboard views of a Hybrid Virtual SAN setup in my lab.

Dashboard 1 : Virtual SAN Troubleshooting

Below are the views from Virtual SAN Troubleshooting Dashboard. Let's have a closer look at them

This one shows a Hierarchical View of the entire Datacenter, mapping Physical Hosts -> VMs -> VSAN Datastore ->  Disk Groups and its respective sub components (SSD/Mechanical Drives)-> the Physical NIC used for the VSAN network.

Clicking any Virtual Machine would highlight the VSAN Datastore its hosted on and the associated membership to other objects in the hierarchy.

If i highlight one of the objects showing a Red Fault (EsxPnic in this case) a respective trendline shows up on the right side, showcasing the pattern and the reason for the fault. 

Likewise we can switch to the "Workload" view and highlight a Red colored object reflecting an anomaly.
In this specific case its a Virtual Machine experiencing an increase in Write Latency.

Dashboard 2: Virtual SAN Heatmap

This dashboard provide a heatmap view of the object performance against the predefined limits vs. its own trended behavior in past.

Dashboard 3: Virtual SAN Entity Usage

This dashboard provides extremely useful data points for all Physical subcomponents of Virtual SAN subsystem, i.e Host Adapter, SSD Cache Drive, Mechanical/SSD Capacity Drives.

At a glance you can make out which specific host and it components is more worked than others in the Virtual cluster.
It also gives us host level performance metrics for SSD/Mechanical disk, for both Read/Write IO patterns. Mostly information which contributes to the performance of the Virtual SAN can be found on this page.

Dashboard 4: Virtual SAN Device Insights

This is a very important Dashboard as it provides insights to disk level counters like reported errors, media endurance indicators etc. which determines the health of the disks. 

Besides this is gives provide details like Capacity utilization values across hosts. If we notice that the data is not distributed uniformly across hosts, we can initiate a rebalance operation to have data equally spread.

Other counters like CPU/memory Utilization, SSD Cache hit ratios etc. are useful for gauging the overall host performance.

Dashboard 5: Virtual SAN Cluster Insights

This dashboard is suitable to gauge disk group level details holistically. If you want to know that which disk group within a host/cluster has the most amount of data, which is most utilized in catering to the IOs, which has most errors or can check this dashboard.

Its a good starting point for Virtual SAN troubleshooting at a disk group level.

I believe that this configuration in vRealize Operations is a must for any VMware administrator using Virtual SAN. With the help of this specially crafted Management Pack one can easily isolate problem caused at the Storage subsystem and connected SAN devices.

It's no more a secret that Software Defined Storage is the way forward to craft a true Software Defined Datacenter. With the power of SPBM with VSAN and vVOLs, we can practically take away all the headaches of planning and management of storage blocks.

But once we have created a SDS layer and hit the Day 2 operation stage, there are a lot many things we may want to know about the underlying SDS layer.

The vSphere Web client has some really cool and simplified widgets showcasing data/metrics around capacity utilization and performance, but is kind off limited to that.

In some of my discussions customers has raised queries about granular monitoring of Virtual SAN subcomponents like knowing about disk group level utilization, disk (SSD/Mechanical) level utilization, granular Disk metrics etc.

In this post i would like to spend some time on how to achieve granular monitoring of Virtual SAN using vRealize Operations Manager.

VMware Virtual SAN team along with vRealize Operations team jointly developed the vRealize Operations Management Pack for Storage Devices (MPSD) available at 
Solution Exchange (here). 

The Management Pack provides us with 
  • Pre-defined dashboards that allow you to track data path from VM to VSAN, and notify any problem in the path. 
  • It also shows heat-maps, widgets, metrics for all the sub-components comprising your VSAN environment.

Basically it helps unveil what's happening "Under the Hood".

So without further ado, let me share steps to integrate MPSD with vRealize Operations Management.
You would be surprised to see how radically simple it is.

Its a good idea to check interoperability before we deploy it.

Once you've download the Management Pack (.pak file), Launch the vRealize Operation Manager page. In the left pane of vRealize Operation Manager, Click the Administration icon and click Solutions

On the Right pane, Click the (+) sign and launch the Wizard to upload the Management Pack

Browse to the location of Management Pack for Storage Device and proceed with Upload.
After the upload, Accept the EULA and Proceed for the Install.

Post the Installation, you should see the new Adapter added under the Solution section.

Highlight the Management Pack for Storage Devices and click the Configure button as shown below.

Provide vCenter Credentials for the adapter to connect to the vCenter server and capture the relevant Data points.

Once configured you should wait for it to start "Collecting Data".

As an end result you should see additional Dashboards added to the Home View. 

Given enough time for these Dashboards to populate, it will start reflecting some very handy details about our VM to VSAN configuration. It also provides with FC/iSCSI and NFS mounted Datastore details with metrics from SAN switches and HBA cards.

