NetApp performance monitoring disk level

By | October 5, 2016

Hi All in this module (NetApp performance monitoring disk level),

I will try to explain very interesting and confusing topic related to storage performance which is the monitoring the storage performance on disk level.

Many of us working on storage profile specially on Netapp monitor the storage performance using the
NetApp provided general utilities like sysstat, where we get the general storage performance idea about the overall storage performance.

When ever we face some issue related to storage performance, its very tough to conclude about the cause of the issue, as multiple technologies or layers are involved into this setup (like in SAN – Network, Host OS, Host MPIO, storage CPU, Disk level).

In this blog I will explain the disk performance monitoring on netapp which is the final layer of such a setup.

To monitor the disk performance NetApp provide the command line utility (Statit) which we can run in advanced mode and can capture the live performance data for some period.

How to run the statit –

In order to run the statit to capture the performance of NetApp storage FAS 7-mode perform the following steps.

1. Go into advanced mode.
priv set advanced
2. Run statit -b (Which will start data capturing).
2. Run statit -e (After desired period of time which will print the data on console).

“Statit -e” will print performance data for “WAFL, CPU, Network, RAID, Disks), here in this blog I will focus to provide a understanding on performance the disks.

In order to make you understand the below out put I will explain some basic definition related to disk performance.

1. IOPS- IOPS is the most common term which we use to define the performance of disk, No of IOPS generated by a particular disk depend on the disk type, like Disk is either SATA or SAS or SSD and RPM of particular disk.

As per my understanding one IO can be defined as the operation on one 4K block (As WAFL use 4K) or a continuous chain of 4K block.

2. Disk latency – time taken between the request of data and response of that is the disk latency and these depend on below latency type for SAS or SAS not SSD.

a. Rotational latency.
b. Seek time.
c. Transfer time.

I have captured the disk performance data while some backup jobs was running on data stores in aggr2 in below performance data sample.


Disk Statistics (per second)
ut% is the percent of time the disk was busy.
xfers is the number of data-transfer commands issued per second.
xfers = ureads + writes + cpreads + greads + gwrites
chain is the average number of 4K blocks per command.
usecs is the average disk round-trip time per 4K block.

disk-performance

performance data

In the above output, you can find out the Aggr2 which consist 3 raid-group size 16 with 45 no of disk each of 4TB, 7.5 RPM SATA drives.

using the out put we can calculate the read or write performance and IOPS generated by each disk, to calculate this we need to understand each column of the output.

1. %ut – It represent the % utilization of the disk.
2. Xfer – No of physical IO done per/second.
3. Chain – average no of blocks per IO.
4. usec – Average time taken to serve 4K block.

using the output we can conclude following.

1. From Xfer – Average no of physical IO serve by each disk for this output is 73.
2. From uread – Average no of read IO will be 58 per disk.
3. From read chain – On an average one IO is reading 1.2, 4K block.

So for this instance calculated read performance will be as follow.

Total data serve by one IO – 4K*1.2 = 4.8K
Total Read IOPS = 58
Data read performance by a single disk = 4.8 read/s * total read /sec = 4.8*58 = 278.4 K
Total read performance by aggr2 = (no of disk * total read performance of each disk) – some raid or other penalties (Like storage efficiency). = 43*278.4 = 11971.2K – some value = 11MB/sec approx.

The same way if we calculate the write disk performance of the filer during the time when the output was captured.

1. From Xfer – Average no of physical IO serve by each disk for this output is 73.
2. From uread – Average no of read IO will be 58 per disk.
3. From read chain – On an average one IO is reading 58, 4K block.

So for this instance calculated write disk performance will be as follows.

Total data serve by one IO – 4K*58 = 232K
Total write IOPS = 16
Data read performance by a single disk = 232 K * total write /sec = 232*16 = 3712 K/sec
Total write performance by aggr2 = (no of disk * total read performance of each disk) – some raid or other penalties (Like storage efficiency). = 43*3712 = 159616K – some value = 160MB/sec approx.

As I stated above that this performance was captured while some backup operations was running on data stores contained into this aggregate aggr2.

above calculation is not exact it just provide us an idea about the read and write disk performance at some point of time which we can use to find out if some performance bottle neck is their.

I think this post was informative for you, I will continue to write on the NetApp storage performance, if you have some analysis or suggestion on above post waiting for them.

One thought on “NetApp performance monitoring disk level

  1. Pingback: NetApp performance monitoring disk level | ciscohite

Comments are closed.