Skip to content

Add 'vardir' disk utilization collecting #296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

avtikhon
Copy link
Contributor

@avtikhon avtikhon commented Apr 30, 2021

Added for 'vardir' path disk disk utilization collecting routine which
parses file:

      /proc/diskstats

for the value milliseconds spent doing I/Os from the given device
and counts 'busy' value in percents, like iostat tool does for its
'%util' field. Check for more information linux kernel documentation
[1]. We use from the 'diskstats' file Field 10 which is in real at
the 12th position:

  Field  9 -- # of I/Os currently in progress
    The only field that should go to zero. Incremented as requests are
    given to appropriate struct request_queue and decremented as they
    finish.

  Field 10 -- # of milliseconds spent doing I/Os
    This field increases so long as field 9 is nonzero.

Decided to collect 'vardir' disk utilization per each run test in
standalone file and print to stdout after testing completed, up to
10 most biggest of it.

To count 'vardir' disk utilization for each test we need to know what
was the value of the disk bound on the test start and time when test
started to count later the utilization. On test finish we check the
test finish time and current disk bound and count disk utilization.
All of these checks and counts implemented as the new routine
'get_disk_bound_stat_busy()' in 'lib/utils.py' file.

To collect disk bounds and durations on test start/finish events used
'StatisticsWatcher' listener which has the following usable routines
for it:

  process_result()
    Using 'WorkerCurrentTask' queue message apperance to save the disk
    bounds values at the start of the tests.

    Using 'WorkerTaskResult' queue message appearance to save final
    disk bounds values for finished tests.

  print_statistics() - statistics printing to stdout after testing.
    Prints disk bounds for failed tests and up to 10 most used it tests.
    We use standalone 'statistics' directory in 'vardir' path to save
    'durations.log' file in it with disk bounds for each tested task
    in format:

      <test task name> <disk bound>

Needed for tarantool/tarantool-qa#97

[1]: https://www.kernel.org/doc/Documentation/iostats.txt

@avtikhon avtikhon requested review from Totktonada and kyukhin April 30, 2021 12:47
@avtikhon avtikhon force-pushed the avtikhon/gh-qa-97-disk-bound-collect branch 3 times, most recently from d3ff560 to 979fa99 Compare April 30, 2021 14:22
@avtikhon avtikhon self-assigned this Apr 30, 2021
@avtikhon avtikhon added the teamQ label Apr 30, 2021
@avtikhon avtikhon force-pushed the avtikhon/gh-qa-97-disk-bound-collect branch from 979fa99 to 59bed08 Compare May 24, 2021 06:12
@avtikhon avtikhon changed the title Add vardir disk bound status collecting routine Add 'vardir' disk utilization collecting May 24, 2021
@avtikhon avtikhon force-pushed the avtikhon/gh-qa-97-disk-bound-collect branch 7 times, most recently from f038227 to 24ec878 Compare May 31, 2021 10:34
@avtikhon avtikhon linked an issue May 31, 2021 that may be closed by this pull request
@avtikhon avtikhon force-pushed the avtikhon/gh-qa-97-disk-bound-collect branch 3 times, most recently from d72bbb5 to 97e2a2d Compare June 2, 2021 10:39
Added for 'vardir' path disk disk utilization collecting routine which
parses file:

  /proc/diskstats

for the value milliseconds spent doing I/Os from the given device
and counts 'busy' value in percents, like iostat tool does for its
'%util' field. Check for more information linux kernel documentation
[1]. We use from the 'diskstats' file Field 10 which is in real at
the 12th position:

  Field  9 -- # of I/Os currently in progress
    The only field that should go to zero. Incremented as requests are
    given to appropriate struct request_queue and decremented as they
    finish.

  Field 10 -- # of milliseconds spent doing I/Os
    This field increases so long as field 9 is nonzero.

Decided to collect 'vardir' disk utilization per each run test in
standalone file and print to stdout after testing completed, up to
10 most biggest of it.

To count 'vardir' disk utilization for each test we need to know what
was the value of the disk bound on the test start and time when test
started to count later the utilization. On test finish we check the
test finish time and current disk bound and count disk utilization.
All of these checks and counts implemented as the new routine
'get_disk_bound_stat_busy()' in 'lib/utils.py' file.

To collect disk bounds and durations on test start/finish events used
'StatisticsWatcher' listener which has the following usable routines
for it:

  process_result()
    Using 'WorkerCurrentTask' queue message apperance to save the disk
    bounds values at the start of the tests.

    Using 'WorkerTaskResult' queue message appearance to save final
    disk bounds values for finished tests.

  print_statistics() - statistics printing to stdout after testing.
    Prints disk bounds for failed tests and up to 10 most used it tests.
    We use standalone 'statistics' directory in 'vardir' path to save
    'durations.log' file in it with disk bounds for each tested task
    in format:

      <test task name> <disk bound>

Part of tarantool/tarantool-qa#97
Closes #308

[1]: https://www.kernel.org/doc/Documentation/iostats.txt
@avtikhon avtikhon force-pushed the avtikhon/gh-qa-97-disk-bound-collect branch from 97e2a2d to 4edad84 Compare June 2, 2021 13:43
@Totktonada
Copy link
Member

We discussed it with @avtikhon. I hope Alexander will summarize our near and far goals in tarantool/tarantool-qa#97.

I'll just note here that, as far as I understand, we should measure disk saturation (ability to serve more requests) rather than utilization (a fraction of time spent on IO). It should give a better indicator, whether the disk IO is near to aggressive queueing.

Relevant materials:

http://www.brendangregg.com/usemethod.html
http://www.brendangregg.com/USEmethod/use-linux.html
https://www.percona.com/blog/2017/08/28/looking-disk-utilization-and-saturation/
https://www.kernel.org/doc/html/latest/admin-guide/iostats.html
https://www.kernel.org/doc/html/latest/filesystems/proc.html#proc-pid-io-display-the-io-accounting-fields (does not count mmaps?)

Aside of this, I doubt about splitting system-wide statistics per tests. I guess it'll too vague, when many tests are working in parallel. I propose to start with some binary indicator: write something to the terminal, when the system goes to a disk saturation.

@kyukhin asked the nice question: whether 1 GiB write() will be counted as one request or as many block sized ones? I don't know, let's look deeper.

@Totktonada
Copy link
Member

Fast-written notes around iostat code (just to hold them somewhere):

  • aqu-sz (old name: avgqu-sz) is field 11 difference, divided on wall time. See the code. (It seems, / 1000 and * 100 in S_VALUE is something about time units conversion and is not relevant for us.)
  • await code is here. It is field 4 + 8 + 15 (time spent on read + write + discard request; flush requests are skipped, see the comment in the iostat code), divided by a difference in field 1 + 5 + 12 (difference in # of read + write + discard requests). Looks like a kind of average time for 1 request.

I don't know what is better to choose: queue size or await time and how to choose a threshold. We need to experiment around those metrics and elaborate an indicator that will work good for us.

Comment on lines +263 to +276
def get_vardir_device(vardir):
dev = 'unknown'
path = os.path.abspath(vardir)
while not os.path.ismount(path):
path = os.path.dirname(path)
try:
with open('/proc/mounts', 'r') as f:
for line in f:
if line.split()[1] == path:
mount = line.split()[0]
dev = mount.split('/')[2]
except (OSError, IOError, IndexError):
pass
return dev
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't check, but will it work as expected for me?

$ cat /proc/mounts
/dev/root / ext4 rw,noatime 0 0
devtmpfs /dev devtmpfs rw,nosuid,relatime,size=10240k,nr_inodes=2033827,mode=755 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
tmpfs /run tmpfs rw,nodev,relatime,size=1627436k,mode=755 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0
selinuxfs /sys/fs/selinux selinuxfs rw,relatime 0 0
efivarfs /sys/firmware/efi/efivars efivarfs rw,nosuid,nodev,noexec,relatime 0 0
cgroup_root /sys/fs/cgroup tmpfs rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755 0 0
openrc /sys/fs/cgroup/openrc cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc 0 0
none /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 0 0
cpuset /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
cpu /sys/fs/cgroup/cpu cgroup rw,nosuid,nodev,noexec,relatime,cpu 0 0
cpuacct /sys/fs/cgroup/cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct 0 0
blkio /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
memory /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
devices /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0
freezer /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0
net_cls /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 0 0
perf_event /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0
net_prio /sys/fs/cgroup/net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_prio 0 0
hugetlb /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0
pids /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,noexec,noatime,size=1048576k 0 0
tmpfs /var/tmp/portage tmpfs rw,nodev,noatime,size=4194304k 0 0
binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc rw,nosuid,nodev,noexec,relatime 0 0
/dev/root /var/lib/docker ext4 rw,noatime 0 0
nsfs /run/docker/netns/default nsfs rw 0 0

$ cat /proc/diskstats 
   7       0 loop0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   7       1 loop1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   7       2 loop2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   7       3 loop3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   7       4 loop4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   7       5 loop5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   7       6 loop6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   7       7 loop7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
   8       0 sda 3914740 2091334 209636333 3300777 20400957 25673178 8530616480 245749826 0 12550525 241975264 1065702 0 590682616 2552745
   8       1 sda1 90 0 9728 53 0 0 0 0 0 44 25 0 0 0 0
   8       2 sda2 28 0 2080 22 0 0 0 0 0 15 11 0 0 0 0
   8       3 sda3 3914590 2091334 209623349 3300680 20339615 25673178 8530616480 245695226 0 12541502 241943738 1065702 0 590682616 2552745
   9       0 md0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

@avtikhon
Copy link
Contributor Author

We discussed it with @avtikhon. I hope Alexander will summarize our near and far goals in tarantool/tarantool-qa#97.

Right, we discussed it and found that current statistic is really good and fine for HDD based hosts. But for SSD this statistics doesn't show anything countable except that any of SSD parallel RW channels were in use, but not all, while SSD may have it and 16 and 64 and use only some of it doesn't feed and overload its use.

@avtikhon avtikhon closed this Jun 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collect statistics for 'vardisk' disk usage
2 participants