diff options
author | Jens Axboe <axboe@kernel.dk> | 2020-06-26 00:39:59 +0300 |
---|---|---|
committer | Jens Axboe <axboe@kernel.dk> | 2020-06-26 19:34:23 +0300 |
commit | c40f63790ec957e9449056fb78d8c2523eff96b5 (patch) | |
tree | 961f55ef1b97412692448335321b903b411e7e96 /fs/io-wq.h | |
parent | a1d7c393c4711a9ce6c239c3ab053a50dc96505a (diff) | |
download | linux-c40f63790ec957e9449056fb78d8c2523eff96b5.tar.xz |
io_uring: use task_work for links if possible
Currently links are always done in an async fashion, unless we catch them
inline after we successfully complete a request without having to resort
to blocking. This isn't necessarily the most efficient approach, it'd be
more ideal if we could just use the task_work handling for this.
Outside of saving an async jump, we can also do less prep work for these
kinds of requests.
Running dependent links from the task_work handler yields some nice
performance benefits. As an example, examples/link-cp from the liburing
repository uses read+write links to implement a copy operation. Without
this patch, the a cache fold 4G file read from a VM runs in about 3
seconds:
$ time examples/link-cp /data/file /dev/null
real 0m2.986s
user 0m0.051s
sys 0m2.843s
and a subsequent cache hot run looks like this:
$ time examples/link-cp /data/file /dev/null
real 0m0.898s
user 0m0.069s
sys 0m0.797s
With this patch in place, the cold case takes about 2.4 seconds:
$ time examples/link-cp /data/file /dev/null
real 0m2.400s
user 0m0.020s
sys 0m2.366s
and the cache hot case looks like this:
$ time examples/link-cp /data/file /dev/null
real 0m0.676s
user 0m0.010s
sys 0m0.665s
As expected, the (mostly) cache hot case yields the biggest improvement,
running about 25% faster with this change, while the cache cold case
yields about a 20% increase in performance. Outside of the performance
increase, we're using less CPU as well, as we're not using the async
offload threads at all for this anymore.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Diffstat (limited to 'fs/io-wq.h')
0 files changed, 0 insertions, 0 deletions