habanalabs: increase timeout during reset - kernel/linux.git

diff options

author	Oded Gabbay <oded.gabbay@gmail.com>	2020-03-27 16:38:37 +0300
committer	Oded Gabbay <oded.gabbay@gmail.com>	2020-05-17 12:06:22 +0300
commit	7a65ee046b2238e053f6ebb610e1a082cfc49490 (patch)
tree	6a70b06695d9ac9b28ad83df81263c6c546321df /net/l2tp
parent	49aba0bbab20a581dc3e32a6ee636c07a542eb9e (diff)
download	linux-7a65ee046b2238e053f6ebb610e1a082cfc49490.tar.xz

habanalabs: increase timeout during reset

When doing training, the DL framework (e.g. tensorflow) performs hundreds of thousands of memory allocations and mappings. In case the driver needs to perform hard-reset during training, the driver kills the application and unmaps all those memory allocations. Unfortunately, because of that large amount of mappings, the driver isn't able to do that in the current timeout (5 seconds). Therefore, increase the timeout significantly to 30 seconds to avoid situation where the driver resets the device with active mappings, which sometime can cause a kernel bug. BTW, it doesn't mean we will spend all the 30 seconds because the reset thread checks every one second if the unmap operation is done. Reviewed-by: Omer Shpigelman <oshpigelman@habana.ai> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>

Diffstat (limited to 'net/l2tp')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: