Age | Commit message (Collapse) | Author | Files | Lines |
|
This reverts commit 81f95076281fdd3bc382e004ba1bce8e82fccbce.
It causes random failures of firmware loading at resume time (well,
random for me, it seems to be more reliable for others) because the
firmware disabling is not actually synchronous with any particular
resume event, and at least the btusb driver that uses a workqueue to
load the firmware at resume seems to occasionally hit the "firmware
loading is disabled" logic because the firmware loader hasn't gotten the
resume event yet.
Some kind of sanity check for not trying to load firmware when it's not
possible might be a good thing, but this commit was not it.
Greg seems to have silently suffered the same issue, and pointed to the
likely culprit, and Gabriel C verified the revert fixed it for him too.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Pointed-at-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Gabriel C <nix.or.die@gmail.com>
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We want the fixes in here as well for testing.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Otherwise there is no easy way this actually happened.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
For some reason we have always forgotten this. Without this
we don't get a nice prefix on our pr_debug() / pr_*() messages.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Right now we send -EAGAIN to a syfs write which got interrupted.
Userspace can't tell what happened though, send -EINTR if we
were killed due to a signal so userspace can tell things apart.
This is only applicable to the fallback mechanism.
Reported-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Commit 0cb64249ca500 ("firmware_loader: abort request if wait_for_completion
is interrupted") added via 4.0 added support to abort the fallback mechanism
when a signal was detected and wait_for_completion_interruptible() returned
-ERESTARTSYS -- for instance when a user hits CTRL-C. The abort was overly
*too* effective.
When a child process terminates (successful or not) the signal SIGCHLD can
be sent to the parent process which ran the child in the background and
later triggered a sync request for firmware through a sysfs interface which
relies on the fallback mechanism. This signal in turn can be recieved by the
interruptible wait we constructed on firmware_class and detects it as an
abort *before* userspace could get a chance to write the firmware. Upon
failure -EAGAIN is returned, so userspace is also kept in the dark about
exactly what happened.
We can reproduce the issue with the fw_fallback.sh selftest:
Before this patch:
$ sudo tools/testing/selftests/firmware/fw_fallback.sh
...
tools/testing/selftests/firmware/fw_fallback.sh: error - sync firmware request cancelled due to SIGCHLD
After this patch:
$ sudo tools/testing/selftests/firmware/fw_fallback.sh
...
tools/testing/selftests/firmware/fw_fallback.sh: SIGCHLD on sync ignored as expected
Fix this by making the wait killable -- only killable by SIGKILL (kill -9).
We loose the ability to allow userspace to cancel a write with CTRL-C
(SIGINT), however its been decided the compromise to require SIGKILL is
worth the gains.
Chances of this issue occuring are low due to the number of drivers upstream
exclusively relying on the fallback mechanism for firmware (2 drivers),
however this is observed in the field with custom drivers with sysfs
triggers to load firmware. Only distributions relying on the fallback
mechanism are impacted as well. An example reported issue was on Android,
as follows:
1) Android init (pid=1) fork()s (say pid=42) [this child process is totally
unrelated to firmware loading, it could be sleep 2; for all we care ]
2) Android init (pid=1) does a write() on a (driver custom) sysfs file which
ends up calling request_firmware() kernel side
3) The firmware loading fallback mechanism is used, the request is sent to
userspace and pid 1 waits in the kernel on wait_*
4) before firmware loading completes pid 42 dies (for any reason, even
normal termination)
5) Kernel delivers SIGCHLD to pid=1 to tell it a child has died, which
causes -ERESTARTSYS to be returned from wait_*
6) The kernel's wait aborts and return -EAGAIN for the
request_firmware() caller.
Cc: stable <stable@vger.kernel.org> # 4.0
Fixes: 0cb64249ca500 ("firmware_loader: abort request if wait_for_completion is interrupted")
Suggested-by: "Eric W. Biederman" <ebiederm@xmission.com>
Suggested-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Tested-by: Martin Fuzzey <mfuzzey@parkeon.com>
Reported-by: Martin Fuzzey <mfuzzey@parkeon.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fix batched requests from waiting forever on failure.
The firmware API batched requests feature has been broken since the API call
request_firmware_direct() was introduced on commit bba3a87e982ad ("firmware:
Introduce request_firmware_direct()"), added on v3.14 *iff* the firmware
being requested was not present in *certain kernel builds* [0].
When no firmware is found the worker which goes on to finish never informs
waiters queued up of this, so any batched request will stall in what seems
to be forever (MAX_SCHEDULE_TIMEOUT). Sadly, a reboot will also stall, as
the reboot notifier was only designed to kill custom fallback workers. The
issue seems to the user as a type of soft lockup, what *actually* happens
underneath the hood is a wait call which never completes as we failed to
issue a completion on error.
For device drivers with optional firmware schemes (ie, Intel iwlwifi, or
Netronome -- even though it uses request_firmware() and not
request_firmware_direct()), this could mean that when you boot a system with
multiple cards the firmware will seem to never load on the system, or that
the card is just not responsive even the driver initialization. Due to
differences in scheduling possible this should not always trigger --
one would need to to ensure that multiple requests are in place at the
right time for this to work, also release_firmware() must not be called
prior to any other incoming request. The complexity may not be worth
supporting batched requests in the future given the wait mechanism is
only used also for the fallback mechanism. We'll keep it for now and
just fix it.
Its reported that at least with the Intel WiFi cards on one system this
issue was creeping up 50% of the boots [0].
Before this commit batched requests testing revealed:
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=y
Most common Linux distribution setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() FAIL OK
request_firmware_direct() FAIL OK
request_firmware_nowait(uevent=true) FAIL OK
request_firmware_nowait(uevent=false) FAIL OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=n
Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() FAIL OK
request_firmware_direct() FAIL OK
request_firmware_nowait(uevent=true) FAIL OK
request_firmware_nowait(uevent=false) FAIL OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
CONFIG_FW_LOADER_USER_HELPER=y
Google Android setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() OK OK
request_firmware_direct() FAIL OK
request_firmware_nowait(uevent=true) OK OK
request_firmware_nowait(uevent=false) OK OK
============================================================================
Ater this commit batched testing results:
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=y
Most common Linux distribution setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() OK OK
request_firmware_direct() OK OK
request_firmware_nowait(uevent=true) OK OK
request_firmware_nowait(uevent=false) OK OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=n
Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() OK OK
request_firmware_direct() OK OK
request_firmware_nowait(uevent=true) OK OK
request_firmware_nowait(uevent=false) OK OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
CONFIG_FW_LOADER_USER_HELPER=y
Google Android setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() OK OK
request_firmware_direct() OK OK
request_firmware_nowait(uevent=true) OK OK
request_firmware_nowait(uevent=false) OK OK
============================================================================
[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477
Cc: stable <stable@vger.kernel.org> # v3.14
Fixes: bba3a87e982ad ("firmware: Introduce request_firmware_direct()"
Reported-by: Nicolas <nbroeking@me.com>
Reported-by: John Ewalt <jewalt@lgsinnovations.com>
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The firmware cache mechanism serves two purposes, the secondary purpose is
not well documented nor understood. This fixes a regression with the
secondary purpose of the firmware cache mechanism: batched requests on
successful lookups. Without this fix *any* time a batched request is
triggered, secondary requests for which the batched request mechanism
was designed for will seem to last forver and seem to never return.
This issue is present for all kernel builds possible, and a hard reset
is required.
The firmware cache is used for:
1) Addressing races with file lookups during the suspend/resume cycle
by keeping firmware in memory during the suspend/resume cycle
2) Batched requests for the same file rely only on work from the first file
lookup, which keeps the firmware in memory until the last
release_firmware() is called
Batched requests *only* take effect if secondary requests come in prior to
the first user calling release_firmware(). The devres name used for the
internal firmware cache is used as a hint other pending requests are
ongoing, the firmware buffer data is kept in memory until the last user of
the buffer calls release_firmware(), therefore serializing requests and
delaying the release until all requests are done.
Batched requests wait for a wakup or signal so we can rely on the first file
fetch to write to the pending secondary requests. Commit 5b029624948d
("firmware: do not use fw_lock for fw_state protection") ported the firmware
API to use swait, and in doing so failed to convert complete_all() to
swake_up_all() -- it used swake_up(), loosing the ability for *some* batched
requests to take effect.
We *could* fix this by just using swake_up_all() *but* swait is now known
to be very special use case, so its best to just move away from it. So we
just go back to using completions as before commit 5b029624948d ("firmware:
do not use fw_lock for fw_state protection") given this was using
complete_all().
Without this fix it has been reported plugging in two Intel 6260 Wifi cards
on a system will end up enumerating the two devices only 50% of the time
[0]. The ported swake_up() should have actually handled the case with two
devices, however, *if more than two cards are used* the swake_up() would
not have sufficed. This change is only part of the required fixes for
batched requests. Another fix is provided in the next patch.
This particular change should fix the cases where more than three requests
with the same firmware name is used, otherwise batched requests will wait
for MAX_SCHEDULE_TIMEOUT and just timeout eventually.
Below is a summary of tests triggering batched requests on different
kernel builds.
Before this patch:
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=y
Most common Linux distribution setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() FAIL FAIL
request_firmware_direct() FAIL FAIL
request_firmware_nowait(uevent=true) FAIL FAIL
request_firmware_nowait(uevent=false) FAIL FAIL
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=n
Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() FAIL FAIL
request_firmware_direct() FAIL FAIL
request_firmware_nowait(uevent=true) FAIL FAIL
request_firmware_nowait(uevent=false) FAIL FAIL
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
CONFIG_FW_LOADER_USER_HELPER=y
Google Android setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() FAIL FAIL
request_firmware_direct() FAIL FAIL
request_firmware_nowait(uevent=true) FAIL FAIL
request_firmware_nowait(uevent=false) FAIL FAIL
============================================================================
After this patch:
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=y
Most common Linux distribution setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() FAIL OK
request_firmware_direct() FAIL OK
request_firmware_nowait(uevent=true) FAIL OK
request_firmware_nowait(uevent=false) FAIL OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=n
CONFIG_FW_LOADER_USER_HELPER=n
Only possible if CONFIG_DELL_RBU=n and CONFIG_LEDS_LP55XX_COMMON=n, rare.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() FAIL OK
request_firmware_direct() FAIL OK
request_firmware_nowait(uevent=true) FAIL OK
request_firmware_nowait(uevent=false) FAIL OK
============================================================================
CONFIG_FW_LOADER_USER_HELPER_FALLBACK=y
CONFIG_FW_LOADER_USER_HELPER=y
Google Android setup.
API-type no-firmware-found firmware-found
----------------------------------------------------------------------
request_firmware() OK OK
request_firmware_direct() FAIL OK
request_firmware_nowait(uevent=true) OK OK
request_firmware_nowait(uevent=false) OK OK
============================================================================
[0] https://bugzilla.kernel.org/show_bug.cgi?id=195477
CC: <stable@vger.kernel.org> [4.10+]
Cc: Ming Lei <ming.lei@redhat.com>
Fixes: 5b029624948d ("firmware: do not use fw_lock for fw_state protection")
Reported-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This moves the usermode helper locks into only code paths that use the
usermode helper API from the kernel. The usermode helper locks were
originally added to prevent stalling suspend, later the firmware cache
was added to help with this, and further later direct filesystem lookup
was added by Linus to completely bypass udev due to the amount of issues
the umh approach had.
The usermode helper locks were kept even when the direct filesystem lookup
mechanism is used though. A lot has changed since the original usermode
helper locks were added but the recent commit which added the code for
firmware_enabled() are intended to address any possible races cured only
as collateral by using the locks as though side consequence of code
evolution and this not being addressed any time sooner. With the
firmware_enabled() code in place we are a bit more sure to move the
usermode helper locks to UMH only code.
There is a bit of history here so let's recap a bit of it to ensure nothing
is lost and things are clear. The direct filesystem approach to loading
firmware is rather new, it was added via commit abb139e75c2cdb ("firmware:
teach the kernel to load firmware files directly from the filesystem") by
Linus merged on the v3.7 release, to enable to bypass udev.
usermodehelper_read_lock_wait() was added earlier via commit 9b78c1da60b3c
("firmware_class: Do not warn that system is not ready from async loads")
merged on v3.4, after Rafael noted that the async firmware API call
request_firmware_nowait() should not be penalized to fail if userspace is
not available yet or frozen, it'd allow for a timeout grace period before
giving up. The WARN_ON() was kept for the sync firmware API call though on
request_firmware(). At this time there was no direct filesystem lookup for
firmware though.
The original usermode helper lock came from commit a144c6a6c924a ("PM:
Print a warning if firmware is requested when tasks are frozen") merged on
the v3.0 kernel by Rafael to print a warning back when firmware requests
were used on resume(), thaw() or restore() callbacks and there was no
direct fs lookups or the firmware cache.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This will make subsequent changes easier to read.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The firmware API should not be used after we go to suspend
and after we reboot/halt. The suspend/resume case is a bit
complex, so this documents that so things are clearer.
We want to know about users of the API in incorrect places so
that their callers are corrected, so this also adds a warn
for those cases.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Now that we've have proper wrappers for the fallback mechanism
we can easily share the reboot notifier for the firmware_class
at all times.
This change will make subsequent modifications to the reboot
notifier easier to review.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We kill pending fallback requests on suspend and reboot,
the only difference is that on suspend we only kill custom
fallback requests. Provide a wrapper that lets us customize
the request with a flag.
This also lets us simplify the #ifdef'ery over the calls.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This routine will used in functions declared earlier next. This
code shift has no functional changes, it will make subsequent
changes easier to read.
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Since commit 5d47ec02c37ea6 ("firmware: Correct handling of
fw_state_wait() return value") fw_load_abort() could be called twice and
lead us to a kernel crash. This happens only when the firmware fallback
mechanism (regular or custom) is used. The fallback mechanism exposes a
sysfs interface for userspace to upload a file and notify the kernel
when the file is loaded and ready, or to cancel an upload by echo'ing -1
into on the loading file:
echo -n "-1" > /sys/$DEVPATH/loading
This will call fw_load_abort(). Some distributions actually have a udev
rule in place to *always* immediately cancel all firmware fallback
mechanism requests (Debian), they have:
$ cat /lib/udev/rules.d/50-firmware.rules
# stub for immediately telling the kernel that userspace firmware loading
# failed; necessary to avoid long timeouts with CONFIG_FW_LOADER_USER_HELPER=y
SUBSYSTEM=="firmware", ACTION=="add", ATTR{loading}="-1
Distributions with this udev rule would run into this crash only if the
fallback mechanism is used. Since most distributions disable by default
using the fallback mechanism (CONFIG_FW_LOADER_USER_HELPER_FALLBACK),
this would typicaly mean only 2 drivers which *require* the fallback
mechanism could typically incur a crash: drivers/firmware/dell_rbu.c and
the drivers/leds/leds-lp55xx-common.c driver. Distributions enabling
CONFIG_FW_LOADER_USER_HELPER_FALLBACK by default are obviously more
exposed to this crash.
The crash happens because after commit 5b029624948d ("firmware: do not
use fw_lock for fw_state protection") and subsequent fix commit
5d47ec02c37ea6 ("firmware: Correct handling of fw_state_wait() return
value") a race can happen between this cancelation and the firmware
fw_state_wait_timeout() being woken up after a state change with which
fw_load_abort() as that calls swake_up(). Upon error
fw_state_wait_timeout() will also again call fw_load_abort() and trigger
a null reference.
At first glance we could just fix this with a !buf check on
fw_load_abort() before accessing buf->fw_st, however there is a logical
issue in having a state machine used for the fallback mechanism and
preventing access from it once we abort as its inside the buf
(buf->fw_st).
The firmware_class.c code is setting the buf to NULL to annotate an
abort has occurred. Replace this mechanism by simply using the state
check instead. All the other code in place already uses similar checks
for aborting as well so no further changes are needed.
An oops can be reproduced with the new fw_fallback.sh fallback mechanism
cancellation test. Either cancelling the fallback mechanism or the
custom fallback mechanism triggers a crash.
mcgrof@piggy ~/linux-next/tools/testing/selftests/firmware
(git::20170111-fw-fixes)$ sudo ./fw_fallback.sh
./fw_fallback.sh: timeout works
./fw_fallback.sh: firmware comparison works
./fw_fallback.sh: fallback mechanism works
[ this then sits here when it is trying the cancellation test ]
Kernel log:
test_firmware: loading 'nope-test-firmware.bin'
misc test_firmware: Direct firmware load for nope-test-firmware.bin failed with error -2
misc test_firmware: Falling back to user helper
BUG: unable to handle kernel NULL pointer dereference at 0000000000000038
IP: _request_firmware+0xa27/0xad0
PGD 0
Oops: 0000 [#1] SMP
Modules linked in: test_firmware(E) ... etc ...
CPU: 1 PID: 1396 Comm: fw_fallback.sh Tainted: G W E 4.10.0-rc3-next-20170111+ #30
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.1-0-g8891697-prebuilt.qemu-project.org 04/01/2014
task: ffff9740b27f4340 task.stack: ffffbb15c0bc8000
RIP: 0010:_request_firmware+0xa27/0xad0
RSP: 0018:ffffbb15c0bcbd10 EFLAGS: 00010246
RAX: 00000000fffffffe RBX: ffff9740afe5aa80 RCX: 0000000000000000
RDX: ffff9740b27f4340 RSI: 0000000000000283 RDI: 0000000000000000
RBP: ffffbb15c0bcbd90 R08: ffffbb15c0bcbcd8 R09: 0000000000000000
R10: 0000000894a0d4b1 R11: 000000000000008c R12: ffffffffc0312480
R13: 0000000000000005 R14: ffff9740b1c32400 R15: 00000000000003e8
FS: 00007f8604422700(0000) GS:ffff9740bfc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000038 CR3: 000000012164c000 CR4: 00000000000006e0
Call Trace:
request_firmware+0x37/0x50
trigger_request_store+0x79/0xd0 [test_firmware]
dev_attr_store+0x18/0x30
sysfs_kf_write+0x37/0x40
kernfs_fop_write+0x110/0x1a0
__vfs_write+0x37/0x160
? _cond_resched+0x1a/0x50
vfs_write+0xb5/0x1a0
SyS_write+0x55/0xc0
? trace_do_page_fault+0x37/0xd0
entry_SYSCALL_64_fastpath+0x1e/0xad
RIP: 0033:0x7f8603f49620
RSP: 002b:00007fff6287b788 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 000055c307b110a0 RCX: 00007f8603f49620
RDX: 0000000000000016 RSI: 000055c3084d8a90 RDI: 0000000000000001
RBP: 0000000000000016 R08: 000000000000c0ff R09: 000055c3084d6336
R10: 000055c307b108b0 R11: 0000000000000246 R12: 000055c307b13c80
R13: 000055c3084d6320 R14: 0000000000000000 R15: 00007fff6287b950
Code: 9f 64 84 e8 9c 61 fe ff b8 f4 ff ff ff e9 6b f9 ff
ff 48 c7 c7 40 6b 8d 84 89 45 a8 e8 43 84 18 00 49 8b be 00 03 00 00 8b
45 a8 <83> 7f 38 02 74 08 e8 6e ec ff ff 8b 45 a8 49 c7 86 00 03 00 00
RIP: _request_firmware+0xa27/0xad0 RSP: ffffbb15c0bcbd10
CR2: 0000000000000038
---[ end trace 6d94ac339c133e6f ]---
Fixes: 5d47ec02c37e ("firmware: Correct handling of fw_state_wait() return value")
Reported-and-Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reported-and-Tested-by: Patrick Bruenn <p.bruenn@beckhoff.com>
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
CC: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When request_firmware() finds an already open firmware object it will
wait for that object to become fully loaded and then check the status.
As __fw_state_wait_common() succeeds the timeout value returned will be
truncated in _request_firmware_prepare() and interpreted as -EPERM.
Prior to "firmware: do not use fw_lock for fw_state protection" the code
did test if we where in the "done" state before sleeping, causing this
particular code path to succeed, in some cases.
As the callers are interested in the result of the wait and not the
remaining timeout the return value of __fw_state_wait_common() is
changed to signal "done" or "error", which simplifies the logic in
_request_firmware_load() as well.
Fixes: 5b029624948d ("firmware: do not use fw_lock for fw_state protection")
Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Reviewed-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
This patch removes following error at for `make htmldocs`. No functional
change.
./drivers/base/firmware_class.c:1348: WARNING: Bullet list ends without a blank line; unexpected unindent.
Signed-off-by: Silvio Fricke <silvio.fricke@gmail.com>
Reviewed-by: Mauro Carvalho Chehab <mchehab@s-opensource.com>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
fw_state_is_done() is only used for UHM so moved into that section.
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
fw_lock is to use to protect 'corner cases' inside firmware_class. It
is not exactly clear what those corner cases are nor what it exactly
protects. fw_state can be used without needing the fw_lock to protect
its state transition and wake ups.
fw_state is holds the state in status and the completion is used to
wake up all waiters (in this case that is the user land helper so only
one). This operation has to be 'atomic' to avoid races. We can do this
by using swait which takes care we don't miss any wake up.
We use also swait instead of wait because don't need all the additional
features wait provides.
Note there some more cleanups possible after with this change. For
example for !CONFIG_FW_LOADER_USER_HELPER we don't check for the state
anymore. Let's to this in the next patch instead mingling to many
changes into this one. And yes you get a gcc warning "‘__fw_state_check’
defined but not used [-Wunused-function] code." for the time beeing.
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We track the state of the firmware loading with bit ops. Since the
state machine has only a few states and they are all mutual exclusive
there are only a few simple state transition we can model this simplify.
UNKNOWN -> LOADING -> DONE | ABORTED
Because we don't use any bit ops on fw_state::status anymore we are able
to change the data type to enum fw_status and update the function
arguments accordingly.
READ_ONCE() and WRITE_ONCE() are propably not needed because there are a
lot of load and stores around fw_st->status. But let's make it explicit
and not be sorry later.
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The firmware loader tracks the current state of the loading process
via unsigned long status and a completion in struct
firmware_buf. Instead of open code tracking the state, introduce data
structure which encapsulate the state tracking and synchronization.
While at it also separate UHM states from direct loading states, e.g.
the loading_timeout is only defined when CONFIG_FW_LOADER_USER_HELPER.
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Daniel Wagner <daniel.wagner@bmw-carit.de>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When you use the firmware usermode helper fallback with a timeout value set to a
value greater than INT_MAX (2147483647) a cast overflow issue causes the
timeout value to go negative and breaks all usermode helper loading. This
regression was introduced through commit 68ff2a00dbf5 ("firmware_loader:
handle timeout via wait_for_completion_interruptible_timeout()") on kernel
v4.0.
The firmware_class drivers relies on the firmware usermode helper
fallback as a mechanism to look for firmware if the direct filesystem
search failed only if:
a) You've enabled CONFIG_FW_LOADER_USER_HELPER_FALLBACK (not many distros):
Then all of these callers will rely on the fallback mechanism in case
the firmware is not found through an initial direct filesystem lookup:
o request_firmware()
o request_firmware_into_buf()
o request_firmware_nowait()
b) If you've only enabled CONFIG_FW_LOADER_USER_HELPER (most distros):
Then only callers using request_firmware_nowait() with the second
argument set to false, this explicitly is requesting the UMH firmware
fallback to be relied on in case the first filesystem lookup fails.
Using Coccinelle SmPL grammar we have identified only two drivers
explicitly requesting the UMH firmware fallback mechanism:
- drivers/firmware/dell_rbu.c
- drivers/leds/leds-lp55xx-common.c
Since most distributions only enable CONFIG_FW_LOADER_USER_HELPER the
biggest impact of this regression are users of the dell_rbu and
leds-lp55xx-common device driver which required the UMH to find their
respective needed firmwares.
The default timeout for the UMH is set to 60 seconds always, as of
commit 68ff2a00dbf5 ("firmware_loader: handle timeout via
wait_for_completion_interruptible_timeout()") the timeout was bumped
to MAX_JIFFY_OFFSET ((LONG_MAX >> 1)-1). Additionally the MAX_JIFFY_OFFSET
value was also used if the timeout was configured by a user to 0.
The following works:
echo 2147483647 > /sys/class/firmware/timeout
But both of the following set the timeout to MAX_JIFFY_OFFSET even if
we display 0 back to userspace:
echo 2147483648 > /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0
echo 0> /sys/class/firmware/timeout
cat /sys/class/firmware/timeout
0
A max value of INT_MAX (2147483647) seconds is therefore implicit due to the
another cast with simple_strtol().
This fixes the secondary cast (the first one is simple_strtol() but its an
issue only by forcing an implicit limit) by re-using the timeout variable and
only setting retval in appropriate cases.
Lastly worth noting systemd had ripped out the UMH firmware fallback
mechanism from udev since udev 2014 via commit be2ea723b1d023b3d
("udev: remove userspace firmware loading support"), so as of systemd v217.
Signed-off-by: Yves-Alexis Perez <corsac@corsac.net>
Fixes: 68ff2a00dbf5 "firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()"
Cc: Luis R. Rodriguez <mcgrof@kernel.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Bjorn Andersson <bjorn.andersson@linaro.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
[mcgrof@kernel.org: gave commit log a whole lot of love]
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Convert the firmware core to use class_groups instead of class_attrs as
that's the correct way to handle lists of class attribute files.
Cc: Ming Lei <ming.lei@canonical.com>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Some systems are memory constrained but they need to load very large
firmwares. The firmware subsystem allows drivers to request this
firmware be loaded from the filesystem, but this requires that the
entire firmware be loaded into kernel memory first before it's provided
to the driver. This can lead to a situation where we map the firmware
twice, once to load the firmware into kernel memory and once to copy the
firmware into the final resting place.
This creates needless memory pressure and delays loading because we have
to copy from kernel memory to somewhere else. Let's add a
request_firmware_into_buf() API that allows drivers to request firmware
be loaded directly into a pre-allocated buffer. This skips the
intermediate step of allocating a buffer in kernel memory to hold the
firmware image while it's read from the filesystem. It also requires
that drivers know how much memory they'll require before requesting the
firmware and negates any benefits of firmware caching because the
firmware layer doesn't manage the buffer lifetime.
For a 16MB buffer, about half the time is spent performing a memcpy from
the buffer to the final resting place. I see loading times go from
0.081171 seconds to 0.047696 seconds after applying this patch. Plus
the vmalloc pressure is reduced.
This is based on a patch from Vikram Mulukutla on codeaurora.org:
https://www.codeaurora.org/cgit/quic/la/kernel/msm-3.18/commit/drivers/base/firmware_class.c?h=rel/msm-3.18&id=0a328c5f6cd999f5c591f172216835636f39bcb5
Link: http://lkml.kernel.org/r/20160607164741.31849-4-stephen.boyd@linaro.org
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Mark Brown <broonie@kernel.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some low memory systems with complex peripherals cannot afford to have
the relatively large firmware images taking up valuable memory during
suspend and resume. Change the internal implementation of
firmware_class to disallow caching based on a configurable option. In
the near future, variants of request_firmware will take advantage of
this feature.
Link: http://lkml.kernel.org/r/20160607164741.31849-3-stephen.boyd@linaro.org
[stephen.boyd@linaro.org: Drop firmware_desc design and use flags]
Signed-off-by: Vikram Mulukutla <markivx@codeaurora.org>
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Some systems are memory constrained but they need to load very large
firmwares. The firmware subsystem allows drivers to request this
firmware be loaded from the filesystem, but this requires that the
entire firmware be loaded into kernel memory first before it's provided
to the driver. This can lead to a situation where we map the firmware
twice, once to load the firmware into kernel memory and once to copy the
firmware into the final resting place.
This design creates needless memory pressure and delays loading because
we have to copy from kernel memory to somewhere else. This patch sets
adds support to the request firmware API to load the firmware directly
into a pre-allocated buffer, skipping the intermediate copying step and
alleviating memory pressure during firmware loading. The drawback is
that we can't use the firmware caching feature because the memory for
the firmware cache is not managed by the firmware layer.
This patch (of 3):
We use similar structured code to read and write the kmapped firmware
pages. The only difference is read copies from the kmap region and
write copies to it. Consolidate this into one function to reduce
duplication.
Link: http://lkml.kernel.org/r/20160607164741.31849-2-stephen.boyd@linaro.org
Signed-off-by: Stephen Boyd <stephen.boyd@linaro.org>
Cc: Vikram Mulukutla <markivx@codeaurora.org>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc
Pull char/misc updates from Greg KH:
"Here is the big char/misc driver update for 4.6-rc1.
The majority of the patches here is hwtracing and some new mic
drivers, but there's a lot of other driver updates as well. Full
details in the shortlog.
All have been in linux-next for a while with no reported issues"
* tag 'char-misc-4.6-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (238 commits)
goldfish: Fix build error of missing ioremap on UM
nvmem: mediatek: Fix later provider initialization
nvmem: imx-ocotp: Fix return value of imx_ocotp_read
nvmem: Fix dependencies for !HAS_IOMEM archs
char: genrtc: replace blacklist with whitelist
drivers/hwtracing: make coresight-etm-perf.c explicitly non-modular
drivers: char: mem: fix IS_ERROR_VALUE usage
char: xillybus: Fix internal data structure initialization
pch_phub: return -ENODATA if ROM can't be mapped
Drivers: hv: vmbus: Support kexec on ws2012 r2 and above
Drivers: hv: vmbus: Support handling messages on multiple CPUs
Drivers: hv: utils: Remove util transport handler from list if registration fails
Drivers: hv: util: Pass the channel information during the init call
Drivers: hv: vmbus: avoid unneeded compiler optimizations in vmbus_wait_for_unload()
Drivers: hv: vmbus: remove code duplication in message handling
Drivers: hv: vmbus: avoid wait_for_completion() on crash
Drivers: hv: vmbus: don't loose HVMSG_TIMER_EXPIRED messages
misc: at24: replace memory_accessor with nvmem_device_read
eeprom: 93xx46: extend driver to plug into the NVMEM framework
eeprom: at25: extend driver to plug into the NVMEM framework
...
|
|
When we now use the new kernel_read_file_from_path() we
are reporting a failure when we iterate over all the paths
possible for firmware. Before using kernel_read_file_from_path()
we only reported a failure once we confirmed a file existed
with filp_open() but failed with fw_read_file_contents().
With kernel_read_file_from_path() both are done for us and
we obviously are now reporting too much information given that
some optional paths will always fail and clutter the logs.
fw_get_filesystem_firmware() already has a check for failure
and uses an internal flag, FW_OPT_NO_WARN, but this does not
let us capture other unxpected errors. This enables that
as changed by Neil via commit:
"firmware: Be a bit more verbose about direct firmware loading failure"
Reported-by: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Heiner Kallweit <hkallweit1@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
Replace the fw_read_file_contents with kernel_file_read_from_path().
Although none of the upstreamed LSMs define a kernel_fw_from_file hook,
IMA is called by the security function to prevent unsigned firmware from
being loaded and to measure/appraise signed firmware, based on policy.
Instead of reading the firmware twice, once for measuring/appraising the
firmware and again for reading the firmware contents into memory, the
kernel_post_read_file() security hook calculates the file hash based on
the in memory file buffer. The firmware is read once.
This patch removes the LSM kernel_fw_from_file() hook and security call.
Changelog v4+:
- revert dropped buf->size assignment - reported by Sergey Senozhatsky
v3:
- remove kernel_fw_from_file hook
- use kernel_file_read_from_path() - requested by Luis
v2:
- reordered and squashed firmware patches
- fix MAX firmware size (Kees Cook)
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
|
|
This makes the error and success paths more readable while trying to
load firmware from the filesystem.
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: David Howells <dhowells@redhat.com>
Acked-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
|
|
This will be re-used later through a new extensible interface.
Reviewed-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
|
|
Simplify a few of the *generic* shared dev_warn() and dev_dbg()
print messages for three reasons:
0) Historically firmware_class code was added to help
get device driver firmware binaries but these days
request_firmware*() helpers are being repurposed for
general *system data* needed by the kernel.
1) This will also help generalize shared code as much as possible
later in the future in consideration for a new extensible firmware
API which will enable to separate usermode helper code out as much
as possible.
2) Kees Cook pointed out the the prints already have the device
associated as dev_*() helpers are used, that should help identify
the user and case in which the helpers are used. That should provide
enough context and simplifies the messages further.
v4: generalize debug/warn messages even further as suggested by
Kees Cook.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Vojtěch Pavlík <vojtech@suse.cz>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
No need to use use continuous memory, it may be fail
when memory deeply fragmented.
Signed-off-by: Chen Feng <puck.chen@hisilicon.com>
Signed-off-by: Xia Qing <saberlily.xia@hisilicon.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Simplify a few of the *generic* shared dev_warn() and dev_dbg()
print messages for three reasons:
0) Historically firmware_class code was added to help
get device driver firmware binaries but these days
request_firmware*() helpers are being repurposed for
general *system data* needed by the kernel.
1) This will also help generalize shared code as much as possible
later in the future in consideration for a new extensible firmware
API which will enable to separate usermode helper code out as much
as possible.
2) Kees Cook pointed out the the prints already have the device
associated as dev_*() helpers are used, that should help identify
the user and case in which the helpers are used. That should provide
enough context and simplifies the messages further.
v4: generalize debug/warn messages even further as suggested by
Kees Cook.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Vojtěch Pavlík <vojtech@suse.cz>
Cc: Kyle McMartin <kyle@kernel.org>
Cc: Matthew Garrett <mjg59@srcf.ucam.org>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@kernel.org>
Signed-off-by: Mimi Zohar <zohar@linux.vnet.ibm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The kerneldoc for request_firmware_nowait() says that it may call the
provided cont() callback with @fw == NULL, if the firmware request
fails. However, this is not the case when called with an empty string
(""). This case is short-circuited by the 'name[0] == '\0'' check
introduced in commit 471b095dfe0d ("firmware_class: make sure fw requests
contain a name"), so _request_firmware() never gets to set the fw to
NULL.
Noticed while using the new 'trigger_async_request' testing hook:
# printf '\x00' > /sys/devices/virtual/misc/test_firmware/trigger_async_request
[10553.726178] test_firmware: loading ''
[10553.729859] test_firmware: loaded: 995209091
# printf '\x00' > /sys/devices/virtual/misc/test_firmware/trigger_async_request
[10733.676184] test_firmware: loading ''
[10733.679855] Unable to handle kernel NULL pointer dereference at virtual address 00000004
[10733.687951] pgd = ec188000
[10733.690655] [00000004] *pgd=00000000
[10733.694240] Internal error: Oops: 5 [#1] SMP ARM
[10733.698847] Modules linked in: btmrvl_sdio btmrvl bluetooth sbs_battery nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6_tables asix usbnet mwifiex_sdio mwifiex cfg80211 jitterentropy_rng drbg joydev snd_seq_midi snd_seq_midi_event snd_rawmidi snd_seq snd_seq_device ppp_async ppp_generic slhc tun
[10733.725670] CPU: 0 PID: 6600 Comm: bash Not tainted 4.4.0-rc4-00351-g63d0877 #178
[10733.733137] Hardware name: Rockchip (Device Tree)
[10733.737831] task: ed24f6c0 ti: ee322000 task.ti: ee322000
[10733.743222] PC is at do_raw_spin_lock+0x18/0x1a0
[10733.747831] LR is at _raw_spin_lock+0x18/0x1c
[10733.752180] pc : [<c00653a0>] lr : [<c054c204>] psr: a00d0013
[10733.752180] sp : ee323df8 ip : ee323e20 fp : ee323e1c
[10733.763634] r10: 00000051 r9 : b6f18000 r8 : ee323f80
[10733.768847] r7 : c089cebc r6 : 00000001 r5 : 00000000 r4 : ec0e6000
[10733.775360] r3 : dead4ead r2 : c06bd140 r1 : eef913b4 r0 : 00000000
[10733.781874] Flags: NzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[10733.788995] Control: 10c5387d Table: 2c18806a DAC: 00000051
[10733.794728] Process bash (pid: 6600, stack limit = 0xee322218)
[10733.800549] Stack: (0xee323df8 to 0xee324000)
[10733.804896] 3de0: ec0e6000 00000000
[10733.813059] 3e00: 00000001 c089cebc ee323f80 b6f18000 ee323e2c ee323e20 c054c204 c0065394
[10733.821221] 3e20: ee323e44 ee323e30 c02fec60 c054c1f8 ec0e7ec0 ec3fcfc0 ee323e5c ee323e48
[10733.829384] 3e40: c02fed08 c02fec48 c07dbf74 eeb05a00 ee323e8c ee323e60 c0253828 c02fecac
[10733.837547] 3e60: 00000001 c0116950 ee323eac ee323e78 00000001 ec3fce00 ed2d9700 ed2d970c
[10733.845710] 3e80: ee323e9c ee323e90 c02e873c c02537d4 ee323eac ee323ea0 c017bd40 c02e8720
[10733.853873] 3ea0: ee323ee4 ee323eb0 c017b250 c017bd00 00000000 00000000 f3e47a54 ec128b00
[10733.862035] 3ec0: c017b10c ee323f80 00000001 c000f504 ee322000 00000000 ee323f4c ee323ee8
[10733.870197] 3ee0: c011b71c c017b118 ee323fb0 c011bc90 becfa8d9 00000001 ec128b00 00000001
[10733.878359] 3f00: b6f18000 ee323f80 ee323f4c ee323f18 c011bc90 c0063950 ee323f3c ee323f28
[10733.886522] 3f20: c0063950 c0549138 00000001 ec128b00 00000001 ec128b00 b6f18000 ee323f80
[10733.894684] 3f40: ee323f7c ee323f50 c011bed8 c011b6ec c0135fb8 c0135f24 ec128b00 ec128b00
[10733.902847] 3f60: 00000001 b6f18000 c000f504 ee322000 ee323fa4 ee323f80 c011c664 c011be24
[10733.911009] 3f80: 00000000 00000000 00000001 b6f18000 b6e79be0 00000004 00000000 ee323fa8
[10733.919172] 3fa0: c000f340 c011c618 00000001 b6f18000 00000001 b6f18000 00000001 00000000
[10733.927334] 3fc0: 00000001 b6f18000 b6e79be0 00000004 00000001 00000001 8068a3f1 b6e79c84
[10733.935496] 3fe0: 00000000 becfa7dc b6de194d b6e20246 400d0030 00000001 7a4536e8 49bda390
[10733.943664] [<c00653a0>] (do_raw_spin_lock) from [<c054c204>] (_raw_spin_lock+0x18/0x1c)
[10733.951743] [<c054c204>] (_raw_spin_lock) from [<c02fec60>] (fw_free_buf+0x24/0x64)
[10733.959388] [<c02fec60>] (fw_free_buf) from [<c02fed08>] (release_firmware+0x68/0x74)
[10733.967207] [<c02fed08>] (release_firmware) from [<c0253828>] (trigger_async_request_store+0x60/0x124)
[10733.976501] [<c0253828>] (trigger_async_request_store) from [<c02e873c>] (dev_attr_store+0x28/0x34)
[10733.985533] [<c02e873c>] (dev_attr_store) from [<c017bd40>] (sysfs_kf_write+0x4c/0x58)
[10733.993437] [<c017bd40>] (sysfs_kf_write) from [<c017b250>] (kernfs_fop_write+0x144/0x1a8)
[10734.001689] [<c017b250>] (kernfs_fop_write) from [<c011b71c>] (__vfs_write+0x3c/0xe4)
After this patch:
# printf '\x00' > /sys/devices/virtual/misc/test_firmware/trigger_async_request
[ 32.126322] test_firmware: loading ''
[ 32.129995] test_firmware: failed to async load firmware
-bash: printf: write error: No such device
Fixes: 471b095dfe0d ("firmware_class: make sure fw requests contain a name")
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Ming Lei <ming.lei@canonical.com>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
|
|
Device resource data allocated with devres_alloc() must be deallocated
by devres_free().
Signed-off-by: Vladimir Zapolskiy <vz@mleia.com>
Acked-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The firmware class uevent function accessed the "fw_priv->buf" buffer
without the proper locking and testing for NULL. This is an old bug
(looks like it goes back to 2012 and commit 1244691c73b2: "firmware
loader: introduce firmware_buf"), but for some reason it's triggering
only now in 4.2-rc1.
Shuah Khan is trying to bisect what it is that causes this to trigger
more easily, but in the meantime let's just fix the bug since others are
hitting it too (at least Ingo reports having seen it as well).
Reported-and-tested-by: Shuah Khan <shuahkh@osg.samsung.com>
Acked-by: Ming Lei <ming.lei@canonical.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The recent fix to use kstrdup_const() failed to add a
kfree upon failure of name allocation...
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Kyle McMartin <kyle@kernel.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
We currently use flexible arrays with a char at the
end for the remaining internal firmware name uses.
There are two limitations with the way we use this.
Since we're using a flexible array for a string on the
struct if we wanted to use two strings it means we'd
have a disjoint means of handling the strings, one
using the flexible array, and another a char * pointer.
We're also currently not using 'const' for the string.
We wish to later extend some firmware data structures
with other string/char pointers, but we also want to be
very pedantic about const usage. Since we're going to
change things to use 'const' we might as well also address
unified way to use multiple strings on the structs.
Replace the flexible array practice for strings with
kstrdup_const() and kfree_const(), this will avoid
allocations when the vmlinux .rodata is used, and just
allocate a new proper string for us when needed. This
also means we can simplify the struct allocations by
removing the string length from the allocation size
computation, which would otherwise get even more
complicated when supporting multiple strings.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Kyle McMartin <kyle@kernel.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Asynchronous firmware loading copies the pointer to the
name passed as an argument only to be scheduled later and
used. This behaviour works well for synchronous calling
but in asynchronous mode there's a chance the caller could
immediately free the passed string after making the
asynchronous call. This could trigger a use after free
having the kernel look on disk for arbitrary file names.
In order to force-test the issue you can use a test-driver
designed to illustrate this issue on github [0], use the
next-20150505-fix-use-after-free branch.
With this patch applied you get:
[ 283.512445] firmware name: test_module_stuff.bin
[ 287.514020] firmware name: test_module_stuff.bin
[ 287.532489] firmware found
Without this patch applied you can end up with something such as:
[ 135.624216] firmware name: \xffffff80BJ
[ 135.624249] platform fake-dev.0: Direct firmware load for \xffffff80Bi failed with error -2
[ 135.624252] No firmware found
[ 135.624252] firmware found
Unfortunatley in the worst and most common case however you
can typically crash your system with a page fault by trying to
free something which you cannot, and/or a NULL pointer
dereference [1].
The fix and issue using schedule_work() for asynchronous
runs is generalized in the following SmPL grammar patch,
when applied to next-20150505 only the firmware_class
code is affected. This grammar patch can and should further
be generalized to vet for for other kernel asynchronous
mechanisms.
@ calls_schedule_work @
type T;
T *priv_work;
identifier func, work_func;
identifier work;
identifier priv_name, name;
expression gfp;
@@
func(..., const char *name, ...)
{
...
priv_work = kzalloc(sizeof(T), gfp);
...
- priv_work->priv_name = name;
+ priv_work->priv_name = kstrdup_const(name, gfp);
...
(... when any
if (...)
{
...
+ kfree_const(priv_work->priv_name);
kfree(priv_work);
...
}
) ... when any
INIT_WORK(&priv_work->work, work_func);
...
schedule_work(&priv_work->work);
...
}
@ the_work_func depends on calls_schedule_work @
type calls_schedule_work.T;
T *priv_work;
identifier calls_schedule_work.work_func;
identifier calls_schedule_work.priv_name;
identifier calls_schedule_work.work;
identifier some_work;
@@
work_func(...)
{
...
priv_work = container_of(some_work, T, work);
...
+ kfree_const(priv_work->priv_name);
kfree(priv_work);
...
}
[0] https://github.com/mcgrof/fake-firmware-test.git
[1] The following kernel ring buffer splat:
firmware name: test_module_stuff.bin
firmware name:
firmware found
general protection fault: 0000 [#1] SMP
Modules linked in: test(O) <...etc-it-does-not-matter>
drm sr_mod cdrom xhci_pci xhci_hcd rtsx_pci mfd_core video button sg
CPU: 3 PID: 87 Comm: kworker/3:2 Tainted: G O 4.0.0-00010-g22b5bb0-dirty #176
Hardware name: LENOVO 20AW000LUS/20AW000LUS, BIOS GLET43WW (1.18 ) 12/04/2013
Workqueue: events request_firmware_work_func
task: ffff8800c7f8e290 ti: ffff8800c7f94000 task.ti: ffff8800c7f94000
RIP: 0010:[<ffffffff814a586c>] [<ffffffff814a586c>] fw_free_buf+0xc/0x40
RSP: 0000:ffff8800c7f97d78 EFLAGS: 00010286
RAX: ffffffff81ae3700 RBX: ffffffff816d1181 RCX: 0000000000000006
RDX: 0001ee850ff68500 RSI: 0000000000000246 RDI: c35d5f415e415d41
RBP: ffff8800c7f97d88 R08: 000000000000000a R09: 0000000000000000
R10: 0000000000000358 R11: ffff8800c7f97a7e R12: ffff8800c7ec1e80
R13: ffff88021e2d4cc0 R14: ffff88021e2dff00 R15: 00000000000000c0
FS: 0000000000000000(0000) GS:ffff88021e2c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000034b8cd8 CR3: 000000021073c000 CR4: 00000000001407e0
Stack:
ffffffff816d1181 ffff8800c7ec1e80 ffff8800c7f97da8 ffffffff814a58f8
000000000000000a ffffffff816d1181 ffff8800c7f97dc8 ffffffffa047002c
ffff88021e2dff00 ffff8802116ac1c0 ffff8800c7f97df8 ffffffff814a65fe
Call Trace:
[<ffffffff816d1181>] ? __schedule+0x361/0x940
[<ffffffff814a58f8>] release_firmware+0x58/0x80
[<ffffffff816d1181>] ? __schedule+0x361/0x940
[<ffffffffa047002c>] test_mod_cb+0x2c/0x43 [test]
[<ffffffff814a65fe>] request_firmware_work_func+0x5e/0x80
[<ffffffff816d1181>] ? __schedule+0x361/0x940
[<ffffffff8108d23a>] process_one_work+0x14a/0x3f0
[<ffffffff8108d911>] worker_thread+0x121/0x460
[<ffffffff8108d7f0>] ? rescuer_thread+0x310/0x310
[<ffffffff810928f9>] kthread+0xc9/0xe0
[<ffffffff81092830>] ? kthread_create_on_node+0x180/0x180
[<ffffffff816d52d8>] ret_from_fork+0x58/0x90
[<ffffffff81092830>] ? kthread_create_on_node+0x180/0x180
Code: c7 c6 dd ad a3 81 48 c7 c7 20 97 ce 81 31 c0 e8 0b b2 ed ff e9 78 ff ff ff 66 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 <4c> 8b 67 38 48 89 fb 4c 89 e7 e8 85 f7 22 00 f0 83 2b 01 74 0f
RIP [<ffffffff814a586c>] fw_free_buf+0xc/0x40
RSP <ffff8800c7f97d78>
---[ end trace 4e62c56a58d0eac1 ]---
BUG: unable to handle kernel paging request at ffffffffffffffd8
IP: [<ffffffff81093ee0>] kthread_data+0x10/0x20
PGD 1c13067 PUD 1c15067 PMD 0
Oops: 0000 [#2] SMP
Modules linked in: test(O) <...etc-it-does-not-matter>
drm sr_mod cdrom xhci_pci xhci_hcd rtsx_pci mfd_core video button sg
CPU: 3 PID: 87 Comm: kworker/3:2 Tainted: G D O 4.0.0-00010-g22b5bb0-dirty #176
Hardware name: LENOVO 20AW000LUS/20AW000LUS, BIOS GLET43WW (1.18 ) 12/04/2013
task: ffff8800c7f8e290 ti: ffff8800c7f94000 task.ti: ffff8800c7f94000
RIP: 0010:[<ffffffff81092ee0>] [<ffffffff81092ee0>] kthread_data+0x10/0x20
RSP: 0018:ffff8800c7f97b18 EFLAGS: 00010096
RAX: 0000000000000000 RBX: 0000000000000003 RCX: 000000000000000d
RDX: 0000000000000003 RSI: 0000000000000003 RDI: ffff8800c7f8e290
RBP: ffff8800c7f97b18 R08: 000000000000bc00 R09: 0000000000007e76
R10: 0000000000000001 R11: 000000000000002f R12: ffff8800c7f8e290
R13: 00000000000154c0 R14: 0000000000000003 R15: 0000000000000000
FS: 0000000000000000(0000) GS:ffff88021e2c0000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 0000000210675000 CR4: 00000000001407e0
Stack:
ffff8800c7f97b38 ffffffff8108dcd5 ffff8800c7f97b38 ffff88021e2d54c0
ffff8800c7f97b88 ffffffff816d1500 ffff880213d42368 ffff8800c7f8e290
ffff8800c7f97b88 ffff8800c7f97fd8 ffff8800c7f8e710 0000000000000246
Call Trace:
[<ffffffff8108dcd5>] wq_worker_sleeping+0x15/0xa0
[<ffffffff816d1500>] __schedule+0x6e0/0x940
[<ffffffff816d1797>] schedule+0x37/0x90
[<ffffffff810779bc>] do_exit+0x6bc/0xb40
[<ffffffff8101898f>] oops_end+0x9f/0xe0
[<ffffffff81018efb>] die+0x4b/0x70
[<ffffffff81015622>] do_general_protection+0xe2/0x170
[<ffffffff816d74e8>] general_protection+0x28/0x30
[<ffffffff816d1181>] ? __schedule+0x361/0x940
[<ffffffff814a586c>] ? fw_free_buf+0xc/0x40
[<ffffffff816d1181>] ? __schedule+0x361/0x940
[<ffffffff814a58f8>] release_firmware+0x58/0x80
[<ffffffff816d1181>] ? __schedule+0x361/0x940
[<ffffffffa047002c>] test_mod_cb+0x2c/0x43 [test]
[<ffffffff814a65fe>] request_firmware_work_func+0x5e/0x80
[<ffffffff816d1181>] ? __schedule+0x361/0x940
[<ffffffff8108d23a>] process_one_work+0x14a/0x3f0
[<ffffffff8108d911>] worker_thread+0x121/0x460
[<ffffffff8108d7f0>] ? rescuer_thread+0x310/0x310
[<ffffffff810928f9>] kthread+0xc9/0xe0
[<ffffffff81092830>] ? kthread_create_on_node+0x180/0x180
[<ffffffff816d52d8>] ret_from_fork+0x58/0x90
[<ffffffff81092830>] ? kthread_create_on_node+0x180/0x180
Code: 00 48 89 e5 5d 48 8b 40 c8 48 c1 e8 02 83 e0 01 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 8b 87 30 05 00 00 55 48 89 e5 <48> 8b 40 d8 5d c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00
RIP [<ffffffff81092ee0>] kthread_data+0x10/0x20
RSP <ffff8800c7f97b18>
CR2: ffffffffffffffd8
---[ end trace 4e62c56a58d0eac2 ]---
Fixing recursive fault but reboot is needed!
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Seth Forshee <seth.forshee@canonical.com>
Cc: Kyle McMartin <kyle@kernel.org>
Generated-by: Coccinelle SmPL
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When direct firmware loading is used we iterate over a list
of possible firmware paths and concatenate the desired firmware
name with each path and look for the file there. Should the
passed firmware name be too long we end up truncating the
file we want to look for, the search however is still done.
Add a check for truncation instead of looking for a
truncated firmware filename.
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@kernel.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
The request_firmware*() APIs uses __getname() to iterate
over the list of paths possible for firmware to be found,
the code however never checked for failure on __getname().
Although *very unlikely*, this can still happen. Add the
missing check.
There is still no checks on the concatenation of the path
and filename passed, that requires a bit more work and
subsequent patches address this. The commit that introduced
this is abb139e7 ("firmware: teach the kernel to load
firmware files directly from the filesystem").
mcgrof@ergon ~/linux (git::firmware-fixes) $ git describe --contains abb139e7
v3.7-rc1~120
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Ming Lei <ming.lei@canonical.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: David Howells <dhowells@redhat.com>
Cc: Kyle McMartin <kyle@kernel.org>
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
When using the user mode helper to load firmwares the function _request_firmware
gets a positive return value from fw_load_from_user_helper and because of this
the firmware buffer is not assigned. This happens only when the return value
is zero. This patch fixes this problem in _request_firmware_load. When the
completion is ready the return value is set to zero.
Signed-off-by: Zahari Doychev <zahari.doychev@linux.com>
Cc: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use the static attribute groups assigned to the device instead of
manual device_create_file() & co calls. It simplifies the code and
can avoid possible races, too.
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Fix checkpatch.pl issues with coding style. Removed whitespace and
fixed indentation
Signed-off-by: Andrei Oprea <andrei.br92@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core patches from Greg KH:
"Really tiny set of patches for this kernel. Nothing major, all
described in the shortlog and have been in linux-next for a while"
* tag 'driver-core-3.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
sysfs: fix warning when creating a sysfs group without attributes
firmware_loader: handle timeout via wait_for_completion_interruptible_timeout()
firmware_loader: abort request if wait_for_completion is interrupted
firmware: Correct function name in comment
device: Change dev_<level> logging functions to return void
device: Fix dev_dbg_once macro
|
|
This patch reduces the kernel size by removing error messages that duplicate
the normal OOM message.
A simplified version of the semantic patch that finds this problem is as
follows: (http://coccinelle.lip6.fr)
@@
identifier f,print,l;
expression e;
constant char[] c;
@@
e = \(kzalloc\|kmalloc\|devm_kzalloc\|devm_kmalloc\)(...);
if (e == NULL) {
<+...
- print(...,c,...);
... when any
(
goto l;
|
return ...;
)
...+> }
Signed-off-by: Quentin Lambert <lambert.quentin@gmail.com>
Acked-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
It is simpler to handle timeout by wait_for_completion_interruptible_timeout(),
so remove previous delay work for timeout.
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
If current request is interrupted by signal, such as 'ctrl + c',
this request has to be aborted for the following reasons:
- the buf need to be removed from pending list
- same requests from other contexts need to be completed
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Use the correct function name in the kernel-doc comment above it.
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ming Lei <ming.lei@canonical.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|