| Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 6bd30d8fc523fb880b4be548e8501bc0fe8f42d4 ]
channel_handler() sets intf->channels_ready to true but never
clears it, so __scan_channels() skips any rescan. When the BMC
firmware changes a rescan is required. Allow it by clearing
the flag before starting a new scan.
Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
Message-ID: <20250930074239.2353-3-guojinhui.liam@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 936750fdba4c45e13bbd17f261bb140dd55f5e93 ]
The race window between __scan_channels() and deliver_response() causes
the parameters of some channels to be set to 0.
1.[CPUA] __scan_channels() issues an IPMI request and waits with
wait_event() until all channels have been scanned.
wait_event() internally calls might_sleep(), which might
yield the CPU. (Moreover, an interrupt can preempt
wait_event() and force the task to yield the CPU.)
2.[CPUB] deliver_response() is invoked when the CPU receives the
IPMI response. After processing a IPMI response,
deliver_response() directly assigns intf->wchannels to
intf->channel_list and sets intf->channels_ready to true.
However, not all channels are actually ready for use.
3.[CPUA] Since intf->channels_ready is already true, wait_event()
never enters __wait_event(). __scan_channels() immediately
clears intf->null_user_handler and exits.
4.[CPUB] Once intf->null_user_handler is set to NULL, deliver_response()
ignores further IPMI responses, leaving the remaining
channels zero-initialized and unusable.
CPUA CPUB
------------------------------- -----------------------------
__scan_channels()
intf->null_user_handler
= channel_handler;
send_channel_info_cmd(intf,
0);
wait_event(intf->waitq,
intf->channels_ready);
do {
might_sleep();
deliver_response()
channel_handler()
intf->channel_list =
intf->wchannels + set;
intf->channels_ready = true;
send_channel_info_cmd(intf,
intf->curr_channel);
if (condition)
break;
__wait_event(wq_head,
condition);
} while(0)
intf->null_user_handler
= NULL;
deliver_response()
if (!msg->user)
if (intf->null_user_handler)
rv = -EINVAL;
return rv;
------------------------------- -----------------------------
Fix the race between __scan_channels() and deliver_response() by
deferring both the assignment intf->channel_list = intf->wchannels
and the flag intf->channels_ready = true until all channels have
been successfully scanned or until the IPMI request has failed.
Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
Message-ID: <20250930074239.2353-2-guojinhui.liam@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 8ffcb7560b4a15faf821df95e3ab532b2b020f8c ]
The source and destination of some strcpy operations was the same.
Split out the part of the operations that needed to be done for those
particular calls so the unnecessary copy wasn't done.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506140756.EFXXvIP4-lkp@intel.com/
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ec50ec378e3fd83bde9b3d622ceac3509a60b6b5 ]
During BMC firmware upgrades on live systems, the ipmi_msghandler
generates excessive "BMC returned incorrect response" warnings
while the BMC is temporarily offline. This can flood system logs
in large deployments.
Replace dev_warn() with dev_warn_ratelimited() to throttle these
warnings and prevent log spam during BMC maintenance operations.
Signed-off-by: Breno Leitao <leitao@debian.org>
Message-ID: <20250710-ipmi_ratelimit-v1-1-6d417015ebe9@debian.org>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2378bd0b264ad3a1f76bd957caf33ee0c7945351 ]
devm_kasprintf() can return a NULL pointer on failure but this
returned value is not checked.
Fixes: 51bd6f291583 ("Add support for IPMB driver")
Signed-off-by: Charles Han <hanchunchao@inspur.com>
Message-ID: <20240926094419.25900-1-hanchunchao@inspur.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 6cf1a126de2992b4efe1c3c4d398f8de4aed6e3f upstream.
Kmemleak reported the following leak info in try_smi_init():
unreferenced object 0xffff00018ecf9400 (size 1024):
comm "modprobe", pid 2707763, jiffies 4300851415 (age 773.308s)
backtrace:
[<000000004ca5b312>] __kmalloc+0x4b8/0x7b0
[<00000000953b1072>] try_smi_init+0x148/0x5dc [ipmi_si]
[<000000006460d325>] 0xffff800081b10148
[<0000000039206ea5>] do_one_initcall+0x64/0x2a4
[<00000000601399ce>] do_init_module+0x50/0x300
[<000000003c12ba3c>] load_module+0x7a8/0x9e0
[<00000000c246fffe>] __se_sys_init_module+0x104/0x180
[<00000000eea99093>] __arm64_sys_init_module+0x24/0x30
[<0000000021b1ef87>] el0_svc_common.constprop.0+0x94/0x250
[<0000000070f4f8b7>] do_el0_svc+0x48/0xe0
[<000000005a05337f>] el0_svc+0x24/0x3c
[<000000005eb248d6>] el0_sync_handler+0x160/0x164
[<0000000030a59039>] el0_sync+0x160/0x180
The problem was that when an error occurred before handlers registration
and after allocating `new_smi->si_sm`, the variable wouldn't be freed in
the error handling afterwards since `shutdown_smi()` hadn't been
registered yet. Fix it by adding a `kfree()` in the error handling path
in `try_smi_init()`.
Cc: stable@vger.kernel.org # 4.19+
Fixes: 7960f18a5647 ("ipmi_si: Convert over to a shutdown handler")
Signed-off-by: Yi Yang <yiyang13@huawei.com>
Co-developed-by: GONG, Ruiqi <gongruiqi@huaweicloud.com>
Signed-off-by: GONG, Ruiqi <gongruiqi@huaweicloud.com>
Message-Id: <20230629123328.2402075-1-gongruiqi@huaweicloud.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit b8d72e32e1453d37ee5c8a219f24e7eeadc471ef ]
The adapter scan ssif_info_find() sets info->adapter_name if the adapter
info came from SMBIOS, as it's not set in that case. However, this
function can be called more than once, and it will leak the adapter name
if it had already been set. So check for NULL before setting it.
Fixes: c4436c9149c5 ("ipmi_ssif: avoid registering duplicate ssif interface")
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit c5586d0f711e9744d0cade39b0c4a2d116a333ca ]
Add check for the return value of kstrdup() and return the error
if it fails in order to avoid NULL pointer dereference.
Fixes: c4436c9149c5 ("ipmi_ssif: avoid registering duplicate ssif interface")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Message-Id: <20230619092802.35384-1-jiasheng@iscas.ac.cn>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2a587b9ad052e7e92e508aea90c1e2ae433c1908 ]
REGMAP is a hidden (not user visible) symbol. Users cannot set it
directly thru "make *config", so drivers should select it instead of
depending on it if they need it.
Consistently using "select" or "depends on" can also help reduce
Kconfig circular dependency issues.
Therefore, change the use of "depends on REGMAP_MMIO" to
"select REGMAP_MMIO", which will also set REGMAP.
Fixes: eb994594bc22 ("ipmi: bt-bmc: Use a regmap for register access")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Corey Minyard <minyard@acm.org>
Cc: openipmi-developer@lists.sourceforge.net
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Message-Id: <20230226053953.4681-2-rdunlap@infradead.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 6d2555cde2918409b0331560e66f84a0ad4849c6 upstream.
The ipmi communication is not restored after a specific version of BMC is
upgraded on our server.
The ipmi driver does not respond after printing the following log:
ipmi_ssif: Invalid response getting flags: 1c 1
I found that after entering this branch, ssif_info->ssif_state always
holds SSIF_GETTING_FLAGS and never return to IDLE.
As a result, the driver cannot be loaded, because the driver status is
checked during the unload process and must be IDLE in shutdown_ssif():
while (ssif_info->ssif_state != SSIF_IDLE)
schedule_timeout(1);
The process trigger this problem is:
1. One msg timeout and next msg start send, and call
ssif_set_need_watch().
2. ssif_set_need_watch()->watch_timeout()->start_flag_fetch() change
ssif_state to SSIF_GETTING_FLAGS.
3. In msg_done_handler() ssif_state == SSIF_GETTING_FLAGS, if an error
message is received, the second branch does not modify the ssif_state.
4. All retry action need IS_SSIF_IDLE() == True. Include retry action in
watch_timeout(), msg_done_handler(). Sending msg does not work either.
SSIF_IDLE is also checked in start_next_msg().
5. The only thing that can be triggered in the SSIF driver is
watch_timeout(), after destory_user(), this timer will stop too.
So, if enter this branch, the ssif_state will remain SSIF_GETTING_FLAGS
and can't send msg, no timer started, can't unload.
We did a comparative test before and after adding this patch, and the
result is effective.
Fixes: 259307074bfc ("ipmi: Add SMBus interface driver (SSIF)")
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20230412074907.80046-1-zhangyuchen.lcr@bytedance.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 6ce7995a43febe693d4894033c6e29314970646a upstream.
A recent change removed an increment of send_retries, re-add it.
Fixes: 95767ed78a18 ipmi:ssif: resend_msg() cannot fail
Reported-by: Pavel Machek <pavel@denx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <minyard@acm.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 00bb7e763ec9f384cb382455cb6ba5588b5375cf ]
The IPMI spec has a time (T6) specified between request retries. Add
the handling for that.
Reported by: Tony Camuso <tcamuso@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 39721d62bbc16ebc9bb2bdc2c163658f33da3b0b ]
The spec states that the minimum message retry time is 60ms, but it was
set to 20ms. Correct it.
Reported by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Stable-dep-of: 00bb7e763ec9 ("ipmi:ssif: Add a timer between request retries")
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 8230831c43a328c2be6d28c65d3f77e14c59986b upstream.
Rename the SSIF_IDLE() to IS_SSIF_IDLE(), since that is more clear, and
rename SSIF_NORMAL to SSIF_IDLE, since that's more accurate.
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 95767ed78a181d5404202627499f9cde56053b96 upstream.
The resend_msg() function cannot fail, but there was error handling
around using it. Rework the handling of the error, and fix the out of
retries debug reporting that was wrong around this, too.
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit a92ce570c81dc0feaeb12a429b4bc65686d17967 upstream.
The intf_free() function frees the "intf" pointer so we cannot
dereference it again on the next line.
Fixes: cbb79863fc31 ("ipmi: Don't allow device module unload when in use")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Message-Id: <Y3M8xa1drZv4CToE@kili>
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit f6f1234d98cce69578bfac79df147a1f6660596c upstream.
When fixing the problem mentioned in PATCH1, we also found
the following problem:
If the IPMI is disconnected and in the sending process, the
uninstallation driver will be stuck for a long time.
The main problem is that uninstalling the driver waits for curr_msg to
be sent or HOSED. After stopping tasklet, the only place to trigger the
timeout mechanism is the circular poll in shutdown_smi.
The poll function delays 10us and calls smi_event_handler(smi_info,10).
Smi_event_handler deducts 10us from kcs->ibf_timeout.
But the poll func is followed by schedule_timeout_uninterruptible(1).
The time consumed here is not counted in kcs->ibf_timeout.
So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has
actually passed. The waiting time has increased by more than a
hundredfold.
Now instead of calling poll(). call smi_event_handler() directly and
calculate the elapsed time.
For verification, you can directly use ebpf to check the kcs->
ibf_timeout for each call to kcs_event() when IPMI is disconnected.
Decrement at normal rate before unloading. The decrement rate becomes
very slow after unloading.
$ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n",
*(arg0+584));}'
Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20221007092617.87597-3-zhangyuchen.lcr@bytedance.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 36992eb6b9b83f7f9cdc8e74fb5799d7b52e83e9 ]
After the IPMI disconnect problem, the memory kept rising and we tried
to unload the driver to free the memory. However, only part of the
free memory is recovered after the driver is uninstalled. Using
ebpf to hook free functions, we find that neither ipmi_user nor
ipmi_smi_msg is free, only ipmi_recv_msg is free.
We find that the deliver_smi_err_response call in clean_smi_msgs does
the destroy processing on each message from the xmit_msg queue without
checking the return value and free ipmi_smi_msg.
deliver_smi_err_response is called only at this location. Adding the
free handling has no effect.
To verify, try using ebpf to trace the free function.
$ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc rcv
%p\n",retval);} kprobe:free_recv_msg {printf("free recv %p\n",
arg0)} kretprobe:ipmi_alloc_smi_msg {printf("alloc smi %p\n",
retval);} kprobe:free_smi_msg {printf("free smi %p\n",arg0)}'
Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20221007092617.87597-4-zhangyuchen.lcr@bytedance.com>
[Fixed the comment above handle_one_recv_msg().]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit f90bc0f97f2b65af233a37b2e32fc81871a1e3cf ]
The ASPEED KCS devices don't provide a BMC-side interrupt for the host
reading the output data register (ODR). The act of the host reading ODR
clears the output buffer full (OBF) flag in the status register (STR),
informing the BMC it can transmit a subsequent byte.
On the BMC side the KCS client must enable the OBE event *and* perform a
subsequent read of STR anyway to avoid races - the polling provides a
window for the host to read ODR if data was freshly written while
minimising BMC-side latency.
Fixes: 28651e6c4237 ("ipmi: kcs_bmc: Allow clients to control KCS IRQ state")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-Id: <20220812144741.240315-1-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2ebaf18a0b7fb764bba6c806af99fe868cee93de ]
The was it was wouldn't work in some situations, simplify it. What was
there was unnecessary complexity.
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 7602b957e2404e5f98d9a40b68f1fd27f0028712 ]
Even though it's not possible to get into the SSIF_GETTING_MESSAGES and
SSIF_GETTING_EVENTS states without a valid message in the msg field,
it's probably best to be defensive here and check and print a log, since
that means something else went wrong.
Also add a default clause to that switch statement to release the lock
and print a log, in case the state variable gets messed up somehow.
Reported-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 75d70d76cb7b927cace2cb34265d68ebb3306b13 upstream.
If the workqueue allocation fails, the driver is marked as not initialized,
and timer and panic_notifier will be left registered.
Instead of removing those when workqueue allocation fails, do the workqueue
initialization before doing it, and cleanup srcu_struct if it fails.
Fixes: 1d49eb91e86e ("ipmi: Move remove_work to dedicated workqueue")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Corey Minyard <cminyard@mvista.com>
Cc: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com>
Cc: stable@vger.kernel.org
Message-Id: <20211217154410.1228673-2-cascardo@canonical.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 34f35f8f14bc406efc06ee4ff73202c6fd245d15 upstream.
During probe ssif_info->client is dereferenced in error path. However,
it is set when some of the error checking has already been done. This
causes following kernel crash if an error path is taken:
[ 30.645593][ T674] ipmi_ssif 0-000e: ipmi_ssif: Not probing, Interface already present
[ 30.657616][ T674] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000088
...
[ 30.657723][ T674] pc : __dev_printk+0x28/0xa0
[ 30.657732][ T674] lr : _dev_err+0x7c/0xa0
...
[ 30.657772][ T674] Call trace:
[ 30.657775][ T674] __dev_printk+0x28/0xa0
[ 30.657778][ T674] _dev_err+0x7c/0xa0
[ 30.657781][ T674] ssif_probe+0x548/0x900 [ipmi_ssif 62ce4b08badc1458fd896206d9ef69a3c31f3d3e]
[ 30.657791][ T674] i2c_device_probe+0x37c/0x3c0
...
Initialize ssif_info->client before any error path can be taken. Clear
i2c_client data in the error path to prevent the dangling pointer from
leaking.
Fixes: c4436c9149c5 ("ipmi_ssif: avoid registering duplicate ssif interface")
Cc: stable@vger.kernel.org # 5.4.x
Suggested-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Mian Yousaf Kaukab <ykaukab@suse.de>
Message-Id: <20211208093239.4432-1-ykaukab@suse.de>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 2b5160b12091285c5aca45980f100a9294af7b04 upstream.
In case, init_srcu_struct fails (because of memory allocation failure), we
might proceed with the driver initialization despite srcu_struct not being
entirely initialized.
Fixes: 913a89f009d9 ("ipmi: Don't initialize anything in the core until something uses it")
Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Cc: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org
Message-Id: <20211217154410.1228673-1-cascardo@canonical.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit ffb76a86f8096a8206be03b14adda6092e18e275 ]
Hi,
When testing install and uninstall of ipmi_si.ko and ipmi_msghandler.ko,
the system crashed.
The log as follows:
[ 141.087026] BUG: unable to handle kernel paging request at ffffffffc09b3a5a
[ 141.087241] PGD 8fe4c0d067 P4D 8fe4c0d067 PUD 8fe4c0f067 PMD 103ad89067 PTE 0
[ 141.087464] Oops: 0010 [#1] SMP NOPTI
[ 141.087580] CPU: 67 PID: 668 Comm: kworker/67:1 Kdump: loaded Not tainted 4.18.0.x86_64 #47
[ 141.088009] Workqueue: events 0xffffffffc09b3a40
[ 141.088009] RIP: 0010:0xffffffffc09b3a5a
[ 141.088009] Code: Bad RIP value.
[ 141.088009] RSP: 0018:ffffb9094e2c3e88 EFLAGS: 00010246
[ 141.088009] RAX: 0000000000000000 RBX: ffff9abfdb1f04a0 RCX: 0000000000000000
[ 141.088009] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
[ 141.088009] RBP: 0000000000000000 R08: ffff9abfffee3cb8 R09: 00000000000002e1
[ 141.088009] R10: ffffb9094cb73d90 R11: 00000000000f4240 R12: ffff9abfffee8700
[ 141.088009] R13: 0000000000000000 R14: ffff9abfdb1f04a0 R15: ffff9abfdb1f04a8
[ 141.088009] FS: 0000000000000000(0000) GS:ffff9abfffec0000(0000) knlGS:0000000000000000
[ 141.088009] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 141.088009] CR2: ffffffffc09b3a30 CR3: 0000008fe4c0a001 CR4: 00000000007606e0
[ 141.088009] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 141.088009] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 141.088009] PKRU: 55555554
[ 141.088009] Call Trace:
[ 141.088009] ? process_one_work+0x195/0x390
[ 141.088009] ? worker_thread+0x30/0x390
[ 141.088009] ? process_one_work+0x390/0x390
[ 141.088009] ? kthread+0x10d/0x130
[ 141.088009] ? kthread_flush_work_fn+0x10/0x10
[ 141.088009] ? ret_from_fork+0x35/0x40] BUG: unable to handle kernel paging request at ffffffffc0b28a5a
[ 200.223240] PGD 97fe00d067 P4D 97fe00d067 PUD 97fe00f067 PMD a580cbf067 PTE 0
[ 200.223464] Oops: 0010 [#1] SMP NOPTI
[ 200.223579] CPU: 63 PID: 664 Comm: kworker/63:1 Kdump: loaded Not tainted 4.18.0.x86_64 #46
[ 200.224008] Workqueue: events 0xffffffffc0b28a40
[ 200.224008] RIP: 0010:0xffffffffc0b28a5a
[ 200.224008] Code: Bad RIP value.
[ 200.224008] RSP: 0018:ffffbf3c8e2a3e88 EFLAGS: 00010246
[ 200.224008] RAX: 0000000000000000 RBX: ffffa0799ad6bca0 RCX: 0000000000000000
[ 200.224008] RDX: 0000000000000000 RSI: 0000000000000246 RDI: 0000000000000246
[ 200.224008] RBP: 0000000000000000 R08: ffff9fe43fde3cb8 R09: 00000000000000d5
[ 200.224008] R10: ffffbf3c8cb53d90 R11: 00000000000f4240 R12: ffff9fe43fde8700
[ 200.224008] R13: 0000000000000000 R14: ffffa0799ad6bca0 R15: ffffa0799ad6bca8
[ 200.224008] FS: 0000000000000000(0000) GS:ffff9fe43fdc0000(0000) knlGS:0000000000000000
[ 200.224008] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 200.224008] CR2: ffffffffc0b28a30 CR3: 00000097fe00a002 CR4: 00000000007606e0
[ 200.224008] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 200.224008] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 200.224008] PKRU: 55555554
[ 200.224008] Call Trace:
[ 200.224008] ? process_one_work+0x195/0x390
[ 200.224008] ? worker_thread+0x30/0x390
[ 200.224008] ? process_one_work+0x390/0x390
[ 200.224008] ? kthread+0x10d/0x130
[ 200.224008] ? kthread_flush_work_fn+0x10/0x10
[ 200.224008] ? ret_from_fork+0x35/0x40
[ 200.224008] kernel fault(0x1) notification starting on CPU 63
[ 200.224008] kernel fault(0x1) notification finished on CPU 63
[ 200.224008] CR2: ffffffffc0b28a5a
[ 200.224008] ---[ end trace c82a412d93f57412 ]---
The reason is as follows:
T1: rmmod ipmi_si.
->ipmi_unregister_smi()
-> ipmi_bmc_unregister()
-> __ipmi_bmc_unregister()
-> kref_put(&bmc->usecount, cleanup_bmc_device);
-> schedule_work(&bmc->remove_work);
T2: rmmod ipmi_msghandler.
ipmi_msghander module uninstalled, and the module space
will be freed.
T3: bmc->remove_work doing cleanup the bmc resource.
-> cleanup_bmc_work()
-> platform_device_unregister(&bmc->pdev);
-> platform_device_del(pdev);
-> device_del(&pdev->dev);
-> kobject_uevent(&dev->kobj, KOBJ_REMOVE);
-> kobject_uevent_env()
-> dev_uevent()
-> if (dev->type && dev->type->name)
'dev->type'(bmc_device_type) pointer space has freed when uninstall
ipmi_msghander module, 'dev->type->name' cause the system crash.
drivers/char/ipmi/ipmi_msghandler.c:
2820 static const struct device_type bmc_device_type = {
2821 .groups = bmc_dev_attr_groups,
2822 };
Steps to reproduce:
Add a time delay in cleanup_bmc_work() function,
and uninstall ipmi_si and ipmi_msghandler module.
2910 static void cleanup_bmc_work(struct work_struct *work)
2911 {
2912 struct bmc_device *bmc = container_of(work, struct bmc_device,
2913 remove_work);
2914 int id = bmc->pdev.id; /* Unregister overwrites id */
2915
2916 msleep(3000); <---
2917 platform_device_unregister(&bmc->pdev);
2918 ida_simple_remove(&ipmi_bmc_ida, id);
2919 }
Use 'remove_work_wq' instead of 'system_wq' to solve this issues.
Fixes: b2cfd8ab4add ("ipmi: Rework device id and guid handling to catch changing BMCs")
Signed-off-by: Wu Bo <wubo40@huawei.com>
Message-Id: <1640070034-56671-1-git-send-email-wubo40@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit 5a3ba99b62d8486de0316334e72ac620d4b94fdd upstream.
The sparse tool complains as follows:
drivers/char/ipmi/ipmi_msghandler.c:194:25: warning:
symbol 'remove_work_wq' was not declared. Should it be static?
This symbol is not used outside of ipmi_msghandler.c, so
marks it static.
Fixes: 1d49eb91e86e ("ipmi: Move remove_work to dedicated workqueue")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Message-Id: <20211123083618.2366808-1-weiyongjun1@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 1d49eb91e86e8c1c1614c72e3e958b6b7e2472a9 upstream.
Currently when removing an ipmi_user the removal is deferred as a work on
the system's workqueue. Although this guarantees the free operation will
occur in non atomic context, it can race with the ipmi_msghandler module
removal (see [1]) . In case a remove_user work is scheduled for removal
and shortly after ipmi_msghandler module is removed we can end up in a
situation where the module is removed fist and when the work is executed
the system crashes with :
BUG: unable to handle page fault for address: ffffffffc05c3450
PF: supervisor instruction fetch in kernel mode
PF: error_code(0x0010) - not-present page
because the pages of the module are gone. In cleanup_ipmi() there is no
easy way to detect if there are any pending works to flush them before
removing the module. This patch creates a separate workqueue and schedules
the remove_work works on it. When removing the module the workqueue is
drained when destroyed to avoid the race.
[1] https://bugs.launchpad.net/bugs/1950666
Cc: stable@vger.kernel.org # 5.1
Fixes: 3b9a907223d7 (ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier)
Signed-off-by: Ioanna Alifieraki <ioanna-maria.alifieraki@canonical.com>
Message-Id: <20211115131645.25116-1-ioanna-maria.alifieraki@canonical.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
'kcs_bmc_serio_add_device()'
[ Upstream commit f281d010b87454e72475b668ad66e34961f744e0 ]
In the unlikely event where 'devm_kzalloc()' fails and 'kzalloc()'
succeeds, 'port' would be leaking.
Test each allocation separately to avoid the leak.
Fixes: 3a3d2f6a4c64 ("ipmi: kcs_bmc: Add serio adaptor")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Message-Id: <ecbfa15e94e64f4b878ecab1541ea46c74807670.1631048724.git.christophe.jaillet@wanadoo.fr>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit b36eb5e7b75a756baa64909a176dd4269ee05a8b ]
Don't do kfree or other risky things when oops_in_progress is set.
It's easy enough to avoid doing them
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit db05ddf7f321634c5659a0cf7ea56594e22365f7 upstream.
You will get two decrements when the messages on a panic are sent, not
one, since commit 2033f6858970 ("ipmi: Free receive messages when in an
oops") was added, but the watchdog code had a bug where it didn't set
the value properly.
Reported-by: Anton Lundin <glance@acc.umu.se>
Cc: <Stable@vger.kernel.org> # v5.4+
Fixes: 2033f6858970 ("ipmi: Free receive messages when in an oops")
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
Pull IPMI updates from Corey Minyard:
"A couple of very minor fixes for style and rate limiting.
Nothing big, but probably needs to go in"
* tag 'for-linus-5.15-1' of git://github.com/cminyard/linux-ipmi:
char: ipmi: use DEVICE_ATTR helper macro
ipmi: rate limit ipmi smi_event failure message
|
|
The caller of this function (parisc_driver_remove() in
arch/parisc/kernel/drivers.c) ignores the return value, so better don't
return any value at all to not wake wrong expectations in driver authors.
The only function that could return a non-zero value before was
ipmi_parisc_remove() which returns the return value of
ipmi_si_remove_by_dev(). Make this function return void, too, as for all
other callers the value is ignored, too.
Also fold in a small checkpatch fix for:
WARNING: Unnecessary space before function pointer arguments
+ void (*remove) (struct parisc_device *dev);
Acked-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> (for drivers/input)
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com>
Acked-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Helge Deller <deller@gmx.de>
|
|
Instead of open coding DEVICE_ATTR, use the helper macro
DEVICE_ATTR_RO to replace DEVICE_ATTR with 0444 octal
permissions.
This was detected as a part of checkpatch evaluation
investigating all reports of DEVICE_ATTR_RO warning
type.
Signed-off-by: Dwaipayan Ray <dwaipayanray1@gmail.com>
Message-Id: <20210730062951.84876-1-dwaipayanray1@gmail.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Sometimes we can't get a valid si_sm_data, and we print an error
message accordingly. But the ipmi module seem to like retrying a lot,
in which case we flood the kernel log with a lot of messages, eg:
[46318019.164726] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318020.109700] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318021.158677] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318022.212598] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318023.258564] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318024.210455] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318025.260473] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318026.308445] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318027.356389] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318028.298288] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
[46318029.363302] ipmi_si IPI0001:00: Could not set the global enables: 0xc1.
Signed-off-by: Wen Yang <wenyang@linux.alibaba.com>
Cc: Baoyou Xie <baoyou.xie@alibaba-inc.com>
Cc: Corey Minyard <minyard@acm.org>
Cc: openipmi-developer@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
Message-Id: <20210729093228.77098-1-wenyang@linux.alibaba.com>
[Added a missing comma]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Merge more updates from Andrew Morton:
"190 patches.
Subsystems affected by this patch series: mm (hugetlb, userfaultfd,
vmscan, kconfig, proc, z3fold, zbud, ras, mempolicy, memblock,
migration, thp, nommu, kconfig, madvise, memory-hotplug, zswap,
zsmalloc, zram, cleanups, kfence, and hmm), procfs, sysctl, misc,
core-kernel, lib, lz4, checkpatch, init, kprobes, nilfs2, hfs,
signals, exec, kcov, selftests, compress/decompress, and ipc"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (190 commits)
ipc/util.c: use binary search for max_idx
ipc/sem.c: use READ_ONCE()/WRITE_ONCE() for use_global_lock
ipc: use kmalloc for msg_queue and shmid_kernel
ipc sem: use kvmalloc for sem_undo allocation
lib/decompressors: remove set but not used variabled 'level'
selftests/vm/pkeys: exercise x86 XSAVE init state
selftests/vm/pkeys: refill shadow register after implicit kernel write
selftests/vm/pkeys: handle negative sys_pkey_alloc() return code
selftests/vm/pkeys: fix alloc_random_pkey() to make it really, really random
kcov: add __no_sanitize_coverage to fix noinstr for all architectures
exec: remove checks in __register_bimfmt()
x86: signal: don't do sas_ss_reset() until we are certain that sigframe won't be abandoned
hfsplus: report create_date to kstat.btime
hfsplus: remove unnecessary oom message
nilfs2: remove redundant continue statement in a while-loop
kprobes: remove duplicated strong free_insn_page in x86 and s390
init: print out unknown kernel parameters
checkpatch: do not complain about positive return values starting with EPOLL
checkpatch: improve the indented label test
checkpatch: scripts/spdxcheck.py now requires python3
...
|
|
kernel.h is being used as a dump for all kinds of stuff for a long time.
Here is the attempt to start cleaning it up by splitting out panic and
oops helpers.
There are several purposes of doing this:
- dropping dependency in bug.h
- dropping a loop by moving out panic_notifier.h
- unload kernel.h from something which has its own domain
At the same time convert users tree-wide to use new headers, although for
the time being include new header back to kernel.h to avoid twisted
indirected includes for existing users.
[akpm@linux-foundation.org: thread_info.h needs limits.h]
[andriy.shevchenko@linux.intel.com: ia64 fix]
Link: https://lkml.kernel.org/r/20210520130557.55277-1-andriy.shevchenko@linux.intel.com
Link: https://lkml.kernel.org/r/20210511074137.33666-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>
Co-developed-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Mike Rapoport <rppt@linux.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Wei Liu <wei.liu@kernel.org>
Acked-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Sebastian Reichel <sre@kernel.org>
Acked-by: Luis Chamberlain <mcgrof@kernel.org>
Acked-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Acked-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The comparisons of the unsigned int hw_type to less than zero always
false because it is unsigned. Fix this by using an int for the
assignment and less than zero check.
Addresses-Coverity: ("Unsigned compared against 0")
Fixes: 9d2df9a0ad80 ("ipmi: kcs_bmc_aspeed: Implement KCS SerIRQ configuration")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Message-Id: <20210616162913.15259-1-colin.king@canonical.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Some Aspeed KCS devices can derive the status register address from the
address of the data register. As such, the address of the status
register can be implicit in the configuration if desired. On the other
hand, sometimes address schemes might be requested that are incompatible
with the default addressing scheme. Allow these requests where possible
if the devicetree specifies the status register address.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
Message-Id: <20210608104757.582199-17-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Input Buffer Full Interrupt Enable (IBFIE) is typoed as IBFIF for some
registers in the datasheet. Fix the driver to use the sensible acronym.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Message-Id: <20210608104757.582199-16-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Apply the SerIRQ ID and level/sense behaviours from the devicetree if
provided.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-Id: <20210608104757.582199-15-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
kcs_bmc_serio acts as a bridge between the KCS drivers in the IPMI
subsystem and the existing userspace interfaces available through the
serio subsystem. This is useful when userspace would like to make use of
the BMC KCS devices for purposes that aren't IPMI.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-Id: <20210608104757.582199-12-andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
This way devices don't get delivered IRQs when no-one is interested.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-Id: <20210608104757.582199-11-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Add a mechanism for controlling whether the client associated with a
KCS device will receive Input Buffer Full (IBF) and Output Buffer Empty
(OBE) events. This enables an abstract implementation of poll() for KCS
devices.
A wart in the implementation is that the ASPEED KCS devices don't
support an OBE interrupt for the BMC. Instead we pretend it has one by
polling the status register waiting for the Output Buffer Full (OBF) bit
to clear, and generating an event when OBE is observed.
Cc: CS20 KWLiu <KWLIU@nuvoton.com>
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Message-Id: <20210608104757.582199-10-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Now that we have untangled the data-structures, split the userspace
interface out into its own module. Userspace interfaces and drivers are
registered to the KCS BMC core to support arbitrary binding of either.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-Id: <20210608104757.582199-9-andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Move all client-private data out of `struct kcs_bmc` into the KCS client
implementation.
With this change the KCS BMC core code now only concerns itself with
abstract `struct kcs_bmc` and `struct kcs_bmc_client` types, achieving
expected separation of concerns. Further, the change clears the path for
implementation of alternative userspace interfaces.
The chardev data-structures are rearranged in the same manner applied to
the KCS device driver data-structures in an earlier patch - `struct
kcs_bmc_client` is embedded in the client's private data and we exploit
container_of() to translate as required.
Finally, now that it is free of client data, `struct kcs_bmc` is renamed
to `struct kcs_bmc_device` to contrast `struct kcs_bmc_client`.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Message-Id: <20210608104757.582199-8-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Strengthen the distinction between code that abstracts the
implementation of the KCS behaviours (device drivers) and code that
exploits KCS behaviours (clients). Neither needs to know about the APIs
required by the other, so provide separate headers.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-Id: <20210608104757.582199-7-andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Make the KCS device drivers responsible for allocating their own memory.
Until now the private data for the device driver was allocated internal
to the private data for the chardev interface. This coupling required
the slightly awkward API of passing through the struct size for the
driver private data to the chardev constructor, and then retrieving a
pointer to the driver private data from the allocated chardev memory.
In addition to being awkward, the arrangement prevents the
implementation of alternative userspace interfaces as the device driver
private data is not independent.
Peel a layer off the onion and turn the data-structures inside out by
exploiting container_of() and embedding `struct kcs_device` in the
driver private data.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Message-Id: <20210608104757.582199-6-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Take steps towards defining a coherent API to separate the KCS device
drivers from the userspace interface. Decreasing the coupling will
improve the separation of concerns and enable the introduction of
alternative userspace interfaces.
For now, simply split the chardev logic out to a separate file. The code
continues to build into the same module.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Message-Id: <20210608104757.582199-5-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Rename the functions in preparation for separating the IPMI chardev out
from the KCS BMC core.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Message-Id: <20210608104757.582199-4-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Enable more efficient implementation of read-modify-write sequences.
Both device drivers for the KCS BMC stack use regmaps. The new callback
allows us to exploit regmap_update_bits().
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Zev Weiss <zweiss@equinix.com>
Message-Id: <20210608104757.582199-3-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|