| Age | Commit message (Collapse) | Author | Files | Lines |
|
[ Upstream commit 6bd30d8fc523fb880b4be548e8501bc0fe8f42d4 ]
channel_handler() sets intf->channels_ready to true but never
clears it, so __scan_channels() skips any rescan. When the BMC
firmware changes a rescan is required. Allow it by clearing
the flag before starting a new scan.
Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
Message-ID: <20250930074239.2353-3-guojinhui.liam@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 936750fdba4c45e13bbd17f261bb140dd55f5e93 ]
The race window between __scan_channels() and deliver_response() causes
the parameters of some channels to be set to 0.
1.[CPUA] __scan_channels() issues an IPMI request and waits with
wait_event() until all channels have been scanned.
wait_event() internally calls might_sleep(), which might
yield the CPU. (Moreover, an interrupt can preempt
wait_event() and force the task to yield the CPU.)
2.[CPUB] deliver_response() is invoked when the CPU receives the
IPMI response. After processing a IPMI response,
deliver_response() directly assigns intf->wchannels to
intf->channel_list and sets intf->channels_ready to true.
However, not all channels are actually ready for use.
3.[CPUA] Since intf->channels_ready is already true, wait_event()
never enters __wait_event(). __scan_channels() immediately
clears intf->null_user_handler and exits.
4.[CPUB] Once intf->null_user_handler is set to NULL, deliver_response()
ignores further IPMI responses, leaving the remaining
channels zero-initialized and unusable.
CPUA CPUB
------------------------------- -----------------------------
__scan_channels()
intf->null_user_handler
= channel_handler;
send_channel_info_cmd(intf,
0);
wait_event(intf->waitq,
intf->channels_ready);
do {
might_sleep();
deliver_response()
channel_handler()
intf->channel_list =
intf->wchannels + set;
intf->channels_ready = true;
send_channel_info_cmd(intf,
intf->curr_channel);
if (condition)
break;
__wait_event(wq_head,
condition);
} while(0)
intf->null_user_handler
= NULL;
deliver_response()
if (!msg->user)
if (intf->null_user_handler)
rv = -EINVAL;
return rv;
------------------------------- -----------------------------
Fix the race between __scan_channels() and deliver_response() by
deferring both the assignment intf->channel_list = intf->wchannels
and the flag intf->channels_ready = true until all channels have
been successfully scanned or until the IPMI request has failed.
Signed-off-by: Jinhui Guo <guojinhui.liam@bytedance.com>
Message-ID: <20250930074239.2353-2-guojinhui.liam@bytedance.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit e2c69490dda5d4c9f1bfbb2898989c8f3530e354 upstream
Prior to commit b52da4054ee0 ("ipmi: Rework user message limit handling"),
i_ipmi_request() used to increase the user reference counter if the receive
message is provided by the caller of IPMI API functions. This is no longer
the case. However, ipmi_free_recv_msg() is still called and decreases the
reference counter. This results in the reference counter reaching zero,
the user data pointer is released, and all kinds of interesting crashes are
seen.
Fix the problem by increasing user reference counter if the receive message
has been provided by the caller.
Fixes: b52da4054ee0 ("ipmi: Rework user message limit handling")
Reported-by: Eric Dumazet <edumazet@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Thelen <gthelen@google.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-ID: <20251006201857.3433837-1-linux@roeck-us.net>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit b52da4054ee0bf9ecb44996f2c83236ff50b3812 upstream
This patch required quite a bit of work to backport due to a number
of unrelated changes that do not make sense to backport. This has
been run against my test suite and passes all tests.
The limit on the number of user messages had a number of issues,
improper counting in some cases and a use after free.
Restructure how this is all done to handle more in the receive message
allocation routine, so all refcouting and user message limit counts
are done in that routine. It's a lot cleaner and safer.
Reported-by: Gilles BULOZ <gilles.buloz@kontron.com>
Closes: https://lore.kernel.org/lkml/aLsw6G0GyqfpKs2S@mail.minyard.net/
Fixes: 8e76741c3d8b ("ipmi: Add a limit on the number of users that may use IPMI")
Cc: <stable@vger.kernel.org> # 4.19
Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Gilles BULOZ <gilles.buloz@kontron.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
commit 5d09ee1bec870263f4ace439402ea840503b503b upstream.
This reverts commit c608966f3f9c2dca596967501d00753282b395fc.
This patch has a subtle bug that can cause the IPMI driver to go into an
infinite loop if the BMC misbehaves in a certain way. Apparently
certain BMCs do misbehave this way because several reports have come in
recently about this.
Signed-off-by: Corey Minyard <corey@minyard.net>
Tested-by: Eric Hagberg <ehagberg@janestreet.com>
Cc: <stable@vger.kernel.org> # 6.2
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 8ffcb7560b4a15faf821df95e3ab532b2b020f8c ]
The source and destination of some strcpy operations was the same.
Split out the part of the operations that needed to be done for those
particular calls so the unnecessary copy wasn't done.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202506140756.EFXXvIP4-lkp@intel.com/
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit ec50ec378e3fd83bde9b3d622ceac3509a60b6b5 ]
During BMC firmware upgrades on live systems, the ipmi_msghandler
generates excessive "BMC returned incorrect response" warnings
while the BMC is temporarily offline. This can flood system logs
in large deployments.
Replace dev_warn() with dev_warn_ratelimited() to throttle these
warnings and prevent log spam during BMC maintenance operations.
Signed-off-by: Breno Leitao <leitao@debian.org>
Message-ID: <20250710-ipmi_ratelimit-v1-1-6d417015ebe9@debian.org>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
commit fa332f5dc6fc662ad7d3200048772c96b861cf6b upstream.
The "intf" list iterator is an invalid pointer if the correct
"intf->intf_num" is not found. Calling atomic_dec(&intf->nr_users) on
and invalid pointer will lead to memory corruption.
We don't really need to call atomic_dec() if we haven't called
atomic_add_return() so update the if (intf->in_shutdown) path as well.
Fixes: 8e76741c3d8b ("ipmi: Add a limit on the number of users that may use IPMI")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Message-ID: <aBjMZ8RYrOt6NOgi@stanley.mountain>
Signed-off-by: Corey Minyard <corey@minyard.net>
[ - Dropped change to the `if (intf->in_shutdown)` block since that logic
doesn't exist yet.
- Modified out_unlock to release the srcu lock instead of the mutex
since we don't have the mutex here yet. ]
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
[ Upstream commit 83d8c79aa958e37724ed9c14dc7d0f66a48ad864 ]
Cosmo found that when there is a new request comes in while BMC is
ready for a response, the complete_response(), which is called to
complete the pending response, would accidentally clear out that new
request and force ssif_bmc to move back to abort state again.
This commit is to address that issue.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Reported-by: Cosmo Chou <chou.cosmo@gmail.com>
Closes: https://lore.kernel.org/lkml/20250101165431.2113407-1-chou.cosmo@gmail.com/
Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
Message-ID: <20250107034734.1842247-1-quan@os.amperecomputing.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 2378bd0b264ad3a1f76bd957caf33ee0c7945351 ]
devm_kasprintf() can return a NULL pointer on failure but this
returned value is not checked.
Fixes: 51bd6f291583 ("Add support for IPMB driver")
Signed-off-by: Charles Han <hanchunchao@inspur.com>
Message-ID: <20240926094419.25900-1-hanchunchao@inspur.com>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
[ Upstream commit 0627cef36145c9ff9845bdfd7ddf485bbac1f981 ]
There are actually two bugs here. First, we need to ensure that count
is at least sizeof(u32) or msg.len will be uninitialized data.
The "msg.len" variable is a u32 that comes from the user. On 32bit
systems the "sizeof_field(struct ipmi_ssif_msg, len) + msg.len"
addition can overflow if "msg.len" is greater than U32_MAX - 4.
Valid lengths for "msg.len" are 1-254. Add a check for that to
prevent the integer overflow.
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Message-Id: <1431ca2e-4e9c-4520-bfc0-6879313c30e9@moroto.mountain>
Signed-off-by: Corey Minyard <corey@minyard.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
|
|
Pull IPMI updates from Corey Minyard:
"Minor fixes for IPMI
Lots of small unconnected things, memory leaks on error, a possible
(though unlikely) deadlock, changes for updates to other things that
have changed. Nothing earth-shattering, but things that need update"
* tag 'for-linus-6.6-1' of https://github.com/cminyard/linux-ipmi:
ipmi_si: fix -Wvoid-pointer-to-enum-cast warning
ipmi: fix potential deadlock on &kcs_bmc->lock
ipmi_si: fix a memleak in try_smi_init()
ipmi: Change request_module to request_module_nowait
ipmi: make ipmi_class a static const structure
ipmi:ssif: Fix a memory leak when scanning for an adapter
ipmi:ssif: Add check for kstrdup
dt-bindings: ipmi: aspeed,ast2400-kcs-bmc: drop unneeded quotes
ipmi: Switch i2c drivers back to use .probe()
ipmi_watchdog: Fix read syscall not responding to signals during sleep
|
|
The DT of_device.h and of_platform.h date back to the separate
of_platform_bus_type before it was merged into the regular platform bus.
As part of that merge prepping Arm DT support 13 years ago, they
"temporarily" include each other. They also include platform_device.h
and of.h. As a result, there's a pretty much random mix of those include
files used throughout the tree. In order to detangle these headers and
replace the implicit includes with struct declarations, users need to
explicitly include the correct includes.
Link: https://lore.kernel.org/r/20230728134819.3224045-1-robh@kernel.org
Signed-off-by: Rob Herring <robh@kernel.org>
|
|
With W=1 we see the following warning:
| drivers/char/ipmi/ipmi_si_platform.c:272:15: error: \
| cast to smaller integer type 'enum si_type' from \
| 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast]
| 272 | io.si_type = (enum si_type) match->data;
| | ^~~~~~~~~~~~~~~~~~~~~~~~~~
This is due to the fact that the `si_type` enum members are int-width
and a cast from pointer-width down to int will cause truncation and
possible data loss. Although in this case `si_type` has only a few
enumerated fields and thus there is likely no data loss occurring.
Nonetheless, this patch is necessary to the goal of promoting this
warning out of W=1.
Link: https://github.com/ClangBuiltLinux/linux/issues/1902
Link: https://lore.kernel.org/llvm/202308081000.tTL1ElTr-lkp@intel.com/
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Justin Stitt <justinstitt@google.com>
Message-Id: <20230809-cbl-1902-v1-1-92def12d1dea@google.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
As kcs_bmc_handle_event() is executed inside both a timer and a hardirq,
it should disable irq before lock acquisition otherwise deadlock could
happen if the timmer is preemtped by the irq.
Possible deadlock scenario:
aspeed_kcs_check_obe() (timer)
-> kcs_bmc_handle_event()
-> spin_lock(&kcs_bmc->lock)
<irq interruption>
-> aspeed_kcs_irq()
-> kcs_bmc_handle_event()
-> spin_lock(&kcs_bmc->lock) (deadlock here)
This flaw was found using an experimental static analysis tool we are
developing for irq-related deadlock.
The tentative patch fix the potential deadlock by spin_lock_irqsave()
Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
Message-Id: <20230627152449.36093-1-dg573847474@gmail.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
Kmemleak reported the following leak info in try_smi_init():
unreferenced object 0xffff00018ecf9400 (size 1024):
comm "modprobe", pid 2707763, jiffies 4300851415 (age 773.308s)
backtrace:
[<000000004ca5b312>] __kmalloc+0x4b8/0x7b0
[<00000000953b1072>] try_smi_init+0x148/0x5dc [ipmi_si]
[<000000006460d325>] 0xffff800081b10148
[<0000000039206ea5>] do_one_initcall+0x64/0x2a4
[<00000000601399ce>] do_init_module+0x50/0x300
[<000000003c12ba3c>] load_module+0x7a8/0x9e0
[<00000000c246fffe>] __se_sys_init_module+0x104/0x180
[<00000000eea99093>] __arm64_sys_init_module+0x24/0x30
[<0000000021b1ef87>] el0_svc_common.constprop.0+0x94/0x250
[<0000000070f4f8b7>] do_el0_svc+0x48/0xe0
[<000000005a05337f>] el0_svc+0x24/0x3c
[<000000005eb248d6>] el0_sync_handler+0x160/0x164
[<0000000030a59039>] el0_sync+0x160/0x180
The problem was that when an error occurred before handlers registration
and after allocating `new_smi->si_sm`, the variable wouldn't be freed in
the error handling afterwards since `shutdown_smi()` hadn't been
registered yet. Fix it by adding a `kfree()` in the error handling path
in `try_smi_init()`.
Cc: stable@vger.kernel.org # 4.19+
Fixes: 7960f18a5647 ("ipmi_si: Convert over to a shutdown handler")
Signed-off-by: Yi Yang <yiyang13@huawei.com>
Co-developed-by: GONG, Ruiqi <gongruiqi@huaweicloud.com>
Signed-off-by: GONG, Ruiqi <gongruiqi@huaweicloud.com>
Message-Id: <20230629123328.2402075-1-gongruiqi@huaweicloud.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
When probing for an ACPI-specified IPMI device, the code would request
that the acpi_ipmi module be loaded ACPI operations through IPMI can be
performed. This could happen through module load context, for instance,
if an I2C module is loaded that caused the IPMI interface to be probed.
This is not allowed because a synchronous module load in this context
can result in an deadlock, and I was getting a warning:
[ 23.967853] WARNING: CPU: 0 PID: 21 at kernel/module/kmod.c:144 __request_module+0x1de/0x2d0
[ 23.968852] Modules linked in: i2c_i801 ipmi_ssif
The IPMI driver is not dependent on acpi_ipmi, so just change the called
to request_module_nowait to make the load asynchronous.
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
Now that the driver core allows for struct class to be in read-only
memory, move the ipmi_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.
Cc: Corey Minyard <minyard@acm.org>
Cc: openipmi-developer@lists.sourceforge.net
Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ivan Orlov <ivan.orlov0322@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Message-Id: <20230620143701.577657-2-gregkh@linuxfoundation.org>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
The adapter scan ssif_info_find() sets info->adapter_name if the adapter
info came from SMBIOS, as it's not set in that case. However, this
function can be called more than once, and it will leak the adapter name
if it had already been set. So check for NULL before setting it.
Fixes: c4436c9149c5 ("ipmi_ssif: avoid registering duplicate ssif interface")
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
Add check for the return value of kstrdup() and return the error
if it fails in order to avoid NULL pointer dereference.
Fixes: c4436c9149c5 ("ipmi_ssif: avoid registering duplicate ssif interface")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Message-Id: <20230619092802.35384-1-jiasheng@iscas.ac.cn>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
After commit b8a1a4cd5a98 ("i2c: Provide a temporary .probe_new()
call-back type"), all drivers being converted to .probe_new() and then
03c835f498b5 ("i2c: Switch .probe() to not take an id parameter") convert
back to (the new) .probe() to be able to eventually drop .probe_new() from
struct i2c_driver.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Message-Id: <20230525204021.696858-1-u.kleine-koenig@pengutronix.de>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
Read syscall cannot response to sigals when data_to_read remains at 0
and the while loop cannot break. Check signal_pending in the loop.
Reported-by: Zhen Ni <zhen.ni@easystack.cn>
Message-Id: <20230517085412.367022-1-zhen.ni@easystack.cn>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core updates from Greg KH:
"Here is the large set of driver core changes for 6.4-rc1.
Once again, a busy development cycle, with lots of changes happening
in the driver core in the quest to be able to move "struct bus" and
"struct class" into read-only memory, a task now complete with these
changes.
This will make the future rust interactions with the driver core more
"provably correct" as well as providing more obvious lifetime rules
for all busses and classes in the kernel.
The changes required for this did touch many individual classes and
busses as many callbacks were changed to take const * parameters
instead. All of these changes have been submitted to the various
subsystem maintainers, giving them plenty of time to review, and most
of them actually did so.
Other than those changes, included in here are a small set of other
things:
- kobject logging improvements
- cacheinfo improvements and updates
- obligatory fw_devlink updates and fixes
- documentation updates
- device property cleanups and const * changes
- firwmare loader dependency fixes.
All of these have been in linux-next for a while with no reported
problems"
* tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits)
device property: make device_property functions take const device *
driver core: update comments in device_rename()
driver core: Don't require dynamic_debug for initcall_debug probe timing
firmware_loader: rework crypto dependencies
firmware_loader: Strip off \n from customized path
zram: fix up permission for the hot_add sysfs file
cacheinfo: Add use_arch[|_cache]_info field/function
arch_topology: Remove early cacheinfo error message if -ENOENT
cacheinfo: Check cache properties are present in DT
cacheinfo: Check sib_leaf in cache_leaves_are_shared()
cacheinfo: Allow early level detection when DT/ACPI info is missing/broken
cacheinfo: Add arm64 early level initializer implementation
cacheinfo: Add arch specific early level initializer
tty: make tty_class a static const structure
driver core: class: remove struct class_interface * from callbacks
driver core: class: mark the struct class in struct class_interface constant
driver core: class: make class_register() take a const *
driver core: class: mark class_release() as taking a const *
driver core: remove incorrect comment for device_create*
MIPS: vpe-cmp: remove module owner pointer from struct class usage.
...
|
|
For both variants (platform and i2c driver) after a successful bind
(i.e. .probe completed without error) driver data is set to a non-NULL
value.
So the return value of i2c_get_clientdata and dev_get_drvdata
respectively are not NULL and so the if blocks are never executed. (And
if you fear they might, they shouldn't return silently and yield a
resource leak.)
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Message-Id: <20221230124431.202474-1-u.kleine-koenig@pengutronix.de>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
The ipmi communication is not restored after a specific version of BMC is
upgraded on our server.
The ipmi driver does not respond after printing the following log:
ipmi_ssif: Invalid response getting flags: 1c 1
I found that after entering this branch, ssif_info->ssif_state always
holds SSIF_GETTING_FLAGS and never return to IDLE.
As a result, the driver cannot be loaded, because the driver status is
checked during the unload process and must be IDLE in shutdown_ssif():
while (ssif_info->ssif_state != SSIF_IDLE)
schedule_timeout(1);
The process trigger this problem is:
1. One msg timeout and next msg start send, and call
ssif_set_need_watch().
2. ssif_set_need_watch()->watch_timeout()->start_flag_fetch() change
ssif_state to SSIF_GETTING_FLAGS.
3. In msg_done_handler() ssif_state == SSIF_GETTING_FLAGS, if an error
message is received, the second branch does not modify the ssif_state.
4. All retry action need IS_SSIF_IDLE() == True. Include retry action in
watch_timeout(), msg_done_handler(). Sending msg does not work either.
SSIF_IDLE is also checked in start_next_msg().
5. The only thing that can be triggered in the SSIF driver is
watch_timeout(), after destory_user(), this timer will stop too.
So, if enter this branch, the ssif_state will remain SSIF_GETTING_FLAGS
and can't send msg, no timer started, can't unload.
We did a comparative test before and after adding this patch, and the
result is effective.
Fixes: 259307074bfc ("ipmi: Add SMBus interface driver (SSIF)")
Cc: stable@vger.kernel.org
Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20230412074907.80046-1-zhangyuchen.lcr@bytedance.com>
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
A recent change removed an increment of send_retries, re-add it.
Fixes: 95767ed78a18 ipmi:ssif: resend_msg() cannot fail
Reported-by: Pavel Machek <pavel@denx.de>
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <minyard@acm.org>
|
|
The module pointer in class_create() never actually did anything, and it
shouldn't have been requred to be set as a parameter even if it did
something. So just remove it and fix up all callers of the function in
the kernel tree at the same time.
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
|
|
There is a spelling mistake in the comment information. Fix it.
Signed-off-by: zipeng zhang <zhangzipeng0@foxmail.com>
Message-Id: <tencent_F0BFF85BC7C1FC84E440A7B7D364D2ED4209@qq.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
register_sysctl_table() is a deprecated compatibility wrapper.
register_sysctl() can do the directory creation for you so just use
that.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Message-Id: <20230302204612.782387-3-mcgrof@kernel.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
REGMAP is a hidden (not user visible) symbol. Users cannot set it
directly thru "make *config", so drivers should select it instead of
depending on it if they need it.
Consistently using "select" or "depends on" can also help reduce
Kconfig circular dependency issues.
Therefore, change the use of "depends on REGMAP_MMIO" to
"select REGMAP_MMIO", which will also set REGMAP.
Fixes: eb994594bc22 ("ipmi: bt-bmc: Use a regmap for register access")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Andrew Jeffery <andrew@aj.id.au>
Cc: Corey Minyard <minyard@acm.org>
Cc: openipmi-developer@lists.sourceforge.net
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Message-Id: <20230226053953.4681-2-rdunlap@infradead.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
'This should be 'retry_time_ms' instead of 'max_retries'.
Fixes: 63c4eb347164 ("ipmi:ipmb: Add initial support for IPMI over IPMB")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Message-Id: <0d8670cff2c656e99a832a249e77dc90578f67de.1675591429.git.christophe.jaillet@wanadoo.fr>
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The IPMI spec has a time (T6) specified between request retries. Add
the handling for that.
Reported by: Tony Camuso <tcamuso@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
It was cruft left over from older handling of run to completion.
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Rename the SSIF_IDLE() to IS_SSIF_IDLE(), since that is more clear, and
rename SSIF_NORMAL to SSIF_IDLE, since that's more accurate.
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The resend_msg() function cannot fail, but there was error handling
around using it. Rework the handling of the error, and fix the out of
retries debug reporting that was wrong around this, too.
Cc: stable@vger.kernel.org
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Xu Panda <xu.panda@zte.com.cn>
The implementation of strscpy() is more robust and safer.
That's now the recommended way to copy NUL terminated strings.
Signed-off-by: Xu Panda <xu.panda@zte.com.cn>
Signed-off-by: Yang Yang <yang.yang29@zte.com>
Message-Id: <202212051936400309332@zte.com.cn>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The probe function doesn't make use of the i2c_device_id * parameter so it
can be trivially converted.
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Message-Id: <20221118224540.619276-606-uwe@kleine-koenig.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The intf_free() function frees the "intf" pointer so we cannot
dereference it again on the next line.
Fixes: cbb79863fc31 ("ipmi: Don't allow device module unload when in use")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Message-Id: <Y3M8xa1drZv4CToE@kili>
Cc: <stable@vger.kernel.org> # 5.5+
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The kstrto<something>() functions have been moved from kernel.h to
kstrtox.h.
So, in order to eventually remove <linux/kernel.h> from <linux/watchdog.h>,
include the latter directly in the appropriate files.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Message-Id: <37daa028845d90ee77f1e547121a051a983fec2e.1667647002.git.christophe.jaillet@wanadoo.fr>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The spec states that the minimum message retry time is 60ms, but it was
set to 20ms. Correct it.
Reported by: Tony Camuso <tcamuso@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The current code provokes some kernel-doc warnings:
drivers/char/ipmi/ipmi_msghandler.c:618: warning: This comment starts with '/**', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst
Signed-off-by: Bo Liu <liubo03@inspur.com>
Message-Id: <20221025060436.4372-1-liubo03@inspur.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
This fixes the following sparse warning:
sparse warnings: (new ones prefixed by >>)
>> drivers/char/ipmi/ssif_bmc.c:254:22: sparse: sparse: invalid assignment: |=
>> drivers/char/ipmi/ssif_bmc.c:254:22: sparse: left side has type restricted __poll_t
>> drivers/char/ipmi/ssif_bmc.c:254:22: sparse: right side has type int
Fixes: dd2bc5cc9e25 ("ipmi: ssif_bmc: Add SSIF BMC driver")
Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/all/202210181103.ontD9tRT-lkp@intel.com/
Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
Message-Id: <20221024075956.3312552-1-quan@os.amperecomputing.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
If you continue to access and send messages at a high frequency (once
every 55s) when the IPMI is disconnected, messages will accumulate in
intf->[hp_]xmit_msg. If it lasts long enough, it takes up a lot of
memory.
The reason is that if IPMI is disconnected, each message will be set to
IDLE after it returns to HOSED through IDLE->ERROR0->HOSED. The next
message goes through the same process when it comes in. This process
needs to wait for IBF_TIMEOUT * (MAX_ERROR_RETRIES + 1) = 55s.
Each message takes 55S to destroy. This results in a continuous increase
in memory.
I find that if I wait 5 seconds after the first message fails, the
status changes to ERROR0 in smi_timeout(). The next message will return
the error code IPMI_NOT_IN_MY_STATE_ERR directly without wait.
This is more in line with our needs.
So instead of setting each message state to IDLE after it reaches the
state HOSED, set state to ERROR0.
After testing, the problem has been solved, no matter how many
consecutive sends, will not cause continuous memory growth. It also
returns to normal immediately after the IPMI is restored.
In addition, the HOSED state should also count as invalid. So the HOSED
is removed from the invalid judgment in start_kcs_transaction().
The verification operations are as follows:
1. Use BPF to record the ipmi_alloc/free_smi_msg().
$ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
%p\n",retval);} kprobe:free_recv_msg {printf("free %p\n",arg0)}'
2. Exec `date; time for x in $(seq 1 2); do ipmitool mc info; done`.
3. Record the output of `time` and when free all msgs.
Before:
`time` takes 120s, This is because `ipmitool mc info` send 4 msgs and
waits only 15 seconds for each message. Last msg is free after 440s.
$ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
%p\n",retval);} kprobe:free_recv_msg {printf("free %p\n",arg0)}'
Oct 05 11:40:55 Attaching 2 probes...
Oct 05 11:41:12 alloc 0xffff9558a05f0c00
Oct 05 11:41:27 alloc 0xffff9558a05f1a00
Oct 05 11:41:42 alloc 0xffff9558a05f0000
Oct 05 11:41:57 alloc 0xffff9558a05f1400
Oct 05 11:42:07 free 0xffff9558a05f0c00
Oct 05 11:42:07 alloc 0xffff9558a05f7000
Oct 05 11:42:22 alloc 0xffff9558a05f2a00
Oct 05 11:42:37 alloc 0xffff9558a05f5a00
Oct 05 11:42:52 alloc 0xffff9558a05f3a00
Oct 05 11:43:02 free 0xffff9558a05f1a00
Oct 05 11:43:57 free 0xffff9558a05f0000
Oct 05 11:44:52 free 0xffff9558a05f1400
Oct 05 11:45:47 free 0xffff9558a05f7000
Oct 05 11:46:42 free 0xffff9558a05f2a00
Oct 05 11:47:37 free 0xffff9558a05f5a00
Oct 05 11:48:32 free 0xffff9558a05f3a00
$ root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
ipmitool mc info; done
Wed Oct 5 11:41:12 CST 2022
No data available
Get Device ID command failed
No data available
No data available
No valid response received
Get Device ID command failed: Unspecified error
No data available
Get Device ID command failed
No data available
No data available
No valid response received
No data available
Get Device ID command failed
real 1m55.052s
user 0m0.001s
sys 0m0.001s
After:
`time` takes 55s, all msgs is returned and free after 55s.
$ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc
%p\n",retval);} kprobe:free_recv_msg {printf("free %p\n",arg0)}'
Oct 07 16:30:35 Attaching 2 probes...
Oct 07 16:30:45 alloc 0xffff955943aa9800
Oct 07 16:31:00 alloc 0xffff955943aacc00
Oct 07 16:31:15 alloc 0xffff955943aa8c00
Oct 07 16:31:30 alloc 0xffff955943aaf600
Oct 07 16:31:40 free 0xffff955943aa9800
Oct 07 16:31:40 free 0xffff955943aacc00
Oct 07 16:31:40 free 0xffff955943aa8c00
Oct 07 16:31:40 free 0xffff955943aaf600
Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
Oct 07 16:31:40 free 0xffff9558ec8f7e00
Oct 07 16:31:40 alloc 0xffff9558ec8f7800
Oct 07 16:31:40 free 0xffff9558ec8f7800
Oct 07 16:31:40 alloc 0xffff9558ec8f7e00
Oct 07 16:31:40 free 0xffff9558ec8f7e00
Oct 07 16:31:40 alloc 0xffff9558ec8f7800
Oct 07 16:31:40 free 0xffff9558ec8f7800
root@dc00-pb003-t106-n078:~# date;time for x in $(seq 1 2); do
ipmitool mc info; done
Fri Oct 7 16:30:45 CST 2022
No data available
Get Device ID command failed
No data available
No data available
No valid response received
Get Device ID command failed: Unspecified error
Get Device ID command failed: 0xd5 Command not supported in present state
Get Device ID command failed: Command not supported in present state
real 0m55.038s
user 0m0.001s
sys 0m0.001s
Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20221009091811.40240-2-zhangyuchen.lcr@bytedance.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
After the IPMI disconnect problem, the memory kept rising and we tried
to unload the driver to free the memory. However, only part of the
free memory is recovered after the driver is uninstalled. Using
ebpf to hook free functions, we find that neither ipmi_user nor
ipmi_smi_msg is free, only ipmi_recv_msg is free.
We find that the deliver_smi_err_response call in clean_smi_msgs does
the destroy processing on each message from the xmit_msg queue without
checking the return value and free ipmi_smi_msg.
deliver_smi_err_response is called only at this location. Adding the
free handling has no effect.
To verify, try using ebpf to trace the free function.
$ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc rcv
%p\n",retval);} kprobe:free_recv_msg {printf("free recv %p\n",
arg0)} kretprobe:ipmi_alloc_smi_msg {printf("alloc smi %p\n",
retval);} kprobe:free_smi_msg {printf("free smi %p\n",arg0)}'
Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20221007092617.87597-4-zhangyuchen.lcr@bytedance.com>
[Fixed the comment above handle_one_recv_msg().]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
When fixing the problem mentioned in PATCH1, we also found
the following problem:
If the IPMI is disconnected and in the sending process, the
uninstallation driver will be stuck for a long time.
The main problem is that uninstalling the driver waits for curr_msg to
be sent or HOSED. After stopping tasklet, the only place to trigger the
timeout mechanism is the circular poll in shutdown_smi.
The poll function delays 10us and calls smi_event_handler(smi_info,10).
Smi_event_handler deducts 10us from kcs->ibf_timeout.
But the poll func is followed by schedule_timeout_uninterruptible(1).
The time consumed here is not counted in kcs->ibf_timeout.
So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has
actually passed. The waiting time has increased by more than a
hundredfold.
Now instead of calling poll(). call smi_event_handler() directly and
calculate the elapsed time.
For verification, you can directly use ebpf to check the kcs->
ibf_timeout for each call to kcs_event() when IPMI is disconnected.
Decrement at normal rate before unloading. The decrement rate becomes
very slow after unloading.
$ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n",
*(arg0+584));}'
Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com>
Message-Id: <20221007092617.87597-3-zhangyuchen.lcr@bytedance.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Cc: stable@vger.kernel.org
|
|
The ASPEED KCS devices don't provide a BMC-side interrupt for the host
reading the output data register (ODR). The act of the host reading ODR
clears the output buffer full (OBF) flag in the status register (STR),
informing the BMC it can transmit a subsequent byte.
On the BMC side the KCS client must enable the OBE event *and* perform a
subsequent read of STR anyway to avoid races - the polling provides a
window for the host to read ODR if data was freshly written while
minimising BMC-side latency.
Fixes: 28651e6c4237 ("ipmi: kcs_bmc: Allow clients to control KCS IRQ state")
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-Id: <20220812144741.240315-1-andrew@aj.id.au>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
The SMBus system interface (SSIF) IPMI BMC driver can be used to perform
in-band IPMI communication with their host in management (BMC) side.
Thanks Dan for the copy_from_user() fix in the link below.
Link: https://lore.kernel.org/linux-arm-kernel/20220310114119.13736-4-quan@os.amperecomputing.com/
Signed-off-by: Quan Nguyen <quan@os.amperecomputing.com>
Message-Id: <20221004093106.1653317-2-quan@os.amperecomputing.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Pull IPMI updates from Corey Minyard:
"Fix a bunch of little problems in IPMI
This is mostly just doc, config, and little tweaks. Nothing big, which
is why there was nothing for 6.0. There is one crash fix, but it's not
something that I think anyone is using yet"
* tag 'for-linus-6.1-1' of https://github.com/cminyard/linux-ipmi:
ipmi: Remove unused struct watcher_entry
ipmi: kcs: aspeed: Update port address comments
ipmi: Add __init/__exit annotations to module init/exit funcs
ipmi:ipmb: Don't call ipmi_unregister_smi() on a register failure
ipmi:ipmb: Fix a vague comment and a typo
dt-binding: ipmi: add fallback to npcm845 compatible
ipmi: Fix comment typo
char: ipmi: modify NPCM KCS configuration
dt-bindings: ipmi: Add npcm845 compatible
|
|
After commit e86ee2d44b44("ipmi: Rework locking and shutdown for hot remove"),
no one use struct watcher_entry, so remove it.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Message-Id: <20220927133814.98929-1-yuancan@huawei.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|
|
Remove AST_usrGuide_KCS.pdf as it is no longer maintained.
Add more descriptions as the driver now supports the I/O
address configurations for both the KCS Data and Cmd/Status
interface registers.
Signed-off-by: Chia-Wei Wang <chiawei_wang@aspeedtech.com>
Message-Id: <20220920020333.601-1-chiawei_wang@aspeedtech.com>
[I don't like removing documentation, but the document in question
was a personal note by an employee and nothing official and not
necessarily guaranteed to be accurate in the future. So go
ahead and remove it.]
Signed-off-by: Corey Minyard <cminyard@mvista.com>
|