<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/edac, branch v5.4.113</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v5.4.113</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v5.4.113'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2020-12-30T10:51:36+00:00</updated>
<entry>
<title>EDAC/amd64: Fix PCI component registration</title>
<updated>2020-12-30T10:51:36+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2020-11-22T14:57:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=066d115fdd29ef4eda11758c2387fd5aa273f6cf'/>
<id>urn:sha1:066d115fdd29ef4eda11758c2387fd5aa273f6cf</id>
<content type='text'>
commit 706657b1febf446a9ba37dc51b89f46604f57ee9 upstream.

In order to setup its PCI component, the driver needs any node private
instance in order to get a reference to the PCI device and hand that
into edac_pci_create_generic_ctl(). For convenience, it uses the 0th
memory controller descriptor under the assumption that if any, the 0th
will be always present.

However, this assumption goes wrong when the 0th node doesn't have
memory and the driver doesn't initialize an instance for it:

  EDAC amd64: F17h detected (node 0).
  ...
  EDAC amd64: Node 0: No DIMMs detected.

But looking up node instances is not really needed - all one needs is
the pointer to the proper device which gets discovered during instance
init.

So stash that pointer into a variable and use it when setting up the
EDAC PCI component.

Clear that variable when the driver needs to unwind due to some
instances failing init to avoid any registration imbalance.

Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20201122150815.13808-1-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>EDAC/i10nm: Use readl() to access MMIO registers</title>
<updated>2020-12-30T10:51:36+00:00</updated>
<author>
<name>Qiuxu Zhuo</name>
<email>qiuxu.zhuo@intel.com</email>
</author>
<published>2020-11-17T12:49:51+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f4ce4a53c4e4f110b1c0a21b0cd9c003274132e8'/>
<id>urn:sha1:f4ce4a53c4e4f110b1c0a21b0cd9c003274132e8</id>
<content type='text'>
commit 83ff51c4e3fecf6b8587ce4d46f6eac59f5d7c5a upstream.

Instead of raw access, use readl() to access MMIO registers of
memory controller to avoid possible compiler re-ordering.

Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Qiuxu Zhuo &lt;qiuxu.zhuo@intel.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>EDAC/mce_amd: Use struct cpuinfo_x86.cpu_die_id for AMD NodeId</title>
<updated>2020-12-30T10:51:10+00:00</updated>
<author>
<name>Yazen Ghannam</name>
<email>yazen.ghannam@amd.com</email>
</author>
<published>2020-11-09T21:06:58+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3e08a61b2f94ee88099b82d66114a37343c29758'/>
<id>urn:sha1:3e08a61b2f94ee88099b82d66114a37343c29758</id>
<content type='text'>
[ Upstream commit 8de0c9917cc1297bc5543b61992d5bdee4ce621a ]

The edac_mce_amd module calls decode_dram_ecc() on AMD Family17h and
later systems. This function is used in amd64_edac_mod to do
system-specific decoding for DRAM ECC errors. The function takes a
"NodeId" as a parameter.

In AMD documentation, NodeId is used to identify a physical die in a
system. This can be used to identify a node in the AMD_NB code and also
it is used with umc_normaddr_to_sysaddr().

However, the input used for decode_dram_ecc() is currently the NUMA node
of a logical CPU. In the default configuration, the NUMA node and
physical die will be equivalent, so this doesn't have an impact.

But the NUMA node configuration can be adjusted with optional memory
interleaving modes. This will cause the NUMA node enumeration to not
match the physical die enumeration. The mismatch will cause the address
translation function to fail or report incorrect results.

Use struct cpuinfo_x86.cpu_die_id for the node_id parameter to ensure the
physical ID is used.

Fixes: fbe63acf62f5 ("EDAC, mce_amd: Use cpu_to_node() to find the node ID")
Signed-off-by: Yazen Ghannam &lt;yazen.ghannam@amd.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20201109210659.754018-4-Yazen.Ghannam@amd.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC/ti: Fix handling of platform_get_irq() error</title>
<updated>2020-10-29T08:57:29+00:00</updated>
<author>
<name>Krzysztof Kozlowski</name>
<email>krzk@kernel.org</email>
</author>
<published>2020-08-27T07:07:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9aee8216556e93b166c6039d571a6a89a5fb6f36'/>
<id>urn:sha1:9aee8216556e93b166c6039d571a6a89a5fb6f36</id>
<content type='text'>
[ Upstream commit 66077adb70a2a9e92540155b2ace33ec98299c90 ]

platform_get_irq() returns a negative error number on error. In such a
case, comparison to 0 would pass the check therefore check the return
value properly, whether it is negative.

 [ bp: Massage commit message. ]

Fixes: 86a18ee21e5e ("EDAC, ti: Add support for TI keystone and DRA7xx EDAC")
Signed-off-by: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Tero Kristo &lt;t-kristo@ti.com&gt;
Link: https://lkml.kernel.org/r/20200827070743.26628-2-krzk@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC/aspeed: Fix handling of platform_get_irq() error</title>
<updated>2020-10-29T08:57:29+00:00</updated>
<author>
<name>Krzysztof Kozlowski</name>
<email>krzk@kernel.org</email>
</author>
<published>2020-08-27T07:07:42+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=64a9f5a30fbb0adcf11321eedfc3c41e7b4fa32b'/>
<id>urn:sha1:64a9f5a30fbb0adcf11321eedfc3c41e7b4fa32b</id>
<content type='text'>
[ Upstream commit afce6996943be265fa39240b67025cfcb1bcdfb1 ]

platform_get_irq() returns a negative error number on error. In such a
case, comparison to 0 would pass the check therefore check the return
value properly, whether it is negative.

 [ bp: Massage commit message. ]

Fixes: 9b7e6242ee4e ("EDAC, aspeed: Add an Aspeed AST2500 EDAC driver")
Signed-off-by: Krzysztof Kozlowski &lt;krzk@kernel.org&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Reviewed-by: Stefan Schaeckeler &lt;schaecsn@gmx.net&gt;
Link: https://lkml.kernel.org/r/20200827070743.26628-1-krzk@kernel.org
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC/i5100: Fix error handling order in i5100_init_one()</title>
<updated>2020-10-29T08:57:29+00:00</updated>
<author>
<name>Dinghao Liu</name>
<email>dinghao.liu@zju.edu.cn</email>
</author>
<published>2020-08-26T12:14:37+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4d86328e42c3e6f1aecd4e32c4980def0b717227'/>
<id>urn:sha1:4d86328e42c3e6f1aecd4e32c4980def0b717227</id>
<content type='text'>
[ Upstream commit 857a3139bd8be4f702c030c8ca06f3fd69c1741a ]

When pci_get_device_func() fails, the driver doesn't need to execute
pci_dev_put(). mci should still be freed, though, to prevent a memory
leak. When pci_enable_device() fails, the error injection PCI device
"einj" doesn't need to be disabled either.

 [ bp: Massage commit message, rename label to "bail_mc_free". ]

Fixes: 52608ba205461 ("i5100_edac: probe for device 19 function 0")
Signed-off-by: Dinghao Liu &lt;dinghao.liu@zju.edu.cn&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20200826121437.31606-1-dinghao.liu@zju.edu.cn
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC/ghes: Check whether the driver is on the safe list correctly</title>
<updated>2020-10-01T11:18:14+00:00</updated>
<author>
<name>Borislav Petkov</name>
<email>bp@suse.de</email>
</author>
<published>2020-09-11T16:17:30+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b11f2d6b80f4515893db949c0b12e4655f7fba9d'/>
<id>urn:sha1:b11f2d6b80f4515893db949c0b12e4655f7fba9d</id>
<content type='text'>
[ Upstream commit 251c54ea26fa6029b01a76161a37a12fde5124e4 ]

With CONFIG_DEBUG_TEST_DRIVER_REMOVE=y, a system would try to probe,
unregister and probe again a driver.

When ghes_edac is attempted to be loaded on a system which is not on
the safe platforms list, ghes_edac_register() would return early. The
unregister counterpart ghes_edac_unregister() would still attempt to
unregister and exit early at the refcount test, leading to the refcount
underflow below.

In order to not do *anything* on the unregister path too, reuse the
force_load parameter and check it on that path too, before fumbling with
the refcount.

  ghes_edac: ghes_edac_register: entry
  ghes_edac: ghes_edac_register: return -ENODEV
  ------------[ cut here ]------------
  refcount_t: underflow; use-after-free.
  WARNING: CPU: 10 PID: 1 at lib/refcount.c:28 refcount_warn_saturate+0xb9/0x100
  Modules linked in:
  CPU: 10 PID: 1 Comm: swapper/0 Not tainted 5.9.0-rc4+ #12
  Hardware name: GIGABYTE MZ01-CE1-00/MZ01-CE1-00, BIOS F02 08/29/2018
  RIP: 0010:refcount_warn_saturate+0xb9/0x100
  Code: 82 e8 fb 8f 4d 00 90 0f 0b 90 90 c3 80 3d 55 4c f5 00 00 75 88 c6 05 4c 4c f5 00 01 90 48 c7 c7 d0 8a 10 82 e8 d8 8f 4d 00 90 &lt;0f&gt; 0b 90 90 c3 80 3d 30 4c f5 00 00 0f 85 61 ff ff ff c6 05 23 4c
  RSP: 0018:ffffc90000037d58 EFLAGS: 00010292
  RAX: 0000000000000026 RBX: ffff88840b8da000 RCX: 0000000000000000
  RDX: 0000000000000001 RSI: ffffffff8216b24f RDI: 00000000ffffffff
  RBP: ffff88840c662e00 R08: 0000000000000001 R09: 0000000000000001
  R10: 0000000000000001 R11: 0000000000000046 R12: 0000000000000000
  R13: 0000000000000001 R14: 0000000000000000 R15: 0000000000000000
  FS:  0000000000000000(0000) GS:ffff88840ee80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000800002211000 CR4: 00000000003506e0
  Call Trace:
   ghes_edac_unregister
   ghes_remove
   platform_drv_remove
   really_probe
   driver_probe_device
   device_driver_attach
   __driver_attach
   ? device_driver_attach
   ? device_driver_attach
   bus_for_each_dev
   bus_add_driver
   driver_register
   ? bert_init
   ghes_init
   do_one_initcall
   ? rcu_read_lock_sched_held
   kernel_init_freeable
   ? rest_init
   kernel_init
   ret_from_fork
   ...
  ghes_edac: ghes_edac_unregister: FALSE, refcount: -1073741824

Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Link: https://lkml.kernel.org/r/20200911164950.GB19320@zn.tnic
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC/{i7core,sb,pnd2,skx}: Fix error event severity</title>
<updated>2020-09-03T09:26:53+00:00</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2020-07-07T19:43:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6d11320bed41fb0234c23eb31e03c2f37933180f'/>
<id>urn:sha1:6d11320bed41fb0234c23eb31e03c2f37933180f</id>
<content type='text'>
[ Upstream commit 45bc6098a3e279d8e391d22428396687562797e2 ]

IA32_MCG_STATUS.RIPV indicates whether the return RIP value pushed onto
the stack as part of machine check delivery is valid or not.

Various drivers copied a code fragment that uses the RIPV bit to
determine the severity of the error as either HW_EVENT_ERR_UNCORRECTED
or HW_EVENT_ERR_FATAL, but this check is reversed (marking errors where
RIPV is set as "FATAL").

Reverse the tests so that the error is marked fatal when RIPV is not set.

Reported-by: Gabriele Paoloni &lt;gabriele.paoloni@intel.com&gt;
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20200707194324.14884-1-tony.luck@intel.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC: skx_common: get rid of unused type var</title>
<updated>2020-09-03T09:26:52+00:00</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab+samsung@kernel.org</email>
</author>
<published>2019-09-13T14:23:08+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=87cc96bb11b9ea887bc58df253e138d78f1c5b63'/>
<id>urn:sha1:87cc96bb11b9ea887bc58df253e138d78f1c5b63</id>
<content type='text'>
[ Upstream commit f05390d30e20cccd8f8de981dee42bcdd8d2d137 ]

	drivers/edac/skx_common.c: In function ‘skx_mce_output_error’:
	drivers/edac/skx_common.c:478:8: warning: variable ‘type’ set but not used [-Wunused-but-set-variable]
	  478 |  char *type, *optype;
	      |        ^~~~

Acked-by: Borislav Petkov &lt;bp@alien8.de&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab+samsung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>EDAC: sb_edac: get rid of unused vars</title>
<updated>2020-09-03T09:26:52+00:00</updated>
<author>
<name>Mauro Carvalho Chehab</name>
<email>mchehab+samsung@kernel.org</email>
</author>
<published>2019-09-13T14:17:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=3bf42b2e8d67d4ea3bb6bcce5b8eb66568dc6eb6'/>
<id>urn:sha1:3bf42b2e8d67d4ea3bb6bcce5b8eb66568dc6eb6</id>
<content type='text'>
[ Upstream commit 323014d85d2699b2879ecb15cd06a15408e3e801 ]

There are several vars unused on this driver, probably because
it was a modified copy of another driver. Get rid of them.

	drivers/edac/sb_edac.c: In function ‘knl_get_dimm_capacity’:
	drivers/edac/sb_edac.c:1343:16: warning: variable ‘sad_size’ set but not used [-Wunused-but-set-variable]
	 1343 |  u64 sad_base, sad_size, sad_limit = 0;
	      |                ^~~~~~~~
	drivers/edac/sb_edac.c: In function ‘sbridge_mce_output_error’:
	drivers/edac/sb_edac.c:2955:8: warning: variable ‘type’ set but not used [-Wunused-but-set-variable]
	 2955 |  char *type, *optype, msg[256];
	      |        ^~~~
	drivers/edac/sb_edac.c: In function ‘sbridge_unregister_mci’:
	drivers/edac/sb_edac.c:3203:22: warning: variable ‘pvt’ set but not used [-Wunused-but-set-variable]
	 3203 |  struct sbridge_pvt *pvt;
	      |                      ^~~
	At top level:
	drivers/edac/sb_edac.c:266:18: warning: ‘correrrthrsld’ defined but not used [-Wunused-const-variable=]
	  266 | static const u32 correrrthrsld[] = {
	      |                  ^~~~~~~~~~~~~
	drivers/edac/sb_edac.c:257:18: warning: ‘correrrcnt’ defined but not used [-Wunused-const-variable=]
	  257 | static const u32 correrrcnt[] = {
	      |                  ^~~~~~~~~~

Acked-by: Borislav Petkov &lt;bp@alien8.de&gt;
Acked-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab+samsung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
