<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/ras, branch v6.12.80</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v6.12.80'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2025-12-18T12:55:04+00:00</updated>
<entry>
<title>RAS: Report all ARM processor CPER information to userspace</title>
<updated>2025-12-18T12:55:04+00:00</updated>
<author>
<name>Jason Tian</name>
<email>jason@os.amperecomputing.com</email>
</author>
<published>2025-08-14T16:52:52+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2599ad5e33b629a78a14a463a51afa134e9c5b15'/>
<id>urn:sha1:2599ad5e33b629a78a14a463a51afa134e9c5b15</id>
<content type='text'>
[ Upstream commit 05954511b73e748d0370549ad9dd9cd95297d97a ]

The ARM processor CPER record was added in UEFI v2.6 and remained
unchanged up to v2.10.

Yet, the original arm_event trace code added by

  e9279e83ad1f ("trace, ras: add ARM processor error trace event")

is incomplete, as it only traces some fields of UAPI 2.6 table N.16, not
exporting any information from tables N.17 to N.29 of the record.

This is not enough for the user to be able to figure out what has
exactly happened or to take appropriate action.

According to the UEFI v2.9 specification chapter N2.4.4, the ARM
processor error section includes:

- several (ERR_INFO_NUM) ARM processor error information structures
  (Tables N.17 to N.20);
- several (CONTEXT_INFO_NUM) ARM processor context information
  structures (Tables N.21 to N.29);
- several vendor specific error information structures. The
  size is given by Section Length minus the size of the other
  fields.

In addition, it also exports two fields that are parsed by the GHES
driver when firmware reports it, e.g.:

- error severity
- CPU logical index

Report all of these information to userspace via a the ARM tracepoint so
that userspace can properly record the error and take decisions related
to CPU core isolation according to error severity and other info.

The updated ARM trace event now contains the following fields:

======================================  =============================
UEFI field on table N.16                ARM Processor trace fields
======================================  =============================
Validation                              handled when filling data for
                                        affinity MPIDR and running
                                        state.
ERR_INFO_NUM                            pei_len
CONTEXT_INFO_NUM                        ctx_len
Section Length                          indirectly reported by
                                        pei_len, ctx_len and oem_len
Error affinity level                    affinity
MPIDR_EL1                               mpidr
MIDR_EL1                                midr
Running State                           running_state
PSCI State                              psci_state
Processor Error Information Structure   pei_err - count at pei_len
Processor Context                       ctx_err- count at ctx_len
Vendor Specific Error Info              oem - count at oem_len
======================================  =============================

It should be noted that decoding of tables N.17 to N.29, if needed, will
be handled in userspace. That gives more flexibility, as there won't be
any need to flood the kernel with micro-architecture specific error
decoding.

Also, decoding the other fields require a complex logic, and should be
done for each of the several values inside the record field.  So, let
userspace daemons like rasdaemon decode them, parsing such tables and
having vendor-specific micro-architecture-specific decoders.

 [mchehab: modified description, solved merge conflicts and fixed coding style]

Signed-off-by: Jason Tian &lt;jason@os.amperecomputing.com&gt;
Co-developed-by: Shengwei Luo &lt;luoshengwei@huawei.com&gt;
Signed-off-by: Shengwei Luo &lt;luoshengwei@huawei.com&gt;
Signed-off-by: Mauro Carvalho Chehab &lt;mchehab+huawei@kernel.org&gt;
Signed-off-by: Daniel Ferguson &lt;danielf@os.amperecomputing.com&gt; # rebased
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Tested-by: Shiju Jose &lt;shiju.jose@huawei.com&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Fixes: e9279e83ad1f ("trace, ras: add ARM processor error trace event")
Link: https://uefi.org/specs/UEFI/2.10/Apx_N_Common_Platform_Error_Record.html#arm-processor-error-section
Signed-off-by: Ard Biesheuvel &lt;ardb@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure: remove obsolete MF_MSG_DIFFERENT_COMPOUND</title>
<updated>2024-07-12T22:52:22+00:00</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2024-07-08T03:05:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8a78882dac1c8c464e047a29415edb50421651ce'/>
<id>urn:sha1:8a78882dac1c8c464e047a29415edb50421651ce</id>
<content type='text'>
The page cannot become compound pages again just after a folio is split as
an extra refcnt is held.  So the MF_MSG_DIFFERENT_COMPOUND case is
obsolete and can be removed to get rid of this false assumption and code
burden.  But add one WARN_ON() here to keep the situation clear.

Link: https://lkml.kernel.org/r/20240708030544.196919-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: Naoya Horiguchi &lt;nao.horiguchi@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure: remove MF_MSG_SLAB</title>
<updated>2024-07-04T02:30:10+00:00</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2024-06-12T07:18:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ceb32d6aa93ce3282a724f3a0828b4b84d5035f1'/>
<id>urn:sha1:ceb32d6aa93ce3282a724f3a0828b4b84d5035f1</id>
<content type='text'>
Since commit 46df8e73a4a3 ("mm: free up PG_slab"), MF_MSG_SLAB becomes
unused.  Remove it.  No functional change intended.

Link: https://lkml.kernel.org/r/20240612071835.157004-3-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Cc: Naoya Horiguchi &lt;nao.horiguchi@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure: improve memory failure action_result messages</title>
<updated>2024-07-04T02:29:57+00:00</updated>
<author>
<name>Jane Chu</name>
<email>jane.chu@oracle.com</email>
</author>
<published>2024-05-24T21:53:04+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b8b9488d50b7150bd4830dfff487e8d4ef6589ba'/>
<id>urn:sha1:b8b9488d50b7150bd4830dfff487e8d4ef6589ba</id>
<content type='text'>
Added two explicit MF_MSG messages describing failure in
get_hwpoison_page.  Attemped to document the definition of various action
names, and made a few adjustment to the action_result() calls.

Link: https://lkml.kernel.org/r/20240524215306.2705454-4-jane.chu@oracle.com
Signed-off-by: Jane Chu &lt;jane.chu@oracle.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Naoya Horiguchi &lt;nao.horiguchi@gmail.com&gt;
Cc: Oscar Salvador &lt;oalvador@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>tracing/treewide: Remove second parameter of __assign_str()</title>
<updated>2024-05-23T00:14:47+00:00</updated>
<author>
<name>Steven Rostedt (Google)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2024-05-16T17:34:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2c92ca849fcc6ee7d0c358e9959abc9f58661aea'/>
<id>urn:sha1:2c92ca849fcc6ee7d0c358e9959abc9f58661aea</id>
<content type='text'>
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.

This means that with:

  __string(field, mystring)

Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.

There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:

  git grep -l __assign_str | while read a ; do
      sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a &gt; /tmp/test-file;
      mv /tmp/test-file $a;
  done

I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.

Note, the same updates will need to be done for:

  __assign_str_len()
  __assign_rel_str()
  __assign_rel_str_len()

I tested this with both an allmodconfig and an allyesconfig (build only for both).

[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/

Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home

Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Julia Lawall &lt;Julia.Lawall@inria.fr&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Acked-by: Jani Nikula &lt;jani.nikula@intel.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt; for the amdgpu parts.
Acked-by: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt; #for
Acked-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt; # for thermal
Acked-by: Takashi Iwai &lt;tiwai@suse.de&gt;
Acked-by: Darrick J. Wong &lt;djwong@kernel.org&gt;	# xfs
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
</content>
</entry>
<entry>
<title>PCI/AER: Generalize TLP Header Log reading</title>
<updated>2024-03-08T21:26:46+00:00</updated>
<author>
<name>Ilpo Järvinen</name>
<email>ilpo.jarvinen@linux.intel.com</email>
</author>
<published>2024-02-06T13:57:15+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=0a5a46a6a61be7b63c12c18495d427f91f3662a9'/>
<id>urn:sha1:0a5a46a6a61be7b63c12c18495d427f91f3662a9</id>
<content type='text'>
Both AER and DPC RP PIO provide TLP Header Log registers (PCIe r6.1 secs
7.8.4 &amp; 7.9.14) to convey error diagnostics but the struct is named after
AER as the struct aer_header_log_regs. Also, not all places that handle TLP
Header Log use the struct and the struct members are named individually.

Generalize the struct name and members, and use it consistently where TLP
Header Log is being handled so that a pcie_read_tlp_log() helper can be
easily added.

Link: https://lore.kernel.org/r/20240206135717.8565-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
[bhelgaas: drop ixgbe changes for now, tidy whitespace]
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
<entry>
<title>mm, hwpoison: enable memory error handling on 1GB hugepage</title>
<updated>2022-08-09T01:06:44+00:00</updated>
<author>
<name>Naoya Horiguchi</name>
<email>naoya.horiguchi@nec.com</email>
</author>
<published>2022-07-14T04:24:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=6f4614886baa59b6ae014093300482c1da4d3c93'/>
<id>urn:sha1:6f4614886baa59b6ae014093300482c1da4d3c93</id>
<content type='text'>
Now error handling code is prepared, so remove the blocking code and
enable memory error handling on 1GB hugepage.

Link: https://lkml.kernel.org/r/20220714042420.1847125-9-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Reviewed-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Cc: Liu Shixin &lt;liushixin2@huawei.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "mm/memory-failure.c: fix race with changing page compound again"</title>
<updated>2022-04-29T06:16:02+00:00</updated>
<author>
<name>Naoya Horiguchi</name>
<email>naoya.horiguchi@nec.com</email>
</author>
<published>2022-04-29T06:16:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2ba2b008a8bf5fd268a43d03ba79e0ad464d6836'/>
<id>urn:sha1:2ba2b008a8bf5fd268a43d03ba79e0ad464d6836</id>
<content type='text'>
Reverts commit 888af2701db7 ("mm/memory-failure.c: fix race with changing
page compound again") because now we fetch the page refcount under
hugetlb_lock in try_memory_failure_hugetlb() so that the race check is no
longer necessary.

Link: https://lkml.kernel.org/r/20220408135323.1559401-4-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Suggested-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure.c: fix race with changing page compound again</title>
<updated>2022-03-22T22:57:07+00:00</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2022-03-22T21:44:44+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=888af2701db79b9b27c7e37f9ede528a5ca53b76'/>
<id>urn:sha1:888af2701db79b9b27c7e37f9ede528a5ca53b76</id>
<content type='text'>
Patch series "A few fixup patches for memory failure", v2.

This series contains a few patches to fix the race with changing page
compound page, make non-LRU movable pages unhandlable and so on.  More
details can be found in the respective changelogs.

There is a race window where we got the compound_head, the hugetlb page
could be freed to buddy, or even changed to another compound page just
before we try to get hwpoison page.  Think about the below race window:

  CPU 1					  CPU 2
  memory_failure_hugetlb
  struct page *head = compound_head(p);
					  hugetlb page might be freed to
					  buddy, or even changed to another
					  compound page.

  get_hwpoison_page -- page is not what we want now...

If this race happens, just bail out.  Also MF_MSG_DIFFERENT_PAGE_SIZE is
introduced to record this event.

[akpm@linux-foundation.org: s@/**@/*@, per Naoya Horiguchi]

Link: https://lkml.kernel.org/r/20220312074613.4798-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20220312074613.4798-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Acked-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hwpoison: remove MF_MSG_BUDDY_2ND and MF_MSG_POISONED_HUGE</title>
<updated>2022-01-15T14:30:31+00:00</updated>
<author>
<name>Naoya Horiguchi</name>
<email>naoya.horiguchi@nec.com</email>
</author>
<published>2022-01-14T22:09:06+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=c9fdc4d5487a16bd1f003fc8b66e91f88efb50e6'/>
<id>urn:sha1:c9fdc4d5487a16bd1f003fc8b66e91f88efb50e6</id>
<content type='text'>
These action_page_types are no longer used, so remove them.

Link: https://lkml.kernel.org/r/20211115084006.3728254-3-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Acked-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: "Aneesh Kumar K.V" &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Ding Hui &lt;dinghui@sangfor.com.cn&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
