From 1be01d4a57142ded23bdb9e0c8d9369e693b26cc Mon Sep 17 00:00:00 2001 From: Geert Uytterhoeven Date: Thu, 14 Mar 2019 12:13:50 +0100 Subject: driver: base: Disable CONFIG_UEVENT_HELPER by default Since commit 7934779a69f1184f ("Driver-Core: disable /sbin/hotplug by default"), the help text for the /sbin/hotplug fork-bomb says "This should not be used today [...] creates a high system load, or [...] out-of-memory situations during bootup". The rationale for this was that no recent mainstream system used this anymore (in 2010!). A few years later, the complete uevent helper support was made optional in commit 86d56134f1b67d0c ("kobject: Make support for uevent_helper optional."). However, if was still left enabled by default, to support ancient userland. Time passed by, and nothing should use this anymore, so it can be disabled by default. Signed-off-by: Geert Uytterhoeven Signed-off-by: Greg Kroah-Hartman --- drivers/base/Kconfig | 1 - 1 file changed, 1 deletion(-) (limited to 'drivers/base') diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 059700ea3521..9fb2c4c92340 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -3,7 +3,6 @@ menu "Generic Driver Options" config UEVENT_HELPER bool "Support for uevent helper" - default y help The uevent helper program is forked by the kernel for every uevent. -- cgit v1.2.3 From 08d9dbe72b1f899468b2b34f9309e88a84f440f2 Mon Sep 17 00:00:00 2001 From: Keith Busch Date: Mon, 11 Mar 2019 14:56:00 -0600 Subject: node: Link memory nodes to their compute nodes Systems may be constructed with various specialized nodes. Some nodes may provide memory, some provide compute devices that access and use that memory, and others may provide both. Nodes that provide memory are referred to as memory targets, and nodes that can initiate memory access are referred to as memory initiators. Memory targets will often have varying access characteristics from different initiators, and platforms may have ways to express those relationships. In preparation for these systems, provide interfaces for the kernel to export the memory relationship among different nodes memory targets and their initiators with symlinks to each other. If a system provides access locality for each initiator-target pair, nodes may be grouped into ranked access classes relative to other nodes. The new interface allows a subsystem to register relationships of varying classes if available and desired to be exported. A memory initiator may have multiple memory targets in the same access class. The target memory's initiators in a given class indicate the nodes access characteristics share the same performance relative to other linked initiator nodes. Each target within an initiator's access class, though, do not necessarily perform the same as each other. A memory target node may have multiple memory initiators. All linked initiators in a target's class have the same access characteristics to that target. The following example show the nodes' new sysfs hierarchy for a memory target node 'Y' with access class 0 from initiator node 'X': # symlinks -v /sys/devices/system/node/nodeX/access0/ relative: /sys/devices/system/node/nodeX/access0/targets/nodeY -> ../../nodeY # symlinks -v /sys/devices/system/node/nodeY/access0/ relative: /sys/devices/system/node/nodeY/access0/initiators/nodeX -> ../../nodeX The new attributes are added to the sysfs stable documentation. Reviewed-by: Jonathan Cameron Signed-off-by: Keith Busch Reviewed-by: Rafael J. Wysocki Tested-by: Brice Goglin Signed-off-by: Greg Kroah-Hartman --- Documentation/ABI/stable/sysfs-devices-node | 25 ++++- drivers/base/node.c | 142 +++++++++++++++++++++++++++- include/linux/node.h | 6 ++ 3 files changed, 171 insertions(+), 2 deletions(-) (limited to 'drivers/base') diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node index 3e90e1f3bf0a..433bcc04e542 100644 --- a/Documentation/ABI/stable/sysfs-devices-node +++ b/Documentation/ABI/stable/sysfs-devices-node @@ -90,4 +90,27 @@ Date: December 2009 Contact: Lee Schermerhorn Description: The node's huge page size control/query attributes. - See Documentation/admin-guide/mm/hugetlbpage.rst \ No newline at end of file + See Documentation/admin-guide/mm/hugetlbpage.rst + +What: /sys/devices/system/node/nodeX/accessY/ +Date: December 2018 +Contact: Keith Busch +Description: + The node's relationship to other nodes for access class "Y". + +What: /sys/devices/system/node/nodeX/accessY/initiators/ +Date: December 2018 +Contact: Keith Busch +Description: + The directory containing symlinks to memory initiator + nodes that have class "Y" access to this target node's + memory. CPUs and other memory initiators in nodes not in + the list accessing this node's memory may have different + performance. + +What: /sys/devices/system/node/nodeX/accessY/targets/ +Date: December 2018 +Contact: Keith Busch +Description: + The directory containing symlinks to memory targets that + this initiator node has class "Y" access. diff --git a/drivers/base/node.c b/drivers/base/node.c index 86d6cd92ce3d..6f4097680580 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -59,6 +60,94 @@ static inline ssize_t node_read_cpulist(struct device *dev, static DEVICE_ATTR(cpumap, S_IRUGO, node_read_cpumask, NULL); static DEVICE_ATTR(cpulist, S_IRUGO, node_read_cpulist, NULL); +/** + * struct node_access_nodes - Access class device to hold user visible + * relationships to other nodes. + * @dev: Device for this memory access class + * @list_node: List element in the node's access list + * @access: The access class rank + */ +struct node_access_nodes { + struct device dev; + struct list_head list_node; + unsigned access; +}; +#define to_access_nodes(dev) container_of(dev, struct node_access_nodes, dev) + +static struct attribute *node_init_access_node_attrs[] = { + NULL, +}; + +static struct attribute *node_targ_access_node_attrs[] = { + NULL, +}; + +static const struct attribute_group initiators = { + .name = "initiators", + .attrs = node_init_access_node_attrs, +}; + +static const struct attribute_group targets = { + .name = "targets", + .attrs = node_targ_access_node_attrs, +}; + +static const struct attribute_group *node_access_node_groups[] = { + &initiators, + &targets, + NULL, +}; + +static void node_remove_accesses(struct node *node) +{ + struct node_access_nodes *c, *cnext; + + list_for_each_entry_safe(c, cnext, &node->access_list, list_node) { + list_del(&c->list_node); + device_unregister(&c->dev); + } +} + +static void node_access_release(struct device *dev) +{ + kfree(to_access_nodes(dev)); +} + +static struct node_access_nodes *node_init_node_access(struct node *node, + unsigned access) +{ + struct node_access_nodes *access_node; + struct device *dev; + + list_for_each_entry(access_node, &node->access_list, list_node) + if (access_node->access == access) + return access_node; + + access_node = kzalloc(sizeof(*access_node), GFP_KERNEL); + if (!access_node) + return NULL; + + access_node->access = access; + dev = &access_node->dev; + dev->parent = &node->dev; + dev->release = node_access_release; + dev->groups = node_access_node_groups; + if (dev_set_name(dev, "access%u", access)) + goto free; + + if (device_register(dev)) + goto free_name; + + pm_runtime_no_callbacks(dev); + list_add_tail(&access_node->list_node, &node->access_list); + return access_node; +free_name: + kfree_const(dev->kobj.name); +free: + kfree(access_node); + return NULL; +} + #define K(x) ((x) << (PAGE_SHIFT - 10)) static ssize_t node_read_meminfo(struct device *dev, struct device_attribute *attr, char *buf) @@ -340,7 +429,7 @@ static int register_node(struct node *node, int num) void unregister_node(struct node *node) { hugetlb_unregister_node(node); /* no-op, if memoryless node */ - + node_remove_accesses(node); device_unregister(&node->dev); } @@ -372,6 +461,56 @@ int register_cpu_under_node(unsigned int cpu, unsigned int nid) kobject_name(&node_devices[nid]->dev.kobj)); } +/** + * register_memory_node_under_compute_node - link memory node to its compute + * node for a given access class. + * @mem_node: Memory node number + * @cpu_node: Cpu node number + * @access: Access class to register + * + * Description: + * For use with platforms that may have separate memory and compute nodes. + * This function will export node relationships linking which memory + * initiator nodes can access memory targets at a given ranked access + * class. + */ +int register_memory_node_under_compute_node(unsigned int mem_nid, + unsigned int cpu_nid, + unsigned access) +{ + struct node *init_node, *targ_node; + struct node_access_nodes *initiator, *target; + int ret; + + if (!node_online(cpu_nid) || !node_online(mem_nid)) + return -ENODEV; + + init_node = node_devices[cpu_nid]; + targ_node = node_devices[mem_nid]; + initiator = node_init_node_access(init_node, access); + target = node_init_node_access(targ_node, access); + if (!initiator || !target) + return -ENOMEM; + + ret = sysfs_add_link_to_group(&initiator->dev.kobj, "targets", + &targ_node->dev.kobj, + dev_name(&targ_node->dev)); + if (ret) + return ret; + + ret = sysfs_add_link_to_group(&target->dev.kobj, "initiators", + &init_node->dev.kobj, + dev_name(&init_node->dev)); + if (ret) + goto err; + + return 0; + err: + sysfs_remove_link_from_group(&initiator->dev.kobj, "targets", + dev_name(&targ_node->dev)); + return ret; +} + int unregister_cpu_under_node(unsigned int cpu, unsigned int nid) { struct device *obj; @@ -580,6 +719,7 @@ int __register_one_node(int nid) register_cpu_under_node(cpu, nid); } + INIT_LIST_HEAD(&node_devices[nid]->access_list); /* initialize work queue for memory hot plug */ init_node_hugetlb_work(nid); diff --git a/include/linux/node.h b/include/linux/node.h index 257bb3d6d014..bb288817ed33 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -17,10 +17,12 @@ #include #include +#include #include struct node { struct device dev; + struct list_head access_list; #if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS) struct work_struct node_work; @@ -75,6 +77,10 @@ extern int register_mem_sect_under_node(struct memory_block *mem_blk, extern int unregister_mem_sect_under_nodes(struct memory_block *mem_blk, unsigned long phys_index); +extern int register_memory_node_under_compute_node(unsigned int mem_nid, + unsigned int cpu_nid, + unsigned access); + #ifdef CONFIG_HUGETLBFS extern void register_hugetlbfs_with_node(node_registration_func_t doregister, node_registration_func_t unregister); -- cgit v1.2.3 From e1cf33aafb8462c7d0a0e6349925870316f040ee Mon Sep 17 00:00:00 2001 From: Keith Busch Date: Mon, 11 Mar 2019 14:56:01 -0600 Subject: node: Add heterogenous memory access attributes Heterogeneous memory systems provide memory nodes with different latency and bandwidth performance attributes. Provide a new kernel interface for subsystems to register the attributes under the memory target node's initiator access class. If the system provides this information, applications may query these attributes when deciding which node to request memory. The following example shows the new sysfs hierarchy for a node exporting performance attributes: # tree -P "read*|write*"/sys/devices/system/node/nodeY/accessZ/initiators/ /sys/devices/system/node/nodeY/accessZ/initiators/ |-- read_bandwidth |-- read_latency |-- write_bandwidth `-- write_latency The bandwidth is exported as MB/s and latency is reported in nanoseconds. The values are taken from the platform as reported by the manufacturer. Memory accesses from an initiator node that is not one of the memory's access "Z" initiator nodes linked in the same directory may observe different performance than reported here. When a subsystem makes use of this interface, initiators of a different access number may not have the same performance relative to initiators in other access numbers, or omitted from the any access class' initiators. Descriptions for memory access initiator performance access attributes are added to sysfs stable documentation. Acked-by: Jonathan Cameron Tested-by: Jonathan Cameron Signed-off-by: Keith Busch Reviewed-by: Rafael J. Wysocki Tested-by: Brice Goglin Signed-off-by: Greg Kroah-Hartman --- Documentation/ABI/stable/sysfs-devices-node | 28 ++++++++++++++ drivers/base/Kconfig | 8 ++++ drivers/base/node.c | 59 +++++++++++++++++++++++++++++ include/linux/node.h | 26 +++++++++++++ 4 files changed, 121 insertions(+) (limited to 'drivers/base') diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node index 433bcc04e542..735a40a3f9b2 100644 --- a/Documentation/ABI/stable/sysfs-devices-node +++ b/Documentation/ABI/stable/sysfs-devices-node @@ -114,3 +114,31 @@ Contact: Keith Busch Description: The directory containing symlinks to memory targets that this initiator node has class "Y" access. + +What: /sys/devices/system/node/nodeX/accessY/initiators/read_bandwidth +Date: December 2018 +Contact: Keith Busch +Description: + This node's read bandwidth in MB/s when accessed from + nodes found in this access class's linked initiators. + +What: /sys/devices/system/node/nodeX/accessY/initiators/read_latency +Date: December 2018 +Contact: Keith Busch +Description: + This node's read latency in nanoseconds when accessed + from nodes found in this access class's linked initiators. + +What: /sys/devices/system/node/nodeX/accessY/initiators/write_bandwidth +Date: December 2018 +Contact: Keith Busch +Description: + This node's write bandwidth in MB/s when accessed from + found in this access class's linked initiators. + +What: /sys/devices/system/node/nodeX/accessY/initiators/write_latency +Date: December 2018 +Contact: Keith Busch +Description: + This node's write latency in nanoseconds when access + from nodes found in this class's linked initiators. diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig index 9fb2c4c92340..1d47ab987413 100644 --- a/drivers/base/Kconfig +++ b/drivers/base/Kconfig @@ -148,6 +148,14 @@ config DEBUG_TEST_DRIVER_REMOVE unusable. You should say N here unless you are explicitly looking to test this functionality. +config HMEM_REPORTING + bool + default n + depends on NUMA + help + Enable reporting for heterogenous memory access attributes under + their non-uniform memory nodes. + source "drivers/base/test/Kconfig" config SYS_HYPERVISOR diff --git a/drivers/base/node.c b/drivers/base/node.c index 6f4097680580..2de546a040a5 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -71,6 +71,9 @@ struct node_access_nodes { struct device dev; struct list_head list_node; unsigned access; +#ifdef CONFIG_HMEM_REPORTING + struct node_hmem_attrs hmem_attrs; +#endif }; #define to_access_nodes(dev) container_of(dev, struct node_access_nodes, dev) @@ -148,6 +151,62 @@ free: return NULL; } +#ifdef CONFIG_HMEM_REPORTING +#define ACCESS_ATTR(name) \ +static ssize_t name##_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + return sprintf(buf, "%u\n", to_access_nodes(dev)->hmem_attrs.name); \ +} \ +static DEVICE_ATTR_RO(name); + +ACCESS_ATTR(read_bandwidth) +ACCESS_ATTR(read_latency) +ACCESS_ATTR(write_bandwidth) +ACCESS_ATTR(write_latency) + +static struct attribute *access_attrs[] = { + &dev_attr_read_bandwidth.attr, + &dev_attr_read_latency.attr, + &dev_attr_write_bandwidth.attr, + &dev_attr_write_latency.attr, + NULL, +}; + +/** + * node_set_perf_attrs - Set the performance values for given access class + * @nid: Node identifier to be set + * @hmem_attrs: Heterogeneous memory performance attributes + * @access: The access class the for the given attributes + */ +void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs, + unsigned access) +{ + struct node_access_nodes *c; + struct node *node; + int i; + + if (WARN_ON_ONCE(!node_online(nid))) + return; + + node = node_devices[nid]; + c = node_init_node_access(node, access); + if (!c) + return; + + c->hmem_attrs = *hmem_attrs; + for (i = 0; access_attrs[i] != NULL; i++) { + if (sysfs_add_file_to_group(&c->dev.kobj, access_attrs[i], + "initiators")) { + pr_info("failed to add performance attribute to node %d\n", + nid); + break; + } + } +} +#endif + #define K(x) ((x) << (PAGE_SHIFT - 10)) static ssize_t node_read_meminfo(struct device *dev, struct device_attribute *attr, char *buf) diff --git a/include/linux/node.h b/include/linux/node.h index bb288817ed33..4139d728f8b3 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -20,6 +20,32 @@ #include #include +/** + * struct node_hmem_attrs - heterogeneous memory performance attributes + * + * @read_bandwidth: Read bandwidth in MB/s + * @write_bandwidth: Write bandwidth in MB/s + * @read_latency: Read latency in nanoseconds + * @write_latency: Write latency in nanoseconds + */ +struct node_hmem_attrs { + unsigned int read_bandwidth; + unsigned int write_bandwidth; + unsigned int read_latency; + unsigned int write_latency; +}; + +#ifdef CONFIG_HMEM_REPORTING +void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs, + unsigned access); +#else +static inline void node_set_perf_attrs(unsigned int nid, + struct node_hmem_attrs *hmem_attrs, + unsigned access) +{ +} +#endif + struct node { struct device dev; struct list_head access_list; -- cgit v1.2.3 From acc02a109b0497e917c83f986a89c51e47d0022c Mon Sep 17 00:00:00 2001 From: Keith Busch Date: Mon, 11 Mar 2019 14:56:02 -0600 Subject: node: Add memory-side caching attributes System memory may have caches to help improve access speed to frequently requested address ranges. While the system provided cache is transparent to the software accessing these memory ranges, applications can optimize their own access based on cache attributes. Provide a new API for the kernel to register these memory-side caches under the memory node that provides it. The new sysfs representation is modeled from the existing cpu cacheinfo attributes, as seen from /sys/devices/system/cpu//cache/. Unlike CPU cacheinfo though, the node cache level is reported from the view of the memory. A higher level number is nearer to the CPU, while lower levels are closer to the last level memory. The exported attributes are the cache size, the line size, associativity indexing, and write back policy, and add the attributes for the system memory caches to sysfs stable documentation. Signed-off-by: Keith Busch Reviewed-by: Rafael J. Wysocki Reviewed-by: Brice Goglin Tested-by: Brice Goglin Signed-off-by: Greg Kroah-Hartman --- Documentation/ABI/stable/sysfs-devices-node | 34 +++++++ drivers/base/node.c | 151 ++++++++++++++++++++++++++++ include/linux/node.h | 39 +++++++ 3 files changed, 224 insertions(+) (limited to 'drivers/base') diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node index 735a40a3f9b2..f7ce68fbd4b9 100644 --- a/Documentation/ABI/stable/sysfs-devices-node +++ b/Documentation/ABI/stable/sysfs-devices-node @@ -142,3 +142,37 @@ Contact: Keith Busch Description: This node's write latency in nanoseconds when access from nodes found in this class's linked initiators. + +What: /sys/devices/system/node/nodeX/memory_side_cache/indexY/ +Date: December 2018 +Contact: Keith Busch +Description: + The directory containing attributes for the memory-side cache + level 'Y'. + +What: /sys/devices/system/node/nodeX/memory_side_cache/indexY/indexing +Date: December 2018 +Contact: Keith Busch +Description: + The caches associativity indexing: 0 for direct mapped, + non-zero if indexed. + +What: /sys/devices/system/node/nodeX/memory_side_cache/indexY/line_size +Date: December 2018 +Contact: Keith Busch +Description: + The number of bytes accessed from the next cache level on a + cache miss. + +What: /sys/devices/system/node/nodeX/memory_side_cache/indexY/size +Date: December 2018 +Contact: Keith Busch +Description: + The size of this memory side cache in bytes. + +What: /sys/devices/system/node/nodeX/memory_side_cache/indexY/write_policy +Date: December 2018 +Contact: Keith Busch +Description: + The cache write policy: 0 for write-back, 1 for write-through, + other or unknown. diff --git a/drivers/base/node.c b/drivers/base/node.c index 2de546a040a5..8598fcbd2a17 100644 --- a/drivers/base/node.c +++ b/drivers/base/node.c @@ -205,6 +205,155 @@ void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs, } } } + +/** + * struct node_cache_info - Internal tracking for memory node caches + * @dev: Device represeting the cache level + * @node: List element for tracking in the node + * @cache_attrs:Attributes for this cache level + */ +struct node_cache_info { + struct device dev; + struct list_head node; + struct node_cache_attrs cache_attrs; +}; +#define to_cache_info(device) container_of(device, struct node_cache_info, dev) + +#define CACHE_ATTR(name, fmt) \ +static ssize_t name##_show(struct device *dev, \ + struct device_attribute *attr, \ + char *buf) \ +{ \ + return sprintf(buf, fmt "\n", to_cache_info(dev)->cache_attrs.name);\ +} \ +DEVICE_ATTR_RO(name); + +CACHE_ATTR(size, "%llu") +CACHE_ATTR(line_size, "%u") +CACHE_ATTR(indexing, "%u") +CACHE_ATTR(write_policy, "%u") + +static struct attribute *cache_attrs[] = { + &dev_attr_indexing.attr, + &dev_attr_size.attr, + &dev_attr_line_size.attr, + &dev_attr_write_policy.attr, + NULL, +}; +ATTRIBUTE_GROUPS(cache); + +static void node_cache_release(struct device *dev) +{ + kfree(dev); +} + +static void node_cacheinfo_release(struct device *dev) +{ + struct node_cache_info *info = to_cache_info(dev); + kfree(info); +} + +static void node_init_cache_dev(struct node *node) +{ + struct device *dev; + + dev = kzalloc(sizeof(*dev), GFP_KERNEL); + if (!dev) + return; + + dev->parent = &node->dev; + dev->release = node_cache_release; + if (dev_set_name(dev, "memory_side_cache")) + goto free_dev; + + if (device_register(dev)) + goto free_name; + + pm_runtime_no_callbacks(dev); + node->cache_dev = dev; + return; +free_name: + kfree_const(dev->kobj.name); +free_dev: + kfree(dev); +} + +/** + * node_add_cache() - add cache attribute to a memory node + * @nid: Node identifier that has new cache attributes + * @cache_attrs: Attributes for the cache being added + */ +void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs) +{ + struct node_cache_info *info; + struct device *dev; + struct node *node; + + if (!node_online(nid) || !node_devices[nid]) + return; + + node = node_devices[nid]; + list_for_each_entry(info, &node->cache_attrs, node) { + if (info->cache_attrs.level == cache_attrs->level) { + dev_warn(&node->dev, + "attempt to add duplicate cache level:%d\n", + cache_attrs->level); + return; + } + } + + if (!node->cache_dev) + node_init_cache_dev(node); + if (!node->cache_dev) + return; + + info = kzalloc(sizeof(*info), GFP_KERNEL); + if (!info) + return; + + dev = &info->dev; + dev->parent = node->cache_dev; + dev->release = node_cacheinfo_release; + dev->groups = cache_groups; + if (dev_set_name(dev, "index%d", cache_attrs->level)) + goto free_cache; + + info->cache_attrs = *cache_attrs; + if (device_register(dev)) { + dev_warn(&node->dev, "failed to add cache level:%d\n", + cache_attrs->level); + goto free_name; + } + pm_runtime_no_callbacks(dev); + list_add_tail(&info->node, &node->cache_attrs); + return; +free_name: + kfree_const(dev->kobj.name); +free_cache: + kfree(info); +} + +static void node_remove_caches(struct node *node) +{ + struct node_cache_info *info, *next; + + if (!node->cache_dev) + return; + + list_for_each_entry_safe(info, next, &node->cache_attrs, node) { + list_del(&info->node); + device_unregister(&info->dev); + } + device_unregister(node->cache_dev); +} + +static void node_init_caches(unsigned int nid) +{ + INIT_LIST_HEAD(&node_devices[nid]->cache_attrs); +} +#else +static void node_init_caches(unsigned int nid) { } +static void node_remove_caches(struct node *node) { } #endif #define K(x) ((x) << (PAGE_SHIFT - 10)) @@ -489,6 +638,7 @@ void unregister_node(struct node *node) { hugetlb_unregister_node(node); /* no-op, if memoryless node */ node_remove_accesses(node); + node_remove_caches(node); device_unregister(&node->dev); } @@ -781,6 +931,7 @@ int __register_one_node(int nid) INIT_LIST_HEAD(&node_devices[nid]->access_list); /* initialize work queue for memory hot plug */ init_node_hugetlb_work(nid); + node_init_caches(nid); return error; } diff --git a/include/linux/node.h b/include/linux/node.h index 4139d728f8b3..1a557c589ecb 100644 --- a/include/linux/node.h +++ b/include/linux/node.h @@ -35,10 +35,45 @@ struct node_hmem_attrs { unsigned int write_latency; }; +enum cache_indexing { + NODE_CACHE_DIRECT_MAP, + NODE_CACHE_INDEXED, + NODE_CACHE_OTHER, +}; + +enum cache_write_policy { + NODE_CACHE_WRITE_BACK, + NODE_CACHE_WRITE_THROUGH, + NODE_CACHE_WRITE_OTHER, +}; + +/** + * struct node_cache_attrs - system memory caching attributes + * + * @indexing: The ways memory blocks may be placed in cache + * @write_policy: Write back or write through policy + * @size: Total size of cache in bytes + * @line_size: Number of bytes fetched on a cache miss + * @level: The cache hierarchy level + */ +struct node_cache_attrs { + enum cache_indexing indexing; + enum cache_write_policy write_policy; + u64 size; + u16 line_size; + u8 level; +}; + #ifdef CONFIG_HMEM_REPORTING +void node_add_cache(unsigned int nid, struct node_cache_attrs *cache_attrs); void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs, unsigned access); #else +static inline void node_add_cache(unsigned int nid, + struct node_cache_attrs *cache_attrs) +{ +} + static inline void node_set_perf_attrs(unsigned int nid, struct node_hmem_attrs *hmem_attrs, unsigned access) @@ -53,6 +88,10 @@ struct node { #if defined(CONFIG_MEMORY_HOTPLUG_SPARSE) && defined(CONFIG_HUGETLBFS) struct work_struct node_work; #endif +#ifdef CONFIG_HMEM_REPORTING + struct list_head cache_attrs; + struct device *cache_dev; +#endif }; struct memory_block; -- cgit v1.2.3 From 5d777b185f6db92d8e201a7402f7b242958aafad Mon Sep 17 00:00:00 2001 From: Lingutla Chandrasekhar Date: Mon, 1 Apr 2019 09:54:41 +0530 Subject: arch_topology: Make cpu_capacity sysfs node as read-only If user updates any cpu's cpu_capacity, then the new value is going to be applied to all its online sibling cpus. But this need not to be correct always, as sibling cpus (in ARM, same micro architecture cpus) would have different cpu_capacity with different performance characteristics. So, updating the user supplied cpu_capacity to all cpu siblings is not correct. And another problem is, current code assumes that 'all cpus in a cluster or with same package_id (core_siblings), would have same cpu_capacity'. But with commit '5bdd2b3f0f8 ("arm64: topology: add support to remove cpu topology sibling masks")', when a cpu hotplugged out, the cpu information gets cleared in its sibling cpus. So, user supplied cpu_capacity would be applied to only online sibling cpus at the time. After that, if any cpu hotplugged in, it would have different cpu_capacity than its siblings, which breaks the above assumption. So, instead of mucking around the core sibling mask for user supplied value, use device-tree to set cpu capacity. And make the cpu_capacity node as read-only to know the asymmetry between cpus in the system. While at it, remove cpu_scale_mutex usage, which used for sysfs write protection. Tested-by: Dietmar Eggemann Tested-by: Quentin Perret Reviewed-by: Quentin Perret Acked-by: Sudeep Holla Signed-off-by: Lingutla Chandrasekhar Signed-off-by: Greg Kroah-Hartman --- drivers/base/arch_topology.c | 36 +----------------------------------- 1 file changed, 1 insertion(+), 35 deletions(-) (limited to 'drivers/base') diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c index edfcf8d982e4..1739d7e1952a 100644 --- a/drivers/base/arch_topology.c +++ b/drivers/base/arch_topology.c @@ -7,7 +7,6 @@ */ #include -#include #include #include #include @@ -31,7 +30,6 @@ void arch_set_freq_scale(struct cpumask *cpus, unsigned long cur_freq, per_cpu(freq_scale, i) = scale; } -static DEFINE_MUTEX(cpu_scale_mutex); DEFINE_PER_CPU(unsigned long, cpu_scale) = SCHED_CAPACITY_SCALE; void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity) @@ -51,37 +49,7 @@ static ssize_t cpu_capacity_show(struct device *dev, static void update_topology_flags_workfn(struct work_struct *work); static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn); -static ssize_t cpu_capacity_store(struct device *dev, - struct device_attribute *attr, - const char *buf, - size_t count) -{ - struct cpu *cpu = container_of(dev, struct cpu, dev); - int this_cpu = cpu->dev.id; - int i; - unsigned long new_capacity; - ssize_t ret; - - if (!count) - return 0; - - ret = kstrtoul(buf, 0, &new_capacity); - if (ret) - return ret; - if (new_capacity > SCHED_CAPACITY_SCALE) - return -EINVAL; - - mutex_lock(&cpu_scale_mutex); - for_each_cpu(i, &cpu_topology[this_cpu].core_sibling) - topology_set_cpu_scale(i, new_capacity); - mutex_unlock(&cpu_scale_mutex); - - schedule_work(&update_topology_flags_work); - - return count; -} - -static DEVICE_ATTR_RW(cpu_capacity); +static DEVICE_ATTR_RO(cpu_capacity); static int register_cpu_capacity_sysctl(void) { @@ -141,7 +109,6 @@ void topology_normalize_cpu_scale(void) return; pr_debug("cpu_capacity: capacity_scale=%u\n", capacity_scale); - mutex_lock(&cpu_scale_mutex); for_each_possible_cpu(cpu) { pr_debug("cpu_capacity: cpu=%d raw_capacity=%u\n", cpu, raw_capacity[cpu]); @@ -151,7 +118,6 @@ void topology_normalize_cpu_scale(void) pr_debug("cpu_capacity: CPU%d cpu_capacity=%lu\n", cpu, topology_get_cpu_scale(NULL, cpu)); } - mutex_unlock(&cpu_scale_mutex); } bool __init topology_parse_cpu_capacity(struct device_node *cpu_node, int cpu) -- cgit v1.2.3 From 47bcc18c7e76adfa0b0d9fe99c78f0cbc0ca6b9c Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Tue, 2 Apr 2019 15:32:03 +0200 Subject: drivers: base: test: add proper SPDX identifier to Makefile The Makefile in the drivers/base/test/ directory did not have a SPDX identifier on it, so fix that up. Cc: "Rafael J. Wysocki" Signed-off-by: Greg Kroah-Hartman --- drivers/base/test/Makefile | 1 + 1 file changed, 1 insertion(+) (limited to 'drivers/base') diff --git a/drivers/base/test/Makefile b/drivers/base/test/Makefile index 90477c5fd9f9..0f1f7277a013 100644 --- a/drivers/base/test/Makefile +++ b/drivers/base/test/Makefile @@ -1 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_TEST_ASYNC_DRIVER_PROBE) += test_async_driver_probe.o -- cgit v1.2.3 From 50f86aedfa96c53498971a416d1b34cf1b13282e Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Tue, 2 Apr 2019 15:32:02 +0200 Subject: drivers: base: firmware_loader: add proper SPDX identifiers on files that did not have them. There were two files in the firmware_loader code that did not have SPDX identifiers on them, so fix that up. Cc: Luis Chamberlain Signed-off-by: Greg Kroah-Hartman --- drivers/base/firmware_loader/Kconfig | 1 + drivers/base/firmware_loader/builtin/.gitignore | 1 + 2 files changed, 2 insertions(+) (limited to 'drivers/base') diff --git a/drivers/base/firmware_loader/Kconfig b/drivers/base/firmware_loader/Kconfig index eb15d976a9ea..38f2da6f5c2b 100644 --- a/drivers/base/firmware_loader/Kconfig +++ b/drivers/base/firmware_loader/Kconfig @@ -1,3 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0 menu "Firmware loader" config FW_LOADER diff --git a/drivers/base/firmware_loader/builtin/.gitignore b/drivers/base/firmware_loader/builtin/.gitignore index 9c8bdb9fdcc3..166f76b43049 100644 --- a/drivers/base/firmware_loader/builtin/.gitignore +++ b/drivers/base/firmware_loader/builtin/.gitignore @@ -1 +1,2 @@ +# SPDX-License-Identifier: GPL-2.0 *.gen.S -- cgit v1.2.3 From 5de363b66a37a0193e28a2de64fa4996159bd5ee Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Tue, 2 Apr 2019 15:32:01 +0200 Subject: drivers: base: power: add proper SPDX identifiers on files that did not have them. There were a few files in the driver core power code that did not have SPDX identifiers on them, so fix that up. At the same time, remove the "free form" text that specified the license of the file, as that is impossible for any tool to properly parse. Cc: "Rafael J. Wysocki" Signed-off-by: Greg Kroah-Hartman --- drivers/base/power/clock_ops.c | 3 +-- drivers/base/power/common.c | 4 +--- drivers/base/power/domain.c | 4 +--- drivers/base/power/domain_governor.c | 4 +--- drivers/base/power/generic_ops.c | 4 +--- drivers/base/power/main.c | 4 +--- drivers/base/power/qos.c | 6 +----- drivers/base/power/runtime.c | 4 +--- drivers/base/power/sysfs.c | 6 ++---- drivers/base/power/trace.c | 2 +- drivers/base/power/wakeirq.c | 15 ++------------- drivers/base/power/wakeup.c | 4 +--- 12 files changed, 14 insertions(+), 46 deletions(-) (limited to 'drivers/base') diff --git a/drivers/base/power/clock_ops.c b/drivers/base/power/clock_ops.c index 365ad751ce0f..59d19dd64928 100644 --- a/drivers/base/power/clock_ops.c +++ b/drivers/base/power/clock_ops.c @@ -1,9 +1,8 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/clock_ops.c - Generic clock manipulation PM callbacks * * Copyright (c) 2011 Rafael J. Wysocki , Renesas Electronics Corp. - * - * This file is released under the GPLv2. */ #include diff --git a/drivers/base/power/common.c b/drivers/base/power/common.c index 22aedb28aad7..8db98a1f83dc 100644 --- a/drivers/base/power/common.c +++ b/drivers/base/power/common.c @@ -1,11 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/common.c - Common device power management code. * * Copyright (C) 2011 Rafael J. Wysocki , Renesas Electronics Corp. - * - * This file is released under the GPLv2. */ - #include #include #include diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c index 96a6dc9d305c..c98ac27d6443 100644 --- a/drivers/base/power/domain.c +++ b/drivers/base/power/domain.c @@ -1,11 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/domain.c - Common code related to device power domains. * * Copyright (C) 2011 Rafael J. Wysocki , Renesas Electronics Corp. - * - * This file is released under the GPLv2. */ - #define pr_fmt(fmt) "PM: " fmt #include diff --git a/drivers/base/power/domain_governor.c b/drivers/base/power/domain_governor.c index 4d07e38a8247..35925b174d0d 100644 --- a/drivers/base/power/domain_governor.c +++ b/drivers/base/power/domain_governor.c @@ -1,11 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/domain_governor.c - Governors for device PM domains. * * Copyright (C) 2011 Rafael J. Wysocki , Renesas Electronics Corp. - * - * This file is released under the GPLv2. */ - #include #include #include diff --git a/drivers/base/power/generic_ops.c b/drivers/base/power/generic_ops.c index b2ed606265a8..4fa525668cb7 100644 --- a/drivers/base/power/generic_ops.c +++ b/drivers/base/power/generic_ops.c @@ -1,11 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/generic_ops.c - Generic PM callbacks for subsystems * * Copyright (c) 2010 Rafael J. Wysocki , Novell Inc. - * - * This file is released under the GPLv2. */ - #include #include #include diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c index f80d298de3fa..bd5bb5a09723 100644 --- a/drivers/base/power/main.c +++ b/drivers/base/power/main.c @@ -1,12 +1,10 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/main.c - Where the driver meets power management. * * Copyright (c) 2003 Patrick Mochel * Copyright (c) 2003 Open Source Development Lab * - * This file is released under the GPLv2 - * - * * The driver model core calls device_pm_add() when a device is registered. * This will initialize the embedded device_pm_info object in the device * and add it to the list of power-controlled devices. sysfs entries for diff --git a/drivers/base/power/qos.c b/drivers/base/power/qos.c index f80e402ef778..6c91f8df1d59 100644 --- a/drivers/base/power/qos.c +++ b/drivers/base/power/qos.c @@ -1,13 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 /* * Devices PM QoS constraints management * * Copyright (C) 2011 Texas Instruments, Inc. * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * * This module exposes the interface to kernel space for specifying * per-device PM QoS dependencies. It provides infrastructure for registration * of: diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c index 977db40378b0..952a1e7057c7 100644 --- a/drivers/base/power/runtime.c +++ b/drivers/base/power/runtime.c @@ -1,12 +1,10 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/runtime.c - Helper functions for device runtime PM * * Copyright (c) 2009 Rafael J. Wysocki , Novell Inc. * Copyright (C) 2010 Alan Stern - * - * This file is released under the GPLv2. */ - #include #include #include diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c index 1226e441ddfe..1b9c281cbe41 100644 --- a/drivers/base/power/sysfs.c +++ b/drivers/base/power/sysfs.c @@ -1,7 +1,5 @@ -/* - * drivers/base/power/sysfs.c - sysfs entries for device PM - */ - +// SPDX-License-Identifier: GPL-2.0 +/* sysfs entries for device PM */ #include #include #include diff --git a/drivers/base/power/trace.c b/drivers/base/power/trace.c index 2bd9d2c744ca..977d27bd1a22 100644 --- a/drivers/base/power/trace.c +++ b/drivers/base/power/trace.c @@ -1,3 +1,4 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/trace.c * @@ -6,7 +7,6 @@ * Trace facility for suspend/resume problems, when none of the * devices may be working. */ - #define pr_fmt(fmt) "PM: " fmt #include diff --git a/drivers/base/power/wakeirq.c b/drivers/base/power/wakeirq.c index b8fa5c0f2d13..5ce77d1ef9fc 100644 --- a/drivers/base/power/wakeirq.c +++ b/drivers/base/power/wakeirq.c @@ -1,16 +1,5 @@ -/* - * wakeirq.c - Device wakeirq helper functions - * - * This program is free software; you can redistribute it and/or modify - * it under the terms of the GNU General Public License version 2 as - * published by the Free Software Foundation. - * - * This program is distributed "as is" WITHOUT ANY WARRANTY of any - * kind, whether express or implied; without even the implied warranty - * of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the - * GNU General Public License for more details. - */ - +// SPDX-License-Identifier: GPL-2.0 +/* Device wakeirq helper functions */ #include #include #include diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c index bb1ae175fae1..9066b2dfc3e1 100644 --- a/drivers/base/power/wakeup.c +++ b/drivers/base/power/wakeup.c @@ -1,11 +1,9 @@ +// SPDX-License-Identifier: GPL-2.0 /* * drivers/base/power/wakeup.c - System wakeup events framework * * Copyright (c) 2010 Rafael J. Wysocki , Novell Inc. - * - * This file is released under the GPLv2. */ - #define pr_fmt(fmt) "PM: " fmt #include -- cgit v1.2.3 From affada726cad2402804ec29fc000276c7dc23b95 Mon Sep 17 00:00:00 2001 From: Borislav Petkov Date: Thu, 18 Apr 2019 19:41:56 +0200 Subject: driver core: Clarify which counterparts to use to device_add() It is not absolutely clear from the docs how the cleanup path after device_add() should look like so spell it out explicitly. No functional changes, just documentation. Signed-off-by: Borislav Petkov Reviewed-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman --- drivers/base/core.c | 5 +++++ 1 file changed, 5 insertions(+) (limited to 'drivers/base') diff --git a/drivers/base/core.c b/drivers/base/core.c index 4aeaa0c92bda..fd7511e04e62 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -1999,6 +1999,11 @@ static int device_private_init(struct device *dev) * NOTE: _Never_ directly free @dev after calling this function, even * if it returned an error! Always use put_device() to give up your * reference instead. + * + * Rule of thumb is: if device_add() succeeds, you should call + * device_del() when you want to get rid of it. If device_add() has + * *not* succeeded, use *only* put_device() to drop the reference + * count. */ int device_add(struct device *dev) { -- cgit v1.2.3 From d2ab99403ee00d8014e651728a4702ea1ae5e52c Mon Sep 17 00:00:00 2001 From: zhong jiang Date: Mon, 8 Apr 2019 12:07:17 +0800 Subject: mm/memory_hotplug: Do not unlock when fails to take the device_hotplug_lock When adding the memory by probing memory block in sysfs interface, there is an obvious issue that we will unlock the device_hotplug_lock when fails to takes it. That issue was introduced in Commit 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock") We should drop out in time when fails to take the device_hotplug_lock. Fixes: 8df1d0e4a265 ("mm/memory_hotplug: make add_memory() take the device_hotplug_lock") Reported-by: Yang yingliang Signed-off-by: zhong jiang Reviewed-by: Oscar Salvador Reviewed-by: David Hildenbrand Acked-by: Michal Hocko Cc: stable Signed-off-by: Greg Kroah-Hartman --- drivers/base/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/base') diff --git a/drivers/base/memory.c b/drivers/base/memory.c index cb8347500ce2..e49028a60429 100644 --- a/drivers/base/memory.c +++ b/drivers/base/memory.c @@ -506,7 +506,7 @@ static ssize_t probe_store(struct device *dev, struct device_attribute *attr, ret = lock_device_hotplug_sysfs(); if (ret) - goto out; + return ret; nid = memory_add_physaddr_to_nid(phys_addr); ret = __add_memory(nid, phys_addr, -- cgit v1.2.3 From 7067c96ee8d2d77039aeb49670acfe160f484ef9 Mon Sep 17 00:00:00 2001 From: Bartosz Golaszewski Date: Mon, 1 Apr 2019 10:16:35 +0200 Subject: drivers: fix a typo in the kernel doc for devm_platform_ioremap_resource() It should have been 'management' not 'managemend'. Fixes: 7945f929f1a7 ("drivers: provide devm_platform_ioremap_resource()") Signed-off-by: Bartosz Golaszewski Reviewed-by: Mukesh Ojha Reviewed-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman --- drivers/base/platform.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'drivers/base') diff --git a/drivers/base/platform.c b/drivers/base/platform.c index dab0a5abc391..09c00d91094c 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -84,7 +84,7 @@ EXPORT_SYMBOL_GPL(platform_get_resource); * device * * @pdev: platform device to use both for memory resource lookup as well as - * resource managemend + * resource management * @index: resource index */ #ifdef CONFIG_HAS_IOMEM -- cgit v1.2.3 From 25ebcb7dc84db59514b7409ad009d8d67833e091 Mon Sep 17 00:00:00 2001 From: Andy Shevchenko Date: Thu, 4 Apr 2019 11:11:58 +0300 Subject: driver core: platform: Propagate error from insert_resource() Since insert_resource() might return an error we don't need to shadow its error code and would safely propagate to the user. Signed-off-by: Andy Shevchenko Reviewed-by: Rafael J. Wysocki Signed-off-by: Greg Kroah-Hartman --- drivers/base/platform.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) (limited to 'drivers/base') diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 09c00d91094c..4d1729853d1a 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -438,10 +438,12 @@ int platform_device_add(struct platform_device *pdev) p = &ioport_resource; } - if (p && insert_resource(p, r)) { - dev_err(&pdev->dev, "failed to claim resource %d: %pR\n", i, r); - ret = -EBUSY; - goto failed; + if (p) { + ret = insert_resource(p, r); + if (ret) { + dev_err(&pdev->dev, "failed to claim resource %d: %pR\n", i, r); + goto failed; + } } } -- cgit v1.2.3 From 0b777eee88d712256ba8232a9429edb17c4f9ceb Mon Sep 17 00:00:00 2001 From: John Garry Date: Thu, 28 Mar 2019 18:08:05 +0800 Subject: driver core: Postpone DMA tear-down until after devres release for probe failure In commit 376991db4b64 ("driver core: Postpone DMA tear-down until after devres release"), we changed the ordering of tearing down the device DMA ops and releasing all the device's resources; this was because the DMA ops should be maintained until we release the device's managed DMA memories. However, we have seen another crash on an arm64 system when a device driver probe fails: hisi_sas_v3_hw 0000:74:02.0: Adding to iommu group 2 scsi host1: hisi_sas_v3_hw BUG: Bad page state in process swapper/0 pfn:313f5 page:ffff7e0000c4fd40 count:1 mapcount:0 mapping:0000000000000000 index:0x0 flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd48 ffff7e0000c4fd48 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set bad because of flags: 0x1000(reserved) Modules linked in: CPU: 49 PID: 1 Comm: swapper/0 Not tainted 5.1.0-rc1-43081-g22d97fd-dirty #1433 Hardware name: Huawei D06/D06, BIOS Hisilicon D06 UEFI RC0 - V1.12.01 01/29/2019 Call trace: dump_backtrace+0x0/0x118 show_stack+0x14/0x1c dump_stack+0xa4/0xc8 bad_page+0xe4/0x13c free_pages_check_bad+0x4c/0xc0 __free_pages_ok+0x30c/0x340 __free_pages+0x30/0x44 __dma_direct_free_pages+0x30/0x38 dma_direct_free+0x24/0x38 dma_free_attrs+0x9c/0xd8 dmam_release+0x20/0x28 release_nodes+0x17c/0x220 devres_release_all+0x34/0x54 really_probe+0xc4/0x2c8 driver_probe_device+0x58/0xfc device_driver_attach+0x68/0x70 __driver_attach+0x94/0xdc bus_for_each_dev+0x5c/0xb4 driver_attach+0x20/0x28 bus_add_driver+0x14c/0x200 driver_register+0x6c/0x124 __pci_register_driver+0x48/0x50 sas_v3_pci_driver_init+0x20/0x28 do_one_initcall+0x40/0x25c kernel_init_freeable+0x2b8/0x3c0 kernel_init+0x10/0x100 ret_from_fork+0x10/0x18 Disabling lock debugging due to kernel taint BUG: Bad page state in process swapper/0 pfn:313f6 page:ffff7e0000c4fd80 count:1 mapcount:0 mapping:0000000000000000 index:0x0 [ 89.322983] flags: 0xfffe00000001000(reserved) raw: 0fffe00000001000 ffff7e0000c4fd88 ffff7e0000c4fd88 0000000000000000 raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000 The crash occurs for the same reason. In this case, on the really_probe() failure path, we are still clearing the DMA ops prior to releasing the device's managed memories. This patch fixes this issue by reordering the DMA ops teardown and the call to devres_release_all() on the failure path. Reported-by: Xiang Chen Tested-by: Xiang Chen Signed-off-by: John Garry Reviewed-by: Robin Murphy Signed-off-by: Greg Kroah-Hartman --- drivers/base/dd.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) (limited to 'drivers/base') diff --git a/drivers/base/dd.c b/drivers/base/dd.c index a823f469e53f..0df9b4461766 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -490,7 +490,7 @@ re_probe: if (dev->bus->dma_configure) { ret = dev->bus->dma_configure(dev); if (ret) - goto dma_failed; + goto probe_failed; } if (driver_sysfs_add(dev)) { @@ -546,14 +546,13 @@ re_probe: goto done; probe_failed: - arch_teardown_dma_ops(dev); -dma_failed: if (dev->bus) blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DRIVER_NOT_BOUND, dev); pinctrl_bind_failed: device_links_no_driver(dev); devres_release_all(dev); + arch_teardown_dma_ops(dev); driver_sysfs_remove(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); -- cgit v1.2.3 From edb16da34b084c66763f29bee42b4e6bb33c3d66 Mon Sep 17 00:00:00 2001 From: Venkata Narendra Kumar Gutta Date: Mon, 22 Apr 2019 17:16:29 -0700 Subject: driver core: platform: Fix the usage of platform device name(pdev->name) Platform core is using pdev->name as the platform device name to do the binding of the devices with the drivers. But, when the platform driver overrides the platform device name with dev_set_name(), the pdev->name is pointing to a location which is freed and becomes an invalid parameter to do the binding match. use-after-free instance: [ 33.325013] BUG: KASAN: use-after-free in strcmp+0x8c/0xb0 [ 33.330646] Read of size 1 at addr ffffffc10beae600 by task modprobe [ 33.339068] CPU: 5 PID: 518 Comm: modprobe Tainted: G S W O 4.19.30+ #3 [ 33.346835] Hardware name: MTP (DT) [ 33.350419] Call trace: [ 33.352941] dump_backtrace+0x0/0x3b8 [ 33.356713] show_stack+0x24/0x30 [ 33.360119] dump_stack+0x160/0x1d8 [ 33.363709] print_address_description+0x84/0x2e0 [ 33.368549] kasan_report+0x26c/0x2d0 [ 33.372322] __asan_report_load1_noabort+0x2c/0x38 [ 33.377248] strcmp+0x8c/0xb0 [ 33.380306] platform_match+0x70/0x1f8 [ 33.384168] __driver_attach+0x78/0x3a0 [ 33.388111] bus_for_each_dev+0x13c/0x1b8 [ 33.392237] driver_attach+0x4c/0x58 [ 33.395910] bus_add_driver+0x350/0x560 [ 33.399854] driver_register+0x23c/0x328 [ 33.403886] __platform_driver_register+0xd0/0xe0 So, use dev_name(&pdev->dev), which fetches the platform device name from the kobject(dev->kobj->name) of the device instead of the pdev->name. Signed-off-by: Venkata Narendra Kumar Gutta Signed-off-by: Greg Kroah-Hartman --- drivers/base/platform.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'drivers/base') diff --git a/drivers/base/platform.c b/drivers/base/platform.c index 4d1729853d1a..df76e40c1a83 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -890,7 +890,7 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *a, if (len != -ENODEV) return len; - len = snprintf(buf, PAGE_SIZE, "platform:%s\n", pdev->name); + len = snprintf(buf, PAGE_SIZE, "platform:%s\n", dev_name(&pdev->dev)); return (len >= PAGE_SIZE) ? (PAGE_SIZE - 1) : len; } @@ -966,7 +966,7 @@ static int platform_uevent(struct device *dev, struct kobj_uevent_env *env) return rc; add_uevent_var(env, "MODALIAS=%s%s", PLATFORM_MODULE_PREFIX, - pdev->name); + dev_name(&pdev->dev)); return 0; } @@ -975,7 +975,7 @@ static const struct platform_device_id *platform_match_id( struct platform_device *pdev) { while (id->name[0]) { - if (strcmp(pdev->name, id->name) == 0) { + if (strcmp(dev_name(&pdev->dev), id->name) == 0) { pdev->id_entry = id; return id; } @@ -1019,7 +1019,7 @@ static int platform_match(struct device *dev, struct device_driver *drv) return platform_match_id(pdrv->id_table, pdev) != NULL; /* fall-back to driver name match */ - return (strcmp(pdev->name, drv->name) == 0); + return (strcmp(dev_name(&pdev->dev), drv->name) == 0); } #ifdef CONFIG_PM_SLEEP -- cgit v1.2.3 From 391c0325cc5f9e2daf9117825714d777b3595a42 Mon Sep 17 00:00:00 2001 From: Greg Kroah-Hartman Date: Mon, 29 Apr 2019 19:49:21 +0200 Subject: Revert "driver core: platform: Fix the usage of platform device name(pdev->name)" This reverts commit edb16da34b084c66763f29bee42b4e6bb33c3d66 as it breaks existing systems as reported by Krzysztof. Reported-by: Krzysztof Kozlowski Cc: Venkata Narendra Kumar Gutta Signed-off-by: Greg Kroah-Hartman --- drivers/base/platform.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) (limited to 'drivers/base') diff --git a/drivers/base/platform.c b/drivers/base/platform.c index df76e40c1a83..4d1729853d1a 100644 --- a/drivers/base/platform.c +++ b/drivers/base/platform.c @@ -890,7 +890,7 @@ static ssize_t modalias_show(struct device *dev, struct device_attribute *a, if (len != -ENODEV) return len; - len = snprintf(buf, PAGE_SIZE, "platform:%s\n", dev_name(&pdev->dev)); + len = snprintf(buf, PAGE_SIZE, "platform:%s\n", pdev->name); return (len >= PAGE_SIZE) ? (PAGE_SIZE - 1) : len; } @@ -966,7 +966,7 @@ static int platform_uevent(struct device *dev, struct kobj_uevent_env *env) return rc; add_uevent_var(env, "MODALIAS=%s%s", PLATFORM_MODULE_PREFIX, - dev_name(&pdev->dev)); + pdev->name); return 0; } @@ -975,7 +975,7 @@ static const struct platform_device_id *platform_match_id( struct platform_device *pdev) { while (id->name[0]) { - if (strcmp(dev_name(&pdev->dev), id->name) == 0) { + if (strcmp(pdev->name, id->name) == 0) { pdev->id_entry = id; return id; } @@ -1019,7 +1019,7 @@ static int platform_match(struct device *dev, struct device_driver *drv) return platform_match_id(pdrv->id_table, pdev) != NULL; /* fall-back to driver name match */ - return (strcmp(dev_name(&pdev->dev), drv->name) == 0); + return (strcmp(pdev->name, drv->name) == 0); } #ifdef CONFIG_PM_SLEEP -- cgit v1.2.3 From bbabc3fb2b6344577ee1a43d28355bf788e9b4a2 Mon Sep 17 00:00:00 2001 From: Jonathan Neuschäfer Date: Tue, 30 Apr 2019 16:56:10 +0200 Subject: firmware_loader: Fix a typo ("syfs" -> "sysfs") MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit "sysfs" was misspelled in a comment and a log message. Signed-off-by: Jonathan Neuschäfer Reviewed-by: Mukesh Ojha Signed-off-by: Greg Kroah-Hartman --- drivers/base/firmware_loader/fallback.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) (limited to 'drivers/base') diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c index b5c865fe263b..f962488546b6 100644 --- a/drivers/base/firmware_loader/fallback.c +++ b/drivers/base/firmware_loader/fallback.c @@ -674,8 +674,8 @@ static bool fw_run_sysfs_fallback(enum fw_opt opt_flags) * * This function is called if direct lookup for the firmware failed, it enables * a fallback mechanism through userspace by exposing a sysfs loading - * interface. Userspace is in charge of loading the firmware through the syfs - * loading interface. This syfs fallback mechanism may be disabled completely + * interface. Userspace is in charge of loading the firmware through the sysfs + * loading interface. This sysfs fallback mechanism may be disabled completely * on a system by setting the proc sysctl value ignore_sysfs_fallback to true. * If this false we check if the internal API caller set the @FW_OPT_NOFALLBACK * flag, if so it would also disable the fallback mechanism. A system may want @@ -693,7 +693,7 @@ int firmware_fallback_sysfs(struct firmware *fw, const char *name, return ret; if (!(opt_flags & FW_OPT_NO_WARN)) - dev_warn(device, "Falling back to syfs fallback for: %s\n", + dev_warn(device, "Falling back to sysfs fallback for: %s\n", name); else dev_dbg(device, "Falling back to sysfs fallback for: %s\n", -- cgit v1.2.3