diff options
40 files changed, 1261 insertions, 772 deletions
diff --git a/Documentation/driver-api/index.rst b/Documentation/driver-api/index.rst index d26308af6036..0dbaa987aa11 100644 --- a/Documentation/driver-api/index.rst +++ b/Documentation/driver-api/index.rst @@ -42,6 +42,7 @@ available subsections can be seen below. target mtdnand miscellaneous + mei/index w1 rapidio s390-drivers diff --git a/Documentation/driver-api/mei/hdcp.rst b/Documentation/driver-api/mei/hdcp.rst new file mode 100644 index 000000000000..e85a065b1cdc --- /dev/null +++ b/Documentation/driver-api/mei/hdcp.rst @@ -0,0 +1,32 @@ +.. SPDX-License-Identifier: GPL-2.0 + +HDCP: +===== + +ME FW as a security engine provides the capability for setting up +HDCP2.2 protocol negotiation between the Intel graphics device and +an HDC2.2 sink. + +ME FW prepares HDCP2.2 negotiation parameters, signs and encrypts them +according the HDCP 2.2 spec. The Intel graphics sends the created blob +to the HDCP2.2 sink. + +Similarly, the HDCP2.2 sink's response is transferred to ME FW +for decryption and verification. + +Once all the steps of HDCP2.2 negotiation are completed, +upon request ME FW will configure the port as authenticated and supply +the HDCP encryption keys to Intel graphics hardware. + + +mei_hdcp driver +--------------- +.. kernel-doc:: drivers/misc/mei/hdcp/mei_hdcp.c + :doc: MEI_HDCP Client Driver + +mei_hdcp api +------------ + +.. kernel-doc:: drivers/misc/mei/hdcp/mei_hdcp.c + :functions: + diff --git a/Documentation/driver-api/mei/iamt.rst b/Documentation/driver-api/mei/iamt.rst new file mode 100644 index 000000000000..6ef3e613684b --- /dev/null +++ b/Documentation/driver-api/mei/iamt.rst @@ -0,0 +1,101 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Intel(R) Active Management Technology (Intel AMT) +================================================= + +Prominent usage of the Intel ME Interface is to communicate with Intel(R) +Active Management Technology (Intel AMT) implemented in firmware running on +the Intel ME. + +Intel AMT provides the ability to manage a host remotely out-of-band (OOB) +even when the operating system running on the host processor has crashed or +is in a sleep state. + +Some examples of Intel AMT usage are: + - Monitoring hardware state and platform components + - Remote power off/on (useful for green computing or overnight IT + maintenance) + - OS updates + - Storage of useful platform information such as software assets + - Built-in hardware KVM + - Selective network isolation of Ethernet and IP protocol flows based + on policies set by a remote management console + - IDE device redirection from remote management console + +Intel AMT (OOB) communication is based on SOAP (deprecated +starting with Release 6.0) over HTTP/S or WS-Management protocol over +HTTP/S that are received from a remote management console application. + +For more information about Intel AMT: +https://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide/default.htm + + +Intel AMT Applications +---------------------- + + 1) Intel Local Management Service (Intel LMS) + + Applications running locally on the platform communicate with Intel AMT Release + 2.0 and later releases in the same way that network applications do via SOAP + over HTTP (deprecated starting with Release 6.0) or with WS-Management over + SOAP over HTTP. This means that some Intel AMT features can be accessed from a + local application using the same network interface as a remote application + communicating with Intel AMT over the network. + + When a local application sends a message addressed to the local Intel AMT host + name, the Intel LMS, which listens for traffic directed to the host name, + intercepts the message and routes it to the Intel MEI. + For more information: + https://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide/default.htm + Under "About Intel AMT" => "Local Access" + + For downloading Intel LMS: + https://github.com/intel/lms + + The Intel LMS opens a connection using the Intel MEI driver to the Intel LMS + firmware feature using a defined GUID and then communicates with the feature + using a protocol called Intel AMT Port Forwarding Protocol (Intel APF protocol). + The protocol is used to maintain multiple sessions with Intel AMT from a + single application. + + See the protocol specification in the Intel AMT Software Development Kit (SDK) + https://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide/default.htm + Under "SDK Resources" => "Intel(R) vPro(TM) Gateway (MPS)" + => "Information for Intel(R) vPro(TM) Gateway Developers" + => "Description of the Intel AMT Port Forwarding (APF) Protocol" + + 2) Intel AMT Remote configuration using a Local Agent + + A Local Agent enables IT personnel to configure Intel AMT out-of-the-box + without requiring installing additional data to enable setup. The remote + configuration process may involve an ISV-developed remote configuration + agent that runs on the host. + For more information: + https://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide/default.htm + Under "Setup and Configuration of Intel AMT" => + "SDK Tools Supporting Setup and Configuration" => + "Using the Local Agent Sample" + +Intel AMT OS Health Watchdog +---------------------------- + +The Intel AMT Watchdog is an OS Health (Hang/Crash) watchdog. +Whenever the OS hangs or crashes, Intel AMT will send an event +to any subscriber to this event. This mechanism means that +IT knows when a platform crashes even when there is a hard failure on the host. + +The Intel AMT Watchdog is composed of two parts: + 1) Firmware feature - receives the heartbeats + and sends an event when the heartbeats stop. + 2) Intel MEI iAMT watchdog driver - connects to the watchdog feature, + configures the watchdog and sends the heartbeats. + +The Intel iAMT watchdog MEI driver uses the kernel watchdog API to configure +the Intel AMT Watchdog and to send heartbeats to it. The default timeout of the +watchdog is 120 seconds. + +If the Intel AMT is not enabled in the firmware then the watchdog client won't enumerate +on the me client bus and watchdog devices won't be exposed. + +--- +linux-mei@linux.intel.com diff --git a/Documentation/driver-api/mei/index.rst b/Documentation/driver-api/mei/index.rst new file mode 100644 index 000000000000..3a22b522ee78 --- /dev/null +++ b/Documentation/driver-api/mei/index.rst @@ -0,0 +1,23 @@ +.. SPDX-License-Identifier: GPL-2.0 + +.. include:: <isonum.txt> + +=================================================== +Intel(R) Management Engine Interface (Intel(R) MEI) +=================================================== + +**Copyright** |copy| 2019 Intel Corporation + + +.. only:: html + + .. class:: toc-title + + Table of Contents + +.. toctree:: + :maxdepth: 3 + + mei + mei-client-bus + iamt diff --git a/Documentation/driver-api/mei/mei-client-bus.rst b/Documentation/driver-api/mei/mei-client-bus.rst new file mode 100644 index 000000000000..f242b3f8d6aa --- /dev/null +++ b/Documentation/driver-api/mei/mei-client-bus.rst @@ -0,0 +1,168 @@ +.. SPDX-License-Identifier: GPL-2.0 + +============================================== +Intel(R) Management Engine (ME) Client bus API +============================================== + + +Rationale +========= + +The MEI character device is useful for dedicated applications to send and receive +data to the many FW appliance found in Intel's ME from the user space. +However, for some of the ME functionalities it makes sense to leverage existing software +stack and expose them through existing kernel subsystems. + +In order to plug seamlessly into the kernel device driver model we add kernel virtual +bus abstraction on top of the MEI driver. This allows implementing Linux kernel drivers +for the various MEI features as a stand alone entities found in their respective subsystem. +Existing device drivers can even potentially be re-used by adding an MEI CL bus layer to +the existing code. + + +MEI CL bus API +============== + +A driver implementation for an MEI Client is very similar to any other existing bus +based device drivers. The driver registers itself as an MEI CL bus driver through +the ``struct mei_cl_driver`` structure defined in :file:`include/linux/mei_cl_bus.c` + +.. code-block:: C + + struct mei_cl_driver { + struct device_driver driver; + const char *name; + + const struct mei_cl_device_id *id_table; + + int (*probe)(struct mei_cl_device *dev, const struct mei_cl_id *id); + int (*remove)(struct mei_cl_device *dev); + }; + + + +The mei_cl_device_id structure defined in :file:`include/linux/mod_devicetable.h` allows a +driver to bind itself against a device name. + +.. code-block:: C + + struct mei_cl_device_id { + char name[MEI_CL_NAME_SIZE]; + uuid_le uuid; + __u8 version; + kernel_ulong_t driver_info; + }; + +To actually register a driver on the ME Client bus one must call the :c:func:`mei_cl_add_driver` +API. This is typically called at module initialization time. + +Once the driver is registered and bound to the device, a driver will typically +try to do some I/O on this bus and this should be done through the :c:func:`mei_cl_send` +and :c:func:`mei_cl_recv` functions. More detailed information is in :ref:`api` section. + +In order for a driver to be notified about pending traffic or event, the driver +should register a callback via :c:func:`mei_cl_devev_register_rx_cb` and +:c:func:`mei_cldev_register_notify_cb` function respectively. + +.. _api: + +API: +---- +.. kernel-doc:: drivers/misc/mei/bus.c + :export: drivers/misc/mei/bus.c + + + +Example +======= + +As a theoretical example let's pretend the ME comes with a "contact" NFC IP. +The driver init and exit routines for this device would look like: + +.. code-block:: C + + #define CONTACT_DRIVER_NAME "contact" + + static struct mei_cl_device_id contact_mei_cl_tbl[] = { + { CONTACT_DRIVER_NAME, }, + + /* required last entry */ + { } + }; + MODULE_DEVICE_TABLE(mei_cl, contact_mei_cl_tbl); + + static struct mei_cl_driver contact_driver = { + .id_table = contact_mei_tbl, + .name = CONTACT_DRIVER_NAME, + + .probe = contact_probe, + .remove = contact_remove, + }; + + static int contact_init(void) + { + int r; + + r = mei_cl_driver_register(&contact_driver); + if (r) { + pr_err(CONTACT_DRIVER_NAME ": driver registration failed\n"); + return r; + } + + return 0; + } + + static void __exit contact_exit(void) + { + mei_cl_driver_unregister(&contact_driver); + } + + module_init(contact_init); + module_exit(contact_exit); + +And the driver's simplified probe routine would look like that: + +.. code-block:: C + + int contact_probe(struct mei_cl_device *dev, struct mei_cl_device_id *id) + { + [...] + mei_cldev_enable(dev); + + mei_cldev_register_rx_cb(dev, contact_rx_cb); + + return 0; + } + +In the probe routine the driver first enable the MEI device and then registers +an rx handler which is as close as it can get to registering a threaded IRQ handler. +The handler implementation will typically call :c:func:`mei_cldev_recv` and then +process received data. + +.. code-block:: C + + #define MAX_PAYLOAD 128 + #define HDR_SIZE 4 + static void conntact_rx_cb(struct mei_cl_device *cldev) + { + struct contact *c = mei_cldev_get_drvdata(cldev); + unsigned char payload[MAX_PAYLOAD]; + ssize_t payload_sz; + + payload_sz = mei_cldev_recv(cldev, payload, MAX_PAYLOAD) + if (reply_size < HDR_SIZE) { + return; + } + + c->process_rx(payload); + + } + +MEI Client Bus Drivers +====================== + +.. toctree:: + :maxdepth: 2 + + hdcp + nfc diff --git a/Documentation/driver-api/mei/mei.rst b/Documentation/driver-api/mei/mei.rst new file mode 100644 index 000000000000..c800d8e5f422 --- /dev/null +++ b/Documentation/driver-api/mei/mei.rst @@ -0,0 +1,176 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Introduction +============ + +The Intel Management Engine (Intel ME) is an isolated and protected computing +resource (Co-processor) residing inside certain Intel chipsets. The Intel ME +provides support for computer/IT management and security features. +The actual feature set depends on the Intel chipset SKU. + +The Intel Management Engine Interface (Intel MEI, previously known as HECI) +is the interface between the Host and Intel ME. This interface is exposed +to the host as a PCI device, actually multiple PCI devices might be exposed. +The Intel MEI Driver is in charge of the communication channel between +a host application and the Intel ME features. + +Each Intel ME feature, or Intel ME Client is addressed by a unique GUID and +each client has its own protocol. The protocol is message-based with a +header and payload up to maximal number of bytes advertised by the client, +upon connection. + +Intel MEI Driver +================ + +The driver exposes a character device with device nodes /dev/meiX. + +An application maintains communication with an Intel ME feature while +/dev/meiX is open. The binding to a specific feature is performed by calling +:c:macro:`MEI_CONNECT_CLIENT_IOCTL`, which passes the desired GUID. +The number of instances of an Intel ME feature that can be opened +at the same time depends on the Intel ME feature, but most of the +features allow only a single instance. + +The driver is transparent to data that are passed between firmware feature +and host application. + +Because some of the Intel ME features can change the system +configuration, the driver by default allows only a privileged +user to access it. + +The session is terminated calling :c:func:`close(int fd)`. + +A code snippet for an application communicating with Intel AMTHI client: + +.. code-block:: C + + struct mei_connect_client_data data; + fd = open(MEI_DEVICE); + + data.d.in_client_uuid = AMTHI_GUID; + + ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &data); + + printf("Ver=%d, MaxLen=%ld\n", + data.d.in_client_uuid.protocol_version, + data.d.in_client_uuid.max_msg_length); + + [...] + + write(fd, amthi_req_data, amthi_req_data_len); + + [...] + + read(fd, &amthi_res_data, amthi_res_data_len); + + [...] + close(fd); + + +User space API + +IOCTLs: +======= + +The Intel MEI Driver supports the following IOCTL commands: + +IOCTL_MEI_CONNECT_CLIENT +------------------------- +Connect to firmware Feature/Client. + +.. code-block:: none + + Usage: + + struct mei_connect_client_data client_data; + + ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &client_data); + + Inputs: + + struct mei_connect_client_data - contain the following + Input field: + + in_client_uuid - GUID of the FW Feature that needs + to connect to. + Outputs: + out_client_properties - Client Properties: MTU and Protocol Version. + + Error returns: + + ENOTTY No such client (i.e. wrong GUID) or connection is not allowed. + EINVAL Wrong IOCTL Number + ENODEV Device or Connection is not initialized or ready. + ENOMEM Unable to allocate memory to client internal data. + EFAULT Fatal Error (e.g. Unable to access user input data) + EBUSY Connection Already Open + +:Note: + max_msg_length (MTU) in client properties describes the maximum + data that can be sent or received. (e.g. if MTU=2K, can send + requests up to bytes 2k and received responses up to 2k bytes). + + +IOCTL_MEI_NOTIFY_SET +--------------------- +Enable or disable event notifications. + + +.. code-block:: none + + Usage: + + uint32_t enable; + + ioctl(fd, IOCTL_MEI_NOTIFY_SET, &enable); + + + uint32_t enable = 1; + or + uint32_t enable[disable] = 0; + + Error returns: + + + EINVAL Wrong IOCTL Number + ENODEV Device is not initialized or the client not connected + ENOMEM Unable to allocate memory to client internal data. + EFAULT Fatal Error (e.g. Unable to access user input data) + EOPNOTSUPP if the device doesn't support the feature + +:Note: + The client must be connected in order to enable notification events + + +IOCTL_MEI_NOTIFY_GET +-------------------- +Retrieve event + +.. code-block:: none + + Usage: + uint32_t event; + ioctl(fd, IOCTL_MEI_NOTIFY_GET, &event); + + Outputs: + 1 - if an event is pending + 0 - if there is no even pending + + Error returns: + EINVAL Wrong IOCTL Number + ENODEV Device is not initialized or the client not connected + ENOMEM Unable to allocate memory to client internal data. + EFAULT Fatal Error (e.g. Unable to access user input data) + EOPNOTSUPP if the device doesn't support the feature + +:Note: + The client must be connected and event notification has to be enabled + in order to receive an event + + + +Supported Chipsets +================== +82X38/X48 Express and newer + +linux-mei@linux.intel.com diff --git a/Documentation/driver-api/mei/nfc.rst b/Documentation/driver-api/mei/nfc.rst new file mode 100644 index 000000000000..b5b6fc96f85e --- /dev/null +++ b/Documentation/driver-api/mei/nfc.rst @@ -0,0 +1,28 @@ +.. SPDX-License-Identifier: GPL-2.0 + +MEI NFC +------- + +Some Intel 8 and 9 Serieses chipsets supports NFC devices connected behind +the Intel Management Engine controller. +MEI client bus exposes the NFC chips as NFC phy devices and enables +binding with Microread and NXP PN544 NFC device driver from the Linux NFC +subsystem. + +.. kernel-render:: DOT + :alt: MEI NFC digraph + :caption: **MEI NFC** Stack + + digraph NFC { + cl_nfc -> me_cl_nfc; + "drivers/nfc/mei_phy" -> cl_nfc [lhead=bus]; + "drivers/nfc/microread/mei" -> cl_nfc; + "drivers/nfc/microread/mei" -> "drivers/nfc/mei_phy"; + "drivers/nfc/pn544/mei" -> cl_nfc; + "drivers/nfc/pn544/mei" -> "drivers/nfc/mei_phy"; + "net/nfc" -> "drivers/nfc/microread/mei"; + "net/nfc" -> "drivers/nfc/pn544/mei"; + "neard" -> "net/nfc"; + cl_nfc [label="mei/bus(nfc)"]; + me_cl_nfc [label="me fw (nfc)"]; + } diff --git a/Documentation/misc-devices/mei/mei-client-bus.txt b/Documentation/misc-devices/mei/mei-client-bus.txt deleted file mode 100644 index 743be4ec8989..000000000000 --- a/Documentation/misc-devices/mei/mei-client-bus.txt +++ /dev/null @@ -1,141 +0,0 @@ -Intel(R) Management Engine (ME) Client bus API -============================================== - - -Rationale -========= - -MEI misc character device is useful for dedicated applications to send and receive -data to the many FW appliance found in Intel's ME from the user space. -However for some of the ME functionalities it make sense to leverage existing software -stack and expose them through existing kernel subsystems. - -In order to plug seamlessly into the kernel device driver model we add kernel virtual -bus abstraction on top of the MEI driver. This allows implementing linux kernel drivers -for the various MEI features as a stand alone entities found in their respective subsystem. -Existing device drivers can even potentially be re-used by adding an MEI CL bus layer to -the existing code. - - -MEI CL bus API -============== - -A driver implementation for an MEI Client is very similar to existing bus -based device drivers. The driver registers itself as an MEI CL bus driver through -the mei_cl_driver structure: - -struct mei_cl_driver { - struct device_driver driver; - const char *name; - - const struct mei_cl_device_id *id_table; - - int (*probe)(struct mei_cl_device *dev, const struct mei_cl_id *id); - int (*remove)(struct mei_cl_device *dev); -}; - -struct mei_cl_id { - char name[MEI_NAME_SIZE]; - kernel_ulong_t driver_info; -}; - -The mei_cl_id structure allows the driver to bind itself against a device name. - -To actually register a driver on the ME Client bus one must call the mei_cl_add_driver() -API. This is typically called at module init time. - -Once registered on the ME Client bus, a driver will typically try to do some I/O on -this bus and this should be done through the mei_cl_send() and mei_cl_recv() -routines. The latter is synchronous (blocks and sleeps until data shows up). -In order for drivers to be notified of pending events waiting for them (e.g. -an Rx event) they can register an event handler through the -mei_cl_register_event_cb() routine. Currently only the MEI_EVENT_RX event -will trigger an event handler call and the driver implementation is supposed -to call mei_recv() from the event handler in order to fetch the pending -received buffers. - - -Example -======= - -As a theoretical example let's pretend the ME comes with a "contact" NFC IP. -The driver init and exit routines for this device would look like: - -#define CONTACT_DRIVER_NAME "contact" - -static struct mei_cl_device_id contact_mei_cl_tbl[] = { - { CONTACT_DRIVER_NAME, }, - - /* required last entry */ - { } -}; -MODULE_DEVICE_TABLE(mei_cl, contact_mei_cl_tbl); - -static struct mei_cl_driver contact_driver = { - .id_table = contact_mei_tbl, - .name = CONTACT_DRIVER_NAME, - - .probe = contact_probe, - .remove = contact_remove, -}; - -static int contact_init(void) -{ - int r; - - r = mei_cl_driver_register(&contact_driver); - if (r) { - pr_err(CONTACT_DRIVER_NAME ": driver registration failed\n"); - return r; - } - - return 0; -} - -static void __exit contact_exit(void) -{ - mei_cl_driver_unregister(&contact_driver); -} - -module_init(contact_init); -module_exit(contact_exit); - -And the driver's simplified probe routine would look like that: - -int contact_probe(struct mei_cl_device *dev, struct mei_cl_device_id *id) -{ - struct contact_driver *contact; - - [...] - mei_cl_enable_device(dev); - - mei_cl_register_event_cb(dev, contact_event_cb, contact); - - return 0; -} - -In the probe routine the driver first enable the MEI device and then registers -an ME bus event handler which is as close as it can get to registering a -threaded IRQ handler. -The handler implementation will typically call some I/O routine depending on -the pending events: - -#define MAX_NFC_PAYLOAD 128 - -static void contact_event_cb(struct mei_cl_device *dev, u32 events, - void *context) -{ - struct contact_driver *contact = context; - - if (events & BIT(MEI_EVENT_RX)) { - u8 payload[MAX_NFC_PAYLOAD]; - int payload_size; - - payload_size = mei_recv(dev, payload, MAX_NFC_PAYLOAD); - if (payload_size <= 0) - return; - - /* Hook to the NFC subsystem */ - nfc_hci_recv_frame(contact->hdev, payload, payload_size); - } -} diff --git a/Documentation/misc-devices/mei/mei.txt b/Documentation/misc-devices/mei/mei.txt deleted file mode 100644 index 2b80a0cd621f..000000000000 --- a/Documentation/misc-devices/mei/mei.txt +++ /dev/null @@ -1,266 +0,0 @@ -Intel(R) Management Engine Interface (Intel(R) MEI) -=================================================== - -Introduction -============ - -The Intel Management Engine (Intel ME) is an isolated and protected computing -resource (Co-processor) residing inside certain Intel chipsets. The Intel ME -provides support for computer/IT management features. The feature set -depends on the Intel chipset SKU. - -The Intel Management Engine Interface (Intel MEI, previously known as HECI) -is the interface between the Host and Intel ME. This interface is exposed -to the host as a PCI device. The Intel MEI Driver is in charge of the -communication channel between a host application and the Intel ME feature. - -Each Intel ME feature (Intel ME Client) is addressed by a GUID/UUID and -each client has its own protocol. The protocol is message-based with a -header and payload up to 512 bytes. - -Prominent usage of the Intel ME Interface is to communicate with Intel(R) -Active Management Technology (Intel AMT) implemented in firmware running on -the Intel ME. - -Intel AMT provides the ability to manage a host remotely out-of-band (OOB) -even when the operating system running on the host processor has crashed or -is in a sleep state. - -Some examples of Intel AMT usage are: - - Monitoring hardware state and platform components - - Remote power off/on (useful for green computing or overnight IT - maintenance) - - OS updates - - Storage of useful platform information such as software assets - - Built-in hardware KVM - - Selective network isolation of Ethernet and IP protocol flows based - on policies set by a remote management console - - IDE device redirection from remote management console - -Intel AMT (OOB) communication is based on SOAP (deprecated -starting with Release 6.0) over HTTP/S or WS-Management protocol over -HTTP/S that are received from a remote management console application. - -For more information about Intel AMT: -http://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide - - -Intel MEI Driver -================ - -The driver exposes a misc device called /dev/mei. - -An application maintains communication with an Intel ME feature while -/dev/mei is open. The binding to a specific feature is performed by calling -MEI_CONNECT_CLIENT_IOCTL, which passes the desired UUID. -The number of instances of an Intel ME feature that can be opened -at the same time depends on the Intel ME feature, but most of the -features allow only a single instance. - -The Intel AMT Host Interface (Intel AMTHI) feature supports multiple -simultaneous user connected applications. The Intel MEI driver -handles this internally by maintaining request queues for the applications. - -The driver is transparent to data that are passed between firmware feature -and host application. - -Because some of the Intel ME features can change the system -configuration, the driver by default allows only a privileged -user to access it. - -A code snippet for an application communicating with Intel AMTHI client: - - struct mei_connect_client_data data; - fd = open(MEI_DEVICE); - - data.d.in_client_uuid = AMTHI_UUID; - - ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &data); - - printf("Ver=%d, MaxLen=%ld\n", - data.d.in_client_uuid.protocol_version, - data.d.in_client_uuid.max_msg_length); - - [...] - - write(fd, amthi_req_data, amthi_req_data_len); - - [...] - - read(fd, &amthi_res_data, amthi_res_data_len); - - [...] - close(fd); - - -IOCTL -===== - -The Intel MEI Driver supports the following IOCTL commands: - IOCTL_MEI_CONNECT_CLIENT Connect to firmware Feature (client). - - usage: - struct mei_connect_client_data clientData; - ioctl(fd, IOCTL_MEI_CONNECT_CLIENT, &clientData); - - inputs: - mei_connect_client_data struct contain the following - input field: - - in_client_uuid - UUID of the FW Feature that needs - to connect to. - outputs: - out_client_properties - Client Properties: MTU and Protocol Version. - - error returns: - EINVAL Wrong IOCTL Number - ENODEV Device or Connection is not initialized or ready. - (e.g. Wrong UUID) - ENOMEM Unable to allocate memory to client internal data. - EFAULT Fatal Error (e.g. Unable to access user input data) - EBUSY Connection Already Open - - Notes: - max_msg_length (MTU) in client properties describes the maximum - data that can be sent or received. (e.g. if MTU=2K, can send - requests up to bytes 2k and received responses up to 2k bytes). - - IOCTL_MEI_NOTIFY_SET: enable or disable event notifications - - Usage: - uint32_t enable; - ioctl(fd, IOCTL_MEI_NOTIFY_SET, &enable); - - Inputs: - uint32_t enable = 1; - or - uint32_t enable[disable] = 0; - - Error returns: - EINVAL Wrong IOCTL Number - ENODEV Device is not initialized or the client not connected - ENOMEM Unable to allocate memory to client internal data. - EFAULT Fatal Error (e.g. Unable to access user input data) - EOPNOTSUPP if the device doesn't support the feature - - Notes: - The client must be connected in order to enable notification events - - - IOCTL_MEI_NOTIFY_GET : retrieve event - - Usage: - uint32_t event; - ioctl(fd, IOCTL_MEI_NOTIFY_GET, &event); - - Outputs: - 1 - if an event is pending - 0 - if there is no even pending - - Error returns: - EINVAL Wrong IOCTL Number - ENODEV Device is not initialized or the client not connected - ENOMEM Unable to allocate memory to client internal data. - EFAULT Fatal Error (e.g. Unable to access user input data) - EOPNOTSUPP if the device doesn't support the feature - - Notes: - The client must be connected and event notification has to be enabled - in order to receive an event - - -Intel ME Applications -===================== - - 1) Intel Local Management Service (Intel LMS) - - Applications running locally on the platform communicate with Intel AMT Release - 2.0 and later releases in the same way that network applications do via SOAP - over HTTP (deprecated starting with Release 6.0) or with WS-Management over - SOAP over HTTP. This means that some Intel AMT features can be accessed from a - local application using the same network interface as a remote application - communicating with Intel AMT over the network. - - When a local application sends a message addressed to the local Intel AMT host - name, the Intel LMS, which listens for traffic directed to the host name, - intercepts the message and routes it to the Intel MEI. - For more information: - http://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide - Under "About Intel AMT" => "Local Access" - - For downloading Intel LMS: - http://software.intel.com/en-us/articles/download-the-latest-intel-amt-open-source-drivers/ - - The Intel LMS opens a connection using the Intel MEI driver to the Intel LMS - firmware feature using a defined UUID and then communicates with the feature - using a protocol called Intel AMT Port Forwarding Protocol (Intel APF protocol). - The protocol is used to maintain multiple sessions with Intel AMT from a - single application. - - See the protocol specification in the Intel AMT Software Development Kit (SDK) - http://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide - Under "SDK Resources" => "Intel(R) vPro(TM) Gateway (MPS)" - => "Information for Intel(R) vPro(TM) Gateway Developers" - => "Description of the Intel AMT Port Forwarding (APF) Protocol" - - 2) Intel AMT Remote configuration using a Local Agent - - A Local Agent enables IT personnel to configure Intel AMT out-of-the-box - without requiring installing additional data to enable setup. The remote - configuration process may involve an ISV-developed remote configuration - agent that runs on the host. - For more information: - http://software.intel.com/sites/manageability/AMT_Implementation_and_Reference_Guide - Under "Setup and Configuration of Intel AMT" => - "SDK Tools Supporting Setup and Configuration" => - "Using the Local Agent Sample" - - An open source Intel AMT configuration utility, implementing a local agent - that accesses the Intel MEI driver, can be found here: - http://software.intel.com/en-us/articles/download-the-latest-intel-amt-open-source-drivers/ - - -Intel AMT OS Health Watchdog -============================ - -The Intel AMT Watchdog is an OS Health (Hang/Crash) watchdog. -Whenever the OS hangs or crashes, Intel AMT will send an event -to any subscriber to this event. This mechanism means that -IT knows when a platform crashes even when there is a hard failure on the host. - -The Intel AMT Watchdog is composed of two parts: - 1) Firmware feature - receives the heartbeats - and sends an event when the heartbeats stop. - 2) Intel MEI iAMT watchdog driver - connects to the watchdog feature, - configures the watchdog and sends the heartbeats. - -The Intel iAMT watchdog MEI driver uses the kernel watchdog API to configure -the Intel AMT Watchdog and to send heartbeats to it. The default timeout of the -watchdog is 120 seconds. - -If the Intel AMT is not enabled in the firmware then the watchdog client won't enumerate -on the me client bus and watchdog devices won't be exposed. - - -Supported Chipsets -================== - -7 Series Chipset Family -6 Series Chipset Family -5 Series Chipset Family -4 Series Chipset Family -Mobile 4 Series Chipset Family -ICH9 -82946GZ/GL -82G35 Express -82Q963/Q965 -82P965/G965 -Mobile PM965/GM965 -Mobile GME965/GLE960 -82Q35 Express -82G33/G31/P35/P31 Express -82Q33 Express -82X38/X48 Express - ---- -linux-mei@linux.intel.com diff --git a/MAINTAINERS b/MAINTAINERS index 57f496cff999..a8faac7b0778 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -8034,7 +8034,7 @@ F: include/uapi/linux/mei.h F: include/linux/mei_cl_bus.h F: drivers/misc/mei/* F: drivers/watchdog/mei_wdt.c -F: Documentation/misc-devices/mei/* +F: Documentation/driver-api/mei/* F: samples/mei/* INTEL MENLOW THERMAL DRIVER diff --git a/drivers/char/bsr.c b/drivers/char/bsr.c index 167e7370d43a..e5e5333f302d 100644 --- a/drivers/char/bsr.c +++ b/drivers/char/bsr.c @@ -134,7 +134,7 @@ static int bsr_mmap(struct file *filp, struct vm_area_struct *vma) return 0; } -static int bsr_open(struct inode * inode, struct file * filp) +static int bsr_open(struct inode *inode, struct file *filp) { struct cdev *cdev = inode->i_cdev; struct bsr_dev *dev = container_of(cdev, struct bsr_dev, bsr_cdev); @@ -309,7 +309,8 @@ static int __init bsr_init(void) goto out_err_2; } - if ((ret = bsr_create_devs(np)) < 0) { + ret = bsr_create_devs(np); + if (ret < 0) { np = NULL; goto out_err_3; } diff --git a/drivers/char/misc.c b/drivers/char/misc.c index 53cfe574d8d4..f6a147427029 100644 --- a/drivers/char/misc.c +++ b/drivers/char/misc.c @@ -226,6 +226,7 @@ int misc_register(struct miscdevice *misc) mutex_unlock(&misc_mtx); return err; } +EXPORT_SYMBOL(misc_register); /** * misc_deregister - unregister a miscellaneous device @@ -249,8 +250,6 @@ void misc_deregister(struct miscdevice *misc) clear_bit(i, misc_minors); mutex_unlock(&misc_mtx); } - -EXPORT_SYMBOL(misc_register); EXPORT_SYMBOL(misc_deregister); static char *misc_devnode(struct device *dev, umode_t *mode) diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c index 4fa2931dcb7b..00b113f4b958 100644 --- a/drivers/counter/104-quad-8.c +++ b/drivers/counter/104-quad-8.c @@ -833,7 +833,7 @@ static int quad8_action_get(struct counter_device *counter, return 0; } -const struct counter_ops quad8_ops = { +static const struct counter_ops quad8_ops = { .signal_read = quad8_signal_read, .count_read = quad8_count_read, .count_write = quad8_count_write, diff --git a/drivers/firmware/google/coreboot_table.h b/drivers/firmware/google/coreboot_table.h index 1f63b3034b17..7b7b4a6eedda 100644 --- a/drivers/firmware/google/coreboot_table.h +++ b/drivers/firmware/google/coreboot_table.h @@ -12,7 +12,7 @@ #ifndef __COREBOOT_TABLE_H #define __COREBOOT_TABLE_H -#include <linux/io.h> +#include <linux/device.h> /* Coreboot table header structure */ struct coreboot_table_header { @@ -83,4 +83,13 @@ int coreboot_driver_register(struct coreboot_driver *driver); /* Unregister a driver that uses the data from a coreboot table. */ void coreboot_driver_unregister(struct coreboot_driver *driver); +/* module_coreboot_driver() - Helper macro for drivers that don't do + * anything special in module init/exit. This eliminates a lot of + * boilerplate. Each module may only use this macro once, and + * calling it replaces module_init() and module_exit() + */ +#define module_coreboot_driver(__coreboot_driver) \ + module_driver(__coreboot_driver, coreboot_driver_register, \ + coreboot_driver_unregister) + #endif /* __COREBOOT_TABLE_H */ diff --git a/drivers/firmware/google/framebuffer-coreboot.c b/drivers/firmware/google/framebuffer-coreboot.c index 7e67b651e4ac..916f26adc595 100644 --- a/drivers/firmware/google/framebuffer-coreboot.c +++ b/drivers/firmware/google/framebuffer-coreboot.c @@ -89,19 +89,7 @@ static struct coreboot_driver framebuffer_driver = { }, .tag = CB_TAG_FRAMEBUFFER, }; - -static int __init coreboot_framebuffer_init(void) -{ - return coreboot_driver_register(&framebuffer_driver); -} - -static void coreboot_framebuffer_exit(void) -{ - coreboot_driver_unregister(&framebuffer_driver); -} - -module_init(coreboot_framebuffer_init); -module_exit(coreboot_framebuffer_exit); +module_coreboot_driver(framebuffer_driver); MODULE_AUTHOR("Samuel Holland <samuel@sholland.org>"); MODULE_LICENSE("GPL"); diff --git a/drivers/firmware/google/memconsole-coreboot.c b/drivers/firmware/google/memconsole-coreboot.c index ac90e8536565..fd7f0fbec07e 100644 --- a/drivers/firmware/google/memconsole-coreboot.c +++ b/drivers/firmware/google/memconsole-coreboot.c @@ -8,6 +8,7 @@ */ #include <linux/device.h> +#include <linux/io.h> #include <linux/kernel.h> #include <linux/module.h> @@ -26,7 +27,7 @@ struct cbmem_cons { #define CURSOR_MASK ((1 << 28) - 1) #define OVERFLOW (1 << 31) -static struct cbmem_cons __iomem *cbmem_console; +static struct cbmem_cons *cbmem_console; static u32 cbmem_console_size; /* @@ -67,7 +68,7 @@ static ssize_t memconsole_coreboot_read(char *buf, loff_t pos, size_t count) static int memconsole_probe(struct coreboot_device *dev) { - struct cbmem_cons __iomem *tmp_cbmc; + struct cbmem_cons *tmp_cbmc; tmp_cbmc = memremap(dev->cbmem_ref.cbmem_addr, sizeof(*tmp_cbmc), MEMREMAP_WB); @@ -77,13 +78,13 @@ static int memconsole_probe(struct coreboot_device *dev) /* Read size only once to prevent overrun attack through /dev/mem. */ cbmem_console_size = tmp_cbmc->size_dont_access_after_boot; - cbmem_console = memremap(dev->cbmem_ref.cbmem_addr, + cbmem_console = devm_memremap(&dev->dev, dev->cbmem_ref.cbmem_addr, cbmem_console_size + sizeof(*cbmem_console), MEMREMAP_WB); memunmap(tmp_cbmc); - if (!cbmem_console) - return -ENOMEM; + if (IS_ERR(cbmem_console)) + return PTR_ERR(cbmem_console); memconsole_setup(memconsole_coreboot_read); @@ -94,9 +95,6 @@ static int memconsole_remove(struct coreboot_device *dev) { memconsole_exit(); - if (cbmem_console) - memunmap(cbmem_console); - return 0; } @@ -108,19 +106,7 @@ static struct coreboot_driver memconsole_driver = { }, .tag = CB_TAG_CBMEM_CONSOLE, }; - -static void coreboot_memconsole_exit(void) -{ - coreboot_driver_unregister(&memconsole_driver); -} - -static int __init coreboot_memconsole_init(void) -{ - return coreboot_driver_register(&memconsole_driver); -} - -module_exit(coreboot_memconsole_exit); -module_init(coreboot_memconsole_init); +module_coreboot_driver(memconsole_driver); MODULE_AUTHOR("Google, Inc."); MODULE_LICENSE("GPL"); diff --git a/drivers/firmware/google/memconsole.c b/drivers/firmware/google/memconsole.c index fe5aa740c34d..44d314ad69e4 100644 --- a/drivers/firmware/google/memconsole.c +++ b/drivers/firmware/google/memconsole.c @@ -7,21 +7,22 @@ * Copyright 2017 Google Inc. */ -#include <linux/init.h> #include <linux/sysfs.h> #include <linux/kobject.h> #include <linux/module.h> #include "memconsole.h" -static ssize_t (*memconsole_read_func)(char *, loff_t, size_t); - static ssize_t memconsole_read(struct file *filp, struct kobject *kobp, struct bin_attribute *bin_attr, char *buf, loff_t pos, size_t count) { + ssize_t (*memconsole_read_func)(char *, loff_t, size_t); + + memconsole_read_func = bin_attr->private; if (WARN_ON_ONCE(!memconsole_read_func)) return -EIO; + return memconsole_read_func(buf, pos, count); } @@ -32,7 +33,7 @@ static struct bin_attribute memconsole_bin_attr = { void memconsole_setup(ssize_t (*read_func)(char *, loff_t, size_t)) { - memconsole_read_func = read_func; + memconsole_bin_attr.private = read_func; } EXPORT_SYMBOL(memconsole_setup); diff --git a/drivers/firmware/google/vpd.c b/drivers/firmware/google/vpd.c index fd5212c395c0..0739f3b70347 100644 --- a/drivers/firmware/google/vpd.c +++ b/drivers/firmware/google/vpd.c @@ -316,19 +316,7 @@ static struct coreboot_driver vpd_driver = { }, .tag = CB_TAG_VPD, }; - -static int __init coreboot_vpd_init(void) -{ - return coreboot_driver_register(&vpd_driver); -} - -static void __exit coreboot_vpd_exit(void) -{ - coreboot_driver_unregister(&vpd_driver); -} - -module_init(coreboot_vpd_init); -module_exit(coreboot_vpd_exit); +module_coreboot_driver(vpd_driver); MODULE_AUTHOR("Google, Inc."); MODULE_LICENSE("GPL"); diff --git a/drivers/firmware/google/vpd_decode.c b/drivers/firmware/google/vpd_decode.c index c62fa7063a7c..92e3258552fc 100644 --- a/drivers/firmware/google/vpd_decode.c +++ b/drivers/firmware/google/vpd_decode.c @@ -7,8 +7,6 @@ * Copyright 2017 Google Inc. */ -#include <linux/export.h> - #include "vpd_decode.h" static int vpd_decode_len(const s32 max_len, const u8 *in, diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index 85fc77148d19..8ae6956e5d80 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -9,7 +9,6 @@ config SENSORS_LIS3LV02D tristate depends on INPUT select INPUT_POLLDEV - default n config AD525X_DPOT tristate "Analog Devices Digital Potentiometers" @@ -62,7 +61,6 @@ config ATMEL_TCLIB config DUMMY_IRQ tristate "Dummy IRQ handler" - default n ---help--- This module accepts a single 'irq' parameter, which it should register for. The sole purpose of this module is to help with debugging of systems on @@ -118,7 +116,6 @@ config PHANTOM config INTEL_MID_PTI tristate "Parallel Trace Interface for MIPI P1149.7 cJTAG standard" depends on PCI && TTY && (X86_INTEL_MID || COMPILE_TEST) - default n help The PTI (Parallel Trace Interface) driver directs trace data routed from various parts in the system out @@ -194,7 +191,6 @@ config ATMEL_SSC config ENCLOSURE_SERVICES tristate "Enclosure Services" - default n help Provides support for intelligent enclosures (bays which contain storage devices). You also need either a host @@ -218,7 +214,6 @@ config SGI_XP config CS5535_MFGPT tristate "CS5535/CS5536 Geode Multi-Function General Purpose Timer (MFGPT) support" depends on MFD_CS5535 - default n help This driver provides access to MFGPT functionality for other drivers that need timers. MFGPTs are available in the CS5535 and @@ -251,7 +246,6 @@ config CS5535_CLOCK_EVENT_SRC config HP_ILO tristate "Channel interface driver for the HP iLO processor" depends on PCI - default n help The channel interface driver allows applications to communicate with iLO management processors present on HP ProLiant servers. @@ -286,7 +280,6 @@ config QCOM_FASTRPC config SGI_GRU tristate "SGI GRU driver" depends on X86_UV && SMP - default n select MMU_NOTIFIER ---help--- The GRU is a hardware resource located in the system chipset. The GRU @@ -301,7 +294,6 @@ config SGI_GRU config SGI_GRU_DEBUG bool "SGI GRU driver debug" depends on SGI_GRU - default n ---help--- This option enables additional debugging code for the SGI GRU driver. If you are unsure, say N. @@ -359,7 +351,6 @@ config SENSORS_BH1770 config SENSORS_APDS990X tristate "APDS990X combined als and proximity sensors" depends on I2C - default n ---help--- Say Y here if you want to build a driver for Avago APDS990x combined ambient light and proximity sensor chip. @@ -387,7 +378,6 @@ config DS1682 config SPEAR13XX_PCIE_GADGET bool "PCIe gadget support for SPEAr13XX platform" depends on ARCH_SPEAR13XX && BROKEN - default n help This option enables gadget support for PCIe controller. If board file defines any controller as PCIe endpoint then a sysfs @@ -397,6 +387,7 @@ config SPEAR13XX_PCIE_GADGET config VMWARE_BALLOON tristate "VMware Balloon Driver" depends on VMWARE_VMCI && X86 && HYPERVISOR_GUEST + select MEMORY_BALLOON help This is VMware physical memory management driver which acts like a "balloon" that can be inflated to reclaim physical pages diff --git a/drivers/misc/altera-stapl/Kconfig b/drivers/misc/altera-stapl/Kconfig index b34863544365..6c4c6575ec31 100644 --- a/drivers/misc/altera-stapl/Kconfig +++ b/drivers/misc/altera-stapl/Kconfig @@ -5,6 +5,5 @@ comment "Altera FPGA firmware download module (requires I2C)" config ALTERA_STAPL tristate "Altera FPGA firmware download module" depends on I2C - default n help An Altera FPGA module. Say Y when you want to support this tool. diff --git a/drivers/misc/c2port/Kconfig b/drivers/misc/c2port/Kconfig index 192e25094bd4..e20516ffd91e 100644 --- a/drivers/misc/c2port/Kconfig +++ b/drivers/misc/c2port/Kconfig @@ -5,7 +5,6 @@ menuconfig C2PORT tristate "Silicon Labs C2 port support" - default n help This option enables support for Silicon Labs C2 port used to program Silicon micro controller chips (and other 8051 compatible). @@ -24,7 +23,6 @@ if C2PORT config C2PORT_DURAMAR_2150 tristate "C2 port support for Eurotech's Duramar 2150" depends on X86 - default n help This option enables C2 support for the Eurotech's Duramar 2150 on board micro controller. diff --git a/drivers/misc/cb710/Kconfig b/drivers/misc/cb710/Kconfig index 3c7356d55423..a696d7509024 100644 --- a/drivers/misc/cb710/Kconfig +++ b/drivers/misc/cb710/Kconfig @@ -15,7 +15,6 @@ config CB710_CORE config CB710_DEBUG bool "Enable driver debugging" depends on CB710_CORE != n - default n help This is an option for use by developers; most people should say N here. This adds a lot of debugging output to dmesg. diff --git a/drivers/misc/cxl/Kconfig b/drivers/misc/cxl/Kconfig index f1d9a843e361..39eec9031487 100644 --- a/drivers/misc/cxl/Kconfig +++ b/drivers/misc/cxl/Kconfig @@ -5,16 +5,13 @@ config CXL_BASE bool - default n select PPC_COPRO_BASE config CXL_AFU_DRIVER_OPS bool - default n config CXL_LIB bool - default n config CXL tristate "Support for IBM Coherent Accelerators (CXL)" diff --git a/drivers/misc/echo/Kconfig b/drivers/misc/echo/Kconfig index 39656413e70d..be70b263e271 100644 --- a/drivers/misc/echo/Kconfig +++ b/drivers/misc/echo/Kconfig @@ -1,7 +1,6 @@ # SPDX-License-Identifier: GPL-2.0-only config ECHO tristate "Line Echo Canceller support" - default n ---help--- This driver provides line echo cancelling support for mISDN and Zaptel drivers. diff --git a/drivers/misc/eeprom/ee1004.c b/drivers/misc/eeprom/ee1004.c index be3263df278a..6f00c33cfe22 100644 --- a/drivers/misc/eeprom/ee1004.c +++ b/drivers/misc/eeprom/ee1004.c @@ -2,7 +2,7 @@ /* * ee1004 - driver for DDR4 SPD EEPROMs * - * Copyright (C) 2017 Jean Delvare + * Copyright (C) 2017-2019 Jean Delvare * * Based on the at24 driver: * Copyright (C) 2005-2007 David Brownell @@ -53,6 +53,24 @@ MODULE_DEVICE_TABLE(i2c, ee1004_ids); /*-------------------------------------------------------------------------*/ +static int ee1004_get_current_page(void) +{ + int err; + + err = i2c_smbus_read_byte(ee1004_set_page[0]); + if (err == -ENXIO) { + /* Nack means page 1 is selected */ + return 1; + } + if (err < 0) { + /* Anything else is a real error, bail out */ + return err; + } + + /* Ack means page 0 is selected, returned value meaningless */ + return 0; +} + static ssize_t ee1004_eeprom_read(struct i2c_client *client, char *buf, unsigned int offset, size_t count) { @@ -102,6 +120,16 @@ static ssize_t ee1004_read(struct file *filp, struct kobject *kobj, /* Data is ignored */ status = i2c_smbus_write_byte(ee1004_set_page[page], 0x00); + if (status == -ENXIO) { + /* + * Don't give up just yet. Some memory + * modules will select the page but not + * ack the command. Check which page is + * selected now. + */ + if (ee1004_get_current_page() == page) + status = 0; + } if (status < 0) { dev_err(dev, "Failed to select page %d (%d)\n", page, status); @@ -186,17 +214,10 @@ static int ee1004_probe(struct i2c_client *client, } /* Remember current page to avoid unneeded page select */ - err = i2c_smbus_read_byte(ee1004_set_page[0]); - if (err == -ENXIO) { - /* Nack means page 1 is selected */ - ee1004_current_page = 1; - } else if (err < 0) { - /* Anything else is a real error, bail out */ + err = ee1004_get_current_page(); + if (err < 0) goto err_clients; - } else { - /* Ack means page 0 is selected, returned value meaningless */ - ee1004_current_page = 0; - } + ee1004_current_page = err; dev_dbg(&client->dev, "Currently selected page: %d\n", ee1004_current_page); mutex_unlock(&ee1004_bus_lock); diff --git a/drivers/misc/genwqe/Kconfig b/drivers/misc/genwqe/Kconfig index a8a608713d26..97f64bcf9fe0 100644 --- a/drivers/misc/genwqe/Kconfig +++ b/drivers/misc/genwqe/Kconfig @@ -7,7 +7,6 @@ menuconfig GENWQE tristate "GenWQE PCIe Accelerator" depends on PCI && 64BIT select CRC_ITU_T - default n help Enables PCIe card driver for IBM GenWQE accelerators. The user-space interface is described in diff --git a/drivers/misc/lis3lv02d/Kconfig b/drivers/misc/lis3lv02d/Kconfig index 4cfad45229c8..bb2fec4b5880 100644 --- a/drivers/misc/lis3lv02d/Kconfig +++ b/drivers/misc/lis3lv02d/Kconfig @@ -7,7 +7,6 @@ config SENSORS_LIS3_SPI tristate "STMicroeletronics LIS3LV02Dx three-axis digital accelerometer (SPI)" depends on !ACPI && SPI_MASTER && INPUT select SENSORS_LIS3LV02D - default n help This driver provides support for the LIS3LV02Dx accelerometer connected via SPI. The accelerometer data is readable via @@ -24,7 +23,6 @@ config SENSORS_LIS3_I2C tristate "STMicroeletronics LIS3LV02Dx three-axis digital accelerometer (I2C)" depends on I2C && INPUT select SENSORS_LIS3LV02D - default n help This driver provides support for the LIS3LV02Dx accelerometer connected via I2C. The accelerometer data is readable via diff --git a/drivers/misc/lkdtm/Makefile b/drivers/misc/lkdtm/Makefile index 951c984de61a..fb10eafe9bde 100644 --- a/drivers/misc/lkdtm/Makefile +++ b/drivers/misc/lkdtm/Makefile @@ -15,8 +15,7 @@ KCOV_INSTRUMENT_rodata.o := n OBJCOPYFLAGS := OBJCOPYFLAGS_rodata_objcopy.o := \ - --set-section-flags .text=alloc,readonly \ - --rename-section .text=.rodata + --rename-section .text=.rodata,alloc,readonly,load targets += rodata.o rodata_objcopy.o $(obj)/rodata_objcopy.o: $(obj)/rodata.o FORCE $(call if_changed,objcopy) diff --git a/drivers/misc/mei/debugfs.c b/drivers/misc/mei/debugfs.c index 0970142bcace..47cfd5005e1b 100644 --- a/drivers/misc/mei/debugfs.c +++ b/drivers/misc/mei/debugfs.c @@ -8,6 +8,7 @@ #include <linux/kernel.h> #include <linux/device.h> #include <linux/debugfs.h> +#include <linux/seq_file.h> #include <linux/mei.h> @@ -15,104 +16,56 @@ #include "client.h" #include "hw.h" -static ssize_t mei_dbgfs_read_meclients(struct file *fp, char __user *ubuf, - size_t cnt, loff_t *ppos) +static int mei_dbgfs_meclients_show(struct seq_file *m, void *unused) { - struct mei_device *dev = fp->private_data; + struct mei_device *dev = m->private; struct mei_me_client *me_cl; - size_t bufsz = 1; - char *buf; int i = 0; - int pos = 0; - int ret; -#define HDR \ -" |id|fix| UUID |con|msg len|sb|refc|\n" + if (!dev) + return -ENODEV; down_read(&dev->me_clients_rwsem); - list_for_each_entry(me_cl, &dev->me_clients, list) - bufsz++; - bufsz *= sizeof(HDR) + 1; - buf = kzalloc(bufsz, GFP_KERNEL); - if (!buf) { - up_read(&dev->me_clients_rwsem); - return -ENOMEM; - } - - pos += scnprintf(buf + pos, bufsz - pos, HDR); -#undef HDR + seq_puts(m, " |id|fix| UUID |con|msg len|sb|refc|\n"); /* if the driver is not enabled the list won't be consistent */ if (dev->dev_state != MEI_DEV_ENABLED) goto out; list_for_each_entry(me_cl, &dev->me_clients, list) { - - if (mei_me_cl_get(me_cl)) { - pos += scnprintf(buf + pos, bufsz - pos, - "%2d|%2d|%3d|%pUl|%3d|%7d|%2d|%4d|\n", - i++, me_cl->client_id, - me_cl->props.fixed_address, - &me_cl->props.protocol_name, - me_cl->props.max_number_of_connections, - me_cl->props.max_msg_length, - me_cl->props.single_recv_buf, - kref_read(&me_cl->refcnt)); - - mei_me_cl_put(me_cl); - } + if (!mei_me_cl_get(me_cl)) + continue; + + seq_printf(m, "%2d|%2d|%3d|%pUl|%3d|%7d|%2d|%4d|\n", + i++, me_cl->client_id, + me_cl->props.fixed_address, + &me_cl->props.protocol_name, + me_cl->props.max_number_of_connections, + me_cl->props.max_msg_length, + me_cl->props.single_recv_buf, + kref_read(&me_cl->refcnt)); + mei_me_cl_put(me_cl); } out: up_read(&dev->me_clients_rwsem); - ret = simple_read_from_buffer(ubuf, cnt, ppos, buf, pos); - kfree(buf); - return ret; + return 0; } +DEFINE_SHOW_ATTRIBUTE(mei_dbgfs_meclients); -static const struct file_operations mei_dbgfs_fops_meclients = { - .open = simple_open, - .read = mei_dbgfs_read_meclients, - .llseek = generic_file_llseek, -}; - -static ssize_t mei_dbgfs_read_active(struct file *fp, char __user *ubuf, - size_t cnt, loff_t *ppos) +static int mei_dbgfs_active_show(struct seq_file *m, void *unused) { - struct mei_device *dev = fp->private_data; + struct mei_device *dev = m->private; struct mei_cl *cl; - size_t bufsz = 1; - char *buf; int i = 0; - int pos = 0; - int ret; - -#define HDR " |me|host|state|rd|wr|wrq\n" if (!dev) return -ENODEV; mutex_lock(&dev->device_lock); - /* - * if the driver is not enabled the list won't be consistent, - * we output empty table - */ - if (dev->dev_state == MEI_DEV_ENABLED) - list_for_each_entry(cl, &dev->file_list, link) - bufsz++; - - bufsz *= sizeof(HDR) + 1; - - buf = kzalloc(bufsz, GFP_KERNEL); - if (!buf) { - mutex_unlock(&dev->device_lock); - return -ENOMEM; - } - - pos += scnprintf(buf + pos, bufsz - pos, HDR); -#undef HDR + seq_puts(m, " |me|host|state|rd|wr|wrq\n"); /* if the driver is not enabled the list won't be consistent */ if (dev->dev_state != MEI_DEV_ENABLED) @@ -120,76 +73,44 @@ static ssize_t mei_dbgfs_read_active(struct file *fp, char __user *ubuf, list_for_each_entry(cl, &dev->file_list, link) { - pos += scnprintf(buf + pos, bufsz - pos, - "%3d|%2d|%4d|%5d|%2d|%2d|%3u\n", - i, mei_cl_me_id(cl), cl->host_client_id, cl->state, - !list_empty(&cl->rd_completed), cl->writing_state, - cl->tx_cb_queued); + seq_printf(m, "%3d|%2d|%4d|%5d|%2d|%2d|%3u\n", + i, mei_cl_me_id(cl), cl->host_client_id, cl->state, + !list_empty(&cl->rd_completed), cl->writing_state, + cl->tx_cb_queued); i++; } out: mutex_unlock(&dev->device_lock); - ret = simple_read_from_buffer(ubuf, cnt, ppos, buf, pos); - kfree(buf); - return ret; + return 0; } +DEFINE_SHOW_ATTRIBUTE(mei_dbgfs_active); -static const struct file_operations mei_dbgfs_fops_active = { - .open = simple_open, - .read = mei_dbgfs_read_active, - .llseek = generic_file_llseek, -}; - -static ssize_t mei_dbgfs_read_devstate(struct file *fp, char __user *ubuf, - size_t cnt, loff_t *ppos) +static int mei_dbgfs_devstate_show(struct seq_file *m, void *unused) { - struct mei_device *dev = fp->private_data; - const size_t bufsz = 1024; - char *buf = kzalloc(bufsz, GFP_KERNEL); - int pos = 0; - int ret; - - if (!buf) - return -ENOMEM; + struct mei_device *dev = m->private; - pos += scnprintf(buf + pos, bufsz - pos, "dev: %s\n", - mei_dev_state_str(dev->dev_state)); - pos += scnprintf(buf + pos, bufsz - pos, "hbm: %s\n", - mei_hbm_state_str(dev->hbm_state)); + seq_printf(m, "dev: %s\n", mei_dev_state_str(dev->dev_state)); + seq_printf(m, "hbm: %s\n", mei_hbm_state_str(dev->hbm_state)); if (dev->hbm_state >= MEI_HBM_ENUM_CLIENTS && dev->hbm_state <= MEI_HBM_STARTED) { - pos += scnprintf(buf + pos, bufsz - pos, "hbm features:\n"); - pos += scnprintf(buf + pos, bufsz - pos, "\tPG: %01d\n", - dev->hbm_f_pg_supported); - pos += scnprintf(buf + pos, bufsz - pos, "\tDC: %01d\n", - dev->hbm_f_dc_supported); - pos += scnprintf(buf + pos, bufsz - pos, "\tIE: %01d\n", - dev->hbm_f_ie_supported); - pos += scnprintf(buf + pos, bufsz - pos, "\tDOT: %01d\n", - dev->hbm_f_dot_supported); - pos += scnprintf(buf + pos, bufsz - pos, "\tEV: %01d\n", - dev->hbm_f_ev_supported); - pos += scnprintf(buf + pos, bufsz - pos, "\tFA: %01d\n", - dev->hbm_f_fa_supported); - pos += scnprintf(buf + pos, bufsz - pos, "\tOS: %01d\n", - dev->hbm_f_os_supported); - pos += scnprintf(buf + pos, bufsz - pos, "\tDR: %01d\n", - dev->hbm_f_dr_supported); + seq_puts(m, "hbm features:\n"); + seq_printf(m, "\tPG: %01d\n", dev->hbm_f_pg_supported); + seq_printf(m, "\tDC: %01d\n", dev->hbm_f_dc_supported); + seq_printf(m, "\tIE: %01d\n", dev->hbm_f_ie_supported); + seq_printf(m, "\tDOT: %01d\n", dev->hbm_f_dot_supported); + seq_printf(m, "\tEV: %01d\n", dev->hbm_f_ev_supported); + seq_printf(m, "\tFA: %01d\n", dev->hbm_f_fa_supported); + seq_printf(m, "\tOS: %01d\n", dev->hbm_f_os_supported); + seq_printf(m, "\tDR: %01d\n", dev->hbm_f_dr_supported); } - pos += scnprintf(buf + pos, bufsz - pos, "pg: %s, %s\n", - mei_pg_is_enabled(dev) ? "ENABLED" : "DISABLED", - mei_pg_state_str(mei_pg_state(dev))); - ret = simple_read_from_buffer(ubuf, cnt, ppos, buf, pos); - kfree(buf); - return ret; + seq_printf(m, "pg: %s, %s\n", + mei_pg_is_enabled(dev) ? "ENABLED" : "DISABLED", + mei_pg_state_str(mei_pg_state(dev))); + return 0; } -static const struct file_operations mei_dbgfs_fops_devstate = { - .open = simple_open, - .read = mei_dbgfs_read_devstate, - .llseek = generic_file_llseek, -}; +DEFINE_SHOW_ATTRIBUTE(mei_dbgfs_devstate); static ssize_t mei_dbgfs_write_allow_fa(struct file *file, const char __user *user_buf, @@ -208,7 +129,7 @@ static ssize_t mei_dbgfs_write_allow_fa(struct file *file, return ret; } -static const struct file_operations mei_dbgfs_fops_allow_fa = { +static const struct file_operations mei_dbgfs_allow_fa_fops = { .open = simple_open, .read = debugfs_read_file_bool, .write = mei_dbgfs_write_allow_fa, @@ -247,26 +168,26 @@ int mei_dbgfs_register(struct mei_device *dev, const char *name) dev->dbgfs_dir = dir; f = debugfs_create_file("meclients", S_IRUSR, dir, - dev, &mei_dbgfs_fops_meclients); + dev, &mei_dbgfs_meclients_fops); if (!f) { dev_err(dev->dev, "meclients: registration failed\n"); goto err; } f = debugfs_create_file("active", S_IRUSR, dir, - dev, &mei_dbgfs_fops_active); + dev, &mei_dbgfs_active_fops); if (!f) { dev_err(dev->dev, "active: registration failed\n"); goto err; } f = debugfs_create_file("devstate", S_IRUSR, dir, - dev, &mei_dbgfs_fops_devstate); + dev, &mei_dbgfs_devstate_fops); if (!f) { dev_err(dev->dev, "devstate: registration failed\n"); goto err; } f = debugfs_create_file("allow_fixed_address", S_IRUSR | S_IWUSR, dir, &dev->allow_fixed_address, - &mei_dbgfs_fops_allow_fa); + &mei_dbgfs_allow_fa_fops); if (!f) { dev_err(dev->dev, "allow_fixed_address: registration failed\n"); goto err; @@ -276,4 +197,3 @@ err: mei_dbgfs_deregister(dev); return -ENODEV; } - diff --git a/drivers/misc/mei/hdcp/mei_hdcp.c b/drivers/misc/mei/hdcp/mei_hdcp.c index b07000202d4a..ed816939fb32 100644 --- a/drivers/misc/mei/hdcp/mei_hdcp.c +++ b/drivers/misc/mei/hdcp/mei_hdcp.c @@ -2,7 +2,7 @@ /* * Copyright © 2019 Intel Corporation * - * Mei_hdcp.c: HDCP client driver for mei bus + * mei_hdcp.c: HDCP client driver for mei bus * * Author: * Ramalingam C <ramalingam.c@intel.com> @@ -11,12 +11,9 @@ /** * DOC: MEI_HDCP Client Driver * - * This is a client driver to the mei_bus to make the HDCP2.2 services of - * ME FW available for the interested consumers like I915. - * - * This module will act as a translation layer between HDCP protocol - * implementor(I915) and ME FW by translating HDCP2.2 authentication - * messages to ME FW command payloads and vice versa. + * The mei_hdcp driver acts as a translation layer between HDCP 2.2 + * protocol implementer (I915) and ME FW by translating HDCP2.2 + * negotiation messages to ME FW command payloads and vice versa. */ #include <linux/module.h> diff --git a/drivers/misc/ocxl/Kconfig b/drivers/misc/ocxl/Kconfig index 7fb6d39d4c5a..1916fa65f2f2 100644 --- a/drivers/misc/ocxl/Kconfig +++ b/drivers/misc/ocxl/Kconfig @@ -5,7 +5,6 @@ config OCXL_BASE bool - default n select PPC_COPRO_BASE config OCXL diff --git a/drivers/misc/sgi-xp/xpc_partition.c b/drivers/misc/sgi-xp/xpc_partition.c index 3eba1c420cc0..782ce95d3f17 100644 --- a/drivers/misc/sgi-xp/xpc_partition.c +++ b/drivers/misc/sgi-xp/xpc_partition.c @@ -70,7 +70,7 @@ xpc_get_rsvd_page_pa(int nasid) unsigned long rp_pa = nasid; /* seed with nasid */ size_t len = 0; size_t buf_len = 0; - void *buf = buf; + void *buf = NULL; void *buf_base = NULL; enum xp_retval (*get_partition_rsvd_page_pa) (void *, u64 *, unsigned long *, size_t *) = diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c index ad807d5a3141..043eed845246 100644 --- a/drivers/misc/vmw_balloon.c +++ b/drivers/misc/vmw_balloon.c @@ -28,6 +28,8 @@ #include <linux/rwsem.h> #include <linux/slab.h> #include <linux/spinlock.h> +#include <linux/mount.h> +#include <linux/balloon_compaction.h> #include <linux/vmw_vmci_defs.h> #include <linux/vmw_vmci_api.h> #include <asm/hypervisor.h> @@ -38,25 +40,20 @@ MODULE_ALIAS("dmi:*:svnVMware*:*"); MODULE_ALIAS("vmware_vmmemctl"); MODULE_LICENSE("GPL"); -/* - * Use __GFP_HIGHMEM to allow pages from HIGHMEM zone. We don't allow wait - * (__GFP_RECLAIM) for huge page allocations. Use __GFP_NOWARN, to suppress page - * allocation failure warnings. Disallow access to emergency low-memory pools. - */ -#define VMW_HUGE_PAGE_ALLOC_FLAGS (__GFP_HIGHMEM|__GFP_NOWARN| \ - __GFP_NOMEMALLOC) +static bool __read_mostly vmwballoon_shrinker_enable; +module_param(vmwballoon_shrinker_enable, bool, 0444); +MODULE_PARM_DESC(vmwballoon_shrinker_enable, + "Enable non-cooperative out-of-memory protection. Disabled by default as it may degrade performance."); -/* - * Use __GFP_HIGHMEM to allow pages from HIGHMEM zone. We allow lightweight - * reclamation (__GFP_NORETRY). Use __GFP_NOWARN, to suppress page allocation - * failure warnings. Disallow access to emergency low-memory pools. - */ -#define VMW_PAGE_ALLOC_FLAGS (__GFP_HIGHMEM|__GFP_NOWARN| \ - __GFP_NOMEMALLOC|__GFP_NORETRY) +/* Delay in seconds after shrink before inflation. */ +#define VMBALLOON_SHRINK_DELAY (5) /* Maximum number of refused pages we accumulate during inflation cycle */ #define VMW_BALLOON_MAX_REFUSED 16 +/* Magic number for the balloon mount-point */ +#define BALLOON_VMW_MAGIC 0x0ba11007 + /* * Hypervisor communication port definitions. */ @@ -229,29 +226,26 @@ enum vmballoon_stat_general { VMW_BALLOON_STAT_TIMER, VMW_BALLOON_STAT_DOORBELL, VMW_BALLOON_STAT_RESET, - VMW_BALLOON_STAT_LAST = VMW_BALLOON_STAT_RESET + VMW_BALLOON_STAT_SHRINK, + VMW_BALLOON_STAT_SHRINK_FREE, + VMW_BALLOON_STAT_LAST = VMW_BALLOON_STAT_SHRINK_FREE }; #define VMW_BALLOON_STAT_NUM (VMW_BALLOON_STAT_LAST + 1) - static DEFINE_STATIC_KEY_TRUE(vmw_balloon_batching); static DEFINE_STATIC_KEY_FALSE(balloon_stat_enabled); struct vmballoon_ctl { struct list_head pages; struct list_head refused_pages; + struct list_head prealloc_pages; unsigned int n_refused_pages; unsigned int n_pages; enum vmballoon_page_size_type page_size; enum vmballoon_op op; }; -struct vmballoon_page_size { - /* list of reserved physical pages */ - struct list_head pages; -}; - /** * struct vmballoon_batch_entry - a batch entry for lock or unlock. * @@ -266,8 +260,6 @@ struct vmballoon_batch_entry { } __packed; struct vmballoon { - struct vmballoon_page_size page_sizes[VMW_BALLOON_NUM_PAGE_SIZES]; - /** * @max_page_size: maximum supported page size for ballooning. * @@ -340,6 +332,15 @@ struct vmballoon { */ struct page *page; + /** + * @shrink_timeout: timeout until the next inflation. + * + * After an shrink event, indicates the time in jiffies after which + * inflation is allowed again. Can be written concurrently with reads, + * so must use READ_ONCE/WRITE_ONCE when accessing. + */ + unsigned long shrink_timeout; + /* statistics */ struct vmballoon_stats *stats; @@ -348,9 +349,21 @@ struct vmballoon { struct dentry *dbg_entry; #endif + /** + * @b_dev_info: balloon device information descriptor. + */ + struct balloon_dev_info b_dev_info; + struct delayed_work dwork; /** + * @huge_pages - list of the inflated 2MB pages. + * + * Protected by @b_dev_info.pages_lock . + */ + struct list_head huge_pages; + + /** * @vmci_doorbell. * * Protected by @conf_sem. @@ -368,6 +381,20 @@ struct vmballoon { * Lock ordering: @conf_sem -> @comm_lock . */ spinlock_t comm_lock; + + /** + * @shrinker: shrinker interface that is used to avoid over-inflation. + */ + struct shrinker shrinker; + + /** + * @shrinker_registered: whether the shrinker was registered. + * + * The shrinker interface does not handle gracefully the removal of + * shrinker that was not registered before. This indication allows to + * simplify the unregistration process. + */ + bool shrinker_registered; }; static struct vmballoon balloon; @@ -642,15 +669,25 @@ static int vmballoon_alloc_page_list(struct vmballoon *b, unsigned int i; for (i = 0; i < req_n_pages; i++) { - if (ctl->page_size == VMW_BALLOON_2M_PAGE) - page = alloc_pages(VMW_HUGE_PAGE_ALLOC_FLAGS, - VMW_BALLOON_2M_ORDER); - else - page = alloc_page(VMW_PAGE_ALLOC_FLAGS); - - /* Update statistics */ - vmballoon_stats_page_inc(b, VMW_BALLOON_PAGE_STAT_ALLOC, - ctl->page_size); + /* + * First check if we happen to have pages that were allocated + * before. This happens when 2MB page rejected during inflation + * by the hypervisor, and then split into 4KB pages. + */ + if (!list_empty(&ctl->prealloc_pages)) { + page = list_first_entry(&ctl->prealloc_pages, + struct page, lru); + list_del(&page->lru); + } else { + if (ctl->page_size == VMW_BALLOON_2M_PAGE) + page = alloc_pages(__GFP_HIGHMEM|__GFP_NOWARN| + __GFP_NOMEMALLOC, VMW_BALLOON_2M_ORDER); + else + page = balloon_page_alloc(); + + vmballoon_stats_page_inc(b, VMW_BALLOON_PAGE_STAT_ALLOC, + ctl->page_size); + } if (page) { vmballoon_mark_page_offline(page, ctl->page_size); @@ -896,7 +933,8 @@ static void vmballoon_release_page_list(struct list_head *page_list, __free_pages(page, vmballoon_page_order(page_size)); } - *n_pages = 0; + if (n_pages) + *n_pages = 0; } @@ -942,6 +980,10 @@ static int64_t vmballoon_change(struct vmballoon *b) size - target < vmballoon_page_in_frames(VMW_BALLOON_2M_PAGE)) return 0; + /* If an out-of-memory recently occurred, inflation is disallowed. */ + if (target > size && time_before(jiffies, READ_ONCE(b->shrink_timeout))) + return 0; + return target - size; } @@ -961,9 +1003,22 @@ static void vmballoon_enqueue_page_list(struct vmballoon *b, unsigned int *n_pages, enum vmballoon_page_size_type page_size) { - struct vmballoon_page_size *page_size_info = &b->page_sizes[page_size]; + unsigned long flags; + + if (page_size == VMW_BALLOON_4K_PAGE) { + balloon_page_list_enqueue(&b->b_dev_info, pages); + } else { + /* + * Keep the huge pages in a local list which is not available + * for the balloon compaction mechanism. + */ + spin_lock_irqsave(&b->b_dev_info.pages_lock, flags); + list_splice_init(pages, &b->huge_pages); + __count_vm_events(BALLOON_INFLATE, *n_pages * + vmballoon_page_in_frames(VMW_BALLOON_2M_PAGE)); + spin_unlock_irqrestore(&b->b_dev_info.pages_lock, flags); + } - list_splice_init(pages, &page_size_info->pages); *n_pages = 0; } @@ -986,19 +1041,58 @@ static void vmballoon_dequeue_page_list(struct vmballoon *b, enum vmballoon_page_size_type page_size, unsigned int n_req_pages) { - struct vmballoon_page_size *page_size_info = &b->page_sizes[page_size]; struct page *page, *tmp; unsigned int i = 0; + unsigned long flags; + + /* In the case of 4k pages, use the compaction infrastructure */ + if (page_size == VMW_BALLOON_4K_PAGE) { + *n_pages = balloon_page_list_dequeue(&b->b_dev_info, pages, + n_req_pages); + return; + } - list_for_each_entry_safe(page, tmp, &page_size_info->pages, lru) { + /* 2MB pages */ + spin_lock_irqsave(&b->b_dev_info.pages_lock, flags); + list_for_each_entry_safe(page, tmp, &b->huge_pages, lru) { list_move(&page->lru, pages); if (++i == n_req_pages) break; } + + __count_vm_events(BALLOON_DEFLATE, + i * vmballoon_page_in_frames(VMW_BALLOON_2M_PAGE)); + spin_unlock_irqrestore(&b->b_dev_info.pages_lock, flags); *n_pages = i; } /** + * vmballoon_split_refused_pages() - Split the 2MB refused pages to 4k. + * + * If inflation of 2MB pages was denied by the hypervisor, it is likely to be + * due to one or few 4KB pages. These 2MB pages may keep being allocated and + * then being refused. To prevent this case, this function splits the refused + * pages into 4KB pages and adds them into @prealloc_pages list. + * + * @ctl: pointer for the %struct vmballoon_ctl, which defines the operation. + */ +static void vmballoon_split_refused_pages(struct vmballoon_ctl *ctl) +{ + struct page *page, *tmp; + unsigned int i, order; + + order = vmballoon_page_order(ctl->page_size); + + list_for_each_entry_safe(page, tmp, &ctl->refused_pages, lru) { + list_del(&page->lru); + split_page(page, order); + for (i = 0; i < (1 << order); i++) + list_add(&page[i].lru, &ctl->prealloc_pages); + } + ctl->n_refused_pages = 0; +} + +/** * vmballoon_inflate() - Inflate the balloon towards its target size. * * @b: pointer to the balloon. @@ -1009,6 +1103,7 @@ static void vmballoon_inflate(struct vmballoon *b) struct vmballoon_ctl ctl = { .pages = LIST_HEAD_INIT(ctl.pages), .refused_pages = LIST_HEAD_INIT(ctl.refused_pages), + .prealloc_pages = LIST_HEAD_INIT(ctl.prealloc_pages), .page_size = b->max_page_size, .op = VMW_BALLOON_INFLATE }; @@ -1056,10 +1151,10 @@ static void vmballoon_inflate(struct vmballoon *b) break; /* - * Ignore errors from locking as we now switch to 4k - * pages and we might get different errors. + * Split the refused pages to 4k. This will also empty + * the refused pages list. */ - vmballoon_release_refused_pages(b, &ctl); + vmballoon_split_refused_pages(&ctl); ctl.page_size--; } @@ -1073,6 +1168,8 @@ static void vmballoon_inflate(struct vmballoon *b) */ if (ctl.n_refused_pages != 0) vmballoon_release_refused_pages(b, &ctl); + + vmballoon_release_page_list(&ctl.prealloc_pages, NULL, ctl.page_size); } /** @@ -1411,6 +1508,90 @@ static void vmballoon_work(struct work_struct *work) } +/** + * vmballoon_shrinker_scan() - deflate the balloon due to memory pressure. + * @shrinker: pointer to the balloon shrinker. + * @sc: page reclaim information. + * + * Returns: number of pages that were freed during deflation. + */ +static unsigned long vmballoon_shrinker_scan(struct shrinker *shrinker, + struct shrink_control *sc) +{ + struct vmballoon *b = &balloon; + unsigned long deflated_frames; + + pr_debug("%s - size: %llu", __func__, atomic64_read(&b->size)); + + vmballoon_stats_gen_inc(b, VMW_BALLOON_STAT_SHRINK); + + /* + * If the lock is also contended for read, we cannot easily reclaim and + * we bail out. + */ + if (!down_read_trylock(&b->conf_sem)) + return 0; + + deflated_frames = vmballoon_deflate(b, sc->nr_to_scan, true); + + vmballoon_stats_gen_add(b, VMW_BALLOON_STAT_SHRINK_FREE, + deflated_frames); + + /* + * Delay future inflation for some time to mitigate the situations in + * which balloon continuously grows and shrinks. Use WRITE_ONCE() since + * the access is asynchronous. + */ + WRITE_ONCE(b->shrink_timeout, jiffies + HZ * VMBALLOON_SHRINK_DELAY); + + up_read(&b->conf_sem); + + return deflated_frames; +} + +/** + * vmballoon_shrinker_count() - return the number of ballooned pages. + * @shrinker: pointer to the balloon shrinker. + * @sc: page reclaim information. + * + * Returns: number of 4k pages that are allocated for the balloon and can + * therefore be reclaimed under pressure. + */ +static unsigned long vmballoon_shrinker_count(struct shrinker *shrinker, + struct shrink_control *sc) +{ + struct vmballoon *b = &balloon; + + return atomic64_read(&b->size); +} + +static void vmballoon_unregister_shrinker(struct vmballoon *b) +{ + if (b->shrinker_registered) + unregister_shrinker(&b->shrinker); + b->shrinker_registered = false; +} + +static int vmballoon_register_shrinker(struct vmballoon *b) +{ + int r; + + /* Do nothing if the shrinker is not enabled */ + if (!vmwballoon_shrinker_enable) + return 0; + + b->shrinker.scan_objects = vmballoon_shrinker_scan; + b->shrinker.count_objects = vmballoon_shrinker_count; + b->shrinker.seeks = DEFAULT_SEEKS; + + r = register_shrinker(&b->shrinker); + + if (r == 0) + b->shrinker_registered = true; + + return r; +} + /* * DEBUGFS Interface */ @@ -1428,6 +1609,8 @@ static const char * const vmballoon_stat_names[] = { [VMW_BALLOON_STAT_TIMER] = "timer", [VMW_BALLOON_STAT_DOORBELL] = "doorbell", [VMW_BALLOON_STAT_RESET] = "reset", + [VMW_BALLOON_STAT_SHRINK] = "shrink", + [VMW_BALLOON_STAT_SHRINK_FREE] = "shrinkFree" }; static int vmballoon_enable_stats(struct vmballoon *b) @@ -1552,9 +1735,204 @@ static inline void vmballoon_debugfs_exit(struct vmballoon *b) #endif /* CONFIG_DEBUG_FS */ + +#ifdef CONFIG_BALLOON_COMPACTION + +static struct dentry *vmballoon_mount(struct file_system_type *fs_type, + int flags, const char *dev_name, + void *data) +{ + static const struct dentry_operations ops = { + .d_dname = simple_dname, + }; + + return mount_pseudo(fs_type, "balloon-vmware:", NULL, &ops, + BALLOON_VMW_MAGIC); +} + +static struct file_system_type vmballoon_fs = { + .name = "balloon-vmware", + .mount = vmballoon_mount, + .kill_sb = kill_anon_super, +}; + +static struct vfsmount *vmballoon_mnt; + +/** + * vmballoon_migratepage() - migrates a balloon page. + * @b_dev_info: balloon device information descriptor. + * @newpage: the page to which @page should be migrated. + * @page: a ballooned page that should be migrated. + * @mode: migration mode, ignored. + * + * This function is really open-coded, but that is according to the interface + * that balloon_compaction provides. + * + * Return: zero on success, -EAGAIN when migration cannot be performed + * momentarily, and -EBUSY if migration failed and should be retried + * with that specific page. + */ +static int vmballoon_migratepage(struct balloon_dev_info *b_dev_info, + struct page *newpage, struct page *page, + enum migrate_mode mode) +{ + unsigned long status, flags; + struct vmballoon *b; + int ret; + + b = container_of(b_dev_info, struct vmballoon, b_dev_info); + + /* + * If the semaphore is taken, there is ongoing configuration change + * (i.e., balloon reset), so try again. + */ + if (!down_read_trylock(&b->conf_sem)) + return -EAGAIN; + + spin_lock(&b->comm_lock); + /* + * We must start by deflating and not inflating, as otherwise the + * hypervisor may tell us that it has enough memory and the new page is + * not needed. Since the old page is isolated, we cannot use the list + * interface to unlock it, as the LRU field is used for isolation. + * Instead, we use the native interface directly. + */ + vmballoon_add_page(b, 0, page); + status = vmballoon_lock_op(b, 1, VMW_BALLOON_4K_PAGE, + VMW_BALLOON_DEFLATE); + + if (status == VMW_BALLOON_SUCCESS) + status = vmballoon_status_page(b, 0, &page); + + /* + * If a failure happened, let the migration mechanism know that it + * should not retry. + */ + if (status != VMW_BALLOON_SUCCESS) { + spin_unlock(&b->comm_lock); + ret = -EBUSY; + goto out_unlock; + } + + /* + * The page is isolated, so it is safe to delete it without holding + * @pages_lock . We keep holding @comm_lock since we will need it in a + * second. + */ + balloon_page_delete(page); + + put_page(page); + + /* Inflate */ + vmballoon_add_page(b, 0, newpage); + status = vmballoon_lock_op(b, 1, VMW_BALLOON_4K_PAGE, + VMW_BALLOON_INFLATE); + + if (status == VMW_BALLOON_SUCCESS) + status = vmballoon_status_page(b, 0, &newpage); + + spin_unlock(&b->comm_lock); + + if (status != VMW_BALLOON_SUCCESS) { + /* + * A failure happened. While we can deflate the page we just + * inflated, this deflation can also encounter an error. Instead + * we will decrease the size of the balloon to reflect the + * change and report failure. + */ + atomic64_dec(&b->size); + ret = -EBUSY; + } else { + /* + * Success. Take a reference for the page, and we will add it to + * the list after acquiring the lock. + */ + get_page(newpage); + ret = MIGRATEPAGE_SUCCESS; + } + + /* Update the balloon list under the @pages_lock */ + spin_lock_irqsave(&b->b_dev_info.pages_lock, flags); + + /* + * On inflation success, we already took a reference for the @newpage. + * If we succeed just insert it to the list and update the statistics + * under the lock. + */ + if (ret == MIGRATEPAGE_SUCCESS) { + balloon_page_insert(&b->b_dev_info, newpage); + __count_vm_event(BALLOON_MIGRATE); + } + + /* + * We deflated successfully, so regardless to the inflation success, we + * need to reduce the number of isolated_pages. + */ + b->b_dev_info.isolated_pages--; + spin_unlock_irqrestore(&b->b_dev_info.pages_lock, flags); + +out_unlock: + up_read(&b->conf_sem); + return ret; +} + +/** + * vmballoon_compaction_deinit() - removes compaction related data. + * + * @b: pointer to the balloon. + */ +static void vmballoon_compaction_deinit(struct vmballoon *b) +{ + if (!IS_ERR(b->b_dev_info.inode)) + iput(b->b_dev_info.inode); + + b->b_dev_info.inode = NULL; + kern_unmount(vmballoon_mnt); + vmballoon_mnt = NULL; +} + +/** + * vmballoon_compaction_init() - initialized compaction for the balloon. + * + * @b: pointer to the balloon. + * + * If during the initialization a failure occurred, this function does not + * perform cleanup. The caller must call vmballoon_compaction_deinit() in this + * case. + * + * Return: zero on success or error code on failure. + */ +static __init int vmballoon_compaction_init(struct vmballoon *b) +{ + vmballoon_mnt = kern_mount(&vmballoon_fs); + if (IS_ERR(vmballoon_mnt)) + return PTR_ERR(vmballoon_mnt); + + b->b_dev_info.migratepage = vmballoon_migratepage; + b->b_dev_info.inode = alloc_anon_inode(vmballoon_mnt->mnt_sb); + + if (IS_ERR(b->b_dev_info.inode)) + return PTR_ERR(b->b_dev_info.inode); + + b->b_dev_info.inode->i_mapping->a_ops = &balloon_aops; + return 0; +} + +#else /* CONFIG_BALLOON_COMPACTION */ + +static void vmballoon_compaction_deinit(struct vmballoon *b) +{ +} + +static int vmballoon_compaction_init(struct vmballoon *b) +{ + return 0; +} + +#endif /* CONFIG_BALLOON_COMPACTION */ + static int __init vmballoon_init(void) { - enum vmballoon_page_size_type page_size; int error; /* @@ -1564,17 +1942,26 @@ static int __init vmballoon_init(void) if (x86_hyper_type != X86_HYPER_VMWARE) return -ENODEV; - for (page_size = VMW_BALLOON_4K_PAGE; - page_size <= VMW_BALLOON_LAST_SIZE; page_size++) - INIT_LIST_HEAD(&balloon.page_sizes[page_size].pages); - - INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work); + error = vmballoon_register_shrinker(&balloon); + if (error) + goto fail; + error = vmballoon_debugfs_init(&balloon); if (error) - return error; + goto fail; + + /* + * Initialization of compaction must be done after the call to + * balloon_devinfo_init() . + */ + balloon_devinfo_init(&balloon.b_dev_info); + error = vmballoon_compaction_init(&balloon); + if (error) + goto fail; + INIT_LIST_HEAD(&balloon.huge_pages); spin_lock_init(&balloon.comm_lock); init_rwsem(&balloon.conf_sem); balloon.vmci_doorbell = VMCI_INVALID_HANDLE; @@ -1585,6 +1972,10 @@ static int __init vmballoon_init(void) queue_delayed_work(system_freezable_wq, &balloon.dwork, 0); return 0; +fail: + vmballoon_unregister_shrinker(&balloon); + vmballoon_compaction_deinit(&balloon); + return error; } /* @@ -1597,6 +1988,7 @@ late_initcall(vmballoon_init); static void __exit vmballoon_exit(void) { + vmballoon_unregister_shrinker(&balloon); vmballoon_vmci_cleanup(&balloon); cancel_delayed_work_sync(&balloon.dwork); @@ -1609,5 +2001,8 @@ static void __exit vmballoon_exit(void) */ vmballoon_send_start(&balloon, 0); vmballoon_pop(&balloon); + + /* Only once we popped the balloon, compaction can be deinit */ + vmballoon_compaction_deinit(&balloon); } module_exit(vmballoon_exit); diff --git a/drivers/w1/slaves/w1_ds2413.c b/drivers/w1/slaves/w1_ds2413.c index 492e3d010321..3364ad276b15 100644 --- a/drivers/w1/slaves/w1_ds2413.c +++ b/drivers/w1/slaves/w1_ds2413.c @@ -24,12 +24,17 @@ #define W1_F3A_FUNC_PIO_ACCESS_READ 0xF5 #define W1_F3A_FUNC_PIO_ACCESS_WRITE 0x5A #define W1_F3A_SUCCESS_CONFIRM_BYTE 0xAA +#define W1_F3A_INVALID_PIO_STATE 0xFF static ssize_t state_read(struct file *filp, struct kobject *kobj, struct bin_attribute *bin_attr, char *buf, loff_t off, size_t count) { struct w1_slave *sl = kobj_to_w1_slave(kobj); + unsigned int retries = W1_F3A_RETRIES; + ssize_t bytes_read = -EIO; + u8 state; + dev_dbg(&sl->dev, "Reading %s kobj: %p, off: %0#10x, count: %zu, buff addr: %p", bin_attr->attr.name, kobj, (unsigned int)off, count, buf); @@ -42,22 +47,37 @@ static ssize_t state_read(struct file *filp, struct kobject *kobj, mutex_lock(&sl->master->bus_mutex); dev_dbg(&sl->dev, "mutex locked"); - if (w1_reset_select_slave(sl)) { - mutex_unlock(&sl->master->bus_mutex); - return -EIO; - } +next: + if (w1_reset_select_slave(sl)) + goto out; + + while (retries--) { + w1_write_8(sl->master, W1_F3A_FUNC_PIO_ACCESS_READ); + + state = w1_read_8(sl->master); + if ((state & 0x0F) == ((~state >> 4) & 0x0F)) { + /* complement is correct */ + *buf = state; + bytes_read = 1; + goto out; + } else if (state == W1_F3A_INVALID_PIO_STATE) { + /* slave didn't respond, try to select it again */ + dev_warn(&sl->dev, "slave device did not respond to PIO_ACCESS_READ, " \ + "reselecting, retries left: %d\n", retries); + goto next; + } - w1_write_8(sl->master, W1_F3A_FUNC_PIO_ACCESS_READ); - *buf = w1_read_8(sl->master); + if (w1_reset_resume_command(sl->master)) + goto out; /* unrecoverable error */ - mutex_unlock(&sl->master->bus_mutex); - dev_dbg(&sl->dev, "mutex unlocked"); + dev_warn(&sl->dev, "PIO_ACCESS_READ error, retries left: %d\n", retries); + } - /* check for correct complement */ - if ((*buf & 0x0F) != ((~*buf >> 4) & 0x0F)) - return -EIO; - else - return 1; +out: + mutex_unlock(&sl->master->bus_mutex); + dev_dbg(&sl->dev, "%s, mutex unlocked, retries: %d\n", + (bytes_read > 0) ? "succeeded" : "error", retries); + return bytes_read; } static BIN_ATTR_RO(state, 1); @@ -69,6 +89,7 @@ static ssize_t output_write(struct file *filp, struct kobject *kobj, struct w1_slave *sl = kobj_to_w1_slave(kobj); u8 w1_buf[3]; unsigned int retries = W1_F3A_RETRIES; + ssize_t bytes_written = -EIO; if (count != 1 || off != 0) return -EFAULT; @@ -78,7 +99,7 @@ static ssize_t output_write(struct file *filp, struct kobject *kobj, dev_dbg(&sl->dev, "mutex locked"); if (w1_reset_select_slave(sl)) - goto error; + goto out; /* according to the DS2413 datasheet the most significant 6 bits should be set to "1"s, so do it now */ @@ -91,18 +112,20 @@ static ssize_t output_write(struct file *filp, struct kobject *kobj, w1_write_block(sl->master, w1_buf, 3); if (w1_read_8(sl->master) == W1_F3A_SUCCESS_CONFIRM_BYTE) { - mutex_unlock(&sl->master->bus_mutex); - dev_dbg(&sl->dev, "mutex unlocked, retries:%d", retries); - return 1; + bytes_written = 1; + goto out; } if (w1_reset_resume_command(sl->master)) - goto error; + goto out; /* unrecoverable error */ + + dev_warn(&sl->dev, "PIO_ACCESS_WRITE error, retries left: %d\n", retries); } -error: +out: mutex_unlock(&sl->master->bus_mutex); - dev_dbg(&sl->dev, "mutex unlocked in error, retries:%d", retries); - return -EIO; + dev_dbg(&sl->dev, "%s, mutex unlocked, retries: %d\n", + (bytes_written > 0) ? "succeeded" : "error", retries); + return bytes_written; } static BIN_ATTR(output, S_IRUGO | S_IWUSR | S_IWGRP, NULL, output_write, 1); diff --git a/drivers/w1/slaves/w1_ds2805.c b/drivers/w1/slaves/w1_ds2805.c index 29348d283a65..ab349604531a 100644 --- a/drivers/w1/slaves/w1_ds2805.c +++ b/drivers/w1/slaves/w1_ds2805.c @@ -288,7 +288,7 @@ static struct w1_family_ops w1_f0d_fops = { .remove_slave = w1_f0d_remove_slave, }; -static struct w1_family w1_family_2d = { +static struct w1_family w1_family_0d = { .fid = W1_EEPROM_DS2805, .fops = &w1_f0d_fops, }; @@ -296,13 +296,13 @@ static struct w1_family w1_family_2d = { static int __init w1_f0d_init(void) { pr_info("%s()\n", __func__); - return w1_register_family(&w1_family_2d); + return w1_register_family(&w1_family_0d); } static void __exit w1_f0d_fini(void) { pr_info("%s()\n", __func__); - w1_unregister_family(&w1_family_2d); + w1_unregister_family(&w1_family_0d); } module_init(w1_f0d_init); diff --git a/fs/char_dev.c b/fs/char_dev.c index d18cad28c1c3..00dfe17871ac 100644 --- a/fs/char_dev.c +++ b/fs/char_dev.c @@ -98,7 +98,7 @@ __register_chrdev_region(unsigned int major, unsigned int baseminor, int minorct, const char *name) { struct char_device_struct *cd, *curr, *prev = NULL; - int ret = -EBUSY; + int ret; int i; if (major >= CHRDEV_MAJOR_MAX) { @@ -129,6 +129,7 @@ __register_chrdev_region(unsigned int major, unsigned int baseminor, major = ret; } + ret = -EBUSY; i = major_to_index(major); for (curr = chrdevs[i]; curr; prev = curr, curr = curr->next) { if (curr->major < major) diff --git a/include/linux/balloon_compaction.h b/include/linux/balloon_compaction.h index f31521dcb09a..338aa27e4773 100644 --- a/include/linux/balloon_compaction.h +++ b/include/linux/balloon_compaction.h @@ -64,6 +64,10 @@ extern struct page *balloon_page_alloc(void); extern void balloon_page_enqueue(struct balloon_dev_info *b_dev_info, struct page *page); extern struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info); +extern size_t balloon_page_list_enqueue(struct balloon_dev_info *b_dev_info, + struct list_head *pages); +extern size_t balloon_page_list_dequeue(struct balloon_dev_info *b_dev_info, + struct list_head *pages, size_t n_req_pages); static inline void balloon_devinfo_init(struct balloon_dev_info *balloon) { diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h index 77ac9c7b9483..606504bf376a 100644 --- a/include/linux/vmw_vmci_defs.h +++ b/include/linux/vmw_vmci_defs.h @@ -430,8 +430,8 @@ enum { struct vmci_queue_header { /* All fields are 64bit and aligned. */ struct vmci_handle handle; /* Identifier. */ - atomic64_t producer_tail; /* Offset in this queue. */ - atomic64_t consumer_head; /* Offset in peer queue. */ + u64 producer_tail; /* Offset in this queue. */ + u64 consumer_head; /* Offset in peer queue. */ }; /* @@ -732,13 +732,9 @@ static inline void *vmci_event_data_payload(struct vmci_event_data *ev_data) * prefix will be used, so correctness isn't an issue, but using a * 64bit operation still adds unnecessary overhead. */ -static inline u64 vmci_q_read_pointer(atomic64_t *var) +static inline u64 vmci_q_read_pointer(u64 *var) { -#if defined(CONFIG_X86_32) - return atomic_read((atomic_t *)var); -#else - return atomic64_read(var); -#endif + return READ_ONCE(*(unsigned long *)var); } /* @@ -747,23 +743,17 @@ static inline u64 vmci_q_read_pointer(atomic64_t *var) * never exceeds a 32bit value in this case. On 32bit SMP, using a * locked cmpxchg8b adds unnecessary overhead. */ -static inline void vmci_q_set_pointer(atomic64_t *var, - u64 new_val) +static inline void vmci_q_set_pointer(u64 *var, u64 new_val) { -#if defined(CONFIG_X86_32) - return atomic_set((atomic_t *)var, (u32)new_val); -#else - return atomic64_set(var, new_val); -#endif + /* XXX buggered on big-endian */ + WRITE_ONCE(*(unsigned long *)var, (unsigned long)new_val); } /* * Helper to add a given offset to a head or tail pointer. Wraps the * value of the pointer around the max size of the queue. */ -static inline void vmci_qp_add_pointer(atomic64_t *var, - size_t add, - u64 size) +static inline void vmci_qp_add_pointer(u64 *var, size_t add, u64 size) { u64 new_val = vmci_q_read_pointer(var); @@ -840,8 +830,8 @@ static inline void vmci_q_header_init(struct vmci_queue_header *q_header, const struct vmci_handle handle) { q_header->handle = handle; - atomic64_set(&q_header->producer_tail, 0); - atomic64_set(&q_header->consumer_head, 0); + q_header->producer_tail = 0; + q_header->consumer_head = 0; } /* diff --git a/mm/balloon_compaction.c b/mm/balloon_compaction.c index ba739b76e6c5..83a7b614061f 100644 --- a/mm/balloon_compaction.c +++ b/mm/balloon_compaction.c @@ -11,6 +11,105 @@ #include <linux/export.h> #include <linux/balloon_compaction.h> +static void balloon_page_enqueue_one(struct balloon_dev_info *b_dev_info, + struct page *page) +{ + /* + * Block others from accessing the 'page' when we get around to + * establishing additional references. We should be the only one + * holding a reference to the 'page' at this point. If we are not, then + * memory corruption is possible and we should stop execution. + */ + BUG_ON(!trylock_page(page)); + list_del(&page->lru); + balloon_page_insert(b_dev_info, page); + unlock_page(page); + __count_vm_event(BALLOON_INFLATE); +} + +/** + * balloon_page_list_enqueue() - inserts a list of pages into the balloon page + * list. + * @b_dev_info: balloon device descriptor where we will insert a new page to + * @pages: pages to enqueue - allocated using balloon_page_alloc. + * + * Driver must call it to properly enqueue a balloon pages before definitively + * removing it from the guest system. + * + * Return: number of pages that were enqueued. + */ +size_t balloon_page_list_enqueue(struct balloon_dev_info *b_dev_info, + struct list_head *pages) +{ + struct page *page, *tmp; + unsigned long flags; + size_t n_pages = 0; + + spin_lock_irqsave(&b_dev_info->pages_lock, flags); + list_for_each_entry_safe(page, tmp, pages, lru) { + balloon_page_enqueue_one(b_dev_info, page); + n_pages++; + } + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); + return n_pages; +} +EXPORT_SYMBOL_GPL(balloon_page_list_enqueue); + +/** + * balloon_page_list_dequeue() - removes pages from balloon's page list and + * returns a list of the pages. + * @b_dev_info: balloon device decriptor where we will grab a page from. + * @pages: pointer to the list of pages that would be returned to the caller. + * @n_req_pages: number of requested pages. + * + * Driver must call this function to properly de-allocate a previous enlisted + * balloon pages before definetively releasing it back to the guest system. + * This function tries to remove @n_req_pages from the ballooned pages and + * return them to the caller in the @pages list. + * + * Note that this function may fail to dequeue some pages temporarily empty due + * to compaction isolated pages. + * + * Return: number of pages that were added to the @pages list. + */ +size_t balloon_page_list_dequeue(struct balloon_dev_info *b_dev_info, + struct list_head *pages, size_t n_req_pages) +{ + struct page *page, *tmp; + unsigned long flags; + size_t n_pages = 0; + + spin_lock_irqsave(&b_dev_info->pages_lock, flags); + list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { + if (n_pages == n_req_pages) + break; + + /* + * Block others from accessing the 'page' while we get around to + * establishing additional references and preparing the 'page' + * to be released by the balloon driver. + */ + if (!trylock_page(page)) + continue; + + if (IS_ENABLED(CONFIG_BALLOON_COMPACTION) && + PageIsolated(page)) { + /* raced with isolation */ + unlock_page(page); + continue; + } + balloon_page_delete(page); + __count_vm_event(BALLOON_DEFLATE); + list_add(&page->lru, pages); + unlock_page(page); + n_pages++; + } + spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); + + return n_pages; +} +EXPORT_SYMBOL_GPL(balloon_page_list_dequeue); + /* * balloon_page_alloc - allocates a new page for insertion into the balloon * page list. @@ -44,17 +143,9 @@ void balloon_page_enqueue(struct balloon_dev_info *b_dev_info, { unsigned long flags; - /* - * Block others from accessing the 'page' when we get around to - * establishing additional references. We should be the only one - * holding a reference to the 'page' at this point. - */ - BUG_ON(!trylock_page(page)); spin_lock_irqsave(&b_dev_info->pages_lock, flags); - balloon_page_insert(b_dev_info, page); - __count_vm_event(BALLOON_INFLATE); + balloon_page_enqueue_one(b_dev_info, page); spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); - unlock_page(page); } EXPORT_SYMBOL_GPL(balloon_page_enqueue); @@ -71,36 +162,13 @@ EXPORT_SYMBOL_GPL(balloon_page_enqueue); */ struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) { - struct page *page, *tmp; unsigned long flags; - bool dequeued_page; + LIST_HEAD(pages); + int n_pages; - dequeued_page = false; - spin_lock_irqsave(&b_dev_info->pages_lock, flags); - list_for_each_entry_safe(page, tmp, &b_dev_info->pages, lru) { - /* - * Block others from accessing the 'page' while we get around - * establishing additional references and preparing the 'page' - * to be released by the balloon driver. - */ - if (trylock_page(page)) { -#ifdef CONFIG_BALLOON_COMPACTION - if (PageIsolated(page)) { - /* raced with isolation */ - unlock_page(page); - continue; - } -#endif - balloon_page_delete(page); - __count_vm_event(BALLOON_DEFLATE); - unlock_page(page); - dequeued_page = true; - break; - } - } - spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); + n_pages = balloon_page_list_dequeue(b_dev_info, &pages, 1); - if (!dequeued_page) { + if (n_pages != 1) { /* * If we are unable to dequeue a balloon page because the page * list is empty and there is no isolated pages, then something @@ -113,9 +181,9 @@ struct page *balloon_page_dequeue(struct balloon_dev_info *b_dev_info) !b_dev_info->isolated_pages)) BUG(); spin_unlock_irqrestore(&b_dev_info->pages_lock, flags); - page = NULL; + return NULL; } - return page; + return list_first_entry(&pages, struct page, lru); } EXPORT_SYMBOL_GPL(balloon_page_dequeue); |