From e7096c131e5161fa3b8e52a650d7719d2857adfd Mon Sep 17 00:00:00 2001
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
Date: Mon, 9 Dec 2019 00:27:34 +0100
Subject: net: WireGuard secure network tunnel

WireGuard is a layer 3 secure networking tunnel made specifically for
the kernel, that aims to be much simpler and easier to audit than IPsec.
Extensive documentation and description of the protocol and
considerations, along with formal proofs of the cryptography, are
available at:

  * https://www.wireguard.com/
  * https://www.wireguard.com/papers/wireguard.pdf

This commit implements WireGuard as a simple network device driver,
accessible in the usual RTNL way used by virtual network drivers. It
makes use of the udp_tunnel APIs, GRO, GSO, NAPI, and the usual set of
networking subsystem APIs. It has a somewhat novel multicore queueing
system designed for maximum throughput and minimal latency of encryption
operations, but it is implemented modestly using workqueues and NAPI.
Configuration is done via generic Netlink, and following a review from
the Netlink maintainer a year ago, several high profile userspace tools
have already implemented the API.

This commit also comes with several different tests, both in-kernel
tests and out-of-kernel tests based on network namespaces, taking profit
of the fact that sockets used by WireGuard intentionally stay in the
namespace the WireGuard interface was originally created, exactly like
the semantics of userspace tun devices. See wireguard.com/netns/ for
pictures and examples.

The source code is fairly short, but rather than combining everything
into a single file, WireGuard is developed as cleanly separable files,
making auditing and comprehension easier. Things are laid out as
follows:

  * noise.[ch], cookie.[ch], messages.h: These implement the bulk of the
    cryptographic aspects of the protocol, and are mostly data-only in
    nature, taking in buffers of bytes and spitting out buffers of
    bytes. They also handle reference counting for their various shared
    pieces of data, like keys and key lists.

  * ratelimiter.[ch]: Used as an integral part of cookie.[ch] for
    ratelimiting certain types of cryptographic operations in accordance
    with particular WireGuard semantics.

  * allowedips.[ch], peerlookup.[ch]: The main lookup structures of
    WireGuard, the former being trie-like with particular semantics, an
    integral part of the design of the protocol, and the latter just
    being nice helper functions around the various hashtables we use.

  * device.[ch]: Implementation of functions for the netdevice and for
    rtnl, responsible for maintaining the life of a given interface and
    wiring it up to the rest of WireGuard.

  * peer.[ch]: Each interface has a list of peers, with helper functions
    available here for creation, destruction, and reference counting.

  * socket.[ch]: Implementation of functions related to udp_socket and
    the general set of kernel socket APIs, for sending and receiving
    ciphertext UDP packets, and taking care of WireGuard-specific sticky
    socket routing semantics for the automatic roaming.

  * netlink.[ch]: Userspace API entry point for configuring WireGuard
    peers and devices. The API has been implemented by several userspace
    tools and network management utility, and the WireGuard project
    distributes the basic wg(8) tool.

  * queueing.[ch]: Shared function on the rx and tx path for handling
    the various queues used in the multicore algorithms.

  * send.c: Handles encrypting outgoing packets in parallel on
    multiple cores, before sending them in order on a single core, via
    workqueues and ring buffers. Also handles sending handshake and cookie
    messages as part of the protocol, in parallel.

  * receive.c: Handles decrypting incoming packets in parallel on
    multiple cores, before passing them off in order to be ingested via
    the rest of the networking subsystem with GRO via the typical NAPI
    poll function. Also handles receiving handshake and cookie messages
    as part of the protocol, in parallel.

  * timers.[ch]: Uses the timer wheel to implement protocol particular
    event timeouts, and gives a set of very simple event-driven entry
    point functions for callers.

  * main.c, version.h: Initialization and deinitialization of the module.

  * selftest/*.h: Runtime unit tests for some of the most security
    sensitive functions.

  * tools/testing/selftests/wireguard/netns.sh: Aforementioned testing
    script using network namespaces.

This commit aims to be as self-contained as possible, implementing
WireGuard as a standalone module not needing much special handling or
coordination from the network subsystem. I expect for future
optimizations to the network stack to positively improve WireGuard, and
vice-versa, but for the time being, this exists as intentionally
standalone.

We introduce a menu option for CONFIG_WIREGUARD, as well as providing a
verbose debug log and self-tests via CONFIG_WIREGUARD_DEBUG.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Cc: David Miller <davem@davemloft.net>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: linux-crypto@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/wireguard.h | 196 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 196 insertions(+)
 create mode 100644 include/uapi/linux/wireguard.h

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h
new file mode 100644
index 000000000000..dd8a47c4ad11
--- /dev/null
+++ b/include/uapi/linux/wireguard.h
@@ -0,0 +1,196 @@
+/* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) OR MIT */
+/*
+ * Copyright (C) 2015-2019 Jason A. Donenfeld <Jason@zx2c4.com>. All Rights Reserved.
+ *
+ * Documentation
+ * =============
+ *
+ * The below enums and macros are for interfacing with WireGuard, using generic
+ * netlink, with family WG_GENL_NAME and version WG_GENL_VERSION. It defines two
+ * methods: get and set. Note that while they share many common attributes,
+ * these two functions actually accept a slightly different set of inputs and
+ * outputs.
+ *
+ * WG_CMD_GET_DEVICE
+ * -----------------
+ *
+ * May only be called via NLM_F_REQUEST | NLM_F_DUMP. The command should contain
+ * one but not both of:
+ *
+ *    WGDEVICE_A_IFINDEX: NLA_U32
+ *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ *
+ * The kernel will then return several messages (NLM_F_MULTI) containing the
+ * following tree of nested items:
+ *
+ *    WGDEVICE_A_IFINDEX: NLA_U32
+ *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ *    WGDEVICE_A_PRIVATE_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
+ *    WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
+ *    WGDEVICE_A_LISTEN_PORT: NLA_U16
+ *    WGDEVICE_A_FWMARK: NLA_U32
+ *    WGDEVICE_A_PEERS: NLA_NESTED
+ *        0: NLA_NESTED
+ *            WGPEER_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
+ *            WGPEER_A_PRESHARED_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
+ *            WGPEER_A_ENDPOINT: NLA_MIN_LEN(struct sockaddr), struct sockaddr_in or struct sockaddr_in6
+ *            WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16
+ *            WGPEER_A_LAST_HANDSHAKE_TIME: NLA_EXACT_LEN, struct __kernel_timespec
+ *            WGPEER_A_RX_BYTES: NLA_U64
+ *            WGPEER_A_TX_BYTES: NLA_U64
+ *            WGPEER_A_ALLOWEDIPS: NLA_NESTED
+ *                0: NLA_NESTED
+ *                    WGALLOWEDIP_A_FAMILY: NLA_U16
+ *                    WGALLOWEDIP_A_IPADDR: NLA_MIN_LEN(struct in_addr), struct in_addr or struct in6_addr
+ *                    WGALLOWEDIP_A_CIDR_MASK: NLA_U8
+ *                0: NLA_NESTED
+ *                    ...
+ *                0: NLA_NESTED
+ *                    ...
+ *                ...
+ *            WGPEER_A_PROTOCOL_VERSION: NLA_U32
+ *        0: NLA_NESTED
+ *            ...
+ *        ...
+ *
+ * It is possible that all of the allowed IPs of a single peer will not
+ * fit within a single netlink message. In that case, the same peer will
+ * be written in the following message, except it will only contain
+ * WGPEER_A_PUBLIC_KEY and WGPEER_A_ALLOWEDIPS. This may occur several
+ * times in a row for the same peer. It is then up to the receiver to
+ * coalesce adjacent peers. Likewise, it is possible that all peers will
+ * not fit within a single message. So, subsequent peers will be sent
+ * in following messages, except those will only contain WGDEVICE_A_IFNAME
+ * and WGDEVICE_A_PEERS. It is then up to the receiver to coalesce these
+ * messages to form the complete list of peers.
+ *
+ * Since this is an NLA_F_DUMP command, the final message will always be
+ * NLMSG_DONE, even if an error occurs. However, this NLMSG_DONE message
+ * contains an integer error code. It is either zero or a negative error
+ * code corresponding to the errno.
+ *
+ * WG_CMD_SET_DEVICE
+ * -----------------
+ *
+ * May only be called via NLM_F_REQUEST. The command should contain the
+ * following tree of nested items, containing one but not both of
+ * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME:
+ *
+ *    WGDEVICE_A_IFINDEX: NLA_U32
+ *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ *    WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current
+ *                      peers should be removed prior to adding the list below.
+ *    WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove
+ *    WGDEVICE_A_LISTEN_PORT: NLA_U16, 0 to choose randomly
+ *    WGDEVICE_A_FWMARK: NLA_U32, 0 to disable
+ *    WGDEVICE_A_PEERS: NLA_NESTED
+ *        0: NLA_NESTED
+ *            WGPEER_A_PUBLIC_KEY: len WG_KEY_LEN
+ *            WGPEER_A_FLAGS: NLA_U32, 0 and/or WGPEER_F_REMOVE_ME if the
+ *                            specified peer should not exist at the end of the
+ *                            operation, rather than added/updated and/or
+ *                            WGPEER_F_REPLACE_ALLOWEDIPS if all current allowed
+ *                            IPs of this peer should be removed prior to adding
+ *                            the list below and/or WGPEER_F_UPDATE_ONLY if the
+ *                            peer should only be set if it already exists.
+ *            WGPEER_A_PRESHARED_KEY: len WG_KEY_LEN, all zeros to remove
+ *            WGPEER_A_ENDPOINT: struct sockaddr_in or struct sockaddr_in6
+ *            WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL: NLA_U16, 0 to disable
+ *            WGPEER_A_ALLOWEDIPS: NLA_NESTED
+ *                0: NLA_NESTED
+ *                    WGALLOWEDIP_A_FAMILY: NLA_U16
+ *                    WGALLOWEDIP_A_IPADDR: struct in_addr or struct in6_addr
+ *                    WGALLOWEDIP_A_CIDR_MASK: NLA_U8
+ *                0: NLA_NESTED
+ *                    ...
+ *                0: NLA_NESTED
+ *                    ...
+ *                ...
+ *            WGPEER_A_PROTOCOL_VERSION: NLA_U32, should not be set or used at
+ *                                       all by most users of this API, as the
+ *                                       most recent protocol will be used when
+ *                                       this is unset. Otherwise, must be set
+ *                                       to 1.
+ *        0: NLA_NESTED
+ *            ...
+ *        ...
+ *
+ * It is possible that the amount of configuration data exceeds that of
+ * the maximum message length accepted by the kernel. In that case, several
+ * messages should be sent one after another, with each successive one
+ * filling in information not contained in the prior. Note that if
+ * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably
+ * should not be specified in fragments that come after, so that the list
+ * of peers is only cleared the first time but appened after. Likewise for
+ * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message
+ * of a peer, it likely should not be specified in subsequent fragments.
+ *
+ * If an error occurs, NLMSG_ERROR will reply containing an errno.
+ */
+
+#ifndef _WG_UAPI_WIREGUARD_H
+#define _WG_UAPI_WIREGUARD_H
+
+#define WG_GENL_NAME "wireguard"
+#define WG_GENL_VERSION 1
+
+#define WG_KEY_LEN 32
+
+enum wg_cmd {
+	WG_CMD_GET_DEVICE,
+	WG_CMD_SET_DEVICE,
+	__WG_CMD_MAX
+};
+#define WG_CMD_MAX (__WG_CMD_MAX - 1)
+
+enum wgdevice_flag {
+	WGDEVICE_F_REPLACE_PEERS = 1U << 0,
+	__WGDEVICE_F_ALL = WGDEVICE_F_REPLACE_PEERS
+};
+enum wgdevice_attribute {
+	WGDEVICE_A_UNSPEC,
+	WGDEVICE_A_IFINDEX,
+	WGDEVICE_A_IFNAME,
+	WGDEVICE_A_PRIVATE_KEY,
+	WGDEVICE_A_PUBLIC_KEY,
+	WGDEVICE_A_FLAGS,
+	WGDEVICE_A_LISTEN_PORT,
+	WGDEVICE_A_FWMARK,
+	WGDEVICE_A_PEERS,
+	__WGDEVICE_A_LAST
+};
+#define WGDEVICE_A_MAX (__WGDEVICE_A_LAST - 1)
+
+enum wgpeer_flag {
+	WGPEER_F_REMOVE_ME = 1U << 0,
+	WGPEER_F_REPLACE_ALLOWEDIPS = 1U << 1,
+	WGPEER_F_UPDATE_ONLY = 1U << 2,
+	__WGPEER_F_ALL = WGPEER_F_REMOVE_ME | WGPEER_F_REPLACE_ALLOWEDIPS |
+			 WGPEER_F_UPDATE_ONLY
+};
+enum wgpeer_attribute {
+	WGPEER_A_UNSPEC,
+	WGPEER_A_PUBLIC_KEY,
+	WGPEER_A_PRESHARED_KEY,
+	WGPEER_A_FLAGS,
+	WGPEER_A_ENDPOINT,
+	WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
+	WGPEER_A_LAST_HANDSHAKE_TIME,
+	WGPEER_A_RX_BYTES,
+	WGPEER_A_TX_BYTES,
+	WGPEER_A_ALLOWEDIPS,
+	WGPEER_A_PROTOCOL_VERSION,
+	__WGPEER_A_LAST
+};
+#define WGPEER_A_MAX (__WGPEER_A_LAST - 1)
+
+enum wgallowedip_attribute {
+	WGALLOWEDIP_A_UNSPEC,
+	WGALLOWEDIP_A_FAMILY,
+	WGALLOWEDIP_A_IPADDR,
+	WGALLOWEDIP_A_CIDR_MASK,
+	__WGALLOWEDIP_A_LAST
+};
+#define WGALLOWEDIP_A_MAX (__WGALLOWEDIP_A_LAST - 1)
+
+#endif /* _WG_UAPI_WIREGUARD_H */
-- 
cgit v1.2.3


From e27cca96cd68fa2c6814c90f9a1cfd36bb68c593 Mon Sep 17 00:00:00 2001
From: Sabrina Dubroca <sd@queasysnail.net>
Date: Mon, 25 Nov 2019 14:49:02 +0100
Subject: xfrm: add espintcp (RFC 8229)

TCP encapsulation of IKE and IPsec messages (RFC 8229) is implemented
as a TCP ULP, overriding in particular the sendmsg and recvmsg
operations. A Stream Parser is used to extract messages out of the TCP
stream using the first 2 bytes as length marker. Received IKE messages
are put on "ike_queue", waiting to be dequeued by the custom recvmsg
implementation. Received ESP messages are sent to XFRM, like with UDP
encapsulation.

Some of this code is taken from the original submission by Herbert
Xu. Currently, only IPv4 is supported, like for UDP encapsulation.

Co-developed-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/espintcp.h   |  39 ++++
 include/net/xfrm.h       |   1 +
 include/uapi/linux/udp.h |   1 +
 net/ipv4/Kconfig         |  11 +
 net/ipv4/esp4.c          | 191 +++++++++++++++++-
 net/xfrm/Makefile        |   1 +
 net/xfrm/espintcp.c      | 509 +++++++++++++++++++++++++++++++++++++++++++++++
 net/xfrm/xfrm_policy.c   |   7 +
 net/xfrm/xfrm_state.c    |   3 +
 9 files changed, 760 insertions(+), 3 deletions(-)
 create mode 100644 include/net/espintcp.h
 create mode 100644 net/xfrm/espintcp.c

(limited to 'include/uapi/linux')

diff --git a/include/net/espintcp.h b/include/net/espintcp.h
new file mode 100644
index 000000000000..dd7026a00066
--- /dev/null
+++ b/include/net/espintcp.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _NET_ESPINTCP_H
+#define _NET_ESPINTCP_H
+
+#include <net/strparser.h>
+#include <linux/skmsg.h>
+
+void __init espintcp_init(void);
+
+int espintcp_push_skb(struct sock *sk, struct sk_buff *skb);
+int espintcp_queue_out(struct sock *sk, struct sk_buff *skb);
+bool tcp_is_ulp_esp(struct sock *sk);
+
+struct espintcp_msg {
+	struct sk_buff *skb;
+	struct sk_msg skmsg;
+	int offset;
+	int len;
+};
+
+struct espintcp_ctx {
+	struct strparser strp;
+	struct sk_buff_head ike_queue;
+	struct sk_buff_head out_queue;
+	struct espintcp_msg partial;
+	void (*saved_data_ready)(struct sock *sk);
+	void (*saved_write_space)(struct sock *sk);
+	struct work_struct work;
+	bool tx_running;
+};
+
+static inline struct espintcp_ctx *espintcp_getctx(const struct sock *sk)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+
+	/* RCU is only needed for diag */
+	return (__force void *)icsk->icsk_ulp_data;
+}
+#endif
diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 56ff86621bb4..8f71c111e65a 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -193,6 +193,7 @@ struct xfrm_state {
 
 	/* Data for encapsulator */
 	struct xfrm_encap_tmpl	*encap;
+	struct sock __rcu	*encap_sk;
 
 	/* Data for care-of address */
 	xfrm_address_t	*coaddr;
diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h
index 30baccb6c9c4..4828794efcf8 100644
--- a/include/uapi/linux/udp.h
+++ b/include/uapi/linux/udp.h
@@ -42,5 +42,6 @@ struct udphdr {
 #define UDP_ENCAP_GTP0		4 /* GSM TS 09.60 */
 #define UDP_ENCAP_GTP1U		5 /* 3GPP TS 29.060 */
 #define UDP_ENCAP_RXRPC		6
+#define TCP_ENCAP_ESPINTCP	7 /* Yikes, this is really xfrm encap types. */
 
 #endif /* _UAPI_LINUX_UDP_H */
diff --git a/net/ipv4/Kconfig b/net/ipv4/Kconfig
index fc816b187170..f96bd489b362 100644
--- a/net/ipv4/Kconfig
+++ b/net/ipv4/Kconfig
@@ -378,6 +378,17 @@ config INET_ESP_OFFLOAD
 
 	  If unsure, say N.
 
+config INET_ESPINTCP
+	bool "IP: ESP in TCP encapsulation (RFC 8229)"
+	depends on XFRM && INET_ESP
+	select STREAM_PARSER
+	select NET_SOCK_MSG
+	help
+	  Support for RFC 8229 encapsulation of ESP and IKE over
+	  TCP/IPv4 sockets.
+
+	  If unsure, say N.
+
 config INET_IPCOMP
 	tristate "IP: IPComp transformation"
 	select INET_XFRM_TUNNEL
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 033c61d27148..103c7d599a3c 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -18,6 +18,8 @@
 #include <net/icmp.h>
 #include <net/protocol.h>
 #include <net/udp.h>
+#include <net/tcp.h>
+#include <net/espintcp.h>
 
 #include <linux/highmem.h>
 
@@ -117,6 +119,132 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp)
 			put_page(sg_page(sg));
 }
 
+#ifdef CONFIG_INET_ESPINTCP
+struct esp_tcp_sk {
+	struct sock *sk;
+	struct rcu_head rcu;
+};
+
+static void esp_free_tcp_sk(struct rcu_head *head)
+{
+	struct esp_tcp_sk *esk = container_of(head, struct esp_tcp_sk, rcu);
+
+	sock_put(esk->sk);
+	kfree(esk);
+}
+
+static struct sock *esp_find_tcp_sk(struct xfrm_state *x)
+{
+	struct xfrm_encap_tmpl *encap = x->encap;
+	struct esp_tcp_sk *esk;
+	__be16 sport, dport;
+	struct sock *nsk;
+	struct sock *sk;
+
+	sk = rcu_dereference(x->encap_sk);
+	if (sk && sk->sk_state == TCP_ESTABLISHED)
+		return sk;
+
+	spin_lock_bh(&x->lock);
+	sport = encap->encap_sport;
+	dport = encap->encap_dport;
+	nsk = rcu_dereference_protected(x->encap_sk,
+					lockdep_is_held(&x->lock));
+	if (sk && sk == nsk) {
+		esk = kmalloc(sizeof(*esk), GFP_ATOMIC);
+		if (!esk) {
+			spin_unlock_bh(&x->lock);
+			return ERR_PTR(-ENOMEM);
+		}
+		RCU_INIT_POINTER(x->encap_sk, NULL);
+		esk->sk = sk;
+		call_rcu(&esk->rcu, esp_free_tcp_sk);
+	}
+	spin_unlock_bh(&x->lock);
+
+	sk = inet_lookup_established(xs_net(x), &tcp_hashinfo, x->id.daddr.a4,
+				     dport, x->props.saddr.a4, sport, 0);
+	if (!sk)
+		return ERR_PTR(-ENOENT);
+
+	if (!tcp_is_ulp_esp(sk)) {
+		sock_put(sk);
+		return ERR_PTR(-EINVAL);
+	}
+
+	spin_lock_bh(&x->lock);
+	nsk = rcu_dereference_protected(x->encap_sk,
+					lockdep_is_held(&x->lock));
+	if (encap->encap_sport != sport ||
+	    encap->encap_dport != dport) {
+		sock_put(sk);
+		sk = nsk ?: ERR_PTR(-EREMCHG);
+	} else if (sk == nsk) {
+		sock_put(sk);
+	} else {
+		rcu_assign_pointer(x->encap_sk, sk);
+	}
+	spin_unlock_bh(&x->lock);
+
+	return sk;
+}
+
+static int esp_output_tcp_finish(struct xfrm_state *x, struct sk_buff *skb)
+{
+	struct sock *sk;
+	int err;
+
+	rcu_read_lock();
+
+	sk = esp_find_tcp_sk(x);
+	err = PTR_ERR_OR_ZERO(sk);
+	if (err)
+		goto out;
+
+	bh_lock_sock(sk);
+	if (sock_owned_by_user(sk))
+		err = espintcp_queue_out(sk, skb);
+	else
+		err = espintcp_push_skb(sk, skb);
+	bh_unlock_sock(sk);
+
+out:
+	rcu_read_unlock();
+	return err;
+}
+
+static int esp_output_tcp_encap_cb(struct net *net, struct sock *sk,
+				   struct sk_buff *skb)
+{
+	struct dst_entry *dst = skb_dst(skb);
+	struct xfrm_state *x = dst->xfrm;
+
+	return esp_output_tcp_finish(x, skb);
+}
+
+static int esp_output_tail_tcp(struct xfrm_state *x, struct sk_buff *skb)
+{
+	int err;
+
+	local_bh_disable();
+	err = xfrm_trans_queue_net(xs_net(x), skb, esp_output_tcp_encap_cb);
+	local_bh_enable();
+
+	/* EINPROGRESS just happens to do the right thing.  It
+	 * actually means that the skb has been consumed and
+	 * isn't coming back.
+	 */
+	return err ?: -EINPROGRESS;
+}
+#else
+static int esp_output_tail_tcp(struct xfrm_state *x, struct sk_buff *skb)
+{
+	kfree_skb(skb);
+
+	return -EOPNOTSUPP;
+}
+#endif
+
 static void esp_output_done(struct crypto_async_request *base, int err)
 {
 	struct sk_buff *skb = base->data;
@@ -147,7 +275,11 @@ static void esp_output_done(struct crypto_async_request *base, int err)
 		secpath_reset(skb);
 		xfrm_dev_resume(skb);
 	} else {
-		xfrm_output_resume(skb, err);
+		if (!err &&
+		    x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP)
+			esp_output_tail_tcp(x, skb);
+		else
+			xfrm_output_resume(skb, err);
 	}
 }
 
@@ -236,7 +368,7 @@ static struct ip_esp_hdr *esp_output_udp_encap(struct sk_buff *skb,
 	unsigned int len;
 
 	len = skb->len + esp->tailen - skb_transport_offset(skb);
-	if (len + sizeof(struct iphdr) >= IP_MAX_MTU)
+	if (len + sizeof(struct iphdr) > IP_MAX_MTU)
 		return ERR_PTR(-EMSGSIZE);
 
 	uh = (struct udphdr *)esp->esph;
@@ -256,6 +388,41 @@ static struct ip_esp_hdr *esp_output_udp_encap(struct sk_buff *skb,
 	return (struct ip_esp_hdr *)(uh + 1);
 }
 
+#ifdef CONFIG_INET_ESPINTCP
+static struct ip_esp_hdr *esp_output_tcp_encap(struct xfrm_state *x,
+						    struct sk_buff *skb,
+						    struct esp_info *esp)
+{
+	__be16 *lenp = (void *)esp->esph;
+	struct ip_esp_hdr *esph;
+	unsigned int len;
+	struct sock *sk;
+
+	len = skb->len + esp->tailen - skb_transport_offset(skb);
+	if (len > IP_MAX_MTU)
+		return ERR_PTR(-EMSGSIZE);
+
+	rcu_read_lock();
+	sk = esp_find_tcp_sk(x);
+	rcu_read_unlock();
+
+	if (IS_ERR(sk))
+		return ERR_CAST(sk);
+
+	*lenp = htons(len);
+	esph = (struct ip_esp_hdr *)(lenp + 1);
+
+	return esph;
+}
+#else
+static struct ip_esp_hdr *esp_output_tcp_encap(struct xfrm_state *x,
+						    struct sk_buff *skb,
+						    struct esp_info *esp)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
+#endif
+
 static int esp_output_encap(struct xfrm_state *x, struct sk_buff *skb,
 			    struct esp_info *esp)
 {
@@ -276,6 +443,9 @@ static int esp_output_encap(struct xfrm_state *x, struct sk_buff *skb,
 	case UDP_ENCAP_ESPINUDP_NON_IKE:
 		esph = esp_output_udp_encap(skb, encap_type, esp, sport, dport);
 		break;
+	case TCP_ENCAP_ESPINTCP:
+		esph = esp_output_tcp_encap(x, skb, esp);
+		break;
 	}
 
 	if (IS_ERR(esph))
@@ -296,7 +466,7 @@ int esp_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 	struct sk_buff *trailer;
 	int tailen = esp->tailen;
 
-	/* this is non-NULL only with UDP Encapsulation */
+	/* this is non-NULL only with TCP/UDP Encapsulation */
 	if (x->encap) {
 		int err = esp_output_encap(x, skb, esp);
 
@@ -491,6 +661,9 @@ int esp_output_tail(struct xfrm_state *x, struct sk_buff *skb, struct esp_info *
 	if (sg != dsg)
 		esp_ssg_unref(x, tmp);
 
+	if (!err && x->encap && x->encap->encap_type == TCP_ENCAP_ESPINTCP)
+		err = esp_output_tail_tcp(x, skb);
+
 error_free:
 	kfree(tmp);
 error:
@@ -617,10 +790,14 @@ int esp_input_done2(struct sk_buff *skb, int err)
 
 	if (x->encap) {
 		struct xfrm_encap_tmpl *encap = x->encap;
+		struct tcphdr *th = (void *)(skb_network_header(skb) + ihl);
 		struct udphdr *uh = (void *)(skb_network_header(skb) + ihl);
 		__be16 source;
 
 		switch (x->encap->encap_type) {
+		case TCP_ENCAP_ESPINTCP:
+			source = th->source;
+			break;
 		case UDP_ENCAP_ESPINUDP:
 		case UDP_ENCAP_ESPINUDP_NON_IKE:
 			source = uh->source;
@@ -1017,6 +1194,14 @@ static int esp_init_state(struct xfrm_state *x)
 		case UDP_ENCAP_ESPINUDP_NON_IKE:
 			x->props.header_len += sizeof(struct udphdr) + 2 * sizeof(u32);
 			break;
+#ifdef CONFIG_INET_ESPINTCP
+		case TCP_ENCAP_ESPINTCP:
+			/* only the length field, TCP encap is done by
+			 * the socket
+			 */
+			x->props.header_len += 2;
+			break;
+#endif
 		}
 	}
 
diff --git a/net/xfrm/Makefile b/net/xfrm/Makefile
index fbc4552d17b8..212a4fcb4a88 100644
--- a/net/xfrm/Makefile
+++ b/net/xfrm/Makefile
@@ -11,3 +11,4 @@ obj-$(CONFIG_XFRM_ALGO) += xfrm_algo.o
 obj-$(CONFIG_XFRM_USER) += xfrm_user.o
 obj-$(CONFIG_XFRM_IPCOMP) += xfrm_ipcomp.o
 obj-$(CONFIG_XFRM_INTERFACE) += xfrm_interface.o
+obj-$(CONFIG_INET_ESPINTCP) += espintcp.o
diff --git a/net/xfrm/espintcp.c b/net/xfrm/espintcp.c
new file mode 100644
index 000000000000..f15d6a564b0e
--- /dev/null
+++ b/net/xfrm/espintcp.c
@@ -0,0 +1,509 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <net/tcp.h>
+#include <net/strparser.h>
+#include <net/xfrm.h>
+#include <net/esp.h>
+#include <net/espintcp.h>
+#include <linux/skmsg.h>
+#include <net/inet_common.h>
+
+static void handle_nonesp(struct espintcp_ctx *ctx, struct sk_buff *skb,
+			  struct sock *sk)
+{
+	if (atomic_read(&sk->sk_rmem_alloc) >= sk->sk_rcvbuf ||
+	    !sk_rmem_schedule(sk, skb, skb->truesize)) {
+		kfree_skb(skb);
+		return;
+	}
+
+	skb_set_owner_r(skb, sk);
+
+	memset(skb->cb, 0, sizeof(skb->cb));
+	skb_queue_tail(&ctx->ike_queue, skb);
+	ctx->saved_data_ready(sk);
+}
+
+static void handle_esp(struct sk_buff *skb, struct sock *sk)
+{
+	skb_reset_transport_header(skb);
+	memset(skb->cb, 0, sizeof(skb->cb));
+
+	rcu_read_lock();
+	skb->dev = dev_get_by_index_rcu(sock_net(sk), skb->skb_iif);
+	local_bh_disable();
+	xfrm4_rcv_encap(skb, IPPROTO_ESP, 0, TCP_ENCAP_ESPINTCP);
+	local_bh_enable();
+	rcu_read_unlock();
+}
+
+static void espintcp_rcv(struct strparser *strp, struct sk_buff *skb)
+{
+	struct espintcp_ctx *ctx = container_of(strp, struct espintcp_ctx,
+						strp);
+	struct strp_msg *rxm = strp_msg(skb);
+	u32 nonesp_marker;
+	int err;
+
+	err = skb_copy_bits(skb, rxm->offset + 2, &nonesp_marker,
+			    sizeof(nonesp_marker));
+	if (err < 0) {
+		kfree_skb(skb);
+		return;
+	}
+
+	/* remove header, leave non-ESP marker/SPI */
+	if (!__pskb_pull(skb, rxm->offset + 2)) {
+		kfree_skb(skb);
+		return;
+	}
+
+	if (pskb_trim(skb, rxm->full_len - 2) != 0) {
+		kfree_skb(skb);
+		return;
+	}
+
+	if (nonesp_marker == 0)
+		handle_nonesp(ctx, skb, strp->sk);
+	else
+		handle_esp(skb, strp->sk);
+}
+
+static int espintcp_parse(struct strparser *strp, struct sk_buff *skb)
+{
+	struct strp_msg *rxm = strp_msg(skb);
+	__be16 blen;
+	u16 len;
+	int err;
+
+	if (skb->len < rxm->offset + 2)
+		return 0;
+
+	err = skb_copy_bits(skb, rxm->offset, &blen, sizeof(blen));
+	if (err < 0)
+		return err;
+
+	len = be16_to_cpu(blen);
+	if (len < 6)
+		return -EINVAL;
+
+	return len;
+}
+
+static int espintcp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len,
+			    int nonblock, int flags, int *addr_len)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+	struct sk_buff *skb;
+	int err = 0;
+	int copied;
+	int off = 0;
+
+	flags |= nonblock ? MSG_DONTWAIT : 0;
+
+	skb = __skb_recv_datagram(sk, &ctx->ike_queue, flags, NULL, &off, &err);
+	if (!skb)
+		return err;
+
+	copied = len;
+	if (copied > skb->len)
+		copied = skb->len;
+	else if (copied < skb->len)
+		msg->msg_flags |= MSG_TRUNC;
+
+	err = skb_copy_datagram_msg(skb, 0, msg, copied);
+	if (unlikely(err)) {
+		kfree_skb(skb);
+		return err;
+	}
+
+	if (flags & MSG_TRUNC)
+		copied = skb->len;
+	kfree_skb(skb);
+	return copied;
+}
+
+int espintcp_queue_out(struct sock *sk, struct sk_buff *skb)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+
+	if (skb_queue_len(&ctx->out_queue) >= netdev_max_backlog)
+		return -ENOBUFS;
+
+	__skb_queue_tail(&ctx->out_queue, skb);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(espintcp_queue_out);
+
+/* espintcp length field is 2B and length includes the length field's size */
+#define MAX_ESPINTCP_MSG (((1 << 16) - 1) - 2)
+
+static int espintcp_sendskb_locked(struct sock *sk, struct espintcp_msg *emsg,
+				   int flags)
+{
+	do {
+		int ret;
+
+		ret = skb_send_sock_locked(sk, emsg->skb,
+					   emsg->offset, emsg->len);
+		if (ret < 0)
+			return ret;
+
+		emsg->len -= ret;
+		emsg->offset += ret;
+	} while (emsg->len > 0);
+
+	kfree_skb(emsg->skb);
+	memset(emsg, 0, sizeof(*emsg));
+
+	return 0;
+}
+
+static int espintcp_sendskmsg_locked(struct sock *sk,
+				     struct espintcp_msg *emsg, int flags)
+{
+	struct sk_msg *skmsg = &emsg->skmsg;
+	struct scatterlist *sg;
+	int done = 0;
+	int ret;
+
+	flags |= MSG_SENDPAGE_NOTLAST;
+	sg = &skmsg->sg.data[skmsg->sg.start];
+	do {
+		size_t size = sg->length - emsg->offset;
+		int offset = sg->offset + emsg->offset;
+		struct page *p;
+
+		emsg->offset = 0;
+
+		if (sg_is_last(sg))
+			flags &= ~MSG_SENDPAGE_NOTLAST;
+
+		p = sg_page(sg);
+retry:
+		ret = do_tcp_sendpages(sk, p, offset, size, flags);
+		if (ret < 0) {
+			emsg->offset = offset - sg->offset;
+			skmsg->sg.start += done;
+			return ret;
+		}
+
+		if (ret != size) {
+			offset += ret;
+			size -= ret;
+			goto retry;
+		}
+
+		done++;
+		put_page(p);
+		sk_mem_uncharge(sk, sg->length);
+		sg = sg_next(sg);
+	} while (sg);
+
+	memset(emsg, 0, sizeof(*emsg));
+
+	return 0;
+}
+
+static int espintcp_push_msgs(struct sock *sk)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+	struct espintcp_msg *emsg = &ctx->partial;
+	int err;
+
+	if (!emsg->len)
+		return 0;
+
+	if (ctx->tx_running)
+		return -EAGAIN;
+	ctx->tx_running = 1;
+
+	if (emsg->skb)
+		err = espintcp_sendskb_locked(sk, emsg, 0);
+	else
+		err = espintcp_sendskmsg_locked(sk, emsg, 0);
+	if (err == -EAGAIN) {
+		ctx->tx_running = 0;
+		return 0;
+	}
+	if (!err)
+		memset(emsg, 0, sizeof(*emsg));
+
+	ctx->tx_running = 0;
+
+	return err;
+}
+
+int espintcp_push_skb(struct sock *sk, struct sk_buff *skb)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+	struct espintcp_msg *emsg = &ctx->partial;
+	unsigned int len;
+	int offset;
+
+	if (sk->sk_state != TCP_ESTABLISHED) {
+		kfree_skb(skb);
+		return -ECONNRESET;
+	}
+
+	offset = skb_transport_offset(skb);
+	len = skb->len - offset;
+
+	espintcp_push_msgs(sk);
+
+	if (emsg->len) {
+		kfree_skb(skb);
+		return -ENOBUFS;
+	}
+
+	skb_set_owner_w(skb, sk);
+
+	emsg->offset = offset;
+	emsg->len = len;
+	emsg->skb = skb;
+
+	espintcp_push_msgs(sk);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(espintcp_push_skb);
+
+static int espintcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+{
+	long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+	struct espintcp_msg *emsg = &ctx->partial;
+	struct iov_iter pfx_iter;
+	struct kvec pfx_iov = {};
+	size_t msglen = size + 2;
+	char buf[2] = {0};
+	int err, end;
+
+	if (msg->msg_flags)
+		return -EOPNOTSUPP;
+
+	if (size > MAX_ESPINTCP_MSG)
+		return -EMSGSIZE;
+
+	if (msg->msg_controllen)
+		return -EOPNOTSUPP;
+
+	lock_sock(sk);
+
+	err = espintcp_push_msgs(sk);
+	if (err < 0) {
+		err = -ENOBUFS;
+		goto unlock;
+	}
+
+	sk_msg_init(&emsg->skmsg);
+	while (1) {
+		/* only -ENOMEM is possible since we don't coalesce */
+		err = sk_msg_alloc(sk, &emsg->skmsg, msglen, 0);
+		if (!err)
+			break;
+
+		err = sk_stream_wait_memory(sk, &timeo);
+		if (err)
+			goto fail;
+	}
+
+	*((__be16 *)buf) = cpu_to_be16(msglen);
+	pfx_iov.iov_base = buf;
+	pfx_iov.iov_len = sizeof(buf);
+	iov_iter_kvec(&pfx_iter, WRITE, &pfx_iov, 1, pfx_iov.iov_len);
+
+	err = sk_msg_memcopy_from_iter(sk, &pfx_iter, &emsg->skmsg,
+				       pfx_iov.iov_len);
+	if (err < 0)
+		goto fail;
+
+	err = sk_msg_memcopy_from_iter(sk, &msg->msg_iter, &emsg->skmsg, size);
+	if (err < 0)
+		goto fail;
+
+	end = emsg->skmsg.sg.end;
+	emsg->len = size;
+	sk_msg_iter_var_prev(end);
+	sg_mark_end(sk_msg_elem(&emsg->skmsg, end));
+
+	tcp_rate_check_app_limited(sk);
+
+	err = espintcp_push_msgs(sk);
+	/* this message could be partially sent, keep it */
+	if (err < 0)
+		goto unlock;
+	release_sock(sk);
+
+	return size;
+
+fail:
+	sk_msg_free(sk, &emsg->skmsg);
+	memset(emsg, 0, sizeof(*emsg));
+unlock:
+	release_sock(sk);
+	return err;
+}
+
+static struct proto espintcp_prot __ro_after_init;
+static struct proto_ops espintcp_ops __ro_after_init;
+
+static void espintcp_data_ready(struct sock *sk)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+
+	strp_data_ready(&ctx->strp);
+}
+
+static void espintcp_tx_work(struct work_struct *work)
+{
+	struct espintcp_ctx *ctx = container_of(work,
+						struct espintcp_ctx, work);
+	struct sock *sk = ctx->strp.sk;
+
+	lock_sock(sk);
+	if (!ctx->tx_running)
+		espintcp_push_msgs(sk);
+	release_sock(sk);
+}
+
+static void espintcp_write_space(struct sock *sk)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+
+	schedule_work(&ctx->work);
+	ctx->saved_write_space(sk);
+}
+
+static void espintcp_destruct(struct sock *sk)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+
+	kfree(ctx);
+}
+
+bool tcp_is_ulp_esp(struct sock *sk)
+{
+	return sk->sk_prot == &espintcp_prot;
+}
+EXPORT_SYMBOL_GPL(tcp_is_ulp_esp);
+
+static int espintcp_init_sk(struct sock *sk)
+{
+	struct inet_connection_sock *icsk = inet_csk(sk);
+	struct strp_callbacks cb = {
+		.rcv_msg = espintcp_rcv,
+		.parse_msg = espintcp_parse,
+	};
+	struct espintcp_ctx *ctx;
+	int err;
+
+	/* sockmap is not compatible with espintcp */
+	if (sk->sk_user_data)
+		return -EBUSY;
+
+	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+	if (!ctx)
+		return -ENOMEM;
+
+	err = strp_init(&ctx->strp, sk, &cb);
+	if (err)
+		goto free;
+
+	__sk_dst_reset(sk);
+
+	strp_check_rcv(&ctx->strp);
+	skb_queue_head_init(&ctx->ike_queue);
+	skb_queue_head_init(&ctx->out_queue);
+	sk->sk_prot = &espintcp_prot;
+	sk->sk_socket->ops = &espintcp_ops;
+	ctx->saved_data_ready = sk->sk_data_ready;
+	ctx->saved_write_space = sk->sk_write_space;
+	sk->sk_data_ready = espintcp_data_ready;
+	sk->sk_write_space = espintcp_write_space;
+	sk->sk_destruct = espintcp_destruct;
+	rcu_assign_pointer(icsk->icsk_ulp_data, ctx);
+	INIT_WORK(&ctx->work, espintcp_tx_work);
+
+	/* avoid using task_frag */
+	sk->sk_allocation = GFP_ATOMIC;
+
+	return 0;
+
+free:
+	kfree(ctx);
+	return err;
+}
+
+static void espintcp_release(struct sock *sk)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+	struct sk_buff_head queue;
+	struct sk_buff *skb;
+
+	__skb_queue_head_init(&queue);
+	skb_queue_splice_init(&ctx->out_queue, &queue);
+
+	while ((skb = __skb_dequeue(&queue)))
+		espintcp_push_skb(sk, skb);
+
+	tcp_release_cb(sk);
+}
+
+static void espintcp_close(struct sock *sk, long timeout)
+{
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+	struct espintcp_msg *emsg = &ctx->partial;
+
+	strp_stop(&ctx->strp);
+
+	sk->sk_prot = &tcp_prot;
+	barrier();
+
+	cancel_work_sync(&ctx->work);
+	strp_done(&ctx->strp);
+
+	skb_queue_purge(&ctx->out_queue);
+	skb_queue_purge(&ctx->ike_queue);
+
+	if (emsg->len) {
+		if (emsg->skb)
+			kfree_skb(emsg->skb);
+		else
+			sk_msg_free(sk, &emsg->skmsg);
+	}
+
+	tcp_close(sk, timeout);
+}
+
+static __poll_t espintcp_poll(struct file *file, struct socket *sock,
+			      poll_table *wait)
+{
+	__poll_t mask = datagram_poll(file, sock, wait);
+	struct sock *sk = sock->sk;
+	struct espintcp_ctx *ctx = espintcp_getctx(sk);
+
+	if (!skb_queue_empty(&ctx->ike_queue))
+		mask |= EPOLLIN | EPOLLRDNORM;
+
+	return mask;
+}
+
+static struct tcp_ulp_ops espintcp_ulp __read_mostly = {
+	.name = "espintcp",
+	.owner = THIS_MODULE,
+	.init = espintcp_init_sk,
+};
+
+void __init espintcp_init(void)
+{
+	memcpy(&espintcp_prot, &tcp_prot, sizeof(tcp_prot));
+	memcpy(&espintcp_ops, &inet_stream_ops, sizeof(inet_stream_ops));
+	espintcp_prot.sendmsg = espintcp_sendmsg;
+	espintcp_prot.recvmsg = espintcp_recvmsg;
+	espintcp_prot.close = espintcp_close;
+	espintcp_prot.release_cb = espintcp_release;
+	espintcp_ops.poll = espintcp_poll;
+
+	tcp_register_ulp(&espintcp_ulp);
+}
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index f2d1e573ea55..297d1eb79e5c 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -39,6 +39,9 @@
 #ifdef CONFIG_XFRM_STATISTICS
 #include <net/snmp.h>
 #endif
+#ifdef CONFIG_INET_ESPINTCP
+#include <net/espintcp.h>
+#endif
 
 #include "xfrm_hash.h"
 
@@ -4157,6 +4160,10 @@ void __init xfrm_init(void)
 	seqcount_init(&xfrm_policy_hash_generation);
 	xfrm_input_init();
 
+#ifdef CONFIG_INET_ESPINTCP
+	espintcp_init();
+#endif
+
 	RCU_INIT_POINTER(xfrm_if_cb, NULL);
 	synchronize_rcu();
 }
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index f3423562d933..170d6e7f31d3 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -670,6 +670,9 @@ int __xfrm_state_delete(struct xfrm_state *x)
 		net->xfrm.state_num--;
 		spin_unlock(&net->xfrm.xfrm_state_lock);
 
+		if (x->encap_sk)
+			sock_put(rcu_dereference_raw(x->encap_sk));
+
 		xfrm_dev_state_delete(x);
 
 		/* All xfrm_state objects are created by xfrm_state_alloc.
-- 
cgit v1.2.3


From c02a81fba74fe3488ad6b08bfb5a1329005418f8 Mon Sep 17 00:00:00 2001
From: "Andrew F. Davis" <afd@ti.com>
Date: Tue, 3 Dec 2019 17:26:37 +0000
Subject: dma-buf: Add dma-buf heaps framework

This framework allows a unified userspace interface for dma-buf
exporters, allowing userland to allocate specific types of memory
for use in dma-buf sharing.

Each heap is given its own device node, which a user can allocate
a dma-buf fd from using the DMA_HEAP_IOC_ALLOC.

This code is an evoluiton of the Android ION implementation,
and a big thanks is due to its authors/maintainers over time
for their effort:
  Rebecca Schultz Zavin, Colin Cross, Benjamin Gaignard,
  Laura Abbott, and many other contributors!

Cc: Laura Abbott <labbott@redhat.com>
Cc: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Pratik Patel <pratikp@codeaurora.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Vincent Donnefort <Vincent.Donnefort@arm.com>
Cc: Sudipto Paul <Sudipto.Paul@arm.com>
Cc: Andrew F. Davis <afd@ti.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chenbo Feng <fengc@google.com>
Cc: Alistair Strachan <astrachan@google.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Dave Airlie <airlied@gmail.com>
Cc: dri-devel@lists.freedesktop.org
Reviewed-by: Brian Starkey <brian.starkey@arm.com>
Acked-by: Sandeep Patil <sspatil@android.com>
Signed-off-by: Andrew F. Davis <afd@ti.com>
Signed-off-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191203172641.66642-2-john.stultz@linaro.org
---
 MAINTAINERS                   |  18 +++
 drivers/dma-buf/Kconfig       |   9 ++
 drivers/dma-buf/Makefile      |   1 +
 drivers/dma-buf/dma-heap.c    | 297 ++++++++++++++++++++++++++++++++++++++++++
 include/linux/dma-heap.h      |  59 +++++++++
 include/uapi/linux/dma-heap.h |  53 ++++++++
 6 files changed, 437 insertions(+)
 create mode 100644 drivers/dma-buf/dma-heap.c
 create mode 100644 include/linux/dma-heap.h
 create mode 100644 include/uapi/linux/dma-heap.h

(limited to 'include/uapi/linux')

diff --git a/MAINTAINERS b/MAINTAINERS
index c2b89453805f..248938405c01 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4942,6 +4942,24 @@ F:	include/linux/*fence.h
 F:	Documentation/driver-api/dma-buf.rst
 T:	git git://anongit.freedesktop.org/drm/drm-misc
 
+DMA-BUF HEAPS FRAMEWORK
+M:	Sumit Semwal <sumit.semwal@linaro.org>
+R:	Andrew F. Davis <afd@ti.com>
+R:	Benjamin Gaignard <benjamin.gaignard@linaro.org>
+R:	Liam Mark <lmark@codeaurora.org>
+R:	Laura Abbott <labbott@redhat.com>
+R:	Brian Starkey <Brian.Starkey@arm.com>
+R:	John Stultz <john.stultz@linaro.org>
+S:	Maintained
+L:	linux-media@vger.kernel.org
+L:	dri-devel@lists.freedesktop.org
+L:	linaro-mm-sig@lists.linaro.org (moderated for non-subscribers)
+F:	include/uapi/linux/dma-heap.h
+F:	include/linux/dma-heap.h
+F:	drivers/dma-buf/dma-heap.c
+F:	drivers/dma-buf/heaps/*
+T:	git git://anongit.freedesktop.org/drm/drm-misc
+
 DMA GENERIC OFFLOAD ENGINE SUBSYSTEM
 M:	Vinod Koul <vkoul@kernel.org>
 L:	dmaengine@vger.kernel.org
diff --git a/drivers/dma-buf/Kconfig b/drivers/dma-buf/Kconfig
index a23b6752d11a..bffa58fc3e6e 100644
--- a/drivers/dma-buf/Kconfig
+++ b/drivers/dma-buf/Kconfig
@@ -44,4 +44,13 @@ config DMABUF_SELFTESTS
 	default n
 	depends on DMA_SHARED_BUFFER
 
+menuconfig DMABUF_HEAPS
+	bool "DMA-BUF Userland Memory Heaps"
+	select DMA_SHARED_BUFFER
+	help
+	  Choose this option to enable the DMA-BUF userland memory heaps.
+	  This options creates per heap chardevs in /dev/dma_heap/ which
+	  allows userspace to allocate dma-bufs that can be shared
+	  between drivers.
+
 endmenu
diff --git a/drivers/dma-buf/Makefile b/drivers/dma-buf/Makefile
index 03479da06422..caee5eb3d351 100644
--- a/drivers/dma-buf/Makefile
+++ b/drivers/dma-buf/Makefile
@@ -1,6 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 obj-y := dma-buf.o dma-fence.o dma-fence-array.o dma-fence-chain.o \
 	 dma-resv.o seqno-fence.o
+obj-$(CONFIG_DMABUF_HEAPS)	+= dma-heap.o
 obj-$(CONFIG_SYNC_FILE)		+= sync_file.o
 obj-$(CONFIG_SW_SYNC)		+= sw_sync.o sync_debug.o
 obj-$(CONFIG_UDMABUF)		+= udmabuf.o
diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
new file mode 100644
index 000000000000..4f04d104ae61
--- /dev/null
+++ b/drivers/dma-buf/dma-heap.c
@@ -0,0 +1,297 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Framework for userspace DMA-BUF allocations
+ *
+ * Copyright (C) 2011 Google, Inc.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#include <linux/cdev.h>
+#include <linux/debugfs.h>
+#include <linux/device.h>
+#include <linux/dma-buf.h>
+#include <linux/err.h>
+#include <linux/xarray.h>
+#include <linux/list.h>
+#include <linux/slab.h>
+#include <linux/uaccess.h>
+#include <linux/syscalls.h>
+#include <linux/dma-heap.h>
+#include <uapi/linux/dma-heap.h>
+
+#define DEVNAME "dma_heap"
+
+#define NUM_HEAP_MINORS 128
+
+/**
+ * struct dma_heap - represents a dmabuf heap in the system
+ * @name:		used for debugging/device-node name
+ * @ops:		ops struct for this heap
+ * @heap_devt		heap device node
+ * @list		list head connecting to list of heaps
+ * @heap_cdev		heap char device
+ *
+ * Represents a heap of memory from which buffers can be made.
+ */
+struct dma_heap {
+	const char *name;
+	const struct dma_heap_ops *ops;
+	void *priv;
+	dev_t heap_devt;
+	struct list_head list;
+	struct cdev heap_cdev;
+};
+
+static LIST_HEAD(heap_list);
+static DEFINE_MUTEX(heap_list_lock);
+static dev_t dma_heap_devt;
+static struct class *dma_heap_class;
+static DEFINE_XARRAY_ALLOC(dma_heap_minors);
+
+static int dma_heap_buffer_alloc(struct dma_heap *heap, size_t len,
+				 unsigned int fd_flags,
+				 unsigned int heap_flags)
+{
+	/*
+	 * Allocations from all heaps have to begin
+	 * and end on page boundaries.
+	 */
+	len = PAGE_ALIGN(len);
+	if (!len)
+		return -EINVAL;
+
+	return heap->ops->allocate(heap, len, fd_flags, heap_flags);
+}
+
+static int dma_heap_open(struct inode *inode, struct file *file)
+{
+	struct dma_heap *heap;
+
+	heap = xa_load(&dma_heap_minors, iminor(inode));
+	if (!heap) {
+		pr_err("dma_heap: minor %d unknown.\n", iminor(inode));
+		return -ENODEV;
+	}
+
+	/* instance data as context */
+	file->private_data = heap;
+	nonseekable_open(inode, file);
+
+	return 0;
+}
+
+static long dma_heap_ioctl_allocate(struct file *file, void *data)
+{
+	struct dma_heap_allocation_data *heap_allocation = data;
+	struct dma_heap *heap = file->private_data;
+	int fd;
+
+	if (heap_allocation->fd)
+		return -EINVAL;
+
+	if (heap_allocation->fd_flags & ~DMA_HEAP_VALID_FD_FLAGS)
+		return -EINVAL;
+
+	if (heap_allocation->heap_flags & ~DMA_HEAP_VALID_HEAP_FLAGS)
+		return -EINVAL;
+
+	fd = dma_heap_buffer_alloc(heap, heap_allocation->len,
+				   heap_allocation->fd_flags,
+				   heap_allocation->heap_flags);
+	if (fd < 0)
+		return fd;
+
+	heap_allocation->fd = fd;
+
+	return 0;
+}
+
+unsigned int dma_heap_ioctl_cmds[] = {
+	DMA_HEAP_IOC_ALLOC,
+};
+
+static long dma_heap_ioctl(struct file *file, unsigned int ucmd,
+			   unsigned long arg)
+{
+	char stack_kdata[128];
+	char *kdata = stack_kdata;
+	unsigned int kcmd;
+	unsigned int in_size, out_size, drv_size, ksize;
+	int nr = _IOC_NR(ucmd);
+	int ret = 0;
+
+	if (nr >= ARRAY_SIZE(dma_heap_ioctl_cmds))
+		return -EINVAL;
+
+	/* Get the kernel ioctl cmd that matches */
+	kcmd = dma_heap_ioctl_cmds[nr];
+
+	/* Figure out the delta between user cmd size and kernel cmd size */
+	drv_size = _IOC_SIZE(kcmd);
+	out_size = _IOC_SIZE(ucmd);
+	in_size = out_size;
+	if ((ucmd & kcmd & IOC_IN) == 0)
+		in_size = 0;
+	if ((ucmd & kcmd & IOC_OUT) == 0)
+		out_size = 0;
+	ksize = max(max(in_size, out_size), drv_size);
+
+	/* If necessary, allocate buffer for ioctl argument */
+	if (ksize > sizeof(stack_kdata)) {
+		kdata = kmalloc(ksize, GFP_KERNEL);
+		if (!kdata)
+			return -ENOMEM;
+	}
+
+	if (copy_from_user(kdata, (void __user *)arg, in_size) != 0) {
+		ret = -EFAULT;
+		goto err;
+	}
+
+	/* zero out any difference between the kernel/user structure size */
+	if (ksize > in_size)
+		memset(kdata + in_size, 0, ksize - in_size);
+
+	switch (kcmd) {
+	case DMA_HEAP_IOC_ALLOC:
+		ret = dma_heap_ioctl_allocate(file, kdata);
+		break;
+	default:
+		return -ENOTTY;
+	}
+
+	if (copy_to_user((void __user *)arg, kdata, out_size) != 0)
+		ret = -EFAULT;
+err:
+	if (kdata != stack_kdata)
+		kfree(kdata);
+	return ret;
+}
+
+static const struct file_operations dma_heap_fops = {
+	.owner          = THIS_MODULE,
+	.open		= dma_heap_open,
+	.unlocked_ioctl = dma_heap_ioctl,
+#ifdef CONFIG_COMPAT
+	.compat_ioctl	= dma_heap_ioctl,
+#endif
+};
+
+/**
+ * dma_heap_get_drvdata() - get per-subdriver data for the heap
+ * @heap: DMA-Heap to retrieve private data for
+ *
+ * Returns:
+ * The per-subdriver data for the heap.
+ */
+void *dma_heap_get_drvdata(struct dma_heap *heap)
+{
+	return heap->priv;
+}
+
+struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info)
+{
+	struct dma_heap *heap, *h, *err_ret;
+	struct device *dev_ret;
+	unsigned int minor;
+	int ret;
+
+	if (!exp_info->name || !strcmp(exp_info->name, "")) {
+		pr_err("dma_heap: Cannot add heap without a name\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	if (!exp_info->ops || !exp_info->ops->allocate) {
+		pr_err("dma_heap: Cannot add heap with invalid ops struct\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	/* check the name is unique */
+	mutex_lock(&heap_list_lock);
+	list_for_each_entry(h, &heap_list, list) {
+		if (!strcmp(h->name, exp_info->name)) {
+			mutex_unlock(&heap_list_lock);
+			pr_err("dma_heap: Already registered heap named %s\n",
+			       exp_info->name);
+			return ERR_PTR(-EINVAL);
+		}
+	}
+	mutex_unlock(&heap_list_lock);
+
+	heap = kzalloc(sizeof(*heap), GFP_KERNEL);
+	if (!heap)
+		return ERR_PTR(-ENOMEM);
+
+	heap->name = exp_info->name;
+	heap->ops = exp_info->ops;
+	heap->priv = exp_info->priv;
+
+	/* Find unused minor number */
+	ret = xa_alloc(&dma_heap_minors, &minor, heap,
+		       XA_LIMIT(0, NUM_HEAP_MINORS - 1), GFP_KERNEL);
+	if (ret < 0) {
+		pr_err("dma_heap: Unable to get minor number for heap\n");
+		err_ret = ERR_PTR(ret);
+		goto err0;
+	}
+
+	/* Create device */
+	heap->heap_devt = MKDEV(MAJOR(dma_heap_devt), minor);
+
+	cdev_init(&heap->heap_cdev, &dma_heap_fops);
+	ret = cdev_add(&heap->heap_cdev, heap->heap_devt, 1);
+	if (ret < 0) {
+		pr_err("dma_heap: Unable to add char device\n");
+		err_ret = ERR_PTR(ret);
+		goto err1;
+	}
+
+	dev_ret = device_create(dma_heap_class,
+				NULL,
+				heap->heap_devt,
+				NULL,
+				heap->name);
+	if (IS_ERR(dev_ret)) {
+		pr_err("dma_heap: Unable to create device\n");
+		err_ret = ERR_CAST(dev_ret);
+		goto err2;
+	}
+	/* Add heap to the list */
+	mutex_lock(&heap_list_lock);
+	list_add(&heap->list, &heap_list);
+	mutex_unlock(&heap_list_lock);
+
+	return heap;
+
+err2:
+	cdev_del(&heap->heap_cdev);
+err1:
+	xa_erase(&dma_heap_minors, minor);
+err0:
+	kfree(heap);
+	return err_ret;
+}
+
+static char *dma_heap_devnode(struct device *dev, umode_t *mode)
+{
+	return kasprintf(GFP_KERNEL, "dma_heap/%s", dev_name(dev));
+}
+
+static int dma_heap_init(void)
+{
+	int ret;
+
+	ret = alloc_chrdev_region(&dma_heap_devt, 0, NUM_HEAP_MINORS, DEVNAME);
+	if (ret)
+		return ret;
+
+	dma_heap_class = class_create(THIS_MODULE, DEVNAME);
+	if (IS_ERR(dma_heap_class)) {
+		unregister_chrdev_region(dma_heap_devt, NUM_HEAP_MINORS);
+		return PTR_ERR(dma_heap_class);
+	}
+	dma_heap_class->devnode = dma_heap_devnode;
+
+	return 0;
+}
+subsys_initcall(dma_heap_init);
diff --git a/include/linux/dma-heap.h b/include/linux/dma-heap.h
new file mode 100644
index 000000000000..454e354d1ffb
--- /dev/null
+++ b/include/linux/dma-heap.h
@@ -0,0 +1,59 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * DMABUF Heaps Allocation Infrastructure
+ *
+ * Copyright (C) 2011 Google, Inc.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+
+#ifndef _DMA_HEAPS_H
+#define _DMA_HEAPS_H
+
+#include <linux/cdev.h>
+#include <linux/types.h>
+
+struct dma_heap;
+
+/**
+ * struct dma_heap_ops - ops to operate on a given heap
+ * @allocate:		allocate dmabuf and return fd
+ *
+ * allocate returns dmabuf fd  on success, -errno on error.
+ */
+struct dma_heap_ops {
+	int (*allocate)(struct dma_heap *heap,
+			unsigned long len,
+			unsigned long fd_flags,
+			unsigned long heap_flags);
+};
+
+/**
+ * struct dma_heap_export_info - information needed to export a new dmabuf heap
+ * @name:	used for debugging/device-node name
+ * @ops:	ops struct for this heap
+ * @priv:	heap exporter private data
+ *
+ * Information needed to export a new dmabuf heap.
+ */
+struct dma_heap_export_info {
+	const char *name;
+	const struct dma_heap_ops *ops;
+	void *priv;
+};
+
+/**
+ * dma_heap_get_drvdata() - get per-heap driver data
+ * @heap: DMA-Heap to retrieve private data for
+ *
+ * Returns:
+ * The per-heap data for the heap.
+ */
+void *dma_heap_get_drvdata(struct dma_heap *heap);
+
+/**
+ * dma_heap_add - adds a heap to dmabuf heaps
+ * @exp_info:		information needed to register this heap
+ */
+struct dma_heap *dma_heap_add(const struct dma_heap_export_info *exp_info);
+
+#endif /* _DMA_HEAPS_H */
diff --git a/include/uapi/linux/dma-heap.h b/include/uapi/linux/dma-heap.h
new file mode 100644
index 000000000000..73e7f66c1cae
--- /dev/null
+++ b/include/uapi/linux/dma-heap.h
@@ -0,0 +1,53 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * DMABUF Heaps Userspace API
+ *
+ * Copyright (C) 2011 Google, Inc.
+ * Copyright (C) 2019 Linaro Ltd.
+ */
+#ifndef _UAPI_LINUX_DMABUF_POOL_H
+#define _UAPI_LINUX_DMABUF_POOL_H
+
+#include <linux/ioctl.h>
+#include <linux/types.h>
+
+/**
+ * DOC: DMABUF Heaps Userspace API
+ */
+
+/* Valid FD_FLAGS are O_CLOEXEC, O_RDONLY, O_WRONLY, O_RDWR */
+#define DMA_HEAP_VALID_FD_FLAGS (O_CLOEXEC | O_ACCMODE)
+
+/* Currently no heap flags */
+#define DMA_HEAP_VALID_HEAP_FLAGS (0)
+
+/**
+ * struct dma_heap_allocation_data - metadata passed from userspace for
+ *                                      allocations
+ * @len:		size of the allocation
+ * @fd:			will be populated with a fd which provides the
+ *			handle to the allocated dma-buf
+ * @fd_flags:		file descriptor flags used when allocating
+ * @heap_flags:		flags passed to heap
+ *
+ * Provided by userspace as an argument to the ioctl
+ */
+struct dma_heap_allocation_data {
+	__u64 len;
+	__u32 fd;
+	__u32 fd_flags;
+	__u64 heap_flags;
+};
+
+#define DMA_HEAP_IOC_MAGIC		'H'
+
+/**
+ * DOC: DMA_HEAP_IOC_ALLOC - allocate memory from pool
+ *
+ * Takes a dma_heap_allocation_data struct and returns it with the fd field
+ * populated with the dmabuf handle of the allocation.
+ */
+#define DMA_HEAP_IOC_ALLOC	_IOWR(DMA_HEAP_IOC_MAGIC, 0x0,\
+				      struct dma_heap_allocation_data)
+
+#endif /* _UAPI_LINUX_DMABUF_POOL_H */
-- 
cgit v1.2.3


From f10870b05d5edc0652701c6a92eafcab5044795f Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Tue, 10 Dec 2019 21:59:15 +0100
Subject: staging: remove isdn capi drivers

As described in drivers/staging/isdn/TODO, the drivers are all
assumed to be unmaintained and unused now, with gigaset being the
last one to stop being maintained after Paul Bolle lost access
to an ISDN network.

The CAPI subsystem remains for now, as it is still required by
bluetooth/cmtp.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20191210210455.3475361-1-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/isdn/avmb1.rst                       |  246 --
 Documentation/isdn/gigaset.rst                     |  465 ----
 Documentation/isdn/hysdn.rst                       |  196 --
 Documentation/isdn/index.rst                       |    3 -
 Documentation/userspace-api/ioctl/ioctl-number.rst |    1 -
 MAINTAINERS                                        |    3 +-
 drivers/staging/Kconfig                            |    2 -
 drivers/staging/Makefile                           |    1 -
 drivers/staging/isdn/Kconfig                       |   12 -
 drivers/staging/isdn/Makefile                      |    8 -
 drivers/staging/isdn/TODO                          |   22 -
 drivers/staging/isdn/avm/Kconfig                   |   65 -
 drivers/staging/isdn/avm/Makefile                  |   12 -
 drivers/staging/isdn/avm/avm_cs.c                  |  166 --
 drivers/staging/isdn/avm/avmcard.h                 |  581 -----
 drivers/staging/isdn/avm/b1.c                      |  819 ------
 drivers/staging/isdn/avm/b1dma.c                   |  981 -------
 drivers/staging/isdn/avm/b1isa.c                   |  243 --
 drivers/staging/isdn/avm/b1pci.c                   |  416 ---
 drivers/staging/isdn/avm/b1pcmcia.c                |  224 --
 drivers/staging/isdn/avm/c4.c                      | 1317 ----------
 drivers/staging/isdn/avm/t1isa.c                   |  594 -----
 drivers/staging/isdn/avm/t1pci.c                   |  259 --
 drivers/staging/isdn/gigaset/Kconfig               |   62 -
 drivers/staging/isdn/gigaset/Makefile              |   17 -
 drivers/staging/isdn/gigaset/asyncdata.c           |  606 -----
 drivers/staging/isdn/gigaset/bas-gigaset.c         | 2672 --------------------
 drivers/staging/isdn/gigaset/capi.c                | 2517 ------------------
 drivers/staging/isdn/gigaset/common.c              | 1153 ---------
 drivers/staging/isdn/gigaset/dummyll.c             |   74 -
 drivers/staging/isdn/gigaset/ev-layer.c            | 1910 --------------
 drivers/staging/isdn/gigaset/gigaset.h             |  827 ------
 drivers/staging/isdn/gigaset/interface.c           |  613 -----
 drivers/staging/isdn/gigaset/isocdata.c            | 1006 --------
 drivers/staging/isdn/gigaset/proc.c                |   77 -
 drivers/staging/isdn/gigaset/ser-gigaset.c         |  796 ------
 drivers/staging/isdn/gigaset/usb-gigaset.c         |  946 -------
 drivers/staging/isdn/hysdn/Kconfig                 |   15 -
 drivers/staging/isdn/hysdn/Makefile                |   12 -
 drivers/staging/isdn/hysdn/boardergo.c             |  445 ----
 drivers/staging/isdn/hysdn/boardergo.h             |  100 -
 drivers/staging/isdn/hysdn/hycapi.c                |  785 ------
 drivers/staging/isdn/hysdn/hysdn_boot.c            |  400 ---
 drivers/staging/isdn/hysdn/hysdn_defs.h            |  282 ---
 drivers/staging/isdn/hysdn/hysdn_init.c            |  213 --
 drivers/staging/isdn/hysdn/hysdn_net.c             |  330 ---
 drivers/staging/isdn/hysdn/hysdn_pof.h             |   78 -
 drivers/staging/isdn/hysdn/hysdn_procconf.c        |  411 ---
 drivers/staging/isdn/hysdn/hysdn_proclog.c         |  357 ---
 drivers/staging/isdn/hysdn/hysdn_sched.c           |  197 --
 drivers/staging/isdn/hysdn/ince1pc.h               |  134 -
 include/linux/b1pcmcia.h                           |   21 -
 include/uapi/linux/gigaset_dev.h                   |   39 -
 include/uapi/linux/hysdn_if.h                      |   34 -
 54 files changed, 1 insertion(+), 23764 deletions(-)
 delete mode 100644 Documentation/isdn/avmb1.rst
 delete mode 100644 Documentation/isdn/gigaset.rst
 delete mode 100644 Documentation/isdn/hysdn.rst
 delete mode 100644 drivers/staging/isdn/Kconfig
 delete mode 100644 drivers/staging/isdn/Makefile
 delete mode 100644 drivers/staging/isdn/TODO
 delete mode 100644 drivers/staging/isdn/avm/Kconfig
 delete mode 100644 drivers/staging/isdn/avm/Makefile
 delete mode 100644 drivers/staging/isdn/avm/avm_cs.c
 delete mode 100644 drivers/staging/isdn/avm/avmcard.h
 delete mode 100644 drivers/staging/isdn/avm/b1.c
 delete mode 100644 drivers/staging/isdn/avm/b1dma.c
 delete mode 100644 drivers/staging/isdn/avm/b1isa.c
 delete mode 100644 drivers/staging/isdn/avm/b1pci.c
 delete mode 100644 drivers/staging/isdn/avm/b1pcmcia.c
 delete mode 100644 drivers/staging/isdn/avm/c4.c
 delete mode 100644 drivers/staging/isdn/avm/t1isa.c
 delete mode 100644 drivers/staging/isdn/avm/t1pci.c
 delete mode 100644 drivers/staging/isdn/gigaset/Kconfig
 delete mode 100644 drivers/staging/isdn/gigaset/Makefile
 delete mode 100644 drivers/staging/isdn/gigaset/asyncdata.c
 delete mode 100644 drivers/staging/isdn/gigaset/bas-gigaset.c
 delete mode 100644 drivers/staging/isdn/gigaset/capi.c
 delete mode 100644 drivers/staging/isdn/gigaset/common.c
 delete mode 100644 drivers/staging/isdn/gigaset/dummyll.c
 delete mode 100644 drivers/staging/isdn/gigaset/ev-layer.c
 delete mode 100644 drivers/staging/isdn/gigaset/gigaset.h
 delete mode 100644 drivers/staging/isdn/gigaset/interface.c
 delete mode 100644 drivers/staging/isdn/gigaset/isocdata.c
 delete mode 100644 drivers/staging/isdn/gigaset/proc.c
 delete mode 100644 drivers/staging/isdn/gigaset/ser-gigaset.c
 delete mode 100644 drivers/staging/isdn/gigaset/usb-gigaset.c
 delete mode 100644 drivers/staging/isdn/hysdn/Kconfig
 delete mode 100644 drivers/staging/isdn/hysdn/Makefile
 delete mode 100644 drivers/staging/isdn/hysdn/boardergo.c
 delete mode 100644 drivers/staging/isdn/hysdn/boardergo.h
 delete mode 100644 drivers/staging/isdn/hysdn/hycapi.c
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_boot.c
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_defs.h
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_init.c
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_net.c
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_pof.h
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_procconf.c
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_proclog.c
 delete mode 100644 drivers/staging/isdn/hysdn/hysdn_sched.c
 delete mode 100644 drivers/staging/isdn/hysdn/ince1pc.h
 delete mode 100644 include/linux/b1pcmcia.h
 delete mode 100644 include/uapi/linux/gigaset_dev.h
 delete mode 100644 include/uapi/linux/hysdn_if.h

(limited to 'include/uapi/linux')

diff --git a/Documentation/isdn/avmb1.rst b/Documentation/isdn/avmb1.rst
deleted file mode 100644
index de3961e67553..000000000000
--- a/Documentation/isdn/avmb1.rst
+++ /dev/null
@@ -1,246 +0,0 @@
-================================
-Driver for active AVM Controller
-================================
-
-The driver provides a kernel capi2.0 Interface (kernelcapi) and
-on top of this a User-Level-CAPI2.0-interface (capi)
-and a driver to connect isdn4linux with CAPI2.0 (capidrv).
-The lowlevel interface can be used to implement a CAPI2.0
-also for passive cards since July 1999.
-
-The author can be reached at calle@calle.in-berlin.de.
-The command avmcapictrl is part of the isdn4k-utils.
-t4-files can be found at ftp://ftp.avm.de/cardware/b1/linux/firmware
-
-Currently supported cards:
-
-	- B1 ISA (all versions)
-	- B1 PCI
-	- T1/T1B (HEMA card)
-	- M1
-	- M2
-	- B1 PCMCIA
-
-Installing
-----------
-
-You need at least /dev/capi20 to load the firmware.
-
-::
-
-    mknod /dev/capi20 c 68 0
-    mknod /dev/capi20.00 c 68 1
-    mknod /dev/capi20.01 c 68 2
-    .
-    .
-    .
-    mknod /dev/capi20.19 c 68 20
-
-Running
--------
-
-To use the card you need the t4-files to download the firmware.
-AVM GmbH provides several t4-files for the different D-channel
-protocols (b1.t4 for Euro-ISDN). Install these file in /lib/isdn.
-
-if you configure as modules load the modules this way::
-
-    insmod /lib/modules/current/misc/capiutil.o
-    insmod /lib/modules/current/misc/b1.o
-    insmod /lib/modules/current/misc/kernelcapi.o
-    insmod /lib/modules/current/misc/capidrv.o
-    insmod /lib/modules/current/misc/capi.o
-
-if you have an B1-PCI card load the module b1pci.o::
-
-    insmod /lib/modules/current/misc/b1pci.o
-
-and load the firmware with::
-
-    avmcapictrl load /lib/isdn/b1.t4 1
-
-if you have an B1-ISA card load the module b1isa.o
-and add the card by calling::
-
-    avmcapictrl add 0x150 15
-
-and load the firmware by calling::
-
-    avmcapictrl load /lib/isdn/b1.t4 1
-
-if you have an T1-ISA card load the module t1isa.o
-and add the card by calling::
-
-    avmcapictrl add 0x450 15 T1 0
-
-and load the firmware by calling::
-
-    avmcapictrl load /lib/isdn/t1.t4 1
-
-if you have an PCMCIA card (B1/M1/M2) load the module b1pcmcia.o
-before you insert the card.
-
-Leased Lines with B1
---------------------
-
-Init card and load firmware.
-
-For an D64S use "FV: 1" as phone number
-
-For an D64S2 use "FV: 1" and "FV: 2" for multilink
-or "FV: 1,2" to use CAPI channel bundling.
-
-/proc-Interface
------------------
-
-/proc/capi::
-
-  dr-xr-xr-x   2 root     root            0 Jul  1 14:03 .
-  dr-xr-xr-x  82 root     root            0 Jun 30 19:08 ..
-  -r--r--r--   1 root     root            0 Jul  1 14:03 applications
-  -r--r--r--   1 root     root            0 Jul  1 14:03 applstats
-  -r--r--r--   1 root     root            0 Jul  1 14:03 capi20
-  -r--r--r--   1 root     root            0 Jul  1 14:03 capidrv
-  -r--r--r--   1 root     root            0 Jul  1 14:03 controller
-  -r--r--r--   1 root     root            0 Jul  1 14:03 contrstats
-  -r--r--r--   1 root     root            0 Jul  1 14:03 driver
-  -r--r--r--   1 root     root            0 Jul  1 14:03 ncci
-  -r--r--r--   1 root     root            0 Jul  1 14:03 users
-
-/proc/capi/applications:
-   applid level3cnt datablkcnt datablklen ncci-cnt recvqueuelen
-	level3cnt:
-	    capi_register parameter
-	datablkcnt:
-	    capi_register parameter
-	ncci-cnt:
-	    current number of nccis (connections)
-	recvqueuelen:
-	    number of messages on receive queue
-
-   for example::
-
-	1 -2 16 2048 1 0
-	2 2 7 2048 1 0
-
-/proc/capi/applstats:
-   applid recvctlmsg nrecvdatamsg nsentctlmsg nsentdatamsg
-	recvctlmsg:
-	    capi messages received without DATA_B3_IND
-	recvdatamsg:
-	    capi DATA_B3_IND received
-	sentctlmsg:
-	    capi messages sent without DATA_B3_REQ
-	sentdatamsg:
-	    capi DATA_B3_REQ sent
-
-   for example::
-
-	1 2057 1699 1721 1699
-
-/proc/capi/capi20: statistics of capi.o (/dev/capi20)
-    minor nopen nrecvdropmsg nrecvctlmsg nrecvdatamsg sentctlmsg sentdatamsg
-	minor:
-	    minor device number of capi device
-	nopen:
-	    number of calls to devices open
-	nrecvdropmsg:
-	    capi messages dropped (messages in recvqueue in close)
-	nrecvctlmsg:
-	    capi messages received without DATA_B3_IND
-	nrecvdatamsg:
-	    capi DATA_B3_IND received
-	nsentctlmsg:
-	    capi messages sent without DATA_B3_REQ
-	nsentdatamsg:
-	    capi DATA_B3_REQ sent
-
-   for example::
-
-	1 2 18 0 16 2
-
-/proc/capi/capidrv: statistics of capidrv.o (capi messages)
-    nrecvctlmsg nrecvdatamsg sentctlmsg sentdatamsg
-	nrecvctlmsg:
-	    capi messages received without DATA_B3_IND
-	nrecvdatamsg:
-	    capi DATA_B3_IND received
-	nsentctlmsg:
-	    capi messages sent without DATA_B3_REQ
-	nsentdatamsg:
-	    capi DATA_B3_REQ sent
-
-   for example:
-	2780 2226 2256 2226
-
-/proc/capi/controller:
-   controller drivername state cardname   controllerinfo
-
-   for example::
-
-	1 b1pci      running  b1pci-e000       B1 3.07-01 0xe000 19
-	2 t1isa      running  t1isa-450        B1 3.07-01 0x450 11 0
-	3 b1pcmcia   running  m2-150           B1 3.07-01 0x150 5
-
-/proc/capi/contrstats:
-    controller nrecvctlmsg nrecvdatamsg sentctlmsg sentdatamsg
-	nrecvctlmsg:
-	    capi messages received without DATA_B3_IND
-	nrecvdatamsg:
-	    capi DATA_B3_IND received
-	nsentctlmsg:
-	    capi messages sent without DATA_B3_REQ
-	nsentdatamsg:
-	    capi DATA_B3_REQ sent
-
-   for example::
-
-	1 2845 2272 2310 2274
-	2 2 0 2 0
-	3 2 0 2 0
-
-/proc/capi/driver:
-   drivername ncontroller
-
-   for example::
-
-	b1pci                            1
-	t1isa                            1
-	b1pcmcia                         1
-	b1isa                            0
-
-/proc/capi/ncci:
-   apllid ncci winsize sendwindow
-
-   for example::
-
-	1 0x10101 8 0
-
-/proc/capi/users: kernelmodules that use the kernelcapi.
-   name
-
-   for example::
-
-	capidrv
-	capi20
-
-Questions
----------
-
-Check out the FAQ (ftp.isdn4linux.de) or subscribe to the
-linux-avmb1@calle.in-berlin.de mailing list by sending
-a mail to majordomo@calle.in-berlin.de with
-subscribe linux-avmb1
-in the body.
-
-German documentation and several scripts can be found at
-ftp://ftp.avm.de/cardware/b1/linux/
-
-Bugs
-----
-
-If you find any please let me know.
-
-Enjoy,
-
-Carsten Paeth (calle@calle.in-berlin.de)
diff --git a/Documentation/isdn/gigaset.rst b/Documentation/isdn/gigaset.rst
deleted file mode 100644
index 98b4ec521c51..000000000000
--- a/Documentation/isdn/gigaset.rst
+++ /dev/null
@@ -1,465 +0,0 @@
-==========================
-GigaSet 307x Device Driver
-==========================
-
-1.   Requirements
-=================
-
-1.1. Hardware
--------------
-
-     This driver supports the connection of the Gigaset 307x/417x family of
-     ISDN DECT bases via Gigaset M101 Data, Gigaset M105 Data or direct USB
-     connection. The following devices are reported to be compatible:
-
-     Bases:
-       - Siemens Gigaset 3070/3075 isdn
-       - Siemens Gigaset 4170/4175 isdn
-       - Siemens Gigaset SX205/255
-       - Siemens Gigaset SX353
-       - T-Com Sinus 45 [AB] isdn
-       - T-Com Sinus 721X[A] [SE]
-       - Vox Chicago 390 ISDN (KPN Telecom)
-
-     RS232 data boxes:
-       - Siemens Gigaset M101 Data
-       - T-Com Sinus 45 Data 1
-
-     USB data boxes:
-       - Siemens Gigaset M105 Data
-       - Siemens Gigaset USB Adapter DECT
-       - T-Com Sinus 45 Data 2
-       - T-Com Sinus 721 data
-       - Chicago 390 USB (KPN)
-
-     See also http://www.erbze.info/sinus_gigaset.htm
-       (archived at https://web.archive.org/web/20100717020421/http://www.erbze.info:80/sinus_gigaset.htm ) and
-	http://gigaset307x.sourceforge.net/
-
-     We had also reports from users of Gigaset M105 who could use the drivers
-     with SX 100 and CX 100 ISDN bases (only in unimodem mode, see section 2.5.)
-     If you have another device that works with our driver, please let us know.
-
-     Chances of getting an USB device to work are good if the output of::
-
-	lsusb
-
-     at the command line contains one of the following::
-
-	ID 0681:0001
-	ID 0681:0002
-	ID 0681:0009
-	ID 0681:0021
-	ID 0681:0022
-
-1.2. Software
--------------
-
-     The driver works with the Kernel CAPI subsystem and can be used with any
-     software which is able to use CAPI 2.0 for ISDN connections (voice or data).
-
-     There are some user space tools available at
-     https://sourceforge.net/projects/gigaset307x/
-     which provide access to additional device specific functions like SMS,
-     phonebook or call journal.
-
-
-2.   How to use the driver
-==========================
-
-2.1. Modules
-------------
-
-     For the devices to work, the proper kernel modules have to be loaded.
-     This normally happens automatically when the system detects the USB
-     device (base, M105) or when the line discipline is attached (M101). It
-     can also be triggered manually using the modprobe(8) command, for example
-     for troubleshooting or to pass module parameters.
-
-     The module ser_gigaset provides a serial line discipline N_GIGASET_M101
-     which uses the regular serial port driver to access the device, and must
-     therefore be attached to the serial device to which the M101 is connected.
-     The ldattach(8) command (included in util-linux-ng release 2.14 or later)
-     can be used for that purpose, for example::
-
-	ldattach GIGASET_M101 /dev/ttyS1
-
-     This will open the device file, attach the line discipline to it, and
-     then sleep in the background, keeping the device open so that the line
-     discipline remains active. To deactivate it, kill the daemon, for example
-     with::
-
-	killall ldattach
-
-     before disconnecting the device. To have this happen automatically at
-     system startup/shutdown on an LSB compatible system, create and activate
-     an appropriate LSB startup script /etc/init.d/gigaset. (The init name
-     'gigaset' is officially assigned to this project by LANANA.)
-     Alternatively, just add the 'ldattach' command line to /etc/rc.local.
-
-     The modules accept the following parameters:
-
-	=============== ========== ==========================================
-	Module		Parameter  Meaning
-
-	gigaset		debug	   debug level (see section 3.2.)
-
-			startmode  initial operation mode (see section 2.5.):
-	bas_gigaset )		   1=CAPI (default), 0=Unimodem
-	ser_gigaset )
-	usb_gigaset )	cidmode    initial Call-ID mode setting (see section
-				   2.5.): 1=on (default), 0=off
-
-	=============== ========== ==========================================
-
-     Depending on your distribution you may want to create a separate module
-     configuration file like /etc/modprobe.d/gigaset.conf for these.
-
-2.2. Device nodes for user space programs
------------------------------------------
-
-     The device can be accessed from user space (eg. by the user space tools
-     mentioned in 1.2.) through the device nodes:
-
-     - /dev/ttyGS0 for M101 (RS232 data boxes)
-     - /dev/ttyGU0 for M105 (USB data boxes)
-     - /dev/ttyGB0 for the base driver (direct USB connection)
-
-     If you connect more than one device of a type, they will get consecutive
-     device nodes, eg. /dev/ttyGU1 for a second M105.
-
-     You can also set a "default device" for the user space tools to use when
-     no device node is given as parameter, by creating a symlink /dev/ttyG to
-     one of them, eg.::
-
-	ln -s /dev/ttyGB0 /dev/ttyG
-
-     The devices accept the following device specific ioctl calls
-     (defined in gigaset_dev.h):
-
-     ``ioctl(int fd, GIGASET_REDIR, int *cmd);``
-
-     If cmd==1, the device is set to be controlled exclusively through the
-     character device node; access from the ISDN subsystem is blocked.
-
-     If cmd==0, the device is set to be used from the ISDN subsystem and does
-     not communicate through the character device node.
-
-     ``ioctl(int fd, GIGASET_CONFIG, int *cmd);``
-
-     (ser_gigaset and usb_gigaset only)
-
-     If cmd==1, the device is set to adapter configuration mode where commands
-     are interpreted by the M10x DECT adapter itself instead of being
-     forwarded to the base station. In this mode, the device accepts the
-     commands described in Siemens document "AT-Kommando Alignment M10x Data"
-     for setting the operation mode, associating with a base station and
-     querying parameters like field strengh and signal quality.
-
-     Note that there is no ioctl command for leaving adapter configuration
-     mode and returning to regular operation. In order to leave adapter
-     configuration mode, write the command ATO to the device.
-
-     ``ioctl(int fd, GIGASET_BRKCHARS, unsigned char brkchars[6]);``
-
-     (usb_gigaset only)
-
-     Set the break characters on an M105's internal serial adapter to the six
-     bytes stored in brkchars[]. Unused bytes should be set to zero.
-
-     ioctl(int fd, GIGASET_VERSION, unsigned version[4]);
-     Retrieve version information from the driver. version[0] must be set to
-     one of:
-
-     - GIGVER_DRIVER: retrieve driver version
-     - GIGVER_COMPAT: retrieve interface compatibility version
-     - GIGVER_FWBASE: retrieve the firmware version of the base
-
-     Upon return, version[] is filled with the requested version information.
-
-2.3. CAPI
----------
-
-     The devices will show up as CAPI controllers as soon as the
-     corresponding driver module is loaded, and can then be used with
-     CAPI 2.0 kernel and user space applications. For user space access,
-     the module capi.ko must be loaded.
-
-     Most distributions handle loading and unloading of the various CAPI
-     modules automatically via the command capiinit(1) from the capi4k-utils
-     package or a similar mechanism. Note that capiinit(1) cannot unload the
-     Gigaset drivers because it doesn't support more than one module per
-     driver.
-
-2.5. Unimodem mode
-------------------
-
-     In this mode the device works like a modem connected to a serial port
-     (the /dev/ttyGU0, ... mentioned above) which understands the commands::
-
-	 ATZ                 init, reset
-	     => OK or ERROR
-	 ATD
-	 ATDT                dial
-	     => OK, CONNECT,
-		BUSY,
-		NO DIAL TONE,
-		NO CARRIER,
-		NO ANSWER
-	 <pause>+++<pause>   change to command mode when connected
-	 ATH                 hangup
-
-     You can use some configuration tool of your distribution to configure this
-     "modem" or configure pppd/wvdial manually. There are some example ppp
-     configuration files and chat scripts in the gigaset-VERSION/ppp directory
-     in the driver packages from https://sourceforge.net/projects/gigaset307x/.
-     Please note that the USB drivers are not able to change the state of the
-     control lines. This means you must use "Stupid Mode" if you are using
-     wvdial or you should use the nocrtscts option of pppd.
-     You must also assure that the ppp_async module is loaded with the parameter
-     flag_time=0. You can do this e.g. by adding a line like::
-
-	options ppp_async flag_time=0
-
-     to an appropriate module configuration file, like::
-
-	/etc/modprobe.d/gigaset.conf.
-
-     Unimodem mode is needed for making some devices [e.g. SX100] work which
-     do not support the regular Gigaset command set. If debug output (see
-     section 3.2.) shows something like this when dialing::
-
-	 CMD Received: ERROR
-	 Available Params: 0
-	 Connection State: 0, Response: -1
-	 gigaset_process_response: resp_code -1 in ConState 0 !
-	 Timeout occurred
-
-     then switching to unimodem mode may help.
-
-     If you have installed the command line tool gigacontr, you can enter
-     unimodem mode using::
-
-	 gigacontr --mode unimodem
-
-     You can switch back using::
-
-	 gigacontr --mode isdn
-
-     You can also put the driver directly into Unimodem mode when it's loaded,
-     by passing the module parameter startmode=0 to the hardware specific
-     module, e.g.::
-
-	modprobe usb_gigaset startmode=0
-
-     or by adding a line like::
-
-	options usb_gigaset startmode=0
-
-     to an appropriate module configuration file, like::
-
-	/etc/modprobe.d/gigaset.conf
-
-2.6. Call-ID (CID) mode
------------------------
-
-     Call-IDs are numbers used to tag commands to, and responses from, the
-     Gigaset base in order to support the simultaneous handling of multiple
-     ISDN calls. Their use can be enabled ("CID mode") or disabled ("Unimodem
-     mode"). Without Call-IDs (in Unimodem mode), only a very limited set of
-     functions is available. It allows outgoing data connections only, but
-     does not signal incoming calls or other base events.
-
-     DECT cordless data devices (M10x) permanently occupy the cordless
-     connection to the base while Call-IDs are activated. As the Gigaset
-     bases only support one DECT data connection at a time, this prevents
-     other DECT cordless data devices from accessing the base.
-
-     During active operation, the driver switches to the necessary mode
-     automatically. However, for the reasons above, the mode chosen when
-     the device is not in use (idle) can be selected by the user.
-
-     - If you want to receive incoming calls, you can use the default
-       settings (CID mode).
-     - If you have several DECT data devices (M10x) which you want to use
-       in turn, select Unimodem mode by passing the parameter "cidmode=0" to
-       the appropriate driver module (ser_gigaset or usb_gigaset).
-
-     If you want both of these at once, you are out of luck.
-
-     You can also use the tty class parameter "cidmode" of the device to
-     change its CID mode while the driver is loaded, eg.::
-
-	echo 0 > /sys/class/tty/ttyGU0/cidmode
-
-2.7. Dialing Numbers
---------------------
-provided by an application for dialing out must
-     be a public network number according to the local dialing plan, without
-     any dial prefix for getting an outside line.
-
-     Internal calls can be made by providing an internal extension number
-     prefixed with ``**`` (two asterisks) as the called party number. So to dial
-     eg. the first registered DECT handset, give ``**11`` as the called party
-     number. Dialing ``***`` (three asterisks) calls all extensions
-     simultaneously (global call).
-
-     Unimodem mode does not support internal calls.
-
-2.8. Unregistered Wireless Devices (M101/M105)
-----------------------------------------------
-
-     The main purpose of the ser_gigaset and usb_gigaset drivers is to allow
-     the M101 and M105 wireless devices to be used as ISDN devices for ISDN
-     connections through a Gigaset base. Therefore they assume that the device
-     is registered to a DECT base.
-
-     If the M101/M105 device is not registered to a base, initialization of
-     the device fails, and a corresponding error message is logged by the
-     driver. In that situation, a restricted set of functions is available
-     which includes, in particular, those necessary for registering the device
-     to a base or for switching it between Fixed Part and Portable Part
-     modes. See the gigacontr(8) manpage for details.
-
-3.   Troubleshooting
-====================
-
-3.1. Solutions to frequently reported problems
-----------------------------------------------
-
-     Problem:
-	You have a slow provider and isdn4linux gives up dialing too early.
-     Solution:
-	Load the isdn module using the dialtimeout option. You can do this e.g.
-	by adding a line like::
-
-	   options isdn dialtimeout=15
-
-	to /etc/modprobe.d/gigaset.conf or a similar file.
-
-     Problem:
-	The isdnlog program emits error messages or just doesn't work.
-     Solution:
-	Isdnlog supports only the HiSax driver. Do not attempt to use it with
-	other drivers such as Gigaset.
-
-     Problem:
-	You have two or more DECT data adapters (M101/M105) and only the
-	first one you turn on works.
-     Solution:
-	Select Unimodem mode for all DECT data adapters. (see section 2.5.)
-
-     Problem:
-	Messages like this::
-
-	    usb_gigaset 3-2:1.0: Could not initialize the device.
-
-	appear in your syslog.
-     Solution:
-	Check whether your M10x wireless device is correctly registered to the
-	Gigaset base. (see section 2.7.)
-
-3.2. Telling the driver to provide more information
----------------------------------------------------
-     Building the driver with the "Gigaset debugging" kernel configuration
-     option (CONFIG_GIGASET_DEBUG) gives it the ability to produce additional
-     information useful for debugging.
-
-     You can control the amount of debugging information the driver produces by
-     writing an appropriate value to /sys/module/gigaset/parameters/debug,
-     e.g.::
-
-	echo 0 > /sys/module/gigaset/parameters/debug
-
-     switches off debugging output completely,
-
-     ::
-
-	echo 0x302020 > /sys/module/gigaset/parameters/debug
-
-     enables a reasonable set of debugging output messages. These values are
-     bit patterns where every bit controls a certain type of debugging output.
-     See the constants DEBUG_* in the source file gigaset.h for details.
-
-     The initial value can be set using the debug parameter when loading the
-     module "gigaset", e.g. by adding a line::
-
-	options gigaset debug=0
-
-     to your module configuration file, eg. /etc/modprobe.d/gigaset.conf
-
-     Generated debugging information can be found
-     - as output of the command::
-
-	 dmesg
-
-     - in system log files written by your syslog daemon, usually
-       in /var/log/, e.g. /var/log/messages.
-
-3.3. Reporting problems and bugs
---------------------------------
-     If you can't solve problems with the driver on your own, feel free to
-     use one of the forums, bug trackers, or mailing lists on
-
-	 https://sourceforge.net/projects/gigaset307x
-
-     or write an electronic mail to the maintainers.
-
-     Try to provide as much information as possible, such as
-
-     - distribution
-     - kernel version (uname -r)
-     - gcc version (gcc --version)
-     - hardware architecture (uname -m, ...)
-     - type and firmware version of your device (base and wireless module,
-       if any)
-     - output of "lsusb -v" (if using an USB device)
-     - error messages
-     - relevant system log messages (it would help if you activate debug
-       output as described in 3.2.)
-
-     For help with general configuration problems not specific to our driver,
-     such as isdn4linux and network configuration issues, please refer to the
-     appropriate forums and newsgroups.
-
-3.4. Reporting problem solutions
---------------------------------
-     If you solved a problem with our drivers, wrote startup scripts for your
-     distribution, ... feel free to contact us (using one of the places
-     mentioned in 3.3.). We'd like to add scripts, hints, documentation
-     to the driver and/or the project web page.
-
-
-4.   Links, other software
-==========================
-
-     - Sourceforge project developing this driver and associated tools
-	 https://sourceforge.net/projects/gigaset307x
-     - Yahoo! Group on the Siemens Gigaset family of devices
-	 https://de.groups.yahoo.com/group/Siemens-Gigaset
-     - Siemens Gigaset/T-Sinus compatibility table
-	 http://www.erbze.info/sinus_gigaset.htm
-	    (archived at https://web.archive.org/web/20100717020421/http://www.erbze.info:80/sinus_gigaset.htm )
-
-
-5.   Credits
-============
-
-     Thanks to
-
-     Karsten Keil
-	for his help with isdn4linux
-     Deti Fliegl
-	for his base driver code
-     Dennis Dietrich
-	for his kernel 2.6 patches
-     Andreas Rummel
-	for his work and logs to get unimodem mode working
-     Andreas Degert
-	for his logs and patches to get cx 100 working
-     Dietrich Feist
-	for his generous donation of one M105 and two M101 cordless adapters
-     Christoph Schweers
-	for his generous donation of a M34 device
-
-     and all the other people who sent logs and other information.
diff --git a/Documentation/isdn/hysdn.rst b/Documentation/isdn/hysdn.rst
deleted file mode 100644
index 0a168d1cbffc..000000000000
--- a/Documentation/isdn/hysdn.rst
+++ /dev/null
@@ -1,196 +0,0 @@
-============
-Hysdn Driver
-============
-
-The hysdn driver has been written by
-Werner Cornelius (werner@isdn4linux.de or werner@titro.de)
-for Hypercope GmbH Aachen Germany. Hypercope agreed to publish this driver
-under the GNU General Public License.
-
-The CAPI 2.0-support was added by Ulrich Albrecht (ualbrecht@hypercope.de)
-for Hypercope GmbH Aachen, Germany.
-
-
-    This program is free software; you can redistribute it and/or modify
-    it under the terms of the GNU General Public License as published by
-    the Free Software Foundation; either version 2 of the License, or
-    (at your option) any later version.
-
-    This program is distributed in the hope that it will be useful,
-    but WITHOUT ANY WARRANTY; without even the implied warranty of
-    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-    GNU General Public License for more details.
-
-    You should have received a copy of the GNU General Public License
-    along with this program; if not, write to the Free Software
-    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
-
-.. Table of contents
-
-    1. About the driver
-
-    2. Loading/Unloading the driver
-
-    3. Entries in the /proc filesystem
-
-    4. The /proc/net/hysdn/cardconfX file
-
-    5. The /proc/net/hysdn/cardlogX file
-
-    6. Where to get additional info and help
-
-
-1. About the driver
-===================
-
-   The drivers/isdn/hysdn subdir contains a driver for HYPERCOPEs active
-   PCI isdn cards Champ, Ergo and Metro. To enable support for this cards
-   enable ISDN support in the kernel config and support for HYSDN cards in
-   the active cards submenu. The driver may only be compiled and used if
-   support for loadable modules and the process filesystem have been enabled.
-
-   These cards provide two different interfaces to the kernel. Without the
-   optional CAPI 2.0 support, they register as ethernet card. IP-routing
-   to a ISDN-destination is performed on the card itself. All necessary
-   handlers for various protocols like ppp and others as well as config info
-   and firmware may be fetched from Hypercopes WWW-Site www.hypercope.de.
-
-   With CAPI 2.0 support enabled, the card can also be used as a CAPI 2.0
-   compliant devices with either CAPI 2.0 applications
-   (check isdn4k-utils) or -using the capidrv module- as a regular
-   isdn4linux device. This is done via the same mechanism as with the
-   active AVM cards and in fact uses the same module.
-
-
-2. Loading/Unloading the driver
-===============================
-
-   The module has no command line parameters and auto detects up to 10 cards
-   in the id-range 0-9.
-   If a loaded driver shall be unloaded all open files in the /proc/net/hysdn
-   subdir need to be closed and all ethernet interfaces allocated by this
-   driver must be shut down. Otherwise the module counter will avoid a module
-   unload.
-
-   If you are using the CAPI 2.0-interface, make sure to load/modprobe the
-   kernelcapi-module first.
-
-   If you plan to use the capidrv-link to isdn4linux, make sure to load
-   capidrv.o after all modules using this driver (i.e. after hysdn and
-   any avm-specific modules).
-
-3. Entries in the /proc filesystem
-==================================
-
-   When the module has been loaded it adds the directory hysdn in the
-   /proc/net tree. This directory contains exactly 2 file entries for each
-   card. One is called cardconfX and the other cardlogX, where X is the
-   card id number from 0 to 9.
-   The cards are numbered in the order found in the PCI config data.
-
-4. The /proc/net/hysdn/cardconfX file
-=====================================
-
-   This file may be read to get by everyone to get info about the cards type,
-   actual state, available features and used resources.
-   The first 3 entries (id, bus and slot) are PCI info fields, the following
-   type field gives the information about the cards type:
-
-   - 4 -> Ergo card (server card with 2 b-chans)
-   - 5 -> Metro card (server card with 4 or 8 b-chans)
-   - 6 -> Champ card (client card with 2 b-chans)
-
-   The following 3 fields show the hardware assignments for irq, iobase and the
-   dual ported memory (dp-mem).
-
-   The fields b-chans and fax-chans announce the available card resources of
-   this types for the user.
-
-   The state variable indicates the actual drivers state for this card with the
-   following assignments.
-
-   - 0 -> card has not been booted since driver load
-   - 1 -> card booting is actually in progess
-   - 2 -> card is in an error state due to a previous boot failure
-   - 3 -> card is booted and active
-
-   And the last field (device) shows the name of the ethernet device assigned
-   to this card. Up to the first successful boot this field only shows a -
-   to tell that no net device has been allocated up to now. Once a net device
-   has been allocated it remains assigned to this card, even if a card is
-   rebooted and an boot error occurs.
-
-   Writing to the cardconfX file boots the card or transfers config lines to
-   the cards firmware. The type of data is automatically detected when the
-   first data is written. Only root has write access to this file.
-   The firmware boot files are normally called hyclient.pof for client cards
-   and hyserver.pof for server cards.
-   After successfully writing the boot file, complete config files or single
-   config lines may be copied to this file.
-   If an error occurs the return value given to the writing process has the
-   following additional codes (decimal):
-
-   ==== ============================================
-   1000 Another process is currently bootng the card
-   1001 Invalid firmware header
-   1002 Boards dual-port RAM test failed
-   1003 Internal firmware handler error
-   1004 Boot image size invalid
-   1005 First boot stage (bootstrap loader) failed
-   1006 Second boot stage failure
-   1007 Timeout waiting for card ready during boot
-   1008 Operation only allowed in booted state
-   1009 Config line too long
-   1010 Invalid channel number
-   1011 Timeout sending config data
-   ==== ============================================
-
-   Additional info about error reasons may be fetched from the log output.
-
-5. The /proc/net/hysdn/cardlogX file
-====================================
-
-   The cardlogX file entry may be opened multiple for reading by everyone to
-   get the cards and drivers log data. Card messages always start with the
-   keyword LOG. All other lines are output from the driver.
-   The driver log data may be redirected to the syslog by selecting the
-   appropriate bitmask. The cards log messages will always be send to this
-   interface but never to the syslog.
-
-   A root user may write a decimal or hex (with 0x) value t this file to select
-   desired output options. As mentioned above the cards log dat is always
-   written to the cardlog file independent of the following options only used
-   to check and debug the driver itself:
-
-   For example::
-
-	echo "0x34560078" > /proc/net/hysdn/cardlog0
-
-   to output the hex log mask 34560078 for card 0.
-
-   The written value is regarded as an unsigned 32-Bit value, bit ored for
-   desired output. The following bits are already assigned:
-
-   ==========   ============================================================
-   0x80000000   All driver log data is alternatively via syslog
-   0x00000001   Log memory allocation errors
-   0x00000010   Firmware load start and close are logged
-   0x00000020   Log firmware record parser
-   0x00000040   Log every firmware write actions
-   0x00000080   Log all card related boot messages
-   0x00000100   Output all config data sent for debugging purposes
-   0x00000200   Only non comment config lines are shown wth channel
-   0x00000400   Additional conf log output
-   0x00001000   Log the asynchronous scheduler actions (config and log)
-   0x00100000   Log all open and close actions to /proc/net/hysdn/card files
-   0x00200000   Log all actions from /proc file entries
-   0x00010000   Log network interface init and deinit
-   ==========   ============================================================
-
-6. Where to get additional info and help
-========================================
-
-   If you have any problems concerning the driver or configuration contact
-   the Hypercope support team (support@hypercope.de) and or the authors
-   Werner Cornelius (werner@isdn4linux or cornelius@titro.de) or
-   Ulrich Albrecht (ualbrecht@hypercope.de).
diff --git a/Documentation/isdn/index.rst b/Documentation/isdn/index.rst
index 407e74b78372..9622939fa526 100644
--- a/Documentation/isdn/index.rst
+++ b/Documentation/isdn/index.rst
@@ -9,9 +9,6 @@ ISDN
 
    interface_capi
 
-   avmb1
-   gigaset
-   hysdn
    m_isdn
 
    credits
diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst
index 4ef86433bd67..2e91370dc159 100644
--- a/Documentation/userspace-api/ioctl/ioctl-number.rst
+++ b/Documentation/userspace-api/ioctl/ioctl-number.rst
@@ -132,7 +132,6 @@ Code  Seq#    Include File                                           Comments
 'F'   80-8F  linux/arcfb.h                                           conflict!
 'F'   DD     video/sstfb.h                                           conflict!
 'G'   00-3F  drivers/misc/sgi-gru/grulib.h                           conflict!
-'G'   00-0F  linux/gigaset_dev.h                                     conflict!
 'H'   00-7F  linux/hiddev.h                                          conflict!
 'H'   00-0F  linux/hidraw.h                                          conflict!
 'H'   01     linux/mei.h                                             conflict!
diff --git a/MAINTAINERS b/MAINTAINERS
index bd5847e802de..e53e862bf2c3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8804,7 +8804,7 @@ S:	Maintained
 F:	drivers/isdn/mISDN
 F:	drivers/isdn/hardware
 
-ISDN/CAPI SUBSYSTEM
+ISDN/CMTP OVER BLUETOOTH
 M:	Karsten Keil <isdn@linux-pingi.de>
 L:	isdn4linux@listserv.isdn4linux.de (subscribers-only)
 L:	netdev@vger.kernel.org
@@ -8812,7 +8812,6 @@ W:	http://www.isdn4linux.de
 S:	Odd Fixes
 F:	Documentation/isdn/
 F:	drivers/isdn/capi/
-F:	drivers/staging/isdn/
 F:	net/bluetooth/cmtp/
 F:	include/linux/isdn/
 F:	include/uapi/linux/isdn/
diff --git a/drivers/staging/Kconfig b/drivers/staging/Kconfig
index eaf753b70ec5..3a334a9dd453 100644
--- a/drivers/staging/Kconfig
+++ b/drivers/staging/Kconfig
@@ -116,8 +116,6 @@ source "drivers/staging/fieldbus/Kconfig"
 
 source "drivers/staging/kpc2000/Kconfig"
 
-source "drivers/staging/isdn/Kconfig"
-
 source "drivers/staging/wusbcore/Kconfig"
 source "drivers/staging/uwb/Kconfig"
 
diff --git a/drivers/staging/Makefile b/drivers/staging/Makefile
index 0a4396c9067b..fb0c518ce07c 100644
--- a/drivers/staging/Makefile
+++ b/drivers/staging/Makefile
@@ -48,7 +48,6 @@ obj-$(CONFIG_STAGING_GASKET_FRAMEWORK)	+= gasket/
 obj-$(CONFIG_XIL_AXIS_FIFO)	+= axis-fifo/
 obj-$(CONFIG_FIELDBUS_DEV)     += fieldbus/
 obj-$(CONFIG_KPC2000)		+= kpc2000/
-obj-$(CONFIG_ISDN_CAPI)		+= isdn/
 obj-$(CONFIG_UWB)		+= uwb/
 obj-$(CONFIG_USB_WUSB)		+= wusbcore/
 obj-$(CONFIG_EXFAT_FS)		+= exfat/
diff --git a/drivers/staging/isdn/Kconfig b/drivers/staging/isdn/Kconfig
deleted file mode 100644
index faaf63887094..000000000000
--- a/drivers/staging/isdn/Kconfig
+++ /dev/null
@@ -1,12 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-menu "ISDN CAPI drivers"
-	depends on ISDN_CAPI
-
-source "drivers/staging/isdn/avm/Kconfig"
-
-source "drivers/staging/isdn/gigaset/Kconfig"
-
-source "drivers/staging/isdn/hysdn/Kconfig"
-
-endmenu
-
diff --git a/drivers/staging/isdn/Makefile b/drivers/staging/isdn/Makefile
deleted file mode 100644
index 025504bae5df..000000000000
--- a/drivers/staging/isdn/Makefile
+++ /dev/null
@@ -1,8 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# Makefile for the kernel ISDN subsystem and device drivers.
-
-# Object files in subdirectories
-
-obj-$(CONFIG_CAPI_AVM)			+= avm/
-obj-$(CONFIG_HYSDN)			+= hysdn/
-obj-$(CONFIG_ISDN_DRV_GIGASET)		+= gigaset/
diff --git a/drivers/staging/isdn/TODO b/drivers/staging/isdn/TODO
deleted file mode 100644
index 9210d11eb68b..000000000000
--- a/drivers/staging/isdn/TODO
+++ /dev/null
@@ -1,22 +0,0 @@
-TODO: Remove in late 2019 unless there are users
-
-
-I tried to find any indication of whether the capi drivers are
-still in use, and have not found anything  from a long time ago.
-
-With public ISDN networks almost completely shut down over the past 12
-months, there is very little you can actually do with this hardware. The
-main remaining use case would be to connect ISDN voice phones to an
-in-house installation with Asterisk or LCR, but anyone trying this in
-turn seems to be using either the mISDN driver stack, or out-of-tree
-drivers from the hardware vendors.
-
-I may of course have missed something, so I would suggest moving
-these into drivers/staging/ just in case someone still uses one
-of the three remaining in-kernel drivers (avm, hysdn, gigaset).
-
-If nobody complains, we can remove them entirely in six months,
-or otherwise move the core code and any drivers that are still
-needed back into drivers/isdn.
-
-  Arnd Bergmann <arnd@arndb.de>
diff --git a/drivers/staging/isdn/avm/Kconfig b/drivers/staging/isdn/avm/Kconfig
deleted file mode 100644
index 81483db067bb..000000000000
--- a/drivers/staging/isdn/avm/Kconfig
+++ /dev/null
@@ -1,65 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-#
-# ISDN AVM drivers
-#
-
-menuconfig CAPI_AVM
-	bool "Active AVM cards"
-	help
-	  Enable support for AVM active ISDN cards.
-
-if CAPI_AVM
-
-config ISDN_DRV_AVMB1_B1ISA
-	tristate "AVM B1 ISA support"
-	depends on ISA
-	help
-	  Enable support for the ISA version of the AVM B1 card.
-
-config ISDN_DRV_AVMB1_B1PCI
-	tristate "AVM B1 PCI support"
-	depends on PCI
-	help
-	  Enable support for the PCI version of the AVM B1 card.
-
-config ISDN_DRV_AVMB1_B1PCIV4
-	bool "AVM B1 PCI V4 support"
-	depends on ISDN_DRV_AVMB1_B1PCI
-	help
-	  Enable support for the V4 version of AVM B1 PCI card.
-
-config ISDN_DRV_AVMB1_T1ISA
-	tristate "AVM T1/T1-B ISA support"
-	depends on ISA
-	help
-	  Enable support for the AVM T1 T1B card.
-	  Note: This is a PRI card and handle 30 B-channels.
-
-config ISDN_DRV_AVMB1_B1PCMCIA
-	tristate "AVM B1/M1/M2 PCMCIA support"
-	depends on PCMCIA
-	help
-	  Enable support for the PCMCIA version of the AVM B1 card.
-
-config ISDN_DRV_AVMB1_AVM_CS
-	tristate "AVM B1/M1/M2 PCMCIA cs module"
-	depends on ISDN_DRV_AVMB1_B1PCMCIA
-	help
-	  Enable the PCMCIA client driver for the AVM B1/M1/M2
-	  PCMCIA cards.
-
-config ISDN_DRV_AVMB1_T1PCI
-	tristate "AVM T1/T1-B PCI support"
-	depends on PCI
-	help
-	  Enable support for the AVM T1 T1B card.
-	  Note: This is a PRI card and handle 30 B-channels.
-
-config ISDN_DRV_AVMB1_C4
-	tristate "AVM C4/C2 support"
-	depends on PCI
-	help
-	  Enable support for the AVM C4/C2 PCI cards.
-	  These cards handle 4/2 BRI ISDN lines (8/4 channels).
-
-endif # CAPI_AVM
diff --git a/drivers/staging/isdn/avm/Makefile b/drivers/staging/isdn/avm/Makefile
deleted file mode 100644
index 3830a0573fcc..000000000000
--- a/drivers/staging/isdn/avm/Makefile
+++ /dev/null
@@ -1,12 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-# Makefile for the AVM ISDN device drivers
-
-# Each configuration option enables a list of files.
-
-obj-$(CONFIG_ISDN_DRV_AVMB1_B1ISA)	+= b1isa.o b1.o
-obj-$(CONFIG_ISDN_DRV_AVMB1_B1PCI)	+= b1pci.o b1.o b1dma.o
-obj-$(CONFIG_ISDN_DRV_AVMB1_B1PCMCIA)	+= b1pcmcia.o b1.o
-obj-$(CONFIG_ISDN_DRV_AVMB1_AVM_CS)	+= avm_cs.o
-obj-$(CONFIG_ISDN_DRV_AVMB1_T1ISA)	+= t1isa.o b1.o
-obj-$(CONFIG_ISDN_DRV_AVMB1_T1PCI)	+= t1pci.o b1.o b1dma.o
-obj-$(CONFIG_ISDN_DRV_AVMB1_C4)		+= c4.o b1.o
diff --git a/drivers/staging/isdn/avm/avm_cs.c b/drivers/staging/isdn/avm/avm_cs.c
deleted file mode 100644
index 62b8030ee331..000000000000
--- a/drivers/staging/isdn/avm/avm_cs.c
+++ /dev/null
@@ -1,166 +0,0 @@
-/* $Id: avm_cs.c,v 1.4.6.3 2001/09/23 22:24:33 kai Exp $
- *
- * A PCMCIA client driver for AVM B1/M1/M2
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/init.h>
-#include <linux/ptrace.h>
-#include <linux/string.h>
-#include <linux/tty.h>
-#include <linux/serial.h>
-#include <linux/major.h>
-#include <asm/io.h>
-
-#include <pcmcia/cistpl.h>
-#include <pcmcia/ciscode.h>
-#include <pcmcia/ds.h>
-#include <pcmcia/cisreg.h>
-
-#include <linux/skbuff.h>
-#include <linux/capi.h>
-#include <linux/b1lli.h>
-#include <linux/b1pcmcia.h>
-
-/*====================================================================*/
-
-MODULE_DESCRIPTION("CAPI4Linux: PCMCIA client driver for AVM B1/M1/M2");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-/*====================================================================*/
-
-static int avmcs_config(struct pcmcia_device *link);
-static void avmcs_release(struct pcmcia_device *link);
-static void avmcs_detach(struct pcmcia_device *p_dev);
-
-static int avmcs_probe(struct pcmcia_device *p_dev)
-{
-	/* General socket configuration */
-	p_dev->config_flags |= CONF_ENABLE_IRQ | CONF_AUTO_SET_IO;
-	p_dev->config_index = 1;
-	p_dev->config_regs = PRESENT_OPTION;
-
-	return avmcs_config(p_dev);
-} /* avmcs_attach */
-
-
-static void avmcs_detach(struct pcmcia_device *link)
-{
-	avmcs_release(link);
-} /* avmcs_detach */
-
-static int avmcs_configcheck(struct pcmcia_device *p_dev, void *priv_data)
-{
-	p_dev->resource[0]->end = 16;
-	p_dev->resource[0]->flags &= ~IO_DATA_PATH_WIDTH;
-	p_dev->resource[0]->flags |= IO_DATA_PATH_WIDTH_8;
-
-	return pcmcia_request_io(p_dev);
-}
-
-static int avmcs_config(struct pcmcia_device *link)
-{
-	int i = -1;
-	char devname[128];
-	int cardtype;
-	int (*addcard)(unsigned int port, unsigned irq);
-
-	devname[0] = 0;
-	if (link->prod_id[1])
-		strlcpy(devname, link->prod_id[1], sizeof(devname));
-
-	/*
-	 * find IO port
-	 */
-	if (pcmcia_loop_config(link, avmcs_configcheck, NULL))
-		return -ENODEV;
-
-	do {
-		if (!link->irq) {
-			/* undo */
-			pcmcia_disable_device(link);
-			break;
-		}
-
-		/*
-		 * configure the PCMCIA socket
-		 */
-		i = pcmcia_enable_device(link);
-		if (i != 0) {
-			pcmcia_disable_device(link);
-			break;
-		}
-
-	} while (0);
-
-	if (devname[0]) {
-		char *s = strrchr(devname, ' ');
-		if (!s)
-			s = devname;
-		else s++;
-		if (strcmp("M1", s) == 0) {
-			cardtype = AVM_CARDTYPE_M1;
-		} else if (strcmp("M2", s) == 0) {
-			cardtype = AVM_CARDTYPE_M2;
-		} else {
-			cardtype = AVM_CARDTYPE_B1;
-		}
-	} else
-		cardtype = AVM_CARDTYPE_B1;
-
-	/* If any step failed, release any partially configured state */
-	if (i != 0) {
-		avmcs_release(link);
-		return -ENODEV;
-	}
-
-
-	switch (cardtype) {
-	case AVM_CARDTYPE_M1: addcard = b1pcmcia_addcard_m1; break;
-	case AVM_CARDTYPE_M2: addcard = b1pcmcia_addcard_m2; break;
-	default:
-	case AVM_CARDTYPE_B1: addcard = b1pcmcia_addcard_b1; break;
-	}
-	if ((i = (*addcard)(link->resource[0]->start, link->irq)) < 0) {
-		dev_err(&link->dev,
-			"avm_cs: failed to add AVM-Controller at i/o %#x, irq %d\n",
-			(unsigned int) link->resource[0]->start, link->irq);
-		avmcs_release(link);
-		return -ENODEV;
-	}
-	return 0;
-
-} /* avmcs_config */
-
-
-static void avmcs_release(struct pcmcia_device *link)
-{
-	b1pcmcia_delcard(link->resource[0]->start, link->irq);
-	pcmcia_disable_device(link);
-} /* avmcs_release */
-
-
-static const struct pcmcia_device_id avmcs_ids[] = {
-	PCMCIA_DEVICE_PROD_ID12("AVM", "ISDN-Controller B1", 0x95d42008, 0x845dc335),
-	PCMCIA_DEVICE_PROD_ID12("AVM", "Mobile ISDN-Controller M1", 0x95d42008, 0x81e10430),
-	PCMCIA_DEVICE_PROD_ID12("AVM", "Mobile ISDN-Controller M2", 0x95d42008, 0x18e8558a),
-	PCMCIA_DEVICE_NULL
-};
-MODULE_DEVICE_TABLE(pcmcia, avmcs_ids);
-
-static struct pcmcia_driver avmcs_driver = {
-	.owner	= THIS_MODULE,
-	.name		= "avm_cs",
-	.probe = avmcs_probe,
-	.remove	= avmcs_detach,
-	.id_table = avmcs_ids,
-};
-module_pcmcia_driver(avmcs_driver);
diff --git a/drivers/staging/isdn/avm/avmcard.h b/drivers/staging/isdn/avm/avmcard.h
deleted file mode 100644
index cdfa89c71997..000000000000
--- a/drivers/staging/isdn/avm/avmcard.h
+++ /dev/null
@@ -1,581 +0,0 @@
-/* $Id: avmcard.h,v 1.1.4.1.2.1 2001/12/21 15:00:17 kai Exp $
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#ifndef _AVMCARD_H_
-#define _AVMCARD_H_
-
-#include <linux/spinlock.h>
-#include <linux/list.h>
-#include <linux/interrupt.h>
-
-#define	AVMB1_PORTLEN		0x1f
-#define AVM_MAXVERSION		8
-#define AVM_NCCI_PER_CHANNEL	4
-
-/*
- * Versions
- */
-
-#define	VER_DRIVER	0
-#define	VER_CARDTYPE	1
-#define	VER_HWID	2
-#define	VER_SERIAL	3
-#define	VER_OPTION	4
-#define	VER_PROTO	5
-#define	VER_PROFILE	6
-#define	VER_CAPI	7
-
-enum avmcardtype {
-	avm_b1isa,
-	avm_b1pci,
-	avm_b1pcmcia,
-	avm_m1,
-	avm_m2,
-	avm_t1isa,
-	avm_t1pci,
-	avm_c4,
-	avm_c2
-};
-
-typedef struct avmcard_dmabuf {
-	long        size;
-	u8       *dmabuf;
-	dma_addr_t  dmaaddr;
-} avmcard_dmabuf;
-
-typedef struct avmcard_dmainfo {
-	u32                recvlen;
-	avmcard_dmabuf       recvbuf;
-
-	avmcard_dmabuf       sendbuf;
-	struct sk_buff_head  send_queue;
-
-	struct pci_dev      *pcidev;
-} avmcard_dmainfo;
-
-typedef	struct avmctrl_info {
-	char cardname[32];
-
-	int versionlen;
-	char versionbuf[1024];
-	char *version[AVM_MAXVERSION];
-
-	char infobuf[128];	/* for function procinfo */
-
-	struct avmcard  *card;
-	struct capi_ctr  capi_ctrl;
-
-	struct list_head ncci_head;
-} avmctrl_info;
-
-typedef struct avmcard {
-	char name[32];
-
-	spinlock_t lock;
-	unsigned int port;
-	unsigned irq;
-	unsigned long membase;
-	enum avmcardtype cardtype;
-	unsigned char revision;
-	unsigned char class;
-	int cardnr; /* for t1isa */
-
-	char msgbuf[128];	/* capimsg msg part */
-	char databuf[2048];	/* capimsg data part */
-
-	void __iomem *mbase;
-	volatile u32 csr;
-	avmcard_dmainfo *dma;
-
-	struct avmctrl_info *ctrlinfo;
-
-	u_int nr_controllers;
-	u_int nlogcontr;
-	struct list_head list;
-} avmcard;
-
-extern int b1_irq_table[16];
-
-/*
- * LLI Messages to the ISDN-ControllerISDN Controller
- */
-
-#define	SEND_POLL		0x72	/*
-					 * after load <- RECEIVE_POLL
-					 */
-#define SEND_INIT		0x11	/*
-					 * first message <- RECEIVE_INIT
-					 * int32 NumApplications  int32
-					 * NumNCCIs int32 BoardNumber
-					 */
-#define SEND_REGISTER		0x12	/*
-					 * register an application int32
-					 * ApplIDId int32 NumMessages
-					 * int32 NumB3Connections int32
-					 * NumB3Blocks int32 B3Size
-					 *
-					 * AnzB3Connection != 0 &&
-					 * AnzB3Blocks >= 1 && B3Size >= 1
-					 */
-#define SEND_RELEASE		0x14	/*
-					 * deregister an application int32
-					 * ApplID
-					 */
-#define SEND_MESSAGE		0x15	/*
-					 * send capi-message int32 length
-					 * capi-data ...
-					 */
-#define SEND_DATA_B3_REQ	0x13	/*
-					 * send capi-data-message int32
-					 * MsgLength capi-data ... int32
-					 * B3Length data ....
-					 */
-
-#define SEND_CONFIG		0x21    /*
-					 */
-
-#define SEND_POLLACK		0x73    /* T1 Watchdog */
-
-/*
- * LLI Messages from the ISDN-ControllerISDN Controller
- */
-
-#define RECEIVE_POLL		0x32	/*
-					 * <- after SEND_POLL
-					 */
-#define RECEIVE_INIT		0x27	/*
-					 * <- after SEND_INIT int32 length
-					 * byte total length b1struct board
-					 * driver revision b1struct card
-					 * type b1struct reserved b1struct
-					 * serial number b1struct driver
-					 * capability b1struct d-channel
-					 * protocol b1struct CAPI-2.0
-					 * profile b1struct capi version
-					 */
-#define RECEIVE_MESSAGE		0x21	/*
-					 * <- after SEND_MESSAGE int32
-					 * AppllID int32 Length capi-data
-					 * ....
-					 */
-#define RECEIVE_DATA_B3_IND	0x22	/*
-					 * received data int32 AppllID
-					 * int32 Length capi-data ...
-					 * int32 B3Length data ...
-					 */
-#define RECEIVE_START		0x23	/*
-					 * Handshake
-					 */
-#define RECEIVE_STOP		0x24	/*
-					 * Handshake
-					 */
-#define RECEIVE_NEW_NCCI	0x25	/*
-					 * int32 AppllID int32 NCCI int32
-					 * WindowSize
-					 */
-#define RECEIVE_FREE_NCCI	0x26	/*
-					 * int32 AppllID int32 NCCI
-					 */
-#define RECEIVE_RELEASE		0x26	/*
-					 * int32 AppllID int32 0xffffffff
-					 */
-#define RECEIVE_TASK_READY	0x31	/*
-					 * int32 tasknr
-					 * int32 Length Taskname ...
-					 */
-#define RECEIVE_DEBUGMSG	0x71	/*
-					 * int32 Length message
-					 *
-					 */
-#define RECEIVE_POLLDWORD	0x75	/* t1pci in dword mode */
-
-#define WRITE_REGISTER		0x00
-#define READ_REGISTER		0x01
-
-/*
- * port offsets
- */
-
-#define B1_READ			0x00
-#define B1_WRITE		0x01
-#define B1_INSTAT		0x02
-#define B1_OUTSTAT		0x03
-#define B1_ANALYSE		0x04
-#define B1_REVISION		0x05
-#define B1_RESET		0x10
-
-
-#define B1_STAT0(cardtype)  ((cardtype) == avm_m1 ? 0x81200000l : 0x80A00000l)
-#define B1_STAT1(cardtype)  (0x80E00000l)
-
-/* ---------------------------------------------------------------- */
-
-static inline unsigned char b1outp(unsigned int base,
-				   unsigned short offset,
-				   unsigned char value)
-{
-	outb(value, base + offset);
-	return inb(base + B1_ANALYSE);
-}
-
-
-static inline int b1_rx_full(unsigned int base)
-{
-	return inb(base + B1_INSTAT) & 0x1;
-}
-
-static inline unsigned char b1_get_byte(unsigned int base)
-{
-	unsigned long stop = jiffies + 1 * HZ;	/* maximum wait time 1 sec */
-	while (!b1_rx_full(base) && time_before(jiffies, stop));
-	if (b1_rx_full(base))
-		return inb(base + B1_READ);
-	printk(KERN_CRIT "b1lli(0x%x): rx not full after 1 second\n", base);
-	return 0;
-}
-
-static inline unsigned int b1_get_word(unsigned int base)
-{
-	unsigned int val = 0;
-	val |= b1_get_byte(base);
-	val |= (b1_get_byte(base) << 8);
-	val |= (b1_get_byte(base) << 16);
-	val |= (b1_get_byte(base) << 24);
-	return val;
-}
-
-static inline int b1_tx_empty(unsigned int base)
-{
-	return inb(base + B1_OUTSTAT) & 0x1;
-}
-
-static inline void b1_put_byte(unsigned int base, unsigned char val)
-{
-	while (!b1_tx_empty(base));
-	b1outp(base, B1_WRITE, val);
-}
-
-static inline int b1_save_put_byte(unsigned int base, unsigned char val)
-{
-	unsigned long stop = jiffies + 2 * HZ;
-	while (!b1_tx_empty(base) && time_before(jiffies, stop));
-	if (!b1_tx_empty(base)) return -1;
-	b1outp(base, B1_WRITE, val);
-	return 0;
-}
-
-static inline void b1_put_word(unsigned int base, unsigned int val)
-{
-	b1_put_byte(base, val & 0xff);
-	b1_put_byte(base, (val >> 8) & 0xff);
-	b1_put_byte(base, (val >> 16) & 0xff);
-	b1_put_byte(base, (val >> 24) & 0xff);
-}
-
-static inline unsigned int b1_get_slice(unsigned int base,
-					unsigned char *dp)
-{
-	unsigned int len, i;
-
-	len = i = b1_get_word(base);
-	while (i-- > 0) *dp++ = b1_get_byte(base);
-	return len;
-}
-
-static inline void b1_put_slice(unsigned int base,
-				unsigned char *dp, unsigned int len)
-{
-	unsigned i = len;
-	b1_put_word(base, i);
-	while (i-- > 0)
-		b1_put_byte(base, *dp++);
-}
-
-static void b1_wr_reg(unsigned int base,
-		      unsigned int reg,
-		      unsigned int value)
-{
-	b1_put_byte(base, WRITE_REGISTER);
-	b1_put_word(base, reg);
-	b1_put_word(base, value);
-}
-
-static inline unsigned int b1_rd_reg(unsigned int base,
-				     unsigned int reg)
-{
-	b1_put_byte(base, READ_REGISTER);
-	b1_put_word(base, reg);
-	return b1_get_word(base);
-
-}
-
-static inline void b1_reset(unsigned int base)
-{
-	b1outp(base, B1_RESET, 0);
-	mdelay(55 * 2);	/* 2 TIC's */
-
-	b1outp(base, B1_RESET, 1);
-	mdelay(55 * 2);	/* 2 TIC's */
-
-	b1outp(base, B1_RESET, 0);
-	mdelay(55 * 2);	/* 2 TIC's */
-}
-
-static inline unsigned char b1_disable_irq(unsigned int base)
-{
-	return b1outp(base, B1_INSTAT, 0x00);
-}
-
-/* ---------------------------------------------------------------- */
-
-static inline void b1_set_test_bit(unsigned int base,
-				   enum avmcardtype cardtype,
-				   int onoff)
-{
-	b1_wr_reg(base, B1_STAT0(cardtype), onoff ? 0x21 : 0x20);
-}
-
-static inline int b1_get_test_bit(unsigned int base,
-				  enum avmcardtype cardtype)
-{
-	return (b1_rd_reg(base, B1_STAT0(cardtype)) & 0x01) != 0;
-}
-
-/* ---------------------------------------------------------------- */
-
-#define T1_FASTLINK		0x00
-#define T1_SLOWLINK		0x08
-
-#define T1_READ			B1_READ
-#define T1_WRITE		B1_WRITE
-#define T1_INSTAT		B1_INSTAT
-#define T1_OUTSTAT		B1_OUTSTAT
-#define T1_IRQENABLE		0x05
-#define T1_FIFOSTAT		0x06
-#define T1_RESETLINK		0x10
-#define T1_ANALYSE		0x11
-#define T1_IRQMASTER		0x12
-#define T1_IDENT		0x17
-#define T1_RESETBOARD		0x1f
-
-#define	T1F_IREADY		0x01
-#define	T1F_IHALF		0x02
-#define	T1F_IFULL		0x04
-#define	T1F_IEMPTY		0x08
-#define	T1F_IFLAGS		0xF0
-
-#define	T1F_OREADY		0x10
-#define	T1F_OHALF		0x20
-#define	T1F_OEMPTY		0x40
-#define	T1F_OFULL		0x80
-#define	T1F_OFLAGS		0xF0
-
-/* there are HEMA cards with 1k and 4k FIFO out */
-#define FIFO_OUTBSIZE		256
-#define FIFO_INPBSIZE		512
-
-#define HEMA_VERSION_ID		0
-#define HEMA_PAL_ID		0
-
-static inline void t1outp(unsigned int base,
-			  unsigned short offset,
-			  unsigned char value)
-{
-	outb(value, base + offset);
-}
-
-static inline unsigned char t1inp(unsigned int base,
-				  unsigned short offset)
-{
-	return inb(base + offset);
-}
-
-static inline int t1_isfastlink(unsigned int base)
-{
-	return (inb(base + T1_IDENT) & ~0x82) == 1;
-}
-
-static inline unsigned char t1_fifostatus(unsigned int base)
-{
-	return inb(base + T1_FIFOSTAT);
-}
-
-static inline unsigned int t1_get_slice(unsigned int base,
-					unsigned char *dp)
-{
-	unsigned int len, i;
-#ifdef FASTLINK_DEBUG
-	unsigned wcnt = 0, bcnt = 0;
-#endif
-
-	len = i = b1_get_word(base);
-	if (t1_isfastlink(base)) {
-		int status;
-		while (i > 0) {
-			status = t1_fifostatus(base) & (T1F_IREADY | T1F_IHALF);
-			if (i >= FIFO_INPBSIZE) status |= T1F_IFULL;
-
-			switch (status) {
-			case T1F_IREADY | T1F_IHALF | T1F_IFULL:
-				insb(base + B1_READ, dp, FIFO_INPBSIZE);
-				dp += FIFO_INPBSIZE;
-				i -= FIFO_INPBSIZE;
-#ifdef FASTLINK_DEBUG
-				wcnt += FIFO_INPBSIZE;
-#endif
-				break;
-			case T1F_IREADY | T1F_IHALF:
-				insb(base + B1_READ, dp, i);
-#ifdef FASTLINK_DEBUG
-				wcnt += i;
-#endif
-				dp += i;
-				i = 0;
-				break;
-			default:
-				*dp++ = b1_get_byte(base);
-				i--;
-#ifdef FASTLINK_DEBUG
-				bcnt++;
-#endif
-				break;
-			}
-		}
-#ifdef FASTLINK_DEBUG
-		if (wcnt)
-			printk(KERN_DEBUG "b1lli(0x%x): get_slice l=%d w=%d b=%d\n",
-			       base, len, wcnt, bcnt);
-#endif
-	} else {
-		while (i-- > 0)
-			*dp++ = b1_get_byte(base);
-	}
-	return len;
-}
-
-static inline void t1_put_slice(unsigned int base,
-				unsigned char *dp, unsigned int len)
-{
-	unsigned i = len;
-	b1_put_word(base, i);
-	if (t1_isfastlink(base)) {
-		int status;
-		while (i > 0) {
-			status = t1_fifostatus(base) & (T1F_OREADY | T1F_OHALF);
-			if (i >= FIFO_OUTBSIZE) status |= T1F_OEMPTY;
-			switch (status) {
-			case T1F_OREADY | T1F_OHALF | T1F_OEMPTY:
-				outsb(base + B1_WRITE, dp, FIFO_OUTBSIZE);
-				dp += FIFO_OUTBSIZE;
-				i -= FIFO_OUTBSIZE;
-				break;
-			case T1F_OREADY | T1F_OHALF:
-				outsb(base + B1_WRITE, dp, i);
-				dp += i;
-				i = 0;
-				break;
-			default:
-				b1_put_byte(base, *dp++);
-				i--;
-				break;
-			}
-		}
-	} else {
-		while (i-- > 0)
-			b1_put_byte(base, *dp++);
-	}
-}
-
-static inline void t1_disable_irq(unsigned int base)
-{
-	t1outp(base, T1_IRQMASTER, 0x00);
-}
-
-static inline void t1_reset(unsigned int base)
-{
-	/* reset T1 Controller */
-	b1_reset(base);
-	/* disable irq on HEMA */
-	t1outp(base, B1_INSTAT, 0x00);
-	t1outp(base, B1_OUTSTAT, 0x00);
-	t1outp(base, T1_IRQMASTER, 0x00);
-	/* reset HEMA board configuration */
-	t1outp(base, T1_RESETBOARD, 0xf);
-}
-
-static inline void b1_setinterrupt(unsigned int base, unsigned irq,
-				   enum avmcardtype cardtype)
-{
-	switch (cardtype) {
-	case avm_t1isa:
-		t1outp(base, B1_INSTAT, 0x00);
-		t1outp(base, B1_INSTAT, 0x02);
-		t1outp(base, T1_IRQMASTER, 0x08);
-		break;
-	case avm_b1isa:
-		b1outp(base, B1_INSTAT, 0x00);
-		b1outp(base, B1_RESET, b1_irq_table[irq]);
-		b1outp(base, B1_INSTAT, 0x02);
-		break;
-	default:
-	case avm_m1:
-	case avm_m2:
-	case avm_b1pci:
-		b1outp(base, B1_INSTAT, 0x00);
-		b1outp(base, B1_RESET, 0xf0);
-		b1outp(base, B1_INSTAT, 0x02);
-		break;
-	case avm_c4:
-	case avm_t1pci:
-		b1outp(base, B1_RESET, 0xf0);
-		break;
-	}
-}
-
-/* b1.c */
-avmcard *b1_alloc_card(int nr_controllers);
-void b1_free_card(avmcard *card);
-int b1_detect(unsigned int base, enum avmcardtype cardtype);
-void b1_getrevision(avmcard *card);
-int b1_load_t4file(avmcard *card, capiloaddatapart *t4file);
-int b1_load_config(avmcard *card, capiloaddatapart *config);
-int b1_loaded(avmcard *card);
-
-int b1_load_firmware(struct capi_ctr *ctrl, capiloaddata *data);
-void b1_reset_ctr(struct capi_ctr *ctrl);
-void b1_register_appl(struct capi_ctr *ctrl, u16 appl,
-		      capi_register_params *rp);
-void b1_release_appl(struct capi_ctr *ctrl, u16 appl);
-u16  b1_send_message(struct capi_ctr *ctrl, struct sk_buff *skb);
-void b1_parse_version(avmctrl_info *card);
-irqreturn_t b1_interrupt(int interrupt, void *devptr);
-
-int b1_proc_show(struct seq_file *m, void *v);
-
-avmcard_dmainfo *avmcard_dma_alloc(char *name, struct pci_dev *,
-				   long rsize, long ssize);
-void avmcard_dma_free(avmcard_dmainfo *);
-
-/* b1dma.c */
-int b1pciv4_detect(avmcard *card);
-int t1pci_detect(avmcard *card);
-void b1dma_reset(avmcard *card);
-irqreturn_t b1dma_interrupt(int interrupt, void *devptr);
-
-int b1dma_load_firmware(struct capi_ctr *ctrl, capiloaddata *data);
-void b1dma_reset_ctr(struct capi_ctr *ctrl);
-void b1dma_remove_ctr(struct capi_ctr *ctrl);
-void b1dma_register_appl(struct capi_ctr *ctrl,
-			 u16 appl,
-			 capi_register_params *rp);
-void b1dma_release_appl(struct capi_ctr *ctrl, u16 appl);
-u16  b1dma_send_message(struct capi_ctr *ctrl, struct sk_buff *skb);
-int b1dma_proc_show(struct seq_file *m, void *v);
-
-#endif /* _AVMCARD_H_ */
diff --git a/drivers/staging/isdn/avm/b1.c b/drivers/staging/isdn/avm/b1.c
deleted file mode 100644
index 32ec8cf31fd0..000000000000
--- a/drivers/staging/isdn/avm/b1.c
+++ /dev/null
@@ -1,819 +0,0 @@
-/* $Id: b1.c,v 1.1.2.2 2004/01/16 21:09:27 keil Exp $
- *
- * Common module for AVM B1 cards.
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/pci.h>
-#include <linux/proc_fs.h>
-#include <linux/seq_file.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/capi.h>
-#include <linux/kernelcapi.h>
-#include <linux/slab.h>
-#include <asm/io.h>
-#include <linux/init.h>
-#include <linux/uaccess.h>
-#include <linux/netdevice.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-
-static char *revision = "$Revision: 1.1.2.2 $";
-
-/* ------------------------------------------------------------- */
-
-MODULE_DESCRIPTION("CAPI4Linux: Common support for active AVM cards");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-/* ------------------------------------------------------------- */
-
-int b1_irq_table[16] =
-{0,
- 0,
- 0,
- 192,				/* irq 3 */
- 32,				/* irq 4 */
- 160,				/* irq 5 */
- 96,				/* irq 6 */
- 224,				/* irq 7 */
- 0,
- 64,				/* irq 9 */
- 80,				/* irq 10 */
- 208,				/* irq 11 */
- 48,				/* irq 12 */
- 0,
- 0,
- 112,				/* irq 15 */
-};
-
-/* ------------------------------------------------------------- */
-
-avmcard *b1_alloc_card(int nr_controllers)
-{
-	avmcard *card;
-	avmctrl_info *cinfo;
-	int i;
-
-	card = kzalloc(sizeof(*card), GFP_KERNEL);
-	if (!card)
-		return NULL;
-
-	cinfo = kcalloc(nr_controllers, sizeof(*cinfo), GFP_KERNEL);
-	if (!cinfo) {
-		kfree(card);
-		return NULL;
-	}
-
-	card->ctrlinfo = cinfo;
-	for (i = 0; i < nr_controllers; i++) {
-		INIT_LIST_HEAD(&cinfo[i].ncci_head);
-		cinfo[i].card = card;
-	}
-	spin_lock_init(&card->lock);
-	card->nr_controllers = nr_controllers;
-
-	return card;
-}
-
-/* ------------------------------------------------------------- */
-
-void b1_free_card(avmcard *card)
-{
-	kfree(card->ctrlinfo);
-	kfree(card);
-}
-
-/* ------------------------------------------------------------- */
-
-int b1_detect(unsigned int base, enum avmcardtype cardtype)
-{
-	int onoff, i;
-
-	/*
-	 * Statusregister 0000 00xx
-	 */
-	if ((inb(base + B1_INSTAT) & 0xfc)
-	    || (inb(base + B1_OUTSTAT) & 0xfc))
-		return 1;
-	/*
-	 * Statusregister 0000 001x
-	 */
-	b1outp(base, B1_INSTAT, 0x2);	/* enable irq */
-	/* b1outp(base, B1_OUTSTAT, 0x2); */
-	if ((inb(base + B1_INSTAT) & 0xfe) != 0x2
-	    /* || (inb(base + B1_OUTSTAT) & 0xfe) != 0x2 */)
-		return 2;
-	/*
-	 * Statusregister 0000 000x
-	 */
-	b1outp(base, B1_INSTAT, 0x0);	/* disable irq */
-	b1outp(base, B1_OUTSTAT, 0x0);
-	if ((inb(base + B1_INSTAT) & 0xfe)
-	    || (inb(base + B1_OUTSTAT) & 0xfe))
-		return 3;
-
-	for (onoff = !0, i = 0; i < 10; i++) {
-		b1_set_test_bit(base, cardtype, onoff);
-		if (b1_get_test_bit(base, cardtype) != onoff)
-			return 4;
-		onoff = !onoff;
-	}
-
-	if (cardtype == avm_m1)
-		return 0;
-
-	if ((b1_rd_reg(base, B1_STAT1(cardtype)) & 0x0f) != 0x01)
-		return 5;
-
-	return 0;
-}
-
-void b1_getrevision(avmcard *card)
-{
-	card->class = inb(card->port + B1_ANALYSE);
-	card->revision = inb(card->port + B1_REVISION);
-}
-
-#define FWBUF_SIZE	256
-int b1_load_t4file(avmcard *card, capiloaddatapart *t4file)
-{
-	unsigned char buf[FWBUF_SIZE];
-	unsigned char *dp;
-	int i, left;
-	unsigned int base = card->port;
-
-	dp = t4file->data;
-	left = t4file->len;
-	while (left > FWBUF_SIZE) {
-		if (t4file->user) {
-			if (copy_from_user(buf, dp, FWBUF_SIZE))
-				return -EFAULT;
-		} else {
-			memcpy(buf, dp, FWBUF_SIZE);
-		}
-		for (i = 0; i < FWBUF_SIZE; i++)
-			if (b1_save_put_byte(base, buf[i]) < 0) {
-				printk(KERN_ERR "%s: corrupted firmware file ?\n",
-				       card->name);
-				return -EIO;
-			}
-		left -= FWBUF_SIZE;
-		dp += FWBUF_SIZE;
-	}
-	if (left) {
-		if (t4file->user) {
-			if (copy_from_user(buf, dp, left))
-				return -EFAULT;
-		} else {
-			memcpy(buf, dp, left);
-		}
-		for (i = 0; i < left; i++)
-			if (b1_save_put_byte(base, buf[i]) < 0) {
-				printk(KERN_ERR "%s: corrupted firmware file ?\n",
-				       card->name);
-				return -EIO;
-			}
-	}
-	return 0;
-}
-
-int b1_load_config(avmcard *card, capiloaddatapart *config)
-{
-	unsigned char buf[FWBUF_SIZE];
-	unsigned char *dp;
-	unsigned int base = card->port;
-	int i, j, left;
-
-	dp = config->data;
-	left = config->len;
-	if (left) {
-		b1_put_byte(base, SEND_CONFIG);
-		b1_put_word(base, 1);
-		b1_put_byte(base, SEND_CONFIG);
-		b1_put_word(base, left);
-	}
-	while (left > FWBUF_SIZE) {
-		if (config->user) {
-			if (copy_from_user(buf, dp, FWBUF_SIZE))
-				return -EFAULT;
-		} else {
-			memcpy(buf, dp, FWBUF_SIZE);
-		}
-		for (i = 0; i < FWBUF_SIZE; ) {
-			b1_put_byte(base, SEND_CONFIG);
-			for (j = 0; j < 4; j++) {
-				b1_put_byte(base, buf[i++]);
-			}
-		}
-		left -= FWBUF_SIZE;
-		dp += FWBUF_SIZE;
-	}
-	if (left) {
-		if (config->user) {
-			if (copy_from_user(buf, dp, left))
-				return -EFAULT;
-		} else {
-			memcpy(buf, dp, left);
-		}
-		for (i = 0; i < left; ) {
-			b1_put_byte(base, SEND_CONFIG);
-			for (j = 0; j < 4; j++) {
-				if (i < left)
-					b1_put_byte(base, buf[i++]);
-				else
-					b1_put_byte(base, 0);
-			}
-		}
-	}
-	return 0;
-}
-
-int b1_loaded(avmcard *card)
-{
-	unsigned int base = card->port;
-	unsigned long stop;
-	unsigned char ans;
-	unsigned long tout = 2;
-
-	for (stop = jiffies + tout * HZ; time_before(jiffies, stop);) {
-		if (b1_tx_empty(base))
-			break;
-	}
-	if (!b1_tx_empty(base)) {
-		printk(KERN_ERR "%s: b1_loaded: tx err, corrupted t4 file ?\n",
-		       card->name);
-		return 0;
-	}
-	b1_put_byte(base, SEND_POLL);
-	for (stop = jiffies + tout * HZ; time_before(jiffies, stop);) {
-		if (b1_rx_full(base)) {
-			ans = b1_get_byte(base);
-			if (ans == RECEIVE_POLL)
-				return 1;
-
-			printk(KERN_ERR "%s: b1_loaded: got 0x%x, firmware not running\n",
-			       card->name, ans);
-			return 0;
-		}
-	}
-	printk(KERN_ERR "%s: b1_loaded: firmware not running\n", card->name);
-	return 0;
-}
-
-/* ------------------------------------------------------------- */
-
-int b1_load_firmware(struct capi_ctr *ctrl, capiloaddata *data)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-	int retval;
-
-	b1_reset(port);
-	retval = b1_load_t4file(card, &data->firmware);
-
-	if (retval) {
-		b1_reset(port);
-		printk(KERN_ERR "%s: failed to load t4file!!\n",
-		       card->name);
-		return retval;
-	}
-
-	b1_disable_irq(port);
-
-	if (data->configuration.len > 0 && data->configuration.data) {
-		retval = b1_load_config(card, &data->configuration);
-		if (retval) {
-			b1_reset(port);
-			printk(KERN_ERR "%s: failed to load config!!\n",
-			       card->name);
-			return retval;
-		}
-	}
-
-	if (!b1_loaded(card)) {
-		printk(KERN_ERR "%s: failed to load t4file.\n", card->name);
-		return -EIO;
-	}
-
-	spin_lock_irqsave(&card->lock, flags);
-	b1_setinterrupt(port, card->irq, card->cardtype);
-	b1_put_byte(port, SEND_INIT);
-	b1_put_word(port, CAPI_MAXAPPL);
-	b1_put_word(port, AVM_NCCI_PER_CHANNEL * 2);
-	b1_put_word(port, ctrl->cnr - 1);
-	spin_unlock_irqrestore(&card->lock, flags);
-
-	return 0;
-}
-
-void b1_reset_ctr(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-
-	b1_reset(port);
-	b1_reset(port);
-
-	memset(cinfo->version, 0, sizeof(cinfo->version));
-	spin_lock_irqsave(&card->lock, flags);
-	capilib_release(&cinfo->ncci_head);
-	spin_unlock_irqrestore(&card->lock, flags);
-	capi_ctr_down(ctrl);
-}
-
-void b1_register_appl(struct capi_ctr *ctrl,
-		      u16 appl,
-		      capi_register_params *rp)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-	int nconn, want = rp->level3cnt;
-
-	if (want > 0) nconn = want;
-	else nconn = ctrl->profile.nbchannel * -want;
-	if (nconn == 0) nconn = ctrl->profile.nbchannel;
-
-	spin_lock_irqsave(&card->lock, flags);
-	b1_put_byte(port, SEND_REGISTER);
-	b1_put_word(port, appl);
-	b1_put_word(port, 1024 * (nconn + 1));
-	b1_put_word(port, nconn);
-	b1_put_word(port, rp->datablkcnt);
-	b1_put_word(port, rp->datablklen);
-	spin_unlock_irqrestore(&card->lock, flags);
-}
-
-void b1_release_appl(struct capi_ctr *ctrl, u16 appl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-	capilib_release_appl(&cinfo->ncci_head, appl);
-	b1_put_byte(port, SEND_RELEASE);
-	b1_put_word(port, appl);
-	spin_unlock_irqrestore(&card->lock, flags);
-}
-
-u16 b1_send_message(struct capi_ctr *ctrl, struct sk_buff *skb)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-	u16 len = CAPIMSG_LEN(skb->data);
-	u8 cmd = CAPIMSG_COMMAND(skb->data);
-	u8 subcmd = CAPIMSG_SUBCOMMAND(skb->data);
-	u16 dlen, retval;
-
-	spin_lock_irqsave(&card->lock, flags);
-	if (CAPICMD(cmd, subcmd) == CAPI_DATA_B3_REQ) {
-		retval = capilib_data_b3_req(&cinfo->ncci_head,
-					     CAPIMSG_APPID(skb->data),
-					     CAPIMSG_NCCI(skb->data),
-					     CAPIMSG_MSGID(skb->data));
-		if (retval != CAPI_NOERROR) {
-			spin_unlock_irqrestore(&card->lock, flags);
-			return retval;
-		}
-
-		dlen = CAPIMSG_DATALEN(skb->data);
-
-		b1_put_byte(port, SEND_DATA_B3_REQ);
-		b1_put_slice(port, skb->data, len);
-		b1_put_slice(port, skb->data + len, dlen);
-	} else {
-		b1_put_byte(port, SEND_MESSAGE);
-		b1_put_slice(port, skb->data, len);
-	}
-	spin_unlock_irqrestore(&card->lock, flags);
-
-	dev_kfree_skb_any(skb);
-	return CAPI_NOERROR;
-}
-
-/* ------------------------------------------------------------- */
-
-void b1_parse_version(avmctrl_info *cinfo)
-{
-	struct capi_ctr *ctrl = &cinfo->capi_ctrl;
-	avmcard *card = cinfo->card;
-	capi_profile *profp;
-	u8 *dversion;
-	u8 flag;
-	int i, j;
-
-	for (j = 0; j < AVM_MAXVERSION; j++)
-		cinfo->version[j] = "";
-	for (i = 0, j = 0;
-	     j < AVM_MAXVERSION && i < cinfo->versionlen;
-	     j++, i += cinfo->versionbuf[i] + 1)
-		cinfo->version[j] = &cinfo->versionbuf[i + 1];
-
-	strlcpy(ctrl->serial, cinfo->version[VER_SERIAL], sizeof(ctrl->serial));
-	memcpy(&ctrl->profile, cinfo->version[VER_PROFILE], sizeof(capi_profile));
-	strlcpy(ctrl->manu, "AVM GmbH", sizeof(ctrl->manu));
-	dversion = cinfo->version[VER_DRIVER];
-	ctrl->version.majorversion = 2;
-	ctrl->version.minorversion = 0;
-	ctrl->version.majormanuversion = (((dversion[0] - '0') & 0xf) << 4);
-	ctrl->version.majormanuversion |= ((dversion[2] - '0') & 0xf);
-	ctrl->version.minormanuversion = (dversion[3] - '0') << 4;
-	ctrl->version.minormanuversion |=
-		(dversion[5] - '0') * 10 + ((dversion[6] - '0') & 0xf);
-
-	profp = &ctrl->profile;
-
-	flag = ((u8 *)(profp->manu))[1];
-	switch (flag) {
-	case 0: if (cinfo->version[VER_CARDTYPE])
-			strcpy(cinfo->cardname, cinfo->version[VER_CARDTYPE]);
-		else strcpy(cinfo->cardname, "B1");
-		break;
-	case 3: strcpy(cinfo->cardname, "PCMCIA B"); break;
-	case 4: strcpy(cinfo->cardname, "PCMCIA M1"); break;
-	case 5: strcpy(cinfo->cardname, "PCMCIA M2"); break;
-	case 6: strcpy(cinfo->cardname, "B1 V3.0"); break;
-	case 7: strcpy(cinfo->cardname, "B1 PCI"); break;
-	default: sprintf(cinfo->cardname, "AVM?%u", (unsigned int)flag); break;
-	}
-	printk(KERN_NOTICE "%s: card %d \"%s\" ready.\n",
-	       card->name, ctrl->cnr, cinfo->cardname);
-
-	flag = ((u8 *)(profp->manu))[3];
-	if (flag)
-		printk(KERN_NOTICE "%s: card %d Protocol:%s%s%s%s%s%s%s\n",
-		       card->name,
-		       ctrl->cnr,
-		       (flag & 0x01) ? " DSS1" : "",
-		       (flag & 0x02) ? " CT1" : "",
-		       (flag & 0x04) ? " VN3" : "",
-		       (flag & 0x08) ? " NI1" : "",
-		       (flag & 0x10) ? " AUSTEL" : "",
-		       (flag & 0x20) ? " ESS" : "",
-		       (flag & 0x40) ? " 1TR6" : ""
-			);
-
-	flag = ((u8 *)(profp->manu))[5];
-	if (flag)
-		printk(KERN_NOTICE "%s: card %d Linetype:%s%s%s%s\n",
-		       card->name,
-		       ctrl->cnr,
-		       (flag & 0x01) ? " point to point" : "",
-		       (flag & 0x02) ? " point to multipoint" : "",
-		       (flag & 0x08) ? " leased line without D-channel" : "",
-		       (flag & 0x04) ? " leased line with D-channel" : ""
-			);
-}
-
-/* ------------------------------------------------------------- */
-
-irqreturn_t b1_interrupt(int interrupt, void *devptr)
-{
-	avmcard *card = devptr;
-	avmctrl_info *cinfo = &card->ctrlinfo[0];
-	struct capi_ctr *ctrl = &cinfo->capi_ctrl;
-	unsigned char b1cmd;
-	struct sk_buff *skb;
-
-	unsigned ApplId;
-	unsigned MsgLen;
-	unsigned DataB3Len;
-	unsigned NCCI;
-	unsigned WindowSize;
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-
-	if (!b1_rx_full(card->port)) {
-		spin_unlock_irqrestore(&card->lock, flags);
-		return IRQ_NONE;
-	}
-
-	b1cmd = b1_get_byte(card->port);
-
-	switch (b1cmd) {
-
-	case RECEIVE_DATA_B3_IND:
-
-		ApplId = (unsigned) b1_get_word(card->port);
-		MsgLen = b1_get_slice(card->port, card->msgbuf);
-		DataB3Len = b1_get_slice(card->port, card->databuf);
-		spin_unlock_irqrestore(&card->lock, flags);
-
-		if (MsgLen < 30) { /* not CAPI 64Bit */
-			memset(card->msgbuf + MsgLen, 0, 30-MsgLen);
-			MsgLen = 30;
-			CAPIMSG_SETLEN(card->msgbuf, 30);
-		}
-
-		skb = alloc_skb(DataB3Len + MsgLen, GFP_ATOMIC);
-		if (!skb) {
-			printk(KERN_ERR "%s: incoming packet dropped\n",
-			       card->name);
-		} else {
-			skb_put_data(skb, card->msgbuf, MsgLen);
-			skb_put_data(skb, card->databuf, DataB3Len);
-			capi_ctr_handle_message(ctrl, ApplId, skb);
-		}
-		break;
-
-	case RECEIVE_MESSAGE:
-
-		ApplId = (unsigned) b1_get_word(card->port);
-		MsgLen = b1_get_slice(card->port, card->msgbuf);
-		skb = alloc_skb(MsgLen, GFP_ATOMIC);
-
-		if (!skb) {
-			printk(KERN_ERR "%s: incoming packet dropped\n",
-			       card->name);
-			spin_unlock_irqrestore(&card->lock, flags);
-		} else {
-			skb_put_data(skb, card->msgbuf, MsgLen);
-			if (CAPIMSG_CMD(skb->data) == CAPI_DATA_B3_CONF)
-				capilib_data_b3_conf(&cinfo->ncci_head, ApplId,
-						     CAPIMSG_NCCI(skb->data),
-						     CAPIMSG_MSGID(skb->data));
-			spin_unlock_irqrestore(&card->lock, flags);
-			capi_ctr_handle_message(ctrl, ApplId, skb);
-		}
-		break;
-
-	case RECEIVE_NEW_NCCI:
-
-		ApplId = b1_get_word(card->port);
-		NCCI = b1_get_word(card->port);
-		WindowSize = b1_get_word(card->port);
-		capilib_new_ncci(&cinfo->ncci_head, ApplId, NCCI, WindowSize);
-		spin_unlock_irqrestore(&card->lock, flags);
-		break;
-
-	case RECEIVE_FREE_NCCI:
-
-		ApplId = b1_get_word(card->port);
-		NCCI = b1_get_word(card->port);
-		if (NCCI != 0xffffffff)
-			capilib_free_ncci(&cinfo->ncci_head, ApplId, NCCI);
-		spin_unlock_irqrestore(&card->lock, flags);
-		break;
-
-	case RECEIVE_START:
-		/* b1_put_byte(card->port, SEND_POLLACK); */
-		spin_unlock_irqrestore(&card->lock, flags);
-		capi_ctr_resume_output(ctrl);
-		break;
-
-	case RECEIVE_STOP:
-		spin_unlock_irqrestore(&card->lock, flags);
-		capi_ctr_suspend_output(ctrl);
-		break;
-
-	case RECEIVE_INIT:
-
-		cinfo->versionlen = b1_get_slice(card->port, cinfo->versionbuf);
-		spin_unlock_irqrestore(&card->lock, flags);
-		b1_parse_version(cinfo);
-		printk(KERN_INFO "%s: %s-card (%s) now active\n",
-		       card->name,
-		       cinfo->version[VER_CARDTYPE],
-		       cinfo->version[VER_DRIVER]);
-		capi_ctr_ready(ctrl);
-		break;
-
-	case RECEIVE_TASK_READY:
-		ApplId = (unsigned) b1_get_word(card->port);
-		MsgLen = b1_get_slice(card->port, card->msgbuf);
-		spin_unlock_irqrestore(&card->lock, flags);
-		card->msgbuf[MsgLen] = 0;
-		while (MsgLen > 0
-		       && (card->msgbuf[MsgLen - 1] == '\n'
-			   || card->msgbuf[MsgLen - 1] == '\r')) {
-			card->msgbuf[MsgLen - 1] = 0;
-			MsgLen--;
-		}
-		printk(KERN_INFO "%s: task %d \"%s\" ready.\n",
-		       card->name, ApplId, card->msgbuf);
-		break;
-
-	case RECEIVE_DEBUGMSG:
-		MsgLen = b1_get_slice(card->port, card->msgbuf);
-		spin_unlock_irqrestore(&card->lock, flags);
-		card->msgbuf[MsgLen] = 0;
-		while (MsgLen > 0
-		       && (card->msgbuf[MsgLen - 1] == '\n'
-			   || card->msgbuf[MsgLen - 1] == '\r')) {
-			card->msgbuf[MsgLen - 1] = 0;
-			MsgLen--;
-		}
-		printk(KERN_INFO "%s: DEBUG: %s\n", card->name, card->msgbuf);
-		break;
-
-	case 0xff:
-		spin_unlock_irqrestore(&card->lock, flags);
-		printk(KERN_ERR "%s: card removed ?\n", card->name);
-		return IRQ_NONE;
-	default:
-		spin_unlock_irqrestore(&card->lock, flags);
-		printk(KERN_ERR "%s: b1_interrupt: 0x%x ???\n",
-		       card->name, b1cmd);
-		return IRQ_HANDLED;
-	}
-	return IRQ_HANDLED;
-}
-
-/* ------------------------------------------------------------- */
-int b1_proc_show(struct seq_file *m, void *v)
-{
-	struct capi_ctr *ctrl = m->private;
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	u8 flag;
-	char *s;
-
-	seq_printf(m, "%-16s %s\n", "name", card->name);
-	seq_printf(m, "%-16s 0x%x\n", "io", card->port);
-	seq_printf(m, "%-16s %d\n", "irq", card->irq);
-	switch (card->cardtype) {
-	case avm_b1isa: s = "B1 ISA"; break;
-	case avm_b1pci: s = "B1 PCI"; break;
-	case avm_b1pcmcia: s = "B1 PCMCIA"; break;
-	case avm_m1: s = "M1"; break;
-	case avm_m2: s = "M2"; break;
-	case avm_t1isa: s = "T1 ISA (HEMA)"; break;
-	case avm_t1pci: s = "T1 PCI"; break;
-	case avm_c4: s = "C4"; break;
-	case avm_c2: s = "C2"; break;
-	default: s = "???"; break;
-	}
-	seq_printf(m, "%-16s %s\n", "type", s);
-	if (card->cardtype == avm_t1isa)
-		seq_printf(m, "%-16s %d\n", "cardnr", card->cardnr);
-
-	s = cinfo->version[VER_DRIVER];
-	if (s)
-		seq_printf(m, "%-16s %s\n", "ver_driver", s);
-
-	s = cinfo->version[VER_CARDTYPE];
-	if (s)
-		seq_printf(m, "%-16s %s\n", "ver_cardtype", s);
-
-	s = cinfo->version[VER_SERIAL];
-	if (s)
-		seq_printf(m, "%-16s %s\n", "ver_serial", s);
-
-	if (card->cardtype != avm_m1) {
-		flag = ((u8 *)(ctrl->profile.manu))[3];
-		if (flag)
-			seq_printf(m, "%-16s%s%s%s%s%s%s%s\n",
-				   "protocol",
-				   (flag & 0x01) ? " DSS1" : "",
-				   (flag & 0x02) ? " CT1" : "",
-				   (flag & 0x04) ? " VN3" : "",
-				   (flag & 0x08) ? " NI1" : "",
-				   (flag & 0x10) ? " AUSTEL" : "",
-				   (flag & 0x20) ? " ESS" : "",
-				   (flag & 0x40) ? " 1TR6" : ""
-				);
-	}
-	if (card->cardtype != avm_m1) {
-		flag = ((u8 *)(ctrl->profile.manu))[5];
-		if (flag)
-			seq_printf(m, "%-16s%s%s%s%s\n",
-				   "linetype",
-				   (flag & 0x01) ? " point to point" : "",
-				   (flag & 0x02) ? " point to multipoint" : "",
-				   (flag & 0x08) ? " leased line without D-channel" : "",
-				   (flag & 0x04) ? " leased line with D-channel" : ""
-				);
-	}
-	seq_printf(m, "%-16s %s\n", "cardname", cinfo->cardname);
-
-	return 0;
-}
-EXPORT_SYMBOL(b1_proc_show);
-
-/* ------------------------------------------------------------- */
-
-#ifdef CONFIG_PCI
-
-avmcard_dmainfo *
-avmcard_dma_alloc(char *name, struct pci_dev *pdev, long rsize, long ssize)
-{
-	avmcard_dmainfo *p;
-	void *buf;
-
-	p = kzalloc(sizeof(avmcard_dmainfo), GFP_KERNEL);
-	if (!p) {
-		printk(KERN_WARNING "%s: no memory.\n", name);
-		goto err;
-	}
-
-	p->recvbuf.size = rsize;
-	buf = pci_alloc_consistent(pdev, rsize, &p->recvbuf.dmaaddr);
-	if (!buf) {
-		printk(KERN_WARNING "%s: allocation of receive dma buffer failed.\n", name);
-		goto err_kfree;
-	}
-	p->recvbuf.dmabuf = buf;
-
-	p->sendbuf.size = ssize;
-	buf = pci_alloc_consistent(pdev, ssize, &p->sendbuf.dmaaddr);
-	if (!buf) {
-		printk(KERN_WARNING "%s: allocation of send dma buffer failed.\n", name);
-		goto err_free_consistent;
-	}
-
-	p->sendbuf.dmabuf = buf;
-	skb_queue_head_init(&p->send_queue);
-
-	return p;
-
-err_free_consistent:
-	pci_free_consistent(p->pcidev, p->recvbuf.size,
-			    p->recvbuf.dmabuf, p->recvbuf.dmaaddr);
-err_kfree:
-	kfree(p);
-err:
-	return NULL;
-}
-
-void avmcard_dma_free(avmcard_dmainfo *p)
-{
-	pci_free_consistent(p->pcidev, p->recvbuf.size,
-			    p->recvbuf.dmabuf, p->recvbuf.dmaaddr);
-	pci_free_consistent(p->pcidev, p->sendbuf.size,
-			    p->sendbuf.dmabuf, p->sendbuf.dmaaddr);
-	skb_queue_purge(&p->send_queue);
-	kfree(p);
-}
-
-EXPORT_SYMBOL(avmcard_dma_alloc);
-EXPORT_SYMBOL(avmcard_dma_free);
-
-#endif
-
-EXPORT_SYMBOL(b1_irq_table);
-
-EXPORT_SYMBOL(b1_alloc_card);
-EXPORT_SYMBOL(b1_free_card);
-EXPORT_SYMBOL(b1_detect);
-EXPORT_SYMBOL(b1_getrevision);
-EXPORT_SYMBOL(b1_load_t4file);
-EXPORT_SYMBOL(b1_load_config);
-EXPORT_SYMBOL(b1_loaded);
-EXPORT_SYMBOL(b1_load_firmware);
-EXPORT_SYMBOL(b1_reset_ctr);
-EXPORT_SYMBOL(b1_register_appl);
-EXPORT_SYMBOL(b1_release_appl);
-EXPORT_SYMBOL(b1_send_message);
-
-EXPORT_SYMBOL(b1_parse_version);
-EXPORT_SYMBOL(b1_interrupt);
-
-static int __init b1_init(void)
-{
-	char *p;
-	char rev[32];
-
-	p = strchr(revision, ':');
-	if (p && p[1]) {
-		strlcpy(rev, p + 2, 32);
-		p = strchr(rev, '$');
-		if (p && p > rev)
-			*(p - 1) = 0;
-	} else {
-		strcpy(rev, "1.0");
-	}
-	printk(KERN_INFO "b1: revision %s\n", rev);
-
-	return 0;
-}
-
-static void __exit b1_exit(void)
-{
-}
-
-module_init(b1_init);
-module_exit(b1_exit);
diff --git a/drivers/staging/isdn/avm/b1dma.c b/drivers/staging/isdn/avm/b1dma.c
deleted file mode 100644
index 6a3dc9937ce5..000000000000
--- a/drivers/staging/isdn/avm/b1dma.c
+++ /dev/null
@@ -1,981 +0,0 @@
-/* $Id: b1dma.c,v 1.1.2.3 2004/02/10 01:07:12 keil Exp $
- *
- * Common module for AVM B1 cards that support dma with AMCC
- *
- * Copyright 2000 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/proc_fs.h>
-#include <linux/seq_file.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/capi.h>
-#include <linux/kernelcapi.h>
-#include <linux/gfp.h>
-#include <asm/io.h>
-#include <linux/init.h>
-#include <linux/uaccess.h>
-#include <linux/netdevice.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-
-static char *revision = "$Revision: 1.1.2.3 $";
-
-#undef AVM_B1DMA_DEBUG
-
-/* ------------------------------------------------------------- */
-
-MODULE_DESCRIPTION("CAPI4Linux: DMA support for active AVM cards");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-static bool suppress_pollack = 0;
-module_param(suppress_pollack, bool, 0);
-
-/* ------------------------------------------------------------- */
-
-static void b1dma_dispatch_tx(avmcard *card);
-
-/* ------------------------------------------------------------- */
-
-/* S5933 */
-
-#define	AMCC_RXPTR	0x24
-#define	AMCC_RXLEN	0x28
-#define	AMCC_TXPTR	0x2c
-#define	AMCC_TXLEN	0x30
-
-#define	AMCC_INTCSR	0x38
-#	define EN_READ_TC_INT		0x00008000L
-#	define EN_WRITE_TC_INT		0x00004000L
-#	define EN_TX_TC_INT		EN_READ_TC_INT
-#	define EN_RX_TC_INT		EN_WRITE_TC_INT
-#	define AVM_FLAG			0x30000000L
-
-#	define ANY_S5933_INT		0x00800000L
-#	define READ_TC_INT		0x00080000L
-#	define WRITE_TC_INT		0x00040000L
-#	define	TX_TC_INT		READ_TC_INT
-#	define	RX_TC_INT		WRITE_TC_INT
-#	define MASTER_ABORT_INT		0x00100000L
-#	define TARGET_ABORT_INT		0x00200000L
-#	define BUS_MASTER_INT		0x00200000L
-#	define ALL_INT			0x000C0000L
-
-#define	AMCC_MCSR	0x3c
-#	define A2P_HI_PRIORITY		0x00000100L
-#	define EN_A2P_TRANSFERS		0x00000400L
-#	define P2A_HI_PRIORITY		0x00001000L
-#	define EN_P2A_TRANSFERS		0x00004000L
-#	define RESET_A2P_FLAGS		0x04000000L
-#	define RESET_P2A_FLAGS		0x02000000L
-
-/* ------------------------------------------------------------- */
-
-static inline void b1dma_writel(avmcard *card, u32 value, int off)
-{
-	writel(value, card->mbase + off);
-}
-
-static inline u32 b1dma_readl(avmcard *card, int off)
-{
-	return readl(card->mbase + off);
-}
-
-/* ------------------------------------------------------------- */
-
-static inline int b1dma_tx_empty(unsigned int port)
-{
-	return inb(port + 0x03) & 0x1;
-}
-
-static inline int b1dma_rx_full(unsigned int port)
-{
-	return inb(port + 0x02) & 0x1;
-}
-
-static int b1dma_tolink(avmcard *card, void *buf, unsigned int len)
-{
-	unsigned long stop = jiffies + 1 * HZ;	/* maximum wait time 1 sec */
-	unsigned char *s = (unsigned char *)buf;
-	while (len--) {
-		while (!b1dma_tx_empty(card->port)
-		       && time_before(jiffies, stop));
-		if (!b1dma_tx_empty(card->port))
-			return -1;
-		t1outp(card->port, 0x01, *s++);
-	}
-	return 0;
-}
-
-static int b1dma_fromlink(avmcard *card, void *buf, unsigned int len)
-{
-	unsigned long stop = jiffies + 1 * HZ;	/* maximum wait time 1 sec */
-	unsigned char *s = (unsigned char *)buf;
-	while (len--) {
-		while (!b1dma_rx_full(card->port)
-		       && time_before(jiffies, stop));
-		if (!b1dma_rx_full(card->port))
-			return -1;
-		*s++ = t1inp(card->port, 0x00);
-	}
-	return 0;
-}
-
-static int WriteReg(avmcard *card, u32 reg, u8 val)
-{
-	u8 cmd = 0x00;
-	if (b1dma_tolink(card, &cmd, 1) == 0
-	    && b1dma_tolink(card, &reg, 4) == 0) {
-		u32 tmp = val;
-		return b1dma_tolink(card, &tmp, 4);
-	}
-	return -1;
-}
-
-static u8 ReadReg(avmcard *card, u32 reg)
-{
-	u8 cmd = 0x01;
-	if (b1dma_tolink(card, &cmd, 1) == 0
-	    && b1dma_tolink(card, &reg, 4) == 0) {
-		u32 tmp;
-		if (b1dma_fromlink(card, &tmp, 4) == 0)
-			return (u8)tmp;
-	}
-	return 0xff;
-}
-
-/* ------------------------------------------------------------- */
-
-static inline void _put_byte(void **pp, u8 val)
-{
-	u8 *s = *pp;
-	*s++ = val;
-	*pp = s;
-}
-
-static inline void _put_word(void **pp, u32 val)
-{
-	u8 *s = *pp;
-	*s++ = val & 0xff;
-	*s++ = (val >> 8) & 0xff;
-	*s++ = (val >> 16) & 0xff;
-	*s++ = (val >> 24) & 0xff;
-	*pp = s;
-}
-
-static inline void _put_slice(void **pp, unsigned char *dp, unsigned int len)
-{
-	unsigned i = len;
-	_put_word(pp, i);
-	while (i-- > 0)
-		_put_byte(pp, *dp++);
-}
-
-static inline u8 _get_byte(void **pp)
-{
-	u8 *s = *pp;
-	u8 val;
-	val = *s++;
-	*pp = s;
-	return val;
-}
-
-static inline u32 _get_word(void **pp)
-{
-	u8 *s = *pp;
-	u32 val;
-	val = *s++;
-	val |= (*s++ << 8);
-	val |= (*s++ << 16);
-	val |= (*s++ << 24);
-	*pp = s;
-	return val;
-}
-
-static inline u32 _get_slice(void **pp, unsigned char *dp)
-{
-	unsigned int len, i;
-
-	len = i = _get_word(pp);
-	while (i-- > 0) *dp++ = _get_byte(pp);
-	return len;
-}
-
-/* ------------------------------------------------------------- */
-
-void b1dma_reset(avmcard *card)
-{
-	card->csr = 0x0;
-	b1dma_writel(card, card->csr, AMCC_INTCSR);
-	b1dma_writel(card, 0, AMCC_MCSR);
-	b1dma_writel(card, 0, AMCC_RXLEN);
-	b1dma_writel(card, 0, AMCC_TXLEN);
-
-	t1outp(card->port, 0x10, 0x00);
-	t1outp(card->port, 0x07, 0x00);
-
-	b1dma_writel(card, 0, AMCC_MCSR);
-	mdelay(10);
-	b1dma_writel(card, 0x0f000000, AMCC_MCSR); /* reset all */
-	mdelay(10);
-	b1dma_writel(card, 0, AMCC_MCSR);
-	if (card->cardtype == avm_t1pci)
-		mdelay(42);
-	else
-		mdelay(10);
-}
-
-/* ------------------------------------------------------------- */
-
-static int b1dma_detect(avmcard *card)
-{
-	b1dma_writel(card, 0, AMCC_MCSR);
-	mdelay(10);
-	b1dma_writel(card, 0x0f000000, AMCC_MCSR); /* reset all */
-	mdelay(10);
-	b1dma_writel(card, 0, AMCC_MCSR);
-	mdelay(42);
-
-	b1dma_writel(card, 0, AMCC_RXLEN);
-	b1dma_writel(card, 0, AMCC_TXLEN);
-	card->csr = 0x0;
-	b1dma_writel(card, card->csr, AMCC_INTCSR);
-
-	if (b1dma_readl(card, AMCC_MCSR) != 0x000000E6)
-		return 1;
-
-	b1dma_writel(card, 0xffffffff, AMCC_RXPTR);
-	b1dma_writel(card, 0xffffffff, AMCC_TXPTR);
-	if (b1dma_readl(card, AMCC_RXPTR) != 0xfffffffc
-	    || b1dma_readl(card, AMCC_TXPTR) != 0xfffffffc)
-		return 2;
-
-	b1dma_writel(card, 0x0, AMCC_RXPTR);
-	b1dma_writel(card, 0x0, AMCC_TXPTR);
-	if (b1dma_readl(card, AMCC_RXPTR) != 0x0
-	    || b1dma_readl(card, AMCC_TXPTR) != 0x0)
-		return 3;
-
-	t1outp(card->port, 0x10, 0x00);
-	t1outp(card->port, 0x07, 0x00);
-
-	t1outp(card->port, 0x02, 0x02);
-	t1outp(card->port, 0x03, 0x02);
-
-	if ((t1inp(card->port, 0x02) & 0xFE) != 0x02
-	    || t1inp(card->port, 0x3) != 0x03)
-		return 4;
-
-	t1outp(card->port, 0x02, 0x00);
-	t1outp(card->port, 0x03, 0x00);
-
-	if ((t1inp(card->port, 0x02) & 0xFE) != 0x00
-	    || t1inp(card->port, 0x3) != 0x01)
-		return 5;
-
-	return 0;
-}
-
-int t1pci_detect(avmcard *card)
-{
-	int ret;
-
-	if ((ret = b1dma_detect(card)) != 0)
-		return ret;
-
-	/* Transputer test */
-
-	if (WriteReg(card, 0x80001000, 0x11) != 0
-	    || WriteReg(card, 0x80101000, 0x22) != 0
-	    || WriteReg(card, 0x80201000, 0x33) != 0
-	    || WriteReg(card, 0x80301000, 0x44) != 0)
-		return 6;
-
-	if (ReadReg(card, 0x80001000) != 0x11
-	    || ReadReg(card, 0x80101000) != 0x22
-	    || ReadReg(card, 0x80201000) != 0x33
-	    || ReadReg(card, 0x80301000) != 0x44)
-		return 7;
-
-	if (WriteReg(card, 0x80001000, 0x55) != 0
-	    || WriteReg(card, 0x80101000, 0x66) != 0
-	    || WriteReg(card, 0x80201000, 0x77) != 0
-	    || WriteReg(card, 0x80301000, 0x88) != 0)
-		return 8;
-
-	if (ReadReg(card, 0x80001000) != 0x55
-	    || ReadReg(card, 0x80101000) != 0x66
-	    || ReadReg(card, 0x80201000) != 0x77
-	    || ReadReg(card, 0x80301000) != 0x88)
-		return 9;
-
-	return 0;
-}
-
-int b1pciv4_detect(avmcard *card)
-{
-	int ret, i;
-
-	if ((ret = b1dma_detect(card)) != 0)
-		return ret;
-
-	for (i = 0; i < 5; i++) {
-		if (WriteReg(card, 0x80A00000, 0x21) != 0)
-			return 6;
-		if ((ReadReg(card, 0x80A00000) & 0x01) != 0x01)
-			return 7;
-	}
-	for (i = 0; i < 5; i++) {
-		if (WriteReg(card, 0x80A00000, 0x20) != 0)
-			return 8;
-		if ((ReadReg(card, 0x80A00000) & 0x01) != 0x00)
-			return 9;
-	}
-
-	return 0;
-}
-
-static void b1dma_queue_tx(avmcard *card, struct sk_buff *skb)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-
-	skb_queue_tail(&card->dma->send_queue, skb);
-
-	if (!(card->csr & EN_TX_TC_INT)) {
-		b1dma_dispatch_tx(card);
-		b1dma_writel(card, card->csr, AMCC_INTCSR);
-	}
-
-	spin_unlock_irqrestore(&card->lock, flags);
-}
-
-/* ------------------------------------------------------------- */
-
-static void b1dma_dispatch_tx(avmcard *card)
-{
-	avmcard_dmainfo *dma = card->dma;
-	struct sk_buff *skb;
-	u8 cmd, subcmd;
-	u16 len;
-	u32 txlen;
-	void *p;
-
-	skb = skb_dequeue(&dma->send_queue);
-
-	len = CAPIMSG_LEN(skb->data);
-
-	if (len) {
-		cmd = CAPIMSG_COMMAND(skb->data);
-		subcmd = CAPIMSG_SUBCOMMAND(skb->data);
-
-		p = dma->sendbuf.dmabuf;
-
-		if (CAPICMD(cmd, subcmd) == CAPI_DATA_B3_REQ) {
-			u16 dlen = CAPIMSG_DATALEN(skb->data);
-			_put_byte(&p, SEND_DATA_B3_REQ);
-			_put_slice(&p, skb->data, len);
-			_put_slice(&p, skb->data + len, dlen);
-		} else {
-			_put_byte(&p, SEND_MESSAGE);
-			_put_slice(&p, skb->data, len);
-		}
-		txlen = (u8 *)p - (u8 *)dma->sendbuf.dmabuf;
-#ifdef AVM_B1DMA_DEBUG
-		printk(KERN_DEBUG "tx: put msg len=%d\n", txlen);
-#endif
-	} else {
-		txlen = skb->len - 2;
-#ifdef AVM_B1DMA_POLLDEBUG
-		if (skb->data[2] == SEND_POLLACK)
-			printk(KERN_INFO "%s: send ack\n", card->name);
-#endif
-#ifdef AVM_B1DMA_DEBUG
-		printk(KERN_DEBUG "tx: put 0x%x len=%d\n",
-		       skb->data[2], txlen);
-#endif
-		skb_copy_from_linear_data_offset(skb, 2, dma->sendbuf.dmabuf,
-						 skb->len - 2);
-	}
-	txlen = (txlen + 3) & ~3;
-
-	b1dma_writel(card, dma->sendbuf.dmaaddr, AMCC_TXPTR);
-	b1dma_writel(card, txlen, AMCC_TXLEN);
-
-	card->csr |= EN_TX_TC_INT;
-
-	dev_kfree_skb_any(skb);
-}
-
-/* ------------------------------------------------------------- */
-
-static void queue_pollack(avmcard *card)
-{
-	struct sk_buff *skb;
-	void *p;
-
-	skb = alloc_skb(3, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, lost poll ack\n",
-		       card->name);
-		return;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_POLLACK);
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	b1dma_queue_tx(card, skb);
-}
-
-/* ------------------------------------------------------------- */
-
-static void b1dma_handle_rx(avmcard *card)
-{
-	avmctrl_info *cinfo = &card->ctrlinfo[0];
-	avmcard_dmainfo *dma = card->dma;
-	struct capi_ctr *ctrl = &cinfo->capi_ctrl;
-	struct sk_buff *skb;
-	void *p = dma->recvbuf.dmabuf + 4;
-	u32 ApplId, MsgLen, DataB3Len, NCCI, WindowSize;
-	u8 b1cmd =  _get_byte(&p);
-
-#ifdef AVM_B1DMA_DEBUG
-	printk(KERN_DEBUG "rx: 0x%x %lu\n", b1cmd, (unsigned long)dma->recvlen);
-#endif
-
-	switch (b1cmd) {
-	case RECEIVE_DATA_B3_IND:
-
-		ApplId = (unsigned) _get_word(&p);
-		MsgLen = _get_slice(&p, card->msgbuf);
-		DataB3Len = _get_slice(&p, card->databuf);
-
-		if (MsgLen < 30) { /* not CAPI 64Bit */
-			memset(card->msgbuf + MsgLen, 0, 30 - MsgLen);
-			MsgLen = 30;
-			CAPIMSG_SETLEN(card->msgbuf, 30);
-		}
-		if (!(skb = alloc_skb(DataB3Len + MsgLen, GFP_ATOMIC))) {
-			printk(KERN_ERR "%s: incoming packet dropped\n",
-			       card->name);
-		} else {
-			skb_put_data(skb, card->msgbuf, MsgLen);
-			skb_put_data(skb, card->databuf, DataB3Len);
-			capi_ctr_handle_message(ctrl, ApplId, skb);
-		}
-		break;
-
-	case RECEIVE_MESSAGE:
-
-		ApplId = (unsigned) _get_word(&p);
-		MsgLen = _get_slice(&p, card->msgbuf);
-		if (!(skb = alloc_skb(MsgLen, GFP_ATOMIC))) {
-			printk(KERN_ERR "%s: incoming packet dropped\n",
-			       card->name);
-		} else {
-			skb_put_data(skb, card->msgbuf, MsgLen);
-			if (CAPIMSG_CMD(skb->data) == CAPI_DATA_B3_CONF) {
-				spin_lock(&card->lock);
-				capilib_data_b3_conf(&cinfo->ncci_head, ApplId,
-						     CAPIMSG_NCCI(skb->data),
-						     CAPIMSG_MSGID(skb->data));
-				spin_unlock(&card->lock);
-			}
-			capi_ctr_handle_message(ctrl, ApplId, skb);
-		}
-		break;
-
-	case RECEIVE_NEW_NCCI:
-
-		ApplId = _get_word(&p);
-		NCCI = _get_word(&p);
-		WindowSize = _get_word(&p);
-		spin_lock(&card->lock);
-		capilib_new_ncci(&cinfo->ncci_head, ApplId, NCCI, WindowSize);
-		spin_unlock(&card->lock);
-		break;
-
-	case RECEIVE_FREE_NCCI:
-
-		ApplId = _get_word(&p);
-		NCCI = _get_word(&p);
-
-		if (NCCI != 0xffffffff) {
-			spin_lock(&card->lock);
-			capilib_free_ncci(&cinfo->ncci_head, ApplId, NCCI);
-			spin_unlock(&card->lock);
-		}
-		break;
-
-	case RECEIVE_START:
-#ifdef AVM_B1DMA_POLLDEBUG
-		printk(KERN_INFO "%s: receive poll\n", card->name);
-#endif
-		if (!suppress_pollack)
-			queue_pollack(card);
-		capi_ctr_resume_output(ctrl);
-		break;
-
-	case RECEIVE_STOP:
-		capi_ctr_suspend_output(ctrl);
-		break;
-
-	case RECEIVE_INIT:
-
-		cinfo->versionlen = _get_slice(&p, cinfo->versionbuf);
-		b1_parse_version(cinfo);
-		printk(KERN_INFO "%s: %s-card (%s) now active\n",
-		       card->name,
-		       cinfo->version[VER_CARDTYPE],
-		       cinfo->version[VER_DRIVER]);
-		capi_ctr_ready(ctrl);
-		break;
-
-	case RECEIVE_TASK_READY:
-		ApplId = (unsigned) _get_word(&p);
-		MsgLen = _get_slice(&p, card->msgbuf);
-		card->msgbuf[MsgLen] = 0;
-		while (MsgLen > 0
-		       && (card->msgbuf[MsgLen - 1] == '\n'
-			   || card->msgbuf[MsgLen - 1] == '\r')) {
-			card->msgbuf[MsgLen - 1] = 0;
-			MsgLen--;
-		}
-		printk(KERN_INFO "%s: task %d \"%s\" ready.\n",
-		       card->name, ApplId, card->msgbuf);
-		break;
-
-	case RECEIVE_DEBUGMSG:
-		MsgLen = _get_slice(&p, card->msgbuf);
-		card->msgbuf[MsgLen] = 0;
-		while (MsgLen > 0
-		       && (card->msgbuf[MsgLen - 1] == '\n'
-			   || card->msgbuf[MsgLen - 1] == '\r')) {
-			card->msgbuf[MsgLen - 1] = 0;
-			MsgLen--;
-		}
-		printk(KERN_INFO "%s: DEBUG: %s\n", card->name, card->msgbuf);
-		break;
-
-	default:
-		printk(KERN_ERR "%s: b1dma_interrupt: 0x%x ???\n",
-		       card->name, b1cmd);
-		return;
-	}
-}
-
-/* ------------------------------------------------------------- */
-
-static void b1dma_handle_interrupt(avmcard *card)
-{
-	u32 status;
-	u32 newcsr;
-
-	spin_lock(&card->lock);
-
-	status = b1dma_readl(card, AMCC_INTCSR);
-	if ((status & ANY_S5933_INT) == 0) {
-		spin_unlock(&card->lock);
-		return;
-	}
-
-	newcsr = card->csr | (status & ALL_INT);
-	if (status & TX_TC_INT) newcsr &= ~EN_TX_TC_INT;
-	if (status & RX_TC_INT) newcsr &= ~EN_RX_TC_INT;
-	b1dma_writel(card, newcsr, AMCC_INTCSR);
-
-	if ((status & RX_TC_INT) != 0) {
-		struct avmcard_dmainfo *dma = card->dma;
-		u32 rxlen;
-		if (card->dma->recvlen == 0) {
-			rxlen = b1dma_readl(card, AMCC_RXLEN);
-			if (rxlen == 0) {
-				dma->recvlen = *((u32 *)dma->recvbuf.dmabuf);
-				rxlen = (dma->recvlen + 3) & ~3;
-				b1dma_writel(card, dma->recvbuf.dmaaddr + 4, AMCC_RXPTR);
-				b1dma_writel(card, rxlen, AMCC_RXLEN);
-#ifdef AVM_B1DMA_DEBUG
-			} else {
-				printk(KERN_ERR "%s: rx not complete (%d).\n",
-				       card->name, rxlen);
-#endif
-			}
-		} else {
-			spin_unlock(&card->lock);
-			b1dma_handle_rx(card);
-			dma->recvlen = 0;
-			spin_lock(&card->lock);
-			b1dma_writel(card, dma->recvbuf.dmaaddr, AMCC_RXPTR);
-			b1dma_writel(card, 4, AMCC_RXLEN);
-		}
-	}
-
-	if ((status & TX_TC_INT) != 0) {
-		if (skb_queue_empty(&card->dma->send_queue))
-			card->csr &= ~EN_TX_TC_INT;
-		else
-			b1dma_dispatch_tx(card);
-	}
-	b1dma_writel(card, card->csr, AMCC_INTCSR);
-
-	spin_unlock(&card->lock);
-}
-
-irqreturn_t b1dma_interrupt(int interrupt, void *devptr)
-{
-	avmcard *card = devptr;
-
-	b1dma_handle_interrupt(card);
-	return IRQ_HANDLED;
-}
-
-/* ------------------------------------------------------------- */
-
-static int b1dma_loaded(avmcard *card)
-{
-	unsigned long stop;
-	unsigned char ans;
-	unsigned long tout = 2;
-	unsigned int base = card->port;
-
-	for (stop = jiffies + tout * HZ; time_before(jiffies, stop);) {
-		if (b1_tx_empty(base))
-			break;
-	}
-	if (!b1_tx_empty(base)) {
-		printk(KERN_ERR "%s: b1dma_loaded: tx err, corrupted t4 file ?\n",
-		       card->name);
-		return 0;
-	}
-	b1_put_byte(base, SEND_POLLACK);
-	for (stop = jiffies + tout * HZ; time_before(jiffies, stop);) {
-		if (b1_rx_full(base)) {
-			if ((ans = b1_get_byte(base)) == RECEIVE_POLLDWORD) {
-				return 1;
-			}
-			printk(KERN_ERR "%s: b1dma_loaded: got 0x%x, firmware not running in dword mode\n", card->name, ans);
-			return 0;
-		}
-	}
-	printk(KERN_ERR "%s: b1dma_loaded: firmware not running\n", card->name);
-	return 0;
-}
-
-/* ------------------------------------------------------------- */
-
-static void b1dma_send_init(avmcard *card)
-{
-	struct sk_buff *skb;
-	void *p;
-
-	skb = alloc_skb(15, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, lost register appl.\n",
-		       card->name);
-		return;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_INIT);
-	_put_word(&p, CAPI_MAXAPPL);
-	_put_word(&p, AVM_NCCI_PER_CHANNEL * 30);
-	_put_word(&p, card->cardnr - 1);
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	b1dma_queue_tx(card, skb);
-}
-
-int b1dma_load_firmware(struct capi_ctr *ctrl, capiloaddata *data)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	int retval;
-
-	b1dma_reset(card);
-
-	if ((retval = b1_load_t4file(card, &data->firmware))) {
-		b1dma_reset(card);
-		printk(KERN_ERR "%s: failed to load t4file!!\n",
-		       card->name);
-		return retval;
-	}
-
-	if (data->configuration.len > 0 && data->configuration.data) {
-		if ((retval = b1_load_config(card, &data->configuration))) {
-			b1dma_reset(card);
-			printk(KERN_ERR "%s: failed to load config!!\n",
-			       card->name);
-			return retval;
-		}
-	}
-
-	if (!b1dma_loaded(card)) {
-		b1dma_reset(card);
-		printk(KERN_ERR "%s: failed to load t4file.\n", card->name);
-		return -EIO;
-	}
-
-	card->csr = AVM_FLAG;
-	b1dma_writel(card, card->csr, AMCC_INTCSR);
-	b1dma_writel(card, EN_A2P_TRANSFERS | EN_P2A_TRANSFERS | A2P_HI_PRIORITY |
-		     P2A_HI_PRIORITY | RESET_A2P_FLAGS | RESET_P2A_FLAGS,
-		     AMCC_MCSR);
-	t1outp(card->port, 0x07, 0x30);
-	t1outp(card->port, 0x10, 0xF0);
-
-	card->dma->recvlen = 0;
-	b1dma_writel(card, card->dma->recvbuf.dmaaddr, AMCC_RXPTR);
-	b1dma_writel(card, 4, AMCC_RXLEN);
-	card->csr |= EN_RX_TC_INT;
-	b1dma_writel(card, card->csr, AMCC_INTCSR);
-
-	b1dma_send_init(card);
-
-	return 0;
-}
-
-void b1dma_reset_ctr(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-	b1dma_reset(card);
-
-	memset(cinfo->version, 0, sizeof(cinfo->version));
-	capilib_release(&cinfo->ncci_head);
-	spin_unlock_irqrestore(&card->lock, flags);
-	capi_ctr_down(ctrl);
-}
-
-/* ------------------------------------------------------------- */
-
-void b1dma_register_appl(struct capi_ctr *ctrl,
-			 u16 appl,
-			 capi_register_params *rp)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	struct sk_buff *skb;
-	int want = rp->level3cnt;
-	int nconn;
-	void *p;
-
-	if (want > 0) nconn = want;
-	else nconn = ctrl->profile.nbchannel * -want;
-	if (nconn == 0) nconn = ctrl->profile.nbchannel;
-
-	skb = alloc_skb(23, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, lost register appl.\n",
-		       card->name);
-		return;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_REGISTER);
-	_put_word(&p, appl);
-	_put_word(&p, 1024 * (nconn + 1));
-	_put_word(&p, nconn);
-	_put_word(&p, rp->datablkcnt);
-	_put_word(&p, rp->datablklen);
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	b1dma_queue_tx(card, skb);
-}
-
-/* ------------------------------------------------------------- */
-
-void b1dma_release_appl(struct capi_ctr *ctrl, u16 appl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	struct sk_buff *skb;
-	void *p;
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-	capilib_release_appl(&cinfo->ncci_head, appl);
-	spin_unlock_irqrestore(&card->lock, flags);
-
-	skb = alloc_skb(7, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, lost release appl.\n",
-		       card->name);
-		return;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_RELEASE);
-	_put_word(&p, appl);
-
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	b1dma_queue_tx(card, skb);
-}
-
-/* ------------------------------------------------------------- */
-
-u16 b1dma_send_message(struct capi_ctr *ctrl, struct sk_buff *skb)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	u16 retval = CAPI_NOERROR;
-
-	if (CAPIMSG_CMD(skb->data) == CAPI_DATA_B3_REQ) {
-		unsigned long flags;
-		spin_lock_irqsave(&card->lock, flags);
-		retval = capilib_data_b3_req(&cinfo->ncci_head,
-					     CAPIMSG_APPID(skb->data),
-					     CAPIMSG_NCCI(skb->data),
-					     CAPIMSG_MSGID(skb->data));
-		spin_unlock_irqrestore(&card->lock, flags);
-	}
-	if (retval == CAPI_NOERROR)
-		b1dma_queue_tx(card, skb);
-
-	return retval;
-}
-
-/* ------------------------------------------------------------- */
-
-int b1dma_proc_show(struct seq_file *m, void *v)
-{
-	struct capi_ctr *ctrl = m->private;
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	u8 flag;
-	char *s;
-	u32 txoff, txlen, rxoff, rxlen, csr;
-	unsigned long flags;
-
-	seq_printf(m, "%-16s %s\n", "name", card->name);
-	seq_printf(m, "%-16s 0x%x\n", "io", card->port);
-	seq_printf(m, "%-16s %d\n", "irq", card->irq);
-	seq_printf(m, "%-16s 0x%lx\n", "membase", card->membase);
-	switch (card->cardtype) {
-	case avm_b1isa: s = "B1 ISA"; break;
-	case avm_b1pci: s = "B1 PCI"; break;
-	case avm_b1pcmcia: s = "B1 PCMCIA"; break;
-	case avm_m1: s = "M1"; break;
-	case avm_m2: s = "M2"; break;
-	case avm_t1isa: s = "T1 ISA (HEMA)"; break;
-	case avm_t1pci: s = "T1 PCI"; break;
-	case avm_c4: s = "C4"; break;
-	case avm_c2: s = "C2"; break;
-	default: s = "???"; break;
-	}
-	seq_printf(m, "%-16s %s\n", "type", s);
-	if ((s = cinfo->version[VER_DRIVER]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_driver", s);
-	if ((s = cinfo->version[VER_CARDTYPE]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_cardtype", s);
-	if ((s = cinfo->version[VER_SERIAL]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_serial", s);
-
-	if (card->cardtype != avm_m1) {
-		flag = ((u8 *)(ctrl->profile.manu))[3];
-		if (flag)
-			seq_printf(m, "%-16s%s%s%s%s%s%s%s\n",
-				   "protocol",
-				   (flag & 0x01) ? " DSS1" : "",
-				   (flag & 0x02) ? " CT1" : "",
-				   (flag & 0x04) ? " VN3" : "",
-				   (flag & 0x08) ? " NI1" : "",
-				   (flag & 0x10) ? " AUSTEL" : "",
-				   (flag & 0x20) ? " ESS" : "",
-				   (flag & 0x40) ? " 1TR6" : ""
-				);
-	}
-	if (card->cardtype != avm_m1) {
-		flag = ((u8 *)(ctrl->profile.manu))[5];
-		if (flag)
-			seq_printf(m, "%-16s%s%s%s%s\n",
-				   "linetype",
-				   (flag & 0x01) ? " point to point" : "",
-				   (flag & 0x02) ? " point to multipoint" : "",
-				   (flag & 0x08) ? " leased line without D-channel" : "",
-				   (flag & 0x04) ? " leased line with D-channel" : ""
-				);
-	}
-	seq_printf(m, "%-16s %s\n", "cardname", cinfo->cardname);
-
-
-	spin_lock_irqsave(&card->lock, flags);
-
-	txoff = (dma_addr_t)b1dma_readl(card, AMCC_TXPTR)-card->dma->sendbuf.dmaaddr;
-	txlen = b1dma_readl(card, AMCC_TXLEN);
-
-	rxoff = (dma_addr_t)b1dma_readl(card, AMCC_RXPTR)-card->dma->recvbuf.dmaaddr;
-	rxlen = b1dma_readl(card, AMCC_RXLEN);
-
-	csr  = b1dma_readl(card, AMCC_INTCSR);
-
-	spin_unlock_irqrestore(&card->lock, flags);
-
-	seq_printf(m, "%-16s 0x%lx\n", "csr (cached)", (unsigned long)card->csr);
-	seq_printf(m, "%-16s 0x%lx\n", "csr", (unsigned long)csr);
-	seq_printf(m, "%-16s %lu\n", "txoff", (unsigned long)txoff);
-	seq_printf(m, "%-16s %lu\n", "txlen", (unsigned long)txlen);
-	seq_printf(m, "%-16s %lu\n", "rxoff", (unsigned long)rxoff);
-	seq_printf(m, "%-16s %lu\n", "rxlen", (unsigned long)rxlen);
-
-	return 0;
-}
-EXPORT_SYMBOL(b1dma_proc_show);
-
-/* ------------------------------------------------------------- */
-
-EXPORT_SYMBOL(b1dma_reset);
-EXPORT_SYMBOL(t1pci_detect);
-EXPORT_SYMBOL(b1pciv4_detect);
-EXPORT_SYMBOL(b1dma_interrupt);
-
-EXPORT_SYMBOL(b1dma_load_firmware);
-EXPORT_SYMBOL(b1dma_reset_ctr);
-EXPORT_SYMBOL(b1dma_register_appl);
-EXPORT_SYMBOL(b1dma_release_appl);
-EXPORT_SYMBOL(b1dma_send_message);
-
-static int __init b1dma_init(void)
-{
-	char *p;
-	char rev[32];
-
-	if ((p = strchr(revision, ':')) != NULL && p[1]) {
-		strlcpy(rev, p + 2, sizeof(rev));
-		if ((p = strchr(rev, '$')) != NULL && p > rev)
-			*(p - 1) = 0;
-	} else
-		strcpy(rev, "1.0");
-
-	printk(KERN_INFO "b1dma: revision %s\n", rev);
-
-	return 0;
-}
-
-static void __exit b1dma_exit(void)
-{
-}
-
-module_init(b1dma_init);
-module_exit(b1dma_exit);
diff --git a/drivers/staging/isdn/avm/b1isa.c b/drivers/staging/isdn/avm/b1isa.c
deleted file mode 100644
index cdfea72e0ef6..000000000000
--- a/drivers/staging/isdn/avm/b1isa.c
+++ /dev/null
@@ -1,243 +0,0 @@
-/* $Id: b1isa.c,v 1.1.2.3 2004/02/10 01:07:12 keil Exp $
- *
- * Module for AVM B1 ISA-card.
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/capi.h>
-#include <linux/init.h>
-#include <linux/pci.h>
-#include <asm/io.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-
-/* ------------------------------------------------------------- */
-
-static char *revision = "$Revision: 1.1.2.3 $";
-
-/* ------------------------------------------------------------- */
-
-MODULE_DESCRIPTION("CAPI4Linux: Driver for AVM B1 ISA card");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-/* ------------------------------------------------------------- */
-
-static void b1isa_remove(struct pci_dev *pdev)
-{
-	avmctrl_info *cinfo = pci_get_drvdata(pdev);
-	avmcard *card;
-
-	if (!cinfo)
-		return;
-
-	card = cinfo->card;
-
-	b1_reset(card->port);
-	b1_reset(card->port);
-
-	detach_capi_ctr(&cinfo->capi_ctrl);
-	free_irq(card->irq, card);
-	release_region(card->port, AVMB1_PORTLEN);
-	b1_free_card(card);
-}
-
-/* ------------------------------------------------------------- */
-
-static char *b1isa_procinfo(struct capi_ctr *ctrl);
-
-static int b1isa_probe(struct pci_dev *pdev)
-{
-	avmctrl_info *cinfo;
-	avmcard *card;
-	int retval;
-
-	card = b1_alloc_card(1);
-	if (!card) {
-		printk(KERN_WARNING "b1isa: no memory.\n");
-		retval = -ENOMEM;
-		goto err;
-	}
-
-	cinfo = card->ctrlinfo;
-
-	card->port = pci_resource_start(pdev, 0);
-	card->irq = pdev->irq;
-	card->cardtype = avm_b1isa;
-	sprintf(card->name, "b1isa-%x", card->port);
-
-	if (card->port != 0x150 && card->port != 0x250
-	    && card->port != 0x300 && card->port != 0x340) {
-		printk(KERN_WARNING "b1isa: invalid port 0x%x.\n", card->port);
-		retval = -EINVAL;
-		goto err_free;
-	}
-	if (b1_irq_table[card->irq & 0xf] == 0) {
-		printk(KERN_WARNING "b1isa: irq %d not valid.\n", card->irq);
-		retval = -EINVAL;
-		goto err_free;
-	}
-	if (!request_region(card->port, AVMB1_PORTLEN, card->name)) {
-		printk(KERN_WARNING "b1isa: ports 0x%03x-0x%03x in use.\n",
-		       card->port, card->port + AVMB1_PORTLEN);
-		retval = -EBUSY;
-		goto err_free;
-	}
-	retval = request_irq(card->irq, b1_interrupt, 0, card->name, card);
-	if (retval) {
-		printk(KERN_ERR "b1isa: unable to get IRQ %d.\n", card->irq);
-		goto err_release_region;
-	}
-	b1_reset(card->port);
-	if ((retval = b1_detect(card->port, card->cardtype)) != 0) {
-		printk(KERN_NOTICE "b1isa: NO card at 0x%x (%d)\n",
-		       card->port, retval);
-		retval = -ENODEV;
-		goto err_free_irq;
-	}
-	b1_reset(card->port);
-	b1_getrevision(card);
-
-	cinfo->capi_ctrl.owner = THIS_MODULE;
-	cinfo->capi_ctrl.driver_name   = "b1isa";
-	cinfo->capi_ctrl.driverdata    = cinfo;
-	cinfo->capi_ctrl.register_appl = b1_register_appl;
-	cinfo->capi_ctrl.release_appl  = b1_release_appl;
-	cinfo->capi_ctrl.send_message  = b1_send_message;
-	cinfo->capi_ctrl.load_firmware = b1_load_firmware;
-	cinfo->capi_ctrl.reset_ctr     = b1_reset_ctr;
-	cinfo->capi_ctrl.procinfo      = b1isa_procinfo;
-	cinfo->capi_ctrl.proc_show     = b1_proc_show;
-	strcpy(cinfo->capi_ctrl.name, card->name);
-
-	retval = attach_capi_ctr(&cinfo->capi_ctrl);
-	if (retval) {
-		printk(KERN_ERR "b1isa: attach controller failed.\n");
-		goto err_free_irq;
-	}
-
-	printk(KERN_INFO "b1isa: AVM B1 ISA at i/o %#x, irq %d, revision %d\n",
-	       card->port, card->irq, card->revision);
-
-	pci_set_drvdata(pdev, cinfo);
-	return 0;
-
-err_free_irq:
-	free_irq(card->irq, card);
-err_release_region:
-	release_region(card->port, AVMB1_PORTLEN);
-err_free:
-	b1_free_card(card);
-err:
-	return retval;
-}
-
-static char *b1isa_procinfo(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d r%d",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->port : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		cinfo->card ? cinfo->card->revision : 0
-		);
-	return cinfo->infobuf;
-}
-
-/* ------------------------------------------------------------- */
-
-#define MAX_CARDS 4
-static struct pci_dev isa_dev[MAX_CARDS];
-static int io[MAX_CARDS];
-static int irq[MAX_CARDS];
-
-module_param_hw_array(io, int, ioport, NULL, 0);
-module_param_hw_array(irq, int, irq, NULL, 0);
-MODULE_PARM_DESC(io, "I/O base address(es)");
-MODULE_PARM_DESC(irq, "IRQ number(s) (assigned)");
-
-static int b1isa_add_card(struct capi_driver *driver, capicardparams *data)
-{
-	int i;
-
-	for (i = 0; i < MAX_CARDS; i++) {
-		if (isa_dev[i].resource[0].start)
-			continue;
-
-		isa_dev[i].resource[0].start = data->port;
-		isa_dev[i].irq = data->irq;
-
-		if (b1isa_probe(&isa_dev[i]) == 0)
-			return 0;
-	}
-	return -ENODEV;
-}
-
-static struct capi_driver capi_driver_b1isa = {
-	.name		= "b1isa",
-	.revision	= "1.0",
-	.add_card       = b1isa_add_card,
-};
-
-static int __init b1isa_init(void)
-{
-	char *p;
-	char rev[32];
-	int i;
-
-	if ((p = strchr(revision, ':')) != NULL && p[1]) {
-		strlcpy(rev, p + 2, 32);
-		if ((p = strchr(rev, '$')) != NULL && p > rev)
-			*(p - 1) = 0;
-	} else
-		strcpy(rev, "1.0");
-
-	for (i = 0; i < MAX_CARDS; i++) {
-		if (!io[i])
-			break;
-
-		isa_dev[i].resource[0].start = io[i];
-		isa_dev[i].irq = irq[i];
-
-		if (b1isa_probe(&isa_dev[i]) != 0)
-			return -ENODEV;
-	}
-
-	strlcpy(capi_driver_b1isa.revision, rev, 32);
-	register_capi_driver(&capi_driver_b1isa);
-	printk(KERN_INFO "b1isa: revision %s\n", rev);
-
-	return 0;
-}
-
-static void __exit b1isa_exit(void)
-{
-	int i;
-
-	for (i = 0; i < MAX_CARDS; i++) {
-		if (isa_dev[i].resource[0].start)
-			b1isa_remove(&isa_dev[i]);
-	}
-	unregister_capi_driver(&capi_driver_b1isa);
-}
-
-module_init(b1isa_init);
-module_exit(b1isa_exit);
diff --git a/drivers/staging/isdn/avm/b1pci.c b/drivers/staging/isdn/avm/b1pci.c
deleted file mode 100644
index b76b57a82c02..000000000000
--- a/drivers/staging/isdn/avm/b1pci.c
+++ /dev/null
@@ -1,416 +0,0 @@
-/* $Id: b1pci.c,v 1.1.2.2 2004/01/16 21:09:27 keil Exp $
- *
- * Module for AVM B1 PCI-card.
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/pci.h>
-#include <linux/capi.h>
-#include <asm/io.h>
-#include <linux/init.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-
-/* ------------------------------------------------------------- */
-
-static char *revision = "$Revision: 1.1.2.2 $";
-
-/* ------------------------------------------------------------- */
-
-static struct pci_device_id b1pci_pci_tbl[] = {
-	{ PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_B1, PCI_ANY_ID, PCI_ANY_ID },
-	{ }				/* Terminating entry */
-};
-
-MODULE_DEVICE_TABLE(pci, b1pci_pci_tbl);
-MODULE_DESCRIPTION("CAPI4Linux: Driver for AVM B1 PCI card");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-/* ------------------------------------------------------------- */
-
-static char *b1pci_procinfo(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d r%d",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->port : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		cinfo->card ? cinfo->card->revision : 0
-		);
-	return cinfo->infobuf;
-}
-
-/* ------------------------------------------------------------- */
-
-static int b1pci_probe(struct capicardparams *p, struct pci_dev *pdev)
-{
-	avmcard *card;
-	avmctrl_info *cinfo;
-	int retval;
-
-	card = b1_alloc_card(1);
-	if (!card) {
-		printk(KERN_WARNING "b1pci: no memory.\n");
-		retval = -ENOMEM;
-		goto err;
-	}
-
-	cinfo = card->ctrlinfo;
-	sprintf(card->name, "b1pci-%x", p->port);
-	card->port = p->port;
-	card->irq = p->irq;
-	card->cardtype = avm_b1pci;
-
-	if (!request_region(card->port, AVMB1_PORTLEN, card->name)) {
-		printk(KERN_WARNING "b1pci: ports 0x%03x-0x%03x in use.\n",
-		       card->port, card->port + AVMB1_PORTLEN);
-		retval = -EBUSY;
-		goto err_free;
-	}
-	b1_reset(card->port);
-	retval = b1_detect(card->port, card->cardtype);
-	if (retval) {
-		printk(KERN_NOTICE "b1pci: NO card at 0x%x (%d)\n",
-		       card->port, retval);
-		retval = -ENODEV;
-		goto err_release_region;
-	}
-	b1_reset(card->port);
-	b1_getrevision(card);
-
-	retval = request_irq(card->irq, b1_interrupt, IRQF_SHARED, card->name, card);
-	if (retval) {
-		printk(KERN_ERR "b1pci: unable to get IRQ %d.\n", card->irq);
-		retval = -EBUSY;
-		goto err_release_region;
-	}
-
-	cinfo->capi_ctrl.driver_name   = "b1pci";
-	cinfo->capi_ctrl.driverdata    = cinfo;
-	cinfo->capi_ctrl.register_appl = b1_register_appl;
-	cinfo->capi_ctrl.release_appl  = b1_release_appl;
-	cinfo->capi_ctrl.send_message  = b1_send_message;
-	cinfo->capi_ctrl.load_firmware = b1_load_firmware;
-	cinfo->capi_ctrl.reset_ctr     = b1_reset_ctr;
-	cinfo->capi_ctrl.procinfo      = b1pci_procinfo;
-	cinfo->capi_ctrl.proc_show     = b1_proc_show;
-	strcpy(cinfo->capi_ctrl.name, card->name);
-	cinfo->capi_ctrl.owner         = THIS_MODULE;
-
-	retval = attach_capi_ctr(&cinfo->capi_ctrl);
-	if (retval) {
-		printk(KERN_ERR "b1pci: attach controller failed.\n");
-		goto err_free_irq;
-	}
-
-	if (card->revision >= 4) {
-		printk(KERN_INFO "b1pci: AVM B1 PCI V4 at i/o %#x, irq %d, revision %d (no dma)\n",
-		       card->port, card->irq, card->revision);
-	} else {
-		printk(KERN_INFO "b1pci: AVM B1 PCI at i/o %#x, irq %d, revision %d\n",
-		       card->port, card->irq, card->revision);
-	}
-
-	pci_set_drvdata(pdev, card);
-	return 0;
-
-err_free_irq:
-	free_irq(card->irq, card);
-err_release_region:
-	release_region(card->port, AVMB1_PORTLEN);
-err_free:
-	b1_free_card(card);
-err:
-	return retval;
-}
-
-static void b1pci_remove(struct pci_dev *pdev)
-{
-	avmcard *card = pci_get_drvdata(pdev);
-	avmctrl_info *cinfo = card->ctrlinfo;
-	unsigned int port = card->port;
-
-	b1_reset(port);
-	b1_reset(port);
-
-	detach_capi_ctr(&cinfo->capi_ctrl);
-	free_irq(card->irq, card);
-	release_region(card->port, AVMB1_PORTLEN);
-	b1_free_card(card);
-}
-
-#ifdef CONFIG_ISDN_DRV_AVMB1_B1PCIV4
-/* ------------------------------------------------------------- */
-
-static char *b1pciv4_procinfo(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d 0x%lx r%d",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->port : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		cinfo->card ? cinfo->card->membase : 0,
-		cinfo->card ? cinfo->card->revision : 0
-		);
-	return cinfo->infobuf;
-}
-
-/* ------------------------------------------------------------- */
-
-static int b1pciv4_probe(struct capicardparams *p, struct pci_dev *pdev)
-{
-	avmcard *card;
-	avmctrl_info *cinfo;
-	int retval;
-
-	card = b1_alloc_card(1);
-	if (!card) {
-		printk(KERN_WARNING "b1pci: no memory.\n");
-		retval = -ENOMEM;
-		goto err;
-	}
-
-	card->dma = avmcard_dma_alloc("b1pci", pdev, 2048 + 128, 2048 + 128);
-	if (!card->dma) {
-		printk(KERN_WARNING "b1pci: dma alloc.\n");
-		retval = -ENOMEM;
-		goto err_free;
-	}
-
-	cinfo = card->ctrlinfo;
-	sprintf(card->name, "b1pciv4-%x", p->port);
-	card->port = p->port;
-	card->irq = p->irq;
-	card->membase = p->membase;
-	card->cardtype = avm_b1pci;
-
-	if (!request_region(card->port, AVMB1_PORTLEN, card->name)) {
-		printk(KERN_WARNING "b1pci: ports 0x%03x-0x%03x in use.\n",
-		       card->port, card->port + AVMB1_PORTLEN);
-		retval = -EBUSY;
-		goto err_free_dma;
-	}
-
-	card->mbase = ioremap(card->membase, 64);
-	if (!card->mbase) {
-		printk(KERN_NOTICE "b1pci: can't remap memory at 0x%lx\n",
-		       card->membase);
-		retval = -ENOMEM;
-		goto err_release_region;
-	}
-
-	b1dma_reset(card);
-
-	retval = b1pciv4_detect(card);
-	if (retval) {
-		printk(KERN_NOTICE "b1pci: NO card at 0x%x (%d)\n",
-		       card->port, retval);
-		retval = -ENODEV;
-		goto err_unmap;
-	}
-	b1dma_reset(card);
-	b1_getrevision(card);
-
-	retval = request_irq(card->irq, b1dma_interrupt, IRQF_SHARED, card->name, card);
-	if (retval) {
-		printk(KERN_ERR "b1pci: unable to get IRQ %d.\n",
-		       card->irq);
-		retval = -EBUSY;
-		goto err_unmap;
-	}
-
-	cinfo->capi_ctrl.owner         = THIS_MODULE;
-	cinfo->capi_ctrl.driver_name   = "b1pciv4";
-	cinfo->capi_ctrl.driverdata    = cinfo;
-	cinfo->capi_ctrl.register_appl = b1dma_register_appl;
-	cinfo->capi_ctrl.release_appl  = b1dma_release_appl;
-	cinfo->capi_ctrl.send_message  = b1dma_send_message;
-	cinfo->capi_ctrl.load_firmware = b1dma_load_firmware;
-	cinfo->capi_ctrl.reset_ctr     = b1dma_reset_ctr;
-	cinfo->capi_ctrl.procinfo      = b1pciv4_procinfo;
-	cinfo->capi_ctrl.proc_show     = b1dma_proc_show;
-	strcpy(cinfo->capi_ctrl.name, card->name);
-
-	retval = attach_capi_ctr(&cinfo->capi_ctrl);
-	if (retval) {
-		printk(KERN_ERR "b1pci: attach controller failed.\n");
-		goto err_free_irq;
-	}
-	card->cardnr = cinfo->capi_ctrl.cnr;
-
-	printk(KERN_INFO "b1pci: AVM B1 PCI V4 at i/o %#x, irq %d, mem %#lx, revision %d (dma)\n",
-	       card->port, card->irq, card->membase, card->revision);
-
-	pci_set_drvdata(pdev, card);
-	return 0;
-
-err_free_irq:
-	free_irq(card->irq, card);
-err_unmap:
-	iounmap(card->mbase);
-err_release_region:
-	release_region(card->port, AVMB1_PORTLEN);
-err_free_dma:
-	avmcard_dma_free(card->dma);
-err_free:
-	b1_free_card(card);
-err:
-	return retval;
-
-}
-
-static void b1pciv4_remove(struct pci_dev *pdev)
-{
-	avmcard *card = pci_get_drvdata(pdev);
-	avmctrl_info *cinfo = card->ctrlinfo;
-
-	b1dma_reset(card);
-
-	detach_capi_ctr(&cinfo->capi_ctrl);
-	free_irq(card->irq, card);
-	iounmap(card->mbase);
-	release_region(card->port, AVMB1_PORTLEN);
-	avmcard_dma_free(card->dma);
-	b1_free_card(card);
-}
-
-#endif /* CONFIG_ISDN_DRV_AVMB1_B1PCIV4 */
-
-static int b1pci_pci_probe(struct pci_dev *pdev,
-			   const struct pci_device_id *ent)
-{
-	struct capicardparams param;
-	int retval;
-
-	if (pci_enable_device(pdev) < 0) {
-		printk(KERN_ERR "b1pci: failed to enable AVM-B1\n");
-		return -ENODEV;
-	}
-	param.irq = pdev->irq;
-
-	if (pci_resource_start(pdev, 2)) { /* B1 PCI V4 */
-#ifdef CONFIG_ISDN_DRV_AVMB1_B1PCIV4
-		pci_set_master(pdev);
-#endif
-		param.membase = pci_resource_start(pdev, 0);
-		param.port = pci_resource_start(pdev, 2);
-
-		printk(KERN_INFO "b1pci: PCI BIOS reports AVM-B1 V4 at i/o %#x, irq %d, mem %#x\n",
-		       param.port, param.irq, param.membase);
-#ifdef CONFIG_ISDN_DRV_AVMB1_B1PCIV4
-		retval = b1pciv4_probe(&param, pdev);
-#else
-		retval = b1pci_probe(&param, pdev);
-#endif
-		if (retval != 0) {
-			printk(KERN_ERR "b1pci: no AVM-B1 V4 at i/o %#x, irq %d, mem %#x detected\n",
-			       param.port, param.irq, param.membase);
-		}
-	} else {
-		param.membase = 0;
-		param.port = pci_resource_start(pdev, 1);
-
-		printk(KERN_INFO "b1pci: PCI BIOS reports AVM-B1 at i/o %#x, irq %d\n",
-		       param.port, param.irq);
-		retval = b1pci_probe(&param, pdev);
-		if (retval != 0) {
-			printk(KERN_ERR "b1pci: no AVM-B1 at i/o %#x, irq %d detected\n",
-			       param.port, param.irq);
-		}
-	}
-	return retval;
-}
-
-static void b1pci_pci_remove(struct pci_dev *pdev)
-{
-#ifdef CONFIG_ISDN_DRV_AVMB1_B1PCIV4
-	avmcard *card = pci_get_drvdata(pdev);
-
-	if (card->dma)
-		b1pciv4_remove(pdev);
-	else
-		b1pci_remove(pdev);
-#else
-	b1pci_remove(pdev);
-#endif
-}
-
-static struct pci_driver b1pci_pci_driver = {
-	.name		= "b1pci",
-	.id_table	= b1pci_pci_tbl,
-	.probe		= b1pci_pci_probe,
-	.remove		= b1pci_pci_remove,
-};
-
-static struct capi_driver capi_driver_b1pci = {
-	.name		= "b1pci",
-	.revision	= "1.0",
-};
-#ifdef CONFIG_ISDN_DRV_AVMB1_B1PCIV4
-static struct capi_driver capi_driver_b1pciv4 = {
-	.name		= "b1pciv4",
-	.revision	= "1.0",
-};
-#endif
-
-static int __init b1pci_init(void)
-{
-	char *p;
-	char rev[32];
-	int err;
-
-	if ((p = strchr(revision, ':')) != NULL && p[1]) {
-		strlcpy(rev, p + 2, 32);
-		if ((p = strchr(rev, '$')) != NULL && p > rev)
-			*(p - 1) = 0;
-	} else
-		strcpy(rev, "1.0");
-
-
-	err = pci_register_driver(&b1pci_pci_driver);
-	if (!err) {
-		strlcpy(capi_driver_b1pci.revision, rev, 32);
-		register_capi_driver(&capi_driver_b1pci);
-#ifdef CONFIG_ISDN_DRV_AVMB1_B1PCIV4
-		strlcpy(capi_driver_b1pciv4.revision, rev, 32);
-		register_capi_driver(&capi_driver_b1pciv4);
-#endif
-		printk(KERN_INFO "b1pci: revision %s\n", rev);
-	}
-	return err;
-}
-
-static void __exit b1pci_exit(void)
-{
-	unregister_capi_driver(&capi_driver_b1pci);
-#ifdef CONFIG_ISDN_DRV_AVMB1_B1PCIV4
-	unregister_capi_driver(&capi_driver_b1pciv4);
-#endif
-	pci_unregister_driver(&b1pci_pci_driver);
-}
-
-module_init(b1pci_init);
-module_exit(b1pci_exit);
diff --git a/drivers/staging/isdn/avm/b1pcmcia.c b/drivers/staging/isdn/avm/b1pcmcia.c
deleted file mode 100644
index 3aca16e62902..000000000000
--- a/drivers/staging/isdn/avm/b1pcmcia.c
+++ /dev/null
@@ -1,224 +0,0 @@
-/* $Id: b1pcmcia.c,v 1.1.2.2 2004/01/16 21:09:27 keil Exp $
- *
- * Module for AVM B1/M1/M2 PCMCIA-card.
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/init.h>
-#include <asm/io.h>
-#include <linux/capi.h>
-#include <linux/b1pcmcia.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-
-/* ------------------------------------------------------------- */
-
-static char *revision = "$Revision: 1.1.2.2 $";
-
-/* ------------------------------------------------------------- */
-
-MODULE_DESCRIPTION("CAPI4Linux: Driver for AVM PCMCIA cards");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-/* ------------------------------------------------------------- */
-
-static void b1pcmcia_remove_ctr(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-
-	b1_reset(port);
-	b1_reset(port);
-
-	detach_capi_ctr(ctrl);
-	free_irq(card->irq, card);
-	b1_free_card(card);
-}
-
-/* ------------------------------------------------------------- */
-
-static LIST_HEAD(cards);
-
-static char *b1pcmcia_procinfo(struct capi_ctr *ctrl);
-
-static int b1pcmcia_add_card(unsigned int port, unsigned irq,
-			     enum avmcardtype cardtype)
-{
-	avmctrl_info *cinfo;
-	avmcard *card;
-	char *cardname;
-	int retval;
-
-	card = b1_alloc_card(1);
-	if (!card) {
-		printk(KERN_WARNING "b1pcmcia: no memory.\n");
-		retval = -ENOMEM;
-		goto err;
-	}
-	cinfo = card->ctrlinfo;
-
-	switch (cardtype) {
-	case avm_m1: sprintf(card->name, "m1-%x", port); break;
-	case avm_m2: sprintf(card->name, "m2-%x", port); break;
-	default: sprintf(card->name, "b1pcmcia-%x", port); break;
-	}
-	card->port = port;
-	card->irq = irq;
-	card->cardtype = cardtype;
-
-	retval = request_irq(card->irq, b1_interrupt, IRQF_SHARED, card->name, card);
-	if (retval) {
-		printk(KERN_ERR "b1pcmcia: unable to get IRQ %d.\n",
-		       card->irq);
-		retval = -EBUSY;
-		goto err_free;
-	}
-	b1_reset(card->port);
-	if ((retval = b1_detect(card->port, card->cardtype)) != 0) {
-		printk(KERN_NOTICE "b1pcmcia: NO card at 0x%x (%d)\n",
-		       card->port, retval);
-		retval = -ENODEV;
-		goto err_free_irq;
-	}
-	b1_reset(card->port);
-	b1_getrevision(card);
-
-	cinfo->capi_ctrl.owner         = THIS_MODULE;
-	cinfo->capi_ctrl.driver_name   = "b1pcmcia";
-	cinfo->capi_ctrl.driverdata    = cinfo;
-	cinfo->capi_ctrl.register_appl = b1_register_appl;
-	cinfo->capi_ctrl.release_appl  = b1_release_appl;
-	cinfo->capi_ctrl.send_message  = b1_send_message;
-	cinfo->capi_ctrl.load_firmware = b1_load_firmware;
-	cinfo->capi_ctrl.reset_ctr     = b1_reset_ctr;
-	cinfo->capi_ctrl.procinfo      = b1pcmcia_procinfo;
-	cinfo->capi_ctrl.proc_show     = b1_proc_show;
-	strcpy(cinfo->capi_ctrl.name, card->name);
-
-	retval = attach_capi_ctr(&cinfo->capi_ctrl);
-	if (retval) {
-		printk(KERN_ERR "b1pcmcia: attach controller failed.\n");
-		goto err_free_irq;
-	}
-	switch (cardtype) {
-	case avm_m1: cardname = "M1"; break;
-	case avm_m2: cardname = "M2"; break;
-	default: cardname = "B1 PCMCIA"; break;
-	}
-
-	printk(KERN_INFO "b1pcmcia: AVM %s at i/o %#x, irq %d, revision %d\n",
-	       cardname, card->port, card->irq, card->revision);
-
-	list_add(&card->list, &cards);
-	return cinfo->capi_ctrl.cnr;
-
-err_free_irq:
-	free_irq(card->irq, card);
-err_free:
-	b1_free_card(card);
-err:
-	return retval;
-}
-
-/* ------------------------------------------------------------- */
-
-static char *b1pcmcia_procinfo(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d r%d",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->port : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		cinfo->card ? cinfo->card->revision : 0
-		);
-	return cinfo->infobuf;
-}
-
-/* ------------------------------------------------------------- */
-
-int b1pcmcia_addcard_b1(unsigned int port, unsigned irq)
-{
-	return b1pcmcia_add_card(port, irq, avm_b1pcmcia);
-}
-
-int b1pcmcia_addcard_m1(unsigned int port, unsigned irq)
-{
-	return b1pcmcia_add_card(port, irq, avm_m1);
-}
-
-int b1pcmcia_addcard_m2(unsigned int port, unsigned irq)
-{
-	return b1pcmcia_add_card(port, irq, avm_m2);
-}
-
-int b1pcmcia_delcard(unsigned int port, unsigned irq)
-{
-	struct list_head *l;
-	avmcard *card;
-
-	list_for_each(l, &cards) {
-		card = list_entry(l, avmcard, list);
-		if (card->port == port && card->irq == irq) {
-			b1pcmcia_remove_ctr(&card->ctrlinfo[0].capi_ctrl);
-			return 0;
-		}
-	}
-	return -ESRCH;
-}
-
-EXPORT_SYMBOL(b1pcmcia_addcard_b1);
-EXPORT_SYMBOL(b1pcmcia_addcard_m1);
-EXPORT_SYMBOL(b1pcmcia_addcard_m2);
-EXPORT_SYMBOL(b1pcmcia_delcard);
-
-static struct capi_driver capi_driver_b1pcmcia = {
-	.name		= "b1pcmcia",
-	.revision	= "1.0",
-};
-
-static int __init b1pcmcia_init(void)
-{
-	char *p;
-	char rev[32];
-
-	if ((p = strchr(revision, ':')) != NULL && p[1]) {
-		strlcpy(rev, p + 2, 32);
-		if ((p = strchr(rev, '$')) != NULL && p > rev)
-			*(p - 1) = 0;
-	} else
-		strcpy(rev, "1.0");
-
-	strlcpy(capi_driver_b1pcmcia.revision, rev, 32);
-	register_capi_driver(&capi_driver_b1pcmcia);
-	printk(KERN_INFO "b1pci: revision %s\n", rev);
-
-	return 0;
-}
-
-static void __exit b1pcmcia_exit(void)
-{
-	unregister_capi_driver(&capi_driver_b1pcmcia);
-}
-
-module_init(b1pcmcia_init);
-module_exit(b1pcmcia_exit);
diff --git a/drivers/staging/isdn/avm/c4.c b/drivers/staging/isdn/avm/c4.c
deleted file mode 100644
index ac72cd204c4d..000000000000
--- a/drivers/staging/isdn/avm/c4.c
+++ /dev/null
@@ -1,1317 +0,0 @@
-/* $Id: c4.c,v 1.1.2.2 2004/01/16 21:09:27 keil Exp $
- *
- * Module for AVM C4 & C2 card.
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/proc_fs.h>
-#include <linux/seq_file.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/pci.h>
-#include <linux/capi.h>
-#include <linux/kernelcapi.h>
-#include <linux/init.h>
-#include <linux/gfp.h>
-#include <asm/io.h>
-#include <linux/uaccess.h>
-#include <linux/netdevice.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-
-#undef AVM_C4_DEBUG
-#undef AVM_C4_POLLDEBUG
-
-/* ------------------------------------------------------------- */
-
-static char *revision = "$Revision: 1.1.2.2 $";
-
-/* ------------------------------------------------------------- */
-
-static bool suppress_pollack;
-
-static const struct pci_device_id c4_pci_tbl[] = {
-	{ PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_C4, 0, 0, (unsigned long)4 },
-	{ PCI_VENDOR_ID_DEC, PCI_DEVICE_ID_DEC_21285, PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_C2, 0, 0, (unsigned long)2 },
-	{ }			/* Terminating entry */
-};
-
-MODULE_DEVICE_TABLE(pci, c4_pci_tbl);
-MODULE_DESCRIPTION("CAPI4Linux: Driver for AVM C2/C4 cards");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-module_param(suppress_pollack, bool, 0);
-
-/* ------------------------------------------------------------- */
-
-static void c4_dispatch_tx(avmcard *card);
-
-/* ------------------------------------------------------------- */
-
-#define DC21285_DRAM_A0MR	0x40000000
-#define DC21285_DRAM_A1MR	0x40004000
-#define DC21285_DRAM_A2MR	0x40008000
-#define DC21285_DRAM_A3MR	0x4000C000
-
-#define	CAS_OFFSET	0x88
-
-#define DC21285_ARMCSR_BASE	0x42000000
-
-#define	PCI_OUT_INT_STATUS	0x30
-#define	PCI_OUT_INT_MASK	0x34
-#define	MAILBOX_0		0x50
-#define	MAILBOX_1		0x54
-#define	MAILBOX_2		0x58
-#define	MAILBOX_3		0x5C
-#define	DOORBELL		0x60
-#define	DOORBELL_SETUP		0x64
-
-#define CHAN_1_CONTROL		0x90
-#define CHAN_2_CONTROL		0xB0
-#define DRAM_TIMING		0x10C
-#define DRAM_ADDR_SIZE_0	0x110
-#define DRAM_ADDR_SIZE_1	0x114
-#define DRAM_ADDR_SIZE_2	0x118
-#define DRAM_ADDR_SIZE_3	0x11C
-#define	SA_CONTROL		0x13C
-#define	XBUS_CYCLE		0x148
-#define	XBUS_STROBE		0x14C
-#define	DBELL_PCI_MASK		0x150
-#define DBELL_SA_MASK		0x154
-
-#define SDRAM_SIZE		0x1000000
-
-/* ------------------------------------------------------------- */
-
-#define	MBOX_PEEK_POKE		MAILBOX_0
-
-#define DBELL_ADDR		0x01
-#define DBELL_DATA		0x02
-#define DBELL_RNWR		0x40
-#define DBELL_INIT		0x80
-
-/* ------------------------------------------------------------- */
-
-#define	MBOX_UP_ADDR		MAILBOX_0
-#define	MBOX_UP_LEN		MAILBOX_1
-#define	MBOX_DOWN_ADDR		MAILBOX_2
-#define	MBOX_DOWN_LEN		MAILBOX_3
-
-#define	DBELL_UP_HOST		0x00000100
-#define	DBELL_UP_ARM		0x00000200
-#define	DBELL_DOWN_HOST		0x00000400
-#define	DBELL_DOWN_ARM		0x00000800
-#define	DBELL_RESET_HOST	0x40000000
-#define	DBELL_RESET_ARM		0x80000000
-
-/* ------------------------------------------------------------- */
-
-#define	DRAM_TIMING_DEF		0x001A01A5
-#define DRAM_AD_SZ_DEF0		0x00000045
-#define DRAM_AD_SZ_NULL		0x00000000
-
-#define SA_CTL_ALLRIGHT		0x64AA0271
-
-#define	INIT_XBUS_CYCLE		0x100016DB
-#define	INIT_XBUS_STROBE	0xF1F1F1F1
-
-/* ------------------------------------------------------------- */
-
-#define	RESET_TIMEOUT		(15 * HZ)	/* 15 sec */
-#define	PEEK_POKE_TIMEOUT	(HZ / 10)	/* 0.1 sec */
-
-/* ------------------------------------------------------------- */
-
-#define c4outmeml(addr, value)	writel(value, addr)
-#define c4inmeml(addr)	readl(addr)
-#define c4outmemw(addr, value)	writew(value, addr)
-#define c4inmemw(addr)	readw(addr)
-#define c4outmemb(addr, value)	writeb(value, addr)
-#define c4inmemb(addr)	readb(addr)
-
-/* ------------------------------------------------------------- */
-
-static inline int wait_for_doorbell(avmcard *card, unsigned long t)
-{
-	unsigned long stop;
-
-	stop = jiffies + t;
-	while (c4inmeml(card->mbase + DOORBELL) != 0xffffffff) {
-		if (!time_before(jiffies, stop))
-			return -1;
-		mb();
-	}
-	return 0;
-}
-
-static int c4_poke(avmcard *card,  unsigned long off, unsigned long value)
-{
-
-	if (wait_for_doorbell(card, HZ / 10) < 0)
-		return -1;
-
-	c4outmeml(card->mbase + MBOX_PEEK_POKE, off);
-	c4outmeml(card->mbase + DOORBELL, DBELL_ADDR);
-
-	if (wait_for_doorbell(card, HZ / 10) < 0)
-		return -1;
-
-	c4outmeml(card->mbase + MBOX_PEEK_POKE, value);
-	c4outmeml(card->mbase + DOORBELL, DBELL_DATA | DBELL_ADDR);
-
-	return 0;
-}
-
-static int c4_peek(avmcard *card,  unsigned long off, unsigned long *valuep)
-{
-	if (wait_for_doorbell(card, HZ / 10) < 0)
-		return -1;
-
-	c4outmeml(card->mbase + MBOX_PEEK_POKE, off);
-	c4outmeml(card->mbase + DOORBELL, DBELL_RNWR | DBELL_ADDR);
-
-	if (wait_for_doorbell(card, HZ / 10) < 0)
-		return -1;
-
-	*valuep = c4inmeml(card->mbase + MBOX_PEEK_POKE);
-
-	return 0;
-}
-
-/* ------------------------------------------------------------- */
-
-static int c4_load_t4file(avmcard *card, capiloaddatapart *t4file)
-{
-	u32 val;
-	unsigned char *dp;
-	u_int left;
-	u32 loadoff = 0;
-
-	dp = t4file->data;
-	left = t4file->len;
-	while (left >= sizeof(u32)) {
-		if (t4file->user) {
-			if (copy_from_user(&val, dp, sizeof(val)))
-				return -EFAULT;
-		} else {
-			memcpy(&val, dp, sizeof(val));
-		}
-		if (c4_poke(card, loadoff, val)) {
-			printk(KERN_ERR "%s: corrupted firmware file ?\n",
-			       card->name);
-			return -EIO;
-		}
-		left -= sizeof(u32);
-		dp += sizeof(u32);
-		loadoff += sizeof(u32);
-	}
-	if (left) {
-		val = 0;
-		if (t4file->user) {
-			if (copy_from_user(&val, dp, left))
-				return -EFAULT;
-		} else {
-			memcpy(&val, dp, left);
-		}
-		if (c4_poke(card, loadoff, val)) {
-			printk(KERN_ERR "%s: corrupted firmware file ?\n",
-			       card->name);
-			return -EIO;
-		}
-	}
-	return 0;
-}
-
-/* ------------------------------------------------------------- */
-
-static inline void _put_byte(void **pp, u8 val)
-{
-	u8 *s = *pp;
-	*s++ = val;
-	*pp = s;
-}
-
-static inline void _put_word(void **pp, u32 val)
-{
-	u8 *s = *pp;
-	*s++ = val & 0xff;
-	*s++ = (val >> 8) & 0xff;
-	*s++ = (val >> 16) & 0xff;
-	*s++ = (val >> 24) & 0xff;
-	*pp = s;
-}
-
-static inline void _put_slice(void **pp, unsigned char *dp, unsigned int len)
-{
-	unsigned i = len;
-	_put_word(pp, i);
-	while (i-- > 0)
-		_put_byte(pp, *dp++);
-}
-
-static inline u8 _get_byte(void **pp)
-{
-	u8 *s = *pp;
-	u8 val;
-	val = *s++;
-	*pp = s;
-	return val;
-}
-
-static inline u32 _get_word(void **pp)
-{
-	u8 *s = *pp;
-	u32 val;
-	val = *s++;
-	val |= (*s++ << 8);
-	val |= (*s++ << 16);
-	val |= (*s++ << 24);
-	*pp = s;
-	return val;
-}
-
-static inline u32 _get_slice(void **pp, unsigned char *dp)
-{
-	unsigned int len, i;
-
-	len = i = _get_word(pp);
-	while (i-- > 0) *dp++ = _get_byte(pp);
-	return len;
-}
-
-/* ------------------------------------------------------------- */
-
-static void c4_reset(avmcard *card)
-{
-	unsigned long stop;
-
-	c4outmeml(card->mbase + DOORBELL, DBELL_RESET_ARM);
-
-	stop = jiffies + HZ * 10;
-	while (c4inmeml(card->mbase + DOORBELL) != 0xffffffff) {
-		if (!time_before(jiffies, stop))
-			return;
-		c4outmeml(card->mbase + DOORBELL, DBELL_ADDR);
-		mb();
-	}
-
-	c4_poke(card, DC21285_ARMCSR_BASE + CHAN_1_CONTROL, 0);
-	c4_poke(card, DC21285_ARMCSR_BASE + CHAN_2_CONTROL, 0);
-}
-
-/* ------------------------------------------------------------- */
-
-static int c4_detect(avmcard *card)
-{
-	unsigned long stop, dummy;
-
-	c4outmeml(card->mbase + PCI_OUT_INT_MASK, 0x0c);
-	if (c4inmeml(card->mbase + PCI_OUT_INT_MASK) != 0x0c)
-		return	1;
-
-	c4outmeml(card->mbase + DOORBELL, DBELL_RESET_ARM);
-
-	stop = jiffies + HZ * 10;
-	while (c4inmeml(card->mbase + DOORBELL) != 0xffffffff) {
-		if (!time_before(jiffies, stop))
-			return 2;
-		c4outmeml(card->mbase + DOORBELL, DBELL_ADDR);
-		mb();
-	}
-
-	c4_poke(card, DC21285_ARMCSR_BASE + CHAN_1_CONTROL, 0);
-	c4_poke(card, DC21285_ARMCSR_BASE + CHAN_2_CONTROL, 0);
-
-	c4outmeml(card->mbase + MAILBOX_0, 0x55aa55aa);
-	if (c4inmeml(card->mbase + MAILBOX_0) != 0x55aa55aa) return 3;
-
-	c4outmeml(card->mbase + MAILBOX_0, 0xaa55aa55);
-	if (c4inmeml(card->mbase + MAILBOX_0) != 0xaa55aa55) return 4;
-
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DBELL_SA_MASK, 0)) return 5;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DBELL_PCI_MASK, 0)) return 6;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + SA_CONTROL, SA_CTL_ALLRIGHT))
-		return 7;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + XBUS_CYCLE, INIT_XBUS_CYCLE))
-		return 8;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + XBUS_STROBE, INIT_XBUS_STROBE))
-		return 8;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DRAM_TIMING, 0)) return 9;
-
-	mdelay(1);
-
-	if (c4_peek(card, DC21285_DRAM_A0MR, &dummy)) return 10;
-	if (c4_peek(card, DC21285_DRAM_A1MR, &dummy)) return 11;
-	if (c4_peek(card, DC21285_DRAM_A2MR, &dummy)) return 12;
-	if (c4_peek(card, DC21285_DRAM_A3MR, &dummy)) return 13;
-
-	if (c4_poke(card, DC21285_DRAM_A0MR + CAS_OFFSET, 0)) return 14;
-	if (c4_poke(card, DC21285_DRAM_A1MR + CAS_OFFSET, 0)) return 15;
-	if (c4_poke(card, DC21285_DRAM_A2MR + CAS_OFFSET, 0)) return 16;
-	if (c4_poke(card, DC21285_DRAM_A3MR + CAS_OFFSET, 0)) return 17;
-
-	mdelay(1);
-
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DRAM_TIMING, DRAM_TIMING_DEF))
-		return 18;
-
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DRAM_ADDR_SIZE_0, DRAM_AD_SZ_DEF0))
-		return 19;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DRAM_ADDR_SIZE_1, DRAM_AD_SZ_NULL))
-		return 20;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DRAM_ADDR_SIZE_2, DRAM_AD_SZ_NULL))
-		return 21;
-	if (c4_poke(card, DC21285_ARMCSR_BASE + DRAM_ADDR_SIZE_3, DRAM_AD_SZ_NULL))
-		return 22;
-
-	/* Transputer test */
-
-	if (c4_poke(card, 0x000000, 0x11111111)
-	    || c4_poke(card, 0x400000, 0x22222222)
-	       || c4_poke(card, 0x800000, 0x33333333)
-	       || c4_poke(card, 0xC00000, 0x44444444))
-		return 23;
-
-	if (c4_peek(card, 0x000000, &dummy) || dummy != 0x11111111
-	    || c4_peek(card, 0x400000, &dummy) || dummy != 0x22222222
-	       || c4_peek(card, 0x800000, &dummy) || dummy != 0x33333333
-	       || c4_peek(card, 0xC00000, &dummy) || dummy != 0x44444444)
-		return 24;
-
-	if (c4_poke(card, 0x000000, 0x55555555)
-	    || c4_poke(card, 0x400000, 0x66666666)
-	       || c4_poke(card, 0x800000, 0x77777777)
-	       || c4_poke(card, 0xC00000, 0x88888888))
-		return 25;
-
-	if (c4_peek(card, 0x000000, &dummy) || dummy != 0x55555555
-	    || c4_peek(card, 0x400000, &dummy) || dummy != 0x66666666
-	       || c4_peek(card, 0x800000, &dummy) || dummy != 0x77777777
-	       || c4_peek(card, 0xC00000, &dummy) || dummy != 0x88888888)
-		return 26;
-
-	return 0;
-}
-
-/* ------------------------------------------------------------- */
-
-static void c4_dispatch_tx(avmcard *card)
-{
-	avmcard_dmainfo *dma = card->dma;
-	struct sk_buff *skb;
-	u8 cmd, subcmd;
-	u16 len;
-	u32 txlen;
-	void *p;
-
-
-	if (card->csr & DBELL_DOWN_ARM) { /* tx busy */
-		return;
-	}
-
-	skb = skb_dequeue(&dma->send_queue);
-	if (!skb) {
-#ifdef AVM_C4_DEBUG
-		printk(KERN_DEBUG "%s: tx underrun\n", card->name);
-#endif
-		return;
-	}
-
-	len = CAPIMSG_LEN(skb->data);
-
-	if (len) {
-		cmd = CAPIMSG_COMMAND(skb->data);
-		subcmd = CAPIMSG_SUBCOMMAND(skb->data);
-
-		p = dma->sendbuf.dmabuf;
-
-		if (CAPICMD(cmd, subcmd) == CAPI_DATA_B3_REQ) {
-			u16 dlen = CAPIMSG_DATALEN(skb->data);
-			_put_byte(&p, SEND_DATA_B3_REQ);
-			_put_slice(&p, skb->data, len);
-			_put_slice(&p, skb->data + len, dlen);
-		} else {
-			_put_byte(&p, SEND_MESSAGE);
-			_put_slice(&p, skb->data, len);
-		}
-		txlen = (u8 *)p - (u8 *)dma->sendbuf.dmabuf;
-#ifdef AVM_C4_DEBUG
-		printk(KERN_DEBUG "%s: tx put msg len=%d\n", card->name, txlen);
-#endif
-	} else {
-		txlen = skb->len - 2;
-#ifdef AVM_C4_POLLDEBUG
-		if (skb->data[2] == SEND_POLLACK)
-			printk(KERN_INFO "%s: ack to c4\n", card->name);
-#endif
-#ifdef AVM_C4_DEBUG
-		printk(KERN_DEBUG "%s: tx put 0x%x len=%d\n",
-		       card->name, skb->data[2], txlen);
-#endif
-		skb_copy_from_linear_data_offset(skb, 2, dma->sendbuf.dmabuf,
-						 skb->len - 2);
-	}
-	txlen = (txlen + 3) & ~3;
-
-	c4outmeml(card->mbase + MBOX_DOWN_ADDR, dma->sendbuf.dmaaddr);
-	c4outmeml(card->mbase + MBOX_DOWN_LEN, txlen);
-
-	card->csr |= DBELL_DOWN_ARM;
-
-	c4outmeml(card->mbase + DOORBELL, DBELL_DOWN_ARM);
-
-	dev_kfree_skb_any(skb);
-}
-
-/* ------------------------------------------------------------- */
-
-static void queue_pollack(avmcard *card)
-{
-	struct sk_buff *skb;
-	void *p;
-
-	skb = alloc_skb(3, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, lost poll ack\n",
-		       card->name);
-		return;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_POLLACK);
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	skb_queue_tail(&card->dma->send_queue, skb);
-	c4_dispatch_tx(card);
-}
-
-/* ------------------------------------------------------------- */
-
-static void c4_handle_rx(avmcard *card)
-{
-	avmcard_dmainfo *dma = card->dma;
-	struct capi_ctr *ctrl;
-	avmctrl_info *cinfo;
-	struct sk_buff *skb;
-	void *p = dma->recvbuf.dmabuf;
-	u32 ApplId, MsgLen, DataB3Len, NCCI, WindowSize;
-	u8 b1cmd =  _get_byte(&p);
-	u32 cidx;
-
-
-#ifdef AVM_C4_DEBUG
-	printk(KERN_DEBUG "%s: rx 0x%x len=%lu\n", card->name,
-	       b1cmd, (unsigned long)dma->recvlen);
-#endif
-
-	switch (b1cmd) {
-	case RECEIVE_DATA_B3_IND:
-
-		ApplId = (unsigned) _get_word(&p);
-		MsgLen = _get_slice(&p, card->msgbuf);
-		DataB3Len = _get_slice(&p, card->databuf);
-		cidx = CAPIMSG_CONTROLLER(card->msgbuf)-card->cardnr;
-		if (cidx >= card->nlogcontr) cidx = 0;
-		ctrl = &card->ctrlinfo[cidx].capi_ctrl;
-
-		if (MsgLen < 30) { /* not CAPI 64Bit */
-			memset(card->msgbuf + MsgLen, 0, 30 - MsgLen);
-			MsgLen = 30;
-			CAPIMSG_SETLEN(card->msgbuf, 30);
-		}
-		if (!(skb = alloc_skb(DataB3Len + MsgLen, GFP_ATOMIC))) {
-			printk(KERN_ERR "%s: incoming packet dropped\n",
-			       card->name);
-		} else {
-			skb_put_data(skb, card->msgbuf, MsgLen);
-			skb_put_data(skb, card->databuf, DataB3Len);
-			capi_ctr_handle_message(ctrl, ApplId, skb);
-		}
-		break;
-
-	case RECEIVE_MESSAGE:
-
-		ApplId = (unsigned) _get_word(&p);
-		MsgLen = _get_slice(&p, card->msgbuf);
-		cidx = CAPIMSG_CONTROLLER(card->msgbuf)-card->cardnr;
-		if (cidx >= card->nlogcontr) cidx = 0;
-		cinfo = &card->ctrlinfo[cidx];
-		ctrl = &card->ctrlinfo[cidx].capi_ctrl;
-
-		if (!(skb = alloc_skb(MsgLen, GFP_ATOMIC))) {
-			printk(KERN_ERR "%s: incoming packet dropped\n",
-			       card->name);
-		} else {
-			skb_put_data(skb, card->msgbuf, MsgLen);
-			if (CAPIMSG_CMD(skb->data) == CAPI_DATA_B3_CONF)
-				capilib_data_b3_conf(&cinfo->ncci_head, ApplId,
-						     CAPIMSG_NCCI(skb->data),
-						     CAPIMSG_MSGID(skb->data));
-
-			capi_ctr_handle_message(ctrl, ApplId, skb);
-		}
-		break;
-
-	case RECEIVE_NEW_NCCI:
-
-		ApplId = _get_word(&p);
-		NCCI = _get_word(&p);
-		WindowSize = _get_word(&p);
-		cidx = (NCCI & 0x7f) - card->cardnr;
-		if (cidx >= card->nlogcontr) cidx = 0;
-
-		capilib_new_ncci(&card->ctrlinfo[cidx].ncci_head, ApplId, NCCI, WindowSize);
-
-		break;
-
-	case RECEIVE_FREE_NCCI:
-
-		ApplId = _get_word(&p);
-		NCCI = _get_word(&p);
-
-		if (NCCI != 0xffffffff) {
-			cidx = (NCCI & 0x7f) - card->cardnr;
-			if (cidx >= card->nlogcontr) cidx = 0;
-			capilib_free_ncci(&card->ctrlinfo[cidx].ncci_head, ApplId, NCCI);
-		}
-		break;
-
-	case RECEIVE_START:
-#ifdef AVM_C4_POLLDEBUG
-		printk(KERN_INFO "%s: poll from c4\n", card->name);
-#endif
-		if (!suppress_pollack)
-			queue_pollack(card);
-		for (cidx = 0; cidx < card->nr_controllers; cidx++) {
-			ctrl = &card->ctrlinfo[cidx].capi_ctrl;
-			capi_ctr_resume_output(ctrl);
-		}
-		break;
-
-	case RECEIVE_STOP:
-		for (cidx = 0; cidx < card->nr_controllers; cidx++) {
-			ctrl = &card->ctrlinfo[cidx].capi_ctrl;
-			capi_ctr_suspend_output(ctrl);
-		}
-		break;
-
-	case RECEIVE_INIT:
-
-		cidx = card->nlogcontr;
-		if (cidx >= card->nr_controllers) {
-			printk(KERN_ERR "%s: card with %d controllers ??\n",
-			       card->name, cidx + 1);
-			break;
-		}
-		card->nlogcontr++;
-		cinfo = &card->ctrlinfo[cidx];
-		ctrl = &cinfo->capi_ctrl;
-		cinfo->versionlen = _get_slice(&p, cinfo->versionbuf);
-		b1_parse_version(cinfo);
-		printk(KERN_INFO "%s: %s-card (%s) now active\n",
-		       card->name,
-		       cinfo->version[VER_CARDTYPE],
-		       cinfo->version[VER_DRIVER]);
-		capi_ctr_ready(&cinfo->capi_ctrl);
-		break;
-
-	case RECEIVE_TASK_READY:
-		ApplId = (unsigned) _get_word(&p);
-		MsgLen = _get_slice(&p, card->msgbuf);
-		card->msgbuf[MsgLen] = 0;
-		while (MsgLen > 0
-		       && (card->msgbuf[MsgLen - 1] == '\n'
-			   || card->msgbuf[MsgLen - 1] == '\r')) {
-			card->msgbuf[MsgLen - 1] = 0;
-			MsgLen--;
-		}
-		printk(KERN_INFO "%s: task %d \"%s\" ready.\n",
-		       card->name, ApplId, card->msgbuf);
-		break;
-
-	case RECEIVE_DEBUGMSG:
-		MsgLen = _get_slice(&p, card->msgbuf);
-		card->msgbuf[MsgLen] = 0;
-		while (MsgLen > 0
-		       && (card->msgbuf[MsgLen - 1] == '\n'
-			   || card->msgbuf[MsgLen - 1] == '\r')) {
-			card->msgbuf[MsgLen - 1] = 0;
-			MsgLen--;
-		}
-		printk(KERN_INFO "%s: DEBUG: %s\n", card->name, card->msgbuf);
-		break;
-
-	default:
-		printk(KERN_ERR "%s: c4_interrupt: 0x%x ???\n",
-		       card->name, b1cmd);
-		return;
-	}
-}
-
-/* ------------------------------------------------------------- */
-
-static irqreturn_t c4_handle_interrupt(avmcard *card)
-{
-	unsigned long flags;
-	u32 status;
-
-	spin_lock_irqsave(&card->lock, flags);
-	status = c4inmeml(card->mbase + DOORBELL);
-
-	if (status & DBELL_RESET_HOST) {
-		u_int i;
-		c4outmeml(card->mbase + PCI_OUT_INT_MASK, 0x0c);
-		spin_unlock_irqrestore(&card->lock, flags);
-		if (card->nlogcontr == 0)
-			return IRQ_HANDLED;
-		printk(KERN_ERR "%s: unexpected reset\n", card->name);
-		for (i = 0; i < card->nr_controllers; i++) {
-			avmctrl_info *cinfo = &card->ctrlinfo[i];
-			memset(cinfo->version, 0, sizeof(cinfo->version));
-			spin_lock_irqsave(&card->lock, flags);
-			capilib_release(&cinfo->ncci_head);
-			spin_unlock_irqrestore(&card->lock, flags);
-			capi_ctr_down(&cinfo->capi_ctrl);
-		}
-		card->nlogcontr = 0;
-		return IRQ_HANDLED;
-	}
-
-	status &= (DBELL_UP_HOST | DBELL_DOWN_HOST);
-	if (!status) {
-		spin_unlock_irqrestore(&card->lock, flags);
-		return IRQ_HANDLED;
-	}
-	c4outmeml(card->mbase + DOORBELL, status);
-
-	if ((status & DBELL_UP_HOST) != 0) {
-		card->dma->recvlen = c4inmeml(card->mbase + MBOX_UP_LEN);
-		c4outmeml(card->mbase + MBOX_UP_LEN, 0);
-		c4_handle_rx(card);
-		card->dma->recvlen = 0;
-		c4outmeml(card->mbase + MBOX_UP_LEN, card->dma->recvbuf.size);
-		c4outmeml(card->mbase + DOORBELL, DBELL_UP_ARM);
-	}
-
-	if ((status & DBELL_DOWN_HOST) != 0) {
-		card->csr &= ~DBELL_DOWN_ARM;
-		c4_dispatch_tx(card);
-	} else if (card->csr & DBELL_DOWN_HOST) {
-		if (c4inmeml(card->mbase + MBOX_DOWN_LEN) == 0) {
-			card->csr &= ~DBELL_DOWN_ARM;
-			c4_dispatch_tx(card);
-		}
-	}
-	spin_unlock_irqrestore(&card->lock, flags);
-	return IRQ_HANDLED;
-}
-
-static irqreturn_t c4_interrupt(int interrupt, void *devptr)
-{
-	avmcard *card = devptr;
-
-	return c4_handle_interrupt(card);
-}
-
-/* ------------------------------------------------------------- */
-
-static void c4_send_init(avmcard *card)
-{
-	struct sk_buff *skb;
-	void *p;
-	unsigned long flags;
-
-	skb = alloc_skb(15, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, lost register appl.\n",
-		       card->name);
-		return;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_INIT);
-	_put_word(&p, CAPI_MAXAPPL);
-	_put_word(&p, AVM_NCCI_PER_CHANNEL * 30);
-	_put_word(&p, card->cardnr - 1);
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	skb_queue_tail(&card->dma->send_queue, skb);
-	spin_lock_irqsave(&card->lock, flags);
-	c4_dispatch_tx(card);
-	spin_unlock_irqrestore(&card->lock, flags);
-}
-
-static int queue_sendconfigword(avmcard *card, u32 val)
-{
-	struct sk_buff *skb;
-	unsigned long flags;
-	void *p;
-
-	skb = alloc_skb(3 + 4, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, send config\n",
-		       card->name);
-		return -ENOMEM;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_CONFIG);
-	_put_word(&p, val);
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	skb_queue_tail(&card->dma->send_queue, skb);
-	spin_lock_irqsave(&card->lock, flags);
-	c4_dispatch_tx(card);
-	spin_unlock_irqrestore(&card->lock, flags);
-	return 0;
-}
-
-static int queue_sendconfig(avmcard *card, char cval[4])
-{
-	struct sk_buff *skb;
-	unsigned long flags;
-	void *p;
-
-	skb = alloc_skb(3 + 4, GFP_ATOMIC);
-	if (!skb) {
-		printk(KERN_CRIT "%s: no memory, send config\n",
-		       card->name);
-		return -ENOMEM;
-	}
-	p = skb->data;
-	_put_byte(&p, 0);
-	_put_byte(&p, 0);
-	_put_byte(&p, SEND_CONFIG);
-	_put_byte(&p, cval[0]);
-	_put_byte(&p, cval[1]);
-	_put_byte(&p, cval[2]);
-	_put_byte(&p, cval[3]);
-	skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-	skb_queue_tail(&card->dma->send_queue, skb);
-
-	spin_lock_irqsave(&card->lock, flags);
-	c4_dispatch_tx(card);
-	spin_unlock_irqrestore(&card->lock, flags);
-	return 0;
-}
-
-static int c4_send_config(avmcard *card, capiloaddatapart *config)
-{
-	u8 val[4];
-	unsigned char *dp;
-	u_int left;
-	int retval;
-
-	if ((retval = queue_sendconfigword(card, 1)) != 0)
-		return retval;
-	if ((retval = queue_sendconfigword(card, config->len)) != 0)
-		return retval;
-
-	dp = config->data;
-	left = config->len;
-	while (left >= sizeof(u32)) {
-		if (config->user) {
-			if (copy_from_user(val, dp, sizeof(val)))
-				return -EFAULT;
-		} else {
-			memcpy(val, dp, sizeof(val));
-		}
-		if ((retval = queue_sendconfig(card, val)) != 0)
-			return retval;
-		left -= sizeof(val);
-		dp += sizeof(val);
-	}
-	if (left) {
-		memset(val, 0, sizeof(val));
-		if (config->user) {
-			if (copy_from_user(&val, dp, left))
-				return -EFAULT;
-		} else {
-			memcpy(&val, dp, left);
-		}
-		if ((retval = queue_sendconfig(card, val)) != 0)
-			return retval;
-	}
-
-	return 0;
-}
-
-static int c4_load_firmware(struct capi_ctr *ctrl, capiloaddata *data)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	int retval;
-
-	if ((retval = c4_load_t4file(card, &data->firmware))) {
-		printk(KERN_ERR "%s: failed to load t4file!!\n",
-		       card->name);
-		c4_reset(card);
-		return retval;
-	}
-
-	card->csr = 0;
-	c4outmeml(card->mbase + MBOX_UP_LEN, 0);
-	c4outmeml(card->mbase + MBOX_DOWN_LEN, 0);
-	c4outmeml(card->mbase + DOORBELL, DBELL_INIT);
-	mdelay(1);
-	c4outmeml(card->mbase + DOORBELL,
-		  DBELL_UP_HOST | DBELL_DOWN_HOST | DBELL_RESET_HOST);
-
-	c4outmeml(card->mbase + PCI_OUT_INT_MASK, 0x08);
-
-	card->dma->recvlen = 0;
-	c4outmeml(card->mbase + MBOX_UP_ADDR, card->dma->recvbuf.dmaaddr);
-	c4outmeml(card->mbase + MBOX_UP_LEN, card->dma->recvbuf.size);
-	c4outmeml(card->mbase + DOORBELL, DBELL_UP_ARM);
-
-	if (data->configuration.len > 0 && data->configuration.data) {
-		retval = c4_send_config(card, &data->configuration);
-		if (retval) {
-			printk(KERN_ERR "%s: failed to set config!!\n",
-			       card->name);
-			c4_reset(card);
-			return retval;
-		}
-	}
-
-	c4_send_init(card);
-
-	return 0;
-}
-
-
-static void c4_reset_ctr(struct capi_ctr *ctrl)
-{
-	avmcard *card = ((avmctrl_info *)(ctrl->driverdata))->card;
-	avmctrl_info *cinfo;
-	u_int i;
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-
-	c4_reset(card);
-
-	spin_unlock_irqrestore(&card->lock, flags);
-
-	for (i = 0; i < card->nr_controllers; i++) {
-		cinfo = &card->ctrlinfo[i];
-		memset(cinfo->version, 0, sizeof(cinfo->version));
-		capi_ctr_down(&cinfo->capi_ctrl);
-	}
-	card->nlogcontr = 0;
-}
-
-static void c4_remove(struct pci_dev *pdev)
-{
-	avmcard *card = pci_get_drvdata(pdev);
-	avmctrl_info *cinfo;
-	u_int i;
-
-	if (!card)
-		return;
-
-	c4_reset(card);
-
-	for (i = 0; i < card->nr_controllers; i++) {
-		cinfo = &card->ctrlinfo[i];
-		detach_capi_ctr(&cinfo->capi_ctrl);
-	}
-
-	free_irq(card->irq, card);
-	iounmap(card->mbase);
-	release_region(card->port, AVMB1_PORTLEN);
-	avmcard_dma_free(card->dma);
-	pci_set_drvdata(pdev, NULL);
-	b1_free_card(card);
-}
-
-/* ------------------------------------------------------------- */
-
-
-static void c4_register_appl(struct capi_ctr *ctrl,
-			     u16 appl,
-			     capi_register_params *rp)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	struct sk_buff *skb;
-	int want = rp->level3cnt;
-	unsigned long flags;
-	int nconn;
-	void *p;
-
-	if (ctrl->cnr == card->cardnr) {
-
-		if (want > 0) nconn = want;
-		else nconn = ctrl->profile.nbchannel * 4 * -want;
-		if (nconn == 0) nconn = ctrl->profile.nbchannel * 4;
-
-		skb = alloc_skb(23, GFP_ATOMIC);
-		if (!skb) {
-			printk(KERN_CRIT "%s: no memory, lost register appl.\n",
-			       card->name);
-			return;
-		}
-		p = skb->data;
-		_put_byte(&p, 0);
-		_put_byte(&p, 0);
-		_put_byte(&p, SEND_REGISTER);
-		_put_word(&p, appl);
-		_put_word(&p, 1024 * (nconn + 1));
-		_put_word(&p, nconn);
-		_put_word(&p, rp->datablkcnt);
-		_put_word(&p, rp->datablklen);
-		skb_put(skb, (u8 *)p - (u8 *)skb->data);
-
-		skb_queue_tail(&card->dma->send_queue, skb);
-
-		spin_lock_irqsave(&card->lock, flags);
-		c4_dispatch_tx(card);
-		spin_unlock_irqrestore(&card->lock, flags);
-	}
-}
-
-/* ------------------------------------------------------------- */
-
-static void c4_release_appl(struct capi_ctr *ctrl, u16 appl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned long flags;
-	struct sk_buff *skb;
-	void *p;
-
-	spin_lock_irqsave(&card->lock, flags);
-	capilib_release_appl(&cinfo->ncci_head, appl);
-	spin_unlock_irqrestore(&card->lock, flags);
-
-	if (ctrl->cnr == card->cardnr) {
-		skb = alloc_skb(7, GFP_ATOMIC);
-		if (!skb) {
-			printk(KERN_CRIT "%s: no memory, lost release appl.\n",
-			       card->name);
-			return;
-		}
-		p = skb->data;
-		_put_byte(&p, 0);
-		_put_byte(&p, 0);
-		_put_byte(&p, SEND_RELEASE);
-		_put_word(&p, appl);
-
-		skb_put(skb, (u8 *)p - (u8 *)skb->data);
-		skb_queue_tail(&card->dma->send_queue, skb);
-		spin_lock_irqsave(&card->lock, flags);
-		c4_dispatch_tx(card);
-		spin_unlock_irqrestore(&card->lock, flags);
-	}
-}
-
-/* ------------------------------------------------------------- */
-
-
-static u16 c4_send_message(struct capi_ctr *ctrl, struct sk_buff *skb)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	u16 retval = CAPI_NOERROR;
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-	if (CAPIMSG_CMD(skb->data) == CAPI_DATA_B3_REQ) {
-		retval = capilib_data_b3_req(&cinfo->ncci_head,
-					     CAPIMSG_APPID(skb->data),
-					     CAPIMSG_NCCI(skb->data),
-					     CAPIMSG_MSGID(skb->data));
-	}
-	if (retval == CAPI_NOERROR) {
-		skb_queue_tail(&card->dma->send_queue, skb);
-		c4_dispatch_tx(card);
-	}
-	spin_unlock_irqrestore(&card->lock, flags);
-	return retval;
-}
-
-/* ------------------------------------------------------------- */
-
-static char *c4_procinfo(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d 0x%lx",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->port : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		cinfo->card ? cinfo->card->membase : 0
-		);
-	return cinfo->infobuf;
-}
-
-static int c4_proc_show(struct seq_file *m, void *v)
-{
-	struct capi_ctr *ctrl = m->private;
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	u8 flag;
-	char *s;
-
-	seq_printf(m, "%-16s %s\n", "name", card->name);
-	seq_printf(m, "%-16s 0x%x\n", "io", card->port);
-	seq_printf(m, "%-16s %d\n", "irq", card->irq);
-	seq_printf(m, "%-16s 0x%lx\n", "membase", card->membase);
-	switch (card->cardtype) {
-	case avm_b1isa: s = "B1 ISA"; break;
-	case avm_b1pci: s = "B1 PCI"; break;
-	case avm_b1pcmcia: s = "B1 PCMCIA"; break;
-	case avm_m1: s = "M1"; break;
-	case avm_m2: s = "M2"; break;
-	case avm_t1isa: s = "T1 ISA (HEMA)"; break;
-	case avm_t1pci: s = "T1 PCI"; break;
-	case avm_c4: s = "C4"; break;
-	case avm_c2: s = "C2"; break;
-	default: s = "???"; break;
-	}
-	seq_printf(m, "%-16s %s\n", "type", s);
-	if ((s = cinfo->version[VER_DRIVER]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_driver", s);
-	if ((s = cinfo->version[VER_CARDTYPE]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_cardtype", s);
-	if ((s = cinfo->version[VER_SERIAL]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_serial", s);
-
-	if (card->cardtype != avm_m1) {
-		flag = ((u8 *)(ctrl->profile.manu))[3];
-		if (flag)
-			seq_printf(m, "%-16s%s%s%s%s%s%s%s\n",
-				   "protocol",
-				   (flag & 0x01) ? " DSS1" : "",
-				   (flag & 0x02) ? " CT1" : "",
-				   (flag & 0x04) ? " VN3" : "",
-				   (flag & 0x08) ? " NI1" : "",
-				   (flag & 0x10) ? " AUSTEL" : "",
-				   (flag & 0x20) ? " ESS" : "",
-				   (flag & 0x40) ? " 1TR6" : ""
-				);
-	}
-	if (card->cardtype != avm_m1) {
-		flag = ((u8 *)(ctrl->profile.manu))[5];
-		if (flag)
-			seq_printf(m, "%-16s%s%s%s%s\n",
-				   "linetype",
-				   (flag & 0x01) ? " point to point" : "",
-				   (flag & 0x02) ? " point to multipoint" : "",
-				   (flag & 0x08) ? " leased line without D-channel" : "",
-				   (flag & 0x04) ? " leased line with D-channel" : ""
-				);
-	}
-	seq_printf(m, "%-16s %s\n", "cardname", cinfo->cardname);
-
-	return 0;
-}
-
-/* ------------------------------------------------------------- */
-
-static int c4_add_card(struct capicardparams *p, struct pci_dev *dev,
-		       int nr_controllers)
-{
-	avmcard *card;
-	avmctrl_info *cinfo;
-	int retval;
-	int i;
-
-	card = b1_alloc_card(nr_controllers);
-	if (!card) {
-		printk(KERN_WARNING "c4: no memory.\n");
-		retval = -ENOMEM;
-		goto err;
-	}
-	card->dma = avmcard_dma_alloc("c4", dev, 2048 + 128, 2048 + 128);
-	if (!card->dma) {
-		printk(KERN_WARNING "c4: no memory.\n");
-		retval = -ENOMEM;
-		goto err_free;
-	}
-
-	sprintf(card->name, "c%d-%x", nr_controllers, p->port);
-	card->port = p->port;
-	card->irq = p->irq;
-	card->membase = p->membase;
-	card->cardtype = (nr_controllers == 4) ? avm_c4 : avm_c2;
-
-	if (!request_region(card->port, AVMB1_PORTLEN, card->name)) {
-		printk(KERN_WARNING "c4: ports 0x%03x-0x%03x in use.\n",
-		       card->port, card->port + AVMB1_PORTLEN);
-		retval = -EBUSY;
-		goto err_free_dma;
-	}
-
-	card->mbase = ioremap(card->membase, 128);
-	if (card->mbase == NULL) {
-		printk(KERN_NOTICE "c4: can't remap memory at 0x%lx\n",
-		       card->membase);
-		retval = -EIO;
-		goto err_release_region;
-	}
-
-	retval = c4_detect(card);
-	if (retval != 0) {
-		printk(KERN_NOTICE "c4: NO card at 0x%x error(%d)\n",
-		       card->port, retval);
-		retval = -EIO;
-		goto err_unmap;
-	}
-	c4_reset(card);
-
-	retval = request_irq(card->irq, c4_interrupt, IRQF_SHARED, card->name, card);
-	if (retval) {
-		printk(KERN_ERR "c4: unable to get IRQ %d.\n", card->irq);
-		retval = -EBUSY;
-		goto err_unmap;
-	}
-
-	for (i = 0; i < nr_controllers; i++) {
-		cinfo = &card->ctrlinfo[i];
-		cinfo->capi_ctrl.owner = THIS_MODULE;
-		cinfo->capi_ctrl.driver_name   = "c4";
-		cinfo->capi_ctrl.driverdata    = cinfo;
-		cinfo->capi_ctrl.register_appl = c4_register_appl;
-		cinfo->capi_ctrl.release_appl  = c4_release_appl;
-		cinfo->capi_ctrl.send_message  = c4_send_message;
-		cinfo->capi_ctrl.load_firmware = c4_load_firmware;
-		cinfo->capi_ctrl.reset_ctr     = c4_reset_ctr;
-		cinfo->capi_ctrl.procinfo      = c4_procinfo;
-		cinfo->capi_ctrl.proc_show     = c4_proc_show;
-		strcpy(cinfo->capi_ctrl.name, card->name);
-
-		retval = attach_capi_ctr(&cinfo->capi_ctrl);
-		if (retval) {
-			printk(KERN_ERR "c4: attach controller failed (%d).\n", i);
-			for (i--; i >= 0; i--) {
-				cinfo = &card->ctrlinfo[i];
-				detach_capi_ctr(&cinfo->capi_ctrl);
-			}
-			goto err_free_irq;
-		}
-		if (i == 0)
-			card->cardnr = cinfo->capi_ctrl.cnr;
-	}
-
-	printk(KERN_INFO "c4: AVM C%d at i/o %#x, irq %d, mem %#lx\n",
-	       nr_controllers, card->port, card->irq,
-	       card->membase);
-	pci_set_drvdata(dev, card);
-	return 0;
-
-err_free_irq:
-	free_irq(card->irq, card);
-err_unmap:
-	iounmap(card->mbase);
-err_release_region:
-	release_region(card->port, AVMB1_PORTLEN);
-err_free_dma:
-	avmcard_dma_free(card->dma);
-err_free:
-	b1_free_card(card);
-err:
-	return retval;
-}
-
-/* ------------------------------------------------------------- */
-
-static int c4_probe(struct pci_dev *dev, const struct pci_device_id *ent)
-{
-	int nr = ent->driver_data;
-	int retval = 0;
-	struct capicardparams param;
-
-	if (pci_enable_device(dev) < 0) {
-		printk(KERN_ERR "c4: failed to enable AVM-C%d\n", nr);
-		return -ENODEV;
-	}
-	pci_set_master(dev);
-
-	param.port = pci_resource_start(dev, 1);
-	param.irq = dev->irq;
-	param.membase = pci_resource_start(dev, 0);
-
-	printk(KERN_INFO "c4: PCI BIOS reports AVM-C%d at i/o %#x, irq %d, mem %#x\n",
-	       nr, param.port, param.irq, param.membase);
-
-	retval = c4_add_card(&param, dev, nr);
-	if (retval != 0) {
-		printk(KERN_ERR "c4: no AVM-C%d at i/o %#x, irq %d detected, mem %#x\n",
-		       nr, param.port, param.irq, param.membase);
-		pci_disable_device(dev);
-		return -ENODEV;
-	}
-	return 0;
-}
-
-static struct pci_driver c4_pci_driver = {
-	.name           = "c4",
-	.id_table       = c4_pci_tbl,
-	.probe          = c4_probe,
-	.remove         = c4_remove,
-};
-
-static struct capi_driver capi_driver_c2 = {
-	.name		= "c2",
-	.revision	= "1.0",
-};
-
-static struct capi_driver capi_driver_c4 = {
-	.name		= "c4",
-	.revision	= "1.0",
-};
-
-static int __init c4_init(void)
-{
-	char *p;
-	char rev[32];
-	int err;
-
-	if ((p = strchr(revision, ':')) != NULL && p[1]) {
-		strlcpy(rev, p + 2, 32);
-		if ((p = strchr(rev, '$')) != NULL && p > rev)
-			*(p - 1) = 0;
-	} else
-		strcpy(rev, "1.0");
-
-	err = pci_register_driver(&c4_pci_driver);
-	if (!err) {
-		strlcpy(capi_driver_c2.revision, rev, 32);
-		register_capi_driver(&capi_driver_c2);
-		strlcpy(capi_driver_c4.revision, rev, 32);
-		register_capi_driver(&capi_driver_c4);
-		printk(KERN_INFO "c4: revision %s\n", rev);
-	}
-	return err;
-}
-
-static void __exit c4_exit(void)
-{
-	unregister_capi_driver(&capi_driver_c2);
-	unregister_capi_driver(&capi_driver_c4);
-	pci_unregister_driver(&c4_pci_driver);
-}
-
-module_init(c4_init);
-module_exit(c4_exit);
diff --git a/drivers/staging/isdn/avm/t1isa.c b/drivers/staging/isdn/avm/t1isa.c
deleted file mode 100644
index 2153619c5b31..000000000000
--- a/drivers/staging/isdn/avm/t1isa.c
+++ /dev/null
@@ -1,594 +0,0 @@
-/* $Id: t1isa.c,v 1.1.2.3 2004/02/10 01:07:12 keil Exp $
- *
- * Module for AVM T1 HEMA-card.
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/capi.h>
-#include <linux/netdevice.h>
-#include <linux/kernelcapi.h>
-#include <linux/init.h>
-#include <linux/pci.h>
-#include <linux/gfp.h>
-#include <asm/io.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-
-/* ------------------------------------------------------------- */
-
-static char *revision = "$Revision: 1.1.2.3 $";
-
-/* ------------------------------------------------------------- */
-
-MODULE_DESCRIPTION("CAPI4Linux: Driver for AVM T1 HEMA ISA card");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-/* ------------------------------------------------------------- */
-
-static int hema_irq_table[16] =
-{0,
- 0,
- 0,
- 0x80,				/* irq 3 */
- 0,
- 0x90,				/* irq 5 */
- 0,
- 0xA0,				/* irq 7 */
- 0,
- 0xB0,				/* irq 9 */
- 0xC0,				/* irq 10 */
- 0xD0,				/* irq 11 */
- 0xE0,				/* irq 12 */
- 0,
- 0,
- 0xF0,				/* irq 15 */
-};
-
-static int t1_detectandinit(unsigned int base, unsigned irq, int cardnr)
-{
-	unsigned char cregs[8];
-	unsigned char reverse_cardnr;
-	unsigned char dummy;
-	int i;
-
-	reverse_cardnr =   ((cardnr & 0x01) << 3) | ((cardnr & 0x02) << 1)
-		| ((cardnr & 0x04) >> 1) | ((cardnr & 0x08) >> 3);
-	cregs[0] = (HEMA_VERSION_ID << 4) | (reverse_cardnr & 0xf);
-	cregs[1] = 0x00; /* fast & slow link connected to CON1 */
-	cregs[2] = 0x05; /* fast link 20MBit, slow link 20 MBit */
-	cregs[3] = 0;
-	cregs[4] = 0x11; /* zero wait state */
-	cregs[5] = hema_irq_table[irq & 0xf];
-	cregs[6] = 0;
-	cregs[7] = 0;
-
-	/*
-	 * no one else should use the ISA bus in this moment,
-	 * but no function there to prevent this :-(
-	 * save_flags(flags); cli();
-	 */
-
-	/* board reset */
-	t1outp(base, T1_RESETBOARD, 0xf);
-	mdelay(100);
-	dummy = t1inp(base, T1_FASTLINK + T1_OUTSTAT); /* first read */
-
-	/* write config */
-	dummy = (base >> 4) & 0xff;
-	for (i = 1; i <= 0xf; i++) t1outp(base, i, dummy);
-	t1outp(base, HEMA_PAL_ID & 0xf, dummy);
-	t1outp(base, HEMA_PAL_ID >> 4, cregs[0]);
-	for (i = 1; i < 7; i++) t1outp(base, 0, cregs[i]);
-	t1outp(base, ((base >> 4)) & 0x3, cregs[7]);
-	/* restore_flags(flags); */
-
-	mdelay(100);
-	t1outp(base, T1_FASTLINK + T1_RESETLINK, 0);
-	t1outp(base, T1_SLOWLINK + T1_RESETLINK, 0);
-	mdelay(10);
-	t1outp(base, T1_FASTLINK + T1_RESETLINK, 1);
-	t1outp(base, T1_SLOWLINK + T1_RESETLINK, 1);
-	mdelay(100);
-	t1outp(base, T1_FASTLINK + T1_RESETLINK, 0);
-	t1outp(base, T1_SLOWLINK + T1_RESETLINK, 0);
-	mdelay(10);
-	t1outp(base, T1_FASTLINK + T1_ANALYSE, 0);
-	mdelay(5);
-	t1outp(base, T1_SLOWLINK + T1_ANALYSE, 0);
-
-	if (t1inp(base, T1_FASTLINK + T1_OUTSTAT) != 0x1) /* tx empty */
-		return 1;
-	if (t1inp(base, T1_FASTLINK + T1_INSTAT) != 0x0) /* rx empty */
-		return 2;
-	if (t1inp(base, T1_FASTLINK + T1_IRQENABLE) != 0x0)
-		return 3;
-	if ((t1inp(base, T1_FASTLINK + T1_FIFOSTAT) & 0xf0) != 0x70)
-		return 4;
-	if ((t1inp(base, T1_FASTLINK + T1_IRQMASTER) & 0x0e) != 0)
-		return 5;
-	if ((t1inp(base, T1_FASTLINK + T1_IDENT) & 0x7d) != 1)
-		return 6;
-	if (t1inp(base, T1_SLOWLINK + T1_OUTSTAT) != 0x1) /* tx empty */
-		return 7;
-	if ((t1inp(base, T1_SLOWLINK + T1_IRQMASTER) & 0x0e) != 0)
-		return 8;
-	if ((t1inp(base, T1_SLOWLINK + T1_IDENT) & 0x7d) != 0)
-		return 9;
-	return 0;
-}
-
-static irqreturn_t t1isa_interrupt(int interrupt, void *devptr)
-{
-	avmcard *card = devptr;
-	avmctrl_info *cinfo = &card->ctrlinfo[0];
-	struct capi_ctr *ctrl = &cinfo->capi_ctrl;
-	unsigned char b1cmd;
-	struct sk_buff *skb;
-
-	unsigned ApplId;
-	unsigned MsgLen;
-	unsigned DataB3Len;
-	unsigned NCCI;
-	unsigned WindowSize;
-	unsigned long flags;
-
-	spin_lock_irqsave(&card->lock, flags);
-
-	while (b1_rx_full(card->port)) {
-
-		b1cmd = b1_get_byte(card->port);
-
-		switch (b1cmd) {
-
-		case RECEIVE_DATA_B3_IND:
-
-			ApplId = (unsigned) b1_get_word(card->port);
-			MsgLen = t1_get_slice(card->port, card->msgbuf);
-			DataB3Len = t1_get_slice(card->port, card->databuf);
-			spin_unlock_irqrestore(&card->lock, flags);
-
-			if (MsgLen < 30) { /* not CAPI 64Bit */
-				memset(card->msgbuf + MsgLen, 0, 30 - MsgLen);
-				MsgLen = 30;
-				CAPIMSG_SETLEN(card->msgbuf, 30);
-			}
-			if (!(skb = alloc_skb(DataB3Len + MsgLen, GFP_ATOMIC))) {
-				printk(KERN_ERR "%s: incoming packet dropped\n",
-				       card->name);
-			} else {
-				skb_put_data(skb, card->msgbuf, MsgLen);
-				skb_put_data(skb, card->databuf, DataB3Len);
-				capi_ctr_handle_message(ctrl, ApplId, skb);
-			}
-			break;
-
-		case RECEIVE_MESSAGE:
-
-			ApplId = (unsigned) b1_get_word(card->port);
-			MsgLen = t1_get_slice(card->port, card->msgbuf);
-			if (!(skb = alloc_skb(MsgLen, GFP_ATOMIC))) {
-				spin_unlock_irqrestore(&card->lock, flags);
-				printk(KERN_ERR "%s: incoming packet dropped\n",
-				       card->name);
-			} else {
-				skb_put_data(skb, card->msgbuf, MsgLen);
-				if (CAPIMSG_CMD(skb->data) == CAPI_DATA_B3)
-					capilib_data_b3_conf(&cinfo->ncci_head, ApplId,
-							     CAPIMSG_NCCI(skb->data),
-							     CAPIMSG_MSGID(skb->data));
-				spin_unlock_irqrestore(&card->lock, flags);
-				capi_ctr_handle_message(ctrl, ApplId, skb);
-			}
-			break;
-
-		case RECEIVE_NEW_NCCI:
-
-			ApplId = b1_get_word(card->port);
-			NCCI = b1_get_word(card->port);
-			WindowSize = b1_get_word(card->port);
-			capilib_new_ncci(&cinfo->ncci_head, ApplId, NCCI, WindowSize);
-			spin_unlock_irqrestore(&card->lock, flags);
-			break;
-
-		case RECEIVE_FREE_NCCI:
-
-			ApplId = b1_get_word(card->port);
-			NCCI = b1_get_word(card->port);
-			if (NCCI != 0xffffffff)
-				capilib_free_ncci(&cinfo->ncci_head, ApplId, NCCI);
-			spin_unlock_irqrestore(&card->lock, flags);
-			break;
-
-		case RECEIVE_START:
-			b1_put_byte(card->port, SEND_POLLACK);
-			spin_unlock_irqrestore(&card->lock, flags);
-			capi_ctr_resume_output(ctrl);
-			break;
-
-		case RECEIVE_STOP:
-			spin_unlock_irqrestore(&card->lock, flags);
-			capi_ctr_suspend_output(ctrl);
-			break;
-
-		case RECEIVE_INIT:
-
-			cinfo->versionlen = t1_get_slice(card->port, cinfo->versionbuf);
-			spin_unlock_irqrestore(&card->lock, flags);
-			b1_parse_version(cinfo);
-			printk(KERN_INFO "%s: %s-card (%s) now active\n",
-			       card->name,
-			       cinfo->version[VER_CARDTYPE],
-			       cinfo->version[VER_DRIVER]);
-			capi_ctr_ready(ctrl);
-			break;
-
-		case RECEIVE_TASK_READY:
-			ApplId = (unsigned) b1_get_word(card->port);
-			MsgLen = t1_get_slice(card->port, card->msgbuf);
-			spin_unlock_irqrestore(&card->lock, flags);
-			card->msgbuf[MsgLen] = 0;
-			while (MsgLen > 0
-			       && (card->msgbuf[MsgLen - 1] == '\n'
-				   || card->msgbuf[MsgLen - 1] == '\r')) {
-				card->msgbuf[MsgLen - 1] = 0;
-				MsgLen--;
-			}
-			printk(KERN_INFO "%s: task %d \"%s\" ready.\n",
-			       card->name, ApplId, card->msgbuf);
-			break;
-
-		case RECEIVE_DEBUGMSG:
-			MsgLen = t1_get_slice(card->port, card->msgbuf);
-			spin_unlock_irqrestore(&card->lock, flags);
-			card->msgbuf[MsgLen] = 0;
-			while (MsgLen > 0
-			       && (card->msgbuf[MsgLen - 1] == '\n'
-				   || card->msgbuf[MsgLen - 1] == '\r')) {
-				card->msgbuf[MsgLen - 1] = 0;
-				MsgLen--;
-			}
-			printk(KERN_INFO "%s: DEBUG: %s\n", card->name, card->msgbuf);
-			break;
-
-
-		case 0xff:
-			spin_unlock_irqrestore(&card->lock, flags);
-			printk(KERN_ERR "%s: card reseted ?\n", card->name);
-			return IRQ_HANDLED;
-		default:
-			spin_unlock_irqrestore(&card->lock, flags);
-			printk(KERN_ERR "%s: b1_interrupt: 0x%x ???\n",
-			       card->name, b1cmd);
-			return IRQ_NONE;
-		}
-	}
-	return IRQ_HANDLED;
-}
-
-/* ------------------------------------------------------------- */
-
-static int t1isa_load_firmware(struct capi_ctr *ctrl, capiloaddata *data)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-	int retval;
-
-	t1_disable_irq(port);
-	b1_reset(port);
-
-	if ((retval = b1_load_t4file(card, &data->firmware))) {
-		b1_reset(port);
-		printk(KERN_ERR "%s: failed to load t4file!!\n",
-		       card->name);
-		return retval;
-	}
-
-	if (data->configuration.len > 0 && data->configuration.data) {
-		if ((retval = b1_load_config(card, &data->configuration))) {
-			b1_reset(port);
-			printk(KERN_ERR "%s: failed to load config!!\n",
-			       card->name);
-			return retval;
-		}
-	}
-
-	if (!b1_loaded(card)) {
-		printk(KERN_ERR "%s: failed to load t4file.\n", card->name);
-		return -EIO;
-	}
-
-	spin_lock_irqsave(&card->lock, flags);
-	b1_setinterrupt(port, card->irq, card->cardtype);
-	b1_put_byte(port, SEND_INIT);
-	b1_put_word(port, CAPI_MAXAPPL);
-	b1_put_word(port, AVM_NCCI_PER_CHANNEL * 30);
-	b1_put_word(port, ctrl->cnr - 1);
-	spin_unlock_irqrestore(&card->lock, flags);
-
-	return 0;
-}
-
-static void t1isa_reset_ctr(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-
-	t1_disable_irq(port);
-	b1_reset(port);
-	b1_reset(port);
-
-	memset(cinfo->version, 0, sizeof(cinfo->version));
-	spin_lock_irqsave(&card->lock, flags);
-	capilib_release(&cinfo->ncci_head);
-	spin_unlock_irqrestore(&card->lock, flags);
-	capi_ctr_down(ctrl);
-}
-
-static void t1isa_remove(struct pci_dev *pdev)
-{
-	avmctrl_info *cinfo = pci_get_drvdata(pdev);
-	avmcard *card;
-
-	if (!cinfo)
-		return;
-
-	card = cinfo->card;
-
-	t1_disable_irq(card->port);
-	b1_reset(card->port);
-	b1_reset(card->port);
-	t1_reset(card->port);
-
-	detach_capi_ctr(&cinfo->capi_ctrl);
-	free_irq(card->irq, card);
-	release_region(card->port, AVMB1_PORTLEN);
-	b1_free_card(card);
-}
-
-/* ------------------------------------------------------------- */
-
-static u16 t1isa_send_message(struct capi_ctr *ctrl, struct sk_buff *skb);
-static char *t1isa_procinfo(struct capi_ctr *ctrl);
-
-static int t1isa_probe(struct pci_dev *pdev, int cardnr)
-{
-	avmctrl_info *cinfo;
-	avmcard *card;
-	int retval;
-
-	card = b1_alloc_card(1);
-	if (!card) {
-		printk(KERN_WARNING "t1isa: no memory.\n");
-		retval = -ENOMEM;
-		goto err;
-	}
-
-	cinfo = card->ctrlinfo;
-	card->port = pci_resource_start(pdev, 0);
-	card->irq = pdev->irq;
-	card->cardtype = avm_t1isa;
-	card->cardnr = cardnr;
-	sprintf(card->name, "t1isa-%x", card->port);
-
-	if (!(((card->port & 0x7) == 0) && ((card->port & 0x30) != 0x30))) {
-		printk(KERN_WARNING "t1isa: invalid port 0x%x.\n", card->port);
-		retval = -EINVAL;
-		goto err_free;
-	}
-	if (hema_irq_table[card->irq & 0xf] == 0) {
-		printk(KERN_WARNING "t1isa: irq %d not valid.\n", card->irq);
-		retval = -EINVAL;
-		goto err_free;
-	}
-	if (!request_region(card->port, AVMB1_PORTLEN, card->name)) {
-		printk(KERN_INFO "t1isa: ports 0x%03x-0x%03x in use.\n",
-		       card->port, card->port + AVMB1_PORTLEN);
-		retval = -EBUSY;
-		goto err_free;
-	}
-	retval = request_irq(card->irq, t1isa_interrupt, 0, card->name, card);
-	if (retval) {
-		printk(KERN_INFO "t1isa: unable to get IRQ %d.\n", card->irq);
-		retval = -EBUSY;
-		goto err_release_region;
-	}
-
-	if ((retval = t1_detectandinit(card->port, card->irq, card->cardnr)) != 0) {
-		printk(KERN_INFO "t1isa: NO card at 0x%x (%d)\n",
-		       card->port, retval);
-		retval = -ENODEV;
-		goto err_free_irq;
-	}
-	t1_disable_irq(card->port);
-	b1_reset(card->port);
-
-	cinfo->capi_ctrl.owner = THIS_MODULE;
-	cinfo->capi_ctrl.driver_name   = "t1isa";
-	cinfo->capi_ctrl.driverdata    = cinfo;
-	cinfo->capi_ctrl.register_appl = b1_register_appl;
-	cinfo->capi_ctrl.release_appl  = b1_release_appl;
-	cinfo->capi_ctrl.send_message  = t1isa_send_message;
-	cinfo->capi_ctrl.load_firmware = t1isa_load_firmware;
-	cinfo->capi_ctrl.reset_ctr     = t1isa_reset_ctr;
-	cinfo->capi_ctrl.procinfo      = t1isa_procinfo;
-	cinfo->capi_ctrl.proc_show     = b1_proc_show;
-	strcpy(cinfo->capi_ctrl.name, card->name);
-
-	retval = attach_capi_ctr(&cinfo->capi_ctrl);
-	if (retval) {
-		printk(KERN_INFO "t1isa: attach controller failed.\n");
-		goto err_free_irq;
-	}
-
-	printk(KERN_INFO "t1isa: AVM T1 ISA at i/o %#x, irq %d, card %d\n",
-	       card->port, card->irq, card->cardnr);
-
-	pci_set_drvdata(pdev, cinfo);
-	return 0;
-
-err_free_irq:
-	free_irq(card->irq, card);
-err_release_region:
-	release_region(card->port, AVMB1_PORTLEN);
-err_free:
-	b1_free_card(card);
-err:
-	return retval;
-}
-
-static u16 t1isa_send_message(struct capi_ctr *ctrl, struct sk_buff *skb)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-	avmcard *card = cinfo->card;
-	unsigned int port = card->port;
-	unsigned long flags;
-	u16 len = CAPIMSG_LEN(skb->data);
-	u8 cmd = CAPIMSG_COMMAND(skb->data);
-	u8 subcmd = CAPIMSG_SUBCOMMAND(skb->data);
-	u16 dlen, retval;
-
-	spin_lock_irqsave(&card->lock, flags);
-	if (CAPICMD(cmd, subcmd) == CAPI_DATA_B3_REQ) {
-		retval = capilib_data_b3_req(&cinfo->ncci_head,
-					     CAPIMSG_APPID(skb->data),
-					     CAPIMSG_NCCI(skb->data),
-					     CAPIMSG_MSGID(skb->data));
-		if (retval != CAPI_NOERROR) {
-			spin_unlock_irqrestore(&card->lock, flags);
-			return retval;
-		}
-		dlen = CAPIMSG_DATALEN(skb->data);
-
-		b1_put_byte(port, SEND_DATA_B3_REQ);
-		t1_put_slice(port, skb->data, len);
-		t1_put_slice(port, skb->data + len, dlen);
-	} else {
-		b1_put_byte(port, SEND_MESSAGE);
-		t1_put_slice(port, skb->data, len);
-	}
-	spin_unlock_irqrestore(&card->lock, flags);
-	dev_kfree_skb_any(skb);
-	return CAPI_NOERROR;
-}
-/* ------------------------------------------------------------- */
-
-static char *t1isa_procinfo(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d %d",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->port : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		cinfo->card ? cinfo->card->cardnr : 0
-		);
-	return cinfo->infobuf;
-}
-
-
-/* ------------------------------------------------------------- */
-
-#define MAX_CARDS 4
-static struct pci_dev isa_dev[MAX_CARDS];
-static int io[MAX_CARDS];
-static int irq[MAX_CARDS];
-static int cardnr[MAX_CARDS];
-
-module_param_hw_array(io, int, ioport, NULL, 0);
-module_param_hw_array(irq, int, irq, NULL, 0);
-module_param_array(cardnr, int, NULL, 0);
-MODULE_PARM_DESC(io, "I/O base address(es)");
-MODULE_PARM_DESC(irq, "IRQ number(s) (assigned)");
-MODULE_PARM_DESC(cardnr, "Card number(s) (as jumpered)");
-
-static int t1isa_add_card(struct capi_driver *driver, capicardparams *data)
-{
-	int i;
-
-	for (i = 0; i < MAX_CARDS; i++) {
-		if (isa_dev[i].resource[0].start)
-			continue;
-
-		isa_dev[i].resource[0].start = data->port;
-		isa_dev[i].irq = data->irq;
-
-		if (t1isa_probe(&isa_dev[i], data->cardnr) == 0)
-			return 0;
-	}
-	return -ENODEV;
-}
-
-static struct capi_driver capi_driver_t1isa = {
-	.name		= "t1isa",
-	.revision	= "1.0",
-	.add_card       = t1isa_add_card,
-};
-
-static int __init t1isa_init(void)
-{
-	char rev[32];
-	char *p;
-	int i;
-
-	if ((p = strchr(revision, ':')) != NULL && p[1]) {
-		strlcpy(rev, p + 2, 32);
-		if ((p = strchr(rev, '$')) != NULL && p > rev)
-			*(p - 1) = 0;
-	} else
-		strcpy(rev, "1.0");
-
-	for (i = 0; i < MAX_CARDS; i++) {
-		if (!io[i])
-			break;
-
-		isa_dev[i].resource[0].start = io[i];
-		isa_dev[i].irq = irq[i];
-
-		if (t1isa_probe(&isa_dev[i], cardnr[i]) != 0)
-			return -ENODEV;
-	}
-
-	strlcpy(capi_driver_t1isa.revision, rev, 32);
-	register_capi_driver(&capi_driver_t1isa);
-	printk(KERN_INFO "t1isa: revision %s\n", rev);
-
-	return 0;
-}
-
-static void __exit t1isa_exit(void)
-{
-	int i;
-
-	unregister_capi_driver(&capi_driver_t1isa);
-	for (i = 0; i < MAX_CARDS; i++) {
-		if (!io[i])
-			break;
-
-		t1isa_remove(&isa_dev[i]);
-	}
-}
-
-module_init(t1isa_init);
-module_exit(t1isa_exit);
diff --git a/drivers/staging/isdn/avm/t1pci.c b/drivers/staging/isdn/avm/t1pci.c
deleted file mode 100644
index f5ed1d5004c9..000000000000
--- a/drivers/staging/isdn/avm/t1pci.c
+++ /dev/null
@@ -1,259 +0,0 @@
-/* $Id: t1pci.c,v 1.1.2.2 2004/01/16 21:09:27 keil Exp $
- *
- * Module for AVM T1 PCI-card.
- *
- * Copyright 1999 by Carsten Paeth <calle@calle.de>
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/kernel.h>
-#include <linux/skbuff.h>
-#include <linux/delay.h>
-#include <linux/mm.h>
-#include <linux/interrupt.h>
-#include <linux/ioport.h>
-#include <linux/pci.h>
-#include <linux/capi.h>
-#include <linux/init.h>
-#include <asm/io.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/isdn/capilli.h>
-#include "avmcard.h"
-
-#undef CONFIG_T1PCI_DEBUG
-#undef CONFIG_T1PCI_POLLDEBUG
-
-/* ------------------------------------------------------------- */
-static char *revision = "$Revision: 1.1.2.2 $";
-/* ------------------------------------------------------------- */
-
-static struct pci_device_id t1pci_pci_tbl[] = {
-	{ PCI_VENDOR_ID_AVM, PCI_DEVICE_ID_AVM_T1, PCI_ANY_ID, PCI_ANY_ID },
-	{ }				/* Terminating entry */
-};
-
-MODULE_DEVICE_TABLE(pci, t1pci_pci_tbl);
-MODULE_DESCRIPTION("CAPI4Linux: Driver for AVM T1 PCI card");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
-
-/* ------------------------------------------------------------- */
-
-static char *t1pci_procinfo(struct capi_ctr *ctrl);
-
-static int t1pci_add_card(struct capicardparams *p, struct pci_dev *pdev)
-{
-	avmcard *card;
-	avmctrl_info *cinfo;
-	int retval;
-
-	card = b1_alloc_card(1);
-	if (!card) {
-		printk(KERN_WARNING "t1pci: no memory.\n");
-		retval = -ENOMEM;
-		goto err;
-	}
-
-	card->dma = avmcard_dma_alloc("t1pci", pdev, 2048 + 128, 2048 + 128);
-	if (!card->dma) {
-		printk(KERN_WARNING "t1pci: no memory.\n");
-		retval = -ENOMEM;
-		goto err_free;
-	}
-
-	cinfo = card->ctrlinfo;
-	sprintf(card->name, "t1pci-%x", p->port);
-	card->port = p->port;
-	card->irq = p->irq;
-	card->membase = p->membase;
-	card->cardtype = avm_t1pci;
-
-	if (!request_region(card->port, AVMB1_PORTLEN, card->name)) {
-		printk(KERN_WARNING "t1pci: ports 0x%03x-0x%03x in use.\n",
-		       card->port, card->port + AVMB1_PORTLEN);
-		retval = -EBUSY;
-		goto err_free_dma;
-	}
-
-	card->mbase = ioremap(card->membase, 64);
-	if (!card->mbase) {
-		printk(KERN_NOTICE "t1pci: can't remap memory at 0x%lx\n",
-		       card->membase);
-		retval = -EIO;
-		goto err_release_region;
-	}
-
-	b1dma_reset(card);
-
-	retval = t1pci_detect(card);
-	if (retval != 0) {
-		if (retval < 6)
-			printk(KERN_NOTICE "t1pci: NO card at 0x%x (%d)\n",
-			       card->port, retval);
-		else
-			printk(KERN_NOTICE "t1pci: card at 0x%x, but cable not connected or T1 has no power (%d)\n",
-			       card->port, retval);
-		retval = -EIO;
-		goto err_unmap;
-	}
-	b1dma_reset(card);
-
-	retval = request_irq(card->irq, b1dma_interrupt, IRQF_SHARED, card->name, card);
-	if (retval) {
-		printk(KERN_ERR "t1pci: unable to get IRQ %d.\n", card->irq);
-		retval = -EBUSY;
-		goto err_unmap;
-	}
-
-	cinfo->capi_ctrl.owner         = THIS_MODULE;
-	cinfo->capi_ctrl.driver_name   = "t1pci";
-	cinfo->capi_ctrl.driverdata    = cinfo;
-	cinfo->capi_ctrl.register_appl = b1dma_register_appl;
-	cinfo->capi_ctrl.release_appl  = b1dma_release_appl;
-	cinfo->capi_ctrl.send_message  = b1dma_send_message;
-	cinfo->capi_ctrl.load_firmware = b1dma_load_firmware;
-	cinfo->capi_ctrl.reset_ctr     = b1dma_reset_ctr;
-	cinfo->capi_ctrl.procinfo      = t1pci_procinfo;
-	cinfo->capi_ctrl.proc_show     = b1dma_proc_show;
-	strcpy(cinfo->capi_ctrl.name, card->name);
-
-	retval = attach_capi_ctr(&cinfo->capi_ctrl);
-	if (retval) {
-		printk(KERN_ERR "t1pci: attach controller failed.\n");
-		retval = -EBUSY;
-		goto err_free_irq;
-	}
-	card->cardnr = cinfo->capi_ctrl.cnr;
-
-	printk(KERN_INFO "t1pci: AVM T1 PCI at i/o %#x, irq %d, mem %#lx\n",
-	       card->port, card->irq, card->membase);
-
-	pci_set_drvdata(pdev, card);
-	return 0;
-
-err_free_irq:
-	free_irq(card->irq, card);
-err_unmap:
-	iounmap(card->mbase);
-err_release_region:
-	release_region(card->port, AVMB1_PORTLEN);
-err_free_dma:
-	avmcard_dma_free(card->dma);
-err_free:
-	b1_free_card(card);
-err:
-	return retval;
-}
-
-/* ------------------------------------------------------------- */
-
-static void t1pci_remove(struct pci_dev *pdev)
-{
-	avmcard *card = pci_get_drvdata(pdev);
-	avmctrl_info *cinfo = card->ctrlinfo;
-
-	b1dma_reset(card);
-
-	detach_capi_ctr(&cinfo->capi_ctrl);
-	free_irq(card->irq, card);
-	iounmap(card->mbase);
-	release_region(card->port, AVMB1_PORTLEN);
-	avmcard_dma_free(card->dma);
-	b1_free_card(card);
-}
-
-/* ------------------------------------------------------------- */
-
-static char *t1pci_procinfo(struct capi_ctr *ctrl)
-{
-	avmctrl_info *cinfo = (avmctrl_info *)(ctrl->driverdata);
-
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d 0x%lx",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->port : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		cinfo->card ? cinfo->card->membase : 0
-		);
-	return cinfo->infobuf;
-}
-
-/* ------------------------------------------------------------- */
-
-static int t1pci_probe(struct pci_dev *dev, const struct pci_device_id *ent)
-{
-	struct capicardparams param;
-	int retval;
-
-	if (pci_enable_device(dev) < 0) {
-		printk(KERN_ERR	"t1pci: failed to enable AVM-T1-PCI\n");
-		return -ENODEV;
-	}
-	pci_set_master(dev);
-
-	param.port = pci_resource_start(dev, 1);
-	param.irq = dev->irq;
-	param.membase = pci_resource_start(dev, 0);
-
-	printk(KERN_INFO "t1pci: PCI BIOS reports AVM-T1-PCI at i/o %#x, irq %d, mem %#x\n",
-	       param.port, param.irq, param.membase);
-
-	retval = t1pci_add_card(&param, dev);
-	if (retval != 0) {
-		printk(KERN_ERR "t1pci: no AVM-T1-PCI at i/o %#x, irq %d detected, mem %#x\n",
-		       param.port, param.irq, param.membase);
-		pci_disable_device(dev);
-		return -ENODEV;
-	}
-	return 0;
-}
-
-static struct pci_driver t1pci_pci_driver = {
-	.name           = "t1pci",
-	.id_table       = t1pci_pci_tbl,
-	.probe          = t1pci_probe,
-	.remove         = t1pci_remove,
-};
-
-static struct capi_driver capi_driver_t1pci = {
-	.name		= "t1pci",
-	.revision	= "1.0",
-};
-
-static int __init t1pci_init(void)
-{
-	char *p;
-	char rev[32];
-	int err;
-
-	if ((p = strchr(revision, ':')) != NULL && p[1]) {
-		strlcpy(rev, p + 2, 32);
-		if ((p = strchr(rev, '$')) != NULL && p > rev)
-			*(p - 1) = 0;
-	} else
-		strcpy(rev, "1.0");
-
-	err = pci_register_driver(&t1pci_pci_driver);
-	if (!err) {
-		strlcpy(capi_driver_t1pci.revision, rev, 32);
-		register_capi_driver(&capi_driver_t1pci);
-		printk(KERN_INFO "t1pci: revision %s\n", rev);
-	}
-	return err;
-}
-
-static void __exit t1pci_exit(void)
-{
-	unregister_capi_driver(&capi_driver_t1pci);
-	pci_unregister_driver(&t1pci_pci_driver);
-}
-
-module_init(t1pci_init);
-module_exit(t1pci_exit);
diff --git a/drivers/staging/isdn/gigaset/Kconfig b/drivers/staging/isdn/gigaset/Kconfig
deleted file mode 100644
index c593105b3600..000000000000
--- a/drivers/staging/isdn/gigaset/Kconfig
+++ /dev/null
@@ -1,62 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-menuconfig ISDN_DRV_GIGASET
-	tristate "Siemens Gigaset support"
-	depends on TTY
-	select CRC_CCITT
-	select BITREVERSE
-	help
-	  This driver supports the Siemens Gigaset SX205/255 family of
-	  ISDN DECT bases, including the predecessors Gigaset 3070/3075
-	  and 4170/4175 and their T-Com versions Sinus 45isdn and Sinus
-	  721X.
-	  If you have one of these devices, say M here and for at least
-	  one of the connection specific parts that follow.
-	  This will build a module called "gigaset".
-	  Note: If you build your ISDN subsystem (ISDN_CAPI or ISDN_I4L)
-	  as a module, you have to build this driver as a module too,
-	  otherwise the Gigaset device won't show up as an ISDN device.
-
-if ISDN_DRV_GIGASET
-
-config GIGASET_CAPI
-	bool "Gigaset CAPI support"
-	depends on ISDN_CAPI='y'||(ISDN_CAPI='m'&&ISDN_DRV_GIGASET='m')
-	default 'y'
-	help
-	  Build the Gigaset driver as a CAPI 2.0 driver interfacing with
-	  the Kernel CAPI subsystem. To use it with the old ISDN4Linux
-	  subsystem you'll have to enable the capidrv glue driver.
-	  (select ISDN_CAPI_CAPIDRV.)
-	  Say N to build the old native ISDN4Linux variant.
-	  If unsure, say Y.
-
-config GIGASET_BASE
-	tristate "Gigaset base station support"
-	depends on USB
-	help
-	  Say M here if you want to use the USB interface of the Gigaset
-	  base for connection to your system.
-	  This will build a module called "bas_gigaset".
-
-config GIGASET_M105
-	tristate "Gigaset M105 support"
-	depends on USB
-	help
-	  Say M here if you want to connect to the Gigaset base via DECT
-	  using a Gigaset M105 (Sinus 45 Data 2) USB DECT device.
-	  This will build a module called "usb_gigaset".
-
-config GIGASET_M101
-	tristate "Gigaset M101 support"
-	help
-	  Say M here if you want to connect to the Gigaset base via DECT
-	  using a Gigaset M101 (Sinus 45 Data 1) RS232 DECT device.
-	  This will build a module called "ser_gigaset".
-
-config GIGASET_DEBUG
-	bool "Gigaset debugging"
-	help
-	  This enables debugging code in the Gigaset drivers.
-	  If in doubt, say yes.
-
-endif # ISDN_DRV_GIGASET
diff --git a/drivers/staging/isdn/gigaset/Makefile b/drivers/staging/isdn/gigaset/Makefile
deleted file mode 100644
index 9c010891dcd7..000000000000
--- a/drivers/staging/isdn/gigaset/Makefile
+++ /dev/null
@@ -1,17 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0
-gigaset-y := common.o interface.o proc.o ev-layer.o asyncdata.o
-
-ifdef CONFIG_GIGASET_CAPI
-gigaset-y += capi.o
-else
-gigaset-y += dummyll.o
-endif
-
-usb_gigaset-y := usb-gigaset.o
-ser_gigaset-y := ser-gigaset.o
-bas_gigaset-y := bas-gigaset.o isocdata.o
-
-obj-$(CONFIG_ISDN_DRV_GIGASET) += gigaset.o
-obj-$(CONFIG_GIGASET_M105) += usb_gigaset.o
-obj-$(CONFIG_GIGASET_BASE) += bas_gigaset.o
-obj-$(CONFIG_GIGASET_M101) += ser_gigaset.o
diff --git a/drivers/staging/isdn/gigaset/asyncdata.c b/drivers/staging/isdn/gigaset/asyncdata.c
deleted file mode 100644
index a34b3c9d8a71..000000000000
--- a/drivers/staging/isdn/gigaset/asyncdata.c
+++ /dev/null
@@ -1,606 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Common data handling layer for ser_gigaset and usb_gigaset
- *
- * Copyright (c) 2005 by Tilman Schmidt <tilman@imap.cc>,
- *                       Hansjoerg Lipp <hjlipp@web.de>,
- *                       Stefan Eilers.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/crc-ccitt.h>
-#include <linux/bitrev.h>
-#include <linux/export.h>
-
-/* check if byte must be stuffed/escaped
- * I'm not sure which data should be encoded.
- * Therefore I will go the hard way and encode every value
- * less than 0x20, the flag sequence and the control escape char.
- */
-static inline int muststuff(unsigned char c)
-{
-	if (c < PPP_TRANS) return 1;
-	if (c == PPP_FLAG) return 1;
-	if (c == PPP_ESCAPE) return 1;
-	/* other possible candidates: */
-	/* 0x91: XON with parity set */
-	/* 0x93: XOFF with parity set */
-	return 0;
-}
-
-/* == data input =========================================================== */
-
-/* process a block of received bytes in command mode
- * (mstate != MS_LOCKED && (inputstate & INS_command))
- * Append received bytes to the command response buffer and forward them
- * line by line to the response handler. Exit whenever a mode/state change
- * might have occurred.
- * Note: Received lines may be terminated by CR, LF, or CR LF, which will be
- * removed before passing the line to the response handler.
- * Return value:
- *	number of processed bytes
- */
-static unsigned cmd_loop(unsigned numbytes, struct inbuf_t *inbuf)
-{
-	unsigned char *src = inbuf->data + inbuf->head;
-	struct cardstate *cs = inbuf->cs;
-	unsigned cbytes = cs->cbytes;
-	unsigned procbytes = 0;
-	unsigned char c;
-
-	while (procbytes < numbytes) {
-		c = *src++;
-		procbytes++;
-
-		switch (c) {
-		case '\n':
-			if (cbytes == 0 && cs->respdata[0] == '\r') {
-				/* collapse LF with preceding CR */
-				cs->respdata[0] = 0;
-				break;
-			}
-			/* fall through */
-		case '\r':
-			/* end of message line, pass to response handler */
-			if (cbytes >= MAX_RESP_SIZE) {
-				dev_warn(cs->dev, "response too large (%d)\n",
-					 cbytes);
-				cbytes = MAX_RESP_SIZE;
-			}
-			cs->cbytes = cbytes;
-			gigaset_dbg_buffer(DEBUG_TRANSCMD, "received response",
-					   cbytes, cs->respdata);
-			gigaset_handle_modem_response(cs);
-			cbytes = 0;
-
-			/* store EOL byte for CRLF collapsing */
-			cs->respdata[0] = c;
-
-			/* cs->dle may have changed */
-			if (cs->dle && !(inbuf->inputstate & INS_DLE_command))
-				inbuf->inputstate &= ~INS_command;
-
-			/* return for reevaluating state */
-			goto exit;
-
-		case DLE_FLAG:
-			if (inbuf->inputstate & INS_DLE_char) {
-				/* quoted DLE: clear quote flag */
-				inbuf->inputstate &= ~INS_DLE_char;
-			} else if (cs->dle ||
-				   (inbuf->inputstate & INS_DLE_command)) {
-				/* DLE escape, pass up for handling */
-				inbuf->inputstate |= INS_DLE_char;
-				goto exit;
-			}
-			/* quoted or not in DLE mode: treat as regular data */
-			/* fall through */
-		default:
-			/* append to line buffer if possible */
-			if (cbytes < MAX_RESP_SIZE)
-				cs->respdata[cbytes] = c;
-			cbytes++;
-		}
-	}
-exit:
-	cs->cbytes = cbytes;
-	return procbytes;
-}
-
-/* process a block of received bytes in lock mode
- * All received bytes are passed unmodified to the tty i/f.
- * Return value:
- *	number of processed bytes
- */
-static unsigned lock_loop(unsigned numbytes, struct inbuf_t *inbuf)
-{
-	unsigned char *src = inbuf->data + inbuf->head;
-
-	gigaset_dbg_buffer(DEBUG_LOCKCMD, "received response", numbytes, src);
-	gigaset_if_receive(inbuf->cs, src, numbytes);
-	return numbytes;
-}
-
-/* process a block of received bytes in HDLC data mode
- * (mstate != MS_LOCKED && !(inputstate & INS_command) && proto2 == L2_HDLC)
- * Collect HDLC frames, undoing byte stuffing and watching for DLE escapes.
- * When a frame is complete, check the FCS and pass valid frames to the LL.
- * If DLE is encountered, return immediately to let the caller handle it.
- * Return value:
- *	number of processed bytes
- */
-static unsigned hdlc_loop(unsigned numbytes, struct inbuf_t *inbuf)
-{
-	struct cardstate *cs = inbuf->cs;
-	struct bc_state *bcs = cs->bcs;
-	int inputstate = bcs->inputstate;
-	__u16 fcs = bcs->rx_fcs;
-	struct sk_buff *skb = bcs->rx_skb;
-	unsigned char *src = inbuf->data + inbuf->head;
-	unsigned procbytes = 0;
-	unsigned char c;
-
-	if (inputstate & INS_byte_stuff) {
-		if (!numbytes)
-			return 0;
-		inputstate &= ~INS_byte_stuff;
-		goto byte_stuff;
-	}
-
-	while (procbytes < numbytes) {
-		c = *src++;
-		procbytes++;
-		if (c == DLE_FLAG) {
-			if (inputstate & INS_DLE_char) {
-				/* quoted DLE: clear quote flag */
-				inputstate &= ~INS_DLE_char;
-			} else if (cs->dle || (inputstate & INS_DLE_command)) {
-				/* DLE escape, pass up for handling */
-				inputstate |= INS_DLE_char;
-				break;
-			}
-		}
-
-		if (c == PPP_ESCAPE) {
-			/* byte stuffing indicator: pull in next byte */
-			if (procbytes >= numbytes) {
-				/* end of buffer, save for later processing */
-				inputstate |= INS_byte_stuff;
-				break;
-			}
-byte_stuff:
-			c = *src++;
-			procbytes++;
-			if (c == DLE_FLAG) {
-				if (inputstate & INS_DLE_char) {
-					/* quoted DLE: clear quote flag */
-					inputstate &= ~INS_DLE_char;
-				} else if (cs->dle ||
-					   (inputstate & INS_DLE_command)) {
-					/* DLE escape, pass up for handling */
-					inputstate |=
-						INS_DLE_char | INS_byte_stuff;
-					break;
-				}
-			}
-			c ^= PPP_TRANS;
-#ifdef CONFIG_GIGASET_DEBUG
-			if (!muststuff(c))
-				gig_dbg(DEBUG_HDLC, "byte stuffed: 0x%02x", c);
-#endif
-		} else if (c == PPP_FLAG) {
-			/* end of frame: process content if any */
-			if (inputstate & INS_have_data) {
-				gig_dbg(DEBUG_HDLC,
-					"7e----------------------------");
-
-				/* check and pass received frame */
-				if (!skb) {
-					/* skipped frame */
-					gigaset_isdn_rcv_err(bcs);
-				} else if (skb->len < 2) {
-					/* frame too short for FCS */
-					dev_warn(cs->dev,
-						 "short frame (%d)\n",
-						 skb->len);
-					gigaset_isdn_rcv_err(bcs);
-					dev_kfree_skb_any(skb);
-				} else if (fcs != PPP_GOODFCS) {
-					/* frame check error */
-					dev_err(cs->dev,
-						"Checksum failed, %u bytes corrupted!\n",
-						skb->len);
-					gigaset_isdn_rcv_err(bcs);
-					dev_kfree_skb_any(skb);
-				} else {
-					/* good frame */
-					__skb_trim(skb, skb->len - 2);
-					gigaset_skb_rcvd(bcs, skb);
-				}
-
-				/* prepare reception of next frame */
-				inputstate &= ~INS_have_data;
-				skb = gigaset_new_rx_skb(bcs);
-			} else {
-				/* empty frame (7E 7E) */
-#ifdef CONFIG_GIGASET_DEBUG
-				++bcs->emptycount;
-#endif
-				if (!skb) {
-					/* skipped (?) */
-					gigaset_isdn_rcv_err(bcs);
-					skb = gigaset_new_rx_skb(bcs);
-				}
-			}
-
-			fcs = PPP_INITFCS;
-			continue;
-#ifdef CONFIG_GIGASET_DEBUG
-		} else if (muststuff(c)) {
-			/* Should not happen. Possible after ZDLE=1<CR><LF>. */
-			gig_dbg(DEBUG_HDLC, "not byte stuffed: 0x%02x", c);
-#endif
-		}
-
-		/* regular data byte, append to skb */
-#ifdef CONFIG_GIGASET_DEBUG
-		if (!(inputstate & INS_have_data)) {
-			gig_dbg(DEBUG_HDLC, "7e (%d x) ================",
-				bcs->emptycount);
-			bcs->emptycount = 0;
-		}
-#endif
-		inputstate |= INS_have_data;
-		if (skb) {
-			if (skb->len >= bcs->rx_bufsize) {
-				dev_warn(cs->dev, "received packet too long\n");
-				dev_kfree_skb_any(skb);
-				/* skip remainder of packet */
-				bcs->rx_skb = skb = NULL;
-			} else {
-				__skb_put_u8(skb, c);
-				fcs = crc_ccitt_byte(fcs, c);
-			}
-		}
-	}
-
-	bcs->inputstate = inputstate;
-	bcs->rx_fcs = fcs;
-	return procbytes;
-}
-
-/* process a block of received bytes in transparent data mode
- * (mstate != MS_LOCKED && !(inputstate & INS_command) && proto2 != L2_HDLC)
- * Invert bytes, undoing byte stuffing and watching for DLE escapes.
- * If DLE is encountered, return immediately to let the caller handle it.
- * Return value:
- *	number of processed bytes
- */
-static unsigned iraw_loop(unsigned numbytes, struct inbuf_t *inbuf)
-{
-	struct cardstate *cs = inbuf->cs;
-	struct bc_state *bcs = cs->bcs;
-	int inputstate = bcs->inputstate;
-	struct sk_buff *skb = bcs->rx_skb;
-	unsigned char *src = inbuf->data + inbuf->head;
-	unsigned procbytes = 0;
-	unsigned char c;
-
-	if (!skb) {
-		/* skip this block */
-		gigaset_new_rx_skb(bcs);
-		return numbytes;
-	}
-
-	while (procbytes < numbytes && skb->len < bcs->rx_bufsize) {
-		c = *src++;
-		procbytes++;
-
-		if (c == DLE_FLAG) {
-			if (inputstate & INS_DLE_char) {
-				/* quoted DLE: clear quote flag */
-				inputstate &= ~INS_DLE_char;
-			} else if (cs->dle || (inputstate & INS_DLE_command)) {
-				/* DLE escape, pass up for handling */
-				inputstate |= INS_DLE_char;
-				break;
-			}
-		}
-
-		/* regular data byte: append to current skb */
-		inputstate |= INS_have_data;
-		__skb_put_u8(skb, bitrev8(c));
-	}
-
-	/* pass data up */
-	if (inputstate & INS_have_data) {
-		gigaset_skb_rcvd(bcs, skb);
-		inputstate &= ~INS_have_data;
-		gigaset_new_rx_skb(bcs);
-	}
-
-	bcs->inputstate = inputstate;
-	return procbytes;
-}
-
-/* process DLE escapes
- * Called whenever a DLE sequence might be encountered in the input stream.
- * Either processes the entire DLE sequence or, if that isn't possible,
- * notes the fact that an initial DLE has been received in the INS_DLE_char
- * inputstate flag and resumes processing of the sequence on the next call.
- */
-static void handle_dle(struct inbuf_t *inbuf)
-{
-	struct cardstate *cs = inbuf->cs;
-
-	if (cs->mstate == MS_LOCKED)
-		return;		/* no DLE processing in lock mode */
-
-	if (!(inbuf->inputstate & INS_DLE_char)) {
-		/* no DLE pending */
-		if (inbuf->data[inbuf->head] == DLE_FLAG &&
-		    (cs->dle || inbuf->inputstate & INS_DLE_command)) {
-			/* start of DLE sequence */
-			inbuf->head++;
-			if (inbuf->head == inbuf->tail ||
-			    inbuf->head == RBUFSIZE) {
-				/* end of buffer, save for later processing */
-				inbuf->inputstate |= INS_DLE_char;
-				return;
-			}
-		} else {
-			/* regular data byte */
-			return;
-		}
-	}
-
-	/* consume pending DLE */
-	inbuf->inputstate &= ~INS_DLE_char;
-
-	switch (inbuf->data[inbuf->head]) {
-	case 'X':	/* begin of event message */
-		if (inbuf->inputstate & INS_command)
-			dev_notice(cs->dev,
-				   "received <DLE>X in command mode\n");
-		inbuf->inputstate |= INS_command | INS_DLE_command;
-		inbuf->head++;	/* byte consumed */
-		break;
-	case '.':	/* end of event message */
-		if (!(inbuf->inputstate & INS_DLE_command))
-			dev_notice(cs->dev,
-				   "received <DLE>. without <DLE>X\n");
-		inbuf->inputstate &= ~INS_DLE_command;
-		/* return to data mode if in DLE mode */
-		if (cs->dle)
-			inbuf->inputstate &= ~INS_command;
-		inbuf->head++;	/* byte consumed */
-		break;
-	case DLE_FLAG:	/* DLE in data stream */
-		/* mark as quoted */
-		inbuf->inputstate |= INS_DLE_char;
-		if (!(cs->dle || inbuf->inputstate & INS_DLE_command))
-			dev_notice(cs->dev,
-				   "received <DLE><DLE> not in DLE mode\n");
-		break;	/* quoted byte left in buffer */
-	default:
-		dev_notice(cs->dev, "received <DLE><%02x>\n",
-			   inbuf->data[inbuf->head]);
-		/* quoted byte left in buffer */
-	}
-}
-
-/**
- * gigaset_m10x_input() - process a block of data received from the device
- * @inbuf:	received data and device descriptor structure.
- *
- * Called by hardware module {ser,usb}_gigaset with a block of received
- * bytes. Separates the bytes received over the serial data channel into
- * user data and command replies (locked/unlocked) according to the
- * current state of the interface.
- */
-void gigaset_m10x_input(struct inbuf_t *inbuf)
-{
-	struct cardstate *cs = inbuf->cs;
-	unsigned numbytes, procbytes;
-
-	gig_dbg(DEBUG_INTR, "buffer state: %u -> %u", inbuf->head, inbuf->tail);
-
-	while (inbuf->head != inbuf->tail) {
-		/* check for DLE escape */
-		handle_dle(inbuf);
-
-		/* process a contiguous block of bytes */
-		numbytes = (inbuf->head > inbuf->tail ?
-			    RBUFSIZE : inbuf->tail) - inbuf->head;
-		gig_dbg(DEBUG_INTR, "processing %u bytes", numbytes);
-		/*
-		 * numbytes may be 0 if handle_dle() ate the last byte.
-		 * This does no harm, *_loop() will just return 0 immediately.
-		 */
-
-		if (cs->mstate == MS_LOCKED)
-			procbytes = lock_loop(numbytes, inbuf);
-		else if (inbuf->inputstate & INS_command)
-			procbytes = cmd_loop(numbytes, inbuf);
-		else if (cs->bcs->proto2 == L2_HDLC)
-			procbytes = hdlc_loop(numbytes, inbuf);
-		else
-			procbytes = iraw_loop(numbytes, inbuf);
-		inbuf->head += procbytes;
-
-		/* check for buffer wraparound */
-		if (inbuf->head >= RBUFSIZE)
-			inbuf->head = 0;
-
-		gig_dbg(DEBUG_INTR, "head set to %u", inbuf->head);
-	}
-}
-EXPORT_SYMBOL_GPL(gigaset_m10x_input);
-
-
-/* == data output ========================================================== */
-
-/*
- * Encode a data packet into an octet stuffed HDLC frame with FCS,
- * opening and closing flags, preserving headroom data.
- * parameters:
- *	skb		skb containing original packet (freed upon return)
- * Return value:
- *	pointer to newly allocated skb containing the result frame
- *	and the original link layer header, NULL on error
- */
-static struct sk_buff *HDLC_Encode(struct sk_buff *skb)
-{
-	struct sk_buff *hdlc_skb;
-	__u16 fcs;
-	unsigned char c;
-	unsigned char *cp;
-	int len;
-	unsigned int stuf_cnt;
-
-	stuf_cnt = 0;
-	fcs = PPP_INITFCS;
-	cp = skb->data;
-	len = skb->len;
-	while (len--) {
-		if (muststuff(*cp))
-			stuf_cnt++;
-		fcs = crc_ccitt_byte(fcs, *cp++);
-	}
-	fcs ^= 0xffff;			/* complement */
-
-	/* size of new buffer: original size + number of stuffing bytes
-	 * + 2 bytes FCS + 2 stuffing bytes for FCS (if needed) + 2 flag bytes
-	 * + room for link layer header
-	 */
-	hdlc_skb = dev_alloc_skb(skb->len + stuf_cnt + 6 + skb->mac_len);
-	if (!hdlc_skb) {
-		dev_kfree_skb_any(skb);
-		return NULL;
-	}
-
-	/* Copy link layer header into new skb */
-	skb_reset_mac_header(hdlc_skb);
-	skb_reserve(hdlc_skb, skb->mac_len);
-	memcpy(skb_mac_header(hdlc_skb), skb_mac_header(skb), skb->mac_len);
-	hdlc_skb->mac_len = skb->mac_len;
-
-	/* Add flag sequence in front of everything.. */
-	skb_put_u8(hdlc_skb, PPP_FLAG);
-
-	/* Perform byte stuffing while copying data. */
-	while (skb->len--) {
-		if (muststuff(*skb->data)) {
-			skb_put_u8(hdlc_skb, PPP_ESCAPE);
-			skb_put_u8(hdlc_skb, (*skb->data++) ^ PPP_TRANS);
-		} else
-			skb_put_u8(hdlc_skb, *skb->data++);
-	}
-
-	/* Finally add FCS (byte stuffed) and flag sequence */
-	c = (fcs & 0x00ff);	/* least significant byte first */
-	if (muststuff(c)) {
-		skb_put_u8(hdlc_skb, PPP_ESCAPE);
-		c ^= PPP_TRANS;
-	}
-	skb_put_u8(hdlc_skb, c);
-
-	c = ((fcs >> 8) & 0x00ff);
-	if (muststuff(c)) {
-		skb_put_u8(hdlc_skb, PPP_ESCAPE);
-		c ^= PPP_TRANS;
-	}
-	skb_put_u8(hdlc_skb, c);
-
-	skb_put_u8(hdlc_skb, PPP_FLAG);
-
-	dev_kfree_skb_any(skb);
-	return hdlc_skb;
-}
-
-/*
- * Encode a data packet into an octet stuffed raw bit inverted frame,
- * preserving headroom data.
- * parameters:
- *	skb		skb containing original packet (freed upon return)
- * Return value:
- *	pointer to newly allocated skb containing the result frame
- *	and the original link layer header, NULL on error
- */
-static struct sk_buff *iraw_encode(struct sk_buff *skb)
-{
-	struct sk_buff *iraw_skb;
-	unsigned char c;
-	unsigned char *cp;
-	int len;
-
-	/* size of new buffer (worst case = every byte must be stuffed):
-	 * 2 * original size + room for link layer header
-	 */
-	iraw_skb = dev_alloc_skb(2 * skb->len + skb->mac_len);
-	if (!iraw_skb) {
-		dev_kfree_skb_any(skb);
-		return NULL;
-	}
-
-	/* copy link layer header into new skb */
-	skb_reset_mac_header(iraw_skb);
-	skb_reserve(iraw_skb, skb->mac_len);
-	memcpy(skb_mac_header(iraw_skb), skb_mac_header(skb), skb->mac_len);
-	iraw_skb->mac_len = skb->mac_len;
-
-	/* copy and stuff data */
-	cp = skb->data;
-	len = skb->len;
-	while (len--) {
-		c = bitrev8(*cp++);
-		if (c == DLE_FLAG)
-			skb_put_u8(iraw_skb, c);
-		skb_put_u8(iraw_skb, c);
-	}
-	dev_kfree_skb_any(skb);
-	return iraw_skb;
-}
-
-/**
- * gigaset_m10x_send_skb() - queue an skb for sending
- * @bcs:	B channel descriptor structure.
- * @skb:	data to send.
- *
- * Called by LL to encode and queue an skb for sending, and start
- * transmission if necessary.
- * Once the payload data has been transmitted completely, gigaset_skb_sent()
- * will be called with the skb's link layer header preserved.
- *
- * Return value:
- *	number of bytes accepted for sending (skb->len) if ok,
- *	error code < 0 (eg. -ENOMEM) on error
- */
-int gigaset_m10x_send_skb(struct bc_state *bcs, struct sk_buff *skb)
-{
-	struct cardstate *cs = bcs->cs;
-	unsigned len = skb->len;
-	unsigned long flags;
-
-	if (bcs->proto2 == L2_HDLC)
-		skb = HDLC_Encode(skb);
-	else
-		skb = iraw_encode(skb);
-	if (!skb) {
-		dev_err(cs->dev,
-			"unable to allocate memory for encoding!\n");
-		return -ENOMEM;
-	}
-
-	skb_queue_tail(&bcs->squeue, skb);
-	spin_lock_irqsave(&cs->lock, flags);
-	if (cs->connected)
-		tasklet_schedule(&cs->write_tasklet);
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	return len;	/* ok so far */
-}
-EXPORT_SYMBOL_GPL(gigaset_m10x_send_skb);
diff --git a/drivers/staging/isdn/gigaset/bas-gigaset.c b/drivers/staging/isdn/gigaset/bas-gigaset.c
deleted file mode 100644
index c334525a5f63..000000000000
--- a/drivers/staging/isdn/gigaset/bas-gigaset.c
+++ /dev/null
@@ -1,2672 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * USB driver for Gigaset 307x base via direct USB connection.
- *
- * Copyright (c) 2001 by Hansjoerg Lipp <hjlipp@web.de>,
- *                       Tilman Schmidt <tilman@imap.cc>,
- *                       Stefan Eilers.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/usb.h>
-#include <linux/module.h>
-#include <linux/moduleparam.h>
-
-/* Version Information */
-#define DRIVER_AUTHOR "Tilman Schmidt <tilman@imap.cc>, Hansjoerg Lipp <hjlipp@web.de>, Stefan Eilers"
-#define DRIVER_DESC "USB Driver for Gigaset 307x"
-
-
-/* Module parameters */
-
-static int startmode = SM_ISDN;
-static int cidmode = 1;
-
-module_param(startmode, int, S_IRUGO);
-module_param(cidmode, int, S_IRUGO);
-MODULE_PARM_DESC(startmode, "start in isdn4linux mode");
-MODULE_PARM_DESC(cidmode, "Call-ID mode");
-
-#define GIGASET_MINORS     1
-#define GIGASET_MINOR      16
-#define GIGASET_MODULENAME "bas_gigaset"
-#define GIGASET_DEVNAME    "ttyGB"
-
-/* length limit according to Siemens 3070usb-protokoll.doc ch. 2.1 */
-#define IF_WRITEBUF 264
-
-/* interrupt pipe message size according to ibid. ch. 2.2 */
-#define IP_MSGSIZE 3
-
-/* Values for the Gigaset 307x */
-#define USB_GIGA_VENDOR_ID      0x0681
-#define USB_3070_PRODUCT_ID     0x0001
-#define USB_3075_PRODUCT_ID     0x0002
-#define USB_SX303_PRODUCT_ID    0x0021
-#define USB_SX353_PRODUCT_ID    0x0022
-
-/* table of devices that work with this driver */
-static const struct usb_device_id gigaset_table[] = {
-	{ USB_DEVICE(USB_GIGA_VENDOR_ID, USB_3070_PRODUCT_ID) },
-	{ USB_DEVICE(USB_GIGA_VENDOR_ID, USB_3075_PRODUCT_ID) },
-	{ USB_DEVICE(USB_GIGA_VENDOR_ID, USB_SX303_PRODUCT_ID) },
-	{ USB_DEVICE(USB_GIGA_VENDOR_ID, USB_SX353_PRODUCT_ID) },
-	{ } /* Terminating entry */
-};
-
-MODULE_DEVICE_TABLE(usb, gigaset_table);
-
-/*======================= local function prototypes ==========================*/
-
-/* function called if a new device belonging to this driver is connected */
-static int gigaset_probe(struct usb_interface *interface,
-			 const struct usb_device_id *id);
-
-/* Function will be called if the device is unplugged */
-static void gigaset_disconnect(struct usb_interface *interface);
-
-/* functions called before/after suspend */
-static int gigaset_suspend(struct usb_interface *intf, pm_message_t message);
-static int gigaset_resume(struct usb_interface *intf);
-
-/* functions called before/after device reset */
-static int gigaset_pre_reset(struct usb_interface *intf);
-static int gigaset_post_reset(struct usb_interface *intf);
-
-static int atread_submit(struct cardstate *, int);
-static void stopurbs(struct bas_bc_state *);
-static int req_submit(struct bc_state *, int, int, int);
-static int atwrite_submit(struct cardstate *, unsigned char *, int);
-static int start_cbsend(struct cardstate *);
-
-/*============================================================================*/
-
-struct bas_cardstate {
-	struct usb_device	*udev;		/* USB device pointer */
-	struct cardstate	*cs;
-	struct usb_interface	*interface;	/* interface for this device */
-	unsigned char		minor;		/* starting minor number */
-
-	struct urb		*urb_ctrl;	/* control pipe default URB */
-	struct usb_ctrlrequest	dr_ctrl;
-	struct timer_list	timer_ctrl;	/* control request timeout */
-	int			retry_ctrl;
-
-	struct timer_list	timer_atrdy;	/* AT command ready timeout */
-	struct urb		*urb_cmd_out;	/* for sending AT commands */
-	struct usb_ctrlrequest	dr_cmd_out;
-	int			retry_cmd_out;
-
-	struct urb		*urb_cmd_in;	/* for receiving AT replies */
-	struct usb_ctrlrequest	dr_cmd_in;
-	struct timer_list	timer_cmd_in;	/* receive request timeout */
-	unsigned char		*rcvbuf;	/* AT reply receive buffer */
-
-	struct urb		*urb_int_in;	/* URB for interrupt pipe */
-	unsigned char		*int_in_buf;
-	struct work_struct	int_in_wq;	/* for usb_clear_halt() */
-	struct timer_list	timer_int_in;	/* int read retry delay */
-	int			retry_int_in;
-
-	spinlock_t		lock;		/* locks all following */
-	int			basstate;	/* bitmap (BS_*) */
-	int			pending;	/* uncompleted base request */
-	wait_queue_head_t	waitqueue;
-	int			rcvbuf_size;	/* size of AT receive buffer */
-						/* 0: no receive in progress */
-	int			retry_cmd_in;	/* receive req retry count */
-};
-
-/* status of direct USB connection to 307x base (bits in basstate) */
-#define BS_ATOPEN	0x001	/* AT channel open */
-#define BS_B1OPEN	0x002	/* B channel 1 open */
-#define BS_B2OPEN	0x004	/* B channel 2 open */
-#define BS_ATREADY	0x008	/* base ready for AT command */
-#define BS_INIT		0x010	/* base has signalled INIT_OK */
-#define BS_ATTIMER	0x020	/* waiting for HD_READY_SEND_ATDATA */
-#define BS_ATRDPEND	0x040	/* urb_cmd_in in use */
-#define BS_ATWRPEND	0x080	/* urb_cmd_out in use */
-#define BS_SUSPEND	0x100	/* USB port suspended */
-#define BS_RESETTING	0x200	/* waiting for HD_RESET_INTERRUPT_PIPE_ACK */
-
-
-static struct gigaset_driver *driver;
-
-/* usb specific object needed to register this driver with the usb subsystem */
-static struct usb_driver gigaset_usb_driver = {
-	.name =         GIGASET_MODULENAME,
-	.probe =        gigaset_probe,
-	.disconnect =   gigaset_disconnect,
-	.id_table =     gigaset_table,
-	.suspend =	gigaset_suspend,
-	.resume =	gigaset_resume,
-	.reset_resume =	gigaset_post_reset,
-	.pre_reset =	gigaset_pre_reset,
-	.post_reset =	gigaset_post_reset,
-	.disable_hub_initiated_lpm = 1,
-};
-
-/* get message text for usb_submit_urb return code
- */
-static char *get_usb_rcmsg(int rc)
-{
-	static char unkmsg[28];
-
-	switch (rc) {
-	case 0:
-		return "success";
-	case -ENOMEM:
-		return "out of memory";
-	case -ENODEV:
-		return "device not present";
-	case -ENOENT:
-		return "endpoint not present";
-	case -ENXIO:
-		return "URB type not supported";
-	case -EINVAL:
-		return "invalid argument";
-	case -EAGAIN:
-		return "start frame too early or too much scheduled";
-	case -EFBIG:
-		return "too many isoc frames requested";
-	case -EPIPE:
-		return "endpoint stalled";
-	case -EMSGSIZE:
-		return "invalid packet size";
-	case -ENOSPC:
-		return "would overcommit USB bandwidth";
-	case -ESHUTDOWN:
-		return "device shut down";
-	case -EPERM:
-		return "reject flag set";
-	case -EHOSTUNREACH:
-		return "device suspended";
-	default:
-		snprintf(unkmsg, sizeof(unkmsg), "unknown error %d", rc);
-		return unkmsg;
-	}
-}
-
-/* get message text for USB status code
- */
-static char *get_usb_statmsg(int status)
-{
-	static char unkmsg[28];
-
-	switch (status) {
-	case 0:
-		return "success";
-	case -ENOENT:
-		return "unlinked (sync)";
-	case -EINPROGRESS:
-		return "URB still pending";
-	case -EPROTO:
-		return "bitstuff error, timeout, or unknown USB error";
-	case -EILSEQ:
-		return "CRC mismatch, timeout, or unknown USB error";
-	case -ETIME:
-		return "USB response timeout";
-	case -EPIPE:
-		return "endpoint stalled";
-	case -ECOMM:
-		return "IN buffer overrun";
-	case -ENOSR:
-		return "OUT buffer underrun";
-	case -EOVERFLOW:
-		return "endpoint babble";
-	case -EREMOTEIO:
-		return "short packet";
-	case -ENODEV:
-		return "device removed";
-	case -EXDEV:
-		return "partial isoc transfer";
-	case -EINVAL:
-		return "ISO madness";
-	case -ECONNRESET:
-		return "unlinked (async)";
-	case -ESHUTDOWN:
-		return "device shut down";
-	default:
-		snprintf(unkmsg, sizeof(unkmsg), "unknown status %d", status);
-		return unkmsg;
-	}
-}
-
-/* usb_pipetype_str
- * retrieve string representation of USB pipe type
- */
-static inline char *usb_pipetype_str(int pipe)
-{
-	if (usb_pipeisoc(pipe))
-		return "Isoc";
-	if (usb_pipeint(pipe))
-		return "Int";
-	if (usb_pipecontrol(pipe))
-		return "Ctrl";
-	if (usb_pipebulk(pipe))
-		return "Bulk";
-	return "?";
-}
-
-/* dump_urb
- * write content of URB to syslog for debugging
- */
-static inline void dump_urb(enum debuglevel level, const char *tag,
-			    struct urb *urb)
-{
-#ifdef CONFIG_GIGASET_DEBUG
-	int i;
-	gig_dbg(level, "%s urb(0x%08lx)->{", tag, (unsigned long) urb);
-	if (urb) {
-		gig_dbg(level,
-			"  dev=0x%08lx, pipe=%s:EP%d/DV%d:%s, "
-			"hcpriv=0x%08lx, transfer_flags=0x%x,",
-			(unsigned long) urb->dev,
-			usb_pipetype_str(urb->pipe),
-			usb_pipeendpoint(urb->pipe), usb_pipedevice(urb->pipe),
-			usb_pipein(urb->pipe) ? "in" : "out",
-			(unsigned long) urb->hcpriv,
-			urb->transfer_flags);
-		gig_dbg(level,
-			"  transfer_buffer=0x%08lx[%d], actual_length=%d, "
-			"setup_packet=0x%08lx,",
-			(unsigned long) urb->transfer_buffer,
-			urb->transfer_buffer_length, urb->actual_length,
-			(unsigned long) urb->setup_packet);
-		gig_dbg(level,
-			"  start_frame=%d, number_of_packets=%d, interval=%d, "
-			"error_count=%d,",
-			urb->start_frame, urb->number_of_packets, urb->interval,
-			urb->error_count);
-		gig_dbg(level,
-			"  context=0x%08lx, complete=0x%08lx, "
-			"iso_frame_desc[]={",
-			(unsigned long) urb->context,
-			(unsigned long) urb->complete);
-		for (i = 0; i < urb->number_of_packets; i++) {
-			struct usb_iso_packet_descriptor *pifd
-				= &urb->iso_frame_desc[i];
-			gig_dbg(level,
-				"    {offset=%u, length=%u, actual_length=%u, "
-				"status=%u}",
-				pifd->offset, pifd->length, pifd->actual_length,
-				pifd->status);
-		}
-	}
-	gig_dbg(level, "}}");
-#endif
-}
-
-/* read/set modem control bits etc. (m10x only) */
-static int gigaset_set_modem_ctrl(struct cardstate *cs, unsigned old_state,
-				  unsigned new_state)
-{
-	return -EINVAL;
-}
-
-static int gigaset_baud_rate(struct cardstate *cs, unsigned cflag)
-{
-	return -EINVAL;
-}
-
-static int gigaset_set_line_ctrl(struct cardstate *cs, unsigned cflag)
-{
-	return -EINVAL;
-}
-
-/* set/clear bits in base connection state, return previous state
- */
-static inline int update_basstate(struct bas_cardstate *ucs,
-				  int set, int clear)
-{
-	unsigned long flags;
-	int state;
-
-	spin_lock_irqsave(&ucs->lock, flags);
-	state = ucs->basstate;
-	ucs->basstate = (state & ~clear) | set;
-	spin_unlock_irqrestore(&ucs->lock, flags);
-	return state;
-}
-
-/* error_hangup
- * hang up any existing connection because of an unrecoverable error
- * This function may be called from any context and takes care of scheduling
- * the necessary actions for execution outside of interrupt context.
- * cs->lock must not be held.
- * argument:
- *	B channel control structure
- */
-static inline void error_hangup(struct bc_state *bcs)
-{
-	struct cardstate *cs = bcs->cs;
-
-	gigaset_add_event(cs, &bcs->at_state, EV_HUP, NULL, 0, NULL);
-	gigaset_schedule_event(cs);
-}
-
-/* error_reset
- * reset Gigaset device because of an unrecoverable error
- * This function may be called from any context, and takes care of
- * scheduling the necessary actions for execution outside of interrupt context.
- * cs->hw.bas->lock must not be held.
- * argument:
- *	controller state structure
- */
-static inline void error_reset(struct cardstate *cs)
-{
-	/* reset interrupt pipe to recover (ignore errors) */
-	update_basstate(cs->hw.bas, BS_RESETTING, 0);
-	if (req_submit(cs->bcs, HD_RESET_INTERRUPT_PIPE, 0, BAS_TIMEOUT))
-		/* submission failed, escalate to USB port reset */
-		usb_queue_reset_device(cs->hw.bas->interface);
-}
-
-/* check_pending
- * check for completion of pending control request
- * parameter:
- *	ucs	hardware specific controller state structure
- */
-static void check_pending(struct bas_cardstate *ucs)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&ucs->lock, flags);
-	switch (ucs->pending) {
-	case 0:
-		break;
-	case HD_OPEN_ATCHANNEL:
-		if (ucs->basstate & BS_ATOPEN)
-			ucs->pending = 0;
-		break;
-	case HD_OPEN_B1CHANNEL:
-		if (ucs->basstate & BS_B1OPEN)
-			ucs->pending = 0;
-		break;
-	case HD_OPEN_B2CHANNEL:
-		if (ucs->basstate & BS_B2OPEN)
-			ucs->pending = 0;
-		break;
-	case HD_CLOSE_ATCHANNEL:
-		if (!(ucs->basstate & BS_ATOPEN))
-			ucs->pending = 0;
-		break;
-	case HD_CLOSE_B1CHANNEL:
-		if (!(ucs->basstate & BS_B1OPEN))
-			ucs->pending = 0;
-		break;
-	case HD_CLOSE_B2CHANNEL:
-		if (!(ucs->basstate & BS_B2OPEN))
-			ucs->pending = 0;
-		break;
-	case HD_DEVICE_INIT_ACK:		/* no reply expected */
-		ucs->pending = 0;
-		break;
-	case HD_RESET_INTERRUPT_PIPE:
-		if (!(ucs->basstate & BS_RESETTING))
-			ucs->pending = 0;
-		break;
-	/*
-	 * HD_READ_ATMESSAGE and HD_WRITE_ATMESSAGE are handled separately
-	 * and should never end up here
-	 */
-	default:
-		dev_warn(&ucs->interface->dev,
-			 "unknown pending request 0x%02x cleared\n",
-			 ucs->pending);
-		ucs->pending = 0;
-	}
-
-	if (!ucs->pending)
-		del_timer(&ucs->timer_ctrl);
-
-	spin_unlock_irqrestore(&ucs->lock, flags);
-}
-
-/* cmd_in_timeout
- * timeout routine for command input request
- * argument:
- *	controller state structure
- */
-static void cmd_in_timeout(struct timer_list *t)
-{
-	struct bas_cardstate *ucs = from_timer(ucs, t, timer_cmd_in);
-	struct cardstate *cs = ucs->cs;
-	int rc;
-
-	if (!ucs->rcvbuf_size) {
-		gig_dbg(DEBUG_USBREQ, "%s: no receive in progress", __func__);
-		return;
-	}
-
-	if (ucs->retry_cmd_in++ >= BAS_RETRY) {
-		dev_err(cs->dev,
-			"control read: timeout, giving up after %d tries\n",
-			ucs->retry_cmd_in);
-		kfree(ucs->rcvbuf);
-		ucs->rcvbuf = NULL;
-		ucs->rcvbuf_size = 0;
-		error_reset(cs);
-		return;
-	}
-
-	gig_dbg(DEBUG_USBREQ, "%s: timeout, retry %d",
-		__func__, ucs->retry_cmd_in);
-	rc = atread_submit(cs, BAS_TIMEOUT);
-	if (rc < 0) {
-		kfree(ucs->rcvbuf);
-		ucs->rcvbuf = NULL;
-		ucs->rcvbuf_size = 0;
-		if (rc != -ENODEV)
-			error_reset(cs);
-	}
-}
-
-/* read_ctrl_callback
- * USB completion handler for control pipe input
- * called by the USB subsystem in interrupt context
- * parameter:
- *	urb	USB request block
- *		urb->context = inbuf structure for controller state
- */
-static void read_ctrl_callback(struct urb *urb)
-{
-	struct inbuf_t *inbuf = urb->context;
-	struct cardstate *cs = inbuf->cs;
-	struct bas_cardstate *ucs = cs->hw.bas;
-	int status = urb->status;
-	unsigned numbytes;
-	int rc;
-
-	update_basstate(ucs, 0, BS_ATRDPEND);
-	wake_up(&ucs->waitqueue);
-	del_timer(&ucs->timer_cmd_in);
-
-	switch (status) {
-	case 0:				/* normal completion */
-		numbytes = urb->actual_length;
-		if (unlikely(numbytes != ucs->rcvbuf_size)) {
-			dev_warn(cs->dev,
-				 "control read: received %d chars, expected %d\n",
-				 numbytes, ucs->rcvbuf_size);
-			if (numbytes > ucs->rcvbuf_size)
-				numbytes = ucs->rcvbuf_size;
-		}
-
-		/* copy received bytes to inbuf, notify event layer */
-		if (gigaset_fill_inbuf(inbuf, ucs->rcvbuf, numbytes)) {
-			gig_dbg(DEBUG_INTR, "%s-->BH", __func__);
-			gigaset_schedule_event(cs);
-		}
-		break;
-
-	case -ENOENT:			/* cancelled */
-	case -ECONNRESET:		/* cancelled (async) */
-	case -EINPROGRESS:		/* pending */
-	case -ENODEV:			/* device removed */
-	case -ESHUTDOWN:		/* device shut down */
-		/* no further action necessary */
-		gig_dbg(DEBUG_USBREQ, "%s: %s",
-			__func__, get_usb_statmsg(status));
-		break;
-
-	default:			/* other errors: retry */
-		if (ucs->retry_cmd_in++ < BAS_RETRY) {
-			gig_dbg(DEBUG_USBREQ, "%s: %s, retry %d", __func__,
-				get_usb_statmsg(status), ucs->retry_cmd_in);
-			rc = atread_submit(cs, BAS_TIMEOUT);
-			if (rc >= 0)
-				/* successfully resubmitted, skip freeing */
-				return;
-			if (rc == -ENODEV)
-				/* disconnect, no further action necessary */
-				break;
-		}
-		dev_err(cs->dev, "control read: %s, giving up after %d tries\n",
-			get_usb_statmsg(status), ucs->retry_cmd_in);
-		error_reset(cs);
-	}
-
-	/* read finished, free buffer */
-	kfree(ucs->rcvbuf);
-	ucs->rcvbuf = NULL;
-	ucs->rcvbuf_size = 0;
-}
-
-/* atread_submit
- * submit an HD_READ_ATMESSAGE command URB and optionally start a timeout
- * parameters:
- *	cs	controller state structure
- *	timeout	timeout in 1/10 sec., 0: none
- * return value:
- *	0 on success
- *	-EBUSY if another request is pending
- *	any URB submission error code
- */
-static int atread_submit(struct cardstate *cs, int timeout)
-{
-	struct bas_cardstate *ucs = cs->hw.bas;
-	int basstate;
-	int ret;
-
-	gig_dbg(DEBUG_USBREQ, "-------> HD_READ_ATMESSAGE (%d)",
-		ucs->rcvbuf_size);
-
-	basstate = update_basstate(ucs, BS_ATRDPEND, 0);
-	if (basstate & BS_ATRDPEND) {
-		dev_err(cs->dev,
-			"could not submit HD_READ_ATMESSAGE: URB busy\n");
-		return -EBUSY;
-	}
-
-	if (basstate & BS_SUSPEND) {
-		dev_notice(cs->dev,
-			   "HD_READ_ATMESSAGE not submitted, "
-			   "suspend in progress\n");
-		update_basstate(ucs, 0, BS_ATRDPEND);
-		/* treat like disconnect */
-		return -ENODEV;
-	}
-
-	ucs->dr_cmd_in.bRequestType = IN_VENDOR_REQ;
-	ucs->dr_cmd_in.bRequest = HD_READ_ATMESSAGE;
-	ucs->dr_cmd_in.wValue = 0;
-	ucs->dr_cmd_in.wIndex = 0;
-	ucs->dr_cmd_in.wLength = cpu_to_le16(ucs->rcvbuf_size);
-	usb_fill_control_urb(ucs->urb_cmd_in, ucs->udev,
-			     usb_rcvctrlpipe(ucs->udev, 0),
-			     (unsigned char *) &ucs->dr_cmd_in,
-			     ucs->rcvbuf, ucs->rcvbuf_size,
-			     read_ctrl_callback, cs->inbuf);
-
-	ret = usb_submit_urb(ucs->urb_cmd_in, GFP_ATOMIC);
-	if (ret != 0) {
-		update_basstate(ucs, 0, BS_ATRDPEND);
-		dev_err(cs->dev, "could not submit HD_READ_ATMESSAGE: %s\n",
-			get_usb_rcmsg(ret));
-		return ret;
-	}
-
-	if (timeout > 0) {
-		gig_dbg(DEBUG_USBREQ, "setting timeout of %d/10 secs", timeout);
-		mod_timer(&ucs->timer_cmd_in, jiffies + timeout * HZ / 10);
-	}
-	return 0;
-}
-
-/* int_in_work
- * workqueue routine to clear halt on interrupt in endpoint
- */
-
-static void int_in_work(struct work_struct *work)
-{
-	struct bas_cardstate *ucs =
-		container_of(work, struct bas_cardstate, int_in_wq);
-	struct urb *urb = ucs->urb_int_in;
-	struct cardstate *cs = urb->context;
-	int rc;
-
-	/* clear halt condition */
-	rc = usb_clear_halt(ucs->udev, urb->pipe);
-	gig_dbg(DEBUG_USBREQ, "clear_halt: %s", get_usb_rcmsg(rc));
-	if (rc == 0)
-		/* success, resubmit interrupt read URB */
-		rc = usb_submit_urb(urb, GFP_ATOMIC);
-
-	switch (rc) {
-	case 0:		/* success */
-	case -ENODEV:	/* device gone */
-	case -EINVAL:	/* URB already resubmitted, or terminal badness */
-		break;
-	default:	/* failure: try to recover by resetting the device */
-		dev_err(cs->dev, "clear halt failed: %s\n", get_usb_rcmsg(rc));
-		rc = usb_lock_device_for_reset(ucs->udev, ucs->interface);
-		if (rc == 0) {
-			rc = usb_reset_device(ucs->udev);
-			usb_unlock_device(ucs->udev);
-		}
-	}
-	ucs->retry_int_in = 0;
-}
-
-/* int_in_resubmit
- * timer routine for interrupt read delayed resubmit
- * argument:
- *	controller state structure
- */
-static void int_in_resubmit(struct timer_list *t)
-{
-	struct bas_cardstate *ucs = from_timer(ucs, t, timer_int_in);
-	struct cardstate *cs = ucs->cs;
-	int rc;
-
-	if (ucs->retry_int_in++ >= BAS_RETRY) {
-		dev_err(cs->dev, "interrupt read: giving up after %d tries\n",
-			ucs->retry_int_in);
-		usb_queue_reset_device(ucs->interface);
-		return;
-	}
-
-	gig_dbg(DEBUG_USBREQ, "%s: retry %d", __func__, ucs->retry_int_in);
-	rc = usb_submit_urb(ucs->urb_int_in, GFP_ATOMIC);
-	if (rc != 0 && rc != -ENODEV) {
-		dev_err(cs->dev, "could not resubmit interrupt URB: %s\n",
-			get_usb_rcmsg(rc));
-		usb_queue_reset_device(ucs->interface);
-	}
-}
-
-/* read_int_callback
- * USB completion handler for interrupt pipe input
- * called by the USB subsystem in interrupt context
- * parameter:
- *	urb	USB request block
- *		urb->context = controller state structure
- */
-static void read_int_callback(struct urb *urb)
-{
-	struct cardstate *cs = urb->context;
-	struct bas_cardstate *ucs = cs->hw.bas;
-	struct bc_state *bcs;
-	int status = urb->status;
-	unsigned long flags;
-	int rc;
-	unsigned l;
-	int channel;
-
-	switch (status) {
-	case 0:			/* success */
-		ucs->retry_int_in = 0;
-		break;
-	case -EPIPE:			/* endpoint stalled */
-		schedule_work(&ucs->int_in_wq);
-		/* fall through */
-	case -ENOENT:			/* cancelled */
-	case -ECONNRESET:		/* cancelled (async) */
-	case -EINPROGRESS:		/* pending */
-	case -ENODEV:			/* device removed */
-	case -ESHUTDOWN:		/* device shut down */
-		/* no further action necessary */
-		gig_dbg(DEBUG_USBREQ, "%s: %s",
-			__func__, get_usb_statmsg(status));
-		return;
-	case -EPROTO:			/* protocol error or unplug */
-	case -EILSEQ:
-	case -ETIME:
-		/* resubmit after delay */
-		gig_dbg(DEBUG_USBREQ, "%s: %s",
-			__func__, get_usb_statmsg(status));
-		mod_timer(&ucs->timer_int_in, jiffies + HZ / 10);
-		return;
-	default:		/* other errors: just resubmit */
-		dev_warn(cs->dev, "interrupt read: %s\n",
-			 get_usb_statmsg(status));
-		goto resubmit;
-	}
-
-	/* drop incomplete packets even if the missing bytes wouldn't matter */
-	if (unlikely(urb->actual_length < IP_MSGSIZE)) {
-		dev_warn(cs->dev, "incomplete interrupt packet (%d bytes)\n",
-			 urb->actual_length);
-		goto resubmit;
-	}
-
-	l = (unsigned) ucs->int_in_buf[1] +
-		(((unsigned) ucs->int_in_buf[2]) << 8);
-
-	gig_dbg(DEBUG_USBREQ, "<-------%d: 0x%02x (%u [0x%02x 0x%02x])",
-		urb->actual_length, (int)ucs->int_in_buf[0], l,
-		(int)ucs->int_in_buf[1], (int)ucs->int_in_buf[2]);
-
-	channel = 0;
-
-	switch (ucs->int_in_buf[0]) {
-	case HD_DEVICE_INIT_OK:
-		update_basstate(ucs, BS_INIT, 0);
-		break;
-
-	case HD_READY_SEND_ATDATA:
-		del_timer(&ucs->timer_atrdy);
-		update_basstate(ucs, BS_ATREADY, BS_ATTIMER);
-		start_cbsend(cs);
-		break;
-
-	case HD_OPEN_B2CHANNEL_ACK:
-		++channel;
-		/* fall through */
-	case HD_OPEN_B1CHANNEL_ACK:
-		bcs = cs->bcs + channel;
-		update_basstate(ucs, BS_B1OPEN << channel, 0);
-		gigaset_bchannel_up(bcs);
-		break;
-
-	case HD_OPEN_ATCHANNEL_ACK:
-		update_basstate(ucs, BS_ATOPEN, 0);
-		start_cbsend(cs);
-		break;
-
-	case HD_CLOSE_B2CHANNEL_ACK:
-		++channel;
-		/* fall through */
-	case HD_CLOSE_B1CHANNEL_ACK:
-		bcs = cs->bcs + channel;
-		update_basstate(ucs, 0, BS_B1OPEN << channel);
-		stopurbs(bcs->hw.bas);
-		gigaset_bchannel_down(bcs);
-		break;
-
-	case HD_CLOSE_ATCHANNEL_ACK:
-		update_basstate(ucs, 0, BS_ATOPEN);
-		break;
-
-	case HD_B2_FLOW_CONTROL:
-		++channel;
-		/* fall through */
-	case HD_B1_FLOW_CONTROL:
-		bcs = cs->bcs + channel;
-		atomic_add((l - BAS_NORMFRAME) * BAS_CORRFRAMES,
-			   &bcs->hw.bas->corrbytes);
-		gig_dbg(DEBUG_ISO,
-			"Flow control (channel %d, sub %d): 0x%02x => %d",
-			channel, bcs->hw.bas->numsub, l,
-			atomic_read(&bcs->hw.bas->corrbytes));
-		break;
-
-	case HD_RECEIVEATDATA_ACK:	/* AT response ready to be received */
-		if (!l) {
-			dev_warn(cs->dev,
-				 "HD_RECEIVEATDATA_ACK with length 0 ignored\n");
-			break;
-		}
-		spin_lock_irqsave(&cs->lock, flags);
-		if (ucs->basstate & BS_ATRDPEND) {
-			spin_unlock_irqrestore(&cs->lock, flags);
-			dev_warn(cs->dev,
-				 "HD_RECEIVEATDATA_ACK(%d) during HD_READ_ATMESSAGE(%d) ignored\n",
-				 l, ucs->rcvbuf_size);
-			break;
-		}
-		if (ucs->rcvbuf_size) {
-			/* throw away previous buffer - we have no queue */
-			dev_err(cs->dev,
-				"receive AT data overrun, %d bytes lost\n",
-				ucs->rcvbuf_size);
-			kfree(ucs->rcvbuf);
-			ucs->rcvbuf_size = 0;
-		}
-		ucs->rcvbuf = kmalloc(l, GFP_ATOMIC);
-		if (ucs->rcvbuf == NULL) {
-			spin_unlock_irqrestore(&cs->lock, flags);
-			dev_err(cs->dev, "out of memory receiving AT data\n");
-			break;
-		}
-		ucs->rcvbuf_size = l;
-		ucs->retry_cmd_in = 0;
-		rc = atread_submit(cs, BAS_TIMEOUT);
-		if (rc < 0) {
-			kfree(ucs->rcvbuf);
-			ucs->rcvbuf = NULL;
-			ucs->rcvbuf_size = 0;
-		}
-		spin_unlock_irqrestore(&cs->lock, flags);
-		if (rc < 0 && rc != -ENODEV)
-			error_reset(cs);
-		break;
-
-	case HD_RESET_INTERRUPT_PIPE_ACK:
-		update_basstate(ucs, 0, BS_RESETTING);
-		dev_notice(cs->dev, "interrupt pipe reset\n");
-		break;
-
-	case HD_SUSPEND_END:
-		gig_dbg(DEBUG_USBREQ, "HD_SUSPEND_END");
-		break;
-
-	default:
-		dev_warn(cs->dev,
-			 "unknown Gigaset signal 0x%02x (%u) ignored\n",
-			 (int) ucs->int_in_buf[0], l);
-	}
-
-	check_pending(ucs);
-	wake_up(&ucs->waitqueue);
-
-resubmit:
-	rc = usb_submit_urb(urb, GFP_ATOMIC);
-	if (unlikely(rc != 0 && rc != -ENODEV)) {
-		dev_err(cs->dev, "could not resubmit interrupt URB: %s\n",
-			get_usb_rcmsg(rc));
-		error_reset(cs);
-	}
-}
-
-/* read_iso_callback
- * USB completion handler for B channel isochronous input
- * called by the USB subsystem in interrupt context
- * parameter:
- *	urb	USB request block of completed request
- *		urb->context = bc_state structure
- */
-static void read_iso_callback(struct urb *urb)
-{
-	struct bc_state *bcs;
-	struct bas_bc_state *ubc;
-	int status = urb->status;
-	unsigned long flags;
-	int i, rc;
-
-	/* status codes not worth bothering the tasklet with */
-	if (unlikely(status == -ENOENT ||
-		     status == -ECONNRESET ||
-		     status == -EINPROGRESS ||
-		     status == -ENODEV ||
-		     status == -ESHUTDOWN)) {
-		gig_dbg(DEBUG_ISO, "%s: %s",
-			__func__, get_usb_statmsg(status));
-		return;
-	}
-
-	bcs = urb->context;
-	ubc = bcs->hw.bas;
-
-	spin_lock_irqsave(&ubc->isoinlock, flags);
-	if (likely(ubc->isoindone == NULL)) {
-		/* pass URB to tasklet */
-		ubc->isoindone = urb;
-		ubc->isoinstatus = status;
-		tasklet_hi_schedule(&ubc->rcvd_tasklet);
-	} else {
-		/* tasklet still busy, drop data and resubmit URB */
-		gig_dbg(DEBUG_ISO, "%s: overrun", __func__);
-		ubc->loststatus = status;
-		for (i = 0; i < BAS_NUMFRAMES; i++) {
-			ubc->isoinlost += urb->iso_frame_desc[i].actual_length;
-			if (unlikely(urb->iso_frame_desc[i].status != 0 &&
-				     urb->iso_frame_desc[i].status != -EINPROGRESS))
-				ubc->loststatus = urb->iso_frame_desc[i].status;
-			urb->iso_frame_desc[i].status = 0;
-			urb->iso_frame_desc[i].actual_length = 0;
-		}
-		if (likely(ubc->running)) {
-			/* urb->dev is clobbered by USB subsystem */
-			urb->dev = bcs->cs->hw.bas->udev;
-			urb->transfer_flags = URB_ISO_ASAP;
-			urb->number_of_packets = BAS_NUMFRAMES;
-			rc = usb_submit_urb(urb, GFP_ATOMIC);
-			if (unlikely(rc != 0 && rc != -ENODEV)) {
-				dev_err(bcs->cs->dev,
-					"could not resubmit isoc read URB: %s\n",
-					get_usb_rcmsg(rc));
-				dump_urb(DEBUG_ISO, "isoc read", urb);
-				error_hangup(bcs);
-			}
-		}
-	}
-	spin_unlock_irqrestore(&ubc->isoinlock, flags);
-}
-
-/* write_iso_callback
- * USB completion handler for B channel isochronous output
- * called by the USB subsystem in interrupt context
- * parameter:
- *	urb	USB request block of completed request
- *		urb->context = isow_urbctx_t structure
- */
-static void write_iso_callback(struct urb *urb)
-{
-	struct isow_urbctx_t *ucx;
-	struct bas_bc_state *ubc;
-	int status = urb->status;
-	unsigned long flags;
-
-	/* status codes not worth bothering the tasklet with */
-	if (unlikely(status == -ENOENT ||
-		     status == -ECONNRESET ||
-		     status == -EINPROGRESS ||
-		     status == -ENODEV ||
-		     status == -ESHUTDOWN)) {
-		gig_dbg(DEBUG_ISO, "%s: %s",
-			__func__, get_usb_statmsg(status));
-		return;
-	}
-
-	/* pass URB context to tasklet */
-	ucx = urb->context;
-	ubc = ucx->bcs->hw.bas;
-	ucx->status = status;
-
-	spin_lock_irqsave(&ubc->isooutlock, flags);
-	ubc->isooutovfl = ubc->isooutdone;
-	ubc->isooutdone = ucx;
-	spin_unlock_irqrestore(&ubc->isooutlock, flags);
-	tasklet_hi_schedule(&ubc->sent_tasklet);
-}
-
-/* starturbs
- * prepare and submit USB request blocks for isochronous input and output
- * argument:
- *	B channel control structure
- * return value:
- *	0 on success
- *	< 0 on error (no URBs submitted)
- */
-static int starturbs(struct bc_state *bcs)
-{
-	struct usb_device *udev = bcs->cs->hw.bas->udev;
-	struct bas_bc_state *ubc = bcs->hw.bas;
-	struct urb *urb;
-	int j, k;
-	int rc;
-
-	/* initialize L2 reception */
-	if (bcs->proto2 == L2_HDLC)
-		bcs->inputstate |= INS_flag_hunt;
-
-	/* submit all isochronous input URBs */
-	ubc->running = 1;
-	for (k = 0; k < BAS_INURBS; k++) {
-		urb = ubc->isoinurbs[k];
-		if (!urb) {
-			rc = -EFAULT;
-			goto error;
-		}
-		usb_fill_int_urb(urb, udev,
-				 usb_rcvisocpipe(udev, 3 + 2 * bcs->channel),
-				 ubc->isoinbuf + k * BAS_INBUFSIZE,
-				 BAS_INBUFSIZE, read_iso_callback, bcs,
-				 BAS_FRAMETIME);
-
-		urb->transfer_flags = URB_ISO_ASAP;
-		urb->number_of_packets = BAS_NUMFRAMES;
-		for (j = 0; j < BAS_NUMFRAMES; j++) {
-			urb->iso_frame_desc[j].offset = j * BAS_MAXFRAME;
-			urb->iso_frame_desc[j].length = BAS_MAXFRAME;
-			urb->iso_frame_desc[j].status = 0;
-			urb->iso_frame_desc[j].actual_length = 0;
-		}
-
-		dump_urb(DEBUG_ISO, "Initial isoc read", urb);
-		rc = usb_submit_urb(urb, GFP_ATOMIC);
-		if (rc != 0)
-			goto error;
-	}
-
-	/* initialize L2 transmission */
-	gigaset_isowbuf_init(ubc->isooutbuf, PPP_FLAG);
-
-	/* set up isochronous output URBs for flag idling */
-	for (k = 0; k < BAS_OUTURBS; ++k) {
-		urb = ubc->isoouturbs[k].urb;
-		if (!urb) {
-			rc = -EFAULT;
-			goto error;
-		}
-		usb_fill_int_urb(urb, udev,
-				 usb_sndisocpipe(udev, 4 + 2 * bcs->channel),
-				 ubc->isooutbuf->data,
-				 sizeof(ubc->isooutbuf->data),
-				 write_iso_callback, &ubc->isoouturbs[k],
-				 BAS_FRAMETIME);
-
-		urb->transfer_flags = URB_ISO_ASAP;
-		urb->number_of_packets = BAS_NUMFRAMES;
-		for (j = 0; j < BAS_NUMFRAMES; ++j) {
-			urb->iso_frame_desc[j].offset = BAS_OUTBUFSIZE;
-			urb->iso_frame_desc[j].length = BAS_NORMFRAME;
-			urb->iso_frame_desc[j].status = 0;
-			urb->iso_frame_desc[j].actual_length = 0;
-		}
-		ubc->isoouturbs[k].limit = -1;
-	}
-
-	/* keep one URB free, submit the others */
-	for (k = 0; k < BAS_OUTURBS - 1; ++k) {
-		dump_urb(DEBUG_ISO, "Initial isoc write", urb);
-		rc = usb_submit_urb(ubc->isoouturbs[k].urb, GFP_ATOMIC);
-		if (rc != 0)
-			goto error;
-	}
-	dump_urb(DEBUG_ISO, "Initial isoc write (free)", urb);
-	ubc->isooutfree = &ubc->isoouturbs[BAS_OUTURBS - 1];
-	ubc->isooutdone = ubc->isooutovfl = NULL;
-	return 0;
-error:
-	stopurbs(ubc);
-	return rc;
-}
-
-/* stopurbs
- * cancel the USB request blocks for isochronous input and output
- * errors are silently ignored
- * argument:
- *	B channel control structure
- */
-static void stopurbs(struct bas_bc_state *ubc)
-{
-	int k, rc;
-
-	ubc->running = 0;
-
-	for (k = 0; k < BAS_INURBS; ++k) {
-		rc = usb_unlink_urb(ubc->isoinurbs[k]);
-		gig_dbg(DEBUG_ISO,
-			"%s: isoc input URB %d unlinked, result = %s",
-			__func__, k, get_usb_rcmsg(rc));
-	}
-
-	for (k = 0; k < BAS_OUTURBS; ++k) {
-		rc = usb_unlink_urb(ubc->isoouturbs[k].urb);
-		gig_dbg(DEBUG_ISO,
-			"%s: isoc output URB %d unlinked, result = %s",
-			__func__, k, get_usb_rcmsg(rc));
-	}
-}
-
-/* Isochronous Write - Bottom Half */
-/* =============================== */
-
-/* submit_iso_write_urb
- * fill and submit the next isochronous write URB
- * parameters:
- *	ucx	context structure containing URB
- * return value:
- *	number of frames submitted in URB
- *	0 if URB not submitted because no data available (isooutbuf busy)
- *	error code < 0 on error
- */
-static int submit_iso_write_urb(struct isow_urbctx_t *ucx)
-{
-	struct urb *urb = ucx->urb;
-	struct bas_bc_state *ubc = ucx->bcs->hw.bas;
-	struct usb_iso_packet_descriptor *ifd;
-	int corrbytes, nframe, rc;
-
-	/* urb->dev is clobbered by USB subsystem */
-	urb->dev = ucx->bcs->cs->hw.bas->udev;
-	urb->transfer_flags = URB_ISO_ASAP;
-	urb->transfer_buffer = ubc->isooutbuf->data;
-	urb->transfer_buffer_length = sizeof(ubc->isooutbuf->data);
-
-	for (nframe = 0; nframe < BAS_NUMFRAMES; nframe++) {
-		ifd = &urb->iso_frame_desc[nframe];
-
-		/* compute frame length according to flow control */
-		ifd->length = BAS_NORMFRAME;
-		corrbytes = atomic_read(&ubc->corrbytes);
-		if (corrbytes != 0) {
-			gig_dbg(DEBUG_ISO, "%s: corrbytes=%d",
-				__func__, corrbytes);
-			if (corrbytes > BAS_HIGHFRAME - BAS_NORMFRAME)
-				corrbytes = BAS_HIGHFRAME - BAS_NORMFRAME;
-			else if (corrbytes < BAS_LOWFRAME - BAS_NORMFRAME)
-				corrbytes = BAS_LOWFRAME - BAS_NORMFRAME;
-			ifd->length += corrbytes;
-			atomic_add(-corrbytes, &ubc->corrbytes);
-		}
-
-		/* retrieve block of data to send */
-		rc = gigaset_isowbuf_getbytes(ubc->isooutbuf, ifd->length);
-		if (rc < 0) {
-			if (rc == -EBUSY) {
-				gig_dbg(DEBUG_ISO,
-					"%s: buffer busy at frame %d",
-					__func__, nframe);
-				/* tasklet will be restarted from
-				   gigaset_isoc_send_skb() */
-			} else {
-				dev_err(ucx->bcs->cs->dev,
-					"%s: buffer error %d at frame %d\n",
-					__func__, rc, nframe);
-				return rc;
-			}
-			break;
-		}
-		ifd->offset = rc;
-		ucx->limit = ubc->isooutbuf->nextread;
-		ifd->status = 0;
-		ifd->actual_length = 0;
-	}
-	if (unlikely(nframe == 0))
-		return 0;	/* no data to send */
-	urb->number_of_packets = nframe;
-
-	rc = usb_submit_urb(urb, GFP_ATOMIC);
-	if (unlikely(rc)) {
-		if (rc == -ENODEV)
-			/* device removed - give up silently */
-			gig_dbg(DEBUG_ISO, "%s: disconnected", __func__);
-		else
-			dev_err(ucx->bcs->cs->dev,
-				"could not submit isoc write URB: %s\n",
-				get_usb_rcmsg(rc));
-		return rc;
-	}
-	++ubc->numsub;
-	return nframe;
-}
-
-/* write_iso_tasklet
- * tasklet scheduled when an isochronous output URB from the Gigaset device
- * has completed
- * parameter:
- *	data	B channel state structure
- */
-static void write_iso_tasklet(unsigned long data)
-{
-	struct bc_state *bcs = (struct bc_state *) data;
-	struct bas_bc_state *ubc = bcs->hw.bas;
-	struct cardstate *cs = bcs->cs;
-	struct isow_urbctx_t *done, *next, *ovfl;
-	struct urb *urb;
-	int status;
-	struct usb_iso_packet_descriptor *ifd;
-	unsigned long flags;
-	int i;
-	struct sk_buff *skb;
-	int len;
-	int rc;
-
-	/* loop while completed URBs arrive in time */
-	for (;;) {
-		if (unlikely(!(ubc->running))) {
-			gig_dbg(DEBUG_ISO, "%s: not running", __func__);
-			return;
-		}
-
-		/* retrieve completed URBs */
-		spin_lock_irqsave(&ubc->isooutlock, flags);
-		done = ubc->isooutdone;
-		ubc->isooutdone = NULL;
-		ovfl = ubc->isooutovfl;
-		ubc->isooutovfl = NULL;
-		spin_unlock_irqrestore(&ubc->isooutlock, flags);
-		if (ovfl) {
-			dev_err(cs->dev, "isoc write underrun\n");
-			error_hangup(bcs);
-			break;
-		}
-		if (!done)
-			break;
-
-		/* submit free URB if available */
-		spin_lock_irqsave(&ubc->isooutlock, flags);
-		next = ubc->isooutfree;
-		ubc->isooutfree = NULL;
-		spin_unlock_irqrestore(&ubc->isooutlock, flags);
-		if (next) {
-			rc = submit_iso_write_urb(next);
-			if (unlikely(rc <= 0 && rc != -ENODEV)) {
-				/* could not submit URB, put it back */
-				spin_lock_irqsave(&ubc->isooutlock, flags);
-				if (ubc->isooutfree == NULL) {
-					ubc->isooutfree = next;
-					next = NULL;
-				}
-				spin_unlock_irqrestore(&ubc->isooutlock, flags);
-				if (next) {
-					/* couldn't put it back */
-					dev_err(cs->dev,
-						"losing isoc write URB\n");
-					error_hangup(bcs);
-				}
-			}
-		}
-
-		/* process completed URB */
-		urb = done->urb;
-		status = done->status;
-		switch (status) {
-		case -EXDEV:			/* partial completion */
-			gig_dbg(DEBUG_ISO, "%s: URB partially completed",
-				__func__);
-			/* fall through - what's the difference anyway? */
-		case 0:				/* normal completion */
-			/* inspect individual frames
-			 * assumptions (for lack of documentation):
-			 * - actual_length bytes of first frame in error are
-			 *   successfully sent
-			 * - all following frames are not sent at all
-			 */
-			for (i = 0; i < BAS_NUMFRAMES; i++) {
-				ifd = &urb->iso_frame_desc[i];
-				if (ifd->status ||
-				    ifd->actual_length != ifd->length) {
-					dev_warn(cs->dev,
-						 "isoc write: frame %d[%d/%d]: %s\n",
-						 i, ifd->actual_length,
-						 ifd->length,
-						 get_usb_statmsg(ifd->status));
-					break;
-				}
-			}
-			break;
-		case -EPIPE:			/* stall - probably underrun */
-			dev_err(cs->dev, "isoc write: stalled\n");
-			error_hangup(bcs);
-			break;
-		default:			/* other errors */
-			dev_warn(cs->dev, "isoc write: %s\n",
-				 get_usb_statmsg(status));
-		}
-
-		/* mark the write buffer area covered by this URB as free */
-		if (done->limit >= 0)
-			ubc->isooutbuf->read = done->limit;
-
-		/* mark URB as free */
-		spin_lock_irqsave(&ubc->isooutlock, flags);
-		next = ubc->isooutfree;
-		ubc->isooutfree = done;
-		spin_unlock_irqrestore(&ubc->isooutlock, flags);
-		if (next) {
-			/* only one URB still active - resubmit one */
-			rc = submit_iso_write_urb(next);
-			if (unlikely(rc <= 0 && rc != -ENODEV)) {
-				/* couldn't submit */
-				error_hangup(bcs);
-			}
-		}
-	}
-
-	/* process queued SKBs */
-	while ((skb = skb_dequeue(&bcs->squeue))) {
-		/* copy to output buffer, doing L2 encapsulation */
-		len = skb->len;
-		if (gigaset_isoc_buildframe(bcs, skb->data, len) == -EAGAIN) {
-			/* insufficient buffer space, push back onto queue */
-			skb_queue_head(&bcs->squeue, skb);
-			gig_dbg(DEBUG_ISO, "%s: skb requeued, qlen=%d",
-				__func__, skb_queue_len(&bcs->squeue));
-			break;
-		}
-		skb_pull(skb, len);
-		gigaset_skb_sent(bcs, skb);
-		dev_kfree_skb_any(skb);
-	}
-}
-
-/* Isochronous Read - Bottom Half */
-/* ============================== */
-
-/* read_iso_tasklet
- * tasklet scheduled when an isochronous input URB from the Gigaset device
- * has completed
- * parameter:
- *	data	B channel state structure
- */
-static void read_iso_tasklet(unsigned long data)
-{
-	struct bc_state *bcs = (struct bc_state *) data;
-	struct bas_bc_state *ubc = bcs->hw.bas;
-	struct cardstate *cs = bcs->cs;
-	struct urb *urb;
-	int status;
-	struct usb_iso_packet_descriptor *ifd;
-	char *rcvbuf;
-	unsigned long flags;
-	int totleft, numbytes, offset, frame, rc;
-
-	/* loop while more completed URBs arrive in the meantime */
-	for (;;) {
-		/* retrieve URB */
-		spin_lock_irqsave(&ubc->isoinlock, flags);
-		urb = ubc->isoindone;
-		if (!urb) {
-			spin_unlock_irqrestore(&ubc->isoinlock, flags);
-			return;
-		}
-		status = ubc->isoinstatus;
-		ubc->isoindone = NULL;
-		if (unlikely(ubc->loststatus != -EINPROGRESS)) {
-			dev_warn(cs->dev,
-				 "isoc read overrun, URB dropped (status: %s, %d bytes)\n",
-				 get_usb_statmsg(ubc->loststatus),
-				 ubc->isoinlost);
-			ubc->loststatus = -EINPROGRESS;
-		}
-		spin_unlock_irqrestore(&ubc->isoinlock, flags);
-
-		if (unlikely(!(ubc->running))) {
-			gig_dbg(DEBUG_ISO,
-				"%s: channel not running, "
-				"dropped URB with status: %s",
-				__func__, get_usb_statmsg(status));
-			return;
-		}
-
-		switch (status) {
-		case 0:				/* normal completion */
-			break;
-		case -EXDEV:			/* inspect individual frames
-						   (we do that anyway) */
-			gig_dbg(DEBUG_ISO, "%s: URB partially completed",
-				__func__);
-			break;
-		case -ENOENT:
-		case -ECONNRESET:
-		case -EINPROGRESS:
-			gig_dbg(DEBUG_ISO, "%s: %s",
-				__func__, get_usb_statmsg(status));
-			continue;		/* -> skip */
-		case -EPIPE:
-			dev_err(cs->dev, "isoc read: stalled\n");
-			error_hangup(bcs);
-			continue;		/* -> skip */
-		default:			/* other error */
-			dev_warn(cs->dev, "isoc read: %s\n",
-				 get_usb_statmsg(status));
-			goto error;
-		}
-
-		rcvbuf = urb->transfer_buffer;
-		totleft = urb->actual_length;
-		for (frame = 0; totleft > 0 && frame < BAS_NUMFRAMES; frame++) {
-			ifd = &urb->iso_frame_desc[frame];
-			numbytes = ifd->actual_length;
-			switch (ifd->status) {
-			case 0:			/* success */
-				break;
-			case -EPROTO:		/* protocol error or unplug */
-			case -EILSEQ:
-			case -ETIME:
-				/* probably just disconnected, ignore */
-				gig_dbg(DEBUG_ISO,
-					"isoc read: frame %d[%d]: %s\n",
-					frame, numbytes,
-					get_usb_statmsg(ifd->status));
-				break;
-			default:		/* other error */
-				/* report, assume transferred bytes are ok */
-				dev_warn(cs->dev,
-					 "isoc read: frame %d[%d]: %s\n",
-					 frame, numbytes,
-					 get_usb_statmsg(ifd->status));
-			}
-			if (unlikely(numbytes > BAS_MAXFRAME))
-				dev_warn(cs->dev,
-					 "isoc read: frame %d[%d]: %s\n",
-					 frame, numbytes,
-					 "exceeds max frame size");
-			if (unlikely(numbytes > totleft)) {
-				dev_warn(cs->dev,
-					 "isoc read: frame %d[%d]: %s\n",
-					 frame, numbytes,
-					 "exceeds total transfer length");
-				numbytes = totleft;
-			}
-			offset = ifd->offset;
-			if (unlikely(offset + numbytes > BAS_INBUFSIZE)) {
-				dev_warn(cs->dev,
-					 "isoc read: frame %d[%d]: %s\n",
-					 frame, numbytes,
-					 "exceeds end of buffer");
-				numbytes = BAS_INBUFSIZE - offset;
-			}
-			gigaset_isoc_receive(rcvbuf + offset, numbytes, bcs);
-			totleft -= numbytes;
-		}
-		if (unlikely(totleft > 0))
-			dev_warn(cs->dev, "isoc read: %d data bytes missing\n",
-				 totleft);
-
-error:
-		/* URB processed, resubmit */
-		for (frame = 0; frame < BAS_NUMFRAMES; frame++) {
-			urb->iso_frame_desc[frame].status = 0;
-			urb->iso_frame_desc[frame].actual_length = 0;
-		}
-		/* urb->dev is clobbered by USB subsystem */
-		urb->dev = bcs->cs->hw.bas->udev;
-		urb->transfer_flags = URB_ISO_ASAP;
-		urb->number_of_packets = BAS_NUMFRAMES;
-		rc = usb_submit_urb(urb, GFP_ATOMIC);
-		if (unlikely(rc != 0 && rc != -ENODEV)) {
-			dev_err(cs->dev,
-				"could not resubmit isoc read URB: %s\n",
-				get_usb_rcmsg(rc));
-			dump_urb(DEBUG_ISO, "resubmit isoc read", urb);
-			error_hangup(bcs);
-		}
-	}
-}
-
-/* Channel Operations */
-/* ================== */
-
-/* req_timeout
- * timeout routine for control output request
- * argument:
- *	controller state structure
- */
-static void req_timeout(struct timer_list *t)
-{
-	struct bas_cardstate *ucs = from_timer(ucs, t, timer_ctrl);
-	struct cardstate *cs = ucs->cs;
-	int pending;
-	unsigned long flags;
-
-	check_pending(ucs);
-
-	spin_lock_irqsave(&ucs->lock, flags);
-	pending = ucs->pending;
-	ucs->pending = 0;
-	spin_unlock_irqrestore(&ucs->lock, flags);
-
-	switch (pending) {
-	case 0:					/* no pending request */
-		gig_dbg(DEBUG_USBREQ, "%s: no request pending", __func__);
-		break;
-
-	case HD_OPEN_ATCHANNEL:
-		dev_err(cs->dev, "timeout opening AT channel\n");
-		error_reset(cs);
-		break;
-
-	case HD_OPEN_B1CHANNEL:
-		dev_err(cs->dev, "timeout opening channel 1\n");
-		error_hangup(&cs->bcs[0]);
-		break;
-
-	case HD_OPEN_B2CHANNEL:
-		dev_err(cs->dev, "timeout opening channel 2\n");
-		error_hangup(&cs->bcs[1]);
-		break;
-
-	case HD_CLOSE_ATCHANNEL:
-		dev_err(cs->dev, "timeout closing AT channel\n");
-		error_reset(cs);
-		break;
-
-	case HD_CLOSE_B1CHANNEL:
-		dev_err(cs->dev, "timeout closing channel 1\n");
-		error_reset(cs);
-		break;
-
-	case HD_CLOSE_B2CHANNEL:
-		dev_err(cs->dev, "timeout closing channel 2\n");
-		error_reset(cs);
-		break;
-
-	case HD_RESET_INTERRUPT_PIPE:
-		/* error recovery escalation */
-		dev_err(cs->dev,
-			"reset interrupt pipe timeout, attempting USB reset\n");
-		usb_queue_reset_device(ucs->interface);
-		break;
-
-	default:
-		dev_warn(cs->dev, "request 0x%02x timed out, clearing\n",
-			 pending);
-	}
-
-	wake_up(&ucs->waitqueue);
-}
-
-/* write_ctrl_callback
- * USB completion handler for control pipe output
- * called by the USB subsystem in interrupt context
- * parameter:
- *	urb	USB request block of completed request
- *		urb->context = hardware specific controller state structure
- */
-static void write_ctrl_callback(struct urb *urb)
-{
-	struct bas_cardstate *ucs = urb->context;
-	int status = urb->status;
-	int rc;
-	unsigned long flags;
-
-	/* check status */
-	switch (status) {
-	case 0:					/* normal completion */
-		spin_lock_irqsave(&ucs->lock, flags);
-		switch (ucs->pending) {
-		case HD_DEVICE_INIT_ACK:	/* no reply expected */
-			del_timer(&ucs->timer_ctrl);
-			ucs->pending = 0;
-			break;
-		}
-		spin_unlock_irqrestore(&ucs->lock, flags);
-		return;
-
-	case -ENOENT:			/* cancelled */
-	case -ECONNRESET:		/* cancelled (async) */
-	case -EINPROGRESS:		/* pending */
-	case -ENODEV:			/* device removed */
-	case -ESHUTDOWN:		/* device shut down */
-		/* ignore silently */
-		gig_dbg(DEBUG_USBREQ, "%s: %s",
-			__func__, get_usb_statmsg(status));
-		break;
-
-	default:				/* any failure */
-		/* don't retry if suspend requested */
-		if (++ucs->retry_ctrl > BAS_RETRY ||
-		    (ucs->basstate & BS_SUSPEND)) {
-			dev_err(&ucs->interface->dev,
-				"control request 0x%02x failed: %s\n",
-				ucs->dr_ctrl.bRequest,
-				get_usb_statmsg(status));
-			break;		/* give up */
-		}
-		dev_notice(&ucs->interface->dev,
-			   "control request 0x%02x: %s, retry %d\n",
-			   ucs->dr_ctrl.bRequest, get_usb_statmsg(status),
-			   ucs->retry_ctrl);
-		/* urb->dev is clobbered by USB subsystem */
-		urb->dev = ucs->udev;
-		rc = usb_submit_urb(urb, GFP_ATOMIC);
-		if (unlikely(rc)) {
-			dev_err(&ucs->interface->dev,
-				"could not resubmit request 0x%02x: %s\n",
-				ucs->dr_ctrl.bRequest, get_usb_rcmsg(rc));
-			break;
-		}
-		/* resubmitted */
-		return;
-	}
-
-	/* failed, clear pending request */
-	spin_lock_irqsave(&ucs->lock, flags);
-	del_timer(&ucs->timer_ctrl);
-	ucs->pending = 0;
-	spin_unlock_irqrestore(&ucs->lock, flags);
-	wake_up(&ucs->waitqueue);
-}
-
-/* req_submit
- * submit a control output request without message buffer to the Gigaset base
- * and optionally start a timeout
- * parameters:
- *	bcs	B channel control structure
- *	req	control request code (HD_*)
- *	val	control request parameter value (set to 0 if unused)
- *	timeout	timeout in seconds (0: no timeout)
- * return value:
- *	0 on success
- *	-EBUSY if another request is pending
- *	any URB submission error code
- */
-static int req_submit(struct bc_state *bcs, int req, int val, int timeout)
-{
-	struct bas_cardstate *ucs = bcs->cs->hw.bas;
-	int ret;
-	unsigned long flags;
-
-	gig_dbg(DEBUG_USBREQ, "-------> 0x%02x (%d)", req, val);
-
-	spin_lock_irqsave(&ucs->lock, flags);
-	if (ucs->pending) {
-		spin_unlock_irqrestore(&ucs->lock, flags);
-		dev_err(bcs->cs->dev,
-			"submission of request 0x%02x failed: "
-			"request 0x%02x still pending\n",
-			req, ucs->pending);
-		return -EBUSY;
-	}
-
-	ucs->dr_ctrl.bRequestType = OUT_VENDOR_REQ;
-	ucs->dr_ctrl.bRequest = req;
-	ucs->dr_ctrl.wValue = cpu_to_le16(val);
-	ucs->dr_ctrl.wIndex = 0;
-	ucs->dr_ctrl.wLength = 0;
-	usb_fill_control_urb(ucs->urb_ctrl, ucs->udev,
-			     usb_sndctrlpipe(ucs->udev, 0),
-			     (unsigned char *) &ucs->dr_ctrl, NULL, 0,
-			     write_ctrl_callback, ucs);
-	ucs->retry_ctrl = 0;
-	ret = usb_submit_urb(ucs->urb_ctrl, GFP_ATOMIC);
-	if (unlikely(ret)) {
-		dev_err(bcs->cs->dev, "could not submit request 0x%02x: %s\n",
-			req, get_usb_rcmsg(ret));
-		spin_unlock_irqrestore(&ucs->lock, flags);
-		return ret;
-	}
-	ucs->pending = req;
-
-	if (timeout > 0) {
-		gig_dbg(DEBUG_USBREQ, "setting timeout of %d/10 secs", timeout);
-		mod_timer(&ucs->timer_ctrl, jiffies + timeout * HZ / 10);
-	}
-
-	spin_unlock_irqrestore(&ucs->lock, flags);
-	return 0;
-}
-
-/* gigaset_init_bchannel
- * called by common.c to connect a B channel
- * initialize isochronous I/O and tell the Gigaset base to open the channel
- * argument:
- *	B channel control structure
- * return value:
- *	0 on success, error code < 0 on error
- */
-static int gigaset_init_bchannel(struct bc_state *bcs)
-{
-	struct cardstate *cs = bcs->cs;
-	int req, ret;
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (unlikely(!cs->connected)) {
-		gig_dbg(DEBUG_USBREQ, "%s: not connected", __func__);
-		spin_unlock_irqrestore(&cs->lock, flags);
-		return -ENODEV;
-	}
-
-	if (cs->hw.bas->basstate & BS_SUSPEND) {
-		dev_notice(cs->dev,
-			   "not starting isoc I/O, suspend in progress\n");
-		spin_unlock_irqrestore(&cs->lock, flags);
-		return -EHOSTUNREACH;
-	}
-
-	ret = starturbs(bcs);
-	if (ret < 0) {
-		spin_unlock_irqrestore(&cs->lock, flags);
-		dev_err(cs->dev,
-			"could not start isoc I/O for channel B%d: %s\n",
-			bcs->channel + 1,
-			ret == -EFAULT ? "null URB" : get_usb_rcmsg(ret));
-		if (ret != -ENODEV)
-			error_hangup(bcs);
-		return ret;
-	}
-
-	req = bcs->channel ? HD_OPEN_B2CHANNEL : HD_OPEN_B1CHANNEL;
-	ret = req_submit(bcs, req, 0, BAS_TIMEOUT);
-	if (ret < 0) {
-		dev_err(cs->dev, "could not open channel B%d\n",
-			bcs->channel + 1);
-		stopurbs(bcs->hw.bas);
-	}
-
-	spin_unlock_irqrestore(&cs->lock, flags);
-	if (ret < 0 && ret != -ENODEV)
-		error_hangup(bcs);
-	return ret;
-}
-
-/* gigaset_close_bchannel
- * called by common.c to disconnect a B channel
- * tell the Gigaset base to close the channel
- * stopping isochronous I/O and LL notification will be done when the
- * acknowledgement for the close arrives
- * argument:
- *	B channel control structure
- * return value:
- *	0 on success, error code < 0 on error
- */
-static int gigaset_close_bchannel(struct bc_state *bcs)
-{
-	struct cardstate *cs = bcs->cs;
-	int req, ret;
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (unlikely(!cs->connected)) {
-		spin_unlock_irqrestore(&cs->lock, flags);
-		gig_dbg(DEBUG_USBREQ, "%s: not connected", __func__);
-		return -ENODEV;
-	}
-
-	if (!(cs->hw.bas->basstate & (bcs->channel ? BS_B2OPEN : BS_B1OPEN))) {
-		/* channel not running: just signal common.c */
-		spin_unlock_irqrestore(&cs->lock, flags);
-		gigaset_bchannel_down(bcs);
-		return 0;
-	}
-
-	/* channel running: tell device to close it */
-	req = bcs->channel ? HD_CLOSE_B2CHANNEL : HD_CLOSE_B1CHANNEL;
-	ret = req_submit(bcs, req, 0, BAS_TIMEOUT);
-	if (ret < 0)
-		dev_err(cs->dev, "closing channel B%d failed\n",
-			bcs->channel + 1);
-
-	spin_unlock_irqrestore(&cs->lock, flags);
-	return ret;
-}
-
-/* Device Operations */
-/* ================= */
-
-/* complete_cb
- * unqueue first command buffer from queue, waking any sleepers
- * must be called with cs->cmdlock held
- * parameter:
- *	cs	controller state structure
- */
-static void complete_cb(struct cardstate *cs)
-{
-	struct cmdbuf_t *cb = cs->cmdbuf;
-
-	/* unqueue completed buffer */
-	cs->cmdbytes -= cs->curlen;
-	gig_dbg(DEBUG_OUTPUT, "write_command: sent %u bytes, %u left",
-		cs->curlen, cs->cmdbytes);
-	if (cb->next != NULL) {
-		cs->cmdbuf = cb->next;
-		cs->cmdbuf->prev = NULL;
-		cs->curlen = cs->cmdbuf->len;
-	} else {
-		cs->cmdbuf = NULL;
-		cs->lastcmdbuf = NULL;
-		cs->curlen = 0;
-	}
-
-	if (cb->wake_tasklet)
-		tasklet_schedule(cb->wake_tasklet);
-
-	kfree(cb);
-}
-
-/* write_command_callback
- * USB completion handler for AT command transmission
- * called by the USB subsystem in interrupt context
- * parameter:
- *	urb	USB request block of completed request
- *		urb->context = controller state structure
- */
-static void write_command_callback(struct urb *urb)
-{
-	struct cardstate *cs = urb->context;
-	struct bas_cardstate *ucs = cs->hw.bas;
-	int status = urb->status;
-	unsigned long flags;
-
-	update_basstate(ucs, 0, BS_ATWRPEND);
-	wake_up(&ucs->waitqueue);
-
-	/* check status */
-	switch (status) {
-	case 0:					/* normal completion */
-		break;
-	case -ENOENT:			/* cancelled */
-	case -ECONNRESET:		/* cancelled (async) */
-	case -EINPROGRESS:		/* pending */
-	case -ENODEV:			/* device removed */
-	case -ESHUTDOWN:		/* device shut down */
-		/* ignore silently */
-		gig_dbg(DEBUG_USBREQ, "%s: %s",
-			__func__, get_usb_statmsg(status));
-		return;
-	default:				/* any failure */
-		if (++ucs->retry_cmd_out > BAS_RETRY) {
-			dev_warn(cs->dev,
-				 "command write: %s, "
-				 "giving up after %d retries\n",
-				 get_usb_statmsg(status),
-				 ucs->retry_cmd_out);
-			break;
-		}
-		if (ucs->basstate & BS_SUSPEND) {
-			dev_warn(cs->dev,
-				 "command write: %s, "
-				 "won't retry - suspend requested\n",
-				 get_usb_statmsg(status));
-			break;
-		}
-		if (cs->cmdbuf == NULL) {
-			dev_warn(cs->dev,
-				 "command write: %s, "
-				 "cannot retry - cmdbuf gone\n",
-				 get_usb_statmsg(status));
-			break;
-		}
-		dev_notice(cs->dev, "command write: %s, retry %d\n",
-			   get_usb_statmsg(status), ucs->retry_cmd_out);
-		if (atwrite_submit(cs, cs->cmdbuf->buf, cs->cmdbuf->len) >= 0)
-			/* resubmitted - bypass regular exit block */
-			return;
-		/* command send failed, assume base still waiting */
-		update_basstate(ucs, BS_ATREADY, 0);
-	}
-
-	spin_lock_irqsave(&cs->cmdlock, flags);
-	if (cs->cmdbuf != NULL)
-		complete_cb(cs);
-	spin_unlock_irqrestore(&cs->cmdlock, flags);
-}
-
-/* atrdy_timeout
- * timeout routine for AT command transmission
- * argument:
- *	controller state structure
- */
-static void atrdy_timeout(struct timer_list *t)
-{
-	struct bas_cardstate *ucs = from_timer(ucs, t, timer_atrdy);
-	struct cardstate *cs = ucs->cs;
-
-	dev_warn(cs->dev, "timeout waiting for HD_READY_SEND_ATDATA\n");
-
-	/* fake the missing signal - what else can I do? */
-	update_basstate(ucs, BS_ATREADY, BS_ATTIMER);
-	start_cbsend(cs);
-}
-
-/* atwrite_submit
- * submit an HD_WRITE_ATMESSAGE command URB
- * parameters:
- *	cs	controller state structure
- *	buf	buffer containing command to send
- *	len	length of command to send
- * return value:
- *	0 on success
- *	-EBUSY if another request is pending
- *	any URB submission error code
- */
-static int atwrite_submit(struct cardstate *cs, unsigned char *buf, int len)
-{
-	struct bas_cardstate *ucs = cs->hw.bas;
-	int rc;
-
-	gig_dbg(DEBUG_USBREQ, "-------> HD_WRITE_ATMESSAGE (%d)", len);
-
-	if (update_basstate(ucs, BS_ATWRPEND, 0) & BS_ATWRPEND) {
-		dev_err(cs->dev,
-			"could not submit HD_WRITE_ATMESSAGE: URB busy\n");
-		return -EBUSY;
-	}
-
-	ucs->dr_cmd_out.bRequestType = OUT_VENDOR_REQ;
-	ucs->dr_cmd_out.bRequest = HD_WRITE_ATMESSAGE;
-	ucs->dr_cmd_out.wValue = 0;
-	ucs->dr_cmd_out.wIndex = 0;
-	ucs->dr_cmd_out.wLength = cpu_to_le16(len);
-	usb_fill_control_urb(ucs->urb_cmd_out, ucs->udev,
-			     usb_sndctrlpipe(ucs->udev, 0),
-			     (unsigned char *) &ucs->dr_cmd_out, buf, len,
-			     write_command_callback, cs);
-	rc = usb_submit_urb(ucs->urb_cmd_out, GFP_ATOMIC);
-	if (unlikely(rc)) {
-		update_basstate(ucs, 0, BS_ATWRPEND);
-		dev_err(cs->dev, "could not submit HD_WRITE_ATMESSAGE: %s\n",
-			get_usb_rcmsg(rc));
-		return rc;
-	}
-
-	/* submitted successfully, start timeout if necessary */
-	if (!(update_basstate(ucs, BS_ATTIMER, BS_ATREADY) & BS_ATTIMER)) {
-		gig_dbg(DEBUG_OUTPUT, "setting ATREADY timeout of %d/10 secs",
-			ATRDY_TIMEOUT);
-		mod_timer(&ucs->timer_atrdy, jiffies + ATRDY_TIMEOUT * HZ / 10);
-	}
-	return 0;
-}
-
-/* start_cbsend
- * start transmission of AT command queue if necessary
- * parameter:
- *	cs		controller state structure
- * return value:
- *	0 on success
- *	error code < 0 on error
- */
-static int start_cbsend(struct cardstate *cs)
-{
-	struct cmdbuf_t *cb;
-	struct bas_cardstate *ucs = cs->hw.bas;
-	unsigned long flags;
-	int rc;
-	int retval = 0;
-
-	/* check if suspend requested */
-	if (ucs->basstate & BS_SUSPEND) {
-		gig_dbg(DEBUG_OUTPUT, "suspending");
-		return -EHOSTUNREACH;
-	}
-
-	/* check if AT channel is open */
-	if (!(ucs->basstate & BS_ATOPEN)) {
-		gig_dbg(DEBUG_OUTPUT, "AT channel not open");
-		rc = req_submit(cs->bcs, HD_OPEN_ATCHANNEL, 0, BAS_TIMEOUT);
-		if (rc < 0) {
-			/* flush command queue */
-			spin_lock_irqsave(&cs->cmdlock, flags);
-			while (cs->cmdbuf != NULL)
-				complete_cb(cs);
-			spin_unlock_irqrestore(&cs->cmdlock, flags);
-		}
-		return rc;
-	}
-
-	/* try to send first command in queue */
-	spin_lock_irqsave(&cs->cmdlock, flags);
-
-	while ((cb = cs->cmdbuf) != NULL && (ucs->basstate & BS_ATREADY)) {
-		ucs->retry_cmd_out = 0;
-		rc = atwrite_submit(cs, cb->buf, cb->len);
-		if (unlikely(rc)) {
-			retval = rc;
-			complete_cb(cs);
-		}
-	}
-
-	spin_unlock_irqrestore(&cs->cmdlock, flags);
-	return retval;
-}
-
-/* gigaset_write_cmd
- * This function is called by the device independent part of the driver
- * to transmit an AT command string to the Gigaset device.
- * It encapsulates the device specific method for transmission over the
- * direct USB connection to the base.
- * The command string is added to the queue of commands to send, and
- * USB transmission is started if necessary.
- * parameters:
- *	cs		controller state structure
- *	cb		command buffer structure
- * return value:
- *	number of bytes queued on success
- *	error code < 0 on error
- */
-static int gigaset_write_cmd(struct cardstate *cs, struct cmdbuf_t *cb)
-{
-	unsigned long flags;
-	int rc;
-
-	gigaset_dbg_buffer(cs->mstate != MS_LOCKED ?
-			   DEBUG_TRANSCMD : DEBUG_LOCKCMD,
-			   "CMD Transmit", cb->len, cb->buf);
-
-	/* translate "+++" escape sequence sent as a single separate command
-	 * into "close AT channel" command for error recovery
-	 * The next command will reopen the AT channel automatically.
-	 */
-	if (cb->len == 3 && !memcmp(cb->buf, "+++", 3)) {
-		/* If an HD_RECEIVEATDATA_ACK message remains unhandled
-		 * because of an error, the base never sends another one.
-		 * The response channel is thus effectively blocked.
-		 * Closing and reopening the AT channel does *not* clear
-		 * this condition.
-		 * As a stopgap measure, submit a zero-length AT read
-		 * before closing the AT channel. This has the undocumented
-		 * effect of triggering a new HD_RECEIVEATDATA_ACK message
-		 * from the base if necessary.
-		 * The subsequent AT channel close then discards any pending
-		 * messages.
-		 */
-		spin_lock_irqsave(&cs->lock, flags);
-		if (!(cs->hw.bas->basstate & BS_ATRDPEND)) {
-			kfree(cs->hw.bas->rcvbuf);
-			cs->hw.bas->rcvbuf = NULL;
-			cs->hw.bas->rcvbuf_size = 0;
-			cs->hw.bas->retry_cmd_in = 0;
-			atread_submit(cs, 0);
-		}
-		spin_unlock_irqrestore(&cs->lock, flags);
-
-		rc = req_submit(cs->bcs, HD_CLOSE_ATCHANNEL, 0, BAS_TIMEOUT);
-		if (cb->wake_tasklet)
-			tasklet_schedule(cb->wake_tasklet);
-		if (!rc)
-			rc = cb->len;
-		kfree(cb);
-		return rc;
-	}
-
-	spin_lock_irqsave(&cs->cmdlock, flags);
-	cb->prev = cs->lastcmdbuf;
-	if (cs->lastcmdbuf)
-		cs->lastcmdbuf->next = cb;
-	else {
-		cs->cmdbuf = cb;
-		cs->curlen = cb->len;
-	}
-	cs->cmdbytes += cb->len;
-	cs->lastcmdbuf = cb;
-	spin_unlock_irqrestore(&cs->cmdlock, flags);
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (unlikely(!cs->connected)) {
-		spin_unlock_irqrestore(&cs->lock, flags);
-		gig_dbg(DEBUG_USBREQ, "%s: not connected", __func__);
-		/* flush command queue */
-		spin_lock_irqsave(&cs->cmdlock, flags);
-		while (cs->cmdbuf != NULL)
-			complete_cb(cs);
-		spin_unlock_irqrestore(&cs->cmdlock, flags);
-		return -ENODEV;
-	}
-	rc = start_cbsend(cs);
-	spin_unlock_irqrestore(&cs->lock, flags);
-	return rc < 0 ? rc : cb->len;
-}
-
-/* gigaset_write_room
- * tty_driver.write_room interface routine
- * return number of characters the driver will accept to be written via
- * gigaset_write_cmd
- * parameter:
- *	controller state structure
- * return value:
- *	number of characters
- */
-static int gigaset_write_room(struct cardstate *cs)
-{
-	return IF_WRITEBUF;
-}
-
-/* gigaset_chars_in_buffer
- * tty_driver.chars_in_buffer interface routine
- * return number of characters waiting to be sent
- * parameter:
- *	controller state structure
- * return value:
- *	number of characters
- */
-static int gigaset_chars_in_buffer(struct cardstate *cs)
-{
-	return cs->cmdbytes;
-}
-
-/* gigaset_brkchars
- * implementation of ioctl(GIGASET_BRKCHARS)
- * parameter:
- *	controller state structure
- * return value:
- *	-EINVAL (unimplemented function)
- */
-static int gigaset_brkchars(struct cardstate *cs, const unsigned char buf[6])
-{
-	return -EINVAL;
-}
-
-
-/* Device Initialization/Shutdown */
-/* ============================== */
-
-/* Free hardware dependent part of the B channel structure
- * parameter:
- *	bcs	B channel structure
- */
-static void gigaset_freebcshw(struct bc_state *bcs)
-{
-	struct bas_bc_state *ubc = bcs->hw.bas;
-	int i;
-
-	if (!ubc)
-		return;
-
-	/* kill URBs and tasklets before freeing - better safe than sorry */
-	ubc->running = 0;
-	gig_dbg(DEBUG_INIT, "%s: killing isoc URBs", __func__);
-	for (i = 0; i < BAS_OUTURBS; ++i) {
-		usb_kill_urb(ubc->isoouturbs[i].urb);
-		usb_free_urb(ubc->isoouturbs[i].urb);
-	}
-	for (i = 0; i < BAS_INURBS; ++i) {
-		usb_kill_urb(ubc->isoinurbs[i]);
-		usb_free_urb(ubc->isoinurbs[i]);
-	}
-	tasklet_kill(&ubc->sent_tasklet);
-	tasklet_kill(&ubc->rcvd_tasklet);
-	kfree(ubc->isooutbuf);
-	kfree(ubc);
-	bcs->hw.bas = NULL;
-}
-
-/* Initialize hardware dependent part of the B channel structure
- * parameter:
- *	bcs	B channel structure
- * return value:
- *	0 on success, error code < 0 on failure
- */
-static int gigaset_initbcshw(struct bc_state *bcs)
-{
-	int i;
-	struct bas_bc_state *ubc;
-
-	bcs->hw.bas = ubc = kmalloc(sizeof(struct bas_bc_state), GFP_KERNEL);
-	if (!ubc) {
-		pr_err("out of memory\n");
-		return -ENOMEM;
-	}
-
-	ubc->running = 0;
-	atomic_set(&ubc->corrbytes, 0);
-	spin_lock_init(&ubc->isooutlock);
-	for (i = 0; i < BAS_OUTURBS; ++i) {
-		ubc->isoouturbs[i].urb = NULL;
-		ubc->isoouturbs[i].bcs = bcs;
-	}
-	ubc->isooutdone = ubc->isooutfree = ubc->isooutovfl = NULL;
-	ubc->numsub = 0;
-	ubc->isooutbuf = kmalloc(sizeof(struct isowbuf_t), GFP_KERNEL);
-	if (!ubc->isooutbuf) {
-		pr_err("out of memory\n");
-		kfree(ubc);
-		bcs->hw.bas = NULL;
-		return -ENOMEM;
-	}
-	tasklet_init(&ubc->sent_tasklet,
-		     write_iso_tasklet, (unsigned long) bcs);
-
-	spin_lock_init(&ubc->isoinlock);
-	for (i = 0; i < BAS_INURBS; ++i)
-		ubc->isoinurbs[i] = NULL;
-	ubc->isoindone = NULL;
-	ubc->loststatus = -EINPROGRESS;
-	ubc->isoinlost = 0;
-	ubc->seqlen = 0;
-	ubc->inbyte = 0;
-	ubc->inbits = 0;
-	ubc->goodbytes = 0;
-	ubc->alignerrs = 0;
-	ubc->fcserrs = 0;
-	ubc->frameerrs = 0;
-	ubc->giants = 0;
-	ubc->runts = 0;
-	ubc->aborts = 0;
-	ubc->shared0s = 0;
-	ubc->stolen0s = 0;
-	tasklet_init(&ubc->rcvd_tasklet,
-		     read_iso_tasklet, (unsigned long) bcs);
-	return 0;
-}
-
-static void gigaset_reinitbcshw(struct bc_state *bcs)
-{
-	struct bas_bc_state *ubc = bcs->hw.bas;
-
-	bcs->hw.bas->running = 0;
-	atomic_set(&bcs->hw.bas->corrbytes, 0);
-	bcs->hw.bas->numsub = 0;
-	spin_lock_init(&ubc->isooutlock);
-	spin_lock_init(&ubc->isoinlock);
-	ubc->loststatus = -EINPROGRESS;
-}
-
-static void gigaset_freecshw(struct cardstate *cs)
-{
-	/* timers, URBs and rcvbuf are disposed of in disconnect */
-	kfree(cs->hw.bas->int_in_buf);
-	kfree(cs->hw.bas);
-	cs->hw.bas = NULL;
-}
-
-/* Initialize hardware dependent part of the cardstate structure
- * parameter:
- *	cs	cardstate structure
- * return value:
- *	0 on success, error code < 0 on failure
- */
-static int gigaset_initcshw(struct cardstate *cs)
-{
-	struct bas_cardstate *ucs;
-
-	cs->hw.bas = ucs = kzalloc(sizeof(*ucs), GFP_KERNEL);
-	if (!ucs) {
-		pr_err("out of memory\n");
-		return -ENOMEM;
-	}
-	ucs->int_in_buf = kmalloc(IP_MSGSIZE, GFP_KERNEL);
-	if (!ucs->int_in_buf) {
-		kfree(ucs);
-		pr_err("out of memory\n");
-		return -ENOMEM;
-	}
-
-	spin_lock_init(&ucs->lock);
-	ucs->cs = cs;
-	timer_setup(&ucs->timer_ctrl, req_timeout, 0);
-	timer_setup(&ucs->timer_atrdy, atrdy_timeout, 0);
-	timer_setup(&ucs->timer_cmd_in, cmd_in_timeout, 0);
-	timer_setup(&ucs->timer_int_in, int_in_resubmit, 0);
-	init_waitqueue_head(&ucs->waitqueue);
-	INIT_WORK(&ucs->int_in_wq, int_in_work);
-
-	return 0;
-}
-
-/* freeurbs
- * unlink and deallocate all URBs unconditionally
- * caller must make sure that no commands are still in progress
- * parameter:
- *	cs	controller state structure
- */
-static void freeurbs(struct cardstate *cs)
-{
-	struct bas_cardstate *ucs = cs->hw.bas;
-	struct bas_bc_state *ubc;
-	int i, j;
-
-	gig_dbg(DEBUG_INIT, "%s: killing URBs", __func__);
-	for (j = 0; j < BAS_CHANNELS; ++j) {
-		ubc = cs->bcs[j].hw.bas;
-		for (i = 0; i < BAS_OUTURBS; ++i) {
-			usb_kill_urb(ubc->isoouturbs[i].urb);
-			usb_free_urb(ubc->isoouturbs[i].urb);
-			ubc->isoouturbs[i].urb = NULL;
-		}
-		for (i = 0; i < BAS_INURBS; ++i) {
-			usb_kill_urb(ubc->isoinurbs[i]);
-			usb_free_urb(ubc->isoinurbs[i]);
-			ubc->isoinurbs[i] = NULL;
-		}
-	}
-	usb_kill_urb(ucs->urb_int_in);
-	usb_free_urb(ucs->urb_int_in);
-	ucs->urb_int_in = NULL;
-	usb_kill_urb(ucs->urb_cmd_out);
-	usb_free_urb(ucs->urb_cmd_out);
-	ucs->urb_cmd_out = NULL;
-	usb_kill_urb(ucs->urb_cmd_in);
-	usb_free_urb(ucs->urb_cmd_in);
-	ucs->urb_cmd_in = NULL;
-	usb_kill_urb(ucs->urb_ctrl);
-	usb_free_urb(ucs->urb_ctrl);
-	ucs->urb_ctrl = NULL;
-}
-
-/* gigaset_probe
- * This function is called when a new USB device is connected.
- * It checks whether the new device is handled by this driver.
- */
-static int gigaset_probe(struct usb_interface *interface,
-			 const struct usb_device_id *id)
-{
-	struct usb_host_interface *hostif;
-	struct usb_device *udev = interface_to_usbdev(interface);
-	struct cardstate *cs = NULL;
-	struct bas_cardstate *ucs = NULL;
-	struct bas_bc_state *ubc;
-	struct usb_endpoint_descriptor *endpoint;
-	int i, j;
-	int rc;
-
-	gig_dbg(DEBUG_INIT,
-		"%s: Check if device matches .. (Vendor: 0x%x, Product: 0x%x)",
-		__func__, le16_to_cpu(udev->descriptor.idVendor),
-		le16_to_cpu(udev->descriptor.idProduct));
-
-	/* set required alternate setting */
-	hostif = interface->cur_altsetting;
-	if (hostif->desc.bAlternateSetting != 3) {
-		gig_dbg(DEBUG_INIT,
-			"%s: wrong alternate setting %d - trying to switch",
-			__func__, hostif->desc.bAlternateSetting);
-		if (usb_set_interface(udev, hostif->desc.bInterfaceNumber, 3)
-		    < 0) {
-			dev_warn(&udev->dev, "usb_set_interface failed, "
-				 "device %d interface %d altsetting %d\n",
-				 udev->devnum, hostif->desc.bInterfaceNumber,
-				 hostif->desc.bAlternateSetting);
-			return -ENODEV;
-		}
-		hostif = interface->cur_altsetting;
-	}
-
-	/* Reject application specific interfaces
-	 */
-	if (hostif->desc.bInterfaceClass != 255) {
-		dev_warn(&udev->dev, "%s: bInterfaceClass == %d\n",
-			 __func__, hostif->desc.bInterfaceClass);
-		return -ENODEV;
-	}
-
-	if (hostif->desc.bNumEndpoints < 1)
-		return -ENODEV;
-
-	dev_info(&udev->dev,
-		 "%s: Device matched (Vendor: 0x%x, Product: 0x%x)\n",
-		 __func__, le16_to_cpu(udev->descriptor.idVendor),
-		 le16_to_cpu(udev->descriptor.idProduct));
-
-	/* allocate memory for our device state and initialize it */
-	cs = gigaset_initcs(driver, BAS_CHANNELS, 0, 0, cidmode,
-			    GIGASET_MODULENAME);
-	if (!cs)
-		return -ENODEV;
-	ucs = cs->hw.bas;
-
-	/* save off device structure ptrs for later use */
-	usb_get_dev(udev);
-	ucs->udev = udev;
-	ucs->interface = interface;
-	cs->dev = &interface->dev;
-
-	/* allocate URBs:
-	 * - one for the interrupt pipe
-	 * - three for the different uses of the default control pipe
-	 * - three for each isochronous pipe
-	 */
-	if (!(ucs->urb_int_in = usb_alloc_urb(0, GFP_KERNEL)) ||
-	    !(ucs->urb_cmd_in = usb_alloc_urb(0, GFP_KERNEL)) ||
-	    !(ucs->urb_cmd_out = usb_alloc_urb(0, GFP_KERNEL)) ||
-	    !(ucs->urb_ctrl = usb_alloc_urb(0, GFP_KERNEL)))
-		goto allocerr;
-
-	for (j = 0; j < BAS_CHANNELS; ++j) {
-		ubc = cs->bcs[j].hw.bas;
-		for (i = 0; i < BAS_OUTURBS; ++i)
-			if (!(ubc->isoouturbs[i].urb =
-			      usb_alloc_urb(BAS_NUMFRAMES, GFP_KERNEL)))
-				goto allocerr;
-		for (i = 0; i < BAS_INURBS; ++i)
-			if (!(ubc->isoinurbs[i] =
-			      usb_alloc_urb(BAS_NUMFRAMES, GFP_KERNEL)))
-				goto allocerr;
-	}
-
-	ucs->rcvbuf = NULL;
-	ucs->rcvbuf_size = 0;
-
-	/* Fill the interrupt urb and send it to the core */
-	endpoint = &hostif->endpoint[0].desc;
-	usb_fill_int_urb(ucs->urb_int_in, udev,
-			 usb_rcvintpipe(udev,
-					usb_endpoint_num(endpoint)),
-			 ucs->int_in_buf, IP_MSGSIZE, read_int_callback, cs,
-			 endpoint->bInterval);
-	rc = usb_submit_urb(ucs->urb_int_in, GFP_KERNEL);
-	if (rc != 0) {
-		dev_err(cs->dev, "could not submit interrupt URB: %s\n",
-			get_usb_rcmsg(rc));
-		goto error;
-	}
-	ucs->retry_int_in = 0;
-
-	/* tell the device that the driver is ready */
-	rc = req_submit(cs->bcs, HD_DEVICE_INIT_ACK, 0, 0);
-	if (rc != 0)
-		goto error;
-
-	/* tell common part that the device is ready */
-	if (startmode == SM_LOCKED)
-		cs->mstate = MS_LOCKED;
-
-	/* save address of controller structure */
-	usb_set_intfdata(interface, cs);
-
-	rc = gigaset_start(cs);
-	if (rc < 0)
-		goto error;
-
-	return 0;
-
-allocerr:
-	dev_err(cs->dev, "could not allocate URBs\n");
-	rc = -ENOMEM;
-error:
-	freeurbs(cs);
-	usb_set_intfdata(interface, NULL);
-	usb_put_dev(udev);
-	gigaset_freecs(cs);
-	return rc;
-}
-
-/* gigaset_disconnect
- * This function is called when the Gigaset base is unplugged.
- */
-static void gigaset_disconnect(struct usb_interface *interface)
-{
-	struct cardstate *cs;
-	struct bas_cardstate *ucs;
-	int j;
-
-	cs = usb_get_intfdata(interface);
-
-	ucs = cs->hw.bas;
-
-	dev_info(cs->dev, "disconnecting Gigaset base\n");
-
-	/* mark base as not ready, all channels disconnected */
-	ucs->basstate = 0;
-
-	/* tell LL all channels are down */
-	for (j = 0; j < BAS_CHANNELS; ++j)
-		gigaset_bchannel_down(cs->bcs + j);
-
-	/* stop driver (common part) */
-	gigaset_stop(cs);
-
-	/* stop delayed work and URBs, free ressources */
-	del_timer_sync(&ucs->timer_ctrl);
-	del_timer_sync(&ucs->timer_atrdy);
-	del_timer_sync(&ucs->timer_cmd_in);
-	del_timer_sync(&ucs->timer_int_in);
-	cancel_work_sync(&ucs->int_in_wq);
-	freeurbs(cs);
-	usb_set_intfdata(interface, NULL);
-	kfree(ucs->rcvbuf);
-	ucs->rcvbuf = NULL;
-	ucs->rcvbuf_size = 0;
-	usb_put_dev(ucs->udev);
-	ucs->interface = NULL;
-	ucs->udev = NULL;
-	cs->dev = NULL;
-	gigaset_freecs(cs);
-}
-
-/* gigaset_suspend
- * This function is called before the USB connection is suspended
- * or before the USB device is reset.
- * In the latter case, message == PMSG_ON.
- */
-static int gigaset_suspend(struct usb_interface *intf, pm_message_t message)
-{
-	struct cardstate *cs = usb_get_intfdata(intf);
-	struct bas_cardstate *ucs = cs->hw.bas;
-	int rc;
-
-	/* set suspend flag; this stops AT command/response traffic */
-	if (update_basstate(ucs, BS_SUSPEND, 0) & BS_SUSPEND) {
-		gig_dbg(DEBUG_SUSPEND, "already suspended");
-		return 0;
-	}
-
-	/* wait a bit for blocking conditions to go away */
-	rc = wait_event_timeout(ucs->waitqueue,
-				!(ucs->basstate &
-				  (BS_B1OPEN | BS_B2OPEN | BS_ATRDPEND | BS_ATWRPEND)),
-				BAS_TIMEOUT * HZ / 10);
-	gig_dbg(DEBUG_SUSPEND, "wait_event_timeout() -> %d", rc);
-
-	/* check for conditions preventing suspend */
-	if (ucs->basstate & (BS_B1OPEN | BS_B2OPEN | BS_ATRDPEND | BS_ATWRPEND)) {
-		dev_warn(cs->dev, "cannot suspend:\n");
-		if (ucs->basstate & BS_B1OPEN)
-			dev_warn(cs->dev, " B channel 1 open\n");
-		if (ucs->basstate & BS_B2OPEN)
-			dev_warn(cs->dev, " B channel 2 open\n");
-		if (ucs->basstate & BS_ATRDPEND)
-			dev_warn(cs->dev, " receiving AT reply\n");
-		if (ucs->basstate & BS_ATWRPEND)
-			dev_warn(cs->dev, " sending AT command\n");
-		update_basstate(ucs, 0, BS_SUSPEND);
-		return -EBUSY;
-	}
-
-	/* close AT channel if open */
-	if (ucs->basstate & BS_ATOPEN) {
-		gig_dbg(DEBUG_SUSPEND, "closing AT channel");
-		rc = req_submit(cs->bcs, HD_CLOSE_ATCHANNEL, 0, 0);
-		if (rc) {
-			update_basstate(ucs, 0, BS_SUSPEND);
-			return rc;
-		}
-		wait_event_timeout(ucs->waitqueue, !ucs->pending,
-				   BAS_TIMEOUT * HZ / 10);
-		/* in case of timeout, proceed anyway */
-	}
-
-	/* kill all URBs and delayed work that might still be pending */
-	usb_kill_urb(ucs->urb_ctrl);
-	usb_kill_urb(ucs->urb_int_in);
-	del_timer_sync(&ucs->timer_ctrl);
-	del_timer_sync(&ucs->timer_atrdy);
-	del_timer_sync(&ucs->timer_cmd_in);
-	del_timer_sync(&ucs->timer_int_in);
-
-	/* don't try to cancel int_in_wq from within reset as it
-	 * might be the one requesting the reset
-	 */
-	if (message.event != PM_EVENT_ON)
-		cancel_work_sync(&ucs->int_in_wq);
-
-	gig_dbg(DEBUG_SUSPEND, "suspend complete");
-	return 0;
-}
-
-/* gigaset_resume
- * This function is called after the USB connection has been resumed.
- */
-static int gigaset_resume(struct usb_interface *intf)
-{
-	struct cardstate *cs = usb_get_intfdata(intf);
-	struct bas_cardstate *ucs = cs->hw.bas;
-	int rc;
-
-	/* resubmit interrupt URB for spontaneous messages from base */
-	rc = usb_submit_urb(ucs->urb_int_in, GFP_KERNEL);
-	if (rc) {
-		dev_err(cs->dev, "could not resubmit interrupt URB: %s\n",
-			get_usb_rcmsg(rc));
-		return rc;
-	}
-	ucs->retry_int_in = 0;
-
-	/* clear suspend flag to reallow activity */
-	update_basstate(ucs, 0, BS_SUSPEND);
-
-	gig_dbg(DEBUG_SUSPEND, "resume complete");
-	return 0;
-}
-
-/* gigaset_pre_reset
- * This function is called before the USB connection is reset.
- */
-static int gigaset_pre_reset(struct usb_interface *intf)
-{
-	/* handle just like suspend */
-	return gigaset_suspend(intf, PMSG_ON);
-}
-
-/* gigaset_post_reset
- * This function is called after the USB connection has been reset.
- */
-static int gigaset_post_reset(struct usb_interface *intf)
-{
-	/* FIXME: send HD_DEVICE_INIT_ACK? */
-
-	/* resume operations */
-	return gigaset_resume(intf);
-}
-
-
-static const struct gigaset_ops gigops = {
-	.write_cmd = gigaset_write_cmd,
-	.write_room = gigaset_write_room,
-	.chars_in_buffer = gigaset_chars_in_buffer,
-	.brkchars = gigaset_brkchars,
-	.init_bchannel = gigaset_init_bchannel,
-	.close_bchannel = gigaset_close_bchannel,
-	.initbcshw = gigaset_initbcshw,
-	.freebcshw = gigaset_freebcshw,
-	.reinitbcshw = gigaset_reinitbcshw,
-	.initcshw = gigaset_initcshw,
-	.freecshw = gigaset_freecshw,
-	.set_modem_ctrl = gigaset_set_modem_ctrl,
-	.baud_rate = gigaset_baud_rate,
-	.set_line_ctrl = gigaset_set_line_ctrl,
-	.send_skb = gigaset_isoc_send_skb,
-	.handle_input = gigaset_isoc_input,
-};
-
-/* bas_gigaset_init
- * This function is called after the kernel module is loaded.
- */
-static int __init bas_gigaset_init(void)
-{
-	int result;
-
-	/* allocate memory for our driver state and initialize it */
-	driver = gigaset_initdriver(GIGASET_MINOR, GIGASET_MINORS,
-				    GIGASET_MODULENAME, GIGASET_DEVNAME,
-				    &gigops, THIS_MODULE);
-	if (driver == NULL)
-		goto error;
-
-	/* register this driver with the USB subsystem */
-	result = usb_register(&gigaset_usb_driver);
-	if (result < 0) {
-		pr_err("error %d registering USB driver\n", -result);
-		goto error;
-	}
-
-	pr_info(DRIVER_DESC "\n");
-	return 0;
-
-error:
-	if (driver)
-		gigaset_freedriver(driver);
-	driver = NULL;
-	return -1;
-}
-
-/* bas_gigaset_exit
- * This function is called before the kernel module is unloaded.
- */
-static void __exit bas_gigaset_exit(void)
-{
-	struct bas_cardstate *ucs;
-	int i;
-
-	gigaset_blockdriver(driver); /* => probe will fail
-				      * => no gigaset_start any more
-				      */
-
-	/* stop all connected devices */
-	for (i = 0; i < driver->minors; i++) {
-		if (gigaset_shutdown(driver->cs + i) < 0)
-			continue;		/* no device */
-		/* from now on, no isdn callback should be possible */
-
-		/* close all still open channels */
-		ucs = driver->cs[i].hw.bas;
-		if (ucs->basstate & BS_B1OPEN) {
-			gig_dbg(DEBUG_INIT, "closing B1 channel");
-			usb_control_msg(ucs->udev,
-					usb_sndctrlpipe(ucs->udev, 0),
-					HD_CLOSE_B1CHANNEL, OUT_VENDOR_REQ,
-					0, 0, NULL, 0, BAS_TIMEOUT);
-		}
-		if (ucs->basstate & BS_B2OPEN) {
-			gig_dbg(DEBUG_INIT, "closing B2 channel");
-			usb_control_msg(ucs->udev,
-					usb_sndctrlpipe(ucs->udev, 0),
-					HD_CLOSE_B2CHANNEL, OUT_VENDOR_REQ,
-					0, 0, NULL, 0, BAS_TIMEOUT);
-		}
-		if (ucs->basstate & BS_ATOPEN) {
-			gig_dbg(DEBUG_INIT, "closing AT channel");
-			usb_control_msg(ucs->udev,
-					usb_sndctrlpipe(ucs->udev, 0),
-					HD_CLOSE_ATCHANNEL, OUT_VENDOR_REQ,
-					0, 0, NULL, 0, BAS_TIMEOUT);
-		}
-		ucs->basstate = 0;
-	}
-
-	/* deregister this driver with the USB subsystem */
-	usb_deregister(&gigaset_usb_driver);
-	/* this will call the disconnect-callback */
-	/* from now on, no disconnect/probe callback should be running */
-
-	gigaset_freedriver(driver);
-	driver = NULL;
-}
-
-
-module_init(bas_gigaset_init);
-module_exit(bas_gigaset_exit);
-
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
-MODULE_LICENSE("GPL");
diff --git a/drivers/staging/isdn/gigaset/capi.c b/drivers/staging/isdn/gigaset/capi.c
deleted file mode 100644
index 83d7dd48c61d..000000000000
--- a/drivers/staging/isdn/gigaset/capi.c
+++ /dev/null
@@ -1,2517 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Kernel CAPI interface for the Gigaset driver
- *
- * Copyright (c) 2009 by Tilman Schmidt <tilman@imap.cc>.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/proc_fs.h>
-#include <linux/seq_file.h>
-#include <linux/ratelimit.h>
-#include <linux/isdn/capilli.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/export.h>
-
-/* missing from kernelcapi.h */
-#define CapiNcpiNotSupportedByProtocol	0x0001
-#define CapiFlagsNotSupportedByProtocol	0x0002
-#define CapiAlertAlreadySent		0x0003
-#define CapiFacilitySpecificFunctionNotSupported	0x3011
-
-/* missing from capicmd.h */
-#define CAPI_CONNECT_IND_BASELEN	(CAPI_MSG_BASELEN + 4 + 2 + 8 * 1)
-#define CAPI_CONNECT_ACTIVE_IND_BASELEN	(CAPI_MSG_BASELEN + 4 + 3 * 1)
-#define CAPI_CONNECT_B3_IND_BASELEN	(CAPI_MSG_BASELEN + 4 + 1)
-#define CAPI_CONNECT_B3_ACTIVE_IND_BASELEN	(CAPI_MSG_BASELEN + 4 + 1)
-#define CAPI_DATA_B3_REQ_LEN64		(CAPI_MSG_BASELEN + 4 + 4 + 2 + 2 + 2 + 8)
-#define CAPI_DATA_B3_CONF_LEN		(CAPI_MSG_BASELEN + 4 + 2 + 2)
-#define CAPI_DISCONNECT_IND_LEN		(CAPI_MSG_BASELEN + 4 + 2)
-#define CAPI_DISCONNECT_B3_IND_BASELEN	(CAPI_MSG_BASELEN + 4 + 2 + 1)
-#define CAPI_FACILITY_CONF_BASELEN	(CAPI_MSG_BASELEN + 4 + 2 + 2 + 1)
-/* most _CONF messages contain only Controller/PLCI/NCCI and Info parameters */
-#define CAPI_STDCONF_LEN		(CAPI_MSG_BASELEN + 4 + 2)
-
-#define CAPI_FACILITY_HANDSET	0x0000
-#define CAPI_FACILITY_DTMF	0x0001
-#define CAPI_FACILITY_V42BIS	0x0002
-#define CAPI_FACILITY_SUPPSVC	0x0003
-#define CAPI_FACILITY_WAKEUP	0x0004
-#define CAPI_FACILITY_LI	0x0005
-
-#define CAPI_SUPPSVC_GETSUPPORTED	0x0000
-#define CAPI_SUPPSVC_LISTEN		0x0001
-
-/* missing from capiutil.h */
-#define CAPIMSG_PLCI_PART(m)	CAPIMSG_U8(m, 9)
-#define CAPIMSG_NCCI_PART(m)	CAPIMSG_U16(m, 10)
-#define CAPIMSG_HANDLE_REQ(m)	CAPIMSG_U16(m, 18) /* DATA_B3_REQ/_IND only! */
-#define CAPIMSG_FLAGS(m)	CAPIMSG_U16(m, 20)
-#define CAPIMSG_SETCONTROLLER(m, contr)	capimsg_setu8(m, 8, contr)
-#define CAPIMSG_SETPLCI_PART(m, plci)	capimsg_setu8(m, 9, plci)
-#define CAPIMSG_SETNCCI_PART(m, ncci)	capimsg_setu16(m, 10, ncci)
-#define CAPIMSG_SETFLAGS(m, flags)	capimsg_setu16(m, 20, flags)
-
-/* parameters with differing location in DATA_B3_CONF/_RESP: */
-#define CAPIMSG_SETHANDLE_CONF(m, handle)	capimsg_setu16(m, 12, handle)
-#define	CAPIMSG_SETINFO_CONF(m, info)		capimsg_setu16(m, 14, info)
-
-/* Flags (DATA_B3_REQ/_IND) */
-#define CAPI_FLAGS_DELIVERY_CONFIRMATION	0x04
-#define CAPI_FLAGS_RESERVED			(~0x1f)
-
-/* buffer sizes */
-#define MAX_BC_OCTETS 11
-#define MAX_HLC_OCTETS 3
-#define MAX_NUMBER_DIGITS 20
-#define MAX_FMT_IE_LEN 20
-
-/* values for bcs->apconnstate */
-#define APCONN_NONE	0	/* inactive/listening */
-#define APCONN_SETUP	1	/* connecting */
-#define APCONN_ACTIVE	2	/* B channel up */
-
-/* registered application data structure */
-struct gigaset_capi_appl {
-	struct list_head ctrlist;
-	struct gigaset_capi_appl *bcnext;
-	u16 id;
-	struct capi_register_params rp;
-	u16 nextMessageNumber;
-	u32 listenInfoMask;
-	u32 listenCIPmask;
-};
-
-/* CAPI specific controller data structure */
-struct gigaset_capi_ctr {
-	struct capi_ctr ctr;
-	struct list_head appls;
-	struct sk_buff_head sendqueue;
-	atomic_t sendqlen;
-	/* two _cmsg structures possibly used concurrently: */
-	_cmsg hcmsg;	/* for message composition triggered from hardware */
-	_cmsg acmsg;	/* for dissection of messages sent from application */
-	u8 bc_buf[MAX_BC_OCTETS + 1];
-	u8 hlc_buf[MAX_HLC_OCTETS + 1];
-	u8 cgpty_buf[MAX_NUMBER_DIGITS + 3];
-	u8 cdpty_buf[MAX_NUMBER_DIGITS + 2];
-};
-
-/* CIP Value table (from CAPI 2.0 standard, ch. 6.1) */
-static struct {
-	u8 *bc;
-	u8 *hlc;
-} cip2bchlc[] = {
-	[1] = { "8090A3", NULL },	/* Speech (A-law) */
-	[2] = { "8890", NULL },		/* Unrestricted digital information */
-	[3] = { "8990", NULL },		/* Restricted digital information */
-	[4] = { "9090A3", NULL },	/* 3,1 kHz audio (A-law) */
-	[5] = { "9190", NULL },		/* 7 kHz audio */
-	[6] = { "9890", NULL },		/* Video */
-	[7] = { "88C0C6E6", NULL },	/* Packet mode */
-	[8] = { "8890218F", NULL },	/* 56 kbit/s rate adaptation */
-	[9] = { "9190A5", NULL },	/* Unrestricted digital information
-					 * with tones/announcements */
-	[16] = { "8090A3", "9181" },	/* Telephony */
-	[17] = { "9090A3", "9184" },	/* Group 2/3 facsimile */
-	[18] = { "8890", "91A1" },	/* Group 4 facsimile Class 1 */
-	[19] = { "8890", "91A4" },	/* Teletex service basic and mixed mode
-					 * and Group 4 facsimile service
-					 * Classes II and III */
-	[20] = { "8890", "91A8" },	/* Teletex service basic and
-					 * processable mode */
-	[21] = { "8890", "91B1" },	/* Teletex service basic mode */
-	[22] = { "8890", "91B2" },	/* International interworking for
-					 * Videotex */
-	[23] = { "8890", "91B5" },	/* Telex */
-	[24] = { "8890", "91B8" },	/* Message Handling Systems
-					 * in accordance with X.400 */
-	[25] = { "8890", "91C1" },	/* OSI application
-					 * in accordance with X.200 */
-	[26] = { "9190A5", "9181" },	/* 7 kHz telephony */
-	[27] = { "9190A5", "916001" },	/* Video telephony, first connection */
-	[28] = { "8890", "916002" },	/* Video telephony, second connection */
-};
-
-/*
- * helper functions
- * ================
- */
-
-/*
- * emit unsupported parameter warning
- */
-static inline void ignore_cstruct_param(struct cardstate *cs, _cstruct param,
-					char *msgname, char *paramname)
-{
-	if (param && *param)
-		dev_warn(cs->dev, "%s: ignoring unsupported parameter: %s\n",
-			 msgname, paramname);
-}
-
-/*
- * convert an IE from Gigaset hex string to ETSI binary representation
- * including length byte
- * return value: result length, -1 on error
- */
-static int encode_ie(char *in, u8 *out, int maxlen)
-{
-	int l = 0;
-	while (*in) {
-		if (!isxdigit(in[0]) || !isxdigit(in[1]) || l >= maxlen)
-			return -1;
-		out[++l] = (hex_to_bin(in[0]) << 4) + hex_to_bin(in[1]);
-		in += 2;
-	}
-	out[0] = l;
-	return l;
-}
-
-/*
- * convert an IE from ETSI binary representation including length byte
- * to Gigaset hex string
- */
-static void decode_ie(u8 *in, char *out)
-{
-	int i = *in;
-	while (i-- > 0) {
-		/* ToDo: conversion to upper case necessary? */
-		*out++ = toupper(hex_asc_hi(*++in));
-		*out++ = toupper(hex_asc_lo(*in));
-	}
-}
-
-/*
- * retrieve application data structure for an application ID
- */
-static inline struct gigaset_capi_appl *
-get_appl(struct gigaset_capi_ctr *iif, u16 appl)
-{
-	struct gigaset_capi_appl *ap;
-
-	list_for_each_entry(ap, &iif->appls, ctrlist)
-		if (ap->id == appl)
-			return ap;
-	return NULL;
-}
-
-/*
- * dump CAPI message to kernel messages for debugging
- */
-static inline void dump_cmsg(enum debuglevel level, const char *tag, _cmsg *p)
-{
-#ifdef CONFIG_GIGASET_DEBUG
-	/* dump at most 20 messages in 20 secs */
-	static DEFINE_RATELIMIT_STATE(msg_dump_ratelimit, 20 * HZ, 20);
-	_cdebbuf *cdb;
-
-	if (!(gigaset_debuglevel & level))
-		return;
-	if (!___ratelimit(&msg_dump_ratelimit, tag))
-		return;
-
-	cdb = capi_cmsg2str(p);
-	if (cdb) {
-		gig_dbg(level, "%s: [%d] %s", tag, p->ApplId, cdb->buf);
-		cdebbuf_free(cdb);
-	} else {
-		gig_dbg(level, "%s: [%d] %s", tag, p->ApplId,
-			capi_cmd2str(p->Command, p->Subcommand));
-	}
-#endif
-}
-
-static inline void dump_rawmsg(enum debuglevel level, const char *tag,
-			       unsigned char *data)
-{
-#ifdef CONFIG_GIGASET_DEBUG
-	char *dbgline;
-	int i, l;
-
-	if (!(gigaset_debuglevel & level))
-		return;
-
-	l = CAPIMSG_LEN(data);
-	if (l < 12) {
-		gig_dbg(level, "%s: ??? LEN=%04d", tag, l);
-		return;
-	}
-	gig_dbg(level, "%s: 0x%02x:0x%02x: ID=%03d #0x%04x LEN=%04d NCCI=0x%x",
-		tag, CAPIMSG_COMMAND(data), CAPIMSG_SUBCOMMAND(data),
-		CAPIMSG_APPID(data), CAPIMSG_MSGID(data), l,
-		CAPIMSG_CONTROL(data));
-	l -= 12;
-	if (l <= 0)
-		return;
-	if (l > 64)
-		l = 64; /* arbitrary limit */
-	dbgline = kmalloc_array(3, l, GFP_ATOMIC);
-	if (!dbgline)
-		return;
-	for (i = 0; i < l; i++) {
-		dbgline[3 * i] = hex_asc_hi(data[12 + i]);
-		dbgline[3 * i + 1] = hex_asc_lo(data[12 + i]);
-		dbgline[3 * i + 2] = ' ';
-	}
-	dbgline[3 * l - 1] = '\0';
-	gig_dbg(level, "  %s", dbgline);
-	kfree(dbgline);
-	if (CAPIMSG_COMMAND(data) == CAPI_DATA_B3 &&
-	    (CAPIMSG_SUBCOMMAND(data) == CAPI_REQ ||
-	     CAPIMSG_SUBCOMMAND(data) == CAPI_IND)) {
-		l = CAPIMSG_DATALEN(data);
-		gig_dbg(level, "   DataLength=%d", l);
-		if (l <= 0 || !(gigaset_debuglevel & DEBUG_LLDATA))
-			return;
-		if (l > 64)
-			l = 64; /* arbitrary limit */
-		dbgline = kmalloc_array(3, l, GFP_ATOMIC);
-		if (!dbgline)
-			return;
-		data += CAPIMSG_LEN(data);
-		for (i = 0; i < l; i++) {
-			dbgline[3 * i] = hex_asc_hi(data[i]);
-			dbgline[3 * i + 1] = hex_asc_lo(data[i]);
-			dbgline[3 * i + 2] = ' ';
-		}
-		dbgline[3 * l - 1] = '\0';
-		gig_dbg(level, "  %s", dbgline);
-		kfree(dbgline);
-	}
-#endif
-}
-
-/*
- * format CAPI IE as string
- */
-
-#ifdef CONFIG_GIGASET_DEBUG
-static const char *format_ie(const char *ie)
-{
-	static char result[3 * MAX_FMT_IE_LEN];
-	int len, count;
-	char *pout = result;
-
-	if (!ie)
-		return "NULL";
-
-	count = len = ie[0];
-	if (count > MAX_FMT_IE_LEN)
-		count = MAX_FMT_IE_LEN - 1;
-	while (count--) {
-		*pout++ = hex_asc_hi(*++ie);
-		*pout++ = hex_asc_lo(*ie);
-		*pout++ = ' ';
-	}
-	if (len > MAX_FMT_IE_LEN) {
-		*pout++ = '.';
-		*pout++ = '.';
-		*pout++ = '.';
-	}
-	*--pout = 0;
-	return result;
-}
-#endif
-
-/*
- * emit DATA_B3_CONF message
- */
-static void send_data_b3_conf(struct cardstate *cs, struct capi_ctr *ctr,
-			      u16 appl, u16 msgid, int channel,
-			      u16 handle, u16 info)
-{
-	struct sk_buff *cskb;
-	u8 *msg;
-
-	cskb = alloc_skb(CAPI_DATA_B3_CONF_LEN, GFP_ATOMIC);
-	if (!cskb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	/* frequent message, avoid _cmsg overhead */
-	msg = __skb_put(cskb, CAPI_DATA_B3_CONF_LEN);
-	CAPIMSG_SETLEN(msg, CAPI_DATA_B3_CONF_LEN);
-	CAPIMSG_SETAPPID(msg, appl);
-	CAPIMSG_SETCOMMAND(msg, CAPI_DATA_B3);
-	CAPIMSG_SETSUBCOMMAND(msg,  CAPI_CONF);
-	CAPIMSG_SETMSGID(msg, msgid);
-	CAPIMSG_SETCONTROLLER(msg, ctr->cnr);
-	CAPIMSG_SETPLCI_PART(msg, channel);
-	CAPIMSG_SETNCCI_PART(msg, 1);
-	CAPIMSG_SETHANDLE_CONF(msg, handle);
-	CAPIMSG_SETINFO_CONF(msg, info);
-
-	/* emit message */
-	dump_rawmsg(DEBUG_MCMD, __func__, msg);
-	capi_ctr_handle_message(ctr, appl, cskb);
-}
-
-
-/*
- * driver interface functions
- * ==========================
- */
-
-/**
- * gigaset_skb_sent() - acknowledge transmission of outgoing skb
- * @bcs:	B channel descriptor structure.
- * @skb:	sent data.
- *
- * Called by hardware module {bas,ser,usb}_gigaset when the data in a
- * skb has been successfully sent, for signalling completion to the LL.
- */
-void gigaset_skb_sent(struct bc_state *bcs, struct sk_buff *dskb)
-{
-	struct cardstate *cs = bcs->cs;
-	struct gigaset_capi_ctr *iif = cs->iif;
-	struct gigaset_capi_appl *ap = bcs->ap;
-	unsigned char *req = skb_mac_header(dskb);
-	u16 flags;
-
-	/* update statistics */
-	++bcs->trans_up;
-
-	if (!ap) {
-		gig_dbg(DEBUG_MCMD, "%s: application gone", __func__);
-		return;
-	}
-
-	/* don't send further B3 messages if disconnected */
-	if (bcs->apconnstate < APCONN_ACTIVE) {
-		gig_dbg(DEBUG_MCMD, "%s: disconnected", __func__);
-		return;
-	}
-
-	/*
-	 * send DATA_B3_CONF if "delivery confirmation" bit was set in request;
-	 * otherwise it has already been sent by do_data_b3_req()
-	 */
-	flags = CAPIMSG_FLAGS(req);
-	if (flags & CAPI_FLAGS_DELIVERY_CONFIRMATION)
-		send_data_b3_conf(cs, &iif->ctr, ap->id, CAPIMSG_MSGID(req),
-				  bcs->channel + 1, CAPIMSG_HANDLE_REQ(req),
-				  (flags & ~CAPI_FLAGS_DELIVERY_CONFIRMATION) ?
-				  CapiFlagsNotSupportedByProtocol :
-				  CAPI_NOERROR);
-}
-EXPORT_SYMBOL_GPL(gigaset_skb_sent);
-
-/**
- * gigaset_skb_rcvd() - pass received skb to LL
- * @bcs:	B channel descriptor structure.
- * @skb:	received data.
- *
- * Called by hardware module {bas,ser,usb}_gigaset when user data has
- * been successfully received, for passing to the LL.
- * Warning: skb must not be accessed anymore!
- */
-void gigaset_skb_rcvd(struct bc_state *bcs, struct sk_buff *skb)
-{
-	struct cardstate *cs = bcs->cs;
-	struct gigaset_capi_ctr *iif = cs->iif;
-	struct gigaset_capi_appl *ap = bcs->ap;
-	int len = skb->len;
-
-	/* update statistics */
-	bcs->trans_down++;
-
-	if (!ap) {
-		gig_dbg(DEBUG_MCMD, "%s: application gone", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-
-	/* don't send further B3 messages if disconnected */
-	if (bcs->apconnstate < APCONN_ACTIVE) {
-		gig_dbg(DEBUG_MCMD, "%s: disconnected", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-
-	/*
-	 * prepend DATA_B3_IND message to payload
-	 * Parameters: NCCI = 1, all others 0/unused
-	 * frequent message, avoid _cmsg overhead
-	 */
-	skb_push(skb, CAPI_DATA_B3_REQ_LEN);
-	CAPIMSG_SETLEN(skb->data, CAPI_DATA_B3_REQ_LEN);
-	CAPIMSG_SETAPPID(skb->data, ap->id);
-	CAPIMSG_SETCOMMAND(skb->data, CAPI_DATA_B3);
-	CAPIMSG_SETSUBCOMMAND(skb->data,  CAPI_IND);
-	CAPIMSG_SETMSGID(skb->data, ap->nextMessageNumber++);
-	CAPIMSG_SETCONTROLLER(skb->data, iif->ctr.cnr);
-	CAPIMSG_SETPLCI_PART(skb->data, bcs->channel + 1);
-	CAPIMSG_SETNCCI_PART(skb->data, 1);
-	/* Data parameter not used */
-	CAPIMSG_SETDATALEN(skb->data, len);
-	/* Data handle parameter not used */
-	CAPIMSG_SETFLAGS(skb->data, 0);
-	/* Data64 parameter not present */
-
-	/* emit message */
-	dump_rawmsg(DEBUG_MCMD, __func__, skb->data);
-	capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-}
-EXPORT_SYMBOL_GPL(gigaset_skb_rcvd);
-
-/**
- * gigaset_isdn_rcv_err() - signal receive error
- * @bcs:	B channel descriptor structure.
- *
- * Called by hardware module {bas,ser,usb}_gigaset when a receive error
- * has occurred, for signalling to the LL.
- */
-void gigaset_isdn_rcv_err(struct bc_state *bcs)
-{
-	/* if currently ignoring packets, just count down */
-	if (bcs->ignore) {
-		bcs->ignore--;
-		return;
-	}
-
-	/* update statistics */
-	bcs->corrupted++;
-
-	/* ToDo: signal error -> LL */
-}
-EXPORT_SYMBOL_GPL(gigaset_isdn_rcv_err);
-
-/**
- * gigaset_isdn_icall() - signal incoming call
- * @at_state:	connection state structure.
- *
- * Called by main module at tasklet level to notify the LL that an incoming
- * call has been received. @at_state contains the parameters of the call.
- *
- * Return value: call disposition (ICALL_*)
- */
-int gigaset_isdn_icall(struct at_state_t *at_state)
-{
-	struct cardstate *cs = at_state->cs;
-	struct bc_state *bcs = at_state->bcs;
-	struct gigaset_capi_ctr *iif = cs->iif;
-	struct gigaset_capi_appl *ap;
-	u32 actCIPmask;
-	struct sk_buff *skb;
-	unsigned int msgsize;
-	unsigned long flags;
-	int i;
-
-	/*
-	 * ToDo: signal calls without a free B channel, too
-	 * (requires a u8 handle for the at_state structure that can
-	 * be stored in the PLCI and used in the CONNECT_RESP message
-	 * handler to retrieve it)
-	 */
-	if (!bcs)
-		return ICALL_IGNORE;
-
-	/* prepare CONNECT_IND message, using B channel number as PLCI */
-	capi_cmsg_header(&iif->hcmsg, 0, CAPI_CONNECT, CAPI_IND, 0,
-			 iif->ctr.cnr | ((bcs->channel + 1) << 8));
-
-	/* minimum size, all structs empty */
-	msgsize = CAPI_CONNECT_IND_BASELEN;
-
-	/* Bearer Capability (mandatory) */
-	if (at_state->str_var[STR_ZBC]) {
-		/* pass on BC from Gigaset */
-		if (encode_ie(at_state->str_var[STR_ZBC], iif->bc_buf,
-			      MAX_BC_OCTETS) < 0) {
-			dev_warn(cs->dev, "RING ignored - bad BC %s\n",
-				 at_state->str_var[STR_ZBC]);
-			return ICALL_IGNORE;
-		}
-
-		/* look up corresponding CIP value */
-		iif->hcmsg.CIPValue = 0;	/* default if nothing found */
-		for (i = 0; i < ARRAY_SIZE(cip2bchlc); i++)
-			if (cip2bchlc[i].bc != NULL &&
-			    cip2bchlc[i].hlc == NULL &&
-			    !strcmp(cip2bchlc[i].bc,
-				    at_state->str_var[STR_ZBC])) {
-				iif->hcmsg.CIPValue = i;
-				break;
-			}
-	} else {
-		/* no BC (internal call): assume CIP 1 (speech, A-law) */
-		iif->hcmsg.CIPValue = 1;
-		encode_ie(cip2bchlc[1].bc, iif->bc_buf, MAX_BC_OCTETS);
-	}
-	iif->hcmsg.BC = iif->bc_buf;
-	msgsize += iif->hcmsg.BC[0];
-
-	/* High Layer Compatibility (optional) */
-	if (at_state->str_var[STR_ZHLC]) {
-		/* pass on HLC from Gigaset */
-		if (encode_ie(at_state->str_var[STR_ZHLC], iif->hlc_buf,
-			      MAX_HLC_OCTETS) < 0) {
-			dev_warn(cs->dev, "RING ignored - bad HLC %s\n",
-				 at_state->str_var[STR_ZHLC]);
-			return ICALL_IGNORE;
-		}
-		iif->hcmsg.HLC = iif->hlc_buf;
-		msgsize += iif->hcmsg.HLC[0];
-
-		/* look up corresponding CIP value */
-		/* keep BC based CIP value if none found */
-		if (at_state->str_var[STR_ZBC])
-			for (i = 0; i < ARRAY_SIZE(cip2bchlc); i++)
-				if (cip2bchlc[i].hlc != NULL &&
-				    !strcmp(cip2bchlc[i].hlc,
-					    at_state->str_var[STR_ZHLC]) &&
-				    !strcmp(cip2bchlc[i].bc,
-					    at_state->str_var[STR_ZBC])) {
-					iif->hcmsg.CIPValue = i;
-					break;
-				}
-	}
-
-	/* Called Party Number (optional) */
-	if (at_state->str_var[STR_ZCPN]) {
-		i = strlen(at_state->str_var[STR_ZCPN]);
-		if (i > MAX_NUMBER_DIGITS) {
-			dev_warn(cs->dev, "RING ignored - bad number %s\n",
-				 at_state->str_var[STR_ZBC]);
-			return ICALL_IGNORE;
-		}
-		iif->cdpty_buf[0] = i + 1;
-		iif->cdpty_buf[1] = 0x80; /* type / numbering plan unknown */
-		memcpy(iif->cdpty_buf + 2, at_state->str_var[STR_ZCPN], i);
-		iif->hcmsg.CalledPartyNumber = iif->cdpty_buf;
-		msgsize += iif->hcmsg.CalledPartyNumber[0];
-	}
-
-	/* Calling Party Number (optional) */
-	if (at_state->str_var[STR_NMBR]) {
-		i = strlen(at_state->str_var[STR_NMBR]);
-		if (i > MAX_NUMBER_DIGITS) {
-			dev_warn(cs->dev, "RING ignored - bad number %s\n",
-				 at_state->str_var[STR_ZBC]);
-			return ICALL_IGNORE;
-		}
-		iif->cgpty_buf[0] = i + 2;
-		iif->cgpty_buf[1] = 0x00; /* type / numbering plan unknown */
-		iif->cgpty_buf[2] = 0x80; /* pres. allowed, not screened */
-		memcpy(iif->cgpty_buf + 3, at_state->str_var[STR_NMBR], i);
-		iif->hcmsg.CallingPartyNumber = iif->cgpty_buf;
-		msgsize += iif->hcmsg.CallingPartyNumber[0];
-	}
-
-	/* remaining parameters (not supported, always left NULL):
-	 * - CalledPartySubaddress
-	 * - CallingPartySubaddress
-	 * - AdditionalInfo
-	 *   - BChannelinformation
-	 *   - Keypadfacility
-	 *   - Useruserdata
-	 *   - Facilitydataarray
-	 */
-
-	gig_dbg(DEBUG_CMD, "icall: PLCI %x CIP %d BC %s",
-		iif->hcmsg.adr.adrPLCI, iif->hcmsg.CIPValue,
-		format_ie(iif->hcmsg.BC));
-	gig_dbg(DEBUG_CMD, "icall: HLC %s",
-		format_ie(iif->hcmsg.HLC));
-	gig_dbg(DEBUG_CMD, "icall: CgPty %s",
-		format_ie(iif->hcmsg.CallingPartyNumber));
-	gig_dbg(DEBUG_CMD, "icall: CdPty %s",
-		format_ie(iif->hcmsg.CalledPartyNumber));
-
-	/* scan application list for matching listeners */
-	spin_lock_irqsave(&bcs->aplock, flags);
-	if (bcs->ap != NULL || bcs->apconnstate != APCONN_NONE) {
-		dev_warn(cs->dev, "%s: channel not properly cleared (%p/%d)\n",
-			 __func__, bcs->ap, bcs->apconnstate);
-		bcs->ap = NULL;
-		bcs->apconnstate = APCONN_NONE;
-	}
-	spin_unlock_irqrestore(&bcs->aplock, flags);
-	actCIPmask = 1 | (1 << iif->hcmsg.CIPValue);
-	list_for_each_entry(ap, &iif->appls, ctrlist)
-		if (actCIPmask & ap->listenCIPmask) {
-			/* build CONNECT_IND message for this application */
-			iif->hcmsg.ApplId = ap->id;
-			iif->hcmsg.Messagenumber = ap->nextMessageNumber++;
-
-			skb = alloc_skb(msgsize, GFP_ATOMIC);
-			if (!skb) {
-				dev_err(cs->dev, "%s: out of memory\n",
-					__func__);
-				break;
-			}
-			if (capi_cmsg2message(&iif->hcmsg,
-					      __skb_put(skb, msgsize))) {
-				dev_err(cs->dev, "%s: message parser failure\n",
-					__func__);
-				dev_kfree_skb_any(skb);
-				break;
-			}
-			dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg);
-
-			/* add to listeners on this B channel, update state */
-			spin_lock_irqsave(&bcs->aplock, flags);
-			ap->bcnext = bcs->ap;
-			bcs->ap = ap;
-			bcs->chstate |= CHS_NOTIFY_LL;
-			bcs->apconnstate = APCONN_SETUP;
-			spin_unlock_irqrestore(&bcs->aplock, flags);
-
-			/* emit message */
-			capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-		}
-
-	/*
-	 * Return "accept" if any listeners.
-	 * Gigaset will send ALERTING.
-	 * There doesn't seem to be a way to avoid this.
-	 */
-	return bcs->ap ? ICALL_ACCEPT : ICALL_IGNORE;
-}
-
-/*
- * send a DISCONNECT_IND message to an application
- * does not sleep, clobbers the controller's hcmsg structure
- */
-static void send_disconnect_ind(struct bc_state *bcs,
-				struct gigaset_capi_appl *ap, u16 reason)
-{
-	struct cardstate *cs = bcs->cs;
-	struct gigaset_capi_ctr *iif = cs->iif;
-	struct sk_buff *skb;
-
-	if (bcs->apconnstate == APCONN_NONE)
-		return;
-
-	capi_cmsg_header(&iif->hcmsg, ap->id, CAPI_DISCONNECT, CAPI_IND,
-			 ap->nextMessageNumber++,
-			 iif->ctr.cnr | ((bcs->channel + 1) << 8));
-	iif->hcmsg.Reason = reason;
-	skb = alloc_skb(CAPI_DISCONNECT_IND_LEN, GFP_ATOMIC);
-	if (!skb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	if (capi_cmsg2message(&iif->hcmsg,
-			      __skb_put(skb, CAPI_DISCONNECT_IND_LEN))) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg);
-	capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-}
-
-/*
- * send a DISCONNECT_B3_IND message to an application
- * Parameters: NCCI = 1, NCPI empty, Reason_B3 = 0
- * does not sleep, clobbers the controller's hcmsg structure
- */
-static void send_disconnect_b3_ind(struct bc_state *bcs,
-				   struct gigaset_capi_appl *ap)
-{
-	struct cardstate *cs = bcs->cs;
-	struct gigaset_capi_ctr *iif = cs->iif;
-	struct sk_buff *skb;
-
-	/* nothing to do if no logical connection active */
-	if (bcs->apconnstate < APCONN_ACTIVE)
-		return;
-	bcs->apconnstate = APCONN_SETUP;
-
-	capi_cmsg_header(&iif->hcmsg, ap->id, CAPI_DISCONNECT_B3, CAPI_IND,
-			 ap->nextMessageNumber++,
-			 iif->ctr.cnr | ((bcs->channel + 1) << 8) | (1 << 16));
-	skb = alloc_skb(CAPI_DISCONNECT_B3_IND_BASELEN, GFP_ATOMIC);
-	if (!skb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	if (capi_cmsg2message(&iif->hcmsg,
-			  __skb_put(skb, CAPI_DISCONNECT_B3_IND_BASELEN))) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg);
-	capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-}
-
-/**
- * gigaset_isdn_connD() - signal D channel connect
- * @bcs:	B channel descriptor structure.
- *
- * Called by main module at tasklet level to notify the LL that the D channel
- * connection has been established.
- */
-void gigaset_isdn_connD(struct bc_state *bcs)
-{
-	struct cardstate *cs = bcs->cs;
-	struct gigaset_capi_ctr *iif = cs->iif;
-	struct gigaset_capi_appl *ap;
-	struct sk_buff *skb;
-	unsigned int msgsize;
-	unsigned long flags;
-
-	spin_lock_irqsave(&bcs->aplock, flags);
-	ap = bcs->ap;
-	if (!ap) {
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-		gig_dbg(DEBUG_CMD, "%s: application gone", __func__);
-		return;
-	}
-	if (bcs->apconnstate == APCONN_NONE) {
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-		dev_warn(cs->dev, "%s: application %u not connected\n",
-			 __func__, ap->id);
-		return;
-	}
-	spin_unlock_irqrestore(&bcs->aplock, flags);
-	while (ap->bcnext) {
-		/* this should never happen */
-		dev_warn(cs->dev, "%s: dropping extra application %u\n",
-			 __func__, ap->bcnext->id);
-		send_disconnect_ind(bcs, ap->bcnext,
-				    CapiCallGivenToOtherApplication);
-		ap->bcnext = ap->bcnext->bcnext;
-	}
-
-	/* prepare CONNECT_ACTIVE_IND message
-	 * Note: LLC not supported by device
-	 */
-	capi_cmsg_header(&iif->hcmsg, ap->id, CAPI_CONNECT_ACTIVE, CAPI_IND,
-			 ap->nextMessageNumber++,
-			 iif->ctr.cnr | ((bcs->channel + 1) << 8));
-
-	/* minimum size, all structs empty */
-	msgsize = CAPI_CONNECT_ACTIVE_IND_BASELEN;
-
-	/* ToDo: set parameter: Connected number
-	 * (requires ev-layer state machine extension to collect
-	 * ZCON device reply)
-	 */
-
-	/* build and emit CONNECT_ACTIVE_IND message */
-	skb = alloc_skb(msgsize, GFP_ATOMIC);
-	if (!skb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	if (capi_cmsg2message(&iif->hcmsg, __skb_put(skb, msgsize))) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg);
-	capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-}
-
-/**
- * gigaset_isdn_hupD() - signal D channel hangup
- * @bcs:	B channel descriptor structure.
- *
- * Called by main module at tasklet level to notify the LL that the D channel
- * connection has been shut down.
- */
-void gigaset_isdn_hupD(struct bc_state *bcs)
-{
-	struct gigaset_capi_appl *ap;
-	unsigned long flags;
-
-	/*
-	 * ToDo: pass on reason code reported by device
-	 * (requires ev-layer state machine extension to collect
-	 * ZCAU device reply)
-	 */
-	spin_lock_irqsave(&bcs->aplock, flags);
-	while (bcs->ap != NULL) {
-		ap = bcs->ap;
-		bcs->ap = ap->bcnext;
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-		send_disconnect_b3_ind(bcs, ap);
-		send_disconnect_ind(bcs, ap, 0);
-		spin_lock_irqsave(&bcs->aplock, flags);
-	}
-	bcs->apconnstate = APCONN_NONE;
-	spin_unlock_irqrestore(&bcs->aplock, flags);
-}
-
-/**
- * gigaset_isdn_connB() - signal B channel connect
- * @bcs:	B channel descriptor structure.
- *
- * Called by main module at tasklet level to notify the LL that the B channel
- * connection has been established.
- */
-void gigaset_isdn_connB(struct bc_state *bcs)
-{
-	struct cardstate *cs = bcs->cs;
-	struct gigaset_capi_ctr *iif = cs->iif;
-	struct gigaset_capi_appl *ap;
-	struct sk_buff *skb;
-	unsigned long flags;
-	unsigned int msgsize;
-	u8 command;
-
-	spin_lock_irqsave(&bcs->aplock, flags);
-	ap = bcs->ap;
-	if (!ap) {
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-		gig_dbg(DEBUG_CMD, "%s: application gone", __func__);
-		return;
-	}
-	if (!bcs->apconnstate) {
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-		dev_warn(cs->dev, "%s: application %u not connected\n",
-			 __func__, ap->id);
-		return;
-	}
-
-	/*
-	 * emit CONNECT_B3_ACTIVE_IND if we already got CONNECT_B3_REQ;
-	 * otherwise we have to emit CONNECT_B3_IND first, and follow up with
-	 * CONNECT_B3_ACTIVE_IND in reply to CONNECT_B3_RESP
-	 * Parameters in both cases always: NCCI = 1, NCPI empty
-	 */
-	if (bcs->apconnstate >= APCONN_ACTIVE) {
-		command = CAPI_CONNECT_B3_ACTIVE;
-		msgsize = CAPI_CONNECT_B3_ACTIVE_IND_BASELEN;
-	} else {
-		command = CAPI_CONNECT_B3;
-		msgsize = CAPI_CONNECT_B3_IND_BASELEN;
-	}
-	bcs->apconnstate = APCONN_ACTIVE;
-
-	spin_unlock_irqrestore(&bcs->aplock, flags);
-
-	while (ap->bcnext) {
-		/* this should never happen */
-		dev_warn(cs->dev, "%s: dropping extra application %u\n",
-			 __func__, ap->bcnext->id);
-		send_disconnect_ind(bcs, ap->bcnext,
-				    CapiCallGivenToOtherApplication);
-		ap->bcnext = ap->bcnext->bcnext;
-	}
-
-	capi_cmsg_header(&iif->hcmsg, ap->id, command, CAPI_IND,
-			 ap->nextMessageNumber++,
-			 iif->ctr.cnr | ((bcs->channel + 1) << 8) | (1 << 16));
-	skb = alloc_skb(msgsize, GFP_ATOMIC);
-	if (!skb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	if (capi_cmsg2message(&iif->hcmsg, __skb_put(skb, msgsize))) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->hcmsg);
-	capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-}
-
-/**
- * gigaset_isdn_hupB() - signal B channel hangup
- * @bcs:	B channel descriptor structure.
- *
- * Called by main module to notify the LL that the B channel connection has
- * been shut down.
- */
-void gigaset_isdn_hupB(struct bc_state *bcs)
-{
-	struct gigaset_capi_appl *ap = bcs->ap;
-
-	/* ToDo: assure order of DISCONNECT_B3_IND and DISCONNECT_IND ? */
-
-	if (!ap) {
-		gig_dbg(DEBUG_CMD, "%s: application gone", __func__);
-		return;
-	}
-
-	send_disconnect_b3_ind(bcs, ap);
-}
-
-/**
- * gigaset_isdn_start() - signal device availability
- * @cs:		device descriptor structure.
- *
- * Called by main module to notify the LL that the device is available for
- * use.
- */
-void gigaset_isdn_start(struct cardstate *cs)
-{
-	struct gigaset_capi_ctr *iif = cs->iif;
-
-	/* fill profile data: manufacturer name */
-	strcpy(iif->ctr.manu, "Siemens");
-	/* CAPI and device version */
-	iif->ctr.version.majorversion = 2;		/* CAPI 2.0 */
-	iif->ctr.version.minorversion = 0;
-	/* ToDo: check/assert cs->gotfwver? */
-	iif->ctr.version.majormanuversion = cs->fwver[0];
-	iif->ctr.version.minormanuversion = cs->fwver[1];
-	/* number of B channels supported */
-	iif->ctr.profile.nbchannel = cs->channels;
-	/* global options: internal controller, supplementary services */
-	iif->ctr.profile.goptions = 0x11;
-	/* B1 protocols: 64 kbit/s HDLC or transparent */
-	iif->ctr.profile.support1 =  0x03;
-	/* B2 protocols: transparent only */
-	/* ToDo: X.75 SLP ? */
-	iif->ctr.profile.support2 =  0x02;
-	/* B3 protocols: transparent only */
-	iif->ctr.profile.support3 =  0x01;
-	/* no serial number */
-	strcpy(iif->ctr.serial, "0");
-	capi_ctr_ready(&iif->ctr);
-}
-
-/**
- * gigaset_isdn_stop() - signal device unavailability
- * @cs:		device descriptor structure.
- *
- * Called by main module to notify the LL that the device is no longer
- * available for use.
- */
-void gigaset_isdn_stop(struct cardstate *cs)
-{
-	struct gigaset_capi_ctr *iif = cs->iif;
-	capi_ctr_down(&iif->ctr);
-}
-
-/*
- * kernel CAPI callback methods
- * ============================
- */
-
-/*
- * register CAPI application
- */
-static void gigaset_register_appl(struct capi_ctr *ctr, u16 appl,
-				  capi_register_params *rp)
-{
-	struct gigaset_capi_ctr *iif
-		= container_of(ctr, struct gigaset_capi_ctr, ctr);
-	struct cardstate *cs = ctr->driverdata;
-	struct gigaset_capi_appl *ap;
-
-	gig_dbg(DEBUG_CMD, "%s [%u] l3cnt=%u blkcnt=%u blklen=%u",
-		__func__, appl, rp->level3cnt, rp->datablkcnt, rp->datablklen);
-
-	list_for_each_entry(ap, &iif->appls, ctrlist)
-		if (ap->id == appl) {
-			dev_notice(cs->dev,
-				   "application %u already registered\n", appl);
-			return;
-		}
-
-	ap = kzalloc(sizeof(*ap), GFP_KERNEL);
-	if (!ap) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	ap->id = appl;
-	ap->rp = *rp;
-
-	list_add(&ap->ctrlist, &iif->appls);
-	dev_info(cs->dev, "application %u registered\n", ap->id);
-}
-
-/*
- * remove CAPI application from channel
- * helper function to keep indentation levels down and stay in 80 columns
- */
-
-static inline void remove_appl_from_channel(struct bc_state *bcs,
-					    struct gigaset_capi_appl *ap)
-{
-	struct cardstate *cs = bcs->cs;
-	struct gigaset_capi_appl *bcap;
-	unsigned long flags;
-	int prevconnstate;
-
-	spin_lock_irqsave(&bcs->aplock, flags);
-	bcap = bcs->ap;
-	if (bcap == NULL) {
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-		return;
-	}
-
-	/* check first application on channel */
-	if (bcap == ap) {
-		bcs->ap = ap->bcnext;
-		if (bcs->ap != NULL) {
-			spin_unlock_irqrestore(&bcs->aplock, flags);
-			return;
-		}
-
-		/* none left, clear channel state */
-		prevconnstate = bcs->apconnstate;
-		bcs->apconnstate = APCONN_NONE;
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-
-		if (prevconnstate == APCONN_ACTIVE) {
-			dev_notice(cs->dev, "%s: hanging up channel %u\n",
-				   __func__, bcs->channel);
-			gigaset_add_event(cs, &bcs->at_state,
-					  EV_HUP, NULL, 0, NULL);
-			gigaset_schedule_event(cs);
-		}
-		return;
-	}
-
-	/* check remaining list */
-	do {
-		if (bcap->bcnext == ap) {
-			bcap->bcnext = bcap->bcnext->bcnext;
-			spin_unlock_irqrestore(&bcs->aplock, flags);
-			return;
-		}
-		bcap = bcap->bcnext;
-	} while (bcap != NULL);
-	spin_unlock_irqrestore(&bcs->aplock, flags);
-}
-
-/*
- * release CAPI application
- */
-static void gigaset_release_appl(struct capi_ctr *ctr, u16 appl)
-{
-	struct gigaset_capi_ctr *iif
-		= container_of(ctr, struct gigaset_capi_ctr, ctr);
-	struct cardstate *cs = iif->ctr.driverdata;
-	struct gigaset_capi_appl *ap, *tmp;
-	unsigned ch;
-
-	gig_dbg(DEBUG_CMD, "%s [%u]", __func__, appl);
-
-	list_for_each_entry_safe(ap, tmp, &iif->appls, ctrlist)
-		if (ap->id == appl) {
-			/* remove from any channels */
-			for (ch = 0; ch < cs->channels; ch++)
-				remove_appl_from_channel(&cs->bcs[ch], ap);
-
-			/* remove from registration list */
-			list_del(&ap->ctrlist);
-			kfree(ap);
-			dev_info(cs->dev, "application %u released\n", appl);
-		}
-}
-
-/*
- * =====================================================================
- * outgoing CAPI message handler
- * =====================================================================
- */
-
-/*
- * helper function: emit reply message with given Info value
- */
-static void send_conf(struct gigaset_capi_ctr *iif,
-		      struct gigaset_capi_appl *ap,
-		      struct sk_buff *skb,
-		      u16 info)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-
-	/*
-	 * _CONF replies always only have NCCI and Info parameters
-	 * so they'll fit into the _REQ message skb
-	 */
-	capi_cmsg_answer(&iif->acmsg);
-	iif->acmsg.Info = info;
-	if (capi_cmsg2message(&iif->acmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	__skb_trim(skb, CAPI_STDCONF_LEN);
-	dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg);
-	capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-}
-
-/*
- * process FACILITY_REQ message
- */
-static void do_facility_req(struct gigaset_capi_ctr *iif,
-			    struct gigaset_capi_appl *ap,
-			    struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	_cmsg *cmsg = &iif->acmsg;
-	struct sk_buff *cskb;
-	u8 *pparam;
-	unsigned int msgsize = CAPI_FACILITY_CONF_BASELEN;
-	u16 function, info;
-	static u8 confparam[10];	/* max. 9 octets + length byte */
-
-	/* decode message */
-	if (capi_message2cmsg(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-
-	/*
-	 * Facility Request Parameter is not decoded by capi_message2cmsg()
-	 * encoding depends on Facility Selector
-	 */
-	switch (cmsg->FacilitySelector) {
-	case CAPI_FACILITY_DTMF:	/* ToDo */
-		info = CapiFacilityNotSupported;
-		confparam[0] = 2;	/* length */
-		/* DTMF information: Unknown DTMF request */
-		capimsg_setu16(confparam, 1, 2);
-		break;
-
-	case CAPI_FACILITY_V42BIS:	/* not supported */
-		info = CapiFacilityNotSupported;
-		confparam[0] = 2;	/* length */
-		/* V.42 bis information: not available */
-		capimsg_setu16(confparam, 1, 1);
-		break;
-
-	case CAPI_FACILITY_SUPPSVC:
-		/* decode Function parameter */
-		pparam = cmsg->FacilityRequestParameter;
-		if (pparam == NULL || pparam[0] < 2) {
-			dev_notice(cs->dev, "%s: %s missing\n", "FACILITY_REQ",
-				   "Facility Request Parameter");
-			send_conf(iif, ap, skb, CapiIllMessageParmCoding);
-			return;
-		}
-		function = CAPIMSG_U16(pparam, 1);
-		switch (function) {
-		case CAPI_SUPPSVC_GETSUPPORTED:
-			info = CapiSuccess;
-			/* Supplementary Service specific parameter */
-			confparam[3] = 6;	/* length */
-			/* Supplementary services info: Success */
-			capimsg_setu16(confparam, 4, CapiSuccess);
-			/* Supported Services: none */
-			capimsg_setu32(confparam, 6, 0);
-			break;
-		case CAPI_SUPPSVC_LISTEN:
-			if (pparam[0] < 7 || pparam[3] < 4) {
-				dev_notice(cs->dev, "%s: %s missing\n",
-					   "FACILITY_REQ", "Notification Mask");
-				send_conf(iif, ap, skb,
-					  CapiIllMessageParmCoding);
-				return;
-			}
-			if (CAPIMSG_U32(pparam, 4) != 0) {
-				dev_notice(cs->dev,
-					   "%s: unsupported supplementary service notification mask 0x%x\n",
-					   "FACILITY_REQ", CAPIMSG_U32(pparam, 4));
-				info = CapiFacilitySpecificFunctionNotSupported;
-				confparam[3] = 2;	/* length */
-				capimsg_setu16(confparam, 4,
-					       CapiSupplementaryServiceNotSupported);
-				break;
-			}
-			info = CapiSuccess;
-			confparam[3] = 2;	/* length */
-			capimsg_setu16(confparam, 4, CapiSuccess);
-			break;
-
-		/* ToDo: add supported services */
-
-		default:
-			dev_notice(cs->dev,
-				   "%s: unsupported supplementary service function 0x%04x\n",
-				   "FACILITY_REQ", function);
-			info = CapiFacilitySpecificFunctionNotSupported;
-			/* Supplementary Service specific parameter */
-			confparam[3] = 2;	/* length */
-			/* Supplementary services info: not supported */
-			capimsg_setu16(confparam, 4,
-				       CapiSupplementaryServiceNotSupported);
-		}
-
-		/* Facility confirmation parameter */
-		confparam[0] = confparam[3] + 3;	/* total length */
-		/* Function: copy from _REQ message */
-		capimsg_setu16(confparam, 1, function);
-		/* Supplementary Service specific parameter already set above */
-		break;
-
-	case CAPI_FACILITY_WAKEUP:	/* ToDo */
-		info = CapiFacilityNotSupported;
-		confparam[0] = 2;	/* length */
-		/* Number of accepted awake request parameters: 0 */
-		capimsg_setu16(confparam, 1, 0);
-		break;
-
-	default:
-		info = CapiFacilityNotSupported;
-		confparam[0] = 0;	/* empty struct */
-	}
-
-	/* send FACILITY_CONF with given Info and confirmation parameter */
-	dev_kfree_skb_any(skb);
-	capi_cmsg_answer(cmsg);
-	cmsg->Info = info;
-	cmsg->FacilityConfirmationParameter = confparam;
-	msgsize += confparam[0];	/* length */
-	cskb = alloc_skb(msgsize, GFP_ATOMIC);
-	if (!cskb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	if (capi_cmsg2message(cmsg, __skb_put(cskb, msgsize))) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(cskb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-	capi_ctr_handle_message(&iif->ctr, ap->id, cskb);
-}
-
-
-/*
- * process LISTEN_REQ message
- * just store the masks in the application data structure
- */
-static void do_listen_req(struct gigaset_capi_ctr *iif,
-			  struct gigaset_capi_appl *ap,
-			  struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-
-	/* decode message */
-	if (capi_message2cmsg(&iif->acmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg);
-
-	/* store listening parameters */
-	ap->listenInfoMask = iif->acmsg.InfoMask;
-	ap->listenCIPmask = iif->acmsg.CIPmask;
-	send_conf(iif, ap, skb, CapiSuccess);
-}
-
-/*
- * process ALERT_REQ message
- * nothing to do, Gigaset always alerts anyway
- */
-static void do_alert_req(struct gigaset_capi_ctr *iif,
-			 struct gigaset_capi_appl *ap,
-			 struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-
-	/* decode message */
-	if (capi_message2cmsg(&iif->acmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg);
-	send_conf(iif, ap, skb, CapiAlertAlreadySent);
-}
-
-/*
- * process CONNECT_REQ message
- * allocate a B channel, prepare dial commands, queue a DIAL event,
- * emit CONNECT_CONF reply
- */
-static void do_connect_req(struct gigaset_capi_ctr *iif,
-			   struct gigaset_capi_appl *ap,
-			   struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	_cmsg *cmsg = &iif->acmsg;
-	struct bc_state *bcs;
-	char **commands;
-	char *s;
-	u8 *pp;
-	unsigned long flags;
-	int i, l, lbc, lhlc;
-	u16 info;
-
-	/* decode message */
-	if (capi_message2cmsg(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-
-	/* get free B channel & construct PLCI */
-	bcs = gigaset_get_free_channel(cs);
-	if (!bcs) {
-		dev_notice(cs->dev, "%s: no B channel available\n",
-			   "CONNECT_REQ");
-		send_conf(iif, ap, skb, CapiNoPlciAvailable);
-		return;
-	}
-	spin_lock_irqsave(&bcs->aplock, flags);
-	if (bcs->ap != NULL || bcs->apconnstate != APCONN_NONE)
-		dev_warn(cs->dev, "%s: channel not properly cleared (%p/%d)\n",
-			 __func__, bcs->ap, bcs->apconnstate);
-	ap->bcnext = NULL;
-	bcs->ap = ap;
-	bcs->apconnstate = APCONN_SETUP;
-	spin_unlock_irqrestore(&bcs->aplock, flags);
-
-	bcs->rx_bufsize = ap->rp.datablklen;
-	dev_kfree_skb(bcs->rx_skb);
-	gigaset_new_rx_skb(bcs);
-	cmsg->adr.adrPLCI |= (bcs->channel + 1) << 8;
-
-	/* build command table */
-	commands = kcalloc(AT_NUM, sizeof(*commands), GFP_KERNEL);
-	if (!commands)
-		goto oom;
-
-	/* encode parameter: Called party number */
-	pp = cmsg->CalledPartyNumber;
-	if (pp == NULL || *pp == 0) {
-		dev_notice(cs->dev, "%s: %s missing\n",
-			   "CONNECT_REQ", "Called party number");
-		info = CapiIllMessageParmCoding;
-		goto error;
-	}
-	l = *pp++;
-	/* check type of number/numbering plan byte */
-	switch (*pp) {
-	case 0x80:	/* unknown type / unknown numbering plan */
-	case 0x81:	/* unknown type / ISDN/Telephony numbering plan */
-		break;
-	default:	/* others: warn about potential misinterpretation */
-		dev_notice(cs->dev, "%s: %s type/plan 0x%02x unsupported\n",
-			   "CONNECT_REQ", "Called party number", *pp);
-	}
-	pp++;
-	l--;
-	/* translate "**" internal call prefix to CTP value */
-	if (l >= 2 && pp[0] == '*' && pp[1] == '*') {
-		s = "^SCTP=0\r";
-		pp += 2;
-		l -= 2;
-	} else {
-		s = "^SCTP=1\r";
-	}
-	commands[AT_TYPE] = kstrdup(s, GFP_KERNEL);
-	if (!commands[AT_TYPE])
-		goto oom;
-	commands[AT_DIAL] = kmalloc(l + 3, GFP_KERNEL);
-	if (!commands[AT_DIAL])
-		goto oom;
-	snprintf(commands[AT_DIAL], l + 3, "D%.*s\r", l, pp);
-
-	/* encode parameter: Calling party number */
-	pp = cmsg->CallingPartyNumber;
-	if (pp != NULL && *pp > 0) {
-		l = *pp++;
-
-		/* check type of number/numbering plan byte */
-		/* ToDo: allow for/handle Ext=1? */
-		switch (*pp) {
-		case 0x00:	/* unknown type / unknown numbering plan */
-		case 0x01:	/* unknown type / ISDN/Telephony num. plan */
-			break;
-		default:
-			dev_notice(cs->dev,
-				   "%s: %s type/plan 0x%02x unsupported\n",
-				   "CONNECT_REQ", "Calling party number", *pp);
-		}
-		pp++;
-		l--;
-
-		/* check presentation indicator */
-		if (!l) {
-			dev_notice(cs->dev, "%s: %s IE truncated\n",
-				   "CONNECT_REQ", "Calling party number");
-			info = CapiIllMessageParmCoding;
-			goto error;
-		}
-		switch (*pp & 0xfc) { /* ignore Screening indicator */
-		case 0x80:	/* Presentation allowed */
-			s = "^SCLIP=1\r";
-			break;
-		case 0xa0:	/* Presentation restricted */
-			s = "^SCLIP=0\r";
-			break;
-		default:
-			dev_notice(cs->dev, "%s: invalid %s 0x%02x\n",
-				   "CONNECT_REQ",
-				   "Presentation/Screening indicator",
-				   *pp);
-			s = "^SCLIP=1\r";
-		}
-		commands[AT_CLIP] = kstrdup(s, GFP_KERNEL);
-		if (!commands[AT_CLIP])
-			goto oom;
-		pp++;
-		l--;
-
-		if (l) {
-			/* number */
-			commands[AT_MSN] = kmalloc(l + 8, GFP_KERNEL);
-			if (!commands[AT_MSN])
-				goto oom;
-			snprintf(commands[AT_MSN], l + 8, "^SMSN=%*s\r", l, pp);
-		}
-	}
-
-	/* check parameter: CIP Value */
-	if (cmsg->CIPValue >= ARRAY_SIZE(cip2bchlc) ||
-	    (cmsg->CIPValue > 0 && cip2bchlc[cmsg->CIPValue].bc == NULL)) {
-		dev_notice(cs->dev, "%s: unknown CIP value %d\n",
-			   "CONNECT_REQ", cmsg->CIPValue);
-		info = CapiCipValueUnknown;
-		goto error;
-	}
-
-	/*
-	 * check/encode parameters: BC & HLC
-	 * must be encoded together as device doesn't accept HLC separately
-	 * explicit parameters override values derived from CIP
-	 */
-
-	/* determine lengths */
-	if (cmsg->BC && cmsg->BC[0])		/* BC specified explicitly */
-		lbc = 2 * cmsg->BC[0];
-	else if (cip2bchlc[cmsg->CIPValue].bc)	/* BC derived from CIP */
-		lbc = strlen(cip2bchlc[cmsg->CIPValue].bc);
-	else					/* no BC */
-		lbc = 0;
-	if (cmsg->HLC && cmsg->HLC[0])		/* HLC specified explicitly */
-		lhlc = 2 * cmsg->HLC[0];
-	else if (cip2bchlc[cmsg->CIPValue].hlc)	/* HLC derived from CIP */
-		lhlc = strlen(cip2bchlc[cmsg->CIPValue].hlc);
-	else					/* no HLC */
-		lhlc = 0;
-
-	if (lbc) {
-		/* have BC: allocate and assemble command string */
-		l = lbc + 7;		/* "^SBC=" + value + "\r" + null byte */
-		if (lhlc)
-			l += lhlc + 7;	/* ";^SHLC=" + value */
-		commands[AT_BC] = kmalloc(l, GFP_KERNEL);
-		if (!commands[AT_BC])
-			goto oom;
-		strcpy(commands[AT_BC], "^SBC=");
-		if (cmsg->BC && cmsg->BC[0])	/* BC specified explicitly */
-			decode_ie(cmsg->BC, commands[AT_BC] + 5);
-		else				/* BC derived from CIP */
-			strcpy(commands[AT_BC] + 5,
-			       cip2bchlc[cmsg->CIPValue].bc);
-		if (lhlc) {
-			strcpy(commands[AT_BC] + lbc + 5, ";^SHLC=");
-			if (cmsg->HLC && cmsg->HLC[0])
-				/* HLC specified explicitly */
-				decode_ie(cmsg->HLC,
-					  commands[AT_BC] + lbc + 12);
-			else	/* HLC derived from CIP */
-				strcpy(commands[AT_BC] + lbc + 12,
-				       cip2bchlc[cmsg->CIPValue].hlc);
-		}
-		strcpy(commands[AT_BC] + l - 2, "\r");
-	} else {
-		/* no BC */
-		if (lhlc) {
-			dev_notice(cs->dev, "%s: cannot set HLC without BC\n",
-				   "CONNECT_REQ");
-			info = CapiIllMessageParmCoding; /* ? */
-			goto error;
-		}
-	}
-
-	/* check/encode parameter: B Protocol */
-	if (cmsg->BProtocol == CAPI_DEFAULT) {
-		bcs->proto2 = L2_HDLC;
-		dev_warn(cs->dev,
-			 "B2 Protocol X.75 SLP unsupported, using Transparent\n");
-	} else {
-		switch (cmsg->B1protocol) {
-		case 0:
-			bcs->proto2 = L2_HDLC;
-			break;
-		case 1:
-			bcs->proto2 = L2_VOICE;
-			break;
-		default:
-			dev_warn(cs->dev,
-				 "B1 Protocol %u unsupported, using Transparent\n",
-				 cmsg->B1protocol);
-			bcs->proto2 = L2_VOICE;
-		}
-		if (cmsg->B2protocol != 1)
-			dev_warn(cs->dev,
-				 "B2 Protocol %u unsupported, using Transparent\n",
-				 cmsg->B2protocol);
-		if (cmsg->B3protocol != 0)
-			dev_warn(cs->dev,
-				 "B3 Protocol %u unsupported, using Transparent\n",
-				 cmsg->B3protocol);
-		ignore_cstruct_param(cs, cmsg->B1configuration,
-				     "CONNECT_REQ", "B1 Configuration");
-		ignore_cstruct_param(cs, cmsg->B2configuration,
-				     "CONNECT_REQ", "B2 Configuration");
-		ignore_cstruct_param(cs, cmsg->B3configuration,
-				     "CONNECT_REQ", "B3 Configuration");
-	}
-	commands[AT_PROTO] = kmalloc(9, GFP_KERNEL);
-	if (!commands[AT_PROTO])
-		goto oom;
-	snprintf(commands[AT_PROTO], 9, "^SBPR=%u\r", bcs->proto2);
-
-	/* ToDo: check/encode remaining parameters */
-	ignore_cstruct_param(cs, cmsg->CalledPartySubaddress,
-			     "CONNECT_REQ", "Called pty subaddr");
-	ignore_cstruct_param(cs, cmsg->CallingPartySubaddress,
-			     "CONNECT_REQ", "Calling pty subaddr");
-	ignore_cstruct_param(cs, cmsg->LLC,
-			     "CONNECT_REQ", "LLC");
-	if (cmsg->AdditionalInfo != CAPI_DEFAULT) {
-		ignore_cstruct_param(cs, cmsg->BChannelinformation,
-				     "CONNECT_REQ", "B Channel Information");
-		ignore_cstruct_param(cs, cmsg->Keypadfacility,
-				     "CONNECT_REQ", "Keypad Facility");
-		ignore_cstruct_param(cs, cmsg->Useruserdata,
-				     "CONNECT_REQ", "User-User Data");
-		ignore_cstruct_param(cs, cmsg->Facilitydataarray,
-				     "CONNECT_REQ", "Facility Data Array");
-	}
-
-	/* encode parameter: B channel to use */
-	commands[AT_ISO] = kmalloc(9, GFP_KERNEL);
-	if (!commands[AT_ISO])
-		goto oom;
-	snprintf(commands[AT_ISO], 9, "^SISO=%u\r",
-		 (unsigned) bcs->channel + 1);
-
-	/* queue & schedule EV_DIAL event */
-	if (!gigaset_add_event(cs, &bcs->at_state, EV_DIAL, commands,
-			       bcs->at_state.seq_index, NULL)) {
-		info = CAPI_MSGOSRESOURCEERR;
-		goto error;
-	}
-	gigaset_schedule_event(cs);
-	send_conf(iif, ap, skb, CapiSuccess);
-	return;
-
-oom:
-	dev_err(cs->dev, "%s: out of memory\n", __func__);
-	info = CAPI_MSGOSRESOURCEERR;
-error:
-	if (commands)
-		for (i = 0; i < AT_NUM; i++)
-			kfree(commands[i]);
-	kfree(commands);
-	gigaset_free_channel(bcs);
-	send_conf(iif, ap, skb, info);
-}
-
-/*
- * process CONNECT_RESP message
- * checks protocol parameters and queues an ACCEPT or HUP event
- */
-static void do_connect_resp(struct gigaset_capi_ctr *iif,
-			    struct gigaset_capi_appl *ap,
-			    struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	_cmsg *cmsg = &iif->acmsg;
-	struct bc_state *bcs;
-	struct gigaset_capi_appl *oap;
-	unsigned long flags;
-	int channel;
-
-	/* decode message */
-	if (capi_message2cmsg(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-	dev_kfree_skb_any(skb);
-
-	/* extract and check channel number from PLCI */
-	channel = (cmsg->adr.adrPLCI >> 8) & 0xff;
-	if (!channel || channel > cs->channels) {
-		dev_notice(cs->dev, "%s: invalid %s 0x%02x\n",
-			   "CONNECT_RESP", "PLCI", cmsg->adr.adrPLCI);
-		return;
-	}
-	bcs = cs->bcs + channel - 1;
-
-	switch (cmsg->Reject) {
-	case 0:		/* Accept */
-		/* drop all competing applications, keep only this one */
-		spin_lock_irqsave(&bcs->aplock, flags);
-		while (bcs->ap != NULL) {
-			oap = bcs->ap;
-			bcs->ap = oap->bcnext;
-			if (oap != ap) {
-				spin_unlock_irqrestore(&bcs->aplock, flags);
-				send_disconnect_ind(bcs, oap,
-						    CapiCallGivenToOtherApplication);
-				spin_lock_irqsave(&bcs->aplock, flags);
-			}
-		}
-		ap->bcnext = NULL;
-		bcs->ap = ap;
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-
-		bcs->rx_bufsize = ap->rp.datablklen;
-		dev_kfree_skb(bcs->rx_skb);
-		gigaset_new_rx_skb(bcs);
-		bcs->chstate |= CHS_NOTIFY_LL;
-
-		/* check/encode B channel protocol */
-		if (cmsg->BProtocol == CAPI_DEFAULT) {
-			bcs->proto2 = L2_HDLC;
-			dev_warn(cs->dev,
-				 "B2 Protocol X.75 SLP unsupported, using Transparent\n");
-		} else {
-			switch (cmsg->B1protocol) {
-			case 0:
-				bcs->proto2 = L2_HDLC;
-				break;
-			case 1:
-				bcs->proto2 = L2_VOICE;
-				break;
-			default:
-				dev_warn(cs->dev,
-					 "B1 Protocol %u unsupported, using Transparent\n",
-					 cmsg->B1protocol);
-				bcs->proto2 = L2_VOICE;
-			}
-			if (cmsg->B2protocol != 1)
-				dev_warn(cs->dev,
-					 "B2 Protocol %u unsupported, using Transparent\n",
-					 cmsg->B2protocol);
-			if (cmsg->B3protocol != 0)
-				dev_warn(cs->dev,
-					 "B3 Protocol %u unsupported, using Transparent\n",
-					 cmsg->B3protocol);
-			ignore_cstruct_param(cs, cmsg->B1configuration,
-					     "CONNECT_RESP", "B1 Configuration");
-			ignore_cstruct_param(cs, cmsg->B2configuration,
-					     "CONNECT_RESP", "B2 Configuration");
-			ignore_cstruct_param(cs, cmsg->B3configuration,
-					     "CONNECT_RESP", "B3 Configuration");
-		}
-
-		/* ToDo: check/encode remaining parameters */
-		ignore_cstruct_param(cs, cmsg->ConnectedNumber,
-				     "CONNECT_RESP", "Connected Number");
-		ignore_cstruct_param(cs, cmsg->ConnectedSubaddress,
-				     "CONNECT_RESP", "Connected Subaddress");
-		ignore_cstruct_param(cs, cmsg->LLC,
-				     "CONNECT_RESP", "LLC");
-		if (cmsg->AdditionalInfo != CAPI_DEFAULT) {
-			ignore_cstruct_param(cs, cmsg->BChannelinformation,
-					     "CONNECT_RESP", "BChannel Information");
-			ignore_cstruct_param(cs, cmsg->Keypadfacility,
-					     "CONNECT_RESP", "Keypad Facility");
-			ignore_cstruct_param(cs, cmsg->Useruserdata,
-					     "CONNECT_RESP", "User-User Data");
-			ignore_cstruct_param(cs, cmsg->Facilitydataarray,
-					     "CONNECT_RESP", "Facility Data Array");
-		}
-
-		/* Accept call */
-		if (!gigaset_add_event(cs, &cs->bcs[channel - 1].at_state,
-				       EV_ACCEPT, NULL, 0, NULL))
-			return;
-		gigaset_schedule_event(cs);
-		return;
-
-	case 1:			/* Ignore */
-		/* send DISCONNECT_IND to this application */
-		send_disconnect_ind(bcs, ap, 0);
-
-		/* remove it from the list of listening apps */
-		spin_lock_irqsave(&bcs->aplock, flags);
-		if (bcs->ap == ap) {
-			bcs->ap = ap->bcnext;
-			if (bcs->ap == NULL) {
-				/* last one: stop ev-layer hupD notifications */
-				bcs->apconnstate = APCONN_NONE;
-				bcs->chstate &= ~CHS_NOTIFY_LL;
-			}
-			spin_unlock_irqrestore(&bcs->aplock, flags);
-			return;
-		}
-		for (oap = bcs->ap; oap != NULL; oap = oap->bcnext) {
-			if (oap->bcnext == ap) {
-				oap->bcnext = oap->bcnext->bcnext;
-				spin_unlock_irqrestore(&bcs->aplock, flags);
-				return;
-			}
-		}
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-		dev_err(cs->dev, "%s: application %u not found\n",
-			__func__, ap->id);
-		return;
-
-	default:		/* Reject */
-		/* drop all competing applications, keep only this one */
-		spin_lock_irqsave(&bcs->aplock, flags);
-		while (bcs->ap != NULL) {
-			oap = bcs->ap;
-			bcs->ap = oap->bcnext;
-			if (oap != ap) {
-				spin_unlock_irqrestore(&bcs->aplock, flags);
-				send_disconnect_ind(bcs, oap,
-						    CapiCallGivenToOtherApplication);
-				spin_lock_irqsave(&bcs->aplock, flags);
-			}
-		}
-		ap->bcnext = NULL;
-		bcs->ap = ap;
-		spin_unlock_irqrestore(&bcs->aplock, flags);
-
-		/* reject call - will trigger DISCONNECT_IND for this app */
-		dev_info(cs->dev, "%s: Reject=%x\n",
-			 "CONNECT_RESP", cmsg->Reject);
-		if (!gigaset_add_event(cs, &cs->bcs[channel - 1].at_state,
-				       EV_HUP, NULL, 0, NULL))
-			return;
-		gigaset_schedule_event(cs);
-		return;
-	}
-}
-
-/*
- * process CONNECT_B3_REQ message
- * build NCCI and emit CONNECT_B3_CONF reply
- */
-static void do_connect_b3_req(struct gigaset_capi_ctr *iif,
-			      struct gigaset_capi_appl *ap,
-			      struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	_cmsg *cmsg = &iif->acmsg;
-	struct bc_state *bcs;
-	int channel;
-
-	/* decode message */
-	if (capi_message2cmsg(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-
-	/* extract and check channel number from PLCI */
-	channel = (cmsg->adr.adrPLCI >> 8) & 0xff;
-	if (!channel || channel > cs->channels) {
-		dev_notice(cs->dev, "%s: invalid %s 0x%02x\n",
-			   "CONNECT_B3_REQ", "PLCI", cmsg->adr.adrPLCI);
-		send_conf(iif, ap, skb, CapiIllContrPlciNcci);
-		return;
-	}
-	bcs = &cs->bcs[channel - 1];
-
-	/* mark logical connection active */
-	bcs->apconnstate = APCONN_ACTIVE;
-
-	/* build NCCI: always 1 (one B3 connection only) */
-	cmsg->adr.adrNCCI |= 1 << 16;
-
-	/* NCPI parameter: not applicable for B3 Transparent */
-	ignore_cstruct_param(cs, cmsg->NCPI, "CONNECT_B3_REQ", "NCPI");
-	send_conf(iif, ap, skb,
-		  (cmsg->NCPI && cmsg->NCPI[0]) ?
-		  CapiNcpiNotSupportedByProtocol : CapiSuccess);
-}
-
-/*
- * process CONNECT_B3_RESP message
- * Depending on the Reject parameter, either emit CONNECT_B3_ACTIVE_IND
- * or queue EV_HUP and emit DISCONNECT_B3_IND.
- * The emitted message is always shorter than the received one,
- * allowing to reuse the skb.
- */
-static void do_connect_b3_resp(struct gigaset_capi_ctr *iif,
-			       struct gigaset_capi_appl *ap,
-			       struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	_cmsg *cmsg = &iif->acmsg;
-	struct bc_state *bcs;
-	int channel;
-	unsigned int msgsize;
-	u8 command;
-
-	/* decode message */
-	if (capi_message2cmsg(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-
-	/* extract and check channel number and NCCI */
-	channel = (cmsg->adr.adrNCCI >> 8) & 0xff;
-	if (!channel || channel > cs->channels ||
-	    ((cmsg->adr.adrNCCI >> 16) & 0xffff) != 1) {
-		dev_notice(cs->dev, "%s: invalid %s 0x%02x\n",
-			   "CONNECT_B3_RESP", "NCCI", cmsg->adr.adrNCCI);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	bcs = &cs->bcs[channel - 1];
-
-	if (cmsg->Reject) {
-		/* Reject: clear B3 connect received flag */
-		bcs->apconnstate = APCONN_SETUP;
-
-		/* trigger hangup, causing eventual DISCONNECT_IND */
-		if (!gigaset_add_event(cs, &bcs->at_state,
-				       EV_HUP, NULL, 0, NULL)) {
-			dev_kfree_skb_any(skb);
-			return;
-		}
-		gigaset_schedule_event(cs);
-
-		/* emit DISCONNECT_B3_IND */
-		command = CAPI_DISCONNECT_B3;
-		msgsize = CAPI_DISCONNECT_B3_IND_BASELEN;
-	} else {
-		/*
-		 * Accept: emit CONNECT_B3_ACTIVE_IND immediately, as
-		 * we only send CONNECT_B3_IND if the B channel is up
-		 */
-		command = CAPI_CONNECT_B3_ACTIVE;
-		msgsize = CAPI_CONNECT_B3_ACTIVE_IND_BASELEN;
-	}
-	capi_cmsg_header(cmsg, ap->id, command, CAPI_IND,
-			 ap->nextMessageNumber++, cmsg->adr.adrNCCI);
-	__skb_trim(skb, msgsize);
-	if (capi_cmsg2message(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-	capi_ctr_handle_message(&iif->ctr, ap->id, skb);
-}
-
-/*
- * process DISCONNECT_REQ message
- * schedule EV_HUP and emit DISCONNECT_B3_IND if necessary,
- * emit DISCONNECT_CONF reply
- */
-static void do_disconnect_req(struct gigaset_capi_ctr *iif,
-			      struct gigaset_capi_appl *ap,
-			      struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	_cmsg *cmsg = &iif->acmsg;
-	struct bc_state *bcs;
-	_cmsg *b3cmsg;
-	struct sk_buff *b3skb;
-	int channel;
-
-	/* decode message */
-	if (capi_message2cmsg(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-
-	/* extract and check channel number from PLCI */
-	channel = (cmsg->adr.adrPLCI >> 8) & 0xff;
-	if (!channel || channel > cs->channels) {
-		dev_notice(cs->dev, "%s: invalid %s 0x%02x\n",
-			   "DISCONNECT_REQ", "PLCI", cmsg->adr.adrPLCI);
-		send_conf(iif, ap, skb, CapiIllContrPlciNcci);
-		return;
-	}
-	bcs = cs->bcs + channel - 1;
-
-	/* ToDo: process parameter: Additional info */
-	if (cmsg->AdditionalInfo != CAPI_DEFAULT) {
-		ignore_cstruct_param(cs, cmsg->BChannelinformation,
-				     "DISCONNECT_REQ", "B Channel Information");
-		ignore_cstruct_param(cs, cmsg->Keypadfacility,
-				     "DISCONNECT_REQ", "Keypad Facility");
-		ignore_cstruct_param(cs, cmsg->Useruserdata,
-				     "DISCONNECT_REQ", "User-User Data");
-		ignore_cstruct_param(cs, cmsg->Facilitydataarray,
-				     "DISCONNECT_REQ", "Facility Data Array");
-	}
-
-	/* skip if DISCONNECT_IND already sent */
-	if (!bcs->apconnstate)
-		return;
-
-	/* check for active logical connection */
-	if (bcs->apconnstate >= APCONN_ACTIVE) {
-		/* clear it */
-		bcs->apconnstate = APCONN_SETUP;
-
-		/*
-		 * emit DISCONNECT_B3_IND with cause 0x3301
-		 * use separate cmsg structure, as the content of iif->acmsg
-		 * is still needed for creating the _CONF message
-		 */
-		b3cmsg = kmalloc(sizeof(*b3cmsg), GFP_KERNEL);
-		if (!b3cmsg) {
-			dev_err(cs->dev, "%s: out of memory\n", __func__);
-			send_conf(iif, ap, skb, CAPI_MSGOSRESOURCEERR);
-			return;
-		}
-		capi_cmsg_header(b3cmsg, ap->id, CAPI_DISCONNECT_B3, CAPI_IND,
-				 ap->nextMessageNumber++,
-				 cmsg->adr.adrPLCI | (1 << 16));
-		b3cmsg->Reason_B3 = CapiProtocolErrorLayer1;
-		b3skb = alloc_skb(CAPI_DISCONNECT_B3_IND_BASELEN, GFP_KERNEL);
-		if (b3skb == NULL) {
-			dev_err(cs->dev, "%s: out of memory\n", __func__);
-			send_conf(iif, ap, skb, CAPI_MSGOSRESOURCEERR);
-			kfree(b3cmsg);
-			return;
-		}
-		if (capi_cmsg2message(b3cmsg,
-				      __skb_put(b3skb, CAPI_DISCONNECT_B3_IND_BASELEN))) {
-			dev_err(cs->dev, "%s: message parser failure\n",
-				__func__);
-			kfree(b3cmsg);
-			dev_kfree_skb_any(b3skb);
-			return;
-		}
-		dump_cmsg(DEBUG_CMD, __func__, b3cmsg);
-		kfree(b3cmsg);
-		capi_ctr_handle_message(&iif->ctr, ap->id, b3skb);
-	}
-
-	/* trigger hangup, causing eventual DISCONNECT_IND */
-	if (!gigaset_add_event(cs, &bcs->at_state, EV_HUP, NULL, 0, NULL)) {
-		send_conf(iif, ap, skb, CAPI_MSGOSRESOURCEERR);
-		return;
-	}
-	gigaset_schedule_event(cs);
-
-	/* emit reply */
-	send_conf(iif, ap, skb, CapiSuccess);
-}
-
-/*
- * process DISCONNECT_B3_REQ message
- * schedule EV_HUP and emit DISCONNECT_B3_CONF reply
- */
-static void do_disconnect_b3_req(struct gigaset_capi_ctr *iif,
-				 struct gigaset_capi_appl *ap,
-				 struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	_cmsg *cmsg = &iif->acmsg;
-	struct bc_state *bcs;
-	int channel;
-
-	/* decode message */
-	if (capi_message2cmsg(cmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, cmsg);
-
-	/* extract and check channel number and NCCI */
-	channel = (cmsg->adr.adrNCCI >> 8) & 0xff;
-	if (!channel || channel > cs->channels ||
-	    ((cmsg->adr.adrNCCI >> 16) & 0xffff) != 1) {
-		dev_notice(cs->dev, "%s: invalid %s 0x%02x\n",
-			   "DISCONNECT_B3_REQ", "NCCI", cmsg->adr.adrNCCI);
-		send_conf(iif, ap, skb, CapiIllContrPlciNcci);
-		return;
-	}
-	bcs = &cs->bcs[channel - 1];
-
-	/* reject if logical connection not active */
-	if (bcs->apconnstate < APCONN_ACTIVE) {
-		send_conf(iif, ap, skb,
-			  CapiMessageNotSupportedInCurrentState);
-		return;
-	}
-
-	/* trigger hangup, causing eventual DISCONNECT_B3_IND */
-	if (!gigaset_add_event(cs, &bcs->at_state, EV_HUP, NULL, 0, NULL)) {
-		send_conf(iif, ap, skb, CAPI_MSGOSRESOURCEERR);
-		return;
-	}
-	gigaset_schedule_event(cs);
-
-	/* NCPI parameter: not applicable for B3 Transparent */
-	ignore_cstruct_param(cs, cmsg->NCPI,
-			     "DISCONNECT_B3_REQ", "NCPI");
-	send_conf(iif, ap, skb,
-		  (cmsg->NCPI && cmsg->NCPI[0]) ?
-		  CapiNcpiNotSupportedByProtocol : CapiSuccess);
-}
-
-/*
- * process DATA_B3_REQ message
- */
-static void do_data_b3_req(struct gigaset_capi_ctr *iif,
-			   struct gigaset_capi_appl *ap,
-			   struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-	struct bc_state *bcs;
-	int channel = CAPIMSG_PLCI_PART(skb->data);
-	u16 ncci = CAPIMSG_NCCI_PART(skb->data);
-	u16 msglen = CAPIMSG_LEN(skb->data);
-	u16 datalen = CAPIMSG_DATALEN(skb->data);
-	u16 flags = CAPIMSG_FLAGS(skb->data);
-	u16 msgid = CAPIMSG_MSGID(skb->data);
-	u16 handle = CAPIMSG_HANDLE_REQ(skb->data);
-
-	/* frequent message, avoid _cmsg overhead */
-	dump_rawmsg(DEBUG_MCMD, __func__, skb->data);
-
-	/* check parameters */
-	if (channel == 0 || channel > cs->channels || ncci != 1) {
-		dev_notice(cs->dev, "%s: invalid %s 0x%02x\n",
-			   "DATA_B3_REQ", "NCCI", CAPIMSG_NCCI(skb->data));
-		send_conf(iif, ap, skb, CapiIllContrPlciNcci);
-		return;
-	}
-	bcs = &cs->bcs[channel - 1];
-	if (msglen != CAPI_DATA_B3_REQ_LEN && msglen != CAPI_DATA_B3_REQ_LEN64)
-		dev_notice(cs->dev, "%s: unexpected length %d\n",
-			   "DATA_B3_REQ", msglen);
-	if (msglen + datalen != skb->len)
-		dev_notice(cs->dev, "%s: length mismatch (%d+%d!=%d)\n",
-			   "DATA_B3_REQ", msglen, datalen, skb->len);
-	if (msglen + datalen > skb->len) {
-		/* message too short for announced data length */
-		send_conf(iif, ap, skb, CapiIllMessageParmCoding); /* ? */
-		return;
-	}
-	if (flags & CAPI_FLAGS_RESERVED) {
-		dev_notice(cs->dev, "%s: reserved flags set (%x)\n",
-			   "DATA_B3_REQ", flags);
-		send_conf(iif, ap, skb, CapiIllMessageParmCoding);
-		return;
-	}
-
-	/* reject if logical connection not active */
-	if (bcs->apconnstate < APCONN_ACTIVE) {
-		send_conf(iif, ap, skb, CapiMessageNotSupportedInCurrentState);
-		return;
-	}
-
-	/* pull CAPI message into link layer header */
-	skb_reset_mac_header(skb);
-	skb->mac_len = msglen;
-	skb_pull(skb, msglen);
-
-	/* pass to device-specific module */
-	if (cs->ops->send_skb(bcs, skb) < 0) {
-		send_conf(iif, ap, skb, CAPI_MSGOSRESOURCEERR);
-		return;
-	}
-
-	/*
-	 * DATA_B3_CONF will be sent by gigaset_skb_sent() only if "delivery
-	 * confirmation" bit is set; otherwise we have to send it now
-	 */
-	if (!(flags & CAPI_FLAGS_DELIVERY_CONFIRMATION))
-		send_data_b3_conf(cs, &iif->ctr, ap->id, msgid, channel, handle,
-				  flags ? CapiFlagsNotSupportedByProtocol
-				  : CAPI_NOERROR);
-}
-
-/*
- * process RESET_B3_REQ message
- * just always reply "not supported by current protocol"
- */
-static void do_reset_b3_req(struct gigaset_capi_ctr *iif,
-			    struct gigaset_capi_appl *ap,
-			    struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-
-	/* decode message */
-	if (capi_message2cmsg(&iif->acmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg);
-	send_conf(iif, ap, skb,
-		  CapiResetProcedureNotSupportedByCurrentProtocol);
-}
-
-/*
- * unsupported CAPI message handler
- */
-static void do_unsupported(struct gigaset_capi_ctr *iif,
-			   struct gigaset_capi_appl *ap,
-			   struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-
-	/* decode message */
-	if (capi_message2cmsg(&iif->acmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg);
-	send_conf(iif, ap, skb, CapiMessageNotSupportedInCurrentState);
-}
-
-/*
- * CAPI message handler: no-op
- */
-static void do_nothing(struct gigaset_capi_ctr *iif,
-		       struct gigaset_capi_appl *ap,
-		       struct sk_buff *skb)
-{
-	struct cardstate *cs = iif->ctr.driverdata;
-
-	/* decode message */
-	if (capi_message2cmsg(&iif->acmsg, skb->data)) {
-		dev_err(cs->dev, "%s: message parser failure\n", __func__);
-		dev_kfree_skb_any(skb);
-		return;
-	}
-	dump_cmsg(DEBUG_CMD, __func__, &iif->acmsg);
-	dev_kfree_skb_any(skb);
-}
-
-static void do_data_b3_resp(struct gigaset_capi_ctr *iif,
-			    struct gigaset_capi_appl *ap,
-			    struct sk_buff *skb)
-{
-	dump_rawmsg(DEBUG_MCMD, __func__, skb->data);
-	dev_kfree_skb_any(skb);
-}
-
-/* table of outgoing CAPI message handlers with lookup function */
-typedef void (*capi_send_handler_t)(struct gigaset_capi_ctr *,
-				    struct gigaset_capi_appl *,
-				    struct sk_buff *);
-
-static struct {
-	u16 cmd;
-	capi_send_handler_t handler;
-} capi_send_handler_table[] = {
-	/* most frequent messages first for faster lookup */
-	{ CAPI_DATA_B3_REQ, do_data_b3_req },
-	{ CAPI_DATA_B3_RESP, do_data_b3_resp },
-
-	{ CAPI_ALERT_REQ, do_alert_req },
-	{ CAPI_CONNECT_ACTIVE_RESP, do_nothing },
-	{ CAPI_CONNECT_B3_ACTIVE_RESP, do_nothing },
-	{ CAPI_CONNECT_B3_REQ, do_connect_b3_req },
-	{ CAPI_CONNECT_B3_RESP, do_connect_b3_resp },
-	{ CAPI_CONNECT_B3_T90_ACTIVE_RESP, do_nothing },
-	{ CAPI_CONNECT_REQ, do_connect_req },
-	{ CAPI_CONNECT_RESP, do_connect_resp },
-	{ CAPI_DISCONNECT_B3_REQ, do_disconnect_b3_req },
-	{ CAPI_DISCONNECT_B3_RESP, do_nothing },
-	{ CAPI_DISCONNECT_REQ, do_disconnect_req },
-	{ CAPI_DISCONNECT_RESP, do_nothing },
-	{ CAPI_FACILITY_REQ, do_facility_req },
-	{ CAPI_FACILITY_RESP, do_nothing },
-	{ CAPI_LISTEN_REQ, do_listen_req },
-	{ CAPI_SELECT_B_PROTOCOL_REQ, do_unsupported },
-	{ CAPI_RESET_B3_REQ, do_reset_b3_req },
-	{ CAPI_RESET_B3_RESP, do_nothing },
-
-	/*
-	 * ToDo: support overlap sending (requires ev-layer state
-	 * machine extension to generate additional ATD commands)
-	 */
-	{ CAPI_INFO_REQ, do_unsupported },
-	{ CAPI_INFO_RESP, do_nothing },
-
-	/*
-	 * ToDo: what's the proper response for these?
-	 */
-	{ CAPI_MANUFACTURER_REQ, do_nothing },
-	{ CAPI_MANUFACTURER_RESP, do_nothing },
-};
-
-/* look up handler */
-static inline capi_send_handler_t lookup_capi_send_handler(const u16 cmd)
-{
-	size_t i;
-
-	for (i = 0; i < ARRAY_SIZE(capi_send_handler_table); i++)
-		if (capi_send_handler_table[i].cmd == cmd)
-			return capi_send_handler_table[i].handler;
-	return NULL;
-}
-
-
-/**
- * gigaset_send_message() - accept a CAPI message from an application
- * @ctr:	controller descriptor structure.
- * @skb:	CAPI message.
- *
- * Return value: CAPI error code
- * Note: capidrv (and probably others, too) only uses the return value to
- * decide whether it has to free the skb (only if result != CAPI_NOERROR (0))
- */
-static u16 gigaset_send_message(struct capi_ctr *ctr, struct sk_buff *skb)
-{
-	struct gigaset_capi_ctr *iif
-		= container_of(ctr, struct gigaset_capi_ctr, ctr);
-	struct cardstate *cs = ctr->driverdata;
-	struct gigaset_capi_appl *ap;
-	capi_send_handler_t handler;
-
-	/* can only handle linear sk_buffs */
-	if (skb_linearize(skb) < 0) {
-		dev_warn(cs->dev, "%s: skb_linearize failed\n", __func__);
-		return CAPI_MSGOSRESOURCEERR;
-	}
-
-	/* retrieve application data structure */
-	ap = get_appl(iif, CAPIMSG_APPID(skb->data));
-	if (!ap) {
-		dev_notice(cs->dev, "%s: application %u not registered\n",
-			   __func__, CAPIMSG_APPID(skb->data));
-		return CAPI_ILLAPPNR;
-	}
-
-	/* look up command */
-	handler = lookup_capi_send_handler(CAPIMSG_CMD(skb->data));
-	if (!handler) {
-		/* unknown/unsupported message type */
-		if (printk_ratelimit())
-			dev_notice(cs->dev, "%s: unsupported message %u\n",
-				   __func__, CAPIMSG_CMD(skb->data));
-		return CAPI_ILLCMDORSUBCMDORMSGTOSMALL;
-	}
-
-	/* serialize */
-	if (atomic_add_return(1, &iif->sendqlen) > 1) {
-		/* queue behind other messages */
-		skb_queue_tail(&iif->sendqueue, skb);
-		return CAPI_NOERROR;
-	}
-
-	/* process message */
-	handler(iif, ap, skb);
-
-	/* process other messages arrived in the meantime */
-	while (atomic_sub_return(1, &iif->sendqlen) > 0) {
-		skb = skb_dequeue(&iif->sendqueue);
-		if (!skb) {
-			/* should never happen */
-			dev_err(cs->dev, "%s: send queue empty\n", __func__);
-			continue;
-		}
-		ap = get_appl(iif, CAPIMSG_APPID(skb->data));
-		if (!ap) {
-			/* could that happen? */
-			dev_warn(cs->dev, "%s: application %u vanished\n",
-				 __func__, CAPIMSG_APPID(skb->data));
-			continue;
-		}
-		handler = lookup_capi_send_handler(CAPIMSG_CMD(skb->data));
-		if (!handler) {
-			/* should never happen */
-			dev_err(cs->dev, "%s: handler %x vanished\n",
-				__func__, CAPIMSG_CMD(skb->data));
-			continue;
-		}
-		handler(iif, ap, skb);
-	}
-
-	return CAPI_NOERROR;
-}
-
-/**
- * gigaset_procinfo() - build single line description for controller
- * @ctr:	controller descriptor structure.
- *
- * Return value: pointer to generated string (null terminated)
- */
-static char *gigaset_procinfo(struct capi_ctr *ctr)
-{
-	return ctr->name;	/* ToDo: more? */
-}
-
-static int gigaset_proc_show(struct seq_file *m, void *v)
-{
-	struct capi_ctr *ctr = m->private;
-	struct cardstate *cs = ctr->driverdata;
-	char *s;
-	int i;
-
-	seq_printf(m, "%-16s %s\n", "name", ctr->name);
-	seq_printf(m, "%-16s %s %s\n", "dev",
-		   dev_driver_string(cs->dev), dev_name(cs->dev));
-	seq_printf(m, "%-16s %d\n", "id", cs->myid);
-	if (cs->gotfwver)
-		seq_printf(m, "%-16s %d.%d.%d.%d\n", "firmware",
-			   cs->fwver[0], cs->fwver[1], cs->fwver[2], cs->fwver[3]);
-	seq_printf(m, "%-16s %d\n", "channels", cs->channels);
-	seq_printf(m, "%-16s %s\n", "onechannel", cs->onechannel ? "yes" : "no");
-
-	switch (cs->mode) {
-	case M_UNKNOWN:
-		s = "unknown";
-		break;
-	case M_CONFIG:
-		s = "config";
-		break;
-	case M_UNIMODEM:
-		s = "Unimodem";
-		break;
-	case M_CID:
-		s = "CID";
-		break;
-	default:
-		s = "??";
-	}
-	seq_printf(m, "%-16s %s\n", "mode", s);
-
-	switch (cs->mstate) {
-	case MS_UNINITIALIZED:
-		s = "uninitialized";
-		break;
-	case MS_INIT:
-		s = "init";
-		break;
-	case MS_LOCKED:
-		s = "locked";
-		break;
-	case MS_SHUTDOWN:
-		s = "shutdown";
-		break;
-	case MS_RECOVER:
-		s = "recover";
-		break;
-	case MS_READY:
-		s = "ready";
-		break;
-	default:
-		s = "??";
-	}
-	seq_printf(m, "%-16s %s\n", "mstate", s);
-
-	seq_printf(m, "%-16s %s\n", "running", cs->running ? "yes" : "no");
-	seq_printf(m, "%-16s %s\n", "connected", cs->connected ? "yes" : "no");
-	seq_printf(m, "%-16s %s\n", "isdn_up", cs->isdn_up ? "yes" : "no");
-	seq_printf(m, "%-16s %s\n", "cidmode", cs->cidmode ? "yes" : "no");
-
-	for (i = 0; i < cs->channels; i++) {
-		seq_printf(m, "[%d]%-13s %d\n", i, "corrupted",
-			   cs->bcs[i].corrupted);
-		seq_printf(m, "[%d]%-13s %d\n", i, "trans_down",
-			   cs->bcs[i].trans_down);
-		seq_printf(m, "[%d]%-13s %d\n", i, "trans_up",
-			   cs->bcs[i].trans_up);
-		seq_printf(m, "[%d]%-13s %d\n", i, "chstate",
-			   cs->bcs[i].chstate);
-		switch (cs->bcs[i].proto2) {
-		case L2_BITSYNC:
-			s = "bitsync";
-			break;
-		case L2_HDLC:
-			s = "HDLC";
-			break;
-		case L2_VOICE:
-			s = "voice";
-			break;
-		default:
-			s = "??";
-		}
-		seq_printf(m, "[%d]%-13s %s\n", i, "proto2", s);
-	}
-	return 0;
-}
-
-/**
- * gigaset_isdn_regdev() - register device to LL
- * @cs:		device descriptor structure.
- * @isdnid:	device name.
- *
- * Return value: 0 on success, error code < 0 on failure
- */
-int gigaset_isdn_regdev(struct cardstate *cs, const char *isdnid)
-{
-	struct gigaset_capi_ctr *iif;
-	int rc;
-
-	iif = kzalloc(sizeof(*iif), GFP_KERNEL);
-	if (!iif) {
-		pr_err("%s: out of memory\n", __func__);
-		return -ENOMEM;
-	}
-
-	/* prepare controller structure */
-	iif->ctr.owner         = THIS_MODULE;
-	iif->ctr.driverdata    = cs;
-	strncpy(iif->ctr.name, isdnid, sizeof(iif->ctr.name) - 1);
-	iif->ctr.driver_name   = "gigaset";
-	iif->ctr.load_firmware = NULL;
-	iif->ctr.reset_ctr     = NULL;
-	iif->ctr.register_appl = gigaset_register_appl;
-	iif->ctr.release_appl  = gigaset_release_appl;
-	iif->ctr.send_message  = gigaset_send_message;
-	iif->ctr.procinfo      = gigaset_procinfo;
-	iif->ctr.proc_show     = gigaset_proc_show,
-	INIT_LIST_HEAD(&iif->appls);
-	skb_queue_head_init(&iif->sendqueue);
-	atomic_set(&iif->sendqlen, 0);
-
-	/* register controller with CAPI */
-	rc = attach_capi_ctr(&iif->ctr);
-	if (rc) {
-		pr_err("attach_capi_ctr failed (%d)\n", rc);
-		kfree(iif);
-		return rc;
-	}
-
-	cs->iif = iif;
-	cs->hw_hdr_len = CAPI_DATA_B3_REQ_LEN;
-	return 0;
-}
-
-/**
- * gigaset_isdn_unregdev() - unregister device from LL
- * @cs:		device descriptor structure.
- */
-void gigaset_isdn_unregdev(struct cardstate *cs)
-{
-	struct gigaset_capi_ctr *iif = cs->iif;
-
-	detach_capi_ctr(&iif->ctr);
-	kfree(iif);
-	cs->iif = NULL;
-}
-
-static struct capi_driver capi_driver_gigaset = {
-	.name		= "gigaset",
-	.revision	= "1.0",
-};
-
-/**
- * gigaset_isdn_regdrv() - register driver to LL
- */
-void gigaset_isdn_regdrv(void)
-{
-	pr_info("Kernel CAPI interface\n");
-	register_capi_driver(&capi_driver_gigaset);
-}
-
-/**
- * gigaset_isdn_unregdrv() - unregister driver from LL
- */
-void gigaset_isdn_unregdrv(void)
-{
-	unregister_capi_driver(&capi_driver_gigaset);
-}
diff --git a/drivers/staging/isdn/gigaset/common.c b/drivers/staging/isdn/gigaset/common.c
deleted file mode 100644
index 3bb8092858ab..000000000000
--- a/drivers/staging/isdn/gigaset/common.c
+++ /dev/null
@@ -1,1153 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Stuff used by all variants of the driver
- *
- * Copyright (c) 2001 by Stefan Eilers,
- *                       Hansjoerg Lipp <hjlipp@web.de>,
- *                       Tilman Schmidt <tilman@imap.cc>.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/module.h>
-#include <linux/moduleparam.h>
-
-/* Version Information */
-#define DRIVER_AUTHOR "Hansjoerg Lipp <hjlipp@web.de>, Tilman Schmidt <tilman@imap.cc>, Stefan Eilers"
-#define DRIVER_DESC "Driver for Gigaset 307x"
-
-#ifdef CONFIG_GIGASET_DEBUG
-#define DRIVER_DESC_DEBUG " (debug build)"
-#else
-#define DRIVER_DESC_DEBUG ""
-#endif
-
-/* Module parameters */
-int gigaset_debuglevel;
-EXPORT_SYMBOL_GPL(gigaset_debuglevel);
-module_param_named(debug, gigaset_debuglevel, int, S_IRUGO | S_IWUSR);
-MODULE_PARM_DESC(debug, "debug level");
-
-/* driver state flags */
-#define VALID_MINOR	0x01
-#define VALID_ID	0x02
-
-/**
- * gigaset_dbg_buffer() - dump data in ASCII and hex for debugging
- * @level:	debugging level.
- * @msg:	message prefix.
- * @len:	number of bytes to dump.
- * @buf:	data to dump.
- *
- * If the current debugging level includes one of the bits set in @level,
- * @len bytes starting at @buf are logged to dmesg at KERN_DEBUG prio,
- * prefixed by the text @msg.
- */
-void gigaset_dbg_buffer(enum debuglevel level, const unsigned char *msg,
-			size_t len, const unsigned char *buf)
-{
-	unsigned char outbuf[80];
-	unsigned char c;
-	size_t space = sizeof outbuf - 1;
-	unsigned char *out = outbuf;
-	size_t numin = len;
-
-	while (numin--) {
-		c = *buf++;
-		if (c == '~' || c == '^' || c == '\\') {
-			if (!space--)
-				break;
-			*out++ = '\\';
-		}
-		if (c & 0x80) {
-			if (!space--)
-				break;
-			*out++ = '~';
-			c ^= 0x80;
-		}
-		if (c < 0x20 || c == 0x7f) {
-			if (!space--)
-				break;
-			*out++ = '^';
-			c ^= 0x40;
-		}
-		if (!space--)
-			break;
-		*out++ = c;
-	}
-	*out = 0;
-
-	gig_dbg(level, "%s (%u bytes): %s", msg, (unsigned) len, outbuf);
-}
-EXPORT_SYMBOL_GPL(gigaset_dbg_buffer);
-
-static int setflags(struct cardstate *cs, unsigned flags, unsigned delay)
-{
-	int r;
-
-	r = cs->ops->set_modem_ctrl(cs, cs->control_state, flags);
-	cs->control_state = flags;
-	if (r < 0)
-		return r;
-
-	if (delay) {
-		set_current_state(TASK_INTERRUPTIBLE);
-		schedule_timeout(delay * HZ / 1000);
-	}
-
-	return 0;
-}
-
-int gigaset_enterconfigmode(struct cardstate *cs)
-{
-	int i, r;
-
-	cs->control_state = TIOCM_RTS;
-
-	r = setflags(cs, TIOCM_DTR, 200);
-	if (r < 0)
-		goto error;
-	r = setflags(cs, 0, 200);
-	if (r < 0)
-		goto error;
-	for (i = 0; i < 5; ++i) {
-		r = setflags(cs, TIOCM_RTS, 100);
-		if (r < 0)
-			goto error;
-		r = setflags(cs, 0, 100);
-		if (r < 0)
-			goto error;
-	}
-	r = setflags(cs, TIOCM_RTS | TIOCM_DTR, 800);
-	if (r < 0)
-		goto error;
-
-	return 0;
-
-error:
-	dev_err(cs->dev, "error %d on setuartbits\n", -r);
-	cs->control_state = TIOCM_RTS | TIOCM_DTR;
-	cs->ops->set_modem_ctrl(cs, 0, TIOCM_RTS | TIOCM_DTR);
-
-	return -1;
-}
-
-static int test_timeout(struct at_state_t *at_state)
-{
-	if (!at_state->timer_expires)
-		return 0;
-
-	if (--at_state->timer_expires) {
-		gig_dbg(DEBUG_MCMD, "decreased timer of %p to %lu",
-			at_state, at_state->timer_expires);
-		return 0;
-	}
-
-	gigaset_add_event(at_state->cs, at_state, EV_TIMEOUT, NULL,
-			  at_state->timer_index, NULL);
-	return 1;
-}
-
-static void timer_tick(struct timer_list *t)
-{
-	struct cardstate *cs = from_timer(cs, t, timer);
-	unsigned long flags;
-	unsigned channel;
-	struct at_state_t *at_state;
-	int timeout = 0;
-
-	spin_lock_irqsave(&cs->lock, flags);
-
-	for (channel = 0; channel < cs->channels; ++channel)
-		if (test_timeout(&cs->bcs[channel].at_state))
-			timeout = 1;
-
-	if (test_timeout(&cs->at_state))
-		timeout = 1;
-
-	list_for_each_entry(at_state, &cs->temp_at_states, list)
-		if (test_timeout(at_state))
-			timeout = 1;
-
-	if (cs->running) {
-		mod_timer(&cs->timer, jiffies + msecs_to_jiffies(GIG_TICK));
-		if (timeout) {
-			gig_dbg(DEBUG_EVENT, "scheduling timeout");
-			tasklet_schedule(&cs->event_tasklet);
-		}
-	}
-
-	spin_unlock_irqrestore(&cs->lock, flags);
-}
-
-int gigaset_get_channel(struct bc_state *bcs)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&bcs->cs->lock, flags);
-	if (bcs->use_count || !try_module_get(bcs->cs->driver->owner)) {
-		gig_dbg(DEBUG_CHANNEL, "could not allocate channel %d",
-			bcs->channel);
-		spin_unlock_irqrestore(&bcs->cs->lock, flags);
-		return -EBUSY;
-	}
-	++bcs->use_count;
-	bcs->busy = 1;
-	gig_dbg(DEBUG_CHANNEL, "allocated channel %d", bcs->channel);
-	spin_unlock_irqrestore(&bcs->cs->lock, flags);
-	return 0;
-}
-
-struct bc_state *gigaset_get_free_channel(struct cardstate *cs)
-{
-	unsigned long flags;
-	int i;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (!try_module_get(cs->driver->owner)) {
-		gig_dbg(DEBUG_CHANNEL,
-			"could not get module for allocating channel");
-		spin_unlock_irqrestore(&cs->lock, flags);
-		return NULL;
-	}
-	for (i = 0; i < cs->channels; ++i)
-		if (!cs->bcs[i].use_count) {
-			++cs->bcs[i].use_count;
-			cs->bcs[i].busy = 1;
-			spin_unlock_irqrestore(&cs->lock, flags);
-			gig_dbg(DEBUG_CHANNEL, "allocated channel %d", i);
-			return cs->bcs + i;
-		}
-	module_put(cs->driver->owner);
-	spin_unlock_irqrestore(&cs->lock, flags);
-	gig_dbg(DEBUG_CHANNEL, "no free channel");
-	return NULL;
-}
-
-void gigaset_free_channel(struct bc_state *bcs)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&bcs->cs->lock, flags);
-	if (!bcs->busy) {
-		gig_dbg(DEBUG_CHANNEL, "could not free channel %d",
-			bcs->channel);
-		spin_unlock_irqrestore(&bcs->cs->lock, flags);
-		return;
-	}
-	--bcs->use_count;
-	bcs->busy = 0;
-	module_put(bcs->cs->driver->owner);
-	gig_dbg(DEBUG_CHANNEL, "freed channel %d", bcs->channel);
-	spin_unlock_irqrestore(&bcs->cs->lock, flags);
-}
-
-int gigaset_get_channels(struct cardstate *cs)
-{
-	unsigned long flags;
-	int i;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	for (i = 0; i < cs->channels; ++i)
-		if (cs->bcs[i].use_count) {
-			spin_unlock_irqrestore(&cs->lock, flags);
-			gig_dbg(DEBUG_CHANNEL,
-				"could not allocate all channels");
-			return -EBUSY;
-		}
-	for (i = 0; i < cs->channels; ++i)
-		++cs->bcs[i].use_count;
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	gig_dbg(DEBUG_CHANNEL, "allocated all channels");
-
-	return 0;
-}
-
-void gigaset_free_channels(struct cardstate *cs)
-{
-	unsigned long flags;
-	int i;
-
-	gig_dbg(DEBUG_CHANNEL, "unblocking all channels");
-	spin_lock_irqsave(&cs->lock, flags);
-	for (i = 0; i < cs->channels; ++i)
-		--cs->bcs[i].use_count;
-	spin_unlock_irqrestore(&cs->lock, flags);
-}
-
-void gigaset_block_channels(struct cardstate *cs)
-{
-	unsigned long flags;
-	int i;
-
-	gig_dbg(DEBUG_CHANNEL, "blocking all channels");
-	spin_lock_irqsave(&cs->lock, flags);
-	for (i = 0; i < cs->channels; ++i)
-		++cs->bcs[i].use_count;
-	spin_unlock_irqrestore(&cs->lock, flags);
-}
-
-static void clear_events(struct cardstate *cs)
-{
-	struct event_t *ev;
-	unsigned head, tail;
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->ev_lock, flags);
-
-	head = cs->ev_head;
-	tail = cs->ev_tail;
-
-	while (tail != head) {
-		ev = cs->events + head;
-		kfree(ev->ptr);
-		head = (head + 1) % MAX_EVENTS;
-	}
-
-	cs->ev_head = tail;
-
-	spin_unlock_irqrestore(&cs->ev_lock, flags);
-}
-
-/**
- * gigaset_add_event() - add event to device event queue
- * @cs:		device descriptor structure.
- * @at_state:	connection state structure.
- * @type:	event type.
- * @ptr:	pointer parameter for event.
- * @parameter:	integer parameter for event.
- * @arg:	pointer parameter for event.
- *
- * Allocate an event queue entry from the device's event queue, and set it up
- * with the parameters given.
- *
- * Return value: added event
- */
-struct event_t *gigaset_add_event(struct cardstate *cs,
-				  struct at_state_t *at_state, int type,
-				  void *ptr, int parameter, void *arg)
-{
-	unsigned long flags;
-	unsigned next, tail;
-	struct event_t *event = NULL;
-
-	gig_dbg(DEBUG_EVENT, "queueing event %d", type);
-
-	spin_lock_irqsave(&cs->ev_lock, flags);
-
-	tail = cs->ev_tail;
-	next = (tail + 1) % MAX_EVENTS;
-	if (unlikely(next == cs->ev_head))
-		dev_err(cs->dev, "event queue full\n");
-	else {
-		event = cs->events + tail;
-		event->type = type;
-		event->at_state = at_state;
-		event->cid = -1;
-		event->ptr = ptr;
-		event->arg = arg;
-		event->parameter = parameter;
-		cs->ev_tail = next;
-	}
-
-	spin_unlock_irqrestore(&cs->ev_lock, flags);
-
-	return event;
-}
-EXPORT_SYMBOL_GPL(gigaset_add_event);
-
-static void clear_at_state(struct at_state_t *at_state)
-{
-	int i;
-
-	for (i = 0; i < STR_NUM; ++i) {
-		kfree(at_state->str_var[i]);
-		at_state->str_var[i] = NULL;
-	}
-}
-
-static void dealloc_temp_at_states(struct cardstate *cs)
-{
-	struct at_state_t *cur, *next;
-
-	list_for_each_entry_safe(cur, next, &cs->temp_at_states, list) {
-		list_del(&cur->list);
-		clear_at_state(cur);
-		kfree(cur);
-	}
-}
-
-static void gigaset_freebcs(struct bc_state *bcs)
-{
-	int i;
-
-	gig_dbg(DEBUG_INIT, "freeing bcs[%d]->hw", bcs->channel);
-	bcs->cs->ops->freebcshw(bcs);
-
-	gig_dbg(DEBUG_INIT, "clearing bcs[%d]->at_state", bcs->channel);
-	clear_at_state(&bcs->at_state);
-	gig_dbg(DEBUG_INIT, "freeing bcs[%d]->skb", bcs->channel);
-	dev_kfree_skb(bcs->rx_skb);
-	bcs->rx_skb = NULL;
-
-	for (i = 0; i < AT_NUM; ++i) {
-		kfree(bcs->commands[i]);
-		bcs->commands[i] = NULL;
-	}
-}
-
-static struct cardstate *alloc_cs(struct gigaset_driver *drv)
-{
-	unsigned long flags;
-	unsigned i;
-	struct cardstate *cs;
-	struct cardstate *ret = NULL;
-
-	spin_lock_irqsave(&drv->lock, flags);
-	if (drv->blocked)
-		goto exit;
-	for (i = 0; i < drv->minors; ++i) {
-		cs = drv->cs + i;
-		if (!(cs->flags & VALID_MINOR)) {
-			cs->flags = VALID_MINOR;
-			ret = cs;
-			break;
-		}
-	}
-exit:
-	spin_unlock_irqrestore(&drv->lock, flags);
-	return ret;
-}
-
-static void free_cs(struct cardstate *cs)
-{
-	cs->flags = 0;
-}
-
-static void make_valid(struct cardstate *cs, unsigned mask)
-{
-	unsigned long flags;
-	struct gigaset_driver *drv = cs->driver;
-	spin_lock_irqsave(&drv->lock, flags);
-	cs->flags |= mask;
-	spin_unlock_irqrestore(&drv->lock, flags);
-}
-
-static void make_invalid(struct cardstate *cs, unsigned mask)
-{
-	unsigned long flags;
-	struct gigaset_driver *drv = cs->driver;
-	spin_lock_irqsave(&drv->lock, flags);
-	cs->flags &= ~mask;
-	spin_unlock_irqrestore(&drv->lock, flags);
-}
-
-/**
- * gigaset_freecs() - free all associated ressources of a device
- * @cs:		device descriptor structure.
- *
- * Stops all tasklets and timers, unregisters the device from all
- * subsystems it was registered to, deallocates the device structure
- * @cs and all structures referenced from it.
- * Operations on the device should be stopped before calling this.
- */
-void gigaset_freecs(struct cardstate *cs)
-{
-	int i;
-	unsigned long flags;
-
-	if (!cs)
-		return;
-
-	mutex_lock(&cs->mutex);
-
-	spin_lock_irqsave(&cs->lock, flags);
-	cs->running = 0;
-	spin_unlock_irqrestore(&cs->lock, flags); /* event handler and timer are
-						     not rescheduled below */
-
-	tasklet_kill(&cs->event_tasklet);
-	del_timer_sync(&cs->timer);
-
-	switch (cs->cs_init) {
-	default:
-		/* clear B channel structures */
-		for (i = 0; i < cs->channels; ++i) {
-			gig_dbg(DEBUG_INIT, "clearing bcs[%d]", i);
-			gigaset_freebcs(cs->bcs + i);
-		}
-
-		/* clear device sysfs */
-		gigaset_free_dev_sysfs(cs);
-
-		gigaset_if_free(cs);
-
-		gig_dbg(DEBUG_INIT, "clearing hw");
-		cs->ops->freecshw(cs);
-
-		/* fall through */
-	case 2: /* error in initcshw */
-		/* Deregister from LL */
-		make_invalid(cs, VALID_ID);
-		gigaset_isdn_unregdev(cs);
-
-		/* fall through */
-	case 1: /* error when registering to LL */
-		gig_dbg(DEBUG_INIT, "clearing at_state");
-		clear_at_state(&cs->at_state);
-		dealloc_temp_at_states(cs);
-		clear_events(cs);
-		tty_port_destroy(&cs->port);
-
-		/* fall through */
-	case 0:	/* error in basic setup */
-		gig_dbg(DEBUG_INIT, "freeing inbuf");
-		kfree(cs->inbuf);
-		kfree(cs->bcs);
-	}
-
-	mutex_unlock(&cs->mutex);
-	free_cs(cs);
-}
-EXPORT_SYMBOL_GPL(gigaset_freecs);
-
-void gigaset_at_init(struct at_state_t *at_state, struct bc_state *bcs,
-		     struct cardstate *cs, int cid)
-{
-	int i;
-
-	INIT_LIST_HEAD(&at_state->list);
-	at_state->waiting = 0;
-	at_state->getstring = 0;
-	at_state->pending_commands = 0;
-	at_state->timer_expires = 0;
-	at_state->timer_active = 0;
-	at_state->timer_index = 0;
-	at_state->seq_index = 0;
-	at_state->ConState = 0;
-	for (i = 0; i < STR_NUM; ++i)
-		at_state->str_var[i] = NULL;
-	at_state->int_var[VAR_ZDLE] = 0;
-	at_state->int_var[VAR_ZCTP] = -1;
-	at_state->int_var[VAR_ZSAU] = ZSAU_NULL;
-	at_state->cs = cs;
-	at_state->bcs = bcs;
-	at_state->cid = cid;
-	if (!cid)
-		at_state->replystruct = cs->tabnocid;
-	else
-		at_state->replystruct = cs->tabcid;
-}
-
-
-static void gigaset_inbuf_init(struct inbuf_t *inbuf, struct cardstate *cs)
-/* inbuf->read must be allocated before! */
-{
-	inbuf->head = 0;
-	inbuf->tail = 0;
-	inbuf->cs = cs;
-	inbuf->inputstate = INS_command;
-}
-
-/**
- * gigaset_fill_inbuf() - append received data to input buffer
- * @inbuf:	buffer structure.
- * @src:	received data.
- * @numbytes:	number of bytes received.
- *
- * Return value: !=0 if some data was appended
- */
-int gigaset_fill_inbuf(struct inbuf_t *inbuf, const unsigned char *src,
-		       unsigned numbytes)
-{
-	unsigned n, head, tail, bytesleft;
-
-	gig_dbg(DEBUG_INTR, "received %u bytes", numbytes);
-
-	if (!numbytes)
-		return 0;
-
-	bytesleft = numbytes;
-	tail = inbuf->tail;
-	head = inbuf->head;
-	gig_dbg(DEBUG_INTR, "buffer state: %u -> %u", head, tail);
-
-	while (bytesleft) {
-		if (head > tail)
-			n = head - 1 - tail;
-		else if (head == 0)
-			n = (RBUFSIZE - 1) - tail;
-		else
-			n = RBUFSIZE - tail;
-		if (!n) {
-			dev_err(inbuf->cs->dev,
-				"buffer overflow (%u bytes lost)\n",
-				bytesleft);
-			break;
-		}
-		if (n > bytesleft)
-			n = bytesleft;
-		memcpy(inbuf->data + tail, src, n);
-		bytesleft -= n;
-		tail = (tail + n) % RBUFSIZE;
-		src += n;
-	}
-	gig_dbg(DEBUG_INTR, "setting tail to %u", tail);
-	inbuf->tail = tail;
-	return numbytes != bytesleft;
-}
-EXPORT_SYMBOL_GPL(gigaset_fill_inbuf);
-
-/* Initialize the b-channel structure */
-static int gigaset_initbcs(struct bc_state *bcs, struct cardstate *cs,
-			   int channel)
-{
-	int i;
-
-	bcs->tx_skb = NULL;
-
-	skb_queue_head_init(&bcs->squeue);
-
-	bcs->corrupted = 0;
-	bcs->trans_down = 0;
-	bcs->trans_up = 0;
-
-	gig_dbg(DEBUG_INIT, "setting up bcs[%d]->at_state", channel);
-	gigaset_at_init(&bcs->at_state, bcs, cs, -1);
-
-#ifdef CONFIG_GIGASET_DEBUG
-	bcs->emptycount = 0;
-#endif
-
-	bcs->rx_bufsize = 0;
-	bcs->rx_skb = NULL;
-	bcs->rx_fcs = PPP_INITFCS;
-	bcs->inputstate = 0;
-	bcs->channel = channel;
-	bcs->cs = cs;
-
-	bcs->chstate = 0;
-	bcs->use_count = 1;
-	bcs->busy = 0;
-	bcs->ignore = cs->ignoreframes;
-
-	for (i = 0; i < AT_NUM; ++i)
-		bcs->commands[i] = NULL;
-
-	spin_lock_init(&bcs->aplock);
-	bcs->ap = NULL;
-	bcs->apconnstate = 0;
-
-	gig_dbg(DEBUG_INIT, "  setting up bcs[%d]->hw", channel);
-	return cs->ops->initbcshw(bcs);
-}
-
-/**
- * gigaset_initcs() - initialize device structure
- * @drv:	hardware driver the device belongs to
- * @channels:	number of B channels supported by device
- * @onechannel:	!=0 if B channel data and AT commands share one
- *		    communication channel (M10x),
- *		==0 if B channels have separate communication channels (base)
- * @ignoreframes:	number of frames to ignore after setting up B channel
- * @cidmode:	!=0: start in CallID mode
- * @modulename:	name of driver module for LL registration
- *
- * Allocate and initialize cardstate structure for Gigaset driver
- * Calls hardware dependent gigaset_initcshw() function
- * Calls B channel initialization function gigaset_initbcs() for each B channel
- *
- * Return value:
- *	pointer to cardstate structure
- */
-struct cardstate *gigaset_initcs(struct gigaset_driver *drv, int channels,
-				 int onechannel, int ignoreframes,
-				 int cidmode, const char *modulename)
-{
-	struct cardstate *cs;
-	unsigned long flags;
-	int i;
-
-	gig_dbg(DEBUG_INIT, "allocating cs");
-	cs = alloc_cs(drv);
-	if (!cs) {
-		pr_err("maximum number of devices exceeded\n");
-		return NULL;
-	}
-
-	cs->cs_init = 0;
-	cs->channels = channels;
-	cs->onechannel = onechannel;
-	cs->ignoreframes = ignoreframes;
-	INIT_LIST_HEAD(&cs->temp_at_states);
-	cs->running = 0;
-	timer_setup(&cs->timer, timer_tick, 0);
-	spin_lock_init(&cs->ev_lock);
-	cs->ev_tail = 0;
-	cs->ev_head = 0;
-
-	tasklet_init(&cs->event_tasklet, gigaset_handle_event,
-		     (unsigned long) cs);
-	tty_port_init(&cs->port);
-	cs->commands_pending = 0;
-	cs->cur_at_seq = 0;
-	cs->gotfwver = -1;
-	cs->dev = NULL;
-	cs->tty_dev = NULL;
-	cs->cidmode = cidmode != 0;
-	cs->tabnocid = gigaset_tab_nocid;
-	cs->tabcid = gigaset_tab_cid;
-
-	init_waitqueue_head(&cs->waitqueue);
-	cs->waiting = 0;
-
-	cs->mode = M_UNKNOWN;
-	cs->mstate = MS_UNINITIALIZED;
-
-	cs->bcs = kmalloc_array(channels, sizeof(struct bc_state), GFP_KERNEL);
-	cs->inbuf = kmalloc(sizeof(struct inbuf_t), GFP_KERNEL);
-	if (!cs->bcs || !cs->inbuf) {
-		pr_err("out of memory\n");
-		goto error;
-	}
-	++cs->cs_init;
-
-	gig_dbg(DEBUG_INIT, "setting up at_state");
-	spin_lock_init(&cs->lock);
-	gigaset_at_init(&cs->at_state, NULL, cs, 0);
-	cs->dle = 0;
-	cs->cbytes = 0;
-
-	gig_dbg(DEBUG_INIT, "setting up inbuf");
-	gigaset_inbuf_init(cs->inbuf, cs);
-
-	cs->connected = 0;
-	cs->isdn_up = 0;
-
-	gig_dbg(DEBUG_INIT, "setting up cmdbuf");
-	cs->cmdbuf = cs->lastcmdbuf = NULL;
-	spin_lock_init(&cs->cmdlock);
-	cs->curlen = 0;
-	cs->cmdbytes = 0;
-
-	gig_dbg(DEBUG_INIT, "setting up iif");
-	if (gigaset_isdn_regdev(cs, modulename) < 0) {
-		pr_err("error registering ISDN device\n");
-		goto error;
-	}
-
-	make_valid(cs, VALID_ID);
-	++cs->cs_init;
-	gig_dbg(DEBUG_INIT, "setting up hw");
-	if (cs->ops->initcshw(cs) < 0)
-		goto error;
-
-	++cs->cs_init;
-
-	/* set up character device */
-	gigaset_if_init(cs);
-
-	/* set up device sysfs */
-	gigaset_init_dev_sysfs(cs);
-
-	/* set up channel data structures */
-	for (i = 0; i < channels; ++i) {
-		gig_dbg(DEBUG_INIT, "setting up bcs[%d]", i);
-		if (gigaset_initbcs(cs->bcs + i, cs, i) < 0) {
-			pr_err("could not allocate channel %d data\n", i);
-			goto error;
-		}
-	}
-
-	spin_lock_irqsave(&cs->lock, flags);
-	cs->running = 1;
-	spin_unlock_irqrestore(&cs->lock, flags);
-	cs->timer.expires = jiffies + msecs_to_jiffies(GIG_TICK);
-	add_timer(&cs->timer);
-
-	gig_dbg(DEBUG_INIT, "cs initialized");
-	return cs;
-
-error:
-	gig_dbg(DEBUG_INIT, "failed");
-	gigaset_freecs(cs);
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(gigaset_initcs);
-
-/* ReInitialize the b-channel structure on hangup */
-void gigaset_bcs_reinit(struct bc_state *bcs)
-{
-	struct sk_buff *skb;
-	struct cardstate *cs = bcs->cs;
-	unsigned long flags;
-
-	while ((skb = skb_dequeue(&bcs->squeue)) != NULL)
-		dev_kfree_skb(skb);
-
-	spin_lock_irqsave(&cs->lock, flags);
-	clear_at_state(&bcs->at_state);
-	bcs->at_state.ConState = 0;
-	bcs->at_state.timer_active = 0;
-	bcs->at_state.timer_expires = 0;
-	bcs->at_state.cid = -1;			/* No CID defined */
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	bcs->inputstate = 0;
-
-#ifdef CONFIG_GIGASET_DEBUG
-	bcs->emptycount = 0;
-#endif
-
-	bcs->rx_fcs = PPP_INITFCS;
-	bcs->chstate = 0;
-
-	bcs->ignore = cs->ignoreframes;
-	dev_kfree_skb(bcs->rx_skb);
-	bcs->rx_skb = NULL;
-
-	cs->ops->reinitbcshw(bcs);
-}
-
-static void cleanup_cs(struct cardstate *cs)
-{
-	struct cmdbuf_t *cb, *tcb;
-	int i;
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->lock, flags);
-
-	cs->mode = M_UNKNOWN;
-	cs->mstate = MS_UNINITIALIZED;
-
-	clear_at_state(&cs->at_state);
-	dealloc_temp_at_states(cs);
-	gigaset_at_init(&cs->at_state, NULL, cs, 0);
-
-	cs->inbuf->inputstate = INS_command;
-	cs->inbuf->head = 0;
-	cs->inbuf->tail = 0;
-
-	cb = cs->cmdbuf;
-	while (cb) {
-		tcb = cb;
-		cb = cb->next;
-		kfree(tcb);
-	}
-	cs->cmdbuf = cs->lastcmdbuf = NULL;
-	cs->curlen = 0;
-	cs->cmdbytes = 0;
-	cs->gotfwver = -1;
-	cs->dle = 0;
-	cs->cur_at_seq = 0;
-	cs->commands_pending = 0;
-	cs->cbytes = 0;
-
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	for (i = 0; i < cs->channels; ++i) {
-		gigaset_freebcs(cs->bcs + i);
-		if (gigaset_initbcs(cs->bcs + i, cs, i) < 0)
-			pr_err("could not allocate channel %d data\n", i);
-	}
-
-	if (cs->waiting) {
-		cs->cmd_result = -ENODEV;
-		cs->waiting = 0;
-		wake_up_interruptible(&cs->waitqueue);
-	}
-}
-
-
-/**
- * gigaset_start() - start device operations
- * @cs:		device descriptor structure.
- *
- * Prepares the device for use by setting up communication parameters,
- * scheduling an EV_START event to initiate device initialization, and
- * waiting for completion of the initialization.
- *
- * Return value:
- *	0 on success, error code < 0 on failure
- */
-int gigaset_start(struct cardstate *cs)
-{
-	unsigned long flags;
-
-	if (mutex_lock_interruptible(&cs->mutex))
-		return -EBUSY;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	cs->connected = 1;
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	if (cs->mstate != MS_LOCKED) {
-		cs->ops->set_modem_ctrl(cs, 0, TIOCM_DTR | TIOCM_RTS);
-		cs->ops->baud_rate(cs, B115200);
-		cs->ops->set_line_ctrl(cs, CS8);
-		cs->control_state = TIOCM_DTR | TIOCM_RTS;
-	}
-
-	cs->waiting = 1;
-
-	if (!gigaset_add_event(cs, &cs->at_state, EV_START, NULL, 0, NULL)) {
-		cs->waiting = 0;
-		goto error;
-	}
-	gigaset_schedule_event(cs);
-
-	wait_event(cs->waitqueue, !cs->waiting);
-
-	mutex_unlock(&cs->mutex);
-	return 0;
-
-error:
-	mutex_unlock(&cs->mutex);
-	return -ENOMEM;
-}
-EXPORT_SYMBOL_GPL(gigaset_start);
-
-/**
- * gigaset_shutdown() - shut down device operations
- * @cs:		device descriptor structure.
- *
- * Deactivates the device by scheduling an EV_SHUTDOWN event and
- * waiting for completion of the shutdown.
- *
- * Return value:
- *	0 - success, -ENODEV - error (no device associated)
- */
-int gigaset_shutdown(struct cardstate *cs)
-{
-	mutex_lock(&cs->mutex);
-
-	if (!(cs->flags & VALID_MINOR)) {
-		mutex_unlock(&cs->mutex);
-		return -ENODEV;
-	}
-
-	cs->waiting = 1;
-
-	if (!gigaset_add_event(cs, &cs->at_state, EV_SHUTDOWN, NULL, 0, NULL))
-		goto exit;
-	gigaset_schedule_event(cs);
-
-	wait_event(cs->waitqueue, !cs->waiting);
-
-	cleanup_cs(cs);
-
-exit:
-	mutex_unlock(&cs->mutex);
-	return 0;
-}
-EXPORT_SYMBOL_GPL(gigaset_shutdown);
-
-/**
- * gigaset_stop() - stop device operations
- * @cs:		device descriptor structure.
- *
- * Stops operations on the device by scheduling an EV_STOP event and
- * waiting for completion of the shutdown.
- */
-void gigaset_stop(struct cardstate *cs)
-{
-	mutex_lock(&cs->mutex);
-
-	cs->waiting = 1;
-
-	if (!gigaset_add_event(cs, &cs->at_state, EV_STOP, NULL, 0, NULL))
-		goto exit;
-	gigaset_schedule_event(cs);
-
-	wait_event(cs->waitqueue, !cs->waiting);
-
-	cleanup_cs(cs);
-
-exit:
-	mutex_unlock(&cs->mutex);
-}
-EXPORT_SYMBOL_GPL(gigaset_stop);
-
-static LIST_HEAD(drivers);
-static DEFINE_SPINLOCK(driver_lock);
-
-struct cardstate *gigaset_get_cs_by_id(int id)
-{
-	unsigned long flags;
-	struct cardstate *ret = NULL;
-	struct cardstate *cs;
-	struct gigaset_driver *drv;
-	unsigned i;
-
-	spin_lock_irqsave(&driver_lock, flags);
-	list_for_each_entry(drv, &drivers, list) {
-		spin_lock(&drv->lock);
-		for (i = 0; i < drv->minors; ++i) {
-			cs = drv->cs + i;
-			if ((cs->flags & VALID_ID) && cs->myid == id) {
-				ret = cs;
-				break;
-			}
-		}
-		spin_unlock(&drv->lock);
-		if (ret)
-			break;
-	}
-	spin_unlock_irqrestore(&driver_lock, flags);
-	return ret;
-}
-
-static struct cardstate *gigaset_get_cs_by_minor(unsigned minor)
-{
-	unsigned long flags;
-	struct cardstate *ret = NULL;
-	struct gigaset_driver *drv;
-	unsigned index;
-
-	spin_lock_irqsave(&driver_lock, flags);
-	list_for_each_entry(drv, &drivers, list) {
-		if (minor < drv->minor || minor >= drv->minor + drv->minors)
-			continue;
-		index = minor - drv->minor;
-		spin_lock(&drv->lock);
-		if (drv->cs[index].flags & VALID_MINOR)
-			ret = drv->cs + index;
-		spin_unlock(&drv->lock);
-		if (ret)
-			break;
-	}
-	spin_unlock_irqrestore(&driver_lock, flags);
-	return ret;
-}
-
-struct cardstate *gigaset_get_cs_by_tty(struct tty_struct *tty)
-{
-	return gigaset_get_cs_by_minor(tty->index + tty->driver->minor_start);
-}
-
-/**
- * gigaset_freedriver() - free all associated ressources of a driver
- * @drv:	driver descriptor structure.
- *
- * Unregisters the driver from the system and deallocates the driver
- * structure @drv and all structures referenced from it.
- * All devices should be shut down before calling this.
- */
-void gigaset_freedriver(struct gigaset_driver *drv)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&driver_lock, flags);
-	list_del(&drv->list);
-	spin_unlock_irqrestore(&driver_lock, flags);
-
-	gigaset_if_freedriver(drv);
-
-	kfree(drv->cs);
-	kfree(drv);
-}
-EXPORT_SYMBOL_GPL(gigaset_freedriver);
-
-/**
- * gigaset_initdriver() - initialize driver structure
- * @minor:	First minor number
- * @minors:	Number of minors this driver can handle
- * @procname:	Name of the driver
- * @devname:	Name of the device files (prefix without minor number)
- *
- * Allocate and initialize gigaset_driver structure. Initialize interface.
- *
- * Return value:
- *	Pointer to the gigaset_driver structure on success, NULL on failure.
- */
-struct gigaset_driver *gigaset_initdriver(unsigned minor, unsigned minors,
-					  const char *procname,
-					  const char *devname,
-					  const struct gigaset_ops *ops,
-					  struct module *owner)
-{
-	struct gigaset_driver *drv;
-	unsigned long flags;
-	unsigned i;
-
-	drv = kmalloc(sizeof *drv, GFP_KERNEL);
-	if (!drv)
-		return NULL;
-
-	drv->have_tty = 0;
-	drv->minor = minor;
-	drv->minors = minors;
-	spin_lock_init(&drv->lock);
-	drv->blocked = 0;
-	drv->ops = ops;
-	drv->owner = owner;
-	INIT_LIST_HEAD(&drv->list);
-
-	drv->cs = kmalloc_array(minors, sizeof(*drv->cs), GFP_KERNEL);
-	if (!drv->cs)
-		goto error;
-
-	for (i = 0; i < minors; ++i) {
-		drv->cs[i].flags = 0;
-		drv->cs[i].driver = drv;
-		drv->cs[i].ops = drv->ops;
-		drv->cs[i].minor_index = i;
-		mutex_init(&drv->cs[i].mutex);
-	}
-
-	gigaset_if_initdriver(drv, procname, devname);
-
-	spin_lock_irqsave(&driver_lock, flags);
-	list_add(&drv->list, &drivers);
-	spin_unlock_irqrestore(&driver_lock, flags);
-
-	return drv;
-
-error:
-	kfree(drv);
-	return NULL;
-}
-EXPORT_SYMBOL_GPL(gigaset_initdriver);
-
-/**
- * gigaset_blockdriver() - block driver
- * @drv:	driver descriptor structure.
- *
- * Prevents the driver from attaching new devices, in preparation for
- * deregistration.
- */
-void gigaset_blockdriver(struct gigaset_driver *drv)
-{
-	drv->blocked = 1;
-}
-EXPORT_SYMBOL_GPL(gigaset_blockdriver);
-
-static int __init gigaset_init_module(void)
-{
-	/* in accordance with the principle of least astonishment,
-	 * setting the 'debug' parameter to 1 activates a sensible
-	 * set of default debug levels
-	 */
-	if (gigaset_debuglevel == 1)
-		gigaset_debuglevel = DEBUG_DEFAULT;
-
-	pr_info(DRIVER_DESC DRIVER_DESC_DEBUG "\n");
-	gigaset_isdn_regdrv();
-	return 0;
-}
-
-static void __exit gigaset_exit_module(void)
-{
-	gigaset_isdn_unregdrv();
-}
-
-module_init(gigaset_init_module);
-module_exit(gigaset_exit_module);
-
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
-
-MODULE_LICENSE("GPL");
diff --git a/drivers/staging/isdn/gigaset/dummyll.c b/drivers/staging/isdn/gigaset/dummyll.c
deleted file mode 100644
index 4b9637e5da6e..000000000000
--- a/drivers/staging/isdn/gigaset/dummyll.c
+++ /dev/null
@@ -1,74 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Dummy LL interface for the Gigaset driver
- *
- * Copyright (c) 2009 by Tilman Schmidt <tilman@imap.cc>.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include <linux/export.h>
-#include "gigaset.h"
-
-void gigaset_skb_sent(struct bc_state *bcs, struct sk_buff *skb)
-{
-}
-EXPORT_SYMBOL_GPL(gigaset_skb_sent);
-
-void gigaset_skb_rcvd(struct bc_state *bcs, struct sk_buff *skb)
-{
-}
-EXPORT_SYMBOL_GPL(gigaset_skb_rcvd);
-
-void gigaset_isdn_rcv_err(struct bc_state *bcs)
-{
-}
-EXPORT_SYMBOL_GPL(gigaset_isdn_rcv_err);
-
-int gigaset_isdn_icall(struct at_state_t *at_state)
-{
-	return ICALL_IGNORE;
-}
-
-void gigaset_isdn_connD(struct bc_state *bcs)
-{
-}
-
-void gigaset_isdn_hupD(struct bc_state *bcs)
-{
-}
-
-void gigaset_isdn_connB(struct bc_state *bcs)
-{
-}
-
-void gigaset_isdn_hupB(struct bc_state *bcs)
-{
-}
-
-void gigaset_isdn_start(struct cardstate *cs)
-{
-}
-
-void gigaset_isdn_stop(struct cardstate *cs)
-{
-}
-
-int gigaset_isdn_regdev(struct cardstate *cs, const char *isdnid)
-{
-	return 0;
-}
-
-void gigaset_isdn_unregdev(struct cardstate *cs)
-{
-}
-
-void gigaset_isdn_regdrv(void)
-{
-	pr_info("no ISDN subsystem interface\n");
-}
-
-void gigaset_isdn_unregdrv(void)
-{
-}
diff --git a/drivers/staging/isdn/gigaset/ev-layer.c b/drivers/staging/isdn/gigaset/ev-layer.c
deleted file mode 100644
index f8bb1869c600..000000000000
--- a/drivers/staging/isdn/gigaset/ev-layer.c
+++ /dev/null
@@ -1,1910 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Stuff used by all variants of the driver
- *
- * Copyright (c) 2001 by Stefan Eilers,
- *                       Hansjoerg Lipp <hjlipp@web.de>,
- *                       Tilman Schmidt <tilman@imap.cc>.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include <linux/export.h>
-#include "gigaset.h"
-
-/* ========================================================== */
-/* bit masks for pending commands */
-#define PC_DIAL		0x001
-#define PC_HUP		0x002
-#define PC_INIT		0x004
-#define PC_DLE0		0x008
-#define PC_DLE1		0x010
-#define PC_SHUTDOWN	0x020
-#define PC_ACCEPT	0x040
-#define PC_CID		0x080
-#define PC_NOCID	0x100
-#define PC_CIDMODE	0x200
-#define PC_UMMODE	0x400
-
-/* types of modem responses */
-#define RT_NOTHING	0
-#define RT_ZSAU		1
-#define RT_RING		2
-#define RT_NUMBER	3
-#define RT_STRING	4
-#define RT_ZCAU		6
-
-/* Possible ASCII responses */
-#define RSP_OK		0
-#define RSP_ERROR	1
-#define RSP_ZGCI	3
-#define RSP_RING	4
-#define RSP_ZVLS	5
-#define RSP_ZCAU	6
-
-/* responses with values to store in at_state */
-/* - numeric */
-#define RSP_VAR		100
-#define RSP_ZSAU	(RSP_VAR + VAR_ZSAU)
-#define RSP_ZDLE	(RSP_VAR + VAR_ZDLE)
-#define RSP_ZCTP	(RSP_VAR + VAR_ZCTP)
-/* - string */
-#define RSP_STR		(RSP_VAR + VAR_NUM)
-#define RSP_NMBR	(RSP_STR + STR_NMBR)
-#define RSP_ZCPN	(RSP_STR + STR_ZCPN)
-#define RSP_ZCON	(RSP_STR + STR_ZCON)
-#define RSP_ZBC		(RSP_STR + STR_ZBC)
-#define RSP_ZHLC	(RSP_STR + STR_ZHLC)
-
-#define RSP_WRONG_CID	-2	/* unknown cid in cmd */
-#define RSP_INVAL	-6	/* invalid response   */
-#define RSP_NODEV	-9	/* device not connected */
-
-#define RSP_NONE	-19
-#define RSP_STRING	-20
-#define RSP_NULL	-21
-#define RSP_INIT	-27
-#define RSP_ANY		-26
-#define RSP_LAST	-28
-
-/* actions for process_response */
-#define ACT_NOTHING		0
-#define ACT_SETDLE1		1
-#define ACT_SETDLE0		2
-#define ACT_FAILINIT		3
-#define ACT_HUPMODEM		4
-#define ACT_CONFIGMODE		5
-#define ACT_INIT		6
-#define ACT_DLE0		7
-#define ACT_DLE1		8
-#define ACT_FAILDLE0		9
-#define ACT_FAILDLE1		10
-#define ACT_RING		11
-#define ACT_CID			12
-#define ACT_FAILCID		13
-#define ACT_SDOWN		14
-#define ACT_FAILSDOWN		15
-#define ACT_DEBUG		16
-#define ACT_WARN		17
-#define ACT_DIALING		18
-#define ACT_ABORTDIAL		19
-#define ACT_DISCONNECT		20
-#define ACT_CONNECT		21
-#define ACT_REMOTEREJECT	22
-#define ACT_CONNTIMEOUT		23
-#define ACT_REMOTEHUP		24
-#define ACT_ABORTHUP		25
-#define ACT_ICALL		26
-#define ACT_ACCEPTED		27
-#define ACT_ABORTACCEPT		28
-#define ACT_TIMEOUT		29
-#define ACT_GETSTRING		30
-#define ACT_SETVER		31
-#define ACT_FAILVER		32
-#define ACT_GOTVER		33
-#define ACT_TEST		34
-#define ACT_ERROR		35
-#define ACT_ABORTCID		36
-#define ACT_ZCAU		37
-#define ACT_NOTIFY_BC_DOWN	38
-#define ACT_NOTIFY_BC_UP	39
-#define ACT_DIAL		40
-#define ACT_ACCEPT		41
-#define ACT_HUP			43
-#define ACT_IF_LOCK		44
-#define ACT_START		45
-#define ACT_STOP		46
-#define ACT_FAKEDLE0		47
-#define ACT_FAKEHUP		48
-#define ACT_FAKESDOWN		49
-#define ACT_SHUTDOWN		50
-#define ACT_PROC_CIDMODE	51
-#define ACT_UMODESET		52
-#define ACT_FAILUMODE		53
-#define ACT_CMODESET		54
-#define ACT_FAILCMODE		55
-#define ACT_IF_VER		56
-#define ACT_CMD			100
-
-/* at command sequences */
-#define SEQ_NONE	0
-#define SEQ_INIT	100
-#define SEQ_DLE0	200
-#define SEQ_DLE1	250
-#define SEQ_CID		300
-#define SEQ_NOCID	350
-#define SEQ_HUP		400
-#define SEQ_DIAL	600
-#define SEQ_ACCEPT	720
-#define SEQ_SHUTDOWN	500
-#define SEQ_CIDMODE	10
-#define SEQ_UMMODE	11
-
-
-/* 100: init, 200: dle0, 250:dle1, 300: get cid (dial), 350: "hup" (no cid),
- * 400: hup, 500: reset, 600: dial, 700: ring */
-struct reply_t gigaset_tab_nocid[] =
-{
-/* resp_code, min_ConState, max_ConState, parameter, new_ConState, timeout,
- * action, command */
-
-/* initialize device, set cid mode if possible */
-	{RSP_INIT,	 -1,  -1, SEQ_INIT,	100,  1, {ACT_TIMEOUT} },
-
-	{EV_TIMEOUT,	100, 100, -1,		101,  3, {0},	"Z\r"},
-	{RSP_OK,	101, 103, -1,		120,  5, {ACT_GETSTRING},
-								"+GMR\r"},
-
-	{EV_TIMEOUT,	101, 101, -1,		102,  5, {0},	"Z\r"},
-	{RSP_ERROR,	101, 101, -1,		102,  5, {0},	"Z\r"},
-
-	{EV_TIMEOUT,	102, 102, -1,		108,  5, {ACT_SETDLE1},
-								"^SDLE=0\r"},
-	{RSP_OK,	108, 108, -1,		104, -1},
-	{RSP_ZDLE,	104, 104,  0,		103,  5, {0},	"Z\r"},
-	{EV_TIMEOUT,	104, 104, -1,		  0,  0, {ACT_FAILINIT} },
-	{RSP_ERROR,	108, 108, -1,		  0,  0, {ACT_FAILINIT} },
-
-	{EV_TIMEOUT,	108, 108, -1,		105,  2, {ACT_SETDLE0,
-							  ACT_HUPMODEM,
-							  ACT_TIMEOUT} },
-	{EV_TIMEOUT,	105, 105, -1,		103,  5, {0},	"Z\r"},
-
-	{RSP_ERROR,	102, 102, -1,		107,  5, {0},	"^GETPRE\r"},
-	{RSP_OK,	107, 107, -1,		  0,  0, {ACT_CONFIGMODE} },
-	{RSP_ERROR,	107, 107, -1,		  0,  0, {ACT_FAILINIT} },
-	{EV_TIMEOUT,	107, 107, -1,		  0,  0, {ACT_FAILINIT} },
-
-	{RSP_ERROR,	103, 103, -1,		  0,  0, {ACT_FAILINIT} },
-	{EV_TIMEOUT,	103, 103, -1,		  0,  0, {ACT_FAILINIT} },
-
-	{RSP_STRING,	120, 120, -1,		121, -1, {ACT_SETVER} },
-
-	{EV_TIMEOUT,	120, 121, -1,		  0,  0, {ACT_FAILVER,
-							  ACT_INIT} },
-	{RSP_ERROR,	120, 121, -1,		  0,  0, {ACT_FAILVER,
-							  ACT_INIT} },
-	{RSP_OK,	121, 121, -1,		  0,  0, {ACT_GOTVER,
-							  ACT_INIT} },
-	{RSP_NONE,	121, 121, -1,		120,  0, {ACT_GETSTRING} },
-
-/* leave dle mode */
-	{RSP_INIT,	  0,   0, SEQ_DLE0,	201,  5, {0},	"^SDLE=0\r"},
-	{RSP_OK,	201, 201, -1,		202, -1},
-	{RSP_ZDLE,	202, 202,  0,		  0,  0, {ACT_DLE0} },
-	{RSP_NODEV,	200, 249, -1,		  0,  0, {ACT_FAKEDLE0} },
-	{RSP_ERROR,	200, 249, -1,		  0,  0, {ACT_FAILDLE0} },
-	{EV_TIMEOUT,	200, 249, -1,		  0,  0, {ACT_FAILDLE0} },
-
-/* enter dle mode */
-	{RSP_INIT,	  0,   0, SEQ_DLE1,	251,  5, {0},	"^SDLE=1\r"},
-	{RSP_OK,	251, 251, -1,		252, -1},
-	{RSP_ZDLE,	252, 252,  1,		  0,  0, {ACT_DLE1} },
-	{RSP_ERROR,	250, 299, -1,		  0,  0, {ACT_FAILDLE1} },
-	{EV_TIMEOUT,	250, 299, -1,		  0,  0, {ACT_FAILDLE1} },
-
-/* incoming call */
-	{RSP_RING,	 -1,  -1, -1,		 -1, -1, {ACT_RING} },
-
-/* get cid */
-	{RSP_INIT,	  0,   0, SEQ_CID,	301,  5, {0},	"^SGCI?\r"},
-	{RSP_OK,	301, 301, -1,		302, -1},
-	{RSP_ZGCI,	302, 302, -1,		  0,  0, {ACT_CID} },
-	{RSP_ERROR,	301, 349, -1,		  0,  0, {ACT_FAILCID} },
-	{EV_TIMEOUT,	301, 349, -1,		  0,  0, {ACT_FAILCID} },
-
-/* enter cid mode */
-	{RSP_INIT,	  0,   0, SEQ_CIDMODE,	150,  5, {0},	"^SGCI=1\r"},
-	{RSP_OK,	150, 150, -1,		  0,  0, {ACT_CMODESET} },
-	{RSP_ERROR,	150, 150, -1,		  0,  0, {ACT_FAILCMODE} },
-	{EV_TIMEOUT,	150, 150, -1,		  0,  0, {ACT_FAILCMODE} },
-
-/* leave cid mode */
-	{RSP_INIT,	  0,   0, SEQ_UMMODE,	160,  5, {0},	"Z\r"},
-	{RSP_OK,	160, 160, -1,		  0,  0, {ACT_UMODESET} },
-	{RSP_ERROR,	160, 160, -1,		  0,  0, {ACT_FAILUMODE} },
-	{EV_TIMEOUT,	160, 160, -1,		  0,  0, {ACT_FAILUMODE} },
-
-/* abort getting cid */
-	{RSP_INIT,	  0,   0, SEQ_NOCID,	  0,  0, {ACT_ABORTCID} },
-
-/* reset */
-	{RSP_INIT,	  0,   0, SEQ_SHUTDOWN,	504,  5, {0},	"Z\r"},
-	{RSP_OK,	504, 504, -1,		  0,  0, {ACT_SDOWN} },
-	{RSP_ERROR,	501, 599, -1,		  0,  0, {ACT_FAILSDOWN} },
-	{EV_TIMEOUT,	501, 599, -1,		  0,  0, {ACT_FAILSDOWN} },
-	{RSP_NODEV,	501, 599, -1,		  0,  0, {ACT_FAKESDOWN} },
-
-	{EV_PROC_CIDMODE, -1, -1, -1,		 -1, -1, {ACT_PROC_CIDMODE} },
-	{EV_IF_LOCK,	 -1,  -1, -1,		 -1, -1, {ACT_IF_LOCK} },
-	{EV_IF_VER,	 -1,  -1, -1,		 -1, -1, {ACT_IF_VER} },
-	{EV_START,	 -1,  -1, -1,		 -1, -1, {ACT_START} },
-	{EV_STOP,	 -1,  -1, -1,		 -1, -1, {ACT_STOP} },
-	{EV_SHUTDOWN,	 -1,  -1, -1,		 -1, -1, {ACT_SHUTDOWN} },
-
-/* misc. */
-	{RSP_ERROR,	 -1,  -1, -1,		 -1, -1, {ACT_ERROR} },
-	{RSP_ZCAU,	 -1,  -1, -1,		 -1, -1, {ACT_ZCAU} },
-	{RSP_NONE,	 -1,  -1, -1,		 -1, -1, {ACT_DEBUG} },
-	{RSP_ANY,	 -1,  -1, -1,		 -1, -1, {ACT_WARN} },
-	{RSP_LAST}
-};
-
-/* 600: start dialing, 650: dial in progress, 800: connection is up, 700: ring,
- * 400: hup, 750: accepted icall */
-struct reply_t gigaset_tab_cid[] =
-{
-/* resp_code, min_ConState, max_ConState, parameter, new_ConState, timeout,
- * action, command */
-
-/* dial */
-	{EV_DIAL,	 -1,  -1, -1,		 -1, -1, {ACT_DIAL} },
-	{RSP_INIT,	  0,   0, SEQ_DIAL,	601,  5, {ACT_CMD + AT_BC} },
-	{RSP_OK,	601, 601, -1,		603,  5, {ACT_CMD + AT_PROTO} },
-	{RSP_OK,	603, 603, -1,		604,  5, {ACT_CMD + AT_TYPE} },
-	{RSP_OK,	604, 604, -1,		605,  5, {ACT_CMD + AT_MSN} },
-	{RSP_NULL,	605, 605, -1,		606,  5, {ACT_CMD + AT_CLIP} },
-	{RSP_OK,	605, 605, -1,		606,  5, {ACT_CMD + AT_CLIP} },
-	{RSP_NULL,	606, 606, -1,		607,  5, {ACT_CMD + AT_ISO} },
-	{RSP_OK,	606, 606, -1,		607,  5, {ACT_CMD + AT_ISO} },
-	{RSP_OK,	607, 607, -1,		608,  5, {0},	"+VLS=17\r"},
-	{RSP_OK,	608, 608, -1,		609, -1},
-	{RSP_ZSAU,	609, 609, ZSAU_PROCEEDING, 610, 5, {ACT_CMD + AT_DIAL} },
-	{RSP_OK,	610, 610, -1,		650,  0, {ACT_DIALING} },
-
-	{RSP_ERROR,	601, 610, -1,		  0,  0, {ACT_ABORTDIAL} },
-	{EV_TIMEOUT,	601, 610, -1,		  0,  0, {ACT_ABORTDIAL} },
-
-/* optional dialing responses */
-	{EV_BC_OPEN,	650, 650, -1,		651, -1},
-	{RSP_ZVLS,	609, 651, 17,		 -1, -1, {ACT_DEBUG} },
-	{RSP_ZCTP,	610, 651, -1,		 -1, -1, {ACT_DEBUG} },
-	{RSP_ZCPN,	610, 651, -1,		 -1, -1, {ACT_DEBUG} },
-	{RSP_ZSAU,	650, 651, ZSAU_CALL_DELIVERED, -1, -1, {ACT_DEBUG} },
-
-/* connect */
-	{RSP_ZSAU,	650, 650, ZSAU_ACTIVE,	800, -1, {ACT_CONNECT} },
-	{RSP_ZSAU,	651, 651, ZSAU_ACTIVE,	800, -1, {ACT_CONNECT,
-							  ACT_NOTIFY_BC_UP} },
-	{RSP_ZSAU,	750, 750, ZSAU_ACTIVE,	800, -1, {ACT_CONNECT} },
-	{RSP_ZSAU,	751, 751, ZSAU_ACTIVE,	800, -1, {ACT_CONNECT,
-							  ACT_NOTIFY_BC_UP} },
-	{EV_BC_OPEN,	800, 800, -1,		800, -1, {ACT_NOTIFY_BC_UP} },
-
-/* remote hangup */
-	{RSP_ZSAU,	650, 651, ZSAU_DISCONNECT_IND, 0, 0, {ACT_REMOTEREJECT} },
-	{RSP_ZSAU,	750, 751, ZSAU_DISCONNECT_IND, 0, 0, {ACT_REMOTEHUP} },
-	{RSP_ZSAU,	800, 800, ZSAU_DISCONNECT_IND, 0, 0, {ACT_REMOTEHUP} },
-
-/* hangup */
-	{EV_HUP,	 -1,  -1, -1,		 -1, -1, {ACT_HUP} },
-	{RSP_INIT,	 -1,  -1, SEQ_HUP,	401,  5, {0},	"+VLS=0\r"},
-	{RSP_OK,	401, 401, -1,		402,  5},
-	{RSP_ZVLS,	402, 402,  0,		403,  5},
-	{RSP_ZSAU,	403, 403, ZSAU_DISCONNECT_REQ, -1, -1, {ACT_DEBUG} },
-	{RSP_ZSAU,	403, 403, ZSAU_NULL,	  0,  0, {ACT_DISCONNECT} },
-	{RSP_NODEV,	401, 403, -1,		  0,  0, {ACT_FAKEHUP} },
-	{RSP_ERROR,	401, 401, -1,		  0,  0, {ACT_ABORTHUP} },
-	{EV_TIMEOUT,	401, 403, -1,		  0,  0, {ACT_ABORTHUP} },
-
-	{EV_BC_CLOSED,	  0,   0, -1,		  0, -1, {ACT_NOTIFY_BC_DOWN} },
-
-/* ring */
-	{RSP_ZBC,	700, 700, -1,		 -1, -1, {0} },
-	{RSP_ZHLC,	700, 700, -1,		 -1, -1, {0} },
-	{RSP_NMBR,	700, 700, -1,		 -1, -1, {0} },
-	{RSP_ZCPN,	700, 700, -1,		 -1, -1, {0} },
-	{RSP_ZCTP,	700, 700, -1,		 -1, -1, {0} },
-	{EV_TIMEOUT,	700, 700, -1,		720, 720, {ACT_ICALL} },
-	{EV_BC_CLOSED,	720, 720, -1,		  0, -1, {ACT_NOTIFY_BC_DOWN} },
-
-/*accept icall*/
-	{EV_ACCEPT,	 -1,  -1, -1,		 -1, -1, {ACT_ACCEPT} },
-	{RSP_INIT,	720, 720, SEQ_ACCEPT,	721,  5, {ACT_CMD + AT_PROTO} },
-	{RSP_OK,	721, 721, -1,		722,  5, {ACT_CMD + AT_ISO} },
-	{RSP_OK,	722, 722, -1,		723,  5, {0},	"+VLS=17\r"},
-	{RSP_OK,	723, 723, -1,		724,  5, {0} },
-	{RSP_ZVLS,	724, 724, 17,		750, 50, {ACT_ACCEPTED} },
-	{RSP_ERROR,	721, 729, -1,		  0,  0, {ACT_ABORTACCEPT} },
-	{EV_TIMEOUT,	721, 729, -1,		  0,  0, {ACT_ABORTACCEPT} },
-	{RSP_ZSAU,	700, 729, ZSAU_NULL,	  0,  0, {ACT_ABORTACCEPT} },
-	{RSP_ZSAU,	700, 729, ZSAU_ACTIVE,	  0,  0, {ACT_ABORTACCEPT} },
-	{RSP_ZSAU,	700, 729, ZSAU_DISCONNECT_IND, 0, 0, {ACT_ABORTACCEPT} },
-
-	{EV_BC_OPEN,	750, 750, -1,		751, -1},
-	{EV_TIMEOUT,	750, 751, -1,		  0,  0, {ACT_CONNTIMEOUT} },
-
-/* B channel closed (general case) */
-	{EV_BC_CLOSED,	 -1,  -1, -1,		 -1, -1, {ACT_NOTIFY_BC_DOWN} },
-
-/* misc. */
-	{RSP_ZCON,	 -1,  -1, -1,		 -1, -1, {ACT_DEBUG} },
-	{RSP_ZCAU,	 -1,  -1, -1,		 -1, -1, {ACT_ZCAU} },
-	{RSP_NONE,	 -1,  -1, -1,		 -1, -1, {ACT_DEBUG} },
-	{RSP_ANY,	 -1,  -1, -1,		 -1, -1, {ACT_WARN} },
-	{RSP_LAST}
-};
-
-
-static const struct resp_type_t {
-	char	*response;
-	int	resp_code;
-	int	type;
-}
-resp_type[] =
-{
-	{"OK",		RSP_OK,		RT_NOTHING},
-	{"ERROR",	RSP_ERROR,	RT_NOTHING},
-	{"ZSAU",	RSP_ZSAU,	RT_ZSAU},
-	{"ZCAU",	RSP_ZCAU,	RT_ZCAU},
-	{"RING",	RSP_RING,	RT_RING},
-	{"ZGCI",	RSP_ZGCI,	RT_NUMBER},
-	{"ZVLS",	RSP_ZVLS,	RT_NUMBER},
-	{"ZCTP",	RSP_ZCTP,	RT_NUMBER},
-	{"ZDLE",	RSP_ZDLE,	RT_NUMBER},
-	{"ZHLC",	RSP_ZHLC,	RT_STRING},
-	{"ZBC",		RSP_ZBC,	RT_STRING},
-	{"NMBR",	RSP_NMBR,	RT_STRING},
-	{"ZCPN",	RSP_ZCPN,	RT_STRING},
-	{"ZCON",	RSP_ZCON,	RT_STRING},
-	{NULL,		0,		0}
-};
-
-static const struct zsau_resp_t {
-	char	*str;
-	int	code;
-}
-zsau_resp[] =
-{
-	{"OUTGOING_CALL_PROCEEDING",	ZSAU_PROCEEDING},
-	{"CALL_DELIVERED",		ZSAU_CALL_DELIVERED},
-	{"ACTIVE",			ZSAU_ACTIVE},
-	{"DISCONNECT_IND",		ZSAU_DISCONNECT_IND},
-	{"NULL",			ZSAU_NULL},
-	{"DISCONNECT_REQ",		ZSAU_DISCONNECT_REQ},
-	{NULL,				ZSAU_UNKNOWN}
-};
-
-/* check for and remove fixed string prefix
- * If s starts with prefix terminated by a non-alphanumeric character,
- * return pointer to the first character after that, otherwise return NULL.
- */
-static char *skip_prefix(char *s, const char *prefix)
-{
-	while (*prefix)
-		if (*s++ != *prefix++)
-			return NULL;
-	if (isalnum(*s))
-		return NULL;
-	return s;
-}
-
-/* queue event with CID */
-static void add_cid_event(struct cardstate *cs, int cid, int type,
-			  void *ptr, int parameter)
-{
-	unsigned long flags;
-	unsigned next, tail;
-	struct event_t *event;
-
-	gig_dbg(DEBUG_EVENT, "queueing event %d for cid %d", type, cid);
-
-	spin_lock_irqsave(&cs->ev_lock, flags);
-
-	tail = cs->ev_tail;
-	next = (tail + 1) % MAX_EVENTS;
-	if (unlikely(next == cs->ev_head)) {
-		dev_err(cs->dev, "event queue full\n");
-		kfree(ptr);
-	} else {
-		event = cs->events + tail;
-		event->type = type;
-		event->cid = cid;
-		event->ptr = ptr;
-		event->arg = NULL;
-		event->parameter = parameter;
-		event->at_state = NULL;
-		cs->ev_tail = next;
-	}
-
-	spin_unlock_irqrestore(&cs->ev_lock, flags);
-}
-
-/**
- * gigaset_handle_modem_response() - process received modem response
- * @cs:		device descriptor structure.
- *
- * Called by asyncdata/isocdata if a block of data received from the
- * device must be processed as a modem command response. The data is
- * already in the cs structure.
- */
-void gigaset_handle_modem_response(struct cardstate *cs)
-{
-	char *eoc, *psep, *ptr;
-	const struct resp_type_t *rt;
-	const struct zsau_resp_t *zr;
-	int cid, parameter;
-	u8 type, value;
-
-	if (!cs->cbytes) {
-		/* ignore additional LFs/CRs (M10x config mode or cx100) */
-		gig_dbg(DEBUG_MCMD, "skipped EOL [%02X]", cs->respdata[0]);
-		return;
-	}
-	cs->respdata[cs->cbytes] = 0;
-
-	if (cs->at_state.getstring) {
-		/* state machine wants next line verbatim */
-		cs->at_state.getstring = 0;
-		ptr = kstrdup(cs->respdata, GFP_ATOMIC);
-		gig_dbg(DEBUG_EVENT, "string==%s", ptr ? ptr : "NULL");
-		add_cid_event(cs, 0, RSP_STRING, ptr, 0);
-		return;
-	}
-
-	/* look up response type */
-	for (rt = resp_type; rt->response; ++rt) {
-		eoc = skip_prefix(cs->respdata, rt->response);
-		if (eoc)
-			break;
-	}
-	if (!rt->response) {
-		add_cid_event(cs, 0, RSP_NONE, NULL, 0);
-		gig_dbg(DEBUG_EVENT, "unknown modem response: '%s'\n",
-			cs->respdata);
-		return;
-	}
-
-	/* check for CID */
-	psep = strrchr(cs->respdata, ';');
-	if (psep &&
-	    !kstrtoint(psep + 1, 10, &cid) &&
-	    cid >= 1 && cid <= 65535) {
-		/* valid CID: chop it off */
-		*psep = 0;
-	} else {
-		/* no valid CID: leave unchanged */
-		cid = 0;
-	}
-
-	gig_dbg(DEBUG_EVENT, "CMD received: %s", cs->respdata);
-	if (cid)
-		gig_dbg(DEBUG_EVENT, "CID: %d", cid);
-
-	switch (rt->type) {
-	case RT_NOTHING:
-		/* check parameter separator */
-		if (*eoc)
-			goto bad_param;	/* extra parameter */
-
-		add_cid_event(cs, cid, rt->resp_code, NULL, 0);
-		break;
-
-	case RT_RING:
-		/* check parameter separator */
-		if (!*eoc)
-			eoc = NULL;	/* no parameter */
-		else if (*eoc++ != ',')
-			goto bad_param;
-
-		add_cid_event(cs, 0, rt->resp_code, NULL, cid);
-
-		/* process parameters as individual responses */
-		while (eoc) {
-			/* look up parameter type */
-			psep = NULL;
-			for (rt = resp_type; rt->response; ++rt) {
-				psep = skip_prefix(eoc, rt->response);
-				if (psep)
-					break;
-			}
-
-			/* all legal parameters are of type RT_STRING */
-			if (!psep || rt->type != RT_STRING) {
-				dev_warn(cs->dev,
-					 "illegal RING parameter: '%s'\n",
-					 eoc);
-				return;
-			}
-
-			/* skip parameter value separator */
-			if (*psep++ != '=')
-				goto bad_param;
-
-			/* look up end of parameter */
-			eoc = strchr(psep, ',');
-			if (eoc)
-				*eoc++ = 0;
-
-			/* retrieve parameter value */
-			ptr = kstrdup(psep, GFP_ATOMIC);
-
-			/* queue event */
-			add_cid_event(cs, cid, rt->resp_code, ptr, 0);
-		}
-		break;
-
-	case RT_ZSAU:
-		/* check parameter separator */
-		if (!*eoc) {
-			/* no parameter */
-			add_cid_event(cs, cid, rt->resp_code, NULL, ZSAU_NONE);
-			break;
-		}
-		if (*eoc++ != '=')
-			goto bad_param;
-
-		/* look up parameter value */
-		for (zr = zsau_resp; zr->str; ++zr)
-			if (!strcmp(eoc, zr->str))
-				break;
-		if (!zr->str)
-			goto bad_param;
-
-		add_cid_event(cs, cid, rt->resp_code, NULL, zr->code);
-		break;
-
-	case RT_STRING:
-		/* check parameter separator */
-		if (*eoc++ != '=')
-			goto bad_param;
-
-		/* retrieve parameter value */
-		ptr = kstrdup(eoc, GFP_ATOMIC);
-
-		/* queue event */
-		add_cid_event(cs, cid, rt->resp_code, ptr, 0);
-		break;
-
-	case RT_ZCAU:
-		/* check parameter separators */
-		if (*eoc++ != '=')
-			goto bad_param;
-		psep = strchr(eoc, ',');
-		if (!psep)
-			goto bad_param;
-		*psep++ = 0;
-
-		/* decode parameter values */
-		if (kstrtou8(eoc, 16, &type) || kstrtou8(psep, 16, &value)) {
-			*--psep = ',';
-			goto bad_param;
-		}
-		parameter = (type << 8) | value;
-
-		add_cid_event(cs, cid, rt->resp_code, NULL, parameter);
-		break;
-
-	case RT_NUMBER:
-		/* check parameter separator */
-		if (*eoc++ != '=')
-			goto bad_param;
-
-		/* decode parameter value */
-		if (kstrtoint(eoc, 10, &parameter))
-			goto bad_param;
-
-		/* special case ZDLE: set flag before queueing event */
-		if (rt->resp_code == RSP_ZDLE)
-			cs->dle = parameter;
-
-		add_cid_event(cs, cid, rt->resp_code, NULL, parameter);
-		break;
-
-bad_param:
-		/* parameter unexpected, incomplete or malformed */
-		dev_warn(cs->dev, "bad parameter in response '%s'\n",
-			 cs->respdata);
-		add_cid_event(cs, cid, rt->resp_code, NULL, -1);
-		break;
-
-	default:
-		dev_err(cs->dev, "%s: internal error on '%s'\n",
-			__func__, cs->respdata);
-	}
-}
-EXPORT_SYMBOL_GPL(gigaset_handle_modem_response);
-
-/* disconnect_nobc
- * process closing of connection associated with given AT state structure
- * without B channel
- */
-static void disconnect_nobc(struct at_state_t **at_state_p,
-			    struct cardstate *cs)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	++(*at_state_p)->seq_index;
-
-	/* revert to selected idle mode */
-	if (!cs->cidmode) {
-		cs->at_state.pending_commands |= PC_UMMODE;
-		gig_dbg(DEBUG_EVENT, "Scheduling PC_UMMODE");
-		cs->commands_pending = 1;
-	}
-
-	/* check for and deallocate temporary AT state */
-	if (!list_empty(&(*at_state_p)->list)) {
-		list_del(&(*at_state_p)->list);
-		kfree(*at_state_p);
-		*at_state_p = NULL;
-	}
-
-	spin_unlock_irqrestore(&cs->lock, flags);
-}
-
-/* disconnect_bc
- * process closing of connection associated with given AT state structure
- * and B channel
- */
-static void disconnect_bc(struct at_state_t *at_state,
-			  struct cardstate *cs, struct bc_state *bcs)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	++at_state->seq_index;
-
-	/* revert to selected idle mode */
-	if (!cs->cidmode) {
-		cs->at_state.pending_commands |= PC_UMMODE;
-		gig_dbg(DEBUG_EVENT, "Scheduling PC_UMMODE");
-		cs->commands_pending = 1;
-	}
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	/* invoke hardware specific handler */
-	cs->ops->close_bchannel(bcs);
-
-	/* notify LL */
-	if (bcs->chstate & (CHS_D_UP | CHS_NOTIFY_LL)) {
-		bcs->chstate &= ~(CHS_D_UP | CHS_NOTIFY_LL);
-		gigaset_isdn_hupD(bcs);
-	}
-}
-
-/* get_free_channel
- * get a free AT state structure: either one of those associated with the
- * B channels of the Gigaset device, or if none of those is available,
- * a newly allocated one with bcs=NULL
- * The structure should be freed by calling disconnect_nobc() after use.
- */
-static inline struct at_state_t *get_free_channel(struct cardstate *cs,
-						  int cid)
-/* cids: >0: siemens-cid
- *        0: without cid
- *       -1: no cid assigned yet
- */
-{
-	unsigned long flags;
-	int i;
-	struct at_state_t *ret;
-
-	for (i = 0; i < cs->channels; ++i)
-		if (gigaset_get_channel(cs->bcs + i) >= 0) {
-			ret = &cs->bcs[i].at_state;
-			ret->cid = cid;
-			return ret;
-		}
-
-	spin_lock_irqsave(&cs->lock, flags);
-	ret = kmalloc(sizeof(struct at_state_t), GFP_ATOMIC);
-	if (ret) {
-		gigaset_at_init(ret, NULL, cs, cid);
-		list_add(&ret->list, &cs->temp_at_states);
-	}
-	spin_unlock_irqrestore(&cs->lock, flags);
-	return ret;
-}
-
-static void init_failed(struct cardstate *cs, int mode)
-{
-	int i;
-	struct at_state_t *at_state;
-
-	cs->at_state.pending_commands &= ~PC_INIT;
-	cs->mode = mode;
-	cs->mstate = MS_UNINITIALIZED;
-	gigaset_free_channels(cs);
-	for (i = 0; i < cs->channels; ++i) {
-		at_state = &cs->bcs[i].at_state;
-		if (at_state->pending_commands & PC_CID) {
-			at_state->pending_commands &= ~PC_CID;
-			at_state->pending_commands |= PC_NOCID;
-			cs->commands_pending = 1;
-		}
-	}
-}
-
-static void schedule_init(struct cardstate *cs, int state)
-{
-	if (cs->at_state.pending_commands & PC_INIT) {
-		gig_dbg(DEBUG_EVENT, "not scheduling PC_INIT again");
-		return;
-	}
-	cs->mstate = state;
-	cs->mode = M_UNKNOWN;
-	gigaset_block_channels(cs);
-	cs->at_state.pending_commands |= PC_INIT;
-	gig_dbg(DEBUG_EVENT, "Scheduling PC_INIT");
-	cs->commands_pending = 1;
-}
-
-/* send an AT command
- * adding the "AT" prefix, cid and DLE encapsulation as appropriate
- */
-static void send_command(struct cardstate *cs, const char *cmd,
-			 struct at_state_t *at_state)
-{
-	int cid = at_state->cid;
-	struct cmdbuf_t *cb;
-	size_t buflen;
-
-	buflen = strlen(cmd) + 12; /* DLE ( A T 1 2 3 4 5 <cmd> DLE ) \0 */
-	cb = kmalloc(sizeof(struct cmdbuf_t) + buflen, GFP_ATOMIC);
-	if (!cb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		return;
-	}
-	if (cid > 0 && cid <= 65535)
-		cb->len = snprintf(cb->buf, buflen,
-				   cs->dle ? "\020(AT%d%s\020)" : "AT%d%s",
-				   cid, cmd);
-	else
-		cb->len = snprintf(cb->buf, buflen,
-				   cs->dle ? "\020(AT%s\020)" : "AT%s",
-				   cmd);
-	cb->offset = 0;
-	cb->next = NULL;
-	cb->wake_tasklet = NULL;
-	cs->ops->write_cmd(cs, cb);
-}
-
-static struct at_state_t *at_state_from_cid(struct cardstate *cs, int cid)
-{
-	struct at_state_t *at_state;
-	int i;
-	unsigned long flags;
-
-	if (cid == 0)
-		return &cs->at_state;
-
-	for (i = 0; i < cs->channels; ++i)
-		if (cid == cs->bcs[i].at_state.cid)
-			return &cs->bcs[i].at_state;
-
-	spin_lock_irqsave(&cs->lock, flags);
-
-	list_for_each_entry(at_state, &cs->temp_at_states, list)
-		if (cid == at_state->cid) {
-			spin_unlock_irqrestore(&cs->lock, flags);
-			return at_state;
-		}
-
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	return NULL;
-}
-
-static void bchannel_down(struct bc_state *bcs)
-{
-	if (bcs->chstate & CHS_B_UP) {
-		bcs->chstate &= ~CHS_B_UP;
-		gigaset_isdn_hupB(bcs);
-	}
-
-	if (bcs->chstate & (CHS_D_UP | CHS_NOTIFY_LL)) {
-		bcs->chstate &= ~(CHS_D_UP | CHS_NOTIFY_LL);
-		gigaset_isdn_hupD(bcs);
-	}
-
-	gigaset_free_channel(bcs);
-
-	gigaset_bcs_reinit(bcs);
-}
-
-static void bchannel_up(struct bc_state *bcs)
-{
-	if (bcs->chstate & CHS_B_UP) {
-		dev_notice(bcs->cs->dev, "%s: B channel already up\n",
-			   __func__);
-		return;
-	}
-
-	bcs->chstate |= CHS_B_UP;
-	gigaset_isdn_connB(bcs);
-}
-
-static void start_dial(struct at_state_t *at_state, void *data,
-		       unsigned seq_index)
-{
-	struct bc_state *bcs = at_state->bcs;
-	struct cardstate *cs = at_state->cs;
-	char **commands = data;
-	unsigned long flags;
-	int i;
-
-	bcs->chstate |= CHS_NOTIFY_LL;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (at_state->seq_index != seq_index) {
-		spin_unlock_irqrestore(&cs->lock, flags);
-		goto error;
-	}
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	for (i = 0; i < AT_NUM; ++i) {
-		kfree(bcs->commands[i]);
-		bcs->commands[i] = commands[i];
-	}
-
-	at_state->pending_commands |= PC_CID;
-	gig_dbg(DEBUG_EVENT, "Scheduling PC_CID");
-	cs->commands_pending = 1;
-	return;
-
-error:
-	for (i = 0; i < AT_NUM; ++i) {
-		kfree(commands[i]);
-		commands[i] = NULL;
-	}
-	at_state->pending_commands |= PC_NOCID;
-	gig_dbg(DEBUG_EVENT, "Scheduling PC_NOCID");
-	cs->commands_pending = 1;
-	return;
-}
-
-static void start_accept(struct at_state_t *at_state)
-{
-	struct cardstate *cs = at_state->cs;
-	struct bc_state *bcs = at_state->bcs;
-	int i;
-
-	for (i = 0; i < AT_NUM; ++i) {
-		kfree(bcs->commands[i]);
-		bcs->commands[i] = NULL;
-	}
-
-	bcs->commands[AT_PROTO] = kmalloc(9, GFP_ATOMIC);
-	bcs->commands[AT_ISO] = kmalloc(9, GFP_ATOMIC);
-	if (!bcs->commands[AT_PROTO] || !bcs->commands[AT_ISO]) {
-		dev_err(at_state->cs->dev, "out of memory\n");
-		/* error reset */
-		at_state->pending_commands |= PC_HUP;
-		gig_dbg(DEBUG_EVENT, "Scheduling PC_HUP");
-		cs->commands_pending = 1;
-		return;
-	}
-
-	snprintf(bcs->commands[AT_PROTO], 9, "^SBPR=%u\r", bcs->proto2);
-	snprintf(bcs->commands[AT_ISO], 9, "^SISO=%u\r", bcs->channel + 1);
-
-	at_state->pending_commands |= PC_ACCEPT;
-	gig_dbg(DEBUG_EVENT, "Scheduling PC_ACCEPT");
-	cs->commands_pending = 1;
-}
-
-static void do_start(struct cardstate *cs)
-{
-	gigaset_free_channels(cs);
-
-	if (cs->mstate != MS_LOCKED)
-		schedule_init(cs, MS_INIT);
-
-	cs->isdn_up = 1;
-	gigaset_isdn_start(cs);
-
-	cs->waiting = 0;
-	wake_up(&cs->waitqueue);
-}
-
-static void finish_shutdown(struct cardstate *cs)
-{
-	if (cs->mstate != MS_LOCKED) {
-		cs->mstate = MS_UNINITIALIZED;
-		cs->mode = M_UNKNOWN;
-	}
-
-	/* Tell the LL that the device is not available .. */
-	if (cs->isdn_up) {
-		cs->isdn_up = 0;
-		gigaset_isdn_stop(cs);
-	}
-
-	/* The rest is done by cleanup_cs() in process context. */
-
-	cs->cmd_result = -ENODEV;
-	cs->waiting = 0;
-	wake_up(&cs->waitqueue);
-}
-
-static void do_shutdown(struct cardstate *cs)
-{
-	gigaset_block_channels(cs);
-
-	if (cs->mstate == MS_READY) {
-		cs->mstate = MS_SHUTDOWN;
-		cs->at_state.pending_commands |= PC_SHUTDOWN;
-		gig_dbg(DEBUG_EVENT, "Scheduling PC_SHUTDOWN");
-		cs->commands_pending = 1;
-	} else
-		finish_shutdown(cs);
-}
-
-static void do_stop(struct cardstate *cs)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	cs->connected = 0;
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	do_shutdown(cs);
-}
-
-/* Entering cid mode or getting a cid failed:
- * try to initialize the device and try again.
- *
- * channel >= 0: getting cid for the channel failed
- * channel < 0:  entering cid mode failed
- *
- * returns 0 on success, <0 on failure
- */
-static int reinit_and_retry(struct cardstate *cs, int channel)
-{
-	int i;
-
-	if (--cs->retry_count <= 0)
-		return -EFAULT;
-
-	for (i = 0; i < cs->channels; ++i)
-		if (cs->bcs[i].at_state.cid > 0)
-			return -EBUSY;
-
-	if (channel < 0)
-		dev_warn(cs->dev,
-			 "Could not enter cid mode. Reinit device and try again.\n");
-	else {
-		dev_warn(cs->dev,
-			 "Could not get a call id. Reinit device and try again.\n");
-		cs->bcs[channel].at_state.pending_commands |= PC_CID;
-	}
-	schedule_init(cs, MS_INIT);
-	return 0;
-}
-
-static int at_state_invalid(struct cardstate *cs,
-			    struct at_state_t *test_ptr)
-{
-	unsigned long flags;
-	unsigned channel;
-	struct at_state_t *at_state;
-	int retval = 0;
-
-	spin_lock_irqsave(&cs->lock, flags);
-
-	if (test_ptr == &cs->at_state)
-		goto exit;
-
-	list_for_each_entry(at_state, &cs->temp_at_states, list)
-		if (at_state == test_ptr)
-			goto exit;
-
-	for (channel = 0; channel < cs->channels; ++channel)
-		if (&cs->bcs[channel].at_state == test_ptr)
-			goto exit;
-
-	retval = 1;
-exit:
-	spin_unlock_irqrestore(&cs->lock, flags);
-	return retval;
-}
-
-static void handle_icall(struct cardstate *cs, struct bc_state *bcs,
-			 struct at_state_t *at_state)
-{
-	int retval;
-
-	retval = gigaset_isdn_icall(at_state);
-	switch (retval) {
-	case ICALL_ACCEPT:
-		break;
-	default:
-		dev_err(cs->dev, "internal error: disposition=%d\n", retval);
-		/* fall through */
-	case ICALL_IGNORE:
-	case ICALL_REJECT:
-		/* hang up actively
-		 * Device doc says that would reject the call.
-		 * In fact it doesn't.
-		 */
-		at_state->pending_commands |= PC_HUP;
-		cs->commands_pending = 1;
-		break;
-	}
-}
-
-static int do_lock(struct cardstate *cs)
-{
-	int mode;
-	int i;
-
-	switch (cs->mstate) {
-	case MS_UNINITIALIZED:
-	case MS_READY:
-		if (cs->cur_at_seq || !list_empty(&cs->temp_at_states) ||
-		    cs->at_state.pending_commands)
-			return -EBUSY;
-
-		for (i = 0; i < cs->channels; ++i)
-			if (cs->bcs[i].at_state.pending_commands)
-				return -EBUSY;
-
-		if (gigaset_get_channels(cs) < 0)
-			return -EBUSY;
-
-		break;
-	case MS_LOCKED:
-		break;
-	default:
-		return -EBUSY;
-	}
-
-	mode = cs->mode;
-	cs->mstate = MS_LOCKED;
-	cs->mode = M_UNKNOWN;
-
-	return mode;
-}
-
-static int do_unlock(struct cardstate *cs)
-{
-	if (cs->mstate != MS_LOCKED)
-		return -EINVAL;
-
-	cs->mstate = MS_UNINITIALIZED;
-	cs->mode = M_UNKNOWN;
-	gigaset_free_channels(cs);
-	if (cs->connected)
-		schedule_init(cs, MS_INIT);
-
-	return 0;
-}
-
-static void do_action(int action, struct cardstate *cs,
-		      struct bc_state *bcs,
-		      struct at_state_t **p_at_state, char **pp_command,
-		      int *p_genresp, int *p_resp_code,
-		      struct event_t *ev)
-{
-	struct at_state_t *at_state = *p_at_state;
-	struct bc_state *bcs2;
-	unsigned long flags;
-
-	int channel;
-
-	unsigned char *s, *e;
-	int i;
-	unsigned long val;
-
-	switch (action) {
-	case ACT_NOTHING:
-		break;
-	case ACT_TIMEOUT:
-		at_state->waiting = 1;
-		break;
-	case ACT_INIT:
-		cs->at_state.pending_commands &= ~PC_INIT;
-		cs->cur_at_seq = SEQ_NONE;
-		cs->mode = M_UNIMODEM;
-		spin_lock_irqsave(&cs->lock, flags);
-		if (!cs->cidmode) {
-			spin_unlock_irqrestore(&cs->lock, flags);
-			gigaset_free_channels(cs);
-			cs->mstate = MS_READY;
-			break;
-		}
-		spin_unlock_irqrestore(&cs->lock, flags);
-		cs->at_state.pending_commands |= PC_CIDMODE;
-		gig_dbg(DEBUG_EVENT, "Scheduling PC_CIDMODE");
-		cs->commands_pending = 1;
-		break;
-	case ACT_FAILINIT:
-		dev_warn(cs->dev, "Could not initialize the device.\n");
-		cs->dle = 0;
-		init_failed(cs, M_UNKNOWN);
-		cs->cur_at_seq = SEQ_NONE;
-		break;
-	case ACT_CONFIGMODE:
-		init_failed(cs, M_CONFIG);
-		cs->cur_at_seq = SEQ_NONE;
-		break;
-	case ACT_SETDLE1:
-		cs->dle = 1;
-		/* cs->inbuf[0].inputstate |= INS_command | INS_DLE_command; */
-		cs->inbuf[0].inputstate &=
-			~(INS_command | INS_DLE_command);
-		break;
-	case ACT_SETDLE0:
-		cs->dle = 0;
-		cs->inbuf[0].inputstate =
-			(cs->inbuf[0].inputstate & ~INS_DLE_command)
-			| INS_command;
-		break;
-	case ACT_CMODESET:
-		if (cs->mstate == MS_INIT || cs->mstate == MS_RECOVER) {
-			gigaset_free_channels(cs);
-			cs->mstate = MS_READY;
-		}
-		cs->mode = M_CID;
-		cs->cur_at_seq = SEQ_NONE;
-		break;
-	case ACT_UMODESET:
-		cs->mode = M_UNIMODEM;
-		cs->cur_at_seq = SEQ_NONE;
-		break;
-	case ACT_FAILCMODE:
-		cs->cur_at_seq = SEQ_NONE;
-		if (cs->mstate == MS_INIT || cs->mstate == MS_RECOVER) {
-			init_failed(cs, M_UNKNOWN);
-			break;
-		}
-		if (reinit_and_retry(cs, -1) < 0)
-			schedule_init(cs, MS_RECOVER);
-		break;
-	case ACT_FAILUMODE:
-		cs->cur_at_seq = SEQ_NONE;
-		schedule_init(cs, MS_RECOVER);
-		break;
-	case ACT_HUPMODEM:
-		/* send "+++" (hangup in unimodem mode) */
-		if (cs->connected) {
-			struct cmdbuf_t *cb;
-
-			cb = kmalloc(sizeof(struct cmdbuf_t) + 3, GFP_ATOMIC);
-			if (!cb) {
-				dev_err(cs->dev, "%s: out of memory\n",
-					__func__);
-				return;
-			}
-			memcpy(cb->buf, "+++", 3);
-			cb->len = 3;
-			cb->offset = 0;
-			cb->next = NULL;
-			cb->wake_tasklet = NULL;
-			cs->ops->write_cmd(cs, cb);
-		}
-		break;
-	case ACT_RING:
-		/* get fresh AT state structure for new CID */
-		at_state = get_free_channel(cs, ev->parameter);
-		if (!at_state) {
-			dev_warn(cs->dev,
-				 "RING ignored: could not allocate channel structure\n");
-			break;
-		}
-
-		/* initialize AT state structure
-		 * note that bcs may be NULL if no B channel is free
-		 */
-		at_state->ConState = 700;
-		for (i = 0; i < STR_NUM; ++i) {
-			kfree(at_state->str_var[i]);
-			at_state->str_var[i] = NULL;
-		}
-		at_state->int_var[VAR_ZCTP] = -1;
-
-		spin_lock_irqsave(&cs->lock, flags);
-		at_state->timer_expires = RING_TIMEOUT;
-		at_state->timer_active = 1;
-		spin_unlock_irqrestore(&cs->lock, flags);
-		break;
-	case ACT_ICALL:
-		handle_icall(cs, bcs, at_state);
-		break;
-	case ACT_FAILSDOWN:
-		dev_warn(cs->dev, "Could not shut down the device.\n");
-		/* fall through */
-	case ACT_FAKESDOWN:
-	case ACT_SDOWN:
-		cs->cur_at_seq = SEQ_NONE;
-		finish_shutdown(cs);
-		break;
-	case ACT_CONNECT:
-		if (cs->onechannel) {
-			at_state->pending_commands |= PC_DLE1;
-			cs->commands_pending = 1;
-			break;
-		}
-		bcs->chstate |= CHS_D_UP;
-		gigaset_isdn_connD(bcs);
-		cs->ops->init_bchannel(bcs);
-		break;
-	case ACT_DLE1:
-		cs->cur_at_seq = SEQ_NONE;
-		bcs = cs->bcs + cs->curchannel;
-
-		bcs->chstate |= CHS_D_UP;
-		gigaset_isdn_connD(bcs);
-		cs->ops->init_bchannel(bcs);
-		break;
-	case ACT_FAKEHUP:
-		at_state->int_var[VAR_ZSAU] = ZSAU_NULL;
-		/* fall through */
-	case ACT_DISCONNECT:
-		cs->cur_at_seq = SEQ_NONE;
-		at_state->cid = -1;
-		if (!bcs) {
-			disconnect_nobc(p_at_state, cs);
-		} else if (cs->onechannel && cs->dle) {
-			/* Check for other open channels not needed:
-			 * DLE only used for M10x with one B channel.
-			 */
-			at_state->pending_commands |= PC_DLE0;
-			cs->commands_pending = 1;
-		} else {
-			disconnect_bc(at_state, cs, bcs);
-		}
-		break;
-	case ACT_FAKEDLE0:
-		at_state->int_var[VAR_ZDLE] = 0;
-		cs->dle = 0;
-		/* fall through */
-	case ACT_DLE0:
-		cs->cur_at_seq = SEQ_NONE;
-		bcs2 = cs->bcs + cs->curchannel;
-		disconnect_bc(&bcs2->at_state, cs, bcs2);
-		break;
-	case ACT_ABORTHUP:
-		cs->cur_at_seq = SEQ_NONE;
-		dev_warn(cs->dev, "Could not hang up.\n");
-		at_state->cid = -1;
-		if (!bcs)
-			disconnect_nobc(p_at_state, cs);
-		else if (cs->onechannel)
-			at_state->pending_commands |= PC_DLE0;
-		else
-			disconnect_bc(at_state, cs, bcs);
-		schedule_init(cs, MS_RECOVER);
-		break;
-	case ACT_FAILDLE0:
-		cs->cur_at_seq = SEQ_NONE;
-		dev_warn(cs->dev, "Error leaving DLE mode.\n");
-		cs->dle = 0;
-		bcs2 = cs->bcs + cs->curchannel;
-		disconnect_bc(&bcs2->at_state, cs, bcs2);
-		schedule_init(cs, MS_RECOVER);
-		break;
-	case ACT_FAILDLE1:
-		cs->cur_at_seq = SEQ_NONE;
-		dev_warn(cs->dev,
-			 "Could not enter DLE mode. Trying to hang up.\n");
-		channel = cs->curchannel;
-		cs->bcs[channel].at_state.pending_commands |= PC_HUP;
-		cs->commands_pending = 1;
-		break;
-
-	case ACT_CID: /* got cid; start dialing */
-		cs->cur_at_seq = SEQ_NONE;
-		channel = cs->curchannel;
-		if (ev->parameter > 0 && ev->parameter <= 65535) {
-			cs->bcs[channel].at_state.cid = ev->parameter;
-			cs->bcs[channel].at_state.pending_commands |=
-				PC_DIAL;
-			cs->commands_pending = 1;
-			break;
-		}
-		/* fall through - bad cid */
-	case ACT_FAILCID:
-		cs->cur_at_seq = SEQ_NONE;
-		channel = cs->curchannel;
-		if (reinit_and_retry(cs, channel) < 0) {
-			dev_warn(cs->dev,
-				 "Could not get a call ID. Cannot dial.\n");
-			bcs2 = cs->bcs + channel;
-			disconnect_bc(&bcs2->at_state, cs, bcs2);
-		}
-		break;
-	case ACT_ABORTCID:
-		cs->cur_at_seq = SEQ_NONE;
-		bcs2 = cs->bcs + cs->curchannel;
-		disconnect_bc(&bcs2->at_state, cs, bcs2);
-		break;
-
-	case ACT_DIALING:
-	case ACT_ACCEPTED:
-		cs->cur_at_seq = SEQ_NONE;
-		break;
-
-	case ACT_ABORTACCEPT:	/* hangup/error/timeout during ICALL procssng */
-		if (bcs)
-			disconnect_bc(at_state, cs, bcs);
-		else
-			disconnect_nobc(p_at_state, cs);
-		break;
-
-	case ACT_ABORTDIAL:	/* error/timeout during dial preparation */
-		cs->cur_at_seq = SEQ_NONE;
-		at_state->pending_commands |= PC_HUP;
-		cs->commands_pending = 1;
-		break;
-
-	case ACT_REMOTEREJECT:	/* DISCONNECT_IND after dialling */
-	case ACT_CONNTIMEOUT:	/* timeout waiting for ZSAU=ACTIVE */
-	case ACT_REMOTEHUP:	/* DISCONNECT_IND with established connection */
-		at_state->pending_commands |= PC_HUP;
-		cs->commands_pending = 1;
-		break;
-	case ACT_GETSTRING: /* warning: RING, ZDLE, ...
-			       are not handled properly anymore */
-		at_state->getstring = 1;
-		break;
-	case ACT_SETVER:
-		if (!ev->ptr) {
-			*p_genresp = 1;
-			*p_resp_code = RSP_ERROR;
-			break;
-		}
-		s = ev->ptr;
-
-		if (!strcmp(s, "OK")) {
-			/* OK without version string: assume old response */
-			*p_genresp = 1;
-			*p_resp_code = RSP_NONE;
-			break;
-		}
-
-		for (i = 0; i < 4; ++i) {
-			val = simple_strtoul(s, (char **) &e, 10);
-			if (val > INT_MAX || e == s)
-				break;
-			if (i == 3) {
-				if (*e)
-					break;
-			} else if (*e != '.')
-				break;
-			else
-				s = e + 1;
-			cs->fwver[i] = val;
-		}
-		if (i != 4) {
-			*p_genresp = 1;
-			*p_resp_code = RSP_ERROR;
-			break;
-		}
-		cs->gotfwver = 0;
-		break;
-	case ACT_GOTVER:
-		if (cs->gotfwver == 0) {
-			cs->gotfwver = 1;
-			gig_dbg(DEBUG_EVENT,
-				"firmware version %02d.%03d.%02d.%02d",
-				cs->fwver[0], cs->fwver[1],
-				cs->fwver[2], cs->fwver[3]);
-			break;
-		}
-		/* fall through */
-	case ACT_FAILVER:
-		cs->gotfwver = -1;
-		dev_err(cs->dev, "could not read firmware version.\n");
-		break;
-	case ACT_ERROR:
-		gig_dbg(DEBUG_ANY, "%s: ERROR response in ConState %d",
-			__func__, at_state->ConState);
-		cs->cur_at_seq = SEQ_NONE;
-		break;
-	case ACT_DEBUG:
-		gig_dbg(DEBUG_ANY, "%s: resp_code %d in ConState %d",
-			__func__, ev->type, at_state->ConState);
-		break;
-	case ACT_WARN:
-		dev_warn(cs->dev, "%s: resp_code %d in ConState %d!\n",
-			 __func__, ev->type, at_state->ConState);
-		break;
-	case ACT_ZCAU:
-		dev_warn(cs->dev, "cause code %04x in connection state %d.\n",
-			 ev->parameter, at_state->ConState);
-		break;
-
-	/* events from the LL */
-
-	case ACT_DIAL:
-		if (!ev->ptr) {
-			*p_genresp = 1;
-			*p_resp_code = RSP_ERROR;
-			break;
-		}
-		start_dial(at_state, ev->ptr, ev->parameter);
-		break;
-	case ACT_ACCEPT:
-		start_accept(at_state);
-		break;
-	case ACT_HUP:
-		at_state->pending_commands |= PC_HUP;
-		gig_dbg(DEBUG_EVENT, "Scheduling PC_HUP");
-		cs->commands_pending = 1;
-		break;
-
-	/* hotplug events */
-
-	case ACT_STOP:
-		do_stop(cs);
-		break;
-	case ACT_START:
-		do_start(cs);
-		break;
-
-	/* events from the interface */
-
-	case ACT_IF_LOCK:
-		cs->cmd_result = ev->parameter ? do_lock(cs) : do_unlock(cs);
-		cs->waiting = 0;
-		wake_up(&cs->waitqueue);
-		break;
-	case ACT_IF_VER:
-		if (ev->parameter != 0)
-			cs->cmd_result = -EINVAL;
-		else if (cs->gotfwver != 1) {
-			cs->cmd_result = -ENOENT;
-		} else {
-			memcpy(ev->arg, cs->fwver, sizeof cs->fwver);
-			cs->cmd_result = 0;
-		}
-		cs->waiting = 0;
-		wake_up(&cs->waitqueue);
-		break;
-
-	/* events from the proc file system */
-
-	case ACT_PROC_CIDMODE:
-		spin_lock_irqsave(&cs->lock, flags);
-		if (ev->parameter != cs->cidmode) {
-			cs->cidmode = ev->parameter;
-			if (ev->parameter) {
-				cs->at_state.pending_commands |= PC_CIDMODE;
-				gig_dbg(DEBUG_EVENT, "Scheduling PC_CIDMODE");
-			} else {
-				cs->at_state.pending_commands |= PC_UMMODE;
-				gig_dbg(DEBUG_EVENT, "Scheduling PC_UMMODE");
-			}
-			cs->commands_pending = 1;
-		}
-		spin_unlock_irqrestore(&cs->lock, flags);
-		cs->waiting = 0;
-		wake_up(&cs->waitqueue);
-		break;
-
-	/* events from the hardware drivers */
-
-	case ACT_NOTIFY_BC_DOWN:
-		bchannel_down(bcs);
-		break;
-	case ACT_NOTIFY_BC_UP:
-		bchannel_up(bcs);
-		break;
-	case ACT_SHUTDOWN:
-		do_shutdown(cs);
-		break;
-
-
-	default:
-		if (action >= ACT_CMD && action < ACT_CMD + AT_NUM) {
-			*pp_command = at_state->bcs->commands[action - ACT_CMD];
-			if (!*pp_command) {
-				*p_genresp = 1;
-				*p_resp_code = RSP_NULL;
-			}
-		} else
-			dev_err(cs->dev, "%s: action==%d!\n", __func__, action);
-	}
-}
-
-/* State machine to do the calling and hangup procedure */
-static void process_event(struct cardstate *cs, struct event_t *ev)
-{
-	struct bc_state *bcs;
-	char *p_command = NULL;
-	struct reply_t *rep;
-	int rcode;
-	int genresp = 0;
-	int resp_code = RSP_ERROR;
-	struct at_state_t *at_state;
-	int index;
-	int curact;
-	unsigned long flags;
-
-	if (ev->cid >= 0) {
-		at_state = at_state_from_cid(cs, ev->cid);
-		if (!at_state) {
-			gig_dbg(DEBUG_EVENT, "event %d for invalid cid %d",
-				ev->type, ev->cid);
-			gigaset_add_event(cs, &cs->at_state, RSP_WRONG_CID,
-					  NULL, 0, NULL);
-			return;
-		}
-	} else {
-		at_state = ev->at_state;
-		if (at_state_invalid(cs, at_state)) {
-			gig_dbg(DEBUG_EVENT, "event for invalid at_state %p",
-				at_state);
-			return;
-		}
-	}
-
-	gig_dbg(DEBUG_EVENT, "connection state %d, event %d",
-		at_state->ConState, ev->type);
-
-	bcs = at_state->bcs;
-
-	/* Setting the pointer to the dial array */
-	rep = at_state->replystruct;
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (ev->type == EV_TIMEOUT) {
-		if (ev->parameter != at_state->timer_index
-		    || !at_state->timer_active) {
-			ev->type = RSP_NONE; /* old timeout */
-			gig_dbg(DEBUG_EVENT, "old timeout");
-		} else {
-			if (at_state->waiting)
-				gig_dbg(DEBUG_EVENT, "stopped waiting");
-			else
-				gig_dbg(DEBUG_EVENT, "timeout occurred");
-		}
-	}
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	/* if the response belongs to a variable in at_state->int_var[VAR_XXXX]
-	   or at_state->str_var[STR_XXXX], set it */
-	if (ev->type >= RSP_VAR && ev->type < RSP_VAR + VAR_NUM) {
-		index = ev->type - RSP_VAR;
-		at_state->int_var[index] = ev->parameter;
-	} else if (ev->type >= RSP_STR && ev->type < RSP_STR + STR_NUM) {
-		index = ev->type - RSP_STR;
-		kfree(at_state->str_var[index]);
-		at_state->str_var[index] = ev->ptr;
-		ev->ptr = NULL; /* prevent process_events() from
-				   deallocating ptr */
-	}
-
-	if (ev->type == EV_TIMEOUT || ev->type == RSP_STRING)
-		at_state->getstring = 0;
-
-	/* Search row in dial array which matches modem response and current
-	   constate */
-	for (;; rep++) {
-		rcode = rep->resp_code;
-		if (rcode == RSP_LAST) {
-			/* found nothing...*/
-			dev_warn(cs->dev, "%s: rcode=RSP_LAST: "
-				 "resp_code %d in ConState %d!\n",
-				 __func__, ev->type, at_state->ConState);
-			return;
-		}
-		if ((rcode == RSP_ANY || rcode == ev->type)
-		    && ((int) at_state->ConState >= rep->min_ConState)
-		    && (rep->max_ConState < 0
-			|| (int) at_state->ConState <= rep->max_ConState)
-		    && (rep->parameter < 0 || rep->parameter == ev->parameter))
-			break;
-	}
-
-	p_command = rep->command;
-
-	at_state->waiting = 0;
-	for (curact = 0; curact < MAXACT; ++curact) {
-		/* The row tells us what we should do  ..
-		 */
-		do_action(rep->action[curact], cs, bcs, &at_state, &p_command,
-			  &genresp, &resp_code, ev);
-		if (!at_state)
-			/* at_state destroyed by disconnect */
-			return;
-	}
-
-	/* Jump to the next con-state regarding the array */
-	if (rep->new_ConState >= 0)
-		at_state->ConState = rep->new_ConState;
-
-	if (genresp) {
-		spin_lock_irqsave(&cs->lock, flags);
-		at_state->timer_expires = 0;
-		at_state->timer_active = 0;
-		spin_unlock_irqrestore(&cs->lock, flags);
-		gigaset_add_event(cs, at_state, resp_code, NULL, 0, NULL);
-	} else {
-		/* Send command to modem if not NULL... */
-		if (p_command) {
-			if (cs->connected)
-				send_command(cs, p_command, at_state);
-			else
-				gigaset_add_event(cs, at_state, RSP_NODEV,
-						  NULL, 0, NULL);
-		}
-
-		spin_lock_irqsave(&cs->lock, flags);
-		if (!rep->timeout) {
-			at_state->timer_expires = 0;
-			at_state->timer_active = 0;
-		} else if (rep->timeout > 0) { /* new timeout */
-			at_state->timer_expires = rep->timeout * 10;
-			at_state->timer_active = 1;
-			++at_state->timer_index;
-		}
-		spin_unlock_irqrestore(&cs->lock, flags);
-	}
-}
-
-static void schedule_sequence(struct cardstate *cs,
-			      struct at_state_t *at_state, int sequence)
-{
-	cs->cur_at_seq = sequence;
-	gigaset_add_event(cs, at_state, RSP_INIT, NULL, sequence, NULL);
-}
-
-static void process_command_flags(struct cardstate *cs)
-{
-	struct at_state_t *at_state = NULL;
-	struct bc_state *bcs;
-	int i;
-	int sequence;
-	unsigned long flags;
-
-	cs->commands_pending = 0;
-
-	if (cs->cur_at_seq) {
-		gig_dbg(DEBUG_EVENT, "not searching scheduled commands: busy");
-		return;
-	}
-
-	gig_dbg(DEBUG_EVENT, "searching scheduled commands");
-
-	sequence = SEQ_NONE;
-
-	/* clear pending_commands and hangup channels on shutdown */
-	if (cs->at_state.pending_commands & PC_SHUTDOWN) {
-		cs->at_state.pending_commands &= ~PC_CIDMODE;
-		for (i = 0; i < cs->channels; ++i) {
-			bcs = cs->bcs + i;
-			at_state = &bcs->at_state;
-			at_state->pending_commands &=
-				~(PC_DLE1 | PC_ACCEPT | PC_DIAL);
-			if (at_state->cid > 0)
-				at_state->pending_commands |= PC_HUP;
-			if (at_state->pending_commands & PC_CID) {
-				at_state->pending_commands |= PC_NOCID;
-				at_state->pending_commands &= ~PC_CID;
-			}
-		}
-	}
-
-	/* clear pending_commands and hangup channels on reset */
-	if (cs->at_state.pending_commands & PC_INIT) {
-		cs->at_state.pending_commands &= ~PC_CIDMODE;
-		for (i = 0; i < cs->channels; ++i) {
-			bcs = cs->bcs + i;
-			at_state = &bcs->at_state;
-			at_state->pending_commands &=
-				~(PC_DLE1 | PC_ACCEPT | PC_DIAL);
-			if (at_state->cid > 0)
-				at_state->pending_commands |= PC_HUP;
-			if (cs->mstate == MS_RECOVER) {
-				if (at_state->pending_commands & PC_CID) {
-					at_state->pending_commands |= PC_NOCID;
-					at_state->pending_commands &= ~PC_CID;
-				}
-			}
-		}
-	}
-
-	/* only switch back to unimodem mode if no commands are pending and
-	 * no channels are up */
-	spin_lock_irqsave(&cs->lock, flags);
-	if (cs->at_state.pending_commands == PC_UMMODE
-	    && !cs->cidmode
-	    && list_empty(&cs->temp_at_states)
-	    && cs->mode == M_CID) {
-		sequence = SEQ_UMMODE;
-		at_state = &cs->at_state;
-		for (i = 0; i < cs->channels; ++i) {
-			bcs = cs->bcs + i;
-			if (bcs->at_state.pending_commands ||
-			    bcs->at_state.cid > 0) {
-				sequence = SEQ_NONE;
-				break;
-			}
-		}
-	}
-	spin_unlock_irqrestore(&cs->lock, flags);
-	cs->at_state.pending_commands &= ~PC_UMMODE;
-	if (sequence != SEQ_NONE) {
-		schedule_sequence(cs, at_state, sequence);
-		return;
-	}
-
-	for (i = 0; i < cs->channels; ++i) {
-		bcs = cs->bcs + i;
-		if (bcs->at_state.pending_commands & PC_HUP) {
-			if (cs->dle) {
-				cs->curchannel = bcs->channel;
-				schedule_sequence(cs, &cs->at_state, SEQ_DLE0);
-				return;
-			}
-			bcs->at_state.pending_commands &= ~PC_HUP;
-			if (bcs->at_state.pending_commands & PC_CID) {
-				/* not yet dialing: PC_NOCID is sufficient */
-				bcs->at_state.pending_commands |= PC_NOCID;
-				bcs->at_state.pending_commands &= ~PC_CID;
-			} else {
-				schedule_sequence(cs, &bcs->at_state, SEQ_HUP);
-				return;
-			}
-		}
-		if (bcs->at_state.pending_commands & PC_NOCID) {
-			bcs->at_state.pending_commands &= ~PC_NOCID;
-			cs->curchannel = bcs->channel;
-			schedule_sequence(cs, &cs->at_state, SEQ_NOCID);
-			return;
-		} else if (bcs->at_state.pending_commands & PC_DLE0) {
-			bcs->at_state.pending_commands &= ~PC_DLE0;
-			cs->curchannel = bcs->channel;
-			schedule_sequence(cs, &cs->at_state, SEQ_DLE0);
-			return;
-		}
-	}
-
-	list_for_each_entry(at_state, &cs->temp_at_states, list)
-		if (at_state->pending_commands & PC_HUP) {
-			at_state->pending_commands &= ~PC_HUP;
-			schedule_sequence(cs, at_state, SEQ_HUP);
-			return;
-		}
-
-	if (cs->at_state.pending_commands & PC_INIT) {
-		cs->at_state.pending_commands &= ~PC_INIT;
-		cs->dle = 0;
-		cs->inbuf->inputstate = INS_command;
-		schedule_sequence(cs, &cs->at_state, SEQ_INIT);
-		return;
-	}
-	if (cs->at_state.pending_commands & PC_SHUTDOWN) {
-		cs->at_state.pending_commands &= ~PC_SHUTDOWN;
-		schedule_sequence(cs, &cs->at_state, SEQ_SHUTDOWN);
-		return;
-	}
-	if (cs->at_state.pending_commands & PC_CIDMODE) {
-		cs->at_state.pending_commands &= ~PC_CIDMODE;
-		if (cs->mode == M_UNIMODEM) {
-			cs->retry_count = 1;
-			schedule_sequence(cs, &cs->at_state, SEQ_CIDMODE);
-			return;
-		}
-	}
-
-	for (i = 0; i < cs->channels; ++i) {
-		bcs = cs->bcs + i;
-		if (bcs->at_state.pending_commands & PC_DLE1) {
-			bcs->at_state.pending_commands &= ~PC_DLE1;
-			cs->curchannel = bcs->channel;
-			schedule_sequence(cs, &cs->at_state, SEQ_DLE1);
-			return;
-		}
-		if (bcs->at_state.pending_commands & PC_ACCEPT) {
-			bcs->at_state.pending_commands &= ~PC_ACCEPT;
-			schedule_sequence(cs, &bcs->at_state, SEQ_ACCEPT);
-			return;
-		}
-		if (bcs->at_state.pending_commands & PC_DIAL) {
-			bcs->at_state.pending_commands &= ~PC_DIAL;
-			schedule_sequence(cs, &bcs->at_state, SEQ_DIAL);
-			return;
-		}
-		if (bcs->at_state.pending_commands & PC_CID) {
-			switch (cs->mode) {
-			case M_UNIMODEM:
-				cs->at_state.pending_commands |= PC_CIDMODE;
-				gig_dbg(DEBUG_EVENT, "Scheduling PC_CIDMODE");
-				cs->commands_pending = 1;
-				return;
-			case M_UNKNOWN:
-				schedule_init(cs, MS_INIT);
-				return;
-			}
-			bcs->at_state.pending_commands &= ~PC_CID;
-			cs->curchannel = bcs->channel;
-			cs->retry_count = 2;
-			schedule_sequence(cs, &cs->at_state, SEQ_CID);
-			return;
-		}
-	}
-}
-
-static void process_events(struct cardstate *cs)
-{
-	struct event_t *ev;
-	unsigned head, tail;
-	int i;
-	int check_flags = 0;
-	int was_busy;
-	unsigned long flags;
-
-	spin_lock_irqsave(&cs->ev_lock, flags);
-	head = cs->ev_head;
-
-	for (i = 0; i < 2 * MAX_EVENTS; ++i) {
-		tail = cs->ev_tail;
-		if (tail == head) {
-			if (!check_flags && !cs->commands_pending)
-				break;
-			check_flags = 0;
-			spin_unlock_irqrestore(&cs->ev_lock, flags);
-			process_command_flags(cs);
-			spin_lock_irqsave(&cs->ev_lock, flags);
-			tail = cs->ev_tail;
-			if (tail == head) {
-				if (!cs->commands_pending)
-					break;
-				continue;
-			}
-		}
-
-		ev = cs->events + head;
-		was_busy = cs->cur_at_seq != SEQ_NONE;
-		spin_unlock_irqrestore(&cs->ev_lock, flags);
-		process_event(cs, ev);
-		spin_lock_irqsave(&cs->ev_lock, flags);
-		kfree(ev->ptr);
-		ev->ptr = NULL;
-		if (was_busy && cs->cur_at_seq == SEQ_NONE)
-			check_flags = 1;
-
-		head = (head + 1) % MAX_EVENTS;
-		cs->ev_head = head;
-	}
-
-	spin_unlock_irqrestore(&cs->ev_lock, flags);
-
-	if (i == 2 * MAX_EVENTS) {
-		dev_err(cs->dev,
-			"infinite loop in process_events; aborting.\n");
-	}
-}
-
-/* tasklet scheduled on any event received from the Gigaset device
- * parameter:
- *	data	ISDN controller state structure
- */
-void gigaset_handle_event(unsigned long data)
-{
-	struct cardstate *cs = (struct cardstate *) data;
-
-	/* handle incoming data on control/common channel */
-	if (cs->inbuf->head != cs->inbuf->tail) {
-		gig_dbg(DEBUG_INTR, "processing new data");
-		cs->ops->handle_input(cs->inbuf);
-	}
-
-	process_events(cs);
-}
diff --git a/drivers/staging/isdn/gigaset/gigaset.h b/drivers/staging/isdn/gigaset/gigaset.h
deleted file mode 100644
index 0ecc2b5ea553..000000000000
--- a/drivers/staging/isdn/gigaset/gigaset.h
+++ /dev/null
@@ -1,827 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-or-later */
-/*
- * Siemens Gigaset 307x driver
- * Common header file for all connection variants
- *
- * Written by Stefan Eilers
- *        and Hansjoerg Lipp <hjlipp@web.de>
- *
- * =====================================================================
- * =====================================================================
- */
-
-#ifndef GIGASET_H
-#define GIGASET_H
-
-/* define global prefix for pr_ macros in linux/kernel.h */
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/kernel.h>
-#include <linux/sched.h>
-#include <linux/compiler.h>
-#include <linux/types.h>
-#include <linux/ctype.h>
-#include <linux/slab.h>
-#include <linux/spinlock.h>
-#include <linux/skbuff.h>
-#include <linux/netdevice.h>
-#include <linux/ppp_defs.h>
-#include <linux/timer.h>
-#include <linux/interrupt.h>
-#include <linux/tty.h>
-#include <linux/tty_driver.h>
-#include <linux/list.h>
-#include <linux/atomic.h>
-
-#define GIG_VERSION {0, 5, 0, 0}
-#define GIG_COMPAT  {0, 4, 0, 0}
-
-#define MAX_REC_PARAMS 10	/* Max. number of params in response string */
-#define MAX_RESP_SIZE 511	/* Max. size of a response string */
-
-#define MAX_EVENTS 64		/* size of event queue */
-
-#define RBUFSIZE 8192
-
-#define GIG_TICK 100		/* in milliseconds */
-
-/* timeout values (unit: 1 sec) */
-#define INIT_TIMEOUT 1
-
-/* timeout values (unit: 0.1 sec) */
-#define RING_TIMEOUT 3		/* for additional parameters to RING */
-#define BAS_TIMEOUT 20		/* for response to Base USB ops */
-#define ATRDY_TIMEOUT 3		/* for HD_READY_SEND_ATDATA */
-
-#define BAS_RETRY 3		/* max. retries for base USB ops */
-
-#define MAXACT 3
-
-extern int gigaset_debuglevel;	/* "needs" cast to (enum debuglevel) */
-
-/* debug flags, combine by adding/bitwise OR */
-enum debuglevel {
-	DEBUG_INTR	  = 0x00008, /* interrupt processing */
-	DEBUG_CMD	  = 0x00020, /* sent/received LL commands */
-	DEBUG_STREAM	  = 0x00040, /* application data stream I/O events */
-	DEBUG_STREAM_DUMP = 0x00080, /* application data stream content */
-	DEBUG_LLDATA	  = 0x00100, /* sent/received LL data */
-	DEBUG_EVENT	  = 0x00200, /* event processing */
-	DEBUG_HDLC	  = 0x00800, /* M10x HDLC processing */
-	DEBUG_CHANNEL	  = 0x01000, /* channel allocation/deallocation */
-	DEBUG_TRANSCMD	  = 0x02000, /* AT-COMMANDS+RESPONSES */
-	DEBUG_MCMD	  = 0x04000, /* COMMANDS THAT ARE SENT VERY OFTEN */
-	DEBUG_INIT	  = 0x08000, /* (de)allocation+initialization of data
-					structures */
-	DEBUG_SUSPEND	  = 0x10000, /* suspend/resume processing */
-	DEBUG_OUTPUT	  = 0x20000, /* output to device */
-	DEBUG_ISO	  = 0x40000, /* isochronous transfers */
-	DEBUG_IF	  = 0x80000, /* character device operations */
-	DEBUG_USBREQ	  = 0x100000, /* USB communication (except payload
-					 data) */
-	DEBUG_LOCKCMD	  = 0x200000, /* AT commands and responses when
-					 MS_LOCKED */
-
-	DEBUG_ANY	  = 0x3fffff, /* print message if any of the others is
-					 activated */
-};
-
-#ifdef CONFIG_GIGASET_DEBUG
-
-#define gig_dbg(level, format, arg...)					\
-	do {								\
-		if (unlikely(((enum debuglevel)gigaset_debuglevel) & (level))) \
-			printk(KERN_DEBUG KBUILD_MODNAME ": " format "\n", \
-			       ## arg);					\
-	} while (0)
-#define DEBUG_DEFAULT (DEBUG_TRANSCMD | DEBUG_CMD | DEBUG_USBREQ)
-
-#else
-
-#define gig_dbg(level, format, arg...) do {} while (0)
-#define DEBUG_DEFAULT 0
-
-#endif
-
-void gigaset_dbg_buffer(enum debuglevel level, const unsigned char *msg,
-			size_t len, const unsigned char *buf);
-
-/* connection state */
-#define ZSAU_NONE			0
-#define ZSAU_PROCEEDING			1
-#define ZSAU_CALL_DELIVERED		2
-#define ZSAU_ACTIVE			3
-#define ZSAU_DISCONNECT_IND		4
-#define ZSAU_NULL			5
-#define ZSAU_DISCONNECT_REQ		6
-#define ZSAU_UNKNOWN			-1
-
-/* USB control transfer requests */
-#define OUT_VENDOR_REQ	(USB_DIR_OUT | USB_TYPE_VENDOR | USB_RECIP_ENDPOINT)
-#define IN_VENDOR_REQ	(USB_DIR_IN | USB_TYPE_VENDOR | USB_RECIP_ENDPOINT)
-
-/* interrupt pipe messages */
-#define HD_B1_FLOW_CONTROL		0x80
-#define HD_B2_FLOW_CONTROL		0x81
-#define HD_RECEIVEATDATA_ACK		(0x35)		/* 3070 */
-#define HD_READY_SEND_ATDATA		(0x36)		/* 3070 */
-#define HD_OPEN_ATCHANNEL_ACK		(0x37)		/* 3070 */
-#define HD_CLOSE_ATCHANNEL_ACK		(0x38)		/* 3070 */
-#define HD_DEVICE_INIT_OK		(0x11)		/* ISurf USB + 3070 */
-#define HD_OPEN_B1CHANNEL_ACK		(0x51)		/* ISurf USB + 3070 */
-#define HD_OPEN_B2CHANNEL_ACK		(0x52)		/* ISurf USB + 3070 */
-#define HD_CLOSE_B1CHANNEL_ACK		(0x53)		/* ISurf USB + 3070 */
-#define HD_CLOSE_B2CHANNEL_ACK		(0x54)		/* ISurf USB + 3070 */
-#define HD_SUSPEND_END			(0x61)		/* ISurf USB */
-#define HD_RESET_INTERRUPT_PIPE_ACK	(0xFF)		/* ISurf USB + 3070 */
-
-/* control requests */
-#define	HD_OPEN_B1CHANNEL		(0x23)		/* ISurf USB + 3070 */
-#define	HD_CLOSE_B1CHANNEL		(0x24)		/* ISurf USB + 3070 */
-#define	HD_OPEN_B2CHANNEL		(0x25)		/* ISurf USB + 3070 */
-#define	HD_CLOSE_B2CHANNEL		(0x26)		/* ISurf USB + 3070 */
-#define HD_RESET_INTERRUPT_PIPE		(0x27)		/* ISurf USB + 3070 */
-#define	HD_DEVICE_INIT_ACK		(0x34)		/* ISurf USB + 3070 */
-#define	HD_WRITE_ATMESSAGE		(0x12)		/* 3070 */
-#define	HD_READ_ATMESSAGE		(0x13)		/* 3070 */
-#define	HD_OPEN_ATCHANNEL		(0x28)		/* 3070 */
-#define	HD_CLOSE_ATCHANNEL		(0x29)		/* 3070 */
-
-/* number of B channels supported by base driver */
-#define BAS_CHANNELS	2
-
-/* USB frames for isochronous transfer */
-#define BAS_FRAMETIME	1	/* number of milliseconds between frames */
-#define BAS_NUMFRAMES	8	/* number of frames per URB */
-#define BAS_MAXFRAME	16	/* allocated bytes per frame */
-#define BAS_NORMFRAME	8	/* send size without flow control */
-#define BAS_HIGHFRAME	10	/* "    "    with positive flow control */
-#define BAS_LOWFRAME	5	/* "    "    with negative flow control */
-#define BAS_CORRFRAMES	4	/* flow control multiplicator */
-
-#define BAS_INBUFSIZE	(BAS_MAXFRAME * BAS_NUMFRAMES)	/* size of isoc in buf
-							 * per URB */
-#define BAS_OUTBUFSIZE	4096		/* size of common isoc out buffer */
-#define BAS_OUTBUFPAD	BAS_MAXFRAME	/* size of pad area for isoc out buf */
-
-#define BAS_INURBS	3
-#define BAS_OUTURBS	3
-
-/* variable commands in struct bc_state */
-#define AT_ISO		0
-#define AT_DIAL		1
-#define AT_MSN		2
-#define AT_BC		3
-#define AT_PROTO	4
-#define AT_TYPE		5
-#define AT_CLIP		6
-/* total number */
-#define AT_NUM		7
-
-/* variables in struct at_state_t */
-/* - numeric */
-#define VAR_ZSAU	0
-#define VAR_ZDLE	1
-#define VAR_ZCTP	2
-/* total number */
-#define VAR_NUM		3
-/* - string */
-#define STR_NMBR	0
-#define STR_ZCPN	1
-#define STR_ZCON	2
-#define STR_ZBC		3
-#define STR_ZHLC	4
-/* total number */
-#define STR_NUM		5
-
-/* event types */
-#define EV_TIMEOUT	-105
-#define EV_IF_VER	-106
-#define EV_PROC_CIDMODE	-107
-#define EV_SHUTDOWN	-108
-#define EV_START	-110
-#define EV_STOP		-111
-#define EV_IF_LOCK	-112
-#define EV_ACCEPT	-114
-#define EV_DIAL		-115
-#define EV_HUP		-116
-#define EV_BC_OPEN	-117
-#define EV_BC_CLOSED	-118
-
-/* input state */
-#define INS_command	0x0001	/* receiving messages (not payload data) */
-#define INS_DLE_char	0x0002	/* DLE flag received (in DLE mode) */
-#define INS_byte_stuff	0x0004
-#define INS_have_data	0x0008
-#define INS_DLE_command	0x0020	/* DLE message start (<DLE> X) received */
-#define INS_flag_hunt	0x0040
-
-/* channel state */
-#define CHS_D_UP	0x01
-#define CHS_B_UP	0x02
-#define CHS_NOTIFY_LL	0x04
-
-#define ICALL_REJECT	0
-#define ICALL_ACCEPT	1
-#define ICALL_IGNORE	2
-
-/* device state */
-#define MS_UNINITIALIZED	0
-#define MS_INIT			1
-#define MS_LOCKED		2
-#define MS_SHUTDOWN		3
-#define MS_RECOVER		4
-#define MS_READY		5
-
-/* mode */
-#define M_UNKNOWN	0
-#define M_CONFIG	1
-#define M_UNIMODEM	2
-#define M_CID		3
-
-/* start mode */
-#define SM_LOCKED	0
-#define SM_ISDN		1 /* default */
-
-/* layer 2 protocols (AT^SBPR=...) */
-#define L2_BITSYNC	0
-#define L2_HDLC		1
-#define L2_VOICE	2
-
-struct gigaset_ops;
-struct gigaset_driver;
-
-struct usb_cardstate;
-struct ser_cardstate;
-struct bas_cardstate;
-
-struct bc_state;
-struct usb_bc_state;
-struct ser_bc_state;
-struct bas_bc_state;
-
-struct reply_t {
-	int	resp_code;	/* RSP_XXXX */
-	int	min_ConState;	/* <0 => ignore */
-	int	max_ConState;	/* <0 => ignore */
-	int	parameter;	/* e.g. ZSAU_XXXX <0: ignore*/
-	int	new_ConState;	/* <0 => ignore */
-	int	timeout;	/* >0 => *HZ; <=0 => TOUT_XXXX*/
-	int	action[MAXACT];	/* ACT_XXXX */
-	char	*command;	/* NULL==none */
-};
-
-extern struct reply_t gigaset_tab_cid[];
-extern struct reply_t gigaset_tab_nocid[];
-
-struct inbuf_t {
-	struct cardstate	*cs;
-	int			inputstate;
-	int			head, tail;
-	unsigned char		data[RBUFSIZE];
-};
-
-/* isochronous write buffer structure
- * circular buffer with pad area for extraction of complete USB frames
- * - data[read..nextread-1] is valid data already submitted to the USB subsystem
- * - data[nextread..write-1] is valid data yet to be sent
- * - data[write] is the next byte to write to
- *   - in byte-oriented L2 procotols, it is completely free
- *   - in bit-oriented L2 procotols, it may contain a partial byte of valid data
- * - data[write+1..read-1] is free
- * - wbits is the number of valid data bits in data[write], starting at the LSB
- * - writesem is the semaphore for writing to the buffer:
- *   if writesem <= 0, data[write..read-1] is currently being written to
- * - idle contains the byte value to repeat when the end of valid data is
- *   reached; if nextread==write (buffer contains no data to send), either the
- *   BAS_OUTBUFPAD bytes immediately before data[write] (if
- *   write>=BAS_OUTBUFPAD) or those of the pad area (if write<BAS_OUTBUFPAD)
- *   are also filled with that value
- */
-struct isowbuf_t {
-	int		read;
-	int		nextread;
-	int		write;
-	atomic_t	writesem;
-	int		wbits;
-	unsigned char	data[BAS_OUTBUFSIZE + BAS_OUTBUFPAD];
-	unsigned char	idle;
-};
-
-/* isochronous write URB context structure
- * data to be stored along with the URB and retrieved when it is returned
- * as completed by the USB subsystem
- * - urb: pointer to the URB itself
- * - bcs: pointer to the B Channel control structure
- * - limit: end of write buffer area covered by this URB
- * - status: URB completion status
- */
-struct isow_urbctx_t {
-	struct urb *urb;
-	struct bc_state *bcs;
-	int limit;
-	int status;
-};
-
-/* AT state structure
- * data associated with the state of an ISDN connection, whether or not
- * it is currently assigned a B channel
- */
-struct at_state_t {
-	struct list_head	list;
-	int			waiting;
-	int			getstring;
-	unsigned		timer_index;
-	unsigned long		timer_expires;
-	int			timer_active;
-	unsigned int		ConState;	/* State of connection */
-	struct reply_t		*replystruct;
-	int			cid;
-	int			int_var[VAR_NUM];	/* see VAR_XXXX */
-	char			*str_var[STR_NUM];	/* see STR_XXXX */
-	unsigned		pending_commands;	/* see PC_XXXX */
-	unsigned		seq_index;
-
-	struct cardstate	*cs;
-	struct bc_state		*bcs;
-};
-
-struct event_t {
-	int type;
-	void *ptr, *arg;
-	int parameter;
-	int cid;
-	struct at_state_t *at_state;
-};
-
-/* This buffer holds all information about the used B-Channel */
-struct bc_state {
-	struct sk_buff *tx_skb;		/* Current transfer buffer to modem */
-	struct sk_buff_head squeue;	/* B-Channel send Queue */
-
-	/* Variables for debugging .. */
-	int corrupted;			/* Counter for corrupted packages */
-	int trans_down;			/* Counter of packages (downstream) */
-	int trans_up;			/* Counter of packages (upstream) */
-
-	struct at_state_t at_state;
-
-	/* receive buffer */
-	unsigned rx_bufsize;		/* max size accepted by application */
-	struct sk_buff *rx_skb;
-	__u16 rx_fcs;
-	int inputstate;			/* see INS_XXXX */
-
-	int channel;
-
-	struct cardstate *cs;
-
-	unsigned chstate;		/* bitmap (CHS_*) */
-	int ignore;
-	unsigned proto2;		/* layer 2 protocol (L2_*) */
-	char *commands[AT_NUM];		/* see AT_XXXX */
-
-#ifdef CONFIG_GIGASET_DEBUG
-	int emptycount;
-#endif
-	int busy;
-	int use_count;
-
-	/* private data of hardware drivers */
-	union {
-		struct ser_bc_state *ser;	/* serial hardware driver */
-		struct usb_bc_state *usb;	/* usb hardware driver (m105) */
-		struct bas_bc_state *bas;	/* usb hardware driver (base) */
-	} hw;
-
-	void *ap;			/* associated LL application */
-	int apconnstate;		/* LL application connection state */
-	spinlock_t aplock;
-};
-
-struct cardstate {
-	struct gigaset_driver *driver;
-	unsigned minor_index;
-	struct device *dev;
-	struct device *tty_dev;
-	unsigned flags;
-
-	const struct gigaset_ops *ops;
-
-	/* Stuff to handle communication */
-	wait_queue_head_t waitqueue;
-	int waiting;
-	int mode;			/* see M_XXXX */
-	int mstate;			/* Modem state: see MS_XXXX */
-					/* only changed by the event layer */
-	int cmd_result;
-
-	int channels;
-	struct bc_state *bcs;		/* Array of struct bc_state */
-
-	int onechannel;			/* data and commands transmitted in one
-					   stream (M10x) */
-
-	spinlock_t lock;
-	struct at_state_t at_state;	/* at_state_t for cid == 0 */
-	struct list_head temp_at_states;/* list of temporary "struct
-					   at_state_t"s without B channel */
-
-	struct inbuf_t *inbuf;
-
-	struct cmdbuf_t *cmdbuf, *lastcmdbuf;
-	spinlock_t cmdlock;
-	unsigned curlen, cmdbytes;
-
-	struct tty_port port;
-	struct tasklet_struct if_wake_tasklet;
-	unsigned control_state;
-
-	unsigned fwver[4];
-	int gotfwver;
-
-	unsigned running;		/* !=0 if events are handled */
-	unsigned connected;		/* !=0 if hardware is connected */
-	unsigned isdn_up;		/* !=0 after gigaset_isdn_start() */
-
-	unsigned cidmode;
-
-	int myid;			/* id for communication with LL */
-	void *iif;			/* LL interface structure */
-	unsigned short hw_hdr_len;	/* headroom needed in data skbs */
-
-	struct reply_t *tabnocid;
-	struct reply_t *tabcid;
-	int cs_init;
-	int ignoreframes;		/* frames to ignore after setting up the
-					   B channel */
-	struct mutex mutex;		/* locks this structure:
-					 *   connected is not changed,
-					 *   hardware_up is not changed,
-					 *   MState is not changed to or from
-					 *   MS_LOCKED */
-
-	struct timer_list timer;
-	int retry_count;
-	int dle;			/* !=0 if DLE mode is active
-					   (ZDLE=1 received -- M10x only) */
-	int cur_at_seq;			/* sequence of AT commands being
-					   processed */
-	int curchannel;			/* channel those commands are meant
-					   for */
-	int commands_pending;		/* flag(s) in xxx.commands_pending have
-					   been set */
-	struct tasklet_struct
-		event_tasklet;		/* tasklet for serializing AT commands.
-					 * Scheduled
-					 *   -> for modem reponses (and
-					 *      incoming data for M10x)
-					 *   -> on timeout
-					 *   -> after setting bits in
-					 *      xxx.at_state.pending_command
-					 *      (e.g. command from LL) */
-	struct tasklet_struct
-		write_tasklet;		/* tasklet for serial output
-					 * (not used in base driver) */
-
-	/* event queue */
-	struct event_t events[MAX_EVENTS];
-	unsigned ev_tail, ev_head;
-	spinlock_t ev_lock;
-
-	/* current modem response */
-	unsigned char respdata[MAX_RESP_SIZE + 1];
-	unsigned cbytes;
-
-	/* private data of hardware drivers */
-	union {
-		struct usb_cardstate *usb; /* USB hardware driver (m105) */
-		struct ser_cardstate *ser; /* serial hardware driver */
-		struct bas_cardstate *bas; /* USB hardware driver (base) */
-	} hw;
-};
-
-struct gigaset_driver {
-	struct list_head list;
-	spinlock_t lock;		/* locks minor tables and blocked */
-	struct tty_driver *tty;
-	unsigned have_tty;
-	unsigned minor;
-	unsigned minors;
-	struct cardstate *cs;
-	int blocked;
-
-	const struct gigaset_ops *ops;
-	struct module *owner;
-};
-
-struct cmdbuf_t {
-	struct cmdbuf_t *next, *prev;
-	int len, offset;
-	struct tasklet_struct *wake_tasklet;
-	unsigned char buf[0];
-};
-
-struct bas_bc_state {
-	/* isochronous output state */
-	int		running;
-	atomic_t	corrbytes;
-	spinlock_t	isooutlock;
-	struct isow_urbctx_t	isoouturbs[BAS_OUTURBS];
-	struct isow_urbctx_t	*isooutdone, *isooutfree, *isooutovfl;
-	struct isowbuf_t	*isooutbuf;
-	unsigned numsub;		/* submitted URB counter
-					   (for diagnostic messages only) */
-	struct tasklet_struct	sent_tasklet;
-
-	/* isochronous input state */
-	spinlock_t isoinlock;
-	struct urb *isoinurbs[BAS_INURBS];
-	unsigned char isoinbuf[BAS_INBUFSIZE * BAS_INURBS];
-	struct urb *isoindone;		/* completed isoc read URB */
-	int isoinstatus;		/* status of completed URB */
-	int loststatus;			/* status of dropped URB */
-	unsigned isoinlost;		/* number of bytes lost */
-	/* state of bit unstuffing algorithm
-	   (in addition to BC_state.inputstate) */
-	unsigned seqlen;		/* number of '1' bits not yet
-					   unstuffed */
-	unsigned inbyte, inbits;	/* collected bits for next byte */
-	/* statistics */
-	unsigned goodbytes;		/* bytes correctly received */
-	unsigned alignerrs;		/* frames with incomplete byte at end */
-	unsigned fcserrs;		/* FCS errors */
-	unsigned frameerrs;		/* framing errors */
-	unsigned giants;		/* long frames */
-	unsigned runts;			/* short frames */
-	unsigned aborts;		/* HDLC aborts */
-	unsigned shared0s;		/* '0' bits shared between flags */
-	unsigned stolen0s;		/* '0' stuff bits also serving as
-					   leading flag bits */
-	struct tasklet_struct rcvd_tasklet;
-};
-
-struct gigaset_ops {
-	/* Called from ev-layer.c/interface.c for sending AT commands to the
-	   device */
-	int (*write_cmd)(struct cardstate *cs, struct cmdbuf_t *cb);
-
-	/* Called from interface.c for additional device control */
-	int (*write_room)(struct cardstate *cs);
-	int (*chars_in_buffer)(struct cardstate *cs);
-	int (*brkchars)(struct cardstate *cs, const unsigned char buf[6]);
-
-	/* Called from ev-layer.c after setting up connection
-	 * Should call gigaset_bchannel_up(), when finished. */
-	int (*init_bchannel)(struct bc_state *bcs);
-
-	/* Called from ev-layer.c after hanging up
-	 * Should call gigaset_bchannel_down(), when finished. */
-	int (*close_bchannel)(struct bc_state *bcs);
-
-	/* Called by gigaset_initcs() for setting up bcs->hw.xxx */
-	int (*initbcshw)(struct bc_state *bcs);
-
-	/* Called by gigaset_freecs() for freeing bcs->hw.xxx */
-	void (*freebcshw)(struct bc_state *bcs);
-
-	/* Called by gigaset_bchannel_down() for resetting bcs->hw.xxx */
-	void (*reinitbcshw)(struct bc_state *bcs);
-
-	/* Called by gigaset_initcs() for setting up cs->hw.xxx */
-	int (*initcshw)(struct cardstate *cs);
-
-	/* Called by gigaset_freecs() for freeing cs->hw.xxx */
-	void (*freecshw)(struct cardstate *cs);
-
-	/* Called from common.c/interface.c for additional serial port
-	   control */
-	int (*set_modem_ctrl)(struct cardstate *cs, unsigned old_state,
-			      unsigned new_state);
-	int (*baud_rate)(struct cardstate *cs, unsigned cflag);
-	int (*set_line_ctrl)(struct cardstate *cs, unsigned cflag);
-
-	/* Called from LL interface to put an skb into the send-queue.
-	 * After sending is completed, gigaset_skb_sent() must be called
-	 * with the skb's link layer header preserved. */
-	int (*send_skb)(struct bc_state *bcs, struct sk_buff *skb);
-
-	/* Called from ev-layer.c to process a block of data
-	 * received through the common/control channel. */
-	void (*handle_input)(struct inbuf_t *inbuf);
-
-};
-
-/* = Common structures and definitions =======================================
- */
-
-/* Parser states for DLE-Event:
- * <DLE-EVENT>: <DLE_FLAG> "X" <EVENT> <DLE_FLAG> "."
- * <DLE_FLAG>:  0x10
- * <EVENT>:     ((a-z)* | (A-Z)* | (0-10)*)+
- */
-#define DLE_FLAG	0x10
-
-/* ===========================================================================
- *  Functions implemented in asyncdata.c
- */
-
-/* Called from LL interface to put an skb into the send queue. */
-int gigaset_m10x_send_skb(struct bc_state *bcs, struct sk_buff *skb);
-
-/* Called from ev-layer.c to process a block of data
- * received through the common/control channel. */
-void gigaset_m10x_input(struct inbuf_t *inbuf);
-
-/* ===========================================================================
- *  Functions implemented in isocdata.c
- */
-
-/* Called from LL interface to put an skb into the send queue. */
-int gigaset_isoc_send_skb(struct bc_state *bcs, struct sk_buff *skb);
-
-/* Called from ev-layer.c to process a block of data
- * received through the common/control channel. */
-void gigaset_isoc_input(struct inbuf_t *inbuf);
-
-/* Called from bas-gigaset.c to process a block of data
- * received through the isochronous channel */
-void gigaset_isoc_receive(unsigned char *src, unsigned count,
-			  struct bc_state *bcs);
-
-/* Called from bas-gigaset.c to put a block of data
- * into the isochronous output buffer */
-int gigaset_isoc_buildframe(struct bc_state *bcs, unsigned char *in, int len);
-
-/* Called from bas-gigaset.c to initialize the isochronous output buffer */
-void gigaset_isowbuf_init(struct isowbuf_t *iwb, unsigned char idle);
-
-/* Called from bas-gigaset.c to retrieve a block of bytes for sending */
-int gigaset_isowbuf_getbytes(struct isowbuf_t *iwb, int size);
-
-/* ===========================================================================
- *  Functions implemented in LL interface
- */
-
-/* Called from common.c for setting up/shutting down with the ISDN subsystem */
-void gigaset_isdn_regdrv(void);
-void gigaset_isdn_unregdrv(void);
-int gigaset_isdn_regdev(struct cardstate *cs, const char *isdnid);
-void gigaset_isdn_unregdev(struct cardstate *cs);
-
-/* Called from hardware module to indicate completion of an skb */
-void gigaset_skb_sent(struct bc_state *bcs, struct sk_buff *skb);
-void gigaset_skb_rcvd(struct bc_state *bcs, struct sk_buff *skb);
-void gigaset_isdn_rcv_err(struct bc_state *bcs);
-
-/* Called from common.c/ev-layer.c to indicate events relevant to the LL */
-void gigaset_isdn_start(struct cardstate *cs);
-void gigaset_isdn_stop(struct cardstate *cs);
-int gigaset_isdn_icall(struct at_state_t *at_state);
-void gigaset_isdn_connD(struct bc_state *bcs);
-void gigaset_isdn_hupD(struct bc_state *bcs);
-void gigaset_isdn_connB(struct bc_state *bcs);
-void gigaset_isdn_hupB(struct bc_state *bcs);
-
-/* ===========================================================================
- *  Functions implemented in ev-layer.c
- */
-
-/* tasklet called from common.c to process queued events */
-void gigaset_handle_event(unsigned long data);
-
-/* called from isocdata.c / asyncdata.c
- * when a complete modem response line has been received */
-void gigaset_handle_modem_response(struct cardstate *cs);
-
-/* ===========================================================================
- *  Functions implemented in proc.c
- */
-
-/* initialize sysfs for device */
-void gigaset_init_dev_sysfs(struct cardstate *cs);
-void gigaset_free_dev_sysfs(struct cardstate *cs);
-
-/* ===========================================================================
- *  Functions implemented in common.c/gigaset.h
- */
-
-void gigaset_bcs_reinit(struct bc_state *bcs);
-void gigaset_at_init(struct at_state_t *at_state, struct bc_state *bcs,
-		     struct cardstate *cs, int cid);
-int gigaset_get_channel(struct bc_state *bcs);
-struct bc_state *gigaset_get_free_channel(struct cardstate *cs);
-void gigaset_free_channel(struct bc_state *bcs);
-int gigaset_get_channels(struct cardstate *cs);
-void gigaset_free_channels(struct cardstate *cs);
-void gigaset_block_channels(struct cardstate *cs);
-
-/* Allocate and initialize driver structure. */
-struct gigaset_driver *gigaset_initdriver(unsigned minor, unsigned minors,
-					  const char *procname,
-					  const char *devname,
-					  const struct gigaset_ops *ops,
-					  struct module *owner);
-
-/* Deallocate driver structure. */
-void gigaset_freedriver(struct gigaset_driver *drv);
-
-struct cardstate *gigaset_get_cs_by_tty(struct tty_struct *tty);
-struct cardstate *gigaset_get_cs_by_id(int id);
-void gigaset_blockdriver(struct gigaset_driver *drv);
-
-/* Allocate and initialize card state. Calls hardware dependent
-   gigaset_init[b]cs(). */
-struct cardstate *gigaset_initcs(struct gigaset_driver *drv, int channels,
-				 int onechannel, int ignoreframes,
-				 int cidmode, const char *modulename);
-
-/* Free card state. Calls hardware dependent gigaset_free[b]cs(). */
-void gigaset_freecs(struct cardstate *cs);
-
-/* Tell common.c that hardware and driver are ready. */
-int gigaset_start(struct cardstate *cs);
-
-/* Tell common.c that the device is not present any more. */
-void gigaset_stop(struct cardstate *cs);
-
-/* Tell common.c that the driver is being unloaded. */
-int gigaset_shutdown(struct cardstate *cs);
-
-/* Append event to the queue.
- * Returns NULL on failure or a pointer to the event on success.
- * ptr must be kmalloc()ed (and not be freed by the caller).
- */
-struct event_t *gigaset_add_event(struct cardstate *cs,
-				  struct at_state_t *at_state, int type,
-				  void *ptr, int parameter, void *arg);
-
-/* Called on CONFIG1 command from frontend. */
-int gigaset_enterconfigmode(struct cardstate *cs);
-
-/* cs->lock must not be locked */
-static inline void gigaset_schedule_event(struct cardstate *cs)
-{
-	unsigned long flags;
-	spin_lock_irqsave(&cs->lock, flags);
-	if (cs->running)
-		tasklet_schedule(&cs->event_tasklet);
-	spin_unlock_irqrestore(&cs->lock, flags);
-}
-
-/* Tell common.c that B channel has been closed. */
-/* cs->lock must not be locked */
-static inline void gigaset_bchannel_down(struct bc_state *bcs)
-{
-	gigaset_add_event(bcs->cs, &bcs->at_state, EV_BC_CLOSED, NULL, 0, NULL);
-	gigaset_schedule_event(bcs->cs);
-}
-
-/* Tell common.c that B channel has been opened. */
-/* cs->lock must not be locked */
-static inline void gigaset_bchannel_up(struct bc_state *bcs)
-{
-	gigaset_add_event(bcs->cs, &bcs->at_state, EV_BC_OPEN, NULL, 0, NULL);
-	gigaset_schedule_event(bcs->cs);
-}
-
-/* set up next receive skb for data mode */
-static inline struct sk_buff *gigaset_new_rx_skb(struct bc_state *bcs)
-{
-	struct cardstate *cs = bcs->cs;
-	unsigned short hw_hdr_len = cs->hw_hdr_len;
-
-	if (bcs->ignore) {
-		bcs->rx_skb = NULL;
-	} else {
-		bcs->rx_skb = dev_alloc_skb(bcs->rx_bufsize + hw_hdr_len);
-		if (bcs->rx_skb == NULL)
-			dev_warn(cs->dev, "could not allocate skb\n");
-		else
-			skb_reserve(bcs->rx_skb, hw_hdr_len);
-	}
-	return bcs->rx_skb;
-}
-
-/* append received bytes to inbuf */
-int gigaset_fill_inbuf(struct inbuf_t *inbuf, const unsigned char *src,
-		       unsigned numbytes);
-
-/* ===========================================================================
- *  Functions implemented in interface.c
- */
-
-/* initialize interface */
-void gigaset_if_initdriver(struct gigaset_driver *drv, const char *procname,
-			   const char *devname);
-/* release interface */
-void gigaset_if_freedriver(struct gigaset_driver *drv);
-/* add minor */
-void gigaset_if_init(struct cardstate *cs);
-/* remove minor */
-void gigaset_if_free(struct cardstate *cs);
-/* device received data */
-void gigaset_if_receive(struct cardstate *cs,
-			unsigned char *buffer, size_t len);
-
-#endif
diff --git a/drivers/staging/isdn/gigaset/interface.c b/drivers/staging/isdn/gigaset/interface.c
deleted file mode 100644
index 9ddadd07e707..000000000000
--- a/drivers/staging/isdn/gigaset/interface.c
+++ /dev/null
@@ -1,613 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * interface to user space for the gigaset driver
- *
- * Copyright (c) 2004 by Hansjoerg Lipp <hjlipp@web.de>
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/gigaset_dev.h>
-#include <linux/tty_flip.h>
-#include <linux/module.h>
-
-/*** our ioctls ***/
-
-static int if_lock(struct cardstate *cs, int *arg)
-{
-	int cmd = *arg;
-
-	gig_dbg(DEBUG_IF, "%u: if_lock (%d)", cs->minor_index, cmd);
-
-	if (cmd > 1)
-		return -EINVAL;
-
-	if (cmd < 0) {
-		*arg = cs->mstate == MS_LOCKED;
-		return 0;
-	}
-
-	if (!cmd && cs->mstate == MS_LOCKED && cs->connected) {
-		cs->ops->set_modem_ctrl(cs, 0, TIOCM_DTR | TIOCM_RTS);
-		cs->ops->baud_rate(cs, B115200);
-		cs->ops->set_line_ctrl(cs, CS8);
-		cs->control_state = TIOCM_DTR | TIOCM_RTS;
-	}
-
-	cs->waiting = 1;
-	if (!gigaset_add_event(cs, &cs->at_state, EV_IF_LOCK,
-			       NULL, cmd, NULL)) {
-		cs->waiting = 0;
-		return -ENOMEM;
-	}
-	gigaset_schedule_event(cs);
-
-	wait_event(cs->waitqueue, !cs->waiting);
-
-	if (cs->cmd_result >= 0) {
-		*arg = cs->cmd_result;
-		return 0;
-	}
-
-	return cs->cmd_result;
-}
-
-static int if_version(struct cardstate *cs, unsigned arg[4])
-{
-	static const unsigned version[4] = GIG_VERSION;
-	static const unsigned compat[4] = GIG_COMPAT;
-	unsigned cmd = arg[0];
-
-	gig_dbg(DEBUG_IF, "%u: if_version (%d)", cs->minor_index, cmd);
-
-	switch (cmd) {
-	case GIGVER_DRIVER:
-		memcpy(arg, version, sizeof version);
-		return 0;
-	case GIGVER_COMPAT:
-		memcpy(arg, compat, sizeof compat);
-		return 0;
-	case GIGVER_FWBASE:
-		cs->waiting = 1;
-		if (!gigaset_add_event(cs, &cs->at_state, EV_IF_VER,
-				       NULL, 0, arg)) {
-			cs->waiting = 0;
-			return -ENOMEM;
-		}
-		gigaset_schedule_event(cs);
-
-		wait_event(cs->waitqueue, !cs->waiting);
-
-		if (cs->cmd_result >= 0)
-			return 0;
-
-		return cs->cmd_result;
-	default:
-		return -EINVAL;
-	}
-}
-
-static int if_config(struct cardstate *cs, int *arg)
-{
-	gig_dbg(DEBUG_IF, "%u: if_config (%d)", cs->minor_index, *arg);
-
-	if (*arg != 1)
-		return -EINVAL;
-
-	if (cs->mstate != MS_LOCKED)
-		return -EBUSY;
-
-	if (!cs->connected) {
-		pr_err("%s: not connected\n", __func__);
-		return -ENODEV;
-	}
-
-	*arg = 0;
-	return gigaset_enterconfigmode(cs);
-}
-
-/*** the terminal driver ***/
-
-static int if_open(struct tty_struct *tty, struct file *filp)
-{
-	struct cardstate *cs;
-
-	gig_dbg(DEBUG_IF, "%d+%d: %s()",
-		tty->driver->minor_start, tty->index, __func__);
-
-	cs = gigaset_get_cs_by_tty(tty);
-	if (!cs || !try_module_get(cs->driver->owner))
-		return -ENODEV;
-
-	if (mutex_lock_interruptible(&cs->mutex)) {
-		module_put(cs->driver->owner);
-		return -ERESTARTSYS;
-	}
-	tty->driver_data = cs;
-
-	++cs->port.count;
-
-	if (cs->port.count == 1) {
-		tty_port_tty_set(&cs->port, tty);
-		cs->port.low_latency = 1;
-	}
-
-	mutex_unlock(&cs->mutex);
-	return 0;
-}
-
-static void if_close(struct tty_struct *tty, struct file *filp)
-{
-	struct cardstate *cs = tty->driver_data;
-
-	if (!cs) { /* happens if we didn't find cs in open */
-		gig_dbg(DEBUG_IF, "%s: no cardstate", __func__);
-		return;
-	}
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	mutex_lock(&cs->mutex);
-
-	if (!cs->connected)
-		gig_dbg(DEBUG_IF, "not connected");	/* nothing to do */
-	else if (!cs->port.count)
-		dev_warn(cs->dev, "%s: device not opened\n", __func__);
-	else if (!--cs->port.count)
-		tty_port_tty_set(&cs->port, NULL);
-
-	mutex_unlock(&cs->mutex);
-
-	module_put(cs->driver->owner);
-}
-
-static int if_ioctl(struct tty_struct *tty,
-		    unsigned int cmd, unsigned long arg)
-{
-	struct cardstate *cs = tty->driver_data;
-	int retval = -ENODEV;
-	int int_arg;
-	unsigned char buf[6];
-	unsigned version[4];
-
-	gig_dbg(DEBUG_IF, "%u: %s(0x%x)", cs->minor_index, __func__, cmd);
-
-	if (mutex_lock_interruptible(&cs->mutex))
-		return -ERESTARTSYS;
-
-	if (!cs->connected) {
-		gig_dbg(DEBUG_IF, "not connected");
-		retval = -ENODEV;
-	} else {
-		retval = 0;
-		switch (cmd) {
-		case GIGASET_REDIR:
-			retval = get_user(int_arg, (int __user *) arg);
-			if (retval >= 0)
-				retval = if_lock(cs, &int_arg);
-			if (retval >= 0)
-				retval = put_user(int_arg, (int __user *) arg);
-			break;
-		case GIGASET_CONFIG:
-			retval = get_user(int_arg, (int __user *) arg);
-			if (retval >= 0)
-				retval = if_config(cs, &int_arg);
-			if (retval >= 0)
-				retval = put_user(int_arg, (int __user *) arg);
-			break;
-		case GIGASET_BRKCHARS:
-			retval = copy_from_user(&buf,
-						(const unsigned char __user *) arg, 6)
-				? -EFAULT : 0;
-			if (retval >= 0) {
-				gigaset_dbg_buffer(DEBUG_IF, "GIGASET_BRKCHARS",
-						   6, buf);
-				retval = cs->ops->brkchars(cs, buf);
-			}
-			break;
-		case GIGASET_VERSION:
-			retval = copy_from_user(version,
-						(unsigned __user *) arg, sizeof version)
-				? -EFAULT : 0;
-			if (retval >= 0)
-				retval = if_version(cs, version);
-			if (retval >= 0)
-				retval = copy_to_user((unsigned __user *) arg,
-						      version, sizeof version)
-					? -EFAULT : 0;
-			break;
-		default:
-			gig_dbg(DEBUG_IF, "%s: arg not supported - 0x%04x",
-				__func__, cmd);
-			retval = -ENOIOCTLCMD;
-		}
-	}
-
-	mutex_unlock(&cs->mutex);
-
-	return retval;
-}
-
-#ifdef CONFIG_COMPAT
-static long if_compat_ioctl(struct tty_struct *tty,
-		    unsigned int cmd, unsigned long arg)
-{
-	return if_ioctl(tty, cmd, (unsigned long)compat_ptr(arg));
-}
-#endif
-
-static int if_tiocmget(struct tty_struct *tty)
-{
-	struct cardstate *cs = tty->driver_data;
-	int retval;
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	if (mutex_lock_interruptible(&cs->mutex))
-		return -ERESTARTSYS;
-
-	retval = cs->control_state & (TIOCM_RTS | TIOCM_DTR);
-
-	mutex_unlock(&cs->mutex);
-
-	return retval;
-}
-
-static int if_tiocmset(struct tty_struct *tty,
-		       unsigned int set, unsigned int clear)
-{
-	struct cardstate *cs = tty->driver_data;
-	int retval;
-	unsigned mc;
-
-	gig_dbg(DEBUG_IF, "%u: %s(0x%x, 0x%x)",
-		cs->minor_index, __func__, set, clear);
-
-	if (mutex_lock_interruptible(&cs->mutex))
-		return -ERESTARTSYS;
-
-	if (!cs->connected) {
-		gig_dbg(DEBUG_IF, "not connected");
-		retval = -ENODEV;
-	} else {
-		mc = (cs->control_state | set) & ~clear & (TIOCM_RTS | TIOCM_DTR);
-		retval = cs->ops->set_modem_ctrl(cs, cs->control_state, mc);
-		cs->control_state = mc;
-	}
-
-	mutex_unlock(&cs->mutex);
-
-	return retval;
-}
-
-static int if_write(struct tty_struct *tty, const unsigned char *buf, int count)
-{
-	struct cardstate *cs = tty->driver_data;
-	struct cmdbuf_t *cb;
-	int retval;
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	if (mutex_lock_interruptible(&cs->mutex))
-		return -ERESTARTSYS;
-
-	if (!cs->connected) {
-		gig_dbg(DEBUG_IF, "not connected");
-		retval = -ENODEV;
-		goto done;
-	}
-	if (cs->mstate != MS_LOCKED) {
-		dev_warn(cs->dev, "can't write to unlocked device\n");
-		retval = -EBUSY;
-		goto done;
-	}
-	if (count <= 0) {
-		/* nothing to do */
-		retval = 0;
-		goto done;
-	}
-
-	cb = kmalloc(sizeof(struct cmdbuf_t) + count, GFP_KERNEL);
-	if (!cb) {
-		dev_err(cs->dev, "%s: out of memory\n", __func__);
-		retval = -ENOMEM;
-		goto done;
-	}
-
-	memcpy(cb->buf, buf, count);
-	cb->len = count;
-	cb->offset = 0;
-	cb->next = NULL;
-	cb->wake_tasklet = &cs->if_wake_tasklet;
-	retval = cs->ops->write_cmd(cs, cb);
-done:
-	mutex_unlock(&cs->mutex);
-	return retval;
-}
-
-static int if_write_room(struct tty_struct *tty)
-{
-	struct cardstate *cs = tty->driver_data;
-	int retval;
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	if (mutex_lock_interruptible(&cs->mutex))
-		return -ERESTARTSYS;
-
-	if (!cs->connected) {
-		gig_dbg(DEBUG_IF, "not connected");
-		retval = -ENODEV;
-	} else if (cs->mstate != MS_LOCKED) {
-		dev_warn(cs->dev, "can't write to unlocked device\n");
-		retval = -EBUSY;
-	} else
-		retval = cs->ops->write_room(cs);
-
-	mutex_unlock(&cs->mutex);
-
-	return retval;
-}
-
-static int if_chars_in_buffer(struct tty_struct *tty)
-{
-	struct cardstate *cs = tty->driver_data;
-	int retval = 0;
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	mutex_lock(&cs->mutex);
-
-	if (!cs->connected)
-		gig_dbg(DEBUG_IF, "not connected");
-	else if (cs->mstate != MS_LOCKED)
-		dev_warn(cs->dev, "can't write to unlocked device\n");
-	else
-		retval = cs->ops->chars_in_buffer(cs);
-
-	mutex_unlock(&cs->mutex);
-
-	return retval;
-}
-
-static void if_throttle(struct tty_struct *tty)
-{
-	struct cardstate *cs = tty->driver_data;
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	mutex_lock(&cs->mutex);
-
-	if (!cs->connected)
-		gig_dbg(DEBUG_IF, "not connected");	/* nothing to do */
-	else
-		gig_dbg(DEBUG_IF, "%s: not implemented\n", __func__);
-
-	mutex_unlock(&cs->mutex);
-}
-
-static void if_unthrottle(struct tty_struct *tty)
-{
-	struct cardstate *cs = tty->driver_data;
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	mutex_lock(&cs->mutex);
-
-	if (!cs->connected)
-		gig_dbg(DEBUG_IF, "not connected");	/* nothing to do */
-	else
-		gig_dbg(DEBUG_IF, "%s: not implemented\n", __func__);
-
-	mutex_unlock(&cs->mutex);
-}
-
-static void if_set_termios(struct tty_struct *tty, struct ktermios *old)
-{
-	struct cardstate *cs = tty->driver_data;
-	unsigned int iflag;
-	unsigned int cflag;
-	unsigned int old_cflag;
-	unsigned int control_state, new_state;
-
-	gig_dbg(DEBUG_IF, "%u: %s()", cs->minor_index, __func__);
-
-	mutex_lock(&cs->mutex);
-
-	if (!cs->connected) {
-		gig_dbg(DEBUG_IF, "not connected");
-		goto out;
-	}
-
-	iflag = tty->termios.c_iflag;
-	cflag = tty->termios.c_cflag;
-	old_cflag = old ? old->c_cflag : cflag;
-	gig_dbg(DEBUG_IF, "%u: iflag %x cflag %x old %x",
-		cs->minor_index, iflag, cflag, old_cflag);
-
-	/* get a local copy of the current port settings */
-	control_state = cs->control_state;
-
-	/*
-	 * Update baud rate.
-	 * Do not attempt to cache old rates and skip settings,
-	 * disconnects screw such tricks up completely.
-	 * Premature optimization is the root of all evil.
-	 */
-
-	/* reassert DTR and (maybe) RTS on transition from B0 */
-	if ((old_cflag & CBAUD) == B0) {
-		new_state = control_state | TIOCM_DTR;
-		/* don't set RTS if using hardware flow control */
-		if (!(old_cflag & CRTSCTS))
-			new_state |= TIOCM_RTS;
-		gig_dbg(DEBUG_IF, "%u: from B0 - set DTR%s",
-			cs->minor_index,
-			(new_state & TIOCM_RTS) ? " only" : "/RTS");
-		cs->ops->set_modem_ctrl(cs, control_state, new_state);
-		control_state = new_state;
-	}
-
-	cs->ops->baud_rate(cs, cflag & CBAUD);
-
-	if ((cflag & CBAUD) == B0) {
-		/* Drop RTS and DTR */
-		gig_dbg(DEBUG_IF, "%u: to B0 - drop DTR/RTS", cs->minor_index);
-		new_state = control_state & ~(TIOCM_DTR | TIOCM_RTS);
-		cs->ops->set_modem_ctrl(cs, control_state, new_state);
-		control_state = new_state;
-	}
-
-	/*
-	 * Update line control register (LCR)
-	 */
-
-	cs->ops->set_line_ctrl(cs, cflag);
-
-	/* save off the modified port settings */
-	cs->control_state = control_state;
-
-out:
-	mutex_unlock(&cs->mutex);
-}
-
-static const struct tty_operations if_ops = {
-	.open =			if_open,
-	.close =		if_close,
-	.ioctl =		if_ioctl,
-#ifdef CONFIG_COMPAT
-	.compat_ioctl =		if_compat_ioctl,
-#endif
-	.write =		if_write,
-	.write_room =		if_write_room,
-	.chars_in_buffer =	if_chars_in_buffer,
-	.set_termios =		if_set_termios,
-	.throttle =		if_throttle,
-	.unthrottle =		if_unthrottle,
-	.tiocmget =		if_tiocmget,
-	.tiocmset =		if_tiocmset,
-};
-
-
-/* wakeup tasklet for the write operation */
-static void if_wake(unsigned long data)
-{
-	struct cardstate *cs = (struct cardstate *)data;
-
-	tty_port_tty_wakeup(&cs->port);
-}
-
-/*** interface to common ***/
-
-void gigaset_if_init(struct cardstate *cs)
-{
-	struct gigaset_driver *drv;
-
-	drv = cs->driver;
-	if (!drv->have_tty)
-		return;
-
-	tasklet_init(&cs->if_wake_tasklet, if_wake, (unsigned long) cs);
-
-	mutex_lock(&cs->mutex);
-	cs->tty_dev = tty_port_register_device(&cs->port, drv->tty,
-			cs->minor_index, NULL);
-
-	if (!IS_ERR(cs->tty_dev))
-		dev_set_drvdata(cs->tty_dev, cs);
-	else {
-		pr_warn("could not register device to the tty subsystem\n");
-		cs->tty_dev = NULL;
-	}
-	mutex_unlock(&cs->mutex);
-}
-
-void gigaset_if_free(struct cardstate *cs)
-{
-	struct gigaset_driver *drv;
-
-	drv = cs->driver;
-	if (!drv->have_tty)
-		return;
-
-	tasklet_disable(&cs->if_wake_tasklet);
-	tasklet_kill(&cs->if_wake_tasklet);
-	cs->tty_dev = NULL;
-	tty_unregister_device(drv->tty, cs->minor_index);
-}
-
-/**
- * gigaset_if_receive() - pass a received block of data to the tty device
- * @cs:		device descriptor structure.
- * @buffer:	received data.
- * @len:	number of bytes received.
- *
- * Called by asyncdata/isocdata if a block of data received from the
- * device must be sent to userspace through the ttyG* device.
- */
-void gigaset_if_receive(struct cardstate *cs,
-			unsigned char *buffer, size_t len)
-{
-	tty_insert_flip_string(&cs->port, buffer, len);
-	tty_flip_buffer_push(&cs->port);
-}
-EXPORT_SYMBOL_GPL(gigaset_if_receive);
-
-/* gigaset_if_initdriver
- * Initialize tty interface.
- * parameters:
- *	drv		Driver
- *	procname	Name of the driver (e.g. for /proc/tty/drivers)
- *	devname		Name of the device files (prefix without minor number)
- */
-void gigaset_if_initdriver(struct gigaset_driver *drv, const char *procname,
-			   const char *devname)
-{
-	int ret;
-	struct tty_driver *tty;
-
-	drv->have_tty = 0;
-
-	drv->tty = tty = alloc_tty_driver(drv->minors);
-	if (tty == NULL)
-		goto enomem;
-
-	tty->type =		TTY_DRIVER_TYPE_SERIAL;
-	tty->subtype =		SERIAL_TYPE_NORMAL;
-	tty->flags =		TTY_DRIVER_REAL_RAW | TTY_DRIVER_DYNAMIC_DEV;
-
-	tty->driver_name =	procname;
-	tty->name =		devname;
-	tty->minor_start =	drv->minor;
-
-	tty->init_termios          = tty_std_termios;
-	tty->init_termios.c_cflag  = B9600 | CS8 | CREAD | HUPCL | CLOCAL;
-	tty_set_operations(tty, &if_ops);
-
-	ret = tty_register_driver(tty);
-	if (ret < 0) {
-		pr_err("error %d registering tty driver\n", ret);
-		goto error;
-	}
-	gig_dbg(DEBUG_IF, "tty driver initialized");
-	drv->have_tty = 1;
-	return;
-
-enomem:
-	pr_err("out of memory\n");
-error:
-	if (drv->tty)
-		put_tty_driver(drv->tty);
-}
-
-void gigaset_if_freedriver(struct gigaset_driver *drv)
-{
-	if (!drv->have_tty)
-		return;
-
-	drv->have_tty = 0;
-	tty_unregister_driver(drv->tty);
-	put_tty_driver(drv->tty);
-}
diff --git a/drivers/staging/isdn/gigaset/isocdata.c b/drivers/staging/isdn/gigaset/isocdata.c
deleted file mode 100644
index 3ecf6e33ed15..000000000000
--- a/drivers/staging/isdn/gigaset/isocdata.c
+++ /dev/null
@@ -1,1006 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Common data handling layer for bas_gigaset
- *
- * Copyright (c) 2005 by Tilman Schmidt <tilman@imap.cc>,
- *                       Hansjoerg Lipp <hjlipp@web.de>.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/crc-ccitt.h>
-#include <linux/bitrev.h>
-
-/* access methods for isowbuf_t */
-/* ============================ */
-
-/* initialize buffer structure
- */
-void gigaset_isowbuf_init(struct isowbuf_t *iwb, unsigned char idle)
-{
-	iwb->read = 0;
-	iwb->nextread = 0;
-	iwb->write = 0;
-	atomic_set(&iwb->writesem, 1);
-	iwb->wbits = 0;
-	iwb->idle = idle;
-	memset(iwb->data + BAS_OUTBUFSIZE, idle, BAS_OUTBUFPAD);
-}
-
-/* compute number of bytes which can be appended to buffer
- * so that there is still room to append a maximum frame of flags
- */
-static inline int isowbuf_freebytes(struct isowbuf_t *iwb)
-{
-	int read, write, freebytes;
-
-	read = iwb->read;
-	write = iwb->write;
-	freebytes = read - write;
-	if (freebytes > 0) {
-		/* no wraparound: need padding space within regular area */
-		return freebytes - BAS_OUTBUFPAD;
-	} else if (read < BAS_OUTBUFPAD) {
-		/* wraparound: can use space up to end of regular area */
-		return BAS_OUTBUFSIZE - write;
-	} else {
-		/* following the wraparound yields more space */
-		return freebytes + BAS_OUTBUFSIZE - BAS_OUTBUFPAD;
-	}
-}
-
-/* start writing
- * acquire the write semaphore
- * return 0 if acquired, <0 if busy
- */
-static inline int isowbuf_startwrite(struct isowbuf_t *iwb)
-{
-	if (!atomic_dec_and_test(&iwb->writesem)) {
-		atomic_inc(&iwb->writesem);
-		gig_dbg(DEBUG_ISO, "%s: couldn't acquire iso write semaphore",
-			__func__);
-		return -EBUSY;
-	}
-	gig_dbg(DEBUG_ISO,
-		"%s: acquired iso write semaphore, data[write]=%02x, nbits=%d",
-		__func__, iwb->data[iwb->write], iwb->wbits);
-	return 0;
-}
-
-/* finish writing
- * release the write semaphore
- * returns the current write position
- */
-static inline int isowbuf_donewrite(struct isowbuf_t *iwb)
-{
-	int write = iwb->write;
-	atomic_inc(&iwb->writesem);
-	return write;
-}
-
-/* append bits to buffer without any checks
- * - data contains bits to append, starting at LSB
- * - nbits is number of bits to append (0..24)
- * must be called with the write semaphore held
- * If more than nbits bits are set in data, the extraneous bits are set in the
- * buffer too, but the write position is only advanced by nbits.
- */
-static inline void isowbuf_putbits(struct isowbuf_t *iwb, u32 data, int nbits)
-{
-	int write = iwb->write;
-	data <<= iwb->wbits;
-	data |= iwb->data[write];
-	nbits += iwb->wbits;
-	while (nbits >= 8) {
-		iwb->data[write++] = data & 0xff;
-		write %= BAS_OUTBUFSIZE;
-		data >>= 8;
-		nbits -= 8;
-	}
-	iwb->wbits = nbits;
-	iwb->data[write] = data & 0xff;
-	iwb->write = write;
-}
-
-/* put final flag on HDLC bitstream
- * also sets the idle fill byte to the correspondingly shifted flag pattern
- * must be called with the write semaphore held
- */
-static inline void isowbuf_putflag(struct isowbuf_t *iwb)
-{
-	int write;
-
-	/* add two flags, thus reliably covering one byte */
-	isowbuf_putbits(iwb, 0x7e7e, 8);
-	/* recover the idle flag byte */
-	write = iwb->write;
-	iwb->idle = iwb->data[write];
-	gig_dbg(DEBUG_ISO, "idle fill byte %02x", iwb->idle);
-	/* mask extraneous bits in buffer */
-	iwb->data[write] &= (1 << iwb->wbits) - 1;
-}
-
-/* retrieve a block of bytes for sending
- * The requested number of bytes is provided as a contiguous block.
- * If necessary, the frame is filled to the requested number of bytes
- * with the idle value.
- * returns offset to frame, < 0 on busy or error
- */
-int gigaset_isowbuf_getbytes(struct isowbuf_t *iwb, int size)
-{
-	int read, write, limit, src, dst;
-	unsigned char pbyte;
-
-	read = iwb->nextread;
-	write = iwb->write;
-	if (likely(read == write)) {
-		/* return idle frame */
-		return read < BAS_OUTBUFPAD ?
-			BAS_OUTBUFSIZE : read - BAS_OUTBUFPAD;
-	}
-
-	limit = read + size;
-	gig_dbg(DEBUG_STREAM, "%s: read=%d write=%d limit=%d",
-		__func__, read, write, limit);
-#ifdef CONFIG_GIGASET_DEBUG
-	if (unlikely(size < 0 || size > BAS_OUTBUFPAD)) {
-		pr_err("invalid size %d\n", size);
-		return -EINVAL;
-	}
-#endif
-
-	if (read < write) {
-		/* no wraparound in valid data */
-		if (limit >= write) {
-			/* append idle frame */
-			if (isowbuf_startwrite(iwb) < 0)
-				return -EBUSY;
-			/* write position could have changed */
-			write = iwb->write;
-			if (limit >= write) {
-				pbyte = iwb->data[write]; /* save
-							     partial byte */
-				limit = write + BAS_OUTBUFPAD;
-				gig_dbg(DEBUG_STREAM,
-					"%s: filling %d->%d with %02x",
-					__func__, write, limit, iwb->idle);
-				if (write + BAS_OUTBUFPAD < BAS_OUTBUFSIZE)
-					memset(iwb->data + write, iwb->idle,
-					       BAS_OUTBUFPAD);
-				else {
-					/* wraparound, fill entire pad area */
-					memset(iwb->data + write, iwb->idle,
-					       BAS_OUTBUFSIZE + BAS_OUTBUFPAD
-					       - write);
-					limit = 0;
-				}
-				gig_dbg(DEBUG_STREAM,
-					"%s: restoring %02x at %d",
-					__func__, pbyte, limit);
-				iwb->data[limit] = pbyte; /* restore
-							     partial byte */
-				iwb->write = limit;
-			}
-			isowbuf_donewrite(iwb);
-		}
-	} else {
-		/* valid data wraparound */
-		if (limit >= BAS_OUTBUFSIZE) {
-			/* copy wrapped part into pad area */
-			src = 0;
-			dst = BAS_OUTBUFSIZE;
-			while (dst < limit && src < write)
-				iwb->data[dst++] = iwb->data[src++];
-			if (dst <= limit) {
-				/* fill pad area with idle byte */
-				memset(iwb->data + dst, iwb->idle,
-				       BAS_OUTBUFSIZE + BAS_OUTBUFPAD - dst);
-			}
-			limit = src;
-		}
-	}
-	iwb->nextread = limit;
-	return read;
-}
-
-/* dump_bytes
- * write hex bytes to syslog for debugging
- */
-static inline void dump_bytes(enum debuglevel level, const char *tag,
-			      unsigned char *bytes, int count)
-{
-#ifdef CONFIG_GIGASET_DEBUG
-	unsigned char c;
-	static char dbgline[3 * 32 + 1];
-	int i = 0;
-
-	if (!(gigaset_debuglevel & level))
-		return;
-
-	while (count-- > 0) {
-		if (i > sizeof(dbgline) - 4) {
-			dbgline[i] = '\0';
-			gig_dbg(level, "%s:%s", tag, dbgline);
-			i = 0;
-		}
-		c = *bytes++;
-		dbgline[i] = (i && !(i % 12)) ? '-' : ' ';
-		i++;
-		dbgline[i++] = hex_asc_hi(c);
-		dbgline[i++] = hex_asc_lo(c);
-	}
-	dbgline[i] = '\0';
-	gig_dbg(level, "%s:%s", tag, dbgline);
-#endif
-}
-
-/*============================================================================*/
-
-/* bytewise HDLC bitstuffing via table lookup
- * lookup table: 5 subtables for 0..4 preceding consecutive '1' bits
- * index: 256*(number of preceding '1' bits) + (next byte to stuff)
- * value: bit  9.. 0 = result bits
- *        bit 12..10 = number of trailing '1' bits in result
- *        bit 14..13 = number of bits added by stuffing
- */
-static const u16 stufftab[5 * 256] = {
-/* previous 1s = 0: */
-	0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000c, 0x000d, 0x000e, 0x000f,
-	0x0010, 0x0011, 0x0012, 0x0013, 0x0014, 0x0015, 0x0016, 0x0017, 0x0018, 0x0019, 0x001a, 0x001b, 0x001c, 0x001d, 0x001e, 0x201f,
-	0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002a, 0x002b, 0x002c, 0x002d, 0x002e, 0x002f,
-	0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003a, 0x003b, 0x003c, 0x003d, 0x203e, 0x205f,
-	0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004a, 0x004b, 0x004c, 0x004d, 0x004e, 0x004f,
-	0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005a, 0x005b, 0x005c, 0x005d, 0x005e, 0x209f,
-	0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006a, 0x006b, 0x006c, 0x006d, 0x006e, 0x006f,
-	0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007a, 0x007b, 0x207c, 0x207d, 0x20be, 0x20df,
-	0x0480, 0x0481, 0x0482, 0x0483, 0x0484, 0x0485, 0x0486, 0x0487, 0x0488, 0x0489, 0x048a, 0x048b, 0x048c, 0x048d, 0x048e, 0x048f,
-	0x0490, 0x0491, 0x0492, 0x0493, 0x0494, 0x0495, 0x0496, 0x0497, 0x0498, 0x0499, 0x049a, 0x049b, 0x049c, 0x049d, 0x049e, 0x251f,
-	0x04a0, 0x04a1, 0x04a2, 0x04a3, 0x04a4, 0x04a5, 0x04a6, 0x04a7, 0x04a8, 0x04a9, 0x04aa, 0x04ab, 0x04ac, 0x04ad, 0x04ae, 0x04af,
-	0x04b0, 0x04b1, 0x04b2, 0x04b3, 0x04b4, 0x04b5, 0x04b6, 0x04b7, 0x04b8, 0x04b9, 0x04ba, 0x04bb, 0x04bc, 0x04bd, 0x253e, 0x255f,
-	0x08c0, 0x08c1, 0x08c2, 0x08c3, 0x08c4, 0x08c5, 0x08c6, 0x08c7, 0x08c8, 0x08c9, 0x08ca, 0x08cb, 0x08cc, 0x08cd, 0x08ce, 0x08cf,
-	0x08d0, 0x08d1, 0x08d2, 0x08d3, 0x08d4, 0x08d5, 0x08d6, 0x08d7, 0x08d8, 0x08d9, 0x08da, 0x08db, 0x08dc, 0x08dd, 0x08de, 0x299f,
-	0x0ce0, 0x0ce1, 0x0ce2, 0x0ce3, 0x0ce4, 0x0ce5, 0x0ce6, 0x0ce7, 0x0ce8, 0x0ce9, 0x0cea, 0x0ceb, 0x0cec, 0x0ced, 0x0cee, 0x0cef,
-	0x10f0, 0x10f1, 0x10f2, 0x10f3, 0x10f4, 0x10f5, 0x10f6, 0x10f7, 0x20f8, 0x20f9, 0x20fa, 0x20fb, 0x257c, 0x257d, 0x29be, 0x2ddf,
-
-/* previous 1s = 1: */
-	0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x0007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000c, 0x000d, 0x000e, 0x200f,
-	0x0010, 0x0011, 0x0012, 0x0013, 0x0014, 0x0015, 0x0016, 0x0017, 0x0018, 0x0019, 0x001a, 0x001b, 0x001c, 0x001d, 0x001e, 0x202f,
-	0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002a, 0x002b, 0x002c, 0x002d, 0x002e, 0x204f,
-	0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003a, 0x003b, 0x003c, 0x003d, 0x203e, 0x206f,
-	0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004a, 0x004b, 0x004c, 0x004d, 0x004e, 0x208f,
-	0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005a, 0x005b, 0x005c, 0x005d, 0x005e, 0x20af,
-	0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006a, 0x006b, 0x006c, 0x006d, 0x006e, 0x20cf,
-	0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007a, 0x007b, 0x207c, 0x207d, 0x20be, 0x20ef,
-	0x0480, 0x0481, 0x0482, 0x0483, 0x0484, 0x0485, 0x0486, 0x0487, 0x0488, 0x0489, 0x048a, 0x048b, 0x048c, 0x048d, 0x048e, 0x250f,
-	0x0490, 0x0491, 0x0492, 0x0493, 0x0494, 0x0495, 0x0496, 0x0497, 0x0498, 0x0499, 0x049a, 0x049b, 0x049c, 0x049d, 0x049e, 0x252f,
-	0x04a0, 0x04a1, 0x04a2, 0x04a3, 0x04a4, 0x04a5, 0x04a6, 0x04a7, 0x04a8, 0x04a9, 0x04aa, 0x04ab, 0x04ac, 0x04ad, 0x04ae, 0x254f,
-	0x04b0, 0x04b1, 0x04b2, 0x04b3, 0x04b4, 0x04b5, 0x04b6, 0x04b7, 0x04b8, 0x04b9, 0x04ba, 0x04bb, 0x04bc, 0x04bd, 0x253e, 0x256f,
-	0x08c0, 0x08c1, 0x08c2, 0x08c3, 0x08c4, 0x08c5, 0x08c6, 0x08c7, 0x08c8, 0x08c9, 0x08ca, 0x08cb, 0x08cc, 0x08cd, 0x08ce, 0x298f,
-	0x08d0, 0x08d1, 0x08d2, 0x08d3, 0x08d4, 0x08d5, 0x08d6, 0x08d7, 0x08d8, 0x08d9, 0x08da, 0x08db, 0x08dc, 0x08dd, 0x08de, 0x29af,
-	0x0ce0, 0x0ce1, 0x0ce2, 0x0ce3, 0x0ce4, 0x0ce5, 0x0ce6, 0x0ce7, 0x0ce8, 0x0ce9, 0x0cea, 0x0ceb, 0x0cec, 0x0ced, 0x0cee, 0x2dcf,
-	0x10f0, 0x10f1, 0x10f2, 0x10f3, 0x10f4, 0x10f5, 0x10f6, 0x10f7, 0x20f8, 0x20f9, 0x20fa, 0x20fb, 0x257c, 0x257d, 0x29be, 0x31ef,
-
-/* previous 1s = 2: */
-	0x0000, 0x0001, 0x0002, 0x0003, 0x0004, 0x0005, 0x0006, 0x2007, 0x0008, 0x0009, 0x000a, 0x000b, 0x000c, 0x000d, 0x000e, 0x2017,
-	0x0010, 0x0011, 0x0012, 0x0013, 0x0014, 0x0015, 0x0016, 0x2027, 0x0018, 0x0019, 0x001a, 0x001b, 0x001c, 0x001d, 0x001e, 0x2037,
-	0x0020, 0x0021, 0x0022, 0x0023, 0x0024, 0x0025, 0x0026, 0x2047, 0x0028, 0x0029, 0x002a, 0x002b, 0x002c, 0x002d, 0x002e, 0x2057,
-	0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x2067, 0x0038, 0x0039, 0x003a, 0x003b, 0x003c, 0x003d, 0x203e, 0x2077,
-	0x0040, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x2087, 0x0048, 0x0049, 0x004a, 0x004b, 0x004c, 0x004d, 0x004e, 0x2097,
-	0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x20a7, 0x0058, 0x0059, 0x005a, 0x005b, 0x005c, 0x005d, 0x005e, 0x20b7,
-	0x0060, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x20c7, 0x0068, 0x0069, 0x006a, 0x006b, 0x006c, 0x006d, 0x006e, 0x20d7,
-	0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x20e7, 0x0078, 0x0079, 0x007a, 0x007b, 0x207c, 0x207d, 0x20be, 0x20f7,
-	0x0480, 0x0481, 0x0482, 0x0483, 0x0484, 0x0485, 0x0486, 0x2507, 0x0488, 0x0489, 0x048a, 0x048b, 0x048c, 0x048d, 0x048e, 0x2517,
-	0x0490, 0x0491, 0x0492, 0x0493, 0x0494, 0x0495, 0x0496, 0x2527, 0x0498, 0x0499, 0x049a, 0x049b, 0x049c, 0x049d, 0x049e, 0x2537,
-	0x04a0, 0x04a1, 0x04a2, 0x04a3, 0x04a4, 0x04a5, 0x04a6, 0x2547, 0x04a8, 0x04a9, 0x04aa, 0x04ab, 0x04ac, 0x04ad, 0x04ae, 0x2557,
-	0x04b0, 0x04b1, 0x04b2, 0x04b3, 0x04b4, 0x04b5, 0x04b6, 0x2567, 0x04b8, 0x04b9, 0x04ba, 0x04bb, 0x04bc, 0x04bd, 0x253e, 0x2577,
-	0x08c0, 0x08c1, 0x08c2, 0x08c3, 0x08c4, 0x08c5, 0x08c6, 0x2987, 0x08c8, 0x08c9, 0x08ca, 0x08cb, 0x08cc, 0x08cd, 0x08ce, 0x2997,
-	0x08d0, 0x08d1, 0x08d2, 0x08d3, 0x08d4, 0x08d5, 0x08d6, 0x29a7, 0x08d8, 0x08d9, 0x08da, 0x08db, 0x08dc, 0x08dd, 0x08de, 0x29b7,
-	0x0ce0, 0x0ce1, 0x0ce2, 0x0ce3, 0x0ce4, 0x0ce5, 0x0ce6, 0x2dc7, 0x0ce8, 0x0ce9, 0x0cea, 0x0ceb, 0x0cec, 0x0ced, 0x0cee, 0x2dd7,
-	0x10f0, 0x10f1, 0x10f2, 0x10f3, 0x10f4, 0x10f5, 0x10f6, 0x31e7, 0x20f8, 0x20f9, 0x20fa, 0x20fb, 0x257c, 0x257d, 0x29be, 0x41f7,
-
-/* previous 1s = 3: */
-	0x0000, 0x0001, 0x0002, 0x2003, 0x0004, 0x0005, 0x0006, 0x200b, 0x0008, 0x0009, 0x000a, 0x2013, 0x000c, 0x000d, 0x000e, 0x201b,
-	0x0010, 0x0011, 0x0012, 0x2023, 0x0014, 0x0015, 0x0016, 0x202b, 0x0018, 0x0019, 0x001a, 0x2033, 0x001c, 0x001d, 0x001e, 0x203b,
-	0x0020, 0x0021, 0x0022, 0x2043, 0x0024, 0x0025, 0x0026, 0x204b, 0x0028, 0x0029, 0x002a, 0x2053, 0x002c, 0x002d, 0x002e, 0x205b,
-	0x0030, 0x0031, 0x0032, 0x2063, 0x0034, 0x0035, 0x0036, 0x206b, 0x0038, 0x0039, 0x003a, 0x2073, 0x003c, 0x003d, 0x203e, 0x207b,
-	0x0040, 0x0041, 0x0042, 0x2083, 0x0044, 0x0045, 0x0046, 0x208b, 0x0048, 0x0049, 0x004a, 0x2093, 0x004c, 0x004d, 0x004e, 0x209b,
-	0x0050, 0x0051, 0x0052, 0x20a3, 0x0054, 0x0055, 0x0056, 0x20ab, 0x0058, 0x0059, 0x005a, 0x20b3, 0x005c, 0x005d, 0x005e, 0x20bb,
-	0x0060, 0x0061, 0x0062, 0x20c3, 0x0064, 0x0065, 0x0066, 0x20cb, 0x0068, 0x0069, 0x006a, 0x20d3, 0x006c, 0x006d, 0x006e, 0x20db,
-	0x0070, 0x0071, 0x0072, 0x20e3, 0x0074, 0x0075, 0x0076, 0x20eb, 0x0078, 0x0079, 0x007a, 0x20f3, 0x207c, 0x207d, 0x20be, 0x40fb,
-	0x0480, 0x0481, 0x0482, 0x2503, 0x0484, 0x0485, 0x0486, 0x250b, 0x0488, 0x0489, 0x048a, 0x2513, 0x048c, 0x048d, 0x048e, 0x251b,
-	0x0490, 0x0491, 0x0492, 0x2523, 0x0494, 0x0495, 0x0496, 0x252b, 0x0498, 0x0499, 0x049a, 0x2533, 0x049c, 0x049d, 0x049e, 0x253b,
-	0x04a0, 0x04a1, 0x04a2, 0x2543, 0x04a4, 0x04a5, 0x04a6, 0x254b, 0x04a8, 0x04a9, 0x04aa, 0x2553, 0x04ac, 0x04ad, 0x04ae, 0x255b,
-	0x04b0, 0x04b1, 0x04b2, 0x2563, 0x04b4, 0x04b5, 0x04b6, 0x256b, 0x04b8, 0x04b9, 0x04ba, 0x2573, 0x04bc, 0x04bd, 0x253e, 0x257b,
-	0x08c0, 0x08c1, 0x08c2, 0x2983, 0x08c4, 0x08c5, 0x08c6, 0x298b, 0x08c8, 0x08c9, 0x08ca, 0x2993, 0x08cc, 0x08cd, 0x08ce, 0x299b,
-	0x08d0, 0x08d1, 0x08d2, 0x29a3, 0x08d4, 0x08d5, 0x08d6, 0x29ab, 0x08d8, 0x08d9, 0x08da, 0x29b3, 0x08dc, 0x08dd, 0x08de, 0x29bb,
-	0x0ce0, 0x0ce1, 0x0ce2, 0x2dc3, 0x0ce4, 0x0ce5, 0x0ce6, 0x2dcb, 0x0ce8, 0x0ce9, 0x0cea, 0x2dd3, 0x0cec, 0x0ced, 0x0cee, 0x2ddb,
-	0x10f0, 0x10f1, 0x10f2, 0x31e3, 0x10f4, 0x10f5, 0x10f6, 0x31eb, 0x20f8, 0x20f9, 0x20fa, 0x41f3, 0x257c, 0x257d, 0x29be, 0x46fb,
-
-/* previous 1s = 4: */
-	0x0000, 0x2001, 0x0002, 0x2005, 0x0004, 0x2009, 0x0006, 0x200d, 0x0008, 0x2011, 0x000a, 0x2015, 0x000c, 0x2019, 0x000e, 0x201d,
-	0x0010, 0x2021, 0x0012, 0x2025, 0x0014, 0x2029, 0x0016, 0x202d, 0x0018, 0x2031, 0x001a, 0x2035, 0x001c, 0x2039, 0x001e, 0x203d,
-	0x0020, 0x2041, 0x0022, 0x2045, 0x0024, 0x2049, 0x0026, 0x204d, 0x0028, 0x2051, 0x002a, 0x2055, 0x002c, 0x2059, 0x002e, 0x205d,
-	0x0030, 0x2061, 0x0032, 0x2065, 0x0034, 0x2069, 0x0036, 0x206d, 0x0038, 0x2071, 0x003a, 0x2075, 0x003c, 0x2079, 0x203e, 0x407d,
-	0x0040, 0x2081, 0x0042, 0x2085, 0x0044, 0x2089, 0x0046, 0x208d, 0x0048, 0x2091, 0x004a, 0x2095, 0x004c, 0x2099, 0x004e, 0x209d,
-	0x0050, 0x20a1, 0x0052, 0x20a5, 0x0054, 0x20a9, 0x0056, 0x20ad, 0x0058, 0x20b1, 0x005a, 0x20b5, 0x005c, 0x20b9, 0x005e, 0x20bd,
-	0x0060, 0x20c1, 0x0062, 0x20c5, 0x0064, 0x20c9, 0x0066, 0x20cd, 0x0068, 0x20d1, 0x006a, 0x20d5, 0x006c, 0x20d9, 0x006e, 0x20dd,
-	0x0070, 0x20e1, 0x0072, 0x20e5, 0x0074, 0x20e9, 0x0076, 0x20ed, 0x0078, 0x20f1, 0x007a, 0x20f5, 0x207c, 0x40f9, 0x20be, 0x417d,
-	0x0480, 0x2501, 0x0482, 0x2505, 0x0484, 0x2509, 0x0486, 0x250d, 0x0488, 0x2511, 0x048a, 0x2515, 0x048c, 0x2519, 0x048e, 0x251d,
-	0x0490, 0x2521, 0x0492, 0x2525, 0x0494, 0x2529, 0x0496, 0x252d, 0x0498, 0x2531, 0x049a, 0x2535, 0x049c, 0x2539, 0x049e, 0x253d,
-	0x04a0, 0x2541, 0x04a2, 0x2545, 0x04a4, 0x2549, 0x04a6, 0x254d, 0x04a8, 0x2551, 0x04aa, 0x2555, 0x04ac, 0x2559, 0x04ae, 0x255d,
-	0x04b0, 0x2561, 0x04b2, 0x2565, 0x04b4, 0x2569, 0x04b6, 0x256d, 0x04b8, 0x2571, 0x04ba, 0x2575, 0x04bc, 0x2579, 0x253e, 0x467d,
-	0x08c0, 0x2981, 0x08c2, 0x2985, 0x08c4, 0x2989, 0x08c6, 0x298d, 0x08c8, 0x2991, 0x08ca, 0x2995, 0x08cc, 0x2999, 0x08ce, 0x299d,
-	0x08d0, 0x29a1, 0x08d2, 0x29a5, 0x08d4, 0x29a9, 0x08d6, 0x29ad, 0x08d8, 0x29b1, 0x08da, 0x29b5, 0x08dc, 0x29b9, 0x08de, 0x29bd,
-	0x0ce0, 0x2dc1, 0x0ce2, 0x2dc5, 0x0ce4, 0x2dc9, 0x0ce6, 0x2dcd, 0x0ce8, 0x2dd1, 0x0cea, 0x2dd5, 0x0cec, 0x2dd9, 0x0cee, 0x2ddd,
-	0x10f0, 0x31e1, 0x10f2, 0x31e5, 0x10f4, 0x31e9, 0x10f6, 0x31ed, 0x20f8, 0x41f1, 0x20fa, 0x41f5, 0x257c, 0x46f9, 0x29be, 0x4b7d
-};
-
-/* hdlc_bitstuff_byte
- * perform HDLC bitstuffing for one input byte (8 bits, LSB first)
- * parameters:
- *	cin	input byte
- *	ones	number of trailing '1' bits in result before this step
- *	iwb	pointer to output buffer structure
- *		(write semaphore must be held)
- * return value:
- *	number of trailing '1' bits in result after this step
- */
-
-static inline int hdlc_bitstuff_byte(struct isowbuf_t *iwb, unsigned char cin,
-				     int ones)
-{
-	u16 stuff;
-	int shiftinc, newones;
-
-	/* get stuffing information for input byte
-	 * value: bit  9.. 0 = result bits
-	 *        bit 12..10 = number of trailing '1' bits in result
-	 *        bit 14..13 = number of bits added by stuffing
-	 */
-	stuff = stufftab[256 * ones + cin];
-	shiftinc = (stuff >> 13) & 3;
-	newones = (stuff >> 10) & 7;
-	stuff &= 0x3ff;
-
-	/* append stuffed byte to output stream */
-	isowbuf_putbits(iwb, stuff, 8 + shiftinc);
-	return newones;
-}
-
-/* hdlc_buildframe
- * Perform HDLC framing with bitstuffing on a byte buffer
- * The input buffer is regarded as a sequence of bits, starting with the least
- * significant bit of the first byte and ending with the most significant bit
- * of the last byte. A 16 bit FCS is appended as defined by RFC 1662.
- * Whenever five consecutive '1' bits appear in the resulting bit sequence, a
- * '0' bit is inserted after them.
- * The resulting bit string and a closing flag pattern (PPP_FLAG, '01111110')
- * are appended to the output buffer starting at the given bit position, which
- * is assumed to already contain a leading flag.
- * The output buffer must have sufficient length; count + count/5 + 6 bytes
- * starting at *out are safe and are verified to be present.
- * parameters:
- *	in	input buffer
- *	count	number of bytes in input buffer
- *	iwb	pointer to output buffer structure
- *		(write semaphore must be held)
- * return value:
- *	position of end of packet in output buffer on success,
- *	-EAGAIN if write semaphore busy or buffer full
- */
-
-static inline int hdlc_buildframe(struct isowbuf_t *iwb,
-				  unsigned char *in, int count)
-{
-	int ones;
-	u16 fcs;
-	int end;
-	unsigned char c;
-
-	if (isowbuf_freebytes(iwb) < count + count / 5 + 6 ||
-	    isowbuf_startwrite(iwb) < 0) {
-		gig_dbg(DEBUG_ISO, "%s: %d bytes free -> -EAGAIN",
-			__func__, isowbuf_freebytes(iwb));
-		return -EAGAIN;
-	}
-
-	dump_bytes(DEBUG_STREAM_DUMP, "snd data", in, count);
-
-	/* bitstuff and checksum input data */
-	fcs = PPP_INITFCS;
-	ones = 0;
-	while (count-- > 0) {
-		c = *in++;
-		ones = hdlc_bitstuff_byte(iwb, c, ones);
-		fcs = crc_ccitt_byte(fcs, c);
-	}
-
-	/* bitstuff and append FCS
-	 * (complemented, least significant byte first) */
-	fcs ^= 0xffff;
-	ones = hdlc_bitstuff_byte(iwb, fcs & 0x00ff, ones);
-	ones = hdlc_bitstuff_byte(iwb, (fcs >> 8) & 0x00ff, ones);
-
-	/* put closing flag and repeat byte for flag idle */
-	isowbuf_putflag(iwb);
-	end = isowbuf_donewrite(iwb);
-	return end;
-}
-
-/* trans_buildframe
- * Append a block of 'transparent' data to the output buffer,
- * inverting the bytes.
- * The output buffer must have sufficient length; count bytes
- * starting at *out are safe and are verified to be present.
- * parameters:
- *	in	input buffer
- *	count	number of bytes in input buffer
- *	iwb	pointer to output buffer structure
- *		(write semaphore must be held)
- * return value:
- *	position of end of packet in output buffer on success,
- *	-EAGAIN if write semaphore busy or buffer full
- */
-
-static inline int trans_buildframe(struct isowbuf_t *iwb,
-				   unsigned char *in, int count)
-{
-	int write;
-	unsigned char c;
-
-	if (unlikely(count <= 0))
-		return iwb->write;
-
-	if (isowbuf_freebytes(iwb) < count ||
-	    isowbuf_startwrite(iwb) < 0) {
-		gig_dbg(DEBUG_ISO, "can't put %d bytes", count);
-		return -EAGAIN;
-	}
-
-	gig_dbg(DEBUG_STREAM, "put %d bytes", count);
-	dump_bytes(DEBUG_STREAM_DUMP, "snd data", in, count);
-
-	write = iwb->write;
-	do {
-		c = bitrev8(*in++);
-		iwb->data[write++] = c;
-		write %= BAS_OUTBUFSIZE;
-	} while (--count > 0);
-	iwb->write = write;
-	iwb->idle = c;
-
-	return isowbuf_donewrite(iwb);
-}
-
-int gigaset_isoc_buildframe(struct bc_state *bcs, unsigned char *in, int len)
-{
-	int result;
-
-	switch (bcs->proto2) {
-	case L2_HDLC:
-		result = hdlc_buildframe(bcs->hw.bas->isooutbuf, in, len);
-		gig_dbg(DEBUG_ISO, "%s: %d bytes HDLC -> %d",
-			__func__, len, result);
-		break;
-	default:			/* assume transparent */
-		result = trans_buildframe(bcs->hw.bas->isooutbuf, in, len);
-		gig_dbg(DEBUG_ISO, "%s: %d bytes trans -> %d",
-			__func__, len, result);
-	}
-	return result;
-}
-
-/* hdlc_putbyte
- * append byte c to current skb of B channel structure *bcs, updating fcs
- */
-static inline void hdlc_putbyte(unsigned char c, struct bc_state *bcs)
-{
-	bcs->rx_fcs = crc_ccitt_byte(bcs->rx_fcs, c);
-	if (bcs->rx_skb == NULL)
-		/* skipping */
-		return;
-	if (bcs->rx_skb->len >= bcs->rx_bufsize) {
-		dev_warn(bcs->cs->dev, "received oversized packet discarded\n");
-		bcs->hw.bas->giants++;
-		dev_kfree_skb_any(bcs->rx_skb);
-		bcs->rx_skb = NULL;
-		return;
-	}
-	__skb_put_u8(bcs->rx_skb, c);
-}
-
-/* hdlc_flush
- * drop partial HDLC data packet
- */
-static inline void hdlc_flush(struct bc_state *bcs)
-{
-	/* clear skb or allocate new if not skipping */
-	if (bcs->rx_skb != NULL)
-		skb_trim(bcs->rx_skb, 0);
-	else
-		gigaset_new_rx_skb(bcs);
-
-	/* reset packet state */
-	bcs->rx_fcs = PPP_INITFCS;
-}
-
-/* hdlc_done
- * process completed HDLC data packet
- */
-static inline void hdlc_done(struct bc_state *bcs)
-{
-	struct cardstate *cs = bcs->cs;
-	struct sk_buff *procskb;
-	unsigned int len;
-
-	if (unlikely(bcs->ignore)) {
-		bcs->ignore--;
-		hdlc_flush(bcs);
-		return;
-	}
-	procskb = bcs->rx_skb;
-	if (procskb == NULL) {
-		/* previous error */
-		gig_dbg(DEBUG_ISO, "%s: skb=NULL", __func__);
-		gigaset_isdn_rcv_err(bcs);
-	} else if (procskb->len < 2) {
-		dev_notice(cs->dev, "received short frame (%d octets)\n",
-			   procskb->len);
-		bcs->hw.bas->runts++;
-		dev_kfree_skb_any(procskb);
-		gigaset_isdn_rcv_err(bcs);
-	} else if (bcs->rx_fcs != PPP_GOODFCS) {
-		dev_notice(cs->dev, "frame check error\n");
-		bcs->hw.bas->fcserrs++;
-		dev_kfree_skb_any(procskb);
-		gigaset_isdn_rcv_err(bcs);
-	} else {
-		len = procskb->len;
-		__skb_trim(procskb, len -= 2);	/* subtract FCS */
-		gig_dbg(DEBUG_ISO, "%s: good frame (%d octets)", __func__, len);
-		dump_bytes(DEBUG_STREAM_DUMP,
-			   "rcv data", procskb->data, len);
-		bcs->hw.bas->goodbytes += len;
-		gigaset_skb_rcvd(bcs, procskb);
-	}
-	gigaset_new_rx_skb(bcs);
-	bcs->rx_fcs = PPP_INITFCS;
-}
-
-/* hdlc_frag
- * drop HDLC data packet with non-integral last byte
- */
-static inline void hdlc_frag(struct bc_state *bcs, unsigned inbits)
-{
-	if (unlikely(bcs->ignore)) {
-		bcs->ignore--;
-		hdlc_flush(bcs);
-		return;
-	}
-
-	dev_notice(bcs->cs->dev, "received partial byte (%d bits)\n", inbits);
-	bcs->hw.bas->alignerrs++;
-	gigaset_isdn_rcv_err(bcs);
-	__skb_trim(bcs->rx_skb, 0);
-	bcs->rx_fcs = PPP_INITFCS;
-}
-
-/* bit counts lookup table for HDLC bit unstuffing
- * index: input byte
- * value: bit 0..3 = number of consecutive '1' bits starting from LSB
- *        bit 4..6 = number of consecutive '1' bits starting from MSB
- *		     (replacing 8 by 7 to make it fit; the algorithm won't care)
- *        bit 7 set if there are 5 or more "interior" consecutive '1' bits
- */
-static const unsigned char bitcounts[256] = {
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x04,
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x05,
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x04,
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x80, 0x06,
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x04,
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x05,
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x04,
-	0x00, 0x01, 0x00, 0x02, 0x00, 0x01, 0x00, 0x03, 0x00, 0x01, 0x00, 0x02, 0x80, 0x81, 0x80, 0x07,
-	0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x10, 0x13, 0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x10, 0x14,
-	0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x10, 0x13, 0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x10, 0x15,
-	0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x10, 0x13, 0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x10, 0x14,
-	0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x10, 0x13, 0x10, 0x11, 0x10, 0x12, 0x10, 0x11, 0x90, 0x16,
-	0x20, 0x21, 0x20, 0x22, 0x20, 0x21, 0x20, 0x23, 0x20, 0x21, 0x20, 0x22, 0x20, 0x21, 0x20, 0x24,
-	0x20, 0x21, 0x20, 0x22, 0x20, 0x21, 0x20, 0x23, 0x20, 0x21, 0x20, 0x22, 0x20, 0x21, 0x20, 0x25,
-	0x30, 0x31, 0x30, 0x32, 0x30, 0x31, 0x30, 0x33, 0x30, 0x31, 0x30, 0x32, 0x30, 0x31, 0x30, 0x34,
-	0x40, 0x41, 0x40, 0x42, 0x40, 0x41, 0x40, 0x43, 0x50, 0x51, 0x50, 0x52, 0x60, 0x61, 0x70, 0x78
-};
-
-/* hdlc_unpack
- * perform HDLC frame processing (bit unstuffing, flag detection, FCS
- * calculation) on a sequence of received data bytes (8 bits each, LSB first)
- * pass on successfully received, complete frames as SKBs via gigaset_skb_rcvd
- * notify of errors via gigaset_isdn_rcv_err
- * tally frames, errors etc. in BC structure counters
- * parameters:
- *	src	received data
- *	count	number of received bytes
- *	bcs	receiving B channel structure
- */
-static inline void hdlc_unpack(unsigned char *src, unsigned count,
-			       struct bc_state *bcs)
-{
-	struct bas_bc_state *ubc = bcs->hw.bas;
-	int inputstate;
-	unsigned seqlen, inbyte, inbits;
-
-	/* load previous state:
-	 * inputstate = set of flag bits:
-	 * - INS_flag_hunt: no complete opening flag received since connection
-	 *                  setup or last abort
-	 * - INS_have_data: at least one complete data byte received since last
-	 *                  flag
-	 * seqlen = number of consecutive '1' bits in last 7 input stream bits
-	 *          (0..7)
-	 * inbyte = accumulated partial data byte (if !INS_flag_hunt)
-	 * inbits = number of valid bits in inbyte, starting at LSB (0..6)
-	 */
-	inputstate = bcs->inputstate;
-	seqlen = ubc->seqlen;
-	inbyte = ubc->inbyte;
-	inbits = ubc->inbits;
-
-	/* bit unstuffing a byte a time
-	 * Take your time to understand this; it's straightforward but tedious.
-	 * The "bitcounts" lookup table is used to speed up the counting of
-	 * leading and trailing '1' bits.
-	 */
-	while (count--) {
-		unsigned char c = *src++;
-		unsigned char tabentry = bitcounts[c];
-		unsigned lead1 = tabentry & 0x0f;
-		unsigned trail1 = (tabentry >> 4) & 0x0f;
-
-		seqlen += lead1;
-
-		if (unlikely(inputstate & INS_flag_hunt)) {
-			if (c == PPP_FLAG) {
-				/* flag-in-one */
-				inputstate &= ~(INS_flag_hunt | INS_have_data);
-				inbyte = 0;
-				inbits = 0;
-			} else if (seqlen == 6 && trail1 != 7) {
-				/* flag completed & not followed by abort */
-				inputstate &= ~(INS_flag_hunt | INS_have_data);
-				inbyte = c >> (lead1 + 1);
-				inbits = 7 - lead1;
-				if (trail1 >= 8) {
-					/* interior stuffing:
-					 * omitting the MSB handles most cases,
-					 * correct the incorrectly handled
-					 * cases individually */
-					inbits--;
-					switch (c) {
-					case 0xbe:
-						inbyte = 0x3f;
-						break;
-					}
-				}
-			}
-			/* else: continue flag-hunting */
-		} else if (likely(seqlen < 5 && trail1 < 7)) {
-			/* streamlined case: 8 data bits, no stuffing */
-			inbyte |= c << inbits;
-			hdlc_putbyte(inbyte & 0xff, bcs);
-			inputstate |= INS_have_data;
-			inbyte >>= 8;
-			/* inbits unchanged */
-		} else if (likely(seqlen == 6 && inbits == 7 - lead1 &&
-				  trail1 + 1 == inbits &&
-				  !(inputstate & INS_have_data))) {
-			/* streamlined case: flag idle - state unchanged */
-		} else if (unlikely(seqlen > 6)) {
-			/* abort sequence */
-			ubc->aborts++;
-			hdlc_flush(bcs);
-			inputstate |= INS_flag_hunt;
-		} else if (seqlen == 6) {
-			/* closing flag, including (6 - lead1) '1's
-			 * and one '0' from inbits */
-			if (inbits > 7 - lead1) {
-				hdlc_frag(bcs, inbits + lead1 - 7);
-				inputstate &= ~INS_have_data;
-			} else {
-				if (inbits < 7 - lead1)
-					ubc->stolen0s++;
-				if (inputstate & INS_have_data) {
-					hdlc_done(bcs);
-					inputstate &= ~INS_have_data;
-				}
-			}
-
-			if (c == PPP_FLAG) {
-				/* complete flag, LSB overlaps preceding flag */
-				ubc->shared0s++;
-				inbits = 0;
-				inbyte = 0;
-			} else if (trail1 != 7) {
-				/* remaining bits */
-				inbyte = c >> (lead1 + 1);
-				inbits = 7 - lead1;
-				if (trail1 >= 8) {
-					/* interior stuffing:
-					 * omitting the MSB handles most cases,
-					 * correct the incorrectly handled
-					 * cases individually */
-					inbits--;
-					switch (c) {
-					case 0xbe:
-						inbyte = 0x3f;
-						break;
-					}
-				}
-			} else {
-				/* abort sequence follows,
-				 * skb already empty anyway */
-				ubc->aborts++;
-				inputstate |= INS_flag_hunt;
-			}
-		} else { /* (seqlen < 6) && (seqlen == 5 || trail1 >= 7) */
-
-			if (c == PPP_FLAG) {
-				/* complete flag */
-				if (seqlen == 5)
-					ubc->stolen0s++;
-				if (inbits) {
-					hdlc_frag(bcs, inbits);
-					inbits = 0;
-					inbyte = 0;
-				} else if (inputstate & INS_have_data)
-					hdlc_done(bcs);
-				inputstate &= ~INS_have_data;
-			} else if (trail1 == 7) {
-				/* abort sequence */
-				ubc->aborts++;
-				hdlc_flush(bcs);
-				inputstate |= INS_flag_hunt;
-			} else {
-				/* stuffed data */
-				if (trail1 < 7) { /* => seqlen == 5 */
-					/* stuff bit at position lead1,
-					 * no interior stuffing */
-					unsigned char mask = (1 << lead1) - 1;
-					c = (c & mask) | ((c & ~mask) >> 1);
-					inbyte |= c << inbits;
-					inbits += 7;
-				} else if (seqlen < 5) { /* trail1 >= 8 */
-					/* interior stuffing:
-					 * omitting the MSB handles most cases,
-					 * correct the incorrectly handled
-					 * cases individually */
-					switch (c) {
-					case 0xbe:
-						c = 0x7e;
-						break;
-					}
-					inbyte |= c << inbits;
-					inbits += 7;
-				} else { /* seqlen == 5 && trail1 >= 8 */
-
-					/* stuff bit at lead1 *and* interior
-					 * stuffing -- unstuff individually */
-					switch (c) {
-					case 0x7d:
-						c = 0x3f;
-						break;
-					case 0xbe:
-						c = 0x3f;
-						break;
-					case 0x3e:
-						c = 0x1f;
-						break;
-					case 0x7c:
-						c = 0x3e;
-						break;
-					}
-					inbyte |= c << inbits;
-					inbits += 6;
-				}
-				if (inbits >= 8) {
-					inbits -= 8;
-					hdlc_putbyte(inbyte & 0xff, bcs);
-					inputstate |= INS_have_data;
-					inbyte >>= 8;
-				}
-			}
-		}
-		seqlen = trail1 & 7;
-	}
-
-	/* save new state */
-	bcs->inputstate = inputstate;
-	ubc->seqlen = seqlen;
-	ubc->inbyte = inbyte;
-	ubc->inbits = inbits;
-}
-
-/* trans_receive
- * pass on received USB frame transparently as SKB via gigaset_skb_rcvd
- * invert bytes
- * tally frames, errors etc. in BC structure counters
- * parameters:
- *	src	received data
- *	count	number of received bytes
- *	bcs	receiving B channel structure
- */
-static inline void trans_receive(unsigned char *src, unsigned count,
-				 struct bc_state *bcs)
-{
-	struct sk_buff *skb;
-	int dobytes;
-	unsigned char *dst;
-
-	if (unlikely(bcs->ignore)) {
-		bcs->ignore--;
-		return;
-	}
-	skb = bcs->rx_skb;
-	if (skb == NULL) {
-		skb = gigaset_new_rx_skb(bcs);
-		if (skb == NULL)
-			return;
-	}
-	dobytes = bcs->rx_bufsize - skb->len;
-	while (count > 0) {
-		dst = skb_put(skb, count < dobytes ? count : dobytes);
-		while (count > 0 && dobytes > 0) {
-			*dst++ = bitrev8(*src++);
-			count--;
-			dobytes--;
-		}
-		if (dobytes == 0) {
-			dump_bytes(DEBUG_STREAM_DUMP,
-				   "rcv data", skb->data, skb->len);
-			bcs->hw.bas->goodbytes += skb->len;
-			gigaset_skb_rcvd(bcs, skb);
-			skb = gigaset_new_rx_skb(bcs);
-			if (skb == NULL)
-				return;
-			dobytes = bcs->rx_bufsize;
-		}
-	}
-}
-
-void gigaset_isoc_receive(unsigned char *src, unsigned count,
-			  struct bc_state *bcs)
-{
-	switch (bcs->proto2) {
-	case L2_HDLC:
-		hdlc_unpack(src, count, bcs);
-		break;
-	default:		/* assume transparent */
-		trans_receive(src, count, bcs);
-	}
-}
-
-/* == data input =========================================================== */
-
-/* process a block of received bytes in command mode (mstate != MS_LOCKED)
- * Append received bytes to the command response buffer and forward them
- * line by line to the response handler.
- * Note: Received lines may be terminated by CR, LF, or CR LF, which will be
- * removed before passing the line to the response handler.
- */
-static void cmd_loop(unsigned char *src, int numbytes, struct inbuf_t *inbuf)
-{
-	struct cardstate *cs = inbuf->cs;
-	unsigned cbytes      = cs->cbytes;
-	unsigned char c;
-
-	while (numbytes--) {
-		c = *src++;
-		switch (c) {
-		case '\n':
-			if (cbytes == 0 && cs->respdata[0] == '\r') {
-				/* collapse LF with preceding CR */
-				cs->respdata[0] = 0;
-				break;
-			}
-			/* fall through */
-		case '\r':
-			/* end of message line, pass to response handler */
-			if (cbytes >= MAX_RESP_SIZE) {
-				dev_warn(cs->dev, "response too large (%d)\n",
-					 cbytes);
-				cbytes = MAX_RESP_SIZE;
-			}
-			cs->cbytes = cbytes;
-			gigaset_dbg_buffer(DEBUG_TRANSCMD, "received response",
-					   cbytes, cs->respdata);
-			gigaset_handle_modem_response(cs);
-			cbytes = 0;
-
-			/* store EOL byte for CRLF collapsing */
-			cs->respdata[0] = c;
-			break;
-		default:
-			/* append to line buffer if possible */
-			if (cbytes < MAX_RESP_SIZE)
-				cs->respdata[cbytes] = c;
-			cbytes++;
-		}
-	}
-
-	/* save state */
-	cs->cbytes = cbytes;
-}
-
-
-/* process a block of data received through the control channel
- */
-void gigaset_isoc_input(struct inbuf_t *inbuf)
-{
-	struct cardstate *cs = inbuf->cs;
-	unsigned tail, head, numbytes;
-	unsigned char *src;
-
-	head = inbuf->head;
-	while (head != (tail = inbuf->tail)) {
-		gig_dbg(DEBUG_INTR, "buffer state: %u -> %u", head, tail);
-		if (head > tail)
-			tail = RBUFSIZE;
-		src = inbuf->data + head;
-		numbytes = tail - head;
-		gig_dbg(DEBUG_INTR, "processing %u bytes", numbytes);
-
-		if (cs->mstate == MS_LOCKED) {
-			gigaset_dbg_buffer(DEBUG_LOCKCMD, "received response",
-					   numbytes, src);
-			gigaset_if_receive(inbuf->cs, src, numbytes);
-		} else {
-			cmd_loop(src, numbytes, inbuf);
-		}
-
-		head += numbytes;
-		if (head == RBUFSIZE)
-			head = 0;
-		gig_dbg(DEBUG_INTR, "setting head to %u", head);
-		inbuf->head = head;
-	}
-}
-
-
-/* == data output ========================================================== */
-
-/**
- * gigaset_isoc_send_skb() - queue an skb for sending
- * @bcs:	B channel descriptor structure.
- * @skb:	data to send.
- *
- * Called by LL to queue an skb for sending, and start transmission if
- * necessary.
- * Once the payload data has been transmitted completely, gigaset_skb_sent()
- * will be called with the skb's link layer header preserved.
- *
- * Return value:
- *	number of bytes accepted for sending (skb->len) if ok,
- *	error code < 0 (eg. -ENODEV) on error
- */
-int gigaset_isoc_send_skb(struct bc_state *bcs, struct sk_buff *skb)
-{
-	int len = skb->len;
-	unsigned long flags;
-
-	spin_lock_irqsave(&bcs->cs->lock, flags);
-	if (!bcs->cs->connected) {
-		spin_unlock_irqrestore(&bcs->cs->lock, flags);
-		return -ENODEV;
-	}
-
-	skb_queue_tail(&bcs->squeue, skb);
-	gig_dbg(DEBUG_ISO, "%s: skb queued, qlen=%d",
-		__func__, skb_queue_len(&bcs->squeue));
-
-	/* tasklet submits URB if necessary */
-	tasklet_schedule(&bcs->hw.bas->sent_tasklet);
-	spin_unlock_irqrestore(&bcs->cs->lock, flags);
-
-	return len;	/* ok so far */
-}
diff --git a/drivers/staging/isdn/gigaset/proc.c b/drivers/staging/isdn/gigaset/proc.c
deleted file mode 100644
index 8914439a4237..000000000000
--- a/drivers/staging/isdn/gigaset/proc.c
+++ /dev/null
@@ -1,77 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Stuff used by all variants of the driver
- *
- * Copyright (c) 2001 by Stefan Eilers,
- *                       Hansjoerg Lipp <hjlipp@web.de>,
- *                       Tilman Schmidt <tilman@imap.cc>.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-
-static ssize_t show_cidmode(struct device *dev,
-			    struct device_attribute *attr, char *buf)
-{
-	struct cardstate *cs = dev_get_drvdata(dev);
-
-	return sprintf(buf, "%u\n", cs->cidmode);
-}
-
-static ssize_t set_cidmode(struct device *dev, struct device_attribute *attr,
-			   const char *buf, size_t count)
-{
-	struct cardstate *cs = dev_get_drvdata(dev);
-	long int value;
-	char *end;
-
-	value = simple_strtol(buf, &end, 0);
-	while (*end)
-		if (!isspace(*end++))
-			return -EINVAL;
-	if (value < 0 || value > 1)
-		return -EINVAL;
-
-	if (mutex_lock_interruptible(&cs->mutex))
-		return -ERESTARTSYS;
-
-	cs->waiting = 1;
-	if (!gigaset_add_event(cs, &cs->at_state, EV_PROC_CIDMODE,
-			       NULL, value, NULL)) {
-		cs->waiting = 0;
-		mutex_unlock(&cs->mutex);
-		return -ENOMEM;
-	}
-	gigaset_schedule_event(cs);
-
-	wait_event(cs->waitqueue, !cs->waiting);
-
-	mutex_unlock(&cs->mutex);
-
-	return count;
-}
-
-static DEVICE_ATTR(cidmode, S_IRUGO | S_IWUSR, show_cidmode, set_cidmode);
-
-/* free sysfs for device */
-void gigaset_free_dev_sysfs(struct cardstate *cs)
-{
-	if (!cs->tty_dev)
-		return;
-
-	gig_dbg(DEBUG_INIT, "removing sysfs entries");
-	device_remove_file(cs->tty_dev, &dev_attr_cidmode);
-}
-
-/* initialize sysfs for device */
-void gigaset_init_dev_sysfs(struct cardstate *cs)
-{
-	if (!cs->tty_dev)
-		return;
-
-	gig_dbg(DEBUG_INIT, "setting up sysfs");
-	if (device_create_file(cs->tty_dev, &dev_attr_cidmode))
-		pr_err("could not create sysfs attribute\n");
-}
diff --git a/drivers/staging/isdn/gigaset/ser-gigaset.c b/drivers/staging/isdn/gigaset/ser-gigaset.c
deleted file mode 100644
index 5587e9e7fc73..000000000000
--- a/drivers/staging/isdn/gigaset/ser-gigaset.c
+++ /dev/null
@@ -1,796 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/* This is the serial hardware link layer (HLL) for the Gigaset 307x isdn
- * DECT base (aka Sinus 45 isdn) using the RS232 DECT data module M101,
- * written as a line discipline.
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/module.h>
-#include <linux/moduleparam.h>
-#include <linux/platform_device.h>
-#include <linux/completion.h>
-
-/* Version Information */
-#define DRIVER_AUTHOR "Tilman Schmidt"
-#define DRIVER_DESC "Serial Driver for Gigaset 307x using Siemens M101"
-
-#define GIGASET_MINORS     1
-#define GIGASET_MINOR      0
-#define GIGASET_MODULENAME "ser_gigaset"
-#define GIGASET_DEVNAME    "ttyGS"
-
-/* length limit according to Siemens 3070usb-protokoll.doc ch. 2.1 */
-#define IF_WRITEBUF 264
-
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
-MODULE_LICENSE("GPL");
-MODULE_ALIAS_LDISC(N_GIGASET_M101);
-
-static int startmode = SM_ISDN;
-module_param(startmode, int, S_IRUGO);
-MODULE_PARM_DESC(startmode, "initial operation mode");
-static int cidmode = 1;
-module_param(cidmode, int, S_IRUGO);
-MODULE_PARM_DESC(cidmode, "stay in CID mode when idle");
-
-static struct gigaset_driver *driver;
-
-struct ser_cardstate {
-	struct platform_device	dev;
-	struct tty_struct	*tty;
-	atomic_t		refcnt;
-	struct completion	dead_cmp;
-};
-
-static struct platform_driver device_driver = {
-	.driver = {
-		.name = GIGASET_MODULENAME,
-	},
-};
-
-static void flush_send_queue(struct cardstate *);
-
-/* transmit data from current open skb
- * result: number of bytes sent or error code < 0
- */
-static int write_modem(struct cardstate *cs)
-{
-	struct tty_struct *tty = cs->hw.ser->tty;
-	struct bc_state *bcs = &cs->bcs[0];	/* only one channel */
-	struct sk_buff *skb = bcs->tx_skb;
-	int sent = -EOPNOTSUPP;
-
-	WARN_ON(!tty || !tty->ops || !skb);
-
-	if (!skb->len) {
-		dev_kfree_skb_any(skb);
-		bcs->tx_skb = NULL;
-		return -EINVAL;
-	}
-
-	set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
-	if (tty->ops->write)
-		sent = tty->ops->write(tty, skb->data, skb->len);
-	gig_dbg(DEBUG_OUTPUT, "write_modem: sent %d", sent);
-	if (sent < 0) {
-		/* error */
-		flush_send_queue(cs);
-		return sent;
-	}
-	skb_pull(skb, sent);
-	if (!skb->len) {
-		/* skb sent completely */
-		gigaset_skb_sent(bcs, skb);
-
-		gig_dbg(DEBUG_INTR, "kfree skb (Adr: %lx)!",
-			(unsigned long) skb);
-		dev_kfree_skb_any(skb);
-		bcs->tx_skb = NULL;
-	}
-	return sent;
-}
-
-/*
- * transmit first queued command buffer
- * result: number of bytes sent or error code < 0
- */
-static int send_cb(struct cardstate *cs)
-{
-	struct tty_struct *tty = cs->hw.ser->tty;
-	struct cmdbuf_t *cb, *tcb;
-	unsigned long flags;
-	int sent = 0;
-
-	WARN_ON(!tty || !tty->ops);
-
-	cb = cs->cmdbuf;
-	if (!cb)
-		return 0;	/* nothing to do */
-
-	if (cb->len) {
-		set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
-		sent = tty->ops->write(tty, cb->buf + cb->offset, cb->len);
-		if (sent < 0) {
-			/* error */
-			gig_dbg(DEBUG_OUTPUT, "send_cb: write error %d", sent);
-			flush_send_queue(cs);
-			return sent;
-		}
-		cb->offset += sent;
-		cb->len -= sent;
-		gig_dbg(DEBUG_OUTPUT, "send_cb: sent %d, left %u, queued %u",
-			sent, cb->len, cs->cmdbytes);
-	}
-
-	while (cb && !cb->len) {
-		spin_lock_irqsave(&cs->cmdlock, flags);
-		cs->cmdbytes -= cs->curlen;
-		tcb = cb;
-		cs->cmdbuf = cb = cb->next;
-		if (cb) {
-			cb->prev = NULL;
-			cs->curlen = cb->len;
-		} else {
-			cs->lastcmdbuf = NULL;
-			cs->curlen = 0;
-		}
-		spin_unlock_irqrestore(&cs->cmdlock, flags);
-
-		if (tcb->wake_tasklet)
-			tasklet_schedule(tcb->wake_tasklet);
-		kfree(tcb);
-	}
-	return sent;
-}
-
-/*
- * send queue tasklet
- * If there is already a skb opened, put data to the transfer buffer
- * by calling "write_modem".
- * Otherwise take a new skb out of the queue.
- */
-static void gigaset_modem_fill(unsigned long data)
-{
-	struct cardstate *cs = (struct cardstate *) data;
-	struct bc_state *bcs;
-	struct sk_buff *nextskb;
-	int sent = 0;
-
-	if (!cs) {
-		gig_dbg(DEBUG_OUTPUT, "%s: no cardstate", __func__);
-		return;
-	}
-	bcs = cs->bcs;
-	if (!bcs) {
-		gig_dbg(DEBUG_OUTPUT, "%s: no cardstate", __func__);
-		return;
-	}
-	if (!bcs->tx_skb) {
-		/* no skb is being sent; send command if any */
-		sent = send_cb(cs);
-		gig_dbg(DEBUG_OUTPUT, "%s: send_cb -> %d", __func__, sent);
-		if (sent)
-			/* something sent or error */
-			return;
-
-		/* no command to send; get skb */
-		nextskb = skb_dequeue(&bcs->squeue);
-		if (!nextskb)
-			/* no skb either, nothing to do */
-			return;
-		bcs->tx_skb = nextskb;
-
-		gig_dbg(DEBUG_INTR, "Dequeued skb (Adr: %lx)",
-			(unsigned long) bcs->tx_skb);
-	}
-
-	/* send skb */
-	gig_dbg(DEBUG_OUTPUT, "%s: tx_skb", __func__);
-	if (write_modem(cs) < 0)
-		gig_dbg(DEBUG_OUTPUT, "%s: write_modem failed", __func__);
-}
-
-/*
- * throw away all data queued for sending
- */
-static void flush_send_queue(struct cardstate *cs)
-{
-	struct sk_buff *skb;
-	struct cmdbuf_t *cb;
-	unsigned long flags;
-
-	/* command queue */
-	spin_lock_irqsave(&cs->cmdlock, flags);
-	while ((cb = cs->cmdbuf) != NULL) {
-		cs->cmdbuf = cb->next;
-		if (cb->wake_tasklet)
-			tasklet_schedule(cb->wake_tasklet);
-		kfree(cb);
-	}
-	cs->cmdbuf = cs->lastcmdbuf = NULL;
-	cs->cmdbytes = cs->curlen = 0;
-	spin_unlock_irqrestore(&cs->cmdlock, flags);
-
-	/* data queue */
-	if (cs->bcs->tx_skb)
-		dev_kfree_skb_any(cs->bcs->tx_skb);
-	while ((skb = skb_dequeue(&cs->bcs->squeue)) != NULL)
-		dev_kfree_skb_any(skb);
-}
-
-
-/* Gigaset Driver Interface */
-/* ======================== */
-
-/*
- * queue an AT command string for transmission to the Gigaset device
- * parameters:
- *	cs		controller state structure
- *	buf		buffer containing the string to send
- *	len		number of characters to send
- *	wake_tasklet	tasklet to run when transmission is complete, or NULL
- * return value:
- *	number of bytes queued, or error code < 0
- */
-static int gigaset_write_cmd(struct cardstate *cs, struct cmdbuf_t *cb)
-{
-	unsigned long flags;
-
-	gigaset_dbg_buffer(cs->mstate != MS_LOCKED ?
-			   DEBUG_TRANSCMD : DEBUG_LOCKCMD,
-			   "CMD Transmit", cb->len, cb->buf);
-
-	spin_lock_irqsave(&cs->cmdlock, flags);
-	cb->prev = cs->lastcmdbuf;
-	if (cs->lastcmdbuf)
-		cs->lastcmdbuf->next = cb;
-	else {
-		cs->cmdbuf = cb;
-		cs->curlen = cb->len;
-	}
-	cs->cmdbytes += cb->len;
-	cs->lastcmdbuf = cb;
-	spin_unlock_irqrestore(&cs->cmdlock, flags);
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (cs->connected)
-		tasklet_schedule(&cs->write_tasklet);
-	spin_unlock_irqrestore(&cs->lock, flags);
-	return cb->len;
-}
-
-/*
- * tty_driver.write_room interface routine
- * return number of characters the driver will accept to be written
- * parameter:
- *	controller state structure
- * return value:
- *	number of characters
- */
-static int gigaset_write_room(struct cardstate *cs)
-{
-	unsigned bytes;
-
-	bytes = cs->cmdbytes;
-	return bytes < IF_WRITEBUF ? IF_WRITEBUF - bytes : 0;
-}
-
-/*
- * tty_driver.chars_in_buffer interface routine
- * return number of characters waiting to be sent
- * parameter:
- *	controller state structure
- * return value:
- *	number of characters
- */
-static int gigaset_chars_in_buffer(struct cardstate *cs)
-{
-	return cs->cmdbytes;
-}
-
-/*
- * implementation of ioctl(GIGASET_BRKCHARS)
- * parameter:
- *	controller state structure
- * return value:
- *	-EINVAL (unimplemented function)
- */
-static int gigaset_brkchars(struct cardstate *cs, const unsigned char buf[6])
-{
-	/* not implemented */
-	return -EINVAL;
-}
-
-/*
- * Open B channel
- * Called by "do_action" in ev-layer.c
- */
-static int gigaset_init_bchannel(struct bc_state *bcs)
-{
-	/* nothing to do for M10x */
-	gigaset_bchannel_up(bcs);
-	return 0;
-}
-
-/*
- * Close B channel
- * Called by "do_action" in ev-layer.c
- */
-static int gigaset_close_bchannel(struct bc_state *bcs)
-{
-	/* nothing to do for M10x */
-	gigaset_bchannel_down(bcs);
-	return 0;
-}
-
-/*
- * Set up B channel structure
- * This is called by "gigaset_initcs" in common.c
- */
-static int gigaset_initbcshw(struct bc_state *bcs)
-{
-	/* unused */
-	bcs->hw.ser = NULL;
-	return 0;
-}
-
-/*
- * Free B channel structure
- * Called by "gigaset_freebcs" in common.c
- */
-static void gigaset_freebcshw(struct bc_state *bcs)
-{
-	/* unused */
-}
-
-/*
- * Reinitialize B channel structure
- * This is called by "bcs_reinit" in common.c
- */
-static void gigaset_reinitbcshw(struct bc_state *bcs)
-{
-	/* nothing to do for M10x */
-}
-
-/*
- * Free hardware specific device data
- * This will be called by "gigaset_freecs" in common.c
- */
-static void gigaset_freecshw(struct cardstate *cs)
-{
-	tasklet_kill(&cs->write_tasklet);
-	if (!cs->hw.ser)
-		return;
-	platform_device_unregister(&cs->hw.ser->dev);
-}
-
-static void gigaset_device_release(struct device *dev)
-{
-	kfree(container_of(dev, struct ser_cardstate, dev.dev));
-}
-
-/*
- * Set up hardware specific device data
- * This is called by "gigaset_initcs" in common.c
- */
-static int gigaset_initcshw(struct cardstate *cs)
-{
-	int rc;
-	struct ser_cardstate *scs;
-
-	scs = kzalloc(sizeof(struct ser_cardstate), GFP_KERNEL);
-	if (!scs) {
-		pr_err("out of memory\n");
-		return -ENOMEM;
-	}
-	cs->hw.ser = scs;
-
-	cs->hw.ser->dev.name = GIGASET_MODULENAME;
-	cs->hw.ser->dev.id = cs->minor_index;
-	cs->hw.ser->dev.dev.release = gigaset_device_release;
-	rc = platform_device_register(&cs->hw.ser->dev);
-	if (rc != 0) {
-		pr_err("error %d registering platform device\n", rc);
-		kfree(cs->hw.ser);
-		cs->hw.ser = NULL;
-		return rc;
-	}
-
-	tasklet_init(&cs->write_tasklet,
-		     gigaset_modem_fill, (unsigned long) cs);
-	return 0;
-}
-
-/*
- * set modem control lines
- * Parameters:
- *	card state structure
- *	modem control line state ([TIOCM_DTR]|[TIOCM_RTS])
- * Called by "gigaset_start" and "gigaset_enterconfigmode" in common.c
- * and by "if_lock" and "if_termios" in interface.c
- */
-static int gigaset_set_modem_ctrl(struct cardstate *cs, unsigned old_state,
-				  unsigned new_state)
-{
-	struct tty_struct *tty = cs->hw.ser->tty;
-	unsigned int set, clear;
-
-	WARN_ON(!tty || !tty->ops);
-	/* tiocmset is an optional tty driver method */
-	if (!tty->ops->tiocmset)
-		return -EINVAL;
-	set = new_state & ~old_state;
-	clear = old_state & ~new_state;
-	if (!set && !clear)
-		return 0;
-	gig_dbg(DEBUG_IF, "tiocmset set %x clear %x", set, clear);
-	return tty->ops->tiocmset(tty, set, clear);
-}
-
-static int gigaset_baud_rate(struct cardstate *cs, unsigned cflag)
-{
-	return -EINVAL;
-}
-
-static int gigaset_set_line_ctrl(struct cardstate *cs, unsigned cflag)
-{
-	return -EINVAL;
-}
-
-static const struct gigaset_ops ops = {
-	.write_cmd = gigaset_write_cmd,
-	.write_room = gigaset_write_room,
-	.chars_in_buffer = gigaset_chars_in_buffer,
-	.brkchars = gigaset_brkchars,
-	.init_bchannel = gigaset_init_bchannel,
-	.close_bchannel = gigaset_close_bchannel,
-	.initbcshw = gigaset_initbcshw,
-	.freebcshw = gigaset_freebcshw,
-	.reinitbcshw = gigaset_reinitbcshw,
-	.initcshw = gigaset_initcshw,
-	.freecshw = gigaset_freecshw,
-	.set_modem_ctrl = gigaset_set_modem_ctrl,
-	.baud_rate = gigaset_baud_rate,
-	.set_line_ctrl = gigaset_set_line_ctrl,
-	.send_skb = gigaset_m10x_send_skb,	/* asyncdata.c */
-	.handle_input = gigaset_m10x_input,	/* asyncdata.c */
-};
-
-
-/* Line Discipline Interface */
-/* ========================= */
-
-/* helper functions for cardstate refcounting */
-static struct cardstate *cs_get(struct tty_struct *tty)
-{
-	struct cardstate *cs = tty->disc_data;
-
-	if (!cs || !cs->hw.ser) {
-		gig_dbg(DEBUG_ANY, "%s: no cardstate", __func__);
-		return NULL;
-	}
-	atomic_inc(&cs->hw.ser->refcnt);
-	return cs;
-}
-
-static void cs_put(struct cardstate *cs)
-{
-	if (atomic_dec_and_test(&cs->hw.ser->refcnt))
-		complete(&cs->hw.ser->dead_cmp);
-}
-
-/*
- * Called by the tty driver when the line discipline is pushed onto the tty.
- * Called in process context.
- */
-static int
-gigaset_tty_open(struct tty_struct *tty)
-{
-	struct cardstate *cs;
-	int rc;
-
-	gig_dbg(DEBUG_INIT, "Starting HLL for Gigaset M101");
-
-	pr_info(DRIVER_DESC "\n");
-
-	if (!driver) {
-		pr_err("%s: no driver structure\n", __func__);
-		return -ENODEV;
-	}
-
-	/* allocate memory for our device state and initialize it */
-	cs = gigaset_initcs(driver, 1, 1, 0, cidmode, GIGASET_MODULENAME);
-	if (!cs) {
-		rc = -ENODEV;
-		goto error;
-	}
-
-	cs->dev = &cs->hw.ser->dev.dev;
-	cs->hw.ser->tty = tty;
-	atomic_set(&cs->hw.ser->refcnt, 1);
-	init_completion(&cs->hw.ser->dead_cmp);
-	tty->disc_data = cs;
-
-	/* Set the amount of data we're willing to receive per call
-	 * from the hardware driver to half of the input buffer size
-	 * to leave some reserve.
-	 * Note: We don't do flow control towards the hardware driver.
-	 * If more data is received than will fit into the input buffer,
-	 * it will be dropped and an error will be logged. This should
-	 * never happen as the device is slow and the buffer size ample.
-	 */
-	tty->receive_room = RBUFSIZE/2;
-
-	/* OK.. Initialization of the datastructures and the HW is done.. Now
-	 * startup system and notify the LL that we are ready to run
-	 */
-	if (startmode == SM_LOCKED)
-		cs->mstate = MS_LOCKED;
-	rc = gigaset_start(cs);
-	if (rc < 0) {
-		tasklet_kill(&cs->write_tasklet);
-		goto error;
-	}
-
-	gig_dbg(DEBUG_INIT, "Startup of HLL done");
-	return 0;
-
-error:
-	gig_dbg(DEBUG_INIT, "Startup of HLL failed");
-	tty->disc_data = NULL;
-	gigaset_freecs(cs);
-	return rc;
-}
-
-/*
- * Called by the tty driver when the line discipline is removed.
- * Called from process context.
- */
-static void
-gigaset_tty_close(struct tty_struct *tty)
-{
-	struct cardstate *cs = tty->disc_data;
-
-	gig_dbg(DEBUG_INIT, "Stopping HLL for Gigaset M101");
-
-	if (!cs) {
-		gig_dbg(DEBUG_INIT, "%s: no cardstate", __func__);
-		return;
-	}
-
-	/* prevent other callers from entering ldisc methods */
-	tty->disc_data = NULL;
-
-	if (!cs->hw.ser)
-		pr_err("%s: no hw cardstate\n", __func__);
-	else {
-		/* wait for running methods to finish */
-		if (!atomic_dec_and_test(&cs->hw.ser->refcnt))
-			wait_for_completion(&cs->hw.ser->dead_cmp);
-	}
-
-	/* stop operations */
-	gigaset_stop(cs);
-	tasklet_kill(&cs->write_tasklet);
-	flush_send_queue(cs);
-	cs->dev = NULL;
-	gigaset_freecs(cs);
-
-	gig_dbg(DEBUG_INIT, "Shutdown of HLL done");
-}
-
-/*
- * Called by the tty driver when the tty line is hung up.
- * Wait for I/O to driver to complete and unregister ISDN device.
- * This is already done by the close routine, so just call that.
- * Called from process context.
- */
-static int gigaset_tty_hangup(struct tty_struct *tty)
-{
-	gigaset_tty_close(tty);
-	return 0;
-}
-
-/*
- * Ioctl on the tty.
- * Called in process context only.
- * May be re-entered by multiple ioctl calling threads.
- */
-static int
-gigaset_tty_ioctl(struct tty_struct *tty, struct file *file,
-		  unsigned int cmd, unsigned long arg)
-{
-	struct cardstate *cs = cs_get(tty);
-	int rc, val;
-	int __user *p = (int __user *)arg;
-
-	if (!cs)
-		return -ENXIO;
-
-	switch (cmd) {
-
-	case FIONREAD:
-		/* unused, always return zero */
-		val = 0;
-		rc = put_user(val, p);
-		break;
-
-	case TCFLSH:
-		/* flush our buffers and the serial port's buffer */
-		switch (arg) {
-		case TCIFLUSH:
-			/* no own input buffer to flush */
-			break;
-		case TCIOFLUSH:
-		case TCOFLUSH:
-			flush_send_queue(cs);
-			break;
-		}
-		/* fall through */
-
-	default:
-		/* pass through to underlying serial device */
-		rc = n_tty_ioctl_helper(tty, file, cmd, arg);
-		break;
-	}
-	cs_put(cs);
-	return rc;
-}
-
-/*
- * Called by the tty driver when a block of data has been received.
- * Will not be re-entered while running but other ldisc functions
- * may be called in parallel.
- * Can be called from hard interrupt level as well as soft interrupt
- * level or mainline.
- * Parameters:
- *	tty	tty structure
- *	buf	buffer containing received characters
- *	cflags	buffer containing error flags for received characters (ignored)
- *	count	number of received characters
- */
-static void
-gigaset_tty_receive(struct tty_struct *tty, const unsigned char *buf,
-		    char *cflags, int count)
-{
-	struct cardstate *cs = cs_get(tty);
-	unsigned tail, head, n;
-	struct inbuf_t *inbuf;
-
-	if (!cs)
-		return;
-	inbuf = cs->inbuf;
-	if (!inbuf) {
-		dev_err(cs->dev, "%s: no inbuf\n", __func__);
-		cs_put(cs);
-		return;
-	}
-
-	tail = inbuf->tail;
-	head = inbuf->head;
-	gig_dbg(DEBUG_INTR, "buffer state: %u -> %u, receive %u bytes",
-		head, tail, count);
-
-	if (head <= tail) {
-		/* possible buffer wraparound */
-		n = min_t(unsigned, count, RBUFSIZE - tail);
-		memcpy(inbuf->data + tail, buf, n);
-		tail = (tail + n) % RBUFSIZE;
-		buf += n;
-		count -= n;
-	}
-
-	if (count > 0) {
-		/* tail < head and some data left */
-		n = head - tail - 1;
-		if (count > n) {
-			dev_err(cs->dev,
-				"inbuf overflow, discarding %d bytes\n",
-				count - n);
-			count = n;
-		}
-		memcpy(inbuf->data + tail, buf, count);
-		tail += count;
-	}
-
-	gig_dbg(DEBUG_INTR, "setting tail to %u", tail);
-	inbuf->tail = tail;
-
-	/* Everything was received .. Push data into handler */
-	gig_dbg(DEBUG_INTR, "%s-->BH", __func__);
-	gigaset_schedule_event(cs);
-	cs_put(cs);
-}
-
-/*
- * Called by the tty driver when there's room for more data to send.
- */
-static void
-gigaset_tty_wakeup(struct tty_struct *tty)
-{
-	struct cardstate *cs = cs_get(tty);
-
-	clear_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
-	if (!cs)
-		return;
-	tasklet_schedule(&cs->write_tasklet);
-	cs_put(cs);
-}
-
-static struct tty_ldisc_ops gigaset_ldisc = {
-	.owner		= THIS_MODULE,
-	.magic		= TTY_LDISC_MAGIC,
-	.name		= "ser_gigaset",
-	.open		= gigaset_tty_open,
-	.close		= gigaset_tty_close,
-	.hangup		= gigaset_tty_hangup,
-	.ioctl		= gigaset_tty_ioctl,
-	.receive_buf	= gigaset_tty_receive,
-	.write_wakeup	= gigaset_tty_wakeup,
-};
-
-
-/* Initialization / Shutdown */
-/* ========================= */
-
-static int __init ser_gigaset_init(void)
-{
-	int rc;
-
-	gig_dbg(DEBUG_INIT, "%s", __func__);
-	rc = platform_driver_register(&device_driver);
-	if (rc != 0) {
-		pr_err("error %d registering platform driver\n", rc);
-		return rc;
-	}
-
-	/* allocate memory for our driver state and initialize it */
-	driver = gigaset_initdriver(GIGASET_MINOR, GIGASET_MINORS,
-				    GIGASET_MODULENAME, GIGASET_DEVNAME,
-				    &ops, THIS_MODULE);
-	if (!driver) {
-		rc = -ENOMEM;
-		goto error;
-	}
-
-	rc = tty_register_ldisc(N_GIGASET_M101, &gigaset_ldisc);
-	if (rc != 0) {
-		pr_err("error %d registering line discipline\n", rc);
-		goto error;
-	}
-
-	return 0;
-
-error:
-	if (driver) {
-		gigaset_freedriver(driver);
-		driver = NULL;
-	}
-	platform_driver_unregister(&device_driver);
-	return rc;
-}
-
-static void __exit ser_gigaset_exit(void)
-{
-	int rc;
-
-	gig_dbg(DEBUG_INIT, "%s", __func__);
-
-	if (driver) {
-		gigaset_freedriver(driver);
-		driver = NULL;
-	}
-
-	rc = tty_unregister_ldisc(N_GIGASET_M101);
-	if (rc != 0)
-		pr_err("error %d unregistering line discipline\n", rc);
-
-	platform_driver_unregister(&device_driver);
-}
-
-module_init(ser_gigaset_init);
-module_exit(ser_gigaset_exit);
diff --git a/drivers/staging/isdn/gigaset/usb-gigaset.c b/drivers/staging/isdn/gigaset/usb-gigaset.c
deleted file mode 100644
index 1b9b43659bdf..000000000000
--- a/drivers/staging/isdn/gigaset/usb-gigaset.c
+++ /dev/null
@@ -1,946 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * USB driver for Gigaset 307x directly or using M105 Data.
- *
- * Copyright (c) 2001 by Stefan Eilers
- *                   and Hansjoerg Lipp <hjlipp@web.de>.
- *
- * This driver was derived from the USB skeleton driver by
- * Greg Kroah-Hartman <greg@kroah.com>
- *
- * =====================================================================
- * =====================================================================
- */
-
-#include "gigaset.h"
-#include <linux/usb.h>
-#include <linux/module.h>
-#include <linux/moduleparam.h>
-
-/* Version Information */
-#define DRIVER_AUTHOR "Hansjoerg Lipp <hjlipp@web.de>, Stefan Eilers"
-#define DRIVER_DESC "USB Driver for Gigaset 307x using M105"
-
-/* Module parameters */
-
-static int startmode = SM_ISDN;
-static int cidmode = 1;
-
-module_param(startmode, int, S_IRUGO);
-module_param(cidmode, int, S_IRUGO);
-MODULE_PARM_DESC(startmode, "start in isdn4linux mode");
-MODULE_PARM_DESC(cidmode, "Call-ID mode");
-
-#define GIGASET_MINORS     1
-#define GIGASET_MINOR      8
-#define GIGASET_MODULENAME "usb_gigaset"
-#define GIGASET_DEVNAME    "ttyGU"
-
-/* length limit according to Siemens 3070usb-protokoll.doc ch. 2.1 */
-#define IF_WRITEBUF 264
-
-/* Values for the Gigaset M105 Data */
-#define USB_M105_VENDOR_ID	0x0681
-#define USB_M105_PRODUCT_ID	0x0009
-
-/* table of devices that work with this driver */
-static const struct usb_device_id gigaset_table[] = {
-	{ USB_DEVICE(USB_M105_VENDOR_ID, USB_M105_PRODUCT_ID) },
-	{ }					/* Terminating entry */
-};
-
-MODULE_DEVICE_TABLE(usb, gigaset_table);
-
-/*
- * Control requests (empty fields: 00)
- *
- *       RT|RQ|VALUE|INDEX|LEN  |DATA
- * In:
- *       C1 08             01
- *            Get flags (1 byte). Bits: 0=dtr,1=rts,3-7:?
- *       C1 0F             ll ll
- *            Get device information/status (llll: 0x200 and 0x40 seen).
- *            Real size: I only saw MIN(llll,0x64).
- *            Contents: seems to be always the same...
- *              offset 0x00: Length of this structure (0x64) (len: 1,2,3 bytes)
- *              offset 0x3c: String (16 bit chars): "MCCI USB Serial V2.0"
- *              rest:        ?
- * Out:
- *       41 11
- *            Initialize/reset device ?
- *       41 00 xx 00
- *            ? (xx=00 or 01; 01 on start, 00 on close)
- *       41 07 vv mm
- *            Set/clear flags vv=value, mm=mask (see RQ 08)
- *       41 12 xx
- *            Used before the following configuration requests are issued
- *            (with xx=0x0f). I've seen other values<0xf, though.
- *       41 01 xx xx
- *            Set baud rate. xxxx=ceil(0x384000/rate)=trunc(0x383fff/rate)+1.
- *       41 03 ps bb
- *            Set byte size and parity. p:  0x20=even,0x10=odd,0x00=no parity
- *                                     [    0x30: m, 0x40: s           ]
- *                                     [s:  0: 1 stop bit; 1: 1.5; 2: 2]
- *                                      bb: bits/byte (seen 7 and 8)
- *       41 13 -- -- -- -- 10 00 ww 00 00 00 xx 00 00 00 yy 00 00 00 zz 00 00 00
- *            ??
- *            Initialization: 01, 40, 00, 00
- *            Open device:    00  40, 00, 00
- *            yy and zz seem to be equal, either 0x00 or 0x0a
- *            (ww,xx) pairs seen: (00,00), (00,40), (01,40), (09,80), (19,80)
- *       41 19 -- -- -- -- 06 00 00 00 00 xx 11 13
- *            Used after every "configuration sequence" (RQ 12, RQs 01/03/13).
- *            xx is usually 0x00 but was 0x7e before starting data transfer
- *            in unimodem mode. So, this might be an array of characters that
- *            need special treatment ("commit all bufferd data"?), 11=^Q, 13=^S.
- *
- * Unimodem mode: use "modprobe ppp_async flag_time=0" as the device _needs_ two
- * flags per packet.
- */
-
-/* functions called if a device of this driver is connected/disconnected */
-static int gigaset_probe(struct usb_interface *interface,
-			 const struct usb_device_id *id);
-static void gigaset_disconnect(struct usb_interface *interface);
-
-/* functions called before/after suspend */
-static int gigaset_suspend(struct usb_interface *intf, pm_message_t message);
-static int gigaset_resume(struct usb_interface *intf);
-static int gigaset_pre_reset(struct usb_interface *intf);
-
-static struct gigaset_driver *driver;
-
-/* usb specific object needed to register this driver with the usb subsystem */
-static struct usb_driver gigaset_usb_driver = {
-	.name =		GIGASET_MODULENAME,
-	.probe =	gigaset_probe,
-	.disconnect =	gigaset_disconnect,
-	.id_table =	gigaset_table,
-	.suspend =	gigaset_suspend,
-	.resume =	gigaset_resume,
-	.reset_resume =	gigaset_resume,
-	.pre_reset =	gigaset_pre_reset,
-	.post_reset =	gigaset_resume,
-	.disable_hub_initiated_lpm = 1,
-};
-
-struct usb_cardstate {
-	struct usb_device	*udev;		/* usb device pointer */
-	struct usb_interface	*interface;	/* interface for this device */
-	int			busy;		/* bulk output in progress */
-
-	/* Output buffer */
-	unsigned char		*bulk_out_buffer;
-	int			bulk_out_size;
-	int			bulk_out_epnum;
-	struct urb		*bulk_out_urb;
-
-	/* Input buffer */
-	unsigned char		*rcvbuf;
-	int			rcvbuf_size;
-	struct urb		*read_urb;
-
-	char			bchars[6];		/* for request 0x19 */
-};
-
-static inline unsigned tiocm_to_gigaset(unsigned state)
-{
-	return ((state & TIOCM_DTR) ? 1 : 0) | ((state & TIOCM_RTS) ? 2 : 0);
-}
-
-static int gigaset_set_modem_ctrl(struct cardstate *cs, unsigned old_state,
-				  unsigned new_state)
-{
-	struct usb_device *udev = cs->hw.usb->udev;
-	unsigned mask, val;
-	int r;
-
-	mask = tiocm_to_gigaset(old_state ^ new_state);
-	val = tiocm_to_gigaset(new_state);
-
-	gig_dbg(DEBUG_USBREQ, "set flags 0x%02x with mask 0x%02x", val, mask);
-	r = usb_control_msg(udev, usb_sndctrlpipe(udev, 0), 7, 0x41,
-			    (val & 0xff) | ((mask & 0xff) << 8), 0,
-			    NULL, 0, 2000 /* timeout? */);
-	if (r < 0)
-		return r;
-	return 0;
-}
-
-/*
- * Set M105 configuration value
- * using undocumented device commands reverse engineered from USB traces
- * of the Siemens Windows driver
- */
-static int set_value(struct cardstate *cs, u8 req, u16 val)
-{
-	struct usb_device *udev = cs->hw.usb->udev;
-	int r, r2;
-
-	gig_dbg(DEBUG_USBREQ, "request %02x (%04x)",
-		(unsigned)req, (unsigned)val);
-	r = usb_control_msg(udev, usb_sndctrlpipe(udev, 0), 0x12, 0x41,
-			    0xf /*?*/, 0, NULL, 0, 2000 /*?*/);
-	/* no idea what this does */
-	if (r < 0) {
-		dev_err(&udev->dev, "error %d on request 0x12\n", -r);
-		return r;
-	}
-
-	r = usb_control_msg(udev, usb_sndctrlpipe(udev, 0), req, 0x41,
-			    val, 0, NULL, 0, 2000 /*?*/);
-	if (r < 0)
-		dev_err(&udev->dev, "error %d on request 0x%02x\n",
-			-r, (unsigned)req);
-
-	r2 = usb_control_msg(udev, usb_sndctrlpipe(udev, 0), 0x19, 0x41,
-			     0, 0, cs->hw.usb->bchars, 6, 2000 /*?*/);
-	if (r2 < 0)
-		dev_err(&udev->dev, "error %d on request 0x19\n", -r2);
-
-	return r < 0 ? r : (r2 < 0 ? r2 : 0);
-}
-
-/*
- * set the baud rate on the internal serial adapter
- * using the undocumented parameter setting command
- */
-static int gigaset_baud_rate(struct cardstate *cs, unsigned cflag)
-{
-	u16 val;
-	u32 rate;
-
-	cflag &= CBAUD;
-
-	switch (cflag) {
-	case    B300: rate =     300; break;
-	case    B600: rate =     600; break;
-	case   B1200: rate =    1200; break;
-	case   B2400: rate =    2400; break;
-	case   B4800: rate =    4800; break;
-	case   B9600: rate =    9600; break;
-	case  B19200: rate =   19200; break;
-	case  B38400: rate =   38400; break;
-	case  B57600: rate =   57600; break;
-	case B115200: rate =  115200; break;
-	default:
-		rate =  9600;
-		dev_err(cs->dev, "unsupported baudrate request 0x%x,"
-			" using default of B9600\n", cflag);
-	}
-
-	val = 0x383fff / rate + 1;
-
-	return set_value(cs, 1, val);
-}
-
-/*
- * set the line format on the internal serial adapter
- * using the undocumented parameter setting command
- */
-static int gigaset_set_line_ctrl(struct cardstate *cs, unsigned cflag)
-{
-	u16 val = 0;
-
-	/* set the parity */
-	if (cflag & PARENB)
-		val |= (cflag & PARODD) ? 0x10 : 0x20;
-
-	/* set the number of data bits */
-	switch (cflag & CSIZE) {
-	case CS5:
-		val |= 5 << 8; break;
-	case CS6:
-		val |= 6 << 8; break;
-	case CS7:
-		val |= 7 << 8; break;
-	case CS8:
-		val |= 8 << 8; break;
-	default:
-		dev_err(cs->dev, "CSIZE was not CS5-CS8, using default of 8\n");
-		val |= 8 << 8;
-		break;
-	}
-
-	/* set the number of stop bits */
-	if (cflag & CSTOPB) {
-		if ((cflag & CSIZE) == CS5)
-			val |= 1; /* 1.5 stop bits */
-		else
-			val |= 2; /* 2 stop bits */
-	}
-
-	return set_value(cs, 3, val);
-}
-
-
-/*============================================================================*/
-static int gigaset_init_bchannel(struct bc_state *bcs)
-{
-	/* nothing to do for M10x */
-	gigaset_bchannel_up(bcs);
-	return 0;
-}
-
-static int gigaset_close_bchannel(struct bc_state *bcs)
-{
-	/* nothing to do for M10x */
-	gigaset_bchannel_down(bcs);
-	return 0;
-}
-
-static int write_modem(struct cardstate *cs);
-static int send_cb(struct cardstate *cs);
-
-
-/* Write tasklet handler: Continue sending current skb, or send command, or
- * start sending an skb from the send queue.
- */
-static void gigaset_modem_fill(unsigned long data)
-{
-	struct cardstate *cs = (struct cardstate *) data;
-	struct bc_state *bcs = &cs->bcs[0]; /* only one channel */
-
-	gig_dbg(DEBUG_OUTPUT, "modem_fill");
-
-	if (cs->hw.usb->busy) {
-		gig_dbg(DEBUG_OUTPUT, "modem_fill: busy");
-		return;
-	}
-
-again:
-	if (!bcs->tx_skb) {	/* no skb is being sent */
-		if (cs->cmdbuf) {	/* commands to send? */
-			gig_dbg(DEBUG_OUTPUT, "modem_fill: cb");
-			if (send_cb(cs) < 0) {
-				gig_dbg(DEBUG_OUTPUT,
-					"modem_fill: send_cb failed");
-				goto again; /* no callback will be called! */
-			}
-			return;
-		}
-
-		/* skbs to send? */
-		bcs->tx_skb = skb_dequeue(&bcs->squeue);
-		if (!bcs->tx_skb)
-			return;
-
-		gig_dbg(DEBUG_INTR, "Dequeued skb (Adr: %lx)!",
-			(unsigned long) bcs->tx_skb);
-	}
-
-	gig_dbg(DEBUG_OUTPUT, "modem_fill: tx_skb");
-	if (write_modem(cs) < 0) {
-		gig_dbg(DEBUG_OUTPUT, "modem_fill: write_modem failed");
-		goto again;	/* no callback will be called! */
-	}
-}
-
-/*
- * Interrupt Input URB completion routine
- */
-static void gigaset_read_int_callback(struct urb *urb)
-{
-	struct cardstate *cs = urb->context;
-	struct inbuf_t *inbuf = cs->inbuf;
-	int status = urb->status;
-	int r;
-	unsigned numbytes;
-	unsigned char *src;
-	unsigned long flags;
-
-	if (!status) {
-		numbytes = urb->actual_length;
-
-		if (numbytes) {
-			src = cs->hw.usb->rcvbuf;
-			if (unlikely(*src))
-				dev_warn(cs->dev,
-					 "%s: There was no leading 0, but 0x%02x!\n",
-					 __func__, (unsigned) *src);
-			++src; /* skip leading 0x00 */
-			--numbytes;
-			if (gigaset_fill_inbuf(inbuf, src, numbytes)) {
-				gig_dbg(DEBUG_INTR, "%s-->BH", __func__);
-				gigaset_schedule_event(inbuf->cs);
-			}
-		} else
-			gig_dbg(DEBUG_INTR, "Received zero block length");
-	} else {
-		/* The urb might have been killed. */
-		gig_dbg(DEBUG_ANY, "%s - nonzero status received: %d",
-			__func__, status);
-		if (status == -ENOENT || status == -ESHUTDOWN)
-			/* killed or endpoint shutdown: don't resubmit */
-			return;
-	}
-
-	/* resubmit URB */
-	spin_lock_irqsave(&cs->lock, flags);
-	if (!cs->connected) {
-		spin_unlock_irqrestore(&cs->lock, flags);
-		pr_err("%s: disconnected\n", __func__);
-		return;
-	}
-	r = usb_submit_urb(urb, GFP_ATOMIC);
-	spin_unlock_irqrestore(&cs->lock, flags);
-	if (r)
-		dev_err(cs->dev, "error %d resubmitting URB\n", -r);
-}
-
-
-/* This callback routine is called when data was transmitted to the device. */
-static void gigaset_write_bulk_callback(struct urb *urb)
-{
-	struct cardstate *cs = urb->context;
-	int status = urb->status;
-	unsigned long flags;
-
-	switch (status) {
-	case 0:			/* normal completion */
-		break;
-	case -ENOENT:		/* killed */
-		gig_dbg(DEBUG_ANY, "%s: killed", __func__);
-		cs->hw.usb->busy = 0;
-		return;
-	default:
-		dev_err(cs->dev, "bulk transfer failed (status %d)\n",
-			-status);
-		/* That's all we can do. Communication problems
-		   are handled by timeouts or network protocols. */
-	}
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (!cs->connected) {
-		pr_err("%s: disconnected\n", __func__);
-	} else {
-		cs->hw.usb->busy = 0;
-		tasklet_schedule(&cs->write_tasklet);
-	}
-	spin_unlock_irqrestore(&cs->lock, flags);
-}
-
-static int send_cb(struct cardstate *cs)
-{
-	struct cmdbuf_t *cb = cs->cmdbuf;
-	unsigned long flags;
-	int count;
-	int status = -ENOENT;
-	struct usb_cardstate *ucs = cs->hw.usb;
-
-	do {
-		if (!cb->len) {
-			spin_lock_irqsave(&cs->cmdlock, flags);
-			cs->cmdbytes -= cs->curlen;
-			gig_dbg(DEBUG_OUTPUT, "send_cb: sent %u bytes, %u left",
-				cs->curlen, cs->cmdbytes);
-			cs->cmdbuf = cb->next;
-			if (cs->cmdbuf) {
-				cs->cmdbuf->prev = NULL;
-				cs->curlen = cs->cmdbuf->len;
-			} else {
-				cs->lastcmdbuf = NULL;
-				cs->curlen = 0;
-			}
-			spin_unlock_irqrestore(&cs->cmdlock, flags);
-
-			if (cb->wake_tasklet)
-				tasklet_schedule(cb->wake_tasklet);
-			kfree(cb);
-
-			cb = cs->cmdbuf;
-		}
-
-		if (cb) {
-			count = min(cb->len, ucs->bulk_out_size);
-			gig_dbg(DEBUG_OUTPUT, "send_cb: send %d bytes", count);
-
-			usb_fill_bulk_urb(ucs->bulk_out_urb, ucs->udev,
-					  usb_sndbulkpipe(ucs->udev,
-							  ucs->bulk_out_epnum),
-					  cb->buf + cb->offset, count,
-					  gigaset_write_bulk_callback, cs);
-
-			cb->offset += count;
-			cb->len -= count;
-			ucs->busy = 1;
-
-			spin_lock_irqsave(&cs->lock, flags);
-			status = cs->connected ?
-				usb_submit_urb(ucs->bulk_out_urb, GFP_ATOMIC) :
-				-ENODEV;
-			spin_unlock_irqrestore(&cs->lock, flags);
-
-			if (status) {
-				ucs->busy = 0;
-				dev_err(cs->dev,
-					"could not submit urb (error %d)\n",
-					-status);
-				cb->len = 0; /* skip urb => remove cb+wakeup
-						in next loop cycle */
-			}
-		}
-	} while (cb && status); /* next command on error */
-
-	return status;
-}
-
-/* Send command to device. */
-static int gigaset_write_cmd(struct cardstate *cs, struct cmdbuf_t *cb)
-{
-	unsigned long flags;
-	int len;
-
-	gigaset_dbg_buffer(cs->mstate != MS_LOCKED ?
-			   DEBUG_TRANSCMD : DEBUG_LOCKCMD,
-			   "CMD Transmit", cb->len, cb->buf);
-
-	spin_lock_irqsave(&cs->cmdlock, flags);
-	cb->prev = cs->lastcmdbuf;
-	if (cs->lastcmdbuf)
-		cs->lastcmdbuf->next = cb;
-	else {
-		cs->cmdbuf = cb;
-		cs->curlen = cb->len;
-	}
-	cs->cmdbytes += cb->len;
-	cs->lastcmdbuf = cb;
-	spin_unlock_irqrestore(&cs->cmdlock, flags);
-
-	spin_lock_irqsave(&cs->lock, flags);
-	len = cb->len;
-	if (cs->connected)
-		tasklet_schedule(&cs->write_tasklet);
-	spin_unlock_irqrestore(&cs->lock, flags);
-	return len;
-}
-
-static int gigaset_write_room(struct cardstate *cs)
-{
-	unsigned bytes;
-
-	bytes = cs->cmdbytes;
-	return bytes < IF_WRITEBUF ? IF_WRITEBUF - bytes : 0;
-}
-
-static int gigaset_chars_in_buffer(struct cardstate *cs)
-{
-	return cs->cmdbytes;
-}
-
-/*
- * set the break characters on the internal serial adapter
- * using undocumented device commands reverse engineered from USB traces
- * of the Siemens Windows driver
- */
-static int gigaset_brkchars(struct cardstate *cs, const unsigned char buf[6])
-{
-	struct usb_device *udev = cs->hw.usb->udev;
-
-	gigaset_dbg_buffer(DEBUG_USBREQ, "brkchars", 6, buf);
-	memcpy(cs->hw.usb->bchars, buf, 6);
-	return usb_control_msg(udev, usb_sndctrlpipe(udev, 0), 0x19, 0x41,
-			       0, 0, &buf, 6, 2000);
-}
-
-static void gigaset_freebcshw(struct bc_state *bcs)
-{
-	/* unused */
-}
-
-/* Initialize the b-channel structure */
-static int gigaset_initbcshw(struct bc_state *bcs)
-{
-	/* unused */
-	bcs->hw.usb = NULL;
-	return 0;
-}
-
-static void gigaset_reinitbcshw(struct bc_state *bcs)
-{
-	/* nothing to do for M10x */
-}
-
-static void gigaset_freecshw(struct cardstate *cs)
-{
-	tasklet_kill(&cs->write_tasklet);
-	kfree(cs->hw.usb);
-}
-
-static int gigaset_initcshw(struct cardstate *cs)
-{
-	struct usb_cardstate *ucs;
-
-	cs->hw.usb = ucs =
-		kmalloc(sizeof(struct usb_cardstate), GFP_KERNEL);
-	if (!ucs) {
-		pr_err("out of memory\n");
-		return -ENOMEM;
-	}
-
-	ucs->bchars[0] = 0;
-	ucs->bchars[1] = 0;
-	ucs->bchars[2] = 0;
-	ucs->bchars[3] = 0;
-	ucs->bchars[4] = 0x11;
-	ucs->bchars[5] = 0x13;
-	ucs->bulk_out_buffer = NULL;
-	ucs->bulk_out_urb = NULL;
-	ucs->read_urb = NULL;
-	tasklet_init(&cs->write_tasklet,
-		     gigaset_modem_fill, (unsigned long) cs);
-
-	return 0;
-}
-
-/* Send data from current skb to the device. */
-static int write_modem(struct cardstate *cs)
-{
-	int ret = 0;
-	int count;
-	struct bc_state *bcs = &cs->bcs[0]; /* only one channel */
-	struct usb_cardstate *ucs = cs->hw.usb;
-	unsigned long flags;
-
-	gig_dbg(DEBUG_OUTPUT, "len: %d...", bcs->tx_skb->len);
-
-	if (!bcs->tx_skb->len) {
-		dev_kfree_skb_any(bcs->tx_skb);
-		bcs->tx_skb = NULL;
-		return -EINVAL;
-	}
-
-	/* Copy data to bulk out buffer and transmit data */
-	count = min(bcs->tx_skb->len, (unsigned) ucs->bulk_out_size);
-	skb_copy_from_linear_data(bcs->tx_skb, ucs->bulk_out_buffer, count);
-	skb_pull(bcs->tx_skb, count);
-	ucs->busy = 1;
-	gig_dbg(DEBUG_OUTPUT, "write_modem: send %d bytes", count);
-
-	spin_lock_irqsave(&cs->lock, flags);
-	if (cs->connected) {
-		usb_fill_bulk_urb(ucs->bulk_out_urb, ucs->udev,
-				  usb_sndbulkpipe(ucs->udev,
-						  ucs->bulk_out_epnum),
-				  ucs->bulk_out_buffer, count,
-				  gigaset_write_bulk_callback, cs);
-		ret = usb_submit_urb(ucs->bulk_out_urb, GFP_ATOMIC);
-	} else {
-		ret = -ENODEV;
-	}
-	spin_unlock_irqrestore(&cs->lock, flags);
-
-	if (ret) {
-		dev_err(cs->dev, "could not submit urb (error %d)\n", -ret);
-		ucs->busy = 0;
-	}
-
-	if (!bcs->tx_skb->len) {
-		/* skb sent completely */
-		gigaset_skb_sent(bcs, bcs->tx_skb);
-
-		gig_dbg(DEBUG_INTR, "kfree skb (Adr: %lx)!",
-			(unsigned long) bcs->tx_skb);
-		dev_kfree_skb_any(bcs->tx_skb);
-		bcs->tx_skb = NULL;
-	}
-
-	return ret;
-}
-
-static int gigaset_probe(struct usb_interface *interface,
-			 const struct usb_device_id *id)
-{
-	int retval;
-	struct usb_device *udev = interface_to_usbdev(interface);
-	struct usb_host_interface *hostif = interface->cur_altsetting;
-	struct cardstate *cs = NULL;
-	struct usb_cardstate *ucs = NULL;
-	struct usb_endpoint_descriptor *endpoint;
-	int buffer_size;
-
-	gig_dbg(DEBUG_ANY, "%s: Check if device matches ...", __func__);
-
-	/* See if the device offered us matches what we can accept */
-	if ((le16_to_cpu(udev->descriptor.idVendor)  != USB_M105_VENDOR_ID) ||
-	    (le16_to_cpu(udev->descriptor.idProduct) != USB_M105_PRODUCT_ID)) {
-		gig_dbg(DEBUG_ANY, "device ID (0x%x, 0x%x) not for me - skip",
-			le16_to_cpu(udev->descriptor.idVendor),
-			le16_to_cpu(udev->descriptor.idProduct));
-		return -ENODEV;
-	}
-	if (hostif->desc.bInterfaceNumber != 0) {
-		gig_dbg(DEBUG_ANY, "interface %d not for me - skip",
-			hostif->desc.bInterfaceNumber);
-		return -ENODEV;
-	}
-	if (hostif->desc.bAlternateSetting != 0) {
-		dev_notice(&udev->dev, "unsupported altsetting %d - skip",
-			   hostif->desc.bAlternateSetting);
-		return -ENODEV;
-	}
-	if (hostif->desc.bInterfaceClass != 255) {
-		dev_notice(&udev->dev, "unsupported interface class %d - skip",
-			   hostif->desc.bInterfaceClass);
-		return -ENODEV;
-	}
-
-	dev_info(&udev->dev, "%s: Device matched ... !\n", __func__);
-
-	/* allocate memory for our device state and initialize it */
-	cs = gigaset_initcs(driver, 1, 1, 0, cidmode, GIGASET_MODULENAME);
-	if (!cs)
-		return -ENODEV;
-	ucs = cs->hw.usb;
-
-	/* save off device structure ptrs for later use */
-	usb_get_dev(udev);
-	ucs->udev = udev;
-	ucs->interface = interface;
-	cs->dev = &interface->dev;
-
-	/* save address of controller structure */
-	usb_set_intfdata(interface, cs);
-
-	endpoint = &hostif->endpoint[0].desc;
-
-	buffer_size = le16_to_cpu(endpoint->wMaxPacketSize);
-	ucs->bulk_out_size = buffer_size;
-	ucs->bulk_out_epnum = usb_endpoint_num(endpoint);
-	ucs->bulk_out_buffer = kmalloc(buffer_size, GFP_KERNEL);
-	if (!ucs->bulk_out_buffer) {
-		dev_err(cs->dev, "Couldn't allocate bulk_out_buffer\n");
-		retval = -ENOMEM;
-		goto error;
-	}
-
-	ucs->bulk_out_urb = usb_alloc_urb(0, GFP_KERNEL);
-	if (!ucs->bulk_out_urb) {
-		dev_err(cs->dev, "Couldn't allocate bulk_out_urb\n");
-		retval = -ENOMEM;
-		goto error;
-	}
-
-	endpoint = &hostif->endpoint[1].desc;
-
-	ucs->busy = 0;
-
-	ucs->read_urb = usb_alloc_urb(0, GFP_KERNEL);
-	if (!ucs->read_urb) {
-		dev_err(cs->dev, "No free urbs available\n");
-		retval = -ENOMEM;
-		goto error;
-	}
-	buffer_size = le16_to_cpu(endpoint->wMaxPacketSize);
-	ucs->rcvbuf_size = buffer_size;
-	ucs->rcvbuf = kmalloc(buffer_size, GFP_KERNEL);
-	if (!ucs->rcvbuf) {
-		dev_err(cs->dev, "Couldn't allocate rcvbuf\n");
-		retval = -ENOMEM;
-		goto error;
-	}
-	/* Fill the interrupt urb and send it to the core */
-	usb_fill_int_urb(ucs->read_urb, udev,
-			 usb_rcvintpipe(udev, usb_endpoint_num(endpoint)),
-			 ucs->rcvbuf, buffer_size,
-			 gigaset_read_int_callback,
-			 cs, endpoint->bInterval);
-
-	retval = usb_submit_urb(ucs->read_urb, GFP_KERNEL);
-	if (retval) {
-		dev_err(cs->dev, "Could not submit URB (error %d)\n", -retval);
-		goto error;
-	}
-
-	/* tell common part that the device is ready */
-	if (startmode == SM_LOCKED)
-		cs->mstate = MS_LOCKED;
-
-	retval = gigaset_start(cs);
-	if (retval < 0) {
-		tasklet_kill(&cs->write_tasklet);
-		goto error;
-	}
-	return 0;
-
-error:
-	usb_kill_urb(ucs->read_urb);
-	kfree(ucs->bulk_out_buffer);
-	usb_free_urb(ucs->bulk_out_urb);
-	kfree(ucs->rcvbuf);
-	usb_free_urb(ucs->read_urb);
-	usb_set_intfdata(interface, NULL);
-	ucs->read_urb = ucs->bulk_out_urb = NULL;
-	ucs->rcvbuf = ucs->bulk_out_buffer = NULL;
-	usb_put_dev(ucs->udev);
-	ucs->udev = NULL;
-	ucs->interface = NULL;
-	gigaset_freecs(cs);
-	return retval;
-}
-
-static void gigaset_disconnect(struct usb_interface *interface)
-{
-	struct cardstate *cs;
-	struct usb_cardstate *ucs;
-
-	cs = usb_get_intfdata(interface);
-	ucs = cs->hw.usb;
-
-	dev_info(cs->dev, "disconnecting Gigaset USB adapter\n");
-
-	usb_kill_urb(ucs->read_urb);
-
-	gigaset_stop(cs);
-
-	usb_set_intfdata(interface, NULL);
-	tasklet_kill(&cs->write_tasklet);
-
-	usb_kill_urb(ucs->bulk_out_urb);
-
-	kfree(ucs->bulk_out_buffer);
-	usb_free_urb(ucs->bulk_out_urb);
-	kfree(ucs->rcvbuf);
-	usb_free_urb(ucs->read_urb);
-	ucs->read_urb = ucs->bulk_out_urb = NULL;
-	ucs->rcvbuf = ucs->bulk_out_buffer = NULL;
-
-	usb_put_dev(ucs->udev);
-	ucs->interface = NULL;
-	ucs->udev = NULL;
-	cs->dev = NULL;
-	gigaset_freecs(cs);
-}
-
-/* gigaset_suspend
- * This function is called before the USB connection is suspended or reset.
- */
-static int gigaset_suspend(struct usb_interface *intf, pm_message_t message)
-{
-	struct cardstate *cs = usb_get_intfdata(intf);
-
-	/* stop activity */
-	cs->connected = 0;	/* prevent rescheduling */
-	usb_kill_urb(cs->hw.usb->read_urb);
-	tasklet_kill(&cs->write_tasklet);
-	usb_kill_urb(cs->hw.usb->bulk_out_urb);
-
-	gig_dbg(DEBUG_SUSPEND, "suspend complete");
-	return 0;
-}
-
-/* gigaset_resume
- * This function is called after the USB connection has been resumed or reset.
- */
-static int gigaset_resume(struct usb_interface *intf)
-{
-	struct cardstate *cs = usb_get_intfdata(intf);
-	int rc;
-
-	/* resubmit interrupt URB */
-	cs->connected = 1;
-	rc = usb_submit_urb(cs->hw.usb->read_urb, GFP_KERNEL);
-	if (rc) {
-		dev_err(cs->dev, "Could not submit read URB (error %d)\n", -rc);
-		return rc;
-	}
-
-	gig_dbg(DEBUG_SUSPEND, "resume complete");
-	return 0;
-}
-
-/* gigaset_pre_reset
- * This function is called before the USB connection is reset.
- */
-static int gigaset_pre_reset(struct usb_interface *intf)
-{
-	/* same as suspend */
-	return gigaset_suspend(intf, PMSG_ON);
-}
-
-static const struct gigaset_ops ops = {
-	.write_cmd = gigaset_write_cmd,
-	.write_room = gigaset_write_room,
-	.chars_in_buffer = gigaset_chars_in_buffer,
-	.brkchars = gigaset_brkchars,
-	.init_bchannel = gigaset_init_bchannel,
-	.close_bchannel = gigaset_close_bchannel,
-	.initbcshw = gigaset_initbcshw,
-	.freebcshw = gigaset_freebcshw,
-	.reinitbcshw = gigaset_reinitbcshw,
-	.initcshw = gigaset_initcshw,
-	.freecshw = gigaset_freecshw,
-	.set_modem_ctrl = gigaset_set_modem_ctrl,
-	.baud_rate = gigaset_baud_rate,
-	.set_line_ctrl = gigaset_set_line_ctrl,
-	.send_skb = gigaset_m10x_send_skb,
-	.handle_input = gigaset_m10x_input,
-};
-
-/*
- * This function is called while kernel-module is loaded
- */
-static int __init usb_gigaset_init(void)
-{
-	int result;
-
-	/* allocate memory for our driver state and initialize it */
-	driver = gigaset_initdriver(GIGASET_MINOR, GIGASET_MINORS,
-				    GIGASET_MODULENAME, GIGASET_DEVNAME,
-				    &ops, THIS_MODULE);
-	if (driver == NULL) {
-		result = -ENOMEM;
-		goto error;
-	}
-
-	/* register this driver with the USB subsystem */
-	result = usb_register(&gigaset_usb_driver);
-	if (result < 0) {
-		pr_err("error %d registering USB driver\n", -result);
-		goto error;
-	}
-
-	pr_info(DRIVER_DESC "\n");
-	return 0;
-
-error:
-	if (driver)
-		gigaset_freedriver(driver);
-	driver = NULL;
-	return result;
-}
-
-/*
- * This function is called while unloading the kernel-module
- */
-static void __exit usb_gigaset_exit(void)
-{
-	int i;
-
-	gigaset_blockdriver(driver); /* => probe will fail
-				      * => no gigaset_start any more
-				      */
-
-	/* stop all connected devices */
-	for (i = 0; i < driver->minors; i++)
-		gigaset_shutdown(driver->cs + i);
-
-	/* from now on, no isdn callback should be possible */
-
-	/* deregister this driver with the USB subsystem */
-	usb_deregister(&gigaset_usb_driver);
-	/* this will call the disconnect-callback */
-	/* from now on, no disconnect/probe callback should be running */
-
-	gigaset_freedriver(driver);
-	driver = NULL;
-}
-
-
-module_init(usb_gigaset_init);
-module_exit(usb_gigaset_exit);
-
-MODULE_AUTHOR(DRIVER_AUTHOR);
-MODULE_DESCRIPTION(DRIVER_DESC);
-
-MODULE_LICENSE("GPL");
diff --git a/drivers/staging/isdn/hysdn/Kconfig b/drivers/staging/isdn/hysdn/Kconfig
deleted file mode 100644
index 4c8a9283b9dd..000000000000
--- a/drivers/staging/isdn/hysdn/Kconfig
+++ /dev/null
@@ -1,15 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-config HYSDN
-	tristate "Hypercope HYSDN cards (Champ, Ergo, Metro) support (module only)"
-	depends on m && PROC_FS && PCI
-	help
-	  Say Y here if you have one of Hypercope's active PCI ISDN cards
-	  Champ, Ergo and Metro. You will then get a module called hysdn.
-	  Please read the file <file:Documentation/isdn/hysdn.rst> for more
-	  information.
-
-config HYSDN_CAPI
-	bool "HYSDN CAPI 2.0 support"
-	depends on HYSDN && ISDN_CAPI
-	help
-	  Say Y here if you like to use Hypercope's CAPI 2.0 interface.
diff --git a/drivers/staging/isdn/hysdn/Makefile b/drivers/staging/isdn/hysdn/Makefile
deleted file mode 100644
index e01f17f22ebb..000000000000
--- a/drivers/staging/isdn/hysdn/Makefile
+++ /dev/null
@@ -1,12 +0,0 @@
-# SPDX-License-Identifier: GPL-2.0-only
-# Makefile for the hysdn ISDN device driver
-
-# Each configuration option enables a list of files.
-
-obj-$(CONFIG_HYSDN)		+= hysdn.o
-
-# Multipart objects.
-
-hysdn-y				:= hysdn_procconf.o hysdn_proclog.o boardergo.o \
-				   hysdn_boot.o hysdn_sched.o hysdn_net.o hysdn_init.o
-hysdn-$(CONFIG_HYSDN_CAPI)	+= hycapi.o
diff --git a/drivers/staging/isdn/hysdn/boardergo.c b/drivers/staging/isdn/hysdn/boardergo.c
deleted file mode 100644
index 2aa2a0e08247..000000000000
--- a/drivers/staging/isdn/hysdn/boardergo.c
+++ /dev/null
@@ -1,445 +0,0 @@
-/* $Id: boardergo.c,v 1.5.6.7 2001/11/06 21:58:19 kai Exp $
- *
- * Linux driver for HYSDN cards, specific routines for ergo type boards.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- * As all Linux supported cards Champ2, Ergo and Metro2/4 use the same
- * DPRAM interface and layout with only minor differences all related
- * stuff is done here, not in separate modules.
- *
- */
-
-#include <linux/signal.h>
-#include <linux/kernel.h>
-#include <linux/ioport.h>
-#include <linux/interrupt.h>
-#include <linux/vmalloc.h>
-#include <linux/delay.h>
-#include <asm/io.h>
-
-#include "hysdn_defs.h"
-#include "boardergo.h"
-
-#define byteout(addr, val) outb(val, addr)
-#define bytein(addr) inb(addr)
-
-/***************************************************/
-/* The cards interrupt handler. Called from system */
-/***************************************************/
-static irqreturn_t
-ergo_interrupt(int intno, void *dev_id)
-{
-	hysdn_card *card = dev_id;	/* parameter from irq */
-	tErgDpram *dpr;
-	unsigned long flags;
-	unsigned char volatile b;
-
-	if (!card)
-		return IRQ_NONE;		/* error -> spurious interrupt */
-	if (!card->irq_enabled)
-		return IRQ_NONE;		/* other device interrupting or irq switched off */
-
-	spin_lock_irqsave(&card->hysdn_lock, flags); /* no further irqs allowed */
-
-	if (!(bytein(card->iobase + PCI9050_INTR_REG) & PCI9050_INTR_REG_STAT1)) {
-		spin_unlock_irqrestore(&card->hysdn_lock, flags);	/* restore old state */
-		return IRQ_NONE;		/* no interrupt requested by E1 */
-	}
-	/* clear any pending ints on the board */
-	dpr = card->dpram;
-	b = dpr->ToPcInt;	/* clear for ergo */
-	b |= dpr->ToPcIntMetro;	/* same for metro */
-	b |= dpr->ToHyInt;	/* and for champ */
-
-	/* start kernel task immediately after leaving all interrupts */
-	if (!card->hw_lock)
-		schedule_work(&card->irq_queue);
-	spin_unlock_irqrestore(&card->hysdn_lock, flags);
-	return IRQ_HANDLED;
-}				/* ergo_interrupt */
-
-/******************************************************************************/
-/* ergo_irq_bh will be called as part of the kernel clearing its shared work  */
-/* queue sometime after a call to schedule_work has been made passing our     */
-/* work_struct. This task is the only one handling data transfer from or to   */
-/* the card after booting. The task may be queued from everywhere             */
-/* (interrupts included).                                                     */
-/******************************************************************************/
-static void
-ergo_irq_bh(struct work_struct *ugli_api)
-{
-	hysdn_card *card = container_of(ugli_api, hysdn_card, irq_queue);
-	tErgDpram *dpr;
-	int again;
-	unsigned long flags;
-
-	if (card->state != CARD_STATE_RUN)
-		return;		/* invalid call */
-
-	dpr = card->dpram;	/* point to DPRAM */
-
-	spin_lock_irqsave(&card->hysdn_lock, flags);
-	if (card->hw_lock) {
-		spin_unlock_irqrestore(&card->hysdn_lock, flags);	/* hardware currently unavailable */
-		return;
-	}
-	card->hw_lock = 1;	/* we now lock the hardware */
-
-	do {
-		again = 0;	/* assume loop not to be repeated */
-
-		if (!dpr->ToHyFlag) {
-			/* we are able to send a buffer */
-
-			if (hysdn_sched_tx(card, dpr->ToHyBuf, &dpr->ToHySize, &dpr->ToHyChannel,
-					   ERG_TO_HY_BUF_SIZE)) {
-				dpr->ToHyFlag = 1;	/* enable tx */
-				again = 1;	/* restart loop */
-			}
-		}		/* we are able to send a buffer */
-		if (dpr->ToPcFlag) {
-			/* a message has arrived for us, handle it */
-
-			if (hysdn_sched_rx(card, dpr->ToPcBuf, dpr->ToPcSize, dpr->ToPcChannel)) {
-				dpr->ToPcFlag = 0;	/* we worked the data */
-				again = 1;	/* restart loop */
-			}
-		}		/* a message has arrived for us */
-		if (again) {
-			dpr->ToHyInt = 1;
-			dpr->ToPcInt = 1;	/* interrupt to E1 for all cards */
-		} else
-			card->hw_lock = 0;	/* free hardware again */
-	} while (again);	/* until nothing more to do */
-
-	spin_unlock_irqrestore(&card->hysdn_lock, flags);
-}				/* ergo_irq_bh */
-
-
-/*********************************************************/
-/* stop the card (hardware reset) and disable interrupts */
-/*********************************************************/
-static void
-ergo_stopcard(hysdn_card *card)
-{
-	unsigned long flags;
-	unsigned char val;
-
-	hysdn_net_release(card);	/* first release the net device if existing */
-#ifdef CONFIG_HYSDN_CAPI
-	hycapi_capi_stop(card);
-#endif /* CONFIG_HYSDN_CAPI */
-	spin_lock_irqsave(&card->hysdn_lock, flags);
-	val = bytein(card->iobase + PCI9050_INTR_REG);	/* get actual value */
-	val &= ~(PCI9050_INTR_REG_ENPCI | PCI9050_INTR_REG_EN1);	/* mask irq */
-	byteout(card->iobase + PCI9050_INTR_REG, val);
-	card->irq_enabled = 0;
-	byteout(card->iobase + PCI9050_USER_IO, PCI9050_E1_RESET);	/* reset E1 processor */
-	card->state = CARD_STATE_UNUSED;
-	card->err_log_state = ERRLOG_STATE_OFF;		/* currently no log active */
-
-	spin_unlock_irqrestore(&card->hysdn_lock, flags);
-}				/* ergo_stopcard */
-
-/**************************************************************************/
-/* enable or disable the cards error log. The event is queued if possible */
-/**************************************************************************/
-static void
-ergo_set_errlog_state(hysdn_card *card, int on)
-{
-	unsigned long flags;
-
-	if (card->state != CARD_STATE_RUN) {
-		card->err_log_state = ERRLOG_STATE_OFF;		/* must be off */
-		return;
-	}
-	spin_lock_irqsave(&card->hysdn_lock, flags);
-
-	if (((card->err_log_state == ERRLOG_STATE_OFF) && !on) ||
-	    ((card->err_log_state == ERRLOG_STATE_ON) && on)) {
-		spin_unlock_irqrestore(&card->hysdn_lock, flags);
-		return;		/* nothing to do */
-	}
-	if (on)
-		card->err_log_state = ERRLOG_STATE_START;	/* request start */
-	else
-		card->err_log_state = ERRLOG_STATE_STOP;	/* request stop */
-
-	spin_unlock_irqrestore(&card->hysdn_lock, flags);
-	schedule_work(&card->irq_queue);
-}				/* ergo_set_errlog_state */
-
-/******************************************/
-/* test the cards RAM and return 0 if ok. */
-/******************************************/
-static const char TestText[36] = "This Message is filler, why read it";
-
-static int
-ergo_testram(hysdn_card *card)
-{
-	tErgDpram *dpr = card->dpram;
-
-	memset(dpr->TrapTable, 0, sizeof(dpr->TrapTable));	/* clear all Traps */
-	dpr->ToHyInt = 1;	/* E1 INTR state forced */
-
-	memcpy(&dpr->ToHyBuf[ERG_TO_HY_BUF_SIZE - sizeof(TestText)], TestText,
-	       sizeof(TestText));
-	if (memcmp(&dpr->ToHyBuf[ERG_TO_HY_BUF_SIZE - sizeof(TestText)], TestText,
-		   sizeof(TestText)))
-		return (-1);
-
-	memcpy(&dpr->ToPcBuf[ERG_TO_PC_BUF_SIZE - sizeof(TestText)], TestText,
-	       sizeof(TestText));
-	if (memcmp(&dpr->ToPcBuf[ERG_TO_PC_BUF_SIZE - sizeof(TestText)], TestText,
-		   sizeof(TestText)))
-		return (-1);
-
-	return (0);
-}				/* ergo_testram */
-
-/*****************************************************************************/
-/* this function is intended to write stage 1 boot image to the cards buffer */
-/* this is done in two steps. First the 1024 hi-words are written (offs=0),  */
-/* then the 1024 lo-bytes are written. The remaining DPRAM is cleared, the   */
-/* PCI-write-buffers flushed and the card is taken out of reset.             */
-/* The function then waits for a reaction of the E1 processor or a timeout.  */
-/* Negative return values are interpreted as errors.                         */
-/*****************************************************************************/
-static int
-ergo_writebootimg(struct HYSDN_CARD *card, unsigned char *buf,
-		  unsigned long offs)
-{
-	unsigned char *dst;
-	tErgDpram *dpram;
-	int cnt = (BOOT_IMG_SIZE >> 2);		/* number of words to move and swap (byte order!) */
-
-	if (card->debug_flags & LOG_POF_CARD)
-		hysdn_addlog(card, "ERGO: write bootldr offs=0x%lx ", offs);
-
-	dst = card->dpram;	/* pointer to start of DPRAM */
-	dst += (offs + ERG_DPRAM_FILL_SIZE);	/* offset in the DPRAM */
-	while (cnt--) {
-		*dst++ = *(buf + 1);	/* high byte */
-		*dst++ = *buf;	/* low byte */
-		dst += 2;	/* point to next longword */
-		buf += 2;	/* buffer only filled with words */
-	}
-
-	/* if low words (offs = 2) have been written, clear the rest of the DPRAM, */
-	/* flush the PCI-write-buffer and take the E1 out of reset */
-	if (offs) {
-		memset(card->dpram, 0, ERG_DPRAM_FILL_SIZE);	/* fill the DPRAM still not cleared */
-		dpram = card->dpram;	/* get pointer to dpram structure */
-		dpram->ToHyNoDpramErrLog = 0xFF;	/* write a dpram register */
-		while (!dpram->ToHyNoDpramErrLog);	/* reread volatile register to flush PCI */
-
-		byteout(card->iobase + PCI9050_USER_IO, PCI9050_E1_RUN);	/* start E1 processor */
-		/* the interrupts are still masked */
-
-		msleep_interruptible(20);		/* Timeout 20ms */
-
-		if (((tDpramBootSpooler *) card->dpram)->Len != DPRAM_SPOOLER_DATA_SIZE) {
-			if (card->debug_flags & LOG_POF_CARD)
-				hysdn_addlog(card, "ERGO: write bootldr no answer");
-			return (-ERR_BOOTIMG_FAIL);
-		}
-	}			/* start_boot_img */
-	return (0);		/* successful */
-}				/* ergo_writebootimg */
-
-/********************************************************************************/
-/* ergo_writebootseq writes the buffer containing len bytes to the E1 processor */
-/* using the boot spool mechanism. If everything works fine 0 is returned. In   */
-/* case of errors a negative error value is returned.                           */
-/********************************************************************************/
-static int
-ergo_writebootseq(struct HYSDN_CARD *card, unsigned char *buf, int len)
-{
-	tDpramBootSpooler *sp = (tDpramBootSpooler *) card->dpram;
-	unsigned char *dst;
-	unsigned char buflen;
-	int nr_write;
-	unsigned char tmp_rdptr;
-	unsigned char wr_mirror;
-	int i;
-
-	if (card->debug_flags & LOG_POF_CARD)
-		hysdn_addlog(card, "ERGO: write boot seq len=%d ", len);
-
-	dst = sp->Data;		/* point to data in spool structure */
-	buflen = sp->Len;	/* maximum len of spooled data */
-	wr_mirror = sp->WrPtr;	/* only once read */
-
-	/* try until all bytes written or error */
-	i = 0x1000;		/* timeout value */
-	while (len) {
-
-		/* first determine the number of bytes that may be buffered */
-		do {
-			tmp_rdptr = sp->RdPtr;	/* first read the pointer */
-			i--;	/* decrement timeout */
-		} while (i && (tmp_rdptr != sp->RdPtr));	/* wait for stable pointer */
-
-		if (!i) {
-			if (card->debug_flags & LOG_POF_CARD)
-				hysdn_addlog(card, "ERGO: write boot seq timeout");
-			return (-ERR_BOOTSEQ_FAIL);	/* value not stable -> timeout */
-		}
-		if ((nr_write = tmp_rdptr - wr_mirror - 1) < 0)
-			nr_write += buflen;	/* now we got number of free bytes - 1 in buffer */
-
-		if (!nr_write)
-			continue;	/* no free bytes in buffer */
-
-		if (nr_write > len)
-			nr_write = len;		/* limit if last few bytes */
-		i = 0x1000;	/* reset timeout value */
-
-		/* now we know how much bytes we may put in the puffer */
-		len -= nr_write;	/* we savely could adjust len before output */
-		while (nr_write--) {
-			*(dst + wr_mirror) = *buf++;	/* output one byte */
-			if (++wr_mirror >= buflen)
-				wr_mirror = 0;
-			sp->WrPtr = wr_mirror;	/* announce the next byte to E1 */
-		}		/* while (nr_write) */
-
-	}			/* while (len) */
-	return (0);
-}				/* ergo_writebootseq */
-
-/***********************************************************************************/
-/* ergo_waitpofready waits for a maximum of 10 seconds for the completition of the */
-/* boot process. If the process has been successful 0 is returned otherwise a     */
-/* negative error code is returned.                                                */
-/***********************************************************************************/
-static int
-ergo_waitpofready(struct HYSDN_CARD *card)
-{
-	tErgDpram *dpr = card->dpram;	/* pointer to DPRAM structure */
-	int timecnt = 10000 / 50;	/* timeout is 10 secs max. */
-	unsigned long flags;
-	int msg_size;
-	int i;
-
-	if (card->debug_flags & LOG_POF_CARD)
-		hysdn_addlog(card, "ERGO: waiting for pof ready");
-	while (timecnt--) {
-		/* wait until timeout  */
-
-		if (dpr->ToPcFlag) {
-			/* data has arrived */
-
-			if ((dpr->ToPcChannel != CHAN_SYSTEM) ||
-			    (dpr->ToPcSize < MIN_RDY_MSG_SIZE) ||
-			    (dpr->ToPcSize > MAX_RDY_MSG_SIZE) ||
-			    ((*(unsigned long *) dpr->ToPcBuf) != RDY_MAGIC))
-				break;	/* an error occurred */
-
-			/* Check for additional data delivered during SysReady */
-			msg_size = dpr->ToPcSize - RDY_MAGIC_SIZE;
-			if (msg_size > 0)
-				if (EvalSysrTokData(card, dpr->ToPcBuf + RDY_MAGIC_SIZE, msg_size))
-					break;
-
-			if (card->debug_flags & LOG_POF_RECORD)
-				hysdn_addlog(card, "ERGO: pof boot success");
-			spin_lock_irqsave(&card->hysdn_lock, flags);
-
-			card->state = CARD_STATE_RUN;	/* now card is running */
-			/* enable the cards interrupt */
-			byteout(card->iobase + PCI9050_INTR_REG,
-				bytein(card->iobase + PCI9050_INTR_REG) |
-				(PCI9050_INTR_REG_ENPCI | PCI9050_INTR_REG_EN1));
-			card->irq_enabled = 1;	/* we are ready to receive interrupts */
-
-			dpr->ToPcFlag = 0;	/* reset data indicator */
-			dpr->ToHyInt = 1;
-			dpr->ToPcInt = 1;	/* interrupt to E1 for all cards */
-
-			spin_unlock_irqrestore(&card->hysdn_lock, flags);
-			if ((hynet_enable & (1 << card->myid))
-			    && (i = hysdn_net_create(card)))
-			{
-				ergo_stopcard(card);
-				card->state = CARD_STATE_BOOTERR;
-				return (i);
-			}
-#ifdef CONFIG_HYSDN_CAPI
-			if ((i = hycapi_capi_create(card))) {
-				printk(KERN_WARNING "HYSDN: failed to create capi-interface.\n");
-			}
-#endif /* CONFIG_HYSDN_CAPI */
-			return (0);	/* success */
-		}		/* data has arrived */
-		msleep_interruptible(50);		/* Timeout 50ms */
-	}			/* wait until timeout */
-
-	if (card->debug_flags & LOG_POF_CARD)
-		hysdn_addlog(card, "ERGO: pof boot ready timeout");
-	return (-ERR_POF_TIMEOUT);
-}				/* ergo_waitpofready */
-
-
-
-/************************************************************************************/
-/* release the cards hardware. Before releasing do a interrupt disable and hardware */
-/* reset. Also unmap dpram.                                                         */
-/* Use only during module release.                                                  */
-/************************************************************************************/
-static void
-ergo_releasehardware(hysdn_card *card)
-{
-	ergo_stopcard(card);	/* first stop the card if not already done */
-	free_irq(card->irq, card);	/* release interrupt */
-	release_region(card->iobase + PCI9050_INTR_REG, 1);	/* release all io ports */
-	release_region(card->iobase + PCI9050_USER_IO, 1);
-	iounmap(card->dpram);
-	card->dpram = NULL;	/* release shared mem */
-}				/* ergo_releasehardware */
-
-
-/*********************************************************************************/
-/* acquire the needed hardware ports and map dpram. If an error occurs a nonzero */
-/* value is returned.                                                            */
-/* Use only during module init.                                                  */
-/*********************************************************************************/
-int
-ergo_inithardware(hysdn_card *card)
-{
-	if (!request_region(card->iobase + PCI9050_INTR_REG, 1, "HYSDN"))
-		return (-1);
-	if (!request_region(card->iobase + PCI9050_USER_IO, 1, "HYSDN")) {
-		release_region(card->iobase + PCI9050_INTR_REG, 1);
-		return (-1);	/* ports already in use */
-	}
-	card->memend = card->membase + ERG_DPRAM_PAGE_SIZE - 1;
-	if (!(card->dpram = ioremap(card->membase, ERG_DPRAM_PAGE_SIZE))) {
-		release_region(card->iobase + PCI9050_INTR_REG, 1);
-		release_region(card->iobase + PCI9050_USER_IO, 1);
-		return (-1);
-	}
-
-	ergo_stopcard(card);	/* disable interrupts */
-	if (request_irq(card->irq, ergo_interrupt, IRQF_SHARED, "HYSDN", card)) {
-		ergo_releasehardware(card); /* return the acquired hardware */
-		return (-1);
-	}
-	/* success, now setup the function pointers */
-	card->stopcard = ergo_stopcard;
-	card->releasehardware = ergo_releasehardware;
-	card->testram = ergo_testram;
-	card->writebootimg = ergo_writebootimg;
-	card->writebootseq = ergo_writebootseq;
-	card->waitpofready = ergo_waitpofready;
-	card->set_errlog_state = ergo_set_errlog_state;
-	INIT_WORK(&card->irq_queue, ergo_irq_bh);
-	spin_lock_init(&card->hysdn_lock);
-
-	return (0);
-}				/* ergo_inithardware */
diff --git a/drivers/staging/isdn/hysdn/boardergo.h b/drivers/staging/isdn/hysdn/boardergo.h
deleted file mode 100644
index e99bd81c4034..000000000000
--- a/drivers/staging/isdn/hysdn/boardergo.h
+++ /dev/null
@@ -1,100 +0,0 @@
-/* $Id: boardergo.h,v 1.2.6.1 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards, definitions for ergo type boards (buffers..).
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-
-/************************************************/
-/* defines for the dual port memory of the card */
-/************************************************/
-#define ERG_DPRAM_PAGE_SIZE 0x2000	/* DPRAM occupies a 8K page */
-#define BOOT_IMG_SIZE 4096
-#define ERG_DPRAM_FILL_SIZE (ERG_DPRAM_PAGE_SIZE - BOOT_IMG_SIZE)
-
-#define ERG_TO_HY_BUF_SIZE  0x0E00	/* 3072 bytes buffer size to card */
-#define ERG_TO_PC_BUF_SIZE  0x0E00	/* 3072 bytes to PC, too */
-
-/* following DPRAM layout copied from OS2-driver boarderg.h */
-typedef struct ErgDpram_tag {
-	/*0000 */ unsigned char ToHyBuf[ERG_TO_HY_BUF_SIZE];
-	/*0E00 */ unsigned char ToPcBuf[ERG_TO_PC_BUF_SIZE];
-
-	/*1C00 */ unsigned char bSoftUart[SIZE_RSV_SOFT_UART];
-	/* size 0x1B0 */
-
-	/*1DB0 *//* tErrLogEntry */ unsigned char volatile ErrLogMsg[64];
-	/* size 64 bytes */
-	/*1DB0  unsigned long ulErrType;               */
-	/*1DB4  unsigned long ulErrSubtype;            */
-	/*1DB8  unsigned long ucTextSize;              */
-	/*1DB9  unsigned long ucText[ERRLOG_TEXT_SIZE]; *//* ASCIIZ of len ucTextSize-1 */
-	/*1DF0 */
-
-	/*1DF0 */ unsigned short volatile ToHyChannel;
-	/*1DF2 */ unsigned short volatile ToHySize;
-	/*1DF4 */ unsigned char volatile ToHyFlag;
-	/* !=0: msg for Hy waiting */
-	/*1DF5 */ unsigned char volatile ToPcFlag;
-	/* !=0: msg for PC waiting */
-	/*1DF6 */ unsigned short volatile ToPcChannel;
-	/*1DF8 */ unsigned short volatile ToPcSize;
-	/*1DFA */ unsigned char bRes1DBA[0x1E00 - 0x1DFA];
-	/* 6 bytes */
-
-	/*1E00 */ unsigned char bRestOfEntryTbl[0x1F00 - 0x1E00];
-	/*1F00 */ unsigned long TrapTable[62];
-	/*1FF8 */ unsigned char bRes1FF8[0x1FFB - 0x1FF8];
-	/* low part of reset vetor */
-	/*1FFB */ unsigned char ToPcIntMetro;
-	/* notes:
-	 * - metro has 32-bit boot ram - accessing
-	 *   ToPcInt and ToHyInt would be the same;
-	 *   so we moved ToPcInt to 1FFB.
-	 *   Because on the PC side both vars are
-	 *   readonly (reseting on int from E1 to PC),
-	 *   we can read both vars on both cards
-	 *   without destroying anything.
-	 * - 1FFB is the high byte of the reset vector,
-	 *   so E1 side should NOT change this byte
-	 *   when writing!
-	 */
-	/*1FFC */ unsigned char volatile ToHyNoDpramErrLog;
-	/* note: ToHyNoDpramErrLog is used to inform
-	 *       boot loader, not to use DPRAM based
-	 *       ErrLog; when DOS driver is rewritten
-	 *       this becomes obsolete
-	 */
-	/*1FFD */ unsigned char bRes1FFD;
-	/*1FFE */ unsigned char ToPcInt;
-	/* E1_intclear; on CHAMP2: E1_intset   */
-	/*1FFF */ unsigned char ToHyInt;
-	/* E1_intset;   on CHAMP2: E1_intclear */
-} tErgDpram;
-
-/**********************************************/
-/* PCI9050 controller local register offsets: */
-/* copied from boarderg.c                     */
-/**********************************************/
-#define PCI9050_INTR_REG    0x4C	/* Interrupt register */
-#define PCI9050_USER_IO     0x51	/* User I/O  register */
-
-/* bitmask for PCI9050_INTR_REG: */
-#define PCI9050_INTR_REG_EN1    0x01	/* 1= enable (def.), 0= disable */
-#define PCI9050_INTR_REG_POL1   0x02	/* 1= active high (def.), 0= active low */
-#define PCI9050_INTR_REG_STAT1  0x04	/* 1= intr. active, 0= intr. not active (def.) */
-#define PCI9050_INTR_REG_ENPCI  0x40	/* 1= PCI interrupts enable (def.) */
-
-/* bitmask for PCI9050_USER_IO: */
-#define PCI9050_USER_IO_EN3     0x02	/* 1= disable      , 0= enable (def.) */
-#define PCI9050_USER_IO_DIR3    0x04	/* 1= output (def.), 0= input         */
-#define PCI9050_USER_IO_DAT3    0x08	/* 1= high (def.)  , 0= low           */
-
-#define PCI9050_E1_RESET    (PCI9050_USER_IO_DIR3)		/* 0x04 */
-#define PCI9050_E1_RUN      (PCI9050_USER_IO_DAT3 | PCI9050_USER_IO_DIR3)		/* 0x0C */
diff --git a/drivers/staging/isdn/hysdn/hycapi.c b/drivers/staging/isdn/hysdn/hycapi.c
deleted file mode 100644
index a2c15cd7bf67..000000000000
--- a/drivers/staging/isdn/hysdn/hycapi.c
+++ /dev/null
@@ -1,785 +0,0 @@
-/* $Id: hycapi.c,v 1.8.6.4 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards, CAPI2.0-Interface.
- *
- * Author    Ulrich Albrecht <u.albrecht@hypercope.de> for Hypercope GmbH
- * Copyright 2000 by Hypercope GmbH
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/proc_fs.h>
-#include <linux/seq_file.h>
-#include <linux/signal.h>
-#include <linux/kernel.h>
-#include <linux/skbuff.h>
-#include <linux/netdevice.h>
-#include <linux/slab.h>
-
-#define	VER_DRIVER	0
-#define	VER_CARDTYPE	1
-#define	VER_HWID	2
-#define	VER_SERIAL	3
-#define	VER_OPTION	4
-#define	VER_PROTO	5
-#define	VER_PROFILE	6
-#define	VER_CAPI	7
-
-#include "hysdn_defs.h"
-#include <linux/kernelcapi.h>
-
-static char hycapi_revision[] = "$Revision: 1.8.6.4 $";
-
-unsigned int hycapi_enable = 0xffffffff;
-module_param(hycapi_enable, uint, 0);
-
-typedef struct _hycapi_appl {
-	unsigned int ctrl_mask;
-	capi_register_params rp;
-	struct sk_buff *listen_req[CAPI_MAXCONTR];
-} hycapi_appl;
-
-static hycapi_appl hycapi_applications[CAPI_MAXAPPL];
-
-static u16 hycapi_send_message(struct capi_ctr *ctrl, struct sk_buff *skb);
-
-static inline int _hycapi_appCheck(int app_id, int ctrl_no)
-{
-	if ((ctrl_no <= 0) || (ctrl_no > CAPI_MAXCONTR) || (app_id <= 0) ||
-	   (app_id > CAPI_MAXAPPL))
-	{
-		printk(KERN_ERR "HYCAPI: Invalid request app_id %d for controller %d", app_id, ctrl_no);
-		return -1;
-	}
-	return ((hycapi_applications[app_id - 1].ctrl_mask & (1 << (ctrl_no-1))) != 0);
-}
-
-/******************************
-Kernel-Capi callback reset_ctr
-******************************/
-
-static void
-hycapi_reset_ctr(struct capi_ctr *ctrl)
-{
-	hycapictrl_info *cinfo = ctrl->driverdata;
-
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "HYCAPI hycapi_reset_ctr\n");
-#endif
-	capilib_release(&cinfo->ncci_head);
-	capi_ctr_down(ctrl);
-}
-
-/******************************
-Kernel-Capi callback remove_ctr
-******************************/
-
-static void
-hycapi_remove_ctr(struct capi_ctr *ctrl)
-{
-	int i;
-	hycapictrl_info *cinfo = NULL;
-	hysdn_card *card = NULL;
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "HYCAPI hycapi_remove_ctr\n");
-#endif
-	cinfo = (hycapictrl_info *)(ctrl->driverdata);
-	if (!cinfo) {
-		printk(KERN_ERR "No hycapictrl_info set!");
-		return;
-	}
-	card = cinfo->card;
-	capi_ctr_suspend_output(ctrl);
-	for (i = 0; i < CAPI_MAXAPPL; i++) {
-		if (hycapi_applications[i].listen_req[ctrl->cnr - 1]) {
-			kfree_skb(hycapi_applications[i].listen_req[ctrl->cnr - 1]);
-			hycapi_applications[i].listen_req[ctrl->cnr - 1] = NULL;
-		}
-	}
-	detach_capi_ctr(ctrl);
-	ctrl->driverdata = NULL;
-	kfree(card->hyctrlinfo);
-
-
-	card->hyctrlinfo = NULL;
-}
-
-/***********************************************************
-
-Queue a CAPI-message to the controller.
-
-***********************************************************/
-
-static void
-hycapi_sendmsg_internal(struct capi_ctr *ctrl, struct sk_buff *skb)
-{
-	hycapictrl_info *cinfo = (hycapictrl_info *)(ctrl->driverdata);
-	hysdn_card *card = cinfo->card;
-
-	spin_lock_irq(&cinfo->lock);
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_send_message\n");
-#endif
-	cinfo->skbs[cinfo->in_idx++] = skb;	/* add to buffer list */
-	if (cinfo->in_idx >= HYSDN_MAX_CAPI_SKB)
-		cinfo->in_idx = 0;	/* wrap around */
-	cinfo->sk_count++;		/* adjust counter */
-	if (cinfo->sk_count >= HYSDN_MAX_CAPI_SKB) {
-		/* inform upper layers we're full */
-		printk(KERN_ERR "HYSDN Card%d: CAPI-buffer overrun!\n",
-		       card->myid);
-		capi_ctr_suspend_output(ctrl);
-	}
-	cinfo->tx_skb = skb;
-	spin_unlock_irq(&cinfo->lock);
-	schedule_work(&card->irq_queue);
-}
-
-/***********************************************************
-hycapi_register_internal
-
-Send down the CAPI_REGISTER-Command to the controller.
-This functions will also be used if the adapter has been rebooted to
-re-register any applications in the private list.
-
-************************************************************/
-
-static void
-hycapi_register_internal(struct capi_ctr *ctrl, __u16 appl,
-			 capi_register_params *rp)
-{
-	char ExtFeatureDefaults[] = "49  /0/0/0/0,*/1,*/2,*/3,*/4,*/5,*/6,*/7,*/8,*/9,*";
-	hycapictrl_info *cinfo = (hycapictrl_info *)(ctrl->driverdata);
-	hysdn_card *card = cinfo->card;
-	struct sk_buff *skb;
-	__u16 len;
-	__u8 _command = 0xa0, _subcommand = 0x80;
-	__u16 MessageNumber = 0x0000;
-	__u16 MessageBufferSize = 0;
-	int slen = strlen(ExtFeatureDefaults);
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_register_appl\n");
-#endif
-	MessageBufferSize = rp->level3cnt * rp->datablkcnt * rp->datablklen;
-
-	len = CAPI_MSG_BASELEN + 8 + slen + 1;
-	if (!(skb = alloc_skb(len, GFP_ATOMIC))) {
-		printk(KERN_ERR "HYSDN card%d: memory squeeze in hycapi_register_appl\n",
-		       card->myid);
-		return;
-	}
-	skb_put_data(skb, &len, sizeof(__u16));
-	skb_put_data(skb, &appl, sizeof(__u16));
-	skb_put_data(skb, &_command, sizeof(__u8));
-	skb_put_data(skb, &_subcommand, sizeof(__u8));
-	skb_put_data(skb, &MessageNumber, sizeof(__u16));
-	skb_put_data(skb, &MessageBufferSize, sizeof(__u16));
-	skb_put_data(skb, &(rp->level3cnt), sizeof(__u16));
-	skb_put_data(skb, &(rp->datablkcnt), sizeof(__u16));
-	skb_put_data(skb, &(rp->datablklen), sizeof(__u16));
-	skb_put_data(skb, ExtFeatureDefaults, slen);
-	hycapi_applications[appl - 1].ctrl_mask |= (1 << (ctrl->cnr - 1));
-	hycapi_send_message(ctrl, skb);
-}
-
-/************************************************************
-hycapi_restart_internal
-
-After an adapter has been rebootet, re-register all applications and
-send a LISTEN_REQ (if there has been such a thing )
-
-*************************************************************/
-
-static void hycapi_restart_internal(struct capi_ctr *ctrl)
-{
-	int i;
-	struct sk_buff *skb;
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_WARNING "HYSDN: hycapi_restart_internal");
-#endif
-	for (i = 0; i < CAPI_MAXAPPL; i++) {
-		if (_hycapi_appCheck(i + 1, ctrl->cnr) == 1) {
-			hycapi_register_internal(ctrl, i + 1,
-						 &hycapi_applications[i].rp);
-			if (hycapi_applications[i].listen_req[ctrl->cnr - 1]) {
-				skb = skb_copy(hycapi_applications[i].listen_req[ctrl->cnr - 1], GFP_ATOMIC);
-				hycapi_sendmsg_internal(ctrl, skb);
-			}
-		}
-	}
-}
-
-/*************************************************************
-Register an application.
-Error-checking is done for CAPI-compliance.
-
-The application is recorded in the internal list.
-*************************************************************/
-
-static void
-hycapi_register_appl(struct capi_ctr *ctrl, __u16 appl,
-		     capi_register_params *rp)
-{
-	int MaxLogicalConnections = 0, MaxBDataBlocks = 0, MaxBDataLen = 0;
-	hycapictrl_info *cinfo = (hycapictrl_info *)(ctrl->driverdata);
-	hysdn_card *card = cinfo->card;
-	int chk = _hycapi_appCheck(appl, ctrl->cnr);
-	if (chk < 0) {
-		return;
-	}
-	if (chk == 1) {
-		printk(KERN_INFO "HYSDN: apl %d already registered\n", appl);
-		return;
-	}
-	MaxBDataBlocks = rp->datablkcnt > CAPI_MAXDATAWINDOW ? CAPI_MAXDATAWINDOW : rp->datablkcnt;
-	rp->datablkcnt = MaxBDataBlocks;
-	MaxBDataLen = rp->datablklen < 1024 ? 1024 : rp->datablklen;
-	rp->datablklen = MaxBDataLen;
-
-	MaxLogicalConnections = rp->level3cnt;
-	if (MaxLogicalConnections < 0) {
-		MaxLogicalConnections = card->bchans * -MaxLogicalConnections;
-	}
-	if (MaxLogicalConnections == 0) {
-		MaxLogicalConnections = card->bchans;
-	}
-
-	rp->level3cnt = MaxLogicalConnections;
-	memcpy(&hycapi_applications[appl - 1].rp,
-	       rp, sizeof(capi_register_params));
-}
-
-/*********************************************************************
-
-hycapi_release_internal
-
-Send down a CAPI_RELEASE to the controller.
-*********************************************************************/
-
-static void hycapi_release_internal(struct capi_ctr *ctrl, __u16 appl)
-{
-	hycapictrl_info *cinfo = (hycapictrl_info *)(ctrl->driverdata);
-	hysdn_card *card = cinfo->card;
-	struct sk_buff *skb;
-	__u16 len;
-	__u8 _command = 0xa1, _subcommand = 0x80;
-	__u16 MessageNumber = 0x0000;
-
-	capilib_release_appl(&cinfo->ncci_head, appl);
-
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_release_appl\n");
-#endif
-	len = CAPI_MSG_BASELEN;
-	if (!(skb = alloc_skb(len, GFP_ATOMIC))) {
-		printk(KERN_ERR "HYSDN card%d: memory squeeze in hycapi_register_appl\n",
-		       card->myid);
-		return;
-	}
-	skb_put_data(skb, &len, sizeof(__u16));
-	skb_put_data(skb, &appl, sizeof(__u16));
-	skb_put_data(skb, &_command, sizeof(__u8));
-	skb_put_data(skb, &_subcommand, sizeof(__u8));
-	skb_put_data(skb, &MessageNumber, sizeof(__u16));
-	hycapi_send_message(ctrl, skb);
-	hycapi_applications[appl - 1].ctrl_mask &= ~(1 << (ctrl->cnr - 1));
-}
-
-/******************************************************************
-hycapi_release_appl
-
-Release the application from the internal list an remove it's
-registration at controller-level
-******************************************************************/
-
-static void
-hycapi_release_appl(struct capi_ctr *ctrl, __u16 appl)
-{
-	int chk;
-
-	chk = _hycapi_appCheck(appl, ctrl->cnr);
-	if (chk < 0) {
-		printk(KERN_ERR "HYCAPI: Releasing invalid appl %d on controller %d\n", appl, ctrl->cnr);
-		return;
-	}
-	if (hycapi_applications[appl - 1].listen_req[ctrl->cnr - 1]) {
-		kfree_skb(hycapi_applications[appl - 1].listen_req[ctrl->cnr - 1]);
-		hycapi_applications[appl - 1].listen_req[ctrl->cnr - 1] = NULL;
-	}
-	if (chk == 1)
-	{
-		hycapi_release_internal(ctrl, appl);
-	}
-}
-
-
-/**************************************************************
-Kill a single controller.
-**************************************************************/
-
-int hycapi_capi_release(hysdn_card *card)
-{
-	hycapictrl_info *cinfo = card->hyctrlinfo;
-	struct capi_ctr *ctrl;
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_capi_release\n");
-#endif
-	if (cinfo) {
-		ctrl = &cinfo->capi_ctrl;
-		hycapi_remove_ctr(ctrl);
-	}
-	return 0;
-}
-
-/**************************************************************
-hycapi_capi_stop
-
-Stop CAPI-Output on a card. (e.g. during reboot)
-***************************************************************/
-
-int hycapi_capi_stop(hysdn_card *card)
-{
-	hycapictrl_info *cinfo = card->hyctrlinfo;
-	struct capi_ctr *ctrl;
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_capi_stop\n");
-#endif
-	if (cinfo) {
-		ctrl = &cinfo->capi_ctrl;
-/*		ctrl->suspend_output(ctrl); */
-		capi_ctr_down(ctrl);
-	}
-	return 0;
-}
-
-/***************************************************************
-hycapi_send_message
-
-Send a message to the controller.
-
-Messages are parsed for their Command/Subcommand-type, and appropriate
-action's are performed.
-
-Note that we have to muck around with a 64Bit-DATA_REQ as there are
-firmware-releases that do not check the MsgLen-Indication!
-
-***************************************************************/
-
-static u16 hycapi_send_message(struct capi_ctr *ctrl, struct sk_buff *skb)
-{
-	__u16 appl_id;
-	int _len, _len2;
-	__u8 msghead[64];
-	hycapictrl_info *cinfo = ctrl->driverdata;
-	u16 retval = CAPI_NOERROR;
-
-	appl_id = CAPIMSG_APPID(skb->data);
-	switch (_hycapi_appCheck(appl_id, ctrl->cnr))
-	{
-	case 0:
-/*			printk(KERN_INFO "Need to register\n"); */
-		hycapi_register_internal(ctrl,
-					 appl_id,
-					 &(hycapi_applications[appl_id - 1].rp));
-		break;
-	case 1:
-		break;
-	default:
-		printk(KERN_ERR "HYCAPI: Controller mixup!\n");
-		retval = CAPI_ILLAPPNR;
-		goto out;
-	}
-	switch (CAPIMSG_CMD(skb->data)) {
-	case CAPI_DISCONNECT_B3_RESP:
-		capilib_free_ncci(&cinfo->ncci_head, appl_id,
-				  CAPIMSG_NCCI(skb->data));
-		break;
-	case CAPI_DATA_B3_REQ:
-		_len = CAPIMSG_LEN(skb->data);
-		if (_len > 22) {
-			_len2 = _len - 22;
-			skb_copy_from_linear_data(skb, msghead, 22);
-			skb_copy_to_linear_data_offset(skb, _len2,
-						       msghead, 22);
-			skb_pull(skb, _len2);
-			CAPIMSG_SETLEN(skb->data, 22);
-			retval = capilib_data_b3_req(&cinfo->ncci_head,
-						     CAPIMSG_APPID(skb->data),
-						     CAPIMSG_NCCI(skb->data),
-						     CAPIMSG_MSGID(skb->data));
-		}
-		break;
-	case CAPI_LISTEN_REQ:
-		if (hycapi_applications[appl_id - 1].listen_req[ctrl->cnr - 1])
-		{
-			kfree_skb(hycapi_applications[appl_id - 1].listen_req[ctrl->cnr - 1]);
-			hycapi_applications[appl_id - 1].listen_req[ctrl->cnr - 1] = NULL;
-		}
-		if (!(hycapi_applications[appl_id  -1].listen_req[ctrl->cnr - 1] = skb_copy(skb, GFP_ATOMIC)))
-		{
-			printk(KERN_ERR "HYSDN: memory squeeze in private_listen\n");
-		}
-		break;
-	default:
-		break;
-	}
-out:
-	if (retval == CAPI_NOERROR)
-		hycapi_sendmsg_internal(ctrl, skb);
-	else
-		dev_kfree_skb_any(skb);
-
-	return retval;
-}
-
-static int hycapi_proc_show(struct seq_file *m, void *v)
-{
-	struct capi_ctr *ctrl = m->private;
-	hycapictrl_info *cinfo = (hycapictrl_info *)(ctrl->driverdata);
-	hysdn_card *card = cinfo->card;
-	char *s;
-
-	seq_printf(m, "%-16s %s\n", "name", cinfo->cardname);
-	seq_printf(m, "%-16s 0x%x\n", "io", card->iobase);
-	seq_printf(m, "%-16s %d\n", "irq", card->irq);
-
-	switch (card->brdtype) {
-	case BD_PCCARD:  s = "HYSDN Hycard"; break;
-	case BD_ERGO: s = "HYSDN Ergo2"; break;
-	case BD_METRO: s = "HYSDN Metro4"; break;
-	case BD_CHAMP2: s = "HYSDN Champ2";	break;
-	case BD_PLEXUS: s = "HYSDN Plexus30"; break;
-	default: s = "???"; break;
-	}
-	seq_printf(m, "%-16s %s\n", "type", s);
-	if ((s = cinfo->version[VER_DRIVER]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_driver", s);
-	if ((s = cinfo->version[VER_CARDTYPE]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_cardtype", s);
-	if ((s = cinfo->version[VER_SERIAL]) != NULL)
-		seq_printf(m, "%-16s %s\n", "ver_serial", s);
-
-	seq_printf(m, "%-16s %s\n", "cardname", cinfo->cardname);
-
-	return 0;
-}
-
-/**************************************************************
-hycapi_load_firmware
-
-This does NOT load any firmware, but the callback somehow is needed
-on capi-interface registration.
-
-**************************************************************/
-
-static int hycapi_load_firmware(struct capi_ctr *ctrl, capiloaddata *data)
-{
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_load_firmware\n");
-#endif
-	return 0;
-}
-
-
-static char *hycapi_procinfo(struct capi_ctr *ctrl)
-{
-	hycapictrl_info *cinfo = (hycapictrl_info *)(ctrl->driverdata);
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "%s\n", __func__);
-#endif
-	if (!cinfo)
-		return "";
-	sprintf(cinfo->infobuf, "%s %s 0x%x %d %s",
-		cinfo->cardname[0] ? cinfo->cardname : "-",
-		cinfo->version[VER_DRIVER] ? cinfo->version[VER_DRIVER] : "-",
-		cinfo->card ? cinfo->card->iobase : 0x0,
-		cinfo->card ? cinfo->card->irq : 0,
-		hycapi_revision
-		);
-	return cinfo->infobuf;
-}
-
-/******************************************************************
-hycapi_rx_capipkt
-
-Receive a capi-message.
-
-All B3_DATA_IND are converted to 64K-extension compatible format.
-New nccis are created if necessary.
-*******************************************************************/
-
-void
-hycapi_rx_capipkt(hysdn_card *card, unsigned char *buf, unsigned short len)
-{
-	struct sk_buff *skb;
-	hycapictrl_info *cinfo = card->hyctrlinfo;
-	struct capi_ctr *ctrl;
-	__u16 ApplId;
-	__u16 MsgLen, info;
-	__u16 len2, CapiCmd;
-	__u32 CP64[2] = {0, 0};
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_rx_capipkt\n");
-#endif
-	if (!cinfo) {
-		return;
-	}
-	ctrl = &cinfo->capi_ctrl;
-	if (len < CAPI_MSG_BASELEN) {
-		printk(KERN_ERR "HYSDN Card%d: invalid CAPI-message, length %d!\n",
-		       card->myid, len);
-		return;
-	}
-	MsgLen = CAPIMSG_LEN(buf);
-	ApplId = CAPIMSG_APPID(buf);
-	CapiCmd = CAPIMSG_CMD(buf);
-
-	if ((CapiCmd == CAPI_DATA_B3_IND) && (MsgLen < 30)) {
-		len2 = len + (30 - MsgLen);
-		if (!(skb = alloc_skb(len2, GFP_ATOMIC))) {
-			printk(KERN_ERR "HYSDN Card%d: incoming packet dropped\n",
-			       card->myid);
-			return;
-		}
-		skb_put_data(skb, buf, MsgLen);
-		skb_put_data(skb, CP64, 2 * sizeof(__u32));
-		skb_put_data(skb, buf + MsgLen, len - MsgLen);
-		CAPIMSG_SETLEN(skb->data, 30);
-	} else {
-		if (!(skb = alloc_skb(len, GFP_ATOMIC))) {
-			printk(KERN_ERR "HYSDN Card%d: incoming packet dropped\n",
-			       card->myid);
-			return;
-		}
-		skb_put_data(skb, buf, len);
-	}
-	switch (CAPIMSG_CMD(skb->data))
-	{
-	case CAPI_CONNECT_B3_CONF:
-/* Check info-field for error-indication: */
-		info = CAPIMSG_U16(skb->data, 12);
-		switch (info)
-		{
-		case 0:
-			capilib_new_ncci(&cinfo->ncci_head, ApplId, CAPIMSG_NCCI(skb->data),
-					 hycapi_applications[ApplId - 1].rp.datablkcnt);
-
-			break;
-		case 0x0001:
-			printk(KERN_ERR "HYSDN Card%d: NCPI not supported by current "
-			       "protocol. NCPI ignored.\n", card->myid);
-			break;
-		case 0x2001:
-			printk(KERN_ERR "HYSDN Card%d: Message not supported in"
-			       " current state\n", card->myid);
-			break;
-		case 0x2002:
-			printk(KERN_ERR "HYSDN Card%d: invalid PLCI\n", card->myid);
-			break;
-		case 0x2004:
-			printk(KERN_ERR "HYSDN Card%d: out of NCCI\n", card->myid);
-			break;
-		case 0x3008:
-			printk(KERN_ERR "HYSDN Card%d: NCPI not supported\n",
-			       card->myid);
-			break;
-		default:
-			printk(KERN_ERR "HYSDN Card%d: Info in CONNECT_B3_CONF: %d\n",
-			       card->myid, info);
-			break;
-		}
-		break;
-	case CAPI_CONNECT_B3_IND:
-		capilib_new_ncci(&cinfo->ncci_head, ApplId,
-				 CAPIMSG_NCCI(skb->data),
-				 hycapi_applications[ApplId - 1].rp.datablkcnt);
-		break;
-	case CAPI_DATA_B3_CONF:
-		capilib_data_b3_conf(&cinfo->ncci_head, ApplId,
-				     CAPIMSG_NCCI(skb->data),
-				     CAPIMSG_MSGID(skb->data));
-		break;
-	default:
-		break;
-	}
-	capi_ctr_handle_message(ctrl, ApplId, skb);
-}
-
-/******************************************************************
-hycapi_tx_capiack
-
-Internally acknowledge a msg sent. This will remove the msg from the
-internal queue.
-
-*******************************************************************/
-
-void hycapi_tx_capiack(hysdn_card *card)
-{
-	hycapictrl_info *cinfo = card->hyctrlinfo;
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_tx_capiack\n");
-#endif
-	if (!cinfo) {
-		return;
-	}
-	spin_lock_irq(&cinfo->lock);
-	kfree_skb(cinfo->skbs[cinfo->out_idx]);		/* free skb */
-	cinfo->skbs[cinfo->out_idx++] = NULL;
-	if (cinfo->out_idx >= HYSDN_MAX_CAPI_SKB)
-		cinfo->out_idx = 0;	/* wrap around */
-
-	if (cinfo->sk_count-- == HYSDN_MAX_CAPI_SKB)	/* dec usage count */
-		capi_ctr_resume_output(&cinfo->capi_ctrl);
-	spin_unlock_irq(&cinfo->lock);
-}
-
-/***************************************************************
-hycapi_tx_capiget(hysdn_card *card)
-
-This is called when polling for messages to SEND.
-
-****************************************************************/
-
-struct sk_buff *
-hycapi_tx_capiget(hysdn_card *card)
-{
-	hycapictrl_info *cinfo = card->hyctrlinfo;
-	if (!cinfo) {
-		return (struct sk_buff *)NULL;
-	}
-	if (!cinfo->sk_count)
-		return (struct sk_buff *)NULL;	/* nothing available */
-
-	return (cinfo->skbs[cinfo->out_idx]);		/* next packet to send */
-}
-
-
-/**********************************************************
-int hycapi_init()
-
-attach the capi-driver to the kernel-capi.
-
-***********************************************************/
-
-int hycapi_init(void)
-{
-	int i;
-	for (i = 0; i < CAPI_MAXAPPL; i++) {
-		memset(&(hycapi_applications[i]), 0, sizeof(hycapi_appl));
-	}
-	return (0);
-}
-
-/**************************************************************
-hycapi_cleanup(void)
-
-detach the capi-driver to the kernel-capi. Actually this should
-free some more ressources. Do that later.
-**************************************************************/
-
-void
-hycapi_cleanup(void)
-{
-}
-
-/********************************************************************
-hycapi_capi_create(hysdn_card *card)
-
-Attach the card with its capi-ctrl.
-*********************************************************************/
-
-static void hycapi_fill_profile(hysdn_card *card)
-{
-	hycapictrl_info *cinfo = NULL;
-	struct capi_ctr *ctrl = NULL;
-	cinfo = card->hyctrlinfo;
-	if (!cinfo) return;
-	ctrl = &cinfo->capi_ctrl;
-	strcpy(ctrl->manu, "Hypercope");
-	ctrl->version.majorversion = 2;
-	ctrl->version.minorversion = 0;
-	ctrl->version.majormanuversion = 3;
-	ctrl->version.minormanuversion = 2;
-	ctrl->profile.ncontroller = card->myid;
-	ctrl->profile.nbchannel = card->bchans;
-	ctrl->profile.goptions = GLOBAL_OPTION_INTERNAL_CONTROLLER |
-		GLOBAL_OPTION_B_CHANNEL_OPERATION;
-	ctrl->profile.support1 =  B1_PROT_64KBIT_HDLC |
-		(card->faxchans ? B1_PROT_T30 : 0) |
-		B1_PROT_64KBIT_TRANSPARENT;
-	ctrl->profile.support2 = B2_PROT_ISO7776 |
-		(card->faxchans ? B2_PROT_T30 : 0) |
-		B2_PROT_TRANSPARENT;
-	ctrl->profile.support3 = B3_PROT_TRANSPARENT |
-		B3_PROT_T90NL |
-		(card->faxchans ? B3_PROT_T30 : 0) |
-		(card->faxchans ? B3_PROT_T30EXT : 0) |
-		B3_PROT_ISO8208;
-}
-
-int
-hycapi_capi_create(hysdn_card *card)
-{
-	hycapictrl_info *cinfo = NULL;
-	struct capi_ctr *ctrl = NULL;
-	int retval;
-#ifdef HYCAPI_PRINTFNAMES
-	printk(KERN_NOTICE "hycapi_capi_create\n");
-#endif
-	if ((hycapi_enable & (1 << card->myid)) == 0) {
-		return 1;
-	}
-	if (!card->hyctrlinfo) {
-		cinfo = kzalloc(sizeof(hycapictrl_info), GFP_ATOMIC);
-		if (!cinfo) {
-			printk(KERN_WARNING "HYSDN: no memory for capi-ctrl.\n");
-			return -ENOMEM;
-		}
-		card->hyctrlinfo = cinfo;
-		cinfo->card = card;
-		spin_lock_init(&cinfo->lock);
-		INIT_LIST_HEAD(&cinfo->ncci_head);
-
-		switch (card->brdtype) {
-		case BD_PCCARD:  strcpy(cinfo->cardname, "HYSDN Hycard"); break;
-		case BD_ERGO: strcpy(cinfo->cardname, "HYSDN Ergo2"); break;
-		case BD_METRO: strcpy(cinfo->cardname, "HYSDN Metro4"); break;
-		case BD_CHAMP2: strcpy(cinfo->cardname, "HYSDN Champ2"); break;
-		case BD_PLEXUS: strcpy(cinfo->cardname, "HYSDN Plexus30"); break;
-		default: strcpy(cinfo->cardname, "HYSDN ???"); break;
-		}
-
-		ctrl = &cinfo->capi_ctrl;
-		ctrl->driver_name   = "hycapi";
-		ctrl->driverdata    = cinfo;
-		ctrl->register_appl = hycapi_register_appl;
-		ctrl->release_appl  = hycapi_release_appl;
-		ctrl->send_message  = hycapi_send_message;
-		ctrl->load_firmware = hycapi_load_firmware;
-		ctrl->reset_ctr     = hycapi_reset_ctr;
-		ctrl->procinfo      = hycapi_procinfo;
-		ctrl->proc_show     = hycapi_proc_show;
-		strcpy(ctrl->name, cinfo->cardname);
-		ctrl->owner = THIS_MODULE;
-
-		retval = attach_capi_ctr(ctrl);
-		if (retval) {
-			printk(KERN_ERR "hycapi: attach controller failed.\n");
-			return -EBUSY;
-		}
-		/* fill in the blanks: */
-		hycapi_fill_profile(card);
-		capi_ctr_ready(ctrl);
-	} else {
-		/* resume output on stopped ctrl */
-		ctrl = &card->hyctrlinfo->capi_ctrl;
-		hycapi_fill_profile(card);
-		capi_ctr_ready(ctrl);
-		hycapi_restart_internal(ctrl);
-/*		ctrl->resume_output(ctrl); */
-	}
-	return 0;
-}
diff --git a/drivers/staging/isdn/hysdn/hysdn_boot.c b/drivers/staging/isdn/hysdn/hysdn_boot.c
deleted file mode 100644
index ba177c3a621b..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_boot.c
+++ /dev/null
@@ -1,400 +0,0 @@
-/* $Id: hysdn_boot.c,v 1.4.6.4 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards
- * specific routines for booting and pof handling
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/vmalloc.h>
-#include <linux/slab.h>
-#include <linux/uaccess.h>
-
-#include "hysdn_defs.h"
-#include "hysdn_pof.h"
-
-/********************************/
-/* defines for pof read handler */
-/********************************/
-#define POF_READ_FILE_HEAD  0
-#define POF_READ_TAG_HEAD   1
-#define POF_READ_TAG_DATA   2
-
-/************************************************************/
-/* definition of boot specific data area. This data is only */
-/* needed during boot and so allocated dynamically.         */
-/************************************************************/
-struct boot_data {
-	unsigned short Cryptor;	/* for use with Decrypt function */
-	unsigned short Nrecs;	/* records remaining in file */
-	unsigned char pof_state;/* actual state of read handler */
-	unsigned char is_crypted;/* card data is crypted */
-	int BufSize;		/* actual number of bytes bufferd */
-	int last_error;		/* last occurred error */
-	unsigned short pof_recid;/* actual pof recid */
-	unsigned long pof_reclen;/* total length of pof record data */
-	unsigned long pof_recoffset;/* actual offset inside pof record */
-	union {
-		unsigned char BootBuf[BOOT_BUF_SIZE];/* buffer as byte count */
-		tPofRecHdr PofRecHdr;	/* header for actual record/chunk */
-		tPofFileHdr PofFileHdr;		/* header from POF file */
-		tPofTimeStamp PofTime;	/* time information */
-	} buf;
-};
-
-/*****************************************************/
-/*  start decryption of successive POF file chuncks.  */
-/*                                                   */
-/*  to be called at start of POF file reading,       */
-/*  before starting any decryption on any POF record. */
-/*****************************************************/
-static void
-StartDecryption(struct boot_data *boot)
-{
-	boot->Cryptor = CRYPT_STARTTERM;
-}				/* StartDecryption */
-
-
-/***************************************************************/
-/* decrypt complete BootBuf                                    */
-/* NOTE: decryption must be applied to all or none boot tags - */
-/*       to HI and LO boot loader and (all) seq tags, because  */
-/*       global Cryptor is started for whole POF.              */
-/***************************************************************/
-static void
-DecryptBuf(struct boot_data *boot, int cnt)
-{
-	unsigned char *bufp = boot->buf.BootBuf;
-
-	while (cnt--) {
-		boot->Cryptor = (boot->Cryptor >> 1) ^ ((boot->Cryptor & 1U) ? CRYPT_FEEDTERM : 0);
-		*bufp++ ^= (unsigned char)boot->Cryptor;
-	}
-}				/* DecryptBuf */
-
-/********************************************************************************/
-/* pof_handle_data executes the required actions dependent on the active record */
-/* id. If successful 0 is returned, a negative value shows an error.           */
-/********************************************************************************/
-static int
-pof_handle_data(hysdn_card *card, int datlen)
-{
-	struct boot_data *boot = card->boot;	/* pointer to boot specific data */
-	long l;
-	unsigned char *imgp;
-	int img_len;
-
-	/* handle the different record types */
-	switch (boot->pof_recid) {
-
-	case TAG_TIMESTMP:
-		if (card->debug_flags & LOG_POF_RECORD)
-			hysdn_addlog(card, "POF created %s", boot->buf.PofTime.DateTimeText);
-		break;
-
-	case TAG_CBOOTDTA:
-		DecryptBuf(boot, datlen);	/* we need to encrypt the buffer */
-		/* fall through */
-	case TAG_BOOTDTA:
-		if (card->debug_flags & LOG_POF_RECORD)
-			hysdn_addlog(card, "POF got %s len=%d offs=0x%lx",
-				     (boot->pof_recid == TAG_CBOOTDTA) ? "CBOOTDATA" : "BOOTDTA",
-				     datlen, boot->pof_recoffset);
-
-		if (boot->pof_reclen != POF_BOOT_LOADER_TOTAL_SIZE) {
-			boot->last_error = EPOF_BAD_IMG_SIZE;	/* invalid length */
-			return (boot->last_error);
-		}
-		imgp = boot->buf.BootBuf;	/* start of buffer */
-		img_len = datlen;	/* maximum length to transfer */
-
-		l = POF_BOOT_LOADER_OFF_IN_PAGE -
-			(boot->pof_recoffset & (POF_BOOT_LOADER_PAGE_SIZE - 1));
-		if (l > 0) {
-			/* buffer needs to be truncated */
-			imgp += l;	/* advance pointer */
-			img_len -= l;	/* adjust len */
-		}
-		/* at this point no special handling for data wrapping over buffer */
-		/* is necessary, because the boot image always will be adjusted to */
-		/* match a page boundary inside the buffer.                        */
-		/* The buffer for the boot image on the card is filled in 2 cycles */
-		/* first the 1024 hi-words are put in the buffer, then the low 1024 */
-		/* word are handled in the same way with different offset.         */
-
-		if (img_len > 0) {
-			/* data available for copy */
-			if ((boot->last_error =
-			     card->writebootimg(card, imgp,
-						(boot->pof_recoffset > POF_BOOT_LOADER_PAGE_SIZE) ? 2 : 0)) < 0)
-				return (boot->last_error);
-		}
-		break;	/* end of case boot image hi/lo */
-
-	case TAG_CABSDATA:
-		DecryptBuf(boot, datlen);	/* we need to encrypt the buffer */
-		/* fall through */
-	case TAG_ABSDATA:
-		if (card->debug_flags & LOG_POF_RECORD)
-			hysdn_addlog(card, "POF got %s len=%d offs=0x%lx",
-				     (boot->pof_recid == TAG_CABSDATA) ? "CABSDATA" : "ABSDATA",
-				     datlen, boot->pof_recoffset);
-
-		if ((boot->last_error = card->writebootseq(card, boot->buf.BootBuf, datlen)) < 0)
-			return (boot->last_error);	/* error writing data */
-
-		if (boot->pof_recoffset + datlen >= boot->pof_reclen)
-			return (card->waitpofready(card));	/* data completely spooled, wait for ready */
-
-		break;	/* end of case boot seq data */
-
-	default:
-		if (card->debug_flags & LOG_POF_RECORD)
-			hysdn_addlog(card, "POF got data(id=0x%lx) len=%d offs=0x%lx", boot->pof_recid,
-				     datlen, boot->pof_recoffset);
-
-		break;	/* simply skip record */
-	}			/* switch boot->pof_recid */
-
-	return (0);
-}				/* pof_handle_data */
-
-
-/******************************************************************************/
-/* pof_write_buffer is called when the buffer has been filled with the needed */
-/* number of data bytes. The number delivered is additionally supplied for    */
-/* verification. The functions handles the data and returns the needed number */
-/* of bytes for the next action. If the returned value is 0 or less an error  */
-/* occurred and booting must be aborted.                                       */
-/******************************************************************************/
-int
-pof_write_buffer(hysdn_card *card, int datlen)
-{
-	struct boot_data *boot = card->boot;	/* pointer to boot specific data */
-
-	if (!boot)
-		return (-EFAULT);	/* invalid call */
-	if (boot->last_error < 0)
-		return (boot->last_error);	/* repeated error */
-
-	if (card->debug_flags & LOG_POF_WRITE)
-		hysdn_addlog(card, "POF write: got %d bytes ", datlen);
-
-	switch (boot->pof_state) {
-	case POF_READ_FILE_HEAD:
-		if (card->debug_flags & LOG_POF_WRITE)
-			hysdn_addlog(card, "POF write: checking file header");
-
-		if (datlen != sizeof(tPofFileHdr)) {
-			boot->last_error = -EPOF_INTERNAL;
-			break;
-		}
-		if (boot->buf.PofFileHdr.Magic != TAGFILEMAGIC) {
-			boot->last_error = -EPOF_BAD_MAGIC;
-			break;
-		}
-		/* Setup the new state and vars */
-		boot->Nrecs = (unsigned short)(boot->buf.PofFileHdr.N_PofRecs);	/* limited to 65535 */
-		boot->pof_state = POF_READ_TAG_HEAD;	/* now start with single tags */
-		boot->last_error = sizeof(tPofRecHdr);	/* new length */
-		break;
-
-	case POF_READ_TAG_HEAD:
-		if (card->debug_flags & LOG_POF_WRITE)
-			hysdn_addlog(card, "POF write: checking tag header");
-
-		if (datlen != sizeof(tPofRecHdr)) {
-			boot->last_error = -EPOF_INTERNAL;
-			break;
-		}
-		boot->pof_recid = boot->buf.PofRecHdr.PofRecId;		/* actual pof recid */
-		boot->pof_reclen = boot->buf.PofRecHdr.PofRecDataLen;	/* total length */
-		boot->pof_recoffset = 0;	/* no starting offset */
-
-		if (card->debug_flags & LOG_POF_RECORD)
-			hysdn_addlog(card, "POF: got record id=0x%lx length=%ld ",
-				     boot->pof_recid, boot->pof_reclen);
-
-		boot->pof_state = POF_READ_TAG_DATA;	/* now start with tag data */
-		if (boot->pof_reclen < BOOT_BUF_SIZE)
-			boot->last_error = boot->pof_reclen;	/* limit size */
-		else
-			boot->last_error = BOOT_BUF_SIZE;	/* maximum */
-
-		if (!boot->last_error) {	/* no data inside record */
-			boot->pof_state = POF_READ_TAG_HEAD;	/* now start with single tags */
-			boot->last_error = sizeof(tPofRecHdr);	/* new length */
-		}
-		break;
-
-	case POF_READ_TAG_DATA:
-		if (card->debug_flags & LOG_POF_WRITE)
-			hysdn_addlog(card, "POF write: getting tag data");
-
-		if (datlen != boot->last_error) {
-			boot->last_error = -EPOF_INTERNAL;
-			break;
-		}
-		if ((boot->last_error = pof_handle_data(card, datlen)) < 0)
-			return (boot->last_error);	/* an error occurred */
-		boot->pof_recoffset += datlen;
-		if (boot->pof_recoffset >= boot->pof_reclen) {
-			boot->pof_state = POF_READ_TAG_HEAD;	/* now start with single tags */
-			boot->last_error = sizeof(tPofRecHdr);	/* new length */
-		} else {
-			if (boot->pof_reclen - boot->pof_recoffset < BOOT_BUF_SIZE)
-				boot->last_error = boot->pof_reclen - boot->pof_recoffset;	/* limit size */
-			else
-				boot->last_error = BOOT_BUF_SIZE;	/* maximum */
-		}
-		break;
-
-	default:
-		boot->last_error = -EPOF_INTERNAL;	/* unknown state */
-		break;
-	}			/* switch (boot->pof_state) */
-
-	return (boot->last_error);
-}				/* pof_write_buffer */
-
-
-/*******************************************************************************/
-/* pof_write_open is called when an open for boot on the cardlog device occurs. */
-/* The function returns the needed number of bytes for the next operation. If  */
-/* the returned number is less or equal 0 an error specified by this code      */
-/* occurred. Additionally the pointer to the buffer data area is set on success */
-/*******************************************************************************/
-int
-pof_write_open(hysdn_card *card, unsigned char **bufp)
-{
-	struct boot_data *boot;	/* pointer to boot specific data */
-
-	if (card->boot) {
-		if (card->debug_flags & LOG_POF_OPEN)
-			hysdn_addlog(card, "POF open: already opened for boot");
-		return (-ERR_ALREADY_BOOT);	/* boot already active */
-	}
-	/* error no mem available */
-	if (!(boot = kzalloc(sizeof(struct boot_data), GFP_KERNEL))) {
-		if (card->debug_flags & LOG_MEM_ERR)
-			hysdn_addlog(card, "POF open: unable to allocate mem");
-		return (-EFAULT);
-	}
-	card->boot = boot;
-	card->state = CARD_STATE_BOOTING;
-
-	card->stopcard(card);	/* first stop the card */
-	if (card->testram(card)) {
-		if (card->debug_flags & LOG_POF_OPEN)
-			hysdn_addlog(card, "POF open: DPRAM test failure");
-		boot->last_error = -ERR_BOARD_DPRAM;
-		card->state = CARD_STATE_BOOTERR;	/* show boot error */
-		return (boot->last_error);
-	}
-	boot->BufSize = 0;	/* Buffer is empty */
-	boot->pof_state = POF_READ_FILE_HEAD;	/* read file header */
-	StartDecryption(boot);	/* if POF File should be encrypted */
-
-	if (card->debug_flags & LOG_POF_OPEN)
-		hysdn_addlog(card, "POF open: success");
-
-	*bufp = boot->buf.BootBuf;	/* point to buffer */
-	return (sizeof(tPofFileHdr));
-}				/* pof_write_open */
-
-/********************************************************************************/
-/* pof_write_close is called when an close of boot on the cardlog device occurs. */
-/* The return value must be 0 if everything has happened as desired.            */
-/********************************************************************************/
-int
-pof_write_close(hysdn_card *card)
-{
-	struct boot_data *boot = card->boot;	/* pointer to boot specific data */
-
-	if (!boot)
-		return (-EFAULT);	/* invalid call */
-
-	card->boot = NULL;	/* no boot active */
-	kfree(boot);
-
-	if (card->state == CARD_STATE_RUN)
-		card->set_errlog_state(card, 1);	/* activate error log */
-
-	if (card->debug_flags & LOG_POF_OPEN)
-		hysdn_addlog(card, "POF close: success");
-
-	return (0);
-}				/* pof_write_close */
-
-/*********************************************************************************/
-/* EvalSysrTokData checks additional records delivered with the Sysready Message */
-/* when POF has been booted. A return value of 0 is used if no error occurred.    */
-/*********************************************************************************/
-int
-EvalSysrTokData(hysdn_card *card, unsigned char *cp, int len)
-{
-	u_char *p;
-	u_char crc;
-
-	if (card->debug_flags & LOG_POF_RECORD)
-		hysdn_addlog(card, "SysReady Token data length %d", len);
-
-	if (len < 2) {
-		hysdn_addlog(card, "SysReady Token Data to short");
-		return (1);
-	}
-	for (p = cp, crc = 0; p < (cp + len - 2); p++)
-		if ((crc & 0x80))
-			crc = (((u_char) (crc << 1)) + 1) + *p;
-		else
-			crc = ((u_char) (crc << 1)) + *p;
-	crc = ~crc;
-	if (crc != *(cp + len - 1)) {
-		hysdn_addlog(card, "SysReady Token Data invalid CRC");
-		return (1);
-	}
-	len--;			/* don't check CRC byte */
-	while (len > 0) {
-
-		if (*cp == SYSR_TOK_END)
-			return (0);	/* End of Token stream */
-
-		if (len < (*(cp + 1) + 2)) {
-			hysdn_addlog(card, "token 0x%x invalid length %d", *cp, *(cp + 1));
-			return (1);
-		}
-		switch (*cp) {
-		case SYSR_TOK_B_CHAN:	/* 1 */
-			if (*(cp + 1) != 1)
-				return (1);	/* length invalid */
-			card->bchans = *(cp + 2);
-			break;
-
-		case SYSR_TOK_FAX_CHAN:	/* 2 */
-			if (*(cp + 1) != 1)
-				return (1);	/* length invalid */
-			card->faxchans = *(cp + 2);
-			break;
-
-		case SYSR_TOK_MAC_ADDR:	/* 3 */
-			if (*(cp + 1) != 6)
-				return (1);	/* length invalid */
-			memcpy(card->mac_addr, cp + 2, 6);
-			break;
-
-		default:
-			hysdn_addlog(card, "unknown token 0x%02x length %d", *cp, *(cp + 1));
-			break;
-		}
-		len -= (*(cp + 1) + 2);		/* adjust len */
-		cp += (*(cp + 1) + 2);	/* and pointer */
-	}
-
-	hysdn_addlog(card, "no end token found");
-	return (1);
-}				/* EvalSysrTokData */
diff --git a/drivers/staging/isdn/hysdn/hysdn_defs.h b/drivers/staging/isdn/hysdn/hysdn_defs.h
deleted file mode 100644
index cdac46a21692..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_defs.h
+++ /dev/null
@@ -1,282 +0,0 @@
-/* $Id: hysdn_defs.h,v 1.5.6.3 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards
- * global definitions and exported vars and functions.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#ifndef HYSDN_DEFS_H
-#define HYSDN_DEFS_H
-
-#include <linux/hysdn_if.h>
-#include <linux/interrupt.h>
-#include <linux/workqueue.h>
-#include <linux/skbuff.h>
-
-#include "ince1pc.h"
-
-#ifdef CONFIG_HYSDN_CAPI
-#include <linux/capi.h>
-#include <linux/isdn/capicmd.h>
-#include <linux/isdn/capiutil.h>
-#include <linux/isdn/capilli.h>
-
-/***************************/
-/*   CAPI-Profile values.  */
-/***************************/
-
-#define GLOBAL_OPTION_INTERNAL_CONTROLLER 0x0001
-#define GLOBAL_OPTION_EXTERNAL_CONTROLLER 0x0002
-#define GLOBAL_OPTION_HANDSET             0x0004
-#define GLOBAL_OPTION_DTMF                0x0008
-#define GLOBAL_OPTION_SUPPL_SERVICES      0x0010
-#define GLOBAL_OPTION_CHANNEL_ALLOCATION  0x0020
-#define GLOBAL_OPTION_B_CHANNEL_OPERATION 0x0040
-
-#define B1_PROT_64KBIT_HDLC        0x0001
-#define B1_PROT_64KBIT_TRANSPARENT 0x0002
-#define B1_PROT_V110_ASYNCH        0x0004
-#define B1_PROT_V110_SYNCH         0x0008
-#define B1_PROT_T30                0x0010
-#define B1_PROT_64KBIT_INV_HDLC    0x0020
-#define B1_PROT_56KBIT_TRANSPARENT 0x0040
-
-#define B2_PROT_ISO7776            0x0001
-#define B2_PROT_TRANSPARENT        0x0002
-#define B2_PROT_SDLC               0x0004
-#define B2_PROT_LAPD               0x0008
-#define B2_PROT_T30                0x0010
-#define B2_PROT_PPP                0x0020
-#define B2_PROT_TRANSPARENT_IGNORE_B1_FRAMING_ERRORS 0x0040
-
-#define B3_PROT_TRANSPARENT        0x0001
-#define B3_PROT_T90NL              0x0002
-#define B3_PROT_ISO8208            0x0004
-#define B3_PROT_X25_DCE            0x0008
-#define B3_PROT_T30                0x0010
-#define B3_PROT_T30EXT             0x0020
-
-#define HYSDN_MAXVERSION		8
-
-/* Number of sendbuffers in CAPI-queue */
-#define HYSDN_MAX_CAPI_SKB             20
-
-#endif /* CONFIG_HYSDN_CAPI*/
-
-/************************************************/
-/* constants and bits for debugging/log outputs */
-/************************************************/
-#define LOG_MAX_LINELEN 120
-#define DEB_OUT_SYSLOG  0x80000000	/* output to syslog instead of proc fs */
-#define LOG_MEM_ERR     0x00000001	/* log memory errors like kmalloc failure */
-#define LOG_POF_OPEN    0x00000010	/* log pof open and close activities */
-#define LOG_POF_RECORD  0x00000020	/* log pof record parser */
-#define LOG_POF_WRITE   0x00000040	/* log detailed pof write operation */
-#define LOG_POF_CARD    0x00000080	/* log pof related card functions */
-#define LOG_CNF_LINE    0x00000100	/* all conf lines are put to procfs */
-#define LOG_CNF_DATA    0x00000200	/* non comment conf lines are shown with channel */
-#define LOG_CNF_MISC    0x00000400	/* additional conf line debug outputs */
-#define LOG_SCHED_ASYN  0x00001000	/* debug schedulers async tx routines */
-#define LOG_PROC_OPEN   0x00100000	/* open and close from procfs are logged */
-#define LOG_PROC_ALL    0x00200000	/* all actions from procfs are logged */
-#define LOG_NET_INIT    0x00010000	/* network init and deinit logging */
-
-#define DEF_DEB_FLAGS   0x7fff000f	/* everything is logged to procfs */
-
-/**********************************/
-/* proc filesystem name constants */
-/**********************************/
-#define PROC_SUBDIR_NAME "hysdn"
-#define PROC_CONF_BASENAME "cardconf"
-#define PROC_LOG_BASENAME "cardlog"
-
-/***********************************/
-/* PCI 32 bit parms for IO and MEM */
-/***********************************/
-#define PCI_REG_PLX_MEM_BASE    0
-#define PCI_REG_PLX_IO_BASE     1
-#define PCI_REG_MEMORY_BASE     3
-
-/**************/
-/* card types */
-/**************/
-#define BD_NONE         0U
-#define BD_PERFORMANCE  1U
-#define BD_VALUE        2U
-#define BD_PCCARD       3U
-#define BD_ERGO         4U
-#define BD_METRO        5U
-#define BD_CHAMP2       6U
-#define BD_PLEXUS       7U
-
-/******************************************************/
-/* defined states for cards shown by reading cardconf */
-/******************************************************/
-#define CARD_STATE_UNUSED   0	/* never been used or booted */
-#define CARD_STATE_BOOTING  1	/* booting is in progress */
-#define CARD_STATE_BOOTERR  2	/* a previous boot was aborted */
-#define CARD_STATE_RUN      3	/* card is active */
-
-/*******************************/
-/* defines for error_log_state */
-/*******************************/
-#define ERRLOG_STATE_OFF   0	/* error log is switched off, nothing to do */
-#define ERRLOG_STATE_ON    1	/* error log is switched on, wait for data */
-#define ERRLOG_STATE_START 2	/* start error logging */
-#define ERRLOG_STATE_STOP  3	/* stop error logging */
-
-/*******************************/
-/* data structure for one card */
-/*******************************/
-typedef struct HYSDN_CARD {
-
-	/* general variables for the cards */
-	int myid;		/* own driver card id */
-	unsigned char bus;	/* pci bus the card is connected to */
-	unsigned char devfn;	/* slot+function bit encoded */
-	unsigned short subsysid;/* PCI subsystem id */
-	unsigned char brdtype;	/* type of card */
-	unsigned int bchans;	/* number of available B-channels */
-	unsigned int faxchans;	/* number of available fax-channels */
-	unsigned char mac_addr[6];/* MAC Address read from card */
-	unsigned int irq;	/* interrupt number */
-	unsigned int iobase;	/* IO-port base address */
-	unsigned long plxbase;	/* PLX memory base */
-	unsigned long membase;	/* DPRAM memory base */
-	unsigned long memend;	/* DPRAM memory end */
-	void *dpram;		/* mapped dpram */
-	int state;		/* actual state of card -> CARD_STATE_** */
-	struct HYSDN_CARD *next;	/* pointer to next card */
-
-	/* data areas for the /proc file system */
-	void *proclog;		/* pointer to proclog filesystem specific data */
-	void *procconf;		/* pointer to procconf filesystem specific data */
-
-	/* debugging and logging */
-	unsigned char err_log_state;/* actual error log state of the card */
-	unsigned long debug_flags;/* tells what should be debugged and where */
-	void (*set_errlog_state) (struct HYSDN_CARD *, int);
-
-	/* interrupt handler + interrupt synchronisation */
-	struct work_struct irq_queue;	/* interrupt task queue */
-	unsigned char volatile irq_enabled;/* interrupt enabled if != 0 */
-	unsigned char volatile hw_lock;/* hardware is currently locked -> no access */
-
-	/* boot process */
-	void *boot;		/* pointer to boot private data */
-	int (*writebootimg) (struct HYSDN_CARD *, unsigned char *, unsigned long);
-	int (*writebootseq) (struct HYSDN_CARD *, unsigned char *, int);
-	int (*waitpofready) (struct HYSDN_CARD *);
-	int (*testram) (struct HYSDN_CARD *);
-
-	/* scheduler for data transfer (only async parts) */
-	unsigned char async_data[256];/* async data to be sent (normally for config) */
-	unsigned short volatile async_len;/* length of data to sent */
-	unsigned short volatile async_channel;/* channel number for async transfer */
-	int volatile async_busy;	/* flag != 0 sending in progress */
-	int volatile net_tx_busy;	/* a network packet tx is in progress */
-
-	/* network interface */
-	void *netif;		/* pointer to network structure */
-
-	/* init and deinit stopcard for booting, too */
-	void (*stopcard) (struct HYSDN_CARD *);
-	void (*releasehardware) (struct HYSDN_CARD *);
-
-	spinlock_t hysdn_lock;
-#ifdef CONFIG_HYSDN_CAPI
-	struct hycapictrl_info {
-		char cardname[32];
-		spinlock_t lock;
-		int versionlen;
-		char versionbuf[1024];
-		char *version[HYSDN_MAXVERSION];
-
-		char infobuf[128];	/* for function procinfo */
-
-		struct HYSDN_CARD  *card;
-		struct capi_ctr capi_ctrl;
-		struct sk_buff *skbs[HYSDN_MAX_CAPI_SKB];
-		int in_idx, out_idx;	/* indexes to buffer ring */
-		int sk_count;		/* number of buffers currently in ring */
-		struct sk_buff *tx_skb;	/* buffer for tx operation */
-
-		struct list_head ncci_head;
-	} *hyctrlinfo;
-#endif /* CONFIG_HYSDN_CAPI */
-} hysdn_card;
-
-#ifdef CONFIG_HYSDN_CAPI
-typedef struct hycapictrl_info hycapictrl_info;
-#endif /* CONFIG_HYSDN_CAPI */
-
-
-/*****************/
-/* exported vars */
-/*****************/
-extern hysdn_card *card_root;	/* pointer to first card */
-
-
-
-/*************************/
-/* im/exported functions */
-/*************************/
-
-/* hysdn_procconf.c */
-extern int hysdn_procconf_init(void);	/* init proc config filesys */
-extern void hysdn_procconf_release(void);	/* deinit proc config filesys */
-
-/* hysdn_proclog.c */
-extern int hysdn_proclog_init(hysdn_card *);	/* init proc log entry */
-extern void hysdn_proclog_release(hysdn_card *);	/* deinit proc log entry */
-extern void hysdn_addlog(hysdn_card *, char *, ...);	/* output data to log */
-extern void hysdn_card_errlog(hysdn_card *, tErrLogEntry *, int);	/* output card log */
-
-/* boardergo.c */
-extern int ergo_inithardware(hysdn_card *card);	/* get hardware -> module init */
-
-/* hysdn_boot.c */
-extern int pof_write_close(hysdn_card *);	/* close proc file after writing pof */
-extern int pof_write_open(hysdn_card *, unsigned char **);	/* open proc file for writing pof */
-extern int pof_write_buffer(hysdn_card *, int);		/* write boot data to card */
-extern int EvalSysrTokData(hysdn_card *, unsigned char *, int);		/* Check Sysready Token Data */
-
-/* hysdn_sched.c */
-extern int hysdn_sched_tx(hysdn_card *, unsigned char *,
-			  unsigned short volatile *, unsigned short volatile *,
-			  unsigned short);
-extern int hysdn_sched_rx(hysdn_card *, unsigned char *, unsigned short,
-			  unsigned short);
-extern int hysdn_tx_cfgline(hysdn_card *, unsigned char *,
-			    unsigned short);	/* send one cfg line */
-
-/* hysdn_net.c */
-extern unsigned int hynet_enable;
-extern int hysdn_net_create(hysdn_card *);	/* create a new net device */
-extern int hysdn_net_release(hysdn_card *);	/* delete the device */
-extern char *hysdn_net_getname(hysdn_card *);	/* get name of net interface */
-extern void hysdn_tx_netack(hysdn_card *);	/* acknowledge a packet tx */
-extern struct sk_buff *hysdn_tx_netget(hysdn_card *);	/* get next network packet */
-extern void hysdn_rx_netpkt(hysdn_card *, unsigned char *,
-			    unsigned short);	/* rxed packet from network */
-
-#ifdef CONFIG_HYSDN_CAPI
-extern unsigned int hycapi_enable;
-extern int hycapi_capi_create(hysdn_card *);	/* create a new capi device */
-extern int hycapi_capi_release(hysdn_card *);	/* delete the device */
-extern int hycapi_capi_stop(hysdn_card *card);   /* suspend */
-extern void hycapi_rx_capipkt(hysdn_card *card, unsigned char *buf,
-			      unsigned short len);
-extern void hycapi_tx_capiack(hysdn_card *card);
-extern struct sk_buff *hycapi_tx_capiget(hysdn_card *card);
-extern int hycapi_init(void);
-extern void hycapi_cleanup(void);
-#endif /* CONFIG_HYSDN_CAPI */
-
-#endif /* HYSDN_DEFS_H */
diff --git a/drivers/staging/isdn/hysdn/hysdn_init.c b/drivers/staging/isdn/hysdn/hysdn_init.c
deleted file mode 100644
index 0db2f7506250..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_init.c
+++ /dev/null
@@ -1,213 +0,0 @@
-/* $Id: hysdn_init.c,v 1.6.6.6 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards, init functions.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/init.h>
-#include <linux/poll.h>
-#include <linux/vmalloc.h>
-#include <linux/slab.h>
-#include <linux/pci.h>
-
-#include "hysdn_defs.h"
-
-static struct pci_device_id hysdn_pci_tbl[] = {
-	{ PCI_VENDOR_ID_HYPERCOPE, PCI_DEVICE_ID_HYPERCOPE_PLX,
-	  PCI_ANY_ID, PCI_SUBDEVICE_ID_HYPERCOPE_METRO, 0, 0, BD_METRO },
-	{ PCI_VENDOR_ID_HYPERCOPE, PCI_DEVICE_ID_HYPERCOPE_PLX,
-	  PCI_ANY_ID, PCI_SUBDEVICE_ID_HYPERCOPE_CHAMP2, 0, 0, BD_CHAMP2 },
-	{ PCI_VENDOR_ID_HYPERCOPE, PCI_DEVICE_ID_HYPERCOPE_PLX,
-	  PCI_ANY_ID, PCI_SUBDEVICE_ID_HYPERCOPE_ERGO, 0, 0, BD_ERGO },
-	{ PCI_VENDOR_ID_HYPERCOPE, PCI_DEVICE_ID_HYPERCOPE_PLX,
-	  PCI_ANY_ID, PCI_SUBDEVICE_ID_HYPERCOPE_OLD_ERGO, 0, 0, BD_ERGO },
-
-	{ }				/* Terminating entry */
-};
-MODULE_DEVICE_TABLE(pci, hysdn_pci_tbl);
-MODULE_DESCRIPTION("ISDN4Linux: Driver for HYSDN cards");
-MODULE_AUTHOR("Werner Cornelius");
-MODULE_LICENSE("GPL");
-
-static int cardmax;		/* number of found cards */
-hysdn_card *card_root = NULL;	/* pointer to first card */
-static hysdn_card *card_last = NULL;	/* pointer to first card */
-
-
-/****************************************************************************/
-/* The module startup and shutdown code. Only compiled when used as module. */
-/* Using the driver as module is always advisable, because the booting      */
-/* image becomes smaller and the driver code is only loaded when needed.    */
-/* Additionally newer versions may be activated without rebooting.          */
-/****************************************************************************/
-
-/****************************************************************************/
-/* init_module is called once when the module is loaded to do all necessary */
-/* things like autodetect...                                                */
-/* If the return value of this function is 0 the init has been successful   */
-/* and the module is added to the list in /proc/modules, otherwise an error */
-/* is assumed and the module will not be kept in memory.                    */
-/****************************************************************************/
-
-static int hysdn_pci_init_one(struct pci_dev *akt_pcidev,
-			      const struct pci_device_id *ent)
-{
-	hysdn_card *card;
-	int rc;
-
-	rc = pci_enable_device(akt_pcidev);
-	if (rc)
-		return rc;
-
-	if (!(card = kzalloc(sizeof(hysdn_card), GFP_KERNEL))) {
-		printk(KERN_ERR "HYSDN: unable to alloc device mem \n");
-		rc = -ENOMEM;
-		goto err_out;
-	}
-	card->myid = cardmax;	/* set own id */
-	card->bus = akt_pcidev->bus->number;
-	card->devfn = akt_pcidev->devfn;	/* slot + function */
-	card->subsysid = akt_pcidev->subsystem_device;
-	card->irq = akt_pcidev->irq;
-	card->iobase = pci_resource_start(akt_pcidev, PCI_REG_PLX_IO_BASE);
-	card->plxbase = pci_resource_start(akt_pcidev, PCI_REG_PLX_MEM_BASE);
-	card->membase = pci_resource_start(akt_pcidev, PCI_REG_MEMORY_BASE);
-	card->brdtype = BD_NONE;	/* unknown */
-	card->debug_flags = DEF_DEB_FLAGS;	/* set default debug */
-	card->faxchans = 0;	/* default no fax channels */
-	card->bchans = 2;	/* and 2 b-channels */
-	card->brdtype = ent->driver_data;
-
-	if (ergo_inithardware(card)) {
-		printk(KERN_WARNING "HYSDN: card at io 0x%04x already in use\n", card->iobase);
-		rc = -EBUSY;
-		goto err_out_card;
-	}
-
-	cardmax++;
-	card->next = NULL;	/*end of chain */
-	if (card_last)
-		card_last->next = card;		/* pointer to next card */
-	else
-		card_root = card;
-	card_last = card;	/* new chain end */
-
-	pci_set_drvdata(akt_pcidev, card);
-	return 0;
-
-err_out_card:
-	kfree(card);
-err_out:
-	pci_disable_device(akt_pcidev);
-	return rc;
-}
-
-static void hysdn_pci_remove_one(struct pci_dev *akt_pcidev)
-{
-	hysdn_card *card = pci_get_drvdata(akt_pcidev);
-
-	pci_set_drvdata(akt_pcidev, NULL);
-
-	if (card->stopcard)
-		card->stopcard(card);
-
-#ifdef CONFIG_HYSDN_CAPI
-	hycapi_capi_release(card);
-#endif
-
-	if (card->releasehardware)
-		card->releasehardware(card);   /* free all hardware resources */
-
-	if (card == card_root) {
-		card_root = card_root->next;
-		if (!card_root)
-			card_last = NULL;
-	} else {
-		hysdn_card *tmp = card_root;
-		while (tmp) {
-			if (tmp->next == card)
-				tmp->next = card->next;
-			card_last = tmp;
-			tmp = tmp->next;
-		}
-	}
-
-	kfree(card);
-	pci_disable_device(akt_pcidev);
-}
-
-static struct pci_driver hysdn_pci_driver = {
-	.name		= "hysdn",
-	.id_table	= hysdn_pci_tbl,
-	.probe		= hysdn_pci_init_one,
-	.remove		= hysdn_pci_remove_one,
-};
-
-static int hysdn_have_procfs;
-
-static int __init
-hysdn_init(void)
-{
-	int rc;
-
-	printk(KERN_NOTICE "HYSDN: module loaded\n");
-
-	rc = pci_register_driver(&hysdn_pci_driver);
-	if (rc)
-		return rc;
-
-	printk(KERN_INFO "HYSDN: %d card(s) found.\n", cardmax);
-
-	if (!hysdn_procconf_init())
-		hysdn_have_procfs = 1;
-
-#ifdef CONFIG_HYSDN_CAPI
-	if (cardmax > 0) {
-		if (hycapi_init()) {
-			printk(KERN_ERR "HYCAPI: init failed\n");
-
-			if (hysdn_have_procfs)
-				hysdn_procconf_release();
-
-			pci_unregister_driver(&hysdn_pci_driver);
-			return -ESPIPE;
-		}
-	}
-#endif /* CONFIG_HYSDN_CAPI */
-
-	return 0;		/* no error */
-}				/* init_module */
-
-
-/***********************************************************************/
-/* cleanup_module is called when the module is released by the kernel. */
-/* The routine is only called if init_module has been successful and   */
-/* the module counter has a value of 0. Otherwise this function will   */
-/* not be called. This function must release all resources still allo- */
-/* cated as after the return from this function the module code will   */
-/* be removed from memory.                                             */
-/***********************************************************************/
-static void __exit
-hysdn_exit(void)
-{
-	if (hysdn_have_procfs)
-		hysdn_procconf_release();
-
-	pci_unregister_driver(&hysdn_pci_driver);
-
-#ifdef CONFIG_HYSDN_CAPI
-	hycapi_cleanup();
-#endif /* CONFIG_HYSDN_CAPI */
-
-	printk(KERN_NOTICE "HYSDN: module unloaded\n");
-}				/* cleanup_module */
-
-module_init(hysdn_init);
-module_exit(hysdn_exit);
diff --git a/drivers/staging/isdn/hysdn/hysdn_net.c b/drivers/staging/isdn/hysdn/hysdn_net.c
deleted file mode 100644
index dcb9ef7a2651..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_net.c
+++ /dev/null
@@ -1,330 +0,0 @@
-/* $Id: hysdn_net.c,v 1.8.6.4 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards, net (ethernet type) handling routines.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- * This net module has been inspired by the skeleton driver from
- * Donald Becker (becker@CESDIS.gsfc.nasa.gov)
- *
- */
-
-#include <linux/module.h>
-#include <linux/signal.h>
-#include <linux/kernel.h>
-#include <linux/netdevice.h>
-#include <linux/etherdevice.h>
-#include <linux/skbuff.h>
-#include <linux/inetdevice.h>
-
-#include "hysdn_defs.h"
-
-unsigned int hynet_enable = 0xffffffff;
-module_param(hynet_enable, uint, 0);
-
-#define MAX_SKB_BUFFERS 20	/* number of buffers for keeping TX-data */
-
-/****************************************************************************/
-/* structure containing the complete network data. The structure is aligned */
-/* in a way that both, the device and statistics are kept inside it.        */
-/* for proper access, the device structure MUST be the first var/struct     */
-/* inside the definition.                                                   */
-/****************************************************************************/
-struct net_local {
-	/* Tx control lock.  This protects the transmit buffer ring
-	 * state along with the "tx full" state of the driver.  This
-	 * means all netif_queue flow control actions are protected
-	 * by this lock as well.
-	 */
-	struct net_device *dev;
-	spinlock_t lock;
-	struct sk_buff *skbs[MAX_SKB_BUFFERS];	/* pointers to tx-skbs */
-	int in_idx, out_idx;	/* indexes to buffer ring */
-	int sk_count;		/* number of buffers currently in ring */
-};				/* net_local */
-
-
-
-/*********************************************************************/
-/* Open/initialize the board. This is called (in the current kernel) */
-/* sometime after booting when the 'ifconfig' program is run.        */
-/* This routine should set everything up anew at each open, even     */
-/* registers that "should" only need to be set once at boot, so that */
-/* there is non-reboot way to recover if something goes wrong.       */
-/*********************************************************************/
-static int
-net_open(struct net_device *dev)
-{
-	struct in_device *in_dev;
-	hysdn_card *card = dev->ml_priv;
-	int i;
-
-	netif_start_queue(dev);	/* start tx-queueing */
-
-	/* Fill in the MAC-level header (if not already set) */
-	if (!card->mac_addr[0]) {
-		for (i = 0; i < ETH_ALEN; i++)
-			dev->dev_addr[i] = 0xfc;
-		if ((in_dev = dev->ip_ptr) != NULL) {
-			const struct in_ifaddr *ifa;
-
-			rcu_read_lock();
-			ifa = rcu_dereference(in_dev->ifa_list);
-			if (ifa != NULL)
-				memcpy(dev->dev_addr + (ETH_ALEN - sizeof(ifa->ifa_local)), &ifa->ifa_local, sizeof(ifa->ifa_local));
-			rcu_read_unlock();
-		}
-	} else
-		memcpy(dev->dev_addr, card->mac_addr, ETH_ALEN);
-
-	return (0);
-}				/* net_open */
-
-/*******************************************/
-/* flush the currently occupied tx-buffers */
-/* must only be called when device closed  */
-/*******************************************/
-static void
-flush_tx_buffers(struct net_local *nl)
-{
-
-	while (nl->sk_count) {
-		dev_kfree_skb(nl->skbs[nl->out_idx++]);		/* free skb */
-		if (nl->out_idx >= MAX_SKB_BUFFERS)
-			nl->out_idx = 0;	/* wrap around */
-		nl->sk_count--;
-	}
-}				/* flush_tx_buffers */
-
-
-/*********************************************************************/
-/* close/decativate the device. The device is not removed, but only  */
-/* deactivated.                                                      */
-/*********************************************************************/
-static int
-net_close(struct net_device *dev)
-{
-
-	netif_stop_queue(dev);	/* disable queueing */
-
-	flush_tx_buffers((struct net_local *) dev);
-
-	return (0);		/* success */
-}				/* net_close */
-
-/************************************/
-/* send a packet on this interface. */
-/* new style for kernel >= 2.3.33   */
-/************************************/
-static netdev_tx_t
-net_send_packet(struct sk_buff *skb, struct net_device *dev)
-{
-	struct net_local *lp = (struct net_local *) dev;
-
-	spin_lock_irq(&lp->lock);
-
-	lp->skbs[lp->in_idx++] = skb;	/* add to buffer list */
-	if (lp->in_idx >= MAX_SKB_BUFFERS)
-		lp->in_idx = 0;	/* wrap around */
-	lp->sk_count++;		/* adjust counter */
-	netif_trans_update(dev);
-
-	/* If we just used up the very last entry in the
-	 * TX ring on this device, tell the queueing
-	 * layer to send no more.
-	 */
-	if (lp->sk_count >= MAX_SKB_BUFFERS)
-		netif_stop_queue(dev);
-
-	/* When the TX completion hw interrupt arrives, this
-	 * is when the transmit statistics are updated.
-	 */
-
-	spin_unlock_irq(&lp->lock);
-
-	if (lp->sk_count <= 3) {
-		schedule_work(&((hysdn_card *) dev->ml_priv)->irq_queue);
-	}
-	return NETDEV_TX_OK;	/* success */
-}				/* net_send_packet */
-
-
-
-/***********************************************************************/
-/* acknowlegde a packet send. The network layer will be informed about */
-/* completion                                                          */
-/***********************************************************************/
-void
-hysdn_tx_netack(hysdn_card *card)
-{
-	struct net_local *lp = card->netif;
-
-	if (!lp)
-		return;		/* non existing device */
-
-
-	if (!lp->sk_count)
-		return;		/* error condition */
-
-	lp->dev->stats.tx_packets++;
-	lp->dev->stats.tx_bytes += lp->skbs[lp->out_idx]->len;
-
-	dev_kfree_skb(lp->skbs[lp->out_idx++]);		/* free skb */
-	if (lp->out_idx >= MAX_SKB_BUFFERS)
-		lp->out_idx = 0;	/* wrap around */
-
-	if (lp->sk_count-- == MAX_SKB_BUFFERS)	/* dec usage count */
-		netif_start_queue((struct net_device *) lp);
-}				/* hysdn_tx_netack */
-
-/*****************************************************/
-/* we got a packet from the network, go and queue it */
-/*****************************************************/
-void
-hysdn_rx_netpkt(hysdn_card *card, unsigned char *buf, unsigned short len)
-{
-	struct net_local *lp = card->netif;
-	struct net_device *dev;
-	struct sk_buff *skb;
-
-	if (!lp)
-		return;		/* non existing device */
-
-	dev = lp->dev;
-	dev->stats.rx_bytes += len;
-
-	skb = dev_alloc_skb(len);
-	if (skb == NULL) {
-		printk(KERN_NOTICE "%s: Memory squeeze, dropping packet.\n",
-		       dev->name);
-		dev->stats.rx_dropped++;
-		return;
-	}
-	/* copy the data */
-	skb_put_data(skb, buf, len);
-
-	/* determine the used protocol */
-	skb->protocol = eth_type_trans(skb, dev);
-
-	dev->stats.rx_packets++;	/* adjust packet count */
-
-	netif_rx(skb);
-}				/* hysdn_rx_netpkt */
-
-/*****************************************************/
-/* return the pointer to a network packet to be send */
-/*****************************************************/
-struct sk_buff *
-hysdn_tx_netget(hysdn_card *card)
-{
-	struct net_local *lp = card->netif;
-
-	if (!lp)
-		return (NULL);	/* non existing device */
-
-	if (!lp->sk_count)
-		return (NULL);	/* nothing available */
-
-	return (lp->skbs[lp->out_idx]);		/* next packet to send */
-}				/* hysdn_tx_netget */
-
-static const struct net_device_ops hysdn_netdev_ops = {
-	.ndo_open		= net_open,
-	.ndo_stop		= net_close,
-	.ndo_start_xmit		= net_send_packet,
-	.ndo_set_mac_address	= eth_mac_addr,
-	.ndo_validate_addr	= eth_validate_addr,
-};
-
-
-/*****************************************************************************/
-/* hysdn_net_create creates a new net device for the given card. If a device */
-/* already exists, it will be deleted and created a new one. The return value */
-/* 0 announces success, else a negative error code will be returned.         */
-/*****************************************************************************/
-int
-hysdn_net_create(hysdn_card *card)
-{
-	struct net_device *dev;
-	int i;
-	struct net_local *lp;
-
-	if (!card) {
-		printk(KERN_WARNING "No card-pt in hysdn_net_create!\n");
-		return (-ENOMEM);
-	}
-	hysdn_net_release(card);	/* release an existing net device */
-
-	dev = alloc_etherdev(sizeof(struct net_local));
-	if (!dev) {
-		printk(KERN_WARNING "HYSDN: unable to allocate mem\n");
-		return (-ENOMEM);
-	}
-
-	lp = netdev_priv(dev);
-	lp->dev = dev;
-
-	dev->netdev_ops = &hysdn_netdev_ops;
-	spin_lock_init(&((struct net_local *) dev)->lock);
-
-	/* initialise necessary or informing fields */
-	dev->base_addr = card->iobase;	/* IO address */
-	dev->irq = card->irq;	/* irq */
-
-	dev->netdev_ops = &hysdn_netdev_ops;
-	if ((i = register_netdev(dev))) {
-		printk(KERN_WARNING "HYSDN: unable to create network device\n");
-		free_netdev(dev);
-		return (i);
-	}
-	dev->ml_priv = card;	/* remember pointer to own data structure */
-	card->netif = dev;	/* setup the local pointer */
-
-	if (card->debug_flags & LOG_NET_INIT)
-		hysdn_addlog(card, "network device created");
-	return 0;		/* and return success */
-}				/* hysdn_net_create */
-
-/***************************************************************************/
-/* hysdn_net_release deletes the net device for the given card. The return */
-/* value 0 announces success, else a negative error code will be returned. */
-/***************************************************************************/
-int
-hysdn_net_release(hysdn_card *card)
-{
-	struct net_device *dev = card->netif;
-
-	if (!dev)
-		return (0);	/* non existing */
-
-	card->netif = NULL;	/* clear out pointer */
-	net_close(dev);
-
-	flush_tx_buffers((struct net_local *) dev);	/* empty buffers */
-
-	unregister_netdev(dev);	/* release the device */
-	free_netdev(dev);	/* release the memory allocated */
-	if (card->debug_flags & LOG_NET_INIT)
-		hysdn_addlog(card, "network device deleted");
-
-	return (0);		/* always successful */
-}				/* hysdn_net_release */
-
-/*****************************************************************************/
-/* hysdn_net_getname returns a pointer to the name of the network interface. */
-/* if the interface is not existing, a "-" is returned.                      */
-/*****************************************************************************/
-char *
-hysdn_net_getname(hysdn_card *card)
-{
-	struct net_device *dev = card->netif;
-
-	if (!dev)
-		return ("-");	/* non existing */
-
-	return (dev->name);
-}				/* hysdn_net_getname */
diff --git a/drivers/staging/isdn/hysdn/hysdn_pof.h b/drivers/staging/isdn/hysdn/hysdn_pof.h
deleted file mode 100644
index f63f5fa59d7e..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_pof.h
+++ /dev/null
@@ -1,78 +0,0 @@
-/* $Id: hysdn_pof.h,v 1.2.6.1 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards, definitions used for handling pof-files.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-/************************/
-/* POF specific defines */
-/************************/
-#define BOOT_BUF_SIZE   0x1000	/* =4096, maybe moved to other h file */
-#define CRYPT_FEEDTERM  0x8142
-#define CRYPT_STARTTERM 0x81a5
-/*  max. timeout time in seconds
- *  from end of booting to POF is ready
- */
-#define POF_READY_TIME_OUT_SEC  10
-
-/**********************************/
-/* defines for 1.stage boot image */
-/**********************************/
-
-/*  the POF file record containing the boot loader image
- *  has 2 pages a 16KB:
- *  1. page contains the high 16-bit part of the 32-bit E1 words
- *  2. page contains the low  16-bit part of the 32-bit E1 words
- *
- *  In each 16KB page we assume the start of the boot loader code
- *  in the highest 2KB part (at offset 0x3800);
- *  the rest (0x0000..0x37FF) is assumed to contain 0 bytes.
- */
-
-#define POF_BOOT_LOADER_PAGE_SIZE   0x4000	/* =16384U */
-#define POF_BOOT_LOADER_TOTAL_SIZE  (2U * POF_BOOT_LOADER_PAGE_SIZE)
-
-#define POF_BOOT_LOADER_CODE_SIZE   0x0800	/* =2KB =2048U */
-
-/* offset in boot page, where loader code may start */
-/* =0x3800= 14336U */
-#define POF_BOOT_LOADER_OFF_IN_PAGE (POF_BOOT_LOADER_PAGE_SIZE-POF_BOOT_LOADER_CODE_SIZE)
-
-
-/*--------------------------------------POF file record structs------------*/
-typedef struct PofFileHdr_tag {	/* Pof file header */
-	/*00 */ unsigned long Magic __attribute__((packed));
-	/*04 */ unsigned long N_PofRecs __attribute__((packed));
-/*08 */
-} tPofFileHdr;
-
-typedef struct PofRecHdr_tag {	/* Pof record header */
-	/*00 */ unsigned short PofRecId __attribute__((packed));
-	/*02 */ unsigned long PofRecDataLen __attribute__((packed));
-/*06 */
-} tPofRecHdr;
-
-typedef struct PofTimeStamp_tag {
-	/*00 */ unsigned long UnixTime __attribute__((packed));
-	/*04 */ unsigned char DateTimeText[0x28];
-	/* =40 */
-/*2C */
-} tPofTimeStamp;
-
-/* tPofFileHdr.Magic value: */
-#define TAGFILEMAGIC 0x464F501AUL
-/* tPofRecHdr.PofRecId values: */
-#define TAG_ABSDATA  0x1000	/* abs. data */
-#define TAG_BOOTDTA  0x1001	/* boot data */
-#define TAG_COMMENT  0x0020
-#define TAG_SYSCALL  0x0021
-#define TAG_FLOWCTRL 0x0022
-#define TAG_TIMESTMP 0x0010	/* date/time stamp of version */
-#define TAG_CABSDATA 0x1100	/* crypted abs. data */
-#define TAG_CBOOTDTA 0x1101	/* crypted boot data */
diff --git a/drivers/staging/isdn/hysdn/hysdn_procconf.c b/drivers/staging/isdn/hysdn/hysdn_procconf.c
deleted file mode 100644
index 48afd9f5316e..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_procconf.c
+++ /dev/null
@@ -1,411 +0,0 @@
-/* $Id: hysdn_procconf.c,v 1.8.6.4 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards, /proc/net filesystem dir and conf functions.
- *
- * written by Werner Cornelius (werner@titro.de) for Hypercope GmbH
- *
- * Copyright 1999  by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/cred.h>
-#include <linux/module.h>
-#include <linux/poll.h>
-#include <linux/proc_fs.h>
-#include <linux/pci.h>
-#include <linux/slab.h>
-#include <linux/mutex.h>
-#include <net/net_namespace.h>
-
-#include "hysdn_defs.h"
-
-static DEFINE_MUTEX(hysdn_conf_mutex);
-
-#define INFO_OUT_LEN 80		/* length of info line including lf */
-
-/********************************************************/
-/* defines and data structure for conf write operations */
-/********************************************************/
-#define CONF_STATE_DETECT 0	/* waiting for detect */
-#define CONF_STATE_CONF   1	/* writing config data */
-#define CONF_STATE_POF    2	/* writing pof data */
-#define CONF_LINE_LEN   255	/* 255 chars max */
-
-struct conf_writedata {
-	hysdn_card *card;	/* card the device is connected to */
-	int buf_size;		/* actual number of bytes in the buffer */
-	int needed_size;	/* needed size when reading pof */
-	int state;		/* actual interface states from above constants */
-	unsigned char conf_line[CONF_LINE_LEN];	/* buffered conf line */
-	unsigned short channel;		/* active channel number */
-	unsigned char *pof_buffer;	/* buffer when writing pof */
-};
-
-/***********************************************************************/
-/* process_line parses one config line and transfers it to the card if */
-/* necessary.                                                          */
-/* if the return value is negative an error occurred.                   */
-/***********************************************************************/
-static int
-process_line(struct conf_writedata *cnf)
-{
-	unsigned char *cp = cnf->conf_line;
-	int i;
-
-	if (cnf->card->debug_flags & LOG_CNF_LINE)
-		hysdn_addlog(cnf->card, "conf line: %s", cp);
-
-	if (*cp == '-') {	/* option */
-		cp++;		/* point to option char */
-
-		if (*cp++ != 'c')
-			return (0);	/* option unknown or used */
-		i = 0;		/* start value for channel */
-		while ((*cp <= '9') && (*cp >= '0'))
-			i = i * 10 + *cp++ - '0';	/* get decimal number */
-		if (i > 65535) {
-			if (cnf->card->debug_flags & LOG_CNF_MISC)
-				hysdn_addlog(cnf->card, "conf channel invalid  %d", i);
-			return (-ERR_INV_CHAN);		/* invalid channel */
-		}
-		cnf->channel = i & 0xFFFF;	/* set new channel number */
-		return (0);	/* success */
-	}			/* option */
-	if (*cp == '*') {	/* line to send */
-		if (cnf->card->debug_flags & LOG_CNF_DATA)
-			hysdn_addlog(cnf->card, "conf chan=%d %s", cnf->channel, cp);
-		return (hysdn_tx_cfgline(cnf->card, cnf->conf_line + 1,
-					 cnf->channel));	/* send the line without * */
-	}			/* line to send */
-	return (0);
-}				/* process_line */
-
-/***********************************/
-/* conf file operations and tables */
-/***********************************/
-
-/****************************************************/
-/* write conf file -> boot or send cfg line to card */
-/****************************************************/
-static ssize_t
-hysdn_conf_write(struct file *file, const char __user *buf, size_t count, loff_t *off)
-{
-	struct conf_writedata *cnf;
-	int i;
-	unsigned char ch, *cp;
-
-	if (!count)
-		return (0);	/* nothing to handle */
-
-	if (!(cnf = file->private_data))
-		return (-EFAULT);	/* should never happen */
-
-	if (cnf->state == CONF_STATE_DETECT) {	/* auto detect cnf or pof data */
-		if (copy_from_user(&ch, buf, 1))	/* get first char for detect */
-			return (-EFAULT);
-
-		if (ch == 0x1A) {
-			/* we detected a pof file */
-			if ((cnf->needed_size = pof_write_open(cnf->card, &cnf->pof_buffer)) <= 0)
-				return (cnf->needed_size);	/* an error occurred -> exit */
-			cnf->buf_size = 0;	/* buffer is empty */
-			cnf->state = CONF_STATE_POF;	/* new state */
-		} else {
-			/* conf data has been detected */
-			cnf->buf_size = 0;	/* buffer is empty */
-			cnf->state = CONF_STATE_CONF;	/* requested conf data write */
-			if (cnf->card->state != CARD_STATE_RUN)
-				return (-ERR_NOT_BOOTED);
-			cnf->conf_line[CONF_LINE_LEN - 1] = 0;	/* limit string length */
-			cnf->channel = 4098;	/* default channel for output */
-		}
-	}			/* state was auto detect */
-	if (cnf->state == CONF_STATE_POF) {	/* pof write active */
-		i = cnf->needed_size - cnf->buf_size;	/* bytes still missing for write */
-		if (i <= 0)
-			return (-EINVAL);	/* size error handling pof */
-
-		if (i < count)
-			count = i;	/* limit requested number of bytes */
-		if (copy_from_user(cnf->pof_buffer + cnf->buf_size, buf, count))
-			return (-EFAULT);	/* error while copying */
-		cnf->buf_size += count;
-
-		if (cnf->needed_size == cnf->buf_size) {
-			cnf->needed_size = pof_write_buffer(cnf->card, cnf->buf_size);	/* write data */
-			if (cnf->needed_size <= 0) {
-				cnf->card->state = CARD_STATE_BOOTERR;	/* show boot error */
-				return (cnf->needed_size);	/* an error occurred */
-			}
-			cnf->buf_size = 0;	/* buffer is empty again */
-		}
-	}
-	/* pof write active */
-	else {			/* conf write active */
-
-		if (cnf->card->state != CARD_STATE_RUN) {
-			if (cnf->card->debug_flags & LOG_CNF_MISC)
-				hysdn_addlog(cnf->card, "cnf write denied -> not booted");
-			return (-ERR_NOT_BOOTED);
-		}
-		i = (CONF_LINE_LEN - 1) - cnf->buf_size;	/* bytes available in buffer */
-		if (i > 0) {
-			/* copy remaining bytes into buffer */
-
-			if (count > i)
-				count = i;	/* limit transfer */
-			if (copy_from_user(cnf->conf_line + cnf->buf_size, buf, count))
-				return (-EFAULT);	/* error while copying */
-
-			i = count;	/* number of chars in buffer */
-			cp = cnf->conf_line + cnf->buf_size;
-			while (i) {
-				/* search for end of line */
-				if ((*cp < ' ') && (*cp != 9))
-					break;	/* end of line found */
-				cp++;
-				i--;
-			}	/* search for end of line */
-
-			if (i) {
-				/* delimiter found */
-				*cp++ = 0;	/* string termination */
-				count -= (i - 1);	/* subtract remaining bytes from count */
-				while ((i) && (*cp < ' ') && (*cp != 9)) {
-					i--;	/* discard next char */
-					count++;	/* mark as read */
-					cp++;	/* next char */
-				}
-				cnf->buf_size = 0;	/* buffer is empty after transfer */
-				if ((i = process_line(cnf)) < 0)	/* handle the line */
-					count = i;	/* return the error */
-			}
-			/* delimiter found */
-			else {
-				cnf->buf_size += count;		/* add chars to string */
-				if (cnf->buf_size >= CONF_LINE_LEN - 1) {
-					if (cnf->card->debug_flags & LOG_CNF_MISC)
-						hysdn_addlog(cnf->card, "cnf line too long %d chars pos %d", cnf->buf_size, count);
-					return (-ERR_CONF_LONG);
-				}
-			}	/* not delimited */
-
-		}
-		/* copy remaining bytes into buffer */
-		else {
-			if (cnf->card->debug_flags & LOG_CNF_MISC)
-				hysdn_addlog(cnf->card, "cnf line too long");
-			return (-ERR_CONF_LONG);
-		}
-	}			/* conf write active */
-
-	return (count);
-}				/* hysdn_conf_write */
-
-/*******************************************/
-/* read conf file -> output card info data */
-/*******************************************/
-static ssize_t
-hysdn_conf_read(struct file *file, char __user *buf, size_t count, loff_t *off)
-{
-	char *cp;
-
-	if (!(file->f_mode & FMODE_READ))
-		return -EPERM;	/* no permission to read */
-
-	if (!(cp = file->private_data))
-		return -EFAULT;	/* should never happen */
-
-	return simple_read_from_buffer(buf, count, off, cp, strlen(cp));
-}				/* hysdn_conf_read */
-
-/******************/
-/* open conf file */
-/******************/
-static int
-hysdn_conf_open(struct inode *ino, struct file *filep)
-{
-	hysdn_card *card;
-	struct conf_writedata *cnf;
-	char *cp, *tmp;
-
-	/* now search the addressed card */
-	mutex_lock(&hysdn_conf_mutex);
-	card = PDE_DATA(ino);
-	if (card->debug_flags & (LOG_PROC_OPEN | LOG_PROC_ALL))
-		hysdn_addlog(card, "config open for uid=%d gid=%d mode=0x%x",
-			     filep->f_cred->fsuid, filep->f_cred->fsgid,
-			     filep->f_mode);
-
-	if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_WRITE) {
-		/* write only access -> write boot file or conf line */
-
-		if (!(cnf = kmalloc(sizeof(struct conf_writedata), GFP_KERNEL))) {
-			mutex_unlock(&hysdn_conf_mutex);
-			return (-EFAULT);
-		}
-		cnf->card = card;
-		cnf->buf_size = 0;	/* nothing buffered */
-		cnf->state = CONF_STATE_DETECT;		/* start auto detect */
-		filep->private_data = cnf;
-
-	} else if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) {
-		/* read access -> output card info data */
-
-		if (!(tmp = kmalloc(INFO_OUT_LEN * 2 + 2, GFP_KERNEL))) {
-			mutex_unlock(&hysdn_conf_mutex);
-			return (-EFAULT);	/* out of memory */
-		}
-		filep->private_data = tmp;	/* start of string */
-
-		/* first output a headline */
-		sprintf(tmp, "id bus slot type irq iobase dp-mem     b-chans fax-chans state device");
-		cp = tmp;	/* start of string */
-		while (*cp)
-			cp++;
-		while (((cp - tmp) % (INFO_OUT_LEN + 1)) != INFO_OUT_LEN)
-			*cp++ = ' ';
-		*cp++ = '\n';
-
-		/* and now the data */
-		sprintf(cp, "%d  %3d %4d %4d %3d 0x%04x 0x%08lx %7d %9d %3d   %s",
-			card->myid,
-			card->bus,
-			PCI_SLOT(card->devfn),
-			card->brdtype,
-			card->irq,
-			card->iobase,
-			card->membase,
-			card->bchans,
-			card->faxchans,
-			card->state,
-			hysdn_net_getname(card));
-		while (*cp)
-			cp++;
-		while (((cp - tmp) % (INFO_OUT_LEN + 1)) != INFO_OUT_LEN)
-			*cp++ = ' ';
-		*cp++ = '\n';
-		*cp = 0;	/* end of string */
-	} else {		/* simultaneous read/write access forbidden ! */
-		mutex_unlock(&hysdn_conf_mutex);
-		return (-EPERM);	/* no permission this time */
-	}
-	mutex_unlock(&hysdn_conf_mutex);
-	return nonseekable_open(ino, filep);
-}				/* hysdn_conf_open */
-
-/***************************/
-/* close a config file.    */
-/***************************/
-static int
-hysdn_conf_close(struct inode *ino, struct file *filep)
-{
-	hysdn_card *card;
-	struct conf_writedata *cnf;
-	int retval = 0;
-
-	mutex_lock(&hysdn_conf_mutex);
-	card = PDE_DATA(ino);
-	if (card->debug_flags & (LOG_PROC_OPEN | LOG_PROC_ALL))
-		hysdn_addlog(card, "config close for uid=%d gid=%d mode=0x%x",
-			     filep->f_cred->fsuid, filep->f_cred->fsgid,
-			     filep->f_mode);
-
-	if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_WRITE) {
-		/* write only access -> write boot file or conf line */
-		if (filep->private_data) {
-			cnf = filep->private_data;
-
-			if (cnf->state == CONF_STATE_POF)
-				retval = pof_write_close(cnf->card);	/* close the pof write */
-			kfree(filep->private_data);	/* free allocated memory for buffer */
-
-		}		/* handle write private data */
-	} else if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) {
-		/* read access -> output card info data */
-
-		kfree(filep->private_data);	/* release memory */
-	}
-	mutex_unlock(&hysdn_conf_mutex);
-	return (retval);
-}				/* hysdn_conf_close */
-
-/******************************************************/
-/* table for conf filesystem functions defined above. */
-/******************************************************/
-static const struct file_operations conf_fops =
-{
-	.owner		= THIS_MODULE,
-	.llseek         = no_llseek,
-	.read           = hysdn_conf_read,
-	.write          = hysdn_conf_write,
-	.open           = hysdn_conf_open,
-	.release        = hysdn_conf_close,
-};
-
-/*****************************/
-/* hysdn subdir in /proc/net */
-/*****************************/
-struct proc_dir_entry *hysdn_proc_entry = NULL;
-
-/*******************************************************************************/
-/* hysdn_procconf_init is called when the module is loaded and after the cards */
-/* have been detected. The needed proc dir and card config files are created.  */
-/* The log init is called at last.                                             */
-/*******************************************************************************/
-int
-hysdn_procconf_init(void)
-{
-	hysdn_card *card;
-	unsigned char conf_name[20];
-
-	hysdn_proc_entry = proc_mkdir(PROC_SUBDIR_NAME, init_net.proc_net);
-	if (!hysdn_proc_entry) {
-		printk(KERN_ERR "HYSDN: unable to create hysdn subdir\n");
-		return (-1);
-	}
-	card = card_root;	/* point to first card */
-	while (card) {
-
-		sprintf(conf_name, "%s%d", PROC_CONF_BASENAME, card->myid);
-		if ((card->procconf = (void *) proc_create_data(conf_name,
-							   S_IFREG | S_IRUGO | S_IWUSR,
-							   hysdn_proc_entry,
-							   &conf_fops,
-							   card)) != NULL) {
-			hysdn_proclog_init(card);	/* init the log file entry */
-		}
-		card = card->next;	/* next entry */
-	}
-
-	printk(KERN_NOTICE "HYSDN: procfs initialised\n");
-	return 0;
-}				/* hysdn_procconf_init */
-
-/*************************************************************************************/
-/* hysdn_procconf_release is called when the module is unloaded and before the cards */
-/* resources are released. The module counter is assumed to be 0 !                   */
-/*************************************************************************************/
-void
-hysdn_procconf_release(void)
-{
-	hysdn_card *card;
-	unsigned char conf_name[20];
-
-	card = card_root;	/* start with first card */
-	while (card) {
-
-		sprintf(conf_name, "%s%d", PROC_CONF_BASENAME, card->myid);
-		if (card->procconf)
-			remove_proc_entry(conf_name, hysdn_proc_entry);
-
-		hysdn_proclog_release(card);	/* init the log file entry */
-
-		card = card->next;	/* point to next card */
-	}
-
-	remove_proc_entry(PROC_SUBDIR_NAME, init_net.proc_net);
-}
diff --git a/drivers/staging/isdn/hysdn/hysdn_proclog.c b/drivers/staging/isdn/hysdn/hysdn_proclog.c
deleted file mode 100644
index 6e898b90e86e..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_proclog.c
+++ /dev/null
@@ -1,357 +0,0 @@
-/* $Id: hysdn_proclog.c,v 1.9.6.3 2001/09/23 22:24:54 kai Exp $
- *
- * Linux driver for HYSDN cards, /proc/net filesystem log functions.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/module.h>
-#include <linux/poll.h>
-#include <linux/proc_fs.h>
-#include <linux/sched.h>
-#include <linux/slab.h>
-#include <linux/mutex.h>
-#include <linux/kernel.h>
-
-#include "hysdn_defs.h"
-
-/* the proc subdir for the interface is defined in the procconf module */
-extern struct proc_dir_entry *hysdn_proc_entry;
-
-static DEFINE_MUTEX(hysdn_log_mutex);
-static void put_log_buffer(hysdn_card *card, char *cp);
-
-/*************************************************/
-/* structure keeping ascii log for device output */
-/*************************************************/
-struct log_data {
-	struct log_data *next;
-	unsigned long usage_cnt;/* number of files still to work */
-	void *proc_ctrl;	/* pointer to own control procdata structure */
-	char log_start[2];	/* log string start (final len aligned by size) */
-};
-
-/**********************************************/
-/* structure holding proc entrys for one card */
-/**********************************************/
-struct procdata {
-	struct proc_dir_entry *log;	/* log entry */
-	char log_name[15];	/* log filename */
-	struct log_data *log_head, *log_tail;	/* head and tail for queue */
-	int if_used;		/* open count for interface */
-	unsigned char logtmp[LOG_MAX_LINELEN];
-	wait_queue_head_t rd_queue;
-};
-
-
-/**********************************************/
-/* log function for cards error log interface */
-/**********************************************/
-void
-hysdn_card_errlog(hysdn_card *card, tErrLogEntry *logp, int maxsize)
-{
-	char buf[ERRLOG_TEXT_SIZE + 40];
-
-	sprintf(buf, "LOG 0x%08lX 0x%08lX : %s\n", logp->ulErrType, logp->ulErrSubtype, logp->ucText);
-	put_log_buffer(card, buf);	/* output the string */
-}				/* hysdn_card_errlog */
-
-/***************************************************/
-/* Log function using format specifiers for output */
-/***************************************************/
-void
-hysdn_addlog(hysdn_card *card, char *fmt, ...)
-{
-	struct procdata *pd = card->proclog;
-	char *cp;
-	va_list args;
-
-	if (!pd)
-		return;		/* log structure non existent */
-
-	cp = pd->logtmp;
-	cp += sprintf(cp, "HYSDN: card %d ", card->myid);
-
-	va_start(args, fmt);
-	cp += vsprintf(cp, fmt, args);
-	va_end(args);
-	*cp++ = '\n';
-	*cp = 0;
-
-	if (card->debug_flags & DEB_OUT_SYSLOG)
-		printk(KERN_INFO "%s", pd->logtmp);
-	else
-		put_log_buffer(card, pd->logtmp);
-
-}				/* hysdn_addlog */
-
-/********************************************/
-/* put an log buffer into the log queue.    */
-/* This buffer will be kept until all files */
-/* opened for read got the contents.        */
-/* Flushes buffers not longer in use.       */
-/********************************************/
-static void
-put_log_buffer(hysdn_card *card, char *cp)
-{
-	struct log_data *ib;
-	struct procdata *pd = card->proclog;
-	unsigned long flags;
-
-	if (!pd)
-		return;
-	if (!cp)
-		return;
-	if (!*cp)
-		return;
-	if (pd->if_used <= 0)
-		return;		/* no open file for read */
-
-	if (!(ib = kmalloc(sizeof(struct log_data) + strlen(cp), GFP_ATOMIC)))
-		return;	/* no memory */
-	strcpy(ib->log_start, cp);	/* set output string */
-	ib->next = NULL;
-	ib->proc_ctrl = pd;	/* point to own control structure */
-	spin_lock_irqsave(&card->hysdn_lock, flags);
-	ib->usage_cnt = pd->if_used;
-	if (!pd->log_head)
-		pd->log_head = ib;	/* new head */
-	else
-		pd->log_tail->next = ib;	/* follows existing messages */
-	pd->log_tail = ib;	/* new tail */
-
-	/* delete old entrys */
-	while (pd->log_head->next) {
-		if ((pd->log_head->usage_cnt <= 0) &&
-		    (pd->log_head->next->usage_cnt <= 0)) {
-			ib = pd->log_head;
-			pd->log_head = pd->log_head->next;
-			kfree(ib);
-		} else {
-			break;
-		}
-	}		/* pd->log_head->next */
-
-	spin_unlock_irqrestore(&card->hysdn_lock, flags);
-
-	wake_up_interruptible(&(pd->rd_queue));		/* announce new entry */
-}				/* put_log_buffer */
-
-
-/******************************/
-/* file operations and tables */
-/******************************/
-
-/****************************************/
-/* write log file -> set log level bits */
-/****************************************/
-static ssize_t
-hysdn_log_write(struct file *file, const char __user *buf, size_t count, loff_t *off)
-{
-	int rc;
-	hysdn_card *card = file->private_data;
-
-	rc = kstrtoul_from_user(buf, count, 0, &card->debug_flags);
-	if (rc < 0)
-		return rc;
-	hysdn_addlog(card, "debug set to 0x%lx", card->debug_flags);
-	return (count);
-}				/* hysdn_log_write */
-
-/******************/
-/* read log file */
-/******************/
-static ssize_t
-hysdn_log_read(struct file *file, char __user *buf, size_t count, loff_t *off)
-{
-	struct log_data *inf;
-	int len;
-	hysdn_card *card = PDE_DATA(file_inode(file));
-
-	if (!(inf = *((struct log_data **) file->private_data))) {
-		struct procdata *pd = card->proclog;
-		if (file->f_flags & O_NONBLOCK)
-			return (-EAGAIN);
-
-		wait_event_interruptible(pd->rd_queue, (inf =
-				*((struct log_data **) file->private_data)));
-	}
-	if (!inf)
-		return (0);
-
-	inf->usage_cnt--;	/* new usage count */
-	file->private_data = &inf->next;	/* next structure */
-	if ((len = strlen(inf->log_start)) <= count) {
-		if (copy_to_user(buf, inf->log_start, len))
-			return -EFAULT;
-		*off += len;
-		return (len);
-	}
-	return (0);
-}				/* hysdn_log_read */
-
-/******************/
-/* open log file */
-/******************/
-static int
-hysdn_log_open(struct inode *ino, struct file *filep)
-{
-	hysdn_card *card = PDE_DATA(ino);
-
-	mutex_lock(&hysdn_log_mutex);
-	if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_WRITE) {
-		/* write only access -> write log level only */
-		filep->private_data = card;	/* remember our own card */
-	} else if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_READ) {
-		struct procdata *pd = card->proclog;
-		unsigned long flags;
-
-		/* read access -> log/debug read */
-		spin_lock_irqsave(&card->hysdn_lock, flags);
-		pd->if_used++;
-		if (pd->log_head)
-			filep->private_data = &pd->log_tail->next;
-		else
-			filep->private_data = &pd->log_head;
-		spin_unlock_irqrestore(&card->hysdn_lock, flags);
-	} else {		/* simultaneous read/write access forbidden ! */
-		mutex_unlock(&hysdn_log_mutex);
-		return (-EPERM);	/* no permission this time */
-	}
-	mutex_unlock(&hysdn_log_mutex);
-	return nonseekable_open(ino, filep);
-}				/* hysdn_log_open */
-
-/*******************************************************************************/
-/* close a cardlog file. If the file has been opened for exclusive write it is */
-/* assumed as pof data input and the pof loader is noticed about.              */
-/* Otherwise file is handled as log output. In this case the interface usage   */
-/* count is decremented and all buffers are noticed of closing. If this file   */
-/* was the last one to be closed, all buffers are freed.                       */
-/*******************************************************************************/
-static int
-hysdn_log_close(struct inode *ino, struct file *filep)
-{
-	struct log_data *inf;
-	struct procdata *pd;
-	hysdn_card *card;
-	int retval = 0;
-
-	mutex_lock(&hysdn_log_mutex);
-	if ((filep->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_WRITE) {
-		/* write only access -> write debug level written */
-		retval = 0;	/* success */
-	} else {
-		/* read access -> log/debug read, mark one further file as closed */
-
-		inf = *((struct log_data **) filep->private_data);	/* get first log entry */
-		if (inf)
-			pd = (struct procdata *) inf->proc_ctrl;	/* still entries there */
-		else {
-			/* no info available -> search card */
-			card = PDE_DATA(file_inode(filep));
-			pd = card->proclog;	/* pointer to procfs log */
-		}
-		if (pd)
-			pd->if_used--;	/* decrement interface usage count by one */
-
-		while (inf) {
-			inf->usage_cnt--;	/* decrement usage count for buffers */
-			inf = inf->next;
-		}
-
-		if (pd)
-			if (pd->if_used <= 0)	/* delete buffers if last file closed */
-				while (pd->log_head) {
-					inf = pd->log_head;
-					pd->log_head = pd->log_head->next;
-					kfree(inf);
-				}
-	}			/* read access */
-	mutex_unlock(&hysdn_log_mutex);
-
-	return (retval);
-}				/* hysdn_log_close */
-
-/*************************************************/
-/* select/poll routine to be able using select() */
-/*************************************************/
-static __poll_t
-hysdn_log_poll(struct file *file, poll_table *wait)
-{
-	__poll_t mask = 0;
-	hysdn_card *card = PDE_DATA(file_inode(file));
-	struct procdata *pd = card->proclog;
-
-	if ((file->f_mode & (FMODE_READ | FMODE_WRITE)) == FMODE_WRITE)
-		return (mask);	/* no polling for write supported */
-
-	poll_wait(file, &(pd->rd_queue), wait);
-
-	if (*((struct log_data **) file->private_data))
-		mask |= EPOLLIN | EPOLLRDNORM;
-
-	return mask;
-}				/* hysdn_log_poll */
-
-/**************************************************/
-/* table for log filesystem functions defined above. */
-/**************************************************/
-static const struct file_operations log_fops =
-{
-	.owner		= THIS_MODULE,
-	.llseek         = no_llseek,
-	.read           = hysdn_log_read,
-	.write          = hysdn_log_write,
-	.poll           = hysdn_log_poll,
-	.open           = hysdn_log_open,
-	.release        = hysdn_log_close,
-};
-
-
-/***********************************************************************************/
-/* hysdn_proclog_init is called when the module is loaded after creating the cards */
-/* conf files.                                                                     */
-/***********************************************************************************/
-int
-hysdn_proclog_init(hysdn_card *card)
-{
-	struct procdata *pd;
-
-	/* create a cardlog proc entry */
-
-	if ((pd = kzalloc(sizeof(struct procdata), GFP_KERNEL)) != NULL) {
-		sprintf(pd->log_name, "%s%d", PROC_LOG_BASENAME, card->myid);
-		pd->log = proc_create_data(pd->log_name,
-				      S_IFREG | S_IRUGO | S_IWUSR, hysdn_proc_entry,
-				      &log_fops, card);
-
-		init_waitqueue_head(&(pd->rd_queue));
-
-		card->proclog = (void *) pd;	/* remember procfs structure */
-	}
-	return (0);
-}				/* hysdn_proclog_init */
-
-/************************************************************************************/
-/* hysdn_proclog_release is called when the module is unloaded and before the cards */
-/* conf file is released                                                            */
-/* The module counter is assumed to be 0 !                                          */
-/************************************************************************************/
-void
-hysdn_proclog_release(hysdn_card *card)
-{
-	struct procdata *pd;
-
-	if ((pd = (struct procdata *) card->proclog) != NULL) {
-		if (pd->log)
-			remove_proc_entry(pd->log_name, hysdn_proc_entry);
-		kfree(pd);	/* release memory */
-		card->proclog = NULL;
-	}
-}				/* hysdn_proclog_release */
diff --git a/drivers/staging/isdn/hysdn/hysdn_sched.c b/drivers/staging/isdn/hysdn/hysdn_sched.c
deleted file mode 100644
index 31d7c1415543..000000000000
--- a/drivers/staging/isdn/hysdn/hysdn_sched.c
+++ /dev/null
@@ -1,197 +0,0 @@
-/* $Id: hysdn_sched.c,v 1.5.6.4 2001/11/06 21:58:19 kai Exp $
- *
- * Linux driver for HYSDN cards
- * scheduler routines for handling exchange card <-> pc.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#include <linux/signal.h>
-#include <linux/kernel.h>
-#include <linux/ioport.h>
-#include <linux/interrupt.h>
-#include <linux/delay.h>
-#include <asm/io.h>
-
-#include "hysdn_defs.h"
-
-/*****************************************************************************/
-/* hysdn_sched_rx is called from the cards handler to announce new data is   */
-/* available from the card. The routine has to handle the data and return    */
-/* with a nonzero code if the data could be worked (or even thrown away), if */
-/* no room to buffer the data is available a zero return tells the card      */
-/* to keep the data until later.                                             */
-/*****************************************************************************/
-int
-hysdn_sched_rx(hysdn_card *card, unsigned char *buf, unsigned short len,
-	       unsigned short chan)
-{
-
-	switch (chan) {
-	case CHAN_NDIS_DATA:
-		if (hynet_enable & (1 << card->myid)) {
-			/* give packet to network handler */
-			hysdn_rx_netpkt(card, buf, len);
-		}
-		break;
-
-	case CHAN_ERRLOG:
-		hysdn_card_errlog(card, (tErrLogEntry *) buf, len);
-		if (card->err_log_state == ERRLOG_STATE_ON)
-			card->err_log_state = ERRLOG_STATE_START;	/* start new fetch */
-		break;
-#ifdef CONFIG_HYSDN_CAPI
-	case CHAN_CAPI:
-/* give packet to CAPI handler */
-		if (hycapi_enable & (1 << card->myid)) {
-			hycapi_rx_capipkt(card, buf, len);
-		}
-		break;
-#endif /* CONFIG_HYSDN_CAPI */
-	default:
-		printk(KERN_INFO "irq message channel %d len %d unhandled \n", chan, len);
-		break;
-
-	}			/* switch rx channel */
-
-	return (1);		/* always handled */
-}				/* hysdn_sched_rx */
-
-/*****************************************************************************/
-/* hysdn_sched_tx is called from the cards handler to announce that there is */
-/* room in the tx-buffer to the card and data may be sent if needed.         */
-/* If the routine wants to send data it must fill buf, len and chan with the */
-/* appropriate data and return a nonzero value. With a zero return no new    */
-/* data to send is assumed. maxlen specifies the buffer size available for   */
-/* sending.                                                                  */
-/*****************************************************************************/
-int
-hysdn_sched_tx(hysdn_card *card, unsigned char *buf,
-	       unsigned short volatile *len, unsigned short volatile *chan,
-	       unsigned short maxlen)
-{
-	struct sk_buff *skb;
-
-	if (card->net_tx_busy) {
-		card->net_tx_busy = 0;	/* reset flag */
-		hysdn_tx_netack(card);	/* acknowledge packet send */
-	}			/* a network packet has completely been transferred */
-	/* first of all async requests are handled */
-	if (card->async_busy) {
-		if (card->async_len <= maxlen) {
-			memcpy(buf, card->async_data, card->async_len);
-			*len = card->async_len;
-			*chan = card->async_channel;
-			card->async_busy = 0;	/* reset request */
-			return (1);
-		}
-		card->async_busy = 0;	/* in case of length error */
-	}			/* async request */
-	if ((card->err_log_state == ERRLOG_STATE_START) &&
-	    (maxlen >= ERRLOG_CMD_REQ_SIZE)) {
-		strcpy(buf, ERRLOG_CMD_REQ);	/* copy the command */
-		*len = ERRLOG_CMD_REQ_SIZE;	/* buffer length */
-		*chan = CHAN_ERRLOG;	/* and channel */
-		card->err_log_state = ERRLOG_STATE_ON;	/* new state is on */
-		return (1);	/* tell that data should be send */
-	}			/* error log start and able to send */
-	if ((card->err_log_state == ERRLOG_STATE_STOP) &&
-	    (maxlen >= ERRLOG_CMD_STOP_SIZE)) {
-		strcpy(buf, ERRLOG_CMD_STOP);	/* copy the command */
-		*len = ERRLOG_CMD_STOP_SIZE;	/* buffer length */
-		*chan = CHAN_ERRLOG;	/* and channel */
-		card->err_log_state = ERRLOG_STATE_OFF;		/* new state is off */
-		return (1);	/* tell that data should be send */
-	}			/* error log start and able to send */
-	/* now handle network interface packets */
-	if ((hynet_enable & (1 << card->myid)) &&
-	    (skb = hysdn_tx_netget(card)) != NULL)
-	{
-		if (skb->len <= maxlen) {
-			/* copy the packet to the buffer */
-			skb_copy_from_linear_data(skb, buf, skb->len);
-			*len = skb->len;
-			*chan = CHAN_NDIS_DATA;
-			card->net_tx_busy = 1;	/* we are busy sending network data */
-			return (1);	/* go and send the data */
-		} else
-			hysdn_tx_netack(card);	/* aknowledge packet -> throw away */
-	}			/* send a network packet if available */
-#ifdef CONFIG_HYSDN_CAPI
-	if (((hycapi_enable & (1 << card->myid))) &&
-	    ((skb = hycapi_tx_capiget(card)) != NULL))
-	{
-		if (skb->len <= maxlen) {
-			skb_copy_from_linear_data(skb, buf, skb->len);
-			*len = skb->len;
-			*chan = CHAN_CAPI;
-			hycapi_tx_capiack(card);
-			return (1);	/* go and send the data */
-		}
-	}
-#endif /* CONFIG_HYSDN_CAPI */
-	return (0);		/* nothing to send */
-}				/* hysdn_sched_tx */
-
-
-/*****************************************************************************/
-/* send one config line to the card and return 0 if successful, otherwise a */
-/* negative error code.                                                      */
-/* The function works with timeouts perhaps not giving the greatest speed    */
-/* sending the line, but this should be meaningless because only some lines  */
-/* are to be sent and this happens very seldom.                              */
-/*****************************************************************************/
-int
-hysdn_tx_cfgline(hysdn_card *card, unsigned char *line, unsigned short chan)
-{
-	int cnt = 50;		/* timeout intervalls */
-	unsigned long flags;
-
-	if (card->debug_flags & LOG_SCHED_ASYN)
-		hysdn_addlog(card, "async tx-cfg chan=%d len=%d", chan, strlen(line) + 1);
-
-	while (card->async_busy) {
-
-		if (card->debug_flags & LOG_SCHED_ASYN)
-			hysdn_addlog(card, "async tx-cfg delayed");
-
-		msleep_interruptible(20);		/* Timeout 20ms */
-		if (!--cnt)
-			return (-ERR_ASYNC_TIME);	/* timed out */
-	}			/* wait for buffer to become free */
-
-	spin_lock_irqsave(&card->hysdn_lock, flags);
-	strcpy(card->async_data, line);
-	card->async_len = strlen(line) + 1;
-	card->async_channel = chan;
-	card->async_busy = 1;	/* request transfer */
-
-	/* now queue the task */
-	schedule_work(&card->irq_queue);
-	spin_unlock_irqrestore(&card->hysdn_lock, flags);
-
-	if (card->debug_flags & LOG_SCHED_ASYN)
-		hysdn_addlog(card, "async tx-cfg data queued");
-
-	cnt++;			/* short delay */
-
-	while (card->async_busy) {
-
-		if (card->debug_flags & LOG_SCHED_ASYN)
-			hysdn_addlog(card, "async tx-cfg waiting for tx-ready");
-
-		msleep_interruptible(20);		/* Timeout 20ms */
-		if (!--cnt)
-			return (-ERR_ASYNC_TIME);	/* timed out */
-	}			/* wait for buffer to become free again */
-
-	if (card->debug_flags & LOG_SCHED_ASYN)
-		hysdn_addlog(card, "async tx-cfg data send");
-
-	return (0);		/* line send correctly */
-}				/* hysdn_tx_cfgline */
diff --git a/drivers/staging/isdn/hysdn/ince1pc.h b/drivers/staging/isdn/hysdn/ince1pc.h
deleted file mode 100644
index cab68361de65..000000000000
--- a/drivers/staging/isdn/hysdn/ince1pc.h
+++ /dev/null
@@ -1,134 +0,0 @@
-/*
- * Linux driver for HYSDN cards
- * common definitions for both sides of the bus:
- * - conventions both spoolers must know
- * - channel numbers agreed upon
- *
- * Author    M. Steinkopf
- * Copyright 1999 by M. Steinkopf
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#ifndef __INCE1PC_H__
-#define __INCE1PC_H__
-
-/*  basic scalar definitions have same meanning,
- *  but their declaration location depends on environment
- */
-
-/*--------------------------------------channel numbers---------------------*/
-#define CHAN_SYSTEM     0x0001      /* system channel (spooler to spooler) */
-#define CHAN_ERRLOG     0x0005      /* error logger */
-#define CHAN_CAPI       0x0064      /* CAPI interface */
-#define CHAN_NDIS_DATA  0x1001      /* NDIS data transfer */
-
-/*--------------------------------------POF ready msg-----------------------*/
-/* NOTE: after booting POF sends system ready message to PC: */
-#define RDY_MAGIC       0x52535953UL    /* 'SYSR' reversed */
-#define RDY_MAGIC_SIZE  4               /* size in bytes */
-
-#define MAX_N_TOK_BYTES 255
-
-#define MIN_RDY_MSG_SIZE    RDY_MAGIC_SIZE
-#define MAX_RDY_MSG_SIZE    (RDY_MAGIC_SIZE + MAX_N_TOK_BYTES)
-
-#define SYSR_TOK_END            0
-#define SYSR_TOK_B_CHAN         1   /* nr. of B-Channels;   DataLen=1; def: 2 */
-#define SYSR_TOK_FAX_CHAN       2   /* nr. of FAX Channels; DataLen=1; def: 0 */
-#define SYSR_TOK_MAC_ADDR       3   /* MAC-Address; DataLen=6; def: auto */
-#define SYSR_TOK_ESC            255 /* undefined data size yet */
-/* default values, if not corrected by token: */
-#define SYSR_TOK_B_CHAN_DEF     2   /* assume 2 B-Channels */
-#define SYSR_TOK_FAX_CHAN_DEF   1   /* assume 1 FAX Channel */
-
-/*  syntax of new SYSR token stream:
- *  channel: CHAN_SYSTEM
- *  msgsize: MIN_RDY_MSG_SIZE <= x <= MAX_RDY_MSG_SIZE
- *           RDY_MAGIC_SIZE   <= x <= (RDY_MAGIC_SIZE+MAX_N_TOK_BYTES)
- *  msg    : 0 1 2 3 {4 5 6 ..}
- *           S Y S R  MAX_N_TOK_BYTES bytes of TokenStream
- *
- *  TokenStream     :=   empty
- *                     | {NonEndTokenChunk} EndToken RotlCRC
- *  NonEndTokenChunk:= NonEndTokenId DataLen [Data]
- *  NonEndTokenId   := 0x01 .. 0xFE                 1 BYTE
- *  DataLen         := 0x00 .. 0xFF                 1 BYTE
- *  Data            := DataLen bytes
- *  EndToken        := 0x00
- *  RotlCRC         := special 1 byte CRC over all NonEndTokenChunk bytes
- *                     s. RotlCRC algorithm
- *
- *  RotlCRC algorithm:
- *      ucSum= 0                        1 unsigned char
- *      for all NonEndTokenChunk bytes:
- *          ROTL(ucSum,1)               rotate left by 1
- *          ucSum += Char;              add current byte with swap around
- *      RotlCRC= ~ucSum;                invert all bits for result
- *
- *  note:
- *  - for 16-bit FIFO add padding 0 byte to achieve even token data bytes!
- */
-
-/*--------------------------------------error logger------------------------*/
-/* note: pof needs final 0 ! */
-#define ERRLOG_CMD_REQ          "ERRLOG ON"
-#define ERRLOG_CMD_REQ_SIZE     10              /* with final 0 byte ! */
-#define ERRLOG_CMD_STOP         "ERRLOG OFF"
-#define ERRLOG_CMD_STOP_SIZE    11              /* with final 0 byte ! */
-
-#define ERRLOG_ENTRY_SIZE       64      /* sizeof(tErrLogEntry) */
-					/* remaining text size = 55 */
-#define ERRLOG_TEXT_SIZE    (ERRLOG_ENTRY_SIZE - 2 * 4 - 1)
-
-typedef struct ErrLogEntry_tag {
-
-	/*00 */ unsigned long ulErrType;
-
-	/*04 */ unsigned long ulErrSubtype;
-
-	/*08 */ unsigned char ucTextSize;
-
-	/*09 */ unsigned char ucText[ERRLOG_TEXT_SIZE];
-	/* ASCIIZ of len ucTextSize-1 */
-
-/*40 */
-} tErrLogEntry;
-
-
-#if defined(__TURBOC__)
-#if sizeof(tErrLogEntry) != ERRLOG_ENTRY_SIZE
-#error size of tErrLogEntry != ERRLOG_ENTRY_SIZE
-#endif				/*  */
-#endif				/*  */
-
-/*--------------------------------------DPRAM boot spooler------------------*/
-/*  this is the struture used between pc and
- *  hyperstone to exchange boot data
- */
-#define DPRAM_SPOOLER_DATA_SIZE 0x20
-typedef struct DpramBootSpooler_tag {
-
-	/*00 */ unsigned char Len;
-
-	/*01 */ volatile unsigned char RdPtr;
-
-	/*02 */ unsigned char WrPtr;
-
-	/*03 */ unsigned char Data[DPRAM_SPOOLER_DATA_SIZE];
-
-/*23 */
-} tDpramBootSpooler;
-
-
-#define DPRAM_SPOOLER_MIN_SIZE  5       /* Len+RdPtr+Wrptr+2*data */
-#define DPRAM_SPOOLER_DEF_SIZE  0x23    /* current default size   */
-
-/*--------------------------------------HYCARD/ERGO DPRAM SoftUart----------*/
-/* at DPRAM offset 0x1C00: */
-#define SIZE_RSV_SOFT_UART  0x1B0   /* 432 bytes reserved for SoftUart */
-
-
-#endif	/* __INCE1PC_H__ */
diff --git a/include/linux/b1pcmcia.h b/include/linux/b1pcmcia.h
deleted file mode 100644
index 12a867c6061e..000000000000
--- a/include/linux/b1pcmcia.h
+++ /dev/null
@@ -1,21 +0,0 @@
-/* $Id: b1pcmcia.h,v 1.1.8.2 2001/09/23 22:25:05 kai Exp $
- *
- * Exported functions of module b1pcmcia to be called by
- * avm_cs card services module.
- *
- * Copyright 1999 by Carsten Paeth (calle@calle.in-berlin.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#ifndef _B1PCMCIA_H_
-#define _B1PCMCIA_H_
-
-int b1pcmcia_addcard_b1(unsigned int port, unsigned irq);
-int b1pcmcia_addcard_m1(unsigned int port, unsigned irq);
-int b1pcmcia_addcard_m2(unsigned int port, unsigned irq);
-int b1pcmcia_delcard(unsigned int port, unsigned irq);
-
-#endif	/* _B1PCMCIA_H_ */
diff --git a/include/uapi/linux/gigaset_dev.h b/include/uapi/linux/gigaset_dev.h
deleted file mode 100644
index 279551af8ebf..000000000000
--- a/include/uapi/linux/gigaset_dev.h
+++ /dev/null
@@ -1,39 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
-/*
- * interface to user space for the gigaset driver
- *
- * Copyright (c) 2004 by Hansjoerg Lipp <hjlipp@web.de>
- *
- * =====================================================================
- *    This program is free software; you can redistribute it and/or
- *    modify it under the terms of the GNU General Public License as
- *    published by the Free Software Foundation; either version 2 of
- *    the License, or (at your option) any later version.
- * =====================================================================
- */
-
-#ifndef GIGASET_INTERFACE_H
-#define GIGASET_INTERFACE_H
-
-#include <linux/ioctl.h>
-
-/* The magic IOCTL value for this interface. */
-#define GIGASET_IOCTL 0x47
-
-/* enable/disable device control via character device (lock out ISDN subsys) */
-#define GIGASET_REDIR    _IOWR(GIGASET_IOCTL, 0, int)
-
-/* enable adapter configuration mode (M10x only) */
-#define GIGASET_CONFIG   _IOWR(GIGASET_IOCTL, 1, int)
-
-/* set break characters (M105 only) */
-#define GIGASET_BRKCHARS _IOW(GIGASET_IOCTL, 2, unsigned char[6])
-
-/* get version information selected by arg[0] */
-#define GIGASET_VERSION  _IOWR(GIGASET_IOCTL, 3, unsigned[4])
-/* values for GIGASET_VERSION arg[0] */
-#define GIGVER_DRIVER 0		/* get driver version */
-#define GIGVER_COMPAT 1		/* get interface compatibility version */
-#define GIGVER_FWBASE 2		/* get base station firmware version */
-
-#endif
diff --git a/include/uapi/linux/hysdn_if.h b/include/uapi/linux/hysdn_if.h
deleted file mode 100644
index 99f77c517e6d..000000000000
--- a/include/uapi/linux/hysdn_if.h
+++ /dev/null
@@ -1,34 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
-/* $Id: hysdn_if.h,v 1.1.8.3 2001/09/23 22:25:05 kai Exp $
- *
- * Linux driver for HYSDN cards
- * ioctl definitions shared by hynetmgr and driver.
- *
- * Author    Werner Cornelius (werner@titro.de) for Hypercope GmbH
- * Copyright 1999 by Werner Cornelius (werner@titro.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-/****************/
-/* error values */
-/****************/
-#define ERR_NONE             0 /* no error occurred */
-#define ERR_ALREADY_BOOT  1000 /* we are already booting */
-#define EPOF_BAD_MAGIC    1001 /* bad magic in POF header */
-#define ERR_BOARD_DPRAM   1002 /* board DPRAM failed */
-#define EPOF_INTERNAL     1003 /* internal POF handler error */
-#define EPOF_BAD_IMG_SIZE 1004 /* POF boot image size invalid */
-#define ERR_BOOTIMG_FAIL  1005 /* 1. stage boot image did not start */
-#define ERR_BOOTSEQ_FAIL  1006 /* 2. stage boot seq handshake timeout */
-#define ERR_POF_TIMEOUT   1007 /* timeout waiting for card pof ready */
-#define ERR_NOT_BOOTED    1008 /* operation only allowed when booted */
-#define ERR_CONF_LONG     1009 /* conf line is too long */ 
-#define ERR_INV_CHAN      1010 /* invalid channel number */ 
-#define ERR_ASYNC_TIME    1011 /* timeout sending async data */ 
-
-
-
-
-- 
cgit v1.2.3


From f59aba2f75795e5b6a4f1aa31f3e20d7b71ca804 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Tue, 10 Dec 2019 21:59:16 +0100
Subject: isdn: capi: dead code removal

The staging isdn drivers are gone, and CONFIG_BT_CMTP is now
the only user. This means a lot of the code in the subsystem
has no remaining callers and can be removed.

Change the capi user space front-end to be part of kernelcapi,
and the combined module to only be compiled if BT_CMTP is
also enabled, then remove the interfaces that have no remaining
callers.

As the notifier list and the capi_drivers list have no callers
outside of kcapi.c, the implementation gets much simpler.

Some definitions from the include/linux/*.h headers are only
needed internally and are moved to kcapi.h.

Acked-by: David Miller <davem@davemloft.net>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Link: https://lore.kernel.org/r/20191210210455.3475361-2-arnd@arndb.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 Documentation/isdn/interface_capi.rst |  71 ------
 drivers/isdn/Makefile                 |   2 +-
 drivers/isdn/capi/Kconfig             |  32 +--
 drivers/isdn/capi/Makefile            |  18 +-
 drivers/isdn/capi/capi.c              |  14 +-
 drivers/isdn/capi/capilib.c           | 202 ---------------
 drivers/isdn/capi/capiutil.c          | 231 +----------------
 drivers/isdn/capi/kcapi.c             | 409 +-----------------------------
 drivers/isdn/capi/kcapi.h             | 149 ++++++++++-
 drivers/isdn/capi/kcapi_proc.c        |  34 +--
 include/linux/isdn/capilli.h          |  18 --
 include/linux/isdn/capiutil.h         | 456 ----------------------------------
 include/linux/kernelcapi.h            |  75 ------
 include/uapi/linux/b1lli.h            |  74 ------
 14 files changed, 179 insertions(+), 1606 deletions(-)
 delete mode 100644 drivers/isdn/capi/capilib.c
 delete mode 100644 include/uapi/linux/b1lli.h

(limited to 'include/uapi/linux')

diff --git a/Documentation/isdn/interface_capi.rst b/Documentation/isdn/interface_capi.rst
index 01a4b5ade9a4..fe2421444b76 100644
--- a/Documentation/isdn/interface_capi.rst
+++ b/Documentation/isdn/interface_capi.rst
@@ -26,13 +26,6 @@ This standard is freely available from https://www.capi.org.
 2. Driver and Device Registration
 =================================
 
-CAPI drivers optionally register themselves with Kernel CAPI by calling the
-Kernel CAPI function register_capi_driver() with a pointer to a struct
-capi_driver. This structure must be filled with the name and revision of the
-driver, and optionally a pointer to a callback function, add_card(). The
-registration can be revoked by calling the function unregister_capi_driver()
-with a pointer to the same struct capi_driver.
-
 CAPI drivers must register each of the ISDN devices they control with Kernel
 CAPI by calling the Kernel CAPI function attach_capi_ctr() with a pointer to a
 struct capi_ctr before they can be used. This structure must be filled with
@@ -89,9 +82,6 @@ register_capi_driver():
 	the name of the driver, as a zero-terminated ASCII string
 ``char revision[32]``
 	the revision number of the driver, as a zero-terminated ASCII string
-``int (*add_card)(struct capi_driver *driver, capicardparams *data)``
-	a callback function pointer (may be NULL)
-
 
 4.2 struct capi_ctr
 -------------------
@@ -178,12 +168,6 @@ to be set by the driver before calling attach_capi_ctr():
 	pointer to a callback function returning the entry for the device in
 	the CAPI controller info table, /proc/capi/controller
 
-``const struct file_operations *proc_fops``
-	pointers to callback functions for the device's proc file
-	system entry, /proc/capi/controllers/<n>; pointer to the device's
-	capi_ctr structure is available from struct proc_dir_entry::data
-	which is available from struct inode.
-
 Note:
   Callback functions except send_message() are never called in interrupt
   context.
@@ -267,25 +251,10 @@ _cmstruct   alternative representation for CAPI parameters of type 'struct'
 	    _cmsg structure members.
 =========== =================================================================
 
-Functions capi_cmsg2message() and capi_message2cmsg() are provided to convert
-messages between their transport encoding described in the CAPI 2.0 standard
-and their _cmsg structure representation. Note that capi_cmsg2message() does
-not know or check the size of its destination buffer. The caller must make
-sure it is big enough to accommodate the resulting CAPI message.
-
 
 5. Lower Layer Interface Functions
 ==================================
 
-(declared in <linux/isdn/capilli.h>)
-
-::
-
-  void register_capi_driver(struct capi_driver *drvr)
-  void unregister_capi_driver(struct capi_driver *drvr)
-
-register/unregister a driver with Kernel CAPI
-
 ::
 
   int attach_capi_ctr(struct capi_ctr *ctrlr)
@@ -300,13 +269,6 @@ register/unregister a device (controller) with Kernel CAPI
 
 signal controller ready/not ready
 
-::
-
-  void capi_ctr_suspend_output(struct capi_ctr *ctrlr)
-  void capi_ctr_resume_output(struct capi_ctr *ctrlr)
-
-signal suspend/resume
-
 ::
 
   void capi_ctr_handle_message(struct capi_ctr * ctrlr, u16 applid,
@@ -319,21 +281,6 @@ for forwarding to the specified application
 6. Helper Functions and Macros
 ==============================
 
-Library functions (from <linux/isdn/capilli.h>):
-
-::
-
-  void capilib_new_ncci(struct list_head *head, u16 applid,
-			u32 ncci, u32 winsize)
-  void capilib_free_ncci(struct list_head *head, u16 applid, u32 ncci)
-  void capilib_release_appl(struct list_head *head, u16 applid)
-  void capilib_release(struct list_head *head)
-  void capilib_data_b3_conf(struct list_head *head, u16 applid,
-			u32 ncci, u16 msgid)
-  u16  capilib_data_b3_req(struct list_head *head, u16 applid,
-			u32 ncci, u16 msgid)
-
-
 Macros to extract/set element values from/in a CAPI message header
 (from <linux/isdn/capiutil.h>):
 
@@ -357,24 +304,6 @@ CAPIMSG_DATALEN(m)	CAPIMSG_SETDATALEN(m, len)	Data Length (u16)
 Library functions for working with _cmsg structures
 (from <linux/isdn/capiutil.h>):
 
-``unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg)``
-	Assembles a CAPI 2.0 message from the parameters in ``*cmsg``,
-	storing the result in ``*msg``.
-
-``unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg)``
-	Disassembles the CAPI 2.0 message in ``*msg``, storing the parameters
-	in ``*cmsg``.
-
-``unsigned capi_cmsg_header(_cmsg *cmsg, u16 ApplId, u8 Command, u8 Subcommand, u16 Messagenumber, u32 Controller)``
-	Fills the header part and address field of the _cmsg structure ``*cmsg``
-	with the given values, zeroing the remainder of the structure so only
-	parameters with non-default values need to be changed before sending
-	the message.
-
-``void capi_cmsg_answer(_cmsg *cmsg)``
-	Sets the low bit of the Subcommand field in ``*cmsg``, thereby
-	converting ``_REQ`` to ``_CONF`` and ``_IND`` to ``_RESP``.
-
 ``char *capi_cmd2str(u8 Command, u8 Subcommand)``
 	Returns the CAPI 2.0 message name corresponding to the given command
 	and subcommand values, as a static ASCII string. The return value may
diff --git a/drivers/isdn/Makefile b/drivers/isdn/Makefile
index 63baf27a2c79..d14334f4007e 100644
--- a/drivers/isdn/Makefile
+++ b/drivers/isdn/Makefile
@@ -3,6 +3,6 @@
 
 # Object files in subdirectories
 
-obj-$(CONFIG_ISDN_CAPI)			+= capi/
+obj-$(CONFIG_BT_CMTP)			+= capi/
 obj-$(CONFIG_MISDN)			+= mISDN/
 obj-$(CONFIG_ISDN)			+= hardware/
diff --git a/drivers/isdn/capi/Kconfig b/drivers/isdn/capi/Kconfig
index 573fea5500ce..cc408ad9aafb 100644
--- a/drivers/isdn/capi/Kconfig
+++ b/drivers/isdn/capi/Kconfig
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
-menuconfig ISDN_CAPI
-	tristate "CAPI 2.0 subsystem"
+config ISDN_CAPI
+	def_bool ISDN && BT
 	help
 	  This provides CAPI (the Common ISDN Application Programming
 	  Interface) Version 2.0, a standard making it easy for programs to
@@ -15,42 +15,18 @@ menuconfig ISDN_CAPI
 	  See CONFIG_BT_CMTP for the last remaining regular driver
 	  in the kernel that uses the CAPI subsystem.
 
-if ISDN_CAPI
-
 config CAPI_TRACE
-	bool "CAPI trace support"
-	default y
+	def_bool BT_CMTP
 	help
 	  If you say Y here, the kernelcapi driver can make verbose traces
 	  of CAPI messages. This feature can be enabled/disabled via IOCTL for
 	  every controller (default disabled).
-	  This will increase the size of the kernelcapi module by 20 KB.
-	  If unsure, say Y.
-
-config ISDN_CAPI_CAPI20
-	tristate "CAPI2.0 /dev/capi20 support"
-	help
-	  This option will provide the CAPI 2.0 interface to userspace
-	  applications via /dev/capi20. Applications should use the
-	  standardized libcapi20 to access this functionality.  You should say
-	  Y/M here.
 
 config ISDN_CAPI_MIDDLEWARE
-	bool "CAPI2.0 Middleware support"
-	depends on ISDN_CAPI_CAPI20 && TTY
+	def_bool BT_CMTP && TTY
 	help
 	  This option will enhance the capabilities of the /dev/capi20
 	  interface.  It will provide a means of moving a data connection,
 	  established via the usual /dev/capi20 interface to a special tty
 	  device.  If you want to use pppd with pppdcapiplugin to dial up to
 	  your ISP, say Y here.
-
-config ISDN_CAPI_CAPIDRV_VERBOSE
-	bool "Verbose reason code reporting"
-	depends on ISDN_CAPI_CAPIDRV
-	help
-	  If you say Y here, the capidrv interface will give verbose reasons
-	  for disconnecting. This will increase the size of the kernel by 7 KB.
-	  If unsure, say N.
-
-endif
diff --git a/drivers/isdn/capi/Makefile b/drivers/isdn/capi/Makefile
index d299f3e75f89..352217ebabd8 100644
--- a/drivers/isdn/capi/Makefile
+++ b/drivers/isdn/capi/Makefile
@@ -1,17 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-# Makefile for the CAPI subsystem.
+# Makefile for the CAPI subsystem used by BT_CMTP
 
-# Ordering constraints: kernelcapi.o first
-
-# Each configuration option enables a list of files.
-
-obj-$(CONFIG_ISDN_CAPI)			+= kernelcapi.o
-obj-$(CONFIG_ISDN_CAPI_CAPI20)		+= capi.o 
-obj-$(CONFIG_ISDN_CAPI_CAPIDRV)		+= capidrv.o
-
-# Multipart objects.
-
-kernelcapi-y				:= kcapi.o capiutil.o capilib.o
-kernelcapi-$(CONFIG_PROC_FS)		+= kcapi_proc.o
-
-ccflags-y += -I$(srctree)/$(src)/../include -I$(srctree)/$(src)/../include/uapi
+obj-$(CONFIG_BT_CMTP)			+= kernelcapi.o
+kernelcapi-y				:= kcapi.o capiutil.o capi.o kcapi_proc.o
diff --git a/drivers/isdn/capi/capi.c b/drivers/isdn/capi/capi.c
index 1675da34239b..85767f52fe3c 100644
--- a/drivers/isdn/capi/capi.c
+++ b/drivers/isdn/capi/capi.c
@@ -39,7 +39,9 @@
 #include <linux/isdn/capiutil.h>
 #include <linux/isdn/capicmd.h>
 
-MODULE_DESCRIPTION("CAPI4Linux: Userspace /dev/capi20 interface");
+#include "kcapi.h"
+
+MODULE_DESCRIPTION("CAPI4Linux: kernel CAPI layer and /dev/capi20 interface");
 MODULE_AUTHOR("Carsten Paeth");
 MODULE_LICENSE("GPL");
 
@@ -1412,15 +1414,22 @@ static int __init capi_init(void)
 {
 	const char *compileinfo;
 	int major_ret;
+	int ret;
+
+	ret = kcapi_init();
+	if (ret)
+		return ret;
 
 	major_ret = register_chrdev(capi_major, "capi20", &capi_fops);
 	if (major_ret < 0) {
 		printk(KERN_ERR "capi20: unable to get major %d\n", capi_major);
+		kcapi_exit();
 		return major_ret;
 	}
 	capi_class = class_create(THIS_MODULE, "capi");
 	if (IS_ERR(capi_class)) {
 		unregister_chrdev(capi_major, "capi20");
+		kcapi_exit();
 		return PTR_ERR(capi_class);
 	}
 
@@ -1430,6 +1439,7 @@ static int __init capi_init(void)
 		device_destroy(capi_class, MKDEV(capi_major, 0));
 		class_destroy(capi_class);
 		unregister_chrdev(capi_major, "capi20");
+		kcapi_exit();
 		return -ENOMEM;
 	}
 
@@ -1455,6 +1465,8 @@ static void __exit capi_exit(void)
 	unregister_chrdev(capi_major, "capi20");
 
 	capinc_tty_exit();
+
+	kcapi_exit();
 }
 
 module_init(capi_init);
diff --git a/drivers/isdn/capi/capilib.c b/drivers/isdn/capi/capilib.c
deleted file mode 100644
index a39ad3796bba..000000000000
--- a/drivers/isdn/capi/capilib.c
+++ /dev/null
@@ -1,202 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-
-#include <linux/slab.h>
-#include <linux/kernel.h>
-#include <linux/module.h>
-#include <linux/isdn/capilli.h>
-
-#define DBG(format, arg...) do {					\
-		printk(KERN_DEBUG "%s: " format "\n" , __func__ , ## arg); \
-	} while (0)
-
-struct capilib_msgidqueue {
-	struct capilib_msgidqueue *next;
-	u16 msgid;
-};
-
-struct capilib_ncci {
-	struct list_head list;
-	u16 applid;
-	u32 ncci;
-	u32 winsize;
-	int   nmsg;
-	struct capilib_msgidqueue *msgidqueue;
-	struct capilib_msgidqueue *msgidlast;
-	struct capilib_msgidqueue *msgidfree;
-	struct capilib_msgidqueue msgidpool[CAPI_MAXDATAWINDOW];
-};
-
-// ---------------------------------------------------------------------------
-// NCCI Handling
-
-static inline void mq_init(struct capilib_ncci *np)
-{
-	u_int i;
-	np->msgidqueue = NULL;
-	np->msgidlast = NULL;
-	np->nmsg = 0;
-	memset(np->msgidpool, 0, sizeof(np->msgidpool));
-	np->msgidfree = &np->msgidpool[0];
-	for (i = 1; i < np->winsize; i++) {
-		np->msgidpool[i].next = np->msgidfree;
-		np->msgidfree = &np->msgidpool[i];
-	}
-}
-
-static inline int mq_enqueue(struct capilib_ncci *np, u16 msgid)
-{
-	struct capilib_msgidqueue *mq;
-	if ((mq = np->msgidfree) == NULL)
-		return 0;
-	np->msgidfree = mq->next;
-	mq->msgid = msgid;
-	mq->next = NULL;
-	if (np->msgidlast)
-		np->msgidlast->next = mq;
-	np->msgidlast = mq;
-	if (!np->msgidqueue)
-		np->msgidqueue = mq;
-	np->nmsg++;
-	return 1;
-}
-
-static inline int mq_dequeue(struct capilib_ncci *np, u16 msgid)
-{
-	struct capilib_msgidqueue **pp;
-	for (pp = &np->msgidqueue; *pp; pp = &(*pp)->next) {
-		if ((*pp)->msgid == msgid) {
-			struct capilib_msgidqueue *mq = *pp;
-			*pp = mq->next;
-			if (mq == np->msgidlast)
-				np->msgidlast = NULL;
-			mq->next = np->msgidfree;
-			np->msgidfree = mq;
-			np->nmsg--;
-			return 1;
-		}
-	}
-	return 0;
-}
-
-void capilib_new_ncci(struct list_head *head, u16 applid, u32 ncci, u32 winsize)
-{
-	struct capilib_ncci *np;
-
-	np = kmalloc(sizeof(*np), GFP_ATOMIC);
-	if (!np) {
-		printk(KERN_WARNING "capilib_new_ncci: no memory.\n");
-		return;
-	}
-	if (winsize > CAPI_MAXDATAWINDOW) {
-		printk(KERN_ERR "capi_new_ncci: winsize %d too big\n",
-		       winsize);
-		winsize = CAPI_MAXDATAWINDOW;
-	}
-	np->applid = applid;
-	np->ncci = ncci;
-	np->winsize = winsize;
-	mq_init(np);
-	list_add_tail(&np->list, head);
-	DBG("kcapi: appl %d ncci 0x%x up", applid, ncci);
-}
-
-EXPORT_SYMBOL(capilib_new_ncci);
-
-void capilib_free_ncci(struct list_head *head, u16 applid, u32 ncci)
-{
-	struct list_head *l;
-	struct capilib_ncci *np;
-
-	list_for_each(l, head) {
-		np = list_entry(l, struct capilib_ncci, list);
-		if (np->applid != applid)
-			continue;
-		if (np->ncci != ncci)
-			continue;
-		printk(KERN_INFO "kcapi: appl %d ncci 0x%x down\n", applid, ncci);
-		list_del(&np->list);
-		kfree(np);
-		return;
-	}
-	printk(KERN_ERR "capilib_free_ncci: ncci 0x%x not found\n", ncci);
-}
-
-EXPORT_SYMBOL(capilib_free_ncci);
-
-void capilib_release_appl(struct list_head *head, u16 applid)
-{
-	struct list_head *l, *n;
-	struct capilib_ncci *np;
-
-	list_for_each_safe(l, n, head) {
-		np = list_entry(l, struct capilib_ncci, list);
-		if (np->applid != applid)
-			continue;
-		printk(KERN_INFO "kcapi: appl %d ncci 0x%x forced down\n", applid, np->ncci);
-		list_del(&np->list);
-		kfree(np);
-	}
-}
-
-EXPORT_SYMBOL(capilib_release_appl);
-
-void capilib_release(struct list_head *head)
-{
-	struct list_head *l, *n;
-	struct capilib_ncci *np;
-
-	list_for_each_safe(l, n, head) {
-		np = list_entry(l, struct capilib_ncci, list);
-		printk(KERN_INFO "kcapi: appl %d ncci 0x%x forced down\n", np->applid, np->ncci);
-		list_del(&np->list);
-		kfree(np);
-	}
-}
-
-EXPORT_SYMBOL(capilib_release);
-
-u16 capilib_data_b3_req(struct list_head *head, u16 applid, u32 ncci, u16 msgid)
-{
-	struct list_head *l;
-	struct capilib_ncci *np;
-
-	list_for_each(l, head) {
-		np = list_entry(l, struct capilib_ncci, list);
-		if (np->applid != applid)
-			continue;
-		if (np->ncci != ncci)
-			continue;
-
-		if (mq_enqueue(np, msgid) == 0)
-			return CAPI_SENDQUEUEFULL;
-
-		return CAPI_NOERROR;
-	}
-	printk(KERN_ERR "capilib_data_b3_req: ncci 0x%x not found\n", ncci);
-	return CAPI_NOERROR;
-}
-
-EXPORT_SYMBOL(capilib_data_b3_req);
-
-void capilib_data_b3_conf(struct list_head *head, u16 applid, u32 ncci, u16 msgid)
-{
-	struct list_head *l;
-	struct capilib_ncci *np;
-
-	list_for_each(l, head) {
-		np = list_entry(l, struct capilib_ncci, list);
-		if (np->applid != applid)
-			continue;
-		if (np->ncci != ncci)
-			continue;
-
-		if (mq_dequeue(np, msgid) == 0) {
-			printk(KERN_ERR "kcapi: msgid %hu ncci 0x%x not on queue\n",
-			       msgid, ncci);
-		}
-		return;
-	}
-	printk(KERN_ERR "capilib_data_b3_conf: ncci 0x%x not found\n", ncci);
-}
-
-EXPORT_SYMBOL(capilib_data_b3_conf);
diff --git a/drivers/isdn/capi/capiutil.c b/drivers/isdn/capi/capiutil.c
index 9846d82eb097..f26bf3c66d7e 100644
--- a/drivers/isdn/capi/capiutil.c
+++ b/drivers/isdn/capi/capiutil.c
@@ -20,6 +20,8 @@
 #include <linux/isdn/capiutil.h>
 #include <linux/slab.h>
 
+#include "kcapi.h"
+
 /* from CAPI2.0 DDK AVM Berlin GmbH */
 
 typedef struct {
@@ -245,190 +247,6 @@ static void jumpcstruct(_cmsg *cmsg)
 		}
 	}
 }
-/*-------------------------------------------------------*/
-static void pars_2_message(_cmsg *cmsg)
-{
-
-	for (; TYP != _CEND; cmsg->p++) {
-		switch (TYP) {
-		case _CBYTE:
-			byteTLcpy(cmsg->m + cmsg->l, OFF);
-			cmsg->l++;
-			break;
-		case _CWORD:
-			wordTLcpy(cmsg->m + cmsg->l, OFF);
-			cmsg->l += 2;
-			break;
-		case _CDWORD:
-			dwordTLcpy(cmsg->m + cmsg->l, OFF);
-			cmsg->l += 4;
-			break;
-		case _CSTRUCT:
-			if (*(u8 **) OFF == NULL) {
-				*(cmsg->m + cmsg->l) = '\0';
-				cmsg->l++;
-			} else if (**(_cstruct *) OFF != 0xff) {
-				structTLcpy(cmsg->m + cmsg->l, *(_cstruct *) OFF, 1 + **(_cstruct *) OFF);
-				cmsg->l += 1 + **(_cstruct *) OFF;
-			} else {
-				_cstruct s = *(_cstruct *) OFF;
-				structTLcpy(cmsg->m + cmsg->l, s, 3 + *(u16 *) (s + 1));
-				cmsg->l += 3 + *(u16 *) (s + 1);
-			}
-			break;
-		case _CMSTRUCT:
-/*----- Metastruktur 0 -----*/
-			if (*(_cmstruct *) OFF == CAPI_DEFAULT) {
-				*(cmsg->m + cmsg->l) = '\0';
-				cmsg->l++;
-				jumpcstruct(cmsg);
-			}
-/*----- Metastruktur wird composed -----*/
-			else {
-				unsigned _l = cmsg->l;
-				unsigned _ls;
-				cmsg->l++;
-				cmsg->p++;
-				pars_2_message(cmsg);
-				_ls = cmsg->l - _l - 1;
-				if (_ls < 255)
-					(cmsg->m + _l)[0] = (u8) _ls;
-				else {
-					structTLcpyovl(cmsg->m + _l + 3, cmsg->m + _l + 1, _ls);
-					(cmsg->m + _l)[0] = 0xff;
-					wordTLcpy(cmsg->m + _l + 1, &_ls);
-				}
-			}
-			break;
-		}
-	}
-}
-
-/**
- * capi_cmsg2message() - assemble CAPI 2.0 message from _cmsg structure
- * @cmsg:	_cmsg structure
- * @msg:	buffer for assembled message
- *
- * Return value: 0 for success
- */
-
-unsigned capi_cmsg2message(_cmsg *cmsg, u8 *msg)
-{
-	cmsg->m = msg;
-	cmsg->l = 8;
-	cmsg->p = 0;
-	cmsg->par = capi_cmd2par(cmsg->Command, cmsg->Subcommand);
-	if (!cmsg->par)
-		return 1;	/* invalid command/subcommand */
-
-	pars_2_message(cmsg);
-
-	wordTLcpy(msg + 0, &cmsg->l);
-	byteTLcpy(cmsg->m + 4, &cmsg->Command);
-	byteTLcpy(cmsg->m + 5, &cmsg->Subcommand);
-	wordTLcpy(cmsg->m + 2, &cmsg->ApplId);
-	wordTLcpy(cmsg->m + 6, &cmsg->Messagenumber);
-
-	return 0;
-}
-
-/*-------------------------------------------------------*/
-static void message_2_pars(_cmsg *cmsg)
-{
-	for (; TYP != _CEND; cmsg->p++) {
-
-		switch (TYP) {
-		case _CBYTE:
-			byteTRcpy(cmsg->m + cmsg->l, OFF);
-			cmsg->l++;
-			break;
-		case _CWORD:
-			wordTRcpy(cmsg->m + cmsg->l, OFF);
-			cmsg->l += 2;
-			break;
-		case _CDWORD:
-			dwordTRcpy(cmsg->m + cmsg->l, OFF);
-			cmsg->l += 4;
-			break;
-		case _CSTRUCT:
-			*(u8 **) OFF = cmsg->m + cmsg->l;
-
-			if (cmsg->m[cmsg->l] != 0xff)
-				cmsg->l += 1 + cmsg->m[cmsg->l];
-			else
-				cmsg->l += 3 + *(u16 *) (cmsg->m + cmsg->l + 1);
-			break;
-		case _CMSTRUCT:
-/*----- Metastruktur 0 -----*/
-			if (cmsg->m[cmsg->l] == '\0') {
-				*(_cmstruct *) OFF = CAPI_DEFAULT;
-				cmsg->l++;
-				jumpcstruct(cmsg);
-			} else {
-				unsigned _l = cmsg->l;
-				*(_cmstruct *) OFF = CAPI_COMPOSE;
-				cmsg->l = (cmsg->m + _l)[0] == 255 ? cmsg->l + 3 : cmsg->l + 1;
-				cmsg->p++;
-				message_2_pars(cmsg);
-			}
-			break;
-		}
-	}
-}
-
-/**
- * capi_message2cmsg() - disassemble CAPI 2.0 message into _cmsg structure
- * @cmsg:	_cmsg structure
- * @msg:	buffer for assembled message
- *
- * Return value: 0 for success
- */
-
-unsigned capi_message2cmsg(_cmsg *cmsg, u8 *msg)
-{
-	memset(cmsg, 0, sizeof(_cmsg));
-	cmsg->m = msg;
-	cmsg->l = 8;
-	cmsg->p = 0;
-	byteTRcpy(cmsg->m + 4, &cmsg->Command);
-	byteTRcpy(cmsg->m + 5, &cmsg->Subcommand);
-	cmsg->par = capi_cmd2par(cmsg->Command, cmsg->Subcommand);
-	if (!cmsg->par)
-		return 1;	/* invalid command/subcommand */
-
-	message_2_pars(cmsg);
-
-	wordTRcpy(msg + 0, &cmsg->l);
-	wordTRcpy(cmsg->m + 2, &cmsg->ApplId);
-	wordTRcpy(cmsg->m + 6, &cmsg->Messagenumber);
-
-	return 0;
-}
-
-/**
- * capi_cmsg_header() - initialize header part of _cmsg structure
- * @cmsg:	_cmsg structure
- * @_ApplId:	ApplID field value
- * @_Command:	Command field value
- * @_Subcommand:	Subcommand field value
- * @_Messagenumber:	Message Number field value
- * @_Controller:	Controller/PLCI/NCCI field value
- *
- * Return value: 0 for success
- */
-
-unsigned capi_cmsg_header(_cmsg *cmsg, u16 _ApplId,
-			  u8 _Command, u8 _Subcommand,
-			  u16 _Messagenumber, u32 _Controller)
-{
-	memset(cmsg, 0, sizeof(_cmsg));
-	cmsg->ApplId = _ApplId;
-	cmsg->Command = _Command;
-	cmsg->Subcommand = _Subcommand;
-	cmsg->Messagenumber = _Messagenumber;
-	cmsg->adr.adrController = _Controller;
-	return 0;
-}
 
 /*-------------------------------------------------------*/
 
@@ -561,8 +379,6 @@ static char *pnames[] =
 	/*2f */ "Useruserdata"
 };
 
-
-
 #include <stdarg.h>
 
 /*-------------------------------------------------------*/
@@ -800,37 +616,6 @@ _cdebbuf *capi_message2str(u8 *msg)
 	return cdb;
 }
 
-/**
- * capi_cmsg2str() - format _cmsg structure for printing
- * @cmsg:	_cmsg structure
- *
- * Allocates a CAPI debug buffer and fills it with a printable representation
- * of the CAPI 2.0 message stored in @cmsg by a previous call to
- * capi_cmsg2message() or capi_message2cmsg().
- * Return value: allocated debug buffer, NULL on error
- * The returned buffer should be freed by a call to cdebbuf_free() after use.
- */
-
-_cdebbuf *capi_cmsg2str(_cmsg *cmsg)
-{
-	_cdebbuf *cdb;
-
-	if (!cmsg->m)
-		return NULL;	/* no message */
-	cdb = cdebbuf_alloc();
-	if (!cdb)
-		return NULL;
-	cmsg->l = 8;
-	cmsg->p = 0;
-	cdb = bufprint(cdb, "%s ID=%03d #0x%04x LEN=%04d\n",
-		       capi_cmd2str(cmsg->Command, cmsg->Subcommand),
-		       ((u16 *) cmsg->m)[1],
-		       ((u16 *) cmsg->m)[3],
-		       ((u16 *) cmsg->m)[0]);
-	cdb = protocol_message_2_pars(cdb, cmsg, 1);
-	return cdb;
-}
-
 int __init cdebug_init(void)
 {
 	g_cmsg = kmalloc(sizeof(_cmsg), GFP_KERNEL);
@@ -854,7 +639,7 @@ int __init cdebug_init(void)
 	return 0;
 }
 
-void __exit cdebug_exit(void)
+void cdebug_exit(void)
 {
 	if (g_debbuf)
 		kfree(g_debbuf->buf);
@@ -885,16 +670,8 @@ int __init cdebug_init(void)
 	return 0;
 }
 
-void __exit cdebug_exit(void)
+void cdebug_exit(void)
 {
 }
 
 #endif
-
-EXPORT_SYMBOL(cdebbuf_free);
-EXPORT_SYMBOL(capi_cmsg2message);
-EXPORT_SYMBOL(capi_message2cmsg);
-EXPORT_SYMBOL(capi_cmsg_header);
-EXPORT_SYMBOL(capi_cmd2str);
-EXPORT_SYMBOL(capi_cmsg2str);
-EXPORT_SYMBOL(capi_message2str);
diff --git a/drivers/isdn/capi/kcapi.c b/drivers/isdn/capi/kcapi.c
index a4ceb61c5b60..7168778fbbe1 100644
--- a/drivers/isdn/capi/kcapi.c
+++ b/drivers/isdn/capi/kcapi.c
@@ -10,8 +10,6 @@
  *
  */
 
-#define AVMB1_COMPAT
-
 #include "kcapi.h"
 #include <linux/module.h>
 #include <linux/mm.h>
@@ -31,18 +29,12 @@
 #include <linux/uaccess.h>
 #include <linux/isdn/capicmd.h>
 #include <linux/isdn/capiutil.h>
-#ifdef AVMB1_COMPAT
-#include <linux/b1lli.h>
-#endif
 #include <linux/mutex.h>
 #include <linux/rcupdate.h>
 
 static int showcapimsgs = 0;
 static struct workqueue_struct *kcapi_wq;
 
-MODULE_DESCRIPTION("CAPI4Linux: kernel CAPI layer");
-MODULE_AUTHOR("Carsten Paeth");
-MODULE_LICENSE("GPL");
 module_param(showcapimsgs, uint, 0);
 
 /* ------------------------------------------------------------- */
@@ -61,9 +53,6 @@ static char capi_manufakturer[64] = "AVM Berlin";
 
 #define NCCI2CTRL(ncci)    (((ncci) >> 24) & 0x7f)
 
-LIST_HEAD(capi_drivers);
-DEFINE_MUTEX(capi_drivers_lock);
-
 struct capi_ctr *capi_controller[CAPI_MAXCONTR];
 DEFINE_MUTEX(capi_controller_lock);
 
@@ -71,8 +60,6 @@ struct capi20_appl *capi_applications[CAPI_MAXAPPL];
 
 static int ncontrollers;
 
-static BLOCKING_NOTIFIER_HEAD(ctr_notifier_list);
-
 /* -------- controller ref counting -------------------------------------- */
 
 static inline struct capi_ctr *
@@ -200,8 +187,6 @@ static void notify_up(u32 contr)
 			if (ap)
 				register_appl(ctr, applid, &ap->rparam);
 		}
-
-		wake_up_interruptible_all(&ctr->state_wait_queue);
 	} else
 		printk(KERN_WARNING "%s: invalid contr %d\n", __func__, contr);
 
@@ -229,8 +214,6 @@ static void ctr_down(struct capi_ctr *ctr, int new_state)
 		if (ap)
 			capi_ctr_put(ctr);
 	}
-
-	wake_up_interruptible_all(&ctr->state_wait_queue);
 }
 
 static void notify_down(u32 contr)
@@ -251,36 +234,23 @@ static void notify_down(u32 contr)
 	mutex_unlock(&capi_controller_lock);
 }
 
-static int
-notify_handler(struct notifier_block *nb, unsigned long val, void *v)
+static void do_notify_work(struct work_struct *work)
 {
-	u32 contr = (long)v;
+	struct capictr_event *event =
+		container_of(work, struct capictr_event, work);
 
-	switch (val) {
+	switch (event->type) {
 	case CAPICTR_UP:
-		notify_up(contr);
+		notify_up(event->controller);
 		break;
 	case CAPICTR_DOWN:
-		notify_down(contr);
+		notify_down(event->controller);
 		break;
 	}
-	return NOTIFY_OK;
-}
 
-static void do_notify_work(struct work_struct *work)
-{
-	struct capictr_event *event =
-		container_of(work, struct capictr_event, work);
-
-	blocking_notifier_call_chain(&ctr_notifier_list, event->type,
-				     (void *)(long)event->controller);
 	kfree(event);
 }
 
-/*
- * The notifier will result in adding/deleteing of devices. Devices can
- * only removed in user process, not in bh.
- */
 static int notify_push(unsigned int event_type, u32 controller)
 {
 	struct capictr_event *event = kmalloc(sizeof(*event), GFP_ATOMIC);
@@ -296,18 +266,6 @@ static int notify_push(unsigned int event_type, u32 controller)
 	return 0;
 }
 
-int register_capictr_notifier(struct notifier_block *nb)
-{
-	return blocking_notifier_chain_register(&ctr_notifier_list, nb);
-}
-EXPORT_SYMBOL_GPL(register_capictr_notifier);
-
-int unregister_capictr_notifier(struct notifier_block *nb)
-{
-	return blocking_notifier_chain_unregister(&ctr_notifier_list, nb);
-}
-EXPORT_SYMBOL_GPL(unregister_capictr_notifier);
-
 /* -------- Receiver ------------------------------------------ */
 
 static void recv_handler(struct work_struct *work)
@@ -454,48 +412,6 @@ void capi_ctr_down(struct capi_ctr *ctr)
 
 EXPORT_SYMBOL(capi_ctr_down);
 
-/**
- * capi_ctr_suspend_output() - suspend controller
- * @ctr:	controller descriptor structure.
- *
- * Called by hardware driver to stop data flow.
- *
- * Note: The caller is responsible for synchronizing concurrent state changes
- * as well as invocations of capi_ctr_handle_message.
- */
-
-void capi_ctr_suspend_output(struct capi_ctr *ctr)
-{
-	if (!ctr->blocked) {
-		printk(KERN_DEBUG "kcapi: controller [%03d] suspend\n",
-		       ctr->cnr);
-		ctr->blocked = 1;
-	}
-}
-
-EXPORT_SYMBOL(capi_ctr_suspend_output);
-
-/**
- * capi_ctr_resume_output() - resume controller
- * @ctr:	controller descriptor structure.
- *
- * Called by hardware driver to resume data flow.
- *
- * Note: The caller is responsible for synchronizing concurrent state changes
- * as well as invocations of capi_ctr_handle_message.
- */
-
-void capi_ctr_resume_output(struct capi_ctr *ctr)
-{
-	if (ctr->blocked) {
-		printk(KERN_DEBUG "kcapi: controller [%03d] resumed\n",
-		       ctr->cnr);
-		ctr->blocked = 0;
-	}
-}
-
-EXPORT_SYMBOL(capi_ctr_resume_output);
-
 /* ------------------------------------------------------------- */
 
 /**
@@ -531,7 +447,6 @@ int attach_capi_ctr(struct capi_ctr *ctr)
 	ctr->state = CAPI_CTR_DETECTED;
 	ctr->blocked = 0;
 	ctr->traceflag = showcapimsgs;
-	init_waitqueue_head(&ctr->state_wait_queue);
 
 	sprintf(ctr->procfn, "capi/controllers/%d", ctr->cnr);
 	ctr->procent = proc_create_single_data(ctr->procfn, 0, NULL,
@@ -586,38 +501,6 @@ unlock_out:
 
 EXPORT_SYMBOL(detach_capi_ctr);
 
-/**
- * register_capi_driver() - register CAPI driver
- * @driver:	driver descriptor structure.
- *
- * Called by hardware driver to register itself with the CAPI subsystem.
- */
-
-void register_capi_driver(struct capi_driver *driver)
-{
-	mutex_lock(&capi_drivers_lock);
-	list_add_tail(&driver->list, &capi_drivers);
-	mutex_unlock(&capi_drivers_lock);
-}
-
-EXPORT_SYMBOL(register_capi_driver);
-
-/**
- * unregister_capi_driver() - unregister CAPI driver
- * @driver:	driver descriptor structure.
- *
- * Called by hardware driver to unregister itself from the CAPI subsystem.
- */
-
-void unregister_capi_driver(struct capi_driver *driver)
-{
-	mutex_lock(&capi_drivers_lock);
-	list_del(&driver->list);
-	mutex_unlock(&capi_drivers_lock);
-}
-
-EXPORT_SYMBOL(unregister_capi_driver);
-
 /* ------------------------------------------------------------- */
 /* -------- CAPI2.0 Interface ---------------------------------- */
 /* ------------------------------------------------------------- */
@@ -648,8 +531,6 @@ u16 capi20_isinstalled(void)
 	return ret;
 }
 
-EXPORT_SYMBOL(capi20_isinstalled);
-
 /**
  * capi20_register() - CAPI 2.0 operation CAPI_REGISTER
  * @ap:		CAPI application descriptor structure.
@@ -711,8 +592,6 @@ u16 capi20_register(struct capi20_appl *ap)
 	return CAPI_NOERROR;
 }
 
-EXPORT_SYMBOL(capi20_register);
-
 /**
  * capi20_release() - CAPI 2.0 operation CAPI_RELEASE
  * @ap:		CAPI application descriptor structure.
@@ -755,8 +634,6 @@ u16 capi20_release(struct capi20_appl *ap)
 	return CAPI_NOERROR;
 }
 
-EXPORT_SYMBOL(capi20_release);
-
 /**
  * capi20_put_message() - CAPI 2.0 operation CAPI_PUT_MESSAGE
  * @ap:		CAPI application descriptor structure.
@@ -834,8 +711,6 @@ u16 capi20_put_message(struct capi20_appl *ap, struct sk_buff *skb)
 	return ctr->send_message(ctr, skb);
 }
 
-EXPORT_SYMBOL(capi20_put_message);
-
 /**
  * capi20_get_manufacturer() - CAPI 2.0 operation CAPI_GET_MANUFACTURER
  * @contr:	controller number.
@@ -869,8 +744,6 @@ u16 capi20_get_manufacturer(u32 contr, u8 *buf)
 	return ret;
 }
 
-EXPORT_SYMBOL(capi20_get_manufacturer);
-
 /**
  * capi20_get_version() - CAPI 2.0 operation CAPI_GET_VERSION
  * @contr:	controller number.
@@ -904,8 +777,6 @@ u16 capi20_get_version(u32 contr, struct capi_version *verp)
 	return ret;
 }
 
-EXPORT_SYMBOL(capi20_get_version);
-
 /**
  * capi20_get_serial() - CAPI 2.0 operation CAPI_GET_SERIAL_NUMBER
  * @contr:	controller number.
@@ -939,8 +810,6 @@ u16 capi20_get_serial(u32 contr, u8 *serial)
 	return ret;
 }
 
-EXPORT_SYMBOL(capi20_get_serial);
-
 /**
  * capi20_get_profile() - CAPI 2.0 operation CAPI_GET_PROFILE
  * @contr:	controller number.
@@ -974,209 +843,6 @@ u16 capi20_get_profile(u32 contr, struct capi_profile *profp)
 	return ret;
 }
 
-EXPORT_SYMBOL(capi20_get_profile);
-
-/* Must be called with capi_controller_lock held. */
-static int wait_on_ctr_state(struct capi_ctr *ctr, unsigned int state)
-{
-	DEFINE_WAIT(wait);
-	int retval = 0;
-
-	ctr = capi_ctr_get(ctr);
-	if (!ctr)
-		return -ESRCH;
-
-	for (;;) {
-		prepare_to_wait(&ctr->state_wait_queue, &wait,
-				TASK_INTERRUPTIBLE);
-
-		if (ctr->state == state)
-			break;
-		if (ctr->state == CAPI_CTR_DETACHED) {
-			retval = -ESRCH;
-			break;
-		}
-		if (signal_pending(current)) {
-			retval = -EINTR;
-			break;
-		}
-
-		mutex_unlock(&capi_controller_lock);
-		schedule();
-		mutex_lock(&capi_controller_lock);
-	}
-	finish_wait(&ctr->state_wait_queue, &wait);
-
-	capi_ctr_put(ctr);
-
-	return retval;
-}
-
-#ifdef AVMB1_COMPAT
-static int old_capi_manufacturer(unsigned int cmd, void __user *data)
-{
-	avmb1_loadandconfigdef ldef;
-	avmb1_extcarddef cdef;
-	avmb1_resetdef rdef;
-	capicardparams cparams;
-	struct capi_ctr *ctr;
-	struct capi_driver *driver = NULL;
-	capiloaddata ldata;
-	struct list_head *l;
-	int retval;
-
-	switch (cmd) {
-	case AVMB1_ADDCARD:
-	case AVMB1_ADDCARD_WITH_TYPE:
-		if (cmd == AVMB1_ADDCARD) {
-			if ((retval = copy_from_user(&cdef, data,
-						     sizeof(avmb1_carddef))))
-				return -EFAULT;
-			cdef.cardtype = AVM_CARDTYPE_B1;
-			cdef.cardnr = 0;
-		} else {
-			if ((retval = copy_from_user(&cdef, data,
-						     sizeof(avmb1_extcarddef))))
-				return -EFAULT;
-		}
-		cparams.port = cdef.port;
-		cparams.irq = cdef.irq;
-		cparams.cardnr = cdef.cardnr;
-
-		mutex_lock(&capi_drivers_lock);
-
-		switch (cdef.cardtype) {
-		case AVM_CARDTYPE_B1:
-			list_for_each(l, &capi_drivers) {
-				driver = list_entry(l, struct capi_driver, list);
-				if (strcmp(driver->name, "b1isa") == 0)
-					break;
-			}
-			break;
-		case AVM_CARDTYPE_T1:
-			list_for_each(l, &capi_drivers) {
-				driver = list_entry(l, struct capi_driver, list);
-				if (strcmp(driver->name, "t1isa") == 0)
-					break;
-			}
-			break;
-		default:
-			driver = NULL;
-			break;
-		}
-		if (!driver) {
-			printk(KERN_ERR "kcapi: driver not loaded.\n");
-			retval = -EIO;
-		} else if (!driver->add_card) {
-			printk(KERN_ERR "kcapi: driver has no add card function.\n");
-			retval = -EIO;
-		} else
-			retval = driver->add_card(driver, &cparams);
-
-		mutex_unlock(&capi_drivers_lock);
-		return retval;
-
-	case AVMB1_LOAD:
-	case AVMB1_LOAD_AND_CONFIG:
-
-		if (cmd == AVMB1_LOAD) {
-			if (copy_from_user(&ldef, data,
-					   sizeof(avmb1_loaddef)))
-				return -EFAULT;
-			ldef.t4config.len = 0;
-			ldef.t4config.data = NULL;
-		} else {
-			if (copy_from_user(&ldef, data,
-					   sizeof(avmb1_loadandconfigdef)))
-				return -EFAULT;
-		}
-
-		mutex_lock(&capi_controller_lock);
-
-		ctr = get_capi_ctr_by_nr(ldef.contr);
-		if (!ctr) {
-			retval = -EINVAL;
-			goto load_unlock_out;
-		}
-
-		if (ctr->load_firmware == NULL) {
-			printk(KERN_DEBUG "kcapi: load: no load function\n");
-			retval = -ESRCH;
-			goto load_unlock_out;
-		}
-
-		if (ldef.t4file.len <= 0) {
-			printk(KERN_DEBUG "kcapi: load: invalid parameter: length of t4file is %d ?\n", ldef.t4file.len);
-			retval = -EINVAL;
-			goto load_unlock_out;
-		}
-		if (ldef.t4file.data == NULL) {
-			printk(KERN_DEBUG "kcapi: load: invalid parameter: dataptr is 0\n");
-			retval = -EINVAL;
-			goto load_unlock_out;
-		}
-
-		ldata.firmware.user = 1;
-		ldata.firmware.data = ldef.t4file.data;
-		ldata.firmware.len = ldef.t4file.len;
-		ldata.configuration.user = 1;
-		ldata.configuration.data = ldef.t4config.data;
-		ldata.configuration.len = ldef.t4config.len;
-
-		if (ctr->state != CAPI_CTR_DETECTED) {
-			printk(KERN_INFO "kcapi: load: contr=%d not in detect state\n", ldef.contr);
-			retval = -EBUSY;
-			goto load_unlock_out;
-		}
-		ctr->state = CAPI_CTR_LOADING;
-
-		retval = ctr->load_firmware(ctr, &ldata);
-		if (retval) {
-			ctr->state = CAPI_CTR_DETECTED;
-			goto load_unlock_out;
-		}
-
-		retval = wait_on_ctr_state(ctr, CAPI_CTR_RUNNING);
-
-	load_unlock_out:
-		mutex_unlock(&capi_controller_lock);
-		return retval;
-
-	case AVMB1_RESETCARD:
-		if (copy_from_user(&rdef, data, sizeof(avmb1_resetdef)))
-			return -EFAULT;
-
-		retval = 0;
-
-		mutex_lock(&capi_controller_lock);
-
-		ctr = get_capi_ctr_by_nr(rdef.contr);
-		if (!ctr) {
-			retval = -ESRCH;
-			goto reset_unlock_out;
-		}
-
-		if (ctr->state == CAPI_CTR_DETECTED)
-			goto reset_unlock_out;
-
-		if (ctr->reset_ctr == NULL) {
-			printk(KERN_DEBUG "kcapi: reset: no reset function\n");
-			retval = -ESRCH;
-			goto reset_unlock_out;
-		}
-
-		ctr->reset_ctr(ctr);
-
-		retval = wait_on_ctr_state(ctr, CAPI_CTR_DETECTED);
-
-	reset_unlock_out:
-		mutex_unlock(&capi_controller_lock);
-		return retval;
-	}
-	return -EINVAL;
-}
-#endif
-
 /**
  * capi20_manufacturer() - CAPI 2.0 operation CAPI_MANUFACTURER
  * @cmd:	command.
@@ -1192,14 +858,6 @@ int capi20_manufacturer(unsigned long cmd, void __user *data)
 	int retval;
 
 	switch (cmd) {
-#ifdef AVMB1_COMPAT
-	case AVMB1_LOAD:
-	case AVMB1_LOAD_AND_CONFIG:
-	case AVMB1_RESETCARD:
-	case AVMB1_GET_CARDINFO:
-	case AVMB1_REMOVECARD:
-		return old_capi_manufacturer(cmd, data);
-#endif
 	case KCAPI_CMD_TRACE:
 	{
 		kcapi_flagdef fdef;
@@ -1222,43 +880,6 @@ int capi20_manufacturer(unsigned long cmd, void __user *data)
 
 		return retval;
 	}
-	case KCAPI_CMD_ADDCARD:
-	{
-		struct list_head *l;
-		struct capi_driver *driver = NULL;
-		capicardparams cparams;
-		kcapi_carddef cdef;
-
-		if ((retval = copy_from_user(&cdef, data, sizeof(cdef))))
-			return -EFAULT;
-
-		cparams.port = cdef.port;
-		cparams.irq = cdef.irq;
-		cparams.membase = cdef.membase;
-		cparams.cardnr = cdef.cardnr;
-		cparams.cardtype = 0;
-		cdef.driver[sizeof(cdef.driver) - 1] = 0;
-
-		mutex_lock(&capi_drivers_lock);
-
-		list_for_each(l, &capi_drivers) {
-			driver = list_entry(l, struct capi_driver, list);
-			if (strcmp(driver->name, cdef.driver) == 0)
-				break;
-		}
-		if (driver == NULL) {
-			printk(KERN_ERR "kcapi: driver \"%s\" not loaded.\n",
-			       cdef.driver);
-			retval = -ESRCH;
-		} else if (!driver->add_card) {
-			printk(KERN_ERR "kcapi: driver \"%s\" has no add card function.\n", cdef.driver);
-			retval = -EIO;
-		} else
-			retval = driver->add_card(driver, &cparams);
-
-		mutex_unlock(&capi_drivers_lock);
-		return retval;
-	}
 
 	default:
 		printk(KERN_ERR "kcapi: manufacturer command %lu unknown.\n",
@@ -1269,8 +890,6 @@ int capi20_manufacturer(unsigned long cmd, void __user *data)
 	return -EINVAL;
 }
 
-EXPORT_SYMBOL(capi20_manufacturer);
-
 /* ------------------------------------------------------------- */
 /* -------- Init & Cleanup ------------------------------------- */
 /* ------------------------------------------------------------- */
@@ -1279,12 +898,7 @@ EXPORT_SYMBOL(capi20_manufacturer);
  * init / exit functions
  */
 
-static struct notifier_block capictr_nb = {
-	.notifier_call = notify_handler,
-	.priority = INT_MAX,
-};
-
-static int __init kcapi_init(void)
+int __init kcapi_init(void)
 {
 	int err;
 
@@ -1292,11 +906,8 @@ static int __init kcapi_init(void)
 	if (!kcapi_wq)
 		return -ENOMEM;
 
-	register_capictr_notifier(&capictr_nb);
-
 	err = cdebug_init();
 	if (err) {
-		unregister_capictr_notifier(&capictr_nb);
 		destroy_workqueue(kcapi_wq);
 		return err;
 	}
@@ -1305,14 +916,10 @@ static int __init kcapi_init(void)
 	return 0;
 }
 
-static void __exit kcapi_exit(void)
+void kcapi_exit(void)
 {
 	kcapi_proc_exit();
 
-	unregister_capictr_notifier(&capictr_nb);
 	cdebug_exit();
 	destroy_workqueue(kcapi_wq);
 }
-
-module_init(kcapi_init);
-module_exit(kcapi_exit);
diff --git a/drivers/isdn/capi/kcapi.h b/drivers/isdn/capi/kcapi.h
index 6d439f9a76b2..479623e1db2a 100644
--- a/drivers/isdn/capi/kcapi.h
+++ b/drivers/isdn/capi/kcapi.h
@@ -30,22 +30,153 @@ enum {
 	CAPI_CTR_RUNNING  = 3,
 };
 
-extern struct list_head capi_drivers;
-extern struct mutex capi_drivers_lock;
-
 extern struct capi_ctr *capi_controller[CAPI_MAXCONTR];
 extern struct mutex capi_controller_lock;
 
 extern struct capi20_appl *capi_applications[CAPI_MAXAPPL];
 
-#ifdef CONFIG_PROC_FS
-
 void kcapi_proc_init(void);
 void kcapi_proc_exit(void);
 
-#else
+struct capi20_appl {
+	u16 applid;
+	capi_register_params rparam;
+	void (*recv_message)(struct capi20_appl *ap, struct sk_buff *skb);
+	void *private;
 
-static inline void kcapi_proc_init(void) { };
-static inline void kcapi_proc_exit(void) { };
+	/* internal to kernelcapi.o */
+	unsigned long nrecvctlpkt;
+	unsigned long nrecvdatapkt;
+	unsigned long nsentctlpkt;
+	unsigned long nsentdatapkt;
+	struct mutex recv_mtx;
+	struct sk_buff_head recv_queue;
+	struct work_struct recv_work;
+	int release_in_progress;
+};
 
-#endif
+u16 capi20_isinstalled(void);
+u16 capi20_register(struct capi20_appl *ap);
+u16 capi20_release(struct capi20_appl *ap);
+u16 capi20_put_message(struct capi20_appl *ap, struct sk_buff *skb);
+u16 capi20_get_manufacturer(u32 contr, u8 buf[CAPI_MANUFACTURER_LEN]);
+u16 capi20_get_version(u32 contr, struct capi_version *verp);
+u16 capi20_get_serial(u32 contr, u8 serial[CAPI_SERIAL_LEN]);
+u16 capi20_get_profile(u32 contr, struct capi_profile *profp);
+int capi20_manufacturer(unsigned long cmd, void __user *data);
+
+#define CAPICTR_UP			0
+#define CAPICTR_DOWN			1
+
+int kcapi_init(void);
+void kcapi_exit(void);
+
+/*----- basic-type definitions -----*/
+
+typedef __u8 *_cstruct;
+
+typedef enum {
+	CAPI_COMPOSE,
+	CAPI_DEFAULT
+} _cmstruct;
+
+/*
+   The _cmsg structure contains all possible CAPI 2.0 parameter.
+   All parameters are stored here first. The function CAPI_CMSG_2_MESSAGE
+   assembles the parameter and builds CAPI2.0 conform messages.
+   CAPI_MESSAGE_2_CMSG disassembles CAPI 2.0 messages and stores the
+   parameter in the _cmsg structure
+ */
+
+typedef struct {
+	/* Header */
+	__u16 ApplId;
+	__u8 Command;
+	__u8 Subcommand;
+	__u16 Messagenumber;
+
+	/* Parameter */
+	union {
+		__u32 adrController;
+		__u32 adrPLCI;
+		__u32 adrNCCI;
+	} adr;
+
+	_cmstruct AdditionalInfo;
+	_cstruct B1configuration;
+	__u16 B1protocol;
+	_cstruct B2configuration;
+	__u16 B2protocol;
+	_cstruct B3configuration;
+	__u16 B3protocol;
+	_cstruct BC;
+	_cstruct BChannelinformation;
+	_cmstruct BProtocol;
+	_cstruct CalledPartyNumber;
+	_cstruct CalledPartySubaddress;
+	_cstruct CallingPartyNumber;
+	_cstruct CallingPartySubaddress;
+	__u32 CIPmask;
+	__u32 CIPmask2;
+	__u16 CIPValue;
+	__u32 Class;
+	_cstruct ConnectedNumber;
+	_cstruct ConnectedSubaddress;
+	__u32 Data;
+	__u16 DataHandle;
+	__u16 DataLength;
+	_cstruct FacilityConfirmationParameter;
+	_cstruct Facilitydataarray;
+	_cstruct FacilityIndicationParameter;
+	_cstruct FacilityRequestParameter;
+	__u16 FacilitySelector;
+	__u16 Flags;
+	__u32 Function;
+	_cstruct HLC;
+	__u16 Info;
+	_cstruct InfoElement;
+	__u32 InfoMask;
+	__u16 InfoNumber;
+	_cstruct Keypadfacility;
+	_cstruct LLC;
+	_cstruct ManuData;
+	__u32 ManuID;
+	_cstruct NCPI;
+	__u16 Reason;
+	__u16 Reason_B3;
+	__u16 Reject;
+	_cstruct Useruserdata;
+
+	/* intern */
+	unsigned l, p;
+	unsigned char *par;
+	__u8 *m;
+
+	/* buffer to construct message */
+	__u8 buf[180];
+
+} _cmsg;
+
+/*-----------------------------------------------------------------------*/
+
+/*
+ * Debugging / Tracing functions
+ */
+
+char *capi_cmd2str(__u8 cmd, __u8 subcmd);
+
+typedef struct {
+	u_char	*buf;
+	u_char	*p;
+	size_t	size;
+	size_t	pos;
+} _cdebbuf;
+
+#define	CDEBUG_SIZE	1024
+#define	CDEBUG_GSIZE	4096
+
+void cdebbuf_free(_cdebbuf *cdb);
+int cdebug_init(void);
+void cdebug_exit(void);
+
+_cdebbuf *capi_message2str(__u8 *msg);
diff --git a/drivers/isdn/capi/kcapi_proc.c b/drivers/isdn/capi/kcapi_proc.c
index c94bd12c0f7c..2bffbb8bf271 100644
--- a/drivers/isdn/capi/kcapi_proc.c
+++ b/drivers/isdn/capi/kcapi_proc.c
@@ -192,37 +192,15 @@ static const struct seq_operations seq_applstats_ops = {
 
 // ---------------------------------------------------------------------------
 
-static void *capi_driver_start(struct seq_file *seq, loff_t *pos)
-	__acquires(&capi_drivers_lock)
+/* /proc/capi/drivers is always empty */
+static ssize_t empty_read(struct file *file, char __user *buf,
+			  size_t size, loff_t *off)
 {
-	mutex_lock(&capi_drivers_lock);
-	return seq_list_start(&capi_drivers, *pos);
-}
-
-static void *capi_driver_next(struct seq_file *seq, void *v, loff_t *pos)
-{
-	return seq_list_next(v, &capi_drivers, pos);
-}
-
-static void capi_driver_stop(struct seq_file *seq, void *v)
-	__releases(&capi_drivers_lock)
-{
-	mutex_unlock(&capi_drivers_lock);
-}
-
-static int capi_driver_show(struct seq_file *seq, void *v)
-{
-	struct capi_driver *drv = list_entry(v, struct capi_driver, list);
-
-	seq_printf(seq, "%-32s %s\n", drv->name, drv->revision);
 	return 0;
 }
 
-static const struct seq_operations seq_capi_driver_ops = {
-	.start	= capi_driver_start,
-	.next	= capi_driver_next,
-	.stop	= capi_driver_stop,
-	.show	= capi_driver_show,
+static const struct file_operations empty_fops = {
+	.read	= empty_read,
 };
 
 // ---------------------------------------------------------------------------
@@ -236,7 +214,7 @@ kcapi_proc_init(void)
 	proc_create_seq("capi/contrstats",   0, NULL, &seq_contrstats_ops);
 	proc_create_seq("capi/applications", 0, NULL, &seq_applications_ops);
 	proc_create_seq("capi/applstats",    0, NULL, &seq_applstats_ops);
-	proc_create_seq("capi/driver",       0, NULL, &seq_capi_driver_ops);
+	proc_create("capi/driver",           0, NULL, &empty_fops);
 }
 
 void __exit
diff --git a/include/linux/isdn/capilli.h b/include/linux/isdn/capilli.h
index d75e1ad72964..12be09b6883b 100644
--- a/include/linux/isdn/capilli.h
+++ b/include/linux/isdn/capilli.h
@@ -69,7 +69,6 @@ struct capi_ctr {
 	unsigned short state;			/* controller state */
 	int blocked;				/* output blocked */
 	int traceflag;				/* capi trace */
-	wait_queue_head_t state_wait_queue;
 
 	struct proc_dir_entry *procent;
         char procfn[128];
@@ -80,8 +79,6 @@ int detach_capi_ctr(struct capi_ctr *);
 
 void capi_ctr_ready(struct capi_ctr * card);
 void capi_ctr_down(struct capi_ctr * card);
-void capi_ctr_suspend_output(struct capi_ctr * card);
-void capi_ctr_resume_output(struct capi_ctr * card);
 void capi_ctr_handle_message(struct capi_ctr * card, u16 appl, struct sk_buff *skb);
 
 // ---------------------------------------------------------------------------
@@ -91,23 +88,8 @@ struct capi_driver {
 	char name[32];				/* driver name */
 	char revision[32];
 
-	int (*add_card)(struct capi_driver *driver, capicardparams *data);
-
 	/* management information for kcapi */
 	struct list_head list; 
 };
 
-void register_capi_driver(struct capi_driver *driver);
-void unregister_capi_driver(struct capi_driver *driver);
-
-// ---------------------------------------------------------------------------
-// library functions for use by hardware controller drivers
-
-void capilib_new_ncci(struct list_head *head, u16 applid, u32 ncci, u32 winsize);
-void capilib_free_ncci(struct list_head *head, u16 applid, u32 ncci);
-void capilib_release_appl(struct list_head *head, u16 applid);
-void capilib_release(struct list_head *head);
-void capilib_data_b3_conf(struct list_head *head, u16 applid, u32 ncci, u16 msgid);
-u16  capilib_data_b3_req(struct list_head *head, u16 applid, u32 ncci, u16 msgid);
-
 #endif				/* __CAPILLI_H__ */
diff --git a/include/linux/isdn/capiutil.h b/include/linux/isdn/capiutil.h
index 44bd6046e6e2..953fd500dff7 100644
--- a/include/linux/isdn/capiutil.h
+++ b/include/linux/isdn/capiutil.h
@@ -57,460 +57,4 @@ static inline void capimsg_setu32(void *m, int off, __u32 val)
 #define	CAPIMSG_SETCONTROL(m, contr)	capimsg_setu32(m, 8, contr)
 #define	CAPIMSG_SETDATALEN(m, len)	capimsg_setu16(m, 16, len)
 
-/*----- basic-type definitions -----*/
-
-typedef __u8 *_cstruct;
-
-typedef enum {
-	CAPI_COMPOSE,
-	CAPI_DEFAULT
-} _cmstruct;
-
-/*
-   The _cmsg structure contains all possible CAPI 2.0 parameter.
-   All parameters are stored here first. The function CAPI_CMSG_2_MESSAGE
-   assembles the parameter and builds CAPI2.0 conform messages.
-   CAPI_MESSAGE_2_CMSG disassembles CAPI 2.0 messages and stores the
-   parameter in the _cmsg structure
- */
-
-typedef struct {
-	/* Header */
-	__u16 ApplId;
-	__u8 Command;
-	__u8 Subcommand;
-	__u16 Messagenumber;
-
-	/* Parameter */
-	union {
-		__u32 adrController;
-		__u32 adrPLCI;
-		__u32 adrNCCI;
-	} adr;
-
-	_cmstruct AdditionalInfo;
-	_cstruct B1configuration;
-	__u16 B1protocol;
-	_cstruct B2configuration;
-	__u16 B2protocol;
-	_cstruct B3configuration;
-	__u16 B3protocol;
-	_cstruct BC;
-	_cstruct BChannelinformation;
-	_cmstruct BProtocol;
-	_cstruct CalledPartyNumber;
-	_cstruct CalledPartySubaddress;
-	_cstruct CallingPartyNumber;
-	_cstruct CallingPartySubaddress;
-	__u32 CIPmask;
-	__u32 CIPmask2;
-	__u16 CIPValue;
-	__u32 Class;
-	_cstruct ConnectedNumber;
-	_cstruct ConnectedSubaddress;
-	__u32 Data;
-	__u16 DataHandle;
-	__u16 DataLength;
-	_cstruct FacilityConfirmationParameter;
-	_cstruct Facilitydataarray;
-	_cstruct FacilityIndicationParameter;
-	_cstruct FacilityRequestParameter;
-	__u16 FacilitySelector;
-	__u16 Flags;
-	__u32 Function;
-	_cstruct HLC;
-	__u16 Info;
-	_cstruct InfoElement;
-	__u32 InfoMask;
-	__u16 InfoNumber;
-	_cstruct Keypadfacility;
-	_cstruct LLC;
-	_cstruct ManuData;
-	__u32 ManuID;
-	_cstruct NCPI;
-	__u16 Reason;
-	__u16 Reason_B3;
-	__u16 Reject;
-	_cstruct Useruserdata;
-
-	/* intern */
-	unsigned l, p;
-	unsigned char *par;
-	__u8 *m;
-
-	/* buffer to construct message */
-	__u8 buf[180];
-
-} _cmsg;
-
-/*
- * capi_cmsg2message() assembles the parameter from _cmsg to a CAPI 2.0
- * conform message
- */
-unsigned capi_cmsg2message(_cmsg * cmsg, __u8 * msg);
-
-/*
- *  capi_message2cmsg disassembles a CAPI message an writes the parameter
- *  into _cmsg for easy access
- */
-unsigned capi_message2cmsg(_cmsg * cmsg, __u8 * msg);
-
-/*
- * capi_cmsg_header() fills the _cmsg structure with default values, so only
- * parameter with non default values must be changed before sending the
- * message.
- */
-unsigned capi_cmsg_header(_cmsg * cmsg, __u16 _ApplId,
-			  __u8 _Command, __u8 _Subcommand,
-			  __u16 _Messagenumber, __u32 _Controller);
-
-/*-----------------------------------------------------------------------*/
-
-/*
- * Debugging / Tracing functions
- */
-
-char *capi_cmd2str(__u8 cmd, __u8 subcmd);
-
-typedef struct {
-	u_char	*buf;
-	u_char	*p;
-	size_t	size;
-	size_t	pos;
-} _cdebbuf;
-
-#define	CDEBUG_SIZE	1024
-#define	CDEBUG_GSIZE	4096
-
-void cdebbuf_free(_cdebbuf *cdb);
-int cdebug_init(void);
-void cdebug_exit(void);
-
-_cdebbuf *capi_cmsg2str(_cmsg *cmsg);
-_cdebbuf *capi_message2str(__u8 *msg);
-
-/*-----------------------------------------------------------------------*/
-
-static inline void capi_cmsg_answer(_cmsg * cmsg)
-{
-	cmsg->Subcommand |= 0x01;
-}
-
-/*-----------------------------------------------------------------------*/
-
-static inline void capi_fill_CONNECT_B3_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					    __u32 adr,
-					    _cstruct NCPI)
-{
-	capi_cmsg_header(cmsg, ApplId, 0x82, 0x80, Messagenumber, adr);
-	cmsg->NCPI = NCPI;
-}
-
-static inline void capi_fill_FACILITY_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					  __u32 adr,
-					  __u16 FacilitySelector,
-				       _cstruct FacilityRequestParameter)
-{
-	capi_cmsg_header(cmsg, ApplId, 0x80, 0x80, Messagenumber, adr);
-	cmsg->FacilitySelector = FacilitySelector;
-	cmsg->FacilityRequestParameter = FacilityRequestParameter;
-}
-
-static inline void capi_fill_INFO_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-				      __u32 adr,
-				      _cstruct CalledPartyNumber,
-				      _cstruct BChannelinformation,
-				      _cstruct Keypadfacility,
-				      _cstruct Useruserdata,
-				      _cstruct Facilitydataarray)
-{
-	capi_cmsg_header(cmsg, ApplId, 0x08, 0x80, Messagenumber, adr);
-	cmsg->CalledPartyNumber = CalledPartyNumber;
-	cmsg->BChannelinformation = BChannelinformation;
-	cmsg->Keypadfacility = Keypadfacility;
-	cmsg->Useruserdata = Useruserdata;
-	cmsg->Facilitydataarray = Facilitydataarray;
-}
-
-static inline void capi_fill_LISTEN_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					__u32 adr,
-					__u32 InfoMask,
-					__u32 CIPmask,
-					__u32 CIPmask2,
-					_cstruct CallingPartyNumber,
-					_cstruct CallingPartySubaddress)
-{
-	capi_cmsg_header(cmsg, ApplId, 0x05, 0x80, Messagenumber, adr);
-	cmsg->InfoMask = InfoMask;
-	cmsg->CIPmask = CIPmask;
-	cmsg->CIPmask2 = CIPmask2;
-	cmsg->CallingPartyNumber = CallingPartyNumber;
-	cmsg->CallingPartySubaddress = CallingPartySubaddress;
-}
-
-static inline void capi_fill_ALERT_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-				       __u32 adr,
-				       _cstruct BChannelinformation,
-				       _cstruct Keypadfacility,
-				       _cstruct Useruserdata,
-				       _cstruct Facilitydataarray)
-{
-	capi_cmsg_header(cmsg, ApplId, 0x01, 0x80, Messagenumber, adr);
-	cmsg->BChannelinformation = BChannelinformation;
-	cmsg->Keypadfacility = Keypadfacility;
-	cmsg->Useruserdata = Useruserdata;
-	cmsg->Facilitydataarray = Facilitydataarray;
-}
-
-static inline void capi_fill_CONNECT_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					 __u32 adr,
-					 __u16 CIPValue,
-					 _cstruct CalledPartyNumber,
-					 _cstruct CallingPartyNumber,
-					 _cstruct CalledPartySubaddress,
-					 _cstruct CallingPartySubaddress,
-					 __u16 B1protocol,
-					 __u16 B2protocol,
-					 __u16 B3protocol,
-					 _cstruct B1configuration,
-					 _cstruct B2configuration,
-					 _cstruct B3configuration,
-					 _cstruct BC,
-					 _cstruct LLC,
-					 _cstruct HLC,
-					 _cstruct BChannelinformation,
-					 _cstruct Keypadfacility,
-					 _cstruct Useruserdata,
-					 _cstruct Facilitydataarray)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x02, 0x80, Messagenumber, adr);
-	cmsg->CIPValue = CIPValue;
-	cmsg->CalledPartyNumber = CalledPartyNumber;
-	cmsg->CallingPartyNumber = CallingPartyNumber;
-	cmsg->CalledPartySubaddress = CalledPartySubaddress;
-	cmsg->CallingPartySubaddress = CallingPartySubaddress;
-	cmsg->B1protocol = B1protocol;
-	cmsg->B2protocol = B2protocol;
-	cmsg->B3protocol = B3protocol;
-	cmsg->B1configuration = B1configuration;
-	cmsg->B2configuration = B2configuration;
-	cmsg->B3configuration = B3configuration;
-	cmsg->BC = BC;
-	cmsg->LLC = LLC;
-	cmsg->HLC = HLC;
-	cmsg->BChannelinformation = BChannelinformation;
-	cmsg->Keypadfacility = Keypadfacility;
-	cmsg->Useruserdata = Useruserdata;
-	cmsg->Facilitydataarray = Facilitydataarray;
-}
-
-static inline void capi_fill_DATA_B3_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					 __u32 adr,
-					 __u32 Data,
-					 __u16 DataLength,
-					 __u16 DataHandle,
-					 __u16 Flags)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x86, 0x80, Messagenumber, adr);
-	cmsg->Data = Data;
-	cmsg->DataLength = DataLength;
-	cmsg->DataHandle = DataHandle;
-	cmsg->Flags = Flags;
-}
-
-static inline void capi_fill_DISCONNECT_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					    __u32 adr,
-					    _cstruct BChannelinformation,
-					    _cstruct Keypadfacility,
-					    _cstruct Useruserdata,
-					    _cstruct Facilitydataarray)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x04, 0x80, Messagenumber, adr);
-	cmsg->BChannelinformation = BChannelinformation;
-	cmsg->Keypadfacility = Keypadfacility;
-	cmsg->Useruserdata = Useruserdata;
-	cmsg->Facilitydataarray = Facilitydataarray;
-}
-
-static inline void capi_fill_DISCONNECT_B3_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					       __u32 adr,
-					       _cstruct NCPI)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x84, 0x80, Messagenumber, adr);
-	cmsg->NCPI = NCPI;
-}
-
-static inline void capi_fill_MANUFACTURER_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					      __u32 adr,
-					      __u32 ManuID,
-					      __u32 Class,
-					      __u32 Function,
-					      _cstruct ManuData)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0xff, 0x80, Messagenumber, adr);
-	cmsg->ManuID = ManuID;
-	cmsg->Class = Class;
-	cmsg->Function = Function;
-	cmsg->ManuData = ManuData;
-}
-
-static inline void capi_fill_RESET_B3_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					  __u32 adr,
-					  _cstruct NCPI)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x87, 0x80, Messagenumber, adr);
-	cmsg->NCPI = NCPI;
-}
-
-static inline void capi_fill_SELECT_B_PROTOCOL_REQ(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-						   __u32 adr,
-						   __u16 B1protocol,
-						   __u16 B2protocol,
-						   __u16 B3protocol,
-						_cstruct B1configuration,
-						_cstruct B2configuration,
-						_cstruct B3configuration)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x41, 0x80, Messagenumber, adr);
-	cmsg->B1protocol = B1protocol;
-	cmsg->B2protocol = B2protocol;
-	cmsg->B3protocol = B3protocol;
-	cmsg->B1configuration = B1configuration;
-	cmsg->B2configuration = B2configuration;
-	cmsg->B3configuration = B3configuration;
-}
-
-static inline void capi_fill_CONNECT_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					  __u32 adr,
-					  __u16 Reject,
-					  __u16 B1protocol,
-					  __u16 B2protocol,
-					  __u16 B3protocol,
-					  _cstruct B1configuration,
-					  _cstruct B2configuration,
-					  _cstruct B3configuration,
-					  _cstruct ConnectedNumber,
-					  _cstruct ConnectedSubaddress,
-					  _cstruct LLC,
-					  _cstruct BChannelinformation,
-					  _cstruct Keypadfacility,
-					  _cstruct Useruserdata,
-					  _cstruct Facilitydataarray)
-{
-	capi_cmsg_header(cmsg, ApplId, 0x02, 0x83, Messagenumber, adr);
-	cmsg->Reject = Reject;
-	cmsg->B1protocol = B1protocol;
-	cmsg->B2protocol = B2protocol;
-	cmsg->B3protocol = B3protocol;
-	cmsg->B1configuration = B1configuration;
-	cmsg->B2configuration = B2configuration;
-	cmsg->B3configuration = B3configuration;
-	cmsg->ConnectedNumber = ConnectedNumber;
-	cmsg->ConnectedSubaddress = ConnectedSubaddress;
-	cmsg->LLC = LLC;
-	cmsg->BChannelinformation = BChannelinformation;
-	cmsg->Keypadfacility = Keypadfacility;
-	cmsg->Useruserdata = Useruserdata;
-	cmsg->Facilitydataarray = Facilitydataarray;
-}
-
-static inline void capi_fill_CONNECT_ACTIVE_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-						 __u32 adr)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x03, 0x83, Messagenumber, adr);
-}
-
-static inline void capi_fill_CONNECT_B3_ACTIVE_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-						    __u32 adr)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x83, 0x83, Messagenumber, adr);
-}
-
-static inline void capi_fill_CONNECT_B3_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					     __u32 adr,
-					     __u16 Reject,
-					     _cstruct NCPI)
-{
-	capi_cmsg_header(cmsg, ApplId, 0x82, 0x83, Messagenumber, adr);
-	cmsg->Reject = Reject;
-	cmsg->NCPI = NCPI;
-}
-
-static inline void capi_fill_CONNECT_B3_T90_ACTIVE_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-							__u32 adr)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x88, 0x83, Messagenumber, adr);
-}
-
-static inline void capi_fill_DATA_B3_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					  __u32 adr,
-					  __u16 DataHandle)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x86, 0x83, Messagenumber, adr);
-	cmsg->DataHandle = DataHandle;
-}
-
-static inline void capi_fill_DISCONNECT_B3_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-						__u32 adr)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x84, 0x83, Messagenumber, adr);
-}
-
-static inline void capi_fill_DISCONNECT_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					     __u32 adr)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x04, 0x83, Messagenumber, adr);
-}
-
-static inline void capi_fill_FACILITY_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					   __u32 adr,
-					   __u16 FacilitySelector)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x80, 0x83, Messagenumber, adr);
-	cmsg->FacilitySelector = FacilitySelector;
-}
-
-static inline void capi_fill_INFO_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-				       __u32 adr)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x08, 0x83, Messagenumber, adr);
-}
-
-static inline void capi_fill_MANUFACTURER_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					       __u32 adr,
-					       __u32 ManuID,
-					       __u32 Class,
-					       __u32 Function,
-					       _cstruct ManuData)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0xff, 0x83, Messagenumber, adr);
-	cmsg->ManuID = ManuID;
-	cmsg->Class = Class;
-	cmsg->Function = Function;
-	cmsg->ManuData = ManuData;
-}
-
-static inline void capi_fill_RESET_B3_RESP(_cmsg * cmsg, __u16 ApplId, __u16 Messagenumber,
-					   __u32 adr)
-{
-
-	capi_cmsg_header(cmsg, ApplId, 0x87, 0x83, Messagenumber, adr);
-}
-
 #endif				/* __CAPIUTIL_H__ */
diff --git a/include/linux/kernelcapi.h b/include/linux/kernelcapi.h
index 075fab5f92e1..94ba42bf9da1 100644
--- a/include/linux/kernelcapi.h
+++ b/include/linux/kernelcapi.h
@@ -10,46 +10,12 @@
 #ifndef __KERNELCAPI_H__
 #define __KERNELCAPI_H__
 
-
 #include <linux/list.h>
 #include <linux/skbuff.h>
 #include <linux/workqueue.h>
 #include <linux/notifier.h>
 #include <uapi/linux/kernelcapi.h>
 
-struct capi20_appl {
-	u16 applid;
-	capi_register_params rparam;
-	void (*recv_message)(struct capi20_appl *ap, struct sk_buff *skb);
-	void *private;
-
-	/* internal to kernelcapi.o */
-	unsigned long nrecvctlpkt;
-	unsigned long nrecvdatapkt;
-	unsigned long nsentctlpkt;
-	unsigned long nsentdatapkt;
-	struct mutex recv_mtx;
-	struct sk_buff_head recv_queue;
-	struct work_struct recv_work;
-	int release_in_progress;
-};
-
-u16 capi20_isinstalled(void);
-u16 capi20_register(struct capi20_appl *ap);
-u16 capi20_release(struct capi20_appl *ap);
-u16 capi20_put_message(struct capi20_appl *ap, struct sk_buff *skb);
-u16 capi20_get_manufacturer(u32 contr, u8 buf[CAPI_MANUFACTURER_LEN]);
-u16 capi20_get_version(u32 contr, struct capi_version *verp);
-u16 capi20_get_serial(u32 contr, u8 serial[CAPI_SERIAL_LEN]);
-u16 capi20_get_profile(u32 contr, struct capi_profile *profp);
-int capi20_manufacturer(unsigned long cmd, void __user *data);
-
-#define CAPICTR_UP			0
-#define CAPICTR_DOWN			1
-
-int register_capictr_notifier(struct notifier_block *nb);
-int unregister_capictr_notifier(struct notifier_block *nb);
-
 #define CAPI_NOERROR                      0x0000
 
 #define CAPI_TOOMANYAPPLS		  0x1001
@@ -76,45 +42,4 @@ int unregister_capictr_notifier(struct notifier_block *nb);
 #define CAPI_MSGCTRLERNOTSUPPORTEXTEQUIP  0x110a
 #define CAPI_MSGCTRLERONLYSUPPORTEXTEQUIP 0x110b
 
-typedef enum {
-        CapiMessageNotSupportedInCurrentState = 0x2001,
-        CapiIllContrPlciNcci                  = 0x2002,
-        CapiNoPlciAvailable                   = 0x2003,
-        CapiNoNcciAvailable                   = 0x2004,
-        CapiNoListenResourcesAvailable        = 0x2005,
-        CapiNoFaxResourcesAvailable           = 0x2006,
-        CapiIllMessageParmCoding              = 0x2007,
-} RESOURCE_CODING_PROBLEM;
-
-typedef enum {
-        CapiB1ProtocolNotSupported                      = 0x3001,
-        CapiB2ProtocolNotSupported                      = 0x3002,
-        CapiB3ProtocolNotSupported                      = 0x3003,
-        CapiB1ProtocolParameterNotSupported             = 0x3004,
-        CapiB2ProtocolParameterNotSupported             = 0x3005,
-        CapiB3ProtocolParameterNotSupported             = 0x3006,
-        CapiBProtocolCombinationNotSupported            = 0x3007,
-        CapiNcpiNotSupported                            = 0x3008,
-        CapiCipValueUnknown                             = 0x3009,
-        CapiFlagsNotSupported                           = 0x300a,
-        CapiFacilityNotSupported                        = 0x300b,
-        CapiDataLengthNotSupportedByCurrentProtocol     = 0x300c,
-        CapiResetProcedureNotSupportedByCurrentProtocol = 0x300d,
-        CapiTeiAssignmentFailed                         = 0x300e,
-} REQUESTED_SERVICES_PROBLEM;
-
-typedef enum {
-	CapiSuccess                                     = 0x0000,
-	CapiSupplementaryServiceNotSupported            = 0x300e,
-	CapiRequestNotAllowedInThisState                = 0x3010,
-} SUPPLEMENTARY_SERVICE_INFO;
-
-typedef enum {
-	CapiProtocolErrorLayer1                         = 0x3301,
-	CapiProtocolErrorLayer2                         = 0x3302,
-	CapiProtocolErrorLayer3                         = 0x3303,
-	CapiTimeOut                                     = 0x3303, // SuppServiceReason
-	CapiCallGivenToOtherApplication                 = 0x3304,
-} CAPI_REASON;
-
 #endif				/* __KERNELCAPI_H__ */
diff --git a/include/uapi/linux/b1lli.h b/include/uapi/linux/b1lli.h
deleted file mode 100644
index 4ae6ac9950df..000000000000
--- a/include/uapi/linux/b1lli.h
+++ /dev/null
@@ -1,74 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
-/* $Id: b1lli.h,v 1.8.8.3 2001/09/23 22:25:05 kai Exp $
- *
- * ISDN lowlevel-module for AVM B1-card.
- *
- * Copyright 1996 by Carsten Paeth (calle@calle.in-berlin.de)
- *
- * This software may be used and distributed according to the terms
- * of the GNU General Public License, incorporated herein by reference.
- *
- */
-
-#ifndef _B1LLI_H_
-#define _B1LLI_H_
-/*
- * struct for loading t4 file 
- */
-typedef struct avmb1_t4file {
-	int len;
-	unsigned char *data;
-} avmb1_t4file;
-
-typedef struct avmb1_loaddef {
-	int contr;
-	avmb1_t4file t4file;
-} avmb1_loaddef;
-
-typedef struct avmb1_loadandconfigdef {
-	int contr;
-	avmb1_t4file t4file;
-        avmb1_t4file t4config; 
-} avmb1_loadandconfigdef;
-
-typedef struct avmb1_resetdef {
-	int contr;
-} avmb1_resetdef;
-
-typedef struct avmb1_getdef {
-	int contr;
-	int cardtype;
-	int cardstate;
-} avmb1_getdef;
-
-/*
- * struct for adding new cards 
- */
-typedef struct avmb1_carddef {
-	int port;
-	int irq;
-} avmb1_carddef;
-
-#define AVM_CARDTYPE_B1		0
-#define AVM_CARDTYPE_T1		1
-#define AVM_CARDTYPE_M1		2
-#define AVM_CARDTYPE_M2		3
-
-typedef struct avmb1_extcarddef {
-	int port;
-	int irq;
-        int cardtype;
-        int cardnr;  /* for HEMA/T1 */
-} avmb1_extcarddef;
-
-#define	AVMB1_LOAD		0	/* load image to card */
-#define AVMB1_ADDCARD		1	/* add a new card - OBSOLETE */
-#define AVMB1_RESETCARD		2	/* reset a card */
-#define	AVMB1_LOAD_AND_CONFIG	3	/* load image and config to card */
-#define	AVMB1_ADDCARD_WITH_TYPE	4	/* add a new card, with cardtype */
-#define AVMB1_GET_CARDINFO	5	/* get cardtype */
-#define AVMB1_REMOVECARD	6	/* remove a card - OBSOLETE */
-
-#define	AVMB1_REGISTERCARD_IS_OBSOLETE
-
-#endif				/* _B1LLI_H_ */
-- 
cgit v1.2.3


From 2f48865db3322f8e74f51ff93b91e5c2916dd259 Mon Sep 17 00:00:00 2001
From: Marcel Holtmann <marcel@holtmann.org>
Date: Wed, 4 Dec 2019 04:41:09 +0100
Subject: HID: hidraw: add support uniq ioctl

Add support for reading out the uniq information from the underlying HID
device. This might be the iSerialNumber in case of USB or the BD_ADDR in
case of Bluetooth.

Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 drivers/hid/hidraw.c        | 9 +++++++++
 include/uapi/linux/hidraw.h | 1 +
 2 files changed, 10 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/drivers/hid/hidraw.c b/drivers/hid/hidraw.c
index c3fc0ceb8096..5dfe80208932 100644
--- a/drivers/hid/hidraw.c
+++ b/drivers/hid/hidraw.c
@@ -450,6 +450,15 @@ static long hidraw_ioctl(struct file *file, unsigned int cmd,
 						-EFAULT : len;
 					break;
 				}
+
+				if (_IOC_NR(cmd) == _IOC_NR(HIDIOCGRAWUNIQ(0))) {
+					int len = strlen(hid->uniq) + 1;
+					if (len > _IOC_SIZE(cmd))
+						len = _IOC_SIZE(cmd);
+					ret = copy_to_user(user_arg, hid->uniq, len) ?
+						-EFAULT : len;
+					break;
+				}
 			}
 
 		ret = -ENOTTY;
diff --git a/include/uapi/linux/hidraw.h b/include/uapi/linux/hidraw.h
index 98e2c493de85..4913539e5bcc 100644
--- a/include/uapi/linux/hidraw.h
+++ b/include/uapi/linux/hidraw.h
@@ -39,6 +39,7 @@ struct hidraw_devinfo {
 /* The first byte of SFEATURE and GFEATURE is the report number */
 #define HIDIOCSFEATURE(len)    _IOC(_IOC_WRITE|_IOC_READ, 'H', 0x06, len)
 #define HIDIOCGFEATURE(len)    _IOC(_IOC_WRITE|_IOC_READ, 'H', 0x07, len)
+#define HIDIOCGRAWUNIQ(len)     _IOC(_IOC_READ, 'H', 0x08, len)
 
 #define HIDRAW_FIRST_MINOR 0
 #define HIDRAW_MAX_DEVICES 64
-- 
cgit v1.2.3


From bae141f54be83b06652c1d47e50e4e75ed4e9c7e Mon Sep 17 00:00:00 2001
From: Daniel Borkmann <daniel@iogearbox.net>
Date: Fri, 6 Dec 2019 22:49:34 +0100
Subject: bpf: Emit audit messages upon successful prog load and unload

Allow for audit messages to be emitted upon BPF program load and
unload for having a timeline of events. The load itself is in
syscall context, so additional info about the process initiating
the BPF prog creation can be logged and later directly correlated
to the unload event.

The only info really needed from BPF side is the globally unique
prog ID where then audit user space tooling can query / dump all
info needed about the specific BPF program right upon load event
and enrich the record, thus these changes needed here can be kept
small and non-intrusive to the core.

Raw example output:

  # auditctl -D
  # auditctl -a always,exit -F arch=x86_64 -S bpf
  # ausearch --start recent -m 1334
  ...
  ----
  time->Wed Nov 27 16:04:13 2019
  type=PROCTITLE msg=audit(1574867053.120:84664): proctitle="./bpf"
  type=SYSCALL msg=audit(1574867053.120:84664): arch=c000003e syscall=321   \
    success=yes exit=3 a0=5 a1=7ffea484fbe0 a2=70 a3=0 items=0 ppid=7477    \
    pid=12698 auid=1001 uid=1001 gid=1001 euid=1001 suid=1001 fsuid=1001    \
    egid=1001 sgid=1001 fsgid=1001 tty=pts2 ses=4 comm="bpf"                \
    exe="/home/jolsa/auditd/audit-testsuite/tests/bpf/bpf"                  \
    subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 key=(null)
  type=UNKNOWN[1334] msg=audit(1574867053.120:84664): prog-id=76 op=LOAD
  ----
  time->Wed Nov 27 16:04:13 2019
  type=UNKNOWN[1334] msg=audit(1574867053.120:84665): prog-id=76 op=UNLOAD
  ...

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Co-developed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/bpf/20191206214934.11319-1-jolsa@kernel.org
---
 include/uapi/linux/audit.h |  1 +
 kernel/bpf/syscall.c       | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 34 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/audit.h b/include/uapi/linux/audit.h
index 3ad935527177..a534d71e689a 100644
--- a/include/uapi/linux/audit.h
+++ b/include/uapi/linux/audit.h
@@ -116,6 +116,7 @@
 #define AUDIT_FANOTIFY		1331	/* Fanotify access decision */
 #define AUDIT_TIME_INJOFFSET	1332	/* Timekeeping offset injected */
 #define AUDIT_TIME_ADJNTPVAL	1333	/* NTP value adjustment */
+#define AUDIT_BPF		1334	/* BPF subsystem */
 
 #define AUDIT_AVC		1400	/* SE Linux avc denial or grant */
 #define AUDIT_SELINUX_ERR	1401	/* Internal SE Linux Errors */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e3461ec59570..66b90eaf99fe 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -23,6 +23,7 @@
 #include <linux/timekeeping.h>
 #include <linux/ctype.h>
 #include <linux/nospec.h>
+#include <linux/audit.h>
 #include <uapi/linux/btf.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
@@ -1306,6 +1307,36 @@ static int find_prog_type(enum bpf_prog_type type, struct bpf_prog *prog)
 	return 0;
 }
 
+enum bpf_audit {
+	BPF_AUDIT_LOAD,
+	BPF_AUDIT_UNLOAD,
+	BPF_AUDIT_MAX,
+};
+
+static const char * const bpf_audit_str[BPF_AUDIT_MAX] = {
+	[BPF_AUDIT_LOAD]   = "LOAD",
+	[BPF_AUDIT_UNLOAD] = "UNLOAD",
+};
+
+static void bpf_audit_prog(const struct bpf_prog *prog, unsigned int op)
+{
+	struct audit_context *ctx = NULL;
+	struct audit_buffer *ab;
+
+	if (WARN_ON_ONCE(op >= BPF_AUDIT_MAX))
+		return;
+	if (audit_enabled == AUDIT_OFF)
+		return;
+	if (op == BPF_AUDIT_LOAD)
+		ctx = audit_context();
+	ab = audit_log_start(ctx, GFP_ATOMIC, AUDIT_BPF);
+	if (unlikely(!ab))
+		return;
+	audit_log_format(ab, "prog-id=%u op=%s",
+			 prog->aux->id, bpf_audit_str[op]);
+	audit_log_end(ab);
+}
+
 int __bpf_prog_charge(struct user_struct *user, u32 pages)
 {
 	unsigned long memlock_limit = rlimit(RLIMIT_MEMLOCK) >> PAGE_SHIFT;
@@ -1421,6 +1452,7 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock)
 {
 	if (atomic64_dec_and_test(&prog->aux->refcnt)) {
 		perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_UNLOAD, 0);
+		bpf_audit_prog(prog, BPF_AUDIT_UNLOAD);
 		/* bpf_prog_free_id() must be called first */
 		bpf_prog_free_id(prog, do_idr_lock);
 		__bpf_prog_put_noref(prog, true);
@@ -1830,6 +1862,7 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	 */
 	bpf_prog_kallsyms_add(prog);
 	perf_event_bpf_event(prog, PERF_BPF_EVENT_PROG_LOAD, 0);
+	bpf_audit_prog(prog, BPF_AUDIT_LOAD);
 
 	err = bpf_prog_new_fd(prog);
 	if (err < 0)
-- 
cgit v1.2.3


From ef343b35d46667668a099655fca4a5b2e43a5dfe Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Tue, 10 Dec 2019 11:43:03 +0100
Subject: vsock: add VMADDR_CID_LOCAL definition

The VMADDR_CID_RESERVED (1) was used by VMCI, but now it is not
used anymore, so we can reuse it for local communication
(loopback) adding the new well-know CID: VMADDR_CID_LOCAL.

Cc: Jorgen Hansen <jhansen@vmware.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jorgen Hansen <jhansen@vmware.com>
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/vm_sockets.h | 8 +++++---
 net/vmw_vsock/vmci_transport.c  | 2 +-
 2 files changed, 6 insertions(+), 4 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/vm_sockets.h b/include/uapi/linux/vm_sockets.h
index 68d57c5e99bc..fd0ed7221645 100644
--- a/include/uapi/linux/vm_sockets.h
+++ b/include/uapi/linux/vm_sockets.h
@@ -99,11 +99,13 @@
 
 #define VMADDR_CID_HYPERVISOR 0
 
-/* This CID is specific to VMCI and can be considered reserved (even VMCI
- * doesn't use it anymore, it's a legacy value from an older release).
+/* Use this as the destination CID in an address when referring to the
+ * local communication (loopback).
+ * (This was VMADDR_CID_RESERVED, but even VMCI doesn't use it anymore,
+ * it was a legacy value from an older release).
  */
 
-#define VMADDR_CID_RESERVED 1
+#define VMADDR_CID_LOCAL 1
 
 /* Use this as the destination CID in an address when referring to the host
  * (any process other than the hypervisor).  VMCI relies on it being 2, but
diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 644d32e43d23..4b8b1150a738 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -648,7 +648,7 @@ static int vmci_transport_recv_dgram_cb(void *data, struct vmci_datagram *dg)
 static bool vmci_transport_stream_allow(u32 cid, u32 port)
 {
 	static const u32 non_socket_contexts[] = {
-		VMADDR_CID_RESERVED,
+		VMADDR_CID_LOCAL,
 	};
 	int i;
 
-- 
cgit v1.2.3


From f74877a5457d34d604dba6dbbb13c4c05bac8b93 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Wed, 11 Dec 2019 10:58:14 +0100
Subject: rtnetlink: provide permanent hardware address in RTM_NEWLINK

Permanent hardware address of a network device was traditionally provided
via ethtool ioctl interface but as Jiri Pirko pointed out in a review of
ethtool netlink interface, rtnetlink is much more suitable for it so let's
add it to the RTM_NEWLINK message.

Add IFLA_PERM_ADDRESS attribute to RTM_NEWLINK messages unless the
permanent address is all zeros (i.e. device driver did not fill it). As
permanent address is not modifiable, reject userspace requests containing
IFLA_PERM_ADDRESS attribute.

Note: we already provide permanent hardware address for bond slaves;
unfortunately we cannot drop that attribute for backward compatibility
reasons.

v5 -> v6: only add the attribute if permanent address is not zero

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Acked-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/if_link.h | 1 +
 net/core/rtnetlink.c         | 5 +++++
 2 files changed, 6 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 8aec8769d944..1d69f637c5d6 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -169,6 +169,7 @@ enum {
 	IFLA_MAX_MTU,
 	IFLA_PROP_LIST,
 	IFLA_ALT_IFNAME, /* Alternative ifname */
+	IFLA_PERM_ADDRESS,
 	__IFLA_MAX
 };
 
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 02916f43bf63..20bc406f3871 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1041,6 +1041,7 @@ static noinline size_t if_nlmsg_size(const struct net_device *dev,
 	       + nla_total_size(4)  /* IFLA_MIN_MTU */
 	       + nla_total_size(4)  /* IFLA_MAX_MTU */
 	       + rtnl_prop_list_size(dev)
+	       + nla_total_size(MAX_ADDR_LEN) /* IFLA_PERM_ADDRESS */
 	       + 0;
 }
 
@@ -1757,6 +1758,9 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb,
 	    nla_put_s32(skb, IFLA_NEW_IFINDEX, new_ifindex) < 0)
 		goto nla_put_failure;
 
+	if (memchr_inv(dev->perm_addr, '\0', dev->addr_len) &&
+	    nla_put(skb, IFLA_PERM_ADDRESS, dev->addr_len, dev->perm_addr))
+		goto nla_put_failure;
 
 	rcu_read_lock();
 	if (rtnl_fill_link_af(skb, dev, ext_filter_mask))
@@ -1822,6 +1826,7 @@ static const struct nla_policy ifla_policy[IFLA_MAX+1] = {
 	[IFLA_PROP_LIST]	= { .type = NLA_NESTED },
 	[IFLA_ALT_IFNAME]	= { .type = NLA_STRING,
 				    .len = ALTIFNAMSIZ - 1 },
+	[IFLA_PERM_ADDRESS]	= { .type = NLA_REJECT },
 };
 
 static const struct nla_policy ifla_info_policy[IFLA_INFO_MAX+1] = {
-- 
cgit v1.2.3


From 428c122f5f6b54f40bd51c47495104b534b5a57c Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Wed, 11 Dec 2019 10:58:34 +0100
Subject: ethtool: provide link mode names as a string set

Unlike e.g. netdev features, the ethtool ioctl interface requires link mode
table to be in sync between kernel and userspace for userspace to be able
to display and set all link modes supported by kernel. The way arbitrary
length bitsets are implemented in netlink interface, this will be no longer
needed.

To allow userspace to access all link modes running kernel supports, add
table of ethernet link mode names and make it available as a string set to
userspace GET_STRSET requests. Add build time check to make sure names
are defined for all modes declared in enum ethtool_link_mode_bit_indices.

Once the string set is available, make it also accessible via ioctl.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/ethtool.h |  2 ++
 net/ethtool/common.c         | 86 ++++++++++++++++++++++++++++++++++++++++++++
 net/ethtool/common.h         |  5 +++
 net/ethtool/ioctl.c          |  6 ++++
 4 files changed, 99 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index d4591792f0b4..f44155840b07 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -593,6 +593,7 @@ struct ethtool_pauseparam {
  * @ETH_SS_RSS_HASH_FUNCS: RSS hush function names
  * @ETH_SS_PHY_STATS: Statistic names, for use with %ETHTOOL_GPHYSTATS
  * @ETH_SS_PHY_TUNABLES: PHY tunable names
+ * @ETH_SS_LINK_MODES: link mode names
  */
 enum ethtool_stringset {
 	ETH_SS_TEST		= 0,
@@ -604,6 +605,7 @@ enum ethtool_stringset {
 	ETH_SS_TUNABLES,
 	ETH_SS_PHY_STATS,
 	ETH_SS_PHY_TUNABLES,
+	ETH_SS_LINK_MODES,
 };
 
 /**
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 8b5e11e7e0a6..0a8728565356 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -83,3 +83,89 @@ phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN] = {
 	[ETHTOOL_PHY_FAST_LINK_DOWN] = "phy-fast-link-down",
 	[ETHTOOL_PHY_EDPD]	= "phy-energy-detect-power-down",
 };
+
+#define __LINK_MODE_NAME(speed, type, duplex) \
+	#speed "base" #type "/" #duplex
+#define __DEFINE_LINK_MODE_NAME(speed, type, duplex) \
+	[ETHTOOL_LINK_MODE(speed, type, duplex)] = \
+	__LINK_MODE_NAME(speed, type, duplex)
+#define __DEFINE_SPECIAL_MODE_NAME(_mode, _name) \
+	[ETHTOOL_LINK_MODE_ ## _mode ## _BIT] = _name
+
+const char link_mode_names[][ETH_GSTRING_LEN] = {
+	__DEFINE_LINK_MODE_NAME(10, T, Half),
+	__DEFINE_LINK_MODE_NAME(10, T, Full),
+	__DEFINE_LINK_MODE_NAME(100, T, Half),
+	__DEFINE_LINK_MODE_NAME(100, T, Full),
+	__DEFINE_LINK_MODE_NAME(1000, T, Half),
+	__DEFINE_LINK_MODE_NAME(1000, T, Full),
+	__DEFINE_SPECIAL_MODE_NAME(Autoneg, "Autoneg"),
+	__DEFINE_SPECIAL_MODE_NAME(TP, "TP"),
+	__DEFINE_SPECIAL_MODE_NAME(AUI, "AUI"),
+	__DEFINE_SPECIAL_MODE_NAME(MII, "MII"),
+	__DEFINE_SPECIAL_MODE_NAME(FIBRE, "FIBRE"),
+	__DEFINE_SPECIAL_MODE_NAME(BNC, "BNC"),
+	__DEFINE_LINK_MODE_NAME(10000, T, Full),
+	__DEFINE_SPECIAL_MODE_NAME(Pause, "Pause"),
+	__DEFINE_SPECIAL_MODE_NAME(Asym_Pause, "Asym_Pause"),
+	__DEFINE_LINK_MODE_NAME(2500, X, Full),
+	__DEFINE_SPECIAL_MODE_NAME(Backplane, "Backplane"),
+	__DEFINE_LINK_MODE_NAME(1000, KX, Full),
+	__DEFINE_LINK_MODE_NAME(10000, KX4, Full),
+	__DEFINE_LINK_MODE_NAME(10000, KR, Full),
+	__DEFINE_SPECIAL_MODE_NAME(10000baseR_FEC, "10000baseR_FEC"),
+	__DEFINE_LINK_MODE_NAME(20000, MLD2, Full),
+	__DEFINE_LINK_MODE_NAME(20000, KR2, Full),
+	__DEFINE_LINK_MODE_NAME(40000, KR4, Full),
+	__DEFINE_LINK_MODE_NAME(40000, CR4, Full),
+	__DEFINE_LINK_MODE_NAME(40000, SR4, Full),
+	__DEFINE_LINK_MODE_NAME(40000, LR4, Full),
+	__DEFINE_LINK_MODE_NAME(56000, KR4, Full),
+	__DEFINE_LINK_MODE_NAME(56000, CR4, Full),
+	__DEFINE_LINK_MODE_NAME(56000, SR4, Full),
+	__DEFINE_LINK_MODE_NAME(56000, LR4, Full),
+	__DEFINE_LINK_MODE_NAME(25000, CR, Full),
+	__DEFINE_LINK_MODE_NAME(25000, KR, Full),
+	__DEFINE_LINK_MODE_NAME(25000, SR, Full),
+	__DEFINE_LINK_MODE_NAME(50000, CR2, Full),
+	__DEFINE_LINK_MODE_NAME(50000, KR2, Full),
+	__DEFINE_LINK_MODE_NAME(100000, KR4, Full),
+	__DEFINE_LINK_MODE_NAME(100000, SR4, Full),
+	__DEFINE_LINK_MODE_NAME(100000, CR4, Full),
+	__DEFINE_LINK_MODE_NAME(100000, LR4_ER4, Full),
+	__DEFINE_LINK_MODE_NAME(50000, SR2, Full),
+	__DEFINE_LINK_MODE_NAME(1000, X, Full),
+	__DEFINE_LINK_MODE_NAME(10000, CR, Full),
+	__DEFINE_LINK_MODE_NAME(10000, SR, Full),
+	__DEFINE_LINK_MODE_NAME(10000, LR, Full),
+	__DEFINE_LINK_MODE_NAME(10000, LRM, Full),
+	__DEFINE_LINK_MODE_NAME(10000, ER, Full),
+	__DEFINE_LINK_MODE_NAME(2500, T, Full),
+	__DEFINE_LINK_MODE_NAME(5000, T, Full),
+	__DEFINE_SPECIAL_MODE_NAME(FEC_NONE, "None"),
+	__DEFINE_SPECIAL_MODE_NAME(FEC_RS, "RS"),
+	__DEFINE_SPECIAL_MODE_NAME(FEC_BASER, "BASER"),
+	__DEFINE_LINK_MODE_NAME(50000, KR, Full),
+	__DEFINE_LINK_MODE_NAME(50000, SR, Full),
+	__DEFINE_LINK_MODE_NAME(50000, CR, Full),
+	__DEFINE_LINK_MODE_NAME(50000, LR_ER_FR, Full),
+	__DEFINE_LINK_MODE_NAME(50000, DR, Full),
+	__DEFINE_LINK_MODE_NAME(100000, KR2, Full),
+	__DEFINE_LINK_MODE_NAME(100000, SR2, Full),
+	__DEFINE_LINK_MODE_NAME(100000, CR2, Full),
+	__DEFINE_LINK_MODE_NAME(100000, LR2_ER2_FR2, Full),
+	__DEFINE_LINK_MODE_NAME(100000, DR2, Full),
+	__DEFINE_LINK_MODE_NAME(200000, KR4, Full),
+	__DEFINE_LINK_MODE_NAME(200000, SR4, Full),
+	__DEFINE_LINK_MODE_NAME(200000, LR4_ER4_FR4, Full),
+	__DEFINE_LINK_MODE_NAME(200000, DR4, Full),
+	__DEFINE_LINK_MODE_NAME(200000, CR4, Full),
+	__DEFINE_LINK_MODE_NAME(100, T1, Full),
+	__DEFINE_LINK_MODE_NAME(1000, T1, Full),
+	__DEFINE_LINK_MODE_NAME(400000, KR8, Full),
+	__DEFINE_LINK_MODE_NAME(400000, SR8, Full),
+	__DEFINE_LINK_MODE_NAME(400000, LR8_ER8_FR8, Full),
+	__DEFINE_LINK_MODE_NAME(400000, DR8, Full),
+	__DEFINE_LINK_MODE_NAME(400000, CR8, Full),
+};
+static_assert(ARRAY_SIZE(link_mode_names) == __ETHTOOL_LINK_MODE_MASK_NBITS);
diff --git a/net/ethtool/common.h b/net/ethtool/common.h
index 336566430be4..bbb788908cb1 100644
--- a/net/ethtool/common.h
+++ b/net/ethtool/common.h
@@ -5,6 +5,10 @@
 
 #include <linux/ethtool.h>
 
+/* compose link mode index from speed, type and duplex */
+#define ETHTOOL_LINK_MODE(speed, type, duplex) \
+	ETHTOOL_LINK_MODE_ ## speed ## base ## type ## _ ## duplex ## _BIT
+
 extern const char
 netdev_features_strings[NETDEV_FEATURE_COUNT][ETH_GSTRING_LEN];
 extern const char
@@ -13,5 +17,6 @@ extern const char
 tunable_strings[__ETHTOOL_TUNABLE_COUNT][ETH_GSTRING_LEN];
 extern const char
 phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN];
+extern const char link_mode_names[][ETH_GSTRING_LEN];
 
 #endif /* _ETHTOOL_COMMON_H */
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index b262db5a1d91..aed2c2cf1623 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -154,6 +154,9 @@ static int __ethtool_get_sset_count(struct net_device *dev, int sset)
 	    !ops->get_ethtool_phy_stats)
 		return phy_ethtool_get_sset_count(dev->phydev);
 
+	if (sset == ETH_SS_LINK_MODES)
+		return __ETHTOOL_LINK_MODE_MASK_NBITS;
+
 	if (ops->get_sset_count && ops->get_strings)
 		return ops->get_sset_count(dev, sset);
 	else
@@ -178,6 +181,9 @@ static void __ethtool_get_strings(struct net_device *dev,
 	else if (stringset == ETH_SS_PHY_STATS && dev->phydev &&
 		 !ops->get_ethtool_phy_stats)
 		phy_ethtool_get_strings(dev->phydev, data);
+	else if (stringset == ETH_SS_LINK_MODES)
+		memcpy(data, link_mode_names,
+		       __ETHTOOL_LINK_MODE_MASK_NBITS * ETH_GSTRING_LEN);
 	else
 		/* ops->get_strings is valid because checked earlier */
 		ops->get_strings(dev, stringset, data);
-- 
cgit v1.2.3


From 826f66b30c2e3d4df533e4224cb8c734afa0a17c Mon Sep 17 00:00:00 2001
From: Andy Roulin <aroulin@cumulusnetworks.com>
Date: Wed, 11 Dec 2019 14:30:58 -0800
Subject: bonding: move 802.3ad port state flags to uapi

The bond slave actor/partner operating state is exported as
bitfield to userspace, which lacks a way to interpret it, e.g.,
iproute2 only prints the state as a number:

ad_actor_oper_port_state 15

For userspace to interpret the bitfield, the bitfield definitions
should be part of the uapi. The bitfield itself is defined in the
802.3ad standard.

This commit moves the 802.3ad bitfield definitions to uapi.

Related iproute2 patches, soon to be posted upstream, use the new uapi
headers to pretty-print bond slave state, e.g., with ip -d link show

ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync>

Signed-off-by: Andy Roulin <aroulin@cumulusnetworks.com>
Acked-by: Roopa Prabhu <roopa@cumulusnetworks.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/bonding/bond_3ad.c  | 10 ----------
 include/uapi/linux/if_bonding.h | 10 ++++++++++
 2 files changed, 10 insertions(+), 10 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index e3b25f310936..34bfe99641a3 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -31,16 +31,6 @@
 #define AD_CHURN_DETECTION_TIME    60
 #define AD_AGGREGATE_WAIT_TIME     2
 
-/* Port state definitions (43.4.2.2 in the 802.3ad standard) */
-#define AD_STATE_LACP_ACTIVITY   0x1
-#define AD_STATE_LACP_TIMEOUT    0x2
-#define AD_STATE_AGGREGATION     0x4
-#define AD_STATE_SYNCHRONIZATION 0x8
-#define AD_STATE_COLLECTING      0x10
-#define AD_STATE_DISTRIBUTING    0x20
-#define AD_STATE_DEFAULTED       0x40
-#define AD_STATE_EXPIRED         0x80
-
 /* Port Variables definitions used by the State Machines (43.4.7 in the
  * 802.3ad standard)
  */
diff --git a/include/uapi/linux/if_bonding.h b/include/uapi/linux/if_bonding.h
index 790585f0e61b..6829213a54c5 100644
--- a/include/uapi/linux/if_bonding.h
+++ b/include/uapi/linux/if_bonding.h
@@ -95,6 +95,16 @@
 #define BOND_XMIT_POLICY_ENCAP23	3 /* encapsulated layer 2+3 */
 #define BOND_XMIT_POLICY_ENCAP34	4 /* encapsulated layer 3+4 */
 
+/* 802.3ad port state definitions (43.4.2.2 in the 802.3ad standard) */
+#define AD_STATE_LACP_ACTIVITY   0x1
+#define AD_STATE_LACP_TIMEOUT    0x2
+#define AD_STATE_AGGREGATION     0x4
+#define AD_STATE_SYNCHRONIZATION 0x8
+#define AD_STATE_COLLECTING      0x10
+#define AD_STATE_DISTRIBUTING    0x20
+#define AD_STATE_DEFAULTED       0x40
+#define AD_STATE_EXPIRED         0x80
+
 typedef struct ifbond {
 	__s32 bond_mode;
 	__s32 num_slaves;
-- 
cgit v1.2.3


From de1799667b002331609e8ee6b924ccfd51968d99 Mon Sep 17 00:00:00 2001
From: Vivien Didelot <vivien.didelot@gmail.com>
Date: Wed, 11 Dec 2019 20:07:10 -0500
Subject: net: bridge: add STP xstats

This adds rx_bpdu, tx_bpdu, rx_tcn, tx_tcn, transition_blk,
transition_fwd xstats counters to the bridge ports copied over via
netlink, providing useful information for STP.

Signed-off-by: Vivien Didelot <vivien.didelot@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/uapi/linux/if_bridge.h | 10 ++++++++++
 net/bridge/br_netlink.c        | 13 +++++++++++++
 net/bridge/br_private.h        |  2 ++
 net/bridge/br_stp.c            | 15 +++++++++++++++
 net/bridge/br_stp_bpdu.c       |  4 ++++
 5 files changed, 44 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 1b3c2b643a02..4a58e3d7de46 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -156,6 +156,15 @@ struct bridge_vlan_xstats {
 	__u32 pad2;
 };
 
+struct bridge_stp_xstats {
+	__u64 transition_blk;
+	__u64 transition_fwd;
+	__u64 rx_bpdu;
+	__u64 tx_bpdu;
+	__u64 rx_tcn;
+	__u64 tx_tcn;
+};
+
 /* Bridge multicast database attributes
  * [MDBA_MDB] = {
  *     [MDBA_MDB_ENTRY] = {
@@ -262,6 +271,7 @@ enum {
 	BRIDGE_XSTATS_VLAN,
 	BRIDGE_XSTATS_MCAST,
 	BRIDGE_XSTATS_PAD,
+	BRIDGE_XSTATS_STP,
 	__BRIDGE_XSTATS_MAX
 };
 #define BRIDGE_XSTATS_MAX (__BRIDGE_XSTATS_MAX - 1)
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index a0a54482aabc..60136575aea4 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1607,6 +1607,19 @@ static int br_fill_linkxstats(struct sk_buff *skb,
 		br_multicast_get_stats(br, p, nla_data(nla));
 	}
 #endif
+
+	if (p) {
+		nla = nla_reserve_64bit(skb, BRIDGE_XSTATS_STP,
+					sizeof(p->stp_xstats),
+					BRIDGE_XSTATS_PAD);
+		if (!nla)
+			goto nla_put_failure;
+
+		spin_lock_bh(&br->lock);
+		memcpy(nla_data(nla), &p->stp_xstats, sizeof(p->stp_xstats));
+		spin_unlock_bh(&br->lock);
+	}
+
 	nla_nest_end(skb, nest);
 	*prividx = 0;
 
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 36b0367ca1e0..f540f3bdf294 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -283,6 +283,8 @@ struct net_bridge_port {
 #endif
 	u16				group_fwd_mask;
 	u16				backup_redirected_cnt;
+
+	struct bridge_stp_xstats	stp_xstats;
 };
 
 #define kobj_to_brport(obj)	container_of(obj, struct net_bridge_port, kobj)
diff --git a/net/bridge/br_stp.c b/net/bridge/br_stp.c
index 1f1410f8d312..6856a6d9282b 100644
--- a/net/bridge/br_stp.c
+++ b/net/bridge/br_stp.c
@@ -45,6 +45,17 @@ void br_set_state(struct net_bridge_port *p, unsigned int state)
 		br_info(p->br, "port %u(%s) entered %s state\n",
 				(unsigned int) p->port_no, p->dev->name,
 				br_port_state_names[p->state]);
+
+	if (p->br->stp_enabled == BR_KERNEL_STP) {
+		switch (p->state) {
+		case BR_STATE_BLOCKING:
+			p->stp_xstats.transition_blk++;
+			break;
+		case BR_STATE_FORWARDING:
+			p->stp_xstats.transition_fwd++;
+			break;
+		}
+	}
 }
 
 /* called under bridge lock */
@@ -484,6 +495,8 @@ void br_received_config_bpdu(struct net_bridge_port *p,
 	struct net_bridge *br;
 	int was_root;
 
+	p->stp_xstats.rx_bpdu++;
+
 	br = p->br;
 	was_root = br_is_root_bridge(br);
 
@@ -517,6 +530,8 @@ void br_received_config_bpdu(struct net_bridge_port *p,
 /* called under bridge lock */
 void br_received_tcn_bpdu(struct net_bridge_port *p)
 {
+	p->stp_xstats.rx_tcn++;
+
 	if (br_is_designated_port(p)) {
 		br_info(p->br, "port %u(%s) received tcn bpdu\n",
 			(unsigned int) p->port_no, p->dev->name);
diff --git a/net/bridge/br_stp_bpdu.c b/net/bridge/br_stp_bpdu.c
index 7796dd9d42d7..0e4572f31330 100644
--- a/net/bridge/br_stp_bpdu.c
+++ b/net/bridge/br_stp_bpdu.c
@@ -118,6 +118,8 @@ void br_send_config_bpdu(struct net_bridge_port *p, struct br_config_bpdu *bpdu)
 	br_set_ticks(buf+33, bpdu->forward_delay);
 
 	br_send_bpdu(p, buf, 35);
+
+	p->stp_xstats.tx_bpdu++;
 }
 
 /* called under bridge lock */
@@ -133,6 +135,8 @@ void br_send_tcn_bpdu(struct net_bridge_port *p)
 	buf[2] = 0;
 	buf[3] = BPDU_TYPE_TCN;
 	br_send_bpdu(p, buf, 4);
+
+	p->stp_xstats.tx_tcn++;
 }
 
 /*
-- 
cgit v1.2.3


From 166750bc1dd256b2184b22588fb9fe6d3fbb93ae Mon Sep 17 00:00:00 2001
From: Andrii Nakryiko <andriin@fb.com>
Date: Fri, 13 Dec 2019 17:47:08 -0800
Subject: libbpf: Support libbpf-provided extern variables

Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention, can be 4- and 8-byte
    long;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, strings, and integer values are supported.

Set of possible values is determined by declared type of extern variable.
Supported types of variables are:
- Tristate values. Are represented as `enum libbpf_tristate`. Accepted values
  are **strictly** 'y', 'n', or 'm', which are represented as TRI_YES, TRI_NO,
  or TRI_MODULE, respectively.
- Boolean values. Are represented as bool (_Bool) types. Accepted values are
  'y' and 'n' only, turning into true/false values, respectively.
- Single-character values. Can be used both as a substritute for
  bool/tristate, or as a small-range integer:
  - 'y'/'n'/'m' are represented as is, as characters 'y', 'n', or 'm';
  - integers in a range [-128, 127] or [0, 255] (depending on signedness of
    char in target architecture) are recognized and represented with
    respective values of char type.
- Strings. String values are declared as fixed-length char arrays. String of
  up to that length will be accepted and put in first N bytes of char array,
  with the rest of bytes zeroed out. If config string value is longer than
  space alloted, it will be truncated and warning message emitted. Char array
  is always zero terminated. String literals in config have to be enclosed in
  double quotes, just like C-style string literals.
- Integers. 8-, 16-, 32-, and 64-bit integers are supported, both signed and
  unsigned variants. Libbpf enforces parsed config value to be in the
  supported range of corresponding integer type. Integers values in config can
  be:
  - decimal integers, with optional + and - signs;
  - hexadecimal integers, prefixed with 0x or 0X;
  - octal integers, starting with 0.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and perform enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20191214014710.3449601-3-andriin@fb.com
---
 include/uapi/linux/btf.h             |   3 +-
 tools/include/uapi/linux/btf.h       |   7 +-
 tools/lib/bpf/Makefile               |  15 +-
 tools/lib/bpf/bpf_helpers.h          |   9 +
 tools/lib/bpf/btf.c                  |   9 +-
 tools/lib/bpf/libbpf.c               | 738 ++++++++++++++++++++++++++++++++---
 tools/lib/bpf/libbpf.h               |  12 +-
 tools/testing/selftests/bpf/Makefile |   2 +-
 8 files changed, 729 insertions(+), 66 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index c02dec97e1ce..1a2898c482ee 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -142,7 +142,8 @@ struct btf_param {
 
 enum {
 	BTF_VAR_STATIC = 0,
-	BTF_VAR_GLOBAL_ALLOCATED,
+	BTF_VAR_GLOBAL_ALLOCATED = 1,
+	BTF_VAR_GLOBAL_EXTERN = 2,
 };
 
 /* BTF_KIND_VAR is followed by a single "struct btf_var" to describe
diff --git a/tools/include/uapi/linux/btf.h b/tools/include/uapi/linux/btf.h
index 63ae4a39e58b..1a2898c482ee 100644
--- a/tools/include/uapi/linux/btf.h
+++ b/tools/include/uapi/linux/btf.h
@@ -22,9 +22,9 @@ struct btf_header {
 };
 
 /* Max # of type identifier */
-#define BTF_MAX_TYPE	0x0000ffff
+#define BTF_MAX_TYPE	0x000fffff
 /* Max offset into the string section */
-#define BTF_MAX_NAME_OFFSET	0x0000ffff
+#define BTF_MAX_NAME_OFFSET	0x00ffffff
 /* Max # of struct/union/enum members or func args */
 #define BTF_MAX_VLEN	0xffff
 
@@ -142,7 +142,8 @@ struct btf_param {
 
 enum {
 	BTF_VAR_STATIC = 0,
-	BTF_VAR_GLOBAL_ALLOCATED,
+	BTF_VAR_GLOBAL_ALLOCATED = 1,
+	BTF_VAR_GLOBAL_EXTERN = 2,
 };
 
 /* BTF_KIND_VAR is followed by a single "struct btf_var" to describe
diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index 23ae06c43d08..a3718cb275f2 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -56,8 +56,8 @@ ifndef VERBOSE
 endif
 
 FEATURE_USER = .libbpf
-FEATURE_TESTS = libelf libelf-mmap bpf reallocarray
-FEATURE_DISPLAY = libelf bpf
+FEATURE_TESTS = libelf libelf-mmap zlib bpf reallocarray
+FEATURE_DISPLAY = libelf zlib bpf
 
 INCLUDES = -I. -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(ARCH)/include/uapi -I$(srctree)/tools/include/uapi
 FEATURE_CHECK_CFLAGS-bpf = $(INCLUDES)
@@ -160,7 +160,7 @@ all: fixdep
 
 all_cmd: $(CMD_TARGETS) check
 
-$(BPF_IN_SHARED): force elfdep bpfdep bpf_helper_defs.h
+$(BPF_IN_SHARED): force elfdep zdep bpfdep bpf_helper_defs.h
 	@(test -f ../../include/uapi/linux/bpf.h -a -f ../../../include/uapi/linux/bpf.h && ( \
 	(diff -B ../../include/uapi/linux/bpf.h ../../../include/uapi/linux/bpf.h >/dev/null) || \
 	echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'" >&2 )) || true
@@ -178,7 +178,7 @@ $(BPF_IN_SHARED): force elfdep bpfdep bpf_helper_defs.h
 	echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h'" >&2 )) || true
 	$(Q)$(MAKE) $(build)=libbpf OUTPUT=$(SHARED_OBJDIR) CFLAGS="$(CFLAGS) $(SHLIB_FLAGS)"
 
-$(BPF_IN_STATIC): force elfdep bpfdep bpf_helper_defs.h
+$(BPF_IN_STATIC): force elfdep zdep bpfdep bpf_helper_defs.h
 	$(Q)$(MAKE) $(build)=libbpf OUTPUT=$(STATIC_OBJDIR)
 
 bpf_helper_defs.h: $(srctree)/tools/include/uapi/linux/bpf.h
@@ -190,7 +190,7 @@ $(OUTPUT)libbpf.so: $(OUTPUT)libbpf.so.$(LIBBPF_VERSION)
 $(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN_SHARED)
 	$(QUIET_LINK)$(CC) $(LDFLAGS) \
 		--shared -Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
-		-Wl,--version-script=$(VERSION_SCRIPT) $^ -lelf -o $@
+		-Wl,--version-script=$(VERSION_SCRIPT) $^ -lelf -lz -o $@
 	@ln -sf $(@F) $(OUTPUT)libbpf.so
 	@ln -sf $(@F) $(OUTPUT)libbpf.so.$(LIBBPF_MAJOR_VERSION)
 
@@ -279,12 +279,15 @@ clean:
 
 
-PHONY += force elfdep bpfdep cscope tags
+PHONY += force elfdep zdep bpfdep cscope tags
 force:
 
 elfdep:
 	@if [ "$(feature-libelf)" != "1" ]; then echo "No libelf found"; exit 1 ; fi
 
+zdep:
+	@if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi
+
 bpfdep:
 	@if [ "$(feature-bpf)" != "1" ]; then echo "BPF API too old"; exit 1 ; fi
 
diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
index 0c7d28292898..aa46700075e1 100644
--- a/tools/lib/bpf/bpf_helpers.h
+++ b/tools/lib/bpf/bpf_helpers.h
@@ -25,6 +25,9 @@
 #ifndef __always_inline
 #define __always_inline __attribute__((always_inline))
 #endif
+#ifndef __weak
+#define __weak __attribute__((weak))
+#endif
 
 /*
  * Helper structure used by eBPF C program
@@ -44,4 +47,10 @@ enum libbpf_pin_type {
 	LIBBPF_PIN_BY_NAME,
 };
 
+enum libbpf_tristate {
+	TRI_NO = 0,
+	TRI_YES = 1,
+	TRI_MODULE = 2,
+};
+
 #endif
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 84fe82f27bef..520021939d81 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -578,6 +578,12 @@ static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf,
 		return -ENOENT;
 	}
 
+	/* .extern datasec size and var offsets were set correctly during
+	 * extern collection step, so just skip straight to sorting variables
+	 */
+	if (t->size)
+		goto sort_vars;
+
 	ret = bpf_object__section_size(obj, name, &size);
 	if (ret || !size || (t->size && t->size != size)) {
 		pr_debug("Invalid size for section %s: %u bytes\n", name, size);
@@ -614,7 +620,8 @@ static int btf_fixup_datasec(struct bpf_object *obj, struct btf *btf,
 		vsi->offset = off;
 	}
 
-	qsort(t + 1, vars, sizeof(*vsi), compare_vsi_off);
+sort_vars:
+	qsort(btf_var_secinfos(t), vars, sizeof(*vsi), compare_vsi_off);
 	return 0;
 }
 
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index f6c135869701..26144706ba0b 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -44,6 +44,7 @@
 #include <tools/libc_compat.h>
 #include <libelf.h>
 #include <gelf.h>
+#include <zlib.h>
 
 #include "libbpf.h"
 #include "bpf.h"
@@ -139,6 +140,20 @@ struct bpf_capabilities {
 	__u32 array_mmap:1;
 };
 
+enum reloc_type {
+	RELO_LD64,
+	RELO_CALL,
+	RELO_DATA,
+	RELO_EXTERN,
+};
+
+struct reloc_desc {
+	enum reloc_type type;
+	int insn_idx;
+	int map_idx;
+	int sym_off;
+};
+
 /*
  * bpf_prog should be a better name but it has been used in
  * linux/filter.h.
@@ -157,16 +172,7 @@ struct bpf_program {
 	size_t insns_cnt, main_prog_cnt;
 	enum bpf_prog_type type;
 
-	struct reloc_desc {
-		enum {
-			RELO_LD64,
-			RELO_CALL,
-			RELO_DATA,
-		} type;
-		int insn_idx;
-		int map_idx;
-		int sym_off;
-	} *reloc_desc;
+	struct reloc_desc *reloc_desc;
 	int nr_reloc;
 	int log_level;
 
@@ -198,18 +204,21 @@ struct bpf_program {
 #define DATA_SEC ".data"
 #define BSS_SEC ".bss"
 #define RODATA_SEC ".rodata"
+#define EXTERN_SEC ".extern"
 
 enum libbpf_map_type {
 	LIBBPF_MAP_UNSPEC,
 	LIBBPF_MAP_DATA,
 	LIBBPF_MAP_BSS,
 	LIBBPF_MAP_RODATA,
+	LIBBPF_MAP_EXTERN,
 };
 
 static const char * const libbpf_type_to_btf_name[] = {
 	[LIBBPF_MAP_DATA]	= DATA_SEC,
 	[LIBBPF_MAP_BSS]	= BSS_SEC,
 	[LIBBPF_MAP_RODATA]	= RODATA_SEC,
+	[LIBBPF_MAP_EXTERN]	= EXTERN_SEC,
 };
 
 struct bpf_map {
@@ -231,6 +240,28 @@ struct bpf_map {
 	bool reused;
 };
 
+enum extern_type {
+	EXT_UNKNOWN,
+	EXT_CHAR,
+	EXT_BOOL,
+	EXT_INT,
+	EXT_TRISTATE,
+	EXT_CHAR_ARR,
+};
+
+struct extern_desc {
+	const char *name;
+	int sym_idx;
+	int btf_id;
+	enum extern_type type;
+	int sz;
+	int align;
+	int data_off;
+	bool is_signed;
+	bool is_weak;
+	bool is_set;
+};
+
 static LIST_HEAD(bpf_objects_list);
 
 struct bpf_object {
@@ -244,6 +275,11 @@ struct bpf_object {
 	size_t nr_maps;
 	size_t maps_cap;
 
+	char *kconfig_path;
+	struct extern_desc *externs;
+	int nr_extern;
+	int extern_map_idx;
+
 	bool loaded;
 	bool has_pseudo_calls;
 	bool relaxed_core_relocs;
@@ -271,6 +307,7 @@ struct bpf_object {
 		int maps_shndx;
 		int btf_maps_shndx;
 		int text_shndx;
+		int symbols_shndx;
 		int data_shndx;
 		int rodata_shndx;
 		int bss_shndx;
@@ -542,6 +579,7 @@ static struct bpf_object *bpf_object__new(const char *path,
 	obj->efile.data_shndx = -1;
 	obj->efile.rodata_shndx = -1;
 	obj->efile.bss_shndx = -1;
+	obj->extern_map_idx = -1;
 
 	obj->kern_version = get_kernel_version();
 	obj->loaded = false;
@@ -866,7 +904,8 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 	def->key_size = sizeof(int);
 	def->value_size = data_sz;
 	def->max_entries = 1;
-	def->map_flags = type == LIBBPF_MAP_RODATA ? BPF_F_RDONLY_PROG : 0;
+	def->map_flags = type == LIBBPF_MAP_RODATA || type == LIBBPF_MAP_EXTERN
+			 ? BPF_F_RDONLY_PROG : 0;
 	def->map_flags |= BPF_F_MMAPABLE;
 
 	pr_debug("map '%s' (global data): at sec_idx %d, offset %zu, flags %x.\n",
@@ -883,7 +922,7 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 		return err;
 	}
 
-	if (type != LIBBPF_MAP_BSS)
+	if (data)
 		memcpy(map->mmaped, data, data_sz);
 
 	pr_debug("map %td is \"%s\"\n", map - obj->maps, map->name);
@@ -924,6 +963,271 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
 	return 0;
 }
 
+
+static struct extern_desc *find_extern_by_name(const struct bpf_object *obj,
+					       const void *name)
+{
+	int i;
+
+	for (i = 0; i < obj->nr_extern; i++) {
+		if (strcmp(obj->externs[i].name, name) == 0)
+			return &obj->externs[i];
+	}
+	return NULL;
+}
+
+static int set_ext_value_tri(struct extern_desc *ext, void *ext_val,
+			     char value)
+{
+	switch (ext->type) {
+	case EXT_BOOL:
+		if (value == 'm') {
+			pr_warn("extern %s=%c should be tristate or char\n",
+				ext->name, value);
+			return -EINVAL;
+		}
+		*(bool *)ext_val = value == 'y' ? true : false;
+		break;
+	case EXT_TRISTATE:
+		if (value == 'y')
+			*(enum libbpf_tristate *)ext_val = TRI_YES;
+		else if (value == 'm')
+			*(enum libbpf_tristate *)ext_val = TRI_MODULE;
+		else /* value == 'n' */
+			*(enum libbpf_tristate *)ext_val = TRI_NO;
+		break;
+	case EXT_CHAR:
+		*(char *)ext_val = value;
+		break;
+	case EXT_UNKNOWN:
+	case EXT_INT:
+	case EXT_CHAR_ARR:
+	default:
+		pr_warn("extern %s=%c should be bool, tristate, or char\n",
+			ext->name, value);
+		return -EINVAL;
+	}
+	ext->is_set = true;
+	return 0;
+}
+
+static int set_ext_value_str(struct extern_desc *ext, char *ext_val,
+			     const char *value)
+{
+	size_t len;
+
+	if (ext->type != EXT_CHAR_ARR) {
+		pr_warn("extern %s=%s should char array\n", ext->name, value);
+		return -EINVAL;
+	}
+
+	len = strlen(value);
+	if (value[len - 1] != '"') {
+		pr_warn("extern '%s': invalid string config '%s'\n",
+			ext->name, value);
+		return -EINVAL;
+	}
+
+	/* strip quotes */
+	len -= 2;
+	if (len >= ext->sz) {
+		pr_warn("extern '%s': long string config %s of (%zu bytes) truncated to %d bytes\n",
+			ext->name, value, len, ext->sz - 1);
+		len = ext->sz - 1;
+	}
+	memcpy(ext_val, value + 1, len);
+	ext_val[len] = '\0';
+	ext->is_set = true;
+	return 0;
+}
+
+static int parse_u64(const char *value, __u64 *res)
+{
+	char *value_end;
+	int err;
+
+	errno = 0;
+	*res = strtoull(value, &value_end, 0);
+	if (errno) {
+		err = -errno;
+		pr_warn("failed to parse '%s' as integer: %d\n", value, err);
+		return err;
+	}
+	if (*value_end) {
+		pr_warn("failed to parse '%s' as integer completely\n", value);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static bool is_ext_value_in_range(const struct extern_desc *ext, __u64 v)
+{
+	int bit_sz = ext->sz * 8;
+
+	if (ext->sz == 8)
+		return true;
+
+	/* Validate that value stored in u64 fits in integer of `ext->sz`
+	 * bytes size without any loss of information. If the target integer
+	 * is signed, we rely on the following limits of integer type of
+	 * Y bits and subsequent transformation:
+	 *
+	 *     -2^(Y-1) <= X           <= 2^(Y-1) - 1
+	 *            0 <= X + 2^(Y-1) <= 2^Y - 1
+	 *            0 <= X + 2^(Y-1) <  2^Y
+	 *
+	 *  For unsigned target integer, check that all the (64 - Y) bits are
+	 *  zero.
+	 */
+	if (ext->is_signed)
+		return v + (1ULL << (bit_sz - 1)) < (1ULL << bit_sz);
+	else
+		return (v >> bit_sz) == 0;
+}
+
+static int set_ext_value_num(struct extern_desc *ext, void *ext_val,
+			     __u64 value)
+{
+	if (ext->type != EXT_INT && ext->type != EXT_CHAR) {
+		pr_warn("extern %s=%llu should be integer\n",
+			ext->name, value);
+		return -EINVAL;
+	}
+	if (!is_ext_value_in_range(ext, value)) {
+		pr_warn("extern %s=%llu value doesn't fit in %d bytes\n",
+			ext->name, value, ext->sz);
+		return -ERANGE;
+	}
+	switch (ext->sz) {
+		case 1: *(__u8 *)ext_val = value; break;
+		case 2: *(__u16 *)ext_val = value; break;
+		case 4: *(__u32 *)ext_val = value; break;
+		case 8: *(__u64 *)ext_val = value; break;
+		default:
+			return -EINVAL;
+	}
+	ext->is_set = true;
+	return 0;
+}
+
+static int bpf_object__read_kernel_config(struct bpf_object *obj,
+					  const char *config_path,
+					  void *data)
+{
+	char buf[PATH_MAX], *sep, *value;
+	struct extern_desc *ext;
+	int len, err = 0;
+	void *ext_val;
+	__u64 num;
+	gzFile file;
+
+	if (config_path) {
+		file = gzopen(config_path, "r");
+	} else {
+		struct utsname uts;
+
+		uname(&uts);
+		len = snprintf(buf, PATH_MAX, "/boot/config-%s", uts.release);
+		if (len < 0)
+			return -EINVAL;
+		else if (len >= PATH_MAX)
+			return -ENAMETOOLONG;
+		/* gzopen also accepts uncompressed files. */
+		file = gzopen(buf, "r");
+		if (!file)
+			file = gzopen("/proc/config.gz", "r");
+	}
+	if (!file) {
+		pr_warn("failed to read kernel config at '%s'\n", config_path);
+		return -ENOENT;
+	}
+
+	while (gzgets(file, buf, sizeof(buf))) {
+		if (strncmp(buf, "CONFIG_", 7))
+			continue;
+
+		sep = strchr(buf, '=');
+		if (!sep) {
+			err = -EINVAL;
+			pr_warn("failed to parse '%s': no separator\n", buf);
+			goto out;
+		}
+		/* Trim ending '\n' */
+		len = strlen(buf);
+		if (buf[len - 1] == '\n')
+			buf[len - 1] = '\0';
+		/* Split on '=' and ensure that a value is present. */
+		*sep = '\0';
+		if (!sep[1]) {
+			err = -EINVAL;
+			*sep = '=';
+			pr_warn("failed to parse '%s': no value\n", buf);
+			goto out;
+		}
+
+		ext = find_extern_by_name(obj, buf);
+		if (!ext)
+			continue;
+		if (ext->is_set) {
+			err = -EINVAL;
+			pr_warn("re-defining extern '%s' not allowed\n", buf);
+			goto out;
+		}
+
+		ext_val = data + ext->data_off;
+		value = sep + 1;
+
+		switch (*value) {
+		case 'y': case 'n': case 'm':
+			err = set_ext_value_tri(ext, ext_val, *value);
+			break;
+		case '"':
+			err = set_ext_value_str(ext, ext_val, value);
+			break;
+		default:
+			/* assume integer */
+			err = parse_u64(value, &num);
+			if (err) {
+				pr_warn("extern %s=%s should be integer\n",
+					ext->name, value);
+				goto out;
+			}
+			err = set_ext_value_num(ext, ext_val, num);
+			break;
+		}
+		if (err)
+			goto out;
+		pr_debug("extern %s=%s\n", ext->name, value);
+	}
+
+out:
+	gzclose(file);
+	return err;
+}
+
+static int bpf_object__init_extern_map(struct bpf_object *obj)
+{
+	struct extern_desc *last_ext;
+	size_t map_sz;
+	int err;
+
+	if (obj->nr_extern == 0)
+		return 0;
+
+	last_ext = &obj->externs[obj->nr_extern - 1];
+	map_sz = last_ext->data_off + last_ext->sz;
+
+	err = bpf_object__init_internal_map(obj, LIBBPF_MAP_EXTERN,
+					    obj->efile.symbols_shndx,
+					    NULL, map_sz);
+	if (err)
+		return err;
+
+	obj->extern_map_idx = obj->nr_maps - 1;
+
+	return 0;
+}
+
 static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict)
 {
 	Elf_Data *symbols = obj->efile.symbols;
@@ -1401,19 +1705,17 @@ static int bpf_object__init_user_btf_maps(struct bpf_object *obj, bool strict,
 static int bpf_object__init_maps(struct bpf_object *obj,
 				 const struct bpf_object_open_opts *opts)
 {
-	const char *pin_root_path = OPTS_GET(opts, pin_root_path, NULL);
-	bool strict = !OPTS_GET(opts, relaxed_maps, false);
+	const char *pin_root_path;
+	bool strict;
 	int err;
 
-	err = bpf_object__init_user_maps(obj, strict);
-	if (err)
-		return err;
+	strict = !OPTS_GET(opts, relaxed_maps, false);
+	pin_root_path = OPTS_GET(opts, pin_root_path, NULL);
 
-	err = bpf_object__init_user_btf_maps(obj, strict, pin_root_path);
-	if (err)
-		return err;
-
-	err = bpf_object__init_global_data_maps(obj);
+	err = bpf_object__init_user_maps(obj, strict);
+	err = err ?: bpf_object__init_user_btf_maps(obj, strict, pin_root_path);
+	err = err ?: bpf_object__init_global_data_maps(obj);
+	err = err ?: bpf_object__init_extern_map(obj);
 	if (err)
 		return err;
 
@@ -1532,11 +1834,6 @@ static int bpf_object__init_btf(struct bpf_object *obj,
 				BTF_ELF_SEC, err);
 			goto out;
 		}
-		err = btf__finalize_data(obj, obj->btf);
-		if (err) {
-			pr_warn("Error finalizing %s: %d.\n", BTF_ELF_SEC, err);
-			goto out;
-		}
 	}
 	if (btf_ext_data) {
 		if (!obj->btf) {
@@ -1570,6 +1867,30 @@ out:
 	return 0;
 }
 
+static int bpf_object__finalize_btf(struct bpf_object *obj)
+{
+	int err;
+
+	if (!obj->btf)
+		return 0;
+
+	err = btf__finalize_data(obj, obj->btf);
+	if (!err)
+		return 0;
+
+	pr_warn("Error finalizing %s: %d.\n", BTF_ELF_SEC, err);
+	btf__free(obj->btf);
+	obj->btf = NULL;
+	btf_ext__free(obj->btf_ext);
+	obj->btf_ext = NULL;
+
+	if (bpf_object__is_btf_mandatory(obj)) {
+		pr_warn("BTF is required, but is missing or corrupted.\n");
+		return -ENOENT;
+	}
+	return 0;
+}
+
 static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
 {
 	int err = 0;
@@ -1670,6 +1991,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 				return -LIBBPF_ERRNO__FORMAT;
 			}
 			obj->efile.symbols = data;
+			obj->efile.symbols_shndx = idx;
 			obj->efile.strtabidx = sh.sh_link;
 		} else if (sh.sh_type == SHT_PROGBITS && data->d_size > 0) {
 			if (sh.sh_flags & SHF_EXECINSTR) {
@@ -1737,6 +2059,216 @@ static int bpf_object__elf_collect(struct bpf_object *obj)
 	return bpf_object__init_btf(obj, btf_data, btf_ext_data);
 }
 
+static bool sym_is_extern(const GElf_Sym *sym)
+{
+	int bind = GELF_ST_BIND(sym->st_info);
+	/* externs are symbols w/ type=NOTYPE, bind=GLOBAL|WEAK, section=UND */
+	return sym->st_shndx == SHN_UNDEF &&
+	       (bind == STB_GLOBAL || bind == STB_WEAK) &&
+	       GELF_ST_TYPE(sym->st_info) == STT_NOTYPE;
+}
+
+static int find_extern_btf_id(const struct btf *btf, const char *ext_name)
+{
+	const struct btf_type *t;
+	const char *var_name;
+	int i, n;
+
+	if (!btf)
+		return -ESRCH;
+
+	n = btf__get_nr_types(btf);
+	for (i = 1; i <= n; i++) {
+		t = btf__type_by_id(btf, i);
+
+		if (!btf_is_var(t))
+			continue;
+
+		var_name = btf__name_by_offset(btf, t->name_off);
+		if (strcmp(var_name, ext_name))
+			continue;
+
+		if (btf_var(t)->linkage != BTF_VAR_GLOBAL_EXTERN)
+			return -EINVAL;
+
+		return i;
+	}
+
+	return -ENOENT;
+}
+
+static enum extern_type find_extern_type(const struct btf *btf, int id,
+					 bool *is_signed)
+{
+	const struct btf_type *t;
+	const char *name;
+
+	t = skip_mods_and_typedefs(btf, id, NULL);
+	name = btf__name_by_offset(btf, t->name_off);
+
+	if (is_signed)
+		*is_signed = false;
+	switch (btf_kind(t)) {
+	case BTF_KIND_INT: {
+		int enc = btf_int_encoding(t);
+
+		if (enc & BTF_INT_BOOL)
+			return t->size == 1 ? EXT_BOOL : EXT_UNKNOWN;
+		if (is_signed)
+			*is_signed = enc & BTF_INT_SIGNED;
+		if (t->size == 1)
+			return EXT_CHAR;
+		if (t->size < 1 || t->size > 8 || (t->size & (t->size - 1)))
+			return EXT_UNKNOWN;
+		return EXT_INT;
+	}
+	case BTF_KIND_ENUM:
+		if (t->size != 4)
+			return EXT_UNKNOWN;
+		if (strcmp(name, "libbpf_tristate"))
+			return EXT_UNKNOWN;
+		return EXT_TRISTATE;
+	case BTF_KIND_ARRAY:
+		if (btf_array(t)->nelems == 0)
+			return EXT_UNKNOWN;
+		if (find_extern_type(btf, btf_array(t)->type, NULL) != EXT_CHAR)
+			return EXT_UNKNOWN;
+		return EXT_CHAR_ARR;
+	default:
+		return EXT_UNKNOWN;
+	}
+}
+
+static int cmp_externs(const void *_a, const void *_b)
+{
+	const struct extern_desc *a = _a;
+	const struct extern_desc *b = _b;
+
+	/* descending order by alignment requirements */
+	if (a->align != b->align)
+		return a->align > b->align ? -1 : 1;
+	/* ascending order by size, within same alignment class */
+	if (a->sz != b->sz)
+		return a->sz < b->sz ? -1 : 1;
+	/* resolve ties by name */
+	return strcmp(a->name, b->name);
+}
+
+static int bpf_object__collect_externs(struct bpf_object *obj)
+{
+	const struct btf_type *t;
+	struct extern_desc *ext;
+	int i, n, off, btf_id;
+	struct btf_type *sec;
+	const char *ext_name;
+	Elf_Scn *scn;
+	GElf_Shdr sh;
+
+	if (!obj->efile.symbols)
+		return 0;
+
+	scn = elf_getscn(obj->efile.elf, obj->efile.symbols_shndx);
+	if (!scn)
+		return -LIBBPF_ERRNO__FORMAT;
+	if (gelf_getshdr(scn, &sh) != &sh)
+		return -LIBBPF_ERRNO__FORMAT;
+	n = sh.sh_size / sh.sh_entsize;
+
+	pr_debug("looking for externs among %d symbols...\n", n);
+	for (i = 0; i < n; i++) {
+		GElf_Sym sym;
+
+		if (!gelf_getsym(obj->efile.symbols, i, &sym))
+			return -LIBBPF_ERRNO__FORMAT;
+		if (!sym_is_extern(&sym))
+			continue;
+		ext_name = elf_strptr(obj->efile.elf, obj->efile.strtabidx,
+				      sym.st_name);
+		if (!ext_name || !ext_name[0])
+			continue;
+
+		ext = obj->externs;
+		ext = reallocarray(ext, obj->nr_extern + 1, sizeof(*ext));
+		if (!ext)
+			return -ENOMEM;
+		obj->externs = ext;
+		ext = &ext[obj->nr_extern];
+		memset(ext, 0, sizeof(*ext));
+		obj->nr_extern++;
+
+		ext->btf_id = find_extern_btf_id(obj->btf, ext_name);
+		if (ext->btf_id <= 0) {
+			pr_warn("failed to find BTF for extern '%s': %d\n",
+				ext_name, ext->btf_id);
+			return ext->btf_id;
+		}
+		t = btf__type_by_id(obj->btf, ext->btf_id);
+		ext->name = btf__name_by_offset(obj->btf, t->name_off);
+		ext->sym_idx = i;
+		ext->is_weak = GELF_ST_BIND(sym.st_info) == STB_WEAK;
+		ext->sz = btf__resolve_size(obj->btf, t->type);
+		if (ext->sz <= 0) {
+			pr_warn("failed to resolve size of extern '%s': %d\n",
+				ext_name, ext->sz);
+			return ext->sz;
+		}
+		ext->align = btf__align_of(obj->btf, t->type);
+		if (ext->align <= 0) {
+			pr_warn("failed to determine alignment of extern '%s': %d\n",
+				ext_name, ext->align);
+			return -EINVAL;
+		}
+		ext->type = find_extern_type(obj->btf, t->type,
+					     &ext->is_signed);
+		if (ext->type == EXT_UNKNOWN) {
+			pr_warn("extern '%s' type is unsupported\n", ext_name);
+			return -ENOTSUP;
+		}
+	}
+	pr_debug("collected %d externs total\n", obj->nr_extern);
+
+	if (!obj->nr_extern)
+		return 0;
+
+	/* sort externs by (alignment, size, name) and calculate their offsets
+	 * within a map */
+	qsort(obj->externs, obj->nr_extern, sizeof(*ext), cmp_externs);
+	off = 0;
+	for (i = 0; i < obj->nr_extern; i++) {
+		ext = &obj->externs[i];
+		ext->data_off = roundup(off, ext->align);
+		off = ext->data_off + ext->sz;
+		pr_debug("extern #%d: symbol %d, off %u, name %s\n",
+			 i, ext->sym_idx, ext->data_off, ext->name);
+	}
+
+	btf_id = btf__find_by_name(obj->btf, EXTERN_SEC);
+	if (btf_id <= 0) {
+		pr_warn("no BTF info found for '%s' datasec\n", EXTERN_SEC);
+		return -ESRCH;
+	}
+
+	sec = (struct btf_type *)btf__type_by_id(obj->btf, btf_id);
+	sec->size = off;
+	n = btf_vlen(sec);
+	for (i = 0; i < n; i++) {
+		struct btf_var_secinfo *vs = btf_var_secinfos(sec) + i;
+
+		t = btf__type_by_id(obj->btf, vs->type);
+		ext_name = btf__name_by_offset(obj->btf, t->name_off);
+		ext = find_extern_by_name(obj, ext_name);
+		if (!ext) {
+			pr_warn("failed to find extern definition for BTF var '%s'\n",
+				ext_name);
+			return -ESRCH;
+		}
+		vs->offset = ext->data_off;
+		btf_var(t)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
+	}
+
+	return 0;
+}
+
 static struct bpf_program *
 bpf_object__find_prog_by_idx(struct bpf_object *obj, int idx)
 {
@@ -1801,6 +2333,8 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx)
 		return LIBBPF_MAP_BSS;
 	else if (shndx == obj->efile.rodata_shndx)
 		return LIBBPF_MAP_RODATA;
+	else if (shndx == obj->efile.symbols_shndx)
+		return LIBBPF_MAP_EXTERN;
 	else
 		return LIBBPF_MAP_UNSPEC;
 }
@@ -1845,6 +2379,30 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 			insn_idx, insn->code);
 		return -LIBBPF_ERRNO__RELOC;
 	}
+
+	if (sym_is_extern(sym)) {
+		int sym_idx = GELF_R_SYM(rel->r_info);
+		int i, n = obj->nr_extern;
+		struct extern_desc *ext;
+
+		for (i = 0; i < n; i++) {
+			ext = &obj->externs[i];
+			if (ext->sym_idx == sym_idx)
+				break;
+		}
+		if (i >= n) {
+			pr_warn("extern relo failed to find extern for sym %d\n",
+				sym_idx);
+			return -LIBBPF_ERRNO__RELOC;
+		}
+		pr_debug("found extern #%d '%s' (sym %d, off %u) for insn %u\n",
+			 i, ext->name, ext->sym_idx, ext->data_off, insn_idx);
+		reloc_desc->type = RELO_EXTERN;
+		reloc_desc->insn_idx = insn_idx;
+		reloc_desc->sym_off = ext->data_off;
+		return 0;
+	}
+
 	if (!shdr_idx || shdr_idx >= SHN_LORESERVE) {
 		pr_warn("invalid relo for \'%s\' in special section 0x%x; forgot to initialize global var?..\n",
 			name, shdr_idx);
@@ -2306,11 +2864,12 @@ bpf_object__reuse_map(struct bpf_map *map)
 static int
 bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 {
+	enum libbpf_map_type map_type = map->libbpf_type;
 	char *cp, errmsg[STRERR_BUFSIZE];
 	int err, zero = 0;
 
-	/* Nothing to do here since kernel already zero-initializes .bss map. */
-	if (map->libbpf_type == LIBBPF_MAP_BSS)
+	/* kernel already zero-initializes .bss map. */
+	if (map_type == LIBBPF_MAP_BSS)
 		return 0;
 
 	err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
@@ -2322,8 +2881,8 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 		return err;
 	}
 
-	/* Freeze .rodata map as read-only from syscall side. */
-	if (map->libbpf_type == LIBBPF_MAP_RODATA) {
+	/* Freeze .rodata and .extern map as read-only from syscall side. */
+	if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_EXTERN) {
 		err = bpf_map_freeze(map->fd);
 		if (err) {
 			err = -errno;
@@ -3572,9 +4131,6 @@ bpf_program__reloc_text(struct bpf_program *prog, struct bpf_object *obj,
 	size_t new_cnt;
 	int err;
 
-	if (relo->type != RELO_CALL)
-		return -LIBBPF_ERRNO__RELOC;
-
 	if (prog->idx == obj->efile.text_shndx) {
 		pr_warn("relo in .text insn %d into off %d (insn #%d)\n",
 			relo->insn_idx, relo->sym_off, relo->sym_off / 8);
@@ -3636,27 +4192,37 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 
 	for (i = 0; i < prog->nr_reloc; i++) {
 		struct reloc_desc *relo = &prog->reloc_desc[i];
+		struct bpf_insn *insn = &prog->insns[relo->insn_idx];
 
-		if (relo->type == RELO_LD64 || relo->type == RELO_DATA) {
-			struct bpf_insn *insn = &prog->insns[relo->insn_idx];
-
-			if (relo->insn_idx + 1 >= (int)prog->insns_cnt) {
-				pr_warn("relocation out of range: '%s'\n",
-					prog->section_name);
-				return -LIBBPF_ERRNO__RELOC;
-			}
+		if (relo->insn_idx + 1 >= (int)prog->insns_cnt) {
+			pr_warn("relocation out of range: '%s'\n",
+				prog->section_name);
+			return -LIBBPF_ERRNO__RELOC;
+		}
 
-			if (relo->type != RELO_DATA) {
-				insn[0].src_reg = BPF_PSEUDO_MAP_FD;
-			} else {
-				insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
-				insn[1].imm = insn[0].imm + relo->sym_off;
-			}
+		switch (relo->type) {
+		case RELO_LD64:
+			insn[0].src_reg = BPF_PSEUDO_MAP_FD;
+			insn[0].imm = obj->maps[relo->map_idx].fd;
+			break;
+		case RELO_DATA:
+			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+			insn[1].imm = insn[0].imm + relo->sym_off;
 			insn[0].imm = obj->maps[relo->map_idx].fd;
-		} else if (relo->type == RELO_CALL) {
+			break;
+		case RELO_EXTERN:
+			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+			insn[0].imm = obj->maps[obj->extern_map_idx].fd;
+			insn[1].imm = relo->sym_off;
+			break;
+		case RELO_CALL:
 			err = bpf_program__reloc_text(prog, obj, relo);
 			if (err)
 				return err;
+			break;
+		default:
+			pr_warn("relo #%d: bad relo type %d\n", i, relo->type);
+			return -EINVAL;
 		}
 	}
 
@@ -3936,11 +4502,10 @@ static struct bpf_object *
 __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 		   const struct bpf_object_open_opts *opts)
 {
+	const char *obj_name, *kconfig_path;
 	struct bpf_program *prog;
 	struct bpf_object *obj;
-	const char *obj_name;
 	char tmp_name[64];
-	__u32 attach_prog_fd;
 	int err;
 
 	if (elf_version(EV_CURRENT) == EV_NONE) {
@@ -3969,11 +4534,18 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 		return obj;
 
 	obj->relaxed_core_relocs = OPTS_GET(opts, relaxed_core_relocs, false);
-	attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0);
+	kconfig_path = OPTS_GET(opts, kconfig_path, NULL);
+	if (kconfig_path) {
+		obj->kconfig_path = strdup(kconfig_path);
+		if (!obj->kconfig_path)
+			return ERR_PTR(-ENOMEM);
+	}
 
 	err = bpf_object__elf_init(obj);
 	err = err ? : bpf_object__check_endianness(obj);
 	err = err ? : bpf_object__elf_collect(obj);
+	err = err ? : bpf_object__collect_externs(obj);
+	err = err ? : bpf_object__finalize_btf(obj);
 	err = err ? : bpf_object__init_maps(obj, opts);
 	err = err ? : bpf_object__init_prog_names(obj);
 	err = err ? : bpf_object__collect_reloc(obj);
@@ -3996,7 +4568,7 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 		bpf_program__set_type(prog, prog_type);
 		bpf_program__set_expected_attach_type(prog, attach_type);
 		if (prog_type == BPF_PROG_TYPE_TRACING)
-			prog->attach_prog_fd = attach_prog_fd;
+			prog->attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0);
 	}
 
 	return obj;
@@ -4107,6 +4679,61 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
 	return 0;
 }
 
+static int bpf_object__resolve_externs(struct bpf_object *obj,
+				       const char *config_path)
+{
+	bool need_config = false;
+	struct extern_desc *ext;
+	int err, i;
+	void *data;
+
+	if (obj->nr_extern == 0)
+		return 0;
+
+	data = obj->maps[obj->extern_map_idx].mmaped;
+
+	for (i = 0; i < obj->nr_extern; i++) {
+		ext = &obj->externs[i];
+
+		if (strcmp(ext->name, "LINUX_KERNEL_VERSION") == 0) {
+			void *ext_val = data + ext->data_off;
+			__u32 kver = get_kernel_version();
+
+			if (!kver) {
+				pr_warn("failed to get kernel version\n");
+				return -EINVAL;
+			}
+			err = set_ext_value_num(ext, ext_val, kver);
+			if (err)
+				return err;
+			pr_debug("extern %s=0x%x\n", ext->name, kver);
+		} else if (strncmp(ext->name, "CONFIG_", 7) == 0) {
+			need_config = true;
+		} else {
+			pr_warn("unrecognized extern '%s'\n", ext->name);
+			return -EINVAL;
+		}
+	}
+	if (need_config) {
+		err = bpf_object__read_kernel_config(obj, config_path, data);
+		if (err)
+			return -EINVAL;
+	}
+	for (i = 0; i < obj->nr_extern; i++) {
+		ext = &obj->externs[i];
+
+		if (!ext->is_set && !ext->is_weak) {
+			pr_warn("extern %s (strong) not resolved\n", ext->name);
+			return -ESRCH;
+		} else if (!ext->is_set) {
+			pr_debug("extern %s (weak) not resolved, defaulting to zero\n",
+				 ext->name);
+		}
+	}
+
+	return 0;
+}
+
 int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
 {
 	struct bpf_object *obj;
@@ -4126,6 +4753,7 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
 	obj->loaded = true;
 
 	err = bpf_object__probe_caps(obj);
+	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig_path);
 	err = err ? : bpf_object__sanitize_and_load_btf(obj);
 	err = err ? : bpf_object__sanitize_maps(obj);
 	err = err ? : bpf_object__create_maps(obj);
@@ -4719,6 +5347,10 @@ void bpf_object__close(struct bpf_object *obj)
 		zfree(&map->pin_path);
 	}
 
+	zfree(&obj->kconfig_path);
+	zfree(&obj->externs);
+	obj->nr_extern = 0;
+
 	zfree(&obj->maps);
 	obj->nr_maps = 0;
 
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 623191e71415..6340823871e2 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -85,8 +85,12 @@ struct bpf_object_open_opts {
 	 */
 	const char *pin_root_path;
 	__u32 attach_prog_fd;
+	/* kernel config file path override (for CONFIG_ externs); can point
+	 * to either uncompressed text file or .gz file
+	 */
+	const char *kconfig_path;
 };
-#define bpf_object_open_opts__last_field attach_prog_fd
+#define bpf_object_open_opts__last_field kconfig_path
 
 LIBBPF_API struct bpf_object *bpf_object__open(const char *path);
 LIBBPF_API struct bpf_object *
@@ -669,6 +673,12 @@ LIBBPF_API int bpf_object__attach_skeleton(struct bpf_object_skeleton *s);
 LIBBPF_API void bpf_object__detach_skeleton(struct bpf_object_skeleton *s);
 LIBBPF_API void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s);
 
+enum libbpf_tristate {
+	TRI_NO = 0,
+	TRI_YES = 1,
+	TRI_MODULE = 2,
+};
+
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index f70c8e735120..ba90621617a8 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -24,7 +24,7 @@ CFLAGS += -g -Wall -O2 $(GENFLAGS) -I$(APIDIR) -I$(LIBDIR) -I$(BPFDIR)	\
 	  -I$(GENDIR) -I$(TOOLSINCDIR) -I$(CURDIR)			\
 	  -Dbpf_prog_load=bpf_prog_test_load				\
 	  -Dbpf_load_program=bpf_test_load_program
-LDLIBS += -lcap -lelf -lrt -lpthread
+LDLIBS += -lcap -lelf -lz -lrt -lpthread
 
 # Order correspond to 'make run_tests' order
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
-- 
cgit v1.2.3


From a2ec8b5706944d228181c8b91d815f41d6dd8e7b Mon Sep 17 00:00:00 2001
From: Josh Soref <jsoref@gmail.com>
Date: Sun, 15 Dec 2019 22:08:02 +0100
Subject: wireguard: global: fix spelling mistakes in comments

This fixes two spelling errors in source code comments.

Signed-off-by: Josh Soref <jsoref@gmail.com>
[Jason: rewrote commit message]
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/wireguard/receive.c | 2 +-
 include/uapi/linux/wireguard.h  | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/net/wireguard/receive.c b/drivers/net/wireguard/receive.c
index 7e675f541491..9c6bab9c981f 100644
--- a/drivers/net/wireguard/receive.c
+++ b/drivers/net/wireguard/receive.c
@@ -380,7 +380,7 @@ static void wg_packet_consume_data_done(struct wg_peer *peer,
 	/* We've already verified the Poly1305 auth tag, which means this packet
 	 * was not modified in transit. We can therefore tell the networking
 	 * stack that all checksums of every layer of encapsulation have already
-	 * been checked "by the hardware" and therefore is unneccessary to check
+	 * been checked "by the hardware" and therefore is unnecessary to check
 	 * again in software.
 	 */
 	skb->ip_summed = CHECKSUM_UNNECESSARY;
diff --git a/include/uapi/linux/wireguard.h b/include/uapi/linux/wireguard.h
index dd8a47c4ad11..ae88be14c947 100644
--- a/include/uapi/linux/wireguard.h
+++ b/include/uapi/linux/wireguard.h
@@ -18,13 +18,13 @@
  * one but not both of:
  *
  *    WGDEVICE_A_IFINDEX: NLA_U32
- *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1
  *
  * The kernel will then return several messages (NLM_F_MULTI) containing the
  * following tree of nested items:
  *
  *    WGDEVICE_A_IFINDEX: NLA_U32
- *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1
  *    WGDEVICE_A_PRIVATE_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
  *    WGDEVICE_A_PUBLIC_KEY: NLA_EXACT_LEN, len WG_KEY_LEN
  *    WGDEVICE_A_LISTEN_PORT: NLA_U16
@@ -77,7 +77,7 @@
  * WGDEVICE_A_IFINDEX and WGDEVICE_A_IFNAME:
  *
  *    WGDEVICE_A_IFINDEX: NLA_U32
- *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMESIZ - 1
+ *    WGDEVICE_A_IFNAME: NLA_NUL_STRING, maxlen IFNAMSIZ - 1
  *    WGDEVICE_A_FLAGS: NLA_U32, 0 or WGDEVICE_F_REPLACE_PEERS if all current
  *                      peers should be removed prior to adding the list below.
  *    WGDEVICE_A_PRIVATE_KEY: len WG_KEY_LEN, all zeros to remove
@@ -121,7 +121,7 @@
  * filling in information not contained in the prior. Note that if
  * WGDEVICE_F_REPLACE_PEERS is specified in the first message, it probably
  * should not be specified in fragments that come after, so that the list
- * of peers is only cleared the first time but appened after. Likewise for
+ * of peers is only cleared the first time but appended after. Likewise for
  * peers, if WGPEER_F_REPLACE_ALLOWEDIPS is specified in the first message
  * of a peer, it likely should not be specified in subsequent fragments.
  *
-- 
cgit v1.2.3


From b3b4346544b571c96d46be615b9db69a601ce4c8 Mon Sep 17 00:00:00 2001
From: "Andrew F. Davis" <afd@ti.com>
Date: Mon, 16 Dec 2019 08:34:04 -0500
Subject: dma-buf: heaps: Use _IOCTL_ for userspace IOCTL identifier

This is more consistent with the DMA and DRM frameworks convention. This
patch is only a name change, no logic is changed.

Signed-off-by: Andrew F. Davis <afd@ti.com>
Acked-by: John Stultz <john.stultz@linaro.org>
Signed-off-by: Sumit Semwal <sumit.semwal@linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/20191216133405.1001-2-afd@ti.com
---
 drivers/dma-buf/dma-heap.c                         | 4 ++--
 include/uapi/linux/dma-heap.h                      | 4 ++--
 tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 2 +-
 3 files changed, 5 insertions(+), 5 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/dma-buf/dma-heap.c b/drivers/dma-buf/dma-heap.c
index 4f04d104ae61..a24721496114 100644
--- a/drivers/dma-buf/dma-heap.c
+++ b/drivers/dma-buf/dma-heap.c
@@ -107,7 +107,7 @@ static long dma_heap_ioctl_allocate(struct file *file, void *data)
 }
 
 unsigned int dma_heap_ioctl_cmds[] = {
-	DMA_HEAP_IOC_ALLOC,
+	DMA_HEAP_IOCTL_ALLOC,
 };
 
 static long dma_heap_ioctl(struct file *file, unsigned int ucmd,
@@ -153,7 +153,7 @@ static long dma_heap_ioctl(struct file *file, unsigned int ucmd,
 		memset(kdata + in_size, 0, ksize - in_size);
 
 	switch (kcmd) {
-	case DMA_HEAP_IOC_ALLOC:
+	case DMA_HEAP_IOCTL_ALLOC:
 		ret = dma_heap_ioctl_allocate(file, kdata);
 		break;
 	default:
diff --git a/include/uapi/linux/dma-heap.h b/include/uapi/linux/dma-heap.h
index 73e7f66c1cae..6f84fa08e074 100644
--- a/include/uapi/linux/dma-heap.h
+++ b/include/uapi/linux/dma-heap.h
@@ -42,12 +42,12 @@ struct dma_heap_allocation_data {
 #define DMA_HEAP_IOC_MAGIC		'H'
 
 /**
- * DOC: DMA_HEAP_IOC_ALLOC - allocate memory from pool
+ * DOC: DMA_HEAP_IOCTL_ALLOC - allocate memory from pool
  *
  * Takes a dma_heap_allocation_data struct and returns it with the fd field
  * populated with the dmabuf handle of the allocation.
  */
-#define DMA_HEAP_IOC_ALLOC	_IOWR(DMA_HEAP_IOC_MAGIC, 0x0,\
+#define DMA_HEAP_IOCTL_ALLOC	_IOWR(DMA_HEAP_IOC_MAGIC, 0x0,\
 				      struct dma_heap_allocation_data)
 
 #endif /* _UAPI_LINUX_DMABUF_POOL_H */
diff --git a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
index 3e53ad331bdc..cd5e1f602ac9 100644
--- a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
+++ b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c
@@ -116,7 +116,7 @@ static int dmabuf_heap_alloc_fdflags(int fd, size_t len, unsigned int fd_flags,
 	if (!dmabuf_fd)
 		return -EINVAL;
 
-	ret = ioctl(fd, DMA_HEAP_IOC_ALLOC, &data);
+	ret = ioctl(fd, DMA_HEAP_IOCTL_ALLOC, &data);
 	if (ret < 0)
 		return ret;
 	*dmabuf_fd = (int)data.fd;
-- 
cgit v1.2.3


From 3431ca4837bf7f2b816e4c439db5bb66e7586746 Mon Sep 17 00:00:00 2001
From: Alexandre Belloni <alexandre.belloni@bootlin.com>
Date: Sat, 14 Dec 2019 23:02:43 +0100
Subject: rtc: define RTC_VL_READ values

Currently, the meaning of the value returned by RTC_VL_READ is undocumented
and left to the driver implementation. In order to get more meaningful
values, define a set of values to use as to make clear to userspace what is
the status of the various voltages feeding the RTC.

Link: https://lore.kernel.org/r/20191214220259.621996-2-alexandre.belloni@bootlin.com
Signed-off-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
---
 include/uapi/linux/rtc.h | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/rtc.h b/include/uapi/linux/rtc.h
index 2ad1788968d0..095af360326a 100644
--- a/include/uapi/linux/rtc.h
+++ b/include/uapi/linux/rtc.h
@@ -92,7 +92,12 @@ struct rtc_pll_info {
 #define RTC_PLL_GET	_IOR('p', 0x11, struct rtc_pll_info)  /* Get PLL correction */
 #define RTC_PLL_SET	_IOW('p', 0x12, struct rtc_pll_info)  /* Set PLL correction */
 
-#define RTC_VL_READ	_IOR('p', 0x13, int)	/* Voltage low detector */
+#define RTC_VL_DATA_INVALID	BIT(0) /* Voltage too low, RTC data is invalid */
+#define RTC_VL_BACKUP_LOW	BIT(1) /* Backup voltage is low */
+#define RTC_VL_BACKUP_EMPTY	BIT(2) /* Backup empty or not present */
+#define RTC_VL_ACCURACY_LOW	BIT(3) /* Voltage is low, RTC accuracy is reduced */
+
+#define RTC_VL_READ	_IOR('p', 0x13, unsigned int)	/* Voltage low detection */
 #define RTC_VL_CLR	_IO('p', 0x14)		/* Clear voltage low information */
 
 /* interrupt flags */
-- 
cgit v1.2.3


From 2d602bf28316e2f61a553f13d279f3d74c2e5189 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Thu, 24 Oct 2019 16:34:25 +0200
Subject: acct: stop using get_seconds()

In 'struct acct', 'struct acct_v3', and 'struct taskstats' we have
a 32-bit 'ac_btime' field containing an absolute time value, which
will overflow in year 2106.

There are two possible ways to deal with it:

a) let it overflow and have user space code deal with reconstructing
   the data based on the current time, or
b) truncate the times based on the range of the u32 type.

Neither of them solves the actual problem. Pick the second
one to best document what the issue is, and have someone
fix it in a future version.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/linux/acct.h      | 2 ++
 include/uapi/linux/taskstats.h | 1 +
 kernel/acct.c                  | 4 +++-
 kernel/tsacct.c                | 8 +++++---
 4 files changed, 11 insertions(+), 4 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/acct.h b/include/uapi/linux/acct.h
index 0e72172cd23a..985b89068591 100644
--- a/include/uapi/linux/acct.h
+++ b/include/uapi/linux/acct.h
@@ -49,6 +49,7 @@ struct acct
 	__u16		ac_uid16;		/* LSB of Real User ID */
 	__u16		ac_gid16;		/* LSB of Real Group ID */
 	__u16		ac_tty;			/* Control Terminal */
+	/* __u32 range means times from 1970 to 2106 */
 	__u32		ac_btime;		/* Process Creation Time */
 	comp_t		ac_utime;		/* User Time */
 	comp_t		ac_stime;		/* System Time */
@@ -81,6 +82,7 @@ struct acct_v3
 	__u32		ac_gid;			/* Real Group ID */
 	__u32		ac_pid;			/* Process ID */
 	__u32		ac_ppid;		/* Parent Process ID */
+	/* __u32 range means times from 1970 to 2106 */
 	__u32		ac_btime;		/* Process Creation Time */
 #ifdef __KERNEL__
 	__u32		ac_etime;		/* Elapsed Time */
diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h
index 5e8ca16a9079..7d3ea366e93b 100644
--- a/include/uapi/linux/taskstats.h
+++ b/include/uapi/linux/taskstats.h
@@ -112,6 +112,7 @@ struct taskstats {
 	__u32	ac_gid;			/* Group ID */
 	__u32	ac_pid;			/* Process ID */
 	__u32	ac_ppid;		/* Parent process ID */
+	/* __u32 range means times from 1970 to 2106 */
 	__u32	ac_btime;		/* Begin time [sec since 1970] */
 	__u64	ac_etime __attribute__((aligned(8)));
 					/* Elapsed time [usec] */
diff --git a/kernel/acct.c b/kernel/acct.c
index 81f9831a7859..11ff4a596d6b 100644
--- a/kernel/acct.c
+++ b/kernel/acct.c
@@ -416,6 +416,7 @@ static void fill_ac(acct_t *ac)
 {
 	struct pacct_struct *pacct = &current->signal->pacct;
 	u64 elapsed, run_time;
+	time64_t btime;
 	struct tty_struct *tty;
 
 	/*
@@ -448,7 +449,8 @@ static void fill_ac(acct_t *ac)
 	}
 #endif
 	do_div(elapsed, AHZ);
-	ac->ac_btime = get_seconds() - elapsed;
+	btime = ktime_get_real_seconds() - elapsed;
+	ac->ac_btime = clamp_t(time64_t, btime, 0, U32_MAX);
 #if ACCT_VERSION==2
 	ac->ac_ahz = AHZ;
 #endif
diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index 7be3e7530841..ab12616ee6fb 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -24,6 +24,7 @@ void bacct_add_tsk(struct user_namespace *user_ns,
 	const struct cred *tcred;
 	u64 utime, stime, utimescaled, stimescaled;
 	u64 delta;
+	time64_t btime;
 
 	BUILD_BUG_ON(TS_COMM_LEN < TASK_COMM_LEN);
 
@@ -32,9 +33,10 @@ void bacct_add_tsk(struct user_namespace *user_ns,
 	/* Convert to micro seconds */
 	do_div(delta, NSEC_PER_USEC);
 	stats->ac_etime = delta;
-	/* Convert to seconds for btime */
-	do_div(delta, USEC_PER_SEC);
-	stats->ac_btime = get_seconds() - delta;
+	/* Convert to seconds for btime (note y2106 limit) */
+	btime = ktime_get_real_seconds() - div_u64(delta, USEC_PER_SEC);
+	stats->ac_btime = clamp_t(time64_t, btime, 0, U32_MAX);
+
 	if (thread_group_leader(tsk)) {
 		stats->ac_exitcode = tsk->exit_code;
 		if (tsk->flags & PF_FORKNOEXEC)
-- 
cgit v1.2.3


From 352c912b0a525977a8e6fa1f87c15d9f71943642 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Thu, 24 Oct 2019 16:47:56 +0200
Subject: tsacct: add 64-bit btime field

As there is only a 32-bit ac_btime field in taskstat and
we should handle dates after the overflow, add a new field
with the same information but 64-bit width that can hold
a full time64_t.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/uapi/linux/taskstats.h | 5 ++++-
 kernel/tsacct.c                | 1 +
 2 files changed, 5 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/taskstats.h b/include/uapi/linux/taskstats.h
index 7d3ea366e93b..ccbd08709321 100644
--- a/include/uapi/linux/taskstats.h
+++ b/include/uapi/linux/taskstats.h
@@ -34,7 +34,7 @@
  */
 
 
-#define TASKSTATS_VERSION	9
+#define TASKSTATS_VERSION	10
 #define TS_COMM_LEN		32	/* should be >= TASK_COMM_LEN
 					 * in linux/sched.h */
 
@@ -169,6 +169,9 @@ struct taskstats {
 	/* Delay waiting for thrashing page */
 	__u64	thrashing_count;
 	__u64	thrashing_delay_total;
+
+	/* v10: 64-bit btime to avoid overflow */
+	__u64	ac_btime64;		/* 64-bit begin time */
 };
 
 
diff --git a/kernel/tsacct.c b/kernel/tsacct.c
index ab12616ee6fb..257ffb993ea2 100644
--- a/kernel/tsacct.c
+++ b/kernel/tsacct.c
@@ -36,6 +36,7 @@ void bacct_add_tsk(struct user_namespace *user_ns,
 	/* Convert to seconds for btime (note y2106 limit) */
 	btime = ktime_get_real_seconds() - div_u64(delta, USEC_PER_SEC);
 	stats->ac_btime = clamp_t(time64_t, btime, 0, U32_MAX);
+	stats->ac_btime64 = btime;
 
 	if (thread_group_leader(tsk)) {
 		stats->ac_exitcode = tsk->exit_code;
-- 
cgit v1.2.3


From 4f9fbd893fe83a1193adceca41c8f7aa6c7382a1 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Fri, 15 Nov 2019 15:53:29 +0100
Subject: y2038: rename itimerval to __kernel_old_itimerval

Take the renaming of timeval and timespec one level further,
also renaming itimerval to __kernel_old_itimerval, to avoid
namespace conflicts with the user-space structure that may
use 64-bit time_t members.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/syscalls.h        |  9 ++++-----
 include/uapi/linux/time_types.h |  5 +++++
 kernel/time/itimer.c            | 18 +++++++++---------
 3 files changed, 18 insertions(+), 14 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index d0391cc2dae9..27245fec2a8a 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -16,8 +16,7 @@ struct inode;
 struct iocb;
 struct io_event;
 struct iovec;
-struct itimerspec;
-struct itimerval;
+struct __kernel_old_itimerval;
 struct kexec_segment;
 struct linux_dirent;
 struct linux_dirent64;
@@ -591,10 +590,10 @@ asmlinkage long sys_nanosleep_time32(struct old_timespec32 __user *rqtp,
 				     struct old_timespec32 __user *rmtp);
 
 /* kernel/itimer.c */
-asmlinkage long sys_getitimer(int which, struct itimerval __user *value);
+asmlinkage long sys_getitimer(int which, struct __kernel_old_itimerval __user *value);
 asmlinkage long sys_setitimer(int which,
-				struct itimerval __user *value,
-				struct itimerval __user *ovalue);
+				struct __kernel_old_itimerval __user *value,
+				struct __kernel_old_itimerval __user *ovalue);
 
 /* kernel/kexec.c */
 asmlinkage long sys_kexec_load(unsigned long entry, unsigned long nr_segments,
diff --git a/include/uapi/linux/time_types.h b/include/uapi/linux/time_types.h
index 074e391d73a1..bcc0002115d3 100644
--- a/include/uapi/linux/time_types.h
+++ b/include/uapi/linux/time_types.h
@@ -33,6 +33,11 @@ struct __kernel_old_timespec {
 	long			tv_nsec;	/* nanoseconds */
 };
 
+struct __kernel_old_itimerval {
+	struct __kernel_old_timeval it_interval;/* timer interval */
+	struct __kernel_old_timeval it_value;	/* current value */
+};
+
 struct __kernel_sock_timeval {
 	__s64 tv_sec;
 	__s64 tv_usec;
diff --git a/kernel/time/itimer.c b/kernel/time/itimer.c
index 9e59c9ea92aa..ca4e6d57d68b 100644
--- a/kernel/time/itimer.c
+++ b/kernel/time/itimer.c
@@ -97,20 +97,20 @@ static int do_getitimer(int which, struct itimerspec64 *value)
 	return 0;
 }
 
-static int put_itimerval(struct itimerval __user *o,
+static int put_itimerval(struct __kernel_old_itimerval __user *o,
 			 const struct itimerspec64 *i)
 {
-	struct itimerval v;
+	struct __kernel_old_itimerval v;
 
 	v.it_interval.tv_sec = i->it_interval.tv_sec;
 	v.it_interval.tv_usec = i->it_interval.tv_nsec / NSEC_PER_USEC;
 	v.it_value.tv_sec = i->it_value.tv_sec;
 	v.it_value.tv_usec = i->it_value.tv_nsec / NSEC_PER_USEC;
-	return copy_to_user(o, &v, sizeof(struct itimerval)) ? -EFAULT : 0;
+	return copy_to_user(o, &v, sizeof(struct __kernel_old_itimerval)) ? -EFAULT : 0;
 }
 
 
-SYSCALL_DEFINE2(getitimer, int, which, struct itimerval __user *, value)
+SYSCALL_DEFINE2(getitimer, int, which, struct __kernel_old_itimerval __user *, value)
 {
 	struct itimerspec64 get_buffer;
 	int error = do_getitimer(which, &get_buffer);
@@ -314,11 +314,11 @@ SYSCALL_DEFINE1(alarm, unsigned int, seconds)
 
 #endif
 
-static int get_itimerval(struct itimerspec64 *o, const struct itimerval __user *i)
+static int get_itimerval(struct itimerspec64 *o, const struct __kernel_old_itimerval __user *i)
 {
-	struct itimerval v;
+	struct __kernel_old_itimerval v;
 
-	if (copy_from_user(&v, i, sizeof(struct itimerval)))
+	if (copy_from_user(&v, i, sizeof(struct __kernel_old_itimerval)))
 		return -EFAULT;
 
 	/* Validate the timevals in value. */
@@ -333,8 +333,8 @@ static int get_itimerval(struct itimerspec64 *o, const struct itimerval __user *
 	return 0;
 }
 
-SYSCALL_DEFINE3(setitimer, int, which, struct itimerval __user *, value,
-		struct itimerval __user *, ovalue)
+SYSCALL_DEFINE3(setitimer, int, which, struct __kernel_old_itimerval __user *, value,
+		struct __kernel_old_itimerval __user *, ovalue)
 {
 	struct itimerspec64 set_buffer, get_buffer;
 	int error;
-- 
cgit v1.2.3


From 251ec1c159e4874fbede0c3c586e317e177c0c9b Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Wed, 11 Dec 2019 21:07:23 +0100
Subject: y2038: sparc: remove use of struct timex

'struct timex' is one of the last users of 'struct timeval' and is
only referenced in one place in the kernel any more, to convert the
user space timex into the kernel-internal version on sparc64, with a
different tv_usec member type.

As a preparation for hiding the time_t definition and everything
using that in the kernel, change the implementation once more
to only convert the timeval member, and then enclose the
struct definition in an #ifdef.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
---
 arch/sparc/kernel/sys_sparc_64.c | 33 +++++++++++++++++----------------
 include/uapi/linux/timex.h       |  2 ++
 2 files changed, 19 insertions(+), 16 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/arch/sparc/kernel/sys_sparc_64.c b/arch/sparc/kernel/sys_sparc_64.c
index 9f41a6f5a032..34917617f258 100644
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -548,34 +548,35 @@ out_unlock:
 	return err;
 }
 
-SYSCALL_DEFINE1(sparc_adjtimex, struct timex __user *, txc_p)
+SYSCALL_DEFINE1(sparc_adjtimex, struct __kernel_timex __user *, txc_p)
 {
-	struct timex txc;		/* Local copy of parameter */
-	struct __kernel_timex *kt = (void *)&txc;
+	struct __kernel_timex txc;
+	struct __kernel_old_timeval *tv = (void *)&txc_p->time;
 	int ret;
 
 	/* Copy the user data space into the kernel copy
 	 * structure. But bear in mind that the structures
 	 * may change
 	 */
-	if (copy_from_user(&txc, txc_p, sizeof(struct timex)))
+	if (copy_from_user(&txc, txc_p, sizeof(txc)))
 		return -EFAULT;
 
 	/*
 	 * override for sparc64 specific timeval type: tv_usec
 	 * is 32 bit wide instead of 64-bit in __kernel_timex
 	 */
-	kt->time.tv_usec = txc.time.tv_usec;
-	ret = do_adjtimex(kt);
-	txc.time.tv_usec = kt->time.tv_usec;
+	txc.time.tv_usec = tv->tv_usec;
+	ret = do_adjtimex(&txc);
+	tv->tv_usec = txc.time.tv_usec;
 
-	return copy_to_user(txc_p, &txc, sizeof(struct timex)) ? -EFAULT : ret;
+	return copy_to_user(txc_p, &txc, sizeof(txc)) ? -EFAULT : ret;
 }
 
-SYSCALL_DEFINE2(sparc_clock_adjtime, const clockid_t, which_clock,struct timex __user *, txc_p)
+SYSCALL_DEFINE2(sparc_clock_adjtime, const clockid_t, which_clock,
+		struct __kernel_timex __user *, txc_p)
 {
-	struct timex txc;		/* Local copy of parameter */
-	struct __kernel_timex *kt = (void *)&txc;
+	struct __kernel_timex txc;
+	struct __kernel_old_timeval *tv = (void *)&txc_p->time;
 	int ret;
 
 	if (!IS_ENABLED(CONFIG_POSIX_TIMERS)) {
@@ -590,18 +591,18 @@ SYSCALL_DEFINE2(sparc_clock_adjtime, const clockid_t, which_clock,struct timex _
 	 * structure. But bear in mind that the structures
 	 * may change
 	 */
-	if (copy_from_user(&txc, txc_p, sizeof(struct timex)))
+	if (copy_from_user(&txc, txc_p, sizeof(txc)))
 		return -EFAULT;
 
 	/*
 	 * override for sparc64 specific timeval type: tv_usec
 	 * is 32 bit wide instead of 64-bit in __kernel_timex
 	 */
-	kt->time.tv_usec = txc.time.tv_usec;
-	ret = do_clock_adjtime(which_clock, kt);
-	txc.time.tv_usec = kt->time.tv_usec;
+	txc.time.tv_usec = tv->tv_usec;
+	ret = do_clock_adjtime(which_clock, &txc);
+	tv->tv_usec = txc.time.tv_usec;
 
-	return copy_to_user(txc_p, &txc, sizeof(struct timex)) ? -EFAULT : ret;
+	return copy_to_user(txc_p, &txc, sizeof(txc)) ? -EFAULT : ret;
 }
 
 SYSCALL_DEFINE5(utrap_install, utrap_entry_t, type,
diff --git a/include/uapi/linux/timex.h b/include/uapi/linux/timex.h
index 9f517f9010bb..bd627c368d09 100644
--- a/include/uapi/linux/timex.h
+++ b/include/uapi/linux/timex.h
@@ -57,6 +57,7 @@
 
 #define NTP_API		4	/* NTP API version */
 
+#ifndef __KERNEL__
 /*
  * syscall interface - used (mainly by NTP daemon)
  * to discipline kernel clock oscillator
@@ -91,6 +92,7 @@ struct timex {
 	int  :32; int  :32; int  :32; int  :32;
 	int  :32; int  :32; int  :32;
 };
+#endif
 
 struct __kernel_timex_timeval {
 	__kernel_time64_t       tv_sec;
-- 
cgit v1.2.3


From dcc68b4d8084e1ac9af0d4022d6b1aff6a139a33 Mon Sep 17 00:00:00 2001
From: Petr Machata <petrm@mellanox.com>
Date: Wed, 18 Dec 2019 14:55:13 +0000
Subject: net: sch_ets: Add a new Qdisc

Introduces a new Qdisc, which is based on 802.1Q-2014 wording. It is
PRIO-like in how it is configured, meaning one needs to specify how many
bands there are, how many are strict and how many are dwrr, quanta for the
latter, and priomap.

The new Qdisc operates like the PRIO / DRR combo would when configured as
per the standard. The strict classes, if any, are tried for traffic first.
When there's no traffic in any of the strict queues, the ETS ones (if any)
are treated in the same way as in DRR.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/pkt_sched.h |  17 +
 net/sched/Kconfig              |  17 +
 net/sched/Makefile             |   1 +
 net/sched/sch_ets.c            | 733 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 768 insertions(+)
 create mode 100644 net/sched/sch_ets.c

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 9f1a72876212..bf5a5b1dfb0b 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -1187,4 +1187,21 @@ enum {
 
 #define TCA_TAPRIO_ATTR_MAX (__TCA_TAPRIO_ATTR_MAX - 1)
 
+/* ETS */
+
+#define TCQ_ETS_MAX_BANDS 16
+
+enum {
+	TCA_ETS_UNSPEC,
+	TCA_ETS_NBANDS,		/* u8 */
+	TCA_ETS_NSTRICT,	/* u8 */
+	TCA_ETS_QUANTA,		/* nested TCA_ETS_QUANTA_BAND */
+	TCA_ETS_QUANTA_BAND,	/* u32 */
+	TCA_ETS_PRIOMAP,	/* nested TCA_ETS_PRIOMAP_BAND */
+	TCA_ETS_PRIOMAP_BAND,	/* u8 */
+	__TCA_ETS_MAX,
+};
+
+#define TCA_ETS_MAX (__TCA_ETS_MAX - 1)
+
 #endif
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index 2985509147a2..b1e7ec726958 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -409,6 +409,23 @@ config NET_SCH_PLUG
 	  To compile this code as a module, choose M here: the
 	  module will be called sch_plug.
 
+config NET_SCH_ETS
+	tristate "Enhanced transmission selection scheduler (ETS)"
+	help
+          The Enhanced Transmission Selection scheduler is a classful
+          queuing discipline that merges functionality of PRIO and DRR
+          qdiscs in one scheduler. ETS makes it easy to configure a set of
+          strict and bandwidth-sharing bands to implement the transmission
+          selection described in 802.1Qaz.
+
+	  Say Y here if you want to use the ETS packet scheduling
+	  algorithm.
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called sch_ets.
+
+	  If unsure, say N.
+
 menuconfig NET_SCH_DEFAULT
 	bool "Allow override default queue discipline"
 	---help---
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 415d1e1f237e..bc8856b865ff 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -48,6 +48,7 @@ obj-$(CONFIG_NET_SCH_ATM)	+= sch_atm.o
 obj-$(CONFIG_NET_SCH_NETEM)	+= sch_netem.o
 obj-$(CONFIG_NET_SCH_DRR)	+= sch_drr.o
 obj-$(CONFIG_NET_SCH_PLUG)	+= sch_plug.o
+obj-$(CONFIG_NET_SCH_ETS)	+= sch_ets.o
 obj-$(CONFIG_NET_SCH_MQPRIO)	+= sch_mqprio.o
 obj-$(CONFIG_NET_SCH_SKBPRIO)	+= sch_skbprio.o
 obj-$(CONFIG_NET_SCH_CHOKE)	+= sch_choke.o
diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c
new file mode 100644
index 000000000000..e6194b23e9b0
--- /dev/null
+++ b/net/sched/sch_ets.c
@@ -0,0 +1,733 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * net/sched/sch_ets.c         Enhanced Transmission Selection scheduler
+ *
+ * Description
+ * -----------
+ *
+ * The Enhanced Transmission Selection scheduler is a classful queuing
+ * discipline that merges functionality of PRIO and DRR qdiscs in one scheduler.
+ * ETS makes it easy to configure a set of strict and bandwidth-sharing bands to
+ * implement the transmission selection described in 802.1Qaz.
+ *
+ * Although ETS is technically classful, it's not possible to add and remove
+ * classes at will. Instead one specifies number of classes, how many are
+ * PRIO-like and how many DRR-like, and quanta for the latter.
+ *
+ * Algorithm
+ * ---------
+ *
+ * The strict classes, if any, are tried for traffic first: first band 0, if it
+ * has no traffic then band 1, etc.
+ *
+ * When there is no traffic in any of the strict queues, the bandwidth-sharing
+ * ones are tried next. Each band is assigned a deficit counter, initialized to
+ * "quantum" of that band. ETS maintains a list of active bandwidth-sharing
+ * bands whose qdiscs are non-empty. A packet is dequeued from the band at the
+ * head of the list if the packet size is smaller or equal to the deficit
+ * counter. If the counter is too small, it is increased by "quantum" and the
+ * scheduler moves on to the next band in the active list.
+ */
+
+#include <linux/module.h>
+#include <net/gen_stats.h>
+#include <net/netlink.h>
+#include <net/pkt_cls.h>
+#include <net/pkt_sched.h>
+#include <net/sch_generic.h>
+
+struct ets_class {
+	struct list_head alist; /* In struct ets_sched.active. */
+	struct Qdisc *qdisc;
+	u32 quantum;
+	u32 deficit;
+	struct gnet_stats_basic_packed bstats;
+	struct gnet_stats_queue qstats;
+};
+
+struct ets_sched {
+	struct list_head active;
+	struct tcf_proto __rcu *filter_list;
+	struct tcf_block *block;
+	unsigned int nbands;
+	unsigned int nstrict;
+	u8 prio2band[TC_PRIO_MAX + 1];
+	struct ets_class classes[TCQ_ETS_MAX_BANDS];
+};
+
+static const struct nla_policy ets_policy[TCA_ETS_MAX + 1] = {
+	[TCA_ETS_NBANDS] = { .type = NLA_U8 },
+	[TCA_ETS_NSTRICT] = { .type = NLA_U8 },
+	[TCA_ETS_QUANTA] = { .type = NLA_NESTED },
+	[TCA_ETS_PRIOMAP] = { .type = NLA_NESTED },
+};
+
+static const struct nla_policy ets_priomap_policy[TCA_ETS_MAX + 1] = {
+	[TCA_ETS_PRIOMAP_BAND] = { .type = NLA_U8 },
+};
+
+static const struct nla_policy ets_quanta_policy[TCA_ETS_MAX + 1] = {
+	[TCA_ETS_QUANTA_BAND] = { .type = NLA_U32 },
+};
+
+static const struct nla_policy ets_class_policy[TCA_ETS_MAX + 1] = {
+	[TCA_ETS_QUANTA_BAND] = { .type = NLA_U32 },
+};
+
+static int ets_quantum_parse(struct Qdisc *sch, const struct nlattr *attr,
+			     unsigned int *quantum,
+			     struct netlink_ext_ack *extack)
+{
+	*quantum = nla_get_u32(attr);
+	if (!*quantum) {
+		NL_SET_ERR_MSG(extack, "ETS quantum cannot be zero");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static struct ets_class *
+ets_class_from_arg(struct Qdisc *sch, unsigned long arg)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+
+	return &q->classes[arg - 1];
+}
+
+static u32 ets_class_id(struct Qdisc *sch, const struct ets_class *cl)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	int band = cl - q->classes;
+
+	return TC_H_MAKE(sch->handle, band + 1);
+}
+
+static bool ets_class_is_strict(struct ets_sched *q, const struct ets_class *cl)
+{
+	unsigned int band = cl - q->classes;
+
+	return band < q->nstrict;
+}
+
+static int ets_class_change(struct Qdisc *sch, u32 classid, u32 parentid,
+			    struct nlattr **tca, unsigned long *arg,
+			    struct netlink_ext_ack *extack)
+{
+	struct ets_class *cl = ets_class_from_arg(sch, *arg);
+	struct ets_sched *q = qdisc_priv(sch);
+	struct nlattr *opt = tca[TCA_OPTIONS];
+	struct nlattr *tb[TCA_ETS_MAX + 1];
+	unsigned int quantum;
+	int err;
+
+	/* Classes can be added and removed only through Qdisc_ops.change
+	 * interface.
+	 */
+	if (!cl) {
+		NL_SET_ERR_MSG(extack, "Fine-grained class addition and removal is not supported");
+		return -EOPNOTSUPP;
+	}
+
+	if (!opt) {
+		NL_SET_ERR_MSG(extack, "ETS options are required for this operation");
+		return -EINVAL;
+	}
+
+	err = nla_parse_nested(tb, TCA_ETS_MAX, opt, ets_class_policy, extack);
+	if (err < 0)
+		return err;
+
+	if (!tb[TCA_ETS_QUANTA_BAND])
+		/* Nothing to configure. */
+		return 0;
+
+	if (ets_class_is_strict(q, cl)) {
+		NL_SET_ERR_MSG(extack, "Strict bands do not have a configurable quantum");
+		return -EINVAL;
+	}
+
+	err = ets_quantum_parse(sch, tb[TCA_ETS_QUANTA_BAND], &quantum,
+				extack);
+	if (err)
+		return err;
+
+	sch_tree_lock(sch);
+	cl->quantum = quantum;
+	sch_tree_unlock(sch);
+	return 0;
+}
+
+static int ets_class_graft(struct Qdisc *sch, unsigned long arg,
+			   struct Qdisc *new, struct Qdisc **old,
+			   struct netlink_ext_ack *extack)
+{
+	struct ets_class *cl = ets_class_from_arg(sch, arg);
+
+	if (!new) {
+		new = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
+					ets_class_id(sch, cl), NULL);
+		if (!new)
+			new = &noop_qdisc;
+		else
+			qdisc_hash_add(new, true);
+	}
+
+	*old = qdisc_replace(sch, new, &cl->qdisc);
+	return 0;
+}
+
+static struct Qdisc *ets_class_leaf(struct Qdisc *sch, unsigned long arg)
+{
+	struct ets_class *cl = ets_class_from_arg(sch, arg);
+
+	return cl->qdisc;
+}
+
+static unsigned long ets_class_find(struct Qdisc *sch, u32 classid)
+{
+	unsigned long band = TC_H_MIN(classid);
+	struct ets_sched *q = qdisc_priv(sch);
+
+	if (band - 1 >= q->nbands)
+		return 0;
+	return band;
+}
+
+static void ets_class_qlen_notify(struct Qdisc *sch, unsigned long arg)
+{
+	struct ets_class *cl = ets_class_from_arg(sch, arg);
+	struct ets_sched *q = qdisc_priv(sch);
+
+	/* We get notified about zero-length child Qdiscs as well if they are
+	 * offloaded. Those aren't on the active list though, so don't attempt
+	 * to remove them.
+	 */
+	if (!ets_class_is_strict(q, cl) && sch->q.qlen)
+		list_del(&cl->alist);
+}
+
+static int ets_class_dump(struct Qdisc *sch, unsigned long arg,
+			  struct sk_buff *skb, struct tcmsg *tcm)
+{
+	struct ets_class *cl = ets_class_from_arg(sch, arg);
+	struct ets_sched *q = qdisc_priv(sch);
+	struct nlattr *nest;
+
+	tcm->tcm_parent = TC_H_ROOT;
+	tcm->tcm_handle = ets_class_id(sch, cl);
+	tcm->tcm_info = cl->qdisc->handle;
+
+	nest = nla_nest_start_noflag(skb, TCA_OPTIONS);
+	if (!nest)
+		goto nla_put_failure;
+	if (!ets_class_is_strict(q, cl)) {
+		if (nla_put_u32(skb, TCA_ETS_QUANTA_BAND, cl->quantum))
+			goto nla_put_failure;
+	}
+	return nla_nest_end(skb, nest);
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -EMSGSIZE;
+}
+
+static int ets_class_dump_stats(struct Qdisc *sch, unsigned long arg,
+				struct gnet_dump *d)
+{
+	struct ets_class *cl = ets_class_from_arg(sch, arg);
+	struct Qdisc *cl_q = cl->qdisc;
+
+	if (gnet_stats_copy_basic(qdisc_root_sleeping_running(sch),
+				  d, NULL, &cl_q->bstats) < 0 ||
+	    qdisc_qstats_copy(d, cl_q) < 0)
+		return -1;
+
+	return 0;
+}
+
+static void ets_qdisc_walk(struct Qdisc *sch, struct qdisc_walker *arg)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	int i;
+
+	if (arg->stop)
+		return;
+
+	for (i = 0; i < q->nbands; i++) {
+		if (arg->count < arg->skip) {
+			arg->count++;
+			continue;
+		}
+		if (arg->fn(sch, i + 1, arg) < 0) {
+			arg->stop = 1;
+			break;
+		}
+		arg->count++;
+	}
+}
+
+static struct tcf_block *
+ets_qdisc_tcf_block(struct Qdisc *sch, unsigned long cl,
+		    struct netlink_ext_ack *extack)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+
+	if (cl) {
+		NL_SET_ERR_MSG(extack, "ETS classid must be zero");
+		return NULL;
+	}
+
+	return q->block;
+}
+
+static unsigned long ets_qdisc_bind_tcf(struct Qdisc *sch, unsigned long parent,
+					u32 classid)
+{
+	return ets_class_find(sch, classid);
+}
+
+static void ets_qdisc_unbind_tcf(struct Qdisc *sch, unsigned long arg)
+{
+}
+
+static struct ets_class *ets_classify(struct sk_buff *skb, struct Qdisc *sch,
+				      int *qerr)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	u32 band = skb->priority;
+	struct tcf_result res;
+	struct tcf_proto *fl;
+	int err;
+
+	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
+	if (TC_H_MAJ(skb->priority) != sch->handle) {
+		fl = rcu_dereference_bh(q->filter_list);
+		err = tcf_classify(skb, fl, &res, false);
+#ifdef CONFIG_NET_CLS_ACT
+		switch (err) {
+		case TC_ACT_STOLEN:
+		case TC_ACT_QUEUED:
+		case TC_ACT_TRAP:
+			*qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN;
+			/* fall through */
+		case TC_ACT_SHOT:
+			return NULL;
+		}
+#endif
+		if (!fl || err < 0) {
+			if (TC_H_MAJ(band))
+				band = 0;
+			return &q->classes[q->prio2band[band & TC_PRIO_MAX]];
+		}
+		band = res.classid;
+	}
+	band = TC_H_MIN(band) - 1;
+	if (band >= q->nbands)
+		return &q->classes[q->prio2band[0]];
+	return &q->classes[band];
+}
+
+static int ets_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch,
+			     struct sk_buff **to_free)
+{
+	unsigned int len = qdisc_pkt_len(skb);
+	struct ets_sched *q = qdisc_priv(sch);
+	struct ets_class *cl;
+	int err = 0;
+	bool first;
+
+	cl = ets_classify(skb, sch, &err);
+	if (!cl) {
+		if (err & __NET_XMIT_BYPASS)
+			qdisc_qstats_drop(sch);
+		__qdisc_drop(skb, to_free);
+		return err;
+	}
+
+	first = !cl->qdisc->q.qlen;
+	err = qdisc_enqueue(skb, cl->qdisc, to_free);
+	if (unlikely(err != NET_XMIT_SUCCESS)) {
+		if (net_xmit_drop_count(err)) {
+			cl->qstats.drops++;
+			qdisc_qstats_drop(sch);
+		}
+		return err;
+	}
+
+	if (first && !ets_class_is_strict(q, cl)) {
+		list_add_tail(&cl->alist, &q->active);
+		cl->deficit = cl->quantum;
+	}
+
+	sch->qstats.backlog += len;
+	sch->q.qlen++;
+	return err;
+}
+
+static struct sk_buff *
+ets_qdisc_dequeue_skb(struct Qdisc *sch, struct sk_buff *skb)
+{
+	qdisc_bstats_update(sch, skb);
+	qdisc_qstats_backlog_dec(sch, skb);
+	sch->q.qlen--;
+	return skb;
+}
+
+static struct sk_buff *ets_qdisc_dequeue(struct Qdisc *sch)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	struct ets_class *cl;
+	struct sk_buff *skb;
+	unsigned int band;
+	unsigned int len;
+
+	while (1) {
+		for (band = 0; band < q->nstrict; band++) {
+			cl = &q->classes[band];
+			skb = qdisc_dequeue_peeked(cl->qdisc);
+			if (skb)
+				return ets_qdisc_dequeue_skb(sch, skb);
+		}
+
+		if (list_empty(&q->active))
+			goto out;
+
+		cl = list_first_entry(&q->active, struct ets_class, alist);
+		skb = cl->qdisc->ops->peek(cl->qdisc);
+		if (!skb) {
+			qdisc_warn_nonwc(__func__, cl->qdisc);
+			goto out;
+		}
+
+		len = qdisc_pkt_len(skb);
+		if (len <= cl->deficit) {
+			cl->deficit -= len;
+			skb = qdisc_dequeue_peeked(cl->qdisc);
+			if (unlikely(!skb))
+				goto out;
+			if (cl->qdisc->q.qlen == 0)
+				list_del(&cl->alist);
+			return ets_qdisc_dequeue_skb(sch, skb);
+		}
+
+		cl->deficit += cl->quantum;
+		list_move_tail(&cl->alist, &q->active);
+	}
+out:
+	return NULL;
+}
+
+static int ets_qdisc_priomap_parse(struct nlattr *priomap_attr,
+				   unsigned int nbands, u8 *priomap,
+				   struct netlink_ext_ack *extack)
+{
+	const struct nlattr *attr;
+	int prio = 0;
+	u8 band;
+	int rem;
+	int err;
+
+	err = __nla_validate_nested(priomap_attr, TCA_ETS_MAX,
+				    ets_priomap_policy, NL_VALIDATE_STRICT,
+				    extack);
+	if (err)
+		return err;
+
+	nla_for_each_nested(attr, priomap_attr, rem) {
+		switch (nla_type(attr)) {
+		case TCA_ETS_PRIOMAP_BAND:
+			if (prio > TC_PRIO_MAX) {
+				NL_SET_ERR_MSG_MOD(extack, "Too many priorities in ETS priomap");
+				return -EINVAL;
+			}
+			band = nla_get_u8(attr);
+			if (band >= nbands) {
+				NL_SET_ERR_MSG_MOD(extack, "Invalid band number in ETS priomap");
+				return -EINVAL;
+			}
+			priomap[prio++] = band;
+			break;
+		default:
+			WARN_ON_ONCE(1); /* Validate should have caught this. */
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static int ets_qdisc_quanta_parse(struct Qdisc *sch, struct nlattr *quanta_attr,
+				  unsigned int nbands, unsigned int nstrict,
+				  unsigned int *quanta,
+				  struct netlink_ext_ack *extack)
+{
+	const struct nlattr *attr;
+	int band = nstrict;
+	int rem;
+	int err;
+
+	err = __nla_validate_nested(quanta_attr, TCA_ETS_MAX,
+				    ets_quanta_policy, NL_VALIDATE_STRICT,
+				    extack);
+	if (err < 0)
+		return err;
+
+	nla_for_each_nested(attr, quanta_attr, rem) {
+		switch (nla_type(attr)) {
+		case TCA_ETS_QUANTA_BAND:
+			if (band >= nbands) {
+				NL_SET_ERR_MSG_MOD(extack, "ETS quanta has more values than bands");
+				return -EINVAL;
+			}
+			err = ets_quantum_parse(sch, attr, &quanta[band++],
+						extack);
+			if (err)
+				return err;
+			break;
+		default:
+			WARN_ON_ONCE(1); /* Validate should have caught this. */
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
+			    struct netlink_ext_ack *extack)
+{
+	unsigned int quanta[TCQ_ETS_MAX_BANDS] = {0};
+	struct Qdisc *queues[TCQ_ETS_MAX_BANDS];
+	struct ets_sched *q = qdisc_priv(sch);
+	struct nlattr *tb[TCA_ETS_MAX + 1];
+	unsigned int oldbands = q->nbands;
+	u8 priomap[TC_PRIO_MAX + 1];
+	unsigned int nstrict = 0;
+	unsigned int nbands;
+	unsigned int i;
+	int err;
+
+	if (!opt) {
+		NL_SET_ERR_MSG(extack, "ETS options are required for this operation");
+		return -EINVAL;
+	}
+
+	err = nla_parse_nested(tb, TCA_ETS_MAX, opt, ets_policy, extack);
+	if (err < 0)
+		return err;
+
+	if (!tb[TCA_ETS_NBANDS]) {
+		NL_SET_ERR_MSG_MOD(extack, "Number of bands is a required argument");
+		return -EINVAL;
+	}
+	nbands = nla_get_u8(tb[TCA_ETS_NBANDS]);
+	if (nbands < 1 || nbands > TCQ_ETS_MAX_BANDS) {
+		NL_SET_ERR_MSG_MOD(extack, "Invalid number of bands");
+		return -EINVAL;
+	}
+	/* Unless overridden, traffic goes to the last band. */
+	memset(priomap, nbands - 1, sizeof(priomap));
+
+	if (tb[TCA_ETS_NSTRICT]) {
+		nstrict = nla_get_u8(tb[TCA_ETS_NSTRICT]);
+		if (nstrict > nbands) {
+			NL_SET_ERR_MSG_MOD(extack, "Invalid number of strict bands");
+			return -EINVAL;
+		}
+	}
+
+	if (tb[TCA_ETS_PRIOMAP]) {
+		err = ets_qdisc_priomap_parse(tb[TCA_ETS_PRIOMAP],
+					      nbands, priomap, extack);
+		if (err)
+			return err;
+	}
+
+	if (tb[TCA_ETS_QUANTA]) {
+		err = ets_qdisc_quanta_parse(sch, tb[TCA_ETS_QUANTA],
+					     nbands, nstrict, quanta, extack);
+		if (err)
+			return err;
+	}
+	/* If there are more bands than strict + quanta provided, the remaining
+	 * ones are ETS with quantum of MTU. Initialize the missing values here.
+	 */
+	for (i = nstrict; i < nbands; i++) {
+		if (!quanta[i])
+			quanta[i] = psched_mtu(qdisc_dev(sch));
+	}
+
+	/* Before commit, make sure we can allocate all new qdiscs */
+	for (i = oldbands; i < nbands; i++) {
+		queues[i] = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
+					      ets_class_id(sch, &q->classes[i]),
+					      extack);
+		if (!queues[i]) {
+			while (i > oldbands)
+				qdisc_put(queues[--i]);
+			return -ENOMEM;
+		}
+	}
+
+	sch_tree_lock(sch);
+
+	q->nbands = nbands;
+	q->nstrict = nstrict;
+	memcpy(q->prio2band, priomap, sizeof(priomap));
+
+	for (i = q->nbands; i < oldbands; i++)
+		qdisc_tree_flush_backlog(q->classes[i].qdisc);
+
+	for (i = 0; i < q->nbands; i++)
+		q->classes[i].quantum = quanta[i];
+
+	for (i = oldbands; i < q->nbands; i++) {
+		q->classes[i].qdisc = queues[i];
+		if (q->classes[i].qdisc != &noop_qdisc)
+			qdisc_hash_add(q->classes[i].qdisc, true);
+	}
+
+	sch_tree_unlock(sch);
+
+	for (i = q->nbands; i < oldbands; i++) {
+		qdisc_put(q->classes[i].qdisc);
+		memset(&q->classes[i], 0, sizeof(q->classes[i]));
+	}
+	return 0;
+}
+
+static int ets_qdisc_init(struct Qdisc *sch, struct nlattr *opt,
+			  struct netlink_ext_ack *extack)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	int err;
+
+	if (!opt)
+		return -EINVAL;
+
+	err = tcf_block_get(&q->block, &q->filter_list, sch, extack);
+	if (err)
+		return err;
+
+	INIT_LIST_HEAD(&q->active);
+	return ets_qdisc_change(sch, opt, extack);
+}
+
+static void ets_qdisc_reset(struct Qdisc *sch)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	int band;
+
+	for (band = q->nstrict; band < q->nbands; band++) {
+		if (q->classes[band].qdisc->q.qlen)
+			list_del(&q->classes[band].alist);
+	}
+	for (band = 0; band < q->nbands; band++)
+		qdisc_reset(q->classes[band].qdisc);
+	sch->qstats.backlog = 0;
+	sch->q.qlen = 0;
+}
+
+static void ets_qdisc_destroy(struct Qdisc *sch)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	int band;
+
+	tcf_block_put(q->block);
+	for (band = 0; band < q->nbands; band++)
+		qdisc_put(q->classes[band].qdisc);
+}
+
+static int ets_qdisc_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct ets_sched *q = qdisc_priv(sch);
+	struct nlattr *opts;
+	struct nlattr *nest;
+	int band;
+	int prio;
+
+	opts = nla_nest_start_noflag(skb, TCA_OPTIONS);
+	if (!opts)
+		goto nla_err;
+
+	if (nla_put_u8(skb, TCA_ETS_NBANDS, q->nbands))
+		goto nla_err;
+
+	if (q->nstrict &&
+	    nla_put_u8(skb, TCA_ETS_NSTRICT, q->nstrict))
+		goto nla_err;
+
+	if (q->nbands > q->nstrict) {
+		nest = nla_nest_start(skb, TCA_ETS_QUANTA);
+		if (!nest)
+			goto nla_err;
+
+		for (band = q->nstrict; band < q->nbands; band++) {
+			if (nla_put_u32(skb, TCA_ETS_QUANTA_BAND,
+					q->classes[band].quantum))
+				goto nla_err;
+		}
+
+		nla_nest_end(skb, nest);
+	}
+
+	nest = nla_nest_start(skb, TCA_ETS_PRIOMAP);
+	if (!nest)
+		goto nla_err;
+
+	for (prio = 0; prio <= TC_PRIO_MAX; prio++) {
+		if (nla_put_u8(skb, TCA_ETS_PRIOMAP_BAND, q->prio2band[prio]))
+			goto nla_err;
+	}
+
+	nla_nest_end(skb, nest);
+
+	return nla_nest_end(skb, opts);
+
+nla_err:
+	nla_nest_cancel(skb, opts);
+	return -EMSGSIZE;
+}
+
+static const struct Qdisc_class_ops ets_class_ops = {
+	.change		= ets_class_change,
+	.graft		= ets_class_graft,
+	.leaf		= ets_class_leaf,
+	.find		= ets_class_find,
+	.qlen_notify	= ets_class_qlen_notify,
+	.dump		= ets_class_dump,
+	.dump_stats	= ets_class_dump_stats,
+	.walk		= ets_qdisc_walk,
+	.tcf_block	= ets_qdisc_tcf_block,
+	.bind_tcf	= ets_qdisc_bind_tcf,
+	.unbind_tcf	= ets_qdisc_unbind_tcf,
+};
+
+static struct Qdisc_ops ets_qdisc_ops __read_mostly = {
+	.cl_ops		= &ets_class_ops,
+	.id		= "ets",
+	.priv_size	= sizeof(struct ets_sched),
+	.enqueue	= ets_qdisc_enqueue,
+	.dequeue	= ets_qdisc_dequeue,
+	.peek		= qdisc_peek_dequeued,
+	.change		= ets_qdisc_change,
+	.init		= ets_qdisc_init,
+	.reset		= ets_qdisc_reset,
+	.destroy	= ets_qdisc_destroy,
+	.dump		= ets_qdisc_dump,
+	.owner		= THIS_MODULE,
+};
+
+static int __init ets_init(void)
+{
+	return register_qdisc(&ets_qdisc_ops);
+}
+
+static void __exit ets_exit(void)
+{
+	unregister_qdisc(&ets_qdisc_ops);
+}
+
+module_init(ets_init);
+module_exit(ets_exit);
+MODULE_LICENSE("GPL");
-- 
cgit v1.2.3


From 7dd68b3279f1792103d12e69933db3128c6d416e Mon Sep 17 00:00:00 2001
From: Andrey Ignatov <rdna@fb.com>
Date: Wed, 18 Dec 2019 23:44:35 -0800
Subject: bpf: Support replacing cgroup-bpf program in MULTI mode

The common use-case in production is to have multiple cgroup-bpf
programs per attach type that cover multiple use-cases. Such programs
are attached with BPF_F_ALLOW_MULTI and can be maintained by different
people.

Order of programs usually matters, for example imagine two egress
programs: the first one drops packets and the second one counts packets.
If they're swapped the result of counting program will be different.

It brings operational challenges with updating cgroup-bpf program(s)
attached with BPF_F_ALLOW_MULTI since there is no way to replace a
program:

* One way to update is to detach all programs first and then attach the
  new version(s) again in the right order. This introduces an
  interruption in the work a program is doing and may not be acceptable
  (e.g. if it's egress firewall);

* Another way is attach the new version of a program first and only then
  detach the old version. This introduces the time interval when two
  versions of same program are working, what may not be acceptable if a
  program is not idempotent. It also imposes additional burden on
  program developers to make sure that two versions of their program can
  co-exist.

Solve the problem by introducing a "replace" mode in BPF_PROG_ATTACH
command for cgroup-bpf programs being attached with BPF_F_ALLOW_MULTI
flag. This mode is enabled by newly introduced BPF_F_REPLACE attach flag
and bpf_attr.replace_bpf_fd attribute to pass fd of the old program to
replace

That way user can replace any program among those attached with
BPF_F_ALLOW_MULTI flag without the problems described above.

Details of the new API:

* If BPF_F_REPLACE is set but replace_bpf_fd doesn't have valid
  descriptor of BPF program, BPF_PROG_ATTACH will return corresponding
  error (EINVAL or EBADF).

* If replace_bpf_fd has valid descriptor of BPF program but such a
  program is not attached to specified cgroup, BPF_PROG_ATTACH will
  return ENOENT.

BPF_F_REPLACE is introduced to make the user intent clear, since
replace_bpf_fd alone can't be used for this (its default value, 0, is a
valid fd). BPF_F_REPLACE also makes it possible to extend the API in the
future (e.g. add BPF_F_BEFORE and BPF_F_AFTER if needed).

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Acked-by: Andrii Narkyiko <andriin@fb.com>
Link: https://lore.kernel.org/bpf/30cd850044a0057bdfcaaf154b7d2f39850ba813.1576741281.git.rdna@fb.com
---
 include/linux/bpf-cgroup.h     |  4 +++-
 include/uapi/linux/bpf.h       | 10 ++++++++++
 kernel/bpf/cgroup.c            | 30 ++++++++++++++++++++++++++----
 kernel/bpf/syscall.c           |  4 ++--
 kernel/cgroup/cgroup.c         |  5 +++--
 tools/include/uapi/linux/bpf.h | 10 ++++++++++
 6 files changed, 54 insertions(+), 9 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h
index 169fd25f6bc2..18f6a6da7c3c 100644
--- a/include/linux/bpf-cgroup.h
+++ b/include/linux/bpf-cgroup.h
@@ -85,6 +85,7 @@ int cgroup_bpf_inherit(struct cgroup *cgrp);
 void cgroup_bpf_offline(struct cgroup *cgrp);
 
 int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
+			struct bpf_prog *replace_prog,
 			enum bpf_attach_type type, u32 flags);
 int __cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 			enum bpf_attach_type type);
@@ -93,7 +94,8 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 
 /* Wrapper for __cgroup_bpf_*() protected by cgroup_mutex */
 int cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
-		      enum bpf_attach_type type, u32 flags);
+		      struct bpf_prog *replace_prog, enum bpf_attach_type type,
+		      u32 flags);
 int cgroup_bpf_detach(struct cgroup *cgrp, struct bpf_prog *prog,
 		      enum bpf_attach_type type, u32 flags);
 int cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index dbbcf0b02970..7df436da542d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -231,6 +231,11 @@ enum bpf_attach_type {
  * When children program makes decision (like picking TCP CA or sock bind)
  * parent program has a chance to override it.
  *
+ * With BPF_F_ALLOW_MULTI a new program is added to the end of the list of
+ * programs for a cgroup. Though it's possible to replace an old program at
+ * any position by also specifying BPF_F_REPLACE flag and position itself in
+ * replace_bpf_fd attribute. Old program at this position will be released.
+ *
  * A cgroup with MULTI or OVERRIDE flag allows any attach flags in sub-cgroups.
  * A cgroup with NONE doesn't allow any programs in sub-cgroups.
  * Ex1:
@@ -249,6 +254,7 @@ enum bpf_attach_type {
  */
 #define BPF_F_ALLOW_OVERRIDE	(1U << 0)
 #define BPF_F_ALLOW_MULTI	(1U << 1)
+#define BPF_F_REPLACE		(1U << 2)
 
 /* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the
  * verifier will perform strict alignment checking as if the kernel
@@ -442,6 +448,10 @@ union bpf_attr {
 		__u32		attach_bpf_fd;	/* eBPF program to attach */
 		__u32		attach_type;
 		__u32		attach_flags;
+		__u32		replace_bpf_fd;	/* previously attached eBPF
+						 * program to replace if
+						 * BPF_F_REPLACE is used
+						 */
 	};
 
 	struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */
diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c
index 283efe3ce052..45346c79613a 100644
--- a/kernel/bpf/cgroup.c
+++ b/kernel/bpf/cgroup.c
@@ -282,14 +282,17 @@ cleanup:
  *                         propagate the change to descendants
  * @cgrp: The cgroup which descendants to traverse
  * @prog: A program to attach
+ * @replace_prog: Previously attached program to replace if BPF_F_REPLACE is set
  * @type: Type of attach operation
  * @flags: Option flags
  *
  * Must be called with cgroup_mutex held.
  */
 int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
+			struct bpf_prog *replace_prog,
 			enum bpf_attach_type type, u32 flags)
 {
+	u32 saved_flags = (flags & (BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI));
 	struct list_head *progs = &cgrp->bpf.progs[type];
 	struct bpf_prog *old_prog = NULL;
 	struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE],
@@ -298,14 +301,15 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
 	enum bpf_cgroup_storage_type stype;
 	int err;
 
-	if ((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI))
+	if (((flags & BPF_F_ALLOW_OVERRIDE) && (flags & BPF_F_ALLOW_MULTI)) ||
+	    ((flags & BPF_F_REPLACE) && !(flags & BPF_F_ALLOW_MULTI)))
 		/* invalid combination */
 		return -EINVAL;
 
 	if (!hierarchy_allows_attach(cgrp, type))
 		return -EPERM;
 
-	if (!list_empty(progs) && cgrp->bpf.flags[type] != flags)
+	if (!list_empty(progs) && cgrp->bpf.flags[type] != saved_flags)
 		/* Disallow attaching non-overridable on top
 		 * of existing overridable in this cgroup.
 		 * Disallow attaching multi-prog if overridable or none
@@ -320,7 +324,12 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
 			if (pl->prog == prog)
 				/* disallow attaching the same prog twice */
 				return -EINVAL;
+			if (pl->prog == replace_prog)
+				replace_pl = pl;
 		}
+		if ((flags & BPF_F_REPLACE) && !replace_pl)
+			/* prog to replace not found for cgroup */
+			return -ENOENT;
 	} else if (!list_empty(progs)) {
 		replace_pl = list_first_entry(progs, typeof(*pl), node);
 	}
@@ -356,7 +365,7 @@ int __cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
 	for_each_cgroup_storage_type(stype)
 		pl->storage[stype] = storage[stype];
 
-	cgrp->bpf.flags[type] = flags;
+	cgrp->bpf.flags[type] = saved_flags;
 
 	err = update_effective_progs(cgrp, type);
 	if (err)
@@ -522,6 +531,7 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr,
 int cgroup_bpf_prog_attach(const union bpf_attr *attr,
 			   enum bpf_prog_type ptype, struct bpf_prog *prog)
 {
+	struct bpf_prog *replace_prog = NULL;
 	struct cgroup *cgrp;
 	int ret;
 
@@ -529,8 +539,20 @@ int cgroup_bpf_prog_attach(const union bpf_attr *attr,
 	if (IS_ERR(cgrp))
 		return PTR_ERR(cgrp);
 
-	ret = cgroup_bpf_attach(cgrp, prog, attr->attach_type,
+	if ((attr->attach_flags & BPF_F_ALLOW_MULTI) &&
+	    (attr->attach_flags & BPF_F_REPLACE)) {
+		replace_prog = bpf_prog_get_type(attr->replace_bpf_fd, ptype);
+		if (IS_ERR(replace_prog)) {
+			cgroup_put(cgrp);
+			return PTR_ERR(replace_prog);
+		}
+	}
+
+	ret = cgroup_bpf_attach(cgrp, prog, replace_prog, attr->attach_type,
 				attr->attach_flags);
+
+	if (replace_prog)
+		bpf_prog_put(replace_prog);
 	cgroup_put(cgrp);
 	return ret;
 }
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index b08c362f4e02..81ee8595dfee 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2073,10 +2073,10 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog,
 	}
 }
 
-#define BPF_PROG_ATTACH_LAST_FIELD attach_flags
+#define BPF_PROG_ATTACH_LAST_FIELD replace_bpf_fd
 
 #define BPF_F_ATTACH_MASK \
-	(BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI)
+	(BPF_F_ALLOW_OVERRIDE | BPF_F_ALLOW_MULTI | BPF_F_REPLACE)
 
 static int bpf_prog_attach(const union bpf_attr *attr)
 {
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 735af8f15f95..725365df066d 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -6288,12 +6288,13 @@ void cgroup_sk_free(struct sock_cgroup_data *skcd)
 
 #ifdef CONFIG_CGROUP_BPF
 int cgroup_bpf_attach(struct cgroup *cgrp, struct bpf_prog *prog,
-		      enum bpf_attach_type type, u32 flags)
+		      struct bpf_prog *replace_prog, enum bpf_attach_type type,
+		      u32 flags)
 {
 	int ret;
 
 	mutex_lock(&cgroup_mutex);
-	ret = __cgroup_bpf_attach(cgrp, prog, type, flags);
+	ret = __cgroup_bpf_attach(cgrp, prog, replace_prog, type, flags);
 	mutex_unlock(&cgroup_mutex);
 	return ret;
 }
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index dbbcf0b02970..7df436da542d 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -231,6 +231,11 @@ enum bpf_attach_type {
  * When children program makes decision (like picking TCP CA or sock bind)
  * parent program has a chance to override it.
  *
+ * With BPF_F_ALLOW_MULTI a new program is added to the end of the list of
+ * programs for a cgroup. Though it's possible to replace an old program at
+ * any position by also specifying BPF_F_REPLACE flag and position itself in
+ * replace_bpf_fd attribute. Old program at this position will be released.
+ *
  * A cgroup with MULTI or OVERRIDE flag allows any attach flags in sub-cgroups.
  * A cgroup with NONE doesn't allow any programs in sub-cgroups.
  * Ex1:
@@ -249,6 +254,7 @@ enum bpf_attach_type {
  */
 #define BPF_F_ALLOW_OVERRIDE	(1U << 0)
 #define BPF_F_ALLOW_MULTI	(1U << 1)
+#define BPF_F_REPLACE		(1U << 2)
 
 /* If BPF_F_STRICT_ALIGNMENT is used in BPF_PROG_LOAD command, the
  * verifier will perform strict alignment checking as if the kernel
@@ -442,6 +448,10 @@ union bpf_attr {
 		__u32		attach_bpf_fd;	/* eBPF program to attach */
 		__u32		attach_type;
 		__u32		attach_flags;
+		__u32		replace_bpf_fd;	/* previously attached eBPF
+						 * program to replace if
+						 * BPF_F_REPLACE is used
+						 */
 	};
 
 	struct { /* anonymous struct used by BPF_PROG_TEST_RUN command */
-- 
cgit v1.2.3


From e1b5e598e5a51b453328879682b178b4acc15105 Mon Sep 17 00:00:00 2001
From: John Rutherford <john.rutherford@dektech.com.au>
Date: Thu, 19 Dec 2019 16:03:57 +1100
Subject: tipc: make legacy address flag readable over netlink

To enable iproute2/tipc to generate backwards compatible
printouts and validate command parameters for nodes using a
<z.c.n> node address, it needs to be able to read the legacy
address flag from the kernel.  The legacy address flag records
the way in which the node identity was originally specified.

The legacy address flag is requested by the netlink message
TIPC_NL_ADDR_LEGACY_GET.  If the flag is set the attribute
TIPC_NLA_NET_ADDR_LEGACY is set in the return message.

Signed-off-by: John Rutherford <john.rutherford@dektech.com.au>
Acked-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/tipc_netlink.h |  2 ++
 net/tipc/net.c                    | 56 +++++++++++++++++++++++++++++++++++++++
 net/tipc/net.h                    |  1 +
 net/tipc/netlink.c                |  6 +++++
 4 files changed, 65 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/tipc_netlink.h b/include/uapi/linux/tipc_netlink.h
index 6c2194ab745b..dc0d23a50e69 100644
--- a/include/uapi/linux/tipc_netlink.h
+++ b/include/uapi/linux/tipc_netlink.h
@@ -65,6 +65,7 @@ enum {
 	TIPC_NL_UDP_GET_REMOTEIP,
 	TIPC_NL_KEY_SET,
 	TIPC_NL_KEY_FLUSH,
+	TIPC_NL_ADDR_LEGACY_GET,
 
 	__TIPC_NL_CMD_MAX,
 	TIPC_NL_CMD_MAX = __TIPC_NL_CMD_MAX - 1
@@ -176,6 +177,7 @@ enum {
 	TIPC_NLA_NET_ADDR,		/* u32 */
 	TIPC_NLA_NET_NODEID,		/* u64 */
 	TIPC_NLA_NET_NODEID_W1,		/* u64 */
+	TIPC_NLA_NET_ADDR_LEGACY,	/* flag */
 
 	__TIPC_NLA_NET_MAX,
 	TIPC_NLA_NET_MAX = __TIPC_NLA_NET_MAX - 1
diff --git a/net/tipc/net.c b/net/tipc/net.c
index 2de3cec9929d..85400e4242de 100644
--- a/net/tipc/net.c
+++ b/net/tipc/net.c
@@ -302,3 +302,59 @@ int tipc_nl_net_set(struct sk_buff *skb, struct genl_info *info)
 
 	return err;
 }
+
+static int __tipc_nl_addr_legacy_get(struct net *net, struct tipc_nl_msg *msg)
+{
+	struct tipc_net *tn = tipc_net(net);
+	struct nlattr *attrs;
+	void *hdr;
+
+	hdr = genlmsg_put(msg->skb, msg->portid, msg->seq, &tipc_genl_family,
+			  0, TIPC_NL_ADDR_LEGACY_GET);
+	if (!hdr)
+		return -EMSGSIZE;
+
+	attrs = nla_nest_start(msg->skb, TIPC_NLA_NET);
+	if (!attrs)
+		goto msg_full;
+
+	if (tn->legacy_addr_format)
+		if (nla_put_flag(msg->skb, TIPC_NLA_NET_ADDR_LEGACY))
+			goto attr_msg_full;
+
+	nla_nest_end(msg->skb, attrs);
+	genlmsg_end(msg->skb, hdr);
+
+	return 0;
+
+attr_msg_full:
+	nla_nest_cancel(msg->skb, attrs);
+msg_full:
+	genlmsg_cancel(msg->skb, hdr);
+
+	return -EMSGSIZE;
+}
+
+int tipc_nl_net_addr_legacy_get(struct sk_buff *skb, struct genl_info *info)
+{
+	struct net *net = sock_net(skb->sk);
+	struct tipc_nl_msg msg;
+	struct sk_buff *rep;
+	int err;
+
+	rep = nlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
+	if (!rep)
+		return -ENOMEM;
+
+	msg.skb = rep;
+	msg.portid = info->snd_portid;
+	msg.seq = info->snd_seq;
+
+	err = __tipc_nl_addr_legacy_get(net, &msg);
+	if (err) {
+		nlmsg_free(msg.skb);
+		return err;
+	}
+
+	return genlmsg_reply(msg.skb, info);
+}
diff --git a/net/tipc/net.h b/net/tipc/net.h
index b7f2e364eb99..6740d97c706e 100644
--- a/net/tipc/net.h
+++ b/net/tipc/net.h
@@ -47,5 +47,6 @@ void tipc_net_stop(struct net *net);
 int tipc_nl_net_dump(struct sk_buff *skb, struct netlink_callback *cb);
 int tipc_nl_net_set(struct sk_buff *skb, struct genl_info *info);
 int __tipc_nl_net_set(struct sk_buff *skb, struct genl_info *info);
+int tipc_nl_net_addr_legacy_get(struct sk_buff *skb, struct genl_info *info);
 
 #endif
diff --git a/net/tipc/netlink.c b/net/tipc/netlink.c
index e53231bd23b4..7c35094c20b8 100644
--- a/net/tipc/netlink.c
+++ b/net/tipc/netlink.c
@@ -83,6 +83,7 @@ const struct nla_policy tipc_nl_net_policy[TIPC_NLA_NET_MAX + 1] = {
 	[TIPC_NLA_NET_ADDR]		= { .type = NLA_U32 },
 	[TIPC_NLA_NET_NODEID]		= { .type = NLA_U64 },
 	[TIPC_NLA_NET_NODEID_W1]	= { .type = NLA_U64 },
+	[TIPC_NLA_NET_ADDR_LEGACY]	= { .type = NLA_FLAG }
 };
 
 const struct nla_policy tipc_nl_link_policy[TIPC_NLA_LINK_MAX + 1] = {
@@ -273,6 +274,11 @@ static const struct genl_ops tipc_genl_v2_ops[] = {
 		.doit	= tipc_nl_node_flush_key,
 	},
 #endif
+	{
+		.cmd	= TIPC_NL_ADDR_LEGACY_GET,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit	= tipc_nl_net_addr_legacy_get,
+	},
 };
 
 struct genl_family tipc_genl_family __ro_after_init = {
-- 
cgit v1.2.3


From f66b53fdbb22ced1a323b22b9de84a61aacd8d18 Mon Sep 17 00:00:00 2001
From: Martin Varghese <martin.varghese@nokia.com>
Date: Sat, 21 Dec 2019 08:50:46 +0530
Subject: openvswitch: New MPLS actions for layer 2 tunnelling

The existing PUSH MPLS action inserts MPLS header between ethernet header
and the IP header. Though this behaviour is fine for L3 VPN where an IP
packet is encapsulated inside a MPLS tunnel, it does not suffice the L2
VPN (l2 tunnelling) requirements. In L2 VPN the MPLS header should
encapsulate the ethernet packet.

The new mpls action ADD_MPLS inserts MPLS header at the start of the
packet or at the start of the l3 header depending on the value of l3 tunnel
flag in the ADD_MPLS arguments.

POP_MPLS action is extended to support ethertype 0x6558.

Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Acked-by: Pravin B Shelar <pshelar@ovn.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/openvswitch.h | 31 +++++++++++++++++++++++++++++++
 net/openvswitch/actions.c        | 30 ++++++++++++++++++++++++------
 net/openvswitch/flow_netlink.c   | 34 ++++++++++++++++++++++++++++++++++
 3 files changed, 89 insertions(+), 6 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index a87b44cd5590..ae2bff14e7e1 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -672,6 +672,32 @@ struct ovs_action_push_mpls {
 	__be16 mpls_ethertype; /* Either %ETH_P_MPLS_UC or %ETH_P_MPLS_MC */
 };
 
+/**
+ * struct ovs_action_add_mpls - %OVS_ACTION_ATTR_ADD_MPLS action
+ * argument.
+ * @mpls_lse: MPLS label stack entry to push.
+ * @mpls_ethertype: Ethertype to set in the encapsulating ethernet frame.
+ * @tun_flags: MPLS tunnel attributes.
+ *
+ * The only values @mpls_ethertype should ever be given are %ETH_P_MPLS_UC and
+ * %ETH_P_MPLS_MC, indicating MPLS unicast or multicast. Other are rejected.
+ */
+struct ovs_action_add_mpls {
+	__be32 mpls_lse;
+	__be16 mpls_ethertype; /* Either %ETH_P_MPLS_UC or %ETH_P_MPLS_MC */
+	__u16 tun_flags;
+};
+
+#define OVS_MPLS_L3_TUNNEL_FLAG_MASK  (1 << 0) /* Flag to specify the place of
+						* insertion of MPLS header.
+						* When false, the MPLS header
+						* will be inserted at the start
+						* of the packet.
+						* When true, the MPLS header
+						* will be inserted at the start
+						* of the l3 header.
+						*/
+
 /**
  * struct ovs_action_push_vlan - %OVS_ACTION_ATTR_PUSH_VLAN action argument.
  * @vlan_tpid: Tag protocol identifier (TPID) to push.
@@ -892,6 +918,10 @@ struct check_pkt_len_arg {
  * @OVS_ACTION_ATTR_CHECK_PKT_LEN: Check the packet length and execute a set
  * of actions if greater than the specified packet length, else execute
  * another set of actions.
+ * @OVS_ACTION_ATTR_ADD_MPLS: Push a new MPLS label stack entry at the
+ * start of the packet or at the start of the l3 header depending on the value
+ * of l3 tunnel flag in the tun_flags field of OVS_ACTION_ATTR_ADD_MPLS
+ * argument.
  *
  * Only a single header can be set with a single %OVS_ACTION_ATTR_SET.  Not all
  * fields within a header are modifiable, e.g. the IPv4 protocol and fragment
@@ -927,6 +957,7 @@ enum ovs_action_attr {
 	OVS_ACTION_ATTR_METER,        /* u32 meter ID. */
 	OVS_ACTION_ATTR_CLONE,        /* Nested OVS_CLONE_ATTR_*.  */
 	OVS_ACTION_ATTR_CHECK_PKT_LEN, /* Nested OVS_CHECK_PKT_LEN_ATTR_*. */
+	OVS_ACTION_ATTR_ADD_MPLS,     /* struct ovs_action_add_mpls. */
 
 	__OVS_ACTION_ATTR_MAX,	      /* Nothing past this will be accepted
 				       * from userspace. */
diff --git a/net/openvswitch/actions.c b/net/openvswitch/actions.c
index 4c8395462303..7fbfe2adfffa 100644
--- a/net/openvswitch/actions.c
+++ b/net/openvswitch/actions.c
@@ -161,16 +161,17 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 			      const struct nlattr *attr, int len);
 
 static int push_mpls(struct sk_buff *skb, struct sw_flow_key *key,
-		     const struct ovs_action_push_mpls *mpls)
+		     __be32 mpls_lse, __be16 mpls_ethertype, __u16 mac_len)
 {
 	int err;
 
-	err = skb_mpls_push(skb, mpls->mpls_lse, mpls->mpls_ethertype,
-			    skb->mac_len,
-			    ovs_key_mac_proto(key) == MAC_PROTO_ETHERNET);
+	err = skb_mpls_push(skb, mpls_lse, mpls_ethertype, mac_len, !!mac_len);
 	if (err)
 		return err;
 
+	if (!mac_len)
+		key->mac_proto = MAC_PROTO_NONE;
+
 	invalidate_flow_key(key);
 	return 0;
 }
@@ -185,6 +186,9 @@ static int pop_mpls(struct sk_buff *skb, struct sw_flow_key *key,
 	if (err)
 		return err;
 
+	if (ethertype == htons(ETH_P_TEB))
+		key->mac_proto = MAC_PROTO_ETHERNET;
+
 	invalidate_flow_key(key);
 	return 0;
 }
@@ -1229,10 +1233,24 @@ static int do_execute_actions(struct datapath *dp, struct sk_buff *skb,
 			execute_hash(skb, key, a);
 			break;
 
-		case OVS_ACTION_ATTR_PUSH_MPLS:
-			err = push_mpls(skb, key, nla_data(a));
+		case OVS_ACTION_ATTR_PUSH_MPLS: {
+			struct ovs_action_push_mpls *mpls = nla_data(a);
+
+			err = push_mpls(skb, key, mpls->mpls_lse,
+					mpls->mpls_ethertype, skb->mac_len);
 			break;
+		}
+		case OVS_ACTION_ATTR_ADD_MPLS: {
+			struct ovs_action_add_mpls *mpls = nla_data(a);
+			__u16 mac_len = 0;
+
+			if (mpls->tun_flags & OVS_MPLS_L3_TUNNEL_FLAG_MASK)
+				mac_len = skb->mac_len;
 
+			err = push_mpls(skb, key, mpls->mpls_lse,
+					mpls->mpls_ethertype, mac_len);
+			break;
+		}
 		case OVS_ACTION_ATTR_POP_MPLS:
 			err = pop_mpls(skb, key, nla_get_be16(a));
 			break;
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 65c2e3458ff5..7da4230627f5 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -79,6 +79,7 @@ static bool actions_may_change_flow(const struct nlattr *actions)
 		case OVS_ACTION_ATTR_SET_MASKED:
 		case OVS_ACTION_ATTR_METER:
 		case OVS_ACTION_ATTR_CHECK_PKT_LEN:
+		case OVS_ACTION_ATTR_ADD_MPLS:
 		default:
 			return true;
 		}
@@ -3005,6 +3006,7 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			[OVS_ACTION_ATTR_METER] = sizeof(u32),
 			[OVS_ACTION_ATTR_CLONE] = (u32)-1,
 			[OVS_ACTION_ATTR_CHECK_PKT_LEN] = (u32)-1,
+			[OVS_ACTION_ATTR_ADD_MPLS] = sizeof(struct ovs_action_add_mpls),
 		};
 		const struct ovs_action_push_vlan *vlan;
 		int type = nla_type(a);
@@ -3072,6 +3074,33 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 		case OVS_ACTION_ATTR_RECIRC:
 			break;
 
+		case OVS_ACTION_ATTR_ADD_MPLS: {
+			const struct ovs_action_add_mpls *mpls = nla_data(a);
+
+			if (!eth_p_mpls(mpls->mpls_ethertype))
+				return -EINVAL;
+
+			if (mpls->tun_flags & OVS_MPLS_L3_TUNNEL_FLAG_MASK) {
+				if (vlan_tci & htons(VLAN_CFI_MASK) ||
+				    (eth_type != htons(ETH_P_IP) &&
+				     eth_type != htons(ETH_P_IPV6) &&
+				     eth_type != htons(ETH_P_ARP) &&
+				     eth_type != htons(ETH_P_RARP) &&
+				     !eth_p_mpls(eth_type)))
+					return -EINVAL;
+				mpls_label_count++;
+			} else {
+				if (mac_proto == MAC_PROTO_ETHERNET) {
+					mpls_label_count = 1;
+					mac_proto = MAC_PROTO_NONE;
+				} else {
+					mpls_label_count++;
+				}
+			}
+			eth_type = mpls->mpls_ethertype;
+			break;
+		}
+
 		case OVS_ACTION_ATTR_PUSH_MPLS: {
 			const struct ovs_action_push_mpls *mpls = nla_data(a);
 
@@ -3109,6 +3138,11 @@ static int __ovs_nla_copy_actions(struct net *net, const struct nlattr *attr,
 			 * recirculation.
 			 */
 			proto = nla_get_be16(a);
+
+			if (proto == htons(ETH_P_TEB) &&
+			    mac_proto != MAC_PROTO_NONE)
+				return -EINVAL;
+
 			mpls_label_count--;
 
 			if (!eth_p_mpls(proto) || !mpls_label_count)
-- 
cgit v1.2.3


From b6fd7b96366769651ab23988607ce9c5c9042cdb Mon Sep 17 00:00:00 2001
From: Richard Cochran <richardcochran@gmail.com>
Date: Wed, 25 Dec 2019 18:16:19 -0800
Subject: net: Introduce peer to peer one step PTP time stamping.

The 1588 standard defines one step operation for both Sync and
PDelay_Resp messages.  Up until now, hardware with P2P one step has
been rare, and kernel support was lacking.  This patch adds support of
the mode in anticipation of new hardware developments.

Signed-off-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c   | 1 +
 drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c | 1 +
 drivers/net/ethernet/microchip/lan743x_ptp.c       | 3 +++
 drivers/net/ethernet/qlogic/qede/qede_ptp.c        | 1 +
 include/uapi/linux/net_tstamp.h                    | 8 ++++++++
 net/core/dev_ioctl.c                               | 1 +
 6 files changed, 15 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
index cff64e43bdd8..741d865e4afc 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_main.c
@@ -15410,6 +15410,7 @@ int bnx2x_configure_ptp_filters(struct bnx2x *bp)
 		REG_WR(bp, rule, BNX2X_PTP_TX_ON_RULE_MASK);
 		break;
 	case HWTSTAMP_TX_ONESTEP_SYNC:
+	case HWTSTAMP_TX_ONESTEP_P2P:
 		BNX2X_ERR("One-step timestamping is not supported\n");
 		return -ERANGE;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
index ec2ff3d7f41c..4aaaa4937b1a 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_ptp.c
@@ -920,6 +920,7 @@ static int mlxsw_sp_ptp_get_message_types(const struct hwtstamp_config *config,
 		egr_types = 0xff;
 		break;
 	case HWTSTAMP_TX_ONESTEP_SYNC:
+	case HWTSTAMP_TX_ONESTEP_P2P:
 		return -ERANGE;
 	}
 
diff --git a/drivers/net/ethernet/microchip/lan743x_ptp.c b/drivers/net/ethernet/microchip/lan743x_ptp.c
index afe52463dc57..9399f6a98748 100644
--- a/drivers/net/ethernet/microchip/lan743x_ptp.c
+++ b/drivers/net/ethernet/microchip/lan743x_ptp.c
@@ -1265,6 +1265,9 @@ int lan743x_ptp_ioctl(struct net_device *netdev, struct ifreq *ifr, int cmd)
 
 		lan743x_ptp_set_sync_ts_insert(adapter, true);
 		break;
+	case HWTSTAMP_TX_ONESTEP_P2P:
+		ret = -ERANGE;
+		break;
 	default:
 		netif_warn(adapter, drv, adapter->netdev,
 			   "  tx_type = %d, UNKNOWN\n", config.tx_type);
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ptp.c b/drivers/net/ethernet/qlogic/qede/qede_ptp.c
index f815435cf106..4c7f7a7fc151 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ptp.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ptp.c
@@ -247,6 +247,7 @@ static int qede_ptp_cfg_filters(struct qede_dev *edev)
 		break;
 
 	case HWTSTAMP_TX_ONESTEP_SYNC:
+	case HWTSTAMP_TX_ONESTEP_P2P:
 		DP_ERR(edev, "One-step timestamping is not supported\n");
 		return -ERANGE;
 	}
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index e5b39721c6e4..f96e650d0af9 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -90,6 +90,14 @@ enum hwtstamp_tx_types {
 	 * queue.
 	 */
 	HWTSTAMP_TX_ONESTEP_SYNC,
+
+	/*
+	 * Same as HWTSTAMP_TX_ONESTEP_SYNC, but also enables time
+	 * stamp insertion directly into PDelay_Resp packets. In this
+	 * case, neither transmitted Sync nor PDelay_Resp packets will
+	 * receive a time stamp via the socket error queue.
+	 */
+	HWTSTAMP_TX_ONESTEP_P2P,
 };
 
 /* possible values for hwtstamp_config->rx_filter */
diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 5163d900bb4f..dbaebbe573f0 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -187,6 +187,7 @@ static int net_hwtstamp_validate(struct ifreq *ifr)
 	case HWTSTAMP_TX_OFF:
 	case HWTSTAMP_TX_ON:
 	case HWTSTAMP_TX_ONESTEP_SYNC:
+	case HWTSTAMP_TX_ONESTEP_P2P:
 		tx_type_valid = 1;
 		break;
 	}
-- 
cgit v1.2.3


From c14ceb0ec727187f71a487a592ffa91767fed66e Mon Sep 17 00:00:00 2001
From: Florian Westphal <fw@strlen.de>
Date: Wed, 18 Dec 2019 12:05:21 +0100
Subject: netfilter: nft_meta: add support for slave device ifindex matching

Allow to match on vrf slave ifindex or name.

In case there was no slave interface involved, store 0 in the
destination register just like existing iif/oif matching.

sdif(name) is restricted to the ipv4/ipv6 input and forward hooks,
as it depends on ip(6) stack parsing/storing info in skb->cb[].

Cc: Martin Willi <martin@strongswan.org>
Cc: David Ahern <dsahern@kernel.org>
Cc: Shrijeet Mukherjee <shrijeet@gmail.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/nf_tables.h |  4 ++
 net/netfilter/nft_meta.c                 | 76 +++++++++++++++++++++++++++++---
 2 files changed, 73 insertions(+), 7 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index bb9b049310df..e237ecbdcd8a 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -805,6 +805,8 @@ enum nft_exthdr_attributes {
  * @NFT_META_TIME_NS: time since epoch (in nanoseconds)
  * @NFT_META_TIME_DAY: day of week (from 0 = Sunday to 6 = Saturday)
  * @NFT_META_TIME_HOUR: hour of day (in seconds)
+ * @NFT_META_SDIF: slave device interface index
+ * @NFT_META_SDIFNAME: slave device interface name
  */
 enum nft_meta_keys {
 	NFT_META_LEN,
@@ -840,6 +842,8 @@ enum nft_meta_keys {
 	NFT_META_TIME_NS,
 	NFT_META_TIME_DAY,
 	NFT_META_TIME_HOUR,
+	NFT_META_SDIF,
+	NFT_META_SDIFNAME,
 };
 
 /**
diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c
index fb1a571db924..951b6e87ed5d 100644
--- a/net/netfilter/nft_meta.c
+++ b/net/netfilter/nft_meta.c
@@ -17,6 +17,7 @@
 #include <linux/smp.h>
 #include <linux/static_key.h>
 #include <net/dst.h>
+#include <net/ip.h>
 #include <net/sock.h>
 #include <net/tcp_states.h> /* for TCP_TIME_WAIT */
 #include <net/netfilter/nf_tables.h>
@@ -287,6 +288,28 @@ nft_meta_get_eval_rtclassid(const struct sk_buff *skb, u32 *dest)
 }
 #endif
 
+static noinline u32 nft_meta_get_eval_sdif(const struct nft_pktinfo *pkt)
+{
+	switch (nft_pf(pkt)) {
+	case NFPROTO_IPV4:
+		return inet_sdif(pkt->skb);
+	case NFPROTO_IPV6:
+		return inet6_sdif(pkt->skb);
+	}
+
+	return 0;
+}
+
+static noinline void
+nft_meta_get_eval_sdifname(u32 *dest, const struct nft_pktinfo *pkt)
+{
+	u32 sdif = nft_meta_get_eval_sdif(pkt);
+	const struct net_device *dev;
+
+	dev = sdif ? dev_get_by_index_rcu(nft_net(pkt), sdif) : NULL;
+	nft_meta_store_ifname(dest, dev);
+}
+
 void nft_meta_get_eval(const struct nft_expr *expr,
 		       struct nft_regs *regs,
 		       const struct nft_pktinfo *pkt)
@@ -379,6 +402,12 @@ void nft_meta_get_eval(const struct nft_expr *expr,
 	case NFT_META_TIME_HOUR:
 		nft_meta_get_eval_time(priv->key, dest);
 		break;
+	case NFT_META_SDIF:
+		*dest = nft_meta_get_eval_sdif(pkt);
+		break;
+	case NFT_META_SDIFNAME:
+		nft_meta_get_eval_sdifname(dest, pkt);
+		break;
 	default:
 		WARN_ON(1);
 		goto err;
@@ -459,6 +488,7 @@ int nft_meta_get_init(const struct nft_ctx *ctx,
 	case NFT_META_MARK:
 	case NFT_META_IIF:
 	case NFT_META_OIF:
+	case NFT_META_SDIF:
 	case NFT_META_SKUID:
 	case NFT_META_SKGID:
 #ifdef CONFIG_IP_ROUTE_CLASSID
@@ -480,6 +510,7 @@ int nft_meta_get_init(const struct nft_ctx *ctx,
 	case NFT_META_OIFNAME:
 	case NFT_META_IIFKIND:
 	case NFT_META_OIFKIND:
+	case NFT_META_SDIFNAME:
 		len = IFNAMSIZ;
 		break;
 	case NFT_META_PRANDOM:
@@ -510,16 +541,28 @@ int nft_meta_get_init(const struct nft_ctx *ctx,
 }
 EXPORT_SYMBOL_GPL(nft_meta_get_init);
 
-static int nft_meta_get_validate(const struct nft_ctx *ctx,
-				 const struct nft_expr *expr,
-				 const struct nft_data **data)
+static int nft_meta_get_validate_sdif(const struct nft_ctx *ctx)
 {
-#ifdef CONFIG_XFRM
-	const struct nft_meta *priv = nft_expr_priv(expr);
 	unsigned int hooks;
 
-	if (priv->key != NFT_META_SECPATH)
-		return 0;
+	switch (ctx->family) {
+	case NFPROTO_IPV4:
+	case NFPROTO_IPV6:
+	case NFPROTO_INET:
+		hooks = (1 << NF_INET_LOCAL_IN) |
+			(1 << NF_INET_FORWARD);
+		break;
+	default:
+		return -EOPNOTSUPP;
+	}
+
+	return nft_chain_validate_hooks(ctx->chain, hooks);
+}
+
+static int nft_meta_get_validate_xfrm(const struct nft_ctx *ctx)
+{
+#ifdef CONFIG_XFRM
+	unsigned int hooks;
 
 	switch (ctx->family) {
 	case NFPROTO_NETDEV:
@@ -542,6 +585,25 @@ static int nft_meta_get_validate(const struct nft_ctx *ctx,
 #endif
 }
 
+static int nft_meta_get_validate(const struct nft_ctx *ctx,
+				 const struct nft_expr *expr,
+				 const struct nft_data **data)
+{
+	const struct nft_meta *priv = nft_expr_priv(expr);
+
+	switch (priv->key) {
+	case NFT_META_SECPATH:
+		return nft_meta_get_validate_xfrm(ctx);
+	case NFT_META_SDIF:
+	case NFT_META_SDIFNAME:
+		return nft_meta_get_validate_sdif(ctx);
+	default:
+		break;
+	}
+
+	return 0;
+}
+
 int nft_meta_set_validate(const struct nft_ctx *ctx,
 			  const struct nft_expr *expr,
 			  const struct nft_data **data)
-- 
cgit v1.2.3


From c1e469902640ed0a82d0b8df5951199c17a49e3c Mon Sep 17 00:00:00 2001
From: Andy Roulin <aroulin@cumulusnetworks.com>
Date: Thu, 26 Dec 2019 05:41:57 -0800
Subject: bonding: rename AD_STATE_* to LACP_STATE_*

As the LACP actor/partner state is now part of the uapi, rename the
3ad state defines with LACP prefix. The LACP prefix is preferred over
BOND_3AD as the LACP standard moved to 802.1AX.

Fixes: 826f66b30c2e3 ("bonding: move 802.3ad port state flags to uapi")
Signed-off-by: Andy Roulin <aroulin@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/bonding/bond_3ad.c  | 112 ++++++++++++++++++++--------------------
 include/uapi/linux/if_bonding.h |  16 +++---
 2 files changed, 64 insertions(+), 64 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c
index 34bfe99641a3..31e43a2197a3 100644
--- a/drivers/net/bonding/bond_3ad.c
+++ b/drivers/net/bonding/bond_3ad.c
@@ -447,8 +447,8 @@ static void __choose_matched(struct lacpdu *lacpdu, struct port *port)
 	     MAC_ADDRESS_EQUAL(&(lacpdu->partner_system), &(port->actor_system)) &&
 	     (ntohs(lacpdu->partner_system_priority) == port->actor_system_priority) &&
 	     (ntohs(lacpdu->partner_key) == port->actor_oper_port_key) &&
-	     ((lacpdu->partner_state & AD_STATE_AGGREGATION) == (port->actor_oper_port_state & AD_STATE_AGGREGATION))) ||
-	    ((lacpdu->actor_state & AD_STATE_AGGREGATION) == 0)
+	     ((lacpdu->partner_state & LACP_STATE_AGGREGATION) == (port->actor_oper_port_state & LACP_STATE_AGGREGATION))) ||
+	    ((lacpdu->actor_state & LACP_STATE_AGGREGATION) == 0)
 		) {
 		port->sm_vars |= AD_PORT_MATCHED;
 	} else {
@@ -482,18 +482,18 @@ static void __record_pdu(struct lacpdu *lacpdu, struct port *port)
 		partner->port_state = lacpdu->actor_state;
 
 		/* set actor_oper_port_state.defaulted to FALSE */
-		port->actor_oper_port_state &= ~AD_STATE_DEFAULTED;
+		port->actor_oper_port_state &= ~LACP_STATE_DEFAULTED;
 
 		/* set the partner sync. to on if the partner is sync,
 		 * and the port is matched
 		 */
 		if ((port->sm_vars & AD_PORT_MATCHED) &&
-		    (lacpdu->actor_state & AD_STATE_SYNCHRONIZATION)) {
-			partner->port_state |= AD_STATE_SYNCHRONIZATION;
+		    (lacpdu->actor_state & LACP_STATE_SYNCHRONIZATION)) {
+			partner->port_state |= LACP_STATE_SYNCHRONIZATION;
 			slave_dbg(port->slave->bond->dev, port->slave->dev,
 				  "partner sync=1\n");
 		} else {
-			partner->port_state &= ~AD_STATE_SYNCHRONIZATION;
+			partner->port_state &= ~LACP_STATE_SYNCHRONIZATION;
 			slave_dbg(port->slave->bond->dev, port->slave->dev,
 				  "partner sync=0\n");
 		}
@@ -516,7 +516,7 @@ static void __record_default(struct port *port)
 		       sizeof(struct port_params));
 
 		/* set actor_oper_port_state.defaulted to true */
-		port->actor_oper_port_state |= AD_STATE_DEFAULTED;
+		port->actor_oper_port_state |= LACP_STATE_DEFAULTED;
 	}
 }
 
@@ -546,7 +546,7 @@ static void __update_selected(struct lacpdu *lacpdu, struct port *port)
 		    !MAC_ADDRESS_EQUAL(&lacpdu->actor_system, &partner->system) ||
 		    ntohs(lacpdu->actor_system_priority) != partner->system_priority ||
 		    ntohs(lacpdu->actor_key) != partner->key ||
-		    (lacpdu->actor_state & AD_STATE_AGGREGATION) != (partner->port_state & AD_STATE_AGGREGATION)) {
+		    (lacpdu->actor_state & LACP_STATE_AGGREGATION) != (partner->port_state & LACP_STATE_AGGREGATION)) {
 			port->sm_vars &= ~AD_PORT_SELECTED;
 		}
 	}
@@ -578,8 +578,8 @@ static void __update_default_selected(struct port *port)
 		    !MAC_ADDRESS_EQUAL(&admin->system, &oper->system) ||
 		    admin->system_priority != oper->system_priority ||
 		    admin->key != oper->key ||
-		    (admin->port_state & AD_STATE_AGGREGATION)
-			!= (oper->port_state & AD_STATE_AGGREGATION)) {
+		    (admin->port_state & LACP_STATE_AGGREGATION)
+			!= (oper->port_state & LACP_STATE_AGGREGATION)) {
 			port->sm_vars &= ~AD_PORT_SELECTED;
 		}
 	}
@@ -609,10 +609,10 @@ static void __update_ntt(struct lacpdu *lacpdu, struct port *port)
 		    !MAC_ADDRESS_EQUAL(&(lacpdu->partner_system), &(port->actor_system)) ||
 		    (ntohs(lacpdu->partner_system_priority) != port->actor_system_priority) ||
 		    (ntohs(lacpdu->partner_key) != port->actor_oper_port_key) ||
-		    ((lacpdu->partner_state & AD_STATE_LACP_ACTIVITY) != (port->actor_oper_port_state & AD_STATE_LACP_ACTIVITY)) ||
-		    ((lacpdu->partner_state & AD_STATE_LACP_TIMEOUT) != (port->actor_oper_port_state & AD_STATE_LACP_TIMEOUT)) ||
-		    ((lacpdu->partner_state & AD_STATE_SYNCHRONIZATION) != (port->actor_oper_port_state & AD_STATE_SYNCHRONIZATION)) ||
-		    ((lacpdu->partner_state & AD_STATE_AGGREGATION) != (port->actor_oper_port_state & AD_STATE_AGGREGATION))
+		    ((lacpdu->partner_state & LACP_STATE_LACP_ACTIVITY) != (port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY)) ||
+		    ((lacpdu->partner_state & LACP_STATE_LACP_TIMEOUT) != (port->actor_oper_port_state & LACP_STATE_LACP_TIMEOUT)) ||
+		    ((lacpdu->partner_state & LACP_STATE_SYNCHRONIZATION) != (port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) ||
+		    ((lacpdu->partner_state & LACP_STATE_AGGREGATION) != (port->actor_oper_port_state & LACP_STATE_AGGREGATION))
 		   ) {
 			port->ntt = true;
 		}
@@ -968,7 +968,7 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 			 * edable port will take place only after this timer)
 			 */
 			if ((port->sm_vars & AD_PORT_SELECTED) &&
-			    (port->partner_oper.port_state & AD_STATE_SYNCHRONIZATION) &&
+			    (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) &&
 			    !__check_agg_selection_timer(port)) {
 				if (port->aggregator->is_active)
 					port->sm_mux_state =
@@ -986,14 +986,14 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 				port->sm_mux_state = AD_MUX_DETACHED;
 			} else if (port->aggregator->is_active) {
 				port->actor_oper_port_state |=
-				    AD_STATE_SYNCHRONIZATION;
+				    LACP_STATE_SYNCHRONIZATION;
 			}
 			break;
 		case AD_MUX_COLLECTING_DISTRIBUTING:
 			if (!(port->sm_vars & AD_PORT_SELECTED) ||
 			    (port->sm_vars & AD_PORT_STANDBY) ||
-			    !(port->partner_oper.port_state & AD_STATE_SYNCHRONIZATION) ||
-			    !(port->actor_oper_port_state & AD_STATE_SYNCHRONIZATION)) {
+			    !(port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) ||
+			    !(port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION)) {
 				port->sm_mux_state = AD_MUX_ATTACHED;
 			} else {
 				/* if port state hasn't changed make
@@ -1022,11 +1022,11 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 			  port->sm_mux_state);
 		switch (port->sm_mux_state) {
 		case AD_MUX_DETACHED:
-			port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION;
+			port->actor_oper_port_state &= ~LACP_STATE_SYNCHRONIZATION;
 			ad_disable_collecting_distributing(port,
 							   update_slave_arr);
-			port->actor_oper_port_state &= ~AD_STATE_COLLECTING;
-			port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING;
+			port->actor_oper_port_state &= ~LACP_STATE_COLLECTING;
+			port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
 			port->ntt = true;
 			break;
 		case AD_MUX_WAITING:
@@ -1035,20 +1035,20 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr)
 		case AD_MUX_ATTACHED:
 			if (port->aggregator->is_active)
 				port->actor_oper_port_state |=
-				    AD_STATE_SYNCHRONIZATION;
+				    LACP_STATE_SYNCHRONIZATION;
 			else
 				port->actor_oper_port_state &=
-				    ~AD_STATE_SYNCHRONIZATION;
-			port->actor_oper_port_state &= ~AD_STATE_COLLECTING;
-			port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING;
+				    ~LACP_STATE_SYNCHRONIZATION;
+			port->actor_oper_port_state &= ~LACP_STATE_COLLECTING;
+			port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
 			ad_disable_collecting_distributing(port,
 							   update_slave_arr);
 			port->ntt = true;
 			break;
 		case AD_MUX_COLLECTING_DISTRIBUTING:
-			port->actor_oper_port_state |= AD_STATE_COLLECTING;
-			port->actor_oper_port_state |= AD_STATE_DISTRIBUTING;
-			port->actor_oper_port_state |= AD_STATE_SYNCHRONIZATION;
+			port->actor_oper_port_state |= LACP_STATE_COLLECTING;
+			port->actor_oper_port_state |= LACP_STATE_DISTRIBUTING;
+			port->actor_oper_port_state |= LACP_STATE_SYNCHRONIZATION;
 			ad_enable_collecting_distributing(port,
 							  update_slave_arr);
 			port->ntt = true;
@@ -1146,7 +1146,7 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
 				port->sm_vars |= AD_PORT_LACP_ENABLED;
 			port->sm_vars &= ~AD_PORT_SELECTED;
 			__record_default(port);
-			port->actor_oper_port_state &= ~AD_STATE_EXPIRED;
+			port->actor_oper_port_state &= ~LACP_STATE_EXPIRED;
 			port->sm_rx_state = AD_RX_PORT_DISABLED;
 
 			/* Fall Through */
@@ -1156,9 +1156,9 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
 		case AD_RX_LACP_DISABLED:
 			port->sm_vars &= ~AD_PORT_SELECTED;
 			__record_default(port);
-			port->partner_oper.port_state &= ~AD_STATE_AGGREGATION;
+			port->partner_oper.port_state &= ~LACP_STATE_AGGREGATION;
 			port->sm_vars |= AD_PORT_MATCHED;
-			port->actor_oper_port_state &= ~AD_STATE_EXPIRED;
+			port->actor_oper_port_state &= ~LACP_STATE_EXPIRED;
 			break;
 		case AD_RX_EXPIRED:
 			/* Reset of the Synchronization flag (Standard 43.4.12)
@@ -1167,19 +1167,19 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
 			 * case of EXPIRED even if LINK_DOWN didn't arrive for
 			 * the port.
 			 */
-			port->partner_oper.port_state &= ~AD_STATE_SYNCHRONIZATION;
+			port->partner_oper.port_state &= ~LACP_STATE_SYNCHRONIZATION;
 			port->sm_vars &= ~AD_PORT_MATCHED;
-			port->partner_oper.port_state |= AD_STATE_LACP_TIMEOUT;
-			port->partner_oper.port_state |= AD_STATE_LACP_ACTIVITY;
+			port->partner_oper.port_state |= LACP_STATE_LACP_TIMEOUT;
+			port->partner_oper.port_state |= LACP_STATE_LACP_ACTIVITY;
 			port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(AD_SHORT_TIMEOUT));
-			port->actor_oper_port_state |= AD_STATE_EXPIRED;
+			port->actor_oper_port_state |= LACP_STATE_EXPIRED;
 			port->sm_vars |= AD_PORT_CHURNED;
 			break;
 		case AD_RX_DEFAULTED:
 			__update_default_selected(port);
 			__record_default(port);
 			port->sm_vars |= AD_PORT_MATCHED;
-			port->actor_oper_port_state &= ~AD_STATE_EXPIRED;
+			port->actor_oper_port_state &= ~LACP_STATE_EXPIRED;
 			break;
 		case AD_RX_CURRENT:
 			/* detect loopback situation */
@@ -1192,8 +1192,8 @@ static void ad_rx_machine(struct lacpdu *lacpdu, struct port *port)
 			__update_selected(lacpdu, port);
 			__update_ntt(lacpdu, port);
 			__record_pdu(lacpdu, port);
-			port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(port->actor_oper_port_state & AD_STATE_LACP_TIMEOUT));
-			port->actor_oper_port_state &= ~AD_STATE_EXPIRED;
+			port->sm_rx_timer_counter = __ad_timer_to_ticks(AD_CURRENT_WHILE_TIMER, (u16)(port->actor_oper_port_state & LACP_STATE_LACP_TIMEOUT));
+			port->actor_oper_port_state &= ~LACP_STATE_EXPIRED;
 			break;
 		default:
 			break;
@@ -1221,7 +1221,7 @@ static void ad_churn_machine(struct port *port)
 	if (port->sm_churn_actor_timer_counter &&
 	    !(--port->sm_churn_actor_timer_counter) &&
 	    port->sm_churn_actor_state == AD_CHURN_MONITOR) {
-		if (port->actor_oper_port_state & AD_STATE_SYNCHRONIZATION) {
+		if (port->actor_oper_port_state & LACP_STATE_SYNCHRONIZATION) {
 			port->sm_churn_actor_state = AD_NO_CHURN;
 		} else {
 			port->churn_actor_count++;
@@ -1231,7 +1231,7 @@ static void ad_churn_machine(struct port *port)
 	if (port->sm_churn_partner_timer_counter &&
 	    !(--port->sm_churn_partner_timer_counter) &&
 	    port->sm_churn_partner_state == AD_CHURN_MONITOR) {
-		if (port->partner_oper.port_state & AD_STATE_SYNCHRONIZATION) {
+		if (port->partner_oper.port_state & LACP_STATE_SYNCHRONIZATION) {
 			port->sm_churn_partner_state = AD_NO_CHURN;
 		} else {
 			port->churn_partner_count++;
@@ -1288,7 +1288,7 @@ static void ad_periodic_machine(struct port *port)
 
 	/* check if port was reinitialized */
 	if (((port->sm_vars & AD_PORT_BEGIN) || !(port->sm_vars & AD_PORT_LACP_ENABLED) || !port->is_enabled) ||
-	    (!(port->actor_oper_port_state & AD_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & AD_STATE_LACP_ACTIVITY))
+	    (!(port->actor_oper_port_state & LACP_STATE_LACP_ACTIVITY) && !(port->partner_oper.port_state & LACP_STATE_LACP_ACTIVITY))
 	   ) {
 		port->sm_periodic_state = AD_NO_PERIODIC;
 	}
@@ -1305,11 +1305,11 @@ static void ad_periodic_machine(struct port *port)
 			switch (port->sm_periodic_state) {
 			case AD_FAST_PERIODIC:
 				if (!(port->partner_oper.port_state
-				      & AD_STATE_LACP_TIMEOUT))
+				      & LACP_STATE_LACP_TIMEOUT))
 					port->sm_periodic_state = AD_SLOW_PERIODIC;
 				break;
 			case AD_SLOW_PERIODIC:
-				if ((port->partner_oper.port_state & AD_STATE_LACP_TIMEOUT)) {
+				if ((port->partner_oper.port_state & LACP_STATE_LACP_TIMEOUT)) {
 					port->sm_periodic_timer_counter = 0;
 					port->sm_periodic_state = AD_PERIODIC_TX;
 				}
@@ -1325,7 +1325,7 @@ static void ad_periodic_machine(struct port *port)
 			break;
 		case AD_PERIODIC_TX:
 			if (!(port->partner_oper.port_state &
-			    AD_STATE_LACP_TIMEOUT))
+			    LACP_STATE_LACP_TIMEOUT))
 				port->sm_periodic_state = AD_SLOW_PERIODIC;
 			else
 				port->sm_periodic_state = AD_FAST_PERIODIC;
@@ -1532,7 +1532,7 @@ static void ad_port_selection_logic(struct port *port, bool *update_slave_arr)
 	ad_agg_selection_logic(aggregator, update_slave_arr);
 
 	if (!port->aggregator->is_active)
-		port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION;
+		port->actor_oper_port_state &= ~LACP_STATE_SYNCHRONIZATION;
 }
 
 /* Decide if "agg" is a better choice for the new active aggregator that
@@ -1838,13 +1838,13 @@ static void ad_initialize_port(struct port *port, int lacp_fast)
 		port->actor_port_priority = 0xff;
 		port->actor_port_aggregator_identifier = 0;
 		port->ntt = false;
-		port->actor_admin_port_state = AD_STATE_AGGREGATION |
-					       AD_STATE_LACP_ACTIVITY;
-		port->actor_oper_port_state  = AD_STATE_AGGREGATION |
-					       AD_STATE_LACP_ACTIVITY;
+		port->actor_admin_port_state = LACP_STATE_AGGREGATION |
+					       LACP_STATE_LACP_ACTIVITY;
+		port->actor_oper_port_state  = LACP_STATE_AGGREGATION |
+					       LACP_STATE_LACP_ACTIVITY;
 
 		if (lacp_fast)
-			port->actor_oper_port_state |= AD_STATE_LACP_TIMEOUT;
+			port->actor_oper_port_state |= LACP_STATE_LACP_TIMEOUT;
 
 		memcpy(&port->partner_admin, &tmpl, sizeof(tmpl));
 		memcpy(&port->partner_oper, &tmpl, sizeof(tmpl));
@@ -2095,10 +2095,10 @@ void bond_3ad_unbind_slave(struct slave *slave)
 		  aggregator->aggregator_identifier);
 
 	/* Tell the partner that this port is not suitable for aggregation */
-	port->actor_oper_port_state &= ~AD_STATE_SYNCHRONIZATION;
-	port->actor_oper_port_state &= ~AD_STATE_COLLECTING;
-	port->actor_oper_port_state &= ~AD_STATE_DISTRIBUTING;
-	port->actor_oper_port_state &= ~AD_STATE_AGGREGATION;
+	port->actor_oper_port_state &= ~LACP_STATE_SYNCHRONIZATION;
+	port->actor_oper_port_state &= ~LACP_STATE_COLLECTING;
+	port->actor_oper_port_state &= ~LACP_STATE_DISTRIBUTING;
+	port->actor_oper_port_state &= ~LACP_STATE_AGGREGATION;
 	__update_lacpdu_from_port(port);
 	ad_lacpdu_send(port);
 
@@ -2685,9 +2685,9 @@ void bond_3ad_update_lacp_rate(struct bonding *bond)
 	bond_for_each_slave(bond, slave, iter) {
 		port = &(SLAVE_AD_INFO(slave)->port);
 		if (lacp_fast)
-			port->actor_oper_port_state |= AD_STATE_LACP_TIMEOUT;
+			port->actor_oper_port_state |= LACP_STATE_LACP_TIMEOUT;
 		else
-			port->actor_oper_port_state &= ~AD_STATE_LACP_TIMEOUT;
+			port->actor_oper_port_state &= ~LACP_STATE_LACP_TIMEOUT;
 	}
 	spin_unlock_bh(&bond->mode_lock);
 }
diff --git a/include/uapi/linux/if_bonding.h b/include/uapi/linux/if_bonding.h
index 6829213a54c5..45f3750aa861 100644
--- a/include/uapi/linux/if_bonding.h
+++ b/include/uapi/linux/if_bonding.h
@@ -96,14 +96,14 @@
 #define BOND_XMIT_POLICY_ENCAP34	4 /* encapsulated layer 3+4 */
 
 /* 802.3ad port state definitions (43.4.2.2 in the 802.3ad standard) */
-#define AD_STATE_LACP_ACTIVITY   0x1
-#define AD_STATE_LACP_TIMEOUT    0x2
-#define AD_STATE_AGGREGATION     0x4
-#define AD_STATE_SYNCHRONIZATION 0x8
-#define AD_STATE_COLLECTING      0x10
-#define AD_STATE_DISTRIBUTING    0x20
-#define AD_STATE_DEFAULTED       0x40
-#define AD_STATE_EXPIRED         0x80
+#define LACP_STATE_LACP_ACTIVITY   0x1
+#define LACP_STATE_LACP_TIMEOUT    0x2
+#define LACP_STATE_AGGREGATION     0x4
+#define LACP_STATE_SYNCHRONIZATION 0x8
+#define LACP_STATE_COLLECTING      0x10
+#define LACP_STATE_DISTRIBUTING    0x20
+#define LACP_STATE_DEFAULTED       0x40
+#define LACP_STATE_EXPIRED         0x80
 
 typedef struct ifbond {
 	__s32 bond_mode;
-- 
cgit v1.2.3


From 2b4a8990b7df55875745a80a609a1ceaaf51f322 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:55:18 +0100
Subject: ethtool: introduce ethtool netlink interface

Basic genetlink and init infrastructure for the netlink interface, register
genetlink family "ethtool". Add CONFIG_ETHTOOL_NETLINK Kconfig option to
make the build optional. Add initial overall interface description into
Documentation/networking/ethtool-netlink.rst, further patches will add more
detailed information.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 216 +++++++++++++++++++++++++++
 Documentation/networking/index.rst           |   1 +
 include/linux/ethtool_netlink.h              |   9 ++
 include/uapi/linux/ethtool_netlink.h         |  36 +++++
 net/Kconfig                                  |   8 +
 net/ethtool/Makefile                         |   6 +-
 net/ethtool/netlink.c                        |  33 ++++
 net/ethtool/netlink.h                        |  10 ++
 8 files changed, 318 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/networking/ethtool-netlink.rst
 create mode 100644 include/linux/ethtool_netlink.h
 create mode 100644 include/uapi/linux/ethtool_netlink.h
 create mode 100644 net/ethtool/netlink.c
 create mode 100644 net/ethtool/netlink.h

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
new file mode 100644
index 000000000000..9448442ad293
--- /dev/null
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -0,0 +1,216 @@
+=============================
+Netlink interface for ethtool
+=============================
+
+
+Basic information
+=================
+
+Netlink interface for ethtool uses generic netlink family ``ethtool``
+(userspace application should use macros ``ETHTOOL_GENL_NAME`` and
+``ETHTOOL_GENL_VERSION`` defined in ``<linux/ethtool_netlink.h>`` uapi
+header). This family does not use a specific header, all information in
+requests and replies is passed using netlink attributes.
+
+The ethtool netlink interface uses extended ACK for error and warning
+reporting, userspace application developers are encouraged to make these
+messages available to user in a suitable way.
+
+Requests can be divided into three categories: "get" (retrieving information),
+"set" (setting parameters) and "action" (invoking an action).
+
+All "set" and "action" type requests require admin privileges
+(``CAP_NET_ADMIN`` in the namespace). Most "get" type requests are allowed for
+anyone but there are exceptions (where the response contains sensitive
+information). In some cases, the request as such is allowed for anyone but
+unprivileged users have attributes with sensitive information (e.g.
+wake-on-lan password) omitted.
+
+
+Conventions
+===========
+
+Attributes which represent a boolean value usually use NLA_U8 type so that we
+can distinguish three states: "on", "off" and "not present" (meaning the
+information is not available in "get" requests or value is not to be changed
+in "set" requests). For these attributes, the "true" value should be passed as
+number 1 but any non-zero value should be understood as "true" by recipient.
+In the tables below, "bool" denotes NLA_U8 attributes interpreted in this way.
+
+In the message structure descriptions below, if an attribute name is suffixed
+with "+", parent nest can contain multiple attributes of the same type. This
+implements an array of entries.
+
+
+Request header
+==============
+
+Each request or reply message contains a nested attribute with common header.
+Structure of this header is
+
+  ==============================  ======  =============================
+  ``ETHTOOL_A_HEADER_DEV_INDEX``  u32     device ifindex
+  ``ETHTOOL_A_HEADER_DEV_NAME``   string  device name
+  ``ETHTOOL_A_HEADER_FLAGS``      u32     flags common for all requests
+  ==============================  ======  =============================
+
+``ETHTOOL_A_HEADER_DEV_INDEX`` and ``ETHTOOL_A_HEADER_DEV_NAME`` identify the
+device message relates to. One of them is sufficient in requests, if both are
+used, they must identify the same device. Some requests, e.g. global string
+sets, do not require device identification. Most ``GET`` requests also allow
+dump requests without device identification to query the same information for
+all devices providing it (each device in a separate message).
+
+``ETHTOOL_A_HEADER_FLAGS`` is a bitmap of request flags common for all request
+types. The interpretation of these flags is the same for all request types but
+the flags may not apply to requests. Recognized flags are:
+
+  =================================  ===================================
+  ``ETHTOOL_FLAG_COMPACT_BITSETS``   use compact format bitsets in reply
+  ``ETHTOOL_FLAG_OMIT_REPLY``        omit optional reply (_SET and _ACT)
+  =================================  ===================================
+
+New request flags should follow the general idea that if the flag is not set,
+the behaviour is backward compatible, i.e. requests from old clients not aware
+of the flag should be interpreted the way the client expects. A client must
+not set flags it does not understand.
+
+
+List of message types
+=====================
+
+All constants identifying message types use ``ETHTOOL_CMD_`` prefix and suffix
+according to message purpose:
+
+  ==============    ======================================
+  ``_GET``          userspace request to retrieve data
+  ``_SET``          userspace request to set data
+  ``_ACT``          userspace request to perform an action
+  ``_GET_REPLY``    kernel reply to a ``GET`` request
+  ``_SET_REPLY``    kernel reply to a ``SET`` request
+  ``_ACT_REPLY``    kernel reply to an ``ACT`` request
+  ``_NTF``          kernel notification
+  ==============    ======================================
+
+``GET`` requests are sent by userspace applications to retrieve device
+information. They usually do not contain any message specific attributes.
+Kernel replies with corresponding "GET_REPLY" message. For most types, ``GET``
+request with ``NLM_F_DUMP`` and no device identification can be used to query
+the information for all devices supporting the request.
+
+If the data can be also modified, corresponding ``SET`` message with the same
+layout as corresponding ``GET_REPLY`` is used to request changes. Only
+attributes where a change is requested are included in such request (also, not
+all attributes may be changed). Replies to most ``SET`` request consist only
+of error code and extack; if kernel provides additional data, it is sent in
+the form of corresponding ``SET_REPLY`` message which can be suppressed by
+setting ``ETHTOOL_FLAG_OMIT_REPLY`` flag in request header.
+
+Data modification also triggers sending a ``NTF`` message with a notification.
+These usually bear only a subset of attributes which was affected by the
+change. The same notification is issued if the data is modified using other
+means (mostly ioctl ethtool interface). Unlike notifications from ethtool
+netlink code which are only sent if something actually changed, notifications
+triggered by ioctl interface may be sent even if the request did not actually
+change any data.
+
+``ACT`` messages request kernel (driver) to perform a specific action. If some
+information is reported by kernel (which can be suppressed by setting
+``ETHTOOL_FLAG_OMIT_REPLY`` flag in request header), the reply takes form of
+an ``ACT_REPLY`` message. Performing an action also triggers a notification
+(``NTF`` message).
+
+Later sections describe the format and semantics of these messages.
+
+
+Request translation
+===================
+
+The following table maps ioctl commands to netlink commands providing their
+functionality. Entries with "n/a" in right column are commands which do not
+have their netlink replacement yet.
+
+  =================================== =====================================
+  ioctl command                       netlink command
+  =================================== =====================================
+  ``ETHTOOL_GSET``                    n/a
+  ``ETHTOOL_SSET``                    n/a
+  ``ETHTOOL_GDRVINFO``                n/a
+  ``ETHTOOL_GREGS``                   n/a
+  ``ETHTOOL_GWOL``                    n/a
+  ``ETHTOOL_SWOL``                    n/a
+  ``ETHTOOL_GMSGLVL``                 n/a
+  ``ETHTOOL_SMSGLVL``                 n/a
+  ``ETHTOOL_NWAY_RST``                n/a
+  ``ETHTOOL_GLINK``                   n/a
+  ``ETHTOOL_GEEPROM``                 n/a
+  ``ETHTOOL_SEEPROM``                 n/a
+  ``ETHTOOL_GCOALESCE``               n/a
+  ``ETHTOOL_SCOALESCE``               n/a
+  ``ETHTOOL_GRINGPARAM``              n/a
+  ``ETHTOOL_SRINGPARAM``              n/a
+  ``ETHTOOL_GPAUSEPARAM``             n/a
+  ``ETHTOOL_SPAUSEPARAM``             n/a
+  ``ETHTOOL_GRXCSUM``                 n/a
+  ``ETHTOOL_SRXCSUM``                 n/a
+  ``ETHTOOL_GTXCSUM``                 n/a
+  ``ETHTOOL_STXCSUM``                 n/a
+  ``ETHTOOL_GSG``                     n/a
+  ``ETHTOOL_SSG``                     n/a
+  ``ETHTOOL_TEST``                    n/a
+  ``ETHTOOL_GSTRINGS``                n/a
+  ``ETHTOOL_PHYS_ID``                 n/a
+  ``ETHTOOL_GSTATS``                  n/a
+  ``ETHTOOL_GTSO``                    n/a
+  ``ETHTOOL_STSO``                    n/a
+  ``ETHTOOL_GPERMADDR``               rtnetlink ``RTM_GETLINK``
+  ``ETHTOOL_GUFO``                    n/a
+  ``ETHTOOL_SUFO``                    n/a
+  ``ETHTOOL_GGSO``                    n/a
+  ``ETHTOOL_SGSO``                    n/a
+  ``ETHTOOL_GFLAGS``                  n/a
+  ``ETHTOOL_SFLAGS``                  n/a
+  ``ETHTOOL_GPFLAGS``                 n/a
+  ``ETHTOOL_SPFLAGS``                 n/a
+  ``ETHTOOL_GRXFH``                   n/a
+  ``ETHTOOL_SRXFH``                   n/a
+  ``ETHTOOL_GGRO``                    n/a
+  ``ETHTOOL_SGRO``                    n/a
+  ``ETHTOOL_GRXRINGS``                n/a
+  ``ETHTOOL_GRXCLSRLCNT``             n/a
+  ``ETHTOOL_GRXCLSRULE``              n/a
+  ``ETHTOOL_GRXCLSRLALL``             n/a
+  ``ETHTOOL_SRXCLSRLDEL``             n/a
+  ``ETHTOOL_SRXCLSRLINS``             n/a
+  ``ETHTOOL_FLASHDEV``                n/a
+  ``ETHTOOL_RESET``                   n/a
+  ``ETHTOOL_SRXNTUPLE``               n/a
+  ``ETHTOOL_GRXNTUPLE``               n/a
+  ``ETHTOOL_GSSET_INFO``              n/a
+  ``ETHTOOL_GRXFHINDIR``              n/a
+  ``ETHTOOL_SRXFHINDIR``              n/a
+  ``ETHTOOL_GFEATURES``               n/a
+  ``ETHTOOL_SFEATURES``               n/a
+  ``ETHTOOL_GCHANNELS``               n/a
+  ``ETHTOOL_SCHANNELS``               n/a
+  ``ETHTOOL_SET_DUMP``                n/a
+  ``ETHTOOL_GET_DUMP_FLAG``           n/a
+  ``ETHTOOL_GET_DUMP_DATA``           n/a
+  ``ETHTOOL_GET_TS_INFO``             n/a
+  ``ETHTOOL_GMODULEINFO``             n/a
+  ``ETHTOOL_GMODULEEEPROM``           n/a
+  ``ETHTOOL_GEEE``                    n/a
+  ``ETHTOOL_SEEE``                    n/a
+  ``ETHTOOL_GRSSH``                   n/a
+  ``ETHTOOL_SRSSH``                   n/a
+  ``ETHTOOL_GTUNABLE``                n/a
+  ``ETHTOOL_STUNABLE``                n/a
+  ``ETHTOOL_GPHYSTATS``               n/a
+  ``ETHTOOL_PERQUEUE``                n/a
+  ``ETHTOOL_GLINKSETTINGS``           n/a
+  ``ETHTOOL_SLINKSETTINGS``           n/a
+  ``ETHTOOL_PHY_GTUNABLE``            n/a
+  ``ETHTOOL_PHY_STUNABLE``            n/a
+  ``ETHTOOL_GFECPARAM``               n/a
+  ``ETHTOOL_SFECPARAM``               n/a
+  =================================== =====================================
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 5acab1290e03..bee73be7af93 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -16,6 +16,7 @@ Contents:
    devlink-info-versions
    devlink-trap
    devlink-trap-netdevsim
+   ethtool-netlink
    ieee802154
    j1939
    kapi
diff --git a/include/linux/ethtool_netlink.h b/include/linux/ethtool_netlink.h
new file mode 100644
index 000000000000..f27e92b5f344
--- /dev/null
+++ b/include/linux/ethtool_netlink.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _LINUX_ETHTOOL_NETLINK_H_
+#define _LINUX_ETHTOOL_NETLINK_H_
+
+#include <uapi/linux/ethtool_netlink.h>
+#include <linux/ethtool.h>
+
+#endif /* _LINUX_ETHTOOL_NETLINK_H_ */
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
new file mode 100644
index 000000000000..3c93276ba066
--- /dev/null
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0-only WITH Linux-syscall-note */
+/*
+ * include/uapi/linux/ethtool_netlink.h - netlink interface for ethtool
+ *
+ * See Documentation/networking/ethtool-netlink.txt in kernel source tree for
+ * doucumentation of the interface.
+ */
+
+#ifndef _UAPI_LINUX_ETHTOOL_NETLINK_H_
+#define _UAPI_LINUX_ETHTOOL_NETLINK_H_
+
+#include <linux/ethtool.h>
+
+/* message types - userspace to kernel */
+enum {
+	ETHTOOL_MSG_USER_NONE,
+
+	/* add new constants above here */
+	__ETHTOOL_MSG_USER_CNT,
+	ETHTOOL_MSG_USER_MAX = __ETHTOOL_MSG_USER_CNT - 1
+};
+
+/* message types - kernel to userspace */
+enum {
+	ETHTOOL_MSG_KERNEL_NONE,
+
+	/* add new constants above here */
+	__ETHTOOL_MSG_KERNEL_CNT,
+	ETHTOOL_MSG_KERNEL_MAX = __ETHTOOL_MSG_KERNEL_CNT - 1
+};
+
+/* generic netlink info */
+#define ETHTOOL_GENL_NAME "ethtool"
+#define ETHTOOL_GENL_VERSION 1
+
+#endif /* _UAPI_LINUX_ETHTOOL_NETLINK_H_ */
diff --git a/net/Kconfig b/net/Kconfig
index 52af65e5d28c..54916b7adb9b 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -449,6 +449,14 @@ config FAILOVER
 	  migration of VMs with direct attached VFs by failing over to the
 	  paravirtual datapath when the VF is unplugged.
 
+config ETHTOOL_NETLINK
+	bool "Netlink interface for ethtool"
+	default y
+	help
+	  An alternative userspace interface for ethtool based on generic
+	  netlink. It provides better extensibility and some new features,
+	  e.g. notification messages.
+
 endif   # if NET
 
 # Used by archs to tell that they support BPF JIT compiler plus which flavour.
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index f68387618973..59d5ee230c29 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -1,3 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
-obj-y		+= ioctl.o common.o
+obj-y				+= ioctl.o common.o
+
+obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
+
+ethtool_nl-y	:= netlink.o
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
new file mode 100644
index 000000000000..59e1ebde2f15
--- /dev/null
+++ b/net/ethtool/netlink.c
@@ -0,0 +1,33 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/ethtool_netlink.h>
+#include "netlink.h"
+
+/* genetlink setup */
+
+static const struct genl_ops ethtool_genl_ops[] = {
+};
+
+static struct genl_family ethtool_genl_family = {
+	.name		= ETHTOOL_GENL_NAME,
+	.version	= ETHTOOL_GENL_VERSION,
+	.netnsok	= true,
+	.parallel_ops	= true,
+	.ops		= ethtool_genl_ops,
+	.n_ops		= ARRAY_SIZE(ethtool_genl_ops),
+};
+
+/* module setup */
+
+static int __init ethnl_init(void)
+{
+	int ret;
+
+	ret = genl_register_family(&ethtool_genl_family);
+	if (WARN(ret < 0, "ethtool: genetlink family registration failed"))
+		return ret;
+
+	return 0;
+}
+
+subsys_initcall(ethnl_init);
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
new file mode 100644
index 000000000000..e4220780d368
--- /dev/null
+++ b/net/ethtool/netlink.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _NET_ETHTOOL_NETLINK_H
+#define _NET_ETHTOOL_NETLINK_H
+
+#include <linux/ethtool_netlink.h>
+#include <linux/netdevice.h>
+#include <net/genetlink.h>
+
+#endif /* _NET_ETHTOOL_NETLINK_H */
-- 
cgit v1.2.3


From 041b1c5d4a53e97fc9e029ae32469552ca12cb9b Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:55:23 +0100
Subject: ethtool: helper functions for netlink interface

Add common request/reply header definition and helpers to parse request
header and fill reply header. Provide ethnl_update_* helpers to update
structure members from request attributes (to be used for *_SET requests).

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/ethtool_netlink.h |  21 ++++
 net/ethtool/netlink.c                | 166 ++++++++++++++++++++++++++++
 net/ethtool/netlink.h                | 205 +++++++++++++++++++++++++++++++++++
 3 files changed, 392 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 3c93276ba066..82fc3b5f41c9 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -29,6 +29,27 @@ enum {
 	ETHTOOL_MSG_KERNEL_MAX = __ETHTOOL_MSG_KERNEL_CNT - 1
 };
 
+/* request header */
+
+/* use compact bitsets in reply */
+#define ETHTOOL_FLAG_COMPACT_BITSETS	(1 << 0)
+/* provide optional reply for SET or ACT requests */
+#define ETHTOOL_FLAG_OMIT_REPLY	(1 << 1)
+
+#define ETHTOOL_FLAG_ALL (ETHTOOL_FLAG_COMPACT_BITSETS | \
+			  ETHTOOL_FLAG_OMIT_REPLY)
+
+enum {
+	ETHTOOL_A_HEADER_UNSPEC,
+	ETHTOOL_A_HEADER_DEV_INDEX,		/* u32 */
+	ETHTOOL_A_HEADER_DEV_NAME,		/* string */
+	ETHTOOL_A_HEADER_FLAGS,			/* u32 - ETHTOOL_FLAG_* */
+
+	/* add new constants above here */
+	__ETHTOOL_A_HEADER_CNT,
+	ETHTOOL_A_HEADER_MAX = __ETHTOOL_A_HEADER_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 59e1ebde2f15..aef882e0c3f5 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -1,8 +1,174 @@
 // SPDX-License-Identifier: GPL-2.0-only
 
+#include <net/sock.h>
 #include <linux/ethtool_netlink.h>
 #include "netlink.h"
 
+static struct genl_family ethtool_genl_family;
+
+static const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_MAX + 1] = {
+	[ETHTOOL_A_HEADER_UNSPEC]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_HEADER_DEV_INDEX]	= { .type = NLA_U32 },
+	[ETHTOOL_A_HEADER_DEV_NAME]	= { .type = NLA_NUL_STRING,
+					    .len = ALTIFNAMSIZ - 1 },
+	[ETHTOOL_A_HEADER_FLAGS]	= { .type = NLA_U32 },
+};
+
+/**
+ * ethnl_parse_header() - parse request header
+ * @req_info:    structure to put results into
+ * @header:      nest attribute with request header
+ * @net:         request netns
+ * @extack:      netlink extack for error reporting
+ * @require_dev: fail if no device identified in header
+ *
+ * Parse request header in nested attribute @nest and puts results into
+ * the structure pointed to by @req_info. Extack from @info is used for error
+ * reporting. If req_info->dev is not null on return, reference to it has
+ * been taken. If error is returned, *req_info is null initialized and no
+ * reference is held.
+ *
+ * Return: 0 on success or negative error code
+ */
+int ethnl_parse_header(struct ethnl_req_info *req_info,
+		       const struct nlattr *header, struct net *net,
+		       struct netlink_ext_ack *extack, bool require_dev)
+{
+	struct nlattr *tb[ETHTOOL_A_HEADER_MAX + 1];
+	const struct nlattr *devname_attr;
+	struct net_device *dev = NULL;
+	int ret;
+
+	if (!header) {
+		NL_SET_ERR_MSG(extack, "request header missing");
+		return -EINVAL;
+	}
+	ret = nla_parse_nested(tb, ETHTOOL_A_HEADER_MAX, header,
+			       ethnl_header_policy, extack);
+	if (ret < 0)
+		return ret;
+	devname_attr = tb[ETHTOOL_A_HEADER_DEV_NAME];
+
+	if (tb[ETHTOOL_A_HEADER_DEV_INDEX]) {
+		u32 ifindex = nla_get_u32(tb[ETHTOOL_A_HEADER_DEV_INDEX]);
+
+		dev = dev_get_by_index(net, ifindex);
+		if (!dev) {
+			NL_SET_ERR_MSG_ATTR(extack,
+					    tb[ETHTOOL_A_HEADER_DEV_INDEX],
+					    "no device matches ifindex");
+			return -ENODEV;
+		}
+		/* if both ifindex and ifname are passed, they must match */
+		if (devname_attr &&
+		    strncmp(dev->name, nla_data(devname_attr), IFNAMSIZ)) {
+			dev_put(dev);
+			NL_SET_ERR_MSG_ATTR(extack, header,
+					    "ifindex and name do not match");
+			return -ENODEV;
+		}
+	} else if (devname_attr) {
+		dev = dev_get_by_name(net, nla_data(devname_attr));
+		if (!dev) {
+			NL_SET_ERR_MSG_ATTR(extack, devname_attr,
+					    "no device matches name");
+			return -ENODEV;
+		}
+	} else if (require_dev) {
+		NL_SET_ERR_MSG_ATTR(extack, header,
+				    "neither ifindex nor name specified");
+		return -EINVAL;
+	}
+
+	if (dev && !netif_device_present(dev)) {
+		dev_put(dev);
+		NL_SET_ERR_MSG(extack, "device not present");
+		return -ENODEV;
+	}
+
+	req_info->dev = dev;
+	if (tb[ETHTOOL_A_HEADER_FLAGS])
+		req_info->flags = nla_get_u32(tb[ETHTOOL_A_HEADER_FLAGS]);
+
+	return 0;
+}
+
+/**
+ * ethnl_fill_reply_header() - Put common header into a reply message
+ * @skb:      skb with the message
+ * @dev:      network device to describe in header
+ * @attrtype: attribute type to use for the nest
+ *
+ * Create a nested attribute with attributes describing given network device.
+ *
+ * Return: 0 on success, error value (-EMSGSIZE only) on error
+ */
+int ethnl_fill_reply_header(struct sk_buff *skb, struct net_device *dev,
+			    u16 attrtype)
+{
+	struct nlattr *nest;
+
+	if (!dev)
+		return 0;
+	nest = nla_nest_start(skb, attrtype);
+	if (!nest)
+		return -EMSGSIZE;
+
+	if (nla_put_u32(skb, ETHTOOL_A_HEADER_DEV_INDEX, (u32)dev->ifindex) ||
+	    nla_put_string(skb, ETHTOOL_A_HEADER_DEV_NAME, dev->name))
+		goto nla_put_failure;
+	/* If more attributes are put into reply header, ethnl_header_size()
+	 * must be updated to account for them.
+	 */
+
+	nla_nest_end(skb, nest);
+	return 0;
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -EMSGSIZE;
+}
+
+/**
+ * ethnl_reply_init() - Create skb for a reply and fill device identification
+ * @payload: payload length (without netlink and genetlink header)
+ * @dev:     device the reply is about (may be null)
+ * @cmd:     ETHTOOL_MSG_* message type for reply
+ * @info:    genetlink info of the received packet we respond to
+ * @ehdrp:   place to store payload pointer returned by genlmsg_new()
+ *
+ * Return: pointer to allocated skb on success, NULL on error
+ */
+struct sk_buff *ethnl_reply_init(size_t payload, struct net_device *dev, u8 cmd,
+				 u16 hdr_attrtype, struct genl_info *info,
+				 void **ehdrp)
+{
+	struct sk_buff *skb;
+
+	skb = genlmsg_new(payload, GFP_KERNEL);
+	if (!skb)
+		goto err;
+	*ehdrp = genlmsg_put_reply(skb, info, &ethtool_genl_family, 0, cmd);
+	if (!*ehdrp)
+		goto err_free;
+
+	if (dev) {
+		int ret;
+
+		ret = ethnl_fill_reply_header(skb, dev, hdr_attrtype);
+		if (ret < 0)
+			goto err_free;
+	}
+	return skb;
+
+err_free:
+	nlmsg_free(skb);
+err:
+	if (info)
+		GENL_SET_ERR_MSG(info, "failed to setup reply message");
+	return NULL;
+}
+
 /* genetlink setup */
 
 static const struct genl_ops ethtool_genl_ops[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index e4220780d368..05d7183da894 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -6,5 +6,210 @@
 #include <linux/ethtool_netlink.h>
 #include <linux/netdevice.h>
 #include <net/genetlink.h>
+#include <net/sock.h>
+
+struct ethnl_req_info;
+
+int ethnl_parse_header(struct ethnl_req_info *req_info,
+		       const struct nlattr *nest, struct net *net,
+		       struct netlink_ext_ack *extack, bool require_dev);
+int ethnl_fill_reply_header(struct sk_buff *skb, struct net_device *dev,
+			    u16 attrtype);
+struct sk_buff *ethnl_reply_init(size_t payload, struct net_device *dev, u8 cmd,
+				 u16 hdr_attrtype, struct genl_info *info,
+				 void **ehdrp);
+
+/**
+ * ethnl_strz_size() - calculate attribute length for fixed size string
+ * @s: ETH_GSTRING_LEN sized string (may not be null terminated)
+ *
+ * Return: total length of an attribute with null terminated string from @s
+ */
+static inline int ethnl_strz_size(const char *s)
+{
+	return nla_total_size(strnlen(s, ETH_GSTRING_LEN) + 1);
+}
+
+/**
+ * ethnl_put_strz() - put string attribute with fixed size string
+ * @skb:     skb with the message
+ * @attrype: attribute type
+ * @s:       ETH_GSTRING_LEN sized string (may not be null terminated)
+ *
+ * Puts an attribute with null terminated string from @s into the message.
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+static inline int ethnl_put_strz(struct sk_buff *skb, u16 attrtype,
+				 const char *s)
+{
+	unsigned int len = strnlen(s, ETH_GSTRING_LEN);
+	struct nlattr *attr;
+
+	attr = nla_reserve(skb, attrtype, len + 1);
+	if (!attr)
+		return -EMSGSIZE;
+
+	memcpy(nla_data(attr), s, len);
+	((char *)nla_data(attr))[len] = '\0';
+	return 0;
+}
+
+/**
+ * ethnl_update_u32() - update u32 value from NLA_U32 attribute
+ * @dst:  value to update
+ * @attr: netlink attribute with new value or null
+ * @mod:  pointer to bool for modification tracking
+ *
+ * Copy the u32 value from NLA_U32 netlink attribute @attr into variable
+ * pointed to by @dst; do nothing if @attr is null. Bool pointed to by @mod
+ * is set to true if this function changed the value of *dst, otherwise it
+ * is left as is.
+ */
+static inline void ethnl_update_u32(u32 *dst, const struct nlattr *attr,
+				    bool *mod)
+{
+	u32 val;
+
+	if (!attr)
+		return;
+	val = nla_get_u32(attr);
+	if (*dst == val)
+		return;
+
+	*dst = val;
+	*mod = true;
+}
+
+/**
+ * ethnl_update_u8() - update u8 value from NLA_U8 attribute
+ * @dst:  value to update
+ * @attr: netlink attribute with new value or null
+ * @mod:  pointer to bool for modification tracking
+ *
+ * Copy the u8 value from NLA_U8 netlink attribute @attr into variable
+ * pointed to by @dst; do nothing if @attr is null. Bool pointed to by @mod
+ * is set to true if this function changed the value of *dst, otherwise it
+ * is left as is.
+ */
+static inline void ethnl_update_u8(u8 *dst, const struct nlattr *attr,
+				   bool *mod)
+{
+	u8 val;
+
+	if (!attr)
+		return;
+	val = nla_get_u8(attr);
+	if (*dst == val)
+		return;
+
+	*dst = val;
+	*mod = true;
+}
+
+/**
+ * ethnl_update_bool32() - update u32 used as bool from NLA_U8 attribute
+ * @dst:  value to update
+ * @attr: netlink attribute with new value or null
+ * @mod:  pointer to bool for modification tracking
+ *
+ * Use the u8 value from NLA_U8 netlink attribute @attr to set u32 variable
+ * pointed to by @dst to 0 (if zero) or 1 (if not); do nothing if @attr is
+ * null. Bool pointed to by @mod is set to true if this function changed the
+ * logical value of *dst, otherwise it is left as is.
+ */
+static inline void ethnl_update_bool32(u32 *dst, const struct nlattr *attr,
+				       bool *mod)
+{
+	u8 val;
+
+	if (!attr)
+		return;
+	val = !!nla_get_u8(attr);
+	if (!!*dst == val)
+		return;
+
+	*dst = val;
+	*mod = true;
+}
+
+/**
+ * ethnl_update_binary() - update binary data from NLA_BINARY atribute
+ * @dst:  value to update
+ * @len:  destination buffer length
+ * @attr: netlink attribute with new value or null
+ * @mod:  pointer to bool for modification tracking
+ *
+ * Use the u8 value from NLA_U8 netlink attribute @attr to rewrite data block
+ * of length @len at @dst by attribute payload; do nothing if @attr is null.
+ * Bool pointed to by @mod is set to true if this function changed the logical
+ * value of *dst, otherwise it is left as is.
+ */
+static inline void ethnl_update_binary(void *dst, unsigned int len,
+				       const struct nlattr *attr, bool *mod)
+{
+	if (!attr)
+		return;
+	if (nla_len(attr) < len)
+		len = nla_len(attr);
+	if (!memcmp(dst, nla_data(attr), len))
+		return;
+
+	memcpy(dst, nla_data(attr), len);
+	*mod = true;
+}
+
+/**
+ * ethnl_update_bitfield32() - update u32 value from NLA_BITFIELD32 attribute
+ * @dst:  value to update
+ * @attr: netlink attribute with new value or null
+ * @mod:  pointer to bool for modification tracking
+ *
+ * Update bits in u32 value which are set in attribute's mask to values from
+ * attribute's value. Do nothing if @attr is null or the value wouldn't change;
+ * otherwise, set bool pointed to by @mod to true.
+ */
+static inline void ethnl_update_bitfield32(u32 *dst, const struct nlattr *attr,
+					   bool *mod)
+{
+	struct nla_bitfield32 change;
+	u32 newval;
+
+	if (!attr)
+		return;
+	change = nla_get_bitfield32(attr);
+	newval = (*dst & ~change.selector) | (change.value & change.selector);
+	if (*dst == newval)
+		return;
+
+	*dst = newval;
+	*mod = true;
+}
+
+/**
+ * ethnl_reply_header_size() - total size of reply header
+ *
+ * This is an upper estimate so that we do not need to hold RTNL lock longer
+ * than necessary (to prevent rename between size estimate and composing the
+ * message). Accounts only for device ifindex and name as those are the only
+ * attributes ethnl_fill_reply_header() puts into the reply header.
+ */
+static inline unsigned int ethnl_reply_header_size(void)
+{
+	return nla_total_size(nla_total_size(sizeof(u32)) +
+			      nla_total_size(IFNAMSIZ));
+}
+
+/**
+ * struct ethnl_req_info - base type of request information for GET requests
+ * @dev:   network device the request is for (may be null)
+ * @flags: request flags common for all request types
+ *
+ * This is a common base, additional members may follow after this structure.
+ */
+struct ethnl_req_info {
+	struct net_device	*dev;
+	u32			flags;
+};
 
 #endif /* _NET_ETHTOOL_NETLINK_H */
-- 
cgit v1.2.3


From 10b518d4e6dd5390e40f7d8de0f08753c1195a7e Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:55:28 +0100
Subject: ethtool: netlink bitset handling

The ethtool netlink code uses common framework for passing arbitrary
length bit sets to allow future extensions. A bitset can be a list (only
one bitmap) or can consist of value and mask pair (used e.g. when client
want to modify only some bits). A bitset can use one of two formats:
verbose (bit by bit) or compact.

Verbose format consists of bitset size (number of bits), list flag and
an array of bit nests, telling which bits are part of the list or which
bits are in the mask and which of them are to be set. In requests, bits
can be identified by index (position) or by name. In replies, kernel
provides both index and name. Verbose format is suitable for "one shot"
applications like standard ethtool command as it avoids the need to
either keep bit names (e.g. link modes) in sync with kernel or having to
add an extra roundtrip for string set request (e.g. for private flags).

Compact format uses one (list) or two (value/mask) arrays of 32-bit
words to store the bitmap(s). It is more suitable for long running
applications (ethtool in monitor mode or network management daemons)
which can retrieve the names once and then pass only compact bitmaps to
save space.

Userspace requests can use either format; ETHTOOL_FLAG_COMPACT_BITSETS
flag in request header tells kernel which format to use in reply.
Notifications always use compact format.

As some code uses arrays of unsigned long for internal representation and
some arrays of u32 (or even a single u32), two sets of parse/compose
helpers are introduced. To avoid code duplication, helpers for unsigned
long arrays are implemented as wrappers around helpers for u32 arrays.
There are two reasons for this choice: (1) u32 arrays are more frequent in
ethtool code and (2) unsigned long array can be always interpreted as an
u32 array on little endian 64-bit and all 32-bit architectures while we
would need special handling for odd number of u32 words in the opposite
direction.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst |  84 +++
 include/uapi/linux/ethtool_netlink.h         |  35 ++
 net/ethtool/Makefile                         |   2 +-
 net/ethtool/bitset.c                         | 735 +++++++++++++++++++++++++++
 net/ethtool/bitset.h                         |  28 +
 5 files changed, 883 insertions(+), 1 deletion(-)
 create mode 100644 net/ethtool/bitset.c
 create mode 100644 net/ethtool/bitset.h

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 9448442ad293..7797f1237472 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -76,6 +76,90 @@ of the flag should be interpreted the way the client expects. A client must
 not set flags it does not understand.
 
 
+Bit sets
+========
+
+For short bitmaps of (reasonably) fixed length, standard ``NLA_BITFIELD32``
+type is used. For arbitrary length bitmaps, ethtool netlink uses a nested
+attribute with contents of one of two forms: compact (two binary bitmaps
+representing bit values and mask of affected bits) and bit-by-bit (list of
+bits identified by either index or name).
+
+Verbose (bit-by-bit) bitsets allow sending symbolic names for bits together
+with their values which saves a round trip (when the bitset is passed in a
+request) or at least a second request (when the bitset is in a reply). This is
+useful for one shot applications like traditional ethtool command. On the
+other hand, long running applications like ethtool monitor (displaying
+notifications) or network management daemons may prefer fetching the names
+only once and using compact form to save message size. Notifications from
+ethtool netlink interface always use compact form for bitsets.
+
+A bitset can represent either a value/mask pair (``ETHTOOL_A_BITSET_NOMASK``
+not set) or a single bitmap (``ETHTOOL_A_BITSET_NOMASK`` set). In requests
+modifying a bitmap, the former changes the bit set in mask to values set in
+value and preserves the rest; the latter sets the bits set in the bitmap and
+clears the rest.
+
+Compact form: nested (bitset) atrribute contents:
+
+  ============================  ======  ============================
+  ``ETHTOOL_A_BITSET_NOMASK``   flag    no mask, only a list
+  ``ETHTOOL_A_BITSET_SIZE``     u32     number of significant bits
+  ``ETHTOOL_A_BITSET_VALUE``    binary  bitmap of bit values
+  ``ETHTOOL_A_BITSET_MASK``     binary  bitmap of valid bits
+  ============================  ======  ============================
+
+Value and mask must have length at least ``ETHTOOL_A_BITSET_SIZE`` bits
+rounded up to a multiple of 32 bits. They consist of 32-bit words in host byte
+order, words ordered from least significant to most significant (i.e. the same
+way as bitmaps are passed with ioctl interface).
+
+For compact form, ``ETHTOOL_A_BITSET_SIZE`` and ``ETHTOOL_A_BITSET_VALUE`` are
+mandatory. ``ETHTOOL_A_BITSET_MASK`` attribute is mandatory if
+``ETHTOOL_A_BITSET_NOMASK`` is not set (bitset represents a value/mask pair);
+if ``ETHTOOL_A_BITSET_NOMASK`` is not set, ``ETHTOOL_A_BITSET_MASK`` is not
+allowed (bitset represents a single bitmap.
+
+Kernel bit set length may differ from userspace length if older application is
+used on newer kernel or vice versa. If userspace bitmap is longer, an error is
+issued only if the request actually tries to set values of some bits not
+recognized by kernel.
+
+Bit-by-bit form: nested (bitset) attribute contents:
+
+ +------------------------------------+--------+-----------------------------+
+ | ``ETHTOOL_A_BITSET_NOMASK``        | flag   | no mask, only a list        |
+ +------------------------------------+--------+-----------------------------+
+ | ``ETHTOOL_A_BITSET_SIZE``          | u32    | number of significant bits  |
+ +------------------------------------+--------+-----------------------------+
+ | ``ETHTOOL_A_BITSET_BITS``          | nested | array of bits               |
+ +-+----------------------------------+--------+-----------------------------+
+ | | ``ETHTOOL_A_BITSET_BITS_BIT+``   | nested | one bit                     |
+ +-+-+--------------------------------+--------+-----------------------------+
+ | | | ``ETHTOOL_A_BITSET_BIT_INDEX`` | u32    | bit index (0 for LSB)       |
+ +-+-+--------------------------------+--------+-----------------------------+
+ | | | ``ETHTOOL_A_BITSET_BIT_NAME``  | string | bit name                    |
+ +-+-+--------------------------------+--------+-----------------------------+
+ | | | ``ETHTOOL_A_BITSET_BIT_VALUE`` | flag   | present if bit is set       |
+ +-+-+--------------------------------+--------+-----------------------------+
+
+Bit size is optional for bit-by-bit form. ``ETHTOOL_A_BITSET_BITS`` nest can
+only contain ``ETHTOOL_A_BITSET_BITS_BIT`` attributes but there can be an
+arbitrary number of them.  A bit may be identified by its index or by its
+name. When used in requests, listed bits are set to 0 or 1 according to
+``ETHTOOL_A_BITSET_BIT_VALUE``, the rest is preserved. A request fails if
+index exceeds kernel bit length or if name is not recognized.
+
+When ``ETHTOOL_A_BITSET_NOMASK`` flag is present, bitset is interpreted as
+a simple bitmap. ``ETHTOOL_A_BITSET_BIT_VALUE`` attributes are not used in
+such case. Such bitset represents a bitmap with listed bits set and the rest
+zero.
+
+In requests, application can use either form. Form used by kernel in reply is
+determined by ``ETHTOOL_FLAG_COMPACT_BITSETS`` flag in flags field of request
+header. Semantics of value and mask depends on the attribute.
+
+
 List of message types
 =====================
 
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 82fc3b5f41c9..951203049615 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -50,6 +50,41 @@ enum {
 	ETHTOOL_A_HEADER_MAX = __ETHTOOL_A_HEADER_CNT - 1
 };
 
+/* bit sets */
+
+enum {
+	ETHTOOL_A_BITSET_BIT_UNSPEC,
+	ETHTOOL_A_BITSET_BIT_INDEX,		/* u32 */
+	ETHTOOL_A_BITSET_BIT_NAME,		/* string */
+	ETHTOOL_A_BITSET_BIT_VALUE,		/* flag */
+
+	/* add new constants above here */
+	__ETHTOOL_A_BITSET_BIT_CNT,
+	ETHTOOL_A_BITSET_BIT_MAX = __ETHTOOL_A_BITSET_BIT_CNT - 1
+};
+
+enum {
+	ETHTOOL_A_BITSET_BITS_UNSPEC,
+	ETHTOOL_A_BITSET_BITS_BIT,		/* nest - _A_BITSET_BIT_* */
+
+	/* add new constants above here */
+	__ETHTOOL_A_BITSET_BITS_CNT,
+	ETHTOOL_A_BITSET_BITS_MAX = __ETHTOOL_A_BITSET_BITS_CNT - 1
+};
+
+enum {
+	ETHTOOL_A_BITSET_UNSPEC,
+	ETHTOOL_A_BITSET_NOMASK,		/* flag */
+	ETHTOOL_A_BITSET_SIZE,			/* u32 */
+	ETHTOOL_A_BITSET_BITS,			/* nest - _A_BITSET_BITS_* */
+	ETHTOOL_A_BITSET_VALUE,			/* binary */
+	ETHTOOL_A_BITSET_MASK,			/* binary */
+
+	/* add new constants above here */
+	__ETHTOOL_A_BITSET_CNT,
+	ETHTOOL_A_BITSET_MAX = __ETHTOOL_A_BITSET_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index 59d5ee230c29..a7e6c2c85db9 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -4,4 +4,4 @@ obj-y				+= ioctl.o common.o
 
 obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 
-ethtool_nl-y	:= netlink.o
+ethtool_nl-y	:= netlink.o bitset.o
diff --git a/net/ethtool/bitset.c b/net/ethtool/bitset.c
new file mode 100644
index 000000000000..fce45dac4205
--- /dev/null
+++ b/net/ethtool/bitset.c
@@ -0,0 +1,735 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/ethtool_netlink.h>
+#include <linux/bitmap.h>
+#include "netlink.h"
+#include "bitset.h"
+
+/* Some bitmaps are internally represented as an array of unsigned long, some
+ * as an array of u32 (some even as single u32 for now). To avoid the need of
+ * wrappers on caller side, we provide two set of functions: those with "32"
+ * suffix in their names expect u32 based bitmaps, those without it expect
+ * unsigned long bitmaps.
+ */
+
+static u32 ethnl_lower_bits(unsigned int n)
+{
+	return ~(u32)0 >> (32 - n % 32);
+}
+
+static u32 ethnl_upper_bits(unsigned int n)
+{
+	return ~(u32)0 << (n % 32);
+}
+
+/**
+ * ethnl_bitmap32_clear() - Clear u32 based bitmap
+ * @dst:   bitmap to clear
+ * @start: beginning of the interval
+ * @end:   end of the interval
+ * @mod:   set if bitmap was modified
+ *
+ * Clear @nbits bits of a bitmap with indices @start <= i < @end
+ */
+static void ethnl_bitmap32_clear(u32 *dst, unsigned int start, unsigned int end,
+				 bool *mod)
+{
+	unsigned int start_word = start / 32;
+	unsigned int end_word = end / 32;
+	unsigned int i;
+	u32 mask;
+
+	if (end <= start)
+		return;
+
+	if (start % 32) {
+		mask = ethnl_upper_bits(start);
+		if (end_word == start_word) {
+			mask &= ethnl_lower_bits(end);
+			if (dst[start_word] & mask) {
+				dst[start_word] &= ~mask;
+				*mod = true;
+			}
+			return;
+		}
+		if (dst[start_word] & mask) {
+			dst[start_word] &= ~mask;
+			*mod = true;
+		}
+		start_word++;
+	}
+
+	for (i = start_word; i < end_word; i++) {
+		if (dst[i]) {
+			dst[i] = 0;
+			*mod = true;
+		}
+	}
+	if (end % 32) {
+		mask = ethnl_lower_bits(end);
+		if (dst[end_word] & mask) {
+			dst[end_word] &= ~mask;
+			*mod = true;
+		}
+	}
+}
+
+/**
+ * ethnl_bitmap32_not_zero() - Check if any bit is set in an interval
+ * @map:   bitmap to test
+ * @start: beginning of the interval
+ * @end:   end of the interval
+ *
+ * Return: true if there is non-zero bit with  index @start <= i < @end,
+ *         false if the whole interval is zero
+ */
+static bool ethnl_bitmap32_not_zero(const u32 *map, unsigned int start,
+				    unsigned int end)
+{
+	unsigned int start_word = start / 32;
+	unsigned int end_word = end / 32;
+	u32 mask;
+
+	if (end <= start)
+		return true;
+
+	if (start % 32) {
+		mask = ethnl_upper_bits(start);
+		if (end_word == start_word) {
+			mask &= ethnl_lower_bits(end);
+			return map[start_word] & mask;
+		}
+		if (map[start_word] & mask)
+			return true;
+		start_word++;
+	}
+
+	if (!memchr_inv(map + start_word, '\0',
+			(end_word - start_word) * sizeof(u32)))
+		return true;
+	if (end % 32 == 0)
+		return true;
+	return map[end_word] & ethnl_lower_bits(end);
+}
+
+/**
+ * ethnl_bitmap32_update() - Modify u32 based bitmap according to value/mask
+ *			     pair
+ * @dst:   bitmap to update
+ * @nbits: bit size of the bitmap
+ * @value: values to set
+ * @mask:  mask of bits to set
+ * @mod:   set to true if bitmap is modified, preserve if not
+ *
+ * Set bits in @dst bitmap which are set in @mask to values from @value, leave
+ * the rest untouched. If destination bitmap was modified, set @mod to true,
+ * leave as it is if not.
+ */
+static void ethnl_bitmap32_update(u32 *dst, unsigned int nbits,
+				  const u32 *value, const u32 *mask, bool *mod)
+{
+	while (nbits > 0) {
+		u32 real_mask = mask ? *mask : ~(u32)0;
+		u32 new_value;
+
+		if (nbits < 32)
+			real_mask &= ethnl_lower_bits(nbits);
+		new_value = (*dst & ~real_mask) | (*value & real_mask);
+		if (new_value != *dst) {
+			*dst = new_value;
+			*mod = true;
+		}
+
+		if (nbits <= 32)
+			break;
+		dst++;
+		nbits -= 32;
+		value++;
+		if (mask)
+			mask++;
+	}
+}
+
+static bool ethnl_bitmap32_test_bit(const u32 *map, unsigned int index)
+{
+	return map[index / 32] & (1U << (index % 32));
+}
+
+/**
+ * ethnl_bitset32_size() - Calculate size of bitset nested attribute
+ * @val:     value bitmap (u32 based)
+ * @mask:    mask bitmap (u32 based, optional)
+ * @nbits:   bit length of the bitset
+ * @names:   array of bit names (optional)
+ * @compact: assume compact format for output
+ *
+ * Estimate length of netlink attribute composed by a later call to
+ * ethnl_put_bitset32() call with the same arguments.
+ *
+ * Return: negative error code or attribute length estimate
+ */
+int ethnl_bitset32_size(const u32 *val, const u32 *mask, unsigned int nbits,
+			ethnl_string_array_t names, bool compact)
+{
+	unsigned int len = 0;
+
+	/* list flag */
+	if (!mask)
+		len += nla_total_size(sizeof(u32));
+	/* size */
+	len += nla_total_size(sizeof(u32));
+
+	if (compact) {
+		unsigned int nwords = DIV_ROUND_UP(nbits, 32);
+
+		/* value, mask */
+		len += (mask ? 2 : 1) * nla_total_size(nwords * sizeof(u32));
+	} else {
+		unsigned int bits_len = 0;
+		unsigned int bit_len, i;
+
+		for (i = 0; i < nbits; i++) {
+			const char *name = names ? names[i] : NULL;
+
+			if (!ethnl_bitmap32_test_bit(mask ?: val, i))
+				continue;
+			/* index */
+			bit_len = nla_total_size(sizeof(u32));
+			/* name */
+			if (name)
+				bit_len += ethnl_strz_size(name);
+			/* value */
+			if (mask && ethnl_bitmap32_test_bit(val, i))
+				bit_len += nla_total_size(0);
+
+			/* bit nest */
+			bits_len += nla_total_size(bit_len);
+		}
+		/* bits nest */
+		len += nla_total_size(bits_len);
+	}
+
+	/* outermost nest */
+	return nla_total_size(len);
+}
+
+/**
+ * ethnl_put_bitset32() - Put a bitset nest into a message
+ * @skb:      skb with the message
+ * @attrtype: attribute type for the bitset nest
+ * @val:      value bitmap (u32 based)
+ * @mask:     mask bitmap (u32 based, optional)
+ * @nbits:    bit length of the bitset
+ * @names:    array of bit names (optional)
+ * @compact:  use compact format for the output
+ *
+ * Compose a nested attribute representing a bitset. If @mask is null, simple
+ * bitmap (bit list) is created, if @mask is provided, represent a value/mask
+ * pair. Bit names are only used in verbose mode and when provided by calller.
+ *
+ * Return: 0 on success, negative error value on error
+ */
+int ethnl_put_bitset32(struct sk_buff *skb, int attrtype, const u32 *val,
+		       const u32 *mask, unsigned int nbits,
+		       ethnl_string_array_t names, bool compact)
+{
+	struct nlattr *nest;
+	struct nlattr *attr;
+
+	nest = nla_nest_start(skb, attrtype);
+	if (!nest)
+		return -EMSGSIZE;
+
+	if (!mask && nla_put_flag(skb, ETHTOOL_A_BITSET_NOMASK))
+		goto nla_put_failure;
+	if (nla_put_u32(skb, ETHTOOL_A_BITSET_SIZE, nbits))
+		goto nla_put_failure;
+	if (compact) {
+		unsigned int nwords = DIV_ROUND_UP(nbits, 32);
+		unsigned int nbytes = nwords * sizeof(u32);
+		u32 *dst;
+
+		attr = nla_reserve(skb, ETHTOOL_A_BITSET_VALUE, nbytes);
+		if (!attr)
+			goto nla_put_failure;
+		dst = nla_data(attr);
+		memcpy(dst, val, nbytes);
+		if (nbits % 32)
+			dst[nwords - 1] &= ethnl_lower_bits(nbits);
+
+		if (mask) {
+			attr = nla_reserve(skb, ETHTOOL_A_BITSET_MASK, nbytes);
+			if (!attr)
+				goto nla_put_failure;
+			dst = nla_data(attr);
+			memcpy(dst, mask, nbytes);
+			if (nbits % 32)
+				dst[nwords - 1] &= ethnl_lower_bits(nbits);
+		}
+	} else {
+		struct nlattr *bits;
+		unsigned int i;
+
+		bits = nla_nest_start(skb, ETHTOOL_A_BITSET_BITS);
+		if (!bits)
+			goto nla_put_failure;
+		for (i = 0; i < nbits; i++) {
+			const char *name = names ? names[i] : NULL;
+
+			if (!ethnl_bitmap32_test_bit(mask ?: val, i))
+				continue;
+			attr = nla_nest_start(skb, ETHTOOL_A_BITSET_BITS_BIT);
+			if (!attr)
+				goto nla_put_failure;
+			if (nla_put_u32(skb, ETHTOOL_A_BITSET_BIT_INDEX, i))
+				goto nla_put_failure;
+			if (name &&
+			    ethnl_put_strz(skb, ETHTOOL_A_BITSET_BIT_NAME, name))
+				goto nla_put_failure;
+			if (mask && ethnl_bitmap32_test_bit(val, i) &&
+			    nla_put_flag(skb, ETHTOOL_A_BITSET_BIT_VALUE))
+				goto nla_put_failure;
+			nla_nest_end(skb, attr);
+		}
+		nla_nest_end(skb, bits);
+	}
+
+	nla_nest_end(skb, nest);
+	return 0;
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return -EMSGSIZE;
+}
+
+static const struct nla_policy bitset_policy[ETHTOOL_A_BITSET_MAX + 1] = {
+	[ETHTOOL_A_BITSET_UNSPEC]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_BITSET_NOMASK]	= { .type = NLA_FLAG },
+	[ETHTOOL_A_BITSET_SIZE]		= { .type = NLA_U32 },
+	[ETHTOOL_A_BITSET_BITS]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_BITSET_VALUE]	= { .type = NLA_BINARY },
+	[ETHTOOL_A_BITSET_MASK]		= { .type = NLA_BINARY },
+};
+
+static const struct nla_policy bit_policy[ETHTOOL_A_BITSET_BIT_MAX + 1] = {
+	[ETHTOOL_A_BITSET_BIT_UNSPEC]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_BITSET_BIT_INDEX]	= { .type = NLA_U32 },
+	[ETHTOOL_A_BITSET_BIT_NAME]	= { .type = NLA_NUL_STRING },
+	[ETHTOOL_A_BITSET_BIT_VALUE]	= { .type = NLA_FLAG },
+};
+
+/**
+ * ethnl_bitset_is_compact() - check if bitset attribute represents a compact
+ *			       bitset
+ * @bitset:  nested attribute representing a bitset
+ * @compact: pointer for return value
+ *
+ * Return: 0 on success, negative error code on failure
+ */
+int ethnl_bitset_is_compact(const struct nlattr *bitset, bool *compact)
+{
+	struct nlattr *tb[ETHTOOL_A_BITSET_MAX + 1];
+	int ret;
+
+	ret = nla_parse_nested(tb, ETHTOOL_A_BITSET_MAX, bitset,
+			       bitset_policy, NULL);
+	if (ret < 0)
+		return ret;
+
+	if (tb[ETHTOOL_A_BITSET_BITS]) {
+		if (tb[ETHTOOL_A_BITSET_VALUE] || tb[ETHTOOL_A_BITSET_MASK])
+			return -EINVAL;
+		*compact = false;
+		return 0;
+	}
+	if (!tb[ETHTOOL_A_BITSET_SIZE] || !tb[ETHTOOL_A_BITSET_VALUE])
+		return -EINVAL;
+
+	*compact = true;
+	return 0;
+}
+
+/**
+ * ethnl_name_to_idx() - look up string index for a name
+ * @names:   array of ETH_GSTRING_LEN sized strings
+ * @n_names: number of strings in the array
+ * @name:    name to look up
+ *
+ * Return: index of the string if found, -ENOENT if not found
+ */
+static int ethnl_name_to_idx(ethnl_string_array_t names, unsigned int n_names,
+			     const char *name)
+{
+	unsigned int i;
+
+	if (!names)
+		return -ENOENT;
+
+	for (i = 0; i < n_names; i++) {
+		/* names[i] may not be null terminated */
+		if (!strncmp(names[i], name, ETH_GSTRING_LEN) &&
+		    strlen(name) <= ETH_GSTRING_LEN)
+			return i;
+	}
+
+	return -ENOENT;
+}
+
+static int ethnl_parse_bit(unsigned int *index, bool *val, unsigned int nbits,
+			   const struct nlattr *bit_attr, bool no_mask,
+			   ethnl_string_array_t names,
+			   struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[ETHTOOL_A_BITSET_BIT_MAX + 1];
+	int ret, idx;
+
+	ret = nla_parse_nested(tb, ETHTOOL_A_BITSET_BIT_MAX, bit_attr,
+			       bit_policy, extack);
+	if (ret < 0)
+		return ret;
+
+	if (tb[ETHTOOL_A_BITSET_BIT_INDEX]) {
+		const char *name;
+
+		idx = nla_get_u32(tb[ETHTOOL_A_BITSET_BIT_INDEX]);
+		if (idx >= nbits) {
+			NL_SET_ERR_MSG_ATTR(extack,
+					    tb[ETHTOOL_A_BITSET_BIT_INDEX],
+					    "bit index too high");
+			return -EOPNOTSUPP;
+		}
+		name = names ? names[idx] : NULL;
+		if (tb[ETHTOOL_A_BITSET_BIT_NAME] && name &&
+		    strncmp(nla_data(tb[ETHTOOL_A_BITSET_BIT_NAME]), name,
+			    nla_len(tb[ETHTOOL_A_BITSET_BIT_NAME]))) {
+			NL_SET_ERR_MSG_ATTR(extack, bit_attr,
+					    "bit index and name mismatch");
+			return -EINVAL;
+		}
+	} else if (tb[ETHTOOL_A_BITSET_BIT_NAME]) {
+		idx = ethnl_name_to_idx(names, nbits,
+					nla_data(tb[ETHTOOL_A_BITSET_BIT_NAME]));
+		if (idx < 0) {
+			NL_SET_ERR_MSG_ATTR(extack,
+					    tb[ETHTOOL_A_BITSET_BIT_NAME],
+					    "bit name not found");
+			return -EOPNOTSUPP;
+		}
+	} else {
+		NL_SET_ERR_MSG_ATTR(extack, bit_attr,
+				    "neither bit index nor name specified");
+		return -EINVAL;
+	}
+
+	*index = idx;
+	*val = no_mask || tb[ETHTOOL_A_BITSET_BIT_VALUE];
+	return 0;
+}
+
+static int
+ethnl_update_bitset32_verbose(u32 *bitmap, unsigned int nbits,
+			      const struct nlattr *attr, struct nlattr **tb,
+			      ethnl_string_array_t names,
+			      struct netlink_ext_ack *extack, bool *mod)
+{
+	struct nlattr *bit_attr;
+	bool no_mask;
+	int rem;
+	int ret;
+
+	if (tb[ETHTOOL_A_BITSET_VALUE]) {
+		NL_SET_ERR_MSG_ATTR(extack, tb[ETHTOOL_A_BITSET_VALUE],
+				    "value only allowed in compact bitset");
+		return -EINVAL;
+	}
+	if (tb[ETHTOOL_A_BITSET_MASK]) {
+		NL_SET_ERR_MSG_ATTR(extack, tb[ETHTOOL_A_BITSET_MASK],
+				    "mask only allowed in compact bitset");
+		return -EINVAL;
+	}
+	no_mask = tb[ETHTOOL_A_BITSET_NOMASK];
+
+	nla_for_each_nested(bit_attr, tb[ETHTOOL_A_BITSET_BITS], rem) {
+		bool old_val, new_val;
+		unsigned int idx;
+
+		if (nla_type(bit_attr) != ETHTOOL_A_BITSET_BITS_BIT) {
+			NL_SET_ERR_MSG_ATTR(extack, bit_attr,
+					    "only ETHTOOL_A_BITSET_BITS_BIT allowed in ETHTOOL_A_BITSET_BITS");
+			return -EINVAL;
+		}
+		ret = ethnl_parse_bit(&idx, &new_val, nbits, bit_attr, no_mask,
+				      names, extack);
+		if (ret < 0)
+			return ret;
+		old_val = bitmap[idx / 32] & ((u32)1 << (idx % 32));
+		if (new_val != old_val) {
+			if (new_val)
+				bitmap[idx / 32] |= ((u32)1 << (idx % 32));
+			else
+				bitmap[idx / 32] &= ~((u32)1 << (idx % 32));
+			*mod = true;
+		}
+	}
+
+	return 0;
+}
+
+static int ethnl_compact_sanity_checks(unsigned int nbits,
+				       const struct nlattr *nest,
+				       struct nlattr **tb,
+				       struct netlink_ext_ack *extack)
+{
+	bool no_mask = tb[ETHTOOL_A_BITSET_NOMASK];
+	unsigned int attr_nbits, attr_nwords;
+	const struct nlattr *test_attr;
+
+	if (no_mask && tb[ETHTOOL_A_BITSET_MASK]) {
+		NL_SET_ERR_MSG_ATTR(extack, tb[ETHTOOL_A_BITSET_MASK],
+				    "mask not allowed in list bitset");
+		return -EINVAL;
+	}
+	if (!tb[ETHTOOL_A_BITSET_SIZE]) {
+		NL_SET_ERR_MSG_ATTR(extack, nest,
+				    "missing size in compact bitset");
+		return -EINVAL;
+	}
+	if (!tb[ETHTOOL_A_BITSET_VALUE]) {
+		NL_SET_ERR_MSG_ATTR(extack, nest,
+				    "missing value in compact bitset");
+		return -EINVAL;
+	}
+	if (!no_mask && !tb[ETHTOOL_A_BITSET_MASK]) {
+		NL_SET_ERR_MSG_ATTR(extack, nest,
+				    "missing mask in compact nonlist bitset");
+		return -EINVAL;
+	}
+
+	attr_nbits = nla_get_u32(tb[ETHTOOL_A_BITSET_SIZE]);
+	attr_nwords = DIV_ROUND_UP(attr_nbits, 32);
+	if (nla_len(tb[ETHTOOL_A_BITSET_VALUE]) != attr_nwords * sizeof(u32)) {
+		NL_SET_ERR_MSG_ATTR(extack, tb[ETHTOOL_A_BITSET_VALUE],
+				    "bitset value length does not match size");
+		return -EINVAL;
+	}
+	if (tb[ETHTOOL_A_BITSET_MASK] &&
+	    nla_len(tb[ETHTOOL_A_BITSET_MASK]) != attr_nwords * sizeof(u32)) {
+		NL_SET_ERR_MSG_ATTR(extack, tb[ETHTOOL_A_BITSET_MASK],
+				    "bitset mask length does not match size");
+		return -EINVAL;
+	}
+	if (attr_nbits <= nbits)
+		return 0;
+
+	test_attr = no_mask ? tb[ETHTOOL_A_BITSET_VALUE] :
+			      tb[ETHTOOL_A_BITSET_MASK];
+	if (ethnl_bitmap32_not_zero(nla_data(test_attr), nbits, attr_nbits)) {
+		NL_SET_ERR_MSG_ATTR(extack, test_attr,
+				    "cannot modify bits past kernel bitset size");
+		return -EINVAL;
+	}
+	return 0;
+}
+
+/**
+ * ethnl_update_bitset32() - Apply a bitset nest to a u32 based bitmap
+ * @bitmap:  bitmap to update
+ * @nbits:   size of the updated bitmap in bits
+ * @attr:    nest attribute to parse and apply
+ * @names:   array of bit names; may be null for compact format
+ * @extack:  extack for error reporting
+ * @mod:     set this to true if bitmap is modified, leave as it is if not
+ *
+ * Apply bitset netsted attribute to a bitmap. If the attribute represents
+ * a bit list, @bitmap is set to its contents; otherwise, bits in mask are
+ * set to values from value. Bitmaps in the attribute may be longer than
+ * @nbits but the message must not request modifying any bits past @nbits.
+ *
+ * Return: negative error code on failure, 0 on success
+ */
+int ethnl_update_bitset32(u32 *bitmap, unsigned int nbits,
+			  const struct nlattr *attr, ethnl_string_array_t names,
+			  struct netlink_ext_ack *extack, bool *mod)
+{
+	struct nlattr *tb[ETHTOOL_A_BITSET_MAX + 1];
+	unsigned int change_bits;
+	bool no_mask;
+	int ret;
+
+	if (!attr)
+		return 0;
+	ret = nla_parse_nested(tb, ETHTOOL_A_BITSET_MAX, attr, bitset_policy,
+			       extack);
+	if (ret < 0)
+		return ret;
+
+	if (tb[ETHTOOL_A_BITSET_BITS])
+		return ethnl_update_bitset32_verbose(bitmap, nbits, attr, tb,
+						     names, extack, mod);
+	ret = ethnl_compact_sanity_checks(nbits, attr, tb, extack);
+	if (ret < 0)
+		return ret;
+
+	no_mask = tb[ETHTOOL_A_BITSET_NOMASK];
+	change_bits = min_t(unsigned int,
+			    nla_get_u32(tb[ETHTOOL_A_BITSET_SIZE]), nbits);
+	ethnl_bitmap32_update(bitmap, change_bits,
+			      nla_data(tb[ETHTOOL_A_BITSET_VALUE]),
+			      no_mask ? NULL :
+					nla_data(tb[ETHTOOL_A_BITSET_MASK]),
+			      mod);
+	if (no_mask && change_bits < nbits)
+		ethnl_bitmap32_clear(bitmap, change_bits, nbits, mod);
+
+	return 0;
+}
+
+#if BITS_PER_LONG == 64 && defined(__BIG_ENDIAN)
+
+/* 64-bit big endian architectures are the only case when u32 based bitmaps
+ * and unsigned long based bitmaps have different memory layout so that we
+ * cannot simply cast the latter to the former and need actual wrappers
+ * converting the latter to the former.
+ *
+ * To reduce the number of slab allocations, the wrappers use fixed size local
+ * variables for bitmaps up to ETHNL_SMALL_BITMAP_BITS bits which is the
+ * majority of bitmaps used by ethtool.
+ */
+#define ETHNL_SMALL_BITMAP_BITS 128
+#define ETHNL_SMALL_BITMAP_WORDS DIV_ROUND_UP(ETHNL_SMALL_BITMAP_BITS, 32)
+
+int ethnl_bitset_size(const unsigned long *val, const unsigned long *mask,
+		      unsigned int nbits, ethnl_string_array_t names,
+		      bool compact)
+{
+	u32 small_mask32[ETHNL_SMALL_BITMAP_WORDS];
+	u32 small_val32[ETHNL_SMALL_BITMAP_WORDS];
+	u32 *mask32;
+	u32 *val32;
+	int ret;
+
+	if (nbits > ETHNL_SMALL_BITMAP_BITS) {
+		unsigned int nwords = DIV_ROUND_UP(nbits, 32);
+
+		val32 = kmalloc_array(2 * nwords, sizeof(u32), GFP_KERNEL);
+		if (!val32)
+			return -ENOMEM;
+		mask32 = val32 + nwords;
+	} else {
+		val32 = small_val32;
+		mask32 = small_mask32;
+	}
+
+	bitmap_to_arr32(val32, val, nbits);
+	if (mask)
+		bitmap_to_arr32(mask32, mask, nbits);
+	else
+		mask32 = NULL;
+	ret = ethnl_bitset32_size(val32, mask32, nbits, names, compact);
+
+	if (nbits > ETHNL_SMALL_BITMAP_BITS)
+		kfree(val32);
+
+	return ret;
+}
+
+int ethnl_put_bitset(struct sk_buff *skb, int attrtype,
+		     const unsigned long *val, const unsigned long *mask,
+		     unsigned int nbits, ethnl_string_array_t names,
+		     bool compact)
+{
+	u32 small_mask32[ETHNL_SMALL_BITMAP_WORDS];
+	u32 small_val32[ETHNL_SMALL_BITMAP_WORDS];
+	u32 *mask32;
+	u32 *val32;
+	int ret;
+
+	if (nbits > ETHNL_SMALL_BITMAP_BITS) {
+		unsigned int nwords = DIV_ROUND_UP(nbits, 32);
+
+		val32 = kmalloc_array(2 * nwords, sizeof(u32), GFP_KERNEL);
+		if (!val32)
+			return -ENOMEM;
+		mask32 = val32 + nwords;
+	} else {
+		val32 = small_val32;
+		mask32 = small_mask32;
+	}
+
+	bitmap_to_arr32(val32, val, nbits);
+	if (mask)
+		bitmap_to_arr32(mask32, mask, nbits);
+	else
+		mask32 = NULL;
+	ret = ethnl_put_bitset32(skb, attrtype, val32, mask32, nbits, names,
+				 compact);
+
+	if (nbits > ETHNL_SMALL_BITMAP_BITS)
+		kfree(val32);
+
+	return ret;
+}
+
+int ethnl_update_bitset(unsigned long *bitmap, unsigned int nbits,
+			const struct nlattr *attr, ethnl_string_array_t names,
+			struct netlink_ext_ack *extack, bool *mod)
+{
+	u32 small_bitmap32[ETHNL_SMALL_BITMAP_WORDS];
+	u32 *bitmap32 = small_bitmap32;
+	bool u32_mod = false;
+	int ret;
+
+	if (nbits > ETHNL_SMALL_BITMAP_BITS) {
+		unsigned int dst_words = DIV_ROUND_UP(nbits, 32);
+
+		bitmap32 = kmalloc_array(dst_words, sizeof(u32), GFP_KERNEL);
+		if (!bitmap32)
+			return -ENOMEM;
+	}
+
+	bitmap_to_arr32(bitmap32, bitmap, nbits);
+	ret = ethnl_update_bitset32(bitmap32, nbits, attr, names, extack,
+				    &u32_mod);
+	if (u32_mod) {
+		bitmap_from_arr32(bitmap, bitmap32, nbits);
+		*mod = true;
+	}
+
+	if (nbits > ETHNL_SMALL_BITMAP_BITS)
+		kfree(bitmap32);
+
+	return ret;
+}
+
+#else
+
+/* On little endian 64-bit and all 32-bit architectures, an unsigned long
+ * based bitmap can be interpreted as u32 based one using a simple cast.
+ */
+
+int ethnl_bitset_size(const unsigned long *val, const unsigned long *mask,
+		      unsigned int nbits, ethnl_string_array_t names,
+		      bool compact)
+{
+	return ethnl_bitset32_size((const u32 *)val, (const u32 *)mask, nbits,
+				   names, compact);
+}
+
+int ethnl_put_bitset(struct sk_buff *skb, int attrtype,
+		     const unsigned long *val, const unsigned long *mask,
+		     unsigned int nbits, ethnl_string_array_t names,
+		     bool compact)
+{
+	return ethnl_put_bitset32(skb, attrtype, (const u32 *)val,
+				  (const u32 *)mask, nbits, names, compact);
+}
+
+int ethnl_update_bitset(unsigned long *bitmap, unsigned int nbits,
+			const struct nlattr *attr, ethnl_string_array_t names,
+			struct netlink_ext_ack *extack, bool *mod)
+{
+	return ethnl_update_bitset32((u32 *)bitmap, nbits, attr, names, extack,
+				     mod);
+}
+
+#endif /* BITS_PER_LONG == 64 && defined(__BIG_ENDIAN) */
diff --git a/net/ethtool/bitset.h b/net/ethtool/bitset.h
new file mode 100644
index 000000000000..b8247e34109d
--- /dev/null
+++ b/net/ethtool/bitset.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef _NET_ETHTOOL_BITSET_H
+#define _NET_ETHTOOL_BITSET_H
+
+typedef const char (*const ethnl_string_array_t)[ETH_GSTRING_LEN];
+
+int ethnl_bitset_is_compact(const struct nlattr *bitset, bool *compact);
+int ethnl_bitset_size(const unsigned long *val, const unsigned long *mask,
+		      unsigned int nbits, ethnl_string_array_t names,
+		      bool compact);
+int ethnl_bitset32_size(const u32 *val, const u32 *mask, unsigned int nbits,
+			ethnl_string_array_t names, bool compact);
+int ethnl_put_bitset(struct sk_buff *skb, int attrtype,
+		     const unsigned long *val, const unsigned long *mask,
+		     unsigned int nbits, ethnl_string_array_t names,
+		     bool compact);
+int ethnl_put_bitset32(struct sk_buff *skb, int attrtype, const u32 *val,
+		       const u32 *mask, unsigned int nbits,
+		       ethnl_string_array_t names, bool compact);
+int ethnl_update_bitset(unsigned long *bitmap, unsigned int nbits,
+			const struct nlattr *attr, ethnl_string_array_t names,
+			struct netlink_ext_ack *extack, bool *mod);
+int ethnl_update_bitset32(u32 *bitmap, unsigned int nbits,
+			  const struct nlattr *attr, ethnl_string_array_t names,
+			  struct netlink_ext_ack *extack, bool *mod);
+
+#endif /* _NET_ETHTOOL_BITSET_H */
-- 
cgit v1.2.3


From 6b08d6c146f4c5ed451c45339c10feb06d619db2 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:55:33 +0100
Subject: ethtool: support for netlink notifications

Add infrastructure for ethtool netlink notifications. There is only one
multicast group "monitor" which is used to notify userspace about changes
and actions performed. Notification messages (types using suffix _NTF)
share the format with replies to GET requests.

Notifications are supposed to be broadcasted on every configuration change,
whether it is done using the netlink interface or ioctl one. Netlink SET
requests only trigger a notification if some data is actually changed.

To trigger an ethtool notification, both ethtool netlink and external code
use ethtool_notify() helper. This helper requires RTNL to be held and may
sleep. Handlers sending messages for specific notification message types
are registered in ethnl_notify_handlers array. As notifications can be
triggered from other code, ethnl_ok flag is used to prevent an attempt to
send notification before genetlink family is registered.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/ethtool_netlink.h      |  5 +++++
 include/linux/netdevice.h            |  9 +++++++++
 include/uapi/linux/ethtool_netlink.h |  2 ++
 net/ethtool/netlink.c                | 32 ++++++++++++++++++++++++++++++++
 4 files changed, 48 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/linux/ethtool_netlink.h b/include/linux/ethtool_netlink.h
index f27e92b5f344..c98f6852c8eb 100644
--- a/include/linux/ethtool_netlink.h
+++ b/include/linux/ethtool_netlink.h
@@ -5,5 +5,10 @@
 
 #include <uapi/linux/ethtool_netlink.h>
 #include <linux/ethtool.h>
+#include <linux/netdevice.h>
+
+enum ethtool_multicast_groups {
+	ETHNL_MCGRP_MONITOR,
+};
 
 #endif /* _LINUX_ETHTOOL_NETLINK_H_ */
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 469a297b58c0..f007155ae8f4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -4393,6 +4393,15 @@ struct netdev_notifier_bonding_info {
 void netdev_bonding_info_change(struct net_device *dev,
 				struct netdev_bonding_info *bonding_info);
 
+#if IS_ENABLED(CONFIG_ETHTOOL_NETLINK)
+void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data);
+#else
+static inline void ethtool_notify(struct net_device *dev, unsigned int cmd,
+				  const void *data)
+{
+}
+#endif
+
 static inline
 struct sk_buff *skb_gso_segment(struct sk_buff *skb, netdev_features_t features)
 {
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 951203049615..d530ccb6f7e6 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -89,4 +89,6 @@ enum {
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
 
+#define ETHTOOL_MCGRP_MONITOR_NAME "monitor"
+
 #endif /* _UAPI_LINUX_ETHTOOL_NETLINK_H_ */
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index aef882e0c3f5..c0f25c8f3565 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -6,6 +6,8 @@
 
 static struct genl_family ethtool_genl_family;
 
+static bool ethnl_ok __read_mostly;
+
 static const struct nla_policy ethnl_header_policy[ETHTOOL_A_HEADER_MAX + 1] = {
 	[ETHTOOL_A_HEADER_UNSPEC]	= { .type = NLA_REJECT },
 	[ETHTOOL_A_HEADER_DEV_INDEX]	= { .type = NLA_U32 },
@@ -169,11 +171,38 @@ err:
 	return NULL;
 }
 
+/* notifications */
+
+typedef void (*ethnl_notify_handler_t)(struct net_device *dev, unsigned int cmd,
+				       const void *data);
+
+static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
+};
+
+void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
+{
+	if (unlikely(!ethnl_ok))
+		return;
+	ASSERT_RTNL();
+
+	if (likely(cmd < ARRAY_SIZE(ethnl_notify_handlers) &&
+		   ethnl_notify_handlers[cmd]))
+		ethnl_notify_handlers[cmd](dev, cmd, data);
+	else
+		WARN_ONCE(1, "notification %u not implemented (dev=%s)\n",
+			  cmd, netdev_name(dev));
+}
+EXPORT_SYMBOL(ethtool_notify);
+
 /* genetlink setup */
 
 static const struct genl_ops ethtool_genl_ops[] = {
 };
 
+static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
+	[ETHNL_MCGRP_MONITOR] = { .name = ETHTOOL_MCGRP_MONITOR_NAME },
+};
+
 static struct genl_family ethtool_genl_family = {
 	.name		= ETHTOOL_GENL_NAME,
 	.version	= ETHTOOL_GENL_VERSION,
@@ -181,6 +210,8 @@ static struct genl_family ethtool_genl_family = {
 	.parallel_ops	= true,
 	.ops		= ethtool_genl_ops,
 	.n_ops		= ARRAY_SIZE(ethtool_genl_ops),
+	.mcgrps		= ethtool_nl_mcgrps,
+	.n_mcgrps	= ARRAY_SIZE(ethtool_nl_mcgrps),
 };
 
 /* module setup */
@@ -192,6 +223,7 @@ static int __init ethnl_init(void)
 	ret = genl_register_family(&ethtool_genl_family);
 	if (WARN(ret < 0, "ethtool: genetlink family registration failed"))
 		return ret;
+	ethnl_ok = true;
 
 	return 0;
 }
-- 
cgit v1.2.3


From 71921690f9745fef60a2bad425f30adf8cdc9da0 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:55:43 +0100
Subject: ethtool: provide string sets with STRSET_GET request

Requests a contents of one or more string sets, i.e. indexed arrays of
strings; this information is provided by ETHTOOL_GSSET_INFO and
ETHTOOL_GSTRINGS commands of ioctl interface. Unlike ioctl interface, all
information can be retrieved with one request and mulitple string sets can
be requested at once.

There are three types of requests:

  - no NLM_F_DUMP, no device: get "global" stringsets
  - no NLM_F_DUMP, with device: get string sets related to the device
  - NLM_F_DUMP, no device: get device related string sets for all devices

Client can request either all string sets of given type (global or device
related) or only specific sets. With ETHTOOL_A_STRSET_COUNTS flag set, only
set sizes (numbers of strings) are returned.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst |  75 ++++-
 include/uapi/linux/ethtool.h                 |   3 +
 include/uapi/linux/ethtool_netlink.h         |  56 ++++
 net/ethtool/Makefile                         |   2 +-
 net/ethtool/netlink.c                        |   8 +
 net/ethtool/netlink.h                        |   4 +
 net/ethtool/strset.c                         | 425 +++++++++++++++++++++++++++
 7 files changed, 570 insertions(+), 3 deletions(-)
 create mode 100644 net/ethtool/strset.c

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 7797f1237472..3912cb0eb9c6 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -176,6 +176,18 @@ according to message purpose:
   ``_NTF``          kernel notification
   ==============    ======================================
 
+Userspace to kernel:
+
+  ===================================== ================================
+  ``ETHTOOL_MSG_STRSET_GET``            get string set
+  ===================================== ================================
+
+Kernel to userspace:
+
+  ===================================== ================================
+  ``ETHTOOL_MSG_STRSET_GET_REPLY``      string set contents
+  ===================================== ================================
+
 ``GET`` requests are sent by userspace applications to retrieve device
 information. They usually do not contain any message specific attributes.
 Kernel replies with corresponding "GET_REPLY" message. For most types, ``GET``
@@ -207,6 +219,65 @@ an ``ACT_REPLY`` message. Performing an action also triggers a notification
 Later sections describe the format and semantics of these messages.
 
 
+STRSET_GET
+==========
+
+Requests contents of a string set as provided by ioctl commands
+``ETHTOOL_GSSET_INFO`` and ``ETHTOOL_GSTRINGS.`` String sets are not user
+writeable so that the corresponding ``STRSET_SET`` message is only used in
+kernel replies. There are two types of string sets: global (independent of
+a device, e.g. device feature names) and device specific (e.g. device private
+flags).
+
+Request contents:
+
+ +---------------------------------------+--------+------------------------+
+ | ``ETHTOOL_A_STRSET_HEADER``           | nested | request header         |
+ +---------------------------------------+--------+------------------------+
+ | ``ETHTOOL_A_STRSET_STRINGSETS``       | nested | string set to request  |
+ +-+-------------------------------------+--------+------------------------+
+ | | ``ETHTOOL_A_STRINGSETS_STRINGSET+`` | nested | one string set         |
+ +-+-+-----------------------------------+--------+------------------------+
+ | | | ``ETHTOOL_A_STRINGSET_ID``        | u32    | set id                 |
+ +-+-+-----------------------------------+--------+------------------------+
+
+Kernel response contents:
+
+ +---------------------------------------+--------+-----------------------+
+ | ``ETHTOOL_A_STRSET_HEADER``           | nested | reply header          |
+ +---------------------------------------+--------+-----------------------+
+ | ``ETHTOOL_A_STRSET_STRINGSETS``       | nested | array of string sets  |
+ +-+-------------------------------------+--------+-----------------------+
+ | | ``ETHTOOL_A_STRINGSETS_STRINGSET+`` | nested | one string set        |
+ +-+-+-----------------------------------+--------+-----------------------+
+ | | | ``ETHTOOL_A_STRINGSET_ID``        | u32    | set id                |
+ +-+-+-----------------------------------+--------+-----------------------+
+ | | | ``ETHTOOL_A_STRINGSET_COUNT``     | u32    | number of strings     |
+ +-+-+-----------------------------------+--------+-----------------------+
+ | | | ``ETHTOOL_A_STRINGSET_STRINGS``   | nested | array of strings      |
+ +-+-+-+---------------------------------+--------+-----------------------+
+ | | | | ``ETHTOOL_A_STRINGS_STRING+``   | nested | one string            |
+ +-+-+-+-+-------------------------------+--------+-----------------------+
+ | | | | | ``ETHTOOL_A_STRING_INDEX``    | u32    | string index          |
+ +-+-+-+-+-------------------------------+--------+-----------------------+
+ | | | | | ``ETHTOOL_A_STRING_VALUE``    | string | string value          |
+ +-+-+-+-+-------------------------------+--------+-----------------------+
+ | ``ETHTOOL_A_STRSET_COUNTS_ONLY``      | flag   | return only counts    |
+ +---------------------------------------+--------+-----------------------+
+
+Device identification in request header is optional. Depending on its presence
+a and ``NLM_F_DUMP`` flag, there are three type of ``STRSET_GET`` requests:
+
+ - no ``NLM_F_DUMP,`` no device: get "global" stringsets
+ - no ``NLM_F_DUMP``, with device: get string sets related to the device
+ - ``NLM_F_DUMP``, no device: get device related string sets for all devices
+
+If there is no ``ETHTOOL_A_STRSET_STRINGSETS`` array, all string sets of
+requested type are returned, otherwise only those specified in the request.
+Flag ``ETHTOOL_A_STRSET_COUNTS_ONLY`` tells kernel to only return string
+counts of the sets, not the actual strings.
+
+
 Request translation
 ===================
 
@@ -242,7 +313,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GSG``                     n/a
   ``ETHTOOL_SSG``                     n/a
   ``ETHTOOL_TEST``                    n/a
-  ``ETHTOOL_GSTRINGS``                n/a
+  ``ETHTOOL_GSTRINGS``                ``ETHTOOL_MSG_STRSET_GET``
   ``ETHTOOL_PHYS_ID``                 n/a
   ``ETHTOOL_GSTATS``                  n/a
   ``ETHTOOL_GTSO``                    n/a
@@ -270,7 +341,7 @@ have their netlink replacement yet.
   ``ETHTOOL_RESET``                   n/a
   ``ETHTOOL_SRXNTUPLE``               n/a
   ``ETHTOOL_GRXNTUPLE``               n/a
-  ``ETHTOOL_GSSET_INFO``              n/a
+  ``ETHTOOL_GSSET_INFO``              ``ETHTOOL_MSG_STRSET_GET``
   ``ETHTOOL_GRXFHINDIR``              n/a
   ``ETHTOOL_SRXFHINDIR``              n/a
   ``ETHTOOL_GFEATURES``               n/a
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index f44155840b07..116bcbf09c74 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -606,6 +606,9 @@ enum ethtool_stringset {
 	ETH_SS_PHY_STATS,
 	ETH_SS_PHY_TUNABLES,
 	ETH_SS_LINK_MODES,
+
+	/* add new constants above here */
+	ETH_SS_COUNT
 };
 
 /**
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index d530ccb6f7e6..cabef1fec42a 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -14,6 +14,7 @@
 /* message types - userspace to kernel */
 enum {
 	ETHTOOL_MSG_USER_NONE,
+	ETHTOOL_MSG_STRSET_GET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -23,6 +24,7 @@ enum {
 /* message types - kernel to userspace */
 enum {
 	ETHTOOL_MSG_KERNEL_NONE,
+	ETHTOOL_MSG_STRSET_GET_REPLY,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -85,6 +87,60 @@ enum {
 	ETHTOOL_A_BITSET_MAX = __ETHTOOL_A_BITSET_CNT - 1
 };
 
+/* string sets */
+
+enum {
+	ETHTOOL_A_STRING_UNSPEC,
+	ETHTOOL_A_STRING_INDEX,			/* u32 */
+	ETHTOOL_A_STRING_VALUE,			/* string */
+
+	/* add new constants above here */
+	__ETHTOOL_A_STRING_CNT,
+	ETHTOOL_A_STRING_MAX = __ETHTOOL_A_STRING_CNT - 1
+};
+
+enum {
+	ETHTOOL_A_STRINGS_UNSPEC,
+	ETHTOOL_A_STRINGS_STRING,		/* nest - _A_STRINGS_* */
+
+	/* add new constants above here */
+	__ETHTOOL_A_STRINGS_CNT,
+	ETHTOOL_A_STRINGS_MAX = __ETHTOOL_A_STRINGS_CNT - 1
+};
+
+enum {
+	ETHTOOL_A_STRINGSET_UNSPEC,
+	ETHTOOL_A_STRINGSET_ID,			/* u32 */
+	ETHTOOL_A_STRINGSET_COUNT,		/* u32 */
+	ETHTOOL_A_STRINGSET_STRINGS,		/* nest - _A_STRINGS_* */
+
+	/* add new constants above here */
+	__ETHTOOL_A_STRINGSET_CNT,
+	ETHTOOL_A_STRINGSET_MAX = __ETHTOOL_A_STRINGSET_CNT - 1
+};
+
+enum {
+	ETHTOOL_A_STRINGSETS_UNSPEC,
+	ETHTOOL_A_STRINGSETS_STRINGSET,		/* nest - _A_STRINGSET_* */
+
+	/* add new constants above here */
+	__ETHTOOL_A_STRINGSETS_CNT,
+	ETHTOOL_A_STRINGSETS_MAX = __ETHTOOL_A_STRINGSETS_CNT - 1
+};
+
+/* STRSET */
+
+enum {
+	ETHTOOL_A_STRSET_UNSPEC,
+	ETHTOOL_A_STRSET_HEADER,		/* nest - _A_HEADER_* */
+	ETHTOOL_A_STRSET_STRINGSETS,		/* nest - _A_STRINGSETS_* */
+	ETHTOOL_A_STRSET_COUNTS_ONLY,		/* flag */
+
+	/* add new constants above here */
+	__ETHTOOL_A_STRSET_CNT,
+	ETHTOOL_A_STRSET_MAX = __ETHTOOL_A_STRSET_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index a7e6c2c85db9..efcc42c34d62 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -4,4 +4,4 @@ obj-y				+= ioctl.o common.o
 
 obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 
-ethtool_nl-y	:= netlink.o bitset.o
+ethtool_nl-y	:= netlink.o bitset.o strset.o
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index ae63e8423072..7dc082bde670 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -194,6 +194,7 @@ struct ethnl_dump_ctx {
 
 static const struct ethnl_request_ops *
 ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
+	[ETHTOOL_MSG_STRSET_GET]	= &ethnl_strset_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -518,6 +519,13 @@ EXPORT_SYMBOL(ethtool_notify);
 /* genetlink setup */
 
 static const struct genl_ops ethtool_genl_ops[] = {
+	{
+		.cmd	= ETHTOOL_MSG_STRSET_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 6ec0dd06277f..44e9f63aefb7 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -326,4 +326,8 @@ struct ethnl_request_ops {
 	void (*cleanup_data)(struct ethnl_reply_data *reply_data);
 };
 
+/* request handlers */
+
+extern const struct ethnl_request_ops ethnl_strset_request_ops;
+
 #endif /* _NET_ETHTOOL_NETLINK_H */
diff --git a/net/ethtool/strset.c b/net/ethtool/strset.c
new file mode 100644
index 000000000000..9f2243329015
--- /dev/null
+++ b/net/ethtool/strset.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/ethtool.h>
+#include <linux/phy.h>
+#include "netlink.h"
+#include "common.h"
+
+struct strset_info {
+	bool per_dev;
+	bool free_strings;
+	unsigned int count;
+	const char (*strings)[ETH_GSTRING_LEN];
+};
+
+static const struct strset_info info_template[] = {
+	[ETH_SS_TEST] = {
+		.per_dev	= true,
+	},
+	[ETH_SS_STATS] = {
+		.per_dev	= true,
+	},
+	[ETH_SS_PRIV_FLAGS] = {
+		.per_dev	= true,
+	},
+	[ETH_SS_FEATURES] = {
+		.per_dev	= false,
+		.count		= ARRAY_SIZE(netdev_features_strings),
+		.strings	= netdev_features_strings,
+	},
+	[ETH_SS_RSS_HASH_FUNCS] = {
+		.per_dev	= false,
+		.count		= ARRAY_SIZE(rss_hash_func_strings),
+		.strings	= rss_hash_func_strings,
+	},
+	[ETH_SS_TUNABLES] = {
+		.per_dev	= false,
+		.count		= ARRAY_SIZE(tunable_strings),
+		.strings	= tunable_strings,
+	},
+	[ETH_SS_PHY_STATS] = {
+		.per_dev	= true,
+	},
+	[ETH_SS_PHY_TUNABLES] = {
+		.per_dev	= false,
+		.count		= ARRAY_SIZE(phy_tunable_strings),
+		.strings	= phy_tunable_strings,
+	},
+	[ETH_SS_LINK_MODES] = {
+		.per_dev	= false,
+		.count		= __ETHTOOL_LINK_MODE_MASK_NBITS,
+		.strings	= link_mode_names,
+	},
+};
+
+struct strset_req_info {
+	struct ethnl_req_info		base;
+	u32				req_ids;
+	bool				counts_only;
+};
+
+#define STRSET_REQINFO(__req_base) \
+	container_of(__req_base, struct strset_req_info, base)
+
+struct strset_reply_data {
+	struct ethnl_reply_data		base;
+	struct strset_info		sets[ETH_SS_COUNT];
+};
+
+#define STRSET_REPDATA(__reply_base) \
+	container_of(__reply_base, struct strset_reply_data, base)
+
+static const struct nla_policy strset_get_policy[ETHTOOL_A_STRSET_MAX + 1] = {
+	[ETHTOOL_A_STRSET_UNSPEC]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_STRSET_HEADER]	= { .type = NLA_NESTED },
+	[ETHTOOL_A_STRSET_STRINGSETS]	= { .type = NLA_NESTED },
+};
+
+static const struct nla_policy
+get_stringset_policy[ETHTOOL_A_STRINGSET_MAX + 1] = {
+	[ETHTOOL_A_STRINGSET_UNSPEC]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_STRINGSET_ID]	= { .type = NLA_U32 },
+	[ETHTOOL_A_STRINGSET_COUNT]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_STRINGSET_STRINGS]	= { .type = NLA_REJECT },
+};
+
+/**
+ * strset_include() - test if a string set should be included in reply
+ * @data: pointer to request data structure
+ * @id:   id of string set to check (ETH_SS_* constants)
+ */
+static bool strset_include(const struct strset_req_info *info,
+			   const struct strset_reply_data *data, u32 id)
+{
+	bool per_dev;
+
+	BUILD_BUG_ON(ETH_SS_COUNT >= BITS_PER_BYTE * sizeof(info->req_ids));
+
+	if (info->req_ids)
+		return info->req_ids & (1U << id);
+	per_dev = data->sets[id].per_dev;
+	if (!per_dev && !data->sets[id].strings)
+		return false;
+
+	return data->base.dev ? per_dev : !per_dev;
+}
+
+static int strset_get_id(const struct nlattr *nest, u32 *val,
+			 struct netlink_ext_ack *extack)
+{
+	struct nlattr *tb[ETHTOOL_A_STRINGSET_MAX + 1];
+	int ret;
+
+	ret = nla_parse_nested(tb, ETHTOOL_A_STRINGSET_MAX, nest,
+			       get_stringset_policy, extack);
+	if (ret < 0)
+		return ret;
+	if (!tb[ETHTOOL_A_STRINGSET_ID])
+		return -EINVAL;
+
+	*val = nla_get_u32(tb[ETHTOOL_A_STRINGSET_ID]);
+	return 0;
+}
+
+static const struct nla_policy
+strset_stringsets_policy[ETHTOOL_A_STRINGSETS_MAX + 1] = {
+	[ETHTOOL_A_STRINGSETS_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_STRINGSETS_STRINGSET]	= { .type = NLA_NESTED },
+};
+
+static int strset_parse_request(struct ethnl_req_info *req_base,
+				struct nlattr **tb,
+				struct netlink_ext_ack *extack)
+{
+	struct strset_req_info *req_info = STRSET_REQINFO(req_base);
+	struct nlattr *nest = tb[ETHTOOL_A_STRSET_STRINGSETS];
+	struct nlattr *attr;
+	int rem, ret;
+
+	if (!nest)
+		return 0;
+	ret = nla_validate_nested(nest, ETHTOOL_A_STRINGSETS_MAX,
+				  strset_stringsets_policy, extack);
+	if (ret < 0)
+		return ret;
+
+	req_info->counts_only = tb[ETHTOOL_A_STRSET_COUNTS_ONLY];
+	nla_for_each_nested(attr, nest, rem) {
+		u32 id;
+
+		if (WARN_ONCE(nla_type(attr) != ETHTOOL_A_STRINGSETS_STRINGSET,
+			      "unexpected attrtype %u in ETHTOOL_A_STRSET_STRINGSETS\n",
+			      nla_type(attr)))
+			return -EINVAL;
+
+		ret = strset_get_id(attr, &id, extack);
+		if (ret < 0)
+			return ret;
+		if (ret >= ETH_SS_COUNT) {
+			NL_SET_ERR_MSG_ATTR(extack, attr,
+					    "unknown string set id");
+			return -EOPNOTSUPP;
+		}
+
+		req_info->req_ids |= (1U << id);
+	}
+
+	return 0;
+}
+
+static void strset_cleanup_data(struct ethnl_reply_data *reply_base)
+{
+	struct strset_reply_data *data = STRSET_REPDATA(reply_base);
+	unsigned int i;
+
+	for (i = 0; i < ETH_SS_COUNT; i++)
+		if (data->sets[i].free_strings) {
+			kfree(data->sets[i].strings);
+			data->sets[i].strings = NULL;
+			data->sets[i].free_strings = false;
+		}
+}
+
+static int strset_prepare_set(struct strset_info *info, struct net_device *dev,
+			      unsigned int id, bool counts_only)
+{
+	const struct ethtool_ops *ops = dev->ethtool_ops;
+	void *strings;
+	int count, ret;
+
+	if (id == ETH_SS_PHY_STATS && dev->phydev &&
+	    !ops->get_ethtool_phy_stats)
+		ret = phy_ethtool_get_sset_count(dev->phydev);
+	else if (ops->get_sset_count && ops->get_strings)
+		ret = ops->get_sset_count(dev, id);
+	else
+		ret = -EOPNOTSUPP;
+	if (ret <= 0) {
+		info->count = 0;
+		return 0;
+	}
+
+	count = ret;
+	if (!counts_only) {
+		strings = kcalloc(count, ETH_GSTRING_LEN, GFP_KERNEL);
+		if (!strings)
+			return -ENOMEM;
+		if (id == ETH_SS_PHY_STATS && dev->phydev &&
+		    !ops->get_ethtool_phy_stats)
+			phy_ethtool_get_strings(dev->phydev, strings);
+		else
+			ops->get_strings(dev, id, strings);
+		info->strings = strings;
+		info->free_strings = true;
+	}
+	info->count = count;
+
+	return 0;
+}
+
+static int strset_prepare_data(const struct ethnl_req_info *req_base,
+			       struct ethnl_reply_data *reply_base,
+			       struct genl_info *info)
+{
+	const struct strset_req_info *req_info = STRSET_REQINFO(req_base);
+	struct strset_reply_data *data = STRSET_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	unsigned int i;
+	int ret;
+
+	BUILD_BUG_ON(ARRAY_SIZE(info_template) != ETH_SS_COUNT);
+	memcpy(&data->sets, &info_template, sizeof(data->sets));
+
+	if (!dev) {
+		for (i = 0; i < ETH_SS_COUNT; i++) {
+			if ((req_info->req_ids & (1U << i)) &&
+			    data->sets[i].per_dev) {
+				if (info)
+					GENL_SET_ERR_MSG(info, "requested per device strings without dev");
+				return -EINVAL;
+			}
+		}
+	}
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto err_strset;
+	for (i = 0; i < ETH_SS_COUNT; i++) {
+		if (!strset_include(req_info, data, i) ||
+		    !data->sets[i].per_dev)
+			continue;
+
+		ret = strset_prepare_set(&data->sets[i], dev, i,
+					 req_info->counts_only);
+		if (ret < 0)
+			goto err_ops;
+	}
+	ethnl_ops_complete(dev);
+
+	return 0;
+err_ops:
+	ethnl_ops_complete(dev);
+err_strset:
+	strset_cleanup_data(reply_base);
+	return ret;
+}
+
+/* calculate size of ETHTOOL_A_STRSET_STRINGSET nest for one string set */
+static int strset_set_size(const struct strset_info *info, bool counts_only)
+{
+	unsigned int len = 0;
+	unsigned int i;
+
+	if (info->count == 0)
+		return 0;
+	if (counts_only)
+		return nla_total_size(2 * nla_total_size(sizeof(u32)));
+
+	for (i = 0; i < info->count; i++) {
+		const char *str = info->strings[i];
+
+		/* ETHTOOL_A_STRING_INDEX, ETHTOOL_A_STRING_VALUE, nest */
+		len += nla_total_size(nla_total_size(sizeof(u32)) +
+				      ethnl_strz_size(str));
+	}
+	/* ETHTOOL_A_STRINGSET_ID, ETHTOOL_A_STRINGSET_COUNT */
+	len = 2 * nla_total_size(sizeof(u32)) + nla_total_size(len);
+
+	return nla_total_size(len);
+}
+
+static int strset_reply_size(const struct ethnl_req_info *req_base,
+			     const struct ethnl_reply_data *reply_base)
+{
+	const struct strset_req_info *req_info = STRSET_REQINFO(req_base);
+	const struct strset_reply_data *data = STRSET_REPDATA(reply_base);
+	unsigned int i;
+	int len = 0;
+	int ret;
+
+	len += ethnl_reply_header_size();
+	for (i = 0; i < ETH_SS_COUNT; i++) {
+		const struct strset_info *set_info = &data->sets[i];
+
+		if (!strset_include(req_info, data, i))
+			continue;
+
+		ret = strset_set_size(set_info, req_info->counts_only);
+		if (ret < 0)
+			return ret;
+		len += ret;
+	}
+
+	return len;
+}
+
+/* fill one string into reply */
+static int strset_fill_string(struct sk_buff *skb,
+			      const struct strset_info *set_info, u32 idx)
+{
+	struct nlattr *string_attr;
+	const char *value;
+
+	value = set_info->strings[idx];
+
+	string_attr = nla_nest_start(skb, ETHTOOL_A_STRINGS_STRING);
+	if (!string_attr)
+		return -EMSGSIZE;
+	if (nla_put_u32(skb, ETHTOOL_A_STRING_INDEX, idx) ||
+	    ethnl_put_strz(skb, ETHTOOL_A_STRING_VALUE, value))
+		goto nla_put_failure;
+	nla_nest_end(skb, string_attr);
+
+	return 0;
+nla_put_failure:
+	nla_nest_cancel(skb, string_attr);
+	return -EMSGSIZE;
+}
+
+/* fill one string set into reply */
+static int strset_fill_set(struct sk_buff *skb,
+			   const struct strset_info *set_info, u32 id,
+			   bool counts_only)
+{
+	struct nlattr *stringset_attr;
+	struct nlattr *strings_attr;
+	unsigned int i;
+
+	if (!set_info->per_dev && !set_info->strings)
+		return -EOPNOTSUPP;
+	if (set_info->count == 0)
+		return 0;
+	stringset_attr = nla_nest_start(skb, ETHTOOL_A_STRINGSETS_STRINGSET);
+	if (!stringset_attr)
+		return -EMSGSIZE;
+
+	if (nla_put_u32(skb, ETHTOOL_A_STRINGSET_ID, id) ||
+	    nla_put_u32(skb, ETHTOOL_A_STRINGSET_COUNT, set_info->count))
+		goto nla_put_failure;
+
+	if (!counts_only) {
+		strings_attr = nla_nest_start(skb, ETHTOOL_A_STRINGSET_STRINGS);
+		if (!strings_attr)
+			goto nla_put_failure;
+		for (i = 0; i < set_info->count; i++) {
+			if (strset_fill_string(skb, set_info, i) < 0)
+				goto nla_put_failure;
+		}
+		nla_nest_end(skb, strings_attr);
+	}
+
+	nla_nest_end(skb, stringset_attr);
+	return 0;
+
+nla_put_failure:
+	nla_nest_cancel(skb, stringset_attr);
+	return -EMSGSIZE;
+}
+
+static int strset_fill_reply(struct sk_buff *skb,
+			     const struct ethnl_req_info *req_base,
+			     const struct ethnl_reply_data *reply_base)
+{
+	const struct strset_req_info *req_info = STRSET_REQINFO(req_base);
+	const struct strset_reply_data *data = STRSET_REPDATA(reply_base);
+	struct nlattr *nest;
+	unsigned int i;
+	int ret;
+
+	nest = nla_nest_start(skb, ETHTOOL_A_STRSET_STRINGSETS);
+	if (!nest)
+		return -EMSGSIZE;
+
+	for (i = 0; i < ETH_SS_COUNT; i++) {
+		if (strset_include(req_info, data, i)) {
+			ret = strset_fill_set(skb, &data->sets[i], i,
+					      req_info->counts_only);
+			if (ret < 0)
+				goto nla_put_failure;
+		}
+	}
+
+	nla_nest_end(skb, nest);
+	return 0;
+
+nla_put_failure:
+	nla_nest_cancel(skb, nest);
+	return ret;
+}
+
+const struct ethnl_request_ops ethnl_strset_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_STRSET_GET,
+	.reply_cmd		= ETHTOOL_MSG_STRSET_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_STRSET_HEADER,
+	.max_attr		= ETHTOOL_A_STRSET_MAX,
+	.req_info_size		= sizeof(struct strset_req_info),
+	.reply_data_size	= sizeof(struct strset_reply_data),
+	.request_policy		= strset_get_policy,
+	.allow_nodev_do		= true,
+
+	.parse_request		= strset_parse_request,
+	.prepare_data		= strset_prepare_data,
+	.reply_size		= strset_reply_size,
+	.fill_reply		= strset_fill_reply,
+	.cleanup_data		= strset_cleanup_data,
+};
-- 
cgit v1.2.3


From 459e0b81b37043545d90629fdfb243444151e77d Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:55:48 +0100
Subject: ethtool: provide link settings with LINKINFO_GET request

Implement LINKINFO_GET netlink request to get basic link settings provided
by ETHTOOL_GLINKSETTINGS and ETHTOOL_GSET ioctl commands.

This request provides settings not directly related to autonegotiation and
link mode selection: physical port, phy MDIO address, MDI(-X) status,
MDI(-X) control and transceiver.

LINKINFO_GET request can be used with NLM_F_DUMP (without device
identification) to request the information for all devices in current
network namespace providing the data.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 37 ++++++++++-
 include/uapi/linux/ethtool_netlink.h         | 18 ++++++
 net/ethtool/Makefile                         |  2 +-
 net/ethtool/common.c                         | 48 ++++++++++++++
 net/ethtool/common.h                         |  4 ++
 net/ethtool/ioctl.c                          | 48 --------------
 net/ethtool/linkinfo.c                       | 94 ++++++++++++++++++++++++++++
 net/ethtool/netlink.c                        |  8 +++
 net/ethtool/netlink.h                        |  1 +
 9 files changed, 209 insertions(+), 51 deletions(-)
 create mode 100644 net/ethtool/linkinfo.c

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 3912cb0eb9c6..60684786a92c 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -180,12 +180,14 @@ Userspace to kernel:
 
   ===================================== ================================
   ``ETHTOOL_MSG_STRSET_GET``            get string set
+  ``ETHTOOL_MSG_LINKINFO_GET``          get link settings
   ===================================== ================================
 
 Kernel to userspace:
 
   ===================================== ================================
   ``ETHTOOL_MSG_STRSET_GET_REPLY``      string set contents
+  ``ETHTOOL_MSG_LINKINFO_GET_REPLY``    link settings
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -278,6 +280,37 @@ Flag ``ETHTOOL_A_STRSET_COUNTS_ONLY`` tells kernel to only return string
 counts of the sets, not the actual strings.
 
 
+LINKINFO_GET
+============
+
+Requests link settings as provided by ``ETHTOOL_GLINKSETTINGS`` except for
+link modes and autonegotiation related information. The request does not use
+any attributes.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKINFO_HEADER``         nested  request header
+  ====================================  ======  ==========================
+
+Kernel response contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKINFO_HEADER``         nested  reply header
+  ``ETHTOOL_A_LINKINFO_PORT``           u8      physical port
+  ``ETHTOOL_A_LINKINFO_PHYADDR``        u8      phy MDIO address
+  ``ETHTOOL_A_LINKINFO_TP_MDIX``        u8      MDI(-X) status
+  ``ETHTOOL_A_LINKINFO_TP_MDIX_CTRL``   u8      MDI(-X) control
+  ``ETHTOOL_A_LINKINFO_TRANSCEIVER``    u8      transceiver
+  ====================================  ======  ==========================
+
+Attributes and their values have the same meaning as matching members of the
+corresponding ioctl structures.
+
+``LINKINFO_GET`` allows dump requests (kernel returns reply message for all
+devices supporting the request).
+
+
 Request translation
 ===================
 
@@ -288,7 +321,7 @@ have their netlink replacement yet.
   =================================== =====================================
   ioctl command                       netlink command
   =================================== =====================================
-  ``ETHTOOL_GSET``                    n/a
+  ``ETHTOOL_GSET``                    ``ETHTOOL_MSG_LINKINFO_GET``
   ``ETHTOOL_SSET``                    n/a
   ``ETHTOOL_GDRVINFO``                n/a
   ``ETHTOOL_GREGS``                   n/a
@@ -362,7 +395,7 @@ have their netlink replacement yet.
   ``ETHTOOL_STUNABLE``                n/a
   ``ETHTOOL_GPHYSTATS``               n/a
   ``ETHTOOL_PERQUEUE``                n/a
-  ``ETHTOOL_GLINKSETTINGS``           n/a
+  ``ETHTOOL_GLINKSETTINGS``           ``ETHTOOL_MSG_LINKINFO_GET``
   ``ETHTOOL_SLINKSETTINGS``           n/a
   ``ETHTOOL_PHY_GTUNABLE``            n/a
   ``ETHTOOL_PHY_STUNABLE``            n/a
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index cabef1fec42a..1966532993e5 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -15,6 +15,7 @@
 enum {
 	ETHTOOL_MSG_USER_NONE,
 	ETHTOOL_MSG_STRSET_GET,
+	ETHTOOL_MSG_LINKINFO_GET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -25,6 +26,7 @@ enum {
 enum {
 	ETHTOOL_MSG_KERNEL_NONE,
 	ETHTOOL_MSG_STRSET_GET_REPLY,
+	ETHTOOL_MSG_LINKINFO_GET_REPLY,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -141,6 +143,22 @@ enum {
 	ETHTOOL_A_STRSET_MAX = __ETHTOOL_A_STRSET_CNT - 1
 };
 
+/* LINKINFO */
+
+enum {
+	ETHTOOL_A_LINKINFO_UNSPEC,
+	ETHTOOL_A_LINKINFO_HEADER,		/* nest - _A_HEADER_* */
+	ETHTOOL_A_LINKINFO_PORT,		/* u8 */
+	ETHTOOL_A_LINKINFO_PHYADDR,		/* u8 */
+	ETHTOOL_A_LINKINFO_TP_MDIX,		/* u8 */
+	ETHTOOL_A_LINKINFO_TP_MDIX_CTRL,	/* u8 */
+	ETHTOOL_A_LINKINFO_TRANSCEIVER,		/* u8 */
+
+	/* add new constants above here */
+	__ETHTOOL_A_LINKINFO_CNT,
+	ETHTOOL_A_LINKINFO_MAX = __ETHTOOL_A_LINKINFO_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index efcc42c34d62..765736ec52c0 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -4,4 +4,4 @@ obj-y				+= ioctl.o common.o
 
 obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 
-ethtool_nl-y	:= netlink.o bitset.o strset.o
+ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 0a8728565356..1d4a0aeff2cb 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -169,3 +169,51 @@ const char link_mode_names[][ETH_GSTRING_LEN] = {
 	__DEFINE_LINK_MODE_NAME(400000, CR8, Full),
 };
 static_assert(ARRAY_SIZE(link_mode_names) == __ETHTOOL_LINK_MODE_MASK_NBITS);
+
+/* return false if legacy contained non-0 deprecated fields
+ * maxtxpkt/maxrxpkt. rest of ksettings always updated
+ */
+bool
+convert_legacy_settings_to_link_ksettings(
+	struct ethtool_link_ksettings *link_ksettings,
+	const struct ethtool_cmd *legacy_settings)
+{
+	bool retval = true;
+
+	memset(link_ksettings, 0, sizeof(*link_ksettings));
+
+	/* This is used to tell users that driver is still using these
+	 * deprecated legacy fields, and they should not use
+	 * %ETHTOOL_GLINKSETTINGS/%ETHTOOL_SLINKSETTINGS
+	 */
+	if (legacy_settings->maxtxpkt ||
+	    legacy_settings->maxrxpkt)
+		retval = false;
+
+	ethtool_convert_legacy_u32_to_link_mode(
+		link_ksettings->link_modes.supported,
+		legacy_settings->supported);
+	ethtool_convert_legacy_u32_to_link_mode(
+		link_ksettings->link_modes.advertising,
+		legacy_settings->advertising);
+	ethtool_convert_legacy_u32_to_link_mode(
+		link_ksettings->link_modes.lp_advertising,
+		legacy_settings->lp_advertising);
+	link_ksettings->base.speed
+		= ethtool_cmd_speed(legacy_settings);
+	link_ksettings->base.duplex
+		= legacy_settings->duplex;
+	link_ksettings->base.port
+		= legacy_settings->port;
+	link_ksettings->base.phy_address
+		= legacy_settings->phy_address;
+	link_ksettings->base.autoneg
+		= legacy_settings->autoneg;
+	link_ksettings->base.mdio_support
+		= legacy_settings->mdio_support;
+	link_ksettings->base.eth_tp_mdix
+		= legacy_settings->eth_tp_mdix;
+	link_ksettings->base.eth_tp_mdix_ctrl
+		= legacy_settings->eth_tp_mdix_ctrl;
+	return retval;
+}
diff --git a/net/ethtool/common.h b/net/ethtool/common.h
index bbb788908cb1..c8a237402729 100644
--- a/net/ethtool/common.h
+++ b/net/ethtool/common.h
@@ -19,4 +19,8 @@ extern const char
 phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN];
 extern const char link_mode_names[][ETH_GSTRING_LEN];
 
+bool convert_legacy_settings_to_link_ksettings(
+	struct ethtool_link_ksettings *link_ksettings,
+	const struct ethtool_cmd *legacy_settings);
+
 #endif /* _ETHTOOL_COMMON_H */
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 88f7cddf5a6f..8a0a13b478e0 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -358,54 +358,6 @@ bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32,
 }
 EXPORT_SYMBOL(ethtool_convert_link_mode_to_legacy_u32);
 
-/* return false if legacy contained non-0 deprecated fields
- * maxtxpkt/maxrxpkt. rest of ksettings always updated
- */
-static bool
-convert_legacy_settings_to_link_ksettings(
-	struct ethtool_link_ksettings *link_ksettings,
-	const struct ethtool_cmd *legacy_settings)
-{
-	bool retval = true;
-
-	memset(link_ksettings, 0, sizeof(*link_ksettings));
-
-	/* This is used to tell users that driver is still using these
-	 * deprecated legacy fields, and they should not use
-	 * %ETHTOOL_GLINKSETTINGS/%ETHTOOL_SLINKSETTINGS
-	 */
-	if (legacy_settings->maxtxpkt ||
-	    legacy_settings->maxrxpkt)
-		retval = false;
-
-	ethtool_convert_legacy_u32_to_link_mode(
-		link_ksettings->link_modes.supported,
-		legacy_settings->supported);
-	ethtool_convert_legacy_u32_to_link_mode(
-		link_ksettings->link_modes.advertising,
-		legacy_settings->advertising);
-	ethtool_convert_legacy_u32_to_link_mode(
-		link_ksettings->link_modes.lp_advertising,
-		legacy_settings->lp_advertising);
-	link_ksettings->base.speed
-		= ethtool_cmd_speed(legacy_settings);
-	link_ksettings->base.duplex
-		= legacy_settings->duplex;
-	link_ksettings->base.port
-		= legacy_settings->port;
-	link_ksettings->base.phy_address
-		= legacy_settings->phy_address;
-	link_ksettings->base.autoneg
-		= legacy_settings->autoneg;
-	link_ksettings->base.mdio_support
-		= legacy_settings->mdio_support;
-	link_ksettings->base.eth_tp_mdix
-		= legacy_settings->eth_tp_mdix;
-	link_ksettings->base.eth_tp_mdix_ctrl
-		= legacy_settings->eth_tp_mdix_ctrl;
-	return retval;
-}
-
 /* return false if ksettings link modes had higher bits
  * set. legacy_settings always updated (best effort)
  */
diff --git a/net/ethtool/linkinfo.c b/net/ethtool/linkinfo.c
new file mode 100644
index 000000000000..f52a31af6fff
--- /dev/null
+++ b/net/ethtool/linkinfo.c
@@ -0,0 +1,94 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "netlink.h"
+#include "common.h"
+
+struct linkinfo_req_info {
+	struct ethnl_req_info		base;
+};
+
+struct linkinfo_reply_data {
+	struct ethnl_reply_data		base;
+	struct ethtool_link_ksettings	ksettings;
+	struct ethtool_link_settings	*lsettings;
+};
+
+#define LINKINFO_REPDATA(__reply_base) \
+	container_of(__reply_base, struct linkinfo_reply_data, base)
+
+static const struct nla_policy
+linkinfo_get_policy[ETHTOOL_A_LINKINFO_MAX + 1] = {
+	[ETHTOOL_A_LINKINFO_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKINFO_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_LINKINFO_PORT]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKINFO_PHYADDR]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKINFO_TP_MDIX]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKINFO_TP_MDIX_CTRL]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKINFO_TRANSCEIVER]	= { .type = NLA_REJECT },
+};
+
+static int linkinfo_prepare_data(const struct ethnl_req_info *req_base,
+				 struct ethnl_reply_data *reply_base,
+				 struct genl_info *info)
+{
+	struct linkinfo_reply_data *data = LINKINFO_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	data->lsettings = &data->ksettings.base;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+	ret = __ethtool_get_link_ksettings(dev, &data->ksettings);
+	if (ret < 0 && info)
+		GENL_SET_ERR_MSG(info, "failed to retrieve link settings");
+	ethnl_ops_complete(dev);
+
+	return ret;
+}
+
+static int linkinfo_reply_size(const struct ethnl_req_info *req_base,
+			       const struct ethnl_reply_data *reply_base)
+{
+	return nla_total_size(sizeof(u8)) /* LINKINFO_PORT */
+		+ nla_total_size(sizeof(u8)) /* LINKINFO_PHYADDR */
+		+ nla_total_size(sizeof(u8)) /* LINKINFO_TP_MDIX */
+		+ nla_total_size(sizeof(u8)) /* LINKINFO_TP_MDIX_CTRL */
+		+ nla_total_size(sizeof(u8)) /* LINKINFO_TRANSCEIVER */
+		+ 0;
+}
+
+static int linkinfo_fill_reply(struct sk_buff *skb,
+			       const struct ethnl_req_info *req_base,
+			       const struct ethnl_reply_data *reply_base)
+{
+	const struct linkinfo_reply_data *data = LINKINFO_REPDATA(reply_base);
+
+	if (nla_put_u8(skb, ETHTOOL_A_LINKINFO_PORT, data->lsettings->port) ||
+	    nla_put_u8(skb, ETHTOOL_A_LINKINFO_PHYADDR,
+		       data->lsettings->phy_address) ||
+	    nla_put_u8(skb, ETHTOOL_A_LINKINFO_TP_MDIX,
+		       data->lsettings->eth_tp_mdix) ||
+	    nla_put_u8(skb, ETHTOOL_A_LINKINFO_TP_MDIX_CTRL,
+		       data->lsettings->eth_tp_mdix_ctrl) ||
+	    nla_put_u8(skb, ETHTOOL_A_LINKINFO_TRANSCEIVER,
+		       data->lsettings->transceiver))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+const struct ethnl_request_ops ethnl_linkinfo_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_LINKINFO_GET,
+	.reply_cmd		= ETHTOOL_MSG_LINKINFO_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_LINKINFO_HEADER,
+	.max_attr		= ETHTOOL_A_LINKINFO_MAX,
+	.req_info_size		= sizeof(struct linkinfo_req_info),
+	.reply_data_size	= sizeof(struct linkinfo_reply_data),
+	.request_policy		= linkinfo_get_policy,
+
+	.prepare_data		= linkinfo_prepare_data,
+	.reply_size		= linkinfo_reply_size,
+	.fill_reply		= linkinfo_fill_reply,
+};
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 7dc082bde670..ce86d97c922e 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -195,6 +195,7 @@ struct ethnl_dump_ctx {
 static const struct ethnl_request_ops *
 ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_STRSET_GET]	= &ethnl_strset_request_ops,
+	[ETHTOOL_MSG_LINKINFO_GET]	= &ethnl_linkinfo_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -526,6 +527,13 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.dumpit	= ethnl_default_dumpit,
 		.done	= ethnl_default_done,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_LINKINFO_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 44e9f63aefb7..9fc8f94d8dce 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -329,5 +329,6 @@ struct ethnl_request_ops {
 /* request handlers */
 
 extern const struct ethnl_request_ops ethnl_strset_request_ops;
+extern const struct ethnl_request_ops ethnl_linkinfo_request_ops;
 
 #endif /* _NET_ETHTOOL_NETLINK_H */
-- 
cgit v1.2.3


From a53f3d41e4d3df46aba254ba51e7fbe45470fece Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:55:53 +0100
Subject: ethtool: set link settings with LINKINFO_SET request

Implement LINKINFO_SET netlink request to set link settings queried by
LINKINFO_GET message.

Only physical port, phy MDIO address and MDI(-X) control can be set,
attempt to modify MDI(-X) status and transceiver is rejected.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 24 +++++++++-
 include/uapi/linux/ethtool_netlink.h         |  1 +
 net/ethtool/linkinfo.c                       | 71 ++++++++++++++++++++++++++++
 net/ethtool/netlink.c                        |  5 ++
 net/ethtool/netlink.h                        |  2 +
 5 files changed, 101 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 60684786a92c..39c4aba324c3 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -181,6 +181,7 @@ Userspace to kernel:
   ===================================== ================================
   ``ETHTOOL_MSG_STRSET_GET``            get string set
   ``ETHTOOL_MSG_LINKINFO_GET``          get link settings
+  ``ETHTOOL_MSG_LINKINFO_SET``          set link settings
   ===================================== ================================
 
 Kernel to userspace:
@@ -311,6 +312,25 @@ corresponding ioctl structures.
 devices supporting the request).
 
 
+LINKINFO_SET
+============
+
+``LINKINFO_SET`` request allows setting some of the attributes reported by
+``LINKINFO_GET``.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKINFO_HEADER``         nested  request header
+  ``ETHTOOL_A_LINKINFO_PORT``           u8      physical port
+  ``ETHTOOL_A_LINKINFO_PHYADDR``        u8      phy MDIO address
+  ``ETHTOOL_A_LINKINFO_TP_MDIX_CTRL``   u8      MDI(-X) control
+  ====================================  ======  ==========================
+
+MDI(-X) status and transceiver cannot be set, request with the corresponding
+attributes is rejected.
+
+
 Request translation
 ===================
 
@@ -322,7 +342,7 @@ have their netlink replacement yet.
   ioctl command                       netlink command
   =================================== =====================================
   ``ETHTOOL_GSET``                    ``ETHTOOL_MSG_LINKINFO_GET``
-  ``ETHTOOL_SSET``                    n/a
+  ``ETHTOOL_SSET``                    ``ETHTOOL_MSG_LINKINFO_SET``
   ``ETHTOOL_GDRVINFO``                n/a
   ``ETHTOOL_GREGS``                   n/a
   ``ETHTOOL_GWOL``                    n/a
@@ -396,7 +416,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GPHYSTATS``               n/a
   ``ETHTOOL_PERQUEUE``                n/a
   ``ETHTOOL_GLINKSETTINGS``           ``ETHTOOL_MSG_LINKINFO_GET``
-  ``ETHTOOL_SLINKSETTINGS``           n/a
+  ``ETHTOOL_SLINKSETTINGS``           ``ETHTOOL_MSG_LINKINFO_SET``
   ``ETHTOOL_PHY_GTUNABLE``            n/a
   ``ETHTOOL_PHY_STUNABLE``            n/a
   ``ETHTOOL_GFECPARAM``               n/a
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 1966532993e5..5b7806a5bef8 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -16,6 +16,7 @@ enum {
 	ETHTOOL_MSG_USER_NONE,
 	ETHTOOL_MSG_STRSET_GET,
 	ETHTOOL_MSG_LINKINFO_GET,
+	ETHTOOL_MSG_LINKINFO_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
diff --git a/net/ethtool/linkinfo.c b/net/ethtool/linkinfo.c
index f52a31af6fff..8a5f68f92425 100644
--- a/net/ethtool/linkinfo.c
+++ b/net/ethtool/linkinfo.c
@@ -92,3 +92,74 @@ const struct ethnl_request_ops ethnl_linkinfo_request_ops = {
 	.reply_size		= linkinfo_reply_size,
 	.fill_reply		= linkinfo_fill_reply,
 };
+
+/* LINKINFO_SET */
+
+static const struct nla_policy
+linkinfo_set_policy[ETHTOOL_A_LINKINFO_MAX + 1] = {
+	[ETHTOOL_A_LINKINFO_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKINFO_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_LINKINFO_PORT]		= { .type = NLA_U8 },
+	[ETHTOOL_A_LINKINFO_PHYADDR]		= { .type = NLA_U8 },
+	[ETHTOOL_A_LINKINFO_TP_MDIX]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKINFO_TP_MDIX_CTRL]	= { .type = NLA_U8 },
+	[ETHTOOL_A_LINKINFO_TRANSCEIVER]	= { .type = NLA_REJECT },
+};
+
+int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info)
+{
+	struct nlattr *tb[ETHTOOL_A_LINKINFO_MAX + 1];
+	struct ethtool_link_ksettings ksettings = {};
+	struct ethtool_link_settings *lsettings;
+	struct ethnl_req_info req_info = {};
+	struct net_device *dev;
+	bool mod = false;
+	int ret;
+
+	ret = nlmsg_parse(info->nlhdr, GENL_HDRLEN, tb,
+			  ETHTOOL_A_LINKINFO_MAX, linkinfo_set_policy,
+			  info->extack);
+	if (ret < 0)
+		return ret;
+	ret = ethnl_parse_header(&req_info, tb[ETHTOOL_A_LINKINFO_HEADER],
+				 genl_info_net(info), info->extack, true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+	if (!dev->ethtool_ops->get_link_ksettings ||
+	    !dev->ethtool_ops->set_link_ksettings)
+		return -EOPNOTSUPP;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	ret = __ethtool_get_link_ksettings(dev, &ksettings);
+	if (ret < 0) {
+		if (info)
+			GENL_SET_ERR_MSG(info, "failed to retrieve link settings");
+		goto out_ops;
+	}
+	lsettings = &ksettings.base;
+
+	ethnl_update_u8(&lsettings->port, tb[ETHTOOL_A_LINKINFO_PORT], &mod);
+	ethnl_update_u8(&lsettings->phy_address, tb[ETHTOOL_A_LINKINFO_PHYADDR],
+			&mod);
+	ethnl_update_u8(&lsettings->eth_tp_mdix_ctrl,
+			tb[ETHTOOL_A_LINKINFO_TP_MDIX_CTRL], &mod);
+	ret = 0;
+	if (!mod)
+		goto out_ops;
+
+	ret = dev->ethtool_ops->set_link_ksettings(dev, &ksettings);
+	if (ret < 0)
+		GENL_SET_ERR_MSG(info, "link settings update failed");
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+	dev_put(dev);
+	return ret;
+}
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index ce86d97c922e..7867425956f6 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -534,6 +534,11 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.dumpit	= ethnl_default_dumpit,
 		.done	= ethnl_default_done,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_LINKINFO_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_linkinfo,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 9fc8f94d8dce..bbe5fe60a023 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -331,4 +331,6 @@ struct ethnl_request_ops {
 extern const struct ethnl_request_ops ethnl_strset_request_ops;
 extern const struct ethnl_request_ops ethnl_linkinfo_request_ops;
 
+int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
+
 #endif /* _NET_ETHTOOL_NETLINK_H */
-- 
cgit v1.2.3


From 73286734c1b0d009fd317c2a74e173cb22233dad Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:56:03 +0100
Subject: ethtool: add LINKINFO_NTF notification

Send ETHTOOL_MSG_LINKINFO_NTF notification message whenever device link
settings are modified using ETHTOOL_MSG_LINKINFO_SET netlink message or
ETHTOOL_SLINKSETTINGS or ETHTOOL_SSET ioctl commands.

The notification message has the same format as reply to LINKINFO_GET
request. ETHTOOL_MSG_LINKINFO_SET netlink request only triggers the
notification if there is a change but the ioctl command handlers do not
check if there is an actual change and trigger the notification whenever
the commands are executed.

As all work is done by ethnl_default_notify() handler and callback
functions introduced to handle LINKINFO_GET requests, all that remains is
adding entries for ETHTOOL_MSG_LINKINFO_NTF into ethnl_notify_handlers and
ethnl_default_notify_ops lookup tables and calls to ethtool_notify() where
needed.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst |  1 +
 include/uapi/linux/ethtool_netlink.h         |  1 +
 net/ethtool/ioctl.c                          | 12 ++++++++++--
 net/ethtool/linkinfo.c                       |  2 ++
 net/ethtool/netlink.c                        |  2 ++
 5 files changed, 16 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 39c4aba324c3..34254482d295 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -189,6 +189,7 @@ Kernel to userspace:
   ===================================== ================================
   ``ETHTOOL_MSG_STRSET_GET_REPLY``      string set contents
   ``ETHTOOL_MSG_LINKINFO_GET_REPLY``    link settings
+  ``ETHTOOL_MSG_LINKINFO_NTF``          link settings notification
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 5b7806a5bef8..d530fa30de36 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -28,6 +28,7 @@ enum {
 	ETHTOOL_MSG_KERNEL_NONE,
 	ETHTOOL_MSG_STRSET_GET_REPLY,
 	ETHTOOL_MSG_LINKINFO_GET_REPLY,
+	ETHTOOL_MSG_LINKINFO_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 8a0a13b478e0..11a467294a33 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -26,6 +26,7 @@
 #include <net/devlink.h>
 #include <net/xdp_sock.h>
 #include <net/flow_offload.h>
+#include <linux/ethtool_netlink.h>
 
 #include "common.h"
 
@@ -571,7 +572,10 @@ static int ethtool_set_link_ksettings(struct net_device *dev,
 	    != link_ksettings.base.link_mode_masks_nwords)
 		return -EINVAL;
 
-	return dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
+	err = dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
+	if (err >= 0)
+		ethtool_notify(dev, ETHTOOL_MSG_LINKINFO_NTF, NULL);
+	return err;
 }
 
 /* Query device for its ethtool_cmd settings.
@@ -620,6 +624,7 @@ static int ethtool_set_settings(struct net_device *dev, void __user *useraddr)
 {
 	struct ethtool_link_ksettings link_ksettings;
 	struct ethtool_cmd cmd;
+	int ret;
 
 	ASSERT_RTNL();
 
@@ -632,7 +637,10 @@ static int ethtool_set_settings(struct net_device *dev, void __user *useraddr)
 		return -EINVAL;
 	link_ksettings.base.link_mode_masks_nwords =
 		__ETHTOOL_LINK_MODE_MASK_NU32;
-	return dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
+	ret = dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
+	if (ret >= 0)
+		ethtool_notify(dev, ETHTOOL_MSG_LINKINFO_NTF, NULL);
+	return ret;
 }
 
 static noinline_for_stack int ethtool_get_drvinfo(struct net_device *dev,
diff --git a/net/ethtool/linkinfo.c b/net/ethtool/linkinfo.c
index 8a5f68f92425..5d16cb4e8693 100644
--- a/net/ethtool/linkinfo.c
+++ b/net/ethtool/linkinfo.c
@@ -155,6 +155,8 @@ int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info)
 	ret = dev->ethtool_ops->set_link_ksettings(dev, &ksettings);
 	if (ret < 0)
 		GENL_SET_ERR_MSG(info, "link settings update failed");
+	else
+		ethtool_notify(dev, ETHTOOL_MSG_LINKINFO_NTF, NULL);
 
 out_ops:
 	ethnl_ops_complete(dev);
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 057b67f8ba8c..942da4ebdfe9 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -509,6 +509,7 @@ static int ethnl_default_done(struct netlink_callback *cb)
 
 static const struct ethnl_request_ops *
 ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
+	[ETHTOOL_MSG_LINKINFO_NTF]	= &ethnl_linkinfo_request_ops,
 };
 
 /* default notification handler */
@@ -589,6 +590,7 @@ typedef void (*ethnl_notify_handler_t)(struct net_device *dev, unsigned int cmd,
 				       const void *data);
 
 static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
+	[ETHTOOL_MSG_LINKINFO_NTF]	= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
-- 
cgit v1.2.3


From f625aa9be8c10f2e4dc677837e240730a25feda7 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:56:08 +0100
Subject: ethtool: provide link mode information with LINKMODES_GET request

Implement LINKMODES_GET netlink request to get link modes related
information provided by ETHTOOL_GLINKSETTINGS and ETHTOOL_GSET ioctl
commands.

This request provides supported, advertised and peer advertised link modes,
autonegotiation flag, speed and duplex.

LINKMODES_GET request can be used with NLM_F_DUMP (without device
identification) to request the information for all devices in current
network namespace providing the data.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst |  36 +++++++
 include/linux/ethtool_netlink.h              |   3 +
 include/uapi/linux/ethtool_netlink.h         |  18 ++++
 net/ethtool/Makefile                         |   2 +-
 net/ethtool/linkmodes.c                      | 140 +++++++++++++++++++++++++++
 net/ethtool/netlink.c                        |   8 ++
 net/ethtool/netlink.h                        |   1 +
 7 files changed, 207 insertions(+), 1 deletion(-)
 create mode 100644 net/ethtool/linkmodes.c

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 34254482d295..04c55be0264c 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -182,6 +182,7 @@ Userspace to kernel:
   ``ETHTOOL_MSG_STRSET_GET``            get string set
   ``ETHTOOL_MSG_LINKINFO_GET``          get link settings
   ``ETHTOOL_MSG_LINKINFO_SET``          set link settings
+  ``ETHTOOL_MSG_LINKMODES_GET``         get link modes info
   ===================================== ================================
 
 Kernel to userspace:
@@ -190,6 +191,7 @@ Kernel to userspace:
   ``ETHTOOL_MSG_STRSET_GET_REPLY``      string set contents
   ``ETHTOOL_MSG_LINKINFO_GET_REPLY``    link settings
   ``ETHTOOL_MSG_LINKINFO_NTF``          link settings notification
+  ``ETHTOOL_MSG_LINKMODES_GET_REPLY``   link modes info
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -332,6 +334,38 @@ MDI(-X) status and transceiver cannot be set, request with the corresponding
 attributes is rejected.
 
 
+LINKMODES_GET
+=============
+
+Requests link modes (supported, advertised and peer advertised) and related
+information (autonegotiation status, link speed and duplex) as provided by
+``ETHTOOL_GLINKSETTINGS``. The request does not use any attributes.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKMODES_HEADER``        nested  request header
+  ====================================  ======  ==========================
+
+Kernel response contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKMODES_HEADER``        nested  reply header
+  ``ETHTOOL_A_LINKMODES_AUTONEG``       u8      autonegotiation status
+  ``ETHTOOL_A_LINKMODES_OURS``          bitset  advertised link modes
+  ``ETHTOOL_A_LINKMODES_PEER``          bitset  partner link modes
+  ``ETHTOOL_A_LINKMODES_SPEED``         u32     link speed (Mb/s)
+  ``ETHTOOL_A_LINKMODES_DUPLEX``        u8      duplex mode
+  ====================================  ======  ==========================
+
+For ``ETHTOOL_A_LINKMODES_OURS``, value represents advertised modes and mask
+represents supported modes. ``ETHTOOL_A_LINKMODES_PEER`` in the reply is a bit
+list.
+
+``LINKMODES_GET`` allows dump requests (kernel returns reply messages for all
+devices supporting the request).
+
+
 Request translation
 ===================
 
@@ -343,6 +377,7 @@ have their netlink replacement yet.
   ioctl command                       netlink command
   =================================== =====================================
   ``ETHTOOL_GSET``                    ``ETHTOOL_MSG_LINKINFO_GET``
+                                      ``ETHTOOL_MSG_LINKMODES_GET``
   ``ETHTOOL_SSET``                    ``ETHTOOL_MSG_LINKINFO_SET``
   ``ETHTOOL_GDRVINFO``                n/a
   ``ETHTOOL_GREGS``                   n/a
@@ -417,6 +452,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GPHYSTATS``               n/a
   ``ETHTOOL_PERQUEUE``                n/a
   ``ETHTOOL_GLINKSETTINGS``           ``ETHTOOL_MSG_LINKINFO_GET``
+                                      ``ETHTOOL_MSG_LINKMODES_GET``
   ``ETHTOOL_SLINKSETTINGS``           ``ETHTOOL_MSG_LINKINFO_SET``
   ``ETHTOOL_PHY_GTUNABLE``            n/a
   ``ETHTOOL_PHY_STUNABLE``            n/a
diff --git a/include/linux/ethtool_netlink.h b/include/linux/ethtool_netlink.h
index c98f6852c8eb..d01b77887f82 100644
--- a/include/linux/ethtool_netlink.h
+++ b/include/linux/ethtool_netlink.h
@@ -7,6 +7,9 @@
 #include <linux/ethtool.h>
 #include <linux/netdevice.h>
 
+#define __ETHTOOL_LINK_MODE_MASK_NWORDS \
+	DIV_ROUND_UP(__ETHTOOL_LINK_MODE_MASK_NBITS, 32)
+
 enum ethtool_multicast_groups {
 	ETHNL_MCGRP_MONITOR,
 };
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index d530fa30de36..dc1cae052eee 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -17,6 +17,7 @@ enum {
 	ETHTOOL_MSG_STRSET_GET,
 	ETHTOOL_MSG_LINKINFO_GET,
 	ETHTOOL_MSG_LINKINFO_SET,
+	ETHTOOL_MSG_LINKMODES_GET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -29,6 +30,7 @@ enum {
 	ETHTOOL_MSG_STRSET_GET_REPLY,
 	ETHTOOL_MSG_LINKINFO_GET_REPLY,
 	ETHTOOL_MSG_LINKINFO_NTF,
+	ETHTOOL_MSG_LINKMODES_GET_REPLY,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -161,6 +163,22 @@ enum {
 	ETHTOOL_A_LINKINFO_MAX = __ETHTOOL_A_LINKINFO_CNT - 1
 };
 
+/* LINKMODES */
+
+enum {
+	ETHTOOL_A_LINKMODES_UNSPEC,
+	ETHTOOL_A_LINKMODES_HEADER,		/* nest - _A_HEADER_* */
+	ETHTOOL_A_LINKMODES_AUTONEG,		/* u8 */
+	ETHTOOL_A_LINKMODES_OURS,		/* bitset */
+	ETHTOOL_A_LINKMODES_PEER,		/* bitset */
+	ETHTOOL_A_LINKMODES_SPEED,		/* u32 */
+	ETHTOOL_A_LINKMODES_DUPLEX,		/* u8 */
+
+	/* add new constants above here */
+	__ETHTOOL_A_LINKMODES_CNT,
+	ETHTOOL_A_LINKMODES_MAX = __ETHTOOL_A_LINKMODES_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index 765736ec52c0..8023da6672ce 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -4,4 +4,4 @@ obj-y				+= ioctl.o common.o
 
 obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 
-ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o
+ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o
diff --git a/net/ethtool/linkmodes.c b/net/ethtool/linkmodes.c
new file mode 100644
index 000000000000..81856fa1e632
--- /dev/null
+++ b/net/ethtool/linkmodes.c
@@ -0,0 +1,140 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "netlink.h"
+#include "common.h"
+#include "bitset.h"
+
+struct linkmodes_req_info {
+	struct ethnl_req_info		base;
+};
+
+struct linkmodes_reply_data {
+	struct ethnl_reply_data		base;
+	struct ethtool_link_ksettings	ksettings;
+	struct ethtool_link_settings	*lsettings;
+	bool				peer_empty;
+};
+
+#define LINKMODES_REPDATA(__reply_base) \
+	container_of(__reply_base, struct linkmodes_reply_data, base)
+
+static const struct nla_policy
+linkmodes_get_policy[ETHTOOL_A_LINKMODES_MAX + 1] = {
+	[ETHTOOL_A_LINKMODES_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKMODES_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_LINKMODES_AUTONEG]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKMODES_OURS]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKMODES_PEER]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKMODES_SPEED]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKMODES_DUPLEX]		= { .type = NLA_REJECT },
+};
+
+static int linkmodes_prepare_data(const struct ethnl_req_info *req_base,
+				  struct ethnl_reply_data *reply_base,
+				  struct genl_info *info)
+{
+	struct linkmodes_reply_data *data = LINKMODES_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	data->lsettings = &data->ksettings.base;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+
+	ret = __ethtool_get_link_ksettings(dev, &data->ksettings);
+	if (ret < 0 && info) {
+		GENL_SET_ERR_MSG(info, "failed to retrieve link settings");
+		goto out;
+	}
+
+	data->peer_empty =
+		bitmap_empty(data->ksettings.link_modes.lp_advertising,
+			     __ETHTOOL_LINK_MODE_MASK_NBITS);
+
+out:
+	ethnl_ops_complete(dev);
+	return ret;
+}
+
+static int linkmodes_reply_size(const struct ethnl_req_info *req_base,
+				const struct ethnl_reply_data *reply_base)
+{
+	const struct linkmodes_reply_data *data = LINKMODES_REPDATA(reply_base);
+	const struct ethtool_link_ksettings *ksettings = &data->ksettings;
+	bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS;
+	int len, ret;
+
+	len = nla_total_size(sizeof(u8)) /* LINKMODES_AUTONEG */
+		+ nla_total_size(sizeof(u32)) /* LINKMODES_SPEED */
+		+ nla_total_size(sizeof(u8)) /* LINKMODES_DUPLEX */
+		+ 0;
+	ret = ethnl_bitset_size(ksettings->link_modes.advertising,
+				ksettings->link_modes.supported,
+				__ETHTOOL_LINK_MODE_MASK_NBITS,
+				link_mode_names, compact);
+	if (ret < 0)
+		return ret;
+	len += ret;
+	if (!data->peer_empty) {
+		ret = ethnl_bitset_size(ksettings->link_modes.lp_advertising,
+					NULL, __ETHTOOL_LINK_MODE_MASK_NBITS,
+					link_mode_names, compact);
+		if (ret < 0)
+			return ret;
+		len += ret;
+	}
+
+	return len;
+}
+
+static int linkmodes_fill_reply(struct sk_buff *skb,
+				const struct ethnl_req_info *req_base,
+				const struct ethnl_reply_data *reply_base)
+{
+	const struct linkmodes_reply_data *data = LINKMODES_REPDATA(reply_base);
+	const struct ethtool_link_ksettings *ksettings = &data->ksettings;
+	const struct ethtool_link_settings *lsettings = &ksettings->base;
+	bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS;
+	int ret;
+
+	if (nla_put_u8(skb, ETHTOOL_A_LINKMODES_AUTONEG, lsettings->autoneg))
+		return -EMSGSIZE;
+
+	ret = ethnl_put_bitset(skb, ETHTOOL_A_LINKMODES_OURS,
+			       ksettings->link_modes.advertising,
+			       ksettings->link_modes.supported,
+			       __ETHTOOL_LINK_MODE_MASK_NBITS, link_mode_names,
+			       compact);
+	if (ret < 0)
+		return -EMSGSIZE;
+	if (!data->peer_empty) {
+		ret = ethnl_put_bitset(skb, ETHTOOL_A_LINKMODES_PEER,
+				       ksettings->link_modes.lp_advertising,
+				       NULL, __ETHTOOL_LINK_MODE_MASK_NBITS,
+				       link_mode_names, compact);
+		if (ret < 0)
+			return -EMSGSIZE;
+	}
+
+	if (nla_put_u32(skb, ETHTOOL_A_LINKMODES_SPEED, lsettings->speed) ||
+	    nla_put_u8(skb, ETHTOOL_A_LINKMODES_DUPLEX, lsettings->duplex))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+const struct ethnl_request_ops ethnl_linkmodes_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_LINKMODES_GET,
+	.reply_cmd		= ETHTOOL_MSG_LINKMODES_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_LINKMODES_HEADER,
+	.max_attr		= ETHTOOL_A_LINKMODES_MAX,
+	.req_info_size		= sizeof(struct linkmodes_req_info),
+	.reply_data_size	= sizeof(struct linkmodes_reply_data),
+	.request_policy		= linkmodes_get_policy,
+
+	.prepare_data		= linkmodes_prepare_data,
+	.reply_size		= linkmodes_reply_size,
+	.fill_reply		= linkmodes_fill_reply,
+};
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 942da4ebdfe9..703ff3a227a4 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -209,6 +209,7 @@ static const struct ethnl_request_ops *
 ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_STRSET_GET]	= &ethnl_strset_request_ops,
 	[ETHTOOL_MSG_LINKINFO_GET]	= &ethnl_linkinfo_request_ops,
+	[ETHTOOL_MSG_LINKMODES_GET]	= &ethnl_linkmodes_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -630,6 +631,13 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.flags	= GENL_UNS_ADMIN_PERM,
 		.doit	= ethnl_set_linkinfo,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_LINKMODES_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 5d56d7779a06..256d38972d1e 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -332,6 +332,7 @@ struct ethnl_request_ops {
 
 extern const struct ethnl_request_ops ethnl_strset_request_ops;
 extern const struct ethnl_request_ops ethnl_linkinfo_request_ops;
+extern const struct ethnl_request_ops ethnl_linkmodes_request_ops;
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 
-- 
cgit v1.2.3


From bfbcfe2032e70bd8598d680d39ac177d507e39ac Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:56:13 +0100
Subject: ethtool: set link modes related data with LINKMODES_SET request

Implement LINKMODES_SET netlink request to set advertised linkmodes and
related attributes as ETHTOOL_SLINKSETTINGS and ETHTOOL_SSET commands do.

The request allows setting autonegotiation flag, speed, duplex and
advertised link modes.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst |  27 +++
 include/uapi/linux/ethtool_netlink.h         |   1 +
 net/ethtool/linkmodes.c                      | 235 +++++++++++++++++++++++++++
 net/ethtool/netlink.c                        |   5 +
 net/ethtool/netlink.h                        |   1 +
 5 files changed, 269 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 04c55be0264c..625c80183563 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -183,6 +183,7 @@ Userspace to kernel:
   ``ETHTOOL_MSG_LINKINFO_GET``          get link settings
   ``ETHTOOL_MSG_LINKINFO_SET``          set link settings
   ``ETHTOOL_MSG_LINKMODES_GET``         get link modes info
+  ``ETHTOOL_MSG_LINKMODES_SET``         set link modes info
   ===================================== ================================
 
 Kernel to userspace:
@@ -366,6 +367,30 @@ list.
 devices supporting the request).
 
 
+LINKMODES_SET
+=============
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKMODES_HEADER``        nested  request header
+  ``ETHTOOL_A_LINKMODES_AUTONEG``       u8      autonegotiation status
+  ``ETHTOOL_A_LINKMODES_OURS``          bitset  advertised link modes
+  ``ETHTOOL_A_LINKMODES_PEER``          bitset  partner link modes
+  ``ETHTOOL_A_LINKMODES_SPEED``         u32     link speed (Mb/s)
+  ``ETHTOOL_A_LINKMODES_DUPLEX``        u8      duplex mode
+  ====================================  ======  ==========================
+
+``ETHTOOL_A_LINKMODES_OURS`` bit set allows setting advertised link modes. If
+autonegotiation is on (either set now or kept from before), advertised modes
+are not changed (no ``ETHTOOL_A_LINKMODES_OURS`` attribute) and at least one
+of speed and duplex is specified, kernel adjusts advertised modes to all
+supported modes matching speed, duplex or both (whatever is specified). This
+autoselection is done on ethtool side with ioctl interface, netlink interface
+is supposed to allow requesting changes without knowing what exactly kernel
+supports.
+
+
 Request translation
 ===================
 
@@ -379,6 +404,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GSET``                    ``ETHTOOL_MSG_LINKINFO_GET``
                                       ``ETHTOOL_MSG_LINKMODES_GET``
   ``ETHTOOL_SSET``                    ``ETHTOOL_MSG_LINKINFO_SET``
+                                      ``ETHTOOL_MSG_LINKMODES_SET``
   ``ETHTOOL_GDRVINFO``                n/a
   ``ETHTOOL_GREGS``                   n/a
   ``ETHTOOL_GWOL``                    n/a
@@ -454,6 +480,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GLINKSETTINGS``           ``ETHTOOL_MSG_LINKINFO_GET``
                                       ``ETHTOOL_MSG_LINKMODES_GET``
   ``ETHTOOL_SLINKSETTINGS``           ``ETHTOOL_MSG_LINKINFO_SET``
+                                      ``ETHTOOL_MSG_LINKMODES_SET``
   ``ETHTOOL_PHY_GTUNABLE``            n/a
   ``ETHTOOL_PHY_STUNABLE``            n/a
   ``ETHTOOL_GFECPARAM``               n/a
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index dc1cae052eee..cddf978b98df 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -18,6 +18,7 @@ enum {
 	ETHTOOL_MSG_LINKINFO_GET,
 	ETHTOOL_MSG_LINKINFO_SET,
 	ETHTOOL_MSG_LINKMODES_GET,
+	ETHTOOL_MSG_LINKMODES_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
diff --git a/net/ethtool/linkmodes.c b/net/ethtool/linkmodes.c
index 81856fa1e632..790b60771d0e 100644
--- a/net/ethtool/linkmodes.c
+++ b/net/ethtool/linkmodes.c
@@ -138,3 +138,238 @@ const struct ethnl_request_ops ethnl_linkmodes_request_ops = {
 	.reply_size		= linkmodes_reply_size,
 	.fill_reply		= linkmodes_fill_reply,
 };
+
+/* LINKMODES_SET */
+
+struct link_mode_info {
+	int				speed;
+	u8				duplex;
+};
+
+#define __DEFINE_LINK_MODE_PARAMS(_speed, _type, _duplex) \
+	[ETHTOOL_LINK_MODE(_speed, _type, _duplex)] = { \
+		.speed	= SPEED_ ## _speed, \
+		.duplex	= __DUPLEX_ ## _duplex \
+	}
+#define __DUPLEX_Half DUPLEX_HALF
+#define __DUPLEX_Full DUPLEX_FULL
+#define __DEFINE_SPECIAL_MODE_PARAMS(_mode) \
+	[ETHTOOL_LINK_MODE_ ## _mode ## _BIT] = { \
+		.speed	= SPEED_UNKNOWN, \
+		.duplex	= DUPLEX_UNKNOWN, \
+	}
+
+static const struct link_mode_info link_mode_params[] = {
+	__DEFINE_LINK_MODE_PARAMS(10, T, Half),
+	__DEFINE_LINK_MODE_PARAMS(10, T, Full),
+	__DEFINE_LINK_MODE_PARAMS(100, T, Half),
+	__DEFINE_LINK_MODE_PARAMS(100, T, Full),
+	__DEFINE_LINK_MODE_PARAMS(1000, T, Half),
+	__DEFINE_LINK_MODE_PARAMS(1000, T, Full),
+	__DEFINE_SPECIAL_MODE_PARAMS(Autoneg),
+	__DEFINE_SPECIAL_MODE_PARAMS(TP),
+	__DEFINE_SPECIAL_MODE_PARAMS(AUI),
+	__DEFINE_SPECIAL_MODE_PARAMS(MII),
+	__DEFINE_SPECIAL_MODE_PARAMS(FIBRE),
+	__DEFINE_SPECIAL_MODE_PARAMS(BNC),
+	__DEFINE_LINK_MODE_PARAMS(10000, T, Full),
+	__DEFINE_SPECIAL_MODE_PARAMS(Pause),
+	__DEFINE_SPECIAL_MODE_PARAMS(Asym_Pause),
+	__DEFINE_LINK_MODE_PARAMS(2500, X, Full),
+	__DEFINE_SPECIAL_MODE_PARAMS(Backplane),
+	__DEFINE_LINK_MODE_PARAMS(1000, KX, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, KX4, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, KR, Full),
+	[ETHTOOL_LINK_MODE_10000baseR_FEC_BIT] = {
+		.speed	= SPEED_10000,
+		.duplex = DUPLEX_FULL,
+	},
+	__DEFINE_LINK_MODE_PARAMS(20000, MLD2, Full),
+	__DEFINE_LINK_MODE_PARAMS(20000, KR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, KR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, CR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, SR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(40000, LR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, KR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, CR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, SR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(56000, LR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(25000, CR, Full),
+	__DEFINE_LINK_MODE_PARAMS(25000, KR, Full),
+	__DEFINE_LINK_MODE_PARAMS(25000, SR, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, CR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, KR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, KR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, SR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, CR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, LR4_ER4, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, SR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(1000, X, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, CR, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, SR, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, LR, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, LRM, Full),
+	__DEFINE_LINK_MODE_PARAMS(10000, ER, Full),
+	__DEFINE_LINK_MODE_PARAMS(2500, T, Full),
+	__DEFINE_LINK_MODE_PARAMS(5000, T, Full),
+	__DEFINE_SPECIAL_MODE_PARAMS(FEC_NONE),
+	__DEFINE_SPECIAL_MODE_PARAMS(FEC_RS),
+	__DEFINE_SPECIAL_MODE_PARAMS(FEC_BASER),
+	__DEFINE_LINK_MODE_PARAMS(50000, KR, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, SR, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, CR, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, LR_ER_FR, Full),
+	__DEFINE_LINK_MODE_PARAMS(50000, DR, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, KR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, SR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, CR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, LR2_ER2_FR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(100000, DR2, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, KR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, SR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, LR4_ER4_FR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, DR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(200000, CR4, Full),
+	__DEFINE_LINK_MODE_PARAMS(100, T1, Full),
+	__DEFINE_LINK_MODE_PARAMS(1000, T1, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, KR8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, SR8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, LR8_ER8_FR8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, DR8, Full),
+	__DEFINE_LINK_MODE_PARAMS(400000, CR8, Full),
+};
+
+static const struct nla_policy
+linkmodes_set_policy[ETHTOOL_A_LINKMODES_MAX + 1] = {
+	[ETHTOOL_A_LINKMODES_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKMODES_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_LINKMODES_AUTONEG]		= { .type = NLA_U8 },
+	[ETHTOOL_A_LINKMODES_OURS]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_LINKMODES_PEER]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKMODES_SPEED]		= { .type = NLA_U32 },
+	[ETHTOOL_A_LINKMODES_DUPLEX]		= { .type = NLA_U8 },
+};
+
+/* Set advertised link modes to all supported modes matching requested speed
+ * and duplex values. Called when autonegotiation is on, speed or duplex is
+ * requested but no link mode change. This is done in userspace with ioctl()
+ * interface, move it into kernel for netlink.
+ * Returns true if advertised modes bitmap was modified.
+ */
+static bool ethnl_auto_linkmodes(struct ethtool_link_ksettings *ksettings,
+				 bool req_speed, bool req_duplex)
+{
+	unsigned long *advertising = ksettings->link_modes.advertising;
+	unsigned long *supported = ksettings->link_modes.supported;
+	DECLARE_BITMAP(old_adv, __ETHTOOL_LINK_MODE_MASK_NBITS);
+	unsigned int i;
+
+	BUILD_BUG_ON(ARRAY_SIZE(link_mode_params) !=
+		     __ETHTOOL_LINK_MODE_MASK_NBITS);
+
+	bitmap_copy(old_adv, advertising, __ETHTOOL_LINK_MODE_MASK_NBITS);
+
+	for (i = 0; i < __ETHTOOL_LINK_MODE_MASK_NBITS; i++) {
+		const struct link_mode_info *info = &link_mode_params[i];
+
+		if (info->speed == SPEED_UNKNOWN)
+			continue;
+		if (test_bit(i, supported) &&
+		    (!req_speed || info->speed == ksettings->base.speed) &&
+		    (!req_duplex || info->duplex == ksettings->base.duplex))
+			set_bit(i, advertising);
+		else
+			clear_bit(i, advertising);
+	}
+
+	return !bitmap_equal(old_adv, advertising,
+			     __ETHTOOL_LINK_MODE_MASK_NBITS);
+}
+
+static int ethnl_update_linkmodes(struct genl_info *info, struct nlattr **tb,
+				  struct ethtool_link_ksettings *ksettings,
+				  bool *mod)
+{
+	struct ethtool_link_settings *lsettings = &ksettings->base;
+	bool req_speed, req_duplex;
+	int ret;
+
+	*mod = false;
+	req_speed = tb[ETHTOOL_A_LINKMODES_SPEED];
+	req_duplex = tb[ETHTOOL_A_LINKMODES_DUPLEX];
+
+	ethnl_update_u8(&lsettings->autoneg, tb[ETHTOOL_A_LINKMODES_AUTONEG],
+			mod);
+	ret = ethnl_update_bitset(ksettings->link_modes.advertising,
+				  __ETHTOOL_LINK_MODE_MASK_NBITS,
+				  tb[ETHTOOL_A_LINKMODES_OURS], link_mode_names,
+				  info->extack, mod);
+	if (ret < 0)
+		return ret;
+	ethnl_update_u32(&lsettings->speed, tb[ETHTOOL_A_LINKMODES_SPEED],
+			 mod);
+	ethnl_update_u8(&lsettings->duplex, tb[ETHTOOL_A_LINKMODES_DUPLEX],
+			mod);
+
+	if (!tb[ETHTOOL_A_LINKMODES_OURS] && lsettings->autoneg &&
+	    (req_speed || req_duplex) &&
+	    ethnl_auto_linkmodes(ksettings, req_speed, req_duplex))
+		*mod = true;
+
+	return 0;
+}
+
+int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info)
+{
+	struct nlattr *tb[ETHTOOL_A_LINKMODES_MAX + 1];
+	struct ethtool_link_ksettings ksettings = {};
+	struct ethtool_link_settings *lsettings;
+	struct ethnl_req_info req_info = {};
+	struct net_device *dev;
+	bool mod = false;
+	int ret;
+
+	ret = nlmsg_parse(info->nlhdr, GENL_HDRLEN, tb,
+			  ETHTOOL_A_LINKMODES_MAX, linkmodes_set_policy,
+			  info->extack);
+	if (ret < 0)
+		return ret;
+	ret = ethnl_parse_header(&req_info, tb[ETHTOOL_A_LINKMODES_HEADER],
+				 genl_info_net(info), info->extack, true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+	if (!dev->ethtool_ops->get_link_ksettings ||
+	    !dev->ethtool_ops->set_link_ksettings)
+		return -EOPNOTSUPP;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	ret = __ethtool_get_link_ksettings(dev, &ksettings);
+	if (ret < 0) {
+		if (info)
+			GENL_SET_ERR_MSG(info, "failed to retrieve link settings");
+		goto out_ops;
+	}
+	lsettings = &ksettings.base;
+
+	ret = ethnl_update_linkmodes(info, tb, &ksettings, &mod);
+	if (ret < 0)
+		goto out_ops;
+
+	if (mod) {
+		ret = dev->ethtool_ops->set_link_ksettings(dev, &ksettings);
+		if (ret < 0)
+			GENL_SET_ERR_MSG(info, "link settings update failed");
+	}
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+	dev_put(dev);
+	return ret;
+}
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 703ff3a227a4..5f28f3cb022d 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -638,6 +638,11 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.dumpit	= ethnl_default_dumpit,
 		.done	= ethnl_default_done,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_LINKMODES_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_linkmodes,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 256d38972d1e..1269cca8a002 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -335,5 +335,6 @@ extern const struct ethnl_request_ops ethnl_linkinfo_request_ops;
 extern const struct ethnl_request_ops ethnl_linkmodes_request_ops;
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
+int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
 
 #endif /* _NET_ETHTOOL_NETLINK_H */
-- 
cgit v1.2.3


From 1b1b1847c8505df2e1dd8804838526bed22a8bd4 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:56:18 +0100
Subject: ethtool: add LINKMODES_NTF notification

Send ETHTOOL_MSG_LINKMODES_NTF notification message whenever device link
settings or advertised modes are modified using ETHTOOL_MSG_LINKMODES_SET
netlink message or ETHTOOL_SLINKSETTINGS or ETHTOOL_SSET ioctl commands.

The notification message has the same format as reply to LINKMODES_GET
request. ETHTOOL_MSG_LINKMODES_SET netlink request only triggers the
notification if there is a change but the ioctl command handlers do not
check if there is an actual change and trigger the notification whenever
the commands are executed.

As all work is done by ethnl_default_notify() handler and callback
functions introduced to handle LINKMODES_GET requests, all that remains is
adding entries for ETHTOOL_MSG_LINKMODES_NTF into ethnl_notify_handlers and
ethnl_default_notify_ops lookup tables and calls to ethtool_notify() where
needed.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 1 +
 include/uapi/linux/ethtool_netlink.h         | 1 +
 net/ethtool/ioctl.c                          | 8 ++++++--
 net/ethtool/linkmodes.c                      | 2 ++
 net/ethtool/netlink.c                        | 2 ++
 5 files changed, 12 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 625c80183563..9d96d51e9360 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -193,6 +193,7 @@ Kernel to userspace:
   ``ETHTOOL_MSG_LINKINFO_GET_REPLY``    link settings
   ``ETHTOOL_MSG_LINKINFO_NTF``          link settings notification
   ``ETHTOOL_MSG_LINKMODES_GET_REPLY``   link modes info
+  ``ETHTOOL_MSG_LINKMODES_NTF``         link modes notification
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index cddf978b98df..35948df6d6e3 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -32,6 +32,7 @@ enum {
 	ETHTOOL_MSG_LINKINFO_GET_REPLY,
 	ETHTOOL_MSG_LINKINFO_NTF,
 	ETHTOOL_MSG_LINKMODES_GET_REPLY,
+	ETHTOOL_MSG_LINKMODES_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 11a467294a33..36e2ef2d900d 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -573,8 +573,10 @@ static int ethtool_set_link_ksettings(struct net_device *dev,
 		return -EINVAL;
 
 	err = dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
-	if (err >= 0)
+	if (err >= 0) {
 		ethtool_notify(dev, ETHTOOL_MSG_LINKINFO_NTF, NULL);
+		ethtool_notify(dev, ETHTOOL_MSG_LINKMODES_NTF, NULL);
+	}
 	return err;
 }
 
@@ -638,8 +640,10 @@ static int ethtool_set_settings(struct net_device *dev, void __user *useraddr)
 	link_ksettings.base.link_mode_masks_nwords =
 		__ETHTOOL_LINK_MODE_MASK_NU32;
 	ret = dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
-	if (ret >= 0)
+	if (ret >= 0) {
 		ethtool_notify(dev, ETHTOOL_MSG_LINKINFO_NTF, NULL);
+		ethtool_notify(dev, ETHTOOL_MSG_LINKMODES_NTF, NULL);
+	}
 	return ret;
 }
 
diff --git a/net/ethtool/linkmodes.c b/net/ethtool/linkmodes.c
index 790b60771d0e..0b99f494ad3b 100644
--- a/net/ethtool/linkmodes.c
+++ b/net/ethtool/linkmodes.c
@@ -364,6 +364,8 @@ int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info)
 		ret = dev->ethtool_ops->set_link_ksettings(dev, &ksettings);
 		if (ret < 0)
 			GENL_SET_ERR_MSG(info, "link settings update failed");
+		else
+			ethtool_notify(dev, ETHTOOL_MSG_LINKMODES_NTF, NULL);
 	}
 
 out_ops:
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 5f28f3cb022d..1b5e1bd26504 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -511,6 +511,7 @@ static int ethnl_default_done(struct netlink_callback *cb)
 static const struct ethnl_request_ops *
 ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
 	[ETHTOOL_MSG_LINKINFO_NTF]	= &ethnl_linkinfo_request_ops,
+	[ETHTOOL_MSG_LINKMODES_NTF]	= &ethnl_linkmodes_request_ops,
 };
 
 /* default notification handler */
@@ -592,6 +593,7 @@ typedef void (*ethnl_notify_handler_t)(struct net_device *dev, unsigned int cmd,
 
 static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
 	[ETHTOOL_MSG_LINKINFO_NTF]	= ethnl_default_notify,
+	[ETHTOOL_MSG_LINKMODES_NTF]	= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
-- 
cgit v1.2.3


From 3d2b847fb99cf2b28aa046e486636e555bc6ed1c Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Fri, 27 Dec 2019 15:56:23 +0100
Subject: ethtool: provide link state with LINKSTATE_GET request

Implement LINKSTATE_GET netlink request to get link state information.

At the moment, only link up flag as provided by ETHTOOL_GLINK ioctl command
is returned.

LINKSTATE_GET request can be used with NLM_F_DUMP (without device
identification) to request the information for all devices in current
network namespace providing the data.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 33 ++++++++++++-
 include/uapi/linux/ethtool_netlink.h         | 14 ++++++
 net/ethtool/Makefile                         |  3 +-
 net/ethtool/common.c                         |  8 +++
 net/ethtool/common.h                         |  3 ++
 net/ethtool/ioctl.c                          |  8 +--
 net/ethtool/linkstate.c                      | 74 ++++++++++++++++++++++++++++
 net/ethtool/netlink.c                        |  8 +++
 net/ethtool/netlink.h                        |  1 +
 9 files changed, 146 insertions(+), 6 deletions(-)
 create mode 100644 net/ethtool/linkstate.c

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 9d96d51e9360..c60afba69e3c 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -184,6 +184,7 @@ Userspace to kernel:
   ``ETHTOOL_MSG_LINKINFO_SET``          set link settings
   ``ETHTOOL_MSG_LINKMODES_GET``         get link modes info
   ``ETHTOOL_MSG_LINKMODES_SET``         set link modes info
+  ``ETHTOOL_MSG_LINKSTATE_GET``         get link state
   ===================================== ================================
 
 Kernel to userspace:
@@ -194,6 +195,7 @@ Kernel to userspace:
   ``ETHTOOL_MSG_LINKINFO_NTF``          link settings notification
   ``ETHTOOL_MSG_LINKMODES_GET_REPLY``   link modes info
   ``ETHTOOL_MSG_LINKMODES_NTF``         link modes notification
+  ``ETHTOOL_MSG_LINKSTATE_GET_REPLY``   link state info
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -392,6 +394,35 @@ is supposed to allow requesting changes without knowing what exactly kernel
 supports.
 
 
+LINKSTATE_GET
+=============
+
+Requests link state information. At the moment, only link up/down flag (as
+provided by ``ETHTOOL_GLINK`` ioctl command) is provided but some future
+extensions are planned (e.g. link down reason). This request does not have any
+attributes.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKSTATE_HEADER``        nested  request header
+  ====================================  ======  ==========================
+
+Kernel response contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_LINKSTATE_HEADER``        nested  reply header
+  ``ETHTOOL_A_LINKSTATE_LINK``          bool    link state (up/down)
+  ====================================  ======  ==========================
+
+For most NIC drivers, the value of ``ETHTOOL_A_LINKSTATE_LINK`` returns
+carrier flag provided by ``netif_carrier_ok()`` but there are drivers which
+define their own handler.
+
+``LINKSTATE_GET`` allows dump requests (kernel returns reply messages for all
+devices supporting the request).
+
+
 Request translation
 ===================
 
@@ -413,7 +444,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GMSGLVL``                 n/a
   ``ETHTOOL_SMSGLVL``                 n/a
   ``ETHTOOL_NWAY_RST``                n/a
-  ``ETHTOOL_GLINK``                   n/a
+  ``ETHTOOL_GLINK``                   ``ETHTOOL_MSG_LINKSTATE_GET``
   ``ETHTOOL_GEEPROM``                 n/a
   ``ETHTOOL_SEEPROM``                 n/a
   ``ETHTOOL_GCOALESCE``               n/a
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 35948df6d6e3..02f82f42a889 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -19,6 +19,7 @@ enum {
 	ETHTOOL_MSG_LINKINFO_SET,
 	ETHTOOL_MSG_LINKMODES_GET,
 	ETHTOOL_MSG_LINKMODES_SET,
+	ETHTOOL_MSG_LINKSTATE_GET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -33,6 +34,7 @@ enum {
 	ETHTOOL_MSG_LINKINFO_NTF,
 	ETHTOOL_MSG_LINKMODES_GET_REPLY,
 	ETHTOOL_MSG_LINKMODES_NTF,
+	ETHTOOL_MSG_LINKSTATE_GET_REPLY,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -181,6 +183,18 @@ enum {
 	ETHTOOL_A_LINKMODES_MAX = __ETHTOOL_A_LINKMODES_CNT - 1
 };
 
+/* LINKSTATE */
+
+enum {
+	ETHTOOL_A_LINKSTATE_UNSPEC,
+	ETHTOOL_A_LINKSTATE_HEADER,		/* nest - _A_HEADER_* */
+	ETHTOOL_A_LINKSTATE_LINK,		/* u8 */
+
+	/* add new constants above here */
+	__ETHTOOL_A_LINKSTATE_CNT,
+	ETHTOOL_A_LINKSTATE_MAX = __ETHTOOL_A_LINKSTATE_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index 8023da6672ce..9a1332fb0cc6 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -4,4 +4,5 @@ obj-y				+= ioctl.o common.o
 
 obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 
-ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o
+ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o \
+		   linkstate.o
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index 1d4a0aeff2cb..e621b1694d2f 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -217,3 +217,11 @@ convert_legacy_settings_to_link_ksettings(
 		= legacy_settings->eth_tp_mdix_ctrl;
 	return retval;
 }
+
+int __ethtool_get_link(struct net_device *dev)
+{
+	if (!dev->ethtool_ops->get_link)
+		return -EOPNOTSUPP;
+
+	return netif_running(dev) && dev->ethtool_ops->get_link(dev);
+}
diff --git a/net/ethtool/common.h b/net/ethtool/common.h
index c8a237402729..5c5f7dc90cd4 100644
--- a/net/ethtool/common.h
+++ b/net/ethtool/common.h
@@ -3,6 +3,7 @@
 #ifndef _ETHTOOL_COMMON_H
 #define _ETHTOOL_COMMON_H
 
+#include <linux/netdevice.h>
 #include <linux/ethtool.h>
 
 /* compose link mode index from speed, type and duplex */
@@ -19,6 +20,8 @@ extern const char
 phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN];
 extern const char link_mode_names[][ETH_GSTRING_LEN];
 
+int __ethtool_get_link(struct net_device *dev);
+
 bool convert_legacy_settings_to_link_ksettings(
 	struct ethtool_link_ksettings *link_ksettings,
 	const struct ethtool_cmd *legacy_settings);
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 36e2ef2d900d..182bffbffa78 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1365,12 +1365,12 @@ static int ethtool_nway_reset(struct net_device *dev)
 static int ethtool_get_link(struct net_device *dev, char __user *useraddr)
 {
 	struct ethtool_value edata = { .cmd = ETHTOOL_GLINK };
+	int link = __ethtool_get_link(dev);
 
-	if (!dev->ethtool_ops->get_link)
-		return -EOPNOTSUPP;
-
-	edata.data = netif_running(dev) && dev->ethtool_ops->get_link(dev);
+	if (link < 0)
+		return link;
 
+	edata.data = link;
 	if (copy_to_user(useraddr, &edata, sizeof(edata)))
 		return -EFAULT;
 	return 0;
diff --git a/net/ethtool/linkstate.c b/net/ethtool/linkstate.c
new file mode 100644
index 000000000000..2740cde0a182
--- /dev/null
+++ b/net/ethtool/linkstate.c
@@ -0,0 +1,74 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "netlink.h"
+#include "common.h"
+
+struct linkstate_req_info {
+	struct ethnl_req_info		base;
+};
+
+struct linkstate_reply_data {
+	struct ethnl_reply_data		base;
+	int				link;
+};
+
+#define LINKSTATE_REPDATA(__reply_base) \
+	container_of(__reply_base, struct linkstate_reply_data, base)
+
+static const struct nla_policy
+linkstate_get_policy[ETHTOOL_A_LINKSTATE_MAX + 1] = {
+	[ETHTOOL_A_LINKSTATE_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_LINKSTATE_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_LINKSTATE_LINK]		= { .type = NLA_REJECT },
+};
+
+static int linkstate_prepare_data(const struct ethnl_req_info *req_base,
+				  struct ethnl_reply_data *reply_base,
+				  struct genl_info *info)
+{
+	struct linkstate_reply_data *data = LINKSTATE_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+	data->link = __ethtool_get_link(dev);
+	ethnl_ops_complete(dev);
+
+	return 0;
+}
+
+static int linkstate_reply_size(const struct ethnl_req_info *req_base,
+				const struct ethnl_reply_data *reply_base)
+{
+	return nla_total_size(sizeof(u8)) /* LINKSTATE_LINK */
+		+ 0;
+}
+
+static int linkstate_fill_reply(struct sk_buff *skb,
+				const struct ethnl_req_info *req_base,
+				const struct ethnl_reply_data *reply_base)
+{
+	struct linkstate_reply_data *data = LINKSTATE_REPDATA(reply_base);
+
+	if (data->link >= 0 &&
+	    nla_put_u8(skb, ETHTOOL_A_LINKSTATE_LINK, !!data->link))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+const struct ethnl_request_ops ethnl_linkstate_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_LINKSTATE_GET,
+	.reply_cmd		= ETHTOOL_MSG_LINKSTATE_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_LINKSTATE_HEADER,
+	.max_attr		= ETHTOOL_A_LINKSTATE_MAX,
+	.req_info_size		= sizeof(struct linkstate_req_info),
+	.reply_data_size	= sizeof(struct linkstate_reply_data),
+	.request_policy		= linkstate_get_policy,
+
+	.prepare_data		= linkstate_prepare_data,
+	.reply_size		= linkstate_reply_size,
+	.fill_reply		= linkstate_fill_reply,
+};
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 1b5e1bd26504..4ca96c7b86b3 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -210,6 +210,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_STRSET_GET]	= &ethnl_strset_request_ops,
 	[ETHTOOL_MSG_LINKINFO_GET]	= &ethnl_linkinfo_request_ops,
 	[ETHTOOL_MSG_LINKMODES_GET]	= &ethnl_linkmodes_request_ops,
+	[ETHTOOL_MSG_LINKSTATE_GET]	= &ethnl_linkstate_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -645,6 +646,13 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.flags	= GENL_UNS_ADMIN_PERM,
 		.doit	= ethnl_set_linkmodes,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_LINKSTATE_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 1269cca8a002..da9d6521a4eb 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -333,6 +333,7 @@ struct ethnl_request_ops {
 extern const struct ethnl_request_ops ethnl_strset_request_ops;
 extern const struct ethnl_request_ops ethnl_linkinfo_request_ops;
 extern const struct ethnl_request_ops ethnl_linkmodes_request_ops;
+extern const struct ethnl_request_ops ethnl_linkstate_request_ops;
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
-- 
cgit v1.2.3


From 93edd392cad7485097c2e9068c764ae30083bb05 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers@google.com>
Date: Tue, 19 Nov 2019 14:24:47 -0800
Subject: fscrypt: support passing a keyring key to FS_IOC_ADD_ENCRYPTION_KEY

Extend the FS_IOC_ADD_ENCRYPTION_KEY ioctl to allow the raw key to be
specified by a Linux keyring key, rather than specified directly.

This is useful because fscrypt keys belong to a particular filesystem
instance, so they are destroyed when that filesystem is unmounted.
Usually this is desired.  But in some cases, userspace may need to
unmount and re-mount the filesystem while keeping the keys, e.g. during
a system update.  This requires keeping the keys somewhere else too.

The keys could be kept in memory in a userspace daemon.  But depending
on the security architecture and assumptions, it can be preferable to
keep them only in kernel memory, where they are unreadable by userspace.

We also can't solve this by going back to the original fscrypt API
(where for each file, the master key was looked up in the process's
keyring hierarchy) because that caused lots of problems of its own.

Therefore, add the ability for FS_IOC_ADD_ENCRYPTION_KEY to accept a
Linux keyring key.  This solves the problem by allowing userspace to (if
needed) save the keys securely in a Linux keyring for re-provisioning,
while still using the new fscrypt key management ioctls.

This is analogous to how dm-crypt accepts a Linux keyring key, but the
key is then stored internally in the dm-crypt data structures rather
than being looked up again each time the dm-crypt device is accessed.

Use a custom key type "fscrypt-provisioning" rather than one of the
existing key types such as "logon".  This is strongly desired because it
enforces that these keys are only usable for a particular purpose: for
fscrypt as input to a particular KDF.  Otherwise, the keys could also be
passed to any kernel API that accepts a "logon" key with any service
prefix, e.g. dm-crypt, UBIFS, or (recently proposed) AF_ALG.  This would
risk leaking information about the raw key despite it ostensibly being
unreadable.  Of course, this mistake has already been made for multiple
kernel APIs; but since this is a new API, let's do it right.

This patch has been tested using an xfstest which I wrote to test it.

Link: https://lore.kernel.org/r/20191119222447.226853-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 Documentation/filesystems/fscrypt.rst |  35 ++++++++-
 fs/crypto/keyring.c                   | 132 +++++++++++++++++++++++++++++++---
 include/uapi/linux/fscrypt.h          |  13 +++-
 3 files changed, 168 insertions(+), 12 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/filesystems/fscrypt.rst b/Documentation/filesystems/fscrypt.rst
index 68c2bc8275cf..9c53336d06a4 100644
--- a/Documentation/filesystems/fscrypt.rst
+++ b/Documentation/filesystems/fscrypt.rst
@@ -638,7 +638,8 @@ follows::
     struct fscrypt_add_key_arg {
             struct fscrypt_key_specifier key_spec;
             __u32 raw_size;
-            __u32 __reserved[9];
+            __u32 key_id;
+            __u32 __reserved[8];
             __u8 raw[];
     };
 
@@ -655,6 +656,12 @@ follows::
             } u;
     };
 
+    struct fscrypt_provisioning_key_payload {
+            __u32 type;
+            __u32 __reserved;
+            __u8 raw[];
+    };
+
 :c:type:`struct fscrypt_add_key_arg` must be zeroed, then initialized
 as follows:
 
@@ -677,9 +684,26 @@ as follows:
   ``Documentation/security/keys/core.rst``).
 
 - ``raw_size`` must be the size of the ``raw`` key provided, in bytes.
+  Alternatively, if ``key_id`` is nonzero, this field must be 0, since
+  in that case the size is implied by the specified Linux keyring key.
+
+- ``key_id`` is 0 if the raw key is given directly in the ``raw``
+  field.  Otherwise ``key_id`` is the ID of a Linux keyring key of
+  type "fscrypt-provisioning" whose payload is a :c:type:`struct
+  fscrypt_provisioning_key_payload` whose ``raw`` field contains the
+  raw key and whose ``type`` field matches ``key_spec.type``.  Since
+  ``raw`` is variable-length, the total size of this key's payload
+  must be ``sizeof(struct fscrypt_provisioning_key_payload)`` plus the
+  raw key size.  The process must have Search permission on this key.
+
+  Most users should leave this 0 and specify the raw key directly.
+  The support for specifying a Linux keyring key is intended mainly to
+  allow re-adding keys after a filesystem is unmounted and re-mounted,
+  without having to store the raw keys in userspace memory.
 
 - ``raw`` is a variable-length field which must contain the actual
-  key, ``raw_size`` bytes long.
+  key, ``raw_size`` bytes long.  Alternatively, if ``key_id`` is
+  nonzero, then this field is unused.
 
 For v2 policy keys, the kernel keeps track of which user (identified
 by effective user ID) added the key, and only allows the key to be
@@ -701,11 +725,16 @@ FS_IOC_ADD_ENCRYPTION_KEY can fail with the following errors:
 
 - ``EACCES``: FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR was specified, but the
   caller does not have the CAP_SYS_ADMIN capability in the initial
-  user namespace
+  user namespace; or the raw key was specified by Linux key ID but the
+  process lacks Search permission on the key.
 - ``EDQUOT``: the key quota for this user would be exceeded by adding
   the key
 - ``EINVAL``: invalid key size or key specifier type, or reserved bits
   were set
+- ``EKEYREJECTED``: the raw key was specified by Linux key ID, but the
+  key has the wrong type
+- ``ENOKEY``: the raw key was specified by Linux key ID, but no key
+  exists with that ID
 - ``ENOTTY``: this type of filesystem does not implement encryption
 - ``EOPNOTSUPP``: the kernel was not configured with encryption
   support for this filesystem, or the filesystem superblock has not
diff --git a/fs/crypto/keyring.c b/fs/crypto/keyring.c
index 40cca351273f..098ff2e0f0bb 100644
--- a/fs/crypto/keyring.c
+++ b/fs/crypto/keyring.c
@@ -465,6 +465,109 @@ out_unlock:
 	return err;
 }
 
+static int fscrypt_provisioning_key_preparse(struct key_preparsed_payload *prep)
+{
+	const struct fscrypt_provisioning_key_payload *payload = prep->data;
+
+	if (prep->datalen < sizeof(*payload) + FSCRYPT_MIN_KEY_SIZE ||
+	    prep->datalen > sizeof(*payload) + FSCRYPT_MAX_KEY_SIZE)
+		return -EINVAL;
+
+	if (payload->type != FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR &&
+	    payload->type != FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER)
+		return -EINVAL;
+
+	if (payload->__reserved)
+		return -EINVAL;
+
+	prep->payload.data[0] = kmemdup(payload, prep->datalen, GFP_KERNEL);
+	if (!prep->payload.data[0])
+		return -ENOMEM;
+
+	prep->quotalen = prep->datalen;
+	return 0;
+}
+
+static void fscrypt_provisioning_key_free_preparse(
+					struct key_preparsed_payload *prep)
+{
+	kzfree(prep->payload.data[0]);
+}
+
+static void fscrypt_provisioning_key_describe(const struct key *key,
+					      struct seq_file *m)
+{
+	seq_puts(m, key->description);
+	if (key_is_positive(key)) {
+		const struct fscrypt_provisioning_key_payload *payload =
+			key->payload.data[0];
+
+		seq_printf(m, ": %u [%u]", key->datalen, payload->type);
+	}
+}
+
+static void fscrypt_provisioning_key_destroy(struct key *key)
+{
+	kzfree(key->payload.data[0]);
+}
+
+static struct key_type key_type_fscrypt_provisioning = {
+	.name			= "fscrypt-provisioning",
+	.preparse		= fscrypt_provisioning_key_preparse,
+	.free_preparse		= fscrypt_provisioning_key_free_preparse,
+	.instantiate		= generic_key_instantiate,
+	.describe		= fscrypt_provisioning_key_describe,
+	.destroy		= fscrypt_provisioning_key_destroy,
+};
+
+/*
+ * Retrieve the raw key from the Linux keyring key specified by 'key_id', and
+ * store it into 'secret'.
+ *
+ * The key must be of type "fscrypt-provisioning" and must have the field
+ * fscrypt_provisioning_key_payload::type set to 'type', indicating that it's
+ * only usable with fscrypt with the particular KDF version identified by
+ * 'type'.  We don't use the "logon" key type because there's no way to
+ * completely restrict the use of such keys; they can be used by any kernel API
+ * that accepts "logon" keys and doesn't require a specific service prefix.
+ *
+ * The ability to specify the key via Linux keyring key is intended for cases
+ * where userspace needs to re-add keys after the filesystem is unmounted and
+ * re-mounted.  Most users should just provide the raw key directly instead.
+ */
+static int get_keyring_key(u32 key_id, u32 type,
+			   struct fscrypt_master_key_secret *secret)
+{
+	key_ref_t ref;
+	struct key *key;
+	const struct fscrypt_provisioning_key_payload *payload;
+	int err;
+
+	ref = lookup_user_key(key_id, 0, KEY_NEED_SEARCH);
+	if (IS_ERR(ref))
+		return PTR_ERR(ref);
+	key = key_ref_to_ptr(ref);
+
+	if (key->type != &key_type_fscrypt_provisioning)
+		goto bad_key;
+	payload = key->payload.data[0];
+
+	/* Don't allow fscrypt v1 keys to be used as v2 keys and vice versa. */
+	if (payload->type != type)
+		goto bad_key;
+
+	secret->size = key->datalen - sizeof(*payload);
+	memcpy(secret->raw, payload->raw, secret->size);
+	err = 0;
+	goto out_put;
+
+bad_key:
+	err = -EKEYREJECTED;
+out_put:
+	key_ref_put(ref);
+	return err;
+}
+
 /*
  * Add a master encryption key to the filesystem, causing all files which were
  * encrypted with it to appear "unlocked" (decrypted) when accessed.
@@ -503,18 +606,25 @@ int fscrypt_ioctl_add_key(struct file *filp, void __user *_uarg)
 	if (!valid_key_spec(&arg.key_spec))
 		return -EINVAL;
 
-	if (arg.raw_size < FSCRYPT_MIN_KEY_SIZE ||
-	    arg.raw_size > FSCRYPT_MAX_KEY_SIZE)
-		return -EINVAL;
-
 	if (memchr_inv(arg.__reserved, 0, sizeof(arg.__reserved)))
 		return -EINVAL;
 
 	memset(&secret, 0, sizeof(secret));
-	secret.size = arg.raw_size;
-	err = -EFAULT;
-	if (copy_from_user(secret.raw, uarg->raw, secret.size))
-		goto out_wipe_secret;
+	if (arg.key_id) {
+		if (arg.raw_size != 0)
+			return -EINVAL;
+		err = get_keyring_key(arg.key_id, arg.key_spec.type, &secret);
+		if (err)
+			goto out_wipe_secret;
+	} else {
+		if (arg.raw_size < FSCRYPT_MIN_KEY_SIZE ||
+		    arg.raw_size > FSCRYPT_MAX_KEY_SIZE)
+			return -EINVAL;
+		secret.size = arg.raw_size;
+		err = -EFAULT;
+		if (copy_from_user(secret.raw, uarg->raw, secret.size))
+			goto out_wipe_secret;
+	}
 
 	switch (arg.key_spec.type) {
 	case FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR:
@@ -978,8 +1088,14 @@ int __init fscrypt_init_keyring(void)
 	if (err)
 		goto err_unregister_fscrypt;
 
+	err = register_key_type(&key_type_fscrypt_provisioning);
+	if (err)
+		goto err_unregister_fscrypt_user;
+
 	return 0;
 
+err_unregister_fscrypt_user:
+	unregister_key_type(&key_type_fscrypt_user);
 err_unregister_fscrypt:
 	unregister_key_type(&key_type_fscrypt);
 	return err;
diff --git a/include/uapi/linux/fscrypt.h b/include/uapi/linux/fscrypt.h
index 1beb174ad950..d5112a24e8b9 100644
--- a/include/uapi/linux/fscrypt.h
+++ b/include/uapi/linux/fscrypt.h
@@ -109,11 +109,22 @@ struct fscrypt_key_specifier {
 	} u;
 };
 
+/*
+ * Payload of Linux keyring key of type "fscrypt-provisioning", referenced by
+ * fscrypt_add_key_arg::key_id as an alternative to fscrypt_add_key_arg::raw.
+ */
+struct fscrypt_provisioning_key_payload {
+	__u32 type;
+	__u32 __reserved;
+	__u8 raw[];
+};
+
 /* Struct passed to FS_IOC_ADD_ENCRYPTION_KEY */
 struct fscrypt_add_key_arg {
 	struct fscrypt_key_specifier key_spec;
 	__u32 raw_size;
-	__u32 __reserved[9];
+	__u32 key_id;
+	__u32 __reserved[8];
 	__u8 raw[];
 };
 
-- 
cgit v1.2.3


From e933adde6f97c4b51d39250c1dad0e89bd3fb1c7 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers@google.com>
Date: Thu, 19 Dec 2019 10:56:24 -0800
Subject: fscrypt: include <linux/ioctl.h> in UAPI header

<linux/fscrypt.h> defines ioctl numbers using the macros like _IOWR()
which are defined in <linux/ioctl.h>, so <linux/ioctl.h> should be
included as a prerequisite, like it is in many other kernel headers.

In practice this doesn't really matter since anyone referencing these
ioctl numbers will almost certainly include <sys/ioctl.h> too in order
to actually call ioctl().  But we might as well fix this.

Link: https://lore.kernel.org/r/20191219185624.21251-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
---
 include/uapi/linux/fscrypt.h | 1 +
 1 file changed, 1 insertion(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/fscrypt.h b/include/uapi/linux/fscrypt.h
index d5112a24e8b9..0d8a6f47711c 100644
--- a/include/uapi/linux/fscrypt.h
+++ b/include/uapi/linux/fscrypt.h
@@ -8,6 +8,7 @@
 #ifndef _UAPI_LINUX_FSCRYPT_H
 #define _UAPI_LINUX_FSCRYPT_H
 
+#include <linux/ioctl.h>
 #include <linux/types.h>
 
 /* Encryption policy flags */
-- 
cgit v1.2.3


From 68e039f966cb577c91649a02591646ac3919f8c9 Mon Sep 17 00:00:00 2001
From: Sven Eckelmann <sven@narfation.org>
Date: Wed, 1 Jan 2020 00:00:01 +0100
Subject: batman-adv: Update copyright years for 2020

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
---
 include/uapi/linux/batadv_packet.h     | 2 +-
 include/uapi/linux/batman_adv.h        | 2 +-
 net/batman-adv/Kconfig                 | 2 +-
 net/batman-adv/Makefile                | 2 +-
 net/batman-adv/bat_algo.c              | 2 +-
 net/batman-adv/bat_algo.h              | 2 +-
 net/batman-adv/bat_iv_ogm.c            | 2 +-
 net/batman-adv/bat_iv_ogm.h            | 2 +-
 net/batman-adv/bat_v.c                 | 2 +-
 net/batman-adv/bat_v.h                 | 2 +-
 net/batman-adv/bat_v_elp.c             | 2 +-
 net/batman-adv/bat_v_elp.h             | 2 +-
 net/batman-adv/bat_v_ogm.c             | 2 +-
 net/batman-adv/bat_v_ogm.h             | 2 +-
 net/batman-adv/bitarray.c              | 2 +-
 net/batman-adv/bitarray.h              | 2 +-
 net/batman-adv/bridge_loop_avoidance.c | 2 +-
 net/batman-adv/bridge_loop_avoidance.h | 2 +-
 net/batman-adv/debugfs.c               | 2 +-
 net/batman-adv/debugfs.h               | 2 +-
 net/batman-adv/distributed-arp-table.c | 2 +-
 net/batman-adv/distributed-arp-table.h | 2 +-
 net/batman-adv/fragmentation.c         | 2 +-
 net/batman-adv/fragmentation.h         | 2 +-
 net/batman-adv/gateway_client.c        | 2 +-
 net/batman-adv/gateway_client.h        | 2 +-
 net/batman-adv/gateway_common.c        | 2 +-
 net/batman-adv/gateway_common.h        | 2 +-
 net/batman-adv/hard-interface.c        | 2 +-
 net/batman-adv/hard-interface.h        | 2 +-
 net/batman-adv/hash.c                  | 2 +-
 net/batman-adv/hash.h                  | 2 +-
 net/batman-adv/icmp_socket.c           | 2 +-
 net/batman-adv/icmp_socket.h           | 2 +-
 net/batman-adv/log.c                   | 2 +-
 net/batman-adv/log.h                   | 2 +-
 net/batman-adv/main.c                  | 2 +-
 net/batman-adv/main.h                  | 2 +-
 net/batman-adv/multicast.c             | 2 +-
 net/batman-adv/multicast.h             | 2 +-
 net/batman-adv/netlink.c               | 2 +-
 net/batman-adv/netlink.h               | 2 +-
 net/batman-adv/network-coding.c        | 2 +-
 net/batman-adv/network-coding.h        | 2 +-
 net/batman-adv/originator.c            | 2 +-
 net/batman-adv/originator.h            | 2 +-
 net/batman-adv/routing.c               | 2 +-
 net/batman-adv/routing.h               | 2 +-
 net/batman-adv/send.c                  | 2 +-
 net/batman-adv/send.h                  | 2 +-
 net/batman-adv/soft-interface.c        | 2 +-
 net/batman-adv/soft-interface.h        | 2 +-
 net/batman-adv/sysfs.c                 | 2 +-
 net/batman-adv/sysfs.h                 | 2 +-
 net/batman-adv/tp_meter.c              | 2 +-
 net/batman-adv/tp_meter.h              | 2 +-
 net/batman-adv/trace.c                 | 2 +-
 net/batman-adv/trace.h                 | 2 +-
 net/batman-adv/translation-table.c     | 2 +-
 net/batman-adv/translation-table.h     | 2 +-
 net/batman-adv/tvlv.c                  | 2 +-
 net/batman-adv/tvlv.h                  | 2 +-
 net/batman-adv/types.h                 | 2 +-
 63 files changed, 63 insertions(+), 63 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/batadv_packet.h b/include/uapi/linux/batadv_packet.h
index 2a15f01c2243..0ae34c85ef9e 100644
--- a/include/uapi/linux/batadv_packet.h
+++ b/include/uapi/linux/batadv_packet.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: (GPL-2.0 WITH Linux-syscall-note) */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/include/uapi/linux/batman_adv.h b/include/uapi/linux/batman_adv.h
index 67f4636758af..617c180ff0c8 100644
--- a/include/uapi/linux/batman_adv.h
+++ b/include/uapi/linux/batman_adv.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: MIT */
-/* Copyright (C) 2016-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2016-2020  B.A.T.M.A.N. contributors:
  *
  * Matthias Schiffer
  */
diff --git a/net/batman-adv/Kconfig b/net/batman-adv/Kconfig
index d5028af750d5..045b2b183213 100644
--- a/net/batman-adv/Kconfig
+++ b/net/batman-adv/Kconfig
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-# Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+# Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
 #
 # Marek Lindner, Simon Wunderlich
 
diff --git a/net/batman-adv/Makefile b/net/batman-adv/Makefile
index fd63e116d9ff..daa49af7ff40 100644
--- a/net/batman-adv/Makefile
+++ b/net/batman-adv/Makefile
@@ -1,5 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
-# Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+# Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
 #
 # Marek Lindner, Simon Wunderlich
 
diff --git a/net/batman-adv/bat_algo.c b/net/batman-adv/bat_algo.c
index fa39eaaab9d7..382fbe51fd34 100644
--- a/net/batman-adv/bat_algo.c
+++ b/net/batman-adv/bat_algo.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/bat_algo.h b/net/batman-adv/bat_algo.h
index 37898da8ad48..686a60bc9492 100644
--- a/net/batman-adv/bat_algo.h
+++ b/net/batman-adv/bat_algo.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2011-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2011-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Linus Lüssing
  */
diff --git a/net/batman-adv/bat_iv_ogm.c b/net/batman-adv/bat_iv_ogm.c
index 5b0b20e6da95..f0209505e41a 100644
--- a/net/batman-adv/bat_iv_ogm.c
+++ b/net/batman-adv/bat_iv_ogm.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/bat_iv_ogm.h b/net/batman-adv/bat_iv_ogm.h
index c7a9ba305bfc..0c57c1000c64 100644
--- a/net/batman-adv/bat_iv_ogm.h
+++ b/net/batman-adv/bat_iv_ogm.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/bat_v.c b/net/batman-adv/bat_v.c
index 4ff6cf1ecae7..0ecaf1bb0068 100644
--- a/net/batman-adv/bat_v.c
+++ b/net/batman-adv/bat_v.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2013-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2013-2020  B.A.T.M.A.N. contributors:
  *
  * Linus Lüssing, Marek Lindner
  */
diff --git a/net/batman-adv/bat_v.h b/net/batman-adv/bat_v.h
index 37833db098e6..5e0be10bc84e 100644
--- a/net/batman-adv/bat_v.h
+++ b/net/batman-adv/bat_v.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2011-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2011-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Linus Lüssing
  */
diff --git a/net/batman-adv/bat_v_elp.c b/net/batman-adv/bat_v_elp.c
index 6536e8cc53a0..1e3172db7492 100644
--- a/net/batman-adv/bat_v_elp.c
+++ b/net/batman-adv/bat_v_elp.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2011-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2011-2020  B.A.T.M.A.N. contributors:
  *
  * Linus Lüssing, Marek Lindner
  */
diff --git a/net/batman-adv/bat_v_elp.h b/net/batman-adv/bat_v_elp.h
index 1a29505f4f66..4358d436be2a 100644
--- a/net/batman-adv/bat_v_elp.h
+++ b/net/batman-adv/bat_v_elp.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2013-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2013-2020  B.A.T.M.A.N. contributors:
  *
  * Linus Lüssing, Marek Lindner
  */
diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
index 714ce56cfcc8..969466218999 100644
--- a/net/batman-adv/bat_v_ogm.c
+++ b/net/batman-adv/bat_v_ogm.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2013-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2013-2020  B.A.T.M.A.N. contributors:
  *
  * Antonio Quartulli
  */
diff --git a/net/batman-adv/bat_v_ogm.h b/net/batman-adv/bat_v_ogm.h
index bf16d040461d..0ae2575f70bb 100644
--- a/net/batman-adv/bat_v_ogm.h
+++ b/net/batman-adv/bat_v_ogm.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2013-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2013-2020  B.A.T.M.A.N. contributors:
  *
  * Antonio Quartulli
  */
diff --git a/net/batman-adv/bitarray.c b/net/batman-adv/bitarray.c
index 7f04a6acf14e..4bc695cda397 100644
--- a/net/batman-adv/bitarray.c
+++ b/net/batman-adv/bitarray.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2006-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2006-2020  B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich, Marek Lindner
  */
diff --git a/net/batman-adv/bitarray.h b/net/batman-adv/bitarray.h
index 84ad2d2b6ac9..533c6d44cb58 100644
--- a/net/batman-adv/bitarray.h
+++ b/net/batman-adv/bitarray.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2006-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2006-2020  B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich, Marek Lindner
  */
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c
index 0eff33580f95..41cc87f06b14 100644
--- a/net/batman-adv/bridge_loop_avoidance.c
+++ b/net/batman-adv/bridge_loop_avoidance.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2011-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2011-2020  B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich
  */
diff --git a/net/batman-adv/bridge_loop_avoidance.h b/net/batman-adv/bridge_loop_avoidance.h
index 02b24a861a85..41edb2c4a327 100644
--- a/net/batman-adv/bridge_loop_avoidance.h
+++ b/net/batman-adv/bridge_loop_avoidance.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2011-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2011-2020  B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich
  */
diff --git a/net/batman-adv/debugfs.c b/net/batman-adv/debugfs.c
index 38c4d8e51155..452856c27d20 100644
--- a/net/batman-adv/debugfs.c
+++ b/net/batman-adv/debugfs.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2010-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2010-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/debugfs.h b/net/batman-adv/debugfs.h
index 1c5afd301ce9..7e2e8f586f42 100644
--- a/net/batman-adv/debugfs.h
+++ b/net/batman-adv/debugfs.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2010-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2010-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/distributed-arp-table.c b/net/batman-adv/distributed-arp-table.c
index 5004e38fe792..906011b60d66 100644
--- a/net/batman-adv/distributed-arp-table.c
+++ b/net/batman-adv/distributed-arp-table.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2011-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2011-2020  B.A.T.M.A.N. contributors:
  *
  * Antonio Quartulli
  */
diff --git a/net/batman-adv/distributed-arp-table.h b/net/batman-adv/distributed-arp-table.h
index 67c7729add55..2bff2f4a325c 100644
--- a/net/batman-adv/distributed-arp-table.h
+++ b/net/batman-adv/distributed-arp-table.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2011-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2011-2020  B.A.T.M.A.N. contributors:
  *
  * Antonio Quartulli
  */
diff --git a/net/batman-adv/fragmentation.c b/net/batman-adv/fragmentation.c
index 385fccdcf69d..7cad97644d05 100644
--- a/net/batman-adv/fragmentation.c
+++ b/net/batman-adv/fragmentation.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2013-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2013-2020  B.A.T.M.A.N. contributors:
  *
  * Martin Hundebøll <martin@hundeboll.net>
  */
diff --git a/net/batman-adv/fragmentation.h b/net/batman-adv/fragmentation.h
index abfe8c6556de..881ef328b6cd 100644
--- a/net/batman-adv/fragmentation.h
+++ b/net/batman-adv/fragmentation.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2013-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2013-2020  B.A.T.M.A.N. contributors:
  *
  * Martin Hundebøll <martin@hundeboll.net>
  */
diff --git a/net/batman-adv/gateway_client.c b/net/batman-adv/gateway_client.c
index 47df4c678988..e22e49289677 100644
--- a/net/batman-adv/gateway_client.c
+++ b/net/batman-adv/gateway_client.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2009-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2009-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/gateway_client.h b/net/batman-adv/gateway_client.h
index 0be8e7178ec7..88b5dba84354 100644
--- a/net/batman-adv/gateway_client.h
+++ b/net/batman-adv/gateway_client.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2009-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2009-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/gateway_common.c b/net/batman-adv/gateway_common.c
index fc55750542e4..16cd9450ceb1 100644
--- a/net/batman-adv/gateway_common.c
+++ b/net/batman-adv/gateway_common.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2009-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2009-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/gateway_common.h b/net/batman-adv/gateway_common.h
index 211b14b37db8..c3a0c5a7f7e9 100644
--- a/net/batman-adv/gateway_common.h
+++ b/net/batman-adv/gateway_common.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2009-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2009-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/hard-interface.c b/net/batman-adv/hard-interface.c
index afb52282d5bd..c7e98a40dd33 100644
--- a/net/batman-adv/hard-interface.c
+++ b/net/batman-adv/hard-interface.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/hard-interface.h b/net/batman-adv/hard-interface.h
index bbb8a6f18d6b..bad2e50135e8 100644
--- a/net/batman-adv/hard-interface.h
+++ b/net/batman-adv/hard-interface.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/hash.c b/net/batman-adv/hash.c
index a9d4e176f4de..68638e0450a6 100644
--- a/net/batman-adv/hash.c
+++ b/net/batman-adv/hash.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2006-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2006-2020  B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich, Marek Lindner
  */
diff --git a/net/batman-adv/hash.h b/net/batman-adv/hash.h
index 57877f0b78e0..91ae9f32b580 100644
--- a/net/batman-adv/hash.h
+++ b/net/batman-adv/hash.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2006-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2006-2020  B.A.T.M.A.N. contributors:
  *
  * Simon Wunderlich, Marek Lindner
  */
diff --git a/net/batman-adv/icmp_socket.c b/net/batman-adv/icmp_socket.c
index 0a70b66e8770..ccb535c77e5d 100644
--- a/net/batman-adv/icmp_socket.c
+++ b/net/batman-adv/icmp_socket.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/icmp_socket.h b/net/batman-adv/icmp_socket.h
index 27fafff586df..6abd0f4742ef 100644
--- a/net/batman-adv/icmp_socket.h
+++ b/net/batman-adv/icmp_socket.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/log.c b/net/batman-adv/log.c
index 11941cf1adcc..a67b2b091447 100644
--- a/net/batman-adv/log.c
+++ b/net/batman-adv/log.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2010-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2010-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/log.h b/net/batman-adv/log.h
index 41fd900121c5..f9884dc56cf3 100644
--- a/net/batman-adv/log.h
+++ b/net/batman-adv/log.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/main.c b/net/batman-adv/main.c
index 4a89177def64..0b840e1e3dfa 100644
--- a/net/batman-adv/main.c
+++ b/net/batman-adv/main.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/main.h b/net/batman-adv/main.h
index fd8c0728ddc7..692306df7b6f 100644
--- a/net/batman-adv/main.h
+++ b/net/batman-adv/main.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/multicast.c b/net/batman-adv/multicast.c
index f9ec8e7507b6..9ebdc1e864b9 100644
--- a/net/batman-adv/multicast.c
+++ b/net/batman-adv/multicast.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2014-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2014-2020  B.A.T.M.A.N. contributors:
  *
  * Linus Lüssing
  */
diff --git a/net/batman-adv/multicast.h b/net/batman-adv/multicast.h
index 5d9e2bb29c97..ebf825991ecd 100644
--- a/net/batman-adv/multicast.h
+++ b/net/batman-adv/multicast.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2014-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2014-2020  B.A.T.M.A.N. contributors:
  *
  * Linus Lüssing
  */
diff --git a/net/batman-adv/netlink.c b/net/batman-adv/netlink.c
index 7e052d6f759b..02ed073f95a9 100644
--- a/net/batman-adv/netlink.c
+++ b/net/batman-adv/netlink.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2016-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2016-2020  B.A.T.M.A.N. contributors:
  *
  * Matthias Schiffer
  */
diff --git a/net/batman-adv/netlink.h b/net/batman-adv/netlink.h
index ddc674e47dbb..7ee48f916997 100644
--- a/net/batman-adv/netlink.h
+++ b/net/batman-adv/netlink.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2016-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2016-2020  B.A.T.M.A.N. contributors:
  *
  * Matthias Schiffer
  */
diff --git a/net/batman-adv/network-coding.c b/net/batman-adv/network-coding.c
index 580609389f0f..8f0717c3f7b5 100644
--- a/net/batman-adv/network-coding.c
+++ b/net/batman-adv/network-coding.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2012-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2012-2020  B.A.T.M.A.N. contributors:
  *
  * Martin Hundebøll, Jeppe Ledet-Pedersen
  */
diff --git a/net/batman-adv/network-coding.h b/net/batman-adv/network-coding.h
index 753fa49723cf..334289084127 100644
--- a/net/batman-adv/network-coding.h
+++ b/net/batman-adv/network-coding.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2012-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2012-2020  B.A.T.M.A.N. contributors:
  *
  * Martin Hundebøll, Jeppe Ledet-Pedersen
  */
diff --git a/net/batman-adv/originator.c b/net/batman-adv/originator.c
index 38613487fb1b..5b0c2fffc214 100644
--- a/net/batman-adv/originator.c
+++ b/net/batman-adv/originator.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2009-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2009-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/originator.h b/net/batman-adv/originator.h
index 512a1f99dd75..7bc01c138b3a 100644
--- a/net/batman-adv/originator.h
+++ b/net/batman-adv/originator.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/routing.c b/net/batman-adv/routing.c
index f0f864820dea..3632bd976c56 100644
--- a/net/batman-adv/routing.c
+++ b/net/batman-adv/routing.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/routing.h b/net/batman-adv/routing.h
index c20feac95107..2ed49db6eff5 100644
--- a/net/batman-adv/routing.h
+++ b/net/batman-adv/routing.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/send.c b/net/batman-adv/send.c
index 3ce5f7bad369..7f8ade04e08e 100644
--- a/net/batman-adv/send.c
+++ b/net/batman-adv/send.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/send.h b/net/batman-adv/send.h
index 5fc0fd1e5d08..0d36e15589f6 100644
--- a/net/batman-adv/send.h
+++ b/net/batman-adv/send.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/soft-interface.c b/net/batman-adv/soft-interface.c
index 832e156c519e..5f05a728f347 100644
--- a/net/batman-adv/soft-interface.c
+++ b/net/batman-adv/soft-interface.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/soft-interface.h b/net/batman-adv/soft-interface.h
index 29139ad769fe..534e08d6ad91 100644
--- a/net/batman-adv/soft-interface.h
+++ b/net/batman-adv/soft-interface.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/sysfs.c b/net/batman-adv/sysfs.c
index e5bbc28ed12c..c45962d8527b 100644
--- a/net/batman-adv/sysfs.c
+++ b/net/batman-adv/sysfs.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2010-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2010-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/sysfs.h b/net/batman-adv/sysfs.h
index 5e466093dfa5..d987f8b30a98 100644
--- a/net/batman-adv/sysfs.h
+++ b/net/batman-adv/sysfs.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2010-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2010-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner
  */
diff --git a/net/batman-adv/tp_meter.c b/net/batman-adv/tp_meter.c
index dd6a9a40dbb9..bd2ac570c42c 100644
--- a/net/batman-adv/tp_meter.c
+++ b/net/batman-adv/tp_meter.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2012-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2012-2020  B.A.T.M.A.N. contributors:
  *
  * Edo Monticelli, Antonio Quartulli
  */
diff --git a/net/batman-adv/tp_meter.h b/net/batman-adv/tp_meter.h
index 78d310da0ad3..140105215aa2 100644
--- a/net/batman-adv/tp_meter.h
+++ b/net/batman-adv/tp_meter.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2012-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2012-2020  B.A.T.M.A.N. contributors:
  *
  * Edo Monticelli, Antonio Quartulli
  */
diff --git a/net/batman-adv/trace.c b/net/batman-adv/trace.c
index 3cedd2c36528..3444d9e4e90d 100644
--- a/net/batman-adv/trace.c
+++ b/net/batman-adv/trace.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2010-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2010-2020  B.A.T.M.A.N. contributors:
  *
  * Sven Eckelmann
  */
diff --git a/net/batman-adv/trace.h b/net/batman-adv/trace.h
index d8f764521c0b..f631b1e01b89 100644
--- a/net/batman-adv/trace.h
+++ b/net/batman-adv/trace.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2010-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2010-2020  B.A.T.M.A.N. contributors:
  *
  * Sven Eckelmann
  */
diff --git a/net/batman-adv/translation-table.c b/net/batman-adv/translation-table.c
index 8a482c5ec67b..852932838ddc 100644
--- a/net/batman-adv/translation-table.c
+++ b/net/batman-adv/translation-table.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich, Antonio Quartulli
  */
diff --git a/net/batman-adv/translation-table.h b/net/batman-adv/translation-table.h
index 4a98860d7f0e..b24d35b9226a 100644
--- a/net/batman-adv/translation-table.h
+++ b/net/batman-adv/translation-table.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich, Antonio Quartulli
  */
diff --git a/net/batman-adv/tvlv.c b/net/batman-adv/tvlv.c
index aae63f0d21eb..0963a43ad996 100644
--- a/net/batman-adv/tvlv.c
+++ b/net/batman-adv/tvlv.c
@@ -1,5 +1,5 @@
 // SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/tvlv.h b/net/batman-adv/tvlv.h
index 36985000a0a8..d509d00c7a23 100644
--- a/net/batman-adv/tvlv.h
+++ b/net/batman-adv/tvlv.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
diff --git a/net/batman-adv/types.h b/net/batman-adv/types.h
index bdf9827b5f63..4a17a66cc572 100644
--- a/net/batman-adv/types.h
+++ b/net/batman-adv/types.h
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-/* Copyright (C) 2007-2019  B.A.T.M.A.N. contributors:
+/* Copyright (C) 2007-2020  B.A.T.M.A.N. contributors:
  *
  * Marek Lindner, Simon Wunderlich
  */
-- 
cgit v1.2.3


From 6b102db50cdde3ba2f78631ed21222edf3a5fb51 Mon Sep 17 00:00:00 2001
From: David Ahern <dsahern@gmail.com>
Date: Mon, 30 Dec 2019 14:14:29 -0800
Subject: net: Add device index to tcp_md5sig

Add support for userspace to specify a device index to limit the scope
of an entry via the TCP_MD5SIG_EXT setsockopt. The existing __tcpm_pad
is renamed to tcpm_ifindex and the new field is only checked if the new
TCP_MD5SIG_FLAG_IFINDEX is set in tcpm_flags. For now, the device index
must point to an L3 master device (e.g., VRF). The API and error
handling are setup to allow the constraint to be relaxed in the future
to any device index.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/tcp.h |  5 +++--
 net/ipv4/tcp_ipv4.c      | 18 ++++++++++++++++++
 net/ipv6/tcp_ipv6.c      | 20 +++++++++++++++++++-
 3 files changed, 40 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index 74af1f759cee..d87184e673ca 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -317,14 +317,15 @@ enum {
 #define TCP_MD5SIG_MAXKEYLEN	80
 
 /* tcp_md5sig extension flags for TCP_MD5SIG_EXT */
-#define TCP_MD5SIG_FLAG_PREFIX		1	/* address prefix length */
+#define TCP_MD5SIG_FLAG_PREFIX		0x1	/* address prefix length */
+#define TCP_MD5SIG_FLAG_IFINDEX		0x2	/* ifindex set */
 
 struct tcp_md5sig {
 	struct __kernel_sockaddr_storage tcpm_addr;	/* address associated */
 	__u8	tcpm_flags;				/* extension flags */
 	__u8	tcpm_prefixlen;				/* address prefix */
 	__u16	tcpm_keylen;				/* key length */
-	__u32	__tcpm_pad;				/* zero */
+	int	tcpm_ifindex;				/* device index for scope */
 	__u8	tcpm_key[TCP_MD5SIG_MAXKEYLEN];		/* key (binary) */
 };
 
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index 30b3f19d6301..4adac9c75343 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1196,6 +1196,24 @@ static int tcp_v4_parse_md5_keys(struct sock *sk, int optname,
 			return -EINVAL;
 	}
 
+	if (optname == TCP_MD5SIG_EXT &&
+	    cmd.tcpm_flags & TCP_MD5SIG_FLAG_IFINDEX) {
+		struct net_device *dev;
+
+		rcu_read_lock();
+		dev = dev_get_by_index_rcu(sock_net(sk), cmd.tcpm_ifindex);
+		if (dev && netif_is_l3_master(dev))
+			l3index = dev->ifindex;
+
+		rcu_read_unlock();
+
+		/* ok to reference set/not set outside of rcu;
+		 * right now device MUST be an L3 master
+		 */
+		if (!dev || !l3index)
+			return -EINVAL;
+	}
+
 	addr = (union tcp_md5_addr *)&sin->sin_addr.s_addr;
 
 	if (!cmd.tcpm_keylen)
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 71ad7d89be0f..95e4e1e95db2 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -578,10 +578,28 @@ static int tcp_v6_parse_md5_keys(struct sock *sk, int optname,
 		prefixlen = ipv6_addr_v4mapped(&sin6->sin6_addr) ? 32 : 128;
 	}
 
+	if (optname == TCP_MD5SIG_EXT &&
+	    cmd.tcpm_flags & TCP_MD5SIG_FLAG_IFINDEX) {
+		struct net_device *dev;
+
+		rcu_read_lock();
+		dev = dev_get_by_index_rcu(sock_net(sk), cmd.tcpm_ifindex);
+		if (dev && netif_is_l3_master(dev))
+			l3index = dev->ifindex;
+		rcu_read_unlock();
+
+		/* ok to reference set/not set outside of rcu;
+		 * right now device MUST be an L3 master
+		 */
+		if (!dev || !l3index)
+			return -EINVAL;
+	}
+
 	if (!cmd.tcpm_keylen) {
 		if (ipv6_addr_v4mapped(&sin6->sin6_addr))
 			return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr.s6_addr32[3],
-					      AF_INET, prefixlen, l3index);
+					      AF_INET, prefixlen,
+					      l3index);
 		return tcp_md5_do_del(sk, (union tcp_md5_addr *)&sin6->sin6_addr,
 				      AF_INET6, prefixlen, l3index);
 	}
-- 
cgit v1.2.3


From 77cdffcb0bfb87fe3645894335cb8cb94917e6ac Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Mon, 16 Dec 2019 15:15:00 +0100
Subject: media: v4l2: abstract timeval handling in v4l2_buffer

As a preparation for adding 64-bit time_t support in the uapi,
change the drivers to no longer care about the format of the
timestamp field in struct v4l2_buffer.

The v4l2_timeval_to_ns() function is no longer needed in the
kernel after this, but there is userspace code relying on
it to be part of the uapi header.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
[hverkuil-cisco@xs4all.nl: replace spaces by tabs]
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 drivers/media/common/videobuf2/videobuf2-v4l2.c |  4 ++--
 drivers/media/pci/meye/meye.c                   |  4 ++--
 drivers/media/usb/cpia2/cpia2_v4l.c             |  4 ++--
 drivers/media/usb/stkwebcam/stk-webcam.c        |  2 +-
 drivers/media/usb/usbvision/usbvision-video.c   |  4 ++--
 drivers/media/v4l2-core/videobuf-core.c         |  5 +++--
 include/media/v4l2-common.h                     | 21 +++++++++++++++++++++
 include/trace/events/v4l2.h                     |  2 +-
 include/uapi/linux/videodev2.h                  |  2 ++
 9 files changed, 36 insertions(+), 12 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/media/common/videobuf2/videobuf2-v4l2.c b/drivers/media/common/videobuf2/videobuf2-v4l2.c
index e652f4318284..eb5d5db96552 100644
--- a/drivers/media/common/videobuf2/videobuf2-v4l2.c
+++ b/drivers/media/common/videobuf2/videobuf2-v4l2.c
@@ -146,7 +146,7 @@ static void __copy_timestamp(struct vb2_buffer *vb, const void *pb)
 		 * and the timecode field and flag if needed.
 		 */
 		if (q->copy_timestamp)
-			vb->timestamp = v4l2_timeval_to_ns(&b->timestamp);
+			vb->timestamp = v4l2_buffer_get_timestamp(b);
 		vbuf->flags |= b->flags & V4L2_BUF_FLAG_TIMECODE;
 		if (b->flags & V4L2_BUF_FLAG_TIMECODE)
 			vbuf->timecode = b->timecode;
@@ -482,7 +482,7 @@ static void __fill_v4l2_buffer(struct vb2_buffer *vb, void *pb)
 
 	b->flags = vbuf->flags;
 	b->field = vbuf->field;
-	b->timestamp = ns_to_timeval(vb->timestamp);
+	v4l2_buffer_set_timestamp(b, vb->timestamp);
 	b->timecode = vbuf->timecode;
 	b->sequence = vbuf->sequence;
 	b->reserved2 = 0;
diff --git a/drivers/media/pci/meye/meye.c b/drivers/media/pci/meye/meye.c
index 0e61c81356ef..3a4c29bc0ba5 100644
--- a/drivers/media/pci/meye/meye.c
+++ b/drivers/media/pci/meye/meye.c
@@ -1266,7 +1266,7 @@ static int vidioc_querybuf(struct file *file, void *fh, struct v4l2_buffer *buf)
 		buf->flags |= V4L2_BUF_FLAG_DONE;
 
 	buf->field = V4L2_FIELD_NONE;
-	buf->timestamp = ns_to_timeval(meye.grab_buffer[index].ts);
+	v4l2_buffer_set_timestamp(buf, meye.grab_buffer[index].ts);
 	buf->sequence = meye.grab_buffer[index].sequence;
 	buf->memory = V4L2_MEMORY_MMAP;
 	buf->m.offset = index * gbufsize;
@@ -1332,7 +1332,7 @@ static int vidioc_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
 	buf->bytesused = meye.grab_buffer[reqnr].size;
 	buf->flags = V4L2_BUF_FLAG_MAPPED | V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
 	buf->field = V4L2_FIELD_NONE;
-	buf->timestamp = ns_to_timeval(meye.grab_buffer[reqnr].ts);
+	v4l2_buffer_set_timestamp(buf, meye.grab_buffer[reqnr].ts);
 	buf->sequence = meye.grab_buffer[reqnr].sequence;
 	buf->memory = V4L2_MEMORY_MMAP;
 	buf->m.offset = reqnr * gbufsize;
diff --git a/drivers/media/usb/cpia2/cpia2_v4l.c b/drivers/media/usb/cpia2/cpia2_v4l.c
index 626264a56517..9d3d05125d7b 100644
--- a/drivers/media/usb/cpia2/cpia2_v4l.c
+++ b/drivers/media/usb/cpia2/cpia2_v4l.c
@@ -800,7 +800,7 @@ static int cpia2_querybuf(struct file *file, void *fh, struct v4l2_buffer *buf)
 		break;
 	case FRAME_READY:
 		buf->bytesused = cam->buffers[buf->index].length;
-		buf->timestamp = ns_to_timeval(cam->buffers[buf->index].ts);
+		v4l2_buffer_set_timestamp(buf, cam->buffers[buf->index].ts);
 		buf->sequence = cam->buffers[buf->index].seq;
 		buf->flags = V4L2_BUF_FLAG_DONE;
 		break;
@@ -907,7 +907,7 @@ static int cpia2_dqbuf(struct file *file, void *fh, struct v4l2_buffer *buf)
 	buf->flags = V4L2_BUF_FLAG_MAPPED | V4L2_BUF_FLAG_DONE
 		| V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
 	buf->field = V4L2_FIELD_NONE;
-	buf->timestamp = ns_to_timeval(cam->buffers[buf->index].ts);
+	v4l2_buffer_set_timestamp(buf, cam->buffers[buf->index].ts);
 	buf->sequence = cam->buffers[buf->index].seq;
 	buf->m.offset = cam->buffers[buf->index].data - cam->frame_buffer;
 	buf->length = cam->frame_size;
diff --git a/drivers/media/usb/stkwebcam/stk-webcam.c b/drivers/media/usb/stkwebcam/stk-webcam.c
index 21f90a887485..b22501f76b78 100644
--- a/drivers/media/usb/stkwebcam/stk-webcam.c
+++ b/drivers/media/usb/stkwebcam/stk-webcam.c
@@ -1125,7 +1125,7 @@ static int stk_vidioc_dqbuf(struct file *filp,
 	sbuf->v4lbuf.flags &= ~V4L2_BUF_FLAG_QUEUED;
 	sbuf->v4lbuf.flags |= V4L2_BUF_FLAG_DONE;
 	sbuf->v4lbuf.sequence = ++dev->sequence;
-	sbuf->v4lbuf.timestamp = ns_to_timeval(ktime_get_ns());
+	v4l2_buffer_set_timestamp(&sbuf->v4lbuf, ktime_get_ns());
 
 	*buf = sbuf->v4lbuf;
 	return 0;
diff --git a/drivers/media/usb/usbvision/usbvision-video.c b/drivers/media/usb/usbvision/usbvision-video.c
index 93d36aab824f..5ca2c2f35fe2 100644
--- a/drivers/media/usb/usbvision/usbvision-video.c
+++ b/drivers/media/usb/usbvision/usbvision-video.c
@@ -696,7 +696,7 @@ static int vidioc_querybuf(struct file *file,
 	vb->length = usbvision->curwidth *
 		usbvision->curheight *
 		usbvision->palette.bytes_per_pixel;
-	vb->timestamp = ns_to_timeval(usbvision->frame[vb->index].ts);
+	v4l2_buffer_set_timestamp(vb, usbvision->frame[vb->index].ts);
 	vb->sequence = usbvision->frame[vb->index].sequence;
 	return 0;
 }
@@ -765,7 +765,7 @@ static int vidioc_dqbuf(struct file *file, void *priv, struct v4l2_buffer *vb)
 		V4L2_BUF_FLAG_TIMESTAMP_MONOTONIC;
 	vb->index = f->index;
 	vb->sequence = f->sequence;
-	vb->timestamp = ns_to_timeval(f->ts);
+	v4l2_buffer_set_timestamp(vb, f->ts);
 	vb->field = V4L2_FIELD_NONE;
 	vb->bytesused = f->scanlength;
 
diff --git a/drivers/media/v4l2-core/videobuf-core.c b/drivers/media/v4l2-core/videobuf-core.c
index 939fc11cf080..2686f03b322e 100644
--- a/drivers/media/v4l2-core/videobuf-core.c
+++ b/drivers/media/v4l2-core/videobuf-core.c
@@ -19,6 +19,7 @@
 #include <linux/interrupt.h>
 
 #include <media/videobuf-core.h>
+#include <media/v4l2-common.h>
 
 #define MAGIC_BUFFER 0x20070728
 #define MAGIC_CHECK(is, should)						\
@@ -364,7 +365,7 @@ static void videobuf_status(struct videobuf_queue *q, struct v4l2_buffer *b,
 	}
 
 	b->field     = vb->field;
-	b->timestamp = ns_to_timeval(vb->ts);
+	v4l2_buffer_set_timestamp(b, vb->ts);
 	b->bytesused = vb->size;
 	b->sequence  = vb->field_count >> 1;
 }
@@ -578,7 +579,7 @@ int videobuf_qbuf(struct videobuf_queue *q, struct v4l2_buffer *b)
 		    || q->type == V4L2_BUF_TYPE_SDR_OUTPUT) {
 			buf->size = b->bytesused;
 			buf->field = b->field;
-			buf->ts = v4l2_timeval_to_ns(&b->timestamp);
+			buf->ts = v4l2_buffer_get_timestamp(b);
 		}
 		break;
 	case V4L2_MEMORY_USERPTR:
diff --git a/include/media/v4l2-common.h b/include/media/v4l2-common.h
index d8c29e089000..150ee16ebd81 100644
--- a/include/media/v4l2-common.h
+++ b/include/media/v4l2-common.h
@@ -14,6 +14,7 @@
 #ifndef V4L2_COMMON_H_
 #define V4L2_COMMON_H_
 
+#include <linux/time.h>
 #include <media/v4l2-dev.h>
 
 /* Common printk constructs for v4l-i2c drivers. These macros create a unique
@@ -518,4 +519,24 @@ int v4l2_fill_pixfmt(struct v4l2_pix_format *pixfmt, u32 pixelformat,
 int v4l2_fill_pixfmt_mp(struct v4l2_pix_format_mplane *pixfmt, u32 pixelformat,
 			u32 width, u32 height);
 
+static inline u64 v4l2_buffer_get_timestamp(const struct v4l2_buffer *buf)
+{
+	/*
+	 * When the timestamp comes from 32-bit user space, there may be
+	 * uninitialized data in tv_usec, so cast it to u32.
+	 * Otherwise allow invalid input for backwards compatibility.
+	 */
+	return buf->timestamp.tv_sec * NSEC_PER_SEC +
+		(u32)buf->timestamp.tv_usec * NSEC_PER_USEC;
+}
+
+static inline void v4l2_buffer_set_timestamp(struct v4l2_buffer *buf,
+					     u64 timestamp)
+{
+	struct timespec64 ts = ns_to_timespec64(timestamp);
+
+	buf->timestamp.tv_sec  = ts.tv_sec;
+	buf->timestamp.tv_usec = ts.tv_nsec / NSEC_PER_USEC;
+}
+
 #endif /* V4L2_COMMON_H_ */
diff --git a/include/trace/events/v4l2.h b/include/trace/events/v4l2.h
index 83860de120e3..248bc09bfc99 100644
--- a/include/trace/events/v4l2.h
+++ b/include/trace/events/v4l2.h
@@ -130,7 +130,7 @@ DECLARE_EVENT_CLASS(v4l2_event_class,
 		__entry->bytesused = buf->bytesused;
 		__entry->flags = buf->flags;
 		__entry->field = buf->field;
-		__entry->timestamp = timeval_to_ns(&buf->timestamp);
+		__entry->timestamp = v4l2_buffer_get_timestamp(buf);
 		__entry->timecode_type = buf->timecode.type;
 		__entry->timecode_flags = buf->timecode.flags;
 		__entry->timecode_frames = buf->timecode.frames;
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 04481c717fee..6ef4a5b787a4 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -1017,6 +1017,7 @@ struct v4l2_buffer {
 	};
 };
 
+#ifndef __KERNEL__
 /**
  * v4l2_timeval_to_ns - Convert timeval to nanoseconds
  * @ts:		pointer to the timeval variable to be converted
@@ -1028,6 +1029,7 @@ static inline __u64 v4l2_timeval_to_ns(const struct timeval *tv)
 {
 	return (__u64)tv->tv_sec * 1000000000ULL + tv->tv_usec * 1000;
 }
+#endif
 
 /*  Flags for 'flags' field */
 /* Buffer is mapped (flag) */
-- 
cgit v1.2.3


From 1a6c0b36dd19c51cdd76895d009c5deba2286ebb Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Mon, 16 Dec 2019 15:15:03 +0100
Subject: media: v4l2-core: fix VIDIOC_DQEVENT for time64 ABI

The v4l2_event structure contains a 'struct timespec' member that is
defined by the user space C library, creating an ABI incompatibility
when that gets updated to a 64-bit time_t.

While passing a 32-bit time_t here would be sufficient for CLOCK_MONOTONIC
timestamps, simply redefining the structure to use the kernel's
__kernel_old_timespec would not work for any library that uses a copy
of the linux/videodev2.h header file rather than including the copy from
the latest kernel headers.

This means the kernel has to be changed to handle both versions of the
structure layout on a 32-bit architecture. The easiest way to do this
is during the copy from/to user space.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 drivers/media/v4l2-core/v4l2-event.c  |  5 ++++-
 drivers/media/v4l2-core/v4l2-ioctl.c  | 29 ++++++++++++++++++++++++++++-
 drivers/media/v4l2-core/v4l2-subdev.c | 26 +++++++++++++++++++++++++-
 include/media/v4l2-ioctl.h            | 25 +++++++++++++++++++++++++
 include/uapi/linux/videodev2.h        |  4 ++++
 5 files changed, 86 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/media/v4l2-core/v4l2-event.c b/drivers/media/v4l2-core/v4l2-event.c
index 9d673d113d7a..290c6b213179 100644
--- a/drivers/media/v4l2-core/v4l2-event.c
+++ b/drivers/media/v4l2-core/v4l2-event.c
@@ -27,6 +27,7 @@ static unsigned sev_pos(const struct v4l2_subscribed_event *sev, unsigned idx)
 static int __v4l2_event_dequeue(struct v4l2_fh *fh, struct v4l2_event *event)
 {
 	struct v4l2_kevent *kev;
+	struct timespec64 ts;
 	unsigned long flags;
 
 	spin_lock_irqsave(&fh->vdev->fh_lock, flags);
@@ -44,7 +45,9 @@ static int __v4l2_event_dequeue(struct v4l2_fh *fh, struct v4l2_event *event)
 
 	kev->event.pending = fh->navailable;
 	*event = kev->event;
-	event->timestamp = ns_to_timespec(kev->ts);
+	ts = ns_to_timespec64(kev->ts);
+	event->timestamp.tv_sec = ts.tv_sec;
+	event->timestamp.tv_nsec = ts.tv_nsec;
 	kev->sev->first = sev_pos(kev->sev, 1);
 	kev->sev->in_use--;
 
diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index 1f22b1355778..f13b232e6c31 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -821,7 +821,7 @@ static void v4l_print_event(const void *arg, bool write_only)
 	const struct v4l2_event *p = arg;
 	const struct v4l2_event_ctrl *c;
 
-	pr_cont("type=0x%x, pending=%u, sequence=%u, id=%u, timestamp=%lu.%9.9lu\n",
+	pr_cont("type=0x%x, pending=%u, sequence=%u, id=%u, timestamp=%llu.%9.9llu\n",
 			p->type, p->pending, p->sequence, p->id,
 			p->timestamp.tv_sec, p->timestamp.tv_nsec);
 	switch (p->type) {
@@ -3025,6 +3025,13 @@ static int check_array_args(unsigned int cmd, void *parg, size_t *array_size,
 
 static unsigned int video_translate_cmd(unsigned int cmd)
 {
+	switch (cmd) {
+#ifdef CONFIG_COMPAT_32BIT_TIME
+	case VIDIOC_DQEVENT_TIME32:
+		return VIDIOC_DQEVENT;
+#endif
+	}
+
 	return cmd;
 }
 
@@ -3074,6 +3081,26 @@ static int video_put_user(void __user *arg, void *parg, unsigned int cmd)
 		return 0;
 
 	switch (cmd) {
+#ifdef CONFIG_COMPAT_32BIT_TIME
+	case VIDIOC_DQEVENT_TIME32: {
+		struct v4l2_event *ev = parg;
+		struct v4l2_event_time32 ev32 = {
+			.type		= ev->type,
+			.pending	= ev->pending,
+			.sequence	= ev->sequence,
+			.timestamp.tv_sec  = ev->timestamp.tv_sec,
+			.timestamp.tv_nsec = ev->timestamp.tv_nsec,
+			.id		= ev->id,
+		};
+
+		memcpy(&ev32.u, &ev->u, sizeof(ev->u));
+		memcpy(&ev32.reserved, &ev->reserved, sizeof(ev->reserved));
+
+		if (copy_to_user(arg, &ev32, sizeof(ev32)))
+			return -EFAULT;
+		break;
+	}
+#endif
 	default:
 		/*  Copy results into user buffer  */
 		if (copy_to_user(arg, parg, _IOC_SIZE(cmd)))
diff --git a/drivers/media/v4l2-core/v4l2-subdev.c b/drivers/media/v4l2-core/v4l2-subdev.c
index 9e987c0f840e..de926e311348 100644
--- a/drivers/media/v4l2-core/v4l2-subdev.c
+++ b/drivers/media/v4l2-core/v4l2-subdev.c
@@ -331,8 +331,8 @@ static long subdev_do_ioctl(struct file *file, unsigned int cmd, void *arg)
 	struct v4l2_fh *vfh = file->private_data;
 #if defined(CONFIG_VIDEO_V4L2_SUBDEV_API)
 	struct v4l2_subdev_fh *subdev_fh = to_v4l2_subdev_fh(vfh);
-	int rval;
 #endif
+	int rval;
 
 	switch (cmd) {
 	case VIDIOC_QUERYCTRL:
@@ -392,6 +392,30 @@ static long subdev_do_ioctl(struct file *file, unsigned int cmd, void *arg)
 
 		return v4l2_event_dequeue(vfh, arg, file->f_flags & O_NONBLOCK);
 
+	case VIDIOC_DQEVENT_TIME32: {
+		struct v4l2_event_time32 *ev32 = arg;
+		struct v4l2_event ev;
+
+		if (!(sd->flags & V4L2_SUBDEV_FL_HAS_EVENTS))
+			return -ENOIOCTLCMD;
+
+		rval = v4l2_event_dequeue(vfh, &ev, file->f_flags & O_NONBLOCK);
+
+		*ev32 = (struct v4l2_event_time32) {
+			.type		= ev.type,
+			.pending	= ev.pending,
+			.sequence	= ev.sequence,
+			.timestamp.tv_sec  = ev.timestamp.tv_sec,
+			.timestamp.tv_nsec = ev.timestamp.tv_nsec,
+			.id		= ev.id,
+		};
+
+		memcpy(&ev32->u, &ev.u, sizeof(ev.u));
+		memcpy(&ev32->reserved, &ev.reserved, sizeof(ev.reserved));
+
+		return rval;
+	}
+
 	case VIDIOC_SUBSCRIBE_EVENT:
 		return v4l2_subdev_call(sd, core, subscribe_event, vfh, arg);
 
diff --git a/include/media/v4l2-ioctl.h b/include/media/v4l2-ioctl.h
index 4bba65a59d46..05c1ec93a911 100644
--- a/include/media/v4l2-ioctl.h
+++ b/include/media/v4l2-ioctl.h
@@ -724,4 +724,29 @@ long int video_usercopy(struct file *file, unsigned int cmd,
 long int video_ioctl2(struct file *file,
 		      unsigned int cmd, unsigned long int arg);
 
+/*
+ * The user space interpretation of the 'v4l2_event' differs
+ * based on the 'time_t' definition on 32-bit architectures, so
+ * the kernel has to handle both.
+ * This is the old version for 32-bit architectures.
+ */
+struct v4l2_event_time32 {
+	__u32				type;
+	union {
+		struct v4l2_event_vsync		vsync;
+		struct v4l2_event_ctrl		ctrl;
+		struct v4l2_event_frame_sync	frame_sync;
+		struct v4l2_event_src_change	src_change;
+		struct v4l2_event_motion_det	motion_det;
+		__u8				data[64];
+	} u;
+	__u32				pending;
+	__u32				sequence;
+	struct old_timespec32		timestamp;
+	__u32				id;
+	__u32				reserved[8];
+};
+
+#define	VIDIOC_DQEVENT_TIME32	 _IOR('V', 89, struct v4l2_event_time32)
+
 #endif /* _V4L2_IOCTL_H */
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index 6ef4a5b787a4..caf156d45842 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -2341,7 +2341,11 @@ struct v4l2_event {
 	} u;
 	__u32				pending;
 	__u32				sequence;
+#ifdef __KERNEL__
+	struct __kernel_timespec	timestamp;
+#else
 	struct timespec			timestamp;
+#endif
 	__u32				id;
 	__u32				reserved[8];
 };
-- 
cgit v1.2.3


From 577c89b0ce726e44c08c396d14f84a00070a57b7 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Mon, 16 Dec 2019 15:15:04 +0100
Subject: media: v4l2-core: fix v4l2_buffer handling for time64 ABI

The v4l2_buffer structure contains a 'struct timeval' member that is
defined by the user space C library, creating an ABI incompatibility
when that gets updated to a 64-bit time_t.

As in v4l2_event, handle this with a special case in video_put_user()
and video_get_user() to replace the memcpy there.

Since the structure also contains a pointer, there are now two
native versions (on 32-bit systems) as well as two compat versions
(on 64-bit systems), which unfortunately complicates the compat
handler quite a bit.

Duplicating the existing handlers for the new types is a safe
conversion for now, but unfortunately this may turn into a
maintenance burden later. A larger-scale rework of the
compat code might be a better alternative, but is out of scope
of the y2038 work.

Sparc64 needs a special case because of their special suseconds_t
definition.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
---
 drivers/media/v4l2-core/v4l2-ioctl.c | 74 ++++++++++++++++++++++++++++++++++--
 include/media/v4l2-ioctl.h           | 30 +++++++++++++++
 include/uapi/linux/videodev2.h       | 23 +++++++++++
 3 files changed, 123 insertions(+), 4 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/media/v4l2-core/v4l2-ioctl.c b/drivers/media/v4l2-core/v4l2-ioctl.c
index f13b232e6c31..15035cbf616e 100644
--- a/drivers/media/v4l2-core/v4l2-ioctl.c
+++ b/drivers/media/v4l2-core/v4l2-ioctl.c
@@ -474,10 +474,10 @@ static void v4l_print_buffer(const void *arg, bool write_only)
 	const struct v4l2_plane *plane;
 	int i;
 
-	pr_cont("%02ld:%02d:%02d.%08ld index=%d, type=%s, request_fd=%d, flags=0x%08x, field=%s, sequence=%d, memory=%s",
-			p->timestamp.tv_sec / 3600,
-			(int)(p->timestamp.tv_sec / 60) % 60,
-			(int)(p->timestamp.tv_sec % 60),
+	pr_cont("%02d:%02d:%02d.%09ld index=%d, type=%s, request_fd=%d, flags=0x%08x, field=%s, sequence=%d, memory=%s",
+			(int)p->timestamp.tv_sec / 3600,
+			((int)p->timestamp.tv_sec / 60) % 60,
+			((int)p->timestamp.tv_sec % 60),
 			(long)p->timestamp.tv_usec,
 			p->index,
 			prt_names(p->type, v4l2_type_names), p->request_fd,
@@ -3029,6 +3029,14 @@ static unsigned int video_translate_cmd(unsigned int cmd)
 #ifdef CONFIG_COMPAT_32BIT_TIME
 	case VIDIOC_DQEVENT_TIME32:
 		return VIDIOC_DQEVENT;
+	case VIDIOC_QUERYBUF_TIME32:
+		return VIDIOC_QUERYBUF;
+	case VIDIOC_QBUF_TIME32:
+		return VIDIOC_QBUF;
+	case VIDIOC_DQBUF_TIME32:
+		return VIDIOC_DQBUF;
+	case VIDIOC_PREPARE_BUF_TIME32:
+		return VIDIOC_PREPARE_BUF;
 #endif
 	}
 
@@ -3047,6 +3055,39 @@ static int video_get_user(void __user *arg, void *parg, unsigned int cmd,
 	}
 
 	switch (cmd) {
+#ifdef CONFIG_COMPAT_32BIT_TIME
+	case VIDIOC_QUERYBUF_TIME32:
+	case VIDIOC_QBUF_TIME32:
+	case VIDIOC_DQBUF_TIME32:
+	case VIDIOC_PREPARE_BUF_TIME32: {
+		struct v4l2_buffer_time32 vb32;
+		struct v4l2_buffer *vb = parg;
+
+		if (copy_from_user(&vb32, arg, sizeof(vb32)))
+			return -EFAULT;
+
+		*vb = (struct v4l2_buffer) {
+			.index		= vb32.index,
+			.type		= vb32.type,
+			.bytesused	= vb32.bytesused,
+			.flags		= vb32.flags,
+			.field		= vb32.field,
+			.timestamp.tv_sec	= vb32.timestamp.tv_sec,
+			.timestamp.tv_usec	= vb32.timestamp.tv_usec,
+			.timecode	= vb32.timecode,
+			.sequence	= vb32.sequence,
+			.memory		= vb32.memory,
+			.m.userptr	= vb32.m.userptr,
+			.length		= vb32.length,
+			.request_fd	= vb32.request_fd,
+		};
+
+		if (cmd == VIDIOC_QUERYBUF_TIME32)
+			vb->request_fd = 0;
+
+		break;
+	}
+#endif
 	default:
 		/*
 		 * In some cases, only a few fields are used as input,
@@ -3100,6 +3141,31 @@ static int video_put_user(void __user *arg, void *parg, unsigned int cmd)
 			return -EFAULT;
 		break;
 	}
+	case VIDIOC_QUERYBUF_TIME32:
+	case VIDIOC_QBUF_TIME32:
+	case VIDIOC_DQBUF_TIME32:
+	case VIDIOC_PREPARE_BUF_TIME32: {
+		struct v4l2_buffer *vb = parg;
+		struct v4l2_buffer_time32 vb32 = {
+			.index		= vb->index,
+			.type		= vb->type,
+			.bytesused	= vb->bytesused,
+			.flags		= vb->flags,
+			.field		= vb->field,
+			.timestamp.tv_sec	= vb->timestamp.tv_sec,
+			.timestamp.tv_usec	= vb->timestamp.tv_usec,
+			.timecode	= vb->timecode,
+			.sequence	= vb->sequence,
+			.memory		= vb->memory,
+			.m.userptr	= vb->m.userptr,
+			.length		= vb->length,
+			.request_fd	= vb->request_fd,
+		};
+
+		if (copy_to_user(arg, &vb32, sizeof(vb32)))
+			return -EFAULT;
+		break;
+	}
 #endif
 	default:
 		/*  Copy results into user buffer  */
diff --git a/include/media/v4l2-ioctl.h b/include/media/v4l2-ioctl.h
index 05c1ec93a911..86878fba332b 100644
--- a/include/media/v4l2-ioctl.h
+++ b/include/media/v4l2-ioctl.h
@@ -749,4 +749,34 @@ struct v4l2_event_time32 {
 
 #define	VIDIOC_DQEVENT_TIME32	 _IOR('V', 89, struct v4l2_event_time32)
 
+struct v4l2_buffer_time32 {
+	__u32			index;
+	__u32			type;
+	__u32			bytesused;
+	__u32			flags;
+	__u32			field;
+	struct old_timeval32	timestamp;
+	struct v4l2_timecode	timecode;
+	__u32			sequence;
+
+	/* memory location */
+	__u32			memory;
+	union {
+		__u32           offset;
+		unsigned long   userptr;
+		struct v4l2_plane *planes;
+		__s32		fd;
+	} m;
+	__u32			length;
+	__u32			reserved2;
+	union {
+		__s32		request_fd;
+		__u32		reserved;
+	};
+};
+#define VIDIOC_QUERYBUF_TIME32	_IOWR('V',  9, struct v4l2_buffer_time32)
+#define VIDIOC_QBUF_TIME32	_IOWR('V', 15, struct v4l2_buffer_time32)
+#define VIDIOC_DQBUF_TIME32	_IOWR('V', 17, struct v4l2_buffer_time32)
+#define VIDIOC_PREPARE_BUF_TIME32 _IOWR('V', 93, struct v4l2_buffer_time32)
+
 #endif /* _V4L2_IOCTL_H */
diff --git a/include/uapi/linux/videodev2.h b/include/uapi/linux/videodev2.h
index caf156d45842..5f9357dcb060 100644
--- a/include/uapi/linux/videodev2.h
+++ b/include/uapi/linux/videodev2.h
@@ -912,6 +912,25 @@ struct v4l2_jpegcompression {
 /*
  *	M E M O R Y - M A P P I N G   B U F F E R S
  */
+
+#ifdef __KERNEL__
+/*
+ * This corresponds to the user space version of timeval
+ * for 64-bit time_t. sparc64 is different from everyone
+ * else, using the microseconds in the wrong half of the
+ * second 64-bit word.
+ */
+struct __kernel_v4l2_timeval {
+	long long	tv_sec;
+#if defined(__sparc__) && defined(__arch64__)
+	int		tv_usec;
+	int		__pad;
+#else
+	long long	tv_usec;
+#endif
+};
+#endif
+
 struct v4l2_requestbuffers {
 	__u32			count;
 	__u32			type;		/* enum v4l2_buf_type */
@@ -997,7 +1016,11 @@ struct v4l2_buffer {
 	__u32			bytesused;
 	__u32			flags;
 	__u32			field;
+#ifdef __KERNEL__
+	struct __kernel_v4l2_timeval timestamp;
+#else
 	struct timeval		timestamp;
+#endif
 	struct v4l2_timecode	timecode;
 	__u32			sequence;
 
-- 
cgit v1.2.3


From 757cc3e9ff1d72d014096399d6e2bf03974d9da1 Mon Sep 17 00:00:00 2001
From: Rijo Thomas <Rijo-john.Thomas@amd.com>
Date: Fri, 27 Dec 2019 10:54:01 +0530
Subject: tee: add AMD-TEE driver

Adds AMD-TEE driver.
* targets AMD APUs which has AMD Secure Processor with software-based
  Trusted Execution Environment (TEE) support
* registers with TEE subsystem
* defines tee_driver_ops function callbacks
* kernel allocated memory is used as shared memory between normal
  world and secure world.
* acts as REE (Rich Execution Environment) communication agent, which
  uses the services of AMD Secure Processor driver to submit commands
  for processing in TEE environment

Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Acked-by: Jens Wiklander <jens.wiklander@linaro.org>
Co-developed-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Devaraj Rangasamy <Devaraj.Rangasamy@amd.com>
Signed-off-by: Rijo Thomas <Rijo-john.Thomas@amd.com>
Reviewed-by: Gary R Hook <gary.hook@amd.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
---
 drivers/tee/Kconfig                 |   2 +-
 drivers/tee/Makefile                |   1 +
 drivers/tee/amdtee/Kconfig          |   8 +
 drivers/tee/amdtee/Makefile         |   5 +
 drivers/tee/amdtee/amdtee_if.h      | 183 +++++++++++++
 drivers/tee/amdtee/amdtee_private.h | 159 +++++++++++
 drivers/tee/amdtee/call.c           | 373 ++++++++++++++++++++++++++
 drivers/tee/amdtee/core.c           | 510 ++++++++++++++++++++++++++++++++++++
 drivers/tee/amdtee/shm_pool.c       |  93 +++++++
 include/uapi/linux/tee.h            |   1 +
 10 files changed, 1334 insertions(+), 1 deletion(-)
 create mode 100644 drivers/tee/amdtee/Kconfig
 create mode 100644 drivers/tee/amdtee/Makefile
 create mode 100644 drivers/tee/amdtee/amdtee_if.h
 create mode 100644 drivers/tee/amdtee/amdtee_private.h
 create mode 100644 drivers/tee/amdtee/call.c
 create mode 100644 drivers/tee/amdtee/core.c
 create mode 100644 drivers/tee/amdtee/shm_pool.c

(limited to 'include/uapi/linux')

diff --git a/drivers/tee/Kconfig b/drivers/tee/Kconfig
index 4f3197de3133..8da63f38e6bd 100644
--- a/drivers/tee/Kconfig
+++ b/drivers/tee/Kconfig
@@ -14,7 +14,7 @@ if TEE
 menu "TEE drivers"
 
 source "drivers/tee/optee/Kconfig"
-
+source "drivers/tee/amdtee/Kconfig"
 endmenu
 
 endif
diff --git a/drivers/tee/Makefile b/drivers/tee/Makefile
index 21f51fd88b07..68da044afbfa 100644
--- a/drivers/tee/Makefile
+++ b/drivers/tee/Makefile
@@ -4,3 +4,4 @@ tee-objs += tee_core.o
 tee-objs += tee_shm.o
 tee-objs += tee_shm_pool.o
 obj-$(CONFIG_OPTEE) += optee/
+obj-$(CONFIG_AMDTEE) += amdtee/
diff --git a/drivers/tee/amdtee/Kconfig b/drivers/tee/amdtee/Kconfig
new file mode 100644
index 000000000000..4e32b6413b41
--- /dev/null
+++ b/drivers/tee/amdtee/Kconfig
@@ -0,0 +1,8 @@
+# SPDX-License-Identifier: MIT
+# AMD-TEE Trusted Execution Environment Configuration
+config AMDTEE
+	tristate "AMD-TEE"
+	default m
+	depends on CRYPTO_DEV_SP_PSP
+	help
+	  This implements AMD's Trusted Execution Environment (TEE) driver.
diff --git a/drivers/tee/amdtee/Makefile b/drivers/tee/amdtee/Makefile
new file mode 100644
index 000000000000..ff1485266117
--- /dev/null
+++ b/drivers/tee/amdtee/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: MIT
+obj-$(CONFIG_AMDTEE) += amdtee.o
+amdtee-objs += core.o
+amdtee-objs += call.o
+amdtee-objs += shm_pool.o
diff --git a/drivers/tee/amdtee/amdtee_if.h b/drivers/tee/amdtee/amdtee_if.h
new file mode 100644
index 000000000000..ff48c3e47375
--- /dev/null
+++ b/drivers/tee/amdtee/amdtee_if.h
@@ -0,0 +1,183 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+
+/*
+ * This file has definitions related to Host and AMD-TEE Trusted OS interface.
+ * These definitions must match the definitions on the TEE side.
+ */
+
+#ifndef AMDTEE_IF_H
+#define AMDTEE_IF_H
+
+#include <linux/types.h>
+
+/*****************************************************************************
+ ** TEE Param
+ ******************************************************************************/
+#define TEE_MAX_PARAMS		4
+
+/**
+ * struct memref - memory reference structure
+ * @buf_id:    buffer ID of the buffer mapped by TEE_CMD_ID_MAP_SHARED_MEM
+ * @offset:    offset in bytes from beginning of the buffer
+ * @size:      data size in bytes
+ */
+struct memref {
+	u32 buf_id;
+	u32 offset;
+	u32 size;
+};
+
+struct value {
+	u32 a;
+	u32 b;
+};
+
+/*
+ * Parameters passed to open_session or invoke_command
+ */
+union tee_op_param {
+	struct memref mref;
+	struct value val;
+};
+
+struct tee_operation {
+	u32 param_types;
+	union tee_op_param params[TEE_MAX_PARAMS];
+};
+
+/* Must be same as in GP TEE specification */
+#define TEE_OP_PARAM_TYPE_NONE                  0
+#define TEE_OP_PARAM_TYPE_VALUE_INPUT           1
+#define TEE_OP_PARAM_TYPE_VALUE_OUTPUT          2
+#define TEE_OP_PARAM_TYPE_VALUE_INOUT           3
+#define TEE_OP_PARAM_TYPE_INVALID               4
+#define TEE_OP_PARAM_TYPE_MEMREF_INPUT          5
+#define TEE_OP_PARAM_TYPE_MEMREF_OUTPUT         6
+#define TEE_OP_PARAM_TYPE_MEMREF_INOUT          7
+
+#define TEE_PARAM_TYPE_GET(t, i)        (((t) >> ((i) * 4)) & 0xF)
+#define TEE_PARAM_TYPES(t0, t1, t2, t3) \
+	((t0) | ((t1) << 4) | ((t2) << 8) | ((t3) << 12))
+
+/*****************************************************************************
+ ** TEE Commands
+ *****************************************************************************/
+
+/*
+ * The shared memory between rich world and secure world may be physically
+ * non-contiguous. Below structures are meant to describe a shared memory region
+ * via scatter/gather (sg) list
+ */
+
+/**
+ * struct tee_sg_desc - sg descriptor for a physically contiguous buffer
+ * @low_addr: [in] bits[31:0] of buffer's physical address. Must be 4KB aligned
+ * @hi_addr:  [in] bits[63:32] of the buffer's physical address
+ * @size:     [in] size in bytes (must be multiple of 4KB)
+ */
+struct tee_sg_desc {
+	u32 low_addr;
+	u32 hi_addr;
+	u32 size;
+};
+
+/**
+ * struct tee_sg_list - structure describing a scatter/gather list
+ * @count:   [in] number of sg descriptors
+ * @size:    [in] total size of all buffers in the list. Must be multiple of 4KB
+ * @buf:     [in] list of sg buffer descriptors
+ */
+#define TEE_MAX_SG_DESC 64
+struct tee_sg_list {
+	u32 count;
+	u32 size;
+	struct tee_sg_desc buf[TEE_MAX_SG_DESC];
+};
+
+/**
+ * struct tee_cmd_map_shared_mem - command to map shared memory
+ * @buf_id:    [out] return buffer ID value
+ * @sg_list:   [in] list describing memory to be mapped
+ */
+struct tee_cmd_map_shared_mem {
+	u32 buf_id;
+	struct tee_sg_list sg_list;
+};
+
+/**
+ * struct tee_cmd_unmap_shared_mem - command to unmap shared memory
+ * @buf_id:    [in] buffer ID of memory to be unmapped
+ */
+struct tee_cmd_unmap_shared_mem {
+	u32 buf_id;
+};
+
+/**
+ * struct tee_cmd_load_ta - load Trusted Application (TA) binary into TEE
+ * @low_addr:    [in] bits [31:0] of the physical address of the TA binary
+ * @hi_addr:     [in] bits [63:32] of the physical address of the TA binary
+ * @size:        [in] size of TA binary in bytes
+ * @ta_handle:   [out] return handle of the loaded TA
+ */
+struct tee_cmd_load_ta {
+	u32 low_addr;
+	u32 hi_addr;
+	u32 size;
+	u32 ta_handle;
+};
+
+/**
+ * struct tee_cmd_unload_ta - command to unload TA binary from TEE environment
+ * @ta_handle:    [in] handle of the loaded TA to be unloaded
+ */
+struct tee_cmd_unload_ta {
+	u32 ta_handle;
+};
+
+/**
+ * struct tee_cmd_open_session - command to call TA_OpenSessionEntryPoint in TA
+ * @ta_handle:      [in] handle of the loaded TA
+ * @session_info:   [out] pointer to TA allocated session data
+ * @op:             [in/out] operation parameters
+ * @return_origin:  [out] origin of return code after TEE processing
+ */
+struct tee_cmd_open_session {
+	u32 ta_handle;
+	u32 session_info;
+	struct tee_operation op;
+	u32 return_origin;
+};
+
+/**
+ * struct tee_cmd_close_session - command to call TA_CloseSessionEntryPoint()
+ *                                in TA
+ * @ta_handle:      [in] handle of the loaded TA
+ * @session_info:   [in] pointer to TA allocated session data
+ */
+struct tee_cmd_close_session {
+	u32 ta_handle;
+	u32 session_info;
+};
+
+/**
+ * struct tee_cmd_invoke_cmd - command to call TA_InvokeCommandEntryPoint() in
+ *                             TA
+ * @ta_handle:     [in] handle of the loaded TA
+ * @cmd_id:        [in] TA command ID
+ * @session_info:  [in] pointer to TA allocated session data
+ * @op:            [in/out] operation parameters
+ * @return_origin: [out] origin of return code after TEE processing
+ */
+struct tee_cmd_invoke_cmd {
+	u32 ta_handle;
+	u32 cmd_id;
+	u32 session_info;
+	struct tee_operation op;
+	u32 return_origin;
+};
+
+#endif /*AMDTEE_IF_H*/
diff --git a/drivers/tee/amdtee/amdtee_private.h b/drivers/tee/amdtee/amdtee_private.h
new file mode 100644
index 000000000000..d7f798c3394b
--- /dev/null
+++ b/drivers/tee/amdtee/amdtee_private.h
@@ -0,0 +1,159 @@
+/* SPDX-License-Identifier: MIT */
+
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+
+#ifndef AMDTEE_PRIVATE_H
+#define AMDTEE_PRIVATE_H
+
+#include <linux/mutex.h>
+#include <linux/spinlock.h>
+#include <linux/tee_drv.h>
+#include <linux/kref.h>
+#include <linux/types.h>
+#include "amdtee_if.h"
+
+#define DRIVER_NAME	"amdtee"
+#define DRIVER_AUTHOR   "AMD-TEE Linux driver team"
+
+/* Some GlobalPlatform error codes used in this driver */
+#define TEEC_SUCCESS			0x00000000
+#define TEEC_ERROR_GENERIC		0xFFFF0000
+#define TEEC_ERROR_BAD_PARAMETERS	0xFFFF0006
+#define TEEC_ERROR_COMMUNICATION	0xFFFF000E
+
+#define TEEC_ORIGIN_COMMS		0x00000002
+
+/* Maximum number of sessions which can be opened with a Trusted Application */
+#define TEE_NUM_SESSIONS			32
+
+#define TA_LOAD_PATH				"/amdtee"
+#define TA_PATH_MAX				60
+
+/**
+ * struct amdtee - main service struct
+ * @teedev:		client device
+ * @pool:		shared memory pool
+ */
+struct amdtee {
+	struct tee_device *teedev;
+	struct tee_shm_pool *pool;
+};
+
+/**
+ * struct amdtee_session - Trusted Application (TA) session related information.
+ * @ta_handle:     handle to Trusted Application (TA) loaded in TEE environment
+ * @refcount:      counter to keep track of sessions opened for the TA instance
+ * @session_info:  an array pointing to TA allocated session data.
+ * @sess_mask:     session usage bit-mask. If a particular bit is set, then the
+ *                 corresponding @session_info entry is in use or valid.
+ *
+ * Session structure is updated on open_session and this information is used for
+ * subsequent operations with the Trusted Application.
+ */
+struct amdtee_session {
+	struct list_head list_node;
+	u32 ta_handle;
+	struct kref refcount;
+	u32 session_info[TEE_NUM_SESSIONS];
+	DECLARE_BITMAP(sess_mask, TEE_NUM_SESSIONS);
+	spinlock_t lock;	/* synchronizes access to @sess_mask */
+};
+
+/**
+ * struct amdtee_context_data - AMD-TEE driver context data
+ * @sess_list:    Keeps track of sessions opened in current TEE context
+ */
+struct amdtee_context_data {
+	struct list_head sess_list;
+};
+
+struct amdtee_driver_data {
+	struct amdtee *amdtee;
+};
+
+struct shmem_desc {
+	void *kaddr;
+	u64 size;
+};
+
+/**
+ * struct amdtee_shm_data - Shared memory data
+ * @kaddr:	Kernel virtual address of shared memory
+ * @buf_id:	Buffer id of memory mapped by TEE_CMD_ID_MAP_SHARED_MEM
+ */
+struct amdtee_shm_data {
+	struct  list_head shm_node;
+	void    *kaddr;
+	u32     buf_id;
+};
+
+struct amdtee_shm_context {
+	struct list_head shmdata_list;
+};
+
+#define LOWER_TWO_BYTE_MASK	0x0000FFFF
+
+/**
+ * set_session_id() - Sets the session identifier.
+ * @ta_handle:      [in] handle of the loaded Trusted Application (TA)
+ * @session_index:  [in] Session index. Range: 0 to (TEE_NUM_SESSIONS - 1).
+ * @session:        [out] Pointer to session id
+ *
+ * Lower two bytes of the session identifier represents the TA handle and the
+ * upper two bytes is session index.
+ */
+static inline void set_session_id(u32 ta_handle, u32 session_index,
+				  u32 *session)
+{
+	*session = (session_index << 16) | (LOWER_TWO_BYTE_MASK & ta_handle);
+}
+
+static inline u32 get_ta_handle(u32 session)
+{
+	return session & LOWER_TWO_BYTE_MASK;
+}
+
+static inline u32 get_session_index(u32 session)
+{
+	return (session >> 16) & LOWER_TWO_BYTE_MASK;
+}
+
+int amdtee_open_session(struct tee_context *ctx,
+			struct tee_ioctl_open_session_arg *arg,
+			struct tee_param *param);
+
+int amdtee_close_session(struct tee_context *ctx, u32 session);
+
+int amdtee_invoke_func(struct tee_context *ctx,
+		       struct tee_ioctl_invoke_arg *arg,
+		       struct tee_param *param);
+
+int amdtee_cancel_req(struct tee_context *ctx, u32 cancel_id, u32 session);
+
+int amdtee_map_shmem(struct tee_shm *shm);
+
+void amdtee_unmap_shmem(struct tee_shm *shm);
+
+int handle_load_ta(void *data, u32 size,
+		   struct tee_ioctl_open_session_arg *arg);
+
+int handle_unload_ta(u32 ta_handle);
+
+int handle_open_session(struct tee_ioctl_open_session_arg *arg, u32 *info,
+			struct tee_param *p);
+
+int handle_close_session(u32 ta_handle, u32 info);
+
+int handle_map_shmem(u32 count, struct shmem_desc *start, u32 *buf_id);
+
+void handle_unmap_shmem(u32 buf_id);
+
+int handle_invoke_cmd(struct tee_ioctl_invoke_arg *arg, u32 sinfo,
+		      struct tee_param *p);
+
+struct tee_shm_pool *amdtee_config_shm(void);
+
+u32 get_buffer_id(struct tee_shm *shm);
+#endif /*AMDTEE_PRIVATE_H*/
diff --git a/drivers/tee/amdtee/call.c b/drivers/tee/amdtee/call.c
new file mode 100644
index 000000000000..87ccad256686
--- /dev/null
+++ b/drivers/tee/amdtee/call.c
@@ -0,0 +1,373 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+
+#include <linux/device.h>
+#include <linux/tee.h>
+#include <linux/tee_drv.h>
+#include <linux/psp-tee.h>
+#include <linux/slab.h>
+#include <linux/psp-sev.h>
+#include "amdtee_if.h"
+#include "amdtee_private.h"
+
+static int tee_params_to_amd_params(struct tee_param *tee, u32 count,
+				    struct tee_operation *amd)
+{
+	int i, ret = 0;
+	u32 type;
+
+	if (!count)
+		return 0;
+
+	if (!tee || !amd || count > TEE_MAX_PARAMS)
+		return -EINVAL;
+
+	amd->param_types = 0;
+	for (i = 0; i < count; i++) {
+		/* AMD TEE does not support meta parameter */
+		if (tee[i].attr > TEE_IOCTL_PARAM_ATTR_TYPE_MEMREF_INOUT)
+			return -EINVAL;
+
+		amd->param_types |= ((tee[i].attr & 0xF) << i * 4);
+	}
+
+	for (i = 0; i < count; i++) {
+		type = TEE_PARAM_TYPE_GET(amd->param_types, i);
+		pr_debug("%s: type[%d] = 0x%x\n", __func__, i, type);
+
+		if (type == TEE_OP_PARAM_TYPE_INVALID)
+			return -EINVAL;
+
+		if (type == TEE_OP_PARAM_TYPE_NONE)
+			continue;
+
+		/* It is assumed that all values are within 2^32-1 */
+		if (type > TEE_OP_PARAM_TYPE_VALUE_INOUT) {
+			u32 buf_id = get_buffer_id(tee[i].u.memref.shm);
+
+			amd->params[i].mref.buf_id = buf_id;
+			amd->params[i].mref.offset = tee[i].u.memref.shm_offs;
+			amd->params[i].mref.size = tee[i].u.memref.size;
+			pr_debug("%s: bufid[%d] = 0x%x, offset[%d] = 0x%x, size[%d] = 0x%x\n",
+				 __func__,
+				 i, amd->params[i].mref.buf_id,
+				 i, amd->params[i].mref.offset,
+				 i, amd->params[i].mref.size);
+		} else {
+			if (tee[i].u.value.c)
+				pr_warn("%s: Discarding value c", __func__);
+
+			amd->params[i].val.a = tee[i].u.value.a;
+			amd->params[i].val.b = tee[i].u.value.b;
+			pr_debug("%s: a[%d] = 0x%x, b[%d] = 0x%x\n", __func__,
+				 i, amd->params[i].val.a,
+				 i, amd->params[i].val.b);
+		}
+	}
+	return ret;
+}
+
+static int amd_params_to_tee_params(struct tee_param *tee, u32 count,
+				    struct tee_operation *amd)
+{
+	int i, ret = 0;
+	u32 type;
+
+	if (!count)
+		return 0;
+
+	if (!tee || !amd || count > TEE_MAX_PARAMS)
+		return -EINVAL;
+
+	/* Assumes amd->param_types is valid */
+	for (i = 0; i < count; i++) {
+		type = TEE_PARAM_TYPE_GET(amd->param_types, i);
+		pr_debug("%s: type[%d] = 0x%x\n", __func__, i, type);
+
+		if (type == TEE_OP_PARAM_TYPE_INVALID ||
+		    type > TEE_OP_PARAM_TYPE_MEMREF_INOUT)
+			return -EINVAL;
+
+		if (type == TEE_OP_PARAM_TYPE_NONE ||
+		    type == TEE_OP_PARAM_TYPE_VALUE_INPUT ||
+		    type == TEE_OP_PARAM_TYPE_MEMREF_INPUT)
+			continue;
+
+		/*
+		 * It is assumed that buf_id remains unchanged for
+		 * both open_session and invoke_cmd call
+		 */
+		if (type > TEE_OP_PARAM_TYPE_MEMREF_INPUT) {
+			tee[i].u.memref.shm_offs = amd->params[i].mref.offset;
+			tee[i].u.memref.size = amd->params[i].mref.size;
+			pr_debug("%s: bufid[%d] = 0x%x, offset[%d] = 0x%x, size[%d] = 0x%x\n",
+				 __func__,
+				 i, amd->params[i].mref.buf_id,
+				 i, amd->params[i].mref.offset,
+				 i, amd->params[i].mref.size);
+		} else {
+			/* field 'c' not supported by AMD TEE */
+			tee[i].u.value.a = amd->params[i].val.a;
+			tee[i].u.value.b = amd->params[i].val.b;
+			tee[i].u.value.c = 0;
+			pr_debug("%s: a[%d] = 0x%x, b[%d] = 0x%x\n",
+				 __func__,
+				 i, amd->params[i].val.a,
+				 i, amd->params[i].val.b);
+		}
+	}
+	return ret;
+}
+
+int handle_unload_ta(u32 ta_handle)
+{
+	struct tee_cmd_unload_ta cmd = {0};
+	int ret = 0;
+	u32 status;
+
+	if (!ta_handle)
+		return -EINVAL;
+
+	cmd.ta_handle = ta_handle;
+
+	ret = psp_tee_process_cmd(TEE_CMD_ID_UNLOAD_TA, (void *)&cmd,
+				  sizeof(cmd), &status);
+	if (!ret && status != 0) {
+		pr_err("unload ta: status = 0x%x\n", status);
+		ret = -EBUSY;
+	}
+
+	return ret;
+}
+
+int handle_close_session(u32 ta_handle, u32 info)
+{
+	struct tee_cmd_close_session cmd = {0};
+	int ret = 0;
+	u32 status;
+
+	if (ta_handle == 0)
+		return -EINVAL;
+
+	cmd.ta_handle = ta_handle;
+	cmd.session_info = info;
+
+	ret = psp_tee_process_cmd(TEE_CMD_ID_CLOSE_SESSION, (void *)&cmd,
+				  sizeof(cmd), &status);
+	if (!ret && status != 0) {
+		pr_err("close session: status = 0x%x\n", status);
+		ret = -EBUSY;
+	}
+
+	return ret;
+}
+
+void handle_unmap_shmem(u32 buf_id)
+{
+	struct tee_cmd_unmap_shared_mem cmd = {0};
+	int ret = 0;
+	u32 status;
+
+	cmd.buf_id = buf_id;
+
+	ret = psp_tee_process_cmd(TEE_CMD_ID_UNMAP_SHARED_MEM, (void *)&cmd,
+				  sizeof(cmd), &status);
+	if (!ret)
+		pr_debug("unmap shared memory: buf_id %u status = 0x%x\n",
+			 buf_id, status);
+}
+
+int handle_invoke_cmd(struct tee_ioctl_invoke_arg *arg, u32 sinfo,
+		      struct tee_param *p)
+{
+	struct tee_cmd_invoke_cmd cmd = {0};
+	int ret = 0;
+
+	if (!arg || (!p && arg->num_params))
+		return -EINVAL;
+
+	arg->ret_origin = TEEC_ORIGIN_COMMS;
+
+	if (arg->session == 0) {
+		arg->ret = TEEC_ERROR_BAD_PARAMETERS;
+		return -EINVAL;
+	}
+
+	ret = tee_params_to_amd_params(p, arg->num_params, &cmd.op);
+	if (ret) {
+		pr_err("invalid Params. Abort invoke command\n");
+		arg->ret = TEEC_ERROR_BAD_PARAMETERS;
+		return ret;
+	}
+
+	cmd.ta_handle = get_ta_handle(arg->session);
+	cmd.cmd_id = arg->func;
+	cmd.session_info = sinfo;
+
+	ret = psp_tee_process_cmd(TEE_CMD_ID_INVOKE_CMD, (void *)&cmd,
+				  sizeof(cmd), &arg->ret);
+	if (ret) {
+		arg->ret = TEEC_ERROR_COMMUNICATION;
+	} else {
+		ret = amd_params_to_tee_params(p, arg->num_params, &cmd.op);
+		if (unlikely(ret)) {
+			pr_err("invoke command: failed to copy output\n");
+			arg->ret = TEEC_ERROR_GENERIC;
+			return ret;
+		}
+		arg->ret_origin = cmd.return_origin;
+		pr_debug("invoke command: RO = 0x%x ret = 0x%x\n",
+			 arg->ret_origin, arg->ret);
+	}
+
+	return ret;
+}
+
+int handle_map_shmem(u32 count, struct shmem_desc *start, u32 *buf_id)
+{
+	struct tee_cmd_map_shared_mem *cmd;
+	phys_addr_t paddr;
+	int ret = 0, i;
+	u32 status;
+
+	if (!count || !start || !buf_id)
+		return -EINVAL;
+
+	cmd = kzalloc(sizeof(*cmd), GFP_KERNEL);
+	if (!cmd)
+		return -ENOMEM;
+
+	/* Size must be page aligned */
+	for (i = 0; i < count ; i++) {
+		if (!start[i].kaddr || (start[i].size & (PAGE_SIZE - 1))) {
+			ret = -EINVAL;
+			goto free_cmd;
+		}
+
+		if ((u64)start[i].kaddr & (PAGE_SIZE - 1)) {
+			pr_err("map shared memory: page unaligned. addr 0x%llx",
+			       (u64)start[i].kaddr);
+			ret = -EINVAL;
+			goto free_cmd;
+		}
+	}
+
+	cmd->sg_list.count = count;
+
+	/* Create buffer list */
+	for (i = 0; i < count ; i++) {
+		paddr = __psp_pa(start[i].kaddr);
+		cmd->sg_list.buf[i].hi_addr = upper_32_bits(paddr);
+		cmd->sg_list.buf[i].low_addr = lower_32_bits(paddr);
+		cmd->sg_list.buf[i].size = start[i].size;
+		cmd->sg_list.size += cmd->sg_list.buf[i].size;
+
+		pr_debug("buf[%d]:hi addr = 0x%x\n", i,
+			 cmd->sg_list.buf[i].hi_addr);
+		pr_debug("buf[%d]:low addr = 0x%x\n", i,
+			 cmd->sg_list.buf[i].low_addr);
+		pr_debug("buf[%d]:size = 0x%x\n", i, cmd->sg_list.buf[i].size);
+		pr_debug("list size = 0x%x\n", cmd->sg_list.size);
+	}
+
+	*buf_id = 0;
+
+	ret = psp_tee_process_cmd(TEE_CMD_ID_MAP_SHARED_MEM, (void *)cmd,
+				  sizeof(*cmd), &status);
+	if (!ret && !status) {
+		*buf_id = cmd->buf_id;
+		pr_debug("mapped buffer ID = 0x%x\n", *buf_id);
+	} else {
+		pr_err("map shared memory: status = 0x%x\n", status);
+		ret = -ENOMEM;
+	}
+
+free_cmd:
+	kfree(cmd);
+
+	return ret;
+}
+
+int handle_open_session(struct tee_ioctl_open_session_arg *arg, u32 *info,
+			struct tee_param *p)
+{
+	struct tee_cmd_open_session cmd = {0};
+	int ret = 0;
+
+	if (!arg || !info || (!p && arg->num_params))
+		return -EINVAL;
+
+	arg->ret_origin = TEEC_ORIGIN_COMMS;
+
+	if (arg->session == 0) {
+		arg->ret = TEEC_ERROR_GENERIC;
+		return -EINVAL;
+	}
+
+	ret = tee_params_to_amd_params(p, arg->num_params, &cmd.op);
+	if (ret) {
+		pr_err("invalid Params. Abort open session\n");
+		arg->ret = TEEC_ERROR_BAD_PARAMETERS;
+		return ret;
+	}
+
+	cmd.ta_handle = get_ta_handle(arg->session);
+	*info = 0;
+
+	ret = psp_tee_process_cmd(TEE_CMD_ID_OPEN_SESSION, (void *)&cmd,
+				  sizeof(cmd), &arg->ret);
+	if (ret) {
+		arg->ret = TEEC_ERROR_COMMUNICATION;
+	} else {
+		ret = amd_params_to_tee_params(p, arg->num_params, &cmd.op);
+		if (unlikely(ret)) {
+			pr_err("open session: failed to copy output\n");
+			arg->ret = TEEC_ERROR_GENERIC;
+			return ret;
+		}
+		arg->ret_origin = cmd.return_origin;
+		*info = cmd.session_info;
+		pr_debug("open session: session info = 0x%x\n", *info);
+	}
+
+	pr_debug("open session: ret = 0x%x RO = 0x%x\n", arg->ret,
+		 arg->ret_origin);
+
+	return ret;
+}
+
+int handle_load_ta(void *data, u32 size, struct tee_ioctl_open_session_arg *arg)
+{
+	struct tee_cmd_load_ta cmd = {0};
+	phys_addr_t blob;
+	int ret = 0;
+
+	if (size == 0 || !data || !arg)
+		return -EINVAL;
+
+	blob = __psp_pa(data);
+	if (blob & (PAGE_SIZE - 1)) {
+		pr_err("load TA: page unaligned. blob 0x%llx", blob);
+		return -EINVAL;
+	}
+
+	cmd.hi_addr = upper_32_bits(blob);
+	cmd.low_addr = lower_32_bits(blob);
+	cmd.size = size;
+
+	ret = psp_tee_process_cmd(TEE_CMD_ID_LOAD_TA, (void *)&cmd,
+				  sizeof(cmd), &arg->ret);
+	if (ret) {
+		arg->ret_origin = TEEC_ORIGIN_COMMS;
+		arg->ret = TEEC_ERROR_COMMUNICATION;
+	} else {
+		set_session_id(cmd.ta_handle, 0, &arg->session);
+	}
+
+	pr_debug("load TA: TA handle = 0x%x, RO = 0x%x, ret = 0x%x\n",
+		 cmd.ta_handle, arg->ret_origin, arg->ret);
+
+	return 0;
+}
diff --git a/drivers/tee/amdtee/core.c b/drivers/tee/amdtee/core.c
new file mode 100644
index 000000000000..dd360f378ef0
--- /dev/null
+++ b/drivers/tee/amdtee/core.c
@@ -0,0 +1,510 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+
+#include <linux/errno.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/string.h>
+#include <linux/device.h>
+#include <linux/tee_drv.h>
+#include <linux/types.h>
+#include <linux/mm.h>
+#include <linux/uaccess.h>
+#include <linux/firmware.h>
+#include "amdtee_private.h"
+#include "../tee_private.h"
+
+static struct amdtee_driver_data *drv_data;
+static DEFINE_MUTEX(session_list_mutex);
+static struct amdtee_shm_context shmctx;
+
+static void amdtee_get_version(struct tee_device *teedev,
+			       struct tee_ioctl_version_data *vers)
+{
+	struct tee_ioctl_version_data v = {
+		.impl_id = TEE_IMPL_ID_AMDTEE,
+		.impl_caps = 0,
+		.gen_caps = TEE_GEN_CAP_GP,
+	};
+	*vers = v;
+}
+
+static int amdtee_open(struct tee_context *ctx)
+{
+	struct amdtee_context_data *ctxdata;
+
+	ctxdata = kzalloc(sizeof(*ctxdata), GFP_KERNEL);
+	if (!ctxdata)
+		return -ENOMEM;
+
+	INIT_LIST_HEAD(&ctxdata->sess_list);
+	INIT_LIST_HEAD(&shmctx.shmdata_list);
+
+	ctx->data = ctxdata;
+	return 0;
+}
+
+static void release_session(struct amdtee_session *sess)
+{
+	int i = 0;
+
+	/* Close any open session */
+	for (i = 0; i < TEE_NUM_SESSIONS; ++i) {
+		/* Check if session entry 'i' is valid */
+		if (!test_bit(i, sess->sess_mask))
+			continue;
+
+		handle_close_session(sess->ta_handle, sess->session_info[i]);
+	}
+
+	/* Unload Trusted Application once all sessions are closed */
+	handle_unload_ta(sess->ta_handle);
+	kfree(sess);
+}
+
+static void amdtee_release(struct tee_context *ctx)
+{
+	struct amdtee_context_data *ctxdata = ctx->data;
+
+	if (!ctxdata)
+		return;
+
+	while (true) {
+		struct amdtee_session *sess;
+
+		sess = list_first_entry_or_null(&ctxdata->sess_list,
+						struct amdtee_session,
+						list_node);
+
+		if (!sess)
+			break;
+
+		list_del(&sess->list_node);
+		release_session(sess);
+	}
+	kfree(ctxdata);
+
+	ctx->data = NULL;
+}
+
+/**
+ * alloc_session() - Allocate a session structure
+ * @ctxdata:    TEE Context data structure
+ * @session:    Session ID for which 'struct amdtee_session' structure is to be
+ *              allocated.
+ *
+ * Scans the TEE context's session list to check if TA is already loaded in to
+ * TEE. If yes, returns the 'session' structure for that TA. Else allocates,
+ * initializes a new 'session' structure and adds it to context's session list.
+ *
+ * The caller must hold a mutex.
+ *
+ * Returns:
+ * 'struct amdtee_session *' on success and NULL on failure.
+ */
+static struct amdtee_session *alloc_session(struct amdtee_context_data *ctxdata,
+					    u32 session)
+{
+	struct amdtee_session *sess;
+	u32 ta_handle = get_ta_handle(session);
+
+	/* Scan session list to check if TA is already loaded in to TEE */
+	list_for_each_entry(sess, &ctxdata->sess_list, list_node)
+		if (sess->ta_handle == ta_handle) {
+			kref_get(&sess->refcount);
+			return sess;
+		}
+
+	/* Allocate a new session and add to list */
+	sess = kzalloc(sizeof(*sess), GFP_KERNEL);
+	if (sess) {
+		sess->ta_handle = ta_handle;
+		kref_init(&sess->refcount);
+		spin_lock_init(&sess->lock);
+		list_add(&sess->list_node, &ctxdata->sess_list);
+	}
+
+	return sess;
+}
+
+/* Requires mutex to be held */
+static struct amdtee_session *find_session(struct amdtee_context_data *ctxdata,
+					   u32 session)
+{
+	u32 ta_handle = get_ta_handle(session);
+	u32 index = get_session_index(session);
+	struct amdtee_session *sess;
+
+	list_for_each_entry(sess, &ctxdata->sess_list, list_node)
+		if (ta_handle == sess->ta_handle &&
+		    test_bit(index, sess->sess_mask))
+			return sess;
+
+	return NULL;
+}
+
+u32 get_buffer_id(struct tee_shm *shm)
+{
+	u32 buf_id = 0;
+	struct amdtee_shm_data *shmdata;
+
+	list_for_each_entry(shmdata, &shmctx.shmdata_list, shm_node)
+		if (shmdata->kaddr == shm->kaddr) {
+			buf_id = shmdata->buf_id;
+			break;
+		}
+
+	return buf_id;
+}
+
+static DEFINE_MUTEX(drv_mutex);
+static int copy_ta_binary(struct tee_context *ctx, void *ptr, void **ta,
+			  size_t *ta_size)
+{
+	const struct firmware *fw;
+	char fw_name[TA_PATH_MAX];
+	struct {
+		u32 lo;
+		u16 mid;
+		u16 hi_ver;
+		u8 seq_n[8];
+	} *uuid = ptr;
+	int n = 0, rc = 0;
+
+	n = snprintf(fw_name, TA_PATH_MAX,
+		     "%s/%08x-%04x-%04x-%02x%02x%02x%02x%02x%02x%02x%02x.bin",
+		     TA_LOAD_PATH, uuid->lo, uuid->mid, uuid->hi_ver,
+		     uuid->seq_n[0], uuid->seq_n[1],
+		     uuid->seq_n[2], uuid->seq_n[3],
+		     uuid->seq_n[4], uuid->seq_n[5],
+		     uuid->seq_n[6], uuid->seq_n[7]);
+	if (n < 0 || n >= TA_PATH_MAX) {
+		pr_err("failed to get firmware name\n");
+		return -EINVAL;
+	}
+
+	mutex_lock(&drv_mutex);
+	n = request_firmware(&fw, fw_name, &ctx->teedev->dev);
+	if (n) {
+		pr_err("failed to load firmware %s\n", fw_name);
+		rc = -ENOMEM;
+		goto unlock;
+	}
+
+	*ta_size = roundup(fw->size, PAGE_SIZE);
+	*ta = (void *)__get_free_pages(GFP_KERNEL, get_order(*ta_size));
+	if (IS_ERR(*ta)) {
+		pr_err("%s: get_free_pages failed 0x%llx\n", __func__,
+		       (u64)*ta);
+		rc = -ENOMEM;
+		goto rel_fw;
+	}
+
+	memcpy(*ta, fw->data, fw->size);
+rel_fw:
+	release_firmware(fw);
+unlock:
+	mutex_unlock(&drv_mutex);
+	return rc;
+}
+
+int amdtee_open_session(struct tee_context *ctx,
+			struct tee_ioctl_open_session_arg *arg,
+			struct tee_param *param)
+{
+	struct amdtee_context_data *ctxdata = ctx->data;
+	struct amdtee_session *sess = NULL;
+	u32 session_info;
+	void *ta = NULL;
+	size_t ta_size;
+	int rc = 0, i;
+
+	if (arg->clnt_login != TEE_IOCTL_LOGIN_PUBLIC) {
+		pr_err("unsupported client login method\n");
+		return -EINVAL;
+	}
+
+	rc = copy_ta_binary(ctx, &arg->uuid[0], &ta, &ta_size);
+	if (rc) {
+		pr_err("failed to copy TA binary\n");
+		return rc;
+	}
+
+	/* Load the TA binary into TEE environment */
+	handle_load_ta(ta, ta_size, arg);
+	if (arg->ret == TEEC_SUCCESS) {
+		mutex_lock(&session_list_mutex);
+		sess = alloc_session(ctxdata, arg->session);
+		mutex_unlock(&session_list_mutex);
+	}
+
+	if (arg->ret != TEEC_SUCCESS)
+		goto out;
+
+	if (!sess) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	/* Find an empty session index for the given TA */
+	spin_lock(&sess->lock);
+	i = find_first_zero_bit(sess->sess_mask, TEE_NUM_SESSIONS);
+	if (i < TEE_NUM_SESSIONS)
+		set_bit(i, sess->sess_mask);
+	spin_unlock(&sess->lock);
+
+	if (i >= TEE_NUM_SESSIONS) {
+		pr_err("reached maximum session count %d\n", TEE_NUM_SESSIONS);
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	/* Open session with loaded TA */
+	handle_open_session(arg, &session_info, param);
+
+	if (arg->ret == TEEC_SUCCESS) {
+		sess->session_info[i] = session_info;
+		set_session_id(sess->ta_handle, i, &arg->session);
+	} else {
+		pr_err("open_session failed %d\n", arg->ret);
+		spin_lock(&sess->lock);
+		clear_bit(i, sess->sess_mask);
+		spin_unlock(&sess->lock);
+	}
+out:
+	free_pages((u64)ta, get_order(ta_size));
+	return rc;
+}
+
+static void destroy_session(struct kref *ref)
+{
+	struct amdtee_session *sess = container_of(ref, struct amdtee_session,
+						   refcount);
+
+	/* Unload the TA from TEE */
+	handle_unload_ta(sess->ta_handle);
+	mutex_lock(&session_list_mutex);
+	list_del(&sess->list_node);
+	mutex_unlock(&session_list_mutex);
+	kfree(sess);
+}
+
+int amdtee_close_session(struct tee_context *ctx, u32 session)
+{
+	struct amdtee_context_data *ctxdata = ctx->data;
+	u32 i, ta_handle, session_info;
+	struct amdtee_session *sess;
+
+	pr_debug("%s: sid = 0x%x\n", __func__, session);
+
+	/*
+	 * Check that the session is valid and clear the session
+	 * usage bit
+	 */
+	mutex_lock(&session_list_mutex);
+	sess = find_session(ctxdata, session);
+	if (sess) {
+		ta_handle = get_ta_handle(session);
+		i = get_session_index(session);
+		session_info = sess->session_info[i];
+		spin_lock(&sess->lock);
+		clear_bit(i, sess->sess_mask);
+		spin_unlock(&sess->lock);
+	}
+	mutex_unlock(&session_list_mutex);
+
+	if (!sess)
+		return -EINVAL;
+
+	/* Close the session */
+	handle_close_session(ta_handle, session_info);
+
+	kref_put(&sess->refcount, destroy_session);
+
+	return 0;
+}
+
+int amdtee_map_shmem(struct tee_shm *shm)
+{
+	struct shmem_desc shmem;
+	struct amdtee_shm_data *shmnode;
+	int rc, count;
+	u32 buf_id;
+
+	if (!shm)
+		return -EINVAL;
+
+	shmnode = kmalloc(sizeof(*shmnode), GFP_KERNEL);
+	if (!shmnode)
+		return -ENOMEM;
+
+	count = 1;
+	shmem.kaddr = shm->kaddr;
+	shmem.size = shm->size;
+
+	/*
+	 * Send a MAP command to TEE and get the corresponding
+	 * buffer Id
+	 */
+	rc = handle_map_shmem(count, &shmem, &buf_id);
+	if (rc) {
+		pr_err("map_shmem failed: ret = %d\n", rc);
+		kfree(shmnode);
+		return rc;
+	}
+
+	shmnode->kaddr = shm->kaddr;
+	shmnode->buf_id = buf_id;
+	list_add(&shmnode->shm_node, &shmctx.shmdata_list);
+
+	pr_debug("buf_id :[%x] kaddr[%p]\n", shmnode->buf_id, shmnode->kaddr);
+
+	return 0;
+}
+
+void amdtee_unmap_shmem(struct tee_shm *shm)
+{
+	u32 buf_id;
+	struct amdtee_shm_data *shmnode = NULL;
+
+	if (!shm)
+		return;
+
+	buf_id = get_buffer_id(shm);
+	/* Unmap the shared memory from TEE */
+	handle_unmap_shmem(buf_id);
+
+	list_for_each_entry(shmnode, &shmctx.shmdata_list, shm_node)
+		if (buf_id == shmnode->buf_id) {
+			list_del(&shmnode->shm_node);
+			kfree(shmnode);
+			break;
+		}
+}
+
+int amdtee_invoke_func(struct tee_context *ctx,
+		       struct tee_ioctl_invoke_arg *arg,
+		       struct tee_param *param)
+{
+	struct amdtee_context_data *ctxdata = ctx->data;
+	struct amdtee_session *sess;
+	u32 i, session_info;
+
+	/* Check that the session is valid */
+	mutex_lock(&session_list_mutex);
+	sess = find_session(ctxdata, arg->session);
+	if (sess) {
+		i = get_session_index(arg->session);
+		session_info = sess->session_info[i];
+	}
+	mutex_unlock(&session_list_mutex);
+
+	if (!sess)
+		return -EINVAL;
+
+	handle_invoke_cmd(arg, session_info, param);
+
+	return 0;
+}
+
+int amdtee_cancel_req(struct tee_context *ctx, u32 cancel_id, u32 session)
+{
+	return -EINVAL;
+}
+
+static const struct tee_driver_ops amdtee_ops = {
+	.get_version = amdtee_get_version,
+	.open = amdtee_open,
+	.release = amdtee_release,
+	.open_session = amdtee_open_session,
+	.close_session = amdtee_close_session,
+	.invoke_func = amdtee_invoke_func,
+	.cancel_req = amdtee_cancel_req,
+};
+
+static const struct tee_desc amdtee_desc = {
+	.name = DRIVER_NAME "-clnt",
+	.ops = &amdtee_ops,
+	.owner = THIS_MODULE,
+};
+
+static int __init amdtee_driver_init(void)
+{
+	struct amdtee *amdtee = NULL;
+	struct tee_device *teedev;
+	struct tee_shm_pool *pool = ERR_PTR(-EINVAL);
+	int rc;
+
+	drv_data = kzalloc(sizeof(*drv_data), GFP_KERNEL);
+	if (IS_ERR(drv_data))
+		return -ENOMEM;
+
+	amdtee = kzalloc(sizeof(*amdtee), GFP_KERNEL);
+	if (IS_ERR(amdtee)) {
+		rc = -ENOMEM;
+		goto err_kfree_drv_data;
+	}
+
+	pool = amdtee_config_shm();
+	if (IS_ERR(pool)) {
+		pr_err("shared pool configuration error\n");
+		rc = PTR_ERR(pool);
+		goto err_kfree_amdtee;
+	}
+
+	teedev = tee_device_alloc(&amdtee_desc, NULL, pool, amdtee);
+	if (IS_ERR(teedev)) {
+		rc = PTR_ERR(teedev);
+		goto err;
+	}
+	amdtee->teedev = teedev;
+
+	rc = tee_device_register(amdtee->teedev);
+	if (rc)
+		goto err;
+
+	amdtee->pool = pool;
+
+	drv_data->amdtee = amdtee;
+
+	pr_info("amd-tee driver initialization successful\n");
+	return 0;
+
+err:
+	tee_device_unregister(amdtee->teedev);
+	if (pool)
+		tee_shm_pool_free(pool);
+
+err_kfree_amdtee:
+	kfree(amdtee);
+
+err_kfree_drv_data:
+	kfree(drv_data);
+	drv_data = NULL;
+
+	pr_err("amd-tee driver initialization failed\n");
+	return rc;
+}
+module_init(amdtee_driver_init);
+
+static void __exit amdtee_driver_exit(void)
+{
+	struct amdtee *amdtee;
+
+	if (!drv_data || !drv_data->amdtee)
+		return;
+
+	amdtee = drv_data->amdtee;
+
+	tee_device_unregister(amdtee->teedev);
+	tee_shm_pool_free(amdtee->pool);
+}
+module_exit(amdtee_driver_exit);
+
+MODULE_AUTHOR(DRIVER_AUTHOR);
+MODULE_DESCRIPTION("AMD-TEE driver");
+MODULE_VERSION("1.0");
+MODULE_LICENSE("Dual MIT/GPL");
diff --git a/drivers/tee/amdtee/shm_pool.c b/drivers/tee/amdtee/shm_pool.c
new file mode 100644
index 000000000000..065854e2db18
--- /dev/null
+++ b/drivers/tee/amdtee/shm_pool.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: MIT
+/*
+ * Copyright 2019 Advanced Micro Devices, Inc.
+ */
+
+#include <linux/slab.h>
+#include <linux/tee_drv.h>
+#include <linux/psp-sev.h>
+#include "amdtee_private.h"
+
+static int pool_op_alloc(struct tee_shm_pool_mgr *poolm, struct tee_shm *shm,
+			 size_t size)
+{
+	unsigned int order = get_order(size);
+	unsigned long va;
+	int rc;
+
+	va = __get_free_pages(GFP_KERNEL | __GFP_ZERO, order);
+	if (!va)
+		return -ENOMEM;
+
+	shm->kaddr = (void *)va;
+	shm->paddr = __psp_pa((void *)va);
+	shm->size = PAGE_SIZE << order;
+
+	/* Map the allocated memory in to TEE */
+	rc = amdtee_map_shmem(shm);
+	if (rc) {
+		free_pages(va, order);
+		shm->kaddr = NULL;
+		return rc;
+	}
+
+	return 0;
+}
+
+static void pool_op_free(struct tee_shm_pool_mgr *poolm, struct tee_shm *shm)
+{
+	/* Unmap the shared memory from TEE */
+	amdtee_unmap_shmem(shm);
+	free_pages((unsigned long)shm->kaddr, get_order(shm->size));
+	shm->kaddr = NULL;
+}
+
+static void pool_op_destroy_poolmgr(struct tee_shm_pool_mgr *poolm)
+{
+	kfree(poolm);
+}
+
+static const struct tee_shm_pool_mgr_ops pool_ops = {
+	.alloc = pool_op_alloc,
+	.free = pool_op_free,
+	.destroy_poolmgr = pool_op_destroy_poolmgr,
+};
+
+static struct tee_shm_pool_mgr *pool_mem_mgr_alloc(void)
+{
+	struct tee_shm_pool_mgr *mgr = kzalloc(sizeof(*mgr), GFP_KERNEL);
+
+	if (!mgr)
+		return ERR_PTR(-ENOMEM);
+
+	mgr->ops = &pool_ops;
+
+	return mgr;
+}
+
+struct tee_shm_pool *amdtee_config_shm(void)
+{
+	struct tee_shm_pool_mgr *priv_mgr;
+	struct tee_shm_pool_mgr *dmabuf_mgr;
+	void *rc;
+
+	rc = pool_mem_mgr_alloc();
+	if (IS_ERR(rc))
+		return rc;
+	priv_mgr = rc;
+
+	rc = pool_mem_mgr_alloc();
+	if (IS_ERR(rc)) {
+		tee_shm_pool_mgr_destroy(priv_mgr);
+		return rc;
+	}
+	dmabuf_mgr = rc;
+
+	rc = tee_shm_pool_alloc(priv_mgr, dmabuf_mgr);
+	if (IS_ERR(rc)) {
+		tee_shm_pool_mgr_destroy(priv_mgr);
+		tee_shm_pool_mgr_destroy(dmabuf_mgr);
+	}
+
+	return rc;
+}
diff --git a/include/uapi/linux/tee.h b/include/uapi/linux/tee.h
index 4b9eb064d7e7..6596f3a09e54 100644
--- a/include/uapi/linux/tee.h
+++ b/include/uapi/linux/tee.h
@@ -56,6 +56,7 @@
  * TEE Implementation ID
  */
 #define TEE_IMPL_ID_OPTEE	1
+#define TEE_IMPL_ID_AMDTEE	2
 
 /*
  * OP-TEE specific capabilities
-- 
cgit v1.2.3


From 6c930994503d9b5bd34b1329427dd7d3d6d37cd4 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Mon, 6 Jan 2020 03:34:09 +0200
Subject: mii: Add helpers for parsing SGMII auto-negotiation

Typically a MAC PCS auto-configures itself after it receives the
negotiated copper-side link settings from the PHY, but some MAC devices
are more special and need manual interpretation of the SGMII AN result.

In other cases, the PCS exposes the entire tx_config_reg base page as it
is transmitted on the wire during auto-negotiation, so it makes sense to
be able to decode the equivalent lp_advertised bit mask from the raw u16
(of course, "lp" considering the PCS to be the local PHY).

Therefore, add the bit definitions for the SGMII registers 4 and 5
(local device ability, link partner ability), as well as a link_mode
conversion helper that can be used to feed the AN results into
phy_resolve_aneg_linkmode.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/mii.h      | 50 ++++++++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/mii.h | 12 ++++++++++++
 2 files changed, 62 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/linux/mii.h b/include/linux/mii.h
index 4ce8901a1af6..18c6208f56fc 100644
--- a/include/linux/mii.h
+++ b/include/linux/mii.h
@@ -372,6 +372,56 @@ static inline u32 mii_lpa_to_ethtool_lpa_x(u32 lpa)
 	return result | mii_adv_to_ethtool_adv_x(lpa);
 }
 
+/**
+ * mii_lpa_mod_linkmode_adv_sgmii
+ * @lp_advertising: pointer to destination link mode.
+ * @lpa: value of the MII_LPA register
+ *
+ * A small helper function that translates MII_LPA bits to
+ * linkmode advertisement settings for SGMII.
+ * Leaves other bits unchanged.
+ */
+static inline void
+mii_lpa_mod_linkmode_lpa_sgmii(unsigned long *lp_advertising, u32 lpa)
+{
+	u32 speed_duplex = lpa & LPA_SGMII_DPX_SPD_MASK;
+
+	linkmode_mod_bit(ETHTOOL_LINK_MODE_1000baseT_Half_BIT, lp_advertising,
+			 speed_duplex == LPA_SGMII_1000HALF);
+
+	linkmode_mod_bit(ETHTOOL_LINK_MODE_1000baseT_Full_BIT, lp_advertising,
+			 speed_duplex == LPA_SGMII_1000FULL);
+
+	linkmode_mod_bit(ETHTOOL_LINK_MODE_100baseT_Half_BIT, lp_advertising,
+			 speed_duplex == LPA_SGMII_100HALF);
+
+	linkmode_mod_bit(ETHTOOL_LINK_MODE_100baseT_Full_BIT, lp_advertising,
+			 speed_duplex == LPA_SGMII_100FULL);
+
+	linkmode_mod_bit(ETHTOOL_LINK_MODE_10baseT_Half_BIT, lp_advertising,
+			 speed_duplex == LPA_SGMII_10HALF);
+
+	linkmode_mod_bit(ETHTOOL_LINK_MODE_10baseT_Full_BIT, lp_advertising,
+			 speed_duplex == LPA_SGMII_10FULL);
+}
+
+/**
+ * mii_lpa_to_linkmode_adv_sgmii
+ * @advertising: pointer to destination link mode.
+ * @lpa: value of the MII_LPA register
+ *
+ * A small helper function that translates MII_ADVERTISE bits
+ * to linkmode advertisement settings when in SGMII mode.
+ * Clears the old value of advertising.
+ */
+static inline void mii_lpa_to_linkmode_lpa_sgmii(unsigned long *lp_advertising,
+						 u32 lpa)
+{
+	linkmode_zero(lp_advertising);
+
+	mii_lpa_mod_linkmode_lpa_sgmii(lp_advertising, lpa);
+}
+
 /**
  * mii_adv_mod_linkmode_adv_t
  * @advertising:pointer to destination link mode.
diff --git a/include/uapi/linux/mii.h b/include/uapi/linux/mii.h
index 51b48e4be1f2..0b9c3beda345 100644
--- a/include/uapi/linux/mii.h
+++ b/include/uapi/linux/mii.h
@@ -131,6 +131,18 @@
 #define NWAYTEST_LOOPBACK	0x0100	/* Enable loopback for N-way   */
 #define NWAYTEST_RESV2		0xfe00	/* Unused...                   */
 
+/* MAC and PHY tx_config_Reg[15:0] for SGMII in-band auto-negotiation.*/
+#define ADVERTISE_SGMII		0x0001	/* MAC can do SGMII            */
+#define LPA_SGMII		0x0001	/* PHY can do SGMII            */
+#define LPA_SGMII_DPX_SPD_MASK	0x1C00	/* SGMII duplex and speed bits */
+#define LPA_SGMII_10HALF	0x0000	/* Can do 10mbps half-duplex   */
+#define LPA_SGMII_10FULL	0x1000	/* Can do 10mbps full-duplex   */
+#define LPA_SGMII_100HALF	0x0400	/* Can do 100mbps half-duplex  */
+#define LPA_SGMII_100FULL	0x1400	/* Can do 100mbps full-duplex  */
+#define LPA_SGMII_1000HALF	0x0800	/* Can do 1000mbps half-duplex */
+#define LPA_SGMII_1000FULL	0x1800	/* Can do 1000mbps full-duplex */
+#define LPA_SGMII_LINK		0x8000	/* PHY link with copper-side partner */
+
 /* 1000BASE-T Control register */
 #define ADVERTISE_1000FULL	0x0200  /* Advertise 1000BASE-T full duplex */
 #define ADVERTISE_1000HALF	0x0100  /* Advertise 1000BASE-T half duplex */
-- 
cgit v1.2.3


From 75551dbf112c992bc6c99a972990b3f272247e23 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto@kernel.org>
Date: Mon, 23 Dec 2019 00:20:46 -0800
Subject: random: add GRND_INSECURE to return best-effort non-cryptographic
 bytes

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/d5473b56cf1fa900ca4bd2b3fc1e5b8874399919.1577088521.git.luto@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 drivers/char/random.c       | 11 +++++++++--
 include/uapi/linux/random.h |  2 ++
 2 files changed, 11 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 91954c0091a5..b7e2ad7eafca 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -2194,7 +2194,14 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 {
 	int ret;
 
-	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM))
+	if (flags & ~(GRND_NONBLOCK|GRND_RANDOM|GRND_INSECURE))
+		return -EINVAL;
+
+	/*
+	 * Requesting insecure and blocking randomness at the same time makes
+	 * no sense.
+	 */
+	if ((flags & (GRND_INSECURE|GRND_RANDOM)) == (GRND_INSECURE|GRND_RANDOM))
 		return -EINVAL;
 
 	if (count > INT_MAX)
@@ -2203,7 +2210,7 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 	if (flags & GRND_RANDOM)
 		return _random_read(flags & GRND_NONBLOCK, buf, count);
 
-	if (!crng_ready()) {
+	if (!(flags & GRND_INSECURE) && !crng_ready()) {
 		if (flags & GRND_NONBLOCK)
 			return -EAGAIN;
 		ret = wait_for_random_bytes();
diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h
index 26ee91300e3e..c092d20088d3 100644
--- a/include/uapi/linux/random.h
+++ b/include/uapi/linux/random.h
@@ -49,8 +49,10 @@ struct rand_pool_info {
  *
  * GRND_NONBLOCK	Don't block and return EAGAIN instead
  * GRND_RANDOM		Use the /dev/random pool instead of /dev/urandom
+ * GRND_INSECURE	Return non-cryptographic random bytes
  */
 #define GRND_NONBLOCK	0x0001
 #define GRND_RANDOM	0x0002
+#define GRND_INSECURE	0x0004
 
 #endif /* _UAPI_LINUX_RANDOM_H */
-- 
cgit v1.2.3


From 48446f198f9adcb499b30332488dfd5bc3f176f6 Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto@kernel.org>
Date: Mon, 23 Dec 2019 00:20:47 -0800
Subject: random: ignore GRND_RANDOM in getentropy(2)

The separate blocking pool is going away.  Start by ignoring
GRND_RANDOM in getentropy(2).

This should not materially break any API.  Any code that worked
without this change should work at least as well with this change.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
Link: https://lore.kernel.org/r/705c5a091b63cc5da70c99304bb97e0109be0a26.1577088521.git.luto@kernel.org
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 drivers/char/random.c       | 3 ---
 include/uapi/linux/random.h | 2 +-
 2 files changed, 1 insertion(+), 4 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/char/random.c b/drivers/char/random.c
index b7e2ad7eafca..a92dfea038ce 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -2207,9 +2207,6 @@ SYSCALL_DEFINE3(getrandom, char __user *, buf, size_t, count,
 	if (count > INT_MAX)
 		count = INT_MAX;
 
-	if (flags & GRND_RANDOM)
-		return _random_read(flags & GRND_NONBLOCK, buf, count);
-
 	if (!(flags & GRND_INSECURE) && !crng_ready()) {
 		if (flags & GRND_NONBLOCK)
 			return -EAGAIN;
diff --git a/include/uapi/linux/random.h b/include/uapi/linux/random.h
index c092d20088d3..dcc1b3e6106f 100644
--- a/include/uapi/linux/random.h
+++ b/include/uapi/linux/random.h
@@ -48,7 +48,7 @@ struct rand_pool_info {
  * Flags for getrandom(2)
  *
  * GRND_NONBLOCK	Don't block and return EAGAIN instead
- * GRND_RANDOM		Use the /dev/random pool instead of /dev/urandom
+ * GRND_RANDOM		No effect
  * GRND_INSECURE	Return non-cryptographic random bytes
  */
 #define GRND_NONBLOCK	0x0001
-- 
cgit v1.2.3


From ed22aaaede44f647477a5048e62855c0ed49c9bd Mon Sep 17 00:00:00 2001
From: Dilip Kota <eswara.kota@linux.intel.com>
Date: Mon, 9 Dec 2019 11:20:05 +0800
Subject: PCI: dwc: intel: PCIe RC controller driver

Add support to PCIe RC controller on Intel Gateway SoCs.
PCIe controller is based of Synopsys DesignWare PCIe core.

Intel PCIe driver requires Upconfigure support, Fast Training
Sequence and link speed configurations. So adding the respective
helper functions in the PCIe DesignWare framework.
It also programs hardware autonomous speed during speed
configuration so defining it in pci_regs.h.

Also, mark Intel PCIe driver depends on MSI IRQ Domain
as Synopsys DesignWare framework depends on the
PCI_MSI_IRQ_DOMAIN.

Signed-off-by: Dilip Kota <eswara.kota@linux.intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Andrew Murray <andrew.murray@arm.com>
Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Acked-by: Gustavo Pimentel <gustavo.pimentel@synopsys.com>
---
 drivers/pci/controller/dwc/Kconfig           |  11 +
 drivers/pci/controller/dwc/Makefile          |   1 +
 drivers/pci/controller/dwc/pcie-designware.c |  56 +++
 drivers/pci/controller/dwc/pcie-designware.h |  12 +
 drivers/pci/controller/dwc/pcie-intel-gw.c   | 545 +++++++++++++++++++++++++++
 include/uapi/linux/pci_regs.h                |   1 +
 6 files changed, 626 insertions(+)
 create mode 100644 drivers/pci/controller/dwc/pcie-intel-gw.c

(limited to 'include/uapi/linux')

diff --git a/drivers/pci/controller/dwc/Kconfig b/drivers/pci/controller/dwc/Kconfig
index 625a031b2193..0830dfcfa43a 100644
--- a/drivers/pci/controller/dwc/Kconfig
+++ b/drivers/pci/controller/dwc/Kconfig
@@ -209,6 +209,17 @@ config PCIE_ARTPEC6_EP
 	  Enables support for the PCIe controller in the ARTPEC-6 SoC to work in
 	  endpoint mode. This uses the DesignWare core.
 
+config PCIE_INTEL_GW
+	bool "Intel Gateway PCIe host controller support"
+	depends on OF && (X86 || COMPILE_TEST)
+	depends on PCI_MSI_IRQ_DOMAIN
+	select PCIE_DW_HOST
+	help
+	  Say 'Y' here to enable PCIe Host controller support on Intel
+	  Gateway SoCs.
+	  The PCIe controller uses the DesignWare core plus Intel-specific
+	  hardware wrappers.
+
 config PCIE_KIRIN
 	depends on OF && (ARM64 || COMPILE_TEST)
 	bool "HiSilicon Kirin series SoCs PCIe controllers"
diff --git a/drivers/pci/controller/dwc/Makefile b/drivers/pci/controller/dwc/Makefile
index 69faff371f11..8a637cfcf6e9 100644
--- a/drivers/pci/controller/dwc/Makefile
+++ b/drivers/pci/controller/dwc/Makefile
@@ -13,6 +13,7 @@ obj-$(CONFIG_PCI_LAYERSCAPE_EP) += pci-layerscape-ep.o
 obj-$(CONFIG_PCIE_QCOM) += pcie-qcom.o
 obj-$(CONFIG_PCIE_ARMADA_8K) += pcie-armada8k.o
 obj-$(CONFIG_PCIE_ARTPEC6) += pcie-artpec6.o
+obj-$(CONFIG_PCIE_INTEL_GW) += pcie-intel-gw.o
 obj-$(CONFIG_PCIE_KIRIN) += pcie-kirin.o
 obj-$(CONFIG_PCIE_HISI_STB) += pcie-histb.o
 obj-$(CONFIG_PCI_MESON) += pci-meson.o
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 820488dfeaed..681548c88282 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -12,6 +12,7 @@
 #include <linux/of.h>
 #include <linux/types.h>
 
+#include "../../pci.h"
 #include "pcie-designware.h"
 
 /*
@@ -474,6 +475,61 @@ int dw_pcie_link_up(struct dw_pcie *pci)
 		(!(val & PCIE_PORT_DEBUG1_LINK_IN_TRAINING)));
 }
 
+void dw_pcie_upconfig_setup(struct dw_pcie *pci)
+{
+	u32 val;
+
+	val = dw_pcie_readl_dbi(pci, PCIE_PORT_MULTI_LANE_CTRL);
+	val |= PORT_MLTI_UPCFG_SUPPORT;
+	dw_pcie_writel_dbi(pci, PCIE_PORT_MULTI_LANE_CTRL, val);
+}
+EXPORT_SYMBOL_GPL(dw_pcie_upconfig_setup);
+
+void dw_pcie_link_set_max_speed(struct dw_pcie *pci, u32 link_gen)
+{
+	u32 reg, val;
+	u8 offset = dw_pcie_find_capability(pci, PCI_CAP_ID_EXP);
+
+	reg = dw_pcie_readl_dbi(pci, offset + PCI_EXP_LNKCTL2);
+	reg &= ~PCI_EXP_LNKCTL2_TLS;
+
+	switch (pcie_link_speed[link_gen]) {
+	case PCIE_SPEED_2_5GT:
+		reg |= PCI_EXP_LNKCTL2_TLS_2_5GT;
+		break;
+	case PCIE_SPEED_5_0GT:
+		reg |= PCI_EXP_LNKCTL2_TLS_5_0GT;
+		break;
+	case PCIE_SPEED_8_0GT:
+		reg |= PCI_EXP_LNKCTL2_TLS_8_0GT;
+		break;
+	case PCIE_SPEED_16_0GT:
+		reg |= PCI_EXP_LNKCTL2_TLS_16_0GT;
+		break;
+	default:
+		/* Use hardware capability */
+		val = dw_pcie_readl_dbi(pci, offset + PCI_EXP_LNKCAP);
+		val = FIELD_GET(PCI_EXP_LNKCAP_SLS, val);
+		reg &= ~PCI_EXP_LNKCTL2_HASD;
+		reg |= FIELD_PREP(PCI_EXP_LNKCTL2_TLS, val);
+		break;
+	}
+
+	dw_pcie_writel_dbi(pci, offset + PCI_EXP_LNKCTL2, reg);
+}
+EXPORT_SYMBOL_GPL(dw_pcie_link_set_max_speed);
+
+void dw_pcie_link_set_n_fts(struct dw_pcie *pci, u32 n_fts)
+{
+	u32 val;
+
+	val = dw_pcie_readl_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL);
+	val &= ~PORT_LOGIC_N_FTS_MASK;
+	val |= n_fts & PORT_LOGIC_N_FTS_MASK;
+	dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val);
+}
+EXPORT_SYMBOL_GPL(dw_pcie_link_set_n_fts);
+
 static u8 dw_pcie_iatu_unroll_enabled(struct dw_pcie *pci)
 {
 	u32 val;
diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 5accdd6bc388..a22ea5982817 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -30,7 +30,12 @@
 #define LINK_WAIT_IATU			9
 
 /* Synopsys-specific PCIe configuration registers */
+#define PCIE_PORT_AFR			0x70C
+#define PORT_AFR_N_FTS_MASK		GENMASK(15, 8)
+#define PORT_AFR_CC_N_FTS_MASK		GENMASK(23, 16)
+
 #define PCIE_PORT_LINK_CONTROL		0x710
+#define PORT_LINK_DLL_LINK_EN		BIT(5)
 #define PORT_LINK_MODE_MASK		GENMASK(21, 16)
 #define PORT_LINK_MODE(n)		FIELD_PREP(PORT_LINK_MODE_MASK, n)
 #define PORT_LINK_MODE_1_LANES		PORT_LINK_MODE(0x1)
@@ -46,6 +51,7 @@
 #define PCIE_PORT_DEBUG1_LINK_IN_TRAINING	BIT(29)
 
 #define PCIE_LINK_WIDTH_SPEED_CONTROL	0x80C
+#define PORT_LOGIC_N_FTS_MASK		GENMASK(7, 0)
 #define PORT_LOGIC_SPEED_CHANGE		BIT(17)
 #define PORT_LOGIC_LINK_WIDTH_MASK	GENMASK(12, 8)
 #define PORT_LOGIC_LINK_WIDTH(n)	FIELD_PREP(PORT_LOGIC_LINK_WIDTH_MASK, n)
@@ -60,6 +66,9 @@
 #define PCIE_MSI_INTR0_MASK		0x82C
 #define PCIE_MSI_INTR0_STATUS		0x830
 
+#define PCIE_PORT_MULTI_LANE_CTRL	0x8C0
+#define PORT_MLTI_UPCFG_SUPPORT		BIT(7)
+
 #define PCIE_ATU_VIEWPORT		0x900
 #define PCIE_ATU_REGION_INBOUND		BIT(31)
 #define PCIE_ATU_REGION_OUTBOUND	0
@@ -273,6 +282,9 @@ void dw_pcie_write_dbi2(struct dw_pcie *pci, u32 reg, size_t size, u32 val);
 u32 dw_pcie_read_atu(struct dw_pcie *pci, u32 reg, size_t size);
 void dw_pcie_write_atu(struct dw_pcie *pci, u32 reg, size_t size, u32 val);
 int dw_pcie_link_up(struct dw_pcie *pci);
+void dw_pcie_upconfig_setup(struct dw_pcie *pci);
+void dw_pcie_link_set_max_speed(struct dw_pcie *pci, u32 link_gen);
+void dw_pcie_link_set_n_fts(struct dw_pcie *pci, u32 n_fts);
 int dw_pcie_wait_for_link(struct dw_pcie *pci);
 void dw_pcie_prog_outbound_atu(struct dw_pcie *pci, int index,
 			       int type, u64 cpu_addr, u64 pci_addr,
diff --git a/drivers/pci/controller/dwc/pcie-intel-gw.c b/drivers/pci/controller/dwc/pcie-intel-gw.c
new file mode 100644
index 000000000000..fc2a12212dec
--- /dev/null
+++ b/drivers/pci/controller/dwc/pcie-intel-gw.c
@@ -0,0 +1,545 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PCIe host controller driver for Intel Gateway SoCs
+ *
+ * Copyright (c) 2019 Intel Corporation.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/clk.h>
+#include <linux/gpio/consumer.h>
+#include <linux/iopoll.h>
+#include <linux/pci_regs.h>
+#include <linux/phy/phy.h>
+#include <linux/platform_device.h>
+#include <linux/reset.h>
+
+#include "../../pci.h"
+#include "pcie-designware.h"
+
+#define PORT_AFR_N_FTS_GEN12_DFT	(SZ_128 - 1)
+#define PORT_AFR_N_FTS_GEN3		180
+#define PORT_AFR_N_FTS_GEN4		196
+
+/* PCIe Application logic Registers */
+#define PCIE_APP_CCR			0x10
+#define PCIE_APP_CCR_LTSSM_ENABLE	BIT(0)
+
+#define PCIE_APP_MSG_CR			0x30
+#define PCIE_APP_MSG_XMT_PM_TURNOFF	BIT(0)
+
+#define PCIE_APP_PMC			0x44
+#define PCIE_APP_PMC_IN_L2		BIT(20)
+
+#define PCIE_APP_IRNEN			0xF4
+#define PCIE_APP_IRNCR			0xF8
+#define PCIE_APP_IRN_AER_REPORT		BIT(0)
+#define PCIE_APP_IRN_PME		BIT(2)
+#define PCIE_APP_IRN_RX_VDM_MSG		BIT(4)
+#define PCIE_APP_IRN_PM_TO_ACK		BIT(9)
+#define PCIE_APP_IRN_LINK_AUTO_BW_STAT	BIT(11)
+#define PCIE_APP_IRN_BW_MGT		BIT(12)
+#define PCIE_APP_IRN_MSG_LTR		BIT(18)
+#define PCIE_APP_IRN_SYS_ERR_RC		BIT(29)
+#define PCIE_APP_INTX_OFST		12
+
+#define PCIE_APP_IRN_INT \
+	(PCIE_APP_IRN_AER_REPORT | PCIE_APP_IRN_PME | \
+	PCIE_APP_IRN_RX_VDM_MSG | PCIE_APP_IRN_SYS_ERR_RC | \
+	PCIE_APP_IRN_PM_TO_ACK | PCIE_APP_IRN_MSG_LTR | \
+	PCIE_APP_IRN_BW_MGT | PCIE_APP_IRN_LINK_AUTO_BW_STAT | \
+	(PCIE_APP_INTX_OFST + PCI_INTERRUPT_INTA) | \
+	(PCIE_APP_INTX_OFST + PCI_INTERRUPT_INTB) | \
+	(PCIE_APP_INTX_OFST + PCI_INTERRUPT_INTC) | \
+	(PCIE_APP_INTX_OFST + PCI_INTERRUPT_INTD))
+
+#define BUS_IATU_OFFSET			SZ_256M
+#define RESET_INTERVAL_MS		100
+
+struct intel_pcie_soc {
+	unsigned int	pcie_ver;
+	unsigned int	pcie_atu_offset;
+	u32		num_viewport;
+};
+
+struct intel_pcie_port {
+	struct dw_pcie		pci;
+	void __iomem		*app_base;
+	struct gpio_desc	*reset_gpio;
+	u32			rst_intrvl;
+	u32			max_speed;
+	u32			link_gen;
+	u32			max_width;
+	u32			n_fts;
+	struct clk		*core_clk;
+	struct reset_control	*core_rst;
+	struct phy		*phy;
+	u8			pcie_cap_ofst;
+};
+
+static void pcie_update_bits(void __iomem *base, u32 ofs, u32 mask, u32 val)
+{
+	u32 old;
+
+	old = readl(base + ofs);
+	val = (old & ~mask) | (val & mask);
+
+	if (val != old)
+		writel(val, base + ofs);
+}
+
+static inline u32 pcie_app_rd(struct intel_pcie_port *lpp, u32 ofs)
+{
+	return readl(lpp->app_base + ofs);
+}
+
+static inline void pcie_app_wr(struct intel_pcie_port *lpp, u32 ofs, u32 val)
+{
+	writel(val, lpp->app_base + ofs);
+}
+
+static void pcie_app_wr_mask(struct intel_pcie_port *lpp, u32 ofs,
+			     u32 mask, u32 val)
+{
+	pcie_update_bits(lpp->app_base, ofs, mask, val);
+}
+
+static inline u32 pcie_rc_cfg_rd(struct intel_pcie_port *lpp, u32 ofs)
+{
+	return dw_pcie_readl_dbi(&lpp->pci, ofs);
+}
+
+static inline void pcie_rc_cfg_wr(struct intel_pcie_port *lpp, u32 ofs, u32 val)
+{
+	dw_pcie_writel_dbi(&lpp->pci, ofs, val);
+}
+
+static void pcie_rc_cfg_wr_mask(struct intel_pcie_port *lpp, u32 ofs,
+				u32 mask, u32 val)
+{
+	pcie_update_bits(lpp->pci.dbi_base, ofs, mask, val);
+}
+
+static void intel_pcie_ltssm_enable(struct intel_pcie_port *lpp)
+{
+	pcie_app_wr_mask(lpp, PCIE_APP_CCR, PCIE_APP_CCR_LTSSM_ENABLE,
+			 PCIE_APP_CCR_LTSSM_ENABLE);
+}
+
+static void intel_pcie_ltssm_disable(struct intel_pcie_port *lpp)
+{
+	pcie_app_wr_mask(lpp, PCIE_APP_CCR, PCIE_APP_CCR_LTSSM_ENABLE, 0);
+}
+
+static void intel_pcie_link_setup(struct intel_pcie_port *lpp)
+{
+	u32 val;
+	u8 offset = lpp->pcie_cap_ofst;
+
+	val = pcie_rc_cfg_rd(lpp, offset + PCI_EXP_LNKCAP);
+	lpp->max_speed = FIELD_GET(PCI_EXP_LNKCAP_SLS, val);
+	lpp->max_width = FIELD_GET(PCI_EXP_LNKCAP_MLW, val);
+
+	val = pcie_rc_cfg_rd(lpp, offset + PCI_EXP_LNKCTL);
+
+	val &= ~(PCI_EXP_LNKCTL_LD | PCI_EXP_LNKCTL_ASPMC);
+	pcie_rc_cfg_wr(lpp, offset + PCI_EXP_LNKCTL, val);
+}
+
+static void intel_pcie_port_logic_setup(struct intel_pcie_port *lpp)
+{
+	u32 val, mask;
+
+	switch (pcie_link_speed[lpp->max_speed]) {
+	case PCIE_SPEED_8_0GT:
+		lpp->n_fts = PORT_AFR_N_FTS_GEN3;
+		break;
+	case PCIE_SPEED_16_0GT:
+		lpp->n_fts = PORT_AFR_N_FTS_GEN4;
+		break;
+	default:
+		lpp->n_fts = PORT_AFR_N_FTS_GEN12_DFT;
+		break;
+	}
+
+	mask = PORT_AFR_N_FTS_MASK | PORT_AFR_CC_N_FTS_MASK;
+	val = FIELD_PREP(PORT_AFR_N_FTS_MASK, lpp->n_fts) |
+	       FIELD_PREP(PORT_AFR_CC_N_FTS_MASK, lpp->n_fts);
+	pcie_rc_cfg_wr_mask(lpp, PCIE_PORT_AFR, mask, val);
+
+	/* Port Link Control Register */
+	pcie_rc_cfg_wr_mask(lpp, PCIE_PORT_LINK_CONTROL, PORT_LINK_DLL_LINK_EN,
+			    PORT_LINK_DLL_LINK_EN);
+}
+
+static void intel_pcie_rc_setup(struct intel_pcie_port *lpp)
+{
+	intel_pcie_ltssm_disable(lpp);
+	intel_pcie_link_setup(lpp);
+	dw_pcie_setup_rc(&lpp->pci.pp);
+	dw_pcie_upconfig_setup(&lpp->pci);
+	intel_pcie_port_logic_setup(lpp);
+	dw_pcie_link_set_max_speed(&lpp->pci, lpp->link_gen);
+	dw_pcie_link_set_n_fts(&lpp->pci, lpp->n_fts);
+}
+
+static int intel_pcie_ep_rst_init(struct intel_pcie_port *lpp)
+{
+	struct device *dev = lpp->pci.dev;
+	int ret;
+
+	lpp->reset_gpio = devm_gpiod_get(dev, "reset", GPIOD_OUT_LOW);
+	if (IS_ERR(lpp->reset_gpio)) {
+		ret = PTR_ERR(lpp->reset_gpio);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Failed to request PCIe GPIO: %d\n", ret);
+		return ret;
+	}
+
+	/* Make initial reset last for 100us */
+	usleep_range(100, 200);
+
+	return 0;
+}
+
+static void intel_pcie_core_rst_assert(struct intel_pcie_port *lpp)
+{
+	reset_control_assert(lpp->core_rst);
+}
+
+static void intel_pcie_core_rst_deassert(struct intel_pcie_port *lpp)
+{
+	/*
+	 * One micro-second delay to make sure the reset pulse
+	 * wide enough so that core reset is clean.
+	 */
+	udelay(1);
+	reset_control_deassert(lpp->core_rst);
+
+	/*
+	 * Some SoC core reset also reset PHY, more delay needed
+	 * to make sure the reset process is done.
+	 */
+	usleep_range(1000, 2000);
+}
+
+static void intel_pcie_device_rst_assert(struct intel_pcie_port *lpp)
+{
+	gpiod_set_value_cansleep(lpp->reset_gpio, 1);
+}
+
+static void intel_pcie_device_rst_deassert(struct intel_pcie_port *lpp)
+{
+	msleep(lpp->rst_intrvl);
+	gpiod_set_value_cansleep(lpp->reset_gpio, 0);
+}
+
+static int intel_pcie_app_logic_setup(struct intel_pcie_port *lpp)
+{
+	intel_pcie_device_rst_deassert(lpp);
+	intel_pcie_ltssm_enable(lpp);
+
+	return dw_pcie_wait_for_link(&lpp->pci);
+}
+
+static void intel_pcie_core_irq_disable(struct intel_pcie_port *lpp)
+{
+	pcie_app_wr(lpp, PCIE_APP_IRNEN, 0);
+	pcie_app_wr(lpp, PCIE_APP_IRNCR, PCIE_APP_IRN_INT);
+}
+
+static int intel_pcie_get_resources(struct platform_device *pdev)
+{
+	struct intel_pcie_port *lpp = platform_get_drvdata(pdev);
+	struct dw_pcie *pci = &lpp->pci;
+	struct device *dev = pci->dev;
+	struct resource *res;
+	int ret;
+
+	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "dbi");
+	pci->dbi_base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(pci->dbi_base))
+		return PTR_ERR(pci->dbi_base);
+
+	lpp->core_clk = devm_clk_get(dev, NULL);
+	if (IS_ERR(lpp->core_clk)) {
+		ret = PTR_ERR(lpp->core_clk);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Failed to get clks: %d\n", ret);
+		return ret;
+	}
+
+	lpp->core_rst = devm_reset_control_get(dev, NULL);
+	if (IS_ERR(lpp->core_rst)) {
+		ret = PTR_ERR(lpp->core_rst);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Failed to get resets: %d\n", ret);
+		return ret;
+	}
+
+	ret = device_property_match_string(dev, "device_type", "pci");
+	if (ret) {
+		dev_err(dev, "Failed to find pci device type: %d\n", ret);
+		return ret;
+	}
+
+	ret = device_property_read_u32(dev, "reset-assert-ms",
+				       &lpp->rst_intrvl);
+	if (ret)
+		lpp->rst_intrvl = RESET_INTERVAL_MS;
+
+	ret = of_pci_get_max_link_speed(dev->of_node);
+	lpp->link_gen = ret < 0 ? 0 : ret;
+
+	res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "app");
+	lpp->app_base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(lpp->app_base))
+		return PTR_ERR(lpp->app_base);
+
+	lpp->phy = devm_phy_get(dev, "pcie");
+	if (IS_ERR(lpp->phy)) {
+		ret = PTR_ERR(lpp->phy);
+		if (ret != -EPROBE_DEFER)
+			dev_err(dev, "Couldn't get pcie-phy: %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static void intel_pcie_deinit_phy(struct intel_pcie_port *lpp)
+{
+	phy_exit(lpp->phy);
+}
+
+static int intel_pcie_wait_l2(struct intel_pcie_port *lpp)
+{
+	u32 value;
+	int ret;
+
+	if (pcie_link_speed[lpp->max_speed] < PCIE_SPEED_8_0GT)
+		return 0;
+
+	/* Send PME_TURN_OFF message */
+	pcie_app_wr_mask(lpp, PCIE_APP_MSG_CR, PCIE_APP_MSG_XMT_PM_TURNOFF,
+			 PCIE_APP_MSG_XMT_PM_TURNOFF);
+
+	/* Read PMC status and wait for falling into L2 link state */
+	ret = readl_poll_timeout(lpp->app_base + PCIE_APP_PMC, value,
+				 value & PCIE_APP_PMC_IN_L2, 20,
+				 jiffies_to_usecs(5 * HZ));
+	if (ret)
+		dev_err(lpp->pci.dev, "PCIe link enter L2 timeout!\n");
+
+	return ret;
+}
+
+static void intel_pcie_turn_off(struct intel_pcie_port *lpp)
+{
+	if (dw_pcie_link_up(&lpp->pci))
+		intel_pcie_wait_l2(lpp);
+
+	/* Put endpoint device in reset state */
+	intel_pcie_device_rst_assert(lpp);
+	pcie_rc_cfg_wr_mask(lpp, PCI_COMMAND, PCI_COMMAND_MEMORY, 0);
+}
+
+static int intel_pcie_host_setup(struct intel_pcie_port *lpp)
+{
+	struct device *dev = lpp->pci.dev;
+	int ret;
+
+	intel_pcie_core_rst_assert(lpp);
+	intel_pcie_device_rst_assert(lpp);
+
+	ret = phy_init(lpp->phy);
+	if (ret)
+		return ret;
+
+	intel_pcie_core_rst_deassert(lpp);
+
+	ret = clk_prepare_enable(lpp->core_clk);
+	if (ret) {
+		dev_err(lpp->pci.dev, "Core clock enable failed: %d\n", ret);
+		goto clk_err;
+	}
+
+	if (!lpp->pcie_cap_ofst) {
+		ret = dw_pcie_find_capability(&lpp->pci, PCI_CAP_ID_EXP);
+		if (!ret) {
+			ret = -ENXIO;
+			dev_err(dev, "Invalid PCIe capability offset\n");
+			goto app_init_err;
+		}
+
+		lpp->pcie_cap_ofst = ret;
+	}
+
+	intel_pcie_rc_setup(lpp);
+	ret = intel_pcie_app_logic_setup(lpp);
+	if (ret)
+		goto app_init_err;
+
+	/* Enable integrated interrupts */
+	pcie_app_wr_mask(lpp, PCIE_APP_IRNEN, PCIE_APP_IRN_INT,
+			 PCIE_APP_IRN_INT);
+
+	return 0;
+
+app_init_err:
+	clk_disable_unprepare(lpp->core_clk);
+clk_err:
+	intel_pcie_core_rst_assert(lpp);
+	intel_pcie_deinit_phy(lpp);
+
+	return ret;
+}
+
+static void __intel_pcie_remove(struct intel_pcie_port *lpp)
+{
+	intel_pcie_core_irq_disable(lpp);
+	intel_pcie_turn_off(lpp);
+	clk_disable_unprepare(lpp->core_clk);
+	intel_pcie_core_rst_assert(lpp);
+	intel_pcie_deinit_phy(lpp);
+}
+
+static int intel_pcie_remove(struct platform_device *pdev)
+{
+	struct intel_pcie_port *lpp = platform_get_drvdata(pdev);
+	struct pcie_port *pp = &lpp->pci.pp;
+
+	dw_pcie_host_deinit(pp);
+	__intel_pcie_remove(lpp);
+
+	return 0;
+}
+
+static int __maybe_unused intel_pcie_suspend_noirq(struct device *dev)
+{
+	struct intel_pcie_port *lpp = dev_get_drvdata(dev);
+	int ret;
+
+	intel_pcie_core_irq_disable(lpp);
+	ret = intel_pcie_wait_l2(lpp);
+	if (ret)
+		return ret;
+
+	intel_pcie_deinit_phy(lpp);
+	clk_disable_unprepare(lpp->core_clk);
+	return ret;
+}
+
+static int __maybe_unused intel_pcie_resume_noirq(struct device *dev)
+{
+	struct intel_pcie_port *lpp = dev_get_drvdata(dev);
+
+	return intel_pcie_host_setup(lpp);
+}
+
+static int intel_pcie_rc_init(struct pcie_port *pp)
+{
+	struct dw_pcie *pci = to_dw_pcie_from_pp(pp);
+	struct intel_pcie_port *lpp = dev_get_drvdata(pci->dev);
+
+	return intel_pcie_host_setup(lpp);
+}
+
+/*
+ * Dummy function so that DW core doesn't configure MSI
+ */
+static int intel_pcie_msi_init(struct pcie_port *pp)
+{
+	return 0;
+}
+
+u64 intel_pcie_cpu_addr(struct dw_pcie *pcie, u64 cpu_addr)
+{
+	return cpu_addr + BUS_IATU_OFFSET;
+}
+
+static const struct dw_pcie_ops intel_pcie_ops = {
+	.cpu_addr_fixup = intel_pcie_cpu_addr,
+};
+
+static const struct dw_pcie_host_ops intel_pcie_dw_ops = {
+	.host_init =		intel_pcie_rc_init,
+	.msi_host_init =	intel_pcie_msi_init,
+};
+
+static const struct intel_pcie_soc pcie_data = {
+	.pcie_ver =		0x520A,
+	.pcie_atu_offset =	0xC0000,
+	.num_viewport =		3,
+};
+
+static int intel_pcie_probe(struct platform_device *pdev)
+{
+	const struct intel_pcie_soc *data;
+	struct device *dev = &pdev->dev;
+	struct intel_pcie_port *lpp;
+	struct pcie_port *pp;
+	struct dw_pcie *pci;
+	int ret;
+
+	lpp = devm_kzalloc(dev, sizeof(*lpp), GFP_KERNEL);
+	if (!lpp)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, lpp);
+	pci = &lpp->pci;
+	pci->dev = dev;
+	pp = &pci->pp;
+
+	ret = intel_pcie_get_resources(pdev);
+	if (ret)
+		return ret;
+
+	ret = intel_pcie_ep_rst_init(lpp);
+	if (ret)
+		return ret;
+
+	data = device_get_match_data(dev);
+	if (!data)
+		return -ENODEV;
+
+	pci->ops = &intel_pcie_ops;
+	pci->version = data->pcie_ver;
+	pci->atu_base = pci->dbi_base + data->pcie_atu_offset;
+	pp->ops = &intel_pcie_dw_ops;
+
+	ret = dw_pcie_host_init(pp);
+	if (ret) {
+		dev_err(dev, "Cannot initialize host\n");
+		return ret;
+	}
+
+	/*
+	 * Intel PCIe doesn't configure IO region, so set viewport
+	 * to not perform IO region access.
+	 */
+	pci->num_viewport = data->num_viewport;
+
+	return 0;
+}
+
+static const struct dev_pm_ops intel_pcie_pm_ops = {
+	SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(intel_pcie_suspend_noirq,
+				      intel_pcie_resume_noirq)
+};
+
+static const struct of_device_id of_intel_pcie_match[] = {
+	{ .compatible = "intel,lgm-pcie", .data = &pcie_data },
+	{}
+};
+
+static struct platform_driver intel_pcie_driver = {
+	.probe = intel_pcie_probe,
+	.remove = intel_pcie_remove,
+	.driver = {
+		.name = "intel-gw-pcie",
+		.of_match_table = of_intel_pcie_match,
+		.pm = &intel_pcie_pm_ops,
+	},
+};
+builtin_platform_driver(intel_pcie_driver);
diff --git a/include/uapi/linux/pci_regs.h b/include/uapi/linux/pci_regs.h
index acb7d2bdb419..5437690483cd 100644
--- a/include/uapi/linux/pci_regs.h
+++ b/include/uapi/linux/pci_regs.h
@@ -676,6 +676,7 @@
 #define  PCI_EXP_LNKCTL2_TLS_32_0GT	0x0005 /* Supported Speed 32GT/s */
 #define  PCI_EXP_LNKCTL2_ENTER_COMP	0x0010 /* Enter Compliance */
 #define  PCI_EXP_LNKCTL2_TX_MARGIN	0x0380 /* Transmit Margin */
+#define  PCI_EXP_LNKCTL2_HASD		0x0020 /* HW Autonomous Speed Disable */
 #define PCI_EXP_LNKSTA2		50	/* Link Status 2 */
 #define PCI_CAP_EXP_ENDPOINT_SIZEOF_V2	52	/* v2 endpoints with link end here */
 #define PCI_EXP_SLTCAP2		52	/* Slot Capabilities 2 */
-- 
cgit v1.2.3


From 27ae7997a66174cb8afd6a75b3989f5e0c1b9e5a Mon Sep 17 00:00:00 2001
From: Martin KaFai Lau <kafai@fb.com>
Date: Wed, 8 Jan 2020 16:35:03 -0800
Subject: bpf: Introduce BPF_PROG_TYPE_STRUCT_OPS

This patch allows the kernel's struct ops (i.e. func ptr) to be
implemented in BPF.  The first use case in this series is the
"struct tcp_congestion_ops" which will be introduced in a
latter patch.

This patch introduces a new prog type BPF_PROG_TYPE_STRUCT_OPS.
The BPF_PROG_TYPE_STRUCT_OPS prog is verified against a particular
func ptr of a kernel struct.  The attr->attach_btf_id is the btf id
of a kernel struct.  The attr->expected_attach_type is the member
"index" of that kernel struct.  The first member of a struct starts
with member index 0.  That will avoid ambiguity when a kernel struct
has multiple func ptrs with the same func signature.

For example, a BPF_PROG_TYPE_STRUCT_OPS prog is written
to implement the "init" func ptr of the "struct tcp_congestion_ops".
The attr->attach_btf_id is the btf id of the "struct tcp_congestion_ops"
of the _running_ kernel.  The attr->expected_attach_type is 3.

The ctx of BPF_PROG_TYPE_STRUCT_OPS is an array of u64 args saved
by arch_prepare_bpf_trampoline that will be done in the next
patch when introducing BPF_MAP_TYPE_STRUCT_OPS.

"struct bpf_struct_ops" is introduced as a common interface for the kernel
struct that supports BPF_PROG_TYPE_STRUCT_OPS prog.  The supporting kernel
struct will need to implement an instance of the "struct bpf_struct_ops".

The supporting kernel struct also needs to implement a bpf_verifier_ops.
During BPF_PROG_LOAD, bpf_struct_ops_find() will find the right
bpf_verifier_ops by searching the attr->attach_btf_id.

A new "btf_struct_access" is also added to the bpf_verifier_ops such
that the supporting kernel struct can optionally provide its own specific
check on accessing the func arg (e.g. provide limited write access).

After btf_vmlinux is parsed, the new bpf_struct_ops_init() is called
to initialize some values (e.g. the btf id of the supporting kernel
struct) and it can only be done once the btf_vmlinux is available.

The R0 checks at BPF_EXIT is excluded for the BPF_PROG_TYPE_STRUCT_OPS prog
if the return type of the prog->aux->attach_func_proto is "void".

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003503.3855825-1-kafai@fb.com
---
 include/linux/bpf.h               |  30 +++++++++
 include/linux/bpf_types.h         |   4 ++
 include/linux/btf.h               |  34 ++++++++++
 include/uapi/linux/bpf.h          |   1 +
 kernel/bpf/Makefile               |   3 +
 kernel/bpf/bpf_struct_ops.c       | 121 ++++++++++++++++++++++++++++++++++
 kernel/bpf/bpf_struct_ops_types.h |   4 ++
 kernel/bpf/btf.c                  |  88 +++++++++++++++++--------
 kernel/bpf/syscall.c              |  17 +++--
 kernel/bpf/verifier.c             | 134 +++++++++++++++++++++++++++++---------
 10 files changed, 373 insertions(+), 63 deletions(-)
 create mode 100644 kernel/bpf/bpf_struct_ops.c
 create mode 100644 kernel/bpf/bpf_struct_ops_types.h

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index b14e51d56a82..50f3b20ae284 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -349,6 +349,10 @@ struct bpf_verifier_ops {
 				  const struct bpf_insn *src,
 				  struct bpf_insn *dst,
 				  struct bpf_prog *prog, u32 *target_size);
+	int (*btf_struct_access)(struct bpf_verifier_log *log,
+				 const struct btf_type *t, int off, int size,
+				 enum bpf_access_type atype,
+				 u32 *next_btf_id);
 };
 
 struct bpf_prog_offload_ops {
@@ -668,6 +672,32 @@ struct bpf_array_aux {
 	struct work_struct work;
 };
 
+struct btf_type;
+struct btf_member;
+
+#define BPF_STRUCT_OPS_MAX_NR_MEMBERS 64
+struct bpf_struct_ops {
+	const struct bpf_verifier_ops *verifier_ops;
+	int (*init)(struct btf *btf);
+	int (*check_member)(const struct btf_type *t,
+			    const struct btf_member *member);
+	const struct btf_type *type;
+	const char *name;
+	struct btf_func_model func_models[BPF_STRUCT_OPS_MAX_NR_MEMBERS];
+	u32 type_id;
+};
+
+#if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
+const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id);
+void bpf_struct_ops_init(struct btf *btf);
+#else
+static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
+{
+	return NULL;
+}
+static inline void bpf_struct_ops_init(struct btf *btf) { }
+#endif
+
 struct bpf_array {
 	struct bpf_map map;
 	u32 elem_size;
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 93740b3614d7..fadd243ffa2d 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -65,6 +65,10 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LIRC_MODE2, lirc_mode2,
 BPF_PROG_TYPE(BPF_PROG_TYPE_SK_REUSEPORT, sk_reuseport,
 	      struct sk_reuseport_md, struct sk_reuseport_kern)
 #endif
+#if defined(CONFIG_BPF_JIT)
+BPF_PROG_TYPE(BPF_PROG_TYPE_STRUCT_OPS, bpf_struct_ops,
+	      void *, void *)
+#endif
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 79d4abc2556a..f74a09a7120b 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -53,6 +53,18 @@ bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
 			   u32 expected_offset, u32 expected_size);
 int btf_find_spin_lock(const struct btf *btf, const struct btf_type *t);
 bool btf_type_is_void(const struct btf_type *t);
+s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind);
+const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
+					       u32 id, u32 *res_id);
+const struct btf_type *btf_type_resolve_ptr(const struct btf *btf,
+					    u32 id, u32 *res_id);
+const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf,
+						 u32 id, u32 *res_id);
+
+#define for_each_member(i, struct_type, member)			\
+	for (i = 0, member = btf_type_member(struct_type);	\
+	     i < btf_type_vlen(struct_type);			\
+	     i++, member++)
 
 static inline bool btf_type_is_ptr(const struct btf_type *t)
 {
@@ -84,6 +96,28 @@ static inline bool btf_type_is_func_proto(const struct btf_type *t)
 	return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
 }
 
+static inline u16 btf_type_vlen(const struct btf_type *t)
+{
+	return BTF_INFO_VLEN(t->info);
+}
+
+static inline bool btf_type_kflag(const struct btf_type *t)
+{
+	return BTF_INFO_KFLAG(t->info);
+}
+
+static inline u32 btf_member_bitfield_size(const struct btf_type *struct_type,
+					   const struct btf_member *member)
+{
+	return btf_type_kflag(struct_type) ? BTF_MEMBER_BITFIELD_SIZE(member->offset)
+					   : 0;
+}
+
+static inline const struct btf_member *btf_type_member(const struct btf_type *t)
+{
+	return (const struct btf_member *)(t + 1);
+}
+
 #ifdef CONFIG_BPF_SYSCALL
 const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 7df436da542d..c1eeb3e0e116 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -174,6 +174,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE,
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
 	BPF_PROG_TYPE_TRACING,
+	BPF_PROG_TYPE_STRUCT_OPS,
 };
 
 enum bpf_attach_type {
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index d4f330351f87..046ce5d98033 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -27,3 +27,6 @@ endif
 ifeq ($(CONFIG_SYSFS),y)
 obj-$(CONFIG_DEBUG_INFO_BTF) += sysfs_btf.o
 endif
+ifeq ($(CONFIG_BPF_JIT),y)
+obj-$(CONFIG_BPF_SYSCALL) += bpf_struct_ops.o
+endif
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
new file mode 100644
index 000000000000..2ea68fe34c33
--- /dev/null
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Copyright (c) 2019 Facebook */
+
+#include <linux/bpf.h>
+#include <linux/bpf_verifier.h>
+#include <linux/btf.h>
+#include <linux/filter.h>
+#include <linux/slab.h>
+#include <linux/numa.h>
+#include <linux/seq_file.h>
+#include <linux/refcount.h>
+
+#define BPF_STRUCT_OPS_TYPE(_name)				\
+extern struct bpf_struct_ops bpf_##_name;
+#include "bpf_struct_ops_types.h"
+#undef BPF_STRUCT_OPS_TYPE
+
+enum {
+#define BPF_STRUCT_OPS_TYPE(_name) BPF_STRUCT_OPS_TYPE_##_name,
+#include "bpf_struct_ops_types.h"
+#undef BPF_STRUCT_OPS_TYPE
+	__NR_BPF_STRUCT_OPS_TYPE,
+};
+
+static struct bpf_struct_ops * const bpf_struct_ops[] = {
+#define BPF_STRUCT_OPS_TYPE(_name)				\
+	[BPF_STRUCT_OPS_TYPE_##_name] = &bpf_##_name,
+#include "bpf_struct_ops_types.h"
+#undef BPF_STRUCT_OPS_TYPE
+};
+
+const struct bpf_verifier_ops bpf_struct_ops_verifier_ops = {
+};
+
+const struct bpf_prog_ops bpf_struct_ops_prog_ops = {
+};
+
+void bpf_struct_ops_init(struct btf *btf)
+{
+	const struct btf_member *member;
+	struct bpf_struct_ops *st_ops;
+	struct bpf_verifier_log log = {};
+	const struct btf_type *t;
+	const char *mname;
+	s32 type_id;
+	u32 i, j;
+
+	for (i = 0; i < ARRAY_SIZE(bpf_struct_ops); i++) {
+		st_ops = bpf_struct_ops[i];
+
+		type_id = btf_find_by_name_kind(btf, st_ops->name,
+						BTF_KIND_STRUCT);
+		if (type_id < 0) {
+			pr_warn("Cannot find struct %s in btf_vmlinux\n",
+				st_ops->name);
+			continue;
+		}
+		t = btf_type_by_id(btf, type_id);
+		if (btf_type_vlen(t) > BPF_STRUCT_OPS_MAX_NR_MEMBERS) {
+			pr_warn("Cannot support #%u members in struct %s\n",
+				btf_type_vlen(t), st_ops->name);
+			continue;
+		}
+
+		for_each_member(j, t, member) {
+			const struct btf_type *func_proto;
+
+			mname = btf_name_by_offset(btf, member->name_off);
+			if (!*mname) {
+				pr_warn("anon member in struct %s is not supported\n",
+					st_ops->name);
+				break;
+			}
+
+			if (btf_member_bitfield_size(t, member)) {
+				pr_warn("bit field member %s in struct %s is not supported\n",
+					mname, st_ops->name);
+				break;
+			}
+
+			func_proto = btf_type_resolve_func_ptr(btf,
+							       member->type,
+							       NULL);
+			if (func_proto &&
+			    btf_distill_func_proto(&log, btf,
+						   func_proto, mname,
+						   &st_ops->func_models[j])) {
+				pr_warn("Error in parsing func ptr %s in struct %s\n",
+					mname, st_ops->name);
+				break;
+			}
+		}
+
+		if (j == btf_type_vlen(t)) {
+			if (st_ops->init(btf)) {
+				pr_warn("Error in init bpf_struct_ops %s\n",
+					st_ops->name);
+			} else {
+				st_ops->type_id = type_id;
+				st_ops->type = t;
+			}
+		}
+	}
+}
+
+extern struct btf *btf_vmlinux;
+
+const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
+{
+	unsigned int i;
+
+	if (!type_id || !btf_vmlinux)
+		return NULL;
+
+	for (i = 0; i < ARRAY_SIZE(bpf_struct_ops); i++) {
+		if (bpf_struct_ops[i]->type_id == type_id)
+			return bpf_struct_ops[i];
+	}
+
+	return NULL;
+}
diff --git a/kernel/bpf/bpf_struct_ops_types.h b/kernel/bpf/bpf_struct_ops_types.h
new file mode 100644
index 000000000000..7bb13ff49ec2
--- /dev/null
+++ b/kernel/bpf/bpf_struct_ops_types.h
@@ -0,0 +1,4 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* internal file - do not include directly */
+
+/* To be filled in a later patch */
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 48bbde2e1c1e..12af4a1bb1a4 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -180,11 +180,6 @@
  */
 #define BTF_MAX_SIZE (16 * 1024 * 1024)
 
-#define for_each_member(i, struct_type, member)			\
-	for (i = 0, member = btf_type_member(struct_type);	\
-	     i < btf_type_vlen(struct_type);			\
-	     i++, member++)
-
 #define for_each_member_from(i, from, struct_type, member)		\
 	for (i = from, member = btf_type_member(struct_type) + from;	\
 	     i < btf_type_vlen(struct_type);				\
@@ -382,6 +377,65 @@ static bool btf_type_is_datasec(const struct btf_type *t)
 	return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
 }
 
+s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
+{
+	const struct btf_type *t;
+	const char *tname;
+	u32 i;
+
+	for (i = 1; i <= btf->nr_types; i++) {
+		t = btf->types[i];
+		if (BTF_INFO_KIND(t->info) != kind)
+			continue;
+
+		tname = btf_name_by_offset(btf, t->name_off);
+		if (!strcmp(tname, name))
+			return i;
+	}
+
+	return -ENOENT;
+}
+
+const struct btf_type *btf_type_skip_modifiers(const struct btf *btf,
+					       u32 id, u32 *res_id)
+{
+	const struct btf_type *t = btf_type_by_id(btf, id);
+
+	while (btf_type_is_modifier(t)) {
+		id = t->type;
+		t = btf_type_by_id(btf, t->type);
+	}
+
+	if (res_id)
+		*res_id = id;
+
+	return t;
+}
+
+const struct btf_type *btf_type_resolve_ptr(const struct btf *btf,
+					    u32 id, u32 *res_id)
+{
+	const struct btf_type *t;
+
+	t = btf_type_skip_modifiers(btf, id, NULL);
+	if (!btf_type_is_ptr(t))
+		return NULL;
+
+	return btf_type_skip_modifiers(btf, t->type, res_id);
+}
+
+const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf,
+						 u32 id, u32 *res_id)
+{
+	const struct btf_type *ptype;
+
+	ptype = btf_type_resolve_ptr(btf, id, res_id);
+	if (ptype && btf_type_is_func_proto(ptype))
+		return ptype;
+
+	return NULL;
+}
+
 /* Types that act only as a source, not sink or intermediate
  * type when resolving.
  */
@@ -446,16 +500,6 @@ static const char *btf_int_encoding_str(u8 encoding)
 		return "UNKN";
 }
 
-static u16 btf_type_vlen(const struct btf_type *t)
-{
-	return BTF_INFO_VLEN(t->info);
-}
-
-static bool btf_type_kflag(const struct btf_type *t)
-{
-	return BTF_INFO_KFLAG(t->info);
-}
-
 static u32 btf_member_bit_offset(const struct btf_type *struct_type,
 			     const struct btf_member *member)
 {
@@ -463,13 +507,6 @@ static u32 btf_member_bit_offset(const struct btf_type *struct_type,
 					   : member->offset;
 }
 
-static u32 btf_member_bitfield_size(const struct btf_type *struct_type,
-				    const struct btf_member *member)
-{
-	return btf_type_kflag(struct_type) ? BTF_MEMBER_BITFIELD_SIZE(member->offset)
-					   : 0;
-}
-
 static u32 btf_type_int(const struct btf_type *t)
 {
 	return *(u32 *)(t + 1);
@@ -480,11 +517,6 @@ static const struct btf_array *btf_type_array(const struct btf_type *t)
 	return (const struct btf_array *)(t + 1);
 }
 
-static const struct btf_member *btf_type_member(const struct btf_type *t)
-{
-	return (const struct btf_member *)(t + 1);
-}
-
 static const struct btf_enum *btf_type_enum(const struct btf_type *t)
 {
 	return (const struct btf_enum *)(t + 1);
@@ -3605,6 +3637,8 @@ struct btf *btf_parse_vmlinux(void)
 		goto errout;
 	}
 
+	bpf_struct_ops_init(btf);
+
 	btf_verifier_env_free(env);
 	refcount_set(&btf->refcnt, 1);
 	return btf;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 81ee8595dfee..03a02ef4c496 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1672,17 +1672,22 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 			   enum bpf_attach_type expected_attach_type,
 			   u32 btf_id, u32 prog_fd)
 {
-	switch (prog_type) {
-	case BPF_PROG_TYPE_TRACING:
+	if (btf_id) {
 		if (btf_id > BTF_MAX_TYPE)
 			return -EINVAL;
-		break;
-	default:
-		if (btf_id || prog_fd)
+
+		switch (prog_type) {
+		case BPF_PROG_TYPE_TRACING:
+		case BPF_PROG_TYPE_STRUCT_OPS:
+			break;
+		default:
 			return -EINVAL;
-		break;
+		}
 	}
 
+	if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING)
+		return -EINVAL;
+
 	switch (prog_type) {
 	case BPF_PROG_TYPE_CGROUP_SOCK:
 		switch (expected_attach_type) {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index d433d70022fd..586ed3c94b80 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -2859,11 +2859,6 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
 	u32 btf_id;
 	int ret;
 
-	if (atype != BPF_READ) {
-		verbose(env, "only read is supported\n");
-		return -EACCES;
-	}
-
 	if (off < 0) {
 		verbose(env,
 			"R%d is ptr_%s invalid negative access: off=%d\n",
@@ -2880,17 +2875,32 @@ static int check_ptr_to_btf_access(struct bpf_verifier_env *env,
 		return -EACCES;
 	}
 
-	ret = btf_struct_access(&env->log, t, off, size, atype, &btf_id);
+	if (env->ops->btf_struct_access) {
+		ret = env->ops->btf_struct_access(&env->log, t, off, size,
+						  atype, &btf_id);
+	} else {
+		if (atype != BPF_READ) {
+			verbose(env, "only read is supported\n");
+			return -EACCES;
+		}
+
+		ret = btf_struct_access(&env->log, t, off, size, atype,
+					&btf_id);
+	}
+
 	if (ret < 0)
 		return ret;
 
-	if (ret == SCALAR_VALUE) {
-		mark_reg_unknown(env, regs, value_regno);
-		return 0;
+	if (atype == BPF_READ) {
+		if (ret == SCALAR_VALUE) {
+			mark_reg_unknown(env, regs, value_regno);
+			return 0;
+		}
+		mark_reg_known_zero(env, regs, value_regno);
+		regs[value_regno].type = PTR_TO_BTF_ID;
+		regs[value_regno].btf_id = btf_id;
 	}
-	mark_reg_known_zero(env, regs, value_regno);
-	regs[value_regno].type = PTR_TO_BTF_ID;
-	regs[value_regno].btf_id = btf_id;
+
 	return 0;
 }
 
@@ -6349,8 +6359,30 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn)
 static int check_return_code(struct bpf_verifier_env *env)
 {
 	struct tnum enforce_attach_type_range = tnum_unknown;
+	const struct bpf_prog *prog = env->prog;
 	struct bpf_reg_state *reg;
 	struct tnum range = tnum_range(0, 1);
+	int err;
+
+	/* The struct_ops func-ptr's return type could be "void" */
+	if (env->prog->type == BPF_PROG_TYPE_STRUCT_OPS &&
+	    !prog->aux->attach_func_proto->type)
+		return 0;
+
+	/* eBPF calling convetion is such that R0 is used
+	 * to return the value from eBPF program.
+	 * Make sure that it's readable at this time
+	 * of bpf_exit, which means that program wrote
+	 * something into it earlier
+	 */
+	err = check_reg_arg(env, BPF_REG_0, SRC_OP);
+	if (err)
+		return err;
+
+	if (is_pointer_value(env, BPF_REG_0)) {
+		verbose(env, "R0 leaks addr as return value\n");
+		return -EACCES;
+	}
 
 	switch (env->prog->type) {
 	case BPF_PROG_TYPE_CGROUP_SOCK_ADDR:
@@ -8016,21 +8048,6 @@ static int do_check(struct bpf_verifier_env *env)
 				if (err)
 					return err;
 
-				/* eBPF calling convetion is such that R0 is used
-				 * to return the value from eBPF program.
-				 * Make sure that it's readable at this time
-				 * of bpf_exit, which means that program wrote
-				 * something into it earlier
-				 */
-				err = check_reg_arg(env, BPF_REG_0, SRC_OP);
-				if (err)
-					return err;
-
-				if (is_pointer_value(env, BPF_REG_0)) {
-					verbose(env, "R0 leaks addr as return value\n");
-					return -EACCES;
-				}
-
 				err = check_return_code(env);
 				if (err)
 					return err;
@@ -8829,12 +8846,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
 			convert_ctx_access = bpf_xdp_sock_convert_ctx_access;
 			break;
 		case PTR_TO_BTF_ID:
-			if (type == BPF_WRITE) {
+			if (type == BPF_READ) {
+				insn->code = BPF_LDX | BPF_PROBE_MEM |
+					BPF_SIZE((insn)->code);
+				env->prog->aux->num_exentries++;
+			} else if (env->prog->type != BPF_PROG_TYPE_STRUCT_OPS) {
 				verbose(env, "Writes through BTF pointers are not allowed\n");
 				return -EINVAL;
 			}
-			insn->code = BPF_LDX | BPF_PROBE_MEM | BPF_SIZE((insn)->code);
-			env->prog->aux->num_exentries++;
 			continue;
 		default:
 			continue;
@@ -9502,6 +9521,58 @@ static void print_verification_stats(struct bpf_verifier_env *env)
 		env->peak_states, env->longest_mark_read_walk);
 }
 
+static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
+{
+	const struct btf_type *t, *func_proto;
+	const struct bpf_struct_ops *st_ops;
+	const struct btf_member *member;
+	struct bpf_prog *prog = env->prog;
+	u32 btf_id, member_idx;
+	const char *mname;
+
+	btf_id = prog->aux->attach_btf_id;
+	st_ops = bpf_struct_ops_find(btf_id);
+	if (!st_ops) {
+		verbose(env, "attach_btf_id %u is not a supported struct\n",
+			btf_id);
+		return -ENOTSUPP;
+	}
+
+	t = st_ops->type;
+	member_idx = prog->expected_attach_type;
+	if (member_idx >= btf_type_vlen(t)) {
+		verbose(env, "attach to invalid member idx %u of struct %s\n",
+			member_idx, st_ops->name);
+		return -EINVAL;
+	}
+
+	member = &btf_type_member(t)[member_idx];
+	mname = btf_name_by_offset(btf_vmlinux, member->name_off);
+	func_proto = btf_type_resolve_func_ptr(btf_vmlinux, member->type,
+					       NULL);
+	if (!func_proto) {
+		verbose(env, "attach to invalid member %s(@idx %u) of struct %s\n",
+			mname, member_idx, st_ops->name);
+		return -EINVAL;
+	}
+
+	if (st_ops->check_member) {
+		int err = st_ops->check_member(t, member);
+
+		if (err) {
+			verbose(env, "attach to unsupported member %s of struct %s\n",
+				mname, st_ops->name);
+			return err;
+		}
+	}
+
+	prog->aux->attach_func_proto = func_proto;
+	prog->aux->attach_func_name = mname;
+	env->ops = st_ops->verifier_ops;
+
+	return 0;
+}
+
 static int check_attach_btf_id(struct bpf_verifier_env *env)
 {
 	struct bpf_prog *prog = env->prog;
@@ -9517,6 +9588,9 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 	long addr;
 	u64 key;
 
+	if (prog->type == BPF_PROG_TYPE_STRUCT_OPS)
+		return check_struct_ops_btf_id(env);
+
 	if (prog->type != BPF_PROG_TYPE_TRACING)
 		return 0;
 
-- 
cgit v1.2.3


From 85d33df357b634649ddbe0a20fd2d0fc5732c3cb Mon Sep 17 00:00:00 2001
From: Martin KaFai Lau <kafai@fb.com>
Date: Wed, 8 Jan 2020 16:35:05 -0800
Subject: bpf: Introduce BPF_MAP_TYPE_STRUCT_OPS

The patch introduces BPF_MAP_TYPE_STRUCT_OPS.  The map value
is a kernel struct with its func ptr implemented in bpf prog.
This new map is the interface to register/unregister/introspect
a bpf implemented kernel struct.

The kernel struct is actually embedded inside another new struct
(or called the "value" struct in the code).  For example,
"struct tcp_congestion_ops" is embbeded in:
struct bpf_struct_ops_tcp_congestion_ops {
	refcount_t refcnt;
	enum bpf_struct_ops_state state;
	struct tcp_congestion_ops data;  /* <-- kernel subsystem struct here */
}
The map value is "struct bpf_struct_ops_tcp_congestion_ops".
The "bpftool map dump" will then be able to show the
state ("inuse"/"tobefree") and the number of subsystem's refcnt (e.g.
number of tcp_sock in the tcp_congestion_ops case).  This "value" struct
is created automatically by a macro.  Having a separate "value" struct
will also make extending "struct bpf_struct_ops_XYZ" easier (e.g. adding
"void (*init)(void)" to "struct bpf_struct_ops_XYZ" to do some
initialization works before registering the struct_ops to the kernel
subsystem).  The libbpf will take care of finding and populating the
"struct bpf_struct_ops_XYZ" from "struct XYZ".

Register a struct_ops to a kernel subsystem:
1. Load all needed BPF_PROG_TYPE_STRUCT_OPS prog(s)
2. Create a BPF_MAP_TYPE_STRUCT_OPS with attr->btf_vmlinux_value_type_id
   set to the btf id "struct bpf_struct_ops_tcp_congestion_ops" of the
   running kernel.
   Instead of reusing the attr->btf_value_type_id,
   btf_vmlinux_value_type_id s added such that attr->btf_fd can still be
   used as the "user" btf which could store other useful sysadmin/debug
   info that may be introduced in the furture,
   e.g. creation-date/compiler-details/map-creator...etc.
3. Create a "struct bpf_struct_ops_tcp_congestion_ops" object as described
   in the running kernel btf.  Populate the value of this object.
   The function ptr should be populated with the prog fds.
4. Call BPF_MAP_UPDATE with the object created in (3) as
   the map value.  The key is always "0".

During BPF_MAP_UPDATE, the code that saves the kernel-func-ptr's
args as an array of u64 is generated.  BPF_MAP_UPDATE also allows
the specific struct_ops to do some final checks in "st_ops->init_member()"
(e.g. ensure all mandatory func ptrs are implemented).
If everything looks good, it will register this kernel struct
to the kernel subsystem.  The map will not allow further update
from this point.

Unregister a struct_ops from the kernel subsystem:
BPF_MAP_DELETE with key "0".

Introspect a struct_ops:
BPF_MAP_LOOKUP_ELEM with key "0".  The map value returned will
have the prog _id_ populated as the func ptr.

The map value state (enum bpf_struct_ops_state) will transit from:
INIT (map created) =>
INUSE (map updated, i.e. reg) =>
TOBEFREE (map value deleted, i.e. unreg)

The kernel subsystem needs to call bpf_struct_ops_get() and
bpf_struct_ops_put() to manage the "refcnt" in the
"struct bpf_struct_ops_XYZ".  This patch uses a separate refcnt
for the purose of tracking the subsystem usage.  Another approach
is to reuse the map->refcnt and then "show" (i.e. during map_lookup)
the subsystem's usage by doing map->refcnt - map->usercnt to filter out
the map-fd/pinned-map usage.  However, that will also tie down the
future semantics of map->refcnt and map->usercnt.

The very first subsystem's refcnt (during reg()) holds one
count to map->refcnt.  When the very last subsystem's refcnt
is gone, it will also release the map->refcnt.  All bpf_prog will be
freed when the map->refcnt reaches 0 (i.e. during map_free()).

Here is how the bpftool map command will look like:
[root@arch-fb-vm1 bpf]# bpftool map show
6: struct_ops  name dctcp  flags 0x0
	key 4B  value 256B  max_entries 1  memlock 4096B
	btf_id 6
[root@arch-fb-vm1 bpf]# bpftool map dump id 6
[{
        "value": {
            "refcnt": {
                "refs": {
                    "counter": 1
                }
            },
            "state": 1,
            "data": {
                "list": {
                    "next": 0,
                    "prev": 0
                },
                "key": 0,
                "flags": 2,
                "init": 24,
                "release": 0,
                "ssthresh": 25,
                "cong_avoid": 30,
                "set_state": 27,
                "cwnd_event": 28,
                "in_ack_event": 26,
                "undo_cwnd": 29,
                "pkts_acked": 0,
                "min_tso_segs": 0,
                "sndbuf_expand": 0,
                "cong_control": 0,
                "get_info": 0,
                "name": [98,112,102,95,100,99,116,99,112,0,0,0,0,0,0,0
                ],
                "owner": 0
            }
        }
    }
]

Misc Notes:
* bpf_struct_ops_map_sys_lookup_elem() is added for syscall lookup.
  It does an inplace update on "*value" instead returning a pointer
  to syscall.c.  Otherwise, it needs a separate copy of "zero" value
  for the BPF_STRUCT_OPS_STATE_INIT to avoid races.

* The bpf_struct_ops_map_delete_elem() is also called without
  preempt_disable() from map_delete_elem().  It is because
  the "->unreg()" may requires sleepable context, e.g.
  the "tcp_unregister_congestion_control()".

* "const" is added to some of the existing "struct btf_func_model *"
  function arg to avoid a compiler warning caused by this patch.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109003505.3855919-1-kafai@fb.com
---
 arch/x86/net/bpf_jit_comp.c |  18 +-
 include/linux/bpf.h         |  49 ++++-
 include/linux/bpf_types.h   |   3 +
 include/linux/btf.h         |  13 ++
 include/uapi/linux/bpf.h    |   7 +-
 kernel/bpf/bpf_struct_ops.c | 511 +++++++++++++++++++++++++++++++++++++++++++-
 kernel/bpf/btf.c            |  20 +-
 kernel/bpf/map_in_map.c     |   3 +-
 kernel/bpf/syscall.c        |  52 +++--
 kernel/bpf/trampoline.c     |   8 +-
 kernel/bpf/verifier.c       |   5 +
 11 files changed, 642 insertions(+), 47 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 4c8a2d1f8470..9ba08e9abc09 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1328,7 +1328,7 @@ emit_jmp:
 	return proglen;
 }
 
-static void save_regs(struct btf_func_model *m, u8 **prog, int nr_args,
+static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
 		      int stack_size)
 {
 	int i;
@@ -1344,7 +1344,7 @@ static void save_regs(struct btf_func_model *m, u8 **prog, int nr_args,
 			 -(stack_size - i * 8));
 }
 
-static void restore_regs(struct btf_func_model *m, u8 **prog, int nr_args,
+static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
 			 int stack_size)
 {
 	int i;
@@ -1361,7 +1361,7 @@ static void restore_regs(struct btf_func_model *m, u8 **prog, int nr_args,
 			 -(stack_size - i * 8));
 }
 
-static int invoke_bpf(struct btf_func_model *m, u8 **pprog,
+static int invoke_bpf(const struct btf_func_model *m, u8 **pprog,
 		      struct bpf_prog **progs, int prog_cnt, int stack_size)
 {
 	u8 *prog = *pprog;
@@ -1456,7 +1456,8 @@ static int invoke_bpf(struct btf_func_model *m, u8 **pprog,
  * add rsp, 8                      // skip eth_type_trans's frame
  * ret                             // return to its caller
  */
-int arch_prepare_bpf_trampoline(void *image, struct btf_func_model *m, u32 flags,
+int arch_prepare_bpf_trampoline(void *image, void *image_end,
+				const struct btf_func_model *m, u32 flags,
 				struct bpf_prog **fentry_progs, int fentry_cnt,
 				struct bpf_prog **fexit_progs, int fexit_cnt,
 				void *orig_call)
@@ -1523,13 +1524,10 @@ int arch_prepare_bpf_trampoline(void *image, struct btf_func_model *m, u32 flags
 		/* skip our return address and return to parent */
 		EMIT4(0x48, 0x83, 0xC4, 8); /* add rsp, 8 */
 	EMIT1(0xC3); /* ret */
-	/* One half of the page has active running trampoline.
-	 * Another half is an area for next trampoline.
-	 * Make sure the trampoline generation logic doesn't overflow.
-	 */
-	if (WARN_ON_ONCE(prog - (u8 *)image > PAGE_SIZE / 2 - BPF_INSN_SAFETY))
+	/* Make sure the trampoline generation logic doesn't overflow */
+	if (WARN_ON_ONCE(prog > (u8 *)image_end - BPF_INSN_SAFETY))
 		return -EFAULT;
-	return 0;
+	return prog - (u8 *)image;
 }
 
 static int emit_cond_near_jump(u8 **pprog, void *func, void *ip, u8 jmp_cond)
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 50f3b20ae284..a7bfe8a388c6 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -17,6 +17,7 @@
 #include <linux/u64_stats_sync.h>
 #include <linux/refcount.h>
 #include <linux/mutex.h>
+#include <linux/module.h>
 
 struct bpf_verifier_env;
 struct bpf_verifier_log;
@@ -106,6 +107,7 @@ struct bpf_map {
 	struct btf *btf;
 	struct bpf_map_memory memory;
 	char name[BPF_OBJ_NAME_LEN];
+	u32 btf_vmlinux_value_type_id;
 	bool unpriv_array;
 	bool frozen; /* write-once; write-protected by freeze_mutex */
 	/* 22 bytes hole */
@@ -183,7 +185,8 @@ static inline bool bpf_map_offload_neutral(const struct bpf_map *map)
 
 static inline bool bpf_map_support_seq_show(const struct bpf_map *map)
 {
-	return map->btf && map->ops->map_seq_show_elem;
+	return (map->btf_value_type_id || map->btf_vmlinux_value_type_id) &&
+		map->ops->map_seq_show_elem;
 }
 
 int map_check_no_btf(const struct bpf_map *map,
@@ -441,7 +444,8 @@ struct btf_func_model {
  *      fentry = a set of program to run before calling original function
  *      fexit = a set of program to run after original function
  */
-int arch_prepare_bpf_trampoline(void *image, struct btf_func_model *m, u32 flags,
+int arch_prepare_bpf_trampoline(void *image, void *image_end,
+				const struct btf_func_model *m, u32 flags,
 				struct bpf_prog **fentry_progs, int fentry_cnt,
 				struct bpf_prog **fexit_progs, int fexit_cnt,
 				void *orig_call);
@@ -672,6 +676,7 @@ struct bpf_array_aux {
 	struct work_struct work;
 };
 
+struct bpf_struct_ops_value;
 struct btf_type;
 struct btf_member;
 
@@ -681,21 +686,61 @@ struct bpf_struct_ops {
 	int (*init)(struct btf *btf);
 	int (*check_member)(const struct btf_type *t,
 			    const struct btf_member *member);
+	int (*init_member)(const struct btf_type *t,
+			   const struct btf_member *member,
+			   void *kdata, const void *udata);
+	int (*reg)(void *kdata);
+	void (*unreg)(void *kdata);
 	const struct btf_type *type;
+	const struct btf_type *value_type;
 	const char *name;
 	struct btf_func_model func_models[BPF_STRUCT_OPS_MAX_NR_MEMBERS];
 	u32 type_id;
+	u32 value_id;
 };
 
 #if defined(CONFIG_BPF_JIT) && defined(CONFIG_BPF_SYSCALL)
+#define BPF_MODULE_OWNER ((void *)((0xeB9FUL << 2) + POISON_POINTER_DELTA))
 const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id);
 void bpf_struct_ops_init(struct btf *btf);
+bool bpf_struct_ops_get(const void *kdata);
+void bpf_struct_ops_put(const void *kdata);
+int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map, void *key,
+				       void *value);
+static inline bool bpf_try_module_get(const void *data, struct module *owner)
+{
+	if (owner == BPF_MODULE_OWNER)
+		return bpf_struct_ops_get(data);
+	else
+		return try_module_get(owner);
+}
+static inline void bpf_module_put(const void *data, struct module *owner)
+{
+	if (owner == BPF_MODULE_OWNER)
+		bpf_struct_ops_put(data);
+	else
+		module_put(owner);
+}
 #else
 static inline const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
 {
 	return NULL;
 }
 static inline void bpf_struct_ops_init(struct btf *btf) { }
+static inline bool bpf_try_module_get(const void *data, struct module *owner)
+{
+	return try_module_get(owner);
+}
+static inline void bpf_module_put(const void *data, struct module *owner)
+{
+	module_put(owner);
+}
+static inline int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map,
+						     void *key,
+						     void *value)
+{
+	return -EINVAL;
+}
 #endif
 
 struct bpf_array {
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index fadd243ffa2d..9f326e6ef885 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -109,3 +109,6 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_REUSEPORT_SOCKARRAY, reuseport_array_ops)
 #endif
 BPF_MAP_TYPE(BPF_MAP_TYPE_QUEUE, queue_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_STACK, stack_map_ops)
+#if defined(CONFIG_BPF_JIT)
+BPF_MAP_TYPE(BPF_MAP_TYPE_STRUCT_OPS, bpf_struct_ops_map_ops)
+#endif
diff --git a/include/linux/btf.h b/include/linux/btf.h
index f74a09a7120b..881e9b76ef49 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -7,6 +7,8 @@
 #include <linux/types.h>
 #include <uapi/linux/btf.h>
 
+#define BTF_TYPE_EMIT(type) ((void)(type *)0)
+
 struct btf;
 struct btf_member;
 struct btf_type;
@@ -60,6 +62,10 @@ const struct btf_type *btf_type_resolve_ptr(const struct btf *btf,
 					    u32 id, u32 *res_id);
 const struct btf_type *btf_type_resolve_func_ptr(const struct btf *btf,
 						 u32 id, u32 *res_id);
+const struct btf_type *
+btf_resolve_size(const struct btf *btf, const struct btf_type *type,
+		 u32 *type_size, const struct btf_type **elem_type,
+		 u32 *total_nelems);
 
 #define for_each_member(i, struct_type, member)			\
 	for (i = 0, member = btf_type_member(struct_type);	\
@@ -106,6 +112,13 @@ static inline bool btf_type_kflag(const struct btf_type *t)
 	return BTF_INFO_KFLAG(t->info);
 }
 
+static inline u32 btf_member_bit_offset(const struct btf_type *struct_type,
+					const struct btf_member *member)
+{
+	return btf_type_kflag(struct_type) ? BTF_MEMBER_BIT_OFFSET(member->offset)
+					   : member->offset;
+}
+
 static inline u32 btf_member_bitfield_size(const struct btf_type *struct_type,
 					   const struct btf_member *member)
 {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c1eeb3e0e116..38059880963e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -136,6 +136,7 @@ enum bpf_map_type {
 	BPF_MAP_TYPE_STACK,
 	BPF_MAP_TYPE_SK_STORAGE,
 	BPF_MAP_TYPE_DEVMAP_HASH,
+	BPF_MAP_TYPE_STRUCT_OPS,
 };
 
 /* Note that tracing related programs such as
@@ -398,6 +399,10 @@ union bpf_attr {
 		__u32	btf_fd;		/* fd pointing to a BTF type data */
 		__u32	btf_key_type_id;	/* BTF type_id of the key */
 		__u32	btf_value_type_id;	/* BTF type_id of the value */
+		__u32	btf_vmlinux_value_type_id;/* BTF type_id of a kernel-
+						   * struct stored as the
+						   * map value
+						   */
 	};
 
 	struct { /* anonymous struct used by BPF_MAP_*_ELEM commands */
@@ -3350,7 +3355,7 @@ struct bpf_map_info {
 	__u32 map_flags;
 	char  name[BPF_OBJ_NAME_LEN];
 	__u32 ifindex;
-	__u32 :32;
+	__u32 btf_vmlinux_value_type_id;
 	__u64 netns_dev;
 	__u64 netns_ino;
 	__u32 btf_id;
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 2ea68fe34c33..ddf48f49914b 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -9,9 +9,68 @@
 #include <linux/numa.h>
 #include <linux/seq_file.h>
 #include <linux/refcount.h>
+#include <linux/mutex.h>
 
+enum bpf_struct_ops_state {
+	BPF_STRUCT_OPS_STATE_INIT,
+	BPF_STRUCT_OPS_STATE_INUSE,
+	BPF_STRUCT_OPS_STATE_TOBEFREE,
+};
+
+#define BPF_STRUCT_OPS_COMMON_VALUE			\
+	refcount_t refcnt;				\
+	enum bpf_struct_ops_state state
+
+struct bpf_struct_ops_value {
+	BPF_STRUCT_OPS_COMMON_VALUE;
+	char data[0] ____cacheline_aligned_in_smp;
+};
+
+struct bpf_struct_ops_map {
+	struct bpf_map map;
+	const struct bpf_struct_ops *st_ops;
+	/* protect map_update */
+	struct mutex lock;
+	/* progs has all the bpf_prog that is populated
+	 * to the func ptr of the kernel's struct
+	 * (in kvalue.data).
+	 */
+	struct bpf_prog **progs;
+	/* image is a page that has all the trampolines
+	 * that stores the func args before calling the bpf_prog.
+	 * A PAGE_SIZE "image" is enough to store all trampoline for
+	 * "progs[]".
+	 */
+	void *image;
+	/* uvalue->data stores the kernel struct
+	 * (e.g. tcp_congestion_ops) that is more useful
+	 * to userspace than the kvalue.  For example,
+	 * the bpf_prog's id is stored instead of the kernel
+	 * address of a func ptr.
+	 */
+	struct bpf_struct_ops_value *uvalue;
+	/* kvalue.data stores the actual kernel's struct
+	 * (e.g. tcp_congestion_ops) that will be
+	 * registered to the kernel subsystem.
+	 */
+	struct bpf_struct_ops_value kvalue;
+};
+
+#define VALUE_PREFIX "bpf_struct_ops_"
+#define VALUE_PREFIX_LEN (sizeof(VALUE_PREFIX) - 1)
+
+/* bpf_struct_ops_##_name (e.g. bpf_struct_ops_tcp_congestion_ops) is
+ * the map's value exposed to the userspace and its btf-type-id is
+ * stored at the map->btf_vmlinux_value_type_id.
+ *
+ */
 #define BPF_STRUCT_OPS_TYPE(_name)				\
-extern struct bpf_struct_ops bpf_##_name;
+extern struct bpf_struct_ops bpf_##_name;			\
+								\
+struct bpf_struct_ops_##_name {						\
+	BPF_STRUCT_OPS_COMMON_VALUE;				\
+	struct _name data ____cacheline_aligned_in_smp;		\
+};
 #include "bpf_struct_ops_types.h"
 #undef BPF_STRUCT_OPS_TYPE
 
@@ -35,19 +94,50 @@ const struct bpf_verifier_ops bpf_struct_ops_verifier_ops = {
 const struct bpf_prog_ops bpf_struct_ops_prog_ops = {
 };
 
+static const struct btf_type *module_type;
+
 void bpf_struct_ops_init(struct btf *btf)
 {
+	s32 type_id, value_id, module_id;
 	const struct btf_member *member;
 	struct bpf_struct_ops *st_ops;
 	struct bpf_verifier_log log = {};
 	const struct btf_type *t;
+	char value_name[128];
 	const char *mname;
-	s32 type_id;
 	u32 i, j;
 
+	/* Ensure BTF type is emitted for "struct bpf_struct_ops_##_name" */
+#define BPF_STRUCT_OPS_TYPE(_name) BTF_TYPE_EMIT(struct bpf_struct_ops_##_name);
+#include "bpf_struct_ops_types.h"
+#undef BPF_STRUCT_OPS_TYPE
+
+	module_id = btf_find_by_name_kind(btf, "module", BTF_KIND_STRUCT);
+	if (module_id < 0) {
+		pr_warn("Cannot find struct module in btf_vmlinux\n");
+		return;
+	}
+	module_type = btf_type_by_id(btf, module_id);
+
 	for (i = 0; i < ARRAY_SIZE(bpf_struct_ops); i++) {
 		st_ops = bpf_struct_ops[i];
 
+		if (strlen(st_ops->name) + VALUE_PREFIX_LEN >=
+		    sizeof(value_name)) {
+			pr_warn("struct_ops name %s is too long\n",
+				st_ops->name);
+			continue;
+		}
+		sprintf(value_name, "%s%s", VALUE_PREFIX, st_ops->name);
+
+		value_id = btf_find_by_name_kind(btf, value_name,
+						 BTF_KIND_STRUCT);
+		if (value_id < 0) {
+			pr_warn("Cannot find struct %s in btf_vmlinux\n",
+				value_name);
+			continue;
+		}
+
 		type_id = btf_find_by_name_kind(btf, st_ops->name,
 						BTF_KIND_STRUCT);
 		if (type_id < 0) {
@@ -98,6 +188,9 @@ void bpf_struct_ops_init(struct btf *btf)
 			} else {
 				st_ops->type_id = type_id;
 				st_ops->type = t;
+				st_ops->value_id = value_id;
+				st_ops->value_type = btf_type_by_id(btf,
+								    value_id);
 			}
 		}
 	}
@@ -105,6 +198,22 @@ void bpf_struct_ops_init(struct btf *btf)
 
 extern struct btf *btf_vmlinux;
 
+static const struct bpf_struct_ops *
+bpf_struct_ops_find_value(u32 value_id)
+{
+	unsigned int i;
+
+	if (!value_id || !btf_vmlinux)
+		return NULL;
+
+	for (i = 0; i < ARRAY_SIZE(bpf_struct_ops); i++) {
+		if (bpf_struct_ops[i]->value_id == value_id)
+			return bpf_struct_ops[i];
+	}
+
+	return NULL;
+}
+
 const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
 {
 	unsigned int i;
@@ -119,3 +228,401 @@ const struct bpf_struct_ops *bpf_struct_ops_find(u32 type_id)
 
 	return NULL;
 }
+
+static int bpf_struct_ops_map_get_next_key(struct bpf_map *map, void *key,
+					   void *next_key)
+{
+	if (key && *(u32 *)key == 0)
+		return -ENOENT;
+
+	*(u32 *)next_key = 0;
+	return 0;
+}
+
+int bpf_struct_ops_map_sys_lookup_elem(struct bpf_map *map, void *key,
+				       void *value)
+{
+	struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map;
+	struct bpf_struct_ops_value *uvalue, *kvalue;
+	enum bpf_struct_ops_state state;
+
+	if (unlikely(*(u32 *)key != 0))
+		return -ENOENT;
+
+	kvalue = &st_map->kvalue;
+	/* Pair with smp_store_release() during map_update */
+	state = smp_load_acquire(&kvalue->state);
+	if (state == BPF_STRUCT_OPS_STATE_INIT) {
+		memset(value, 0, map->value_size);
+		return 0;
+	}
+
+	/* No lock is needed.  state and refcnt do not need
+	 * to be updated together under atomic context.
+	 */
+	uvalue = (struct bpf_struct_ops_value *)value;
+	memcpy(uvalue, st_map->uvalue, map->value_size);
+	uvalue->state = state;
+	refcount_set(&uvalue->refcnt, refcount_read(&kvalue->refcnt));
+
+	return 0;
+}
+
+static void *bpf_struct_ops_map_lookup_elem(struct bpf_map *map, void *key)
+{
+	return ERR_PTR(-EINVAL);
+}
+
+static void bpf_struct_ops_map_put_progs(struct bpf_struct_ops_map *st_map)
+{
+	const struct btf_type *t = st_map->st_ops->type;
+	u32 i;
+
+	for (i = 0; i < btf_type_vlen(t); i++) {
+		if (st_map->progs[i]) {
+			bpf_prog_put(st_map->progs[i]);
+			st_map->progs[i] = NULL;
+		}
+	}
+}
+
+static int check_zero_holes(const struct btf_type *t, void *data)
+{
+	const struct btf_member *member;
+	u32 i, moff, msize, prev_mend = 0;
+	const struct btf_type *mtype;
+
+	for_each_member(i, t, member) {
+		moff = btf_member_bit_offset(t, member) / 8;
+		if (moff > prev_mend &&
+		    memchr_inv(data + prev_mend, 0, moff - prev_mend))
+			return -EINVAL;
+
+		mtype = btf_type_by_id(btf_vmlinux, member->type);
+		mtype = btf_resolve_size(btf_vmlinux, mtype, &msize,
+					 NULL, NULL);
+		if (IS_ERR(mtype))
+			return PTR_ERR(mtype);
+		prev_mend = moff + msize;
+	}
+
+	if (t->size > prev_mend &&
+	    memchr_inv(data + prev_mend, 0, t->size - prev_mend))
+		return -EINVAL;
+
+	return 0;
+}
+
+static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
+					  void *value, u64 flags)
+{
+	struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map;
+	const struct bpf_struct_ops *st_ops = st_map->st_ops;
+	struct bpf_struct_ops_value *uvalue, *kvalue;
+	const struct btf_member *member;
+	const struct btf_type *t = st_ops->type;
+	void *udata, *kdata;
+	int prog_fd, err = 0;
+	void *image;
+	u32 i;
+
+	if (flags)
+		return -EINVAL;
+
+	if (*(u32 *)key != 0)
+		return -E2BIG;
+
+	err = check_zero_holes(st_ops->value_type, value);
+	if (err)
+		return err;
+
+	uvalue = (struct bpf_struct_ops_value *)value;
+	err = check_zero_holes(t, uvalue->data);
+	if (err)
+		return err;
+
+	if (uvalue->state || refcount_read(&uvalue->refcnt))
+		return -EINVAL;
+
+	uvalue = (struct bpf_struct_ops_value *)st_map->uvalue;
+	kvalue = (struct bpf_struct_ops_value *)&st_map->kvalue;
+
+	mutex_lock(&st_map->lock);
+
+	if (kvalue->state != BPF_STRUCT_OPS_STATE_INIT) {
+		err = -EBUSY;
+		goto unlock;
+	}
+
+	memcpy(uvalue, value, map->value_size);
+
+	udata = &uvalue->data;
+	kdata = &kvalue->data;
+	image = st_map->image;
+
+	for_each_member(i, t, member) {
+		const struct btf_type *mtype, *ptype;
+		struct bpf_prog *prog;
+		u32 moff;
+
+		moff = btf_member_bit_offset(t, member) / 8;
+		ptype = btf_type_resolve_ptr(btf_vmlinux, member->type, NULL);
+		if (ptype == module_type) {
+			if (*(void **)(udata + moff))
+				goto reset_unlock;
+			*(void **)(kdata + moff) = BPF_MODULE_OWNER;
+			continue;
+		}
+
+		err = st_ops->init_member(t, member, kdata, udata);
+		if (err < 0)
+			goto reset_unlock;
+
+		/* The ->init_member() has handled this member */
+		if (err > 0)
+			continue;
+
+		/* If st_ops->init_member does not handle it,
+		 * we will only handle func ptrs and zero-ed members
+		 * here.  Reject everything else.
+		 */
+
+		/* All non func ptr member must be 0 */
+		if (!ptype || !btf_type_is_func_proto(ptype)) {
+			u32 msize;
+
+			mtype = btf_type_by_id(btf_vmlinux, member->type);
+			mtype = btf_resolve_size(btf_vmlinux, mtype, &msize,
+						 NULL, NULL);
+			if (IS_ERR(mtype)) {
+				err = PTR_ERR(mtype);
+				goto reset_unlock;
+			}
+
+			if (memchr_inv(udata + moff, 0, msize)) {
+				err = -EINVAL;
+				goto reset_unlock;
+			}
+
+			continue;
+		}
+
+		prog_fd = (int)(*(unsigned long *)(udata + moff));
+		/* Similar check as the attr->attach_prog_fd */
+		if (!prog_fd)
+			continue;
+
+		prog = bpf_prog_get(prog_fd);
+		if (IS_ERR(prog)) {
+			err = PTR_ERR(prog);
+			goto reset_unlock;
+		}
+		st_map->progs[i] = prog;
+
+		if (prog->type != BPF_PROG_TYPE_STRUCT_OPS ||
+		    prog->aux->attach_btf_id != st_ops->type_id ||
+		    prog->expected_attach_type != i) {
+			err = -EINVAL;
+			goto reset_unlock;
+		}
+
+		err = arch_prepare_bpf_trampoline(image,
+						  st_map->image + PAGE_SIZE,
+						  &st_ops->func_models[i], 0,
+						  &prog, 1, NULL, 0, NULL);
+		if (err < 0)
+			goto reset_unlock;
+
+		*(void **)(kdata + moff) = image;
+		image += err;
+
+		/* put prog_id to udata */
+		*(unsigned long *)(udata + moff) = prog->aux->id;
+	}
+
+	refcount_set(&kvalue->refcnt, 1);
+	bpf_map_inc(map);
+
+	set_memory_ro((long)st_map->image, 1);
+	set_memory_x((long)st_map->image, 1);
+	err = st_ops->reg(kdata);
+	if (likely(!err)) {
+		/* Pair with smp_load_acquire() during lookup_elem().
+		 * It ensures the above udata updates (e.g. prog->aux->id)
+		 * can be seen once BPF_STRUCT_OPS_STATE_INUSE is set.
+		 */
+		smp_store_release(&kvalue->state, BPF_STRUCT_OPS_STATE_INUSE);
+		goto unlock;
+	}
+
+	/* Error during st_ops->reg().  It is very unlikely since
+	 * the above init_member() should have caught it earlier
+	 * before reg().  The only possibility is if there was a race
+	 * in registering the struct_ops (under the same name) to
+	 * a sub-system through different struct_ops's maps.
+	 */
+	set_memory_nx((long)st_map->image, 1);
+	set_memory_rw((long)st_map->image, 1);
+	bpf_map_put(map);
+
+reset_unlock:
+	bpf_struct_ops_map_put_progs(st_map);
+	memset(uvalue, 0, map->value_size);
+	memset(kvalue, 0, map->value_size);
+unlock:
+	mutex_unlock(&st_map->lock);
+	return err;
+}
+
+static int bpf_struct_ops_map_delete_elem(struct bpf_map *map, void *key)
+{
+	enum bpf_struct_ops_state prev_state;
+	struct bpf_struct_ops_map *st_map;
+
+	st_map = (struct bpf_struct_ops_map *)map;
+	prev_state = cmpxchg(&st_map->kvalue.state,
+			     BPF_STRUCT_OPS_STATE_INUSE,
+			     BPF_STRUCT_OPS_STATE_TOBEFREE);
+	if (prev_state == BPF_STRUCT_OPS_STATE_INUSE) {
+		st_map->st_ops->unreg(&st_map->kvalue.data);
+		if (refcount_dec_and_test(&st_map->kvalue.refcnt))
+			bpf_map_put(map);
+	}
+
+	return 0;
+}
+
+static void bpf_struct_ops_map_seq_show_elem(struct bpf_map *map, void *key,
+					     struct seq_file *m)
+{
+	void *value;
+
+	value = bpf_struct_ops_map_lookup_elem(map, key);
+	if (!value)
+		return;
+
+	btf_type_seq_show(btf_vmlinux, map->btf_vmlinux_value_type_id,
+			  value, m);
+	seq_puts(m, "\n");
+}
+
+static void bpf_struct_ops_map_free(struct bpf_map *map)
+{
+	struct bpf_struct_ops_map *st_map = (struct bpf_struct_ops_map *)map;
+
+	if (st_map->progs)
+		bpf_struct_ops_map_put_progs(st_map);
+	bpf_map_area_free(st_map->progs);
+	bpf_jit_free_exec(st_map->image);
+	bpf_map_area_free(st_map->uvalue);
+	bpf_map_area_free(st_map);
+}
+
+static int bpf_struct_ops_map_alloc_check(union bpf_attr *attr)
+{
+	if (attr->key_size != sizeof(unsigned int) || attr->max_entries != 1 ||
+	    attr->map_flags || !attr->btf_vmlinux_value_type_id)
+		return -EINVAL;
+	return 0;
+}
+
+static struct bpf_map *bpf_struct_ops_map_alloc(union bpf_attr *attr)
+{
+	const struct bpf_struct_ops *st_ops;
+	size_t map_total_size, st_map_size;
+	struct bpf_struct_ops_map *st_map;
+	const struct btf_type *t, *vt;
+	struct bpf_map_memory mem;
+	struct bpf_map *map;
+	int err;
+
+	if (!capable(CAP_SYS_ADMIN))
+		return ERR_PTR(-EPERM);
+
+	st_ops = bpf_struct_ops_find_value(attr->btf_vmlinux_value_type_id);
+	if (!st_ops)
+		return ERR_PTR(-ENOTSUPP);
+
+	vt = st_ops->value_type;
+	if (attr->value_size != vt->size)
+		return ERR_PTR(-EINVAL);
+
+	t = st_ops->type;
+
+	st_map_size = sizeof(*st_map) +
+		/* kvalue stores the
+		 * struct bpf_struct_ops_tcp_congestions_ops
+		 */
+		(vt->size - sizeof(struct bpf_struct_ops_value));
+	map_total_size = st_map_size +
+		/* uvalue */
+		sizeof(vt->size) +
+		/* struct bpf_progs **progs */
+		 btf_type_vlen(t) * sizeof(struct bpf_prog *);
+	err = bpf_map_charge_init(&mem, map_total_size);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	st_map = bpf_map_area_alloc(st_map_size, NUMA_NO_NODE);
+	if (!st_map) {
+		bpf_map_charge_finish(&mem);
+		return ERR_PTR(-ENOMEM);
+	}
+	st_map->st_ops = st_ops;
+	map = &st_map->map;
+
+	st_map->uvalue = bpf_map_area_alloc(vt->size, NUMA_NO_NODE);
+	st_map->progs =
+		bpf_map_area_alloc(btf_type_vlen(t) * sizeof(struct bpf_prog *),
+				   NUMA_NO_NODE);
+	st_map->image = bpf_jit_alloc_exec(PAGE_SIZE);
+	if (!st_map->uvalue || !st_map->progs || !st_map->image) {
+		bpf_struct_ops_map_free(map);
+		bpf_map_charge_finish(&mem);
+		return ERR_PTR(-ENOMEM);
+	}
+
+	mutex_init(&st_map->lock);
+	set_vm_flush_reset_perms(st_map->image);
+	bpf_map_init_from_attr(map, attr);
+	bpf_map_charge_move(&map->memory, &mem);
+
+	return map;
+}
+
+const struct bpf_map_ops bpf_struct_ops_map_ops = {
+	.map_alloc_check = bpf_struct_ops_map_alloc_check,
+	.map_alloc = bpf_struct_ops_map_alloc,
+	.map_free = bpf_struct_ops_map_free,
+	.map_get_next_key = bpf_struct_ops_map_get_next_key,
+	.map_lookup_elem = bpf_struct_ops_map_lookup_elem,
+	.map_delete_elem = bpf_struct_ops_map_delete_elem,
+	.map_update_elem = bpf_struct_ops_map_update_elem,
+	.map_seq_show_elem = bpf_struct_ops_map_seq_show_elem,
+};
+
+/* "const void *" because some subsystem is
+ * passing a const (e.g. const struct tcp_congestion_ops *)
+ */
+bool bpf_struct_ops_get(const void *kdata)
+{
+	struct bpf_struct_ops_value *kvalue;
+
+	kvalue = container_of(kdata, struct bpf_struct_ops_value, data);
+
+	return refcount_inc_not_zero(&kvalue->refcnt);
+}
+
+void bpf_struct_ops_put(const void *kdata)
+{
+	struct bpf_struct_ops_value *kvalue;
+
+	kvalue = container_of(kdata, struct bpf_struct_ops_value, data);
+	if (refcount_dec_and_test(&kvalue->refcnt)) {
+		struct bpf_struct_ops_map *st_map;
+
+		st_map = container_of(kvalue, struct bpf_struct_ops_map,
+				      kvalue);
+		bpf_map_put(&st_map->map);
+	}
+}
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 12af4a1bb1a4..81d9cf75cacd 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -500,13 +500,6 @@ static const char *btf_int_encoding_str(u8 encoding)
 		return "UNKN";
 }
 
-static u32 btf_member_bit_offset(const struct btf_type *struct_type,
-			     const struct btf_member *member)
-{
-	return btf_type_kflag(struct_type) ? BTF_MEMBER_BIT_OFFSET(member->offset)
-					   : member->offset;
-}
-
 static u32 btf_type_int(const struct btf_type *t)
 {
 	return *(u32 *)(t + 1);
@@ -1089,7 +1082,7 @@ static const struct resolve_vertex *env_stack_peak(struct btf_verifier_env *env)
  * *elem_type: same as return type ("struct X")
  * *total_nelems: 1
  */
-static const struct btf_type *
+const struct btf_type *
 btf_resolve_size(const struct btf *btf, const struct btf_type *type,
 		 u32 *type_size, const struct btf_type **elem_type,
 		 u32 *total_nelems)
@@ -1143,8 +1136,10 @@ resolved:
 		return ERR_PTR(-EINVAL);
 
 	*type_size = nelems * size;
-	*total_nelems = nelems;
-	*elem_type = type;
+	if (total_nelems)
+		*total_nelems = nelems;
+	if (elem_type)
+		*elem_type = type;
 
 	return array_type ? : type;
 }
@@ -1858,7 +1853,10 @@ static void btf_modifier_seq_show(const struct btf *btf,
 				  u32 type_id, void *data,
 				  u8 bits_offset, struct seq_file *m)
 {
-	t = btf_type_id_resolve(btf, &type_id);
+	if (btf->resolved_ids)
+		t = btf_type_id_resolve(btf, &type_id);
+	else
+		t = btf_type_skip_modifiers(btf, type_id, NULL);
 
 	btf_type_ops(t)->seq_show(btf, t, type_id, data, bits_offset, m);
 }
diff --git a/kernel/bpf/map_in_map.c b/kernel/bpf/map_in_map.c
index 5e9366b33f0f..b3c48d1533cb 100644
--- a/kernel/bpf/map_in_map.c
+++ b/kernel/bpf/map_in_map.c
@@ -22,7 +22,8 @@ struct bpf_map *bpf_map_meta_alloc(int inner_map_ufd)
 	 */
 	if (inner_map->map_type == BPF_MAP_TYPE_PROG_ARRAY ||
 	    inner_map->map_type == BPF_MAP_TYPE_CGROUP_STORAGE ||
-	    inner_map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) {
+	    inner_map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE ||
+	    inner_map->map_type == BPF_MAP_TYPE_STRUCT_OPS) {
 		fdput(f);
 		return ERR_PTR(-ENOTSUPP);
 	}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 03a02ef4c496..f9db72a96ec0 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -628,7 +628,7 @@ static int map_check_btf(struct bpf_map *map, const struct btf *btf,
 	return ret;
 }
 
-#define BPF_MAP_CREATE_LAST_FIELD btf_value_type_id
+#define BPF_MAP_CREATE_LAST_FIELD btf_vmlinux_value_type_id
 /* called via syscall */
 static int map_create(union bpf_attr *attr)
 {
@@ -642,6 +642,14 @@ static int map_create(union bpf_attr *attr)
 	if (err)
 		return -EINVAL;
 
+	if (attr->btf_vmlinux_value_type_id) {
+		if (attr->map_type != BPF_MAP_TYPE_STRUCT_OPS ||
+		    attr->btf_key_type_id || attr->btf_value_type_id)
+			return -EINVAL;
+	} else if (attr->btf_key_type_id && !attr->btf_value_type_id) {
+		return -EINVAL;
+	}
+
 	f_flags = bpf_get_file_flag(attr->map_flags);
 	if (f_flags < 0)
 		return f_flags;
@@ -664,32 +672,35 @@ static int map_create(union bpf_attr *attr)
 	atomic64_set(&map->usercnt, 1);
 	mutex_init(&map->freeze_mutex);
 
-	if (attr->btf_key_type_id || attr->btf_value_type_id) {
+	map->spin_lock_off = -EINVAL;
+	if (attr->btf_key_type_id || attr->btf_value_type_id ||
+	    /* Even the map's value is a kernel's struct,
+	     * the bpf_prog.o must have BTF to begin with
+	     * to figure out the corresponding kernel's
+	     * counter part.  Thus, attr->btf_fd has
+	     * to be valid also.
+	     */
+	    attr->btf_vmlinux_value_type_id) {
 		struct btf *btf;
 
-		if (!attr->btf_value_type_id) {
-			err = -EINVAL;
-			goto free_map;
-		}
-
 		btf = btf_get_by_fd(attr->btf_fd);
 		if (IS_ERR(btf)) {
 			err = PTR_ERR(btf);
 			goto free_map;
 		}
+		map->btf = btf;
 
-		err = map_check_btf(map, btf, attr->btf_key_type_id,
-				    attr->btf_value_type_id);
-		if (err) {
-			btf_put(btf);
-			goto free_map;
+		if (attr->btf_value_type_id) {
+			err = map_check_btf(map, btf, attr->btf_key_type_id,
+					    attr->btf_value_type_id);
+			if (err)
+				goto free_map;
 		}
 
-		map->btf = btf;
 		map->btf_key_type_id = attr->btf_key_type_id;
 		map->btf_value_type_id = attr->btf_value_type_id;
-	} else {
-		map->spin_lock_off = -EINVAL;
+		map->btf_vmlinux_value_type_id =
+			attr->btf_vmlinux_value_type_id;
 	}
 
 	err = security_bpf_map_alloc(map);
@@ -888,6 +899,9 @@ static int map_lookup_elem(union bpf_attr *attr)
 	} else if (map->map_type == BPF_MAP_TYPE_QUEUE ||
 		   map->map_type == BPF_MAP_TYPE_STACK) {
 		err = map->ops->map_peek_elem(map, value);
+	} else if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS) {
+		/* struct_ops map requires directly updating "value" */
+		err = bpf_struct_ops_map_sys_lookup_elem(map, key, value);
 	} else {
 		rcu_read_lock();
 		if (map->ops->map_lookup_elem_sys_only)
@@ -1003,7 +1017,8 @@ static int map_update_elem(union bpf_attr *attr)
 		goto out;
 	} else if (map->map_type == BPF_MAP_TYPE_CPUMAP ||
 		   map->map_type == BPF_MAP_TYPE_SOCKHASH ||
-		   map->map_type == BPF_MAP_TYPE_SOCKMAP) {
+		   map->map_type == BPF_MAP_TYPE_SOCKMAP ||
+		   map->map_type == BPF_MAP_TYPE_STRUCT_OPS) {
 		err = map->ops->map_update_elem(map, key, value, attr->flags);
 		goto out;
 	} else if (IS_FD_PROG_ARRAY(map)) {
@@ -1092,7 +1107,9 @@ static int map_delete_elem(union bpf_attr *attr)
 	if (bpf_map_is_dev_bound(map)) {
 		err = bpf_map_offload_delete_elem(map, key);
 		goto out;
-	} else if (IS_FD_PROG_ARRAY(map)) {
+	} else if (IS_FD_PROG_ARRAY(map) ||
+		   map->map_type == BPF_MAP_TYPE_STRUCT_OPS) {
+		/* These maps require sleepable context */
 		err = map->ops->map_delete_elem(map, key);
 		goto out;
 	}
@@ -2822,6 +2839,7 @@ static int bpf_map_get_info_by_fd(struct bpf_map *map,
 		info.btf_key_type_id = map->btf_key_type_id;
 		info.btf_value_type_id = map->btf_value_type_id;
 	}
+	info.btf_vmlinux_value_type_id = map->btf_vmlinux_value_type_id;
 
 	if (bpf_map_is_dev_bound(map)) {
 		err = bpf_map_offload_info_fill(&info, map);
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 505f4e4b31d2..79a04417050d 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -160,11 +160,12 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
 	if (fexit_cnt)
 		flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
 
-	err = arch_prepare_bpf_trampoline(new_image, &tr->func.model, flags,
+	err = arch_prepare_bpf_trampoline(new_image, new_image + PAGE_SIZE / 2,
+					  &tr->func.model, flags,
 					  fentry, fentry_cnt,
 					  fexit, fexit_cnt,
 					  tr->func.addr);
-	if (err)
+	if (err < 0)
 		goto out;
 
 	if (tr->selector)
@@ -296,7 +297,8 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start)
 }
 
 int __weak
-arch_prepare_bpf_trampoline(void *image, struct btf_func_model *m, u32 flags,
+arch_prepare_bpf_trampoline(void *image, void *image_end,
+			    const struct btf_func_model *m, u32 flags,
 			    struct bpf_prog **fentry_progs, int fentry_cnt,
 			    struct bpf_prog **fexit_progs, int fexit_cnt,
 			    void *orig_call)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 586ed3c94b80..f5af759a8a5f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8155,6 +8155,11 @@ static int check_map_prog_compatibility(struct bpf_verifier_env *env,
 		return -EINVAL;
 	}
 
+	if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS) {
+		verbose(env, "bpf_struct_ops map cannot be used in prog\n");
+		return -EINVAL;
+	}
+
 	return 0;
 }
 
-- 
cgit v1.2.3


From 206057fe020ac5c037d5e2dd6562a9bd216ec765 Mon Sep 17 00:00:00 2001
From: Martin KaFai Lau <kafai@fb.com>
Date: Wed, 8 Jan 2020 16:45:51 -0800
Subject: bpf: Add BPF_FUNC_tcp_send_ack helper

Add a helper to send out a tcp-ack.  It will be used in the later
bpf_dctcp implementation that requires to send out an ack
when the CE state changed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20200109004551.3900448-1-kafai@fb.com
---
 include/uapi/linux/bpf.h | 11 ++++++++++-
 net/ipv4/bpf_tcp_ca.c    | 24 +++++++++++++++++++++++-
 2 files changed, 33 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 38059880963e..2d6a2e572f56 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2837,6 +2837,14 @@ union bpf_attr {
  * 	Return
  * 		On success, the strictly positive length of the string,	including
  * 		the trailing NUL character. On error, a negative value.
+ *
+ * int bpf_tcp_send_ack(void *tp, u32 rcv_nxt)
+ *	Description
+ *		Send out a tcp-ack. *tp* is the in-kernel struct tcp_sock.
+ *		*rcv_nxt* is the ack_seq to be sent out.
+ *	Return
+ *		0 on success, or a negative error in case of failure.
+ *
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2954,7 +2962,8 @@ union bpf_attr {
 	FN(probe_read_user),		\
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
-	FN(probe_read_kernel_str),
+	FN(probe_read_kernel_str),	\
+	FN(tcp_send_ack),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index 9c7745b958d7..574972bc7299 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -143,11 +143,33 @@ static int bpf_tcp_ca_btf_struct_access(struct bpf_verifier_log *log,
 	return NOT_INIT;
 }
 
+BPF_CALL_2(bpf_tcp_send_ack, struct tcp_sock *, tp, u32, rcv_nxt)
+{
+	/* bpf_tcp_ca prog cannot have NULL tp */
+	__tcp_send_ack((struct sock *)tp, rcv_nxt);
+	return 0;
+}
+
+static const struct bpf_func_proto bpf_tcp_send_ack_proto = {
+	.func		= bpf_tcp_send_ack,
+	.gpl_only	= false,
+	/* In case we want to report error later */
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_BTF_ID,
+	.arg2_type	= ARG_ANYTHING,
+	.btf_id		= &tcp_sock_id,
+};
+
 static const struct bpf_func_proto *
 bpf_tcp_ca_get_func_proto(enum bpf_func_id func_id,
 			  const struct bpf_prog *prog)
 {
-	return bpf_base_func_proto(func_id);
+	switch (func_id) {
+	case BPF_FUNC_tcp_send_ack:
+		return &bpf_tcp_send_ack_proto;
+	default:
+		return bpf_base_func_proto(func_id);
+	}
 }
 
 static const struct bpf_verifier_ops bpf_tcp_ca_verifier_ops = {
-- 
cgit v1.2.3


From f5bfcd953d811dbb8913de36b96b38da6bb62135 Mon Sep 17 00:00:00 2001
From: Andrey Ignatov <rdna@fb.com>
Date: Tue, 7 Jan 2020 17:40:06 -0800
Subject: bpf: Document BPF_F_QUERY_EFFECTIVE flag

Document BPF_F_QUERY_EFFECTIVE flag, mostly to clarify how it affects
attach_flags what may not be obvious and what may lead to confision.

Specifically attach_flags is returned only for target_fd but if programs
are inherited from an ancestor cgroup then returned attach_flags for
current cgroup may be confusing. For example, two effective programs of
same attach_type can be returned but w/o BPF_F_ALLOW_MULTI in
attach_flags.

Simple repro:
  # bpftool c s /sys/fs/cgroup/path/to/task
  ID       AttachType      AttachFlags     Name
  # bpftool c s /sys/fs/cgroup/path/to/task effective
  ID       AttachType      AttachFlags     Name
  95043    ingress                         tw_ipt_ingress
  95048    ingress                         tw_ingress

Signed-off-by: Andrey Ignatov <rdna@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200108014006.938363-1-rdna@fb.com
---
 include/uapi/linux/bpf.h       | 7 ++++++-
 tools/include/uapi/linux/bpf.h | 7 ++++++-
 2 files changed, 12 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2d6a2e572f56..52966e758fe5 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -359,7 +359,12 @@ enum bpf_attach_type {
 /* Enable memory-mapping BPF map */
 #define BPF_F_MMAPABLE		(1U << 10)
 
-/* flags for BPF_PROG_QUERY */
+/* Flags for BPF_PROG_QUERY. */
+
+/* Query effective (directly attached + inherited from ancestor cgroups)
+ * programs that will be executed for events within a cgroup.
+ * attach_flags with this flag are returned only for directly attached programs.
+ */
 #define BPF_F_QUERY_EFFECTIVE	(1U << 0)
 
 enum bpf_stack_build_id_status {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 2d6a2e572f56..52966e758fe5 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -359,7 +359,12 @@ enum bpf_attach_type {
 /* Enable memory-mapping BPF map */
 #define BPF_F_MMAPABLE		(1U << 10)
 
-/* flags for BPF_PROG_QUERY */
+/* Flags for BPF_PROG_QUERY. */
+
+/* Query effective (directly attached + inherited from ancestor cgroups)
+ * programs that will be executed for events within a cgroup.
+ * attach_flags with this flag are returned only for directly attached programs.
+ */
 #define BPF_F_QUERY_EFFECTIVE	(1U << 0)
 
 enum bpf_stack_build_id_status {
-- 
cgit v1.2.3


From faf391c3826cd29feae02078ca2022d2f912f7cc Mon Sep 17 00:00:00 2001
From: Mat Martineau <mathew.j.martineau@linux.intel.com>
Date: Thu, 9 Jan 2020 07:59:16 -0800
Subject: tcp: Define IPPROTO_MPTCP

To open a MPTCP socket with socket(AF_INET, SOCK_STREAM, IPPROTO_MPTCP),
IPPROTO_MPTCP needs a value that differs from IPPROTO_TCP. The existing
IPPROTO numbers mostly map directly to IANA-specified protocol numbers.
MPTCP does not have a protocol number allocated because MPTCP packets
use the TCP protocol number. Use private number not used OTA.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/trace/events/sock.h   | 3 ++-
 include/uapi/linux/in.h       | 2 ++
 tools/include/uapi/linux/in.h | 2 ++
 3 files changed, 6 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/trace/events/sock.h b/include/trace/events/sock.h
index 3ff12b90048d..a966d4b5ab37 100644
--- a/include/trace/events/sock.h
+++ b/include/trace/events/sock.h
@@ -19,7 +19,8 @@
 #define inet_protocol_names		\
 		EM(IPPROTO_TCP)			\
 		EM(IPPROTO_DCCP)		\
-		EMe(IPPROTO_SCTP)
+		EM(IPPROTO_SCTP)		\
+		EMe(IPPROTO_MPTCP)
 
 #define tcp_state_names			\
 		EM(TCP_ESTABLISHED)		\
diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index e7ad9d350a28..1521073b6348 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -76,6 +76,8 @@ enum {
 #define IPPROTO_MPLS		IPPROTO_MPLS
   IPPROTO_RAW = 255,		/* Raw IP packets			*/
 #define IPPROTO_RAW		IPPROTO_RAW
+  IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
+#define IPPROTO_MPTCP		IPPROTO_MPTCP
   IPPROTO_MAX
 };
 #endif
diff --git a/tools/include/uapi/linux/in.h b/tools/include/uapi/linux/in.h
index e7ad9d350a28..1521073b6348 100644
--- a/tools/include/uapi/linux/in.h
+++ b/tools/include/uapi/linux/in.h
@@ -76,6 +76,8 @@ enum {
 #define IPPROTO_MPLS		IPPROTO_MPLS
   IPPROTO_RAW = 255,		/* Raw IP packets			*/
 #define IPPROTO_RAW		IPPROTO_RAW
+  IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
+#define IPPROTO_MPTCP		IPPROTO_MPTCP
   IPPROTO_MAX
 };
 #endif
-- 
cgit v1.2.3


From 51c39bb1d5d105a02e29aa7960f0a395086e6342 Mon Sep 17 00:00:00 2001
From: Alexei Starovoitov <ast@kernel.org>
Date: Thu, 9 Jan 2020 22:41:20 -0800
Subject: bpf: Introduce function-by-function verification

New llvm and old llvm with libbpf help produce BTF that distinguish global and
static functions. Unlike arguments of static function the arguments of global
functions cannot be removed or optimized away by llvm. The compiler has to use
exactly the arguments specified in a function prototype. The argument type
information allows the verifier validate each global function independently.
For now only supported argument types are pointer to context and scalars. In
the future pointers to structures, sizes, pointer to packet data can be
supported as well. Consider the following example:

static int f1(int ...)
{
  ...
}

int f3(int b);

int f2(int a)
{
  f1(a) + f3(a);
}

int f3(int b)
{
  ...
}

int main(...)
{
  f1(...) + f2(...) + f3(...);
}

The verifier will start its safety checks from the first global function f2().
It will recursively descend into f1() because it's static. Then it will check
that arguments match for the f3() invocation inside f2(). It will not descend
into f3(). It will finish f2() that has to be successfully verified for all
possible values of 'a'. Then it will proceed with f3(). That function also has
to be safe for all possible values of 'b'. Then it will start subprog 0 (which
is main() function). It will recursively descend into f1() and will skip full
check of f2() and f3(), since they are global. The order of processing global
functions doesn't affect safety, since all global functions must be proven safe
based on their arguments only.

Such function by function verification can drastically improve speed of the
verification and reduce complexity.

Note that the stack limit of 512 still applies to the call chain regardless whether
functions were static or global. The nested level of 8 also still applies. The
same recursion prevention checks are in place as well.

The type information and static/global kind is preserved after the verification
hence in the above example global function f2() and f3() can be replaced later
by equivalent functions with the same types that are loaded and verified later
without affecting safety of this main() program. Such replacement (re-linking)
of global functions is a subject of future patches.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20200110064124.1760511-3-ast@kernel.org
---
 include/linux/bpf.h          |   7 +-
 include/linux/bpf_verifier.h |  10 +-
 include/uapi/linux/btf.h     |   6 ++
 kernel/bpf/btf.c             | 175 ++++++++++++++++++++++++------
 kernel/bpf/verifier.c        | 252 ++++++++++++++++++++++++++++++++++---------
 5 files changed, 366 insertions(+), 84 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a7bfe8a388c6..aed2bc39d72b 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -566,6 +566,7 @@ static inline void bpf_dispatcher_change_prog(struct bpf_dispatcher *d,
 #endif
 
 struct bpf_func_info_aux {
+	u16 linkage;
 	bool unreliable;
 };
 
@@ -1081,7 +1082,11 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 			   const char *func_name,
 			   struct btf_func_model *m);
 
-int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog);
+struct bpf_reg_state;
+int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
+			     struct bpf_reg_state *regs);
+int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
+			  struct bpf_reg_state *reg);
 
 struct bpf_prog *bpf_prog_by_id(u32 id);
 
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 26e40de9ef55..5406e6e96585 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -304,11 +304,13 @@ struct bpf_insn_aux_data {
 	u64 map_key_state; /* constant (32 bit) key tracking for maps */
 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
 	int sanitize_stack_off; /* stack slot to be cleared */
-	bool seen; /* this insn was processed by the verifier */
+	u32 seen; /* this insn was processed by the verifier at env->pass_cnt */
 	bool zext_dst; /* this insn zero extends dst reg */
 	u8 alu_state; /* used in combination with alu_limit */
-	bool prune_point;
+
+	/* below fields are initialized once */
 	unsigned int orig_idx; /* original instruction index */
+	bool prune_point;
 };
 
 #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
@@ -379,6 +381,7 @@ struct bpf_verifier_env {
 		int *insn_stack;
 		int cur_stack;
 	} cfg;
+	u32 pass_cnt; /* number of times do_check() was called */
 	u32 subprog_cnt;
 	/* number of instructions analyzed by the verifier */
 	u32 prev_insn_processed, insn_processed;
@@ -428,4 +431,7 @@ bpf_prog_offload_replace_insn(struct bpf_verifier_env *env, u32 off,
 void
 bpf_prog_offload_remove_insns(struct bpf_verifier_env *env, u32 off, u32 cnt);
 
+int check_ctx_reg(struct bpf_verifier_env *env,
+		  const struct bpf_reg_state *reg, int regno);
+
 #endif /* _LINUX_BPF_VERIFIER_H */
diff --git a/include/uapi/linux/btf.h b/include/uapi/linux/btf.h
index 1a2898c482ee..5a667107ad2c 100644
--- a/include/uapi/linux/btf.h
+++ b/include/uapi/linux/btf.h
@@ -146,6 +146,12 @@ enum {
 	BTF_VAR_GLOBAL_EXTERN = 2,
 };
 
+enum btf_func_linkage {
+	BTF_FUNC_STATIC = 0,
+	BTF_FUNC_GLOBAL = 1,
+	BTF_FUNC_EXTERN = 2,
+};
+
 /* BTF_KIND_VAR is followed by a single "struct btf_var" to describe
  * additional information related to the variable such as its linkage.
  */
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 81d9cf75cacd..832b5d7fd892 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -2651,8 +2651,8 @@ static s32 btf_func_check_meta(struct btf_verifier_env *env,
 		return -EINVAL;
 	}
 
-	if (btf_type_vlen(t)) {
-		btf_verifier_log_type(env, t, "vlen != 0");
+	if (btf_type_vlen(t) > BTF_FUNC_GLOBAL) {
+		btf_verifier_log_type(env, t, "Invalid func linkage");
 		return -EINVAL;
 	}
 
@@ -3506,7 +3506,8 @@ static u8 bpf_ctx_convert_map[] = {
 
 static const struct btf_member *
 btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf,
-		      const struct btf_type *t, enum bpf_prog_type prog_type)
+		      const struct btf_type *t, enum bpf_prog_type prog_type,
+		      int arg)
 {
 	const struct btf_type *conv_struct;
 	const struct btf_type *ctx_struct;
@@ -3527,12 +3528,13 @@ btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf,
 		 * is not supported yet.
 		 * BPF_PROG_TYPE_RAW_TRACEPOINT is fine.
 		 */
-		bpf_log(log, "BPF program ctx type is not a struct\n");
+		if (log->level & BPF_LOG_LEVEL)
+			bpf_log(log, "arg#%d type is not a struct\n", arg);
 		return NULL;
 	}
 	tname = btf_name_by_offset(btf, t->name_off);
 	if (!tname) {
-		bpf_log(log, "BPF program ctx struct doesn't have a name\n");
+		bpf_log(log, "arg#%d struct doesn't have a name\n", arg);
 		return NULL;
 	}
 	/* prog_type is valid bpf program type. No need for bounds check. */
@@ -3565,11 +3567,12 @@ btf_get_prog_ctx_type(struct bpf_verifier_log *log, struct btf *btf,
 static int btf_translate_to_vmlinux(struct bpf_verifier_log *log,
 				     struct btf *btf,
 				     const struct btf_type *t,
-				     enum bpf_prog_type prog_type)
+				     enum bpf_prog_type prog_type,
+				     int arg)
 {
 	const struct btf_member *prog_ctx_type, *kern_ctx_type;
 
-	prog_ctx_type = btf_get_prog_ctx_type(log, btf, t, prog_type);
+	prog_ctx_type = btf_get_prog_ctx_type(log, btf, t, prog_type, arg);
 	if (!prog_ctx_type)
 		return -ENOENT;
 	kern_ctx_type = prog_ctx_type + 1;
@@ -3731,7 +3734,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 	info->reg_type = PTR_TO_BTF_ID;
 
 	if (tgt_prog) {
-		ret = btf_translate_to_vmlinux(log, btf, t, tgt_prog->type);
+		ret = btf_translate_to_vmlinux(log, btf, t, tgt_prog->type, arg);
 		if (ret > 0) {
 			info->btf_id = ret;
 			return true;
@@ -4112,11 +4115,16 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 	return 0;
 }
 
-int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog)
+/* Compare BTF of a function with given bpf_reg_state.
+ * Returns:
+ * EFAULT - there is a verifier bug. Abort verification.
+ * EINVAL - there is a type mismatch or BTF is not available.
+ * 0 - BTF matches with what bpf_reg_state expects.
+ * Only PTR_TO_CTX and SCALAR_VALUE states are recognized.
+ */
+int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
+			     struct bpf_reg_state *reg)
 {
-	struct bpf_verifier_state *st = env->cur_state;
-	struct bpf_func_state *func = st->frame[st->curframe];
-	struct bpf_reg_state *reg = func->regs;
 	struct bpf_verifier_log *log = &env->log;
 	struct bpf_prog *prog = env->prog;
 	struct btf *btf = prog->aux->btf;
@@ -4126,27 +4134,30 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog)
 	const char *tname;
 
 	if (!prog->aux->func_info)
-		return 0;
+		return -EINVAL;
 
 	btf_id = prog->aux->func_info[subprog].type_id;
 	if (!btf_id)
-		return 0;
+		return -EFAULT;
 
 	if (prog->aux->func_info_aux[subprog].unreliable)
-		return 0;
+		return -EINVAL;
 
 	t = btf_type_by_id(btf, btf_id);
 	if (!t || !btf_type_is_func(t)) {
-		bpf_log(log, "BTF of subprog %d doesn't point to KIND_FUNC\n",
+		/* These checks were already done by the verifier while loading
+		 * struct bpf_func_info
+		 */
+		bpf_log(log, "BTF of func#%d doesn't point to KIND_FUNC\n",
 			subprog);
-		return -EINVAL;
+		return -EFAULT;
 	}
 	tname = btf_name_by_offset(btf, t->name_off);
 
 	t = btf_type_by_id(btf, t->type);
 	if (!t || !btf_type_is_func_proto(t)) {
-		bpf_log(log, "Invalid type of func %s\n", tname);
-		return -EINVAL;
+		bpf_log(log, "Invalid BTF of func %s\n", tname);
+		return -EFAULT;
 	}
 	args = (const struct btf_param *)(t + 1);
 	nargs = btf_type_vlen(t);
@@ -4172,25 +4183,127 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog)
 				bpf_log(log, "R%d is not a pointer\n", i + 1);
 				goto out;
 			}
-			/* If program is passing PTR_TO_CTX into subprogram
-			 * check that BTF type matches.
+			/* If function expects ctx type in BTF check that caller
+			 * is passing PTR_TO_CTX.
 			 */
-			if (reg[i + 1].type == PTR_TO_CTX &&
-			    !btf_get_prog_ctx_type(log, btf, t, prog->type))
-				goto out;
-			/* All other pointers are ok */
-			continue;
+			if (btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {
+				if (reg[i + 1].type != PTR_TO_CTX) {
+					bpf_log(log,
+						"arg#%d expected pointer to ctx, but got %s\n",
+						i, btf_kind_str[BTF_INFO_KIND(t->info)]);
+					goto out;
+				}
+				if (check_ctx_reg(env, &reg[i + 1], i + 1))
+					goto out;
+				continue;
+			}
 		}
-		bpf_log(log, "Unrecognized argument type %s\n",
-			btf_kind_str[BTF_INFO_KIND(t->info)]);
+		bpf_log(log, "Unrecognized arg#%d type %s\n",
+			i, btf_kind_str[BTF_INFO_KIND(t->info)]);
 		goto out;
 	}
 	return 0;
 out:
-	/* LLVM optimizations can remove arguments from static functions. */
-	bpf_log(log,
-		"Type info disagrees with actual arguments due to compiler optimizations\n");
+	/* Compiler optimizations can remove arguments from static functions
+	 * or mismatched type can be passed into a global function.
+	 * In such cases mark the function as unreliable from BTF point of view.
+	 */
 	prog->aux->func_info_aux[subprog].unreliable = true;
+	return -EINVAL;
+}
+
+/* Convert BTF of a function into bpf_reg_state if possible
+ * Returns:
+ * EFAULT - there is a verifier bug. Abort verification.
+ * EINVAL - cannot convert BTF.
+ * 0 - Successfully converted BTF into bpf_reg_state
+ * (either PTR_TO_CTX or SCALAR_VALUE).
+ */
+int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
+			  struct bpf_reg_state *reg)
+{
+	struct bpf_verifier_log *log = &env->log;
+	struct bpf_prog *prog = env->prog;
+	struct btf *btf = prog->aux->btf;
+	const struct btf_param *args;
+	const struct btf_type *t;
+	u32 i, nargs, btf_id;
+	const char *tname;
+
+	if (!prog->aux->func_info ||
+	    prog->aux->func_info_aux[subprog].linkage != BTF_FUNC_GLOBAL) {
+		bpf_log(log, "Verifier bug\n");
+		return -EFAULT;
+	}
+
+	btf_id = prog->aux->func_info[subprog].type_id;
+	if (!btf_id) {
+		bpf_log(log, "Global functions need valid BTF\n");
+		return -EFAULT;
+	}
+
+	t = btf_type_by_id(btf, btf_id);
+	if (!t || !btf_type_is_func(t)) {
+		/* These checks were already done by the verifier while loading
+		 * struct bpf_func_info
+		 */
+		bpf_log(log, "BTF of func#%d doesn't point to KIND_FUNC\n",
+			subprog);
+		return -EFAULT;
+	}
+	tname = btf_name_by_offset(btf, t->name_off);
+
+	if (log->level & BPF_LOG_LEVEL)
+		bpf_log(log, "Validating %s() func#%d...\n",
+			tname, subprog);
+
+	if (prog->aux->func_info_aux[subprog].unreliable) {
+		bpf_log(log, "Verifier bug in function %s()\n", tname);
+		return -EFAULT;
+	}
+
+	t = btf_type_by_id(btf, t->type);
+	if (!t || !btf_type_is_func_proto(t)) {
+		bpf_log(log, "Invalid type of function %s()\n", tname);
+		return -EFAULT;
+	}
+	args = (const struct btf_param *)(t + 1);
+	nargs = btf_type_vlen(t);
+	if (nargs > 5) {
+		bpf_log(log, "Global function %s() with %d > 5 args. Buggy compiler.\n",
+			tname, nargs);
+		return -EINVAL;
+	}
+	/* check that function returns int */
+	t = btf_type_by_id(btf, t->type);
+	while (btf_type_is_modifier(t))
+		t = btf_type_by_id(btf, t->type);
+	if (!btf_type_is_int(t) && !btf_type_is_enum(t)) {
+		bpf_log(log,
+			"Global function %s() doesn't return scalar. Only those are supported.\n",
+			tname);
+		return -EINVAL;
+	}
+	/* Convert BTF function arguments into verifier types.
+	 * Only PTR_TO_CTX and SCALAR are supported atm.
+	 */
+	for (i = 0; i < nargs; i++) {
+		t = btf_type_by_id(btf, args[i].type);
+		while (btf_type_is_modifier(t))
+			t = btf_type_by_id(btf, t->type);
+		if (btf_type_is_int(t) || btf_type_is_enum(t)) {
+			reg[i + 1].type = SCALAR_VALUE;
+			continue;
+		}
+		if (btf_type_is_ptr(t) &&
+		    btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {
+			reg[i + 1].type = PTR_TO_CTX;
+			continue;
+		}
+		bpf_log(log, "Arg#%d type %s in %s() is not supported yet.\n",
+			i, btf_kind_str[BTF_INFO_KIND(t->info)], tname);
+		return -EINVAL;
+	}
 	return 0;
 }
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index f5af759a8a5f..ca17dccc17ba 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -1122,10 +1122,6 @@ static void init_reg_state(struct bpf_verifier_env *env,
 	regs[BPF_REG_FP].type = PTR_TO_STACK;
 	mark_reg_known_zero(env, regs, BPF_REG_FP);
 	regs[BPF_REG_FP].frameno = state->frameno;
-
-	/* 1st arg to a function */
-	regs[BPF_REG_1].type = PTR_TO_CTX;
-	mark_reg_known_zero(env, regs, BPF_REG_1);
 }
 
 #define BPF_MAIN_FUNC (-1)
@@ -2739,8 +2735,8 @@ static int get_callee_stack_depth(struct bpf_verifier_env *env,
 }
 #endif
 
-static int check_ctx_reg(struct bpf_verifier_env *env,
-			 const struct bpf_reg_state *reg, int regno)
+int check_ctx_reg(struct bpf_verifier_env *env,
+		  const struct bpf_reg_state *reg, int regno)
 {
 	/* Access to ctx or passing it to a helper is only allowed in
 	 * its original, unmodified form.
@@ -3956,12 +3952,26 @@ static int release_reference(struct bpf_verifier_env *env,
 	return 0;
 }
 
+static void clear_caller_saved_regs(struct bpf_verifier_env *env,
+				    struct bpf_reg_state *regs)
+{
+	int i;
+
+	/* after the call registers r0 - r5 were scratched */
+	for (i = 0; i < CALLER_SAVED_REGS; i++) {
+		mark_reg_not_init(env, regs, caller_saved[i]);
+		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
+	}
+}
+
 static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			   int *insn_idx)
 {
 	struct bpf_verifier_state *state = env->cur_state;
+	struct bpf_func_info_aux *func_info_aux;
 	struct bpf_func_state *caller, *callee;
 	int i, err, subprog, target_insn;
+	bool is_global = false;
 
 	if (state->curframe + 1 >= MAX_CALL_FRAMES) {
 		verbose(env, "the call stack of %d frames is too deep\n",
@@ -3984,6 +3994,32 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 		return -EFAULT;
 	}
 
+	func_info_aux = env->prog->aux->func_info_aux;
+	if (func_info_aux)
+		is_global = func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;
+	err = btf_check_func_arg_match(env, subprog, caller->regs);
+	if (err == -EFAULT)
+		return err;
+	if (is_global) {
+		if (err) {
+			verbose(env, "Caller passes invalid args into func#%d\n",
+				subprog);
+			return err;
+		} else {
+			if (env->log.level & BPF_LOG_LEVEL)
+				verbose(env,
+					"Func#%d is global and valid. Skipping.\n",
+					subprog);
+			clear_caller_saved_regs(env, caller->regs);
+
+			/* All global functions return SCALAR_VALUE */
+			mark_reg_unknown(env, caller->regs, BPF_REG_0);
+
+			/* continue with next insn after call */
+			return 0;
+		}
+	}
+
 	callee = kzalloc(sizeof(*callee), GFP_KERNEL);
 	if (!callee)
 		return -ENOMEM;
@@ -4010,18 +4046,11 @@ static int check_func_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 	for (i = BPF_REG_1; i <= BPF_REG_5; i++)
 		callee->regs[i] = caller->regs[i];
 
-	/* after the call registers r0 - r5 were scratched */
-	for (i = 0; i < CALLER_SAVED_REGS; i++) {
-		mark_reg_not_init(env, caller->regs, caller_saved[i]);
-		check_reg_arg(env, caller_saved[i], DST_OP_NO_MARK);
-	}
+	clear_caller_saved_regs(env, caller->regs);
 
 	/* only increment it after check_reg_arg() finished */
 	state->curframe++;
 
-	if (btf_check_func_arg_match(env, subprog))
-		return -EINVAL;
-
 	/* and go analyze first insn of the callee */
 	*insn_idx = target_insn;
 
@@ -6771,12 +6800,13 @@ static int check_btf_func(struct bpf_verifier_env *env,
 
 		/* check type_id */
 		type = btf_type_by_id(btf, krecord[i].type_id);
-		if (!type || BTF_INFO_KIND(type->info) != BTF_KIND_FUNC) {
+		if (!type || !btf_type_is_func(type)) {
 			verbose(env, "invalid type id %d in func info",
 				krecord[i].type_id);
 			ret = -EINVAL;
 			goto err_free;
 		}
+		info_aux[i].linkage = BTF_INFO_VLEN(type->info);
 		prev_offset = krecord[i].insn_off;
 		urecord += urec_size;
 	}
@@ -7756,35 +7786,13 @@ static bool reg_type_mismatch(enum bpf_reg_type src, enum bpf_reg_type prev)
 
 static int do_check(struct bpf_verifier_env *env)
 {
-	struct bpf_verifier_state *state;
+	struct bpf_verifier_state *state = env->cur_state;
 	struct bpf_insn *insns = env->prog->insnsi;
 	struct bpf_reg_state *regs;
 	int insn_cnt = env->prog->len;
 	bool do_print_state = false;
 	int prev_insn_idx = -1;
 
-	env->prev_linfo = NULL;
-
-	state = kzalloc(sizeof(struct bpf_verifier_state), GFP_KERNEL);
-	if (!state)
-		return -ENOMEM;
-	state->curframe = 0;
-	state->speculative = false;
-	state->branches = 1;
-	state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL);
-	if (!state->frame[0]) {
-		kfree(state);
-		return -ENOMEM;
-	}
-	env->cur_state = state;
-	init_func_state(env, state->frame[0],
-			BPF_MAIN_FUNC /* callsite */,
-			0 /* frameno */,
-			0 /* subprogno, zero == main subprog */);
-
-	if (btf_check_func_arg_match(env, 0))
-		return -EINVAL;
-
 	for (;;) {
 		struct bpf_insn *insn;
 		u8 class;
@@ -7862,7 +7870,7 @@ static int do_check(struct bpf_verifier_env *env)
 		}
 
 		regs = cur_regs(env);
-		env->insn_aux_data[env->insn_idx].seen = true;
+		env->insn_aux_data[env->insn_idx].seen = env->pass_cnt;
 		prev_insn_idx = env->insn_idx;
 
 		if (class == BPF_ALU || class == BPF_ALU64) {
@@ -8082,7 +8090,7 @@ process_bpf_exit:
 					return err;
 
 				env->insn_idx++;
-				env->insn_aux_data[env->insn_idx].seen = true;
+				env->insn_aux_data[env->insn_idx].seen = env->pass_cnt;
 			} else {
 				verbose(env, "invalid BPF_LD mode\n");
 				return -EINVAL;
@@ -8095,7 +8103,6 @@ process_bpf_exit:
 		env->insn_idx++;
 	}
 
-	env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
 	return 0;
 }
 
@@ -8372,7 +8379,7 @@ static int adjust_insn_aux_data(struct bpf_verifier_env *env,
 	memcpy(new_data + off + cnt - 1, old_data + off,
 	       sizeof(struct bpf_insn_aux_data) * (prog_len - off - cnt + 1));
 	for (i = off; i < off + cnt - 1; i++) {
-		new_data[i].seen = true;
+		new_data[i].seen = env->pass_cnt;
 		new_data[i].zext_dst = insn_has_def32(env, insn + i);
 	}
 	env->insn_aux_data = new_data;
@@ -9484,6 +9491,7 @@ static void free_states(struct bpf_verifier_env *env)
 		kfree(sl);
 		sl = sln;
 	}
+	env->free_list = NULL;
 
 	if (!env->explored_states)
 		return;
@@ -9497,11 +9505,159 @@ static void free_states(struct bpf_verifier_env *env)
 			kfree(sl);
 			sl = sln;
 		}
+		env->explored_states[i] = NULL;
 	}
+}
 
-	kvfree(env->explored_states);
+/* The verifier is using insn_aux_data[] to store temporary data during
+ * verification and to store information for passes that run after the
+ * verification like dead code sanitization. do_check_common() for subprogram N
+ * may analyze many other subprograms. sanitize_insn_aux_data() clears all
+ * temporary data after do_check_common() finds that subprogram N cannot be
+ * verified independently. pass_cnt counts the number of times
+ * do_check_common() was run and insn->aux->seen tells the pass number
+ * insn_aux_data was touched. These variables are compared to clear temporary
+ * data from failed pass. For testing and experiments do_check_common() can be
+ * run multiple times even when prior attempt to verify is unsuccessful.
+ */
+static void sanitize_insn_aux_data(struct bpf_verifier_env *env)
+{
+	struct bpf_insn *insn = env->prog->insnsi;
+	struct bpf_insn_aux_data *aux;
+	int i, class;
+
+	for (i = 0; i < env->prog->len; i++) {
+		class = BPF_CLASS(insn[i].code);
+		if (class != BPF_LDX && class != BPF_STX)
+			continue;
+		aux = &env->insn_aux_data[i];
+		if (aux->seen != env->pass_cnt)
+			continue;
+		memset(aux, 0, offsetof(typeof(*aux), orig_idx));
+	}
 }
 
+static int do_check_common(struct bpf_verifier_env *env, int subprog)
+{
+	struct bpf_verifier_state *state;
+	struct bpf_reg_state *regs;
+	int ret, i;
+
+	env->prev_linfo = NULL;
+	env->pass_cnt++;
+
+	state = kzalloc(sizeof(struct bpf_verifier_state), GFP_KERNEL);
+	if (!state)
+		return -ENOMEM;
+	state->curframe = 0;
+	state->speculative = false;
+	state->branches = 1;
+	state->frame[0] = kzalloc(sizeof(struct bpf_func_state), GFP_KERNEL);
+	if (!state->frame[0]) {
+		kfree(state);
+		return -ENOMEM;
+	}
+	env->cur_state = state;
+	init_func_state(env, state->frame[0],
+			BPF_MAIN_FUNC /* callsite */,
+			0 /* frameno */,
+			subprog);
+
+	regs = state->frame[state->curframe]->regs;
+	if (subprog) {
+		ret = btf_prepare_func_args(env, subprog, regs);
+		if (ret)
+			goto out;
+		for (i = BPF_REG_1; i <= BPF_REG_5; i++) {
+			if (regs[i].type == PTR_TO_CTX)
+				mark_reg_known_zero(env, regs, i);
+			else if (regs[i].type == SCALAR_VALUE)
+				mark_reg_unknown(env, regs, i);
+		}
+	} else {
+		/* 1st arg to a function */
+		regs[BPF_REG_1].type = PTR_TO_CTX;
+		mark_reg_known_zero(env, regs, BPF_REG_1);
+		ret = btf_check_func_arg_match(env, subprog, regs);
+		if (ret == -EFAULT)
+			/* unlikely verifier bug. abort.
+			 * ret == 0 and ret < 0 are sadly acceptable for
+			 * main() function due to backward compatibility.
+			 * Like socket filter program may be written as:
+			 * int bpf_prog(struct pt_regs *ctx)
+			 * and never dereference that ctx in the program.
+			 * 'struct pt_regs' is a type mismatch for socket
+			 * filter that should be using 'struct __sk_buff'.
+			 */
+			goto out;
+	}
+
+	ret = do_check(env);
+out:
+	free_verifier_state(env->cur_state, true);
+	env->cur_state = NULL;
+	while (!pop_stack(env, NULL, NULL));
+	free_states(env);
+	if (ret)
+		/* clean aux data in case subprog was rejected */
+		sanitize_insn_aux_data(env);
+	return ret;
+}
+
+/* Verify all global functions in a BPF program one by one based on their BTF.
+ * All global functions must pass verification. Otherwise the whole program is rejected.
+ * Consider:
+ * int bar(int);
+ * int foo(int f)
+ * {
+ *    return bar(f);
+ * }
+ * int bar(int b)
+ * {
+ *    ...
+ * }
+ * foo() will be verified first for R1=any_scalar_value. During verification it
+ * will be assumed that bar() already verified successfully and call to bar()
+ * from foo() will be checked for type match only. Later bar() will be verified
+ * independently to check that it's safe for R1=any_scalar_value.
+ */
+static int do_check_subprogs(struct bpf_verifier_env *env)
+{
+	struct bpf_prog_aux *aux = env->prog->aux;
+	int i, ret;
+
+	if (!aux->func_info)
+		return 0;
+
+	for (i = 1; i < env->subprog_cnt; i++) {
+		if (aux->func_info_aux[i].linkage != BTF_FUNC_GLOBAL)
+			continue;
+		env->insn_idx = env->subprog_info[i].start;
+		WARN_ON_ONCE(env->insn_idx == 0);
+		ret = do_check_common(env, i);
+		if (ret) {
+			return ret;
+		} else if (env->log.level & BPF_LOG_LEVEL) {
+			verbose(env,
+				"Func#%d is safe for any args that match its prototype\n",
+				i);
+		}
+	}
+	return 0;
+}
+
+static int do_check_main(struct bpf_verifier_env *env)
+{
+	int ret;
+
+	env->insn_idx = 0;
+	ret = do_check_common(env, 0);
+	if (!ret)
+		env->prog->aux->stack_depth = env->subprog_info[0].stack_depth;
+	return ret;
+}
+
+
 static void print_verification_stats(struct bpf_verifier_env *env)
 {
 	int i;
@@ -9849,18 +10005,14 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (ret < 0)
 		goto skip_full_check;
 
-	ret = do_check(env);
-	if (env->cur_state) {
-		free_verifier_state(env->cur_state, true);
-		env->cur_state = NULL;
-	}
+	ret = do_check_subprogs(env);
+	ret = ret ?: do_check_main(env);
 
 	if (ret == 0 && bpf_prog_is_dev_bound(env->prog->aux))
 		ret = bpf_prog_offload_finalize(env);
 
 skip_full_check:
-	while (!pop_stack(env, NULL, NULL));
-	free_states(env);
+	kvfree(env->explored_states);
 
 	if (ret == 0)
 		ret = check_max_stack_depth(env);
-- 
cgit v1.2.3


From 769071ac9f20b6a447410c7eaa55d1a5233ef40c Mon Sep 17 00:00:00 2001
From: Andrei Vagin <avagin@openvz.org>
Date: Tue, 12 Nov 2019 01:26:52 +0000
Subject: ns: Introduce Time Namespace

Time Namespace isolates clock values.

The kernel provides access to several clocks CLOCK_REALTIME,
CLOCK_MONOTONIC, CLOCK_BOOTTIME, etc.

CLOCK_REALTIME
      System-wide clock that measures real (i.e., wall-clock) time.

CLOCK_MONOTONIC
      Clock that cannot be set and represents monotonic time since
      some unspecified starting point.

CLOCK_BOOTTIME
      Identical to CLOCK_MONOTONIC, except it also includes any time
      that the system is suspended.

For many users, the time namespace means the ability to changes date and
time in a container (CLOCK_REALTIME). Providing per namespace notions of
CLOCK_REALTIME would be complex with a massive overhead, but has a dubious
value.

But in the context of checkpoint/restore functionality, monotonic and
boottime clocks become interesting. Both clocks are monotonic with
unspecified starting points. These clocks are widely used to measure time
slices and set timers. After restoring or migrating processes, it has to be
guaranteed that they never go backward. In an ideal case, the behavior of
these clocks should be the same as for a case when a whole system is
suspended. All this means that it is required to set CLOCK_MONOTONIC and
CLOCK_BOOTTIME clocks, which can be achieved by adding per-namespace
offsets for clocks.

A time namespace is similar to a pid namespace in the way how it is
created: unshare(CLONE_NEWTIME) system call creates a new time namespace,
but doesn't set it to the current process. Then all children of the process
will be born in the new time namespace, or a process can use the setns()
system call to join a namespace.

This scheme allows setting clock offsets for a namespace, before any
processes appear in it.

All available clone flags have been used, so CLONE_NEWTIME uses the highest
bit of CSIGNAL. It means that it can be used only with the unshare() and
the clone3() system calls.

[ tglx: Adjusted paragraph about clone3() to reality and massaged the
  	changelog a bit. ]

Co-developed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Andrei Vagin <avagin@gmail.com>
Signed-off-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://criu.org/Time_namespace
Link: https://lists.openvz.org/pipermail/criu/2018-June/041504.html
Link: https://lore.kernel.org/r/20191112012724.250792-4-dima@arista.com
---
 MAINTAINERS                    |   2 +
 fs/proc/namespaces.c           |   4 +
 include/linux/nsproxy.h        |   2 +
 include/linux/proc_ns.h        |   3 +
 include/linux/time_namespace.h |  71 ++++++++++++++
 include/linux/user_namespace.h |   1 +
 include/uapi/linux/sched.h     |   6 ++
 init/Kconfig                   |   7 ++
 kernel/fork.c                  |  16 ++-
 kernel/nsproxy.c               |  41 ++++++--
 kernel/time/Makefile           |   1 +
 kernel/time/namespace.c        | 217 +++++++++++++++++++++++++++++++++++++++++
 12 files changed, 361 insertions(+), 10 deletions(-)
 create mode 100644 include/linux/time_namespace.h
 create mode 100644 kernel/time/namespace.c

(limited to 'include/uapi/linux')

diff --git a/MAINTAINERS b/MAINTAINERS
index 8982c6e013b3..f6d00023e56f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -13214,6 +13214,8 @@ T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git timers/core
 S:	Maintained
 F:	fs/timerfd.c
 F:	include/linux/timer*
+F:	include/linux/time_namespace.h
+F:	kernel/time_namespace.c
 F:	kernel/time/*timer*
 
 POWER MANAGEMENT CORE
diff --git a/fs/proc/namespaces.c b/fs/proc/namespaces.c
index dd2b35f78b09..8b5c720fe5d7 100644
--- a/fs/proc/namespaces.c
+++ b/fs/proc/namespaces.c
@@ -33,6 +33,10 @@ static const struct proc_ns_operations *ns_entries[] = {
 #ifdef CONFIG_CGROUPS
 	&cgroupns_operations,
 #endif
+#ifdef CONFIG_TIME_NS
+	&timens_operations,
+	&timens_for_children_operations,
+#endif
 };
 
 static const char *proc_ns_get_link(struct dentry *dentry,
diff --git a/include/linux/nsproxy.h b/include/linux/nsproxy.h
index 2ae1b1a4d84d..074f395b9ad2 100644
--- a/include/linux/nsproxy.h
+++ b/include/linux/nsproxy.h
@@ -35,6 +35,8 @@ struct nsproxy {
 	struct mnt_namespace *mnt_ns;
 	struct pid_namespace *pid_ns_for_children;
 	struct net 	     *net_ns;
+	struct time_namespace *time_ns;
+	struct time_namespace *time_ns_for_children;
 	struct cgroup_namespace *cgroup_ns;
 };
 extern struct nsproxy init_nsproxy;
diff --git a/include/linux/proc_ns.h b/include/linux/proc_ns.h
index d31cb6215905..d312e6281e69 100644
--- a/include/linux/proc_ns.h
+++ b/include/linux/proc_ns.h
@@ -32,6 +32,8 @@ extern const struct proc_ns_operations pidns_for_children_operations;
 extern const struct proc_ns_operations userns_operations;
 extern const struct proc_ns_operations mntns_operations;
 extern const struct proc_ns_operations cgroupns_operations;
+extern const struct proc_ns_operations timens_operations;
+extern const struct proc_ns_operations timens_for_children_operations;
 
 /*
  * We always define these enumerators
@@ -43,6 +45,7 @@ enum {
 	PROC_USER_INIT_INO	= 0xEFFFFFFDU,
 	PROC_PID_INIT_INO	= 0xEFFFFFFCU,
 	PROC_CGROUP_INIT_INO	= 0xEFFFFFFBU,
+	PROC_TIME_INIT_INO	= 0xEFFFFFFAU,
 };
 
 #ifdef CONFIG_PROC_FS
diff --git a/include/linux/time_namespace.h b/include/linux/time_namespace.h
new file mode 100644
index 000000000000..8c74cc12ad24
--- /dev/null
+++ b/include/linux/time_namespace.h
@@ -0,0 +1,71 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_TIMENS_H
+#define _LINUX_TIMENS_H
+
+
+#include <linux/sched.h>
+#include <linux/kref.h>
+#include <linux/nsproxy.h>
+#include <linux/ns_common.h>
+#include <linux/err.h>
+
+struct user_namespace;
+extern struct user_namespace init_user_ns;
+
+struct time_namespace {
+	struct kref		kref;
+	struct user_namespace	*user_ns;
+	struct ucounts		*ucounts;
+	struct ns_common	ns;
+} __randomize_layout;
+
+extern struct time_namespace init_time_ns;
+
+#ifdef CONFIG_TIME_NS
+static inline struct time_namespace *get_time_ns(struct time_namespace *ns)
+{
+	kref_get(&ns->kref);
+	return ns;
+}
+
+struct time_namespace *copy_time_ns(unsigned long flags,
+				    struct user_namespace *user_ns,
+				    struct time_namespace *old_ns);
+void free_time_ns(struct kref *kref);
+int timens_on_fork(struct nsproxy *nsproxy, struct task_struct *tsk);
+
+static inline void put_time_ns(struct time_namespace *ns)
+{
+	kref_put(&ns->kref, free_time_ns);
+}
+
+#else
+static inline struct time_namespace *get_time_ns(struct time_namespace *ns)
+{
+	return NULL;
+}
+
+static inline void put_time_ns(struct time_namespace *ns)
+{
+}
+
+static inline
+struct time_namespace *copy_time_ns(unsigned long flags,
+				    struct user_namespace *user_ns,
+				    struct time_namespace *old_ns)
+{
+	if (flags & CLONE_NEWTIME)
+		return ERR_PTR(-EINVAL);
+
+	return old_ns;
+}
+
+static inline int timens_on_fork(struct nsproxy *nsproxy,
+				 struct task_struct *tsk)
+{
+	return 0;
+}
+
+#endif
+
+#endif /* _LINUX_TIMENS_H */
diff --git a/include/linux/user_namespace.h b/include/linux/user_namespace.h
index fb9f4f799554..6ef1c7109fc4 100644
--- a/include/linux/user_namespace.h
+++ b/include/linux/user_namespace.h
@@ -45,6 +45,7 @@ enum ucount_type {
 	UCOUNT_NET_NAMESPACES,
 	UCOUNT_MNT_NAMESPACES,
 	UCOUNT_CGROUP_NAMESPACES,
+	UCOUNT_TIME_NAMESPACES,
 #ifdef CONFIG_INOTIFY_USER
 	UCOUNT_INOTIFY_INSTANCES,
 	UCOUNT_INOTIFY_WATCHES,
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 4a0217832464..2e3bc22c6f20 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -36,6 +36,12 @@
 /* Flags for the clone3() syscall. */
 #define CLONE_CLEAR_SIGHAND 0x100000000ULL /* Clear any signal handler and reset to SIG_DFL. */
 
+/*
+ * cloning flags intersect with CSIGNAL so can be used with unshare and clone3
+ * syscalls only:
+ */
+#define CLONE_NEWTIME	0x00000080	/* New time namespace */
+
 #ifndef __ASSEMBLY__
 /**
  * struct clone_args - arguments for the clone3 syscall
diff --git a/init/Kconfig b/init/Kconfig
index a34064a031a5..b34314fc75f7 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1080,6 +1080,13 @@ config UTS_NS
 	  In this namespace tasks see different info provided with the
 	  uname() system call
 
+config TIME_NS
+	bool "TIME namespace"
+	default y
+	help
+	  In this namespace boottime and monotonic clocks can be set.
+	  The time will keep going with the same pace.
+
 config IPC_NS
 	bool "IPC namespace"
 	depends on (SYSVIPC || POSIX_MQUEUE)
diff --git a/kernel/fork.c b/kernel/fork.c
index 2508a4f238a3..363595815144 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1832,6 +1832,7 @@ static __latent_entropy struct task_struct *copy_process(
 	struct multiprocess_signals delayed;
 	struct file *pidfile = NULL;
 	u64 clone_flags = args->flags;
+	struct nsproxy *nsp = current->nsproxy;
 
 	/*
 	 * Don't allow sharing the root directory with processes in a different
@@ -1874,8 +1875,16 @@ static __latent_entropy struct task_struct *copy_process(
 	 */
 	if (clone_flags & CLONE_THREAD) {
 		if ((clone_flags & (CLONE_NEWUSER | CLONE_NEWPID)) ||
-		    (task_active_pid_ns(current) !=
-				current->nsproxy->pid_ns_for_children))
+		    (task_active_pid_ns(current) != nsp->pid_ns_for_children))
+			return ERR_PTR(-EINVAL);
+	}
+
+	/*
+	 * If the new process will be in a different time namespace
+	 * do not allow it to share VM or a thread group with the forking task.
+	 */
+	if (clone_flags & (CLONE_THREAD | CLONE_VM)) {
+		if (nsp->time_ns != nsp->time_ns_for_children)
 			return ERR_PTR(-EINVAL);
 	}
 
@@ -2811,7 +2820,8 @@ static int check_unshare_flags(unsigned long unshare_flags)
 	if (unshare_flags & ~(CLONE_THREAD|CLONE_FS|CLONE_NEWNS|CLONE_SIGHAND|
 				CLONE_VM|CLONE_FILES|CLONE_SYSVSEM|
 				CLONE_NEWUTS|CLONE_NEWIPC|CLONE_NEWNET|
-				CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWCGROUP))
+				CLONE_NEWUSER|CLONE_NEWPID|CLONE_NEWCGROUP|
+				CLONE_NEWTIME))
 		return -EINVAL;
 	/*
 	 * Not implemented, but pretend it works if there is nothing
diff --git a/kernel/nsproxy.c b/kernel/nsproxy.c
index c815f58e6bc0..ed9882108cd2 100644
--- a/kernel/nsproxy.c
+++ b/kernel/nsproxy.c
@@ -18,6 +18,7 @@
 #include <linux/pid_namespace.h>
 #include <net/net_namespace.h>
 #include <linux/ipc_namespace.h>
+#include <linux/time_namespace.h>
 #include <linux/proc_ns.h>
 #include <linux/file.h>
 #include <linux/syscalls.h>
@@ -40,6 +41,10 @@ struct nsproxy init_nsproxy = {
 #ifdef CONFIG_CGROUPS
 	.cgroup_ns		= &init_cgroup_ns,
 #endif
+#ifdef CONFIG_TIME_NS
+	.time_ns		= &init_time_ns,
+	.time_ns_for_children	= &init_time_ns,
+#endif
 };
 
 static inline struct nsproxy *create_nsproxy(void)
@@ -106,8 +111,18 @@ static struct nsproxy *create_new_namespaces(unsigned long flags,
 		goto out_net;
 	}
 
+	new_nsp->time_ns_for_children = copy_time_ns(flags, user_ns,
+					tsk->nsproxy->time_ns_for_children);
+	if (IS_ERR(new_nsp->time_ns_for_children)) {
+		err = PTR_ERR(new_nsp->time_ns_for_children);
+		goto out_time;
+	}
+	new_nsp->time_ns = get_time_ns(tsk->nsproxy->time_ns);
+
 	return new_nsp;
 
+out_time:
+	put_net(new_nsp->net_ns);
 out_net:
 	put_cgroup_ns(new_nsp->cgroup_ns);
 out_cgroup:
@@ -136,15 +151,16 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
 	struct nsproxy *old_ns = tsk->nsproxy;
 	struct user_namespace *user_ns = task_cred_xxx(tsk, user_ns);
 	struct nsproxy *new_ns;
+	int ret;
 
 	if (likely(!(flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC |
 			      CLONE_NEWPID | CLONE_NEWNET |
-			      CLONE_NEWCGROUP)))) {
-		get_nsproxy(old_ns);
-		return 0;
-	}
-
-	if (!ns_capable(user_ns, CAP_SYS_ADMIN))
+			      CLONE_NEWCGROUP | CLONE_NEWTIME)))) {
+		if (likely(old_ns->time_ns_for_children == old_ns->time_ns)) {
+			get_nsproxy(old_ns);
+			return 0;
+		}
+	} else if (!ns_capable(user_ns, CAP_SYS_ADMIN))
 		return -EPERM;
 
 	/*
@@ -162,6 +178,12 @@ int copy_namespaces(unsigned long flags, struct task_struct *tsk)
 	if (IS_ERR(new_ns))
 		return  PTR_ERR(new_ns);
 
+	ret = timens_on_fork(new_ns, tsk);
+	if (ret) {
+		free_nsproxy(new_ns);
+		return ret;
+	}
+
 	tsk->nsproxy = new_ns;
 	return 0;
 }
@@ -176,6 +198,10 @@ void free_nsproxy(struct nsproxy *ns)
 		put_ipc_ns(ns->ipc_ns);
 	if (ns->pid_ns_for_children)
 		put_pid_ns(ns->pid_ns_for_children);
+	if (ns->time_ns)
+		put_time_ns(ns->time_ns);
+	if (ns->time_ns_for_children)
+		put_time_ns(ns->time_ns_for_children);
 	put_cgroup_ns(ns->cgroup_ns);
 	put_net(ns->net_ns);
 	kmem_cache_free(nsproxy_cachep, ns);
@@ -192,7 +218,8 @@ int unshare_nsproxy_namespaces(unsigned long unshare_flags,
 	int err = 0;
 
 	if (!(unshare_flags & (CLONE_NEWNS | CLONE_NEWUTS | CLONE_NEWIPC |
-			       CLONE_NEWNET | CLONE_NEWPID | CLONE_NEWCGROUP)))
+			       CLONE_NEWNET | CLONE_NEWPID | CLONE_NEWCGROUP |
+			       CLONE_NEWTIME)))
 		return 0;
 
 	user_ns = new_cred ? new_cred->user_ns : current_user_ns();
diff --git a/kernel/time/Makefile b/kernel/time/Makefile
index 1867044800bb..c8f00168afe8 100644
--- a/kernel/time/Makefile
+++ b/kernel/time/Makefile
@@ -19,3 +19,4 @@ obj-$(CONFIG_TICK_ONESHOT)			+= tick-oneshot.o tick-sched.o
 obj-$(CONFIG_HAVE_GENERIC_VDSO)			+= vsyscall.o
 obj-$(CONFIG_DEBUG_FS)				+= timekeeping_debug.o
 obj-$(CONFIG_TEST_UDELAY)			+= test_udelay.o
+obj-$(CONFIG_TIME_NS)				+= namespace.o
diff --git a/kernel/time/namespace.c b/kernel/time/namespace.c
new file mode 100644
index 000000000000..2662a69e0382
--- /dev/null
+++ b/kernel/time/namespace.c
@@ -0,0 +1,217 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Author: Andrei Vagin <avagin@openvz.org>
+ * Author: Dmitry Safonov <dima@arista.com>
+ */
+
+#include <linux/time_namespace.h>
+#include <linux/user_namespace.h>
+#include <linux/sched/signal.h>
+#include <linux/sched/task.h>
+#include <linux/proc_ns.h>
+#include <linux/export.h>
+#include <linux/time.h>
+#include <linux/slab.h>
+#include <linux/cred.h>
+#include <linux/err.h>
+
+static struct ucounts *inc_time_namespaces(struct user_namespace *ns)
+{
+	return inc_ucount(ns, current_euid(), UCOUNT_TIME_NAMESPACES);
+}
+
+static void dec_time_namespaces(struct ucounts *ucounts)
+{
+	dec_ucount(ucounts, UCOUNT_TIME_NAMESPACES);
+}
+
+/**
+ * clone_time_ns - Clone a time namespace
+ * @user_ns:	User namespace which owns a new namespace.
+ * @old_ns:	Namespace to clone
+ *
+ * Clone @old_ns and set the clone refcount to 1
+ *
+ * Return: The new namespace or ERR_PTR.
+ */
+static struct time_namespace *clone_time_ns(struct user_namespace *user_ns,
+					  struct time_namespace *old_ns)
+{
+	struct time_namespace *ns;
+	struct ucounts *ucounts;
+	int err;
+
+	err = -ENOSPC;
+	ucounts = inc_time_namespaces(user_ns);
+	if (!ucounts)
+		goto fail;
+
+	err = -ENOMEM;
+	ns = kmalloc(sizeof(*ns), GFP_KERNEL);
+	if (!ns)
+		goto fail_dec;
+
+	kref_init(&ns->kref);
+
+	err = ns_alloc_inum(&ns->ns);
+	if (err)
+		goto fail_free;
+
+	ns->ucounts = ucounts;
+	ns->ns.ops = &timens_operations;
+	ns->user_ns = get_user_ns(user_ns);
+	return ns;
+
+fail_free:
+	kfree(ns);
+fail_dec:
+	dec_time_namespaces(ucounts);
+fail:
+	return ERR_PTR(err);
+}
+
+/**
+ * copy_time_ns - Create timens_for_children from @old_ns
+ * @flags:	Cloning flags
+ * @user_ns:	User namespace which owns a new namespace.
+ * @old_ns:	Namespace to clone
+ *
+ * If CLONE_NEWTIME specified in @flags, creates a new timens_for_children;
+ * adds a refcounter to @old_ns otherwise.
+ *
+ * Return: timens_for_children namespace or ERR_PTR.
+ */
+struct time_namespace *copy_time_ns(unsigned long flags,
+	struct user_namespace *user_ns, struct time_namespace *old_ns)
+{
+	if (!(flags & CLONE_NEWTIME))
+		return get_time_ns(old_ns);
+
+	return clone_time_ns(user_ns, old_ns);
+}
+
+void free_time_ns(struct kref *kref)
+{
+	struct time_namespace *ns;
+
+	ns = container_of(kref, struct time_namespace, kref);
+	dec_time_namespaces(ns->ucounts);
+	put_user_ns(ns->user_ns);
+	ns_free_inum(&ns->ns);
+	kfree(ns);
+}
+
+static struct time_namespace *to_time_ns(struct ns_common *ns)
+{
+	return container_of(ns, struct time_namespace, ns);
+}
+
+static struct ns_common *timens_get(struct task_struct *task)
+{
+	struct time_namespace *ns = NULL;
+	struct nsproxy *nsproxy;
+
+	task_lock(task);
+	nsproxy = task->nsproxy;
+	if (nsproxy) {
+		ns = nsproxy->time_ns;
+		get_time_ns(ns);
+	}
+	task_unlock(task);
+
+	return ns ? &ns->ns : NULL;
+}
+
+static struct ns_common *timens_for_children_get(struct task_struct *task)
+{
+	struct time_namespace *ns = NULL;
+	struct nsproxy *nsproxy;
+
+	task_lock(task);
+	nsproxy = task->nsproxy;
+	if (nsproxy) {
+		ns = nsproxy->time_ns_for_children;
+		get_time_ns(ns);
+	}
+	task_unlock(task);
+
+	return ns ? &ns->ns : NULL;
+}
+
+static void timens_put(struct ns_common *ns)
+{
+	put_time_ns(to_time_ns(ns));
+}
+
+static int timens_install(struct nsproxy *nsproxy, struct ns_common *new)
+{
+	struct time_namespace *ns = to_time_ns(new);
+
+	if (!current_is_single_threaded())
+		return -EUSERS;
+
+	if (!ns_capable(ns->user_ns, CAP_SYS_ADMIN) ||
+	    !ns_capable(current_user_ns(), CAP_SYS_ADMIN))
+		return -EPERM;
+
+	get_time_ns(ns);
+	put_time_ns(nsproxy->time_ns);
+	nsproxy->time_ns = ns;
+
+	get_time_ns(ns);
+	put_time_ns(nsproxy->time_ns_for_children);
+	nsproxy->time_ns_for_children = ns;
+	return 0;
+}
+
+int timens_on_fork(struct nsproxy *nsproxy, struct task_struct *tsk)
+{
+	struct ns_common *nsc = &nsproxy->time_ns_for_children->ns;
+	struct time_namespace *ns = to_time_ns(nsc);
+
+	/* create_new_namespaces() already incremented the ref counter */
+	if (nsproxy->time_ns == nsproxy->time_ns_for_children)
+		return 0;
+
+	get_time_ns(ns);
+	put_time_ns(nsproxy->time_ns);
+	nsproxy->time_ns = ns;
+
+	return 0;
+}
+
+static struct user_namespace *timens_owner(struct ns_common *ns)
+{
+	return to_time_ns(ns)->user_ns;
+}
+
+const struct proc_ns_operations timens_operations = {
+	.name		= "time",
+	.type		= CLONE_NEWTIME,
+	.get		= timens_get,
+	.put		= timens_put,
+	.install	= timens_install,
+	.owner		= timens_owner,
+};
+
+const struct proc_ns_operations timens_for_children_operations = {
+	.name		= "time_for_children",
+	.type		= CLONE_NEWTIME,
+	.get		= timens_for_children_get,
+	.put		= timens_put,
+	.install	= timens_install,
+	.owner		= timens_owner,
+};
+
+struct time_namespace init_time_ns = {
+	.kref		= KREF_INIT(3),
+	.user_ns	= &init_user_ns,
+	.ns.inum	= PROC_TIME_INIT_INO,
+	.ns.ops		= &timens_operations,
+};
+
+static int __init time_ns_init(void)
+{
+	return 0;
+}
+subsys_initcall(time_ns_init);
-- 
cgit v1.2.3


From 76564261a7db80c5f5c624e0122a28787f266bdf Mon Sep 17 00:00:00 2001
From: Antoine Tenart <antoine.tenart@bootlin.com>
Date: Mon, 13 Jan 2020 23:31:40 +0100
Subject: net: macsec: introduce the macsec_context structure

This patch introduces the macsec_context structure. It will be used
in the kernel to exchange information between the common MACsec
implementation (macsec.c) and the MACsec hardware offloading
implementations. This structure contains pointers to MACsec specific
structures which contain the actual MACsec configuration, and to the
underlying device (phydev for now).

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/phy.h                |  2 ++
 include/net/macsec.h               | 21 +++++++++++++++++++++
 include/uapi/linux/if_link.h       |  7 +++++++
 tools/include/uapi/linux/if_link.h |  7 +++++++
 4 files changed, 37 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/linux/phy.h b/include/linux/phy.h
index 3a70b756ac1a..be079a7bb40a 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -332,6 +332,8 @@ struct phy_c45_device_ids {
 	u32 device_ids[8];
 };
 
+struct macsec_context;
+
 /* phy_device: An instance of a PHY
  *
  * drv: Pointer to the driver for this PHY instance
diff --git a/include/net/macsec.h b/include/net/macsec.h
index e7b41c1043f6..0b98803f92ec 100644
--- a/include/net/macsec.h
+++ b/include/net/macsec.h
@@ -174,4 +174,25 @@ struct macsec_secy {
 	struct macsec_rx_sc __rcu *rx_sc;
 };
 
+/**
+ * struct macsec_context - MACsec context for hardware offloading
+ */
+struct macsec_context {
+	struct phy_device *phydev;
+	enum macsec_offload offload;
+
+	struct macsec_secy *secy;
+	struct macsec_rx_sc *rx_sc;
+	struct {
+		unsigned char assoc_num;
+		u8 key[MACSEC_KEYID_LEN];
+		union {
+			struct macsec_rx_sa *rx_sa;
+			struct macsec_tx_sa *tx_sa;
+		};
+	} sa;
+
+	u8 prepare:1;
+};
+
 #endif /* _NET_MACSEC_H_ */
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h
index 1d69f637c5d6..024af2d1d0af 100644
--- a/include/uapi/linux/if_link.h
+++ b/include/uapi/linux/if_link.h
@@ -486,6 +486,13 @@ enum macsec_validation_type {
 	MACSEC_VALIDATE_MAX = __MACSEC_VALIDATE_END - 1,
 };
 
+enum macsec_offload {
+	MACSEC_OFFLOAD_OFF = 0,
+	MACSEC_OFFLOAD_PHY = 1,
+	__MACSEC_OFFLOAD_END,
+	MACSEC_OFFLOAD_MAX = __MACSEC_OFFLOAD_END - 1,
+};
+
 /* IPVLAN section */
 enum {
 	IFLA_IPVLAN_UNSPEC,
diff --git a/tools/include/uapi/linux/if_link.h b/tools/include/uapi/linux/if_link.h
index 8aec8769d944..42efdb84d189 100644
--- a/tools/include/uapi/linux/if_link.h
+++ b/tools/include/uapi/linux/if_link.h
@@ -485,6 +485,13 @@ enum macsec_validation_type {
 	MACSEC_VALIDATE_MAX = __MACSEC_VALIDATE_END - 1,
 };
 
+enum macsec_offload {
+	MACSEC_OFFLOAD_OFF = 0,
+	MACSEC_OFFLOAD_PHY = 1,
+	__MACSEC_OFFLOAD_END,
+	MACSEC_OFFLOAD_MAX = __MACSEC_OFFLOAD_END - 1,
+};
+
 /* IPVLAN section */
 enum {
 	IFLA_IPVLAN_UNSPEC,
-- 
cgit v1.2.3


From dcb780fb279514f268826f2e9f4df3bc75610703 Mon Sep 17 00:00:00 2001
From: Antoine Tenart <antoine.tenart@bootlin.com>
Date: Mon, 13 Jan 2020 23:31:44 +0100
Subject: net: macsec: add nla support for changing the offloading selection

MACsec offloading to underlying hardware devices is disabled by default
(the software implementation is used). This patch adds support for
changing this setting through the MACsec netlink interface. Many checks
are done when enabling offloading on a given MACsec interface as there
are limitations (it must be supported by the hardware, only a single
interface can be offloaded on a given physical device at a time, rules
can't be moved for now).

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/macsec.c           | 145 ++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/if_macsec.h |  11 ++++
 2 files changed, 153 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c
index 36b0416381bf..e515919e8687 100644
--- a/drivers/net/macsec.c
+++ b/drivers/net/macsec.c
@@ -1484,6 +1484,7 @@ static const struct nla_policy macsec_genl_policy[NUM_MACSEC_ATTR] = {
 	[MACSEC_ATTR_IFINDEX] = { .type = NLA_U32 },
 	[MACSEC_ATTR_RXSC_CONFIG] = { .type = NLA_NESTED },
 	[MACSEC_ATTR_SA_CONFIG] = { .type = NLA_NESTED },
+	[MACSEC_ATTR_OFFLOAD] = { .type = NLA_NESTED },
 };
 
 static const struct nla_policy macsec_genl_rxsc_policy[NUM_MACSEC_RXSC_ATTR] = {
@@ -1501,6 +1502,10 @@ static const struct nla_policy macsec_genl_sa_policy[NUM_MACSEC_SA_ATTR] = {
 				 .len = MACSEC_MAX_KEY_LEN, },
 };
 
+static const struct nla_policy macsec_genl_offload_policy[NUM_MACSEC_OFFLOAD_ATTR] = {
+	[MACSEC_OFFLOAD_ATTR_TYPE] = { .type = NLA_U8 },
+};
+
 /* Offloads an operation to a device driver */
 static int macsec_offload(int (* const func)(struct macsec_context *),
 			  struct macsec_context *ctx)
@@ -2329,6 +2334,126 @@ cleanup:
 	return ret;
 }
 
+static bool macsec_is_configured(struct macsec_dev *macsec)
+{
+	struct macsec_secy *secy = &macsec->secy;
+	struct macsec_tx_sc *tx_sc = &secy->tx_sc;
+	int i;
+
+	if (secy->n_rx_sc > 0)
+		return true;
+
+	for (i = 0; i < MACSEC_NUM_AN; i++)
+		if (tx_sc->sa[i])
+			return true;
+
+	return false;
+}
+
+static int macsec_upd_offload(struct sk_buff *skb, struct genl_info *info)
+{
+	struct nlattr *tb_offload[MACSEC_OFFLOAD_ATTR_MAX + 1];
+	enum macsec_offload offload, prev_offload;
+	int (*func)(struct macsec_context *ctx);
+	struct nlattr **attrs = info->attrs;
+	struct net_device *dev, *loop_dev;
+	const struct macsec_ops *ops;
+	struct macsec_context ctx;
+	struct macsec_dev *macsec;
+	struct net *loop_net;
+	int ret;
+
+	if (!attrs[MACSEC_ATTR_IFINDEX])
+		return -EINVAL;
+
+	if (!attrs[MACSEC_ATTR_OFFLOAD])
+		return -EINVAL;
+
+	if (nla_parse_nested_deprecated(tb_offload, MACSEC_OFFLOAD_ATTR_MAX,
+					attrs[MACSEC_ATTR_OFFLOAD],
+					macsec_genl_offload_policy, NULL))
+		return -EINVAL;
+
+	dev = get_dev_from_nl(genl_info_net(info), attrs);
+	if (IS_ERR(dev))
+		return PTR_ERR(dev);
+	macsec = macsec_priv(dev);
+
+	offload = nla_get_u8(tb_offload[MACSEC_OFFLOAD_ATTR_TYPE]);
+	if (macsec->offload == offload)
+		return 0;
+
+	/* Check if the offloading mode is supported by the underlying layers */
+	if (offload != MACSEC_OFFLOAD_OFF &&
+	    !macsec_check_offload(offload, macsec))
+		return -EOPNOTSUPP;
+
+	if (offload == MACSEC_OFFLOAD_OFF)
+		goto skip_limitation;
+
+	/* Check the physical interface isn't offloading another interface
+	 * first.
+	 */
+	for_each_net(loop_net) {
+		for_each_netdev(loop_net, loop_dev) {
+			struct macsec_dev *priv;
+
+			if (!netif_is_macsec(loop_dev))
+				continue;
+
+			priv = macsec_priv(loop_dev);
+
+			if (priv->real_dev == macsec->real_dev &&
+			    priv->offload != MACSEC_OFFLOAD_OFF)
+				return -EBUSY;
+		}
+	}
+
+skip_limitation:
+	/* Check if the net device is busy. */
+	if (netif_running(dev))
+		return -EBUSY;
+
+	rtnl_lock();
+
+	prev_offload = macsec->offload;
+	macsec->offload = offload;
+
+	/* Check if the device already has rules configured: we do not support
+	 * rules migration.
+	 */
+	if (macsec_is_configured(macsec)) {
+		ret = -EBUSY;
+		goto rollback;
+	}
+
+	ops = __macsec_get_ops(offload == MACSEC_OFFLOAD_OFF ? prev_offload : offload,
+			       macsec, &ctx);
+	if (!ops) {
+		ret = -EOPNOTSUPP;
+		goto rollback;
+	}
+
+	if (prev_offload == MACSEC_OFFLOAD_OFF)
+		func = ops->mdo_add_secy;
+	else
+		func = ops->mdo_del_secy;
+
+	ctx.secy = &macsec->secy;
+	ret = macsec_offload(func, &ctx);
+	if (ret)
+		goto rollback;
+
+	rtnl_unlock();
+	return 0;
+
+rollback:
+	macsec->offload = prev_offload;
+
+	rtnl_unlock();
+	return ret;
+}
+
 static int copy_tx_sa_stats(struct sk_buff *skb,
 			    struct macsec_tx_sa_stats __percpu *pstats)
 {
@@ -2590,12 +2715,13 @@ static noinline_for_stack int
 dump_secy(struct macsec_secy *secy, struct net_device *dev,
 	  struct sk_buff *skb, struct netlink_callback *cb)
 {
-	struct macsec_rx_sc *rx_sc;
+	struct macsec_dev *macsec = netdev_priv(dev);
 	struct macsec_tx_sc *tx_sc = &secy->tx_sc;
 	struct nlattr *txsa_list, *rxsc_list;
-	int i, j;
-	void *hdr;
+	struct macsec_rx_sc *rx_sc;
 	struct nlattr *attr;
+	void *hdr;
+	int i, j;
 
 	hdr = genlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
 			  &macsec_fam, NLM_F_MULTI, MACSEC_CMD_GET_TXSC);
@@ -2607,6 +2733,13 @@ dump_secy(struct macsec_secy *secy, struct net_device *dev,
 	if (nla_put_u32(skb, MACSEC_ATTR_IFINDEX, dev->ifindex))
 		goto nla_put_failure;
 
+	attr = nla_nest_start_noflag(skb, MACSEC_ATTR_OFFLOAD);
+	if (!attr)
+		goto nla_put_failure;
+	if (nla_put_u8(skb, MACSEC_OFFLOAD_ATTR_TYPE, macsec->offload))
+		goto nla_put_failure;
+	nla_nest_end(skb, attr);
+
 	if (nla_put_secy(secy, skb))
 		goto nla_put_failure;
 
@@ -2872,6 +3005,12 @@ static const struct genl_ops macsec_genl_ops[] = {
 		.doit = macsec_upd_rxsa,
 		.flags = GENL_ADMIN_PERM,
 	},
+	{
+		.cmd = MACSEC_CMD_UPD_OFFLOAD,
+		.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
+		.doit = macsec_upd_offload,
+		.flags = GENL_ADMIN_PERM,
+	},
 };
 
 static struct genl_family macsec_fam __ro_after_init = {
diff --git a/include/uapi/linux/if_macsec.h b/include/uapi/linux/if_macsec.h
index 98e4d5d7c45c..1d63c43c38cc 100644
--- a/include/uapi/linux/if_macsec.h
+++ b/include/uapi/linux/if_macsec.h
@@ -45,6 +45,7 @@ enum macsec_attrs {
 	MACSEC_ATTR_RXSC_LIST,   /* dump, nested, macsec_rxsc_attrs for each RXSC */
 	MACSEC_ATTR_TXSC_STATS,  /* dump, nested, macsec_txsc_stats_attr */
 	MACSEC_ATTR_SECY_STATS,  /* dump, nested, macsec_secy_stats_attr */
+	MACSEC_ATTR_OFFLOAD,     /* config, nested, macsec_offload_attrs */
 	__MACSEC_ATTR_END,
 	NUM_MACSEC_ATTR = __MACSEC_ATTR_END,
 	MACSEC_ATTR_MAX = __MACSEC_ATTR_END - 1,
@@ -97,6 +98,15 @@ enum macsec_sa_attrs {
 	MACSEC_SA_ATTR_MAX = __MACSEC_SA_ATTR_END - 1,
 };
 
+enum macsec_offload_attrs {
+	MACSEC_OFFLOAD_ATTR_UNSPEC,
+	MACSEC_OFFLOAD_ATTR_TYPE, /* config/dump, u8 0..2 */
+	MACSEC_OFFLOAD_ATTR_PAD,
+	__MACSEC_OFFLOAD_ATTR_END,
+	NUM_MACSEC_OFFLOAD_ATTR = __MACSEC_OFFLOAD_ATTR_END,
+	MACSEC_OFFLOAD_ATTR_MAX = __MACSEC_OFFLOAD_ATTR_END - 1,
+};
+
 enum macsec_nl_commands {
 	MACSEC_CMD_GET_TXSC,
 	MACSEC_CMD_ADD_RXSC,
@@ -108,6 +118,7 @@ enum macsec_nl_commands {
 	MACSEC_CMD_ADD_RXSA,
 	MACSEC_CMD_DEL_RXSA,
 	MACSEC_CMD_UPD_RXSA,
+	MACSEC_CMD_UPD_OFFLOAD,
 };
 
 /* u64 per-RXSC stats */
-- 
cgit v1.2.3


From 90b93f1b31f86dcde2fa3c57f1ae33d28d87a1f8 Mon Sep 17 00:00:00 2001
From: Ido Schimmel <idosch@mellanox.com>
Date: Tue, 14 Jan 2020 13:23:11 +0200
Subject: ipv4: Add "offload" and "trap" indications to routes

When performing L3 offload, routes and nexthops are usually programmed
into two different tables in the underlying device. Therefore, the fact
that a nexthop resides in hardware does not necessarily mean that all
the associated routes also reside in hardware and vice-versa.

While the kernel can signal to user space the presence of a nexthop in
hardware (via 'RTNH_F_OFFLOAD'), it does not have a corresponding flag
for routes. In addition, the fact that a route resides in hardware does
not necessarily mean that the traffic is offloaded. For example,
unreachable routes (i.e., 'RTN_UNREACHABLE') are programmed to trap
packets to the CPU so that the kernel will be able to generate the
appropriate ICMP error packet.

This patch adds an "offload" and "trap" indications to IPv4 routes, so
that users will have better visibility into the offload process.

'struct fib_alias' is extended with two new fields that indicate if the
route resides in hardware or not and if it is offloading traffic from
the kernel or trapping packets to it. Note that the new fields are added
in the 6 bytes hole and therefore the struct still fits in a single
cache line [1].

Capable drivers are expected to invoke fib_alias_hw_flags_set() with the
route's key in order to set the flags.

The indications are dumped to user space via a new flags (i.e.,
'RTM_F_OFFLOAD' and 'RTM_F_TRAP') in the 'rtm_flags' field in the
ancillary header.

v2:
* Make use of 'struct fib_rt_info' in fib_alias_hw_flags_set()

[1]
struct fib_alias {
        struct hlist_node  fa_list;                      /*     0    16 */
        struct fib_info *          fa_info;              /*    16     8 */
        u8                         fa_tos;               /*    24     1 */
        u8                         fa_type;              /*    25     1 */
        u8                         fa_state;             /*    26     1 */
        u8                         fa_slen;              /*    27     1 */
        u32                        tb_id;                /*    28     4 */
        s16                        fa_default;           /*    32     2 */
        u8                         offload:1;            /*    34: 0  1 */
        u8                         trap:1;               /*    34: 1  1 */
        u8                         unused:6;             /*    34: 2  1 */

        /* XXX 5 bytes hole, try to pack */

        struct callback_head rcu __attribute__((__aligned__(8))); /*    40    16 */

        /* size: 56, cachelines: 1, members: 12 */
        /* sum members: 50, holes: 1, sum holes: 5 */
        /* sum bitfield members: 8 bits (1 bytes) */
        /* forced alignments: 1, forced holes: 1, sum forced holes: 5 */
        /* last cacheline: 56 bytes */
} __attribute__((__aligned__(8)));

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/ip_fib.h           |  4 ++++
 include/uapi/linux/rtnetlink.h |  2 ++
 net/ipv4/fib_lookup.h          |  3 +++
 net/ipv4/fib_semantics.c       |  7 ++++++
 net/ipv4/fib_trie.c            | 52 ++++++++++++++++++++++++++++++++++++++++++
 net/ipv4/route.c               | 19 +++++++++++++++
 6 files changed, 87 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/net/ip_fib.h b/include/net/ip_fib.h
index 0c071c820e33..6a1ae49809de 100644
--- a/include/net/ip_fib.h
+++ b/include/net/ip_fib.h
@@ -211,6 +211,9 @@ struct fib_rt_info {
 	int			dst_len;
 	u8			tos;
 	u8			type;
+	u8			offload:1,
+				trap:1,
+				unused:6;
 };
 
 struct fib_entry_notifier_info {
@@ -473,6 +476,7 @@ int fib_nh_common_init(struct fib_nh_common *nhc, struct nlattr *fc_encap,
 void fib_nh_common_release(struct fib_nh_common *nhc);
 
 /* Exported by fib_trie.c */
+void fib_alias_hw_flags_set(struct net *net, const struct fib_rt_info *fri);
 void fib_trie_init(void);
 struct fib_table *fib_trie_table(u32 id, struct fib_table *alias);
 
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index 1418a8362bb7..cd43321d20dd 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -309,6 +309,8 @@ enum rt_scope_t {
 #define RTM_F_PREFIX		0x800	/* Prefix addresses		*/
 #define RTM_F_LOOKUP_TABLE	0x1000	/* set rtm_table to FIB lookup result */
 #define RTM_F_FIB_MATCH	        0x2000	/* return full fib lookup match */
+#define RTM_F_OFFLOAD		0x4000	/* route is offloaded */
+#define RTM_F_TRAP		0x8000	/* route is trapping packets */
 
 /* Reserved table identifiers */
 
diff --git a/net/ipv4/fib_lookup.h b/net/ipv4/fib_lookup.h
index a4b829358bfa..c092e9a55790 100644
--- a/net/ipv4/fib_lookup.h
+++ b/net/ipv4/fib_lookup.h
@@ -16,6 +16,9 @@ struct fib_alias {
 	u8			fa_slen;
 	u32			tb_id;
 	s16			fa_default;
+	u8			offload:1,
+				trap:1,
+				unused:6;
 	struct rcu_head		rcu;
 };
 
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index 3ed1349be428..a803cdd9400a 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -519,6 +519,8 @@ void rtmsg_fib(int event, __be32 key, struct fib_alias *fa,
 	fri.dst_len = dst_len;
 	fri.tos = fa->fa_tos;
 	fri.type = fa->fa_type;
+	fri.offload = fa->offload;
+	fri.trap = fa->trap;
 	err = fib_dump_info(skb, info->portid, seq, event, &fri, nlm_flags);
 	if (err < 0) {
 		/* -EMSGSIZE implies BUG in fib_nlmsg_size() */
@@ -1801,6 +1803,11 @@ int fib_dump_info(struct sk_buff *skb, u32 portid, u32 seq, int event,
 			goto nla_put_failure;
 	}
 
+	if (fri->offload)
+		rtm->rtm_flags |= RTM_F_OFFLOAD;
+	if (fri->trap)
+		rtm->rtm_flags |= RTM_F_TRAP;
+
 	nlmsg_end(skb, nlh);
 	return 0;
 
diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 75af3f8ae50e..6ce1f2bbffd0 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1012,6 +1012,52 @@ static struct fib_alias *fib_find_alias(struct hlist_head *fah, u8 slen,
 	return NULL;
 }
 
+static struct fib_alias *
+fib_find_matching_alias(struct net *net, const struct fib_rt_info *fri)
+{
+	u8 slen = KEYLENGTH - fri->dst_len;
+	struct key_vector *l, *tp;
+	struct fib_table *tb;
+	struct fib_alias *fa;
+	struct trie *t;
+
+	tb = fib_get_table(net, fri->tb_id);
+	if (!tb)
+		return NULL;
+
+	t = (struct trie *)tb->tb_data;
+	l = fib_find_node(t, &tp, be32_to_cpu(fri->dst));
+	if (!l)
+		return NULL;
+
+	hlist_for_each_entry_rcu(fa, &l->leaf, fa_list) {
+		if (fa->fa_slen == slen && fa->tb_id == fri->tb_id &&
+		    fa->fa_tos == fri->tos && fa->fa_info == fri->fi &&
+		    fa->fa_type == fri->type)
+			return fa;
+	}
+
+	return NULL;
+}
+
+void fib_alias_hw_flags_set(struct net *net, const struct fib_rt_info *fri)
+{
+	struct fib_alias *fa_match;
+
+	rcu_read_lock();
+
+	fa_match = fib_find_matching_alias(net, fri);
+	if (!fa_match)
+		goto out;
+
+	fa_match->offload = fri->offload;
+	fa_match->trap = fri->trap;
+
+out:
+	rcu_read_unlock();
+}
+EXPORT_SYMBOL_GPL(fib_alias_hw_flags_set);
+
 static void trie_rebalance(struct trie *t, struct key_vector *tn)
 {
 	while (!IS_TRIE(tn))
@@ -1220,6 +1266,8 @@ int fib_table_insert(struct net *net, struct fib_table *tb,
 			new_fa->fa_slen = fa->fa_slen;
 			new_fa->tb_id = tb->tb_id;
 			new_fa->fa_default = -1;
+			new_fa->offload = 0;
+			new_fa->trap = 0;
 
 			hlist_replace_rcu(&fa->fa_list, &new_fa->fa_list);
 
@@ -1278,6 +1326,8 @@ int fib_table_insert(struct net *net, struct fib_table *tb,
 	new_fa->fa_slen = slen;
 	new_fa->tb_id = tb->tb_id;
 	new_fa->fa_default = -1;
+	new_fa->offload = 0;
+	new_fa->trap = 0;
 
 	/* Insert new entry to the list. */
 	err = fib_insert_alias(t, tp, l, new_fa, fa, key);
@@ -2202,6 +2252,8 @@ static int fn_trie_dump_leaf(struct key_vector *l, struct fib_table *tb,
 				fri.dst_len = KEYLENGTH - fa->fa_slen;
 				fri.tos = fa->fa_tos;
 				fri.type = fa->fa_type;
+				fri.offload = fa->offload;
+				fri.trap = fa->trap;
 				err = fib_dump_info(skb,
 						    NETLINK_CB(cb->skb).portid,
 						    cb->nlh->nlmsg_seq,
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 167a7357d12a..2010888e68ca 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -3237,6 +3237,25 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 		fri.dst_len = res.prefixlen;
 		fri.tos = fl4.flowi4_tos;
 		fri.type = rt->rt_type;
+		fri.offload = 0;
+		fri.trap = 0;
+		if (res.fa_head) {
+			struct fib_alias *fa;
+
+			hlist_for_each_entry_rcu(fa, res.fa_head, fa_list) {
+				u8 slen = 32 - fri.dst_len;
+
+				if (fa->fa_slen == slen &&
+				    fa->tb_id == fri.tb_id &&
+				    fa->fa_tos == fri.tos &&
+				    fa->fa_info == res.fi &&
+				    fa->fa_type == fri.type) {
+					fri.offload = fa->offload;
+					fri.trap = fa->trap;
+					break;
+				}
+			}
+		}
 		err = fib_dump_info(skb, NETLINK_CB(in_skb).portid,
 				    nlh->nlmsg_seq, RTM_NEWROUTE, &fri, 0);
 	} else {
-- 
cgit v1.2.3


From 8dcea187088bce5d2f1149294ad109f022653547 Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Tue, 14 Jan 2020 19:56:09 +0200
Subject: net: bridge: vlan: add rtm definitions and dump support

This patch adds vlan rtm definitions:
 - NEWVLAN: to be used for creating vlans, setting options and
   notifications
 - DELVLAN: to be used for deleting vlans
 - GETVLAN: used for dumping vlan information

Dumping vlans which can span multiple messages is added now with basic
information (vid and flags). We use nlmsg_parse() to validate the header
length in order to be able to extend the message with filtering
attributes later.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/if_bridge.h |  28 ++++++++
 include/uapi/linux/rtnetlink.h |   7 ++
 net/bridge/br_netlink.c        |   2 +
 net/bridge/br_private.h        |  14 ++++
 net/bridge/br_vlan.c           | 148 +++++++++++++++++++++++++++++++++++++++++
 security/selinux/nlmsgtab.c    |   5 +-
 6 files changed, 203 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 4a58e3d7de46..4da04f77d9ee 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -165,6 +165,34 @@ struct bridge_stp_xstats {
 	__u64 tx_tcn;
 };
 
+/* Bridge vlan RTM header */
+struct br_vlan_msg {
+	__u8 family;
+	__u8 reserved1;
+	__u16 reserved2;
+	__u32 ifindex;
+};
+
+/* Bridge vlan RTM attributes
+ * [BRIDGE_VLANDB_ENTRY] = {
+ *     [BRIDGE_VLANDB_ENTRY_INFO]
+ *     ...
+ * }
+ */
+enum {
+	BRIDGE_VLANDB_UNSPEC,
+	BRIDGE_VLANDB_ENTRY,
+	__BRIDGE_VLANDB_MAX,
+};
+#define BRIDGE_VLANDB_MAX (__BRIDGE_VLANDB_MAX - 1)
+
+enum {
+	BRIDGE_VLANDB_ENTRY_UNSPEC,
+	BRIDGE_VLANDB_ENTRY_INFO,
+	__BRIDGE_VLANDB_ENTRY_MAX,
+};
+#define BRIDGE_VLANDB_ENTRY_MAX (__BRIDGE_VLANDB_ENTRY_MAX - 1)
+
 /* Bridge multicast database attributes
  * [MDBA_MDB] = {
  *     [MDBA_MDB_ENTRY] = {
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index cd43321d20dd..b333cff14071 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -171,6 +171,13 @@ enum {
 	RTM_GETLINKPROP,
 #define RTM_GETLINKPROP	RTM_GETLINKPROP
 
+	RTM_NEWVLAN = 112,
+#define RTM_NEWNVLAN	RTM_NEWVLAN
+	RTM_DELVLAN,
+#define RTM_DELVLAN	RTM_DELVLAN
+	RTM_GETVLAN,
+#define RTM_GETVLAN	RTM_GETVLAN
+
 	__RTM_MAX,
 #define RTM_MAX		(((__RTM_MAX + 3) & ~3) - 1)
 };
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c
index 40942cece51a..75a7ecf95d7f 100644
--- a/net/bridge/br_netlink.c
+++ b/net/bridge/br_netlink.c
@@ -1657,6 +1657,7 @@ int __init br_netlink_init(void)
 	int err;
 
 	br_mdb_init();
+	br_vlan_rtnl_init();
 	rtnl_af_register(&br_af_ops);
 
 	err = rtnl_link_register(&br_link_ops);
@@ -1674,6 +1675,7 @@ out_af:
 void br_netlink_fini(void)
 {
 	br_mdb_uninit();
+	br_vlan_rtnl_uninit();
 	rtnl_af_unregister(&br_af_ops);
 	rtnl_link_unregister(&br_link_ops);
 }
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index a7dddc5d7790..1c00411ae938 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -958,6 +958,8 @@ void br_vlan_get_stats(const struct net_bridge_vlan *v,
 void br_vlan_port_event(struct net_bridge_port *p, unsigned long event);
 int br_vlan_bridge_event(struct net_device *dev, unsigned long event,
 			 void *ptr);
+void br_vlan_rtnl_init(void);
+void br_vlan_rtnl_uninit(void);
 
 static inline struct net_bridge_vlan_group *br_vlan_group(
 					const struct net_bridge *br)
@@ -1009,6 +1011,10 @@ static inline u16 br_get_pvid(const struct net_bridge_vlan_group *vg)
 	return vg->pvid;
 }
 
+static inline u16 br_vlan_flags(const struct net_bridge_vlan *v, u16 pvid)
+{
+	return v->vid == pvid ? v->flags | BRIDGE_VLAN_INFO_PVID : v->flags;
+}
 #else
 static inline bool br_allowed_ingress(const struct net_bridge *br,
 				      struct net_bridge_vlan_group *vg,
@@ -1152,6 +1158,14 @@ static inline int br_vlan_bridge_event(struct net_device *dev,
 {
 	return 0;
 }
+
+static inline void br_vlan_rtnl_init(void)
+{
+}
+
+static inline void br_vlan_rtnl_uninit(void)
+{
+}
 #endif
 
 struct nf_br_ops {
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index bb98984cd27d..5f2ac4f244f5 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -1505,3 +1505,151 @@ void br_vlan_port_event(struct net_bridge_port *p, unsigned long event)
 		break;
 	}
 }
+
+static bool br_vlan_fill_vids(struct sk_buff *skb, u16 vid, u16 flags)
+{
+	struct bridge_vlan_info info;
+	struct nlattr *nest;
+
+	nest = nla_nest_start(skb, BRIDGE_VLANDB_ENTRY);
+	if (!nest)
+		return false;
+
+	memset(&info, 0, sizeof(info));
+	info.vid = vid;
+	if (flags & BRIDGE_VLAN_INFO_UNTAGGED)
+		info.flags |= BRIDGE_VLAN_INFO_UNTAGGED;
+	if (flags & BRIDGE_VLAN_INFO_PVID)
+		info.flags |= BRIDGE_VLAN_INFO_PVID;
+
+	if (nla_put(skb, BRIDGE_VLANDB_ENTRY_INFO, sizeof(info), &info))
+		goto out_err;
+
+	nla_nest_end(skb, nest);
+
+	return true;
+
+out_err:
+	nla_nest_cancel(skb, nest);
+	return false;
+}
+
+static int br_vlan_dump_dev(const struct net_device *dev,
+			    struct sk_buff *skb,
+			    struct netlink_callback *cb)
+{
+	struct net_bridge_vlan_group *vg;
+	int idx = 0, s_idx = cb->args[1];
+	struct nlmsghdr *nlh = NULL;
+	struct net_bridge_vlan *v;
+	struct net_bridge_port *p;
+	struct br_vlan_msg *bvm;
+	struct net_bridge *br;
+	int err = 0;
+	u16 pvid;
+
+	if (!netif_is_bridge_master(dev) && !netif_is_bridge_port(dev))
+		return -EINVAL;
+
+	if (netif_is_bridge_master(dev)) {
+		br = netdev_priv(dev);
+		vg = br_vlan_group_rcu(br);
+		p = NULL;
+	} else {
+		p = br_port_get_rcu(dev);
+		if (WARN_ON(!p))
+			return -EINVAL;
+		vg = nbp_vlan_group_rcu(p);
+		br = p->br;
+	}
+
+	if (!vg)
+		return 0;
+
+	nlh = nlmsg_put(skb, NETLINK_CB(cb->skb).portid, cb->nlh->nlmsg_seq,
+			RTM_NEWVLAN, sizeof(*bvm), NLM_F_MULTI);
+	if (!nlh)
+		return -EMSGSIZE;
+	bvm = nlmsg_data(nlh);
+	memset(bvm, 0, sizeof(*bvm));
+	bvm->family = PF_BRIDGE;
+	bvm->ifindex = dev->ifindex;
+	pvid = br_get_pvid(vg);
+
+	list_for_each_entry_rcu(v, &vg->vlan_list, vlist) {
+		if (!br_vlan_should_use(v))
+			continue;
+		if (idx < s_idx)
+			goto skip;
+		if (!br_vlan_fill_vids(skb, v->vid, br_vlan_flags(v, pvid))) {
+			err = -EMSGSIZE;
+			break;
+		}
+skip:
+		idx++;
+	}
+	if (err)
+		cb->args[1] = idx;
+	else
+		cb->args[1] = 0;
+	nlmsg_end(skb, nlh);
+
+	return err;
+}
+
+static int br_vlan_rtm_dump(struct sk_buff *skb, struct netlink_callback *cb)
+{
+	int idx = 0, err = 0, s_idx = cb->args[0];
+	struct net *net = sock_net(skb->sk);
+	struct br_vlan_msg *bvm;
+	struct net_device *dev;
+
+	err = nlmsg_parse(cb->nlh, sizeof(*bvm), NULL, 0, NULL, cb->extack);
+	if (err < 0)
+		return err;
+
+	bvm = nlmsg_data(cb->nlh);
+
+	rcu_read_lock();
+	if (bvm->ifindex) {
+		dev = dev_get_by_index_rcu(net, bvm->ifindex);
+		if (!dev) {
+			err = -ENODEV;
+			goto out_err;
+		}
+		err = br_vlan_dump_dev(dev, skb, cb);
+		if (err && err != -EMSGSIZE)
+			goto out_err;
+	} else {
+		for_each_netdev_rcu(net, dev) {
+			if (idx < s_idx)
+				goto skip;
+
+			err = br_vlan_dump_dev(dev, skb, cb);
+			if (err == -EMSGSIZE)
+				break;
+skip:
+			idx++;
+		}
+	}
+	cb->args[0] = idx;
+	rcu_read_unlock();
+
+	return skb->len;
+
+out_err:
+	rcu_read_unlock();
+
+	return err;
+}
+
+void br_vlan_rtnl_init(void)
+{
+	rtnl_register_module(THIS_MODULE, PF_BRIDGE, RTM_GETVLAN, NULL,
+			     br_vlan_rtm_dump, 0);
+}
+
+void br_vlan_rtnl_uninit(void)
+{
+	rtnl_unregister(PF_BRIDGE, RTM_GETVLAN);
+}
diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c
index c97fdae8f71b..b69231918686 100644
--- a/security/selinux/nlmsgtab.c
+++ b/security/selinux/nlmsgtab.c
@@ -85,6 +85,9 @@ static const struct nlmsg_perm nlmsg_route_perms[] =
 	{ RTM_GETNEXTHOP,	NETLINK_ROUTE_SOCKET__NLMSG_READ  },
 	{ RTM_NEWLINKPROP,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
 	{ RTM_DELLINKPROP,	NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
+	{ RTM_NEWVLAN,		NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
+	{ RTM_DELVLAN,		NETLINK_ROUTE_SOCKET__NLMSG_WRITE },
+	{ RTM_GETVLAN,		NETLINK_ROUTE_SOCKET__NLMSG_READ  },
 };
 
 static const struct nlmsg_perm nlmsg_tcpdiag_perms[] =
@@ -168,7 +171,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm)
 		 * structures at the top of this file with the new mappings
 		 * before updating the BUILD_BUG_ON() macro!
 		 */
-		BUILD_BUG_ON(RTM_MAX != (RTM_NEWLINKPROP + 3));
+		BUILD_BUG_ON(RTM_MAX != (RTM_NEWVLAN + 3));
 		err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms,
 				 sizeof(nlmsg_route_perms));
 		break;
-- 
cgit v1.2.3


From 0ab558795184f87b532363e262357160d25f5d64 Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Tue, 14 Jan 2020 19:56:12 +0200
Subject: net: bridge: vlan: add rtm range support

Add a new vlandb nl attribute - BRIDGE_VLANDB_ENTRY_RANGE which causes
RTM_NEWVLAN/DELVAN to act on a range. Dumps now automatically compress
similar vlans into ranges. This will be also used when per-vlan options
are introduced and vlans' options match, they will be put into a single
range which is encapsulated in one netlink attribute. We need to run
similar checks as br_process_vlan_info() does because these ranges will
be used for options setting and they'll be able to skip
br_process_vlan_info().

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/if_bridge.h |  1 +
 net/bridge/br_vlan.c           | 86 +++++++++++++++++++++++++++++++++++-------
 2 files changed, 73 insertions(+), 14 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index 4da04f77d9ee..ac38f0b674b8 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -189,6 +189,7 @@ enum {
 enum {
 	BRIDGE_VLANDB_ENTRY_UNSPEC,
 	BRIDGE_VLANDB_ENTRY_INFO,
+	BRIDGE_VLANDB_ENTRY_RANGE,
 	__BRIDGE_VLANDB_ENTRY_MAX,
 };
 #define BRIDGE_VLANDB_ENTRY_MAX (__BRIDGE_VLANDB_ENTRY_MAX - 1)
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 89d5fa75c575..9d64a86f2cbd 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -1506,7 +1506,8 @@ void br_vlan_port_event(struct net_bridge_port *p, unsigned long event)
 	}
 }
 
-static bool br_vlan_fill_vids(struct sk_buff *skb, u16 vid, u16 flags)
+static bool br_vlan_fill_vids(struct sk_buff *skb, u16 vid, u16 vid_range,
+			      u16 flags)
 {
 	struct bridge_vlan_info info;
 	struct nlattr *nest;
@@ -1525,6 +1526,11 @@ static bool br_vlan_fill_vids(struct sk_buff *skb, u16 vid, u16 flags)
 	if (nla_put(skb, BRIDGE_VLANDB_ENTRY_INFO, sizeof(info), &info))
 		goto out_err;
 
+	if (vid_range && vid < vid_range &&
+	    !(flags & BRIDGE_VLAN_INFO_PVID) &&
+	    nla_put_u16(skb, BRIDGE_VLANDB_ENTRY_RANGE, vid_range))
+		goto out_err;
+
 	nla_nest_end(skb, nest);
 
 	return true;
@@ -1534,14 +1540,22 @@ out_err:
 	return false;
 }
 
+/* check if v_curr can enter a range ending in range_end */
+static bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr,
+				    const struct net_bridge_vlan *range_end)
+{
+	return v_curr->vid - range_end->vid == 1 &&
+	       range_end->flags == v_curr->flags;
+}
+
 static int br_vlan_dump_dev(const struct net_device *dev,
 			    struct sk_buff *skb,
 			    struct netlink_callback *cb)
 {
+	struct net_bridge_vlan *v, *range_start = NULL, *range_end = NULL;
 	struct net_bridge_vlan_group *vg;
 	int idx = 0, s_idx = cb->args[1];
 	struct nlmsghdr *nlh = NULL;
-	struct net_bridge_vlan *v;
 	struct net_bridge_port *p;
 	struct br_vlan_msg *bvm;
 	struct net_bridge *br;
@@ -1576,22 +1590,49 @@ static int br_vlan_dump_dev(const struct net_device *dev,
 	bvm->ifindex = dev->ifindex;
 	pvid = br_get_pvid(vg);
 
+	/* idx must stay at range's beginning until it is filled in */
 	list_for_each_entry_rcu(v, &vg->vlan_list, vlist) {
 		if (!br_vlan_should_use(v))
 			continue;
-		if (idx < s_idx)
-			goto skip;
-		if (!br_vlan_fill_vids(skb, v->vid, br_vlan_flags(v, pvid))) {
-			err = -EMSGSIZE;
-			break;
+		if (idx < s_idx) {
+			idx++;
+			continue;
 		}
-skip:
-		idx++;
+
+		if (!range_start) {
+			range_start = v;
+			range_end = v;
+			continue;
+		}
+
+		if (v->vid == pvid || !br_vlan_can_enter_range(v, range_end)) {
+			u16 flags = br_vlan_flags(range_start, pvid);
+
+			if (!br_vlan_fill_vids(skb, range_start->vid,
+					       range_end->vid, flags)) {
+				err = -EMSGSIZE;
+				break;
+			}
+			/* advance number of filled vlans */
+			idx += range_end->vid - range_start->vid + 1;
+
+			range_start = v;
+		}
+		range_end = v;
 	}
-	if (err)
-		cb->args[1] = idx;
-	else
-		cb->args[1] = 0;
+
+	/* err will be 0 and range_start will be set in 3 cases here:
+	 * - first vlan (range_start == range_end)
+	 * - last vlan (range_start == range_end, not in range)
+	 * - last vlan range (range_start != range_end, in range)
+	 */
+	if (!err && range_start &&
+	    !br_vlan_fill_vids(skb, range_start->vid, range_end->vid,
+			       br_vlan_flags(range_start, pvid)))
+		err = -EMSGSIZE;
+
+	cb->args[1] = err ? idx : 0;
+
 	nlmsg_end(skb, nlh);
 
 	return err;
@@ -1646,13 +1687,14 @@ out_err:
 static const struct nla_policy br_vlan_db_policy[BRIDGE_VLANDB_ENTRY_MAX + 1] = {
 	[BRIDGE_VLANDB_ENTRY_INFO]	= { .type = NLA_EXACT_LEN,
 					    .len = sizeof(struct bridge_vlan_info) },
+	[BRIDGE_VLANDB_ENTRY_RANGE]	= { .type = NLA_U16 },
 };
 
 static int br_vlan_rtm_process_one(struct net_device *dev,
 				   const struct nlattr *attr,
 				   int cmd, struct netlink_ext_ack *extack)
 {
-	struct bridge_vlan_info *vinfo, *vinfo_last = NULL;
+	struct bridge_vlan_info *vinfo, vrange_end, *vinfo_last = NULL;
 	struct nlattr *tb[BRIDGE_VLANDB_ENTRY_MAX + 1];
 	struct net_bridge_vlan_group *vg;
 	struct net_bridge_port *p = NULL;
@@ -1683,6 +1725,7 @@ static int br_vlan_rtm_process_one(struct net_device *dev,
 		NL_SET_ERR_MSG_MOD(extack, "Missing vlan entry info");
 		return -EINVAL;
 	}
+	memset(&vrange_end, 0, sizeof(vrange_end));
 
 	vinfo = nla_data(tb[BRIDGE_VLANDB_ENTRY_INFO]);
 	if (vinfo->flags & (BRIDGE_VLAN_INFO_RANGE_BEGIN |
@@ -1693,6 +1736,21 @@ static int br_vlan_rtm_process_one(struct net_device *dev,
 	if (!br_vlan_valid_id(vinfo->vid, extack))
 		return -EINVAL;
 
+	if (tb[BRIDGE_VLANDB_ENTRY_RANGE]) {
+		vrange_end.vid = nla_get_u16(tb[BRIDGE_VLANDB_ENTRY_RANGE]);
+		/* validate user-provided flags without RANGE_BEGIN */
+		vrange_end.flags = BRIDGE_VLAN_INFO_RANGE_END | vinfo->flags;
+		vinfo->flags |= BRIDGE_VLAN_INFO_RANGE_BEGIN;
+
+		/* vinfo_last is the range start, vinfo the range end */
+		vinfo_last = vinfo;
+		vinfo = &vrange_end;
+
+		if (!br_vlan_valid_id(vinfo->vid, extack) ||
+		    !br_vlan_valid_range(vinfo, vinfo_last, extack))
+			return -EINVAL;
+	}
+
 	switch (cmd) {
 	case RTM_NEWVLAN:
 		cmdmap = RTM_SETLINK;
-- 
cgit v1.2.3


From cf5bddb95cbe5340e486e84d46cf2f0bb324078c Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Tue, 14 Jan 2020 19:56:13 +0200
Subject: net: bridge: vlan: add rtnetlink group and notify support

Add a new rtnetlink group for bridge vlan notifications - RTNLGRP_BRVLAN
and add support for sending vlan notifications (both single and ranges).
No functional changes intended, the notification support will be used by
later patches.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/rtnetlink.h |  2 ++
 net/bridge/br_private.h        | 11 ++++++
 net/bridge/br_vlan.c           | 79 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 92 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h
index b333cff14071..4a8c5b745157 100644
--- a/include/uapi/linux/rtnetlink.h
+++ b/include/uapi/linux/rtnetlink.h
@@ -730,6 +730,8 @@ enum rtnetlink_groups {
 #define RTNLGRP_IPV6_MROUTE_R	RTNLGRP_IPV6_MROUTE_R
 	RTNLGRP_NEXTHOP,
 #define RTNLGRP_NEXTHOP		RTNLGRP_NEXTHOP
+	RTNLGRP_BRVLAN,
+#define RTNLGRP_BRVLAN		RTNLGRP_BRVLAN
 	__RTNLGRP_MAX
 };
 #define RTNLGRP_MAX	(__RTNLGRP_MAX - 1)
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index ee3871dea68f..ba162c8197da 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -960,6 +960,10 @@ int br_vlan_bridge_event(struct net_device *dev, unsigned long event,
 			 void *ptr);
 void br_vlan_rtnl_init(void);
 void br_vlan_rtnl_uninit(void);
+void br_vlan_notify(const struct net_bridge *br,
+		    const struct net_bridge_port *p,
+		    u16 vid, u16 vid_range,
+		    int cmd);
 
 static inline struct net_bridge_vlan_group *br_vlan_group(
 					const struct net_bridge *br)
@@ -1166,6 +1170,13 @@ static inline void br_vlan_rtnl_init(void)
 static inline void br_vlan_rtnl_uninit(void)
 {
 }
+
+static inline void br_vlan_notify(const struct net_bridge *br,
+				  const struct net_bridge_port *p,
+				  u16 vid, u16 vid_range,
+				  int cmd)
+{
+}
 #endif
 
 struct nf_br_ops {
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 9d64a86f2cbd..5d52a2604547 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -1540,6 +1540,85 @@ out_err:
 	return false;
 }
 
+static size_t rtnl_vlan_nlmsg_size(void)
+{
+	return NLMSG_ALIGN(sizeof(struct br_vlan_msg))
+		+ nla_total_size(0) /* BRIDGE_VLANDB_ENTRY */
+		+ nla_total_size(sizeof(u16)) /* BRIDGE_VLANDB_ENTRY_RANGE */
+		+ nla_total_size(sizeof(struct bridge_vlan_info)); /* BRIDGE_VLANDB_ENTRY_INFO */
+}
+
+void br_vlan_notify(const struct net_bridge *br,
+		    const struct net_bridge_port *p,
+		    u16 vid, u16 vid_range,
+		    int cmd)
+{
+	struct net_bridge_vlan_group *vg;
+	struct net_bridge_vlan *v;
+	struct br_vlan_msg *bvm;
+	struct nlmsghdr *nlh;
+	struct sk_buff *skb;
+	int err = -ENOBUFS;
+	struct net *net;
+	u16 flags = 0;
+	int ifindex;
+
+	/* right now notifications are done only with rtnl held */
+	ASSERT_RTNL();
+
+	if (p) {
+		ifindex = p->dev->ifindex;
+		vg = nbp_vlan_group(p);
+		net = dev_net(p->dev);
+	} else {
+		ifindex = br->dev->ifindex;
+		vg = br_vlan_group(br);
+		net = dev_net(br->dev);
+	}
+
+	skb = nlmsg_new(rtnl_vlan_nlmsg_size(), GFP_KERNEL);
+	if (!skb)
+		goto out_err;
+
+	err = -EMSGSIZE;
+	nlh = nlmsg_put(skb, 0, 0, cmd, sizeof(*bvm), 0);
+	if (!nlh)
+		goto out_err;
+	bvm = nlmsg_data(nlh);
+	memset(bvm, 0, sizeof(*bvm));
+	bvm->family = AF_BRIDGE;
+	bvm->ifindex = ifindex;
+
+	switch (cmd) {
+	case RTM_NEWVLAN:
+		/* need to find the vlan due to flags/options */
+		v = br_vlan_find(vg, vid);
+		if (!v || !br_vlan_should_use(v))
+			goto out_kfree;
+
+		flags = v->flags;
+		if (br_get_pvid(vg) == v->vid)
+			flags |= BRIDGE_VLAN_INFO_PVID;
+		break;
+	case RTM_DELVLAN:
+		break;
+	default:
+		goto out_kfree;
+	}
+
+	if (!br_vlan_fill_vids(skb, vid, vid_range, flags))
+		goto out_err;
+
+	nlmsg_end(skb, nlh);
+	rtnl_notify(skb, net, 0, RTNLGRP_BRVLAN, NULL, GFP_KERNEL);
+	return;
+
+out_err:
+	rtnl_set_sk_err(net, RTNLGRP_BRVLAN, err);
+out_kfree:
+	kfree_skb(skb);
+}
+
 /* check if v_curr can enter a range ending in range_end */
 static bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr,
 				    const struct net_bridge_vlan *range_end)
-- 
cgit v1.2.3


From a6b0ef9a7d03bb78d37c420753741ef8a082160b Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Mon, 6 Jan 2020 12:03:28 -0700
Subject: PCI/switchtec: Add support for Intercomm Notify and Upstream Error
 Containment

Add support for the Inter Fabric Manager Communication (Intercomm) Notify
event in PAX variants of Switchtec hardware and the Upstream Error
Containment port in the MR1 release of Gen3 firmware.

Link: https://lore.kernel.org/r/20200106190337.2428-4-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/switch/switchtec.c       | 3 +++
 include/linux/switchtec.h            | 7 +++++--
 include/uapi/linux/switchtec_ioctl.h | 4 +++-
 3 files changed, 11 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index 9c3ad09d3022..218e67428cf9 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -751,10 +751,13 @@ static const struct event_reg {
 	EV_PAR(SWITCHTEC_IOCTL_EVENT_MRPC_COMP, mrpc_comp_hdr),
 	EV_PAR(SWITCHTEC_IOCTL_EVENT_MRPC_COMP_ASYNC, mrpc_comp_async_hdr),
 	EV_PAR(SWITCHTEC_IOCTL_EVENT_DYN_PART_BIND_COMP, dyn_binding_hdr),
+	EV_PAR(SWITCHTEC_IOCTL_EVENT_INTERCOMM_REQ_NOTIFY,
+	       intercomm_notify_hdr),
 	EV_PFF(SWITCHTEC_IOCTL_EVENT_AER_IN_P2P, aer_in_p2p_hdr),
 	EV_PFF(SWITCHTEC_IOCTL_EVENT_AER_IN_VEP, aer_in_vep_hdr),
 	EV_PFF(SWITCHTEC_IOCTL_EVENT_DPC, dpc_hdr),
 	EV_PFF(SWITCHTEC_IOCTL_EVENT_CTS, cts_hdr),
+	EV_PFF(SWITCHTEC_IOCTL_EVENT_UEC, uec_hdr),
 	EV_PFF(SWITCHTEC_IOCTL_EVENT_HOTPLUG, hotplug_hdr),
 	EV_PFF(SWITCHTEC_IOCTL_EVENT_IER, ier_hdr),
 	EV_PFF(SWITCHTEC_IOCTL_EVENT_THRESH, threshold_hdr),
diff --git a/include/linux/switchtec.h b/include/linux/switchtec.h
index e295515bc3f3..b4ba3a38f30f 100644
--- a/include/linux/switchtec.h
+++ b/include/linux/switchtec.h
@@ -196,7 +196,9 @@ struct part_cfg_regs {
 	u32 mrpc_comp_async_data[5];
 	u32 dyn_binding_hdr;
 	u32 dyn_binding_data[5];
-	u32 reserved4[159];
+	u32 intercomm_notify_hdr;
+	u32 intercomm_notify_data[5];
+	u32 reserved4[153];
 } __packed;
 
 enum {
@@ -320,7 +322,8 @@ struct pff_csr_regs {
 	u32 dpc_data[5];
 	u32 cts_hdr;
 	u32 cts_data[5];
-	u32 reserved3[6];
+	u32 uec_hdr;
+	u32 uec_data[5];
 	u32 hotplug_hdr;
 	u32 hotplug_data[5];
 	u32 ier_hdr;
diff --git a/include/uapi/linux/switchtec_ioctl.h b/include/uapi/linux/switchtec_ioctl.h
index c912b5a678e4..e8db938985ca 100644
--- a/include/uapi/linux/switchtec_ioctl.h
+++ b/include/uapi/linux/switchtec_ioctl.h
@@ -98,7 +98,9 @@ struct switchtec_ioctl_event_summary {
 #define SWITCHTEC_IOCTL_EVENT_CREDIT_TIMEOUT		27
 #define SWITCHTEC_IOCTL_EVENT_LINK_STATE		28
 #define SWITCHTEC_IOCTL_EVENT_GFMS			29
-#define SWITCHTEC_IOCTL_MAX_EVENTS			30
+#define SWITCHTEC_IOCTL_EVENT_INTERCOMM_REQ_NOTIFY	30
+#define SWITCHTEC_IOCTL_EVENT_UEC			31
+#define SWITCHTEC_IOCTL_MAX_EVENTS			32
 
 #define SWITCHTEC_IOCTL_EVENT_LOCAL_PART_IDX -1
 #define SWITCHTEC_IOCTL_EVENT_IDX_ALL -2
-- 
cgit v1.2.3


From fcccd282b633ab9fc7d53ff8ccf82ab5c30a0985 Mon Sep 17 00:00:00 2001
From: Logan Gunthorpe <logang@deltatee.com>
Date: Tue, 14 Jan 2020 20:56:42 -0700
Subject: PCI/switchtec: Rename generation-specific constants

Gen4 hardware will have different values for the SWITCHTEC_X_RUNNING and
SWITCHTEC_IOCTL_NUM_PARTITIONS, so rename them with GEN3 in their name.

No functional changes intended.

Link: https://lore.kernel.org/r/20200115035648.2578-2-logang@deltatee.com
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/switch/switchtec.c       | 10 +++++-----
 include/linux/switchtec.h            |  8 ++++----
 include/uapi/linux/switchtec_ioctl.h |  5 ++++-
 3 files changed, 13 insertions(+), 10 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index 05d4cb49219b..e4d9291864bc 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -569,7 +569,7 @@ static int ioctl_flash_info(struct switchtec_dev *stdev,
 	struct flash_info_regs __iomem *fi = stdev->mmio_flash_info;
 
 	info.flash_length = ioread32(&fi->flash_length);
-	info.num_partitions = SWITCHTEC_IOCTL_NUM_PARTITIONS;
+	info.num_partitions = SWITCHTEC_NUM_PARTITIONS_GEN3;
 
 	if (copy_to_user(uinfo, &info, sizeof(info)))
 		return -EFAULT;
@@ -599,25 +599,25 @@ static int ioctl_flash_part_info(struct switchtec_dev *stdev,
 	case SWITCHTEC_IOCTL_PART_CFG0:
 		active_addr = ioread32(&fi->active_cfg);
 		set_fw_info_part(&info, &fi->cfg0);
-		if (ioread16(&si->cfg_running) == SWITCHTEC_CFG0_RUNNING)
+		if (ioread16(&si->cfg_running) == SWITCHTEC_GEN3_CFG0_RUNNING)
 			info.active |= SWITCHTEC_IOCTL_PART_RUNNING;
 		break;
 	case SWITCHTEC_IOCTL_PART_CFG1:
 		active_addr = ioread32(&fi->active_cfg);
 		set_fw_info_part(&info, &fi->cfg1);
-		if (ioread16(&si->cfg_running) == SWITCHTEC_CFG1_RUNNING)
+		if (ioread16(&si->cfg_running) == SWITCHTEC_GEN3_CFG1_RUNNING)
 			info.active |= SWITCHTEC_IOCTL_PART_RUNNING;
 		break;
 	case SWITCHTEC_IOCTL_PART_IMG0:
 		active_addr = ioread32(&fi->active_img);
 		set_fw_info_part(&info, &fi->img0);
-		if (ioread16(&si->img_running) == SWITCHTEC_IMG0_RUNNING)
+		if (ioread16(&si->img_running) == SWITCHTEC_GEN3_IMG0_RUNNING)
 			info.active |= SWITCHTEC_IOCTL_PART_RUNNING;
 		break;
 	case SWITCHTEC_IOCTL_PART_IMG1:
 		active_addr = ioread32(&fi->active_img);
 		set_fw_info_part(&info, &fi->img1);
-		if (ioread16(&si->img_running) == SWITCHTEC_IMG1_RUNNING)
+		if (ioread16(&si->img_running) == SWITCHTEC_GEN3_IMG1_RUNNING)
 			info.active |= SWITCHTEC_IOCTL_PART_RUNNING;
 		break;
 	case SWITCHTEC_IOCTL_PART_NVLOG:
diff --git a/include/linux/switchtec.h b/include/linux/switchtec.h
index b4ba3a38f30f..4ee450487fe4 100644
--- a/include/linux/switchtec.h
+++ b/include/linux/switchtec.h
@@ -98,10 +98,10 @@ struct sw_event_regs {
 } __packed;
 
 enum {
-	SWITCHTEC_CFG0_RUNNING = 0x04,
-	SWITCHTEC_CFG1_RUNNING = 0x05,
-	SWITCHTEC_IMG0_RUNNING = 0x03,
-	SWITCHTEC_IMG1_RUNNING = 0x07,
+	SWITCHTEC_GEN3_CFG0_RUNNING = 0x04,
+	SWITCHTEC_GEN3_CFG1_RUNNING = 0x05,
+	SWITCHTEC_GEN3_IMG0_RUNNING = 0x03,
+	SWITCHTEC_GEN3_IMG1_RUNNING = 0x07,
 };
 
 struct sys_info_regs {
diff --git a/include/uapi/linux/switchtec_ioctl.h b/include/uapi/linux/switchtec_ioctl.h
index e8db938985ca..4d09cfa2e9e6 100644
--- a/include/uapi/linux/switchtec_ioctl.h
+++ b/include/uapi/linux/switchtec_ioctl.h
@@ -32,7 +32,10 @@
 #define SWITCHTEC_IOCTL_PART_VENDOR5	10
 #define SWITCHTEC_IOCTL_PART_VENDOR6	11
 #define SWITCHTEC_IOCTL_PART_VENDOR7	12
-#define SWITCHTEC_IOCTL_NUM_PARTITIONS	13
+#define SWITCHTEC_NUM_PARTITIONS_GEN3	13
+
+/* obsolete: for compatibility with old userspace software */
+#define SWITCHTEC_IOCTL_NUM_PARTITIONS	SWITCHTEC_NUM_PARTITIONS_GEN3
 
 struct switchtec_ioctl_flash_info {
 	__u64 flash_length;
-- 
cgit v1.2.3


From 4efa1d2e36976d7b26f2e67f4c838330fbc91299 Mon Sep 17 00:00:00 2001
From: Kelvin Cao <kelvin.cao@microchip.com>
Date: Tue, 14 Jan 2020 20:56:47 -0700
Subject: PCI/switchtec: Add Gen4 flash information interface support

Add the new flash_info registers struct and the implementation of
ioctl_flash_part_info() for the new Gen4 hardware.

[logang@deltatee.com: rewrote commit message]
Link: https://lore.kernel.org/r/20200115035648.2578-7-logang@deltatee.com
Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com>
Signed-off-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
---
 drivers/pci/switch/switchtec.c       | 111 +++++++++++++++++++++++++++++++++++
 include/linux/switchtec.h            |  52 ++++++++++++++++
 include/uapi/linux/switchtec_ioctl.h |   8 +++
 3 files changed, 171 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/drivers/pci/switch/switchtec.c b/drivers/pci/switch/switchtec.c
index b09c3f53a552..af85d232d200 100644
--- a/drivers/pci/switch/switchtec.c
+++ b/drivers/pci/switch/switchtec.c
@@ -600,6 +600,9 @@ static int ioctl_flash_info(struct switchtec_dev *stdev,
 	if (stdev->gen == SWITCHTEC_GEN3) {
 		info.flash_length = ioread32(&fi->gen3.flash_length);
 		info.num_partitions = SWITCHTEC_NUM_PARTITIONS_GEN3;
+	} else if (stdev->gen == SWITCHTEC_GEN4) {
+		info.flash_length = ioread32(&fi->gen4.flash_length);
+		info.num_partitions = SWITCHTEC_NUM_PARTITIONS_GEN4;
 	} else {
 		return -ENOTSUPP;
 	}
@@ -687,6 +690,110 @@ static int flash_part_info_gen3(struct switchtec_dev *stdev,
 	return 0;
 }
 
+static int flash_part_info_gen4(struct switchtec_dev *stdev,
+		struct switchtec_ioctl_flash_part_info *info)
+{
+	struct flash_info_regs_gen4 __iomem *fi = &stdev->mmio_flash_info->gen4;
+	struct sys_info_regs_gen4 __iomem *si = &stdev->mmio_sys_info->gen4;
+	struct active_partition_info_gen4 __iomem *af = &fi->active_flag;
+
+	switch (info->flash_partition) {
+	case SWITCHTEC_IOCTL_PART_MAP_0:
+		set_fw_info_part(info, &fi->map0);
+		break;
+	case SWITCHTEC_IOCTL_PART_MAP_1:
+		set_fw_info_part(info, &fi->map1);
+		break;
+	case SWITCHTEC_IOCTL_PART_KEY_0:
+		set_fw_info_part(info, &fi->key0);
+		if (ioread8(&af->key) == SWITCHTEC_GEN4_KEY0_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->key_running) == SWITCHTEC_GEN4_KEY0_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_KEY_1:
+		set_fw_info_part(info, &fi->key1);
+		if (ioread8(&af->key) == SWITCHTEC_GEN4_KEY1_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->key_running) == SWITCHTEC_GEN4_KEY1_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_BL2_0:
+		set_fw_info_part(info, &fi->bl2_0);
+		if (ioread8(&af->bl2) == SWITCHTEC_GEN4_BL2_0_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->bl2_running) == SWITCHTEC_GEN4_BL2_0_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_BL2_1:
+		set_fw_info_part(info, &fi->bl2_1);
+		if (ioread8(&af->bl2) == SWITCHTEC_GEN4_BL2_1_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->bl2_running) == SWITCHTEC_GEN4_BL2_1_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_CFG0:
+		set_fw_info_part(info, &fi->cfg0);
+		if (ioread8(&af->cfg) == SWITCHTEC_GEN4_CFG0_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->cfg_running) == SWITCHTEC_GEN4_CFG0_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_CFG1:
+		set_fw_info_part(info, &fi->cfg1);
+		if (ioread8(&af->cfg) == SWITCHTEC_GEN4_CFG1_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->cfg_running) == SWITCHTEC_GEN4_CFG1_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_IMG0:
+		set_fw_info_part(info, &fi->img0);
+		if (ioread8(&af->img) == SWITCHTEC_GEN4_IMG0_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->img_running) == SWITCHTEC_GEN4_IMG0_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_IMG1:
+		set_fw_info_part(info, &fi->img1);
+		if (ioread8(&af->img) == SWITCHTEC_GEN4_IMG1_ACTIVE)
+			info->active |= SWITCHTEC_IOCTL_PART_ACTIVE;
+		if (ioread16(&si->img_running) == SWITCHTEC_GEN4_IMG1_RUNNING)
+			info->active |= SWITCHTEC_IOCTL_PART_RUNNING;
+		break;
+	case SWITCHTEC_IOCTL_PART_NVLOG:
+		set_fw_info_part(info, &fi->nvlog);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR0:
+		set_fw_info_part(info, &fi->vendor[0]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR1:
+		set_fw_info_part(info, &fi->vendor[1]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR2:
+		set_fw_info_part(info, &fi->vendor[2]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR3:
+		set_fw_info_part(info, &fi->vendor[3]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR4:
+		set_fw_info_part(info, &fi->vendor[4]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR5:
+		set_fw_info_part(info, &fi->vendor[5]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR6:
+		set_fw_info_part(info, &fi->vendor[6]);
+		break;
+	case SWITCHTEC_IOCTL_PART_VENDOR7:
+		set_fw_info_part(info, &fi->vendor[7]);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int ioctl_flash_part_info(struct switchtec_dev *stdev,
 		struct switchtec_ioctl_flash_part_info __user *uinfo)
 {
@@ -700,6 +807,10 @@ static int ioctl_flash_part_info(struct switchtec_dev *stdev,
 		ret = flash_part_info_gen3(stdev, &info);
 		if (ret)
 			return ret;
+	} else if (stdev->gen == SWITCHTEC_GEN4) {
+		ret = flash_part_info_gen4(stdev, &info);
+		if (ret)
+			return ret;
 	} else {
 		return -ENOTSUPP;
 	}
diff --git a/include/linux/switchtec.h b/include/linux/switchtec.h
index d012520e5cc5..e85155244135 100644
--- a/include/linux/switchtec.h
+++ b/include/linux/switchtec.h
@@ -109,6 +109,30 @@ enum {
 	SWITCHTEC_GEN3_IMG1_RUNNING = 0x07,
 };
 
+enum {
+	SWITCHTEC_GEN4_MAP0_RUNNING = 0x00,
+	SWITCHTEC_GEN4_MAP1_RUNNING = 0x01,
+	SWITCHTEC_GEN4_KEY0_RUNNING = 0x02,
+	SWITCHTEC_GEN4_KEY1_RUNNING = 0x03,
+	SWITCHTEC_GEN4_BL2_0_RUNNING = 0x04,
+	SWITCHTEC_GEN4_BL2_1_RUNNING = 0x05,
+	SWITCHTEC_GEN4_CFG0_RUNNING = 0x06,
+	SWITCHTEC_GEN4_CFG1_RUNNING = 0x07,
+	SWITCHTEC_GEN4_IMG0_RUNNING = 0x08,
+	SWITCHTEC_GEN4_IMG1_RUNNING = 0x09,
+};
+
+enum {
+	SWITCHTEC_GEN4_KEY0_ACTIVE = 0,
+	SWITCHTEC_GEN4_KEY1_ACTIVE = 1,
+	SWITCHTEC_GEN4_BL2_0_ACTIVE = 0,
+	SWITCHTEC_GEN4_BL2_1_ACTIVE = 1,
+	SWITCHTEC_GEN4_CFG0_ACTIVE = 0,
+	SWITCHTEC_GEN4_CFG1_ACTIVE = 1,
+	SWITCHTEC_GEN4_IMG0_ACTIVE = 0,
+	SWITCHTEC_GEN4_IMG1_ACTIVE = 1,
+};
+
 struct sys_info_regs_gen3 {
 	u32 reserved1;
 	u32 vendor_table_revision;
@@ -205,9 +229,37 @@ struct flash_info_regs_gen3 {
 	struct partition_info vendor[8];
 };
 
+struct flash_info_regs_gen4 {
+	u32 flash_address;
+	u32 flash_length;
+
+	struct active_partition_info_gen4 {
+		unsigned char bl2;
+		unsigned char cfg;
+		unsigned char img;
+		unsigned char key;
+	} active_flag;
+
+	u32 reserved[3];
+
+	struct partition_info map0;
+	struct partition_info map1;
+	struct partition_info key0;
+	struct partition_info key1;
+	struct partition_info bl2_0;
+	struct partition_info bl2_1;
+	struct partition_info cfg0;
+	struct partition_info cfg1;
+	struct partition_info img0;
+	struct partition_info img1;
+	struct partition_info nvlog;
+	struct partition_info vendor[8];
+};
+
 struct flash_info_regs {
 	union {
 		struct flash_info_regs_gen3 gen3;
+		struct flash_info_regs_gen4 gen4;
 	};
 };
 
diff --git a/include/uapi/linux/switchtec_ioctl.h b/include/uapi/linux/switchtec_ioctl.h
index 4d09cfa2e9e6..2c661a3557e5 100644
--- a/include/uapi/linux/switchtec_ioctl.h
+++ b/include/uapi/linux/switchtec_ioctl.h
@@ -32,7 +32,15 @@
 #define SWITCHTEC_IOCTL_PART_VENDOR5	10
 #define SWITCHTEC_IOCTL_PART_VENDOR6	11
 #define SWITCHTEC_IOCTL_PART_VENDOR7	12
+#define SWITCHTEC_IOCTL_PART_BL2_0	13
+#define SWITCHTEC_IOCTL_PART_BL2_1	14
+#define SWITCHTEC_IOCTL_PART_MAP_0	15
+#define SWITCHTEC_IOCTL_PART_MAP_1	16
+#define SWITCHTEC_IOCTL_PART_KEY_0	17
+#define SWITCHTEC_IOCTL_PART_KEY_1	18
+
 #define SWITCHTEC_NUM_PARTITIONS_GEN3	13
+#define SWITCHTEC_NUM_PARTITIONS_GEN4	19
 
 /* obsolete: for compatibility with old userspace software */
 #define SWITCHTEC_IOCTL_NUM_PARTITIONS	SWITCHTEC_NUM_PARTITIONS_GEN3
-- 
cgit v1.2.3


From 8482941f09067da42f9c3362e15bfb3f3c19d610 Mon Sep 17 00:00:00 2001
From: Yonghong Song <yhs@fb.com>
Date: Tue, 14 Jan 2020 19:50:02 -0800
Subject: bpf: Add bpf_send_signal_thread() helper

Commit 8b401f9ed244 ("bpf: implement bpf_send_signal() helper")
added helper bpf_send_signal() which permits bpf program to
send a signal to the current process. The signal may be
delivered to any threads in the process.

We found a use case where sending the signal to the current
thread is more preferable.
  - A bpf program will collect the stack trace and then
    send signal to the user application.
  - The user application will add some thread specific
    information to the just collected stack trace for
    later analysis.

If bpf_send_signal() is used, user application will need
to check whether the thread receiving the signal matches
the thread collecting the stack by checking thread id.
If not, it will need to send signal to another thread
through pthread_kill().

This patch proposed a new helper bpf_send_signal_thread(),
which sends the signal to the thread corresponding to
the current kernel task. This way, user space is guaranteed that
bpf_program execution context and user space signal handling
context are the same thread.

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115035002.602336-1-yhs@fb.com
---
 include/uapi/linux/bpf.h       | 19 +++++++++++++++++--
 kernel/trace/bpf_trace.c       | 27 ++++++++++++++++++++++++---
 tools/include/uapi/linux/bpf.h | 19 +++++++++++++++++--
 3 files changed, 58 insertions(+), 7 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 52966e758fe5..a88837d4dd18 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2714,7 +2714,8 @@ union bpf_attr {
  *
  * int bpf_send_signal(u32 sig)
  *	Description
- *		Send signal *sig* to the current task.
+ *		Send signal *sig* to the process of the current task.
+ *		The signal may be delivered to any of this process's threads.
  *	Return
  *		0 on success or successfully queued.
  *
@@ -2850,6 +2851,19 @@ union bpf_attr {
  *	Return
  *		0 on success, or a negative error in case of failure.
  *
+ * int bpf_send_signal_thread(u32 sig)
+ *	Description
+ *		Send signal *sig* to the thread corresponding to the current task.
+ *	Return
+ *		0 on success or successfully queued.
+ *
+ *		**-EBUSY** if work queue under nmi is full.
+ *
+ *		**-EINVAL** if *sig* is invalid.
+ *
+ *		**-EPERM** if no permission to send the *sig*.
+ *
+ *		**-EAGAIN** if bpf program can try again.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2968,7 +2982,8 @@ union bpf_attr {
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
 	FN(probe_read_kernel_str),	\
-	FN(tcp_send_ack),
+	FN(tcp_send_ack),		\
+	FN(send_signal_thread),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index e5ef4ae9edb5..19e793aa441a 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -703,6 +703,7 @@ struct send_signal_irq_work {
 	struct irq_work irq_work;
 	struct task_struct *task;
 	u32 sig;
+	enum pid_type type;
 };
 
 static DEFINE_PER_CPU(struct send_signal_irq_work, send_signal_work);
@@ -712,10 +713,10 @@ static void do_bpf_send_signal(struct irq_work *entry)
 	struct send_signal_irq_work *work;
 
 	work = container_of(entry, struct send_signal_irq_work, irq_work);
-	group_send_sig_info(work->sig, SEND_SIG_PRIV, work->task, PIDTYPE_TGID);
+	group_send_sig_info(work->sig, SEND_SIG_PRIV, work->task, work->type);
 }
 
-BPF_CALL_1(bpf_send_signal, u32, sig)
+static int bpf_send_signal_common(u32 sig, enum pid_type type)
 {
 	struct send_signal_irq_work *work = NULL;
 
@@ -748,11 +749,17 @@ BPF_CALL_1(bpf_send_signal, u32, sig)
 		 */
 		work->task = current;
 		work->sig = sig;
+		work->type = type;
 		irq_work_queue(&work->irq_work);
 		return 0;
 	}
 
-	return group_send_sig_info(sig, SEND_SIG_PRIV, current, PIDTYPE_TGID);
+	return group_send_sig_info(sig, SEND_SIG_PRIV, current, type);
+}
+
+BPF_CALL_1(bpf_send_signal, u32, sig)
+{
+	return bpf_send_signal_common(sig, PIDTYPE_TGID);
 }
 
 static const struct bpf_func_proto bpf_send_signal_proto = {
@@ -762,6 +769,18 @@ static const struct bpf_func_proto bpf_send_signal_proto = {
 	.arg1_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_1(bpf_send_signal_thread, u32, sig)
+{
+	return bpf_send_signal_common(sig, PIDTYPE_PID);
+}
+
+static const struct bpf_func_proto bpf_send_signal_thread_proto = {
+	.func		= bpf_send_signal_thread,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -822,6 +841,8 @@ tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 #endif
 	case BPF_FUNC_send_signal:
 		return &bpf_send_signal_proto;
+	case BPF_FUNC_send_signal_thread:
+		return &bpf_send_signal_thread_proto;
 	default:
 		return NULL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 52966e758fe5..a88837d4dd18 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -2714,7 +2714,8 @@ union bpf_attr {
  *
  * int bpf_send_signal(u32 sig)
  *	Description
- *		Send signal *sig* to the current task.
+ *		Send signal *sig* to the process of the current task.
+ *		The signal may be delivered to any of this process's threads.
  *	Return
  *		0 on success or successfully queued.
  *
@@ -2850,6 +2851,19 @@ union bpf_attr {
  *	Return
  *		0 on success, or a negative error in case of failure.
  *
+ * int bpf_send_signal_thread(u32 sig)
+ *	Description
+ *		Send signal *sig* to the thread corresponding to the current task.
+ *	Return
+ *		0 on success or successfully queued.
+ *
+ *		**-EBUSY** if work queue under nmi is full.
+ *
+ *		**-EINVAL** if *sig* is invalid.
+ *
+ *		**-EPERM** if no permission to send the *sig*.
+ *
+ *		**-EAGAIN** if bpf program can try again.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -2968,7 +2982,8 @@ union bpf_attr {
 	FN(probe_read_kernel),		\
 	FN(probe_read_user_str),	\
 	FN(probe_read_kernel_str),	\
-	FN(tcp_send_ack),
+	FN(tcp_send_ack),		\
+	FN(send_signal_thread),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
-- 
cgit v1.2.3


From cb4d03ab499d4c040f4ab6fd4389d2b49f42b5a5 Mon Sep 17 00:00:00 2001
From: Brian Vazquez <brianvv@google.com>
Date: Wed, 15 Jan 2020 10:43:01 -0800
Subject: bpf: Add generic support for lookup batch op

This commit introduces generic support for the bpf_map_lookup_batch.
This implementation can be used by almost all the bpf maps since its core
implementation is relying on the existing map_get_next_key and
map_lookup_elem. The bpf syscall subcommand introduced is:

  BPF_MAP_LOOKUP_BATCH

The UAPI attribute is:

  struct { /* struct used by BPF_MAP_*_BATCH commands */
         __aligned_u64   in_batch;       /* start batch,
                                          * NULL to start from beginning
                                          */
         __aligned_u64   out_batch;      /* output: next start batch */
         __aligned_u64   keys;
         __aligned_u64   values;
         __u32           count;          /* input/output:
                                          * input: # of key/value
                                          * elements
                                          * output: # of filled elements
                                          */
         __u32           map_fd;
         __u64           elem_flags;
         __u64           flags;
  } batch;

in_batch/out_batch are opaque values use to communicate between
user/kernel space, in_batch/out_batch must be of key_size length.

To start iterating from the beginning in_batch must be null,
count is the # of key/value elements to retrieve. Note that the 'keys'
buffer must be a buffer of key_size * count size and the 'values' buffer
must be value_size * count, where value_size must be aligned to 8 bytes
by userspace if it's dealing with percpu maps. 'count' will contain the
number of keys/values successfully retrieved. Note that 'count' is an
input/output variable and it can contain a lower value after a call.

If there's no more entries to retrieve, ENOENT will be returned. If error
is ENOENT, count might be > 0 in case it copied some values but there were
no more entries to retrieve.

Note that if the return code is an error and not -EFAULT,
count indicates the number of elements successfully processed.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-3-brianvv@google.com
---
 include/linux/bpf.h      |   5 ++
 include/uapi/linux/bpf.h |  18 ++++++
 kernel/bpf/syscall.c     | 160 +++++++++++++++++++++++++++++++++++++++++++++--
 3 files changed, 179 insertions(+), 4 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index aed2bc39d72b..807744ecaa5a 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -44,6 +44,8 @@ struct bpf_map_ops {
 	int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key);
 	void (*map_release_uref)(struct bpf_map *map);
 	void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
+	int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr,
+				union bpf_attr __user *uattr);
 
 	/* funcs callable from userspace and from eBPF programs */
 	void *(*map_lookup_elem)(struct bpf_map *map, void *key);
@@ -982,6 +984,9 @@ void *bpf_map_area_alloc(u64 size, int numa_node);
 void *bpf_map_area_mmapable_alloc(u64 size, int numa_node);
 void bpf_map_area_free(void *base);
 void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr);
+int  generic_map_lookup_batch(struct bpf_map *map,
+			      const union bpf_attr *attr,
+			      union bpf_attr __user *uattr);
 
 extern int sysctl_unprivileged_bpf_disabled;
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a88837d4dd18..0721b2c6bb8c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -107,6 +107,7 @@ enum bpf_cmd {
 	BPF_MAP_LOOKUP_AND_DELETE_ELEM,
 	BPF_MAP_FREEZE,
 	BPF_BTF_GET_NEXT_ID,
+	BPF_MAP_LOOKUP_BATCH,
 };
 
 enum bpf_map_type {
@@ -420,6 +421,23 @@ union bpf_attr {
 		__u64		flags;
 	};
 
+	struct { /* struct used by BPF_MAP_*_BATCH commands */
+		__aligned_u64	in_batch;	/* start batch,
+						 * NULL to start from beginning
+						 */
+		__aligned_u64	out_batch;	/* output: next start batch */
+		__aligned_u64	keys;
+		__aligned_u64	values;
+		__u32		count;		/* input/output:
+						 * input: # of key/value
+						 * elements
+						 * output: # of filled elements
+						 */
+		__u32		map_fd;
+		__u64		elem_flags;
+		__u64		flags;
+	} batch;
+
 	struct { /* anonymous struct used by BPF_PROG_LOAD command */
 		__u32		prog_type;	/* one of enum bpf_prog_type */
 		__u32		insn_cnt;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 08b0b6e40454..d604ddbb1afb 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -219,10 +219,8 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value,
 	void *ptr;
 	int err;
 
-	if (bpf_map_is_dev_bound(map)) {
-		err =  bpf_map_offload_lookup_elem(map, key, value);
-		return err;
-	}
+	if (bpf_map_is_dev_bound(map))
+		return bpf_map_offload_lookup_elem(map, key, value);
 
 	preempt_disable();
 	this_cpu_inc(bpf_prog_active);
@@ -1220,6 +1218,109 @@ err_put:
 	return err;
 }
 
+#define MAP_LOOKUP_RETRIES 3
+
+int generic_map_lookup_batch(struct bpf_map *map,
+				    const union bpf_attr *attr,
+				    union bpf_attr __user *uattr)
+{
+	void __user *uobatch = u64_to_user_ptr(attr->batch.out_batch);
+	void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch);
+	void __user *values = u64_to_user_ptr(attr->batch.values);
+	void __user *keys = u64_to_user_ptr(attr->batch.keys);
+	void *buf, *buf_prevkey, *prev_key, *key, *value;
+	int err, retry = MAP_LOOKUP_RETRIES;
+	u32 value_size, cp, max_count;
+	bool first_key = false;
+
+	if (attr->batch.elem_flags & ~BPF_F_LOCK)
+		return -EINVAL;
+
+	if ((attr->batch.elem_flags & BPF_F_LOCK) &&
+	    !map_value_has_spin_lock(map))
+		return -EINVAL;
+
+	value_size = bpf_map_value_size(map);
+
+	max_count = attr->batch.count;
+	if (!max_count)
+		return 0;
+
+	if (put_user(0, &uattr->batch.count))
+		return -EFAULT;
+
+	buf_prevkey = kmalloc(map->key_size, GFP_USER | __GFP_NOWARN);
+	if (!buf_prevkey)
+		return -ENOMEM;
+
+	buf = kmalloc(map->key_size + value_size, GFP_USER | __GFP_NOWARN);
+	if (!buf) {
+		kvfree(buf_prevkey);
+		return -ENOMEM;
+	}
+
+	err = -EFAULT;
+	first_key = false;
+	prev_key = NULL;
+	if (ubatch && copy_from_user(buf_prevkey, ubatch, map->key_size))
+		goto free_buf;
+	key = buf;
+	value = key + map->key_size;
+	if (ubatch)
+		prev_key = buf_prevkey;
+
+	for (cp = 0; cp < max_count;) {
+		rcu_read_lock();
+		err = map->ops->map_get_next_key(map, prev_key, key);
+		rcu_read_unlock();
+		if (err)
+			break;
+		err = bpf_map_copy_value(map, key, value,
+					 attr->batch.elem_flags);
+
+		if (err == -ENOENT) {
+			if (retry) {
+				retry--;
+				continue;
+			}
+			err = -EINTR;
+			break;
+		}
+
+		if (err)
+			goto free_buf;
+
+		if (copy_to_user(keys + cp * map->key_size, key,
+				 map->key_size)) {
+			err = -EFAULT;
+			goto free_buf;
+		}
+		if (copy_to_user(values + cp * value_size, value, value_size)) {
+			err = -EFAULT;
+			goto free_buf;
+		}
+
+		if (!prev_key)
+			prev_key = buf_prevkey;
+
+		swap(prev_key, key);
+		retry = MAP_LOOKUP_RETRIES;
+		cp++;
+	}
+
+	if (err == -EFAULT)
+		goto free_buf;
+
+	if ((copy_to_user(&uattr->batch.count, &cp, sizeof(cp)) ||
+		    (cp && copy_to_user(uobatch, prev_key, map->key_size))))
+		err = -EFAULT;
+
+free_buf:
+	kfree(buf_prevkey);
+	kfree(buf);
+	return err;
+}
+
 #define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value
 
 static int map_lookup_and_delete_elem(union bpf_attr *attr)
@@ -3076,6 +3177,54 @@ out:
 	return err;
 }
 
+#define BPF_MAP_BATCH_LAST_FIELD batch.flags
+
+#define BPF_DO_BATCH(fn)			\
+	do {					\
+		if (!fn) {			\
+			err = -ENOTSUPP;	\
+			goto err_put;		\
+		}				\
+		err = fn(map, attr, uattr);	\
+	} while (0)
+
+static int bpf_map_do_batch(const union bpf_attr *attr,
+			    union bpf_attr __user *uattr,
+			    int cmd)
+{
+	struct bpf_map *map;
+	int err, ufd;
+	struct fd f;
+
+	if (CHECK_ATTR(BPF_MAP_BATCH))
+		return -EINVAL;
+
+	ufd = attr->batch.map_fd;
+	f = fdget(ufd);
+	map = __bpf_map_get(f);
+	if (IS_ERR(map))
+		return PTR_ERR(map);
+
+	if (cmd == BPF_MAP_LOOKUP_BATCH &&
+	    !(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
+		err = -EPERM;
+		goto err_put;
+	}
+
+	if (cmd != BPF_MAP_LOOKUP_BATCH &&
+	    !(map_get_sys_perms(map, f) & FMODE_CAN_WRITE)) {
+		err = -EPERM;
+		goto err_put;
+	}
+
+	if (cmd == BPF_MAP_LOOKUP_BATCH)
+		BPF_DO_BATCH(map->ops->map_lookup_batch);
+
+err_put:
+	fdput(f);
+	return err;
+}
+
 SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
 {
 	union bpf_attr attr = {};
@@ -3173,6 +3322,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
 		err = map_lookup_and_delete_elem(&attr);
 		break;
+	case BPF_MAP_LOOKUP_BATCH:
+		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH);
+		break;
 	default:
 		err = -EINVAL;
 		break;
-- 
cgit v1.2.3


From aa2e93b8e58e18442edfb2427446732415bc215e Mon Sep 17 00:00:00 2001
From: Brian Vazquez <brianvv@google.com>
Date: Wed, 15 Jan 2020 10:43:02 -0800
Subject: bpf: Add generic support for update and delete batch ops

This commit adds generic support for update and delete batch ops that
can be used for almost all the bpf maps. These commands share the same
UAPI attr that lookup and lookup_and_delete batch ops use and the
syscall commands are:

  BPF_MAP_UPDATE_BATCH
  BPF_MAP_DELETE_BATCH

The main difference between update/delete and lookup batch ops is that
for update/delete keys/values must be specified for userspace and
because of that, neither in_batch nor out_batch are used.

Suggested-by: Stanislav Fomichev <sdf@google.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-4-brianvv@google.com
---
 include/linux/bpf.h      |  10 +++++
 include/uapi/linux/bpf.h |   2 +
 kernel/bpf/syscall.c     | 115 +++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 127 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 807744ecaa5a..05466ad6cf1c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -46,6 +46,10 @@ struct bpf_map_ops {
 	void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
 	int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr,
 				union bpf_attr __user *uattr);
+	int (*map_update_batch)(struct bpf_map *map, const union bpf_attr *attr,
+				union bpf_attr __user *uattr);
+	int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr,
+				union bpf_attr __user *uattr);
 
 	/* funcs callable from userspace and from eBPF programs */
 	void *(*map_lookup_elem)(struct bpf_map *map, void *key);
@@ -987,6 +991,12 @@ void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr);
 int  generic_map_lookup_batch(struct bpf_map *map,
 			      const union bpf_attr *attr,
 			      union bpf_attr __user *uattr);
+int  generic_map_update_batch(struct bpf_map *map,
+			      const union bpf_attr *attr,
+			      union bpf_attr __user *uattr);
+int  generic_map_delete_batch(struct bpf_map *map,
+			      const union bpf_attr *attr,
+			      union bpf_attr __user *uattr);
 
 extern int sysctl_unprivileged_bpf_disabled;
 
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 0721b2c6bb8c..d5320b0be459 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -108,6 +108,8 @@ enum bpf_cmd {
 	BPF_MAP_FREEZE,
 	BPF_BTF_GET_NEXT_ID,
 	BPF_MAP_LOOKUP_BATCH,
+	BPF_MAP_UPDATE_BATCH,
+	BPF_MAP_DELETE_BATCH,
 };
 
 enum bpf_map_type {
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index d604ddbb1afb..ce8244d1ba99 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1218,6 +1218,111 @@ err_put:
 	return err;
 }
 
+int generic_map_delete_batch(struct bpf_map *map,
+			     const union bpf_attr *attr,
+			     union bpf_attr __user *uattr)
+{
+	void __user *keys = u64_to_user_ptr(attr->batch.keys);
+	u32 cp, max_count;
+	int err = 0;
+	void *key;
+
+	if (attr->batch.elem_flags & ~BPF_F_LOCK)
+		return -EINVAL;
+
+	if ((attr->batch.elem_flags & BPF_F_LOCK) &&
+	    !map_value_has_spin_lock(map)) {
+		return -EINVAL;
+	}
+
+	max_count = attr->batch.count;
+	if (!max_count)
+		return 0;
+
+	for (cp = 0; cp < max_count; cp++) {
+		key = __bpf_copy_key(keys + cp * map->key_size, map->key_size);
+		if (IS_ERR(key)) {
+			err = PTR_ERR(key);
+			break;
+		}
+
+		if (bpf_map_is_dev_bound(map)) {
+			err = bpf_map_offload_delete_elem(map, key);
+			break;
+		}
+
+		preempt_disable();
+		__this_cpu_inc(bpf_prog_active);
+		rcu_read_lock();
+		err = map->ops->map_delete_elem(map, key);
+		rcu_read_unlock();
+		__this_cpu_dec(bpf_prog_active);
+		preempt_enable();
+		maybe_wait_bpf_programs(map);
+		if (err)
+			break;
+	}
+	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
+		err = -EFAULT;
+	return err;
+}
+
+int generic_map_update_batch(struct bpf_map *map,
+			     const union bpf_attr *attr,
+			     union bpf_attr __user *uattr)
+{
+	void __user *values = u64_to_user_ptr(attr->batch.values);
+	void __user *keys = u64_to_user_ptr(attr->batch.keys);
+	u32 value_size, cp, max_count;
+	int ufd = attr->map_fd;
+	void *key, *value;
+	struct fd f;
+	int err = 0;
+
+	f = fdget(ufd);
+	if (attr->batch.elem_flags & ~BPF_F_LOCK)
+		return -EINVAL;
+
+	if ((attr->batch.elem_flags & BPF_F_LOCK) &&
+	    !map_value_has_spin_lock(map)) {
+		return -EINVAL;
+	}
+
+	value_size = bpf_map_value_size(map);
+
+	max_count = attr->batch.count;
+	if (!max_count)
+		return 0;
+
+	value = kmalloc(value_size, GFP_USER | __GFP_NOWARN);
+	if (!value)
+		return -ENOMEM;
+
+	for (cp = 0; cp < max_count; cp++) {
+		key = __bpf_copy_key(keys + cp * map->key_size, map->key_size);
+		if (IS_ERR(key)) {
+			err = PTR_ERR(key);
+			break;
+		}
+		err = -EFAULT;
+		if (copy_from_user(value, values + cp * value_size, value_size))
+			break;
+
+		err = bpf_map_update_value(map, f, key, value,
+					   attr->batch.elem_flags);
+
+		if (err)
+			break;
+	}
+
+	if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp)))
+		err = -EFAULT;
+
+	kfree(value);
+	kfree(key);
+	return err;
+}
+
 #define MAP_LOOKUP_RETRIES 3
 
 int generic_map_lookup_batch(struct bpf_map *map,
@@ -3219,6 +3324,10 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
 
 	if (cmd == BPF_MAP_LOOKUP_BATCH)
 		BPF_DO_BATCH(map->ops->map_lookup_batch);
+	else if (cmd == BPF_MAP_UPDATE_BATCH)
+		BPF_DO_BATCH(map->ops->map_update_batch);
+	else
+		BPF_DO_BATCH(map->ops->map_delete_batch);
 
 err_put:
 	fdput(f);
@@ -3325,6 +3434,12 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_MAP_LOOKUP_BATCH:
 		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH);
 		break;
+	case BPF_MAP_UPDATE_BATCH:
+		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH);
+		break;
+	case BPF_MAP_DELETE_BATCH:
+		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_DELETE_BATCH);
+		break;
 	default:
 		err = -EINVAL;
 		break;
-- 
cgit v1.2.3


From 057996380a42bb64ccc04383cfa9c0ace4ea11f0 Mon Sep 17 00:00:00 2001
From: Yonghong Song <yhs@fb.com>
Date: Wed, 15 Jan 2020 10:43:04 -0800
Subject: bpf: Add batch ops to all htab bpf map

htab can't use generic batch support due some problematic behaviours
inherent to the data structre, i.e. while iterating the bpf map  a
concurrent program might delete the next entry that batch was about to
use, in that case there's no easy solution to retrieve the next entry,
the issue has been discussed multiple times (see [1] and [2]).

The only way hmap can be traversed without the problem previously
exposed is by making sure that the map is traversing entire buckets.
This commit implements those strict requirements for hmap, the
implementation follows the same interaction that generic support with
some exceptions:

 - If keys/values buffer are not big enough to traverse a bucket,
   ENOSPC will be returned.
 - out_batch contains the value of the next bucket in the iteration, not
   the next key, but this is transparent for the user since the user
   should never use out_batch for other than bpf batch syscalls.

This commits implements BPF_MAP_LOOKUP_BATCH and adds support for new
command BPF_MAP_LOOKUP_AND_DELETE_BATCH. Note that for update/delete
batch ops it is possible to use the generic implementations.

[1] https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/
[2] https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Brian Vazquez <brianvv@google.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200115184308.162644-6-brianvv@google.com
---
 include/linux/bpf.h      |   3 +
 include/uapi/linux/bpf.h |   1 +
 kernel/bpf/hashtab.c     | 264 +++++++++++++++++++++++++++++++++++++++++++++++
 kernel/bpf/syscall.c     |   9 +-
 4 files changed, 276 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 05466ad6cf1c..3517e32149a4 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -46,6 +46,9 @@ struct bpf_map_ops {
 	void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key);
 	int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr,
 				union bpf_attr __user *uattr);
+	int (*map_lookup_and_delete_batch)(struct bpf_map *map,
+					   const union bpf_attr *attr,
+					   union bpf_attr __user *uattr);
 	int (*map_update_batch)(struct bpf_map *map, const union bpf_attr *attr,
 				union bpf_attr __user *uattr);
 	int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d5320b0be459..033d90a2282d 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -108,6 +108,7 @@ enum bpf_cmd {
 	BPF_MAP_FREEZE,
 	BPF_BTF_GET_NEXT_ID,
 	BPF_MAP_LOOKUP_BATCH,
+	BPF_MAP_LOOKUP_AND_DELETE_BATCH,
 	BPF_MAP_UPDATE_BATCH,
 	BPF_MAP_DELETE_BATCH,
 };
diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 22066a62c8c9..2d182c4ee9d9 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -17,6 +17,16 @@
 	(BPF_F_NO_PREALLOC | BPF_F_NO_COMMON_LRU | BPF_F_NUMA_NODE |	\
 	 BPF_F_ACCESS_MASK | BPF_F_ZERO_SEED)
 
+#define BATCH_OPS(_name)			\
+	.map_lookup_batch =			\
+	_name##_map_lookup_batch,		\
+	.map_lookup_and_delete_batch =		\
+	_name##_map_lookup_and_delete_batch,	\
+	.map_update_batch =			\
+	generic_map_update_batch,		\
+	.map_delete_batch =			\
+	generic_map_delete_batch
+
 struct bucket {
 	struct hlist_nulls_head head;
 	raw_spinlock_t lock;
@@ -1232,6 +1242,256 @@ static void htab_map_seq_show_elem(struct bpf_map *map, void *key,
 	rcu_read_unlock();
 }
 
+static int
+__htab_map_lookup_and_delete_batch(struct bpf_map *map,
+				   const union bpf_attr *attr,
+				   union bpf_attr __user *uattr,
+				   bool do_delete, bool is_lru_map,
+				   bool is_percpu)
+{
+	struct bpf_htab *htab = container_of(map, struct bpf_htab, map);
+	u32 bucket_cnt, total, key_size, value_size, roundup_key_size;
+	void *keys = NULL, *values = NULL, *value, *dst_key, *dst_val;
+	void __user *uvalues = u64_to_user_ptr(attr->batch.values);
+	void __user *ukeys = u64_to_user_ptr(attr->batch.keys);
+	void *ubatch = u64_to_user_ptr(attr->batch.in_batch);
+	u32 batch, max_count, size, bucket_size;
+	u64 elem_map_flags, map_flags;
+	struct hlist_nulls_head *head;
+	struct hlist_nulls_node *n;
+	unsigned long flags;
+	struct htab_elem *l;
+	struct bucket *b;
+	int ret = 0;
+
+	elem_map_flags = attr->batch.elem_flags;
+	if ((elem_map_flags & ~BPF_F_LOCK) ||
+	    ((elem_map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map)))
+		return -EINVAL;
+
+	map_flags = attr->batch.flags;
+	if (map_flags)
+		return -EINVAL;
+
+	max_count = attr->batch.count;
+	if (!max_count)
+		return 0;
+
+	if (put_user(0, &uattr->batch.count))
+		return -EFAULT;
+
+	batch = 0;
+	if (ubatch && copy_from_user(&batch, ubatch, sizeof(batch)))
+		return -EFAULT;
+
+	if (batch >= htab->n_buckets)
+		return -ENOENT;
+
+	key_size = htab->map.key_size;
+	roundup_key_size = round_up(htab->map.key_size, 8);
+	value_size = htab->map.value_size;
+	size = round_up(value_size, 8);
+	if (is_percpu)
+		value_size = size * num_possible_cpus();
+	total = 0;
+	/* while experimenting with hash tables with sizes ranging from 10 to
+	 * 1000, it was observed that a bucket can have upto 5 entries.
+	 */
+	bucket_size = 5;
+
+alloc:
+	/* We cannot do copy_from_user or copy_to_user inside
+	 * the rcu_read_lock. Allocate enough space here.
+	 */
+	keys = kvmalloc(key_size * bucket_size, GFP_USER | __GFP_NOWARN);
+	values = kvmalloc(value_size * bucket_size, GFP_USER | __GFP_NOWARN);
+	if (!keys || !values) {
+		ret = -ENOMEM;
+		goto after_loop;
+	}
+
+again:
+	preempt_disable();
+	this_cpu_inc(bpf_prog_active);
+	rcu_read_lock();
+again_nocopy:
+	dst_key = keys;
+	dst_val = values;
+	b = &htab->buckets[batch];
+	head = &b->head;
+	raw_spin_lock_irqsave(&b->lock, flags);
+
+	bucket_cnt = 0;
+	hlist_nulls_for_each_entry_rcu(l, n, head, hash_node)
+		bucket_cnt++;
+
+	if (bucket_cnt > (max_count - total)) {
+		if (total == 0)
+			ret = -ENOSPC;
+		raw_spin_unlock_irqrestore(&b->lock, flags);
+		rcu_read_unlock();
+		this_cpu_dec(bpf_prog_active);
+		preempt_enable();
+		goto after_loop;
+	}
+
+	if (bucket_cnt > bucket_size) {
+		bucket_size = bucket_cnt;
+		raw_spin_unlock_irqrestore(&b->lock, flags);
+		rcu_read_unlock();
+		this_cpu_dec(bpf_prog_active);
+		preempt_enable();
+		kvfree(keys);
+		kvfree(values);
+		goto alloc;
+	}
+
+	hlist_nulls_for_each_entry_safe(l, n, head, hash_node) {
+		memcpy(dst_key, l->key, key_size);
+
+		if (is_percpu) {
+			int off = 0, cpu;
+			void __percpu *pptr;
+
+			pptr = htab_elem_get_ptr(l, map->key_size);
+			for_each_possible_cpu(cpu) {
+				bpf_long_memcpy(dst_val + off,
+						per_cpu_ptr(pptr, cpu), size);
+				off += size;
+			}
+		} else {
+			value = l->key + roundup_key_size;
+			if (elem_map_flags & BPF_F_LOCK)
+				copy_map_value_locked(map, dst_val, value,
+						      true);
+			else
+				copy_map_value(map, dst_val, value);
+			check_and_init_map_lock(map, dst_val);
+		}
+		if (do_delete) {
+			hlist_nulls_del_rcu(&l->hash_node);
+			if (is_lru_map)
+				bpf_lru_push_free(&htab->lru, &l->lru_node);
+			else
+				free_htab_elem(htab, l);
+		}
+		dst_key += key_size;
+		dst_val += value_size;
+	}
+
+	raw_spin_unlock_irqrestore(&b->lock, flags);
+	/* If we are not copying data, we can go to next bucket and avoid
+	 * unlocking the rcu.
+	 */
+	if (!bucket_cnt && (batch + 1 < htab->n_buckets)) {
+		batch++;
+		goto again_nocopy;
+	}
+
+	rcu_read_unlock();
+	this_cpu_dec(bpf_prog_active);
+	preempt_enable();
+	if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys,
+	    key_size * bucket_cnt) ||
+	    copy_to_user(uvalues + total * value_size, values,
+	    value_size * bucket_cnt))) {
+		ret = -EFAULT;
+		goto after_loop;
+	}
+
+	total += bucket_cnt;
+	batch++;
+	if (batch >= htab->n_buckets) {
+		ret = -ENOENT;
+		goto after_loop;
+	}
+	goto again;
+
+after_loop:
+	if (ret == -EFAULT)
+		goto out;
+
+	/* copy # of entries and next batch */
+	ubatch = u64_to_user_ptr(attr->batch.out_batch);
+	if (copy_to_user(ubatch, &batch, sizeof(batch)) ||
+	    put_user(total, &uattr->batch.count))
+		ret = -EFAULT;
+
+out:
+	kvfree(keys);
+	kvfree(values);
+	return ret;
+}
+
+static int
+htab_percpu_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
+			     union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
+						  false, true);
+}
+
+static int
+htab_percpu_map_lookup_and_delete_batch(struct bpf_map *map,
+					const union bpf_attr *attr,
+					union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
+						  false, true);
+}
+
+static int
+htab_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
+		      union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
+						  false, false);
+}
+
+static int
+htab_map_lookup_and_delete_batch(struct bpf_map *map,
+				 const union bpf_attr *attr,
+				 union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
+						  false, false);
+}
+
+static int
+htab_lru_percpu_map_lookup_batch(struct bpf_map *map,
+				 const union bpf_attr *attr,
+				 union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
+						  true, true);
+}
+
+static int
+htab_lru_percpu_map_lookup_and_delete_batch(struct bpf_map *map,
+					    const union bpf_attr *attr,
+					    union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
+						  true, true);
+}
+
+static int
+htab_lru_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr,
+			  union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, false,
+						  true, false);
+}
+
+static int
+htab_lru_map_lookup_and_delete_batch(struct bpf_map *map,
+				     const union bpf_attr *attr,
+				     union bpf_attr __user *uattr)
+{
+	return __htab_map_lookup_and_delete_batch(map, attr, uattr, true,
+						  true, false);
+}
+
 const struct bpf_map_ops htab_map_ops = {
 	.map_alloc_check = htab_map_alloc_check,
 	.map_alloc = htab_map_alloc,
@@ -1242,6 +1502,7 @@ const struct bpf_map_ops htab_map_ops = {
 	.map_delete_elem = htab_map_delete_elem,
 	.map_gen_lookup = htab_map_gen_lookup,
 	.map_seq_show_elem = htab_map_seq_show_elem,
+	BATCH_OPS(htab),
 };
 
 const struct bpf_map_ops htab_lru_map_ops = {
@@ -1255,6 +1516,7 @@ const struct bpf_map_ops htab_lru_map_ops = {
 	.map_delete_elem = htab_lru_map_delete_elem,
 	.map_gen_lookup = htab_lru_map_gen_lookup,
 	.map_seq_show_elem = htab_map_seq_show_elem,
+	BATCH_OPS(htab_lru),
 };
 
 /* Called from eBPF program */
@@ -1368,6 +1630,7 @@ const struct bpf_map_ops htab_percpu_map_ops = {
 	.map_update_elem = htab_percpu_map_update_elem,
 	.map_delete_elem = htab_map_delete_elem,
 	.map_seq_show_elem = htab_percpu_map_seq_show_elem,
+	BATCH_OPS(htab_percpu),
 };
 
 const struct bpf_map_ops htab_lru_percpu_map_ops = {
@@ -1379,6 +1642,7 @@ const struct bpf_map_ops htab_lru_percpu_map_ops = {
 	.map_update_elem = htab_lru_percpu_map_update_elem,
 	.map_delete_elem = htab_lru_map_delete_elem,
 	.map_seq_show_elem = htab_percpu_map_seq_show_elem,
+	BATCH_OPS(htab_lru_percpu),
 };
 
 static int fd_htab_map_alloc_check(union bpf_attr *attr)
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index ce8244d1ba99..0d94d361379c 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3310,7 +3310,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
 	if (IS_ERR(map))
 		return PTR_ERR(map);
 
-	if (cmd == BPF_MAP_LOOKUP_BATCH &&
+	if ((cmd == BPF_MAP_LOOKUP_BATCH ||
+	     cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH) &&
 	    !(map_get_sys_perms(map, f) & FMODE_CAN_READ)) {
 		err = -EPERM;
 		goto err_put;
@@ -3324,6 +3325,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
 
 	if (cmd == BPF_MAP_LOOKUP_BATCH)
 		BPF_DO_BATCH(map->ops->map_lookup_batch);
+	else if (cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH)
+		BPF_DO_BATCH(map->ops->map_lookup_and_delete_batch);
 	else if (cmd == BPF_MAP_UPDATE_BATCH)
 		BPF_DO_BATCH(map->ops->map_update_batch);
 	else
@@ -3434,6 +3437,10 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	case BPF_MAP_LOOKUP_BATCH:
 		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH);
 		break;
+	case BPF_MAP_LOOKUP_AND_DELETE_BATCH:
+		err = bpf_map_do_batch(&attr, uattr,
+				       BPF_MAP_LOOKUP_AND_DELETE_BATCH);
+		break;
 	case BPF_MAP_UPDATE_BATCH:
 		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH);
 		break;
-- 
cgit v1.2.3


From 4a7faaf4add3108522512c16c9ee08fb2ad4525e Mon Sep 17 00:00:00 2001
From: Jeremy Sowden <jeremy@azazel.net>
Date: Wed, 1 Jan 2020 13:41:32 +0000
Subject: netfilter: nft_bitwise: correct uapi header comment.

The comment documenting how bitwise expressions work includes a table
which summarizes the mask and xor arguments combined to express the
supported boolean operations.  However, the row for OR:

 mask    xor
 0       x

is incorrect.

  dreg = (sreg & 0) ^ x

is not equivalent to:

  dreg = sreg | x

What the code actually does is:

  dreg = (sreg & ~x) ^ x

Update the documentation to match.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/nf_tables.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index e237ecbdcd8a..dd4611767933 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -501,7 +501,7 @@ enum nft_immediate_attributes {
  *
  * 		mask	xor
  * NOT:		1	1
- * OR:		0	x
+ * OR:		~x	x
  * XOR:		1	x
  * AND:		x	0
  */
-- 
cgit v1.2.3


From 9d1f979986c2e29632b6a8f7a8ef8b3c7d24a48c Mon Sep 17 00:00:00 2001
From: Jeremy Sowden <jeremy@azazel.net>
Date: Wed, 15 Jan 2020 20:05:51 +0000
Subject: netfilter: bitwise: add NFTA_BITWISE_OP netlink attribute.

Add a new bitwise netlink attribute, NFTA_BITWISE_OP, which is set to a
value of a new enum, nft_bitwise_ops.  It describes the type of
operation an expression contains.  Currently, it only has one value:
NFT_BITWISE_BOOL.  More values will be added later to implement shifts.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/nf_tables.h | 12 ++++++++++++
 net/netfilter/nft_bitwise.c              | 16 ++++++++++++++++
 2 files changed, 28 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index dd4611767933..0cddf357281f 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -484,6 +484,16 @@ enum nft_immediate_attributes {
 };
 #define NFTA_IMMEDIATE_MAX	(__NFTA_IMMEDIATE_MAX - 1)
 
+/**
+ * enum nft_bitwise_ops - nf_tables bitwise operations
+ *
+ * @NFT_BITWISE_BOOL: mask-and-xor operation used to implement NOT, AND, OR and
+ *                    XOR boolean operations
+ */
+enum nft_bitwise_ops {
+	NFT_BITWISE_BOOL,
+};
+
 /**
  * enum nft_bitwise_attributes - nf_tables bitwise expression netlink attributes
  *
@@ -492,6 +502,7 @@ enum nft_immediate_attributes {
  * @NFTA_BITWISE_LEN: length of operands (NLA_U32)
  * @NFTA_BITWISE_MASK: mask value (NLA_NESTED: nft_data_attributes)
  * @NFTA_BITWISE_XOR: xor value (NLA_NESTED: nft_data_attributes)
+ * @NFTA_BITWISE_OP: type of operation (NLA_U32: nft_bitwise_ops)
  *
  * The bitwise expression performs the following operation:
  *
@@ -512,6 +523,7 @@ enum nft_bitwise_attributes {
 	NFTA_BITWISE_LEN,
 	NFTA_BITWISE_MASK,
 	NFTA_BITWISE_XOR,
+	NFTA_BITWISE_OP,
 	__NFTA_BITWISE_MAX
 };
 #define NFTA_BITWISE_MAX	(__NFTA_BITWISE_MAX - 1)
diff --git a/net/netfilter/nft_bitwise.c b/net/netfilter/nft_bitwise.c
index c15e9beb5243..6948df7b0587 100644
--- a/net/netfilter/nft_bitwise.c
+++ b/net/netfilter/nft_bitwise.c
@@ -18,6 +18,7 @@
 struct nft_bitwise {
 	enum nft_registers	sreg:8;
 	enum nft_registers	dreg:8;
+	enum nft_bitwise_ops	op:8;
 	u8			len;
 	struct nft_data		mask;
 	struct nft_data		xor;
@@ -41,6 +42,7 @@ static const struct nla_policy nft_bitwise_policy[NFTA_BITWISE_MAX + 1] = {
 	[NFTA_BITWISE_LEN]	= { .type = NLA_U32 },
 	[NFTA_BITWISE_MASK]	= { .type = NLA_NESTED },
 	[NFTA_BITWISE_XOR]	= { .type = NLA_NESTED },
+	[NFTA_BITWISE_OP]	= { .type = NLA_U32 },
 };
 
 static int nft_bitwise_init(const struct nft_ctx *ctx,
@@ -76,6 +78,18 @@ static int nft_bitwise_init(const struct nft_ctx *ctx,
 	if (err < 0)
 		return err;
 
+	if (tb[NFTA_BITWISE_OP]) {
+		priv->op = ntohl(nla_get_be32(tb[NFTA_BITWISE_OP]));
+		switch (priv->op) {
+		case NFT_BITWISE_BOOL:
+			break;
+		default:
+			return -EOPNOTSUPP;
+		}
+	} else {
+		priv->op = NFT_BITWISE_BOOL;
+	}
+
 	err = nft_data_init(NULL, &priv->mask, sizeof(priv->mask), &d1,
 			    tb[NFTA_BITWISE_MASK]);
 	if (err < 0)
@@ -112,6 +126,8 @@ static int nft_bitwise_dump(struct sk_buff *skb, const struct nft_expr *expr)
 		return -1;
 	if (nla_put_be32(skb, NFTA_BITWISE_LEN, htonl(priv->len)))
 		return -1;
+	if (nla_put_be32(skb, NFTA_BITWISE_OP, htonl(priv->op)))
+		return -1;
 
 	if (nft_data_dump(skb, NFTA_BITWISE_MASK, &priv->mask,
 			  NFT_DATA_VALUE, priv->len) < 0)
-- 
cgit v1.2.3


From 779f725e142cd987a69296c0eb7d416ee3ba57dd Mon Sep 17 00:00:00 2001
From: Jeremy Sowden <jeremy@azazel.net>
Date: Wed, 15 Jan 2020 20:05:56 +0000
Subject: netfilter: bitwise: add NFTA_BITWISE_DATA attribute.

Add a new bitwise netlink attribute that will be used by shift
operations to store the size of the shift.  It is not used by boolean
operations.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/nf_tables.h | 3 +++
 net/netfilter/nft_bitwise.c              | 5 +++++
 2 files changed, 8 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 0cddf357281f..8bef0620bc4f 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -503,6 +503,8 @@ enum nft_bitwise_ops {
  * @NFTA_BITWISE_MASK: mask value (NLA_NESTED: nft_data_attributes)
  * @NFTA_BITWISE_XOR: xor value (NLA_NESTED: nft_data_attributes)
  * @NFTA_BITWISE_OP: type of operation (NLA_U32: nft_bitwise_ops)
+ * @NFTA_BITWISE_DATA: argument for non-boolean operations
+ *                     (NLA_NESTED: nft_data_attributes)
  *
  * The bitwise expression performs the following operation:
  *
@@ -524,6 +526,7 @@ enum nft_bitwise_attributes {
 	NFTA_BITWISE_MASK,
 	NFTA_BITWISE_XOR,
 	NFTA_BITWISE_OP,
+	NFTA_BITWISE_DATA,
 	__NFTA_BITWISE_MAX
 };
 #define NFTA_BITWISE_MAX	(__NFTA_BITWISE_MAX - 1)
diff --git a/net/netfilter/nft_bitwise.c b/net/netfilter/nft_bitwise.c
index b4619d9989ea..744008a527fb 100644
--- a/net/netfilter/nft_bitwise.c
+++ b/net/netfilter/nft_bitwise.c
@@ -22,6 +22,7 @@ struct nft_bitwise {
 	u8			len;
 	struct nft_data		mask;
 	struct nft_data		xor;
+	struct nft_data		data;
 };
 
 static void nft_bitwise_eval_bool(u32 *dst, const u32 *src,
@@ -54,6 +55,7 @@ static const struct nla_policy nft_bitwise_policy[NFTA_BITWISE_MAX + 1] = {
 	[NFTA_BITWISE_MASK]	= { .type = NLA_NESTED },
 	[NFTA_BITWISE_XOR]	= { .type = NLA_NESTED },
 	[NFTA_BITWISE_OP]	= { .type = NLA_U32 },
+	[NFTA_BITWISE_DATA]	= { .type = NLA_NESTED },
 };
 
 static int nft_bitwise_init_bool(struct nft_bitwise *priv,
@@ -62,6 +64,9 @@ static int nft_bitwise_init_bool(struct nft_bitwise *priv,
 	struct nft_data_desc d1, d2;
 	int err;
 
+	if (tb[NFTA_BITWISE_DATA])
+		return -EINVAL;
+
 	if (!tb[NFTA_BITWISE_MASK] ||
 	    !tb[NFTA_BITWISE_XOR])
 		return -EINVAL;
-- 
cgit v1.2.3


From 567d746b55bc66d3800c9ae91d50f0c5deb2fd93 Mon Sep 17 00:00:00 2001
From: Jeremy Sowden <jeremy@azazel.net>
Date: Wed, 15 Jan 2020 20:05:57 +0000
Subject: netfilter: bitwise: add support for shifts.

Hitherto nft_bitwise has only supported boolean operations: NOT, AND, OR
and XOR.  Extend it to do shifts as well.

Signed-off-by: Jeremy Sowden <jeremy@azazel.net>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/uapi/linux/netfilter/nf_tables.h |  9 +++-
 net/netfilter/nft_bitwise.c              | 77 ++++++++++++++++++++++++++++++++
 2 files changed, 84 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 8bef0620bc4f..261864736b26 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -489,9 +489,13 @@ enum nft_immediate_attributes {
  *
  * @NFT_BITWISE_BOOL: mask-and-xor operation used to implement NOT, AND, OR and
  *                    XOR boolean operations
+ * @NFT_BITWISE_LSHIFT: left-shift operation
+ * @NFT_BITWISE_RSHIFT: right-shift operation
  */
 enum nft_bitwise_ops {
 	NFT_BITWISE_BOOL,
+	NFT_BITWISE_LSHIFT,
+	NFT_BITWISE_RSHIFT,
 };
 
 /**
@@ -506,11 +510,12 @@ enum nft_bitwise_ops {
  * @NFTA_BITWISE_DATA: argument for non-boolean operations
  *                     (NLA_NESTED: nft_data_attributes)
  *
- * The bitwise expression performs the following operation:
+ * The bitwise expression supports boolean and shift operations.  It implements
+ * the boolean operations by performing the following operation:
  *
  * dreg = (sreg & mask) ^ xor
  *
- * which allow to express all bitwise operations:
+ * with these mask and xor values:
  *
  * 		mask	xor
  * NOT:		1	1
diff --git a/net/netfilter/nft_bitwise.c b/net/netfilter/nft_bitwise.c
index 744008a527fb..0ed2281f03be 100644
--- a/net/netfilter/nft_bitwise.c
+++ b/net/netfilter/nft_bitwise.c
@@ -34,6 +34,32 @@ static void nft_bitwise_eval_bool(u32 *dst, const u32 *src,
 		dst[i] = (src[i] & priv->mask.data[i]) ^ priv->xor.data[i];
 }
 
+static void nft_bitwise_eval_lshift(u32 *dst, const u32 *src,
+				    const struct nft_bitwise *priv)
+{
+	u32 shift = priv->data.data[0];
+	unsigned int i;
+	u32 carry = 0;
+
+	for (i = DIV_ROUND_UP(priv->len, sizeof(u32)); i > 0; i--) {
+		dst[i - 1] = (src[i - 1] << shift) | carry;
+		carry = src[i - 1] >> (BITS_PER_TYPE(u32) - shift);
+	}
+}
+
+static void nft_bitwise_eval_rshift(u32 *dst, const u32 *src,
+				    const struct nft_bitwise *priv)
+{
+	u32 shift = priv->data.data[0];
+	unsigned int i;
+	u32 carry = 0;
+
+	for (i = 0; i < DIV_ROUND_UP(priv->len, sizeof(u32)); i++) {
+		dst[i] = carry | (src[i] >> shift);
+		carry = src[i] << (BITS_PER_TYPE(u32) - shift);
+	}
+}
+
 void nft_bitwise_eval(const struct nft_expr *expr,
 		      struct nft_regs *regs, const struct nft_pktinfo *pkt)
 {
@@ -45,6 +71,12 @@ void nft_bitwise_eval(const struct nft_expr *expr,
 	case NFT_BITWISE_BOOL:
 		nft_bitwise_eval_bool(dst, src, priv);
 		break;
+	case NFT_BITWISE_LSHIFT:
+		nft_bitwise_eval_lshift(dst, src, priv);
+		break;
+	case NFT_BITWISE_RSHIFT:
+		nft_bitwise_eval_rshift(dst, src, priv);
+		break;
 	}
 }
 
@@ -97,6 +129,32 @@ err1:
 	return err;
 }
 
+static int nft_bitwise_init_shift(struct nft_bitwise *priv,
+				  const struct nlattr *const tb[])
+{
+	struct nft_data_desc d;
+	int err;
+
+	if (tb[NFTA_BITWISE_MASK] ||
+	    tb[NFTA_BITWISE_XOR])
+		return -EINVAL;
+
+	if (!tb[NFTA_BITWISE_DATA])
+		return -EINVAL;
+
+	err = nft_data_init(NULL, &priv->data, sizeof(priv->data), &d,
+			    tb[NFTA_BITWISE_DATA]);
+	if (err < 0)
+		return err;
+	if (d.type != NFT_DATA_VALUE || d.len != sizeof(u32) ||
+	    priv->data.data[0] >= BITS_PER_TYPE(u32)) {
+		nft_data_release(&priv->data, d.type);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
 static int nft_bitwise_init(const struct nft_ctx *ctx,
 			    const struct nft_expr *expr,
 			    const struct nlattr * const tb[])
@@ -131,6 +189,8 @@ static int nft_bitwise_init(const struct nft_ctx *ctx,
 		priv->op = ntohl(nla_get_be32(tb[NFTA_BITWISE_OP]));
 		switch (priv->op) {
 		case NFT_BITWISE_BOOL:
+		case NFT_BITWISE_LSHIFT:
+		case NFT_BITWISE_RSHIFT:
 			break;
 		default:
 			return -EOPNOTSUPP;
@@ -143,6 +203,10 @@ static int nft_bitwise_init(const struct nft_ctx *ctx,
 	case NFT_BITWISE_BOOL:
 		err = nft_bitwise_init_bool(priv, tb);
 		break;
+	case NFT_BITWISE_LSHIFT:
+	case NFT_BITWISE_RSHIFT:
+		err = nft_bitwise_init_shift(priv, tb);
+		break;
 	}
 
 	return err;
@@ -162,6 +226,15 @@ static int nft_bitwise_dump_bool(struct sk_buff *skb,
 	return 0;
 }
 
+static int nft_bitwise_dump_shift(struct sk_buff *skb,
+				  const struct nft_bitwise *priv)
+{
+	if (nft_data_dump(skb, NFTA_BITWISE_DATA, &priv->data,
+			  NFT_DATA_VALUE, sizeof(u32)) < 0)
+		return -1;
+	return 0;
+}
+
 static int nft_bitwise_dump(struct sk_buff *skb, const struct nft_expr *expr)
 {
 	const struct nft_bitwise *priv = nft_expr_priv(expr);
@@ -180,6 +253,10 @@ static int nft_bitwise_dump(struct sk_buff *skb, const struct nft_expr *expr)
 	case NFT_BITWISE_BOOL:
 		err = nft_bitwise_dump_bool(skb, priv);
 		break;
+	case NFT_BITWISE_LSHIFT:
+	case NFT_BITWISE_RSHIFT:
+		err = nft_bitwise_dump_shift(skb, priv);
+		break;
 	}
 
 	return err;
-- 
cgit v1.2.3


From fddb5d430ad9fa91b49b1d34d0202ffe2fa0e179 Mon Sep 17 00:00:00 2001
From: Aleksa Sarai <cyphar@cyphar.com>
Date: Sat, 18 Jan 2020 23:07:59 +1100
Subject: open: introduce openat2(2) syscall

/* Background. */
For a very long time, extending openat(2) with new features has been
incredibly frustrating. This stems from the fact that openat(2) is
possibly the most famous counter-example to the mantra "don't silently
accept garbage from userspace" -- it doesn't check whether unknown flags
are present[1].

This means that (generally) the addition of new flags to openat(2) has
been fraught with backwards-compatibility issues (O_TMPFILE has to be
defined as __O_TMPFILE|O_DIRECTORY|[O_RDWR or O_WRONLY] to ensure old
kernels gave errors, since it's insecure to silently ignore the
flag[2]). All new security-related flags therefore have a tough road to
being added to openat(2).

Userspace also has a hard time figuring out whether a particular flag is
supported on a particular kernel. While it is now possible with
contemporary kernels (thanks to [3]), older kernels will expose unknown
flag bits through fcntl(F_GETFL). Giving a clear -EINVAL during
openat(2) time matches modern syscall designs and is far more
fool-proof.

In addition, the newly-added path resolution restriction LOOKUP flags
(which we would like to expose to user-space) don't feel related to the
pre-existing O_* flag set -- they affect all components of path lookup.
We'd therefore like to add a new flag argument.

Adding a new syscall allows us to finally fix the flag-ignoring problem,
and we can make it extensible enough so that we will hopefully never
need an openat3(2).

/* Syscall Prototype. */
  /*
   * open_how is an extensible structure (similar in interface to
   * clone3(2) or sched_setattr(2)). The size parameter must be set to
   * sizeof(struct open_how), to allow for future extensions. All future
   * extensions will be appended to open_how, with their zero value
   * acting as a no-op default.
   */
  struct open_how { /* ... */ };

  int openat2(int dfd, const char *pathname,
              struct open_how *how, size_t size);

/* Description. */
The initial version of 'struct open_how' contains the following fields:

  flags
    Used to specify openat(2)-style flags. However, any unknown flag
    bits or otherwise incorrect flag combinations (like O_PATH|O_RDWR)
    will result in -EINVAL. In addition, this field is 64-bits wide to
    allow for more O_ flags than currently permitted with openat(2).

  mode
    The file mode for O_CREAT or O_TMPFILE.

    Must be set to zero if flags does not contain O_CREAT or O_TMPFILE.

  resolve
    Restrict path resolution (in contrast to O_* flags they affect all
    path components). The current set of flags are as follows (at the
    moment, all of the RESOLVE_ flags are implemented as just passing
    the corresponding LOOKUP_ flag).

    RESOLVE_NO_XDEV       => LOOKUP_NO_XDEV
    RESOLVE_NO_SYMLINKS   => LOOKUP_NO_SYMLINKS
    RESOLVE_NO_MAGICLINKS => LOOKUP_NO_MAGICLINKS
    RESOLVE_BENEATH       => LOOKUP_BENEATH
    RESOLVE_IN_ROOT       => LOOKUP_IN_ROOT

open_how does not contain an embedded size field, because it is of
little benefit (userspace can figure out the kernel open_how size at
runtime fairly easily without it). It also only contains u64s (even
though ->mode arguably should be a u16) to avoid having padding fields
which are never used in the future.

Note that as a result of the new how->flags handling, O_PATH|O_TMPFILE
is no longer permitted for openat(2). As far as I can tell, this has
always been a bug and appears to not be used by userspace (and I've not
seen any problems on my machines by disallowing it). If it turns out
this breaks something, we can special-case it and only permit it for
openat(2) but not openat2(2).

After input from Florian Weimer, the new open_how and flag definitions
are inside a separate header from uapi/linux/fcntl.h, to avoid problems
that glibc has with importing that header.

/* Testing. */
In a follow-up patch there are over 200 selftests which ensure that this
syscall has the correct semantics and will correctly handle several
attack scenarios.

In addition, I've written a userspace library[4] which provides
convenient wrappers around openat2(RESOLVE_IN_ROOT) (this is necessary
because no other syscalls support RESOLVE_IN_ROOT, and thus lots of care
must be taken when using RESOLVE_IN_ROOT'd file descriptors with other
syscalls). During the development of this patch, I've run numerous
verification tests using libpathrs (showing that the API is reasonably
usable by userspace).

/* Future Work. */
Additional RESOLVE_ flags have been suggested during the review period.
These can be easily implemented separately (such as blocking auto-mount
during resolution).

Furthermore, there are some other proposed changes to the openat(2)
interface (the most obvious example is magic-link hardening[5]) which
would be a good opportunity to add a way for userspace to restrict how
O_PATH file descriptors can be re-opened.

Another possible avenue of future work would be some kind of
CHECK_FIELDS[6] flag which causes the kernel to indicate to userspace
which openat2(2) flags and fields are supported by the current kernel
(to avoid userspace having to go through several guesses to figure it
out).

[1]: https://lwn.net/Articles/588444/
[2]: https://lore.kernel.org/lkml/CA+55aFyyxJL1LyXZeBsf2ypriraj5ut1XkNDsunRBqgVjZU_6Q@mail.gmail.com
[3]: commit 629e014bb834 ("fs: completely ignore unknown open flags")
[4]: https://sourceware.org/bugzilla/show_bug.cgi?id=17523
[5]: https://lore.kernel.org/lkml/20190930183316.10190-2-cyphar@cyphar.com/
[6]: https://youtu.be/ggD-eb3yPVs

Suggested-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 CREDITS                                     |   4 +-
 MAINTAINERS                                 |   1 +
 arch/alpha/kernel/syscalls/syscall.tbl      |   1 +
 arch/arm/tools/syscall.tbl                  |   1 +
 arch/arm64/include/asm/unistd.h             |   2 +-
 arch/arm64/include/asm/unistd32.h           |   2 +
 arch/ia64/kernel/syscalls/syscall.tbl       |   1 +
 arch/m68k/kernel/syscalls/syscall.tbl       |   1 +
 arch/microblaze/kernel/syscalls/syscall.tbl |   1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   |   1 +
 arch/parisc/kernel/syscalls/syscall.tbl     |   1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    |   1 +
 arch/s390/kernel/syscalls/syscall.tbl       |   1 +
 arch/sh/kernel/syscalls/syscall.tbl         |   1 +
 arch/sparc/kernel/syscalls/syscall.tbl      |   1 +
 arch/x86/entry/syscalls/syscall_32.tbl      |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl      |   1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     |   1 +
 fs/open.c                                   | 147 +++++++++++++++++++++-------
 include/linux/fcntl.h                       |  16 ++-
 include/linux/syscalls.h                    |   3 +
 include/uapi/asm-generic/unistd.h           |   5 +-
 include/uapi/linux/fcntl.h                  |   2 +-
 include/uapi/linux/openat2.h                |  39 ++++++++
 26 files changed, 198 insertions(+), 39 deletions(-)
 create mode 100644 include/uapi/linux/openat2.h

(limited to 'include/uapi/linux')

diff --git a/CREDITS b/CREDITS
index 9602b0fa1c95..a97d3280a627 100644
--- a/CREDITS
+++ b/CREDITS
@@ -3302,7 +3302,9 @@ S: France
 N: Aleksa Sarai
 E: cyphar@cyphar.com
 W: https://www.cyphar.com/
-D: `pids` cgroup subsystem
+D: /sys/fs/cgroup/pids
+D: openat2(2)
+S: Sydney, Australia
 
 N: Dipankar Sarma
 E: dipankar@in.ibm.com
diff --git a/MAINTAINERS b/MAINTAINERS
index bd5847e802de..737ada377ac3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6397,6 +6397,7 @@ F:	fs/*
 F:	include/linux/fs.h
 F:	include/linux/fs_types.h
 F:	include/uapi/linux/fs.h
+F:	include/uapi/linux/openat2.h
 
 FINTEK F75375S HARDWARE MONITOR AND FAN CONTROLLER DRIVER
 M:	Riku Voipio <riku.voipio@iki.fi>
diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 8e13b0b2928d..4d7f2ffa957c 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -475,3 +475,4 @@
 543	common	fspick				sys_fspick
 544	common	pidfd_open			sys_pidfd_open
 # 545 reserved for clone3
+547	common	openat2				sys_openat2
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 6da7dc4d79cc..4ba54bc7e19a 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -449,3 +449,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 435	common	clone3				sys_clone3
+437	common	openat2				sys_openat2
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 2629a68b8724..8aa00ccb0b96 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -38,7 +38,7 @@
 #define __ARM_NR_compat_set_tls		(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END		(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls		436
+#define __NR_compat_syscalls		438
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 94ab29cf4f00..57f6f592d460 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -879,6 +879,8 @@ __SYSCALL(__NR_fspick, sys_fspick)
 __SYSCALL(__NR_pidfd_open, sys_pidfd_open)
 #define __NR_clone3 435
 __SYSCALL(__NR_clone3, sys_clone3)
+#define __NR_openat2 437
+__SYSCALL(__NR_openat2, sys_openat2)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index 36d5faf4c86c..8d36f2e2dc89 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -356,3 +356,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 # 435 reserved for clone3
+437	common	openat2				sys_openat2
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index a88a285a0e5f..2559925f1924 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -435,3 +435,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 # 435 reserved for clone3
+437	common	openat2				sys_openat2
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 09b0cd7dab0a..c04385e60833 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -441,3 +441,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 435	common	clone3				sys_clone3
+437	common	openat2				sys_openat2
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index e7c5ab38e403..68c9ec06851f 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -374,3 +374,4 @@
 433	n32	fspick				sys_fspick
 434	n32	pidfd_open			sys_pidfd_open
 435	n32	clone3				__sys_clone3
+437	n32	openat2				sys_openat2
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 13cd66581f3b..42a72d010050 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -350,3 +350,4 @@
 433	n64	fspick				sys_fspick
 434	n64	pidfd_open			sys_pidfd_open
 435	n64	clone3				__sys_clone3
+437	n64	openat2				sys_openat2
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 353539ea4140..f114c4aed0ed 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -423,3 +423,4 @@
 433	o32	fspick				sys_fspick
 434	o32	pidfd_open			sys_pidfd_open
 435	o32	clone3				__sys_clone3
+437	o32	openat2				sys_openat2
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 285ff516150c..b550ae9a7fea 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -433,3 +433,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 435	common	clone3				sys_clone3_wrapper
+437	common	openat2				sys_openat2
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 43f736ed47f2..a8b5ecb5b602 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -517,3 +517,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 435	nospu	clone3				ppc_clone3
+437	common	openat2				sys_openat2
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index 3054e9c035a3..16b571c06161 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -438,3 +438,4 @@
 433  common	fspick			sys_fspick			sys_fspick
 434  common	pidfd_open		sys_pidfd_open			sys_pidfd_open
 435  common	clone3			sys_clone3			sys_clone3
+437  common	openat2			sys_openat2			sys_openat2
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index b5ed26c4c005..a7185cc18626 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -438,3 +438,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 # 435 reserved for clone3
+437	common	openat2				sys_openat2
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 8c8cc7537fb2..b11c19552022 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -481,3 +481,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 # 435 reserved for clone3
+437	common	openat2			sys_openat2
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 15908eb9b17e..d22a8b5c3fab 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -440,3 +440,4 @@
 433	i386	fspick			sys_fspick			__ia32_sys_fspick
 434	i386	pidfd_open		sys_pidfd_open			__ia32_sys_pidfd_open
 435	i386	clone3			sys_clone3			__ia32_sys_clone3
+437	i386	openat2			sys_openat2			__ia32_sys_openat2
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index c29976eca4a8..9035647ef236 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -357,6 +357,7 @@
 433	common	fspick			__x64_sys_fspick
 434	common	pidfd_open		__x64_sys_pidfd_open
 435	common	clone3			__x64_sys_clone3/ptregs
+437	common	openat2			__x64_sys_openat2
 
 #
 # x32-specific system call numbers start at 512 to avoid cache impact
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 25f4de729a6d..f0a68013c038 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -406,3 +406,4 @@
 433	common	fspick				sys_fspick
 434	common	pidfd_open			sys_pidfd_open
 435	common	clone3				sys_clone3
+437	common	openat2				sys_openat2
diff --git a/fs/open.c b/fs/open.c
index b62f5c0923a8..8cdb2b675867 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -955,48 +955,84 @@ struct file *open_with_fake_path(const struct path *path, int flags,
 }
 EXPORT_SYMBOL(open_with_fake_path);
 
-static inline int build_open_flags(int flags, umode_t mode, struct open_flags *op)
+#define WILL_CREATE(flags)	(flags & (O_CREAT | __O_TMPFILE))
+#define O_PATH_FLAGS		(O_DIRECTORY | O_NOFOLLOW | O_PATH | O_CLOEXEC)
+
+static inline struct open_how build_open_how(int flags, umode_t mode)
+{
+	struct open_how how = {
+		.flags = flags & VALID_OPEN_FLAGS,
+		.mode = mode & S_IALLUGO,
+	};
+
+	/* O_PATH beats everything else. */
+	if (how.flags & O_PATH)
+		how.flags &= O_PATH_FLAGS;
+	/* Modes should only be set for create-like flags. */
+	if (!WILL_CREATE(how.flags))
+		how.mode = 0;
+	return how;
+}
+
+static inline int build_open_flags(const struct open_how *how,
+				   struct open_flags *op)
 {
+	int flags = how->flags;
 	int lookup_flags = 0;
 	int acc_mode = ACC_MODE(flags);
 
+	/* Must never be set by userspace */
+	flags &= ~(FMODE_NONOTIFY | O_CLOEXEC);
+
 	/*
-	 * Clear out all open flags we don't know about so that we don't report
-	 * them in fcntl(F_GETFD) or similar interfaces.
+	 * Older syscalls implicitly clear all of the invalid flags or argument
+	 * values before calling build_open_flags(), but openat2(2) checks all
+	 * of its arguments.
 	 */
-	flags &= VALID_OPEN_FLAGS;
+	if (flags & ~VALID_OPEN_FLAGS)
+		return -EINVAL;
+	if (how->resolve & ~VALID_RESOLVE_FLAGS)
+		return -EINVAL;
 
-	if (flags & (O_CREAT | __O_TMPFILE))
-		op->mode = (mode & S_IALLUGO) | S_IFREG;
-	else
+	/* Deal with the mode. */
+	if (WILL_CREATE(flags)) {
+		if (how->mode & ~S_IALLUGO)
+			return -EINVAL;
+		op->mode = how->mode | S_IFREG;
+	} else {
+		if (how->mode != 0)
+			return -EINVAL;
 		op->mode = 0;
-
-	/* Must never be set by userspace */
-	flags &= ~FMODE_NONOTIFY & ~O_CLOEXEC;
+	}
 
 	/*
-	 * O_SYNC is implemented as __O_SYNC|O_DSYNC.  As many places only
-	 * check for O_DSYNC if the need any syncing at all we enforce it's
-	 * always set instead of having to deal with possibly weird behaviour
-	 * for malicious applications setting only __O_SYNC.
+	 * In order to ensure programs get explicit errors when trying to use
+	 * O_TMPFILE on old kernels, O_TMPFILE is implemented such that it
+	 * looks like (O_DIRECTORY|O_RDWR & ~O_CREAT) to old kernels. But we
+	 * have to require userspace to explicitly set it.
 	 */
-	if (flags & __O_SYNC)
-		flags |= O_DSYNC;
-
 	if (flags & __O_TMPFILE) {
 		if ((flags & O_TMPFILE_MASK) != O_TMPFILE)
 			return -EINVAL;
 		if (!(acc_mode & MAY_WRITE))
 			return -EINVAL;
-	} else if (flags & O_PATH) {
-		/*
-		 * If we have O_PATH in the open flag. Then we
-		 * cannot have anything other than the below set of flags
-		 */
-		flags &= O_DIRECTORY | O_NOFOLLOW | O_PATH;
+	}
+	if (flags & O_PATH) {
+		/* O_PATH only permits certain other flags to be set. */
+		if (flags & ~O_PATH_FLAGS)
+			return -EINVAL;
 		acc_mode = 0;
 	}
 
+	/*
+	 * O_SYNC is implemented as __O_SYNC|O_DSYNC.  As many places only
+	 * check for O_DSYNC if the need any syncing at all we enforce it's
+	 * always set instead of having to deal with possibly weird behaviour
+	 * for malicious applications setting only __O_SYNC.
+	 */
+	if (flags & __O_SYNC)
+		flags |= O_DSYNC;
+
 	op->open_flag = flags;
 
 	/* O_TRUNC implies we need access checks for write permissions */
@@ -1022,6 +1058,18 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
 		lookup_flags |= LOOKUP_DIRECTORY;
 	if (!(flags & O_NOFOLLOW))
 		lookup_flags |= LOOKUP_FOLLOW;
+
+	if (how->resolve & RESOLVE_NO_XDEV)
+		lookup_flags |= LOOKUP_NO_XDEV;
+	if (how->resolve & RESOLVE_NO_MAGICLINKS)
+		lookup_flags |= LOOKUP_NO_MAGICLINKS;
+	if (how->resolve & RESOLVE_NO_SYMLINKS)
+		lookup_flags |= LOOKUP_NO_SYMLINKS;
+	if (how->resolve & RESOLVE_BENEATH)
+		lookup_flags |= LOOKUP_BENEATH;
+	if (how->resolve & RESOLVE_IN_ROOT)
+		lookup_flags |= LOOKUP_IN_ROOT;
+
 	op->lookup_flags = lookup_flags;
 	return 0;
 }
@@ -1040,8 +1088,11 @@ static inline int build_open_flags(int flags, umode_t mode, struct open_flags *o
 struct file *file_open_name(struct filename *name, int flags, umode_t mode)
 {
 	struct open_flags op;
-	int err = build_open_flags(flags, mode, &op);
-	return err ? ERR_PTR(err) : do_filp_open(AT_FDCWD, name, &op);
+	struct open_how how = build_open_how(flags, mode);
+	int err = build_open_flags(&how, &op);
+	if (err)
+		return ERR_PTR(err);
+	return do_filp_open(AT_FDCWD, name, &op);
 }
 
 /**
@@ -1072,17 +1123,19 @@ struct file *file_open_root(struct dentry *dentry, struct vfsmount *mnt,
 			    const char *filename, int flags, umode_t mode)
 {
 	struct open_flags op;
-	int err = build_open_flags(flags, mode, &op);
+	struct open_how how = build_open_how(flags, mode);
+	int err = build_open_flags(&how, &op);
 	if (err)
 		return ERR_PTR(err);
 	return do_file_open_root(dentry, mnt, filename, &op);
 }
 EXPORT_SYMBOL(file_open_root);
 
-long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
+static long do_sys_openat2(int dfd, const char __user *filename,
+			   struct open_how *how)
 {
 	struct open_flags op;
-	int fd = build_open_flags(flags, mode, &op);
+	int fd = build_open_flags(how, &op);
 	struct filename *tmp;
 
 	if (fd)
@@ -1092,7 +1145,7 @@ long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
 	if (IS_ERR(tmp))
 		return PTR_ERR(tmp);
 
-	fd = get_unused_fd_flags(flags);
+	fd = get_unused_fd_flags(how->flags);
 	if (fd >= 0) {
 		struct file *f = do_filp_open(dfd, tmp, &op);
 		if (IS_ERR(f)) {
@@ -1107,12 +1160,16 @@ long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
 	return fd;
 }
 
-SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
+long do_sys_open(int dfd, const char __user *filename, int flags, umode_t mode)
 {
-	if (force_o_largefile())
-		flags |= O_LARGEFILE;
+	struct open_how how = build_open_how(flags, mode);
+	return do_sys_openat2(dfd, filename, &how);
+}
 
-	return do_sys_open(AT_FDCWD, filename, flags, mode);
+
+SYSCALL_DEFINE3(open, const char __user *, filename, int, flags, umode_t, mode)
+{
+	return ksys_open(filename, flags, mode);
 }
 
 SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags,
@@ -1120,10 +1177,32 @@ SYSCALL_DEFINE4(openat, int, dfd, const char __user *, filename, int, flags,
 {
 	if (force_o_largefile())
 		flags |= O_LARGEFILE;
-
 	return do_sys_open(dfd, filename, flags, mode);
 }
 
+SYSCALL_DEFINE4(openat2, int, dfd, const char __user *, filename,
+		struct open_how __user *, how, size_t, usize)
+{
+	int err;
+	struct open_how tmp;
+
+	BUILD_BUG_ON(sizeof(struct open_how) < OPEN_HOW_SIZE_VER0);
+	BUILD_BUG_ON(sizeof(struct open_how) != OPEN_HOW_SIZE_LATEST);
+
+	if (unlikely(usize < OPEN_HOW_SIZE_VER0))
+		return -EINVAL;
+
+	err = copy_struct_from_user(&tmp, sizeof(tmp), how, usize);
+	if (err)
+		return err;
+
+	/* O_LARGEFILE is only allowed for non-O_PATH. */
+	if (!(tmp.flags & O_PATH) && force_o_largefile())
+		tmp.flags |= O_LARGEFILE;
+
+	return do_sys_openat2(dfd, filename, &tmp);
+}
+
 #ifdef CONFIG_COMPAT
 /*
  * Exactly like sys_open(), except that it doesn't set the
diff --git a/include/linux/fcntl.h b/include/linux/fcntl.h
index d019df946cb2..7bcdcf4f6ab2 100644
--- a/include/linux/fcntl.h
+++ b/include/linux/fcntl.h
@@ -2,15 +2,29 @@
 #ifndef _LINUX_FCNTL_H
 #define _LINUX_FCNTL_H
 
+#include <linux/stat.h>
 #include <uapi/linux/fcntl.h>
 
-/* list of all valid flags for the open/openat flags argument: */
+/* List of all valid flags for the open/openat flags argument: */
 #define VALID_OPEN_FLAGS \
 	(O_RDONLY | O_WRONLY | O_RDWR | O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC | \
 	 O_APPEND | O_NDELAY | O_NONBLOCK | O_NDELAY | __O_SYNC | O_DSYNC | \
 	 FASYNC	| O_DIRECT | O_LARGEFILE | O_DIRECTORY | O_NOFOLLOW | \
 	 O_NOATIME | O_CLOEXEC | O_PATH | __O_TMPFILE)
 
+/* List of all valid flags for the how->upgrade_mask argument: */
+#define VALID_UPGRADE_FLAGS \
+	(UPGRADE_NOWRITE | UPGRADE_NOREAD)
+
+/* List of all valid flags for the how->resolve argument: */
+#define VALID_RESOLVE_FLAGS \
+	(RESOLVE_NO_XDEV | RESOLVE_NO_MAGICLINKS | RESOLVE_NO_SYMLINKS | \
+	 RESOLVE_BENEATH | RESOLVE_IN_ROOT)
+
+/* List of all open_how "versions". */
+#define OPEN_HOW_SIZE_VER0	24 /* sizeof first published struct */
+#define OPEN_HOW_SIZE_LATEST	OPEN_HOW_SIZE_VER0
+
 #ifndef force_o_largefile
 #define force_o_largefile() (!IS_ENABLED(CONFIG_ARCH_32BIT_OFF_T))
 #endif
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index d0391cc2dae9..cd9f27cbc567 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -69,6 +69,7 @@ struct rseq;
 union bpf_attr;
 struct io_uring_params;
 struct clone_args;
+struct open_how;
 
 #include <linux/types.h>
 #include <linux/aio_abi.h>
@@ -439,6 +440,8 @@ asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
 asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);
 asmlinkage long sys_openat(int dfd, const char __user *filename, int flags,
 			   umode_t mode);
+asmlinkage long sys_openat2(int dfd, const char __user *filename,
+			    struct open_how *how, size_t size);
 asmlinkage long sys_close(unsigned int fd);
 asmlinkage long sys_vhangup(void);
 
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 1fc8faa6e973..d4122c091472 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -851,8 +851,11 @@ __SYSCALL(__NR_pidfd_open, sys_pidfd_open)
 __SYSCALL(__NR_clone3, sys_clone3)
 #endif
 
+#define __NR_openat2 437
+__SYSCALL(__NR_openat2, sys_openat2)
+
 #undef __NR_syscalls
-#define __NR_syscalls 436
+#define __NR_syscalls 438
 
 /*
  * 32 bit systems traditionally used different
diff --git a/include/uapi/linux/fcntl.h b/include/uapi/linux/fcntl.h
index 1f97b33c840e..ca88b7bce553 100644
--- a/include/uapi/linux/fcntl.h
+++ b/include/uapi/linux/fcntl.h
@@ -3,6 +3,7 @@
 #define _UAPI_LINUX_FCNTL_H
 
 #include <asm/fcntl.h>
+#include <linux/openat2.h>
 
 #define F_SETLEASE	(F_LINUX_SPECIFIC_BASE + 0)
 #define F_GETLEASE	(F_LINUX_SPECIFIC_BASE + 1)
@@ -100,5 +101,4 @@
 
 #define AT_RECURSIVE		0x8000	/* Apply to the entire subtree */
 
-
 #endif /* _UAPI_LINUX_FCNTL_H */
diff --git a/include/uapi/linux/openat2.h b/include/uapi/linux/openat2.h
new file mode 100644
index 000000000000..58b1eb711360
--- /dev/null
+++ b/include/uapi/linux/openat2.h
@@ -0,0 +1,39 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _UAPI_LINUX_OPENAT2_H
+#define _UAPI_LINUX_OPENAT2_H
+
+#include <linux/types.h>
+
+/*
+ * Arguments for how openat2(2) should open the target path. If only @flags and
+ * @mode are non-zero, then openat2(2) operates very similarly to openat(2).
+ *
+ * However, unlike openat(2), unknown or invalid bits in @flags result in
+ * -EINVAL rather than being silently ignored. @mode must be zero unless one of
+ * {O_CREAT, O_TMPFILE} are set.
+ *
+ * @flags: O_* flags.
+ * @mode: O_CREAT/O_TMPFILE file mode.
+ * @resolve: RESOLVE_* flags.
+ */
+struct open_how {
+	__u64 flags;
+	__u64 mode;
+	__u64 resolve;
+};
+
+/* how->resolve flags for openat2(2). */
+#define RESOLVE_NO_XDEV		0x01 /* Block mount-point crossings
+					(includes bind-mounts). */
+#define RESOLVE_NO_MAGICLINKS	0x02 /* Block traversal through procfs-style
+					"magic-links". */
+#define RESOLVE_NO_SYMLINKS	0x04 /* Block traversal through all symlinks
+					(implies OEXT_NO_MAGICLINKS) */
+#define RESOLVE_BENEATH		0x08 /* Block "lexical" trickery like
+					"..", symlinks, and absolute
+					paths which escape the dirfd. */
+#define RESOLVE_IN_ROOT		0x10 /* Make all jumps to "/" and ".."
+					be scoped inside the dirfd
+					(similar to chroot(2)). */
+
+#endif /* _UAPI_LINUX_OPENAT2_H */
-- 
cgit v1.2.3


From 1292e972fff2b2d81e139e0c2fe5f50249e78c58 Mon Sep 17 00:00:00 2001
From: Eugene Syromiatnikov <esyr@redhat.com>
Date: Wed, 15 Jan 2020 17:35:38 +0100
Subject: io_uring: fix compat for IORING_REGISTER_FILES_UPDATE

fds field of struct io_uring_files_update is problematic with regards
to compat user space, as pointer size is different in 32-bit, 32-on-64-bit,
and 64-bit user space.  In order to avoid custom handling of compat in
the syscall implementation, make fds __u64 and use u64_to_user_ptr in
order to retrieve it.  Also, align the field naturally and check that
no garbage is passed there.

Fixes: c3a31e605620c279 ("io_uring: add support for IORING_REGISTER_FILES_UPDATE")
Signed-off-by: Eugene Syromiatnikov <esyr@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 4 +++-
 include/uapi/linux/io_uring.h | 3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 187dd94fd6b1..5953d7f13690 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4463,13 +4463,15 @@ static int io_sqe_files_update(struct io_ring_ctx *ctx, void __user *arg,
 		return -EINVAL;
 	if (copy_from_user(&up, arg, sizeof(up)))
 		return -EFAULT;
+	if (up.resv)
+		return -EINVAL;
 	if (check_add_overflow(up.offset, nr_args, &done))
 		return -EOVERFLOW;
 	if (done > ctx->nr_user_files)
 		return -EINVAL;
 
 	done = 0;
-	fds = (__s32 __user *) up.fds;
+	fds = u64_to_user_ptr(up.fds);
 	while (nr_args) {
 		struct fixed_file_table *table;
 		unsigned index;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index a3300e1b9a01..55cfcb71606d 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -178,7 +178,8 @@ struct io_uring_params {
 
 struct io_uring_files_update {
 	__u32 offset;
-	__s32 *fds;
+	__u32 resv;
+	__aligned_u64 /* __s32 * */ fds;
 };
 
 #endif
-- 
cgit v1.2.3


From d63d1b5edb7b832210bfde587ba9e7549fa064eb Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 10 Dec 2019 10:38:56 -0700
Subject: io_uring: add support for fallocate()

This exposes fallocate(2) through io_uring.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 60 +++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 61 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5953d7f13690..7a4e00ef02be 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -319,6 +319,7 @@ struct io_sync {
 	loff_t				len;
 	loff_t				off;
 	int				flags;
+	int				mode;
 };
 
 struct io_cancel {
@@ -2101,6 +2102,54 @@ static int io_fsync(struct io_kiocb *req, struct io_kiocb **nxt,
 	return 0;
 }
 
+static void io_fallocate_finish(struct io_wq_work **workptr)
+{
+	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
+	struct io_kiocb *nxt = NULL;
+	int ret;
+
+	ret = vfs_fallocate(req->file, req->sync.mode, req->sync.off,
+				req->sync.len);
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req_find_next(req, &nxt);
+	if (nxt)
+		io_wq_assign_next(workptr, nxt);
+}
+
+static int io_fallocate_prep(struct io_kiocb *req,
+			     const struct io_uring_sqe *sqe)
+{
+	if (sqe->ioprio || sqe->buf_index || sqe->rw_flags)
+		return -EINVAL;
+
+	req->sync.off = READ_ONCE(sqe->off);
+	req->sync.len = READ_ONCE(sqe->addr);
+	req->sync.mode = READ_ONCE(sqe->len);
+	return 0;
+}
+
+static int io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt,
+			bool force_nonblock)
+{
+	struct io_wq_work *work, *old_work;
+
+	/* fallocate always requiring blocking context */
+	if (force_nonblock) {
+		io_put_req(req);
+		req->work.func = io_fallocate_finish;
+		return -EAGAIN;
+	}
+
+	work = old_work = &req->work;
+	io_fallocate_finish(&work);
+	if (work && work != old_work)
+		*nxt = container_of(work, struct io_kiocb, work);
+
+	return 0;
+}
+
 static int io_prep_sfr(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	struct io_ring_ctx *ctx = req->ctx;
@@ -3123,6 +3172,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_ACCEPT:
 		ret = io_accept_prep(req, sqe);
 		break;
+	case IORING_OP_FALLOCATE:
+		ret = io_fallocate_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -3277,6 +3329,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_async_cancel(req, nxt);
 		break;
+	case IORING_OP_FALLOCATE:
+		if (sqe) {
+			ret = io_fallocate_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_fallocate(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 55cfcb71606d..ad1574f35eb3 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -76,6 +76,7 @@ enum {
 	IORING_OP_ASYNC_CANCEL,
 	IORING_OP_LINK_TIMEOUT,
 	IORING_OP_CONNECT,
+	IORING_OP_FALLOCATE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From 15b71abe7b52df214785dde0de9f581cc0216d17 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 11 Dec 2019 11:20:36 -0700
Subject: io_uring: add support for IORING_OP_OPENAT

This works just like openat(2), except it can be performed async. For
the normal case of a non-blocking path lookup this will complete
inline. If we have to do IO to perform the open, it'll be done from
async context.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 95 ++++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/io_uring.h |  2 +
 2 files changed, 95 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 7a4e00ef02be..34cbce622fcd 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -70,6 +70,8 @@
 #include <linux/sizes.h>
 #include <linux/hugetlb.h>
 #include <linux/highmem.h>
+#include <linux/namei.h>
+#include <linux/fsnotify.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/io_uring.h>
@@ -353,6 +355,15 @@ struct io_sr_msg {
 	int				msg_flags;
 };
 
+struct io_open {
+	struct file			*file;
+	int				dfd;
+	umode_t				mode;
+	const char __user		*fname;
+	struct filename			*filename;
+	int				flags;
+};
+
 struct io_async_connect {
 	struct sockaddr_storage		address;
 };
@@ -371,12 +382,17 @@ struct io_async_rw {
 	ssize_t				size;
 };
 
+struct io_async_open {
+	struct filename			*filename;
+};
+
 struct io_async_ctx {
 	union {
 		struct io_async_rw	rw;
 		struct io_async_msghdr	msg;
 		struct io_async_connect	connect;
 		struct io_timeout_data	timeout;
+		struct io_async_open	open;
 	};
 };
 
@@ -397,6 +413,7 @@ struct io_kiocb {
 		struct io_timeout	timeout;
 		struct io_connect	connect;
 		struct io_sr_msg	sr_msg;
+		struct io_open		open;
 	};
 
 	struct io_async_ctx		*io;
@@ -2150,6 +2167,67 @@ static int io_fallocate(struct io_kiocb *req, struct io_kiocb **nxt,
 	return 0;
 }
 
+static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	int ret;
+
+	if (sqe->ioprio || sqe->buf_index)
+		return -EINVAL;
+
+	req->open.dfd = READ_ONCE(sqe->fd);
+	req->open.mode = READ_ONCE(sqe->len);
+	req->open.fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+	req->open.flags = READ_ONCE(sqe->open_flags);
+
+	req->open.filename = getname(req->open.fname);
+	if (IS_ERR(req->open.filename)) {
+		ret = PTR_ERR(req->open.filename);
+		req->open.filename = NULL;
+		return ret;
+	}
+
+	return 0;
+}
+
+static int io_openat(struct io_kiocb *req, struct io_kiocb **nxt,
+		     bool force_nonblock)
+{
+	struct open_flags op;
+	struct open_how how;
+	struct file *file;
+	int ret;
+
+	if (force_nonblock) {
+		req->work.flags |= IO_WQ_WORK_NEEDS_FILES;
+		return -EAGAIN;
+	}
+
+	how = build_open_how(req->open.flags, req->open.mode);
+	ret = build_open_flags(&how, &op);
+	if (ret)
+		goto err;
+
+	ret = get_unused_fd_flags(how.flags);
+	if (ret < 0)
+		goto err;
+
+	file = do_filp_open(req->open.dfd, req->open.filename, &op);
+	if (IS_ERR(file)) {
+		put_unused_fd(ret);
+		ret = PTR_ERR(file);
+	} else {
+		fsnotify_open(file);
+		fd_install(ret, file);
+	}
+err:
+	putname(req->open.filename);
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req_find_next(req, nxt);
+	return 0;
+}
+
 static int io_prep_sfr(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	struct io_ring_ctx *ctx = req->ctx;
@@ -3175,6 +3253,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_FALLOCATE:
 		ret = io_fallocate_prep(req, sqe);
 		break;
+	case IORING_OP_OPENAT:
+		ret = io_openat_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -3337,6 +3418,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_fallocate(req, nxt, force_nonblock);
 		break;
+	case IORING_OP_OPENAT:
+		if (sqe) {
+			ret = io_openat_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_openat(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -3409,7 +3498,7 @@ static bool io_req_op_valid(int op)
 	return op >= IORING_OP_NOP && op < IORING_OP_LAST;
 }
 
-static int io_req_needs_file(struct io_kiocb *req)
+static int io_req_needs_file(struct io_kiocb *req, int fd)
 {
 	switch (req->opcode) {
 	case IORING_OP_NOP:
@@ -3419,6 +3508,8 @@ static int io_req_needs_file(struct io_kiocb *req)
 	case IORING_OP_ASYNC_CANCEL:
 	case IORING_OP_LINK_TIMEOUT:
 		return 0;
+	case IORING_OP_OPENAT:
+		return fd != -1;
 	default:
 		if (io_req_op_valid(req->opcode))
 			return 1;
@@ -3448,7 +3539,7 @@ static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req,
 	if (flags & IOSQE_IO_DRAIN)
 		req->flags |= REQ_F_IO_DRAIN;
 
-	ret = io_req_needs_file(req);
+	ret = io_req_needs_file(req, fd);
 	if (ret <= 0)
 		return ret;
 
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index ad1574f35eb3..c1a7c1c65eaf 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -34,6 +34,7 @@ struct io_uring_sqe {
 		__u32		timeout_flags;
 		__u32		accept_flags;
 		__u32		cancel_flags;
+		__u32		open_flags;
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	union {
@@ -77,6 +78,7 @@ enum {
 	IORING_OP_LINK_TIMEOUT,
 	IORING_OP_CONNECT,
 	IORING_OP_FALLOCATE,
+	IORING_OP_OPENAT,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From b5dba59e0cf7e2cc4d3b3b1ac5fe81ddf21959eb Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 11 Dec 2019 14:02:38 -0700
Subject: io_uring: add support for IORING_OP_CLOSE

This works just like close(2), unsurprisingly. We remove the file
descriptor and post the completion inline, then offload the actual
(potential) last file put to async context.

Mark the async part of this work as uncancellable, as we really must
guarantee that the latter part of the close is run.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 109 ++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/io_uring.h |   1 +
 2 files changed, 110 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index fe227650efd6..6aaff7bfe8b5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -301,6 +301,12 @@ struct io_poll_iocb {
 	struct wait_queue_entry		wait;
 };
 
+struct io_close {
+	struct file			*file;
+	struct file			*put_file;
+	int				fd;
+};
+
 struct io_timeout_data {
 	struct io_kiocb			*req;
 	struct hrtimer			timer;
@@ -414,6 +420,7 @@ struct io_kiocb {
 		struct io_connect	connect;
 		struct io_sr_msg	sr_msg;
 		struct io_open		open;
+		struct io_close		close;
 	};
 
 	struct io_async_ctx		*io;
@@ -2228,6 +2235,94 @@ err:
 	return 0;
 }
 
+static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	/*
+	 * If we queue this for async, it must not be cancellable. That would
+	 * leave the 'file' in an undeterminate state.
+	 */
+	req->work.flags |= IO_WQ_WORK_NO_CANCEL;
+
+	if (sqe->ioprio || sqe->off || sqe->addr || sqe->len ||
+	    sqe->rw_flags || sqe->buf_index)
+		return -EINVAL;
+	if (sqe->flags & IOSQE_FIXED_FILE)
+		return -EINVAL;
+
+	req->close.fd = READ_ONCE(sqe->fd);
+	if (req->file->f_op == &io_uring_fops ||
+	    req->close.fd == req->ring_fd)
+		return -EBADF;
+
+	return 0;
+}
+
+static void io_close_finish(struct io_wq_work **workptr)
+{
+	struct io_kiocb *req = container_of(*workptr, struct io_kiocb, work);
+	struct io_kiocb *nxt = NULL;
+
+	/* Invoked with files, we need to do the close */
+	if (req->work.files) {
+		int ret;
+
+		ret = filp_close(req->close.put_file, req->work.files);
+		if (ret < 0) {
+			req_set_fail_links(req);
+		}
+		io_cqring_add_event(req, ret);
+	}
+
+	fput(req->close.put_file);
+
+	/* we bypassed the re-issue, drop the submission reference */
+	io_put_req(req);
+	io_put_req_find_next(req, &nxt);
+	if (nxt)
+		io_wq_assign_next(workptr, nxt);
+}
+
+static int io_close(struct io_kiocb *req, struct io_kiocb **nxt,
+		    bool force_nonblock)
+{
+	int ret;
+
+	req->close.put_file = NULL;
+	ret = __close_fd_get_file(req->close.fd, &req->close.put_file);
+	if (ret < 0)
+		return ret;
+
+	/* if the file has a flush method, be safe and punt to async */
+	if (req->close.put_file->f_op->flush && !io_wq_current_is_worker()) {
+		req->work.flags |= IO_WQ_WORK_NEEDS_FILES;
+		goto eagain;
+	}
+
+	/*
+	 * No ->flush(), safely close from here and just punt the
+	 * fput() to async context.
+	 */
+	ret = filp_close(req->close.put_file, current->files);
+
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+
+	if (io_wq_current_is_worker()) {
+		struct io_wq_work *old_work, *work;
+
+		old_work = work = &req->work;
+		io_close_finish(&work);
+		if (work && work != old_work)
+			*nxt = container_of(work, struct io_kiocb, work);
+		return 0;
+	}
+
+eagain:
+	req->work.func = io_close_finish;
+	return -EAGAIN;
+}
+
 static int io_prep_sfr(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	struct io_ring_ctx *ctx = req->ctx;
@@ -3256,6 +3351,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_OPENAT:
 		ret = io_openat_prep(req, sqe);
 		break;
+	case IORING_OP_CLOSE:
+		ret = io_close_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -3426,6 +3524,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_openat(req, nxt, force_nonblock);
 		break;
+	case IORING_OP_CLOSE:
+		if (sqe) {
+			ret = io_close_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_close(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -3572,6 +3678,9 @@ static int io_grab_files(struct io_kiocb *req)
 	int ret = -EBADF;
 	struct io_ring_ctx *ctx = req->ctx;
 
+	if (!req->ring_file)
+		return -EBADF;
+
 	rcu_read_lock();
 	spin_lock_irq(&ctx->inflight_lock);
 	/*
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index c1a7c1c65eaf..084dea85b838 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -79,6 +79,7 @@ enum {
 	IORING_OP_CONNECT,
 	IORING_OP_FALLOCATE,
 	IORING_OP_OPENAT,
+	IORING_OP_CLOSE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From 05f3fb3c5397524feae2e73ee8e150a9090a7da2 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 9 Dec 2019 11:22:50 -0700
Subject: io_uring: avoid ring quiesce for fixed file set unregister and update

We currently fully quiesce the ring before an unregister or update of
the fixed fileset. This is very expensive, and we can be a bit smarter
about this.

Add a percpu refcount for the file tables as a whole. Grab a percpu ref
when we use a registered file, and put it on completion. This is cheap
to do. Upon removal of a file from a set, switch the ref count to atomic
mode. When we hit zero ref on the completion side, then we know we can
drop the previously registered files. When the old files have been
dropped, switch the ref back to percpu mode for normal operation.

Since there's a period between doing the update and the kernel being
done with it, add a IORING_OP_FILES_UPDATE opcode that can perform the
same action. The application knows the update has completed when it gets
the CQE for it. Between doing the update and receiving this completion,
the application must continue to use the unregistered fd if submitting
IO on this particular file.

This takes the runtime of test/file-register from liburing from 14s to
about 0.7s.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 485 ++++++++++++++++++++++++++++++------------
 include/uapi/linux/io_uring.h |   1 +
 2 files changed, 351 insertions(+), 135 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 6aaff7bfe8b5..4325068324b7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -179,6 +179,21 @@ struct fixed_file_table {
 	struct file		**files;
 };
 
+enum {
+	FFD_F_ATOMIC,
+};
+
+struct fixed_file_data {
+	struct fixed_file_table		*table;
+	struct io_ring_ctx		*ctx;
+
+	struct percpu_ref		refs;
+	struct llist_head		put_llist;
+	unsigned long			state;
+	struct work_struct		ref_work;
+	struct completion		done;
+};
+
 struct io_ring_ctx {
 	struct {
 		struct percpu_ref	refs;
@@ -231,7 +246,7 @@ struct io_ring_ctx {
 	 * readers must ensure that ->refs is alive as long as the file* is
 	 * used. Only updated through io_uring_register(2).
 	 */
-	struct fixed_file_table	*file_table;
+	struct fixed_file_data	*file_data;
 	unsigned		nr_user_files;
 
 	/* if used, fixed mapped user buffers */
@@ -370,6 +385,13 @@ struct io_open {
 	int				flags;
 };
 
+struct io_files_update {
+	struct file			*file;
+	u64				arg;
+	u32				nr_args;
+	u32				offset;
+};
+
 struct io_async_connect {
 	struct sockaddr_storage		address;
 };
@@ -421,6 +443,7 @@ struct io_kiocb {
 		struct io_sr_msg	sr_msg;
 		struct io_open		open;
 		struct io_close		close;
+		struct io_files_update	files_update;
 	};
 
 	struct io_async_ctx		*io;
@@ -496,6 +519,9 @@ static void io_double_put_req(struct io_kiocb *req);
 static void __io_double_put_req(struct io_kiocb *req);
 static struct io_kiocb *io_prep_linked_timeout(struct io_kiocb *req);
 static void io_queue_linked_timeout(struct io_kiocb *req);
+static int __io_sqe_files_update(struct io_ring_ctx *ctx,
+				 struct io_uring_files_update *ip,
+				 unsigned nr_args);
 
 static struct kmem_cache *req_cachep;
 
@@ -945,6 +971,7 @@ static void io_free_req_many(struct io_ring_ctx *ctx, void **reqs, int *nr)
 	if (*nr) {
 		kmem_cache_free_bulk(req_cachep, *nr, reqs);
 		percpu_ref_put_many(&ctx->refs, *nr);
+		percpu_ref_put_many(&ctx->file_data->refs, *nr);
 		*nr = 0;
 	}
 }
@@ -955,8 +982,12 @@ static void __io_free_req(struct io_kiocb *req)
 
 	if (req->io)
 		kfree(req->io);
-	if (req->file && !(req->flags & REQ_F_FIXED_FILE))
-		fput(req->file);
+	if (req->file) {
+		if (req->flags & REQ_F_FIXED_FILE)
+			percpu_ref_put(&ctx->file_data->refs);
+		else
+			fput(req->file);
+	}
 	if (req->flags & REQ_F_INFLIGHT) {
 		unsigned long flags;
 
@@ -3293,6 +3324,45 @@ static int io_async_cancel(struct io_kiocb *req, struct io_kiocb **nxt)
 	return 0;
 }
 
+static int io_files_update_prep(struct io_kiocb *req,
+				const struct io_uring_sqe *sqe)
+{
+	if (sqe->flags || sqe->ioprio || sqe->rw_flags)
+		return -EINVAL;
+
+	req->files_update.offset = READ_ONCE(sqe->off);
+	req->files_update.nr_args = READ_ONCE(sqe->len);
+	if (!req->files_update.nr_args)
+		return -EINVAL;
+	req->files_update.arg = READ_ONCE(sqe->addr);
+	return 0;
+}
+
+static int io_files_update(struct io_kiocb *req, bool force_nonblock)
+{
+	struct io_ring_ctx *ctx = req->ctx;
+	struct io_uring_files_update up;
+	int ret;
+
+	if (force_nonblock) {
+		req->work.flags |= IO_WQ_WORK_NEEDS_FILES;
+		return -EAGAIN;
+	}
+
+	up.offset = req->files_update.offset;
+	up.fds = req->files_update.arg;
+
+	mutex_lock(&ctx->uring_lock);
+	ret = __io_sqe_files_update(ctx, &up, req->files_update.nr_args);
+	mutex_unlock(&ctx->uring_lock);
+
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req(req);
+	return 0;
+}
+
 static int io_req_defer_prep(struct io_kiocb *req,
 			     const struct io_uring_sqe *sqe)
 {
@@ -3354,6 +3424,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_CLOSE:
 		ret = io_close_prep(req, sqe);
 		break;
+	case IORING_OP_FILES_UPDATE:
+		ret = io_files_update_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -3532,6 +3605,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_close(req, nxt, force_nonblock);
 		break;
+	case IORING_OP_FILES_UPDATE:
+		if (sqe) {
+			ret = io_files_update_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_files_update(req, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -3631,8 +3712,8 @@ static inline struct file *io_file_from_index(struct io_ring_ctx *ctx,
 {
 	struct fixed_file_table *table;
 
-	table = &ctx->file_table[index >> IORING_FILE_TABLE_SHIFT];
-	return table->files[index & IORING_FILE_TABLE_MASK];
+	table = &ctx->file_data->table[index >> IORING_FILE_TABLE_SHIFT];
+	return table->files[index & IORING_FILE_TABLE_MASK];;
 }
 
 static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req,
@@ -3653,7 +3734,7 @@ static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req,
 		return ret;
 
 	if (flags & IOSQE_FIXED_FILE) {
-		if (unlikely(!ctx->file_table ||
+		if (unlikely(!ctx->file_data ||
 		    (unsigned) fd >= ctx->nr_user_files))
 			return -EBADF;
 		fd = array_index_nospec(fd, ctx->nr_user_files);
@@ -3661,6 +3742,7 @@ static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req,
 		if (!req->file)
 			return -EBADF;
 		req->flags |= REQ_F_FIXED_FILE;
+		percpu_ref_get(&ctx->file_data->refs);
 	} else {
 		if (req->needs_fixed_file)
 			return -EBADF;
@@ -4338,19 +4420,37 @@ static void __io_sqe_files_unregister(struct io_ring_ctx *ctx)
 #endif
 }
 
+static void io_file_ref_kill(struct percpu_ref *ref)
+{
+	struct fixed_file_data *data;
+
+	data = container_of(ref, struct fixed_file_data, refs);
+	complete(&data->done);
+}
+
 static int io_sqe_files_unregister(struct io_ring_ctx *ctx)
 {
+	struct fixed_file_data *data = ctx->file_data;
 	unsigned nr_tables, i;
 
-	if (!ctx->file_table)
+	if (!data)
 		return -ENXIO;
 
+	/* protect against inflight atomic switch, which drops the ref */
+	flush_work(&data->ref_work);
+	percpu_ref_get(&data->refs);
+	percpu_ref_kill_and_confirm(&data->refs, io_file_ref_kill);
+	wait_for_completion(&data->done);
+	percpu_ref_put(&data->refs);
+	percpu_ref_exit(&data->refs);
+
 	__io_sqe_files_unregister(ctx);
 	nr_tables = DIV_ROUND_UP(ctx->nr_user_files, IORING_MAX_FILES_TABLE);
 	for (i = 0; i < nr_tables; i++)
-		kfree(ctx->file_table[i].files);
-	kfree(ctx->file_table);
-	ctx->file_table = NULL;
+		kfree(data->table[i].files);
+	kfree(data->table);
+	kfree(data);
+	ctx->file_data = NULL;
 	ctx->nr_user_files = 0;
 	return 0;
 }
@@ -4381,16 +4481,6 @@ static void io_finish_async(struct io_ring_ctx *ctx)
 }
 
 #if defined(CONFIG_UNIX)
-static void io_destruct_skb(struct sk_buff *skb)
-{
-	struct io_ring_ctx *ctx = skb->sk->sk_user_data;
-
-	if (ctx->io_wq)
-		io_wq_flush(ctx->io_wq);
-
-	unix_destruct_scm(skb);
-}
-
 /*
  * Ensure the UNIX gc is aware of our file set, so we are certain that
  * the io_uring can be safely unregistered on process exit, even if we have
@@ -4438,7 +4528,7 @@ static int __io_sqe_files_scm(struct io_ring_ctx *ctx, int nr, int offset)
 		fpl->max = SCM_MAX_FD;
 		fpl->count = nr_files;
 		UNIXCB(skb).fp = fpl;
-		skb->destructor = io_destruct_skb;
+		skb->destructor = unix_destruct_scm;
 		refcount_add(skb->truesize, &sk->sk_wmem_alloc);
 		skb_queue_head(&sk->sk_receive_queue, skb);
 
@@ -4500,7 +4590,7 @@ static int io_sqe_alloc_file_tables(struct io_ring_ctx *ctx, unsigned nr_tables,
 	int i;
 
 	for (i = 0; i < nr_tables; i++) {
-		struct fixed_file_table *table = &ctx->file_table[i];
+		struct fixed_file_table *table = &ctx->file_data->table[i];
 		unsigned this_files;
 
 		this_files = min(nr_files, IORING_MAX_FILES_TABLE);
@@ -4515,36 +4605,159 @@ static int io_sqe_alloc_file_tables(struct io_ring_ctx *ctx, unsigned nr_tables,
 		return 0;
 
 	for (i = 0; i < nr_tables; i++) {
-		struct fixed_file_table *table = &ctx->file_table[i];
+		struct fixed_file_table *table = &ctx->file_data->table[i];
 		kfree(table->files);
 	}
 	return 1;
 }
 
+static void io_ring_file_put(struct io_ring_ctx *ctx, struct file *file)
+{
+#if defined(CONFIG_UNIX)
+	struct sock *sock = ctx->ring_sock->sk;
+	struct sk_buff_head list, *head = &sock->sk_receive_queue;
+	struct sk_buff *skb;
+	int i;
+
+	__skb_queue_head_init(&list);
+
+	/*
+	 * Find the skb that holds this file in its SCM_RIGHTS. When found,
+	 * remove this entry and rearrange the file array.
+	 */
+	skb = skb_dequeue(head);
+	while (skb) {
+		struct scm_fp_list *fp;
+
+		fp = UNIXCB(skb).fp;
+		for (i = 0; i < fp->count; i++) {
+			int left;
+
+			if (fp->fp[i] != file)
+				continue;
+
+			unix_notinflight(fp->user, fp->fp[i]);
+			left = fp->count - 1 - i;
+			if (left) {
+				memmove(&fp->fp[i], &fp->fp[i + 1],
+						left * sizeof(struct file *));
+			}
+			fp->count--;
+			if (!fp->count) {
+				kfree_skb(skb);
+				skb = NULL;
+			} else {
+				__skb_queue_tail(&list, skb);
+			}
+			fput(file);
+			file = NULL;
+			break;
+		}
+
+		if (!file)
+			break;
+
+		__skb_queue_tail(&list, skb);
+
+		skb = skb_dequeue(head);
+	}
+
+	if (skb_peek(&list)) {
+		spin_lock_irq(&head->lock);
+		while ((skb = __skb_dequeue(&list)) != NULL)
+			__skb_queue_tail(head, skb);
+		spin_unlock_irq(&head->lock);
+	}
+#else
+	fput(file);
+#endif
+}
+
+struct io_file_put {
+	struct llist_node llist;
+	struct file *file;
+	struct completion *done;
+};
+
+static void io_ring_file_ref_switch(struct work_struct *work)
+{
+	struct io_file_put *pfile, *tmp;
+	struct fixed_file_data *data;
+	struct llist_node *node;
+
+	data = container_of(work, struct fixed_file_data, ref_work);
+
+	while ((node = llist_del_all(&data->put_llist)) != NULL) {
+		llist_for_each_entry_safe(pfile, tmp, node, llist) {
+			io_ring_file_put(data->ctx, pfile->file);
+			if (pfile->done)
+				complete(pfile->done);
+			else
+				kfree(pfile);
+		}
+	}
+
+	percpu_ref_get(&data->refs);
+	percpu_ref_switch_to_percpu(&data->refs);
+}
+
+static void io_file_data_ref_zero(struct percpu_ref *ref)
+{
+	struct fixed_file_data *data;
+
+	data = container_of(ref, struct fixed_file_data, refs);
+
+	/* we can't safely switch from inside this context, punt to wq */
+	queue_work(system_wq, &data->ref_work);
+}
+
 static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
 				 unsigned nr_args)
 {
 	__s32 __user *fds = (__s32 __user *) arg;
 	unsigned nr_tables;
+	struct file *file;
 	int fd, ret = 0;
 	unsigned i;
 
-	if (ctx->file_table)
+	if (ctx->file_data)
 		return -EBUSY;
 	if (!nr_args)
 		return -EINVAL;
 	if (nr_args > IORING_MAX_FIXED_FILES)
 		return -EMFILE;
 
+	ctx->file_data = kzalloc(sizeof(*ctx->file_data), GFP_KERNEL);
+	if (!ctx->file_data)
+		return -ENOMEM;
+	ctx->file_data->ctx = ctx;
+	init_completion(&ctx->file_data->done);
+
 	nr_tables = DIV_ROUND_UP(nr_args, IORING_MAX_FILES_TABLE);
-	ctx->file_table = kcalloc(nr_tables, sizeof(struct fixed_file_table),
+	ctx->file_data->table = kcalloc(nr_tables,
+					sizeof(struct fixed_file_table),
 					GFP_KERNEL);
-	if (!ctx->file_table)
+	if (!ctx->file_data->table) {
+		kfree(ctx->file_data);
+		ctx->file_data = NULL;
 		return -ENOMEM;
+	}
+
+	if (percpu_ref_init(&ctx->file_data->refs, io_file_data_ref_zero,
+				PERCPU_REF_ALLOW_REINIT, GFP_KERNEL)) {
+		kfree(ctx->file_data->table);
+		kfree(ctx->file_data);
+		ctx->file_data = NULL;
+		return -ENOMEM;
+	}
+	ctx->file_data->put_llist.first = NULL;
+	INIT_WORK(&ctx->file_data->ref_work, io_ring_file_ref_switch);
 
 	if (io_sqe_alloc_file_tables(ctx, nr_tables, nr_args)) {
-		kfree(ctx->file_table);
-		ctx->file_table = NULL;
+		percpu_ref_exit(&ctx->file_data->refs);
+		kfree(ctx->file_data->table);
+		kfree(ctx->file_data);
+		ctx->file_data = NULL;
 		return -ENOMEM;
 	}
 
@@ -4561,13 +4774,14 @@ static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
 			continue;
 		}
 
-		table = &ctx->file_table[i >> IORING_FILE_TABLE_SHIFT];
+		table = &ctx->file_data->table[i >> IORING_FILE_TABLE_SHIFT];
 		index = i & IORING_FILE_TABLE_MASK;
-		table->files[index] = fget(fd);
+		file = fget(fd);
 
 		ret = -EBADF;
-		if (!table->files[index])
+		if (!file)
 			break;
+
 		/*
 		 * Don't allow io_uring instances to be registered. If UNIX
 		 * isn't enabled, then this causes a reference cycle and this
@@ -4575,26 +4789,26 @@ static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
 		 * handle it just fine, but there's still no point in allowing
 		 * a ring fd as it doesn't support regular read/write anyway.
 		 */
-		if (table->files[index]->f_op == &io_uring_fops) {
-			fput(table->files[index]);
+		if (file->f_op == &io_uring_fops) {
+			fput(file);
 			break;
 		}
 		ret = 0;
+		table->files[index] = file;
 	}
 
 	if (ret) {
 		for (i = 0; i < ctx->nr_user_files; i++) {
-			struct file *file;
-
 			file = io_file_from_index(ctx, i);
 			if (file)
 				fput(file);
 		}
 		for (i = 0; i < nr_tables; i++)
-			kfree(ctx->file_table[i].files);
+			kfree(ctx->file_data->table[i].files);
 
-		kfree(ctx->file_table);
-		ctx->file_table = NULL;
+		kfree(ctx->file_data->table);
+		kfree(ctx->file_data);
+		ctx->file_data = NULL;
 		ctx->nr_user_files = 0;
 		return ret;
 	}
@@ -4606,69 +4820,6 @@ static int io_sqe_files_register(struct io_ring_ctx *ctx, void __user *arg,
 	return ret;
 }
 
-static void io_sqe_file_unregister(struct io_ring_ctx *ctx, int index)
-{
-#if defined(CONFIG_UNIX)
-	struct file *file = io_file_from_index(ctx, index);
-	struct sock *sock = ctx->ring_sock->sk;
-	struct sk_buff_head list, *head = &sock->sk_receive_queue;
-	struct sk_buff *skb;
-	int i;
-
-	__skb_queue_head_init(&list);
-
-	/*
-	 * Find the skb that holds this file in its SCM_RIGHTS. When found,
-	 * remove this entry and rearrange the file array.
-	 */
-	skb = skb_dequeue(head);
-	while (skb) {
-		struct scm_fp_list *fp;
-
-		fp = UNIXCB(skb).fp;
-		for (i = 0; i < fp->count; i++) {
-			int left;
-
-			if (fp->fp[i] != file)
-				continue;
-
-			unix_notinflight(fp->user, fp->fp[i]);
-			left = fp->count - 1 - i;
-			if (left) {
-				memmove(&fp->fp[i], &fp->fp[i + 1],
-						left * sizeof(struct file *));
-			}
-			fp->count--;
-			if (!fp->count) {
-				kfree_skb(skb);
-				skb = NULL;
-			} else {
-				__skb_queue_tail(&list, skb);
-			}
-			fput(file);
-			file = NULL;
-			break;
-		}
-
-		if (!file)
-			break;
-
-		__skb_queue_tail(&list, skb);
-
-		skb = skb_dequeue(head);
-	}
-
-	if (skb_peek(&list)) {
-		spin_lock_irq(&head->lock);
-		while ((skb = __skb_dequeue(&list)) != NULL)
-			__skb_queue_tail(head, skb);
-		spin_unlock_irq(&head->lock);
-	}
-#else
-	fput(io_file_from_index(ctx, index));
-#endif
-}
-
 static int io_sqe_file_register(struct io_ring_ctx *ctx, struct file *file,
 				int index)
 {
@@ -4712,29 +4863,65 @@ static int io_sqe_file_register(struct io_ring_ctx *ctx, struct file *file,
 #endif
 }
 
-static int io_sqe_files_update(struct io_ring_ctx *ctx, void __user *arg,
-			       unsigned nr_args)
+static void io_atomic_switch(struct percpu_ref *ref)
 {
-	struct io_uring_files_update up;
+	struct fixed_file_data *data;
+
+	data = container_of(ref, struct fixed_file_data, refs);
+	clear_bit(FFD_F_ATOMIC, &data->state);
+}
+
+static bool io_queue_file_removal(struct fixed_file_data *data,
+				  struct file *file)
+{
+	struct io_file_put *pfile, pfile_stack;
+	DECLARE_COMPLETION_ONSTACK(done);
+
+	/*
+	 * If we fail allocating the struct we need for doing async reomval
+	 * of this file, just punt to sync and wait for it.
+	 */
+	pfile = kzalloc(sizeof(*pfile), GFP_KERNEL);
+	if (!pfile) {
+		pfile = &pfile_stack;
+		pfile->done = &done;
+	}
+
+	pfile->file = file;
+	llist_add(&pfile->llist, &data->put_llist);
+
+	if (pfile == &pfile_stack) {
+		if (!test_and_set_bit(FFD_F_ATOMIC, &data->state)) {
+			percpu_ref_put(&data->refs);
+			percpu_ref_switch_to_atomic(&data->refs,
+							io_atomic_switch);
+		}
+		wait_for_completion(&done);
+		flush_work(&data->ref_work);
+		return false;
+	}
+
+	return true;
+}
+
+static int __io_sqe_files_update(struct io_ring_ctx *ctx,
+				 struct io_uring_files_update *up,
+				 unsigned nr_args)
+{
+	struct fixed_file_data *data = ctx->file_data;
+	bool ref_switch = false;
+	struct file *file;
 	__s32 __user *fds;
 	int fd, i, err;
 	__u32 done;
 
-	if (!ctx->file_table)
-		return -ENXIO;
-	if (!nr_args)
-		return -EINVAL;
-	if (copy_from_user(&up, arg, sizeof(up)))
-		return -EFAULT;
-	if (up.resv)
-		return -EINVAL;
-	if (check_add_overflow(up.offset, nr_args, &done))
+	if (check_add_overflow(up->offset, nr_args, &done))
 		return -EOVERFLOW;
 	if (done > ctx->nr_user_files)
 		return -EINVAL;
 
 	done = 0;
-	fds = u64_to_user_ptr(up.fds);
+	fds = u64_to_user_ptr(up->fds);
 	while (nr_args) {
 		struct fixed_file_table *table;
 		unsigned index;
@@ -4744,16 +4931,16 @@ static int io_sqe_files_update(struct io_ring_ctx *ctx, void __user *arg,
 			err = -EFAULT;
 			break;
 		}
-		i = array_index_nospec(up.offset, ctx->nr_user_files);
-		table = &ctx->file_table[i >> IORING_FILE_TABLE_SHIFT];
+		i = array_index_nospec(up->offset, ctx->nr_user_files);
+		table = &ctx->file_data->table[i >> IORING_FILE_TABLE_SHIFT];
 		index = i & IORING_FILE_TABLE_MASK;
 		if (table->files[index]) {
-			io_sqe_file_unregister(ctx, i);
+			file = io_file_from_index(ctx, index);
 			table->files[index] = NULL;
+			if (io_queue_file_removal(data, file))
+				ref_switch = true;
 		}
 		if (fd != -1) {
-			struct file *file;
-
 			file = fget(fd);
 			if (!file) {
 				err = -EBADF;
@@ -4779,11 +4966,32 @@ static int io_sqe_files_update(struct io_ring_ctx *ctx, void __user *arg,
 		}
 		nr_args--;
 		done++;
-		up.offset++;
+		up->offset++;
+	}
+
+	if (ref_switch && !test_and_set_bit(FFD_F_ATOMIC, &data->state)) {
+		percpu_ref_put(&data->refs);
+		percpu_ref_switch_to_atomic(&data->refs, io_atomic_switch);
 	}
 
 	return done ? done : err;
 }
+static int io_sqe_files_update(struct io_ring_ctx *ctx, void __user *arg,
+			       unsigned nr_args)
+{
+	struct io_uring_files_update up;
+
+	if (!ctx->file_data)
+		return -ENXIO;
+	if (!nr_args)
+		return -EINVAL;
+	if (copy_from_user(&up, arg, sizeof(up)))
+		return -EFAULT;
+	if (up.resv)
+		return -EINVAL;
+
+	return __io_sqe_files_update(ctx, &up, nr_args);
+}
 
 static void io_put_work(struct io_wq_work *work)
 {
@@ -5546,7 +5754,6 @@ static int io_uring_get_fd(struct io_ring_ctx *ctx)
 
 #if defined(CONFIG_UNIX)
 	ctx->ring_sock->file = file;
-	ctx->ring_sock->sk->sk_user_data = ctx;
 #endif
 	fd_install(ret, file);
 	return ret;
@@ -5710,18 +5917,22 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 	if (percpu_ref_is_dying(&ctx->refs))
 		return -ENXIO;
 
-	percpu_ref_kill(&ctx->refs);
+	if (opcode != IORING_UNREGISTER_FILES &&
+	    opcode != IORING_REGISTER_FILES_UPDATE) {
+		percpu_ref_kill(&ctx->refs);
 
-	/*
-	 * Drop uring mutex before waiting for references to exit. If another
-	 * thread is currently inside io_uring_enter() it might need to grab
-	 * the uring_lock to make progress. If we hold it here across the drain
-	 * wait, then we can deadlock. It's safe to drop the mutex here, since
-	 * no new references will come in after we've killed the percpu ref.
-	 */
-	mutex_unlock(&ctx->uring_lock);
-	wait_for_completion(&ctx->completions[0]);
-	mutex_lock(&ctx->uring_lock);
+		/*
+		 * Drop uring mutex before waiting for references to exit. If
+		 * another thread is currently inside io_uring_enter() it might
+		 * need to grab the uring_lock to make progress. If we hold it
+		 * here across the drain wait, then we can deadlock. It's safe
+		 * to drop the mutex here, since no new references will come in
+		 * after we've killed the percpu ref.
+		 */
+		mutex_unlock(&ctx->uring_lock);
+		wait_for_completion(&ctx->completions[0]);
+		mutex_lock(&ctx->uring_lock);
+	}
 
 	switch (opcode) {
 	case IORING_REGISTER_BUFFERS:
@@ -5762,9 +5973,13 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 		break;
 	}
 
-	/* bring the ctx back to life */
-	reinit_completion(&ctx->completions[0]);
-	percpu_ref_reinit(&ctx->refs);
+
+	if (opcode != IORING_UNREGISTER_FILES &&
+	    opcode != IORING_REGISTER_FILES_UPDATE) {
+		/* bring the ctx back to life */
+		reinit_completion(&ctx->completions[0]);
+		percpu_ref_reinit(&ctx->refs);
+	}
 	return ret;
 }
 
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 084dea85b838..ca436b9d4921 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -80,6 +80,7 @@ enum {
 	IORING_OP_FALLOCATE,
 	IORING_OP_OPENAT,
 	IORING_OP_CLOSE,
+	IORING_OP_FILES_UPDATE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From eddc7ef52a6b37b7ba3d1c8a8fbb63d5d9914f8a Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Fri, 13 Dec 2019 21:18:10 -0700
Subject: io_uring: add support for IORING_OP_STATX

This provides support for async statx(2) through io_uring.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 86 ++++++++++++++++++++++++++++++++++++++++++-
 include/uapi/linux/io_uring.h |  2 +
 2 files changed, 87 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 4325068324b7..115c7d413372 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -379,9 +379,13 @@ struct io_sr_msg {
 struct io_open {
 	struct file			*file;
 	int				dfd;
-	umode_t				mode;
+	union {
+		umode_t			mode;
+		unsigned		mask;
+	};
 	const char __user		*fname;
 	struct filename			*filename;
+	struct statx __user		*buffer;
 	int				flags;
 };
 
@@ -2266,6 +2270,74 @@ err:
 	return 0;
 }
 
+static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	unsigned lookup_flags;
+	int ret;
+
+	if (sqe->ioprio || sqe->buf_index)
+		return -EINVAL;
+
+	req->open.dfd = READ_ONCE(sqe->fd);
+	req->open.mask = READ_ONCE(sqe->len);
+	req->open.fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+	req->open.buffer = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+	req->open.flags = READ_ONCE(sqe->statx_flags);
+
+	if (vfs_stat_set_lookup_flags(&lookup_flags, req->open.flags))
+		return -EINVAL;
+
+	req->open.filename = getname_flags(req->open.fname, lookup_flags, NULL);
+	if (IS_ERR(req->open.filename)) {
+		ret = PTR_ERR(req->open.filename);
+		req->open.filename = NULL;
+		return ret;
+	}
+
+	return 0;
+}
+
+static int io_statx(struct io_kiocb *req, struct io_kiocb **nxt,
+		    bool force_nonblock)
+{
+	struct io_open *ctx = &req->open;
+	unsigned lookup_flags;
+	struct path path;
+	struct kstat stat;
+	int ret;
+
+	if (force_nonblock)
+		return -EAGAIN;
+
+	if (vfs_stat_set_lookup_flags(&lookup_flags, ctx->flags))
+		return -EINVAL;
+
+retry:
+	/* filename_lookup() drops it, keep a reference */
+	ctx->filename->refcnt++;
+
+	ret = filename_lookup(ctx->dfd, ctx->filename, lookup_flags, &path,
+				NULL);
+	if (ret)
+		goto err;
+
+	ret = vfs_getattr(&path, &stat, ctx->mask, ctx->flags);
+	path_put(&path);
+	if (retry_estale(ret, lookup_flags)) {
+		lookup_flags |= LOOKUP_REVAL;
+		goto retry;
+	}
+	if (!ret)
+		ret = cp_statx(&stat, ctx->buffer);
+err:
+	putname(ctx->filename);
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req_find_next(req, nxt);
+	return 0;
+}
+
 static int io_close_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	/*
@@ -3427,6 +3499,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_FILES_UPDATE:
 		ret = io_files_update_prep(req, sqe);
 		break;
+	case IORING_OP_STATX:
+		ret = io_statx_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -3613,6 +3688,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_files_update(req, force_nonblock);
 		break;
+	case IORING_OP_STATX:
+		if (sqe) {
+			ret = io_statx_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_statx(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -3699,6 +3782,7 @@ static int io_req_needs_file(struct io_kiocb *req, int fd)
 	case IORING_OP_LINK_TIMEOUT:
 		return 0;
 	case IORING_OP_OPENAT:
+	case IORING_OP_STATX:
 		return fd != -1;
 	default:
 		if (io_req_op_valid(req->opcode))
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index ca436b9d4921..3f45f7c543de 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -35,6 +35,7 @@ struct io_uring_sqe {
 		__u32		accept_flags;
 		__u32		cancel_flags;
 		__u32		open_flags;
+		__u32		statx_flags;
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	union {
@@ -81,6 +82,7 @@ enum {
 	IORING_OP_OPENAT,
 	IORING_OP_CLOSE,
 	IORING_OP_FILES_UPDATE,
+	IORING_OP_STATX,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From ce35a47a3a0208a77b4d31b7f2e8ed57d624093d Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 17 Dec 2019 08:04:44 -0700
Subject: io_uring: add IOSQE_ASYNC

io_uring defaults to always doing inline submissions, if at all
possible. But for larger copies, even if the data is fully cached, that
can take a long time. Add an IOSQE_ASYNC flag that the application can
set on the SQE - if set, it'll ensure that we always go async for those
kinds of requests. Use the io-wq IO_WQ_WORK_CONCURRENT flag to ensure we
get the concurrency we desire for this case.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 16 ++++++++++++++--
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 15 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 115c7d413372..9ffcfdc6382b 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -483,6 +483,7 @@ struct io_kiocb {
 #define REQ_F_INFLIGHT		16384	/* on inflight list */
 #define REQ_F_COMP_LOCKED	32768	/* completion under lock */
 #define REQ_F_HARDLINK		65536	/* doesn't sever on completion < 0 */
+#define REQ_F_FORCE_ASYNC	131072	/* IOSQE_ASYNC */
 	u64			user_data;
 	u32			result;
 	u32			sequence;
@@ -4017,8 +4018,17 @@ static void io_queue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 			req_set_fail_links(req);
 			io_double_put_req(req);
 		}
-	} else
+	} else if ((req->flags & REQ_F_FORCE_ASYNC) &&
+		   !io_wq_current_is_worker()) {
+		/*
+		 * Never try inline submit of IOSQE_ASYNC is set, go straight
+		 * to async execution.
+		 */
+		req->work.flags |= IO_WQ_WORK_CONCURRENT;
+		io_queue_async_work(req);
+	} else {
 		__io_queue_sqe(req, sqe);
+	}
 }
 
 static inline void io_queue_link_head(struct io_kiocb *req)
@@ -4031,7 +4041,7 @@ static inline void io_queue_link_head(struct io_kiocb *req)
 }
 
 #define SQE_VALID_FLAGS	(IOSQE_FIXED_FILE|IOSQE_IO_DRAIN|IOSQE_IO_LINK|	\
-				IOSQE_IO_HARDLINK)
+				IOSQE_IO_HARDLINK | IOSQE_ASYNC)
 
 static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			  struct io_submit_state *state, struct io_kiocb **link)
@@ -4044,6 +4054,8 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		ret = -EINVAL;
 		goto err_req;
 	}
+	if (sqe->flags & IOSQE_ASYNC)
+		req->flags |= REQ_F_FORCE_ASYNC;
 
 	ret = io_req_set_file(state, req, sqe);
 	if (unlikely(ret)) {
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 3f45f7c543de..d7ec50247a3a 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -51,6 +51,7 @@ struct io_uring_sqe {
 #define IOSQE_IO_DRAIN		(1U << 1)	/* issue after inflight IO */
 #define IOSQE_IO_LINK		(1U << 2)	/* links next sqe */
 #define IOSQE_IO_HARDLINK	(1U << 3)	/* like LINK, but stronger */
+#define IOSQE_ASYNC		(1U << 4)	/* always go async */
 
 /*
  * io_uring_setup() flags
-- 
cgit v1.2.3


From 3a6820f2bb8a079975109c25a5d1f29f46bce5d2 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Sun, 22 Dec 2019 15:19:35 -0700
Subject: io_uring: add non-vectored read/write commands

For uses cases that don't already naturally have an iovec, it's easier
(or more convenient) to just use a buffer address + length. This is
particular true if the use case is from languages that want to create
a memory safe abstraction on top of io_uring, and where introducing
the need for the iovec may impose an ownership issue. For those cases,
they currently need an indirection buffer, which means allocating data
just for this purpose.

Add basic read/write that don't require the iovec.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 23 +++++++++++++++++++++++
 include/uapi/linux/io_uring.h |  2 ++
 2 files changed, 25 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index c54a8bd37b54..407ba3388e14 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -654,6 +654,18 @@ static const struct io_op_def io_op_defs[] = {
 		.needs_file		= 1,
 		.fd_non_neg		= 1,
 	},
+	{
+		/* IORING_OP_READ */
+		.needs_mm		= 1,
+		.needs_file		= 1,
+		.unbound_nonreg_file	= 1,
+	},
+	{
+		/* IORING_OP_WRITE */
+		.needs_mm		= 1,
+		.needs_file		= 1,
+		.unbound_nonreg_file	= 1,
+	},
 };
 
 static void io_wq_submit_work(struct io_wq_work **workptr);
@@ -1867,6 +1879,13 @@ static ssize_t io_import_iovec(int rw, struct io_kiocb *req,
 	if (req->rw.kiocb.private)
 		return -EINVAL;
 
+	if (opcode == IORING_OP_READ || opcode == IORING_OP_WRITE) {
+		ssize_t ret;
+		ret = import_single_range(rw, buf, sqe_len, *iovec, iter);
+		*iovec = NULL;
+		return ret;
+	}
+
 	if (req->io) {
 		struct io_async_rw *iorw = &req->io->rw;
 
@@ -3634,10 +3653,12 @@ static int io_req_defer_prep(struct io_kiocb *req,
 		break;
 	case IORING_OP_READV:
 	case IORING_OP_READ_FIXED:
+	case IORING_OP_READ:
 		ret = io_read_prep(req, sqe, true);
 		break;
 	case IORING_OP_WRITEV:
 	case IORING_OP_WRITE_FIXED:
+	case IORING_OP_WRITE:
 		ret = io_write_prep(req, sqe, true);
 		break;
 	case IORING_OP_POLL_ADD:
@@ -3741,6 +3762,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		break;
 	case IORING_OP_READV:
 	case IORING_OP_READ_FIXED:
+	case IORING_OP_READ:
 		if (sqe) {
 			ret = io_read_prep(req, sqe, force_nonblock);
 			if (ret < 0)
@@ -3750,6 +3772,7 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		break;
 	case IORING_OP_WRITEV:
 	case IORING_OP_WRITE_FIXED:
+	case IORING_OP_WRITE:
 		if (sqe) {
 			ret = io_write_prep(req, sqe, force_nonblock);
 			if (ret < 0)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index d7ec50247a3a..7fdf994f3313 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -84,6 +84,8 @@ enum {
 	IORING_OP_CLOSE,
 	IORING_OP_FILES_UPDATE,
 	IORING_OP_STATX,
+	IORING_OP_READ,
+	IORING_OP_WRITE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From ba04291eb66ed895f194ae5abd3748d72bf8aaea Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 25 Dec 2019 16:33:42 -0700
Subject: io_uring: allow use of offset == -1 to mean file position
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This behaves like preadv2/pwritev2 with offset == -1, it'll use (and
update) the current file position. This obviously comes with the caveat
that if the application has multiple read/writes in flight, then the
end result will not be as expected. This is similar to threads sharing
a file descriptor and doing IO using the current file position.

Since this feature isn't easily detectable by doing a read or write,
add a feature flags, IORING_FEAT_RW_CUR_POS, to allow applications to
detect presence of this feature.

Reported-by: 李通洲 <carter.li@eoitek.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 11 ++++++++++-
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 11 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 407ba3388e14..5286620e4e46 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -495,6 +495,7 @@ struct io_kiocb {
 #define REQ_F_COMP_LOCKED	32768	/* completion under lock */
 #define REQ_F_HARDLINK		65536	/* doesn't sever on completion < 0 */
 #define REQ_F_FORCE_ASYNC	131072	/* IOSQE_ASYNC */
+#define REQ_F_CUR_POS		262144	/* read/write uses file position */
 	u64			user_data;
 	u32			result;
 	u32			sequence;
@@ -1711,6 +1712,10 @@ static int io_prep_rw(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		req->flags |= REQ_F_ISREG;
 
 	kiocb->ki_pos = READ_ONCE(sqe->off);
+	if (kiocb->ki_pos == -1 && !(req->file->f_mode & FMODE_STREAM)) {
+		req->flags |= REQ_F_CUR_POS;
+		kiocb->ki_pos = req->file->f_pos;
+	}
 	kiocb->ki_flags = iocb_flags(kiocb->ki_filp);
 	kiocb->ki_hint = ki_hint_validate(file_write_hint(kiocb->ki_filp));
 
@@ -1782,6 +1787,10 @@ static inline void io_rw_done(struct kiocb *kiocb, ssize_t ret)
 static void kiocb_done(struct kiocb *kiocb, ssize_t ret, struct io_kiocb **nxt,
 		       bool in_async)
 {
+	struct io_kiocb *req = container_of(kiocb, struct io_kiocb, rw.kiocb);
+
+	if (req->flags & REQ_F_CUR_POS)
+		req->file->f_pos = kiocb->ki_pos;
 	if (in_async && ret >= 0 && kiocb->ki_complete == io_complete_rw)
 		*nxt = __io_complete_rw(kiocb, ret);
 	else
@@ -6154,7 +6163,7 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p)
 		goto err;
 
 	p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP |
-			IORING_FEAT_SUBMIT_STABLE;
+			IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS;
 	trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags);
 	return ret;
 err:
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 7fdf994f3313..1f96136eb6ee 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -174,6 +174,7 @@ struct io_uring_params {
 #define IORING_FEAT_SINGLE_MMAP		(1U << 0)
 #define IORING_FEAT_NODROP		(1U << 1)
 #define IORING_FEAT_SUBMIT_STABLE	(1U << 2)
+#define IORING_FEAT_RW_CUR_POS		(1U << 3)
 
 /*
  * io_uring_register(2) opcodes and arguments
-- 
cgit v1.2.3


From 4840e418c2fc533d55ff6caa5b9313eed1d26cfd Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 25 Dec 2019 22:03:45 -0700
Subject: io_uring: add IORING_OP_FADVISE

This adds support for doing fadvise through io_uring. We assume that
WILLNEED doesn't block, but that DONTNEED may block.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 53 +++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/io_uring.h |  2 ++
 2 files changed, 55 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5286620e4e46..9ca12b900b42 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -72,6 +72,7 @@
 #include <linux/highmem.h>
 #include <linux/namei.h>
 #include <linux/fsnotify.h>
+#include <linux/fadvise.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/io_uring.h>
@@ -400,6 +401,13 @@ struct io_files_update {
 	u32				offset;
 };
 
+struct io_fadvise {
+	struct file			*file;
+	u64				offset;
+	u32				len;
+	u32				advice;
+};
+
 struct io_async_connect {
 	struct sockaddr_storage		address;
 };
@@ -452,6 +460,7 @@ struct io_kiocb {
 		struct io_open		open;
 		struct io_close		close;
 		struct io_files_update	files_update;
+		struct io_fadvise	fadvise;
 	};
 
 	struct io_async_ctx		*io;
@@ -667,6 +676,10 @@ static const struct io_op_def io_op_defs[] = {
 		.needs_file		= 1,
 		.unbound_nonreg_file	= 1,
 	},
+	{
+		/* IORING_OP_FADVISE */
+		.needs_file		= 1,
+	},
 };
 
 static void io_wq_submit_work(struct io_wq_work **workptr);
@@ -2436,6 +2449,35 @@ err:
 	return 0;
 }
 
+static int io_fadvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	if (sqe->ioprio || sqe->buf_index || sqe->addr)
+		return -EINVAL;
+
+	req->fadvise.offset = READ_ONCE(sqe->off);
+	req->fadvise.len = READ_ONCE(sqe->len);
+	req->fadvise.advice = READ_ONCE(sqe->fadvise_advice);
+	return 0;
+}
+
+static int io_fadvise(struct io_kiocb *req, struct io_kiocb **nxt,
+		      bool force_nonblock)
+{
+	struct io_fadvise *fa = &req->fadvise;
+	int ret;
+
+	/* DONTNEED may block, others _should_ not */
+	if (fa->advice == POSIX_FADV_DONTNEED && force_nonblock)
+		return -EAGAIN;
+
+	ret = vfs_fadvise(req->file, fa->offset, fa->len, fa->advice);
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req_find_next(req, nxt);
+	return 0;
+}
+
 static int io_statx_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	unsigned lookup_flags;
@@ -3721,6 +3763,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_STATX:
 		ret = io_statx_prep(req, sqe);
 		break;
+	case IORING_OP_FADVISE:
+		ret = io_fadvise_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -3917,6 +3962,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_statx(req, nxt, force_nonblock);
 		break;
+	case IORING_OP_FADVISE:
+		if (sqe) {
+			ret = io_fadvise_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_fadvise(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 1f96136eb6ee..f86d1c776078 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -36,6 +36,7 @@ struct io_uring_sqe {
 		__u32		cancel_flags;
 		__u32		open_flags;
 		__u32		statx_flags;
+		__u32		fadvise_advice;
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	union {
@@ -86,6 +87,7 @@ enum {
 	IORING_OP_STATX,
 	IORING_OP_READ,
 	IORING_OP_WRITE,
+	IORING_OP_FADVISE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From c1ca757bd6f4632c510714631ddcc2d13030fe1e Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 25 Dec 2019 22:18:28 -0700
Subject: io_uring: add IORING_OP_MADVISE

This adds support for doing madvise(2) through io_uring. We assume that
any operation can block, and hence punt everything async. This could be
improved, but hard to make bullet proof. The async punt ensures it's
safe.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 59 +++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 60 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 9ca12b900b42..17a199c99c7c 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -408,6 +408,13 @@ struct io_fadvise {
 	u32				advice;
 };
 
+struct io_madvise {
+	struct file			*file;
+	u64				addr;
+	u32				len;
+	u32				advice;
+};
+
 struct io_async_connect {
 	struct sockaddr_storage		address;
 };
@@ -461,6 +468,7 @@ struct io_kiocb {
 		struct io_close		close;
 		struct io_files_update	files_update;
 		struct io_fadvise	fadvise;
+		struct io_madvise	madvise;
 	};
 
 	struct io_async_ctx		*io;
@@ -680,6 +688,10 @@ static const struct io_op_def io_op_defs[] = {
 		/* IORING_OP_FADVISE */
 		.needs_file		= 1,
 	},
+	{
+		/* IORING_OP_MADVISE */
+		.needs_mm		= 1,
+	},
 };
 
 static void io_wq_submit_work(struct io_wq_work **workptr);
@@ -2449,6 +2461,42 @@ err:
 	return 0;
 }
 
+static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
+	if (sqe->ioprio || sqe->buf_index || sqe->off)
+		return -EINVAL;
+
+	req->madvise.addr = READ_ONCE(sqe->addr);
+	req->madvise.len = READ_ONCE(sqe->len);
+	req->madvise.advice = READ_ONCE(sqe->fadvise_advice);
+	return 0;
+#else
+	return -EOPNOTSUPP;
+#endif
+}
+
+static int io_madvise(struct io_kiocb *req, struct io_kiocb **nxt,
+		      bool force_nonblock)
+{
+#if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
+	struct io_madvise *ma = &req->madvise;
+	int ret;
+
+	if (force_nonblock)
+		return -EAGAIN;
+
+	ret = do_madvise(ma->addr, ma->len, ma->advice);
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req_find_next(req, nxt);
+	return 0;
+#else
+	return -EOPNOTSUPP;
+#endif
+}
+
 static int io_fadvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 	if (sqe->ioprio || sqe->buf_index || sqe->addr)
@@ -3766,6 +3814,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_FADVISE:
 		ret = io_fadvise_prep(req, sqe);
 		break;
+	case IORING_OP_MADVISE:
+		ret = io_madvise_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -3970,6 +4021,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_fadvise(req, nxt, force_nonblock);
 		break;
+	case IORING_OP_MADVISE:
+		if (sqe) {
+			ret = io_madvise_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_madvise(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index f86d1c776078..8ad3cece5440 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -88,6 +88,7 @@ enum {
 	IORING_OP_READ,
 	IORING_OP_WRITE,
 	IORING_OP_FADVISE,
+	IORING_OP_MADVISE,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From 8110c1a6212e430a84edd2b83fe9043def8b743e Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Sat, 28 Dec 2019 15:39:54 -0700
Subject: io_uring: add support for IORING_SETUP_CLAMP

Some applications like to start small in terms of ring size, and then
ramp up as needed. This is a bit tricky to do currently, since we don't
advertise the max ring size.

This adds IORING_SETUP_CLAMP. If set, and the values for SQ or CQ ring
size exceed what we support, then clamp them at the max values instead
of returning -EINVAL. Since we return the chosen ring sizes after setup,
no further changes are needed on the application side. io_uring already
changes the ring sizes if the application doesn't ask for power-of-two
sizes, for example.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 17 ++++++++++++++---
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 15 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1fbe3eb78d08..7c44b0ef10d7 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -6234,8 +6234,13 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p)
 	bool account_mem;
 	int ret;
 
-	if (!entries || entries > IORING_MAX_ENTRIES)
+	if (!entries)
 		return -EINVAL;
+	if (entries > IORING_MAX_ENTRIES) {
+		if (!(p->flags & IORING_SETUP_CLAMP))
+			return -EINVAL;
+		entries = IORING_MAX_ENTRIES;
+	}
 
 	/*
 	 * Use twice as many entries for the CQ ring. It's possible for the
@@ -6252,8 +6257,13 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p)
 		 * to a power-of-two, if it isn't already. We do NOT impose
 		 * any cq vs sq ring sizing.
 		 */
-		if (p->cq_entries < p->sq_entries || p->cq_entries > IORING_MAX_CQ_ENTRIES)
+		if (p->cq_entries < p->sq_entries)
 			return -EINVAL;
+		if (p->cq_entries > IORING_MAX_CQ_ENTRIES) {
+			if (!(p->flags & IORING_SETUP_CLAMP))
+				return -EINVAL;
+			p->cq_entries = IORING_MAX_CQ_ENTRIES;
+		}
 		p->cq_entries = roundup_pow_of_two(p->cq_entries);
 	} else {
 		p->cq_entries = 2 * p->sq_entries;
@@ -6345,7 +6355,8 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
 	}
 
 	if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL |
-			IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE))
+			IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
+			IORING_SETUP_CLAMP))
 		return -EINVAL;
 
 	ret = io_uring_create(entries, &p);
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 8ad3cece5440..29fae13395a8 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -61,6 +61,7 @@ struct io_uring_sqe {
 #define IORING_SETUP_SQPOLL	(1U << 1)	/* SQ poll thread */
 #define IORING_SETUP_SQ_AFF	(1U << 2)	/* sq_thread_cpu is valid */
 #define IORING_SETUP_CQSIZE	(1U << 3)	/* app defines CQ size */
+#define IORING_SETUP_CLAMP	(1U << 4)	/* clamp SQ/CQ ring sizes */
 
 enum {
 	IORING_OP_NOP,
-- 
cgit v1.2.3


From fddafacee287b3140212c92464077e971401f860 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Sat, 4 Jan 2020 20:19:44 -0700
Subject: io_uring: add support for send(2) and recv(2)

This adds IORING_OP_SEND for send(2) support, and IORING_OP_RECV for
recv(2) support.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 140 ++++++++++++++++++++++++++++++++++++++++--
 include/uapi/linux/io_uring.h |   2 +
 2 files changed, 137 insertions(+), 5 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0d083811ccb9..edf072b6cb8f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -377,8 +377,12 @@ struct io_connect {
 
 struct io_sr_msg {
 	struct file			*file;
-	struct user_msghdr __user	*msg;
+	union {
+		struct user_msghdr __user *msg;
+		void __user		*buf;
+	};
 	int				msg_flags;
+	size_t				len;
 };
 
 struct io_open {
@@ -692,6 +696,18 @@ static const struct io_op_def io_op_defs[] = {
 		/* IORING_OP_MADVISE */
 		.needs_mm		= 1,
 	},
+	{
+		/* IORING_OP_SEND */
+		.needs_mm		= 1,
+		.needs_file		= 1,
+		.unbound_nonreg_file	= 1,
+	},
+	{
+		/* IORING_OP_RECV */
+		.needs_mm		= 1,
+		.needs_file		= 1,
+		.unbound_nonreg_file	= 1,
+	},
 };
 
 static void io_wq_submit_work(struct io_wq_work **workptr);
@@ -2802,8 +2818,9 @@ static int io_sendmsg_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 
 	sr->msg_flags = READ_ONCE(sqe->msg_flags);
 	sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr));
+	sr->len = READ_ONCE(sqe->len);
 
-	if (!io)
+	if (!io || req->opcode == IORING_OP_SEND)
 		return 0;
 
 	io->msg.iov = io->msg.fast_iov;
@@ -2883,6 +2900,56 @@ static int io_sendmsg(struct io_kiocb *req, struct io_kiocb **nxt,
 #endif
 }
 
+static int io_send(struct io_kiocb *req, struct io_kiocb **nxt,
+		   bool force_nonblock)
+{
+#if defined(CONFIG_NET)
+	struct socket *sock;
+	int ret;
+
+	if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+		return -EINVAL;
+
+	sock = sock_from_file(req->file, &ret);
+	if (sock) {
+		struct io_sr_msg *sr = &req->sr_msg;
+		struct msghdr msg;
+		struct iovec iov;
+		unsigned flags;
+
+		ret = import_single_range(WRITE, sr->buf, sr->len, &iov,
+						&msg.msg_iter);
+		if (ret)
+			return ret;
+
+		msg.msg_name = NULL;
+		msg.msg_control = NULL;
+		msg.msg_controllen = 0;
+		msg.msg_namelen = 0;
+
+		flags = req->sr_msg.msg_flags;
+		if (flags & MSG_DONTWAIT)
+			req->flags |= REQ_F_NOWAIT;
+		else if (force_nonblock)
+			flags |= MSG_DONTWAIT;
+
+		ret = __sys_sendmsg_sock(sock, &msg, flags);
+		if (force_nonblock && ret == -EAGAIN)
+			return -EAGAIN;
+		if (ret == -ERESTARTSYS)
+			ret = -EINTR;
+	}
+
+	io_cqring_add_event(req, ret);
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_put_req_find_next(req, nxt);
+	return 0;
+#else
+	return -EOPNOTSUPP;
+#endif
+}
+
 static int io_recvmsg_prep(struct io_kiocb *req,
 			   const struct io_uring_sqe *sqe)
 {
@@ -2893,7 +2960,7 @@ static int io_recvmsg_prep(struct io_kiocb *req,
 	sr->msg_flags = READ_ONCE(sqe->msg_flags);
 	sr->msg = u64_to_user_ptr(READ_ONCE(sqe->addr));
 
-	if (!io)
+	if (!io || req->opcode == IORING_OP_RECV)
 		return 0;
 
 	io->msg.iov = io->msg.fast_iov;
@@ -2975,6 +3042,59 @@ static int io_recvmsg(struct io_kiocb *req, struct io_kiocb **nxt,
 #endif
 }
 
+static int io_recv(struct io_kiocb *req, struct io_kiocb **nxt,
+		   bool force_nonblock)
+{
+#if defined(CONFIG_NET)
+	struct socket *sock;
+	int ret;
+
+	if (unlikely(req->ctx->flags & IORING_SETUP_IOPOLL))
+		return -EINVAL;
+
+	sock = sock_from_file(req->file, &ret);
+	if (sock) {
+		struct io_sr_msg *sr = &req->sr_msg;
+		struct msghdr msg;
+		struct iovec iov;
+		unsigned flags;
+
+		ret = import_single_range(READ, sr->buf, sr->len, &iov,
+						&msg.msg_iter);
+		if (ret)
+			return ret;
+
+		msg.msg_name = NULL;
+		msg.msg_control = NULL;
+		msg.msg_controllen = 0;
+		msg.msg_namelen = 0;
+		msg.msg_iocb = NULL;
+		msg.msg_flags = 0;
+
+		flags = req->sr_msg.msg_flags;
+		if (flags & MSG_DONTWAIT)
+			req->flags |= REQ_F_NOWAIT;
+		else if (force_nonblock)
+			flags |= MSG_DONTWAIT;
+
+		ret = __sys_recvmsg_sock(sock, &msg, NULL, NULL, flags);
+		if (force_nonblock && ret == -EAGAIN)
+			return -EAGAIN;
+		if (ret == -ERESTARTSYS)
+			ret = -EINTR;
+	}
+
+	io_cqring_add_event(req, ret);
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_put_req_find_next(req, nxt);
+	return 0;
+#else
+	return -EOPNOTSUPP;
+#endif
+}
+
+
 static int io_accept_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 #if defined(CONFIG_NET)
@@ -3811,9 +3931,11 @@ static int io_req_defer_prep(struct io_kiocb *req,
 		ret = io_prep_sfr(req, sqe);
 		break;
 	case IORING_OP_SENDMSG:
+	case IORING_OP_SEND:
 		ret = io_sendmsg_prep(req, sqe);
 		break;
 	case IORING_OP_RECVMSG:
+	case IORING_OP_RECV:
 		ret = io_recvmsg_prep(req, sqe);
 		break;
 	case IORING_OP_CONNECT:
@@ -3956,20 +4078,28 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		ret = io_sync_file_range(req, nxt, force_nonblock);
 		break;
 	case IORING_OP_SENDMSG:
+	case IORING_OP_SEND:
 		if (sqe) {
 			ret = io_sendmsg_prep(req, sqe);
 			if (ret < 0)
 				break;
 		}
-		ret = io_sendmsg(req, nxt, force_nonblock);
+		if (req->opcode == IORING_OP_SENDMSG)
+			ret = io_sendmsg(req, nxt, force_nonblock);
+		else
+			ret = io_send(req, nxt, force_nonblock);
 		break;
 	case IORING_OP_RECVMSG:
+	case IORING_OP_RECV:
 		if (sqe) {
 			ret = io_recvmsg_prep(req, sqe);
 			if (ret)
 				break;
 		}
-		ret = io_recvmsg(req, nxt, force_nonblock);
+		if (req->opcode == IORING_OP_RECVMSG)
+			ret = io_recvmsg(req, nxt, force_nonblock);
+		else
+			ret = io_recv(req, nxt, force_nonblock);
 		break;
 	case IORING_OP_TIMEOUT:
 		if (sqe) {
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 29fae13395a8..0fe270ab191c 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -90,6 +90,8 @@ enum {
 	IORING_OP_WRITE,
 	IORING_OP_FADVISE,
 	IORING_OP_MADVISE,
+	IORING_OP_SEND,
+	IORING_OP_RECV,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From f2842ab5b72d7ee5f7f8385c2d4f32c133f5837b Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 8 Jan 2020 11:04:00 -0700
Subject: io_uring: enable option to only trigger eventfd for async completions

If an application is using eventfd notifications with poll to know when
new SQEs can be issued, it's expecting the following read/writes to
complete inline. And with that, it knows that there are events available,
and don't want spurious wakeups on the eventfd for those requests.

This adds IORING_REGISTER_EVENTFD_ASYNC, which works just like
IORING_REGISTER_EVENTFD, except it only triggers notifications for events
that happen from async completions (IRQ, or io-wq worker completions).
Any completions inline from the submission itself will not trigger
notifications.

Suggested-by: Mark Papadakis <markuspapadakis@icloud.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 17 ++++++++++++++++-
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 17 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 42bf83b3fbd5..70656762244f 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -206,6 +206,7 @@ struct io_ring_ctx {
 		int			account_mem: 1;
 		int			cq_overflow_flushed: 1;
 		int			drain_next: 1;
+		int			eventfd_async: 1;
 
 		/*
 		 * Ring buffer of indices into array of io_uring_sqe, which is
@@ -963,13 +964,20 @@ static struct io_uring_cqe *io_get_cqring(struct io_ring_ctx *ctx)
 	return &rings->cqes[tail & ctx->cq_mask];
 }
 
+static inline bool io_should_trigger_evfd(struct io_ring_ctx *ctx)
+{
+	if (!ctx->eventfd_async)
+		return true;
+	return io_wq_current_is_worker() || in_interrupt();
+}
+
 static void io_cqring_ev_posted(struct io_ring_ctx *ctx)
 {
 	if (waitqueue_active(&ctx->wait))
 		wake_up(&ctx->wait);
 	if (waitqueue_active(&ctx->sqo_wait))
 		wake_up(&ctx->sqo_wait);
-	if (ctx->cq_ev_fd)
+	if (ctx->cq_ev_fd && io_should_trigger_evfd(ctx))
 		eventfd_signal(ctx->cq_ev_fd, 1);
 }
 
@@ -6556,10 +6564,17 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 		ret = io_sqe_files_update(ctx, arg, nr_args);
 		break;
 	case IORING_REGISTER_EVENTFD:
+	case IORING_REGISTER_EVENTFD_ASYNC:
 		ret = -EINVAL;
 		if (nr_args != 1)
 			break;
 		ret = io_eventfd_register(ctx, arg);
+		if (ret)
+			break;
+		if (opcode == IORING_REGISTER_EVENTFD_ASYNC)
+			ctx->eventfd_async = 1;
+		else
+			ctx->eventfd_async = 0;
 		break;
 	case IORING_UNREGISTER_EVENTFD:
 		ret = -EINVAL;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 0fe270ab191c..66772a90a7f2 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -192,6 +192,7 @@ struct io_uring_params {
 #define IORING_REGISTER_EVENTFD		4
 #define IORING_UNREGISTER_EVENTFD	5
 #define IORING_REGISTER_FILES_UPDATE	6
+#define IORING_REGISTER_EVENTFD_ASYNC	7
 
 struct io_uring_files_update {
 	__u32 offset;
-- 
cgit v1.2.3


From cebdb98617ae3e842c81c73758a185248b37cfd6 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 8 Jan 2020 17:59:24 -0700
Subject: io_uring: add support for IORING_OP_OPENAT2

Add support for the new openat2(2) system call. It's trivial to do, as
we can have openat(2) just be wrapped around it.

Suggested-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 69 +++++++++++++++++++++++++++++++++++++++----
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 64 insertions(+), 6 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 3a57ea98fe3a..0b30b0cf8af5 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -707,6 +707,11 @@ static const struct io_op_def io_op_defs[] = {
 		.needs_file		= 1,
 		.unbound_nonreg_file	= 1,
 	},
+	{
+		/* IORING_OP_OPENAT2 */
+		.needs_file		= 1,
+		.fd_non_neg		= 1,
+	},
 };
 
 static void io_wq_submit_work(struct io_wq_work **workptr);
@@ -2487,11 +2492,46 @@ static int io_openat_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 	return 0;
 }
 
-static int io_openat(struct io_kiocb *req, struct io_kiocb **nxt,
-		     bool force_nonblock)
+static int io_openat2_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
+{
+	struct open_how __user *how;
+	const char __user *fname;
+	size_t len;
+	int ret;
+
+	if (sqe->ioprio || sqe->buf_index)
+		return -EINVAL;
+
+	req->open.dfd = READ_ONCE(sqe->fd);
+	fname = u64_to_user_ptr(READ_ONCE(sqe->addr));
+	how = u64_to_user_ptr(READ_ONCE(sqe->addr2));
+	len = READ_ONCE(sqe->len);
+
+	if (len < OPEN_HOW_SIZE_VER0)
+		return -EINVAL;
+
+	ret = copy_struct_from_user(&req->open.how, sizeof(req->open.how), how,
+					len);
+	if (ret)
+		return ret;
+
+	if (!(req->open.how.flags & O_PATH) && force_o_largefile())
+		req->open.how.flags |= O_LARGEFILE;
+
+	req->open.filename = getname(fname);
+	if (IS_ERR(req->open.filename)) {
+		ret = PTR_ERR(req->open.filename);
+		req->open.filename = NULL;
+		return ret;
+	}
+
+	return 0;
+}
+
+static int io_openat2(struct io_kiocb *req, struct io_kiocb **nxt,
+		      bool force_nonblock)
 {
 	struct open_flags op;
-	struct open_how how;
 	struct file *file;
 	int ret;
 
@@ -2500,12 +2540,11 @@ static int io_openat(struct io_kiocb *req, struct io_kiocb **nxt,
 		return -EAGAIN;
 	}
 
-	how = build_open_how(req->open.how.flags, req->open.how.mode);
-	ret = build_open_flags(&how, &op);
+	ret = build_open_flags(&req->open.how, &op);
 	if (ret)
 		goto err;
 
-	ret = get_unused_fd_flags(how.flags);
+	ret = get_unused_fd_flags(req->open.how.flags);
 	if (ret < 0)
 		goto err;
 
@@ -2526,6 +2565,13 @@ err:
 	return 0;
 }
 
+static int io_openat(struct io_kiocb *req, struct io_kiocb **nxt,
+		     bool force_nonblock)
+{
+	req->open.how = build_open_how(req->open.how.flags, req->open.how.mode);
+	return io_openat2(req, nxt, force_nonblock);
+}
+
 static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 #if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
@@ -3984,6 +4030,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_MADVISE:
 		ret = io_madvise_prep(req, sqe);
 		break;
+	case IORING_OP_OPENAT2:
+		ret = io_openat2_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -4204,6 +4253,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_madvise(req, nxt, force_nonblock);
 		break;
+	case IORING_OP_OPENAT2:
+		if (sqe) {
+			ret = io_openat2_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_openat2(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 66772a90a7f2..fea7da182851 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -92,6 +92,7 @@ enum {
 	IORING_OP_MADVISE,
 	IORING_OP_SEND,
 	IORING_OP_RECV,
+	IORING_OP_OPENAT2,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From 66f4af93da5761d2fa05c0dc673a47003cdb9cfe Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Thu, 16 Jan 2020 15:36:52 -0700
Subject: io_uring: add support for probing opcodes

The application currently has no way of knowing if a given opcode is
supported or not without having to try and issue one and see if we get
-EINVAL or not. And even this approach is fraught with peril, as maybe
we're getting -EINVAL due to some fields being missing, or maybe it's
just not that easy to issue that particular command without doing some
other leg work in terms of setup first.

This adds IORING_REGISTER_PROBE, which fills in a structure with info
on what it supported or not. This will work even with sparse opcode
fields, which may happen in the future or even today if someone
backports specific features to older kernels.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 53 +++++++++++++++++++++++++++++++++++++++++--
 include/uapi/linux/io_uring.h | 18 +++++++++++++++
 2 files changed, 69 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 8a645a37b4c7..7715a729271a 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -561,6 +561,8 @@ struct io_op_def {
 	unsigned		hash_reg_file : 1;
 	/* unbound wq insertion if file is a non-regular file */
 	unsigned		unbound_nonreg_file : 1;
+	/* opcode is not supported by this kernel */
+	unsigned		not_supported : 1;
 };
 
 static const struct io_op_def io_op_defs[] = {
@@ -6566,6 +6568,45 @@ SYSCALL_DEFINE2(io_uring_setup, u32, entries,
 	return io_uring_setup(entries, params);
 }
 
+static int io_probe(struct io_ring_ctx *ctx, void __user *arg, unsigned nr_args)
+{
+	struct io_uring_probe *p;
+	size_t size;
+	int i, ret;
+
+	size = struct_size(p, ops, nr_args);
+	if (size == SIZE_MAX)
+		return -EOVERFLOW;
+	p = kzalloc(size, GFP_KERNEL);
+	if (!p)
+		return -ENOMEM;
+
+	ret = -EFAULT;
+	if (copy_from_user(p, arg, size))
+		goto out;
+	ret = -EINVAL;
+	if (memchr_inv(p, 0, size))
+		goto out;
+
+	p->last_op = IORING_OP_LAST - 1;
+	if (nr_args > IORING_OP_LAST)
+		nr_args = IORING_OP_LAST;
+
+	for (i = 0; i < nr_args; i++) {
+		p->ops[i].op = i;
+		if (!io_op_defs[i].not_supported)
+			p->ops[i].flags = IO_URING_OP_SUPPORTED;
+	}
+	p->ops_len = i;
+
+	ret = 0;
+	if (copy_to_user(arg, p, size))
+		ret = -EFAULT;
+out:
+	kfree(p);
+	return ret;
+}
+
 static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 			       void __user *arg, unsigned nr_args)
 	__releases(ctx->uring_lock)
@@ -6582,7 +6623,8 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 		return -ENXIO;
 
 	if (opcode != IORING_UNREGISTER_FILES &&
-	    opcode != IORING_REGISTER_FILES_UPDATE) {
+	    opcode != IORING_REGISTER_FILES_UPDATE &&
+	    opcode != IORING_REGISTER_PROBE) {
 		percpu_ref_kill(&ctx->refs);
 
 		/*
@@ -6644,6 +6686,12 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 			break;
 		ret = io_eventfd_unregister(ctx);
 		break;
+	case IORING_REGISTER_PROBE:
+		ret = -EINVAL;
+		if (!arg || nr_args > 256)
+			break;
+		ret = io_probe(ctx, arg, nr_args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
@@ -6651,7 +6699,8 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 
 
 	if (opcode != IORING_UNREGISTER_FILES &&
-	    opcode != IORING_REGISTER_FILES_UPDATE) {
+	    opcode != IORING_REGISTER_FILES_UPDATE &&
+	    opcode != IORING_REGISTER_PROBE) {
 		/* bring the ctx back to life */
 		percpu_ref_reinit(&ctx->refs);
 out:
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index fea7da182851..955fd477e530 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -194,6 +194,7 @@ struct io_uring_params {
 #define IORING_UNREGISTER_EVENTFD	5
 #define IORING_REGISTER_FILES_UPDATE	6
 #define IORING_REGISTER_EVENTFD_ASYNC	7
+#define IORING_REGISTER_PROBE		8
 
 struct io_uring_files_update {
 	__u32 offset;
@@ -201,4 +202,21 @@ struct io_uring_files_update {
 	__aligned_u64 /* __s32 * */ fds;
 };
 
+#define IO_URING_OP_SUPPORTED	(1U << 0)
+
+struct io_uring_probe_op {
+	__u8 op;
+	__u8 resv;
+	__u16 flags;	/* IO_URING_OP_* flags */
+	__u32 resv2;
+};
+
+struct io_uring_probe {
+	__u8 last_op;	/* last opcode supported */
+	__u8 ops_len;	/* length of ops[] array below */
+	__u16 resv;
+	__u32 resv2[3];
+	struct io_uring_probe_op ops[0];
+};
+
 #endif
-- 
cgit v1.2.3


From 6b47ee6ecab142f938a40bf3b297abac74218ee2 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence@gmail.com>
Date: Sat, 18 Jan 2020 20:22:41 +0300
Subject: io_uring: optimise sqe-to-req flags translation

For each IOSQE_* flag there is a corresponding REQ_F_* flag. And there
is a repetitive pattern of their translation:
e.g. if (sqe->flags & SQE_FLAG*) req->flags |= REQ_F_FLAG*

Use same numeric values/bits for them and copy instead of manual
handling.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 92 +++++++++++++++++++++++++++++--------------
 include/uapi/linux/io_uring.h | 23 ++++++++---
 2 files changed, 81 insertions(+), 34 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index ed1adeda370e..cf5bad51f752 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -46,6 +46,7 @@
 #include <linux/compat.h>
 #include <linux/refcount.h>
 #include <linux/uio.h>
+#include <linux/bits.h>
 
 #include <linux/sched/signal.h>
 #include <linux/fs.h>
@@ -452,6 +453,65 @@ struct io_async_ctx {
 	};
 };
 
+enum {
+	REQ_F_FIXED_FILE_BIT	= IOSQE_FIXED_FILE_BIT,
+	REQ_F_IO_DRAIN_BIT	= IOSQE_IO_DRAIN_BIT,
+	REQ_F_LINK_BIT		= IOSQE_IO_LINK_BIT,
+	REQ_F_HARDLINK_BIT	= IOSQE_IO_HARDLINK_BIT,
+	REQ_F_FORCE_ASYNC_BIT	= IOSQE_ASYNC_BIT,
+
+	REQ_F_LINK_NEXT_BIT,
+	REQ_F_FAIL_LINK_BIT,
+	REQ_F_INFLIGHT_BIT,
+	REQ_F_CUR_POS_BIT,
+	REQ_F_NOWAIT_BIT,
+	REQ_F_IOPOLL_COMPLETED_BIT,
+	REQ_F_LINK_TIMEOUT_BIT,
+	REQ_F_TIMEOUT_BIT,
+	REQ_F_ISREG_BIT,
+	REQ_F_MUST_PUNT_BIT,
+	REQ_F_TIMEOUT_NOSEQ_BIT,
+	REQ_F_COMP_LOCKED_BIT,
+};
+
+enum {
+	/* ctx owns file */
+	REQ_F_FIXED_FILE	= BIT(REQ_F_FIXED_FILE_BIT),
+	/* drain existing IO first */
+	REQ_F_IO_DRAIN		= BIT(REQ_F_IO_DRAIN_BIT),
+	/* linked sqes */
+	REQ_F_LINK		= BIT(REQ_F_LINK_BIT),
+	/* doesn't sever on completion < 0 */
+	REQ_F_HARDLINK		= BIT(REQ_F_HARDLINK_BIT),
+	/* IOSQE_ASYNC */
+	REQ_F_FORCE_ASYNC	= BIT(REQ_F_FORCE_ASYNC_BIT),
+
+	/* already grabbed next link */
+	REQ_F_LINK_NEXT		= BIT(REQ_F_LINK_NEXT_BIT),
+	/* fail rest of links */
+	REQ_F_FAIL_LINK		= BIT(REQ_F_FAIL_LINK_BIT),
+	/* on inflight list */
+	REQ_F_INFLIGHT		= BIT(REQ_F_INFLIGHT_BIT),
+	/* read/write uses file position */
+	REQ_F_CUR_POS		= BIT(REQ_F_CUR_POS_BIT),
+	/* must not punt to workers */
+	REQ_F_NOWAIT		= BIT(REQ_F_NOWAIT_BIT),
+	/* polled IO has completed */
+	REQ_F_IOPOLL_COMPLETED	= BIT(REQ_F_IOPOLL_COMPLETED_BIT),
+	/* has linked timeout */
+	REQ_F_LINK_TIMEOUT	= BIT(REQ_F_LINK_TIMEOUT_BIT),
+	/* timeout request */
+	REQ_F_TIMEOUT		= BIT(REQ_F_TIMEOUT_BIT),
+	/* regular file */
+	REQ_F_ISREG		= BIT(REQ_F_ISREG_BIT),
+	/* must be punted even for NONBLOCK */
+	REQ_F_MUST_PUNT		= BIT(REQ_F_MUST_PUNT_BIT),
+	/* no timeout sequence */
+	REQ_F_TIMEOUT_NOSEQ	= BIT(REQ_F_TIMEOUT_NOSEQ_BIT),
+	/* completion under lock */
+	REQ_F_COMP_LOCKED	= BIT(REQ_F_COMP_LOCKED_BIT),
+};
+
 /*
  * NOTE! Each of the iocb union members has the file pointer
  * as the first entry in their struct definition. So you can
@@ -494,23 +554,6 @@ struct io_kiocb {
 	struct list_head	link_list;
 	unsigned int		flags;
 	refcount_t		refs;
-#define REQ_F_NOWAIT		1	/* must not punt to workers */
-#define REQ_F_IOPOLL_COMPLETED	2	/* polled IO has completed */
-#define REQ_F_FIXED_FILE	4	/* ctx owns file */
-#define REQ_F_LINK_NEXT		8	/* already grabbed next link */
-#define REQ_F_IO_DRAIN		16	/* drain existing IO first */
-#define REQ_F_LINK		64	/* linked sqes */
-#define REQ_F_LINK_TIMEOUT	128	/* has linked timeout */
-#define REQ_F_FAIL_LINK		256	/* fail rest of links */
-#define REQ_F_TIMEOUT		1024	/* timeout request */
-#define REQ_F_ISREG		2048	/* regular file */
-#define REQ_F_MUST_PUNT		4096	/* must be punted even for NONBLOCK */
-#define REQ_F_TIMEOUT_NOSEQ	8192	/* no timeout sequence */
-#define REQ_F_INFLIGHT		16384	/* on inflight list */
-#define REQ_F_COMP_LOCKED	32768	/* completion under lock */
-#define REQ_F_HARDLINK		65536	/* doesn't sever on completion < 0 */
-#define REQ_F_FORCE_ASYNC	131072	/* IOSQE_ASYNC */
-#define REQ_F_CUR_POS		262144	/* read/write uses file position */
 	u64			user_data;
 	u32			result;
 	u32			sequence;
@@ -4355,9 +4398,6 @@ static int io_req_set_file(struct io_submit_state *state, struct io_kiocb *req,
 	flags = READ_ONCE(sqe->flags);
 	fd = READ_ONCE(sqe->fd);
 
-	if (flags & IOSQE_IO_DRAIN)
-		req->flags |= REQ_F_IO_DRAIN;
-
 	if (!io_req_needs_file(req, fd))
 		return 0;
 
@@ -4593,8 +4633,9 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		ret = -EINVAL;
 		goto err_req;
 	}
-	if (sqe_flags & IOSQE_ASYNC)
-		req->flags |= REQ_F_FORCE_ASYNC;
+	/* same numerical values with corresponding REQ_F_*, safe to copy */
+	req->flags |= sqe_flags & (IOSQE_IO_DRAIN|IOSQE_IO_HARDLINK|
+					IOSQE_ASYNC);
 
 	ret = io_req_set_file(state, req, sqe);
 	if (unlikely(ret)) {
@@ -4618,10 +4659,6 @@ err_req:
 			head->flags |= REQ_F_IO_DRAIN;
 			ctx->drain_next = 1;
 		}
-
-		if (sqe_flags & IOSQE_IO_HARDLINK)
-			req->flags |= REQ_F_HARDLINK;
-
 		if (io_alloc_async_ctx(req)) {
 			ret = -EAGAIN;
 			goto err_req;
@@ -4648,9 +4685,6 @@ err_req:
 		}
 		if (sqe_flags & (IOSQE_IO_LINK|IOSQE_IO_HARDLINK)) {
 			req->flags |= REQ_F_LINK;
-			if (sqe_flags & IOSQE_IO_HARDLINK)
-				req->flags |= REQ_F_HARDLINK;
-
 			INIT_LIST_HEAD(&req->link_list);
 			ret = io_req_defer_prep(req, sqe);
 			if (ret)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 955fd477e530..57d05cc5e271 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -45,14 +45,27 @@ struct io_uring_sqe {
 	};
 };
 
+enum {
+	IOSQE_FIXED_FILE_BIT,
+	IOSQE_IO_DRAIN_BIT,
+	IOSQE_IO_LINK_BIT,
+	IOSQE_IO_HARDLINK_BIT,
+	IOSQE_ASYNC_BIT,
+};
+
 /*
  * sqe->flags
  */
-#define IOSQE_FIXED_FILE	(1U << 0)	/* use fixed fileset */
-#define IOSQE_IO_DRAIN		(1U << 1)	/* issue after inflight IO */
-#define IOSQE_IO_LINK		(1U << 2)	/* links next sqe */
-#define IOSQE_IO_HARDLINK	(1U << 3)	/* like LINK, but stronger */
-#define IOSQE_ASYNC		(1U << 4)	/* always go async */
+/* use fixed fileset */
+#define IOSQE_FIXED_FILE	(1U << IOSQE_FIXED_FILE_BIT)
+/* issue after inflight IO */
+#define IOSQE_IO_DRAIN		(1U << IOSQE_IO_DRAIN_BIT)
+/* links next sqe */
+#define IOSQE_IO_LINK		(1U << IOSQE_IO_LINK_BIT)
+/* like LINK, but stronger */
+#define IOSQE_IO_HARDLINK	(1U << IOSQE_IO_HARDLINK_BIT)
+/* always go async */
+#define IOSQE_ASYNC		(1U << IOSQE_ASYNC_BIT)
 
 /*
  * io_uring_setup() flags
-- 
cgit v1.2.3


From f362e5fe0f1f9774b28c5b4c9b07b0168575cf6e Mon Sep 17 00:00:00 2001
From: Martin Schiller <ms@dev.tdt.de>
Date: Tue, 21 Jan 2020 07:00:33 +0100
Subject: wan/hdlc_x25: make lapb params configurable

This enables you to configure mode (DTE/DCE), Modulo, Window, T1, T2, N2 via
sethdlc (which needs to be patched as well).

Signed-off-by: Martin Schiller <ms@dev.tdt.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 drivers/net/wan/hdlc_x25.c      | 80 +++++++++++++++++++++++++++++++++++++++--
 include/uapi/linux/hdlc/ioctl.h |  9 +++++
 include/uapi/linux/if.h         |  1 +
 3 files changed, 87 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/net/wan/hdlc_x25.c b/drivers/net/wan/hdlc_x25.c
index 5643675ff724..688233c2e1ea 100644
--- a/drivers/net/wan/hdlc_x25.c
+++ b/drivers/net/wan/hdlc_x25.c
@@ -21,8 +21,17 @@
 #include <linux/skbuff.h>
 #include <net/x25device.h>
 
+struct x25_state {
+	x25_hdlc_proto settings;
+};
+
 static int x25_ioctl(struct net_device *dev, struct ifreq *ifr);
 
+static struct x25_state *state(hdlc_device *hdlc)
+{
+	return hdlc->state;
+}
+
 /* These functions are callbacks called by LAPB layer */
 
 static void x25_connect_disconnect(struct net_device *dev, int reason, int code)
@@ -131,7 +140,6 @@ static netdev_tx_t x25_xmit(struct sk_buff *skb, struct net_device *dev)
 
 static int x25_open(struct net_device *dev)
 {
-	int result;
 	static const struct lapb_register_struct cb = {
 		.connect_confirmation = x25_connected,
 		.connect_indication = x25_connected,
@@ -140,10 +148,33 @@ static int x25_open(struct net_device *dev)
 		.data_indication = x25_data_indication,
 		.data_transmit = x25_data_transmit,
 	};
+	hdlc_device *hdlc = dev_to_hdlc(dev);
+	struct lapb_parms_struct params;
+	int result;
 
 	result = lapb_register(dev, &cb);
 	if (result != LAPB_OK)
 		return result;
+
+	result = lapb_getparms(dev, &params);
+	if (result != LAPB_OK)
+		return result;
+
+	if (state(hdlc)->settings.dce)
+		params.mode = params.mode | LAPB_DCE;
+
+	if (state(hdlc)->settings.modulo == 128)
+		params.mode = params.mode | LAPB_EXTENDED;
+
+	params.window = state(hdlc)->settings.window;
+	params.t1 = state(hdlc)->settings.t1;
+	params.t2 = state(hdlc)->settings.t2;
+	params.n2 = state(hdlc)->settings.n2;
+
+	result = lapb_setparms(dev, &params);
+	if (result != LAPB_OK)
+		return result;
+
 	return 0;
 }
 
@@ -186,7 +217,10 @@ static struct hdlc_proto proto = {
 
 static int x25_ioctl(struct net_device *dev, struct ifreq *ifr)
 {
+	x25_hdlc_proto __user *x25_s = ifr->ifr_settings.ifs_ifsu.x25;
+	const size_t size = sizeof(x25_hdlc_proto);
 	hdlc_device *hdlc = dev_to_hdlc(dev);
+	x25_hdlc_proto new_settings;
 	int result;
 
 	switch (ifr->ifr_settings.type) {
@@ -194,7 +228,13 @@ static int x25_ioctl(struct net_device *dev, struct ifreq *ifr)
 		if (dev_to_hdlc(dev)->proto != &proto)
 			return -EINVAL;
 		ifr->ifr_settings.type = IF_PROTO_X25;
-		return 0; /* return protocol only, no settable parameters */
+		if (ifr->ifr_settings.size < size) {
+			ifr->ifr_settings.size = size; /* data size wanted */
+			return -ENOBUFS;
+		}
+		if (copy_to_user(x25_s, &state(hdlc)->settings, size))
+			return -EFAULT;
+		return 0;
 
 	case IF_PROTO_X25:
 		if (!capable(CAP_NET_ADMIN))
@@ -203,12 +243,46 @@ static int x25_ioctl(struct net_device *dev, struct ifreq *ifr)
 		if (dev->flags & IFF_UP)
 			return -EBUSY;
 
+		/* backward compatibility */
+		if (ifr->ifr_settings.size = 0) {
+			new_settings.dce = 0;
+			new_settings.modulo = 8;
+			new_settings.window = 7;
+			new_settings.t1 = 3;
+			new_settings.t2 = 1;
+			new_settings.n2 = 10;
+		}
+		else {
+			if (copy_from_user(&new_settings, x25_s, size))
+				return -EFAULT;
+
+			if ((new_settings.dce != 0 &&
+			new_settings.dce != 1) ||
+			(new_settings.modulo != 8 &&
+			new_settings.modulo != 128) ||
+			new_settings.window < 1 ||
+			(new_settings.modulo == 8 &&
+			new_settings.window > 7) ||
+			(new_settings.modulo == 128 &&
+			new_settings.window > 127) ||
+			new_settings.t1 < 1 ||
+			new_settings.t1 > 255 ||
+			new_settings.t2 < 1 ||
+			new_settings.t2 > 255 ||
+			new_settings.n2 < 1 ||
+			new_settings.n2 > 255)
+				return -EINVAL;
+		}
+
 		result=hdlc->attach(dev, ENCODING_NRZ,PARITY_CRC16_PR1_CCITT);
 		if (result)
 			return result;
 
-		if ((result = attach_hdlc_protocol(dev, &proto, 0)))
+		if ((result = attach_hdlc_protocol(dev, &proto,
+						   sizeof(struct x25_state))))
 			return result;
+
+		memcpy(&state(hdlc)->settings, &new_settings, size);
 		dev->type = ARPHRD_X25;
 		call_netdevice_notifiers(NETDEV_POST_TYPE_CHANGE, dev);
 		netif_dormant_off(dev);
diff --git a/include/uapi/linux/hdlc/ioctl.h b/include/uapi/linux/hdlc/ioctl.h
index 0fe4238e8246..b06341acab5e 100644
--- a/include/uapi/linux/hdlc/ioctl.h
+++ b/include/uapi/linux/hdlc/ioctl.h
@@ -79,6 +79,15 @@ typedef struct {
     unsigned int timeout;
 } cisco_proto;
 
+typedef struct {
+	unsigned short dce; /* 1 for DCE (network side) operation */
+	unsigned int modulo; /* modulo (8 = basic / 128 = extended) */
+	unsigned int window; /* frame window size */
+	unsigned int t1; /* timeout t1 */
+	unsigned int t2; /* timeout t2 */
+	unsigned int n2; /* frame retry counter */
+} x25_hdlc_proto;
+
 /* PPP doesn't need any info now - supply length = 0 to ioctl */
 
 #endif /* __ASSEMBLY__ */
diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h
index 4bf33344aab1..be714cd8c826 100644
--- a/include/uapi/linux/if.h
+++ b/include/uapi/linux/if.h
@@ -213,6 +213,7 @@ struct if_settings {
 		fr_proto		__user *fr;
 		fr_proto_pvc		__user *fr_pvc;
 		fr_proto_pvc_info	__user *fr_pvc_info;
+		x25_hdlc_proto		__user *x25;
 
 		/* interface settings */
 		sync_serial_settings	__user *sync;
-- 
cgit v1.2.3


From be8704ff07d2374bcc5c675526f95e70c6459683 Mon Sep 17 00:00:00 2001
From: Alexei Starovoitov <ast@kernel.org>
Date: Mon, 20 Jan 2020 16:53:46 -0800
Subject: bpf: Introduce dynamic program extensions
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Introduce dynamic program extensions. The users can load additional BPF
functions and replace global functions in previously loaded BPF programs while
these programs are executing.

Global functions are verified individually by the verifier based on their types only.
Hence the global function in the new program which types match older function can
safely replace that corresponding function.

This new function/program is called 'an extension' of old program. At load time
the verifier uses (attach_prog_fd, attach_btf_id) pair to identify the function
to be replaced. The BPF program type is derived from the target program into
extension program. Technically bpf_verifier_ops is copied from target program.
The BPF_PROG_TYPE_EXT program type is a placeholder. It has empty verifier_ops.
The extension program can call the same bpf helper functions as target program.
Single BPF_PROG_TYPE_EXT type is used to extend XDP, SKB and all other program
types. The verifier allows only one level of replacement. Meaning that the
extension program cannot recursively extend an extension. That also means that
the maximum stack size is increasing from 512 to 1024 bytes and maximum
function nesting level from 8 to 16. The programs don't always consume that
much. The stack usage is determined by the number of on-stack variables used by
the program. The verifier could have enforced 512 limit for combined original
plus extension program, but it makes for difficult user experience. The main
use case for extensions is to provide generic mechanism to plug external
programs into policy program or function call chaining.

BPF trampoline is used to track both fentry/fexit and program extensions
because both are using the same nop slot at the beginning of every BPF
function. Attaching fentry/fexit to a function that was replaced is not
allowed. The opposite is true as well. Replacing a function that currently
being analyzed with fentry/fexit is not allowed. The executable page allocated
by BPF trampoline is not used by program extensions. This inefficiency will be
optimized in future patches.

Function by function verification of global function supports scalars and
pointer to context only. Hence program extensions are supported for such class
of global functions only. In the future the verifier will be extended with
support to pointers to structures, arrays with sizes, etc.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Acked-by: Andrii Nakryiko <andriin@fb.com>
Acked-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://lore.kernel.org/bpf/20200121005348.2769920-2-ast@kernel.org
---
 include/linux/bpf.h       |  10 ++-
 include/linux/bpf_types.h |   2 +
 include/linux/btf.h       |   5 ++
 include/uapi/linux/bpf.h  |   1 +
 kernel/bpf/btf.c          | 152 +++++++++++++++++++++++++++++++++++++++++++++-
 kernel/bpf/syscall.c      |  15 ++++-
 kernel/bpf/trampoline.c   |  41 ++++++++++++-
 kernel/bpf/verifier.c     |  85 ++++++++++++++++++++------
 8 files changed, 283 insertions(+), 28 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8e3b8f4ad183..05d16615054c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -465,7 +465,8 @@ void notrace __bpf_prog_exit(struct bpf_prog *prog, u64 start);
 enum bpf_tramp_prog_type {
 	BPF_TRAMP_FENTRY,
 	BPF_TRAMP_FEXIT,
-	BPF_TRAMP_MAX
+	BPF_TRAMP_MAX,
+	BPF_TRAMP_REPLACE, /* more than MAX */
 };
 
 struct bpf_trampoline {
@@ -480,6 +481,11 @@ struct bpf_trampoline {
 		void *addr;
 		bool ftrace_managed;
 	} func;
+	/* if !NULL this is BPF_PROG_TYPE_EXT program that extends another BPF
+	 * program by replacing one of its functions. func.addr is the address
+	 * of the function it replaced.
+	 */
+	struct bpf_prog *extension_prog;
 	/* list of BPF programs using this trampoline */
 	struct hlist_head progs_hlist[BPF_TRAMP_MAX];
 	/* Number of attached programs. A counter per kind. */
@@ -1107,6 +1113,8 @@ int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
 			     struct bpf_reg_state *regs);
 int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 			  struct bpf_reg_state *reg);
+int btf_check_type_match(struct bpf_verifier_env *env, struct bpf_prog *prog,
+			 struct btf *btf, const struct btf_type *t);
 
 struct bpf_prog *bpf_prog_by_id(u32 id);
 
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index 9f326e6ef885..c81d4ece79a4 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -68,6 +68,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_SK_REUSEPORT, sk_reuseport,
 #if defined(CONFIG_BPF_JIT)
 BPF_PROG_TYPE(BPF_PROG_TYPE_STRUCT_OPS, bpf_struct_ops,
 	      void *, void *)
+BPF_PROG_TYPE(BPF_PROG_TYPE_EXT, bpf_extension,
+	      void *, void *)
 #endif
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 881e9b76ef49..5c1ea99b480f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -107,6 +107,11 @@ static inline u16 btf_type_vlen(const struct btf_type *t)
 	return BTF_INFO_VLEN(t->info);
 }
 
+static inline u16 btf_func_linkage(const struct btf_type *t)
+{
+	return BTF_INFO_VLEN(t->info);
+}
+
 static inline bool btf_type_kflag(const struct btf_type *t)
 {
 	return BTF_INFO_KFLAG(t->info);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 033d90a2282d..e81628eb059c 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -180,6 +180,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_CGROUP_SOCKOPT,
 	BPF_PROG_TYPE_TRACING,
 	BPF_PROG_TYPE_STRUCT_OPS,
+	BPF_PROG_TYPE_EXT,
 };
 
 enum bpf_attach_type {
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 832b5d7fd892..32963b6d5a9c 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -276,6 +276,11 @@ static const char * const btf_kind_str[NR_BTF_KINDS] = {
 	[BTF_KIND_DATASEC]	= "DATASEC",
 };
 
+static const char *btf_type_str(const struct btf_type *t)
+{
+	return btf_kind_str[BTF_INFO_KIND(t->info)];
+}
+
 struct btf_kind_operations {
 	s32 (*check_meta)(struct btf_verifier_env *env,
 			  const struct btf_type *t,
@@ -4115,6 +4120,148 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 	return 0;
 }
 
+/* Compare BTFs of two functions assuming only scalars and pointers to context.
+ * t1 points to BTF_KIND_FUNC in btf1
+ * t2 points to BTF_KIND_FUNC in btf2
+ * Returns:
+ * EINVAL - function prototype mismatch
+ * EFAULT - verifier bug
+ * 0 - 99% match. The last 1% is validated by the verifier.
+ */
+int btf_check_func_type_match(struct bpf_verifier_log *log,
+			      struct btf *btf1, const struct btf_type *t1,
+			      struct btf *btf2, const struct btf_type *t2)
+{
+	const struct btf_param *args1, *args2;
+	const char *fn1, *fn2, *s1, *s2;
+	u32 nargs1, nargs2, i;
+
+	fn1 = btf_name_by_offset(btf1, t1->name_off);
+	fn2 = btf_name_by_offset(btf2, t2->name_off);
+
+	if (btf_func_linkage(t1) != BTF_FUNC_GLOBAL) {
+		bpf_log(log, "%s() is not a global function\n", fn1);
+		return -EINVAL;
+	}
+	if (btf_func_linkage(t2) != BTF_FUNC_GLOBAL) {
+		bpf_log(log, "%s() is not a global function\n", fn2);
+		return -EINVAL;
+	}
+
+	t1 = btf_type_by_id(btf1, t1->type);
+	if (!t1 || !btf_type_is_func_proto(t1))
+		return -EFAULT;
+	t2 = btf_type_by_id(btf2, t2->type);
+	if (!t2 || !btf_type_is_func_proto(t2))
+		return -EFAULT;
+
+	args1 = (const struct btf_param *)(t1 + 1);
+	nargs1 = btf_type_vlen(t1);
+	args2 = (const struct btf_param *)(t2 + 1);
+	nargs2 = btf_type_vlen(t2);
+
+	if (nargs1 != nargs2) {
+		bpf_log(log, "%s() has %d args while %s() has %d args\n",
+			fn1, nargs1, fn2, nargs2);
+		return -EINVAL;
+	}
+
+	t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
+	t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
+	if (t1->info != t2->info) {
+		bpf_log(log,
+			"Return type %s of %s() doesn't match type %s of %s()\n",
+			btf_type_str(t1), fn1,
+			btf_type_str(t2), fn2);
+		return -EINVAL;
+	}
+
+	for (i = 0; i < nargs1; i++) {
+		t1 = btf_type_skip_modifiers(btf1, args1[i].type, NULL);
+		t2 = btf_type_skip_modifiers(btf2, args2[i].type, NULL);
+
+		if (t1->info != t2->info) {
+			bpf_log(log, "arg%d in %s() is %s while %s() has %s\n",
+				i, fn1, btf_type_str(t1),
+				fn2, btf_type_str(t2));
+			return -EINVAL;
+		}
+		if (btf_type_has_size(t1) && t1->size != t2->size) {
+			bpf_log(log,
+				"arg%d in %s() has size %d while %s() has %d\n",
+				i, fn1, t1->size,
+				fn2, t2->size);
+			return -EINVAL;
+		}
+
+		/* global functions are validated with scalars and pointers
+		 * to context only. And only global functions can be replaced.
+		 * Hence type check only those types.
+		 */
+		if (btf_type_is_int(t1) || btf_type_is_enum(t1))
+			continue;
+		if (!btf_type_is_ptr(t1)) {
+			bpf_log(log,
+				"arg%d in %s() has unrecognized type\n",
+				i, fn1);
+			return -EINVAL;
+		}
+		t1 = btf_type_skip_modifiers(btf1, t1->type, NULL);
+		t2 = btf_type_skip_modifiers(btf2, t2->type, NULL);
+		if (!btf_type_is_struct(t1)) {
+			bpf_log(log,
+				"arg%d in %s() is not a pointer to context\n",
+				i, fn1);
+			return -EINVAL;
+		}
+		if (!btf_type_is_struct(t2)) {
+			bpf_log(log,
+				"arg%d in %s() is not a pointer to context\n",
+				i, fn2);
+			return -EINVAL;
+		}
+		/* This is an optional check to make program writing easier.
+		 * Compare names of structs and report an error to the user.
+		 * btf_prepare_func_args() already checked that t2 struct
+		 * is a context type. btf_prepare_func_args() will check
+		 * later that t1 struct is a context type as well.
+		 */
+		s1 = btf_name_by_offset(btf1, t1->name_off);
+		s2 = btf_name_by_offset(btf2, t2->name_off);
+		if (strcmp(s1, s2)) {
+			bpf_log(log,
+				"arg%d %s(struct %s *) doesn't match %s(struct %s *)\n",
+				i, fn1, s1, fn2, s2);
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+/* Compare BTFs of given program with BTF of target program */
+int btf_check_type_match(struct bpf_verifier_env *env, struct bpf_prog *prog,
+			 struct btf *btf2, const struct btf_type *t2)
+{
+	struct btf *btf1 = prog->aux->btf;
+	const struct btf_type *t1;
+	u32 btf_id = 0;
+
+	if (!prog->aux->func_info) {
+		bpf_log(&env->log, "Program extension requires BTF\n");
+		return -EINVAL;
+	}
+
+	btf_id = prog->aux->func_info[0].type_id;
+	if (!btf_id)
+		return -EFAULT;
+
+	t1 = btf_type_by_id(btf1, btf_id);
+	if (!t1 || !btf_type_is_func(t1))
+		return -EFAULT;
+
+	return btf_check_func_type_match(&env->log, btf1, t1, btf2, t2);
+}
+
 /* Compare BTF of a function with given bpf_reg_state.
  * Returns:
  * EFAULT - there is a verifier bug. Abort verification.
@@ -4224,6 +4371,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 {
 	struct bpf_verifier_log *log = &env->log;
 	struct bpf_prog *prog = env->prog;
+	enum bpf_prog_type prog_type = prog->type;
 	struct btf *btf = prog->aux->btf;
 	const struct btf_param *args;
 	const struct btf_type *t;
@@ -4261,6 +4409,8 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 		bpf_log(log, "Verifier bug in function %s()\n", tname);
 		return -EFAULT;
 	}
+	if (prog_type == BPF_PROG_TYPE_EXT)
+		prog_type = prog->aux->linked_prog->type;
 
 	t = btf_type_by_id(btf, t->type);
 	if (!t || !btf_type_is_func_proto(t)) {
@@ -4296,7 +4446,7 @@ int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 			continue;
 		}
 		if (btf_type_is_ptr(t) &&
-		    btf_get_prog_ctx_type(log, btf, t, prog->type, i)) {
+		    btf_get_prog_ctx_type(log, btf, t, prog_type, i)) {
 			reg[i + 1].type = PTR_TO_CTX;
 			continue;
 		}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 9a840c57f6df..a91ad518c050 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1932,13 +1932,15 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		switch (prog_type) {
 		case BPF_PROG_TYPE_TRACING:
 		case BPF_PROG_TYPE_STRUCT_OPS:
+		case BPF_PROG_TYPE_EXT:
 			break;
 		default:
 			return -EINVAL;
 		}
 	}
 
-	if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING)
+	if (prog_fd && prog_type != BPF_PROG_TYPE_TRACING &&
+	    prog_type != BPF_PROG_TYPE_EXT)
 		return -EINVAL;
 
 	switch (prog_type) {
@@ -1981,6 +1983,10 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		default:
 			return -EINVAL;
 		}
+	case BPF_PROG_TYPE_EXT:
+		if (expected_attach_type)
+			return -EINVAL;
+		/* fallthrough */
 	default:
 		return 0;
 	}
@@ -2183,7 +2189,8 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog)
 	int tr_fd, err;
 
 	if (prog->expected_attach_type != BPF_TRACE_FENTRY &&
-	    prog->expected_attach_type != BPF_TRACE_FEXIT) {
+	    prog->expected_attach_type != BPF_TRACE_FEXIT &&
+	    prog->type != BPF_PROG_TYPE_EXT) {
 		err = -EINVAL;
 		goto out_put_prog;
 	}
@@ -2250,12 +2257,14 @@ static int bpf_raw_tracepoint_open(const union bpf_attr *attr)
 
 	if (prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT &&
 	    prog->type != BPF_PROG_TYPE_TRACING &&
+	    prog->type != BPF_PROG_TYPE_EXT &&
 	    prog->type != BPF_PROG_TYPE_RAW_TRACEPOINT_WRITABLE) {
 		err = -EINVAL;
 		goto out_put_prog;
 	}
 
-	if (prog->type == BPF_PROG_TYPE_TRACING) {
+	if (prog->type == BPF_PROG_TYPE_TRACING ||
+	    prog->type == BPF_PROG_TYPE_EXT) {
 		if (attr->raw_tracepoint.name) {
 			/* The attach point for this category of programs
 			 * should be specified via btf_id during program load.
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 7657ede7aee2..eb64c245052b 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -5,6 +5,12 @@
 #include <linux/filter.h>
 #include <linux/ftrace.h>
 
+/* dummy _ops. The verifier will operate on target program's ops. */
+const struct bpf_verifier_ops bpf_extension_verifier_ops = {
+};
+const struct bpf_prog_ops bpf_extension_prog_ops = {
+};
+
 /* btf_vmlinux has ~22k attachable functions. 1k htab is enough. */
 #define TRAMPOLINE_HASH_BITS 10
 #define TRAMPOLINE_TABLE_SIZE (1 << TRAMPOLINE_HASH_BITS)
@@ -194,8 +200,10 @@ static enum bpf_tramp_prog_type bpf_attach_type_to_tramp(enum bpf_attach_type t)
 	switch (t) {
 	case BPF_TRACE_FENTRY:
 		return BPF_TRAMP_FENTRY;
-	default:
+	case BPF_TRACE_FEXIT:
 		return BPF_TRAMP_FEXIT;
+	default:
+		return BPF_TRAMP_REPLACE;
 	}
 }
 
@@ -204,12 +212,31 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog)
 	enum bpf_tramp_prog_type kind;
 	struct bpf_trampoline *tr;
 	int err = 0;
+	int cnt;
 
 	tr = prog->aux->trampoline;
 	kind = bpf_attach_type_to_tramp(prog->expected_attach_type);
 	mutex_lock(&tr->mutex);
-	if (tr->progs_cnt[BPF_TRAMP_FENTRY] + tr->progs_cnt[BPF_TRAMP_FEXIT]
-	    >= BPF_MAX_TRAMP_PROGS) {
+	if (tr->extension_prog) {
+		/* cannot attach fentry/fexit if extension prog is attached.
+		 * cannot overwrite extension prog either.
+		 */
+		err = -EBUSY;
+		goto out;
+	}
+	cnt = tr->progs_cnt[BPF_TRAMP_FENTRY] + tr->progs_cnt[BPF_TRAMP_FEXIT];
+	if (kind == BPF_TRAMP_REPLACE) {
+		/* Cannot attach extension if fentry/fexit are in use. */
+		if (cnt) {
+			err = -EBUSY;
+			goto out;
+		}
+		tr->extension_prog = prog;
+		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
+					 prog->bpf_func);
+		goto out;
+	}
+	if (cnt >= BPF_MAX_TRAMP_PROGS) {
 		err = -E2BIG;
 		goto out;
 	}
@@ -240,9 +267,17 @@ int bpf_trampoline_unlink_prog(struct bpf_prog *prog)
 	tr = prog->aux->trampoline;
 	kind = bpf_attach_type_to_tramp(prog->expected_attach_type);
 	mutex_lock(&tr->mutex);
+	if (kind == BPF_TRAMP_REPLACE) {
+		WARN_ON_ONCE(!tr->extension_prog);
+		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP,
+					 tr->extension_prog->bpf_func, NULL);
+		tr->extension_prog = NULL;
+		goto out;
+	}
 	hlist_del(&prog->aux->tramp_hlist);
 	tr->progs_cnt[kind]--;
 	err = bpf_trampoline_update(prog->aux->trampoline);
+out:
 	mutex_unlock(&tr->mutex);
 	return err;
 }
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6defbec9eb62..9fe94f64f0ec 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9564,7 +9564,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
 			subprog);
 
 	regs = state->frame[state->curframe]->regs;
-	if (subprog) {
+	if (subprog || env->prog->type == BPF_PROG_TYPE_EXT) {
 		ret = btf_prepare_func_args(env, subprog, regs);
 		if (ret)
 			goto out;
@@ -9742,6 +9742,7 @@ static int check_struct_ops_btf_id(struct bpf_verifier_env *env)
 static int check_attach_btf_id(struct bpf_verifier_env *env)
 {
 	struct bpf_prog *prog = env->prog;
+	bool prog_extension = prog->type == BPF_PROG_TYPE_EXT;
 	struct bpf_prog *tgt_prog = prog->aux->linked_prog;
 	u32 btf_id = prog->aux->attach_btf_id;
 	const char prefix[] = "btf_trace_";
@@ -9757,7 +9758,7 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 	if (prog->type == BPF_PROG_TYPE_STRUCT_OPS)
 		return check_struct_ops_btf_id(env);
 
-	if (prog->type != BPF_PROG_TYPE_TRACING)
+	if (prog->type != BPF_PROG_TYPE_TRACING && !prog_extension)
 		return 0;
 
 	if (!btf_id) {
@@ -9793,8 +9794,59 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 			return -EINVAL;
 		}
 		conservative = aux->func_info_aux[subprog].unreliable;
+		if (prog_extension) {
+			if (conservative) {
+				verbose(env,
+					"Cannot replace static functions\n");
+				return -EINVAL;
+			}
+			if (!prog->jit_requested) {
+				verbose(env,
+					"Extension programs should be JITed\n");
+				return -EINVAL;
+			}
+			env->ops = bpf_verifier_ops[tgt_prog->type];
+		}
+		if (!tgt_prog->jited) {
+			verbose(env, "Can attach to only JITed progs\n");
+			return -EINVAL;
+		}
+		if (tgt_prog->type == prog->type) {
+			/* Cannot fentry/fexit another fentry/fexit program.
+			 * Cannot attach program extension to another extension.
+			 * It's ok to attach fentry/fexit to extension program.
+			 */
+			verbose(env, "Cannot recursively attach\n");
+			return -EINVAL;
+		}
+		if (tgt_prog->type == BPF_PROG_TYPE_TRACING &&
+		    prog_extension &&
+		    (tgt_prog->expected_attach_type == BPF_TRACE_FENTRY ||
+		     tgt_prog->expected_attach_type == BPF_TRACE_FEXIT)) {
+			/* Program extensions can extend all program types
+			 * except fentry/fexit. The reason is the following.
+			 * The fentry/fexit programs are used for performance
+			 * analysis, stats and can be attached to any program
+			 * type except themselves. When extension program is
+			 * replacing XDP function it is necessary to allow
+			 * performance analysis of all functions. Both original
+			 * XDP program and its program extension. Hence
+			 * attaching fentry/fexit to BPF_PROG_TYPE_EXT is
+			 * allowed. If extending of fentry/fexit was allowed it
+			 * would be possible to create long call chain
+			 * fentry->extension->fentry->extension beyond
+			 * reasonable stack size. Hence extending fentry is not
+			 * allowed.
+			 */
+			verbose(env, "Cannot extend fentry/fexit\n");
+			return -EINVAL;
+		}
 		key = ((u64)aux->id) << 32 | btf_id;
 	} else {
+		if (prog_extension) {
+			verbose(env, "Cannot replace kernel functions\n");
+			return -EINVAL;
+		}
 		key = btf_id;
 	}
 
@@ -9832,6 +9884,10 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 		prog->aux->attach_func_proto = t;
 		prog->aux->attach_btf_trace = true;
 		return 0;
+	default:
+		if (!prog_extension)
+			return -EINVAL;
+		/* fallthrough */
 	case BPF_TRACE_FENTRY:
 	case BPF_TRACE_FEXIT:
 		if (!btf_type_is_func(t)) {
@@ -9839,6 +9895,9 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 				btf_id);
 			return -EINVAL;
 		}
+		if (prog_extension &&
+		    btf_check_type_match(env, prog, btf, t))
+			return -EINVAL;
 		t = btf_type_by_id(btf, t->type);
 		if (!btf_type_is_func_proto(t))
 			return -EINVAL;
@@ -9862,18 +9921,6 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 		if (ret < 0)
 			goto out;
 		if (tgt_prog) {
-			if (!tgt_prog->jited) {
-				/* for now */
-				verbose(env, "Can trace only JITed BPF progs\n");
-				ret = -EINVAL;
-				goto out;
-			}
-			if (tgt_prog->type == BPF_PROG_TYPE_TRACING) {
-				/* prevent cycles */
-				verbose(env, "Cannot recursively attach\n");
-				ret = -EINVAL;
-				goto out;
-			}
 			if (subprog == 0)
 				addr = (long) tgt_prog->bpf_func;
 			else
@@ -9895,8 +9942,6 @@ out:
 		if (ret)
 			bpf_trampoline_put(tr);
 		return ret;
-	default:
-		return -EINVAL;
 	}
 }
 
@@ -9966,10 +10011,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 		goto skip_full_check;
 	}
 
-	ret = check_attach_btf_id(env);
-	if (ret)
-		goto skip_full_check;
-
 	env->strict_alignment = !!(attr->prog_flags & BPF_F_STRICT_ALIGNMENT);
 	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS))
 		env->strict_alignment = true;
@@ -10006,6 +10047,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (ret < 0)
 		goto skip_full_check;
 
+	ret = check_attach_btf_id(env);
+	if (ret)
+		goto skip_full_check;
+
 	ret = check_cfg(env);
 	if (ret < 0)
 		goto skip_full_check;
-- 
cgit v1.2.3


From 5576b991e9c1a11d2cc21c4b94fc75ec27603896 Mon Sep 17 00:00:00 2001
From: Martin KaFai Lau <kafai@fb.com>
Date: Wed, 22 Jan 2020 15:36:46 -0800
Subject: bpf: Add BPF_FUNC_jiffies64

This patch adds a helper to read the 64bit jiffies.  It will be used
in a later patch to implement the bpf_cubic.c.

The helper is inlined for jit_requested and 64 BITS_PER_LONG
as the map_gen_lookup().  Other cases could be considered together
with map_gen_lookup() if needed.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200122233646.903260-1-kafai@fb.com
---
 include/linux/bpf.h      |  1 +
 include/uapi/linux/bpf.h |  9 ++++++++-
 kernel/bpf/core.c        |  1 +
 kernel/bpf/helpers.c     | 12 ++++++++++++
 kernel/bpf/verifier.c    | 24 ++++++++++++++++++++++++
 net/core/filter.c        |  2 ++
 6 files changed, 48 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 05d16615054c..a9687861fd7e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1414,6 +1414,7 @@ extern const struct bpf_func_proto bpf_get_local_storage_proto;
 extern const struct bpf_func_proto bpf_strtol_proto;
 extern const struct bpf_func_proto bpf_strtoul_proto;
 extern const struct bpf_func_proto bpf_tcp_sock_proto;
+extern const struct bpf_func_proto bpf_jiffies64_proto;
 
 /* Shared helpers among cBPF and eBPF. */
 void bpf_user_rnd_init_once(void);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index e81628eb059c..f1d74a2bd234 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -2886,6 +2886,12 @@ union bpf_attr {
  *		**-EPERM** if no permission to send the *sig*.
  *
  *		**-EAGAIN** if bpf program can try again.
+ *
+ * u64 bpf_jiffies64(void)
+ *	Description
+ *		Obtain the 64bit jiffies
+ *	Return
+ *		The 64 bit jiffies
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3005,7 +3011,8 @@ union bpf_attr {
 	FN(probe_read_user_str),	\
 	FN(probe_read_kernel_str),	\
 	FN(tcp_send_ack),		\
-	FN(send_signal_thread),
+	FN(send_signal_thread),		\
+	FN(jiffies64),
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
  * function eBPF program intends to call
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 29d47aae0dd1..973a20d49749 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2137,6 +2137,7 @@ const struct bpf_func_proto bpf_map_pop_elem_proto __weak;
 const struct bpf_func_proto bpf_map_peek_elem_proto __weak;
 const struct bpf_func_proto bpf_spin_lock_proto __weak;
 const struct bpf_func_proto bpf_spin_unlock_proto __weak;
+const struct bpf_func_proto bpf_jiffies64_proto __weak;
 
 const struct bpf_func_proto bpf_get_prandom_u32_proto __weak;
 const struct bpf_func_proto bpf_get_smp_processor_id_proto __weak;
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index cada974c9f4e..d8b7b110a1c5 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -11,6 +11,7 @@
 #include <linux/uidgid.h>
 #include <linux/filter.h>
 #include <linux/ctype.h>
+#include <linux/jiffies.h>
 
 #include "../../lib/kstrtox.h"
 
@@ -312,6 +313,17 @@ void copy_map_value_locked(struct bpf_map *map, void *dst, void *src,
 	preempt_enable();
 }
 
+BPF_CALL_0(bpf_jiffies64)
+{
+	return get_jiffies_64();
+}
+
+const struct bpf_func_proto bpf_jiffies64_proto = {
+	.func		= bpf_jiffies64,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+};
+
 #ifdef CONFIG_CGROUPS
 BPF_CALL_0(bpf_get_current_cgroup_id)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9fe94f64f0ec..734abaa02123 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9445,6 +9445,30 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
 			goto patch_call_imm;
 		}
 
+		if (prog->jit_requested && BITS_PER_LONG == 64 &&
+		    insn->imm == BPF_FUNC_jiffies64) {
+			struct bpf_insn ld_jiffies_addr[2] = {
+				BPF_LD_IMM64(BPF_REG_0,
+					     (unsigned long)&jiffies),
+			};
+
+			insn_buf[0] = ld_jiffies_addr[0];
+			insn_buf[1] = ld_jiffies_addr[1];
+			insn_buf[2] = BPF_LDX_MEM(BPF_DW, BPF_REG_0,
+						  BPF_REG_0, 0);
+			cnt = 3;
+
+			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf,
+						       cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+			continue;
+		}
+
 patch_call_imm:
 		fn = env->ops->get_func_proto(insn->imm, env->prog);
 		/* all functions that have prototype and verifier allowed
diff --git a/net/core/filter.c b/net/core/filter.c
index 17de6747d9e3..4bf3e4aa8a7a 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -5923,6 +5923,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_spin_unlock_proto;
 	case BPF_FUNC_trace_printk:
 		return bpf_get_trace_printk_proto();
+	case BPF_FUNC_jiffies64:
+		return &bpf_jiffies64_proto;
 	default:
 		return NULL;
 	}
-- 
cgit v1.2.3


From ec97ecf1ebe485a17cd8395a5f35e6b80b57665a Mon Sep 17 00:00:00 2001
From: "Mohit P. Tahiliani" <tahiliani@nitk.edu.in>
Date: Wed, 22 Jan 2020 23:52:33 +0530
Subject: net: sched: add Flow Queue PIE packet scheduler

Principles:
  - Packets are classified on flows.
  - This is a Stochastic model (as we use a hash, several flows might
                                be hashed to the same slot)
  - Each flow has a PIE managed queue.
  - Flows are linked onto two (Round Robin) lists,
    so that new flows have priority on old ones.
  - For a given flow, packets are not reordered.
  - Drops during enqueue only.
  - ECN capability is off by default.
  - ECN threshold (if ECN is enabled) is at 10% by default.
  - Uses timestamps to calculate queue delay by default.

Usage:
tc qdisc ... fq_pie [ limit PACKETS ] [ flows NUMBER ]
                    [ target TIME ] [ tupdate TIME ]
                    [ alpha NUMBER ] [ beta NUMBER ]
                    [ quantum BYTES ] [ memory_limit BYTES ]
                    [ ecnprob PERCENTAGE ] [ [no]ecn ]
                    [ [no]bytemode ] [ [no_]dq_rate_estimator ]

defaults:
  limit: 10240 packets, flows: 1024
  target: 15 ms, tupdate: 15 ms (in jiffies)
  alpha: 1/8, beta : 5/4
  quantum: device MTU, memory_limit: 32 Mb
  ecnprob: 10%, ecn: off
  bytemode: off, dq_rate_estimator: off

Signed-off-by: Mohit P. Tahiliani <tahiliani@nitk.edu.in>
Signed-off-by: Sachin D. Patil <sdp.sachin@gmail.com>
Signed-off-by: V. Saicharan <vsaicharan1998@gmail.com>
Signed-off-by: Mohit Bhasi <mohitbhasi1998@gmail.com>
Signed-off-by: Leslie Monis <lesliemonis@gmail.com>
Signed-off-by: Gautam Ramakrishnan <gautamramk@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/net/pie.h              |   2 +
 include/uapi/linux/pkt_sched.h |  31 +++
 net/sched/Kconfig              |  13 +
 net/sched/Makefile             |   1 +
 net/sched/sch_fq_pie.c         | 562 +++++++++++++++++++++++++++++++++++++++++
 5 files changed, 609 insertions(+)
 create mode 100644 net/sched/sch_fq_pie.c

(limited to 'include/uapi/linux')

diff --git a/include/net/pie.h b/include/net/pie.h
index 90f5db3d29e7..fd5a37cb7993 100644
--- a/include/net/pie.h
+++ b/include/net/pie.h
@@ -81,9 +81,11 @@ struct pie_stats {
 /**
  * struct pie_skb_cb - contains private skb vars
  * @enqueue_time:	timestamp when the packet is enqueued
+ * @mem_usage:		size of the skb during enqueue
  */
 struct pie_skb_cb {
 	psched_time_t enqueue_time;
+	u32 mem_usage;
 };
 
 static inline void pie_params_init(struct pie_params *params)
diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index bf5a5b1dfb0b..bbe791b24168 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -971,6 +971,37 @@ struct tc_pie_xstats {
 	__u32 ecn_mark;			/* packets marked with ecn*/
 };
 
+/* FQ PIE */
+enum {
+	TCA_FQ_PIE_UNSPEC,
+	TCA_FQ_PIE_LIMIT,
+	TCA_FQ_PIE_FLOWS,
+	TCA_FQ_PIE_TARGET,
+	TCA_FQ_PIE_TUPDATE,
+	TCA_FQ_PIE_ALPHA,
+	TCA_FQ_PIE_BETA,
+	TCA_FQ_PIE_QUANTUM,
+	TCA_FQ_PIE_MEMORY_LIMIT,
+	TCA_FQ_PIE_ECN_PROB,
+	TCA_FQ_PIE_ECN,
+	TCA_FQ_PIE_BYTEMODE,
+	TCA_FQ_PIE_DQ_RATE_ESTIMATOR,
+	__TCA_FQ_PIE_MAX
+};
+#define TCA_FQ_PIE_MAX   (__TCA_FQ_PIE_MAX - 1)
+
+struct tc_fq_pie_xstats {
+	__u32 packets_in;	/* total number of packets enqueued */
+	__u32 dropped;		/* packets dropped due to fq_pie_action */
+	__u32 overlimit;	/* dropped due to lack of space in queue */
+	__u32 overmemory;	/* dropped due to lack of memory in queue */
+	__u32 ecn_mark;		/* packets marked with ecn */
+	__u32 new_flow_count;	/* count of new flows created by packets */
+	__u32 new_flows_len;	/* count of flows in new list */
+	__u32 old_flows_len;	/* count of flows in old list */
+	__u32 memory_usage;	/* total memory across all queues */
+};
+
 /* CBS */
 struct tc_cbs_qopt {
 	__u8 offload;
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index b1e7ec726958..edde0e519438 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -366,6 +366,19 @@ config NET_SCH_PIE
 
 	  If unsure, say N.
 
+config NET_SCH_FQ_PIE
+	depends on NET_SCH_PIE
+	tristate "Flow Queue Proportional Integral controller Enhanced (FQ-PIE)"
+	help
+	  Say Y here if you want to use the Flow Queue Proportional Integral
+	  controller Enhanced (FQ-PIE) packet scheduling algorithm.
+	  For more information, please see https://tools.ietf.org/html/rfc8033
+
+	  To compile this driver as a module, choose M here: the module
+	  will be called sch_fq_pie.
+
+	  If unsure, say N.
+
 config NET_SCH_INGRESS
 	tristate "Ingress/classifier-action Qdisc"
 	depends on NET_CLS_ACT
diff --git a/net/sched/Makefile b/net/sched/Makefile
index bc8856b865ff..31c367a6cd09 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -59,6 +59,7 @@ obj-$(CONFIG_NET_SCH_CAKE)	+= sch_cake.o
 obj-$(CONFIG_NET_SCH_FQ)	+= sch_fq.o
 obj-$(CONFIG_NET_SCH_HHF)	+= sch_hhf.o
 obj-$(CONFIG_NET_SCH_PIE)	+= sch_pie.o
+obj-$(CONFIG_NET_SCH_FQ_PIE)	+= sch_fq_pie.o
 obj-$(CONFIG_NET_SCH_CBS)	+= sch_cbs.o
 obj-$(CONFIG_NET_SCH_ETF)	+= sch_etf.o
 obj-$(CONFIG_NET_SCH_TAPRIO)	+= sch_taprio.o
diff --git a/net/sched/sch_fq_pie.c b/net/sched/sch_fq_pie.c
new file mode 100644
index 000000000000..bbd0dea6b6b9
--- /dev/null
+++ b/net/sched/sch_fq_pie.c
@@ -0,0 +1,562 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/* Flow Queue PIE discipline
+ *
+ * Copyright (C) 2019 Mohit P. Tahiliani <tahiliani@nitk.edu.in>
+ * Copyright (C) 2019 Sachin D. Patil <sdp.sachin@gmail.com>
+ * Copyright (C) 2019 V. Saicharan <vsaicharan1998@gmail.com>
+ * Copyright (C) 2019 Mohit Bhasi <mohitbhasi1998@gmail.com>
+ * Copyright (C) 2019 Leslie Monis <lesliemonis@gmail.com>
+ * Copyright (C) 2019 Gautam Ramakrishnan <gautamramk@gmail.com>
+ */
+
+#include <linux/jhash.h>
+#include <linux/sizes.h>
+#include <linux/vmalloc.h>
+#include <net/pkt_cls.h>
+#include <net/pie.h>
+
+/* Flow Queue PIE
+ *
+ * Principles:
+ *   - Packets are classified on flows.
+ *   - This is a Stochastic model (as we use a hash, several flows might
+ *                                 be hashed to the same slot)
+ *   - Each flow has a PIE managed queue.
+ *   - Flows are linked onto two (Round Robin) lists,
+ *     so that new flows have priority on old ones.
+ *   - For a given flow, packets are not reordered.
+ *   - Drops during enqueue only.
+ *   - ECN capability is off by default.
+ *   - ECN threshold (if ECN is enabled) is at 10% by default.
+ *   - Uses timestamps to calculate queue delay by default.
+ */
+
+/**
+ * struct fq_pie_flow - contains data for each flow
+ * @vars:	pie vars associated with the flow
+ * @deficit:	number of remaining byte credits
+ * @backlog:	size of data in the flow
+ * @qlen:	number of packets in the flow
+ * @flowchain:	flowchain for the flow
+ * @head:	first packet in the flow
+ * @tail:	last packet in the flow
+ */
+struct fq_pie_flow {
+	struct pie_vars vars;
+	s32 deficit;
+	u32 backlog;
+	u32 qlen;
+	struct list_head flowchain;
+	struct sk_buff *head;
+	struct sk_buff *tail;
+};
+
+struct fq_pie_sched_data {
+	struct tcf_proto __rcu *filter_list; /* optional external classifier */
+	struct tcf_block *block;
+	struct fq_pie_flow *flows;
+	struct Qdisc *sch;
+	struct list_head old_flows;
+	struct list_head new_flows;
+	struct pie_params p_params;
+	u32 ecn_prob;
+	u32 flows_cnt;
+	u32 quantum;
+	u32 memory_limit;
+	u32 new_flow_count;
+	u32 memory_usage;
+	u32 overmemory;
+	struct pie_stats stats;
+	struct timer_list adapt_timer;
+};
+
+static unsigned int fq_pie_hash(const struct fq_pie_sched_data *q,
+				struct sk_buff *skb)
+{
+	return reciprocal_scale(skb_get_hash(skb), q->flows_cnt);
+}
+
+static unsigned int fq_pie_classify(struct sk_buff *skb, struct Qdisc *sch,
+				    int *qerr)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	struct tcf_proto *filter;
+	struct tcf_result res;
+	int result;
+
+	if (TC_H_MAJ(skb->priority) == sch->handle &&
+	    TC_H_MIN(skb->priority) > 0 &&
+	    TC_H_MIN(skb->priority) <= q->flows_cnt)
+		return TC_H_MIN(skb->priority);
+
+	filter = rcu_dereference_bh(q->filter_list);
+	if (!filter)
+		return fq_pie_hash(q, skb) + 1;
+
+	*qerr = NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
+	result = tcf_classify(skb, filter, &res, false);
+	if (result >= 0) {
+#ifdef CONFIG_NET_CLS_ACT
+		switch (result) {
+		case TC_ACT_STOLEN:
+		case TC_ACT_QUEUED:
+		case TC_ACT_TRAP:
+			*qerr = NET_XMIT_SUCCESS | __NET_XMIT_STOLEN;
+			/* fall through */
+		case TC_ACT_SHOT:
+			return 0;
+		}
+#endif
+		if (TC_H_MIN(res.classid) <= q->flows_cnt)
+			return TC_H_MIN(res.classid);
+	}
+	return 0;
+}
+
+/* add skb to flow queue (tail add) */
+static inline void flow_queue_add(struct fq_pie_flow *flow,
+				  struct sk_buff *skb)
+{
+	if (!flow->head)
+		flow->head = skb;
+	else
+		flow->tail->next = skb;
+	flow->tail = skb;
+	skb->next = NULL;
+}
+
+static int fq_pie_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch,
+				struct sk_buff **to_free)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	struct fq_pie_flow *sel_flow;
+	int uninitialized_var(ret);
+	u8 memory_limited = false;
+	u8 enqueue = false;
+	u32 pkt_len;
+	u32 idx;
+
+	/* Classifies packet into corresponding flow */
+	idx = fq_pie_classify(skb, sch, &ret);
+	sel_flow = &q->flows[idx];
+
+	/* Checks whether adding a new packet would exceed memory limit */
+	get_pie_cb(skb)->mem_usage = skb->truesize;
+	memory_limited = q->memory_usage > q->memory_limit + skb->truesize;
+
+	/* Checks if the qdisc is full */
+	if (unlikely(qdisc_qlen(sch) >= sch->limit)) {
+		q->stats.overlimit++;
+		goto out;
+	} else if (unlikely(memory_limited)) {
+		q->overmemory++;
+	}
+
+	if (!pie_drop_early(sch, &q->p_params, &sel_flow->vars,
+			    sel_flow->backlog, skb->len)) {
+		enqueue = true;
+	} else if (q->p_params.ecn &&
+		   sel_flow->vars.prob <= (MAX_PROB / 100) * q->ecn_prob &&
+		   INET_ECN_set_ce(skb)) {
+		/* If packet is ecn capable, mark it if drop probability
+		 * is lower than the parameter ecn_prob, else drop it.
+		 */
+		q->stats.ecn_mark++;
+		enqueue = true;
+	}
+	if (enqueue) {
+		/* Set enqueue time only when dq_rate_estimator is disabled. */
+		if (!q->p_params.dq_rate_estimator)
+			pie_set_enqueue_time(skb);
+
+		pkt_len = qdisc_pkt_len(skb);
+		q->stats.packets_in++;
+		q->memory_usage += skb->truesize;
+		sch->qstats.backlog += pkt_len;
+		sch->q.qlen++;
+		flow_queue_add(sel_flow, skb);
+		if (list_empty(&sel_flow->flowchain)) {
+			list_add_tail(&sel_flow->flowchain, &q->new_flows);
+			q->new_flow_count++;
+			sel_flow->deficit = q->quantum;
+			sel_flow->qlen = 0;
+			sel_flow->backlog = 0;
+		}
+		sel_flow->qlen++;
+		sel_flow->backlog += pkt_len;
+		return NET_XMIT_SUCCESS;
+	}
+out:
+	q->stats.dropped++;
+	sel_flow->vars.accu_prob = 0;
+	sel_flow->vars.accu_prob_overflows = 0;
+	__qdisc_drop(skb, to_free);
+	qdisc_qstats_drop(sch);
+	return NET_XMIT_CN;
+}
+
+static const struct nla_policy fq_pie_policy[TCA_FQ_PIE_MAX + 1] = {
+	[TCA_FQ_PIE_LIMIT]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_FLOWS]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_TARGET]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_TUPDATE]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_ALPHA]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_BETA]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_QUANTUM]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_MEMORY_LIMIT]	= {.type = NLA_U32},
+	[TCA_FQ_PIE_ECN_PROB]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_ECN]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_BYTEMODE]		= {.type = NLA_U32},
+	[TCA_FQ_PIE_DQ_RATE_ESTIMATOR]	= {.type = NLA_U32},
+};
+
+static inline struct sk_buff *dequeue_head(struct fq_pie_flow *flow)
+{
+	struct sk_buff *skb = flow->head;
+
+	flow->head = skb->next;
+	skb->next = NULL;
+	return skb;
+}
+
+static struct sk_buff *fq_pie_qdisc_dequeue(struct Qdisc *sch)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	struct sk_buff *skb = NULL;
+	struct fq_pie_flow *flow;
+	struct list_head *head;
+	u32 pkt_len;
+
+begin:
+	head = &q->new_flows;
+	if (list_empty(head)) {
+		head = &q->old_flows;
+		if (list_empty(head))
+			return NULL;
+	}
+
+	flow = list_first_entry(head, struct fq_pie_flow, flowchain);
+	/* Flow has exhausted all its credits */
+	if (flow->deficit <= 0) {
+		flow->deficit += q->quantum;
+		list_move_tail(&flow->flowchain, &q->old_flows);
+		goto begin;
+	}
+
+	if (flow->head) {
+		skb = dequeue_head(flow);
+		pkt_len = qdisc_pkt_len(skb);
+		sch->qstats.backlog -= pkt_len;
+		sch->q.qlen--;
+		qdisc_bstats_update(sch, skb);
+	}
+
+	if (!skb) {
+		/* force a pass through old_flows to prevent starvation */
+		if (head == &q->new_flows && !list_empty(&q->old_flows))
+			list_move_tail(&flow->flowchain, &q->old_flows);
+		else
+			list_del_init(&flow->flowchain);
+		goto begin;
+	}
+
+	flow->qlen--;
+	flow->deficit -= pkt_len;
+	flow->backlog -= pkt_len;
+	q->memory_usage -= get_pie_cb(skb)->mem_usage;
+	pie_process_dequeue(skb, &q->p_params, &flow->vars, flow->backlog);
+	return skb;
+}
+
+static int fq_pie_change(struct Qdisc *sch, struct nlattr *opt,
+			 struct netlink_ext_ack *extack)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	struct nlattr *tb[TCA_FQ_PIE_MAX + 1];
+	unsigned int len_dropped = 0;
+	unsigned int num_dropped = 0;
+	int err;
+
+	if (!opt)
+		return -EINVAL;
+
+	err = nla_parse_nested(tb, TCA_FQ_PIE_MAX, opt, fq_pie_policy, extack);
+	if (err < 0)
+		return err;
+
+	sch_tree_lock(sch);
+	if (tb[TCA_FQ_PIE_LIMIT]) {
+		u32 limit = nla_get_u32(tb[TCA_FQ_PIE_LIMIT]);
+
+		q->p_params.limit = limit;
+		sch->limit = limit;
+	}
+	if (tb[TCA_FQ_PIE_FLOWS]) {
+		if (q->flows) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "Number of flows cannot be changed");
+			goto flow_error;
+		}
+		q->flows_cnt = nla_get_u32(tb[TCA_FQ_PIE_FLOWS]);
+		if (!q->flows_cnt || q->flows_cnt > 65536) {
+			NL_SET_ERR_MSG_MOD(extack,
+					   "Number of flows must be < 65536");
+			goto flow_error;
+		}
+	}
+
+	/* convert from microseconds to pschedtime */
+	if (tb[TCA_FQ_PIE_TARGET]) {
+		/* target is in us */
+		u32 target = nla_get_u32(tb[TCA_FQ_PIE_TARGET]);
+
+		/* convert to pschedtime */
+		q->p_params.target =
+			PSCHED_NS2TICKS((u64)target * NSEC_PER_USEC);
+	}
+
+	/* tupdate is in jiffies */
+	if (tb[TCA_FQ_PIE_TUPDATE])
+		q->p_params.tupdate =
+			usecs_to_jiffies(nla_get_u32(tb[TCA_FQ_PIE_TUPDATE]));
+
+	if (tb[TCA_FQ_PIE_ALPHA])
+		q->p_params.alpha = nla_get_u32(tb[TCA_FQ_PIE_ALPHA]);
+
+	if (tb[TCA_FQ_PIE_BETA])
+		q->p_params.beta = nla_get_u32(tb[TCA_FQ_PIE_BETA]);
+
+	if (tb[TCA_FQ_PIE_QUANTUM])
+		q->quantum = nla_get_u32(tb[TCA_FQ_PIE_QUANTUM]);
+
+	if (tb[TCA_FQ_PIE_MEMORY_LIMIT])
+		q->memory_limit = nla_get_u32(tb[TCA_FQ_PIE_MEMORY_LIMIT]);
+
+	if (tb[TCA_FQ_PIE_ECN_PROB])
+		q->ecn_prob = nla_get_u32(tb[TCA_FQ_PIE_ECN_PROB]);
+
+	if (tb[TCA_FQ_PIE_ECN])
+		q->p_params.ecn = nla_get_u32(tb[TCA_FQ_PIE_ECN]);
+
+	if (tb[TCA_FQ_PIE_BYTEMODE])
+		q->p_params.bytemode = nla_get_u32(tb[TCA_FQ_PIE_BYTEMODE]);
+
+	if (tb[TCA_FQ_PIE_DQ_RATE_ESTIMATOR])
+		q->p_params.dq_rate_estimator =
+			nla_get_u32(tb[TCA_FQ_PIE_DQ_RATE_ESTIMATOR]);
+
+	/* Drop excess packets if new limit is lower */
+	while (sch->q.qlen > sch->limit) {
+		struct sk_buff *skb = fq_pie_qdisc_dequeue(sch);
+
+		kfree_skb(skb);
+		len_dropped += qdisc_pkt_len(skb);
+		num_dropped += 1;
+	}
+	qdisc_tree_reduce_backlog(sch, num_dropped, len_dropped);
+
+	sch_tree_unlock(sch);
+	return 0;
+
+flow_error:
+	sch_tree_unlock(sch);
+	return -EINVAL;
+}
+
+static void fq_pie_timer(struct timer_list *t)
+{
+	struct fq_pie_sched_data *q = from_timer(q, t, adapt_timer);
+	struct Qdisc *sch = q->sch;
+	spinlock_t *root_lock; /* to lock qdisc for probability calculations */
+	u16 idx;
+
+	root_lock = qdisc_lock(qdisc_root_sleeping(sch));
+	spin_lock(root_lock);
+
+	for (idx = 0; idx < q->flows_cnt; idx++)
+		pie_calculate_probability(&q->p_params, &q->flows[idx].vars,
+					  q->flows[idx].backlog);
+
+	/* reset the timer to fire after 'tupdate' jiffies. */
+	if (q->p_params.tupdate)
+		mod_timer(&q->adapt_timer, jiffies + q->p_params.tupdate);
+
+	spin_unlock(root_lock);
+}
+
+static int fq_pie_init(struct Qdisc *sch, struct nlattr *opt,
+		       struct netlink_ext_ack *extack)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	int err;
+	u16 idx;
+
+	pie_params_init(&q->p_params);
+	sch->limit = 10 * 1024;
+	q->p_params.limit = sch->limit;
+	q->quantum = psched_mtu(qdisc_dev(sch));
+	q->sch = sch;
+	q->ecn_prob = 10;
+	q->flows_cnt = 1024;
+	q->memory_limit = SZ_32M;
+
+	INIT_LIST_HEAD(&q->new_flows);
+	INIT_LIST_HEAD(&q->old_flows);
+
+	if (opt) {
+		err = fq_pie_change(sch, opt, extack);
+
+		if (err)
+			return err;
+	}
+
+	err = tcf_block_get(&q->block, &q->filter_list, sch, extack);
+	if (err)
+		goto init_failure;
+
+	q->flows = kvcalloc(q->flows_cnt, sizeof(struct fq_pie_flow),
+			    GFP_KERNEL);
+	if (!q->flows) {
+		err = -ENOMEM;
+		goto init_failure;
+	}
+	for (idx = 0; idx < q->flows_cnt; idx++) {
+		struct fq_pie_flow *flow = q->flows + idx;
+
+		INIT_LIST_HEAD(&flow->flowchain);
+		pie_vars_init(&flow->vars);
+	}
+
+	timer_setup(&q->adapt_timer, fq_pie_timer, 0);
+	mod_timer(&q->adapt_timer, jiffies + HZ / 2);
+
+	return 0;
+
+init_failure:
+	q->flows_cnt = 0;
+
+	return err;
+}
+
+static int fq_pie_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	struct nlattr *opts;
+
+	opts = nla_nest_start(skb, TCA_OPTIONS);
+	if (!opts)
+		return -EMSGSIZE;
+
+	/* convert target from pschedtime to us */
+	if (nla_put_u32(skb, TCA_FQ_PIE_LIMIT, sch->limit) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_FLOWS, q->flows_cnt) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_TARGET,
+			((u32)PSCHED_TICKS2NS(q->p_params.target)) /
+			NSEC_PER_USEC) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_TUPDATE,
+			jiffies_to_usecs(q->p_params.tupdate)) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_ALPHA, q->p_params.alpha) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_BETA, q->p_params.beta) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_QUANTUM, q->quantum) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_MEMORY_LIMIT, q->memory_limit) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_ECN_PROB, q->ecn_prob) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_ECN, q->p_params.ecn) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_BYTEMODE, q->p_params.bytemode) ||
+	    nla_put_u32(skb, TCA_FQ_PIE_DQ_RATE_ESTIMATOR,
+			q->p_params.dq_rate_estimator))
+		goto nla_put_failure;
+
+	return nla_nest_end(skb, opts);
+
+nla_put_failure:
+	nla_nest_cancel(skb, opts);
+	return -EMSGSIZE;
+}
+
+static int fq_pie_dump_stats(struct Qdisc *sch, struct gnet_dump *d)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	struct tc_fq_pie_xstats st = {
+		.packets_in	= q->stats.packets_in,
+		.overlimit	= q->stats.overlimit,
+		.overmemory	= q->overmemory,
+		.dropped	= q->stats.dropped,
+		.ecn_mark	= q->stats.ecn_mark,
+		.new_flow_count = q->new_flow_count,
+		.memory_usage   = q->memory_usage,
+	};
+	struct list_head *pos;
+
+	sch_tree_lock(sch);
+	list_for_each(pos, &q->new_flows)
+		st.new_flows_len++;
+
+	list_for_each(pos, &q->old_flows)
+		st.old_flows_len++;
+	sch_tree_unlock(sch);
+
+	return gnet_stats_copy_app(d, &st, sizeof(st));
+}
+
+static void fq_pie_reset(struct Qdisc *sch)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+	u16 idx;
+
+	INIT_LIST_HEAD(&q->new_flows);
+	INIT_LIST_HEAD(&q->old_flows);
+	for (idx = 0; idx < q->flows_cnt; idx++) {
+		struct fq_pie_flow *flow = q->flows + idx;
+
+		/* Removes all packets from flow */
+		rtnl_kfree_skbs(flow->head, flow->tail);
+		flow->head = NULL;
+
+		INIT_LIST_HEAD(&flow->flowchain);
+		pie_vars_init(&flow->vars);
+	}
+
+	sch->q.qlen = 0;
+	sch->qstats.backlog = 0;
+}
+
+static void fq_pie_destroy(struct Qdisc *sch)
+{
+	struct fq_pie_sched_data *q = qdisc_priv(sch);
+
+	tcf_block_put(q->block);
+	del_timer_sync(&q->adapt_timer);
+	kvfree(q->flows);
+}
+
+static struct Qdisc_ops fq_pie_qdisc_ops __read_mostly = {
+	.id		= "fq_pie",
+	.priv_size	= sizeof(struct fq_pie_sched_data),
+	.enqueue	= fq_pie_qdisc_enqueue,
+	.dequeue	= fq_pie_qdisc_dequeue,
+	.peek		= qdisc_peek_dequeued,
+	.init		= fq_pie_init,
+	.destroy	= fq_pie_destroy,
+	.reset		= fq_pie_reset,
+	.change		= fq_pie_change,
+	.dump		= fq_pie_dump,
+	.dump_stats	= fq_pie_dump_stats,
+	.owner		= THIS_MODULE,
+};
+
+static int __init fq_pie_module_init(void)
+{
+	return register_qdisc(&fq_pie_qdisc_ops);
+}
+
+static void __exit fq_pie_module_exit(void)
+{
+	unregister_qdisc(&fq_pie_qdisc_ops);
+}
+
+module_init(fq_pie_module_init);
+module_exit(fq_pie_module_exit);
+
+MODULE_DESCRIPTION("Flow Queue Proportional Integral controller Enhanced (FQ-PIE)");
+MODULE_AUTHOR("Mohit P. Tahiliani");
+MODULE_LICENSE("GPL");
-- 
cgit v1.2.3


From a702a692cd7559053ea573f4e2c84828f0e62824 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 24 Jan 2020 01:01:27 +0800
Subject: bcache: use a separate data structure for the on-disk super block

Split out an on-disk version struct cache_sb with the proper endianness
annotations.  This fixes a fair chunk of sparse warnings, but there are
some left due to the way the checksum is defined.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/md/bcache/super.c   |  6 +++---
 include/uapi/linux/bcache.h | 51 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index a573ce1d85aa..3045f27e0d67 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -63,14 +63,14 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
 			      struct page **res)
 {
 	const char *err;
-	struct cache_sb *s;
+	struct cache_sb_disk *s;
 	struct buffer_head *bh = __bread(bdev, 1, SB_SIZE);
 	unsigned int i;
 
 	if (!bh)
 		return "IO error";
 
-	s = (struct cache_sb *) bh->b_data;
+	s = (struct cache_sb_disk *)bh->b_data;
 
 	sb->offset		= le64_to_cpu(s->offset);
 	sb->version		= le64_to_cpu(s->version);
@@ -209,7 +209,7 @@ static void write_bdev_super_endio(struct bio *bio)
 
 static void __write_super(struct cache_sb *sb, struct bio *bio)
 {
-	struct cache_sb *out = page_address(bio_first_page_all(bio));
+	struct cache_sb_disk *out = page_address(bio_first_page_all(bio));
 	unsigned int i;
 
 	bio->bi_iter.bi_sector	= SB_SECTOR;
diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
index 5d4f58e059fd..1d8b3a9fc080 100644
--- a/include/uapi/linux/bcache.h
+++ b/include/uapi/linux/bcache.h
@@ -156,6 +156,57 @@ static inline struct bkey *bkey_idx(const struct bkey *k, unsigned int nr_keys)
 
 #define BDEV_DATA_START_DEFAULT		16	/* sectors */
 
+struct cache_sb_disk {
+	__le64			csum;
+	__le64			offset;	/* sector where this sb was written */
+	__le64			version;
+
+	__u8			magic[16];
+
+	__u8			uuid[16];
+	union {
+		__u8		set_uuid[16];
+		__le64		set_magic;
+	};
+	__u8			label[SB_LABEL_SIZE];
+
+	__le64			flags;
+	__le64			seq;
+	__le64			pad[8];
+
+	union {
+	struct {
+		/* Cache devices */
+		__le64		nbuckets;	/* device size */
+
+		__le16		block_size;	/* sectors */
+		__le16		bucket_size;	/* sectors */
+
+		__le16		nr_in_set;
+		__le16		nr_this_dev;
+	};
+	struct {
+		/* Backing devices */
+		__le64		data_offset;
+
+		/*
+		 * block_size from the cache device section is still used by
+		 * backing devices, so don't add anything here until we fix
+		 * things to not need it for backing devices anymore
+		 */
+	};
+	};
+
+	__le32			last_mount;	/* time overflow in y2106 */
+
+	__le16			first_bucket;
+	union {
+		__le16		njournal_buckets;
+		__le16		keys;
+	};
+	__le64			d[SB_JOURNAL_BUCKETS];	/* journal buckets */
+};
+
 struct cache_sb {
 	__u64			csum;
 	__u64			offset;	/* sector where this sb was written */
-- 
cgit v1.2.3


From 6321bef028de43724c47cfa7f9dee69ecb783796 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Fri, 24 Jan 2020 01:01:34 +0800
Subject: bcache: use read_cache_page_gfp to read the superblock

Avoid a pointless dependency on buffer heads in bcache by simply open
coding reading a single page.  Also add a SB_OFFSET define for the
byte offset of the superblock instead of using magic numbers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Coly Li <colyli@suse.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 drivers/md/bcache/super.c   | 16 +++++++---------
 include/uapi/linux/bcache.h |  1 +
 2 files changed, 8 insertions(+), 9 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 5797c03f993e..3dea1d5acd5c 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -15,7 +15,6 @@
 #include "writeback.h"
 
 #include <linux/blkdev.h>
-#include <linux/buffer_head.h>
 #include <linux/debugfs.h>
 #include <linux/genhd.h>
 #include <linux/idr.h>
@@ -64,13 +63,14 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
 {
 	const char *err;
 	struct cache_sb_disk *s;
-	struct buffer_head *bh = __bread(bdev, 1, SB_SIZE);
+	struct page *page;
 	unsigned int i;
 
-	if (!bh)
+	page = read_cache_page_gfp(bdev->bd_inode->i_mapping,
+				   SB_OFFSET >> PAGE_SHIFT, GFP_KERNEL);
+	if (IS_ERR(page))
 		return "IO error";
-
-	s = (struct cache_sb_disk *)bh->b_data;
+	s = page_address(page) + offset_in_page(SB_OFFSET);
 
 	sb->offset		= le64_to_cpu(s->offset);
 	sb->version		= le64_to_cpu(s->version);
@@ -188,12 +188,10 @@ static const char *read_super(struct cache_sb *sb, struct block_device *bdev,
 	}
 
 	sb->last_mount = (u32)ktime_get_real_seconds();
-	err = NULL;
-
-	get_page(bh->b_page);
 	*res = s;
+	return NULL;
 err:
-	put_bh(bh);
+	put_page(page);
 	return err;
 }
 
diff --git a/include/uapi/linux/bcache.h b/include/uapi/linux/bcache.h
index 1d8b3a9fc080..9a1965c6c3d0 100644
--- a/include/uapi/linux/bcache.h
+++ b/include/uapi/linux/bcache.h
@@ -148,6 +148,7 @@ static inline struct bkey *bkey_idx(const struct bkey *k, unsigned int nr_keys)
 #define BCACHE_SB_MAX_VERSION		4
 
 #define SB_SECTOR			8
+#define SB_OFFSET			(SB_SECTOR << SECTOR_SHIFT)
 #define SB_SIZE				4096
 #define SB_LABEL_SIZE			32
 #define SB_JOURNAL_BUCKETS		256U
-- 
cgit v1.2.3


From bfe1d56091c1a404b3d4ce7e9809d745fc4453bb Mon Sep 17 00:00:00 2001
From: Dave Jiang <dave.jiang@intel.com>
Date: Tue, 21 Jan 2020 16:43:59 -0700
Subject: dmaengine: idxd: Init and probe for Intel data accelerators

The idxd driver introduces the Intel Data Stream Accelerator [1] that will
be available on future Intel Xeon CPUs. One of the kernel access
point for the driver is through the dmaengine subsystem. It will initially
provide the DMA copy service to the kernel.

Some of the main functionality introduced with this accelerator
are: shared virtual memory (SVM) support, and descriptor submission using
Intel CPU instructions movdir64b and enqcmds. There will be additional
accelerator devices that share the same driver with variations to
capabilities.

This commit introduces the probe and initialization component of the
driver.

[1]: https://software.intel.com/en-us/download/intel-data-streaming-accelerator-preliminary-architecture-specification

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/157965023991.73301.6186843973135311580.stgit@djiang5-desk3.ch.intel.com
Signed-off-by: Vinod Koul <vkoul@kernel.org>
---
 MAINTAINERS                  |   8 +
 drivers/dma/Kconfig          |  13 +
 drivers/dma/Makefile         |   1 +
 drivers/dma/idxd/Makefile    |   2 +
 drivers/dma/idxd/device.c    | 657 +++++++++++++++++++++++++++++++++++++++++++
 drivers/dma/idxd/idxd.h      | 225 +++++++++++++++
 drivers/dma/idxd/init.c      | 468 ++++++++++++++++++++++++++++++
 drivers/dma/idxd/irq.c       | 156 ++++++++++
 drivers/dma/idxd/registers.h | 335 ++++++++++++++++++++++
 include/uapi/linux/idxd.h    | 228 +++++++++++++++
 10 files changed, 2093 insertions(+)
 create mode 100644 drivers/dma/idxd/Makefile
 create mode 100644 drivers/dma/idxd/device.c
 create mode 100644 drivers/dma/idxd/idxd.h
 create mode 100644 drivers/dma/idxd/init.c
 create mode 100644 drivers/dma/idxd/irq.c
 create mode 100644 drivers/dma/idxd/registers.h
 create mode 100644 include/uapi/linux/idxd.h

(limited to 'include/uapi/linux')

diff --git a/MAINTAINERS b/MAINTAINERS
index 76713226f256..af08cfe31c0f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8375,6 +8375,14 @@ Q:	https://patchwork.kernel.org/project/linux-dmaengine/list/
 S:	Supported
 F:	drivers/dma/ioat*
 
+INTEL IADX DRIVER
+M:	Dave Jiang <dave.jiang@intel.com>
+L:	dmaengine@vger.kernel.org
+S:	Supported
+F:	drivers/dma/idxd/*
+F:	include/uapi/linux/idxd.h
+F:	include/linux/idxd.h
+
 INTEL IDLE DRIVER
 M:	Jacob Pan <jacob.jun.pan@linux.intel.com>
 M:	Len Brown <lenb@kernel.org>
diff --git a/drivers/dma/Kconfig b/drivers/dma/Kconfig
index 312a6cc36c78..a8f8e9552885 100644
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -273,6 +273,19 @@ config INTEL_IDMA64
 	  Enable DMA support for Intel Low Power Subsystem such as found on
 	  Intel Skylake PCH.
 
+config INTEL_IDXD
+	tristate "Intel Data Accelerators support"
+	depends on PCI && X86_64
+	select DMA_ENGINE
+	select SBITMAP
+	help
+	  Enable support for the Intel(R) data accelerators present
+	  in Intel Xeon CPU.
+
+	  Say Y if you have such a platform.
+
+	  If unsure, say N.
+
 config INTEL_IOATDMA
 	tristate "Intel I/OAT DMA support"
 	depends on PCI && X86_64
diff --git a/drivers/dma/Makefile b/drivers/dma/Makefile
index a150d1d792fd..461d77c4a839 100644
--- a/drivers/dma/Makefile
+++ b/drivers/dma/Makefile
@@ -41,6 +41,7 @@ obj-$(CONFIG_IMX_DMA) += imx-dma.o
 obj-$(CONFIG_IMX_SDMA) += imx-sdma.o
 obj-$(CONFIG_INTEL_IDMA64) += idma64.o
 obj-$(CONFIG_INTEL_IOATDMA) += ioat/
+obj-$(CONFIG_INTEL_IDXD) += idxd/
 obj-$(CONFIG_INTEL_IOP_ADMA) += iop-adma.o
 obj-$(CONFIG_INTEL_MIC_X100_DMA) += mic_x100_dma.o
 obj-$(CONFIG_K3_DMA) += k3dma.o
diff --git a/drivers/dma/idxd/Makefile b/drivers/dma/idxd/Makefile
new file mode 100644
index 000000000000..0dd1ca77513f
--- /dev/null
+++ b/drivers/dma/idxd/Makefile
@@ -0,0 +1,2 @@
+obj-$(CONFIG_INTEL_IDXD) += idxd.o
+idxd-y := init.o irq.o device.o
diff --git a/drivers/dma/idxd/device.c b/drivers/dma/idxd/device.c
new file mode 100644
index 000000000000..af2bdc18df3d
--- /dev/null
+++ b/drivers/dma/idxd/device.c
@@ -0,0 +1,657 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <uapi/linux/idxd.h>
+#include "idxd.h"
+#include "registers.h"
+
+static int idxd_cmd_wait(struct idxd_device *idxd, u32 *status, int timeout);
+static int idxd_cmd_send(struct idxd_device *idxd, int cmd_code, u32 operand);
+
+/* Interrupt control bits */
+int idxd_mask_msix_vector(struct idxd_device *idxd, int vec_id)
+{
+	struct pci_dev *pdev = idxd->pdev;
+	int msixcnt = pci_msix_vec_count(pdev);
+	union msix_perm perm;
+	u32 offset;
+
+	if (vec_id < 0 || vec_id >= msixcnt)
+		return -EINVAL;
+
+	offset = idxd->msix_perm_offset + vec_id * 8;
+	perm.bits = ioread32(idxd->reg_base + offset);
+	perm.ignore = 1;
+	iowrite32(perm.bits, idxd->reg_base + offset);
+
+	return 0;
+}
+
+void idxd_mask_msix_vectors(struct idxd_device *idxd)
+{
+	struct pci_dev *pdev = idxd->pdev;
+	int msixcnt = pci_msix_vec_count(pdev);
+	int i, rc;
+
+	for (i = 0; i < msixcnt; i++) {
+		rc = idxd_mask_msix_vector(idxd, i);
+		if (rc < 0)
+			dev_warn(&pdev->dev,
+				 "Failed disabling msix vec %d\n", i);
+	}
+}
+
+int idxd_unmask_msix_vector(struct idxd_device *idxd, int vec_id)
+{
+	struct pci_dev *pdev = idxd->pdev;
+	int msixcnt = pci_msix_vec_count(pdev);
+	union msix_perm perm;
+	u32 offset;
+
+	if (vec_id < 0 || vec_id >= msixcnt)
+		return -EINVAL;
+
+	offset = idxd->msix_perm_offset + vec_id * 8;
+	perm.bits = ioread32(idxd->reg_base + offset);
+	perm.ignore = 0;
+	iowrite32(perm.bits, idxd->reg_base + offset);
+
+	return 0;
+}
+
+void idxd_unmask_error_interrupts(struct idxd_device *idxd)
+{
+	union genctrl_reg genctrl;
+
+	genctrl.bits = ioread32(idxd->reg_base + IDXD_GENCTRL_OFFSET);
+	genctrl.softerr_int_en = 1;
+	iowrite32(genctrl.bits, idxd->reg_base + IDXD_GENCTRL_OFFSET);
+}
+
+void idxd_mask_error_interrupts(struct idxd_device *idxd)
+{
+	union genctrl_reg genctrl;
+
+	genctrl.bits = ioread32(idxd->reg_base + IDXD_GENCTRL_OFFSET);
+	genctrl.softerr_int_en = 0;
+	iowrite32(genctrl.bits, idxd->reg_base + IDXD_GENCTRL_OFFSET);
+}
+
+static void free_hw_descs(struct idxd_wq *wq)
+{
+	int i;
+
+	for (i = 0; i < wq->num_descs; i++)
+		kfree(wq->hw_descs[i]);
+
+	kfree(wq->hw_descs);
+}
+
+static int alloc_hw_descs(struct idxd_wq *wq, int num)
+{
+	struct device *dev = &wq->idxd->pdev->dev;
+	int i;
+	int node = dev_to_node(dev);
+
+	wq->hw_descs = kcalloc_node(num, sizeof(struct dsa_hw_desc *),
+				    GFP_KERNEL, node);
+	if (!wq->hw_descs)
+		return -ENOMEM;
+
+	for (i = 0; i < num; i++) {
+		wq->hw_descs[i] = kzalloc_node(sizeof(*wq->hw_descs[i]),
+					       GFP_KERNEL, node);
+		if (!wq->hw_descs[i]) {
+			free_hw_descs(wq);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+static void free_descs(struct idxd_wq *wq)
+{
+	int i;
+
+	for (i = 0; i < wq->num_descs; i++)
+		kfree(wq->descs[i]);
+
+	kfree(wq->descs);
+}
+
+static int alloc_descs(struct idxd_wq *wq, int num)
+{
+	struct device *dev = &wq->idxd->pdev->dev;
+	int i;
+	int node = dev_to_node(dev);
+
+	wq->descs = kcalloc_node(num, sizeof(struct idxd_desc *),
+				 GFP_KERNEL, node);
+	if (!wq->descs)
+		return -ENOMEM;
+
+	for (i = 0; i < num; i++) {
+		wq->descs[i] = kzalloc_node(sizeof(*wq->descs[i]),
+					    GFP_KERNEL, node);
+		if (!wq->descs[i]) {
+			free_descs(wq);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
+/* WQ control bits */
+int idxd_wq_alloc_resources(struct idxd_wq *wq)
+{
+	struct idxd_device *idxd = wq->idxd;
+	struct idxd_group *group = wq->group;
+	struct device *dev = &idxd->pdev->dev;
+	int rc, num_descs, i;
+
+	num_descs = wq->size +
+		idxd->hw.gen_cap.max_descs_per_engine * group->num_engines;
+	wq->num_descs = num_descs;
+
+	rc = alloc_hw_descs(wq, num_descs);
+	if (rc < 0)
+		return rc;
+
+	wq->compls_size = num_descs * sizeof(struct dsa_completion_record);
+	wq->compls = dma_alloc_coherent(dev, wq->compls_size,
+					&wq->compls_addr, GFP_KERNEL);
+	if (!wq->compls) {
+		rc = -ENOMEM;
+		goto fail_alloc_compls;
+	}
+
+	rc = alloc_descs(wq, num_descs);
+	if (rc < 0)
+		goto fail_alloc_descs;
+
+	rc = sbitmap_init_node(&wq->sbmap, num_descs, -1, GFP_KERNEL,
+			       dev_to_node(dev));
+	if (rc < 0)
+		goto fail_sbitmap_init;
+
+	for (i = 0; i < num_descs; i++) {
+		struct idxd_desc *desc = wq->descs[i];
+
+		desc->hw = wq->hw_descs[i];
+		desc->completion = &wq->compls[i];
+		desc->compl_dma  = wq->compls_addr +
+			sizeof(struct dsa_completion_record) * i;
+		desc->id = i;
+		desc->wq = wq;
+	}
+
+	return 0;
+
+ fail_sbitmap_init:
+	free_descs(wq);
+ fail_alloc_descs:
+	dma_free_coherent(dev, wq->compls_size, wq->compls, wq->compls_addr);
+ fail_alloc_compls:
+	free_hw_descs(wq);
+	return rc;
+}
+
+void idxd_wq_free_resources(struct idxd_wq *wq)
+{
+	struct device *dev = &wq->idxd->pdev->dev;
+
+	free_hw_descs(wq);
+	free_descs(wq);
+	dma_free_coherent(dev, wq->compls_size, wq->compls, wq->compls_addr);
+	sbitmap_free(&wq->sbmap);
+}
+
+int idxd_wq_enable(struct idxd_wq *wq)
+{
+	struct idxd_device *idxd = wq->idxd;
+	struct device *dev = &idxd->pdev->dev;
+	u32 status;
+	int rc;
+
+	lockdep_assert_held(&idxd->dev_lock);
+
+	if (wq->state == IDXD_WQ_ENABLED) {
+		dev_dbg(dev, "WQ %d already enabled\n", wq->id);
+		return -ENXIO;
+	}
+
+	rc = idxd_cmd_send(idxd, IDXD_CMD_ENABLE_WQ, wq->id);
+	if (rc < 0)
+		return rc;
+	rc = idxd_cmd_wait(idxd, &status, IDXD_REG_TIMEOUT);
+	if (rc < 0)
+		return rc;
+
+	if (status != IDXD_CMDSTS_SUCCESS &&
+	    status != IDXD_CMDSTS_ERR_WQ_ENABLED) {
+		dev_dbg(dev, "WQ enable failed: %#x\n", status);
+		return -ENXIO;
+	}
+
+	wq->state = IDXD_WQ_ENABLED;
+	dev_dbg(dev, "WQ %d enabled\n", wq->id);
+	return 0;
+}
+
+int idxd_wq_disable(struct idxd_wq *wq)
+{
+	struct idxd_device *idxd = wq->idxd;
+	struct device *dev = &idxd->pdev->dev;
+	u32 status, operand;
+	int rc;
+
+	lockdep_assert_held(&idxd->dev_lock);
+	dev_dbg(dev, "Disabling WQ %d\n", wq->id);
+
+	if (wq->state != IDXD_WQ_ENABLED) {
+		dev_dbg(dev, "WQ %d in wrong state: %d\n", wq->id, wq->state);
+		return 0;
+	}
+
+	operand = BIT(wq->id % 16) | ((wq->id / 16) << 16);
+	rc = idxd_cmd_send(idxd, IDXD_CMD_DISABLE_WQ, operand);
+	if (rc < 0)
+		return rc;
+	rc = idxd_cmd_wait(idxd, &status, IDXD_REG_TIMEOUT);
+	if (rc < 0)
+		return rc;
+
+	if (status != IDXD_CMDSTS_SUCCESS) {
+		dev_dbg(dev, "WQ disable failed: %#x\n", status);
+		return -ENXIO;
+	}
+
+	wq->state = IDXD_WQ_DISABLED;
+	dev_dbg(dev, "WQ %d disabled\n", wq->id);
+	return 0;
+}
+
+/* Device control bits */
+static inline bool idxd_is_enabled(struct idxd_device *idxd)
+{
+	union gensts_reg gensts;
+
+	gensts.bits = ioread32(idxd->reg_base + IDXD_GENSTATS_OFFSET);
+
+	if (gensts.state == IDXD_DEVICE_STATE_ENABLED)
+		return true;
+	return false;
+}
+
+static int idxd_cmd_wait(struct idxd_device *idxd, u32 *status, int timeout)
+{
+	u32 sts, to = timeout;
+
+	lockdep_assert_held(&idxd->dev_lock);
+	sts = ioread32(idxd->reg_base + IDXD_CMDSTS_OFFSET);
+	while (sts & IDXD_CMDSTS_ACTIVE && --to) {
+		cpu_relax();
+		sts = ioread32(idxd->reg_base + IDXD_CMDSTS_OFFSET);
+	}
+
+	if (to == 0 && sts & IDXD_CMDSTS_ACTIVE) {
+		dev_warn(&idxd->pdev->dev, "%s timed out!\n", __func__);
+		*status = 0;
+		return -EBUSY;
+	}
+
+	*status = sts;
+	return 0;
+}
+
+static int idxd_cmd_send(struct idxd_device *idxd, int cmd_code, u32 operand)
+{
+	union idxd_command_reg cmd;
+	int rc;
+	u32 status;
+
+	lockdep_assert_held(&idxd->dev_lock);
+	rc = idxd_cmd_wait(idxd, &status, IDXD_REG_TIMEOUT);
+	if (rc < 0)
+		return rc;
+
+	memset(&cmd, 0, sizeof(cmd));
+	cmd.cmd = cmd_code;
+	cmd.operand = operand;
+	dev_dbg(&idxd->pdev->dev, "%s: sending cmd: %#x op: %#x\n",
+		__func__, cmd_code, operand);
+	iowrite32(cmd.bits, idxd->reg_base + IDXD_CMD_OFFSET);
+
+	return 0;
+}
+
+int idxd_device_enable(struct idxd_device *idxd)
+{
+	struct device *dev = &idxd->pdev->dev;
+	int rc;
+	u32 status;
+
+	lockdep_assert_held(&idxd->dev_lock);
+	if (idxd_is_enabled(idxd)) {
+		dev_dbg(dev, "Device already enabled\n");
+		return -ENXIO;
+	}
+
+	rc = idxd_cmd_send(idxd, IDXD_CMD_ENABLE_DEVICE, 0);
+	if (rc < 0)
+		return rc;
+	rc = idxd_cmd_wait(idxd, &status, IDXD_REG_TIMEOUT);
+	if (rc < 0)
+		return rc;
+
+	/* If the command is successful or if the device was enabled */
+	if (status != IDXD_CMDSTS_SUCCESS &&
+	    status != IDXD_CMDSTS_ERR_DEV_ENABLED) {
+		dev_dbg(dev, "%s: err_code: %#x\n", __func__, status);
+		return -ENXIO;
+	}
+
+	idxd->state = IDXD_DEV_ENABLED;
+	return 0;
+}
+
+int idxd_device_disable(struct idxd_device *idxd)
+{
+	struct device *dev = &idxd->pdev->dev;
+	int rc;
+	u32 status;
+
+	lockdep_assert_held(&idxd->dev_lock);
+	if (!idxd_is_enabled(idxd)) {
+		dev_dbg(dev, "Device is not enabled\n");
+		return 0;
+	}
+
+	rc = idxd_cmd_send(idxd, IDXD_CMD_DISABLE_DEVICE, 0);
+	if (rc < 0)
+		return rc;
+	rc = idxd_cmd_wait(idxd, &status, IDXD_REG_TIMEOUT);
+	if (rc < 0)
+		return rc;
+
+	/* If the command is successful or if the device was disabled */
+	if (status != IDXD_CMDSTS_SUCCESS &&
+	    !(status & IDXD_CMDSTS_ERR_DIS_DEV_EN)) {
+		dev_dbg(dev, "%s: err_code: %#x\n", __func__, status);
+		rc = -ENXIO;
+		return rc;
+	}
+
+	idxd->state = IDXD_DEV_CONF_READY;
+	return 0;
+}
+
+int __idxd_device_reset(struct idxd_device *idxd)
+{
+	u32 status;
+	int rc;
+
+	rc = idxd_cmd_send(idxd, IDXD_CMD_RESET_DEVICE, 0);
+	if (rc < 0)
+		return rc;
+	rc = idxd_cmd_wait(idxd, &status, IDXD_REG_TIMEOUT);
+	if (rc < 0)
+		return rc;
+
+	return 0;
+}
+
+int idxd_device_reset(struct idxd_device *idxd)
+{
+	unsigned long flags;
+	int rc;
+
+	spin_lock_irqsave(&idxd->dev_lock, flags);
+	rc = __idxd_device_reset(idxd);
+	spin_unlock_irqrestore(&idxd->dev_lock, flags);
+	return rc;
+}
+
+/* Device configuration bits */
+static void idxd_group_config_write(struct idxd_group *group)
+{
+	struct idxd_device *idxd = group->idxd;
+	struct device *dev = &idxd->pdev->dev;
+	int i;
+	u32 grpcfg_offset;
+
+	dev_dbg(dev, "Writing group %d cfg registers\n", group->id);
+
+	/* setup GRPWQCFG */
+	for (i = 0; i < 4; i++) {
+		grpcfg_offset = idxd->grpcfg_offset +
+			group->id * 64 + i * sizeof(u64);
+		iowrite64(group->grpcfg.wqs[i],
+			  idxd->reg_base + grpcfg_offset);
+		dev_dbg(dev, "GRPCFG wq[%d:%d: %#x]: %#llx\n",
+			group->id, i, grpcfg_offset,
+			ioread64(idxd->reg_base + grpcfg_offset));
+	}
+
+	/* setup GRPENGCFG */
+	grpcfg_offset = idxd->grpcfg_offset + group->id * 64 + 32;
+	iowrite64(group->grpcfg.engines, idxd->reg_base + grpcfg_offset);
+	dev_dbg(dev, "GRPCFG engs[%d: %#x]: %#llx\n", group->id,
+		grpcfg_offset, ioread64(idxd->reg_base + grpcfg_offset));
+
+	/* setup GRPFLAGS */
+	grpcfg_offset = idxd->grpcfg_offset + group->id * 64 + 40;
+	iowrite32(group->grpcfg.flags.bits, idxd->reg_base + grpcfg_offset);
+	dev_dbg(dev, "GRPFLAGS flags[%d: %#x]: %#x\n",
+		group->id, grpcfg_offset,
+		ioread32(idxd->reg_base + grpcfg_offset));
+}
+
+static int idxd_groups_config_write(struct idxd_device *idxd)
+
+{
+	union gencfg_reg reg;
+	int i;
+	struct device *dev = &idxd->pdev->dev;
+
+	/* Setup bandwidth token limit */
+	if (idxd->token_limit) {
+		reg.bits = ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET);
+		reg.token_limit = idxd->token_limit;
+		iowrite32(reg.bits, idxd->reg_base + IDXD_GENCFG_OFFSET);
+	}
+
+	dev_dbg(dev, "GENCFG(%#x): %#x\n", IDXD_GENCFG_OFFSET,
+		ioread32(idxd->reg_base + IDXD_GENCFG_OFFSET));
+
+	for (i = 0; i < idxd->max_groups; i++) {
+		struct idxd_group *group = &idxd->groups[i];
+
+		idxd_group_config_write(group);
+	}
+
+	return 0;
+}
+
+static int idxd_wq_config_write(struct idxd_wq *wq)
+{
+	struct idxd_device *idxd = wq->idxd;
+	struct device *dev = &idxd->pdev->dev;
+	u32 wq_offset;
+	int i;
+
+	if (!wq->group)
+		return 0;
+
+	memset(&wq->wqcfg, 0, sizeof(union wqcfg));
+
+	/* byte 0-3 */
+	wq->wqcfg.wq_size = wq->size;
+
+	if (wq->size == 0) {
+		dev_warn(dev, "Incorrect work queue size: 0\n");
+		return -EINVAL;
+	}
+
+	/* bytes 4-7 */
+	wq->wqcfg.wq_thresh = wq->threshold;
+
+	/* byte 8-11 */
+	wq->wqcfg.priv = 1; /* kernel, therefore priv */
+	wq->wqcfg.mode = 1;
+
+	wq->wqcfg.priority = wq->priority;
+
+	/* bytes 12-15 */
+	wq->wqcfg.max_xfer_shift = idxd->hw.gen_cap.max_xfer_shift;
+	wq->wqcfg.max_batch_shift = idxd->hw.gen_cap.max_batch_shift;
+
+	dev_dbg(dev, "WQ %d CFGs\n", wq->id);
+	for (i = 0; i < 8; i++) {
+		wq_offset = idxd->wqcfg_offset + wq->id * 32 + i * sizeof(u32);
+		iowrite32(wq->wqcfg.bits[i], idxd->reg_base + wq_offset);
+		dev_dbg(dev, "WQ[%d][%d][%#x]: %#x\n",
+			wq->id, i, wq_offset,
+			ioread32(idxd->reg_base + wq_offset));
+	}
+
+	return 0;
+}
+
+static int idxd_wqs_config_write(struct idxd_device *idxd)
+{
+	int i, rc;
+
+	for (i = 0; i < idxd->max_wqs; i++) {
+		struct idxd_wq *wq = &idxd->wqs[i];
+
+		rc = idxd_wq_config_write(wq);
+		if (rc < 0)
+			return rc;
+	}
+
+	return 0;
+}
+
+static void idxd_group_flags_setup(struct idxd_device *idxd)
+{
+	int i;
+
+	/* TC-A 0 and TC-B 1 should be defaults */
+	for (i = 0; i < idxd->max_groups; i++) {
+		struct idxd_group *group = &idxd->groups[i];
+
+		if (group->tc_a == -1)
+			group->grpcfg.flags.tc_a = 0;
+		else
+			group->grpcfg.flags.tc_a = group->tc_a;
+		if (group->tc_b == -1)
+			group->grpcfg.flags.tc_b = 1;
+		else
+			group->grpcfg.flags.tc_b = group->tc_b;
+		group->grpcfg.flags.use_token_limit = group->use_token_limit;
+		group->grpcfg.flags.tokens_reserved = group->tokens_reserved;
+		if (group->tokens_allowed)
+			group->grpcfg.flags.tokens_allowed =
+				group->tokens_allowed;
+		else
+			group->grpcfg.flags.tokens_allowed = idxd->max_tokens;
+	}
+}
+
+static int idxd_engines_setup(struct idxd_device *idxd)
+{
+	int i, engines = 0;
+	struct idxd_engine *eng;
+	struct idxd_group *group;
+
+	for (i = 0; i < idxd->max_groups; i++) {
+		group = &idxd->groups[i];
+		group->grpcfg.engines = 0;
+	}
+
+	for (i = 0; i < idxd->max_engines; i++) {
+		eng = &idxd->engines[i];
+		group = eng->group;
+
+		if (!group)
+			continue;
+
+		group->grpcfg.engines |= BIT(eng->id);
+		engines++;
+	}
+
+	if (!engines)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int idxd_wqs_setup(struct idxd_device *idxd)
+{
+	struct idxd_wq *wq;
+	struct idxd_group *group;
+	int i, j, configured = 0;
+	struct device *dev = &idxd->pdev->dev;
+
+	for (i = 0; i < idxd->max_groups; i++) {
+		group = &idxd->groups[i];
+		for (j = 0; j < 4; j++)
+			group->grpcfg.wqs[j] = 0;
+	}
+
+	for (i = 0; i < idxd->max_wqs; i++) {
+		wq = &idxd->wqs[i];
+		group = wq->group;
+
+		if (!wq->group)
+			continue;
+		if (!wq->size)
+			continue;
+
+		if (!wq_dedicated(wq)) {
+			dev_warn(dev, "No shared workqueue support.\n");
+			return -EINVAL;
+		}
+
+		group->grpcfg.wqs[wq->id / 64] |= BIT(wq->id % 64);
+		configured++;
+	}
+
+	if (configured == 0)
+		return -EINVAL;
+
+	return 0;
+}
+
+int idxd_device_config(struct idxd_device *idxd)
+{
+	int rc;
+
+	lockdep_assert_held(&idxd->dev_lock);
+	rc = idxd_wqs_setup(idxd);
+	if (rc < 0)
+		return rc;
+
+	rc = idxd_engines_setup(idxd);
+	if (rc < 0)
+		return rc;
+
+	idxd_group_flags_setup(idxd);
+
+	rc = idxd_wqs_config_write(idxd);
+	if (rc < 0)
+		return rc;
+
+	rc = idxd_groups_config_write(idxd);
+	if (rc < 0)
+		return rc;
+
+	return 0;
+}
diff --git a/drivers/dma/idxd/idxd.h b/drivers/dma/idxd/idxd.h
new file mode 100644
index 000000000000..733484922365
--- /dev/null
+++ b/drivers/dma/idxd/idxd.h
@@ -0,0 +1,225 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */
+#ifndef _IDXD_H_
+#define _IDXD_H_
+
+#include <linux/sbitmap.h>
+#include <linux/percpu-rwsem.h>
+#include <linux/wait.h>
+#include "registers.h"
+
+#define IDXD_DRIVER_VERSION	"1.00"
+
+extern struct kmem_cache *idxd_desc_pool;
+
+#define IDXD_REG_TIMEOUT	50
+#define IDXD_DRAIN_TIMEOUT	5000
+
+enum idxd_type {
+	IDXD_TYPE_UNKNOWN = -1,
+	IDXD_TYPE_DSA = 0,
+	IDXD_TYPE_MAX
+};
+
+#define IDXD_NAME_SIZE		128
+
+struct idxd_device_driver {
+	struct device_driver drv;
+};
+
+struct idxd_irq_entry {
+	struct idxd_device *idxd;
+	int id;
+	struct llist_head pending_llist;
+	struct list_head work_list;
+};
+
+struct idxd_group {
+	struct device conf_dev;
+	struct idxd_device *idxd;
+	struct grpcfg grpcfg;
+	int id;
+	int num_engines;
+	int num_wqs;
+	bool use_token_limit;
+	u8 tokens_allowed;
+	u8 tokens_reserved;
+	int tc_a;
+	int tc_b;
+};
+
+#define IDXD_MAX_PRIORITY	0xf
+
+enum idxd_wq_state {
+	IDXD_WQ_DISABLED = 0,
+	IDXD_WQ_ENABLED,
+};
+
+enum idxd_wq_flag {
+	WQ_FLAG_DEDICATED = 0,
+};
+
+enum idxd_wq_type {
+	IDXD_WQT_NONE = 0,
+	IDXD_WQT_KERNEL,
+};
+
+#define IDXD_ALLOCATED_BATCH_SIZE	128U
+#define WQ_NAME_SIZE   1024
+#define WQ_TYPE_SIZE   10
+
+struct idxd_wq {
+	void __iomem *dportal;
+	struct device conf_dev;
+	struct idxd_device *idxd;
+	int id;
+	enum idxd_wq_type type;
+	struct idxd_group *group;
+	int client_count;
+	struct mutex wq_lock;	/* mutex for workqueue */
+	u32 size;
+	u32 threshold;
+	u32 priority;
+	enum idxd_wq_state state;
+	unsigned long flags;
+	union wqcfg wqcfg;
+	atomic_t dq_count;	/* dedicated queue flow control */
+	u32 vec_ptr;		/* interrupt steering */
+	struct dsa_hw_desc **hw_descs;
+	int num_descs;
+	struct dsa_completion_record *compls;
+	dma_addr_t compls_addr;
+	int compls_size;
+	struct idxd_desc **descs;
+	struct sbitmap sbmap;
+	struct percpu_rw_semaphore submit_lock;
+	wait_queue_head_t submit_waitq;
+	char name[WQ_NAME_SIZE + 1];
+};
+
+struct idxd_engine {
+	struct device conf_dev;
+	int id;
+	struct idxd_group *group;
+	struct idxd_device *idxd;
+};
+
+/* shadow registers */
+struct idxd_hw {
+	u32 version;
+	union gen_cap_reg gen_cap;
+	union wq_cap_reg wq_cap;
+	union group_cap_reg group_cap;
+	union engine_cap_reg engine_cap;
+	struct opcap opcap;
+};
+
+enum idxd_device_state {
+	IDXD_DEV_HALTED = -1,
+	IDXD_DEV_DISABLED = 0,
+	IDXD_DEV_CONF_READY,
+	IDXD_DEV_ENABLED,
+};
+
+enum idxd_device_flag {
+	IDXD_FLAG_CONFIGURABLE = 0,
+};
+
+struct idxd_device {
+	enum idxd_type type;
+	struct device conf_dev;
+	struct list_head list;
+	struct idxd_hw hw;
+	enum idxd_device_state state;
+	unsigned long flags;
+	int id;
+
+	struct pci_dev *pdev;
+	void __iomem *reg_base;
+
+	spinlock_t dev_lock;	/* spinlock for device */
+	struct idxd_group *groups;
+	struct idxd_wq *wqs;
+	struct idxd_engine *engines;
+
+	int num_groups;
+
+	u32 msix_perm_offset;
+	u32 wqcfg_offset;
+	u32 grpcfg_offset;
+	u32 perfmon_offset;
+
+	u64 max_xfer_bytes;
+	u32 max_batch_size;
+	int max_groups;
+	int max_engines;
+	int max_tokens;
+	int max_wqs;
+	int max_wq_size;
+	int token_limit;
+
+	union sw_err_reg sw_err;
+
+	struct msix_entry *msix_entries;
+	int num_wq_irqs;
+	struct idxd_irq_entry *irq_entries;
+};
+
+/* IDXD software descriptor */
+struct idxd_desc {
+	struct dsa_hw_desc *hw;
+	dma_addr_t desc_dma;
+	struct dsa_completion_record *completion;
+	dma_addr_t compl_dma;
+	struct llist_node llnode;
+	struct list_head list;
+	int id;
+	struct idxd_wq *wq;
+};
+
+#define confdev_to_idxd(dev) container_of(dev, struct idxd_device, conf_dev)
+#define confdev_to_wq(dev) container_of(dev, struct idxd_wq, conf_dev)
+
+static inline bool wq_dedicated(struct idxd_wq *wq)
+{
+	return test_bit(WQ_FLAG_DEDICATED, &wq->flags);
+}
+
+static inline void idxd_set_type(struct idxd_device *idxd)
+{
+	struct pci_dev *pdev = idxd->pdev;
+
+	if (pdev->device == PCI_DEVICE_ID_INTEL_DSA_SPR0)
+		idxd->type = IDXD_TYPE_DSA;
+	else
+		idxd->type = IDXD_TYPE_UNKNOWN;
+}
+
+const char *idxd_get_dev_name(struct idxd_device *idxd);
+
+/* device interrupt control */
+irqreturn_t idxd_irq_handler(int vec, void *data);
+irqreturn_t idxd_misc_thread(int vec, void *data);
+irqreturn_t idxd_wq_thread(int irq, void *data);
+void idxd_mask_error_interrupts(struct idxd_device *idxd);
+void idxd_unmask_error_interrupts(struct idxd_device *idxd);
+void idxd_mask_msix_vectors(struct idxd_device *idxd);
+int idxd_mask_msix_vector(struct idxd_device *idxd, int vec_id);
+int idxd_unmask_msix_vector(struct idxd_device *idxd, int vec_id);
+
+/* device control */
+int idxd_device_enable(struct idxd_device *idxd);
+int idxd_device_disable(struct idxd_device *idxd);
+int idxd_device_reset(struct idxd_device *idxd);
+int __idxd_device_reset(struct idxd_device *idxd);
+void idxd_device_cleanup(struct idxd_device *idxd);
+int idxd_device_config(struct idxd_device *idxd);
+void idxd_device_wqs_clear_state(struct idxd_device *idxd);
+
+/* work queue control */
+int idxd_wq_alloc_resources(struct idxd_wq *wq);
+void idxd_wq_free_resources(struct idxd_wq *wq);
+int idxd_wq_enable(struct idxd_wq *wq);
+int idxd_wq_disable(struct idxd_wq *wq);
+
+#endif
diff --git a/drivers/dma/idxd/init.c b/drivers/dma/idxd/init.c
new file mode 100644
index 000000000000..6e89a87d62b0
--- /dev/null
+++ b/drivers/dma/idxd/init.c
@@ -0,0 +1,468 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/pci.h>
+#include <linux/interrupt.h>
+#include <linux/delay.h>
+#include <linux/dma-mapping.h>
+#include <linux/workqueue.h>
+#include <linux/aer.h>
+#include <linux/fs.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <linux/device.h>
+#include <linux/idr.h>
+#include <uapi/linux/idxd.h>
+#include "registers.h"
+#include "idxd.h"
+
+MODULE_VERSION(IDXD_DRIVER_VERSION);
+MODULE_LICENSE("GPL v2");
+MODULE_AUTHOR("Intel Corporation");
+
+#define DRV_NAME "idxd"
+
+static struct idr idxd_idrs[IDXD_TYPE_MAX];
+static struct mutex idxd_idr_lock;
+
+static struct pci_device_id idxd_pci_tbl[] = {
+	/* DSA ver 1.0 platforms */
+	{ PCI_VDEVICE(INTEL, PCI_DEVICE_ID_INTEL_DSA_SPR0) },
+	{ 0, }
+};
+MODULE_DEVICE_TABLE(pci, idxd_pci_tbl);
+
+static char *idxd_name[] = {
+	"dsa",
+};
+
+const char *idxd_get_dev_name(struct idxd_device *idxd)
+{
+	return idxd_name[idxd->type];
+}
+
+static int idxd_setup_interrupts(struct idxd_device *idxd)
+{
+	struct pci_dev *pdev = idxd->pdev;
+	struct device *dev = &pdev->dev;
+	struct msix_entry *msix;
+	struct idxd_irq_entry *irq_entry;
+	int i, msixcnt;
+	int rc = 0;
+
+	msixcnt = pci_msix_vec_count(pdev);
+	if (msixcnt < 0) {
+		dev_err(dev, "Not MSI-X interrupt capable.\n");
+		goto err_no_irq;
+	}
+
+	idxd->msix_entries = devm_kzalloc(dev, sizeof(struct msix_entry) *
+			msixcnt, GFP_KERNEL);
+	if (!idxd->msix_entries) {
+		rc = -ENOMEM;
+		goto err_no_irq;
+	}
+
+	for (i = 0; i < msixcnt; i++)
+		idxd->msix_entries[i].entry = i;
+
+	rc = pci_enable_msix_exact(pdev, idxd->msix_entries, msixcnt);
+	if (rc) {
+		dev_err(dev, "Failed enabling %d MSIX entries.\n", msixcnt);
+		goto err_no_irq;
+	}
+	dev_dbg(dev, "Enabled %d msix vectors\n", msixcnt);
+
+	/*
+	 * We implement 1 completion list per MSI-X entry except for
+	 * entry 0, which is for errors and others.
+	 */
+	idxd->irq_entries = devm_kcalloc(dev, msixcnt,
+					 sizeof(struct idxd_irq_entry),
+					 GFP_KERNEL);
+	if (!idxd->irq_entries) {
+		rc = -ENOMEM;
+		goto err_no_irq;
+	}
+
+	for (i = 0; i < msixcnt; i++) {
+		idxd->irq_entries[i].id = i;
+		idxd->irq_entries[i].idxd = idxd;
+	}
+
+	msix = &idxd->msix_entries[0];
+	irq_entry = &idxd->irq_entries[0];
+	rc = devm_request_threaded_irq(dev, msix->vector, idxd_irq_handler,
+				       idxd_misc_thread, 0, "idxd-misc",
+				       irq_entry);
+	if (rc < 0) {
+		dev_err(dev, "Failed to allocate misc interrupt.\n");
+		goto err_no_irq;
+	}
+
+	dev_dbg(dev, "Allocated idxd-misc handler on msix vector %d\n",
+		msix->vector);
+
+	/* first MSI-X entry is not for wq interrupts */
+	idxd->num_wq_irqs = msixcnt - 1;
+
+	for (i = 1; i < msixcnt; i++) {
+		msix = &idxd->msix_entries[i];
+		irq_entry = &idxd->irq_entries[i];
+
+		init_llist_head(&idxd->irq_entries[i].pending_llist);
+		INIT_LIST_HEAD(&idxd->irq_entries[i].work_list);
+		rc = devm_request_threaded_irq(dev, msix->vector,
+					       idxd_irq_handler,
+					       idxd_wq_thread, 0,
+					       "idxd-portal", irq_entry);
+		if (rc < 0) {
+			dev_err(dev, "Failed to allocate irq %d.\n",
+				msix->vector);
+			goto err_no_irq;
+		}
+		dev_dbg(dev, "Allocated idxd-msix %d for vector %d\n",
+			i, msix->vector);
+	}
+
+	idxd_unmask_error_interrupts(idxd);
+
+	return 0;
+
+ err_no_irq:
+	/* Disable error interrupt generation */
+	idxd_mask_error_interrupts(idxd);
+	pci_disable_msix(pdev);
+	dev_err(dev, "No usable interrupts\n");
+	return rc;
+}
+
+static void idxd_wqs_free_lock(struct idxd_device *idxd)
+{
+	int i;
+
+	for (i = 0; i < idxd->max_wqs; i++) {
+		struct idxd_wq *wq = &idxd->wqs[i];
+
+		percpu_free_rwsem(&wq->submit_lock);
+	}
+}
+
+static int idxd_setup_internals(struct idxd_device *idxd)
+{
+	struct device *dev = &idxd->pdev->dev;
+	int i;
+
+	idxd->groups = devm_kcalloc(dev, idxd->max_groups,
+				    sizeof(struct idxd_group), GFP_KERNEL);
+	if (!idxd->groups)
+		return -ENOMEM;
+
+	for (i = 0; i < idxd->max_groups; i++) {
+		idxd->groups[i].idxd = idxd;
+		idxd->groups[i].id = i;
+		idxd->groups[i].tc_a = -1;
+		idxd->groups[i].tc_b = -1;
+	}
+
+	idxd->wqs = devm_kcalloc(dev, idxd->max_wqs, sizeof(struct idxd_wq),
+				 GFP_KERNEL);
+	if (!idxd->wqs)
+		return -ENOMEM;
+
+	idxd->engines = devm_kcalloc(dev, idxd->max_engines,
+				     sizeof(struct idxd_engine), GFP_KERNEL);
+	if (!idxd->engines)
+		return -ENOMEM;
+
+	for (i = 0; i < idxd->max_wqs; i++) {
+		struct idxd_wq *wq = &idxd->wqs[i];
+		int rc;
+
+		wq->id = i;
+		wq->idxd = idxd;
+		mutex_init(&wq->wq_lock);
+		atomic_set(&wq->dq_count, 0);
+		init_waitqueue_head(&wq->submit_waitq);
+		rc = percpu_init_rwsem(&wq->submit_lock);
+		if (rc < 0) {
+			idxd_wqs_free_lock(idxd);
+			return rc;
+		}
+	}
+
+	for (i = 0; i < idxd->max_engines; i++) {
+		idxd->engines[i].idxd = idxd;
+		idxd->engines[i].id = i;
+	}
+
+	return 0;
+}
+
+static void idxd_read_table_offsets(struct idxd_device *idxd)
+{
+	union offsets_reg offsets;
+	struct device *dev = &idxd->pdev->dev;
+
+	offsets.bits[0] = ioread64(idxd->reg_base + IDXD_TABLE_OFFSET);
+	offsets.bits[1] = ioread64(idxd->reg_base + IDXD_TABLE_OFFSET
+			+ sizeof(u64));
+	idxd->grpcfg_offset = offsets.grpcfg * 0x100;
+	dev_dbg(dev, "IDXD Group Config Offset: %#x\n", idxd->grpcfg_offset);
+	idxd->wqcfg_offset = offsets.wqcfg * 0x100;
+	dev_dbg(dev, "IDXD Work Queue Config Offset: %#x\n",
+		idxd->wqcfg_offset);
+	idxd->msix_perm_offset = offsets.msix_perm * 0x100;
+	dev_dbg(dev, "IDXD MSIX Permission Offset: %#x\n",
+		idxd->msix_perm_offset);
+	idxd->perfmon_offset = offsets.perfmon * 0x100;
+	dev_dbg(dev, "IDXD Perfmon Offset: %#x\n", idxd->perfmon_offset);
+}
+
+static void idxd_read_caps(struct idxd_device *idxd)
+{
+	struct device *dev = &idxd->pdev->dev;
+	int i;
+
+	/* reading generic capabilities */
+	idxd->hw.gen_cap.bits = ioread64(idxd->reg_base + IDXD_GENCAP_OFFSET);
+	dev_dbg(dev, "gen_cap: %#llx\n", idxd->hw.gen_cap.bits);
+	idxd->max_xfer_bytes = 1ULL << idxd->hw.gen_cap.max_xfer_shift;
+	dev_dbg(dev, "max xfer size: %llu bytes\n", idxd->max_xfer_bytes);
+	idxd->max_batch_size = 1U << idxd->hw.gen_cap.max_batch_shift;
+	dev_dbg(dev, "max batch size: %u\n", idxd->max_batch_size);
+	if (idxd->hw.gen_cap.config_en)
+		set_bit(IDXD_FLAG_CONFIGURABLE, &idxd->flags);
+
+	/* reading group capabilities */
+	idxd->hw.group_cap.bits =
+		ioread64(idxd->reg_base + IDXD_GRPCAP_OFFSET);
+	dev_dbg(dev, "group_cap: %#llx\n", idxd->hw.group_cap.bits);
+	idxd->max_groups = idxd->hw.group_cap.num_groups;
+	dev_dbg(dev, "max groups: %u\n", idxd->max_groups);
+	idxd->max_tokens = idxd->hw.group_cap.total_tokens;
+	dev_dbg(dev, "max tokens: %u\n", idxd->max_tokens);
+
+	/* read engine capabilities */
+	idxd->hw.engine_cap.bits =
+		ioread64(idxd->reg_base + IDXD_ENGCAP_OFFSET);
+	dev_dbg(dev, "engine_cap: %#llx\n", idxd->hw.engine_cap.bits);
+	idxd->max_engines = idxd->hw.engine_cap.num_engines;
+	dev_dbg(dev, "max engines: %u\n", idxd->max_engines);
+
+	/* read workqueue capabilities */
+	idxd->hw.wq_cap.bits = ioread64(idxd->reg_base + IDXD_WQCAP_OFFSET);
+	dev_dbg(dev, "wq_cap: %#llx\n", idxd->hw.wq_cap.bits);
+	idxd->max_wq_size = idxd->hw.wq_cap.total_wq_size;
+	dev_dbg(dev, "total workqueue size: %u\n", idxd->max_wq_size);
+	idxd->max_wqs = idxd->hw.wq_cap.num_wqs;
+	dev_dbg(dev, "max workqueues: %u\n", idxd->max_wqs);
+
+	/* reading operation capabilities */
+	for (i = 0; i < 4; i++) {
+		idxd->hw.opcap.bits[i] = ioread64(idxd->reg_base +
+				IDXD_OPCAP_OFFSET + i * sizeof(u64));
+		dev_dbg(dev, "opcap[%d]: %#llx\n", i, idxd->hw.opcap.bits[i]);
+	}
+}
+
+static struct idxd_device *idxd_alloc(struct pci_dev *pdev,
+				      void __iomem * const *iomap)
+{
+	struct device *dev = &pdev->dev;
+	struct idxd_device *idxd;
+
+	idxd = devm_kzalloc(dev, sizeof(struct idxd_device), GFP_KERNEL);
+	if (!idxd)
+		return NULL;
+
+	idxd->pdev = pdev;
+	idxd->reg_base = iomap[IDXD_MMIO_BAR];
+	spin_lock_init(&idxd->dev_lock);
+
+	return idxd;
+}
+
+static int idxd_probe(struct idxd_device *idxd)
+{
+	struct pci_dev *pdev = idxd->pdev;
+	struct device *dev = &pdev->dev;
+	int rc;
+
+	dev_dbg(dev, "%s entered and resetting device\n", __func__);
+	rc = idxd_device_reset(idxd);
+	if (rc < 0)
+		return rc;
+	dev_dbg(dev, "IDXD reset complete\n");
+
+	idxd_read_caps(idxd);
+	idxd_read_table_offsets(idxd);
+
+	rc = idxd_setup_internals(idxd);
+	if (rc)
+		goto err_setup;
+
+	rc = idxd_setup_interrupts(idxd);
+	if (rc)
+		goto err_setup;
+
+	dev_dbg(dev, "IDXD interrupt setup complete.\n");
+
+	mutex_lock(&idxd_idr_lock);
+	idxd->id = idr_alloc(&idxd_idrs[idxd->type], idxd, 0, 0, GFP_KERNEL);
+	mutex_unlock(&idxd_idr_lock);
+	if (idxd->id < 0) {
+		rc = -ENOMEM;
+		goto err_idr_fail;
+	}
+
+	dev_dbg(dev, "IDXD device %d probed successfully\n", idxd->id);
+	return 0;
+
+ err_idr_fail:
+	idxd_mask_error_interrupts(idxd);
+	idxd_mask_msix_vectors(idxd);
+ err_setup:
+	return rc;
+}
+
+static int idxd_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id)
+{
+	void __iomem * const *iomap;
+	struct device *dev = &pdev->dev;
+	struct idxd_device *idxd;
+	int rc;
+	unsigned int mask;
+
+	rc = pcim_enable_device(pdev);
+	if (rc)
+		return rc;
+
+	dev_dbg(dev, "Mapping BARs\n");
+	mask = (1 << IDXD_MMIO_BAR);
+	rc = pcim_iomap_regions(pdev, mask, DRV_NAME);
+	if (rc)
+		return rc;
+
+	iomap = pcim_iomap_table(pdev);
+	if (!iomap)
+		return -ENOMEM;
+
+	dev_dbg(dev, "Set DMA masks\n");
+	rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(64));
+	if (rc)
+		rc = pci_set_dma_mask(pdev, DMA_BIT_MASK(32));
+	if (rc)
+		return rc;
+
+	rc = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
+	if (rc)
+		rc = pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
+	if (rc)
+		return rc;
+
+	dev_dbg(dev, "Alloc IDXD context\n");
+	idxd = idxd_alloc(pdev, iomap);
+	if (!idxd)
+		return -ENOMEM;
+
+	idxd_set_type(idxd);
+
+	dev_dbg(dev, "Set PCI master\n");
+	pci_set_master(pdev);
+	pci_set_drvdata(pdev, idxd);
+
+	idxd->hw.version = ioread32(idxd->reg_base + IDXD_VER_OFFSET);
+	rc = idxd_probe(idxd);
+	if (rc) {
+		dev_err(dev, "Intel(R) IDXD DMA Engine init failed\n");
+		return -ENODEV;
+	}
+
+	dev_info(&pdev->dev, "Intel(R) Accelerator Device (v%x)\n",
+		 idxd->hw.version);
+
+	return 0;
+}
+
+static void idxd_shutdown(struct pci_dev *pdev)
+{
+	struct idxd_device *idxd = pci_get_drvdata(pdev);
+	int rc, i;
+	struct idxd_irq_entry *irq_entry;
+	int msixcnt = pci_msix_vec_count(pdev);
+	unsigned long flags;
+
+	spin_lock_irqsave(&idxd->dev_lock, flags);
+	rc = idxd_device_disable(idxd);
+	spin_unlock_irqrestore(&idxd->dev_lock, flags);
+	if (rc)
+		dev_err(&pdev->dev, "Disabling device failed\n");
+
+	dev_dbg(&pdev->dev, "%s called\n", __func__);
+	idxd_mask_msix_vectors(idxd);
+	idxd_mask_error_interrupts(idxd);
+
+	for (i = 0; i < msixcnt; i++) {
+		irq_entry = &idxd->irq_entries[i];
+		synchronize_irq(idxd->msix_entries[i].vector);
+		if (i == 0)
+			continue;
+	}
+}
+
+static void idxd_remove(struct pci_dev *pdev)
+{
+	struct idxd_device *idxd = pci_get_drvdata(pdev);
+
+	dev_dbg(&pdev->dev, "%s called\n", __func__);
+	idxd_shutdown(pdev);
+	idxd_wqs_free_lock(idxd);
+	mutex_lock(&idxd_idr_lock);
+	idr_remove(&idxd_idrs[idxd->type], idxd->id);
+	mutex_unlock(&idxd_idr_lock);
+}
+
+static struct pci_driver idxd_pci_driver = {
+	.name		= DRV_NAME,
+	.id_table	= idxd_pci_tbl,
+	.probe		= idxd_pci_probe,
+	.remove		= idxd_remove,
+	.shutdown	= idxd_shutdown,
+};
+
+static int __init idxd_init_module(void)
+{
+	int err, i;
+
+	/*
+	 * If the CPU does not support write512, there's no point in
+	 * enumerating the device. We can not utilize it.
+	 */
+	if (!boot_cpu_has(X86_FEATURE_MOVDIR64B)) {
+		pr_warn("idxd driver failed to load without MOVDIR64B.\n");
+		return -ENODEV;
+	}
+
+	pr_info("%s: Intel(R) Accelerator Devices Driver %s\n",
+		DRV_NAME, IDXD_DRIVER_VERSION);
+
+	mutex_init(&idxd_idr_lock);
+	for (i = 0; i < IDXD_TYPE_MAX; i++)
+		idr_init(&idxd_idrs[i]);
+
+	err = pci_register_driver(&idxd_pci_driver);
+	if (err)
+		return err;
+
+	return 0;
+}
+module_init(idxd_init_module);
+
+static void __exit idxd_exit_module(void)
+{
+	pci_unregister_driver(&idxd_pci_driver);
+}
+module_exit(idxd_exit_module);
diff --git a/drivers/dma/idxd/irq.c b/drivers/dma/idxd/irq.c
new file mode 100644
index 000000000000..de4b80973c2f
--- /dev/null
+++ b/drivers/dma/idxd/irq.c
@@ -0,0 +1,156 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pci.h>
+#include <linux/io-64-nonatomic-lo-hi.h>
+#include <uapi/linux/idxd.h>
+#include "idxd.h"
+#include "registers.h"
+
+void idxd_device_wqs_clear_state(struct idxd_device *idxd)
+{
+	int i;
+
+	lockdep_assert_held(&idxd->dev_lock);
+	for (i = 0; i < idxd->max_wqs; i++) {
+		struct idxd_wq *wq = &idxd->wqs[i];
+
+		wq->state = IDXD_WQ_DISABLED;
+	}
+}
+
+static int idxd_restart(struct idxd_device *idxd)
+{
+	int i, rc;
+
+	lockdep_assert_held(&idxd->dev_lock);
+
+	rc = __idxd_device_reset(idxd);
+	if (rc < 0)
+		goto out;
+
+	rc = idxd_device_config(idxd);
+	if (rc < 0)
+		goto out;
+
+	rc = idxd_device_enable(idxd);
+	if (rc < 0)
+		goto out;
+
+	for (i = 0; i < idxd->max_wqs; i++) {
+		struct idxd_wq *wq = &idxd->wqs[i];
+
+		if (wq->state == IDXD_WQ_ENABLED) {
+			rc = idxd_wq_enable(wq);
+			if (rc < 0) {
+				dev_warn(&idxd->pdev->dev,
+					 "Unable to re-enable wq %s\n",
+					 dev_name(&wq->conf_dev));
+			}
+		}
+	}
+
+	return 0;
+
+ out:
+	idxd_device_wqs_clear_state(idxd);
+	idxd->state = IDXD_DEV_HALTED;
+	return rc;
+}
+
+irqreturn_t idxd_irq_handler(int vec, void *data)
+{
+	struct idxd_irq_entry *irq_entry = data;
+	struct idxd_device *idxd = irq_entry->idxd;
+
+	idxd_mask_msix_vector(idxd, irq_entry->id);
+	return IRQ_WAKE_THREAD;
+}
+
+irqreturn_t idxd_misc_thread(int vec, void *data)
+{
+	struct idxd_irq_entry *irq_entry = data;
+	struct idxd_device *idxd = irq_entry->idxd;
+	struct device *dev = &idxd->pdev->dev;
+	union gensts_reg gensts;
+	u32 cause, val = 0;
+	int i, rc;
+	bool err = false;
+
+	cause = ioread32(idxd->reg_base + IDXD_INTCAUSE_OFFSET);
+
+	if (cause & IDXD_INTC_ERR) {
+		spin_lock_bh(&idxd->dev_lock);
+		for (i = 0; i < 4; i++)
+			idxd->sw_err.bits[i] = ioread64(idxd->reg_base +
+					IDXD_SWERR_OFFSET + i * sizeof(u64));
+		iowrite64(IDXD_SWERR_ACK, idxd->reg_base + IDXD_SWERR_OFFSET);
+		spin_unlock_bh(&idxd->dev_lock);
+		val |= IDXD_INTC_ERR;
+
+		for (i = 0; i < 4; i++)
+			dev_warn(dev, "err[%d]: %#16.16llx\n",
+				 i, idxd->sw_err.bits[i]);
+		err = true;
+	}
+
+	if (cause & IDXD_INTC_CMD) {
+		/* Driver does use command interrupts */
+		val |= IDXD_INTC_CMD;
+	}
+
+	if (cause & IDXD_INTC_OCCUPY) {
+		/* Driver does not utilize occupancy interrupt */
+		val |= IDXD_INTC_OCCUPY;
+	}
+
+	if (cause & IDXD_INTC_PERFMON_OVFL) {
+		/*
+		 * Driver does not utilize perfmon counter overflow interrupt
+		 * yet.
+		 */
+		val |= IDXD_INTC_PERFMON_OVFL;
+	}
+
+	val ^= cause;
+	if (val)
+		dev_warn_once(dev, "Unexpected interrupt cause bits set: %#x\n",
+			      val);
+
+	iowrite32(cause, idxd->reg_base + IDXD_INTCAUSE_OFFSET);
+	if (!err)
+		return IRQ_HANDLED;
+
+	gensts.bits = ioread32(idxd->reg_base + IDXD_GENSTATS_OFFSET);
+	if (gensts.state == IDXD_DEVICE_STATE_HALT) {
+		spin_lock_bh(&idxd->dev_lock);
+		if (gensts.reset_type == IDXD_DEVICE_RESET_SOFTWARE) {
+			rc = idxd_restart(idxd);
+			if (rc < 0)
+				dev_err(&idxd->pdev->dev,
+					"idxd restart failed, device halt.");
+		} else {
+			idxd_device_wqs_clear_state(idxd);
+			idxd->state = IDXD_DEV_HALTED;
+			dev_err(&idxd->pdev->dev,
+				"idxd halted, need %s.\n",
+				gensts.reset_type == IDXD_DEVICE_RESET_FLR ?
+				"FLR" : "system reset");
+		}
+		spin_unlock_bh(&idxd->dev_lock);
+	}
+
+	idxd_unmask_msix_vector(idxd, irq_entry->id);
+	return IRQ_HANDLED;
+}
+
+irqreturn_t idxd_wq_thread(int irq, void *data)
+{
+	struct idxd_irq_entry *irq_entry = data;
+
+	idxd_unmask_msix_vector(irq_entry->idxd, irq_entry->id);
+
+	return IRQ_HANDLED;
+}
diff --git a/drivers/dma/idxd/registers.h b/drivers/dma/idxd/registers.h
new file mode 100644
index 000000000000..146e51f1d872
--- /dev/null
+++ b/drivers/dma/idxd/registers.h
@@ -0,0 +1,335 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */
+#ifndef _IDXD_REGISTERS_H_
+#define _IDXD_REGISTERS_H_
+
+/* PCI Config */
+#define PCI_DEVICE_ID_INTEL_DSA_SPR0	0x0b25
+
+#define IDXD_MMIO_BAR		0
+#define IDXD_WQ_BAR		2
+
+/* MMIO Device BAR0 Registers */
+#define IDXD_VER_OFFSET			0x00
+#define IDXD_VER_MAJOR_MASK		0xf0
+#define IDXD_VER_MINOR_MASK		0x0f
+#define GET_IDXD_VER_MAJOR(x)		(((x) & IDXD_VER_MAJOR_MASK) >> 4)
+#define GET_IDXD_VER_MINOR(x)		((x) & IDXD_VER_MINOR_MASK)
+
+union gen_cap_reg {
+	struct {
+		u64 block_on_fault:1;
+		u64 overlap_copy:1;
+		u64 cache_control_mem:1;
+		u64 cache_control_cache:1;
+		u64 rsvd:3;
+		u64 int_handle_req:1;
+		u64 dest_readback:1;
+		u64 drain_readback:1;
+		u64 rsvd2:6;
+		u64 max_xfer_shift:5;
+		u64 max_batch_shift:4;
+		u64 max_ims_mult:6;
+		u64 config_en:1;
+		u64 max_descs_per_engine:8;
+		u64 rsvd3:24;
+	};
+	u64 bits;
+} __packed;
+#define IDXD_GENCAP_OFFSET		0x10
+
+union wq_cap_reg {
+	struct {
+		u64 total_wq_size:16;
+		u64 num_wqs:8;
+		u64 rsvd:24;
+		u64 shared_mode:1;
+		u64 dedicated_mode:1;
+		u64 rsvd2:1;
+		u64 priority:1;
+		u64 occupancy:1;
+		u64 occupancy_int:1;
+		u64 rsvd3:10;
+	};
+	u64 bits;
+} __packed;
+#define IDXD_WQCAP_OFFSET		0x20
+
+union group_cap_reg {
+	struct {
+		u64 num_groups:8;
+		u64 total_tokens:8;
+		u64 token_en:1;
+		u64 token_limit:1;
+		u64 rsvd:46;
+	};
+	u64 bits;
+} __packed;
+#define IDXD_GRPCAP_OFFSET		0x30
+
+union engine_cap_reg {
+	struct {
+		u64 num_engines:8;
+		u64 rsvd:56;
+	};
+	u64 bits;
+} __packed;
+
+#define IDXD_ENGCAP_OFFSET		0x38
+
+#define IDXD_OPCAP_NOOP			0x0001
+#define IDXD_OPCAP_BATCH			0x0002
+#define IDXD_OPCAP_MEMMOVE		0x0008
+struct opcap {
+	u64 bits[4];
+};
+
+#define IDXD_OPCAP_OFFSET		0x40
+
+#define IDXD_TABLE_OFFSET		0x60
+union offsets_reg {
+	struct {
+		u64 grpcfg:16;
+		u64 wqcfg:16;
+		u64 msix_perm:16;
+		u64 ims:16;
+		u64 perfmon:16;
+		u64 rsvd:48;
+	};
+	u64 bits[2];
+} __packed;
+
+#define IDXD_GENCFG_OFFSET		0x80
+union gencfg_reg {
+	struct {
+		u32 token_limit:8;
+		u32 rsvd:4;
+		u32 user_int_en:1;
+		u32 rsvd2:19;
+	};
+	u32 bits;
+} __packed;
+
+#define IDXD_GENCTRL_OFFSET		0x88
+union genctrl_reg {
+	struct {
+		u32 softerr_int_en:1;
+		u32 rsvd:31;
+	};
+	u32 bits;
+} __packed;
+
+#define IDXD_GENSTATS_OFFSET		0x90
+union gensts_reg {
+	struct {
+		u32 state:2;
+		u32 reset_type:2;
+		u32 rsvd:28;
+	};
+	u32 bits;
+} __packed;
+
+enum idxd_device_status_state {
+	IDXD_DEVICE_STATE_DISABLED = 0,
+	IDXD_DEVICE_STATE_ENABLED,
+	IDXD_DEVICE_STATE_DRAIN,
+	IDXD_DEVICE_STATE_HALT,
+};
+
+enum idxd_device_reset_type {
+	IDXD_DEVICE_RESET_SOFTWARE = 0,
+	IDXD_DEVICE_RESET_FLR,
+	IDXD_DEVICE_RESET_WARM,
+	IDXD_DEVICE_RESET_COLD,
+};
+
+#define IDXD_INTCAUSE_OFFSET		0x98
+#define IDXD_INTC_ERR			0x01
+#define IDXD_INTC_CMD			0x02
+#define IDXD_INTC_OCCUPY			0x04
+#define IDXD_INTC_PERFMON_OVFL		0x08
+
+#define IDXD_CMD_OFFSET			0xa0
+union idxd_command_reg {
+	struct {
+		u32 operand:20;
+		u32 cmd:5;
+		u32 rsvd:6;
+		u32 int_req:1;
+	};
+	u32 bits;
+} __packed;
+
+enum idxd_cmd {
+	IDXD_CMD_ENABLE_DEVICE = 1,
+	IDXD_CMD_DISABLE_DEVICE,
+	IDXD_CMD_DRAIN_ALL,
+	IDXD_CMD_ABORT_ALL,
+	IDXD_CMD_RESET_DEVICE,
+	IDXD_CMD_ENABLE_WQ,
+	IDXD_CMD_DISABLE_WQ,
+	IDXD_CMD_DRAIN_WQ,
+	IDXD_CMD_ABORT_WQ,
+	IDXD_CMD_RESET_WQ,
+	IDXD_CMD_DRAIN_PASID,
+	IDXD_CMD_ABORT_PASID,
+	IDXD_CMD_REQUEST_INT_HANDLE,
+};
+
+#define IDXD_CMDSTS_OFFSET		0xa8
+union cmdsts_reg {
+	struct {
+		u8 err;
+		u16 result;
+		u8 rsvd:7;
+		u8 active:1;
+	};
+	u32 bits;
+} __packed;
+#define IDXD_CMDSTS_ACTIVE		0x80000000
+
+enum idxd_cmdsts_err {
+	IDXD_CMDSTS_SUCCESS = 0,
+	IDXD_CMDSTS_INVAL_CMD,
+	IDXD_CMDSTS_INVAL_WQIDX,
+	IDXD_CMDSTS_HW_ERR,
+	/* enable device errors */
+	IDXD_CMDSTS_ERR_DEV_ENABLED = 0x10,
+	IDXD_CMDSTS_ERR_CONFIG,
+	IDXD_CMDSTS_ERR_BUSMASTER_EN,
+	IDXD_CMDSTS_ERR_PASID_INVAL,
+	IDXD_CMDSTS_ERR_WQ_SIZE_ERANGE,
+	IDXD_CMDSTS_ERR_GRP_CONFIG,
+	IDXD_CMDSTS_ERR_GRP_CONFIG2,
+	IDXD_CMDSTS_ERR_GRP_CONFIG3,
+	IDXD_CMDSTS_ERR_GRP_CONFIG4,
+	/* enable wq errors */
+	IDXD_CMDSTS_ERR_DEV_NOTEN = 0x20,
+	IDXD_CMDSTS_ERR_WQ_ENABLED,
+	IDXD_CMDSTS_ERR_WQ_SIZE,
+	IDXD_CMDSTS_ERR_WQ_PRIOR,
+	IDXD_CMDSTS_ERR_WQ_MODE,
+	IDXD_CMDSTS_ERR_BOF_EN,
+	IDXD_CMDSTS_ERR_PASID_EN,
+	IDXD_CMDSTS_ERR_MAX_BATCH_SIZE,
+	IDXD_CMDSTS_ERR_MAX_XFER_SIZE,
+	/* disable device errors */
+	IDXD_CMDSTS_ERR_DIS_DEV_EN = 0x31,
+	/* disable WQ, drain WQ, abort WQ, reset WQ */
+	IDXD_CMDSTS_ERR_DEV_NOT_EN,
+	/* request interrupt handle */
+	IDXD_CMDSTS_ERR_INVAL_INT_IDX = 0x41,
+	IDXD_CMDSTS_ERR_NO_HANDLE,
+};
+
+#define IDXD_SWERR_OFFSET		0xc0
+#define IDXD_SWERR_VALID			0x00000001
+#define IDXD_SWERR_OVERFLOW		0x00000002
+#define IDXD_SWERR_ACK			(IDXD_SWERR_VALID | IDXD_SWERR_OVERFLOW)
+union sw_err_reg {
+	struct {
+		u64 valid:1;
+		u64 overflow:1;
+		u64 desc_valid:1;
+		u64 wq_idx_valid:1;
+		u64 batch:1;
+		u64 fault_rw:1;
+		u64 priv:1;
+		u64 rsvd:1;
+		u64 error:8;
+		u64 wq_idx:8;
+		u64 rsvd2:8;
+		u64 operation:8;
+		u64 pasid:20;
+		u64 rsvd3:4;
+
+		u64 batch_idx:16;
+		u64 rsvd4:16;
+		u64 invalid_flags:32;
+
+		u64 fault_addr;
+
+		u64 rsvd5;
+	};
+	u64 bits[4];
+} __packed;
+
+union msix_perm {
+	struct {
+		u32 rsvd:2;
+		u32 ignore:1;
+		u32 pasid_en:1;
+		u32 rsvd2:8;
+		u32 pasid:20;
+	};
+	u32 bits;
+} __packed;
+
+union group_flags {
+	struct {
+		u32 tc_a:3;
+		u32 tc_b:3;
+		u32 rsvd:1;
+		u32 use_token_limit:1;
+		u32 tokens_reserved:8;
+		u32 rsvd2:4;
+		u32 tokens_allowed:8;
+		u32 rsvd3:4;
+	};
+	u32 bits;
+} __packed;
+
+struct grpcfg {
+	u64 wqs[4];
+	u64 engines;
+	union group_flags flags;
+} __packed;
+
+union wqcfg {
+	struct {
+		/* bytes 0-3 */
+		u16 wq_size;
+		u16 rsvd;
+
+		/* bytes 4-7 */
+		u16 wq_thresh;
+		u16 rsvd1;
+
+		/* bytes 8-11 */
+		u32 mode:1;	/* shared or dedicated */
+		u32 bof:1;	/* block on fault */
+		u32 rsvd2:2;
+		u32 priority:4;
+		u32 pasid:20;
+		u32 pasid_en:1;
+		u32 priv:1;
+		u32 rsvd3:2;
+
+		/* bytes 12-15 */
+		u32 max_xfer_shift:5;
+		u32 max_batch_shift:4;
+		u32 rsvd4:23;
+
+		/* bytes 16-19 */
+		u16 occupancy_inth;
+		u16 occupancy_table_sel:1;
+		u16 rsvd5:15;
+
+		/* bytes 20-23 */
+		u16 occupancy_limit;
+		u16 occupancy_int_en:1;
+		u16 rsvd6:15;
+
+		/* bytes 24-27 */
+		u16 occupancy;
+		u16 occupancy_int:1;
+		u16 rsvd7:12;
+		u16 mode_support:1;
+		u16 wq_state:2;
+
+		/* bytes 28-31 */
+		u32 rsvd8;
+	};
+	u32 bits[8];
+} __packed;
+#endif
diff --git a/include/uapi/linux/idxd.h b/include/uapi/linux/idxd.h
new file mode 100644
index 000000000000..849ef1515d04
--- /dev/null
+++ b/include/uapi/linux/idxd.h
@@ -0,0 +1,228 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */
+#ifndef _USR_IDXD_H_
+#define _USR_IDXD_H_
+
+#ifdef __KERNEL__
+#include <linux/types.h>
+#else
+#include <stdint.h>
+#endif
+
+/* Descriptor flags */
+#define IDXD_OP_FLAG_FENCE	0x0001
+#define IDXD_OP_FLAG_BOF	0x0002
+#define IDXD_OP_FLAG_CRAV	0x0004
+#define IDXD_OP_FLAG_RCR	0x0008
+#define IDXD_OP_FLAG_RCI	0x0010
+#define IDXD_OP_FLAG_CRSTS	0x0020
+#define IDXD_OP_FLAG_CR		0x0080
+#define IDXD_OP_FLAG_CC		0x0100
+#define IDXD_OP_FLAG_ADDR1_TCS	0x0200
+#define IDXD_OP_FLAG_ADDR2_TCS	0x0400
+#define IDXD_OP_FLAG_ADDR3_TCS	0x0800
+#define IDXD_OP_FLAG_CR_TCS	0x1000
+#define IDXD_OP_FLAG_STORD	0x2000
+#define IDXD_OP_FLAG_DRDBK	0x4000
+#define IDXD_OP_FLAG_DSTS	0x8000
+
+/* Opcode */
+enum dsa_opcode {
+	DSA_OPCODE_NOOP = 0,
+	DSA_OPCODE_BATCH,
+	DSA_OPCODE_DRAIN,
+	DSA_OPCODE_MEMMOVE,
+	DSA_OPCODE_MEMFILL,
+	DSA_OPCODE_COMPARE,
+	DSA_OPCODE_COMPVAL,
+	DSA_OPCODE_CR_DELTA,
+	DSA_OPCODE_AP_DELTA,
+	DSA_OPCODE_DUALCAST,
+	DSA_OPCODE_CRCGEN = 0x10,
+	DSA_OPCODE_COPY_CRC,
+	DSA_OPCODE_DIF_CHECK,
+	DSA_OPCODE_DIF_INS,
+	DSA_OPCODE_DIF_STRP,
+	DSA_OPCODE_DIF_UPDT,
+	DSA_OPCODE_CFLUSH = 0x20,
+};
+
+/* Completion record status */
+enum dsa_completion_status {
+	DSA_COMP_NONE = 0,
+	DSA_COMP_SUCCESS,
+	DSA_COMP_SUCCESS_PRED,
+	DSA_COMP_PAGE_FAULT_NOBOF,
+	DSA_COMP_PAGE_FAULT_IR,
+	DSA_COMP_BATCH_FAIL,
+	DSA_COMP_BATCH_PAGE_FAULT,
+	DSA_COMP_DR_OFFSET_NOINC,
+	DSA_COMP_DR_OFFSET_ERANGE,
+	DSA_COMP_DIF_ERR,
+	DSA_COMP_BAD_OPCODE = 0x10,
+	DSA_COMP_INVALID_FLAGS,
+	DSA_COMP_NOZERO_RESERVE,
+	DSA_COMP_XFER_ERANGE,
+	DSA_COMP_DESC_CNT_ERANGE,
+	DSA_COMP_DR_ERANGE,
+	DSA_COMP_OVERLAP_BUFFERS,
+	DSA_COMP_DCAST_ERR,
+	DSA_COMP_DESCLIST_ALIGN,
+	DSA_COMP_INT_HANDLE_INVAL,
+	DSA_COMP_CRA_XLAT,
+	DSA_COMP_CRA_ALIGN,
+	DSA_COMP_ADDR_ALIGN,
+	DSA_COMP_PRIV_BAD,
+	DSA_COMP_TRAFFIC_CLASS_CONF,
+	DSA_COMP_PFAULT_RDBA,
+	DSA_COMP_HW_ERR1,
+	DSA_COMP_HW_ERR_DRB,
+	DSA_COMP_TRANSLATION_FAIL,
+};
+
+#define DSA_COMP_STATUS_MASK		0x7f
+#define DSA_COMP_STATUS_WRITE		0x80
+
+struct dsa_batch_desc {
+	uint32_t	pasid:20;
+	uint32_t	rsvd:11;
+	uint32_t	priv:1;
+	uint32_t	flags:24;
+	uint32_t	opcode:8;
+	uint64_t	completion_addr;
+	uint64_t	desc_list_addr;
+	uint64_t	rsvd1;
+	uint32_t	desc_count;
+	uint16_t	interrupt_handle;
+	uint16_t	rsvd2;
+	uint8_t		rsvd3[24];
+} __attribute__((packed));
+
+struct dsa_hw_desc {
+	uint32_t	pasid:20;
+	uint32_t	rsvd:11;
+	uint32_t	priv:1;
+	uint32_t	flags:24;
+	uint32_t	opcode:8;
+	uint64_t	completion_addr;
+	union {
+		uint64_t	src_addr;
+		uint64_t	rdback_addr;
+		uint64_t	pattern;
+	};
+	union {
+		uint64_t	dst_addr;
+		uint64_t	rdback_addr2;
+		uint64_t	src2_addr;
+		uint64_t	comp_pattern;
+	};
+	uint32_t	xfer_size;
+	uint16_t	int_handle;
+	uint16_t	rsvd1;
+	union {
+		uint8_t		expected_res;
+		struct {
+			uint64_t	delta_addr;
+			uint32_t	max_delta_size;
+		};
+		uint32_t	delta_rec_size;
+		uint64_t	dest2;
+		/* CRC */
+		struct {
+			uint32_t	crc_seed;
+			uint32_t	crc_rsvd;
+			uint64_t	seed_addr;
+		};
+		/* DIF check or strip */
+		struct {
+			uint8_t		src_dif_flags;
+			uint8_t		dif_chk_res;
+			uint8_t		dif_chk_flags;
+			uint8_t		dif_chk_res2[5];
+			uint32_t	chk_ref_tag_seed;
+			uint16_t	chk_app_tag_mask;
+			uint16_t	chk_app_tag_seed;
+		};
+		/* DIF insert */
+		struct {
+			uint8_t		dif_ins_res;
+			uint8_t		dest_dif_flag;
+			uint8_t		dif_ins_flags;
+			uint8_t		dif_ins_res2[13];
+			uint32_t	ins_ref_tag_seed;
+			uint16_t	ins_app_tag_mask;
+			uint16_t	ins_app_tag_seed;
+		};
+		/* DIF update */
+		struct {
+			uint8_t		src_upd_flags;
+			uint8_t		upd_dest_flags;
+			uint8_t		dif_upd_flags;
+			uint8_t		dif_upd_res[5];
+			uint32_t	src_ref_tag_seed;
+			uint16_t	src_app_tag_mask;
+			uint16_t	src_app_tag_seed;
+			uint32_t	dest_ref_tag_seed;
+			uint16_t	dest_app_tag_mask;
+			uint16_t	dest_app_tag_seed;
+		};
+
+		uint8_t		op_specific[24];
+	};
+} __attribute__((packed));
+
+struct dsa_raw_desc {
+	uint64_t	field[8];
+} __attribute__((packed));
+
+/*
+ * The status field will be modified by hardware, therefore it should be
+ * volatile and prevent the compiler from optimize the read.
+ */
+struct dsa_completion_record {
+	volatile uint8_t	status;
+	union {
+		uint8_t		result;
+		uint8_t		dif_status;
+	};
+	uint16_t		rsvd;
+	uint32_t		bytes_completed;
+	uint64_t		fault_addr;
+	union {
+		uint16_t	delta_rec_size;
+		uint16_t	crc_val;
+
+		/* DIF check & strip */
+		struct {
+			uint32_t	dif_chk_ref_tag;
+			uint16_t	dif_chk_app_tag_mask;
+			uint16_t	dif_chk_app_tag;
+		};
+
+		/* DIF insert */
+		struct {
+			uint64_t	dif_ins_res;
+			uint32_t	dif_ins_ref_tag;
+			uint16_t	dif_ins_app_tag_mask;
+			uint16_t	dif_ins_app_tag;
+		};
+
+		/* DIF update */
+		struct {
+			uint32_t	dif_upd_src_ref_tag;
+			uint16_t	dif_upd_src_app_tag_mask;
+			uint16_t	dif_upd_src_app_tag;
+			uint32_t	dif_upd_dest_ref_tag;
+			uint16_t	dif_upd_dest_app_tag_mask;
+			uint16_t	dif_upd_dest_app_tag;
+		};
+
+		uint8_t		op_specific[16];
+	};
+} __attribute__((packed));
+
+struct dsa_raw_completion_record {
+	uint64_t	field[4];
+} __attribute__((packed));
+
+#endif
-- 
cgit v1.2.3


From a5d29ae226812ae620f58a0726525d8af9b1d751 Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Fri, 24 Jan 2020 13:40:21 +0200
Subject: net: bridge: vlan: add basic option setting support

This patch adds support for option modification of single vlans and
ranges. It allows to only modify options, i.e. skip create/delete by
using the BRIDGE_VLAN_INFO_ONLY_OPTS flag. When working with a range
option changes we try to pack the notifications as much as possible.

v2: do full port (all vlans) notification only when creating/deleting
    vlans for compatibility, rework the range detection when changing
    options, add more verbose extack errors and check if a vlan should
    be used (br_vlan_should_use checks)

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/if_bridge.h |  1 +
 net/bridge/br_private.h        |  8 ++++
 net/bridge/br_vlan.c           | 41 ++++++++++++++++----
 net/bridge/br_vlan_options.c   | 87 ++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 130 insertions(+), 7 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index ac38f0b674b8..c8b87ad52c5b 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -130,6 +130,7 @@ enum {
 #define BRIDGE_VLAN_INFO_RANGE_BEGIN	(1<<3) /* VLAN is start of vlan range */
 #define BRIDGE_VLAN_INFO_RANGE_END	(1<<4) /* VLAN is end of vlan range */
 #define BRIDGE_VLAN_INFO_BRENTRY	(1<<5) /* Global bridge VLAN entry */
+#define BRIDGE_VLAN_INFO_ONLY_OPTS	(1<<6) /* Skip create/delete/flags */
 
 struct bridge_vlan_info {
 	__u16 flags;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 403df71d2cfa..084904ee22a8 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -976,6 +976,8 @@ void br_vlan_notify(const struct net_bridge *br,
 		    const struct net_bridge_port *p,
 		    u16 vid, u16 vid_range,
 		    int cmd);
+bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr,
+			     const struct net_bridge_vlan *range_end);
 
 static inline struct net_bridge_vlan_group *br_vlan_group(
 					const struct net_bridge *br)
@@ -1197,6 +1199,12 @@ bool br_vlan_opts_eq(const struct net_bridge_vlan *v1,
 		     const struct net_bridge_vlan *v2);
 bool br_vlan_opts_fill(struct sk_buff *skb, const struct net_bridge_vlan *v);
 size_t br_vlan_opts_nl_size(void);
+int br_vlan_process_options(const struct net_bridge *br,
+			    const struct net_bridge_port *p,
+			    struct net_bridge_vlan *range_start,
+			    struct net_bridge_vlan *range_end,
+			    struct nlattr **tb,
+			    struct netlink_ext_ack *extack);
 #endif
 
 struct nf_br_ops {
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index 75ec3da92b0b..d747756eac63 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -1667,8 +1667,8 @@ out_kfree:
 }
 
 /* check if v_curr can enter a range ending in range_end */
-static bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr,
-				    const struct net_bridge_vlan *range_end)
+bool br_vlan_can_enter_range(const struct net_bridge_vlan *v_curr,
+			     const struct net_bridge_vlan *range_end)
 {
 	return v_curr->vid - range_end->vid == 1 &&
 	       range_end->flags == v_curr->flags &&
@@ -1824,11 +1824,11 @@ static int br_vlan_rtm_process_one(struct net_device *dev,
 {
 	struct bridge_vlan_info *vinfo, vrange_end, *vinfo_last = NULL;
 	struct nlattr *tb[BRIDGE_VLANDB_ENTRY_MAX + 1];
+	bool changed = false, skip_processing = false;
 	struct net_bridge_vlan_group *vg;
 	struct net_bridge_port *p = NULL;
 	int err = 0, cmdmap = 0;
 	struct net_bridge *br;
-	bool changed = false;
 
 	if (netif_is_bridge_master(dev)) {
 		br = netdev_priv(dev);
@@ -1882,16 +1882,43 @@ static int br_vlan_rtm_process_one(struct net_device *dev,
 	switch (cmd) {
 	case RTM_NEWVLAN:
 		cmdmap = RTM_SETLINK;
+		skip_processing = !!(vinfo->flags & BRIDGE_VLAN_INFO_ONLY_OPTS);
 		break;
 	case RTM_DELVLAN:
 		cmdmap = RTM_DELLINK;
 		break;
 	}
 
-	err = br_process_vlan_info(br, p, cmdmap, vinfo, &vinfo_last, &changed,
-				   extack);
-	if (changed)
-		br_ifinfo_notify(cmdmap, br, p);
+	if (!skip_processing) {
+		struct bridge_vlan_info *tmp_last = vinfo_last;
+
+		/* br_process_vlan_info may overwrite vinfo_last */
+		err = br_process_vlan_info(br, p, cmdmap, vinfo, &tmp_last,
+					   &changed, extack);
+
+		/* notify first if anything changed */
+		if (changed)
+			br_ifinfo_notify(cmdmap, br, p);
+
+		if (err)
+			return err;
+	}
+
+	/* deal with options */
+	if (cmd == RTM_NEWVLAN) {
+		struct net_bridge_vlan *range_start, *range_end;
+
+		if (vinfo_last) {
+			range_start = br_vlan_find(vg, vinfo_last->vid);
+			range_end = br_vlan_find(vg, vinfo->vid);
+		} else {
+			range_start = br_vlan_find(vg, vinfo->vid);
+			range_end = range_start;
+		}
+
+		err = br_vlan_process_options(br, p, range_start, range_end,
+					      tb, extack);
+	}
 
 	return err;
 }
diff --git a/net/bridge/br_vlan_options.c b/net/bridge/br_vlan_options.c
index 55fcdc9c380c..27275ac3e42e 100644
--- a/net/bridge/br_vlan_options.c
+++ b/net/bridge/br_vlan_options.c
@@ -23,3 +23,90 @@ size_t br_vlan_opts_nl_size(void)
 {
 	return 0;
 }
+
+static int br_vlan_process_one_opts(const struct net_bridge *br,
+				    const struct net_bridge_port *p,
+				    struct net_bridge_vlan_group *vg,
+				    struct net_bridge_vlan *v,
+				    struct nlattr **tb,
+				    bool *changed,
+				    struct netlink_ext_ack *extack)
+{
+	*changed = false;
+	return 0;
+}
+
+int br_vlan_process_options(const struct net_bridge *br,
+			    const struct net_bridge_port *p,
+			    struct net_bridge_vlan *range_start,
+			    struct net_bridge_vlan *range_end,
+			    struct nlattr **tb,
+			    struct netlink_ext_ack *extack)
+{
+	struct net_bridge_vlan *v, *curr_start = NULL, *curr_end = NULL;
+	struct net_bridge_vlan_group *vg;
+	int vid, err = 0;
+	u16 pvid;
+
+	if (p)
+		vg = nbp_vlan_group(p);
+	else
+		vg = br_vlan_group(br);
+
+	if (!range_start || !br_vlan_should_use(range_start)) {
+		NL_SET_ERR_MSG_MOD(extack, "Vlan range start doesn't exist, can't process options");
+		return -ENOENT;
+	}
+	if (!range_end || !br_vlan_should_use(range_end)) {
+		NL_SET_ERR_MSG_MOD(extack, "Vlan range end doesn't exist, can't process options");
+		return -ENOENT;
+	}
+
+	pvid = br_get_pvid(vg);
+	for (vid = range_start->vid; vid <= range_end->vid; vid++) {
+		bool changed = false;
+
+		v = br_vlan_find(vg, vid);
+		if (!v || !br_vlan_should_use(v)) {
+			NL_SET_ERR_MSG_MOD(extack, "Vlan in range doesn't exist, can't process options");
+			err = -ENOENT;
+			break;
+		}
+
+		err = br_vlan_process_one_opts(br, p, vg, v, tb, &changed,
+					       extack);
+		if (err)
+			break;
+
+		if (changed) {
+			/* vlan options changed, check for range */
+			if (!curr_start) {
+				curr_start = v;
+				curr_end = v;
+				continue;
+			}
+
+			if (v->vid == pvid ||
+			    !br_vlan_can_enter_range(v, curr_end)) {
+				br_vlan_notify(br, p, curr_start->vid,
+					       curr_end->vid, RTM_NEWVLAN);
+				curr_start = v;
+			}
+			curr_end = v;
+		} else {
+			/* nothing changed and nothing to notify yet */
+			if (!curr_start)
+				continue;
+
+			br_vlan_notify(br, p, curr_start->vid, curr_end->vid,
+				       RTM_NEWVLAN);
+			curr_start = NULL;
+			curr_end = NULL;
+		}
+	}
+	if (curr_start)
+		br_vlan_notify(br, p, curr_start->vid, curr_end->vid,
+			       RTM_NEWVLAN);
+
+	return err;
+}
-- 
cgit v1.2.3


From a580c76d534c7360ba68042b19cb255e8420e987 Mon Sep 17 00:00:00 2001
From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Fri, 24 Jan 2020 13:40:22 +0200
Subject: net: bridge: vlan: add per-vlan state

The first per-vlan option added is state, it is needed for EVPN and for
per-vlan STP. The state allows to control the forwarding on per-vlan
basis. The vlan state is considered only if the port state is forwarding
in order to avoid conflicts and be consistent. br_allowed_egress is
called only when the state is forwarding, but the ingress case is a bit
more complicated due to the fact that we may have the transition between
port:BR_STATE_FORWARDING -> vlan:BR_STATE_LEARNING which should still
allow the bridge to learn from the packet after vlan filtering and it will
be dropped after that. Also to optimize the pvid state check we keep a
copy in the vlan group to avoid one lookup. The state members are
modified with *_ONCE() to annotate the lockless access.

Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/if_bridge.h |  1 +
 net/bridge/br_device.c         |  3 ++-
 net/bridge/br_input.c          |  7 ++++--
 net/bridge/br_private.h        | 43 ++++++++++++++++++++++++++++++++--
 net/bridge/br_vlan.c           | 47 ++++++++++++++++++++++++++++----------
 net/bridge/br_vlan_options.c   | 52 ++++++++++++++++++++++++++++++++++++++++--
 6 files changed, 134 insertions(+), 19 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/if_bridge.h b/include/uapi/linux/if_bridge.h
index c8b87ad52c5b..42f7ca38ad80 100644
--- a/include/uapi/linux/if_bridge.h
+++ b/include/uapi/linux/if_bridge.h
@@ -191,6 +191,7 @@ enum {
 	BRIDGE_VLANDB_ENTRY_UNSPEC,
 	BRIDGE_VLANDB_ENTRY_INFO,
 	BRIDGE_VLANDB_ENTRY_RANGE,
+	BRIDGE_VLANDB_ENTRY_STATE,
 	__BRIDGE_VLANDB_ENTRY_MAX,
 };
 #define BRIDGE_VLANDB_ENTRY_MAX (__BRIDGE_VLANDB_ENTRY_MAX - 1)
diff --git a/net/bridge/br_device.c b/net/bridge/br_device.c
index fb38add21b37..dc3d2c1dd9d5 100644
--- a/net/bridge/br_device.c
+++ b/net/bridge/br_device.c
@@ -32,6 +32,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	struct net_bridge_mdb_entry *mdst;
 	struct pcpu_sw_netstats *brstats = this_cpu_ptr(br->stats);
 	const struct nf_br_ops *nf_ops;
+	u8 state = BR_STATE_FORWARDING;
 	const unsigned char *dest;
 	struct ethhdr *eth;
 	u16 vid = 0;
@@ -56,7 +57,7 @@ netdev_tx_t br_dev_xmit(struct sk_buff *skb, struct net_device *dev)
 	eth = eth_hdr(skb);
 	skb_pull(skb, ETH_HLEN);
 
-	if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid))
+	if (!br_allowed_ingress(br, br_vlan_group_rcu(br), skb, &vid, &state))
 		goto out;
 
 	if (IS_ENABLED(CONFIG_INET) &&
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c
index 8944ceb47fe9..fcc260840028 100644
--- a/net/bridge/br_input.c
+++ b/net/bridge/br_input.c
@@ -76,11 +76,14 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 	bool local_rcv, mcast_hit = false;
 	struct net_bridge *br;
 	u16 vid = 0;
+	u8 state;
 
 	if (!p || p->state == BR_STATE_DISABLED)
 		goto drop;
 
-	if (!br_allowed_ingress(p->br, nbp_vlan_group_rcu(p), skb, &vid))
+	state = p->state;
+	if (!br_allowed_ingress(p->br, nbp_vlan_group_rcu(p), skb, &vid,
+				&state))
 		goto out;
 
 	nbp_switchdev_frame_mark(p, skb);
@@ -103,7 +106,7 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb
 		}
 	}
 
-	if (p->state == BR_STATE_LEARNING)
+	if (state == BR_STATE_LEARNING)
 		goto drop;
 
 	BR_INPUT_SKB_CB(skb)->brdev = br->dev;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h
index 084904ee22a8..5153ffe79a01 100644
--- a/net/bridge/br_private.h
+++ b/net/bridge/br_private.h
@@ -113,6 +113,7 @@ enum {
  * @vid: VLAN id
  * @flags: bridge vlan flags
  * @priv_flags: private (in-kernel) bridge vlan flags
+ * @state: STP state (e.g. blocking, learning, forwarding)
  * @stats: per-cpu VLAN statistics
  * @br: if MASTER flag set, this points to a bridge struct
  * @port: if MASTER flag unset, this points to a port struct
@@ -133,6 +134,7 @@ struct net_bridge_vlan {
 	u16				vid;
 	u16				flags;
 	u16				priv_flags;
+	u8				state;
 	struct br_vlan_stats __percpu	*stats;
 	union {
 		struct net_bridge	*br;
@@ -157,6 +159,7 @@ struct net_bridge_vlan {
  * @vlan_list: sorted VLAN entry list
  * @num_vlans: number of total VLAN entries
  * @pvid: PVID VLAN id
+ * @pvid_state: PVID's STP state (e.g. forwarding, learning, blocking)
  *
  * IMPORTANT: Be careful when checking if there're VLAN entries using list
  *            primitives because the bridge can have entries in its list which
@@ -170,6 +173,7 @@ struct net_bridge_vlan_group {
 	struct list_head		vlan_list;
 	u16				num_vlans;
 	u16				pvid;
+	u8				pvid_state;
 };
 
 /* bridge fdb flags */
@@ -935,7 +939,7 @@ static inline int br_multicast_igmp_type(const struct sk_buff *skb)
 #ifdef CONFIG_BRIDGE_VLAN_FILTERING
 bool br_allowed_ingress(const struct net_bridge *br,
 			struct net_bridge_vlan_group *vg, struct sk_buff *skb,
-			u16 *vid);
+			u16 *vid, u8 *state);
 bool br_allowed_egress(struct net_bridge_vlan_group *vg,
 		       const struct sk_buff *skb);
 bool br_should_learn(struct net_bridge_port *p, struct sk_buff *skb, u16 *vid);
@@ -1037,7 +1041,7 @@ static inline u16 br_vlan_flags(const struct net_bridge_vlan *v, u16 pvid)
 static inline bool br_allowed_ingress(const struct net_bridge *br,
 				      struct net_bridge_vlan_group *vg,
 				      struct sk_buff *skb,
-				      u16 *vid)
+				      u16 *vid, u8 *state)
 {
 	return true;
 }
@@ -1205,6 +1209,41 @@ int br_vlan_process_options(const struct net_bridge *br,
 			    struct net_bridge_vlan *range_end,
 			    struct nlattr **tb,
 			    struct netlink_ext_ack *extack);
+
+/* vlan state manipulation helpers using *_ONCE to annotate lock-free access */
+static inline u8 br_vlan_get_state(const struct net_bridge_vlan *v)
+{
+	return READ_ONCE(v->state);
+}
+
+static inline void br_vlan_set_state(struct net_bridge_vlan *v, u8 state)
+{
+	WRITE_ONCE(v->state, state);
+}
+
+static inline u8 br_vlan_get_pvid_state(const struct net_bridge_vlan_group *vg)
+{
+	return READ_ONCE(vg->pvid_state);
+}
+
+static inline void br_vlan_set_pvid_state(struct net_bridge_vlan_group *vg,
+					  u8 state)
+{
+	WRITE_ONCE(vg->pvid_state, state);
+}
+
+/* learn_allow is true at ingress and false at egress */
+static inline bool br_vlan_state_allowed(u8 state, bool learn_allow)
+{
+	switch (state) {
+	case BR_STATE_LEARNING:
+		return learn_allow;
+	case BR_STATE_FORWARDING:
+		return true;
+	default:
+		return false;
+	}
+}
 #endif
 
 struct nf_br_ops {
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index d747756eac63..6b5deca08b89 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -34,13 +34,15 @@ static struct net_bridge_vlan *br_vlan_lookup(struct rhashtable *tbl, u16 vid)
 	return rhashtable_lookup_fast(tbl, &vid, br_vlan_rht_params);
 }
 
-static bool __vlan_add_pvid(struct net_bridge_vlan_group *vg, u16 vid)
+static bool __vlan_add_pvid(struct net_bridge_vlan_group *vg,
+			    const struct net_bridge_vlan *v)
 {
-	if (vg->pvid == vid)
+	if (vg->pvid == v->vid)
 		return false;
 
 	smp_wmb();
-	vg->pvid = vid;
+	br_vlan_set_pvid_state(vg, v->state);
+	vg->pvid = v->vid;
 
 	return true;
 }
@@ -69,7 +71,7 @@ static bool __vlan_add_flags(struct net_bridge_vlan *v, u16 flags)
 		vg = nbp_vlan_group(v->port);
 
 	if (flags & BRIDGE_VLAN_INFO_PVID)
-		ret = __vlan_add_pvid(vg, v->vid);
+		ret = __vlan_add_pvid(vg, v);
 	else
 		ret = __vlan_delete_pvid(vg, v->vid);
 
@@ -293,6 +295,9 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags,
 		vg->num_vlans++;
 	}
 
+	/* set the state before publishing */
+	v->state = BR_STATE_FORWARDING;
+
 	err = rhashtable_lookup_insert_fast(&vg->vlan_hash, &v->vnode,
 					    br_vlan_rht_params);
 	if (err)
@@ -466,7 +471,8 @@ out:
 /* Called under RCU */
 static bool __allowed_ingress(const struct net_bridge *br,
 			      struct net_bridge_vlan_group *vg,
-			      struct sk_buff *skb, u16 *vid)
+			      struct sk_buff *skb, u16 *vid,
+			      u8 *state)
 {
 	struct br_vlan_stats *stats;
 	struct net_bridge_vlan *v;
@@ -532,13 +538,25 @@ static bool __allowed_ingress(const struct net_bridge *br,
 			skb->vlan_tci |= pvid;
 
 		/* if stats are disabled we can avoid the lookup */
-		if (!br_opt_get(br, BROPT_VLAN_STATS_ENABLED))
-			return true;
+		if (!br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) {
+			if (*state == BR_STATE_FORWARDING) {
+				*state = br_vlan_get_pvid_state(vg);
+				return br_vlan_state_allowed(*state, true);
+			} else {
+				return true;
+			}
+		}
 	}
 	v = br_vlan_find(vg, *vid);
 	if (!v || !br_vlan_should_use(v))
 		goto drop;
 
+	if (*state == BR_STATE_FORWARDING) {
+		*state = br_vlan_get_state(v);
+		if (!br_vlan_state_allowed(*state, true))
+			goto drop;
+	}
+
 	if (br_opt_get(br, BROPT_VLAN_STATS_ENABLED)) {
 		stats = this_cpu_ptr(v->stats);
 		u64_stats_update_begin(&stats->syncp);
@@ -556,7 +574,7 @@ drop:
 
 bool br_allowed_ingress(const struct net_bridge *br,
 			struct net_bridge_vlan_group *vg, struct sk_buff *skb,
-			u16 *vid)
+			u16 *vid, u8 *state)
 {
 	/* If VLAN filtering is disabled on the bridge, all packets are
 	 * permitted.
@@ -566,7 +584,7 @@ bool br_allowed_ingress(const struct net_bridge *br,
 		return true;
 	}
 
-	return __allowed_ingress(br, vg, skb, vid);
+	return __allowed_ingress(br, vg, skb, vid, state);
 }
 
 /* Called under RCU. */
@@ -582,7 +600,8 @@ bool br_allowed_egress(struct net_bridge_vlan_group *vg,
 
 	br_vlan_get_tag(skb, &vid);
 	v = br_vlan_find(vg, vid);
-	if (v && br_vlan_should_use(v))
+	if (v && br_vlan_should_use(v) &&
+	    br_vlan_state_allowed(br_vlan_get_state(v), false))
 		return true;
 
 	return false;
@@ -593,6 +612,7 @@ bool br_should_learn(struct net_bridge_port *p, struct sk_buff *skb, u16 *vid)
 {
 	struct net_bridge_vlan_group *vg;
 	struct net_bridge *br = p->br;
+	struct net_bridge_vlan *v;
 
 	/* If filtering was disabled at input, let it pass. */
 	if (!br_opt_get(br, BROPT_VLAN_ENABLED))
@@ -607,13 +627,15 @@ bool br_should_learn(struct net_bridge_port *p, struct sk_buff *skb, u16 *vid)
 
 	if (!*vid) {
 		*vid = br_get_pvid(vg);
-		if (!*vid)
+		if (!*vid ||
+		    !br_vlan_state_allowed(br_vlan_get_pvid_state(vg), true))
 			return false;
 
 		return true;
 	}
 
-	if (br_vlan_find(vg, *vid))
+	v = br_vlan_find(vg, *vid);
+	if (v && br_vlan_state_allowed(br_vlan_get_state(v), true))
 		return true;
 
 	return false;
@@ -1816,6 +1838,7 @@ static const struct nla_policy br_vlan_db_policy[BRIDGE_VLANDB_ENTRY_MAX + 1] =
 	[BRIDGE_VLANDB_ENTRY_INFO]	= { .type = NLA_EXACT_LEN,
 					    .len = sizeof(struct bridge_vlan_info) },
 	[BRIDGE_VLANDB_ENTRY_RANGE]	= { .type = NLA_U16 },
+	[BRIDGE_VLANDB_ENTRY_STATE]	= { .type = NLA_U8 },
 };
 
 static int br_vlan_rtm_process_one(struct net_device *dev,
diff --git a/net/bridge/br_vlan_options.c b/net/bridge/br_vlan_options.c
index 27275ac3e42e..cd2eb194eb98 100644
--- a/net/bridge/br_vlan_options.c
+++ b/net/bridge/br_vlan_options.c
@@ -11,16 +11,54 @@
 bool br_vlan_opts_eq(const struct net_bridge_vlan *v1,
 		     const struct net_bridge_vlan *v2)
 {
-	return true;
+	return v1->state == v2->state;
 }
 
 bool br_vlan_opts_fill(struct sk_buff *skb, const struct net_bridge_vlan *v)
 {
-	return true;
+	return !nla_put_u8(skb, BRIDGE_VLANDB_ENTRY_STATE,
+			   br_vlan_get_state(v));
 }
 
 size_t br_vlan_opts_nl_size(void)
 {
+	return nla_total_size(sizeof(u8)); /* BRIDGE_VLANDB_ENTRY_STATE */
+}
+
+static int br_vlan_modify_state(struct net_bridge_vlan_group *vg,
+				struct net_bridge_vlan *v,
+				u8 state,
+				bool *changed,
+				struct netlink_ext_ack *extack)
+{
+	struct net_bridge *br;
+
+	ASSERT_RTNL();
+
+	if (state > BR_STATE_BLOCKING) {
+		NL_SET_ERR_MSG_MOD(extack, "Invalid vlan state");
+		return -EINVAL;
+	}
+
+	if (br_vlan_is_brentry(v))
+		br = v->br;
+	else
+		br = v->port->br;
+
+	if (br->stp_enabled == BR_KERNEL_STP) {
+		NL_SET_ERR_MSG_MOD(extack, "Can't modify vlan state when using kernel STP");
+		return -EBUSY;
+	}
+
+	if (v->state == state)
+		return 0;
+
+	if (v->vid == br_get_pvid(vg))
+		br_vlan_set_pvid_state(vg, state);
+
+	br_vlan_set_state(v, state);
+	*changed = true;
+
 	return 0;
 }
 
@@ -32,7 +70,17 @@ static int br_vlan_process_one_opts(const struct net_bridge *br,
 				    bool *changed,
 				    struct netlink_ext_ack *extack)
 {
+	int err;
+
 	*changed = false;
+	if (tb[BRIDGE_VLANDB_ENTRY_STATE]) {
+		u8 state = nla_get_u8(tb[BRIDGE_VLANDB_ENTRY_STATE]);
+
+		err = br_vlan_modify_state(vg, v, state, changed, extack);
+		if (err)
+			return err;
+	}
+
 	return 0;
 }
 
-- 
cgit v1.2.3


From 32efcc06d2a15fa87585614d12d6c2308cc2d3f3 Mon Sep 17 00:00:00 2001
From: Abdul Kabbani <akabbani@google.com>
Date: Fri, 24 Jan 2020 16:34:02 -0500
Subject: tcp: export count for rehash attempts

Using IPv6 flow-label to swiftly route around avoid congested or
disconnected network path can greatly improve TCP reliability.

This patch adds SNMP counters and a OPT_STATS counter to track both
host-level and connection-level statistics. Network administrators
can use these counters to evaluate the impact of this new ability better.

Export count for rehash attempts to
1) two SNMP counters: TcpTimeoutRehash (rehash due to timeouts),
   and TcpDuplicateDataRehash (rehash due to receiving duplicate
   packets)
2) Timestamping API SOF_TIMESTAMPING_OPT_STATS.

Signed-off-by: Abdul Kabbani <akabbani@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Kevin(Yudong) Yang <yyd@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/linux/tcp.h       | 2 ++
 include/uapi/linux/snmp.h | 2 ++
 include/uapi/linux/tcp.h  | 1 +
 net/ipv4/proc.c           | 2 ++
 net/ipv4/tcp.c            | 2 ++
 net/ipv4/tcp_input.c      | 4 +++-
 net/ipv4/tcp_timer.c      | 6 ++++++
 7 files changed, 18 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/tcp.h b/include/linux/tcp.h
index 4e2124607d32..1cf73e6f85ca 100644
--- a/include/linux/tcp.h
+++ b/include/linux/tcp.h
@@ -386,6 +386,8 @@ struct tcp_sock {
 #define BPF_SOCK_OPS_TEST_FLAG(TP, ARG) 0
 #endif
 
+	u16 timeout_rehash;	/* Timeout-triggered rehash attempts */
+
 	u32 rcv_ooopack; /* Received out-of-order packets, for tcpinfo */
 
 /* Receiver side RTT estimation */
diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 7eee233e78d2..7d91f4debc48 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -285,6 +285,8 @@ enum
 	LINUX_MIB_TCPRCVQDROP,			/* TCPRcvQDrop */
 	LINUX_MIB_TCPWQUEUETOOBIG,		/* TCPWqueueTooBig */
 	LINUX_MIB_TCPFASTOPENPASSIVEALTKEY,	/* TCPFastOpenPassiveAltKey */
+	LINUX_MIB_TCPTIMEOUTREHASH,		/* TCPTimeoutRehash */
+	LINUX_MIB_TCPDUPLICATEDATAREHASH,	/* TCPDuplicateDataRehash */
 	__LINUX_MIB_MAX
 };
 
diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index d87184e673ca..fd9eb8f6bcae 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -311,6 +311,7 @@ enum {
 	TCP_NLA_DSACK_DUPS,	/* DSACK blocks received */
 	TCP_NLA_REORD_SEEN,	/* reordering events seen */
 	TCP_NLA_SRTT,		/* smoothed RTT in usecs */
+	TCP_NLA_TIMEOUT_REHASH, /* Timeout-triggered rehash attempts */
 };
 
 /* for TCP_MD5SIG socket option */
diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index cc90243ccf76..2580303249e2 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -289,6 +289,8 @@ static const struct snmp_mib snmp4_net_list[] = {
 	SNMP_MIB_ITEM("TCPRcvQDrop", LINUX_MIB_TCPRCVQDROP),
 	SNMP_MIB_ITEM("TCPWqueueTooBig", LINUX_MIB_TCPWQUEUETOOBIG),
 	SNMP_MIB_ITEM("TCPFastOpenPassiveAltKey", LINUX_MIB_TCPFASTOPENPASSIVEALTKEY),
+	SNMP_MIB_ITEM("TcpTimeoutRehash", LINUX_MIB_TCPTIMEOUTREHASH),
+	SNMP_MIB_ITEM("TcpDuplicateDataRehash", LINUX_MIB_TCPDUPLICATEDATAREHASH),
 	SNMP_MIB_SENTINEL
 };
 
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 2857c853e2fd..484485ae74c2 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3337,6 +3337,7 @@ static size_t tcp_opt_stats_get_size(void)
 		nla_total_size(sizeof(u32)) + /* TCP_NLA_DSACK_DUPS */
 		nla_total_size(sizeof(u32)) + /* TCP_NLA_REORD_SEEN */
 		nla_total_size(sizeof(u32)) + /* TCP_NLA_SRTT */
+		nla_total_size(sizeof(u16)) + /* TCP_NLA_TIMEOUT_REHASH */
 		0;
 }
 
@@ -3391,6 +3392,7 @@ struct sk_buff *tcp_get_timestamping_opt_stats(const struct sock *sk)
 	nla_put_u32(stats, TCP_NLA_DSACK_DUPS, tp->dsack_dups);
 	nla_put_u32(stats, TCP_NLA_REORD_SEEN, tp->reord_seen);
 	nla_put_u32(stats, TCP_NLA_SRTT, tp->srtt_us >> 3);
+	nla_put_u16(stats, TCP_NLA_TIMEOUT_REHASH, tp->timeout_rehash);
 
 	return stats;
 }
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4915de65d278..e8b840a4767e 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -4271,8 +4271,10 @@ static void tcp_rcv_spurious_retrans(struct sock *sk, const struct sk_buff *skb)
 	 * The receiver remembers and reflects via DSACKs. Leverage the
 	 * DSACK state and change the txhash to re-route speculatively.
 	 */
-	if (TCP_SKB_CB(skb)->seq == tcp_sk(sk)->duplicate_sack[0].start_seq)
+	if (TCP_SKB_CB(skb)->seq == tcp_sk(sk)->duplicate_sack[0].start_seq) {
 		sk_rethink_txhash(sk);
+		NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPDUPLICATEDATAREHASH);
+	}
 }
 
 static void tcp_send_dupack(struct sock *sk, const struct sk_buff *skb)
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 1097b438befe..c3f26dcd6704 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -223,6 +223,9 @@ static int tcp_write_timeout(struct sock *sk)
 			dst_negative_advice(sk);
 		} else {
 			sk_rethink_txhash(sk);
+			tp->timeout_rehash++;
+			__NET_INC_STATS(sock_net(sk),
+					LINUX_MIB_TCPTIMEOUTREHASH);
 		}
 		retry_until = icsk->icsk_syn_retries ? : net->ipv4.sysctl_tcp_syn_retries;
 		expired = icsk->icsk_retransmits >= retry_until;
@@ -234,6 +237,9 @@ static int tcp_write_timeout(struct sock *sk)
 			dst_negative_advice(sk);
 		} else {
 			sk_rethink_txhash(sk);
+			tp->timeout_rehash++;
+			__NET_INC_STATS(sock_net(sk),
+					LINUX_MIB_TCPTIMEOUTREHASH);
 		}
 
 		retry_until = net->ipv4.sysctl_tcp_retries2;
-- 
cgit v1.2.3


From 7b225d0b5c6dda5fefab578175f210c6fc7e389a Mon Sep 17 00:00:00 2001
From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Wed, 22 Jan 2020 00:17:52 +0100
Subject: netfilter: nf_tables: add NFTA_SET_ELEM_KEY_END attribute

Add NFTA_SET_ELEM_KEY_END attribute to convey the closing element of the
interval between kernel and userspace.

This patch also adds the NFT_SET_EXT_KEY_END extension to store the
closing element value in this interval.

v4: No changes
v3: New patch

[sbrivio: refactor error paths and labels; add corresponding
  nft_set_ext_type for new key; rebase]
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h        | 14 +++++-
 include/uapi/linux/netfilter/nf_tables.h |  2 +
 net/netfilter/nf_tables_api.c            | 85 +++++++++++++++++++++++---------
 net/netfilter/nft_dynset.c               |  2 +-
 4 files changed, 79 insertions(+), 24 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index fe7c50acc681..504c0aa93805 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -231,6 +231,7 @@ struct nft_userdata {
  *	struct nft_set_elem - generic representation of set elements
  *
  *	@key: element key
+ *	@key_end: closing element key
  *	@priv: element private data and extensions
  */
 struct nft_set_elem {
@@ -238,6 +239,10 @@ struct nft_set_elem {
 		u32		buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)];
 		struct nft_data	val;
 	} key;
+	union {
+		u32		buf[NFT_DATA_VALUE_MAXLEN / sizeof(u32)];
+		struct nft_data	val;
+	} key_end;
 	void			*priv;
 };
 
@@ -502,6 +507,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set);
  *	enum nft_set_extensions - set extension type IDs
  *
  *	@NFT_SET_EXT_KEY: element key
+ *	@NFT_SET_EXT_KEY_END: upper bound element key, for ranges
  *	@NFT_SET_EXT_DATA: mapping data
  *	@NFT_SET_EXT_FLAGS: element flags
  *	@NFT_SET_EXT_TIMEOUT: element timeout
@@ -513,6 +519,7 @@ void nf_tables_destroy_set(const struct nft_ctx *ctx, struct nft_set *set);
  */
 enum nft_set_extensions {
 	NFT_SET_EXT_KEY,
+	NFT_SET_EXT_KEY_END,
 	NFT_SET_EXT_DATA,
 	NFT_SET_EXT_FLAGS,
 	NFT_SET_EXT_TIMEOUT,
@@ -606,6 +613,11 @@ static inline struct nft_data *nft_set_ext_key(const struct nft_set_ext *ext)
 	return nft_set_ext(ext, NFT_SET_EXT_KEY);
 }
 
+static inline struct nft_data *nft_set_ext_key_end(const struct nft_set_ext *ext)
+{
+	return nft_set_ext(ext, NFT_SET_EXT_KEY_END);
+}
+
 static inline struct nft_data *nft_set_ext_data(const struct nft_set_ext *ext)
 {
 	return nft_set_ext(ext, NFT_SET_EXT_DATA);
@@ -655,7 +667,7 @@ static inline struct nft_object **nft_set_ext_obj(const struct nft_set_ext *ext)
 
 void *nft_set_elem_init(const struct nft_set *set,
 			const struct nft_set_ext_tmpl *tmpl,
-			const u32 *key, const u32 *data,
+			const u32 *key, const u32 *key_end, const u32 *data,
 			u64 timeout, u64 expiration, gfp_t gfp);
 void nft_set_elem_destroy(const struct nft_set *set, void *elem,
 			  bool destroy_expr);
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index 261864736b26..c13106496bd2 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -370,6 +370,7 @@ enum nft_set_elem_flags {
  * @NFTA_SET_ELEM_USERDATA: user data (NLA_BINARY)
  * @NFTA_SET_ELEM_EXPR: expression (NLA_NESTED: nft_expr_attributes)
  * @NFTA_SET_ELEM_OBJREF: stateful object reference (NLA_STRING)
+ * @NFTA_SET_ELEM_KEY_END: closing key value (NLA_NESTED: nft_data)
  */
 enum nft_set_elem_attributes {
 	NFTA_SET_ELEM_UNSPEC,
@@ -382,6 +383,7 @@ enum nft_set_elem_attributes {
 	NFTA_SET_ELEM_EXPR,
 	NFTA_SET_ELEM_PAD,
 	NFTA_SET_ELEM_OBJREF,
+	NFTA_SET_ELEM_KEY_END,
 	__NFTA_SET_ELEM_MAX
 };
 #define NFTA_SET_ELEM_MAX	(__NFTA_SET_ELEM_MAX - 1)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 58e3b285a3d1..5f645a85538a 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -4215,6 +4215,9 @@ const struct nft_set_ext_type nft_set_ext_types[] = {
 		.len	= sizeof(struct nft_userdata),
 		.align	= __alignof__(struct nft_userdata),
 	},
+	[NFT_SET_EXT_KEY_END]		= {
+		.align	= __alignof__(u32),
+	},
 };
 EXPORT_SYMBOL_GPL(nft_set_ext_types);
 
@@ -4233,6 +4236,7 @@ static const struct nla_policy nft_set_elem_policy[NFTA_SET_ELEM_MAX + 1] = {
 	[NFTA_SET_ELEM_EXPR]		= { .type = NLA_NESTED },
 	[NFTA_SET_ELEM_OBJREF]		= { .type = NLA_STRING,
 					    .len = NFT_OBJ_MAXNAMELEN - 1 },
+	[NFTA_SET_ELEM_KEY_END]		= { .type = NLA_NESTED },
 };
 
 static const struct nla_policy nft_set_elem_list_policy[NFTA_SET_ELEM_LIST_MAX + 1] = {
@@ -4282,6 +4286,11 @@ static int nf_tables_fill_setelem(struct sk_buff *skb,
 			  NFT_DATA_VALUE, set->klen) < 0)
 		goto nla_put_failure;
 
+	if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END) &&
+	    nft_data_dump(skb, NFTA_SET_ELEM_KEY_END, nft_set_ext_key_end(ext),
+			  NFT_DATA_VALUE, set->klen) < 0)
+		goto nla_put_failure;
+
 	if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA) &&
 	    nft_data_dump(skb, NFTA_SET_ELEM_DATA, nft_set_ext_data(ext),
 			  set->dtype == NFT_DATA_VERDICT ? NFT_DATA_VERDICT : NFT_DATA_VALUE,
@@ -4569,6 +4578,13 @@ static int nft_get_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	if (err < 0)
 		return err;
 
+	if (nla[NFTA_SET_ELEM_KEY_END]) {
+		err = nft_setelem_parse_key(ctx, set, &elem.key_end.val,
+					    nla[NFTA_SET_ELEM_KEY_END]);
+		if (err < 0)
+			return err;
+	}
+
 	priv = set->ops->get(ctx->net, set, &elem, flags);
 	if (IS_ERR(priv))
 		return PTR_ERR(priv);
@@ -4694,8 +4710,8 @@ static struct nft_trans *nft_trans_elem_alloc(struct nft_ctx *ctx,
 
 void *nft_set_elem_init(const struct nft_set *set,
 			const struct nft_set_ext_tmpl *tmpl,
-			const u32 *key, const u32 *data,
-			u64 timeout, u64 expiration, gfp_t gfp)
+			const u32 *key, const u32 *key_end,
+			const u32 *data, u64 timeout, u64 expiration, gfp_t gfp)
 {
 	struct nft_set_ext *ext;
 	void *elem;
@@ -4708,6 +4724,8 @@ void *nft_set_elem_init(const struct nft_set *set,
 	nft_set_ext_init(ext, tmpl);
 
 	memcpy(nft_set_ext_key(ext), key, set->klen);
+	if (nft_set_ext_exists(ext, NFT_SET_EXT_KEY_END))
+		memcpy(nft_set_ext_key_end(ext), key_end, set->klen);
 	if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA))
 		memcpy(nft_set_ext_data(ext), data, set->dlen);
 	if (nft_set_ext_exists(ext, NFT_SET_EXT_EXPIRATION)) {
@@ -4842,9 +4860,19 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	err = nft_setelem_parse_key(ctx, set, &elem.key.val,
 				    nla[NFTA_SET_ELEM_KEY]);
 	if (err < 0)
-		goto err1;
+		return err;
 
 	nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen);
+
+	if (nla[NFTA_SET_ELEM_KEY_END]) {
+		err = nft_setelem_parse_key(ctx, set, &elem.key_end.val,
+					    nla[NFTA_SET_ELEM_KEY_END]);
+		if (err < 0)
+			goto err_parse_key;
+
+		nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen);
+	}
+
 	if (timeout > 0) {
 		nft_set_ext_add(&tmpl, NFT_SET_EXT_EXPIRATION);
 		if (timeout != set->timeout)
@@ -4854,14 +4882,14 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	if (nla[NFTA_SET_ELEM_OBJREF] != NULL) {
 		if (!(set->flags & NFT_SET_OBJECT)) {
 			err = -EINVAL;
-			goto err2;
+			goto err_parse_key_end;
 		}
 		obj = nft_obj_lookup(ctx->net, ctx->table,
 				     nla[NFTA_SET_ELEM_OBJREF],
 				     set->objtype, genmask);
 		if (IS_ERR(obj)) {
 			err = PTR_ERR(obj);
-			goto err2;
+			goto err_parse_key_end;
 		}
 		nft_set_ext_add(&tmpl, NFT_SET_EXT_OBJREF);
 	}
@@ -4870,11 +4898,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 		err = nft_data_init(ctx, &data, sizeof(data), &desc,
 				    nla[NFTA_SET_ELEM_DATA]);
 		if (err < 0)
-			goto err2;
+			goto err_parse_key_end;
 
 		err = -EINVAL;
 		if (set->dtype != NFT_DATA_VERDICT && desc.len != set->dlen)
-			goto err3;
+			goto err_parse_data;
 
 		dreg = nft_type_to_reg(set->dtype);
 		list_for_each_entry(binding, &set->bindings, list) {
@@ -4892,7 +4920,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 							  &data,
 							  desc.type, desc.len);
 			if (err < 0)
-				goto err3;
+				goto err_parse_data;
 
 			if (desc.type == NFT_DATA_VERDICT &&
 			    (data.verdict.code == NFT_GOTO ||
@@ -4917,10 +4945,11 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	}
 
 	err = -ENOMEM;
-	elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, data.data,
+	elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data,
+				      elem.key_end.val.data, data.data,
 				      timeout, expiration, GFP_KERNEL);
 	if (elem.priv == NULL)
-		goto err3;
+		goto err_parse_data;
 
 	ext = nft_set_elem_ext(set, elem.priv);
 	if (flags)
@@ -4937,7 +4966,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 
 	trans = nft_trans_elem_alloc(ctx, NFT_MSG_NEWSETELEM, set);
 	if (trans == NULL)
-		goto err4;
+		goto err_trans;
 
 	ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK;
 	err = set->ops->insert(ctx->net, set, &elem, &ext2);
@@ -4948,7 +4977,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 			    nft_set_ext_exists(ext, NFT_SET_EXT_OBJREF) ^
 			    nft_set_ext_exists(ext2, NFT_SET_EXT_OBJREF)) {
 				err = -EBUSY;
-				goto err5;
+				goto err_element_clash;
 			}
 			if ((nft_set_ext_exists(ext, NFT_SET_EXT_DATA) &&
 			     nft_set_ext_exists(ext2, NFT_SET_EXT_DATA) &&
@@ -4961,33 +4990,35 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 			else if (!(nlmsg_flags & NLM_F_EXCL))
 				err = 0;
 		}
-		goto err5;
+		goto err_element_clash;
 	}
 
 	if (set->size &&
 	    !atomic_add_unless(&set->nelems, 1, set->size + set->ndeact)) {
 		err = -ENFILE;
-		goto err6;
+		goto err_set_full;
 	}
 
 	nft_trans_elem(trans) = elem;
 	list_add_tail(&trans->list, &ctx->net->nft.commit_list);
 	return 0;
 
-err6:
+err_set_full:
 	set->ops->remove(ctx->net, set, &elem);
-err5:
+err_element_clash:
 	kfree(trans);
-err4:
+err_trans:
 	if (obj)
 		obj->use--;
 	kfree(elem.priv);
-err3:
+err_parse_data:
 	if (nla[NFTA_SET_ELEM_DATA] != NULL)
 		nft_data_release(&data, desc.type);
-err2:
+err_parse_key_end:
+	nft_data_release(&elem.key_end.val, NFT_DATA_VALUE);
+err_parse_key:
 	nft_data_release(&elem.key.val, NFT_DATA_VALUE);
-err1:
+
 	return err;
 }
 
@@ -5112,9 +5143,19 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set,
 
 	nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY, set->klen);
 
+	if (nla[NFTA_SET_ELEM_KEY_END]) {
+		err = nft_setelem_parse_key(ctx, set, &elem.key_end.val,
+					    nla[NFTA_SET_ELEM_KEY_END]);
+		if (err < 0)
+			return err;
+
+		nft_set_ext_add_length(&tmpl, NFT_SET_EXT_KEY_END, set->klen);
+	}
+
 	err = -ENOMEM;
-	elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data, NULL, 0,
-				      0, GFP_KERNEL);
+	elem.priv = nft_set_elem_init(set, &tmpl, elem.key.val.data,
+				      elem.key_end.val.data, NULL, 0, 0,
+				      GFP_KERNEL);
 	if (elem.priv == NULL)
 		goto fail_elem;
 
diff --git a/net/netfilter/nft_dynset.c b/net/netfilter/nft_dynset.c
index 8887295414dc..683785225a3e 100644
--- a/net/netfilter/nft_dynset.c
+++ b/net/netfilter/nft_dynset.c
@@ -54,7 +54,7 @@ static void *nft_dynset_new(struct nft_set *set, const struct nft_expr *expr,
 
 	timeout = priv->timeout ? : set->timeout;
 	elem = nft_set_elem_init(set, &priv->tmpl,
-				 &regs->data[priv->sreg_key],
+				 &regs->data[priv->sreg_key], NULL,
 				 &regs->data[priv->sreg_data],
 				 timeout, 0, GFP_ATOMIC);
 	if (elem == NULL)
-- 
cgit v1.2.3


From f3a2181e16f1dcbf5446ed43f6b5d9f56c459f85 Mon Sep 17 00:00:00 2001
From: Stefano Brivio <sbrivio@redhat.com>
Date: Wed, 22 Jan 2020 00:17:53 +0100
Subject: netfilter: nf_tables: Support for sets with multiple ranged fields

Introduce a new nested netlink attribute, NFTA_SET_DESC_CONCAT, used
to specify the length of each field in a set concatenation.

This allows set implementations to support concatenation of multiple
ranged items, as they can divide the input key into matching data for
every single field. Such set implementations would be selected as
they specify support for NFT_SET_INTERVAL and allow desc->field_count
to be greater than one. Explicitly disallow this for nft_set_rbtree.

In order to specify the interval for a set entry, userspace would
include in NFTA_SET_DESC_CONCAT attributes field lengths, and pass
range endpoints as two separate keys, represented by attributes
NFTA_SET_ELEM_KEY and NFTA_SET_ELEM_KEY_END.

While at it, export the number of 32-bit registers available for
packet matching, as nftables will need this to know the maximum
number of field lengths that can be specified.

For example, "packets with an IPv4 address between 192.0.2.0 and
192.0.2.42, with destination port between 22 and 25", can be
expressed as two concatenated elements:

  NFTA_SET_ELEM_KEY:            192.0.2.0 . 22
  NFTA_SET_ELEM_KEY_END:        192.0.2.42 . 25

and NFTA_SET_DESC_CONCAT attribute would contain:

  NFTA_LIST_ELEM
    NFTA_SET_FIELD_LEN:		4
  NFTA_LIST_ELEM
    NFTA_SET_FIELD_LEN:		2

v4: No changes
v3: Complete rework, NFTA_SET_DESC_CONCAT instead of NFTA_SET_SUBKEY
v2: No changes

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_tables.h        |  8 +++
 include/uapi/linux/netfilter/nf_tables.h | 15 ++++++
 net/netfilter/nf_tables_api.c            | 90 +++++++++++++++++++++++++++++++-
 net/netfilter/nft_set_rbtree.c           |  3 ++
 4 files changed, 115 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 504c0aa93805..4170c033d461 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -264,11 +264,15 @@ struct nft_set_iter {
  *	@klen: key length
  *	@dlen: data length
  *	@size: number of set elements
+ *	@field_len: length of each field in concatenation, bytes
+ *	@field_count: number of concatenated fields in element
  */
 struct nft_set_desc {
 	unsigned int		klen;
 	unsigned int		dlen;
 	unsigned int		size;
+	u8			field_len[NFT_REG32_COUNT];
+	u8			field_count;
 };
 
 /**
@@ -409,6 +413,8 @@ void nft_unregister_set(struct nft_set_type *type);
  * 	@dtype: data type (verdict or numeric type defined by userspace)
  * 	@objtype: object type (see NFT_OBJECT_* definitions)
  * 	@size: maximum set size
+ *	@field_len: length of each field in concatenation, bytes
+ *	@field_count: number of concatenated fields in element
  *	@use: number of rules references to this set
  * 	@nelems: number of elements
  * 	@ndeact: number of deactivated elements queued for removal
@@ -435,6 +441,8 @@ struct nft_set {
 	u32				dtype;
 	u32				objtype;
 	u32				size;
+	u8				field_len[NFT_REG32_COUNT];
+	u8				field_count;
 	u32				use;
 	atomic_t			nelems;
 	u32				ndeact;
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index c13106496bd2..065218a20bb7 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -48,6 +48,7 @@ enum nft_registers {
 
 #define NFT_REG_SIZE	16
 #define NFT_REG32_SIZE	4
+#define NFT_REG32_COUNT	(NFT_REG32_15 - NFT_REG32_00 + 1)
 
 /**
  * enum nft_verdicts - nf_tables internal verdicts
@@ -301,14 +302,28 @@ enum nft_set_policies {
  * enum nft_set_desc_attributes - set element description
  *
  * @NFTA_SET_DESC_SIZE: number of elements in set (NLA_U32)
+ * @NFTA_SET_DESC_CONCAT: description of field concatenation (NLA_NESTED)
  */
 enum nft_set_desc_attributes {
 	NFTA_SET_DESC_UNSPEC,
 	NFTA_SET_DESC_SIZE,
+	NFTA_SET_DESC_CONCAT,
 	__NFTA_SET_DESC_MAX
 };
 #define NFTA_SET_DESC_MAX	(__NFTA_SET_DESC_MAX - 1)
 
+/**
+ * enum nft_set_field_attributes - attributes of concatenated fields
+ *
+ * @NFTA_SET_FIELD_LEN: length of single field, in bits (NLA_U32)
+ */
+enum nft_set_field_attributes {
+	NFTA_SET_FIELD_UNSPEC,
+	NFTA_SET_FIELD_LEN,
+	__NFTA_SET_FIELD_MAX
+};
+#define NFTA_SET_FIELD_MAX	(__NFTA_SET_FIELD_MAX - 1)
+
 /**
  * enum nft_set_attributes - nf_tables set netlink attributes
  *
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 5f645a85538a..d1318bdf49ca 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3391,6 +3391,7 @@ static const struct nla_policy nft_set_policy[NFTA_SET_MAX + 1] = {
 
 static const struct nla_policy nft_set_desc_policy[NFTA_SET_DESC_MAX + 1] = {
 	[NFTA_SET_DESC_SIZE]		= { .type = NLA_U32 },
+	[NFTA_SET_DESC_CONCAT]		= { .type = NLA_NESTED },
 };
 
 static int nft_ctx_init_from_setattr(struct nft_ctx *ctx, struct net *net,
@@ -3557,6 +3558,33 @@ static __be64 nf_jiffies64_to_msecs(u64 input)
 	return cpu_to_be64(jiffies64_to_msecs(input));
 }
 
+static int nf_tables_fill_set_concat(struct sk_buff *skb,
+				     const struct nft_set *set)
+{
+	struct nlattr *concat, *field;
+	int i;
+
+	concat = nla_nest_start_noflag(skb, NFTA_SET_DESC_CONCAT);
+	if (!concat)
+		return -ENOMEM;
+
+	for (i = 0; i < set->field_count; i++) {
+		field = nla_nest_start_noflag(skb, NFTA_LIST_ELEM);
+		if (!field)
+			return -ENOMEM;
+
+		if (nla_put_be32(skb, NFTA_SET_FIELD_LEN,
+				 htonl(set->field_len[i])))
+			return -ENOMEM;
+
+		nla_nest_end(skb, field);
+	}
+
+	nla_nest_end(skb, concat);
+
+	return 0;
+}
+
 static int nf_tables_fill_set(struct sk_buff *skb, const struct nft_ctx *ctx,
 			      const struct nft_set *set, u16 event, u16 flags)
 {
@@ -3620,11 +3648,17 @@ static int nf_tables_fill_set(struct sk_buff *skb, const struct nft_ctx *ctx,
 		goto nla_put_failure;
 
 	desc = nla_nest_start_noflag(skb, NFTA_SET_DESC);
+
 	if (desc == NULL)
 		goto nla_put_failure;
 	if (set->size &&
 	    nla_put_be32(skb, NFTA_SET_DESC_SIZE, htonl(set->size)))
 		goto nla_put_failure;
+
+	if (set->field_count > 1 &&
+	    nf_tables_fill_set_concat(skb, set))
+		goto nla_put_failure;
+
 	nla_nest_end(skb, desc);
 
 	nlmsg_end(skb, nlh);
@@ -3797,6 +3831,53 @@ err:
 	return err;
 }
 
+static const struct nla_policy nft_concat_policy[NFTA_SET_FIELD_MAX + 1] = {
+	[NFTA_SET_FIELD_LEN]	= { .type = NLA_U32 },
+};
+
+static int nft_set_desc_concat_parse(const struct nlattr *attr,
+				     struct nft_set_desc *desc)
+{
+	struct nlattr *tb[NFTA_SET_FIELD_MAX + 1];
+	u32 len;
+	int err;
+
+	err = nla_parse_nested_deprecated(tb, NFTA_SET_FIELD_MAX, attr,
+					  nft_concat_policy, NULL);
+	if (err < 0)
+		return err;
+
+	if (!tb[NFTA_SET_FIELD_LEN])
+		return -EINVAL;
+
+	len = ntohl(nla_get_be32(tb[NFTA_SET_FIELD_LEN]));
+
+	if (len * BITS_PER_BYTE / 32 > NFT_REG32_COUNT)
+		return -E2BIG;
+
+	desc->field_len[desc->field_count++] = len;
+
+	return 0;
+}
+
+static int nft_set_desc_concat(struct nft_set_desc *desc,
+			       const struct nlattr *nla)
+{
+	struct nlattr *attr;
+	int rem, err;
+
+	nla_for_each_nested(attr, nla, rem) {
+		if (nla_type(attr) != NFTA_LIST_ELEM)
+			return -EINVAL;
+
+		err = nft_set_desc_concat_parse(attr, desc);
+		if (err < 0)
+			return err;
+	}
+
+	return 0;
+}
+
 static int nf_tables_set_desc_parse(struct nft_set_desc *desc,
 				    const struct nlattr *nla)
 {
@@ -3810,8 +3891,10 @@ static int nf_tables_set_desc_parse(struct nft_set_desc *desc,
 
 	if (da[NFTA_SET_DESC_SIZE] != NULL)
 		desc->size = ntohl(nla_get_be32(da[NFTA_SET_DESC_SIZE]));
+	if (da[NFTA_SET_DESC_CONCAT])
+		err = nft_set_desc_concat(desc, da[NFTA_SET_DESC_CONCAT]);
 
-	return 0;
+	return err;
 }
 
 static int nf_tables_newset(struct net *net, struct sock *nlsk,
@@ -3834,6 +3917,7 @@ static int nf_tables_newset(struct net *net, struct sock *nlsk,
 	unsigned char *udata;
 	u16 udlen;
 	int err;
+	int i;
 
 	if (nla[NFTA_SET_TABLE] == NULL ||
 	    nla[NFTA_SET_NAME] == NULL ||
@@ -4012,6 +4096,10 @@ static int nf_tables_newset(struct net *net, struct sock *nlsk,
 	set->gc_int = gc_int;
 	set->handle = nf_tables_alloc_handle(table);
 
+	set->field_count = desc.field_count;
+	for (i = 0; i < desc.field_count; i++)
+		set->field_len[i] = desc.field_len[i];
+
 	err = ops->init(set, &desc, nla);
 	if (err < 0)
 		goto err3;
diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index a9f804f7a04a..5000b938ab1e 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -466,6 +466,9 @@ static void nft_rbtree_destroy(const struct nft_set *set)
 static bool nft_rbtree_estimate(const struct nft_set_desc *desc, u32 features,
 				struct nft_set_estimate *est)
 {
+	if (desc->field_count > 1)
+		return false;
+
 	if (desc->size)
 		est->size = sizeof(struct nft_rbtree) +
 			    desc->size * sizeof(struct nft_rbtree_elem);
-- 
cgit v1.2.3


From 6a94b8ccf6b77f005ab1b36a878e1d81df0c033e Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Sun, 26 Jan 2020 23:11:04 +0100
Subject: ethtool: provide message mask with DEBUG_GET request

Implement DEBUG_GET request to get debugging settings for a device. At the
moment, only message mask corresponding to message level as reported by
ETHTOOL_GMSGLVL ioctl request is provided. (It is called message level in
ioctl interface but almost all drivers interpret it as a bit mask.)

As part of the implementation, provide symbolic names for message mask bits
as ETH_SS_MSG_CLASSES string set.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 34 +++++++++++-
 include/linux/netdevice.h                    | 56 +++++++++++++------
 include/uapi/linux/ethtool.h                 |  2 +
 include/uapi/linux/ethtool_netlink.h         | 14 +++++
 net/ethtool/Makefile                         |  2 +-
 net/ethtool/common.c                         | 19 +++++++
 net/ethtool/common.h                         |  1 +
 net/ethtool/debug.c                          | 80 ++++++++++++++++++++++++++++
 net/ethtool/netlink.c                        |  8 +++
 net/ethtool/netlink.h                        |  1 +
 net/ethtool/strset.c                         |  5 ++
 11 files changed, 205 insertions(+), 17 deletions(-)
 create mode 100644 net/ethtool/debug.c

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index c60afba69e3c..76a411d3102d 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -185,6 +185,7 @@ Userspace to kernel:
   ``ETHTOOL_MSG_LINKMODES_GET``         get link modes info
   ``ETHTOOL_MSG_LINKMODES_SET``         set link modes info
   ``ETHTOOL_MSG_LINKSTATE_GET``         get link state
+  ``ETHTOOL_MSG_DEBUG_GET``             get debugging settings
   ===================================== ================================
 
 Kernel to userspace:
@@ -196,6 +197,7 @@ Kernel to userspace:
   ``ETHTOOL_MSG_LINKMODES_GET_REPLY``   link modes info
   ``ETHTOOL_MSG_LINKMODES_NTF``         link modes notification
   ``ETHTOOL_MSG_LINKSTATE_GET_REPLY``   link state info
+  ``ETHTOOL_MSG_DEBUG_GET_REPLY``       debugging settings
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -423,6 +425,36 @@ define their own handler.
 devices supporting the request).
 
 
+DEBUG_GET
+=========
+
+Requests debugging settings of a device. At the moment, only message mask is
+provided.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_DEBUG_HEADER``            nested  request header
+  ====================================  ======  ==========================
+
+Kernel response contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_DEBUG_HEADER``            nested  reply header
+  ``ETHTOOL_A_DEBUG_MSGMASK``           bitset  message mask
+  ====================================  ======  ==========================
+
+The message mask (``ETHTOOL_A_DEBUG_MSGMASK``) is equal to message level as
+provided by ``ETHTOOL_GMSGLVL`` and set by ``ETHTOOL_SMSGLVL`` in ioctl
+interface. While it is called message level there for historical reasons, most
+drivers and almost all newer drivers use it as a mask of enabled message
+classes (represented by ``NETIF_MSG_*`` constants); therefore netlink
+interface follows its actual use in practice.
+
+``DEBUG_GET`` allows dump requests (kernel returns reply messages for all
+devices supporting the request).
+
+
 Request translation
 ===================
 
@@ -441,7 +473,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GREGS``                   n/a
   ``ETHTOOL_GWOL``                    n/a
   ``ETHTOOL_SWOL``                    n/a
-  ``ETHTOOL_GMSGLVL``                 n/a
+  ``ETHTOOL_GMSGLVL``                 ``ETHTOOL_MSG_DEBUG_GET``
   ``ETHTOOL_SMSGLVL``                 n/a
   ``ETHTOOL_NWAY_RST``                n/a
   ``ETHTOOL_GLINK``                   ``ETHTOOL_MSG_LINKSTATE_GET``
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 4626188a754b..a9c6b5c61d27 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -3913,22 +3913,48 @@ void netif_device_attach(struct net_device *dev);
  */
 
 enum {
-	NETIF_MSG_DRV		= 0x0001,
-	NETIF_MSG_PROBE		= 0x0002,
-	NETIF_MSG_LINK		= 0x0004,
-	NETIF_MSG_TIMER		= 0x0008,
-	NETIF_MSG_IFDOWN	= 0x0010,
-	NETIF_MSG_IFUP		= 0x0020,
-	NETIF_MSG_RX_ERR	= 0x0040,
-	NETIF_MSG_TX_ERR	= 0x0080,
-	NETIF_MSG_TX_QUEUED	= 0x0100,
-	NETIF_MSG_INTR		= 0x0200,
-	NETIF_MSG_TX_DONE	= 0x0400,
-	NETIF_MSG_RX_STATUS	= 0x0800,
-	NETIF_MSG_PKTDATA	= 0x1000,
-	NETIF_MSG_HW		= 0x2000,
-	NETIF_MSG_WOL		= 0x4000,
+	NETIF_MSG_DRV_BIT,
+	NETIF_MSG_PROBE_BIT,
+	NETIF_MSG_LINK_BIT,
+	NETIF_MSG_TIMER_BIT,
+	NETIF_MSG_IFDOWN_BIT,
+	NETIF_MSG_IFUP_BIT,
+	NETIF_MSG_RX_ERR_BIT,
+	NETIF_MSG_TX_ERR_BIT,
+	NETIF_MSG_TX_QUEUED_BIT,
+	NETIF_MSG_INTR_BIT,
+	NETIF_MSG_TX_DONE_BIT,
+	NETIF_MSG_RX_STATUS_BIT,
+	NETIF_MSG_PKTDATA_BIT,
+	NETIF_MSG_HW_BIT,
+	NETIF_MSG_WOL_BIT,
+
+	/* When you add a new bit above, update netif_msg_class_names array
+	 * in net/ethtool/common.c
+	 */
+	NETIF_MSG_CLASS_COUNT,
 };
+/* Both ethtool_ops interface and internal driver implementation use u32 */
+static_assert(NETIF_MSG_CLASS_COUNT <= 32);
+
+#define __NETIF_MSG_BIT(bit)	((u32)1 << (bit))
+#define __NETIF_MSG(name)	__NETIF_MSG_BIT(NETIF_MSG_ ## name ## _BIT)
+
+#define NETIF_MSG_DRV		__NETIF_MSG(DRV)
+#define NETIF_MSG_PROBE		__NETIF_MSG(PROBE)
+#define NETIF_MSG_LINK		__NETIF_MSG(LINK)
+#define NETIF_MSG_TIMER		__NETIF_MSG(TIMER)
+#define NETIF_MSG_IFDOWN	__NETIF_MSG(IFDOWN)
+#define NETIF_MSG_IFUP		__NETIF_MSG(IFUP)
+#define NETIF_MSG_RX_ERR	__NETIF_MSG(RX_ERR)
+#define NETIF_MSG_TX_ERR	__NETIF_MSG(TX_ERR)
+#define NETIF_MSG_TX_QUEUED	__NETIF_MSG(TX_QUEUED)
+#define NETIF_MSG_INTR		__NETIF_MSG(INTR)
+#define NETIF_MSG_TX_DONE	__NETIF_MSG(TX_DONE)
+#define NETIF_MSG_RX_STATUS	__NETIF_MSG(RX_STATUS)
+#define NETIF_MSG_PKTDATA	__NETIF_MSG(PKTDATA)
+#define NETIF_MSG_HW		__NETIF_MSG(HW)
+#define NETIF_MSG_WOL		__NETIF_MSG(WOL)
 
 #define netif_msg_drv(p)	((p)->msg_enable & NETIF_MSG_DRV)
 #define netif_msg_probe(p)	((p)->msg_enable & NETIF_MSG_PROBE)
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 116bcbf09c74..456fb2aa0fad 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -594,6 +594,7 @@ struct ethtool_pauseparam {
  * @ETH_SS_PHY_STATS: Statistic names, for use with %ETHTOOL_GPHYSTATS
  * @ETH_SS_PHY_TUNABLES: PHY tunable names
  * @ETH_SS_LINK_MODES: link mode names
+ * @ETH_SS_MSG_CLASSES: debug message class names
  */
 enum ethtool_stringset {
 	ETH_SS_TEST		= 0,
@@ -606,6 +607,7 @@ enum ethtool_stringset {
 	ETH_SS_PHY_STATS,
 	ETH_SS_PHY_TUNABLES,
 	ETH_SS_LINK_MODES,
+	ETH_SS_MSG_CLASSES,
 
 	/* add new constants above here */
 	ETH_SS_COUNT
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 02f82f42a889..0d39e04567cb 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -20,6 +20,7 @@ enum {
 	ETHTOOL_MSG_LINKMODES_GET,
 	ETHTOOL_MSG_LINKMODES_SET,
 	ETHTOOL_MSG_LINKSTATE_GET,
+	ETHTOOL_MSG_DEBUG_GET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -35,6 +36,7 @@ enum {
 	ETHTOOL_MSG_LINKMODES_GET_REPLY,
 	ETHTOOL_MSG_LINKMODES_NTF,
 	ETHTOOL_MSG_LINKSTATE_GET_REPLY,
+	ETHTOOL_MSG_DEBUG_GET_REPLY,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -195,6 +197,18 @@ enum {
 	ETHTOOL_A_LINKSTATE_MAX = __ETHTOOL_A_LINKSTATE_CNT - 1
 };
 
+/* DEBUG */
+
+enum {
+	ETHTOOL_A_DEBUG_UNSPEC,
+	ETHTOOL_A_DEBUG_HEADER,			/* nest - _A_HEADER_* */
+	ETHTOOL_A_DEBUG_MSGMASK,		/* bitset */
+
+	/* add new constants above here */
+	__ETHTOOL_A_DEBUG_CNT,
+	ETHTOOL_A_DEBUG_MAX = __ETHTOOL_A_DEBUG_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index 9a1332fb0cc6..c120c820a4f5 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -5,4 +5,4 @@ obj-y				+= ioctl.o common.o
 obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 
 ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o \
-		   linkstate.o
+		   linkstate.o debug.o
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index c7b8956c3827..aa1183a65a76 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -171,6 +171,25 @@ const char link_mode_names[][ETH_GSTRING_LEN] = {
 };
 static_assert(ARRAY_SIZE(link_mode_names) == __ETHTOOL_LINK_MODE_MASK_NBITS);
 
+const char netif_msg_class_names[][ETH_GSTRING_LEN] = {
+	[NETIF_MSG_DRV_BIT]		= "drv",
+	[NETIF_MSG_PROBE_BIT]		= "probe",
+	[NETIF_MSG_LINK_BIT]		= "link",
+	[NETIF_MSG_TIMER_BIT]		= "timer",
+	[NETIF_MSG_IFDOWN_BIT]		= "ifdown",
+	[NETIF_MSG_IFUP_BIT]		= "ifup",
+	[NETIF_MSG_RX_ERR_BIT]		= "rx_err",
+	[NETIF_MSG_TX_ERR_BIT]		= "tx_err",
+	[NETIF_MSG_TX_QUEUED_BIT]	= "tx_queued",
+	[NETIF_MSG_INTR_BIT]		= "intr",
+	[NETIF_MSG_TX_DONE_BIT]		= "tx_done",
+	[NETIF_MSG_RX_STATUS_BIT]	= "rx_status",
+	[NETIF_MSG_PKTDATA_BIT]		= "pktdata",
+	[NETIF_MSG_HW_BIT]		= "hw",
+	[NETIF_MSG_WOL_BIT]		= "wol",
+};
+static_assert(ARRAY_SIZE(netif_msg_class_names) == NETIF_MSG_CLASS_COUNT);
+
 /* return false if legacy contained non-0 deprecated fields
  * maxtxpkt/maxrxpkt. rest of ksettings always updated
  */
diff --git a/net/ethtool/common.h b/net/ethtool/common.h
index 5c5f7dc90cd4..064c5c3aa990 100644
--- a/net/ethtool/common.h
+++ b/net/ethtool/common.h
@@ -19,6 +19,7 @@ tunable_strings[__ETHTOOL_TUNABLE_COUNT][ETH_GSTRING_LEN];
 extern const char
 phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN];
 extern const char link_mode_names[][ETH_GSTRING_LEN];
+extern const char netif_msg_class_names[][ETH_GSTRING_LEN];
 
 int __ethtool_get_link(struct net_device *dev);
 
diff --git a/net/ethtool/debug.c b/net/ethtool/debug.c
new file mode 100644
index 000000000000..cc121a23be5f
--- /dev/null
+++ b/net/ethtool/debug.c
@@ -0,0 +1,80 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "netlink.h"
+#include "common.h"
+#include "bitset.h"
+
+struct debug_req_info {
+	struct ethnl_req_info		base;
+};
+
+struct debug_reply_data {
+	struct ethnl_reply_data		base;
+	u32				msg_mask;
+};
+
+#define DEBUG_REPDATA(__reply_base) \
+	container_of(__reply_base, struct debug_reply_data, base)
+
+static const struct nla_policy
+debug_get_policy[ETHTOOL_A_DEBUG_MAX + 1] = {
+	[ETHTOOL_A_DEBUG_UNSPEC]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_DEBUG_HEADER]	= { .type = NLA_NESTED },
+	[ETHTOOL_A_DEBUG_MSGMASK]	= { .type = NLA_REJECT },
+};
+
+static int debug_prepare_data(const struct ethnl_req_info *req_base,
+			      struct ethnl_reply_data *reply_base,
+			      struct genl_info *info)
+{
+	struct debug_reply_data *data = DEBUG_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	if (!dev->ethtool_ops->get_msglevel)
+		return -EOPNOTSUPP;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+	data->msg_mask = dev->ethtool_ops->get_msglevel(dev);
+	ethnl_ops_complete(dev);
+
+	return 0;
+}
+
+static int debug_reply_size(const struct ethnl_req_info *req_base,
+			    const struct ethnl_reply_data *reply_base)
+{
+	const struct debug_reply_data *data = DEBUG_REPDATA(reply_base);
+	bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS;
+
+	return ethnl_bitset32_size(&data->msg_mask, NULL, NETIF_MSG_CLASS_COUNT,
+				   netif_msg_class_names, compact);
+}
+
+static int debug_fill_reply(struct sk_buff *skb,
+			    const struct ethnl_req_info *req_base,
+			    const struct ethnl_reply_data *reply_base)
+{
+	const struct debug_reply_data *data = DEBUG_REPDATA(reply_base);
+	bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS;
+
+	return ethnl_put_bitset32(skb, ETHTOOL_A_DEBUG_MSGMASK, &data->msg_mask,
+				  NULL, NETIF_MSG_CLASS_COUNT,
+				  netif_msg_class_names, compact);
+}
+
+const struct ethnl_request_ops ethnl_debug_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_DEBUG_GET,
+	.reply_cmd		= ETHTOOL_MSG_DEBUG_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_DEBUG_HEADER,
+	.max_attr		= ETHTOOL_A_DEBUG_MAX,
+	.req_info_size		= sizeof(struct debug_req_info),
+	.reply_data_size	= sizeof(struct debug_reply_data),
+	.request_policy		= debug_get_policy,
+
+	.prepare_data		= debug_prepare_data,
+	.reply_size		= debug_reply_size,
+	.fill_reply		= debug_fill_reply,
+};
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 0af43bbdb9b2..bdaf583e392b 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -213,6 +213,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_LINKINFO_GET]	= &ethnl_linkinfo_request_ops,
 	[ETHTOOL_MSG_LINKMODES_GET]	= &ethnl_linkmodes_request_ops,
 	[ETHTOOL_MSG_LINKSTATE_GET]	= &ethnl_linkstate_request_ops,
+	[ETHTOOL_MSG_DEBUG_GET]		= &ethnl_debug_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -664,6 +665,13 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.dumpit	= ethnl_default_dumpit,
 		.done	= ethnl_default_done,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_DEBUG_GET,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index da9d6521a4eb..9bd8ef671501 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -334,6 +334,7 @@ extern const struct ethnl_request_ops ethnl_strset_request_ops;
 extern const struct ethnl_request_ops ethnl_linkinfo_request_ops;
 extern const struct ethnl_request_ops ethnl_linkmodes_request_ops;
 extern const struct ethnl_request_ops ethnl_linkstate_request_ops;
+extern const struct ethnl_request_ops ethnl_debug_request_ops;
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
diff --git a/net/ethtool/strset.c b/net/ethtool/strset.c
index 948d967f1eca..7a45c25355b8 100644
--- a/net/ethtool/strset.c
+++ b/net/ethtool/strset.c
@@ -50,6 +50,11 @@ static const struct strset_info info_template[] = {
 		.count		= __ETHTOOL_LINK_MODE_MASK_NBITS,
 		.strings	= link_mode_names,
 	},
+	[ETH_SS_MSG_CLASSES] = {
+		.per_dev	= false,
+		.count		= NETIF_MSG_CLASS_COUNT,
+		.strings	= netif_msg_class_names,
+	},
 };
 
 struct strset_req_info {
-- 
cgit v1.2.3


From e54d04e3afead22d8e7d6edaaac487a1205bac39 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Sun, 26 Jan 2020 23:11:07 +0100
Subject: ethtool: set message mask with DEBUG_SET request

Implement DEBUG_SET netlink request to set debugging settings for a device.
At the moment, only message mask corresponding to message level as set by
ETHTOOL_SMSGLVL ioctl request can be set. (It is called message level in
ioctl interface but almost all drivers interpret it as a bit mask.)

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 20 ++++++++++-
 include/uapi/linux/ethtool_netlink.h         |  1 +
 net/ethtool/debug.c                          | 53 ++++++++++++++++++++++++++++
 net/ethtool/netlink.c                        |  5 +++
 net/ethtool/netlink.h                        |  1 +
 5 files changed, 79 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 76a411d3102d..58473c05319f 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -186,6 +186,7 @@ Userspace to kernel:
   ``ETHTOOL_MSG_LINKMODES_SET``         set link modes info
   ``ETHTOOL_MSG_LINKSTATE_GET``         get link state
   ``ETHTOOL_MSG_DEBUG_GET``             get debugging settings
+  ``ETHTOOL_MSG_DEBUG_SET``             set debugging settings
   ===================================== ================================
 
 Kernel to userspace:
@@ -455,6 +456,23 @@ interface follows its actual use in practice.
 devices supporting the request).
 
 
+DEBUG_SET
+=========
+
+Set or update debugging settings of a device. At the moment, only message mask
+is supported.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_DEBUG_HEADER``            nested  request header
+  ``ETHTOOL_A_DEBUG_MSGMASK``           bitset  message mask
+  ====================================  ======  ==========================
+
+``ETHTOOL_A_DEBUG_MSGMASK`` bit set allows setting or modifying mask of
+enabled debugging message types for the device.
+
+
 Request translation
 ===================
 
@@ -474,7 +492,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GWOL``                    n/a
   ``ETHTOOL_SWOL``                    n/a
   ``ETHTOOL_GMSGLVL``                 ``ETHTOOL_MSG_DEBUG_GET``
-  ``ETHTOOL_SMSGLVL``                 n/a
+  ``ETHTOOL_SMSGLVL``                 ``ETHTOOL_MSG_DEBUG_SET``
   ``ETHTOOL_NWAY_RST``                n/a
   ``ETHTOOL_GLINK``                   ``ETHTOOL_MSG_LINKSTATE_GET``
   ``ETHTOOL_GEEPROM``                 n/a
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 0d39e04567cb..8f0b6fd277d5 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -21,6 +21,7 @@ enum {
 	ETHTOOL_MSG_LINKMODES_SET,
 	ETHTOOL_MSG_LINKSTATE_GET,
 	ETHTOOL_MSG_DEBUG_GET,
+	ETHTOOL_MSG_DEBUG_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
diff --git a/net/ethtool/debug.c b/net/ethtool/debug.c
index cc121a23be5f..6fc3ef8113c0 100644
--- a/net/ethtool/debug.c
+++ b/net/ethtool/debug.c
@@ -78,3 +78,56 @@ const struct ethnl_request_ops ethnl_debug_request_ops = {
 	.reply_size		= debug_reply_size,
 	.fill_reply		= debug_fill_reply,
 };
+
+/* DEBUG_SET */
+
+static const struct nla_policy
+debug_set_policy[ETHTOOL_A_DEBUG_MAX + 1] = {
+	[ETHTOOL_A_DEBUG_UNSPEC]	= { .type = NLA_REJECT },
+	[ETHTOOL_A_DEBUG_HEADER]	= { .type = NLA_NESTED },
+	[ETHTOOL_A_DEBUG_MSGMASK]	= { .type = NLA_NESTED },
+};
+
+int ethnl_set_debug(struct sk_buff *skb, struct genl_info *info)
+{
+	struct nlattr *tb[ETHTOOL_A_DEBUG_MAX + 1];
+	struct ethnl_req_info req_info = {};
+	struct net_device *dev;
+	bool mod = false;
+	u32 msg_mask;
+	int ret;
+
+	ret = nlmsg_parse(info->nlhdr, GENL_HDRLEN, tb,
+			  ETHTOOL_A_DEBUG_MAX, debug_set_policy,
+			  info->extack);
+	if (ret < 0)
+		return ret;
+	ret = ethnl_parse_header(&req_info, tb[ETHTOOL_A_DEBUG_HEADER],
+				 genl_info_net(info), info->extack, true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+	if (!dev->ethtool_ops->get_msglevel || !dev->ethtool_ops->set_msglevel)
+		return -EOPNOTSUPP;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	msg_mask = dev->ethtool_ops->get_msglevel(dev);
+	ret = ethnl_update_bitset32(&msg_mask, NETIF_MSG_CLASS_COUNT,
+				    tb[ETHTOOL_A_DEBUG_MSGMASK],
+				    netif_msg_class_names, info->extack, &mod);
+	if (ret < 0 || !mod)
+		goto out_ops;
+
+	dev->ethtool_ops->set_msglevel(dev, msg_mask);
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+	dev_put(dev);
+	return ret;
+}
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index bdaf583e392b..bcdc42e0bd14 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -672,6 +672,11 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.dumpit	= ethnl_default_dumpit,
 		.done	= ethnl_default_done,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_DEBUG_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_debug,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 9bd8ef671501..772723e536e8 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -338,5 +338,6 @@ extern const struct ethnl_request_ops ethnl_debug_request_ops;
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
+int ethnl_set_debug(struct sk_buff *skb, struct genl_info *info);
 
 #endif /* _NET_ETHTOOL_NETLINK_H */
-- 
cgit v1.2.3


From 0bda7af39d2ba6ef83bd18e26bcb027f1630004e Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Sun, 26 Jan 2020 23:11:10 +0100
Subject: ethtool: add DEBUG_NTF notification

Send ETHTOOL_MSG_DEBUG_NTF notification message whenever debugging message
mask for a device are modified using ETHTOOL_MSG_DEBUG_SET netlink message
or ETHTOOL_SMSGLVL ioctl request.

The notification message has the same format as reply to DEBUG_GET request.
As with other ethtool notifications, netlink requests only trigger the
notification if the mask is actually changed while ioctl request trigger it
whenever the request results in calling the ethtool_ops handler.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 1 +
 include/uapi/linux/ethtool_netlink.h         | 1 +
 net/ethtool/debug.c                          | 1 +
 net/ethtool/ioctl.c                          | 2 ++
 net/ethtool/netlink.c                        | 2 ++
 5 files changed, 7 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 58473c05319f..2263f885d18f 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -199,6 +199,7 @@ Kernel to userspace:
   ``ETHTOOL_MSG_LINKMODES_NTF``         link modes notification
   ``ETHTOOL_MSG_LINKSTATE_GET_REPLY``   link state info
   ``ETHTOOL_MSG_DEBUG_GET_REPLY``       debugging settings
+  ``ETHTOOL_MSG_DEBUG_NTF``             debugging settings notification
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 8f0b6fd277d5..67a06b94bf28 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -38,6 +38,7 @@ enum {
 	ETHTOOL_MSG_LINKMODES_NTF,
 	ETHTOOL_MSG_LINKSTATE_GET_REPLY,
 	ETHTOOL_MSG_DEBUG_GET_REPLY,
+	ETHTOOL_MSG_DEBUG_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
diff --git a/net/ethtool/debug.c b/net/ethtool/debug.c
index 6fc3ef8113c0..aaef4843e6ba 100644
--- a/net/ethtool/debug.c
+++ b/net/ethtool/debug.c
@@ -123,6 +123,7 @@ int ethnl_set_debug(struct sk_buff *skb, struct genl_info *info)
 		goto out_ops;
 
 	dev->ethtool_ops->set_msglevel(dev, msg_mask);
+	ethtool_notify(dev, ETHTOOL_MSG_DEBUG_NTF, NULL);
 
 out_ops:
 	ethnl_ops_complete(dev);
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 182bffbffa78..46e0b31782fc 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -2545,6 +2545,8 @@ int dev_ethtool(struct net *net, struct ifreq *ifr)
 	case ETHTOOL_SMSGLVL:
 		rc = ethtool_set_value_void(dev, useraddr,
 				       dev->ethtool_ops->set_msglevel);
+		if (!rc)
+			ethtool_notify(dev, ETHTOOL_MSG_DEBUG_NTF, NULL);
 		break;
 	case ETHTOOL_GEEE:
 		rc = ethtool_get_eee(dev, useraddr);
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index bcdc42e0bd14..9a0a6c2f8dbb 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -524,6 +524,7 @@ static const struct ethnl_request_ops *
 ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
 	[ETHTOOL_MSG_LINKINFO_NTF]	= &ethnl_linkinfo_request_ops,
 	[ETHTOOL_MSG_LINKMODES_NTF]	= &ethnl_linkmodes_request_ops,
+	[ETHTOOL_MSG_DEBUG_NTF]		= &ethnl_debug_request_ops,
 };
 
 /* default notification handler */
@@ -607,6 +608,7 @@ typedef void (*ethnl_notify_handler_t)(struct net_device *dev, unsigned int cmd,
 static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
 	[ETHTOOL_MSG_LINKINFO_NTF]	= ethnl_default_notify,
 	[ETHTOOL_MSG_LINKMODES_NTF]	= ethnl_default_notify,
+	[ETHTOOL_MSG_DEBUG_NTF]		= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
-- 
cgit v1.2.3


From 51ea22b04ea0c210f9ce87b8a600965dbe476bc2 Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Sun, 26 Jan 2020 23:11:13 +0100
Subject: ethtool: provide WoL settings with WOL_GET request

Implement WOL_GET request to get wake-on-lan settings for a device,
traditionally available via ETHTOOL_GWOL ioctl request.

As part of the implementation, provide symbolic names for wake-on-line
modes as ETH_SS_WOL_MODES string set.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 30 ++++++++-
 include/uapi/linux/ethtool.h                 |  4 ++
 include/uapi/linux/ethtool_netlink.h         | 15 +++++
 net/ethtool/Makefile                         |  2 +-
 net/ethtool/common.c                         | 12 ++++
 net/ethtool/common.h                         |  1 +
 net/ethtool/netlink.c                        |  9 +++
 net/ethtool/netlink.h                        |  1 +
 net/ethtool/strset.c                         |  5 ++
 net/ethtool/wol.c                            | 99 ++++++++++++++++++++++++++++
 10 files changed, 176 insertions(+), 2 deletions(-)
 create mode 100644 net/ethtool/wol.c

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 2263f885d18f..5fd85e3ea96e 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -187,6 +187,7 @@ Userspace to kernel:
   ``ETHTOOL_MSG_LINKSTATE_GET``         get link state
   ``ETHTOOL_MSG_DEBUG_GET``             get debugging settings
   ``ETHTOOL_MSG_DEBUG_SET``             set debugging settings
+  ``ETHTOOL_MSG_WOL_GET``               get wake-on-lan settings
   ===================================== ================================
 
 Kernel to userspace:
@@ -200,6 +201,7 @@ Kernel to userspace:
   ``ETHTOOL_MSG_LINKSTATE_GET_REPLY``   link state info
   ``ETHTOOL_MSG_DEBUG_GET_REPLY``       debugging settings
   ``ETHTOOL_MSG_DEBUG_NTF``             debugging settings notification
+  ``ETHTOOL_MSG_WOL_GET_REPLY``         wake-on-lan settings
   ===================================== ================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
@@ -474,6 +476,32 @@ Request contents:
 enabled debugging message types for the device.
 
 
+WOL_GET
+=======
+
+Query device wake-on-lan settings. Unlike most "GET" type requests,
+``ETHTOOL_MSG_WOL_GET`` requires (netns) ``CAP_NET_ADMIN`` privileges as it
+(potentially) provides SecureOn(tm) password which is confidential.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_WOL_HEADER``              nested  request header
+  ====================================  ======  ==========================
+
+Kernel response contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_WOL_HEADER``              nested  reply header
+  ``ETHTOOL_A_WOL_MODES``               bitset  mask of enabled WoL modes
+  ``ETHTOOL_A_WOL_SOPASS``              binary  SecureOn(tm) password
+  ====================================  ======  ==========================
+
+In reply, ``ETHTOOL_A_WOL_MODES`` mask consists of modes supported by the
+device, value of modes which are enabled. ``ETHTOOL_A_WOL_SOPASS`` is only
+included in reply if ``WAKE_MAGICSECURE`` mode is supported.
+
+
 Request translation
 ===================
 
@@ -490,7 +518,7 @@ have their netlink replacement yet.
                                       ``ETHTOOL_MSG_LINKMODES_SET``
   ``ETHTOOL_GDRVINFO``                n/a
   ``ETHTOOL_GREGS``                   n/a
-  ``ETHTOOL_GWOL``                    n/a
+  ``ETHTOOL_GWOL``                    ``ETHTOOL_MSG_WOL_GET``
   ``ETHTOOL_SWOL``                    n/a
   ``ETHTOOL_GMSGLVL``                 ``ETHTOOL_MSG_DEBUG_GET``
   ``ETHTOOL_SMSGLVL``                 ``ETHTOOL_MSG_DEBUG_SET``
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 456fb2aa0fad..4295ebfa2f91 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -595,6 +595,7 @@ struct ethtool_pauseparam {
  * @ETH_SS_PHY_TUNABLES: PHY tunable names
  * @ETH_SS_LINK_MODES: link mode names
  * @ETH_SS_MSG_CLASSES: debug message class names
+ * @ETH_SS_WOL_MODES: wake-on-lan modes
  */
 enum ethtool_stringset {
 	ETH_SS_TEST		= 0,
@@ -608,6 +609,7 @@ enum ethtool_stringset {
 	ETH_SS_PHY_TUNABLES,
 	ETH_SS_LINK_MODES,
 	ETH_SS_MSG_CLASSES,
+	ETH_SS_WOL_MODES,
 
 	/* add new constants above here */
 	ETH_SS_COUNT
@@ -1695,6 +1697,8 @@ static inline int ethtool_validate_duplex(__u8 duplex)
 #define WAKE_MAGICSECURE	(1 << 6) /* only meaningful if WAKE_MAGIC */
 #define WAKE_FILTER		(1 << 7)
 
+#define WOL_MODE_COUNT		8
+
 /* L2-L4 network traffic flow types */
 #define	TCP_V4_FLOW	0x01	/* hash or spec (tcp_ip4_spec) */
 #define	UDP_V4_FLOW	0x02	/* hash or spec (udp_ip4_spec) */
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 67a06b94bf28..dcc5c32dc018 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -22,6 +22,7 @@ enum {
 	ETHTOOL_MSG_LINKSTATE_GET,
 	ETHTOOL_MSG_DEBUG_GET,
 	ETHTOOL_MSG_DEBUG_SET,
+	ETHTOOL_MSG_WOL_GET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
@@ -39,6 +40,7 @@ enum {
 	ETHTOOL_MSG_LINKSTATE_GET_REPLY,
 	ETHTOOL_MSG_DEBUG_GET_REPLY,
 	ETHTOOL_MSG_DEBUG_NTF,
+	ETHTOOL_MSG_WOL_GET_REPLY,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
@@ -211,6 +213,19 @@ enum {
 	ETHTOOL_A_DEBUG_MAX = __ETHTOOL_A_DEBUG_CNT - 1
 };
 
+/* WOL */
+
+enum {
+	ETHTOOL_A_WOL_UNSPEC,
+	ETHTOOL_A_WOL_HEADER,			/* nest - _A_HEADER_* */
+	ETHTOOL_A_WOL_MODES,			/* bitset */
+	ETHTOOL_A_WOL_SOPASS,			/* binary */
+
+	/* add new constants above here */
+	__ETHTOOL_A_WOL_CNT,
+	ETHTOOL_A_WOL_MAX = __ETHTOOL_A_WOL_CNT - 1
+};
+
 /* generic netlink info */
 #define ETHTOOL_GENL_NAME "ethtool"
 #define ETHTOOL_GENL_VERSION 1
diff --git a/net/ethtool/Makefile b/net/ethtool/Makefile
index c120c820a4f5..424545a4aaec 100644
--- a/net/ethtool/Makefile
+++ b/net/ethtool/Makefile
@@ -5,4 +5,4 @@ obj-y				+= ioctl.o common.o
 obj-$(CONFIG_ETHTOOL_NETLINK)	+= ethtool_nl.o
 
 ethtool_nl-y	:= netlink.o bitset.o strset.o linkinfo.o linkmodes.o \
-		   linkstate.o debug.o
+		   linkstate.o debug.o wol.o
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index aa1183a65a76..636ec6d5110e 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -190,6 +190,18 @@ const char netif_msg_class_names[][ETH_GSTRING_LEN] = {
 };
 static_assert(ARRAY_SIZE(netif_msg_class_names) == NETIF_MSG_CLASS_COUNT);
 
+const char wol_mode_names[][ETH_GSTRING_LEN] = {
+	[const_ilog2(WAKE_PHY)]		= "phy",
+	[const_ilog2(WAKE_UCAST)]	= "ucast",
+	[const_ilog2(WAKE_MCAST)]	= "mcast",
+	[const_ilog2(WAKE_BCAST)]	= "bcast",
+	[const_ilog2(WAKE_ARP)]		= "arp",
+	[const_ilog2(WAKE_MAGIC)]	= "magic",
+	[const_ilog2(WAKE_MAGICSECURE)]	= "magicsecure",
+	[const_ilog2(WAKE_FILTER)]	= "filter",
+};
+static_assert(ARRAY_SIZE(wol_mode_names) == WOL_MODE_COUNT);
+
 /* return false if legacy contained non-0 deprecated fields
  * maxtxpkt/maxrxpkt. rest of ksettings always updated
  */
diff --git a/net/ethtool/common.h b/net/ethtool/common.h
index 064c5c3aa990..40ba74e0b9bb 100644
--- a/net/ethtool/common.h
+++ b/net/ethtool/common.h
@@ -20,6 +20,7 @@ extern const char
 phy_tunable_strings[__ETHTOOL_PHY_TUNABLE_COUNT][ETH_GSTRING_LEN];
 extern const char link_mode_names[][ETH_GSTRING_LEN];
 extern const char netif_msg_class_names[][ETH_GSTRING_LEN];
+extern const char wol_mode_names[][ETH_GSTRING_LEN];
 
 int __ethtool_get_link(struct net_device *dev);
 
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 9a0a6c2f8dbb..eeb6d8594e1b 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -214,6 +214,7 @@ ethnl_default_requests[__ETHTOOL_MSG_USER_CNT] = {
 	[ETHTOOL_MSG_LINKMODES_GET]	= &ethnl_linkmodes_request_ops,
 	[ETHTOOL_MSG_LINKSTATE_GET]	= &ethnl_linkstate_request_ops,
 	[ETHTOOL_MSG_DEBUG_GET]		= &ethnl_debug_request_ops,
+	[ETHTOOL_MSG_WOL_GET]		= &ethnl_wol_request_ops,
 };
 
 static struct ethnl_dump_ctx *ethnl_dump_context(struct netlink_callback *cb)
@@ -679,6 +680,14 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.flags	= GENL_UNS_ADMIN_PERM,
 		.doit	= ethnl_set_debug,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_WOL_GET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_default_doit,
+		.start	= ethnl_default_start,
+		.dumpit	= ethnl_default_dumpit,
+		.done	= ethnl_default_done,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 772723e536e8..9fcd6f87b396 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -335,6 +335,7 @@ extern const struct ethnl_request_ops ethnl_linkinfo_request_ops;
 extern const struct ethnl_request_ops ethnl_linkmodes_request_ops;
 extern const struct ethnl_request_ops ethnl_linkstate_request_ops;
 extern const struct ethnl_request_ops ethnl_debug_request_ops;
+extern const struct ethnl_request_ops ethnl_wol_request_ops;
 
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
diff --git a/net/ethtool/strset.c b/net/ethtool/strset.c
index 7a45c25355b8..8e5911887b4c 100644
--- a/net/ethtool/strset.c
+++ b/net/ethtool/strset.c
@@ -55,6 +55,11 @@ static const struct strset_info info_template[] = {
 		.count		= NETIF_MSG_CLASS_COUNT,
 		.strings	= netif_msg_class_names,
 	},
+	[ETH_SS_WOL_MODES] = {
+		.per_dev	= false,
+		.count		= WOL_MODE_COUNT,
+		.strings	= wol_mode_names,
+	},
 };
 
 struct strset_req_info {
diff --git a/net/ethtool/wol.c b/net/ethtool/wol.c
new file mode 100644
index 000000000000..7c9a1ef622ce
--- /dev/null
+++ b/net/ethtool/wol.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "netlink.h"
+#include "common.h"
+#include "bitset.h"
+
+struct wol_req_info {
+	struct ethnl_req_info		base;
+};
+
+struct wol_reply_data {
+	struct ethnl_reply_data		base;
+	struct ethtool_wolinfo		wol;
+	bool				show_sopass;
+};
+
+#define WOL_REPDATA(__reply_base) \
+	container_of(__reply_base, struct wol_reply_data, base)
+
+static const struct nla_policy
+wol_get_policy[ETHTOOL_A_WOL_MAX + 1] = {
+	[ETHTOOL_A_WOL_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_WOL_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_WOL_MODES]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_WOL_SOPASS]		= { .type = NLA_REJECT },
+};
+
+static int wol_prepare_data(const struct ethnl_req_info *req_base,
+			    struct ethnl_reply_data *reply_base,
+			    struct genl_info *info)
+{
+	struct wol_reply_data *data = WOL_REPDATA(reply_base);
+	struct net_device *dev = reply_base->dev;
+	int ret;
+
+	if (!dev->ethtool_ops->get_wol)
+		return -EOPNOTSUPP;
+
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		return ret;
+	dev->ethtool_ops->get_wol(dev, &data->wol);
+	ethnl_ops_complete(dev);
+	data->show_sopass = data->wol.supported & WAKE_MAGICSECURE;
+
+	return 0;
+}
+
+static int wol_reply_size(const struct ethnl_req_info *req_base,
+			  const struct ethnl_reply_data *reply_base)
+{
+	bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS;
+	const struct wol_reply_data *data = WOL_REPDATA(reply_base);
+	int len;
+
+	len = ethnl_bitset32_size(&data->wol.wolopts, &data->wol.supported,
+				  WOL_MODE_COUNT, wol_mode_names, compact);
+	if (len < 0)
+		return len;
+	if (data->show_sopass)
+		len += nla_total_size(sizeof(data->wol.sopass));
+
+	return len;
+}
+
+static int wol_fill_reply(struct sk_buff *skb,
+			  const struct ethnl_req_info *req_base,
+			  const struct ethnl_reply_data *reply_base)
+{
+	bool compact = req_base->flags & ETHTOOL_FLAG_COMPACT_BITSETS;
+	const struct wol_reply_data *data = WOL_REPDATA(reply_base);
+	int ret;
+
+	ret = ethnl_put_bitset32(skb, ETHTOOL_A_WOL_MODES, &data->wol.wolopts,
+				 &data->wol.supported, WOL_MODE_COUNT,
+				 wol_mode_names, compact);
+	if (ret < 0)
+		return ret;
+	if (data->show_sopass &&
+	    nla_put(skb, ETHTOOL_A_WOL_SOPASS, sizeof(data->wol.sopass),
+		    data->wol.sopass))
+		return -EMSGSIZE;
+
+	return 0;
+}
+
+const struct ethnl_request_ops ethnl_wol_request_ops = {
+	.request_cmd		= ETHTOOL_MSG_WOL_GET,
+	.reply_cmd		= ETHTOOL_MSG_WOL_GET_REPLY,
+	.hdr_attr		= ETHTOOL_A_WOL_HEADER,
+	.max_attr		= ETHTOOL_A_WOL_MAX,
+	.req_info_size		= sizeof(struct wol_req_info),
+	.reply_data_size	= sizeof(struct wol_reply_data),
+	.request_policy		= wol_get_policy,
+
+	.prepare_data		= wol_prepare_data,
+	.reply_size		= wol_reply_size,
+	.fill_reply		= wol_fill_reply,
+};
-- 
cgit v1.2.3


From 8d425b19b305a77cb1ba67b84e7c1ca547de032f Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Sun, 26 Jan 2020 23:11:16 +0100
Subject: ethtool: set wake-on-lan settings with WOL_SET request

Implement WOL_SET netlink request to set wake-on-lan settings. This is
equivalent to ETHTOOL_SWOL ioctl request.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 20 +++++++-
 include/uapi/linux/ethtool_netlink.h         |  1 +
 net/ethtool/netlink.c                        |  5 ++
 net/ethtool/netlink.h                        |  1 +
 net/ethtool/wol.c                            | 76 ++++++++++++++++++++++++++++
 5 files changed, 102 insertions(+), 1 deletion(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index 5fd85e3ea96e..f16f74bbb546 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -188,6 +188,7 @@ Userspace to kernel:
   ``ETHTOOL_MSG_DEBUG_GET``             get debugging settings
   ``ETHTOOL_MSG_DEBUG_SET``             set debugging settings
   ``ETHTOOL_MSG_WOL_GET``               get wake-on-lan settings
+  ``ETHTOOL_MSG_WOL_SET``               set wake-on-lan settings
   ===================================== ================================
 
 Kernel to userspace:
@@ -502,6 +503,23 @@ device, value of modes which are enabled. ``ETHTOOL_A_WOL_SOPASS`` is only
 included in reply if ``WAKE_MAGICSECURE`` mode is supported.
 
 
+WOL_SET
+=======
+
+Set or update wake-on-lan settings.
+
+Request contents:
+
+  ====================================  ======  ==========================
+  ``ETHTOOL_A_WOL_HEADER``              nested  request header
+  ``ETHTOOL_A_WOL_MODES``               bitset  enabled WoL modes
+  ``ETHTOOL_A_WOL_SOPASS``              binary  SecureOn(tm) password
+  ====================================  ======  ==========================
+
+``ETHTOOL_A_WOL_SOPASS`` is only allowed for devices supporting
+``WAKE_MAGICSECURE`` mode.
+
+
 Request translation
 ===================
 
@@ -519,7 +537,7 @@ have their netlink replacement yet.
   ``ETHTOOL_GDRVINFO``                n/a
   ``ETHTOOL_GREGS``                   n/a
   ``ETHTOOL_GWOL``                    ``ETHTOOL_MSG_WOL_GET``
-  ``ETHTOOL_SWOL``                    n/a
+  ``ETHTOOL_SWOL``                    ``ETHTOOL_MSG_WOL_SET``
   ``ETHTOOL_GMSGLVL``                 ``ETHTOOL_MSG_DEBUG_GET``
   ``ETHTOOL_SMSGLVL``                 ``ETHTOOL_MSG_DEBUG_SET``
   ``ETHTOOL_NWAY_RST``                n/a
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index dcc5c32dc018..59de35695521 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -23,6 +23,7 @@ enum {
 	ETHTOOL_MSG_DEBUG_GET,
 	ETHTOOL_MSG_DEBUG_SET,
 	ETHTOOL_MSG_WOL_GET,
+	ETHTOOL_MSG_WOL_SET,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_USER_CNT,
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index eeb6d8594e1b..2c375f9095fe 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -688,6 +688,11 @@ static const struct genl_ops ethtool_genl_ops[] = {
 		.dumpit	= ethnl_default_dumpit,
 		.done	= ethnl_default_done,
 	},
+	{
+		.cmd	= ETHTOOL_MSG_WOL_SET,
+		.flags	= GENL_UNS_ADMIN_PERM,
+		.doit	= ethnl_set_wol,
+	},
 };
 
 static const struct genl_multicast_group ethtool_nl_mcgrps[] = {
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index 9fcd6f87b396..60efd87686ad 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -340,5 +340,6 @@ extern const struct ethnl_request_ops ethnl_wol_request_ops;
 int ethnl_set_linkinfo(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_linkmodes(struct sk_buff *skb, struct genl_info *info);
 int ethnl_set_debug(struct sk_buff *skb, struct genl_info *info);
+int ethnl_set_wol(struct sk_buff *skb, struct genl_info *info);
 
 #endif /* _NET_ETHTOOL_NETLINK_H */
diff --git a/net/ethtool/wol.c b/net/ethtool/wol.c
index 7c9a1ef622ce..a2724378fac4 100644
--- a/net/ethtool/wol.c
+++ b/net/ethtool/wol.c
@@ -97,3 +97,79 @@ const struct ethnl_request_ops ethnl_wol_request_ops = {
 	.reply_size		= wol_reply_size,
 	.fill_reply		= wol_fill_reply,
 };
+
+/* WOL_SET */
+
+static const struct nla_policy
+wol_set_policy[ETHTOOL_A_WOL_MAX + 1] = {
+	[ETHTOOL_A_WOL_UNSPEC]		= { .type = NLA_REJECT },
+	[ETHTOOL_A_WOL_HEADER]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_WOL_MODES]		= { .type = NLA_NESTED },
+	[ETHTOOL_A_WOL_SOPASS]		= { .type = NLA_BINARY,
+					    .len = SOPASS_MAX },
+};
+
+int ethnl_set_wol(struct sk_buff *skb, struct genl_info *info)
+{
+	struct ethtool_wolinfo wol = { .cmd = ETHTOOL_GWOL };
+	struct nlattr *tb[ETHTOOL_A_WOL_MAX + 1];
+	struct ethnl_req_info req_info = {};
+	struct net_device *dev;
+	bool mod = false;
+	int ret;
+
+	ret = nlmsg_parse(info->nlhdr, GENL_HDRLEN, tb, ETHTOOL_A_WOL_MAX,
+			  wol_set_policy, info->extack);
+	if (ret < 0)
+		return ret;
+	ret = ethnl_parse_header(&req_info, tb[ETHTOOL_A_WOL_HEADER],
+				 genl_info_net(info), info->extack, true);
+	if (ret < 0)
+		return ret;
+	dev = req_info.dev;
+	if (!dev->ethtool_ops->get_wol || !dev->ethtool_ops->set_wol)
+		return -EOPNOTSUPP;
+
+	rtnl_lock();
+	ret = ethnl_ops_begin(dev);
+	if (ret < 0)
+		goto out_rtnl;
+
+	dev->ethtool_ops->get_wol(dev, &wol);
+	ret = ethnl_update_bitset32(&wol.wolopts, WOL_MODE_COUNT,
+				    tb[ETHTOOL_A_WOL_MODES], wol_mode_names,
+				    info->extack, &mod);
+	if (ret < 0)
+		goto out_ops;
+	if (wol.wolopts & ~wol.supported) {
+		NL_SET_ERR_MSG_ATTR(info->extack, tb[ETHTOOL_A_WOL_MODES],
+				    "cannot enable unsupported WoL mode");
+		ret = -EINVAL;
+		goto out_ops;
+	}
+	if (tb[ETHTOOL_A_WOL_SOPASS]) {
+		if (!(wol.supported & WAKE_MAGICSECURE)) {
+			NL_SET_ERR_MSG_ATTR(info->extack,
+					    tb[ETHTOOL_A_WOL_SOPASS],
+					    "magicsecure not supported, cannot set password");
+			ret = -EINVAL;
+			goto out_ops;
+		}
+		ethnl_update_binary(wol.sopass, sizeof(wol.sopass),
+				    tb[ETHTOOL_A_WOL_SOPASS], &mod);
+	}
+
+	if (!mod)
+		goto out_ops;
+	ret = dev->ethtool_ops->set_wol(dev, &wol);
+	if (ret)
+		goto out_ops;
+	dev->wol_enabled = !!wol.wolopts;
+
+out_ops:
+	ethnl_ops_complete(dev);
+out_rtnl:
+	rtnl_unlock();
+	dev_put(dev);
+	return ret;
+}
-- 
cgit v1.2.3


From 67bffa79231f15ba372007972563cc8aa5e48bfa Mon Sep 17 00:00:00 2001
From: Michal Kubecek <mkubecek@suse.cz>
Date: Sun, 26 Jan 2020 23:11:19 +0100
Subject: ethtool: add WOL_NTF notification

Send ETHTOOL_MSG_WOL_NTF notification whenever wake-on-lan settings of
a device are modified using ETHTOOL_MSG_WOL_SET netlink message or
ETHTOOL_SWOL ioctl request.

As notifications can be received by anyone, do not include SecureOn(tm)
password in notification messages.

Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 Documentation/networking/ethtool-netlink.rst | 5 +++--
 include/uapi/linux/ethtool_netlink.h         | 1 +
 net/ethtool/ioctl.c                          | 1 +
 net/ethtool/netlink.c                        | 2 ++
 net/ethtool/wol.c                            | 4 +++-
 5 files changed, 10 insertions(+), 3 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst
index f16f74bbb546..f1f868479ceb 100644
--- a/Documentation/networking/ethtool-netlink.rst
+++ b/Documentation/networking/ethtool-netlink.rst
@@ -193,7 +193,7 @@ Userspace to kernel:
 
 Kernel to userspace:
 
-  ===================================== ================================
+  ===================================== =================================
   ``ETHTOOL_MSG_STRSET_GET_REPLY``      string set contents
   ``ETHTOOL_MSG_LINKINFO_GET_REPLY``    link settings
   ``ETHTOOL_MSG_LINKINFO_NTF``          link settings notification
@@ -203,7 +203,8 @@ Kernel to userspace:
   ``ETHTOOL_MSG_DEBUG_GET_REPLY``       debugging settings
   ``ETHTOOL_MSG_DEBUG_NTF``             debugging settings notification
   ``ETHTOOL_MSG_WOL_GET_REPLY``         wake-on-lan settings
-  ===================================== ================================
+  ``ETHTOOL_MSG_WOL_NTF``               wake-on-lan settings notification
+  ===================================== =================================
 
 ``GET`` requests are sent by userspace applications to retrieve device
 information. They usually do not contain any message specific attributes.
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index 59de35695521..7e0b460f872c 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -42,6 +42,7 @@ enum {
 	ETHTOOL_MSG_DEBUG_GET_REPLY,
 	ETHTOOL_MSG_DEBUG_NTF,
 	ETHTOOL_MSG_WOL_GET_REPLY,
+	ETHTOOL_MSG_WOL_NTF,
 
 	/* add new constants above here */
 	__ETHTOOL_MSG_KERNEL_CNT,
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 46e0b31782fc..b88dd14e41c6 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1316,6 +1316,7 @@ static int ethtool_set_wol(struct net_device *dev, char __user *useraddr)
 		return ret;
 
 	dev->wol_enabled = !!wol.wolopts;
+	ethtool_notify(dev, ETHTOOL_MSG_WOL_NTF, NULL);
 
 	return 0;
 }
diff --git a/net/ethtool/netlink.c b/net/ethtool/netlink.c
index 2c375f9095fe..180c194fab07 100644
--- a/net/ethtool/netlink.c
+++ b/net/ethtool/netlink.c
@@ -526,6 +526,7 @@ ethnl_default_notify_ops[ETHTOOL_MSG_KERNEL_MAX + 1] = {
 	[ETHTOOL_MSG_LINKINFO_NTF]	= &ethnl_linkinfo_request_ops,
 	[ETHTOOL_MSG_LINKMODES_NTF]	= &ethnl_linkmodes_request_ops,
 	[ETHTOOL_MSG_DEBUG_NTF]		= &ethnl_debug_request_ops,
+	[ETHTOOL_MSG_WOL_NTF]		= &ethnl_wol_request_ops,
 };
 
 /* default notification handler */
@@ -610,6 +611,7 @@ static const ethnl_notify_handler_t ethnl_notify_handlers[] = {
 	[ETHTOOL_MSG_LINKINFO_NTF]	= ethnl_default_notify,
 	[ETHTOOL_MSG_LINKMODES_NTF]	= ethnl_default_notify,
 	[ETHTOOL_MSG_DEBUG_NTF]		= ethnl_default_notify,
+	[ETHTOOL_MSG_WOL_NTF]		= ethnl_default_notify,
 };
 
 void ethtool_notify(struct net_device *dev, unsigned int cmd, const void *data)
diff --git a/net/ethtool/wol.c b/net/ethtool/wol.c
index a2724378fac4..e1b8a65b64c4 100644
--- a/net/ethtool/wol.c
+++ b/net/ethtool/wol.c
@@ -41,7 +41,8 @@ static int wol_prepare_data(const struct ethnl_req_info *req_base,
 		return ret;
 	dev->ethtool_ops->get_wol(dev, &data->wol);
 	ethnl_ops_complete(dev);
-	data->show_sopass = data->wol.supported & WAKE_MAGICSECURE;
+	/* do not include password in notifications */
+	data->show_sopass = info && (data->wol.supported & WAKE_MAGICSECURE);
 
 	return 0;
 }
@@ -165,6 +166,7 @@ int ethnl_set_wol(struct sk_buff *skb, struct genl_info *info)
 	if (ret)
 		goto out_ops;
 	dev->wol_enabled = !!wol.wolopts;
+	ethtool_notify(dev, ETHTOOL_MSG_WOL_NTF, NULL);
 
 out_ops:
 	ethnl_ops_complete(dev);
-- 
cgit v1.2.3


From 8d19f1c8e1937baf74e1962aae9f90fa3aeab463 Mon Sep 17 00:00:00 2001
From: Mike Christie <mchristi@redhat.com>
Date: Mon, 11 Nov 2019 18:19:00 -0600
Subject: prctl: PR_{G,S}ET_IO_FLUSHER to support controlling memory reclaim

There are several storage drivers like dm-multipath, iscsi, tcmu-runner,
amd nbd that have userspace components that can run in the IO path. For
example, iscsi and nbd's userspace deamons may need to recreate a socket
and/or send IO on it, and dm-multipath's daemon multipathd may need to
send SG IO or read/write IO to figure out the state of paths and re-set
them up.

In the kernel these drivers have access to GFP_NOIO/GFP_NOFS and the
memalloc_*_save/restore functions to control the allocation behavior,
but for userspace we would end up hitting an allocation that ended up
writing data back to the same device we are trying to allocate for.
The device is then in a state of deadlock, because to execute IO the
device needs to allocate memory, but to allocate memory the memory
layers want execute IO to the device.

Here is an example with nbd using a local userspace daemon that performs
network IO to a remote server. We are using XFS on top of the nbd device,
but it can happen with any FS or other modules layered on top of the nbd
device that can write out data to free memory.  Here a nbd daemon helper
thread, msgr-worker-1, is performing a write/sendmsg on a socket to execute
a request. This kicks off a reclaim operation which results in a WRITE to
the nbd device and the nbd thread calling back into the mm layer.

[ 1626.609191] msgr-worker-1   D    0  1026      1 0x00004000
[ 1626.609193] Call Trace:
[ 1626.609195]  ? __schedule+0x29b/0x630
[ 1626.609197]  ? wait_for_completion+0xe0/0x170
[ 1626.609198]  schedule+0x30/0xb0
[ 1626.609200]  schedule_timeout+0x1f6/0x2f0
[ 1626.609202]  ? blk_finish_plug+0x21/0x2e
[ 1626.609204]  ? _xfs_buf_ioapply+0x2e6/0x410
[ 1626.609206]  ? wait_for_completion+0xe0/0x170
[ 1626.609208]  wait_for_completion+0x108/0x170
[ 1626.609210]  ? wake_up_q+0x70/0x70
[ 1626.609212]  ? __xfs_buf_submit+0x12e/0x250
[ 1626.609214]  ? xfs_bwrite+0x25/0x60
[ 1626.609215]  xfs_buf_iowait+0x22/0xf0
[ 1626.609218]  __xfs_buf_submit+0x12e/0x250
[ 1626.609220]  xfs_bwrite+0x25/0x60
[ 1626.609222]  xfs_reclaim_inode+0x2e8/0x310
[ 1626.609224]  xfs_reclaim_inodes_ag+0x1b6/0x300
[ 1626.609227]  xfs_reclaim_inodes_nr+0x31/0x40
[ 1626.609228]  super_cache_scan+0x152/0x1a0
[ 1626.609231]  do_shrink_slab+0x12c/0x2d0
[ 1626.609233]  shrink_slab+0x9c/0x2a0
[ 1626.609235]  shrink_node+0xd7/0x470
[ 1626.609237]  do_try_to_free_pages+0xbf/0x380
[ 1626.609240]  try_to_free_pages+0xd9/0x1f0
[ 1626.609245]  __alloc_pages_slowpath+0x3a4/0xd30
[ 1626.609251]  ? ___slab_alloc+0x238/0x560
[ 1626.609254]  __alloc_pages_nodemask+0x30c/0x350
[ 1626.609259]  skb_page_frag_refill+0x97/0xd0
[ 1626.609274]  sk_page_frag_refill+0x1d/0x80
[ 1626.609279]  tcp_sendmsg_locked+0x2bb/0xdd0
[ 1626.609304]  tcp_sendmsg+0x27/0x40
[ 1626.609307]  sock_sendmsg+0x54/0x60
[ 1626.609308]  ___sys_sendmsg+0x29f/0x320
[ 1626.609313]  ? sock_poll+0x66/0xb0
[ 1626.609318]  ? ep_item_poll.isra.15+0x40/0xc0
[ 1626.609320]  ? ep_send_events_proc+0xe6/0x230
[ 1626.609322]  ? hrtimer_try_to_cancel+0x54/0xf0
[ 1626.609324]  ? ep_read_events_proc+0xc0/0xc0
[ 1626.609326]  ? _raw_write_unlock_irq+0xa/0x20
[ 1626.609327]  ? ep_scan_ready_list.constprop.19+0x218/0x230
[ 1626.609329]  ? __hrtimer_init+0xb0/0xb0
[ 1626.609331]  ? _raw_spin_unlock_irq+0xa/0x20
[ 1626.609334]  ? ep_poll+0x26c/0x4a0
[ 1626.609337]  ? tcp_tsq_write.part.54+0xa0/0xa0
[ 1626.609339]  ? release_sock+0x43/0x90
[ 1626.609341]  ? _raw_spin_unlock_bh+0xa/0x20
[ 1626.609342]  __sys_sendmsg+0x47/0x80
[ 1626.609347]  do_syscall_64+0x5f/0x1c0
[ 1626.609349]  ? prepare_exit_to_usermode+0x75/0xa0
[ 1626.609351]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

This patch adds a new prctl command that daemons can use after they have
done their initial setup, and before they start to do allocations that
are in the IO path. It sets the PF_MEMALLOC_NOIO and PF_LESS_THROTTLE
flags so both userspace block and FS threads can use it to avoid the
allocation recursion and try to prevent from being throttled while
writing out data to free up memory.

Signed-off-by: Mike Christie <mchristi@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Masato Suzuki <masato.suzuki@wdc.com>
Reviewed-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Link: https://lore.kernel.org/r/20191112001900.9206-1-mchristi@redhat.com
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 include/uapi/linux/capability.h |  1 +
 include/uapi/linux/prctl.h      |  4 ++++
 kernel/sys.c                    | 25 +++++++++++++++++++++++++
 3 files changed, 30 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/capability.h b/include/uapi/linux/capability.h
index 240fdb9a60f6..272dc69fa080 100644
--- a/include/uapi/linux/capability.h
+++ b/include/uapi/linux/capability.h
@@ -301,6 +301,7 @@ struct vfs_ns_cap_data {
 /* Allow more than 64hz interrupts from the real-time clock */
 /* Override max number of consoles on console allocation */
 /* Override max number of keymaps */
+/* Control memory reclaim behavior */
 
 #define CAP_SYS_RESOURCE     24
 
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 7da1b37b27aa..07b4f8131e36 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -234,4 +234,8 @@ struct prctl_mm_map {
 #define PR_GET_TAGGED_ADDR_CTRL		56
 # define PR_TAGGED_ADDR_ENABLE		(1UL << 0)
 
+/* Control reclaim behavior when allocating memory */
+#define PR_SET_IO_FLUSHER		57
+#define PR_GET_IO_FLUSHER		58
+
 #endif /* _LINUX_PRCTL_H */
diff --git a/kernel/sys.c b/kernel/sys.c
index a9331f101883..f9bc5c303e3f 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2261,6 +2261,8 @@ int __weak arch_prctl_spec_ctrl_set(struct task_struct *t, unsigned long which,
 	return -EINVAL;
 }
 
+#define PR_IO_FLUSHER (PF_MEMALLOC_NOIO | PF_LESS_THROTTLE)
+
 SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 		unsigned long, arg4, unsigned long, arg5)
 {
@@ -2488,6 +2490,29 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, unsigned long, arg3,
 			return -EINVAL;
 		error = GET_TAGGED_ADDR_CTRL();
 		break;
+	case PR_SET_IO_FLUSHER:
+		if (!capable(CAP_SYS_RESOURCE))
+			return -EPERM;
+
+		if (arg3 || arg4 || arg5)
+			return -EINVAL;
+
+		if (arg2 == 1)
+			current->flags |= PR_IO_FLUSHER;
+		else if (!arg2)
+			current->flags &= ~PR_IO_FLUSHER;
+		else
+			return -EINVAL;
+		break;
+	case PR_GET_IO_FLUSHER:
+		if (!capable(CAP_SYS_RESOURCE))
+			return -EPERM;
+
+		if (arg2 || arg3 || arg4 || arg5)
+			return -EINVAL;
+
+		error = (current->flags & PR_IO_FLUSHER) == PR_IO_FLUSHER;
+		break;
 	default:
 		error = -EINVAL;
 		break;
-- 
cgit v1.2.3


From cccf0ee834559ae0b327b40290e14f6a2a017177 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Mon, 27 Jan 2020 16:34:48 -0700
Subject: io_uring/io-wq: don't use static creds/mm assignments

We currently setup the io_wq with a static set of mm and creds. Even for
a single-use io-wq per io_uring, this is suboptimal as we have may have
multiple enters of the ring. For sharing the io-wq backend, it doesn't
work at all.

Switch to passing in the creds and mm when the work item is setup. This
means that async work is no longer deferred to the io_uring mm and creds,
it is done with the current mm and creds.

Flag this behavior with IORING_FEAT_CUR_PERSONALITY, so applications know
they can rely on the current personality (mm and creds) being the same
for direct issue and async issue.

Reviewed-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io-wq.c                    | 68 +++++++++++++++++++++++++++++--------------
 fs/io-wq.h                    |  7 +++--
 fs/io_uring.c                 | 36 +++++++++++++++++++----
 include/uapi/linux/io_uring.h |  1 +
 4 files changed, 82 insertions(+), 30 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io-wq.c b/fs/io-wq.c
index 54e270ae12ab..7ccedc82a703 100644
--- a/fs/io-wq.c
+++ b/fs/io-wq.c
@@ -56,7 +56,8 @@ struct io_worker {
 
 	struct rcu_head rcu;
 	struct mm_struct *mm;
-	const struct cred *creds;
+	const struct cred *cur_creds;
+	const struct cred *saved_creds;
 	struct files_struct *restore_files;
 };
 
@@ -109,8 +110,6 @@ struct io_wq {
 
 	struct task_struct *manager;
 	struct user_struct *user;
-	const struct cred *creds;
-	struct mm_struct *mm;
 	refcount_t refs;
 	struct completion done;
 
@@ -137,9 +136,9 @@ static bool __io_worker_unuse(struct io_wqe *wqe, struct io_worker *worker)
 {
 	bool dropped_lock = false;
 
-	if (worker->creds) {
-		revert_creds(worker->creds);
-		worker->creds = NULL;
+	if (worker->saved_creds) {
+		revert_creds(worker->saved_creds);
+		worker->cur_creds = worker->saved_creds = NULL;
 	}
 
 	if (current->files != worker->restore_files) {
@@ -398,6 +397,43 @@ static struct io_wq_work *io_get_next_work(struct io_wqe *wqe, unsigned *hash)
 	return NULL;
 }
 
+static void io_wq_switch_mm(struct io_worker *worker, struct io_wq_work *work)
+{
+	if (worker->mm) {
+		unuse_mm(worker->mm);
+		mmput(worker->mm);
+		worker->mm = NULL;
+	}
+	if (!work->mm) {
+		set_fs(KERNEL_DS);
+		return;
+	}
+	if (mmget_not_zero(work->mm)) {
+		use_mm(work->mm);
+		if (!worker->mm)
+			set_fs(USER_DS);
+		worker->mm = work->mm;
+		/* hang on to this mm */
+		work->mm = NULL;
+		return;
+	}
+
+	/* failed grabbing mm, ensure work gets cancelled */
+	work->flags |= IO_WQ_WORK_CANCEL;
+}
+
+static void io_wq_switch_creds(struct io_worker *worker,
+			       struct io_wq_work *work)
+{
+	const struct cred *old_creds = override_creds(work->creds);
+
+	worker->cur_creds = work->creds;
+	if (worker->saved_creds)
+		put_cred(old_creds); /* creds set by previous switch */
+	else
+		worker->saved_creds = old_creds;
+}
+
 static void io_worker_handle_work(struct io_worker *worker)
 	__releases(wqe->lock)
 {
@@ -446,18 +482,10 @@ next:
 			current->files = work->files;
 			task_unlock(current);
 		}
-		if ((work->flags & IO_WQ_WORK_NEEDS_USER) && !worker->mm &&
-		    wq->mm) {
-			if (mmget_not_zero(wq->mm)) {
-				use_mm(wq->mm);
-				set_fs(USER_DS);
-				worker->mm = wq->mm;
-			} else {
-				work->flags |= IO_WQ_WORK_CANCEL;
-			}
-		}
-		if (!worker->creds)
-			worker->creds = override_creds(wq->creds);
+		if (work->mm != worker->mm)
+			io_wq_switch_mm(worker, work);
+		if (worker->cur_creds != work->creds)
+			io_wq_switch_creds(worker, work);
 		/*
 		 * OK to set IO_WQ_WORK_CANCEL even for uncancellable work,
 		 * the worker function will do the right thing.
@@ -1037,7 +1065,6 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
 
 	/* caller must already hold a reference to this */
 	wq->user = data->user;
-	wq->creds = data->creds;
 
 	for_each_node(node) {
 		struct io_wqe *wqe;
@@ -1064,9 +1091,6 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
 
 	init_completion(&wq->done);
 
-	/* caller must have already done mmgrab() on this mm */
-	wq->mm = data->mm;
-
 	wq->manager = kthread_create(io_wq_manager, wq, "io_wq_manager");
 	if (!IS_ERR(wq->manager)) {
 		wake_up_process(wq->manager);
diff --git a/fs/io-wq.h b/fs/io-wq.h
index 1cd039af8813..167316ad447e 100644
--- a/fs/io-wq.h
+++ b/fs/io-wq.h
@@ -7,7 +7,6 @@ enum {
 	IO_WQ_WORK_CANCEL	= 1,
 	IO_WQ_WORK_HAS_MM	= 2,
 	IO_WQ_WORK_HASHED	= 4,
-	IO_WQ_WORK_NEEDS_USER	= 8,
 	IO_WQ_WORK_NEEDS_FILES	= 16,
 	IO_WQ_WORK_UNBOUND	= 32,
 	IO_WQ_WORK_INTERNAL	= 64,
@@ -74,6 +73,8 @@ struct io_wq_work {
 	};
 	void (*func)(struct io_wq_work **);
 	struct files_struct *files;
+	struct mm_struct *mm;
+	const struct cred *creds;
 	unsigned flags;
 };
 
@@ -83,15 +84,15 @@ struct io_wq_work {
 		(work)->func = _func;			\
 		(work)->flags = 0;			\
 		(work)->files = NULL;			\
+		(work)->mm = NULL;			\
+		(work)->creds = NULL;			\
 	} while (0)					\
 
 typedef void (get_work_fn)(struct io_wq_work *);
 typedef void (put_work_fn)(struct io_wq_work *);
 
 struct io_wq_data {
-	struct mm_struct *mm;
 	struct user_struct *user;
-	const struct cred *creds;
 
 	get_work_fn *get_work;
 	put_work_fn *put_work;
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 1dd20305c664..0ea36911745d 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -875,6 +875,29 @@ static void __io_commit_cqring(struct io_ring_ctx *ctx)
 	}
 }
 
+static inline void io_req_work_grab_env(struct io_kiocb *req,
+					const struct io_op_def *def)
+{
+	if (!req->work.mm && def->needs_mm) {
+		mmgrab(current->mm);
+		req->work.mm = current->mm;
+	}
+	if (!req->work.creds)
+		req->work.creds = get_current_cred();
+}
+
+static inline void io_req_work_drop_env(struct io_kiocb *req)
+{
+	if (req->work.mm) {
+		mmdrop(req->work.mm);
+		req->work.mm = NULL;
+	}
+	if (req->work.creds) {
+		put_cred(req->work.creds);
+		req->work.creds = NULL;
+	}
+}
+
 static inline bool io_prep_async_work(struct io_kiocb *req,
 				      struct io_kiocb **link)
 {
@@ -888,8 +911,8 @@ static inline bool io_prep_async_work(struct io_kiocb *req,
 		if (def->unbound_nonreg_file)
 			req->work.flags |= IO_WQ_WORK_UNBOUND;
 	}
-	if (def->needs_mm)
-		req->work.flags |= IO_WQ_WORK_NEEDS_USER;
+
+	io_req_work_grab_env(req, def);
 
 	*link = io_prep_linked_timeout(req);
 	return do_hashed;
@@ -1180,6 +1203,8 @@ static void __io_req_aux_free(struct io_kiocb *req)
 		else
 			fput(req->file);
 	}
+
+	io_req_work_drop_env(req);
 }
 
 static void __io_free_req(struct io_kiocb *req)
@@ -3963,6 +3988,8 @@ static int io_req_defer_prep(struct io_kiocb *req,
 {
 	ssize_t ret = 0;
 
+	io_req_work_grab_env(req, &io_op_defs[req->opcode]);
+
 	switch (req->opcode) {
 	case IORING_OP_NOP:
 		break;
@@ -5725,9 +5752,7 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx,
 		goto err;
 	}
 
-	data.mm = ctx->sqo_mm;
 	data.user = ctx->user;
-	data.creds = ctx->creds;
 	data.get_work = io_get_work;
 	data.put_work = io_put_work;
 
@@ -6535,7 +6560,8 @@ static int io_uring_create(unsigned entries, struct io_uring_params *p)
 		goto err;
 
 	p->features = IORING_FEAT_SINGLE_MMAP | IORING_FEAT_NODROP |
-			IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS;
+			IORING_FEAT_SUBMIT_STABLE | IORING_FEAT_RW_CUR_POS |
+			IORING_FEAT_CUR_PERSONALITY;
 	trace_io_uring_create(ret, ctx, p->sq_entries, p->cq_entries, p->flags);
 	return ret;
 err:
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 57d05cc5e271..9988e82f858b 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -195,6 +195,7 @@ struct io_uring_params {
 #define IORING_FEAT_NODROP		(1U << 1)
 #define IORING_FEAT_SUBMIT_STABLE	(1U << 2)
 #define IORING_FEAT_RW_CUR_POS		(1U << 3)
+#define IORING_FEAT_CUR_PERSONALITY	(1U << 4)
 
 /*
  * io_uring_register(2) opcodes and arguments
-- 
cgit v1.2.3


From 24369c2e3bb06d8c4e71fd6ceaf4f8a01ae79b7c Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence@gmail.com>
Date: Tue, 28 Jan 2020 03:15:48 +0300
Subject: io_uring: add io-wq workqueue sharing

If IORING_SETUP_ATTACH_WQ is set, it expects wq_fd in io_uring_params to
be a valid io_uring fd io-wq of which will be shared with the newly
created io_uring instance. If the flag is set but it can't share io-wq,
it fails.

This allows creation of "sibling" io_urings, where we prefer to keep the
SQ/CQ private, but want to share the async backend to minimize the amount
of overhead associated with having multiple rings that belong to the same
backend.

Reported-by: Jens Axboe <axboe@kernel.dk>
Reported-by: Daurnimator <quae@daurnimator.com>
Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 64 +++++++++++++++++++++++++++++++++----------
 include/uapi/linux/io_uring.h |  4 ++-
 2 files changed, 53 insertions(+), 15 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0ea36911745d..275355bd3a64 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5704,11 +5704,56 @@ static void io_get_work(struct io_wq_work *work)
 	refcount_inc(&req->refs);
 }
 
+static int io_init_wq_offload(struct io_ring_ctx *ctx,
+			      struct io_uring_params *p)
+{
+	struct io_wq_data data;
+	struct fd f;
+	struct io_ring_ctx *ctx_attach;
+	unsigned int concurrency;
+	int ret = 0;
+
+	data.user = ctx->user;
+	data.get_work = io_get_work;
+	data.put_work = io_put_work;
+
+	if (!(p->flags & IORING_SETUP_ATTACH_WQ)) {
+		/* Do QD, or 4 * CPUS, whatever is smallest */
+		concurrency = min(ctx->sq_entries, 4 * num_online_cpus());
+
+		ctx->io_wq = io_wq_create(concurrency, &data);
+		if (IS_ERR(ctx->io_wq)) {
+			ret = PTR_ERR(ctx->io_wq);
+			ctx->io_wq = NULL;
+		}
+		return ret;
+	}
+
+	f = fdget(p->wq_fd);
+	if (!f.file)
+		return -EBADF;
+
+	if (f.file->f_op != &io_uring_fops) {
+		ret = -EINVAL;
+		goto out_fput;
+	}
+
+	ctx_attach = f.file->private_data;
+	/* @io_wq is protected by holding the fd */
+	if (!io_wq_get(ctx_attach->io_wq, &data)) {
+		ret = -EINVAL;
+		goto out_fput;
+	}
+
+	ctx->io_wq = ctx_attach->io_wq;
+out_fput:
+	fdput(f);
+	return ret;
+}
+
 static int io_sq_offload_start(struct io_ring_ctx *ctx,
 			       struct io_uring_params *p)
 {
-	struct io_wq_data data;
-	unsigned concurrency;
 	int ret;
 
 	init_waitqueue_head(&ctx->sqo_wait);
@@ -5752,18 +5797,9 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx,
 		goto err;
 	}
 
-	data.user = ctx->user;
-	data.get_work = io_get_work;
-	data.put_work = io_put_work;
-
-	/* Do QD, or 4 * CPUS, whatever is smallest */
-	concurrency = min(ctx->sq_entries, 4 * num_online_cpus());
-	ctx->io_wq = io_wq_create(concurrency, &data);
-	if (IS_ERR(ctx->io_wq)) {
-		ret = PTR_ERR(ctx->io_wq);
-		ctx->io_wq = NULL;
+	ret = io_init_wq_offload(ctx, p);
+	if (ret)
 		goto err;
-	}
 
 	return 0;
 err:
@@ -6589,7 +6625,7 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params)
 
 	if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL |
 			IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE |
-			IORING_SETUP_CLAMP))
+			IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ))
 		return -EINVAL;
 
 	ret = io_uring_create(entries, &p);
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 9988e82f858b..e067b92af5ad 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -75,6 +75,7 @@ enum {
 #define IORING_SETUP_SQ_AFF	(1U << 2)	/* sq_thread_cpu is valid */
 #define IORING_SETUP_CQSIZE	(1U << 3)	/* app defines CQ size */
 #define IORING_SETUP_CLAMP	(1U << 4)	/* clamp SQ/CQ ring sizes */
+#define IORING_SETUP_ATTACH_WQ	(1U << 5)	/* attach to existing wq */
 
 enum {
 	IORING_OP_NOP,
@@ -183,7 +184,8 @@ struct io_uring_params {
 	__u32 sq_thread_cpu;
 	__u32 sq_thread_idle;
 	__u32 features;
-	__u32 resv[4];
+	__u32 wq_fd;
+	__u32 resv[3];
 	struct io_sqring_offsets sq_off;
 	struct io_cqring_offsets cq_off;
 };
-- 
cgit v1.2.3


From 071698e13ac6ba786dfa22349a7b62deb5a9464d Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 28 Jan 2020 10:04:42 -0700
Subject: io_uring: allow registering credentials

If an application wants to use a ring with different kinds of
credentials, it can register them upfront. We don't lookup credentials,
the credentials of the task calling IORING_REGISTER_PERSONALITY is used.

An 'id' is returned for the application to use in subsequent personality
support.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 75 +++++++++++++++++++++++++++++++++++++++----
 include/uapi/linux/io_uring.h |  2 ++
 2 files changed, 70 insertions(+), 7 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 275355bd3a64..d74567fc9628 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -273,6 +273,8 @@ struct io_ring_ctx {
 	struct socket		*ring_sock;
 #endif
 
+	struct idr		personality_idr;
+
 	struct {
 		unsigned		cached_cq_tail;
 		unsigned		cq_entries;
@@ -796,6 +798,7 @@ static struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
 	INIT_LIST_HEAD(&ctx->cq_overflow_list);
 	init_completion(&ctx->completions[0]);
 	init_completion(&ctx->completions[1]);
+	idr_init(&ctx->personality_idr);
 	mutex_init(&ctx->uring_lock);
 	init_waitqueue_head(&ctx->wait);
 	spin_lock_init(&ctx->completion_lock);
@@ -6177,6 +6180,17 @@ static int io_uring_fasync(int fd, struct file *file, int on)
 	return fasync_helper(fd, file, on, &ctx->cq_fasync);
 }
 
+static int io_remove_personalities(int id, void *p, void *data)
+{
+	struct io_ring_ctx *ctx = data;
+	const struct cred *cred;
+
+	cred = idr_remove(&ctx->personality_idr, id);
+	if (cred)
+		put_cred(cred);
+	return 0;
+}
+
 static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
 {
 	mutex_lock(&ctx->uring_lock);
@@ -6193,6 +6207,7 @@ static void io_ring_ctx_wait_and_kill(struct io_ring_ctx *ctx)
 	/* if we failed setting up the ctx, we might not have any rings */
 	if (ctx->rings)
 		io_cqring_overflow_flush(ctx, true);
+	idr_for_each(&ctx->personality_idr, io_remove_personalities, ctx);
 	wait_for_completion(&ctx->completions[0]);
 	io_ring_ctx_free(ctx);
 }
@@ -6683,6 +6698,45 @@ out:
 	return ret;
 }
 
+static int io_register_personality(struct io_ring_ctx *ctx)
+{
+	const struct cred *creds = get_current_cred();
+	int id;
+
+	id = idr_alloc_cyclic(&ctx->personality_idr, (void *) creds, 1,
+				USHRT_MAX, GFP_KERNEL);
+	if (id < 0)
+		put_cred(creds);
+	return id;
+}
+
+static int io_unregister_personality(struct io_ring_ctx *ctx, unsigned id)
+{
+	const struct cred *old_creds;
+
+	old_creds = idr_remove(&ctx->personality_idr, id);
+	if (old_creds) {
+		put_cred(old_creds);
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
+static bool io_register_op_must_quiesce(int op)
+{
+	switch (op) {
+	case IORING_UNREGISTER_FILES:
+	case IORING_REGISTER_FILES_UPDATE:
+	case IORING_REGISTER_PROBE:
+	case IORING_REGISTER_PERSONALITY:
+	case IORING_UNREGISTER_PERSONALITY:
+		return false;
+	default:
+		return true;
+	}
+}
+
 static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 			       void __user *arg, unsigned nr_args)
 	__releases(ctx->uring_lock)
@@ -6698,9 +6752,7 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 	if (percpu_ref_is_dying(&ctx->refs))
 		return -ENXIO;
 
-	if (opcode != IORING_UNREGISTER_FILES &&
-	    opcode != IORING_REGISTER_FILES_UPDATE &&
-	    opcode != IORING_REGISTER_PROBE) {
+	if (io_register_op_must_quiesce(opcode)) {
 		percpu_ref_kill(&ctx->refs);
 
 		/*
@@ -6768,15 +6820,24 @@ static int __io_uring_register(struct io_ring_ctx *ctx, unsigned opcode,
 			break;
 		ret = io_probe(ctx, arg, nr_args);
 		break;
+	case IORING_REGISTER_PERSONALITY:
+		ret = -EINVAL;
+		if (arg || nr_args)
+			break;
+		ret = io_register_personality(ctx);
+		break;
+	case IORING_UNREGISTER_PERSONALITY:
+		ret = -EINVAL;
+		if (arg)
+			break;
+		ret = io_unregister_personality(ctx, nr_args);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
 	}
 
-
-	if (opcode != IORING_UNREGISTER_FILES &&
-	    opcode != IORING_REGISTER_FILES_UPDATE &&
-	    opcode != IORING_REGISTER_PROBE) {
+	if (io_register_op_must_quiesce(opcode)) {
 		/* bring the ctx back to life */
 		percpu_ref_reinit(&ctx->refs);
 out:
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index e067b92af5ad..b4ccf31db2d1 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -211,6 +211,8 @@ struct io_uring_params {
 #define IORING_REGISTER_FILES_UPDATE	6
 #define IORING_REGISTER_EVENTFD_ASYNC	7
 #define IORING_REGISTER_PROBE		8
+#define IORING_REGISTER_PERSONALITY	9
+#define IORING_UNREGISTER_PERSONALITY	10
 
 struct io_uring_files_update {
 	__u32 offset;
-- 
cgit v1.2.3


From 75c6a03904e0dd414a4d99a3072075cb5117e5bc Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Tue, 28 Jan 2020 10:15:23 -0700
Subject: io_uring: support using a registered personality for commands

For personalities previously registered via IORING_REGISTER_PERSONALITY,
allow any command to select them. This is done through setting
sqe->personality to the id returned from registration, and then flagging
sqe->flags with IOSQE_PERSONALITY.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 20 +++++++++++++++++++-
 include/uapi/linux/io_uring.h |  7 ++++++-
 2 files changed, 25 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index d74567fc9628..8bcf0538e2e1 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -4626,9 +4626,10 @@ static inline void io_queue_link_head(struct io_kiocb *req)
 static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 			  struct io_submit_state *state, struct io_kiocb **link)
 {
+	const struct cred *old_creds = NULL;
 	struct io_ring_ctx *ctx = req->ctx;
 	unsigned int sqe_flags;
-	int ret;
+	int ret, id;
 
 	sqe_flags = READ_ONCE(sqe->flags);
 
@@ -4637,6 +4638,19 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		ret = -EINVAL;
 		goto err_req;
 	}
+
+	id = READ_ONCE(sqe->personality);
+	if (id) {
+		const struct cred *personality_creds;
+
+		personality_creds = idr_find(&ctx->personality_idr, id);
+		if (unlikely(!personality_creds)) {
+			ret = -EINVAL;
+			goto err_req;
+		}
+		old_creds = override_creds(personality_creds);
+	}
+
 	/* same numerical values with corresponding REQ_F_*, safe to copy */
 	req->flags |= sqe_flags & (IOSQE_IO_DRAIN|IOSQE_IO_HARDLINK|
 					IOSQE_ASYNC);
@@ -4646,6 +4660,8 @@ static bool io_submit_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 err_req:
 		io_cqring_add_event(req, ret);
 		io_double_put_req(req);
+		if (old_creds)
+			revert_creds(old_creds);
 		return false;
 	}
 
@@ -4706,6 +4722,8 @@ err_req:
 		}
 	}
 
+	if (old_creds)
+		revert_creds(old_creds);
 	return true;
 }
 
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index b4ccf31db2d1..98105ff8d3e6 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -40,7 +40,12 @@ struct io_uring_sqe {
 	};
 	__u64	user_data;	/* data to be passed back at completion time */
 	union {
-		__u16	buf_index;	/* index into fixed buffers, if used */
+		struct {
+			/* index into fixed buffers, if used */
+			__u16	buf_index;
+			/* personality to use, if used */
+			__u16	personality;
+		};
 		__u64	__pad2[3];
 	};
 };
-- 
cgit v1.2.3


From 3e4827b05d2ac2d377ed136a52829ec46787bf4b Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe@kernel.dk>
Date: Wed, 8 Jan 2020 15:18:09 -0700
Subject: io_uring: add support for epoll_ctl(2)

This adds IORING_OP_EPOLL_CTL, which can perform the same work as the
epoll_ctl(2) system call.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
---
 fs/io_uring.c                 | 71 +++++++++++++++++++++++++++++++++++++++++++
 include/uapi/linux/io_uring.h |  1 +
 2 files changed, 72 insertions(+)

(limited to 'include/uapi/linux')

diff --git a/fs/io_uring.c b/fs/io_uring.c
index 0d8d0e217847..c5ca84a305d3 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -74,6 +74,7 @@
 #include <linux/namei.h>
 #include <linux/fsnotify.h>
 #include <linux/fadvise.h>
+#include <linux/eventpoll.h>
 
 #define CREATE_TRACE_POINTS
 #include <trace/events/io_uring.h>
@@ -423,6 +424,14 @@ struct io_madvise {
 	u32				advice;
 };
 
+struct io_epoll {
+	struct file			*file;
+	int				epfd;
+	int				op;
+	int				fd;
+	struct epoll_event		event;
+};
+
 struct io_async_connect {
 	struct sockaddr_storage		address;
 };
@@ -536,6 +545,7 @@ struct io_kiocb {
 		struct io_files_update	files_update;
 		struct io_fadvise	fadvise;
 		struct io_madvise	madvise;
+		struct io_epoll		epoll;
 	};
 
 	struct io_async_ctx		*io;
@@ -728,6 +738,10 @@ static const struct io_op_def io_op_defs[] = {
 		.fd_non_neg		= 1,
 		.file_table		= 1,
 	},
+	[IORING_OP_EPOLL_CTL] = {
+		.unbound_nonreg_file	= 1,
+		.file_table		= 1,
+	},
 };
 
 static void io_wq_submit_work(struct io_wq_work **workptr);
@@ -2611,6 +2625,52 @@ static int io_openat(struct io_kiocb *req, struct io_kiocb **nxt,
 	return io_openat2(req, nxt, force_nonblock);
 }
 
+static int io_epoll_ctl_prep(struct io_kiocb *req,
+			     const struct io_uring_sqe *sqe)
+{
+#if defined(CONFIG_EPOLL)
+	if (sqe->ioprio || sqe->buf_index)
+		return -EINVAL;
+
+	req->epoll.epfd = READ_ONCE(sqe->fd);
+	req->epoll.op = READ_ONCE(sqe->len);
+	req->epoll.fd = READ_ONCE(sqe->off);
+
+	if (ep_op_has_event(req->epoll.op)) {
+		struct epoll_event __user *ev;
+
+		ev = u64_to_user_ptr(READ_ONCE(sqe->addr));
+		if (copy_from_user(&req->epoll.event, ev, sizeof(*ev)))
+			return -EFAULT;
+	}
+
+	return 0;
+#else
+	return -EOPNOTSUPP;
+#endif
+}
+
+static int io_epoll_ctl(struct io_kiocb *req, struct io_kiocb **nxt,
+			bool force_nonblock)
+{
+#if defined(CONFIG_EPOLL)
+	struct io_epoll *ie = &req->epoll;
+	int ret;
+
+	ret = do_epoll_ctl(ie->epfd, ie->op, ie->fd, &ie->event, force_nonblock);
+	if (force_nonblock && ret == -EAGAIN)
+		return -EAGAIN;
+
+	if (ret < 0)
+		req_set_fail_links(req);
+	io_cqring_add_event(req, ret);
+	io_put_req_find_next(req, nxt);
+	return 0;
+#else
+	return -EOPNOTSUPP;
+#endif
+}
+
 static int io_madvise_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe)
 {
 #if defined(CONFIG_ADVISE_SYSCALLS) && defined(CONFIG_MMU)
@@ -4075,6 +4135,9 @@ static int io_req_defer_prep(struct io_kiocb *req,
 	case IORING_OP_OPENAT2:
 		ret = io_openat2_prep(req, sqe);
 		break;
+	case IORING_OP_EPOLL_CTL:
+		ret = io_epoll_ctl_prep(req, sqe);
+		break;
 	default:
 		printk_once(KERN_WARNING "io_uring: unhandled opcode %d\n",
 				req->opcode);
@@ -4303,6 +4366,14 @@ static int io_issue_sqe(struct io_kiocb *req, const struct io_uring_sqe *sqe,
 		}
 		ret = io_openat2(req, nxt, force_nonblock);
 		break;
+	case IORING_OP_EPOLL_CTL:
+		if (sqe) {
+			ret = io_epoll_ctl_prep(req, sqe);
+			if (ret)
+				break;
+		}
+		ret = io_epoll_ctl(req, nxt, force_nonblock);
+		break;
 	default:
 		ret = -EINVAL;
 		break;
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 98105ff8d3e6..3f7961c1c243 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -112,6 +112,7 @@ enum {
 	IORING_OP_SEND,
 	IORING_OP_RECV,
 	IORING_OP_OPENAT2,
+	IORING_OP_EPOLL_CTL,
 
 	/* this goes last, obviously */
 	IORING_OP_LAST,
-- 
cgit v1.2.3


From 7de3f1423ff9431f3bd5023bb78d1e062314e7f0 Mon Sep 17 00:00:00 2001
From: Janosch Frank <frankja@linux.ibm.com>
Date: Fri, 31 Jan 2020 05:02:02 -0500
Subject: KVM: s390: Add new reset vcpu API

The architecture states that we need to reset local IRQs for all CPU
resets. Because the old reset interface did not support the normal CPU
reset we never did that on a normal reset.

Let's implement an interface for the missing normal and clear resets
and reset all local IRQs, registers and control structures as stated
in the architecture.

Userspace might already reset the registers via the vcpu run struct,
but as we need the interface for the interrupt clearing part anyway,
we implement the resets fully and don't rely on userspace to reset the
rest.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20200131100205.74720-4-frankja@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
 Documentation/virt/kvm/api.txt | 43 +++++++++++++++++++++
 arch/s390/kvm/kvm-s390.c       | 84 ++++++++++++++++++++++++++++--------------
 include/uapi/linux/kvm.h       |  5 +++
 3 files changed, 105 insertions(+), 27 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/Documentation/virt/kvm/api.txt b/Documentation/virt/kvm/api.txt
index ebb37b34dcfc..73448764f544 100644
--- a/Documentation/virt/kvm/api.txt
+++ b/Documentation/virt/kvm/api.txt
@@ -4168,6 +4168,42 @@ This ioctl issues an ultravisor call to terminate the secure guest,
 unpins the VPA pages and releases all the device pages that are used to
 track the secure pages by hypervisor.
 
+4.122 KVM_S390_NORMAL_RESET
+
+Capability: KVM_CAP_S390_VCPU_RESETS
+Architectures: s390
+Type: vcpu ioctl
+Parameters: none
+Returns: 0
+
+This ioctl resets VCPU registers and control structures according to
+the cpu reset definition in the POP (Principles Of Operation).
+
+4.123 KVM_S390_INITIAL_RESET
+
+Capability: none
+Architectures: s390
+Type: vcpu ioctl
+Parameters: none
+Returns: 0
+
+This ioctl resets VCPU registers and control structures according to
+the initial cpu reset definition in the POP. However, the cpu is not
+put into ESA mode. This reset is a superset of the normal reset.
+
+4.124 KVM_S390_CLEAR_RESET
+
+Capability: KVM_CAP_S390_VCPU_RESETS
+Architectures: s390
+Type: vcpu ioctl
+Parameters: none
+Returns: 0
+
+This ioctl resets VCPU registers and control structures according to
+the clear cpu reset definition in the POP. However, the cpu is not put
+into ESA mode. This reset is a superset of the initial reset.
+
+
 5. The kvm_run structure
 ------------------------
 
@@ -5396,3 +5432,10 @@ handling by KVM (as some KVM hypercall may be mistakenly treated as TLB
 flush hypercalls by Hyper-V) so userspace should disable KVM identification
 in CPUID and only exposes Hyper-V identification. In this case, guest
 thinks it's running on Hyper-V and only use Hyper-V hypercalls.
+
+8.22 KVM_CAP_S390_VCPU_RESETS
+
+Architectures: s390
+
+This capability indicates that the KVM_S390_NORMAL_RESET and
+KVM_S390_CLEAR_RESET ioctls are available.
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index bb072866bd69..e39f6ef97b09 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -529,6 +529,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_S390_CMMA_MIGRATION:
 	case KVM_CAP_S390_AIS:
 	case KVM_CAP_S390_AIS_MIGRATION:
+	case KVM_CAP_S390_VCPU_RESETS:
 		r = 1;
 		break;
 	case KVM_CAP_S390_HPAGE_1M:
@@ -2844,29 +2845,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 
 }
 
-static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
-{
-	/* this equals initial cpu reset in pop, but we don't switch to ESA */
-	vcpu->arch.sie_block->gpsw.mask = 0;
-	vcpu->arch.sie_block->gpsw.addr = 0;
-	kvm_s390_set_prefix(vcpu, 0);
-	kvm_s390_set_cpu_timer(vcpu, 0);
-	vcpu->arch.sie_block->ckc = 0;
-	vcpu->arch.sie_block->todpr = 0;
-	memset(vcpu->arch.sie_block->gcr, 0, sizeof(vcpu->arch.sie_block->gcr));
-	vcpu->arch.sie_block->gcr[0] = CR0_INITIAL_MASK;
-	vcpu->arch.sie_block->gcr[14] = CR14_INITIAL_MASK;
-	vcpu->run->s.regs.fpc = 0;
-	vcpu->arch.sie_block->gbea = 1;
-	vcpu->arch.sie_block->pp = 0;
-	vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
-	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
-	kvm_clear_async_pf_completion_queue(vcpu);
-	if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
-		kvm_s390_vcpu_stop(vcpu);
-	kvm_s390_clear_local_irqs(vcpu);
-}
-
 void kvm_arch_vcpu_postcreate(struct kvm_vcpu *vcpu)
 {
 	mutex_lock(&vcpu->kvm->lock);
@@ -3281,10 +3259,53 @@ static int kvm_arch_vcpu_ioctl_set_one_reg(struct kvm_vcpu *vcpu,
 	return r;
 }
 
-static int kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu)
+static void kvm_arch_vcpu_ioctl_normal_reset(struct kvm_vcpu *vcpu)
 {
-	kvm_s390_vcpu_initial_reset(vcpu);
-	return 0;
+	vcpu->arch.sie_block->gpsw.mask &= ~PSW_MASK_RI;
+	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
+	memset(vcpu->run->s.regs.riccb, 0, sizeof(vcpu->run->s.regs.riccb));
+
+	kvm_clear_async_pf_completion_queue(vcpu);
+	if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
+		kvm_s390_vcpu_stop(vcpu);
+	kvm_s390_clear_local_irqs(vcpu);
+}
+
+static void kvm_arch_vcpu_ioctl_initial_reset(struct kvm_vcpu *vcpu)
+{
+	/* Initial reset is a superset of the normal reset */
+	kvm_arch_vcpu_ioctl_normal_reset(vcpu);
+
+	/* this equals initial cpu reset in pop, but we don't switch to ESA */
+	vcpu->arch.sie_block->gpsw.mask = 0;
+	vcpu->arch.sie_block->gpsw.addr = 0;
+	kvm_s390_set_prefix(vcpu, 0);
+	kvm_s390_set_cpu_timer(vcpu, 0);
+	vcpu->arch.sie_block->ckc = 0;
+	vcpu->arch.sie_block->todpr = 0;
+	memset(vcpu->arch.sie_block->gcr, 0, sizeof(vcpu->arch.sie_block->gcr));
+	vcpu->arch.sie_block->gcr[0] = CR0_INITIAL_MASK;
+	vcpu->arch.sie_block->gcr[14] = CR14_INITIAL_MASK;
+	vcpu->run->s.regs.fpc = 0;
+	vcpu->arch.sie_block->gbea = 1;
+	vcpu->arch.sie_block->pp = 0;
+	vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
+}
+
+static void kvm_arch_vcpu_ioctl_clear_reset(struct kvm_vcpu *vcpu)
+{
+	struct kvm_sync_regs *regs = &vcpu->run->s.regs;
+
+	/* Clear reset is a superset of the initial reset */
+	kvm_arch_vcpu_ioctl_initial_reset(vcpu);
+
+	memset(&regs->gprs, 0, sizeof(regs->gprs));
+	memset(&regs->vrs, 0, sizeof(regs->vrs));
+	memset(&regs->acrs, 0, sizeof(regs->acrs));
+	memset(&regs->gscb, 0, sizeof(regs->gscb));
+
+	regs->etoken = 0;
+	regs->etoken_extension = 0;
 }
 
 int kvm_arch_vcpu_ioctl_set_regs(struct kvm_vcpu *vcpu, struct kvm_regs *regs)
@@ -4357,8 +4378,17 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_arch_vcpu_ioctl_set_initial_psw(vcpu, psw);
 		break;
 	}
+	case KVM_S390_CLEAR_RESET:
+		r = 0;
+		kvm_arch_vcpu_ioctl_clear_reset(vcpu);
+		break;
 	case KVM_S390_INITIAL_RESET:
-		r = kvm_arch_vcpu_ioctl_initial_reset(vcpu);
+		r = 0;
+		kvm_arch_vcpu_ioctl_initial_reset(vcpu);
+		break;
+	case KVM_S390_NORMAL_RESET:
+		r = 0;
+		kvm_arch_vcpu_ioctl_normal_reset(vcpu);
 		break;
 	case KVM_SET_ONE_REG:
 	case KVM_GET_ONE_REG: {
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index f0a16b4adbbd..4b95f9a31a2f 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1009,6 +1009,7 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_PPC_GUEST_DEBUG_SSTEP 176
 #define KVM_CAP_ARM_NISV_TO_USER 177
 #define KVM_CAP_ARM_INJECT_EXT_DABT 178
+#define KVM_CAP_S390_VCPU_RESETS 179
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1473,6 +1474,10 @@ struct kvm_enc_region {
 /* Available with KVM_CAP_ARM_SVE */
 #define KVM_ARM_VCPU_FINALIZE	  _IOW(KVMIO,  0xc2, int)
 
+/* Available with  KVM_CAP_S390_VCPU_RESETS */
+#define KVM_S390_NORMAL_RESET	_IO(KVMIO,   0xc3)
+#define KVM_S390_CLEAR_RESET	_IO(KVMIO,   0xc4)
+
 /* Secure Encrypted Virtualization command */
 enum sev_cmd_id {
 	/* Guest initialization commands */
-- 
cgit v1.2.3


From 0a3c57729768e08de690ba872dd52ee9c3c11e8b Mon Sep 17 00:00:00 2001
From: Hao Lee <haolee.swjtu@gmail.com>
Date: Thu, 30 Jan 2020 22:15:19 -0800
Subject: mm: fix comments related to node reclaim

As zone reclaim has been replaced by node reclaim, this patch fixes
related comments.

Link: http://lkml.kernel.org/r/20191126141346.GA22665@haolee.github.io
Signed-off-by: Hao Lee <haolee.swjtu@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/mmzone.h      | 2 +-
 include/uapi/linux/sysctl.h | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 5334ad8fc7bd..c2bc309d1634 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -758,7 +758,7 @@ typedef struct pglist_data {
 
 #ifdef CONFIG_NUMA
 	/*
-	 * zone reclaim becomes active if more unmapped pages exist.
+	 * node reclaim becomes active if more unmapped pages exist.
 	 */
 	unsigned long		min_unmapped_pages;
 	unsigned long		min_slab_pages;
diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h
index 87aa2a6d9125..27c1ed2822e6 100644
--- a/include/uapi/linux/sysctl.h
+++ b/include/uapi/linux/sysctl.h
@@ -195,7 +195,7 @@ enum
 	VM_MIN_UNMAPPED=32,	/* Set min percent of unmapped pages */
 	VM_PANIC_ON_OOM=33,	/* panic at out-of-memory */
 	VM_VDSO_ENABLED=34,	/* map VDSO into new processes? */
-	VM_MIN_SLAB=35,		 /* Percent pages ignored by zone reclaim */
+	VM_MIN_SLAB=35,		 /* Percent pages ignored by node reclaim */
 };
 
 
-- 
cgit v1.2.3


From d5767057c9a76a29f073dad66b7fa12a90e8c748 Mon Sep 17 00:00:00 2001
From: Yury Norov <yury.norov@gmail.com>
Date: Thu, 30 Jan 2020 22:16:40 -0800
Subject: uapi: rename ext2_swab() to swab() and share globally in swab.h

ext2_swab() is defined locally in lib/find_bit.c However it is not
specific to ext2, neither to bitmaps.

There are many potential users of it, so rename it to just swab() and
move to include/uapi/linux/swab.h

ABI guarantees that size of unsigned long corresponds to BITS_PER_LONG,
therefore drop unneeded cast.

Link: http://lkml.kernel.org/r/20200103202846.21616-1-yury.norov@gmail.com
Signed-off-by: Yury Norov <yury.norov@gmail.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Joe Perches <joe@perches.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/linux/swab.h      |  1 +
 include/uapi/linux/swab.h | 10 ++++++++++
 lib/find_bit.c            | 16 ++--------------
 3 files changed, 13 insertions(+), 14 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/swab.h b/include/linux/swab.h
index e466fd159c85..bcff5149861a 100644
--- a/include/linux/swab.h
+++ b/include/linux/swab.h
@@ -7,6 +7,7 @@
 # define swab16 __swab16
 # define swab32 __swab32
 # define swab64 __swab64
+# define swab __swab
 # define swahw32 __swahw32
 # define swahb32 __swahb32
 # define swab16p __swab16p
diff --git a/include/uapi/linux/swab.h b/include/uapi/linux/swab.h
index 23cd84868cc3..fa7f97da5b76 100644
--- a/include/uapi/linux/swab.h
+++ b/include/uapi/linux/swab.h
@@ -4,6 +4,7 @@
 
 #include <linux/types.h>
 #include <linux/compiler.h>
+#include <asm/bitsperlong.h>
 #include <asm/swab.h>
 
 /*
@@ -132,6 +133,15 @@ static inline __attribute_const__ __u32 __fswahb32(__u32 val)
 	__fswab64(x))
 #endif
 
+static __always_inline unsigned long __swab(const unsigned long y)
+{
+#if BITS_PER_LONG == 64
+	return __swab64(y);
+#else /* BITS_PER_LONG == 32 */
+	return __swab32(y);
+#endif
+}
+
 /**
  * __swahw32 - return a word-swapped 32-bit value
  * @x: value to wordswap
diff --git a/lib/find_bit.c b/lib/find_bit.c
index e35a76b291e6..06e503c339f3 100644
--- a/lib/find_bit.c
+++ b/lib/find_bit.c
@@ -149,18 +149,6 @@ EXPORT_SYMBOL(find_last_bit);
 
 #ifdef __BIG_ENDIAN
 
-/* include/linux/byteorder does not support "unsigned long" type */
-static inline unsigned long ext2_swab(const unsigned long y)
-{
-#if BITS_PER_LONG == 64
-	return (unsigned long) __swab64((u64) y);
-#elif BITS_PER_LONG == 32
-	return (unsigned long) __swab32((u32) y);
-#else
-#error BITS_PER_LONG not defined
-#endif
-}
-
 #if !defined(find_next_bit_le) || !defined(find_next_zero_bit_le)
 static inline unsigned long _find_next_bit_le(const unsigned long *addr1,
 		const unsigned long *addr2, unsigned long nbits,
@@ -177,7 +165,7 @@ static inline unsigned long _find_next_bit_le(const unsigned long *addr1,
 	tmp ^= invert;
 
 	/* Handle 1st word. */
-	tmp &= ext2_swab(BITMAP_FIRST_WORD_MASK(start));
+	tmp &= swab(BITMAP_FIRST_WORD_MASK(start));
 	start = round_down(start, BITS_PER_LONG);
 
 	while (!tmp) {
@@ -191,7 +179,7 @@ static inline unsigned long _find_next_bit_le(const unsigned long *addr1,
 		tmp ^= invert;
 	}
 
-	return min(start + __ffs(ext2_swab(tmp)), nbits);
+	return min(start + __ffs(swab(tmp)), nbits);
 }
 #endif
 
-- 
cgit v1.2.3


From 8dcc1a9d90c10fa4143e5c17821082e5e60e46a1 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal@wdc.com>
Date: Wed, 25 Dec 2019 16:07:44 +0900
Subject: fs: New zonefs file system

zonefs is a very simple file system exposing each zone of a zoned block
device as a file. Unlike a regular file system with zoned block device
support (e.g. f2fs), zonefs does not hide the sequential write
constraint of zoned block devices to the user. Files representing
sequential write zones of the device must be written sequentially
starting from the end of the file (append only writes).

As such, zonefs is in essence closer to a raw block device access
interface than to a full featured POSIX file system. The goal of zonefs
is to simplify the implementation of zoned block device support in
applications by replacing raw block device file accesses with a richer
file API, avoiding relying on direct block device file ioctls which may
be more obscure to developers. One example of this approach is the
implementation of LSM (log-structured merge) tree structures (such as
used in RocksDB and LevelDB) on zoned block devices by allowing SSTables
to be stored in a zone file similarly to a regular file system rather
than as a range of sectors of a zoned device. The introduction of the
higher level construct "one file is one zone" can help reducing the
amount of changes needed in the application as well as introducing
support for different application programming languages.

Zonefs on-disk metadata is reduced to an immutable super block to
persistently store a magic number and optional feature flags and
values. On mount, zonefs uses blkdev_report_zones() to obtain the device
zone configuration and populates the mount point with a static file tree
solely based on this information. E.g. file sizes come from the device
zone type and write pointer offset managed by the device itself.

The zone files created on mount have the following characteristics.
1) Files representing zones of the same type are grouped together
   under a common sub-directory:
     * For conventional zones, the sub-directory "cnv" is used.
     * For sequential write zones, the sub-directory "seq" is used.
  These two directories are the only directories that exist in zonefs.
  Users cannot create other directories and cannot rename nor delete
  the "cnv" and "seq" sub-directories.
2) The name of zone files is the number of the file within the zone
   type sub-directory, in order of increasing zone start sector.
3) The size of conventional zone files is fixed to the device zone size.
   Conventional zone files cannot be truncated.
4) The size of sequential zone files represent the file's zone write
   pointer position relative to the zone start sector. Truncating these
   files is allowed only down to 0, in which case, the zone is reset to
   rewind the zone write pointer position to the start of the zone, or
   up to the zone size, in which case the file's zone is transitioned
   to the FULL state (finish zone operation).
5) All read and write operations to files are not allowed beyond the
   file zone size. Any access exceeding the zone size is failed with
   the -EFBIG error.
6) Creating, deleting, renaming or modifying any attribute of files and
   sub-directories is not allowed.
7) There are no restrictions on the type of read and write operations
   that can be issued to conventional zone files. Buffered, direct and
   mmap read & write operations are accepted. For sequential zone files,
   there are no restrictions on read operations, but all write
   operations must be direct IO append writes. mmap write of sequential
   files is not allowed.

Several optional features of zonefs can be enabled at format time.
* Conventional zone aggregation: ranges of contiguous conventional
  zones can be aggregated into a single larger file instead of the
  default one file per zone.
* File ownership: The owner UID and GID of zone files is by default 0
  (root) but can be changed to any valid UID/GID.
* File access permissions: the default 640 access permissions can be
  changed.

The mkzonefs tool is used to format zoned block devices for use with
zonefs. This tool is available on Github at:

git@github.com:damien-lemoal/zonefs-tools.git.

zonefs-tools also includes a test suite which can be run against any
zoned block device, including null_blk block device created with zoned
mode.

Example: the following formats a 15TB host-managed SMR HDD with 256 MB
zones with the conventional zones aggregation feature enabled.

$ sudo mkzonefs -o aggr_cnv /dev/sdX
$ sudo mount -t zonefs /dev/sdX /mnt
$ ls -l /mnt/
total 0
dr-xr-xr-x 2 root root     1 Nov 25 13:23 cnv
dr-xr-xr-x 2 root root 55356 Nov 25 13:23 seq

The size of the zone files sub-directories indicate the number of files
existing for each type of zones. In this example, there is only one
conventional zone file (all conventional zones are aggregated under a
single file).

$ ls -l /mnt/cnv
total 137101312
-rw-r----- 1 root root 140391743488 Nov 25 13:23 0

This aggregated conventional zone file can be used as a regular file.

$ sudo mkfs.ext4 /mnt/cnv/0
$ sudo mount -o loop /mnt/cnv/0 /data

The "seq" sub-directory grouping files for sequential write zones has
in this example 55356 zones.

$ ls -lv /mnt/seq
total 14511243264
-rw-r----- 1 root root 0 Nov 25 13:23 0
-rw-r----- 1 root root 0 Nov 25 13:23 1
-rw-r----- 1 root root 0 Nov 25 13:23 2
...
-rw-r----- 1 root root 0 Nov 25 13:23 55354
-rw-r----- 1 root root 0 Nov 25 13:23 55355

For sequential write zone files, the file size changes as data is
appended at the end of the file, similarly to any regular file system.

$ dd if=/dev/zero of=/mnt/seq/0 bs=4K count=1 conv=notrunc oflag=direct
1+0 records in
1+0 records out
4096 bytes (4.1 kB, 4.0 KiB) copied, 0.000452219 s, 9.1 MB/s

$ ls -l /mnt/seq/0
-rw-r----- 1 root root 4096 Nov 25 13:23 /mnt/seq/0

The written file can be truncated to the zone size, preventing any
further write operation.

$ truncate -s 268435456 /mnt/seq/0
$ ls -l /mnt/seq/0
-rw-r----- 1 root root 268435456 Nov 25 13:49 /mnt/seq/0

Truncation to 0 size allows freeing the file zone storage space and
restart append-writes to the file.

$ truncate -s 0 /mnt/seq/0
$ ls -l /mnt/seq/0
-rw-r----- 1 root root 0 Nov 25 13:49 /mnt/seq/0

Since files are statically mapped to zones on the disk, the number of
blocks of a file as reported by stat() and fstat() indicates the size
of the file zone.

$ stat /mnt/seq/0
  File: /mnt/seq/0
  Size: 0       Blocks: 524288     IO Block: 4096   regular empty file
Device: 870h/2160d      Inode: 50431       Links: 1
Access: (0640/-rw-r-----)  Uid: (    0/    root)   Gid: (    0/  root)
Access: 2019-11-25 13:23:57.048971997 +0900
Modify: 2019-11-25 13:52:25.553805765 +0900
Change: 2019-11-25 13:52:25.553805765 +0900
 Birth: -

The number of blocks of the file ("Blocks") in units of 512B blocks
gives the maximum file size of 524288 * 512 B = 256 MB, corresponding
to the device zone size in this example. Of note is that the "IO block"
field always indicates the minimum IO size for writes and corresponds
to the device physical sector size.

This code contains contributions from:
* Johannes Thumshirn <jthumshirn@suse.de>,
* Darrick J. Wong <darrick.wong@oracle.com>,
* Christoph Hellwig <hch@lst.de>,
* Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com> and
* Ting Yao <tingyao@hust.edu.cn>.

Signed-off-by: Damien Le Moal <damien.lemoal@wdc.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
---
 MAINTAINERS                |    9 +
 fs/Kconfig                 |    1 +
 fs/Makefile                |    1 +
 fs/zonefs/Kconfig          |    9 +
 fs/zonefs/Makefile         |    4 +
 fs/zonefs/super.c          | 1439 ++++++++++++++++++++++++++++++++++++++++++++
 fs/zonefs/zonefs.h         |  189 ++++++
 include/uapi/linux/magic.h |    1 +
 8 files changed, 1653 insertions(+)
 create mode 100644 fs/zonefs/Kconfig
 create mode 100644 fs/zonefs/Makefile
 create mode 100644 fs/zonefs/super.c
 create mode 100644 fs/zonefs/zonefs.h

(limited to 'include/uapi/linux')

diff --git a/MAINTAINERS b/MAINTAINERS
index 56765f542244..089fd879632a 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -18303,6 +18303,15 @@ L:	linux-kernel@vger.kernel.org
 S:	Maintained
 F:	arch/x86/kernel/cpu/zhaoxin.c
 
+ZONEFS FILESYSTEM
+M:	Damien Le Moal <damien.lemoal@wdc.com>
+M:	Naohiro Aota <naohiro.aota@wdc.com>
+R:	Johannes Thumshirn <jth@kernel.org>
+L:	linux-fsdevel@vger.kernel.org
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/dlemoal/zonefs.git
+S:	Maintained
+F:	fs/zonefs/
+
 ZPOOL COMPRESSED PAGE STORAGE API
 M:	Dan Streetman <ddstreet@ieee.org>
 L:	linux-mm@kvack.org
diff --git a/fs/Kconfig b/fs/Kconfig
index 7b623e9fc1b0..a3f97ca2bd46 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -40,6 +40,7 @@ source "fs/ocfs2/Kconfig"
 source "fs/btrfs/Kconfig"
 source "fs/nilfs2/Kconfig"
 source "fs/f2fs/Kconfig"
+source "fs/zonefs/Kconfig"
 
 config FS_DAX
 	bool "Direct Access (DAX) support"
diff --git a/fs/Makefile b/fs/Makefile
index 1148c555c4d3..527f228a5e8a 100644
--- a/fs/Makefile
+++ b/fs/Makefile
@@ -133,3 +133,4 @@ obj-$(CONFIG_CEPH_FS)		+= ceph/
 obj-$(CONFIG_PSTORE)		+= pstore/
 obj-$(CONFIG_EFIVAR_FS)		+= efivarfs/
 obj-$(CONFIG_EROFS_FS)		+= erofs/
+obj-$(CONFIG_ZONEFS_FS)		+= zonefs/
diff --git a/fs/zonefs/Kconfig b/fs/zonefs/Kconfig
new file mode 100644
index 000000000000..fb87ad372e29
--- /dev/null
+++ b/fs/zonefs/Kconfig
@@ -0,0 +1,9 @@
+config ZONEFS_FS
+	tristate "zonefs filesystem support"
+	depends on BLOCK
+	depends on BLK_DEV_ZONED
+	help
+	  zonefs is a simple file system which exposes zones of a zoned block
+	  device (e.g. host-managed or host-aware SMR disk drives) as files.
+
+	  If unsure, say N.
diff --git a/fs/zonefs/Makefile b/fs/zonefs/Makefile
new file mode 100644
index 000000000000..75a380aa1ae1
--- /dev/null
+++ b/fs/zonefs/Makefile
@@ -0,0 +1,4 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ZONEFS_FS) += zonefs.o
+
+zonefs-y	:= super.o
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c
new file mode 100644
index 000000000000..8bc6ef82d693
--- /dev/null
+++ b/fs/zonefs/super.c
@@ -0,0 +1,1439 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Simple file system for zoned block devices exposing zones as files.
+ *
+ * Copyright (C) 2019 Western Digital Corporation or its affiliates.
+ */
+#include <linux/module.h>
+#include <linux/fs.h>
+#include <linux/magic.h>
+#include <linux/iomap.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/blkdev.h>
+#include <linux/statfs.h>
+#include <linux/writeback.h>
+#include <linux/quotaops.h>
+#include <linux/seq_file.h>
+#include <linux/parser.h>
+#include <linux/uio.h>
+#include <linux/mman.h>
+#include <linux/sched/mm.h>
+#include <linux/crc32.h>
+
+#include "zonefs.h"
+
+static int zonefs_iomap_begin(struct inode *inode, loff_t offset, loff_t length,
+			      unsigned int flags, struct iomap *iomap,
+			      struct iomap *srcmap)
+{
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	struct super_block *sb = inode->i_sb;
+	loff_t isize;
+
+	/* All I/Os should always be within the file maximum size */
+	if (WARN_ON_ONCE(offset + length > zi->i_max_size))
+		return -EIO;
+
+	/*
+	 * Sequential zones can only accept direct writes. This is already
+	 * checked when writes are issued, so warn if we see a page writeback
+	 * operation.
+	 */
+	if (WARN_ON_ONCE(zi->i_ztype == ZONEFS_ZTYPE_SEQ &&
+			 (flags & IOMAP_WRITE) && !(flags & IOMAP_DIRECT)))
+		return -EIO;
+
+	/*
+	 * For conventional zones, all blocks are always mapped. For sequential
+	 * zones, all blocks after always mapped below the inode size (zone
+	 * write pointer) and unwriten beyond.
+	 */
+	mutex_lock(&zi->i_truncate_mutex);
+	isize = i_size_read(inode);
+	if (offset >= isize)
+		iomap->type = IOMAP_UNWRITTEN;
+	else
+		iomap->type = IOMAP_MAPPED;
+	if (flags & IOMAP_WRITE)
+		length = zi->i_max_size - offset;
+	else
+		length = min(length, isize - offset);
+	mutex_unlock(&zi->i_truncate_mutex);
+
+	iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize);
+	iomap->length = ALIGN(offset + length, sb->s_blocksize) - iomap->offset;
+	iomap->bdev = inode->i_sb->s_bdev;
+	iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset;
+
+	return 0;
+}
+
+static const struct iomap_ops zonefs_iomap_ops = {
+	.iomap_begin	= zonefs_iomap_begin,
+};
+
+static int zonefs_readpage(struct file *unused, struct page *page)
+{
+	return iomap_readpage(page, &zonefs_iomap_ops);
+}
+
+static int zonefs_readpages(struct file *unused, struct address_space *mapping,
+			    struct list_head *pages, unsigned int nr_pages)
+{
+	return iomap_readpages(mapping, pages, nr_pages, &zonefs_iomap_ops);
+}
+
+/*
+ * Map blocks for page writeback. This is used only on conventional zone files,
+ * which implies that the page range can only be within the fixed inode size.
+ */
+static int zonefs_map_blocks(struct iomap_writepage_ctx *wpc,
+			     struct inode *inode, loff_t offset)
+{
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+
+	if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV))
+		return -EIO;
+	if (WARN_ON_ONCE(offset >= i_size_read(inode)))
+		return -EIO;
+
+	/* If the mapping is already OK, nothing needs to be done */
+	if (offset >= wpc->iomap.offset &&
+	    offset < wpc->iomap.offset + wpc->iomap.length)
+		return 0;
+
+	return zonefs_iomap_begin(inode, offset, zi->i_max_size - offset,
+				  IOMAP_WRITE, &wpc->iomap, NULL);
+}
+
+static const struct iomap_writeback_ops zonefs_writeback_ops = {
+	.map_blocks		= zonefs_map_blocks,
+};
+
+static int zonefs_writepage(struct page *page, struct writeback_control *wbc)
+{
+	struct iomap_writepage_ctx wpc = { };
+
+	return iomap_writepage(page, wbc, &wpc, &zonefs_writeback_ops);
+}
+
+static int zonefs_writepages(struct address_space *mapping,
+			     struct writeback_control *wbc)
+{
+	struct iomap_writepage_ctx wpc = { };
+
+	return iomap_writepages(mapping, wbc, &wpc, &zonefs_writeback_ops);
+}
+
+static const struct address_space_operations zonefs_file_aops = {
+	.readpage		= zonefs_readpage,
+	.readpages		= zonefs_readpages,
+	.writepage		= zonefs_writepage,
+	.writepages		= zonefs_writepages,
+	.set_page_dirty		= iomap_set_page_dirty,
+	.releasepage		= iomap_releasepage,
+	.invalidatepage		= iomap_invalidatepage,
+	.migratepage		= iomap_migrate_page,
+	.is_partially_uptodate	= iomap_is_partially_uptodate,
+	.error_remove_page	= generic_error_remove_page,
+	.direct_IO		= noop_direct_IO,
+};
+
+static void zonefs_update_stats(struct inode *inode, loff_t new_isize)
+{
+	struct super_block *sb = inode->i_sb;
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	loff_t old_isize = i_size_read(inode);
+	loff_t nr_blocks;
+
+	if (new_isize == old_isize)
+		return;
+
+	spin_lock(&sbi->s_lock);
+
+	/*
+	 * This may be called for an update after an IO error.
+	 * So beware of the values seen.
+	 */
+	if (new_isize < old_isize) {
+		nr_blocks = (old_isize - new_isize) >> sb->s_blocksize_bits;
+		if (sbi->s_used_blocks > nr_blocks)
+			sbi->s_used_blocks -= nr_blocks;
+		else
+			sbi->s_used_blocks = 0;
+	} else {
+		sbi->s_used_blocks +=
+			(new_isize - old_isize) >> sb->s_blocksize_bits;
+		if (sbi->s_used_blocks > sbi->s_blocks)
+			sbi->s_used_blocks = sbi->s_blocks;
+	}
+
+	spin_unlock(&sbi->s_lock);
+}
+
+/*
+ * Check a zone condition and adjust its file inode access permissions for
+ * offline and readonly zones. Return the inode size corresponding to the
+ * amount of readable data in the zone.
+ */
+static loff_t zonefs_check_zone_condition(struct inode *inode,
+					  struct blk_zone *zone, bool warn)
+{
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+
+	switch (zone->cond) {
+	case BLK_ZONE_COND_OFFLINE:
+		/*
+		 * Dead zone: make the inode immutable, disable all accesses
+		 * and set the file size to 0 (zone wp set to zone start).
+		 */
+		if (warn)
+			zonefs_warn(inode->i_sb, "inode %lu: offline zone\n",
+				    inode->i_ino);
+		inode->i_flags |= S_IMMUTABLE;
+		inode->i_mode &= ~0777;
+		zone->wp = zone->start;
+		return 0;
+	case BLK_ZONE_COND_READONLY:
+		/* Do not allow writes in read-only zones */
+		if (warn)
+			zonefs_warn(inode->i_sb, "inode %lu: read-only zone\n",
+				    inode->i_ino);
+		inode->i_flags |= S_IMMUTABLE;
+		inode->i_mode &= ~0222;
+		/* fallthrough */
+	default:
+		if (zi->i_ztype == ZONEFS_ZTYPE_CNV)
+			return zi->i_max_size;
+		return (zone->wp - zone->start) << SECTOR_SHIFT;
+	}
+}
+
+struct zonefs_ioerr_data {
+	struct inode	*inode;
+	bool		write;
+};
+
+static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx,
+			      void *data)
+{
+	struct zonefs_ioerr_data *err = data;
+	struct inode *inode = err->inode;
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	struct super_block *sb = inode->i_sb;
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	loff_t isize, data_size;
+
+	/*
+	 * Check the zone condition: if the zone is not "bad" (offline or
+	 * read-only), read errors are simply signaled to the IO issuer as long
+	 * as there is no inconsistency between the inode size and the amount of
+	 * data writen in the zone (data_size).
+	 */
+	data_size = zonefs_check_zone_condition(inode, zone, true);
+	isize = i_size_read(inode);
+	if (zone->cond != BLK_ZONE_COND_OFFLINE &&
+	    zone->cond != BLK_ZONE_COND_READONLY &&
+	    !err->write && isize == data_size)
+		return 0;
+
+	/*
+	 * At this point, we detected either a bad zone or an inconsistency
+	 * between the inode size and the amount of data written in the zone.
+	 * For the latter case, the cause may be a write IO error or an external
+	 * action on the device. Two error patterns exist:
+	 * 1) The inode size is lower than the amount of data in the zone:
+	 *    a write operation partially failed and data was writen at the end
+	 *    of the file. This can happen in the case of a large direct IO
+	 *    needing several BIOs and/or write requests to be processed.
+	 * 2) The inode size is larger than the amount of data in the zone:
+	 *    this can happen with a deferred write error with the use of the
+	 *    device side write cache after getting successful write IO
+	 *    completions. Other possibilities are (a) an external corruption,
+	 *    e.g. an application reset the zone directly, or (b) the device
+	 *    has a serious problem (e.g. firmware bug).
+	 *
+	 * In all cases, warn about inode size inconsistency and handle the
+	 * IO error according to the zone condition and to the mount options.
+	 */
+	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && isize != data_size)
+		zonefs_warn(sb, "inode %lu: invalid size %lld (should be %lld)\n",
+			    inode->i_ino, isize, data_size);
+
+	/*
+	 * First handle bad zones signaled by hardware. The mount options
+	 * errors=zone-ro and errors=zone-offline result in changing the
+	 * zone condition to read-only and offline respectively, as if the
+	 * condition was signaled by the hardware.
+	 */
+	if (zone->cond == BLK_ZONE_COND_OFFLINE ||
+	    sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZOL) {
+		zonefs_warn(sb, "inode %lu: read/write access disabled\n",
+			    inode->i_ino);
+		if (zone->cond != BLK_ZONE_COND_OFFLINE) {
+			zone->cond = BLK_ZONE_COND_OFFLINE;
+			data_size = zonefs_check_zone_condition(inode, zone,
+								false);
+		}
+	} else if (zone->cond == BLK_ZONE_COND_READONLY ||
+		   sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZRO) {
+		zonefs_warn(sb, "inode %lu: write access disabled\n",
+			    inode->i_ino);
+		if (zone->cond != BLK_ZONE_COND_READONLY) {
+			zone->cond = BLK_ZONE_COND_READONLY;
+			data_size = zonefs_check_zone_condition(inode, zone,
+								false);
+		}
+	}
+
+	/*
+	 * If error=remount-ro was specified, any error result in remounting
+	 * the volume as read-only.
+	 */
+	if ((sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_RO) && !sb_rdonly(sb)) {
+		zonefs_warn(sb, "remounting filesystem read-only\n");
+		sb->s_flags |= SB_RDONLY;
+	}
+
+	/*
+	 * Update block usage stats and the inode size  to prevent access to
+	 * invalid data.
+	 */
+	zonefs_update_stats(inode, data_size);
+	i_size_write(inode, data_size);
+	zi->i_wpoffset = data_size;
+
+	return 0;
+}
+
+/*
+ * When an file IO error occurs, check the file zone to see if there is a change
+ * in the zone condition (e.g. offline or read-only). For a failed write to a
+ * sequential zone, the zone write pointer position must also be checked to
+ * eventually correct the file size and zonefs inode write pointer offset
+ * (which can be out of sync with the drive due to partial write failures).
+ */
+static void zonefs_io_error(struct inode *inode, bool write)
+{
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	struct super_block *sb = inode->i_sb;
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	unsigned int noio_flag;
+	unsigned int nr_zones =
+		zi->i_max_size >> (sbi->s_zone_sectors_shift + SECTOR_SHIFT);
+	struct zonefs_ioerr_data err = {
+		.inode = inode,
+		.write = write,
+	};
+	int ret;
+
+	mutex_lock(&zi->i_truncate_mutex);
+
+	/*
+	 * Memory allocations in blkdev_report_zones() can trigger a memory
+	 * reclaim which may in turn cause a recursion into zonefs as well as
+	 * struct request allocations for the same device. The former case may
+	 * end up in a deadlock on the inode truncate mutex, while the latter
+	 * may prevent IO forward progress. Executing the report zones under
+	 * the GFP_NOIO context avoids both problems.
+	 */
+	noio_flag = memalloc_noio_save();
+	ret = blkdev_report_zones(sb->s_bdev, zi->i_zsector, nr_zones,
+				  zonefs_io_error_cb, &err);
+	if (ret != nr_zones)
+		zonefs_err(sb, "Get inode %lu zone information failed %d\n",
+			   inode->i_ino, ret);
+	memalloc_noio_restore(noio_flag);
+
+	mutex_unlock(&zi->i_truncate_mutex);
+}
+
+static int zonefs_file_truncate(struct inode *inode, loff_t isize)
+{
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	loff_t old_isize;
+	enum req_opf op;
+	int ret = 0;
+
+	/*
+	 * Only sequential zone files can be truncated and truncation is allowed
+	 * only down to a 0 size, which is equivalent to a zone reset, and to
+	 * the maximum file size, which is equivalent to a zone finish.
+	 */
+	if (zi->i_ztype != ZONEFS_ZTYPE_SEQ)
+		return -EPERM;
+
+	if (!isize)
+		op = REQ_OP_ZONE_RESET;
+	else if (isize == zi->i_max_size)
+		op = REQ_OP_ZONE_FINISH;
+	else
+		return -EPERM;
+
+	inode_dio_wait(inode);
+
+	/* Serialize against page faults */
+	down_write(&zi->i_mmap_sem);
+
+	/* Serialize against zonefs_iomap_begin() */
+	mutex_lock(&zi->i_truncate_mutex);
+
+	old_isize = i_size_read(inode);
+	if (isize == old_isize)
+		goto unlock;
+
+	ret = blkdev_zone_mgmt(inode->i_sb->s_bdev, op, zi->i_zsector,
+			       zi->i_max_size >> SECTOR_SHIFT, GFP_NOFS);
+	if (ret) {
+		zonefs_err(inode->i_sb,
+			   "Zone management operation at %llu failed %d",
+			   zi->i_zsector, ret);
+		goto unlock;
+	}
+
+	zonefs_update_stats(inode, isize);
+	truncate_setsize(inode, isize);
+	zi->i_wpoffset = isize;
+
+unlock:
+	mutex_unlock(&zi->i_truncate_mutex);
+	up_write(&zi->i_mmap_sem);
+
+	return ret;
+}
+
+static int zonefs_inode_setattr(struct dentry *dentry, struct iattr *iattr)
+{
+	struct inode *inode = d_inode(dentry);
+	int ret;
+
+	if (unlikely(IS_IMMUTABLE(inode)))
+		return -EPERM;
+
+	ret = setattr_prepare(dentry, iattr);
+	if (ret)
+		return ret;
+
+	/*
+	 * Since files and directories cannot be created nor deleted, do not
+	 * allow setting any write attributes on the sub-directories grouping
+	 * files by zone type.
+	 */
+	if ((iattr->ia_valid & ATTR_MODE) && S_ISDIR(inode->i_mode) &&
+	    (iattr->ia_mode & 0222))
+		return -EPERM;
+
+	if (((iattr->ia_valid & ATTR_UID) &&
+	     !uid_eq(iattr->ia_uid, inode->i_uid)) ||
+	    ((iattr->ia_valid & ATTR_GID) &&
+	     !gid_eq(iattr->ia_gid, inode->i_gid))) {
+		ret = dquot_transfer(inode, iattr);
+		if (ret)
+			return ret;
+	}
+
+	if (iattr->ia_valid & ATTR_SIZE) {
+		ret = zonefs_file_truncate(inode, iattr->ia_size);
+		if (ret)
+			return ret;
+	}
+
+	setattr_copy(inode, iattr);
+
+	return 0;
+}
+
+static const struct inode_operations zonefs_file_inode_operations = {
+	.setattr	= zonefs_inode_setattr,
+};
+
+static int zonefs_file_fsync(struct file *file, loff_t start, loff_t end,
+			     int datasync)
+{
+	struct inode *inode = file_inode(file);
+	int ret = 0;
+
+	if (unlikely(IS_IMMUTABLE(inode)))
+		return -EPERM;
+
+	/*
+	 * Since only direct writes are allowed in sequential files, page cache
+	 * flush is needed only for conventional zone files.
+	 */
+	if (ZONEFS_I(inode)->i_ztype == ZONEFS_ZTYPE_CNV)
+		ret = file_write_and_wait_range(file, start, end);
+	if (!ret)
+		ret = blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL);
+
+	if (ret)
+		zonefs_io_error(inode, true);
+
+	return ret;
+}
+
+static vm_fault_t zonefs_filemap_fault(struct vm_fault *vmf)
+{
+	struct zonefs_inode_info *zi = ZONEFS_I(file_inode(vmf->vma->vm_file));
+	vm_fault_t ret;
+
+	down_read(&zi->i_mmap_sem);
+	ret = filemap_fault(vmf);
+	up_read(&zi->i_mmap_sem);
+
+	return ret;
+}
+
+static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf)
+{
+	struct inode *inode = file_inode(vmf->vma->vm_file);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	vm_fault_t ret;
+
+	if (unlikely(IS_IMMUTABLE(inode)))
+		return VM_FAULT_SIGBUS;
+
+	/*
+	 * Sanity check: only conventional zone files can have shared
+	 * writeable mappings.
+	 */
+	if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV))
+		return VM_FAULT_NOPAGE;
+
+	sb_start_pagefault(inode->i_sb);
+	file_update_time(vmf->vma->vm_file);
+
+	/* Serialize against truncates */
+	down_read(&zi->i_mmap_sem);
+	ret = iomap_page_mkwrite(vmf, &zonefs_iomap_ops);
+	up_read(&zi->i_mmap_sem);
+
+	sb_end_pagefault(inode->i_sb);
+	return ret;
+}
+
+static const struct vm_operations_struct zonefs_file_vm_ops = {
+	.fault		= zonefs_filemap_fault,
+	.map_pages	= filemap_map_pages,
+	.page_mkwrite	= zonefs_filemap_page_mkwrite,
+};
+
+static int zonefs_file_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	/*
+	 * Conventional zones accept random writes, so their files can support
+	 * shared writable mappings. For sequential zone files, only read
+	 * mappings are possible since there are no guarantees for write
+	 * ordering between msync() and page cache writeback.
+	 */
+	if (ZONEFS_I(file_inode(file))->i_ztype == ZONEFS_ZTYPE_SEQ &&
+	    (vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE))
+		return -EINVAL;
+
+	file_accessed(file);
+	vma->vm_ops = &zonefs_file_vm_ops;
+
+	return 0;
+}
+
+static loff_t zonefs_file_llseek(struct file *file, loff_t offset, int whence)
+{
+	loff_t isize = i_size_read(file_inode(file));
+
+	/*
+	 * Seeks are limited to below the zone size for conventional zones
+	 * and below the zone write pointer for sequential zones. In both
+	 * cases, this limit is the inode size.
+	 */
+	return generic_file_llseek_size(file, offset, whence, isize, isize);
+}
+
+static int zonefs_file_write_dio_end_io(struct kiocb *iocb, ssize_t size,
+					int error, unsigned int flags)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+
+	if (error) {
+		zonefs_io_error(inode, true);
+		return error;
+	}
+
+	if (size && zi->i_ztype != ZONEFS_ZTYPE_CNV) {
+		/*
+		 * Note that we may be seeing completions out of order,
+		 * but that is not a problem since a write completed
+		 * successfully necessarily means that all preceding writes
+		 * were also successful. So we can safely increase the inode
+		 * size to the write end location.
+		 */
+		mutex_lock(&zi->i_truncate_mutex);
+		if (i_size_read(inode) < iocb->ki_pos + size) {
+			zonefs_update_stats(inode, iocb->ki_pos + size);
+			i_size_write(inode, iocb->ki_pos + size);
+		}
+		mutex_unlock(&zi->i_truncate_mutex);
+	}
+
+	return 0;
+}
+
+static const struct iomap_dio_ops zonefs_write_dio_ops = {
+	.end_io			= zonefs_file_write_dio_end_io,
+};
+
+/*
+ * Handle direct writes. For sequential zone files, this is the only possible
+ * write path. For these files, check that the user is issuing writes
+ * sequentially from the end of the file. This code assumes that the block layer
+ * delivers write requests to the device in sequential order. This is always the
+ * case if a block IO scheduler implementing the ELEVATOR_F_ZBD_SEQ_WRITE
+ * elevator feature is being used (e.g. mq-deadline). The block layer always
+ * automatically select such an elevator for zoned block devices during the
+ * device initialization.
+ */
+static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	struct super_block *sb = inode->i_sb;
+	size_t count;
+	ssize_t ret;
+
+	/*
+	 * For async direct IOs to sequential zone files, ignore IOCB_NOWAIT
+	 * as this can cause write reordering (e.g. the first aio gets EAGAIN
+	 * on the inode lock but the second goes through but is now unaligned).
+	 */
+	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb)
+	    && (iocb->ki_flags & IOCB_NOWAIT))
+		iocb->ki_flags &= ~IOCB_NOWAIT;
+
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		if (!inode_trylock(inode))
+			return -EAGAIN;
+	} else {
+		inode_lock(inode);
+	}
+
+	ret = generic_write_checks(iocb, from);
+	if (ret <= 0)
+		goto inode_unlock;
+
+	iov_iter_truncate(from, zi->i_max_size - iocb->ki_pos);
+	count = iov_iter_count(from);
+
+	if ((iocb->ki_pos | count) & (sb->s_blocksize - 1)) {
+		ret = -EINVAL;
+		goto inode_unlock;
+	}
+
+	/* Enforce sequential writes (append only) in sequential zones */
+	mutex_lock(&zi->i_truncate_mutex);
+	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && iocb->ki_pos != zi->i_wpoffset) {
+		mutex_unlock(&zi->i_truncate_mutex);
+		ret = -EINVAL;
+		goto inode_unlock;
+	}
+	mutex_unlock(&zi->i_truncate_mutex);
+
+	ret = iomap_dio_rw(iocb, from, &zonefs_iomap_ops,
+			   &zonefs_write_dio_ops, is_sync_kiocb(iocb));
+	if (zi->i_ztype == ZONEFS_ZTYPE_SEQ &&
+	    (ret > 0 || ret == -EIOCBQUEUED)) {
+		if (ret > 0)
+			count = ret;
+		mutex_lock(&zi->i_truncate_mutex);
+		zi->i_wpoffset += count;
+		mutex_unlock(&zi->i_truncate_mutex);
+	}
+
+inode_unlock:
+	inode_unlock(inode);
+
+	return ret;
+}
+
+static ssize_t zonefs_file_buffered_write(struct kiocb *iocb,
+					  struct iov_iter *from)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	ssize_t ret;
+
+	/*
+	 * Direct IO writes are mandatory for sequential zone files so that the
+	 * write IO issuing order is preserved.
+	 */
+	if (zi->i_ztype != ZONEFS_ZTYPE_CNV)
+		return -EIO;
+
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		if (!inode_trylock(inode))
+			return -EAGAIN;
+	} else {
+		inode_lock(inode);
+	}
+
+	ret = generic_write_checks(iocb, from);
+	if (ret <= 0)
+		goto inode_unlock;
+
+	iov_iter_truncate(from, zi->i_max_size - iocb->ki_pos);
+
+	ret = iomap_file_buffered_write(iocb, from, &zonefs_iomap_ops);
+	if (ret > 0)
+		iocb->ki_pos += ret;
+	else if (ret == -EIO)
+		zonefs_io_error(inode, true);
+
+inode_unlock:
+	inode_unlock(inode);
+	if (ret > 0)
+		ret = generic_write_sync(iocb, ret);
+
+	return ret;
+}
+
+static ssize_t zonefs_file_write_iter(struct kiocb *iocb, struct iov_iter *from)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+
+	if (unlikely(IS_IMMUTABLE(inode)))
+		return -EPERM;
+
+	if (sb_rdonly(inode->i_sb))
+		return -EROFS;
+
+	/* Write operations beyond the zone size are not allowed */
+	if (iocb->ki_pos >= ZONEFS_I(inode)->i_max_size)
+		return -EFBIG;
+
+	if (iocb->ki_flags & IOCB_DIRECT)
+		return zonefs_file_dio_write(iocb, from);
+
+	return zonefs_file_buffered_write(iocb, from);
+}
+
+static int zonefs_file_read_dio_end_io(struct kiocb *iocb, ssize_t size,
+				       int error, unsigned int flags)
+{
+	if (error) {
+		zonefs_io_error(file_inode(iocb->ki_filp), false);
+		return error;
+	}
+
+	return 0;
+}
+
+static const struct iomap_dio_ops zonefs_read_dio_ops = {
+	.end_io			= zonefs_file_read_dio_end_io,
+};
+
+static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
+{
+	struct inode *inode = file_inode(iocb->ki_filp);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+	struct super_block *sb = inode->i_sb;
+	loff_t isize;
+	ssize_t ret;
+
+	/* Offline zones cannot be read */
+	if (unlikely(IS_IMMUTABLE(inode) && !(inode->i_mode & 0777)))
+		return -EPERM;
+
+	if (iocb->ki_pos >= zi->i_max_size)
+		return 0;
+
+	if (iocb->ki_flags & IOCB_NOWAIT) {
+		if (!inode_trylock_shared(inode))
+			return -EAGAIN;
+	} else {
+		inode_lock_shared(inode);
+	}
+
+	/* Limit read operations to written data */
+	mutex_lock(&zi->i_truncate_mutex);
+	isize = i_size_read(inode);
+	if (iocb->ki_pos >= isize) {
+		mutex_unlock(&zi->i_truncate_mutex);
+		ret = 0;
+		goto inode_unlock;
+	}
+	iov_iter_truncate(to, isize - iocb->ki_pos);
+	mutex_unlock(&zi->i_truncate_mutex);
+
+	if (iocb->ki_flags & IOCB_DIRECT) {
+		size_t count = iov_iter_count(to);
+
+		if ((iocb->ki_pos | count) & (sb->s_blocksize - 1)) {
+			ret = -EINVAL;
+			goto inode_unlock;
+		}
+		file_accessed(iocb->ki_filp);
+		ret = iomap_dio_rw(iocb, to, &zonefs_iomap_ops,
+				   &zonefs_read_dio_ops, is_sync_kiocb(iocb));
+	} else {
+		ret = generic_file_read_iter(iocb, to);
+		if (ret == -EIO)
+			zonefs_io_error(inode, false);
+	}
+
+inode_unlock:
+	inode_unlock_shared(inode);
+
+	return ret;
+}
+
+static const struct file_operations zonefs_file_operations = {
+	.open		= generic_file_open,
+	.fsync		= zonefs_file_fsync,
+	.mmap		= zonefs_file_mmap,
+	.llseek		= zonefs_file_llseek,
+	.read_iter	= zonefs_file_read_iter,
+	.write_iter	= zonefs_file_write_iter,
+	.splice_read	= generic_file_splice_read,
+	.splice_write	= iter_file_splice_write,
+	.iopoll		= iomap_dio_iopoll,
+};
+
+static struct kmem_cache *zonefs_inode_cachep;
+
+static struct inode *zonefs_alloc_inode(struct super_block *sb)
+{
+	struct zonefs_inode_info *zi;
+
+	zi = kmem_cache_alloc(zonefs_inode_cachep, GFP_KERNEL);
+	if (!zi)
+		return NULL;
+
+	inode_init_once(&zi->i_vnode);
+	mutex_init(&zi->i_truncate_mutex);
+	init_rwsem(&zi->i_mmap_sem);
+
+	return &zi->i_vnode;
+}
+
+static void zonefs_free_inode(struct inode *inode)
+{
+	kmem_cache_free(zonefs_inode_cachep, ZONEFS_I(inode));
+}
+
+/*
+ * File system stat.
+ */
+static int zonefs_statfs(struct dentry *dentry, struct kstatfs *buf)
+{
+	struct super_block *sb = dentry->d_sb;
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	enum zonefs_ztype t;
+	u64 fsid;
+
+	buf->f_type = ZONEFS_MAGIC;
+	buf->f_bsize = sb->s_blocksize;
+	buf->f_namelen = ZONEFS_NAME_MAX;
+
+	spin_lock(&sbi->s_lock);
+
+	buf->f_blocks = sbi->s_blocks;
+	if (WARN_ON(sbi->s_used_blocks > sbi->s_blocks))
+		buf->f_bfree = 0;
+	else
+		buf->f_bfree = buf->f_blocks - sbi->s_used_blocks;
+	buf->f_bavail = buf->f_bfree;
+
+	for (t = 0; t < ZONEFS_ZTYPE_MAX; t++) {
+		if (sbi->s_nr_files[t])
+			buf->f_files += sbi->s_nr_files[t] + 1;
+	}
+	buf->f_ffree = 0;
+
+	spin_unlock(&sbi->s_lock);
+
+	fsid = le64_to_cpup((void *)sbi->s_uuid.b) ^
+		le64_to_cpup((void *)sbi->s_uuid.b + sizeof(u64));
+	buf->f_fsid.val[0] = (u32)fsid;
+	buf->f_fsid.val[1] = (u32)(fsid >> 32);
+
+	return 0;
+}
+
+enum {
+	Opt_errors_ro, Opt_errors_zro, Opt_errors_zol, Opt_errors_repair,
+	Opt_err,
+};
+
+static const match_table_t tokens = {
+	{ Opt_errors_ro,	"errors=remount-ro"},
+	{ Opt_errors_zro,	"errors=zone-ro"},
+	{ Opt_errors_zol,	"errors=zone-offline"},
+	{ Opt_errors_repair,	"errors=repair"},
+	{ Opt_err,		NULL}
+};
+
+static int zonefs_parse_options(struct super_block *sb, char *options)
+{
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	substring_t args[MAX_OPT_ARGS];
+	char *p;
+
+	if (!options)
+		return 0;
+
+	while ((p = strsep(&options, ",")) != NULL) {
+		int token;
+
+		if (!*p)
+			continue;
+
+		token = match_token(p, tokens, args);
+		switch (token) {
+		case Opt_errors_ro:
+			sbi->s_mount_opts &= ~ZONEFS_MNTOPT_ERRORS_MASK;
+			sbi->s_mount_opts |= ZONEFS_MNTOPT_ERRORS_RO;
+			break;
+		case Opt_errors_zro:
+			sbi->s_mount_opts &= ~ZONEFS_MNTOPT_ERRORS_MASK;
+			sbi->s_mount_opts |= ZONEFS_MNTOPT_ERRORS_ZRO;
+			break;
+		case Opt_errors_zol:
+			sbi->s_mount_opts &= ~ZONEFS_MNTOPT_ERRORS_MASK;
+			sbi->s_mount_opts |= ZONEFS_MNTOPT_ERRORS_ZOL;
+			break;
+		case Opt_errors_repair:
+			sbi->s_mount_opts &= ~ZONEFS_MNTOPT_ERRORS_MASK;
+			sbi->s_mount_opts |= ZONEFS_MNTOPT_ERRORS_REPAIR;
+			break;
+		default:
+			return -EINVAL;
+		}
+	}
+
+	return 0;
+}
+
+static int zonefs_show_options(struct seq_file *seq, struct dentry *root)
+{
+	struct zonefs_sb_info *sbi = ZONEFS_SB(root->d_sb);
+
+	if (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_RO)
+		seq_puts(seq, ",errors=remount-ro");
+	if (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZRO)
+		seq_puts(seq, ",errors=zone-ro");
+	if (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZOL)
+		seq_puts(seq, ",errors=zone-offline");
+	if (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_REPAIR)
+		seq_puts(seq, ",errors=repair");
+
+	return 0;
+}
+
+static int zonefs_remount(struct super_block *sb, int *flags, char *data)
+{
+	sync_filesystem(sb);
+
+	return zonefs_parse_options(sb, data);
+}
+
+static const struct super_operations zonefs_sops = {
+	.alloc_inode	= zonefs_alloc_inode,
+	.free_inode	= zonefs_free_inode,
+	.statfs		= zonefs_statfs,
+	.remount_fs	= zonefs_remount,
+	.show_options	= zonefs_show_options,
+};
+
+static const struct inode_operations zonefs_dir_inode_operations = {
+	.lookup		= simple_lookup,
+	.setattr	= zonefs_inode_setattr,
+};
+
+static void zonefs_init_dir_inode(struct inode *parent, struct inode *inode,
+				  enum zonefs_ztype type)
+{
+	struct super_block *sb = parent->i_sb;
+
+	inode->i_ino = blkdev_nr_zones(sb->s_bdev->bd_disk) + type + 1;
+	inode_init_owner(inode, parent, S_IFDIR | 0555);
+	inode->i_op = &zonefs_dir_inode_operations;
+	inode->i_fop = &simple_dir_operations;
+	set_nlink(inode, 2);
+	inc_nlink(parent);
+}
+
+static void zonefs_init_file_inode(struct inode *inode, struct blk_zone *zone,
+				   enum zonefs_ztype type)
+{
+	struct super_block *sb = inode->i_sb;
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	struct zonefs_inode_info *zi = ZONEFS_I(inode);
+
+	inode->i_ino = zone->start >> sbi->s_zone_sectors_shift;
+	inode->i_mode = S_IFREG | sbi->s_perm;
+
+	zi->i_ztype = type;
+	zi->i_zsector = zone->start;
+	zi->i_max_size = min_t(loff_t, MAX_LFS_FILESIZE,
+			       zone->len << SECTOR_SHIFT);
+	zi->i_wpoffset = zonefs_check_zone_condition(inode, zone, true);
+
+	inode->i_uid = sbi->s_uid;
+	inode->i_gid = sbi->s_gid;
+	inode->i_size = zi->i_wpoffset;
+	inode->i_blocks = zone->len;
+
+	inode->i_op = &zonefs_file_inode_operations;
+	inode->i_fop = &zonefs_file_operations;
+	inode->i_mapping->a_ops = &zonefs_file_aops;
+
+	sb->s_maxbytes = max(zi->i_max_size, sb->s_maxbytes);
+	sbi->s_blocks += zi->i_max_size >> sb->s_blocksize_bits;
+	sbi->s_used_blocks += zi->i_wpoffset >> sb->s_blocksize_bits;
+}
+
+static struct dentry *zonefs_create_inode(struct dentry *parent,
+					const char *name, struct blk_zone *zone,
+					enum zonefs_ztype type)
+{
+	struct inode *dir = d_inode(parent);
+	struct dentry *dentry;
+	struct inode *inode;
+
+	dentry = d_alloc_name(parent, name);
+	if (!dentry)
+		return NULL;
+
+	inode = new_inode(parent->d_sb);
+	if (!inode)
+		goto dput;
+
+	inode->i_ctime = inode->i_mtime = inode->i_atime = dir->i_ctime;
+	if (zone)
+		zonefs_init_file_inode(inode, zone, type);
+	else
+		zonefs_init_dir_inode(dir, inode, type);
+	d_add(dentry, inode);
+	dir->i_size++;
+
+	return dentry;
+
+dput:
+	dput(dentry);
+
+	return NULL;
+}
+
+struct zonefs_zone_data {
+	struct super_block	*sb;
+	unsigned int		nr_zones[ZONEFS_ZTYPE_MAX];
+	struct blk_zone		*zones;
+};
+
+/*
+ * Create a zone group and populate it with zone files.
+ */
+static int zonefs_create_zgroup(struct zonefs_zone_data *zd,
+				enum zonefs_ztype type)
+{
+	struct super_block *sb = zd->sb;
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	struct blk_zone *zone, *next, *end;
+	const char *zgroup_name;
+	char *file_name;
+	struct dentry *dir;
+	unsigned int n = 0;
+	int ret = -ENOMEM;
+
+	/* If the group is empty, there is nothing to do */
+	if (!zd->nr_zones[type])
+		return 0;
+
+	file_name = kmalloc(ZONEFS_NAME_MAX, GFP_KERNEL);
+	if (!file_name)
+		return -ENOMEM;
+
+	if (type == ZONEFS_ZTYPE_CNV)
+		zgroup_name = "cnv";
+	else
+		zgroup_name = "seq";
+
+	dir = zonefs_create_inode(sb->s_root, zgroup_name, NULL, type);
+	if (!dir)
+		goto free;
+
+	/*
+	 * The first zone contains the super block: skip it.
+	 */
+	end = zd->zones + blkdev_nr_zones(sb->s_bdev->bd_disk);
+	for (zone = &zd->zones[1]; zone < end; zone = next) {
+
+		next = zone + 1;
+		if (zonefs_zone_type(zone) != type)
+			continue;
+
+		/*
+		 * For conventional zones, contiguous zones can be aggregated
+		 * together to form larger files. Note that this overwrites the
+		 * length of the first zone of the set of contiguous zones
+		 * aggregated together. If one offline or read-only zone is
+		 * found, assume that all zones aggregated have the same
+		 * condition.
+		 */
+		if (type == ZONEFS_ZTYPE_CNV &&
+		    (sbi->s_features & ZONEFS_F_AGGRCNV)) {
+			for (; next < end; next++) {
+				if (zonefs_zone_type(next) != type)
+					break;
+				zone->len += next->len;
+				if (next->cond == BLK_ZONE_COND_READONLY &&
+				    zone->cond != BLK_ZONE_COND_OFFLINE)
+					zone->cond = BLK_ZONE_COND_READONLY;
+				else if (next->cond == BLK_ZONE_COND_OFFLINE)
+					zone->cond = BLK_ZONE_COND_OFFLINE;
+			}
+		}
+
+		/*
+		 * Use the file number within its group as file name.
+		 */
+		snprintf(file_name, ZONEFS_NAME_MAX - 1, "%u", n);
+		if (!zonefs_create_inode(dir, file_name, zone, type))
+			goto free;
+
+		n++;
+	}
+
+	zonefs_info(sb, "Zone group \"%s\" has %u file%s\n",
+		    zgroup_name, n, n > 1 ? "s" : "");
+
+	sbi->s_nr_files[type] = n;
+	ret = 0;
+
+free:
+	kfree(file_name);
+
+	return ret;
+}
+
+static int zonefs_get_zone_info_cb(struct blk_zone *zone, unsigned int idx,
+				   void *data)
+{
+	struct zonefs_zone_data *zd = data;
+
+	/*
+	 * Count the number of usable zones: the first zone at index 0 contains
+	 * the super block and is ignored.
+	 */
+	switch (zone->type) {
+	case BLK_ZONE_TYPE_CONVENTIONAL:
+		zone->wp = zone->start + zone->len;
+		if (idx)
+			zd->nr_zones[ZONEFS_ZTYPE_CNV]++;
+		break;
+	case BLK_ZONE_TYPE_SEQWRITE_REQ:
+	case BLK_ZONE_TYPE_SEQWRITE_PREF:
+		if (idx)
+			zd->nr_zones[ZONEFS_ZTYPE_SEQ]++;
+		break;
+	default:
+		zonefs_err(zd->sb, "Unsupported zone type 0x%x\n",
+			   zone->type);
+		return -EIO;
+	}
+
+	memcpy(&zd->zones[idx], zone, sizeof(struct blk_zone));
+
+	return 0;
+}
+
+static int zonefs_get_zone_info(struct zonefs_zone_data *zd)
+{
+	struct block_device *bdev = zd->sb->s_bdev;
+	int ret;
+
+	zd->zones = kvcalloc(blkdev_nr_zones(bdev->bd_disk),
+			     sizeof(struct blk_zone), GFP_KERNEL);
+	if (!zd->zones)
+		return -ENOMEM;
+
+	/* Get zones information from the device */
+	ret = blkdev_report_zones(bdev, 0, BLK_ALL_ZONES,
+				  zonefs_get_zone_info_cb, zd);
+	if (ret < 0) {
+		zonefs_err(zd->sb, "Zone report failed %d\n", ret);
+		return ret;
+	}
+
+	if (ret != blkdev_nr_zones(bdev->bd_disk)) {
+		zonefs_err(zd->sb, "Invalid zone report (%d/%u zones)\n",
+			   ret, blkdev_nr_zones(bdev->bd_disk));
+		return -EIO;
+	}
+
+	return 0;
+}
+
+static inline void zonefs_cleanup_zone_info(struct zonefs_zone_data *zd)
+{
+	kvfree(zd->zones);
+}
+
+/*
+ * Read super block information from the device.
+ */
+static int zonefs_read_super(struct super_block *sb)
+{
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+	struct zonefs_super *super;
+	u32 crc, stored_crc;
+	struct page *page;
+	struct bio_vec bio_vec;
+	struct bio bio;
+	int ret;
+
+	page = alloc_page(GFP_KERNEL);
+	if (!page)
+		return -ENOMEM;
+
+	bio_init(&bio, &bio_vec, 1);
+	bio.bi_iter.bi_sector = 0;
+	bio.bi_opf = REQ_OP_READ;
+	bio_set_dev(&bio, sb->s_bdev);
+	bio_add_page(&bio, page, PAGE_SIZE, 0);
+
+	ret = submit_bio_wait(&bio);
+	if (ret)
+		goto free_page;
+
+	super = kmap(page);
+
+	ret = -EINVAL;
+	if (le32_to_cpu(super->s_magic) != ZONEFS_MAGIC)
+		goto unmap;
+
+	stored_crc = le32_to_cpu(super->s_crc);
+	super->s_crc = 0;
+	crc = crc32(~0U, (unsigned char *)super, sizeof(struct zonefs_super));
+	if (crc != stored_crc) {
+		zonefs_err(sb, "Invalid checksum (Expected 0x%08x, got 0x%08x)",
+			   crc, stored_crc);
+		goto unmap;
+	}
+
+	sbi->s_features = le64_to_cpu(super->s_features);
+	if (sbi->s_features & ~ZONEFS_F_DEFINED_FEATURES) {
+		zonefs_err(sb, "Unknown features set 0x%llx\n",
+			   sbi->s_features);
+		goto unmap;
+	}
+
+	if (sbi->s_features & ZONEFS_F_UID) {
+		sbi->s_uid = make_kuid(current_user_ns(),
+				       le32_to_cpu(super->s_uid));
+		if (!uid_valid(sbi->s_uid)) {
+			zonefs_err(sb, "Invalid UID feature\n");
+			goto unmap;
+		}
+	}
+
+	if (sbi->s_features & ZONEFS_F_GID) {
+		sbi->s_gid = make_kgid(current_user_ns(),
+				       le32_to_cpu(super->s_gid));
+		if (!gid_valid(sbi->s_gid)) {
+			zonefs_err(sb, "Invalid GID feature\n");
+			goto unmap;
+		}
+	}
+
+	if (sbi->s_features & ZONEFS_F_PERM)
+		sbi->s_perm = le32_to_cpu(super->s_perm);
+
+	if (memchr_inv(super->s_reserved, 0, sizeof(super->s_reserved))) {
+		zonefs_err(sb, "Reserved area is being used\n");
+		goto unmap;
+	}
+
+	uuid_copy(&sbi->s_uuid, (uuid_t *)super->s_uuid);
+	ret = 0;
+
+unmap:
+	kunmap(page);
+free_page:
+	__free_page(page);
+
+	return ret;
+}
+
+/*
+ * Check that the device is zoned. If it is, get the list of zones and create
+ * sub-directories and files according to the device zone configuration and
+ * format options.
+ */
+static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
+{
+	struct zonefs_zone_data zd;
+	struct zonefs_sb_info *sbi;
+	struct inode *inode;
+	enum zonefs_ztype t;
+	int ret;
+
+	if (!bdev_is_zoned(sb->s_bdev)) {
+		zonefs_err(sb, "Not a zoned block device\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * Initialize super block information: the maximum file size is updated
+	 * when the zone files are created so that the format option
+	 * ZONEFS_F_AGGRCNV which increases the maximum file size of a file
+	 * beyond the zone size is taken into account.
+	 */
+	sbi = kzalloc(sizeof(*sbi), GFP_KERNEL);
+	if (!sbi)
+		return -ENOMEM;
+
+	spin_lock_init(&sbi->s_lock);
+	sb->s_fs_info = sbi;
+	sb->s_magic = ZONEFS_MAGIC;
+	sb->s_maxbytes = 0;
+	sb->s_op = &zonefs_sops;
+	sb->s_time_gran	= 1;
+
+	/*
+	 * The block size is set to the device physical sector size to ensure
+	 * that write operations on 512e devices (512B logical block and 4KB
+	 * physical block) are always aligned to the device physical blocks,
+	 * as mandated by the ZBC/ZAC specifications.
+	 */
+	sb_set_blocksize(sb, bdev_physical_block_size(sb->s_bdev));
+	sbi->s_zone_sectors_shift = ilog2(bdev_zone_sectors(sb->s_bdev));
+	sbi->s_uid = GLOBAL_ROOT_UID;
+	sbi->s_gid = GLOBAL_ROOT_GID;
+	sbi->s_perm = 0640;
+	sbi->s_mount_opts = ZONEFS_MNTOPT_ERRORS_RO;
+
+	ret = zonefs_read_super(sb);
+	if (ret)
+		return ret;
+
+	ret = zonefs_parse_options(sb, data);
+	if (ret)
+		return ret;
+
+	memset(&zd, 0, sizeof(struct zonefs_zone_data));
+	zd.sb = sb;
+	ret = zonefs_get_zone_info(&zd);
+	if (ret)
+		goto cleanup;
+
+	zonefs_info(sb, "Mounting %u zones",
+		    blkdev_nr_zones(sb->s_bdev->bd_disk));
+
+	/* Create root directory inode */
+	ret = -ENOMEM;
+	inode = new_inode(sb);
+	if (!inode)
+		goto cleanup;
+
+	inode->i_ino = blkdev_nr_zones(sb->s_bdev->bd_disk);
+	inode->i_mode = S_IFDIR | 0555;
+	inode->i_ctime = inode->i_mtime = inode->i_atime = current_time(inode);
+	inode->i_op = &zonefs_dir_inode_operations;
+	inode->i_fop = &simple_dir_operations;
+	set_nlink(inode, 2);
+
+	sb->s_root = d_make_root(inode);
+	if (!sb->s_root)
+		goto cleanup;
+
+	/* Create and populate files in zone groups directories */
+	for (t = 0; t < ZONEFS_ZTYPE_MAX; t++) {
+		ret = zonefs_create_zgroup(&zd, t);
+		if (ret)
+			break;
+	}
+
+cleanup:
+	zonefs_cleanup_zone_info(&zd);
+
+	return ret;
+}
+
+static struct dentry *zonefs_mount(struct file_system_type *fs_type,
+				   int flags, const char *dev_name, void *data)
+{
+	return mount_bdev(fs_type, flags, dev_name, data, zonefs_fill_super);
+}
+
+static void zonefs_kill_super(struct super_block *sb)
+{
+	struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
+
+	if (sb->s_root)
+		d_genocide(sb->s_root);
+	kill_block_super(sb);
+	kfree(sbi);
+}
+
+/*
+ * File system definition and registration.
+ */
+static struct file_system_type zonefs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "zonefs",
+	.mount		= zonefs_mount,
+	.kill_sb	= zonefs_kill_super,
+	.fs_flags	= FS_REQUIRES_DEV,
+};
+
+static int __init zonefs_init_inodecache(void)
+{
+	zonefs_inode_cachep = kmem_cache_create("zonefs_inode_cache",
+			sizeof(struct zonefs_inode_info), 0,
+			(SLAB_RECLAIM_ACCOUNT | SLAB_MEM_SPREAD | SLAB_ACCOUNT),
+			NULL);
+	if (zonefs_inode_cachep == NULL)
+		return -ENOMEM;
+	return 0;
+}
+
+static void zonefs_destroy_inodecache(void)
+{
+	/*
+	 * Make sure all delayed rcu free inodes are flushed before we
+	 * destroy the inode cache.
+	 */
+	rcu_barrier();
+	kmem_cache_destroy(zonefs_inode_cachep);
+}
+
+static int __init zonefs_init(void)
+{
+	int ret;
+
+	BUILD_BUG_ON(sizeof(struct zonefs_super) != ZONEFS_SUPER_SIZE);
+
+	ret = zonefs_init_inodecache();
+	if (ret)
+		return ret;
+
+	ret = register_filesystem(&zonefs_type);
+	if (ret) {
+		zonefs_destroy_inodecache();
+		return ret;
+	}
+
+	return 0;
+}
+
+static void __exit zonefs_exit(void)
+{
+	zonefs_destroy_inodecache();
+	unregister_filesystem(&zonefs_type);
+}
+
+MODULE_AUTHOR("Damien Le Moal");
+MODULE_DESCRIPTION("Zone file system for zoned block devices");
+MODULE_LICENSE("GPL");
+module_init(zonefs_init);
+module_exit(zonefs_exit);
diff --git a/fs/zonefs/zonefs.h b/fs/zonefs/zonefs.h
new file mode 100644
index 000000000000..ad17fef7ce91
--- /dev/null
+++ b/fs/zonefs/zonefs.h
@@ -0,0 +1,189 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Simple zone file system for zoned block devices.
+ *
+ * Copyright (C) 2019 Western Digital Corporation or its affiliates.
+ */
+#ifndef __ZONEFS_H__
+#define __ZONEFS_H__
+
+#include <linux/fs.h>
+#include <linux/magic.h>
+#include <linux/uuid.h>
+#include <linux/mutex.h>
+#include <linux/rwsem.h>
+
+/*
+ * Maximum length of file names: this only needs to be large enough to fit
+ * the zone group directory names and a decimal zone number for file names.
+ * 16 characters is plenty.
+ */
+#define ZONEFS_NAME_MAX		16
+
+/*
+ * Zone types: ZONEFS_ZTYPE_SEQ is used for all sequential zone types
+ * defined in linux/blkzoned.h, that is, BLK_ZONE_TYPE_SEQWRITE_REQ and
+ * BLK_ZONE_TYPE_SEQWRITE_PREF.
+ */
+enum zonefs_ztype {
+	ZONEFS_ZTYPE_CNV,
+	ZONEFS_ZTYPE_SEQ,
+	ZONEFS_ZTYPE_MAX,
+};
+
+static inline enum zonefs_ztype zonefs_zone_type(struct blk_zone *zone)
+{
+	if (zone->type == BLK_ZONE_TYPE_CONVENTIONAL)
+		return ZONEFS_ZTYPE_CNV;
+	return ZONEFS_ZTYPE_SEQ;
+}
+
+/*
+ * In-memory inode data.
+ */
+struct zonefs_inode_info {
+	struct inode		i_vnode;
+
+	/* File zone type */
+	enum zonefs_ztype	i_ztype;
+
+	/* File zone start sector (512B unit) */
+	sector_t		i_zsector;
+
+	/* File zone write pointer position (sequential zones only) */
+	loff_t			i_wpoffset;
+
+	/* File maximum size */
+	loff_t			i_max_size;
+
+	/*
+	 * To serialise fully against both syscall and mmap based IO and
+	 * sequential file truncation, two locks are used. For serializing
+	 * zonefs_seq_file_truncate() against zonefs_iomap_begin(), that is,
+	 * file truncate operations against block mapping, i_truncate_mutex is
+	 * used. i_truncate_mutex also protects against concurrent accesses
+	 * and changes to the inode private data, and in particular changes to
+	 * a sequential file size on completion of direct IO writes.
+	 * Serialization of mmap read IOs with truncate and syscall IO
+	 * operations is done with i_mmap_sem in addition to i_truncate_mutex.
+	 * Only zonefs_seq_file_truncate() takes both lock (i_mmap_sem first,
+	 * i_truncate_mutex second).
+	 */
+	struct mutex		i_truncate_mutex;
+	struct rw_semaphore	i_mmap_sem;
+};
+
+static inline struct zonefs_inode_info *ZONEFS_I(struct inode *inode)
+{
+	return container_of(inode, struct zonefs_inode_info, i_vnode);
+}
+
+/*
+ * On-disk super block (block 0).
+ */
+#define ZONEFS_LABEL_LEN	64
+#define ZONEFS_UUID_SIZE	16
+#define ZONEFS_SUPER_SIZE	4096
+
+struct zonefs_super {
+
+	/* Magic number */
+	__le32		s_magic;
+
+	/* Checksum */
+	__le32		s_crc;
+
+	/* Volume label */
+	char		s_label[ZONEFS_LABEL_LEN];
+
+	/* 128-bit uuid */
+	__u8		s_uuid[ZONEFS_UUID_SIZE];
+
+	/* Features */
+	__le64		s_features;
+
+	/* UID/GID to use for files */
+	__le32		s_uid;
+	__le32		s_gid;
+
+	/* File permissions */
+	__le32		s_perm;
+
+	/* Padding to ZONEFS_SUPER_SIZE bytes */
+	__u8		s_reserved[3988];
+
+} __packed;
+
+/*
+ * Feature flags: specified in the s_features field of the on-disk super
+ * block struct zonefs_super and in-memory in the s_feartures field of
+ * struct zonefs_sb_info.
+ */
+enum zonefs_features {
+	/*
+	 * Aggregate contiguous conventional zones into a single file.
+	 */
+	ZONEFS_F_AGGRCNV = 1ULL << 0,
+	/*
+	 * Use super block specified UID for files instead of default 0.
+	 */
+	ZONEFS_F_UID = 1ULL << 1,
+	/*
+	 * Use super block specified GID for files instead of default 0.
+	 */
+	ZONEFS_F_GID = 1ULL << 2,
+	/*
+	 * Use super block specified file permissions instead of default 640.
+	 */
+	ZONEFS_F_PERM = 1ULL << 3,
+};
+
+#define ZONEFS_F_DEFINED_FEATURES \
+	(ZONEFS_F_AGGRCNV | ZONEFS_F_UID | ZONEFS_F_GID | ZONEFS_F_PERM)
+
+/*
+ * Mount options for zone write pointer error handling.
+ */
+#define ZONEFS_MNTOPT_ERRORS_RO		(1 << 0) /* Make zone file readonly */
+#define ZONEFS_MNTOPT_ERRORS_ZRO	(1 << 1) /* Make zone file offline */
+#define ZONEFS_MNTOPT_ERRORS_ZOL	(1 << 2) /* Make zone file offline */
+#define ZONEFS_MNTOPT_ERRORS_REPAIR	(1 << 3) /* Remount read-only */
+#define ZONEFS_MNTOPT_ERRORS_MASK	\
+	(ZONEFS_MNTOPT_ERRORS_RO | ZONEFS_MNTOPT_ERRORS_ZRO | \
+	 ZONEFS_MNTOPT_ERRORS_ZOL | ZONEFS_MNTOPT_ERRORS_REPAIR)
+
+/*
+ * In-memory Super block information.
+ */
+struct zonefs_sb_info {
+
+	unsigned long		s_mount_opts;
+
+	spinlock_t		s_lock;
+
+	unsigned long long	s_features;
+	kuid_t			s_uid;
+	kgid_t			s_gid;
+	umode_t			s_perm;
+	uuid_t			s_uuid;
+	unsigned int		s_zone_sectors_shift;
+
+	unsigned int		s_nr_files[ZONEFS_ZTYPE_MAX];
+
+	loff_t			s_blocks;
+	loff_t			s_used_blocks;
+};
+
+static inline struct zonefs_sb_info *ZONEFS_SB(struct super_block *sb)
+{
+	return sb->s_fs_info;
+}
+
+#define zonefs_info(sb, format, args...)	\
+	pr_info("zonefs (%s): " format, sb->s_id, ## args)
+#define zonefs_err(sb, format, args...)		\
+	pr_err("zonefs (%s) ERROR: " format, sb->s_id, ## args)
+#define zonefs_warn(sb, format, args...)	\
+	pr_warn("zonefs (%s) WARNING: " format, sb->s_id, ## args)
+
+#endif
diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h
index 3ac436376d79..d78064007b17 100644
--- a/include/uapi/linux/magic.h
+++ b/include/uapi/linux/magic.h
@@ -87,6 +87,7 @@
 #define NSFS_MAGIC		0x6e736673
 #define BPF_FS_MAGIC		0xcafe4a11
 #define AAFS_MAGIC		0x5a3c69f0
+#define ZONEFS_MAGIC		0x5a4f4653
 
 /* Since UDF 2.01 is ISO 13346 based... */
 #define UDF_SUPER_MAGIC		0x15013346
-- 
cgit v1.2.3


From ca4b43c14cd88d28cfc6467d2fa075aad6818f1d Mon Sep 17 00:00:00 2001
From: Peter Chen <peter.chen@nxp.com>
Date: Sat, 1 Feb 2020 14:13:44 +0800
Subject: usb: charger: assign specific number for enum value

To work properly on every architectures and compilers, the enum value
needs to be specific numbers.

Suggested-by: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Peter Chen <peter.chen@nxp.com>
Link: https://lore.kernel.org/r/1580537624-10179-1-git-send-email-peter.chen@nxp.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 include/uapi/linux/usb/charger.h | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/usb/charger.h b/include/uapi/linux/usb/charger.h
index 5f72af35b3ed..ad22079125bf 100644
--- a/include/uapi/linux/usb/charger.h
+++ b/include/uapi/linux/usb/charger.h
@@ -14,18 +14,18 @@
  * ACA (Accessory Charger Adapters)
  */
 enum usb_charger_type {
-	UNKNOWN_TYPE,
-	SDP_TYPE,
-	DCP_TYPE,
-	CDP_TYPE,
-	ACA_TYPE,
+	UNKNOWN_TYPE = 0,
+	SDP_TYPE = 1,
+	DCP_TYPE = 2,
+	CDP_TYPE = 3,
+	ACA_TYPE = 4,
 };
 
 /* USB charger state */
 enum usb_charger_state {
-	USB_CHARGER_DEFAULT,
-	USB_CHARGER_PRESENT,
-	USB_CHARGER_ABSENT,
+	USB_CHARGER_DEFAULT = 0,
+	USB_CHARGER_PRESENT = 1,
+	USB_CHARGER_ABSENT = 2,
 };
 
 #endif /* _UAPI__LINUX_USB_CHARGER_H */
-- 
cgit v1.2.3


From 6a757c07e51f80ac34325fcd558490d2d1439e1b Mon Sep 17 00:00:00 2001
From: Florian Westphal <fw@strlen.de>
Date: Mon, 3 Feb 2020 17:37:07 +0100
Subject: netfilter: conntrack: allow insertion of clashing entries

This patch further relaxes the need to drop an skb due to a clash with
an existing conntrack entry.

Current clash resolution handles the case where the clash occurs between
two identical entries (distinct nf_conn objects with same tuples), i.e.:

                    Original                        Reply
existing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.6:5353
clashing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.6:5353

... existing handling will discard the unconfirmed clashing entry and
makes skb->_nfct point to the existing one.  The skb can then be
processed normally just as if the clash would not have existed in the
first place.

For other clashes, the skb needs to be dropped.
This frequently happens with DNS resolvers that send A and AAAA queries
back-to-back when NAT rules are present that cause packets to get
different DNAT transformations applied, for example:

-m statistics --mode random ... -j DNAT --dnat-to 10.0.0.6:5353
-m statistics --mode random ... -j DNAT --dnat-to 10.0.0.7:5353

In this case the A or AAAA query is dropped which incurs a costly
delay during name resolution.

This patch also allows this collision type:
                       Original                   Reply
existing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.6:5353
clashing: 10.2.3.4:42 -> 10.8.8.8:53      10.2.3.4:42 <- 10.0.0.7:5353

In this case, clash is in original direction -- the reply direction
is still unique.

The change makes it so that when the 2nd colliding packet is received,
the clashing conntrack is tagged with new IPS_NAT_CLASH_BIT, gets a fixed
1 second timeout and is inserted in the reply direction only.

The entry is hidden from 'conntrack -L', it will time out quickly
and it can be early dropped because it will never progress to the
ASSURED state.

To avoid special-casing the delete code path to special case
the ORIGINAL hlist_nulls node, a new helper, "hlist_nulls_add_fake", is
added so hlist_nulls_del() will work.

Example:

      CPU A:                               CPU B:
1.  10.2.3.4:42 -> 10.8.8.8:53 (A)
2.                                         10.2.3.4:42 -> 10.8.8.8:53 (AAAA)
3.  Apply DNAT, reply changed to 10.0.0.6
4.                                         10.2.3.4:42 -> 10.8.8.8:53 (AAAA)
5.                                         Apply DNAT, reply changed to 10.0.0.7
6. confirm/commit to conntrack table, no collisions
7.                                         commit clashing entry

Reply comes in:

10.2.3.4:42 <- 10.0.0.6:5353 (A)
 -> Finds a conntrack, DNAT is reversed & packet forwarded to 10.2.3.4:42
10.2.3.4:42 <- 10.0.0.7:5353 (AAAA)
 -> Finds a conntrack, DNAT is reversed & packet forwarded to 10.2.3.4:42
    The conntrack entry is deleted from table, as it has the NAT_CLASH
    bit set.

In case of a retransmit from ORIGINAL dir, all further packets will get
the DNAT transformation to 10.0.0.6.

I tried to come up with other solutions but they all have worse
problems.

Alternatives considered were:
1.  Confirm ct entries at allocation time, not in postrouting.
 a. will cause uneccesarry work when the skb that creates the
    conntrack is dropped by ruleset.
 b. in case nat is applied, ct entry would need to be moved in
    the table, which requires another spinlock pair to be taken.
 c. breaks the 'unconfirmed entry is private to cpu' assumption:
    we would need to guard all nfct->ext allocation requests with
    ct->lock spinlock.

2. Make the unconfirmed list a hash table instead of a pcpu list.
   Shares drawback c) of the first alternative.

3. Document this is expected and force users to rearrange their
   ruleset (e.g. by using "-m cluster" instead of "-m statistics").
   nft has the 'jhash' expression which can be used instead of 'numgen'.

   Major drawback: doesn't fix what I consider a bug, not very realistic
   and I believe its reasonable to have the existing rulesets to 'just
   work'.

4. Document this is expected and force users to steer problematic
   packets to the same CPU -- this would serialize the "allocate new
   conntrack entry/nat table evaluation/perform nat/confirm entry", so
   no race can occur.  Similar drawback to 3.

Another advantage of this patch compared to 1) and 2) is that there are
no changes to the hot path; things are handled in the udp tracker and
the clash resolution path.

Cc: rcu@vger.kernel.org
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Jozsef Kadlecsik <kadlec@netfilter.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/linux/rculist_nulls.h                      |  7 ++
 include/uapi/linux/netfilter/nf_conntrack_common.h | 12 +++-
 net/netfilter/nf_conntrack_core.c                  | 76 +++++++++++++++++++++-
 net/netfilter/nf_conntrack_proto_udp.c             | 20 ++++--
 4 files changed, 108 insertions(+), 7 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/linux/rculist_nulls.h b/include/linux/rculist_nulls.h
index e5b752027a03..9670b54b484a 100644
--- a/include/linux/rculist_nulls.h
+++ b/include/linux/rculist_nulls.h
@@ -145,6 +145,13 @@ static inline void hlist_nulls_add_tail_rcu(struct hlist_nulls_node *n,
 	}
 }
 
+/* after that hlist_nulls_del will work */
+static inline void hlist_nulls_add_fake(struct hlist_nulls_node *n)
+{
+	n->pprev = &n->next;
+	n->next = (struct hlist_nulls_node *)NULLS_MARKER(NULL);
+}
+
 /**
  * hlist_nulls_for_each_entry_rcu - iterate over rcu list of given type
  * @tpos:	the type * to use as a loop cursor.
diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
index 336014bf8868..b6f0bb1dc799 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -97,6 +97,15 @@ enum ip_conntrack_status {
 	IPS_UNTRACKED_BIT = 12,
 	IPS_UNTRACKED = (1 << IPS_UNTRACKED_BIT),
 
+#ifdef __KERNEL__
+	/* Re-purposed for in-kernel use:
+	 * Tags a conntrack entry that clashed with an existing entry
+	 * on insert.
+	 */
+	IPS_NAT_CLASH_BIT = IPS_UNTRACKED_BIT,
+	IPS_NAT_CLASH = IPS_UNTRACKED,
+#endif
+
 	/* Conntrack got a helper explicitly attached via CT target. */
 	IPS_HELPER_BIT = 13,
 	IPS_HELPER = (1 << IPS_HELPER_BIT),
@@ -110,7 +119,8 @@ enum ip_conntrack_status {
 	 */
 	IPS_UNCHANGEABLE_MASK = (IPS_NAT_DONE_MASK | IPS_NAT_MASK |
 				 IPS_EXPECTED | IPS_CONFIRMED | IPS_DYING |
-				 IPS_SEQ_ADJUST | IPS_TEMPLATE | IPS_OFFLOAD),
+				 IPS_SEQ_ADJUST | IPS_TEMPLATE | IPS_UNTRACKED |
+				 IPS_OFFLOAD),
 
 	__IPS_MAX_BIT = 15,
 };
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 3f069eb0f0fc..1927fc296f95 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -940,11 +940,71 @@ static int __nf_ct_resolve_clash(struct sk_buff *skb,
 	return NF_DROP;
 }
 
+/**
+ * nf_ct_resolve_clash_harder - attempt to insert clashing conntrack entry
+ *
+ * @skb: skb that causes the collision
+ * @repl_idx: hash slot for reply direction
+ *
+ * Called when origin or reply direction had a clash.
+ * The skb can be handled without packet drop provided the reply direction
+ * is unique or there the existing entry has the identical tuple in both
+ * directions.
+ *
+ * Caller must hold conntrack table locks to prevent concurrent updates.
+ *
+ * Returns NF_DROP if the clash could not be handled.
+ */
+static int nf_ct_resolve_clash_harder(struct sk_buff *skb, u32 repl_idx)
+{
+	struct nf_conn *loser_ct = (struct nf_conn *)skb_nfct(skb);
+	const struct nf_conntrack_zone *zone;
+	struct nf_conntrack_tuple_hash *h;
+	struct hlist_nulls_node *n;
+	struct net *net;
+
+	zone = nf_ct_zone(loser_ct);
+	net = nf_ct_net(loser_ct);
+
+	/* Reply direction must never result in a clash, unless both origin
+	 * and reply tuples are identical.
+	 */
+	hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[repl_idx], hnnode) {
+		if (nf_ct_key_equal(h,
+				    &loser_ct->tuplehash[IP_CT_DIR_REPLY].tuple,
+				    zone, net))
+			return __nf_ct_resolve_clash(skb, h);
+	}
+
+	/* We want the clashing entry to go away real soon: 1 second timeout. */
+	loser_ct->timeout = nfct_time_stamp + HZ;
+
+	/* IPS_NAT_CLASH removes the entry automatically on the first
+	 * reply.  Also prevents UDP tracker from moving the entry to
+	 * ASSURED state, i.e. the entry can always be evicted under
+	 * pressure.
+	 */
+	loser_ct->status |= IPS_FIXED_TIMEOUT | IPS_NAT_CLASH;
+
+	__nf_conntrack_insert_prepare(loser_ct);
+
+	/* fake add for ORIGINAL dir: we want lookups to only find the entry
+	 * already in the table.  This also hides the clashing entry from
+	 * ctnetlink iteration, i.e. conntrack -L won't show them.
+	 */
+	hlist_nulls_add_fake(&loser_ct->tuplehash[IP_CT_DIR_ORIGINAL].hnnode);
+
+	hlist_nulls_add_head_rcu(&loser_ct->tuplehash[IP_CT_DIR_REPLY].hnnode,
+				 &nf_conntrack_hash[repl_idx]);
+	return NF_ACCEPT;
+}
+
 /**
  * nf_ct_resolve_clash - attempt to handle clash without packet drop
  *
  * @skb: skb that causes the clash
  * @h: tuplehash of the clashing entry already in table
+ * @hash_reply: hash slot for reply direction
  *
  * A conntrack entry can be inserted to the connection tracking table
  * if there is no existing entry with an identical tuple.
@@ -963,10 +1023,18 @@ static int __nf_ct_resolve_clash(struct sk_buff *skb,
  * exactly the same, only the to-be-confirmed conntrack entry is discarded
  * and @skb is associated with the conntrack entry already in the table.
  *
+ * Failing that, the new, unconfirmed conntrack is still added to the table
+ * provided that the collision only occurs in the ORIGINAL direction.
+ * The new entry will be added after the existing one in the hash list,
+ * so packets in the ORIGINAL direction will continue to match the existing
+ * entry.  The new entry will also have a fixed timeout so it expires --
+ * due to the collision, it will not see bidirectional traffic.
+ *
  * Returns NF_DROP if the clash could not be resolved.
  */
 static __cold noinline int
-nf_ct_resolve_clash(struct sk_buff *skb, struct nf_conntrack_tuple_hash *h)
+nf_ct_resolve_clash(struct sk_buff *skb, struct nf_conntrack_tuple_hash *h,
+		    u32 reply_hash)
 {
 	/* This is the conntrack entry already in hashes that won race. */
 	struct nf_conn *ct = nf_ct_tuplehash_to_ctrack(h);
@@ -987,6 +1055,10 @@ nf_ct_resolve_clash(struct sk_buff *skb, struct nf_conntrack_tuple_hash *h)
 	if (ret == NF_ACCEPT)
 		return ret;
 
+	ret = nf_ct_resolve_clash_harder(skb, reply_hash);
+	if (ret == NF_ACCEPT)
+		return ret;
+
 drop:
 	nf_ct_add_to_dying_list(loser_ct);
 	NF_CT_STAT_INC(net, drop);
@@ -1101,7 +1173,7 @@ __nf_conntrack_confirm(struct sk_buff *skb)
 	return NF_ACCEPT;
 
 out:
-	ret = nf_ct_resolve_clash(skb, h);
+	ret = nf_ct_resolve_clash(skb, h, reply_hash);
 dying:
 	nf_conntrack_double_unlock(hash, reply_hash);
 	local_bh_enable();
diff --git a/net/netfilter/nf_conntrack_proto_udp.c b/net/netfilter/nf_conntrack_proto_udp.c
index 7365b43f8f98..760ca2422816 100644
--- a/net/netfilter/nf_conntrack_proto_udp.c
+++ b/net/netfilter/nf_conntrack_proto_udp.c
@@ -81,6 +81,18 @@ static bool udp_error(struct sk_buff *skb,
 	return false;
 }
 
+static void nf_conntrack_udp_refresh_unreplied(struct nf_conn *ct,
+					       struct sk_buff *skb,
+					       enum ip_conntrack_info ctinfo,
+					       u32 extra_jiffies)
+{
+	if (unlikely(ctinfo == IP_CT_ESTABLISHED_REPLY &&
+		     ct->status & IPS_NAT_CLASH))
+		nf_ct_kill(ct);
+	else
+		nf_ct_refresh_acct(ct, ctinfo, skb, extra_jiffies);
+}
+
 /* Returns verdict for packet, and may modify conntracktype */
 int nf_conntrack_udp_packet(struct nf_conn *ct,
 			    struct sk_buff *skb,
@@ -116,8 +128,8 @@ int nf_conntrack_udp_packet(struct nf_conn *ct,
 		if (!test_and_set_bit(IPS_ASSURED_BIT, &ct->status))
 			nf_conntrack_event_cache(IPCT_ASSURED, ct);
 	} else {
-		nf_ct_refresh_acct(ct, ctinfo, skb,
-				   timeouts[UDP_CT_UNREPLIED]);
+		nf_conntrack_udp_refresh_unreplied(ct, skb, ctinfo,
+						   timeouts[UDP_CT_UNREPLIED]);
 	}
 	return NF_ACCEPT;
 }
@@ -198,8 +210,8 @@ int nf_conntrack_udplite_packet(struct nf_conn *ct,
 		if (!test_and_set_bit(IPS_ASSURED_BIT, &ct->status))
 			nf_conntrack_event_cache(IPCT_ASSURED, ct);
 	} else {
-		nf_ct_refresh_acct(ct, ctinfo, skb,
-				   timeouts[UDP_CT_UNREPLIED]);
+		nf_conntrack_udp_refresh_unreplied(ct, skb, ctinfo,
+						   timeouts[UDP_CT_UNREPLIED]);
 	}
 	return NF_ACCEPT;
 }
-- 
cgit v1.2.3


From f25975f42f2f8f2a01303054d6a70c7ceb1fcf54 Mon Sep 17 00:00:00 2001
From: Toke Høiland-Jørgensen <toke@redhat.com>
Date: Tue, 18 Feb 2020 14:03:34 +0100
Subject: bpf, uapi: Remove text about bpf_redirect_map() giving higher
 performance
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

The performance of bpf_redirect() is now roughly the same as that of
bpf_redirect_map(). However, David Ahern pointed out that the header file
has not been updated to reflect this, and still says that a significant
performance increase is possible when using bpf_redirect_map(). Remove this
text from the bpf_redirect_map() description, and reword the description in
bpf_redirect() slightly. Also fix the 'Return' section of the
bpf_redirect_map() documentation.

Fixes: 1d233886dd90 ("xdp: Use bulking for non-map XDP_REDIRECT and consolidate code paths")
Reported-by: David Ahern <dsahern@gmail.com>
Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20200218130334.29889-1-toke@redhat.com
---
 include/uapi/linux/bpf.h       | 16 +++++++---------
 tools/include/uapi/linux/bpf.h | 16 +++++++---------
 2 files changed, 14 insertions(+), 18 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index f1d74a2bd234..22f235260a3a 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1045,9 +1045,9 @@ union bpf_attr {
  * 		supports redirection to the egress interface, and accepts no
  * 		flag at all.
  *
- * 		The same effect can be attained with the more generic
- * 		**bpf_redirect_map**\ (), which requires specific maps to be
- * 		used but offers better performance.
+ * 		The same effect can also be attained with the more generic
+ * 		**bpf_redirect_map**\ (), which uses a BPF map to store the
+ * 		redirect target instead of providing it directly to the helper.
  * 	Return
  * 		For XDP, the helper returns **XDP_REDIRECT** on success or
  * 		**XDP_ABORTED** on error. For other program types, the values
@@ -1611,13 +1611,11 @@ union bpf_attr {
  * 		the caller. Any higher bits in the *flags* argument must be
  * 		unset.
  *
- * 		When used to redirect packets to net devices, this helper
- * 		provides a high performance increase over **bpf_redirect**\ ().
- * 		This is due to various implementation details of the underlying
- * 		mechanisms, one of which is the fact that **bpf_redirect_map**\
- * 		() tries to send packet as a "bulk" to the device.
+ * 		See also bpf_redirect(), which only supports redirecting to an
+ * 		ifindex, but doesn't require a map to do so.
  * 	Return
- * 		**XDP_REDIRECT** on success, or **XDP_ABORTED** on error.
+ * 		**XDP_REDIRECT** on success, or the value of the two lower bits
+ * 		of the **flags* argument on error.
  *
  * int bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32 key, u64 flags)
  * 	Description
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index f1d74a2bd234..22f235260a3a 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1045,9 +1045,9 @@ union bpf_attr {
  * 		supports redirection to the egress interface, and accepts no
  * 		flag at all.
  *
- * 		The same effect can be attained with the more generic
- * 		**bpf_redirect_map**\ (), which requires specific maps to be
- * 		used but offers better performance.
+ * 		The same effect can also be attained with the more generic
+ * 		**bpf_redirect_map**\ (), which uses a BPF map to store the
+ * 		redirect target instead of providing it directly to the helper.
  * 	Return
  * 		For XDP, the helper returns **XDP_REDIRECT** on success or
  * 		**XDP_ABORTED** on error. For other program types, the values
@@ -1611,13 +1611,11 @@ union bpf_attr {
  * 		the caller. Any higher bits in the *flags* argument must be
  * 		unset.
  *
- * 		When used to redirect packets to net devices, this helper
- * 		provides a high performance increase over **bpf_redirect**\ ().
- * 		This is due to various implementation details of the underlying
- * 		mechanisms, one of which is the fact that **bpf_redirect_map**\
- * 		() tries to send packet as a "bulk" to the device.
+ * 		See also bpf_redirect(), which only supports redirecting to an
+ * 		ifindex, but doesn't require a map to do so.
  * 	Return
- * 		**XDP_REDIRECT** on success, or **XDP_ABORTED** on error.
+ * 		**XDP_REDIRECT** on success, or the value of the two lower bits
+ * 		of the **flags* argument on error.
  *
  * int bpf_sk_redirect_map(struct sk_buff *skb, struct bpf_map *map, u32 key, u64 flags)
  * 	Description
-- 
cgit v1.2.3


From c766d1472c70d25ad475cf56042af1652e792b23 Mon Sep 17 00:00:00 2001
From: Arnd Bergmann <arnd@arndb.de>
Date: Thu, 20 Feb 2020 20:03:57 -0800
Subject: y2038: hide timeval/timespec/itimerval/itimerspec types

There are no in-kernel users remaining, but there may still be users that
include linux/time.h instead of sys/time.h from user space, so leave the
types available to user space while hiding them from kernel space.

Only the __kernel_old_* versions of these types remain now.

Link: http://lkml.kernel.org/r/20200110154232.4104492-4-arnd@arndb.de
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/uapi/asm-generic/posix_types.h |  2 ++
 include/uapi/linux/time.h              | 22 ++++++++++++----------
 2 files changed, 14 insertions(+), 10 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/asm-generic/posix_types.h b/include/uapi/asm-generic/posix_types.h
index 2f9c80595ba7..b5f7594eee7a 100644
--- a/include/uapi/asm-generic/posix_types.h
+++ b/include/uapi/asm-generic/posix_types.h
@@ -87,7 +87,9 @@ typedef struct {
 typedef __kernel_long_t	__kernel_off_t;
 typedef long long	__kernel_loff_t;
 typedef __kernel_long_t	__kernel_old_time_t;
+#ifndef __KERNEL__
 typedef __kernel_long_t	__kernel_time_t;
+#endif
 typedef long long __kernel_time64_t;
 typedef __kernel_long_t	__kernel_clock_t;
 typedef int		__kernel_timer_t;
diff --git a/include/uapi/linux/time.h b/include/uapi/linux/time.h
index a655aa28dc6e..4f4b6e48e01c 100644
--- a/include/uapi/linux/time.h
+++ b/include/uapi/linux/time.h
@@ -5,6 +5,7 @@
 #include <linux/types.h>
 #include <linux/time_types.h>
 
+#ifndef __KERNEL__
 #ifndef _STRUCT_TIMESPEC
 #define _STRUCT_TIMESPEC
 struct timespec {
@@ -18,6 +19,17 @@ struct timeval {
 	__kernel_suseconds_t	tv_usec;	/* microseconds */
 };
 
+struct itimerspec {
+	struct timespec it_interval;/* timer period */
+	struct timespec it_value;	/* timer expiration */
+};
+
+struct itimerval {
+	struct timeval it_interval;/* timer interval */
+	struct timeval it_value;	/* current value */
+};
+#endif
+
 struct timezone {
 	int	tz_minuteswest;	/* minutes west of Greenwich */
 	int	tz_dsttime;	/* type of dst correction */
@@ -31,16 +43,6 @@ struct timezone {
 #define	ITIMER_VIRTUAL		1
 #define	ITIMER_PROF		2
 
-struct itimerspec {
-	struct timespec it_interval;	/* timer period */
-	struct timespec it_value;	/* timer expiration */
-};
-
-struct itimerval {
-	struct timeval it_interval;	/* timer interval */
-	struct timeval it_value;	/* current value */
-};
-
 /*
  * The IDs of the various system clocks (for POSIX.1b interval timers):
  */
-- 
cgit v1.2.3


From 467d12f5c7842896d2de3ced74e4147ee29e97c8 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger@de.ibm.com>
Date: Thu, 20 Feb 2020 20:04:03 -0800
Subject: include/uapi/linux/swab.h: fix userspace breakage, use
 __BITS_PER_LONG for swap

QEMU has a funny new build error message when I use the upstream kernel
headers:

      CC      block/file-posix.o
    In file included from /home/cborntra/REPOS/qemu/include/qemu/timer.h:4,
                     from /home/cborntra/REPOS/qemu/include/qemu/timed-average.h:29,
                     from /home/cborntra/REPOS/qemu/include/block/accounting.h:28,
                     from /home/cborntra/REPOS/qemu/include/block/block_int.h:27,
                     from /home/cborntra/REPOS/qemu/block/file-posix.c:30:
    /usr/include/linux/swab.h: In function `__swab':
    /home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:34: error: "sizeof" is not defined, evaluates to 0 [-Werror=undef]
       20 | #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)
          |                                  ^~~~~~
    /home/cborntra/REPOS/qemu/include/qemu/bitops.h:20:41: error: missing binary operator before token "("
       20 | #define BITS_PER_LONG           (sizeof (unsigned long) * BITS_PER_BYTE)
          |                                         ^
    cc1: all warnings being treated as errors
    make: *** [/home/cborntra/REPOS/qemu/rules.mak:69: block/file-posix.o] Error 1
    rm tests/qemu-iotests/socket_scm_helper.o

This was triggered by commit d5767057c9a ("uapi: rename ext2_swab() to
swab() and share globally in swab.h").  That patch is doing

  #include <asm/bitsperlong.h>

but it uses BITS_PER_LONG.

The kernel file asm/bitsperlong.h provide only __BITS_PER_LONG.

Let us use the __ variant in swap.h

Link: http://lkml.kernel.org/r/20200213142147.17604-1-borntraeger@de.ibm.com
Fixes: d5767057c9a ("uapi: rename ext2_swab() to swab() and share globally in swab.h")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Cc: Allison Randal <allison@lohutok.net>
Cc: Joe Perches <joe@perches.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: William Breathitt Gray <vilhelm.gray@gmail.com>
Cc: Torsten Hilbrich <torsten.hilbrich@secunet.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
---
 include/uapi/linux/swab.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/swab.h b/include/uapi/linux/swab.h
index fa7f97da5b76..7272f85d6d6a 100644
--- a/include/uapi/linux/swab.h
+++ b/include/uapi/linux/swab.h
@@ -135,9 +135,9 @@ static inline __attribute_const__ __u32 __fswahb32(__u32 val)
 
 static __always_inline unsigned long __swab(const unsigned long y)
 {
-#if BITS_PER_LONG == 64
+#if __BITS_PER_LONG == 64
 	return __swab64(y);
-#else /* BITS_PER_LONG == 32 */
+#else /* __BITS_PER_LONG == 32 */
 	return __swab32(y);
 #endif
 }
-- 
cgit v1.2.3


From 636be4241bdd88fec273b38723e44bad4e1c4fae Mon Sep 17 00:00:00 2001
From: Mike Snitzer <snitzer@redhat.com>
Date: Thu, 27 Feb 2020 14:25:31 -0500
Subject: dm: bump version of core and various targets

Changes made during the 5.6 cycle warrant bumping the version number
for DM core and the targets modified by this commit.

It should be noted that dm-thin, dm-crypt and dm-raid already had
their target version bumped during the 5.6 merge window.

Signed-off-by; Mike Snitzer <snitzer@redhat.com>
---
 drivers/md/dm-cache-target.c  | 2 +-
 drivers/md/dm-integrity.c     | 2 +-
 drivers/md/dm-mpath.c         | 2 +-
 drivers/md/dm-verity-target.c | 2 +-
 drivers/md/dm-writecache.c    | 2 +-
 drivers/md/dm-zoned-target.c  | 2 +-
 include/uapi/linux/dm-ioctl.h | 4 ++--
 7 files changed, 8 insertions(+), 8 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/drivers/md/dm-cache-target.c b/drivers/md/dm-cache-target.c
index f4be63671233..d3bb355819a4 100644
--- a/drivers/md/dm-cache-target.c
+++ b/drivers/md/dm-cache-target.c
@@ -3492,7 +3492,7 @@ static void cache_io_hints(struct dm_target *ti, struct queue_limits *limits)
 
 static struct target_type cache_target = {
 	.name = "cache",
-	.version = {2, 1, 0},
+	.version = {2, 2, 0},
 	.module = THIS_MODULE,
 	.ctr = cache_ctr,
 	.dtr = cache_dtr,
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index a82a9c257744..2f03fecd312d 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -4202,7 +4202,7 @@ static void dm_integrity_dtr(struct dm_target *ti)
 
 static struct target_type integrity_target = {
 	.name			= "integrity",
-	.version		= {1, 4, 0},
+	.version		= {1, 5, 0},
 	.module			= THIS_MODULE,
 	.features		= DM_TARGET_SINGLETON | DM_TARGET_INTEGRITY,
 	.ctr			= dm_integrity_ctr,
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index 2bc18c9c3abc..58fd137b6ae1 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -2053,7 +2053,7 @@ static int multipath_busy(struct dm_target *ti)
  *---------------------------------------------------------------*/
 static struct target_type multipath_target = {
 	.name = "multipath",
-	.version = {1, 13, 0},
+	.version = {1, 14, 0},
 	.features = DM_TARGET_SINGLETON | DM_TARGET_IMMUTABLE |
 		    DM_TARGET_PASSES_INTEGRITY,
 	.module = THIS_MODULE,
diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 0d61e9c67986..eec9f252e935 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -1221,7 +1221,7 @@ bad:
 
 static struct target_type verity_target = {
 	.name		= "verity",
-	.version	= {1, 5, 0},
+	.version	= {1, 6, 0},
 	.module		= THIS_MODULE,
 	.ctr		= verity_ctr,
 	.dtr		= verity_dtr,
diff --git a/drivers/md/dm-writecache.c b/drivers/md/dm-writecache.c
index d692d2a00745..a09bdc000e64 100644
--- a/drivers/md/dm-writecache.c
+++ b/drivers/md/dm-writecache.c
@@ -2320,7 +2320,7 @@ static void writecache_status(struct dm_target *ti, status_type_t type,
 
 static struct target_type writecache_target = {
 	.name			= "writecache",
-	.version		= {1, 1, 1},
+	.version		= {1, 2, 0},
 	.module			= THIS_MODULE,
 	.ctr			= writecache_ctr,
 	.dtr			= writecache_dtr,
diff --git a/drivers/md/dm-zoned-target.c b/drivers/md/dm-zoned-target.c
index b1e64cd31647..f4f83d39b3dc 100644
--- a/drivers/md/dm-zoned-target.c
+++ b/drivers/md/dm-zoned-target.c
@@ -967,7 +967,7 @@ static int dmz_iterate_devices(struct dm_target *ti,
 
 static struct target_type dmz_type = {
 	.name		 = "zoned",
-	.version	 = {1, 0, 0},
+	.version	 = {1, 1, 0},
 	.features	 = DM_TARGET_SINGLETON | DM_TARGET_ZONED_HM,
 	.module		 = THIS_MODULE,
 	.ctr		 = dmz_ctr,
diff --git a/include/uapi/linux/dm-ioctl.h b/include/uapi/linux/dm-ioctl.h
index 2df8ceca1f9b..6622912c2342 100644
--- a/include/uapi/linux/dm-ioctl.h
+++ b/include/uapi/linux/dm-ioctl.h
@@ -272,9 +272,9 @@ enum {
 #define DM_DEV_SET_GEOMETRY	_IOWR(DM_IOCTL, DM_DEV_SET_GEOMETRY_CMD, struct dm_ioctl)
 
 #define DM_VERSION_MAJOR	4
-#define DM_VERSION_MINOR	41
+#define DM_VERSION_MINOR	42
 #define DM_VERSION_PATCHLEVEL	0
-#define DM_VERSION_EXTRA	"-ioctl (2019-09-16)"
+#define DM_VERSION_EXTRA	"-ioctl (2020-02-27)"
 
 /* Status bits */
 #define DM_READONLY_FLAG	(1 << 0) /* In/Out */
-- 
cgit v1.2.3


From 2677625387056136e256c743e3285b4fe3da87bb Mon Sep 17 00:00:00 2001
From: Paolo Lungaroni <paolo.lungaroni@cnit.it>
Date: Wed, 11 Mar 2020 17:54:06 +0100
Subject: seg6: fix SRv6 L2 tunnels to use IANA-assigned protocol number

The Internet Assigned Numbers Authority (IANA) has recently assigned
a protocol number value of 143 for Ethernet [1].

Before this assignment, encapsulation mechanisms such as Segment Routing
used the IPv6-NoNxt protocol number (59) to indicate that the encapsulated
payload is an Ethernet frame.

In this patch, we add the definition of the Ethernet protocol number to the
kernel headers and update the SRv6 L2 tunnels to use it.

[1] https://www.iana.org/assignments/protocol-numbers/protocol-numbers.xhtml

Signed-off-by: Paolo Lungaroni <paolo.lungaroni@cnit.it>
Reviewed-by: Andrea Mayer <andrea.mayer@uniroma2.it>
Acked-by: Ahmed Abdelsalam <ahmed.abdelsalam@gssi.it>
Signed-off-by: David S. Miller <davem@davemloft.net>
---
 include/uapi/linux/in.h  | 2 ++
 net/ipv6/seg6_iptunnel.c | 2 +-
 net/ipv6/seg6_local.c    | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

(limited to 'include/uapi/linux')

diff --git a/include/uapi/linux/in.h b/include/uapi/linux/in.h
index 1521073b6348..8533bf07450f 100644
--- a/include/uapi/linux/in.h
+++ b/include/uapi/linux/in.h
@@ -74,6 +74,8 @@ enum {
 #define IPPROTO_UDPLITE		IPPROTO_UDPLITE
   IPPROTO_MPLS = 137,		/* MPLS in IP (RFC 4023)		*/
 #define IPPROTO_MPLS		IPPROTO_MPLS
+  IPPROTO_ETHERNET = 143,	/* Ethernet-within-IPv6 Encapsulation	*/
+#define IPPROTO_ETHERNET	IPPROTO_ETHERNET
   IPPROTO_RAW = 255,		/* Raw IP packets			*/
 #define IPPROTO_RAW		IPPROTO_RAW
   IPPROTO_MPTCP = 262,		/* Multipath TCP connection		*/
diff --git a/net/ipv6/seg6_iptunnel.c b/net/ipv6/seg6_iptunnel.c
index ab7f124ff5d7..8c52efe299cc 100644
--- a/net/ipv6/seg6_iptunnel.c
+++ b/net/ipv6/seg6_iptunnel.c
@@ -268,7 +268,7 @@ static int seg6_do_srh(struct sk_buff *skb)
 		skb_mac_header_rebuild(skb);
 		skb_push(skb, skb->mac_len);
 
-		err = seg6_do_srh_encap(skb, tinfo->srh, NEXTHDR_NONE);
+		err = seg6_do_srh_encap(skb, tinfo->srh, IPPROTO_ETHERNET);
 		if (err)
 			return err;
 
diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c
index 7cbc19731997..8165802d8e05 100644
--- a/net/ipv6/seg6_local.c
+++ b/net/ipv6/seg6_local.c
@@ -282,7 +282,7 @@ static int input_action_end_dx2(struct sk_buff *skb,
 	struct net_device *odev;
 	struct ethhdr *eth;
 
-	if (!decap_and_validate(skb, NEXTHDR_NONE))
+	if (!decap_and_validate(skb, IPPROTO_ETHERNET))
 		goto drop;
 
 	if (!pskb_may_pull(skb, ETH_HLEN))
-- 
cgit v1.2.3