diff options
| author | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-15 04:36:10 +0300 |
|---|---|---|
| committer | Linus Torvalds <torvalds@linux-foundation.org> | 2026-04-15 04:36:10 +0300 |
| commit | 91a4855d6c03e770e42f17c798a36a3c46e63de2 (patch) | |
| tree | 5103bfe3aea2aab7e8b358c5c9329539508f648d /tools | |
| parent | f5ad4101009e7f5f5984ffea6923d4fcd470932a (diff) | |
| parent | 35c2c39832e569449b9192fa1afbbc4c66227af7 (diff) | |
| download | linux-91a4855d6c03e770e42f17c798a36a3c46e63de2.tar.xz | |
Merge tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core & protocols:
- Support HW queue leasing, allowing containers to be granted access
to HW queues for zero-copy operations and AF_XDP
- Number of code moves to help the compiler with inlining. Avoid
output arguments for returning drop reason where possible
- Rework drop handling within qdiscs to include more metadata about
the reason and dropping qdisc in the tracepoints
- Remove the rtnl_lock use from IP Multicast Routing
- Pack size information into the Rx Flow Steering table pointer
itself. This allows making the table itself a flat array of u32s,
thus making the table allocation size a power of two
- Report TCP delayed ack timer information via socket diag
- Add ip_local_port_step_width sysctl to allow distributing the
randomly selected ports more evenly throughout the allowed space
- Add support for per-route tunsrc in IPv6 segment routing
- Start work of switching sockopt handling to iov_iter
- Improve dynamic recvbuf sizing in MPTCP, limit burstiness and avoid
buffer size drifting up
- Support MSG_EOR in MPTCP
- Add stp_mode attribute to the bridge driver for STP mode selection.
This addresses concerns about call_usermodehelper() usage
- Remove UDP-Lite support (as announced in 2023)
- Remove support for building IPv6 as a module. Remove the now
unnecessary function calling indirection
Cross-tree stuff:
- Move Michael MIC code from generic crypto into wireless, it's
considered insecure but some WiFi networks still need it
Netfilter:
- Switch nft_fib_ipv6 module to no longer need temporary dst_entry
object allocations by using fib6_lookup() + RCU.
Florian W reports this gets us ~13% higher packet rate
- Convert IPVS's global __ip_vs_mutex to per-net service_mutex and
switch the service tables to be per-net. Convert some code that
walks the service lists to use RCU instead of the service_mutex
- Add more opinionated input validation to lower security exposure
- Make IPVS hash tables to be per-netns and resizable
Wireless:
- Finished assoc frame encryption/EPPKE/802.1X-over-auth
- Radar detection improvements
- Add 6 GHz incumbent signal detection APIs
- Multi-link support for FILS, probe response templates and client
probing
- New APIs and mac80211 support for NAN (Neighbor Aware Networking,
aka Wi-Fi Aware) so less work must be in firmware
Driver API:
- Add numerical ID for devlink instances (to avoid having to create
fake bus/device pairs just to have an ID). Support shared devlink
instances which span multiple PFs
- Add standard counters for reporting pause storm events (implement
in mlx5 and fbnic)
- Add configuration API for completion writeback buffering (implement
in mana)
- Support driver-initiated change of RSS context sizes
- Support DPLL monitoring input frequency (implement in zl3073x)
- Support per-port resources in devlink (implement in mlx5)
Misc:
- Expand the YAML spec for Netfilter
Drivers
- Software:
- macvlan: support multicast rx for bridge ports with shared
source MAC address
- team: decouple receive and transmit enablement for IEEE 802.3ad
LACP "independent control"
- Ethernet high-speed NICs:
- nVidia/Mellanox:
- support high order pages in zero-copy mode (for payload
coalescing)
- support multiple packets in a page (for systems with 64kB
pages)
- Broadcom 25-400GE (bnxt):
- implement XDP RSS hash metadata extraction
- add software fallback for UDP GSO, lowering the IOMMU cost
- Broadcom 800GE (bnge):
- add link status and configuration handling
- add various HW and SW statistics
- Marvell/Cavium:
- NPC HW block support for cn20k
- Huawei (hinic3):
- add mailbox / control queue
- add rx VLAN offload
- add driver info and link management
- Ethernet NICs:
- Marvell/Aquantia:
- support reading SFP module info on some AQC100 cards
- Realtek PCI (r8169):
- add support for RTL8125cp
- Realtek USB (r8152):
- support for the RTL8157 5Gbit chip
- add 2500baseT EEE status/configuration support
- Ethernet NICs embedded and off-the-shelf IP:
- Synopsys (stmmac):
- cleanup and reorganize SerDes handling and PCS support
- cleanup descriptor handling and per-platform data
- cleanup and consolidate MDIO defines and handling
- shrink driver memory use for internal structures
- improve Tx IRQ coalescing
- improve TCP segmentation handling
- add support for Spacemit K3
- Cadence (macb):
- support PHYs that have inband autoneg disabled with GEM
- support IEEE 802.3az EEE
- rework usrio capabilities and handling
- AMD (xgbe):
- improve power management for S0i3
- improve TX resilience for link-down handling
- Virtual:
- Google cloud vNIC:
- support larger ring sizes in DQO-QPL mode
- improve HW-GRO handling
- support UDP GSO for DQO format
- PCIe NTB:
- support queue count configuration
- Ethernet PHYs:
- automatically disable PHY autonomous EEE if MAC is in charge
- Broadcom:
- add BCM84891/BCM84892 support
- Micrel:
- support for LAN9645X internal PHY
- Realtek:
- add RTL8224 pair order support
- support PHY LEDs on RTL8211F-VD
- support spread spectrum clocking (SSC)
- Maxlinear:
- add PHY-level statistics via ethtool
- Ethernet switches:
- Maxlinear (mxl862xx):
- support for bridge offloading
- support for VLANs
- support driver statistics
- Bluetooth:
- large number of fixes and new device IDs
- Mediatek:
- support MT6639 (MT7927)
- support MT7902 SDIO
- WiFi:
- Intel (iwlwifi):
- UNII-9 and continuing UHR work
- MediaTek (mt76):
- mt7996/mt7925 MLO fixes/improvements
- mt7996 NPU support (HW eth/wifi traffic offload)
- Qualcomm (ath12k):
- monitor mode support on IPQ5332
- basic hwmon temperature reporting
- support IPQ5424
- Realtek:
- add USB RX aggregation to improve performance
- add USB TX flow control by tracking in-flight URBs
- Cellular:
- IPA v5.2 support"
* tag 'net-next-7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1561 commits)
net: pse-pd: fix kernel-doc function name for pse_control_find_by_id()
wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
wireguard: allowedips: remove redundant space
tools: ynl: add sample for wireguard
wireguard: allowedips: Use kfree_rcu() instead of call_rcu()
MAINTAINERS: Add netkit selftest files
selftests/net: Add additional test coverage in nk_qlease
selftests/net: Split netdevsim tests from HW tests in nk_qlease
tools/ynl: Make YnlFamily closeable as a context manager
net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
net: airoha: Fix VIP configuration for AN7583 SoC
net: caif: clear client service pointer on teardown
net: strparser: fix skb_head leak in strp_abort_strp()
net: usb: cdc-phonet: fix skb frags[] overflow in rx_complete()
selftests/bpf: add test for xdp_master_redirect with bond not up
net, bpf: fix null-ptr-deref in xdp_master_redirect() for down master
net: airoha: Remove PCE_MC_EN_MASK bit in REG_FE_PCE_CFG configuration
sctp: disable BH before calling udp_tunnel_xmit_skb()
sctp: fix missing encap_port propagation for GSO fragments
net: airoha: Rely on net_device pointer in ETS callbacks
...
Diffstat (limited to 'tools')
189 files changed, 12155 insertions, 1865 deletions
diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index e0b579a1df4f..7df1056a35fd 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -160,6 +160,7 @@ enum { NETDEV_A_QUEUE_DMABUF, NETDEV_A_QUEUE_IO_URING, NETDEV_A_QUEUE_XSK, + NETDEV_A_QUEUE_LEASE, __NETDEV_A_QUEUE_MAX, NETDEV_A_QUEUE_MAX = (__NETDEV_A_QUEUE_MAX - 1) @@ -203,6 +204,15 @@ enum { }; enum { + NETDEV_A_LEASE_IFINDEX = 1, + NETDEV_A_LEASE_QUEUE, + NETDEV_A_LEASE_NETNS_ID, + + __NETDEV_A_LEASE_MAX, + NETDEV_A_LEASE_MAX = (__NETDEV_A_LEASE_MAX - 1) +}; + +enum { NETDEV_A_DMABUF_IFINDEX = 1, NETDEV_A_DMABUF_QUEUES, NETDEV_A_DMABUF_FD, @@ -228,6 +238,7 @@ enum { NETDEV_CMD_BIND_RX, NETDEV_CMD_NAPI_SET, NETDEV_CMD_BIND_TX, + NETDEV_CMD_QUEUE_CREATE, __NETDEV_CMD_MAX, NETDEV_CMD_MAX = (__NETDEV_CMD_MAX - 1) diff --git a/tools/net/ynl/Makefile b/tools/net/ynl/Makefile index 9b692f368be7..d514a48dae27 100644 --- a/tools/net/ynl/Makefile +++ b/tools/net/ynl/Makefile @@ -14,12 +14,12 @@ includedir ?= $(prefix)/include SPECDIR=../../../Documentation/netlink/specs -SUBDIRS = lib generated samples ynltool tests +SUBDIRS = lib generated ynltool tests all: $(SUBDIRS) libynl.a +tests: | lib generated libynl.a ynltool: | lib generated libynl.a -samples: | lib generated libynl.a: | lib generated @echo -e "\tAR $@" @ar rcs $@ lib/ynl.o generated/*-user.o diff --git a/tools/net/ynl/pyynl/cli.py b/tools/net/ynl/pyynl/cli.py index 94a5ba348b69..8275a806cf73 100755 --- a/tools/net/ynl/pyynl/cli.py +++ b/tools/net/ynl/pyynl/cli.py @@ -78,7 +78,7 @@ class YnlEncoder(json.JSONEncoder): if isinstance(o, bytes): return bytes.hex(o) if isinstance(o, set): - return list(o) + return sorted(o) return json.JSONEncoder.default(self, o) @@ -256,6 +256,8 @@ def main(): schema_group.add_argument('--no-schema', action='store_true') dbg_group = parser.add_argument_group('Debug options') + io_group.add_argument('--policy', action='store_true', + help='Query kernel policy for the operation instead of executing it') dbg_group.add_argument('--dbg-small-recv', default=0, const=4000, action='store', nargs='?', type=int, metavar='INT', help="Length of buffers used for recv()") @@ -308,6 +310,16 @@ def main(): if args.dbg_small_recv: ynl.set_recv_dbg(True) + if args.policy: + if args.do: + pol = ynl.get_policy(args.do, 'do') + output(pol.to_dict() if pol else None) + args.do = None + if args.dump: + pol = ynl.get_policy(args.dump, 'dump') + output(pol.to_dict() if pol else None) + args.dump = None + if args.ntf: ynl.ntf_subscribe(args.ntf) diff --git a/tools/net/ynl/pyynl/lib/__init__.py b/tools/net/ynl/pyynl/lib/__init__.py index 33a96155fb3b..be741985ae4e 100644 --- a/tools/net/ynl/pyynl/lib/__init__.py +++ b/tools/net/ynl/pyynl/lib/__init__.py @@ -5,11 +5,12 @@ from .nlspec import SpecAttr, SpecAttrSet, SpecEnumEntry, SpecEnumSet, \ SpecFamily, SpecOperation, SpecSubMessage, SpecSubMessageFormat, \ SpecException -from .ynl import YnlFamily, Netlink, NlError, YnlException +from .ynl import YnlFamily, Netlink, NlError, NlPolicy, YnlException from .doc_generator import YnlDocGenerator __all__ = ["SpecAttr", "SpecAttrSet", "SpecEnumEntry", "SpecEnumSet", "SpecFamily", "SpecOperation", "SpecSubMessage", "SpecSubMessageFormat", "SpecException", - "YnlFamily", "Netlink", "NlError", "YnlDocGenerator", "YnlException"] + "YnlFamily", "Netlink", "NlError", "NlPolicy", "YnlException", + "YnlDocGenerator"] diff --git a/tools/net/ynl/pyynl/lib/ynl.py b/tools/net/ynl/pyynl/lib/ynl.py index 9774005e7ad1..f63c6f828735 100644 --- a/tools/net/ynl/pyynl/lib/ynl.py +++ b/tools/net/ynl/pyynl/lib/ynl.py @@ -77,15 +77,22 @@ class Netlink: # nlctrl CTRL_CMD_GETFAMILY = 3 + CTRL_CMD_GETPOLICY = 10 CTRL_ATTR_FAMILY_ID = 1 CTRL_ATTR_FAMILY_NAME = 2 CTRL_ATTR_MAXATTR = 5 CTRL_ATTR_MCAST_GROUPS = 7 + CTRL_ATTR_POLICY = 8 + CTRL_ATTR_OP_POLICY = 9 + CTRL_ATTR_OP = 10 CTRL_ATTR_MCAST_GRP_NAME = 1 CTRL_ATTR_MCAST_GRP_ID = 2 + CTRL_ATTR_POLICY_DO = 1 + CTRL_ATTR_POLICY_DUMP = 2 + # Extack types NLMSGERR_ATTR_MSG = 1 NLMSGERR_ATTR_OFFS = 2 @@ -136,6 +143,119 @@ class ConfigError(Exception): pass +class NlPolicy: + """Kernel policy for one mode (do or dump) of one operation. + + Returned by YnlFamily.get_policy(). Attributes of the policy + are accessible as attributes of the object. Nested policies + can be accessed indexing the object like a dictionary:: + + pol = ynl.get_policy('page-pool-stats-get', 'do') + pol['info'].type # 'nested' + pol['info']['id'].type # 'uint' + pol['info']['id'].min_value # 1 + + Each policy entry always has a 'type' attribute (e.g. u32, string, + nested). Optional attributes depending on the 'type': min-value, + max-value, min-length, max-length, mask. + + Policies can form infinite nesting loops. These loops are trimmed + when policy is converted to a dict with pol.to_dict(). + """ + def __init__(self, ynl, policy_idx, policy_table, attr_set, props=None): + self._policy_idx = policy_idx + self._policy_table = policy_table + self._ynl = ynl + self._props = props or {} + self._entries = {} + self._cache = {} + if policy_idx is not None and policy_idx in policy_table: + for attr_id, decoded in policy_table[policy_idx].items(): + if attr_set and attr_id in attr_set.attrs_by_val: + spec = attr_set.attrs_by_val[attr_id] + name = spec['name'] + else: + spec = None + name = f'attr-{attr_id}' + self._entries[name] = (spec, decoded) + + def __getitem__(self, name): + """Descend into a nested policy by attribute name.""" + if name not in self._cache: + spec, decoded = self._entries[name] + props = dict(decoded) + child_idx = None + child_set = None + if 'policy-idx' in props: + child_idx = props.pop('policy-idx') + if spec and 'nested-attributes' in spec.yaml: + child_set = self._ynl.attr_sets[spec.yaml['nested-attributes']] + self._cache[name] = NlPolicy(self._ynl, child_idx, + self._policy_table, + child_set, props) + return self._cache[name] + + def __getattr__(self, name): + """Access this policy entry's own properties (type, min-value, etc.). + + Underscores in the name are converted to dashes, so that + pol.min_value looks up "min-value". + """ + key = name.replace('_', '-') + try: + # Hack for level-0 which we still want to have .type but we don't + # want type to pointlessly show up in the dict / JSON form. + if not self._props and name == "type": + return "nested" + return self._props[key] + except KeyError: + raise AttributeError(name) + + def get(self, name, default=None): + """Look up a child policy entry by attribute name, with a default.""" + try: + return self[name] + except KeyError: + return default + + def __contains__(self, name): + return name in self._entries + + def __len__(self): + return len(self._entries) + + def __iter__(self): + return iter(self._entries) + + def keys(self): + """Return attribute names accepted by this policy.""" + return self._entries.keys() + + def to_dict(self, seen=None): + """Convert to a plain dict, suitable for JSON serialization. + + Nested NlPolicy objects are expanded recursively. Cyclic + references are trimmed (resolved to just {"type": "nested"}). + """ + if seen is None: + seen = set() + result = dict(self._props) + if self._policy_idx is not None: + if self._policy_idx not in seen: + seen = seen | {self._policy_idx} + children = {} + for name in self: + children[name] = self[name].to_dict(seen) + if self._props: + result['policy'] = children + else: + result = children + return result + + def __repr__(self): + return repr(self.to_dict()) + + class NlAttr: ScalarFormat = namedtuple('ScalarFormat', ['native', 'big', 'little']) type_formats = { @@ -247,7 +367,7 @@ class NlMsg: elif extack.type == Netlink.NLMSGERR_ATTR_OFFS: self.extack['bad-attr-offs'] = extack.as_scalar('u32') elif extack.type == Netlink.NLMSGERR_ATTR_POLICY: - self.extack['policy'] = self._decode_policy(extack.raw) + self.extack['policy'] = _genl_decode_policy(extack.raw) else: if 'unknown' not in self.extack: self.extack['unknown'] = [] @@ -256,30 +376,6 @@ class NlMsg: if attr_space: self.annotate_extack(attr_space) - def _decode_policy(self, raw): - policy = {} - for attr in NlAttrs(raw): - if attr.type == Netlink.NL_POLICY_TYPE_ATTR_TYPE: - type_ = attr.as_scalar('u32') - policy['type'] = Netlink.AttrType(type_).name - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MIN_VALUE_S: - policy['min-value'] = attr.as_scalar('s64') - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MAX_VALUE_S: - policy['max-value'] = attr.as_scalar('s64') - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MIN_VALUE_U: - policy['min-value'] = attr.as_scalar('u64') - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MAX_VALUE_U: - policy['max-value'] = attr.as_scalar('u64') - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MIN_LENGTH: - policy['min-length'] = attr.as_scalar('u32') - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MAX_LENGTH: - policy['max-length'] = attr.as_scalar('u32') - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_BITFIELD32_MASK: - policy['bitfield32-mask'] = attr.as_scalar('u32') - elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MASK: - policy['mask'] = attr.as_scalar('u64') - return policy - def annotate_extack(self, attr_space): """ Make extack more human friendly with attribute information """ @@ -333,6 +429,33 @@ def _genl_msg_finalize(msg): return struct.pack("I", len(msg) + 4) + msg +def _genl_decode_policy(raw): + policy = {} + for attr in NlAttrs(raw): + if attr.type == Netlink.NL_POLICY_TYPE_ATTR_TYPE: + type_ = attr.as_scalar('u32') + policy['type'] = Netlink.AttrType(type_).name + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MIN_VALUE_S: + policy['min-value'] = attr.as_scalar('s64') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MAX_VALUE_S: + policy['max-value'] = attr.as_scalar('s64') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MIN_VALUE_U: + policy['min-value'] = attr.as_scalar('u64') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MAX_VALUE_U: + policy['max-value'] = attr.as_scalar('u64') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MIN_LENGTH: + policy['min-length'] = attr.as_scalar('u32') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MAX_LENGTH: + policy['max-length'] = attr.as_scalar('u32') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_POLICY_IDX: + policy['policy-idx'] = attr.as_scalar('u32') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_BITFIELD32_MASK: + policy['bitfield32-mask'] = attr.as_scalar('u32') + elif attr.type == Netlink.NL_POLICY_TYPE_ATTR_MASK: + policy['mask'] = attr.as_scalar('u64') + return policy + + # pylint: disable=too-many-nested-blocks def _genl_load_families(): genl_family_name_to_id = {} @@ -381,6 +504,52 @@ def _genl_load_families(): genl_family_name_to_id[fam['name']] = fam +# pylint: disable=too-many-nested-blocks +def _genl_policy_dump(family_id, op): + op_policy = {} + policy_table = {} + + with socket.socket(socket.AF_NETLINK, socket.SOCK_RAW, Netlink.NETLINK_GENERIC) as sock: + sock.setsockopt(Netlink.SOL_NETLINK, Netlink.NETLINK_CAP_ACK, 1) + + msg = _genl_msg(Netlink.GENL_ID_CTRL, + Netlink.NLM_F_REQUEST | Netlink.NLM_F_ACK | Netlink.NLM_F_DUMP, + Netlink.CTRL_CMD_GETPOLICY, 1) + msg += struct.pack('HHHxx', 6, Netlink.CTRL_ATTR_FAMILY_ID, family_id) + msg += struct.pack('HHI', 8, Netlink.CTRL_ATTR_OP, op) + msg = _genl_msg_finalize(msg) + + sock.send(msg, 0) + + while True: + reply = sock.recv(128 * 1024) + nms = NlMsgs(reply) + for nl_msg in nms: + if nl_msg.error: + raise YnlException(f"Netlink error: {nl_msg.error}") + if nl_msg.done: + return op_policy, policy_table + + gm = GenlMsg(nl_msg) + for attr in NlAttrs(gm.raw): + if attr.type == Netlink.CTRL_ATTR_OP_POLICY: + for op_attr in NlAttrs(attr.raw): + for method_attr in NlAttrs(op_attr.raw): + if method_attr.type == Netlink.CTRL_ATTR_POLICY_DO: + op_policy['do'] = method_attr.as_scalar('u32') + elif method_attr.type == Netlink.CTRL_ATTR_POLICY_DUMP: + op_policy['dump'] = method_attr.as_scalar('u32') + elif attr.type == Netlink.CTRL_ATTR_POLICY: + for pidx_attr in NlAttrs(attr.raw): + policy_idx = pidx_attr.type + for aid_attr in NlAttrs(pidx_attr.raw): + attr_id = aid_attr.type + decoded = _genl_decode_policy(aid_attr.raw) + if policy_idx not in policy_table: + policy_table[policy_idx] = {} + policy_table[policy_idx][attr_id] = decoded + + class GenlMsg: def __init__(self, nl_msg): self.nl = nl_msg @@ -488,6 +657,37 @@ class SpaceAttrs: class YnlFamily(SpecFamily): + """ + YNL family -- a Netlink interface built from a YAML spec. + + Primary use of the class is to execute Netlink commands: + + ynl.<op_name>(attrs, ...) + + By default this will execute the <op_name> as "do", pass dump=True + to perform a dump operation. + + ynl.<op_name> is a shorthand / convenience wrapper for the following + methods which take the op_name as a string: + + ynl.do(op_name, attrs, flags=None) -- execute a do operation + ynl.dump(op_name, attrs) -- execute a dump operation + ynl.do_multi(ops) -- batch multiple do operations + + The flags argument in ynl.do() allows passing in extra NLM_F_* flags + which may be necessary for old families. + + Notification API: + + ynl.ntf_subscribe(mcast_name) -- join a multicast group + ynl.check_ntf() -- drain pending notifications + ynl.poll_ntf(duration=None) -- yield notifications + + Policy introspection allows querying validation criteria from the running + kernel. Allows checking whether kernel supports a given attribute or value. + + ynl.get_policy(op_name, mode) -- query kernel policy for an op + """ def __init__(self, def_path, schema=None, process_unknown=False, recv_size=0): super().__init__(def_path, schema) @@ -531,6 +731,16 @@ class YnlFamily(SpecFamily): bound_f = functools.partial(self._op, op_name) setattr(self, op.ident_name, bound_f) + def close(self): + if self.sock is not None: + self.sock.close() + self.sock = None + + def __enter__(self): + return self + + def __exit__(self, exc_type, exc, tb): + self.close() def ntf_subscribe(self, mcast_name): mcast_id = self.nlproto.get_mcast_id(mcast_name, self.mcast_groups) @@ -814,7 +1024,9 @@ class YnlFamily(SpecFamily): continue try: - if attr_spec["type"] == 'nest': + if attr_spec["type"] == 'pad': + continue + elif attr_spec["type"] == 'nest': subdict = self._decode(NlAttrs(attr.raw), attr_spec['nested-attributes'], search_attrs) @@ -1190,3 +1402,28 @@ class YnlFamily(SpecFamily): def do_multi(self, ops): return self._ops(ops) + + def get_policy(self, op_name, mode): + """Query running kernel for the Netlink policy of an operation. + + Allows checking whether kernel supports a given attribute or value. + This method consults the running kernel, not the YAML spec. + + Args: + op_name: operation name as it appears in the YAML spec + mode: 'do' or 'dump' + + Returns: + NlPolicy acting as a read-only dict mapping attribute names + to their policy properties (type, min/max, nested, etc.), + or None if the operation has no policy for the given mode. + Empty policy usually implies that the operation rejects + all attributes. + """ + op = self.ops[op_name] + op_policy, policy_table = _genl_policy_dump(self.nlproto.family_id, + op.req_value) + if mode not in op_policy: + return None + policy_idx = op_policy[mode] + return NlPolicy(self, policy_idx, policy_table, op.attr_set) diff --git a/tools/net/ynl/samples/Makefile b/tools/net/ynl/samples/Makefile deleted file mode 100644 index d76cbd41cbb1..000000000000 --- a/tools/net/ynl/samples/Makefile +++ /dev/null @@ -1,36 +0,0 @@ -# SPDX-License-Identifier: GPL-2.0 - -include ../Makefile.deps - -CC=gcc -CFLAGS += -std=gnu11 -O2 -W -Wall -Wextra -Wno-unused-parameter -Wshadow \ - -I../lib/ -I../generated/ -idirafter $(UAPI_PATH) -ifeq ("$(DEBUG)","1") - CFLAGS += -g -fsanitize=address -fsanitize=leak -static-libasan -endif - -LDLIBS=../lib/ynl.a ../generated/protos.a - -SRCS=$(wildcard *.c) -BINS=$(patsubst %.c,%,${SRCS}) - -include $(wildcard *.d) - -all: $(BINS) - -CFLAGS_page-pool=$(CFLAGS_netdev) -CFLAGS_tc-filter-add:=$(CFLAGS_tc) - -$(BINS): ../lib/ynl.a ../generated/protos.a $(SRCS) - @echo -e '\tCC sample $@' - @$(COMPILE.c) $(CFLAGS_$@) $@.c -o $@.o - @$(LINK.c) $@.o -o $@ $(LDLIBS) - -clean: - rm -f *.o *.d *~ - -distclean: clean - rm -f $(BINS) - -.PHONY: all clean distclean -.DEFAULT_GOAL=all diff --git a/tools/net/ynl/samples/devlink.c b/tools/net/ynl/samples/devlink.c deleted file mode 100644 index ac9dfb01f280..000000000000 --- a/tools/net/ynl/samples/devlink.c +++ /dev/null @@ -1,61 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include "devlink-user.h" - -int main(int argc, char **argv) -{ - struct devlink_get_list *devs; - struct ynl_sock *ys; - - ys = ynl_sock_create(&ynl_devlink_family, NULL); - if (!ys) - return 1; - - devs = devlink_get_dump(ys); - if (!devs) - goto err_close; - - ynl_dump_foreach(devs, d) { - struct devlink_info_get_req *info_req; - struct devlink_info_get_rsp *info_rsp; - unsigned i; - - printf("%s/%s:\n", d->bus_name, d->dev_name); - - info_req = devlink_info_get_req_alloc(); - devlink_info_get_req_set_bus_name(info_req, d->bus_name); - devlink_info_get_req_set_dev_name(info_req, d->dev_name); - - info_rsp = devlink_info_get(ys, info_req); - devlink_info_get_req_free(info_req); - if (!info_rsp) - goto err_free_devs; - - if (info_rsp->_len.info_driver_name) - printf(" driver: %s\n", info_rsp->info_driver_name); - if (info_rsp->_count.info_version_running) - printf(" running fw:\n"); - for (i = 0; i < info_rsp->_count.info_version_running; i++) - printf(" %s: %s\n", - info_rsp->info_version_running[i].info_version_name, - info_rsp->info_version_running[i].info_version_value); - printf(" ...\n"); - devlink_info_get_rsp_free(info_rsp); - } - devlink_get_list_free(devs); - - ynl_sock_destroy(ys); - - return 0; - -err_free_devs: - devlink_get_list_free(devs); -err_close: - fprintf(stderr, "YNL: %s\n", ys->err.msg); - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/ethtool.c b/tools/net/ynl/samples/ethtool.c deleted file mode 100644 index a7ebbd1b98db..000000000000 --- a/tools/net/ynl/samples/ethtool.c +++ /dev/null @@ -1,65 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include <net/if.h> - -#include "ethtool-user.h" - -int main(int argc, char **argv) -{ - struct ethtool_channels_get_req_dump creq = {}; - struct ethtool_rings_get_req_dump rreq = {}; - struct ethtool_channels_get_list *channels; - struct ethtool_rings_get_list *rings; - struct ynl_sock *ys; - - ys = ynl_sock_create(&ynl_ethtool_family, NULL); - if (!ys) - return 1; - - creq._present.header = 1; /* ethtool needs an empty nest, sigh */ - channels = ethtool_channels_get_dump(ys, &creq); - if (!channels) - goto err_close; - - printf("Channels:\n"); - ynl_dump_foreach(channels, dev) { - printf(" %8s: ", dev->header.dev_name); - if (dev->_present.rx_count) - printf("rx %d ", dev->rx_count); - if (dev->_present.tx_count) - printf("tx %d ", dev->tx_count); - if (dev->_present.combined_count) - printf("combined %d ", dev->combined_count); - printf("\n"); - } - ethtool_channels_get_list_free(channels); - - rreq._present.header = 1; /* ethtool needs an empty nest.. */ - rings = ethtool_rings_get_dump(ys, &rreq); - if (!rings) - goto err_close; - - printf("Rings:\n"); - ynl_dump_foreach(rings, dev) { - printf(" %8s: ", dev->header.dev_name); - if (dev->_present.rx) - printf("rx %d ", dev->rx); - if (dev->_present.tx) - printf("tx %d ", dev->tx); - printf("\n"); - } - ethtool_rings_get_list_free(rings); - - ynl_sock_destroy(ys); - - return 0; - -err_close: - fprintf(stderr, "YNL (%d): %s\n", ys->err.code, ys->err.msg); - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/netdev.c b/tools/net/ynl/samples/netdev.c deleted file mode 100644 index 22609d44c89a..000000000000 --- a/tools/net/ynl/samples/netdev.c +++ /dev/null @@ -1,128 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include <net/if.h> - -#include "netdev-user.h" - -/* netdev genetlink family code sample - * This sample shows off basics of the netdev family but also notification - * handling, hence the somewhat odd UI. We subscribe to notifications first - * then wait for ifc selection, so the socket may already accumulate - * notifications as we wait. This allows us to test that YNL can handle - * requests and notifications getting interleaved. - */ - -static void netdev_print_device(struct netdev_dev_get_rsp *d, unsigned int op) -{ - char ifname[IF_NAMESIZE]; - const char *name; - - if (!d->_present.ifindex) - return; - - name = if_indextoname(d->ifindex, ifname); - if (name) - printf("%8s", name); - printf("[%d]\t", d->ifindex); - - if (!d->_present.xdp_features) - return; - - printf("xdp-features (%llx):", d->xdp_features); - for (int i = 0; d->xdp_features >= 1U << i; i++) { - if (d->xdp_features & (1U << i)) - printf(" %s", netdev_xdp_act_str(1 << i)); - } - - printf(" xdp-rx-metadata-features (%llx):", d->xdp_rx_metadata_features); - for (int i = 0; d->xdp_rx_metadata_features >= 1U << i; i++) { - if (d->xdp_rx_metadata_features & (1U << i)) - printf(" %s", netdev_xdp_rx_metadata_str(1 << i)); - } - - printf(" xsk-features (%llx):", d->xsk_features); - for (int i = 0; d->xsk_features >= 1U << i; i++) { - if (d->xsk_features & (1U << i)) - printf(" %s", netdev_xsk_flags_str(1 << i)); - } - - printf(" xdp-zc-max-segs=%u", d->xdp_zc_max_segs); - - name = netdev_op_str(op); - if (name) - printf(" (ntf: %s)", name); - printf("\n"); -} - -int main(int argc, char **argv) -{ - struct netdev_dev_get_list *devs; - struct ynl_ntf_base_type *ntf; - struct ynl_error yerr; - struct ynl_sock *ys; - int ifindex = 0; - - if (argc > 1) - ifindex = strtol(argv[1], NULL, 0); - - ys = ynl_sock_create(&ynl_netdev_family, &yerr); - if (!ys) { - fprintf(stderr, "YNL: %s\n", yerr.msg); - return 1; - } - - if (ynl_subscribe(ys, "mgmt")) - goto err_close; - - printf("Select ifc ($ifindex; or 0 = dump; or -2 ntf check): "); - if (scanf("%d", &ifindex) != 1) { - fprintf(stderr, "Error: unable to parse input\n"); - goto err_destroy; - } - - if (ifindex > 0) { - struct netdev_dev_get_req *req; - struct netdev_dev_get_rsp *d; - - req = netdev_dev_get_req_alloc(); - netdev_dev_get_req_set_ifindex(req, ifindex); - - d = netdev_dev_get(ys, req); - netdev_dev_get_req_free(req); - if (!d) - goto err_close; - - netdev_print_device(d, 0); - netdev_dev_get_rsp_free(d); - } else if (!ifindex) { - devs = netdev_dev_get_dump(ys); - if (!devs) - goto err_close; - - if (ynl_dump_empty(devs)) - fprintf(stderr, "Error: no devices reported\n"); - ynl_dump_foreach(devs, d) - netdev_print_device(d, 0); - netdev_dev_get_list_free(devs); - } else if (ifindex == -2) { - ynl_ntf_check(ys); - } - while ((ntf = ynl_ntf_dequeue(ys))) { - netdev_print_device((struct netdev_dev_get_rsp *)&ntf->data, - ntf->cmd); - ynl_ntf_free(ntf); - } - - ynl_sock_destroy(ys); - return 0; - -err_close: - fprintf(stderr, "YNL: %s\n", ys->err.msg); -err_destroy: - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/ovs.c b/tools/net/ynl/samples/ovs.c deleted file mode 100644 index 3e975c003d77..000000000000 --- a/tools/net/ynl/samples/ovs.c +++ /dev/null @@ -1,60 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include "ovs_datapath-user.h" - -int main(int argc, char **argv) -{ - struct ynl_sock *ys; - int err; - - ys = ynl_sock_create(&ynl_ovs_datapath_family, NULL); - if (!ys) - return 1; - - if (argc > 1) { - struct ovs_datapath_new_req *req; - - req = ovs_datapath_new_req_alloc(); - if (!req) - goto err_close; - - ovs_datapath_new_req_set_upcall_pid(req, 1); - ovs_datapath_new_req_set_name(req, argv[1]); - - err = ovs_datapath_new(ys, req); - ovs_datapath_new_req_free(req); - if (err) - goto err_close; - } else { - struct ovs_datapath_get_req_dump *req; - struct ovs_datapath_get_list *dps; - - printf("Dump:\n"); - req = ovs_datapath_get_req_dump_alloc(); - - dps = ovs_datapath_get_dump(ys, req); - ovs_datapath_get_req_dump_free(req); - if (!dps) - goto err_close; - - ynl_dump_foreach(dps, dp) { - printf(" %s(%d): pid:%u cache:%u\n", - dp->name, dp->_hdr.dp_ifindex, - dp->upcall_pid, dp->masks_cache_size); - } - ovs_datapath_get_list_free(dps); - } - - ynl_sock_destroy(ys); - - return 0; - -err_close: - fprintf(stderr, "YNL (%d): %s\n", ys->err.code, ys->err.msg); - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/rt-addr.c b/tools/net/ynl/samples/rt-addr.c deleted file mode 100644 index 2edde5c36b18..000000000000 --- a/tools/net/ynl/samples/rt-addr.c +++ /dev/null @@ -1,80 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include <arpa/inet.h> -#include <net/if.h> - -#include "rt-addr-user.h" - -static void rt_addr_print(struct rt_addr_getaddr_rsp *a) -{ - char ifname[IF_NAMESIZE]; - char addr_str[64]; - const char *addr; - const char *name; - - name = if_indextoname(a->_hdr.ifa_index, ifname); - if (name) - printf("%16s: ", name); - - switch (a->_len.address) { - case 4: - addr = inet_ntop(AF_INET, a->address, - addr_str, sizeof(addr_str)); - break; - case 16: - addr = inet_ntop(AF_INET6, a->address, - addr_str, sizeof(addr_str)); - break; - default: - addr = NULL; - break; - } - if (addr) - printf("%s", addr); - else - printf("[%d]", a->_len.address); - - printf("\n"); -} - -int main(int argc, char **argv) -{ - struct rt_addr_getaddr_list *rsp; - struct rt_addr_getaddr_req *req; - struct ynl_error yerr; - struct ynl_sock *ys; - - ys = ynl_sock_create(&ynl_rt_addr_family, &yerr); - if (!ys) { - fprintf(stderr, "YNL: %s\n", yerr.msg); - return 1; - } - - req = rt_addr_getaddr_req_alloc(); - if (!req) - goto err_destroy; - - rsp = rt_addr_getaddr_dump(ys, req); - rt_addr_getaddr_req_free(req); - if (!rsp) - goto err_close; - - if (ynl_dump_empty(rsp)) - fprintf(stderr, "Error: no addresses reported\n"); - ynl_dump_foreach(rsp, addr) - rt_addr_print(addr); - rt_addr_getaddr_list_free(rsp); - - ynl_sock_destroy(ys); - return 0; - -err_close: - fprintf(stderr, "YNL: %s\n", ys->err.msg); -err_destroy: - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/rt-link.c b/tools/net/ynl/samples/rt-link.c deleted file mode 100644 index acdd4b4a0f74..000000000000 --- a/tools/net/ynl/samples/rt-link.c +++ /dev/null @@ -1,184 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include <arpa/inet.h> -#include <net/if.h> - -#include "rt-link-user.h" - -static void rt_link_print(struct rt_link_getlink_rsp *r) -{ - unsigned int i; - - printf("%3d: ", r->_hdr.ifi_index); - - if (r->_len.ifname) - printf("%16s: ", r->ifname); - - if (r->_present.mtu) - printf("mtu %5d ", r->mtu); - - if (r->linkinfo._len.kind) - printf("kind %-8s ", r->linkinfo.kind); - else - printf(" %8s ", ""); - - if (r->prop_list._count.alt_ifname) { - printf("altname "); - for (i = 0; i < r->prop_list._count.alt_ifname; i++) - printf("%s ", r->prop_list.alt_ifname[i]->str); - printf(" "); - } - - if (r->linkinfo._present.data && r->linkinfo.data._present.netkit) { - struct rt_link_linkinfo_netkit_attrs *netkit; - const char *name; - - netkit = &r->linkinfo.data.netkit; - printf("primary %d ", netkit->primary); - - name = NULL; - if (netkit->_present.policy) - name = rt_link_netkit_policy_str(netkit->policy); - if (name) - printf("policy %s ", name); - } - - printf("\n"); -} - -static int rt_link_create_netkit(struct ynl_sock *ys) -{ - struct rt_link_getlink_ntf *ntf_gl; - struct rt_link_newlink_req *req; - struct ynl_ntf_base_type *ntf; - int ret; - - req = rt_link_newlink_req_alloc(); - if (!req) { - fprintf(stderr, "Can't alloc req\n"); - return -1; - } - - /* rtnetlink doesn't provide info about the created object. - * It expects us to set the ECHO flag and the dig the info out - * of the notifications... - */ - rt_link_newlink_req_set_nlflags(req, NLM_F_CREATE | NLM_F_ECHO); - - rt_link_newlink_req_set_linkinfo_kind(req, "netkit"); - - /* Test error messages */ - rt_link_newlink_req_set_linkinfo_data_netkit_policy(req, 10); - ret = rt_link_newlink(ys, req); - if (ret) { - printf("Testing error message for policy being bad:\n\t%s\n", ys->err.msg); - } else { - fprintf(stderr, "Warning: unexpected success creating netkit with bad attrs\n"); - goto created; - } - - rt_link_newlink_req_set_linkinfo_data_netkit_policy(req, NETKIT_DROP); - - ret = rt_link_newlink(ys, req); -created: - rt_link_newlink_req_free(req); - if (ret) { - fprintf(stderr, "YNL: %s\n", ys->err.msg); - return -1; - } - - if (!ynl_has_ntf(ys)) { - fprintf(stderr, - "Warning: interface created but received no notification, won't delete the interface\n"); - return 0; - } - - ntf = ynl_ntf_dequeue(ys); - if (ntf->cmd != RTM_NEWLINK) { - fprintf(stderr, - "Warning: unexpected notification type, won't delete the interface\n"); - return 0; - } - ntf_gl = (void *)ntf; - ret = ntf_gl->obj._hdr.ifi_index; - ynl_ntf_free(ntf); - - return ret; -} - -static void rt_link_del(struct ynl_sock *ys, int ifindex) -{ - struct rt_link_dellink_req *req; - - req = rt_link_dellink_req_alloc(); - if (!req) { - fprintf(stderr, "Can't alloc req\n"); - return; - } - - req->_hdr.ifi_index = ifindex; - if (rt_link_dellink(ys, req)) - fprintf(stderr, "YNL: %s\n", ys->err.msg); - else - fprintf(stderr, - "Trying to delete a Netkit interface (ifindex %d)\n", - ifindex); - - rt_link_dellink_req_free(req); -} - -int main(int argc, char **argv) -{ - struct rt_link_getlink_req_dump *req; - struct rt_link_getlink_list *rsp; - struct ynl_error yerr; - struct ynl_sock *ys; - int created = 0; - - ys = ynl_sock_create(&ynl_rt_link_family, &yerr); - if (!ys) { - fprintf(stderr, "YNL: %s\n", yerr.msg); - return 1; - } - - if (argc > 1) { - fprintf(stderr, "Trying to create a Netkit interface\n"); - created = rt_link_create_netkit(ys); - if (created < 0) - goto err_destroy; - } - - req = rt_link_getlink_req_dump_alloc(); - if (!req) - goto err_del_ifc; - - rsp = rt_link_getlink_dump(ys, req); - rt_link_getlink_req_dump_free(req); - if (!rsp) - goto err_close; - - if (ynl_dump_empty(rsp)) - fprintf(stderr, "Error: no links reported\n"); - ynl_dump_foreach(rsp, link) - rt_link_print(link); - rt_link_getlink_list_free(rsp); - - if (created) - rt_link_del(ys, created); - - ynl_sock_destroy(ys); - return 0; - -err_close: - fprintf(stderr, "YNL: %s\n", ys->err.msg); -err_del_ifc: - if (created) - rt_link_del(ys, created); -err_destroy: - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/rt-route.c b/tools/net/ynl/samples/rt-route.c deleted file mode 100644 index 7427104a96df..000000000000 --- a/tools/net/ynl/samples/rt-route.c +++ /dev/null @@ -1,80 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include <arpa/inet.h> -#include <net/if.h> - -#include "rt-route-user.h" - -static void rt_route_print(struct rt_route_getroute_rsp *r) -{ - char ifname[IF_NAMESIZE]; - char route_str[64]; - const char *route; - const char *name; - - /* Ignore local */ - if (r->_hdr.rtm_table == RT_TABLE_LOCAL) - return; - - if (r->_present.oif) { - name = if_indextoname(r->oif, ifname); - if (name) - printf("oif: %-16s ", name); - } - - if (r->_len.dst) { - route = inet_ntop(r->_hdr.rtm_family, r->dst, - route_str, sizeof(route_str)); - printf("dst: %s/%d", route, r->_hdr.rtm_dst_len); - } - - if (r->_len.gateway) { - route = inet_ntop(r->_hdr.rtm_family, r->gateway, - route_str, sizeof(route_str)); - printf("gateway: %s ", route); - } - - printf("\n"); -} - -int main(int argc, char **argv) -{ - struct rt_route_getroute_req_dump *req; - struct rt_route_getroute_list *rsp; - struct ynl_error yerr; - struct ynl_sock *ys; - - ys = ynl_sock_create(&ynl_rt_route_family, &yerr); - if (!ys) { - fprintf(stderr, "YNL: %s\n", yerr.msg); - return 1; - } - - req = rt_route_getroute_req_dump_alloc(); - if (!req) - goto err_destroy; - - rsp = rt_route_getroute_dump(ys, req); - rt_route_getroute_req_dump_free(req); - if (!rsp) - goto err_close; - - if (ynl_dump_empty(rsp)) - fprintf(stderr, "Error: no routeesses reported\n"); - ynl_dump_foreach(rsp, route) - rt_route_print(route); - rt_route_getroute_list_free(rsp); - - ynl_sock_destroy(ys); - return 0; - -err_close: - fprintf(stderr, "YNL: %s\n", ys->err.msg); -err_destroy: - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/tc-filter-add.c b/tools/net/ynl/samples/tc-filter-add.c deleted file mode 100644 index 97871e9e9edc..000000000000 --- a/tools/net/ynl/samples/tc-filter-add.c +++ /dev/null @@ -1,335 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> -#include <stdlib.h> -#include <arpa/inet.h> -#include <linux/pkt_sched.h> -#include <linux/tc_act/tc_vlan.h> -#include <linux/tc_act/tc_gact.h> -#include <linux/if_ether.h> -#include <net/if.h> - -#include <ynl.h> - -#include "tc-user.h" - -#define TC_HANDLE (0xFFFF << 16) - -const char *vlan_act_name(struct tc_vlan *p) -{ - switch (p->v_action) { - case TCA_VLAN_ACT_POP: - return "pop"; - case TCA_VLAN_ACT_PUSH: - return "push"; - case TCA_VLAN_ACT_MODIFY: - return "modify"; - default: - break; - } - - return "not supported"; -} - -const char *gact_act_name(struct tc_gact *p) -{ - switch (p->action) { - case TC_ACT_SHOT: - return "drop"; - case TC_ACT_OK: - return "ok"; - case TC_ACT_PIPE: - return "pipe"; - default: - break; - } - - return "not supported"; -} - -static void print_vlan(struct tc_act_vlan_attrs *vlan) -{ - printf("%s ", vlan_act_name(vlan->parms)); - if (vlan->_present.push_vlan_id) - printf("id %u ", vlan->push_vlan_id); - if (vlan->_present.push_vlan_protocol) - printf("protocol %#x ", ntohs(vlan->push_vlan_protocol)); - if (vlan->_present.push_vlan_priority) - printf("priority %u ", vlan->push_vlan_priority); -} - -static void print_gact(struct tc_act_gact_attrs *gact) -{ - struct tc_gact *p = gact->parms; - - printf("%s ", gact_act_name(p)); -} - -static void flower_print(struct tc_flower_attrs *flower, const char *kind) -{ - struct tc_act_attrs *a; - unsigned int i; - - printf("%s:\n", kind); - - if (flower->_present.key_vlan_id) - printf(" vlan_id: %u\n", flower->key_vlan_id); - if (flower->_present.key_vlan_prio) - printf(" vlan_prio: %u\n", flower->key_vlan_prio); - if (flower->_present.key_num_of_vlans) - printf(" num_of_vlans: %u\n", flower->key_num_of_vlans); - - for (i = 0; i < flower->_count.act; i++) { - a = &flower->act[i]; - printf("action order: %i %s ", i + 1, a->kind); - if (a->options._present.vlan) - print_vlan(&a->options.vlan); - else if (a->options._present.gact) - print_gact(&a->options.gact); - printf("\n"); - } - printf("\n"); -} - -static void tc_filter_print(struct tc_gettfilter_rsp *f) -{ - struct tc_options_msg *opt = &f->options; - - if (opt->_present.flower) - flower_print(&opt->flower, f->kind); - else if (f->_len.kind) - printf("%s pref %u proto: %#x\n", f->kind, - (f->_hdr.tcm_info >> 16), - ntohs(TC_H_MIN(f->_hdr.tcm_info))); -} - -static int tc_filter_add(struct ynl_sock *ys, int ifi) -{ - struct tc_newtfilter_req *req; - struct tc_act_attrs *acts; - struct tc_vlan p = { - .action = TC_ACT_PIPE, - .v_action = TCA_VLAN_ACT_PUSH - }; - __u16 flags = NLM_F_REQUEST | NLM_F_EXCL | NLM_F_CREATE; - int ret; - - req = tc_newtfilter_req_alloc(); - if (!req) { - fprintf(stderr, "tc_newtfilter_req_alloc failed\n"); - return -1; - } - memset(req, 0, sizeof(*req)); - - acts = tc_act_attrs_alloc(3); - if (!acts) { - fprintf(stderr, "tc_act_attrs_alloc\n"); - tc_newtfilter_req_free(req); - return -1; - } - memset(acts, 0, sizeof(*acts) * 3); - - req->_hdr.tcm_ifindex = ifi; - req->_hdr.tcm_parent = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS); - req->_hdr.tcm_info = TC_H_MAKE(1 << 16, htons(ETH_P_8021Q)); - req->chain = 0; - - tc_newtfilter_req_set_nlflags(req, flags); - tc_newtfilter_req_set_kind(req, "flower"); - tc_newtfilter_req_set_options_flower_key_vlan_id(req, 100); - tc_newtfilter_req_set_options_flower_key_vlan_prio(req, 5); - tc_newtfilter_req_set_options_flower_key_num_of_vlans(req, 3); - - __tc_newtfilter_req_set_options_flower_act(req, acts, 3); - - /* Skip action at index 0 because in TC, the action array - * index starts at 1, with each index defining the action's - * order. In contrast, in YNL indexed arrays start at index 0. - */ - tc_act_attrs_set_kind(&acts[1], "vlan"); - tc_act_attrs_set_options_vlan_parms(&acts[1], &p, sizeof(p)); - tc_act_attrs_set_options_vlan_push_vlan_id(&acts[1], 200); - tc_act_attrs_set_kind(&acts[2], "vlan"); - tc_act_attrs_set_options_vlan_parms(&acts[2], &p, sizeof(p)); - tc_act_attrs_set_options_vlan_push_vlan_id(&acts[2], 300); - - tc_newtfilter_req_set_options_flower_flags(req, 0); - tc_newtfilter_req_set_options_flower_key_eth_type(req, htons(0x8100)); - - ret = tc_newtfilter(ys, req); - if (ret) - fprintf(stderr, "tc_newtfilter: %s\n", ys->err.msg); - - tc_newtfilter_req_free(req); - - return ret; -} - -static int tc_filter_show(struct ynl_sock *ys, int ifi) -{ - struct tc_gettfilter_req_dump *req; - struct tc_gettfilter_list *rsp; - - req = tc_gettfilter_req_dump_alloc(); - if (!req) { - fprintf(stderr, "tc_gettfilter_req_dump_alloc failed\n"); - return -1; - } - memset(req, 0, sizeof(*req)); - - req->_hdr.tcm_ifindex = ifi; - req->_hdr.tcm_parent = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS); - req->_present.chain = 1; - req->chain = 0; - - rsp = tc_gettfilter_dump(ys, req); - tc_gettfilter_req_dump_free(req); - if (!rsp) { - fprintf(stderr, "YNL: %s\n", ys->err.msg); - return -1; - } - - if (ynl_dump_empty(rsp)) - fprintf(stderr, "Error: no filters reported\n"); - else - ynl_dump_foreach(rsp, flt) tc_filter_print(flt); - - tc_gettfilter_list_free(rsp); - - return 0; -} - -static int tc_filter_del(struct ynl_sock *ys, int ifi) -{ - struct tc_deltfilter_req *req; - __u16 flags = NLM_F_REQUEST; - int ret; - - req = tc_deltfilter_req_alloc(); - if (!req) { - fprintf(stderr, "tc_deltfilter_req_alloc failed\n"); - return -1; - } - memset(req, 0, sizeof(*req)); - - req->_hdr.tcm_ifindex = ifi; - req->_hdr.tcm_parent = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS); - req->_hdr.tcm_info = TC_H_MAKE(1 << 16, htons(ETH_P_8021Q)); - tc_deltfilter_req_set_nlflags(req, flags); - - ret = tc_deltfilter(ys, req); - if (ret) - fprintf(stderr, "tc_deltfilter failed: %s\n", ys->err.msg); - - tc_deltfilter_req_free(req); - - return ret; -} - -static int tc_clsact_add(struct ynl_sock *ys, int ifi) -{ - struct tc_newqdisc_req *req; - __u16 flags = NLM_F_REQUEST | NLM_F_EXCL | NLM_F_CREATE; - int ret; - - req = tc_newqdisc_req_alloc(); - if (!req) { - fprintf(stderr, "tc_newqdisc_req_alloc failed\n"); - return -1; - } - memset(req, 0, sizeof(*req)); - - req->_hdr.tcm_ifindex = ifi; - req->_hdr.tcm_parent = TC_H_CLSACT; - req->_hdr.tcm_handle = TC_HANDLE; - tc_newqdisc_req_set_nlflags(req, flags); - tc_newqdisc_req_set_kind(req, "clsact"); - - ret = tc_newqdisc(ys, req); - if (ret) - fprintf(stderr, "tc_newqdisc failed: %s\n", ys->err.msg); - - tc_newqdisc_req_free(req); - - return ret; -} - -static int tc_clsact_del(struct ynl_sock *ys, int ifi) -{ - struct tc_delqdisc_req *req; - __u16 flags = NLM_F_REQUEST; - int ret; - - req = tc_delqdisc_req_alloc(); - if (!req) { - fprintf(stderr, "tc_delqdisc_req_alloc failed\n"); - return -1; - } - memset(req, 0, sizeof(*req)); - - req->_hdr.tcm_ifindex = ifi; - req->_hdr.tcm_parent = TC_H_CLSACT; - req->_hdr.tcm_handle = TC_HANDLE; - tc_delqdisc_req_set_nlflags(req, flags); - - ret = tc_delqdisc(ys, req); - if (ret) - fprintf(stderr, "tc_delqdisc failed: %s\n", ys->err.msg); - - tc_delqdisc_req_free(req); - - return ret; -} - -static int tc_filter_config(struct ynl_sock *ys, int ifi) -{ - int ret = 0; - - if (tc_filter_add(ys, ifi)) - return -1; - - ret = tc_filter_show(ys, ifi); - - if (tc_filter_del(ys, ifi)) - return -1; - - return ret; -} - -int main(int argc, char **argv) -{ - struct ynl_error yerr; - struct ynl_sock *ys; - int ifi, ret = 0; - - if (argc < 2) { - fprintf(stderr, "Usage: %s <interface_name>\n", argv[0]); - return 1; - } - ifi = if_nametoindex(argv[1]); - if (!ifi) { - perror("if_nametoindex"); - return 1; - } - - ys = ynl_sock_create(&ynl_tc_family, &yerr); - if (!ys) { - fprintf(stderr, "YNL: %s\n", yerr.msg); - return 1; - } - - if (tc_clsact_add(ys, ifi)) { - ret = 2; - goto err_destroy; - } - - if (tc_filter_config(ys, ifi)) - ret = 3; - - if (tc_clsact_del(ys, ifi)) - ret = 4; - -err_destroy: - ynl_sock_destroy(ys); - return ret; -} diff --git a/tools/net/ynl/samples/tc.c b/tools/net/ynl/samples/tc.c deleted file mode 100644 index 0bfff0fdd792..000000000000 --- a/tools/net/ynl/samples/tc.c +++ /dev/null @@ -1,80 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <stdio.h> -#include <string.h> - -#include <ynl.h> - -#include <net/if.h> - -#include "tc-user.h" - -static void tc_qdisc_print(struct tc_getqdisc_rsp *q) -{ - char ifname[IF_NAMESIZE]; - const char *name; - - name = if_indextoname(q->_hdr.tcm_ifindex, ifname); - if (name) - printf("%16s: ", name); - - if (q->_len.kind) { - printf("%s ", q->kind); - - if (q->options._present.fq_codel) { - struct tc_fq_codel_attrs *fq_codel; - struct tc_fq_codel_xstats *stats; - - fq_codel = &q->options.fq_codel; - stats = q->stats2.app.fq_codel; - - if (fq_codel->_present.limit) - printf("limit: %dp ", fq_codel->limit); - if (fq_codel->_present.target) - printf("target: %dms ", - (fq_codel->target + 500) / 1000); - if (q->stats2.app._len.fq_codel) - printf("new_flow_cnt: %d ", - stats->qdisc_stats.new_flow_count); - } - } - - printf("\n"); -} - -int main(int argc, char **argv) -{ - struct tc_getqdisc_req_dump *req; - struct tc_getqdisc_list *rsp; - struct ynl_error yerr; - struct ynl_sock *ys; - - ys = ynl_sock_create(&ynl_tc_family, &yerr); - if (!ys) { - fprintf(stderr, "YNL: %s\n", yerr.msg); - return 1; - } - - req = tc_getqdisc_req_dump_alloc(); - if (!req) - goto err_destroy; - - rsp = tc_getqdisc_dump(ys, req); - tc_getqdisc_req_dump_free(req); - if (!rsp) - goto err_close; - - if (ynl_dump_empty(rsp)) - fprintf(stderr, "Error: no addresses reported\n"); - ynl_dump_foreach(rsp, qdisc) - tc_qdisc_print(qdisc); - tc_getqdisc_list_free(rsp); - - ynl_sock_destroy(ys); - return 0; - -err_close: - fprintf(stderr, "YNL: %s\n", ys->err.msg); -err_destroy: - ynl_sock_destroy(ys); - return 2; -} diff --git a/tools/net/ynl/samples/.gitignore b/tools/net/ynl/tests/.gitignore index 05087ee323ba..a7832ebfdbbc 100644 --- a/tools/net/ynl/samples/.gitignore +++ b/tools/net/ynl/tests/.gitignore @@ -1,10 +1,10 @@ -ethtool devlink +ethtool netdev ovs -page-pool rt-addr rt-link rt-route tc tc-filter-add +wireguard diff --git a/tools/net/ynl/tests/Makefile b/tools/net/ynl/tests/Makefile index c1df2e001255..40827ca8e579 100644 --- a/tools/net/ynl/tests/Makefile +++ b/tools/net/ynl/tests/Makefile @@ -1,32 +1,97 @@ # SPDX-License-Identifier: GPL-2.0 # Makefile for YNL tests -TESTS := \ +include ../Makefile.deps + +CC=gcc +CFLAGS += -std=gnu11 -O2 -W -Wall -Wextra -Wno-unused-parameter -Wshadow \ + -I../lib/ -I../generated/ -I../../../testing/selftests/ \ + -idirafter $(UAPI_PATH) +ifneq ("$(NDEBUG)","1") + CFLAGS += -g -fsanitize=address -fsanitize=leak -static-libasan +endif + +LDLIBS=../lib/ynl.a ../generated/protos.a + +TEST_PROGS := \ + devlink.sh \ + ethtool.sh \ + rt-addr.sh \ + rt-route.sh \ test_ynl_cli.sh \ test_ynl_ethtool.sh \ -# end of TESTS +# end of TEST_PROGS + +TEST_GEN_PROGS := \ + netdev \ + ovs \ + rt-link \ + tc \ +# end of TEST_GEN_PROGS + +TEST_GEN_FILES := \ + devlink \ + ethtool \ + rt-addr \ + rt-route \ +# end of TEST_GEN_FILES + +TEST_FILES := \ + ethtool.py \ + ynl_nsim_lib.sh \ +# end of TEST_FILES + +CFLAGS_netdev:=$(CFLAGS_netdev) $(CFLAGS_rt-link) +CFLAGS_ovs:=$(CFLAGS_ovs_datapath) + +include $(wildcard *.d) -all: $(TESTS) +INSTALL_PATH ?= $(DESTDIR)/usr/share/kselftest + +all: $(TEST_GEN_PROGS) $(TEST_GEN_FILES) + +../lib/ynl.a: + @$(MAKE) -C ../lib + +../generated/protos.a: + @$(MAKE) -C ../generated + +$(TEST_GEN_PROGS) $(TEST_GEN_FILES): %: %.c ../lib/ynl.a ../generated/protos.a + @echo -e '\tCC test $@' + @$(COMPILE.c) $(CFLAGS_$@) $@.c -o $@.o + @$(LINK.c) $@.o -o $@ $(LDLIBS) run_tests: - @for test in $(TESTS); do \ + @for test in $(TEST_PROGS); do \ ./$$test; \ done -install: $(TESTS) - @mkdir -p $(DESTDIR)/usr/bin - @mkdir -p $(DESTDIR)/usr/share/kselftest - @cp ../../../testing/selftests/kselftest/ktap_helpers.sh $(DESTDIR)/usr/share/kselftest/ - @for test in $(TESTS); do \ - name=$$(basename $$test .sh); \ +install: $(TEST_GEN_PROGS) $(TEST_GEN_FILES) + @mkdir -p $(INSTALL_PATH)/ynl + @cp ../../../testing/selftests/kselftest/ktap_helpers.sh $(INSTALL_PATH)/ + @for test in $(TEST_PROGS); do \ + name=$$(basename $$test); \ sed -e 's|^ynl=.*|ynl="ynl"|' \ -e 's|^ynl_ethtool=.*|ynl_ethtool="ynl-ethtool"|' \ - -e 's|KSELFTEST_KTAP_HELPERS=.*|KSELFTEST_KTAP_HELPERS="/usr/share/kselftest/ktap_helpers.sh"|' \ - $$test > $(DESTDIR)/usr/bin/$$name; \ - chmod +x $(DESTDIR)/usr/bin/$$name; \ + -e 's|KSELFTEST_KTAP_HELPERS=.*|KSELFTEST_KTAP_HELPERS="$(INSTALL_PATH)/ktap_helpers.sh"|' \ + $$test > $(INSTALL_PATH)/ynl/$$name; \ + chmod +x $(INSTALL_PATH)/ynl/$$name; \ done + @for file in $(TEST_FILES); do \ + cp $$file $(INSTALL_PATH)/ynl/$$file; \ + done + @for bin in $(TEST_GEN_PROGS) $(TEST_GEN_FILES); do \ + cp $$bin $(INSTALL_PATH)/ynl/$$bin; \ + done + @for test in $(TEST_PROGS) $(TEST_GEN_PROGS); do \ + echo "ynl:$$test"; \ + done > $(INSTALL_PATH)/kselftest-list.txt + +clean: + rm -f *.o *.d *~ -clean distclean: - @# Nothing to clean +distclean: clean + rm -f $(TEST_GEN_PROGS) $(TEST_GEN_FILES) -.PHONY: all install clean run_tests +.PHONY: all install clean distclean run_tests +.DEFAULT_GOAL=all diff --git a/tools/net/ynl/tests/config b/tools/net/ynl/tests/config index 339f1309c03f..75c0fe72391f 100644 --- a/tools/net/ynl/tests/config +++ b/tools/net/ynl/tests/config @@ -1,6 +1,14 @@ CONFIG_DUMMY=m CONFIG_INET_DIAG=y CONFIG_IPV6=y +CONFIG_NET_ACT_VLAN=m +CONFIG_NET_CLS_ACT=y +CONFIG_NET_CLS_FLOWER=m +CONFIG_NET_SCH_FQ_CODEL=m +CONFIG_NET_SCH_INGRESS=m CONFIG_NET_NS=y +CONFIG_NET_SCHED=y CONFIG_NETDEVSIM=m +CONFIG_NETKIT=y +CONFIG_OPENVSWITCH=m CONFIG_VETH=m diff --git a/tools/net/ynl/tests/devlink.c b/tools/net/ynl/tests/devlink.c new file mode 100644 index 000000000000..2e668bb15af1 --- /dev/null +++ b/tools/net/ynl/tests/devlink.c @@ -0,0 +1,101 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <string.h> + +#include <ynl.h> + +#include <kselftest_harness.h> + +#include "devlink-user.h" + +FIXTURE(devlink) +{ + struct ynl_sock *ys; +}; + +FIXTURE_SETUP(devlink) +{ + self->ys = ynl_sock_create(&ynl_devlink_family, NULL); + ASSERT_NE(NULL, self->ys) + TH_LOG("failed to create devlink socket"); +} + +FIXTURE_TEARDOWN(devlink) +{ + ynl_sock_destroy(self->ys); +} + +TEST_F(devlink, dump) +{ + struct devlink_get_list *devs; + + devs = devlink_get_dump(self->ys); + ASSERT_NE(NULL, devs) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + if (ynl_dump_empty(devs)) { + devlink_get_list_free(devs); + SKIP(return, "no entries in dump"); + } + + ynl_dump_foreach(devs, d) { + EXPECT_TRUE((bool)d->_len.bus_name); + EXPECT_TRUE((bool)d->_len.dev_name); + ksft_print_msg("%s/%s\n", d->bus_name, d->dev_name); + } + + devlink_get_list_free(devs); +} + +TEST_F(devlink, info) +{ + struct devlink_get_list *devs; + + devs = devlink_get_dump(self->ys); + ASSERT_NE(NULL, devs) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + if (ynl_dump_empty(devs)) { + devlink_get_list_free(devs); + SKIP(return, "no devices to query"); + } + + ynl_dump_foreach(devs, d) { + struct devlink_info_get_req *info_req; + struct devlink_info_get_rsp *info_rsp; + unsigned int i; + + EXPECT_TRUE((bool)d->_len.bus_name); + EXPECT_TRUE((bool)d->_len.dev_name); + ksft_print_msg("%s/%s:\n", d->bus_name, d->dev_name); + + info_req = devlink_info_get_req_alloc(); + ASSERT_NE(NULL, info_req); + devlink_info_get_req_set_bus_name(info_req, d->bus_name); + devlink_info_get_req_set_dev_name(info_req, d->dev_name); + + info_rsp = devlink_info_get(self->ys, info_req); + devlink_info_get_req_free(info_req); + ASSERT_NE(NULL, info_rsp) { + devlink_get_list_free(devs); + TH_LOG("info_get failed: %s", self->ys->err.msg); + } + + EXPECT_TRUE((bool)info_rsp->_len.info_driver_name); + if (info_rsp->_len.info_driver_name) + ksft_print_msg(" driver: %s\n", + info_rsp->info_driver_name); + if (info_rsp->_count.info_version_running) + ksft_print_msg(" running fw:\n"); + for (i = 0; i < info_rsp->_count.info_version_running; i++) + ksft_print_msg(" %s: %s\n", + info_rsp->info_version_running[i].info_version_name, + info_rsp->info_version_running[i].info_version_value); + devlink_info_get_rsp_free(info_rsp); + } + devlink_get_list_free(devs); +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/tests/devlink.sh b/tools/net/ynl/tests/devlink.sh new file mode 100755 index 000000000000..a684c749aa5e --- /dev/null +++ b/tools/net/ynl/tests/devlink.sh @@ -0,0 +1,5 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +source "$(dirname "$(realpath "$0")")/ynl_nsim_lib.sh" +nsim_setup +"$(dirname "$(realpath "$0")")/devlink" diff --git a/tools/net/ynl/tests/ethtool.c b/tools/net/ynl/tests/ethtool.c new file mode 100644 index 000000000000..926a75d23c9b --- /dev/null +++ b/tools/net/ynl/tests/ethtool.c @@ -0,0 +1,92 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <string.h> + +#include <ynl.h> + +#include <net/if.h> + +#include <kselftest_harness.h> + +#include "ethtool-user.h" + +FIXTURE(ethtool) +{ + struct ynl_sock *ys; +}; + +FIXTURE_SETUP(ethtool) +{ + self->ys = ynl_sock_create(&ynl_ethtool_family, NULL); + ASSERT_NE(NULL, self->ys) + TH_LOG("failed to create ethtool socket"); +} + +FIXTURE_TEARDOWN(ethtool) +{ + ynl_sock_destroy(self->ys); +} + +TEST_F(ethtool, channels) +{ + struct ethtool_channels_get_req_dump creq = {}; + struct ethtool_channels_get_list *channels; + + creq._present.header = 1; /* ethtool needs an empty nest */ + channels = ethtool_channels_get_dump(self->ys, &creq); + ASSERT_NE(NULL, channels) { + TH_LOG("channels dump failed: %s", self->ys->err.msg); + } + + if (ynl_dump_empty(channels)) { + ethtool_channels_get_list_free(channels); + SKIP(return, "no entries in channels dump"); + } + + ynl_dump_foreach(channels, dev) { + EXPECT_TRUE((bool)dev->header._len.dev_name); + ksft_print_msg("%8s: ", dev->header.dev_name); + EXPECT_TRUE(dev->_present.rx_count || + dev->_present.tx_count || + dev->_present.combined_count); + if (dev->_present.rx_count) + printf("rx %d ", dev->rx_count); + if (dev->_present.tx_count) + printf("tx %d ", dev->tx_count); + if (dev->_present.combined_count) + printf("combined %d ", dev->combined_count); + printf("\n"); + } + ethtool_channels_get_list_free(channels); +} + +TEST_F(ethtool, rings) +{ + struct ethtool_rings_get_req_dump rreq = {}; + struct ethtool_rings_get_list *rings; + + rreq._present.header = 1; /* ethtool needs an empty nest */ + rings = ethtool_rings_get_dump(self->ys, &rreq); + ASSERT_NE(NULL, rings) { + TH_LOG("rings dump failed: %s", self->ys->err.msg); + } + + if (ynl_dump_empty(rings)) { + ethtool_rings_get_list_free(rings); + SKIP(return, "no entries in rings dump"); + } + + ynl_dump_foreach(rings, dev) { + EXPECT_TRUE((bool)dev->header._len.dev_name); + ksft_print_msg("%8s: ", dev->header.dev_name); + EXPECT_TRUE(dev->_present.rx || dev->_present.tx); + if (dev->_present.rx) + printf("rx %d ", dev->rx); + if (dev->_present.tx) + printf("tx %d ", dev->tx); + printf("\n"); + } + ethtool_rings_get_list_free(rings); +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/pyynl/ethtool.py b/tools/net/ynl/tests/ethtool.py index f1a2a2a89985..db3b62c652e7 100755 --- a/tools/net/ynl/pyynl/ethtool.py +++ b/tools/net/ynl/tests/ethtool.py @@ -14,7 +14,7 @@ import re import os # pylint: disable=no-name-in-module,wrong-import-position -sys.path.append(pathlib.Path(__file__).resolve().parent.as_posix()) +sys.path.append(pathlib.Path(__file__).resolve().parent.parent.joinpath('pyynl').as_posix()) # pylint: disable=import-error from cli import schema_dir, spec_dir from lib import YnlFamily @@ -84,9 +84,9 @@ def print_speed(name, value): speed = [ k for k, v in value.items() if v and speed_re.match(k) ] print(f'{name}: {" ".join(speed)}') -def doit(ynl, args, op_name): +def do_set(ynl, args, op_name): """ - Prepare request header, parse arguments and doit. + Prepare request header, parse arguments and do a set operation. """ req = { 'header': { @@ -97,26 +97,24 @@ def doit(ynl, args, op_name): args_to_req(ynl, op_name, args.args, req) ynl.do(op_name, req) -def dumpit(ynl, args, op_name, extra=None): +def do_get(ynl, args, op_name, extra=None): """ - Prepare request header, parse arguments and dumpit (filtering out the - devices we're not interested in). + Prepare request header and get info for a specific device using doit. """ extra = extra or {} - reply = ynl.dump(op_name, { 'header': {} } | extra) + req = {'header': {'dev-name': args.device}} + req['header'].update(extra.pop('header', {})) + req.update(extra) + + reply = ynl.do(op_name, req) if not reply: return {} - for msg in reply: - if msg['header']['dev-name'] == args.device: - if args.json: - pprint.PrettyPrinter().pprint(msg) - sys.exit(0) - msg.pop('header', None) - return msg - - print(f"Not supported for device {args.device}") - sys.exit(1) + if args.json: + pprint.PrettyPrinter().pprint(reply) + sys.exit(0) + reply.pop('header', None) + return reply def bits_to_dict(attr): """ @@ -168,12 +166,19 @@ def main(): parser.add_argument('device', metavar='device', type=str) parser.add_argument('args', metavar='args', type=str, nargs='*') + dbg_group = parser.add_argument_group('Debug options') + dbg_group.add_argument('--dbg-small-recv', default=0, const=4000, + action='store', nargs='?', type=int, metavar='INT', + help="Length of buffers used for recv()") + args = parser.parse_args() spec = os.path.join(spec_dir(), 'ethtool.yaml') schema = os.path.join(schema_dir(), 'genetlink-legacy.yaml') - ynl = YnlFamily(spec, schema) + ynl = YnlFamily(spec, schema, recv_size=args.dbg_small_recv) + if args.dbg_small_recv: + ynl.set_recv_dbg(True) if args.set_priv_flags: # TODO: parse the bitmask @@ -181,15 +186,15 @@ def main(): return if args.set_eee: - doit(ynl, args, 'eee-set') + do_set(ynl, args, 'eee-set') return if args.set_pause: - doit(ynl, args, 'pause-set') + do_set(ynl, args, 'pause-set') return if args.set_coalesce: - doit(ynl, args, 'coalesce-set') + do_set(ynl, args, 'coalesce-set') return if args.set_features: @@ -198,20 +203,20 @@ def main(): return if args.set_channels: - doit(ynl, args, 'channels-set') + do_set(ynl, args, 'channels-set') return if args.set_ring: - doit(ynl, args, 'rings-set') + do_set(ynl, args, 'rings-set') return if args.show_priv_flags: - flags = bits_to_dict(dumpit(ynl, args, 'privflags-get')['flags']) + flags = bits_to_dict(do_get(ynl, args, 'privflags-get')['flags']) print_field(flags) return if args.show_eee: - eee = dumpit(ynl, args, 'eee-get') + eee = do_get(ynl, args, 'eee-get') ours = bits_to_dict(eee['modes-ours']) peer = bits_to_dict(eee['modes-peer']) @@ -232,18 +237,18 @@ def main(): return if args.show_pause: - print_field(dumpit(ynl, args, 'pause-get'), + print_field(do_get(ynl, args, 'pause-get'), ('autoneg', 'Autonegotiate', 'bool'), ('rx', 'RX', 'bool'), ('tx', 'TX', 'bool')) return if args.show_coalesce: - print_field(dumpit(ynl, args, 'coalesce-get')) + print_field(do_get(ynl, args, 'coalesce-get')) return if args.show_features: - reply = dumpit(ynl, args, 'features-get') + reply = do_get(ynl, args, 'features-get') available = bits_to_dict(reply['hw']) requested = bits_to_dict(reply['wanted']).keys() active = bits_to_dict(reply['active']).keys() @@ -270,7 +275,7 @@ def main(): return if args.show_channels: - reply = dumpit(ynl, args, 'channels-get') + reply = do_get(ynl, args, 'channels-get') print(f'Channel parameters for {args.device}:') print('Pre-set maximums:') @@ -290,7 +295,7 @@ def main(): return if args.show_ring: - reply = dumpit(ynl, args, 'channels-get') + reply = do_get(ynl, args, 'channels-get') print(f'Ring parameters for {args.device}:') @@ -319,7 +324,7 @@ def main(): print('NIC statistics:') # TODO: pass id? - strset = dumpit(ynl, args, 'strset-get') + strset = do_get(ynl, args, 'strset-get') pprint.PrettyPrinter().pprint(strset) req = { @@ -338,7 +343,7 @@ def main(): }, } - rsp = dumpit(ynl, args, 'stats-get', req) + rsp = do_get(ynl, args, 'stats-get', req) pprint.PrettyPrinter().pprint(rsp) return @@ -349,7 +354,7 @@ def main(): }, } - tsinfo = dumpit(ynl, args, 'tsinfo-get', req) + tsinfo = do_get(ynl, args, 'tsinfo-get', req) print(f'Time stamping parameters for {args.device}:') @@ -377,7 +382,7 @@ def main(): return print(f'Settings for {args.device}:') - linkmodes = dumpit(ynl, args, 'linkmodes-get') + linkmodes = do_get(ynl, args, 'linkmodes-get') ours = bits_to_dict(linkmodes['ours']) supported_ports = ('TP', 'AUI', 'BNC', 'MII', 'FIBRE', 'Backplane') @@ -425,7 +430,7 @@ def main(): 5: 'Directly Attached Copper', 0xef: 'None', } - linkinfo = dumpit(ynl, args, 'linkinfo-get') + linkinfo = do_get(ynl, args, 'linkinfo-get') print(f'Port: {ports.get(linkinfo["port"], "Other")}') print_field(linkinfo, ('phyaddr', 'PHYAD')) @@ -447,11 +452,11 @@ def main(): mdix = mdix_ctrl.get(linkinfo['tp-mdix'], 'Unknown (auto)') print(f'MDI-X: {mdix}') - debug = dumpit(ynl, args, 'debug-get') + debug = do_get(ynl, args, 'debug-get') msgmask = bits_to_dict(debug.get("msgmask", [])).keys() print(f'Current message level: {" ".join(msgmask)}') - linkstate = dumpit(ynl, args, 'linkstate-get') + linkstate = do_get(ynl, args, 'linkstate-get') detected_states = { 0: 'no', 1: 'yes', diff --git a/tools/net/ynl/tests/ethtool.sh b/tools/net/ynl/tests/ethtool.sh new file mode 100755 index 000000000000..0859ddd697e8 --- /dev/null +++ b/tools/net/ynl/tests/ethtool.sh @@ -0,0 +1,5 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +source "$(dirname "$(realpath "$0")")/ynl_nsim_lib.sh" +nsim_setup +"$(dirname "$(realpath "$0")")/ethtool" diff --git a/tools/net/ynl/tests/netdev.c b/tools/net/ynl/tests/netdev.c new file mode 100644 index 000000000000..f849e3d7f4b3 --- /dev/null +++ b/tools/net/ynl/tests/netdev.c @@ -0,0 +1,231 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <string.h> + +#include <ynl.h> + +#include <net/if.h> + +#include <kselftest_harness.h> + +#include "netdev-user.h" +#include "rt-link-user.h" + +static void netdev_print_device(struct __test_metadata *_metadata, + struct netdev_dev_get_rsp *d, unsigned int op) +{ + char ifname[IF_NAMESIZE]; + const char *name; + + EXPECT_TRUE((bool)d->_present.ifindex); + if (!d->_present.ifindex) + return; + + name = if_indextoname(d->ifindex, ifname); + EXPECT_TRUE((bool)name); + if (name) + ksft_print_msg("%8s[%d]\t", name, d->ifindex); + else + ksft_print_msg("[%d]\t", d->ifindex); + + EXPECT_TRUE((bool)d->_present.xdp_features); + if (!d->_present.xdp_features) + return; + + printf("xdp-features (%llx):", d->xdp_features); + for (int i = 0; d->xdp_features >= 1U << i; i++) { + if (d->xdp_features & (1U << i)) + printf(" %s", netdev_xdp_act_str(1 << i)); + } + + printf(" xdp-rx-metadata-features (%llx):", + d->xdp_rx_metadata_features); + for (int i = 0; d->xdp_rx_metadata_features >= 1U << i; i++) { + if (d->xdp_rx_metadata_features & (1U << i)) + printf(" %s", + netdev_xdp_rx_metadata_str(1 << i)); + } + + printf(" xsk-features (%llx):", d->xsk_features); + for (int i = 0; d->xsk_features >= 1U << i; i++) { + if (d->xsk_features & (1U << i)) + printf(" %s", netdev_xsk_flags_str(1 << i)); + } + + printf(" xdp-zc-max-segs=%u", d->xdp_zc_max_segs); + + name = netdev_op_str(op); + if (name) + printf(" (ntf: %s)", name); + printf("\n"); +} + +static int veth_create(struct ynl_sock *ys_link) +{ + struct rt_link_getlink_ntf *ntf_gl; + struct rt_link_newlink_req *req; + struct ynl_ntf_base_type *ntf; + int ret; + + req = rt_link_newlink_req_alloc(); + if (!req) + return -1; + + rt_link_newlink_req_set_nlflags(req, NLM_F_CREATE | NLM_F_ECHO); + rt_link_newlink_req_set_linkinfo_kind(req, "veth"); + + ret = rt_link_newlink(ys_link, req); + rt_link_newlink_req_free(req); + if (ret) + return -1; + + if (!ynl_has_ntf(ys_link)) + return 0; + + ntf = ynl_ntf_dequeue(ys_link); + if (!ntf || ntf->cmd != RTM_NEWLINK) { + ynl_ntf_free(ntf); + return 0; + } + ntf_gl = (void *)ntf; + ret = ntf_gl->obj._hdr.ifi_index; + ynl_ntf_free(ntf); + + return ret; +} + +static void veth_delete(struct __test_metadata *_metadata, + struct ynl_sock *ys_link, int ifindex) +{ + struct rt_link_dellink_req *req; + + req = rt_link_dellink_req_alloc(); + ASSERT_NE(NULL, req); + + req->_hdr.ifi_index = ifindex; + EXPECT_EQ(0, rt_link_dellink(ys_link, req)); + rt_link_dellink_req_free(req); +} + +FIXTURE(netdev) +{ + struct ynl_sock *ys; + struct ynl_sock *ys_link; +}; + +FIXTURE_SETUP(netdev) +{ + struct ynl_error yerr; + + self->ys = ynl_sock_create(&ynl_netdev_family, &yerr); + ASSERT_NE(NULL, self->ys) { + TH_LOG("Failed to create YNL netdev socket: %s", yerr.msg); + } +} + +FIXTURE_TEARDOWN(netdev) +{ + if (self->ys_link) + ynl_sock_destroy(self->ys_link); + ynl_sock_destroy(self->ys); +} + +TEST_F(netdev, dump) +{ + struct netdev_dev_get_list *devs; + + devs = netdev_dev_get_dump(self->ys); + ASSERT_NE(NULL, devs) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + if (ynl_dump_empty(devs)) { + netdev_dev_get_list_free(devs); + SKIP(return, "no entries in dump"); + } + + ynl_dump_foreach(devs, d) + netdev_print_device(_metadata, d, 0); + + netdev_dev_get_list_free(devs); +} + +TEST_F(netdev, get) +{ + struct netdev_dev_get_list *devs; + struct netdev_dev_get_req *req; + struct netdev_dev_get_rsp *dev; + int ifindex = 0; + + devs = netdev_dev_get_dump(self->ys); + ASSERT_NE(NULL, devs) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + ynl_dump_foreach(devs, d) { + if (d->_present.ifindex) { + ifindex = d->ifindex; + break; + } + } + netdev_dev_get_list_free(devs); + + if (!ifindex) + SKIP(return, "no device to query"); + + req = netdev_dev_get_req_alloc(); + ASSERT_NE(NULL, req); + netdev_dev_get_req_set_ifindex(req, ifindex); + + dev = netdev_dev_get(self->ys, req); + netdev_dev_get_req_free(req); + ASSERT_NE(NULL, dev) { + TH_LOG("dev_get failed: %s", self->ys->err.msg); + } + + netdev_print_device(_metadata, dev, 0); + netdev_dev_get_rsp_free(dev); +} + +TEST_F(netdev, ntf_check) +{ + struct ynl_ntf_base_type *ntf; + int veth_ifindex; + bool received; + int ret; + + ret = ynl_subscribe(self->ys, "mgmt"); + ASSERT_EQ(0, ret) { + TH_LOG("subscribe failed: %s", self->ys->err.msg); + } + + self->ys_link = ynl_sock_create(&ynl_rt_link_family, NULL); + ASSERT_NE(NULL, self->ys_link) + TH_LOG("failed to create rt-link socket"); + + veth_ifindex = veth_create(self->ys_link); + ASSERT_GT(veth_ifindex, 0) + TH_LOG("failed to create veth"); + + ynl_ntf_check(self->ys); + + ntf = ynl_ntf_dequeue(self->ys); + received = ntf; + if (ntf) { + netdev_print_device(_metadata, + (struct netdev_dev_get_rsp *)&ntf->data, + ntf->cmd); + ynl_ntf_free(ntf); + } + + /* Drain any remaining notifications */ + while ((ntf = ynl_ntf_dequeue(self->ys))) + ynl_ntf_free(ntf); + + veth_delete(_metadata, self->ys_link, veth_ifindex); + + ASSERT_TRUE(received) + TH_LOG("no notification received"); +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/tests/ovs.c b/tools/net/ynl/tests/ovs.c new file mode 100644 index 000000000000..d49f5a8e647e --- /dev/null +++ b/tools/net/ynl/tests/ovs.c @@ -0,0 +1,108 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <string.h> + +#include <ynl.h> + +#include <kselftest_harness.h> + +#include "ovs_datapath-user.h" + +static void ovs_print_datapath(struct __test_metadata *_metadata, + struct ovs_datapath_get_rsp *dp) +{ + EXPECT_TRUE((bool)dp->_len.name); + if (!dp->_len.name) + return; + + EXPECT_TRUE((bool)dp->_hdr.dp_ifindex); + ksft_print_msg("%s(%d): pid:%u cache:%u\n", + dp->name, dp->_hdr.dp_ifindex, + dp->upcall_pid, dp->masks_cache_size); +} + +FIXTURE(ovs) +{ + struct ynl_sock *ys; + char *dp_name; +}; + +FIXTURE_SETUP(ovs) +{ + self->ys = ynl_sock_create(&ynl_ovs_datapath_family, NULL); + ASSERT_NE(NULL, self->ys) + TH_LOG("failed to create OVS datapath socket"); +} + +FIXTURE_TEARDOWN(ovs) +{ + if (self->dp_name) { + struct ovs_datapath_del_req *req; + + req = ovs_datapath_del_req_alloc(); + if (req) { + ovs_datapath_del_req_set_name(req, self->dp_name); + ovs_datapath_del(self->ys, req); + ovs_datapath_del_req_free(req); + } + } + ynl_sock_destroy(self->ys); +} + +TEST_F(ovs, crud) +{ + struct ovs_datapath_get_req_dump *dreq; + struct ovs_datapath_new_req *new_req; + struct ovs_datapath_get_list *dps; + struct ovs_datapath_get_rsp *dp; + struct ovs_datapath_get_req *req; + bool found = false; + int err; + + new_req = ovs_datapath_new_req_alloc(); + ASSERT_NE(NULL, new_req); + ovs_datapath_new_req_set_upcall_pid(new_req, 1); + ovs_datapath_new_req_set_name(new_req, "ynl-test"); + + err = ovs_datapath_new(self->ys, new_req); + ovs_datapath_new_req_free(new_req); + ASSERT_EQ(0, err) { + TH_LOG("new failed: %s", self->ys->err.msg); + } + self->dp_name = "ynl-test"; + + ksft_print_msg("get:\n"); + req = ovs_datapath_get_req_alloc(); + ASSERT_NE(NULL, req); + ovs_datapath_get_req_set_name(req, "ynl-test"); + + dp = ovs_datapath_get(self->ys, req); + ovs_datapath_get_req_free(req); + ASSERT_NE(NULL, dp) { + TH_LOG("get failed: %s", self->ys->err.msg); + } + + ovs_print_datapath(_metadata, dp); + EXPECT_STREQ("ynl-test", dp->name); + ovs_datapath_get_rsp_free(dp); + + ksft_print_msg("dump:\n"); + dreq = ovs_datapath_get_req_dump_alloc(); + ASSERT_NE(NULL, dreq); + + dps = ovs_datapath_get_dump(self->ys, dreq); + ovs_datapath_get_req_dump_free(dreq); + ASSERT_NE(NULL, dps) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + ynl_dump_foreach(dps, d) { + ovs_print_datapath(_metadata, d); + if (d->name && !strcmp(d->name, "ynl-test")) + found = true; + } + ovs_datapath_get_list_free(dps); + EXPECT_TRUE(found); +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/tests/rt-addr.c b/tools/net/ynl/tests/rt-addr.c new file mode 100644 index 000000000000..f6c3715b2f20 --- /dev/null +++ b/tools/net/ynl/tests/rt-addr.c @@ -0,0 +1,111 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <string.h> + +#include <ynl.h> + +#include <arpa/inet.h> +#include <net/if.h> + +#include <kselftest_harness.h> + +#include "rt-addr-user.h" + +static void rt_addr_print(struct __test_metadata *_metadata, + struct rt_addr_getaddr_rsp *a) +{ + char ifname[IF_NAMESIZE]; + char addr_str[64]; + const char *addr; + const char *name; + + name = if_indextoname(a->_hdr.ifa_index, ifname); + EXPECT_NE(NULL, name); + if (name) + ksft_print_msg("%16s: ", name); + + EXPECT_TRUE(a->_len.address == 4 || a->_len.address == 16); + switch (a->_len.address) { + case 4: + addr = inet_ntop(AF_INET, a->address, + addr_str, sizeof(addr_str)); + break; + case 16: + addr = inet_ntop(AF_INET6, a->address, + addr_str, sizeof(addr_str)); + break; + default: + addr = NULL; + break; + } + if (addr) + printf("%s", addr); + else + printf("[%d]", a->_len.address); + + printf("\n"); +} + +FIXTURE(rt_addr) +{ + struct ynl_sock *ys; +}; + +FIXTURE_SETUP(rt_addr) +{ + struct ynl_error yerr; + + self->ys = ynl_sock_create(&ynl_rt_addr_family, &yerr); + ASSERT_NE(NULL, self->ys) + TH_LOG("failed to create rt-addr socket: %s", yerr.msg); +} + +FIXTURE_TEARDOWN(rt_addr) +{ + ynl_sock_destroy(self->ys); +} + +TEST_F(rt_addr, dump) +{ + struct rt_addr_getaddr_list *rsp; + struct rt_addr_getaddr_req *req; + struct in6_addr v6_expected; + struct in_addr v4_expected; + bool found_v4 = false; + bool found_v6 = false; + + /* The bash wrapper for this test adds these addresses on nsim0, + * make sure we can find them in the dump. + */ + inet_pton(AF_INET, "192.168.1.1", &v4_expected); + inet_pton(AF_INET6, "2001:db8::1", &v6_expected); + + req = rt_addr_getaddr_req_alloc(); + ASSERT_NE(NULL, req); + + rsp = rt_addr_getaddr_dump(self->ys, req); + rt_addr_getaddr_req_free(req); + ASSERT_NE(NULL, rsp) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + ASSERT_FALSE(ynl_dump_empty(rsp)) { + rt_addr_getaddr_list_free(rsp); + TH_LOG("no addresses reported"); + } + + ynl_dump_foreach(rsp, addr) { + rt_addr_print(_metadata, addr); + + found_v4 |= addr->_len.address == 4 && + !memcmp(addr->address, &v4_expected, 4); + found_v6 |= addr->_len.address == 16 && + !memcmp(addr->address, &v6_expected, 16); + } + rt_addr_getaddr_list_free(rsp); + + EXPECT_TRUE(found_v4); + EXPECT_TRUE(found_v6); +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/tests/rt-addr.sh b/tools/net/ynl/tests/rt-addr.sh new file mode 100755 index 000000000000..87661236d126 --- /dev/null +++ b/tools/net/ynl/tests/rt-addr.sh @@ -0,0 +1,5 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +source "$(dirname "$(realpath "$0")")/ynl_nsim_lib.sh" +nsim_setup +"$(dirname "$(realpath "$0")")/rt-addr" diff --git a/tools/net/ynl/tests/rt-link.c b/tools/net/ynl/tests/rt-link.c new file mode 100644 index 000000000000..ef619ce6143f --- /dev/null +++ b/tools/net/ynl/tests/rt-link.c @@ -0,0 +1,206 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <string.h> + +#include <ynl.h> + +#include <arpa/inet.h> +#include <net/if.h> + +#include <kselftest_harness.h> + +#include "rt-link-user.h" + +static void rt_link_print(struct __test_metadata *_metadata, + struct rt_link_getlink_rsp *r) +{ + unsigned int i; + + EXPECT_TRUE((bool)r->_hdr.ifi_index); + ksft_print_msg("%3d: ", r->_hdr.ifi_index); + + EXPECT_TRUE((bool)r->_len.ifname); + if (r->_len.ifname) + printf("%6s: ", r->ifname); + + if (r->_present.mtu) + printf("mtu %5d ", r->mtu); + + if (r->linkinfo._len.kind) + printf("kind %-8s ", r->linkinfo.kind); + else + printf(" %8s ", ""); + + if (r->prop_list._count.alt_ifname) { + printf("altname "); + for (i = 0; i < r->prop_list._count.alt_ifname; i++) + printf("%s ", r->prop_list.alt_ifname[i]->str); + printf(" "); + } + + if (r->linkinfo._present.data && r->linkinfo.data._present.netkit) { + struct rt_link_linkinfo_netkit_attrs *netkit; + const char *name; + + netkit = &r->linkinfo.data.netkit; + printf("primary %d ", netkit->primary); + + name = NULL; + if (netkit->_present.policy) + name = rt_link_netkit_policy_str(netkit->policy); + if (name) + printf("policy %s ", name); + } + + printf("\n"); +} + +static int netkit_create(struct ynl_sock *ys) +{ + struct rt_link_getlink_ntf *ntf_gl; + struct rt_link_newlink_req *req; + struct ynl_ntf_base_type *ntf; + int ret; + + req = rt_link_newlink_req_alloc(); + if (!req) + return -1; + + rt_link_newlink_req_set_nlflags(req, NLM_F_CREATE | NLM_F_ECHO); + rt_link_newlink_req_set_linkinfo_kind(req, "netkit"); + rt_link_newlink_req_set_linkinfo_data_netkit_policy(req, NETKIT_DROP); + + ret = rt_link_newlink(ys, req); + rt_link_newlink_req_free(req); + if (ret) + return -1; + + if (!ynl_has_ntf(ys)) + return 0; + + ntf = ynl_ntf_dequeue(ys); + if (!ntf || ntf->cmd != RTM_NEWLINK) { + ynl_ntf_free(ntf); + return 0; + } + ntf_gl = (void *)ntf; + ret = ntf_gl->obj._hdr.ifi_index; + ynl_ntf_free(ntf); + + return ret; +} + +static void netkit_delete(struct __test_metadata *_metadata, + struct ynl_sock *ys, int ifindex) +{ + struct rt_link_dellink_req *req; + + req = rt_link_dellink_req_alloc(); + ASSERT_NE(NULL, req); + + req->_hdr.ifi_index = ifindex; + EXPECT_EQ(0, rt_link_dellink(ys, req)); + rt_link_dellink_req_free(req); +} + +FIXTURE(rt_link) +{ + struct ynl_sock *ys; +}; + +FIXTURE_SETUP(rt_link) +{ + struct ynl_error yerr; + + self->ys = ynl_sock_create(&ynl_rt_link_family, &yerr); + ASSERT_NE(NULL, self->ys) { + TH_LOG("failed to create rt-link socket: %s", yerr.msg); + } +} + +FIXTURE_TEARDOWN(rt_link) +{ + ynl_sock_destroy(self->ys); +} + +TEST_F(rt_link, dump) +{ + struct rt_link_getlink_req_dump *req; + struct rt_link_getlink_list *rsp; + + req = rt_link_getlink_req_dump_alloc(); + ASSERT_NE(NULL, req); + rsp = rt_link_getlink_dump(self->ys, req); + rt_link_getlink_req_dump_free(req); + ASSERT_NE(NULL, rsp) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + ASSERT_FALSE(ynl_dump_empty(rsp)); + + ynl_dump_foreach(rsp, link) + rt_link_print(_metadata, link); + + rt_link_getlink_list_free(rsp); +} + +TEST_F(rt_link, netkit) +{ + struct rt_link_getlink_req_dump *dreq; + struct rt_link_getlink_list *rsp; + bool found = false; + int netkit_ifindex; + + /* Create netkit with valid policy */ + netkit_ifindex = netkit_create(self->ys); + ASSERT_GT(netkit_ifindex, 0) + TH_LOG("failed to create netkit: %s", self->ys->err.msg); + + /* Verify it appears in a dump */ + dreq = rt_link_getlink_req_dump_alloc(); + ASSERT_NE(NULL, dreq); + rsp = rt_link_getlink_dump(self->ys, dreq); + rt_link_getlink_req_dump_free(dreq); + ASSERT_NE(NULL, rsp) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + ynl_dump_foreach(rsp, link) { + if (link->_hdr.ifi_index == netkit_ifindex) { + rt_link_print(_metadata, link); + found = true; + } + } + rt_link_getlink_list_free(rsp); + EXPECT_TRUE(found); + + netkit_delete(_metadata, self->ys, netkit_ifindex); +} + +TEST_F(rt_link, netkit_err_msg) +{ + struct rt_link_newlink_req *req; + int ret; + + /* Test creating netkit with bad policy - should fail */ + req = rt_link_newlink_req_alloc(); + ASSERT_NE(NULL, req); + rt_link_newlink_req_set_nlflags(req, NLM_F_CREATE); + rt_link_newlink_req_set_linkinfo_kind(req, "netkit"); + rt_link_newlink_req_set_linkinfo_data_netkit_policy(req, 10); + + ret = rt_link_newlink(self->ys, req); + rt_link_newlink_req_free(req); + EXPECT_NE(0, ret) { + TH_LOG("creating netkit with bad policy should fail"); + } + + /* Expect: + * Kernel error: 'Provided default xmit policy not supported' (bad attribute: .linkinfo.data(netkit).policy) + */ + EXPECT_NE(NULL, strstr(self->ys->err.msg, "bad attribute: .linkinfo.data(netkit).policy")) { + TH_LOG("expected extack msg not found: %s", + self->ys->err.msg); + } +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/tests/rt-route.c b/tools/net/ynl/tests/rt-route.c new file mode 100644 index 000000000000..c9fd2bc48144 --- /dev/null +++ b/tools/net/ynl/tests/rt-route.c @@ -0,0 +1,113 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <stdio.h> +#include <string.h> + +#include <ynl.h> + +#include <arpa/inet.h> +#include <net/if.h> + +#include <kselftest_harness.h> + +#include "rt-route-user.h" + +static void rt_route_print(struct __test_metadata *_metadata, + struct rt_route_getroute_rsp *r) +{ + char ifname[IF_NAMESIZE]; + char route_str[64]; + const char *route; + const char *name; + + /* Ignore local */ + if (r->_hdr.rtm_table == RT_TABLE_LOCAL) + return; + + if (r->_present.oif) { + name = if_indextoname(r->oif, ifname); + EXPECT_NE(NULL, name); + if (name) + ksft_print_msg("oif: %-16s ", name); + } + + if (r->_len.dst) { + route = inet_ntop(r->_hdr.rtm_family, r->dst, + route_str, sizeof(route_str)); + printf("dst: %s/%d", route, r->_hdr.rtm_dst_len); + } + + if (r->_len.gateway) { + route = inet_ntop(r->_hdr.rtm_family, r->gateway, + route_str, sizeof(route_str)); + printf("gateway: %s ", route); + } + + printf("\n"); +} + +FIXTURE(rt_route) +{ + struct ynl_sock *ys; +}; + +FIXTURE_SETUP(rt_route) +{ + struct ynl_error yerr; + + self->ys = ynl_sock_create(&ynl_rt_route_family, &yerr); + ASSERT_NE(NULL, self->ys) + TH_LOG("failed to create rt-route socket: %s", yerr.msg); +} + +FIXTURE_TEARDOWN(rt_route) +{ + ynl_sock_destroy(self->ys); +} + +TEST_F(rt_route, dump) +{ + struct rt_route_getroute_req_dump *req; + struct rt_route_getroute_list *rsp; + struct in6_addr v6_expected; + struct in_addr v4_expected; + bool found_v4 = false; + bool found_v6 = false; + + /* The bash wrapper configures 192.168.1.1/24 and 2001:db8::1/64, + * make sure we can find the connected routes in the dump. + */ + inet_pton(AF_INET, "192.168.1.0", &v4_expected); + inet_pton(AF_INET6, "2001:db8::", &v6_expected); + + req = rt_route_getroute_req_dump_alloc(); + ASSERT_NE(NULL, req); + + rsp = rt_route_getroute_dump(self->ys, req); + rt_route_getroute_req_dump_free(req); + ASSERT_NE(NULL, rsp) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + + ASSERT_FALSE(ynl_dump_empty(rsp)) { + rt_route_getroute_list_free(rsp); + TH_LOG("no routes reported"); + } + + ynl_dump_foreach(rsp, route) { + rt_route_print(_metadata, route); + + if (route->_hdr.rtm_table == RT_TABLE_LOCAL) + continue; + + if (route->_len.dst == 4 && route->_hdr.rtm_dst_len == 24) + found_v4 |= !memcmp(route->dst, &v4_expected, 4); + if (route->_len.dst == 16 && route->_hdr.rtm_dst_len == 64) + found_v6 |= !memcmp(route->dst, &v6_expected, 16); + } + rt_route_getroute_list_free(rsp); + + EXPECT_TRUE(found_v4); + EXPECT_TRUE(found_v6); +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/tests/rt-route.sh b/tools/net/ynl/tests/rt-route.sh new file mode 100755 index 000000000000..020338f0a238 --- /dev/null +++ b/tools/net/ynl/tests/rt-route.sh @@ -0,0 +1,5 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +source "$(dirname "$(realpath "$0")")/ynl_nsim_lib.sh" +nsim_setup +"$(dirname "$(realpath "$0")")/rt-route" diff --git a/tools/net/ynl/tests/tc.c b/tools/net/ynl/tests/tc.c new file mode 100644 index 000000000000..6ff13876578d --- /dev/null +++ b/tools/net/ynl/tests/tc.c @@ -0,0 +1,409 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include <sched.h> +#include <stdio.h> +#include <string.h> +#include <stdlib.h> +#include <arpa/inet.h> +#include <linux/pkt_sched.h> +#include <linux/tc_act/tc_vlan.h> +#include <linux/tc_act/tc_gact.h> +#include <linux/if_ether.h> +#include <net/if.h> + +#include <ynl.h> + +#include <kselftest_harness.h> + +#include "tc-user.h" + +#define TC_HANDLE (0xFFFF << 16) + +static bool tc_qdisc_print(struct __test_metadata *_metadata, + struct tc_getqdisc_rsp *q) +{ + bool was_fq_codel = false; + char ifname[IF_NAMESIZE]; + const char *name; + + name = if_indextoname(q->_hdr.tcm_ifindex, ifname); + EXPECT_TRUE((bool)name); + ksft_print_msg("%16s: ", name ?: "no-name"); + + if (q->_len.kind) { + printf("%s ", q->kind); + + if (q->options._present.fq_codel) { + struct tc_fq_codel_attrs *fq_codel; + struct tc_fq_codel_xstats *stats; + + fq_codel = &q->options.fq_codel; + stats = q->stats2.app.fq_codel; + + EXPECT_EQ(true, + fq_codel->_present.limit && + fq_codel->_present.target && + q->stats2.app._len.fq_codel); + + if (fq_codel->_present.limit) + printf("limit: %dp ", fq_codel->limit); + if (fq_codel->_present.target) + printf("target: %dms ", + (fq_codel->target + 500) / 1000); + if (q->stats2.app._len.fq_codel) + printf("new_flow_cnt: %d ", + stats->qdisc_stats.new_flow_count); + was_fq_codel = true; + } + } + printf("\n"); + + return was_fq_codel; +} + +static const char *vlan_act_name(struct tc_vlan *p) +{ + switch (p->v_action) { + case TCA_VLAN_ACT_POP: + return "pop"; + case TCA_VLAN_ACT_PUSH: + return "push"; + case TCA_VLAN_ACT_MODIFY: + return "modify"; + default: + break; + } + + return "not supported"; +} + +static const char *gact_act_name(struct tc_gact *p) +{ + switch (p->action) { + case TC_ACT_SHOT: + return "drop"; + case TC_ACT_OK: + return "ok"; + case TC_ACT_PIPE: + return "pipe"; + default: + break; + } + + return "not supported"; +} + +static void print_vlan(struct tc_act_vlan_attrs *vlan) +{ + printf("%s ", vlan_act_name(vlan->parms)); + if (vlan->_present.push_vlan_id) + printf("id %u ", vlan->push_vlan_id); + if (vlan->_present.push_vlan_protocol) + printf("protocol %#x ", ntohs(vlan->push_vlan_protocol)); + if (vlan->_present.push_vlan_priority) + printf("priority %u ", vlan->push_vlan_priority); +} + +static void print_gact(struct tc_act_gact_attrs *gact) +{ + struct tc_gact *p = gact->parms; + + printf("%s ", gact_act_name(p)); +} + +static void flower_print(struct tc_flower_attrs *flower, const char *kind) +{ + struct tc_act_attrs *a; + unsigned int i; + + ksft_print_msg("%s:\n", kind); + + if (flower->_present.key_vlan_id) + ksft_print_msg(" vlan_id: %u\n", flower->key_vlan_id); + if (flower->_present.key_vlan_prio) + ksft_print_msg(" vlan_prio: %u\n", flower->key_vlan_prio); + if (flower->_present.key_num_of_vlans) + ksft_print_msg(" num_of_vlans: %u\n", + flower->key_num_of_vlans); + + for (i = 0; i < flower->_count.act; i++) { + a = &flower->act[i]; + ksft_print_msg("action order: %i %s ", i + 1, a->kind); + if (a->options._present.vlan) + print_vlan(&a->options.vlan); + else if (a->options._present.gact) + print_gact(&a->options.gact); + printf("\n"); + } +} + +static void tc_filter_print(struct __test_metadata *_metadata, + struct tc_gettfilter_rsp *f) +{ + struct tc_options_msg *opt = &f->options; + + if (opt->_present.flower) { + EXPECT_TRUE((bool)f->_len.kind); + flower_print(&opt->flower, f->kind); + } else if (f->_len.kind) { + ksft_print_msg("%s pref %u proto: %#x\n", f->kind, + (f->_hdr.tcm_info >> 16), + ntohs(TC_H_MIN(f->_hdr.tcm_info))); + } +} + +static int tc_clsact_add(struct ynl_sock *ys, int ifi) +{ + struct tc_newqdisc_req *req; + int ret; + + req = tc_newqdisc_req_alloc(); + if (!req) + return -1; + memset(req, 0, sizeof(*req)); + + req->_hdr.tcm_ifindex = ifi; + req->_hdr.tcm_parent = TC_H_CLSACT; + req->_hdr.tcm_handle = TC_HANDLE; + tc_newqdisc_req_set_nlflags(req, + NLM_F_REQUEST | NLM_F_EXCL | NLM_F_CREATE); + tc_newqdisc_req_set_kind(req, "clsact"); + + ret = tc_newqdisc(ys, req); + tc_newqdisc_req_free(req); + + return ret; +} + +static int tc_clsact_del(struct ynl_sock *ys, int ifi) +{ + struct tc_delqdisc_req *req; + int ret; + + req = tc_delqdisc_req_alloc(); + if (!req) + return -1; + memset(req, 0, sizeof(*req)); + + req->_hdr.tcm_ifindex = ifi; + req->_hdr.tcm_parent = TC_H_CLSACT; + req->_hdr.tcm_handle = TC_HANDLE; + tc_delqdisc_req_set_nlflags(req, NLM_F_REQUEST); + + ret = tc_delqdisc(ys, req); + tc_delqdisc_req_free(req); + + return ret; +} + +static int tc_filter_add(struct ynl_sock *ys, int ifi) +{ + struct tc_newtfilter_req *req; + struct tc_act_attrs *acts; + struct tc_vlan p = { + .action = TC_ACT_PIPE, + .v_action = TCA_VLAN_ACT_PUSH + }; + int ret; + + req = tc_newtfilter_req_alloc(); + if (!req) + return -1; + memset(req, 0, sizeof(*req)); + + acts = tc_act_attrs_alloc(3); + if (!acts) { + tc_newtfilter_req_free(req); + return -1; + } + memset(acts, 0, sizeof(*acts) * 3); + + req->_hdr.tcm_ifindex = ifi; + req->_hdr.tcm_parent = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS); + req->_hdr.tcm_info = TC_H_MAKE(1 << 16, htons(ETH_P_8021Q)); + req->chain = 0; + + tc_newtfilter_req_set_nlflags(req, NLM_F_REQUEST | NLM_F_EXCL | NLM_F_CREATE); + tc_newtfilter_req_set_kind(req, "flower"); + tc_newtfilter_req_set_options_flower_key_vlan_id(req, 100); + tc_newtfilter_req_set_options_flower_key_vlan_prio(req, 5); + tc_newtfilter_req_set_options_flower_key_num_of_vlans(req, 3); + + __tc_newtfilter_req_set_options_flower_act(req, acts, 3); + + /* Skip action at index 0 because in TC, the action array + * index starts at 1, with each index defining the action's + * order. In contrast, in YNL indexed arrays start at index 0. + */ + tc_act_attrs_set_kind(&acts[1], "vlan"); + tc_act_attrs_set_options_vlan_parms(&acts[1], &p, sizeof(p)); + tc_act_attrs_set_options_vlan_push_vlan_id(&acts[1], 200); + tc_act_attrs_set_kind(&acts[2], "vlan"); + tc_act_attrs_set_options_vlan_parms(&acts[2], &p, sizeof(p)); + tc_act_attrs_set_options_vlan_push_vlan_id(&acts[2], 300); + + tc_newtfilter_req_set_options_flower_flags(req, 0); + tc_newtfilter_req_set_options_flower_key_eth_type(req, htons(0x8100)); + + ret = tc_newtfilter(ys, req); + tc_newtfilter_req_free(req); + + return ret; +} + +static int tc_filter_del(struct ynl_sock *ys, int ifi) +{ + struct tc_deltfilter_req *req; + int ret; + + req = tc_deltfilter_req_alloc(); + if (!req) + return -1; + memset(req, 0, sizeof(*req)); + + req->_hdr.tcm_ifindex = ifi; + req->_hdr.tcm_parent = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS); + req->_hdr.tcm_info = TC_H_MAKE(1 << 16, htons(ETH_P_8021Q)); + tc_deltfilter_req_set_nlflags(req, NLM_F_REQUEST); + + ret = tc_deltfilter(ys, req); + tc_deltfilter_req_free(req); + + return ret; +} + +FIXTURE(tc) +{ + struct ynl_sock *ys; + int ifindex; +}; + +FIXTURE_SETUP(tc) +{ + struct ynl_error yerr; + int ret; + + ret = unshare(CLONE_NEWNET); + ASSERT_EQ(0, ret); + + self->ifindex = 1; /* loopback */ + + self->ys = ynl_sock_create(&ynl_tc_family, &yerr); + ASSERT_NE(NULL, self->ys) { + TH_LOG("failed to create tc socket: %s", yerr.msg); + } +} + +FIXTURE_TEARDOWN(tc) +{ + ynl_sock_destroy(self->ys); +} + +TEST_F(tc, qdisc) +{ + struct tc_getqdisc_req_dump *dreq; + struct tc_newqdisc_req *add_req; + struct tc_delqdisc_req *del_req; + struct tc_getqdisc_list *rsp; + bool found = false; + int ret; + + add_req = tc_newqdisc_req_alloc(); + ASSERT_NE(NULL, add_req); + memset(add_req, 0, sizeof(*add_req)); + + add_req->_hdr.tcm_ifindex = self->ifindex; + add_req->_hdr.tcm_parent = TC_H_ROOT; + tc_newqdisc_req_set_nlflags(add_req, + NLM_F_REQUEST | NLM_F_CREATE); + tc_newqdisc_req_set_kind(add_req, "fq_codel"); + + ret = tc_newqdisc(self->ys, add_req); + tc_newqdisc_req_free(add_req); + ASSERT_EQ(0, ret) { + TH_LOG("qdisc add failed: %s", self->ys->err.msg); + } + + dreq = tc_getqdisc_req_dump_alloc(); + ASSERT_NE(NULL, dreq); + rsp = tc_getqdisc_dump(self->ys, dreq); + tc_getqdisc_req_dump_free(dreq); + ASSERT_NE(NULL, rsp) { + TH_LOG("dump failed: %s", self->ys->err.msg); + } + ASSERT_FALSE(ynl_dump_empty(rsp)); + + ynl_dump_foreach(rsp, qdisc) { + found |= tc_qdisc_print(_metadata, qdisc); + } + tc_getqdisc_list_free(rsp); + EXPECT_TRUE(found); + + del_req = tc_delqdisc_req_alloc(); + ASSERT_NE(NULL, del_req); + memset(del_req, 0, sizeof(*del_req)); + + del_req->_hdr.tcm_ifindex = self->ifindex; + del_req->_hdr.tcm_parent = TC_H_ROOT; + tc_delqdisc_req_set_nlflags(del_req, NLM_F_REQUEST); + + ret = tc_delqdisc(self->ys, del_req); + tc_delqdisc_req_free(del_req); + EXPECT_EQ(0, ret) { + TH_LOG("qdisc del failed: %s", self->ys->err.msg); + } +} + +TEST_F(tc, flower) +{ + struct tc_gettfilter_req_dump *dreq; + struct tc_gettfilter_list *rsp; + bool found = false; + int ret; + + ret = tc_clsact_add(self->ys, self->ifindex); + if (ret) + SKIP(return, "clsact not supported: %s", self->ys->err.msg); + + ret = tc_filter_add(self->ys, self->ifindex); + ASSERT_EQ(0, ret) { + TH_LOG("filter add failed: %s", self->ys->err.msg); + } + + dreq = tc_gettfilter_req_dump_alloc(); + ASSERT_NE(NULL, dreq); + memset(dreq, 0, sizeof(*dreq)); + dreq->_hdr.tcm_ifindex = self->ifindex; + dreq->_hdr.tcm_parent = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS); + dreq->_present.chain = 1; + dreq->chain = 0; + + rsp = tc_gettfilter_dump(self->ys, dreq); + tc_gettfilter_req_dump_free(dreq); + ASSERT_NE(NULL, rsp) { + TH_LOG("filter dump failed: %s", self->ys->err.msg); + } + + ynl_dump_foreach(rsp, flt) { + tc_filter_print(_metadata, flt); + if (flt->options._present.flower) { + EXPECT_EQ(100, flt->options.flower.key_vlan_id); + EXPECT_EQ(5, flt->options.flower.key_vlan_prio); + found = true; + } + } + tc_gettfilter_list_free(rsp); + EXPECT_TRUE(found); + + ret = tc_filter_del(self->ys, self->ifindex); + EXPECT_EQ(0, ret) { + TH_LOG("filter del failed: %s", self->ys->err.msg); + } + + ret = tc_clsact_del(self->ys, self->ifindex); + EXPECT_EQ(0, ret) { + TH_LOG("clsact del failed: %s", self->ys->err.msg); + } +} + +TEST_HARNESS_MAIN diff --git a/tools/net/ynl/tests/test_ynl_ethtool.sh b/tools/net/ynl/tests/test_ynl_ethtool.sh index b826269017f4..b4480e9be7b7 100755 --- a/tools/net/ynl/tests/test_ynl_ethtool.sh +++ b/tools/net/ynl/tests/test_ynl_ethtool.sh @@ -8,7 +8,7 @@ KSELFTEST_KTAP_HELPERS="$(dirname "$(realpath "$0")")/../../../testing/selftests source "$KSELFTEST_KTAP_HELPERS" # Default ynl-ethtool path for direct execution, can be overridden by make install -ynl_ethtool="../pyynl/ethtool.py" +ynl_ethtool="./ethtool.py" readonly NSIM_ID="1337" readonly NSIM_DEV_NAME="nsim${NSIM_ID}" diff --git a/tools/net/ynl/tests/wireguard.c b/tools/net/ynl/tests/wireguard.c new file mode 100644 index 000000000000..df601e742c28 --- /dev/null +++ b/tools/net/ynl/tests/wireguard.c @@ -0,0 +1,106 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <arpa/inet.h> +#include <string.h> +#include <stdio.h> +#include <errno.h> +#include <ynl.h> + +#include "wireguard-user.h" + +static void print_allowed_ip(const struct wireguard_wgallowedip *aip) +{ + char addr_out[INET6_ADDRSTRLEN]; + + if (!inet_ntop(aip->family, aip->ipaddr, addr_out, sizeof(addr_out))) { + addr_out[0] = '?'; + addr_out[1] = '\0'; + } + printf("\t\t\t%s/%u\n", addr_out, aip->cidr_mask); +} + +/* Only printing public key in this demo. For better key formatting, + * use the constant-time implementation as found in wireguard-tools. + */ +static void print_peer_header(const struct wireguard_wgpeer *peer) +{ + unsigned int len = peer->_len.public_key; + uint8_t *key = peer->public_key; + unsigned int i; + + if (len != 32) + return; + printf("\tPeer "); + for (i = 0; i < len; i++) + printf("%02x", key[i]); + printf(":\n"); +} + +static void print_peer(const struct wireguard_wgpeer *peer) +{ + unsigned int i; + + print_peer_header(peer); + printf("\t\tData: rx: %llu / tx: %llu bytes\n", + peer->rx_bytes, peer->tx_bytes); + printf("\t\tAllowed IPs:\n"); + for (i = 0; i < peer->_count.allowedips; i++) + print_allowed_ip(&peer->allowedips[i]); +} + +static void build_request(struct wireguard_get_device_req *req, char *arg) +{ + char *endptr; + int ifindex; + + ifindex = strtol(arg, &endptr, 0); + if (endptr != arg + strlen(arg) || errno != 0) + ifindex = 0; + if (ifindex > 0) + wireguard_get_device_req_set_ifindex(req, ifindex); + else + wireguard_get_device_req_set_ifname(req, arg); +} + +int main(int argc, char **argv) +{ + struct wireguard_get_device_list *devs; + struct wireguard_get_device_req *req; + struct ynl_error yerr; + struct ynl_sock *ys; + + if (argc < 2) { + fprintf(stderr, "usage: %s <ifindex|ifname>\n", argv[0]); + return 1; + } + + ys = ynl_sock_create(&ynl_wireguard_family, &yerr); + if (!ys) { + fprintf(stderr, "YNL: %s\n", yerr.msg); + return 2; + } + + req = wireguard_get_device_req_alloc(); + build_request(req, argv[1]); + + devs = wireguard_get_device_dump(ys, req); + if (!devs) { + fprintf(stderr, "YNL (%d): %s\n", ys->err.code, ys->err.msg); + wireguard_get_device_req_free(req); + ynl_sock_destroy(ys); + return 3; + } + + ynl_dump_foreach(devs, d) { + unsigned int i; + + printf("Interface %d: %s\n", d->ifindex, d->ifname); + for (i = 0; i < d->_count.peers; i++) + print_peer(&d->peers[i]); + } + + wireguard_get_device_list_free(devs); + wireguard_get_device_req_free(req); + ynl_sock_destroy(ys); + + return 0; +} diff --git a/tools/net/ynl/tests/ynl_nsim_lib.sh b/tools/net/ynl/tests/ynl_nsim_lib.sh new file mode 100644 index 000000000000..98cdce44a69c --- /dev/null +++ b/tools/net/ynl/tests/ynl_nsim_lib.sh @@ -0,0 +1,35 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Shared netdevsim setup/cleanup for YNL C test wrappers + +NSIM_ID="1337" +NSIM_DEV="" +KSFT_SKIP=4 + +nsim_cleanup() { + echo "$NSIM_ID" > /sys/bus/netdevsim/del_device 2>/dev/null || true +} + +nsim_setup() { + modprobe netdevsim 2>/dev/null + if ! [ -f /sys/bus/netdevsim/new_device ]; then + echo "netdevsim module not available, skipping" >&2 + exit "$KSFT_SKIP" + fi + + trap nsim_cleanup EXIT + + echo "$NSIM_ID 1" > /sys/bus/netdevsim/new_device + udevadm settle + + NSIM_DEV=$(ls /sys/bus/netdevsim/devices/netdevsim${NSIM_ID}/net 2>/dev/null | head -1) + if [ -z "$NSIM_DEV" ]; then + echo "failed to find netdevsim device" >&2 + exit 1 + fi + + ip link set dev "$NSIM_DEV" name nsim0 + ip link set dev nsim0 up + ip addr add 192.168.1.1/24 dev nsim0 + ip addr add 2001:db8::1/64 dev nsim0 nodad +} diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 0949f370ad78..1db72e6b05b8 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -78,6 +78,7 @@ TARGETS += net/netfilter TARGETS += net/openvswitch TARGETS += net/ovpn TARGETS += net/packetdrill +TARGETS += net/ppp TARGETS += net/rds TARGETS += net/tcp_ao TARGETS += nolibc diff --git a/tools/testing/selftests/bpf/prog_tests/sock_ops_get_sk.c b/tools/testing/selftests/bpf/prog_tests/sock_ops_get_sk.c new file mode 100644 index 000000000000..343d92c4df30 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/sock_ops_get_sk.c @@ -0,0 +1,76 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <test_progs.h> +#include "cgroup_helpers.h" +#include "network_helpers.h" +#include "sock_ops_get_sk.skel.h" + +/* See progs/sock_ops_get_sk.c for the bug description. */ +static void run_sock_ops_test(int cgroup_fd, int prog_fd) +{ + int server_fd, client_fd, err; + + err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0); + if (!ASSERT_OK(err, "prog_attach")) + return; + + server_fd = start_server(AF_INET, SOCK_STREAM, NULL, 0, 0); + if (!ASSERT_OK_FD(server_fd, "start_server")) + goto detach; + + /* Trigger TCP handshake which causes TCP_NEW_SYN_RECV state where + * is_fullsock == 0 and is_locked_tcp_sock == 0. + */ + client_fd = connect_to_fd(server_fd, 0); + if (!ASSERT_OK_FD(client_fd, "connect_to_fd")) + goto close_server; + + close(client_fd); + +close_server: + close(server_fd); +detach: + bpf_prog_detach(cgroup_fd, BPF_CGROUP_SOCK_OPS); +} + +void test_ns_sock_ops_get_sk(void) +{ + struct sock_ops_get_sk *skel; + int cgroup_fd; + + cgroup_fd = test__join_cgroup("/sock_ops_get_sk"); + if (!ASSERT_OK_FD(cgroup_fd, "join_cgroup")) + return; + + skel = sock_ops_get_sk__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel_open_load")) + goto close_cgroup; + + /* Test SOCK_OPS_GET_SK with same src/dst register */ + if (test__start_subtest("get_sk")) { + run_sock_ops_test(cgroup_fd, + bpf_program__fd(skel->progs.sock_ops_get_sk_same_reg)); + ASSERT_EQ(skel->bss->null_seen, 1, "null_seen"); + ASSERT_EQ(skel->bss->bug_detected, 0, "bug_not_detected"); + } + + /* Test SOCK_OPS_GET_FIELD with same src/dst register */ + if (test__start_subtest("get_field")) { + run_sock_ops_test(cgroup_fd, + bpf_program__fd(skel->progs.sock_ops_get_field_same_reg)); + ASSERT_EQ(skel->bss->field_null_seen, 1, "field_null_seen"); + ASSERT_EQ(skel->bss->field_bug_detected, 0, "field_bug_not_detected"); + } + + /* Test SOCK_OPS_GET_SK with different src/dst register */ + if (test__start_subtest("get_sk_diff_reg")) { + run_sock_ops_test(cgroup_fd, + bpf_program__fd(skel->progs.sock_ops_get_sk_diff_reg)); + ASSERT_EQ(skel->bss->diff_reg_null_seen, 1, "diff_reg_null_seen"); + ASSERT_EQ(skel->bss->diff_reg_bug_detected, 0, "diff_reg_bug_not_detected"); + } + + sock_ops_get_sk__destroy(skel); +close_cgroup: + close(cgroup_fd); +} diff --git a/tools/testing/selftests/bpf/prog_tests/test_dst_clear.c b/tools/testing/selftests/bpf/prog_tests/test_dst_clear.c new file mode 100644 index 000000000000..7c35ca6f4539 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/test_dst_clear.c @@ -0,0 +1,55 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ + +#include <sys/types.h> +#include <sys/socket.h> +#include <net/if.h> + +#include "test_progs.h" +#include "network_helpers.h" +#include "test_dst_clear.skel.h" + +#define IPV4_IFACE_ADDR "1.0.0.1" +#define UDP_TEST_PORT 7777 + +void test_ns_dst_clear(void) +{ + LIBBPF_OPTS(bpf_tcx_opts, tcx_opts); + struct test_dst_clear *skel; + struct sockaddr_in addr; + struct bpf_link *link; + socklen_t addrlen; + char buf[128] = {}; + int sockfd, err; + + skel = test_dst_clear__open_and_load(); + if (!ASSERT_OK_PTR(skel, "skel open_and_load")) + return; + + SYS(fail, "ip addr add %s/8 dev lo", IPV4_IFACE_ADDR); + + link = bpf_program__attach_tcx(skel->progs.dst_clear, + if_nametoindex("lo"), &tcx_opts); + if (!ASSERT_OK_PTR(link, "attach_tcx")) + goto fail; + skel->links.dst_clear = link; + + addrlen = sizeof(addr); + err = make_sockaddr(AF_INET, IPV4_IFACE_ADDR, UDP_TEST_PORT, + (void *)&addr, &addrlen); + if (!ASSERT_OK(err, "make_sockaddr")) + goto fail; + sockfd = socket(AF_INET, SOCK_DGRAM, 0); + if (!ASSERT_NEQ(sockfd, -1, "socket")) + goto fail; + err = sendto(sockfd, buf, sizeof(buf), 0, (void *)&addr, addrlen); + close(sockfd); + if (!ASSERT_EQ(err, sizeof(buf), "send")) + goto fail; + + ASSERT_TRUE(skel->bss->had_dst, "had_dst"); + ASSERT_TRUE(skel->bss->dst_cleared, "dst_cleared"); + +fail: + test_dst_clear__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c b/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c index e8ea26464349..c42488e445c2 100644 --- a/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c +++ b/tools/testing/selftests/bpf/prog_tests/xdp_bonding.c @@ -191,13 +191,18 @@ fail: return -1; } -static void bonding_cleanup(struct skeletons *skeletons) +static void link_cleanup(struct skeletons *skeletons) { - restore_root_netns(); while (skeletons->nlinks) { skeletons->nlinks--; bpf_link__destroy(skeletons->links[skeletons->nlinks]); } +} + +static void bonding_cleanup(struct skeletons *skeletons) +{ + restore_root_netns(); + link_cleanup(skeletons); ASSERT_OK(system("ip link delete bond1"), "delete bond1"); ASSERT_OK(system("ip link delete veth1_1"), "delete veth1_1"); ASSERT_OK(system("ip link delete veth1_2"), "delete veth1_2"); @@ -493,6 +498,90 @@ out: system("ip link del bond_nest2"); } +/* + * Test that XDP redirect via xdp_master_redirect() does not crash when + * the bond master device is not up. When bond is in round-robin mode but + * never opened, rr_tx_counter is NULL. + */ +static void test_xdp_bonding_redirect_no_up(struct skeletons *skeletons) +{ + struct nstoken *nstoken = NULL; + int xdp_pass_fd; + int veth1_ifindex; + int err; + char pkt[ETH_HLEN + 1]; + struct xdp_md ctx_in = {}; + + DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts, + .data_in = &pkt, + .data_size_in = sizeof(pkt), + .ctx_in = &ctx_in, + .ctx_size_in = sizeof(ctx_in), + .flags = BPF_F_TEST_XDP_LIVE_FRAMES, + .repeat = 1, + .batch_size = 1, + ); + + /* We can't use bonding_setup() because bond will be active */ + SYS(out, "ip netns add ns_rr_no_up"); + nstoken = open_netns("ns_rr_no_up"); + if (!ASSERT_OK_PTR(nstoken, "open ns_rr_no_up")) + goto out; + + /* bond0: active-backup, UP with slave veth0. + * Attaching native XDP to bond0 enables bpf_master_redirect_enabled_key + * globally. + */ + SYS(out, "ip link add bond0 type bond mode active-backup"); + SYS(out, "ip link add veth0 type veth peer name veth0p"); + SYS(out, "ip link set veth0 master bond0"); + SYS(out, "ip link set bond0 up"); + SYS(out, "ip link set veth0p up"); + + /* bond1: round-robin, never UP -> rr_tx_counter stays NULL */ + SYS(out, "ip link add bond1 type bond mode balance-rr"); + SYS(out, "ip link add veth1 type veth peer name veth1p"); + SYS(out, "ip link set veth1 master bond1"); + + veth1_ifindex = if_nametoindex("veth1"); + if (!ASSERT_GT(veth1_ifindex, 0, "veth1_ifindex")) + goto out; + + /* Attach native XDP to bond0 -> enables global redirect key */ + if (xdp_attach(skeletons, skeletons->xdp_tx->progs.xdp_tx, "bond0")) + goto out; + + /* Attach generic XDP (XDP_TX) to veth1. + * When packets arrive at veth1 via netif_receive_skb, do_xdp_generic() + * runs this program. XDP_TX + bond slave triggers xdp_master_redirect(). + */ + err = bpf_xdp_attach(veth1_ifindex, + bpf_program__fd(skeletons->xdp_tx->progs.xdp_tx), + XDP_FLAGS_SKB_MODE, NULL); + if (!ASSERT_OK(err, "attach generic XDP to veth1")) + goto out; + + /* Run BPF_PROG_TEST_RUN with XDP_PASS live frames on veth1. + * XDP_PASS frames become SKBs with skb->dev = veth1, entering + * netif_receive_skb -> do_xdp_generic -> xdp_master_redirect. + * Without the fix, bond_rr_gen_slave_id() dereferences NULL + * rr_tx_counter and crashes. + */ + xdp_pass_fd = bpf_program__fd(skeletons->xdp_dummy->progs.xdp_dummy_prog); + + memset(pkt, 0, sizeof(pkt)); + ctx_in.data_end = sizeof(pkt); + ctx_in.ingress_ifindex = veth1_ifindex; + + err = bpf_prog_test_run_opts(xdp_pass_fd, &opts); + ASSERT_OK(err, "xdp_pass test_run should not crash"); + +out: + link_cleanup(skeletons); + close_netns(nstoken); + SYS_NOFAIL("ip netns del ns_rr_no_up"); +} + static void test_xdp_bonding_features(struct skeletons *skeletons) { LIBBPF_OPTS(bpf_xdp_query_opts, query_opts); @@ -738,6 +827,9 @@ void serial_test_xdp_bonding(void) if (test__start_subtest("xdp_bonding_redirect_multi")) test_xdp_bonding_redirect_multi(&skeletons); + if (test__start_subtest("xdp_bonding_redirect_no_up")) + test_xdp_bonding_redirect_no_up(&skeletons); + out: xdp_dummy__destroy(skeletons.xdp_dummy); xdp_tx__destroy(skeletons.xdp_tx); diff --git a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c index 9af19dfe4e80..bccf677b94b6 100644 --- a/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c +++ b/tools/testing/selftests/bpf/progs/bpf_cc_cubic.c @@ -23,7 +23,7 @@ #define TCP_REORDERING (12) extern void cubictcp_init(struct sock *sk) __ksym; -extern void cubictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event) __ksym; +extern void cubictcp_cwnd_event_tx_start(struct sock *sk) __ksym; extern __u32 cubictcp_recalc_ssthresh(struct sock *sk) __ksym; extern void cubictcp_state(struct sock *sk, __u8 new_state) __ksym; extern __u32 tcp_reno_undo_cwnd(struct sock *sk) __ksym; @@ -108,9 +108,9 @@ void BPF_PROG(bpf_cubic_init, struct sock *sk) } SEC("struct_ops") -void BPF_PROG(bpf_cubic_cwnd_event, struct sock *sk, enum tcp_ca_event event) +void BPF_PROG(bpf_cubic_cwnd_event_tx_start, struct sock *sk) { - cubictcp_cwnd_event(sk, event); + cubictcp_cwnd_event_tx_start(sk); } SEC("struct_ops") @@ -172,7 +172,7 @@ struct tcp_congestion_ops cc_cubic = { .cong_control = (void *)bpf_cubic_cong_control, .set_state = (void *)bpf_cubic_state, .undo_cwnd = (void *)bpf_cubic_undo_cwnd, - .cwnd_event = (void *)bpf_cubic_cwnd_event, + .cwnd_event_tx_start = (void *)bpf_cubic_cwnd_event_tx_start, .pkts_acked = (void *)bpf_cubic_acked, .name = "bpf_cc_cubic", }; diff --git a/tools/testing/selftests/bpf/progs/bpf_cubic.c b/tools/testing/selftests/bpf/progs/bpf_cubic.c index 46fb2b37d3a7..ce18a4db813f 100644 --- a/tools/testing/selftests/bpf/progs/bpf_cubic.c +++ b/tools/testing/selftests/bpf/progs/bpf_cubic.c @@ -185,24 +185,21 @@ void BPF_PROG(bpf_cubic_init, struct sock *sk) } SEC("struct_ops") -void BPF_PROG(bpf_cubic_cwnd_event, struct sock *sk, enum tcp_ca_event event) +void BPF_PROG(bpf_cubic_cwnd_event_tx_start, struct sock *sk) { - if (event == CA_EVENT_TX_START) { - struct bpf_bictcp *ca = inet_csk_ca(sk); - __u32 now = tcp_jiffies32; - __s32 delta; - - delta = now - tcp_sk(sk)->lsndtime; - - /* We were application limited (idle) for a while. - * Shift epoch_start to keep cwnd growth to cubic curve. - */ - if (ca->epoch_start && delta > 0) { - ca->epoch_start += delta; - if (after(ca->epoch_start, now)) - ca->epoch_start = now; - } - return; + struct bpf_bictcp *ca = inet_csk_ca(sk); + __u32 now = tcp_jiffies32; + __s32 delta; + + delta = now - tcp_sk(sk)->lsndtime; + + /* We were application limited (idle) for a while. + * Shift epoch_start to keep cwnd growth to cubic curve. + */ + if (ca->epoch_start && delta > 0) { + ca->epoch_start += delta; + if (after(ca->epoch_start, now)) + ca->epoch_start = now; } } @@ -537,7 +534,7 @@ struct tcp_congestion_ops cubic = { .cong_avoid = (void *)bpf_cubic_cong_avoid, .set_state = (void *)bpf_cubic_state, .undo_cwnd = (void *)bpf_cubic_undo_cwnd, - .cwnd_event = (void *)bpf_cubic_cwnd_event, + .cwnd_event_tx_start = (void *)bpf_cubic_cwnd_event_tx_start, .pkts_acked = (void *)bpf_cubic_acked, .name = "bpf_cubic", }; diff --git a/tools/testing/selftests/bpf/progs/sock_ops_get_sk.c b/tools/testing/selftests/bpf/progs/sock_ops_get_sk.c new file mode 100644 index 000000000000..3a0689f8ce7c --- /dev/null +++ b/tools/testing/selftests/bpf/progs/sock_ops_get_sk.c @@ -0,0 +1,117 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "vmlinux.h" +#include <bpf/bpf_helpers.h> +#include "bpf_misc.h" + +/* + * Test the SOCK_OPS_GET_SK() and SOCK_OPS_GET_FIELD() macros in + * sock_ops_convert_ctx_access() when dst_reg == src_reg. + * + * When dst_reg == src_reg, the macros borrow a temporary register to load + * is_fullsock / is_locked_tcp_sock, because dst_reg holds the ctx pointer + * and cannot be clobbered before ctx->sk / ctx->field is read. If + * is_fullsock == 0 (e.g., TCP_NEW_SYN_RECV with a request_sock), the macro + * must still zero dst_reg so the verifier's PTR_TO_SOCKET_OR_NULL / + * SCALAR_VALUE type is correct at runtime. A missing clear leaves a stale + * ctx pointer in dst_reg that passes NULL checks (GET_SK) or leaks a kernel + * address as a scalar (GET_FIELD). + * + * When dst_reg != src_reg, dst_reg itself is used to load is_fullsock, so + * the JEQ (dst_reg == 0) naturally leaves it zeroed on the !fullsock path. + */ + +int bug_detected; +int null_seen; + +SEC("sockops") +__naked void sock_ops_get_sk_same_reg(void) +{ + asm volatile ( + "r7 = *(u32 *)(r1 + %[is_fullsock_off]);" + "r1 = *(u64 *)(r1 + %[sk_off]);" + "if r7 != 0 goto 2f;" + "if r1 == 0 goto 1f;" + "r1 = %[bug_detected] ll;" + "r2 = 1;" + "*(u32 *)(r1 + 0) = r2;" + "goto 2f;" + "1:" + "r1 = %[null_seen] ll;" + "r2 = 1;" + "*(u32 *)(r1 + 0) = r2;" + "2:" + "r0 = 1;" + "exit;" + : + : __imm_const(is_fullsock_off, offsetof(struct bpf_sock_ops, is_fullsock)), + __imm_const(sk_off, offsetof(struct bpf_sock_ops, sk)), + __imm_addr(bug_detected), + __imm_addr(null_seen) + : __clobber_all); +} + +/* SOCK_OPS_GET_FIELD: same-register, is_locked_tcp_sock == 0 path. */ +int field_bug_detected; +int field_null_seen; + +SEC("sockops") +__naked void sock_ops_get_field_same_reg(void) +{ + asm volatile ( + "r7 = *(u32 *)(r1 + %[is_fullsock_off]);" + "r1 = *(u32 *)(r1 + %[snd_cwnd_off]);" + "if r7 != 0 goto 2f;" + "if r1 == 0 goto 1f;" + "r1 = %[field_bug_detected] ll;" + "r2 = 1;" + "*(u32 *)(r1 + 0) = r2;" + "goto 2f;" + "1:" + "r1 = %[field_null_seen] ll;" + "r2 = 1;" + "*(u32 *)(r1 + 0) = r2;" + "2:" + "r0 = 1;" + "exit;" + : + : __imm_const(is_fullsock_off, offsetof(struct bpf_sock_ops, is_fullsock)), + __imm_const(snd_cwnd_off, offsetof(struct bpf_sock_ops, snd_cwnd)), + __imm_addr(field_bug_detected), + __imm_addr(field_null_seen) + : __clobber_all); +} + +/* SOCK_OPS_GET_SK: different-register, is_fullsock == 0 path. */ +int diff_reg_bug_detected; +int diff_reg_null_seen; + +SEC("sockops") +__naked void sock_ops_get_sk_diff_reg(void) +{ + asm volatile ( + "r7 = r1;" + "r6 = *(u32 *)(r7 + %[is_fullsock_off]);" + "r2 = *(u64 *)(r7 + %[sk_off]);" + "if r6 != 0 goto 2f;" + "if r2 == 0 goto 1f;" + "r1 = %[diff_reg_bug_detected] ll;" + "r3 = 1;" + "*(u32 *)(r1 + 0) = r3;" + "goto 2f;" + "1:" + "r1 = %[diff_reg_null_seen] ll;" + "r3 = 1;" + "*(u32 *)(r1 + 0) = r3;" + "2:" + "r0 = 1;" + "exit;" + : + : __imm_const(is_fullsock_off, offsetof(struct bpf_sock_ops, is_fullsock)), + __imm_const(sk_off, offsetof(struct bpf_sock_ops, sk)), + __imm_addr(diff_reg_bug_detected), + __imm_addr(diff_reg_null_seen) + : __clobber_all); +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/tcp_ca_kfunc.c b/tools/testing/selftests/bpf/progs/tcp_ca_kfunc.c index f95862f570b7..0a3e9d35bf6f 100644 --- a/tools/testing/selftests/bpf/progs/tcp_ca_kfunc.c +++ b/tools/testing/selftests/bpf/progs/tcp_ca_kfunc.c @@ -8,7 +8,7 @@ extern void bbr_init(struct sock *sk) __ksym; extern void bbr_main(struct sock *sk, u32 ack, int flag, const struct rate_sample *rs) __ksym; extern u32 bbr_sndbuf_expand(struct sock *sk) __ksym; extern u32 bbr_undo_cwnd(struct sock *sk) __ksym; -extern void bbr_cwnd_event(struct sock *sk, enum tcp_ca_event event) __ksym; +extern void bbr_cwnd_event_tx_start(struct sock *sk) __ksym; extern u32 bbr_ssthresh(struct sock *sk) __ksym; extern u32 bbr_min_tso_segs(struct sock *sk) __ksym; extern void bbr_set_state(struct sock *sk, u8 new_state) __ksym; @@ -16,6 +16,7 @@ extern void bbr_set_state(struct sock *sk, u8 new_state) __ksym; extern void dctcp_init(struct sock *sk) __ksym; extern void dctcp_update_alpha(struct sock *sk, u32 flags) __ksym; extern void dctcp_cwnd_event(struct sock *sk, enum tcp_ca_event ev) __ksym; +extern void dctcp_cwnd_event_tx_start(struct sock *sk) __ksym; extern u32 dctcp_ssthresh(struct sock *sk) __ksym; extern u32 dctcp_cwnd_undo(struct sock *sk) __ksym; extern void dctcp_state(struct sock *sk, u8 new_state) __ksym; @@ -24,7 +25,7 @@ extern void cubictcp_init(struct sock *sk) __ksym; extern u32 cubictcp_recalc_ssthresh(struct sock *sk) __ksym; extern void cubictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked) __ksym; extern void cubictcp_state(struct sock *sk, u8 new_state) __ksym; -extern void cubictcp_cwnd_event(struct sock *sk, enum tcp_ca_event event) __ksym; +extern void cubictcp_cwnd_event_tx_start(struct sock *sk) __ksym; extern void cubictcp_acked(struct sock *sk, const struct ack_sample *sample) __ksym; SEC("struct_ops") @@ -69,9 +70,15 @@ u32 BPF_PROG(undo_cwnd, struct sock *sk) SEC("struct_ops") void BPF_PROG(cwnd_event, struct sock *sk, enum tcp_ca_event event) { - bbr_cwnd_event(sk, event); dctcp_cwnd_event(sk, event); - cubictcp_cwnd_event(sk, event); +} + +SEC("struct_ops") +void BPF_PROG(cwnd_event_tx_start, struct sock *sk) +{ + bbr_cwnd_event_tx_start(sk); + dctcp_cwnd_event_tx_start(sk); + cubictcp_cwnd_event_tx_start(sk); } SEC("struct_ops") @@ -111,6 +118,7 @@ struct tcp_congestion_ops tcp_ca_kfunc = { .sndbuf_expand = (void *)sndbuf_expand, .undo_cwnd = (void *)undo_cwnd, .cwnd_event = (void *)cwnd_event, + .cwnd_event_tx_start = (void *)cwnd_event_tx_start, .ssthresh = (void *)ssthresh, .min_tso_segs = (void *)min_tso_segs, .set_state = (void *)set_state, diff --git a/tools/testing/selftests/bpf/progs/test_dst_clear.c b/tools/testing/selftests/bpf/progs/test_dst_clear.c new file mode 100644 index 000000000000..c22a6eeb4798 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_dst_clear.c @@ -0,0 +1,57 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2026 Meta Platforms, Inc. and affiliates. */ + +#include "vmlinux.h" +#include "bpf_tracing_net.h" +#include <bpf/bpf_helpers.h> +#include <bpf/bpf_endian.h> + +#define UDP_TEST_PORT 7777 + +void *bpf_cast_to_kern_ctx(void *) __ksym; + +bool had_dst = false; +bool dst_cleared = false; + +SEC("tc/egress") +int dst_clear(struct __sk_buff *skb) +{ + struct sk_buff *kskb; + struct iphdr iph; + struct udphdr udph; + int err; + + if (skb->protocol != __bpf_constant_htons(ETH_P_IP)) + return TC_ACT_OK; + + if (bpf_skb_load_bytes(skb, ETH_HLEN, &iph, sizeof(iph))) + return TC_ACT_OK; + + if (iph.protocol != IPPROTO_UDP) + return TC_ACT_OK; + + if (bpf_skb_load_bytes(skb, ETH_HLEN + sizeof(iph), &udph, sizeof(udph))) + return TC_ACT_OK; + + if (udph.dest != __bpf_constant_htons(UDP_TEST_PORT)) + return TC_ACT_OK; + + kskb = bpf_cast_to_kern_ctx(skb); + had_dst = (kskb->_skb_refdst != 0); + + /* Same-protocol encap (IPIP): protocol stays IPv4, but the dst + * from the original routing is no longer valid for the outer hdr. + */ + err = bpf_skb_adjust_room(skb, (s32)sizeof(struct iphdr), + BPF_ADJ_ROOM_MAC, + BPF_F_ADJ_ROOM_FIXED_GSO | + BPF_F_ADJ_ROOM_ENCAP_L3_IPV4); + if (err) + return TC_ACT_SHOT; + + dst_cleared = (kskb->_skb_refdst == 0); + + return TC_ACT_SHOT; +} + +char __license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/drivers/net/.gitignore b/tools/testing/selftests/drivers/net/.gitignore index 3633c7a3ed65..585ecb4d5dc4 100644 --- a/tools/testing/selftests/drivers/net/.gitignore +++ b/tools/testing/selftests/drivers/net/.gitignore @@ -1,4 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only -gro napi_id_helper psp_responder diff --git a/tools/testing/selftests/drivers/net/Makefile b/tools/testing/selftests/drivers/net/Makefile index 8154d6d429d3..b72080c6d06b 100644 --- a/tools/testing/selftests/drivers/net/Makefile +++ b/tools/testing/selftests/drivers/net/Makefile @@ -6,13 +6,13 @@ TEST_INCLUDES := $(wildcard lib/py/*.py) \ ../../net/lib.sh \ TEST_GEN_FILES := \ - gro \ napi_id_helper \ # end of TEST_GEN_FILES TEST_PROGS := \ gro.py \ hds.py \ + macsec.py \ napi_id.py \ napi_threaded.py \ netpoll_basic.py \ diff --git a/tools/testing/selftests/drivers/net/README.rst b/tools/testing/selftests/drivers/net/README.rst index eb838ae94844..c8588436c224 100644 --- a/tools/testing/selftests/drivers/net/README.rst +++ b/tools/testing/selftests/drivers/net/README.rst @@ -26,6 +26,10 @@ The netdevice against which tests will be run must exist, be running Refer to list of :ref:`Variables` later in this file to set up running the tests against a real device. +The current support for bash tests restricts the use of the same interface name +on the local system and the remote one and will bail if this case is +encountered. + Both modes required ~~~~~~~~~~~~~~~~~~~ @@ -47,6 +51,10 @@ or:: # Variable set in a file NETIF=eth0 +Please note that the config parser is very simple, if there are +any non-alphanumeric characters in the value it needs to be in +double quotes. + Local test (which don't require endpoint for sending / receiving traffic) need only the ``NETIF`` variable. Remaining variables define the endpoint and communication method. @@ -62,6 +70,44 @@ LOCAL_V4, LOCAL_V6, REMOTE_V4, REMOTE_V6 Local and remote endpoint IP addresses. +LOCAL_PREFIX_V6 +~~~~~~~~~~~~~~~ + +Local IP prefix/subnet which can be used to allocate extra IP addresses (for +network name spaces behind macvlan, veth, netkit devices). DUT must be +reachable using these addresses from the endpoint. + +LOCAL_PREFIX_V6 must NOT match LOCAL_V6. + +Example: + NETIF = "eth0" + LOCAL_V6 = "2001:db8:1::1" + REMOTE_V6 = "2001:db8:1::2" + LOCAL_PREFIX_V6 = "2001:db8:2::0/64" + + +-----------------------------+ +------------------------------+ + dst | INIT NS | | TEST NS | + 2001: | +---------------+ | | | + db8:2::2| | NETIF | | bpf | | + +---|>| 2001:db8:1::1 | |redirect| +-------------------------+ | + | | | |-----------|--------|>| Netkit | | + | | +---------------+ | _peer | | nk_guest | | + | | +-------------+ Netkit pair | | | fe80::2/64 | | + | | | Netkit |.............|........|>| 2001:db8:2::2/64 | | + | | | nk_host | | | +-------------------------+ | + | | | fe80::1/64 | | | | + | | +-------------+ | | route: | + | | | | default | + | | route: | | via fe80::1 dev nk_guest | + | | 2001:db8:2::2/128 | +------------------------------+ + | | via fe80::2 dev nk_host | + | +-----------------------------+ + | + | +---------------+ + | | REMOTE | + +---| 2001:db8:1::2 | + +---------------+ + REMOTE_TYPE ~~~~~~~~~~~ @@ -107,7 +153,7 @@ On the target machine, running the tests will use netdevsim by default:: 1..1 # timeout set to 45 # selftests: drivers/net: ping.py - # TAP version 13 + # KTAP version 1 # 1..3 # ok 1 ping.test_v4 # ok 2 ping.test_v6 @@ -128,9 +174,124 @@ Create a config with remote info:: Run the test:: [/root] # ./ksft-net-drv/drivers/net/ping.py - TAP version 13 + KTAP version 1 1..3 ok 1 ping.test_v4 ok 2 ping.test_v6 # SKIP Test requires IPv6 connectivity ok 3 ping.test_tcp # Totals: pass:2 fail:0 xfail:0 xpass:0 skip:1 error:0 + +Dependencies +~~~~~~~~~~~~ + +The tests have a handful of dependencies. For Fedora / CentOS:: + + dnf -y install netsniff-ng python-yaml socat iperf3 + +Guidance for test authors +========================= + +This section mostly applies to Python tests but some of the guidance +may be more broadly applicable. + +Kernel config +~~~~~~~~~~~~~ + +Each test directory has a ``config`` file listing which kernel +configuration options the tests depend on. This file must be kept +up to date, the CIs build minimal kernels for each test group. + +Adding checks inside the tests to validate that the necessary kernel +configs are enabled is discouraged. The test author may include such +checks, but standalone patches to make tests compatible e.g. with +distro kernel configs are unlikely to be accepted. + +Avoid libraries and frameworks +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Test files should be relatively self contained. The libraries should +only include very core or non-trivial code. +It may be tempting to "factor out" the common code, but fight that urge. +Library code increases the barrier of entry, and complexity in general. + +Avoid mixing test code and boilerplate +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In Python, try to avoid adding code in the ``main()`` function which +instantiates ``NetDrvEnv()`` and calls ``ksft_run()``. It's okay to +set up global resources (e.g. open an RtNetlink socket used by multiple +tests), but any complex logic, test-specific environment configuration +and validation should be done in the tests (even if it means it has to +be repeated). + +Local host is the DUT +~~~~~~~~~~~~~~~~~~~~~ + +Dual-host tests (tests with an endpoint) should be written from the DUT +perspective. IOW the local machine should be the one tested, remote is +just for traffic generation. + +Avoid modifying remote +~~~~~~~~~~~~~~~~~~~~~~ + +Avoid making configuration changes to the remote system as much as possible. +Remote system may be used concurrently by multiple DUTs. + +defer() +~~~~~~~ + +The env must be clean after test exits. Register a ``defer()`` for any +action that needs an "undo" as soon as possible. If you need to run +the cancel action as part of the test - ``defer()`` returns an object +you can ``.exec()``-ute. + +ksft_pr() +~~~~~~~~~ + +Use ``ksft_pr()`` instead of ``print()`` to avoid breaking TAP format. + +ksft_disruptive +~~~~~~~~~~~~~~~ + +By default the tests are expected to be able to run on +single-interface systems. All tests which may disconnect ``NETIF`` +must be annotated with ``@ksft_disruptive``. + +ksft_variants +~~~~~~~~~~~~~ + +Use the ``@ksft_variants`` decorator to run a test with multiple sets +of inputs as separate test cases. This avoids duplicating test functions +that only differ in parameters. + +Parameters can be a single value, a tuple, or a ``KsftNamedVariant`` +(which gives an explicit name to the sub-case). The argument to the +decorator can be a list or a generator. + +Example:: + + @ksft_variants([ + KsftNamedVariant("main", False), + KsftNamedVariant("ctx", True), + ]) + def resize_periodic(cfg, create_context): + # test body receives (cfg, create_context) where create_context + # is False for the "main" variant and True for "ctx" + pass + +or:: + + def _gro_variants(): + for mode in ["sw", "hw"]: + for protocol in ["tcp4", "tcp6"]: + yield (mode, protocol) + + @ksft_variants(_gro_variants()) + def test(cfg, mode, protocol): + pass + +Running tests CI-style +====================== + +See https://github.com/linux-netdev/nipa/wiki for instructions on how +to easily run the tests using ``virtme-ng``. diff --git a/tools/testing/selftests/drivers/net/bonding/Makefile b/tools/testing/selftests/drivers/net/bonding/Makefile index 6c5c60adb5e8..9af5f84edd37 100644 --- a/tools/testing/selftests/drivers/net/bonding/Makefile +++ b/tools/testing/selftests/drivers/net/bonding/Makefile @@ -11,6 +11,7 @@ TEST_PROGS := \ bond_macvlan_ipvlan.sh \ bond_options.sh \ bond_passive_lacp.sh \ + bond_stacked_header_parse.sh \ dev_addr_lists.sh \ mode-1-recovery-updelay.sh \ mode-2-recovery-updelay.sh \ diff --git a/tools/testing/selftests/drivers/net/bonding/bond_stacked_header_parse.sh b/tools/testing/selftests/drivers/net/bonding/bond_stacked_header_parse.sh new file mode 100755 index 000000000000..36bcdef711b0 --- /dev/null +++ b/tools/testing/selftests/drivers/net/bonding/bond_stacked_header_parse.sh @@ -0,0 +1,72 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test that bond_header_parse() does not infinitely recurse with stacked bonds. +# +# When a non-Ethernet device (e.g. GRE) is enslaved to a bond that is itself +# enslaved to another bond (bond1 -> bond0 -> gre), receiving a packet via +# AF_PACKET SOCK_DGRAM triggers dev_parse_header() -> bond_header_parse(). +# Since parse() used skb->dev (always the topmost bond) instead of a passed-in +# dev pointer, it would recurse back into itself indefinitely. + +# shellcheck disable=SC2034 +ALL_TESTS=" + bond_test_stacked_header_parse +" +REQUIRE_MZ=no +NUM_NETIFS=0 +lib_dir=$(dirname "$0") +source "$lib_dir"/../../../net/forwarding/lib.sh + +# shellcheck disable=SC2329 +bond_test_stacked_header_parse() +{ + local devdummy="test-dummy0" + local devgre="test-gre0" + local devbond0="test-bond0" + local devbond1="test-bond1" + + # shellcheck disable=SC2034 + RET=0 + + # Setup: dummy -> gre -> bond0 -> bond1 + ip link add name "$devdummy" type dummy + ip addr add 10.0.0.1/24 dev "$devdummy" + ip link set "$devdummy" up + + ip link add name "$devgre" type gre local 10.0.0.1 + + ip link add name "$devbond0" type bond mode active-backup + ip link add name "$devbond1" type bond mode active-backup + + ip link set "$devgre" master "$devbond0" + ip link set "$devbond0" master "$devbond1" + + ip link set "$devgre" up + ip link set "$devbond0" up + ip link set "$devbond1" up + + # tcpdump on a non-Ethernet bond uses AF_PACKET SOCK_DGRAM (cooked + # capture), which triggers dev_parse_header() -> bond_header_parse() + # on receive. With the bug, this recurses infinitely. + timeout 5 tcpdump -c 1 -i "$devbond1" >/dev/null 2>&1 & + local tcpdump_pid=$! + sleep 1 + + # Send a GRE packet to 10.0.0.1 so it arrives via gre -> bond0 -> bond1 + python3 -c "from scapy.all import *; send(IP(src='10.0.0.2', dst='10.0.0.1')/GRE()/IP()/UDP(), verbose=0)" + check_err $? "failed to send GRE packet (scapy installed?)" + + wait "$tcpdump_pid" 2>/dev/null + + ip link del "$devbond1" 2>/dev/null + ip link del "$devbond0" 2>/dev/null + ip link del "$devgre" 2>/dev/null + ip link del "$devdummy" 2>/dev/null + + log_test "Stacked bond header_parse does not recurse" +} + +tests_run + +exit "$EXIT_STATUS" diff --git a/tools/testing/selftests/drivers/net/bonding/config b/tools/testing/selftests/drivers/net/bonding/config index 991494376223..b62c70715293 100644 --- a/tools/testing/selftests/drivers/net/bonding/config +++ b/tools/testing/selftests/drivers/net/bonding/config @@ -14,6 +14,7 @@ CONFIG_NETCONSOLE=m CONFIG_NETCONSOLE_DYNAMIC=y CONFIG_NETCONSOLE_EXTENDED_LOG=y CONFIG_NETDEVSIM=m +CONFIG_NET_IPGRE=y CONFIG_NET_SCH_INGRESS=y CONFIG_NLMON=y CONFIG_VETH=y diff --git a/tools/testing/selftests/drivers/net/config b/tools/testing/selftests/drivers/net/config index 77ccf83d87e0..fd16994366f4 100644 --- a/tools/testing/selftests/drivers/net/config +++ b/tools/testing/selftests/drivers/net/config @@ -3,8 +3,10 @@ CONFIG_DEBUG_INFO_BTF=y CONFIG_DEBUG_INFO_BTF_MODULES=n CONFIG_INET_PSP=y CONFIG_IPV6=y +CONFIG_MACSEC=m CONFIG_NETCONSOLE=m CONFIG_NETCONSOLE_DYNAMIC=y CONFIG_NETCONSOLE_EXTENDED_LOG=y CONFIG_NETDEVSIM=m +CONFIG_VLAN_8021Q=m CONFIG_XDP_SOCKETS=y diff --git a/tools/testing/selftests/drivers/net/gro.py b/tools/testing/selftests/drivers/net/gro.py index cbc1b19dbc91..221f27e57147 100755 --- a/tools/testing/selftests/drivers/net/gro.py +++ b/tools/testing/selftests/drivers/net/gro.py @@ -11,6 +11,7 @@ coalescing behavior. Test cases: - data_same: Same size data packets coalesce - data_lrg_sml: Large packet followed by smaller one coalesces + - data_lrg_1byte: Large packet followed by 1B one coalesces (Ethernet padding) - data_sml_lrg: Small packet followed by larger one doesn't coalesce - ack: Pure ACK packets do not coalesce - flags_psh: Packets with PSH flag don't coalesce @@ -35,11 +36,18 @@ Test cases: - large_rem: Large packet remainder handling """ +import glob import os +import re from lib.py import ksft_run, ksft_exit, ksft_pr from lib.py import NetDrvEpEnv, KsftXfailEx +from lib.py import NetdevFamily, EthtoolFamily from lib.py import bkg, cmd, defer, ethtool, ip -from lib.py import ksft_variants +from lib.py import ksft_variants, KsftNamedVariant + + +# gro.c uses hardcoded DPORT=8000 +GRO_DPORT = 8000 def _resolve_dmac(cfg, ipver): @@ -113,11 +121,103 @@ def _set_ethtool_feat(dev, current, feats, host=None): ksft_pr(eth_cmd) +def _get_queue_stats(cfg, queue_id): + """Get stats for a specific Rx queue.""" + cfg.wait_hw_stats_settle() + data = cfg.netnl.qstats_get({"ifindex": cfg.ifindex, "scope": ["queue"]}, + dump=True) + for q in data: + if q.get('queue-type') == 'rx' and q.get('queue-id') == queue_id: + return q + return {} + + +def _setup_isolated_queue(cfg): + """Set up an isolated queue for testing using ntuple filter. + + Remove queue 1 from the default RSS context and steer test traffic to it. + """ + test_queue = 1 + + qcnt = len(glob.glob(f"/sys/class/net/{cfg.ifname}/queues/rx-*")) + if qcnt < 2: + raise KsftXfailEx(f"Need at least 2 queues, have {qcnt}") + + # Remove queue 1 from default RSS context by setting its weight to 0 + weights = ["1"] * qcnt + weights[test_queue] = "0" + ethtool(f"-X {cfg.ifname} weight " + " ".join(weights)) + defer(ethtool, f"-X {cfg.ifname} default") + + # Set up ntuple filter to steer our test traffic to the isolated queue + flow = f"flow-type tcp{cfg.addr_ipver} " + flow += f"dst-ip {cfg.addr} dst-port {GRO_DPORT} action {test_queue}" + output = ethtool(f"-N {cfg.ifname} {flow}").stdout + ntuple_id = int(output.split()[-1]) + defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}") + + return test_queue + + +def _setup_queue_count(cfg, num_queues): + """Configure the NIC to use a specific number of queues.""" + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + ch_max = channels.get('combined-max', 0) + qcnt = channels['combined-count'] + + if ch_max < num_queues: + raise KsftXfailEx(f"Need at least {num_queues} queues, max={ch_max}") + + defer(ethtool, f"-L {cfg.ifname} combined {qcnt}") + ethtool(f"-L {cfg.ifname} combined {num_queues}") + + +def _run_gro_bin(cfg, test_name, protocol=None, num_flows=None, + order_check=False, verbose=False, fail=False): + """Run gro binary with given test and return the process result.""" + if not hasattr(cfg, "bin_remote"): + cfg.bin_local = cfg.net_lib_dir / "gro" + cfg.bin_remote = cfg.remote.deploy(cfg.bin_local) + + if protocol is None: + ipver = cfg.addr_ipver + protocol = f"ipv{ipver}" + else: + ipver = "6" if protocol[-1] == "6" else "4" + + dmac = _resolve_dmac(cfg, ipver) + + base_args = [ + f"--{protocol}", + f"--dmac {dmac}", + f"--smac {cfg.remote_dev['address']}", + f"--daddr {cfg.addr_v[ipver]}", + f"--saddr {cfg.remote_addr_v[ipver]}", + f"--test {test_name}", + ] + if num_flows: + base_args.append(f"--num-flows {num_flows}") + if order_check: + base_args.append("--order-check") + if verbose: + base_args.append("--verbose") + + args = " ".join(base_args) + + rx_cmd = f"{cfg.bin_local} {args} --rx --iface {cfg.ifname}" + tx_cmd = f"{cfg.bin_remote} {args} --iface {cfg.remote_ifname}" + + with bkg(rx_cmd, ksft_ready=True, exit_wait=True, fail=fail) as rx_proc: + cmd(tx_cmd, host=cfg.remote) + + return rx_proc + + def _setup(cfg, mode, test_name): """ Setup hardware loopback mode for GRO testing. """ if not hasattr(cfg, "bin_remote"): - cfg.bin_local = cfg.test_dir / "gro" + cfg.bin_local = cfg.net_lib_dir / "gro" cfg.bin_remote = cfg.remote.deploy(cfg.bin_local) if not hasattr(cfg, "feat"): @@ -190,7 +290,8 @@ def _gro_variants(): # Tests that work for all protocols common_tests = [ - "data_same", "data_lrg_sml", "data_sml_lrg", + "data_same", "data_lrg_sml", "data_sml_lrg", "data_lrg_1byte", + "data_burst", "ack", "flags_psh", "flags_syn", "flags_rst", "flags_urg", "flags_cwr", "tcp_csum", "tcp_seq", "tcp_ts", "tcp_opt", @@ -200,6 +301,7 @@ def _gro_variants(): # Tests specific to IPv4 ipv4_tests = [ + "ip_csum", "ip_ttl", "ip_opt", "ip_frag4", "ip_id_df1_inc", "ip_id_df1_fixed", "ip_id_df0_inc", "ip_id_df0_fixed", @@ -212,7 +314,7 @@ def _gro_variants(): ] for mode in ["sw", "hw", "lro"]: - for protocol in ["ipv4", "ipv6", "ipip"]: + for protocol in ["ipv4", "ipv6", "ipip", "ip6ip6"]: for test_name in common_tests: yield mode, protocol, test_name @@ -233,30 +335,14 @@ def test(cfg, mode, protocol, test_name): _setup(cfg, mode, test_name) - base_cmd_args = [ - f"--{protocol}", - f"--dmac {_resolve_dmac(cfg, ipver)}", - f"--smac {cfg.remote_dev['address']}", - f"--daddr {cfg.addr_v[ipver]}", - f"--saddr {cfg.remote_addr_v[ipver]}", - f"--test {test_name}", - "--verbose" - ] - base_args = " ".join(base_cmd_args) - # Each test is run 6 times to deflake, because given the receive timing, # not all packets that should coalesce will be considered in the same flow # on every try. max_retries = 6 for attempt in range(max_retries): - rx_cmd = f"{cfg.bin_local} {base_args} --rx --iface {cfg.ifname}" - tx_cmd = f"{cfg.bin_remote} {base_args} --iface {cfg.remote_ifname}" - fail_now = attempt >= max_retries - 1 - - with bkg(rx_cmd, ksft_ready=True, exit_wait=True, - fail=fail_now) as rx_proc: - cmd(tx_cmd, host=cfg.remote) + rx_proc = _run_gro_bin(cfg, test_name, protocol=protocol, + verbose=True, fail=fail_now) if rx_proc.ret == 0: return @@ -270,11 +356,89 @@ def test(cfg, mode, protocol, test_name): ksft_pr(f"Attempt {attempt + 1}/{max_retries} failed, retrying...") +def _capacity_variants(): + """Generate variants for capacity test: mode x queue setup.""" + setups = [ + ("isolated", _setup_isolated_queue), + ("1q", lambda cfg: _setup_queue_count(cfg, 1)), + ("8q", lambda cfg: _setup_queue_count(cfg, 8)), + ] + for mode in ["sw", "hw", "lro"]: + for name, func in setups: + yield KsftNamedVariant(f"{mode}_{name}", mode, func) + + +@ksft_variants(_capacity_variants()) +def test_gro_capacity(cfg, mode, setup_func): + """ + Probe GRO capacity. + + Start with 8 flows and increase by 2x on each successful run. + Retry up to 3 times on failure. + + Variants combine mode (sw, hw, lro) with queue setup: + - isolated: Use a single queue isolated from RSS + - 1q: Configure NIC to use 1 queue + - 8q: Configure NIC to use 8 queues + """ + max_retries = 3 + + _setup(cfg, mode, "capacity") + queue_id = setup_func(cfg) + + num_flows = 8 + while True: + success = False + for attempt in range(max_retries): + if queue_id is not None: + stats_before = _get_queue_stats(cfg, queue_id) + + rx_proc = _run_gro_bin(cfg, "capacity", num_flows=num_flows) + output = rx_proc.stdout + + if queue_id is not None: + stats_after = _get_queue_stats(cfg, queue_id) + qstat_pkts = (stats_after.get('rx-packets', 0) - + stats_before.get('rx-packets', 0)) + gro_pkts = (stats_after.get('rx-hw-gro-packets', 0) - + stats_before.get('rx-hw-gro-packets', 0)) + qstat_str = f" qstat={qstat_pkts} hw-gro={gro_pkts}" + else: + qstat_str = "" + + # Parse and print STATS line + match = re.search( + r'STATS: received=(\d+) wire=(\d+) coalesced=(\d+)', output) + if match: + received = int(match.group(1)) + wire = int(match.group(2)) + coalesced = int(match.group(3)) + status = "PASS" if received == num_flows else "MISS" + ksft_pr(f"flows={num_flows} attempt={attempt + 1} " + f"received={received} wire={wire} " + f"coalesced={coalesced}{qstat_str} [{status}]") + if received == num_flows: + success = True + break + else: + ksft_pr(rx_proc) + ksft_pr(f"flows={num_flows} attempt={attempt + 1}" + f"{qstat_str} [FAIL - can't parse stats]") + + if not success: + ksft_pr(f"Stopped at {num_flows} flows") + break + + num_flows *= 2 + + def main() -> None: """ Ksft boiler plate main """ with NetDrvEpEnv(__file__) as cfg: - ksft_run(cases=[test], args=(cfg,)) + cfg.ethnl = EthtoolFamily() + cfg.netnl = NetdevFamily() + ksft_run(cases=[test, test_gro_capacity], args=(cfg,)) ksft_exit() diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile index a64140333a46..85ca4d1ecf9e 100644 --- a/tools/testing/selftests/drivers/net/hw/Makefile +++ b/tools/testing/selftests/drivers/net/hw/Makefile @@ -26,12 +26,17 @@ TEST_PROGS = \ ethtool_extended_state.sh \ ethtool_mm.sh \ ethtool_rmon.sh \ + ethtool_std_stats.sh \ + gro_hw.py \ hw_stats_l3.sh \ hw_stats_l3_gre.sh \ iou-zcrx.py \ irq.py \ loopback.sh \ nic_timestamp.py \ + nk_netns.py \ + nk_qlease.py \ + ntuple.py \ pp_alloc_fail.py \ rss_api.py \ rss_ctx.py \ @@ -40,6 +45,8 @@ TEST_PROGS = \ rss_input_xfrm.py \ toeplitz.py \ tso.py \ + uso.py \ + xdp_metadata.py \ xsk_reconfig.py \ # diff --git a/tools/testing/selftests/drivers/net/hw/config b/tools/testing/selftests/drivers/net/hw/config index 2307aa001be1..dd50cb8a7911 100644 --- a/tools/testing/selftests/drivers/net/hw/config +++ b/tools/testing/selftests/drivers/net/hw/config @@ -1,3 +1,4 @@ +CONFIG_BPF_SYSCALL=y CONFIG_FAIL_FUNCTION=y CONFIG_FAULT_INJECTION=y CONFIG_FAULT_INJECTION_DEBUG_FS=y @@ -5,7 +6,11 @@ CONFIG_FUNCTION_ERROR_INJECTION=y CONFIG_IO_URING=y CONFIG_IPV6=y CONFIG_IPV6_GRE=y +CONFIG_NET_CLS_ACT=y +CONFIG_NET_CLS_BPF=y CONFIG_NET_IPGRE=y CONFIG_NET_IPGRE_DEMUX=y +CONFIG_NETKIT=y +CONFIG_NET_SCH_INGRESS=y CONFIG_UDMABUF=y CONFIG_VXLAN=y diff --git a/tools/testing/selftests/drivers/net/hw/ethtool_rmon.sh b/tools/testing/selftests/drivers/net/hw/ethtool_rmon.sh index 8f60c1685ad4..2ec19edddfaa 100755 --- a/tools/testing/selftests/drivers/net/hw/ethtool_rmon.sh +++ b/tools/testing/selftests/drivers/net/hw/ethtool_rmon.sh @@ -1,17 +1,23 @@ #!/bin/bash # SPDX-License-Identifier: GPL-2.0 +#shellcheck disable=SC2034 # SC does not see the global variables +#shellcheck disable=SC2317,SC2329 # unused functions ALL_TESTS=" rmon_rx_histogram rmon_tx_histogram " +: "${DRIVER_TEST_CONFORMANT:=yes}" NUM_NETIFS=2 lib_dir=$(dirname "$0") source "$lib_dir"/../../../net/forwarding/lib.sh +source "$lib_dir"/../../../kselftest/ktap_helpers.sh +UINT32_MAX=$((2**32 - 1)) ETH_FCS_LEN=4 ETH_HLEN=$((6+6+2)) +TEST_NAME=$(basename "$0" .sh) declare -A netif_mtu @@ -19,11 +25,14 @@ ensure_mtu() { local iface=$1; shift local len=$1; shift - local current=$(ip -j link show dev $iface | jq -r '.[0].mtu') local required=$((len - ETH_HLEN - ETH_FCS_LEN)) + local current - if [ $current -lt $required ]; then - ip link set dev $iface mtu $required || return 1 + current=$(run_on "$iface" \ + ip -j link show dev "$iface" | jq -r '.[0].mtu') + if [ "$current" -lt "$required" ]; then + run_on "$iface" ip link set dev "$iface" mtu "$required" \ + || return 1 fi } @@ -46,23 +55,24 @@ bucket_test() len=$((len - ETH_FCS_LEN)) len=$((len > 0 ? len : 0)) - before=$(ethtool --json -S $iface --groups rmon | \ + before=$(run_on "$iface" ethtool --json -S "$iface" --groups rmon | \ jq -r ".[0].rmon[\"${set}-pktsNtoM\"][$bucket].val") # Send 10k one way and 20k in the other, to detect counters # mapped to the wrong direction - $MZ $neigh -q -c $num_rx -p $len -a own -b bcast -d 10us - $MZ $iface -q -c $num_tx -p $len -a own -b bcast -d 10us + run_on "$neigh" \ + "$MZ" "$neigh" -q -c "$num_rx" -p "$len" -a own -b bcast -d 10us + run_on "$iface" \ + "$MZ" "$iface" -q -c "$num_tx" -p "$len" -a own -b bcast -d 10us - after=$(ethtool --json -S $iface --groups rmon | \ + after=$(run_on "$iface" ethtool --json -S "$iface" --groups rmon | \ jq -r ".[0].rmon[\"${set}-pktsNtoM\"][$bucket].val") delta=$((after - before)) - expected=$([ $set = rx ] && echo $num_rx || echo $num_tx) + expected=$([ "$set" = rx ] && echo "$num_rx" || echo "$num_tx") - # Allow some extra tolerance for other packets sent by the stack - [ $delta -ge $expected ] && [ $delta -le $((expected + 100)) ] + [ "$delta" -ge "$expected" ] && [ "$delta" -le "$UINT32_MAX" ] } rmon_histogram() @@ -73,43 +83,40 @@ rmon_histogram() local nbuckets=0 local step= - RET=0 - while read -r -a bucket; do - step="$set-pkts${bucket[0]}to${bucket[1]} on $iface" + step="$set-pkts${bucket[0]}to${bucket[1]}" - for if in $iface $neigh; do - if ! ensure_mtu $if ${bucket[0]}; then - log_test_xfail "$if does not support the required MTU for $step" + for if in "$iface" "$neigh"; do + if ! ensure_mtu "$if" "${bucket[0]}"; then + ktap_print_msg "$if does not support the required MTU for $step" + ktap_test_xfail "$TEST_NAME.$step" return fi done - if ! bucket_test $iface $neigh $set $nbuckets ${bucket[0]}; then - check_err 1 "$step failed" + if ! bucket_test "$iface" "$neigh" "$set" "$nbuckets" "${bucket[0]}"; then + ktap_test_fail "$TEST_NAME.$step" return 1 fi - log_test "$step" + ktap_test_pass "$TEST_NAME.$step" nbuckets=$((nbuckets + 1)) - done < <(ethtool --json -S $iface --groups rmon | \ + done < <(run_on "$iface" ethtool --json -S "$iface" --groups rmon | \ jq -r ".[0].rmon[\"${set}-pktsNtoM\"][]|[.low, .high]|@tsv" 2>/dev/null) - if [ $nbuckets -eq 0 ]; then - log_test_xfail "$iface does not support $set histogram counters" + if [ "$nbuckets" -eq 0 ]; then + ktap_print_msg "$iface does not support $set histogram counters" return fi } rmon_rx_histogram() { - rmon_histogram $h1 $h2 rx - rmon_histogram $h2 $h1 rx + rmon_histogram "$h1" "$h2" rx } rmon_tx_histogram() { - rmon_histogram $h1 $h2 tx - rmon_histogram $h2 $h1 tx + rmon_histogram "$h1" "$h2" tx } setup_prepare() @@ -117,9 +124,9 @@ setup_prepare() h1=${NETIFS[p1]} h2=${NETIFS[p2]} - for iface in $h1 $h2; do - netif_mtu[$iface]=$(ip -j link show dev $iface | jq -r '.[0].mtu') - ip link set dev $iface up + for iface in "$h1" "$h2"; do + netif_mtu["$iface"]=$(run_on "$iface" \ + ip -j link show dev "$iface" | jq -r '.[0].mtu') done } @@ -127,19 +134,26 @@ cleanup() { pre_cleanup - for iface in $h2 $h1; do - ip link set dev $iface \ - mtu ${netif_mtu[$iface]} \ - down + # Do not bring down the interfaces, just configure the initial MTU + for iface in "$h2" "$h1"; do + run_on "$iface" ip link set dev "$iface" \ + mtu "${netif_mtu[$iface]}" done } check_ethtool_counter_group_support trap cleanup EXIT +bucket_count=$(ethtool --json -S "${NETIFS[p1]}" --groups rmon | \ + jq -r '.[0].rmon | + "\((."rx-pktsNtoM" | length) + + (."tx-pktsNtoM" | length))"') +ktap_print_header +ktap_set_plan "$bucket_count" + setup_prepare setup_wait tests_run -exit $EXIT_STATUS +ktap_finished diff --git a/tools/testing/selftests/drivers/net/hw/ethtool_std_stats.sh b/tools/testing/selftests/drivers/net/hw/ethtool_std_stats.sh new file mode 100755 index 000000000000..c085d2a4c989 --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/ethtool_std_stats.sh @@ -0,0 +1,206 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +#shellcheck disable=SC2034 # SC does not see the global variables +#shellcheck disable=SC2317,SC2329 # unused functions + +ALL_TESTS=" + test_eth_ctrl_stats + test_eth_mac_stats + test_pause_stats +" +: "${DRIVER_TEST_CONFORMANT:=yes}" +STABLE_MAC_ADDRS=yes +NUM_NETIFS=2 +lib_dir=$(dirname "$0") +# shellcheck source=./../../../net/forwarding/lib.sh +source "$lib_dir"/../../../net/forwarding/lib.sh +# shellcheck source=./../../../kselftest/ktap_helpers.sh +source "$lib_dir"/../../../kselftest/ktap_helpers.sh + +UINT32_MAX=$((2**32 - 1)) +SUBTESTS=0 +TEST_NAME=$(basename "$0" .sh) + +traffic_test() +{ + local iface=$1; shift + local neigh=$1; shift + local num_tx=$1; shift + local pkt_format="$1"; shift + local -a counters=("$@") + local int grp cnt target exact_check + local before after delta + local num_rx=$((num_tx * 2)) + local xfail_message + local src="aggregate" + local i + + for i in "${!counters[@]}"; do + read -r int grp cnt target exact_check xfail_message \ + <<< "${counters[$i]}" + + before[i]=$(ethtool_std_stats_get "$int" "$grp" "$cnt" "$src") + done + + # shellcheck disable=SC2086 # needs split options + run_on "$iface" "$MZ" "$iface" -q -c "$num_tx" $pkt_format + + # shellcheck disable=SC2086 # needs split options + run_on "$neigh" "$MZ" "$neigh" -q -c "$num_rx" $pkt_format + + for i in "${!counters[@]}"; do + read -r int grp cnt target exact_check xfail_message \ + <<< "${counters[$i]}" + + after[i]=$(ethtool_std_stats_get "$int" "$grp" "$cnt" "$src") + if [[ "${after[$i]}" == "null" ]]; then + ktap_test_skip "$TEST_NAME.$grp-$cnt" + continue; + fi + + delta=$((after[i] - before[i])) + + if [ "$exact_check" -ne 0 ]; then + [ "$delta" -eq "$target" ] + else + [ "$delta" -ge "$target" ] && \ + [ "$delta" -le "$UINT32_MAX" ] + fi + err="$?" + + if [[ $err != 0 ]] && [[ -n $xfail_message ]]; then + ktap_print_msg "$xfail_message" + ktap_test_xfail "$TEST_NAME.$grp-$cnt" + continue; + fi + + if [[ $err != 0 ]]; then + ktap_print_msg "$grp-$cnt is not valid on $int (expected $target, got $delta)" + ktap_test_fail "$TEST_NAME.$grp-$cnt" + else + ktap_test_pass "$TEST_NAME.$grp-$cnt" + fi + done +} + +test_eth_ctrl_stats() +{ + local pkt_format="-a own -b bcast 88:08 -p 64" + local num_pkts=1000 + local -a counters + + counters=("$h1 eth-ctrl MACControlFramesTransmitted $num_pkts 0") + traffic_test "$h1" "$h2" "$num_pkts" "$pkt_format" \ + "${counters[@]}" + + counters=("$h1 eth-ctrl MACControlFramesReceived $num_pkts 0") + traffic_test "$h2" "$h1" "$num_pkts" "$pkt_format" \ + "${counters[@]}" +} +SUBTESTS=$((SUBTESTS + 2)) + +test_eth_mac_stats() +{ + local pkt_size=100 + local pkt_size_fcs=$((pkt_size + 4)) + local bcast_pkt_format="-a own -b bcast -p $pkt_size" + local mcast_pkt_format="-a own -b 01:00:5E:00:00:01 -p $pkt_size" + local num_pkts=2000 + local octets=$((pkt_size_fcs * num_pkts)) + local -a counters error_cnt collision_cnt + + # Error counters should be exactly zero + counters=("$h1 eth-mac FrameCheckSequenceErrors 0 1" + "$h1 eth-mac AlignmentErrors 0 1" + "$h1 eth-mac FramesLostDueToIntMACXmitError 0 1" + "$h1 eth-mac CarrierSenseErrors 0 1" + "$h1 eth-mac FramesLostDueToIntMACRcvError 0 1" + "$h1 eth-mac InRangeLengthErrors 0 1" + "$h1 eth-mac OutOfRangeLengthField 0 1" + "$h1 eth-mac FrameTooLongErrors 0 1" + "$h1 eth-mac FramesAbortedDueToXSColls 0 1") + traffic_test "$h1" "$h2" "$num_pkts" "$bcast_pkt_format" \ + "${counters[@]}" + + # Collision related counters should also be zero + counters=("$h1 eth-mac SingleCollisionFrames 0 1" + "$h1 eth-mac MultipleCollisionFrames 0 1" + "$h1 eth-mac FramesWithDeferredXmissions 0 1" + "$h1 eth-mac LateCollisions 0 1" + "$h1 eth-mac FramesWithExcessiveDeferral 0 1") + traffic_test "$h1" "$h2" "$num_pkts" "$bcast_pkt_format" \ + "${counters[@]}" + + counters=("$h1 eth-mac BroadcastFramesXmittedOK $num_pkts 0" + "$h1 eth-mac OctetsTransmittedOK $octets 0") + traffic_test "$h1" "$h2" "$num_pkts" "$bcast_pkt_format" \ + "${counters[@]}" + + counters=("$h1 eth-mac BroadcastFramesReceivedOK $num_pkts 0" + "$h1 eth-mac OctetsReceivedOK $octets 0") + traffic_test "$h2" "$h1" "$num_pkts" "$bcast_pkt_format" \ + "${counters[@]}" + + counters=("$h1 eth-mac FramesTransmittedOK $num_pkts 0" + "$h1 eth-mac MulticastFramesXmittedOK $num_pkts 0") + traffic_test "$h1" "$h2" "$num_pkts" "$mcast_pkt_format" \ + "${counters[@]}" + + counters=("$h1 eth-mac FramesReceivedOK $num_pkts 0" + "$h1 eth-mac MulticastFramesReceivedOK $num_pkts 0") + traffic_test "$h2" "$h1" "$num_pkts" "$mcast_pkt_format" \ + "${counters[@]}" +} +SUBTESTS=$((SUBTESTS + 22)) + +test_pause_stats() +{ + local pkt_format="-a own -b 01:80:c2:00:00:01 88:08:00:01:00:01" + local xfail_message="software sent pause frames not detected" + local num_pkts=2000 + local -a counters + local int + local i + + # Check that there is pause frame support + for ((i = 1; i <= NUM_NETIFS; ++i)); do + int="${NETIFS[p$i]}" + if ! run_on "$int" ethtool -I --json -a "$int" > /dev/null 2>&1; then + ktap_test_skip "$TEST_NAME.tx_pause_frames" + ktap_test_skip "$TEST_NAME.rx_pause_frames" + return + fi + done + + counters=("$h1 pause tx_pause_frames $num_pkts 0 $xfail_message") + traffic_test "$h1" "$h2" "$num_pkts" "$pkt_format" \ + "${counters[@]}" + + counters=("$h1 pause rx_pause_frames $num_pkts 0") + traffic_test "$h2" "$h1" "$num_pkts" "$pkt_format" \ + "${counters[@]}" +} +SUBTESTS=$((SUBTESTS + 2)) + +setup_prepare() +{ + local iface + + h1=${NETIFS[p1]} + h2=${NETIFS[p2]} + + h2_mac=$(mac_get "$h2") +} + +ktap_print_header +ktap_set_plan $SUBTESTS + +check_ethtool_counter_group_support +trap cleanup EXIT + +setup_prepare +setup_wait + +tests_run + +ktap_finished diff --git a/tools/testing/selftests/drivers/net/hw/gro_hw.py b/tools/testing/selftests/drivers/net/hw/gro_hw.py new file mode 100755 index 000000000000..10e08b22ee0e --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/gro_hw.py @@ -0,0 +1,294 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +""" +HW GRO tests focusing on device machinery like stats, rather than protocol +processing. +""" + +import glob +import re + +from lib.py import ksft_run, ksft_exit, ksft_pr +from lib.py import ksft_eq, ksft_ge, ksft_variants +from lib.py import NetDrvEpEnv, NetdevFamily +from lib.py import KsftSkipEx +from lib.py import bkg, cmd, defer, ethtool, ip + + +# gro.c uses hardcoded DPORT=8000 +GRO_DPORT = 8000 + + +def _get_queue_stats(cfg, queue_id): + """Get stats for a specific Rx queue.""" + cfg.wait_hw_stats_settle() + data = cfg.netnl.qstats_get({"ifindex": cfg.ifindex, "scope": ["queue"]}, + dump=True) + for q in data: + if q.get('queue-type') == 'rx' and q.get('queue-id') == queue_id: + return q + return {} + + +def _resolve_dmac(cfg, ipver): + """Find the destination MAC address for sending packets.""" + attr = "dmac" + ipver + if hasattr(cfg, attr): + return getattr(cfg, attr) + + route = ip(f"-{ipver} route get {cfg.addr_v[ipver]}", + json=True, host=cfg.remote)[0] + gw = route.get("gateway") + if not gw: + setattr(cfg, attr, cfg.dev['address']) + return getattr(cfg, attr) + + cmd(f"ping -c1 -W0 -I{cfg.remote_ifname} {gw}", host=cfg.remote) + neigh = ip(f"neigh get {gw} dev {cfg.remote_ifname}", + json=True, host=cfg.remote)[0] + setattr(cfg, attr, neigh['lladdr']) + return getattr(cfg, attr) + + +def _setup_isolated_queue(cfg): + """Set up an isolated queue for testing using ntuple filter. + + Remove queue 1 from the default RSS context and steer test traffic to it. + """ + test_queue = 1 + + qcnt = len(glob.glob(f"/sys/class/net/{cfg.ifname}/queues/rx-*")) + if qcnt < 2: + raise KsftSkipEx(f"Need at least 2 queues, have {qcnt}") + + # Remove queue 1 from default RSS context by setting its weight to 0 + weights = ["1"] * qcnt + weights[test_queue] = "0" + ethtool(f"-X {cfg.ifname} weight " + " ".join(weights)) + defer(ethtool, f"-X {cfg.ifname} default") + + # Set up ntuple filter to steer our test traffic to the isolated queue + flow = f"flow-type tcp{cfg.addr_ipver} " + flow += f"dst-ip {cfg.addr} dst-port {GRO_DPORT} action {test_queue}" + output = ethtool(f"-N {cfg.ifname} {flow}").stdout + ntuple_id = int(output.split()[-1]) + defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}") + + return test_queue + + +def _run_gro_test(cfg, test_name, num_flows=None, ignore_fail=False, + order_check=False): + """Run gro binary with given test and return output.""" + if not hasattr(cfg, "bin_remote"): + cfg.bin_local = cfg.net_lib_dir / "gro" + cfg.bin_remote = cfg.remote.deploy(cfg.bin_local) + + ipver = cfg.addr_ipver + protocol = f"--ipv{ipver}" + dmac = _resolve_dmac(cfg, ipver) + + base_args = [ + protocol, + f"--dmac {dmac}", + f"--smac {cfg.remote_dev['address']}", + f"--daddr {cfg.addr}", + f"--saddr {cfg.remote_addr_v[ipver]}", + f"--test {test_name}", + ] + if num_flows: + base_args.append(f"--num-flows {num_flows}") + if order_check: + base_args.append("--order-check") + + args = " ".join(base_args) + + rx_cmd = f"{cfg.bin_local} {args} --rx --iface {cfg.ifname}" + tx_cmd = f"{cfg.bin_remote} {args} --iface {cfg.remote_ifname}" + + with bkg(rx_cmd, ksft_ready=True, exit_wait=True, fail=False) as rx_proc: + cmd(tx_cmd, host=cfg.remote) + + if not ignore_fail: + ksft_eq(rx_proc.ret, 0) + if rx_proc.ret != 0: + ksft_pr(rx_proc) + + return rx_proc.stdout + + +def _require_hw_gro_stats(cfg, queue_id): + """Check if device reports HW GRO stats for the queue.""" + stats = _get_queue_stats(cfg, queue_id) + required = ['rx-packets', 'rx-hw-gro-packets', 'rx-hw-gro-wire-packets'] + for stat in required: + if stat not in stats: + raise KsftSkipEx(f"Driver does not report '{stat}' via qstats") + + +def _set_ethtool_feat(cfg, current, feats): + """Set ethtool features with defer to restore original state.""" + s2n = {True: "on", False: "off"} + + new = ["-K", cfg.ifname] + old = ["-K", cfg.ifname] + no_change = True + for name, state in feats.items(): + new += [name, s2n[state]] + old += [name, s2n[current[name]["active"]]] + + if current[name]["active"] != state: + no_change = False + if current[name]["fixed"]: + raise KsftSkipEx(f"Device does not support {name}") + if no_change: + return + + eth_cmd = ethtool(" ".join(new)) + defer(ethtool, " ".join(old)) + + # If ethtool printed something kernel must have modified some features + if eth_cmd.stdout: + ksft_pr(eth_cmd) + + +def _setup_hw_gro(cfg): + """Enable HW GRO on the device, disabling SW GRO.""" + feat = ethtool(f"-k {cfg.ifname}", json=True)[0] + + # Try to disable SW GRO and enable HW GRO + _set_ethtool_feat(cfg, feat, + {"generic-receive-offload": False, + "rx-gro-hw": True, + "large-receive-offload": False}) + + # Some NICs treat HW GRO as a GRO sub-feature so disabling GRO + # will also clear HW GRO. Use a hack of installing XDP generic + # to skip SW GRO, even when enabled. + feat = ethtool(f"-k {cfg.ifname}", json=True)[0] + if not feat["rx-gro-hw"]["active"]: + ksft_pr("Driver clears HW GRO when SW GRO is cleared, using generic XDP workaround") + prog = cfg.net_lib_dir / "xdp_dummy.bpf.o" + ip(f"link set dev {cfg.ifname} xdpgeneric obj {prog} sec xdp") + defer(ip, f"link set dev {cfg.ifname} xdpgeneric off") + + # Attaching XDP may change features, fetch the latest state + feat = ethtool(f"-k {cfg.ifname}", json=True)[0] + + _set_ethtool_feat(cfg, feat, + {"generic-receive-offload": True, + "rx-gro-hw": True, + "large-receive-offload": False}) + + +def _check_gro_stats(cfg, test_queue, stats_before, + expect_rx, expect_gro, expect_wire): + """Validate GRO stats against expected values.""" + stats_after = _get_queue_stats(cfg, test_queue) + + rx_delta = (stats_after.get('rx-packets', 0) - + stats_before.get('rx-packets', 0)) + gro_delta = (stats_after.get('rx-hw-gro-packets', 0) - + stats_before.get('rx-hw-gro-packets', 0)) + wire_delta = (stats_after.get('rx-hw-gro-wire-packets', 0) - + stats_before.get('rx-hw-gro-wire-packets', 0)) + + ksft_eq(rx_delta, expect_rx, comment="rx-packets") + ksft_eq(gro_delta, expect_gro, comment="rx-hw-gro-packets") + ksft_eq(wire_delta, expect_wire, comment="rx-hw-gro-wire-packets") + + +def test_gro_stats_single(cfg): + """ + Test that a single packet doesn't affect GRO stats. + + Send a single packet that cannot be coalesced (nothing to coalesce with). + GRO stats should not increase since no coalescing occurred. + rx-packets should increase by 2 (1 data + 1 FIN). + """ + _setup_hw_gro(cfg) + + test_queue = _setup_isolated_queue(cfg) + _require_hw_gro_stats(cfg, test_queue) + + stats_before = _get_queue_stats(cfg, test_queue) + + _run_gro_test(cfg, "single") + + # 1 data + 1 FIN = 2 rx-packets, no coalescing + _check_gro_stats(cfg, test_queue, stats_before, + expect_rx=2, expect_gro=0, expect_wire=0) + + +def test_gro_stats_full(cfg): + """ + Test GRO stats when overwhelming HW GRO capacity. + + Send 500 flows to exceed HW GRO flow capacity on a single queue. + This should result in some packets not being coalesced. + Validate that qstats match what gro.c observed. + """ + _setup_hw_gro(cfg) + + test_queue = _setup_isolated_queue(cfg) + _require_hw_gro_stats(cfg, test_queue) + + num_flows = 500 + stats_before = _get_queue_stats(cfg, test_queue) + + # Run capacity test - will likely fail because not all packets coalesce + output = _run_gro_test(cfg, "capacity", num_flows=num_flows, + ignore_fail=True) + + # Parse gro.c output: "STATS: received=X wire=Y coalesced=Z" + match = re.search(r'STATS: received=(\d+) wire=(\d+) coalesced=(\d+)', + output) + if not match: + raise KsftSkipEx(f"Could not parse gro.c output: {output}") + + rx_frames = int(match.group(2)) + gro_coalesced = int(match.group(3)) + + ksft_ge(gro_coalesced, 1, + comment="At least some packets should coalesce") + + # received + 1 FIN, coalesced super-packets, coalesced * 2 wire packets + _check_gro_stats(cfg, test_queue, stats_before, + expect_rx=rx_frames + 1, + expect_gro=gro_coalesced, + expect_wire=gro_coalesced * 2) + + +@ksft_variants([4, 32, 512]) +def test_gro_order(cfg, num_flows): + """ + Test that HW GRO preserves packet ordering between flows. + + Packets may get delayed until the aggregate is released, + but reordering between aggregates and packet terminating + the aggregate and normal packets should not happen. + + Note that this test is stricter than truly required. + Reordering packets between flows should not cause issues. + This test will also fail if traffic is run over an ECMP fabric. + """ + _setup_hw_gro(cfg) + _setup_isolated_queue(cfg) + + _run_gro_test(cfg, "capacity", num_flows=num_flows, order_check=True) + + +def main() -> None: + """ Ksft boiler plate main """ + + with NetDrvEpEnv(__file__, nsim_test=False) as cfg: + cfg.netnl = NetdevFamily() + ksft_run([test_gro_stats_single, + test_gro_stats_full, + test_gro_order], args=(cfg,)) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py index c63d6d6450d2..e81724cb5542 100755 --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.py +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.py @@ -2,14 +2,27 @@ # SPDX-License-Identifier: GPL-2.0 import re +import time from os import path from lib.py import ksft_run, ksft_exit, KsftSkipEx, ksft_variants, KsftNamedVariant from lib.py import NetDrvEpEnv from lib.py import bkg, cmd, defer, ethtool, rand_port, wait_port_listen -from lib.py import EthtoolFamily +from lib.py import EthtoolFamily, NetdevFamily SKIP_CODE = 42 + +def mp_clear_wait(cfg): + """Wait for io_uring memory providers to clear from all device queues.""" + deadline = time.time() + 5 + while time.time() < deadline: + queues = cfg.netnl.queue_get({'ifindex': cfg.ifindex}, dump=True) + if not any('io-uring' in q for q in queues): + return + time.sleep(0.1) + raise TimeoutError("Timed out waiting for memory provider to clear") + + def create_rss_ctx(cfg): output = ethtool(f"-X {cfg.ifname} context new start {cfg.target} equal 1").stdout values = re.search(r'New RSS context is (\d+)', output).group(1) @@ -46,6 +59,7 @@ def single(cfg): 'tcp-data-split': 'unknown', 'hds-thresh': hds_thresh, 'rx': rx_rings}) + defer(mp_clear_wait, cfg) cfg.target = channels - 1 ethtool(f"-X {cfg.ifname} equal {cfg.target}") @@ -73,6 +87,7 @@ def rss(cfg): 'tcp-data-split': 'unknown', 'hds-thresh': hds_thresh, 'rx': rx_rings}) + defer(mp_clear_wait, cfg) cfg.target = channels - 1 ethtool(f"-X {cfg.ifname} equal {cfg.target}") @@ -120,36 +135,25 @@ def test_zcrx_large_chunks(cfg) -> None: cfg.require_ipver('6') - combined_chans = _get_combined_channels(cfg) - if combined_chans < 2: - raise KsftSkipEx('at least 2 combined channels required') - (rx_ring, hds_thresh) = _get_current_settings(cfg) - port = rand_port() - - ethtool(f"-G {cfg.ifname} tcp-data-split on") - defer(ethtool, f"-G {cfg.ifname} tcp-data-split auto") - - ethtool(f"-G {cfg.ifname} hds-thresh 0") - defer(ethtool, f"-G {cfg.ifname} hds-thresh {hds_thresh}") - - ethtool(f"-G {cfg.ifname} rx 64") - defer(ethtool, f"-G {cfg.ifname} rx {rx_ring}") + hp_file = "/proc/sys/vm/nr_hugepages" + with open(hp_file, 'r+', encoding='utf-8') as f: + nr_hugepages = int(f.read().strip()) + if nr_hugepages < 64: + f.seek(0) + f.write("64") + defer(lambda: open(hp_file, 'w', encoding='utf-8').write(str(nr_hugepages))) - ethtool(f"-X {cfg.ifname} equal {combined_chans - 1}") - defer(ethtool, f"-X {cfg.ifname} default") - - flow_rule_id = _set_flow_rule(cfg, port, combined_chans - 1) - defer(ethtool, f"-N {cfg.ifname} delete {flow_rule_id}") - - rx_cmd = f"{cfg.bin_local} -s -p {port} -i {cfg.ifname} -q {combined_chans - 1} -x 2" - tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {port} -l 12840" + single(cfg) + rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.target} -x 2" + tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840" probe = cmd(rx_cmd + " -d", fail=False) if probe.ret == SKIP_CODE: - raise KsftSkipEx(probe.stdout) + raise KsftSkipEx(probe.stdout.strip()) + mp_clear_wait(cfg) with bkg(rx_cmd, exit_wait=True): - wait_port_listen(port, proto="tcp") + wait_port_listen(cfg.port, proto="tcp") cmd(tx_cmd, host=cfg.remote) @@ -159,8 +163,10 @@ def main() -> None: cfg.bin_remote = cfg.remote.deploy(cfg.bin_local) cfg.ethnl = EthtoolFamily() + cfg.netnl = NetdevFamily() cfg.port = rand_port() - ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot], args=(cfg, )) + ksft_run(globs=globals(), cases=[test_zcrx, test_zcrx_oneshot, + test_zcrx_large_chunks], args=(cfg, )) ksft_exit() diff --git a/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py index d5d247eca6b7..84a4dab6c649 100644 --- a/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py +++ b/tools/testing/selftests/drivers/net/hw/lib/py/__init__.py @@ -3,6 +3,7 @@ """ Driver test environment (hardware-only tests). NetDrvEnv and NetDrvEpEnv are the main environment classes. +NetDrvContEnv extends NetDrvEpEnv with netkit container support. Former is for local host only tests, latter creates / connects to a remote endpoint. See NIPA wiki for more information about running and writing driver tests. @@ -19,33 +20,36 @@ try: # Import one by one to avoid pylint false positives from net.lib.py import NetNS, NetNSEnter, NetdevSimDev from net.lib.py import EthtoolFamily, NetdevFamily, NetshaperFamily, \ - NlError, RtnlFamily, DevlinkFamily, PSPFamily + NlError, RtnlFamily, DevlinkFamily, PSPFamily, Netlink from net.lib.py import CmdExitFailure from net.lib.py import bkg, cmd, bpftool, bpftrace, defer, ethtool, \ - fd_read_timeout, ip, rand_port, wait_port_listen, wait_file, tool + fd_read_timeout, ip, rand_port, rand_ports, wait_port_listen, \ + wait_file, tool + from net.lib.py import bpf_map_set, bpf_map_dump, bpf_prog_map_ids from net.lib.py import KsftSkipEx, KsftFailEx, KsftXfailEx from net.lib.py import ksft_disruptive, ksft_exit, ksft_pr, ksft_run, \ ksft_setup, ksft_variants, KsftNamedVariant from net.lib.py import ksft_eq, ksft_ge, ksft_in, ksft_is, ksft_lt, \ ksft_ne, ksft_not_in, ksft_raises, ksft_true, ksft_gt, ksft_not_none from drivers.net.lib.py import GenerateTraffic, Remote, Iperf3Runner - from drivers.net.lib.py import NetDrvEnv, NetDrvEpEnv + from drivers.net.lib.py import NetDrvEnv, NetDrvEpEnv, NetDrvContEnv __all__ = ["NetNS", "NetNSEnter", "NetdevSimDev", "EthtoolFamily", "NetdevFamily", "NetshaperFamily", - "NlError", "RtnlFamily", "DevlinkFamily", "PSPFamily", + "NlError", "RtnlFamily", "DevlinkFamily", "PSPFamily", "Netlink", "CmdExitFailure", "bkg", "cmd", "bpftool", "bpftrace", "defer", "ethtool", - "fd_read_timeout", "ip", "rand_port", + "fd_read_timeout", "ip", "rand_port", "rand_ports", "wait_port_listen", "wait_file", "tool", + "bpf_map_set", "bpf_map_dump", "bpf_prog_map_ids", "KsftSkipEx", "KsftFailEx", "KsftXfailEx", "ksft_disruptive", "ksft_exit", "ksft_pr", "ksft_run", "ksft_setup", "ksft_variants", "KsftNamedVariant", "ksft_eq", "ksft_ge", "ksft_in", "ksft_is", "ksft_lt", "ksft_ne", "ksft_not_in", "ksft_raises", "ksft_true", "ksft_gt", "ksft_not_none", "ksft_not_none", - "NetDrvEnv", "NetDrvEpEnv", "GenerateTraffic", "Remote", - "Iperf3Runner"] + "NetDrvEnv", "NetDrvEpEnv", "NetDrvContEnv", "GenerateTraffic", + "Remote", "Iperf3Runner"] except ModuleNotFoundError as e: print("Failed importing `net` library from kernel sources") print(str(e)) diff --git a/tools/testing/selftests/drivers/net/hw/nk_forward.bpf.c b/tools/testing/selftests/drivers/net/hw/nk_forward.bpf.c new file mode 100644 index 000000000000..86ebfc1445b6 --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/nk_forward.bpf.c @@ -0,0 +1,49 @@ +// SPDX-License-Identifier: GPL-2.0 +#include <linux/bpf.h> +#include <linux/pkt_cls.h> +#include <linux/if_ether.h> +#include <linux/ipv6.h> +#include <linux/in6.h> +#include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> + +#define TC_ACT_OK 0 +#define ETH_P_IPV6 0x86DD + +#define ctx_ptr(field) ((void *)(long)(field)) + +#define v6_p64_equal(a, b) (a.s6_addr32[0] == b.s6_addr32[0] && \ + a.s6_addr32[1] == b.s6_addr32[1]) + +volatile __u32 netkit_ifindex; +volatile __u8 ipv6_prefix[16]; + +SEC("tc/ingress") +int tc_redirect_peer(struct __sk_buff *skb) +{ + void *data_end = ctx_ptr(skb->data_end); + void *data = ctx_ptr(skb->data); + struct in6_addr *peer_addr; + struct ipv6hdr *ip6h; + struct ethhdr *eth; + + peer_addr = (struct in6_addr *)ipv6_prefix; + + if (skb->protocol != bpf_htons(ETH_P_IPV6)) + return TC_ACT_OK; + + eth = data; + if ((void *)(eth + 1) > data_end) + return TC_ACT_OK; + + ip6h = data + sizeof(struct ethhdr); + if ((void *)(ip6h + 1) > data_end) + return TC_ACT_OK; + + if (!v6_p64_equal(ip6h->daddr, (*peer_addr))) + return TC_ACT_OK; + + return bpf_redirect_peer(netkit_ifindex, 0); +} + +char __license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/drivers/net/hw/nk_netns.py b/tools/testing/selftests/drivers/net/hw/nk_netns.py new file mode 100755 index 000000000000..8b7ab75aa27f --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/nk_netns.py @@ -0,0 +1,29 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +""" +Test exercising NetDrvContEnv() itself, a NetDrvContEnv() selftest. +""" + +from lib.py import ksft_run, ksft_exit +from lib.py import NetDrvContEnv +from lib.py import cmd + + +def test_ping(cfg) -> None: + """ Run ping between the container and the remote system. """ + cfg.require_ipver("6") + + cmd(f"ping -c 1 -W5 {cfg.nk_guest_ipv6}", host=cfg.remote) + cmd(f"ping -c 1 -W5 {cfg.remote_addr_v['6']}", ns=cfg.netns) + + +def main() -> None: + """ Ksft boiler plate main """ + with NetDrvContEnv(__file__) as cfg: + ksft_run([test_ping], args=(cfg,)) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/nk_qlease.py b/tools/testing/selftests/drivers/net/hw/nk_qlease.py new file mode 100755 index 000000000000..aa83dc321328 --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/nk_qlease.py @@ -0,0 +1,265 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import re +import time +import threading +from os import path +from lib.py import ( + ksft_run, + ksft_exit, + ksft_eq, + ksft_in, + ksft_not_in, + ksft_raises, +) +from lib.py import ( + NetDrvContEnv, + NetNSEnter, + EthtoolFamily, + NetdevFamily, +) +from lib.py import ( + bkg, + cmd, + defer, + ethtool, + ip, + rand_port, + wait_port_listen, +) +from lib.py import KsftSkipEx, CmdExitFailure + + +def set_flow_rule(cfg): + output = ethtool( + f"-N {cfg.ifname} flow-type tcp6 dst-port {cfg.port} action {cfg.src_queue}" + ).stdout + values = re.search(r"ID (\d+)", output).group(1) + return int(values) + + +def test_iou_zcrx(cfg) -> None: + cfg.require_ipver("6") + ethnl = EthtoolFamily() + + rings = ethnl.rings_get({"header": {"dev-index": cfg.ifindex}}) + rx_rings = rings["rx"] + hds_thresh = rings.get("hds-thresh", 0) + + ethnl.rings_set( + { + "header": {"dev-index": cfg.ifindex}, + "tcp-data-split": "enabled", + "hds-thresh": 0, + "rx": 64, + } + ) + defer( + ethnl.rings_set, + { + "header": {"dev-index": cfg.ifindex}, + "tcp-data-split": "unknown", + "hds-thresh": hds_thresh, + "rx": rx_rings, + }, + ) + + ethtool(f"-X {cfg.ifname} equal {cfg.src_queue}") + defer(ethtool, f"-X {cfg.ifname} default") + + flow_rule_id = set_flow_rule(cfg) + defer(ethtool, f"-N {cfg.ifname} delete {flow_rule_id}") + + rx_cmd = f"ip netns exec {cfg.netns.name} {cfg.bin_local} -s -p {cfg.port} -i {cfg._nk_guest_ifname} -q {cfg.nk_queue}" + tx_cmd = f"{cfg.bin_remote} -c -h {cfg.nk_guest_ipv6} -p {cfg.port} -l 12840" + with bkg(rx_cmd, exit_wait=True): + wait_port_listen(cfg.port, proto="tcp", ns=cfg.netns) + cmd(tx_cmd, host=cfg.remote) + + +def test_attrs(cfg) -> None: + cfg.require_ipver("6") + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": cfg.ifindex, "id": cfg.src_queue, "type": "rx"} + ) + + ksft_eq(queue_info["id"], cfg.src_queue) + ksft_eq(queue_info["type"], "rx") + ksft_eq(queue_info["ifindex"], cfg.ifindex) + + ksft_in("lease", queue_info) + lease = queue_info["lease"] + ksft_eq(lease["ifindex"], cfg.nk_guest_ifindex) + ksft_eq(lease["queue"]["id"], cfg.nk_queue) + ksft_eq(lease["queue"]["type"], "rx") + ksft_in("netns-id", lease) + + +def test_attach_xdp_with_mp(cfg) -> None: + cfg.require_ipver("6") + ethnl = EthtoolFamily() + + rings = ethnl.rings_get({"header": {"dev-index": cfg.ifindex}}) + rx_rings = rings["rx"] + hds_thresh = rings.get("hds-thresh", 0) + + ethnl.rings_set( + { + "header": {"dev-index": cfg.ifindex}, + "tcp-data-split": "enabled", + "hds-thresh": 0, + "rx": 64, + } + ) + defer( + ethnl.rings_set, + { + "header": {"dev-index": cfg.ifindex}, + "tcp-data-split": "unknown", + "hds-thresh": hds_thresh, + "rx": rx_rings, + }, + ) + + ethtool(f"-X {cfg.ifname} equal {cfg.src_queue}") + defer(ethtool, f"-X {cfg.ifname} default") + + netdevnl = NetdevFamily() + + rx_cmd = f"ip netns exec {cfg.netns.name} {cfg.bin_local} -s -p {cfg.port} -i {cfg._nk_guest_ifname} -q {cfg.nk_queue}" + with bkg(rx_cmd): + wait_port_listen(cfg.port, proto="tcp", ns=cfg.netns) + + time.sleep(0.1) + queue_info = netdevnl.queue_get( + {"ifindex": cfg.ifindex, "id": cfg.src_queue, "type": "rx"} + ) + ksft_in("io-uring", queue_info) + + prog = cfg.net_lib_dir / "xdp_dummy.bpf.o" + with ksft_raises(CmdExitFailure): + ip(f"link set dev {cfg.ifname} xdp obj {prog} sec xdp.frags") + + time.sleep(0.1) + queue_info = netdevnl.queue_get( + {"ifindex": cfg.ifindex, "id": cfg.src_queue, "type": "rx"} + ) + ksft_not_in("io-uring", queue_info) + + +def test_destroy(cfg) -> None: + cfg.require_ipver("6") + ethnl = EthtoolFamily() + + rings = ethnl.rings_get({"header": {"dev-index": cfg.ifindex}}) + rx_rings = rings["rx"] + hds_thresh = rings.get("hds-thresh", 0) + + ethnl.rings_set( + { + "header": {"dev-index": cfg.ifindex}, + "tcp-data-split": "enabled", + "hds-thresh": 0, + "rx": 64, + } + ) + defer( + ethnl.rings_set, + { + "header": {"dev-index": cfg.ifindex}, + "tcp-data-split": "unknown", + "hds-thresh": hds_thresh, + "rx": rx_rings, + }, + ) + + ethtool(f"-X {cfg.ifname} equal {cfg.src_queue}") + defer(ethtool, f"-X {cfg.ifname} default") + + rx_cmd = f"ip netns exec {cfg.netns.name} {cfg.bin_local} -s -p {cfg.port} -i {cfg._nk_guest_ifname} -q {cfg.nk_queue}" + rx_proc = cmd(rx_cmd, background=True) + wait_port_listen(cfg.port, proto="tcp", ns=cfg.netns) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": cfg.ifindex, "id": cfg.src_queue, "type": "rx"} + ) + ksft_in("io-uring", queue_info) + + # ip link del will wait for all refs to drop first, but iou-zcrx is holding + # onto a ref. Terminate iou-zcrx async via a thread after a delay. + kill_timer = threading.Timer(1, rx_proc.proc.terminate) + kill_timer.start() + + ip(f"link del dev {cfg._nk_host_ifname}") + kill_timer.join() + cfg._nk_host_ifname = None + cfg._nk_guest_ifname = None + + queue_info = netdevnl.queue_get( + {"ifindex": cfg.ifindex, "id": cfg.src_queue, "type": "rx"} + ) + ksft_not_in("io-uring", queue_info) + + cmd(f"tc filter del dev {cfg.ifname} ingress pref {cfg._bpf_prog_pref}") + cfg._tc_attached = False + + flow_rule_id = set_flow_rule(cfg) + defer(ethtool, f"-N {cfg.ifname} delete {flow_rule_id}") + + rx_cmd = f"{cfg.bin_local} -s -p {cfg.port} -i {cfg.ifname} -q {cfg.src_queue}" + tx_cmd = f"{cfg.bin_remote} -c -h {cfg.addr_v['6']} -p {cfg.port} -l 12840" + with bkg(rx_cmd, exit_wait=True): + wait_port_listen(cfg.port, proto="tcp") + cmd(tx_cmd, host=cfg.remote) + # Short delay since iou cleanup is async and takes a bit of time. + time.sleep(0.1) + queue_info = netdevnl.queue_get( + {"ifindex": cfg.ifindex, "id": cfg.src_queue, "type": "rx"} + ) + ksft_not_in("io-uring", queue_info) + + +def main() -> None: + with NetDrvContEnv(__file__, rxqueues=2) as cfg: + cfg.bin_local = path.abspath( + path.dirname(__file__) + "/../../../drivers/net/hw/iou-zcrx" + ) + cfg.bin_remote = cfg.remote.deploy(cfg.bin_local) + cfg.port = rand_port() + + ethnl = EthtoolFamily() + channels = ethnl.channels_get({"header": {"dev-index": cfg.ifindex}}) + channels = channels["combined-count"] + if channels < 2: + raise KsftSkipEx("Test requires NETIF with at least 2 combined channels") + + cfg.src_queue = channels - 1 + + with NetNSEnter(str(cfg.netns)): + netdevnl = NetdevFamily() + bind_result = netdevnl.queue_create( + { + "ifindex": cfg.nk_guest_ifindex, + "type": "rx", + "lease": { + "ifindex": cfg.ifindex, + "queue": {"id": cfg.src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + cfg.nk_queue = bind_result["id"] + + # test_destroy must be last because it destroys the netkit devices + ksft_run( + [test_iou_zcrx, test_attrs, test_attach_xdp_with_mp, test_destroy], + args=(cfg,), + ) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/ntuple.py b/tools/testing/selftests/drivers/net/hw/ntuple.py new file mode 100755 index 000000000000..232733142c02 --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/ntuple.py @@ -0,0 +1,162 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 +"""Test ethtool NFC (ntuple) flow steering rules.""" + +import random +from enum import Enum, auto +from lib.py import ksft_run, ksft_exit +from lib.py import ksft_eq, ksft_ge +from lib.py import ksft_variants, KsftNamedVariant +from lib.py import EthtoolFamily, NetDrvEpEnv, NetdevFamily +from lib.py import KsftSkipEx +from lib.py import cmd, ethtool, defer, rand_ports, bkg, wait_port_listen + + +class NtupleField(Enum): + SRC_IP = auto() + DST_IP = auto() + SRC_PORT = auto() + DST_PORT = auto() + + +def _require_ntuple(cfg): + features = ethtool(f"-k {cfg.ifname}", json=True)[0] + if not features["ntuple-filters"]["active"]: + raise KsftSkipEx("Ntuple filters not enabled on the device: " + str(features["ntuple-filters"])) + + +def _get_rx_cnts(cfg, prev=None): + """Get Rx packet counts for all queues, as a simple list of integers + if @prev is specified the prev counts will be subtracted""" + cfg.wait_hw_stats_settle() + data = cfg.netdevnl.qstats_get({"ifindex": cfg.ifindex, "scope": ["queue"]}, dump=True) + data = [x for x in data if x['queue-type'] == "rx"] + max_q = max([x["queue-id"] for x in data]) + queue_stats = [0] * (max_q + 1) + for q in data: + queue_stats[q["queue-id"]] = q["rx-packets"] + if prev and q["queue-id"] < len(prev): + queue_stats[q["queue-id"]] -= prev[q["queue-id"]] + return queue_stats + + +def _ntuple_rule_add(cfg, flow_spec): + """Install an NFC rule via ethtool.""" + + output = ethtool(f"-N {cfg.ifname} {flow_spec}").stdout + rule_id = int(output.split()[-1]) + defer(ethtool, f"-N {cfg.ifname} delete {rule_id}") + + +def _setup_isolated_queue(cfg): + """Default all traffic to queue 0, and pick a random queue to + steer NFC traffic to.""" + + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + ch_max = channels['combined-max'] + qcnt = channels['combined-count'] + + if ch_max < 2: + raise KsftSkipEx(f"Need at least 2 combined channels, max is {ch_max}") + + desired_queues = min(ch_max, 4) + if qcnt >= desired_queues: + desired_queues = qcnt + else: + ethtool(f"-L {cfg.ifname} combined {desired_queues}") + defer(ethtool, f"-L {cfg.ifname} combined {qcnt}") + + ethtool(f"-X {cfg.ifname} equal 1") + defer(ethtool, f"-X {cfg.ifname} default") + + return random.randint(1, desired_queues - 1) + + +def _send_traffic(cfg, ipver, proto, dst_port, src_port, pkt_cnt=40): + """Generate traffic with the desired flow signature.""" + + cfg.require_cmd("socat", remote=True) + + socat_proto = proto.upper() + dst_addr = f"[{cfg.addr_v['6']}]" if ipver == '6' else cfg.addr_v['4'] + + extra_opts = ",nodelay" if proto == "tcp" else ",shut-null" + + listen_cmd = (f"socat -{ipver} -t 2 -u " + f"{socat_proto}-LISTEN:{dst_port},reuseport /dev/null") + with bkg(listen_cmd, exit_wait=True): + wait_port_listen(dst_port, proto=proto) + send_cmd = f""" + bash -c 'for i in $(seq {pkt_cnt}); do echo msg; sleep 0.02; done' | + socat -{ipver} -u - \ + {socat_proto}:{dst_addr}:{dst_port},sourceport={src_port},reuseaddr{extra_opts} + """ + cmd(send_cmd, shell=True, host=cfg.remote) + + +def _add_ntuple_rule_and_send_traffic(cfg, ipver, proto, fields, test_queue): + ports = rand_ports(2) + src_port = ports[0] + dst_port = ports[1] + flow_parts = [f"flow-type {proto}{ipver}"] + + for field in fields: + if field == NtupleField.SRC_IP: + flow_parts.append(f"src-ip {cfg.remote_addr_v[ipver]}") + elif field == NtupleField.DST_IP: + flow_parts.append(f"dst-ip {cfg.addr_v[ipver]}") + elif field == NtupleField.SRC_PORT: + flow_parts.append(f"src-port {src_port}") + elif field == NtupleField.DST_PORT: + flow_parts.append(f"dst-port {dst_port}") + + flow_parts.append(f"action {test_queue}") + _ntuple_rule_add(cfg, " ".join(flow_parts)) + _send_traffic(cfg, ipver, proto, dst_port=dst_port, src_port=src_port) + + +def _ntuple_variants(): + for ipver in ["4", "6"]: + for proto in ["tcp", "udp"]: + for fields in [[NtupleField.SRC_IP], + [NtupleField.DST_IP], + [NtupleField.SRC_PORT], + [NtupleField.DST_PORT], + [NtupleField.SRC_IP, NtupleField.DST_IP], + [NtupleField.SRC_IP, NtupleField.DST_IP, + NtupleField.SRC_PORT, NtupleField.DST_PORT]]: + name = ".".join(f.name.lower() for f in fields) + yield KsftNamedVariant(f"{proto}{ipver}.{name}", + ipver, proto, fields) + + +@ksft_variants(_ntuple_variants()) +def queue(cfg, ipver, proto, fields): + """Test that an NFC rule steers traffic to the correct queue.""" + + cfg.require_ipver(ipver) + _require_ntuple(cfg) + + test_queue = _setup_isolated_queue(cfg) + + cnts = _get_rx_cnts(cfg) + _add_ntuple_rule_and_send_traffic(cfg, ipver, proto, fields, test_queue) + cnts = _get_rx_cnts(cfg, prev=cnts) + + ksft_ge(cnts[test_queue], 40, f"Traffic on test queue {test_queue}: {cnts}") + sum_idle = sum(cnts) - cnts[0] - cnts[test_queue] + ksft_eq(sum_idle, 0, f"Traffic on idle queues: {cnts}") + + +def main() -> None: + """Ksft boilerplate main.""" + + with NetDrvEpEnv(__file__, nsim_test=False) as cfg: + cfg.ethnl = EthtoolFamily() + cfg.netdevnl = NetdevFamily() + ksft_run([queue], args=(cfg,)) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py index b9b7527c2c6b..51f4e7bc3e5d 100755 --- a/tools/testing/selftests/drivers/net/hw/rss_ctx.py +++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py @@ -5,14 +5,15 @@ import datetime import random import re import time +from lib.py import ksft_disruptive from lib.py import ksft_run, ksft_pr, ksft_exit from lib.py import ksft_eq, ksft_ne, ksft_ge, ksft_in, ksft_lt, ksft_true, ksft_raises from lib.py import NetDrvEpEnv from lib.py import EthtoolFamily, NetdevFamily from lib.py import KsftSkipEx, KsftFailEx -from lib.py import ksft_disruptive -from lib.py import rand_port -from lib.py import cmd, ethtool, ip, defer, GenerateTraffic, CmdExitFailure, wait_file +from lib.py import rand_port, rand_ports +from lib.py import cmd, ethtool, ip, defer, CmdExitFailure, wait_file +from lib.py import GenerateTraffic def _rss_key_str(key): @@ -165,9 +166,17 @@ def test_rss_key_indir(cfg): ksft_eq(1, max(data['rss-indirection-table'])) # Check we only get traffic on the first 2 queues - cnts = _get_rx_cnts(cfg) - GenerateTraffic(cfg).wait_pkts_and_stop(20000) - cnts = _get_rx_cnts(cfg, prev=cnts) + + # Retry a few times in case the flows skew to a single queue. + attempts = 3 + for attempt in range(attempts): + cnts = _get_rx_cnts(cfg) + GenerateTraffic(cfg).wait_pkts_and_stop(20000) + cnts = _get_rx_cnts(cfg, prev=cnts) + if cnts[0] >= 5000 and cnts[1] >= 5000: + break + ksft_pr(f"Skewed queue distribution, attempt {attempt + 1}/{attempts}: " + str(cnts)) + # 2 queues, 20k packets, must be at least 5k per queue ksft_ge(cnts[0], 5000, "traffic on main context (1/2): " + str(cnts)) ksft_ge(cnts[1], 5000, "traffic on main context (2/2): " + str(cnts)) @@ -177,9 +186,18 @@ def test_rss_key_indir(cfg): # Restore, and check traffic gets spread again reset_indir.exec() - cnts = _get_rx_cnts(cfg) - GenerateTraffic(cfg).wait_pkts_and_stop(20000) - cnts = _get_rx_cnts(cfg, prev=cnts) + for attempt in range(attempts): + cnts = _get_rx_cnts(cfg) + GenerateTraffic(cfg).wait_pkts_and_stop(20000) + cnts = _get_rx_cnts(cfg, prev=cnts) + if qcnt > 4: + if sum(cnts[:2]) < sum(cnts[2:]): + break + else: + if cnts[2] >= 3500: + break + ksft_pr(f"Skewed queue distribution, attempt {attempt + 1}/{attempts}: " + str(cnts)) + if qcnt > 4: # First two queues get less traffic than all the rest ksft_lt(sum(cnts[:2]), sum(cnts[2:]), @@ -356,7 +374,7 @@ def test_hitless_key_update(cfg): tgen.wait_pkts_and_stop(5000) ksft_lt((t1 - t0).total_seconds(), 0.15) - ksft_eq(errors1 - errors1, 0) + ksft_eq(errors1 - errors0, 0) ksft_eq(carrier1 - carrier0, 0) @@ -454,7 +472,7 @@ def test_rss_context(cfg, ctx_cnt=1, create_with_cfg=None): except: raise KsftSkipEx("Not enough queues for the test") - ports = [] + ports = rand_ports(ctx_cnt) # Use queues 0 and 1 for normal traffic ethtool(f"-X {cfg.ifname} equal 2") @@ -488,7 +506,6 @@ def test_rss_context(cfg, ctx_cnt=1, create_with_cfg=None): ksft_eq(min(data['rss-indirection-table']), 2 + i * 2, "Unexpected context cfg: " + str(data)) ksft_eq(max(data['rss-indirection-table']), 2 + i * 2 + 1, "Unexpected context cfg: " + str(data)) - ports.append(rand_port()) flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple}") @@ -544,7 +561,7 @@ def test_rss_context_out_of_order(cfg, ctx_cnt=4): ntuple = [] ctx = [] - ports = [] + ports = rand_ports(ctx_cnt) def remove_ctx(idx): ntuple[idx].exec() @@ -576,7 +593,6 @@ def test_rss_context_out_of_order(cfg, ctx_cnt=4): ctx_id = ethtool_create(cfg, "-X", f"context new start {2 + i * 2} equal 2") ctx.append(defer(ethtool, f"-X {cfg.ifname} context {ctx_id} delete")) - ports.append(rand_port()) flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) ntuple.append(defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")) @@ -790,9 +806,10 @@ def test_rss_default_context_rule(cfg): ethtool(f"-N {cfg.ifname} {flow_generic}") defer(ethtool, f"-N {cfg.ifname} delete 1") + ports = rand_ports(2) # Specific high-priority rule for a random port that should stay on context 0. # Assign loc 0 so it is evaluated before the generic rule. - port_main = rand_port() + port_main = ports[0] flow_main = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port_main} context 0 loc 0" ethtool(f"-N {cfg.ifname} {flow_main}") defer(ethtool, f"-N {cfg.ifname} delete 0") @@ -805,7 +822,7 @@ def test_rss_default_context_rule(cfg): 'empty' : (2, 3) }) # And that traffic for any other port is steered to the new context - port_other = rand_port() + port_other = ports[1] _send_traffic_check(cfg, port_other, f"context {ctx_id}", { 'target': (2, 3), 'noise' : (0, 1) }) diff --git a/tools/testing/selftests/drivers/net/hw/rss_drv.py b/tools/testing/selftests/drivers/net/hw/rss_drv.py index 2d1a33189076..bd59dace6e15 100755 --- a/tools/testing/selftests/drivers/net/hw/rss_drv.py +++ b/tools/testing/selftests/drivers/net/hw/rss_drv.py @@ -5,9 +5,9 @@ Driver-related behavior tests for RSS. """ -from lib.py import ksft_run, ksft_exit, ksft_ge -from lib.py import ksft_variants, KsftNamedVariant, KsftSkipEx -from lib.py import defer, ethtool +from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_ge +from lib.py import ksft_variants, KsftNamedVariant, KsftSkipEx, ksft_raises +from lib.py import defer, ethtool, CmdExitFailure from lib.py import EthtoolFamily, NlError from lib.py import NetDrvEnv @@ -45,6 +45,18 @@ def _maybe_create_context(cfg, create_context): return ctx_id +def _require_dynamic_indir_size(cfg, ch_max): + """Skip if the device does not dynamically size its indirection table.""" + ethtool(f"-X {cfg.ifname} default") + ethtool(f"-L {cfg.ifname} combined 2") + small = len(_get_rss(cfg)['rss-indirection-table']) + ethtool(f"-L {cfg.ifname} combined {ch_max}") + large = len(_get_rss(cfg)['rss-indirection-table']) + + if small == large: + raise KsftSkipEx("Device does not dynamically size indirection table") + + @ksft_variants([ KsftNamedVariant("main", False), KsftNamedVariant("ctx", True), @@ -76,11 +88,224 @@ def indir_size_4x(cfg, create_context): _test_rss_indir_size(cfg, test_max, context=ctx_id) +@ksft_variants([ + KsftNamedVariant("main", False), + KsftNamedVariant("ctx", True), +]) +def resize_periodic(cfg, create_context): + """Test that a periodic indirection table survives channel changes. + + Set a non-default periodic table ([3, 2, 1, 0] x N) via netlink, + reduce channels to trigger a fold, then increase to trigger an + unfold. Using a reversed pattern (instead of [0, 1, 2, 3]) ensures + the test can distinguish a correct fold from a driver that silently + resets the table to defaults. Verify the exact pattern is preserved + and the size tracks the channel count. + """ + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + ch_max = channels.get('combined-max', 0) + qcnt = channels['combined-count'] + + if ch_max < 4: + raise KsftSkipEx(f"Not enough queues for the test: max={ch_max}") + + defer(ethtool, f"-L {cfg.ifname} combined {qcnt}") + + _require_dynamic_indir_size(cfg, ch_max) + + ctx_id = _maybe_create_context(cfg, create_context) + + # Set a non-default periodic pattern via netlink. + # Send only 4 entries (user_size=4) so the kernel replicates it + # to fill the device table. This allows folding down to 4 entries. + rss = _get_rss(cfg, context=ctx_id) + orig_size = len(rss['rss-indirection-table']) + pattern = [3, 2, 1, 0] + req = {'header': {'dev-index': cfg.ifindex}, 'indir': pattern} + if ctx_id: + req['context'] = ctx_id + else: + defer(ethtool, f"-X {cfg.ifname} default") + cfg.ethnl.rss_set(req) + + # Shrink — should fold + ethtool(f"-L {cfg.ifname} combined 4") + rss = _get_rss(cfg, context=ctx_id) + indir = rss['rss-indirection-table'] + + ksft_ge(orig_size, len(indir), "Table did not shrink") + ksft_eq(indir, [3, 2, 1, 0] * (len(indir) // 4), + "Folded table has wrong pattern") + + # Grow back — should unfold + ethtool(f"-L {cfg.ifname} combined {ch_max}") + rss = _get_rss(cfg, context=ctx_id) + indir = rss['rss-indirection-table'] + + ksft_eq(len(indir), orig_size, "Table size not restored") + ksft_eq(indir, [3, 2, 1, 0] * (len(indir) // 4), + "Unfolded table has wrong pattern") + + +@ksft_variants([ + KsftNamedVariant("main", False), + KsftNamedVariant("ctx", True), +]) +def resize_below_user_size_reject(cfg, create_context): + """Test that shrinking below user_size is rejected. + + Send a table via netlink whose size (user_size) sits between + the small and large device table sizes. The table is periodic, + so folding would normally succeed, but the user_size floor must + prevent it. + """ + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + ch_max = channels.get('combined-max', 0) + qcnt = channels['combined-count'] + + if ch_max < 4: + raise KsftSkipEx(f"Not enough queues for the test: max={ch_max}") + + defer(ethtool, f"-L {cfg.ifname} combined {qcnt}") + + _require_dynamic_indir_size(cfg, ch_max) + + ctx_id = _maybe_create_context(cfg, create_context) + + # Measure the table size at max channels + rss = _get_rss(cfg, context=ctx_id) + big_size = len(rss['rss-indirection-table']) + + # Measure the table size at reduced channels + ethtool(f"-L {cfg.ifname} combined 4") + rss = _get_rss(cfg, context=ctx_id) + small_size = len(rss['rss-indirection-table']) + ethtool(f"-L {cfg.ifname} combined {ch_max}") + + if small_size >= big_size: + raise KsftSkipEx("Table did not shrink at reduced channels") + + # Find a user_size + user_size = None + for div in [2, 4]: + candidate = big_size // div + if candidate > small_size and big_size % candidate == 0: + user_size = candidate + break + if user_size is None: + raise KsftSkipEx("No suitable user_size between small and big table") + + # Send a periodic sub-table of exactly user_size entries. + # Pattern safe for 4 channels. + pattern = [0, 1, 2, 3] * (user_size // 4) + if len(pattern) != user_size: + raise KsftSkipEx(f"user_size ({user_size}) not divisible by 4") + req = {'header': {'dev-index': cfg.ifindex}, 'indir': pattern} + if ctx_id: + req['context'] = ctx_id + else: + defer(ethtool, f"-X {cfg.ifname} default") + cfg.ethnl.rss_set(req) + + # Shrink channels — table would go to small_size < user_size. + # The table is periodic so folding would work, but user_size + # floor must reject it. + with ksft_raises(CmdExitFailure): + ethtool(f"-L {cfg.ifname} combined 4") + + +@ksft_variants([ + KsftNamedVariant("main", False), + KsftNamedVariant("ctx", True), +]) +def resize_nonperiodic_reject(cfg, create_context): + """Test that a non-periodic table blocks channel reduction. + + Set equal weight across all queues so the table is not periodic + at any smaller size, then verify channel reduction is rejected. + An additional context with a periodic table is created to verify + that validation catches the non-periodic one even when others + are fine. + """ + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + ch_max = channels.get('combined-max', 0) + qcnt = channels['combined-count'] + + if ch_max < 4: + raise KsftSkipEx(f"Not enough queues for the test: max={ch_max}") + + defer(ethtool, f"-L {cfg.ifname} combined {qcnt}") + + _require_dynamic_indir_size(cfg, ch_max) + + ctx_id = _maybe_create_context(cfg, create_context) + ctx_ref = f"context {ctx_id}" if ctx_id else "" + + # Create an extra context with a periodic (foldable) table so that + # the validation must iterate all contexts to find the bad one. + extra_ctx = _maybe_create_context(cfg, True) + ethtool(f"-X {cfg.ifname} context {extra_ctx} equal 2") + + ethtool(f"-X {cfg.ifname} {ctx_ref} equal {ch_max}") + if not create_context: + defer(ethtool, f"-X {cfg.ifname} default") + + with ksft_raises(CmdExitFailure): + ethtool(f"-L {cfg.ifname} combined 2") + + +@ksft_variants([ + KsftNamedVariant("main", False), + KsftNamedVariant("ctx", True), +]) +def resize_nonperiodic_no_corruption(cfg, create_context): + """Test that a failed resize does not corrupt table or channel count. + + Set a non-periodic table, attempt a channel reduction (which must + fail), then verify both the indirection table contents and the + channel count are unchanged. + """ + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + ch_max = channels.get('combined-max', 0) + qcnt = channels['combined-count'] + + if ch_max < 4: + raise KsftSkipEx(f"Not enough queues for the test: max={ch_max}") + + defer(ethtool, f"-L {cfg.ifname} combined {qcnt}") + + _require_dynamic_indir_size(cfg, ch_max) + + ctx_id = _maybe_create_context(cfg, create_context) + ctx_ref = f"context {ctx_id}" if ctx_id else "" + + ethtool(f"-X {cfg.ifname} {ctx_ref} equal {ch_max}") + if not create_context: + defer(ethtool, f"-X {cfg.ifname} default") + + rss_before = _get_rss(cfg, context=ctx_id) + + with ksft_raises(CmdExitFailure): + ethtool(f"-L {cfg.ifname} combined 2") + + rss_after = _get_rss(cfg, context=ctx_id) + ksft_eq(rss_after['rss-indirection-table'], + rss_before['rss-indirection-table'], + "Indirection table corrupted after failed resize") + + channels = cfg.ethnl.channels_get({'header': {'dev-index': cfg.ifindex}}) + ksft_eq(channels['combined-count'], ch_max, + "Channel count changed after failed resize") + + def main() -> None: """ Ksft boiler plate main """ with NetDrvEnv(__file__) as cfg: cfg.ethnl = EthtoolFamily() - ksft_run([indir_size_4x], args=(cfg, )) + ksft_run([indir_size_4x, resize_periodic, + resize_below_user_size_reject, + resize_nonperiodic_reject, + resize_nonperiodic_no_corruption], args=(cfg, )) ksft_exit() diff --git a/tools/testing/selftests/drivers/net/hw/tso.py b/tools/testing/selftests/drivers/net/hw/tso.py index 0998e68ebaf0..bb675e3dac88 100755 --- a/tools/testing/selftests/drivers/net/hw/tso.py +++ b/tools/testing/selftests/drivers/net/hw/tso.py @@ -36,8 +36,11 @@ def tcp_sock_get_retrans(sock): def run_one_stream(cfg, ipver, remote_v4, remote_v6, should_lso): cfg.require_cmd("socat", local=False, remote=True) + # Set recv window clamp to avoid overwhelming receiver on debug kernels + # the 200k clamp should still let use reach > 15Gbps on real HW port = rand_port() - listen_cmd = f"socat -{ipver} -t 2 -u TCP-LISTEN:{port},reuseport /dev/null,ignoreeof" + listen_opts = f"{port},reuseport,tcp-window-clamp=200000" + listen_cmd = f"socat -{ipver} -t 2 -u TCP-LISTEN:{listen_opts} /dev/null,ignoreeof" with bkg(listen_cmd, host=cfg.remote, exit_wait=True) as nc: wait_port_listen(port, host=cfg.remote) @@ -68,7 +71,7 @@ def run_one_stream(cfg, ipver, remote_v4, remote_v6, should_lso): # Make sure we have order of magnitude more LSO packets than # retransmits, in case TCP retransmitted all the LSO packets. - ksft_lt(tcp_sock_get_retrans(sock), total_lso_wire / 4) + ksft_lt(tcp_sock_get_retrans(sock), total_lso_wire / 16) sock.close() if should_lso: diff --git a/tools/testing/selftests/drivers/net/hw/uso.py b/tools/testing/selftests/drivers/net/hw/uso.py new file mode 100755 index 000000000000..6d61e56cab3c --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/uso.py @@ -0,0 +1,103 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +"""Test USO + +Sends large UDP datagrams with UDP_SEGMENT and verifies that the peer +receives the expected total payload and that the NIC transmitted at least +the expected number of segments. +""" +import random +import socket +import string + +from lib.py import ksft_run, ksft_exit, KsftSkipEx +from lib.py import ksft_eq, ksft_ge, ksft_variants, KsftNamedVariant +from lib.py import NetDrvEpEnv +from lib.py import bkg, defer, ethtool, ip, rand_port, wait_port_listen + +# python doesn't expose this constant, so we need to hardcode it to enable UDP +# segmentation for large payloads +UDP_SEGMENT = 103 + + +def _send_uso(cfg, ipver, mss, total_payload, port): + if ipver == "4": + sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM) + dst = (cfg.remote_addr_v["4"], port) + else: + sock = socket.socket(socket.AF_INET6, socket.SOCK_DGRAM) + dst = (cfg.remote_addr_v["6"], port) + + sock.setsockopt(socket.IPPROTO_UDP, UDP_SEGMENT, mss) + payload = ''.join(random.choice(string.ascii_lowercase) + for _ in range(total_payload)) + sock.sendto(payload.encode(), dst) + sock.close() + + +def _get_tx_packets(cfg): + stats = ip(f"-s link show dev {cfg.ifname}", json=True)[0] + return stats['stats64']['tx']['packets'] + + +def _test_uso(cfg, ipver, mss, total_payload): + cfg.require_ipver(ipver) + cfg.require_cmd("socat", remote=True) + + features = ethtool(f"-k {cfg.ifname}", json=True) + uso_was_on = features[0]["tx-udp-segmentation"]["active"] + + try: + ethtool(f"-K {cfg.ifname} tx-udp-segmentation on") + except Exception as exc: + raise KsftSkipEx( + "Device does not support tx-udp-segmentation") from exc + if not uso_was_on: + defer(ethtool, f"-K {cfg.ifname} tx-udp-segmentation off") + + expected_segs = (total_payload + mss - 1) // mss + + port = rand_port(stype=socket.SOCK_DGRAM) + rx_cmd = f"socat -{ipver} -T 2 -u UDP-LISTEN:{port},reuseport STDOUT" + + tx_before = _get_tx_packets(cfg) + + with bkg(rx_cmd, host=cfg.remote, exit_wait=True) as rx: + wait_port_listen(port, proto="udp", host=cfg.remote) + _send_uso(cfg, ipver, mss, total_payload, port) + + ksft_eq(len(rx.stdout), total_payload, + comment=f"Received {len(rx.stdout)}B, expected {total_payload}B") + + cfg.wait_hw_stats_settle() + + tx_after = _get_tx_packets(cfg) + tx_delta = tx_after - tx_before + + ksft_ge(tx_delta, expected_segs, + comment=f"Expected >= {expected_segs} tx packets, got {tx_delta}") + + +def _uso_variants(): + for ipver in ["4", "6"]: + yield KsftNamedVariant(f"v{ipver}_partial", ipver, 1400, 1400 * 10 + 500) + yield KsftNamedVariant(f"v{ipver}_exact", ipver, 1400, 1400 * 5) + + +@ksft_variants(_uso_variants()) +def test_uso(cfg, ipver, mss, total_payload): + """Send a USO datagram and verify the peer receives the expected segments.""" + _test_uso(cfg, ipver, mss, total_payload) + + +def main() -> None: + """Run USO tests.""" + with NetDrvEpEnv(__file__) as cfg: + ksft_run([test_uso], + args=(cfg, )) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/hw/xdp_metadata.py b/tools/testing/selftests/drivers/net/hw/xdp_metadata.py new file mode 100644 index 000000000000..33a1985356d9 --- /dev/null +++ b/tools/testing/selftests/drivers/net/hw/xdp_metadata.py @@ -0,0 +1,146 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +""" +Tests for XDP metadata kfuncs (e.g. bpf_xdp_metadata_rx_hash). + +These tests load device-bound XDP programs from xdp_metadata.bpf.o +that call metadata kfuncs, send traffic, and verify the extracted +metadata via BPF maps. +""" +from lib.py import ksft_run, ksft_eq, ksft_exit, ksft_ge, ksft_ne, ksft_pr +from lib.py import KsftNamedVariant, ksft_variants +from lib.py import CmdExitFailure, KsftSkipEx, NetDrvEpEnv +from lib.py import NetdevFamily +from lib.py import bkg, cmd, rand_port, wait_port_listen +from lib.py import ip, bpftool, defer +from lib.py import bpf_map_set, bpf_map_dump, bpf_prog_map_ids + + +def _load_xdp_metadata_prog(cfg, prog_name, bpf_file="xdp_metadata.bpf.o"): + """Load a device-bound XDP metadata program and return prog/map info. + + Returns: + dict with 'id', 'name', and 'maps' (name -> map_id). + """ + abs_path = cfg.net_lib_dir / bpf_file + pin_dir = "/sys/fs/bpf/xdp_metadata_test" + + cmd(f"rm -rf {pin_dir}", shell=True, fail=False) + cmd(f"mkdir -p {pin_dir}", shell=True) + + try: + bpftool(f"prog loadall {abs_path} {pin_dir} type xdp " + f"xdpmeta_dev {cfg.ifname}") + except CmdExitFailure as e: + cmd(f"rm -rf {pin_dir}", shell=True, fail=False) + raise KsftSkipEx( + f"Failed to load device-bound XDP program '{prog_name}'" + ) from e + defer(cmd, f"rm -rf {pin_dir}", shell=True, fail=False) + + pin_path = f"{pin_dir}/{prog_name}" + ip(f"link set dev {cfg.ifname} xdpdrv pinned {pin_path}") + defer(ip, f"link set dev {cfg.ifname} xdpdrv off") + + xdp_info = ip(f"-d link show dev {cfg.ifname}", json=True)[0] + prog_id = xdp_info["xdp"]["prog"]["id"] + + return {"id": prog_id, + "name": xdp_info["xdp"]["prog"]["name"], + "maps": bpf_prog_map_ids(prog_id)} + + +def _send_probe(cfg, port, proto="tcp"): + """Send a single payload from the remote end using socat. + + Args: + cfg: Configuration object containing network settings. + port: Port number for the exchange. + proto: Protocol to use, either "tcp" or "udp". + """ + cfg.require_cmd("socat", remote=True) + + if proto == "tcp": + rx_cmd = f"socat -{cfg.addr_ipver} -T 2 TCP-LISTEN:{port},reuseport STDOUT" + tx_cmd = f"echo -n rss_hash_test | socat -t 2 -u STDIN TCP:{cfg.baddr}:{port}" + else: + rx_cmd = f"socat -{cfg.addr_ipver} -T 2 -u UDP-RECV:{port},reuseport STDOUT" + tx_cmd = f"echo -n rss_hash_test | socat -t 2 -u STDIN UDP:{cfg.baddr}:{port}" + + with bkg(rx_cmd, exit_wait=True): + wait_port_listen(port, proto=proto) + cmd(tx_cmd, host=cfg.remote, shell=True) + + +# BPF map keys matching the enums in xdp_metadata.bpf.c +_SETUP_KEY_PORT = 1 + +_RSS_KEY_HASH = 0 +_RSS_KEY_TYPE = 1 +_RSS_KEY_PKT_CNT = 2 +_RSS_KEY_ERR_CNT = 3 + +XDP_RSS_L4 = 0x8 # BIT(3) from enum xdp_rss_hash_type + + +@ksft_variants([ + KsftNamedVariant("tcp", "tcp"), + KsftNamedVariant("udp", "udp"), +]) +def test_xdp_rss_hash(cfg, proto): + """Test RSS hash metadata extraction via bpf_xdp_metadata_rx_hash(). + + This test will only run on devices that support xdp-rx-metadata-features. + + Loads the xdp_rss_hash program from xdp_metadata, sends a packet using + the specified protocol, and verifies that the program extracted a non-zero + hash with an L4 hash type. + """ + dev_info = cfg.netnl.dev_get({"ifindex": cfg.ifindex}) + rx_meta = dev_info.get("xdp-rx-metadata-features", []) + if "hash" not in rx_meta: + raise KsftSkipEx("device does not support XDP rx hash metadata") + + prog_info = _load_xdp_metadata_prog(cfg, "xdp_rss_hash") + + port = rand_port() + bpf_map_set("map_xdp_setup", _SETUP_KEY_PORT, port) + + rss_map_id = prog_info["maps"]["map_rss"] + + _send_probe(cfg, port, proto=proto) + + rss = bpf_map_dump(rss_map_id) + + pkt_cnt = rss.get(_RSS_KEY_PKT_CNT, 0) + err_cnt = rss.get(_RSS_KEY_ERR_CNT, 0) + hash_val = rss.get(_RSS_KEY_HASH, 0) + hash_type = rss.get(_RSS_KEY_TYPE, 0) + + ksft_ge(pkt_cnt, 1, comment="should have received at least one packet") + ksft_eq(err_cnt, 0, comment=f"RSS hash error count: {err_cnt}") + + ksft_ne(hash_val, 0, + f"RSS hash should be non-zero for {proto.upper()} traffic") + ksft_pr(f" RSS hash: {hash_val:#010x}") + + ksft_pr(f" RSS hash type: {hash_type:#06x}") + ksft_ne(hash_type & XDP_RSS_L4, 0, + f"RSS hash type should include L4 for {proto.upper()} traffic") + + +def main(): + """Run XDP metadata kfunc tests against a real device.""" + with NetDrvEpEnv(__file__) as cfg: + cfg.netnl = NetdevFamily() + ksft_run( + [ + test_xdp_rss_hash, + ], + args=(cfg,)) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/lib/py/__init__.py b/tools/testing/selftests/drivers/net/lib/py/__init__.py index 8b75faa9af6d..2b5ec0505672 100644 --- a/tools/testing/selftests/drivers/net/lib/py/__init__.py +++ b/tools/testing/selftests/drivers/net/lib/py/__init__.py @@ -3,6 +3,7 @@ """ Driver test environment. NetDrvEnv and NetDrvEpEnv are the main environment classes. +NetDrvContEnv extends NetDrvEpEnv with netkit container support. Former is for local host only tests, latter creates / connects to a remote endpoint. See NIPA wiki for more information about running and writing driver tests. @@ -19,10 +20,11 @@ try: # Import one by one to avoid pylint false positives from net.lib.py import NetNS, NetNSEnter, NetdevSimDev from net.lib.py import EthtoolFamily, NetdevFamily, NetshaperFamily, \ - NlError, RtnlFamily, DevlinkFamily, PSPFamily + NlError, RtnlFamily, DevlinkFamily, PSPFamily, Netlink from net.lib.py import CmdExitFailure from net.lib.py import bkg, cmd, bpftool, bpftrace, defer, ethtool, \ - fd_read_timeout, ip, rand_port, wait_port_listen, wait_file + fd_read_timeout, ip, rand_port, rand_ports, wait_port_listen, wait_file + from net.lib.py import bpf_map_set, bpf_map_dump, bpf_prog_map_ids from net.lib.py import KsftSkipEx, KsftFailEx, KsftXfailEx from net.lib.py import ksft_disruptive, ksft_exit, ksft_pr, ksft_run, \ ksft_setup, ksft_variants, KsftNamedVariant @@ -31,11 +33,12 @@ try: __all__ = ["NetNS", "NetNSEnter", "NetdevSimDev", "EthtoolFamily", "NetdevFamily", "NetshaperFamily", - "NlError", "RtnlFamily", "DevlinkFamily", "PSPFamily", + "NlError", "RtnlFamily", "DevlinkFamily", "PSPFamily", "Netlink", "CmdExitFailure", "bkg", "cmd", "bpftool", "bpftrace", "defer", "ethtool", - "fd_read_timeout", "ip", "rand_port", + "fd_read_timeout", "ip", "rand_port", "rand_ports", "wait_port_listen", "wait_file", + "bpf_map_set", "bpf_map_dump", "bpf_prog_map_ids", "KsftSkipEx", "KsftFailEx", "KsftXfailEx", "ksft_disruptive", "ksft_exit", "ksft_pr", "ksft_run", "ksft_setup", "ksft_variants", "KsftNamedVariant", @@ -43,12 +46,12 @@ try: "ksft_ne", "ksft_not_in", "ksft_raises", "ksft_true", "ksft_gt", "ksft_not_none", "ksft_not_none"] - from .env import NetDrvEnv, NetDrvEpEnv + from .env import NetDrvEnv, NetDrvEpEnv, NetDrvContEnv from .load import GenerateTraffic, Iperf3Runner from .remote import Remote - __all__ += ["NetDrvEnv", "NetDrvEpEnv", "GenerateTraffic", "Remote", - "Iperf3Runner"] + __all__ += ["NetDrvEnv", "NetDrvEpEnv", "NetDrvContEnv", "GenerateTraffic", + "Remote", "Iperf3Runner"] except ModuleNotFoundError as e: print("Failed importing `net` library from kernel sources") print(str(e)) diff --git a/tools/testing/selftests/drivers/net/lib/py/env.py b/tools/testing/selftests/drivers/net/lib/py/env.py index 41cc248ac848..24ce122abd9c 100644 --- a/tools/testing/selftests/drivers/net/lib/py/env.py +++ b/tools/testing/selftests/drivers/net/lib/py/env.py @@ -1,13 +1,16 @@ # SPDX-License-Identifier: GPL-2.0 +import ipaddress import os import time +import json from pathlib import Path from lib.py import KsftSkipEx, KsftXfailEx from lib.py import ksft_setup, wait_file from lib.py import cmd, ethtool, ip, CmdExitFailure from lib.py import NetNS, NetdevSimDev from .remote import Remote +from . import bpftool, RtnlFamily, Netlink class NetDrvEnvBase: @@ -255,6 +258,15 @@ class NetDrvEpEnv(NetDrvEnvBase): if nsim_test is False and self._ns is not None: raise KsftXfailEx("Test does not work on netdevsim") + def get_local_nsim_dev(self): + """Returns the local netdevsim device or None. + Using this method is discouraged, as it makes tests nsim-specific. + Standard interfaces available on all HW should ideally be used. + This method is intended for the few cases where nsim-specific + assertions need to be verified which cannot be verified otherwise. + """ + return self._ns + def _require_cmd(self, comm, key, host=None): cached = self._required_cmd.get(comm, {}) if cached.get(key) is None: @@ -285,7 +297,211 @@ class NetDrvEpEnv(NetDrvEnvBase): if "Operation not supported" not in e.cmd.stderr: raise - self._stats_settle_time = 0.025 + \ - data.get('stats-block-usecs', 0) / 1000 / 1000 + self._stats_settle_time = \ + 1.25 * data.get('stats-block-usecs', 20000) / 1000 / 1000 time.sleep(self._stats_settle_time) + + +class NetDrvContEnv(NetDrvEpEnv): + """ + Class for an environment with a netkit pair setup for forwarding traffic + between the physical interface and a network namespace. + NETIF = "eth0" + LOCAL_V6 = "2001:db8:1::1" + REMOTE_V6 = "2001:db8:1::2" + LOCAL_PREFIX_V6 = "2001:db8:2::0/64" + + +-----------------------------+ +------------------------------+ + dst | INIT NS | | TEST NS | + 2001: | +---------------+ | | | + db8:2::2| | NETIF | | bpf | | + +---|>| 2001:db8:1::1 | |redirect| +-------------------------+ | + | | | |-----------|--------|>| Netkit | | + | | +---------------+ | _peer | | nk_guest | | + | | +-------------+ Netkit pair | | | fe80::2/64 | | + | | | Netkit |.............|........|>| 2001:db8:2::2/64 | | + | | | nk_host | | | +-------------------------+ | + | | | fe80::1/64 | | | | + | | +-------------+ | | route: | + | | | | default | + | | route: | | via fe80::1 dev nk_guest | + | | 2001:db8:2::2/128 | +------------------------------+ + | | via fe80::2 dev nk_host | + | +-----------------------------+ + | + | +---------------+ + | | REMOTE | + +---| 2001:db8:1::2 | + +---------------+ + """ + + def __init__(self, src_path, rxqueues=1, **kwargs): + self.netns = None + self._nk_host_ifname = None + self._nk_guest_ifname = None + self._tc_clsact_added = False + self._tc_attached = False + self._bpf_prog_pref = None + self._bpf_prog_id = None + self._init_ns_attached = False + self._old_fwd = None + self._old_accept_ra = None + + super().__init__(src_path, **kwargs) + + self.require_ipver("6") + local_prefix = self.env.get("LOCAL_PREFIX_V6") + if not local_prefix: + raise KsftSkipEx("LOCAL_PREFIX_V6 required") + + net = ipaddress.IPv6Network(local_prefix, strict=False) + self.ipv6_prefix = str(net.network_address) + self.nk_host_ipv6 = f"{self.ipv6_prefix}2:1" + self.nk_guest_ipv6 = f"{self.ipv6_prefix}2:2" + + local_v6 = ipaddress.IPv6Address(self.addr_v["6"]) + if local_v6 in net: + raise KsftSkipEx("LOCAL_V6 must not fall within LOCAL_PREFIX_V6") + + rtnl = RtnlFamily() + rtnl.newlink( + { + "linkinfo": { + "kind": "netkit", + "data": { + "mode": "l2", + "policy": "forward", + "peer-policy": "forward", + }, + }, + "num-rx-queues": rxqueues, + }, + flags=[Netlink.NLM_F_CREATE, Netlink.NLM_F_EXCL], + ) + + all_links = ip("-d link show", json=True) + netkit_links = [link for link in all_links + if link.get('linkinfo', {}).get('info_kind') == 'netkit' + and 'UP' not in link.get('flags', [])] + + if len(netkit_links) != 2: + raise KsftSkipEx("Failed to create netkit pair") + + netkit_links.sort(key=lambda x: x['ifindex']) + self._nk_host_ifname = netkit_links[1]['ifname'] + self._nk_guest_ifname = netkit_links[0]['ifname'] + self.nk_host_ifindex = netkit_links[1]['ifindex'] + self.nk_guest_ifindex = netkit_links[0]['ifindex'] + + self._setup_ns() + self._attach_bpf() + + def __del__(self): + if self._tc_attached: + cmd(f"tc filter del dev {self.ifname} ingress pref {self._bpf_prog_pref}") + self._tc_attached = False + + if self._tc_clsact_added: + cmd(f"tc qdisc del dev {self.ifname} clsact") + self._tc_clsact_added = False + + if self._nk_host_ifname: + cmd(f"ip link del dev {self._nk_host_ifname}") + self._nk_host_ifname = None + self._nk_guest_ifname = None + + if self._init_ns_attached: + cmd("ip netns del init", fail=False) + self._init_ns_attached = False + + if self.netns: + del self.netns + self.netns = None + + if self._old_fwd is not None: + with open("/proc/sys/net/ipv6/conf/all/forwarding", "w", + encoding="utf-8") as f: + f.write(self._old_fwd) + self._old_fwd = None + if self._old_accept_ra is not None: + with open("/proc/sys/net/ipv6/conf/all/accept_ra", "w", + encoding="utf-8") as f: + f.write(self._old_accept_ra) + self._old_accept_ra = None + + super().__del__() + + def _setup_ns(self): + fwd_path = "/proc/sys/net/ipv6/conf/all/forwarding" + ra_path = "/proc/sys/net/ipv6/conf/all/accept_ra" + with open(fwd_path, encoding="utf-8") as f: + self._old_fwd = f.read().strip() + with open(ra_path, encoding="utf-8") as f: + self._old_accept_ra = f.read().strip() + with open(fwd_path, "w", encoding="utf-8") as f: + f.write("1") + with open(ra_path, "w", encoding="utf-8") as f: + f.write("2") + + self.netns = NetNS() + cmd("ip netns attach init 1") + self._init_ns_attached = True + ip("netns set init 0", ns=self.netns) + ip(f"link set dev {self._nk_guest_ifname} netns {self.netns.name}") + ip(f"link set dev {self._nk_host_ifname} up") + ip(f"-6 addr add fe80::1/64 dev {self._nk_host_ifname} nodad") + ip(f"-6 route add {self.nk_guest_ipv6}/128 via fe80::2 dev {self._nk_host_ifname}") + + ip("link set lo up", ns=self.netns) + ip(f"link set dev {self._nk_guest_ifname} up", ns=self.netns) + ip(f"-6 addr add fe80::2/64 dev {self._nk_guest_ifname}", ns=self.netns) + ip(f"-6 addr add {self.nk_guest_ipv6}/64 dev {self._nk_guest_ifname} nodad", ns=self.netns) + ip(f"-6 route add default via fe80::1 dev {self._nk_guest_ifname}", ns=self.netns) + + def _tc_ensure_clsact(self): + qdisc = json.loads(cmd(f"tc -j qdisc show dev {self.ifname}").stdout) + for q in qdisc: + if q['kind'] == 'clsact': + return + cmd(f"tc qdisc add dev {self.ifname} clsact") + self._tc_clsact_added = True + + def _get_bpf_prog_ids(self): + filters = json.loads(cmd(f"tc -j filter show dev {self.ifname} ingress").stdout) + for bpf in filters: + if 'options' not in bpf: + continue + if bpf['options']['bpf_name'].startswith('nk_forward.bpf'): + return (bpf['pref'], bpf['options']['prog']['id']) + raise Exception("Failed to get BPF prog ID") + + def _attach_bpf(self): + bpf_obj = self.test_dir / "nk_forward.bpf.o" + if not bpf_obj.exists(): + raise KsftSkipEx("BPF prog not found") + + self._tc_ensure_clsact() + cmd(f"tc filter add dev {self.ifname} ingress bpf obj {bpf_obj}" + " sec tc/ingress direct-action") + self._tc_attached = True + + (self._bpf_prog_pref, self._bpf_prog_id) = self._get_bpf_prog_ids() + prog_info = bpftool(f"prog show id {self._bpf_prog_id}", json=True) + map_ids = prog_info.get("map_ids", []) + + bss_map_id = None + for map_id in map_ids: + map_info = bpftool(f"map show id {map_id}", json=True) + if map_info.get("name").endswith("bss"): + bss_map_id = map_id + + if bss_map_id is None: + raise Exception("Failed to find .bss map") + + ipv6_addr = ipaddress.IPv6Address(self.ipv6_prefix) + ipv6_bytes = ipv6_addr.packed + ifindex_bytes = self.nk_host_ifindex.to_bytes(4, byteorder='little') + value = ipv6_bytes + ifindex_bytes + value_hex = ' '.join(f'{b:02x}' for b in value) + bpftool(f"map update id {bss_map_id} key hex 00 00 00 00 value hex {value_hex}") diff --git a/tools/testing/selftests/drivers/net/lib/sh/lib_netcons.sh b/tools/testing/selftests/drivers/net/lib/sh/lib_netcons.sh index 02dcdeb723be..a9a01a64b7b3 100644 --- a/tools/testing/selftests/drivers/net/lib/sh/lib_netcons.sh +++ b/tools/testing/selftests/drivers/net/lib/sh/lib_netcons.sh @@ -251,7 +251,7 @@ function listen_port_and_save_to() { # Just wait for 3 seconds timeout 3 ip netns exec "${NAMESPACE}" \ - socat "${SOCAT_MODE}":"${PORT}",fork "${OUTPUT}" 2> /dev/null + socat "${SOCAT_MODE}":"${PORT}",fork,shut-none "${OUTPUT}" 2> /dev/null } # Only validate that the message arrived properly @@ -360,8 +360,8 @@ function check_for_taskset() { # This is necessary if running multiple tests in a row function pkill_socat() { - PROCESS_NAME4="socat UDP-LISTEN:6666,fork ${OUTPUT_FILE}" - PROCESS_NAME6="socat UDP6-LISTEN:6666,fork ${OUTPUT_FILE}" + PROCESS_NAME4="socat UDP-LISTEN:6666,fork,shut-none ${OUTPUT_FILE}" + PROCESS_NAME6="socat UDP6-LISTEN:6666,fork,shut-none ${OUTPUT_FILE}" # socat runs under timeout(1), kill it if it is still alive # do not fail if socat doesn't exist anymore set +e diff --git a/tools/testing/selftests/drivers/net/macsec.py b/tools/testing/selftests/drivers/net/macsec.py new file mode 100755 index 000000000000..9a83d9542e04 --- /dev/null +++ b/tools/testing/selftests/drivers/net/macsec.py @@ -0,0 +1,343 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +"""MACsec tests.""" + +import os + +from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_raises +from lib.py import ksft_variants, KsftNamedVariant +from lib.py import CmdExitFailure, KsftSkipEx +from lib.py import NetDrvEpEnv +from lib.py import cmd, ip, defer, ethtool + +MACSEC_KEY = "12345678901234567890123456789012" +MACSEC_VLAN_VID = 10 + +# Unique prefix per run to avoid collisions in the shared netns. +# Keep it short: IFNAMSIZ is 16 (incl. NUL), and VLAN names append ".<vid>". +MACSEC_PFX = f"ms{os.getpid()}_" + + +def _macsec_name(idx=0): + return f"{MACSEC_PFX}{idx}" + + +def _get_macsec_offload(dev): + """Returns macsec offload mode string from ip -d link show.""" + info = ip(f"-d link show dev {dev}", json=True)[0] + return info.get("linkinfo", {}).get("info_data", {}).get("offload") + + +def _get_features(dev): + """Returns ethtool features dict for a device.""" + return ethtool(f"-k {dev}", json=True)[0] + + +def _require_ip_macsec(cfg): + """SKIP if iproute2 on local or remote lacks 'ip macsec' support.""" + for host in [None, cfg.remote]: + out = cmd("ip macsec help", fail=False, host=host) + if "Usage" not in out.stdout + out.stderr: + where = "remote" if host else "local" + raise KsftSkipEx(f"iproute2 too old on {where}," + " missing macsec support") + + +def _require_ip_macsec_offload(): + """SKIP if local iproute2 doesn't understand 'ip macsec offload'.""" + out = cmd("ip macsec help", fail=False) + if "offload" not in out.stdout + out.stderr: + raise KsftSkipEx("iproute2 too old, missing macsec offload") + + +def _require_macsec_offload(cfg): + """SKIP if local device doesn't support macsec-hw-offload.""" + _require_ip_macsec_offload() + try: + feat = ethtool(f"-k {cfg.ifname}", json=True)[0] + except (CmdExitFailure, IndexError) as e: + raise KsftSkipEx( + f"can't query features: {e}") from e + if not feat.get("macsec-hw-offload", {}).get("active"): + raise KsftSkipEx("macsec-hw-offload not supported") + + +def _get_mac(ifname, host=None): + """Gets MAC address of an interface.""" + dev = ip(f"link show dev {ifname}", json=True, host=host) + return dev[0]["address"] + + +def _setup_macsec_sa(cfg, name): + """Adds matching TX/RX SAs on both ends.""" + local_mac = _get_mac(name) + remote_mac = _get_mac(name, host=cfg.remote) + + ip(f"macsec add {name} tx sa 0 pn 1 on key 01 {MACSEC_KEY}") + ip(f"macsec add {name} rx port 1 address {remote_mac}") + ip(f"macsec add {name} rx port 1 address {remote_mac} " + f"sa 0 pn 1 on key 02 {MACSEC_KEY}") + + ip(f"macsec add {name} tx sa 0 pn 1 on key 02 {MACSEC_KEY}", + host=cfg.remote) + ip(f"macsec add {name} rx port 1 address {local_mac}", host=cfg.remote) + ip(f"macsec add {name} rx port 1 address {local_mac} " + f"sa 0 pn 1 on key 01 {MACSEC_KEY}", host=cfg.remote) + + +def _setup_macsec_devs(cfg, name, offload): + """Creates macsec devices on both ends. + + Only the local device gets HW offload; the remote always uses software + MACsec since it may not support offload at all. + """ + offload_arg = "mac" if offload else "off" + + ip(f"link add link {cfg.ifname} {name} " + f"type macsec encrypt on offload {offload_arg}") + defer(ip, f"link del {name}") + ip(f"link add link {cfg.remote_ifname} {name} " + f"type macsec encrypt on", host=cfg.remote) + defer(ip, f"link del {name}", host=cfg.remote) + + +def _set_offload(name, offload): + """Sets offload on the local macsec device only.""" + offload_arg = "mac" if offload else "off" + + ip(f"link set {name} type macsec encrypt on offload {offload_arg}") + + +def _setup_vlans(cfg, name, vid): + """Adds VLANs on top of existing macsec devs.""" + vlan_name = f"{name}.{vid}" + + ip(f"link add link {name} {vlan_name} type vlan id {vid}") + defer(ip, f"link del {vlan_name}") + ip(f"link add link {name} {vlan_name} type vlan id {vid}", host=cfg.remote) + defer(ip, f"link del {vlan_name}", host=cfg.remote) + + +def _setup_vlan_ips(cfg, name, vid): + """Adds VLANs and IPs and brings up the macsec + VLAN devices.""" + local_ip = "198.51.100.1" + remote_ip = "198.51.100.2" + vlan_name = f"{name}.{vid}" + + ip(f"addr add {local_ip}/24 dev {vlan_name}") + ip(f"addr add {remote_ip}/24 dev {vlan_name}", host=cfg.remote) + ip(f"link set {name} up") + ip(f"link set {name} up", host=cfg.remote) + ip(f"link set {vlan_name} up") + ip(f"link set {vlan_name} up", host=cfg.remote) + + return vlan_name, remote_ip + + +def test_offload_api(cfg) -> None: + """MACsec offload API: create SecY, add SA/rx, toggle offload.""" + + _require_macsec_offload(cfg) + ms0 = _macsec_name(0) + ms1 = _macsec_name(1) + ms2 = _macsec_name(2) + + # Create 3 SecY with offload + ip(f"link add link {cfg.ifname} {ms0} type macsec " + f"port 4 encrypt on offload mac") + defer(ip, f"link del {ms0}") + + ip(f"link add link {cfg.ifname} {ms1} type macsec " + f"address aa:bb:cc:dd:ee:ff port 5 encrypt on offload mac") + defer(ip, f"link del {ms1}") + + ip(f"link add link {cfg.ifname} {ms2} type macsec " + f"sci abbacdde01020304 encrypt on offload mac") + defer(ip, f"link del {ms2}") + + # Add TX SA + ip(f"macsec add {ms0} tx sa 0 pn 1024 on " + "key 01 12345678901234567890123456789012") + + # Add RX SC + SA + ip(f"macsec add {ms0} rx port 1234 address 1c:ed:de:ad:be:ef") + ip(f"macsec add {ms0} rx port 1234 address 1c:ed:de:ad:be:ef " + "sa 0 pn 1 on key 00 0123456789abcdef0123456789abcdef") + + # Can't disable offload when SAs are configured + with ksft_raises(CmdExitFailure): + ip(f"link set {ms0} type macsec offload off") + with ksft_raises(CmdExitFailure): + ip(f"macsec offload {ms0} off") + + # Toggle offload via rtnetlink on SA-free device + ip(f"link set {ms2} type macsec offload off") + ip(f"link set {ms2} type macsec encrypt on offload mac") + + # Toggle offload via genetlink + ip(f"macsec offload {ms2} off") + ip(f"macsec offload {ms2} mac") + + +def test_max_secy(cfg) -> None: + """nsim-only test for max number of SecYs.""" + + cfg.require_nsim() + _require_ip_macsec_offload() + ms0 = _macsec_name(0) + ms1 = _macsec_name(1) + ms2 = _macsec_name(2) + ms3 = _macsec_name(3) + + ip(f"link add link {cfg.ifname} {ms0} type macsec " + f"port 4 encrypt on offload mac") + defer(ip, f"link del {ms0}") + + ip(f"link add link {cfg.ifname} {ms1} type macsec " + f"address aa:bb:cc:dd:ee:ff port 5 encrypt on offload mac") + defer(ip, f"link del {ms1}") + + ip(f"link add link {cfg.ifname} {ms2} type macsec " + f"sci abbacdde01020304 encrypt on offload mac") + defer(ip, f"link del {ms2}") + with ksft_raises(CmdExitFailure): + ip(f"link add link {cfg.ifname} {ms3} " + f"type macsec port 8 encrypt on offload mac") + + +def test_max_sc(cfg) -> None: + """nsim-only test for max number of SCs.""" + + cfg.require_nsim() + _require_ip_macsec_offload() + ms0 = _macsec_name(0) + + ip(f"link add link {cfg.ifname} {ms0} type macsec " + f"port 4 encrypt on offload mac") + defer(ip, f"link del {ms0}") + ip(f"macsec add {ms0} rx port 1234 address 1c:ed:de:ad:be:ef") + with ksft_raises(CmdExitFailure): + ip(f"macsec add {ms0} rx port 1235 address 1c:ed:de:ad:be:ef") + + +def test_offload_state(cfg) -> None: + """Offload state reflects configuration changes.""" + + _require_macsec_offload(cfg) + ms0 = _macsec_name(0) + + # Create with offload on + ip(f"link add link {cfg.ifname} {ms0} type macsec " + f"encrypt on offload mac") + cleanup = defer(ip, f"link del {ms0}") + + ksft_eq(_get_macsec_offload(ms0), "mac", + "created with offload: should be mac") + feats_on_1 = _get_features(ms0) + + ip(f"link set {ms0} type macsec offload off") + ksft_eq(_get_macsec_offload(ms0), "off", + "offload disabled: should be off") + feats_off_1 = _get_features(ms0) + + ip(f"link set {ms0} type macsec encrypt on offload mac") + ksft_eq(_get_macsec_offload(ms0), "mac", + "offload re-enabled: should be mac") + ksft_eq(_get_features(ms0), feats_on_1, + "features should match first offload-on snapshot") + + # Delete and recreate without offload + cleanup.exec() + ip(f"link add link {cfg.ifname} {ms0} type macsec") + defer(ip, f"link del {ms0}") + ksft_eq(_get_macsec_offload(ms0), "off", + "created without offload: should be off") + ksft_eq(_get_features(ms0), feats_off_1, + "features should match first offload-off snapshot") + + ip(f"link set {ms0} type macsec encrypt on offload mac") + ksft_eq(_get_macsec_offload(ms0), "mac", + "offload enabled after create: should be mac") + ksft_eq(_get_features(ms0), feats_on_1, + "features should match first offload-on snapshot") + + +def _check_nsim_vid(cfg, vid, expected) -> None: + """Checks if a VLAN is present. Only works on netdevsim.""" + + nsim = cfg.get_local_nsim_dev() + if not nsim: + return + + vlan_path = os.path.join(nsim.nsims[0].dfs_dir, "vlan") + with open(vlan_path, encoding="utf-8") as f: + vids = f.read() + found = f"ctag {vid}\n" in vids + ksft_eq(found, expected, + f"VLAN {vid} {'expected' if expected else 'not expected'}" + f" in debugfs") + + +@ksft_variants([ + KsftNamedVariant("offloaded", True), + KsftNamedVariant("software", False), +]) +def test_vlan(cfg, offload) -> None: + """Ping through VLAN-over-macsec.""" + + _require_ip_macsec(cfg) + if offload: + _require_macsec_offload(cfg) + else: + _require_ip_macsec_offload() + name = _macsec_name() + _setup_macsec_devs(cfg, name, offload=offload) + _setup_macsec_sa(cfg, name) + _setup_vlans(cfg, name, MACSEC_VLAN_VID) + vlan_name, remote_ip = _setup_vlan_ips(cfg, name, MACSEC_VLAN_VID) + _check_nsim_vid(cfg, MACSEC_VLAN_VID, offload) + # nsim doesn't handle the data path for offloaded macsec, so skip + # the ping when offloaded on nsim. + if not offload or not cfg.get_local_nsim_dev(): + cmd(f"ping -I {vlan_name} -c 1 -W 5 {remote_ip}") + + +@ksft_variants([ + KsftNamedVariant("on_to_off", True), + KsftNamedVariant("off_to_on", False), +]) +def test_vlan_toggle(cfg, offload) -> None: + """Toggle offload: VLAN filters propagate/remove correctly.""" + + _require_ip_macsec(cfg) + _require_macsec_offload(cfg) + name = _macsec_name() + _setup_macsec_devs(cfg, name, offload=offload) + _setup_vlans(cfg, name, MACSEC_VLAN_VID) + _check_nsim_vid(cfg, MACSEC_VLAN_VID, offload) + _set_offload(name, offload=not offload) + _check_nsim_vid(cfg, MACSEC_VLAN_VID, not offload) + vlan_name, remote_ip = _setup_vlan_ips(cfg, name, MACSEC_VLAN_VID) + _setup_macsec_sa(cfg, name) + # nsim doesn't handle the data path for offloaded macsec, so skip + # the ping when the final state is offloaded on nsim. + if offload or not cfg.get_local_nsim_dev(): + cmd(f"ping -I {vlan_name} -c 1 -W 5 {remote_ip}") + + +def main() -> None: + """Main program.""" + with NetDrvEpEnv(__file__) as cfg: + ksft_run([test_offload_api, + test_max_secy, + test_max_sc, + test_offload_state, + test_vlan, + test_vlan_toggle, + ], args=(cfg,)) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/drivers/net/netconsole/netcons_basic.sh b/tools/testing/selftests/drivers/net/netconsole/netcons_basic.sh index 59cf10013ecd..7976206523b2 100755 --- a/tools/testing/selftests/drivers/net/netconsole/netcons_basic.sh +++ b/tools/testing/selftests/drivers/net/netconsole/netcons_basic.sh @@ -58,7 +58,11 @@ do # Send the message echo "${MSG}: ${TARGET}" > /dev/kmsg # Wait until socat saves the file to disk - busywait "${BUSYWAIT_TIMEOUT}" test -s "${OUTPUT_FILE}" + if ! busywait "${BUSYWAIT_TIMEOUT}" test -s "${OUTPUT_FILE}" + then + echo "FAIL: Timed out waiting (${BUSYWAIT_TIMEOUT} ms) for netconsole message in ${OUTPUT_FILE}" >&2 + exit "${ksft_fail}" + fi # Make sure the message was received in the dst part # and exit diff --git a/tools/testing/selftests/drivers/net/netdevsim/Makefile b/tools/testing/selftests/drivers/net/netdevsim/Makefile index 1a228c5430f5..9808c2fbae9e 100644 --- a/tools/testing/selftests/drivers/net/netdevsim/Makefile +++ b/tools/testing/selftests/drivers/net/netdevsim/Makefile @@ -11,7 +11,6 @@ TEST_PROGS := \ fib.sh \ fib_notifications.sh \ hw_stats_l3.sh \ - macsec-offload.sh \ nexthop.sh \ peer.sh \ psample.sh \ diff --git a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh index 1b529ccaf050..22a626c6cde3 100755 --- a/tools/testing/selftests/drivers/net/netdevsim/devlink.sh +++ b/tools/testing/selftests/drivers/net/netdevsim/devlink.sh @@ -5,7 +5,8 @@ lib_dir=$(dirname $0)/../../../net/forwarding ALL_TESTS="fw_flash_test params_test \ params_default_test regions_test reload_test \ - netns_reload_test resource_test dev_info_test \ + netns_reload_test resource_test resource_dump_test \ + port_resource_doit_test dev_info_test \ empty_reporter_test dummy_reporter_test rate_test" NUM_NETIFS=0 source $lib_dir/lib.sh @@ -482,6 +483,56 @@ resource_test() log_test "resource test" } +resource_dump_test() +{ + RET=0 + + local port_jq + local dev_jq + local dl_jq + local count + + dl_jq="with_entries(select(.key | startswith(\"$DL_HANDLE\")))" + port_jq="[.[] | $dl_jq | keys |" + port_jq+=" map(select(test(\"/.+/\"))) | length] | add" + dev_jq="[.[] | $dl_jq | keys |" + dev_jq+=" map(select(test(\"/.+/\")|not)) | length] | add" + + if ! devlink resource help 2>&1 | grep -q "scope"; then + echo "SKIP: devlink resource show not supported" + return + fi + + devlink resource show > /dev/null 2>&1 + check_err $? "Failed to dump all resources" + + count=$(cmd_jq "devlink resource show -j" "$port_jq") + [ "$count" -gt "0" ] + check_err $? "missing port resources in resource dump" + + count=$(cmd_jq "devlink resource show -j" "$dev_jq") + [ "$count" -gt "0" ] + check_err $? "missing device resources in resource dump" + + count=$(cmd_jq "devlink resource show scope dev -j" "$dev_jq") + [ "$count" -gt "0" ] + check_err $? "dev scope missing device resources" + + count=$(cmd_jq "devlink resource show scope dev -j" "$port_jq") + [ "$count" -eq "0" ] + check_err $? "dev scope returned port resources" + + count=$(cmd_jq "devlink resource show scope port -j" "$port_jq") + [ "$count" -gt "0" ] + check_err $? "port scope missing port resources" + + count=$(cmd_jq "devlink resource show scope port -j" "$dev_jq") + [ "$count" -eq "0" ] + check_err $? "port scope returned device resources" + + log_test "resource dump test" +} + info_get() { local name=$1 @@ -768,6 +819,32 @@ rate_node_del() devlink port function rate del $handle } +port_resource_doit_test() +{ + RET=0 + + local port_handle="${DL_HANDLE}/0" + local name + local size + + if ! devlink resource help 2>&1 | grep -q "PORT_INDEX"; then + echo "SKIP: devlink resource show with port not supported" + return + fi + + name=$(cmd_jq "devlink resource show $port_handle -j" \ + '.[][][].name') + [ "$name" == "test_resource" ] + check_err $? "wrong port resource name (got $name)" + + size=$(cmd_jq "devlink resource show $port_handle -j" \ + '.[][][].size') + [ "$size" == "20" ] + check_err $? "wrong port resource size (got $size)" + + log_test "port resource doit test" +} + rate_test() { RET=0 diff --git a/tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh b/tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh deleted file mode 100755 index 98033e6667d2..000000000000 --- a/tools/testing/selftests/drivers/net/netdevsim/macsec-offload.sh +++ /dev/null @@ -1,117 +0,0 @@ -#!/bin/bash -# SPDX-License-Identifier: GPL-2.0-only - -source ethtool-common.sh - -NSIM_NETDEV=$(make_netdev) -MACSEC_NETDEV=macsec_nsim - -set -o pipefail - -if ! ethtool -k $NSIM_NETDEV | grep -q 'macsec-hw-offload: on'; then - echo "SKIP: netdevsim doesn't support MACsec offload" - exit 4 -fi - -if ! ip link add link $NSIM_NETDEV $MACSEC_NETDEV type macsec offload mac 2>/dev/null; then - echo "SKIP: couldn't create macsec device" - exit 4 -fi -ip link del $MACSEC_NETDEV - -# -# test macsec offload API -# - -ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}" type macsec port 4 offload mac -check $? - -ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}2" type macsec address "aa:bb:cc:dd:ee:ff" port 5 offload mac -check $? - -ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}3" type macsec sci abbacdde01020304 offload mac -check $? - -ip link add link $NSIM_NETDEV "${MACSEC_NETDEV}4" type macsec port 8 offload mac 2> /dev/null -check $? '' '' 1 - -ip macsec add "${MACSEC_NETDEV}" tx sa 0 pn 1024 on key 01 12345678901234567890123456789012 -check $? - -ip macsec add "${MACSEC_NETDEV}" rx port 1234 address "1c:ed:de:ad:be:ef" -check $? - -ip macsec add "${MACSEC_NETDEV}" rx port 1234 address "1c:ed:de:ad:be:ef" sa 0 pn 1 on \ - key 00 0123456789abcdef0123456789abcdef -check $? - -ip macsec add "${MACSEC_NETDEV}" rx port 1235 address "1c:ed:de:ad:be:ef" 2> /dev/null -check $? '' '' 1 - -# can't disable macsec offload when SAs are configured -ip link set "${MACSEC_NETDEV}" type macsec offload off 2> /dev/null -check $? '' '' 1 - -ip macsec offload "${MACSEC_NETDEV}" off 2> /dev/null -check $? '' '' 1 - -# toggle macsec offload via rtnetlink -ip link set "${MACSEC_NETDEV}2" type macsec offload off -check $? - -ip link set "${MACSEC_NETDEV}2" type macsec offload mac -check $? - -# toggle macsec offload via genetlink -ip macsec offload "${MACSEC_NETDEV}2" off -check $? - -ip macsec offload "${MACSEC_NETDEV}2" mac -check $? - -for dev in ${MACSEC_NETDEV}{,2,3} ; do - ip link del $dev - check $? -done - - -# -# test ethtool features when toggling offload -# - -ip link add link $NSIM_NETDEV $MACSEC_NETDEV type macsec offload mac -TMP_FEATS_ON_1="$(ethtool -k $MACSEC_NETDEV)" - -ip link set $MACSEC_NETDEV type macsec offload off -TMP_FEATS_OFF_1="$(ethtool -k $MACSEC_NETDEV)" - -ip link set $MACSEC_NETDEV type macsec offload mac -TMP_FEATS_ON_2="$(ethtool -k $MACSEC_NETDEV)" - -[ "$TMP_FEATS_ON_1" = "$TMP_FEATS_ON_2" ] -check $? - -ip link del $MACSEC_NETDEV - -ip link add link $NSIM_NETDEV $MACSEC_NETDEV type macsec -check $? - -TMP_FEATS_OFF_2="$(ethtool -k $MACSEC_NETDEV)" -[ "$TMP_FEATS_OFF_1" = "$TMP_FEATS_OFF_2" ] -check $? - -ip link set $MACSEC_NETDEV type macsec offload mac -check $? - -TMP_FEATS_ON_3="$(ethtool -k $MACSEC_NETDEV)" -[ "$TMP_FEATS_ON_1" = "$TMP_FEATS_ON_3" ] -check $? - - -if [ $num_errors -eq 0 ]; then - echo "PASSED all $((num_passes)) checks" - exit 0 -else - echo "FAILED $num_errors/$((num_errors+num_passes)) checks" - exit 1 -fi diff --git a/tools/testing/selftests/drivers/net/team/Makefile b/tools/testing/selftests/drivers/net/team/Makefile index 02d6f51d5a06..7c58cf82121e 100644 --- a/tools/testing/selftests/drivers/net/team/Makefile +++ b/tools/testing/selftests/drivers/net/team/Makefile @@ -2,14 +2,18 @@ # Makefile for net selftests TEST_PROGS := \ + decoupled_enablement.sh \ dev_addr_lists.sh \ non_ether_header_ops.sh \ options.sh \ propagation.sh \ refleak.sh \ + teamd_activebackup.sh \ + transmit_failover.sh \ # end of TEST_PROGS TEST_INCLUDES := \ + team_lib.sh \ ../bonding/lag_lib.sh \ ../../../net/forwarding/lib.sh \ ../../../net/in_netns.sh \ diff --git a/tools/testing/selftests/drivers/net/team/config b/tools/testing/selftests/drivers/net/team/config index 5d36a22ef080..8f04ae419c53 100644 --- a/tools/testing/selftests/drivers/net/team/config +++ b/tools/testing/selftests/drivers/net/team/config @@ -6,4 +6,8 @@ CONFIG_NETDEVSIM=m CONFIG_NET_IPGRE=y CONFIG_NET_TEAM=y CONFIG_NET_TEAM_MODE_ACTIVEBACKUP=y +CONFIG_NET_TEAM_MODE_BROADCAST=y CONFIG_NET_TEAM_MODE_LOADBALANCE=y +CONFIG_NET_TEAM_MODE_RANDOM=y +CONFIG_NET_TEAM_MODE_ROUNDROBIN=y +CONFIG_VETH=y diff --git a/tools/testing/selftests/drivers/net/team/decoupled_enablement.sh b/tools/testing/selftests/drivers/net/team/decoupled_enablement.sh new file mode 100755 index 000000000000..0d3d9c98e9f5 --- /dev/null +++ b/tools/testing/selftests/drivers/net/team/decoupled_enablement.sh @@ -0,0 +1,249 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# These tests verify the decoupled RX and TX enablement of team driver member +# interfaces. +# +# Topology +# +# +---------------------+ NS1 +# | test_team1 | +# | | | +# | eth0 | +# | | | +# | | | +# +---------------------+ +# | +# +---------------------+ NS2 +# | | | +# | | | +# | eth0 | +# | | | +# | test_team2 | +# +---------------------+ + +export ALL_TESTS=" + team_test_tx_enablement + team_test_rx_enablement +" + +test_dir="$(dirname "$0")" +# shellcheck disable=SC1091 +source "${test_dir}/../../../net/lib.sh" +# shellcheck disable=SC1091 +source "${test_dir}/team_lib.sh" + +NS1="" +NS2="" +export NODAD="nodad" +PREFIX_LENGTH="64" +NS1_IP="fd00::1" +NS2_IP="fd00::2" +NS1_IP4="192.168.0.1" +NS2_IP4="192.168.0.2" +MEMBERS=("eth0") +PING_COUNT=5 +PING_TIMEOUT_S=1 +PING_INTERVAL=0.1 + +while getopts "4" opt; do + case $opt in + 4) + echo "IPv4 mode selected." + export NODAD= + PREFIX_LENGTH="24" + NS1_IP="${NS1_IP4}" + NS2_IP="${NS2_IP4}" + ;; + \?) + echo "Invalid option: -$OPTARG" >&2 + exit 1 + ;; + esac +done + +# This has to be sourced after opts are gathered... +export REQUIRE_MZ=no +export NUM_NETIFS=0 +# shellcheck disable=SC1091 +source "${test_dir}/../../../net/forwarding/lib.sh" + +# Create the network namespaces, veth pair, and team devices in the specified +# mode. +# Globals: +# RET - Used by test infra, set by `check_err` functions. +# Arguments: +# mode - The team driver mode to use for the team devices. +environment_create() +{ + trap cleanup_all_ns EXIT + setup_ns ns1 ns2 + NS1="${NS_LIST[0]}" + NS2="${NS_LIST[1]}" + + # Create the interfaces. + ip -n "${NS1}" link add eth0 type veth peer name eth0 netns "${NS2}" + ip -n "${NS1}" link add test_team1 type team + ip -n "${NS2}" link add test_team2 type team + + # Set up the receiving network namespace's team interface. + setup_team "${NS2}" test_team2 roundrobin "${NS2_IP}" \ + "${PREFIX_LENGTH}" "${MEMBERS[@]}" +} + +# Set a particular option value for team or team port. +# Arguments: +# namespace - The namespace name that has the team. +# option_name - The option name to set. +# option_value - The value to set the option to. +# team_name - The name of team to set the option for. +# member_name - The (optional) optional name of the member port. +set_option_value() +{ + local namespace="$1" + local option_name="$2" + local option_value="$3" + local team_name="$4" + local member_name="$5" + local port_flag="--port=${member_name}" + + ip netns exec "${namespace}" teamnl "${team_name}" setoption \ + "${option_name}" "${option_value}" "${port_flag}" + return $? +} + +# Send some pings and return the ping command return value. +try_ping() +{ + ip netns exec "${NS1}" ping -i "${PING_INTERVAL}" -c "${PING_COUNT}" \ + "${NS2_IP}" -W "${PING_TIMEOUT_S}" +} + +# Checks tcpdump output from net/forwarding lib, and checks if there are any +# ICMP(4 or 6) packets. +# Arguments: +# interface - The interface name to search for. +# ip_address - The destination IP address (4 or 6) to search for. +did_interface_receive_icmp() +{ + local interface="$1" + local ip_address="$2" + local packet_count + + packet_count=$(tcpdump_show "$interface" | grep -c \ + "> ${ip_address}: ICMP") + echo "Packet count for ${interface} was ${packet_count}" + + if [[ "$packet_count" -gt 0 ]]; then + true + else + false + fi +} + +# Test JUST tx enablement with a given mode. +# Globals: +# RET - Used by test infra, set by `check_err` functions. +# Arguments: +# mode - The mode to set the team interfaces to. +team_test_mode_tx_enablement() +{ + local mode="$1" + export RET=0 + + # Set up the sender team with the correct mode. + setup_team "${NS1}" test_team1 "${mode}" "${NS1_IP}" \ + "${PREFIX_LENGTH}" "${MEMBERS[@]}" + check_err $? "Failed to set up sender team" + + ### Scenario 1: Member interface initially enabled. + # Expect ping to pass + try_ping + check_err $? "Ping failed when TX enabled" + + ### Scenario 2: One tx-side interface disabled. + # Expect ping to fail. + set_option_value "${NS1}" tx_enabled false test_team1 eth0 + check_err $? "Failed to disable TX" + tcpdump_start eth0 "${NS2}" + try_ping + check_fail $? "Ping succeeded when TX disabled" + tcpdump_stop eth0 + # Expect no packets to be transmitted, since TX is disabled. + did_interface_receive_icmp eth0 "${NS2_IP}" + check_fail $? "eth0 IS transmitting when TX disabled" + tcpdump_cleanup eth0 + + ### Scenario 3: The interface has tx re-enabled. + # Expect ping to pass. + set_option_value "${NS1}" tx_enabled true test_team1 eth0 + check_err $? "Failed to reenable TX" + try_ping + check_err $? "Ping failed when TX reenabled" + + log_test "TX failover of '${mode}' test" +} + +# Test JUST rx enablement with a given mode. +# Globals: +# RET - Used by test infra, set by `check_err` functions. +# Arguments: +# mode - The mode to set the team interfaces to. +team_test_mode_rx_enablement() +{ + local mode="$1" + export RET=0 + + # Set up the sender team with the correct mode. + setup_team "${NS1}" test_team1 "${mode}" "${NS1_IP}" \ + "${PREFIX_LENGTH}" "${MEMBERS[@]}" + check_err $? "Failed to set up sender team" + + ### Scenario 1: Member interface initially enabled. + # Expect ping to pass + try_ping + check_err $? "Ping failed when RX enabled" + + ### Scenario 2: One rx-side interface disabled. + # Expect ping to fail. + set_option_value "${NS1}" rx_enabled false test_team1 eth0 + check_err $? "Failed to disable RX" + tcpdump_start eth0 "${NS2}" + try_ping + check_fail $? "Ping succeeded when RX disabled" + tcpdump_stop eth0 + # Expect packets to be transmitted, since only RX is disabled. + did_interface_receive_icmp eth0 "${NS2_IP}" + check_err $? "eth0 not transmitting when RX disabled" + tcpdump_cleanup eth0 + + ### Scenario 3: The interface has rx re-enabled. + # Expect ping to pass. + set_option_value "${NS1}" rx_enabled true test_team1 eth0 + check_err $? "Failed to reenable RX" + try_ping + check_err $? "Ping failed when RX reenabled" + + log_test "RX failover of '${mode}' test" +} + +team_test_tx_enablement() +{ + team_test_mode_tx_enablement broadcast + team_test_mode_tx_enablement roundrobin + team_test_mode_tx_enablement random +} + +team_test_rx_enablement() +{ + team_test_mode_rx_enablement broadcast + team_test_mode_rx_enablement roundrobin + team_test_mode_rx_enablement random +} + +require_command teamnl +require_command tcpdump +require_command ping +environment_create +tests_run +exit "${EXIT_STATUS}" diff --git a/tools/testing/selftests/drivers/net/team/options.sh b/tools/testing/selftests/drivers/net/team/options.sh index 44888f32b513..66c0cb896dad 100755 --- a/tools/testing/selftests/drivers/net/team/options.sh +++ b/tools/testing/selftests/drivers/net/team/options.sh @@ -11,10 +11,14 @@ if [[ $# -eq 0 ]]; then exit $? fi -ALL_TESTS=" +export ALL_TESTS=" team_test_options + team_test_enabled_implicit_changes + team_test_rx_enabled_implicit_changes + team_test_tx_enabled_implicit_changes " +# shellcheck disable=SC1091 source "${test_dir}/../../../net/lib.sh" TEAM_PORT="team0" @@ -176,12 +180,105 @@ team_test_options() team_test_option mcast_rejoin_count 0 5 team_test_option mcast_rejoin_interval 0 5 team_test_option enabled true false "${MEMBER_PORT}" + team_test_option rx_enabled true false "${MEMBER_PORT}" + team_test_option tx_enabled true false "${MEMBER_PORT}" team_test_option user_linkup true false "${MEMBER_PORT}" team_test_option user_linkup_enabled true false "${MEMBER_PORT}" team_test_option priority 10 20 "${MEMBER_PORT}" team_test_option queue_id 0 1 "${MEMBER_PORT}" } +team_test_enabled_implicit_changes() +{ + export RET=0 + + attach_port_if_specified "${MEMBER_PORT}" + check_err $? "Couldn't attach ${MEMBER_PORT} to master" + + # Set enabled to true. + set_and_check_get enabled true "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'enabled' to true" + + # Show that both rx enabled and tx enabled are true. + get_and_check_value rx_enabled true "--port=${MEMBER_PORT}" + check_err $? "'Rx_enabled' wasn't implicitly set to true" + get_and_check_value tx_enabled true "--port=${MEMBER_PORT}" + check_err $? "'Tx_enabled' wasn't implicitly set to true" + + # Set enabled to false. + set_and_check_get enabled false "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'enabled' to false" + + # Show that both rx enabled and tx enabled are false. + get_and_check_value rx_enabled false "--port=${MEMBER_PORT}" + check_err $? "'Rx_enabled' wasn't implicitly set to false" + get_and_check_value tx_enabled false "--port=${MEMBER_PORT}" + check_err $? "'Tx_enabled' wasn't implicitly set to false" + + log_test "'Enabled' implicit changes" +} + +team_test_rx_enabled_implicit_changes() +{ + export RET=0 + + attach_port_if_specified "${MEMBER_PORT}" + check_err $? "Couldn't attach ${MEMBER_PORT} to master" + + # Set enabled to true. + set_and_check_get enabled true "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'enabled' to true" + + # Set rx_enabled to false. + set_and_check_get rx_enabled false "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'rx_enabled' to false" + + # Show that enabled is false. + get_and_check_value enabled false "--port=${MEMBER_PORT}" + check_err $? "'enabled' wasn't implicitly set to false" + + # Set rx_enabled to true. + set_and_check_get rx_enabled true "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'rx_enabled' to true" + + # Show that enabled is true. + get_and_check_value enabled true "--port=${MEMBER_PORT}" + check_err $? "'enabled' wasn't implicitly set to true" + + log_test "'Rx_enabled' implicit changes" +} + +team_test_tx_enabled_implicit_changes() +{ + export RET=0 + + attach_port_if_specified "${MEMBER_PORT}" + check_err $? "Couldn't attach ${MEMBER_PORT} to master" + + # Set enabled to true. + set_and_check_get enabled true "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'enabled' to true" + + # Set tx_enabled to false. + set_and_check_get tx_enabled false "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'tx_enabled' to false" + + # Show that enabled is false. + get_and_check_value enabled false "--port=${MEMBER_PORT}" + check_err $? "'enabled' wasn't implicitly set to false" + + # Set tx_enabled to true. + set_and_check_get tx_enabled true "--port=${MEMBER_PORT}" + check_err $? "Failed to set 'tx_enabled' to true" + + # Show that enabled is true. + get_and_check_value enabled true "--port=${MEMBER_PORT}" + check_err $? "'enabled' wasn't implicitly set to true" + + log_test "'Tx_enabled' implicit changes" +} + + require_command teamnl setup tests_run diff --git a/tools/testing/selftests/drivers/net/team/settings b/tools/testing/selftests/drivers/net/team/settings new file mode 100644 index 000000000000..694d70710ff0 --- /dev/null +++ b/tools/testing/selftests/drivers/net/team/settings @@ -0,0 +1 @@ +timeout=300 diff --git a/tools/testing/selftests/drivers/net/team/team_lib.sh b/tools/testing/selftests/drivers/net/team/team_lib.sh new file mode 100644 index 000000000000..02ef0ee02d6a --- /dev/null +++ b/tools/testing/selftests/drivers/net/team/team_lib.sh @@ -0,0 +1,174 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +test_dir="$(dirname "$0")" +export REQUIRE_MZ=no +export NUM_NETIFS=0 +# shellcheck disable=SC1091 +source "${test_dir}/../../../net/forwarding/lib.sh" + +TCP_PORT="43434" + +# Create a team interface inside of a given network namespace with a given +# mode, members, and IP address. +# Arguments: +# namespace - Network namespace to put the team interface into. +# team - The name of the team interface to setup. +# mode - The team mode of the interface. +# ip_address - The IP address to assign to the team interface. +# prefix_length - The prefix length for the IP address subnet. +# $@ - members - The member interfaces of the aggregation. +setup_team() +{ + local namespace=$1 + local team=$2 + local mode=$3 + local ip_address=$4 + local prefix_length=$5 + shift 5 + local members=("$@") + + # Prerequisite: team must have no members + for member in "${members[@]}"; do + ip -n "${namespace}" link set "${member}" nomaster + done + + # Prerequisite: team must have no address in order to set it + # shellcheck disable=SC2086 + ip -n "${namespace}" addr del "${ip_address}/${prefix_length}" \ + ${NODAD} dev "${team}" + + echo "Setting team in ${namespace} to mode ${mode}" + + if ! ip -n "${namespace}" link set "${team}" down; then + echo "Failed to bring team device down" + return 1 + fi + if ! ip netns exec "${namespace}" teamnl "${team}" setoption mode \ + "${mode}"; then + echo "Failed to set ${team} mode to '${mode}'" + return 1 + fi + + # Aggregate the members into teams. + for member in "${members[@]}"; do + ip -n "${namespace}" link set "${member}" master "${team}" + done + + # Bring team devices up and give them addresses. + if ! ip -n "${namespace}" link set "${team}" up; then + echo "Failed to set ${team} up" + return 1 + fi + + # shellcheck disable=SC2086 + if ! ip -n "${namespace}" addr add "${ip_address}/${prefix_length}" \ + ${NODAD} dev "${team}"; then + echo "Failed to give ${team} IP address in ${namespace}" + return 1 + fi +} + +# This is global used to keep track of the sender's iperf3 process, so that it +# can be terminated. +declare sender_pid + +# Start sending and receiving TCP traffic with iperf3. +# Globals: +# sender_pid - The process ID of the iperf3 sender process. Used to kill it +# later. +start_listening_and_sending() +{ + ip netns exec "${NS2}" iperf3 -s -p "${TCP_PORT}" --logfile /dev/null & + # Wait for server to become reachable before starting client. + slowwait 5 ip netns exec "${NS1}" iperf3 -c "${NS2_IP}" -p \ + "${TCP_PORT}" -t 1 --logfile /dev/null + ip netns exec "${NS1}" iperf3 -c "${NS2_IP}" -p "${TCP_PORT}" -b 1M -l \ + 1K -t 0 --logfile /dev/null & + sender_pid=$! +} + +# Stop sending TCP traffic with iperf3. +# Globals: +# sender_pid - The process ID of the iperf3 sender process. +stop_sending_and_listening() +{ + kill "${sender_pid}" && wait "${sender_pid}" 2>/dev/null || true +} + +# Monitor for TCP traffic with Tcpdump, save results to temp files. +# Arguments: +# namespace - The network namespace to run tcpdump inside of. +# $@ - interfaces - The interfaces to listen to. +save_tcpdump_outputs() +{ + local namespace=$1 + shift 1 + local interfaces=("$@") + + for interface in "${interfaces[@]}"; do + tcpdump_start "${interface}" "${namespace}" + done + + sleep 1 + + for interface in "${interfaces[@]}"; do + tcpdump_stop_nosleep "${interface}" + done +} + +clear_tcpdump_outputs() +{ + local interfaces=("$@") + + for interface in "${interfaces[@]}"; do + tcpdump_cleanup "${interface}" + done +} + +# Read Tcpdump output, determine packet counts. +# Arguments: +# interface - The name of the interface to count packets for. +# ip_address - The destination IP address. +did_interface_receive() +{ + local interface="$1" + local ip_address="$2" + local packet_count + + packet_count=$(tcpdump_show "$interface" | grep -c \ + "> ${ip_address}.${TCP_PORT}") + echo "Packet count for ${interface} was ${packet_count}" + + if [[ "${packet_count}" -gt 0 ]]; then + true + else + false + fi +} + +# Return true if the given interface in the given namespace does NOT receive +# traffic over a 1 second period. +# Arguments: +# interface - The name of the interface. +# ip_address - The destination IP address. +# namespace - The name of the namespace that the interface is in. +check_no_traffic() +{ + local interface="$1" + local ip_address="$2" + local namespace="$3" + local rc + + save_tcpdump_outputs "${namespace}" "${interface}" + did_interface_receive "${interface}" "${ip_address}" + rc=$? + + clear_tcpdump_outputs "${interface}" + + if [[ "${rc}" -eq 0 ]]; then + return 1 + else + return 0 + fi +} diff --git a/tools/testing/selftests/drivers/net/team/teamd_activebackup.sh b/tools/testing/selftests/drivers/net/team/teamd_activebackup.sh new file mode 100755 index 000000000000..2b26a697e179 --- /dev/null +++ b/tools/testing/selftests/drivers/net/team/teamd_activebackup.sh @@ -0,0 +1,246 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# These tests verify that teamd is able to enable and disable ports via the +# active backup runner. +# +# Topology: +# +# +-------------------------+ NS1 +# | test_team1 | +# | + | +# | eth0 | eth1 | +# | +---+---+ | +# | | | | +# +-------------------------+ +# | | +# +-------------------------+ NS2 +# | | | | +# | +-------+ | +# | eth0 | eth1 | +# | + | +# | test_team2 | +# +-------------------------+ + +export ALL_TESTS="teamd_test_active_backup" + +test_dir="$(dirname "$0")" +# shellcheck disable=SC1091 +source "${test_dir}/../../../net/lib.sh" +# shellcheck disable=SC1091 +source "${test_dir}/team_lib.sh" + +NS1="" +NS2="" +export NODAD="nodad" +PREFIX_LENGTH="64" +NS1_IP="fd00::1" +NS2_IP="fd00::2" +NS1_IP4="192.168.0.1" +NS2_IP4="192.168.0.2" +NS1_TEAMD_CONF="" +NS2_TEAMD_CONF="" +NS1_TEAMD_PID="" +NS2_TEAMD_PID="" + +while getopts "4" opt; do + case $opt in + 4) + echo "IPv4 mode selected." + export NODAD= + PREFIX_LENGTH="24" + NS1_IP="${NS1_IP4}" + NS2_IP="${NS2_IP4}" + ;; + \?) + echo "Invalid option: -${OPTARG}" >&2 + exit 1 + ;; + esac +done + +teamd_config_create() +{ + local runner=$1 + local dev=$2 + local conf + + conf=$(mktemp) + + cat > "${conf}" <<-EOF + { + "device": "${dev}", + "runner": {"name": "${runner}"}, + "ports": { + "eth0": {}, + "eth1": {} + } + } + EOF + echo "${conf}" +} + +# Create the network namespaces, veth pair, and team devices in the specified +# runner. +# Globals: +# RET - Used by test infra, set by `check_err` functions. +# Arguments: +# runner - The Teamd runner to use for the Team devices. +environment_create() +{ + local runner=$1 + + echo "Setting up two-link aggregation for runner ${runner}" + echo "Teamd version is: $(teamd --version)" + trap environment_destroy EXIT + + setup_ns ns1 ns2 + NS1="${NS_LIST[0]}" + NS2="${NS_LIST[1]}" + + for link in $(seq 0 1); do + ip -n "${NS1}" link add "eth${link}" type veth peer name \ + "eth${link}" netns "${NS2}" + check_err $? "Failed to create veth pair" + done + + NS1_TEAMD_CONF=$(teamd_config_create "${runner}" "test_team1") + NS2_TEAMD_CONF=$(teamd_config_create "${runner}" "test_team2") + echo "Conf files are ${NS1_TEAMD_CONF} and ${NS2_TEAMD_CONF}" + + ip netns exec "${NS1}" teamd -d -f "${NS1_TEAMD_CONF}" + check_err $? "Failed to create team device in ${NS1}" + NS1_TEAMD_PID=$(pgrep -f "teamd -d -f ${NS1_TEAMD_CONF}") + + ip netns exec "${NS2}" teamd -d -f "${NS2_TEAMD_CONF}" + check_err $? "Failed to create team device in ${NS2}" + NS2_TEAMD_PID=$(pgrep -f "teamd -d -f ${NS2_TEAMD_CONF}") + + echo "Created team devices" + echo "Teamd PIDs are ${NS1_TEAMD_PID} and ${NS2_TEAMD_PID}" + + ip -n "${NS1}" link set test_team1 up + check_err $? "Failed to set test_team1 up in ${NS1}" + ip -n "${NS2}" link set test_team2 up + check_err $? "Failed to set test_team2 up in ${NS2}" + + ip -n "${NS1}" addr add "${NS1_IP}/${PREFIX_LENGTH}" "${NODAD}" dev \ + test_team1 + check_err $? "Failed to add address to team device in ${NS1}" + ip -n "${NS2}" addr add "${NS2_IP}/${PREFIX_LENGTH}" "${NODAD}" dev \ + test_team2 + check_err $? "Failed to add address to team device in ${NS2}" + + slowwait 2 timeout 0.5 ip netns exec "${NS1}" ping -W 1 -c 1 "${NS2_IP}" +} + +# Tear down the environment: kill teamd and delete network namespaces. +environment_destroy() +{ + echo "Tearing down two-link aggregation" + + rm "${NS1_TEAMD_CONF}" + rm "${NS2_TEAMD_CONF}" + + # First, try graceful teamd teardown. + ip netns exec "${NS1}" teamd -k -t test_team1 + ip netns exec "${NS2}" teamd -k -t test_team2 + + # If teamd can't be killed gracefully, then sigkill. + if kill -0 "${NS1_TEAMD_PID}" 2>/dev/null; then + echo "Sending sigkill to teamd for test_team1" + kill -9 "${NS1_TEAMD_PID}" + rm -f /var/run/teamd/test_team1.{pid,sock} + fi + if kill -0 "${NS2_TEAMD_PID}" 2>/dev/null; then + echo "Sending sigkill to teamd for test_team2" + kill -9 "${NS2_TEAMD_PID}" + rm -f /var/run/teamd/test_team2.{pid,sock} + fi + cleanup_all_ns +} + +# Change the active port for an active-backup mode team. +# Arguments: +# namespace - The network namespace that the team is in. +# team - The name of the team. +# active_port - The port to make active. +set_active_port() +{ + local namespace=$1 + local team=$2 + local active_port=$3 + + ip netns exec "${namespace}" teamdctl "${team}" state item set \ + runner.active_port "${active_port}" + slowwait 2 bash -c "ip netns exec ${namespace} teamdctl ${team} state \ + item get runner.active_port | grep -q ${active_port}" +} + +# Wait for an interface to stop receiving traffic. If it keeps receiving traffic +# for the duration of the timeout, then return an error. +# Arguments: +# - namespace - The network namespace that the interface is in. +# - interface - The name of the interface. +wait_to_stop_receiving() +{ + local namespace=$1 + local interface=$2 + + echo "Waiting for ${interface} in ${namespace} to stop receiving" + slowwait 10 check_no_traffic "${interface}" "${NS2_IP}" \ + "${namespace}" +} + +# Test that active backup runner can change active ports. +# Globals: +# RET - Used by test infra, set by `check_err` functions. +teamd_test_active_backup() +{ + export RET=0 + + start_listening_and_sending + + ### Scenario 1: Don't manually set active port, just make sure team + # works. + save_tcpdump_outputs "${NS2}" test_team2 + did_interface_receive test_team2 "${NS2_IP}" + check_err $? "Traffic did not reach team interface in NS2." + clear_tcpdump_outputs test_team2 + + ### Scenario 2: Choose active port. + set_active_port "${NS1}" test_team1 eth1 + set_active_port "${NS2}" test_team2 eth1 + + wait_to_stop_receiving "${NS2}" eth0 + save_tcpdump_outputs "${NS2}" eth0 eth1 + did_interface_receive eth0 "${NS2_IP}" + check_fail $? "eth0 IS transmitting when inactive" + did_interface_receive eth1 "${NS2_IP}" + check_err $? "eth1 not transmitting when active" + clear_tcpdump_outputs eth0 eth1 + + ### Scenario 3: Change active port. + set_active_port "${NS1}" test_team1 eth0 + set_active_port "${NS2}" test_team2 eth0 + + wait_to_stop_receiving "${NS2}" eth1 + save_tcpdump_outputs "${NS2}" eth0 eth1 + did_interface_receive eth0 "${NS2_IP}" + check_err $? "eth0 not transmitting when active" + did_interface_receive eth1 "${NS2_IP}" + check_fail $? "eth1 IS transmitting when inactive" + clear_tcpdump_outputs eth0 eth1 + + log_test "teamd active backup runner test" + + stop_sending_and_listening +} + +require_command teamd +require_command teamdctl +require_command iperf3 +require_command tcpdump +environment_create activebackup +tests_run +exit "${EXIT_STATUS}" diff --git a/tools/testing/selftests/drivers/net/team/transmit_failover.sh b/tools/testing/selftests/drivers/net/team/transmit_failover.sh new file mode 100755 index 000000000000..b2bdcd27bc98 --- /dev/null +++ b/tools/testing/selftests/drivers/net/team/transmit_failover.sh @@ -0,0 +1,158 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +# These tests verify the basic failover capability of the team driver via the +# `enabled` team driver option across different team driver modes. This does not +# rely on teamd, and instead just uses teamnl to set the `enabled` option +# directly. +# +# Topology: +# +# +-------------------------+ NS1 +# | test_team1 | +# | + | +# | eth0 | eth1 | +# | +---+---+ | +# | | | | +# +-------------------------+ +# | | +# +-------------------------+ NS2 +# | | | | +# | +-------+ | +# | eth0 | eth1 | +# | + | +# | test_team2 | +# +-------------------------+ + +export ALL_TESTS="team_test_failover" + +test_dir="$(dirname "$0")" +# shellcheck disable=SC1091 +source "${test_dir}/../../../net/lib.sh" +# shellcheck disable=SC1091 +source "${test_dir}/team_lib.sh" + +NS1="" +NS2="" +export NODAD="nodad" +PREFIX_LENGTH="64" +NS1_IP="fd00::1" +NS2_IP="fd00::2" +NS1_IP4="192.168.0.1" +NS2_IP4="192.168.0.2" +MEMBERS=("eth0" "eth1") + +while getopts "4" opt; do + case $opt in + 4) + echo "IPv4 mode selected." + export NODAD= + PREFIX_LENGTH="24" + NS1_IP="${NS1_IP4}" + NS2_IP="${NS2_IP4}" + ;; + \?) + echo "Invalid option: -$OPTARG" >&2 + exit 1 + ;; + esac +done + +# Create the network namespaces, veth pair, and team devices in the specified +# mode. +# Globals: +# RET - Used by test infra, set by `check_err` functions. +# Arguments: +# mode - The team driver mode to use for the team devices. +environment_create() +{ + trap cleanup_all_ns EXIT + setup_ns ns1 ns2 + NS1="${NS_LIST[0]}" + NS2="${NS_LIST[1]}" + + # Create the interfaces. + ip -n "${NS1}" link add eth0 type veth peer name eth0 netns "${NS2}" + ip -n "${NS1}" link add eth1 type veth peer name eth1 netns "${NS2}" + ip -n "${NS1}" link add test_team1 type team + ip -n "${NS2}" link add test_team2 type team + + # Set up the receiving network namespace's team interface. + setup_team "${NS2}" test_team2 roundrobin "${NS2_IP}" \ + "${PREFIX_LENGTH}" "${MEMBERS[@]}" +} + + +# Check that failover works for a specific team driver mode. +# Globals: +# RET - Used by test infra, set by `check_err` functions. +# Arguments: +# mode - The mode to set the team interfaces to. +team_test_mode_failover() +{ + local mode="$1" + export RET=0 + + # Set up the sender team with the correct mode. + setup_team "${NS1}" test_team1 "${mode}" "${NS1_IP}" \ + "${PREFIX_LENGTH}" "${MEMBERS[@]}" + check_err $? "Failed to set up sender team" + + start_listening_and_sending + + ### Scenario 1: All interfaces initially enabled. + save_tcpdump_outputs "${NS2}" "${MEMBERS[@]}" + did_interface_receive eth0 "${NS2_IP}" + check_err $? "eth0 not transmitting when both links enabled" + did_interface_receive eth1 "${NS2_IP}" + check_err $? "eth1 not transmitting when both links enabled" + clear_tcpdump_outputs "${MEMBERS[@]}" + + ### Scenario 2: One tx-side interface disabled. + ip netns exec "${NS1}" teamnl test_team1 setoption enabled false \ + --port=eth1 + slowwait 2 bash -c "ip netns exec ${NS1} teamnl test_team1 getoption \ + enabled --port=eth1 | grep -q false" + + save_tcpdump_outputs "${NS2}" "${MEMBERS[@]}" + did_interface_receive eth0 "${NS2_IP}" + check_err $? "eth0 not transmitting when enabled" + did_interface_receive eth1 "${NS2_IP}" + check_fail $? "eth1 IS transmitting when disabled" + clear_tcpdump_outputs "${MEMBERS[@]}" + + ### Scenario 3: The interface is re-enabled. + ip netns exec "${NS1}" teamnl test_team1 setoption enabled true \ + --port=eth1 + slowwait 2 bash -c "ip netns exec ${NS1} teamnl test_team1 getoption \ + enabled --port=eth1 | grep -q true" + + save_tcpdump_outputs "${NS2}" "${MEMBERS[@]}" + did_interface_receive eth0 "${NS2_IP}" + check_err $? "eth0 not transmitting when both links enabled" + did_interface_receive eth1 "${NS2_IP}" + check_err $? "eth1 not transmitting when both links enabled" + clear_tcpdump_outputs "${MEMBERS[@]}" + + log_test "Failover of '${mode}' test" + + # Clean up + stop_sending_and_listening +} + +team_test_failover() +{ + team_test_mode_failover broadcast + team_test_mode_failover roundrobin + team_test_mode_failover random + # Don't test `activebackup` or `loadbalance` modes, since they are too + # complicated for just setting `enabled` to work. They use more than + # the `enabled` option for transmit. +} + +require_command teamnl +require_command iperf3 +require_command tcpdump +environment_create +tests_run +exit "${EXIT_STATUS}" diff --git a/tools/testing/selftests/drivers/net/xdp.py b/tools/testing/selftests/drivers/net/xdp.py index e54df158dfe9..2ad5932299e8 100755 --- a/tools/testing/selftests/drivers/net/xdp.py +++ b/tools/testing/selftests/drivers/net/xdp.py @@ -13,10 +13,11 @@ from enum import Enum from lib.py import ksft_run, ksft_exit, ksft_eq, ksft_ge, ksft_ne, ksft_pr from lib.py import KsftNamedVariant, ksft_variants -from lib.py import KsftFailEx, NetDrvEpEnv +from lib.py import KsftFailEx, KsftSkipEx, NetDrvEpEnv from lib.py import EthtoolFamily, NetdevFamily, NlError from lib.py import bkg, cmd, rand_port, wait_port_listen -from lib.py import ip, bpftool, defer +from lib.py import ip, defer +from lib.py import bpf_map_set, bpf_map_dump, bpf_prog_map_ids class TestConfig(Enum): @@ -69,7 +70,7 @@ def _exchg_udp(cfg, port, test_string): cfg.require_cmd("socat", remote=True) rx_udp_cmd = f"socat -{cfg.addr_ipver} -T 2 -u UDP-RECV:{port},reuseport STDOUT" - tx_udp_cmd = f"echo -n {test_string} | socat -t 2 -u STDIN UDP:{cfg.baddr}:{port}" + tx_udp_cmd = f"echo -n {test_string} | socat -t 2 -u STDIN UDP:{cfg.baddr}:{port},shut-none" with bkg(rx_udp_cmd, exit_wait=True) as nc: wait_port_listen(port, proto="udp") @@ -122,47 +123,11 @@ def _load_xdp_prog(cfg, bpf_info): xdp_info = ip(f"-d link show dev {cfg.ifname}", json=True)[0] prog_info["id"] = xdp_info["xdp"]["prog"]["id"] prog_info["name"] = xdp_info["xdp"]["prog"]["name"] - prog_id = prog_info["id"] - - map_ids = bpftool(f"prog show id {prog_id}", json=True)["map_ids"] - prog_info["maps"] = {} - for map_id in map_ids: - name = bpftool(f"map show id {map_id}", json=True)["name"] - prog_info["maps"][name] = map_id + prog_info["maps"] = bpf_prog_map_ids(prog_info["id"]) return prog_info -def format_hex_bytes(value): - """ - Helper function that converts an integer into a formatted hexadecimal byte string. - - Args: - value: An integer representing the number to be converted. - - Returns: - A string representing hexadecimal equivalent of value, with bytes separated by spaces. - """ - hex_str = value.to_bytes(4, byteorder='little', signed=True) - return ' '.join(f'{byte:02x}' for byte in hex_str) - - -def _set_xdp_map(map_name, key, value): - """ - Updates an XDP map with a given key-value pair using bpftool. - - Args: - map_name: The name of the XDP map to update. - key: The key to update in the map, formatted as a hexadecimal string. - value: The value to associate with the key, formatted as a hexadecimal string. - """ - key_formatted = format_hex_bytes(key) - value_formatted = format_hex_bytes(value) - bpftool( - f"map update name {map_name} key hex {key_formatted} value hex {value_formatted}" - ) - - def _get_stats(xdp_map_id): """ Retrieves and formats statistics from an XDP map. @@ -177,25 +142,11 @@ def _get_stats(xdp_map_id): Raises: KsftFailEx: If the stats retrieval fails. """ - stats_dump = bpftool(f"map dump id {xdp_map_id}", json=True) - if not stats_dump: + stats = bpf_map_dump(xdp_map_id) + if not stats: raise KsftFailEx(f"Failed to get stats for map {xdp_map_id}") - stats_formatted = {} - for key in range(0, 5): - val = stats_dump[key]["formatted"]["value"] - if stats_dump[key]["formatted"]["key"] == XDPStats.RX.value: - stats_formatted[XDPStats.RX.value] = val - elif stats_dump[key]["formatted"]["key"] == XDPStats.PASS.value: - stats_formatted[XDPStats.PASS.value] = val - elif stats_dump[key]["formatted"]["key"] == XDPStats.DROP.value: - stats_formatted[XDPStats.DROP.value] = val - elif stats_dump[key]["formatted"]["key"] == XDPStats.TX.value: - stats_formatted[XDPStats.TX.value] = val - elif stats_dump[key]["formatted"]["key"] == XDPStats.ABORT.value: - stats_formatted[XDPStats.ABORT.value] = val - - return stats_formatted + return stats def _test_pass(cfg, bpf_info, msg_sz): @@ -211,8 +162,8 @@ def _test_pass(cfg, bpf_info, msg_sz): prog_info = _load_xdp_prog(cfg, bpf_info) port = rand_port() - _set_xdp_map("map_xdp_setup", TestConfig.MODE.value, XDPAction.PASS.value) - _set_xdp_map("map_xdp_setup", TestConfig.PORT.value, port) + bpf_map_set("map_xdp_setup", TestConfig.MODE.value, XDPAction.PASS.value) + bpf_map_set("map_xdp_setup", TestConfig.PORT.value, port) ksft_eq(_test_udp(cfg, port, msg_sz), True, "UDP packet exchange failed") stats = _get_stats(prog_info["maps"]["map_xdp_stats"]) @@ -258,8 +209,8 @@ def _test_drop(cfg, bpf_info, msg_sz): prog_info = _load_xdp_prog(cfg, bpf_info) port = rand_port() - _set_xdp_map("map_xdp_setup", TestConfig.MODE.value, XDPAction.DROP.value) - _set_xdp_map("map_xdp_setup", TestConfig.PORT.value, port) + bpf_map_set("map_xdp_setup", TestConfig.MODE.value, XDPAction.DROP.value) + bpf_map_set("map_xdp_setup", TestConfig.PORT.value, port) ksft_eq(_test_udp(cfg, port, msg_sz), False, "UDP packet exchange should fail") stats = _get_stats(prog_info["maps"]["map_xdp_stats"]) @@ -305,8 +256,8 @@ def _test_xdp_native_tx(cfg, bpf_info, payload_lens): prog_info = _load_xdp_prog(cfg, bpf_info) port = rand_port() - _set_xdp_map("map_xdp_setup", TestConfig.MODE.value, XDPAction.TX.value) - _set_xdp_map("map_xdp_setup", TestConfig.PORT.value, port) + bpf_map_set("map_xdp_setup", TestConfig.MODE.value, XDPAction.TX.value) + bpf_map_set("map_xdp_setup", TestConfig.PORT.value, port) expected_pkts = 0 for payload_len in payload_lens: @@ -320,7 +271,7 @@ def _test_xdp_native_tx(cfg, bpf_info, payload_lens): # Writing zero bytes to stdin gets ignored by socat, # but with the shut-null flag socat generates a zero sized packet # when the socket is closed. - tx_cmd_suffix = ",shut-null" if payload_len == 0 else "" + tx_cmd_suffix = ",shut-null" if payload_len == 0 else ",shut-none" tx_udp = f"echo -n {test_string} | socat -t 2 " + \ f"-u STDIN UDP:{cfg.baddr}:{port}{tx_cmd_suffix}" @@ -454,15 +405,15 @@ def _test_xdp_native_tail_adjst(cfg, pkt_sz_lst, offset_lst): prog_info = _load_xdp_prog(cfg, bpf_info) # Configure the XDP map for tail adjustment - _set_xdp_map("map_xdp_setup", TestConfig.MODE.value, XDPAction.TAIL_ADJST.value) - _set_xdp_map("map_xdp_setup", TestConfig.PORT.value, port) + bpf_map_set("map_xdp_setup", TestConfig.MODE.value, XDPAction.TAIL_ADJST.value) + bpf_map_set("map_xdp_setup", TestConfig.PORT.value, port) for offset in offset_lst: tag = format(random.randint(65, 90), "02x") - _set_xdp_map("map_xdp_setup", TestConfig.ADJST_OFFSET.value, offset) + bpf_map_set("map_xdp_setup", TestConfig.ADJST_OFFSET.value, offset) if offset > 0: - _set_xdp_map("map_xdp_setup", TestConfig.ADJST_TAG.value, int(tag, 16)) + bpf_map_set("map_xdp_setup", TestConfig.ADJST_TAG.value, int(tag, 16)) for pkt_sz in pkt_sz_lst: test_str = "".join(random.choice(string.ascii_lowercase) for _ in range(pkt_sz)) @@ -574,8 +525,8 @@ def _test_xdp_native_head_adjst(cfg, prog, pkt_sz_lst, offset_lst): prog_info = _load_xdp_prog(cfg, BPFProgInfo(prog, "xdp_native.bpf.o", "xdp.frags", 9000)) port = rand_port() - _set_xdp_map("map_xdp_setup", TestConfig.MODE.value, XDPAction.HEAD_ADJST.value) - _set_xdp_map("map_xdp_setup", TestConfig.PORT.value, port) + bpf_map_set("map_xdp_setup", TestConfig.MODE.value, XDPAction.HEAD_ADJST.value) + bpf_map_set("map_xdp_setup", TestConfig.PORT.value, port) hds_thresh = get_hds_thresh(cfg) for offset in offset_lst: @@ -595,11 +546,8 @@ def _test_xdp_native_head_adjst(cfg, prog, pkt_sz_lst, offset_lst): test_str = ''.join(random.choice(string.ascii_lowercase) for _ in range(pkt_sz)) tag = format(random.randint(65, 90), '02x') - _set_xdp_map("map_xdp_setup", - TestConfig.ADJST_OFFSET.value, - offset) - _set_xdp_map("map_xdp_setup", TestConfig.ADJST_TAG.value, int(tag, 16)) - _set_xdp_map("map_xdp_setup", TestConfig.ADJST_OFFSET.value, offset) + bpf_map_set("map_xdp_setup", TestConfig.ADJST_OFFSET.value, offset) + bpf_map_set("map_xdp_setup", TestConfig.ADJST_TAG.value, int(tag, 16)) recvd_str = _exchg_udp(cfg, port, test_str) @@ -691,8 +639,8 @@ def test_xdp_native_qstats(cfg, act): prog_info = _load_xdp_prog(cfg, bpf_info) port = rand_port() - _set_xdp_map("map_xdp_setup", TestConfig.MODE.value, act.value) - _set_xdp_map("map_xdp_setup", TestConfig.PORT.value, port) + bpf_map_set("map_xdp_setup", TestConfig.MODE.value, act.value) + bpf_map_set("map_xdp_setup", TestConfig.PORT.value, port) # Discard the input, but we need a listener to avoid ICMP errors rx_udp = f"socat -{cfg.addr_ipver} -T 2 -u UDP-RECV:{port},reuseport " + \ @@ -745,6 +693,34 @@ def test_xdp_native_qstats(cfg, act): ksft_ge(after['tx-packets'], before['tx-packets']) +def test_xdp_native_update_mb_to_sb(cfg): + """ + Test multi-buf to single-buf replacement with jumbo MTU. + """ + obj = cfg.net_lib_dir / "xdp_dummy.bpf.o" + mtu = 9000 + + ip(f"link set dev {cfg.ifname} mtu {mtu}") + defer(ip, f"link set dev {cfg.ifname} mtu {cfg.dev['mtu']} xdpdrv off") + + attach = cmd(f"ip link set dev {cfg.ifname} xdpdrv obj {obj} sec xdp", fail=False) + if attach.ret == 0: + raise KsftSkipEx(f"device supports single-buffer XDP with mtu {mtu}") + + attach = cmd(f"ip link set dev {cfg.ifname} xdpdrv obj {obj} sec xdp.frags", fail=False) + if attach.ret != 0: + ksft_pr(attach) + raise KsftSkipEx("device does not support multi-buffer XDP") + + # Verify updating mb -> mb program works. + cmd(f"ip -force link set dev {cfg.ifname} xdpdrv obj {obj} sec xdp.frags") + + # Verify updating mb -> sb program does not work. + update = cmd(f"ip -force link set dev {cfg.ifname} xdpdrv obj {obj} sec xdp", fail=False) + if update.ret == 0: + raise KsftFailEx("device unexpectedly updates non-multi-buffer XDP") + + def main(): """ Main function to execute the XDP tests. @@ -770,6 +746,7 @@ def main(): test_xdp_native_adjst_head_grow_data, test_xdp_native_adjst_head_shrnk_data, test_xdp_native_qstats, + test_xdp_native_update_mb_to_sb, ], args=(cfg,)) ksft_exit() diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index c709523c99c6..a275ed584026 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -15,6 +15,7 @@ TEST_PROGS := \ big_tcp.sh \ bind_bhash.sh \ bpf_offload.py \ + bridge_stp_mode.sh \ bridge_vlan_dump.sh \ broadcast_ether_dst.sh \ broadcast_pmtu.sh \ @@ -45,6 +46,7 @@ TEST_PROGS := \ io_uring_zerocopy_tx.sh \ ioam6.sh \ ip6_gre_headroom.sh \ + ip6_tunnel.sh \ ip_defrag.sh \ ip_local_port_range.sh \ ipv6_flowlabel.sh \ @@ -55,6 +57,7 @@ TEST_PROGS := \ l2tp.sh \ link_netns.py \ lwt_dst_cache_ref_loop.sh \ + macvlan_mcast_shared_mac.sh \ msg_zerocopy.sh \ nat6to4.sh \ ndisc_unsolicited_na_test.sh \ @@ -62,7 +65,9 @@ TEST_PROGS := \ netdevice.sh \ netns-name.sh \ netns-sysctl.sh \ + nk_qlease.py \ nl_netdev.py \ + nl_nlctrl.py \ pmtu.sh \ psock_snd.sh \ reuseaddr_ports_exhausted.sh \ @@ -121,6 +126,7 @@ TEST_PROGS := \ vrf_route_leaking.sh \ vrf_strict_mode_test.sh \ xfrm_policy.sh \ + xfrm_state.sh \ # end of TEST_PROGS TEST_PROGS_EXTENDED := \ diff --git a/tools/testing/selftests/net/af_unix/so_peek_off.c b/tools/testing/selftests/net/af_unix/so_peek_off.c index 86e7b0fb522d..f6466a717f49 100644 --- a/tools/testing/selftests/net/af_unix/so_peek_off.c +++ b/tools/testing/selftests/net/af_unix/so_peek_off.c @@ -76,6 +76,19 @@ FIXTURE_TEARDOWN(so_peek_off) ASSERT_STREQ(str, buf); \ } while (0) +#define peekoffeq(fd, expected) \ + do { \ + socklen_t optlen = sizeof(int); \ + int off = -1; \ + int ret; \ + \ + ret = getsockopt(fd, SOL_SOCKET, SO_PEEK_OFF, \ + &off, &optlen); \ + ASSERT_EQ(0, ret); \ + ASSERT_EQ((socklen_t)sizeof(off), optlen); \ + ASSERT_EQ(expected, off); \ + } while (0) + #define async \ for (pid_t pid = (pid = fork(), \ pid < 0 ? \ @@ -91,7 +104,12 @@ TEST_F(so_peek_off, single_chunk) sendeq(self->fd[0], "aaaabbbb", 0); recveq(self->fd[1], "aaaa", 4, MSG_PEEK); + peekoffeq(self->fd[1], 4); recveq(self->fd[1], "bbbb", 100, MSG_PEEK); + peekoffeq(self->fd[1], 8); + + recveq(self->fd[1], "aaaabbbb", 8, 0); + peekoffeq(self->fd[1], 0); } TEST_F(so_peek_off, two_chunks) @@ -100,7 +118,13 @@ TEST_F(so_peek_off, two_chunks) sendeq(self->fd[0], "bbbb", 0); recveq(self->fd[1], "aaaa", 4, MSG_PEEK); + peekoffeq(self->fd[1], 4); recveq(self->fd[1], "bbbb", 100, MSG_PEEK); + peekoffeq(self->fd[1], 8); + + recveq(self->fd[1], "aaaa", 4, 0); + recveq(self->fd[1], "bbbb", 4, 0); + peekoffeq(self->fd[1], 0); } TEST_F(so_peek_off, two_chunks_blocking) @@ -111,6 +135,7 @@ TEST_F(so_peek_off, two_chunks_blocking) } recveq(self->fd[1], "aaaa", 4, MSG_PEEK); + peekoffeq(self->fd[1], 4); async { usleep(1000); @@ -119,24 +144,38 @@ TEST_F(so_peek_off, two_chunks_blocking) /* goto again; -> goto redo; in unix_stream_read_generic(). */ recveq(self->fd[1], "bbbb", 100, MSG_PEEK); + peekoffeq(self->fd[1], 8); + + recveq(self->fd[1], "aaaa", 4, 0); + recveq(self->fd[1], "bbbb", 4, 0); + peekoffeq(self->fd[1], 0); } TEST_F(so_peek_off, two_chunks_overlap) { sendeq(self->fd[0], "aaaa", 0); recveq(self->fd[1], "aa", 2, MSG_PEEK); + peekoffeq(self->fd[1], 2); sendeq(self->fd[0], "bbbb", 0); if (variant->type == SOCK_STREAM) { /* SOCK_STREAM tries to fill the buffer. */ recveq(self->fd[1], "aabb", 4, MSG_PEEK); + peekoffeq(self->fd[1], 6); recveq(self->fd[1], "bb", 100, MSG_PEEK); + peekoffeq(self->fd[1], 8); } else { /* SOCK_DGRAM and SOCK_SEQPACKET returns at the skb boundary. */ recveq(self->fd[1], "aa", 100, MSG_PEEK); + peekoffeq(self->fd[1], 4); recveq(self->fd[1], "bbbb", 100, MSG_PEEK); + peekoffeq(self->fd[1], 8); } + + recveq(self->fd[1], "aaaa", 4, 0); + recveq(self->fd[1], "bbbb", 4, 0); + peekoffeq(self->fd[1], 0); } TEST_F(so_peek_off, two_chunks_overlap_blocking) @@ -147,6 +186,7 @@ TEST_F(so_peek_off, two_chunks_overlap_blocking) } recveq(self->fd[1], "aa", 2, MSG_PEEK); + peekoffeq(self->fd[1], 2); async { usleep(1000); @@ -155,8 +195,14 @@ TEST_F(so_peek_off, two_chunks_overlap_blocking) /* Even SOCK_STREAM does not wait if at least one byte is read. */ recveq(self->fd[1], "aa", 100, MSG_PEEK); + peekoffeq(self->fd[1], 4); recveq(self->fd[1], "bbbb", 100, MSG_PEEK); + peekoffeq(self->fd[1], 8); + + recveq(self->fd[1], "aaaa", 4, 0); + recveq(self->fd[1], "bbbb", 4, 0); + peekoffeq(self->fd[1], 0); } TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/net/bridge_stp_mode.sh b/tools/testing/selftests/net/bridge_stp_mode.sh new file mode 100755 index 000000000000..0c81fd029d79 --- /dev/null +++ b/tools/testing/selftests/net/bridge_stp_mode.sh @@ -0,0 +1,288 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# shellcheck disable=SC2034,SC2154,SC2317,SC2329 +# +# Test for bridge STP mode selection (IFLA_BR_STP_MODE). +# +# Verifies that: +# - stp_mode defaults to auto on new bridges +# - stp_mode can be toggled between user, kernel, and auto +# - stp_mode change is rejected while STP is active (-EBUSY) +# - stp_mode user in a netns yields userspace STP (stp_state=2) +# - stp_mode kernel forces kernel STP (stp_state=1) +# - stp_mode auto preserves traditional fallback to kernel STP +# - stp_mode and stp_state can be set atomically in one message +# - stp_mode persists across STP disable/enable cycles + +source lib.sh + +require_command jq + +ALL_TESTS=" + test_default_auto + test_set_modes + test_reject_change_while_stp_active + test_idempotent_mode_while_stp_active + test_user_mode_in_netns + test_kernel_mode + test_auto_mode + test_atomic_mode_and_state + test_mode_persistence +" + +bridge_info_get() +{ + ip -n "$NS1" -d -j link show "$1" | \ + jq -r ".[0].linkinfo.info_data.$2" +} + +check_stp_mode() +{ + local br=$1; shift + local expected=$1; shift + local msg=$1; shift + local val + + val=$(bridge_info_get "$br" stp_mode) + [ "$val" = "$expected" ] + check_err $? "$msg: expected $expected, got $val" +} + +check_stp_state() +{ + local br=$1; shift + local expected=$1; shift + local msg=$1; shift + local val + + val=$(bridge_info_get "$br" stp_state) + [ "$val" = "$expected" ] + check_err $? "$msg: expected $expected, got $val" +} + +# Create a bridge in NS1, bring it up, and defer its deletion. +bridge_create() +{ + ip -n "$NS1" link add "$1" type bridge + ip -n "$NS1" link set "$1" up + defer ip -n "$NS1" link del "$1" +} + +setup_prepare() +{ + setup_ns NS1 +} + +cleanup() +{ + defer_scopes_cleanup + cleanup_all_ns +} + +# Check that stp_mode defaults to auto when creating a bridge. +test_default_auto() +{ + RET=0 + + ip -n "$NS1" link add br-test type bridge + defer ip -n "$NS1" link del br-test + + check_stp_mode br-test auto "stp_mode default" + + log_test "stp_mode defaults to auto" +} + +# Test setting stp_mode to user, kernel, and back to auto. +test_set_modes() +{ + RET=0 + + ip -n "$NS1" link add br-test type bridge + defer ip -n "$NS1" link del br-test + + ip -n "$NS1" link set dev br-test type bridge stp_mode user + check_err $? "Failed to set stp_mode to user" + check_stp_mode br-test user "after set user" + + ip -n "$NS1" link set dev br-test type bridge stp_mode kernel + check_err $? "Failed to set stp_mode to kernel" + check_stp_mode br-test kernel "after set kernel" + + ip -n "$NS1" link set dev br-test type bridge stp_mode auto + check_err $? "Failed to set stp_mode to auto" + check_stp_mode br-test auto "after set auto" + + log_test "stp_mode set user/kernel/auto" +} + +# Verify that stp_mode cannot be changed while STP is active. +test_reject_change_while_stp_active() +{ + RET=0 + + bridge_create br-test + + ip -n "$NS1" link set dev br-test type bridge stp_mode kernel + check_err $? "Failed to set stp_mode to kernel" + + ip -n "$NS1" link set dev br-test type bridge stp_state 1 + check_err $? "Failed to enable STP" + + # Changing stp_mode while STP is active should fail. + ip -n "$NS1" link set dev br-test type bridge stp_mode auto 2>/dev/null + check_fail $? "Changing stp_mode should fail while STP is active" + + check_stp_mode br-test kernel "mode unchanged after rejected change" + + # Disable STP, then change should succeed. + ip -n "$NS1" link set dev br-test type bridge stp_state 0 + check_err $? "Failed to disable STP" + + ip -n "$NS1" link set dev br-test type bridge stp_mode auto + check_err $? "Changing stp_mode should succeed after STP is disabled" + + log_test "reject stp_mode change while STP is active" +} + +# Verify that re-setting the same stp_mode while STP is active succeeds. +test_idempotent_mode_while_stp_active() +{ + RET=0 + + bridge_create br-test + + ip -n "$NS1" link set dev br-test type bridge stp_mode user stp_state 1 + check_err $? "Failed to enable STP with user mode" + + # Re-setting the same mode while STP is active should succeed. + ip -n "$NS1" link set dev br-test type bridge stp_mode user + check_err $? "Idempotent stp_mode set should succeed while STP is active" + + check_stp_state br-test 2 "stp_state after idempotent set" + + # Changing mode while disabling STP in the same message should succeed. + ip -n "$NS1" link set dev br-test type bridge stp_mode auto stp_state 0 + check_err $? "Mode change with simultaneous STP disable should succeed" + + check_stp_mode br-test auto "mode changed after disable+change" + check_stp_state br-test 0 "stp_state after disable+change" + + log_test "idempotent and simultaneous mode change while STP active" +} + +# Test that stp_mode user in a non-init netns yields userspace STP +# (stp_state == 2). This is the key use case: userspace STP without +# needing /sbin/bridge-stp or being in init_net. +test_user_mode_in_netns() +{ + RET=0 + + bridge_create br-test + + ip -n "$NS1" link set dev br-test type bridge stp_mode user + check_err $? "Failed to set stp_mode to user" + + ip -n "$NS1" link set dev br-test type bridge stp_state 1 + check_err $? "Failed to enable STP" + + check_stp_state br-test 2 "stp_state with user mode" + + log_test "stp_mode user in netns yields userspace STP" +} + +# Test that stp_mode kernel forces kernel STP (stp_state == 1) +# regardless of whether /sbin/bridge-stp exists. +test_kernel_mode() +{ + RET=0 + + bridge_create br-test + + ip -n "$NS1" link set dev br-test type bridge stp_mode kernel + check_err $? "Failed to set stp_mode to kernel" + + ip -n "$NS1" link set dev br-test type bridge stp_state 1 + check_err $? "Failed to enable STP" + + check_stp_state br-test 1 "stp_state with kernel mode" + + log_test "stp_mode kernel forces kernel STP" +} + +# Test that stp_mode auto preserves traditional behavior: in a netns +# (non-init_net), bridge-stp is not called and STP falls back to +# kernel mode (stp_state == 1). +test_auto_mode() +{ + RET=0 + + bridge_create br-test + + # Auto mode is the default; enable STP in a netns. + ip -n "$NS1" link set dev br-test type bridge stp_state 1 + check_err $? "Failed to enable STP" + + # In a netns with auto mode, bridge-stp is skipped (init_net only), + # so STP should fall back to kernel mode (stp_state == 1). + check_stp_state br-test 1 "stp_state with auto mode in netns" + + log_test "stp_mode auto preserves traditional behavior" +} + +# Test that stp_mode and stp_state can be set in a single netlink +# message. This is the intended atomic usage pattern. +test_atomic_mode_and_state() +{ + RET=0 + + bridge_create br-test + + # Set both stp_mode and stp_state in one command. + ip -n "$NS1" link set dev br-test type bridge stp_mode user stp_state 1 + check_err $? "Failed to set stp_mode user and stp_state 1 atomically" + + check_stp_state br-test 2 "stp_state after atomic set" + + log_test "atomic stp_mode user + stp_state 1 in single message" +} + +# Test that stp_mode persists across STP disable/enable cycles. +test_mode_persistence() +{ + RET=0 + + bridge_create br-test + + # Set user mode and enable STP. + ip -n "$NS1" link set dev br-test type bridge stp_mode user + ip -n "$NS1" link set dev br-test type bridge stp_state 1 + check_err $? "Failed to enable STP with user mode" + + # Disable STP. + ip -n "$NS1" link set dev br-test type bridge stp_state 0 + check_err $? "Failed to disable STP" + + # Verify mode is still user. + check_stp_mode br-test user "stp_mode after STP disable" + + # Re-enable STP -- should use user mode again. + ip -n "$NS1" link set dev br-test type bridge stp_state 1 + check_err $? "Failed to re-enable STP" + + check_stp_state br-test 2 "stp_state after re-enable" + + log_test "stp_mode persists across STP disable/enable cycles" +} + +# Check iproute2 support before setting up resources. +if ! ip link add type bridge help 2>&1 | grep -q "stp_mode"; then + echo "SKIP: iproute2 too old, missing stp_mode support" + exit "$ksft_skip" +fi + +trap cleanup EXIT + +setup_prepare +tests_run + +exit "$EXIT_STATUS" diff --git a/tools/testing/selftests/net/config b/tools/testing/selftests/net/config index cd49b7dfe216..2a390cae41bf 100644 --- a/tools/testing/selftests/net/config +++ b/tools/testing/selftests/net/config @@ -43,6 +43,8 @@ CONFIG_IPV6_ILA=m CONFIG_IPV6_IOAM6_LWTUNNEL=y CONFIG_IPV6_MROUTE=y CONFIG_IPV6_MULTIPLE_TABLES=y +CONFIG_IPV6_ROUTE_INFO=y +CONFIG_IPV6_ROUTER_PREF=y CONFIG_IPV6_RPL_LWTUNNEL=y CONFIG_IPV6_SEG6_LWTUNNEL=y CONFIG_IPV6_SIT=y diff --git a/tools/testing/selftests/net/fib_tests.sh b/tools/testing/selftests/net/fib_tests.sh index 829f72c8ee07..af64f93bb2e1 100755 --- a/tools/testing/selftests/net/fib_tests.sh +++ b/tools/testing/selftests/net/fib_tests.sh @@ -545,7 +545,7 @@ fib4_nexthop() fib6_nexthop() { local lldummy=$(get_linklocal dummy0) - local llv1=$(get_linklocal dummy0) + local llv1=$(get_linklocal veth1) if [ -z "$lldummy" ]; then echo "Failed to get linklocal address for dummy0" @@ -1589,6 +1589,23 @@ fib6_ra_to_static() log_test $ret 0 "ipv6 promote RA route to static" + # Prepare for RA route with gateway + $NS_EXEC sysctl -wq net.ipv6.conf.veth1.accept_ra_rt_info_max_plen=64 + + # Add initial route to cause ECMP merging + $IP -6 route add 2001:12::/64 via fe80::dead:beef dev veth1 + + $NS_EXEC ra6 -i veth2 -d 2001:10::1 -R 2001:12::/64#1#120 + + # Routes are not merged as RA routes are not elegible for ECMP + check_rt_num 2 "$($IP -6 route list | grep -c "2001:12::/64 via")" + + $IP -6 route append 2001:12::/64 via fe80::dead:feeb dev veth1 + + check_rt_num 2 "$($IP -6 route list | grep -c "nexthop via")" + + log_test "$ret" 0 "ipv6 RA route with nexthop do not merge into ECMP with static" + set +e cleanup &> /dev/null diff --git a/tools/testing/selftests/net/forwarding/.gitignore b/tools/testing/selftests/net/forwarding/.gitignore index 2dea317f12e7..418ff96c52ef 100644 --- a/tools/testing/selftests/net/forwarding/.gitignore +++ b/tools/testing/selftests/net/forwarding/.gitignore @@ -1,2 +1,3 @@ # SPDX-License-Identifier: GPL-2.0-only forwarding.config +ipmr diff --git a/tools/testing/selftests/net/forwarding/Makefile b/tools/testing/selftests/net/forwarding/Makefile index ff4a00d91a26..bbaf4d937dd8 100644 --- a/tools/testing/selftests/net/forwarding/Makefile +++ b/tools/testing/selftests/net/forwarding/Makefile @@ -133,6 +133,10 @@ TEST_FILES := \ tc_common.sh \ # end of TEST_FILES +TEST_GEN_PROGS := \ + ipmr +# end of TEST_GEN_PROGS + TEST_INCLUDES := \ $(wildcard ../lib/sh/*.sh) \ ../lib.sh \ diff --git a/tools/testing/selftests/net/forwarding/gre_multipath.sh b/tools/testing/selftests/net/forwarding/gre_multipath.sh index 57531c1d884d..ce4ae74843d9 100755 --- a/tools/testing/selftests/net/forwarding/gre_multipath.sh +++ b/tools/testing/selftests/net/forwarding/gre_multipath.sh @@ -65,7 +65,7 @@ source lib.sh h1_create() { - simple_if_init $h1 192.0.2.1/28 2001:db8:1::1/64 + simple_if_init $h1 192.0.2.1/28 ip route add vrf v$h1 192.0.2.16/28 via 192.0.2.2 } diff --git a/tools/testing/selftests/net/forwarding/gre_multipath_nh.sh b/tools/testing/selftests/net/forwarding/gre_multipath_nh.sh index 7d5b2b9cc133..c667b81da37f 100755 --- a/tools/testing/selftests/net/forwarding/gre_multipath_nh.sh +++ b/tools/testing/selftests/net/forwarding/gre_multipath_nh.sh @@ -80,7 +80,7 @@ h1_destroy() { ip route del vrf v$h1 2001:db8:2::/64 via 2001:db8:1::2 ip route del vrf v$h1 192.0.2.16/28 via 192.0.2.2 - simple_if_fini $h1 192.0.2.1/28 + simple_if_fini $h1 192.0.2.1/28 2001:db8:1::1/64 } sw1_create() diff --git a/tools/testing/selftests/net/forwarding/gre_multipath_nh_res.sh b/tools/testing/selftests/net/forwarding/gre_multipath_nh_res.sh index 370f9925302d..d04bad58a96a 100755 --- a/tools/testing/selftests/net/forwarding/gre_multipath_nh_res.sh +++ b/tools/testing/selftests/net/forwarding/gre_multipath_nh_res.sh @@ -80,7 +80,7 @@ h1_destroy() { ip route del vrf v$h1 2001:db8:2::/64 via 2001:db8:1::2 ip route del vrf v$h1 192.0.2.16/28 via 192.0.2.2 - simple_if_fini $h1 192.0.2.1/28 + simple_if_fini $h1 192.0.2.1/28 2001:db8:1::1/64 } sw1_create() diff --git a/tools/testing/selftests/net/forwarding/ipip_lib.sh b/tools/testing/selftests/net/forwarding/ipip_lib.sh index 01e62c4ac94d..b255646b737a 100644 --- a/tools/testing/selftests/net/forwarding/ipip_lib.sh +++ b/tools/testing/selftests/net/forwarding/ipip_lib.sh @@ -144,7 +144,7 @@ h1_create() { - simple_if_init $h1 192.0.2.1/28 2001:db8:1::1/64 + simple_if_init $h1 192.0.2.1/28 ip route add vrf v$h1 192.0.2.16/28 via 192.0.2.2 } diff --git a/tools/testing/selftests/net/forwarding/ipmr.c b/tools/testing/selftests/net/forwarding/ipmr.c new file mode 100644 index 000000000000..df870aad9ead --- /dev/null +++ b/tools/testing/selftests/net/forwarding/ipmr.c @@ -0,0 +1,455 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright 2026 Google LLC */ + +#include <linux/if.h> +#include <linux/mroute.h> +#include <linux/netlink.h> +#include <linux/rtnetlink.h> +#include <linux/socket.h> +#include <sched.h> +#include <sys/ioctl.h> +#include <sys/socket.h> + +#include "kselftest_harness.h" + +FIXTURE(ipmr) +{ + int netlink_sk; + int raw_sk; + int veth_ifindex; +}; + +FIXTURE_VARIANT(ipmr) +{ + int family; + int protocol; + int level; + int opts[MRT_MAX - MRT_BASE + 1]; +}; + +FIXTURE_VARIANT_ADD(ipmr, ipv4) +{ + .family = AF_INET, + .protocol = IPPROTO_IGMP, + .level = IPPROTO_IP, + .opts = { + MRT_INIT, + MRT_DONE, + MRT_ADD_VIF, + MRT_DEL_VIF, + MRT_ADD_MFC, + MRT_DEL_MFC, + MRT_VERSION, + MRT_ASSERT, + MRT_PIM, + MRT_TABLE, + MRT_ADD_MFC_PROXY, + MRT_DEL_MFC_PROXY, + MRT_FLUSH, + }, +}; + +struct mfc_attr { + int table; + __u32 origin; + __u32 group; + int ifindex; + bool proxy; +}; + +static struct rtattr *nl_add_rtattr(struct nlmsghdr *nlmsg, struct rtattr *rta, + int type, const void *data, int len) +{ + int unused = 0; + + rta->rta_type = type; + rta->rta_len = RTA_LENGTH(len); + memcpy(RTA_DATA(rta), data, len); + + nlmsg->nlmsg_len += NLMSG_ALIGN(rta->rta_len); + + return RTA_NEXT(rta, unused); +} + +static int nl_sendmsg_mfc(struct __test_metadata *_metadata, FIXTURE_DATA(ipmr) *self, + __u16 nlmsg_type, struct mfc_attr *mfc_attr) +{ + struct { + struct nlmsghdr nlmsg; + struct rtmsg rtm; + char buf[4096]; + } req = { + .nlmsg = { + .nlmsg_len = NLMSG_LENGTH(sizeof(req.rtm)), + /* ipmr does not care about NLM_F_CREATE and NLM_F_EXCL ... */ + .nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK, + .nlmsg_type = nlmsg_type, + }, + .rtm = { + /* hard requirements in rtm_to_ipmr_mfcc() */ + .rtm_family = RTNL_FAMILY_IPMR, + .rtm_dst_len = 32, + .rtm_type = RTN_MULTICAST, + .rtm_scope = RT_SCOPE_UNIVERSE, + .rtm_protocol = RTPROT_MROUTED, + }, + }; + struct nlmsghdr *nlmsg = &req.nlmsg; + struct nlmsgerr *errmsg; + struct rtattr *rta; + int err; + + rta = (struct rtattr *)&req.buf; + rta = nl_add_rtattr(nlmsg, rta, RTA_TABLE, &mfc_attr->table, sizeof(mfc_attr->table)); + rta = nl_add_rtattr(nlmsg, rta, RTA_SRC, &mfc_attr->origin, sizeof(mfc_attr->origin)); + rta = nl_add_rtattr(nlmsg, rta, RTA_DST, &mfc_attr->group, sizeof(mfc_attr->group)); + if (mfc_attr->ifindex) + rta = nl_add_rtattr(nlmsg, rta, RTA_IIF, &mfc_attr->ifindex, sizeof(mfc_attr->ifindex)); + if (mfc_attr->proxy) + rta = nl_add_rtattr(nlmsg, rta, RTA_PREFSRC, NULL, 0); + + err = send(self->netlink_sk, &req, req.nlmsg.nlmsg_len, 0); + ASSERT_EQ(err, req.nlmsg.nlmsg_len); + + memset(&req, 0, sizeof(req)); + + err = recv(self->netlink_sk, &req, sizeof(req), 0); + ASSERT_TRUE(NLMSG_OK(nlmsg, err)); + ASSERT_EQ(NLMSG_ERROR, nlmsg->nlmsg_type); + + errmsg = (struct nlmsgerr *)NLMSG_DATA(nlmsg); + return errmsg->error; +} + +FIXTURE_SETUP(ipmr) +{ + struct ifreq ifr = { + .ifr_name = "veth0", + }; + int err; + + err = unshare(CLONE_NEWNET); + ASSERT_EQ(0, err); + + self->netlink_sk = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE); + ASSERT_LE(0, self->netlink_sk); + + self->raw_sk = socket(variant->family, SOCK_RAW, variant->protocol); + ASSERT_LT(0, self->raw_sk); + + err = system("ip link add veth0 type veth peer veth1"); + ASSERT_EQ(0, err); + + err = ioctl(self->raw_sk, SIOCGIFINDEX, &ifr); + ASSERT_EQ(0, err); + + self->veth_ifindex = ifr.ifr_ifindex; +} + +FIXTURE_TEARDOWN(ipmr) +{ + close(self->raw_sk); + close(self->netlink_sk); +} + +TEST_F(ipmr, mrt_init) +{ + int err, val = 0; /* any value is ok, but size must be int for MRT_INIT. */ + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_INIT - MRT_BASE], + &val, sizeof(val)); + ASSERT_EQ(0, err); + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_DONE - MRT_BASE], + &val, sizeof(val)); + ASSERT_EQ(0, err); +} + +TEST_F(ipmr, mrt_add_vif_register) +{ + struct vifctl vif = { + .vifc_vifi = 0, + .vifc_flags = VIFF_REGISTER, + }; + int err; + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(0, err); + + err = system("cat /proc/net/ip_mr_vif | grep -q pimreg"); + ASSERT_EQ(0, err); + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_DEL_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(0, err); +} + +TEST_F(ipmr, mrt_del_vif_unreg) +{ + struct vifctl vif = { + .vifc_vifi = 0, + .vifc_flags = VIFF_USE_IFINDEX, + .vifc_lcl_ifindex = self->veth_ifindex, + }; + int err; + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(0, err); + + err = system("cat /proc/net/ip_mr_vif | grep -q veth0"); + ASSERT_EQ(0, err); + + /* VIF is removed along with its device. */ + err = system("ip link del veth0"); + ASSERT_EQ(0, err); + + /* mrt->vif_table[veth_ifindex]->dev is NULL. */ + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_DEL_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(-1, err); + ASSERT_EQ(EADDRNOTAVAIL, errno); +} + +TEST_F(ipmr, mrt_del_vif_netns_dismantle) +{ + struct vifctl vif = { + .vifc_vifi = 0, + .vifc_flags = VIFF_USE_IFINDEX, + .vifc_lcl_ifindex = self->veth_ifindex, + }; + int err; + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(0, err); + + /* Let cleanup_net() remove veth0 and VIF. */ +} + +TEST_F(ipmr, mrt_add_mfc) +{ + struct mfcctl mfc = {}; + int err; + + /* MRT_ADD_MFC / MRT_ADD_MFC_PROXY does not need vif to exist (unlike netlink). */ + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_MFC - MRT_BASE], + &mfc, sizeof(mfc)); + ASSERT_EQ(0, err); + + /* (0.0.0.0 -> 0.0.0.0) */ + err = system("cat /proc/net/ip_mr_cache | grep -q '00000000 00000000' "); + ASSERT_EQ(0, err); + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_DEL_MFC - MRT_BASE], + &mfc, sizeof(mfc)); +} + +TEST_F(ipmr, mrt_add_mfc_proxy) +{ + struct mfcctl mfc = {}; + int err; + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_MFC_PROXY - MRT_BASE], + &mfc, sizeof(mfc)); + ASSERT_EQ(0, err); + + err = system("cat /proc/net/ip_mr_cache | grep -q '00000000 00000000' "); + ASSERT_EQ(0, err); + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_DEL_MFC_PROXY - MRT_BASE], + &mfc, sizeof(mfc)); +} + +TEST_F(ipmr, mrt_add_mfc_netlink) +{ + struct vifctl vif = { + .vifc_vifi = 0, + .vifc_flags = VIFF_USE_IFINDEX, + .vifc_lcl_ifindex = self->veth_ifindex, + }; + struct mfc_attr mfc_attr = { + .table = RT_TABLE_DEFAULT, + .origin = 0, + .group = 0, + .ifindex = self->veth_ifindex, + .proxy = false, + }; + int err; + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(0, err); + + err = nl_sendmsg_mfc(_metadata, self, RTM_NEWROUTE, &mfc_attr); + ASSERT_EQ(0, err); + + err = system("cat /proc/net/ip_mr_cache | grep -q '00000000 00000000' "); + ASSERT_EQ(0, err); + + err = nl_sendmsg_mfc(_metadata, self, RTM_DELROUTE, &mfc_attr); + ASSERT_EQ(0, err); +} + +TEST_F(ipmr, mrt_add_mfc_netlink_proxy) +{ + struct vifctl vif = { + .vifc_vifi = 0, + .vifc_flags = VIFF_USE_IFINDEX, + .vifc_lcl_ifindex = self->veth_ifindex, + }; + struct mfc_attr mfc_attr = { + .table = RT_TABLE_DEFAULT, + .origin = 0, + .group = 0, + .ifindex = self->veth_ifindex, + .proxy = true, + }; + int err; + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(0, err); + + err = nl_sendmsg_mfc(_metadata, self, RTM_NEWROUTE, &mfc_attr); + ASSERT_EQ(0, err); + + err = system("cat /proc/net/ip_mr_cache | grep -q '00000000 00000000' "); + ASSERT_EQ(0, err); + + err = nl_sendmsg_mfc(_metadata, self, RTM_DELROUTE, &mfc_attr); + ASSERT_EQ(0, err); +} + +TEST_F(ipmr, mrt_add_mfc_netlink_no_vif) +{ + struct mfc_attr mfc_attr = { + .table = RT_TABLE_DEFAULT, + .origin = 0, + .group = 0, + .proxy = false, + }; + int err; + + /* netlink always requires RTA_IIF of an existing vif. */ + mfc_attr.ifindex = 0; + err = nl_sendmsg_mfc(_metadata, self, RTM_NEWROUTE, &mfc_attr); + ASSERT_EQ(-ENFILE, err); + + /* netlink always requires RTA_IIF of an existing vif. */ + mfc_attr.ifindex = self->veth_ifindex; + err = nl_sendmsg_mfc(_metadata, self, RTM_NEWROUTE, &mfc_attr); + ASSERT_EQ(-ENFILE, err); +} + +TEST_F(ipmr, mrt_del_mfc_netlink_netns_dismantle) +{ + struct vifctl vifs[2] = { + { + .vifc_vifi = 0, + .vifc_flags = VIFF_USE_IFINDEX, + .vifc_lcl_ifindex = self->veth_ifindex, + }, + { + .vifc_vifi = 1, + .vifc_flags = VIFF_REGISTER, + } + }; + struct mfc_attr mfc_attr = { + .table = RT_TABLE_DEFAULT, + .origin = 0, + .group = 0, + .ifindex = self->veth_ifindex, + .proxy = false, + }; + int i, err; + + for (i = 0; i < 2; i++) { + /* Create 2 VIFs just to avoid -ENFILE later. */ + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_VIF - MRT_BASE], + &vifs[i], sizeof(vifs[i])); + ASSERT_EQ(0, err); + } + + /* Create a MFC for mrt->vif_table[0]. */ + err = nl_sendmsg_mfc(_metadata, self, RTM_NEWROUTE, &mfc_attr); + ASSERT_EQ(0, err); + + err = system("cat /proc/net/ip_mr_cache | grep -q '00000000 00000000' "); + ASSERT_EQ(0, err); + + /* Remove mrt->vif_table[0]. */ + err = system("ip link del veth0"); + ASSERT_EQ(0, err); + + /* MFC entry is NOT removed even if the tied VIF is removed... */ + err = system("cat /proc/net/ip_mr_cache | grep -q '00000000 00000000' "); + ASSERT_EQ(0, err); + + /* ... and netlink is not capable of removing such an entry + * because netlink always requires a valid RTA_IIF ... :/ + */ + err = nl_sendmsg_mfc(_metadata, self, RTM_DELROUTE, &mfc_attr); + ASSERT_EQ(-ENODEV, err); + + /* It can be removed by setsockopt(), but let cleanup_net() remove this time. */ +} + +TEST_F(ipmr, mrt_table_flush) +{ + struct vifctl vif = { + .vifc_vifi = 0, + .vifc_flags = VIFF_USE_IFINDEX, + .vifc_lcl_ifindex = self->veth_ifindex, + }; + struct mfc_attr mfc_attr = { + .origin = 0, + .group = 0, + .ifindex = self->veth_ifindex, + .proxy = false, + }; + int table_id = 92; + int err, flags; + + /* Set a random table id rather than RT_TABLE_DEFAULT. + * Note that /proc/net/ip_mr_{vif,cache} only supports RT_TABLE_DEFAULT. + */ + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_TABLE - MRT_BASE], + &table_id, sizeof(table_id)); + ASSERT_EQ(0, err); + + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_ADD_VIF - MRT_BASE], + &vif, sizeof(vif)); + ASSERT_EQ(0, err); + + mfc_attr.table = table_id; + err = nl_sendmsg_mfc(_metadata, self, RTM_NEWROUTE, &mfc_attr); + ASSERT_EQ(0, err); + + /* Flush mrt->vif_table[] and all caches. */ + flags = MRT_FLUSH_VIFS | MRT_FLUSH_VIFS_STATIC | + MRT_FLUSH_MFC | MRT_FLUSH_MFC_STATIC; + err = setsockopt(self->raw_sk, + variant->level, variant->opts[MRT_FLUSH - MRT_BASE], + &flags, sizeof(flags)); + ASSERT_EQ(0, err); +} + +TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/net/forwarding/lib.sh b/tools/testing/selftests/net/forwarding/lib.sh index a9034f0bb58b..ac8358bcb22c 100644 --- a/tools/testing/selftests/net/forwarding/lib.sh +++ b/tools/testing/selftests/net/forwarding/lib.sh @@ -1,5 +1,6 @@ #!/bin/bash # SPDX-License-Identifier: GPL-2.0 +#shellcheck disable=SC2034 # SC doesn't see our uses of global variables ############################################################################## # Topology description. p1 looped back to p2, p3 to p4 and so on. @@ -340,17 +341,145 @@ fi ############################################################################## # Command line options handling -count=0 +check_env() { + if [[ ! (( -n "$LOCAL_V4" && -n "$REMOTE_V4") || + ( -n "$LOCAL_V6" && -n "$REMOTE_V6" )) ]]; then + echo "SKIP: Invalid environment, missing or inconsistent LOCAL_V4/REMOTE_V4/LOCAL_V6/REMOTE_V6" + echo "Please see tools/testing/selftests/drivers/net/README.rst" + exit "$ksft_skip" + fi + + if [[ -z "$REMOTE_TYPE" ]]; then + echo "SKIP: Invalid environment, missing REMOTE_TYPE" + exit "$ksft_skip" + fi + + if [[ -z "$REMOTE_ARGS" ]]; then + echo "SKIP: Invalid environment, missing REMOTE_ARGS" + exit "$ksft_skip" + fi +} + +__run_on() +{ + local target=$1; shift + local type args + + IFS=':' read -r type args <<< "$target" + + case "$type" in + netns) + # Execute command in network namespace + # args contains the namespace name + ip netns exec "$args" "$@" + ;; + ssh) + # Execute command via SSH args contains user@host + ssh -n "$args" "$@" + ;; + local|*) + # Execute command locally. This is also the fallback + # case for when the interface's target is not found in + # the TARGETS array. + "$@" + ;; + esac +} + +run_on() +{ + local iface=$1; shift + local target="local:" + + if [ "${DRIVER_TEST_CONFORMANT}" = "yes" ]; then + target="${TARGETS[$iface]}" + fi + + __run_on "$target" "$@" +} + +get_ifname_by_ip() +{ + local target=$1; shift + local ip_addr=$1; shift + + __run_on "$target" ip -j addr show to "$ip_addr" | jq -r '.[].ifname' +} + +# Whether the test is conforming to the requirements and usage described in +# drivers/net/README.rst. +: "${DRIVER_TEST_CONFORMANT:=no}" + +declare -A TARGETS + +# Based on DRIVER_TEST_CONFORMANT, decide if to source drivers/net/net.config +# or not. In the "yes" case, the test expects to pass the arguments through the +# variables specified in drivers/net/README.rst file. If not, fallback on +# parsing the script arguments for interface names. +if [ "${DRIVER_TEST_CONFORMANT}" = "yes" ]; then + if [[ -f $net_forwarding_dir/../../drivers/net/net.config ]]; then + source "$net_forwarding_dir/../../drivers/net/net.config" + fi + + if (( NUM_NETIFS > 2)); then + echo "SKIP: DRIVER_TEST_CONFORMANT=yes and NUM_NETIFS is bigger than 2" + exit "$ksft_skip" + fi -while [[ $# -gt 0 ]]; do - if [[ "$count" -eq "0" ]]; then + check_env + + # Populate the NETIFS and TARGETS arrays automatically based on the + # environment variables. The TARGETS array is indexed by the network + # interface name keeping track of the target on which the interface + # resides. Values will be strings of the following format - + # <type>:<args>. + # + # TARGETS[eth0]="local:" - meaning that the eth0 interface is + # accessible locally + # TARGETS[eth1]="netns:foo" - eth1 is in the foo netns + # TARGETS[eth2]="ssh:root@10.0.0.2" - eth2 is accessible through + # running the 'ssh root@10.0.0.2' command. + + unset NETIFS + declare -A NETIFS + + NETIFS[p1]="$NETIF" + TARGETS[$NETIF]="local:" + + # Locate the name of the remote interface + remote_target="$REMOTE_TYPE:$REMOTE_ARGS" + if [[ -v REMOTE_V4 ]]; then + remote_netif=$(get_ifname_by_ip "$remote_target" "$REMOTE_V4") + else + remote_netif=$(get_ifname_by_ip "$remote_target" "$REMOTE_V6") + fi + if [[ ! -n "$remote_netif" ]]; then + echo "SKIP: cannot find remote interface" + exit "$ksft_skip" + fi + + if [[ "$NETIF" == "$remote_netif" ]]; then + echo "SKIP: local and remote interfaces cannot have the same name" + exit "$ksft_skip" + fi + + NETIFS[p2]="$remote_netif" + TARGETS[$remote_netif]="$REMOTE_TYPE:$REMOTE_ARGS" +else + count=0 + # Prime NETIFS from the command line, but retain if none given. + if [[ $# -gt 0 ]]; then unset NETIFS declare -A NETIFS + + while [[ $# -gt 0 ]]; do + count=$((count + 1)) + NETIFS[p$count]="$1" + TARGETS[$1]="local:" + shift + done fi - count=$((count + 1)) - NETIFS[p$count]="$1" - shift -done +fi ############################################################################## # Network interfaces configuration @@ -418,10 +547,11 @@ mac_addr_prepare() dev=${NETIFS[p$i]} new_addr=$(printf "00:01:02:03:04:%02x" $i) - MAC_ADDR_ORIG["$dev"]=$(ip -j link show dev $dev | jq -e '.[].address') + MAC_ADDR_ORIG["$dev"]=$(run_on "$dev" \ + ip -j link show dev "$dev" | jq -e '.[].address') # Strip quotes MAC_ADDR_ORIG["$dev"]=${MAC_ADDR_ORIG["$dev"]//\"/} - ip link set dev $dev address $new_addr + run_on "$dev" ip link set dev "$dev" address $new_addr done } @@ -431,7 +561,8 @@ mac_addr_restore() for ((i = 1; i <= NUM_NETIFS; ++i)); do dev=${NETIFS[p$i]} - ip link set dev $dev address ${MAC_ADDR_ORIG["$dev"]} + run_on "$dev" \ + ip link set dev "$dev" address ${MAC_ADDR_ORIG["$dev"]} done } @@ -444,7 +575,9 @@ if [[ "$STABLE_MAC_ADDRS" = "yes" ]]; then fi for ((i = 1; i <= NUM_NETIFS; ++i)); do - ip link show dev ${NETIFS[p$i]} &> /dev/null + int="${NETIFS[p$i]}" + + run_on "$int" ip link show dev "$int" &> /dev/null if [[ $? -ne 0 ]]; then echo "SKIP: could not find all required interfaces" exit $ksft_skip @@ -527,7 +660,7 @@ setup_wait_dev_with_timeout() local i for ((i = 1; i <= $max_iterations; ++i)); do - ip link show dev $dev up \ + run_on "$dev" ip link show dev "$dev" up \ | grep 'state UP' &> /dev/null if [[ $? -ne 0 ]]; then sleep 1 @@ -831,8 +964,15 @@ ethtool_std_stats_get() local name=$1; shift local src=$1; shift - ethtool --json -S $dev --groups $grp -- --src $src | \ - jq '.[]."'"$grp"'"."'$name'"' + if [[ "$grp" == "pause" ]]; then + run_on "$dev" ethtool -I --json -a "$dev" --src "$src" | \ + jq --arg name "$name" '.[].statistics[$name]' + return + fi + + run_on "$dev" \ + ethtool --json -S "$dev" --groups "$grp" -- --src "$src" | \ + jq --arg grp "$grp" --arg name "$name" '.[][$grp][$name]' } qdisc_stats_get() @@ -1610,12 +1750,17 @@ tcpdump_start() sleep 1 } -tcpdump_stop() +tcpdump_stop_nosleep() { local if_name=$1 local pid=${cappid[$if_name]} $ns_cmd kill "$pid" && wait "$pid" +} + +tcpdump_stop() +{ + tcpdump_stop_nosleep "$1" sleep 1 } @@ -1630,7 +1775,7 @@ tcpdump_show() { local if_name=$1 - tcpdump -e -n -r ${capfile[$if_name]} 2>&1 + tcpdump -e -nn -r ${capfile[$if_name]} 2>&1 } # return 0 if the packet wasn't seen on host2_if or 1 if it was diff --git a/tools/testing/selftests/net/forwarding/local_termination.sh b/tools/testing/selftests/net/forwarding/local_termination.sh index 1f2bf6e81847..15b1a1255a41 100755 --- a/tools/testing/selftests/net/forwarding/local_termination.sh +++ b/tools/testing/selftests/net/forwarding/local_termination.sh @@ -57,21 +57,21 @@ PTP_1588_L2_PDELAY_REQ=" \ PTP_1588_IPV4_SYNC=" \ 01:00:5e:00:01:81 00:00:de:ad:be:ef 08:00 45 00 \ 00 48 0a 9a 40 00 01 11 cb 88 c0 00 02 01 e0 00 \ -01 81 01 3f 01 3f 00 34 a3 c8 00 02 00 2c 00 00 \ +01 81 01 3f 01 3f 00 34 9f 41 00 02 00 2c 00 00 \ 02 00 00 00 00 00 00 00 00 00 00 00 00 00 3e 37 \ 63 ff fe cf 17 0e 00 01 00 00 00 00 00 00 00 00 \ 00 00 00 00 00 00" PTP_1588_IPV4_FOLLOW_UP=" 01:00:5e:00:01:81 00:00:de:ad:be:ef 08:00 45 00 \ 00 48 0a 9b 40 00 01 11 cb 87 c0 00 02 01 e0 00 \ -01 81 01 40 01 40 00 34 a3 c8 08 02 00 2c 00 00 \ +01 81 01 40 01 40 00 34 eb 8a 08 02 00 2c 00 00 \ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3e 37 \ 63 ff fe cf 17 0e 00 01 00 00 02 00 00 00 66 83 \ c6 0f 1d 9a 61 87" PTP_1588_IPV4_PDELAY_REQ=" \ 01:00:5e:00:00:6b 00:00:de:ad:be:ef 08:00 45 00 \ 00 52 35 a9 40 00 01 11 a1 85 c0 00 02 01 e0 00 \ -00 6b 01 3f 01 3f 00 3e a2 bc 02 02 00 36 00 00 \ +00 6b 01 3f 01 3f 00 3e 9a b9 02 02 00 36 00 00 \ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 3e 37 \ 63 ff fe cf 17 0e 00 01 00 01 05 7f 00 00 00 00 \ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00" @@ -79,7 +79,7 @@ PTP_1588_IPV6_SYNC=" \ 33:33:00:00:01:81 00:00:de:ad:be:ef 86:dd 60 06 \ 7c 2f 00 36 11 01 20 01 0d b8 00 01 00 00 00 00 \ 00 00 00 00 00 01 ff 0e 00 00 00 00 00 00 00 00 \ -00 00 00 00 01 81 01 3f 01 3f 00 36 2e 92 00 02 \ +00 00 00 00 01 81 01 3f 01 3f 00 36 14 76 00 02 \ 00 2c 00 00 02 00 00 00 00 00 00 00 00 00 00 00 \ 00 00 3e 37 63 ff fe cf 17 0e 00 01 00 00 00 00 \ 00 00 00 00 00 00 00 00 00 00 00 00" @@ -87,7 +87,7 @@ PTP_1588_IPV6_FOLLOW_UP=" \ 33:33:00:00:01:81 00:00:de:ad:be:ef 86:dd 60 0a \ 00 bc 00 36 11 01 20 01 0d b8 00 01 00 00 00 00 \ 00 00 00 00 00 01 ff 0e 00 00 00 00 00 00 00 00 \ -00 00 00 00 01 81 01 40 01 40 00 36 2e 92 08 02 \ +00 00 00 00 01 81 01 40 01 40 00 36 f0 47 08 02 \ 00 2c 00 00 00 00 00 00 00 00 00 00 00 00 00 00 \ 00 00 3e 37 63 ff fe cf 17 0e 00 01 00 00 02 00 \ 00 00 66 83 c6 2a 32 09 bd 74 00 00" @@ -95,11 +95,20 @@ PTP_1588_IPV6_PDELAY_REQ=" \ 33:33:00:00:00:6b 00:00:de:ad:be:ef 86:dd 60 0c \ 5c fd 00 40 11 01 fe 80 00 00 00 00 00 00 3c 37 \ 63 ff fe cf 17 0e ff 02 00 00 00 00 00 00 00 00 \ -00 00 00 00 00 6b 01 3f 01 3f 00 40 b4 54 02 02 \ +00 00 00 00 00 6b 01 3f 01 3f 00 40 89 1f 02 02 \ 00 36 00 00 00 00 00 00 00 00 00 00 00 00 00 00 \ 00 00 3e 37 63 ff fe cf 17 0e 00 01 00 01 05 7f \ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 \ 00 00 00 00 00 00" +LINK_LOCAL_STP_BPDU=" \ +01:80:c2:00:00:00 00:00:de:ad:be:ef 00 26 42 42 03 \ +00 00 00 00 00 80 00 aa bb cc dd ee ff 00 00 00 00 \ +80 00 aa bb cc dd ee ff 80 01 00 00 14 00 02 00 \ +0f 00" +LINK_LOCAL_LLDP=" \ +01:80:c2:00:00:0e 00:00:de:ad:be:ef 88:cc 02 07 04 \ +00 11 22 33 44 55 04 05 05 65 74 68 30 06 02 00 \ +78 00 00" # Disable promisc to ensure we don't receive unknown MAC DA packets export TCPDUMP_EXTRA_FLAGS="-pl" @@ -213,7 +222,15 @@ run_test() mc_route_destroy $rcv_if_name mc_route_destroy $send_if_name + ip maddress add 01:80:c2:00:00:00 dev $rcv_if_name + send_raw $send_if_name "$LINK_LOCAL_STP_BPDU" + ip maddress del 01:80:c2:00:00:00 dev $rcv_if_name + if [ $skip_ptp = false ]; then + ip maddress add 01:80:c2:00:00:0e dev $rcv_if_name + send_raw $send_if_name "$LINK_LOCAL_LLDP" + ip maddress del 01:80:c2:00:00:0e dev $rcv_if_name + ip maddress add 01:1b:19:00:00:00 dev $rcv_if_name send_raw $send_if_name "$PTP_1588_L2_SYNC" send_raw $send_if_name "$PTP_1588_L2_FOLLOW_UP" @@ -304,7 +321,15 @@ run_test() "$smac > $UNKNOWN_MACV6_MC_ADDR3, ethertype IPv6 (0x86dd)" \ true "$test_name" + check_rcv $rcv_if_name "Link-local STP BPDU" \ + "> 01:80:c2:00:00:00" \ + true "$test_name" + if [ $skip_ptp = false ]; then + check_rcv $rcv_if_name "Link-local LLDP" \ + "> 01:80:c2:00:00:0e" \ + true "$test_name" + check_rcv $rcv_if_name "1588v2 over L2 transport, Sync" \ "ethertype PTP (0x88f7).* PTPv2.* msg type *: sync msg" \ true "$test_name" diff --git a/tools/testing/selftests/net/fq_band_pktlimit.sh b/tools/testing/selftests/net/fq_band_pktlimit.sh index 977070ed42b3..223f9efe4090 100755 --- a/tools/testing/selftests/net/fq_band_pktlimit.sh +++ b/tools/testing/selftests/net/fq_band_pktlimit.sh @@ -32,19 +32,19 @@ tc qdisc replace dev dummy0 root handle 1: fq quantum 1514 initial_quantum 1514 DELAY=400000 ./cmsg_sender -6 -p u -d "${DELAY}" -n 20 fdaa::2 8000 -OUT1="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" +OUT1="$(tc -s qdisc show dev dummy0 | grep '^ Sent')" ./cmsg_sender -6 -p u -d "${DELAY}" -n 20 fdaa::2 8000 -OUT2="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" +OUT2="$(tc -s qdisc show dev dummy0 | grep '^ Sent')" ./cmsg_sender -6 -p u -d "${DELAY}" -n 20 -P 7 fdaa::2 8000 -OUT3="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" +OUT3="$(tc -s qdisc show dev dummy0 | grep '^ Sent')" # Initial stats will report zero sent, as all packets are still # queued in FQ. Sleep for at least the delay period and see that # twenty are now sent. sleep 0.6 -OUT4="$(tc -s qdisc show dev dummy0 | grep '^\ Sent')" +OUT4="$(tc -s qdisc show dev dummy0 | grep '^ Sent')" # Log the output after the test echo "${OUT1}" @@ -53,7 +53,7 @@ echo "${OUT3}" echo "${OUT4}" # Test the output for expected values -echo "${OUT1}" | grep -q '0\ pkt\ (dropped\ 10' || die "unexpected drop count at 1" -echo "${OUT2}" | grep -q '0\ pkt\ (dropped\ 30' || die "unexpected drop count at 2" -echo "${OUT3}" | grep -q '0\ pkt\ (dropped\ 40' || die "unexpected drop count at 3" -echo "${OUT4}" | grep -q '20\ pkt\ (dropped\ 40' || die "unexpected accept count at 4" +echo "${OUT1}" | grep -q '0 pkt (dropped 10' || die "unexpected drop count at 1" +echo "${OUT2}" | grep -q '0 pkt (dropped 30' || die "unexpected drop count at 2" +echo "${OUT3}" | grep -q '0 pkt (dropped 40' || die "unexpected drop count at 3" +echo "${OUT4}" | grep -q '20 pkt (dropped 40' || die "unexpected accept count at 4" diff --git a/tools/testing/selftests/net/io_uring_zerocopy_tx.sh b/tools/testing/selftests/net/io_uring_zerocopy_tx.sh index 123439545013..8c3647de9b4c 100755 --- a/tools/testing/selftests/net/io_uring_zerocopy_tx.sh +++ b/tools/testing/selftests/net/io_uring_zerocopy_tx.sh @@ -77,9 +77,13 @@ esac # Start of state changes: install cleanup handler +old_io_uring_disabled=0 cleanup() { ip netns del "${NS2}" ip netns del "${NS1}" + if [ "$old_io_uring_disabled" -ne 0 ]; then + sysctl -w -q kernel.io_uring_disabled="$old_io_uring_disabled" 2>/dev/null || true + fi } trap cleanup EXIT @@ -122,5 +126,10 @@ do_test() { wait } +old_io_uring_disabled=$(sysctl -n kernel.io_uring_disabled 2>/dev/null || echo "0") +if [ "$old_io_uring_disabled" -ne 0 ]; then + sysctl -w -q kernel.io_uring_disabled=0 +fi + do_test "${EXTRA_ARGS}" echo ok diff --git a/tools/testing/selftests/net/ip6_tunnel.sh b/tools/testing/selftests/net/ip6_tunnel.sh new file mode 100755 index 000000000000..fe081a521819 --- /dev/null +++ b/tools/testing/selftests/net/ip6_tunnel.sh @@ -0,0 +1,44 @@ +#!/bin/bash +# Test that IPv4-over-IPv6 tunneling works. + +source lib.sh +set -e + +setup_prepare() { + ip link add transport1 type veth peer name transport2 + + setup_ns ns1 + ip link set transport1 netns $ns1 + ip -n $ns1 address add 2001:db8::1/64 dev transport1 nodad + ip -n $ns1 address add 2001:db8::3/64 dev transport1 nodad + ip -n $ns1 link set transport1 up + ip -n $ns1 link add link transport1 name tunnel4 type ip6tnl mode ipip6 local 2001:db8::1 remote 2001:db8::2 + ip -n $ns1 address add 172.0.0.1/32 peer 172.0.0.2/32 dev tunnel4 + ip -n $ns1 link set tunnel4 up + ip -n $ns1 link add link transport1 name tunnel6 type ip6tnl mode ip6ip6 local 2001:db8::3 remote 2001:db8::4 + ip -n $ns1 address add 2001:db8:6::1/64 dev tunnel6 + ip -n $ns1 link set tunnel6 up + + setup_ns ns2 + ip link set transport2 netns $ns2 + ip -n $ns2 address add 2001:db8::2/64 dev transport2 nodad + ip -n $ns2 address add 2001:db8::4/64 dev transport2 nodad + ip -n $ns2 link set transport2 up + ip -n $ns2 link add link transport2 name tunnel4 type ip6tnl mode ipip6 local 2001:db8::2 remote 2001:db8::1 + ip -n $ns2 address add 172.0.0.2/32 peer 172.0.0.1/32 dev tunnel4 + ip -n $ns2 link set tunnel4 up + ip -n $ns2 link add link transport2 name tunnel6 type ip6tnl mode ip6ip6 local 2001:db8::4 remote 2001:db8::3 + ip -n $ns2 address add 2001:db8:6::2/64 dev tunnel6 + ip -n $ns2 link set tunnel6 up +} + +cleanup() { + cleanup_all_ns + # in case the namespaces haven't been set up yet + ip link delete transport1 &>/dev/null || true +} + +trap cleanup EXIT +setup_prepare +ip netns exec $ns1 ping -q -W1 -c1 172.0.0.2 >/dev/null +ip netns exec $ns1 ping -q -W1 -c1 2001:db8:6::2 >/dev/null diff --git a/tools/testing/selftests/net/ipsec.c b/tools/testing/selftests/net/ipsec.c index f4afef51b930..89c32c354c00 100644 --- a/tools/testing/selftests/net/ipsec.c +++ b/tools/testing/selftests/net/ipsec.c @@ -62,8 +62,6 @@ #define VETH_FMT "ktst-%d" #define VETH_LEN 12 -#define XFRM_ALGO_NR_KEYS 29 - static int nsfd_parent = -1; static int nsfd_childa = -1; static int nsfd_childb = -1; @@ -96,7 +94,6 @@ struct xfrm_key_entry xfrm_key_entries[] = { {"cbc(cast5)", 128}, {"cbc(serpent)", 128}, {"hmac(sha1)", 160}, - {"hmac(rmd160)", 160}, {"cbc(des3_ede)", 192}, {"hmac(sha256)", 256}, {"cbc(aes)", 256}, @@ -813,7 +810,7 @@ static int xfrm_fill_key(char *name, char *buf, { int i; - for (i = 0; i < XFRM_ALGO_NR_KEYS; i++) { + for (i = 0; i < ARRAY_SIZE(xfrm_key_entries); i++) { if (strncmp(name, xfrm_key_entries[i].algo_name, ALGO_LEN) == 0) *key_len = xfrm_key_entries[i].key_len; } @@ -2061,8 +2058,7 @@ static int write_desc(int proto, int test_desc_fd, int proto_list[] = { IPPROTO_AH, IPPROTO_COMP, IPPROTO_ESP }; char *ah_list[] = { "digest_null", "hmac(md5)", "hmac(sha1)", "hmac(sha256)", - "hmac(sha384)", "hmac(sha512)", "hmac(rmd160)", - "xcbc(aes)", "cmac(aes)" + "hmac(sha384)", "hmac(sha512)", "xcbc(aes)", "cmac(aes)" }; char *comp_list[] = { "deflate", diff --git a/tools/testing/selftests/net/lib.sh b/tools/testing/selftests/net/lib.sh index b40694573f4c..b3827b43782b 100644 --- a/tools/testing/selftests/net/lib.sh +++ b/tools/testing/selftests/net/lib.sh @@ -224,6 +224,19 @@ setup_ns() NS_LIST+=("${ns_list[@]}") } +in_all_ns() +{ + local ret=0 + local ns_list=("${NS_LIST[@]}") + + for ns in "${ns_list[@]}"; do + ip netns exec "${ns}" "$@" + (( ret = ret || $? )) + done + + return "${ret}" +} + # Create netdevsim with given id and net namespace. create_netdevsim() { local id="$1" @@ -514,7 +527,8 @@ mac_get() { local if_name=$1 - ip -j link show dev $if_name | jq -r '.[]["address"]' + run_on "$if_name" \ + ip -j link show dev "$if_name" | jq -r '.[]["address"]' } kill_process() @@ -670,3 +684,8 @@ cmd_jq() # return success only in case of non-empty output [ ! -z "$output" ] } + +run_on() +{ + shift; "$@" +} diff --git a/tools/testing/selftests/net/lib/.gitignore b/tools/testing/selftests/net/lib/.gitignore index bbc97d6bf556..6cd2b762af5d 100644 --- a/tools/testing/selftests/net/lib/.gitignore +++ b/tools/testing/selftests/net/lib/.gitignore @@ -1,3 +1,4 @@ # SPDX-License-Identifier: GPL-2.0-only csum +gro xdp_helper diff --git a/tools/testing/selftests/net/lib/Makefile b/tools/testing/selftests/net/lib/Makefile index 5339f56329e1..ff83603397d0 100644 --- a/tools/testing/selftests/net/lib/Makefile +++ b/tools/testing/selftests/net/lib/Makefile @@ -14,6 +14,7 @@ TEST_FILES := \ TEST_GEN_FILES := \ $(patsubst %.c,%.o,$(wildcard *.bpf.c)) \ csum \ + gro \ xdp_helper \ # end of TEST_GEN_FILES diff --git a/tools/testing/selftests/drivers/net/gro.c b/tools/testing/selftests/net/lib/gro.c index 3c0745b68bfa..11b16ae5f0e8 100644 --- a/tools/testing/selftests/drivers/net/gro.c +++ b/tools/testing/selftests/net/lib/gro.c @@ -10,8 +10,10 @@ * packet coalesced: it can be smaller than the rest and coalesced * as long as it is in the same flow. * - data_same: same size packets coalesce - * - data_lrg_sml: large then small coalesces - * - data_sml_lrg: small then large doesn't coalesce + * - data_lrg_sml: large then small coalesces + * - data_lrg_1byte: large then 1 byte coalesces (Ethernet padding) + * - data_sml_lrg: small then large doesn't coalesce + * - data_burst: two bursts of two, separated by 100ms * * ack: * Pure ACK does not coalesce. @@ -34,6 +36,7 @@ * Packets with different (ECN, TTL, TOS) header, IP options or * IP fragments shouldn't coalesce. * - ip_ecn, ip_tos: shared between IPv4/IPv6 + * - ip_csum: IPv4 only, bad IP header checksum * - ip_ttl, ip_opt, ip_frag4: IPv4 only * - ip_id_df*: IPv4 IP ID field coalescing tests * - ip_frag6, ip_v6ext_*: IPv6 only @@ -43,6 +46,10 @@ * - large_max: exceeding max size * - large_rem: remainder handling * + * single, capacity: + * Boring cases used to test coalescing machinery itself and stats + * more than protocol behavior. + * * MSS is defined as 4096 - header because if it is too small * (i.e. 1500 MTU - header), it will result in many packets, * increasing the "large" test case's flakiness. This is because @@ -63,6 +70,7 @@ #include <linux/filter.h> #include <linux/if_packet.h> #include <linux/ipv6.h> +#include <linux/net_tstamp.h> #include <net/ethernet.h> #include <net/if.h> #include <netinet/in.h> @@ -74,10 +82,11 @@ #include <stdio.h> #include <stdarg.h> #include <string.h> +#include <time.h> #include <unistd.h> #include "kselftest.h" -#include "../../net/lib/ksft.h" +#include "ksft.h" #define DPORT 8000 #define SPORT 1500 @@ -86,11 +95,12 @@ #define START_SEQ 100 #define START_ACK 100 #define ETH_P_NONE 0 -#define TOTAL_HDR_LEN (ETH_HLEN + sizeof(struct ipv6hdr) + sizeof(struct tcphdr)) -#define MSS (4096 - sizeof(struct tcphdr) - sizeof(struct ipv6hdr)) -#define MAX_PAYLOAD (IP_MAXPACKET - sizeof(struct tcphdr) - sizeof(struct ipv6hdr)) -#define NUM_LARGE_PKT (MAX_PAYLOAD / MSS) -#define MAX_HDR_LEN (ETH_HLEN + sizeof(struct ipv6hdr) + sizeof(struct tcphdr)) +#define ASSUMED_MTU 4096 +#define MAX_MSS (ASSUMED_MTU - sizeof(struct iphdr) - sizeof(struct tcphdr)) +#define MAX_HDR_LEN \ + (ETH_HLEN + sizeof(struct ipv6hdr) * 2 + sizeof(struct tcphdr)) +#define MAX_LARGE_PKT_CNT ((IP_MAXPACKET - (MAX_HDR_LEN - ETH_HLEN)) / \ + (ASSUMED_MTU - (MAX_HDR_LEN - ETH_HLEN))) #define MIN_EXTHDR_SIZE 8 #define EXT_PAYLOAD_1 "\x00\x00\x00\x00\x00\x00" #define EXT_PAYLOAD_2 "\x11\x11\x11\x11\x11\x11" @@ -123,6 +133,32 @@ static int tcp_offset = -1; static int total_hdr_len = -1; static int ethhdr_proto = -1; static bool ipip; +static bool ip6ip6; +static uint64_t txtime_ns; +static int num_flows = 4; +static bool order_check; + +#define CAPACITY_PAYLOAD_LEN 200 + +#define TXTIME_DELAY_MS 5 + +/* Max TCP payload that GRO will coalesce. The outer header overhead + * varies by encapsulation, reducing the effective max payload. + */ +static int max_payload(void) +{ + return IP_MAXPACKET - (total_hdr_len - ETH_HLEN); +} + +static int calc_mss(void) +{ + return ASSUMED_MTU - (total_hdr_len - ETH_HLEN); +} + +static int num_large_pkt(void) +{ + return max_payload() / calc_mss(); +} static void vlog(const char *fmt, ...) { @@ -141,15 +177,13 @@ static void setup_sock_filter(int fd) const int ethproto_off = offsetof(struct ethhdr, h_proto); int optlen = 0; int ipproto_off, opt_ipproto_off; - int next_off; - if (ipip) - next_off = sizeof(struct iphdr) + offsetof(struct iphdr, protocol); - else if (proto == PF_INET) - next_off = offsetof(struct iphdr, protocol); + if (proto == PF_INET) + ipproto_off = tcp_offset - sizeof(struct iphdr) + + offsetof(struct iphdr, protocol); else - next_off = offsetof(struct ipv6hdr, nexthdr); - ipproto_off = ETH_HLEN + next_off; + ipproto_off = tcp_offset - sizeof(struct ipv6hdr) + + offsetof(struct ipv6hdr, nexthdr); /* Overridden later if exthdrs are used: */ opt_ipproto_off = ipproto_off; @@ -330,36 +364,103 @@ static void fill_transportlayer(void *buf, int seq_offset, int ack_offset, static void write_packet(int fd, char *buf, int len, struct sockaddr_ll *daddr) { + char control[CMSG_SPACE(sizeof(uint64_t))]; + struct msghdr msg = {}; + struct iovec iov = {}; + struct cmsghdr *cm; int ret = -1; - ret = sendto(fd, buf, len, 0, (struct sockaddr *)daddr, sizeof(*daddr)); + iov.iov_base = buf; + iov.iov_len = len; + + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_name = daddr; + msg.msg_namelen = sizeof(*daddr); + + if (txtime_ns) { + memset(control, 0, sizeof(control)); + msg.msg_control = control; + msg.msg_controllen = sizeof(control); + + cm = CMSG_FIRSTHDR(&msg); + cm->cmsg_level = SOL_SOCKET; + cm->cmsg_type = SCM_TXTIME; + cm->cmsg_len = CMSG_LEN(sizeof(uint64_t)); + memcpy(CMSG_DATA(cm), &txtime_ns, sizeof(txtime_ns)); + } + + ret = sendmsg(fd, &msg, 0); if (ret == -1) - error(1, errno, "sendto failure"); + error(1, errno, "sendmsg failure"); if (ret != len) - error(1, errno, "sendto wrong length"); + error(1, 0, "sendmsg wrong length: %d vs %d", ret, len); } static void create_packet(void *buf, int seq_offset, int ack_offset, int payload_len, int fin) { + int ip_hdr_len = (proto == PF_INET) ? + sizeof(struct iphdr) : sizeof(struct ipv6hdr); + int inner_ip_off = tcp_offset - ip_hdr_len; + memset(buf, 0, total_hdr_len); memset(buf + total_hdr_len, 'a', payload_len); fill_transportlayer(buf + tcp_offset, seq_offset, ack_offset, payload_len, fin); - if (ipip) { - fill_networklayer(buf + ETH_HLEN, payload_len + sizeof(struct iphdr), - IPPROTO_IPIP); - fill_networklayer(buf + ETH_HLEN + sizeof(struct iphdr), - payload_len, IPPROTO_TCP); - } else { - fill_networklayer(buf + ETH_HLEN, payload_len, IPPROTO_TCP); + fill_networklayer(buf + inner_ip_off, payload_len, IPPROTO_TCP); + if (inner_ip_off > ETH_HLEN) { + int encap_proto = (proto == PF_INET) ? + IPPROTO_IPIP : IPPROTO_IPV6; + + fill_networklayer(buf + ETH_HLEN, + payload_len + ip_hdr_len, encap_proto); } fill_datalinklayer(buf); } +static void create_capacity_packet(void *buf, int flow_id, int pkt_idx, int psh) +{ + int seq_offset = pkt_idx * CAPACITY_PAYLOAD_LEN; + struct tcphdr *tcph; + + create_packet(buf, seq_offset, 0, CAPACITY_PAYLOAD_LEN, 0); + + /* Customize for this flow id */ + memset(buf + total_hdr_len, 'a' + flow_id, CAPACITY_PAYLOAD_LEN); + + tcph = buf + tcp_offset; + tcph->source = htons(SPORT + flow_id); + tcph->psh = psh; + tcph->check = 0; + tcph->check = tcp_checksum(tcph, CAPACITY_PAYLOAD_LEN); +} + +/* Send a capacity test, 2 packets per flow, all first packets then all second: + * A1 B1 C1 D1 ... A2 B2 C2 D2 ... + */ +static void send_capacity(int fd, struct sockaddr_ll *daddr) +{ + static char buf[MAX_HDR_LEN + CAPACITY_PAYLOAD_LEN]; + int pkt_size = total_hdr_len + CAPACITY_PAYLOAD_LEN; + int i; + + /* Send first packet of each flow (no PSH) */ + for (i = 0; i < num_flows; i++) { + create_capacity_packet(buf, i, 0, 0); + write_packet(fd, buf, pkt_size, daddr); + } + + /* Send second packet of each flow (with PSH to flush) */ + for (i = 0; i < num_flows; i++) { + create_capacity_packet(buf, i, 1, 1); + write_packet(fd, buf, pkt_size, daddr); + } +} + #ifndef TH_CWR #define TH_CWR 0x80 #endif @@ -438,18 +539,20 @@ static void send_data_pkts(int fd, struct sockaddr_ll *daddr, */ static void send_large(int fd, struct sockaddr_ll *daddr, int remainder) { - static char pkts[NUM_LARGE_PKT][TOTAL_HDR_LEN + MSS]; - static char last[TOTAL_HDR_LEN + MSS]; - static char new_seg[TOTAL_HDR_LEN + MSS]; + static char pkts[MAX_LARGE_PKT_CNT][MAX_HDR_LEN + MAX_MSS]; + static char new_seg[MAX_HDR_LEN + MAX_MSS]; + static char last[MAX_HDR_LEN + MAX_MSS]; + const int num_pkt = num_large_pkt(); + const int mss = calc_mss(); int i; - for (i = 0; i < NUM_LARGE_PKT; i++) - create_packet(pkts[i], i * MSS, 0, MSS, 0); - create_packet(last, NUM_LARGE_PKT * MSS, 0, remainder, 0); - create_packet(new_seg, (NUM_LARGE_PKT + 1) * MSS, 0, remainder, 0); + for (i = 0; i < num_pkt; i++) + create_packet(pkts[i], i * mss, 0, mss, 0); + create_packet(last, num_pkt * mss, 0, remainder, 0); + create_packet(new_seg, (num_pkt + 1) * mss, 0, remainder, 0); - for (i = 0; i < NUM_LARGE_PKT; i++) - write_packet(fd, pkts[i], total_hdr_len + MSS, daddr); + for (i = 0; i < num_pkt; i++) + write_packet(fd, pkts[i], total_hdr_len + mss, daddr); write_packet(fd, last, total_hdr_len + remainder, daddr); write_packet(fd, new_seg, total_hdr_len + remainder, daddr); } @@ -469,8 +572,7 @@ static void send_ack(int fd, struct sockaddr_ll *daddr) static void recompute_packet(char *buf, char *no_ext, int extlen) { struct tcphdr *tcphdr = (struct tcphdr *)(buf + tcp_offset); - struct ipv6hdr *ip6h = (struct ipv6hdr *)(buf + ETH_HLEN); - struct iphdr *iph = (struct iphdr *)(buf + ETH_HLEN); + int off; memmove(buf, no_ext, total_hdr_len); memmove(buf + total_hdr_len + extlen, @@ -480,18 +582,22 @@ static void recompute_packet(char *buf, char *no_ext, int extlen) tcphdr->check = 0; tcphdr->check = tcp_checksum(tcphdr, PAYLOAD_LEN + extlen); if (proto == PF_INET) { - iph->tot_len = htons(ntohs(iph->tot_len) + extlen); - iph->check = 0; - iph->check = checksum_fold(iph, sizeof(struct iphdr), 0); + for (off = ETH_HLEN; off < tcp_offset; + off += sizeof(struct iphdr)) { + struct iphdr *iph = (struct iphdr *)(buf + off); - if (ipip) { - iph += 1; iph->tot_len = htons(ntohs(iph->tot_len) + extlen); iph->check = 0; iph->check = checksum_fold(iph, sizeof(struct iphdr), 0); } } else { - ip6h->payload_len = htons(ntohs(ip6h->payload_len) + extlen); + for (off = ETH_HLEN; off < tcp_offset; + off += sizeof(struct ipv6hdr)) { + struct ipv6hdr *ip6h = (struct ipv6hdr *)(buf + off); + + ip6h->payload_len = + htons(ntohs(ip6h->payload_len) + extlen); + } } } @@ -580,6 +686,24 @@ static void send_changed_checksum(int fd, struct sockaddr_ll *daddr) write_packet(fd, buf, pkt_size, daddr); } +/* Packets with incorrect IPv4 header checksum don't coalesce. */ +static void send_changed_ip_checksum(int fd, struct sockaddr_ll *daddr) +{ + static char buf[MAX_HDR_LEN + PAYLOAD_LEN]; + struct iphdr *iph = (struct iphdr *)(buf + ETH_HLEN); + int pkt_size = total_hdr_len + PAYLOAD_LEN; + + create_packet(buf, 0, 0, PAYLOAD_LEN, 0); + write_packet(fd, buf, pkt_size, daddr); + + create_packet(buf, PAYLOAD_LEN, 0, PAYLOAD_LEN, 0); + iph->check = iph->check - 1; + write_packet(fd, buf, pkt_size, daddr); + + create_packet(buf, PAYLOAD_LEN * 2, 0, PAYLOAD_LEN, 0); + write_packet(fd, buf, pkt_size, daddr); +} + /* Packets with non-consecutive sequence number don't coalesce.*/ static void send_changed_seq(int fd, struct sockaddr_ll *daddr) { @@ -1022,7 +1146,8 @@ static void check_recv_pkts(int fd, int *correct_payload, if (iph->version == 4) ip_ext_len = (iph->ihl - 5) * 4; - else if (ip6h->version == 6 && ip6h->nexthdr != IPPROTO_TCP) + else if (ip6h->version == 6 && !ip6ip6 && + ip6h->nexthdr != IPPROTO_TCP) ip_ext_len = MIN_EXTHDR_SIZE; tcph = (struct tcphdr *)(buffer + tcp_offset + ip_ext_len); @@ -1056,8 +1181,129 @@ static void check_recv_pkts(int fd, int *correct_payload, printf("Test succeeded\n\n"); } +static void check_capacity_pkts(int fd) +{ + static char buffer[IP_MAXPACKET + ETH_HLEN + 1]; + struct iphdr *iph = (struct iphdr *)(buffer + ETH_HLEN); + struct ipv6hdr *ip6h = (struct ipv6hdr *)(buffer + ETH_HLEN); + int num_pkt = 0, num_coal = 0, pkt_idx; + const char *fail_reason = NULL; + int flow_order[num_flows * 2]; + int coalesced[num_flows]; + struct tcphdr *tcph; + int ip_ext_len = 0; + int total_data = 0; + int pkt_size = -1; + int data_len = 0; + int flow_id; + int sport; + + memset(coalesced, 0, sizeof(coalesced)); + memset(flow_order, -1, sizeof(flow_order)); + + while (1) { + ip_ext_len = 0; + pkt_size = recv(fd, buffer, IP_MAXPACKET + ETH_HLEN + 1, 0); + if (pkt_size < 0) + recv_error(fd, errno); + + if (iph->version == 4) + ip_ext_len = (iph->ihl - 5) * 4; + else if (ip6h->version == 6 && !ip6ip6 && + ip6h->nexthdr != IPPROTO_TCP) + ip_ext_len = MIN_EXTHDR_SIZE; + + tcph = (struct tcphdr *)(buffer + tcp_offset + ip_ext_len); + + if (tcph->fin) + break; + + sport = ntohs(tcph->source); + flow_id = sport - SPORT; + + if (flow_id < 0 || flow_id >= num_flows) { + vlog("Invalid flow_id %d from sport %d\n", + flow_id, sport); + fail_reason = fail_reason ?: "invalid packet"; + continue; + } + + /* Calculate payload length */ + if (pkt_size == ETH_ZLEN && iph->version == 4) { + data_len = ntohs(iph->tot_len) + - sizeof(struct tcphdr) - sizeof(struct iphdr); + } else { + data_len = pkt_size - total_hdr_len - ip_ext_len; + } + + if (num_pkt < num_flows * 2) { + flow_order[num_pkt] = flow_id; + } else if (num_pkt == num_flows * 2) { + vlog("More packets than expected (%d)\n", + num_flows * 2); + fail_reason = fail_reason ?: "too many packets"; + } + coalesced[flow_id] = data_len; + + if (data_len == CAPACITY_PAYLOAD_LEN * 2) { + num_coal++; + } else { + vlog("Pkt %d: flow %d, sport %d, len %d (expected %d)\n", + num_pkt, flow_id, sport, data_len, + CAPACITY_PAYLOAD_LEN * 2); + fail_reason = fail_reason ?: "not coalesced"; + } + + num_pkt++; + total_data += data_len; + } + + /* Check flow ordering. We expect to see all non-coalesced first segs + * then interleaved coalesced and non-coalesced second frames. + */ + pkt_idx = 0; + for (flow_id = 0; order_check && flow_id < num_flows; flow_id++) { + bool coaled = coalesced[flow_id] > CAPACITY_PAYLOAD_LEN; + + if (coaled) + continue; + + if (flow_order[pkt_idx] != flow_id) { + vlog("Flow order mismatch (non-coalesced) at position %d: expected flow %d, got flow %d\n", + pkt_idx, flow_id, flow_order[pkt_idx]); + fail_reason = fail_reason ?: "bad packet order (1)"; + } + pkt_idx++; + } + for (flow_id = 0; order_check && flow_id < num_flows; flow_id++) { + bool coaled = coalesced[flow_id] > CAPACITY_PAYLOAD_LEN; + + if (flow_order[pkt_idx] != flow_id) { + vlog("Flow order mismatch at position %d: expected flow %d, got flow %d, coalesced: %d\n", + pkt_idx, flow_id, flow_order[pkt_idx], coaled); + fail_reason = fail_reason ?: "bad packet order (2)"; + } + pkt_idx++; + } + + if (!fail_reason) { + vlog("All %d flows coalesced correctly\n", num_flows); + printf("Test succeeded\n\n"); + } else { + printf("FAILED\n"); + } + + /* Always print stats for external validation */ + printf("STATS: received=%d wire=%d coalesced=%d\n", + num_pkt, num_pkt + num_coal, num_coal); + + if (fail_reason) + error(1, 0, "capacity test failed %s", fail_reason); +} + static void gro_sender(void) { + int bufsize = 4 * 1024 * 1024; /* 4 MB */ const int fin_delay_us = 100 * 1000; static char fin_pkt[MAX_HDR_LEN]; struct sockaddr_ll daddr = {}; @@ -1067,6 +1313,27 @@ static void gro_sender(void) if (txfd < 0) error(1, errno, "socket creation"); + if (setsockopt(txfd, SOL_SOCKET, SO_SNDBUF, &bufsize, sizeof(bufsize))) + error(1, errno, "cannot set sndbuf size, setsockopt failed"); + + /* Enable SO_TXTIME unless test case generates more than one flow + * SO_TXTIME could result in qdisc layer sorting the packets at sender. + */ + if (strcmp(testname, "single") && strcmp(testname, "capacity")) { + struct sock_txtime so_txtime = { .clockid = CLOCK_MONOTONIC, }; + struct timespec ts; + + if (setsockopt(txfd, SOL_SOCKET, SO_TXTIME, + &so_txtime, sizeof(so_txtime))) + error(1, errno, "setsockopt SO_TXTIME"); + + if (clock_gettime(CLOCK_MONOTONIC, &ts)) + error(1, errno, "clock_gettime"); + + txtime_ns = ts.tv_sec * 1000000000ULL + ts.tv_nsec; + txtime_ns += TXTIME_DELAY_MS * 1000000ULL; + } + memset(&daddr, 0, sizeof(daddr)); daddr.sll_ifindex = if_nametoindex(ifname); if (daddr.sll_ifindex == 0) @@ -1083,9 +1350,27 @@ static void gro_sender(void) } else if (strcmp(testname, "data_lrg_sml") == 0) { send_data_pkts(txfd, &daddr, PAYLOAD_LEN, PAYLOAD_LEN / 2); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + } else if (strcmp(testname, "data_lrg_1byte") == 0) { + send_data_pkts(txfd, &daddr, PAYLOAD_LEN, 1); + write_packet(txfd, fin_pkt, total_hdr_len, &daddr); } else if (strcmp(testname, "data_sml_lrg") == 0) { send_data_pkts(txfd, &daddr, PAYLOAD_LEN / 2, PAYLOAD_LEN); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + } else if (strcmp(testname, "data_burst") == 0) { + static char buf[MAX_HDR_LEN + PAYLOAD_LEN]; + + create_packet(buf, 0, 0, PAYLOAD_LEN, 0); + write_packet(txfd, buf, total_hdr_len + PAYLOAD_LEN, &daddr); + create_packet(buf, PAYLOAD_LEN, 0, PAYLOAD_LEN, 0); + write_packet(txfd, buf, total_hdr_len + PAYLOAD_LEN, &daddr); + + usleep(100 * 1000); /* 100ms */ + create_packet(buf, PAYLOAD_LEN * 2, 0, PAYLOAD_LEN, 0); + write_packet(txfd, buf, total_hdr_len + PAYLOAD_LEN, &daddr); + create_packet(buf, PAYLOAD_LEN * 3, 0, PAYLOAD_LEN, 0); + write_packet(txfd, buf, total_hdr_len + PAYLOAD_LEN, &daddr); + + write_packet(txfd, fin_pkt, total_hdr_len, &daddr); /* ack test */ } else if (strcmp(testname, "ack") == 0) { @@ -1136,6 +1421,10 @@ static void gro_sender(void) write_packet(txfd, fin_pkt, total_hdr_len, &daddr); /* ip sub-tests - IPv4 only */ + } else if (strcmp(testname, "ip_csum") == 0) { + send_changed_ip_checksum(txfd, &daddr); + usleep(fin_delay_us); + write_packet(txfd, fin_pkt, total_hdr_len, &daddr); } else if (strcmp(testname, "ip_ttl") == 0) { send_changed_ttl(txfd, &daddr); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); @@ -1188,17 +1477,28 @@ static void gro_sender(void) /* large sub-tests */ } else if (strcmp(testname, "large_max") == 0) { - int offset = (proto == PF_INET && !ipip) ? 20 : 0; - int remainder = (MAX_PAYLOAD + offset) % MSS; + int remainder = max_payload() % calc_mss(); send_large(txfd, &daddr, remainder); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); } else if (strcmp(testname, "large_rem") == 0) { - int offset = (proto == PF_INET && !ipip) ? 20 : 0; - int remainder = (MAX_PAYLOAD + offset) % MSS; + int remainder = max_payload() % calc_mss(); send_large(txfd, &daddr, remainder + 1); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + + /* machinery sub-tests */ + } else if (strcmp(testname, "single") == 0) { + static char buf[MAX_HDR_LEN + PAYLOAD_LEN]; + + create_packet(buf, 0, 0, PAYLOAD_LEN, 0); + write_packet(txfd, buf, total_hdr_len + PAYLOAD_LEN, &daddr); + write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + } else if (strcmp(testname, "capacity") == 0) { + send_capacity(txfd, &daddr); + usleep(fin_delay_us); + write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + } else { error(1, 0, "Unknown testcase: %s", testname); } @@ -1233,11 +1533,20 @@ static void gro_receiver(void) printf("large data packets followed by a smaller one: "); correct_payload[0] = PAYLOAD_LEN * 1.5; check_recv_pkts(rxfd, correct_payload, 1); + } else if (strcmp(testname, "data_lrg_1byte") == 0) { + printf("large data packet followed by a 1 byte one: "); + correct_payload[0] = PAYLOAD_LEN + 1; + check_recv_pkts(rxfd, correct_payload, 1); } else if (strcmp(testname, "data_sml_lrg") == 0) { printf("small data packets followed by a larger one: "); correct_payload[0] = PAYLOAD_LEN / 2; correct_payload[1] = PAYLOAD_LEN; check_recv_pkts(rxfd, correct_payload, 2); + } else if (strcmp(testname, "data_burst") == 0) { + printf("two bursts of two data packets: "); + correct_payload[0] = PAYLOAD_LEN * 2; + correct_payload[1] = PAYLOAD_LEN * 2; + check_recv_pkts(rxfd, correct_payload, 2); /* ack test */ } else if (strcmp(testname, "ack") == 0) { @@ -1312,6 +1621,12 @@ static void gro_receiver(void) check_recv_pkts(rxfd, correct_payload, 2); /* ip sub-tests - IPv4 only */ + } else if (strcmp(testname, "ip_csum") == 0) { + correct_payload[0] = PAYLOAD_LEN; + correct_payload[1] = PAYLOAD_LEN; + correct_payload[2] = PAYLOAD_LEN; + printf("bad ip checksum doesn't coalesce: "); + check_recv_pkts(rxfd, correct_payload, 3); } else if (strcmp(testname, "ip_ttl") == 0) { correct_payload[0] = PAYLOAD_LEN; correct_payload[1] = PAYLOAD_LEN; @@ -1377,23 +1692,30 @@ static void gro_receiver(void) /* large sub-tests */ } else if (strcmp(testname, "large_max") == 0) { - int offset = (proto == PF_INET && !ipip) ? 20 : 0; - int remainder = (MAX_PAYLOAD + offset) % MSS; + int remainder = max_payload() % calc_mss(); - correct_payload[0] = (MAX_PAYLOAD + offset); + correct_payload[0] = max_payload(); correct_payload[1] = remainder; printf("Shouldn't coalesce if exceed IP max pkt size: "); check_recv_pkts(rxfd, correct_payload, 2); } else if (strcmp(testname, "large_rem") == 0) { - int offset = (proto == PF_INET && !ipip) ? 20 : 0; - int remainder = (MAX_PAYLOAD + offset) % MSS; + int remainder = max_payload() % calc_mss(); /* last segment sent individually, doesn't start new segment */ - correct_payload[0] = (MAX_PAYLOAD + offset) - remainder; + correct_payload[0] = max_payload() - remainder; correct_payload[1] = remainder + 1; correct_payload[2] = remainder + 1; printf("last segment sent individually: "); check_recv_pkts(rxfd, correct_payload, 3); + + /* machinery sub-tests */ + } else if (strcmp(testname, "single") == 0) { + printf("single data packet: "); + correct_payload[0] = PAYLOAD_LEN; + check_recv_pkts(rxfd, correct_payload, 1); + } else if (strcmp(testname, "capacity") == 0) { + check_capacity_pkts(rxfd); + } else { error(1, 0, "Test case error: unknown testname %s", testname); } @@ -1411,16 +1733,19 @@ static void parse_args(int argc, char **argv) { "ipv4", no_argument, NULL, '4' }, { "ipv6", no_argument, NULL, '6' }, { "ipip", no_argument, NULL, 'e' }, + { "ip6ip6", no_argument, NULL, 'E' }, + { "num-flows", required_argument, NULL, 'n' }, { "rx", no_argument, NULL, 'r' }, { "saddr", required_argument, NULL, 's' }, { "smac", required_argument, NULL, 'S' }, { "test", required_argument, NULL, 't' }, + { "order-check", no_argument, NULL, 'o' }, { "verbose", no_argument, NULL, 'v' }, { 0, 0, 0, 0 } }; int c; - while ((c = getopt_long(argc, argv, "46d:D:ei:rs:S:t:v", opts, NULL)) != -1) { + while ((c = getopt_long(argc, argv, "46d:D:eEi:n:rs:S:t:ov", opts, NULL)) != -1) { switch (c) { case '4': proto = PF_INET; @@ -1435,6 +1760,11 @@ static void parse_args(int argc, char **argv) proto = PF_INET; ethhdr_proto = htons(ETH_P_IP); break; + case 'E': + ip6ip6 = true; + proto = PF_INET6; + ethhdr_proto = htons(ETH_P_IPV6); + break; case 'd': addr4_dst = addr6_dst = optarg; break; @@ -1444,6 +1774,9 @@ static void parse_args(int argc, char **argv) case 'i': ifname = optarg; break; + case 'n': + num_flows = atoi(optarg); + break; case 'r': tx_socket = false; break; @@ -1456,6 +1789,9 @@ static void parse_args(int argc, char **argv) case 't': testname = optarg; break; + case 'o': + order_check = true; + break; case 'v': verbose = true; break; @@ -1473,12 +1809,15 @@ int main(int argc, char **argv) if (ipip) { tcp_offset = ETH_HLEN + sizeof(struct iphdr) * 2; total_hdr_len = tcp_offset + sizeof(struct tcphdr); + } else if (ip6ip6) { + tcp_offset = ETH_HLEN + sizeof(struct ipv6hdr) * 2; + total_hdr_len = tcp_offset + sizeof(struct tcphdr); } else if (proto == PF_INET) { tcp_offset = ETH_HLEN + sizeof(struct iphdr); total_hdr_len = tcp_offset + sizeof(struct tcphdr); } else if (proto == PF_INET6) { tcp_offset = ETH_HLEN + sizeof(struct ipv6hdr); - total_hdr_len = MAX_HDR_LEN; + total_hdr_len = tcp_offset + sizeof(struct tcphdr); } else { error(1, 0, "Protocol family is not ipv4 or ipv6"); } diff --git a/tools/testing/selftests/net/lib/py/__init__.py b/tools/testing/selftests/net/lib/py/__init__.py index f528b67639de..7c81d86a7e97 100644 --- a/tools/testing/selftests/net/lib/py/__init__.py +++ b/tools/testing/selftests/net/lib/py/__init__.py @@ -13,9 +13,12 @@ from .ksft import KsftFailEx, KsftSkipEx, KsftXfailEx, ksft_pr, ksft_eq, \ from .netns import NetNS, NetNSEnter from .nsim import NetdevSim, NetdevSimDev from .utils import CmdExitFailure, fd_read_timeout, cmd, bkg, defer, \ - bpftool, ip, ethtool, bpftrace, rand_port, wait_port_listen, wait_file, tool -from .ynl import NlError, YnlFamily, EthtoolFamily, NetdevFamily, RtnlFamily, RtnlAddrFamily -from .ynl import NetshaperFamily, DevlinkFamily, PSPFamily + bpftool, ip, ethtool, bpftrace, rand_port, rand_ports, wait_port_listen, \ + wait_file, tool +from .bpf import bpf_map_set, bpf_map_dump, bpf_prog_map_ids +from .ynl import NlError, NlctrlFamily, YnlFamily, \ + EthtoolFamily, NetdevFamily, RtnlFamily, RtnlAddrFamily +from .ynl import NetshaperFamily, DevlinkFamily, PSPFamily, Netlink __all__ = ["KSRC", "KsftFailEx", "KsftSkipEx", "KsftXfailEx", "ksft_pr", "ksft_eq", @@ -25,9 +28,10 @@ __all__ = ["KSRC", "ksft_run", "ksft_exit", "ksft_variants", "KsftNamedVariant", "NetNS", "NetNSEnter", "CmdExitFailure", "fd_read_timeout", "cmd", "bkg", "defer", - "bpftool", "ip", "ethtool", "bpftrace", "rand_port", + "bpftool", "ip", "ethtool", "bpftrace", "rand_port", "rand_ports", "wait_port_listen", "wait_file", "tool", + "bpf_map_set", "bpf_map_dump", "bpf_prog_map_ids", "NetdevSim", "NetdevSimDev", "NetshaperFamily", "DevlinkFamily", "PSPFamily", "NlError", "YnlFamily", "EthtoolFamily", "NetdevFamily", "RtnlFamily", - "RtnlAddrFamily"] + "NlctrlFamily", "RtnlAddrFamily", "Netlink"] diff --git a/tools/testing/selftests/net/lib/py/bpf.py b/tools/testing/selftests/net/lib/py/bpf.py new file mode 100644 index 000000000000..beb6bf2896a8 --- /dev/null +++ b/tools/testing/selftests/net/lib/py/bpf.py @@ -0,0 +1,68 @@ +# SPDX-License-Identifier: GPL-2.0 + +""" +BPF helper utilities for kernel selftests. + +Provides common operations for interacting with BPF maps and programs +via bpftool, used by XDP and other BPF-based test files. +""" + +from .utils import bpftool + +def _format_hex_bytes(value): + """ + Helper function that converts an integer into a formatted hexadecimal byte string. + + Args: + value: An integer representing the number to be converted. + + Returns: + A string representing hexadecimal equivalent of value, with bytes separated by spaces. + """ + hex_str = value.to_bytes(4, byteorder='little', signed=True) + return ' '.join(f'{byte:02x}' for byte in hex_str) + + +def bpf_map_set(map_name, key, value): + """ + Updates an XDP map with a given key-value pair using bpftool. + + Args: + map_name: The name of the XDP map to update. + key: The key to update in the map, formatted as a hexadecimal string. + value: The value to associate with the key, formatted as a hexadecimal string. + """ + key_formatted = _format_hex_bytes(key) + value_formatted = _format_hex_bytes(value) + bpftool( + f"map update name {map_name} key hex {key_formatted} value hex {value_formatted}" + ) + +def bpf_map_dump(map_id): + """Dump all entries of a BPF array map. + + Args: + map_id: Numeric map ID (as returned by bpftool prog show). + + Returns: + A dict mapping formatted key (int) to formatted value (int). + """ + raw = bpftool(f"map dump id {map_id}", json=True) + return {e["formatted"]["key"]: e["formatted"]["value"] for e in raw} + + +def bpf_prog_map_ids(prog_id): + """Get the map name-to-ID mapping for a loaded BPF program. + + Args: + prog_id: Numeric program ID. + + Returns: + A dict mapping map name (str) to map ID (int). + """ + map_ids = bpftool(f"prog show id {prog_id}", json=True)["map_ids"] + maps = {} + for mid in map_ids: + name = bpftool(f"map show id {mid}", json=True)["name"] + maps[name] = mid + return maps diff --git a/tools/testing/selftests/net/lib/py/ksft.py b/tools/testing/selftests/net/lib/py/ksft.py index 6cdfb8afccb5..81287c2daff0 100644 --- a/tools/testing/selftests/net/lib/py/ksft.py +++ b/tools/testing/selftests/net/lib/py/ksft.py @@ -1,7 +1,10 @@ # SPDX-License-Identifier: GPL-2.0 +import fnmatch import functools +import getopt import inspect +import os import signal import sys import time @@ -31,6 +34,45 @@ class KsftTerminate(KeyboardInterrupt): pass +class _KsftArgs: + def __init__(self): + self.list_tests = False + self.filters = [] + + try: + opts, _ = getopt.getopt(sys.argv[1:], 'hlt:T:') + except getopt.GetoptError as e: + print(e, file=sys.stderr) + sys.exit(1) + + for opt, val in opts: + if opt == '-h': + print(f"Usage: {sys.argv[0]} [-h|-l] [-t|-T name]\n" + f"\t-h print help\n" + f"\t-l list tests (filtered, if filters were specified)\n" + f"\t-t name include test\n" + f"\t-T name exclude test", + file=sys.stderr) + sys.exit(0) + elif opt == '-l': + self.list_tests = True + elif opt == '-t': + self.filters.append((True, val)) + elif opt == '-T': + self.filters.append((False, val)) + + +@functools.lru_cache() +def _ksft_supports_color(): + if os.environ.get("NO_COLOR") is not None: + return False + if not hasattr(sys.stdout, "isatty") or not sys.stdout.isatty(): + return False + if os.environ.get("TERM") == "dumb": + return False + return True + + def ksft_pr(*objs, **kwargs): """ Print logs to stdout. @@ -165,6 +207,14 @@ def ktap_result(ok, cnt=1, case_name="", comment=""): res += "." + case_name if comment: res += " # " + comment + if _ksft_supports_color(): + if comment.startswith(("SKIP", "XFAIL")): + color = "\033[33m" + elif ok: + color = "\033[32m" + else: + color = "\033[31m" + res = color + res + "\033[0m" print(res, flush=True) @@ -278,8 +328,26 @@ def _ksft_intr(signum, frame): ksft_pr(f"Ignoring SIGTERM (cnt: {term_cnt}), already exiting...") -def _ksft_generate_test_cases(cases, globs, case_pfx, args): - """Generate a flat list of (func, args, name) tuples""" +def _ksft_name_matches(name, pattern): + if '*' in pattern or '?' in pattern or '[' in pattern: + return fnmatch.fnmatchcase(name, pattern) + return name == pattern + + +def _ksft_test_enabled(name, filters): + has_positive = False + for include, pattern in filters: + has_positive |= include + if _ksft_name_matches(name, pattern): + return include + return not has_positive + + +def _ksft_generate_test_cases(cases, globs, case_pfx, args, cli_args): + """Generate a filtered list of (func, args, name) tuples. + + If -l is given, prints matching test names and exits. + """ cases = cases or [] test_cases = [] @@ -309,11 +377,22 @@ def _ksft_generate_test_cases(cases, globs, case_pfx, args): else: test_cases.append((func, args, func.__name__)) + if cli_args.filters: + test_cases = [tc for tc in test_cases + if _ksft_test_enabled(tc[2], cli_args.filters)] + + if cli_args.list_tests: + for _, _, name in test_cases: + print(name) + sys.exit(0) + return test_cases def ksft_run(cases=None, globs=None, case_pfx=None, args=()): - test_cases = _ksft_generate_test_cases(cases, globs, case_pfx, args) + cli_args = _KsftArgs() + test_cases = _ksft_generate_test_cases(cases, globs, case_pfx, args, + cli_args) global term_cnt term_cnt = 0 @@ -321,10 +400,13 @@ def ksft_run(cases=None, globs=None, case_pfx=None, args=()): totals = {"pass": 0, "fail": 0, "skip": 0, "xfail": 0} + global KSFT_RESULT + if KSFT_RESULT is not None: + raise RuntimeError("ksft_run() can't be called multiple times.") + print("TAP version 13", flush=True) print("1.." + str(len(test_cases)), flush=True) - global KSFT_RESULT cnt = 0 stop = False for func, args, name in test_cases: diff --git a/tools/testing/selftests/net/lib/py/utils.py b/tools/testing/selftests/net/lib/py/utils.py index 85884f3e827b..6c44a3d2bbf7 100644 --- a/tools/testing/selftests/net/lib/py/utils.py +++ b/tools/testing/selftests/net/lib/py/utils.py @@ -9,9 +9,17 @@ import subprocess import time +class CmdInitFailure(Exception): + """ Command failed to start. Only raised by bkg(). """ + def __init__(self, msg, cmd_obj): + super().__init__(msg + "\n" + repr(cmd_obj)) + self.cmd = cmd_obj + + class CmdExitFailure(Exception): + """ Command failed (returned non-zero exit code). """ def __init__(self, msg, cmd_obj): - super().__init__(msg) + super().__init__(msg + "\n" + repr(cmd_obj)) self.cmd = cmd_obj @@ -76,30 +84,37 @@ class cmd: msg = fd_read_timeout(rfd, ksft_wait) os.close(rfd) if not msg: - raise Exception("Did not receive ready message") + terminate = self.proc.poll() is None + self._process_terminate(terminate=terminate, timeout=1) + raise CmdInitFailure("Did not receive ready message", self) if not background: self.process(terminate=False, fail=fail, timeout=timeout) - def process(self, terminate=True, fail=None, timeout=5): - if fail is None: - fail = not terminate - - if self.ksft_term_fd: - os.write(self.ksft_term_fd, b"1") + def _process_terminate(self, terminate, timeout): if terminate: self.proc.terminate() - stdout, stderr = self.proc.communicate(timeout) + stdout, stderr = self.proc.communicate(timeout=timeout) self.stdout = stdout.decode("utf-8") self.stderr = stderr.decode("utf-8") self.proc.stdout.close() self.proc.stderr.close() self.ret = self.proc.returncode + return stdout, stderr + + def process(self, terminate=True, fail=None, timeout=5): + if fail is None: + fail = not terminate + + if self.ksft_term_fd: + os.write(self.ksft_term_fd, b"1") + + stdout, stderr = self._process_terminate(terminate=terminate, + timeout=timeout) if self.proc.returncode != 0 and fail: if len(stderr) > 0 and stderr[-1] == "\n": stderr = stderr[:-1] - raise CmdExitFailure("Command failed: %s\nSTDOUT: %s\nSTDERR: %s" % - (self.proc.args, stdout, stderr), self) + raise CmdExitFailure("Command failed", self) def __repr__(self): def str_fmt(name, s): @@ -159,8 +174,11 @@ class bkg(cmd): return self def __exit__(self, ex_type, ex_value, ex_tb): - # Force termination on exception - terminate = self.terminate or (self._exit_wait and ex_type is not None) + terminate = self.terminate + # Force termination on exception, but only if bkg() didn't already exit + # since forcing termination silences failures with fail=None + if self.proc.poll() is None: + terminate = terminate or (self._exit_wait and ex_type is not None) return self.process(terminate=terminate, fail=self.check_fail) @@ -240,8 +258,9 @@ def bpftrace(expr, json=None, ns=None, host=None, timeout=None): cmd_arr += ['-f', 'json', '-q'] if timeout: expr += ' interval:s:' + str(timeout) + ' { exit(); }' + timeout += 20 cmd_arr += ['-e', expr] - cmd_obj = cmd(cmd_arr, ns=ns, host=host, shell=False) + cmd_obj = cmd(cmd_arr, ns=ns, host=host, shell=False, timeout=timeout) if json: # bpftrace prints objects as lines ret = {} @@ -263,9 +282,27 @@ def rand_port(stype=socket.SOCK_STREAM): """ Get a random unprivileged port. """ - with socket.socket(socket.AF_INET6, stype) as s: - s.bind(("", 0)) - return s.getsockname()[1] + return rand_ports(1, stype)[0] + + +def rand_ports(count, stype=socket.SOCK_STREAM): + """ + Get a unique set of random unprivileged ports. + """ + sockets = [] + ports = [] + + try: + for _ in range(count): + s = socket.socket(socket.AF_INET6, stype) + sockets.append(s) + s.bind(("", 0)) + ports.append(s.getsockname()[1]) + finally: + for s in sockets: + s.close() + + return ports def wait_port_listen(port, proto="tcp", ns=None, host=None, sleep=0.005, deadline=5): diff --git a/tools/testing/selftests/net/lib/py/ynl.py b/tools/testing/selftests/net/lib/py/ynl.py index 32c223e93b2c..2e567062aa6c 100644 --- a/tools/testing/selftests/net/lib/py/ynl.py +++ b/tools/testing/selftests/net/lib/py/ynl.py @@ -13,20 +13,27 @@ try: SPEC_PATH = KSFT_DIR / "net/lib/specs" sys.path.append(tools_full_path.as_posix()) - from net.lib.ynl.pyynl.lib import YnlFamily, NlError + from net.lib.ynl.pyynl.lib import YnlFamily, NlError, NlPolicy, Netlink else: # Running in tree tools_full_path = KSRC / "tools" SPEC_PATH = KSRC / "Documentation/netlink/specs" sys.path.append(tools_full_path.as_posix()) - from net.ynl.pyynl.lib import YnlFamily, NlError + from net.ynl.pyynl.lib import YnlFamily, NlError, NlPolicy, Netlink except ModuleNotFoundError as e: ksft_pr("Failed importing `ynl` library from kernel sources") ksft_pr(str(e)) ktap_result(True, comment="SKIP") sys.exit(4) +__all__ = [ + "NlError", "NlPolicy", "Netlink", "YnlFamily", "SPEC_PATH", + "EthtoolFamily", "RtnlFamily", "RtnlAddrFamily", + "NetdevFamily", "NetshaperFamily", "NlctrlFamily", "DevlinkFamily", + "PSPFamily", +] + # # Wrapper classes, loading the right specs # Set schema='' to avoid jsonschema validation, it's slow @@ -57,6 +64,13 @@ class NetshaperFamily(YnlFamily): super().__init__((SPEC_PATH / Path('net_shaper.yaml')).as_posix(), schema='', recv_size=recv_size) + +class NlctrlFamily(YnlFamily): + def __init__(self, recv_size=0): + super().__init__((SPEC_PATH / Path('nlctrl.yaml')).as_posix(), + schema='', recv_size=recv_size) + + class DevlinkFamily(YnlFamily): def __init__(self, recv_size=0): super().__init__((SPEC_PATH / Path('devlink.yaml')).as_posix(), diff --git a/tools/testing/selftests/net/lib/xdp_metadata.bpf.c b/tools/testing/selftests/net/lib/xdp_metadata.bpf.c new file mode 100644 index 000000000000..f71f59215239 --- /dev/null +++ b/tools/testing/selftests/net/lib/xdp_metadata.bpf.c @@ -0,0 +1,163 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include <stddef.h> +#include <linux/bpf.h> +#include <linux/in.h> +#include <linux/if_ether.h> +#include <linux/ip.h> +#include <linux/ipv6.h> +#include <linux/udp.h> +#include <linux/tcp.h> +#include <bpf/bpf_endian.h> +#include <bpf/bpf_helpers.h> + +enum { + XDP_PORT = 1, + XDP_PROTO = 4, +} xdp_map_setup_keys; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 5); + __type(key, __u32); + __type(value, __s32); +} map_xdp_setup SEC(".maps"); + +/* RSS hash results: key 0 = hash, key 1 = hash type, + * key 2 = packet count, key 3 = error count. + */ +enum { + RSS_KEY_HASH = 0, + RSS_KEY_TYPE = 1, + RSS_KEY_PKT_CNT = 2, + RSS_KEY_ERR_CNT = 3, +}; + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __type(key, __u32); + __type(value, __u32); + __uint(max_entries, 4); +} map_rss SEC(".maps"); + +/* Mirror of enum xdp_rss_hash_type from include/net/xdp.h. + * Needed because the enum is not part of UAPI headers. + */ +enum xdp_rss_hash_type { + XDP_RSS_L3_IPV4 = 1U << 0, + XDP_RSS_L3_IPV6 = 1U << 1, + XDP_RSS_L3_DYNHDR = 1U << 2, + XDP_RSS_L4 = 1U << 3, + XDP_RSS_L4_TCP = 1U << 4, + XDP_RSS_L4_UDP = 1U << 5, + XDP_RSS_L4_SCTP = 1U << 6, + XDP_RSS_L4_IPSEC = 1U << 7, + XDP_RSS_L4_ICMP = 1U << 8, +}; + +extern int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, __u32 *hash, + enum xdp_rss_hash_type *rss_type) __ksym; + +static __always_inline __u16 get_dest_port(void *l4, void *data_end, + __u8 protocol) +{ + if (protocol == IPPROTO_UDP) { + struct udphdr *udp = l4; + + if ((void *)(udp + 1) > data_end) + return 0; + return udp->dest; + } else if (protocol == IPPROTO_TCP) { + struct tcphdr *tcp = l4; + + if ((void *)(tcp + 1) > data_end) + return 0; + return tcp->dest; + } + + return 0; +} + +SEC("xdp") +int xdp_rss_hash(struct xdp_md *ctx) +{ + void *data_end = (void *)(long)ctx->data_end; + void *data = (void *)(long)ctx->data; + enum xdp_rss_hash_type rss_type = 0; + struct ethhdr *eth = data; + __u8 l4_proto = 0; + __u32 hash = 0; + __u32 key, val; + void *l4 = NULL; + __u32 *cnt; + int ret; + + if ((void *)(eth + 1) > data_end) + return XDP_PASS; + + if (eth->h_proto == bpf_htons(ETH_P_IP)) { + struct iphdr *iph = (void *)(eth + 1); + + if ((void *)(iph + 1) > data_end) + return XDP_PASS; + l4_proto = iph->protocol; + l4 = (void *)(iph + 1); + } else if (eth->h_proto == bpf_htons(ETH_P_IPV6)) { + struct ipv6hdr *ip6h = (void *)(eth + 1); + + if ((void *)(ip6h + 1) > data_end) + return XDP_PASS; + l4_proto = ip6h->nexthdr; + l4 = (void *)(ip6h + 1); + } + + if (!l4) + return XDP_PASS; + + /* Filter on the configured protocol (map_xdp_setup key XDP_PROTO). + * When set, only process packets matching the requested L4 protocol. + */ + key = XDP_PROTO; + __s32 *proto_cfg = bpf_map_lookup_elem(&map_xdp_setup, &key); + + if (proto_cfg && *proto_cfg != 0 && l4_proto != (__u8)*proto_cfg) + return XDP_PASS; + + /* Filter on the configured port (map_xdp_setup key XDP_PORT). + * Only applies to protocols with ports (UDP, TCP). + */ + key = XDP_PORT; + __s32 *port_cfg = bpf_map_lookup_elem(&map_xdp_setup, &key); + + if (port_cfg && *port_cfg != 0) { + __u16 dest = get_dest_port(l4, data_end, l4_proto); + + if (!dest || bpf_ntohs(dest) != (__u16)*port_cfg) + return XDP_PASS; + } + + ret = bpf_xdp_metadata_rx_hash(ctx, &hash, &rss_type); + if (ret < 0) { + key = RSS_KEY_ERR_CNT; + cnt = bpf_map_lookup_elem(&map_rss, &key); + if (cnt) + __sync_fetch_and_add(cnt, 1); + return XDP_PASS; + } + + key = RSS_KEY_HASH; + bpf_map_update_elem(&map_rss, &key, &hash, BPF_ANY); + + key = RSS_KEY_TYPE; + val = (__u32)rss_type; + bpf_map_update_elem(&map_rss, &key, &val, BPF_ANY); + + key = RSS_KEY_PKT_CNT; + cnt = bpf_map_lookup_elem(&map_rss, &key); + if (cnt) + __sync_fetch_and_add(cnt, 1); + + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/net/macvlan_mcast_shared_mac.sh b/tools/testing/selftests/net/macvlan_mcast_shared_mac.sh new file mode 100755 index 000000000000..ff5b89347247 --- /dev/null +++ b/tools/testing/selftests/net/macvlan_mcast_shared_mac.sh @@ -0,0 +1,93 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# Test multicast delivery to macvlan bridge ports when the source MAC +# matches the macvlan's own MAC address (e.g., VRRP virtual MAC shared +# across multiple hosts). +# +# Topology: +# +# NS_SRC NS_BRIDGE +# veth_src (SHARED_MAC) <-----> veth_dst +# | +# +-- macvlan0 (bridge mode, SHARED_MAC) +# +# A multicast packet sent from NS_SRC with source MAC equal to +# macvlan0's MAC must still be delivered to macvlan0. + +source lib.sh + +SHARED_MAC="00:00:5e:00:01:01" +MCAST_ADDR="239.0.0.1" + +setup() { + setup_ns NS_SRC NS_BRIDGE + + ip -net "${NS_BRIDGE}" link add veth_dst type veth \ + peer name veth_src netns "${NS_SRC}" + + ip -net "${NS_SRC}" link set veth_src address "${SHARED_MAC}" + ip -net "${NS_SRC}" link set veth_src up + ip -net "${NS_SRC}" addr add 192.168.1.1/24 dev veth_src + + ip -net "${NS_BRIDGE}" link set veth_dst up + + ip -net "${NS_BRIDGE}" link add macvlan0 link veth_dst \ + type macvlan mode bridge + ip -net "${NS_BRIDGE}" link set macvlan0 address "${SHARED_MAC}" + ip -net "${NS_BRIDGE}" link set macvlan0 up + ip -net "${NS_BRIDGE}" addr add 192.168.1.2/24 dev macvlan0 + + # Accept all multicast so the mc_filter passes for any group. + ip -net "${NS_BRIDGE}" link set macvlan0 allmulticast on +} + +cleanup() { + rm -f "${CAPFILE}" "${CAPOUT}" + cleanup_ns "${NS_SRC}" "${NS_BRIDGE}" +} + +test_macvlan_mcast_shared_mac() { + CAPFILE=$(mktemp) + CAPOUT=$(mktemp) + + echo "Testing multicast delivery to macvlan with shared source MAC" + + # Listen for one ICMP packet on macvlan0. + timeout 5s ip netns exec "${NS_BRIDGE}" \ + tcpdump -i macvlan0 -c 1 -w "${CAPFILE}" icmp &> "${CAPOUT}" & + local pid=$! + if ! slowwait 1 grep -qs "listening" "${CAPOUT}"; then + echo "[FAIL] tcpdump did not start listening" + return "${ksft_fail}" + fi + + # Send multicast ping from NS_SRC; source MAC equals macvlan0's MAC. + ip netns exec "${NS_SRC}" \ + ping -W 0.1 -c 3 -I veth_src "${MCAST_ADDR}" &> /dev/null + + wait "${pid}" + + local count + count=$(tcpdump -r "${CAPFILE}" 2>/dev/null | wc -l) + if [[ "${count}" -ge 1 ]]; then + echo "[ OK ]" + return "${ksft_pass}" + else + echo "[FAIL] expected at least 1 ICMP packet on macvlan0," \ + "got ${count}" + return "${ksft_fail}" + fi +} + +if [ ! -x "$(command -v tcpdump)" ]; then + echo "SKIP: Could not run test without tcpdump tool" + exit "${ksft_skip}" +fi + +trap cleanup EXIT + +setup +test_macvlan_mcast_shared_mac + +exit $? diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index a3144d7298a5..beec41f6662a 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -4343,13 +4343,13 @@ endpoint_tests() chk_mptcp_info add_addr_signal 2 add_addr_accepted 2 [ ${ipt} = 1 ] && ip netns exec "${ns1}" ${iptables} -D OUTPUT 1 - pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal + pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal wait_mpj 4 chk_subflow_nr "after re-add ID 0" 3 chk_mptcp_info subflows 3 subflows 3 chk_mptcp_info add_addr_signal 3 add_addr_accepted 2 - pm_nl_del_endpoint $ns1 99 10.0.1.1 + pm_nl_del_endpoint $ns1 42 10.0.1.1 sleep 0.5 chk_subflow_nr "after re-delete ID 0" 2 chk_mptcp_info subflows 2 subflows 2 diff --git a/tools/testing/selftests/net/netfilter/nft_tproxy_udp.sh b/tools/testing/selftests/net/netfilter/nft_tproxy_udp.sh index d16de13fe5a7..1dc7b0450145 100755 --- a/tools/testing/selftests/net/netfilter/nft_tproxy_udp.sh +++ b/tools/testing/selftests/net/netfilter/nft_tproxy_udp.sh @@ -190,13 +190,13 @@ table inet filter { } EOF - timeout "$timeout" ip netns exec "$nsrouter" socat -u "$socat_ipproto" udp-listen:12345,fork,ip-transparent,reuseport udp:"$ns1_ip_port",ip-transparent,reuseport,bind="$ns2_ip_port" 2>/dev/null & + timeout "$timeout" ip netns exec "$nsrouter" socat -u "$socat_ipproto" udp-listen:12345,fork,ip-transparent,reuseport,shut-none udp:"$ns1_ip_port",ip-transparent,reuseport,bind="$ns2_ip_port",shut-none 2>/dev/null & local tproxy_pid=$! - timeout "$timeout" ip netns exec "$ns2" socat "$socat_ipproto" udp-listen:8080,fork SYSTEM:"echo PONG_NS2" 2>/dev/null & + timeout "$timeout" ip netns exec "$ns2" socat "$socat_ipproto" udp-listen:8080,fork,shut-none SYSTEM:"echo PONG_NS2" 2>/dev/null & local server2_pid=$! - timeout "$timeout" ip netns exec "$ns3" socat "$socat_ipproto" udp-listen:8080,fork SYSTEM:"echo PONG_NS3" 2>/dev/null & + timeout "$timeout" ip netns exec "$ns3" socat "$socat_ipproto" udp-listen:8080,fork,shut-none SYSTEM:"echo PONG_NS3" 2>/dev/null & local server3_pid=$! busywait "$BUSYWAIT_TIMEOUT" listener_ready "$nsrouter" 12345 "-u" @@ -205,7 +205,7 @@ EOF local result # request from ns1 to ns2 (forwarded traffic) - result=$(echo I_M_PROXIED | ip netns exec "$ns1" socat -t 2 -T 2 STDIO udp:"$ns2_ip_port",sourceport=18888) + result=$(echo I_M_PROXIED | ip netns exec "$ns1" socat -t 2 -T 2 STDIO udp:"$ns2_ip_port",sourceport=18888,shut-none) if [ "$result" == "$expect_ns1_ns2" ] ;then echo "PASS: tproxy test $testname: ns1 got reply \"$result\" connecting to ns2" else @@ -214,7 +214,7 @@ EOF fi # request from ns1 to ns3 (forwarded traffic) - result=$(echo I_M_PROXIED | ip netns exec "$ns1" socat -t 2 -T 2 STDIO udp:"$ns3_ip_port") + result=$(echo I_M_PROXIED | ip netns exec "$ns1" socat -t 2 -T 2 STDIO udp:"$ns3_ip_port",shut-none) if [ "$result" = "$expect_ns1_ns3" ] ;then echo "PASS: tproxy test $testname: ns1 got reply \"$result\" connecting to ns3" else @@ -223,7 +223,7 @@ EOF fi # request from nsrouter to ns2 (localy originated traffic) - result=$(echo I_M_PROXIED | ip netns exec "$nsrouter" socat -t 2 -T 2 STDIO udp:"$ns2_ip_port") + result=$(echo I_M_PROXIED | ip netns exec "$nsrouter" socat -t 2 -T 2 STDIO udp:"$ns2_ip_port",shut-none) if [ "$result" == "$expect_nsrouter_ns2" ] ;then echo "PASS: tproxy test $testname: nsrouter got reply \"$result\" connecting to ns2" else @@ -232,7 +232,7 @@ EOF fi # request from nsrouter to ns3 (localy originated traffic) - result=$(echo I_M_PROXIED | ip netns exec "$nsrouter" socat -t 2 -T 2 STDIO udp:"$ns3_ip_port") + result=$(echo I_M_PROXIED | ip netns exec "$nsrouter" socat -t 2 -T 2 STDIO udp:"$ns3_ip_port",shut-none) if [ "$result" = "$expect_nsrouter_ns3" ] ;then echo "PASS: tproxy test $testname: nsrouter got reply \"$result\" connecting to ns3" else diff --git a/tools/testing/selftests/net/nk_qlease.py b/tools/testing/selftests/net/nk_qlease.py new file mode 100755 index 000000000000..a84a73ff4eda --- /dev/null +++ b/tools/testing/selftests/net/nk_qlease.py @@ -0,0 +1,2109 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import errno +import time +from lib.py import ( + ksft_run, + ksft_exit, + ksft_eq, + ksft_ne, + ksft_in, + ksft_not_in, + ksft_raises, +) +from lib.py import ( + NetNS, + NetNSEnter, + EthtoolFamily, + NetdevFamily, + RtnlFamily, + NetdevSimDev, +) +from lib.py import ( + NlError, + Netlink, + cmd, + defer, + ip, +) + + +def wait_until(cond, timeout=2.0, interval=0.05): + deadline = time.monotonic() + timeout + while not cond(): + if time.monotonic() >= deadline: + return + time.sleep(interval) + + +def create_netkit(rxqueues, mode="l2"): + all_links = ip("-d link show", json=True) + old_idxs = { + link["ifindex"] + for link in all_links + if link.get("linkinfo", {}).get("info_kind") == "netkit" + } + + rtnl = RtnlFamily() + rtnl.newlink( + { + "linkinfo": { + "kind": "netkit", + "data": { + "mode": mode, + "policy": "forward", + "peer-policy": "forward", + }, + }, + "num-rx-queues": rxqueues, + }, + flags=[Netlink.NLM_F_CREATE, Netlink.NLM_F_EXCL], + ) + + all_links = ip("-d link show", json=True) + nk_links = [ + link + for link in all_links + if link.get("linkinfo", {}).get("info_kind") == "netkit" + and link["ifindex"] not in old_idxs + ] + nk_links.sort(key=lambda x: x["ifindex"]) + return ( + nk_links[1]["ifname"], + nk_links[1]["ifindex"], + nk_links[0]["ifname"], + nk_links[0]["ifindex"], + ) + + +def create_netkit_single(rxqueues): + rtnl = RtnlFamily() + rtnl.newlink( + { + "linkinfo": { + "kind": "netkit", + "data": { + "mode": "l2", + "pairing": "single", + }, + }, + "num-rx-queues": rxqueues, + }, + flags=[Netlink.NLM_F_CREATE, Netlink.NLM_F_EXCL], + ) + + all_links = ip("-d link show", json=True) + nk_links = [ + link + for link in all_links + if link.get("linkinfo", {}).get("info_kind") == "netkit" + and "UP" not in link.get("flags", []) + ] + return nk_links[0]["ifname"], nk_links[0]["ifindex"] + + +def test_remove_phys(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + nk_queue_id = result["id"] + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["ifindex"], nk_guest_idx) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + nsimdev.remove() + wait_until(lambda: cmd(f"ip link show dev {nk_host}", fail=False).ret != 0) + ret = cmd(f"ip link show dev {nk_host}", fail=False) + ksft_ne(ret.ret, 0) + + +def test_double_lease(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=3) + defer(cmd, f"ip link del dev {nk_host}") + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EBUSY) + + +def test_virtual_lessor(netns) -> None: + nk_host_a, _, nk_guest_a, nk_guest_a_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_a}") + ip(f"link set dev {nk_host_a} up") + ip(f"link set dev {nk_guest_a} up") + + nk_host_b, _, nk_guest_b, nk_guest_b_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_b}") + + ip(f"link set dev {nk_guest_b} netns {netns.name}") + ip(f"link set dev {nk_host_b} up") + ip(f"link set dev {nk_guest_b} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_b_idx, + "type": "rx", + "lease": { + "ifindex": nk_guest_a_idx, + "queue": {"id": 0, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_phys_lessee(_netns) -> None: + nsimdev_a = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev_a.remove) + nsim_a = nsimdev_a.nsims[0] + ip(f"link set dev {nsim_a.ifname} up") + + nsimdev_b = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev_b.remove) + nsim_b = nsimdev_b.nsims[0] + ip(f"link set dev {nsim_b.ifname} up") + + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nsim_a.ifindex, + "type": "rx", + "lease": { + "ifindex": nsim_b.ifindex, + "queue": {"id": 0, "type": "rx"}, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_different_lessors(netns) -> None: + nsimdev_a = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev_a.remove) + nsim_a = nsimdev_a.nsims[0] + ip(f"link set dev {nsim_a.ifname} up") + + nsimdev_b = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev_b.remove) + nsim_b = nsimdev_b.nsims[0] + ip(f"link set dev {nsim_b.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=3) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim_a.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim_b.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EOPNOTSUPP) + + +def test_queue_out_of_range(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 2, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.ERANGE) + + +def test_resize_leased(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + ethnl = EthtoolFamily() + with ksft_raises(NlError) as e: + ethnl.channels_set({"header": {"dev-index": nsim.ifindex}, "combined-count": 1}) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_self_lease(_netns) -> None: + nk_host, _, _, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nk_guest_idx, + "queue": {"id": 0, "type": "rx"}, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_veth_queue_create(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + ip("link add veth0 type veth peer name veth1") + defer(cmd, "ip link del dev veth0", fail=False) + + all_links = ip("-d link show", json=True) + veth_peer = [ + link + for link in all_links + if link.get("ifname") == "veth1" + ] + veth_peer_idx = veth_peer[0]["ifindex"] + + ip(f"link set dev veth1 netns {netns.name}") + ip("link set dev veth0 up") + ip("link set dev veth1 up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": veth_peer_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_create_tx_type(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "tx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_create_primary(_netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, nk_host_idx, _, _ = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_host} up") + + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_host_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EOPNOTSUPP) + + +def test_create_limit(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=1) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_link_flap_phys(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}") + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + nk_queue_id = result["id"] + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + # Link flap the physical device + ip(f"link set dev {nsim.ifname} down") + ip(f"link set dev {nsim.ifname} up") + + # Verify lease survives the flap + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + +def test_queue_get_virtual(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}") + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + nk_queue_id = result["id"] + + # queue-get on virtual device's leased queue should not show lease + # info (lease info is only shown from the physical device's side) + queue_info = netdevnl.queue_get( + {"ifindex": nk_guest_idx, "id": nk_queue_id, "type": "rx"} + ) + ksft_eq(queue_info["id"], nk_queue_id) + ksft_eq(queue_info["ifindex"], nk_guest_idx) + ksft_not_in("lease", queue_info) + + # Default queue (not leased) also has no lease info + queue_info = netdevnl.queue_get( + {"ifindex": nk_guest_idx, "id": 0, "type": "rx"} + ) + ksft_not_in("lease", queue_info) + + +def test_remove_virt_first(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], result["id"]) + + # Delete netkit (virtual device removed first, physical stays) + cmd(f"ip link del dev {nk_host}") + + # Verify lease is cleaned up on physical device + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_not_in("lease", queue_info) + + +def test_multiple_leases(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=3) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=4) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + r1 = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + r2 = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 2, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + ksft_eq(r1["id"], 1) + ksft_eq(r2["id"], 2) + + # Verify both leases visible on physical device + netdevnl = NetdevFamily() + q1 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + q2 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_in("lease", q1) + ksft_in("lease", q2) + ksft_eq(q1["lease"]["ifindex"], nk_guest_idx) + ksft_eq(q2["lease"]["ifindex"], nk_guest_idx) + ksft_eq(q1["lease"]["queue"]["id"], r1["id"]) + ksft_eq(q2["lease"]["queue"]["id"], r2["id"]) + + +def test_lease_queue_tx_type(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "tx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_invalid_netns(netns) -> None: + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": 1, + "queue": {"id": 0, "type": "rx"}, + "netns-id": 999, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.ENONET) + + +def test_invalid_phys_ifindex(netns) -> None: + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": 99999, + "queue": {"id": 0, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.ENODEV) + + +def test_multi_netkit_remove_phys(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=3) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + # Create two netkit pairs, each leasing a different physical queue + nk_host_a, _, nk_guest_a, nk_guest_a_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_a}", fail=False) + + nk_host_b, _, nk_guest_b, nk_guest_b_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_b}", fail=False) + + ip(f"link set dev {nk_guest_a} netns {netns.name}") + ip(f"link set dev {nk_host_a} up") + ip(f"link set dev {nk_guest_a} up", ns=netns) + + ip(f"link set dev {nk_guest_b} netns {netns.name}") + ip(f"link set dev {nk_host_b} up") + ip(f"link set dev {nk_guest_b} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_guest_a_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + netdevnl.queue_create( + { + "ifindex": nk_guest_b_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 2, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + # Removing the physical device should take down both netkit pairs + nsimdev.remove() + wait_until(lambda: cmd(f"ip link show dev {nk_host_a}", fail=False).ret != 0 + and cmd(f"ip link show dev {nk_host_b}", fail=False).ret != 0) + ret = cmd(f"ip link show dev {nk_host_a}", fail=False) + ksft_ne(ret.ret, 0) + ret = cmd(f"ip link show dev {nk_host_b}", fail=False) + ksft_ne(ret.ret, 0) + + +def test_single_remove_phys(_netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_name, nk_idx = create_netkit_single(rxqueues=2) + defer(cmd, f"ip link del dev {nk_name}", fail=False) + + ip(f"link set dev {nk_name} up") + + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + }, + } + ) + + # Removing the physical device should take down the single netkit device + nsimdev.remove() + wait_until(lambda: cmd(f"ip link show dev {nk_name}", fail=False).ret != 0) + ret = cmd(f"ip link show dev {nk_name}", fail=False) + ksft_ne(ret.ret, 0) + + +def test_link_flap_virt(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}") + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + nk_queue_id = result["id"] + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + # Link flap the virtual (netkit) device + ip(f"link set dev {nk_guest} down", ns=netns) + ip(f"link set dev {nk_guest} up", ns=netns) + + # Verify lease survives the virtual device flap + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + +def test_phys_queue_no_lease(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}") + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + # Physical queue 0 (not leased) should have no lease info + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 0, "type": "rx"} + ) + ksft_not_in("lease", queue_info) + + # Physical queue 1 (leased) should have lease info + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", queue_info) + + +def test_same_ns_lease(_netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_name, nk_idx = create_netkit_single(rxqueues=2) + defer(cmd, f"ip link del dev {nk_name}", fail=False) + + ip(f"link set dev {nk_name} up") + + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + }, + } + ) + ksft_eq(result["id"], 1) + + # Same namespace: lease info should NOT have netns-id + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["ifindex"], nk_idx) + ksft_eq(queue_info["lease"]["queue"]["id"], result["id"]) + ksft_not_in("netns-id", queue_info["lease"]) + + +def test_resize_after_unlease(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + # Resize should fail while lease is active + ethnl = EthtoolFamily() + with ksft_raises(NlError) as e: + ethnl.channels_set({"header": {"dev-index": nsim.ifindex}, "combined-count": 1}) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + # Delete netkit, clearing the lease + cmd(f"ip link del dev {nk_host}") + + # Resize should now succeed + ethnl.channels_set({"header": {"dev-index": nsim.ifindex}, "combined-count": 1}) + + +def test_lease_queue_zero(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 0, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 0, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], result["id"]) + + +def test_release_and_reuse(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + src_queue = 1 + + # First lease + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + + # Delete netkit, freeing the lease + cmd(f"ip link del dev {nk_host}") + + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_not_in("lease", queue_info) + + # Re-create netkit and lease the same physical queue again + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)): + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], result["id"]) + + +def test_two_netkits_same_queue(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host_a, _, nk_guest_a, nk_guest_a_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_a}", fail=False) + + nk_host_b, _, nk_guest_b, nk_guest_b_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_b}", fail=False) + + ip(f"link set dev {nk_guest_a} netns {netns.name}") + ip(f"link set dev {nk_host_a} up") + ip(f"link set dev {nk_guest_a} up", ns=netns) + + ip(f"link set dev {nk_guest_b} netns {netns.name}") + ip(f"link set dev {nk_host_b} up") + ip(f"link set dev {nk_guest_b} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_a_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + with ksft_raises(NlError) as e: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_b_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EBUSY) + + +def test_l3_mode_lease(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2, mode="l3") + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + result = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["ifindex"], nk_guest_idx) + ksft_eq(queue_info["lease"]["queue"]["id"], result["id"]) + + +def test_single_double_lease(_netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_name, nk_idx = create_netkit_single(rxqueues=3) + defer(cmd, f"ip link del dev {nk_name}", fail=False) + + ip(f"link set dev {nk_name} up") + + netdevnl = NetdevFamily() + result = netdevnl.queue_create( + { + "ifindex": nk_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + }, + } + ) + ksft_eq(result["id"], 1) + + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EBUSY) + + +def test_single_different_lessors(_netns) -> None: + nsimdev_a = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev_a.remove) + nsim_a = nsimdev_a.nsims[0] + ip(f"link set dev {nsim_a.ifname} up") + + nsimdev_b = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev_b.remove) + nsim_b = nsimdev_b.nsims[0] + ip(f"link set dev {nsim_b.ifname} up") + + nk_name, nk_idx = create_netkit_single(rxqueues=3) + defer(cmd, f"ip link del dev {nk_name}", fail=False) + + ip(f"link set dev {nk_name} up") + + netdevnl = NetdevFamily() + netdevnl.queue_create( + { + "ifindex": nk_idx, + "type": "rx", + "lease": { + "ifindex": nsim_a.ifindex, + "queue": {"id": 1, "type": "rx"}, + }, + } + ) + + with ksft_raises(NlError) as e: + netdevnl.queue_create( + { + "ifindex": nk_idx, + "type": "rx", + "lease": { + "ifindex": nsim_b.ifindex, + "queue": {"id": 1, "type": "rx"}, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EOPNOTSUPP) + + +def test_cross_ns_netns_id(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_in("netns-id", queue_info["lease"]) + + +def test_delete_guest_netns(_netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + test_ns = NetNS() + ip("netns set init 0", ns=test_ns) + ip("link set lo up", ns=test_ns) + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {test_ns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=test_ns) + + src_queue = 1 + with NetNSEnter(str(test_ns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + + del test_ns + wait_until(lambda: "lease" not in netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"})) + + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_not_in("lease", queue_info) + + ret = cmd(f"ip link show dev {nk_host}", fail=False) + ksft_ne(ret.ret, 0) + + +def test_move_guest_netns(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + result = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + nk_queue_id = result["id"] + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + new_ns = NetNS() + defer(new_ns.__del__) + ip(f"link set dev {nk_guest} netns {new_ns.name}", ns=netns) + + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + +def test_resize_phys_no_reduction(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + ethnl = EthtoolFamily() + ethnl.channels_set( + {"header": {"dev-index": nsim.ifindex}, "combined-count": 2} + ) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", queue_info) + + +def test_delete_one_netkit_of_two(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=3) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host_a, _, nk_guest_a, nk_guest_a_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_a}", fail=False) + + nk_host_b, _, nk_guest_b, nk_guest_b_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_b}", fail=False) + + ip(f"link set dev {nk_guest_a} netns {netns.name}") + ip(f"link set dev {nk_host_a} up") + ip(f"link set dev {nk_guest_a} up", ns=netns) + + ip(f"link set dev {nk_guest_b} netns {netns.name}") + ip(f"link set dev {nk_host_b} up") + ip(f"link set dev {nk_guest_b} up", ns=netns) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_a_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_b_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 2, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + netdevnl = NetdevFamily() + q1 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + q2 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_in("lease", q1) + ksft_in("lease", q2) + + cmd(f"ip link del dev {nk_host_a}") + + q1 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + q2 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_not_in("lease", q1) + ksft_in("lease", q2) + + +def test_bind_rx_leased_phys_queue(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + netdevnl = NetdevFamily() + with ksft_raises(NlError) as e: + netdevnl.bind_rx( + { + "ifindex": nsim.ifindex, + "fd": 0, + "queues": [ + {"id": 0, "type": "rx"}, + {"id": 1, "type": "rx"}, + ], + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EOPNOTSUPP) + + +def test_resize_phys_shrink_past_leased(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=4) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + ethnl = EthtoolFamily() + + # Shrink past the leased queue — only queue 3 removed, queue 1 untouched + ethnl.channels_set( + {"header": {"dev-index": nsim.ifindex}, "combined-count": 3} + ) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", queue_info) + + # Shrink further — queue 2 removed, queue 1 still untouched + ethnl.channels_set( + {"header": {"dev-index": nsim.ifindex}, "combined-count": 2} + ) + + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", queue_info) + + # Shrink into the leased queue — queue 1 is busy, must fail + with ksft_raises(NlError) as e: + ethnl.channels_set( + {"header": {"dev-index": nsim.ifindex}, "combined-count": 1} + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + +def test_resize_virt_not_supported(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, nk_host_idx, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + # Channel resize on the netkit host must fail — not supported + ethnl = EthtoolFamily() + with ksft_raises(NlError) as e: + ethnl.channels_set( + {"header": {"dev-index": nk_host_idx}, "combined-count": 1} + ) + ksft_eq(e.exception.nl_msg.error, -errno.EOPNOTSUPP) + + # Lease must be intact + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", queue_info) + + +def test_lease_devices_down(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + + # Create lease while both physical and virtual devices are down + src_queue = 1 + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + result = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + # Bring devices up before queue_get: netdevsim only instantiates NAPIs in + # ndo_open, and netdev-genl queue_get returns -ENOENT without a NAPI. + ip(f"link set dev {nsim.ifname} up") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], result["id"]) + + +def test_lease_capacity_exhaustion(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=4) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + # rxqueues=3 means num_rx_queues=3, real_num_rx_queues starts at 1. + # Can create 2 leased queues (real goes 1->2->3) but not a 3rd (3->4 > 3). + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=3) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + r1 = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(r1["id"], 1) + + r2 = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 2, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(r2["id"], 2) + + # Third lease fails — netkit queue capacity exhausted + with ksft_raises(NlError) as e: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 3, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(e.exception.nl_msg.error, -errno.EINVAL) + + # Verify the two successful leases are intact + netdevnl = NetdevFamily() + q1 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + q2 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_in("lease", q1) + ksft_in("lease", q2) + + +def test_resize_phys_up(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=3) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + # Shrink nsim first so we have room to grow + ethnl = EthtoolFamily() + ethnl.channels_set( + {"header": {"dev-index": nsim.ifindex}, "combined-count": 2} + ) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + # Grow channels — should succeed since leased queue is not removed + ethnl.channels_set( + {"header": {"dev-index": nsim.ifindex}, "combined-count": 3} + ) + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", queue_info) + + # New queue 2 should exist without a lease + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_not_in("lease", queue_info) + + +def test_multi_ns_lease(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=3) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + ns_b = NetNS() + defer(ns_b.__del__) + ip("netns set init 0", ns=ns_b) + ip("link set lo up", ns=ns_b) + + # First netkit pair, guest in netns + nk_host_a, _, nk_guest_a, nk_guest_a_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_a}", fail=False) + ip(f"link set dev {nk_guest_a} netns {netns.name}") + ip(f"link set dev {nk_host_a} up") + ip(f"link set dev {nk_guest_a} up", ns=netns) + + # Second netkit pair, guest in ns_b + nk_host_b, _, nk_guest_b, nk_guest_b_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_b}", fail=False) + ip(f"link set dev {nk_guest_b} netns {ns_b.name}") + ip(f"link set dev {nk_host_b} up") + ip(f"link set dev {nk_guest_b} up", ns=ns_b) + + # Lease from netns + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + result = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_a_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + # Lease from ns_b (different namespace, same physical device) + with NetNSEnter(str(ns_b)), NetdevFamily() as netdevnl_ns: + result = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_b_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 2, "type": "rx"}, + "netns-id": 0, + }, + } + ) + ksft_eq(result["id"], 1) + + # Verify both leases from the physical side + netdevnl = NetdevFamily() + q1 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + q2 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_in("lease", q1) + ksft_in("lease", q2) + ksft_eq(q1["lease"]["ifindex"], nk_guest_a_idx) + ksft_eq(q2["lease"]["ifindex"], nk_guest_b_idx) + + +def test_multi_ns_delete_one(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=3) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + ns_b = NetNS() + ip("netns set init 0", ns=ns_b) + ip("link set lo up", ns=ns_b) + + # First netkit pair, guest in netns (ns_a) + nk_host_a, _, nk_guest_a, nk_guest_a_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_a}", fail=False) + ip(f"link set dev {nk_guest_a} netns {netns.name}") + ip(f"link set dev {nk_host_a} up") + ip(f"link set dev {nk_guest_a} up", ns=netns) + + # Second netkit pair, guest in ns_b + nk_host_b, _, nk_guest_b, nk_guest_b_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host_b}", fail=False) + + ip(f"link set dev {nk_guest_b} netns {ns_b.name}") + ip(f"link set dev {nk_host_b} up") + ip(f"link set dev {nk_guest_b} up", ns=ns_b) + + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_a_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 1, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + with NetNSEnter(str(ns_b)), NetdevFamily() as netdevnl_ns: + netdevnl_ns.queue_create( + { + "ifindex": nk_guest_b_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": 2, "type": "rx"}, + "netns-id": 0, + }, + } + ) + + netdevnl = NetdevFamily() + q1 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + q2 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_in("lease", q1) + ksft_in("lease", q2) + + # Delete ns_b — destroys nk_guest_b, triggers unlease of queue 2 + del ns_b + wait_until(lambda: "lease" not in netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"})) + + # ns_a's lease on queue 1 must survive + q1 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 1, "type": "rx"} + ) + ksft_in("lease", q1) + ksft_eq(q1["lease"]["ifindex"], nk_guest_a_idx) + + # ns_b's lease on queue 2 must be gone + q2 = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": 2, "type": "rx"} + ) + ksft_not_in("lease", q2) + + # nk_host_b should be gone too (phys removal cascades to netkit pair) + ret = cmd(f"ip link show dev {nk_host_b}", fail=False) + ksft_ne(ret.ret, 0) + + +def test_move_phys_netns(netns) -> None: + nsimdev = NetdevSimDev(port_count=1, queue_count=2) + defer(nsimdev.remove) + nsim = nsimdev.nsims[0] + ip(f"link set dev {nsim.ifname} up") + + nk_host, _, nk_guest, nk_guest_idx = create_netkit(rxqueues=2) + defer(cmd, f"ip link del dev {nk_host}", fail=False) + + ip(f"link set dev {nk_guest} netns {netns.name}") + ip(f"link set dev {nk_host} up") + ip(f"link set dev {nk_guest} up", ns=netns) + + src_queue = 1 + with NetNSEnter(str(netns)), NetdevFamily() as netdevnl_ns: + nk_queue_id = netdevnl_ns.queue_create( + { + "ifindex": nk_guest_idx, + "type": "rx", + "lease": { + "ifindex": nsim.ifindex, + "queue": {"id": src_queue, "type": "rx"}, + "netns-id": 0, + }, + } + )["id"] + + netdevnl = NetdevFamily() + queue_info = netdevnl.queue_get( + {"ifindex": nsim.ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + + # Move the physical device to a new namespace. Move it back to init_net + # on cleanup before the other defers fire (new_ns deletion, nsimdev.remove) + # so nsim lives in a stable namespace when they run. + new_ns = NetNS() + defer(new_ns.__del__) + ip(f"link set dev {nsim.ifname} netns {new_ns.name}") + defer(ip, f"link set dev {nsim.ifname} netns init", ns=new_ns) + + # Physical device is now in new_ns — find its ifindex there + all_links = ip("-d link show", json=True, ns=new_ns) + nsim_in_new = [lnk for lnk in all_links if lnk.get("ifname") == nsim.ifname] + new_ifindex = nsim_in_new[0]["ifindex"] + + # Moving a device across netns brings it admin-down; bring it back up so + # netdevsim re-creates the NAPI (netdev-genl queue_get needs it). + ip(f"link set dev {nsim.ifname} up", ns=new_ns) + + # Verify lease survived the namespace move + with NetNSEnter(str(new_ns)), NetdevFamily() as netdevnl_ns: + queue_info = netdevnl_ns.queue_get( + {"ifindex": new_ifindex, "id": src_queue, "type": "rx"} + ) + ksft_in("lease", queue_info) + ksft_eq(queue_info["lease"]["queue"]["id"], nk_queue_id) + + +def main() -> None: + netns = NetNS() + cmd("ip netns attach init 1") + ip("netns set init 0", ns=netns) + ip("link set lo up", ns=netns) + + ksft_run( + [ + test_remove_phys, + test_double_lease, + test_virtual_lessor, + test_phys_lessee, + test_different_lessors, + test_queue_out_of_range, + test_resize_leased, + test_self_lease, + test_create_tx_type, + test_create_primary, + test_create_limit, + test_link_flap_phys, + test_queue_get_virtual, + test_remove_virt_first, + test_multiple_leases, + test_lease_queue_tx_type, + test_invalid_netns, + test_invalid_phys_ifindex, + test_multi_netkit_remove_phys, + test_single_remove_phys, + test_link_flap_virt, + test_phys_queue_no_lease, + test_same_ns_lease, + test_resize_after_unlease, + test_lease_queue_zero, + test_release_and_reuse, + test_veth_queue_create, + test_two_netkits_same_queue, + test_l3_mode_lease, + test_single_double_lease, + test_single_different_lessors, + test_cross_ns_netns_id, + test_delete_guest_netns, + test_move_guest_netns, + test_resize_phys_no_reduction, + test_delete_one_netkit_of_two, + test_bind_rx_leased_phys_queue, + test_resize_phys_shrink_past_leased, + test_resize_virt_not_supported, + test_lease_devices_down, + test_lease_capacity_exhaustion, + test_resize_phys_up, + test_multi_ns_lease, + test_multi_ns_delete_one, + test_move_phys_netns, + ], + args=(netns,), + ) + + cmd("ip netns del init", fail=False) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/net/nl_netdev.py b/tools/testing/selftests/net/nl_netdev.py index 5c66421ab8aa..eff55c64a012 100755 --- a/tools/testing/selftests/net/nl_netdev.py +++ b/tools/testing/selftests/net/nl_netdev.py @@ -1,11 +1,15 @@ #!/usr/bin/env python3 # SPDX-License-Identifier: GPL-2.0 -import time +""" +Tests for the netdev netlink family. +""" + +import errno from os import system -from lib.py import ksft_run, ksft_exit, ksft_pr -from lib.py import ksft_eq, ksft_ge, ksft_ne, ksft_busy_wait -from lib.py import NetdevFamily, NetdevSimDev, ip +from lib.py import ksft_run, ksft_exit +from lib.py import ksft_eq, ksft_ge, ksft_ne, ksft_raises, ksft_busy_wait +from lib.py import NetdevFamily, NetdevSimDev, NlError, ip def empty_check(nf) -> None: @@ -19,6 +23,15 @@ def lo_check(nf) -> None: ksft_eq(len(lo_info['xdp-rx-metadata-features']), 0) +def dev_dump_reject_attr(nf) -> None: + """Test that dev-get dump rejects attributes (no dump request policy).""" + with ksft_raises(NlError) as cm: + nf.dev_get({'ifindex': 1}, dump=True) + ksft_eq(cm.exception.nl_msg.error, -errno.EINVAL) + ksft_eq(cm.exception.nl_msg.extack['msg'], 'Unknown attribute type') + ksft_eq(cm.exception.nl_msg.extack['bad-attr'], '.ifindex') + + def napi_list_check(nf) -> None: with NetdevSimDev(queue_count=100) as nsimdev: nsim = nsimdev.nsims[0] @@ -243,9 +256,16 @@ def page_pool_check(nf) -> None: def main() -> None: + """ Ksft boiler plate main """ nf = NetdevFamily() - ksft_run([empty_check, lo_check, page_pool_check, napi_list_check, - dev_set_threaded, napi_set_threaded, nsim_rxq_reset_down], + ksft_run([empty_check, + lo_check, + dev_dump_reject_attr, + napi_list_check, + napi_set_threaded, + dev_set_threaded, + nsim_rxq_reset_down, + page_pool_check], args=(nf, )) ksft_exit() diff --git a/tools/testing/selftests/net/nl_nlctrl.py b/tools/testing/selftests/net/nl_nlctrl.py new file mode 100755 index 000000000000..fe1f66dc9435 --- /dev/null +++ b/tools/testing/selftests/net/nl_nlctrl.py @@ -0,0 +1,131 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +""" +Tests for the nlctrl genetlink family (family info and policy dumps). +""" + +from lib.py import ksft_run, ksft_exit +from lib.py import ksft_eq, ksft_ge, ksft_true, ksft_in, ksft_not_in +from lib.py import NetdevFamily, EthtoolFamily, NlctrlFamily + + +def getfamily_do(ctrl) -> None: + """Query a single family by name and validate its ops.""" + fam = ctrl.getfamily({'family-name': 'netdev'}) + ksft_eq(fam['family-name'], 'netdev') + ksft_true(fam['family-id'] > 0) + + # The format of ops is quite odd, [{$idx: {"id"...}}, {$idx: {"id"...}}] + # Discard the indices and re-key by command id. + ops_by_id = {v['id']: v for op in fam['ops'] for v in op.values()} + ksft_eq(len(ops_by_id), len(fam['ops'])) + + # All ops should have a policy (either do or dump has one) + for op in ops_by_id.values(): + ksft_in('cmd-cap-haspol', op['flags'], + comment=f"op {op['id']} missing haspol") + + # dev-get (id 1) should support both do and dump + ksft_in('cmd-cap-do', ops_by_id[1]['flags']) + ksft_in('cmd-cap-dump', ops_by_id[1]['flags']) + + # qstats-get (id 12) is dump-only + ksft_not_in('cmd-cap-do', ops_by_id[12]['flags']) + ksft_in('cmd-cap-dump', ops_by_id[12]['flags']) + + # napi-set (id 14) is do-only and requires admin + ksft_in('cmd-cap-do', ops_by_id[14]['flags']) + ksft_not_in('cmd-cap-dump', ops_by_id[14]['flags']) + ksft_in('admin-perm', ops_by_id[14]['flags']) + + # Notification-only commands (dev-add/del/change-ntf etc.) must + # not appear in the ops list since they have no do/dump handlers. + for ntf_id in [2, 3, 4, 6, 7, 8]: + ksft_not_in(ntf_id, ops_by_id, + comment=f"ntf-only cmd {ntf_id} should not be in ops") + + +def getfamily_dump(ctrl) -> None: + """Dump all families and verify expected entries.""" + families = ctrl.getfamily({}, dump=True) + ksft_ge(len(families), 2) + + names = [f['family-name'] for f in families] + ksft_in('nlctrl', names, comment="nlctrl not found in family dump") + ksft_in('netdev', names, comment="netdev not found in family dump") + + +def getpolicy_dump(_ctrl) -> None: + """Dump policies for ops using get_policy() and validate results. + + Test with netdev (split ops) where do and dump can have different + policies, and with ethtool (full ops) where they always share one. + """ + # -- netdev (split ops) -- + ndev = NetdevFamily() + + # dev-get: do has a real policy with ifindex, dump has no policy + # (only the reject-all policy with maxattr=0) + pol = ndev.get_policy('dev-get', 'do') + ksft_in('ifindex', pol, comment="dev-get do policy should have ifindex") + ksft_eq(pol['ifindex'].type, 'u32') + + pol_dump = ndev.get_policy('dev-get', 'dump') + ksft_eq(len(pol_dump), 0, comment="dev-get should not accept any attrs") + + # napi-get: both do and dump have real policies + pol_do = ndev.get_policy('napi-get', 'do') + ksft_ge(len(pol_do), 1) + + pol_dump = ndev.get_policy('napi-get', 'dump') + ksft_ge(len(pol_dump), 1) + + # -- ethtool (full ops) -- + et = EthtoolFamily() + + # strset-get (has both do and dump, full ops share policy) + pol_do = et.get_policy('strset-get', 'do') + ksft_ge(len(pol_do), 1, comment="strset-get should have a do policy") + + pol_dump = et.get_policy('strset-get', 'dump') + ksft_ge(len(pol_dump), 1, comment="strset-get should have a dump policy") + + # Same policy means same attribute names + ksft_eq(set(pol_do.keys()), set(pol_dump.keys())) + + # linkinfo-set is do-only (SET command), no dump + pol_do = et.get_policy('linkinfo-set', 'do') + ksft_ge(len(pol_do), 1, comment="linkinfo-set should have a do policy") + + pol_dump = et.get_policy('linkinfo-set', 'dump') + ksft_eq(pol_dump, None, + comment="linkinfo-set should not have a dump policy") + + +def getpolicy_by_op(_ctrl) -> None: + """Query policy for specific ops, check attr names are resolved.""" + ndev = NetdevFamily() + + # dev-get do policy should have named attributes from the spec + pol = ndev.get_policy('dev-get', 'do') + ksft_ge(len(pol), 1) + # All attr names should be resolved (no 'attr-N' fallbacks) + for name in pol: + ksft_true(not name.startswith('attr-'), + comment=f"unresolved attr name: {name}") + + +def main() -> None: + """ Ksft boiler plate main """ + ctrl = NlctrlFamily() + ksft_run([getfamily_do, + getfamily_dump, + getpolicy_dump, + getpolicy_by_op], + args=(ctrl, )) + ksft_exit() + + +if __name__ == "__main__": + main() diff --git a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py index b521e0dea506..848f61fdcee0 100644 --- a/tools/testing/selftests/net/openvswitch/ovs-dpctl.py +++ b/tools/testing/selftests/net/openvswitch/ovs-dpctl.py @@ -2583,7 +2583,7 @@ def main(argv): prverscheck = pyroute2.__version__.split(".") if int(prverscheck[0]) == 0 and int(prverscheck[1]) < 6: print("Need to upgrade the python pyroute2 package to >= 0.6.") - sys.exit(0) + sys.exit(1) parser = argparse.ArgumentParser() parser.add_argument( diff --git a/tools/testing/selftests/net/ovpn/Makefile b/tools/testing/selftests/net/ovpn/Makefile index dbe0388c8512..169f0464ac3a 100644 --- a/tools/testing/selftests/net/ovpn/Makefile +++ b/tools/testing/selftests/net/ovpn/Makefile @@ -2,22 +2,35 @@ # Copyright (C) 2020-2025 OpenVPN, Inc. # CFLAGS = -pedantic -Wextra -Wall -Wl,--no-as-needed -g -O0 -ggdb $(KHDR_INCLUDES) +CFLAGS += $(shell pkg-config --cflags mbedcrypto-3 mbedtls-3 2>/dev/null) + VAR_CFLAGS = $(shell pkg-config --cflags libnl-3.0 libnl-genl-3.0 2>/dev/null) ifeq ($(VAR_CFLAGS),) VAR_CFLAGS = -I/usr/include/libnl3 endif CFLAGS += $(VAR_CFLAGS) +MTLS_LDLIBS= $(shell pkg-config --libs mbedcrypto-3 mbedtls-3 2>/dev/null) +ifeq ($(MTLS_LDLIBS),) +MTLS_LDLIBS = -lmbedtls -lmbedcrypto +endif +LDLIBS += $(MTLS_LDLIBS) -LDLIBS = -lmbedtls -lmbedcrypto -VAR_LDLIBS = $(shell pkg-config --libs libnl-3.0 libnl-genl-3.0 2>/dev/null) -ifeq ($(VAR_LDLIBS),) -VAR_LDLIBS = -lnl-genl-3 -lnl-3 +NL_LDLIBS = $(shell pkg-config --libs libnl-3.0 libnl-genl-3.0 2>/dev/null) +ifeq ($(NL_LDLIBS),) +NL_LDLIBS = -lnl-genl-3 -lnl-3 endif -LDLIBS += $(VAR_LDLIBS) +LDLIBS += $(NL_LDLIBS) -TEST_FILES = common.sh +TEST_FILES = \ + common.sh \ + data64.key \ + json \ + tcp_peers.txt \ + udp_peers.txt \ + ../../../../net/ynl/pyynl/cli.py \ +# end of TEST_FILES TEST_PROGS := \ test-chachapoly.sh \ @@ -25,6 +38,10 @@ TEST_PROGS := \ test-close-socket.sh \ test-float.sh \ test-large-mtu.sh \ + test-mark.sh \ + test-symmetric-id-float.sh \ + test-symmetric-id-tcp.sh \ + test-symmetric-id.sh \ test-tcp.sh \ test.sh \ # end of TEST_PROGS diff --git a/tools/testing/selftests/net/ovpn/common.sh b/tools/testing/selftests/net/ovpn/common.sh index 88869c675d03..4c08f756e63a 100644 --- a/tools/testing/selftests/net/ovpn/common.sh +++ b/tools/testing/selftests/net/ovpn/common.sh @@ -7,12 +7,21 @@ UDP_PEERS_FILE=${UDP_PEERS_FILE:-udp_peers.txt} TCP_PEERS_FILE=${TCP_PEERS_FILE:-tcp_peers.txt} OVPN_CLI=${OVPN_CLI:-./ovpn-cli} +YNL_CLI=${YNL_CLI:-../../../../net/ynl/pyynl/cli.py} ALG=${ALG:-aes} PROTO=${PROTO:-UDP} FLOAT=${FLOAT:-0} +SYMMETRIC_ID=${SYMMETRIC_ID:-0} +export ID_OFFSET=$(( 9 * (SYMMETRIC_ID == 0) )) + +JQ_FILTER='map(select(.msg.peer | has("remote-ipv6") | not)) | + map(del(.msg.ifindex)) | sort_by(.msg.peer.id)[]' LAN_IP="11.11.11.11" +declare -A tmp_jsons=() +declare -A listener_pids=() + create_ns() { ip netns add peer${1} } @@ -48,27 +57,67 @@ setup_ns() { ip -n peer${1} link set tun${1} up } +build_capture_filter() { + # match the first four bytes of the openvpn data payload + if [ "${PROTO}" == "UDP" ]; then + # For UDP, libpcap transport indexing only works for IPv4, so + # use an explicit IPv4 or IPv6 expression based on the peer + # address. The IPv6 branch assumes there are no extension + # headers in the outer packet. + if [[ "${2}" == *:* ]]; then + printf "ip6 and ip6[6] = 17 and ip6[48:4] = %s" "${1}" + else + printf "ip and udp[8:4] = %s" "${1}" + fi + else + # openvpn over TCP prepends a 2-byte packet length ahead of the + # DATA_V2 opcode, so skip it before matching the payload header + printf "ip and tcp[(((tcp[12] & 0xf0) >> 2) + 2):4] = %s" "${1}" + fi +} + +setup_listener() { + file=$(mktemp) + PYTHONUNBUFFERED=1 ip netns exec peer${p} ${YNL_CLI} --family ovpn \ + --subscribe peers --output-json --duration 40 > ${file} & + listener_pids[$1]=$! + tmp_jsons[$1]="${file}" +} + add_peer() { + labels=("ASYMM" "SYMM") + M_ID=${labels[SYMMETRIC_ID]} + if [ "${PROTO}" == "UDP" ]; then if [ ${1} -eq 0 ]; then - ip netns exec peer0 ${OVPN_CLI} new_multi_peer tun0 1 ${UDP_PEERS_FILE} + ip netns exec peer0 ${OVPN_CLI} new_multi_peer tun0 1 \ + ${M_ID} ${UDP_PEERS_FILE} for p in $(seq 1 ${NUM_PEERS}); do ip netns exec peer0 ${OVPN_CLI} new_key tun0 ${p} 1 0 ${ALG} 0 \ data64.key done else - RADDR=$(awk "NR == ${1} {print \$2}" ${UDP_PEERS_FILE}) - RPORT=$(awk "NR == ${1} {print \$3}" ${UDP_PEERS_FILE}) - LPORT=$(awk "NR == ${1} {print \$5}" ${UDP_PEERS_FILE}) - ip netns exec peer${1} ${OVPN_CLI} new_peer tun${1} ${1} ${LPORT} \ - ${RADDR} ${RPORT} - ip netns exec peer${1} ${OVPN_CLI} new_key tun${1} ${1} 1 0 ${ALG} 1 \ - data64.key + if [ "${SYMMETRIC_ID}" -eq 1 ]; then + PEER_ID=${1} + TX_ID="none" + else + PEER_ID=$(awk "NR == ${1} {print \$2}" \ + ${UDP_PEERS_FILE}) + TX_ID=${1} + fi + RADDR=$(awk "NR == ${1} {print \$3}" ${UDP_PEERS_FILE}) + RPORT=$(awk "NR == ${1} {print \$4}" ${UDP_PEERS_FILE}) + LPORT=$(awk "NR == ${1} {print \$6}" ${UDP_PEERS_FILE}) + ip netns exec peer${1} ${OVPN_CLI} new_peer tun${1} \ + ${PEER_ID} ${TX_ID} ${LPORT} ${RADDR} ${RPORT} + ip netns exec peer${1} ${OVPN_CLI} new_key tun${1} \ + ${PEER_ID} 1 0 ${ALG} 1 data64.key fi else if [ ${1} -eq 0 ]; then - (ip netns exec peer0 ${OVPN_CLI} listen tun0 1 ${TCP_PEERS_FILE} && { + (ip netns exec peer0 ${OVPN_CLI} listen tun0 1 ${M_ID} \ + ${TCP_PEERS_FILE} && { for p in $(seq 1 ${NUM_PEERS}); do ip netns exec peer0 ${OVPN_CLI} new_key tun0 ${p} 1 0 \ ${ALG} 0 data64.key @@ -76,9 +125,37 @@ add_peer() { }) & sleep 5 else - ip netns exec peer${1} ${OVPN_CLI} connect tun${1} ${1} 10.10.${1}.1 1 \ - data64.key + if [ "${SYMMETRIC_ID}" -eq 1 ]; then + PEER_ID=${1} + TX_ID="none" + else + PEER_ID=$(awk "NR == ${1} {print \$2}" \ + ${TCP_PEERS_FILE}) + TX_ID=${1} + fi + ip netns exec peer${1} ${OVPN_CLI} connect tun${1} \ + ${PEER_ID} ${TX_ID} 10.10.${1}.1 1 data64.key + fi + fi +} + +compare_ntfs() { + if [ ${#tmp_jsons[@]} -gt 0 ]; then + suffix="" + [ "${SYMMETRIC_ID}" -eq 1 ] && suffix="${suffix}-symm" + [ "$FLOAT" == 1 ] && suffix="${suffix}-float" + expected="json/peer${1}${suffix}.json" + received="${tmp_jsons[$1]}" + + kill -TERM ${listener_pids[$1]} || true + wait ${listener_pids[$1]} || true + printf "Checking notifications for peer ${1}... " + if diff <(jq -s "${JQ_FILTER}" ${expected}) \ + <(jq -s "${JQ_FILTER}" ${received}); then + echo "OK" fi + + rm -f ${received} || true fi } @@ -104,5 +181,3 @@ if [ "${PROTO}" == "UDP" ]; then else NUM_PEERS=${NUM_PEERS:-$(wc -l ${TCP_PEERS_FILE} | awk '{print $1}')} fi - - diff --git a/tools/testing/selftests/net/ovpn/data64.key b/tools/testing/selftests/net/ovpn/data64.key index a99e88c4e290..d04febcdf5a2 100644 --- a/tools/testing/selftests/net/ovpn/data64.key +++ b/tools/testing/selftests/net/ovpn/data64.key @@ -1,5 +1 @@ -jRqMACN7d7/aFQNT8S7jkrBD8uwrgHbG5OQZP2eu4R1Y7tfpS2bf5RHv06Vi163CGoaIiTX99R3B -ia9ycAH8Wz1+9PWv51dnBLur9jbShlgZ2QHLtUc4a/gfT7zZwULXuuxdLnvR21DDeMBaTbkgbai9 -uvAa7ne1liIgGFzbv+Bas4HDVrygxIxuAnP5Qgc3648IJkZ0QEXPF+O9f0n5+QIvGCxkAUVx+5K6 -KIs+SoeWXnAopELmoGSjUpFtJbagXK82HfdqpuUxT2Tnuef0/14SzVE/vNleBNu2ZbyrSAaah8tE -BofkPJUBFY+YQcfZNM5Dgrw3i+Bpmpq/gpdg5w== +jRqMACN7d7/aFQNT8S7jkrBD8uwrgHbG5OQZP2eu4R1Y7tfpS2bf5RHv06Vi163CGoaIiTX99R3Bia9ycAH8Wz1+9PWv51dnBLur9jbShlgZ2QHLtUc4a/gfT7zZwULXuuxdLnvR21DDeMBaTbkgbai9uvAa7ne1liIgGFzbv+Bas4HDVrygxIxuAnP5Qgc3648IJkZ0QEXPF+O9f0n5+QIvGCxkAUVx+5K6KIs+SoeWXnAopELmoGSjUpFtJbagXK82HfdqpuUxT2Tnuef0/14SzVE/vNleBNu2ZbyrSAaah8tEBofkPJUBFY+YQcfZNM5Dgrw3i+Bpmpq/gpdg5w== diff --git a/tools/testing/selftests/net/ovpn/json/peer0-float.json b/tools/testing/selftests/net/ovpn/json/peer0-float.json new file mode 100644 index 000000000000..682fa58ad4ea --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer0-float.json @@ -0,0 +1,9 @@ +{"name": "peer-float-ntf", "msg": {"ifindex": 0, "peer": {"id": 1, "remote-ipv4": "10.10.1.3", "remote-port": 1}}} +{"name": "peer-float-ntf", "msg": {"ifindex": 0, "peer": {"id": 2, "remote-ipv4": "10.10.2.3", "remote-port": 1}}} +{"name": "peer-float-ntf", "msg": {"ifindex": 0, "peer": {"id": 3, "remote-ipv4": "10.10.3.3", "remote-port": 1}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 1}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 2}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 3}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 4}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 5}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 6}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer0-symm-float.json b/tools/testing/selftests/net/ovpn/json/peer0-symm-float.json new file mode 120000 index 000000000000..e31a5bd59863 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer0-symm-float.json @@ -0,0 +1 @@ +peer0-float.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer0-symm.json b/tools/testing/selftests/net/ovpn/json/peer0-symm.json new file mode 120000 index 000000000000..57a163048eed --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer0-symm.json @@ -0,0 +1 @@ +peer0.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer0.json b/tools/testing/selftests/net/ovpn/json/peer0.json new file mode 100644 index 000000000000..7c46a33d5ecd --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer0.json @@ -0,0 +1,6 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 1}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 2}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 3}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 4}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 5}}} +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 6}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer1-float.json b/tools/testing/selftests/net/ovpn/json/peer1-float.json new file mode 120000 index 000000000000..d28c328d1452 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer1-float.json @@ -0,0 +1 @@ +peer1.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer1-symm-float.json b/tools/testing/selftests/net/ovpn/json/peer1-symm-float.json new file mode 120000 index 000000000000..b3615dcc523d --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer1-symm-float.json @@ -0,0 +1 @@ +peer1-symm.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer1-symm.json b/tools/testing/selftests/net/ovpn/json/peer1-symm.json new file mode 100644 index 000000000000..5da4ea9d51fb --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer1-symm.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 1}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer1.json b/tools/testing/selftests/net/ovpn/json/peer1.json new file mode 100644 index 000000000000..1009d26dc14a --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer1.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 10}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer2-float.json b/tools/testing/selftests/net/ovpn/json/peer2-float.json new file mode 120000 index 000000000000..b9f09980aaa0 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer2-float.json @@ -0,0 +1 @@ +peer2.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer2-symm-float.json b/tools/testing/selftests/net/ovpn/json/peer2-symm-float.json new file mode 120000 index 000000000000..28a895cb5170 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer2-symm-float.json @@ -0,0 +1 @@ +peer2-symm.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer2-symm.json b/tools/testing/selftests/net/ovpn/json/peer2-symm.json new file mode 100644 index 000000000000..8f6db4f8c2ac --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer2-symm.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 2}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer2.json b/tools/testing/selftests/net/ovpn/json/peer2.json new file mode 100644 index 000000000000..44e9fad2b622 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer2.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "userspace", "id": 11}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer3-float.json b/tools/testing/selftests/net/ovpn/json/peer3-float.json new file mode 120000 index 000000000000..2700b55bcf2e --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer3-float.json @@ -0,0 +1 @@ +peer3.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer3-symm-float.json b/tools/testing/selftests/net/ovpn/json/peer3-symm-float.json new file mode 120000 index 000000000000..ee8b9719c2fd --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer3-symm-float.json @@ -0,0 +1 @@ +peer3-symm.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer3-symm.json b/tools/testing/selftests/net/ovpn/json/peer3-symm.json new file mode 100644 index 000000000000..bdabd6fa2e64 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer3-symm.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 3}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer3.json b/tools/testing/selftests/net/ovpn/json/peer3.json new file mode 100644 index 000000000000..d4be8ba130ae --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer3.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 12}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer4-float.json b/tools/testing/selftests/net/ovpn/json/peer4-float.json new file mode 120000 index 000000000000..460f6c14cd60 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer4-float.json @@ -0,0 +1 @@ +peer4.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer4-symm-float.json b/tools/testing/selftests/net/ovpn/json/peer4-symm-float.json new file mode 120000 index 000000000000..7d34ff7305da --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer4-symm-float.json @@ -0,0 +1 @@ +peer4-symm.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer4-symm.json b/tools/testing/selftests/net/ovpn/json/peer4-symm.json new file mode 100644 index 000000000000..c3734bb9251b --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer4-symm.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 4}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer4.json b/tools/testing/selftests/net/ovpn/json/peer4.json new file mode 100644 index 000000000000..67d27e2d48ac --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer4.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 13}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer5-float.json b/tools/testing/selftests/net/ovpn/json/peer5-float.json new file mode 120000 index 000000000000..0f725c50ce19 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer5-float.json @@ -0,0 +1 @@ +peer5.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer5-symm-float.json b/tools/testing/selftests/net/ovpn/json/peer5-symm-float.json new file mode 120000 index 000000000000..afc0f5f9f13b --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer5-symm-float.json @@ -0,0 +1 @@ +peer5-symm.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer5-symm.json b/tools/testing/selftests/net/ovpn/json/peer5-symm.json new file mode 100644 index 000000000000..46c4a348299d --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer5-symm.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 5}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer5.json b/tools/testing/selftests/net/ovpn/json/peer5.json new file mode 100644 index 000000000000..ecd9bd0b2f37 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer5.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 14}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer6-float.json b/tools/testing/selftests/net/ovpn/json/peer6-float.json new file mode 120000 index 000000000000..4d9ded3e0a84 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer6-float.json @@ -0,0 +1 @@ +peer6.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer6-symm-float.json b/tools/testing/selftests/net/ovpn/json/peer6-symm-float.json new file mode 120000 index 000000000000..e39203204d8c --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer6-symm-float.json @@ -0,0 +1 @@ +peer6-symm.json
\ No newline at end of file diff --git a/tools/testing/selftests/net/ovpn/json/peer6-symm.json b/tools/testing/selftests/net/ovpn/json/peer6-symm.json new file mode 100644 index 000000000000..aa30f2cff625 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer6-symm.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 6}}} diff --git a/tools/testing/selftests/net/ovpn/json/peer6.json b/tools/testing/selftests/net/ovpn/json/peer6.json new file mode 100644 index 000000000000..7fded29c5804 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/json/peer6.json @@ -0,0 +1 @@ +{"name": "peer-del-ntf", "msg": {"ifindex": 0, "peer": {"del-reason": "expired", "id": 15}}} diff --git a/tools/testing/selftests/net/ovpn/ovpn-cli.c b/tools/testing/selftests/net/ovpn/ovpn-cli.c index 0f3babf19fd0..d40953375c86 100644 --- a/tools/testing/selftests/net/ovpn/ovpn-cli.c +++ b/tools/testing/selftests/net/ovpn/ovpn-cli.c @@ -6,6 +6,7 @@ * Author: Antonio Quartulli <antonio@openvpn.net> */ +#include <stdint.h> #include <stdio.h> #include <inttypes.h> #include <stdbool.h> @@ -103,7 +104,7 @@ struct ovpn_ctx { sa_family_t sa_family; - unsigned long peer_id; + unsigned long peer_id, tx_id; unsigned long lport; union { @@ -133,6 +134,9 @@ struct ovpn_ctx { enum ovpn_key_slot key_slot; int key_id; + uint32_t mark; + bool asymm_id; + const char *peers_file; }; @@ -521,6 +525,15 @@ static int ovpn_socket(struct ovpn_ctx *ctx, sa_family_t family, int proto) return ret; } + if (ctx->mark != 0) { + ret = setsockopt(s, SOL_SOCKET, SO_MARK, (void *)&ctx->mark, + sizeof(ctx->mark)); + if (ret < 0) { + perror("setsockopt for SO_MARK"); + return ret; + } + } + if (family == AF_INET6) { opt = 0; if (setsockopt(s, IPPROTO_IPV6, IPV6_V6ONLY, &opt, @@ -649,6 +662,8 @@ static int ovpn_new_peer(struct ovpn_ctx *ovpn, bool is_tcp) attr = nla_nest_start(ctx->nl_msg, OVPN_A_PEER); NLA_PUT_U32(ctx->nl_msg, OVPN_A_PEER_ID, ovpn->peer_id); + if (ovpn->asymm_id) + NLA_PUT_U32(ctx->nl_msg, OVPN_A_PEER_TX_ID, ovpn->tx_id); NLA_PUT_U32(ctx->nl_msg, OVPN_A_PEER_SOCKET, ovpn->socket); if (!is_tcp) { @@ -767,6 +782,10 @@ static int ovpn_handle_peer(struct nl_msg *msg, void (*arg)__always_unused) fprintf(stderr, "* Peer %u\n", nla_get_u32(pattrs[OVPN_A_PEER_ID])); + if (pattrs[OVPN_A_PEER_TX_ID]) + fprintf(stderr, "\tTX peer ID %u\n", + nla_get_u32(pattrs[OVPN_A_PEER_TX_ID])); + if (pattrs[OVPN_A_PEER_SOCKET_NETNSID]) fprintf(stderr, "\tsocket NetNS ID: %d\n", nla_get_s32(pattrs[OVPN_A_PEER_SOCKET_NETNSID])); @@ -1516,6 +1535,9 @@ static int ovpn_handle_msg(struct nl_msg *msg, void *arg) case OVPN_CMD_PEER_DEL_NTF: fprintf(stdout, "received CMD_PEER_DEL_NTF\n"); break; + case OVPN_CMD_PEER_FLOAT_NTF: + fprintf(stdout, "received CMD_PEER_FLOAT_NTF\n"); + break; case OVPN_CMD_KEY_SWAP_NTF: fprintf(stdout, "received CMD_KEY_SWAP_NTF\n"); break; @@ -1654,41 +1676,58 @@ static void usage(const char *cmd) fprintf(stderr, "\tiface: ovpn interface name\n"); fprintf(stderr, - "* listen <iface> <lport> <peers_file> [ipv6]: listen for incoming peer TCP connections\n"); + "* listen <iface> <lport> <id_type> <peers_file> [ipv6]: listen for incoming peer TCP connections\n"); fprintf(stderr, "\tiface: ovpn interface name\n"); fprintf(stderr, "\tlport: TCP port to listen to\n"); + fprintf(stderr, "\tid_type:\n"); + fprintf(stderr, + "\t\t- SYMM for ignoring the TX peer ID from the peers_file\n"); + fprintf(stderr, + "\t\t- ASYMM for using the TX peer ID from the peers_file\n"); fprintf(stderr, "\tpeers_file: file containing one peer per line: Line format:\n"); - fprintf(stderr, "\t\t<peer_id> <vpnaddr>\n"); + fprintf(stderr, "\t\t<peer_id> <tx_id> <vpnaddr>\n"); fprintf(stderr, "\tipv6: whether the socket should listen to the IPv6 wildcard address\n"); fprintf(stderr, - "* connect <iface> <peer_id> <raddr> <rport> [key_file]: start connecting peer of TCP-based VPN session\n"); + "* connect <iface> <peer_id> <tx_id> <raddr> <rport> [key_file]: start connecting peer of TCP-based VPN session\n"); fprintf(stderr, "\tiface: ovpn interface name\n"); - fprintf(stderr, "\tpeer_id: peer ID of the connecting peer\n"); + fprintf(stderr, + "\tpeer_id: peer ID found in data packets received from this peer\n"); + fprintf(stderr, + "\ttx_id: peer ID to be used when sending to this peer, 'none' for symmetric peer ID\n"); fprintf(stderr, "\traddr: peer IP address to connect to\n"); fprintf(stderr, "\trport: peer TCP port to connect to\n"); fprintf(stderr, "\tkey_file: file containing the symmetric key for encryption\n"); fprintf(stderr, - "* new_peer <iface> <peer_id> <lport> <raddr> <rport> [vpnaddr]: add new peer\n"); + "* new_peer <iface> <peer_id> <tx_id> <lport> <raddr> <rport> [vpnaddr]: add new peer\n"); fprintf(stderr, "\tiface: ovpn interface name\n"); - fprintf(stderr, "\tlport: local UDP port to bind to\n"); fprintf(stderr, - "\tpeer_id: peer ID to be used in data packets to/from this peer\n"); + "\tpeer_id: peer ID found in data packets received from this peer\n"); + fprintf(stderr, + "\ttx_id: peer ID to be used when sending to this peer, 'none' for symmetric peer ID\n"); + fprintf(stderr, "\tlport: local UDP port to bind to\n"); fprintf(stderr, "\traddr: peer IP address\n"); fprintf(stderr, "\trport: peer UDP port\n"); fprintf(stderr, "\tvpnaddr: peer VPN IP\n"); fprintf(stderr, - "* new_multi_peer <iface> <lport> <peers_file>: add multiple peers as listed in the file\n"); + "* new_multi_peer <iface> <lport> <id_type> <peers_file> [mark]: add multiple peers as listed in the file\n"); fprintf(stderr, "\tiface: ovpn interface name\n"); fprintf(stderr, "\tlport: local UDP port to bind to\n"); + fprintf(stderr, "\tid_type:\n"); + fprintf(stderr, + "\t\t- SYMM for ignoring the TX peer ID from the peers_file\n"); + fprintf(stderr, + "\t\t- ASYMM for using the TX peer ID from the peers_file\n"); fprintf(stderr, "\tpeers_file: text file containing one peer per line. Line format:\n"); - fprintf(stderr, "\t\t<peer_id> <raddr> <rport> <vpnaddr>\n"); + fprintf(stderr, + "\t\t<peer_id> <tx_id> <raddr> <rport> <laddr> <lport> <vpnaddr>\n"); + fprintf(stderr, "\tmark: socket FW mark value\n"); fprintf(stderr, "* set_peer <iface> <peer_id> <keepalive_interval> <keepalive_timeout>: set peer attributes\n"); @@ -1801,15 +1840,23 @@ out: } static int ovpn_parse_new_peer(struct ovpn_ctx *ovpn, const char *peer_id, - const char *raddr, const char *rport, - const char *vpnip) + const char *tx_id, const char *raddr, + const char *rport, const char *vpnip) { ovpn->peer_id = strtoul(peer_id, NULL, 10); if (errno == ERANGE || ovpn->peer_id > PEER_ID_UNDEF) { - fprintf(stderr, "peer ID value out of range\n"); + fprintf(stderr, "rx peer ID value out of range\n"); return -1; } + if (ovpn->asymm_id) { + ovpn->tx_id = strtoul(tx_id, NULL, 10); + if (errno == ERANGE || ovpn->tx_id > PEER_ID_UNDEF) { + fprintf(stderr, "tx peer ID value out of range\n"); + return -1; + } + } + return ovpn_parse_remote(ovpn, raddr, rport, vpnip); } @@ -1936,8 +1983,8 @@ static void ovpn_waitbg(void) static int ovpn_run_cmd(struct ovpn_ctx *ovpn) { - char peer_id[10], vpnip[INET6_ADDRSTRLEN], laddr[128], lport[10]; - char raddr[128], rport[10]; + char peer_id[10], tx_id[10], vpnip[INET6_ADDRSTRLEN], laddr[128]; + char lport[10], raddr[128], rport[10]; int n, ret; FILE *fp; @@ -1964,7 +2011,8 @@ static int ovpn_run_cmd(struct ovpn_ctx *ovpn) int num_peers = 0; - while ((n = fscanf(fp, "%s %s\n", peer_id, vpnip)) == 2) { + while ((n = fscanf(fp, "%s %s %s\n", peer_id, tx_id, + vpnip)) == 3) { struct ovpn_ctx peer_ctx = { 0 }; if (num_peers == MAX_PEERS) { @@ -1974,6 +2022,7 @@ static int ovpn_run_cmd(struct ovpn_ctx *ovpn) peer_ctx.ifindex = ovpn->ifindex; peer_ctx.sa_family = ovpn->sa_family; + peer_ctx.asymm_id = ovpn->asymm_id; peer_ctx.socket = ovpn_accept(ovpn); if (peer_ctx.socket < 0) { @@ -1984,8 +2033,8 @@ static int ovpn_run_cmd(struct ovpn_ctx *ovpn) /* store peer sockets to test TCP I/O */ ovpn->cli_sockets[num_peers] = peer_ctx.socket; - ret = ovpn_parse_new_peer(&peer_ctx, peer_id, NULL, - NULL, vpnip); + ret = ovpn_parse_new_peer(&peer_ctx, peer_id, tx_id, + NULL, NULL, vpnip); if (ret < 0) { fprintf(stderr, "error while parsing line\n"); return -1; @@ -2053,16 +2102,17 @@ static int ovpn_run_cmd(struct ovpn_ctx *ovpn) return -1; } - while ((n = fscanf(fp, "%s %s %s %s %s %s\n", peer_id, laddr, - lport, raddr, rport, vpnip)) == 6) { + while ((n = fscanf(fp, "%s %s %s %s %s %s %s\n", peer_id, tx_id, + laddr, lport, raddr, rport, vpnip)) == 7) { struct ovpn_ctx peer_ctx = { 0 }; peer_ctx.ifindex = ovpn->ifindex; peer_ctx.socket = ovpn->socket; peer_ctx.sa_family = AF_UNSPEC; + peer_ctx.asymm_id = ovpn->asymm_id; - ret = ovpn_parse_new_peer(&peer_ctx, peer_id, raddr, - rport, vpnip); + ret = ovpn_parse_new_peer(&peer_ctx, peer_id, tx_id, + raddr, rport, vpnip); if (ret < 0) { fprintf(stderr, "error while parsing line\n"); return -1; @@ -2158,7 +2208,7 @@ static int ovpn_parse_cmd_args(struct ovpn_ctx *ovpn, int argc, char *argv[]) case CMD_DEL_IFACE: break; case CMD_LISTEN: - if (argc < 5) + if (argc < 6) return -EINVAL; ovpn->lport = strtoul(argv[3], NULL, 10); @@ -2167,55 +2217,67 @@ static int ovpn_parse_cmd_args(struct ovpn_ctx *ovpn, int argc, char *argv[]) return -1; } - ovpn->peers_file = argv[4]; + if (strcmp(argv[4], "SYMM") == 0) { + ovpn->asymm_id = false; + } else if (strcmp(argv[4], "ASYMM") == 0) { + ovpn->asymm_id = true; + } else { + fprintf(stderr, "Cannot parse id type: %s\n", argv[4]); + return -1; + } + + ovpn->peers_file = argv[5]; ovpn->sa_family = AF_INET; - if (argc > 5 && !strcmp(argv[5], "ipv6")) + if (argc > 6 && !strcmp(argv[6], "ipv6")) ovpn->sa_family = AF_INET6; break; case CMD_CONNECT: - if (argc < 6) + if (argc < 7) return -EINVAL; ovpn->sa_family = AF_INET; + ovpn->asymm_id = strcmp(argv[4], "none"); ret = ovpn_parse_new_peer(ovpn, argv[3], argv[4], argv[5], - NULL); + argv[6], NULL); if (ret < 0) { fprintf(stderr, "Cannot parse remote peer data\n"); return -1; } - if (argc > 6) { + if (argc > 7) { ovpn->key_slot = OVPN_KEY_SLOT_PRIMARY; ovpn->key_id = 0; ovpn->cipher = OVPN_CIPHER_ALG_AES_GCM; ovpn->key_dir = KEY_DIR_OUT; - ret = ovpn_parse_key(argv[6], ovpn); + ret = ovpn_parse_key(argv[7], ovpn); if (ret) return -1; } break; case CMD_NEW_PEER: - if (argc < 7) + if (argc < 8) return -EINVAL; - ovpn->lport = strtoul(argv[4], NULL, 10); + ovpn->asymm_id = strcmp(argv[4], "none"); + + ovpn->lport = strtoul(argv[5], NULL, 10); if (errno == ERANGE || ovpn->lport > 65535) { fprintf(stderr, "lport value out of range\n"); return -1; } - const char *vpnip = (argc > 7) ? argv[7] : NULL; + const char *vpnip = (argc > 8) ? argv[8] : NULL; - ret = ovpn_parse_new_peer(ovpn, argv[3], argv[5], argv[6], - vpnip); + ret = ovpn_parse_new_peer(ovpn, argv[3], argv[4], argv[6], + argv[7], vpnip); if (ret < 0) return -1; break; case CMD_NEW_MULTI_PEER: - if (argc < 5) + if (argc < 6) return -EINVAL; ovpn->lport = strtoul(argv[3], NULL, 10); @@ -2224,7 +2286,25 @@ static int ovpn_parse_cmd_args(struct ovpn_ctx *ovpn, int argc, char *argv[]) return -1; } - ovpn->peers_file = argv[4]; + if (!strcmp(argv[4], "SYMM")) { + ovpn->asymm_id = false; + } else if (!strcmp(argv[4], "ASYMM")) { + ovpn->asymm_id = true; + } else { + fprintf(stderr, "Cannot parse id type: %s\n", argv[4]); + return -1; + } + + ovpn->peers_file = argv[5]; + + ovpn->mark = 0; + if (argc > 6) { + ovpn->mark = strtoul(argv[6], NULL, 10); + if (errno == ERANGE || ovpn->mark > UINT32_MAX) { + fprintf(stderr, "mark value out of range\n"); + return -1; + } + } break; case CMD_SET_PEER: if (argc < 6) diff --git a/tools/testing/selftests/net/ovpn/tcp_peers.txt b/tools/testing/selftests/net/ovpn/tcp_peers.txt index d753eebe8716..3cb67b560705 100644 --- a/tools/testing/selftests/net/ovpn/tcp_peers.txt +++ b/tools/testing/selftests/net/ovpn/tcp_peers.txt @@ -1,5 +1,6 @@ -1 5.5.5.2 -2 5.5.5.3 -3 5.5.5.4 -4 5.5.5.5 -5 5.5.5.6 +1 10 5.5.5.2 +2 11 5.5.5.3 +3 12 5.5.5.4 +4 13 5.5.5.5 +5 14 5.5.5.6 +6 15 5.5.5.7 diff --git a/tools/testing/selftests/net/ovpn/test-close-socket.sh b/tools/testing/selftests/net/ovpn/test-close-socket.sh index 5e48a8b67928..0d09df14fe8e 100755 --- a/tools/testing/selftests/net/ovpn/test-close-socket.sh +++ b/tools/testing/selftests/net/ovpn/test-close-socket.sh @@ -27,7 +27,7 @@ done for p in $(seq 1 ${NUM_PEERS}); do ip netns exec peer0 ${OVPN_CLI} set_peer tun0 ${p} 60 120 - ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} ${p} 60 120 + ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} $((${p}+9)) 60 120 done sleep 1 diff --git a/tools/testing/selftests/net/ovpn/test-mark.sh b/tools/testing/selftests/net/ovpn/test-mark.sh new file mode 100755 index 000000000000..8534428ed3eb --- /dev/null +++ b/tools/testing/selftests/net/ovpn/test-mark.sh @@ -0,0 +1,96 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2020-2025 OpenVPN, Inc. +# +# Author: Ralf Lici <ralf@mandelbit.com> +# Antonio Quartulli <antonio@openvpn.net> + +#set -x +set -e + +MARK=1056 + +source ./common.sh + +cleanup + +modprobe -q ovpn || true + +for p in $(seq 0 "${NUM_PEERS}"); do + create_ns "${p}" +done + +for p in $(seq 0 3); do + setup_ns "${p}" 5.5.5.$((p + 1))/24 +done + +# add peer0 with mark +ip netns exec peer0 "${OVPN_CLI}" new_multi_peer tun0 1 ASYMM \ + "${UDP_PEERS_FILE}" \ + ${MARK} +for p in $(seq 1 3); do + ip netns exec peer0 "${OVPN_CLI}" new_key tun0 "${p}" 1 0 "${ALG}" 0 \ + data64.key +done + +for p in $(seq 1 3); do + add_peer "${p}" +done + +for p in $(seq 1 3); do + ip netns exec peer0 "${OVPN_CLI}" set_peer tun0 "${p}" 60 120 + ip netns exec peer"${p}" "${OVPN_CLI}" set_peer tun"${p}" \ + $((p + 9)) 60 120 +done + +sleep 1 + +for p in $(seq 1 3); do + ip netns exec peer0 ping -qfc 500 -w 3 5.5.5.$((p + 1)) +done + +echo "Adding an nftables drop rule based on mark value ${MARK}" +ip netns exec peer0 nft flush ruleset +ip netns exec peer0 nft 'add table inet filter' +ip netns exec peer0 nft 'add chain inet filter output { + type filter hook output priority 0; + policy accept; +}' +ip netns exec peer0 nft add rule inet filter output \ + meta mark == ${MARK} \ + counter drop + +DROP_COUNTER=$(ip netns exec peer0 nft list chain inet filter output \ + | sed -n 's/.*packets \([0-9]*\).*/\1/p') +sleep 1 + +# ping should fail +for p in $(seq 1 3); do + PING_OUTPUT=$(ip netns exec peer0 ping \ + -qfc 500 -w 1 5.5.5.$((p + 1)) 2>&1) && exit 1 + echo "${PING_OUTPUT}" + LOST_PACKETS=$(echo "$PING_OUTPUT" \ + | awk '/packets transmitted/ { print $1 }') + # increment the drop counter by the amount of lost packets + DROP_COUNTER=$((DROP_COUNTER + LOST_PACKETS)) +done + +# check if the final nft counter matches our counter +TOTAL_COUNT=$(ip netns exec peer0 nft list chain inet filter output \ + | sed -n 's/.*packets \([0-9]*\).*/\1/p') +if [ "${DROP_COUNTER}" -ne "${TOTAL_COUNT}" ]; then + echo "Expected ${TOTAL_COUNT} drops, got ${DROP_COUNTER}" + exit 1 +fi + +echo "Removing the drop rule" +ip netns exec peer0 nft flush ruleset +sleep 1 + +for p in $(seq 1 3); do + ip netns exec peer0 ping -qfc 500 -w 3 5.5.5.$((p + 1)) +done + +cleanup + +modprobe -r ovpn || true diff --git a/tools/testing/selftests/net/ovpn/test-symmetric-id-float.sh b/tools/testing/selftests/net/ovpn/test-symmetric-id-float.sh new file mode 100755 index 000000000000..b3711a81b463 --- /dev/null +++ b/tools/testing/selftests/net/ovpn/test-symmetric-id-float.sh @@ -0,0 +1,11 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2025 OpenVPN, Inc. +# +# Author: Ralf Lici <ralf@mandelbit.com> +# Antonio Quartulli <antonio@openvpn.net> + +SYMMETRIC_ID="1" +FLOAT="1" + +source test.sh diff --git a/tools/testing/selftests/net/ovpn/test-symmetric-id-tcp.sh b/tools/testing/selftests/net/ovpn/test-symmetric-id-tcp.sh new file mode 100755 index 000000000000..188cafb67b2f --- /dev/null +++ b/tools/testing/selftests/net/ovpn/test-symmetric-id-tcp.sh @@ -0,0 +1,11 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2025 OpenVPN, Inc. +# +# Author: Ralf Lici <ralf@mandelbit.com> +# Antonio Quartulli <antonio@openvpn.net> + +PROTO="TCP" +SYMMETRIC_ID=1 + +source test.sh diff --git a/tools/testing/selftests/net/ovpn/test-symmetric-id.sh b/tools/testing/selftests/net/ovpn/test-symmetric-id.sh new file mode 100755 index 000000000000..35b119c72e4f --- /dev/null +++ b/tools/testing/selftests/net/ovpn/test-symmetric-id.sh @@ -0,0 +1,10 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# Copyright (C) 2025 OpenVPN, Inc. +# +# Author: Ralf Lici <ralf@mandelbit.com> +# Antonio Quartulli <antonio@openvpn.net> + +SYMMETRIC_ID="1" + +source test.sh diff --git a/tools/testing/selftests/net/ovpn/test.sh b/tools/testing/selftests/net/ovpn/test.sh index e8acdc303307..b60e94a4094e 100755 --- a/tools/testing/selftests/net/ovpn/test.sh +++ b/tools/testing/selftests/net/ovpn/test.sh @@ -18,6 +18,10 @@ for p in $(seq 0 ${NUM_PEERS}); do done for p in $(seq 0 ${NUM_PEERS}); do + setup_listener ${p} +done + +for p in $(seq 0 ${NUM_PEERS}); do setup_ns ${p} 5.5.5.$((${p} + 1))/24 ${MTU} done @@ -27,14 +31,45 @@ done for p in $(seq 1 ${NUM_PEERS}); do ip netns exec peer0 ${OVPN_CLI} set_peer tun0 ${p} 60 120 - ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} ${p} 60 120 + ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} \ + $((${p}+ID_OFFSET)) 60 120 done sleep 1 +TCPDUMP_TIMEOUT="1.5s" for p in $(seq 1 ${NUM_PEERS}); do + # The first part of the data packet header consists of: + # - TCP only: 2 bytes for the packet length + # - 5 bits for opcode ("9" for DATA_V2) + # - 3 bits for key-id ("0" at this point) + # - 12 bytes for peer-id: + # - with asymmetric ID: "${p}" one way and "${p} + 9" the other way + # - with symmetric ID: "${p}" both ways + HEADER1=$(printf "0x4800000%x" ${p}) + HEADER2=$(printf "0x4800000%x" $((${p} + ID_OFFSET))) + RADDR="" + if [ "${PROTO}" == "UDP" ]; then + RADDR=$(awk "NR == ${p} {print \$3}" ${UDP_PEERS_FILE}) + fi + + timeout ${TCPDUMP_TIMEOUT} ip netns exec peer${p} \ + tcpdump --immediate-mode -p -ni veth${p} -c 1 \ + "$(build_capture_filter "${HEADER1}" "${RADDR}")" \ + >/dev/null 2>&1 & + TCPDUMP_PID1=$! + timeout ${TCPDUMP_TIMEOUT} ip netns exec peer${p} \ + tcpdump --immediate-mode -p -ni veth${p} -c 1 \ + "$(build_capture_filter "${HEADER2}" "${RADDR}")" \ + >/dev/null 2>&1 & + TCPDUMP_PID2=$! + + sleep 0.3 ip netns exec peer0 ping -qfc 500 -w 3 5.5.5.$((${p} + 1)) ip netns exec peer0 ping -qfc 500 -s 3000 -w 3 5.5.5.$((${p} + 1)) + + wait ${TCPDUMP_PID1} + wait ${TCPDUMP_PID2} done # ping LAN behind client 1 @@ -57,9 +92,12 @@ ip netns exec peer1 iperf3 -Z -t 3 -c 5.5.5.1 echo "Adding secondary key and then swap:" for p in $(seq 1 ${NUM_PEERS}); do - ip netns exec peer0 ${OVPN_CLI} new_key tun0 ${p} 2 1 ${ALG} 0 data64.key - ip netns exec peer${p} ${OVPN_CLI} new_key tun${p} ${p} 2 1 ${ALG} 1 data64.key - ip netns exec peer${p} ${OVPN_CLI} swap_keys tun${p} ${p} + ip netns exec peer0 ${OVPN_CLI} new_key tun0 ${p} 2 1 ${ALG} 0 \ + data64.key + ip netns exec peer${p} ${OVPN_CLI} new_key tun${p} \ + $((${p} + ID_OFFSET)) 2 1 ${ALG} 1 data64.key + ip netns exec peer${p} ${OVPN_CLI} swap_keys tun${p} \ + $((${p} + ID_OFFSET)) done sleep 1 @@ -71,17 +109,19 @@ ip netns exec peer1 ${OVPN_CLI} get_peer tun1 echo "Querying peer 1:" ip netns exec peer0 ${OVPN_CLI} get_peer tun0 1 -echo "Querying non-existent peer 10:" -ip netns exec peer0 ${OVPN_CLI} get_peer tun0 10 || true +echo "Querying non-existent peer 20:" +ip netns exec peer0 ${OVPN_CLI} get_peer tun0 20 || true echo "Deleting peer 1:" ip netns exec peer0 ${OVPN_CLI} del_peer tun0 1 -ip netns exec peer1 ${OVPN_CLI} del_peer tun1 1 +ip netns exec peer1 ${OVPN_CLI} del_peer tun1 $((1 + ID_OFFSET)) echo "Querying keys:" for p in $(seq 2 ${NUM_PEERS}); do - ip netns exec peer${p} ${OVPN_CLI} get_key tun${p} ${p} 1 - ip netns exec peer${p} ${OVPN_CLI} get_key tun${p} ${p} 2 + ip netns exec peer${p} ${OVPN_CLI} get_key tun${p} \ + $((${p} + ID_OFFSET)) 1 + ip netns exec peer${p} ${OVPN_CLI} get_key tun${p} \ + $((${p} + ID_OFFSET)) 2 done echo "Deleting peer while sending traffic:" @@ -90,28 +130,36 @@ sleep 2 ip netns exec peer0 ${OVPN_CLI} del_peer tun0 2 # following command fails in TCP mode # (both ends get conn reset when one peer disconnects) -ip netns exec peer2 ${OVPN_CLI} del_peer tun2 2 || true +ip netns exec peer2 ${OVPN_CLI} del_peer tun2 $((2 + ID_OFFSET)) || true echo "Deleting keys:" for p in $(seq 3 ${NUM_PEERS}); do - ip netns exec peer${p} ${OVPN_CLI} del_key tun${p} ${p} 1 - ip netns exec peer${p} ${OVPN_CLI} del_key tun${p} ${p} 2 + ip netns exec peer${p} ${OVPN_CLI} del_key tun${p} \ + $((${p} + ID_OFFSET)) 1 + ip netns exec peer${p} ${OVPN_CLI} del_key tun${p} \ + $((${p} + ID_OFFSET)) 2 done echo "Setting timeout to 3s MP:" for p in $(seq 3 ${NUM_PEERS}); do ip netns exec peer0 ${OVPN_CLI} set_peer tun0 ${p} 3 3 || true - ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} ${p} 0 0 + ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} \ + $((${p} + ID_OFFSET)) 0 0 done # wait for peers to timeout sleep 5 echo "Setting timeout to 3s P2P:" for p in $(seq 3 ${NUM_PEERS}); do - ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} ${p} 3 3 + ip netns exec peer${p} ${OVPN_CLI} set_peer tun${p} \ + $((${p} + ID_OFFSET)) 3 3 done sleep 5 +for p in $(seq 0 ${NUM_PEERS}); do + compare_ntfs ${p} +done + cleanup modprobe -r ovpn || true diff --git a/tools/testing/selftests/net/ovpn/udp_peers.txt b/tools/testing/selftests/net/ovpn/udp_peers.txt index e9773ddf875c..93de6465353c 100644 --- a/tools/testing/selftests/net/ovpn/udp_peers.txt +++ b/tools/testing/selftests/net/ovpn/udp_peers.txt @@ -1,6 +1,6 @@ -1 10.10.1.1 1 10.10.1.2 1 5.5.5.2 -2 10.10.2.1 1 10.10.2.2 1 5.5.5.3 -3 10.10.3.1 1 10.10.3.2 1 5.5.5.4 -4 fd00:0:0:4::1 1 fd00:0:0:4::2 1 5.5.5.5 -5 fd00:0:0:5::1 1 fd00:0:0:5::2 1 5.5.5.6 -6 fd00:0:0:6::1 1 fd00:0:0:6::2 1 5.5.5.7 +1 10 10.10.1.1 1 10.10.1.2 1 5.5.5.2 +2 11 10.10.2.1 1 10.10.2.2 1 5.5.5.3 +3 12 10.10.3.1 1 10.10.3.2 1 5.5.5.4 +4 13 fd00:0:0:4::1 1 fd00:0:0:4::2 1 5.5.5.5 +5 14 fd00:0:0:5::1 1 fd00:0:0:5::2 1 5.5.5.6 +6 15 fd00:0:0:6::1 1 fd00:0:0:6::2 1 5.5.5.7 diff --git a/tools/testing/selftests/net/packetdrill/tcp_disorder_fin_in_FIN_WAIT.pkt b/tools/testing/selftests/net/packetdrill/tcp_disorder_fin_in_FIN_WAIT.pkt new file mode 100644 index 000000000000..336cbf7815c8 --- /dev/null +++ b/tools/testing/selftests/net/packetdrill/tcp_disorder_fin_in_FIN_WAIT.pkt @@ -0,0 +1,33 @@ +// SPDX-License-Identifier: GPL-2.0 + +// Check fix in 795a7dfbc3d9 ("net: tcp: accept old ack during closing") + +// Set up config. +`./defaults.sh` + +// Initialize a server socket. + 0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 + +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 + +0 bind(3, ..., ...) = 0 + +0 listen(3, 1) = 0 + + +0 < S 0:0(0) win 65535 <mss 1000,sackOK,nop,nop,nop,wscale 7> + * > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> + +0 < . 1:1(0) ack 1 win 257 + + +0 accept(3, ..., ...) = 4 + + +0 shutdown(4, SHUT_WR) = 0 + * > F. 1:1(0) ack 1 + +// We expect to receive one ACK. +// But what happens if a FIN was already in transmt and received out-of-order ? + + +0 < . 2:2(0) ack 2 win 257 + +// This FIN packet was sent before the prior ACK (see ack 1). + +0 < F. 1:1(0) ack 1 win 257 + +// Even if the FIN is received out-of-order, we should ACK it. + + * > . 2:2(0) ack 2 diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt index 6c0f32c40f19..12882be10f2e 100644 --- a/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_big_endseq.pkt @@ -36,7 +36,7 @@ +0 read(4, ..., 100000) = 4000 -// If queue is empty, accept a packet even if its end_seq is above wup + rcv_wnd +// If queue is empty, accept a packet even if its end_seq is above rcv_mwnd_seq +0 < P. 4001:54001(50000) ack 1 win 257 * > . 1:1(0) ack 54001 win 0 diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_neg_window.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_neg_window.pkt new file mode 100644 index 000000000000..b9ab264b2a11 --- /dev/null +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_neg_window.pkt @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0 +// Test maximum advertised window limit when rcv_nxt advances past +// rcv_mwnd_seq. The "usable window" must be properly clamped to zero +// rather than becoming negative. + +--mss=1000 + +`./defaults.sh` + +// Establish a connection. + +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 + +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 + +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [20000], 4) = 0 + +0 bind(3, ..., ...) = 0 + +0 listen(3, 1) = 0 + + +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7> + +0 > S. 0:0(0) ack 1 win 18980 <mss 1460,nop,wscale 0> + +.1 < . 1:1(0) ack 1 win 257 + + +0 accept(3, ..., ...) = 4 + +// A too big packet is accepted if the receive queue is empty. It +// does not trigger an immediate ACK. + +0 < P. 1:20001(20000) ack 1 win 257 + +0 %{ assert tcpi_bytes_received == 20000, tcpi_bytes_received; }% + +// Send a RST immediately so that there is no rcv_wup/rcv_mwnd_seq update yet + +0 < R. 20001:20001(0) ack 1 win 257 + +// Verify that the RST was accepted. Indirectly this also verifies that no +// immediate ACK was sent for the data packet above. + +0 < . 20001:20001(0) ack 1 win 257 + +0 > R 1:1(0) diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_allowed.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_allowed.pkt new file mode 100644 index 000000000000..6af0e0eb183a --- /dev/null +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_allowed.pkt @@ -0,0 +1,40 @@ +// SPDX-License-Identifier: GPL-2.0 + +--mss=1000 + +`./defaults.sh +sysctl -q net.ipv4.tcp_shrink_window=1 +sysctl -q net.ipv4.tcp_rmem="4096 32768 $((32*1024*1024))"` + + 0 `nstat -n` + +// Establish a connection. + +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 + +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 + +0 bind(3, ..., ...) = 0 + +0 listen(3, 1) = 0 + + +0 < S 0:0(0) win 32792 <mss 1000,nop,wscale 7> + +0 > S. 0:0(0) ack 1 <mss 1460,nop,wscale 10> + +0 < . 1:1(0) ack 1 win 257 + + +0 accept(3, ..., ...) = 4 + + +0 < P. 1:10001(10000) ack 1 win 257 + * > . 1:1(0) ack 10001 win 15 + + +0 < P. 10001:11024(1023) ack 1 win 257 + * > . 1:1(0) ack 11024 win 13 + +// Max window seq advertised 10001 + 15*1024 = 25361, last advertised: 11024 + 13*1024 = 24336 + +// Segment beyond the max window is dropped + +0 < P. 11024:25362(14338) ack 1 win 257 + * > . 1:1(0) ack 11024 win 13 + +// Segment using the max window is accepted + +0 < P. 11024:25361(14337) ack 1 win 257 + * > . 1:1(0) ack 25361 win 0 + +// Check LINUX_MIB_BEYOND_WINDOW has been incremented once + +0 `nstat | grep TcpExtBeyondWindow | grep -q " 1 "` diff --git a/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_nomem.pkt b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_nomem.pkt new file mode 100644 index 000000000000..a80eb55dc69a --- /dev/null +++ b/tools/testing/selftests/net/packetdrill/tcp_rcv_wnd_shrink_nomem.pkt @@ -0,0 +1,132 @@ +// SPDX-License-Identifier: GPL-2.0 +// When tcp_receive_window() < tcp_max_receive_window(), tcp_sequence() accepts +// packets that would be dropped under normal conditions (i.e. tcp_receive_window() +// equal to tcp_max_receive_window()). +// Test that such packets are handled as expected for RWIN == 0 and for RWIN > 0. + +--mss=1000 + +`./defaults.sh` + + 0 `nstat -n` + +// Establish a connection. + +0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 + +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 + +0 setsockopt(3, SOL_SOCKET, SO_RCVBUF, [1000000], 4) = 0 + +0 bind(3, ..., ...) = 0 + +0 listen(3, 1) = 0 + + +0 < S 0:0(0) win 32792 <mss 1000,nop,nop,sackOK,nop,wscale 7> + +0 > S. 0:0(0) ack 1 win 65535 <mss 1460,nop,nop,sackOK,nop,wscale 4> + +0 < . 1:1(0) ack 1 win 257 + + +0 accept(3, ..., ...) = 4 + +// Put 1040000 bytes into the receive buffer + +0 < P. 1:65001(65000) ack 1 win 257 + * > . 1:1(0) ack 65001 + +0 < P. 65001:130001(65000) ack 1 win 257 + * > . 1:1(0) ack 130001 + +0 < P. 130001:195001(65000) ack 1 win 257 + * > . 1:1(0) ack 195001 + +0 < P. 195001:260001(65000) ack 1 win 257 + * > . 1:1(0) ack 260001 + +0 < P. 260001:325001(65000) ack 1 win 257 + * > . 1:1(0) ack 325001 + +0 < P. 325001:390001(65000) ack 1 win 257 + * > . 1:1(0) ack 390001 + +0 < P. 390001:455001(65000) ack 1 win 257 + * > . 1:1(0) ack 455001 + +0 < P. 455001:520001(65000) ack 1 win 257 + * > . 1:1(0) ack 520001 + +0 < P. 520001:585001(65000) ack 1 win 257 + * > . 1:1(0) ack 585001 + +0 < P. 585001:650001(65000) ack 1 win 257 + * > . 1:1(0) ack 650001 + +0 < P. 650001:715001(65000) ack 1 win 257 + * > . 1:1(0) ack 715001 + +0 < P. 715001:780001(65000) ack 1 win 257 + * > . 1:1(0) ack 780001 + +0 < P. 780001:845001(65000) ack 1 win 257 + * > . 1:1(0) ack 845001 + +0 < P. 845001:910001(65000) ack 1 win 257 + * > . 1:1(0) ack 910001 + +0 < P. 910001:975001(65000) ack 1 win 257 + * > . 1:1(0) ack 975001 + +0 < P. 975001:1040001(65000) ack 1 win 257 + * > . 1:1(0) ack 1040001 + +// Trigger an extreme memory squeeze by shrinking SO_RCVBUF + +0 setsockopt(4, SOL_SOCKET, SO_RCVBUF, [16000], 4) = 0 + + +0 < P. 1040001:1105001(65000) ack 1 win 257 + * > . 1:1(0) ack 1040001 win 0 +// Check LINUX_MIB_TCPRCVQDROP has been incremented + +0 `nstat -s | grep TcpExtTCPRcvQDrop| grep -q " 1 "` + +// RWIN == 0: rcv_wup = 1040001, rcv_wnd = 0, rcv_mwnd_seq > 1105001 (significantly larger, typically ~1970000) + +// Accept pure ack with seq in max adv. window + +0 write(4, ..., 1000) = 1000 + +0 > P. 1:1001(1000) ack 1040001 win 0 + +0 < . 1105001:1105001(0) ack 1001 win 257 + +// In order segment, in max adv. window -> drop (SKB_DROP_REASON_TCP_ZEROWINDOW) + +0 < P. 1040001:1041001(1000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 +// Ooo partial segment, in max adv. window -> drop (SKB_DROP_REASON_TCP_ZEROWINDOW) + +0 < P. 1039001:1041001(2000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 <nop,nop,sack 1039001:1040001> +// Check LINUX_MIB_TCPZEROWINDOWDROP has been incremented twice + +0 `nstat -s | grep TcpExtTCPZeroWindowDrop| grep -q " 2 "` + +// Ooo segment, in max adv. window -> drop (SKB_DROP_REASON_TCP_OVERWINDOW) + +0 < P. 1105001:1106001(1000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 +// Ooo segment, beyond max adv. window -> drop (SKB_DROP_REASON_TCP_INVALID_SEQUENCE) + +0 < P. 2000001:2001001(1000) ack 1001 win 257 + +0 > . 1001:1001(0) ack 1040001 win 0 +// Check LINUX_MIB_BEYOND_WINDOW has been incremented twice + +0 `nstat -s | grep TcpExtBeyondWindow | grep -q " 2 "` + +// Read all data + +0 read(4, ..., 2000000) = 1040000 + * > . 1001:1001(0) ack 1040001 + +// RWIN > 0: rcv_wup = 1040001, 0 < rcv_wnd < 32000, rcv_mwnd_seq > 1105001 (significantly larger, typically ~1970000) + +// Accept pure ack with seq in max adv. window, beyond adv. window + +0 write(4, ..., 1000) = 1000 + +0 > P. 1001:2001(1000) ack 1040001 + +0 < . 1105001:1105001(0) ack 2001 win 257 + +// In order segment, in max adv. window, in adv. window -> accept +// Note: This also ensures that we cannot hit the empty queue exception in tcp_sequence() in the following tests + +0 < P. 1040001:1041001(1000) ack 2001 win 257 + * > . 2001:2001(0) ack 1041001 + +// Ooo partial segment, in adv. window -> accept + +0 < P. 1040001:1042001(2000) ack 2001 win 257 + +0 > . 2001:2001(0) ack 1042001 <nop,nop,sack 1040001:1041001> + +// Ooo segment, in max adv. window, beyond adv. window -> drop (SKB_DROP_REASON_TCP_OVERWINDOW) + +0 < P. 1105001:1106001(1000) ack 2001 win 257 + +0 > . 2001:2001(0) ack 1042001 +// Ooo segment, beyond max adv. window, beyond adv. window -> drop (SKB_DROP_REASON_TCP_INVALID_SEQUENCE) + +0 < P. 2000001:2001001(1000) ack 2001 win 257 + +0 > . 2001:2001(0) ack 1042001 +// Check LINUX_MIB_BEYOND_WINDOW has been incremented twice + +0 `nstat -s | grep TcpExtBeyondWindow | grep -q " 4 "` + +// We are allowed to go beyond the window and buffer with one packet + +0 < P. 1042001:1062001(20000) ack 2001 win 257 + * > . 2001:2001(0) ack 1062001 + +0 < P. 1062001:1082001(20000) ack 2001 win 257 + * > . 2001:2001(0) ack 1082001 win 0 + +// But not more: In order segment, in max adv. window -> drop (SKB_DROP_REASON_TCP_ZEROWINDOW) + +0 < P. 1082001:1083001(1000) ack 2001 win 257 + * > . 2001:2001(0) ack 1082001 +// Check LINUX_MIB_TCPZEROWINDOWDROP has been incremented again + +0 `nstat -s | grep TcpExtTCPZeroWindowDrop| grep -q " 3 "` diff --git a/tools/testing/selftests/net/ppp/Makefile b/tools/testing/selftests/net/ppp/Makefile new file mode 100644 index 000000000000..b39b0abadde6 --- /dev/null +++ b/tools/testing/selftests/net/ppp/Makefile @@ -0,0 +1,15 @@ +# SPDX-License-Identifier: GPL-2.0 + +top_srcdir = ../../../../.. + +TEST_PROGS := \ + ppp_async.sh \ + pppoe.sh \ +# end of TEST_PROGS + +TEST_FILES := \ + ppp_common.sh \ + pppoe-server-options \ +# end of TEST_FILES + +include ../../lib.mk diff --git a/tools/testing/selftests/net/ppp/config b/tools/testing/selftests/net/ppp/config new file mode 100644 index 000000000000..b45d25c5b970 --- /dev/null +++ b/tools/testing/selftests/net/ppp/config @@ -0,0 +1,9 @@ +CONFIG_IPV6=y +CONFIG_PACKET=y +CONFIG_PPP=m +CONFIG_PPP_ASYNC=m +CONFIG_PPP_BSDCOMP=m +CONFIG_PPP_DEFLATE=m +CONFIG_PPPOE=m +CONFIG_PPPOE_HASH_BITS_4=y +CONFIG_VETH=y diff --git a/tools/testing/selftests/net/ppp/ppp_async.sh b/tools/testing/selftests/net/ppp/ppp_async.sh new file mode 100755 index 000000000000..10f54c8dd0bc --- /dev/null +++ b/tools/testing/selftests/net/ppp/ppp_async.sh @@ -0,0 +1,43 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source ppp_common.sh + +# Temporary files for PTY symlinks +TTY_DIR=$(mktemp -d /tmp/ppp.XXXXXX) +TTY_SERVER="$TTY_DIR"/server +TTY_CLIENT="$TTY_DIR"/client + +# shellcheck disable=SC2329 +cleanup() { + cleanup_all_ns + [ -n "$SOCAT_PID" ] && kill_process "$SOCAT_PID" + rm -fr "$TTY_DIR" +} + +trap cleanup EXIT + +ppp_common_init +modprobe -q ppp_async + +# Create the virtual serial device +socat -d PTY,link="$TTY_SERVER",rawer PTY,link="$TTY_CLIENT",rawer & +SOCAT_PID=$! + +# Wait for symlinks to be created +slowwait 5 [ -L "$TTY_SERVER" ] + +# Start the PPP Server +ip netns exec "$NS_SERVER" pppd "$TTY_SERVER" 115200 \ + "$IP_SERVER":"$IP_CLIENT" \ + local noauth nodefaultroute debug + +# Start the PPP Client +ip netns exec "$NS_CLIENT" pppd "$TTY_CLIENT" 115200 \ + local noauth updetach nodefaultroute debug + +ppp_test_connectivity + +log_test "PPP async" + +exit "$EXIT_STATUS" diff --git a/tools/testing/selftests/net/ppp/ppp_common.sh b/tools/testing/selftests/net/ppp/ppp_common.sh new file mode 100644 index 000000000000..40bbec317039 --- /dev/null +++ b/tools/testing/selftests/net/ppp/ppp_common.sh @@ -0,0 +1,45 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# shellcheck disable=SC2153 + +source ../lib.sh + +IP_SERVER="192.168.200.1" +IP_CLIENT="192.168.200.2" + +ppp_common_init() { + # Package requirements + require_command socat + require_command pppd + require_command iperf3 + + # Check for root privileges + if [ "$(id -u)" -ne 0 ];then + echo "SKIP: Need root privileges" + exit "$ksft_skip" + fi + + # Namespaces + setup_ns NS_SERVER NS_CLIENT +} + +ppp_check_addr() { + dev=$1 + addr=$2 + ns=$3 + ip -netns "$ns" -4 addr show dev "$dev" 2>/dev/null | grep -q "$addr" + return $? +} + +ppp_test_connectivity() { + slowwait 10 ppp_check_addr "ppp0" "$IP_CLIENT" "$NS_CLIENT" + + ip netns exec "$NS_CLIENT" ping -c 3 "$IP_SERVER" + check_err $? + + ip netns exec "$NS_SERVER" iperf3 -s -1 -D + wait_local_port_listen "$NS_SERVER" 5201 tcp + + ip netns exec "$NS_CLIENT" iperf3 -c "$IP_SERVER" -Z -t 2 + check_err $? +} diff --git a/tools/testing/selftests/net/ppp/pppoe-server-options b/tools/testing/selftests/net/ppp/pppoe-server-options new file mode 100644 index 000000000000..66c8c9d319e9 --- /dev/null +++ b/tools/testing/selftests/net/ppp/pppoe-server-options @@ -0,0 +1,2 @@ +noauth +noipdefault diff --git a/tools/testing/selftests/net/ppp/pppoe.sh b/tools/testing/selftests/net/ppp/pppoe.sh new file mode 100755 index 000000000000..f67b51df7490 --- /dev/null +++ b/tools/testing/selftests/net/ppp/pppoe.sh @@ -0,0 +1,65 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 + +source ppp_common.sh + +VETH_SERVER="veth-server" +VETH_CLIENT="veth-client" +PPPOE_LOG=$(mktemp /tmp/pppoe.XXXXXX) + +# shellcheck disable=SC2329 +cleanup() { + cleanup_all_ns + [ -n "$SOCAT_PID" ] && kill_process "$SOCAT_PID" + rm -f "$PPPOE_LOG" +} + +trap cleanup EXIT + +require_command pppoe-server +ppp_common_init +modprobe -q pppoe + +# Try to locate pppoe.so plugin +PPPOE_PLUGIN=$(find /usr/{lib,lib64,lib32}/pppd/ -name pppoe.so -type f -print -quit) +if [ -z "$PPPOE_PLUGIN" ]; then + log_test_skip "PPPoE: pppoe.so plugin not found" + exit "$EXIT_STATUS" +fi + +# Create the veth pair +ip link add "$VETH_SERVER" type veth peer name "$VETH_CLIENT" +ip link set "$VETH_SERVER" netns "$NS_SERVER" +ip link set "$VETH_CLIENT" netns "$NS_CLIENT" +ip -netns "$NS_SERVER" link set "$VETH_SERVER" up +ip -netns "$NS_CLIENT" link set "$VETH_CLIENT" up + +# Start socat as syslog listener +socat -v -u UNIX-RECV:/dev/log OPEN:/dev/null > "$PPPOE_LOG" 2>&1 & +SOCAT_PID=$! + +# Start the PPP Server. Note that versions before 4.0 ignore -g option and +# instead use a hardcoded plugin path, so they may fail to find the plugin. +ip netns exec "$NS_SERVER" pppoe-server -I "$VETH_SERVER" \ + -L "$IP_SERVER" -R "$IP_CLIENT" -N 1 -q "$(command -v pppd)" \ + -k -O "$(pwd)/pppoe-server-options" -g "$PPPOE_PLUGIN" + +# Start the PPP Client +ip netns exec "$NS_CLIENT" pppd \ + local debug updetach noipdefault noauth nodefaultroute \ + plugin "$PPPOE_PLUGIN" nic-"$VETH_CLIENT" + +ppp_test_connectivity + +log_test "PPPoE" + +# Dump syslog messages if the test failed +if [ "$RET" -ne 0 ]; then + while read -r _sign _date _time len _from _to + do len=${len##*=} + read -n "$len" -r LINE + echo "$LINE" + done < "$PPPOE_LOG" +fi + +exit "$EXIT_STATUS" diff --git a/tools/testing/selftests/net/rds/Makefile b/tools/testing/selftests/net/rds/Makefile index 762845cc973c..fe363be8e358 100644 --- a/tools/testing/selftests/net/rds/Makefile +++ b/tools/testing/selftests/net/rds/Makefile @@ -7,6 +7,7 @@ TEST_PROGS := run.sh TEST_FILES := \ include.sh \ + settings \ test.py \ # end of TEST_FILES diff --git a/tools/testing/selftests/net/rds/README.txt b/tools/testing/selftests/net/rds/README.txt index cbde2951ab13..c6fe003d503b 100644 --- a/tools/testing/selftests/net/rds/README.txt +++ b/tools/testing/selftests/net/rds/README.txt @@ -31,8 +31,11 @@ EXAMPLE: # Alternatly create a gcov disabled .config tools/testing/selftests/net/rds/config.sh + # Config paths may also be specified with the -c flag + tools/testing/selftests/net/rds/config.sh -c .config.local + # build the kernel - vng --build --config tools/testing/selftests/net/config + vng --build --config .config # launch the tests in a VM vng -v --rwdir ./ --run . --user root --cpus 4 -- \ diff --git a/tools/testing/selftests/net/rds/config b/tools/testing/selftests/net/rds/config new file mode 100644 index 000000000000..97db7ecb892a --- /dev/null +++ b/tools/testing/selftests/net/rds/config @@ -0,0 +1,5 @@ +CONFIG_NET_NS=y +CONFIG_NET_SCH_NETEM=y +CONFIG_RDS=y +CONFIG_RDS_TCP=y +CONFIG_VETH=y diff --git a/tools/testing/selftests/net/rds/config.sh b/tools/testing/selftests/net/rds/config.sh index 791c8dbe1095..29a79314dd60 100755 --- a/tools/testing/selftests/net/rds/config.sh +++ b/tools/testing/selftests/net/rds/config.sh @@ -6,15 +6,20 @@ set -u set -x unset KBUILD_OUTPUT +CONF_FILE="" +FLAGS=() GENERATE_GCOV_REPORT=0 -while getopts "g" opt; do +while getopts "gc:" opt; do case ${opt} in g) GENERATE_GCOV_REPORT=1 ;; + c) + CONF_FILE=$OPTARG + ;; :) - echo "USAGE: config.sh [-g]" + echo "USAGE: config.sh [-g] [-c config]" exit 1 ;; ?) @@ -24,30 +29,32 @@ while getopts "g" opt; do esac done -CONF_FILE="tools/testing/selftests/net/config" +if [[ "$CONF_FILE" != "" ]]; then + FLAGS=(--file "$CONF_FILE") +fi # no modules -scripts/config --file "$CONF_FILE" --disable CONFIG_MODULES +scripts/config "${FLAGS[@]}" --disable CONFIG_MODULES # enable RDS -scripts/config --file "$CONF_FILE" --enable CONFIG_RDS -scripts/config --file "$CONF_FILE" --enable CONFIG_RDS_TCP +scripts/config "${FLAGS[@]}" --enable CONFIG_RDS +scripts/config "${FLAGS[@]}" --enable CONFIG_RDS_TCP if [ "$GENERATE_GCOV_REPORT" -eq 1 ]; then # instrument RDS and only RDS - scripts/config --file "$CONF_FILE" --enable CONFIG_GCOV_KERNEL - scripts/config --file "$CONF_FILE" --disable GCOV_PROFILE_ALL - scripts/config --file "$CONF_FILE" --enable GCOV_PROFILE_RDS + scripts/config "${FLAGS[@]}" --enable CONFIG_GCOV_KERNEL + scripts/config "${FLAGS[@]}" --disable GCOV_PROFILE_ALL + scripts/config "${FLAGS[@]}" --enable GCOV_PROFILE_RDS else - scripts/config --file "$CONF_FILE" --disable CONFIG_GCOV_KERNEL - scripts/config --file "$CONF_FILE" --disable GCOV_PROFILE_ALL - scripts/config --file "$CONF_FILE" --disable GCOV_PROFILE_RDS + scripts/config "${FLAGS[@]}" --disable CONFIG_GCOV_KERNEL + scripts/config "${FLAGS[@]}" --disable GCOV_PROFILE_ALL + scripts/config "${FLAGS[@]}" --disable GCOV_PROFILE_RDS fi # need network namespaces to run tests with veth network interfaces -scripts/config --file "$CONF_FILE" --enable CONFIG_NET_NS -scripts/config --file "$CONF_FILE" --enable CONFIG_VETH +scripts/config "${FLAGS[@]}" --enable CONFIG_NET_NS +scripts/config "${FLAGS[@]}" --enable CONFIG_VETH # simulate packet loss -scripts/config --file "$CONF_FILE" --enable CONFIG_NET_SCH_NETEM +scripts/config "${FLAGS[@]}" --enable CONFIG_NET_SCH_NETEM diff --git a/tools/testing/selftests/net/rds/run.sh b/tools/testing/selftests/net/rds/run.sh index 8aee244f582a..897d17d1b8db 100755 --- a/tools/testing/selftests/net/rds/run.sh +++ b/tools/testing/selftests/net/rds/run.sh @@ -19,6 +19,9 @@ if test -f "$build_include"; then build_dir="$mk_build_dir" fi +# Source settings for timeout value (also used by ksft runner) +source "$current_dir"/settings + # This test requires kernel source and the *.gcda data therein # Locate the top level of the kernel source, and the net/rds # subfolder with the appropriate *.gcno object files @@ -194,8 +197,8 @@ set +e echo running RDS tests... echo Traces will be logged to "$TRACE_FILE" rm -f "$TRACE_FILE" -strace -T -tt -o "$TRACE_FILE" python3 "$(dirname "$0")/test.py" --timeout 400 -d "$LOG_DIR" \ - -l "$PLOSS" -c "$PCORRUPT" -u "$PDUP" +strace -T -tt -o "$TRACE_FILE" python3 "$(dirname "$0")/test.py" \ + --timeout "$timeout" -d "$LOG_DIR" -l "$PLOSS" -c "$PCORRUPT" -u "$PDUP" test_rc=$? dmesg > "${LOG_DIR}/dmesg.out" diff --git a/tools/testing/selftests/net/rds/settings b/tools/testing/selftests/net/rds/settings new file mode 100644 index 000000000000..d2009a64589c --- /dev/null +++ b/tools/testing/selftests/net/rds/settings @@ -0,0 +1 @@ +timeout=400 diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py index 4a7178d11193..93e23e8b256c 100755 --- a/tools/testing/selftests/net/rds/test.py +++ b/tools/testing/selftests/net/rds/test.py @@ -11,9 +11,8 @@ import signal import socket import subprocess import sys -import atexit -from pwd import getpwuid -from os import stat +import tempfile +import shutil # Allow utils module to be imported from different directory this_dir = os.path.dirname(os.path.realpath(__file__)) @@ -23,45 +22,54 @@ from lib.py.utils import ip libc = ctypes.cdll.LoadLibrary('libc.so.6') setns = libc.setns -net0 = 'net0' -net1 = 'net1' +NET0 = 'net0' +NET1 = 'net1' -veth0 = 'veth0' -veth1 = 'veth1' +VETH0 = 'veth0' +VETH1 = 'veth1' # Helper function for creating a socket inside a network namespace. # We need this because otherwise RDS will detect that the two TCP # sockets are on the same interface and use the loop transport instead # of the TCP transport. -def netns_socket(netns, *args): +def netns_socket(netns, *sock_args): + """ + Creates sockets inside of network namespace + + :param netns: the name of the network namespace + :param sock_args: socket family and type + """ u0, u1 = socket.socketpair(socket.AF_UNIX, socket.SOCK_SEQPACKET) child = os.fork() if child == 0: # change network namespace - with open(f'/var/run/netns/{netns}') as f: + with open(f'/var/run/netns/{netns}', encoding='utf-8') as f: try: - ret = setns(f.fileno(), 0) + setns(f.fileno(), 0) except IOError as e: print(e.errno) print(e) # create socket in target namespace - s = socket.socket(*args) + sock = socket.socket(*sock_args) # send resulting socket to parent - socket.send_fds(u0, [], [s.fileno()]) + socket.send_fds(u0, [], [sock.fileno()]) sys.exit(0) # receive socket from child - _, s, _, _ = socket.recv_fds(u1, 0, 1) + _, fds, _, _ = socket.recv_fds(u1, 0, 1) os.waitpid(child, 0) u0.close() u1.close() - return socket.fromfd(s[0], *args) + return socket.fromfd(fds[0], *sock_args) -def signal_handler(sig, frame): +def signal_handler(_sig, _frame): + """ + Test timed out signal handler + """ print('Test timed out') sys.exit(1) @@ -81,13 +89,13 @@ parser.add_argument('-u', '--duplicate', help="Simulate tcp packet duplication", type=int, default=0) args = parser.parse_args() logdir=args.logdir -packet_loss=str(args.loss)+'%' -packet_corruption=str(args.corruption)+'%' -packet_duplicate=str(args.duplicate)+'%' +PACKET_LOSS=str(args.loss)+'%' +PACKET_CORRUPTION=str(args.corruption)+'%' +PACKET_DUPLICATE=str(args.duplicate)+'%' -ip(f"netns add {net0}") -ip(f"netns add {net1}") -ip(f"link add type veth") +ip(f"netns add {NET0}") +ip(f"netns add {NET1}") +ip("link add type veth") addrs = [ # we technically don't need different port numbers, but this will @@ -99,38 +107,38 @@ addrs = [ # move interfaces to separate namespaces so they can no longer be # bound directly; this prevents rds from switching over from the tcp # transport to the loop transport. -ip(f"link set {veth0} netns {net0} up") -ip(f"link set {veth1} netns {net1} up") +ip(f"link set {VETH0} netns {NET0} up") +ip(f"link set {VETH1} netns {NET1} up") # add addresses -ip(f"-n {net0} addr add {addrs[0][0]}/32 dev {veth0}") -ip(f"-n {net1} addr add {addrs[1][0]}/32 dev {veth1}") +ip(f"-n {NET0} addr add {addrs[0][0]}/32 dev {VETH0}") +ip(f"-n {NET1} addr add {addrs[1][0]}/32 dev {VETH1}") # add routes -ip(f"-n {net0} route add {addrs[1][0]}/32 dev {veth0}") -ip(f"-n {net1} route add {addrs[0][0]}/32 dev {veth1}") +ip(f"-n {NET0} route add {addrs[1][0]}/32 dev {VETH0}") +ip(f"-n {NET1} route add {addrs[0][0]}/32 dev {VETH1}") # sanity check that our two interfaces/addresses are correctly set up # and communicating by doing a single ping -ip(f"netns exec {net0} ping -c 1 {addrs[1][0]}") +ip(f"netns exec {NET0} ping -c 1 {addrs[1][0]}") # Start a packet capture on each network -for net in [net0, net1]: - tcpdump_pid = os.fork() - if tcpdump_pid == 0: - pcap = logdir+'/'+net+'.pcap' - subprocess.check_call(['touch', pcap]) - user = getpwuid(stat(pcap).st_uid).pw_name - ip(f"netns exec {net} /usr/sbin/tcpdump -Z {user} -i any -w {pcap}") - sys.exit(0) +tcpdump_procs = [] +for net in [NET0, NET1]: + pcap = logdir+'/'+net+'.pcap' + fd, pcap_tmp = tempfile.mkstemp(suffix=".pcap", prefix=f"{net}-", dir="/tmp") + p = subprocess.Popen( + ['ip', 'netns', 'exec', net, + '/usr/sbin/tcpdump', '-i', 'any', '-w', pcap_tmp]) + tcpdump_procs.append((p, pcap_tmp, pcap, fd)) # simulate packet loss, duplication and corruption -for net, iface in [(net0, veth0), (net1, veth1)]: +for net, iface in [(NET0, VETH0), (NET1, VETH1)]: ip(f"netns exec {net} /usr/sbin/tc qdisc add dev {iface} root netem \ - corrupt {packet_corruption} loss {packet_loss} duplicate \ - {packet_duplicate}") + corrupt {PACKET_CORRUPTION} loss {PACKET_LOSS} duplicate \ + {PACKET_DUPLICATE}") # add a timeout if args.timeout > 0: @@ -138,8 +146,8 @@ if args.timeout > 0: signal.signal(signal.SIGALRM, signal_handler) sockets = [ - netns_socket(net0, socket.AF_RDS, socket.SOCK_SEQPACKET), - netns_socket(net1, socket.AF_RDS, socket.SOCK_SEQPACKET), + netns_socket(NET0, socket.AF_RDS, socket.SOCK_SEQPACKET), + netns_socket(NET1, socket.AF_RDS, socket.SOCK_SEQPACKET), ] for s, addr in zip(sockets, addrs): @@ -150,9 +158,7 @@ fileno_to_socket = { s.fileno(): s for s in sockets } -addr_to_socket = { - addr: s for addr, s in zip(addrs, sockets) -} +addr_to_socket = dict(zip(addrs, sockets)) socket_to_addr = { s: addr for addr, s in zip(addrs, sockets) @@ -166,14 +172,14 @@ ep = select.epoll() for s in sockets: ep.register(s, select.EPOLLRDNORM) -n = 50000 +NUM_PACKETS = 50000 nr_send = 0 nr_recv = 0 -while nr_send < n: +while nr_send < NUM_PACKETS: # Send as much as we can without blocking print("sending...", nr_send, nr_recv) - while nr_send < n: + while nr_send < NUM_PACKETS: send_data = hashlib.sha256( f'packet {nr_send}'.encode('utf-8')).hexdigest().encode('utf-8') @@ -212,7 +218,7 @@ while nr_send < n: break # exercise net/rds/tcp.c:rds_tcp_sysctl_reset() - for net in [net0, net1]: + for net in [NET0, NET1]: ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_rcvbuf=10000") ip(f"netns exec {net} /usr/sbin/sysctl net.rds.tcp.rds_tcp_sndbuf=10000") @@ -242,7 +248,11 @@ for s in sockets: print(f"getsockopt(): {nr_success}/{nr_error}") print("Stopping network packet captures") -subprocess.check_call(['killall', '-q', 'tcpdump']) +for p, pcap_tmp, pcap, fd in tcpdump_procs: + p.terminate() + p.wait() + os.close(fd) + shutil.move(pcap_tmp, pcap) # We're done sending and receiving stuff, now let's check if what # we received is what we sent. diff --git a/tools/testing/selftests/net/reuseport_bpf.c b/tools/testing/selftests/net/reuseport_bpf.c index b6634d6da3d6..12e48b97b862 100644 --- a/tools/testing/selftests/net/reuseport_bpf.c +++ b/tools/testing/selftests/net/reuseport_bpf.c @@ -23,6 +23,7 @@ #include <sys/socket.h> #include <sys/resource.h> #include <unistd.h> +#include <sched.h> #include "kselftest.h" @@ -455,8 +456,18 @@ static __attribute__((destructor)) void main_dtor(void) setrlimit(RLIMIT_MEMLOCK, &rlim_old); } +static void setup_netns(void) +{ + if (unshare(CLONE_NEWNET)) + error(1, errno, "failed to unshare netns"); + if (system("ip link set lo up")) + error(1, 0, "failed to bring up lo interface in netns"); +} + int main(void) { + setup_netns(); + fprintf(stderr, "---- IPv4 UDP ----\n"); /* NOTE: UDP socket lookups traverse a different code path when there * are > 10 sockets in a group. Run the bpf test through both paths. diff --git a/tools/testing/selftests/net/reuseport_bpf_cpu.c b/tools/testing/selftests/net/reuseport_bpf_cpu.c index 2d646174729f..ddfe92f6597a 100644 --- a/tools/testing/selftests/net/reuseport_bpf_cpu.c +++ b/tools/testing/selftests/net/reuseport_bpf_cpu.c @@ -228,10 +228,20 @@ static void test(int *rcv_fd, int len, int family, int proto) close(rcv_fd[cpu]); } +static void setup_netns(void) +{ + if (unshare(CLONE_NEWNET)) + error(1, errno, "failed to unshare netns"); + if (system("ip link set lo up")) + error(1, 0, "failed to bring up lo interface in netns"); +} + int main(void) { int *rcv_fd, cpus; + setup_netns(); + cpus = sysconf(_SC_NPROCESSORS_ONLN); if (cpus <= 0) error(1, errno, "failed counting cpus"); diff --git a/tools/testing/selftests/net/reuseport_bpf_numa.c b/tools/testing/selftests/net/reuseport_bpf_numa.c index 2ffd957ffb15..8ec52fc5ef41 100644 --- a/tools/testing/selftests/net/reuseport_bpf_numa.c +++ b/tools/testing/selftests/net/reuseport_bpf_numa.c @@ -230,10 +230,20 @@ static void test(int *rcv_fd, int len, int family, int proto) close(rcv_fd[node]); } +static void setup_netns(void) +{ + if (unshare(CLONE_NEWNET)) + error(1, errno, "failed to unshare netns"); + if (system("ip link set lo up")) + error(1, 0, "failed to bring up lo interface in netns"); +} + int main(void) { int *rcv_fd, nodes; + setup_netns(); + if (numa_available() < 0) ksft_exit_skip("no numa api support\n"); diff --git a/tools/testing/selftests/net/reuseport_dualstack.c b/tools/testing/selftests/net/reuseport_dualstack.c index fb7a59ed759e..0eaf739d0c85 100644 --- a/tools/testing/selftests/net/reuseport_dualstack.c +++ b/tools/testing/selftests/net/reuseport_dualstack.c @@ -25,6 +25,7 @@ #include <sys/types.h> #include <sys/socket.h> #include <unistd.h> +#include <sched.h> static const int PORT = 8888; @@ -156,10 +157,20 @@ static void test(int *rcv_fds, int count, int proto) close(epfd); } +static void setup_netns(void) +{ + if (unshare(CLONE_NEWNET)) + error(1, errno, "failed to unshare netns"); + if (system("ip link set lo up")) + error(1, 0, "failed to bring up lo interface in netns"); +} + int main(void) { int rcv_fds[32], i; + setup_netns(); + fprintf(stderr, "---- UDP IPv4 created before IPv6 ----\n"); build_rcv_fd(AF_INET, SOCK_DGRAM, rcv_fds, 5); build_rcv_fd(AF_INET6, SOCK_DGRAM, &(rcv_fds[5]), 5); diff --git a/tools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh b/tools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh index 6a68c7eff1dc..cd7d061e21f8 100755 --- a/tools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh +++ b/tools/testing/selftests/net/srv6_hencap_red_l3vpn_test.sh @@ -193,6 +193,8 @@ ret=${ksft_skip} nsuccess=0 nfail=0 +HAS_TUNSRC=false + log_test() { local rc="$1" @@ -345,6 +347,17 @@ setup_rt_networking() ip -netns "${nsname}" addr \ add "${net_prefix}::${rt}/64" dev "${devname}" nodad + # A dedicated ::dead:<rt> address (with preferred_lft 0, i.e., + # deprecated) is added when there is support for tunsrc. Because + # it is deprecated, the kernel should never auto-select it as + # source with current config. Only an explicit tunsrc can place + # it in the outer header. + if $HAS_TUNSRC; then + ip -netns "${nsname}" addr \ + add "${net_prefix}::dead:${rt}/64" \ + dev "${devname}" nodad preferred_lft 0 + fi + ip -netns "${nsname}" link set "${devname}" up done @@ -420,6 +433,7 @@ setup_rt_local_sids() # to the destination host) # $5 - encap mode (full or red) # $6 - traffic type (IPv6 or IPv4) +# $7 - force tunsrc (true or false) __setup_rt_policy() { local dst="$1" @@ -428,10 +442,46 @@ __setup_rt_policy() local dec_rt="$4" local mode="$5" local traffic="$6" + local with_tunsrc="$7" local nsname local policy='' + local tunsrc='' local n + # Verify the per-route tunnel source address ("tunsrc") feature. + # If it is not supported, fallback on encap config without tunsrc. + if $with_tunsrc && $HAS_TUNSRC; then + local net_prefix + local drule + local nxt + + eval nsname=\${$(get_rtname "${dec_rt}")} + + # Next SRv6 hop: first End router if any, or the decap router + [ -z "${end_rts}" ] && nxt="${dec_rt}" || nxt="${end_rts%% *}" + + # Use the right prefix for tunsrc depending on the next SRv6 hop + net_prefix="$(get_network_prefix "${encap_rt}" "${nxt}")" + tunsrc="tunsrc ${net_prefix}::dead:${encap_rt}" + + # To verify that the outer source address matches the one + # configured with tunsrc, the decap router discards packets + # with any other source address. + ip netns exec "${nsname}" ip6tables -t raw -I PREROUTING 1 \ + -s "${net_prefix}::dead:${encap_rt}" \ + -d "${VPN_LOCATOR_SERVICE}:${dec_rt}::${DT46_FUNC}" \ + -j ACCEPT + + drule="PREROUTING \ + -d ${VPN_LOCATOR_SERVICE}:${dec_rt}::${DT46_FUNC} \ + -j DROP" + + if ! ip netns exec "${nsname}" \ + ip6tables -t raw -C ${drule} &>/dev/null; then + ip netns exec "${nsname}" ip6tables -t raw -A ${drule} + fi + fi + eval nsname=\${$(get_rtname "${encap_rt}")} for n in ${end_rts}; do @@ -444,7 +494,7 @@ __setup_rt_policy() if [ "${traffic}" -eq 6 ]; then ip -netns "${nsname}" -6 route \ add "${IPv6_HS_NETWORK}::${dst}" vrf "${VRF_DEVNAME}" \ - encap seg6 mode "${mode}" segs "${policy}" \ + encap seg6 mode "${mode}" ${tunsrc} segs "${policy}" \ dev "${VRF_DEVNAME}" ip -netns "${nsname}" -6 neigh \ @@ -455,7 +505,7 @@ __setup_rt_policy() # received, otherwise the proxy arp does not work. ip -netns "${nsname}" -4 route \ add "${IPv4_HS_NETWORK}.${dst}" vrf "${VRF_DEVNAME}" \ - encap seg6 mode "${mode}" segs "${policy}" \ + encap seg6 mode "${mode}" ${tunsrc} segs "${policy}" \ dev "${VRF_DEVNAME}" fi } @@ -463,13 +513,13 @@ __setup_rt_policy() # see __setup_rt_policy setup_rt_policy_ipv6() { - __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 6 + __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 6 "$6" } #see __setup_rt_policy setup_rt_policy_ipv4() { - __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 4 + __setup_rt_policy "$1" "$2" "$3" "$4" "$5" 4 "$6" } setup_hs() @@ -567,41 +617,41 @@ setup() # the network path between hs-1 and hs-2 traverses several routers # depending on the direction of traffic. # - # Direction hs-1 -> hs-2 (H.Encaps.Red) + # Direction hs-1 -> hs-2 (H.Encaps.Red + tunsrc) # - rt-3,rt-4 (SRv6 End behaviors) # - rt-2 (SRv6 End.DT46 behavior) # # Direction hs-2 -> hs-1 (H.Encaps.Red) # - rt-1 (SRv6 End.DT46 behavior) - setup_rt_policy_ipv6 2 1 "3 4" 2 encap.red - setup_rt_policy_ipv6 1 2 "" 1 encap.red + setup_rt_policy_ipv6 2 1 "3 4" 2 encap.red true + setup_rt_policy_ipv6 1 2 "" 1 encap.red false # create an IPv4 VPN between hosts hs-1 and hs-2 # the network path between hs-1 and hs-2 traverses several routers # depending on the direction of traffic. # - # Direction hs-1 -> hs-2 (H.Encaps.Red) + # Direction hs-1 -> hs-2 (H.Encaps.Red + tunsrc) # - rt-2 (SRv6 End.DT46 behavior) # # Direction hs-2 -> hs-1 (H.Encaps.Red) # - rt-4,rt-3 (SRv6 End behaviors) # - rt-1 (SRv6 End.DT46 behavior) - setup_rt_policy_ipv4 2 1 "" 2 encap.red - setup_rt_policy_ipv4 1 2 "4 3" 1 encap.red + setup_rt_policy_ipv4 2 1 "" 2 encap.red true + setup_rt_policy_ipv4 1 2 "4 3" 1 encap.red false # create an IPv6 VPN between hosts hs-3 and hs-4 # the network path between hs-3 and hs-4 traverses several routers # depending on the direction of traffic. # - # Direction hs-3 -> hs-4 (H.Encaps.Red) + # Direction hs-3 -> hs-4 (H.Encaps.Red + tunsrc) # - rt-2 (SRv6 End Behavior) # - rt-4 (SRv6 End.DT46 behavior) # # Direction hs-4 -> hs-3 (H.Encaps.Red) # - rt-1 (SRv6 End behavior) # - rt-3 (SRv6 End.DT46 behavior) - setup_rt_policy_ipv6 4 3 "2" 4 encap.red - setup_rt_policy_ipv6 3 4 "1" 3 encap.red + setup_rt_policy_ipv6 4 3 "2" 4 encap.red true + setup_rt_policy_ipv6 3 4 "1" 3 encap.red false # testing environment was set up successfully SETUP_ERR=0 @@ -809,6 +859,38 @@ test_vrf_or_ksft_skip() fi } +# Before enabling tunsrc tests, make sure tunsrc and ip6tables are supported. +check_tunsrc_support() +{ + setup_ns tunsrc_ns + + ip -netns "${tunsrc_ns}" link add veth0 type veth \ + peer name veth1 netns "${tunsrc_ns}" + + ip -netns "${tunsrc_ns}" link set veth0 up + + if ! ip -netns "${tunsrc_ns}" -6 route add fc00::dead:beef/128 \ + encap seg6 mode encap.red tunsrc fc00::1 segs fc00::2 \ + dev veth0 &>/dev/null; then + cleanup_ns "${tunsrc_ns}" + return + fi + + if ! ip -netns "${tunsrc_ns}" -6 route show | grep -q "tunsrc"; then + cleanup_ns "${tunsrc_ns}" + return + fi + + if ! ip netns exec "${tunsrc_ns}" ip6tables -t raw -A PREROUTING \ + -d fc00::dead:beef -j DROP &>/dev/null; then + cleanup_ns "${tunsrc_ns}" + return + fi + + cleanup_ns "${tunsrc_ns}" + HAS_TUNSRC=true +} + if [ "$(id -u)" -ne 0 ]; then echo "SKIP: Need root privileges" exit "${ksft_skip}" @@ -826,6 +908,7 @@ test_vrf_or_ksft_skip set -e trap cleanup EXIT +check_tunsrc_support setup set +e diff --git a/tools/testing/selftests/net/xfrm_state.sh b/tools/testing/selftests/net/xfrm_state.sh new file mode 100755 index 000000000000..f6c54a6496d7 --- /dev/null +++ b/tools/testing/selftests/net/xfrm_state.sh @@ -0,0 +1,613 @@ +#!/bin/bash -e +# SPDX-License-Identifier: GPL-2.0 +# +# xfrm/IPsec tests. +# Currently implemented: +# - ICMP error source address verification (IETF RFC 4301 section 6) +# - ICMP MTU exceeded handling over IPsec tunnels. +# +# Addresses and topology: +# IPv4 prefix 10.1.c.d IPv6 prefix fc00:c::d/64 where c is the segment number +# and d is the interface identifier. +# IPv6 uses the same c:d as IPv4, and start with IPv6 prefix instead ipv4 prefix +# +# Network topology default: ns_set_v4 or ns_set_v6 +# 1.1 1.2 2.1 2.2 3.1 3.2 4.1 4.2 5.1 5.2 6.1 6.2 +# eth0 eth1 eth0 eth1 eth0 eth1 eth0 eth1 eth0 eth1 eth0 eth1 +# a -------- r1 -------- s1 -------- r2 -------- s2 -------- r3 -------- b +# a, b = Alice and Bob hosts without IPsec. +# r1, r2, r3 routers, without IPsec +# s1, s2, IPsec gateways/routers that setup tunnel(s). + +# Network topology x: IPsec gateway that generates ICMP response - ns_set_v4x or ns_set_v6x +# 1.1 1.2 2.1 2.2 3.1 3.2 4.1 4.2 5.1 5.2 +# eth0 eth1 eth0 eth1 eth0 eth1 eth0 eth1 eth0 eth1 +# a -------- r1 -------- s1 -------- r2 -------- s2 -------- b + +. lib.sh + +EXIT_ON_TEST_FAIL=no +PAUSE=no +VERBOSE=${VERBOSE:-0} +DEBUG=0 + +# Name Description +tests=" + unreachable_ipv4 IPv4 unreachable from router r3 + unreachable_ipv6 IPv6 unreachable from router r3 + unreachable_gw_ipv4 IPv4 unreachable from IPsec gateway s2 + unreachable_gw_ipv6 IPv6 unreachable from IPsec gateway s2 + mtu_ipv4_s2 IPv4 MTU exceeded from IPsec gateway s2 + mtu_ipv6_s2 IPv6 MTU exceeded from IPsec gateway s2 + mtu_ipv4_r2 IPv4 MTU exceeded from ESP router r2 + mtu_ipv6_r2 IPv6 MTU exceeded from ESP router r2 + mtu_ipv4_r3 IPv4 MTU exceeded from router r3 + mtu_ipv6_r3 IPv6 MTU exceeded from router r3" + +prefix4="10.1" +prefix6="fc00" + +run_cmd_err() { + cmd="$*" + + if [ "$VERBOSE" -gt 0 ]; then + printf " COMMAND: %s\n" "$cmd" + fi + + out="$($cmd 2>&1)" && rc=0 || rc=$? + if [ "$VERBOSE" -gt 1 ] && [ -n "$out" ]; then + echo " $out" + echo + fi + return 0 +} + +run_cmd() { + run_cmd_err "$@" || exit 1 +} + +run_test() { + # If errexit is set, unset it for sub-shell and restore after test + errexit=0 + if [[ $- =~ "e" ]]; then + errexit=1 + set +e + fi + + ( + unset IFS + + # shellcheck disable=SC2030 # fail is read by trap/cleanup within this subshell + fail="yes" + + # Since cleanup() relies on variables modified by this sub shell, + # it has to run in this context. + trap 'log_test_error $?; cleanup' EXIT INT TERM + + if [ "$VERBOSE" -gt 0 ]; then + printf "\n#############################################################\n\n" + fi + + ret=0 + case "${name}" in + # can't use eval and test names shell check will complain about unused code + unreachable_ipv4) test_unreachable_ipv4 ;; + unreachable_ipv6) test_unreachable_ipv6 ;; + unreachable_gw_ipv4) test_unreachable_gw_ipv4 ;; + unreachable_gw_ipv6) test_unreachable_gw_ipv6 ;; + mtu_ipv4_s2) test_mtu_ipv4_s2 ;; + mtu_ipv6_s2) test_mtu_ipv6_s2 ;; + mtu_ipv4_r2) test_mtu_ipv4_r2 ;; + mtu_ipv6_r2) test_mtu_ipv6_r2 ;; + mtu_ipv4_r3) test_mtu_ipv4_r3 ;; + mtu_ipv6_r3) test_mtu_ipv6_r3 ;; + esac + ret=$? + + if [ $ret -eq 0 ]; then + fail="no" + + if [ "$VERBOSE" -gt 1 ]; then + show_icmp_filter + fi + + printf "TEST: %-60s [ PASS ]\n" "${desc}" + elif [ $ret -eq "$ksft_skip" ]; then + fail="no" + printf "TEST: %-60s [SKIP]\n" "${desc}" + fi + + return $ret + ) + ret=$? + + [ $errexit -eq 1 ] && set -e + + case $ret in + 0) + all_skipped=false + [ "$exitcode" -eq "$ksft_skip" ] && exitcode=0 + ;; + "$ksft_skip") + [ $all_skipped = true ] && exitcode=$ksft_skip + ;; + *) + all_skipped=false + exitcode=1 + ;; + esac + + return 0 # don't trigger errexit (-e); actual status in exitcode +} + +setup_namespaces() { + local namespaces="" + + NS_A="" + NS_B="" + NS_R1="" + NS_R2="" + NS_R3="" + NS_S1="" + NS_S2="" + + for ns in ${ns_set}; do + namespaces="$namespaces NS_${ns^^}" + done + + # shellcheck disable=SC2086 # setup_ns expects unquoted list + setup_ns $namespaces + + ns_active= #ordered list of namespaces for this test. + + [ -n "${NS_A}" ] && ns_a=(ip netns exec "${NS_A}") && ns_active="${ns_active} $NS_A" + [ -n "${NS_R1}" ] && ns_active="${ns_active} $NS_R1" + [ -n "${NS_S1}" ] && ns_s1=(ip netns exec "${NS_S1}") && ns_active="${ns_active} $NS_S1" + [ -n "${NS_R2}" ] && ns_r2=(ip netns exec "${NS_R2}") && ns_active="${ns_active} $NS_R2" + [ -n "${NS_S2}" ] && ns_s2=(ip netns exec "${NS_S2}") && ns_active="${ns_active} $NS_S2" + [ -n "${NS_R3}" ] && ns_r3=(ip netns exec "${NS_R3}") && ns_active="${ns_active} $NS_R3" + [ -n "${NS_B}" ] && ns_active="${ns_active} $NS_B" +} + +addr_add() { + local -a ns_cmd=(ip netns exec "$1") + local addr="$2" + local dev="$3" + + run_cmd "${ns_cmd[@]}" ip addr add "${addr}" dev "${dev}" + run_cmd "${ns_cmd[@]}" ip link set up "${dev}" +} + +veth_add() { + local ns=$2 + local pns=$1 + local -a ns_cmd=(ip netns exec "${pns}") + local ln="eth0" + local rn="eth1" + + run_cmd "${ns_cmd[@]}" ip link add "${ln}" type veth peer name "${rn}" netns "${ns}" +} + +show_icmp_filter() { + run_cmd "${ns_r2[@]}" nft list ruleset + echo "$out" +} + +setup_icmp_filter() { + run_cmd "${ns_r2[@]}" nft add table inet filter + run_cmd "${ns_r2[@]}" nft add chain inet filter FORWARD \ + '{ type filter hook forward priority filter; policy drop ; }' + run_cmd "${ns_r2[@]}" nft add rule inet filter FORWARD counter ip protocol esp \ + counter log accept + run_cmd "${ns_r2[@]}" nft add rule inet filter FORWARD counter ip protocol \ + icmp counter log drop + + if [ "$VERBOSE" -gt 0 ]; then + run_cmd "${ns_r2[@]}" nft list ruleset + echo "$out" + fi +} + +setup_icmpv6_filter() { + run_cmd "${ns_r2[@]}" nft add table inet filter + run_cmd "${ns_r2[@]}" nft add chain inet filter FORWARD \ + '{ type filter hook forward priority filter; policy drop ; }' + run_cmd "${ns_r2[@]}" nft add rule inet filter FORWARD ip6 nexthdr \ + ipv6-icmp icmpv6 type echo-request counter log drop + run_cmd "${ns_r2[@]}" nft add rule inet filter FORWARD ip6 nexthdr esp \ + counter log accept + run_cmd "${ns_r2[@]}" nft add rule inet filter FORWARD ip6 nexthdr \ + ipv6-icmp icmpv6 type \ + '{nd-neighbor-solicit,nd-neighbor-advert,nd-router-solicit,nd-router-advert}' \ + counter log drop + if [ "$VERBOSE" -gt 0 ]; then + run_cmd "${ns_r2[@]}" nft list ruleset + echo "$out" + fi +} + +set_xfrm_params() { + s1_src=${src} + s1_dst=${dst} + s1_src_net=${src_net} + s1_dst_net=${dst_net} +} + +setup_ns_set_v4() { + ns_set="a r1 s1 r2 s2 r3 b" # Network topology default + imax=$(echo "$ns_set" | wc -w) # number of namespaces in this topology + + src="10.1.3.1" + dst="10.1.4.2" + src_net="10.1.1.0/24" + dst_net="10.1.6.0/24" + + prefix=${prefix4} + prefix_len=24 + s="." + S="." + + set_xfrm_params +} + +setup_ns_set_v4x() { + ns_set="a r1 s1 r2 s2 b" # Network topology: x + imax=$(echo "$ns_set" | wc -w) # number of namespaces in this topology + prefix=${prefix4} + s="." + S="." + src="10.1.3.1" + dst="10.1.4.2" + src_net="10.1.1.0/24" + dst_net="10.1.5.0/24" + prefix_len=24 + + set_xfrm_params +} + +setup_ns_set_v6() { + ns_set="a r1 s1 r2 s2 r3 b" # Network topology default + imax=$(echo "$ns_set" | wc -w) # number of namespaces in this topology + prefix=${prefix6} + s=":" + S="::" + src="fc00:3::1" + dst="fc00:4::2" + src_net="fc00:1::0/64" + dst_net="fc00:6::0/64" + prefix_len=64 + + set_xfrm_params +} + +setup_ns_set_v6x() { + ns_set="a r1 s1 r2 s2 b" # Network topology: x + imax=$(echo "$ns_set" | wc -w) + prefix=${prefix6} + s=":" + S="::" + src="fc00:3::1" + dst="fc00:4::2" + src_net="fc00:1::0/64" + dst_net="fc00:5::0/64" + prefix_len=64 + + set_xfrm_params +} + +setup_network() { + # Create veths and add addresses + local -a ns_cmd + i=1 + p="" + for ns in ${ns_active}; do + ns_cmd=(ip netns exec "${ns}") + + if [ "${i}" -ne 1 ]; then + # Create veth between previous and current namespace + veth_add "${p}" "${ns}" + # Add addresses: previous gets .1 on eth0, current gets .2 on eth1 + addr_add "${p}" "${prefix}${s}$((i-1))${S}1/${prefix_len}" eth0 + addr_add "${ns}" "${prefix}${s}$((i-1))${S}2/${prefix_len}" eth1 + fi + + # Enable forwarding + run_cmd "${ns_cmd[@]}" sysctl -q net/ipv4/ip_forward=1 + run_cmd "${ns_cmd[@]}" sysctl -q net/ipv6/conf/all/forwarding=1 + run_cmd "${ns_cmd[@]}" sysctl -q net/ipv6/conf/default/accept_dad=0 + + p=${ns} + i=$((i + 1)) + done + + # Add routes (needs all addresses to exist first) + i=1 + for ns in ${ns_active}; do + ns_cmd=(ip netns exec "${ns}") + + # Forward routes to networks beyond this node + if [ "${i}" -ne "${imax}" ]; then + nhf="${prefix}${s}${i}${S}2" # nexthop forward + for j in $(seq $((i + 1)) "${imax}"); do + run_cmd "${ns_cmd[@]}" ip route replace \ + "${prefix}${s}${j}${S}0/${prefix_len}" via "${nhf}" + done + fi + + # Reverse routes to networks before this node + if [ "${i}" -gt 1 ]; then + nhr="${prefix}${s}$((i-1))${S}1" # nexthop reverse + for j in $(seq 1 $((i - 2))); do + run_cmd "${ns_cmd[@]}" ip route replace \ + "${prefix}${s}${j}${S}0/${prefix_len}" via "${nhr}" + done + fi + + i=$((i + 1)) + done +} + +setup_xfrm_mode() { + local MODE=${1:-tunnel} + if [ "${MODE}" != "tunnel" ] && [ "${MODE}" != "beet" ]; then + echo "xfrm mode ${MODE} not supported" + log_test_error + return 1 + fi + + run_cmd "${ns_s1[@]}" ip xfrm policy add src "${s1_src_net}" dst "${s1_dst_net}" dir out \ + tmpl src "${s1_src}" dst "${s1_dst}" proto esp reqid 1 mode "${MODE}" + + # no "input" policies. we are only doing forwarding so far + + run_cmd "${ns_s1[@]}" ip xfrm policy add src "${s1_dst_net}" dst "${s1_src_net}" dir fwd \ + flag icmp tmpl src "${s1_dst}" dst "${s1_src}" proto esp reqid 2 mode "${MODE}" + + run_cmd "${ns_s1[@]}" ip xfrm state add src "${s1_src}" dst "${s1_dst}" proto esp spi 1 \ + reqid 1 mode "${MODE}" aead 'rfc4106(gcm(aes))' \ + 0x1111111111111111111111111111111111111111 96 \ + sel src "${s1_src_net}" dst "${s1_dst_net}" dir out + + run_cmd "${ns_s1[@]}" ip xfrm state add src "${s1_dst}" dst "${s1_src}" proto esp spi 2 \ + reqid 2 flag icmp replay-window 8 mode "${MODE}" aead 'rfc4106(gcm(aes))' \ + 0x2222222222222222222222222222222222222222 96 \ + sel src "${s1_dst_net}" dst "${s1_src_net}" dir in + + run_cmd "${ns_s2[@]}" ip xfrm policy add src "${s1_dst_net}" dst "${s1_src_net}" dir out \ + flag icmp tmpl src "${s1_dst}" dst "${s1_src}" proto esp reqid 2 mode "${MODE}" + + run_cmd "${ns_s2[@]}" ip xfrm policy add src "${s1_src_net}" dst "${s1_dst_net}" dir fwd \ + tmpl src "${s1_src}" dst "${s1_dst}" proto esp reqid 1 mode "${MODE}" + + run_cmd "${ns_s2[@]}" ip xfrm state add src "${s1_dst}" dst "${s1_src}" proto esp spi 2 \ + reqid 2 mode "${MODE}" aead 'rfc4106(gcm(aes))' \ + 0x2222222222222222222222222222222222222222 96 \ + sel src "${s1_dst_net}" dst "${s1_src_net}" dir out + + run_cmd "${ns_s2[@]}" ip xfrm state add src "${s1_src}" dst "${s1_dst}" proto esp spi 1 \ + reqid 1 flag icmp replay-window 8 mode "${MODE}" aead 'rfc4106(gcm(aes))' \ + 0x1111111111111111111111111111111111111111 96 \ + sel src "${s1_src_net}" dst "${s1_dst_net}" dir in +} + +setup_xfrm() { + setup_xfrm_mode tunnel +} + +setup() { + [ "$(id -u)" -ne 0 ] && echo " need to run as root" && return "$ksft_skip" + + for arg; do + case "${arg}" in + ns_set_v4) setup_ns_set_v4 ;; + ns_set_v4x) setup_ns_set_v4x ;; + ns_set_v6) setup_ns_set_v6 ;; + ns_set_v6x) setup_ns_set_v6x ;; + namespaces) setup_namespaces ;; + network) setup_network ;; + xfrm) setup_xfrm ;; + icmp_filter) setup_icmp_filter ;; + icmpv6_filter) setup_icmpv6_filter ;; + *) echo " ${arg} not supported"; return 1 ;; + esac || return 1 + done +} + +# shellcheck disable=SC2317 # called via trap +pause() { + echo + echo "Pausing. Hit enter to continue" + read -r _ +} + +# shellcheck disable=SC2317 # called via trap +log_test_error() { + # shellcheck disable=SC2031 # fail is set in subshell, read via trap + if [ "${fail}" = "yes" ] && [ -n "${desc}" ]; then + if [ "$VERBOSE" -gt 0 ]; then + show_icmp_filter + fi + printf "TEST: %-60s [ FAIL ] %s\n" "${desc}" "${name}" + [ -n "${cmd}" ] && printf '%s\n\n' "${cmd}" + [ -n "${out}" ] && printf '%s\n\n' "${out}" + fi +} + +# shellcheck disable=SC2317 # called via trap +cleanup() { + # shellcheck disable=SC2031 # fail is set in subshell, read via trap + [[ "$PAUSE" = "always" || ( "$PAUSE" = "fail" && "$fail" = "yes" ) ]] && pause + cleanup_all_ns + # shellcheck disable=SC2031 # fail is set in subshell, read via trap + [ "${EXIT_ON_TEST_FAIL}" = "yes" ] && [ "${fail}" = "yes" ] && exit 1 +} + +test_unreachable_ipv6() { + setup ns_set_v6 namespaces network xfrm icmpv6_filter || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 fc00:6::2 + run_cmd_err "${ns_a[@]}" ping -W 5 -w 4 -c 1 fc00:6::3 + rc=0 + echo -e "$out" | grep -q -E 'From fc00:5::2 icmp_seq.* Destination' || rc=1 + return "${rc}" +} + +test_unreachable_gw_ipv6() { + setup ns_set_v6x namespaces network xfrm icmpv6_filter || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 fc00:5::2 + run_cmd_err "${ns_a[@]}" ping -W 5 -w 4 -c 1 fc00:5::3 + rc=0 + echo -e "$out" | grep -q -E 'From fc00:4::2 icmp_seq.* Destination' || rc=1 + return "${rc}" +} + +test_unreachable_ipv4() { + setup ns_set_v4 namespaces network icmp_filter xfrm || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 10.1.6.2 + run_cmd_err "${ns_a[@]}" ping -W 5 -w 4 -c 1 10.1.6.3 + rc=0 + echo -e "$out" | grep -q -E 'From 10.1.5.2 icmp_seq.* Destination' || rc=1 + return "${rc}" +} + +test_unreachable_gw_ipv4() { + setup ns_set_v4x namespaces network icmp_filter xfrm || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 10.1.5.2 + run_cmd_err "${ns_a[@]}" ping -W 5 -w 4 -c 1 10.1.5.3 + rc=0 + echo -e "$out" | grep -q -E 'From 10.1.4.2 icmp_seq.* Destination' || rc=1 + return "${rc}" +} + +test_mtu_ipv4_r2() { + setup ns_set_v4 namespaces network icmp_filter xfrm || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 10.1.6.2 + run_cmd "${ns_r2[@]}" ip route replace 10.1.3.0/24 dev eth1 src 10.1.3.2 mtu 1300 + run_cmd "${ns_r2[@]}" ip route replace 10.1.4.0/24 dev eth0 src 10.1.4.1 mtu 1300 + # shellcheck disable=SC1010 # -M do: do = dont-fragment, not shell keyword + run_cmd "${ns_a[@]}" ping -M do -s 1300 -W 5 -w 4 -c 1 10.1.6.2 || true + rc=0 + echo -e "$out" | grep -q -E "From 10.1.2.2 icmp_seq=.* Frag needed and DF set" || rc=1 + return "${rc}" +} + +test_mtu_ipv6_r2() { + setup ns_set_v6 namespaces network xfrm icmpv6_filter || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 fc00:6::2 + run_cmd "${ns_r2[@]}" ip -6 route replace fc00:3::/64 \ + dev eth1 metric 256 src fc00:3::2 mtu 1300 + run_cmd "${ns_r2[@]}" ip -6 route replace fc00:4::/64 \ + dev eth0 metric 256 src fc00:4::1 mtu 1300 + # shellcheck disable=SC1010 # -M do: do = dont-fragment, not shell keyword + run_cmd "${ns_a[@]}" ping -M do -s 1300 -W 5 -w 4 -c 1 fc00:6::2 || true + rc=0 + echo -e "$out" | grep -q -E "From fc00:2::2 icmp_seq=.* Packet too big: mtu=1230" || rc=1 + return "${rc}" +} + +test_mtu_ipv4_r3() { + setup ns_set_v4 namespaces network icmp_filter xfrm || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 10.1.6.2 + run_cmd "${ns_r3[@]}" ip route replace 10.1.6.0/24 dev eth0 mtu 1300 + # shellcheck disable=SC1010 # -M do: do = dont-fragment, not shell keyword + run_cmd "${ns_a[@]}" ping -M do -s 1350 -W 5 -w 4 -c 1 10.1.6.2 || true + rc=0 + echo -e "$out" | grep -q -E "From 10.1.5.2 .* Frag needed and DF set \(mtu = 1300\)" || rc=1 + return "${rc}" +} + +test_mtu_ipv4_s2() { + setup ns_set_v4x namespaces network icmp_filter xfrm || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 10.1.5.2 + run_cmd "${ns_s2[@]}" ip route replace 10.1.5.0/24 dev eth0 src 10.1.5.1 mtu 1300 + # shellcheck disable=SC1010 # -M do: do = dont-fragment, not shell keyword + run_cmd "${ns_a[@]}" ping -M do -s 1350 -W 5 -w 4 -c 1 10.1.5.2 || true + rc=0 + echo -e "$out" | grep -q -E "From 10.1.4.2.*Frag needed and DF set \(mtu = 1300\)" || rc=1 + return "${rc}" +} + +test_mtu_ipv6_s2() { + setup ns_set_v6x namespaces network xfrm icmpv6_filter || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 fc00:5::2 + run_cmd "${ns_s2[@]}" ip -6 route replace fc00:5::/64 dev eth0 metric 256 mtu 1300 + # shellcheck disable=SC1010 # -M do: do = dont-fragment, not shell keyword + run_cmd "${ns_a[@]}" ping -M do -s 1350 -W 5 -w 4 -c 1 fc00:5::2 || true + rc=0 + echo -e "$out" | grep -q -E "From fc00:4::2.*Packet too big: mtu=1300" || rc=1 + return "${rc}" +} + +test_mtu_ipv6_r3() { + setup ns_set_v6 namespaces network xfrm icmpv6_filter || return "$ksft_skip" + run_cmd "${ns_a[@]}" ping -W 5 -w 4 -c 1 fc00:6::2 + run_cmd "${ns_r3[@]}" ip -6 route replace fc00:6::/64 dev eth1 metric 256 mtu 1300 + # shellcheck disable=SC1010 # -M do: do = dont-fragment, not shell keyword + run_cmd "${ns_a[@]}" ping -M do -s 1300 -W 5 -w 4 -c 1 fc00:6::2 || true + rc=0 + echo -e "$out" | grep -q -E "From fc00:5::2 icmp_seq=.* Packet too big: mtu=1300" || rc=1 + return "${rc}" +} + +################################################################################ +# +usage() { + echo + echo "$0 [OPTIONS] [TEST]..." + echo "If no TEST argument is given, all tests will be run." + echo + echo -e "\t-p Pause on fail. Namespaces are kept for diagnostics" + echo -e "\t-P Pause after the test. Namespaces are kept for diagnostics" + echo -e "\t-v Verbose output. Show commands; -vv Show output and nft rules also" + echo "Available tests${tests}" + exit 1 +} + +################################################################################ +# +exitcode=0 +all_skipped=true +out= +cmd= + +while getopts :epPv o; do + case $o in + e) EXIT_ON_TEST_FAIL=yes ;; + P) PAUSE=always ;; + p) PAUSE=fail ;; + v) VERBOSE=$((VERBOSE + 1)) ;; + *) usage ;; + esac +done +shift $((OPTIND - 1)) + +IFS=$'\t\n' + +for arg; do + # Check first that all requested tests are available before running any + command -v "test_${arg}" >/dev/null || { + echo "=== Test ${arg} not found" + usage + } +done + +name="" +desc="" +fail="no" + +for t in ${tests}; do + [ "${name}" = "" ] && name="${t}" && continue + [ "${desc}" = "" ] && desc="${t}" + + run_this=1 + for arg; do + [ "${arg}" = "${name}" ] && run_this=1 && break + run_this=0 + done + if [ $run_this -eq 1 ]; then + run_test + fi + name="" + desc="" +done + +exit ${exitcode} diff --git a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json index 1e5efb2a31eb..eefadd0546d3 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json +++ b/tools/testing/selftests/tc-testing/tc-tests/infra/qdiscs.json @@ -570,7 +570,7 @@ "cmdUnderTest": "$TC class change dev $DUMMY parent 3:0 classid 3:1 hfsc sc m1 5Mbit d 10ms m2 10Mbit", "expExitCode": "0", "verifyCmd": "$TC -s qdisc show dev $DUMMY", - "matchPattern": "qdisc hfsc 3:.*parent 1:2.*default 1", + "matchPattern": "qdisc hfsc 3:.*parent 1:2.*default 0x1", "matchCount": "1", "teardown": [ "$TC qdisc del dev $DUMMY handle 1:0 root", diff --git a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/ets.json b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/ets.json index a5d94cdec605..ee09e6d6fdf3 100644 --- a/tools/testing/selftests/tc-testing/tc-tests/qdiscs/ets.json +++ b/tools/testing/selftests/tc-testing/tc-tests/qdiscs/ets.json @@ -984,5 +984,28 @@ "matchCount": "1", "teardown": [ ] + }, + { + "id": "41f5", + "name": "ETS offload where the sum of quanta wraps u32", + "category": [ + "qdisc", + "ets" + ], + "plugins": { + "requires": "nsPlugin" + }, + "setup": [ + "echo \"1 1 4\" > /sys/bus/netdevsim/new_device", + "$ETHTOOL -K $ETH hw-tc-offload on" + ], + "cmdUnderTest": "$TC qdisc add dev $ETH root ets quanta 4294967294 1 1", + "expExitCode": "0", + "verifyCmd": "$TC qdisc show dev $ETH", + "matchPattern": "qdisc ets .*bands 3 quanta 4294967294 1 1", + "matchCount": "1", + "teardown": [ + "echo \"1\" > /sys/bus/netdevsim/del_device" + ] } ] diff --git a/tools/testing/selftests/tc-testing/tdc.py b/tools/testing/selftests/tc-testing/tdc.py index 4f255cec0c22..81b4ac3f050c 100755 --- a/tools/testing/selftests/tc-testing/tdc.py +++ b/tools/testing/selftests/tc-testing/tdc.py @@ -755,6 +755,9 @@ def check_default_settings(args, remaining, pm): NAMES['DEV2'] = args.device if 'TIMEOUT' not in NAMES: NAMES['TIMEOUT'] = None + if 'ETHTOOL' in NAMES and not os.path.isfile(NAMES['ETHTOOL']): + print(f"The specified ethtool path {NAMES['ETHTOOL']} does not exist.") + exit(1) if not os.path.isfile(NAMES['TC']): print("The specified tc path " + NAMES['TC'] + " does not exist.") exit(1) diff --git a/tools/testing/selftests/tc-testing/tdc_config.py b/tools/testing/selftests/tc-testing/tdc_config.py index ccb0f06ef9e3..9488b03cbc2c 100644 --- a/tools/testing/selftests/tc-testing/tdc_config.py +++ b/tools/testing/selftests/tc-testing/tdc_config.py @@ -17,6 +17,7 @@ NAMES = { 'DEV1': 'v0p1', 'DEV2': '', 'DUMMY': 'dummy1', + 'ETHTOOL': '/usr/sbin/ethtool', 'ETH': 'eth0', 'BATCH_FILE': './batch.txt', 'BATCH_DIR': 'tmp', diff --git a/tools/testing/selftests/tc-testing/tdc_helper.py b/tools/testing/selftests/tc-testing/tdc_helper.py index bc447ca57333..adb52fe3acc1 100644 --- a/tools/testing/selftests/tc-testing/tdc_helper.py +++ b/tools/testing/selftests/tc-testing/tdc_helper.py @@ -16,9 +16,9 @@ def get_categorized_testlist(alltests, ucat): def get_unique_item(lst): - """ For a list, return a list of the unique items in the list. """ + """Return unique items while preserving original order.""" if len(lst) > 1: - return list(set(lst)) + return list(dict.fromkeys(lst)) else: return lst diff --git a/tools/testing/selftests/vsock/vmtest.sh b/tools/testing/selftests/vsock/vmtest.sh index 86e338886b33..d97913a6bdc7 100755 --- a/tools/testing/selftests/vsock/vmtest.sh +++ b/tools/testing/selftests/vsock/vmtest.sh @@ -42,6 +42,8 @@ readonly KERNEL_CMDLINE="\ virtme.ssh virtme_ssh_channel=tcp virtme_ssh_user=$USER \ " readonly LOG=$(mktemp /tmp/vsock_vmtest_XXXX.log) +readonly TEST_HOME="$(mktemp -d /tmp/vmtest_home_XXXX)" +readonly SSH_KEY_PATH="${TEST_HOME}"/.ssh/id_ed25519 # Namespace tests must use the ns_ prefix. This is checked in check_netns() and # is used to determine if a test needs namespace setup before test execution. @@ -257,7 +259,12 @@ vm_ssh() { shift - ${ns_exec} ssh -q -o UserKnownHostsFile=/dev/null -p "${SSH_HOST_PORT}" localhost "$@" + ${ns_exec} ssh -q \ + -i "${SSH_KEY_PATH}" \ + -o UserKnownHostsFile=/dev/null \ + -o StrictHostKeyChecking=no \ + -p "${SSH_HOST_PORT}" \ + localhost "$@" return $? } @@ -265,6 +272,7 @@ vm_ssh() { cleanup() { terminate_pidfiles "${!PIDFILES[@]}" del_namespaces + rm -rf "${TEST_HOME}" } check_args() { @@ -382,6 +390,12 @@ handle_build() { popd &>/dev/null } +setup_home() { + mkdir -p "$(dirname "${SSH_KEY_PATH}")" + ssh-keygen -t ed25519 -f "${SSH_KEY_PATH}" -N "" -q + cp "${VSOCK_TEST}" "${TEST_HOME}"/vsock_test +} + create_pidfile() { local pidfile @@ -415,6 +429,19 @@ terminate_pids() { done } +vng_dry_run() { + # WORKAROUND: use setsid to work around a virtme-ng bug where vng hangs + # when called from a background process group (e.g., under make + # kselftest). vng save/restores terminal settings using tcsetattr(), + # which is not allowed for background process groups because the + # controlling terminal is owned by the foreground process group. vng is + # stopped with SIGTTOU and hangs until kselftest's timer expires. + # setsid works around this by launching vng in a new session that has + # no controlling terminal, so tcsetattr() succeeds. + + setsid -w vng --run "$@" --dry-run &>/dev/null +} + vm_start() { local pidfile=$1 local ns=$2 @@ -441,6 +468,12 @@ vm_start() { if [[ "${BUILD}" -eq 1 ]]; then kernel_opt="${KERNEL_CHECKOUT}" + elif vng_dry_run; then + kernel_opt="" + elif vng_dry_run "${KERNEL_CHECKOUT}"; then + kernel_opt="${KERNEL_CHECKOUT}" + else + die "No suitable kernel found" fi if [[ "${ns}" != "init_ns" ]]; then @@ -451,11 +484,14 @@ vm_start() { --run \ ${kernel_opt} \ ${verbose_opt} \ + --rwdir=/root="${TEST_HOME}" \ + --force-9p \ + --cwd /root \ --qemu-opts="${qemu_opts}" \ --qemu="${qemu}" \ --user root \ --append "${KERNEL_CMDLINE}" \ - --rw &> ${logfile} & + &> ${logfile} & timeout "${WAIT_QEMU}" \ bash -c 'while [[ ! -s '"${pidfile}"' ]]; do sleep 1; done; exit 0' @@ -585,7 +621,7 @@ vm_vsock_test() { # log output and use pipefail to respect vsock_test errors set -o pipefail if [[ "${host}" != server ]]; then - vm_ssh "${ns}" -- "${VSOCK_TEST}" \ + vm_ssh "${ns}" -- ./vsock_test \ --mode=client \ --control-host="${host}" \ --peer-cid="${cid}" \ @@ -593,7 +629,7 @@ vm_vsock_test() { 2>&1 | log_guest rc=$? else - vm_ssh "${ns}" -- "${VSOCK_TEST}" \ + vm_ssh "${ns}" -- ./vsock_test \ --mode=server \ --peer-cid="${cid}" \ --control-port="${port}" \ @@ -1532,6 +1568,7 @@ check_deps check_vng check_socat handle_build +setup_home echo "1..${#ARGS[@]}" |
