diff options
Diffstat (limited to 'Documentation/networking')
-rw-r--r-- | Documentation/networking/devlink/devlink-params.rst | 3 | ||||
-rw-r--r-- | Documentation/networking/ethtool-netlink.rst | 10 | ||||
-rw-r--r-- | Documentation/networking/net_failover.rst | 111 | ||||
-rw-r--r-- | Documentation/networking/phy.rst | 5 |
4 files changed, 102 insertions, 27 deletions
diff --git a/Documentation/networking/devlink/devlink-params.rst b/Documentation/networking/devlink/devlink-params.rst index 4878907e9232..b7dfe693a332 100644 --- a/Documentation/networking/devlink/devlink-params.rst +++ b/Documentation/networking/devlink/devlink-params.rst @@ -109,6 +109,9 @@ own name. - Boolean - When enabled, the device driver will instantiate VDPA networking specific auxiliary device of the devlink device. + * - ``enable_iwarp`` + - Boolean + - Enable handling of iWARP traffic in the device. * - ``internal_err_reset`` - Boolean - When enabled, the device driver will reset the device on internal diff --git a/Documentation/networking/ethtool-netlink.rst b/Documentation/networking/ethtool-netlink.rst index 7b598c7e3912..9d98e0511249 100644 --- a/Documentation/networking/ethtool-netlink.rst +++ b/Documentation/networking/ethtool-netlink.rst @@ -849,7 +849,7 @@ Request contents: Kernel response contents: - ==================================== ====== ========================== + ==================================== ====== =========================== ``ETHTOOL_A_RINGS_HEADER`` nested reply header ``ETHTOOL_A_RINGS_RX_MAX`` u32 max size of RX ring ``ETHTOOL_A_RINGS_RX_MINI_MAX`` u32 max size of RX mini ring @@ -859,7 +859,8 @@ Kernel response contents: ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring - ==================================== ====== ========================== + ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring + ==================================== ====== =========================== RINGS_SET @@ -869,13 +870,14 @@ Sets ring sizes like ``ETHTOOL_SRINGPARAM`` ioctl request. Request contents: - ==================================== ====== ========================== + ==================================== ====== =========================== ``ETHTOOL_A_RINGS_HEADER`` nested reply header ``ETHTOOL_A_RINGS_RX`` u32 size of RX ring ``ETHTOOL_A_RINGS_RX_MINI`` u32 size of RX mini ring ``ETHTOOL_A_RINGS_RX_JUMBO`` u32 size of RX jumbo ring ``ETHTOOL_A_RINGS_TX`` u32 size of TX ring - ==================================== ====== ========================== + ``ETHTOOL_A_RINGS_RX_BUF_LEN`` u32 size of buffers on the ring + ==================================== ====== =========================== Kernel checks that requested ring sizes do not exceed limits reported by driver. Driver may impose additional constraints and may not suspport all diff --git a/Documentation/networking/net_failover.rst b/Documentation/networking/net_failover.rst index e143ab79a960..3a662f2b4d6e 100644 --- a/Documentation/networking/net_failover.rst +++ b/Documentation/networking/net_failover.rst @@ -35,7 +35,7 @@ To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY feature on the virtio-net interface and assign the same MAC address to both virtio-net and VF interfaces. -Here is an example XML snippet that shows such configuration. +Here is an example libvirt XML snippet that shows such configuration: :: <interface type='network'> @@ -45,18 +45,32 @@ Here is an example XML snippet that shows such configuration. <model type='virtio'/> <driver name='vhost' queues='4'/> <link state='down'/> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> + <teaming type='persistent'/> + <alias name='ua-backup0'/> </interface> <interface type='hostdev' managed='yes'> <mac address='52:54:00:00:12:53'/> <source> <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/> </source> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/> + <teaming type='transient' persistent='ua-backup0'/> </interface> +In this configuration, the first device definition is for the virtio-net +interface and this acts as the 'persistent' device indicating that this +interface will always be plugged in. This is specified by the 'teaming' tag with +required attribute type having value 'persistent'. The link state for the +virtio-net device is set to 'down' to ensure that the 'failover' netdev prefers +the VF passthrough device for normal communication. The virtio-net device will +be brought UP during live migration to allow uninterrupted communication. + +The second device definition is for the VF passthrough interface. Here the +'teaming' tag is provided with type 'transient' indicating that this device may +periodically be unplugged. A second attribute - 'persistent' is provided and +points to the alias name declared for the virtio-net device. + Booting a VM with the above configuration will result in the following 3 -netdevs created in the VM. +interfaces created in the VM: :: 4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 @@ -65,13 +79,36 @@ netdevs created in the VM. valid_lft 42482sec preferred_lft 42482sec inet6 fe80::97d8:db2:8c10:b6d6/64 scope link valid_lft forever preferred_lft forever - 5: ens10nsby: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master ens10 state UP group default qlen 1000 + 5: ens10nsby: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master ens10 state DOWN group default qlen 1000 link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff 7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000 link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff -ens10 is the 'failover' master netdev, ens10nsby and ens11 are the slave -'standby' and 'primary' netdevs respectively. +Here, ens10 is the 'failover' master interface, ens10nsby is the slave 'standby' +virtio-net interface, and ens11 is the slave 'primary' VF passthrough interface. + +One point to note here is that some user space network configuration daemons +like systemd-networkd, ifupdown, etc, do not understand the 'net_failover' +device; and on the first boot, the VM might end up with both 'failover' device +and VF accquiring IP addresses (either same or different) from the DHCP server. +This will result in lack of connectivity to the VM. So some tweaks might be +needed to these network configuration daemons to make sure that an IP is +received only on the 'failover' device. + +Below is the patch snippet used with 'cloud-ifupdown-helper' script found on +Debian cloud images: + +:: + @@ -27,6 +27,8 @@ do_setup() { + local working="$cfgdir/.$INTERFACE" + local final="$cfgdir/$INTERFACE" + + + if [ -d "/sys/class/net/${INTERFACE}/master" ]; then exit 0; fi + + + if ifup --no-act "$INTERFACE" > /dev/null 2>&1; then + # interface is already known to ifupdown, no need to generate cfg + log "Skipping configuration generation for $INTERFACE" + Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode ================================================================== @@ -80,40 +117,68 @@ net_failover also enables hypervisor controlled live migration to be supported with VMs that have direct attached SR-IOV VF devices by automatic failover to the paravirtual datapath when the VF is unplugged. -Here is a sample script that shows the steps to initiate live migration on -the source hypervisor. +Here is a sample script that shows the steps to initiate live migration from +the source hypervisor. Note: It is assumed that the VM is connected to a +software bridge 'br0' which has a single VF attached to it along with the vnet +device to the VM. This is not the VF that was passthrough'd to the VM (seen in +the vf.xml file). :: - # cat vf_xml + # cat vf.xml <interface type='hostdev' managed='yes'> <mac address='52:54:00:00:12:53'/> <source> <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/> </source> - <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/> + <teaming type='transient' persistent='ua-backup0'/> </interface> - # Source Hypervisor + # Source Hypervisor migrate.sh #!/bin/bash - DOMAIN=fedora27-tap01 - PF=enp66s0f0 - VF_NUM=5 - TAP_IF=tap01 - VF_XML= + DOMAIN=vm-01 + PF=ens6np0 + VF=ens6v1 # VF attached to the bridge. + VF_NUM=1 + TAP_IF=vmtap01 # virtio-net interface in the VM. + VF_XML=vf.xml MAC=52:54:00:00:12:53 ZERO_MAC=00:00:00:00:00:00 + # Set the virtio-net interface up. virsh domif-setlink $DOMAIN $TAP_IF up - bridge fdb del $MAC dev $PF master - virsh detach-device $DOMAIN $VF_XML + + # Remove the VF that was passthrough'd to the VM. + virsh detach-device --live --config $DOMAIN $VF_XML + ip link set $PF vf $VF_NUM mac $ZERO_MAC - virsh migrate --live $DOMAIN qemu+ssh://$REMOTE_HOST/system + # Add FDB entry for traffic to continue going to the VM via + # the VF -> br0 -> vnet interface path. + bridge fdb add $MAC dev $VF + bridge fdb add $MAC dev $TAP_IF master + + # Migrate the VM + virsh migrate --live --persistent $DOMAIN qemu+ssh://$REMOTE_HOST/system + + # Clean up FDB entries after migration completes. + bridge fdb del $MAC dev $VF + bridge fdb del $MAC dev $TAP_IF master - # Destination Hypervisor +On the destination hypervisor, a shared bridge 'br0' is created before migration +starts, and a VF from the destination PF is added to the bridge. Similarly an +appropriate FDB entry is added. + +The following script is executed on the destination hypervisor once migration +completes, and it reattaches the VF to the VM and brings down the virtio-net +interface. + +:: + # reattach-vf.sh #!/bin/bash - virsh attach-device $DOMAIN $VF_XML - virsh domif-setlink $DOMAIN $TAP_IF down + bridge fdb del 52:54:00:00:12:53 dev ens36v0 + bridge fdb del 52:54:00:00:12:53 dev vmtap01 master + virsh attach-device --config --live vm01 vf.xml + virsh domif-setlink vm01 vmtap01 down diff --git a/Documentation/networking/phy.rst b/Documentation/networking/phy.rst index 571ba08386e7..d43da709bf40 100644 --- a/Documentation/networking/phy.rst +++ b/Documentation/networking/phy.rst @@ -237,6 +237,11 @@ negotiation results. Some of the interface modes are described below: +``PHY_INTERFACE_MODE_SMII`` + This is serial MII, clocked at 125MHz, supporting 100M and 10M speeds. + Some details can be found in + https://opencores.org/ocsvn/smii/smii/trunk/doc/SMII.pdf + ``PHY_INTERFACE_MODE_1000BASEX`` This defines the 1000BASE-X single-lane serdes link as defined by the 802.3 standard section 36. The link operates at a fixed bit rate of |