linux/net
John Fastabend 77162022ab net: add generic PF_BRIDGE:RTM_ FDB hooks
This adds two new flags NTF_MASTER and NTF_SELF that can
now be used to specify where PF_BRIDGE netlink commands should
be sent. NTF_MASTER sends the commands to the 'dev->master'
device for parsing. Typically this will be the linux net/bridge,
or open-vswitch devices. Also without any flags set the command
will be handled by the master device as well so that current user
space tools continue to work as expected.

The NTF_SELF flag will push the PF_BRIDGE commands to the
device. In the basic example below the commands are then parsed
and programmed in the embedded bridge.

Note if both NTF_SELF and NTF_MASTER bits are set then the
command will be sent to both 'dev->master' and 'dev' this allows
user space to easily keep the embedded bridge and software bridge
in sync.

There is a slight complication in the case with both flags set
when an error occurs. To resolve this the rtnl handler clears
the NTF_ flag in the netlink ack to indicate which sets completed
successfully. The add/del handlers will abort as soon as any
error occurs.

To support this new net device ops were added to call into
the device and the existing bridging code was refactored
to use these. There should be no required changes in user space
to support the current bridge behavior.

A basic setup with a SR-IOV enabled NIC looks like this,

          veth0  veth2
            |      |
          ------------
          |  bridge0 |   <---- software bridging
          ------------
               /
               /
  ethx.y      ethx
    VF         PF
     \         \          <---- propagate FDB entries to HW
     \         \
  --------------------
  |  Embedded Bridge |    <---- hardware offloaded switching
  --------------------

In this case the embedded bridge must be managed to allow 'veth0'
to communicate with 'ethx.y' correctly. At present drivers managing
the embedded bridge either send frames onto the network which
then get dropped by the switch OR the embedded bridge will flood
these frames. With this patch we have a mechanism to manage the
embedded bridge correctly from user space. This example is specific
to SR-IOV but replacing the VF with another PF or dropping this
into the DSA framework generates similar management issues.

Examples session using the 'br'[1] tool to add, dump and then
delete a mac address with a new "embedded" option and enabled
ixgbe driver:

# br fdb add 22:35:19:ac:60:59 dev eth3
# br fdb
port    mac addr                flags
veth0   22:35:19:ac:60:58       static
veth0   9a:5f:81:f7:f6:ec       local
eth3    00:1b:21:55:23:59       local
eth3    22:35:19:ac:60:59       static
veth0   22:35:19:ac:60:57       static
#br fdb add 22:35:19:ac:60:59 embedded dev eth3
#br fdb
port    mac addr                flags
veth0   22:35:19:ac:60:58       static
veth0   9a:5f:81:f7:f6:ec       local
eth3    00:1b:21:55:23:59       local
eth3    22:35:19:ac:60:59       static
veth0   22:35:19:ac:60:57       static
eth3    22:35:19:ac:60:59       local embedded
#br fdb del 22:35:19:ac:60:59 embedded dev eth3

I added a couple lines to 'br' to set the flags correctly is all. It
is my opinion that the merit of this patch is now embedded and SW
bridges can both be modeled correctly in user space using very nearly
the same message passing.

[1] 'br' tool was published as an RFC here and will be renamed 'bridge'
    http://patchwork.ozlabs.org/patch/117664/

Thanks to Jamal Hadi Salim, Stephen Hemminger and Ben Hutchings for
valuable feedback, suggestions, and review.

v2: fixed api descriptions and error case with both NTF_SELF and
    NTF_MASTER set plus updated patch description.

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2012-04-15 13:06:04 -04:00
..
9p net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
802 net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
8021q vlan: Stop using NLA_PUT*(). 2012-04-02 04:33:44 -04:00
appletalk net: remove k{un}map_skb_frag() 2012-04-05 05:36:43 -04:00
atm net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ax25 net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
batman-adv net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
bluetooth Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth 2012-04-09 15:47:49 -04:00
bridge net: add generic PF_BRIDGE:RTM_ FDB hooks 2012-04-15 13:06:04 -04:00
caif net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
can
ceph net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
core net: add generic PF_BRIDGE:RTM_ FDB hooks 2012-04-15 13:06:04 -04:00
dcb net/dcb: Add an optional max rate attribute 2012-04-05 05:08:04 -04:00
dccp net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
decnet net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dns_resolver net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
dsa
econet Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
ethernet net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ieee802154 net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ipv4 net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ipv6 net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
ipx
irda net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
iucv Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux 2012-03-22 18:15:32 -07:00
key net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
l2tp net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
lapb Remove all #inclusions of asm/system.h 2012-03-28 18:30:03 +01:00
llc net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
mac80211 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless 2012-04-12 13:49:28 -04:00
netfilter net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
netlabel netlabel: use GFP flags from caller instead of GFP_ATOMIC 2012-03-22 19:29:57 -04:00
netlink Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2012-04-10 14:30:45 -04:00
netrom net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
nfc net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
openvswitch net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
packet net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
phonet net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
rds RDS: use gfp flags from caller in conn_alloc() 2012-03-22 19:29:58 -04:00
rfkill device.h: cleanup users outside of linux/include (C files) 2012-03-11 14:27:37 -04:00
rose net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
rxrpc net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
sched pkt_sched: Stop using NLA_PUT*(). 2012-04-01 18:11:37 -04:00
sctp net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
sunrpc net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
tipc net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
unix net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
wanrouter
wimax net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
wireless net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
x25 net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
xfrm net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
compat.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
Kconfig
Makefile
nonet.c
socket.c net: cleanup unsigned to unsigned int 2012-04-15 12:44:40 -04:00
sysctl_net.c sysctl: Modify __register_sysctl_paths to take a set instead of a root and an nsproxy 2012-01-24 16:40:30 -08:00