linux/net
Ilya Dryomov 67645d7619 libceph: fix ceph_msg_revoke()
There are a number of problems with revoking a "was sending" message:

(1) We never make any attempt to revoke data - only kvecs contibute to
con->out_skip.  However, once the header (envelope) is written to the
socket, our peer learns data_len and sets itself to expect at least
data_len bytes to follow front or front+middle.  If ceph_msg_revoke()
is called while the messenger is sending message's data portion,
anything we send after that call is counted by the OSD towards the now
revoked message's data portion.  The effects vary, the most common one
is the eventual hang - higher layers get stuck waiting for the reply to
the message that was sent out after ceph_msg_revoke() returned and
treated by the OSD as a bunch of data bytes.  This is what Matt ran
into.

(2) Flat out zeroing con->out_kvec_bytes worth of bytes to handle kvecs
is wrong.  If ceph_msg_revoke() is called before the tag is sent out or
while the messenger is sending the header, we will get a connection
reset, either due to a bad tag (0 is not a valid tag) or a bad header
CRC, which kind of defeats the purpose of revoke.  Currently the kernel
client refuses to work with header CRCs disabled, but that will likely
change in the future, making this even worse.

(3) con->out_skip is not reset on connection reset, leading to one or
more spurious connection resets if we happen to get a real one between
con->out_skip is set in ceph_msg_revoke() and before it's cleared in
write_partial_skip().

Fixing (1) and (3) is trivial.  The idea behind fixing (2) is to never
zero the tag or the header, i.e. send out tag+header regardless of when
ceph_msg_revoke() is called.  That way the header is always correct, no
unnecessary resets are induced and revoke stands ready for disabled
CRCs.  Since ceph_msg_revoke() rips out con->out_msg, introduce a new
"message out temp" and copy the header into it before sending.

Cc: stable@vger.kernel.org # 4.0+
Reported-by: Matt Conner <matt.conner@keepertech.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Tested-by: Matt Conner <matt.conner@keepertech.com>
Reviewed-by: Sage Weil <sage@redhat.com>
2016-01-21 19:36:08 +01:00
..
6lowpan 6lowpan: put mcast compression in an own function 2015-10-21 00:49:25 +02:00
9p IB/cma: Add support for network namespaces 2015-10-28 12:32:48 -04:00
802
8021q vlan: Do not put vlan headers back on bridge and macvlan ports 2015-11-17 14:38:35 -05:00
appletalk
atm
ax25 net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
batman-adv batman-adv: Fix invalid stack access in batadv_dat_select_candidates 2015-12-07 22:40:21 +08:00
bluetooth bluetooth: Validate socket address length in sco_sock_bind(). 2015-12-15 15:39:08 -05:00
bridge bridge: Only call /sbin/bridge-stp for the initial network namespace 2016-01-05 16:46:17 -05:00
caif net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
can can: avoid using timeval for uapi 2015-10-13 17:42:34 +02:00
ceph libceph: fix ceph_msg_revoke() 2016-01-21 19:36:08 +01:00
core net: possible use after free in dst_release 2016-01-06 15:00:27 -05:00
dcb net/dcb: make dcbnl.c explicitly non-modular 2015-10-09 07:52:27 -07:00
dccp ipv6: kill sk_dst_lock 2015-12-03 11:32:06 -05:00
decnet net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
dns_resolver net: dns_resolver: convert time_t to time64_t 2015-11-18 16:27:46 -05:00
dsa net: dsa: use switchdev obj for VLAN add/del ops 2015-11-01 15:56:11 -05:00
ethernet net: help compiler generate better code in eth_get_headlen 2015-09-28 22:51:15 -07:00
hsr net/hsr: fix a warning message 2015-11-23 14:56:15 -05:00
ieee802154 net: fix percpu memory leaks 2015-11-02 22:47:14 -05:00
ipv4 tcp: fix zero cwnd in tcp_cwnd_reduction 2016-01-06 16:39:56 -05:00
ipv6 ipv6: honor ifindex in case we receive ll addresses in router advertisements 2015-12-23 22:03:54 -05:00
ipx
irda net: add validation for the socket syscall protocol argument 2015-12-14 16:09:30 -05:00
iucv net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
key af_key: fix two typos 2015-10-23 03:05:19 -07:00
l2tp ipv6: add complete rcu protection around np->opt 2015-12-02 23:37:16 -05:00
l3mdev net: Add netif_is_l3_slave 2015-10-07 04:27:43 -07:00
lapb
llc
mac80211 mac80211: handle width changes from opmode notification IE in beacon 2015-12-15 13:16:47 +01:00
mac802154 mac802154: llsec: use kzfree 2015-10-21 00:49:24 +02:00
mpls mpls: make via address optional for multipath routes 2015-12-12 00:43:44 -05:00
netfilter netfilter: nft_ct: include direction when dumping NFT_CT_L3PROTOCOL key 2015-12-18 14:45:45 +01:00
netlabel
netlink mm, page_alloc: distinguish between being unable to sleep, unwilling to sleep and avoiding waking kswapd 2015-11-06 17:50:42 -08:00
netrom
nfc net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
openvswitch openvswitch: Fix template leak in error cases. 2015-12-29 15:27:52 -05:00
packet packet: Allow packets with only a header (but no payload) 2015-11-29 22:17:17 -05:00
phonet
rds RDS: fix race condition when sending a message on unbound socket 2015-11-24 17:20:09 -05:00
rfkill rfkill: copy the name into the rfkill struct 2015-12-10 10:37:51 +01:00
rose
rxrpc net: rename SOCK_ASYNC_NOSPACE and SOCK_ASYNC_WAITDATA 2015-12-01 15:45:05 -05:00
sched net: sched: fix missing free per cpu on qstats 2016-01-06 01:40:21 -05:00
sctp sctp: sctp should release assoc when sctp_make_abort_user return NULL in sctp_close 2015-12-30 16:57:16 -05:00
sunrpc sched/wait: Fix the signal handling fix 2015-12-13 14:30:59 -08:00
switchdev switchdev: respect SKIP_EOPNOTSUPP flag in case there is no recursion 2015-11-03 13:39:21 -05:00
tipc tipc: fix error handling of expanding buffer headroom 2015-11-24 11:26:19 -05:00
unix af_unix: Fix splice-bind deadlock 2016-01-04 23:22:49 -05:00
vmw_vsock VSOCK: call sk->sk_data_ready() on accept() 2015-11-04 22:03:10 -05:00
wimax
wireless nl80211: Fix potential memory leak in nl80211_connect 2015-12-15 13:11:26 +01:00
x25
xfrm Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec 2015-12-22 16:26:31 -05:00
compat.c
Kconfig net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
Makefile net: Introduce L3 Master device abstraction 2015-09-29 20:40:32 -07:00
socket.c net, socket, socket_wq: fix missing initialization of flags 2015-12-30 16:38:01 -05:00
sysctl_net.c net: sysctl: fix a kmemleak warning 2015-10-23 06:22:08 -07:00