<feed xmlns='http://www.w3.org/2005/Atom'>
<title>BMC/Intel-BMC/linux.git/net/netlink, branch v4.2</title>
<subtitle>Intel OpenBMC Linux kernel source tree (mirror)</subtitle>
<id>https://git.radix-linux.su/BMC/Intel-BMC/linux.git/atom?h=v4.2</id>
<link rel='self' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/atom?h=v4.2'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/'/>
<updated>2015-08-23T23:04:46+00:00</updated>
<entry>
<title>netlink: mmap: fix tx type check</title>
<updated>2015-08-23T23:04:46+00:00</updated>
<author>
<name>Ken-ichirou MATSUZAWA</name>
<email>chamaken@gmail.com</email>
</author>
<published>2015-08-20T03:43:53+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=c953e23936f902c7719166327e3113639105c981'/>
<id>urn:sha1:c953e23936f902c7719166327e3113639105c981</id>
<content type='text'>
I can't send netlink message via mmaped netlink socket since

    commit: a8866ff6a5bce7d0ec465a63bc482a85c09b0d39
    netlink: make the check for "send from tx_ring" deterministic

msg-&gt;msg_iter.type is set to WRITE (1) at

    SYSCALL_DEFINE6(sendto, ...
        import_single_range(WRITE, ...
            iov_iter_init(1, WRITE, ...

call path, so that we need to check the type by iter_is_iovec()
to accept the WRITE.

Signed-off-by: Ken-ichirou MATSUZAWA &lt;chamas@h4.dion.ne.jp&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: make sure -EBUSY won't escape from netlink_insert</title>
<updated>2015-08-10T17:59:10+00:00</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-08-06T22:26:41+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=4e7c1330689e27556de407d3fdadc65ffff5eb12'/>
<id>urn:sha1:4e7c1330689e27556de407d3fdadc65ffff5eb12</id>
<content type='text'>
Linus reports the following deadlock on rtnl_mutex; triggered only
once so far (extract):

[12236.694209] NetworkManager  D 0000000000013b80     0  1047      1 0x00000000
[12236.694218]  ffff88003f902640 0000000000000000 ffffffff815d15a9 0000000000000018
[12236.694224]  ffff880119538000 ffff88003f902640 ffffffff81a8ff84 00000000ffffffff
[12236.694230]  ffffffff81a8ff88 ffff880119c47f00 ffffffff815d133a ffffffff81a8ff80
[12236.694235] Call Trace:
[12236.694250]  [&lt;ffffffff815d15a9&gt;] ? schedule_preempt_disabled+0x9/0x10
[12236.694257]  [&lt;ffffffff815d133a&gt;] ? schedule+0x2a/0x70
[12236.694263]  [&lt;ffffffff815d15a9&gt;] ? schedule_preempt_disabled+0x9/0x10
[12236.694271]  [&lt;ffffffff815d2c3f&gt;] ? __mutex_lock_slowpath+0x7f/0xf0
[12236.694280]  [&lt;ffffffff815d2cc6&gt;] ? mutex_lock+0x16/0x30
[12236.694291]  [&lt;ffffffff814f1f90&gt;] ? rtnetlink_rcv+0x10/0x30
[12236.694299]  [&lt;ffffffff8150ce3b&gt;] ? netlink_unicast+0xfb/0x180
[12236.694309]  [&lt;ffffffff814f5ad3&gt;] ? rtnl_getlink+0x113/0x190
[12236.694319]  [&lt;ffffffff814f202a&gt;] ? rtnetlink_rcv_msg+0x7a/0x210
[12236.694331]  [&lt;ffffffff8124565c&gt;] ? sock_has_perm+0x5c/0x70
[12236.694339]  [&lt;ffffffff814f1fb0&gt;] ? rtnetlink_rcv+0x30/0x30
[12236.694346]  [&lt;ffffffff8150d62c&gt;] ? netlink_rcv_skb+0x9c/0xc0
[12236.694354]  [&lt;ffffffff814f1f9f&gt;] ? rtnetlink_rcv+0x1f/0x30
[12236.694360]  [&lt;ffffffff8150ce3b&gt;] ? netlink_unicast+0xfb/0x180
[12236.694367]  [&lt;ffffffff8150d344&gt;] ? netlink_sendmsg+0x484/0x5d0
[12236.694376]  [&lt;ffffffff810a236f&gt;] ? __wake_up+0x2f/0x50
[12236.694387]  [&lt;ffffffff814cad23&gt;] ? sock_sendmsg+0x33/0x40
[12236.694396]  [&lt;ffffffff814cb05e&gt;] ? ___sys_sendmsg+0x22e/0x240
[12236.694405]  [&lt;ffffffff814cab75&gt;] ? ___sys_recvmsg+0x135/0x1a0
[12236.694415]  [&lt;ffffffff811a9d12&gt;] ? eventfd_write+0x82/0x210
[12236.694423]  [&lt;ffffffff811a0f9e&gt;] ? fsnotify+0x32e/0x4c0
[12236.694429]  [&lt;ffffffff8108cb70&gt;] ? wake_up_q+0x60/0x60
[12236.694434]  [&lt;ffffffff814cba09&gt;] ? __sys_sendmsg+0x39/0x70
[12236.694440]  [&lt;ffffffff815d4797&gt;] ? entry_SYSCALL_64_fastpath+0x12/0x6a

It seems so far plausible that the recursive call into rtnetlink_rcv()
looks suspicious. One way, where this could trigger is that the senders
NETLINK_CB(skb).portid was wrongly 0 (which is rtnetlink socket), so
the rtnl_getlink() request's answer would be sent to the kernel instead
to the actual user process, thus grabbing rtnl_mutex() twice.

One theory would be that netlink_autobind() triggered via netlink_sendmsg()
internally overwrites the -EBUSY error to 0, but where it is wrongly
originating from __netlink_insert() instead. That would reset the
socket's portid to 0, which is then filled into NETLINK_CB(skb).portid
later on. As commit d470e3b483dc ("[NETLINK]: Fix two socket hashing bugs.")
also puts it, -EBUSY should not be propagated from netlink_insert().

It looks like it's very unlikely to reproduce. We need to trigger the
rhashtable_insert_rehash() handler under a situation where rehashing
currently occurs (one /rare/ way would be to hit ht-&gt;elasticity limits
while not filled enough to expand the hashtable, but that would rather
require a specifically crafted bind() sequence with knowledge about
destination slots, seems unlikely). It probably makes sense to guard
__netlink_insert() in any case and remap that error. It was suggested
that EOVERFLOW might be better than an already overloaded ENOMEM.

Reference: http://thread.gmane.org/gmane.linux.network/372676
Reported-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Acked-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: don't hold mutex in rcu callback when releasing mmapd ring</title>
<updated>2015-07-22T05:22:56+00:00</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2015-07-21T14:33:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=0470eb99b4721586ccac954faac3fa4472da0845'/>
<id>urn:sha1:0470eb99b4721586ccac954faac3fa4472da0845</id>
<content type='text'>
Kirill A. Shutemov says:

This simple test-case trigers few locking asserts in kernel:

int main(int argc, char **argv)
{
        unsigned int block_size = 16 * 4096;
        struct nl_mmap_req req = {
                .nm_block_size          = block_size,
                .nm_block_nr            = 64,
                .nm_frame_size          = 16384,
                .nm_frame_nr            = 64 * block_size / 16384,
        };
        unsigned int ring_size;
	int fd;

	fd = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
        if (setsockopt(fd, SOL_NETLINK, NETLINK_RX_RING, &amp;req, sizeof(req)) &lt; 0)
                exit(1);
        if (setsockopt(fd, SOL_NETLINK, NETLINK_TX_RING, &amp;req, sizeof(req)) &lt; 0)
                exit(1);

	ring_size = req.nm_block_nr * req.nm_block_size;
	mmap(NULL, 2 * ring_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
	return 0;
}

+++ exited with 0 +++
BUG: sleeping function called from invalid context at /home/kas/git/public/linux-mm/kernel/locking/mutex.c:616
in_atomic(): 1, irqs_disabled(): 0, pid: 1, name: init
3 locks held by init/1:
 #0:  (reboot_mutex){+.+...}, at: [&lt;ffffffff81080959&gt;] SyS_reboot+0xa9/0x220
 #1:  ((reboot_notifier_list).rwsem){.+.+..}, at: [&lt;ffffffff8107f379&gt;] __blocking_notifier_call_chain+0x39/0x70
 #2:  (rcu_callback){......}, at: [&lt;ffffffff810d32e0&gt;] rcu_do_batch.isra.49+0x160/0x10c0
Preemption disabled at:[&lt;ffffffff8145365f&gt;] __delay+0xf/0x20

CPU: 1 PID: 1 Comm: init Not tainted 4.1.0-00009-gbddf4c4818e0 #253
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Debian-1.8.2-1 04/01/2014
 ffff88017b3d8000 ffff88027bc03c38 ffffffff81929ceb 0000000000000102
 0000000000000000 ffff88027bc03c68 ffffffff81085a9d 0000000000000002
 ffffffff81ca2a20 0000000000000268 0000000000000000 ffff88027bc03c98
Call Trace:
 &lt;IRQ&gt;  [&lt;ffffffff81929ceb&gt;] dump_stack+0x4f/0x7b
 [&lt;ffffffff81085a9d&gt;] ___might_sleep+0x16d/0x270
 [&lt;ffffffff81085bed&gt;] __might_sleep+0x4d/0x90
 [&lt;ffffffff8192e96f&gt;] mutex_lock_nested+0x2f/0x430
 [&lt;ffffffff81932fed&gt;] ? _raw_spin_unlock_irqrestore+0x5d/0x80
 [&lt;ffffffff81464143&gt;] ? __this_cpu_preempt_check+0x13/0x20
 [&lt;ffffffff8182fc3d&gt;] netlink_set_ring+0x1ed/0x350
 [&lt;ffffffff8182e000&gt;] ? netlink_undo_bind+0x70/0x70
 [&lt;ffffffff8182fe20&gt;] netlink_sock_destruct+0x80/0x150
 [&lt;ffffffff817e484d&gt;] __sk_free+0x1d/0x160
 [&lt;ffffffff817e49a9&gt;] sk_free+0x19/0x20
[..]

Cong Wang says:

We can't hold mutex lock in a rcu callback, [..]

Thomas Graf says:

The socket should be dead at this point. It might be simpler to
add a netlink_release_ring() function which doesn't require
locking at all.

Reported-by: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Diagnosed-by: Cong Wang &lt;cwang@twopensource.com&gt;
Suggested-by: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: Delete an unnecessary check before the function call "module_put"</title>
<updated>2015-07-03T16:27:43+00:00</updated>
<author>
<name>Markus Elfring</name>
<email>elfring@users.sourceforge.net</email>
</author>
<published>2015-07-02T16:38:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=92b80eb33c52583b48ed289bd993579b87b52d9a'/>
<id>urn:sha1:92b80eb33c52583b48ed289bd993579b87b52d9a</id>
<content type='text'>
The module_put() function tests whether its argument is NULL and then
returns immediately. Thus the test around the call is not needed.

This issue was detected by using the Coccinelle software.

Signed-off-by: Markus Elfring &lt;elfring@users.sourceforge.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: add API to retrieve all group memberships</title>
<updated>2015-06-21T17:18:18+00:00</updated>
<author>
<name>David Herrmann</name>
<email>dh.herrmann@gmail.com</email>
</author>
<published>2015-06-17T15:14:33+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=b42be38b2778eda2237fc759e55e3b698b05b315'/>
<id>urn:sha1:b42be38b2778eda2237fc759e55e3b698b05b315</id>
<content type='text'>
This patch adds getsockopt(SOL_NETLINK, NETLINK_LIST_MEMBERSHIPS) to
retrieve all groups a socket is a member of. Currently, we have to use
getsockname() and look at the nl.nl_groups bitmask. However, this mask is
limited to 32 groups. Hence, similar to NETLINK_ADD_MEMBERSHIP and
NETLINK_DROP_MEMBERSHIP, this adds a separate sockopt to manager higher
groups IDs than 32.

This new NETLINK_LIST_MEMBERSHIPS option takes a pointer to __u32 and the
size of the array. The array is filled with the full membership-set of the
socket, and the required array size is returned in optlen. Hence,
user-space can retry with a properly sized array in case it was too small.

Signed-off-by: David Herrmann &lt;dh.herrmann@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2015-05-23T05:22:35+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-05-23T05:22:35+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=36583eb54d46c36a447afd6c379839f292397429'/>
<id>urn:sha1:36583eb54d46c36a447afd6c379839f292397429</id>
<content type='text'>
Conflicts:
	drivers/net/ethernet/cadence/macb.c
	drivers/net/phy/phy.c
	include/linux/skbuff.h
	net/ipv4/tcp.c
	net/switchdev/switchdev.c

Switchdev was a case of RTNH_H_{EXTERNAL --&gt; OFFLOAD}
renaming overlapping with net-next changes of various
sorts.

phy.c was a case of two changes, one adding a local
variable to a function whilst the second was removing
one.

tcp.c overlapped a deadlock fix with the addition of new tcp_info
statistic values.

macb.c involved the addition of two zyncq device entries.

skbuff.h involved adding back ipv4_daddr to nf_bridge_info
whilst net-next changes put two other existing members of
that struct into a union.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: Use random autobind rover</title>
<updated>2015-05-18T03:43:31+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2015-05-17T02:45:34+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=b9fbe709de4dbe663613ebb852f35aef2467872c'/>
<id>urn:sha1:b9fbe709de4dbe663613ebb852f35aef2467872c</id>
<content type='text'>
Currently we use a global rover to select a port ID that is unique.
This used to work consistently when it was protected with a global
lock.  However as we're now lockless, the global rover can exhibit
pathological behaviour should multiple threads all stomp on it at
the same time.

Granted this will eventually resolve itself but the process is
suboptimal.

This patch replaces the global rover with a pseudorandom starting
point to avoid this issue.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: Reset portid after netlink_insert failure</title>
<updated>2015-05-16T21:08:57+00:00</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2015-05-16T13:50:28+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=c0bb07df7d981e4091432754e30c9c720e2c0c78'/>
<id>urn:sha1:c0bb07df7d981e4091432754e30c9c720e2c0c78</id>
<content type='text'>
The commit c5adde9468b0714a051eac7f9666f23eb10b61f7 ("netlink:
eliminate nl_sk_hash_lock") breaks the autobind retry mechanism
because it doesn't reset portid after a failed netlink_insert.

This means that should autobind fail the first time around, then
the socket will be stuck in limbo as it can never be bound again
since it already has a non-zero portid.

Fixes: c5adde9468b0 ("netlink: eliminate nl_sk_hash_lock")
Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netlink: move nl_table in read_mostly section</title>
<updated>2015-05-14T21:49:06+00:00</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-05-13T00:24:50+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=91dd93f956b9ea9ecf47fd4b9acd2d2e7f980303'/>
<id>urn:sha1:91dd93f956b9ea9ecf47fd4b9acd2d2e7f980303</id>
<content type='text'>
netlink sockets creation and deletion heavily modify nl_table_users
and nl_table_lock.

If nl_table is sharing one cache line with one of them, netlink
performance is really bad on SMP.

ffffffff81ff5f00 B nl_table
ffffffff81ff5f0c b nl_table_users

Putting nl_table in read_mostly section increased performance
of my open/delete netlink sockets test by about 80 %

This came up while diagnosing a getaddrinfo() problem.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2015-05-13T18:31:43+00:00</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-05-13T18:31:43+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/BMC/Intel-BMC/linux.git/commit/?id=b04096ff33a977c01c8780ca3ee129dbd641bad4'/>
<id>urn:sha1:b04096ff33a977c01c8780ca3ee129dbd641bad4</id>
<content type='text'>
Four minor merge conflicts:

1) qca_spi.c renamed the local variable used for the SPI device
   from spi_device to spi, meanwhile the spi_set_drvdata() call
   got moved further up in the probe function.

2) Two changes were both adding new members to codel params
   structure, and thus we had overlapping changes to the
   initializer function.

3) 'net' was making a fix to sk_release_kernel() which is
   completely removed in 'net-next'.

4) In net_namespace.c, the rtnl_net_fill() call for GET operations
   had the command value fixed, meanwhile 'net-next' adjusted the
   argument signature a bit.

This also matches example merge resolutions posted by Stephen
Rothwell over the past two days.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
