<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/drivers/infiniband, branch linux-2.6.35.y</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=linux-2.6.35.y</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=linux-2.6.35.y'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2011-03-31T18:58:38+00:00</updated>
<entry>
<title>IB/cm: Bump reference count on cm_id before invoking callback</title>
<updated>2011-03-31T18:58:38+00:00</updated>
<author>
<name>Sean Hefty</name>
<email>sean.hefty@intel.com</email>
</author>
<published>2011-02-23T16:17:40+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ca78068b35522c820207b36191cba1385163f198'/>
<id>urn:sha1:ca78068b35522c820207b36191cba1385163f198</id>
<content type='text'>
commit 29963437a48475036353b95ab142bf199adb909e upstream.

When processing a SIDR REQ, the ib_cm allocates a new cm_id.  The
refcount of the cm_id is initialized to 1.  However, cm_process_work
will decrement the refcount after invoking all callbacks.  The result
is that the cm_id will end up with refcount set to 0 by the end of the
sidr req handler.

If a user tries to destroy the cm_id, the destruction will proceed,
under the incorrect assumption that no other threads are referencing
the cm_id.  This can lead to a crash when the cm callback thread tries
to access the cm_id.

This problem was noticed as part of a larger investigation with kernel
crashes in the rdma_cm when running on a real time OS.

Signed-off-by: Sean Hefty &lt;sean.hefty@intel.com&gt;
Acked-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>RDMA/cma: Fix crash in request handlers</title>
<updated>2011-03-31T18:58:38+00:00</updated>
<author>
<name>Sean Hefty</name>
<email>sean.hefty@intel.com</email>
</author>
<published>2011-02-23T16:11:32+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=af8c55e566b98985bde53ce09e72afcc6db04d4f'/>
<id>urn:sha1:af8c55e566b98985bde53ce09e72afcc6db04d4f</id>
<content type='text'>
commit 25ae21a10112875763c18b385624df713a288a05 upstream.

Doug Ledford and Red Hat reported a crash when running the rdma_cm on
a real-time OS.  The crash has the following call trace:

    cm_process_work
       cma_req_handler
          cma_disable_callback
          rdma_create_id
             kzalloc
             init_completion
          cma_get_net_info
          cma_save_net_info
          cma_any_addr
             cma_zero_addr
          rdma_translate_ip
             rdma_copy_addr
          cma_acquire_dev
             rdma_addr_get_sgid
             ib_find_cached_gid
             cma_attach_to_dev
          ucma_event_handler
             kzalloc
             ib_copy_ah_attr_to_user
          cma_comp

[ preempted ]

    cma_write
        copy_from_user
        ucma_destroy_id
           copy_from_user
           _ucma_find_context
           ucma_put_ctx
           ucma_free_ctx
              rdma_destroy_id
                 cma_exch
                 cma_cancel_operation
                 rdma_node_get_transport

        rt_mutex_slowunlock
        bad_area_nosemaphore
        oops_enter

They were able to reproduce the crash multiple times with the
following details:

    Crash seems to always happen on the:
            mutex_unlock(&amp;conn_id-&gt;handler_mutex);
    as conn_id looks to have been freed during this code path.

An examination of the code shows that a race exists in the request
handlers.  When a new connection request is received, the rdma_cm
allocates a new connection identifier.  This identifier has a single
reference count on it.  If a user calls rdma_destroy_id() from another
thread after receiving a callback, rdma_destroy_id will proceed to
destroy the id and free the associated memory.  However, the request
handlers may still be in the process of running.  When control returns
to the request handlers, they can attempt to access the newly created
identifiers.

Fix this by holding a reference on the newly created rdma_cm_id until
the request handler is through accessing it.

Signed-off-by: Sean Hefty &lt;sean.hefty@intel.com&gt;
Acked-by: Doug Ledford &lt;dledford@redhat.com&gt;
Signed-off-by: Roland Dreier &lt;roland@purestorage.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

</content>
</entry>
<entry>
<title>IB/uverbs: Handle large number of entries in poll CQ</title>
<updated>2011-02-06T19:03:29+00:00</updated>
<author>
<name>Dan Carpenter</name>
<email>error27@gmail.com</email>
</author>
<published>2010-10-13T09:13:12+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=83ec6d3cccb0cd67613880b32c85fec88233d681'/>
<id>urn:sha1:83ec6d3cccb0cd67613880b32c85fec88233d681</id>
<content type='text'>
commit 7182afea8d1afd432a17c18162cc3fd441d0da93 upstream.

In ib_uverbs_poll_cq() code there is a potential integer overflow if
userspace passes in a large cmd.ne.  The calls to kmalloc() would
allocate smaller buffers than intended, leading to memory corruption.
There iss also an information leak if resp wasn't all used.
Unprivileged userspace may call this function, although only if an
RDMA device that uses this function is present.

Fix this by copying CQ entries one at a time, which avoids the
allocation entirely, and also by moving this copying into a function
that makes sure to initialize all memory copied to userspace.

Special thanks to Jason Gunthorpe &lt;jgunthorpe@obsidianresearch.com&gt;
for his help and advice.

Signed-off-by: Dan Carpenter &lt;error27@gmail.com&gt;
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;

[ Monkey around with things a bit to avoid bad code generation by gcc
  when designated initializers are used.  - Roland ]

Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>RDMA/cxgb3: Turn off RX coalescing for iWARP connections</title>
<updated>2010-10-29T04:51:11+00:00</updated>
<author>
<name>Steve Wise</name>
<email>swise@opengridcomputing.com</email>
</author>
<published>2010-09-19T00:38:21+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=b8a7e2b687977fa0a2d0969d8e32fe6f5a4fbff4'/>
<id>urn:sha1:b8a7e2b687977fa0a2d0969d8e32fe6f5a4fbff4</id>
<content type='text'>
commit bec658ff31453a5726b1c188674d587a5d40c482 upstream.

The HW by default has RX coalescing on.  For iWARP connections, this
causes a 100ms delay in connection establishement due to the ingress
MPA Start message being stalled in HW.  So explicitly turn RX
coalescing off when setting up iWARP connections.

This was causing very bad performance for NP64 gather operations using
Open MPI, due to the way it sets up connections on larger jobs.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>RDMA/cxgb3: Don't exceed the max HW CQ depth</title>
<updated>2010-09-20T20:36:43+00:00</updated>
<author>
<name>Steve Wise</name>
<email>swise@opengridcomputing.com</email>
</author>
<published>2010-08-28T13:35:05+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=f155ee1b5def75927fd86a839d7ab10b8fa07e79'/>
<id>urn:sha1:f155ee1b5def75927fd86a839d7ab10b8fa07e79</id>
<content type='text'>
commit dc4e96ce2dceb649224ee84f83592aac8c54c9b7 upstream.

The max depth supported by T3 is 64K entries.  This fixes a bug
introduced in commit 9918b28d ("RDMA/cxgb3: Increase the max CQ
depth") that causes stalls and possibly crashes in large MPI clusters.

Signed-off-by: Steve Wise &lt;swise@opengridcomputing.com&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@suse.de&gt;

</content>
</entry>
<entry>
<title>IB/qib: Use request_firmware() to load SD7220 firmware</title>
<updated>2010-07-08T20:27:05+00:00</updated>
<author>
<name>Ben Hutchings</name>
<email>ben@decadent.org.uk</email>
</author>
<published>2010-07-01T20:37:20+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=ecd4b48a163b55d7eb4132617100b90d0d2768ec'/>
<id>urn:sha1:ecd4b48a163b55d7eb4132617100b90d0d2768ec</id>
<content type='text'>
Extract the microcode for the QLogic QLE7220 series IB HCA and use the
kernel microcode request facility to load the microcode.  This
supports Debian Linux's requirements to separate microcode which
doesn't have open source code available from the device driver.

Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband</title>
<updated>2010-07-08T19:20:54+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2010-07-08T19:20:54+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=e467e104bb7482170b79f516d2025e7cfcaaa733'/>
<id>urn:sha1:e467e104bb7482170b79f516d2025e7cfcaaa733</id>
<content type='text'>
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IPoIB: Fix world-writable child interface control sysfs attributes
  IB/qib: Clean up properly if qib_init() fails
  IB/qib: Completion queue callback needs to be single threaded
  IB/qib: Update 7322 serdes tables
  IB/qib: Clear 6120 hardware error register
  IB/qib: Clear eager buffer memory for each new process
  IB/qib: Mask hardware error during link reset
  IB/qib: Don't mark VL15 bufs as WC to avoid a rare 7322 chip problem
  RDMA/cxgb4: Derive smac_idx from port viid
  RDMA/cxgb4: Avoid false GTS CIDX_INC overflows
  RDMA/cxgb4: Don't call abort_connection() for active connect failures
  RDMA/cxgb4: Use the DMA state API instead of the pci equivalents
</content>
</entry>
<entry>
<title>Merge branches 'cxgb4', 'ipoib' and 'qib' into for-next</title>
<updated>2010-07-08T16:10:24+00:00</updated>
<author>
<name>Roland Dreier</name>
<email>rolandd@cisco.com</email>
</author>
<published>2010-07-08T16:10:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=9e770044a0f08a6dcf245152ec1575f7cb0b9631'/>
<id>urn:sha1:9e770044a0f08a6dcf245152ec1575f7cb0b9631</id>
<content type='text'>
</content>
</entry>
<entry>
<title>IPoIB: Fix world-writable child interface control sysfs attributes</title>
<updated>2010-07-06T21:23:22+00:00</updated>
<author>
<name>Or Gerlitz</name>
<email>ogerlitz@voltaire.com</email>
</author>
<published>2010-06-06T04:59:16+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=7a52b34b07122ff5f45258d47f260f8a525518f0'/>
<id>urn:sha1:7a52b34b07122ff5f45258d47f260f8a525518f0</id>
<content type='text'>
Sumeet Lahorani &lt;sumeet.lahorani@oracle.com&gt; reported that the IPoIB
child entries are world-writable; however we don't want ordinary users
to be able to create and destroy child interfaces, so fix them to be
writable only by root.

Signed-off-by: Or Gerlitz &lt;ogerlitz@voltaire.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
</entry>
<entry>
<title>IB/qib: Clean up properly if qib_init() fails</title>
<updated>2010-07-06T21:14:04+00:00</updated>
<author>
<name>Ralph Campbell</name>
<email>ralph.campbell@qlogic.com</email>
</author>
<published>2010-07-01T20:25:45+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=756a33b8dc3ed5c27685a130339de8a894d528a7'/>
<id>urn:sha1:756a33b8dc3ed5c27685a130339de8a894d528a7</id>
<content type='text'>
If qib_init() fails, the driver fails to free memory, unregister
device files, and unregister with the PCIe framework. The driver will
unload without error but a subsequent driver load will cause the
system to panic.  This was found by changing the 7220 code to load the
serdes microcode separately and not installing the microcode file.

Signed-off-by: Ralph Campbell &lt;ralph.campbell@qlogic.com&gt;
Signed-off-by: Roland Dreier &lt;rolandd@cisco.com&gt;
</content>
</entry>
</feed>
