summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-05-09Merge branches 'cxgb4', 'ipoib', 'iser', 'misc', 'mlx4', 'qib' and 'srp' ↵Roland Dreier26-192/+315
into for-next
2013-05-02IB/iser: Add support for iser CM REQ additional infoOr Gerlitz2-0/+16
Annex A12 of the IBTA spec defines additional information that needs to be provided through the CM exchange relating to usage of ZBVA (Zero Based VAs) and Send With Invalidate over an iSER connection. Currently, the initiator sets both to not supported, but does provide the header so that existing iSER targets can be patched to start looking on the private data carried by the CM. This is a preparation step to enable iSER with HW drivers for which FMRs are not supported, such as mlx4 VF instances or new HW devices which might support only FRWR (Fast Registration Work-Requests) along the details of the IB_DEVICE_MEM_MGT_EXTENSIONS device capability. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-05-02IB/iser: Return error to upper layers on EAGAIN registration failuresOr Gerlitz1-1/+2
Commit 819a087316a6 ("IB/iser: Avoid error prints on EAGAIN registration failures") not only eliminated the error print on that case, but rather also modified the code such that it doesn't return any error to upper layers. As a result a wrong mapping was used. Fix this to correctly return the error in that case. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-05-02IB/iser: Move informational messages from error to info levelRoi Dayan3-25/+34
Introduce iser_info() and move informational messages that were printed as errors to use that macro. Also, cleanup printk leftovers to use the existing macros. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> [ Use pr_warn(... instead of printk(KERN_WARNING .... - Roland ] Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-05-02IB/iser: Add module versionRoi Dayan2-5/+4
Add displaying module version, update the version to 1.1, and remove the DRV_DATE define. Signed-off-by: Roi Dayan <roid@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25mlx4_core: Expose a few helpers to fill DMFS HW strucuturesHadar Hen Zion2-27/+70
Re-arrange some of code which fills DMFS HW structures so we can use it from within the core driver and from the IB driver too, e.g when verbs DMFS structures are transformed into mlx4 hardware structs. Also, add struct mlx4_flow_handle struct which will be of use by the DMFS verbs flow in the IB driver. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25mlx4_core: Directly expose fields of DMFS HW rule control segmentHadar Hen Zion2-8/+10
Some of struct mlx4_net_trans_rule_hw_ctrl fields were packed into u32 and accessed through bit field operations. Expose and access them directly as u8 or u16. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25mlx4_core: Change a few DMFS fields names to match firmare specHadar Hen Zion2-7/+7
Change struct mlx4_net_trans_rule_hw_eth :: vlan_id name to vlan_tag Change struct mlx4_net_trans_rule_hw_ib :: r_u_qpn name to l3_qpn The patch doesn't introduce any functional change or API change towards the firmware. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25mlx4: Match DMFS promiscuous field names to firmware specHadar Hen Zion4-25/+25
Align the names used by enum mlx4_net_trans_promisc_mode with the actual firmware specification. The patch doesn't introduce any functional change or API change towards the firmware. Remove MLX4_FS_PROMISC_FUNCTION_PORT which isn't of use. Add new enums MLX4_FS_{UC/MC}_SNIFFER as a preparation step for sniffer support. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25mlx4_core: Move DMFS HW structs to common header fileHadar Hen Zion2-79/+79
Move flow steering HW structures to be on the public mlx4 include directory, as a pre-step for the mlx4 IB driver to use them too. Signed-off-by: Hadar Hen Zion <hadarh@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25IB/mlx4: Set link type for RAW PACKET QPs in the QP contextEli Cohen1-0/+4
When the link type is Ethernet, setting the link type in the QP context will enable TCP/IP stateless offloads (checksum, LSO, RSS) for RAW PACKET Ethernet QPs. For IB UD QPs this worked OK since the value assumed by the firmware for IB link layer is zero. Signed-off-by: Eli Cohen <eli@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25IB/mlx4: Disable VLAN stripping for RAW PACKET QPsDotan Barak1-0/+2
Fix the asymmetric behavior w.r.t VLAN insertion/stripping for RAW PACKET QPs -- we don't insert on send and need not strip on receive. Signed-off-by: Dotan Barak <dotanb@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25mlx4_core: Reduce warning message for SRQ_LIMIT event to debug levelJack Morgenstein1-2/+2
Commit acba2420f9d2 ("mlx4_core: Add wrapper functions and comm channel and slave event support to EQs") introduced a warning printout for SRQ LIMIT events. This warning can flood the log when (correct, normally operating) apps use SRQ LIMIT events as a trigger to post WQEs to SRQs. Reduce the warning message to be a debug printout. Reported-by: Rick Warner <rick@microway.com> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-25RDMA/iwcm: Don't touch cmid after dropping referenceSteve Wise1-0/+2
The function cm_work_handler() cannot touch the cm_id after it derefs it, because it might be freed on another concurrent thread. If there are more work items queued for this cm_id, then we know there must be more references because they are added when the work items are queued. So in the while loop inside cm_work_handler(), after derefing, if the queue is empty, then exit the function. Otherwise we know it's safe to re-acquire the lock. Signed-off-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17IB/qib: Correct qib_verbs_register_sysfs() error handlingMike Marciniszyn2-2/+7
qib_verbs_register_sysfs() never cleans up from a failure. Additionally, the caller of qib_verbs_register_sysfs() doesn't return the correct "ret" value. This patch resolves both of those issues. Reported-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17IB/ipath: Correct ipath_verbs_register_sysfs() error handlingMike Marciniszyn1-9/+10
ipath_verbs_register_sysfs() never returned the correct error code from device_create_file and never cleaned up from a failure. Additionally, the caller of ipath_verbs_register_sysfs() doesn't return the correct "ret" value. This patch resolves all of these issues. Reported-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Reviewed-by: Dean Luick <dean.luick@intel.com> Signed-off-by: Mike Marciniszyn <mike.marciniszyn@intel.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17RDMA/cxgb4: Fix SQ allocation when on-chip SQ is disabledThadeu Lima de Souza Cascardo1-12/+13
Commit c079c28714e4 ("RDMA/cxgb4: Fix error handling in create_qp()") broke SQ allocation. Instead of falling back to host allocation when on-chip allocation fails, it tries to allocate both. And when it does, and we try to free the address from the genpool using the host address, we hit a BUG and the system crashes as below. We create a new function that has the previous behavior and properly propagate the error, as intended. kernel BUG at /usr/src/packages/BUILD/kernel-ppc64-3.0.68/linux-3.0/lib/genalloc.c:340! Oops: Exception in kernel mode, sig: 5 [#1] SMP NR_CPUS=1024 NUMA pSeries Modules linked in: rdma_ucm rdma_cm ib_addr ib_cm iw_cm ib_sa ib_mad ib_uverbs iw_cxgb4 ib_core ip6t_LOG xt_tcpudp xt_pkttype ipt_LOG xt_limit ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw xt_NOTRACK ipt_REJECT xt_state iptable_raw iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 ip_tables ip6table_filter ip6_tables x_tables fuse loop dm_mod ipv6 ipv6_lib sr_mod cdrom ibmveth(X) cxgb4 sg ext3 jbd mbcache sd_mod crc_t10dif scsi_dh_emc scsi_dh_hp_sw scsi_dh_alua scsi_dh_rdac scsi_dh ibmvscsic(X) scsi_transport_srp scsi_tgt scsi_mod Supported: Yes NIP: c00000000037d41c LR: d000000003913824 CTR: c00000000037d3b0 REGS: c0000001f350ae50 TRAP: 0700 Tainted: G X (3.0.68-0.9-ppc64) MSR: 8000000000029032 <EE,ME,CE,IR,DR> CR: 24042482 XER: 00000001 TASK = c0000001f6f2a840[3616] 'rping' THREAD: c0000001f3508000 CPU: 0 GPR00: c0000001f6e875c8 c0000001f350b0d0 c000000000fc9690 c0000001f6e875c0 GPR04: 00000000000c0000 0000000000010000 0000000000000000 c0000000009d482a GPR08: 000000006a170000 0000000000100000 c0000001f350b140 c0000001f6e875c8 GPR12: d000000003915dd0 c000000003f40000 000000003e3ecfa8 c0000001f350bea0 GPR16: c0000001f350bcd0 00000000003c0000 0000000000040100 c0000001f6e74a80 GPR20: d00000000399a898 c0000001f6e74ac8 c0000001fad91600 c0000001f6e74ab0 GPR24: c0000001f7d23f80 0000000000000000 0000000000000002 000000006a170000 GPR28: 000000000000000c c0000001f584c8d0 d000000003925180 c0000001f6e875c8 NIP [c00000000037d41c] .gen_pool_free+0x6c/0xf8 LR [d000000003913824] .c4iw_ocqp_pool_free+0x8c/0xd8 [iw_cxgb4] Call Trace: [c0000001f350b0d0] [c0000001f350b180] 0xc0000001f350b180 (unreliable) [c0000001f350b170] [d000000003913824] .c4iw_ocqp_pool_free+0x8c/0xd8 [iw_cxgb4] [c0000001f350b210] [d00000000390fd70] .dealloc_sq+0x90/0xb0 [iw_cxgb4] [c0000001f350b280] [d00000000390fe08] .destroy_qp+0x78/0xf8 [iw_cxgb4] [c0000001f350b310] [d000000003912738] .c4iw_destroy_qp+0x208/0x2d0 [iw_cxgb4] [c0000001f350b460] [d000000003861874] .ib_destroy_qp+0x5c/0x130 [ib_core] [c0000001f350b510] [d0000000039911bc] .ib_uverbs_cleanup_ucontext+0x174/0x4f8 [ib_uverbs] [c0000001f350b5f0] [d000000003991568] .ib_uverbs_close+0x28/0x70 [ib_uverbs] [c0000001f350b670] [c0000000001e7b2c] .__fput+0xdc/0x278 [c0000001f350b720] [c0000000001a9590] .remove_vma+0x68/0xd8 [c0000001f350b7b0] [c0000000001a9720] .exit_mmap+0x120/0x160 [c0000001f350b8d0] [c0000000000af330] .mmput+0x80/0x160 [c0000001f350b960] [c0000000000b5d0c] .exit_mm+0x1ac/0x1e8 [c0000001f350ba10] [c0000000000b8154] .do_exit+0x1b4/0x4b8 [c0000001f350bad0] [c0000000000b84b0] .do_group_exit+0x58/0xf8 [c0000001f350bb60] [c0000000000ce9f4] .get_signal_to_deliver+0x2f4/0x5d0 [c0000001f350bc60] [c000000000017ee4] .do_signal_pending+0x6c/0x3e0 [c0000001f350bdb0] [c0000000000182cc] .do_signal+0x74/0x78 [c0000001f350be30] [c000000000009e74] do_work+0x24/0x28 Signed-off-by: Thadeu Lima de Souza Cascardo <cascardo@linux.vnet.ibm.com> Cc: Emil Goode <emilgoode@gmail.com> Cc: <stable@vger.kernel.org> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17SRPT: Fix odd use of WARN_ON()Grant Grundler1-1/+1
While WARN_ON("const string") will work, it's intent is not obvious. The warning is more useful by showing the "state" value. Signed-off-by: Grant Grundler <grundler@chromium.org> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17IPoIB: Fix ipoib_hard_header() return valueDoug Ledford1-1/+1
If you have a patched up dhcp server (and dhclient), they will use AF_PACKET/SOCK_DGRAM pair to send dhcp packets over IPoIB. However, when testing an upstream kernel, this has been broken for a very long time (I tested 2.6.34, 2.6.38, 3.0, 3.1, 3.8, HEAD). It turns out that the hard_header routine in ipoib is not following the API and is returning 0 even when it pushed data onto the skb. This then causes af_packet.c to overwrite the header just pushed with data from user space. Fixing this gets DHCP working on IPoIB. Signed-off-by: Doug Ledford <dledford@redhat.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17RDMA: Rename random32() to prandom_u32()Akinobu Mita4-6/+6
Use more preferable function name which implies using a pseudo-random number generator. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Reviewed-by: Steve Wise <swise@opengridcomputing.com> Cc: Roland Dreier <roland@kernel.org> Cc: Sean Hefty <sean.hefty@intel.com> Cc: Hal Rosenstock <hal.rosenstock@gmail.com> Cc: Steve Wise <swise@chelsio.com> Cc: linux-rdma@vger.kernel.org Reviewed-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17RDMA/cxgb3: Fix uninitialized variableCong Ding1-1/+1
The variable npages might be used uninitialized. Signed-off-by: Cong Ding <dinggnu@gmail.com> Acked-by: Steve Wise <swise@opengridcomputing.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17IB/mlx4: Fetch XRC SRQ in the CQ polling codeShlomo Pongratz1-0/+21
An XRC target QP may redirect to more than one XRC SRQ. This means that for work completions associated with a XRC TGT QP, the srq field in the QP has no usage and the real XRC SRQ need to be retrived using the information from the XRCETH placed into the CQE, do that. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17mlx4_core: Implement SRQ object lookup from srqnShlomo Pongratz2-0/+17
Expose a new API mlx4_srq_lookup() to retrive a SRQ based on its number. This API is needed in the mlx4_ib driver CQ polling logic, when a work completion is associated with a XRC TGT QP. Since a target QP may redirect to more than one XRC SRQ, the srq field in the QP has no usage and the real XRC SRQ need to be retrived using the information from the XRCETH IB header which is placed in the HW CQE. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-17IB/core: Verify that QP handler is valid before dispatching eventsShlomo Pongratz1-1/+2
For QPs of type IB_QPT_XRC_TGT the IB core assigns a common event handler __ib_shared_qp_event_handler(), and the optionally supplied event handler is stored. When the common handler is called it iterates on all opened QPs and calles the original handlers without checking if they are NULL. Fix that. Signed-off-by: Shlomo Pongratz <shlomop@mellanox.com> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Roland Dreier <roland@purestorage.com>
2013-04-15Linux 3.9-rc7v3.9-rc7Linus Torvalds1-1/+1
2013-04-14Merge branch 'x86-urgent-for-linus' of ↵Linus Torvalds8-21/+33
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Ingo Molnar: "Misc fixes" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is set x86/mm/cpa/selftest: Fix false positive in CPA self test x86/mm/cpa: Convert noop to functional fix x86, mm: Patch out arch_flush_lazy_mmu_mode() when running on bare metal x86, mm, paravirt: Fix vmalloc_fault oops during lazy MMU updates
2013-04-14Merge branch 'sched-urgent-for-linus' of ↵Linus Torvalds3-4/+32
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler fixes from Ingo Molnar: "Misc fixlets" * 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: sched/cputime: Fix accounting on multi-threaded processes sched/debug: Fix sd->*_idx limit range avoiding overflow sched_clock: Prevent 64bit inatomicity on 32bit systems sched: Convert BUG_ON()s in try_to_wake_up_local() to WARN_ON_ONCE()s
2013-04-14Merge branch 'perf-urgent-for-linus' of ↵Linus Torvalds6-11/+28
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf fixes from Ingo Molnar: "Misc fixlets" * 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf: Fix error return code ftrace: Fix strncpy() use, use strlcpy() instead of strncpy() perf: Fix strncpy() use, use strlcpy() instead of strncpy() perf: Fix strncpy() use, always make sure it's NUL terminated perf: Fix ring_buffer perf_output_space() boundary calculation perf/x86: Fix uninitialized pt_regs in intel_pmu_drain_bts_buffer()
2013-04-14Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds2-3/+9
Pull drm fixes from Dave Airlie: "One fix for a hotplug locking regressions, and one fix for an oops if you unplug the monitor at an inopportune moment on the udl device." * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/fb-helper: Fix locking in drm_fb_helper_hotplug_event udl: handle EDID failure properly.
2013-04-14Merge branch 'for-linus' of ↵Linus Torvalds1-0/+20
git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu Pull m68knommu fix from Greg Ungerer: "This contains only a single compilation fix for ColdFire m68k targets that use local non-GPIOLIB support." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu: m68k: define a local gpio_request_one() function
2013-04-14Merge git://www.linux-watchdog.org/linux-watchdogLinus Torvalds1-1/+1
Pull watchdog fix from Wim Van Sebroeck: "It will fix compile errors for the at91rm9200_wdt driver" * git://www.linux-watchdog.org/linux-watchdog: watchdog: Revert the AT91RM9200_WATCHDOG dependency
2013-04-14Merge branch 'for-linus' of ↵Linus Torvalds1-6/+42
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull one more btrfs fix from Chris Mason: "This has a recent fix from Josef for our tree log replay code. It fixes problems where the inode counter for the number of bytes in the file wasn't getting updated properly during fsync replay. The commit did get rebased this morning, but it was only to clean up the subject line. The code hasn't changed." * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Btrfs: make sure nbytes are right after log replay
2013-04-14Merge tag 'trace-fixes-v3.9-rc-v3' of ↵Linus Torvalds3-20/+21
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace Pull ftrace fixes from Steven Rostedt: "Namhyung Kim found and fixed a bug that can crash the kernel by simply doing: echo 1234 | tee -a /sys/kernel/debug/tracing/set_ftrace_pid Luckily, this can only be done by root, but still is a nasty bug." * tag 'trace-fixes-v3.9-rc-v3' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE section tracing: Fix possible NULL pointer dereferences
2013-04-14Add file_ns_capable() helper function for open-time capability checkingLinus Torvalds2-0/+26
Nothing is using it yet, but this will allow us to delay the open-time checks to use time, without breaking the normal UNIX permission semantics where permissions are determined by the opener (and the file descriptor can then be passed to a different process, or the process can drop capabilities). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-14watchdog: Revert the AT91RM9200_WATCHDOG dependencyNicolas Ferre1-1/+1
Compiling the at91rm9200_wdt.c driver without at91rm9200 support was leading to several errors: drivers/built-in.o: In function `at91_wdt_close': at91_adc.c:(.text+0xc9fe4): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91_wdt_write': at91_adc.c:(.text+0xca004): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91wdt_shutdown': at91_adc.c:(.text+0xca01c): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91wdt_suspend': at91_adc.c:(.text+0xca038): undefined reference to `at91_st_base' drivers/built-in.o: In function `at91_wdt_open': at91_adc.c:(.text+0xca0cc): undefined reference to `at91_st_base' drivers/built-in.o:at91_adc.c:(.text+0xca2c8): more undefined references to `at91_st_base' follow So, reverting the modification of the "depends" Kconfig line introduced by patch a6a1bcd37 (watchdog: at91rm9200: add DT support) seems to be the good solution. Signed-off-by: Nicolas Ferre <nicolas.ferre@atmel.com> Acked-by: Guenter Roeck <linux@roeck-us.net> Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2013-04-14vfs: Revert spurious fix to spinning prevention in prune_icache_sbSuleiman Souhlal1-1/+1
Revert commit 62a3ddef6181 ("vfs: fix spinning prevention in prune_icache_sb"). This commit doesn't look right: since we are looking at the tail of the list (sb->s_inode_lru.prev) if we want to skip an inode, we should put it back at the head of the list instead of the tail, otherwise we will keep spinning on it. Discovered when investigating why prune_icache_sb came top in perf reports of a swapping load. Signed-off-by: Suleiman Souhlal <suleiman@google.com> Signed-off-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org # v3.2+ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-14kobject: fix kset_find_obj() race with concurrent last kobject_put()Linus Torvalds1-1/+8
Anatol Pomozov identified a race condition that hits module unloading and re-loading. To quote Anatol: "This is a race codition that exists between kset_find_obj() and kobject_put(). kset_find_obj() might return kobject that has refcount equal to 0 if this kobject is freeing by kobject_put() in other thread. Here is timeline for the crash in case if kset_find_obj() searches for an object tht nobody holds and other thread is doing kobject_put() on the same kobject: THREAD A (calls kset_find_obj()) THREAD B (calls kobject_put()) splin_lock() atomic_dec_return(kobj->kref), counter gets zero here ... starts kobject cleanup .... spin_lock() // WAIT thread A in kobj_kset_leave() iterate over kset->list atomic_inc(kobj->kref) (counter becomes 1) spin_unlock() spin_lock() // taken // it does not know that thread A increased counter so it remove obj from list spin_unlock() vfree(module) // frees module object with containing kobj // kobj points to freed memory area!! kobject_put(kobj) // OOPS!!!! The race above happens because module.c tries to use kset_find_obj() when somebody unloads module. The module.c code was introduced in commit 6494a93d55fa" Anatol supplied a patch specific for module.c that worked around the problem by simply not using kset_find_obj() at all, but rather than make a local band-aid, this just fixes kset_find_obj() to be thread-safe using the proper model of refusing the get a new reference if the refcount has already dropped to zero. See examples of this proper refcount handling not only in the kref documentation, but in various other equivalent uses of this pattern by grepping for atomic_inc_not_zero(). [ Side note: the module race does indicate that module loading and unloading is not properly serialized wrt sysfs information using the module mutex. That may require further thought, but this is the correct fix at the kobject layer regardless. ] Reported-analyzed-and-tested-by: Anatol Pomozov <anatol.pomozov@gmail.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-13Btrfs: make sure nbytes are right after log replayJosef Bacik1-6/+42
While trying to track down a tree log replay bug I noticed that fsck was always complaining about nbytes not being right for our fsynced file. That is because the new fsync stuff doesn't wait for ordered extents to complete, so the inodes nbytes are not necessarily updated properly when we log it. So to fix this we need to set nbytes to whatever it is on the inode that is on disk, so when we replay the extents we can just add the bytes that are being added as we replay the extent. This makes it work for the case that we have the wrong nbytes or the case that we logged everything and nbytes is actually correct. With this I'm no longer getting nbytes errors out of btrfsck. Cc: stable@vger.kernel.org Signed-off-by: Josef Bacik <jbacik@fusionio.com> Signed-off-by: Chris Mason <chris.mason@fusionio.com>
2013-04-13x86-32: Fix possible incomplete TLB invalidate with PAE pagetablesDave Hansen4-2/+15
This patch attempts to fix: https://bugzilla.kernel.org/show_bug.cgi?id=56461 The symptom is a crash and messages like this: chrome: Corrupted page table at address 34a03000 *pdpt = 0000000000000000 *pde = 0000000000000000 Bad pagetable: 000f [#1] PREEMPT SMP Ingo guesses this got introduced by commit 611ae8e3f520 ("x86/tlb: enable tlb flush range support for x86") since that code started to free unused pagetables. On x86-32 PAE kernels, that new code has the potential to free an entire PMD page and will clear one of the four page-directory-pointer-table (aka pgd_t entries). The hardware aggressively "caches" these top-level entries and invlpg does not actually affect the CPU's copy. If we clear one we *HAVE* to do a full TLB flush, otherwise we might continue using a freed pmd page. (note, we do this properly on the population side in pud_populate()). This patch tracks whenever we clear one of these entries in the 'struct mmu_gather', and ensures that we follow up with a full tlb flush. BTW, I disassembled and checked that: if (tlb->fullmm == 0) and if (!tlb->fullmm && !tlb->need_flush_all) generate essentially the same code, so there should be zero impact there to the !PAE case. Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Cc: Peter Anvin <hpa@zytor.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Artem S Tashkinov <t.artem@mailcity.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pendingLinus Torvalds2-68/+133
Pull SCSI target fixes from Nicholas Bellinger: "Here are remaining target-pending items for v3.9-rc7 code. The tcm_vhost patches are more than I'd usually include in a -rc7 pull, but are changes required for v3.9 to work correctly with the pending vhost-scsi-pci QEMU upstream series merge. (Paolo CC'ed) Plus Asias's conversion to use vhost_virtqueue->private_data + RCU for managing vhost-scsi endpoints has gotten alot of review + testing over the past weeks, and MST has ACKed the full series. Also, there is a target patch to fix a long-standing bug within control CDB handling with Standby/Offline/Transition ALUA port access states, that had been incorrectly rejecting the control CDBs required for LUN scan to work during these port group states. CC'ing to stable." * git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: target: Fix incorrect fallthrough of ALUA Standby/Offline/Transition CDBs tcm_vhost: Send bad target to guest when cmd fails tcm_vhost: Add vhost_scsi_send_bad_target() helper tcm_vhost: Fix tv_cmd leak in vhost_scsi_handle_vq tcm_vhost: Remove double check of response tcm_vhost: Initialize vq->last_used_idx when set endpoint tcm_vhost: Use vq->private_data to indicate if the endpoint is setup tcm_vhost: Use ACCESS_ONCE for vs->vs_tpg[target] access
2013-04-13Merge tag 'scsi-fixes' of ↵Linus Torvalds13-80/+38
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi Pull SCSI fixes from James Bottomley: "This is a set of ten bug fixes (and two consisting of copyright year update and version number change) pretty much all of which involve either a crash or a hang except the removal of the random sleep from the qla2xxx driver (which is a coding error so bad, we want it gone before anyone has a chance to copy it)." * tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: [SCSI] lpfc: fix potential NULL pointer dereference in lpfc_sli4_rq_put() [SCSI] libsas: fix handling vacant phy in sas_set_ex_phy() [SCSI] ibmvscsi: Fix slave_configure deadlock [SCSI] qla2xxx: Update the driver version to 8.04.00.13-k. [SCSI] qla2xxx: Remove debug code that msleeps for random duration. [SCSI] qla2xxx: Update copyright dates information in LICENSE.qla2xxx file. [SCSI] qla2xxx: Fix crash during firmware dump procedure. [SCSI] Revert "qla2xxx: Add setting of driver version string for vendor application." [SCSI] ipr: dlpar failed when adding an adapter back [SCSI] ipr: fix addition of abort command to HRRQ free queue [SCSI] st: Take additional queue ref in st_probe [SCSI] libsas: use right function to alloc smp response [SCSI] ipr: ipr_test_msi() fails when running with msi-x enabled adapter
2013-04-13Merge branch 'for-next' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-3/+13
Pull CIFS fix from Steve French: "Fixes a regression in cifs in which a password which begins with a comma is parsed incorrectly as a blank password" * 'for-next' of git://git.samba.org/sfrench/cifs-2.6: cifs: Allow passwords which begin with a delimitor
2013-04-13ftrace: Move ftrace_filter_lseek out of CONFIG_DYNAMIC_FTRACE sectionSteven Rostedt (Red Hat)2-15/+16
As ftrace_filter_lseek is now used with ftrace_pid_fops, it needs to be moved out of the #ifdef CONFIG_DYNAMIC_FTRACE section as the ftrace_pid_fops is defined when DYNAMIC_FTRACE is not. Cc: stable@vger.kernel.org Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12tracing: Fix possible NULL pointer dereferencesNamhyung Kim3-7/+7
Currently set_ftrace_pid and set_graph_function files use seq_lseek for their fops. However seq_open() is called only for FMODE_READ in the fops->open() so that if an user tries to seek one of those file when she open it for writing, it sees NULL seq_file and then panic. It can be easily reproduced with following command: $ cd /sys/kernel/debug/tracing $ echo 1234 | sudo tee -a set_ftrace_pid In this example, GNU coreutils' tee opens the file with fopen(, "a") and then the fopen() internally calls lseek(). Link: http://lkml.kernel.org/r/1365663302-2170-1-git-send-email-namhyung@kernel.org Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Namhyung Kim <namhyung.kim@lge.com> Cc: stable@vger.kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-04-12Merge tag 'sound-3.9' of ↵Linus Torvalds9-40/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "This contains a few small ASoC fixes (wm8903, wm5102, samsung-i2s, tegra, and soc-compress) and an endian fix for NI USB-audio devices, update for Mark's e-mail address. No scary changes, AFAIS." * tag 'sound-3.9' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: MAINTAINERS: Update e-mail address ASoC: wm5102: Correct lookup of arizona struct in SYSCLK event ASoC: wm8903: Fix the bypass to HP/LINEOUT when no DAC or ADC is running ALSA: usb-audio: fix endianness bug in snd_nativeinstruments_* ASoC: tegra: Don't claim to support PCM pause and resume ASoC: Samsung: set drvdata before adding secondary device ASoC: Samsung: return error if drvdata is not set ASoC: compress: Cancel delayed power down if needed ASoC: core: Fix to check return value of snd_soc_update_bits_locked()
2013-04-12Merge tag 'asoc-maintainers-v3.9-rc6' of ↵Takashi Iwai1-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus MAINTAINERS: Update e-mail address Update the e-mail address I use for subsystems.
2013-04-12MAINTAINERS: Update e-mail addressMark Brown1-4/+4
Update the e-mail address I use for subsystems. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2013-04-12Merge tag 'asoc-v3.9-rc6' of ↵Takashi Iwai501-3000/+5512
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v3.9 A few updates, more than I'd like, fixing some relatively small issues but mostly driver specific ones. Nothing wildly exciting so if it doesn't make v3.9 it won't be the end of the world but it'd be nice.
2013-04-12x86/mm: Flush lazy MMU when DEBUG_PAGEALLOC is setBoris Ostrovsky1-0/+2
When CONFIG_DEBUG_PAGEALLOC is set page table updates made by kernel_map_pages() are not made visible (via TLB flush) immediately if lazy MMU is on. In environments that support lazy MMU (e.g. Xen) this may lead to fatal page faults, for example, when zap_pte_range() needs to allocate pages in __tlb_remove_page() -> tlb_next_batch(). Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: konrad.wilk@oracle.com Link: http://lkml.kernel.org/r/1365703192-2089-1-git-send-email-boris.ostrovsky@oracle.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-04-12x86/mm/cpa/selftest: Fix false positive in CPA self testAndrea Arcangeli1-1/+1
If the pmd is not present, _PAGE_PSE will not be set anymore. Fix the false positive. Reported-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Cc: Stefan Bader <stefan.bader@canonical.com> Cc: Andy Whitcroft <apw@canonical.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Borislav Petkov <bp@alien8.de> Link: http://lkml.kernel.org/r/1365687369-30802-1-git-send-email-aarcange@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>