kernel/linux.git, branch v3.0.52

Linux 3.0.52

2012-11-17T21:14:48+00:00

ALSA: usb-audio: Fix mutex deadlock at disconnection

2012-11-17T21:14:26+00:00

commit 10e44239f67d0b6fb74006e61a7e883b8075247a upstream. The recent change for USB-audio disconnection race fixes introduced a mutex deadlock again. There is a circular dependency between chip->shutdown_rwsem and pcm->open_mutex, depicted like below, when a device is opened during the disconnection operation: A. snd_usb_audio_disconnect() -> card.c::register_mutex -> chip->shutdown_rwsem (write) -> snd_card_disconnect() -> pcm.c::register_mutex -> pcm->open_mutex B. snd_pcm_open() -> pcm->open_mutex -> snd_usb_pcm_open() -> chip->shutdown_rwsem (read) Since the chip->shutdown_rwsem protection in the case A is required only for turning on the chip->shutdown flag and it doesn't have to be taken for the whole operation, we can reduce its window in snd_usb_audio_disconnect(). Reported-by: Jiri Slaby Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman

ALSA: Fix card refcount unbalance

2012-11-17T21:14:26+00:00

commit 8bb4d9ce08b0a92ca174e41d92c180328f86173f upstream. There are uncovered cases whether the card refcount introduced by the commit a0830dbd isn't properly increased or decreased: - OSS PCM and mixer success paths - When lookup function gets NULL This patch fixes these places. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=50251 Signed-off-by: Takashi Iwai Signed-off-by: Greg Kroah-Hartman

intel-iommu: Fix AB-BA lockdep report

2012-11-17T21:14:26+00:00

commit 3e7abe2556b583e87dabda3e0e6178a67b20d06f upstream. When unbinding a device so that I could pass it through to a KVM VM, I got the lockdep report below. It looks like a legitimate lock ordering problem: - domain_context_mapping_one() takes iommu->lock and calls iommu_support_dev_iotlb(), which takes device_domain_lock (inside iommu->lock). - domain_remove_one_dev_info() starts by taking device_domain_lock then takes iommu->lock inside it (near the end of the function). So this is the classic AB-BA deadlock. It looks like a safe fix is to simply release device_domain_lock a bit earlier, since as far as I can tell, it doesn't protect any of the stuff accessed at the end of domain_remove_one_dev_info() anyway. BTW, the use of device_domain_lock looks a bit unsafe to me... it's at least not obvious to me why we aren't vulnerable to the race below: iommu_support_dev_iotlb() domain_remove_dev_info() lock device_domain_lock find info unlock device_domain_lock lock device_domain_lock find same info unlock device_domain_lock free_devinfo_mem(info) do stuff with info after it's free However I don't understand the locking here well enough to know if this is a real problem, let alone what the best fix is. Anyway here's the full lockdep output that prompted all of this: ======================================================= [ INFO: possible circular locking dependency detected ] 2.6.39.1+ #1 ------------------------------------------------------- bash/13954 is trying to acquire lock: (&(&iommu->lock)->rlock){......}, at: [] domain_remove_one_dev_info+0x121/0x230 but task is already holding lock: (device_domain_lock){-.-...}, at: [] domain_remove_one_dev_info+0x208/0x230 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (device_domain_lock){-.-...}: [] lock_acquire+0x9d/0x130 [] _raw_spin_lock_irqsave+0x55/0xa0 [] domain_context_mapping_one+0x600/0x750 [] domain_context_mapping+0x3f/0x120 [] iommu_prepare_identity_map+0x1c5/0x1e0 [] intel_iommu_init+0x88e/0xb5e [] pci_iommu_init+0x16/0x41 [] do_one_initcall+0x45/0x190 [] kernel_init+0xe3/0x168 [] kernel_thread_helper+0x4/0x10 -> #0 (&(&iommu->lock)->rlock){......}: [] __lock_acquire+0x195e/0x1e10 [] lock_acquire+0x9d/0x130 [] _raw_spin_lock_irqsave+0x55/0xa0 [] domain_remove_one_dev_info+0x121/0x230 [] device_notifier+0x72/0x90 [] notifier_call_chain+0x8c/0xc0 [] __blocking_notifier_call_chain+0x78/0xb0 [] blocking_notifier_call_chain+0x16/0x20 [] __device_release_driver+0xbc/0xe0 [] device_release_driver+0x2f/0x50 [] driver_unbind+0xa3/0xc0 [] drv_attr_store+0x2c/0x30 [] sysfs_write_file+0xe6/0x170 [] vfs_write+0xce/0x190 [] sys_write+0x54/0xa0 [] system_call_fastpath+0x16/0x1b other info that might help us debug this: 6 locks held by bash/13954: #0: (&buffer->mutex){+.+.+.}, at: [] sysfs_write_file+0x44/0x170 #1: (s_active#3){++++.+}, at: [] sysfs_write_file+0xcd/0x170 #2: (&__lockdep_no_validate__){+.+.+.}, at: [] driver_unbind+0x9b/0xc0 #3: (&__lockdep_no_validate__){+.+.+.}, at: [] device_release_driver+0x27/0x50 #4: (&(&priv->bus_notifier)->rwsem){.+.+.+}, at: [] __blocking_notifier_call_chain+0x5f/0xb0 #5: (device_domain_lock){-.-...}, at: [] domain_remove_one_dev_info+0x208/0x230 stack backtrace: Pid: 13954, comm: bash Not tainted 2.6.39.1+ #1 Call Trace: [] print_circular_bug+0xf7/0x100 [] __lock_acquire+0x195e/0x1e10 [] ? trace_hardirqs_off+0xd/0x10 [] ? trace_hardirqs_on_caller+0x13d/0x180 [] lock_acquire+0x9d/0x130 [] ? domain_remove_one_dev_info+0x121/0x230 [] _raw_spin_lock_irqsave+0x55/0xa0 [] ? domain_remove_one_dev_info+0x121/0x230 [] ? trace_hardirqs_off+0xd/0x10 [] domain_remove_one_dev_info+0x121/0x230 [] device_notifier+0x72/0x90 [] notifier_call_chain+0x8c/0xc0 [] __blocking_notifier_call_chain+0x78/0xb0 [] blocking_notifier_call_chain+0x16/0x20 [] __device_release_driver+0xbc/0xe0 [] device_release_driver+0x2f/0x50 [] driver_unbind+0xa3/0xc0 [] drv_attr_store+0x2c/0x30 [] sysfs_write_file+0xe6/0x170 [] vfs_write+0xce/0x190 [] sys_write+0x54/0xa0 [] system_call_fastpath+0x16/0x1b Signed-off-by: Roland Dreier Signed-off-by: David Woodhouse Cc: Steven Rostedt Signed-off-by: Greg Kroah-Hartman

xfs: fix reading of wrapped log data

2012-11-17T21:14:25+00:00

commit 6ce377afd1755eae5c93410ca9a1121dfead7b87 upstream. Commit 4439647 ("xfs: reset buffer pointers before freeing them") in 3.0-rc1 introduced a regression when recovering log buffers that wrapped around the end of log. The second part of the log buffer at the start of the physical log was being read into the header buffer rather than the data buffer, and hence recovery was seeing garbage in the data buffer when it got to the region of the log buffer that was incorrectly read. Reported-by: Torsten Kaiser Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig Reviewed-by: Mark Tinguely Signed-off-by: Ben Myers Signed-off-by: Greg Kroah-Hartman

USB: mos7840: remove unused variable

2012-11-17T21:14:25+00:00

Fix warning about unused variable introduced by commit e681b66f2e19fa ("USB: mos7840: remove invalid disconnect handling") upstream. A subsequent fix which removed the disconnect function got rid of the warning but that one was only backported to v3.6. Reported-by: Jiri Slaby Signed-off-by: Johan Hovold Signed-off-by: Greg Kroah-Hartman

drm/i915: clear the entire sdvo infoframe buffer

2012-11-17T21:14:25+00:00

commit b6e0e543f75729f207b9c72b0162ae61170635b2 upstream. Like in the case of native hdmi, which is fixed already in commit adf00b26d18e1b3570451296e03bcb20e4798cdd Author: Paulo Zanoni Date: Tue Sep 25 13:23:34 2012 -0300 drm/i915: make sure we write all the DIP data bytes we need to clear the entire sdvo buffer to avoid upsetting the display. Since infoframe buffer writing is now a bit more elaborate, extract it into it's own function. This will be useful if we ever get around to properly update the ELD for sdvo. Also #define proper names for the two buffer indexes with fixed usage. v2: Cite the right commit above, spotted by Paulo Zanoni. v3: I'm too stupid to paste the right commit. v4: Ben Hutchings noticed that I've failed to handle an underflow in my loop logic, breaking it for i >= length + 8. Since I've just lost C programmer license, use his solution. Also, make the frustrated 0-base buffer size a notch more clear. Reported-and-tested-by: Jürg Billeter Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=25732 Cc: Paulo Zanoni Cc: Ben Hutchings Reviewed-by: Rodrigo Vivi Signed-off-by: Daniel Vetter Signed-off-by: Greg Kroah-Hartman

drm/i915: fixup infoframe support for sdvo

2012-11-17T21:14:25+00:00

commit 81014b9d0b55fb0b48f26cd2a943359750d532db upstream. At least the worst offenders: - SDVO specifies that the encoder should compute the ecc. Testing also shows that we must not send the ecc field, so copy the dip_infoframe struct to a temporay place and avoid the ecc field. This way the avi infoframe is exactly 17 bytes long, which agrees with what the spec mandates as a minimal storage capacity (with the ecc field it would be 18 bytes). - Only 17 when sending the avi infoframe. The SDVO spec explicitly says that sending more data than what the device announces results in undefined behaviour. - Add __attribute__((packed)) to the avi and spd infoframes, for otherwise they're wrongly aligned. Noticed because the avi infoframe ended up being 18 bytes large instead of 17. We haven't noticed this yet because we don't use the uint16_t fields yet (which are the only ones that would be wrongly aligned). This regression has been introduce by 3c17fe4b8f40a112a85758a9ab2aebf772bdd647 is the first bad commit commit 3c17fe4b8f40a112a85758a9ab2aebf772bdd647 Author: David Härdeman Date: Fri Sep 24 21:44:32 2010 +0200 i915: enable AVI infoframe for intel_hdmi.c [v4] Patch tested on my g33 with a sdvo hdmi adaptor. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=25732 Tested-by: Peter Ross (G35 SDVO-HDMI) Reviewed-by: Eugeni Dodonov Signed-Off-by: Daniel Vetter Cc: Ben Hutchings Signed-off-by: Greg Kroah-Hartman

drm/vmwgfx: Fix hibernation device reset

2012-11-17T21:14:25+00:00

commit 95e8f6a21996c4cc2c4574b231c6e858b749dce3 upstream. The device would not reset properly when resuming from hibernation. Signed-off-by: Thomas Hellstrom Reviewed-by: Brian Paul Reviewed-by: Dmitry Torokhov Cc: linux-graphics-maintainer@vmware.com Signed-off-by: Dave Airlie Signed-off-by: Greg Kroah-Hartman

futex: Handle futex_pi OWNER_DIED take over correctly

2012-11-17T21:14:25+00:00

commit 59fa6245192159ab5e1e17b8e31f15afa9cff4bf upstream. Siddhesh analyzed a failure in the take over of pi futexes in case the owner died and provided a workaround. See: http://sourceware.org/bugzilla/show_bug.cgi?id=14076 The detailed problem analysis shows: Futex F is initialized with PTHREAD_PRIO_INHERIT and PTHREAD_MUTEX_ROBUST_NP attributes. T1 lock_futex_pi(F); T2 lock_futex_pi(F); --> T2 blocks on the futex and creates pi_state which is associated to T1. T1 exits --> exit_robust_list() runs --> Futex F userspace value TID field is set to 0 and FUTEX_OWNER_DIED bit is set. T3 lock_futex_pi(F); --> Succeeds due to the check for F's userspace TID field == 0 --> Claims ownership of the futex and sets its own TID into the userspace TID field of futex F --> returns to user space T1 --> exit_pi_state_list() --> Transfers pi_state to waiter T2 and wakes T2 via rt_mutex_unlock(&pi_state->mutex) T2 --> acquires pi_state->mutex and gains real ownership of the pi_state --> Claims ownership of the futex and sets its own TID into the userspace TID field of futex F --> returns to user space T3 --> observes inconsistent state This problem is independent of UP/SMP, preemptible/non preemptible kernels, or process shared vs. private. The only difference is that certain configurations are more likely to expose it. So as Siddhesh correctly analyzed the following check in futex_lock_pi_atomic() is the culprit: if (unlikely(ownerdied || !(curval & FUTEX_TID_MASK))) { We check the userspace value for a TID value of 0 and take over the futex unconditionally if that's true. AFAICT this check is there as it is correct for a different corner case of futexes: the WAITERS bit became stale. Now the proposed change - if (unlikely(ownerdied || !(curval & FUTEX_TID_MASK))) { + if (unlikely(ownerdied || + !(curval & (FUTEX_TID_MASK | FUTEX_WAITERS)))) { solves the problem, but it's not obvious why and it wreckages the "stale WAITERS bit" case. What happens is, that due to the WAITERS bit being set (T2 is blocked on that futex) it enforces T3 to go through lookup_pi_state(), which in the above case returns an existing pi_state and therefor forces T3 to legitimately fight with T2 over the ownership of the pi_state (via pi_state->mutex). Probelm solved! Though that does not work for the "WAITERS bit is stale" problem because if lookup_pi_state() does not find existing pi_state it returns -ERSCH (due to TID == 0) which causes futex_lock_pi() to return -ESRCH to user space because the OWNER_DIED bit is not set. Now there is a different solution to that problem. Do not look at the user space value at all and enforce a lookup of possibly available pi_state. If pi_state can be found, then the new incoming locker T3 blocks on that pi_state and legitimately races with T2 to acquire the rt_mutex and the pi_state and therefor the proper ownership of the user space futex. lookup_pi_state() has the correct order of checks. It first tries to find a pi_state associated with the user space futex and only if that fails it checks for futex TID value = 0. If no pi_state is available nothing can create new state at that point because this happens with the hash bucket lock held. So the above scenario changes to: T1 lock_futex_pi(F); T2 lock_futex_pi(F); --> T2 blocks on the futex and creates pi_state which is associated to T1. T1 exits --> exit_robust_list() runs --> Futex F userspace value TID field is set to 0 and FUTEX_OWNER_DIED bit is set. T3 lock_futex_pi(F); --> Finds pi_state and blocks on pi_state->rt_mutex T1 --> exit_pi_state_list() --> Transfers pi_state to waiter T2 and wakes it via rt_mutex_unlock(&pi_state->mutex) T2 --> acquires pi_state->mutex and gains ownership of the pi_state --> Claims ownership of the futex and sets its own TID into the userspace TID field of futex F --> returns to user space This covers all gazillion points on which T3 might come in between T1's exit_robust_list() clearing the TID field and T2 fixing it up. It also solves the "WAITERS bit stale" problem by forcing the take over. Another benefit of changing the code this way is that it makes it less dependent on untrusted user space values and therefor minimizes the possible wreckage which might be inflicted. As usual after staring for too long at the futex code my brain hurts so much that I really want to ditch that whole optimization of avoiding the syscall for the non contended case for PI futexes and rip out the maze of corner case handling code. Unfortunately we can't as user space relies on that existing behaviour, but at least thinking about it helps me to preserve my mental sanity. Maybe we should nevertheless :) Reported-and-tested-by: Siddhesh Poyarekar Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1210232138540.2756@ionos Acked-by: Darren Hart Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman