summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2009-02-10ext4: Fix to read empty directory blocks correctly in 64kWei Yongjun1-1/+1
The rec_len field in the directory entry is 16 bits, so there was a problem representing rec_len for filesystems with a 64k block size in the case where the directory entry takes the entire 64k block. Unfortunately, there were two schemes that were proposed; one where all zeros meant 65536 and one where all ones (65535) meant 65536. E2fsprogs used 0, whereas the kernel used 65535. Oops. Fortunately this case happens extremely rarely, with the most common case being the lost+found directory, created by mke2fs. So we will be liberal in what we accept, and accept both encodings, but we will continue to encode 65536 as 65535. This will require a change in e2fsprogs, but with fortunately ext4 filesystems normally have the dir_index feature enabled, which precludes having a completely empty directory block. Signed-off-by: Wei Yongjun <yjwei@cn.fujitsu.com> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
2009-02-10jbd2: Avoid possible NULL dereference in jbd2_journal_begin_ordered_truncate()Jan Kara4-16/+41
If we race with commit code setting i_transaction to NULL, we could possibly dereference it. Proper locking requires the journal pointer (to access journal->j_list_lock), which we don't have. So we have to change the prototype of the function so that filesystem passes us the journal pointer. Also add a more detailed comment about why the function jbd2_journal_begin_ordered_truncate() does what it does and how it should be used. Thanks to Dan Carpenter <error27@gmail.com> for pointing to the suspitious code. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Acked-by: Joel Becker <joel.becker@oracle.com> CC: linux-ext4@vger.kernel.org CC: ocfs2-devel@oss.oracle.com CC: mfasheh@suse.de CC: Dan Carpenter <error27@gmail.com>
2009-02-10Revert "ext4: wait on all pending commits in ext4_sync_fs()"Jan Kara1-4/+7
This undoes commit 14ce0cb411c88681ab8f3a4c9caa7f42e97a3184. Since jbd2_journal_start_commit() is now fixed to return 1 when we started a transaction commit, there's some transaction waiting to be committed or there's a transaction already committing, we don't need to call ext4_force_commit() in ext4_sync_fs(). Furthermore ext4_force_commit() can unnecessarily create sync transaction which is expensive so it's worthwhile to remove it when we can. http://bugzilla.kernel.org/show_bug.cgi?id=12224 Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: Eric Sandeen <sandeen@redhat.com> Cc: linux-ext4@vger.kernel.org
2009-02-10jbd2: Fix return value of jbd2_journal_start_commit()Jan Kara1-6/+11
The function jbd2_journal_start_commit() returns 1 if either a transaction is committing or the function has queued a transaction commit. But it returns 0 if we raced with somebody queueing the transaction commit as well. This resulted in ext4_sync_fs() not functioning correctly (description from Arthur Jones): In the case of a data=ordered umount with pending long symlinks which are delayed due to a long list of other I/O on the backing block device, this causes the buffer associated with the long symlinks to not be moved to the inode dirty list in the second phase of fsync_super. Then, before they can be dirtied again, kjournald exits, seeing the UMOUNT flag and the dirty pages are never written to the backing block device, causing long symlink corruption and exposing new or previously freed block data to userspace. This can be reproduced with a script created by Eric Sandeen <sandeen@redhat.com>: #!/bin/bash umount /mnt/test2 mount /dev/sdb4 /mnt/test2 rm -f /mnt/test2/* dd if=/dev/zero of=/mnt/test2/bigfile bs=1M count=512 touch /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename ln -s /mnt/test2/thisisveryveryveryveryveryveryveryveryveryveryveryveryveryveryveryverylongfilename /mnt/test2/link umount /mnt/test2 mount /dev/sdb4 /mnt/test2 ls /mnt/test2/ This patch fixes jbd2_journal_start_commit() to always return 1 when there's a transaction committing or queued for commit. Signed-off-by: Jan Kara <jack@suse.cz> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> CC: Eric Sandeen <sandeen@redhat.com> CC: linux-ext4@vger.kernel.org
2009-02-14Linux 2.6.29-rc5v2.6.29-rc5Linus Torvalds1-1/+1
2009-02-13Merge branch 'for-linus' of ↵Linus Torvalds13-93/+112
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6: ASoC: Only register AC97 bus if it's not done already ALSA: hda - Add snd_hda_multi_out_dig_cleanup() ALSA: hda - Add missing terminator in slave dig-out array ALSA: hda - Change HP dv7 (103c:30f4) quirk from hp-m4 to hp-dv5 model ALSA: hda - Register (new) devices at reconfig ALSA: mtpav - Fix initial value for input hwport ALSA: hda - add id for Intel IbexPeak integrated HDMI codec ALSA: hda - compute checksum in HDMI audio infoframe ALSA: hda - enable HDMI audio pin out at module loading time ALSA: hda - allow multi-channel HDMI audio playback when ELD is not present ASoC: Update SDP3430 machine driver for snd_soc_card ALSA: hda - Add quirk for Asus z37e (1043:8284) sound: Remove OSSlib stuff from linux/soundcard.h ASoC: WM8990: Fix kcontrol's private value use in put callback ASoC: TLV320AIC3X: Fix kcontrol's private value use in put callback
2009-02-13User namespaces: Only put the userns when we unhash the uidSerge E. Hallyn1-2/+1
uids in namespaces other than init don't get a sysfs entry. For those in the init namespace, while we're waiting to remove the sysfs entry for the uid the uid is still hashed, and alloc_uid() may re-grab that uid without getting a new reference to the user_ns, which we've already put in free_user before scheduling remove_user_sysfs_dir(). Reported-and-tested-by: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com> Signed-off-by: Serge E. Hallyn <serue@us.ibm.com> Acked-by: David Howells <dhowells@redhat.com> Tested-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-13Merge branch 'fix/asoc' into for-linusTakashi Iwai4-8/+16
2009-02-13Merge branch 'fix/hda' into for-linusTakashi Iwai7-33/+71
2009-02-13Merge branch 'fix/misc' into for-linusTakashi Iwai1-1/+2
2009-02-13Merge branch 'fix/oss-header-fix' into for-linusTakashi Iwai1-51/+23
2009-02-13ASoC: Only register AC97 bus if it's not done alreadyMark Brown1-1/+4
ASoC supports both explicit codec drivers for AC97 devices and a simple driver which uses the standard ALSA AC97 framework for codec support. When used with the generic AC97 codec support that will provide the ad hoc AC97 device for drivers like touchscreens to attach to so the core shouldn't do so. Reported-by: Manuel Lauss <mano@roarinelk.homelinux.net> Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
2009-02-13ALSA: hda - Add snd_hda_multi_out_dig_cleanup()Takashi Iwai4-2/+32
Added the helper function snd_hda_multi_out_dig_cleanup() to clean up the digital outputs with multi setup. This call is needed in cases the codec supports multiple digital outputs as slaves. Otherwise the slave widgets aren't properly cleaned up. For a single digital output (e.g. in patch_conexant.c), this call isn't needed. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-13ALSA: hda - Add missing terminator in slave dig-out arrayTakashi Iwai1-2/+2
Added the missing terminator for ad1989b_slave_dig_outs[]. Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-13ALSA: hda - Change HP dv7 (103c:30f4) quirk from hp-m4 to hp-dv5 modelHerton Ronaldo Krzesinski1-1/+1
Change HP dv7 quirk: although reported to work with hp-m4 model (https://bugzilla.novell.com/show_bug.cgi?id=445321), the original report doesn't contain info about testing of internal microphone. Recently I received a report about internal mic not working (https://qa.mandriva.com/show_bug.cgi?id=44855#c193), this must be related with the forced line in on pin 0x0e done with hp-m4 model. Thus change the current quirk from STAC_HP_M4 to STAC_HP_DV5, later reported to be fixed on a provided kernel with this change (https://qa.mandriva.com/show_bug.cgi?id=44855#c196). Signed-off-by: Herton Ronaldo Krzesinski <herton@mandriva.com.br> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6Linus Torvalds26-8901/+8767
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (32 commits) wimax: fix oops in wimax_dev_get_by_genl_info() when looking up non-wimax iface net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2 netxen: fix compile waring "label ‘set_32_bit_mask’ defined but not used" on IA64 platform bnx2: Update version to 1.9.2 and copyright. bnx2: Fix jumbo frames error handling. bnx2: Update 5709 firmware. bnx2: Update 5706/5708 firmware. 3c505: do not set pcb->data.raw beyond its size Documentation/connector/cn_test.c: don't use gfp_any() net: don't use in_atomic() in gfp_any() IRDA: cnt is off by 1 netxen: remove pcie workaround sun3: print when lance_open() fails qlge: bugfix: Add missing rx buf clean index on early exit. qlge: bugfix: Fix RX scaling values. qlge: bugfix: Fix TSO breakage. qlge: bugfix: Add missing dev_kfree_skb_any() call. qlge: bugfix: Add missing put_page() call. qlge: bugfix: Fix fatal error recovery hang. qlge: bugfix: Use netif_receive_skb() and vlan_hwaccel_receive_skb(). ...
2009-02-13wimax: fix oops in wimax_dev_get_by_genl_info() when looking up non-wimax ifaceInaky Perez-Gonzalez1-4/+5
When a non-wimax interface is looked up by the stack, a bad pointer is returned when the looked-up interface is not found in the list (of registered WiMAX interfaces). This causes an oops in the caller when trying to use the pointer. Fix by properly setting the pointer to NULL if we don't exit from the list_for_each() with a found entry. Signed-off-by: Inaky Perez-Gonzalez <inaky@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13net: 4 bytes kernel memory disclosure in SO_BSDCOMPAT gsopt try #2Clément Lecigne1-0/+2
In function sock_getsockopt() located in net/core/sock.c, optval v.val is not correctly initialized and directly returned in userland in case we have SO_BSDCOMPAT option set. This dummy code should trigger the bug: int main(void) { unsigned char buf[4] = { 0, 0, 0, 0 }; int len; int sock; sock = socket(33, 2, 2); getsockopt(sock, 1, SO_BSDCOMPAT, &buf, &len); printf("%x%x%x%x\n", buf[0], buf[1], buf[2], buf[3]); close(sock); } Here is a patch that fix this bug by initalizing v.val just after its declaration. Signed-off-by: Clément Lecigne <clement.lecigne@netasq.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13netxen: fix compile waring "label ‘set_32_bit_mask’ defined but not ↵Yang Hongyang1-1/+1
used" on IA64 platform When compile the latest kernel on IA64 platform,I got a warning: drivers/net/netxen/netxen_nic_main.c:203: warning: label ‘set_32_bit_mask’ defined but not used We do not need label ‘set_32_bit_mask’ on IA64 platform,So move it to #else. Signed-off-by: Yang Hongyang <yanghy@cn.fujitsu.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13bnx2: Update version to 1.9.2 and copyright.Michael Chan2-4/+4
Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13bnx2: Fix jumbo frames error handling.Michael Chan1-11/+19
If errors are reported on a frame descriptor, we need to account for the buffer pages that may have been used for this error packet and recycle them. Otherwise, we may get the wrong pages for the next packet. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: Matt Carlson <mcarlson@broadcom.com> Signed-off-by: Benjamin Li <benli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13bnx2: Update 5709 firmware.Michael Chan1-4432/+4363
New firmware fixes a data corruption issue when receiving and placing jumbo frames into host buffers. In some cases, the buffer descriptor is not updated correctly and this will lead to the driver linking the wrong number of pages into the SKB. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13bnx2: Update 5706/5708 firmware.Michael Chan1-4309/+4206
New firmware fixes a data corruption issue when receiving and placing jumbo frames into host buffers. In some cases, the buffer descriptor is not updated correctly and this will lead to the driver linking the wrong number of pages into the SKB. Signed-off-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-133c505: do not set pcb->data.raw beyond its sizeRoel Kluin1-10/+16
Ensure that we do not set pcb->data.raw beyond its size, print an error message and return false if we attempt to. A timout message was printed one too early. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13Documentation/connector/cn_test.c: don't use gfp_any()Andrew Morton1-4/+2
cn_test_timer_func() is a timer handler and can never use GFP_KERNEL - there's no point in using gfp_any() here. Also, use setup_timer(). Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13net: don't use in_atomic() in gfp_any()Andrew Morton1-1/+1
The problem is that in_atomic() will return false inside spinlocks if CONFIG_PREEMPT=n. This will lead to deadlockable GFP_KERNEL allocations from spinlocked regions. Secondly, if CONFIG_PREEMPT=y, this bug solves itself because networking will instead use GFP_ATOMIC from this callsite. Hence we won't get the might_sleep() debugging warnings which would have informed us of the buggy callsites. Solve both these problems by switching to in_interrupt(). Now, if someone runs a gfp_any() allocation from inside spinlock we will get the warning if CONFIG_PREEMPT=y. I reviewed all callsites and most of them were too complex for my little brain and none of them documented their interface requirements. I have no idea what this patch will do. Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13IRDA: cnt is off by 1Roel Kluin1-1/+1
If no prior break occurs, cnt reaches 101 after the loop, so we are still able to change speed when cnt has become 100. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13netxen: remove pcie workaroundDhananjay Phadke1-64/+0
Remove workaround for pcie bug in early revisions of NX3031 (rev 41 or earlier). This is taken care of during firmware init. The workaround required writing pcie config reg of every pcie function on a card, not all of which are enabled. Signed-off-by: Dhananjay Phadke <dhananjay@netxen.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13sun3: print when lance_open() failsRoel Kluin1-1/+1
With while (--i > 0) { ... } i reaches 0; print when lance_open() fails Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13qlge: bugfix: Add missing rx buf clean index on early exit.Ron Mercer1-0/+2
The large receive buffer queue is not properly tracking the current index in the case where an early exit occurs. This can happen when a page alloc or dma mapping fails. If this occurs the queue will get out of sync and invalid indexes can be written to the hardware. Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13qlge: bugfix: Fix RX scaling values.Ron Mercer1-2/+2
Receive packets were only scaling across 2 of the receive queues. The value was hardcoded to 2 instead of being based on how many rx queues were running. Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13qlge: bugfix: Fix TSO breakage.Ron Mercer1-4/+6
Moved the buffer mapping to a point after TSO logic has modified the iph->check field. We were seeing stale data on the PCIe bus. Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13qlge: bugfix: Add missing dev_kfree_skb_any() call.Ron Mercer1-0/+2
We put the skb back if we can't get mapping for it. We don't want unmapped buffers on our receive buffer queue. Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13qlge: bugfix: Add missing put_page() call.Ron Mercer1-0/+2
We put the page back if we can't get mapping for it. We don't want unmapped buffers on our receive buffer queue. Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13qlge: bugfix: Fix fatal error recovery hang.Ron Mercer1-2/+11
Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13qlge: bugfix: Use netif_receive_skb() and vlan_hwaccel_receive_skb().Ron Mercer1-2/+2
Replace calls to vlan_hwaccel_rx() and netif_rx(). Thanks to Dave Miller for pointing out the the driver was making the wrong upcall for passing packets into the stack. Signed-off-by: Ron Mercer <ron.mercer@qlogic.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-13TG3: limit reaches -1Roel Kluin1-2/+2
With while (limit--) { ... } limit reaches -1, so 0 means success. Signed-off-by: Roel Kluin <roel.kluin@gmail.com> Acked-by: Michael Chan <mchan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-02-12Merge branch 'for-linus' of ↵Linus Torvalds3-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6 * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6: mm: Export symbol ksize()
2009-02-12Fix page writeback thinko, causing Berkeley DB slowdownNick Piggin1-1/+1
A bug was introduced into write_cache_pages cyclic writeout by commit 31a12666d8f0c22235297e1c1575f82061480029 ("mm: write_cache_pages cyclic fix"). The intention (and comments) is that we should cycle back and look for more dirty pages at the beginning of the file if there is no more work to be done. But the !done condition was dropped from the test. This means that any time the page writeout loop breaks (eg. due to nr_to_write == 0), we will set index to 0, then goto again. This will set done_index to index, then find done is set, so will proceed to the end of the function. When updating mapping->writeback_index for cyclic writeout, we now use done_index == 0, so we're always cycling back to 0. This seemed to be causing random mmap writes (slapadd and iozone) to start writing more pages from the LRU and writeout would slowdown, and caused bugzilla entry http://bugzilla.kernel.org/show_bug.cgi?id=12604 about Berkeley DB slowing down dramatically. With this patch, iozone random write performance is increased nearly 5x on my system (iozone -B -r 4k -s 64k -s 512m -s 1200m on ext2). Signed-off-by: Nick Piggin <npiggin@suse.de> Reported-and-tested-by: Jan Kara <jack@suse.cz> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-12mm: Export symbol ksize()Kirill A. Shutemov3-0/+3
Commit 7b2cd92adc5430b0c1adeb120971852b4ea1ab08 ("crypto: api - Fix zeroing on free") added modular user of ksize(). Export that to fix crypto.ko compilation. Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Kirill A. Shutemov <kirill@shutemov.name> Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2009-02-12Merge git://git.infradead.org/users/cbou/battery-2.6.29Linus Torvalds1-1/+2
* git://git.infradead.org/users/cbou/battery-2.6.29: pcf50633_charger: Fix typo
2009-02-12ALSA: hda - Register (new) devices at reconfigTakashi Iwai1-1/+1
The devices that have been newly added during reconfig must be registered. Otherwise they won't be visible to user-space. Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-12ALSA: mtpav - Fix initial value for input hwportTakashi Iwai1-1/+2
Fix the initial value for input hwport. The old value (-1) may cause Oops when an realtime MIDI byte is received before the input port is explicitly given. Instead, now it's set to the broadcasting as default. Tested-by: Holger Dehnhardt <dehnhardt@ahdehnhardt.de> Cc: <stable@kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2009-02-12w1: w1 temp calculation overflow fixIan Dall1-1/+1
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=12646 When the temperature exceeds 32767 milli-degrees the temperature overflows to -32768 millidegrees. These are bothe well within the -55 - +125 degree range for the sensor. Fix overflow in left-shift of a u8. Signed-off-by: Ian Dall <ian@beware.dropbear.id.au> Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-12nbd: fix I/O hang on disconnected nbdsPaul Clements1-0/+9
Fix a problem that causes I/O to a disconnected (or partially initialized) nbd device to hang indefinitely. To reproduce: # ioctl NBD_SET_SIZE_BLOCKS /dev/nbd23 514048 # dd if=/dev/nbd23 of=/dev/null bs=4096 count=1 ...hangs... This can also occur when an nbd device loses its nbd-client/server connection. Although we clear the queue of any outstanding I/Os after the client/server connection fails, any additional I/Os that get queued later will hang. This bug may also be the problem reported in this bug report: http://bugzilla.kernel.org/show_bug.cgi?id=12277 Testing would need to be performed to determine if the two issues are the same. This problem was introduced by the new request handling thread code ("NBD: allow nbd to be used locally", 3/2008), which entered into mainline around 2.6.25. The fix, which is fairly simple, is to restore the check for lo->sock being NULL in do_nbd_request. This causes I/O to an uninitialized nbd to immediately fail with an I/O error, as it did prior to the introduction of this bug. Signed-off-by: Paul Clements <paul.clements@steeleye.com> Reported-by: Jon Nelson <jnelson-kernel-bugzilla@jamponi.net> Acked-by: Pavel Machek <pavel@ucw.cz> Cc: <stable@kernel.org> [2.6.26.x, 2.6.27.x, 2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-12mm: rearrange exit_mmap() to unlock before arch_exit_mmapJeremy Fitzhardinge1-4/+6
Christophe Saout reported [in precursor to: http://marc.info/?l=linux-kernel&m=123209902707347&w=4]: > Note that I also some a different issue with CONFIG_UNEVICTABLE_LRU. > Seems like Xen tears down current->mm early on process termination, so > that __get_user_pages in exit_mmap causes nasty messages when the > process had any mlocked pages. (in fact, it somehow manages to get into > the swapping code and produces a null pointer dereference trying to get > a swap token) Jeremy explained: Yes. In the normal case under Xen, an in-use pagetable is "pinned", meaning that it is RO to the kernel, and all updates must go via hypercall (or writes are trapped and emulated, which is much the same thing). An unpinned pagetable is not currently in use by any process, and can be directly accessed as normal RW pages. As an optimisation at process exit time, we unpin the pagetable as early as possible (switching the process to init_mm), so that all the normal pagetable teardown can happen with direct memory accesses. This happens in exit_mmap() -> arch_exit_mmap(). The munlocking happens a few lines below. The obvious thing to do would be to move arch_exit_mmap() to below the munlock code, but I think we'd want to call it even if mm->mmap is NULL, just to be on the safe side. Thus, this patch: exit_mmap() needs to unlock any locked vmas before calling arch_exit_mmap, as the latter may switch the current mm to init_mm, which would cause the former to fail. Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Signed-off-by: Lee Schermerhorn <lee.schermerhorn@hp.com> Cc: Christophe Saout <christophe@saout.de> Cc: Keir Fraser <keir.fraser@eu.citrix.com> Cc: Christophe Saout <christophe@saout.de> Cc: Alex Williamson <alex.williamson@hp.com> Cc: <stable@kernel.org> [2.6.28.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-12parport: parport_serial, don't bind netmos ibm 0299Jiri Slaby1-0/+5
Since netmos 9835 with subids 0x1014(IBM):0x0299 is now bound with serial/8250_pci, because it has no parallel ports and subdevice id isn't in the expected form, return -ENODEV from probe function. This is performed in netmos preinit_hook. Signed-off-by: Jiri Slaby <jirislaby@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-12writeback: fix break conditionFederico Cuello1-13/+16
Commit dcf6a79dda5cc2a2bec183e50d829030c0972aaa ("write-back: fix nr_to_write counter") fixed nr_to_write counter, but didn't set the break condition properly. If nr_to_write == 0 after being decremented it will loop one more time before setting done = 1 and breaking the loop. [akpm@linux-foundation.org: coding-style fixes] Cc: Artem Bityutskiy <Artem.Bityutskiy@nokia.com> Acked-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-12syscall define: fix uml compile bugHeiko Carstens1-14/+14
With the new system call defines we get this on uml: arch/um/sys-i386/built-in.o: In function `sys_call_table': (.rodata+0x308): undefined reference to `sys_sigprocmask' Reason for this is that uml passes the preprocessor option -Dsigprocmask=kernel_sigprocmask to gcc when compiling the kernel. This causes SYSCALL_DEFINE3(sigprocmask, ...) to be expanded to SYSCALL_DEFINEx(3, kernel_sigprocmask, ...) and finally to a system call named sys_kernel_sigprocmask. However sys_sigprocmask is missing because of this. To avoid macro expansion for the system call name just concatenate the name at first define instead of carrying it through severel levels. This was pointed out by Al Viro. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Reviewed-by: WANG Cong <wangcong@zeuux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-12ext2/xip: refuse to change xip flag during remount with busy inodesCarsten Otte1-3/+6
For a reason that I was unable to understand in three months of debugging, mount ext2 -o remount stopped working properly when remounting from regular operation to xip, or the other way around. According to a git bisect search, the problem was introduced with the VM_MIXEDMAP/PTE_SPECIAL rework in the vm: commit 70688e4dd1647f0ceb502bbd5964fa344c5eb411 Author: Nick Piggin <npiggin@suse.de> Date: Mon Apr 28 02:13:02 2008 -0700 xip: support non-struct page backed memory In the failing scenario, the filesystem is mounted read only via root= kernel parameter on s390x. During remount (in rc.sysinit), the inodes of the bash binary and its libraries are busy and cannot be invalidated (the bash which is running rc.sysinit resides on subject filesystem). Afterwards, another bash process (running ifup-eth) recurses into a subshell, runs dup_mm (via fork). Some of the mappings in this bash process were created from inodes that could not be invalidated during remount. Both parent and child process crash some time later due to inconsistencies in their address spaces. The issue seems to be timing sensitive, various attempts to recreate it have failed. This patch refuses to change the xip flag during remount in case some inodes cannot be invalidated. This patch keeps users from running into that issue. [akpm@linux-foundation.org: cleanup] Signed-off-by: Carsten Otte <cotte@de.ibm.com> Cc: Nick Piggin <npiggin@suse.de> Cc: Jared Hulbert <jaredeh@gmail.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>