summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2013-11-24block: Kill bio_segments()/bi_vcnt usageKent Overstreet10-87/+94
When we start sharing biovecs, keeping bi_vcnt accurate for splits is going to be error prone - and unnecessary, if we refactor some code. So bio_segments() has to go - but most of the existing users just needed to know if the bio had multiple segments, which is easier - add a bio_multiple_segments() for them. (Two of the current uses of bio_segments() are going to go away in a couple patches, but the current implementation of bio_segments() is unsafe as soon as we start doing driver conversions for immutable biovecs - so implement a dumb version for bisectability, it'll go away in a couple patches) Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Neil Brown <neilb@suse.de> Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
2013-11-24bio-integrity: Convert to bvec_iterKent Overstreet4-126/+71
The bio integrity is also stored in a bvec array, so if we use the bvec iter code we just added, the integrity code won't need to implement its own iteration stuff (bio_integrity_mark_head(), bio_integrity_mark_tail()) Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
2013-11-24block: Convert bio_copy_data() to bvec_iterKent Overstreet1-35/+25
Our fancy new bvec iterator makes code like this much easier to write. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk>
2013-11-24block: Immutable bio vecsKent Overstreet7-38/+194
This adds a mechanism by which we can advance a bio by an arbitrary number of bytes without modifying the biovec: bio->bi_iter.bi_bvec_done indicates the number of bytes completed in the current bvec. Various driver code still needs to be updated to not refer to the bvec directly before we can use this for interesting things, like efficient bio splitting. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net
2013-11-24block: Convert bio_for_each_segment() to bvec_iterKent Overstreet39-397/+401
More prep work for immutable biovecs - with immutable bvecs drivers won't be able to use the biovec directly, they'll need to use helpers that take into account bio->bi_iter.bi_bvec_done. This updates callers for the new usage without changing the implementation yet. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Paul Clements <Paul.Clements@steeleye.com> Cc: Jim Paris <jim@jtan.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Nagalakshmi Nandigama <Nagalakshmi.Nandigama@lsi.com> Cc: Sreekanth Reddy <Sreekanth.Reddy@lsi.com> Cc: support@lsi.com Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Tejun Heo <tj@kernel.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Matthew Wilcox <matthew.r.wilcox@intel.com> Cc: Keith Busch <keith.busch@intel.com> Cc: Stephen Hemminger <shemminger@vyatta.com> Cc: Quoc-Son Anh <quoc-sonx.anh@intel.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Jerome Marchand <jmarchan@redhat.com> Cc: Seth Jennings <sjenning@linux.vnet.ibm.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: "Darrick J. Wong" <darrick.wong@oracle.com> Cc: Chris Metcalf <cmetcalf@tilera.com> Cc: Jan Kara <jack@suse.cz> Cc: linux-m68k@lists.linux-m68k.org Cc: linuxppc-dev@lists.ozlabs.org Cc: drbd-user@lists.linbit.com Cc: nbd-general@lists.sourceforge.net Cc: cbe-oss-dev@lists.ozlabs.org Cc: xen-devel@lists.xensource.com Cc: virtualization@lists.linux-foundation.org Cc: linux-raid@vger.kernel.org Cc: linux-s390@vger.kernel.org Cc: DL-MPTFusionLinux@lsi.com Cc: linux-scsi@vger.kernel.org Cc: devel@driverdev.osuosl.org Cc: linux-fsdevel@vger.kernel.org Cc: cluster-devel@redhat.com Cc: linux-mm@kvack.org Acked-by: Geoff Levand <geoff@infradead.org>
2013-11-24block: Convert bio_iovec() to bvec_iterKent Overstreet6-23/+26
For immutable biovecs, we'll be introducing a new bio_iovec() that uses our new bvec iterator to construct a biovec, taking into account bvec_iter->bi_bvec_done - this patch updates existing users for the new usage. Some of the existing users really do need a pointer into the bvec array - those uses are all going to be removed, but we'll need the functionality from immutable to remove them - so for now rename the existing bio_iovec() -> __bio_iovec(), and it'll be removed in a couple patches. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Cc: "James E.J. Bottomley" <JBottomley@parallels.com>
2013-11-24dm: Use bvec_iter for dm_bio_record()Kent Overstreet1-9/+3
This patch doesn't itself have any functional changes, but immutable biovecs are going to add a bi_bvec_done member to bi_iter, which will need to be saved too here. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: dm-devel@redhat.com Reviewed-by: Mike Snitzer <snitzer@redhat.com>
2013-11-24block: Abstract out bvec iteratorKent Overstreet107-638/+700
Immutable biovecs are going to require an explicit iterator. To implement immutable bvecs, a later patch is going to add a bi_bvec_done member to this struct; for now, this patch effectively just renames things. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: "Ed L. Cashin" <ecashin@coraid.com> Cc: Nick Piggin <npiggin@kernel.dk> Cc: Lars Ellenberg <drbd-dev@lists.linbit.com> Cc: Jiri Kosina <jkosina@suse.cz> Cc: Matthew Wilcox <willy@linux.intel.com> Cc: Geoff Levand <geoff@infradead.org> Cc: Yehuda Sadeh <yehuda@inktank.com> Cc: Sage Weil <sage@inktank.com> Cc: Alex Elder <elder@inktank.com> Cc: ceph-devel@vger.kernel.org Cc: Joshua Morris <josh.h.morris@us.ibm.com> Cc: Philip Kelleher <pjk1939@linux.vnet.ibm.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Jeremy Fitzhardinge <jeremy@goop.org> Cc: Neil Brown <neilb@suse.de> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: dm-devel@redhat.com Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: linux390@de.ibm.com Cc: Boaz Harrosh <bharrosh@panasas.com> Cc: Benny Halevy <bhalevy@tonian.com> Cc: "James E.J. Bottomley" <JBottomley@parallels.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "Nicholas A. Bellinger" <nab@linux-iscsi.org> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <chris.mason@fusionio.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Steven Whitehouse <swhiteho@redhat.com> Cc: Dave Kleikamp <shaggy@kernel.org> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com> Cc: KONISHI Ryusuke <konishi.ryusuke@lab.ntt.co.jp> Cc: Mark Fasheh <mfasheh@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Ben Myers <bpm@sgi.com> Cc: xfs@oss.sgi.com Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Frederic Weisbecker <fweisbec@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Len Brown <len.brown@intel.com> Cc: Pavel Machek <pavel@ucw.cz> Cc: "Rafael J. Wysocki" <rjw@sisk.pl> Cc: Herton Ronaldo Krzesinski <herton.krzesinski@canonical.com> Cc: Ben Hutchings <ben@decadent.org.uk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Guo Chao <yan@linux.vnet.ibm.com> Cc: Tejun Heo <tj@kernel.org> Cc: Asai Thambi S P <asamymuthupa@micron.com> Cc: Selvan Mani <smani@micron.com> Cc: Sam Bradshaw <sbradshaw@micron.com> Cc: Wei Yongjun <yongjun_wei@trendmicro.com.cn> Cc: "Roger Pau Monné" <roger.pau@citrix.com> Cc: Jan Beulich <jbeulich@suse.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> Cc: Ian Campbell <Ian.Campbell@citrix.com> Cc: Sebastian Ott <sebott@linux.vnet.ibm.com> Cc: Christian Borntraeger <borntraeger@de.ibm.com> Cc: Minchan Kim <minchan@kernel.org> Cc: Jiang Liu <jiang.liu@huawei.com> Cc: Nitin Gupta <ngupta@vflare.org> Cc: Jerome Marchand <jmarchand@redhat.com> Cc: Joe Perches <joe@perches.com> Cc: Peng Tao <tao.peng@emc.com> Cc: Andy Adamson <andros@netapp.com> Cc: fanchaoting <fanchaoting@cn.fujitsu.com> Cc: Jie Liu <jeff.liu@oracle.com> Cc: Sunil Mushran <sunil.mushran@gmail.com> Cc: "Martin K. Petersen" <martin.petersen@oracle.com> Cc: Namjae Jeon <namjae.jeon@samsung.com> Cc: Pankaj Kumar <pankaj.km@samsung.com> Cc: Dan Magenheimer <dan.magenheimer@oracle.com> Cc: Mel Gorman <mgorman@suse.de>6
2013-11-24bcache: Kill unaligned bvec hackKent Overstreet3-35/+7
Bcache has a hack to avoid cloning the biovec if it's all full pages - but with immutable biovecs coming this won't be necessary anymore. For now, we remove the special case and always clone the bvec array so that the immutable biovec patches are simpler. Signed-off-by: Kent Overstreet <kmo@daterainc.com>
2013-11-24block: Convert various code to bio_for_each_segment()Kent Overstreet10-102/+67
With immutable biovecs we don't want code accessing bi_io_vec directly - the uses this patch changes weren't incorrect since they all own the bio, but it makes the code harder to audit for no good reason - also, this will help with multipage bvecs later. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Chris Mason <chris.mason@fusionio.com> Cc: Jaegeuk Kim <jaegeuk.kim@samsung.com> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Trond Myklebust <Trond.Myklebust@netapp.com>
2013-11-24block: submit_bio_wait() conversionsKent Overstreet8-118/+24
It was being open coded in a few places. Signed-off-by: Kent Overstreet <kmo@daterainc.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joern Engel <joern@logfs.org> Cc: Prasad Joshi <prasadjoshi.linux@gmail.com> Cc: Neil Brown <neilb@suse.de> Cc: Chris Mason <chris.mason@fusionio.com> Acked-by: NeilBrown <neilb@suse.de>
2013-11-22Linux 3.13-rc1v3.13-rc1Linus Torvalds1-2/+2
2013-11-22Merge tag 'ecryptfs-3.13-rc1-quiet-checkers' of ↵Linus Torvalds1-6/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs Pull minor eCryptfs fix from Tyler Hicks: "Quiet static checkers by removing unneeded conditionals" * tag 'ecryptfs-3.13-rc1-quiet-checkers' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs: eCryptfs: file->private_data is always valid
2013-11-22Merge tag 'sound-fix2-3.13-rc1' of ↵Linus Torvalds9-47/+161
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull second set of sound fixes from Takashi Iwai: "A collection of small fixes in HD-audio quirks and runtime PM, ASoC rcar, abs8500 and other codecs. Most of commits are for stable kernels, too" * tag 'sound-fix2-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda - Set current_headset_type to ALC_HEADSET_TYPE_ENUM (janitorial) ALSA: hda - Provide missing pin configs for VAIO with ALC260 ALSA: hda - Add headset quirk for Dell Inspiron 3135 ALSA: hda - Fix the headphone jack detection on Sony VAIO TX ALSA: hda - Fix missing bass speaker on ASUS N550 ALSA: hda - Fix unbalanced runtime PM notification at resume ASoC: arizona: Set FLL to free-run before disabling ALSA: hda - A casual Dell Headset quirk ASoC: rcar: fixup dma_async_issue_pending() timing ASoC: rcar: off by one in rsnd_scu_set_route() ASoC: wm5110: Add post SYSCLK register patch for rev D chip ASoC: ab8500: Revert to using custom I/O functions ALSA: hda - Also enable mute/micmute LED control for "Lenovo dock" fixup ALSA: firewire-lib: include sound/asound.h to refer to snd_pcm_format_t ALSA: hda - Select FW_LOADER from CONFIG_SND_HDA_CODEC_CA0132_DSP ALSA: hda - Enable mute/mic-mute LEDs for more Thinkpads with Realtek codec ASoC: rcar: fixup mod access before checking
2013-11-22Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds54-324/+1057
Pull DRM fixes from Dave Airlie: "I was going to leave this until post -rc1 but sysfs fixes broke hotplug in userspace, so I had to fix it harder, otherwise a set of pulls from intel, radeon and vmware, The vmware/ttm changes are bit larger but since its early and they are unlikely to break anything else I put them in, it lets vmware work with dri3" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (36 commits) drm/sysfs: fix hotplug regression since lifetime changes drm/exynos: g2d: fix memory leak to userptr drm/i915: Fix gen3 self-refresh watermarks drm/ttm: Remove set_need_resched from the ttm fault handler drm/ttm: Don't move non-existing data drm/radeon: hook up backlight functions for CI and KV family. drm/i915: Replicate BIOS eDP bpp clamping hack for hsw drm/i915: Do not enable package C8 on unsupported hardware drm/i915: Hold pc8 lock around toggling pc8.gpu_idle drm/i915: encoder->get_config is no longer optional drm/i915/tv: add ->get_config callback drm/radeon/cik: Add macrotile mode array query drm/radeon/cik: Return backend map information to userspace drm/vmwgfx: Make vmwgfx dma buffers prime aware drm/vmwgfx: Make surfaces prime-aware drm/vmwgfx: Hook up the prime ioctls drm/ttm: Add a minimal prime implementation for ttm base objects drm/vmwgfx: Fix false lockdep warning drm/ttm: Allow execbuf util reserves without ticket drm/i915: restore the early forcewake cleanup ...
2013-11-22Merge tag 'pci-v3.13-fixes-1' of ↵Linus Torvalds65-521/+518
git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Miscellaneous - Remove duplicate disable from pcie_portdrv_remove() (Yinghai Lu) - Fix whitespace, capitalization, and spelling errors (Bjorn Helgaas)" * tag 'pci-v3.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: PCI: Remove duplicate pci_disable_device() from pcie_portdrv_remove() PCI: Fix whitespace, capitalization, and spelling errors
2013-11-22Merge branch 'for-next' of ↵Linus Torvalds53-789/+1009
git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending Pull SCSI target updates from Nicholas Bellinger: "Things have been quiet this round with mostly bugfixes, percpu conversions, and other minor iscsi-target conformance testing changes. The highlights include: - Add demo_mode_discovery attribute for iscsi-target (Thomas) - Convert tcm_fc(FCoE) to use percpu-ida pre-allocation - Add send completion interrupt coalescing for ib_isert - Convert target-core to use percpu-refcounting for se_lun - Fix mutex_trylock usage bug in iscsit_increment_maxcmdsn - tcm_loop updates (Hannes) - target-core ALUA cleanups + prep for v3.14 SCSI Referrals support (Hannes) v3.14 is currently shaping to be a busy development cycle in target land, with initial support for T10 Referrals and T10 DIF currently on the roadmap" * 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending: (40 commits) iscsi-target: chap auth shouldn't match username with trailing garbage iscsi-target: fix extract_param to handle buffer length corner case iscsi-target: Expose default_erl as TPG attribute target_core_configfs: split up ALUA supported states target_core_alua: Make supported states configurable target_core_alua: Store supported ALUA states target_core_alua: Rename ALUA_ACCESS_STATE_OPTIMIZED target_core_alua: spellcheck target core: rename (ex,im)plict -> (ex,im)plicit percpu-refcount: Add percpu-refcount.o to obj-y iscsi-target: Do not reject non-immediate CmdSNs exceeding MaxCmdSN iscsi-target: Convert iscsi_session statistics to atomic_long_t target: Convert se_device statistics to atomic_long_t target: Fix delayed Task Aborted Status (TAS) handling bug iscsi-target: Reject unsupported multi PDU text command sequence ib_isert: Avoid duplicate iscsit_increment_maxcmdsn call iscsi-target: Fix mutex_trylock usage in iscsit_increment_maxcmdsn target: Core does not need blkdev.h target: Pass through I/O topology for block backstores iser-target: Avoid using FRMR for single dma entry requests ...
2013-11-22Merge tag 'hwmon-for-linus' of ↵Linus Torvalds5-13/+88
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging Pull hwmon fixes from Guenter Roeck: - acpi_power_meter: Fix return value check from call to acpi_bus_get_device - nct6775: Fix/improve NCT6791 support - lm75: Add support for GMT G751 * tag 'hwmon-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging: hwmon: (acpi_power_meter) Fix acpi_bus_get_device() return value check hwmon: (nct6775) NCT6791 supports weight control only for CPUFAN hwmon: (nct6775) Monitor additional temperature registers hwmon: (lm75) Add support for GMT G751 chip
2013-11-22Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netLinus Torvalds72-316/+504
Pull networking fixes from David Miller: 1) Fix memory leaks and other issues in mwifiex driver, from Amitkumar Karwar. 2) skb_segment() can choke on packets using frag lists, fix from Herbert Xu with help from Eric Dumazet and others. 3) IPv4 output cached route instantiation properly handles races involving two threads trying to install the same route, but we forgot to propagate this logic to input routes as well. Fix from Alexei Starovoitov. 4) Put protections in place to make sure that recvmsg() paths never accidently copy uninitialized memory back into userspace and also make sure that we never try to use more that sockaddr_storage for building the on-kernel-stack copy of a sockaddr. Fixes from Hannes Frederic Sowa. 5) R8152 driver transmit flow bug fixes from Hayes Wang. 6) Fix some minor fallouts from genetlink changes, from Johannes Berg and Michael Opdenacker. 7) AF_PACKET sendmsg path can race with netdevice unregister notifier, fix by using RCU to make sure the network device doesn't go away from under us. Fix from Daniel Borkmann. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (43 commits) gso: handle new frag_list of frags GRO packets genetlink: fix genl_set_err() group ID genetlink: fix genlmsg_multicast() bug packet: fix use after free race in send path when dev is released xen-netback: stop the VIF thread before unbinding IRQs wimax: remove dead code net/phy: Add the autocross feature for forced links on VSC82x4 net/phy: Add VSC8662 support net/phy: Add VSC8574 support net/phy: Add VSC8234 support net: add BUG_ON if kernel advertises msg_namelen > sizeof(struct sockaddr_storage) net: rework recvmsg handler msg_name and msg_namelen logic bridge: flush br's address entry in fdb when remove the net: core: Always propagate flag changes to interfaces ipv4: fix race in concurrent ip_route_input_slow() r8152: fix incorrect type in assignment r8152: support stopping/waking tx queue r8152: modify the tx flow r8152: fix tx/rx memory overflow netfilter: ebt_ip6: fix source and destination matching ...
2013-11-22Merge branch 'fixes' of git://git.linaro.org/people/rmk/linux-armLinus Torvalds8-14/+39
Pull ARM fixes from Russell King: "Some small fixes for this merge window, most of them quite self explanatory - the biggest thing here is a fix for the ARMv7 LPAE suspend/resume support" * 'fixes' of git://git.linaro.org/people/rmk/linux-arm: ARM: 7894/1: kconfig: select GENERIC_CLOCKEVENTS if HAVE_ARM_ARCH_TIMER ARM: 7893/1: bitops: only emit .arch_extension mp if CONFIG_SMP ARM: 7892/1: Fix warning for V7M builds ARM: 7888/1: seccomp: not compatible with ARM OABI ARM: 7886/1: make OABI default to off ARM: 7885/1: Save/Restore 64-bit TTBR registers on LPAE suspend/resume ARM: 7884/1: mm: Fix ECC mem policy printk ARM: 7883/1: fix mov to mvn conversion in case of 64 bit phys_addr_t and BE ARM: 7882/1: mm: fix __phys_to_virt to work with 64 bit phys_addr_t in BE case ARM: 7881/1: __fixup_smp read of SCU config should do byteswap in BE case ARM: Fix nommu.c build warning
2013-11-22Merge branch 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds3-9/+32
Pull KVM fixes from Gleb Natapov. * 'next' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: kvm_clear_guest_page(): fix empty_zero_page usage kvm: mmu: delay mmu audit activation arm/arm64: KVM: Fix hyp mappings of vmalloc regions
2013-11-22Merge git://git.kvack.org/~bcrl/aio-nextLinus Torvalds1-83/+51
Pull aio fixes from Benjamin LaHaise. * git://git.kvack.org/~bcrl/aio-next: aio: nullify aio->ring_pages after freeing it aio: prevent double free in ioctx_alloc aio: Fix a trinity splat
2013-11-22Merge branch 'for-3.13' of git://linux-nfs.org/~bfields/linuxLinus Torvalds2-75/+101
Pull nfsd bugfixes from Bruce Fields: "A couple nfsd bugfixes" * 'for-3.13' of git://linux-nfs.org/~bfields/linux: nfsd4: fix xdr decoding of large non-write compounds nfsd: make sure to balance get/put_write_access nfsd: split up nfsd_setattr
2013-11-22Merge tag 'gfs2-fixes' of ↵Linus Torvalds2-2/+6
git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes Pull GFS2 fixes from Steven Whitehouse: "A couple of small, but important bug fixes for GFS2. The first one fixes a possible NULL pointer dereference, and the second one resolves a reference counting issue in one of the lesser used paths through atomic_open" * tag 'gfs2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/steve/gfs2-3.0-fixes: GFS2: Fix ref count bug relating to atomic_open GFS2: fix potential NULL pointer dereference
2013-11-22Merge branch 'for-linus' of ↵Linus Torvalds15-72/+73
git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs Pull btrfs fixes from Chris Mason: "Almost all of these are bug fixes. Dave Sterba's documentation update is the big exception because he removed our promises to set any machine running Btrfs on fire" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs: Documentation: filesystems: update btrfs tools section Documentation: filesystems: add new btrfs mount options btrfs: update kconfig help text btrfs: fix bio_size_ok() for max_sectors > 0xffff btrfs: Use trace condition for get_extent tracepoint btrfs: fix typo in the log message Btrfs: fix list delete warning when removing ordered root from the list Btrfs: print bytenr instead of page pointer in check-int Btrfs: remove dead codes from ctree.h Btrfs: don't wait for ordered data outside desired range Btrfs: fix lockdep error in async commit Btrfs: avoid heavy operations in btrfs_commit_super Btrfs: fix __btrfs_start_workers retval Btrfs: disable online raid-repair on ro mounts Btrfs: do not inc uncorrectable_errors counter on ro scrubs Btrfs: only drop modified extents if we logged the whole inode Btrfs: make sure to copy everything if we rename Btrfs: don't BUG_ON() if we get an error walking backrefs
2013-11-22Merge tag 'xfs-for-linus-v3.13-rc1-2' of git://oss.sgi.com/xfs/xfsLinus Torvalds6-24/+44
Pull second xfs update from Ben Myers: "There are a couple of patches that I wasn't quite sure about in time for our initial 3.13 pull request, a bugfix, and an update to add Dave to MAINTAINERS: Here we have a performance fix for inode iversion, increased inode cluster size for v5 superblock filesystems, a fix for error handling in xfs_bmap_add_attrfork, and a MAINTAINERS update to add Dave" * tag 'xfs-for-linus-v3.13-rc1-2' of git://oss.sgi.com/xfs/xfs: xfs: open code inc_inode_iversion when logging an inode xfs: increase inode cluster size for v5 filesystems xfs: fix unlock in xfs_bmap_add_attrfork xfs: update maintainers
2013-11-22Merge branch 'slab/next' of ↵Linus Torvalds6-375/+280
git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux Pull SLAB changes from Pekka Enberg: "The patches from Joonsoo Kim switch mm/slab.c to use 'struct page' for slab internals similar to mm/slub.c. This reduces memory usage and improves performance: https://lkml.org/lkml/2013/10/16/155 Rest of the changes are bug fixes from various people" * 'slab/next' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/linux: (21 commits) mm, slub: fix the typo in mm/slub.c mm, slub: fix the typo in include/linux/slub_def.h slub: Handle NULL parameter in kmem_cache_flags slab: replace non-existing 'struct freelist *' with 'void *' slab: fix to calm down kmemleak warning slub: proper kmemleak tracking if CONFIG_SLUB_DEBUG disabled slab: rename slab_bufctl to slab_freelist slab: remove useless statement for checking pfmemalloc slab: use struct page for slab management slab: replace free and inuse in struct slab with newly introduced active slab: remove SLAB_LIMIT slab: remove kmem_bufctl_t slab: change the management method of free objects of the slab slab: use __GFP_COMP flag for allocating slab pages slab: use well-defined macro, virt_to_slab() slab: overloading the RCU head over the LRU for RCU free slab: remove cachep in struct slab_rcu slab: remove nodeid in struct slab slab: remove colouroff in struct slab slab: change return type of kmem_getpages() to struct page ...
2013-11-22Merge branch 'merge' of ↵Linus Torvalds20-36/+75
git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc Pull third set of powerpc updates from Benjamin Herrenschmidt: "This is a small collection of random bug fixes and a few improvements of Oops output which I deemed valuable enough to include as well. The fixes are essentially recent build breakage and regressions, and a couple of older bugs such as the DTL log duplication, the EEH issue with PCI_COMMAND_MASTER and the problem with small contexts passed to get/set_context with VSX enabled" * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: powerpc/signals: Mark VSX not saved with small contexts powerpc/pseries: Fix SMP=n build of rng.c powerpc: Make cpu_to_chip_id() available when SMP=n powerpc/vio: Fix a dma_mask issue of vio powerpc: booke: Fix build failures powerpc: ppc64 address space capped at 32TB, mmap randomisation disabled powerpc: Only print PACATMSCRATCH in oops when TM is active powerpc/pseries: Duplicate dtl entries sometimes sent to userspace powerpc: Remove a few lines of oops output powerpc: Print DAR and DSISR on machine check oopses powerpc: Fix __get_user_pages_fast() irq handling powerpc/eeh: More accurate log powerpc/eeh: Enable PCI_COMMAND_MASTER for PCI bridges
2013-11-22ALSA: hda - Set current_headset_type to ALC_HEADSET_TYPE_ENUM (janitorial)David Henningsson1-1/+1
current_headset_type should be of the HEADSET_TYPE enum, not the HEADSET_MODE enum. Since ALC_HEADSET_TYPE_UNKNOWN and ALC_HEADSET_MODE_UNKNOWN are both 0, this patch is just janitorial. Signed-off-by: David Henningsson <david.henningsson@canonical.com> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-22ALSA: hda - Provide missing pin configs for VAIO with ALC260Takashi Iwai1-0/+20
Some models (or maybe depending on BIOS version) of Sony VAIO with ALC260 give no proper pin configurations as default, resulting in the non-working speaker, etc. Just provide the whole pin configurations via a fixup. Reported-by: Matthew Markus <mmarkus@hearit.co> Cc: <stable@vger.kernel.org> Signed-off-by: Takashi Iwai <tiwai@suse.de>
2013-11-22Merge branch 'akpm' (fixes from Andrew)Linus Torvalds16-124/+227
Merge patches from Andrew Morton: "13 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: mm: place page->pmd_huge_pte to right union MAINTAINERS: add keyboard driver to Hyper-V file list x86, mm: do not leak page->ptl for pmd page tables ipc,shm: correct error return value in shmctl (SHM_UNLOCK) mm, mempolicy: silence gcc warning block/partitions/efi.c: fix bound check ARM: drivers/rtc/rtc-at91rm9200.c: disable interrupts at shutdown mm: hugetlbfs: fix hugetlbfs optimization kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS cleanly ipc,shm: fix shm_file deletion races mm: thp: give transparent hugepage code a separate copy_page checkpatch: fix "Use of uninitialized value" warnings configfs: fix race between dentry put and lookup
2013-11-22Merge branch 'for-linus2' of ↵Linus Torvalds125-2058/+7712
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security Pull security subsystem updates from James Morris: "In this patchset, we finally get an SELinux update, with Paul Moore taking over as maintainer of that code. Also a significant update for the Keys subsystem, as well as maintenance updates to Smack, IMA, TPM, and Apparmor" and since I wanted to know more about the updates to key handling, here's the explanation from David Howells on that: "Okay. There are a number of separate bits. I'll go over the big bits and the odd important other bit, most of the smaller bits are just fixes and cleanups. If you want the small bits accounting for, I can do that too. (1) Keyring capacity expansion. KEYS: Consolidate the concept of an 'index key' for key access KEYS: Introduce a search context structure KEYS: Search for auth-key by name rather than target key ID Add a generic associative array implementation. KEYS: Expand the capacity of a keyring Several of the patches are providing an expansion of the capacity of a keyring. Currently, the maximum size of a keyring payload is one page. Subtract a small header and then divide up into pointers, that only gives you ~500 pointers on an x86_64 box. However, since the NFS idmapper uses a keyring to store ID mapping data, that has proven to be insufficient to the cause. Whatever data structure I use to handle the keyring payload, it can only store pointers to keys, not the keys themselves because several keyrings may point to a single key. This precludes inserting, say, and rb_node struct into the key struct for this purpose. I could make an rbtree of records such that each record has an rb_node and a key pointer, but that would use four words of space per key stored in the keyring. It would, however, be able to use much existing code. I selected instead a non-rebalancing radix-tree type approach as that could have a better space-used/key-pointer ratio. I could have used the radix tree implementation that we already have and insert keys into it by their serial numbers, but that means any sort of search must iterate over the whole radix tree. Further, its nodes are a bit on the capacious side for what I want - especially given that key serial numbers are randomly allocated, thus leaving a lot of empty space in the tree. So what I have is an associative array that internally is a radix-tree with 16 pointers per node where the index key is constructed from the key type pointer and the key description. This means that an exact lookup by type+description is very fast as this tells us how to navigate directly to the target key. I made the data structure general in lib/assoc_array.c as far as it is concerned, its index key is just a sequence of bits that leads to a pointer. It's possible that someone else will be able to make use of it also. FS-Cache might, for example. (2) Mark keys as 'trusted' and keyrings as 'trusted only'. KEYS: verify a certificate is signed by a 'trusted' key KEYS: Make the system 'trusted' keyring viewable by userspace KEYS: Add a 'trusted' flag and a 'trusted only' flag KEYS: Separate the kernel signature checking keyring from module signing These patches allow keys carrying asymmetric public keys to be marked as being 'trusted' and allow keyrings to be marked as only permitting the addition or linkage of trusted keys. Keys loaded from hardware during kernel boot or compiled into the kernel during build are marked as being trusted automatically. New keys can be loaded at runtime with add_key(). They are checked against the system keyring contents and if their signatures can be validated with keys that are already marked trusted, then they are marked trusted also and can thus be added into the master keyring. Patches from Mimi Zohar make this usable with the IMA keyrings also. (3) Remove the date checks on the key used to validate a module signature. X.509: Remove certificate date checks It's not reasonable to reject a signature just because the key that it was generated with is no longer valid datewise - especially if the kernel hasn't yet managed to set the system clock when the first module is loaded - so just remove those checks. (4) Make it simpler to deal with additional X.509 being loaded into the kernel. KEYS: Load *.x509 files into kernel keyring KEYS: Have make canonicalise the paths of the X.509 certs better to deduplicate The builder of the kernel now just places files with the extension ".x509" into the kernel source or build trees and they're concatenated by the kernel build and stuffed into the appropriate section. (5) Add support for userspace kerberos to use keyrings. KEYS: Add per-user_namespace registers for persistent per-UID kerberos caches KEYS: Implement a big key type that can save to tmpfs Fedora went to, by default, storing kerberos tickets and tokens in tmpfs. We looked at storing it in keyrings instead as that confers certain advantages such as tickets being automatically deleted after a certain amount of time and the ability for the kernel to get at these tokens more easily. To make this work, two things were needed: (a) A way for the tickets to persist beyond the lifetime of all a user's sessions so that cron-driven processes can still use them. The problem is that a user's session keyrings are deleted when the session that spawned them logs out and the user's user keyring is deleted when the UID is deleted (typically when the last log out happens), so neither of these places is suitable. I've added a system keyring into which a 'persistent' keyring is created for each UID on request. Each time a user requests their persistent keyring, the expiry time on it is set anew. If the user doesn't ask for it for, say, three days, the keyring is automatically expired and garbage collected using the existing gc. All the kerberos tokens it held are then also gc'd. (b) A key type that can hold really big tickets (up to 1MB in size). The problem is that Active Directory can return huge tickets with lots of auxiliary data attached. We don't, however, want to eat up huge tracts of unswappable kernel space for this, so if the ticket is greater than a certain size, we create a swappable shmem file and dump the contents in there and just live with the fact we then have an inode and a dentry overhead. If the ticket is smaller than that, we slap it in a kmalloc()'d buffer" * 'for-linus2' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (121 commits) KEYS: Fix keyring content gc scanner KEYS: Fix error handling in big_key instantiation KEYS: Fix UID check in keyctl_get_persistent() KEYS: The RSA public key algorithm needs to select MPILIB ima: define '_ima' as a builtin 'trusted' keyring ima: extend the measurement list to include the file signature kernel/system_certificate.S: use real contents instead of macro GLOBAL() KEYS: fix error return code in big_key_instantiate() KEYS: Fix keyring quota misaccounting on key replacement and unlink KEYS: Fix a race between negating a key and reading the error set KEYS: Make BIG_KEYS boolean apparmor: remove the "task" arg from may_change_ptraced_domain() apparmor: remove parent task info from audit logging apparmor: remove tsk field from the apparmor_audit_struct apparmor: fix capability to not use the current task, during reporting Smack: Ptrace access check mode ima: provide hash algo info in the xattr ima: enable support for larger default filedata hash algorithms ima: define kernel parameter 'ima_template=' to change configured default ima: add Kconfig default measurement list template ...
2013-11-22Merge git://git.infradead.org/users/eparis/auditLinus Torvalds12-113/+259
Pull audit updates from Eric Paris: "Nothing amazing. Formatting, small bug fixes, couple of fixes where we didn't get records due to some old VFS changes, and a change to how we collect execve info..." Fixed conflict in fs/exec.c as per Eric and linux-next. * git://git.infradead.org/users/eparis/audit: (28 commits) audit: fix type of sessionid in audit_set_loginuid() audit: call audit_bprm() only once to add AUDIT_EXECVE information audit: move audit_aux_data_execve contents into audit_context union audit: remove unused envc member of audit_aux_data_execve audit: Kill the unused struct audit_aux_data_capset audit: do not reject all AUDIT_INODE filter types audit: suppress stock memalloc failure warnings since already managed audit: log the audit_names record type audit: add child record before the create to handle case where create fails audit: use given values in tty_audit enable api audit: use nlmsg_len() to get message payload length audit: use memset instead of trying to initialize field by field audit: fix info leak in AUDIT_GET requests audit: update AUDIT_INODE filter rule to comparator function audit: audit feature to set loginuid immutable audit: audit feature to only allow unsetting the loginuid audit: allow unsetting the loginuid (with priv) audit: remove CONFIG_AUDIT_LOGINUID_IMMUTABLE audit: loginuid functions coding style selinux: apply selinux checks on new audit message types ...
2013-11-22mm: place page->pmd_huge_pte to right unionKirill A. Shutemov1-3/+3
I don't know what went wrong, mis-merge or something, but ->pmd_huge_pte placed in wrong union within struct page. In original patch[1] it's placed to union with ->lru and ->slab, but in commit e009bb30c8df ("mm: implement split page table lock for PMD level") it's in union with ->index and ->freelist. That union seems also unused for pages with table tables and safe to re-use, but it's not what I've tested. Let's move it to original place. It fixes indentation at least. :) [1] https://lkml.org/lkml/2013/10/7/288 Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22MAINTAINERS: add keyboard driver to Hyper-V file listHaiyang Zhang1-0/+1
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22x86, mm: do not leak page->ptl for pmd page tablesKirill A. Shutemov2-4/+6
There are two code paths how page with pmd page table can be freed: pmd_free() and pmd_free_tlb(). I've missed the second one and didn't add page table destructor call there. It leads to leak of page->ptl for pmd page tables, if dynamically allocated page->ptl is in use. The patch adds the missed destructor and modifies documentation accordingly. Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Reported-by: Andrey Vagin <avagin@openvz.org> Tested-by: Andrey Vagin <avagin@openvz.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22ipc,shm: correct error return value in shmctl (SHM_UNLOCK)Jesper Nilsson1-3/+6
Commit 2caacaa82a51 ("ipc,shm: shorten critical region for shmctl") restructured the ipc shm to shorten critical region, but introduced a path where the return value could be -EPERM, even if the operation actually was performed. Before the commit, the err return value was reset by the return value from security_shm_shmctl() after the if (!ns_capable(...)) statement. Now, we still exit the if statement with err set to -EPERM, and in the case of SHM_UNLOCK, it is not reset at all, and used as the return value from shmctl. To fix this, we only set err when errors occur, leaving the fallthrough case alone. Signed-off-by: Jesper Nilsson <jesper.nilsson@axis.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Michel Lespinasse <walken@google.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> [3.12.x] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22mm, mempolicy: silence gcc warningDavid Rientjes1-1/+1
Fengguang Wu reports that compiling mm/mempolicy.c results in a warning: mm/mempolicy.c: In function 'mpol_to_str': mm/mempolicy.c:2878:2: error: format not a string literal and no format arguments Kees says this is because he is using -Wformat-security. Silence the warning. Signed-off-by: David Rientjes <rientjes@google.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Suggested-by: Kees Cook <keescook@chromium.org> Acked-by: Kees Cook <keescook@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22block/partitions/efi.c: fix bound checkAntti P Miettinen1-2/+3
Use ARRAY_SIZE instead of sizeof to get proper max for label length. Since this is just a read out of bounds it's not that bad, but the problem becomes user-visible eg if one tries to use DEBUG_PAGEALLOC and DEBUG_RODATA, at least with some enhancements from Hiroshi. Of course the destination array can contain garbage when we read beyond the end of source array so that would be another user-visible problem. Signed-off-by: Antti P Miettinen <amiettinen@nvidia.com> Reviewed-by: Hiroshi Doyu <hdoyu@nvidia.com> Tested-by: Hiroshi Doyu <hdoyu@nvidia.com> Cc: Will Drewry <wad@chromium.org> Cc: Matt Fleming <matt.fleming@intel.com> Acked-by: Davidlohr Bueso <davidlohr@hp.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22ARM: drivers/rtc/rtc-at91rm9200.c: disable interrupts at shutdownJohan Hovold1-0/+9
Make sure RTC-interrupts are disabled at shutdown. As the RTC is generally powered by backup power (VDDBU), its interrupts are not disabled on wake-up, user, watchdog or software reset. This could cause troubles on other systems (e.g. older kernels) if an interrupt occurs before a handler has been installed at next boot. Let us be well-behaved and disable them on clean shutdowns at least (as do the RTT-based rtc-at91sam9 driver). Signed-off-by: Johan Hovold <jhovold@gmail.com> Acked-by: Nicolas Ferre <nicolas.ferre@atmel.com> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Andrew Victor <linux@maxim.org.za> Cc: Alessandro Zummo <a.zummo@towertech.it> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22mm: hugetlbfs: fix hugetlbfs optimizationAndrea Arcangeli3-60/+106
Commit 7cb2ef56e6a8 ("mm: fix aio performance regression for database caused by THP") can cause dereference of a dangling pointer if split_huge_page runs during PageHuge() if there are updates to the tail_page->private field. Also it is repeating compound_head twice for hugetlbfs and it is running compound_head+compound_trans_head for THP when a single one is needed in both cases. The new code within the PageSlab() check doesn't need to verify that the THP page size is never bigger than the smallest hugetlbfs page size, to avoid memory corruption. A longstanding theoretical race condition was found while fixing the above (see the change right after the skip_unlock label, that is relevant for the compound_lock path too). By re-establishing the _mapcount tail refcounting for all compound pages, this also fixes the below problem: echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages BUG: Bad page state in process bash pfn:59a01 page:ffffea000139b038 count:0 mapcount:10 mapping: (null) index:0x0 page flags: 0x1c00000000008000(tail) Modules linked in: CPU: 6 PID: 2018 Comm: bash Not tainted 3.12.0+ #25 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 Call Trace: dump_stack+0x55/0x76 bad_page+0xd5/0x130 free_pages_prepare+0x213/0x280 __free_pages+0x36/0x80 update_and_free_page+0xc1/0xd0 free_pool_huge_page+0xc2/0xe0 set_max_huge_pages.part.58+0x14c/0x220 nr_hugepages_store_common.isra.60+0xd0/0xf0 nr_hugepages_store+0x13/0x20 kobj_attr_store+0xf/0x20 sysfs_write_file+0x189/0x1e0 vfs_write+0xc5/0x1f0 SyS_write+0x55/0xb0 system_call_fastpath+0x16/0x1b Signed-off-by: Khalid Aziz <khalid.aziz@oracle.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Khalid Aziz <khalid.aziz@oracle.com> Cc: Pravin Shelar <pshelar@nicira.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Ben Hutchings <bhutchings@solarflare.com> Cc: Christoph Lameter <cl@linux.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Andi Kleen <andi@firstfloor.org> Cc: Minchan Kim <minchan@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS cleanlyYuanhan Liu2-6/+6
Remove CONFIG_USE_GENERIC_SMP_HELPERS left by commit 0a06ff068f12 ("kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS"). Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com> Cc: Christoph Hellwig <hch@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22ipc,shm: fix shm_file deletion racesGreg Thelen1-5/+23
When IPC_RMID races with other shm operations there's potential for use-after-free of the shm object's associated file (shm_file). Here's the race before this patch: TASK 1 TASK 2 ------ ------ shm_rmid() ipc_lock_object() shmctl() shp = shm_obtain_object_check() shm_destroy() shum_unlock() fput(shp->shm_file) ipc_lock_object() shmem_lock(shp->shm_file) <OOPS> The oops is caused because shm_destroy() calls fput() after dropping the ipc_lock. fput() clears the file's f_inode, f_path.dentry, and f_path.mnt, which causes various NULL pointer references in task 2. I reliably see the oops in task 2 if with shmlock, shmu This patch fixes the races by: 1) set shm_file=NULL in shm_destroy() while holding ipc_object_lock(). 2) modify at risk operations to check shm_file while holding ipc_object_lock(). Example workloads, which each trigger oops... Workload 1: while true; do id=$(shmget 1 4096) shm_rmid $id & shmlock $id & wait done The oops stack shows accessing NULL f_inode due to racing fput: _raw_spin_lock shmem_lock SyS_shmctl Workload 2: while true; do id=$(shmget 1 4096) shmat $id 4096 & shm_rmid $id & wait done The oops stack is similar to workload 1 due to NULL f_inode: touch_atime shmem_mmap shm_mmap mmap_region do_mmap_pgoff do_shmat SyS_shmat Workload 3: while true; do id=$(shmget 1 4096) shmlock $id shm_rmid $id & shmunlock $id & wait done The oops stack shows second fput tripping on an NULL f_inode. The first fput() completed via from shm_destroy(), but a racing thread did a get_file() and queued this fput(): locks_remove_flock __fput ____fput task_work_run do_notify_resume int_signal Fixes: c2c737a0461e ("ipc,shm: shorten critical region for shmat") Fixes: 2caacaa82a51 ("ipc,shm: shorten critical region for shmctl") Signed-off-by: Greg Thelen <gthelen@google.com> Cc: Davidlohr Bueso <davidlohr@hp.com> Cc: Rik van Riel <riel@redhat.com> Cc: Manfred Spraul <manfred@colorfullife.com> Cc: <stable@vger.kernel.org> # 3.10.17+ 3.11.6+ Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22mm: thp: give transparent hugepage code a separate copy_pageDave Hansen3-38/+48
Right now, the migration code in migrate_page_copy() uses copy_huge_page() for hugetlbfs and thp pages: if (PageHuge(page) || PageTransHuge(page)) copy_huge_page(newpage, page); So, yay for code reuse. But: void copy_huge_page(struct page *dst, struct page *src) { struct hstate *h = page_hstate(src); and a non-hugetlbfs page has no page_hstate(). This works 99% of the time because page_hstate() determines the hstate from the page order alone. Since the page order of a THP page matches the default hugetlbfs page order, it works. But, if you change the default huge page size on the boot command-line (say default_hugepagesz=1G), then we might not even *have* a 2MB hstate so page_hstate() returns null and copy_huge_page() oopses pretty fast since copy_huge_page() dereferences the hstate: void copy_huge_page(struct page *dst, struct page *src) { struct hstate *h = page_hstate(src); if (unlikely(pages_per_huge_page(h) > MAX_ORDER_NR_PAGES)) { ... Mel noticed that the migration code is really the only user of these functions. This moves all the copy code over to migrate.c and makes copy_huge_page() work for THP by checking for it explicitly. I believe the bug was introduced in commit b32967ff101a ("mm: numa: Add THP migration for the NUMA working set scanning fault case") [akpm@linux-foundation.org: fix coding-style and comment text, per Naoya Horiguchi] Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Mel Gorman <mgorman@suse.de> Reviewed-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com> Cc: Hillf Danton <dhillf@gmail.com> Cc: Andrea Arcangeli <aarcange@redhat.com> Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22checkpatch: fix "Use of uninitialized value" warningsJoe Perches1-0/+1
checkpatch is currently confused about some complex macros and references undefined variables $stat and $cond. Make sure these are defined before using them. Signed-off-by: Joe Perches <joe@perches.com> Reported-by: Gerhard Sittig <gsi@denx.de> Acked-by: Andy Whitcroft <apw@canonical.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-22configfs: fix race between dentry put and lookupJunxiao Bi1-2/+14
A race window in configfs, it starts from one dentry is UNHASHED and end before configfs_d_iput is called. In this window, if a lookup happen, since the original dentry was UNHASHED, so a new dentry will be allocated, and then in configfs_attach_attr(), sd->s_dentry will be updated to the new dentry. Then in configfs_d_iput(), BUG_ON(sd->s_dentry != dentry) will be triggered and system panic. sys_open: sys_close: ... fput dput dentry_kill __d_drop <--- dentry unhashed here, but sd->dentry still point to this dentry. lookup_real configfs_lookup configfs_attach_attr---> update sd->s_dentry to new allocated dentry here. d_kill configfs_d_iput <--- BUG_ON(sd->s_dentry != dentry) triggered here. To fix it, change configfs_d_iput to not update sd->s_dentry if sd->s_count > 2, that means there are another dentry is using the sd beside the one that is going to be put. Use configfs_dirent_lock in configfs_attach_attr to sync with configfs_d_iput. With the following steps, you can reproduce the bug. 1. enable ocfs2, this will mount configfs at /sys/kernel/config and fill configure in it. 2. run the following script. while [ 1 ]; do cat /sys/kernel/config/cluster/$your_cluster_name/idle_timeout_ms > /dev/null; done & while [ 1 ]; do cat /sys/kernel/config/cluster/$your_cluster_name/idle_timeout_ms > /dev/null; done & Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-21gso: handle new frag_list of frags GRO packetsHerbert Xu1-25/+50
Recently GRO started generating packets with frag_lists of frags. This was not handled by GSO, thus leading to a crash. Thankfully these packets are of a regular form and are easy to handle. This patch handles them in two ways. For completely non-linear frag_list entries, we simply continue to iterate over the frag_list frags once we exhaust the normal frags. For frag_list entries with linear parts, we call pskb_trim on the first part of the frag_list skb, and then process the rest of the frags in the usual way. This patch also kills a chunk of dead frag_list code that has obviously never ever been run since it ends up generating a bogus GSO-segmented packet with a frag_list entry. Future work is planned to split super big packets into TSO ones. Fixes: 8a29111c7ca6 ("net: gro: allow to build full sized skb") Reported-by: Christoph Paasch <christoph.paasch@uclouvain.be> Reported-by: Jerry Chu <hkchu@google.com> Reported-by: Sander Eikelenboom <linux@eikelenboom.it> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Eric Dumazet <edumazet@google.com> Tested-by: Sander Eikelenboom <linux@eikelenboom.it> Tested-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-21GFS2: Fix ref count bug relating to atomic_openSteven Whitehouse1-1/+4
In the case that atomic_open calls finish_no_open() with the dentry that was supplied to gfs2_atomic_open() an extra reference count is required. This patch fixes that issue preventing a bug trap triggering at umount time. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com>
2013-11-21genetlink: fix genl_set_err() group IDJohannes Berg1-0/+3
Fix another really stupid bug - I introduced genl_set_err() precisely to be able to adjust the group and reject invalid ones, but then forgot to do so. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-21genetlink: fix genlmsg_multicast() bugJohannes Berg2-6/+3
Unfortunately, I introduced a tremendously stupid bug into genlmsg_multicast() when doing all those multicast group changes: it adjusts the group number, but then passes it to genlmsg_multicast_netns() which does that again. Somehow, my tests failed to catch this, so add a warning into genlmsg_multicast_netns() and remove the offending group ID adjustment. Also add a warning to the similar code in other functions so people who misuse them are more loudly warned. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>