summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2009-09-18libata: Add pata_atp867x driver for Artop/Acard ATP867X controllersJohn(Jung-Ik) Lee1-0/+2
This is a new pata driver for ARTOP 867X 64bit 4-channel UDMA133 ATA ctrls. Based on the Atp867 data sheet rev 1.2, Acard, and in part on early ide codes from Eric Uhrhane <ericu@google.com>. Signed-off-by: John(Jung-Ik) Lee <jilee@google.com> Reviewed-by: Grant Grundler <grundler@google.com> Reviewed-by: Gwendal Gringo <gwendal@google.com> Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
2009-09-18perf_counter: Allow for a wakeup watermarkPeter Zijlstra1-2/+8
Currently we wake the mmap() consumer once every PAGE_SIZE of data and/or once event wakeup_events when specified. For high speed sampling this results in too many wakeups wrt. the buffer size, hence change this. We move the default wakeup limit to 1/4-th the buffer size, and provide for means to manually specify this limit. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-17Merge branch 'linus' into tracing/coreIngo Molnar190-2187/+3589
Merge reason: Pick up kernel/softirq.c update for dependent fix. Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-17PCI: fix VGA arbiter header fileJesse Barnes1-1/+2
Remove reference to vgaarb.c and replace it with a comment about the arbiter itself. Reported-by: Tiago Vignatti <tiago.vignatti@nokia.com> Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2009-09-17Merge branch 'for-upstream' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb * 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/dvrabel/uwb: uwb: avoid radio controller reset loops uwb: stop uwbd thread if rc->start() fails uwb: handle radio controller events with out-of-range IDs correctly
2009-09-17nfsd: return success for non-NFS4 nfs4_state_startStephen Rothwell1-1/+1
Today's linux-next build (sparc64_defconfig) failed like this: In file included from arch/sparc/kernel/sys_sparc32.c:32: include/linux/nfsd/nfsd.h: In function 'nfs4_state_start': include/linux/nfsd/nfsd.h:177: error: no return statement in function returning non-void Caused by commit 29ab23cc5d351658d01a4327d55e9106a73fd04f ("nfsd4: allow nfs4 state startup to fail"). Please, if you add code that depends on a CONFIG option, build with that option enabled and disabled. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
2009-09-17mfd: Convert WM8350 to use request_threaded_irq()Mark Brown1-1/+0
Instead of hand rolling our own variant. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17HID: consolidate connect and disconnect into core codeJiri Kosina1-0/+3
HID core registers input, hidraw and hiddev devices, but leaves unregistering it up to the individual driver, which is not really nice. Let's move all the logic to the core. Reported-by: Marcel Holtmann <marcel@holtmann.org> Reported-by: Brian Rogers <brian@xyzw.org> Acked-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
2009-09-17sched: Add new wakeup preemption mode: WAKEUP_RUNNINGPeter Zijlstra1-0/+2
Create a new wakeup preemption mode, preempt towards tasks that run shorter on avg. It sets next buddy to be sure we actually run the task we preempted for. Test results: root@twins:~# while :; do :; done & [1] 6537 root@twins:~# while :; do :; done & [2] 6538 root@twins:~# while :; do :; done & [3] 6539 root@twins:~# while :; do :; done & [4] 6540 root@twins:/home/peter# ./latt -c4 sleep 4 Entries: 48 (clients=4) Averages: ------------------------------ Max 4750 usec Avg 497 usec Stdev 737 usec root@twins:/home/peter# echo WAKEUP_RUNNING > /debug/sched_features root@twins:/home/peter# ./latt -c4 sleep 4 Entries: 48 (clients=4) Averages: ------------------------------ Max 14 usec Avg 5 usec Stdev 3 usec Disabled by default - needs more testing. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Acked-by: Mike Galbraith <efault@gmx.de> Signed-off-by: Ingo Molnar <mingo@elte.hu> LKML-Reference: <new-submission>
2009-09-17regulator: AB3100 supportLinus Walleij1-0/+28
This adds support for the regulators found in the AB3100 Mixed-Signal IC. It further also defines platform data for the ST-Ericsson U300 platform and extends the AB3100 MFD driver so that platform/board data with regulation constraints and an init function can be passed down all the way from the board to the regulators. Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Acked-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Fix ab3100-otp build failureSamuel Ortiz1-0/+1
ab3100.h should include linux/workqueue.h for otp to build properly. Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Add support for TWL4030/5030 dynamic power switchingAmit Kucheria1-10/+84
The TWL4030/5030 family of multifunction devices allows board-specific control of the the various regulators, clock and reset lines through 'scripts' that are loaded into its memory. This allows for Dynamic Power Switching (DPS). Implement board-independent core support for DPS that is then used by board-specific code to load custom DPS scripts. Signed-off-by: Amit Kucheria <amit.kucheria@verdurent.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Add Freescale MC13783 driverSascha Hauer2-0/+480
This driver provides the core Freescale MC13783 support. It registers the client platform_devices and provides access to the A/D converter. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: AB3100 accessor function cleanupsLinus Walleij1-4/+4
This adds the _interruptible suffix to the AB3100 accessor functions on par with mutex_lock_interruptible() that's used for blocking simultaneous calls to the AB3100 acessor functions. Since these accesses are slow on a 100kHz I2C bus and may line up waiting for the mutex, we need to handle interruption by system shutdown or kill signals and may just as well denote that in the function names. Signed-off-by: Linus Walleij <linus.walleij@stericsson.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17regulator: Add WM831x LDO supportMark Brown1-0/+626
The WM831x series of devices provide three types of LDO: - General purpose LDOs supporting voltages from 0.9-3.3V - High performance analogue LDOs supporting voltages from 1-3.5V - Very low power consumption LDOs intended to support always on functionality. This patch adds support for all three kinds of LDO. Each regulator is probed as an individual platform device with resources used to provide the register map location of the regulator. Mixed hardware and software control of regulators is not current supported. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17regulator: Add WM831x DC-DC buck convertor supportMark Brown1-0/+571
The WM831x series of devices all have 3 DC-DC buck convertors. This driver implements software control for these regulators via the regulator API. Use with split hardware/software control of individual regulators is not supported, though regulators not controlled by software may be controlled via the hardware control interfaces. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Liam Girdwood <lrg@slimlogic.co.uk> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17regulator: Provide mode to status conversion functionMark Brown1-0/+2
This is useful for implementing get_status() in terms of get_mode(). Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17input: Add support for the WM831x ON pinMark Brown1-0/+19
The WM831x series of PMICs support control of initial power on through the ON pin on the device with soft control of the pin at other times. Represent this to userspace as KEY_POWER. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Dmitry Torokhov <dtor@mail.ru> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17gpio: Add WM831X GPIO driverMark Brown1-0/+55
Add support for the GPIO pins on the WM831x. No direct support is currently supplied for configuring non-gpiolib functionality such as pull configuration and alternate functions, soft configuration of these will be provided in a future patch. Currently use of these pins as interrupts is not supported due to the ongoing issues with generic irq not support interrupt controllers on interrupt driven buses. Users can directly request the interrupts with the wm831x-specific APIs currently provided if required. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: David Brownell <dbrownell@users.sourceforge.net> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Export ISEL values from WM831x coreMark Brown1-0/+21
The current settings which can be used with the WM831x current sinks can't easily be mapped between register values and currents at run time without a lookup table since the values scale logarithmically to match the way the human eye interprets brightness. This lookup table is inclided in the core since several drivers need to use it. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Add basic WM831x OTP supportMark Brown1-0/+162
The WM831x series of devices use OTP (One Time Programmable, a type of PROM) to store system configuration. At run time this data is visible via registers. Currently the only explicitly supported feature is that the unique ID provided by every WM831x device is exported to user space via sysfs. Other configuration data may be read by system-specific code in the pre_init() and post_init() platform data operations. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Conditionally add WM831x backlight subdeviceMark Brown1-0/+6
The WM831x backlight driver requires at least the specification of the current sink to use and a maximum current to allow them to function and will actively interfere with other users of the regulators it uses if misconfigured so only register the subdevice for it if this platform data has been supplied. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Add WM831x AUXADC supportMark Brown2-0/+218
The WM831x contains an auxiliary ADC with a number of switchable inputs which is used to monitor some of the voltages and temperatures in the system and has some external inputs which can be used for machine specific purposes. Provide an API allowing drivers to read values from the ADC. An internal reference voltage is provided to allow callibration of the ADC. This is used to calibrate the device at startup. The hardware also supports continuous readings and digital comparators. These are not yet supported by the driver. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Add WM831x interrupt supportMark Brown2-0/+785
The WM831x includes an interrupt controller managing interrupts for the various functions on the chip. This patch adds support for the core interrupt block on the device. Ideally this would be supported by genirq, particularly for the GPIOs, but currently genirq is unable to cope with controllers on interrupt driven buses so we cut'n'paste the generic interface. Once genirq is able to cope chips like this it should be a case of filing the prefixes off the code and redoing wm831x-irq.c to move over. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Initial core support for WM831x series devicesMark Brown2-0/+354
The WM831x series of devices are register compatible processor power management subsystems, providing regulator and power path management facilities along with other services like watchdog, RTC and touch panel controllers. This patch adds very basic support, providing basic single register I2C access, handling of the security key and registration of the devices. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Allow multiple MFD cells with the same nameMark Brown1-0/+1
Provide basic support for MFDs having multiple cells of a given type with different IDs by adding an id to the mfd_cell structure and then adding that to the id passed in to mfd_add_devices(). As it stands this approach requires that MFDs using this feature deal with ensuring that there aren't any ID collisions resulting from multiple MFDs of the same type being instantiated. This needs to happen with the existing code too, but with this approach there is a knock on effect on the IDs for non-duplicated devices. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Remove VIB defines from pcap header fileDaniel Ribeiro1-4/+0
Vibrator will be accessed via the pcap-regulator driver, no need to expose its bits in the header file. Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: fix wrong define for 10bit pcf50633 ADC modePaul Fertser1-1/+2
The 10 bits definition was the 8 bits one. Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: use a dedicated workqueue for pcf50633 irq processingPaul Fertser1-0/+1
Using the default kernel "events" workqueue causes problems with synchronous adc readings if initiated from some task on the same workqueue. I had a deadlock trying to use pcf50633_adc_sync_read from a power_supply class driver because the reading was initiated from the workqueue and it waited for the irq processing to complete (to get the result) and that was put on the same workqueue. Signed-off-by: Paul Fertser <fercerpav@gmail.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17hwmon: Add WM835x PMIC hardware monitoring driverMark Brown1-0/+6
This driver provides reporting of the status supply voltage rails of the WM835x series of PMICs via the hwmon API. Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com> Acked-by: Jean Delvare <khali@linux-fr.org> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: add ezx_pcap_setbitsDaniel Ribeiro1-0/+1
Provides an atomic set_bits functions, as needed by the pcap-regulator driver. Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: add set_ts_bits for pcapDaniel Ribeiro1-0/+1
Some TS controller bits are on the same register as the ADC control, save TS specific bits and export a set_ts_bits function so the TS driver can set it with the adc_mutex lock held. Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-17mfd: Introduce irq_to_pcap()Daniel Ribeiro1-0/+1
Export an irq_to_pcap function to get pcap irq number, for the keypad driver. Signed-off-by: Daniel Ribeiro <drwyrm@gmail.com> Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2009-09-16jbd: Journal block numbers can ever be only 32-bit use unsigned int for themJan Kara1-13/+13
It does not make sense to store block number for journal as unsigned long since they can be only 32-bit (because of on-disk format limitation). So change in-memory structures and variables to use unsigned int instead. Signed-off-by: Jan Kara <jack@suse.cz>
2009-09-16Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6Linus Torvalds6-18/+37
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core-2.6: Driver Core: devtmpfs - kernel-maintained tmpfs-based /dev debugfs: Modify default debugfs directory for debugging pktcdvd. debugfs: Modified default dir of debugfs for debugging UHCI. debugfs: Change debugfs directory of IWMC3200 debugfs: Change debuhgfs directory of trace-events-sample.h debugfs: Fix mount directory of debugfs by default in events.txt hpilo: add poll f_op hpilo: add interrupt handler hpilo: staging for interrupt handling driver core: platform_device_add_data(): use kmemdup() Driver core: Add support for compatibility classes uio: add generic driver for PCI 2.3 devices driver-core: move dma-coherent.c from kernel to driver/base mem_class: fix bug mem_class: use minor as index instead of searching the array driver model: constify attribute groups UIO: remove 'default n' from Kconfig Driver core: Add accessor for device platform data Driver core: move dev_get/set_drvdata to drivers/base/dd.c Driver core: add new device to bus's list before probing
2009-09-16Merge branch 'linux-next' of ↵Linus Torvalds4-4/+229
git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 * 'linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6: (75 commits) PCI hotplug: clean up acpi_run_hpp() PCI hotplug: acpiphp: use generic pci_configure_slot() PCI hotplug: shpchp: use generic pci_configure_slot() PCI hotplug: pciehp: use generic pci_configure_slot() PCI hotplug: add pci_configure_slot() PCI hotplug: clean up acpi_get_hp_params_from_firmware() interface PCI hotplug: acpiphp: don't cache hotplug_params in acpiphp_bridge PCI hotplug: acpiphp: remove superfluous _HPP/_HPX evaluation PCI: Clear saved_state after the state has been restored PCI PM: Return error codes from pci_pm_resume() PCI: use dev_printk in quirk messages PCI / PCIe portdrv: Fix pcie_portdrv_slot_reset() PCI Hotplug: convert acpi_pci_detect_ejectable() to take an acpi_handle PCI Hotplug: acpiphp: find bridges the easy way PCI: pcie portdrv: remove unused variable PCI / ACPI PM: Propagate wake-up enable for devices w/o ACPI support ACPI PM: Replace wakeup.prepared with reference counter PCI PM: Introduce device flag wakeup_prepared PCI / ACPI PM: Rework some debug messages PCI PM: Simplify PCI wake-up code ... Fixed up conflict in arch/powerpc/kernel/pci_64.c due to OF device tree scanning having been moved and merged for the 32- and 64-bit cases. The 'needs_freset' initialization added in 6e19314cc ("PCI/powerpc: support PCIe fundamental reset") is now in arch/powerpc/kernel/pci_of_scan.c.
2009-09-16sched: Disable wakeup balancingPeter Zijlstra1-3/+3
Sysbench thinks SD_BALANCE_WAKE is too agressive and kbuild doesn't really mind too much, SD_BALANCE_NEWIDLE picks up most of the slack. On a dual socket, quad core, dual thread nehalem system: sysbench (--num_threads=16): SD_BALANCE_WAKE-: 13982 tx/s SD_BALANCE_WAKE+: 15688 tx/s kbuild (-j16): SD_BALANCE_WAKE-: 47.648295846 seconds time elapsed ( +- 0.312% ) SD_BALANCE_WAKE+: 47.608607360 seconds time elapsed ( +- 0.026% ) (same within noise) Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-16writeback: separate starting of sync vs opportunistic writebackJens Axboe2-3/+3
bdi_start_writeback() is currently split into two paths, one for WB_SYNC_NONE and one for WB_SYNC_ALL. Add bdi_sync_writeback() for WB_SYNC_ALL writeback and let bdi_start_writeback() handle only WB_SYNC_NONE. Push down the writeback_control allocation and only accept the parameters that make sense for each function. This cleans up the API considerably. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16writeback: use RCU to protect bdi_listJens Axboe1-0/+1
Now that bdi_writeback_all() no longer handles integrity writeback, it doesn't have to block anymore. This means that we can switch bdi_list reader side protection to RCU. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16fs: Assign bdi in super_blockJens Axboe1-0/+1
We do this automatically in get_sb_bdev() from the set_bdev_super() callback. Filesystems that have their own private backing_dev_info must assign that in ->fill_super(). Note that ->s_bdi assignment is required for proper writeback! Acked-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16writeback: get rid of wbc->for_writepagesJens Axboe1-1/+0
It's only set, it's never checked. Kill it. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16fs: remove bdev->bd_inode_backing_dev_infoJens Axboe1-1/+0
It has been unused since it was introduced in: commit 520808bf20e90fdbdb320264ba7dd5cf9d47dcac Author: Andrew Morton <akpm@osdl.org> Date: Fri May 21 00:46:17 2004 -0700 [PATCH] block device layer: separate backing_dev_info infrastructure So lets just kill it. Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2009-09-16HWPOISON: The high level memory error handler in the VM v7Andi Kleen2-0/+8
Add the high level memory handler that poisons pages that got corrupted by hardware (typically by a two bit flip in a DIMM or a cache) on the Linux level. The goal is to prevent everyone from accessing these pages in the future. This done at the VM level by marking a page hwpoisoned and doing the appropriate action based on the type of page it is. The code that does this is portable and lives in mm/memory-failure.c To quote the overview comment: High level machine check handler. Handles pages reported by the hardware as being corrupted usually due to a 2bit ECC memory or cache failure. This focuses on pages detected as corrupted in the background. When the current CPU tries to consume corruption the currently running process can just be killed directly instead. This implies that if the error cannot be handled for some reason it's safe to just ignore it because no corruption has been consumed yet. Instead when that happens another machine check will happen. Handles page cache pages in various states. The tricky part here is that we can access any page asynchronous to other VM users, because memory failures could happen anytime and anywhere, possibly violating some of their assumptions. This is why this code has to be extremely careful. Generally it tries to use normal locking rules, as in get the standard locks, even if that means the error handling takes potentially a long time. Some of the operations here are somewhat inefficient and have non linear algorithmic complexity, because the data structures have not been optimized for this case. This is in particular the case for the mapping from a vma to a process. Since this case is expected to be rare we hope we can get away with this. There are in principle two strategies to kill processes on poison: - just unmap the data and wait for an actual reference before killing - kill as soon as corruption is detected. Both have advantages and disadvantages and should be used in different situations. Right now both are implemented and can be switched with a new sysctl vm.memory_failure_early_kill The default is early kill. The patch does some rmap data structure walking on its own to collect processes to kill. This is unusual because normally all rmap data structure knowledge is in rmap.c only. I put it here for now to keep everything together and rmap knowledge has been seeping out anyways Includes contributions from Johannes Weiner, Chris Mason, Fengguang Wu, Nick Piggin (who did a lot of great work) and others. Cc: npiggin@suse.de Cc: riel@redhat.com Signed-off-by: Andi Kleen <ak@linux.intel.com> Acked-by: Rik van Riel <riel@redhat.com> Reviewed-by: Hidehiro Kawai <hidehiro.kawai.ez@hitachi.com>
2009-09-16HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per processAndi Kleen2-0/+4
This allows processes to override their early/late kill behaviour on hardware memory errors. Typically applications which are memory error aware is better of with early kill (see the error as soon as possible), all others with late kill (only see the error when the error is really impacting execution) There's a global sysctl, but this way an application can set its specific policy. We're using two bits, one to signify that the process stated its intention and that I also made the prctl future proof by enforcing the unused arguments are 0. The state is inherited to children. Note this makes us officially run out of process flags on 32bit, but the next patch can easily add another field. Manpage patch will be supplied separately. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16HWPOISON: Define a new error_remove_page address space op for async truncationAndi Kleen2-0/+2
Truncating metadata pages is not safe right now before we haven't audited all file systems. To enable truncation only for data address space define a new address_space callback error_remove_page. This is used for memory_failure.c memory error handling. This can be then set to truncate_inode_page() This patch just defines the new operation and adds documentation. Callers and users come in followon patches. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16HWPOISON: Add invalidate_inode_pageWu Fengguang1-0/+2
Add a simple way to invalidate a single page This is just a refactoring of the truncate.c code. Originally from Fengguang, modified by Andi Kleen. Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16HWPOISON: Refactor truncate to allow direct truncating of page v2Nick Piggin1-0/+2
Extract out truncate_inode_page() out of the truncate path so that it can be used by memory-failure.c [AK: description, headers, fix typos] v2: Some white space changes from Fengguang Wu Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16HWPOISON: Handle hardware poisoned pages in try_to_unmapAndi Kleen1-0/+1
When a page has the poison bit set replace the PTE with a poison entry. This causes the right error handling to be done later when a process runs into it. v2: add a new flag to not do that (needed for the memory-failure handler later) (Fengguang) v3: remove unnecessary is_migration_entry() test (Fengguang, Minchan) Reviewed-by: Minchan Kim <minchan.kim@gmail.com> Reviewed-by: Wu Fengguang <fengguang.wu@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16HWPOISON: Use bitmask/action code for try_to_unmap behaviourAndi Kleen1-1/+12
try_to_unmap currently has multiple modi (migration, munlock, normal unmap) which are selected by magic flag variables. The logic is not very straight forward, because each of these flag change multiple behaviours (e.g. migration turns off aging, not only sets up migration ptes etc.) Also the different flags interact in magic ways. A later patch in this series adds another mode to try_to_unmap, so this becomes quickly unmanageable. Replace the different flags with a action code (migration, munlock, munmap) and some additional flags as modifiers (ignore mlock, ignore aging). This makes the logic more straight forward and allows easier extension to new behaviours. Change all the caller to declare what they want to do. This patch is supposed to be a nop in behaviour. If anyone can prove it is not that would be a bug. Cc: Lee.Schermerhorn@hp.com Cc: npiggin@suse.de Signed-off-by: Andi Kleen <ak@linux.intel.com>
2009-09-16HWPOISON: Add basic support for poisoned pages in fault handler v3Andi Kleen1-1/+2
- Add a new VM_FAULT_HWPOISON error code to handle_mm_fault. Right now architectures have to explicitely enable poison page support, so this is forward compatible to all architectures. They only need to add it when they enable poison page support. - Add poison page handling in swap in fault code v2: Add missing delayacct_clear_flag (Hidehiro Kawai) v3: Really use delayacct_clear_flag (Hidehiro Kawai) Signed-off-by: Andi Kleen <ak@linux.intel.com>