summaryrefslogtreecommitdiff
path: root/include/linux
AgeCommit message (Collapse)AuthorFilesLines
2014-01-14ARM: tegra: fix tegra_powergate_sequence_power_up() inlineStephen Warren1-1/+1
Remove an invalid semicolon from the inline dummy, thus solving: In file included from drivers/gpu/drm/tegra/gr3d.c:15:0: include/linux/tegra-powergate.h:119:1: error: expected identifier or '(' before '{' token Fixes: 80b28791ff04 ("ARM: tegra: pass reset to tegra_powergate_sequence_power_up()") Reported-by: Russell King <linux@arm.linux.org.uk> Signed-off-by: Stephen Warren <swarren@nvidia.com> Reviewed-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Kevin Hilman <khilman@linaro.org>
2014-01-14ARM: S3C[24|64]xx: move includes back under <mach/> scopeLinus Walleij2-200/+0
When refactoring and breaking out the includes for the machine-specific GPIO configuration, two files were created in <linux/platform_data/gpio-samsung-s3c[24|64]xx.h>, but as that namespace shall be used for defining data exchanged between machines and drivers, using it for these broad macros and config settings is wrong. Move the headers back into the machine-local <mach/gpio-samsung.h> file and think about the next step. Reported-by: Arnd Bergmann <arnd@arndb.de> Cc: Tomasz Figa <tomasz.figa@gmail.com> Cc: Sylwester Nawrocki <sylvester.nawrocki@gmail.com> Cc: Ben Dooks <ben-linux@fluff.org> Cc: Kukjin Kim <kgene.kim@samsung.com> Cc: linux-samsung-soc@vger.kernel.org Acked-by: Mark Brown <broonie@linaro.org> Acked-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Heiko Stuebner <heiko@sntech.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-01-14pinctrl: as3722: Set pin to output mode for some functionMallikarjun Kasoju1-0/+1
If pins are used for function output like pwm, clk32k, power good etc then set it as output mode default. Signed-off-by: Mallikarjun Kasoju <mkasoju@nvidia.com> Signed-off-by: Laxman Dewangan <ldewangan@nvidia.com> Acked-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2014-01-14libceph: rename ceph_msg::front_max to front_alloc_lenIlya Dryomov1-1/+1
Rename front_max field of struct ceph_msg to front_alloc_len to make its purpose more clear. Signed-off-by: Ilya Dryomov <ilya.dryomov@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
2014-01-14mtd: nand: use __packed shorthandBrian Norris1-1/+1
To be consistent with the rest of include/linux/mtd/nand.h, we should use the __packed shorthand instead of __attribute__((packed)). Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Huang Shijie <b32955@freescale.com>
2014-01-14mtd: nand: support Micron READ RETRYBrian Norris1-0/+3
Micron provides READ RETRY support via the ONFI vendor-specific parameter block (to indicate how many read-retry modes are available) and the ONFI {GET,SET}_FEATURES commands with a vendor-specific feature address (to support reading/switching the current read-retry mode). The recommended sequence is as follows: 1. Perform PAGE_READ operation 2. If no ECC error, we are done 3. Run SET_FEATURES with feature address 89h, mode 1 4. Retry PAGE_READ operation 5. If ECC error and there are remaining supported modes, increment the mode and return to step 3. Otherwise, this is a true ECC error. 6. Run SET_FEATURES with feature address 89h, mode 0, to return to the default state. This patch implements the chip->setup_read_retry() callback for Micron and fills in the chip->read_retries. Tested on Micron MT29F32G08CBADA, which supports 8 read-retry modes. The Micron vendor-specific table was checked against the datasheets for the following Micron NAND: Needs retry Cell-type Part number Vendor revision Byte 180 ----------- --------- ---------------- --------------- ------------ No SLC MT29F16G08ABABA 1 Reserved (0) No MLC MT29F32G08CBABA 1 Reserved (0) No SLC MT29F1G08AACWP 1 0 Yes MLC MT29F32G08CBADA 1 08h Yes MLC MT29F64G08CBABA 2 08h Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Huang Shijie <b32955@freescale.com>
2014-01-14mtd: nand: add generic READ RETRY supportBrian Norris1-0/+6
Modern MLC (and even SLC?) NAND can experience a large number of bitflips (beyond the recommended correctability capacity) due to drifts in the voltage threshold (Vt). These bitflips can cause ECC errors to occur well within the expected lifetime of the flash. To account for this, some manufacturers provide a mechanism for shifting the Vt threshold after a corrupted read. The generic pattern seems to be that a particular flash has N read retry modes (where N = 0, traditionally), and after an ECC failure, the host should reconfigure the flash to use the next available mode, then retry the read operation. This process repeats until all bitfips can be corrected or until the host has tried all available retry modes. This patch adds the infrastructure support for a vendor-specific/flash-specific callback, used for setting the read-retry mode (i.e., voltage threshold). For now, this patch always returns the flash to mode 0 (the default mode) after a successful read-retry, according to the flowchart found in Micron's datasheets. This may need to change in the future if it is determined that eventually, mode 0 is insufficient for the majority of the flash cells (and so for performance reasons, we should leave the flash in mode 1, 2, etc.). Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Huang Shijie <b32955@freescale.com>
2014-01-14mtd: nand: add ONFI vendor block for MicronBrian Norris1-1/+22
Signed-off-by: Brian Norris <computersforpeace@gmail.com> Acked-by: Huang Shijie <b32955@freescale.com>
2014-01-14audit: correct a type mismatch in audit_syscall_exit()AKASHI Takahiro1-1/+1
audit_syscall_exit() saves a result of regs_return_value() in intermediate "int" variable and passes it to __audit_syscall_exit(), which expects its second argument as a "long" value. This will result in truncating the value returned by a system call and making a wrong audit record. I don't know why gcc compiler doesn't complain about this, but anyway it causes a problem at runtime on arm64 (and probably most 64-bit archs). Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Eric Paris <eparis@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-14audit: convert all sessionid declaration to unsigned intEric Paris1-1/+1
Right now the sessionid value in the kernel is a combination of u32, int, and unsigned int. Just use unsigned int throughout. Signed-off-by: Eric Paris <eparis@redhat.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-14audit: refactor audit_receive_msg() to clarify AUDIT_*_RULE* casesRichard Guy Briggs1-1/+3
audit_receive_msg() needlessly contained a fallthrough case that called audit_receive_filter(), containing no common code between the cases. Separate them to make the logic clearer. Refactor AUDIT_LIST_RULES, AUDIT_ADD_RULE, AUDIT_DEL_RULE cases to create audit_rule_change(), audit_list_rules_send() functions. This should not functionally change the logic. Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-14audit: fix incorrect type of sessionidRichard Guy Briggs1-2/+2
The type of task->sessionid is unsigned int, the return type of audit_get_sessionid should be consistent with it. Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-14audit: fix netlink portid naming and typesRichard Guy Briggs1-1/+1
Normally, netlink ports use the PID of the userspace process as the port ID. If the PID is already in use by a port, the kernel will allocate another port ID to avoid conflict. Re-name all references to netlink ports from pid to portid to reflect this reality and avoid confusion with actual PIDs. Ports use the __u32 type, so re-type all portids accordingly. (This patch is very similar to ebiederman's 5deadd69) Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Signed-off-by: Gao feng <gaofeng@cn.fujitsu.com> Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-14audit: Simplify and correct audit_log_capsetEric W. Biederman1-5/+5
- Always report the current process as capset now always only works on the current process. This prevents reporting 0 or a random pid in a random pid namespace. - Don't bother to pass the pid as is available. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> (cherry picked from commit bcc85f0af31af123e32858069eb2ad8f39f90e67) (cherry picked from commit f911cac4556a7a23e0b3ea850233d13b32328692) Signed-off-by: Richard Guy Briggs <rgb@redhat.com> [eparis: fix build error when audit disabled] Signed-off-by: Eric Paris <eparis@redhat.com>
2014-01-14PCI: Add global pci_lock_rescan_remove()Rafael J. Wysocki1-0/+3
There are multiple PCI device addition and removal code paths that may be run concurrently with the generic PCI bus rescan and device removal that can be triggered via sysfs. If that happens, it may lead to multiple different, potentially dangerous race conditions. The most straightforward way to address those problems is to run the code in question under the same lock that is used by the generic rescan/remove code in pci-sysfs.c. To prepare for those changes, move the definition of the global PCI remove/rescan lock to probe.c and provide global wrappers, pci_lock_rescan_remove() and pci_unlock_rescan_remove(), allowing drivers to manipulate that lock. Also provide pci_stop_and_remove_bus_device_locked() for the callers of pci_stop_and_remove_bus_device() who only need to hold the rescan/remove lock around it. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-14PCI: Cleanup pci.h whitespaceBjorn Helgaas1-184/+54
Put empty or trivial inline stub functions on one line when they fit. No functional change. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-14PCI: Reorder so actual code comes before stubsBjorn Helgaas1-23/+23
Consistently use the: #ifdef CONFIG_PCI_FOO int pci_foo(...); #else static inline int pci_foo(...) { return -1; } #endif pattern, instead of sometimes using "#ifndef CONFIG_PCI_FOO". No functional change. Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-14usb: chipidea: add freescale imx28 special write register methodPeter Chen1-0/+1
According to Freescale imx28 Errata, "ENGR119653 USB: ARM to USB register error issue", All USB register write operations must use the ARM SWP instruction. So, we implement special hw_write and hw_test_and_clear for imx28. Discussion for it at below: http://marc.info/?l=linux-usb&m=137996395529294&w=2 This patch is needed for stable tree 3.11+. Cc: stable@vger.kernel.org Cc: robert.hodaszi@digi.com Signed-off-by: Peter Chen <peter.chen@freescale.com> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14Merge branch 'pci/dead-code' into nextBjorn Helgaas2-60/+0
* pci/dead-code: PCI: Make local functions static PCI: Remove unused alloc_pci_dev() PCI: Remove unused pci_renumber_slot() PCI: Remove unused pcie_aspm_enabled() PCI: Remove unused pci_vpd_truncate() PCI: Remove unused ID-Based Ordering support PCI: Remove unused Optimized Buffer Flush/Fill support PCI: Remove unused Latency Tolerance Reporting support PCI: Removed unused parts of Page Request Interface support Conflicts: drivers/pci/pci.c include/linux/pci.h
2014-01-14Revert "kernfs: replace kernfs_node->u.completion with ↵Greg Kroah-Hartman1-2/+2
kernfs_root->deactivate_waitq" This reverts commit ea1c472dfeada211a0100daa7976e8e8e779b858. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14Revert "kernfs: remove KERNFS_ACTIVE_REF and add kernfs_lockdep()"Greg Kroah-Hartman1-0/+1
This reverts commit a69d001cfc712b96ec9d7ba44d6285702a38dabf. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14Revert "kernfs: remove KERNFS_REMOVED"Greg Kroah-Hartman1-0/+1
This reverts commit ae34372eb8408b3d07e870f1939f99007a730d28. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14Revert "kernfs: restructure removal path to fix possible premature return"Greg Kroah-Hartman1-1/+0
This reverts commit 45a140e587f3d32d8d424ed940dffb61e1739047. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14phylib: Add of_phy_attachAndy Fleming1-0/+10
10G PHYs don't currently support running the state machine, which is implicitly setup via of_phy_connect(). Therefore, it is necessary to implement an OF version of phy_attach(), which does everything except start the state machine. Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14phylib: Support attaching to generic 10g driverAndy Fleming1-0/+2
phy_attach_direct() may now attach to a generic 10G driver. It can also be used exactly as phy_connect_direct(), which will be useful when using of_mdio, as phy_connect (and therefore of_phy_connect) start the PHY state machine, which is currently irrelevant for 10G PHYs. Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14phylib: introduce PHY_INTERFACE_MODE_XGMII for 10G PHYAndy Fleming1-0/+1
Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14phylib: Add Clause 45 read/write functionsAndy Fleming1-0/+39
Need an extra parameter to read or write Clause 45 PHYs, so need a different API with the extra parameter. Signed-off-by: Andy Fleming <afleming@gmail.com> Signed-off-by: Shaohui Xie <Shaohui.Xie@freescale.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-01-14Revert "kernfs: remove kernfs_addrm_cxt"Greg Kroah-Hartman1-0/+4
This reverts commit 99177a34110889a8f2c36420c34e3bcc9bfd8a70. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14Revert "kernfs: implement kernfs_{de|re}activate[_self]()"Greg Kroah-Hartman1-6/+1
This reverts commit 9f010c2ad5194a4b682e747984477850fabd03be. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14Revert "kernfs, sysfs, driver-core: implement kernfs_remove_self() and its ↵Greg Kroah-Hartman3-15/+0
wrappers" This reverts commit 1ae06819c77cff1ea2833c94f8c093fe8a5c79db. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Cc: Alan Stern <stern@rowland.harvard.edu> Cc: kbuild test robot <fengguang.wu@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-14Revert "sysfs, driver-core: remove unused ↵Greg Kroah-Hartman2-1/+19
{sysfs|device}_schedule_callback_owner()" This reverts commit d1ba277e79889085a2faec3b68b91ce89c63f888. Tejun writes: I'm sorry but can you please revert the whole series? get_active() waiting while a node is deactivated has potential to lead to deadlock and that deactivate/reactivate interface is something fundamentally flawed and that cgroup will have to work with the remove_self() like everybody else. IOW, I think the first posting was correct. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2014-01-13Merge branch 'for-john' of ↵John W. Linville1-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211-next
2014-01-13PCI: Make local functions staticStephen Hemminger1-5/+0
Using 'make namespacecheck' identify code which should be declared static. Checked for users in other driver/archs as well. Compile tested only. This stops exporting the following interfaces to modules: pci_target_state() pci_load_saved_state() [bhelgaas: retained pci_find_next_ext_capability() and pci_cfg_space_size()] Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-13PCI: Remove unused alloc_pci_dev()Stephen Hemminger1-1/+0
My philosophy is unused code is dead code. And dead code is subject to bit rot and is a likely source of bugs. Use it or lose it. This removes this unused and deprecated interface: alloc_pci_dev() [bhelgaas: split to separate patch] Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-13PCI: Remove unused pci_renumber_slot()Stephen Hemminger1-1/+0
My philosophy is unused code is dead code. And dead code is subject to bit rot and is a likely source of bugs. Use it or lose it. This reverts part of f46753c5e354 ("PCI: introduce pci_slot") and d25b7c8d6ba2 ("PCI: rename pci_update_slot_number to pci_renumber_slot"), removing this interface: pci_renumber_slot() [bhelgaas: split to separate patch, add historical link from Alex] Link: http://lkml.kernel.org/r/20081009043140.8678.44164.stgit@bob.kio Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Chiang <achiang@canonical.com>
2014-01-13PCI: Remove unused pcie_aspm_enabled()Stephen Hemminger1-2/+0
My philosophy is unused code is dead code. And dead code is subject to bit rot and is a likely source of bugs. Use it or lose it. This reverts part of 3e1b16002af2 ("ACPI/PCI: PCIe ASPM _OSC support capabilities called when root bridge added"), removing this interface: pcie_aspm_enabled() [bhelgaas: split to separate patch] Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> CC: Andrew Patterson <andrew.patterson@hp.com>
2014-01-13PCI: Remove unused pci_vpd_truncate()Stephen Hemminger1-1/+0
My philosophy is unused code is dead code. And dead code is subject to bit rot and is a likely source of bugs. Use it or lose it. This reverts db5679437a2b ("PCI: add interface to set visible size of VPD"), removing this interface: pci_vpd_truncate() [bhelgaas: split to separate patch, also remove prototype from pci.h] Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2014-01-13mmc: sdhci: add quirk for broken HS200 supportDavid Cohen1-0/+2
This patch defines a quirk for platforms unable to enable HS200 support. Signed-off-by: David Cohen <david.a.cohen@linux.intel.com> Reviewed-by: Chuanxiao Dong <chuanxiao.dong@intel.com> Acked-by: Dong Aisheng <b29396@freescale.com> Cc: stable <stable@vger.kernel.org> # [3.13] Signed-off-by: Chris Ball <chris@printf.net>
2014-01-13mmc: SDHI: add SoC specific workaround via HW versionKuninori Morimoto1-0/+1
One of Renesas SDHI chip needs workaround to use it, and, we can judge it based on chip version. This patch adds very quick-hack workaround method, since we still don't know how many chips need workaround in the future. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Chris Ball <cjb@laptop.org>
2014-01-13mmc: tmio: add new TMIO_MMC_HAVE_HIGH_REG flagsKuninori Morimoto1-0/+7
The accessibility checking method to the higher register was added by 69d1fe18e92afb (mmc: tmio: only access registers above 0xff, if available) But, it doesn't care 32bit register. It is impossible to calculate it from the resource size, since there is 16/32 bit register IP (e.g. VERSION is located on 0xe2 if 16bit register, but it is located on 0x1c4 if 32bit register). This patch adds new TMIO_MMC_HAVE_HIGH_REG flags, tmio_mmc driver has it, and sh_mobile_sdhi doesn't have it today. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Chris Ball <cjb@laptop.org>
2014-01-13mmc: tmio: bus_shift become tmio_mmc_data memberKuninori Morimoto1-0/+1
.bus_shift is used to 16/32bit register access offset calculation on tmio driver. tmio_mmc_xxx is used from Toshiba/Renesas now, but this bus_shift value depends on HW IP. This patch moves .bus_shift to tmio_mmc_data member and sets it on each driver. Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Signed-off-by: Chris Ball <cjb@laptop.org>
2014-01-13sched/preempt: Fix up missed PREEMPT_NEED_RESCHED foldingPeter Zijlstra2-0/+30
With various drivers wanting to inject idle time; we get people calling idle routines outside of the idle loop proper. Therefore we need to be extra careful about not missing TIF_NEED_RESCHED -> PREEMPT_NEED_RESCHED propagations. While looking at this, I also realized there's a small window in the existing idle loop where we can miss TIF_NEED_RESCHED; when it hits right after the tif_need_resched() test at the end of the loop but right before the need_resched() test at the start of the loop. So move preempt_fold_need_resched() out of the loop where we're guaranteed to have TIF_NEED_RESCHED set. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-x9jgh45oeayzajz2mjt0y7d6@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/preempt, locking: Rework local_bh_{dis,en}able()Peter Zijlstra3-4/+30
Currently local_bh_disable() is out-of-line for no apparent reason. So inline it to save a few cycles on call/return nonsense, the function body is a single add on x86 (a few loads and store extra on load/store archs). Also expose two new local_bh functions: __local_bh_{dis,en}able_ip(unsigned long ip, unsigned int cnt); Which implement the actual local_bh_{dis,en}able() behaviour. The next patch uses the exposed @cnt argument to optimize bh lock functions. With build fixes from Jacob Pan. Cc: rjw@rjwysocki.net Cc: rui.zhang@intel.com Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: lenb@kernel.org Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13cgroup: remove stray references to css_idHugh Dickins1-3/+0
Trivial: remove the few stray references to css_id, which itself was removed in v3.13's 2ff2a7d03bbe "cgroup: kill css_id". Signed-off-by: Hugh Dickins <hughd@google.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2014-01-13sched/clock, x86: Use a static_key for sched_clock_stablePeter Zijlstra1-1/+3
In order to avoid the runtime condition and variable load turn sched_clock_stable into a static_key. Also provide a shorter implementation of local_clock() and cpu_clock(int) when sched_clock_stable==1. MAINLINE PRE POST sched_clock_stable: 1 1 1 (cold) sched_clock: 329841 221876 215295 (cold) local_clock: 301773 234692 220773 (warm) sched_clock: 38375 25602 25659 (warm) local_clock: 100371 33265 27242 (warm) rdtsc: 27340 24214 24208 sched_clock_stable: 0 0 0 (cold) sched_clock: 382634 235941 237019 (cold) local_clock: 396890 297017 294819 (warm) sched_clock: 38194 25233 25609 (warm) local_clock: 143452 71234 71232 (warm) rdtsc: 27345 24245 24243 Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-eummbdechzz37mwmpags1gjr@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/preempt: Take away preempt_enable_no_resched() from modulesPeter Zijlstra2-3/+24
Discourage drivers/modules to be creative with preemption. Sadly all is implemented in macros and inline so if they want to do evil they still can, but at least try and discourage some. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Eliezer Tamir <eliezer.tamir@linux.intel.com> Cc: rui.zhang@intel.com Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: lenb@kernel.org Cc: rjw@rjwysocki.net Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-fn7h6vu8wtgxk0ih402qcijx@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13locking: Optimize lock_bh functionsPeter Zijlstra4-21/+34
Currently all _bh_ lock functions do two preempt_count operations: local_bh_disable(); preempt_disable(); and for the unlock: preempt_enable_no_resched(); local_bh_enable(); Since its a waste of perfectly good cycles to modify the same variable twice when you can do it in one go; use the new __local_bh_{dis,en}able_ip() functions that allow us to provide a preempt_count value to add/sub. So define SOFTIRQ_LOCK_OFFSET as the offset a _bh_ lock needs to add/sub to be done in one go. As a bonus it gets rid of the preempt_enable_no_resched() usage. This reduces a 1000 loops of: spin_lock_bh(&bh_lock); spin_unlock_bh(&bh_lock); from 53596 cycles to 51995 cycles. I didn't do enough measurements to say for absolute sure that the result is significant but the the few runs I did for each suggest it is so. Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: jacob.jun.pan@linux.intel.com Cc: Mike Galbraith <bitbucket@online.de> Cc: hpa@zytor.com Cc: Arjan van de Ven <arjan@linux.intel.com> Cc: lenb@kernel.org Cc: rjw@rjwysocki.net Cc: rui.zhang@intel.com Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: http://lkml.kernel.org/r/20131119151338.GF3694@twins.programming.kicks-ass.net Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/deadline: Remove the sysctl_sched_dl knobsPeter Zijlstra1-13/+0
Remove the deadline specific sysctls for now. The problem with them is that the interaction with the exisiting rt knobs is nearly impossible to get right. The current (as per before this patch) situation is that the rt and dl bandwidth is completely separate and we enforce rt+dl < 100%. This is undesirable because this means that the rt default of 95% leaves us hardly any room, even though dl tasks are saver than rt tasks. Another proposed solution was (a discarted patch) to have the dl bandwidth be a fraction of the rt bandwidth. This is highly confusing imo. Furthermore neither proposal is consistent with the situation we actually want; which is rt tasks ran from a dl server. In which case the rt bandwidth is a direct subset of dl. So whichever way we go, the introduction of dl controls at this point is painful. Therefore remove them and instead share the rt budget. This means that for now the rt knobs are used for dl admission control and the dl runtime is accounted against the rt runtime. I realise that this isn't entirely desirable either; but whatever we do we appear to need to change the interface later, so better have a small interface for now. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/n/tip-zpyqbqds1r0vyxtxza1e7rdc@git.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/deadline: Add bandwidth management for SCHED_DEADLINE tasksDario Faggioli2-0/+14
In order of deadline scheduling to be effective and useful, it is important that some method of having the allocation of the available CPU bandwidth to tasks and task groups under control. This is usually called "admission control" and if it is not performed at all, no guarantee can be given on the actual scheduling of the -deadline tasks. Since when RT-throttling has been introduced each task group have a bandwidth associated to itself, calculated as a certain amount of runtime over a period. Moreover, to make it possible to manipulate such bandwidth, readable/writable controls have been added to both procfs (for system wide settings) and cgroupfs (for per-group settings). Therefore, the same interface is being used for controlling the bandwidth distrubution to -deadline tasks and task groups, i.e., new controls but with similar names, equivalent meaning and with the same usage paradigm are added. However, more discussion is needed in order to figure out how we want to manage SCHED_DEADLINE bandwidth at the task group level. Therefore, this patch adds a less sophisticated, but actually very sensible, mechanism to ensure that a certain utilization cap is not overcome per each root_domain (the single rq for !SMP configurations). Another main difference between deadline bandwidth management and RT-throttling is that -deadline tasks have bandwidth on their own (while -rt ones doesn't!), and thus we don't need an higher level throttling mechanism to enforce the desired bandwidth. This patch, therefore: - adds system wide deadline bandwidth management by means of: * /proc/sys/kernel/sched_dl_runtime_us, * /proc/sys/kernel/sched_dl_period_us, that determine (i.e., runtime / period) the total bandwidth available on each CPU of each root_domain for -deadline tasks; - couples the RT and deadline bandwidth management, i.e., enforces that the sum of how much bandwidth is being devoted to -rt -deadline tasks to stay below 100%. This means that, for a root_domain comprising M CPUs, -deadline tasks can be created until the sum of their bandwidths stay below: M * (sched_dl_runtime_us / sched_dl_period_us) It is also possible to disable this bandwidth management logic, and be thus free of oversubscribing the system up to any arbitrary level. Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-12-git-send-email-juri.lelli@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-01-13sched/deadline: Add SCHED_DEADLINE inheritance logicDario Faggioli2-1/+12
Some method to deal with rt-mutexes and make sched_dl interact with the current PI-coded is needed, raising all but trivial issues, that needs (according to us) to be solved with some restructuring of the pi-code (i.e., going toward a proxy execution-ish implementation). This is under development, in the meanwhile, as a temporary solution, what this commits does is: - ensure a pi-lock owner with waiters is never throttled down. Instead, when it runs out of runtime, it immediately gets replenished and it's deadline is postponed; - the scheduling parameters (relative deadline and default runtime) used for that replenishments --during the whole period it holds the pi-lock-- are the ones of the waiting task with earliest deadline. Acting this way, we provide some kind of boosting to the lock-owner, still by using the existing (actually, slightly modified by the previous commit) pi-architecture. We would stress the fact that this is only a surely needed, all but clean solution to the problem. In the end it's only a way to re-start discussion within the community. So, as always, comments, ideas, rants, etc.. are welcome! :-) Signed-off-by: Dario Faggioli <raistlin@linux.it> Signed-off-by: Juri Lelli <juri.lelli@gmail.com> [ Added !RT_MUTEXES build fix. ] Signed-off-by: Peter Zijlstra <peterz@infradead.org> Link: http://lkml.kernel.org/r/1383831828-15501-11-git-send-email-juri.lelli@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>