summaryrefslogtreecommitdiff
path: root/include/linux/bitops.h
AgeCommit message (Collapse)AuthorFilesLines
2019-03-08include/linux/bitops.h: set_mask_bits() to return old valueVineet Gupta1-1/+1
| > Also, set_mask_bits is used in fs quite a bit and we can possibly come up | > with a generic llsc based implementation (w/o the cmpxchg loop) | | May I also suggest changing the return value of set_mask_bits() to old. | | You can compute the new value given old, but you cannot compute the old | value given new, therefore old is the better return value. Also, no | current user seems to use the return value, so changing it is without | risk. Link: http://lkml.kernel.org/g/20150807110955.GH16853@twins.programming.kicks-ass.net Link: http://lkml.kernel.org/r/1548275584-18096-4-git-send-email-vgupta@synopsys.com Signed-off-by: Vineet Gupta <vgupta@synopsys.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Reviewed-by: Anthony Yznaga <anthony.yznaga@oracle.com> Acked-by: Will Deacon <will.deacon@arm.com> Cc: Miklos Szeredi <mszeredi@redhat.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-10-15bitops: protect variables in bit_clear_unless() macroMiklos Szeredi1-8/+8
Unprotected naming of local variables within bit_clear_unless() can easily lead to using the wrong scope. Noticed this by code review after having hit this issue in set_mask_bits() Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 85ad1d13ee9b ("md: set MD_CHANGE_PENDING in a atomic region") Cc: Guoqing Jiang <gqjiang@suse.com>
2018-10-15bitops: protect variables in set_mask_bits() macroMiklos Szeredi1-7/+7
Unprotected naming of local variables within the set_mask_bits() can easily lead to using the wrong scope. Noticed this when "set_mask_bits(&foo->bar, 0, mask)" behaved as no-op. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Fixes: 00a1a053ebe5 ("ext4: atomically set inode->i_flags in ext4_set_inode_flags()") Cc: Theodore Ts'o <tytso@mit.edu>
2018-08-22include/linux/bitops.h: introduce BITS_PER_TYPEChris Wilson1-1/+2
net_dim.h has a rather useful extension to BITS_PER_BYTE to compute the number of bits in a type (BITS_PER_BYTE * sizeof(T)), so promote the macro to bitops.h, alongside BITS_PER_BYTE, for wider usage. Link: http://lkml.kernel.org/r/20180706094458.14116-1-chris@chris-wilson.co.uk Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Jani Nikula <jani.nikula@intel.com> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Andy Gospodarek <gospo@broadcom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-06-21locking/atomics, asm-generic: Move some macros from <linux/bitops.h> to a ↵Will Deacon1-21/+1
new <linux/bits.h> file In preparation for implementing the asm-generic atomic bitops in terms of atomic_long_*(), we need to prevent <asm/atomic.h> implementations from pulling in <linux/bitops.h>. A common reason for this include is for the BITS_PER_BYTE definition, so move this and some other BIT() and masking macros into a new header file, <linux/bits.h>. Signed-off-by: Will Deacon <will.deacon@arm.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-arm-kernel@lists.infradead.org Cc: yamada.masahiro@socionext.com Link: https://lore.kernel.org/lkml/1529412794-17720-4-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-15Merge tag 'gpio-v4.15-1' of ↵Linus Torvalds1-0/+24
ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio Pull GPIO updates from Linus Walleij: "This is the bulk of GPIO changes for the v4.15 kernel cycle: Core: - Fix the semantics of raw GPIO to actually be raw. No inversion semantics as before, but also no open draining, and allow the raw operations to affect lines used for interrupts as the caller supposedly knows what they are doing if they are getting the big hammer. - Rewrote the __inner_function() notation calls to names that make more sense. I just find this kind of code disturbing. - Drop the .irq_base() field from the gpiochip since now all IRQs are mapped dynamically. This is nice. - Support for .get_multiple() in the core driver API. This allows us to read several GPIO lines with a single register read. This has high value for some usecases: it can be used to create oscilloscopes and signal analyzers and other things that rely on reading several lines at exactly the same instant. Also a generally nice optimization. This uses the new assign_bit() macro from the bitops lib that was ACKed by Andrew Morton and is implemented for two drivers, one of them being the generic MMIO driver so everyone using that will be able to benefit from this. - Do not allow requests of Open Drain and Open Source setting of a GPIO line simultaneously. If the hardware actually supports enabling both at the same time the electrical result would be disastrous. - A new interrupt chip core helper. This will be helpful to deal with "banked" GPIOs, which means GPIO controllers with several logical blocks of GPIO inside them. This is several gpiochips per device in the device model, in contrast to the case when there is a 1-to-1 relationship between a device and a gpiochip. New drivers: - Maxim MAX3191x industrial serializer, a very interesting piece of professional I/O hardware. - Uniphier GPIO driver. This is the GPIO block from the recent Socionext (ex Fujitsu and Panasonic) platform. - Tegra 186 driver. This is based on the new banked GPIO infrastructure. Other improvements: - Some documentation improvements. - Wakeup support for the DesignWare DWAPB GPIO controller. - Reset line support on the DesignWare DWAPB GPIO controller. - Several non-critical bug fixes and improvements for the Broadcom BRCMSTB driver. - Misc non-critical bug fixes like exotic errorpaths, removal of dead code etc. - Explicit comments on fall-through switch() statements" * tag 'gpio-v4.15-1' of ssh://gitolite.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio: (65 commits) gpio: tegra186: Remove tegra186_gpio_lock_class gpio: rcar: Add r8a77995 (R-Car D3) support pinctrl: bcm2835: Fix some merge fallout gpio: Fix undefined lock_dep_class gpio: Automatically add lockdep keys gpio: Introduce struct gpio_irq_chip.first gpio: Disambiguate struct gpio_irq_chip.nested gpio: Add Tegra186 support gpio: Export gpiochip_irq_{map,unmap}() gpio: Implement tighter IRQ chip integration gpio: Move lock_key into struct gpio_irq_chip gpio: Move irq_valid_mask into struct gpio_irq_chip gpio: Move irq_nested into struct gpio_irq_chip gpio: Move irq_chained_parent to struct gpio_irq_chip gpio: Move irq_default_type to struct gpio_irq_chip gpio: Move irq_handler to struct gpio_irq_chip gpio: Move irqdomain into struct gpio_irq_chip gpio: Move irqchip into struct gpio_irq_chip gpio: Introduce struct gpio_irq_chip pinctrl: armada-37xx: remove unused variable ...
2017-11-07Merge branch 'linus' into locking/core, to resolve conflictsIngo Molnar1-0/+1
Conflicts: include/linux/compiler-clang.h include/linux/compiler-gcc.h include/linux/compiler-intel.h include/uapi/linux/stddef.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman1-0/+1
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-25locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns ↵Mark Rutland1-2/+2
to READ_ONCE()/WRITE_ONCE() Please do not apply this to mainline directly, instead please re-run the coccinelle script shown below and apply its output. For several reasons, it is desirable to use {READ,WRITE}_ONCE() in preference to ACCESS_ONCE(), and new code is expected to use one of the former. So far, there's been no reason to change most existing uses of ACCESS_ONCE(), as these aren't harmful, and changing them results in churn. However, for some features, the read/write distinction is critical to correct operation. To distinguish these cases, separate read/write accessors must be used. This patch migrates (most) remaining ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following coccinelle script: ---- // Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and // WRITE_ONCE() // $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch virtual patch @ depends on patch @ expression E1, E2; @@ - ACCESS_ONCE(E1) = E2 + WRITE_ONCE(E1, E2) @ depends on patch @ expression E; @@ - ACCESS_ONCE(E) + READ_ONCE(E) ---- Signed-off-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: davem@davemloft.net Cc: linux-arch@vger.kernel.org Cc: mpe@ellerman.id.au Cc: shuah@kernel.org Cc: snitzer@redhat.com Cc: thor.thayer@linux.intel.com Cc: tj@kernel.org Cc: viro@zeniv.linux.org.uk Cc: will.deacon@arm.com Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-19bitops: Introduce assign_bit()Lukas Wunner1-0/+24
A common idiom is to assign a value to a bit with: if (value) set_bit(nr, addr); else clear_bit(nr, addr); Likewise common is the one-line expression variant: value ? set_bit(nr, addr) : clear_bit(nr, addr); Commit 9a8ac3ae682e ("dm mpath: cleanup QUEUE_IF_NO_PATH bit manipulation by introducing assign_bit()") introduced assign_bit() to the md subsystem for brevity. Make it available to others, specifically gpiolib and the upcoming driver for Maxim MAX3191x industrial serializer chips. As requested by Peter Zijlstra, change the argument order to reflect traditional "dst = src" in C, hence "assign_bit(nr, addr, value)". Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Mike Snitzer <snitzer@redhat.com> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Neil Brown <neilb@suse.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Acked-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2017-09-09bitops: avoid integer overflow in GENMASK(_ULL)Matthias Kaehlcke1-2/+3
GENMASK(_ULL) performs a left-shift of ~0UL(L), which technically results in an integer overflow. clang raises a warning if the overflow occurs in a preprocessor expression. Clear the low-order bits through a substraction instead of the left-shift to avoid the overflow. (akpm: no change in .text size in my testing) Link: http://lkml.kernel.org/r/20170803212020.24939-1-mka@chromium.org Signed-off-by: Matthias Kaehlcke <mka@chromium.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-10-08mm/vmalloc.c: fix align value calculation errorzijun_hu1-10/+26
It causes double align requirement for __get_vm_area_node() if parameter size is power of 2 and VM_IOREMAP is set in parameter flags, for example size=0x10000 -> fls_long(0x10000)=17 -> align=0x20000 get_count_order_long() is implemented and can be used instead of fls_long() for fixing the bug, for example size=0x10000 -> get_count_order_long(0x10000)=16 -> align=0x10000 [akpm@linux-foundation.org: s/get_order_long()/get_count_order_long()/] [zijun_hu@zoho.com: fixes] Link: http://lkml.kernel.org/r/57AABC8B.1040409@zoho.com [akpm@linux-foundation.org: locate get_count_order_long() next to get_count_order()] [akpm@linux-foundation.org: move get_count_order[_long] definitions to pick up fls_long()] [zijun_hu@htc.com: move out get_count_order[_long]() from __KERNEL__ scope] Link: http://lkml.kernel.org/r/57B2C4CE.80303@zoho.com Link: http://lkml.kernel.org/r/fc045ecf-20fa-0722-b3ac-9a6140488fad@zoho.com Signed-off-by: zijun_hu <zijun_hu@htc.com> Cc: Tejun Heo <tj@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Minchan Kim <minchan@kernel.org> Cc: David Rientjes <rientjes@google.com> Signed-off-by: zijun_hu <zijun_hu@htc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-05-09md: set MD_CHANGE_PENDING in a atomic regionGuoqing Jiang1-0/+16
Some code waits for a metadata update by: 1. flagging that it is needed (MD_CHANGE_DEVS or MD_CHANGE_CLEAN) 2. setting MD_CHANGE_PENDING and waking the management thread 3. waiting for MD_CHANGE_PENDING to be cleared If the first two are done without locking, the code in md_update_sb() which checks if it needs to repeat might test if an update is needed before step 1, then clear MD_CHANGE_PENDING after step 2, resulting in the wait returning early. So make sure all places that set MD_CHANGE_PENDING are atomicial, and bit_clear_unless (suggested by Neil) is introduced for the purpose. Cc: Martin Kepplinger <martink@posteo.de> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Sasha Levin <sasha.levin@oracle.com> Cc: <linux-kernel@vger.kernel.org> Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>
2015-12-09bitops.h: correctly handle rol32 with 0 byte shiftSasha Levin1-1/+1
ROL on a 32 bit integer with a shift of 32 or more is undefined and the result is arch-dependent. Avoid this by handling the trivial case of roling by 0 correctly. The trivial solution of checking if shift is 0 breaks gcc's detection of this code as a ROL instruction, which is unacceptable. This bug was reported and fixed in GCC (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157): The standard rotate idiom, (x << n) | (x >> (32 - n)) is recognized by gcc (for concreteness, I discuss only the case that x is an uint32_t here). However, this is portable C only for n in the range 0 < n < 32. For n == 0, we get x >> 32 which gives undefined behaviour according to the C standard (6.5.7, Bitwise shift operators). To portably support n == 0, one has to write the rotate as something like (x << n) | (x >> ((-n) & 31)) And this is apparently not recognized by gcc. Note that this is broken on older GCCs and will result in slower ROL. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sasha.levin@oracle.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-07bitops.h: add sign_extend64()Martin Kepplinger1-0/+11
Months back, this was discussed, see https://lkml.org/lkml/2015/1/18/289 The result was the 64-bit version being "likely fine", "valuable" and "correct". The discussion fell asleep but since there are possible users, let's add it. Signed-off-by: Martin Kepplinger <martin.kepplinger@theobroma-systems.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: George Spelvin <linux@horizon.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Maxime Coquelin <maxime.coquelin@st.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-11-07bitops.h: improve sign_extend32()'s documentationMartin Kepplinger1-0/+2
It is often overlooked that sign_extend32(), despite its name, is safe to use for 16 and 8 bit types as well. This should help prevent sign extension being done manually some other way. Signed-off-by: Martin Kepplinger <martin.kepplinger@theobroma-systems.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Ingo Molnar <mingo@redhat.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: George Spelvin <linux@horizon.com> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Maxime Coquelin <maxime.coquelin@st.com> Cc: Denys Vlasenko <dvlasenk@redhat.com> Cc: Yury Norov <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2015-08-05linux/bitmap: Force inlining of bitmap weight functionsDenys Vlasenko1-3/+3
With this config: http://busybox.net/~vda/kernel_config_OPTIMIZE_INLINING_and_Os gcc-4.7.2 generates many copies of these tiny functions: bitmap_weight (55 copies): 55 push %rbp 48 89 e5 mov %rsp,%rbp e8 3f 3a 8b 00 callq __bitmap_weight 5d pop %rbp c3 retq hweight_long (23 copies): 55 push %rbp e8 b5 65 8e 00 callq __sw_hweight64 48 89 e5 mov %rsp,%rbp 5d pop %rbp c3 retq See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122 This patch fixes this via s/inline/__always_inline/ While at it, replaced two "__inline__" with usual "inline" (the rest of the source file uses the latter). text data bss dec filename 86971357 17195880 36659200 140826437 vmlinux.before 86971120 17195912 36659200 140826232 vmlinux Signed-off-by: Denys Vlasenko <dvlasenk@redhat.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: David Rientjes <rientjes@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Graf <tgraf@suug.ch> Cc: linux-kernel@vger.kernel.org Link: http://lkml.kernel.org/r/1438697716-28121-1-git-send-email-dvlasenk@redhat.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-04-17lib: find_*_bit reimplementationYury Norov1-2/+2
This patchset does rework to find_bit function family to achieve better performance, and decrease size of text. All rework is done in patch 1. Patches 2 and 3 are about code moving and renaming. It was boot-tested on x86_64 and MIPS (big-endian) machines. Performance tests were ran on userspace with code like this: /* addr[] is filled from /dev/urandom */ start = clock(); while (ret < nbits) ret = find_next_bit(addr, nbits, ret + 1); end = clock(); printf("%ld\t", (unsigned long) end - start); On Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz measurements are: (for find_next_bit, nbits is 8M, for find_first_bit - 80K) find_next_bit: find_first_bit: new current new current 26932 43151 14777 14925 26947 43182 14521 15423 26507 43824 15053 14705 27329 43759 14473 14777 26895 43367 14847 15023 26990 43693 15103 15163 26775 43299 15067 15232 27282 42752 14544 15121 27504 43088 14644 14858 26761 43856 14699 15193 26692 43075 14781 14681 27137 42969 14451 15061 ... ... find_next_bit performance gain is 35-40%; find_first_bit - no measurable difference. On ARM machine, there is arch-specific implementation for find_bit. Thanks a lot to George Spelvin and Rasmus Villemoes for hints and helpful discussions. This patch (of 3): New implementations takes less space in source file (see diffstat) and in object. For me it's 710 vs 453 bytes of text. It also shows better performance. find_last_bit description fixed due to obvious typo. [akpm@linux-foundation.org: include linux/bitmap.h, per Rasmus] Signed-off-by: Yury Norov <yury.norov@gmail.com> Reviewed-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Reviewed-by: George Spelvin <linux@horizon.com> Cc: Alexey Klimov <klimov.linux@gmail.com> Cc: David S. Miller <davem@davemloft.net> Cc: Daniel Borkmann <dborkman@redhat.com> Cc: Hannes Frederic Sowa <hannes@stressinduktion.org> Cc: Lai Jiangshan <laijs@cn.fujitsu.com> Cc: Mark Salter <msalter@redhat.com> Cc: AKASHI Takahiro <takahiro.akashi@linaro.org> Cc: Thomas Graf <tgraf@suug.ch> Cc: Valentin Rothberg <valentinrothberg@gmail.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-11-16bitops: Fix shift overflow in GENMASK macrosMaxime COQUELIN1-2/+5
On some 32 bits architectures, including x86, GENMASK(31, 0) returns 0 instead of the expected ~0UL. This is the same on some 64 bits architectures with GENMASK_ULL(63, 0). This is due to an overflow in the shift operand, 1 << 32 for GENMASK, 1 << 64 for GENMASK_ULL. Reported-by: Eric Paire <eric.paire@st.com> Suggested-by: Rasmus Villemoes <linux@rasmusvillemoes.dk> Signed-off-by: Maxime Coquelin <maxime.coquelin@st.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: <stable@vger.kernel.org> # v3.13+ Cc: linux@rasmusvillemoes.dk Cc: gong.chen@linux.intel.com Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Theodore Ts'o <tytso@mit.edu> Fixes: 10ef6b0dffe4 ("bitops: Introduce a more generic BITMASK macro") Link: http://lkml.kernel.org/r/1415267659-10563-1-git-send-email-maxime.coquelin@st.com Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-08-13locking: Remove deprecated smp_mb__() barriersPeter Zijlstra1-20/+0
Its been a while and there are no in-tree users left, so remove the deprecated barriers. Signed-off-by: Peter Zijlstra <peterz@infradead.org> Cc: Chen, Gong <gong.chen@linux.intel.com> Cc: Jacob Pan <jacob.jun.pan@linux.intel.com> Cc: Joe Perches <joe@perches.com> Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: Theodore Ts'o <tytso@mit.edu> Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-04-18arch: Prepare for smp_mb__{before,after}_atomic()Peter Zijlstra1-0/+20
Since the smp_mb__{before,after}*() ops are fundamentally dependent on how an arch can implement atomics it doesn't make sense to have 3 variants of them. They must all be the same. Furthermore, the 3 variants suggest they're only valid for those 3 atomic ops, while we have many more where they could be applied. So move away from smp_mb__{before,after}_{atomic,clear}_{dec,inc,bit}() and reduce the interface to just the two: smp_mb__{before,after}_atomic(). This patch prepares the way by introducing default implementations in asm-generic/barrier.h that default to a full barrier and providing __deprecated inlines for the previous 6 barriers if they're not provided by the arch. This should allow for a mostly painless transition (lots of deprecated warns in the interim). Signed-off-by: Peter Zijlstra <peterz@infradead.org> Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Link: http://lkml.kernel.org/n/tip-wr59327qdyi9mbzn6x937s4e@git.kernel.org Cc: Arnd Bergmann <arnd@arndb.de> Cc: "Chen, Gong" <gong.chen@linux.intel.com> Cc: John Sullivan <jsrhbz@kanargh.force9.co.uk> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mauro Carvalho Chehab <m.chehab@samsung.com> Cc: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2014-03-31ext4: atomically set inode->i_flags in ext4_set_inode_flags()Theodore Ts'o1-0/+15
Use cmpxchg() to atomically set i_flags instead of clearing out the S_IMMUTABLE, S_APPEND, etc. flags and then setting them from the EXT4_IMMUTABLE_FL, EXT4_APPEND_FL flags, since this opens up a race where an immutable file has the immutable flag cleared for a brief window of time. Reported-by: John Sullivan <jsrhbz@kanargh.force9.co.uk> Signed-off-by: "Theodore Ts'o" <tytso@mit.edu> Cc: stable@kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-14Merge tag 'pm+acpi-3.13-rc1' of ↵Linus Torvalds1-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI and power management updates from Rafael J Wysocki: - New power capping framework and the the Intel Running Average Power Limit (RAPL) driver using it from Srinivas Pandruvada and Jacob Pan. - Addition of the in-kernel switching feature to the arm_big_little cpufreq driver from Viresh Kumar and Nicolas Pitre. - cpufreq support for iMac G5 from Aaro Koskinen. - Baytrail processors support for intel_pstate from Dirk Brandewie. - cpufreq support for Midway/ECX-2000 from Mark Langsdorf. - ARM vexpress/TC2 cpufreq support from Sudeep KarkadaNagesha. - ACPI power management support for the I2C and SPI bus types from Mika Westerberg and Lv Zheng. - cpufreq core fixes and cleanups from Viresh Kumar, Srivatsa S Bhat, Stratos Karafotis, Xiaoguang Chen, Lan Tianyu. - cpufreq drivers updates (mostly fixes and cleanups) from Viresh Kumar, Aaro Koskinen, Jungseok Lee, Sudeep KarkadaNagesha, Lukasz Majewski, Manish Badarkhe, Hans-Christian Egtvedt, Evgeny Kapaev. - intel_pstate updates from Dirk Brandewie and Adrian Huang. - ACPICA update to version 20130927 includig fixes and cleanups and some reduction of divergences between the ACPICA code in the kernel and ACPICA upstream in order to improve the automatic ACPICA patch generation process. From Bob Moore, Lv Zheng, Tomasz Nowicki, Naresh Bhat, Bjorn Helgaas, David E Box. - ACPI IPMI driver fixes and cleanups from Lv Zheng. - ACPI hotplug fixes and cleanups from Bjorn Helgaas, Toshi Kani, Zhang Yanfei, Rafael J Wysocki. - Conversion of the ACPI AC driver to the platform bus type and multiple driver fixes and cleanups related to ACPI from Zhang Rui. - ACPI processor driver fixes and cleanups from Hanjun Guo, Jiang Liu, Bartlomiej Zolnierkiewicz, Mathieu Rhéaume, Rafael J Wysocki. - Fixes and cleanups and new blacklist entries related to the ACPI video support from Aaron Lu, Felipe Contreras, Lennart Poettering, Kirill Tkhai. - cpuidle core cleanups from Viresh Kumar and Lorenzo Pieralisi. - cpuidle drivers fixes and cleanups from Daniel Lezcano, Jingoo Han, Bartlomiej Zolnierkiewicz, Prarit Bhargava. - devfreq updates from Sachin Kamat, Dan Carpenter, Manish Badarkhe. - Operation Performance Points (OPP) core updates from Nishanth Menon. - Runtime power management core fix from Rafael J Wysocki and update from Ulf Hansson. - Hibernation fixes from Aaron Lu and Rafael J Wysocki. - Device suspend/resume lockup detection mechanism from Benoit Goby. - Removal of unused proc directories created for various ACPI drivers from Lan Tianyu. - ACPI LPSS driver fix and new device IDs for the ACPI platform scan handler from Heikki Krogerus and Jarkko Nikula. - New ACPI _OSI blacklist entry for Toshiba NB100 from Levente Kurusa. - Assorted fixes and cleanups related to ACPI from Andy Shevchenko, Al Stone, Bartlomiej Zolnierkiewicz, Colin Ian King, Dan Carpenter, Felipe Contreras, Jianguo Wu, Lan Tianyu, Yinghai Lu, Mathias Krause, Liu Chuansheng. - Assorted PM fixes and cleanups from Andy Shevchenko, Thierry Reding, Jean-Christophe Plagniol-Villard. * tag 'pm+acpi-3.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (386 commits) cpufreq: conservative: fix requested_freq reduction issue ACPI / hotplug: Consolidate deferred execution of ACPI hotplug routines PM / runtime: Use pm_runtime_put_sync() in __device_release_driver() ACPI / event: remove unneeded NULL pointer check Revert "ACPI / video: Ignore BIOS initial backlight value for HP 250 G1" ACPI / video: Quirk initial backlight level 0 ACPI / video: Fix initial level validity test intel_pstate: skip the driver if ACPI has power mgmt option PM / hibernate: Avoid overflow in hibernate_preallocate_memory() ACPI / hotplug: Do not execute "insert in progress" _OST ACPI / hotplug: Carry out PCI root eject directly ACPI / hotplug: Merge device hot-removal routines ACPI / hotplug: Make acpi_bus_hot_remove_device() internal ACPI / hotplug: Simplify device ejection routines ACPI / hotplug: Fix handle_root_bridge_removal() ACPI / hotplug: Refuse to hot-remove all objects with disabled hotplug ACPI / scan: Start matching drivers after trying scan handlers ACPI: Remove acpi_pci_slot_init() headers from internal.h ACPI / blacklist: fix name of ThinkPad Edge E530 PowerCap: Fix build error with option -Werror=format-security ... Conflicts: arch/arm/mach-omap2/opp.c drivers/Kconfig drivers/spi/spi.c
2013-10-22bitops: Introduce a more generic BITMASK macroChen, Gong1-0/+8
GENMASK is used to create a contiguous bitmask([hi:lo]). It is implemented twice in current kernel. One is in EDAC driver, the other is in SiS/XGI FB driver. Move it to a more generic place for other usage. Signed-off-by: Chen, Gong <gong.chen@linux.intel.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Winischhofer <thomas@winischhofer.net> Cc: Jean-Christophe Plagniol-Villard <plagnioj@jcrosoft.com> Cc: Tomi Valkeinen <tomi.valkeinen@ti.com> Acked-by: Borislav Petkov <bp@suse.de> Acked-by: Mauro Carvalho Chehab <m.chehab@samsung.com> Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-10-17bitops: Introduce BIT_ULLSrinivas Pandruvada1-0/+3
Adding BIT(x) equivalent for unsigned long long type, BIT_ULL(x). Also added BIT_ULL_MASK and BIT_ULL_WORD. Suggested-by: Joe Perches <joe@perches.com> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com> Signed-off-by: Jacob Pan <jacob.jun.pan@linux.intel.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2012-03-24bitops: introduce for_each_clear_bit()Akinobu Mita1-0/+11
Introduce for_each_clear_bit() and for_each_clear_bit_from(). They are similar to for_each_set_bit() and list_for_each_set_bit_from(), but they iterate over all the cleared bits in a memory region. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Stefano Panella <stefano.panella@csr.com> Cc: David Vrabel <david.vrabel@csr.com> Cc: Sergei Shtylyov <sshtylyov@mvista.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-24bitops: remove for_each_set_bit_cont()Akinobu Mita1-3/+0
Remove for_each_set_bit_cont() after confirming that no one uses for_each_set_bit_cont() anymore. [sfr@canb.auug.org.au: regmap: cope with bitops API change] Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-03-24bitops: rename for_each_set_bit_cont() in favor of analogous list.h functionAkinobu Mita1-1/+4
This renames for_each_set_bit_cont() to for_each_set_bit_from() because it is analogous to list_for_each_entry_from() in list.h rather than list_for_each_entry_continue(). This doesn't remove for_each_set_bit_cont() for now. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: Robert Richter <robert.richter@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2012-02-19Merge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6Linus Torvalds1-0/+20
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: crypto: sha512 - use standard ror64()
2012-02-16crypto: sha512 - use standard ror64()Alexey Dobriyan1-0/+20
Use standard ror64() instead of hand-written. There is no standard ror64, so create it. The difference is shift value being "unsigned int" instead of uint64_t (for which there is no reason). gcc starts to emit native ROR instructions which it doesn't do for some reason currently. This should make the code faster. Patch survives in-tree crypto test and ping flood with hmac(sha512) on. Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2011-12-06perf, x86: Implement event scheduler helper functionsRobert Richter1-2/+8
This patch introduces x86 perf scheduler code helper functions. We need this to later add more complex functionality to support overlapping counter constraints (next patch). The algorithm is modified so that the range of weight values is now generated from the constraints. There shouldn't be other functional changes. With the helper functions the scheduler is controlled. There are functions to initialize, traverse the event list, find unused counters etc. The scheduler keeps its own state. V3: * Added macro for_each_set_bit_cont(). * Changed functions interfaces of perf_sched_find_counter() and perf_sched_next_event() to use bool as return value. * Added some comments to make code better understandable. V4: * Fix broken event assignment if weight of the first event is not wmin (perf_sched_init()). Signed-off-by: Robert Richter <robert.richter@amd.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Link: http://lkml.kernel.org/r/1321616122-1533-2-git-send-email-robert.richter@amd.com Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-05-27arch: remove CONFIG_GENERIC_FIND_{NEXT_BIT,BIT_LE,LAST_BIT}Akinobu Mita1-2/+0
By the previous style change, CONFIG_GENERIC_FIND_NEXT_BIT, CONFIG_GENERIC_FIND_BIT_LE, and CONFIG_GENERIC_FIND_LAST_BIT are not used to test for existence of find bitops anymore. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Greg Ungerer <gerg@uclinux.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Russell King <linux@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2011-05-27bitops: add #ifndef for each of find bitopsAkinobu Mita1-0/+2
The style that we normally use in asm-generic is to test the macro itself for existence, so in asm-generic, do: #ifndef find_next_zero_bit_le extern unsigned long find_next_zero_bit_le(const void *addr, unsigned long size, unsigned long offset); #endif and in the architectures, write static inline unsigned long find_next_zero_bit_le(const void *addr, unsigned long size, unsigned long offset) #define find_next_zero_bit_le find_next_zero_bit_le This adds the #ifndef for each of the find bitops in the generic header and source files. Suggested-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Russell King <rmk+kernel@arm.linux.org.uk> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Greg Ungerer <gerg@uclinux.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-11-15bitops: Provide generic sign_extend32 functionAndreas Herrmann1-0/+11
This patch moves code out from wireless drivers where two different functions are defined in three code locations for the same purpose and provides a common function to sign extend a 32-bit value. Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com> Signed-off-by: John W. Linville <linville@tuxdriver.com>
2010-10-09bitops: remove duplicated extern declarationsAkinobu Mita1-23/+0
If CONFIG_GENERIC_FIND_NEXT_BIT is enabled, find_next_bit() and find_next_zero_bit() are doubly declared in asm-generic/bitops/find.h and linux/bitops.h. asm/bitops.h includes asm-generic/bitops/find.h if and only if the architecture enables CONFIG_GENERIC_FIND_NEXT_BIT. And asm/bitops.h is included by linux/bitops.h So we can just remove the extern declarations of find_next_bit() and find_next_zero_bit() in linux/bitops.h. Also we can remove unneeded #ifndef CONFIG_GENERIC_FIND_NEXT_BIT in asm-generic/bitops/find.h. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2010-10-09bitops: make asm-generic/bitops/find.h more genericAkinobu Mita1-22/+0
asm-generic/bitops/find.h has the extern declarations of find_next_bit() and find_next_zero_bit() and the macro definitions of find_first_bit() and find_first_zero_bit(). It is only usable by the architectures which enables CONFIG_GENERIC_FIND_NEXT_BIT and disables CONFIG_GENERIC_FIND_FIRST_BIT. x86 and tile enable both CONFIG_GENERIC_FIND_NEXT_BIT and CONFIG_GENERIC_FIND_FIRST_BIT. These architectures cannot include asm-generic/bitops/find.h in their asm/bitops.h. So ifdefed extern declarations of find_first_bit and find_first_zero_bit() are put in linux/bitops.h. This makes asm-generic/bitops/find.h usable by these architectures and use it. Also this change is needed for the forthcoming duplicated extern declarations cleanup. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Chris Metcalf <cmetcalf@tilera.com>
2010-05-18Merge branch 'core-hweight-for-linus' of ↵Linus Torvalds1-25/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip * 'core-hweight-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: x86, hweight: Use a 32-bit popcnt for __arch_hweight32() arch, hweight: Fix compilation errors x86: Add optimized popcnt variants bitops: Optimize hweight() by making use of compile-time evaluation
2010-05-04arch, hweight: Fix compilation errorsBorislav Petkov1-0/+5
Fix function prototype visibility issues when compiling for non-x86 architectures. Tested with crosstool (ftp://ftp.kernel.org/pub/tools/crosstool/) with alpha, ia64 and sparc targets. Signed-off-by: Borislav Petkov <borislav.petkov@amd.com> LKML-Reference: <20100503130736.GD26107@aftab> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-04-07bitops: remove temporary for_each_bit()Andrew Morton1-3/+0
Migration has been completed so remove this now. There's one straggler in linux-next's drivers/mtd/sm_ftl.c. A patch has been sent. Cc: Akinobu Mita <akinobu.mita@gmail.com> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-04-07bitops: Optimize hweight() by making use of compile-time evaluationPeter Zijlstra1-25/+0
Rename the extisting runtime hweight() implementations to __arch_hweight(), rename the compile-time versions to __const_hweight() and then have hweight() pick between them. Suggested-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <20100318111929.GB11152@aftab> Acked-by: H. Peter Anvin <hpa@zytor.com> LKML-Reference: <1265028224.24455.154.camel@laptop> Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2010-03-06bitops: rename for_each_bit() to for_each_set_bit()Akinobu Mita1-1/+3
Rename for_each_bit to for_each_set_bit in the kernel source tree. To permit for_each_clear_bit(), should that ever be added. The patch includes a macro to map the old for_each_bit() onto the new for_each_set_bit(). This is a (very) temporary thing to ease the migration. [akpm@linux-foundation.org: add temporary for_each_bit()] Suggested-by: Alexey Dobriyan <adobriyan@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Russell King <rmk@arm.linux.org.uk> Cc: David Woodhouse <dwmw2@infradead.org> Cc: Artem Bityutskiy <dedekind@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-02-04bitops: Ensure the compile time HWEIGHT is only used for suchPeter Zijlstra1-11/+22
Avoid accidental misuse by failing to compile things Suggested-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Linus Torvalds <torvalds@linux-foundation.org> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-01-29bitops: Provide compile time HWEIGHT{8,16,32,64}Peter Zijlstra1-2/+16
Provide compile time versions of hweight. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Stephane Eranian <eranian@google.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> LKML-Reference: <20100122155535.797688466@chello.nl> [ Remove some whitespace damage while we are at it ] Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-23bitops: Add __ffs64 bitopSteven Whitehouse1-0/+19
Finds the first set bit in a 64 bit word. This is required in order to fix a bug in GFS2, but I think it should be a generic function in case of future users. Signed-off-by: Steven Whitehouse <swhiteho@redhat.com> Reviewed-by: Christoph Lameter <cl@linux.com> Reviewed-by: Willy Tarreau <w@1wt.eu>
2009-01-01bitmap: find_last_bit()Rusty Russell1-1/+12
Impact: New API As the name suggests. For the moment everyone uses the generic one. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2008-04-29bitops: remove "optimizations"Thomas Gleixner1-103/+12
The mapsize optimizations which were moved from x86 to the generic code in commit 64970b68d2b3ed32b964b0b30b1b98518fde388e increased the binary size on non x86 architectures. Looking into the real effects of the "optimizations" it turned out that they are not used in find_next_bit() and find_next_zero_bit(). The ones in find_first_bit() and find_first_zero_bit() are used in a couple of places but none of them is a real hot path. Remove the "optimizations" all together and call the library functions unconditionally. Boot-tested on x86 and compile tested on every cross compiler I have. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-29Avoid divides in BITS_TO_LONGSEric Dumazet1-1/+1
BITS_PER_LONG is a signed value (32 or 64) DIV_ROUND_UP(nr, BITS_PER_LONG) performs signed arithmetic if "nr" is signed too. Converting BITS_TO_LONGS(nr) to DIV_ROUND_UP(nr, BITS_PER_BYTE * sizeof(long)) makes sure compiler can perform a right shift, even if "nr" is a signed value, instead of an expensive integer divide. Applying this patch saves 141 bytes on x86 when CONFIG_CC_OPTIMIZE_FOR_SIZE=y and speedup bitmap operations. Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-04-26x86: optimize find_first_bit for small bitmapsAlexander van Heukelum1-0/+29
Avoid a call to find_first_bit if the bitmap size is know at compile time and small enough to fit in a single long integer. Modeled after an optimization in the original x86_64-specific code. Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-26x86: generic versions of find_first_(zero_)bit, convert i386Alexander van Heukelum1-0/+34
Generic versions of __find_first_bit and __find_first_zero_bit are introduced as simplified versions of __find_next_bit and __find_next_zero_bit. Their compilation and use are guarded by a new config variable GENERIC_FIND_FIRST_BIT. The generic versions of find_first_bit and find_first_zero_bit are implemented in terms of the newly introduced __find_first_bit and __find_first_zero_bit. This patch does not remove the i386-specific implementation, but it does switch i386 to use the generic functions by setting GENERIC_FIND_FIRST_BIT=y for X86_32. Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2008-04-26x86, generic: optimize find_next_(zero_)bit for small constant-size bitmapsAlexander van Heukelum1-0/+77
This moves an optimization for searching constant-sized small bitmaps form x86_64-specific to generic code. On an i386 defconfig (the x86#testing one), the size of vmlinux hardly changes with this applied. I have observed only four places where this optimization avoids a call into find_next_bit: In the functions return_unused_surplus_pages, alloc_fresh_huge_page, and adjust_pool_surplus, this patch avoids a call for a 1-bit bitmap. In __next_cpu a call is avoided for a 32-bit bitmap. That's it. On x86_64, 52 locations are optimized with a minimal increase in code size: Current #testing defconfig: 146 x bsf, 27 x find_next_*bit text data bss dec hex filename 5392637 846592 724424 6963653 6a41c5 vmlinux After removing the x86_64 specific optimization for find_next_*bit: 94 x bsf, 79 x find_next_*bit text data bss dec hex filename 5392358 846592 724424 6963374 6a40ae vmlinux After this patch (making the optimization generic): 146 x bsf, 27 x find_next_*bit text data bss dec hex filename 5392396 846592 724424 6963412 6a40d4 vmlinux [ tglx@linutronix.de: build fixes ] Signed-off-by: Ingo Molnar <mingo@elte.hu>