summaryrefslogtreecommitdiff
AgeCommit message (Collapse)AuthorFilesLines
2016-02-26Merge branch 'foreign/zhaolei/reada' into for-chris-4.6David Sterba3-137/+141
2016-02-26Merge branch 'foreign/qu/norecovery-v7' into for-chris-4.6David Sterba4-18/+67
2016-02-26Merge branch 'dev/rename-keys' into for-chris-4.6David Sterba3-12/+64
2016-02-26Merge branch 'dev/gfp-flags' into for-chris-4.6David Sterba10-56/+60
2016-02-26Merge branch 'chandan/prep-subpage-blocksize' into for-chris-4.6David Sterba7-165/+321
# Conflicts: # fs/btrfs/file.c
2016-02-18btrfs: reada: ignore creating reada_extent for a non-existent deviceZhao Lei1-9/+8
For a non-existent device, old code bypasses adding it in dev's reada queue. And to solve problem of unfinished waitting in raid5/6, commit 5fbc7c59fd22 ("Btrfs: fix unfinished readahead thread for raid5/6 degraded mounting") adding an exception for the first stripe, in short, the first stripe will always be processed whether the device exists or not. Actually we have a better way for the above request: just bypass creation of the reada_extent for non-existent device, it will make code simple and effective. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: avoid undone reada extents in btrfs_reada_waitZhao Lei1-1/+8
Reada background works is not designed to finish all jobs completely, it will break in following case: 1: When a device reaches workload limit (MAX_IN_FLIGHT) 2: Total reads reach max limit (10000) 3: All devices don't have queued more jobs, often happened in DUP case And if all background works exit with remaining jobs, btrfs_reada_wait() will wait indefinetelly. Above problem is rarely happened in old code, because: 1: Every work queues 2x new works So many works reduced chances of undone jobs. 2: One work will continue 10000 times loop in case of no-jobs It reduced no-thread window time. But after we fixed above case, the "undone reada extents" frequently happened. Fix: Check to ensure we have at least one thread if there are undone jobs in btrfs_reada_wait(). Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: limit max works countZhao Lei3-1/+12
Reada creates 2 works for each level of tree recursively. In case of a tree having many levels, the number of created works is 2^level_of_tree. Actually we don't need so many works in parallel, this patch limits max works to BTRFS_MAX_MIRRORS * 2. The per-fs works_counter will be also used for btrfs_reada_wait() to check is there are background workers. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: simplify dev->reada_in_flight processingZhao Lei1-18/+10
No need to decrease dev->reada_in_flight in __readahead_hook()'s internal and reada_extent_put(). reada_extent_put() have no chance to decrease dev->reada_in_flight in free operation, because reada_extent have additional refcnt when scheduled to a dev. We can put inc and dec operation for dev->reada_in_flight to one place instead to make logic simple and safe, and move useless reada_extent->scheduled_for to a bool flag instead. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: Fix a debug code typoZhao Lei1-8/+3
Remove one copy of loop to fix the typo of iterate zones. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: Jump into cleanup in direct way for __readahead_hook()Zhao Lei1-19/+21
Current code set nritems to 0 to make for_loop useless to bypass it, and set generation's value which is not necessary. Jump into cleanup directly is better choise. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: Use fs_info instead of root in __readahead_hook's argumentZhao Lei3-25/+24
What __readahead_hook() need exactly is fs_info, no need to convert fs_info to root in caller and convert back in __readahead_hook() Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: Pass reada_extent into __readahead_hook directlyZhao Lei1-21/+24
reada_start_machine_dev() already have reada_extent pointer, pass it into __readahead_hook() directly instead of search radix_tree will make code run faster. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: move reada_extent_put to place after __readahead_hook()Zhao Lei1-2/+2
We can't release reada_extent earlier than __readahead_hook(), because __readahead_hook() still need to use it, it is necessary to hode a refcnt to avoid it be freed. Actually it is not a problem after my patch named: Avoid many times of empty loop It make reada_extent in above line include at least one reada_extctl, which keeps additional one refcnt for reada_extent. But we still need this patch to make the code in pretty logic. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: Remove level argument in severial functionsZhao Lei1-9/+6
level is not used in severial functions, remove them from arguments, and remove relative code for get its value. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: bypass adding extent when all zone failedZhao Lei1-0/+5
When failed adding all dev_zones for a reada_extent, the extent will have no chance to be selected to run, and keep in memory for ever. We should bypass this extent to avoid above case. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: add all reachable mirrors into reada device listZhao Lei1-11/+9
If some device is not reachable, we should bypass and continus addingb next, instead of break on bad device. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-18btrfs: reada: Move is_need_to_readahead contition earlierZhao Lei1-11/+9
Move is_need_to_readahead contition earlier to avoid useless loop to get relative data for readahead. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-16btrfs: reada: Avoid many times of empty loopZhao Lei1-1/+1
We can see following loop(10000 times) in trace_log: [ 75.416137] ZL_DEBUG: reada_start_machine_dev:730: pid=771 comm=kworker/u2:3 re->ref_cnt ffff88003741e0c0 1 -> 2 [ 75.417413] ZL_DEBUG: reada_extent_put:524: pid=771 comm=kworker/u2:3 re = ffff88003741e0c0, refcnt = 2 -> 1 [ 75.418611] ZL_DEBUG: __readahead_hook:129: pid=771 comm=kworker/u2:3 re->ref_cnt ffff88003741e0c0 1 -> 2 [ 75.419793] ZL_DEBUG: reada_extent_put:524: pid=771 comm=kworker/u2:3 re = ffff88003741e0c0, refcnt = 2 -> 1 [ 75.421016] ZL_DEBUG: reada_start_machine_dev:730: pid=771 comm=kworker/u2:3 re->ref_cnt ffff88003741e0c0 1 -> 2 [ 75.422324] ZL_DEBUG: reada_extent_put:524: pid=771 comm=kworker/u2:3 re = ffff88003741e0c0, refcnt = 2 -> 1 [ 75.423661] ZL_DEBUG: __readahead_hook:129: pid=771 comm=kworker/u2:3 re->ref_cnt ffff88003741e0c0 1 -> 2 [ 75.424882] ZL_DEBUG: reada_extent_put:524: pid=771 comm=kworker/u2:3 re = ffff88003741e0c0, refcnt = 2 -> 1 ...(10000 times) [ 124.101672] ZL_DEBUG: reada_start_machine_dev:730: pid=771 comm=kworker/u2:3 re->ref_cnt ffff88003741e0c0 1 -> 2 [ 124.102850] ZL_DEBUG: reada_extent_put:524: pid=771 comm=kworker/u2:3 re = ffff88003741e0c0, refcnt = 2 -> 1 [ 124.104008] ZL_DEBUG: __readahead_hook:129: pid=771 comm=kworker/u2:3 re->ref_cnt ffff88003741e0c0 1 -> 2 [ 124.105121] ZL_DEBUG: reada_extent_put:524: pid=771 comm=kworker/u2:3 re = ffff88003741e0c0, refcnt = 2 -> 1 Reason: If more than one user trigger reada in same extent, the first task finished setting of reada data struct and call reada_start_machine() to start, and the second task only add a ref_count but have not add reada_extctl struct completely, the reada_extent can not finished all jobs, and will be selected in __reada_start_machine() for 10000 times(total times in __reada_start_machine()). Fix: For a reada_extent without job, we don't need to run it, just return 0 to let caller break. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-16btrfs: reada: Add missed segment checking in reada_find_zoneZhao Lei1-1/+3
In rechecking zone-in-tree, we still need to check zone include our logical address. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-16btrfs: reada: reduce additional fs_info->reada_lock in reada_find_zoneZhao Lei1-8/+4
We can avoid additional locking-acquirment and one pair of kref_get/put by combine two condition. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-16btrfs: reada: Fix in-segment calculation for readaZhao Lei1-2/+2
reada_zone->end is end pos of segment: end = start + cache->key.offset - 1; So we need to use "<=" in condition to judge is a pos in the segment. The problem happened rearly, because logical pos rarely pointed to last 4k of a blockgroup, but we need to fix it to make code right in logic. Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-12btrfs: Introduce new mount option alias for nologreplayQu Wenruo1-1/+3
Introduce new mount option alias "norecovery" for nologreplay, to keep "norecovery" behavior the same with other filesystems. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-12btrfs: Introduce new mount option to disable tree log replayQu Wenruo4-7/+40
Introduce a new mount option "nologreplay" to co-operate with "ro" mount option to get real readonly mount, like "norecovery" in ext* and xfs. Since the new parse_options() need to check new flags at remount time, so add a new parameter for parse_options(). Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> Reviewed-by: Chandan Rajendra <chandan@linux.vnet.ibm.com> Tested-by: Austin S. Hemmelgarn <ahferroin7@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-12btrfs: Introduce new mount option usebackuproot to replace recoveryQu Wenruo4-11/+25
Current "recovery" mount option will only try to use backup root. However the word "recovery" is too generic and may be confusing for some users. Here introduce a new and more specific mount option, "usebackuproot" to replace "recovery" mount option. "Recovery" will be kept for compatibility reason, but will be deprecated. Also, since "usebackuproot" will only affect mount behavior and after open_ctree() it has nothing to do with the filesystem, so clear the flag after mount succeeded. This provides the basis for later unified "norecovery" mount option. Signed-off-by: Qu Wenruo <quwenruo@cn.fujitsu.com> [ dropped usebackuproot from show_mount, added note about 'recovery' to docs ] Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: teach print_leaf about temporary item subtypesDavid Sterba1-0/+11
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: teach print_leaf about permanent item subtypesDavid Sterba1-2/+10
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: switch dev stats item to the permanent item keyDavid Sterba2-5/+8
Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: introduce key type for persistent permanent itemsDavid Sterba1-3/+17
The number of distinct key types is not that big that we could waste one for something new we want to store in the tree. Similar to the temporary items, we'll introduce a new name for an existing key value and use the objectid for further extension. The victim is the BTRFS_DEV_STATS_KEY (248). The device stats are an example of a permanent item. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: switch balance item to the temporary item keyDavid Sterba1-3/+3
No visible change. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: introduce key type for persistent temporary itemsDavid Sterba1-0/+16
The number of distinct key types is not that big that we could waste one for something new we want to store in the tree. We'll introduce a new name for an existing key value and use the objectid for further extension. The victim is the BTRFS_BALANCE_ITEM_KEY (248). The nature of the balance status item is a good example of the temporary item. It exists from beginning of the balance, keeps the status until it finishes. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: switch to kcalloc in btrfs_cmp_data_prepareDavid Sterba1-2/+2
Kcalloc is functionally equivalent and does overflow checks. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: extent same: use GFP_KERNEL for page array allocationsDavid Sterba1-2/+2
We can safely use GFP_KERNEL in the functions called from the ioctl handlers. Here we can allocate up to 32k so less pressure to the allocator could help. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: device add and remove: use GFP_KERNELDavid Sterba1-4/+5
We can safely use GFP_KERNEL in the functions called from the ioctl handlers. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: readdir: use GFP_KERNELDavid Sterba1-1/+1
Readdir is initiated from userspace and is not on the critical writeback path, we don't need to use GFP_NOFS for allocations. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: fallocate: use GFP_KERNELDavid Sterba1-3/+3
Fallocate is initiated from userspace and is not on the critical writeback path, we don't need to use GFP_NOFS for allocations. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: let callers of btrfs_alloc_root pass gfp flagsDavid Sterba1-10/+11
We don't need to use GFP_NOFS in all contexts, eg. during mount or for dummy root tree, but we might for the the log tree creation. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: scrub: use GFP_KERNEL on the submission pathDavid Sterba2-12/+14
Scrub is not on the critical writeback path we don't need to use GFP_NOFS for all allocations. The failures are handled and stats passed back to userspace. Let's use GFP_KERNEL on the paths where everything is ok, ie. setup the global structures and the IO submission paths. Functions that do the repair and fixups still use GFP_NOFS as we might want to skip any other filesystem activity if we encounter an error. This could turn out to be unnecessary, but requires more review compared to the easy cases in this patch. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: reada: use GFP_KERNEL everywhereDavid Sterba1-5/+5
The readahead framework is not on the critical writeback path we don't need to use GFP_NOFS for allocations. All error paths are handled and the readahead failures are not fatal. The actual users (scrub, dev-replace) will trigger reads if the blocks are not found in cache. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-11btrfs: send: use GFP_KERNEL everywhereDavid Sterba2-19/+19
The send operation is not on the critical writeback path we don't need to use GFP_NOFS for allocations. All error paths are handled and the whole operation is restartable. Signed-off-by: David Sterba <dsterba@suse.com>
2016-02-08Linux 4.5-rc3v4.5-rc3Linus Torvalds1-1/+1
2016-02-08Merge tag 'armsoc-fixes' of ↵Linus Torvalds33-149/+230
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "The first real batch of fixes for this release cycle, so there are a few more than usual. Most of these are fixes and tweaks to board support (DT bugfixes, etc). I've also picked up a couple of small cleanups that seemed innocent enough that there was little reason to wait (const/ __initconst and Kconfig deps). Quite a bit of the changes on OMAP were due to fixes to no longer write to rodata from assembly when ARM_KERNMEM_PERMS was enabled, but there were also other fixes. Kirkwood had a bunch of gpio fixes for some boards. OMAP had RTC fixes on OMAP5, and Nomadik had changes to MMC parameters in DT. All in all, mostly the usual mix of various fixes" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (46 commits) ARM: multi_v7_defconfig: enable DW_WATCHDOG ARM: nomadik: fix up SD/MMC DT settings ARM64: tegra: Add chosen node for tegra132 norrin ARM: realview: use "depends on" instead of "if" after prompt ARM: tango: use "depends on" instead of "if" after prompt ARM: tango: use const and __initconst for smp_operations ARM: realview: use const and __initconst for smp_operations bus: uniphier-system-bus: revive tristate prompt arm64: dts: Add missing DMA Abort interrupt to Juno bus: vexpress-config: Add missing of_node_put ARM: dts: am57xx: sbc-am57x: correct Eth PHY settings ARM: dts: am57xx: cl-som-am57x: fix CPSW EMAC pinmux ARM: dts: am57xx: sbc-am57x: fix UART3 pinmux ARM: dts: am57xx: cl-som-am57x: update SPI Flash frequency ARM: dts: am57xx: cl-som-am57x: set HOST mode for USB2 ARM: dts: am57xx: sbc-am57x: fix SB-SOM EEPROM I2C address ARM: dts: LogicPD Torpedo: Revert Duplicative Entries ARM: dts: am437x: pixcir_tangoc: use correct flags for irq types ARM: dts: am4372: fix irq type for arm twd and global timer ARM: dts: at91: sama5d4 xplained: fix phy0 IRQ type ...
2016-02-08Merge branch 'mailbox-devel' of ↵Linus Torvalds2-7/+2
git://git.linaro.org/landing-teams/working/fujitsu/integration Pull mailbox fixes from Jassi Brar: - fix getting element from the pcc-channels array by simply indexing into it - prevent building mailbox-test driver for archs that don't have IOMEM * 'mailbox-devel' of git://git.linaro.org/landing-teams/working/fujitsu/integration: mailbox: Fix dependencies for !HAS_IOMEM archs mailbox: pcc: fix channel calculation in get_pcc_channel()
2016-02-07Merge tag 'usb-4.5-rc3' of ↵Linus Torvalds15-65/+131
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb Pull USB fixes from Greg KH: "Here are some USB fixes for 4.5-rc3. The usual, xhci fixes for reported issues, combined with some small gadget driver fixes, and a MAINTAINERS file update. All have been in linux-next with no reported issues" * tag 'usb-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: xhci: harden xhci_find_next_ext_cap against device removal xhci: Fix list corruption in urb dequeue at host removal usb: host: xhci-plat: fix NULL pointer in probe for device tree case usb: xhci-mtk: fix AHB bus hang up caused by roothubs polling usb: xhci-mtk: fix bpkts value of LS/HS periodic eps not behind TT usb: xhci: apply XHCI_PME_STUCK_QUIRK to Intel Broxton-M platforms usb: xhci: set SSIC port unused only if xhci_suspend succeeds usb: xhci: add a quirk bit for ssic port unused usb: xhci: handle both SSIC ports in PME stuck quirk usb: dwc3: gadget: set the OTG flag in dwc3 gadget driver. Revert "xhci: don't finish a TD if we get a short-transfer event mid TD" MAINTAINERS: fix my email address usb: dwc2: Fix probe problem on bcm2835 Revert "usb: dwc2: Move reset into dwc2_get_hwparams()" usb: musb: ux500: Fix NULL pointer dereference at system PM usb: phy: mxs: declare variable with initialized value usb: phy: msm: fix error handling in probe.
2016-02-07Merge tag 'staging-4.5-rc3' of ↵Linus Torvalds14-14/+32
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging Pull staging and IIO driver fixes from Greg KH: "Here are some IIO and staging driver fixes for 4.5-rc3. All of them, except one, are for IIO drivers, and one is for a speakup driver fix caused by some earlier patches, to resolve a reported build failure" * tag 'staging-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: Staging: speakup: Fix allyesconfig build on mn10300 iio: dht11: Use boottime iio: ade7753: avoid uninitialized data iio: pressure: mpl115: fix temperature offset sign iio: imu: Fix dependencies for !HAS_IOMEM archs staging: iio: Fix dependencies for !HAS_IOMEM archs iio: adc: Fix dependencies for !HAS_IOMEM archs iio: inkern: fix a NULL dereference on error iio:adc:ti_am335x_adc Fix buffered mode by identifying as software buffer. iio: light: acpi-als: Report data as processed iio: dac: mcp4725: set iio name property in sysfs iio: add HAS_IOMEM dependency to VF610_ADC iio: add IIO_TRIGGER dependency to STK8BA50 iio: proximity: lidar: correct return value iio-light: Use a signed return type for ltr501_match_samp_freq()
2016-02-06Merge branch 'akpm' (patches from Andrew)Linus Torvalds23-126/+166
Merge fixes from Andrew Morton: "22 fixes" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (22 commits) epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUT radix-tree: fix oops after radix_tree_iter_retry MAINTAINERS: trim the file triggers for ABI/API dax: dirty inode only if required thp: make deferred_split_scan() work again mm: replace vma_lock_anon_vma with anon_vma_lock_read/write ocfs2/dlm: clear refmap bit of recovery lock while doing local recovery cleanup um: asm/page.h: remove the pte_high member from struct pte_t mm, hugetlb: don't require CMA for runtime gigantic pages mm/hugetlb: fix gigantic page initialization/allocation mm: downgrade VM_BUG in isolate_lru_page() to warning mempolicy: do not try to queue pages from !vma_migratable() mm, vmstat: fix wrong WQ sleep when memory reclaim doesn't make any progress vmstat: make vmstat_update deferrable mm, vmstat: make quiet_vmstat lighter mm/Kconfig: correct description of DEFERRED_STRUCT_PAGE_INIT memblock: don't mark memblock_phys_mem_size() as __init dump_stack: avoid potential deadlocks mm: validate_mm browse_rb SMP race condition m32r: fix build failure due to SMP and MMU ...
2016-02-06Merge branch 'for-linus' of ↵Linus Torvalds6-17/+75
git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client Pull Ceph fixes from Sage Weil: "We have a few wire protocol compatibility fixes, ports of a few recent CRUSH mapping changes, and a couple error path fixes" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client: libceph: MOSDOpReply v7 encoding libceph: advertise support for TUNABLES5 crush: decode and initialize chooseleaf_stable crush: add chooseleaf_stable tunable crush: ensure take bucket value is valid crush: ensure bucket id is valid before indexing buckets array ceph: fix snap context leak in error path ceph: checking for IS_ERR instead of NULL
2016-02-06Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linuxLinus Torvalds28-252/+461
Pull drm fixes from Dave Airlie: "Fixes all over the place: - amdkfd: two static checker fixes - mst: a bunch of static checker and spec/hw interaction fixes - amdgpu: fix Iceland hw properly, and some fiji bugs, along with some write-combining fixes. - exynos: some regression fixes - adv7511: fix some EDID reading issues" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (38 commits) drm/dp/mst: deallocate payload on port destruction drm/dp/mst: Reverse order of MST enable and clearing VC payload table. drm/dp/mst: move GUID storage from mgr, port to only mst branch drm/dp/mst: change MST detection scheme drm/dp/mst: Calculate MST PBN with 31.32 fixed point drm: Add drm_fixp_from_fraction and drm_fixp2int_ceil drm/mst: Add range check for max_payloads during init drm/mst: Don't ignore the MST PBN self-test result drm: fix missing reference counting decrease drm/amdgpu: disable uvd and vce clockgating on Fiji drm/amdgpu: remove exp hardware support from iceland drm/amdgpu: load MEC ucode manually on iceland drm/amdgpu: don't load MEC2 on topaz drm/amdgpu: drop topaz support from gmc8 module drm/amdgpu: pull topaz gmc bits into gmc_v7 drm/amdgpu: The VI specific EXE bit should only apply to GMC v8.0 above drm/amdgpu: iceland use CI based MC IP drm/amdgpu: move gmc7 support out of CIK dependency drm/amdgpu/gfx7: enable cp inst/reg error interrupts drm/amdgpu/gfx8: enable cp inst/reg error interrupts ...
2016-02-06Merge tag 'pm+acpi-4.5-rc3' of ↵Linus Torvalds5-23/+10
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management and ACPI fixes from Rafael Wysocki: "These are: a fix for a recently introduced false-positive warnings about PM domain pointers being changed inappropriately (harmless but annoying), an MCH size workaround quirk for one more platform, a compiler warning fix (generic power domains framework), an ACPI LPSS (Intel SoCs) driver fixup and a cleanup of the ACPI CPPC core code. Specifics: - PM core fix to avoid false-positive warnings generated when the pm_domain field is cleared for a device that appears to be bound to a driver (Rafael Wysocki). - New MCH size workaround quirk for Intel Haswell-ULT (Josh Boyer). - Fix for an "unused function" compiler warning in the generic power domains framework (Ulf Hansson). - Fixup for the ACPI driver for Intel SoCs (acpi-lpss) to set the PM domain pointer of a device properly in one place that was overlooked by a recent PM core update (Andy Shevchenko). - Removal of a redundant function declaration in the ACPI CPPC core code (Timur Tabi)" * tag 'pm+acpi-4.5-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: PM: Avoid false-positive warnings in dev_pm_domain_set() PM / Domains: Silence compiler warning for an unused function ACPI / CPPC: remove redundant mbox_send_message() declaration ACPI / LPSS: set PM domain via helper setter PNP: Add Haswell-ULT to Intel MCH size workaround
2016-02-06epoll: restrict EPOLLEXCLUSIVE to POLLIN and POLLOUTJason Baron1-6/+32
In the current implementation of the EPOLLEXCLUSIVE flag (added for 4.5-rc1), if epoll waiters create different POLL* sets and register them as exclusive against the same target fd, the current implementation will stop waking any further waiters once it finds the first idle waiter. This means that waiters could miss wakeups in certain cases. For example, when we wake up a pipe for reading we do: wake_up_interruptible_sync_poll(&pipe->wait, POLLIN | POLLRDNORM); So if one epoll set or epfd is added to pipe p with POLLIN and a second set epfd2 is added to pipe p with POLLRDNORM, only epfd may receive the wakeup since the current implementation will stop after it finds any intersection of events with a waiter that is blocked in epoll_wait(). We could potentially address this by requiring all epoll waiters that are added to p be required to pass the same set of POLL* events. IE the first EPOLL_CTL_ADD that passes EPOLLEXCLUSIVE establishes the set POLL* flags to be used by any other epfds that are added as EPOLLEXCLUSIVE. However, I think it might be somewhat confusing interface as we would have to reference count the number of users for that set, and so userspace would have to keep track of that count, or we would need a more involved interface. It also adds some shared state that we'd have store somewhere. I don't think anybody will want to bloat __wait_queue_head for this. I think what we could do instead, is to simply restrict EPOLLEXCLUSIVE such that it can only be specified with EPOLLIN and/or EPOLLOUT. So that way if the wakeup includes 'POLLIN' and not 'POLLOUT', we can stop once we hit the first idle waiter that specifies the EPOLLIN bit, since any remaining waiters that only have 'POLLOUT' set wouldn't need to be woken. Likewise, we can do the same thing if 'POLLOUT' is in the wakeup bit set and not 'POLLIN'. If both 'POLLOUT' and 'POLLIN' are set in the wake bit set (there is at least one example of this I saw in fs/pipe.c), then we just wake the entire exclusive list. Having both 'POLLOUT' and 'POLLIN' both set should not be on any performance critical path, so I think that's ok (in fs/pipe.c its in pipe_release()). We also continue to include EPOLLERR and EPOLLHUP by default in any exclusive set. Thus, the user can specify EPOLLERR and/or EPOLLHUP but is not required to do so. Since epoll waiters may be interested in other events as well besides EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP, these can still be added by doing a 'dup' call on the target fd and adding that as one normally would with EPOLL_CTL_ADD. Since I think that the POLLIN and POLLOUT events are what we are interest in balancing, I think that the 'dup' thing could perhaps be added to only one of the waiter threads. However, I think that EPOLLIN, EPOLLOUT, EPOLLERR and EPOLLHUP should be sufficient for the majority of use-cases. Since EPOLLEXCLUSIVE is intended to be used with a target fd shared among multiple epfds, where between 1 and n of the epfds may receive an event, it does not satisfy the semantics of EPOLLONESHOT where only 1 epfd would get an event. Thus, it is not allowed to be specified in conjunction with EPOLLEXCLUSIVE. EPOLL_CTL_MOD is also not allowed if the fd was previously added as EPOLLEXCLUSIVE. It seems with the limited number of flags to not be as interesting, but this could be relaxed at some further point. Signed-off-by: Jason Baron <jbaron@akamai.com> Tested-by: Madars Vitolins <m@silodev.com> Cc: Michael Kerrisk <mtk.manpages@gmail.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Al Viro <viro@ftp.linux.org.uk> Cc: Eric Wong <normalperson@yhbt.net> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Andy Lutomirski <luto@amacapital.net> Cc: Hagen Paul Pfeifer <hagen@jauu.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>