starfive-tech/linux.git - StarFive Tech Linux Kernel for VisionFive (JH7110) boards (mirror)

Age	Commit message (Collapse)	Author	Files	Lines
2014-06-01	drm/exynos: add support for apb mapped phys in hdmi driver	Rahul Sharma	2	-53/+96
	Previous SoCs have hdmi phys which are accessible through dedicated i2c lines. Newer SoCs have Apb mapped hdmi phys. Hdmi driver is modified to support apb mapped phys. Signed-off-by: Rahul Sharma <Rahul.Sharma@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: remove unnecessary read for phy configuration values	Rahul Sharma	1	-10/+0
	Cleaning up unnecessary i2c read call after hdmiphy configuration. This check is redundant since check for hdmiphy pll lock status confirms the correct settings for phy. Signed-off-by: Rahul Sharma <Rahul.Sharma@samsung.com> Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Reviewed-by: Tomasz Figa <t.figa@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: replace hdmi reset with hdmi disable	Rahul Sharma	1	-24/+16
	Before setting the core and timing generation registers, hdmi driver resets the whole hdmi hardware, which also resets the audio related registers. Hdmi reset is replaced by hdmi disable which is called just before setting the core and timing registers. It also ensure that audio settings are not changed. Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com> Signed-off-by: Shirish S <s.shirish@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: Read hpd gpio in is_connected callback	Sean Paul	1	-0/+2
	This patch adds a gpio read of hpd during the is_connected callback. This fixes the case where hdmi is off going into suspend and the cable is plugged in while suspended. In this case, the hpd interrupt does not fire and is_connected will return false. Signed-off-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Rahul Sharma <Rahul.Sharma@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: hdmi: remove unnecessary memset	Daniel Kurtz	1	-2/+0
	Our resources were just zalloc'ed as part of hdata. They are already 0. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Rahul Sharma <Rahul.Sharma@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: check for null pointers in error handling	Paul Taysom	1	-1/+2
	Smatch error from arm build: drivers/gpu/drm/exynos/ exynos_hdmi.c:2374 hdmi_probe() error: potential NULL dereference 'hdata->hdmiphy_port'. Added check for hdata->hdmiphy_port that it is not NULL. Signed-off-by: Paul Taysom <taysom@chromium.org> Signed-off-by: Rahul Sharma <Rahul.Sharma@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: Debounce HDMI hotplug interrupts	Sean Paul	1	-2/+21
	This patch debounces hotplug interrupts generated by the HDMI hotplug gpio. The reason this is needed is that we get multiple (5) interrupts every time a monitor is inserted which causes us to needlessly enable and disable the IP block. Signed-off-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Rahul Sharma <Rahul.Sharma@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: Don't reset hdmiphy on hdmi off	Sean Paul	1	-5/+0
	This patch removes the hdmiphy reset in hdmi_poweroff. The hdmiphy reset was added to take advantage of exynos clockgating, doing it would gate the entire TV domain. Unfortunately, mixer is included in the TV domain and its vsync interrupts are stopped when TV is gated. Signed-off-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Rahul Sharma <Rahul.Sharma@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: Fix double locks at PM resume	Takashi Iwai	1	-1/+1
	The recent commit [3ea87855: drm/helper: lock all around force mode restore] introduced drm_modeset_lock_all() in drm_helper_resume_force_mode() itself, while exynos driver takes this lock before calling it. Move the function call outside the lock for avoiding a deadlock. Signed-off-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: dp: Use DPCD defines of drm_dp_helper.h	Jingoo Han	3	-104/+55
	Use DPCD defines of drm_dp_helper.h; thus, duplicated DPCD defines of exynos_dp_core.h can be removed. Also, DP_TEST_EDID_CHECKSUM define is added to drm_dp_helper.h. There is no functional change. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: update phy settings for RB resolutions	Shirish S	1	-21/+12
	This patch updates phy settings of the below mentioned pixel clocks in Exynos5250 and removes support for 88.75MHz, for it is not supported. 71 MHz - 1280x800@60Hz RB 73.25 MHz - 800x600@120Hz RB 115.5 MHz - 1024x768@120Hz RB 119 MHz - 1680x1050@60Hz RB Signed-off-by: Shirish S <s.shirish@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: dp: support hotplug detection via GPIO	Andrew Bresticker	4	-15/+66
	Certain bridge chips use a GPIO to indicate the cable status instead of the I_DP_HPD pin. This adds an optional device-tree property, "samsung,hpd-gpio", to the exynos-dp controller which indicates that the specified GPIO should be used for hotplug detection. The GPIO is then set up as an edge-triggered interrupt where the rising edge indicates hotplug-in and the falling edge indicates hotplug-out. Signed-off-by: Andrew Bresticker <abrestic@chromium.org> Signed-off-by: Rahul Sharma <rahul.sharma@samsung.com> Signed-off-by: Ajay Kumar <ajaykumar.rs@samsung.com> Acked-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: dsi: remove unnecessary pm interfaces	Inki Dae	1	-31/+0
	Exynos drm driver is a single driver so pm operation for kms drivers should be done by connector->dpms at top level driver. If kms driver has its own pm interfaces, single driver model would be broken so this patch removes unnecessary pm interfaces from dsi driver. Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2014-06-01	drm/exynos: remove unnecessary runtime pm interfaces	Inki Dae	1	-29/+0
	Exyno drm driver has no real hardware device, and runtime pm operation should be done by sub drivers. Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2014-06-01	drm/exynos: separate dpi from fimd	Andrzej Hajda	3	-95/+69
	The patch separates dpi related routines from fimd. Changelog v2: - Rename ctx->dpi to ctx->display Signed-off-by: Andrzej Hajda <a.hajda@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: fix comment to exynos_drm_device_subdrv_prove call	Inki Dae	1	-1/+1
	subdrv_probe callback of virtual display driver will be called by exynos_drm_device_subdrv_probe() to create crtc and encoder/connector for virtual display driver. So it fixes comments to exynos_drm_device_subdrv probe call. Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2014-06-01	drm/exynos: dpi: fix hotplug fail issue	Inki Dae	1	-4/+1
	When connector is created, if connector->polled is DRM_CONNECTOR_POLL_CONNECT then drm_kms_helper_hotplug_event function isn't called at drm_helper_hpd_irq_event because the function will be called only in case of DRM_CONNECTOR_POLL_HPD. So this patch sets always DRM_CONNECTOR_POLL_HPD flag to connector->polled of parallel panel driver at connector creation. Signed-off-by: Inki Dae <inki.dae@samsung.com> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
2014-06-01	drm/exynos: add component framework support	Inki Dae	12	-463/+641
	This patch adds component framework support to resolve the probe order issue. Until now, exynos drm had used codes specific to exynos drm to resolve that issue so with this patch, the specific codes are removed. Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: modify goto labels to meaningful names	Inki Dae	1	-28/+27
	Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: fimd: clear channel before enabling iommu	Akshu Agrawal	1	-20/+49
	If any fimd channel was already active, initializing iommu will result in a PAGE FAULT (e.e. u-boot could have turned on the display and not disabled it before the kernel starts). This patch checks if any channel is active before initializing iommu and disables it. Signed-off-by: Akshu Agrawal <akshu.a@samsung.com> Signed-off-by: Prathyush K <prathyush.k@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: rotator: add missing braces	Jingoo Han	1	-1/+2
	In the case of that only one branch of a conditional statement is a single statement, braces are added to both branches. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: dp: remove unnecessary OOM messages	Jingoo Han	1	-6/+2
	The site-specific OOM messages are unnecessary, because they duplicate the MM subsystem generic OOM message. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: hdmi: make local symbols static	Jingoo Han	1	-2/+2
	Make local symbols static, because these are used only in this file. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: fb: make local symbol static	Jingoo Han	1	-1/+1
	Make local symbole static, because this is used only in this file. Signed-off-by: Jingoo Han <jg1.han@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos: remove DRIVER_HAVE_IRQ feature	Joonyoung Shim	1	-2/+1
	Exynos drm driver cannot support DRIVER_HAVE_IRQ feature because it uses driver specific one instead of routine of drm framework to install/uninstall irq handler. Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos/fbdev: don't set mode_config.fb_base	Daniel Kurtz	1	-1/+0
	AFAICT, the fb_base of a drm_device's mode_config is never used. It isn't accessed by core drm, it isn't used by fbmem, and it isn't exposed to user space. Furthermore, it is probably supposed to be a physical address, not the dma address mapped to the display controller, so this is just wrong. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	drm/exynos/fbdev: don't set fix.smem/mmio_{start,len}	Daniel Kurtz	1	-7/+0
	Kernel access to the eyxnos fbdev framebuffer is via its gem object's kernel mapping (kvaddr, stored in info->screen_base). User space access is provided by mmap(), read() and write() of /dev/fb/fb0. These functions also only use screen_base/screen_size(). Therefore, it is not necessary to set fix->smem_{start,len} or fix->mmio_{start,len} fields. This avoids leaking kernel, physical and dma mapped addresses to user space via the ioctls FBIOGET_VSCREENINFO and FBIOGET_FSCREENINFO. Signed-off-by: Daniel Kurtz <djkurtz@chromium.org> Signed-off-by: Inki Dae <inki.dae@samsung.com>
2014-06-01	Merge tag 'batman-adv-fix-for-davem' of git://git.open-mesh.org/linux-merge	David S. Miller	1	-3/+3
	Included changes: - prevent NULL dereference in multicast code Antonion Quartulli says: ==================== pull request net: batman-adv 20140527 here you have another very small fix intended for net/linux-3.15. It prevents some multicast functions from dereferencing a NULL pointer. (Actually it was nothing more than a typo) I hope it is not too late for such a small patch. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-31	Merge branch 'core-urgent-for-linus' of ↵	Linus Torvalds	2	-17/+67
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core futex/rtmutex fixes from Thomas Gleixner: "Three fixlets for long standing issues in the futex/rtmutex code unearthed by Dave Jones syscall fuzzer: - Add missing early deadlock detection checks in the futex code - Prevent user space from attaching a futex to kernel threads - Make the deadlock detector of rtmutex work again Looks large, but is more comments than code change" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: rtmutex: Fix deadlock detector for real futex: Prevent attaching to kernel threads futex: Add another early deadlock detection check
2014-05-31	Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux	Linus Torvalds	12	-304/+298
	Pull drm fixes from Dave Airlie: "Mostly quiet now: i915: fixing userspace visiblie issues, all stable marked radeon: one more pll fix, two crashers, one suspend/resume regression" * 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: drm/radeon: Resume fbcon last drm/radeon: only allocate necessary size for vm bo list drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission drm/radeon: avoid crash if VM command submission isn't available drm/radeon: lower the ref * post PLL maximum once more drm/i915: Prevent negative relocation deltas from wrapping drm/i915: Only copy back the modified fields to userspace from execbuffer drm/i915: Fix dynamic allocation of physical handles
2014-05-31	dcache: add missing lockdep annotation	Linus Torvalds	1	-1/+1
	lock_parent() very much on purpose does nested locking of dentries, and is careful to maintain the right order (lock parent first). But because it didn't annotate the nested locking order, lockdep thought it might be a deadlock on d_lock, and complained. Add the proper annotation for the inner locking of the child dentry to make lockdep happy. Introduced by commit 046b961b45f9 ("shrink_dentry_list(): take parent's ->d_lock earlier"). Reported-and-tested-by: Josh Boyer <jwboyer@fedoraproject.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-31	batman-adv: fix NULL pointer dereferences	Marek Lindner	1	-3/+3
	Was introduced with 4c8755d69cbde2ec464a39c932aed0a83f9ff89f ("batman-adv: Send multicast packets to nodes with a WANT_ALL flag") Reported-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Marek Lindner <mareklindner@neomailbox.ch> Acked-by: Antonio Quartulli <antonio@meshcoding.com> Signed-off-by: Linus Lüssing <linus.luessing@web.de> Signed-off-by: Antonio Quartulli <antonio@meshcoding.com>
2014-05-31	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf	David S. Miller	1	-5/+10
	Pablo Neira Ayuso says: ==================== The following patchset contains a late fix for IPVS: * Fix crash when trying to remove the transport header with non-linear skbuffs, this was introduced in 3.6-rc. Patch from Peter Christensen via the IPVS folks. I'll pass this to -stable once this hits mainstream. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-31	Merge tag 'linux-can-fixes-for-3.15-20140528' of ↵	David S. Miller	1	-0/+3
	git://gitorious.org/linux-can/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2014-05-28 here's a pull request for v3.15, hope it's not too late. Oliver Hartkopp fixed a bug in the CAN led trigger device renaming code. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-31	net/mlx4_core: Reset RoCE VF gids when guest driver goes down	Jack Morgenstein	4	-14/+127
	Reset the GIDs assigned to a VF in the port RoCE GID table when that guest goes down (either crashes or goes down cleanly). As part of this fix, we refactor the RoCE gid table driver copy, moving it to the mlx4_port_info structure (together with the MAC and VLAN tables). As with the MAC and VLAN tables, we now use a mutex per port for the GID table so that modifying the driver copy and modifying the firmware copy of a port GID table becomes an atomic operation (thus avoiding driver-copy/FW-copy mismatches). Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-31	emac: aggregation of v1-2 PLB errors for IER register	Ivan Mikhaylov	2	-16/+9
	Aggreagation of version 1-2 because of version 1 can hit PLB errors too. If it's not set so we missing events for PLB bits and driver can't process those interrupts. Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-31	emac: add missing support of 10mbit in emac/rgmii	Ivan Mikhaylov	1	-0/+3
	In chips of emac/rgmii b'000' for 0/1 channel isn't suitable which resulted in non working network interface in this mode. Signed-off-by: Ivan Mikhaylov <ivan@ru.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2014-05-31	drm/radeon: Resume fbcon last	Daniel Vetter	1	-5/+6
	So a few people complained that commit 177cf92de4aa97ec1435987e91696ed8b5023130 Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Tue Apr 1 22:14:59 2014 +0200 drm/crtc-helpers: fix dpms on logic which was merged into 3.15-rc1, broke resume on radeons. Strangely git bisect lead everyone to commit 25f397a429dfa43f22c278d0119a60a343aa568f Author: Daniel Vetter <daniel.vetter@ffwll.ch> Date: Fri Jul 19 18:57:11 2013 +0200 drm/crtc-helper: explicit DPMS on after modeset which was merged long ago and actually part of 3.14. Digging deeper I've noticed (again) that the call to drm_helper_resume_force_mode in the radeon resume handlers was a no-op previously because everything gets shut down on suspend. radeon does this with explicit calls to drm_helper_connector_dpms with DPMS_OFF. But with 177c we now force the dpms state to ON, so suddenly resume_force_mode actually forced the crtcs back on. This is the intention of the change after all, the problem is that radeon resumes the fbdev console layer _before_ restoring the display, through calling fb_set_suspend. And fbcon does an immediate ->set_par, which in turn causes the same forced mode restore to happen. Two concurrent modeset operations didn't lead to happiness. Fix this by delaying the fbcon resume until the end of the readeon resum functions. v2: Fix up a bit of the spelling fail. References: https://lkml.org/lkml/2014/5/29/1043 References: https://lkml.org/lkml/2014/5/2/388 Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=74751 Tested-by: Ken Moffat <zarniwhoop@ntlworld.com> Cc: Alex Deucher <alexdeucher@gmail.com> Cc: Ken Moffat <zarniwhoop@ntlworld.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Dave Airlie <airlied@gmail.com>
2014-05-31	Merge branch 'drm-fixes-3.15' of ↵	Dave Airlie	3	-8/+21
	git://people.freedesktop.org/~deathsimple/linux into drm-fixes this is the next pull request for stashed up radeon fixes for 3.15. This is finally calming down with only four patches in this pull request. * 'drm-fixes-3.15' of git://people.freedesktop.org/~deathsimple/linux: drm/radeon: only allocate necessary size for vm bo list drm/radeon: don't allow RADEON_GEM_DOMAIN_CPU for command submission drm/radeon: avoid crash if VM command submission isn't available drm/radeon: lower the ref * post PLL maximum once more
2014-05-30	drm/msm: update for ARCH_MSM -> ARCH_QCOM	Rob Clark	2	-2/+2
	Architecture rename/split.. ARCH_QCOM is for the non-legacy platforms (ie. device-tree, multiplatform support, etc). Signed-off-by: Rob Clark <robdclark@gmail.com>
2014-05-30	drm/msm/hdmi: use gpio and HPD polling	Rob Clark	1	-19/+33
	The hotplug detect and irq does not seem to be reliable on all devices for some reason. For now it is more reliable to use polling, and give preference to raw gpio status if it disagrees with the debounced hpd status. Signed-off-by: Rob Clark <robdclark@gmail.com>
2014-05-30	drm/msm/mdp5: fix crash in error/unload paths	Rob Clark	1	-1/+4
	Signed-off-by: Rob Clark <robdclark@gmail.com>
2014-05-30	Merge branch 'for-linus' of ↵	Linus Torvalds	6	-121/+61
	git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input Pull input subsystem fixes from Dmitry Torokhov: "A couple of driver/build fixups and also redone quirk for Synaptics touchpads on Lenovo boxes (now using PNP IDs instead of DMI data to limit number of quirks)" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input: Input: synaptics - change min/max quirk table to pnp-id matching Input: synaptics - add a matches_pnp_id helper function Input: synaptics - T540p - unify with other LEN0034 models Input: synaptics - add min/max quirk for the ThinkPad W540 Input: ambakmi - request a shared interrupt for AMBA KMI devices Input: pxa27x-keypad - fix generating scancode Input: atmel-wm97xx - only build for AVR32 Input: fix ps2/serio module dependency
2014-05-30	Merge tag 'firewire-fixes' of ↵	Linus Torvalds	3	-8/+11
	git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394 Pull firewire fix from Stefan Richter: "A regression fix for the IEEE 1394 subsystem: re-enable IRQ-based asynchronous request reception at addresses below 128 TB" * tag 'firewire-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394: firewire: revert to 4 GB RDMA, fix protocols using Memory Space
2014-05-30	Merge tag 'dm-3.15-fixes-3' of ↵	Linus Torvalds	4	-10/+23
	git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device-mapper fixes from Mike Snitzer: "A dm-cache stable fix to split discards on cache block boundaries because dm-cache cannot yet handle discards that span cache blocks. Really fix a dm-mpath LOCKDEP warning that was introduced in -rc1. Add a 'no_space_timeout' control to dm-thinp to restore the ability to queue IO indefinitely when no data space is available. This fixes a change in behavior that was introduced in -rc6 where the timeout couldn't be disabled" * tag 'dm-3.15-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm mpath: really fix lockdep warning dm cache: always split discards on cache block boundaries dm thin: add 'no_space_timeout' dm-thin-pool module param
2014-05-30	x86_64: expand kernel stack to 16K	Minchan Kim	1	-1/+1
	While I play inhouse patches with much memory pressure on qemu-kvm, 3.14 kernel was randomly crashed. The reason was kernel stack overflow. When I investigated the problem, the callstack was a little bit deeper by involve with reclaim functions but not direct reclaim path. I tried to diet stack size of some functions related with alloc/reclaim so did a hundred of byte but overflow was't disappeard so that I encounter overflow by another deeper callstack on reclaim/allocator path. Of course, we might sweep every sites we have found for reducing stack usage but I'm not sure how long it saves the world(surely, lots of developer start to add nice features which will use stack agains) and if we consider another more complex feature in I/O layer and/or reclaim path, it might be better to increase stack size( meanwhile, stack usage on 64bit machine was doubled compared to 32bit while it have sticked to 8K. Hmm, it's not a fair to me and arm64 already expaned to 16K. ) So, my stupid idea is just let's expand stack size and keep an eye toward stack consumption on each kernel functions via stacktrace of ftrace. For example, we can have a bar like that each funcion shouldn't exceed 200K and emit the warning when some function consumes more in runtime. Of course, it could make false positive but at least, it could make a chance to think over it. I guess this topic was discussed several time so there might be strong reason not to increase kernel stack size on x86_64, for me not knowing so Ccing x86_64 maintainers, other MM guys and virtio maintainers. Here's an example call trace using up the kernel stack: Depth Size Location (51 entries) ----- ---- -------- 0) 7696 16 lookup_address 1) 7680 16 _lookup_address_cpa.isra.3 2) 7664 24 __change_page_attr_set_clr 3) 7640 392 kernel_map_pages 4) 7248 256 get_page_from_freelist 5) 6992 352 __alloc_pages_nodemask 6) 6640 8 alloc_pages_current 7) 6632 168 new_slab 8) 6464 8 __slab_alloc 9) 6456 80 __kmalloc 10) 6376 376 vring_add_indirect 11) 6000 144 virtqueue_add_sgs 12) 5856 288 __virtblk_add_req 13) 5568 96 virtio_queue_rq 14) 5472 128 __blk_mq_run_hw_queue 15) 5344 16 blk_mq_run_hw_queue 16) 5328 96 blk_mq_insert_requests 17) 5232 112 blk_mq_flush_plug_list 18) 5120 112 blk_flush_plug_list 19) 5008 64 io_schedule_timeout 20) 4944 128 mempool_alloc 21) 4816 96 bio_alloc_bioset 22) 4720 48 get_swap_bio 23) 4672 160 __swap_writepage 24) 4512 32 swap_writepage 25) 4480 320 shrink_page_list 26) 4160 208 shrink_inactive_list 27) 3952 304 shrink_lruvec 28) 3648 80 shrink_zone 29) 3568 128 do_try_to_free_pages 30) 3440 208 try_to_free_pages 31) 3232 352 __alloc_pages_nodemask 32) 2880 8 alloc_pages_current 33) 2872 200 __page_cache_alloc 34) 2672 80 find_or_create_page 35) 2592 80 ext4_mb_load_buddy 36) 2512 176 ext4_mb_regular_allocator 37) 2336 128 ext4_mb_new_blocks 38) 2208 256 ext4_ext_map_blocks 39) 1952 160 ext4_map_blocks 40) 1792 384 ext4_writepages 41) 1408 16 do_writepages 42) 1392 96 __writeback_single_inode 43) 1296 176 writeback_sb_inodes 44) 1120 80 __writeback_inodes_wb 45) 1040 160 wb_writeback 46) 880 208 bdi_writeback_workfn 47) 672 144 process_one_work 48) 528 112 worker_thread 49) 416 240 kthread 50) 176 176 ret_from_fork [ Note: the problem is exacerbated by certain gcc versions that seem to generate much bigger stack frames due to apparently bad coalescing of temporaries and generating too many spills. Rusty saw gcc-4.6.4 using 35% more stack on the virtio path than 4.8.2 does, for example. Minchan not only uses such a bad gcc version (4.6.3 in his case), but some of the stack use is due to debugging (CONFIG_DEBUG_PAGEALLOC is what causes that kernel_map_pages() frame, for example). But we're clearly getting too close. The VM code also seems to have excessive stack frames partly for the same compiler reason, triggered by excessive inlining and lots of function arguments. We need to improve on our stack use, but in the meantime let's do this simple stack increase too. Unlike most earlier reports, there is nothing simple that stands out as being really horribly wrong here, apart from the fact that the stack frames are just bigger than they should need to be. - Linus ] Signed-off-by: Minchan Kim <minchan@kernel.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Jones <davej@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Mel Gorman <mgorman@suse.de> Cc: Rik van Riel <riel@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Hugh Dickins <hughd@google.com> Cc: Rusty Russell <rusty@rustcorp.com.au> Cc: Michael S Tsirkin <mst@redhat.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: PJ Waskiewicz <pjwaskiewicz@gmail.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-05-30	Merge branch 'for-linus-2' of ↵	Linus Torvalds	1	-46/+107
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs dcache livelock fix from Al Viro: "Fixes for livelocks in shrink_dentry_list() introduced by fixes to shrink list corruption; the root cause was that trylock of parent's ->d_lock could be disrupted by d_walk() happening on other CPUs, resulting in shrink_dentry_list() making no progress and the same d_walk() being called again and again for as long as shrink_dentry_list() doesn't get past that mess. The solution is to have shrink_dentry_list() treat that trylock failure not as 'try to do the same thing again', but 'lock them in the right order'" * 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: dentry_kill() doesn't need the second argument now dealing with the rest of shrink_dentry_list() livelock shrink_dentry_list(): take parent's ->d_lock earlier expand dentry_kill(dentry, 0) in shrink_dentry_list() split dentry_kill() lift the "already marked killed" case into shrink_dentry_list()
2014-05-30	dentry_kill() doesn't need the second argument now	Al Viro	1	-7/+4
	it's 1 in the only remaining caller. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-30	dealing with the rest of shrink_dentry_list() livelock	Al Viro	1	-2/+20
	We have the same problem with ->d_lock order in the inner loop, where we are dropping references to ancestors. Same solution, basically - instead of using dentry_kill() we use lock_parent() (introduced in the previous commit) to get that lock in a safe way, recheck ->d_count (in case if lock_parent() has ended up dropping and retaking ->d_lock and somebody managed to grab a reference during that window), trylock the inode->i_lock and use __dentry_kill() to do the rest. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2014-05-30	shrink_dentry_list(): take parent's ->d_lock earlier	Al Viro	1	-12/+41
	The cause of livelocks there is that we are taking ->d_lock on dentry and its parent in the wrong order, forcing us to use trylock on the parent's one. d_walk() takes them in the right order, and unfortunately it's not hard to create a situation when shrink_dentry_list() can't make progress since trylock keeps failing, and shrink_dcache_parent() or check_submounts_and_drop() keeps calling d_walk() disrupting the very shrink_dentry_list() it's waiting for. Solution is straightforward - if that trylock fails, let's unlock the dentry itself and take locks in the right order. We need to stabilize ->d_parent without holding ->d_lock, but that's doable using RCU. And we'd better do that in the very beginning of the loop in shrink_dentry_list(), since the checks on refcount, etc. would need to be redone anyway. That deals with a half of the problem - killing dentries on the shrink list itself. Another one (dropping their parents) is in the next commit. locking parent is interesting - it would be easy to do rcu_read_lock(), lock whatever we think is a parent, lock dentry itself and check if the parent is still the right one. Except that we need to check that before locking the dentry, or we are risking taking ->d_lock out of order. Fortunately, once the D1 is locked, we can check if D2->d_parent is equal to D1 without the need to lock D2; D2->d_parent can start or stop pointing to D1 only under D1->d_lock, so taking D1->d_lock is enough. In other words, the right solution is rcu_read_lock/lock what looks like parent right now/check if it's still our parent/rcu_read_unlock/lock the child. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>