<feed xmlns='http://www.w3.org/2005/Atom'>
<title>kernel/linux.git/include/linux/writeback.h, branch v2.6.30</title>
<subtitle>Linux kernel stable tree (mirror)</subtitle>
<id>https://git.radix-linux.su/kernel/linux.git/atom?h=v2.6.30</id>
<link rel='self' href='https://git.radix-linux.su/kernel/linux.git/atom?h=v2.6.30'/>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/'/>
<updated>2009-05-15T09:32:24+00:00</updated>
<entry>
<title>Revert "mm: add /proc controls for pdflush threads"</title>
<updated>2009-05-15T09:32:24+00:00</updated>
<author>
<name>Jens Axboe</name>
<email>jens.axboe@oracle.com</email>
</author>
<published>2009-05-15T09:32:24+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=cd17cbfda004fe5f406c01b318c6378d9895896f'/>
<id>urn:sha1:cd17cbfda004fe5f406c01b318c6378d9895896f</id>
<content type='text'>
This reverts commit fafd688e4c0c34da0f3de909881117d374e4c7af.

Work is progressing to switch away from pdflush as the process backing
for flushing out dirty data. So it seems pointless to add more knobs
to control pdflush threads. The original author of the patch did not
have any specific use cases for adding the knobs, so we can easily
revert this before 2.6.30 to avoid having to maintain this API
forever.

Signed-off-by: Jens Axboe &lt;jens.axboe@oracle.com&gt;
</content>
</entry>
<entry>
<title>mm: add /proc controls for pdflush threads</title>
<updated>2009-04-07T15:31:03+00:00</updated>
<author>
<name>Peter W Morreale</name>
<email>pmorreale@novell.com</email>
</author>
<published>2009-04-07T02:00:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=fafd688e4c0c34da0f3de909881117d374e4c7af'/>
<id>urn:sha1:fafd688e4c0c34da0f3de909881117d374e4c7af</id>
<content type='text'>
Add /proc entries to give the admin the ability to control the minimum and
maximum number of pdflush threads.  This allows finer control of pdflush
on both large and small machines.

The rationale is simply one size does not fit all.  Admins on large and/or
small systems may want to tune the min/max pdflush thread count to best
suit their needs.  Right now the min/max is hardcoded to 2/8.  While
probably a fair estimate for smaller machines, large machines with large
numbers of CPUs and large numbers of filesystems/block devices may benefit
from larger numbers of threads working on different block devices.

Even if the background flushing algorithm is radically changed, it is
still likely that multiple threads will be involved and admins would still
desire finer control on the min/max other than to have to recompile the
kernel.

The patch adds '/proc/sys/vm/nr_pdflush_threads_min' and
'/proc/sys/vm/nr_pdflush_threads_max' with r/w permissions.

The minimum value for nr_pdflush_threads_min is 1 and the maximum value is
the current value of nr_pdflush_threads_max.  This minimum is required
since additional thread creation is performed in a pdflush thread itself.

The minimum value for nr_pdflush_threads_max is the current value of
nr_pdflush_threads_min and the maximum value can be 1000.

Documentation/sysctl/vm.txt is also updated.

[akpm@linux-foundation.org: fix comment, fix whitespace, use __read_mostly]
Signed-off-by: Peter W Morreale &lt;pmorreale@novell.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix proc_dointvec_userhz_jiffies "breakage"</title>
<updated>2009-04-01T15:59:13+00:00</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2009-03-31T22:23:18+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=704503d836042d4a4c7685b7036e7de0418fbc0f'/>
<id>urn:sha1:704503d836042d4a4c7685b7036e7de0418fbc0f</id>
<content type='text'>
Addresses http://bugzilla.kernel.org/show_bug.cgi?id=9838

On i386, HZ=1000, jiffies_to_clock_t() converts time in a somewhat strange
way from the user's point of view:

	# echo 500 &gt;/proc/sys/vm/dirty_writeback_centisecs
	# cat /proc/sys/vm/dirty_writeback_centisecs
	499

So, we have 5000 jiffies converted to only 499 clock ticks and reported
back.

TICK_NSEC = 999848
ACTHZ = 256039

Keeping in-kernel variable in units passed from userspace will fix issue
of course, but this probably won't be right for every sysctl.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs: remove WB_SYNC_HOLD</title>
<updated>2009-01-06T23:59:09+00:00</updated>
<author>
<name>Nick Piggin</name>
<email>npiggin@suse.de</email>
</author>
<published>2009-01-06T22:40:25+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=4f5a99d64c17470a784a6c68064207d82e3e74a5'/>
<id>urn:sha1:4f5a99d64c17470a784a6c68064207d82e3e74a5</id>
<content type='text'>
Remove WB_SYNC_HOLD.  The primary motiviation is the design of my
anti-starvation code for fsync.  It requires taking an inode lock over the
sync operation, so we could run into lock ordering problems with multiple
inodes.  It is possible to take a single global lock to solve the ordering
problem, but then that would prevent a future nice implementation of "sync
multiple inodes" based on lock order via inode address.

Seems like a backward step to remove this, but actually it is busted
anyway: we can't use the inode lists for data integrity wait: an inode can
be taken off the dirty lists but still be under writeback.  In order to
satisfy data integrity semantics, we should wait for it to finish
writeback, but if we only search the dirty lists, we'll miss it.

It would be possible to have a "writeback" list, for sys_sync, I suppose.
But why complicate things by prematurely optimise?  For unmounting, we
could avoid the "livelock avoidance" code, which would be easier, but
again premature IMO.

Fixing the existing data integrity problem will come next.

Signed-off-by: Nick Piggin &lt;npiggin@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: add dirty_background_bytes and dirty_bytes sysctls</title>
<updated>2009-01-06T23:59:03+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-01-06T22:39:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=2da02997e08d3efe8174c7a47696e6f7cbe69ba9'/>
<id>urn:sha1:2da02997e08d3efe8174c7a47696e6f7cbe69ba9</id>
<content type='text'>
This change introduces two new sysctls to /proc/sys/vm:
dirty_background_bytes and dirty_bytes.

dirty_background_bytes is the counterpart to dirty_background_ratio and
dirty_bytes is the counterpart to dirty_ratio.

With growing memory capacities of individual machines, it's no longer
sufficient to specify dirty thresholds as a percentage of the amount of
dirtyable memory over the entire system.

dirty_background_bytes and dirty_bytes specify quantities of memory, in
bytes, that represent the dirty limits for the entire system.  If either
of these values is set, its value represents the amount of dirty memory
that is needed to commence either background or direct writeback.

When a `bytes' or `ratio' file is written, its counterpart becomes a
function of the written value.  For example, if dirty_bytes is written to
be 8096, 8K of memory is required to commence direct writeback.
dirty_ratio is then functionally equivalent to 8K / the amount of
dirtyable memory:

	dirtyable_memory = free pages + mapped pages + file cache

	dirty_background_bytes = dirty_background_ratio * dirtyable_memory
		-or-
	dirty_background_ratio = dirty_background_bytes / dirtyable_memory

		AND

	dirty_bytes = dirty_ratio * dirtyable_memory
		-or-
	dirty_ratio = dirty_bytes / dirtyable_memory

Only one of dirty_background_bytes and dirty_background_ratio may be
specified at a time, and only one of dirty_bytes and dirty_ratio may be
specified.  When one sysctl is written, the other appears as 0 when read.

The `bytes' files operate on a page size granularity since dirty limits
are compared with ZVC values, which are in page units.

Prior to this change, the minimum dirty_ratio was 5 as implemented by
get_dirty_limits() although /proc/sys/vm/dirty_ratio would show any user
written value between 0 and 100.  This restriction is maintained, but
dirty_bytes has a lower limit of only one page.

Also prior to this change, the dirty_background_ratio could not equal or
exceed dirty_ratio.  This restriction is maintained in addition to
restricting dirty_background_bytes.  If either background threshold equals
or exceeds that of the dirty threshold, it is implicitly set to half the
dirty threshold.

Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Andrea Righi &lt;righi.andrea@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: change dirty limit type specifiers to unsigned long</title>
<updated>2009-01-06T23:59:02+00:00</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-01-06T22:39:29+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=364aeb2849789b51bf4b9af2ddd02fee7285c54e'/>
<id>urn:sha1:364aeb2849789b51bf4b9af2ddd02fee7285c54e</id>
<content type='text'>
The background dirty and dirty limits are better defined with type
specifiers of unsigned long since negative writeback thresholds are not
possible.

These values, as returned by get_dirty_limits(), are normally compared
with ZVC values to determine whether writeback shall commence or be
throttled.  Such page counts cannot be negative, so declaring the page
limits as signed is unnecessary.

Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Andrea Righi &lt;righi.andrea@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>vfs: Add no_nrwrite_index_update writeback control flag</title>
<updated>2008-10-16T14:09:17+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2008-10-16T14:09:17+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=17bc6c30cf6bfffd816bdc53682dd46fc34a2cf4'/>
<id>urn:sha1:17bc6c30cf6bfffd816bdc53682dd46fc34a2cf4</id>
<content type='text'>
If no_nrwrite_index_update is set we don't update nr_to_write and
address space writeback_index in write_cache_pages.  This change
enables a file system to skip these updates in write_cache_pages and do
them in the writepages() callback.  This patch will be followed by an
ext4 patch that make use of these new flags.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
CC: linux-fsdevel@vger.kernel.org
</content>
</entry>
<entry>
<title>vfs: Remove the range_cont writeback mode.</title>
<updated>2008-10-14T13:21:02+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2008-10-14T13:21:02+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=74baaaaec8b4f22e1ae279f5ecca4ff705b28912'/>
<id>urn:sha1:74baaaaec8b4f22e1ae279f5ecca4ff705b28912</id>
<content type='text'>
Ext4 was the only user of range_cont writeback mode and ext4 switched
to a different method. So remove the range_cont mode which is not used
in the kernel.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
CC: linux-fsdevel@vger.kernel.org
</content>
</entry>
<entry>
<title>Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4</title>
<updated>2008-07-15T15:36:38+00:00</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2008-07-15T15:36:38+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=8d2567a620ae8c24968a2bdc1c906c724fac1f6a'/>
<id>urn:sha1:8d2567a620ae8c24968a2bdc1c906c724fac1f6a</id>
<content type='text'>
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (61 commits)
  ext4: Documention update for new ordered mode and delayed allocation
  ext4: do not set extents feature from the kernel
  ext4: Don't allow nonextenst mount option for large filesystem
  ext4: Enable delalloc by default.
  ext4: delayed allocation i_blocks fix for stat
  ext4: fix delalloc i_disksize early update issue
  ext4: Handle page without buffers in ext4_*_writepage()
  ext4: Add ordered mode support for delalloc
  ext4: Invert lock ordering of page_lock and transaction start in delalloc
  mm: Add range_cont mode for writeback
  ext4: delayed allocation ENOSPC handling
  percpu_counter: new function percpu_counter_sum_and_set
  ext4: Add delayed allocation support in data=writeback mode
  vfs: add hooks for ext4's delayed allocation support
  jbd2: Remove data=ordered mode support using jbd buffer heads
  ext4: Use new framework for data=ordered mode in JBD2
  jbd2: Implement data=ordered mode handling via inodes
  vfs: export filemap_fdatawrite_range()
  ext4: Fix lock inversion in ext4_ext_truncate()
  ext4: Invert the locking order of page_lock and transaction start
  ...
</content>
</entry>
<entry>
<title>mm: Add range_cont mode for writeback</title>
<updated>2008-07-11T23:27:31+00:00</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2008-07-11T23:27:31+00:00</published>
<link rel='alternate' type='text/html' href='https://git.radix-linux.su/kernel/linux.git/commit/?id=06d6cf6959d22037fcec598f4f954db5db3d7356'/>
<id>urn:sha1:06d6cf6959d22037fcec598f4f954db5db3d7356</id>
<content type='text'>
Filesystems like ext4 needs to start a new transaction in
the writepages for block allocation. This happens with delayed
allocation and there is limit to how many credits we can request
from the journal layer. So we call write_cache_pages multiple
times with wbc-&gt;nr_to_write set to the maximum possible value
limitted by the max journal credits available.

Add a new mode to writeback that enables us to handle this
behaviour. In the new mode we update the wbc-&gt;range_start
to point to the new offset to be written. Next call to
call to write_cache_pages will start writeout from specified
range_start offset. In the new mode we also limit writing
to the specified wbc-&gt;range_end.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Mingming Cao &lt;cmm@us.ibm.com&gt;
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
</content>
</entry>
</feed>
