Age | Commit message (Collapse) | Author | Files | Lines |
|
This fixes CVE-2014-0102.
The following command sequence produces an oops:
keyctl new_session
i=`keyctl newring _ses @s`
keyctl link @s $i
The problem is that search_nested_keyrings() sees two keyrings that have
matching type and description, so keyring_compare_object() returns true.
s_n_k() then passes the key to the iterator function -
keyring_detect_cycle_iterator() - which *should* check to see whether this is
the keyring of interest, not just one with the same name.
Because assoc_array_find() will return one and only one match, I assumed that
the iterator function would only see an exact match or never be called - but
the iterator isn't only called from assoc_array_find()...
The oops looks something like this:
kernel BUG at /data/fs/linux-2.6-fscache/security/keys/keyring.c:1003!
invalid opcode: 0000 [#1] SMP
...
RIP: keyring_detect_cycle_iterator+0xe/0x1f
...
Call Trace:
search_nested_keyrings+0x76/0x2aa
__key_link_check_live_key+0x50/0x5f
key_link+0x4e/0x85
keyctl_keyring_link+0x60/0x81
SyS_keyctl+0x65/0xe4
tracesys+0xdd/0xe2
The fix is to make keyring_detect_cycle_iterator() check that the key it
has is the key it was actually looking for rather than calling BUG_ON().
A testcase has been included in the keyutils testsuite for this:
http://git.kernel.org/cgit/linux/kernel/git/dhowells/keyutils.git/commit/?id=891f3365d07f1996778ade0e3428f01878a1790b
Reported-by: Tommi Rantala <tt.rantala@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We have a problem where the big_key key storage implementation uses a
shmem backed inode to hold the key contents. Because of this detail of
implementation LSM checks are being done between processes trying to
read the keys and the tmpfs backed inode. The LSM checks are already
being handled on the key interface level and should not be enforced at
the inode level (since the inode is an implementation detail, not a
part of the security model)
This patch implements a new function shmem_kernel_file_setup() which
returns the equivalent to shmem_file_setup() only the underlying inode
has S_PRIVATE set. This means that all LSM checks for the inode in
question are skipped. It should only be used for kernel internal
operations where the inode is not exposed to userspace without proper
LSM checking. It is possible that some other users of
shmem_file_setup() should use the new interface, but this has not been
explored.
Reproducing this bug is a little bit difficult. The steps I used on
Fedora are:
(1) Turn off selinux enforcing:
setenforce 0
(2) Create a huge key
k=`dd if=/dev/zero bs=8192 count=1 | keyctl padd big_key test-key @s`
(3) Access the key in another context:
runcon system_u:system_r:httpd_t:s0-s0:c0.c1023 keyctl print $k >/dev/null
(4) Examine the audit logs:
ausearch -m AVC -i --subject httpd_t | audit2allow
If the last command's output includes a line that looks like:
allow httpd_t user_tmpfs_t:file { open read };
There was an inode check between httpd and the tmpfs filesystem. With
this patch no such denial will be seen. (NOTE! you should clear your
audit log if you have tested for this previously)
(Please return you box to enforcing)
Signed-off-by: Eric Paris <eparis@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Hugh Dickins <hughd@google.com>
cc: linux-mm@kvack.org
|
|
If a keyring contains more than 16 keyrings (the capacity of a single node in
the associative array) then those keyrings are split over multiple nodes
arranged as a tree.
If search_nested_keyrings() is called to search the keyring then it will
attempt to manually walk over just the 0 branch of the associative array tree
where all the keyring links are stored. This works provided the key is found
before the algorithm steps from one node containing keyrings to a child node
or if there are sufficiently few keyring links that the keyrings are all in
one node.
However, if the algorithm does need to step from a node to a child node, it
doesn't change the node pointer unless a shortcut also gets transited. This
means that the algorithm will keep scanning the same node over and over again
without terminating and without returning.
To fix this, move the internal-pointer-to-node translation from inside the
shortcut transit handler so that it applies it to node arrival as well.
This can be tested by:
r=`keyctl newring sandbox @s`
for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done
for ((i=0; i<=16; i++)); do keyctl add user a$i a %:ring$i; done
for ((i=0; i<=16; i++)); do keyctl search $r user a$i; done
for ((i=17; i<=20; i++)); do keyctl search $r user a$i; done
The searches should all complete successfully (or with an error for 17-20),
but instead one or more of them will hang.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Stephen Gallagher <sgallagh@redhat.com>
|
|
If sufficient keys (or keyrings) are added into a keyring such that a node in
the associative array's tree overflows (each node has a capacity N, currently
16) and such that all N+1 keys have the same index key segment for that level
of the tree (the level'th nibble of the index key), then assoc_array_insert()
calls ops->diff_objects() to indicate at which bit position the two index keys
vary.
However, __key_link_begin() passes a NULL object to assoc_array_insert() with
the intention of supplying the correct pointer later before we commit the
change. This means that keyring_diff_objects() is given a NULL pointer as one
of its arguments which it does not expect. This results in an oops like the
attached.
With the previous patch to fix the keyring hash function, this can be forced
much more easily by creating a keyring and only adding keyrings to it. Add any
other sort of key and a different insertion path is taken - all 16+1 objects
must want to cluster in the same node slot.
This can be tested by:
r=`keyctl newring sandbox @s`
for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done
This should work fine, but oopses when the 17th keyring is added.
Since ops->diff_objects() is always called with the first pointer pointing to
the object to be inserted (ie. the NULL pointer), we can fix the problem by
changing the to-be-inserted object pointer to point to the index key passed
into assoc_array_insert() instead.
Whilst we're at it, we also switch the arguments so that they are the same as
for ->compare_object().
BUG: unable to handle kernel NULL pointer dereference at 0000000000000088
IP: [<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
RIP: 0010:[<ffffffff81191ee4>] hash_key_type_and_desc+0x18/0xb0
...
Call Trace:
[<ffffffff81191f9d>] keyring_diff_objects+0x21/0xd2
[<ffffffff811f09ef>] assoc_array_insert+0x3b6/0x908
[<ffffffff811929a7>] __key_link_begin+0x78/0xe5
[<ffffffff81191a2e>] key_create_or_update+0x17d/0x36a
[<ffffffff81192e0a>] SyS_add_key+0x123/0x183
[<ffffffff81400ddb>] tracesys+0xdd/0xe2
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Stephen Gallagher <sgallagh@redhat.com>
|
|
The keyring hash function (used by the associative array) is supposed to clear
the bottommost nibble of the index key (where the hash value resides) for
keyrings and make sure it is non-zero for non-keyrings. This is done to make
keyrings cluster together on one branch of the tree separately to other keys.
Unfortunately, the wrong mask is used, so only the bottom two bits are
examined and cleared and not the whole bottom nibble. This means that keys
and keyrings can still be successfully searched for under most circumstances
as the hash is consistent in its miscalculation, but if a keyring's
associative array bottom node gets filled up then approx 75% of the keyrings
will not be put into the 0 branch.
The consequence of this is that a key in a keyring linked to by another
keyring, ie.
keyring A -> keyring B -> key
may not be found if the search starts at keyring A and then descends into
keyring B because search_nested_keyrings() only searches up the 0 branch (as it
"knows" all keyrings must be there and not elsewhere in the tree).
The fix is to use the right mask.
This can be tested with:
r=`keyctl newring sandbox @s`
for ((i=0; i<=16; i++)); do keyctl newring ring$i $r; done
for ((i=0; i<=16; i++)); do keyctl add user a$i a %:ring$i; done
for ((i=0; i<=16; i++)); do keyctl search $r user a$i; done
This creates a sandbox keyring, then creates 17 keyrings therein (labelled
ring0..ring16). This causes the root node of the sandbox's associative array
to overflow and for the tree to have extra nodes inserted.
Each keyring then is given a user key (labelled aN for ringN) for us to search
for.
We then search for the user keys we added, starting from the sandbox. If
working correctly, it should return the same ordered list of key IDs as
for...keyctl add... did. Without this patch, it reports ENOKEY "Required key
not available" for some of the keys. Just which keys get this depends as the
kernel pointer to the key type forms part of the hash function.
Reported-by: Nalin Dahyabhai <nalin@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Stephen Gallagher <sgallagh@redhat.com>
|
|
The second word of key->payload does not get initialised in key_alloc(), but
the big_key type is relying on it having been cleared. The problem comes when
big_key fails to instantiate a large key and doesn't then set the payload. The
big_key_destroy() op is called from the garbage collector and this assumes that
the dentry pointer stored in the second word will be NULL if instantiation did
not complete.
Therefore just pre-clear the entire struct key on allocation rather than trying
to be clever and only initialising to 0 only those bits that aren't otherwise
initialised.
The lack of initialisation can lead to a bug report like the following if
big_key failed to initialise its file:
general protection fault: 0000 [#1] SMP
Modules linked in: ...
CPU: 0 PID: 51 Comm: kworker/0:1 Not tainted 3.10.0-53.el7.x86_64 #1
Hardware name: Dell Inc. PowerEdge 1955/0HC513, BIOS 1.4.4 12/09/2008
Workqueue: events key_garbage_collector
task: ffff8801294f5680 ti: ffff8801296e2000 task.ti: ffff8801296e2000
RIP: 0010:[<ffffffff811b4a51>] dput+0x21/0x2d0
...
Call Trace:
[<ffffffff811a7b06>] path_put+0x16/0x30
[<ffffffff81235604>] big_key_destroy+0x44/0x60
[<ffffffff8122dc4b>] key_gc_unused_keys.constprop.2+0x5b/0xe0
[<ffffffff8122df2f>] key_garbage_collector+0x1df/0x3c0
[<ffffffff8107759b>] process_one_work+0x17b/0x460
[<ffffffff8107834b>] worker_thread+0x11b/0x400
[<ffffffff81078230>] ? rescuer_thread+0x3e0/0x3e0
[<ffffffff8107eb00>] kthread+0xc0/0xd0
[<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110
[<ffffffff815c4bec>] ret_from_fork+0x7c/0xb0
[<ffffffff8107ea40>] ? kthread_create_on_node+0x110/0x110
Reported-by: Patrik Kis <pkis@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Stephen Gallagher <sgallagh@redhat.com>
|
|
Key pointers stored in the keyring are marked in bit 1 to indicate if they
point to a keyring. We need to strip off this bit before using the pointer
when iterating over the keyring for the purpose of looking for links to garbage
collect.
This means that expirable keyrings aren't correctly expiring because the
checker is seeing their key pointer with 2 added to it.
Since the fix for this involves knowing about the internals of the keyring,
key_gc_keyring() is moved to keyring.c and merged into keyring_gc().
This can be tested by:
echo 2 >/proc/sys/kernel/keys/gc_delay
keyctl timeout `keyctl add keyring qwerty "" @s` 2
cat /proc/keys
sleep 5; cat /proc/keys
which should see a keyring called "qwerty" appear in the session keyring and
then disappear after it expires, and:
echo 2 >/proc/sys/kernel/keys/gc_delay
a=`keyctl get_persistent @s`
b=`keyctl add keyring 0 "" $a`
keyctl add user a a $b
keyctl timeout $b 2
cat /proc/keys
sleep 5; cat /proc/keys
which should see a keyring called "0" with a key called "a" in it appear in the
user's persistent keyring (which will be attached to the session keyring) and
then both the "0" keyring and the "a" key should disappear when the "0" keyring
expires.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Simo Sorce <simo@redhat.com>
|
|
In the big_key_instantiate() function we return 0 if kernel_write() returns us
an error rather than returning an error. This can potentially lead to
dentry_open() giving a BUG when called from big_key_read() with an unset
tmpfile path.
------------[ cut here ]------------
kernel BUG at fs/open.c:798!
...
RIP: 0010:[<ffffffff8119bbd1>] dentry_open+0xd1/0xe0
...
Call Trace:
[<ffffffff812350c5>] big_key_read+0x55/0x100
[<ffffffff81231084>] keyctl_read_key+0xb4/0xe0
[<ffffffff81231e58>] SyS_keyctl+0xf8/0x1d0
[<ffffffff815bb799>] system_call_fastpath+0x16/0x1b
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Stephen Gallagher <sgallagh@redhat.com>
|
|
If the UID is specified by userspace when calling the KEYCTL_GET_PERSISTENT
function and the process does not have the CAP_SETUID capability, then the
function will return -EPERM if the current process's uid, suid, euid and fsuid
all match the requested UID. This is incorrect.
Fix it such that when a non-privileged caller requests a persistent keyring by
a specific UID they can only request their own (ie. the specified UID matches
either then process's UID or the process's EUID).
This can be tested by logging in as the user and doing:
keyctl get_persistent @p
keyctl get_persistent @p `id -u`
keyctl get_persistent @p 0
The first two should successfully print the same key ID. The third should do
the same if called by UID 0 or indicate Operation Not Permitted otherwise.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Stephen Gallagher <sgallagh@redhat.com>
|
|
Fix to return a negative error code from the error handling
case instead of 0, as done elsewhere in this function.
Signed-off-by: Wei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
If a key is displaced from a keyring by a matching one, then four more bytes
of quota are allocated to the keyring - despite the fact that the keyring does
not change in size.
Further, when a key is unlinked from a keyring, the four bytes of quota
allocated the link isn't recovered and returned to the user's pool.
The first can be tested by repeating:
keyctl add big_key a fred @s
cat /proc/key-users
(Don't put it in a shell loop otherwise the garbage collector won't have time
to clear the displaced keys, thus affecting the result).
This was causing the kerberos keyring to run out of room fairly quickly.
The second can be tested by:
cat /proc/key-users
a=`keyctl add user a a @s`
cat /proc/key-users
keyctl unlink $a
sleep 1 # Give RCU a chance to delete the key
cat /proc/key-users
assuming no system activity that otherwise adds/removes keys, the amount of
key data allocated should go up (say 40/20000 -> 47/20000) and then return to
the original value at the end.
Reported-by: Stephen Gallagher <sgallagh@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
key_reject_and_link() marking a key as negative and setting the error with
which it was negated races with keyring searches and other things that read
that error.
The fix is to switch the order in which the assignments are done in
key_reject_and_link() and to use memory barriers.
Kudos to Dave Wysochanski <dwysocha@redhat.com> and Scott Mayhew
<smayhew@redhat.com> for tracking this down.
This may be the cause of:
BUG: unable to handle kernel NULL pointer dereference at 0000000000000070
IP: [<ffffffff81219011>] wait_for_key_construction+0x31/0x80
PGD c6b2c3067 PUD c59879067 PMD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/system/cpu/cpu3/cache/index2/shared_cpu_map
CPU 0
Modules linked in: ...
Pid: 13359, comm: amqzxma0 Not tainted 2.6.32-358.20.1.el6.x86_64 #1 IBM System x3650 M3 -[7945PSJ]-/00J6159
RIP: 0010:[<ffffffff81219011>] wait_for_key_construction+0x31/0x80
RSP: 0018:ffff880c6ab33758 EFLAGS: 00010246
RAX: ffffffff81219080 RBX: 0000000000000000 RCX: 0000000000000002
RDX: ffffffff81219060 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff880c6ab33768 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff880adfcbce40
R13: ffffffffa03afb84 R14: ffff880adfcbce40 R15: ffff880adfcbce43
FS: 00007f29b8042700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000070 CR3: 0000000c613dc000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process amqzxma0 (pid: 13359, threadinfo ffff880c6ab32000, task ffff880c610deae0)
Stack:
ffff880adfcbce40 0000000000000000 ffff880c6ab337b8 ffffffff81219695
<d> 0000000000000000 ffff880a000000d0 ffff880c6ab337a8 000000000000000f
<d> ffffffffa03afb93 000000000000000f ffff88186c7882c0 0000000000000014
Call Trace:
[<ffffffff81219695>] request_key+0x65/0xa0
[<ffffffffa03a0885>] nfs_idmap_request_key+0xc5/0x170 [nfs]
[<ffffffffa03a0eb4>] nfs_idmap_lookup_id+0x34/0x80 [nfs]
[<ffffffffa03a1255>] nfs_map_group_to_gid+0x75/0xa0 [nfs]
[<ffffffffa039a9ad>] decode_getfattr_attrs+0xbdd/0xfb0 [nfs]
[<ffffffff81057310>] ? __dequeue_entity+0x30/0x50
[<ffffffff8100988e>] ? __switch_to+0x26e/0x320
[<ffffffffa039ae03>] decode_getfattr+0x83/0xe0 [nfs]
[<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs]
[<ffffffffa039b69f>] nfs4_xdr_dec_getattr+0x8f/0xa0 [nfs]
[<ffffffffa02dada4>] rpcauth_unwrap_resp+0x84/0xb0 [sunrpc]
[<ffffffffa039b610>] ? nfs4_xdr_dec_getattr+0x0/0xa0 [nfs]
[<ffffffffa02cf923>] call_decode+0x1b3/0x800 [sunrpc]
[<ffffffff81096de0>] ? wake_bit_function+0x0/0x50
[<ffffffffa02cf770>] ? call_decode+0x0/0x800 [sunrpc]
[<ffffffffa02d99a7>] __rpc_execute+0x77/0x350 [sunrpc]
[<ffffffff81096c67>] ? bit_waitqueue+0x17/0xd0
[<ffffffffa02d9ce1>] rpc_execute+0x61/0xa0 [sunrpc]
[<ffffffffa02d03a5>] rpc_run_task+0x75/0x90 [sunrpc]
[<ffffffffa02d04c2>] rpc_call_sync+0x42/0x70 [sunrpc]
[<ffffffffa038ff80>] _nfs4_call_sync+0x30/0x40 [nfs]
[<ffffffffa038836c>] _nfs4_proc_getattr+0xac/0xc0 [nfs]
[<ffffffff810aac87>] ? futex_wait+0x227/0x380
[<ffffffffa038b856>] nfs4_proc_getattr+0x56/0x80 [nfs]
[<ffffffffa0371403>] __nfs_revalidate_inode+0xe3/0x220 [nfs]
[<ffffffffa037158e>] nfs_revalidate_mapping+0x4e/0x170 [nfs]
[<ffffffffa036f147>] nfs_file_read+0x77/0x130 [nfs]
[<ffffffff811811aa>] do_sync_read+0xfa/0x140
[<ffffffff81096da0>] ? autoremove_wake_function+0x0/0x40
[<ffffffff8100bb8e>] ? apic_timer_interrupt+0xe/0x20
[<ffffffff8100b9ce>] ? common_interrupt+0xe/0x13
[<ffffffff81228ffb>] ? selinux_file_permission+0xfb/0x150
[<ffffffff8121bed6>] ? security_file_permission+0x16/0x20
[<ffffffff81181a95>] vfs_read+0xb5/0x1a0
[<ffffffff81181bd1>] sys_read+0x51/0x90
[<ffffffff810dc685>] ? __audit_syscall_exit+0x265/0x290
[<ffffffff8100b072>] system_call_fastpath+0x16/0x1b
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Dave Wysochanski <dwysocha@redhat.com>
cc: Scott Mayhew <smayhew@redhat.com>
|
|
Having the big_keys functionality as a module is very marginally useful.
The userspace code that would use this functionality will get odd error
messages from the keys layer if the module isn't loaded. The code itself
is fairly small, so just have this as a boolean option and not a tristate.
Signed-off-by: Josh Boyer <jwboyer@fedoraproject.org>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
In order to create the integrity keyrings (eg. _evm, _ima), root's
uid and session keyrings need to be initialized early.
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Add KEY_FLAG_TRUSTED to indicate that a key either comes from a trusted source
or had a cryptographic signature chain that led back to a trusted key the
kernel already possessed.
Add KEY_FLAGS_TRUSTED_ONLY to indicate that a keyring will only accept links to
keys marked with KEY_FLAGS_TRUSTED.
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
|
|
Add support for per-user_namespace registers of persistent per-UID kerberos
caches held within the kernel.
This allows the kerberos cache to be retained beyond the life of all a user's
processes so that the user's cron jobs can work.
The kerberos cache is envisioned as a keyring/key tree looking something like:
struct user_namespace
\___ .krb_cache keyring - The register
\___ _krb.0 keyring - Root's Kerberos cache
\___ _krb.5000 keyring - User 5000's Kerberos cache
\___ _krb.5001 keyring - User 5001's Kerberos cache
\___ tkt785 big_key - A ccache blob
\___ tkt12345 big_key - Another ccache blob
Or possibly:
struct user_namespace
\___ .krb_cache keyring - The register
\___ _krb.0 keyring - Root's Kerberos cache
\___ _krb.5000 keyring - User 5000's Kerberos cache
\___ _krb.5001 keyring - User 5001's Kerberos cache
\___ tkt785 keyring - A ccache
\___ krbtgt/REDHAT.COM@REDHAT.COM big_key
\___ http/REDHAT.COM@REDHAT.COM user
\___ afs/REDHAT.COM@REDHAT.COM user
\___ nfs/REDHAT.COM@REDHAT.COM user
\___ krbtgt/KERNEL.ORG@KERNEL.ORG big_key
\___ http/KERNEL.ORG@KERNEL.ORG big_key
What goes into a particular Kerberos cache is entirely up to userspace. Kernel
support is limited to giving you the Kerberos cache keyring that you want.
The user asks for their Kerberos cache by:
krb_cache = keyctl_get_krbcache(uid, dest_keyring);
The uid is -1 or the user's own UID for the user's own cache or the uid of some
other user's cache (requires CAP_SETUID). This permits rpc.gssd or whatever to
mess with the cache.
The cache returned is a keyring named "_krb.<uid>" that the possessor can read,
search, clear, invalidate, unlink from and add links to. Active LSMs get a
chance to rule on whether the caller is permitted to make a link.
Each uid's cache keyring is created when it first accessed and is given a
timeout that is extended each time this function is called so that the keyring
goes away after a while. The timeout is configurable by sysctl but defaults to
three days.
Each user_namespace struct gets a lazily-created keyring that serves as the
register. The cache keyrings are added to it. This means that standard key
search and garbage collection facilities are available.
The user_namespace struct's register goes away when it does and anything left
in it is then automatically gc'd.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Simo Sorce <simo@redhat.com>
cc: Serge E. Hallyn <serge.hallyn@ubuntu.com>
cc: Eric W. Biederman <ebiederm@xmission.com>
|
|
Implement a big key type that can save its contents to tmpfs and thus
swapspace when memory is tight. This is useful for Kerberos ticket caches.
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Simo Sorce <simo@redhat.com>
|
|
Expand the capacity of a keyring to be able to hold a lot more keys by using
the previously added associative array implementation. Currently the maximum
capacity is:
(PAGE_SIZE - sizeof(header)) / sizeof(struct key *)
which, on a 64-bit system, is a little more 500. However, since this is being
used for the NFS uid mapper, we need more than that. The new implementation
gives us effectively unlimited capacity.
With some alterations, the keyutils testsuite runs successfully to completion
after this patch is applied. The alterations are because (a) keyrings that
are simply added to no longer appear ordered and (b) some of the errors have
changed a bit.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Drop the permissions argument from __keyring_search_one() as the only caller
passes 0 here - which causes all checks to be skipped.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Define a __key_get() wrapper to use rather than atomic_inc() on the key usage
count as this makes it easier to hook in refcount error debugging.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Search for auth-key by name rather than by target key ID as, in a future
patch, we'll by searching directly by index key in preference to iteration
over all keys.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Search functions pass around a bunch of arguments, each of which gets copied
with each call. Introduce a search context structure to hold these.
Whilst we're at it, create a search flag that indicates whether the search
should be directly to the description or whether it should iterate through all
keys looking for a non-description match.
This will be useful when keyrings use a generic data struct with generic
routines to manage their content as the search terms can just be passed
through to the iterator callback function.
Also, for future use, the data to be supplied to the match function is
separated from the description pointer in the search context. This makes it
clear which is being supplied.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Consolidate the concept of an 'index key' for accessing keys. The index key
is the search term needed to find a key directly - basically the key type and
the key description. We can add to that the description length.
This will be useful when turning a keyring into an associative array rather
than just a pointer block.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
key_is_dead() should take a const key pointer argument as it doesn't modify
what it points to.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Make make_key_ref() take a bool possession parameter and make
is_key_possessed() return a bool.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Skip key state checks (invalidation, revocation and expiration) when checking
for possession. Without this, keys that have been marked invalid, revoked
keys and expired keys are not given a possession attribute - which means the
possessor is not granted any possession permits and cannot do anything with
them unless they also have one a user, group or other permit.
This causes failures in the keyutils test suite's revocation and expiration
tests now that commit 96b5c8fea6c0861621051290d705ec2e971963f1 reduced the
initial permissions granted to a key.
The failures are due to accesses to revoked and expired keys being given
EACCES instead of EKEYREVOKED or EKEYEXPIRED.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Faster kernel compiles by way of fewer unnecessary includes.
[akpm@linux-foundation.org: fix fallout]
[akpm@linux-foundation.org: fix build]
Signed-off-by: Kent Overstreet <koverstreet@google.com>
Cc: Zach Brown <zab@redhat.com>
Cc: Felipe Balbi <balbi@ti.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Mark Fasheh <mfasheh@suse.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Asai Thambi S P <asamymuthupa@micron.com>
Cc: Selvan Mani <smani@micron.com>
Cc: Sam Bradshaw <sbradshaw@micron.com>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin LaHaise <bcrl@kvack.org>
Reviewed-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Use call_usermodehelper_setup() + call_usermodehelper_exec() instead of
calling call_usermodehelper_fns(). In case there's an OOM in this last
function the cleanup function may not be called - in this case we would
miss a call to key_put().
Signed-off-by: Lucas De Marchi <lucas.demarchi@profusion.mobi>
Cc: Oleg Nesterov <oleg@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: James Morris <james.l.morris@oracle.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Tejun Heo <tj@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
security keys
Looking at mm/process_vm_access.c:process_vm_rw() and comparing it to
compat_process_vm_rw() shows that the compatibility code requires an
explicit "access_ok()" check before calling
compat_rw_copy_check_uvector(). The same difference seems to appear when
we compare fs/read_write.c:do_readv_writev() to
fs/compat.c:compat_do_readv_writev().
This subtle difference between the compat and non-compat requirements
should probably be debated, as it seems to be error-prone. In fact,
there are two others sites that use this function in the Linux kernel,
and they both seem to get it wrong:
Now shifting our attention to fs/aio.c, we see that aio_setup_iocb()
also ends up calling compat_rw_copy_check_uvector() through
aio_setup_vectored_rw(). Unfortunately, the access_ok() check appears to
be missing. Same situation for
security/keys/compat.c:compat_keyctl_instantiate_key_iov().
I propose that we add the access_ok() check directly into
compat_rw_copy_check_uvector(), so callers don't have to worry about it,
and it therefore makes the compat call code similar to its non-compat
counterpart. Place the access_ok() check in the same location where
copy_from_user() can trigger a -EFAULT error in the non-compat code, so
the ABI behaviors are alike on both compat and non-compat.
While we are here, fix compat_do_readv_writev() so it checks for
compat_rw_copy_check_uvector() negative return values.
And also, fix a memory leak in compat_keyctl_instantiate_key_iov() error
handling.
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This fixes CVE-2013-1792.
There is a race in install_user_keyrings() that can cause a NULL pointer
dereference when called concurrently for the same user if the uid and
uid-session keyrings are not yet created. It might be possible for an
unprivileged user to trigger this by calling keyctl() from userspace in
parallel immediately after logging in.
Assume that we have two threads both executing lookup_user_key(), both
looking for KEY_SPEC_USER_SESSION_KEYRING.
THREAD A THREAD B
=============================== ===============================
==>call install_user_keyrings();
if (!cred->user->session_keyring)
==>call install_user_keyrings()
...
user->uid_keyring = uid_keyring;
if (user->uid_keyring)
return 0;
<==
key = cred->user->session_keyring [== NULL]
user->session_keyring = session_keyring;
atomic_inc(&key->usage); [oops]
At the point thread A dereferences cred->user->session_keyring, thread B
hasn't updated user->session_keyring yet, but thread A assumes it is
populated because install_user_keyrings() returned ok.
The race window is really small but can be exploited if, for example,
thread B is interrupted or preempted after initializing uid_keyring, but
before doing setting session_keyring.
This couldn't be reproduced on a stock kernel. However, after placing
systemtap probe on 'user->session_keyring = session_keyring;' that
introduced some delay, the kernel could be crashed reliably.
Fix this by checking both pointers before deciding whether to return.
Alternatively, the test could be done away with entirely as it is checked
inside the mutex - but since the mutex is global, that may not be the best
way.
Signed-off-by: David Howells <dhowells@redhat.com>
Reported-by: Mateusz Guzik <mguzik@redhat.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: James Morris <james.l.morris@oracle.com>
|
|
Dave Jones <davej@redhat.com> writes:
> Just hit this on Linus' current tree.
>
> [ 89.621770] BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
> [ 89.623111] IP: [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [ 89.624062] PGD 122bfd067 PUD 122bfe067 PMD 0
> [ 89.624901] Oops: 0000 [#1] PREEMPT SMP
> [ 89.625678] Modules linked in: caif_socket caif netrom bridge hidp 8021q garp stp mrp rose llc2 af_rxrpc phonet af_key binfmt_misc bnep l2tp_ppp can_bcm l2tp_core pppoe pppox can_raw scsi_transport_iscsi ppp_generic slhc nfnetlink can ipt_ULOG ax25 decnet irda nfc rds x25 crc_ccitt appletalk atm ipx p8023 psnap p8022 llc lockd sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_conntrack nf_conntrack ip6table_filter ip6_tables btusb bluetooth snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_pcm vhost_net snd_page_alloc snd_timer tun macvtap usb_debug snd rfkill microcode macvlan edac_core pcspkr serio_raw kvm_amd soundcore kvm r8169 mii
> [ 89.637846] CPU 2
> [ 89.638175] Pid: 782, comm: trinity-main Not tainted 3.8.0+ #63 Gigabyte Technology Co., Ltd. GA-MA78GM-S2H/GA-MA78GM-S2H
> [ 89.639850] RIP: 0010:[<ffffffff810784b0>] [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [ 89.641161] RSP: 0018:ffff880115657eb8 EFLAGS: 00010207
> [ 89.641984] RAX: 00000000000003e8 RBX: ffff88012688b000 RCX: 0000000000000000
> [ 89.643069] RDX: 0000000000000000 RSI: ffffffff81c32960 RDI: ffff880105839600
> [ 89.644167] RBP: ffff880115657ed8 R08: 0000000000000000 R09: 0000000000000000
> [ 89.645254] R10: 0000000000000001 R11: 0000000000000246 R12: ffff880105839600
> [ 89.646340] R13: ffff88011beea490 R14: ffff88011beea490 R15: 0000000000000000
> [ 89.647431] FS: 00007f3ac063b740(0000) GS:ffff88012b200000(0000) knlGS:0000000000000000
> [ 89.648660] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 89.649548] CR2: 00000000000000c8 CR3: 0000000122bfc000 CR4: 00000000000007e0
> [ 89.650635] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 89.651723] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 89.652812] Process trinity-main (pid: 782, threadinfo ffff880115656000, task ffff88011beea490)
> [ 89.654128] Stack:
> [ 89.654433] 0000000000000000 ffff8801058396a0 ffff880105839600 ffff88011beeaa78
> [ 89.655769] ffff880115657ef8 ffffffff812c7d9b ffffffff82079be0 0000000000000000
> [ 89.657073] ffff880115657f28 ffffffff8106c665 0000000000000002 ffff880115657f58
> [ 89.658399] Call Trace:
> [ 89.658822] [<ffffffff812c7d9b>] key_change_session_keyring+0xfb/0x140
> [ 89.659845] [<ffffffff8106c665>] task_work_run+0xa5/0xd0
> [ 89.660698] [<ffffffff81002911>] do_notify_resume+0x71/0xb0
> [ 89.661581] [<ffffffff816c9a4a>] int_signal+0x12/0x17
> [ 89.662385] Code: 24 90 00 00 00 48 8b b3 90 00 00 00 49 8b 4c 24 40 48 39 f2 75 08 e9 83 00 00 00 48 89 ca 48 81 fa 60 29 c3 81 0f 84 41 fe ff ff <48> 8b 8a c8 00 00 00 48 39 ce 75 e4 3b 82 d0 00 00 00 0f 84 4b
> [ 89.667778] RIP [<ffffffff810784b0>] commit_creds+0x250/0x2f0
> [ 89.668733] RSP <ffff880115657eb8>
> [ 89.669301] CR2: 00000000000000c8
>
> My fastest trinity induced oops yet!
>
>
> Appears to be..
>
> if ((set_ns == subset_ns->parent) &&
> 850: 48 8b 8a c8 00 00 00 mov 0xc8(%rdx),%rcx
>
> from the inlined cred_cap_issubset
By historical accident we have been reading trying to set new->user_ns
from new->user_ns. Which is totally silly as new->user_ns is NULL (as
is every other field in new except session_keyring at that point).
The intent is clearly to copy all of the fields from old to new so copy
old->user_ns into into new->user_ns.
Cc: stable@vger.kernel.org
Reported-by: Dave Jones <davej@redhat.com>
Tested-by: Dave Jones <davej@redhat.com>
Acked-by: Serge Hallyn <serge.hallyn@canonical.com>
Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
|
|
A patch to fix some unreachable code in search_my_process_keyrings() got
applied twice by two different routes upstream as commits e67eab39bee2
and b010520ab3d2 (both "fix unreachable code").
Unfortunately, the second application removed something it shouldn't
have and this wasn't detected by GIT. This is due to the patch not
having sufficient lines of context to distinguish the two places of
application.
The effect of this is relatively minor: inside the kernel, the keyring
search routines may search multiple keyrings and then prioritise the
errors if no keys or negative keys are found in any of them. With the
extra deletion, the presence of a negative key in the thread keyring
(causing ENOKEY) is incorrectly overridden by an error searching the
process keyring.
So revert the second application of the patch.
Signed-off-by: David Howells <dhowells@redhat.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
We set ret to NULL then test it. Remove the bogus test
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
"A quiet cycle for the security subsystem with just a few maintenance
updates."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
Smack: create a sysfs mount point for smackfs
Smack: use select not depends in Kconfig
Yama: remove locking from delete path
Yama: add RCU to drop read locking
drivers/char/tpm: remove tasklet and cleanup
KEYS: Use keyring_alloc() to create special keyrings
KEYS: Reduce initial permissions on keys
KEYS: Make the session and process keyrings per-thread
seccomp: Make syscall skipping and nr changes more consistent
key: Fix resource leak
keys: Fix unreachable code
KEYS: Add payload preparsing opportunity prior to key instantiate or update
|
|
Sync up with Linus' tree to be able to apply Cesar's patch
against newer version of the code.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
We set ret to NULL then test it. Remove the bogus test
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux
Pull module signing support from Rusty Russell:
"module signing is the highlight, but it's an all-over David Howells frenzy..."
Hmm "Magrathea: Glacier signing key". Somebody has been reading too much HHGTTG.
* 'modules-next' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux: (37 commits)
X.509: Fix indefinite length element skip error handling
X.509: Convert some printk calls to pr_devel
asymmetric keys: fix printk format warning
MODSIGN: Fix 32-bit overflow in X.509 certificate validity date checking
MODSIGN: Make mrproper should remove generated files.
MODSIGN: Use utf8 strings in signer's name in autogenerated X.509 certs
MODSIGN: Use the same digest for the autogen key sig as for the module sig
MODSIGN: Sign modules during the build process
MODSIGN: Provide a script for generating a key ID from an X.509 cert
MODSIGN: Implement module signature checking
MODSIGN: Provide module signing public keys to the kernel
MODSIGN: Automatically generate module signing keys if missing
MODSIGN: Provide Kconfig options
MODSIGN: Provide gitignore and make clean rules for extra files
MODSIGN: Add FIPS policy
module: signature checking hook
X.509: Add a crypto key parser for binary (DER) X.509 certificates
MPILIB: Provide a function to read raw data into an MPI
X.509: Add an ASN.1 decoder
X.509: Add simple ASN.1 grammar compiler
...
|
|
Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called. This is done with the
provision of two new key type operations:
int (*preparse)(struct key_preparsed_payload *prep);
void (*free_preparse)(struct key_preparsed_payload *prep);
If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases). The second operation is called to clean up if the first
was called.
preparse() is given the opportunity to fill in the following structure:
struct key_preparsed_payload {
char *description;
void *type_data[2];
void *payload;
const void *data;
size_t datalen;
size_t quotalen;
};
Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.
The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.
The preparser may also propose a description for the key by attaching it as a
string to the description field. This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function. This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.
This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.
The instantiate() and update() operations are then modified to look like this:
int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
int (*update)(struct key *key, struct key_preparsed_payload *prep);
and the new payload data is passed in *prep, whether or not it was preparsed.
Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
Pull security subsystem updates from James Morris:
"Highlights:
- Integrity: add local fs integrity verification to detect offline
attacks
- Integrity: add digital signature verification
- Simple stacking of Yama with other LSMs (per LSS discussions)
- IBM vTPM support on ppc64
- Add new driver for Infineon I2C TIS TPM
- Smack: add rule revocation for subject labels"
Fixed conflicts with the user namespace support in kernel/auditsc.c and
security/integrity/ima/ima_policy.c.
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security: (39 commits)
Documentation: Update git repository URL for Smack userland tools
ima: change flags container data type
Smack: setprocattr memory leak fix
Smack: implement revoking all rules for a subject label
Smack: remove task_wait() hook.
ima: audit log hashes
ima: generic IMA action flag handling
ima: rename ima_must_appraise_or_measure
audit: export audit_log_task_info
tpm: fix tpm_acpi sparse warning on different address spaces
samples/seccomp: fix 31 bit build on s390
ima: digital signature verification support
ima: add support for different security.ima data types
ima: add ima_inode_setxattr/removexattr function and calls
ima: add inode_post_setattr call
ima: replace iint spinblock with rwlock/read_lock
ima: allocating iint improvements
ima: add appraise action keywords and default rules
ima: integrity appraisal extension
vfs: move ima_file_free before releasing the file
...
|
|
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Use keyring_alloc() to create special keyrings now that it has a permissions
parameter rather than using key_alloc() + key_instantiate_and_link().
Also document and export keyring_alloc() so that modules can use it too.
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Reduce the initial permissions on new keys to grant the possessor everything,
view permission only to the user (so the keys can be seen in /proc/keys) and
nothing else.
This gives the creator a chance to adjust the permissions mask before other
processes can access the new key or create a link to it.
To aid with this, keyring_alloc() now takes a permission argument rather than
setting the permissions itself.
The following permissions are now set:
(1) The user and user-session keyrings grant the user that owns them full
permissions and grant a possessor everything bar SETATTR.
(2) The process and thread keyrings grant the possessor full permissions but
only grant the user VIEW. This permits the user to see them in
/proc/keys, but not to do anything with them.
(3) Anonymous session keyrings grant the possessor full permissions, but only
grant the user VIEW and READ. This means that the user can see them in
/proc/keys and can list them, but nothing else. Possibly READ shouldn't
be provided either.
(4) Named session keyrings grant everything an anonymous session keyring does,
plus they grant the user LINK permission. The whole point of named
session keyrings is that others can also subscribe to them. Possibly this
should be a separate permission to LINK.
(5) The temporary session keyring created by call_sbin_request_key() gets the
same permissions as an anonymous session keyring.
(6) Keys created by add_key() get VIEW, SEARCH, LINK and SETATTR for the
possessor, plus READ and/or WRITE if the key type supports them. The used
only gets VIEW now.
(7) Keys created by request_key() now get the same as those created by
add_key().
Reported-by: Lennart Poettering <lennart@poettering.net>
Reported-by: Stef Walter <stefw@redhat.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
Make the session keyring per-thread rather than per-process, but still
inherited from the parent thread to solve a problem with PAM and gdm.
The problem is that join_session_keyring() will reject attempts to change the
session keyring of a multithreaded program but gdm is now multithreaded before
it gets to the point of starting PAM and running pam_keyinit to create the
session keyring. See:
https://bugs.freedesktop.org/show_bug.cgi?id=49211
The reason that join_session_keyring() will only change the session keyring
under a single-threaded environment is that it's hard to alter the other
thread's credentials to effect the change in a multi-threaded program. The
problems are such as:
(1) How to prevent two threads both running join_session_keyring() from
racing.
(2) Another thread's credentials may not be modified directly by this process.
(3) The number of threads is uncertain whilst we're not holding the
appropriate spinlock, making preallocation slightly tricky.
(4) We could use TIF_NOTIFY_RESUME and key_replace_session_keyring() to get
another thread to replace its keyring, but that means preallocating for
each thread.
A reasonable way around this is to make the session keyring per-thread rather
than per-process and just document that if you want a common session keyring,
you must get it before you spawn any threads - which is the current situation
anyway.
Whilst we're at it, we can the process keyring behave in the same way. This
means we can clean up some of the ickyness in the creds code.
Basically, after this patch, the session, process and thread keyrings are about
inheritance rules only and not about sharing changes of keyring.
Reported-by: Mantas M. <grawity@gmail.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Tested-by: Ray Strode <rstrode@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull user namespace changes from Eric Biederman:
"This is a mostly modest set of changes to enable basic user namespace
support. This allows the code to code to compile with user namespaces
enabled and removes the assumption there is only the initial user
namespace. Everything is converted except for the most complex of the
filesystems: autofs4, 9p, afs, ceph, cifs, coda, fuse, gfs2, ncpfs,
nfs, ocfs2 and xfs as those patches need a bit more review.
The strategy is to push kuid_t and kgid_t values are far down into
subsystems and filesystems as reasonable. Leaving the make_kuid and
from_kuid operations to happen at the edge of userspace, as the values
come off the disk, and as the values come in from the network.
Letting compile type incompatible compile errors (present when user
namespaces are enabled) guide me to find the issues.
The most tricky areas have been the places where we had an implicit
union of uid and gid values and were storing them in an unsigned int.
Those places were converted into explicit unions. I made certain to
handle those places with simple trivial patches.
Out of that work I discovered we have generic interfaces for storing
quota by projid. I had never heard of the project identifiers before.
Adding full user namespace support for project identifiers accounts
for most of the code size growth in my git tree.
Ultimately there will be work to relax privlige checks from
"capable(FOO)" to "ns_capable(user_ns, FOO)" where it is safe allowing
root in a user names to do those things that today we only forbid to
non-root users because it will confuse suid root applications.
While I was pushing kuid_t and kgid_t changes deep into the audit code
I made a few other cleanups. I capitalized on the fact we process
netlink messages in the context of the message sender. I removed
usage of NETLINK_CRED, and started directly using current->tty.
Some of these patches have also made it into maintainer trees, with no
problems from identical code from different trees showing up in
linux-next.
After reading through all of this code I feel like I might be able to
win a game of kernel trivial pursuit."
Fix up some fairly trivial conflicts in netfilter uid/git logging code.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (107 commits)
userns: Convert the ufs filesystem to use kuid/kgid where appropriate
userns: Convert the udf filesystem to use kuid/kgid where appropriate
userns: Convert ubifs to use kuid/kgid
userns: Convert squashfs to use kuid/kgid where appropriate
userns: Convert reiserfs to use kuid and kgid where appropriate
userns: Convert jfs to use kuid/kgid where appropriate
userns: Convert jffs2 to use kuid and kgid where appropriate
userns: Convert hpfs to use kuid and kgid where appropriate
userns: Convert btrfs to use kuid/kgid where appropriate
userns: Convert bfs to use kuid/kgid where appropriate
userns: Convert affs to use kuid/kgid wherwe appropriate
userns: On alpha modify linux_to_osf_stat to use convert from kuids and kgids
userns: On ia64 deal with current_uid and current_gid being kuid and kgid
userns: On ppc convert current_uid from a kuid before printing.
userns: Convert s390 getting uid and gid system calls to use kuid and kgid
userns: Convert s390 hypfs to use kuid and kgid where appropriate
userns: Convert binder ipc to use kuids
userns: Teach security_path_chown to take kuids and kgids
userns: Add user namespace support to IMA
userns: Convert EVM to deal with kuids and kgids in it's hmac computation
...
|
|
Pull workqueue changes from Tejun Heo:
"This is workqueue updates for v3.7-rc1. A lot of activities this
round including considerable API and behavior cleanups.
* delayed_work combines a timer and a work item. The handling of the
timer part has always been a bit clunky leading to confusing
cancelation API with weird corner-case behaviors. delayed_work is
updated to use new IRQ safe timer and cancelation now works as
expected.
* Another deficiency of delayed_work was lack of the counterpart of
mod_timer() which led to cancel+queue combinations or open-coded
timer+work usages. mod_delayed_work[_on]() are added.
These two delayed_work changes make delayed_work provide interface
and behave like timer which is executed with process context.
* A work item could be executed concurrently on multiple CPUs, which
is rather unintuitive and made flush_work() behavior confusing and
half-broken under certain circumstances. This problem doesn't
exist for non-reentrant workqueues. While non-reentrancy check
isn't free, the overhead is incurred only when a work item bounces
across different CPUs and even in simulated pathological scenario
the overhead isn't too high.
All workqueues are made non-reentrant. This removes the
distinction between flush_[delayed_]work() and
flush_[delayed_]_work_sync(). The former is now as strong as the
latter and the specified work item is guaranteed to have finished
execution of any previous queueing on return.
* In addition to the various bug fixes, Lai redid and simplified CPU
hotplug handling significantly.
* Joonsoo introduced system_highpri_wq and used it during CPU
hotplug.
There are two merge commits - one to pull in IRQ safe timer from
tip/timers/core and the other to pull in CPU hotplug fixes from
wq/for-3.6-fixes as Lai's hotplug restructuring depended on them."
Fixed a number of trivial conflicts, but the more interesting conflicts
were silent ones where the deprecated interfaces had been used by new
code in the merge window, and thus didn't cause any real data conflicts.
Tejun pointed out a few of them, I fixed a couple more.
* 'for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: (46 commits)
workqueue: remove spurious WARN_ON_ONCE(in_irq()) from try_to_grab_pending()
workqueue: use cwq_set_max_active() helper for workqueue_set_max_active()
workqueue: introduce cwq_set_max_active() helper for thaw_workqueues()
workqueue: remove @delayed from cwq_dec_nr_in_flight()
workqueue: fix possible stall on try_to_grab_pending() of a delayed work item
workqueue: use hotcpu_notifier() for workqueue_cpu_down_callback()
workqueue: use __cpuinit instead of __devinit for cpu callbacks
workqueue: rename manager_mutex to assoc_mutex
workqueue: WORKER_REBIND is no longer necessary for idle rebinding
workqueue: WORKER_REBIND is no longer necessary for busy rebinding
workqueue: reimplement idle worker rebinding
workqueue: deprecate __cancel_delayed_work()
workqueue: reimplement cancel_delayed_work() using try_to_grab_pending()
workqueue: use mod_delayed_work() instead of __cancel + queue
workqueue: use irqsafe timer for delayed_work
workqueue: clean up delayed_work initializers and add missing one
workqueue: make deferrable delayed_work initializer names consistent
workqueue: cosmetic whitespace updates for macro definitions
workqueue: deprecate system_nrt[_freezable]_wq
workqueue: deprecate flush[_delayed]_work_sync()
...
|
|
On an error iov may still have been reallocated and need freeing
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
We set ret to NULL then test it. Remove the bogus test
Signed-off-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: David Howells <dhowells@redhat.com>
|
|
- Replace key_user ->user_ns equality checks with kuid_has_mapping checks.
- Use from_kuid to generate key descriptions
- Use kuid_t and kgid_t and the associated helpers instead of uid_t and gid_t
- Avoid potential problems with file descriptor passing by displaying
keys in the user namespace of the opener of key status proc files.
Cc: linux-security-module@vger.kernel.org
Cc: keyrings@linux-nfs.org
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
|
|
This reverts commit d35abdb28824cf74f0a106a0f9c6f3ff700a35bf.
task_lock() was added to ensure exit_mm() and thus exit_task_work() is
not possible before task_work_add().
This is wrong, task_lock() must not be nested with write_lock(tasklist).
And this is no longer needed, task_work_add() now fails if it is called
after exit_task_work().
Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Oleg Nesterov <oleg@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: http://lkml.kernel.org/r/20120826191214.GA4231@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Give the key type the opportunity to preparse the payload prior to the
instantiation and update routines being called. This is done with the
provision of two new key type operations:
int (*preparse)(struct key_preparsed_payload *prep);
void (*free_preparse)(struct key_preparsed_payload *prep);
If the first operation is present, then it is called before key creation (in
the add/update case) or before the key semaphore is taken (in the update and
instantiate cases). The second operation is called to clean up if the first
was called.
preparse() is given the opportunity to fill in the following structure:
struct key_preparsed_payload {
char *description;
void *type_data[2];
void *payload;
const void *data;
size_t datalen;
size_t quotalen;
};
Before the preparser is called, the first three fields will have been cleared,
the payload pointer and size will be stored in data and datalen and the default
quota size from the key_type struct will be stored into quotalen.
The preparser may parse the payload in any way it likes and may store data in
the type_data[] and payload fields for use by the instantiate() and update()
ops.
The preparser may also propose a description for the key by attaching it as a
string to the description field. This can be used by passing a NULL or ""
description to the add_key() system call or the key_create_or_update()
function. This cannot work with request_key() as that required the description
to tell the upcall about the key to be created.
This, for example permits keys that store PGP public keys to generate their own
name from the user ID and public key fingerprint in the key.
The instantiate() and update() operations are then modified to look like this:
int (*instantiate)(struct key *key, struct key_preparsed_payload *prep);
int (*update)(struct key *key, struct key_preparsed_payload *prep);
and the new payload data is passed in *prep, whether or not it was preparsed.
Signed-off-by: David Howells <dhowells@redhat.com>
|