openwrt/staging/blogic.git
13 years agoSUNRPC: Replace svc_addr_u by sockaddr_storage
Mi Jinlong [Tue, 30 Aug 2011 09:18:41 +0000 (17:18 +0800)]
SUNRPC: Replace svc_addr_u by sockaddr_storage

For IPv6 local address, lockd can not callback to client for
missing scope id when binding address at inet6_bind:

 324       if (addr_type & IPV6_ADDR_LINKLOCAL) {
 325               if (addr_len >= sizeof(struct sockaddr_in6) &&
 326                   addr->sin6_scope_id) {
 327                       /* Override any existing binding, if another one
 328                        * is supplied by user.
 329                        */
 330                       sk->sk_bound_dev_if = addr->sin6_scope_id;
 331               }
 332
 333               /* Binding to link-local address requires an interface */
 334               if (!sk->sk_bound_dev_if) {
 335                       err = -EINVAL;
 336                       goto out_unlock;
 337               }

Replacing svc_addr_u by sockaddr_storage, let rqstp->rq_daddr contains more info
besides address.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Mi Jinlong <mijinlong@cn.fujitsu.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agoNFSD: Add a cache for fs_locations information
Trond Myklebust [Mon, 12 Sep 2011 23:37:26 +0000 (19:37 -0400)]
NFSD: Add a cache for fs_locations information

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
[ cel: since this is server-side, use nfsd4_ prefix instead of nfs4_ prefix. ]
[ cel: implement S_ISVTX filter in bfields-normal form ]
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agoNFSD: Remove the ex_pathname field from struct svc_export
Trond Myklebust [Mon, 12 Sep 2011 23:37:16 +0000 (19:37 -0400)]
NFSD: Remove the ex_pathname field from struct svc_export

There are no more users...

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agoNFSD: Cleanup for nfsd4_path()
Trond Myklebust [Mon, 12 Sep 2011 23:37:06 +0000 (19:37 -0400)]
NFSD: Cleanup for nfsd4_path()

The current code is sort of hackish in that it assumes a referral is always
matched to an export. When we add support for junctions that may not be the
case.
We can replace nfsd4_path() with a function that encodes the components
directly from the dentries. Since nfsd4_path is currently the only user of
the 'ex_pathname' field in struct svc_export, this has the added benefit
of allowing us to get rid of that.

Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: better stateid hashing
J. Bruce Fields [Sun, 11 Sep 2011 17:48:41 +0000 (13:48 -0400)]
nfsd4: better stateid hashing

First, we shouldn't care here about the structure of the opaque part of
the stateid.  Second, this hash is really dumb.  (I'm not sure the
replacement is much better, though--to look at it another patch.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: use deleg changes to cleanup preprocess_stateid_op
J. Bruce Fields [Fri, 9 Sep 2011 15:54:57 +0000 (11:54 -0400)]
nfsd4: use deleg changes to cleanup preprocess_stateid_op

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: fix test_stateid for delegation stateid's
J. Bruce Fields [Fri, 9 Sep 2011 15:26:58 +0000 (11:26 -0400)]
nfsd4: fix test_stateid for delegation stateid's

Test_stateid should handle delegation stateid's as well.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: hash deleg stateid's like any other
J. Bruce Fields [Fri, 9 Sep 2011 13:06:12 +0000 (09:06 -0400)]
nfsd4: hash deleg stateid's like any other

It's simpler to look up delegation stateid's in the same hash table as
any other stateid.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: share common stid-hashing helper function
J. Bruce Fields [Thu, 8 Sep 2011 16:16:03 +0000 (12:16 -0400)]
nfsd4: share common stid-hashing helper function

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: add common dl_stid field to delegation
J. Bruce Fields [Thu, 8 Sep 2011 16:07:44 +0000 (12:07 -0400)]
nfsd4: add common dl_stid field to delegation

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: move some of nfs4_stateid into a separate structure
J. Bruce Fields [Wed, 7 Sep 2011 20:06:42 +0000 (16:06 -0400)]
nfsd4: move some of nfs4_stateid into a separate structure

We want delegations to share more with open/lock stateid's, so first
we'll pull out some of the common stuff we want to share.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: remove redundant stateid initialization
J. Bruce Fields [Wed, 7 Sep 2011 16:54:06 +0000 (12:54 -0400)]
nfsd4: remove redundant stateid initialization

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: rename init_stateid
J. Bruce Fields [Tue, 6 Sep 2011 21:20:34 +0000 (17:20 -0400)]
nfsd4: rename init_stateid

Note this is actually open-stateid specific.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: pass around typemask instead of flags
J. Bruce Fields [Tue, 6 Sep 2011 19:50:21 +0000 (15:50 -0400)]
nfsd4: pass around typemask instead of flags

We're only using those flags to choose lock or open stateid's at this
point.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: split preprocess_seqid, cleanup
J. Bruce Fields [Tue, 6 Sep 2011 19:19:46 +0000 (15:19 -0400)]
nfsd4: split preprocess_seqid, cleanup

Move most of this into helper functions.  Also move the non-CONFIRM case
into caller, providing a helper function for that purpose.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: split up find_stateid
J. Bruce Fields [Tue, 6 Sep 2011 20:48:57 +0000 (16:48 -0400)]
nfsd4: split up find_stateid

Minor cleanup.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: rearrange to avoid a forward reference
J. Bruce Fields [Tue, 6 Sep 2011 18:56:09 +0000 (14:56 -0400)]
nfsd4: rearrange to avoid a forward reference

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: split out some free_generic_stateid code
J. Bruce Fields [Tue, 6 Sep 2011 18:50:49 +0000 (14:50 -0400)]
nfsd4: split out some free_generic_stateid code

We'll use this elsewhere.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: split stateowners into open and lockowners
J. Bruce Fields [Sun, 31 Jul 2011 03:33:59 +0000 (23:33 -0400)]
nfsd4: split stateowners into open and lockowners

The stateowner has some fields that only make sense for openowners, and
some that only make sense for lockowners, and I find it a lot clearer if
those are separated out.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: move CLOSE_STATE special case to caller
J. Bruce Fields [Fri, 2 Sep 2011 20:36:49 +0000 (16:36 -0400)]
nfsd4: move CLOSE_STATE special case to caller

Move the CLOSE_STATE case into the unique caller that cares about it
rather than putting it in preprocess_seqid_op.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: move double-confirm test to open_confirm
J. Bruce Fields [Fri, 2 Sep 2011 16:19:43 +0000 (12:19 -0400)]
nfsd4: move double-confirm test to open_confirm

I don't see the point of having this check in nfs4_preprocess_seqid_op()
when it's only needed by the one caller.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: simplify check_open logic
J. Bruce Fields [Fri, 2 Sep 2011 16:08:20 +0000 (12:08 -0400)]
nfsd4: simplify check_open logic

Sometimes the single-exit style is good, sometimes it's unnecessarily
convoluted....

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: share common seqid checks
J. Bruce Fields [Fri, 2 Sep 2011 13:03:37 +0000 (09:03 -0400)]
nfsd4: share common seqid checks

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: eliminate unused lt_stateowner
J. Bruce Fields [Thu, 1 Sep 2011 15:31:45 +0000 (11:31 -0400)]
nfsd4: eliminate unused lt_stateowner

This is used only as a local variable.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: drop most stateowner refcounting
J. Bruce Fields [Wed, 31 Aug 2011 02:15:47 +0000 (22:15 -0400)]
nfsd4: drop most stateowner refcounting

Maybe we'll bring it back some day, but we don't have much real use for
it now.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: eliminate impossible open replay case
J. Bruce Fields [Thu, 25 Aug 2011 22:17:52 +0000 (18:17 -0400)]
nfsd4: eliminate impossible open replay case

If open fails with any error other than nfserr_replay_me, then the main
nfsd4_proc_compound() loop continues unconditionally to
nfsd4_encode_operation(), which will always call encode_seqid_op_tail.
Thus the condition we check for here does not occur.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: extend state lock over seqid replay logic
J. Bruce Fields [Tue, 30 Aug 2011 21:02:48 +0000 (17:02 -0400)]
nfsd4: extend state lock over seqid replay logic

There are currently a couple races in the seqid replay code: a
retransmission could come while we're still encoding the original reply,
or a new seqid-mutating call could come as we're encoding a replay.

So, extend the state lock over the encoding (both encoding of a replayed
reply and caching of the original encoded reply).

I really hate doing this, and previously added the stateowner
reference-counting code to avoid it (which was insufficient)--but I
don't see a less complicated alternative at the moment.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: cleanup seqid op stateowner usage
J. Bruce Fields [Wed, 24 Aug 2011 16:45:03 +0000 (12:45 -0400)]
nfsd4: cleanup seqid op stateowner usage

Now that the replay owner is in the cstate we can remove it from a lot
of other individual operations and further simplify
nfs4_preprocess_seqid_op().

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: centralize handling of replay owners
J. Bruce Fields [Wed, 24 Aug 2011 16:27:31 +0000 (12:27 -0400)]
nfsd4: centralize handling of replay owners

Set the stateowner associated with a replay in one spot in
nfs4_preprocess_seqid_op() and keep it in cstate.  This allows removing
a few lines of boilerplate from all the nfs4_preprocess_seqid_op()
callers.

Also turn ENCODE_SEQID_OP_TAIL into a function while we're here.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: make delegation stateid's seqid start at 1
J. Bruce Fields [Wed, 31 Aug 2011 19:47:21 +0000 (15:47 -0400)]
nfsd4: make delegation stateid's seqid start at 1

Thanks to Casey for reminding me that 5661 gives a special meaning to a
value of 0 in the stateid's seqid field, so all stateid's should start
out with si_generation 1.  We were doing that in the open and lock
cases for minorversion 1, but not for the delegation stateid, and not
for openstateid's with v4.0.

It doesn't *really* matter much for v4.0 or for delegation stateid's
(which never get the seqid field incremented), but we may as well do the
same for all of them.

Reported-by: Casey Bodley <cbodley@citi.umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: simplify stateid generation code, fix wraparound
J. Bruce Fields [Tue, 23 Aug 2011 15:03:29 +0000 (11:03 -0400)]
nfsd4: simplify stateid generation code, fix wraparound

Follow the recommendation from rfc3530bis for stateid generation number
wraparound, simplify some code, and fix or remove incorrect comments.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: consolidate lock & open stateid tables
J. Bruce Fields [Mon, 22 Aug 2011 22:01:43 +0000 (18:01 -0400)]
nfsd4: consolidate lock & open stateid tables

There's no reason to have two separate hash tables for open and lock
stateid's.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: simplify distinguishing lock & open stateid's
J. Bruce Fields [Wed, 31 Aug 2011 19:25:46 +0000 (15:25 -0400)]
nfsd4: simplify distinguishing lock & open stateid's

The trick free_stateid is using is a little cheesy, and we'll have more
uses for this field later.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: remove typoed replay field
J. Bruce Fields [Mon, 29 Aug 2011 14:36:11 +0000 (10:36 -0400)]
nfsd4: remove typoed replay field

Wow, I wonder how long that typo's been there.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: fix off-by-one-error in SEQUENCE reply
J. Bruce Fields [Wed, 31 Aug 2011 19:39:30 +0000 (15:39 -0400)]
nfsd4: fix off-by-one-error in SEQUENCE reply

The values here represent highest slotid numbers.  Since slotid's are
numbered starting from zero, the highest should be one less than the
number of slots.

Reported-by: Rick Macklem <rmacklem@uoguelph.ca>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd: remove include/linux/nfsd/syscall.h
J. Bruce Fields [Fri, 26 Aug 2011 21:22:06 +0000 (17:22 -0400)]
nfsd: remove include/linux/nfsd/syscall.h

We don't need this any more.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: remove redundant is_open_owner check
J. Bruce Fields [Tue, 23 Aug 2011 19:17:50 +0000 (15:17 -0400)]
nfsd4: remove redundant is_open_owner check

When called with OPEN_STATE, preprocess_seqid_op only returns an open
stateid, hence only an open owner.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: get lock checks out of preprocess_seqid_op
J. Bruce Fields [Mon, 22 Aug 2011 17:13:31 +0000 (13:13 -0400)]
nfsd4: get lock checks out of preprocess_seqid_op

We've got some lock-specific code here in nfs4_preprocess_seqid_op which
is only used by nfsd4_lock().  Move it to the caller.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: simplify lock openmode check
J. Bruce Fields [Mon, 22 Aug 2011 15:39:07 +0000 (11:39 -0400)]
nfsd4: simplify lock openmode check

Note that the special handling for the lock stateid case is already done
by nfs4_check_openmode() (as of 02921914170e3b7fea1cd82dac9713685d2de5e2
"nfsd4: fix openmode checking on IO using lock stateid") so we no longer
need these two cases in the caller.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: cleanup and consolidate seqid_mutating_err
J. Bruce Fields [Tue, 23 Aug 2011 19:43:04 +0000 (15:43 -0400)]
nfsd4: cleanup and consolidate seqid_mutating_err

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: remove HAS_SESSION
J. Bruce Fields [Mon, 22 Aug 2011 14:07:12 +0000 (10:07 -0400)]
nfsd4: remove HAS_SESSION

This flag doesn't really buy us anything.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: cleanup lock/stateowner initialization
J. Bruce Fields [Fri, 12 Aug 2011 13:42:57 +0000 (09:42 -0400)]
nfsd4: cleanup lock/stateowner initialization

Share some common code, stop doing silly things like initializing a list
head immediately before adding it to a list, etc.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: name openowner data structures more clearly
J. Bruce Fields [Thu, 11 Aug 2011 22:50:18 +0000 (18:50 -0400)]
nfsd4: name openowner data structures more clearly

These appear to be generic (for both open and lock owners), but they're
actually just for open owners.  This has confused me more than once.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: replace some macros by functions
J. Bruce Fields [Sun, 31 Jul 2011 03:46:29 +0000 (23:46 -0400)]
nfsd4: replace some macros by functions

For all the usual reasons.  (Type safety, readability.)

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: stop using nfserr_resource for transitory errors
J. Bruce Fields [Wed, 10 Aug 2011 23:07:33 +0000 (19:07 -0400)]
nfsd4: stop using nfserr_resource for transitory errors

The server is returning nfserr_resource for both permanent errors and
for errors (like allocation failures) that might be resolved by retrying
later.  Save nfserr_resource for the former and use delay/jukebox for
the latter.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: fix failure to end nfsd4 grace period
Boaz Harrosh [Sat, 13 Aug 2011 00:30:12 +0000 (17:30 -0700)]
nfsd4: fix failure to end nfsd4 grace period

Even if we fail to write a recovery record, we should still mark the
client as having acquired its first state.  Otherwise we leave 4.1
clients with indefinite ERR_GRACE returns.

However, an inability to write stable storage records may cause failures
of reboot recovery, and the problem should still be brought to the
server administrator's attention.

So, make sure the error is logged.

These errors shouldn't normally be triggered on a corectly functioning
server--this isn't a case where a misconfigured client could spam the
logs.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: simplify recovery dir setting
J. Bruce Fields [Sat, 27 Aug 2011 00:40:28 +0000 (20:40 -0400)]
nfsd4: simplify recovery dir setting

Move around some of this code, simplify a bit.

Reviewed-by: Boaz Harrosh <bharrosh@panasas.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd: prettify NFSD_MAY_* flag definitions
J. Bruce Fields [Thu, 25 Aug 2011 18:23:39 +0000 (14:23 -0400)]
nfsd: prettify NFSD_MAY_* flag definitions

Acked-by: Jim Rees <rees@umich.edu>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: permit read opens of executable-only files
J. Bruce Fields [Thu, 25 Aug 2011 14:48:39 +0000 (10:48 -0400)]
nfsd4: permit read opens of executable-only files

A client that wants to execute a file must be able to read it.  Read
opens over nfs are therefore implicitly allowed for executable files
even when those files are not readable.

NFSv2/v3 get this right by using a passed-in NFSD_MAY_OWNER_OVERRIDE on
read requests, but NFSv4 has gotten this wrong ever since
dc730e173785e29b297aa605786c94adaffe2544 "nfsd4: fix owner-override on
open", when we realized that the file owner shouldn't override
permissions on non-reclaim NFSv4 opens.

So we can't use NFSD_MAY_OWNER_OVERRIDE to tell nfsd_permission to allow
reads of executable files.

So, do the same thing we do whenever we encounter another weird NFS
permission nit: define yet another NFSD_MAY_* flag.

The industry's future standardization on 128-bit processors will be
motivated primarily by the need for integers with enough bits for all
the NFSD_MAY_* flags.

Reported-by: Leonardo Borda <leonardoborda@gmail.com>
Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agoRemove include/linux/nfsd/const.h
J. Bruce Fields [Fri, 19 Aug 2011 15:38:52 +0000 (11:38 -0400)]
Remove include/linux/nfsd/const.h

Userspace shouldn't have a use for these constants.  Nothing here is
used outside fs/nfsd.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd: remove unused defines
J. Bruce Fields [Mon, 8 Aug 2011 11:09:03 +0000 (07:09 -0400)]
nfsd: remove unused defines

At least one of these is actually wrong anyway.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: it's OK to return nfserr_symlink
J. Bruce Fields [Mon, 15 Aug 2011 22:39:32 +0000 (18:39 -0400)]
nfsd4: it's OK to return nfserr_symlink

The nfsd4 code has a bunch of special exceptions for error returns which
map nfserr_symlink to other errors.

In fact, the spec makes it clear that nfserr_symlink is to be preferred
over less specific errors where possible.

The patch that introduced it back in 2.6.4 is "kNFSd: correct symlink
related error returns.", which claims that these special exceptions are
represent an NFSv4 break from v2/v3 tradition--when in fact the symlink
error was introduced with v4.

I suspect what happened was pynfs tests were written that were overly
faithful to the (known-incomplete) rfc3530 error return lists, and then
code was fixed up mindlessly to make the tests pass.

Delete these unnecessary exceptions.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: fix incorrect comment in nfsd4_set_nfs4_acl
J. Bruce Fields [Mon, 15 Aug 2011 20:57:07 +0000 (16:57 -0400)]
nfsd4: fix incorrect comment in nfsd4_set_nfs4_acl

Zero means "I don't care what kind of file this is".  And that's
probably what we want--acls are also settable at least on directories,
and if the filesystem doesn't want them on other objects, leave it to it
to complain.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd: clean up nfsd_mode_check()
J. Bruce Fields [Mon, 15 Aug 2011 21:04:19 +0000 (17:04 -0400)]
nfsd: clean up nfsd_mode_check()

Add some more comments, simplify logic, do & S_IFMT just once, name
"type" more helpfully.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd: open-code special directory-hardlink check
J. Bruce Fields [Mon, 15 Aug 2011 20:59:55 +0000 (16:59 -0400)]
nfsd: open-code special directory-hardlink check

We allow the fh_verify caller to specify that any object *except* those
of a given type is allowed, by passing a negative type.  But only one
caller actually uses it.  Open-code that check in the one caller.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: clean up S_IS -> NF4 file type mapping
J. Bruce Fields [Mon, 15 Aug 2011 15:49:30 +0000 (11:49 -0400)]
nfsd4: clean up S_IS -> NF4 file type mapping

A slightly unconventional approach to make the code more compact I could
live with, but let's give the poor reader *some* chance.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agosunrpc: use better NUMA affinities
Eric Dumazet [Thu, 28 Jul 2011 18:04:09 +0000 (20:04 +0200)]
sunrpc: use better NUMA affinities

Use NUMA aware allocations to reduce latencies and increase throughput.

sunrpc kthreads can use kthread_create_on_node() if pool_mode is
"percpu" or "pernode", and svc_prepare_thread()/svc_init_buffer() can
also take into account NUMA node affinity for memory allocations.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
CC: "J. Bruce Fields" <bfields@fieldses.org>
CC: Neil Brown <neilb@suse.de>
CC: David Miller <davem@davemloft.net>
Reviewed-by: Greg Banks <gnb@fastmail.fm>
[bfields@redhat.com: fix up caller nfs41_callback_up]
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agolocks: setlease cleanup
J. Bruce Fields [Fri, 19 Aug 2011 14:59:49 +0000 (10:59 -0400)]
locks: setlease cleanup

There's an incorrect comment here.  Also clean up the logic: the
"rdlease" and "wrlease" locals are confusingly named, and don't really
add anything since we can make a decision as soon as we hit one of these
cases.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agolocks: fix tracking of inprogress lease breaks
J. Bruce Fields [Tue, 26 Jul 2011 22:25:49 +0000 (18:25 -0400)]
locks: fix tracking of inprogress lease breaks

We currently use a bit in fl_flags to record whether a lease is being
broken, and set fl_type to the type (RDLCK or UNLCK) that it will
eventually have.  This means that once the lease break starts, we forget
what the lease's type *used* to be.  Breaking a read lease will then
result in blocking read opens, even though there's no conflict--because
the lease type is now F_UNLCK and we can no longer tell whether it was
previously a read or write lease.

So, instead keep fl_type as the original type (the type which we
enforce), and keep track of whether we're unlocking or merely
downgrading by replacing the single FL_INPROGRESS flag by
FL_UNLOCK_PENDING and FL_DOWNGRADE_PENDING flags.

To get this right we also need to track separate downgrade and break
times, to handle the case where a write-leased file gets conflicting
opens first for read, then later for write.

(I first considered just eliminating the downgrade behavior
completely--nfsv4 doesn't need it, and nobody as far as I can tell
actually uses it currently--but Jeremy Allison tells me that Windows
oplocks do behave this way, so Samba will probably use this some day.)

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agolocks: move F_INPROGRESS from fl_type to fl_flags field
J. Bruce Fields [Tue, 26 Jul 2011 20:28:29 +0000 (16:28 -0400)]
locks: move F_INPROGRESS from fl_type to fl_flags field

F_INPROGRESS isn't exposed to userspace.  To me it makes more sense in
fl_flags....

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agolocks: minor lease cleanup
J. Bruce Fields [Wed, 27 Jul 2011 00:10:51 +0000 (20:10 -0400)]
locks: minor lease cleanup

Use a helper function, to simplify upcoming changes.

Reviewed-by: Jeff Layton <jlayton@redhat.com>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: return nfserr_symlink on v4 OPEN of non-regular file
J. Bruce Fields [Mon, 15 Aug 2011 20:55:02 +0000 (16:55 -0400)]
nfsd4: return nfserr_symlink on v4 OPEN of non-regular file

Without this, an attempt to open a device special file without first
stat'ing it will fail.

Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: fix seqid_mutating_error
J. Bruce Fields [Wed, 10 Aug 2011 23:16:22 +0000 (19:16 -0400)]
nfsd4: fix seqid_mutating_error

The set of errors here does *not* agree with the set of errors specified
in the rfc!

While we're there, turn this macros into a function, for the usual
reasons, and move it to the one place where it's actually used.

Cc: stable@kernel.org
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agonfsd4: Remove check for a 32-bit cookie in nfsd4_readdir()
Bernd Schubert [Mon, 8 Aug 2011 15:38:08 +0000 (17:38 +0200)]
nfsd4: Remove check for a 32-bit cookie in nfsd4_readdir()

Fan Yong <yong.fan@whamcloud.com> noticed setting
FMODE_32bithash wouldn't work with nfsd v4, as
nfsd4_readdir() checks for 32 bit cookies. However, according to RFC 3530
cookies have a 64 bit type and cookies are also defined as u64 in
'struct nfsd4_readdir'. So remove the test for >32-bit values.

Cc: stable@kernel.org
Signed-off-by: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>
Signed-off-by: J. Bruce Fields <bfields@redhat.com>
13 years agoLinux 3.1-rc1
Linus Torvalds [Mon, 8 Aug 2011 01:23:30 +0000 (18:23 -0700)]
Linux 3.1-rc1

13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Linus Torvalds [Sun, 7 Aug 2011 22:52:19 +0000 (15:52 -0700)]
Merge git://git./linux/kernel/git/davem/sparc

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Fix build with DEBUG_PAGEALLOC enabled.

13 years agosh: Fix boot crash related to SCI
Rafael J. Wysocki [Sun, 7 Aug 2011 22:26:50 +0000 (00:26 +0200)]
sh: Fix boot crash related to SCI

Commit d006199e72a9 ("serial: sh-sci: Regtype probing doesn't need to be
fatal.") made sci_init_single() return when sci_probe_regmap() succeeds,
although it should return when sci_probe_regmap() fails.  This causes
systems using the serial sh-sci driver to crash during boot.

Fix the problem by using the right return condition.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoarm: remove stale export of 'sha_transform'
Linus Torvalds [Sun, 7 Aug 2011 22:49:11 +0000 (15:49 -0700)]
arm: remove stale export of 'sha_transform'

The generic library code already exports the generic function, this was
left-over from the ARM-specific version that just got removed.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoarm: remove "optimized" SHA1 routines
Linus Torvalds [Sun, 7 Aug 2011 21:07:03 +0000 (14:07 -0700)]
arm: remove "optimized" SHA1 routines

Since commit 1eb19a12bd22 ("lib/sha1: use the git implementation of
SHA-1"), the ARM SHA1 routines no longer work.  The reason? They
depended on the larger 320-byte workspace, and now the sha1 workspace is
just 16 words (64 bytes).  So the assembly version would overwrite the
stack randomly.

The optimized asm version is also probably slower than the new improved
C version, so there's no reason to keep it around.  At least that was
the case in git, where what appears to be the same assembly language
version was removed two years ago because the optimized C BLK_SHA1 code
was faster.

Reported-and-tested-by: Joachim Eastwood <manabian@gmail.com>
Cc: Andreas Schwab <schwab@linux-m68k.org>
Cc: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agofix rcu annotations noise in cred.h
Al Viro [Sun, 7 Aug 2011 17:55:11 +0000 (18:55 +0100)]
fix rcu annotations noise in cred.h

task->cred is declared as __rcu, and access to other tasks' ->cred is,
indeed, protected.  Access to current->cred does not need rcu_dereference()
at all, since only the task itself can change its ->cred.  sparse, of
course, has no way of knowing that...

Add force-cast in current_cred(), make current_fsuid() et.al. use it.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agovfs: rename 'do_follow_link' to 'should_follow_link'
Linus Torvalds [Sun, 7 Aug 2011 16:53:20 +0000 (09:53 -0700)]
vfs: rename 'do_follow_link' to 'should_follow_link'

Al points out that the do_follow_link() helper function really is
misnamed - it's about whether we should try to follow a symlink or not,
not about actually doing the following.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoFix POSIX ACL permission check
Ari Savolainen [Sat, 6 Aug 2011 16:43:07 +0000 (19:43 +0300)]
Fix POSIX ACL permission check

After commit 3567866bf261: "RCUify freeing acls, let check_acl() go ahead in
RCU mode if acl is cached" posix_acl_permission is being called with an
unsupported flag and the permission check fails. This patch fixes the issue.

Signed-off-by: Ari Savolainen <ari.m.savolainen@gmail.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
13 years agoMerge branch 'for-linus' of git://git.open-osd.org/linux-open-osd
Linus Torvalds [Sun, 7 Aug 2011 05:56:03 +0000 (22:56 -0700)]
Merge branch 'for-linus' of git://git.open-osd.org/linux-open-osd

* 'for-linus' of git://git.open-osd.org/linux-open-osd:
  ore: Make ore its own module
  exofs: Rename raid engine from exofs/ios.c => ore
  exofs: ios: Move to a per inode components & device-table
  exofs: Move exofs specific osd operations out of ios.c
  exofs: Add offset/length to exofs_get_io_state
  exofs: Fix truncate for the raid-groups case
  exofs: Small cleanup of exofs_fill_super
  exofs: BUG: Avoid sbi realloc
  exofs: Remove pnfs-osd private definitions
  nfs_xdr: Move nfs4_string definition out of #ifdef CONFIG_NFS_V4

13 years agovfs: optimize inode cache access patterns
Linus Torvalds [Sun, 7 Aug 2011 05:45:50 +0000 (22:45 -0700)]
vfs: optimize inode cache access patterns

The inode structure layout is largely random, and some of the vfs paths
really do care.  The path lookup in particular is already quite D$
intensive, and profiles show that accessing the 'inode->i_op->xyz'
fields is quite costly.

We already optimized the dcache to not unnecessarily load the d_op
structure for members that are often NULL using the DCACHE_OP_xyz bits
in dentry->d_flags, and this does something very similar for the inode
ops that are used during pathname lookup.

It also re-orders the fields so that the fields accessed by 'stat' are
together at the beginning of the inode structure, and roughly in the
order accessed.

The effect of this seems to be in the 1-2% range for an empty kernel
"make -j" run (which is fairly kernel-intensive, mostly in filename
lookup), so it's visible.  The numbers are fairly noisy, though, and
likely depend a lot on exact microarchitecture.  So there's more tuning
to be done.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agovfs: renumber DCACHE_xyz flags, remove some stale ones
Linus Torvalds [Sun, 7 Aug 2011 05:41:50 +0000 (22:41 -0700)]
vfs: renumber DCACHE_xyz flags, remove some stale ones

Gcc tends to generate better code with small integers, including the
DCACHE_xyz flag tests - so move the common ones to be first in the list.
Also just remove the unused DCACHE_INOTIFY_PARENT_WATCHED and
DCACHE_AUTOFS_PENDING values, their users no longer exists in the source
tree.

And add a "unlikely()" to the DCACHE_OP_COMPARE test, since we want the
common case to be a nice straight-line fall-through.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 7 Aug 2011 05:12:37 +0000 (22:12 -0700)]
Merge git://git./linux/kernel/git/davem/net

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net: Compute protocol sequence numbers and fragment IDs using MD5.
  crypto: Move md5_transform to lib/md5.c

13 years agoore: Make ore its own module
Boaz Harrosh [Sun, 7 Aug 2011 02:22:06 +0000 (19:22 -0700)]
ore: Make ore its own module

Export everything from ore need exporting. Change Kbuild and Kconfig
to build ore.ko as an independent module. Import ore from exofs

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
13 years agoexofs: Rename raid engine from exofs/ios.c => ore
Boaz Harrosh [Sun, 7 Aug 2011 02:26:31 +0000 (19:26 -0700)]
exofs: Rename raid engine from exofs/ios.c => ore

ORE stands for "Objects Raid Engine"

This patch is a mechanical rename of everything that was in ios.c
and its API declaration to an ore.c and an osd_ore.h header. The ore
engine will later be used by the pnfs objects layout driver.

* File ios.c => ore.c

* Declaration of types and API are moved from exofs.h to a new
  osd_ore.h

* All used types are prefixed by ore_ from their exofs_ name.

* Shift includes from exofs.h to osd_ore.h so osd_ore.h is
  independent, include it from exofs.h.

Other than a pure rename there are no other changes. Next patch
will move the ore into it's own module and will export the API
to be used by exofs and later the layout driver

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
13 years agoexofs: ios: Move to a per inode components & device-table
Boaz Harrosh [Fri, 5 Aug 2011 22:06:04 +0000 (15:06 -0700)]
exofs: ios: Move to a per inode components & device-table

Exofs raid engine was saving on memory space by having a single layout-info,
single pid, and a single device-table, global to the filesystem. Then passing
a credential and object_id info at the io_state level, private for each
inode. It would also devise this contraption of rotating the device table
view for each inode->ino to spread out the device usage.

This is not compatible with the pnfs-objects standard, demanding that
each inode can have it's own layout-info, device-table, and each object
component it's own pid, oid and creds.

So: Bring exofs raid engine to be usable for generic pnfs-objects use by:

* Define an exofs_comp structure that holds obj_id and credential info.

* Break up exofs_layout struct to an exofs_components structure that holds a
  possible array of exofs_comp and the array of devices + the size of the
  arrays.

* Add a "comps" parameter to get_io_state() that specifies the ids creds
  and device array to use for each IO.

  This enables to keep the layout global, but the device-table view, creds
  and IDs at the inode level. It only adds two 64bit to each inode, since
  some of these members already existed in another form.

* ios raid engine now access layout-info and comps-info through the passed
  pointers. Everything is pre-prepared by caller for generic access of
  these structures and arrays.

At the exofs Level:

* Super block holds an exofs_components struct that holds the device
  array, previously in layout. The devices there are in device-table
  order. The device-array is twice bigger and repeats the device-table
  twice so now each inode's device array can point to a random device
  and have a round-robin view of the table, making it compatible to
  previous exofs versions.

* Each inode has an exofs_components struct that is initialized at
  load time, with it's own view of the device table IDs and creds.
  When doing IO this gets passed to the io_state together with the
  layout.

While preforming this change. Bugs where found where credentials with the
wrong IDs where used to access the different SB objects (super.c). As well
as some dead code. It was never noticed because the target we use does not
check the credentials.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
13 years agoexofs: Move exofs specific osd operations out of ios.c
Boaz Harrosh [Mon, 16 May 2011 12:26:47 +0000 (15:26 +0300)]
exofs: Move exofs specific osd operations out of ios.c

ios.c will be moving to an external library, for use by the
objects-layout-driver. Remove from it some exofs specific functions.

Also g_attr_logical_length is used both by inode.c and ios.c
move definition to the later, to keep it independent

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
13 years agoexofs: Add offset/length to exofs_get_io_state
Boaz Harrosh [Tue, 16 Nov 2010 18:09:58 +0000 (20:09 +0200)]
exofs: Add offset/length to exofs_get_io_state

In future raid code we will need to know the IO offset/length
and if it's a read or write to determine some of the array
sizes we'll need.

So add a new exofs_get_rw_state() API for use when
writeing/reading. All other simple cases are left using the
old way.

The major change to this is that now we need to call
exofs_get_io_state later at inode.c::read_exec and
inode.c::write_exec when we actually know these things. So this
patch is kept separate so I can test things apart from other
changes.

Signed-off-by: Boaz Harrosh <bharrosh@panasas.com>
13 years agonet: Compute protocol sequence numbers and fragment IDs using MD5.
David S. Miller [Thu, 4 Aug 2011 03:50:44 +0000 (20:50 -0700)]
net: Compute protocol sequence numbers and fragment IDs using MD5.

Computers have become a lot faster since we compromised on the
partial MD4 hash which we use currently for performance reasons.

MD5 is a much safer choice, and is inline with both RFC1948 and
other ISS generators (OpenBSD, Solaris, etc.)

Furthermore, only having 24-bits of the sequence number be truly
unpredictable is a very serious limitation.  So the periodic
regeneration and 8-bit counter have been removed.  We compute and
use a full 32-bit sequence number.

For ipv6, DCCP was found to use a 32-bit truncated initial sequence
number (it needs 43-bits) and that is fixed here as well.

Reported-by: Dan Kaminsky <dan@doxpara.com>
Tested-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agocrypto: Move md5_transform to lib/md5.c
David S. Miller [Thu, 4 Aug 2011 02:45:10 +0000 (19:45 -0700)]
crypto: Move md5_transform to lib/md5.c

We are going to use this for TCP/IP sequence number and fragment ID
generation.

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6
Linus Torvalds [Sat, 6 Aug 2011 20:54:36 +0000 (13:54 -0700)]
Merge git://git./linux/kernel/git/sfrench/cifs-2.6

* git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
  cifs: cope with negative dentries in cifs_get_root
  cifs: convert prefixpath delimiters in cifs_build_path_to_root
  CIFS: Fix missing a decrement of inFlight value
  cifs: demote DFS referral lookup errors to cFYI
  Revert "cifs: advertise the right receive buffer size to the server"

13 years agoMerge branch 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspe...
Linus Torvalds [Sat, 6 Aug 2011 20:26:37 +0000 (13:26 -0700)]
Merge branch 'pm-fixes' of git://git./linux/kernel/git/rafael/suspend-2.6

* 'pm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6:
  PM / Runtime: Allow _put_sync() from interrupts-disabled context
  PM / Domains: Fix pm_genpd_poweron()

13 years agoMerge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platf...
Linus Torvalds [Sat, 6 Aug 2011 20:26:15 +0000 (13:26 -0700)]
Merge branch 'for_linus' of git://git./linux/kernel/git/mjg59/platform-drivers-x86

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mjg59/platform-drivers-x86: (38 commits)
  acer-wmi: support Lenovo ideapad S205 wifi switch
  acerhdf.c: spaces in aliased changed to *
  platform-drivers-x86: ideapad-laptop: add missing ideapad_input_exit in ideapad_acpi_add error path
  x86 driver: fix typo in TDP override enabling
  Platform: fix samsung-laptop DMI identification for N150/N210/220/N230
  dell-wmi: Add keys for Dell XPS L502X
  platform-drivers-x86: samsung-q10: make dmi_check_callback return 1
  Platform: Samsung Q10 backlight driver
  platform-drivers-x86: intel_scu_ipc: convert to DEFINE_PCI_DEVICE_TABLE
  platform-drivers-x86: intel_rar_register: convert to DEFINE_PCI_DEVICE_TABLE
  platform-drivers-x86: intel_menlow: add missing return AE_OK for intel_menlow_register_sensor()
  platform-drivers-x86: intel_mid_thermal: fix memory leak
  platform-drivers-x86: msi-wmi: add missing sparse_keymap_free in msi_wmi_init error path
  Samsung Laptop platform driver: support N510
  asus-wmi: add uwb rfkill support
  asus-wmi: add gps rfkill support
  asus-wmi: add CWAP support and clarify the meaning of WAPF bits
  asus-wmi: return proper value in store_cpufv()
  asus-wmi: check for temp1 presence
  asus-wmi: add thermal sensor
  ...

13 years agoMerge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 6 Aug 2011 19:22:30 +0000 (12:22 -0700)]
Merge branch 'stable/bug.fixes' of git://git./linux/kernel/git/konrad/xen

* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set
  xen: Fix misleading WARN message at xen_release_chunk
  xen: Fix printk() format in xen/setup.c
  xen/tracing: it looks like we wanted CONFIG_FTRACE
  xen/self-balloon: Add dependency on tmem.
  xen/balloon: Fix compile errors - missing header files.
  xen/grant: Fix compile warning.
  xen/pciback: remove duplicated #include

13 years agoMerge branch 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux...
Linus Torvalds [Sat, 6 Aug 2011 19:21:19 +0000 (12:21 -0700)]
Merge branch 'release' of git://git./linux/kernel/git/lenb/linux-acpi-2.6

* 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
  Battery: sysfs_remove_battery(): possible circular locking

13 years agosavagedb: Fix typo causing regression in savage4 series video chip detection
John Stanley [Thu, 4 Aug 2011 00:41:00 +0000 (20:41 -0400)]
savagedb: Fix typo causing regression in savage4 series video chip detection

Two additional savage4 variants were added, but the S3_SAVAGE4_SERIES
macro was incompletely modified, resulting in a false positive detection
of a savage4 card regardless of which savage card is actually present.

For non-savage4 series cards, such as a Savage/IX-MV card, this results
in garbled video and/or a hard-hang at boot time.  Fix this by changing
an '||' to an '&&' in the S3_SAVAGE4_SERIES macro.

Signed-off-by: John P. Stanley <jpsinthemix@verizon.net>
Reviewed-by: Tormod Volden <debian.tormod@gmail.com>
[ The macros have incomplete parenthesis too, but whatever ..  -Linus ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agoCodingStyle: Document the exception of not splitting user-visible strings, for grepping
Josh Triplett [Wed, 3 Aug 2011 19:19:07 +0000 (12:19 -0700)]
CodingStyle: Document the exception of not splitting user-visible strings, for grepping

Patch reviewers now recommend not splitting long user-visible strings,
such as printk messages, even if they exceed 80 columns.  This avoids
breaking grep.  However, that recommendation did not actually appear
anywhere in Documentation/CodingStyle.

See, for example, the thread at
  http://news.gmane.org/find-root.php?message_id=%3c1312215262.11635.15.camel%40Joe%2dLaptop%3e

Signed-off-by: Josh Triplett <josh@joshtriplett.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agovfs: show O_CLOEXE bit properly in /proc/<pid>/fdinfo/<fd> files
Linus Torvalds [Sat, 6 Aug 2011 18:51:33 +0000 (11:51 -0700)]
vfs: show O_CLOEXE bit properly in /proc/<pid>/fdinfo/<fd> files

The CLOEXE bit is magical, and for performance (and semantic) reasons we
don't actually maintain it in the file descriptor itself, but in a
separate bit array.  Which means that when we show f_flags, the CLOEXE
status is shown incorrectly: we show the status not as it is now, but as
it was when the file was opened.

Fix that by looking up the bit properly in the 'fdt->close_on_exec' bit
array.

Uli needs this in order to re-implement the pfiles program:

  "For normal file descriptors (not sockets) this was the last piece of
   information which wasn't available.  This is all part of my 'give
   Solaris users no reason to not switch' effort.  I intend to offer the
   code to the util-linux-ng maintainers."

Requested-by: Ulrich Drepper <drepper@akkadia.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agooom_ajd: don't use WARN_ONCE, just use printk_once
Linus Torvalds [Sat, 6 Aug 2011 18:43:08 +0000 (11:43 -0700)]
oom_ajd: don't use WARN_ONCE, just use printk_once

WARN_ONCE() is very annoying, in that it shows the stack trace that we
don't care about at all, and also triggers various user-level "kernel
oopsed" logic that we really don't care about.  And it's not like the
user can do anything about the applications (sshd) in question, it's a
distro issue.

Requested-by: Andi Kleen <andi@firstfloor.org> (and many others)
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agolib/sha1: use the git implementation of SHA-1
Mandeep Singh Baines [Sat, 6 Aug 2011 01:46:27 +0000 (18:46 -0700)]
lib/sha1: use the git implementation of SHA-1

For ChromiumOS, we use SHA-1 to verify the integrity of the root
filesystem.  The speed of the kernel sha-1 implementation has a major
impact on our boot performance.

To improve boot performance, we investigated using the heavily optimized
sha-1 implementation used in git.  With the git sha-1 implementation, we
see a 11.7% improvement in boot time.

10 reboots, remove slowest/fastest.

Before:

  Mean: 6.58 seconds Stdev: 0.14

After (with git sha-1, this patch):

  Mean: 5.89 seconds Stdev: 0.07

The other cool thing about the git SHA-1 implementation is that it only
needs 64 bytes of stack for the workspace while the original kernel
implementation needed 320 bytes.

Signed-off-by: Mandeep Singh Baines <msb@chromium.org>
Cc: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Cc: Nicolas Pitre <nico@cam.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: linux-crypto@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
13 years agosparc: Fix build with DEBUG_PAGEALLOC enabled.
David S. Miller [Sat, 6 Aug 2011 12:26:35 +0000 (05:26 -0700)]
sparc: Fix build with DEBUG_PAGEALLOC enabled.

arch/sparc/mm/init_64.c:1622:22: error: unused variable '__swapper_4m_tsb_phys_patch_end' [-Werror=unused-variable]
arch/sparc/mm/init_64.c:1621:22: error: unused variable '__swapper_4m_tsb_phys_patch' [-Werror=unused-variable]

Signed-off-by: David S. Miller <davem@davemloft.net>
13 years agoMerge branch 'battery' into release
Len Brown [Sat, 6 Aug 2011 02:16:42 +0000 (22:16 -0400)]
Merge branch 'battery' into release

13 years agoBattery: sysfs_remove_battery(): possible circular locking
Sergey Senozhatsky [Fri, 5 Aug 2011 22:34:08 +0000 (01:34 +0300)]
Battery: sysfs_remove_battery(): possible circular locking

Commit 9c921c22a7f33397a6774d7fa076db9b6a0fd669
Author: Lan Tianyu <tianyu.lan@intel.com>

    ACPI / Battery: Resolve the race condition in the sysfs_remove_battery()

fixed BUG https://bugzilla.kernel.org/show_bug.cgi?id=35642 , but as a side
effect made lockdep unhappy with sysfs_remove_battery():

[14818.477168]
[14818.477170] =======================================================
[14818.477200] [ INFO: possible circular locking dependency detected ]
[14818.477221] 3.1.0-dbg-07865-g1280ea8-dirty #668
[14818.477236] -------------------------------------------------------
[14818.477257] s2ram/1599 is trying to acquire lock:
[14818.477276]  (s_active#8){++++.+}, at: [<ffffffff81169147>] sysfs_addrm_finish+0x31/0x5a
[14818.477323]
[14818.477325] but task is already holding lock:
[14818.477350]  (&battery->lock){+.+.+.}, at: [<ffffffffa0047278>] sysfs_remove_battery+0x10/0x4b [battery]
[14818.477395]
[14818.477397] which lock already depends on the new lock.
[14818.477399]
[..]
[14818.479121] stack backtrace:
[14818.479148] Pid: 1599, comm: s2ram Not tainted 3.1.0-dbg-07865-g1280ea8-dirty #668
[14818.479175] Call Trace:
[14818.479198]  [<ffffffff814828c3>] print_circular_bug+0x293/0x2a4
[14818.479228]  [<ffffffff81070cb5>] __lock_acquire+0xfe4/0x164b
[14818.479260]  [<ffffffff81169147>] ? sysfs_addrm_finish+0x31/0x5a
[14818.479288]  [<ffffffff810718d2>] lock_acquire+0x138/0x1ac
[14818.479316]  [<ffffffff81169147>] ? sysfs_addrm_finish+0x31/0x5a
[14818.479345]  [<ffffffff81168a79>] sysfs_deactivate+0x9b/0xec
[14818.479373]  [<ffffffff81169147>] ? sysfs_addrm_finish+0x31/0x5a
[14818.479405]  [<ffffffff81169147>] sysfs_addrm_finish+0x31/0x5a
[14818.479433]  [<ffffffff81167bc5>] sysfs_hash_and_remove+0x54/0x77
[14818.479461]  [<ffffffff811681b9>] sysfs_remove_file+0x12/0x14
[14818.479488]  [<ffffffff81385bf8>] device_remove_file+0x12/0x14
[14818.479516]  [<ffffffff81386504>] device_del+0x119/0x17c
[14818.479542]  [<ffffffff81386575>] device_unregister+0xe/0x1a
[14818.479570]  [<ffffffff813c6ef9>] power_supply_unregister+0x23/0x27
[14818.479601]  [<ffffffffa004729c>] sysfs_remove_battery+0x34/0x4b [battery]
[14818.479632]  [<ffffffffa004778f>] battery_notify+0x2c/0x3a [battery]
[14818.479662]  [<ffffffff8148fe82>] notifier_call_chain+0x74/0xa1
[14818.479692]  [<ffffffff810624b4>] __blocking_notifier_call_chain+0x6c/0x89
[14818.479722]  [<ffffffff810624e0>] blocking_notifier_call_chain+0xf/0x11
[14818.479751]  [<ffffffff8107e40e>] pm_notifier_call_chain+0x15/0x27
[14818.479770]  [<ffffffff8107ee1a>] enter_state+0xa7/0xd5
[14818.479782]  [<ffffffff8107e341>] state_store+0xaa/0xc0
[14818.479795]  [<ffffffff8107e297>] ? pm_async_store+0x45/0x45
[14818.479807]  [<ffffffff81248837>] kobj_attr_store+0x17/0x19
[14818.479820]  [<ffffffff81167e27>] sysfs_write_file+0x103/0x13f
[14818.479834]  [<ffffffff81109037>] vfs_write+0xad/0x13d
[14818.479847]  [<ffffffff811092b2>] sys_write+0x45/0x6c
[14818.479860]  [<ffffffff81492f92>] system_call_fastpath+0x16/0x1b

This patch introduces separate lock to struct acpi_battery to
grab in sysfs_remove_battery() instead of battery->lock.
So fix by Lan Tianyu is still there, we just grab independent lock.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Tested-by: Lan Tianyu <tianyu.lan@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
13 years agoPM / Runtime: Allow _put_sync() from interrupts-disabled context
Kevin Hilman [Fri, 5 Aug 2011 19:45:20 +0000 (21:45 +0200)]
PM / Runtime: Allow _put_sync() from interrupts-disabled context

Currently the use of pm_runtime_put_sync() is not safe from
interrupts-disabled context because rpm_idle() will release the
spinlock and enable interrupts for the idle callbacks.  This enables
interrupts during a time where interrupts were expected to be
disabled, and can have strange side effects on drivers that expected
interrupts to be disabled.

This is not a bug since the documentation clearly states that only
_put_sync_suspend() is safe in IRQ-safe mode.

However, pm_runtime_put_sync() could be made safe when in IRQ-safe
mode by releasing the spinlock but not re-enabling interrupts, which
is what this patch aims to do.

Problem was found when using some buggy drivers that set
pm_runtime_irq_safe() and used _put_sync() in interrupts-disabled
context.

Reported-by: Colin Cross <ccross@google.com>
Tested-by: Nishanth Menon <nm@ti.com>
Signed-off-by: Kevin Hilman <khilman@ti.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
13 years agoPM / Domains: Fix pm_genpd_poweron()
Rafael J. Wysocki [Fri, 5 Aug 2011 19:45:11 +0000 (21:45 +0200)]
PM / Domains: Fix pm_genpd_poweron()

The local variable ret is defined twice in pm_genpd_poweron(), which
causes this function to always return 0, even if the PM domain's
.power_on() callback fails, in which case an error code should be
returned.

Remove the wrong second definition of ret and additionally remove an
unnecessary definition of wait from pm_genpd_poweron().

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
13 years agoacer-wmi: support Lenovo ideapad S205 wifi switch
Lee, Chun-Yi [Sat, 30 Jul 2011 09:00:45 +0000 (17:00 +0800)]
acer-wmi: support Lenovo ideapad S205 wifi switch

The AMW0 function in acer-wmi works on Lenovo ideapad S205 for control
the wifi hardware state. We also found there have a 0x78 EC register
exposes the state of wifi hardware switch on the machine.

So, add this patch to support Lenovo ideapad S205 wifi hardware switch
in acer-wmi driver.

Reference: bko#37892
https://bugzilla.kernel.org/show_bug.cgi?id=37892

Cc: Carlos Corbacho <carlos@strangeworlds.co.uk>
Cc: Matthew Garrett <mjg@redhat.com>
Cc: Dmitry Torokhov <dtor@mail.ru>
Cc: Corentin Chary <corentincj@iksaif.net>
Cc: Thomas Renninger <trenn@suse.de>
Tested-by: Florian Heyer <heyho@flanto.de>
Signed-off-by: Lee, Chun-Yi <jlee@suse.com>
Signed-off-by: Matthew Garrett <mjg@redhat.com>
13 years agoacerhdf.c: spaces in aliased changed to *
Anton V. Boyarshinov [Thu, 28 Jul 2011 14:05:35 +0000 (18:05 +0400)]
acerhdf.c: spaces in aliased changed to *

It seems that aliases shouldn't contain spaces, as
module-init-tools uses them as delimeters in module.alias file

Signed-off-by: Anton V. Boyarshinov <boyarsh@altlinux.org>
Signed-off-by: Matthew Garrett <mjg@redhat.com>