openwrt/staging/blogic.git
14 years agostmmac: Remove redundant unlikely()
Tobias Klauser [Thu, 9 Dec 2010 04:50:22 +0000 (04:50 +0000)]
stmmac: Remove redundant unlikely()

IS_ERR() already implies unlikely(), so it can be omitted here.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoxfrm: Traffic Flow Confidentiality for IPv6 ESP
Martin Willi [Wed, 8 Dec 2010 04:37:51 +0000 (04:37 +0000)]
xfrm: Traffic Flow Confidentiality for IPv6 ESP

Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.

Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoxfrm: Traffic Flow Confidentiality for IPv4 ESP
Martin Willi [Wed, 8 Dec 2010 04:37:50 +0000 (04:37 +0000)]
xfrm: Traffic Flow Confidentiality for IPv4 ESP

Add TFC padding to all packets smaller than the boundary configured
on the xfrm state. If the boundary is larger than the PMTU, limit
padding to the PMTU.

Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoxfrm: Add Traffic Flow Confidentiality padding XFRM attribute
Martin Willi [Wed, 8 Dec 2010 04:37:49 +0000 (04:37 +0000)]
xfrm: Add Traffic Flow Confidentiality padding XFRM attribute

The XFRMA_TFCPAD attribute for XFRM state installation configures
Traffic Flow Confidentiality by padding ESP packets to a specified
length.

Signed-off-by: Martin Willi <martin@strongswan.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoifb: use the lockless variants of skb_queue
Changli Gao [Sat, 4 Dec 2010 15:01:52 +0000 (15:01 +0000)]
ifb: use the lockless variants of skb_queue

rq and tq are both protected by tx queue lock, so we can simply use
the lockless variants of skb_queue.

skb_queue_splice_tail_init() is used instead of the open coded and slow
one.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoifb: remove unused macro TX_TIMEOUT
Changli Gao [Fri, 3 Dec 2010 19:55:20 +0000 (19:55 +0000)]
ifb: remove unused macro TX_TIMEOUT

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoifb: remove the useless debug stats
Changli Gao [Fri, 3 Dec 2010 19:55:19 +0000 (19:55 +0000)]
ifb: remove the useless debug stats

These debug stats are not exported, and become useless.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: buffer count imbalance
Jan Glauber [Wed, 8 Dec 2010 02:58:01 +0000 (02:58 +0000)]
qeth: buffer count imbalance

The used buffers counter is not incremented in case of an error so
the counter can become negative. Increment the used buffers counter
before checking for errors.

Signed-off-by: Jan Glauber <jang@linux.vnet.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: l3 add vlan hdr in passthru frames
Frank Blaschka [Wed, 8 Dec 2010 02:58:00 +0000 (02:58 +0000)]
qeth: l3 add vlan hdr in passthru frames

OSA l3 mode is hw accelerated VLAN only for IPv4. Take care we
add the vlan hdr to a passthru frame in the device driver.

Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: support VIPA add/del in offline mode
Einar Lueck [Wed, 8 Dec 2010 02:57:59 +0000 (02:57 +0000)]
qeth: support VIPA add/del in offline mode

Only work through the IP adddress to do list if the card is UP or
SOFTSETUP. Enables to configure VIPA add/del in offline mode.

Signed-off-by: Einar Lueck <elelueck@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoqeth: support ipv6 query arp cache for HiperSockets
Einar Lueck [Wed, 8 Dec 2010 02:57:58 +0000 (02:57 +0000)]
qeth: support ipv6 query arp cache for HiperSockets

Function qeth_l3_arp_query now queries for IPv6 addresses, too, if
QETH_QARP_WITH_IPV6 is passed as parameter to the ioctl. HiperSockets
and GuestLAN in HiperSockets mode provide corresponding entries.

Signed-off-by: Einar Lueck <elelueck@de.ibm.com>
Signed-off-by: Frank Blaschka <frank.blaschka@de.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/ipv6/udp.c: fix typo in flush_stack()
Jiri Pirko [Thu, 9 Dec 2010 03:40:30 +0000 (03:40 +0000)]
net/ipv6/udp.c: fix typo in flush_stack()

skb1 should be passed as parameter to sk_rcvqueues_full() here.

Signed-off-by: Jiri Pirko <jpirko@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoipv6: Fix 'release_it' logic in tcp_v6_get_peer()
David S. Miller [Fri, 10 Dec 2010 21:16:09 +0000 (13:16 -0800)]
ipv6: Fix 'release_it' logic in tcp_v6_get_peer()

We accidently set it to "true" for the case where we
are using a route bound peer.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobridge: Fix return values of br_multicast_add_group/br_multicast_new_group
Tobias Klauser [Fri, 10 Dec 2010 03:18:04 +0000 (03:18 +0000)]
bridge: Fix return values of br_multicast_add_group/br_multicast_new_group

If br_multicast_new_group returns NULL, we would return 0 (no error) to
the caller of br_multicast_add_group, which is not what we want. Instead
br_multicast_new_group should return ERR_PTR(-ENOMEM) in this case.
Also propagate the error number returned by br_mdb_rehash properly.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6
David S. Miller [Fri, 10 Dec 2010 19:22:57 +0000 (11:22 -0800)]
Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6

14 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc...
David S. Miller [Fri, 10 Dec 2010 18:20:43 +0000 (10:20 -0800)]
Merge branch 'for-davem' of git://git./linux/kernel/git/bwh/sfc-next-2.6

14 years agoMerge branch 'for-davem' of git://git.kernel.org/pub/scm/linux/kernel/git/linville...
David S. Miller [Fri, 10 Dec 2010 17:50:47 +0000 (09:50 -0800)]
Merge branch 'for-davem' of git://git./linux/kernel/git/linville/wireless-next-2.6

Conflicts:
drivers/net/wireless/ath/ath9k/ar9003_eeprom.c

14 years agodccp: remove unused macros
Shan Wei [Fri, 10 Dec 2010 11:49:23 +0000 (12:49 +0100)]
dccp: remove unused macros

Remove macros which have been unused since the initial implementation
(commit 7c657876b63cb1d8a2ec06f8fc6c37bb8412e66c, [DCCP]: Initial
 implementation from Tue Aug 9 20:14:34 2005 -0700).

Signed-off-by: Shan Wei <shanwei@cn.fujitsu.com>
Acked-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
14 years agobnx2x: Update version number and a date.
Vladislav Zolotarov [Wed, 8 Dec 2010 01:43:37 +0000 (01:43 +0000)]
bnx2x: Update version number and a date.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Fixed a compilation warning
Vladislav Zolotarov [Wed, 8 Dec 2010 01:43:29 +0000 (01:43 +0000)]
bnx2x: Fixed a compilation warning

bnx2x_src_init_t2() is used only when BCM_CNIC is defined.
So, to avoid a compilation warning, we won't define it unless
BCM_CNIC is defined.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: Use dma_alloc_coherent() semantics for ILT memory allocation
Vladislav Zolotarov [Wed, 8 Dec 2010 01:43:17 +0000 (01:43 +0000)]
bnx2x: Use dma_alloc_coherent() semantics for ILT memory allocation

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobnx2x: LSO code was broken on BE platforms
Vladislav Zolotarov [Wed, 8 Dec 2010 01:43:09 +0000 (01:43 +0000)]
bnx2x: LSO code was broken on BE platforms

Make the LSO code work on BE platforms: parsing_data field of
a parsing BD (PBD) for 57712 was improperly composed which made FW read wrong
values for TCP header's length and offset and, as a result, the corresponding
PCI device was performing bad DMA reads triggering EEH.

Signed-off-by: Vladislav Zolotarov <vladz@broadcom.com>
Signed-off-by: Eilon Greenstein <eilong@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agofilter: use size of fetched data in __load_pointer()
Eric Dumazet [Tue, 7 Dec 2010 22:26:15 +0000 (22:26 +0000)]
filter: use size of fetched data in __load_pointer()

__load_pointer() checks data we fetch from skb is included in head
portion, but assumes we fetch one byte, instead of up to four.

This wont crash because we have extra bytes (struct skb_shared_info)
after head, but this can read uninitialized bytes.

Fix this using size of the data (1, 2, 4 bytes) in the test.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoThe new jhash implementation
Jozsef Kadlecsik [Fri, 3 Dec 2010 02:39:01 +0000 (02:39 +0000)]
The new jhash implementation

The current jhash.h implements the lookup2() hash function by Bob Jenkins.
However, lookup2() is outdated as Bob wrote a new hash function called
lookup3(). The patch replaces the lookup2() implementation of the 'jhash*'
functions with that of lookup3().

You can read a longer comparison of the two and other hash functions at
http://burtleburtle.net/bob/hash/doobs.html.

Signed-off-by: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Acked-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: optimize INET input path further
Eric Dumazet [Tue, 30 Nov 2010 19:04:07 +0000 (19:04 +0000)]
net: optimize INET input path further

Followup of commit b178bb3dfc30 (net: reorder struct sock fields)

Optimize INET input path a bit further, by :

1) moving sk_refcnt close to sk_lock.

This reduces number of dirtied cache lines by one on 64bit arches (and
64 bytes cache line size).

2) moving inet_daddr & inet_rcv_saddr at the beginning of sk

(same cache line than hash / family / bound_dev_if / nulls_node)

This reduces number of accessed cache lines in lookups by one, and dont
increase size of inet and timewait socks.
inet and tw sockets now share same place-holder for these fields.

Before patch :

offsetof(struct sock, sk_refcnt) = 0x10
offsetof(struct sock, sk_lock) = 0x40
offsetof(struct sock, sk_receive_queue) = 0x60
offsetof(struct inet_sock, inet_daddr) = 0x270
offsetof(struct inet_sock, inet_rcv_saddr) = 0x274

After patch :

offsetof(struct sock, sk_refcnt) = 0x44
offsetof(struct sock, sk_lock) = 0x48
offsetof(struct sock, sk_receive_queue) = 0x68
offsetof(struct inet_sock, inet_daddr) = 0x0
offsetof(struct inet_sock, inet_rcv_saddr) = 0x4

compute_score() (udp or tcp) now use a single cache line per ignored
item, instead of two.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: Abstract away all dst_entry metrics accesses.
David S. Miller [Thu, 9 Dec 2010 05:16:57 +0000 (21:16 -0800)]
net: Abstract away all dst_entry metrics accesses.

Use helper functions to hide all direct accesses, especially writes,
to dst_entry metrics values.

This will allow us to:

1) More easily change how the metrics are stored.

2) Implement COW for metrics.

In particular this will help us put metrics into the inetpeer
cache if that is what we end up doing.  We can make the _metrics
member a pointer instead of an array, initially have it point
at the read-only metrics in the FIB, and then on the first set
grab an inetpeer entry and point the _metrics member there.

Signed-off-by: David S. Miller <davem@davemloft.net>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
14 years agocan: slcan: Add missing linux/sched.h include.
David S. Miller [Thu, 9 Dec 2010 02:41:03 +0000 (18:41 -0800)]
can: slcan: Add missing linux/sched.h include.

drivers/net/can/slcan.c: In function 'slcan_open':
drivers/net/can/slcan.c:568: error: dereferencing pointer to incomplete type

Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of master.kernel.org:/pub/scm/linux/kernel/git/davem/net-2.6
David S. Miller [Wed, 8 Dec 2010 21:15:38 +0000 (13:15 -0800)]
Merge branch 'master' of /linux/kernel/git/davem/net-2.6

Conflicts:
drivers/net/wireless/ath/ath9k/ar9003_eeprom.c
net/llc/af_llc.c

14 years agotcp: protect sysctl_tcp_cookie_size reads
Eric Dumazet [Tue, 7 Dec 2010 12:20:47 +0000 (12:20 +0000)]
tcp: protect sysctl_tcp_cookie_size reads

Make sure sysctl_tcp_cookie_size is read once in
tcp_cookie_size_check(), or we might return an illegal value to caller
if sysctl_tcp_cookie_size is changed by another cpu.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ben Hutchings <bhutchings@solarflare.com>
Cc: William Allen Simpson <william.allen.simpson@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: avoid a possible divide by zero
Eric Dumazet [Tue, 7 Dec 2010 12:03:55 +0000 (12:03 +0000)]
tcp: avoid a possible divide by zero

sysctl_tcp_tso_win_divisor might be set to zero while one cpu runs in
tcp_tso_should_defer(). Make sure we dont allow a divide by zero by
reading sysctl_tcp_tso_win_divisor exactly once.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoehea: Fixing LRO configuration
Breno Leitao [Wed, 8 Dec 2010 20:19:14 +0000 (12:19 -0800)]
ehea: Fixing LRO configuration

In order to set LRO on ehea, the user must set a module parameter, which
is not the standard way to do so. This patch adds a way to set LRO using
the ethtool tool.

Signed-off-by: Breno Leitao <leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: Replace time wait bucket msg by counter
Tom Herbert [Wed, 8 Dec 2010 20:16:33 +0000 (12:16 -0800)]
tcp: Replace time wait bucket msg by counter

Rather than printing the message to the log, use a mib counter to keep
track of the count of occurences of time wait bucket overflow.  Reduces
spam in logs.

Signed-off-by: Tom Herbert <therbert@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agox25: decrement netdev reference counts on unload
Apollon Oikonomopoulos [Tue, 7 Dec 2010 09:43:30 +0000 (09:43 +0000)]
x25: decrement netdev reference counts on unload

x25 does not decrement the network device reference counts on module unload.
Thus unregistering any pre-existing interface after unloading the x25 module
hangs and results in

 unregister_netdevice: waiting for tap0 to become free. Usage count = 1

This patch decrements the reference counts of all interfaces in x25_link_free,
the way it is already done in x25_link_device_down for NETDEV_DOWN events.

Signed-off-by: Apollon Oikonomopoulos <apollon@noc.grnet.gr>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agodriver/net/benet: fix be_cmd_multicast_set() memcpy bug
Joe Jin [Mon, 6 Dec 2010 03:00:59 +0000 (03:00 +0000)]
driver/net/benet: fix be_cmd_multicast_set() memcpy bug

Regarding  benet be_cmd_multicast_set() function, now using
netdev_for_each_mc_addr() helper for mac address copy, but
when copying to req->mac[] did not increase of the index.

Cc: Sathya Perla <sathyap@serverengines.com>
Cc: Subbu Seetharaman <subbus@serverengines.com>
Cc: Sarveshwar Bandi <sarveshwarb@serverengines.com>
Cc: Ajit Khaparde <ajitk@serverengines.com>
Signed-off-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agol2tp: Fix modalias of l2tp_ip
Michal Marek [Mon, 6 Dec 2010 02:39:12 +0000 (02:39 +0000)]
l2tp: Fix modalias of l2tp_ip

Using the SOCK_DGRAM enum results in
"net-pf-2-proto-SOCK_DGRAM-type-115", so use the numeric value like it
is done in net/dccp.

Signed-off-by: Michal Marek <mmarek@suse.cz>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoeconet: Do the correct cleanup after an unprivileged SIOCSIFADDR.
Nelson Elhage [Wed, 8 Dec 2010 18:13:55 +0000 (10:13 -0800)]
econet: Do the correct cleanup after an unprivileged SIOCSIFADDR.

We need to drop the mutex and do a dev_put, so set an error code and break like
the other paths, instead of returning directly.

Signed-off-by: Nelson Elhage <nelhage@ksplice.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'sfc-2.6.37' of git://git.kernel.org/pub/scm/linux/kernel/git/bwh/sfc-2.6
David S. Miller [Wed, 8 Dec 2010 20:13:23 +0000 (12:13 -0800)]
Merge branch 'sfc-2.6.37' of git://git./linux/kernel/git/bwh/sfc-2.6

14 years agoisdn/hisax: fix compiler warning on hisax_pci_tbl
Namhyung Kim [Tue, 7 Dec 2010 04:49:06 +0000 (04:49 +0000)]
isdn/hisax: fix compiler warning on hisax_pci_tbl

Annotate hisax_pci_tbl as '__used' to fix following warning:

  CC      drivers/isdn/hisax/config.o
drivers/isdn/hisax/config.c:1920: warning: ‘hisax_pci_tbl’ defined but not used

Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaf_packet: fix freeing pg_vec twice on error path
Changli Gao [Tue, 7 Dec 2010 05:05:18 +0000 (05:05 +0000)]
af_packet: fix freeing pg_vec twice on error path

It is introduced in:
        commit 0e3125c755445664f00ad036e4fc2cd32fd52877
        Author: Neil Horman <nhorman@tuxdriver.com>
        Date:   Tue Nov 16 10:26:47 2010 -0800

        packet: Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v4)

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaf_packet: eliminate pgv_to_page on some arches
Changli Gao [Tue, 7 Dec 2010 04:26:16 +0000 (04:26 +0000)]
af_packet: eliminate pgv_to_page on some arches

Some arches don't need flush_dcache_page(), and don't implement it, so
we can eliminate pgv_to_page() calls on those arches.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: call dev_queue_xmit_nit() after skb_dst_drop()
Eric Dumazet [Tue, 7 Dec 2010 00:30:37 +0000 (00:30 +0000)]
net: call dev_queue_xmit_nit() after skb_dst_drop()

Avoid some atomic ops on dst refcount, calling dev_queue_xmit_nit()
after skb_dst_drop() in dev_hard_start_xmit().

When queueing a packet into af_packet socket, we drop dst anyway.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agofilter: constify sk_run_filter()
Eric Dumazet [Mon, 6 Dec 2010 20:50:09 +0000 (20:50 +0000)]
filter: constify sk_run_filter()

sk_run_filter() doesnt write on skb, change its prototype to reflect
this.

Fix two af_packet comments.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agovia-rhine: hardware VLAN support
Roger Luethi [Mon, 6 Dec 2010 00:59:40 +0000 (00:59 +0000)]
via-rhine: hardware VLAN support

This patch adds VLAN hardware support for Rhine chips.

The driver uses up to 3 additional bytes of buffer space when extracting
802.1Q headers; PKT_BUF_SZ should still be sufficient.

The initial code was provided by David Lv. I reworked it to use standard
kernel facilities. Coding style clean up mostly follows via-velocity.

Adapted to new interface for VLAN acceleration (per request of Jesse Gross).

Signed-off-by: David Lv <DavidLv@viatech.com.cn>
Signed-off-by: Roger Luethi <rl@hellgate.ch>
 drivers/net/via-rhine.c |  326 +++++++++++++++++++++++++++++++++++++++++++++--
 1 files changed, 312 insertions(+), 14 deletions(-)
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: RCU conversion of dev_getbyhwaddr() and arp_ioctl()
Eric Dumazet [Sun, 5 Dec 2010 01:23:53 +0000 (01:23 +0000)]
net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()

Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit :

> Hmm..
>
> If somebody can explain why RTNL is held in arp_ioctl() (and therefore
> in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so
> that your patch can be applied.
>
> Right now it is not good, because RTNL wont be necessarly held when you
> are going to call arp_invalidate() ?

While doing this analysis, I found a refcount bug in llc, I'll send a
patch for net-2.6

Meanwhile, here is the patch for net-next-2.6

Your patch then can be applied after mine.

Thanks

[PATCH] net: RCU conversion of dev_getbyhwaddr() and arp_ioctl()

dev_getbyhwaddr() was called under RTNL.

Rename it to dev_getbyhwaddr_rcu() and change all its caller to now use
RCU locking instead of RTNL.

Change arp_ioctl() to use RCU instead of RTNL locking.

Note: this fix a dev refcount bug in llc

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6
David S. Miller [Wed, 8 Dec 2010 18:01:00 +0000 (10:01 -0800)]
Merge branch 'dccp' of git://eden-feed.erg.abdn.ac.uk/net-next-2.6

14 years agollc: fix a device refcount imbalance
Eric Dumazet [Sun, 5 Dec 2010 02:03:26 +0000 (02:03 +0000)]
llc: fix a device refcount imbalance

Le dimanche 05 décembre 2010 à 12:23 +0100, Eric Dumazet a écrit :
> Le dimanche 05 décembre 2010 à 09:19 +0100, Eric Dumazet a écrit :
>
> > Hmm..
> >
> > If somebody can explain why RTNL is held in arp_ioctl() (and therefore
> > in arp_req_delete()), we might first remove RTNL use in arp_ioctl() so
> > that your patch can be applied.
> >
> > Right now it is not good, because RTNL wont be necessarly held when you
> > are going to call arp_invalidate() ?
>
> While doing this analysis, I found a refcount bug in llc, I'll send a
> patch for net-2.6

Oh well, of course I must first fix the bug in net-2.6, and wait David
pull the fix in net-next-2.6 before sending this rcu conversion.

Note: this patch should be sent to stable teams (2.6.34 and up)

[PATCH net-2.6] llc: fix a device refcount imbalance

commit abf9d537fea225 (llc: add support for SO_BINDTODEVICE) added one
refcount imbalance in llc_ui_bind(), because dev_getbyhwaddr() doesnt
take a reference on device, while dev_get_by_index() does.

Fix this using RCU locking. And since an RCU conversion will be done for
2.6.38 for dev_getbyhwaddr(), put the rcu_read_lock/unlock exactly at
their final place.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: stable@kernel.org
Cc: Octavian Purdila <opurdila@ixiacom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet/9p/protocol.c: Remove duplicated macros.
Thiago Farina [Sat, 4 Dec 2010 15:22:46 +0000 (15:22 +0000)]
net/9p/protocol.c: Remove duplicated macros.

Use the macros already provided by kernel.h file.

Signed-off-by: Thiago Farina <tfransosi@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoifb: goto resched directly if error happens and dp->tq isn't empty
Changli Gao [Sat, 4 Dec 2010 14:09:08 +0000 (14:09 +0000)]
ifb: goto resched directly if error happens and dp->tq isn't empty

If we break the loop when there are still skbs in tq and no skb in
rq, the skbs will be left in txq until new skbs are enqueued into rq.
In rare cases, no new skb is queued, then these skbs will stay in rq
forever.

After this patch, if tq isn't empty when we break the loop, we goto
resched directly.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: Jamal Hadi Salim <hadi@cyberus.ca>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: init ingress queue
Changli Gao [Sat, 4 Dec 2010 02:31:41 +0000 (02:31 +0000)]
net: init ingress queue

The dev field of ingress queue is forgot to initialized, then NULL
pointer dereference happens in qdisc_alloc().

Move inits of tx queues to netif_alloc_netdev_queues().

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotcp: Bug fix in initialization of receive window.
Nandita Dukkipati [Fri, 3 Dec 2010 13:33:44 +0000 (13:33 +0000)]
tcp: Bug fix in initialization of receive window.

The bug has to do with boundary checks on the initial receive window.
If the initial receive window falls between init_cwnd and the
receive window specified by the user, the initial window is incorrectly
brought down to init_cwnd. The correct behavior is to allow it to
remain unchanged.

Signed-off-by: Nandita Dukkipati <nanditad@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocxgb4: fix MAC address hash filter
Dimitris Michailidis [Fri, 3 Dec 2010 10:39:04 +0000 (10:39 +0000)]
cxgb4: fix MAC address hash filter

Fix the calculation of the inexact hash-based MAC address filter.
It's 64 bits but current code is missing a ULL.  Results in filtering out
some legitimate packets.

Signed-off-by: Dimitris Michailidis <dm@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agocan: add slcan driver for serial/USB-serial CAN adapters
Oliver Hartkopp [Thu, 2 Dec 2010 10:57:59 +0000 (10:57 +0000)]
can: add slcan driver for serial/USB-serial CAN adapters

This patch adds support for serial/USB-serial CAN adapters implementing the
LAWICEL ASCII protocol for CAN frame transport over serial lines.

The driver implements the SLCAN line discipline and is heavily based on the
slip.c driver. Therefore the code style remains similar to slip.c to be able
to apply changes of the SLIP driver to the SLCAN driver easily.

For more details see the slcan Kconfig entry.

Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoatm: lanai: use kernel's '%pM' format option to print MAC
Andy Shevchenko [Thu, 2 Dec 2010 02:45:08 +0000 (02:45 +0000)]
atm: lanai: use kernel's '%pM' format option to print MAC

Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Cc: Chas Williams <chas@cmf.nrl.navy.mil>
Cc: linux-atm-general@lists.sourceforge.net
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoCAIF: Fix U5500 compile error for shared memory driver
Kim Lilliestierna XX [Tue, 30 Nov 2010 09:11:22 +0000 (09:11 +0000)]
CAIF: Fix U5500 compile error for shared memory driver

Rearrange pr_fmt so it compiles.

Signed-off-by: Sjur Braendeland <sjur.brandeland@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
David S. Miller [Wed, 8 Dec 2010 16:13:01 +0000 (08:13 -0800)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-2.6

14 years agosfc: Fix NAPI list corruption during ring reallocation
Ben Hutchings [Tue, 7 Dec 2010 19:47:34 +0000 (19:47 +0000)]
sfc: Fix NAPI list corruption during ring reallocation

Call netif_napi_{add,del}() on the NAPI contexts in the new and
old channels, respectively.

Since efx_init_napi() cannot fail, make its return type void.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Fix crash in legacy onterrupt handler during ring reallocation
Ben Hutchings [Tue, 7 Dec 2010 19:24:45 +0000 (19:24 +0000)]
sfc: Fix crash in legacy onterrupt handler during ring reallocation

If we are using a legacy interrupt, our IRQ may be shared and our
interrupt handler may be called even though interrupts are disabled on
the NIC. When we change ring sizes, we reallocate the event queue and
the interrupt handler may use an invalid pointer when called for
another device's interrupt.

Maintain a legacy_irq_enabled flag and test that at the top of the
interrupt handler.  Note that this problem results from the need to
work around broken INT_ISR0 reads, and does not affect the legacy
interrupt handler for Falcon A1.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Generalise filter spec initialisation
Ben Hutchings [Tue, 7 Dec 2010 19:11:26 +0000 (19:11 +0000)]
sfc: Generalise filter spec initialisation

Move search_depth arrays into per-table state.

Define initialisation function efx_filter_init_rx() which sets
everything apart from the match fields.

Define efx_filter_set_{ipv4_local,ipv4_full,eth_local}() to set the
match fields.  This allows some simplification of callers and later
support for additional protocols and more flexible matching using
multiple calls to these functions.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Remove filter table IDs from filter functions
Ben Hutchings [Tue, 7 Dec 2010 19:02:27 +0000 (19:02 +0000)]
sfc: Remove filter table IDs from filter functions

The separation between filter tables is largely an internal detail
and it may be removed in future hardware.  To prepare for that:

- Merge table ID with filter index to make an opaque filter ID
- Wrap efx_filter_table_clear() with a function that clears filters
  from both RX tables, which is all that the current caller requires

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Log start and end of ethtool self-test at INFO level
Ben Hutchings [Tue, 7 Dec 2010 18:29:52 +0000 (18:29 +0000)]
sfc: Log start and end of ethtool self-test at INFO level

Add message at start of self-test and increase log level of message at
end of self-test, so that any other messages produced during the
test are clearly associated with it.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agodccp qpolicy: Parameter checking of cmsg qpolicy parameters
Tomasz Grobelny [Sat, 4 Dec 2010 12:39:13 +0000 (13:39 +0100)]
dccp qpolicy: Parameter checking of cmsg qpolicy parameters

Ensure that cmsg->cmsg_type value is valid for qpolicy
that is currently in use.

Signed-off-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
14 years agodccp: Policy-based packet dequeueing infrastructure
Tomasz Grobelny [Sat, 4 Dec 2010 12:38:01 +0000 (13:38 +0100)]
dccp: Policy-based packet dequeueing infrastructure

This patch adds a generic infrastructure for policy-based dequeueing of
TX packets and provides two policies:
 * a simple FIFO policy (which is the default) and
 * a priority based policy (set via socket options).
Both policies honour the tx_qlen sysctl for the maximum size of the write
queue (can be overridden via socket options).

The priority policy uses skb->priority internally to assign an u32 priority
identifier, using the same ranking as SO_PRIORITY. The skb->priority field
is set to 0 when the packet leaves DCCP. The priority is supplied as ancillary
data using cmsg(3), the patch also provides the requisite parsing routines.

Signed-off-by: Tomasz Grobelny <tomasz@grobelny.oswiecenia.net>
Signed-off-by: Gerrit Renker <gerrit@erg.abdn.ac.uk>
14 years agoRevert "ehea: Use the standard logging functions"
David S. Miller [Tue, 7 Dec 2010 04:45:28 +0000 (20:45 -0800)]
Revert "ehea: Use the standard logging functions"

This reverts commit 539995d18649023199986424d140f1d620372ce5.

As reported by Stephen Rothwell, this breaks the build.

Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agosfc: Use TX push whenever adding descriptors to an empty queue
Ben Hutchings [Mon, 15 Nov 2010 23:53:11 +0000 (23:53 +0000)]
sfc: Use TX push whenever adding descriptors to an empty queue

Whenever we add DMA descriptors to a TX ring and update the ring
pointer, the TX DMA engine must first read the new DMA descriptors and
then start reading packet data.  However, all released Solarflare 10G
controllers have a 'TX push' feature that allows us to reduce latency
by writing the first new DMA descriptor along with the pointer update.
This is only useful when the queue is empty.  The hardware should
ignore the pushed descriptor if the queue is not empty, but this check
is buggy, so we must do it in software.

In order to tell whether a TX queue is empty, we need to compare the
previous transmission count (write_count) and completion count
(read_count).  However, if we do that every time we update the ring
pointer then read_count may ping-pong between the caches of two CPUs
running the transmission and completion paths for the queue.
Therefore, we split the check for an empty queue between the
completion path and the transmission path:

- Add an empty_read_count field representing a point at which the
  completion path saw the TX queue as empty.
- Add an old_write_count field for use on the completion path.
- On the completion path, whenever read_count reaches or passes
  old_write_count the TX queue may be empty.  We then read
  write_count, set empty_read_count if read_count == write_count,
  and update old_write_count.
- On the transmission path, we read empty_read_count.  If it's set, we
  compare it with the value of write_count before the current set of
  descriptors was added.  If they match, the queue really is empty and
  we can use TX push.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Remove locking from implementation of efx_writeo_paged()
Ben Hutchings [Mon, 6 Dec 2010 22:58:41 +0000 (22:58 +0000)]
sfc: Remove locking from implementation of efx_writeo_paged()

It is not necessary to serialise writes to the paged 128-bit
registers.  However, if we don't then we must always write the last
dword separately, not as part of a qword write.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Add compile-time checks for correctness of paged register writes
Ben Hutchings [Mon, 6 Dec 2010 22:55:33 +0000 (22:55 +0000)]
sfc: Add compile-time checks for correctness of paged register writes

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Remove redundant memory barriers between MMIOs
Ben Hutchings [Mon, 6 Dec 2010 22:55:18 +0000 (22:55 +0000)]
sfc: Remove redundant memory barriers between MMIOs

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Expand/correct comments on collector behaviour and function usage
Ben Hutchings [Mon, 6 Dec 2010 22:55:00 +0000 (22:55 +0000)]
sfc: Expand/correct comments on collector behaviour and function usage

Document exactly which registers and functions have special behaviour,
and why races on writes to descriptor pointers are safe.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Use ACCESS_ONCE when copying efx_tx_queue::read_count
Ben Hutchings [Wed, 10 Nov 2010 18:46:40 +0000 (18:46 +0000)]
sfc: Use ACCESS_ONCE when copying efx_tx_queue::read_count

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agosfc: Reorder struct efx_nic to separate fields by volatility
Ben Hutchings [Mon, 6 Dec 2010 22:53:15 +0000 (22:53 +0000)]
sfc: Reorder struct efx_nic to separate fields by volatility

Place the regularly updated fields (locks, MAC stats, etc.) on a
separate cache-line from fields which are mostly constant.  This
should reduce cache misses for access to the latter on the data path.

Signed-off-by: Ben Hutchings <bhutchings@solarflare.com>
14 years agonet: cris/eth_v10: Use net_device_stats from struct net_device_stats
Tobias Klauser [Thu, 2 Dec 2010 07:22:05 +0000 (07:22 +0000)]
net: cris/eth_v10: Use net_device_stats from struct net_device_stats

struct net_device has its own struct net_device_stats member, so use
this one instead of a private copy in struct net_local.

Note: This patch was not even compile tested.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: emaclite: Omit private ndo_get_stats function
Tobias Klauser [Thu, 2 Dec 2010 07:20:39 +0000 (07:20 +0000)]
net: emaclite: Omit private ndo_get_stats function

xemaclite_get_stats() just returns dev->stats so we can leave it out
alltogether and let dev_get_stats() do the job.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: am79c961a: Omit private ndo_get_stats function
Tobias Klauser [Thu, 2 Dec 2010 07:20:05 +0000 (07:20 +0000)]
net: am79c961a: Omit private ndo_get_stats function

am79c961_getstats() just returns dev->stats so we can leave it out
alltogether and let dev_get_stats() do the job.

Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Acked-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: fix enum type mismatch on disable laser
Don Skidmore [Fri, 3 Dec 2010 13:24:05 +0000 (13:24 +0000)]
ixgbe: fix enum type mismatch on disable laser

Fixes a recent bug on the patch (c6ecf39a10ceec3e97096e2a8d3eadcecd593422)
that disabled the laser on ifconfig down.  Compilers were seeing a enum
mismatch.

Signed-off-by Don Skidmore <donald.c.skidmore@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: fix for link failure on SFP+ DA cables
Don Skidmore [Fri, 3 Dec 2010 13:23:30 +0000 (13:23 +0000)]
ixgbe: fix for link failure on SFP+ DA cables

This patch helps prevent FW/SW semaphore collision from leading
to link establishment failure.  The collision might mess up the
PHY registers so we reset the PHY.  However there are SFI/KR areas
in the PHY that are not reset with a Reset_AN so we need to change
LMS to reset it.  Also wait until AN state machine is AN_GOOD

Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoixgbe: fix possible NULL pointer deference in shutdown path
Don Skidmore [Wed, 1 Dec 2010 20:54:53 +0000 (20:54 +0000)]
ixgbe: fix possible NULL pointer deference in shutdown path

After freeing the rings we were not zeroing out the ring count values.
This patch now clears these counts correctly.

Reported-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Don Skidmore <donald.c.skidmore@intel.com>
Tested-by: Stephen Ko <stephen.s.ko@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoFix a typo in datagram.c and sctp/socket.c.
David Shwatrz [Thu, 2 Dec 2010 09:01:55 +0000 (09:01 +0000)]
Fix a typo in datagram.c and sctp/socket.c.

Hi,
This patch fixes a typo in net/core/datagram.c and in net/sctp/socket.c

Regards,
David Shwartz

Signed-off-by: David Shwartz <dshwatrz@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agofilter: add a security check at install time
Eric Dumazet [Wed, 1 Dec 2010 20:46:24 +0000 (20:46 +0000)]
filter: add a security check at install time

We added some security checks in commit 57fe93b374a6
(filter: make sure filters dont read uninitialized memory) to close a
potential leak of kernel information to user.

This added a potential extra cost at run time, while we can perform a
check of the filter itself, to make sure a malicious user doesnt try to
abuse us.

This patch adds a check_loads() function, whole unique purpose is to
make this check, allocating a temporary array of mask. We scan the
filter and propagate a bitmask information, telling us if a load M(K) is
allowed because a previous store M(K) is guaranteed. (So that
sk_run_filter() can possibly not read unitialized memory)

Note: this can uncover application bug, denying a filter attach,
previously allowed.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Dan Rosenberg <drosenberg@vsecurity.com>
Cc: Changli Gao <xiaosuo@gmail.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: arp: use assignment
Changli Gao [Wed, 1 Dec 2010 20:07:31 +0000 (20:07 +0000)]
net: arp: use assignment

Only when dont_send is 0, arp_filter() is consulted, so we can simply
assign the return value of arp_filter() to dont_send instead.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobe2net: Handle out of buffer completions for lancer
Sathya Perla [Wed, 1 Dec 2010 01:04:17 +0000 (01:04 +0000)]
be2net: Handle out of buffer completions for lancer

If Lancer chip does not have posted RX buffers, it posts an RX completion entry
with the same frag_index as the last valid completion. The Error bit is also
set. In BE, a flush completion is indicated with a zero value for num_rcvd in
the completion.
Such completions don't carry any data and are not processed.
This patch refactors code to handle both cases with the same code.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobe2net: FW init cmd fix for lancer
Sathya Perla [Wed, 1 Dec 2010 01:03:36 +0000 (01:03 +0000)]
be2net: FW init cmd fix for lancer

Lancer can use the same pattern as BE to indicate a driver load
to the FW.

Signed-off-by: Padmanabh Ratnakar <padmanabh.ratnakar@emulex.com>
Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agobe2net: Fix be_dev_family_check() return value check
Sathya Perla [Wed, 1 Dec 2010 01:02:28 +0000 (01:02 +0000)]
be2net: Fix be_dev_family_check() return value check

Signed-off-by: Sathya Perla <sathya.perla@emulex.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaf_packet: remove pgv.flags
Changli Gao [Wed, 1 Dec 2010 02:52:57 +0000 (02:52 +0000)]
af_packet: remove pgv.flags

As we can check if an address is vmalloc address with is_vmalloc_addr(),
we remove pgv.flags. Then we may get more pg_vecs.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoaf_packet: use vmalloc_to_page() instead for the addresss returned by vmalloc()
Changli Gao [Wed, 1 Dec 2010 02:52:20 +0000 (02:52 +0000)]
af_packet: use vmalloc_to_page() instead for the addresss returned by vmalloc()

The following commit causes the pgv->buffer may point to the memory
returned by vmalloc(). And we can't use virt_to_page() for the vmalloc
address.

This patch introduces a new inline function pgv_to_page(), which calls
vmalloc_to_page() for the vmalloc address, and virt_to_page() for the
__get_free_pages address.

We used to increase page pointer to get the next page at the next page
address, after Neil's patch, it is wrong, as the physical address may
be not continuous. This patch also fixes this issue.

    commit 0e3125c755445664f00ad036e4fc2cd32fd52877
    Author: Neil Horman <nhorman@tuxdriver.com>
    Date:   Tue Nov 16 10:26:47 2010 -0800

    packet: Enhance AF_PACKET implementation to not require high order contiguous memory allocation (v4)

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: kill an RCU warning in inet_fill_link_af()
Eric Dumazet [Wed, 1 Dec 2010 06:03:06 +0000 (06:03 +0000)]
net: kill an RCU warning in inet_fill_link_af()

commits 9f0f7272 (ipv4: AF_INET link address family) and cf7afbfeb8c
(rtnl: make link af-specific updates atomic) used incorrect
__in_dev_get_rcu() in RTNL protected contexts, triggering PROVE_RCU
warnings.

Switch to __in_dev_get_rtnl(), wich is more appropriate, since we hold
RTNL.

Based on a report and initial patch from Amerigo Wang.

Reported-by: Amerigo Wang <amwang@redhat.com>
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Thomas Graf <tgraf@infradead.org>
Reviewed-by: WANG Cong <amwang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years ago__in_dev_get_rtnl() can use rtnl_dereference()
Eric Dumazet [Wed, 1 Dec 2010 01:37:42 +0000 (01:37 +0000)]
__in_dev_get_rtnl() can use rtnl_dereference()

If caller holds RTNL, we dont need a memory barrier
(smp_read_barrier_depends) included in rcu_dereference().

Just use rtnl_dereference() to properly document the assertions.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agofilter: add SKF_AD_RXHASH and SKF_AD_CPU
Eric Dumazet [Tue, 30 Nov 2010 21:45:56 +0000 (21:45 +0000)]
filter: add SKF_AD_RXHASH and SKF_AD_CPU

Add SKF_AD_RXHASH and SKF_AD_CPU to filter ancillary mechanism,
to be able to build advanced filters.

This can help spreading packets on several sockets with a fast
selection, after RPS dispatch to N cpus for example, or to catch a
percentage of flows in one queue.

tcpdump -s 500 "cpu = 1" :

[0] ld CPU
[1] jeq #1  jt 2  jf 3
[2] ret #500
[3] ret #0

# take 12.5 % of flows (average)
tcpdump -s 1000 "rxhash & 7 = 2" :

[0] ld RXHASH
[1] and #7
[2] jeq #2  jt 3  jf 4
[3] ret #1000
[4] ret #0

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Rui <wirelesser@gmail.com>
Acked-by: Changli Gao <xiaosuo@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoehea: Use the standard logging functions
Joe Perches [Tue, 30 Nov 2010 08:18:44 +0000 (08:18 +0000)]
ehea: Use the standard logging functions

Remove ehea_error, ehea_info and ehea_debug macros.
Use pr_fmt, pr_<level>, netdev_<level> and netif_<level> as appropriate.
Fix messages to use trailing "\n", some messages had an extra one
as the old ehea_<level> macros added a trailing "\n".
Coalesced long format strings.

Uncompiled/untested.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Breno Leitao<leitao@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: Fix too optimistic NETIF_F_HW_CSUM features
Michał Mirosław [Tue, 30 Nov 2010 06:38:00 +0000 (06:38 +0000)]
net: Fix too optimistic NETIF_F_HW_CSUM features

NETIF_F_HW_CSUM is a superset of NETIF_F_IP_CSUM+NETIF_F_IPV6_CSUM, but
some drivers miss the difference. Fix this and also fix UFO dependency
on checksumming offload as it makes the same mistake in assumptions.

Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
Acked-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoUSB CDC NCM host driver
Alexey Orishko [Mon, 29 Nov 2010 23:23:28 +0000 (23:23 +0000)]
USB CDC NCM host driver

The patch provides USB CDC NCM host driver support in the Linux Kernel.

Changes:
drivers/net/usb/cdc_ncm.c:
- initial submission of the CDC NCM host driver;
- verified on Intel 32/64 bit, Intel Atom, ST-Ericsson U8500 (ARM)
- throughput measured over 100 Mbits duplex;
- driver supports 16-bit NTB format only, but it is more than enough for
  transfers up to 64K;
- driver can handle up to 32 datagrams in received NTB;
- timer is used to collect several packets in Tx direction

drivers/net/usb/Kconfig:
- a new entry to compile CDC NCM host driver
drivers/net/usb/Makefile:
- a new entry to compile CDC NCM host driver

Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agousbnet: changes for upcoming cdc_ncm driver
Alexey Orishko [Mon, 29 Nov 2010 23:23:27 +0000 (23:23 +0000)]
usbnet: changes for upcoming cdc_ncm driver

Changes:
include/linux/usb/usbnet.h:
- a new flag to indicate driver's capability to accumulate IP packets in Tx
 direction and extract several packets from single skb in Rx direction.
drivers/net/usb/usbnet.c:
- the procedure of counting packets in usbnet was updated due to the
 accumulating of IP packets in the driver
- no short packets are sent if indicated by the flag in driver_info
 structure

Signed-off-by: Alexey Orishko <alexey.orishko@stericsson.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wirel...
John W. Linville [Mon, 6 Dec 2010 20:35:34 +0000 (15:35 -0500)]
Merge branch 'master' of git://git./linux/kernel/git/linville/wireless-next-2.6 into for-davem

14 years agotg3: Update version to 3.116
Matt Carlson [Mon, 6 Dec 2010 08:28:54 +0000 (08:28 +0000)]
tg3: Update version to 3.116

This patch updates the tg3 version to 3.116.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotg3: Relax EEE thresholds
Matt Carlson [Mon, 6 Dec 2010 08:28:53 +0000 (08:28 +0000)]
tg3: Relax EEE thresholds

The hardware defaults to fairly aggressive EEE thresholds.  While there
appear to be no ill effects, this patch relaxes them, just as a
precaution.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotg3: Minor EEE code tweaks
Matt Carlson [Mon, 6 Dec 2010 08:28:52 +0000 (08:28 +0000)]
tg3: Minor EEE code tweaks

The first hunk of this patch makes sure that the driver checks for the
appropriate preconditions before checking if EEE negotiation succeeded.
More specifically the link needs to be full duplex for EEE to be
enabled.

The second and third hunks of this patch fix a bug where the eee
advertisement register would be programmed with extra bits set.

The fourth hunk of this patch makes sure the EEE capability flag is not
set for 5718 A0 devices and that the device is not a serdes device.

None of these modifications are strictly necessary.  The driver /
hardware still does the right thing.  They are submitted primarily for
correctness.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotg3: Fix 57765 EEE support
Matt Carlson [Mon, 6 Dec 2010 08:28:51 +0000 (08:28 +0000)]
tg3: Fix 57765 EEE support

EEE support in the 57765 internal phy will not enable after a phy reset
unless it sees that EEE is supported in the MAC.  This patch moves the
code that programs the CPMU EEE registers to a place before the phy
reset.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotg3: Move EEE definitions into mdio.h
Matt Carlson [Mon, 6 Dec 2010 08:28:50 +0000 (08:28 +0000)]
tg3: Move EEE definitions into mdio.h

In commit 52b02d04c801fff51ca49ad033210846d1713253 entitled "tg3: Add
EEE support", Ben Hutchings had commented that the EEE advertisement
register will be in a standard location.  This patch moves that
definition into mdio.h and changes the code to use it.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agotg3: Raise the jumbo frame BD flag threshold
Matt Carlson [Mon, 6 Dec 2010 08:28:49 +0000 (08:28 +0000)]
tg3: Raise the jumbo frame BD flag threshold

The current transmit routines set the jumbo frame BD flag too
aggressively.  This can reduce performance for common cases.  This patch
raises the jumbo flag threshold to 1518, up from 1500.

Signed-off-by: Matt Carlson <mcarlson@broadcom.com>
Reviewed-by: Benjamin Li <benli@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agofilter: fix sk_filter rcu handling
Eric Dumazet [Mon, 6 Dec 2010 17:29:43 +0000 (09:29 -0800)]
filter: fix sk_filter rcu handling

Pavel Emelyanov tried to fix a race between sk_filter_(de|at)tach and
sk_clone() in commit 47e958eac280c263397

Problem is we can have several clones sharing a common sk_filter, and
these clones might want to sk_filter_attach() their own filters at the
same time, and can overwrite old_filter->rcu, corrupting RCU queues.

We can not use filter->rcu without being sure no other thread could do
the same thing.

Switch code to a more conventional ref-counting technique : Do the
atomic decrement immediately and queue one rcu call back when last
reference is released.

Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
14 years agonet: don't reallocate skb->head unless the current one hasn't the needed extra size...
Changli Gao [Mon, 29 Nov 2010 22:48:46 +0000 (22:48 +0000)]
net: don't reallocate skb->head unless the current one hasn't the needed extra size or is shared

skb head being allocated by kmalloc(), it might be larger than what
actually requested because of discrete kmem caches sizes. Before
reallocating a new skb head, check if the current one has the needed
extra size.

Do this check only if skb head is not shared.

Signed-off-by: Changli Gao <xiaosuo@gmail.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Acked-by: Eric Dumazet <eric.dumazet@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>