MUNEDA Takahiro [Thu, 2 Feb 2012 16:09:22 +0000 (11:09 -0500)]
PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp driver
Add a parameter to avoid using MSI/MSI-X for PCIe native hotplug; it's
known to be buggy on some platforms.
In my environment, while shutting down, following stack trace is shown
sometimes.
irq 16: nobody cared (try booting with the "irqpoll" option)
Pid: 1081, comm: reboot Not tainted 3.2.0 #1
Call Trace:
<IRQ> [<
ffffffff810cec1d>] __report_bad_irq+0x3d/0xe0
[<
ffffffff810cee1c>] note_interrupt+0x15c/0x210
[<
ffffffff810cc485>] handle_irq_event_percpu+0xb5/0x210
[<
ffffffff810cc621>] handle_irq_event+0x41/0x70
[<
ffffffff810cf675>] handle_fasteoi_irq+0x55/0xc0
[<
ffffffff81015356>] handle_irq+0x46/0xb0
[<
ffffffff814fbe9d>] do_IRQ+0x5d/0xe0
[<
ffffffff814f146e>] common_interrupt+0x6e/0x6e
[<
ffffffff8106b040>] ? __do_softirq+0x60/0x210
[<
ffffffff8108aeb1>] ? hrtimer_interrupt+0x151/0x240
[<
ffffffff814fb5ec>] call_softirq+0x1c/0x30
[<
ffffffff810152d5>] do_softirq+0x65/0xa0
[<
ffffffff8106ae9d>] irq_exit+0xbd/0xe0
[<
ffffffff814fbf8e>] smp_apic_timer_interrupt+0x6e/0x99
[<
ffffffff814f9e5e>] apic_timer_interrupt+0x6e/0x80
<EOI> [<
ffffffff814f0fb1>] ? _raw_spin_unlock_irqrestore+0x11/0x20
[<
ffffffff812629fc>] pci_bus_write_config_word+0x6c/0x80
[<
ffffffff81266fc2>] pci_intx+0x52/0xa0
[<
ffffffff8127de3d>] pci_intx_for_msi+0x1d/0x30
[<
ffffffff8127e4fb>] pci_msi_shutdown+0x7b/0x110
[<
ffffffff81269d34>] pci_device_shutdown+0x34/0x50
[<
ffffffff81326c4f>] device_shutdown+0x2f/0x140
[<
ffffffff8107b981>] kernel_restart_prepare+0x31/0x40
[<
ffffffff8107b9e6>] kernel_restart+0x16/0x60
[<
ffffffff8107bbfd>] sys_reboot+0x1ad/0x220
[<
ffffffff814f4b90>] ? do_page_fault+0x1e0/0x460
[<
ffffffff811942d0>] ? __sync_filesystem+0x90/0x90
[<
ffffffff8105c9aa>] ? __cond_resched+0x2a/0x40
[<
ffffffff814ef090>] ? _cond_resched+0x30/0x40
[<
ffffffff81169e17>] ? iterate_supers+0xb7/0xd0
[<
ffffffff814f9382>] system_call_fastpath+0x16/0x1b
handlers:
[<
ffffffff8138a0f0>] usb_hcd_irq
[<
ffffffff8138a0f0>] usb_hcd_irq
[<
ffffffff8138a0f0>] usb_hcd_irq
Disabling IRQ #16
An un-wanted interrupt is generated when PCI driver switches from
MSI/MSI-X to INTx while shutting down the device. The interrupt does
not happen if MSI/MSI-X is not used on the device.
I confirmed that this problem does not happen if pcie_hp=nomsi was
specified and hotplug operation worked fine as usual.
v2: Automatically disable MSI/MSI-X against following device:
PCI bridge: Integrated Device Technology, Inc. Device 807f (rev 02)
v3: Based on the review comment, combile the if statements.
v4: Removed module parameter.
Move some code to build pciehp as a module.
Move device specific code to driver/pci/quirks.c.
v5: Drop a device specific code until getting a vendor statement.
Reviewed-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 11 Feb 2012 08:18:41 +0000 (00:18 -0800)]
PCI: move pci_find_saved_cap out of linux/pci.h
Only one user in driver/pci/pci.c, so we don't need to put it in global
pci.h
Reviewed-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 11 Feb 2012 08:18:30 +0000 (00:18 -0800)]
PCI: fix memleak for pci dev removing during hotplug
unreferenced object 0xffff880276d17700 (size 64):
comm "swapper/0", pid 1, jiffies
4294897182 (age 3976.028s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 18 f9 de 76 02 88 ff ff ...........v....
10 00 00 00 0e 00 00 00 0f 28 40 00 00 00 00 00 .........(@.....
backtrace:
[<
ffffffff81c8aede>] kmemleak_alloc+0x26/0x43
[<
ffffffff811385f0>] __kmalloc+0x121/0x183
[<
ffffffff813cf821>] pci_add_cap_save_buffer+0x35/0x7c
[<
ffffffff813d12b7>] pci_allocate_cap_save_buffers+0x1d/0x65
[<
ffffffff813cdb52>] pci_device_add+0x92/0xf1
[<
ffffffff81c8afe6>] pci_scan_single_device+0x9f/0xa1
[<
ffffffff813cdbd2>] pci_scan_slot.part.20+0x21/0x106
[<
ffffffff813cdce2>] pci_scan_slot+0x2b/0x35
[<
ffffffff81c8dae4>] __pci_scan_child_bus+0x51/0x107
[<
ffffffff81c8d75b>] pci_scan_bridge+0x376/0x6ae
[<
ffffffff81c8db60>] __pci_scan_child_bus+0xcd/0x107
[<
ffffffff81c8dbab>] pci_scan_child_bus+0x11/0x2a
[<
ffffffff81cca58c>] pci_acpi_scan_root+0x18b/0x21c
[<
ffffffff81c916be>] acpi_pci_root_add+0x1e1/0x42a
[<
ffffffff81406210>] acpi_device_probe+0x50/0x190
[<
ffffffff814a0227>] really_probe+0x99/0x126
Need to free saved_buffer for capabilities.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sun, 19 Feb 2012 22:50:12 +0000 (14:50 -0800)]
PCI: Fix device class print out
Found debug print of class is shifted.
| pci 0000:f8:15.2: [8086:2b56] type 0 class 0x000600
Code is trying to print class with 6 digits, but use shifted class with
4 digits valid value as variable.
Change to original dev->class directly.
Also remove not needed calculating of local variable class, because it
will be updated after pci_fixup_device(pci_fixup_early...)
Also unify type print out when class and header is not matched.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Anthony PERARD [Tue, 21 Feb 2012 16:22:10 +0000 (16:22 +0000)]
PCI: Add PCI_EXP_TYPE_PCIE_BRIDGE value
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 10 Feb 2012 23:33:48 +0000 (15:33 -0800)]
PCI: Skip cardbus assigned resource reset during pci bus rescan
Otherwise when rescan is used for cardbus, assigned resources will get
cleared.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 10 Feb 2012 23:33:47 +0000 (15:33 -0800)]
PCI: Fix "cardbus bridge resources as optional" size handling
We should not set the requested size to -2; that will confuse the
resource list sorting with align when SIZEALIGN is used.
Change to STARTALIGN and pass align from start; we are safe to do that
just as we do that regular pci bridge. In the long run, we should just
treat cardbus like a regular pci bridge.
Also fix the case when realloc_head is not passed: we should keep the
requested size.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 10 Feb 2012 23:33:46 +0000 (15:33 -0800)]
PCI: Disable cardbus bridge MEM1 prefetchable bit
Some BIOSes enable prefetch on both MEM0 and MEM1. But the cardbus code
assumes MEM1 is non-pref...
Discussion could be found at:
https://lkml.org/lkml/2012/1/12/1
https://bugzilla.kernel.org/show_bug.cgi?id=41622#c23
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Rafael J. Wysocki [Mon, 6 Feb 2012 23:50:35 +0000 (00:50 +0100)]
PCI / PM: Disable wakeup during shutdown for devices not enabled to wake up
If a PCI device is enabled to generate wakeup signals (PME) when put
into a low-power state by runtime PM, it will be still enabled to
generate those signals after the system shutdown, unless its driver's
.shutdown() callback takes care of the wakeup signals generation
setting. Moreover, there are devices that are not enabled to wake
up the system and that are configured by runtime PM to generate
wakeup signals so that (runtime) remote wakeup works with them.
Those devices should be reconfigured during system shutdown so that
they don't generate wakeup signals, but at least some drivers don't
do that. However, that very well may be done by the PCI core so
that drivers don't have to worry about it. For this reason, modify
pci_device_shutdown() to disable the generation of wakeup events for
devices not supposed to wake up the system.
References: https://bugzilla.kernel.org/show_bug.cgi?id=37952
Reported-and-tested-by: Kamil Iskra <kamil.54002@iskra.name>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sun, 5 Feb 2012 06:55:01 +0000 (22:55 -0800)]
PCI: Fix /sys warning when sriov enabled and card is hot removed
sysfs is a bit stricter now and emits warnings in more cases.
For SRIOV hotplug, we are calling pci_stop_dev() for each VF first
(after we update pci_stop_bus_devices) which remove each VF subdir. So
double check the VF dir in /sys before trying to remove the physfn link.
Signed-of-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Matthew Garrett [Fri, 3 Feb 2012 15:18:13 +0000 (10:18 -0500)]
PCI: pcie: Add support for setting default ASPM policy
Distributions may wish to provide different defaults for PCIE ASPM
depending on their target audience. Provide a configuration option for
choosing the default policy.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Thomas Jarosch [Wed, 7 Dec 2011 21:08:11 +0000 (22:08 +0100)]
PCI: Add quirk for still enabled interrupts on Intel Sandy Bridge GPUs
Some BIOS implementations leave the Intel GPU interrupts enabled,
even though no one is handling them (f.e. i915 driver is never loaded).
Additionally the interrupt destination is not set up properly
and the interrupt ends up -somewhere-.
These spurious interrupts are "sticky" and the kernel disables
the (shared) interrupt line after 100.000+ generated interrupts.
Fix it by disabling the still enabled interrupts.
This resolves crashes often seen on monitor unplug.
Tested on the following boards:
- Intel DH61CR: Affected
- Intel DH67BL: Affected
- Intel S1200KP server board: Affected
- Asus P8H61-M LE: Affected, but system does not crash.
Probably the IRQ ends up somewhere unnoticed.
According to reports on the net, the Intel DH61WW board is also affected.
Many thanks to Jesse Barnes from Intel for helping
with the register configuration and to Intel in general
for providing public hardware documentation.
Signed-off-by: Thomas Jarosch <thomas.jarosch@intra2net.com>
Tested-by: Charlie Suffin <charlie.suffin@stratus.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Arjan van de Ven [Tue, 31 Jan 2012 04:52:07 +0000 (20:52 -0800)]
PCI: Annotate PCI quirks in initcall_debug style
While diagnosing some boot time issues on a platform, all that I
could see in the bootgraph/dmesg was that the system was spending
a lot of time in applying one or more PCI quirks... which
was virtually undebuggable.
This patch adds printk's in "initcall_debug" style to the dmesg,
which are added when the user asks for the initcall_debug
(the nr one tool to use when debugging boot hangs or boot time issues)
kernel command line option.
v2: add #includes so quirks can build on non-x86
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Danny Kukawka [Mon, 30 Jan 2012 22:00:10 +0000 (23:00 +0100)]
PCI hotplug: cpcihp: fix debug module parameter to be bool
Fix debug variable from module parameter to be really bool to
fix 'warning: return from incompatible pointer type'.
Acked-by: Scott Murray <scott@spiteful.org>
Signed-off-by: Danny Kukawka <danny.kukawka@bisect.de>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Kay, Allen M [Thu, 26 Jan 2012 18:25:53 +0000 (10:25 -0800)]
PCI: check for pci bar restore completion and retry
On some OEM systems, pci_restore_state() is called while FLR has not yet
completed. As a result, PCI BAR register restore is not successful. This fix
reads back the restored value and compares it with saved value and re-tries 10
times before giving up.
Signed-off-by: Jean Guyader <jean.guyader@eu.citrix.com>
Signed-off-by: Eric Chanudet <eric.chanudet@citrix.com>
Signed-off-by: Allen Kay <allen.m.kay@intel.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 27 Jan 2012 18:55:15 +0000 (10:55 -0800)]
PCI: pciehp: Disable/enable link during slot power off/on
On a system with a repeater on the system board to support gen2 hotplug,
we found that when an ExpressModule is removed from some slots,
/var/log/messages will be full of "card present/not present" warnings.
It turns out the root complex is continually trying to train the link to
the repeater because the repeater has not been reset.
This patch will disable the link at removal time to allow the repeater
to be reset properly. This also prevents a potential AER message at
removal time.
Also, when testing hotplug on a system under development, we found if we
boot the system without an EM installed, and later hot-add an EM, it
does not work with Linux, but another OS is ok. The root cause is that
BIOS left link disabled when slot was empty at boot time, and other OS
is modifying the link disable bit in link ctrl during power on/off.
So we should do the same thing to disable/enable link during power off/on.
-v2: check link DLLA bit instead of 100ms waiting.
Separate link disable/enable functions to another patch.
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 27 Jan 2012 18:55:14 +0000 (10:55 -0800)]
PCI: pciehp: Add Disable/enable link functions
Will use it during power off/on of slots
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 27 Jan 2012 18:55:13 +0000 (10:55 -0800)]
PCI: pciehp: Add pcie_wait_link_not_active()
Will use it for link disable status checking.
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 27 Jan 2012 18:55:12 +0000 (10:55 -0800)]
PCI: pciehp: make check_link_active more helpful
A few changes:
- remove the 'inline' and let the complier decide
- return a bool to indicate whether the link was active
- add a debug message to indicate link state when it beocmes active
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 27 Jan 2012 18:55:11 +0000 (10:55 -0800)]
PCI: pciehp: replace unconditional sleep with config space access check
During reviewing
| PCI: pciehp: wait 1000 ms before Link Training check
Linus said:
>...
> That's a *long* time, and it's irritating to the user. It makes the
> user think "the machine is slow".
>...
> And quite frankly, an unconditional one-second delay here seems bad.
>Two seconds was unacceptable, one second is just bad.
Try to access the pci conf of a pci device that is supposed to show up
in 1s. If we can read back a valid vendor/device id, we can return
early.
Related discussion could be found:
https://lkml.org/lkml/2011/12/6/339
-v2: seperate code to pci_bus_read_dev_vendor_id() from pci_scan_device()
and reuse it from pciehp code. Suggested by Matthew Wilcox.
-v3: According to Kenj, don't use array in stack, and don't wait too long
for crs, also return fail status if not found.
Also separate pci_bus_dev_read_vendor_id() change to another patch.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 27 Jan 2012 18:55:10 +0000 (10:55 -0800)]
PCI: Separate pci_bus_read_dev_vendor_id from pci_scan_device
We can reuse it for pciehp probing.
-v2: according to Kenji, fix crs timeout checking, and export the function
for later use when pciehp is compiled as a module.
Suggested-by: Matthew Wilcox <matthew@wil.cx>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Fri, 27 Jan 2012 18:55:09 +0000 (10:55 -0800)]
PCI: make sriov work with hotplug remove
When hot removing a pci express module that has a pcie switch and supports
SRIOV, we got:
[ 5918.610127] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 1
[ 5918.615779] pciehp 0000:80:02.2:pcie04: Attention button interrupt received
[ 5918.622730] pciehp 0000:80:02.2:pcie04: Button pressed on Slot(3)
[ 5918.629002] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 1f9
[ 5918.637416] pciehp 0000:80:02.2:pcie04: PCI slot #3 - powering off due to button press.
[ 5918.647125] pciehp 0000:80:02.2:pcie04: pcie_isr: intr_loc 10
[ 5918.653039] pciehp 0000:80:02.2:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
[ 5918.661229] pciehp 0000:80:02.2:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd c0
[ 5924.667627] pciehp 0000:80:02.2:pcie04: Disabling domain:bus:device=0000:b0:00
[ 5924.674909] pciehp 0000:80:02.2:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 2f9
[ 5924.683262] pciehp 0000:80:02.2:pcie04: pciehp_unconfigure_device: domain:bus:dev = 0000:b0:00
[ 5924.693976] libfcoe_device_notification: NETDEV_UNREGISTER eth6
[ 5924.764979] libfcoe_device_notification: NETDEV_UNREGISTER eth14
[ 5924.873539] libfcoe_device_notification: NETDEV_UNREGISTER eth15
[ 5924.995209] libfcoe_device_notification: NETDEV_UNREGISTER eth16
[ 5926.114407] sxge 0000:b2:00.0: PCI INT A disabled
[ 5926.119342] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 5926.127189] IP: [<
ffffffff81353a3b>] pci_stop_bus_device+0x33/0x83
[ 5926.133377] PGD 0
[ 5926.135402] Oops: 0000 [#1] SMP
[ 5926.138659] CPU 2
[ 5926.140499] Modules linked in:
...
[ 5926.143754]
[ 5926.275823] Call Trace:
[ 5926.278267] [<
ffffffff81353a38>] pci_stop_bus_device+0x30/0x83
[ 5926.284180] [<
ffffffff81353af4>] pci_remove_bus_device+0x1a/0xba
[ 5926.290264] [<
ffffffff81366311>] pciehp_unconfigure_device+0x110/0x17b
[ 5926.296866] [<
ffffffff81365dd9>] ? pciehp_disable_slot+0x188/0x188
[ 5926.303123] [<
ffffffff81365d6f>] pciehp_disable_slot+0x11e/0x188
[ 5926.309206] [<
ffffffff81365e68>] pciehp_power_thread+0x8f/0xe0
...
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]--
| +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Device
| | | +-00.1 Device
| | | +-00.2 Device
| | | \-00.3 Device
| | \-03.0-[b3]--+-00.0 Device
| | +-00.1 Device
| | +-00.2 Device
| | \-00.3 Device
root complex: 80:02.2
pci express modules: have pcie switch and are listed as b0:00.0, b1:02.0 and b1:03.0.
end devices are b2:00.0 and b3.00.0.
VFs are: b2:00.1,... b2:00.3, and b3:00.1,...,b3:00.3
Root cause: when doing pci_stop_bus_device() with phys fn, it will stop
virt fn and remove the fn, so
list_for_each_safe(l, n, &bus->devices)
will have problem to refer freed n that is pointed to vf entry.
Solution is just replacing list_for_each_safe() with
list_for_each_prev_safe(). This will make sure we can get valid n pointer
to PF instead of the freed VF pointer (because newly added devices are
inserted to the bus->devices list tail).
During reviewing the patch, Bjorn said:
| The PCI hot-remove path calls pci_stop_bus_devices() via
| pci_remove_bus_device().
|
| pci_stop_bus_devices() traverses the bus->devices list (point A below),
| stopping each device in turn, which calls the driver remove() method. When
| the device is an SR-IOV PF, the driver calls pci_disable_sriov(), which
| also uses pci_remove_bus_device() to remove the VF devices from the
| bus->devices list (point B).
|
| pci_remove_bus_device
| pci_stop_bus_device
| pci_stop_bus_devices(subordinate)
| list_for_each(bus->devices) <-- A
| pci_stop_bus_device(PF)
| ...
| driver->remove
| pci_disable_sriov
| ...
| pci_remove_bus_device(VF)
| <remove from bus_list> <-- B
|
| At B, we're changing the same list we're iterating through at A, so when
| the driver remove() method returns, the pci_stop_bus_devices() iterator has
| a pointer to a list entry that has already been freed.
Discussion thread can be found : https://lkml.org/lkml/2011/10/15/141
https://lkml.org/lkml/2012/1/23/360
-v5: According to Linus to make remove more robust, Change to
list_for_each_prev_safe instead. That is more reasonable, because
those devices are added to tail of the list before.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:32 +0000 (02:08 -0800)]
PCI: remove add_to_failed_list()
Only one user; just use add_to_list instead.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:31 +0000 (02:08 -0800)]
PCI: add debug print out for add_size
For use in debugging resource reallocation.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:30 +0000 (02:08 -0800)]
PCI: make free_list() into a function
After merging struct pci_dev_resource_x and pci_dev_resource,
We can use a function instead of macro now.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:29 +0000 (02:08 -0800)]
PCI: Rename dev_res_x to add_res or fail_res
Linus says don't use dev_res_x because it doesn't communicate anything
about usage. Rename them to add_res or fail_res etc according to
context.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:28 +0000 (02:08 -0800)]
PCI: Merge pci_dev_resource_x and pci_dev_resource
pci_dev_resource_x is a superset of pci_dev_resource and they're just
temp structs used during resource reallocation.
pci_dev_resource usage is quite limted.
So just use pci_dev_resource_x, and rename it as new pci_dev_resource.
-v2: According to Linus, Separate free_list change to another patch
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:27 +0000 (02:08 -0800)]
PCI: Replace resource_list with generic list
So we can use helper functions for generic list. This makes the
resource re-allocation code much more readable.
-v2: Use list_add_tail instead of adding list_insert_before, Pointed out
by Linus.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:26 +0000 (02:08 -0800)]
PCI: Move struct resource_list to setup-bus.c
No user outside of setup-bus.c now. Later patches will convert
resource_list to a regular list.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:25 +0000 (02:08 -0800)]
PCI: Move pdev_sort_resources() to setup-bus.c
This allows us to move the definition of struct resource_list to
setup_bus.c and later convert resource_list to a regular list.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:24 +0000 (02:08 -0800)]
PCI: make re-allocation try harder by reassigning ranges higher in the heirarchy
On a system with devices that support SRIOV connected to a pcie switch
to pcie root port:
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device 207a
| | \-03.0-[a3]--+-00.0 Oracle Corporation Device 207a
| +-02.2-[b0-bf]----00.0-[b1-b3]--+-02.0-[b2]--+-00.0 Oracle Corporation Device 207a
| | \-03.0-[b3]--+-00.0 Oracle Corporation Device 207a
When the BIOS does not assign resources for SRIOV BARs, kernel pci
reallocation only goes up one bridge and then gives up, failing to to
get resources for all sSRIOV BARs, even though the range is large enough
in the peer root bus.
Specifically, only the bridge at the a1:02.0 level has its resources
cleared and reallocated. The kernel does not go up to clear the bridge
at the 80:02.0 level.
To make it go to upper levels, during retry, we need to treat "good to have"
resources as "must have".
Only on the last try will we treat good to have resources as optional.
At that time, parent bridge resources will already have been released so
we'll have a chance to get everything assigned with must_have plus
good_to_have for all child devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:23 +0000 (02:08 -0800)]
PCI: Make pci_rescan_bus handle add_list
This allows us to allocate resources to hotplug bridges during
remove/rescan.
We need to move the function to setup-bus.c so it can use
__pci_bus_size_bridges and __pci_bus_assign_resources directly to take
the add_list resource tracking list.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:22 +0000 (02:08 -0800)]
PCI: Make rescan bus increase bridge resource size if needed
Current rescan will not touch bridge MMIO and IO.
Try to reuse pci_assign_unassigned_bridge_resources(bridge) to update bridge
resources, if child devices need more resources.
Only do that for bridges whose children are all removed already; i.e. don't
release resources that could already be in use by drivers on child devices.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:21 +0000 (02:08 -0800)]
PCI: Use add_list in pcie hotplug path.
We need add size for hot plug path when pluging in hotplug chassis
without cards.
-v2: change descriptions. make it applicable after "pci: Check bridge
resources after resource allocation."
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:20 +0000 (02:08 -0800)]
PCI: try to assign required+option size first
We found reassignment can not find a range for one resource, even if the
total available range is large enough.
bridge b1:02.0 will need 2M+3M
bridge b1:03.0 will need 2M+3M
so bridge b0:00.0 will get assigned: 4M : [
f8000000-
f83fffff]
later is reassigned to 10M : [
f8000000-
f9ffffff]
b1:02.0 is assigned to 2M : [
f8000000-
f81fffff]
b1:03.0 is assigned to 2M : [
f8200000-
f83fffff]
After that b1:03.0 get chance to be reassigned to [
f8200000-
f86fffff],
but b1:02.0 will not have chance to expand, because b1:03.0 is using in
middle one.
[ 187.911401] pci 0000:b1:02.0: bridge window [mem 0x00100000-0x002fffff] to [bus b2-b2] add_size 300000
[ 187.920764] pci 0000:b1:03.0: bridge window [mem 0x00100000-0x002fffff] to [bus b3-b3] add_size 300000
[ 187.930129] pci 0000:b1:02.0: [mem 0x00100000-0x002fffff] get_res_add_size add_size 300000
[ 187.938500] pci 0000:b1:03.0: [mem 0x00100000-0x002fffff] get_res_add_size add_size 300000
[ 187.946857] pci 0000:b0:00.0: bridge window [mem 0x00100000-0x004fffff] to [bus b1-b3] add_size 600000
[ 187.956206] pci 0000:b0:00.0: BAR 14: assigned [mem 0xf8000000-0xf83fffff]
[ 187.963102] pci 0000:b0:00.0: BAR 15: assigned [mem 0xf5000000-0xf51fffff pref]
[ 187.970434] pci 0000:b0:00.0: BAR 14: reassigned [mem 0xf8000000-0xf89fffff]
[ 187.977497] pci 0000:b1:02.0: BAR 14: assigned [mem 0xf8000000-0xf81fffff]
[ 187.984383] pci 0000:b1:02.0: BAR 15: assigned [mem 0xf5000000-0xf50fffff pref]
[ 187.991695] pci 0000:b1:03.0: BAR 14: assigned [mem 0xf8200000-0xf83fffff]
[ 187.998576] pci 0000:b1:03.0: BAR 15: assigned [mem 0xf5100000-0xf51fffff pref]
[ 188.005888] pci 0000:b1:03.0: BAR 14: reassigned [mem 0xf8200000-0xf86fffff]
[ 188.012939] pci 0000:b1:02.0: BAR 14: can't assign mem (size 0x200000)
[ 188.019471] pci 0000:b1:02.0: failed to add 300000 to res=[mem 0xf8000000-0xf81fffff]
[ 188.027326] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 188.034071] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[ 188.040795] pci 0000:b2:00.0: BAR 2: assigned [mem 0xf8000000-0xf80fffff 64bit]
[ 188.048119] pci 0000:b2:00.0: BAR 2: set to [mem 0xf8000000-0xf80fffff 64bit] (PCI address [0xf8000000-0xf80fffff])
[ 188.058550] pci 0000:b2:00.0: BAR 6: assigned [mem 0xf5000000-0xf50fffff pref]
[ 188.065802] pci 0000:b2:00.0: BAR 0: assigned [mem 0xf8100000-0xf8103fff 64bit]
[ 188.073125] pci 0000:b2:00.0: BAR 0: set to [mem 0xf8100000-0xf8103fff 64bit] (PCI address [0xf8100000-0xf8103fff])
[ 188.083596] pci 0000:b2:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[ 188.090310] pci 0000:b2:00.0: BAR 9: can't assign mem (size 0x300000)
[ 188.096773] pci 0000:b2:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 188.103479] pci 0000:b2:00.0: BAR 7: assigned [mem 0xf8104000-0xf810ffff 64bit]
[ 188.110801] pci 0000:b2:00.0: BAR 7: set to [mem 0xf8104000-0xf810ffff 64bit] (PCI address [0xf8104000-0xf810ffff])
[ 188.121256] pci 0000:b1:02.0: PCI bridge to [bus b2-b2]
[ 188.126512] pci 0000:b1:02.0: bridge window [mem 0xf8000000-0xf81fffff]
[ 188.133328] pci 0000:b1:02.0: bridge window [mem 0xf5000000-0xf50fffff pref]
[ 188.140608] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 188.147341] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[ 188.154076] pci 0000:b3:00.0: BAR 2: assigned [mem 0xf8200000-0xf82fffff 64bit]
[ 188.161417] pci 0000:b3:00.0: BAR 2: set to [mem 0xf8200000-0xf82fffff 64bit] (PCI address [0xf8200000-0xf82fffff])
[ 188.171865] pci 0000:b3:00.0: BAR 6: assigned [mem 0xf5100000-0xf51fffff pref]
[ 188.179090] pci 0000:b3:00.0: BAR 0: assigned [mem 0xf8300000-0xf8303fff 64bit]
[ 188.186431] pci 0000:b3:00.0: BAR 0: set to [mem 0xf8300000-0xf8303fff 64bit] (PCI address [0xf8300000-0xf8303fff])
[ 188.196884] pci 0000:b3:00.0: reg 18c: [mem 0x00000000-0x000fffff 64bit]
[ 188.203591] pci 0000:b3:00.0: BAR 9: assigned [mem 0xf8400000-0xf86fffff 64bit]
[ 188.210909] pci 0000:b3:00.0: BAR 9: set to [mem 0xf8400000-0xf86fffff 64bit] (PCI address [0xf8400000-0xf86fffff])
[ 188.221379] pci 0000:b3:00.0: reg 184: [mem 0x00000000-0x00003fff 64bit]
[ 188.228089] pci 0000:b3:00.0: BAR 7: assigned [mem 0xf8304000-0xf830ffff 64bit]
[ 188.235407] pci 0000:b3:00.0: BAR 7: set to [mem 0xf8304000-0xf830ffff 64bit] (PCI address [0xf8304000-0xf830ffff])
[ 188.245843] pci 0000:b1:03.0: PCI bridge to [bus b3-b3]
[ 188.251107] pci 0000:b1:03.0: bridge window [mem 0xf8200000-0xf86fffff]
[ 188.257922] pci 0000:b1:03.0: bridge window [mem 0xf5100000-0xf51fffff pref]
[ 188.265180] pci 0000:b0:00.0: PCI bridge to [bus b1-b3]
[ 188.270443] pci 0000:b0:00.0: bridge window [mem 0xf8000000-0xf89fffff]
[ 188.277250] pci 0000:b0:00.0: bridge window [mem 0xf5000000-0xf51fffff pref]
[ 188.284512] pcieport 0000:80:02.2: PCI bridge to [bus b0-bf]
[ 188.290184] pcieport 0000:80:02.2: bridge window [io 0xa000-0xbfff]
[ 188.296735] pcieport 0000:80:02.2: bridge window [mem 0xf8000000-0xf8ffffff]
[ 188.303963] pcieport 0000:80:02.2: bridge window [mem 0xf5000000-0xf5ffffff 64bit pref]
Thus b2:00.0 BAR 9 does not get assigned...
root cause:
b1:02.0 can not be added more range, because b1:03.0 is just after it;
no space between the required ranges.
Solution:
Try to assign required + optional all together at first, and if that
fails, try again with just the required resources.
-v2: seperate add_to_list change() to another patch according to Jesse.
seperate get_res_add_size() moving to another patch according to Jesse.
add !realloc_head->next check if the list is empty to bail early
according to Jesse.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:19 +0000 (02:08 -0800)]
PCI: Move get_res_add_size() function
Need to call it from __assign_resources_sorted() later and we'd like to
avoid a forward declaraion.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:18 +0000 (02:08 -0800)]
PCI: Make add_to_list() return status
Will be used for resource_list_x duplication when trying
requested+optional at first.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sat, 21 Jan 2012 10:08:17 +0000 (02:08 -0800)]
PCI : Calculate right add_size
During debug of one SRIOV enabled hotplug device, we found found that
add_size is not passed properly.
The device has devices under two level bridges:
+-[0000:80]-+-00.0-[81-8f]--
| +-01.0-[90-9f]--
| +-02.0-[a0-af]----00.0-[a1-a3]--+-02.0-[a2]--+-00.0 Oracle Corporation Device
| | \-03.0-[a3]--+-00.0 Oracle Corporation Device
Which means later the parent bridge will not try to add a big enough range:
[ 557.455077] pci 0000:a0:00.0: BAR 14: assigned [mem 0xf9000000-0xf93fffff]
[ 557.461974] pci 0000:a0:00.0: BAR 15: assigned [mem 0xf6000000-0xf61fffff pref]
[ 557.469340] pci 0000:a1:02.0: BAR 14: assigned [mem 0xf9000000-0xf91fffff]
[ 557.476231] pci 0000:a1:02.0: BAR 15: assigned [mem 0xf6000000-0xf60fffff pref]
[ 557.483582] pci 0000:a1:03.0: BAR 14: assigned [mem 0xf9200000-0xf93fffff]
[ 557.490468] pci 0000:a1:03.0: BAR 15: assigned [mem 0xf6100000-0xf61fffff pref]
[ 557.497833] pci 0000:a1:03.0: BAR 14: can't assign mem (size 0x200000)
[ 557.504378] pci 0000:a1:03.0: failed to add optional resources res=[mem 0xf9200000-0xf93fffff]
[ 557.513026] pci 0000:a1:02.0: BAR 14: can't assign mem (size 0x200000)
[ 557.519578] pci 0000:a1:02.0: failed to add optional resources res=[mem 0xf9000000-0xf91fffff]
It turns out we did not calculate size1 properly.
static resource_size_t calculate_memsize(resource_size_t size,
resource_size_t min_size,
resource_size_t size1,
resource_size_t old_size,
resource_size_t align)
{
if (size < min_size)
size = min_size;
if (old_size == 1 )
old_size = 0;
if (size < old_size)
size = old_size;
size = ALIGN(size + size1, align);
return size;
}
We should not pass add_size with min_size in calculate_memsize since
that will make add_size not contribute final add_size.
So just pass add_size with size1 to calculate_memsize().
With this change, we should have chance to remove extra addon in
pci_reassign_resource.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Masanari Iida [Thu, 26 Jan 2012 14:45:47 +0000 (23:45 +0900)]
PCI: Fix typo in setup-res.c
Correct spelling "resouce" to "resource" in
dricers/pci/setup-res.c
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Bjorn Helgaas [Wed, 18 Jan 2012 00:41:21 +0000 (17:41 -0700)]
x86/PCI: don't fall back to defaults if _CRS has no apertures
Host bridges that lead to things like the Uncore need not have any
I/O port or MMIO apertures. For example, in this case:
ACPI: PCI Root Bridge [UNC1] (domain 0000 [bus ff])
PCI: root bus ff: using default resources
PCI host bridge to bus 0000:ff
pci_bus 0000:ff: root bus resource [io 0x0000-0xffff]
pci_bus 0000:ff: root bus resource [mem 0x00000000-0x3fffffffffff]
we should not pretend those default resources are available on bus ff.
CC: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Konrad Rzeszutek Wilk [Thu, 12 Jan 2012 17:06:47 +0000 (12:06 -0500)]
xen/pciback: Support pci_reset_function, aka FLR or D3 support.
We use the __pci_reset_function_locked to perform the action.
Also on attaching ("bind") and detaching ("unbind") we save and
restore the configuration states. When the device is disconnected
from a guest we use the "pci_reset_function" to also reset the
device before being passed to another guest.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Konrad Rzeszutek Wilk [Thu, 12 Jan 2012 17:06:46 +0000 (12:06 -0500)]
PCI: Introduce __pci_reset_function_locked to be used when holding device_lock.
The use case of this is when a driver wants to call FLR when a device
is attached to it using the SysFS "bind" or "unbind" functionality.
The call chain when a user does "bind" looks as so:
echo "0000:01.07.0" > /sys/bus/pci/drivers/XXXX/bind
and ends up calling:
driver_bind:
device_lock(dev); <=== TAKES LOCK
XXXX_probe:
.. pci_enable_device()
...__pci_reset_function(), which calls
pci_dev_reset(dev, 0):
if (!0) {
device_lock(dev) <==== DEADLOCK
The __pci_reset_function_locked function allows the the drivers
'probe' function to call the "pci_reset_function" while still holding
the driver mutex lock.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Julia Lawall [Thu, 12 Jan 2012 09:55:16 +0000 (10:55 +0100)]
PCI: drivers/pci/hotplug/ibmphp_ebda.c: add missing iounmap
Add missing iounmap in error handling code, in a case where the function
already preforms iounmap on some other execution path.
A simplified version of the semantic match that finds this problem is as
follows: (http://coccinelle.lip6.fr/)
// <smpl>
@@
expression e;
statement S,S1;
int ret;
@@
e = \(ioremap\|ioremap_nocache\)(...)
... when != iounmap(e)
if (<+...e...+>) S
... when any
when != iounmap(e)
*if (...)
{ ... when != iounmap(e)
return ...; }
... when any
iounmap(e);
// </smpl>
Signed-off-by: Julia Lawall <Julia.Lawall@lip6.fr>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Amos Kong [Fri, 25 Nov 2011 07:03:07 +0000 (15:03 +0800)]
PCI: Can continually add funcs after adding func0
Boot up a KVM guest, and hotplug multifunction
devices(func1,func2,func0,func3) to guest.
for i in 1 2 0 3;do
qemu-img create /tmp/resize$i.qcow2 1G -f qcow2
(qemu) drive_add 0x11.$i id=drv11$i,if=none,file=/tmp/resize$i.qcow2
(qemu) device_add virtio-blk-pci,id=dev11$i,drive=drv11$i,addr=0x11.$i,multifunction=on
done
In linux kernel, when func0 of the slot is hot-added, the whole
slot will be marked as 'enabled', then driver will ignore other new
hotadded funcs.
But in Win7 & WinXP, we can continaully add other funcs after adding
func0, all funcs will be added in guest.
drivers/pci/hotplug/acpiphp_glue.c:
static int acpiphp_check_bridge(struct acpiphp_bridge *bridge)
{
....
for (slot = bridge->slots; slot; slot = slot->next) {
if (slot->flags & SLOT_ENABLED) {
acpiphp_disable_slot()
else
acpiphp_enable_slot()
.... |
} v
enable_device()
|
v
//only don't enable slot if func0 is not added
list_for_each_entry(func, &slot->funcs, sibling) {
...
}
slot->flags |= SLOT_ENABLED; //mark slot to 'enabled'
This patch just make pci driver can continaully add funcs after adding
func 0. Only mark slot to 'enabled' when all funcs are added.
For pci multifunction hotplug, we can add functions one by one(func 0 is
necessary), and all functions will be removed in one time.
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Myron Stowe [Mon, 21 Nov 2011 18:54:19 +0000 (11:54 -0700)]
x86/PCI: Convert maintaining FW-assigned BIOS BAR values to use a list
This patch converts the underlying maintenance aspects of FW-assigned
BIOS BAR values from a statically allocated array within struct pci_dev
to a list of temporary, stand alone, entries.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Myron Stowe [Mon, 21 Nov 2011 18:54:13 +0000 (11:54 -0700)]
x86/PCI: Infrastructure to maintain a list of FW-assigned BIOS BAR values
Commit
58c84eda075 introduced functionality to try and reinstate the
original BIOS BAR addresses of a PCI device when normal resource
assignment attempts fail. To keep track of the BIOS BAR addresses,
struct pci_dev was augmented with an array to hold the BAR addresses
of the PCI device: 'resource_size_t fw_addr[DEVICE_COUNT_RESOURCE]'.
The reinstatement of BAR addresses is an uncommon event leaving the
'fw_addr' array unused under normal circumstances. This functionality
is also currently architecture specific with an implementation limited
to x86. As the use of struct pci_dev is so prevalent, having the
'fw_addr' array residing within such seems somewhat wasteful.
This patch introduces a stand alone data structure and interfacing
routines for maintaining a list of FW-assigned BIOS BAR value entries.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Myron Stowe [Mon, 21 Nov 2011 18:54:07 +0000 (11:54 -0700)]
PCI: Fix starting basis for resource requests
pci_revert_fw_address() is used to reinstate a PCI device's original
FW-assigned BIOS BAR value(s) if normal resource assignment fails.
When attempting to reinstate an address, the point within the resource
tree from which to attempt the new resource request should be the parent
resource corresponding to the device, not the base of the resource tree
(ioport_resource or iomem_resource). For PCI devices this would
typically be the resource corresponding to the upstream PCI host bridge
or P2P bridge aperture.
This patch sets the point within the resource tree to attempt a new
resource assignment request to the PCI device's parent resource and only
if that fails does it fall back to the base ioport_resource or
iomem_resource.
Signed-off-by: Myron Stowe <myron.stowe@redhat.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Sun, 5 Feb 2012 06:55:00 +0000 (22:55 -0800)]
PCI: Fix pci cardbus removal
During test busn_res allocation with cardbus, found pci card removal is not
working anymore, and it turns out it is broken by:
|commit
79cc9601c3e42b4f0650fe7e69132ebce7ab48f9
|Date: Tue Nov 22 21:06:53 2011 -0800
|
| PCI: Only call pci_stop_bus_device() one time for child devices at remove
The above changed the behavior of pci_remove_behind_bridge that
yenta_cardbus depended on. So restore the old behavoir of
pci_remove_behind_bridge (which requires stopping and removing of all
devices) by:
1. rename pci_remove_behind_bridge to __pci_remove_behind_bridge, and let
__pci_remove_bus_device() call it instead.
2. add pci_stop_behind_bridge that will stop devices behind a bridge
3. add back pci_remove_behind_bridge that will stop and remove devices
under bridge.
-v2: update commit description a little bit.
Tested-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Vaidyanathan Srinivasan [Thu, 2 Feb 2012 17:41:20 +0000 (23:11 +0530)]
PCI: set pci sriov page size before reading SRIOV BAR
For an SRIOV device, PCI_SRIOV_SYS_PGSIZE should be set before
the PCI_SRIOV_BAR are queried. The sys pagesize defaults to 4k,
so this change is required on powerpc box with 64k base page size.
This is a regression caused due to moving SRIOV init to sriov_enable().
| commit
afd24ece5c76af87f6fc477f2747b83a764f161c
| Author: Ram Pai <linuxram@us.ibm.com>
| PCI: delay configuration of SRIOV capability
| The SRIOV capability, namely page size and total_vfs of a device are
| configured during enumeration phase of the device. This can potentially
| interfere with the PCI operations of the platform, if the IOV capability
| of the device is not enabled.
Signed-off-by: Vaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
Acked-by: Ram Pai <linuxram@us.ibm.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Yinghai Lu [Mon, 30 Jan 2012 11:25:24 +0000 (12:25 +0100)]
PCI: workaround hard-wired bus number V2
Fixes PCI device detection on IBM xSeries IBM 3850 M2 / x3950 M2
when using ACPI resources (_CRS).
This is default, a manual workaround (without this patch)
would be pci=nocrs boot param.
V2: Add dev_warn if the workaround is hit. This should reveal
how common such setups are (via google) and point to possible
problems if things are still not working as expected.
-> Suggested by Jan Beulich.
Cc: stable@vger.kernel.org
Tested-by: garyhade@us.ibm.com
Signed-off-by: Yinghai Lu <yinghai.lu@oracle.com>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
Linus Torvalds [Fri, 27 Jan 2012 15:56:25 +0000 (07:56 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux: (31 commits)
gma500: Fix suspend/resume functions
drm/exynos: fixed pm feature for fimd module.
MAINTAINERS: added maintainer entry for Exynos DRM Driver.
drm/exynos: fixed build dependency for DRM_EXYNOS_FIMD
drm/exynos: fix build dependency for DRM_EXYNOS_HDMI
drm/exynos: use release_mem_region instead of release_resource
agp: fix scratch page cleanup
drm/i915: fixup forcewake spinlock fallout in drpc debugfs function
drm/i915: debugfs: show semaphore registers also on gen7
drm/i915: allow userspace forcewake references also on gen7
drm/i915: Re-enable gen7 RC6 and GPU turbo after resume.
drm/i915: Correct debugfs printout for RC1e.
Revert "drm/i915: Work around gen7 BLT ring synchronization issues."
drm/i915: rip out the HWSTAM missed irq workaround
drm/i915: paper over missed irq issues with force wake voodoo
drm/i915: Hold gt_lock across forcewake register reads
drm/i915: Hold gt_lock during reset
drm/i915: Move reset forcewake processing to gen6_do_reset
drm/i915: protect force_wake_(get|put) with the gt_lock
drm/i915: convert force_wake_get to func pointer in the gpu reset code
...
Linus Torvalds [Fri, 27 Jan 2012 15:53:06 +0000 (07:53 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tiwai/sound
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda - Fix silent output on Haier W18 laptop
ALSA: hda: set mute led polarity for laptops with buggy BIOS based on SSID
ALSA: hda - Fix silent output on ASUS A6Rp
ALSA: Fix memory leak on error in snd_compr_set_params()
ALSA: ymfpci - Don't create invalid PCM & mixers when AC97 doesn't support
Ryan Mallon [Fri, 27 Jan 2012 06:28:24 +0000 (17:28 +1100)]
gma500: Fix suspend/resume functions
Both the suspend and resume functions incorrectly set psbfb =
to_psb_fb(NULL) outside of the loop over all of the framebuffers. Fix
this by moving the assignment of psbfb inside the loop and removing the
initialisation of fb.
Signed-off-by: Ryan Mallon <rmallon@gmail.com>
Acked-by: Alan Cox <alan@linux.intel.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Fri, 27 Jan 2012 07:16:27 +0000 (07:16 +0000)]
Merge branch 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung into drm-fixes
* 'exynos-drm-fixes' of git://git.infradead.org/users/kmpark/linux-samsung:
drm/exynos: fixed pm feature for fimd module.
MAINTAINERS: added maintainer entry for Exynos DRM Driver.
drm/exynos: fixed build dependency for DRM_EXYNOS_FIMD
drm/exynos: fix build dependency for DRM_EXYNOS_HDMI
drm/exynos: use release_mem_region instead of release_resource
Inki Dae [Fri, 27 Jan 2012 02:54:58 +0000 (11:54 +0900)]
drm/exynos: fixed pm feature for fimd module.
this patch separates fimd specific power on/off function from pm function
and the pm interfaces will call that function for power on or off.
and also removes unnecessary codes of resume function.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Inki Dae [Tue, 17 Jan 2012 05:08:55 +0000 (14:08 +0900)]
MAINTAINERS: added maintainer entry for Exynos DRM Driver.
I'd like to add my colleagues who dedicated to developing and
improving our driver to maintainer entry.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Inki Dae [Mon, 16 Jan 2012 09:55:02 +0000 (18:55 +0900)]
drm/exynos: fixed build dependency for DRM_EXYNOS_FIMD
FB based FIMD and DRM based FIMD drivers use same hardware
so with this patch, only one of them would be selected.
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Seung-Woo Kim [Wed, 4 Jan 2012 06:34:32 +0000 (15:34 +0900)]
drm/exynos: fix build dependency for DRM_EXYNOS_HDMI
DRM_EXYNOS_HDMI driver and VIDEO_SAMSUNG_S5P_TV driver should be
not enabled at once because they use same HW blocks. So dependency
for DRM_EXYNOS_HDMI is fixed to check VIDEO_SAMSUNG_S5P_TV=n.
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Seung-Woo Kim [Thu, 22 Dec 2011 02:30:09 +0000 (11:30 +0900)]
drm/exynos: use release_mem_region instead of release_resource
To make a api pair of request_mem_region and release_mem_region,
release_mem_region is used instead of release_resource.
Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Inki Dae <inki.dae@samsung.com>
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Linus Torvalds [Fri, 27 Jan 2012 01:04:47 +0000 (17:04 -0800)]
Merge branch 'v4l_for_linus' of git://git./linux/kernel/git/mchehab/linux-media
* 'v4l_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
[media] cinergyT2-fe: Fix bandwdith settings
[media] V4L: atmel-isi: add clk_prepare()/clk_unprepare() functions
[media] cxd2820r: sleep on DVB-T/T2 delivery system switch
[media] anysee: fix CI init
[media] cxd2820r: remove unused parameter from cxd2820r_attach
[media] cxd2820r: fix dvb_frontend_ops
Linus Torvalds [Fri, 27 Jan 2012 01:00:38 +0000 (17:00 -0800)]
Merge git://git./linux/kernel/git/davem/sparc
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
sparc32: forced setting of mode of sun4m per-cpu timers
Linus Torvalds [Thu, 26 Jan 2012 20:46:07 +0000 (12:46 -0800)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/microcode_amd: Add support for CPU family specific container files
x86/amd: Add missing feature flag for fam15h models 10h-1fh processors
x86/boot-image: Don't leak phdrs in arch/x86/boot/compressed/misc.c::Parse_elf()
x86/numachip: Drop unnecessary conflict with EDAC
x86/uv: Fix uninitialized spinlocks
x86/uv: Fix uv_gpa_to_soc_phys_ram() shift
Linus Torvalds [Thu, 26 Jan 2012 20:45:57 +0000 (12:45 -0800)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf: Call perf_cgroup_event_time() directly
perf: Don't call release_callchain_buffers() if allocation fails
Linus Torvalds [Thu, 26 Jan 2012 20:45:41 +0000 (12:45 -0800)]
Merge branch 'core-urgent-for-linus' of git://git./linux/kernel/git/tip/tip
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
rcu: Add missing __cpuinit annotation in rcutorture code
sched: Add "const" to is_idle_task() parameter
rcu: Make rcutorture bool parameters really bool (core code)
memblock: Fix alloc failure due to dumb underflow protection in memblock_find_in_range_node()
Linus Torvalds [Thu, 26 Jan 2012 20:43:57 +0000 (12:43 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/geert/linux-m68k
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Fix assembler constraint to prevent overeager gcc optimisation
mac_esp: rename irq
mac_scsi: dont enable mac_scsi irq before requesting it
macfb: fix black and white modes
m68k/irq: Remove obsolete IRQ_FLG_* definitions
Fix up trivial conflict in arch/m68k/kernel/process_mm.c as per Geert.
Michal Kubecek [Wed, 25 Jan 2012 15:51:05 +0000 (16:51 +0100)]
agp: fix scratch page cleanup
In error cleanup of agp_backend_initialize() and in agp_backend_cleanup(),
agp_destroy_page() is passed virtual address of the scratch page. This
leads to a kernel warning if the initialization fails (or upon regular
cleanup) as pointer to struct page should be passed instead.
Signed-off-by: Michal Kubecek <mkubecek@suse.cz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 26 Jan 2012 18:25:54 +0000 (18:25 +0000)]
Merge branch 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux into drm-fixes
* 'drm-intel-fixes' of git://people.freedesktop.org/~keithp/linux: (24 commits)
drm/i915: fixup forcewake spinlock fallout in drpc debugfs function
drm/i915: debugfs: show semaphore registers also on gen7
drm/i915: allow userspace forcewake references also on gen7
drm/i915: Re-enable gen7 RC6 and GPU turbo after resume.
drm/i915: Correct debugfs printout for RC1e.
Revert "drm/i915: Work around gen7 BLT ring synchronization issues."
drm/i915: rip out the HWSTAM missed irq workaround
drm/i915: paper over missed irq issues with force wake voodoo
drm/i915: Hold gt_lock across forcewake register reads
drm/i915: Hold gt_lock during reset
drm/i915: Move reset forcewake processing to gen6_do_reset
drm/i915: protect force_wake_(get|put) with the gt_lock
drm/i915: convert force_wake_get to func pointer in the gpu reset code
drm/i915: sprite init failure on pre-SNB is not a failure
drm/i915: VBT Parser cleanup for eDP block
drm/i915: mask transcoder select bits before setting them on LVDS
drm/i915: Add Clientron E830 to the ignore LVDS list
CHROMIUM: i915: Add DMI override to skip CRT initialization on ZGB
drm/i915: handle 3rd pipe
drm/i915: simplify pipe checking
...
Takashi Iwai [Thu, 26 Jan 2012 14:56:16 +0000 (15:56 +0100)]
ALSA: hda - Fix silent output on Haier W18 laptop
The very same problem is seen on Haier W18 laptop with ALC861 as seen
on ASUS A6Rp, which was fixed by the commit
3b25eb69.
Now we just need to add a new SSID entry pointing to the same fixup.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42656
Cc: <stable@kernel.org> [v3.2+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Andreas Herrmann [Fri, 20 Jan 2012 16:44:12 +0000 (17:44 +0100)]
x86/microcode_amd: Add support for CPU family specific container files
We've decided to provide CPU family specific container files
(starting with CPU family 15h). E.g. for family 15h we have to
load microcode_amd_fam15h.bin instead of microcode_amd.bin
Rationale is that starting with family 15h patch size is larger
than 2KB which was hard coded as maximum patch size in various
microcode loaders (not just Linux).
Container files which include patches larger than 2KB cause
different kinds of trouble with such old patch loaders. Thus we
have to ensure that the default container file provides only
patches with size less than 2KB.
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/20120120164412.GD24508@alberich.amd.com
[ documented the naming convention and tidied the code a bit. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Andreas Herrmann [Fri, 20 Jan 2012 16:38:23 +0000 (17:38 +0100)]
x86/amd: Add missing feature flag for fam15h models 10h-1fh processors
That is the last one missing for those CPUs.
Others were recently added with commits
fb215366b3c7320ac25dca766a0152df16534932
(KVM: expose latest Intel cpu new features (BMI1/BMI2/FMA/AVX2) to guest)
and
commit
969df4b82904a30fef19a67398a0c854d223ea67
(x86: Report cpb and eff_freq_ro flags correctly)
Signed-off-by: Andreas Herrmann <andreas.herrmann3@amd.com>
Link: http://lkml.kernel.org/r/20120120163823.GC24508@alberich.amd.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Jesper Juhl [Mon, 23 Jan 2012 22:34:59 +0000 (23:34 +0100)]
x86/boot-image: Don't leak phdrs in arch/x86/boot/compressed/misc.c::Parse_elf()
We allocate memory with malloc(), but neglect to free it before
the variable 'phdrs' goes out of scope --> leak.
Signed-off-by: Jesper Juhl <jj@chaosbits.net>
Link: http://lkml.kernel.org/r/alpine.LNX.2.00.1201232332590.8772@swampdragon.chaosbits.net
[ Mostly harmless. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Daniel J Blueman [Wed, 25 Jan 2012 06:35:49 +0000 (14:35 +0800)]
x86/numachip: Drop unnecessary conflict with EDAC
EDAC detection no longer crashes multi-node systems, so don't
conflict on it with NumaChip.
Signed-off-by: Daniel J Blueman <daniel@numascale-asia.com>
Cc: Steffen Persvold <sp@numascale.com>
Link: http://lkml.kernel.org/r/1327473349-28395-1-git-send-email-daniel@numascale-asia.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cliff Wickman [Wed, 18 Jan 2012 15:40:47 +0000 (09:40 -0600)]
x86/uv: Fix uninitialized spinlocks
Initialize two spinlocks in tlb_uv.c and also properly define/initialize
the uv_irq_lock.
The lack of explicit initialization seems to be functionally
harmless, but it is diagnosed when these are turned on:
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_LOCKDEP=y
Signed-off-by: Cliff Wickman <cpw@sgi.com>
Cc: <stable@kernel.org>
Cc: Dimitri Sivanich <sivanich@sgi.com>
Link: http://lkml.kernel.org/r/E1RnXd1-0003wU-PM@eag09.americas.sgi.com
[ Added the uv_irq_lock initialization fix by Dimitri Sivanich ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Russ Anderson [Thu, 19 Jan 2012 02:07:54 +0000 (20:07 -0600)]
x86/uv: Fix uv_gpa_to_soc_phys_ram() shift
uv_gpa_to_soc_phys_ram() was inadvertently ignoring the
shift values. This fix takes the shift into account.
Signed-off-by: Russ Anderson <rja@sgi.com>
Cc: <stable@kernel.org>
Link: http://lkml.kernel.org/r/20120119020753.GA7228@sgi.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Linus Torvalds [Thu, 26 Jan 2012 03:28:58 +0000 (19:28 -0800)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6
* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
crypto: sha512 - reduce stack usage to safe number
crypto: sha512 - make it work, undo percpu message schedule
Linus Torvalds [Wed, 25 Jan 2012 23:36:44 +0000 (15:36 -0800)]
Merge branch 'for-linus' of git://oss.sgi.com/xfs/xfs
Quoth Ben Myers:
"Please pull in the following bugfix for xfs. We forgot to drop a lock on
error in xfs_readlink. It hasn't been through -next yet, but there is no
-next tree tomorrow. The fix is clear so I'm sending this request today."
* 'for-linus' of git://oss.sgi.com/xfs/xfs:
xfs: Fix missing xfs_iunlock() on error recovery path in xfs_readlink()
Linus Torvalds [Wed, 25 Jan 2012 23:24:30 +0000 (15:24 -0800)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
drm/ttm: fix two regressions since move_notify changes
drm/radeon: avoid deadlock if GPU lockup is detected in ib_pool_get
drm/radeon: silence out possible lock dependency warning
drm: Fix authentication kernel crash
gma500: Fix shmem mapping
drm/radeon/kms: refine TMDS dual link checks
drm/radeon/kms: use drm_detect_hdmi_monitor for picking encoder mode
drm/radeon/kms: rework modeset sequence for DCE41 and DCE5
drm/radeon/kms: move panel mode setup into encoder mode set
drm/radeon/kms: move disp eng pll setup to init path
drm/radeon: finish getting bios earlier
drm/radeon: fix invalid memory access in radeon_atrm_get_bios()
drm/radeon/kms: add some missing semaphore init
drm/radeon/kms: Add an MSI quirk for Dell RS690
gpu, drm, sis: Don't return uninitialized variable from sis_driver_load()
Linus Torvalds [Wed, 25 Jan 2012 23:13:04 +0000 (15:13 -0800)]
Merge branch 'fix/asoc' of git://git./linux/kernel/git/tiwai/sound
* 'fix/asoc' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ASoC: wm2000: Fix use-after-free - don't release_firmware() twice on error
ASoC: wm8958: Use correct format string in dev_err() call
ASoC: wm8996: Call _POST_PMU callback for CPVDD
ASoC: mxs: Fix mxs-saif timeout
ASoC: Disable register synchronisation for low frequency WM8996 SYSCLK
ASoC: Don't go through cache when applying WM5100 rev A updates
ASoC: Mark WM5100 register map cache only when going into BIAS_OFF
ASoC: tlv320aic32x4: always enable analouge block
ASoC: tlv320aic32x4: always enable dividers
ASoC: sgtl5000: Fix wrong register name in restore
Linus Torvalds [Wed, 25 Jan 2012 23:11:57 +0000 (15:11 -0800)]
Merge tag 'for-linus' of git://git./linux/kernel/git/broonie/regmap
A fairly simple bugfix for a WARN_ON() which was triggered in the cache
reset support as a result of some subsequent work. There's only one
mainline user for the code path that's updated right now (wm8994) so
should be low risk.
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap:
regmap: Reset cache status when reinitialsing the cache
Li Wang [Wed, 25 Jan 2012 07:40:31 +0000 (15:40 +0800)]
eCryptfs: move misleading function comments
The data encryption was moved from ecryptfs_write_end into
ecryptfs_writepage, this patch moves the corresponding function
comments to be consistent with the modification.
Signed-off-by: Li Wang <liwang@nudt.edu.cn>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Linus Torvalds [Wed, 25 Jan 2012 23:03:04 +0000 (15:03 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/tyhicks/ecryptfs
Says Tyler:
"Tim's logging message update will be really helpful to users when
they're trying to locate a problematic file in the lower filesystem
with filename encryption enabled.
You'll recognize the fix from Li, as you commented on that.
You should also be familiar with my setattr/truncate improvements,
since you were the one that pointed them out to us (thanks again!).
Andrew noted the /dev/ecryptfs write count sanitization needed to be
improved, so I've got a fix in there for that along with some other
less important cleanups of the /dev/ecryptfs read/write code."
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tyhicks/ecryptfs:
eCryptfs: Fix oops when printing debug info in extent crypto functions
eCryptfs: Remove unused ecryptfs_read()
eCryptfs: Check inode changes in setattr
eCryptfs: Make truncate path killable
eCryptfs: Infinite loop due to overflow in ecryptfs_write()
eCryptfs: Replace miscdev read/write magic numbers
eCryptfs: Report errors in writes to /dev/ecryptfs
eCryptfs: Sanitize write counts of /dev/ecryptfs
ecryptfs: Remove unnecessary variable initialization
ecryptfs: Improve metadata read failure logging
MAINTAINERS: Update eCryptfs maintainer address
Tyler Hicks [Tue, 24 Jan 2012 16:02:22 +0000 (10:02 -0600)]
eCryptfs: Fix oops when printing debug info in extent crypto functions
If pages passed to the eCryptfs extent-based crypto functions are not
mapped and the module parameter ecryptfs_verbosity=1 was specified at
loading time, a NULL pointer dereference will occur.
Note that this wouldn't happen on a production system, as you wouldn't
pass ecryptfs_verbosity=1 on a production system. It leaks private
information to the system logs and is for debugging only.
The debugging info printed in these messages is no longer very useful
and rather than doing a kmap() in these debugging paths, it will be
better to simply remove the debugging paths completely.
https://launchpad.net/bugs/913651
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Daniel DeFreez
Cc: <stable@vger.kernel.org>
Tyler Hicks [Wed, 18 Jan 2012 21:09:43 +0000 (15:09 -0600)]
eCryptfs: Remove unused ecryptfs_read()
ecryptfs_read() has been ifdef'ed out for years now and it was
apparently unused before then. It is time to get rid of it for good.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tyler Hicks [Fri, 20 Jan 2012 02:33:44 +0000 (20:33 -0600)]
eCryptfs: Check inode changes in setattr
Most filesystems call inode_change_ok() very early in ->setattr(), but
eCryptfs didn't call it at all. It allowed the lower filesystem to make
the call in its ->setattr() function. Then, eCryptfs would copy the
appropriate inode attributes from the lower inode to the eCryptfs inode.
This patch changes that and actually calls inode_change_ok() on the
eCryptfs inode, fairly early in ecryptfs_setattr(). Ideally, the call
would happen earlier in ecryptfs_setattr(), but there are some possible
inode initialization steps that must happen first.
Since the call was already being made on the lower inode, the change in
functionality should be minimal, except for the case of a file extending
truncate call. In that case, inode_newsize_ok() was never being
called on the eCryptfs inode. Rather than inode_newsize_ok() catching
maximum file size errors early on, eCryptfs would encrypt zeroed pages
and write them to the lower filesystem until the lower filesystem's
write path caught the error in generic_write_checks(). This patch
introduces a new function, called ecryptfs_inode_newsize_ok(), which
checks if the new lower file size is within the appropriate limits when
the truncate operation will be growing the lower file.
In summary this change prevents eCryptfs truncate operations (and the
resulting page encryptions), which would exceed the lower filesystem
limits or FSIZE rlimits, from ever starting.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reviewed-by: Li Wang <liwang@nudt.edu.cn>
Cc: <stable@vger.kernel.org>
Tyler Hicks [Thu, 19 Jan 2012 00:30:04 +0000 (18:30 -0600)]
eCryptfs: Make truncate path killable
ecryptfs_write() handles the truncation of eCryptfs inodes. It grabs a
page, zeroes out the appropriate portions, and then encrypts the page
before writing it to the lower filesystem. It was unkillable and due to
the lack of sparse file support could result in tying up a large portion
of system resources, while encrypting pages of zeros, with no way for
the truncate operation to be stopped from userspace.
This patch adds the ability for ecryptfs_write() to detect a pending
fatal signal and return as gracefully as possible. The intent is to
leave the lower file in a useable state, while still allowing a user to
break out of the encryption loop. If a pending fatal signal is detected,
the eCryptfs inode size is updated to reflect the modified inode size
and then -EINTR is returned.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Cc: <stable@vger.kernel.org>
Li Wang [Thu, 19 Jan 2012 01:44:36 +0000 (09:44 +0800)]
eCryptfs: Infinite loop due to overflow in ecryptfs_write()
ecryptfs_write() can enter an infinite loop when truncating a file to a
size larger than 4G. This only happens on architectures where size_t is
represented by 32 bits.
This was caused by a size_t overflow due to it incorrectly being used to
store the result of a calculation which uses potentially large values of
type loff_t.
[tyhicks@canonical.com: rewrite subject and commit message]
Signed-off-by: Li Wang <liwang@nudt.edu.cn>
Signed-off-by: Yunchuan Wen <wenyunchuan@kylinos.com.cn>
Reviewed-by: Cong Wang <xiyou.wangcong@gmail.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tyler Hicks [Sat, 14 Jan 2012 15:46:46 +0000 (16:46 +0100)]
eCryptfs: Replace miscdev read/write magic numbers
ecryptfs_miscdev_read() and ecryptfs_miscdev_write() contained many
magic numbers for specifying packet header field sizes and offsets. This
patch defines those values and replaces the magic values.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tyler Hicks [Sat, 14 Jan 2012 14:51:37 +0000 (15:51 +0100)]
eCryptfs: Report errors in writes to /dev/ecryptfs
Errors in writes to /dev/ecryptfs were being incorrectly reported by
returning 0 or the value of the original write count.
This patch clears up the return code assignment in error paths.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tyler Hicks [Thu, 12 Jan 2012 10:30:44 +0000 (11:30 +0100)]
eCryptfs: Sanitize write counts of /dev/ecryptfs
A malicious count value specified when writing to /dev/ecryptfs may
result in a a very large kernel memory allocation.
This patch peeks at the specified packet payload size, adds that to the
size of the packet headers and compares the result with the write count
value. The resulting maximum memory allocation size is approximately 532
bytes.
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Reported-by: Sasha Levin <levinsasha928@gmail.com>
Cc: <stable@vger.kernel.org>
Tim Gardner [Thu, 12 Jan 2012 15:31:55 +0000 (16:31 +0100)]
ecryptfs: Remove unnecessary variable initialization
Removes unneeded variable initialization in ecryptfs_read_metadata(). Also adds
a small comment to help explain metadata reading logic.
[tyhicks@canonical.com: Pulled out of for-stable patch and wrote commit msg]
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Tim Gardner [Thu, 12 Jan 2012 15:31:55 +0000 (16:31 +0100)]
ecryptfs: Improve metadata read failure logging
Print inode on metadata read failure. The only real
way of dealing with metadata read failures is to delete
the underlying file system file. Having the inode
allows one to 'find . -inum INODE`.
[tyhicks@canonical.com: Removed some minor not-for-stable parts]
Signed-off-by: Tim Gardner <tim.gardner@canonical.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Cc: stable@vger.kernel.org
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Dustin Kirkland [Wed, 7 Dec 2011 14:56:49 +0000 (08:56 -0600)]
MAINTAINERS: Update eCryptfs maintainer address
Update my email address in MAINTAINERS.
Signed-off-by: Dustin Kirkland <dustin.kirkland@gazzang.com>
Signed-off-by: Tyler Hicks <tyhicks@canonical.com>
Ben Skeggs [Wed, 25 Jan 2012 05:34:22 +0000 (15:34 +1000)]
drm/ttm: fix two regressions since move_notify changes
Both changes in
dc97b3409a790d2a21aac6e5cdb99558b5944119 cause serious
regressions in the nouveau driver.
move_notify() was originally able to presume that bo->mem is the old node,
and new_mem is the new node. The above commit moves the call to
move_notify() to after move() has been done, which means that now, sometimes,
new_mem isn't the new node at all, bo->mem is, and new_mem points at a
stale, possibly-just-been-killed-by-move node.
This is clearly not a good situation. This patch reverts this change, and
replaces it with a cleanup in the move() failure path instead.
The second issue is that the call to move_notify() from cleanup_memtype_use()
causes the TTM ghost objects to get passed into the driver. This is clearly
bad as the driver knows nothing about these "fake" TTM BOs, and ends up
accessing uninitialised memory.
I worked around this in nouveau's move_notify() hook by ensuring the BO
destructor was nouveau's. I don't particularly like this solution, and
would rather TTM never pass the driver these objects. However, I don't
clearly understand the reason why we're calling move_notify() here anyway
and am happy to work around the problem in nouveau instead of breaking the
behaviour expected by other drivers.
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Cc: Jerome Glisse <j.glisse@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Daniel Vetter [Wed, 25 Jan 2012 12:52:43 +0000 (13:52 +0100)]
drm/i915: fixup forcewake spinlock fallout in drpc debugfs function
My forcewake spinlock patches have a functional conflict with Ben
Widawsky's gen6 drpc support for debugfs. Result was a benign warning
about trying to read an non-atomic variabla with atomic_read.
Note that the entire check is racy anyway and purely informational.
Also update it to reflect the forcewake voodoo changes, the kernel can
now also hold onto a forcewake reference for longer times.
Signed-Off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
Jan Kara [Wed, 11 Jan 2012 18:52:10 +0000 (18:52 +0000)]
xfs: Fix missing xfs_iunlock() on error recovery path in xfs_readlink()
Commit
b52a360b forgot to call xfs_iunlock() when it detected corrupted
symplink and bailed out. Fix it by jumping to 'out' instead of doing return.
CC: stable@kernel.org
CC: Carlos Maiolino <cmaiolino@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Alex Elder <elder@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
Jerome Glisse [Mon, 23 Jan 2012 16:52:15 +0000 (11:52 -0500)]
drm/radeon: avoid deadlock if GPU lockup is detected in ib_pool_get
If GPU lockup is detected in ib_pool get we are holding the ib_pool
mutex that will be needed by the GPU reset code. As ib_pool code is
safe to be reentrant from GPU reset code we should not block if we
are trying to get the ib pool lock on the behalf of the same userspace
caller, thus use the radeon_mutex_lock helper.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jerome Glisse [Tue, 24 Jan 2012 17:08:52 +0000 (12:08 -0500)]
drm/radeon: silence out possible lock dependency warning
Silence out the lock dependency warning by moving bo allocation out
of ib mutex protected section. Might lead to useless temporary
allocation but it's not harmful as such things only happen at
initialization.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Thomas Hellstrom [Tue, 24 Jan 2012 17:54:21 +0000 (18:54 +0100)]
drm: Fix authentication kernel crash
If the master tries to authenticate a client using drm_authmagic and
that client has already closed its drm file descriptor,
either wilfully or because it was terminated, the
call to drm_authmagic will dereference a stale pointer into kmalloc'ed memory
and corrupt it.
Typically this results in a hard system hang.
This patch fixes that problem by removing any authentication tokens
(struct drm_magic_entry) open for a file descriptor when that file
descriptor is closed.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Gustavo Maciel Dias Vieira [Tue, 24 Jan 2012 15:27:56 +0000 (13:27 -0200)]
ALSA: hda: set mute led polarity for laptops with buggy BIOS based on SSID
HP laptop models with buggy BIOS are apparently frequent, including
machines with different codecs. Set the polarity of the mute led based
on the SSID and include an entry for the HP Mini 110-3100.
Signed-off-by: Gustavo Maciel Dias Vieira <gustavo@sagui.org>
Tested-by: Predrag Ivanovic <predivan@open.telekom.rs>
Cc: <stable@kernel.org> [v3.2+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Takashi Iwai [Wed, 25 Jan 2012 08:55:46 +0000 (09:55 +0100)]
ALSA: hda - Fix silent output on ASUS A6Rp
The refactoring of Realtek codec driver in 3.2 kernel caused a
regression for ASUS A6Rp laptop; it doesn't give any output.
The reason was that this machine has a secret master mute (or EAPD)
control via NID 0x0f VREF. Setting VREF50 on this node makes the
sound working again.
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=42588
Cc: <stable@kernel.org> [v3.2+]
Signed-off-by: Takashi Iwai <tiwai@suse.de>