x86: merge the simple bitops and move them to bitops.h
Some of those can be written in such a way that the same
inline assembly can be used to generate both 32 bit and
64 bit code.
For ffs and fls, x86_64 unconditionally used the cmov
instruction and i386 unconditionally used a conditional
branch over a mov instruction. In the current patch I
chose to select the version based on the availability
of the cmov instruction instead. A small detail here is
that x86_64 did not previously set CONFIG_X86_CMOV=y.
Improved comments for ffs, ffz, fls and variations.
Signed-off-by: Alexander van Heukelum <heukelum@fastmail.fm> Signed-off-by: Ingo Molnar <mingo@elte.hu>
x86, generic: optimize find_next_(zero_)bit for small constant-size bitmaps
This moves an optimization for searching constant-sized small
bitmaps form x86_64-specific to generic code.
On an i386 defconfig (the x86#testing one), the size of vmlinux hardly
changes with this applied. I have observed only four places where this
optimization avoids a call into find_next_bit:
In the functions return_unused_surplus_pages, alloc_fresh_huge_page,
and adjust_pool_surplus, this patch avoids a call for a 1-bit bitmap.
In __next_cpu a call is avoided for a 32-bit bitmap. That's it.
On x86_64, 52 locations are optimized with a minimal increase in
code size:
Current #testing defconfig:
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename 5392637 846592 724424 6963653 6a41c5 vmlinux
After removing the x86_64 specific optimization for find_next_*bit:
94 x bsf, 79 x find_next_*bit
text data bss dec hex filename 5392358 846592 724424 6963374 6a40ae vmlinux
After this patch (making the optimization generic):
146 x bsf, 27 x find_next_*bit
text data bss dec hex filename 5392396 846592 724424 6963412 6a40d4 vmlinux
The versions with inline assembly are in fact slower on the machines I
tested them on (in userspace) (Athlon XP 2800+, p4-like Xeon 2.8GHz, AMD
Opteron 270). The i386-version needed a fix similar to 06024f21 to avoid
crashing the benchmark.
Benchmark using: gcc -fomit-frame-pointer -Os. For each bitmap size
1...512, for each possible bitmap with one bit set, for each possible
offset: find the position of the first bit starting at offset. If you
follow ;). Times include setup of the bitmap and checking of the
results.
If the bitmap size is not a multiple of BITS_PER_LONG, and no set
(cleared) bit is found, find_next_bit (find_next_zero_bit) returns a
value outside of the range [0, size]. The generic version always returns
exactly size. The generic version also uses unsigned long everywhere,
while the x86 versions use a mishmash of int, unsigned (int), long and
unsigned long.
Using the generic version does give a slightly bigger kernel, though.
Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-fixes:
x86 PAT: decouple from nonpromisc devmem
x86 PAT: tone down debugging messages
* git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/v4l-dvb:
V4L/DVB (7751): ir-kbd-i2c: Save a temporary memory allocation in ir_probe
V4L/DVB (7750): au0828/ cleanups and fixes
V4L/DVB (7748): tuner-core: some adjustments at tuner logs, if debug enabled
V4L/DVB (7746): pvrusb2: make signed one-bit bitfields unsigned
V4L/DVB (7744): pvrusb2-dvb: add atsc/qam support for Hauppauge pvrusb2 model 751xx
V4L/DVB (7742): cx88: Add support for the DViCO FusionHDTV_7_GOLD digital modes
V4L/DVB (7741): s5h1411: Adding support for this ATSC/QAM demodulator
V4L/DVB (7740): tuner-xc2028.c dubious !x & y
V4L/DVB (7739): mt312.h: dubious one-bit signed bitfield
V4L/DVB (7735): Fix compilation for au0828
V4L/DVB (7734): em28xx: copy and paste error in em28xx_init_isoc
V4L/DVB (7733): blackbird_find_mailbox negative return ignored in blackbird_initialize_codec()
V4L/DVB (7732): vivi: fix a warning
* git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (61 commits)
ide: sanitize handling of IDE_HFLAG_NO_SET_MODE host flag
sis5513: fail early for unsupported chipsets
it821x: fix kzalloc() failure handling
qd65xx: use IDE_HFLAG_SINGLE host flag
qd65xx: always use ->selectproc method
ide-cd: put proc-related functions together under single ifdef
ide-cd: Replace __FUNCTION__ with __func__
IDE: Coding Style fixes to drivers/ide/ide-cd.c
IDE: Coding Style fixes to drivers/ide/pci/cy82c693.c
IDE: Coding Style fixes to drivers/ide/pci/it8213.c
IDE: Coding Style fixes to drivers/ide/ide-floppy.c
IDE: Coding Style fixes to drivers/ide/legacy/ali14xx.c
IDE: Coding Style fixes to drivers/ide/legacy/hd.c
IDE: Coding Style fixes to drivers/ide/pci/cmd640.c
IDE: Coding Style fixes to drivers/ide/pci/opti621.c
IDE: Coding Style fixes to drivers/ide/ide-pnp.c
IDE: Coding Style fixes to drivers/ide/ide-proc.c
IDE: Coding Style fixes to drivers/ide/legacy/ide-4drives.c
IDE: Coding Style fixes to drivers/ide/legacy/umc8672.c
IDE: Coding Style fixes to drivers/ide/pci/generic.c
...
Merge branch 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6
* 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
agp: convert drivers/char/agp/frontend.c to use unlocked_ioctl
agp: fix shadowed variable warning in amd-k7-agp.c
Merge branch 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
* 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6:
drm: _end is shadowing real _end, just rename it.
drm/vbl rework: rework how the drm deals with vblank.
drm: reorganise minor number handling using backported modesetting code.
drm/i915: Handle tiled buffers in vblank tasklet
drm/i965: On I965, use correct 3DSTATE_DRAWING_RECTANGLE command in vblank
drm: Remove unneeded dma sync in ATI pcigart alloc
drm: Fix mismerge of non-coherent DMA patch
Ingo Molnar [Mon, 3 Mar 2008 11:38:52 +0000 (12:38 +0100)]
x86: add optimized inlining
add CONFIG_OPTIMIZE_INLINING=y.
allow gcc to optimize the kernel image's size by uninlining
functions that have been marked 'inline'. Previously gcc was
forced by Linux to always-inline these functions via a gcc
attribute:
Especially when the user has already selected
CONFIG_OPTIMIZE_FOR_SIZE=y this can make a huge difference in
kernel image size (using a standard Fedora .config):
qd_select() checks itself whether timings should be reprogrammed so
remove superfluous qd_timing_ok() and always use ->selectproc method
(rename qd_select() to qd65xx_select() while at it).
All host drivers now either set hwif->mmio or reserve continuous
I/O resources so remove no longer needed hwif->straight8 flag
and never reached code for 'hwif->straight8 == 0' case.
Adrian Bunk [Sat, 26 Apr 2008 15:36:37 +0000 (17:36 +0200)]
remove include/linux/hdsmart.h
include/linux/hdsmart.h is not used by the kernel and should therefore
be removed.
Signed-off-by: Adrian Bunk <bunk@kernel.org> Cc: "Robert P. J. Day" <rpjday@crashcourse.ca>, Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
This removes the internal ide-cd buffer and falls back to read-ahead block layer
capabilities. Thorough testing (cd burning, dvd read, raw read) gives with the
bufferless mode marginally better performance in addition to simplified code.
bufferless:
dd: reading `/dev/hdc': Input/output error
6238+0 records in
6238+0 records out 204406784 bytes (204 MB) copied, 259.891 s, 787 kB/s
real 4m21.598s
user 0m0.014s
sys 0m0.744s
with the old buffer (2.6.25-rc1):
dd: reading `/dev/hdc': Input/output error
6238+0 records in
6238+0 records out 204406784 bytes (204 MB) copied, 262.893 s, 778 kB/s
Sergei Shtylyov [Sat, 26 Apr 2008 15:36:31 +0000 (17:36 +0200)]
ide: make ide_pci_check_iomem() actually work
This function didn't actually check if a given BAR is in I/O space because of
using the bogus PCI_BASE_ADDRESS_IO_MASK (which equals ~3) to test the resource
flags instead of IORESOURCE_IO -- fix this, make ide_hwif_configure() check the
results failing if necessary, and move the printk() call to the failure path.
Signed-off-by: Sergei Shtylyov <sshtylyov@ru.mvista.com> Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Jacek Luczak [Fri, 11 Apr 2008 11:29:04 +0000 (13:29 +0200)]
x86: section mismatch fixes, #3
This patch fixes section mismatch warnings in unlock_ExtINT_logic().
WARNING: arch/x86/kernel/built-in.o(.text+0x14a92): Section mismatch in reference from the function unlock_ExtINT_logic()
to the function .init.text:find_isa_irq_pin()
The function unlock_ExtINT_logic() references
the function __init find_isa_irq_pin().
This is often because unlock_ExtINT_logic lacks a __init
annotation or the annotation of find_isa_irq_pin is wrong.
Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Jacek Luczak [Fri, 11 Apr 2008 11:28:49 +0000 (13:28 +0200)]
x86: section mismatch fixes, #2
This patch fixes section mismatch warnings in smpboot_setup_io_apic().
WARNING: arch/x86/kernel/built-in.o(.text+0x11781): Section mismatch in reference from the function smpboot_setup_io_apic()
to the function .init.text:setup_IO_APIC()
The function smpboot_setup_io_apic() references
the function __init setup_IO_APIC().
This is often because smpboot_setup_io_apic lacks a __init
annotation or the annotation of setup_IO_APIC is wrong.
Signed-off-by: Jacek Luczak <luczak.jacek@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Roland McGrath [Tue, 22 Apr 2008 19:21:25 +0000 (12:21 -0700)]
x86_64 ia32 ptrace: convert to compat_arch_ptrace
Now that there are no more special cases in sys32_ptrace, we
can convert to using the generic compat_sys_ptrace entry point.
The sys32_ptrace function gets simpler and becomes compat_arch_ptrace.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland McGrath [Tue, 22 Apr 2008 19:20:20 +0000 (12:20 -0700)]
x86_64 ia32 ptrace: use compat_ptrace_request for siginfo
This removes the special-case handling for PTRACE_GETSIGINFO
and PTRACE_SETSIGINFO from x86_64's sys32_ptrace. The generic
compat_ptrace_request code handles these.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland McGrath [Sat, 19 Apr 2008 21:27:56 +0000 (14:27 -0700)]
x86 signals: lift set_fs
This lifts the set_fs(USER_DS) call for signal handler setup out of the
three places copying the same code into the one place that calls them
all. There is no change in what it does.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Roland McGrath [Sat, 19 Apr 2008 21:26:54 +0000 (14:26 -0700)]
x86 signals: lift flags diddling code
This lifts the code diddling the TF and DF bits for signal handler setup
out of the several places copying the same code into the one place that
calls them all. There is no change in what it does.
I also separated the recently-added DF bit clearing from the TF diddling.
The compiler turns them back into one instruction anyway. The tossing
in of DF to the same line of code with no new comments was a bit more
arcane than seems wise.
Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitri Vorobiev [Sun, 20 Apr 2008 20:47:55 +0000 (00:47 +0400)]
x86: remove NexGen support
It is claimed that NexGen CPUs were never shipped:
http://lkml.org/lkml/2008/4/20/179
Also, the kernel support for these chips has been broken for
a long time, the code intended to support NexGen thereby being
essentially dead.
As an outcome of the discussion that can be found using the URL
above, this patch removes the NexGen support altogether.
The changes in this patch survived a defconfig build for i386, a
couple of successful randconfig builds, as well as a runtime test,
which consisted in booting a 32-bit x86 box up to the shell prompt.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Dmitri Vorobiev [Sun, 20 Apr 2008 02:54:33 +0000 (06:54 +0400)]
x86: array can become static
In arch/x86/kernel/setup_64.c, the standard_io_resources array
is needlessly defined as global. This patch makes this variable
static.
This patch was successfully build-tested using the defconfig
for x86_64. Runtime test was performed by booting a 64-bit x86
box up to the shell prompt.
Signed-off-by: Dmitri Vorobiev <dmitri.vorobiev@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>