gcc 3.2+ supports __builtin_prefetch, so it's possible to use it on all
architectures. Change the generic fallback in linux/prefetch.h to use it
instead of noping it out. gcc should do the right thing when the
architecture doesn't support prefetching
Undefine the x86-64 inline assembler version and use the fallback.