movnt* instructions are not strongly ordered with respect to other stores,
so if we are to assume stores are strongly ordered in the rest of the 64
bit code, we must fence these off (see similar examples in 32 bit code).
[ The AMD memory ordering document seems to say that nontemporal stores can
also pass earlier regular stores, so maybe we need sfences _before_
movnt* everywhere too? ]
Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>