The implementation of __builtin_xchg() in at least some versions of
avr32 gcc is buggy. Rather than find out exactly which versions that
have this bug, let's just avoid the problem altogether by implementing
xchg() in inline assembly.
Also, in most architectures, xchg() seems to imply a memory barrier,
while the existing avr32 implementation did not. This patch also fixes
that discrepancy.