Accessing nohz_cpu_mask before incrementing rcp->cur is racy. It can cause
tickless idle CPUs to be included in rsp->cpumask, which will extend
graceperiods unnecessarily.
Fix this race. It has been tested using extensions to RCU torture module
that forces various CPUs to become idle.