Namely, it caused nodes be adding to the pool multiple times.
Returned spin back, but use it only in cases node valency is
zero. So now valency is decreasing without any locks, then
if it's zero spin lock happens, node color (which indicates
whether node is scheduled or not) happens. Actual new task
creation happens outside of locks.
This might sound a bit complicated, but it's straightforward
code which is free from any thread synchronization latency.