This was already changed for the TBB-based BLI_task_parallel_range in master.
This task local storage should always be initialized from the template, not
copied from another task which may be executing at the time the copy happens.
This may not fix any actual bug, we only use this user data for parallel reduce
and it's not clear that TBB ever calls the copy constructor for that case.
Ref T76858