Tested overall speedup is about 5x when scaling 4096x4096 -> 4000x4000 in the sequencer. There were some artifacts in the resulting image but double checked and the old code gives the same problems. Added back old code with #if 0's since its a bit more readable.