This patch adds compile-time optimizations where the operation inputs are guaranteed to be non-single values. Pixel load methods now take an optional template parameter CouldBeSingle, which is false by default. If the input is not guaranteed to be single, it needs to be set to true. Gives up to 3x improvement in some nodes.