Recent failures requiring investigation have exposed some shortcomings
that this addresses:
- When creating the diff image for offline comparison, use a higher
threshold to prevent idiff from printing more output which will often
contradict the primary failure output just above it (very confusing)
- For metadata failures, make sure these get printed so it's obvious
what kind of failure we're dealing with
Pull Request: https://projects.blender.org/blender/blender/pulls/107058