Not sure when this happened but apparently the lower bar is now windows 7 [1]
This patch bumps to API version to 0x0601 (Win7) and cleans up any uses that
worked around the globally set API version.
[1] https://www.blender.org/download/requirements/
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D6758
The OptiX SRT motion expects a motion defined by translation,
rotation, shear and scale, but the matrix decomposition code in
Cycles was not able to extract shear information and instead
produced a stretch matrix with the information baked in. This
caused conflicting transforms between traversal and shading
and lead to render artifacts.
This patch changes the matrix decomposition to produce factors
inline with what OptiX expects to fix that.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D6605
Commit baeb11826b switched memory
allocation for the motion transform to use CUDA directly, instead of going
through abstractions. But no CUDA context was set active before those
were called, so the calls failed. This fixes that by binding a context beforehand.
The `optixAccelBuild` API throws an error when the property to get compacted size is passed in
without the `OPTIX_BUILD_FLAG_ALLOW_COMPACTION` flag set. This is not currently hit
because `background` is always true (set in `mem_alloc`), but would become an issue once that
is sorted out, so fixing it now to be safe.
This patch adds support for the OptiX denoiser as an alternative to the existing NLM denoiser in Cycles. It's re-using the same denoising architecture based on tiles and therefore implicitly also works with multiple GPUs.
Reviewed By: sergey
Differential Revision: https://developer.blender.org/D6395
This adds compaction support for OptiX acceleration structures, which reduces the device memory footprint in a post step after building. Depending on the scene this can reduce the amount of used device memory quite a bit and even improve performance (smaller acceleration structure improves cache usage). It's only enabled for background renders to make acceleration structure builds fast in viewport.
Also fixes a bug in the memory management for OptiX acceleration structures: These were held in a dynamic vector of 'device_memory' instances and used the mem_alloc/mem_free functions. However, those keep track of memory instances in the 'cuda_mem_map' via pointers to 'device_memory' (which works fine everywhere else since those are never copied/moved). But in the case of the vector, it may decide to reallocate at some point, which invalidates those pointers and would result in some nasty accesses to invalid memory. So it is not actually safe to move a 'device_memory' object and therefore this removes the move operator overloads again.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D6369
When encountering an error during context creation, the "OptiXDevice" constructor aborts early.
This means the "cuda_stream" vector is never resized and the destructor iterated over non-existent data.
The acceleration structure built by OptiX may be different between GPUs, so cannot assume the memory size is the same for all.
This fixes that by moving the memory management for all OptiX acceleration structures into the responsibility of each device (was already the case for BLAS previously, now for TLAS too).
Calling "OptiXDevice::load_kernels" multiple times would call "optixPipelineDestroy" on a pipeline
pointer that may have already been deleted previously (since the PIP_SHADER_EVAL pipeline is only
created conditionally).
This change also avoids a CUDA kernel reload every time this is called. The CUDA kernels are
precompiled and don't change, so there is no need to reload them every time.
The multi device code did not correctly handle cases where some GPUs store a
resource in device memory and others store it in host mapped memory.
Differential Revision: https://developer.blender.org/D6126
Was caused by D6068, which did not handle "MEM_PIXELS" memory
when not in background mode. Before that it always fell back to using
generic device memory, so restoring that behavior. In future this
should be changes to use OpenGL interop for optimal performance.
The OptiX implementation wasn't trying to allocate memory on the host if device allocation failed, while the CUDA implementation did. This copies the implementation over to OptiX to remedy that.
Differential Revision: https://developer.blender.org/D6068
Rendering would produce invalid results or crash if the Vector pass was active but motion blur was inactive. This caused the OptiX BVH to be built with motion (because objects reported motion available), but the pipeline to be built without motion support (since with disabled motion blur this is not in the list of requested features). The two are not compatible and therefore caused issues. This patch fixes that by not building the BVH with motion if motion blur is not active (which makes sense).
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D5968
Curves with motion blur produced wrong results with OptiX (T69801). This is because the AABBs for the motion steps were calculated from incorrect attribute data because the offset into the attribute data array was incorrect.
Reviewed By: brecht
Differential Revision: https://developer.blender.org/D5961