Incorrect read was happening after nodes which are not affect on output
were removed from the graph. Other nodes could have been connected to
this nodes which lead to accessing freed memory in some other places.
Solved by removing links from unused nodes before removing them from
the graph.
Seems some variables used for light sampling should be reset when
integrator's use_direct_light flag is setting to false, otherwise
this values could be used from previously rendered layer resulting
in using freed memory of __light_distribution texture.
Apparently for viewport rendering on GPU tile size of 1024 gives
much better performance than using single tile. Not sure why it
doesn't work for background rendering, need to be investigated
further.
Meanwhile use old debug value of 1024 for tile size.
Before this debug_tile_size was used as a size of tile, which
became hidden property since tile-based rendering implementation
and couldn't have been controlled.
This resolves the issue with single thread used for viewport
rendering in some cases. Also it makes possible to control tiles
for CUDA viewport rendering, which still behaves much faster when
using the single tile.
Also fixed issue with minimal tile size which was used to calculate
divider of final resolution to be used for initial rendering. Now
it's a Resolution Divider property in Performance tab. This option
could be used to tweak initial resolution of viewport for faster
navigation or faster refresh when changing some properties.
Currently it makes more sense to use single tile for GPU rendering
and in this case tile-based progress report doesn't work well.
Since threading happens within single tile, it's possible to
detect currently computing sample and report it to the interface,
This also allows to display smoother progress when using CPU
with small amount of tiles.
Do not discard (fill with black) tiles which are not fully rendered
(not all the samples are calculated for tile) when canceling render.
This could be helpful to tweak some settings when render glitch is
discovered. Also it could be used in such scenarios as setting
samples number to something really high and render still image
until result is reasonable, controlling this manually.
This could make cancel not so responsible on CPU, but it wouldn't
be less responsible than GPU, also could potentially give some
%% of speedup by avoiding checking cancel state after every pixel
sampled.
Issue was caused by wrong camera motion stored in device in cases
when first render layer does not have vector pass enabled.
Solved by forcing device camera update in cases when scene's motion
changed since previous device camera update.
Move center tile acquiring code into own function. Should be
easier for time being when we'll want to support other tile
render orders.
Also now there should be a single bucket growing from center
when multi-GPU is used. Can't test this here tho.
This commit solves couple of issues appeared with new integrator:
- Render job progress bar is now shows progress based on number of
rendered tiles. This is the same as Blender Internal does.
This still requires some further thoughts because for GPU it's
better to use single tile and in this case progress bar should
be based on number of rendered samples.
- Removes "global" sample counter from progress descriptor. There's
no more global-being sample which makes sense.
This counter was replaced with tile counter.
- Use proper sample number when copying render buffer to blender.
It used to be final sample number used which lead to tiles
appearing from complete dark to normal brightness as they're
being rendered. Now tile would be displayed with proper
brightness starting from the very first sample.
Use sample counter stored in render tile descriptor and pass
it to update / write callbacks.
This was tested on CPU and GPU CUDA rendering.
Additional change:
OpenCL rendering now should be cancellable before it finished
rendering all the samples (the same change as for CPU/CUDA from
a while ago).
This part of commit wasn't actually tested, would do it later.
Carve proved it's a way to go, so the time have came to get rid of old
boolean operation module which isn't used anymore.
Still kept BOP interface but move it to BSP module. At some point it
could be cleaned up further (like perhaps removed extra abstraction
level or so) but would be nice to combine such a refactor with making
BSP aware of NGons.
Tested on linux using both cmake and scons, possible regressions on
windows/osx. Would check windoes build just after commit.
This required wrapping create and update pytohn callbacks as into begin/end
allow threading macroses. From quick tests this seems to be stable enough,
but more tests would be needed before considering this stable.