Dupli-groups used to have special case for updating
which is BKE_group_handle_recalc_and_update. This
function calls BKE_object_handle_update for every
object in the group.
This isn't thread-safe, because object could be
updating in separate thread already. And what's
worse dependencies are not known for objects inside
the group, which makes it impossible to schedule
objects from the group in a safe way.
It's even impossible to schedule groups as different
tasks, because groups could share the same objects.
For now used simple but robust solution which is
updating dupli-groups in main thread, handling
groups one-by-one and updating objects from the
group one-by-one as well.
Will work on a proper solution for this later.
I know this is not so much nice to have this guys hanging
around in a general Object datablock and ideally they better
be wrapped around into a structure like DerivedMesh or
something like this. But this is pure runtime only stuff and
we could re-wrap them around later.
Main purpose of this is making curves more thread safe,
so no separate threads will ever start freeing the same path
or the same bevel list.
It also makes sense because path and bevel shall include
deformation coming from modifiers which are applying on
pre-tesselation point and different objects could have
different set of modifiers. This used to be really confusing
in the past and now dtaa which depends on object is stored
in an object, making things clear for understanding even.
This doesn't make curve code fully thread-safe due to
pre-tesselation modifiers still modifies actual nurbs and
lock is still needed in makeDispListsCurveTypes, but this
change makes usage of paths safe for threading.
Once modifiers will stop modifying actual nurbs, curves
will be fully safe for threading.
Issue was caused by worker threads updating nodes valency
at the same time while we're filling the queue with "root"
nodes (leaf nodes which don't depend on others).
The tihng here is: curve is getting modified by modifier stack
and then it's coordinates are restored. To be really safe we
need to do all this locked.
Added threading lock around unsafe part of do_makeDispListCurveTypes
(parts which touches Curve->bev and Curve->path). Namely it means
pre-tesselation modifiers, bevel, path and non-modified display
list will be calculated inside a locked thread.
Post-tessellation modifiers will eb calculated outside of locked
thread, which means heavy constructive or deformation modifiers
applying on tesselated spline will be nicely threaded.
This makes it possible to use threaded object update by default
in the branch, so everyone could start testing it.
This display list was only used for texture space calculation,
and even there this display list was only used for bounding
box calulation.
Since we alreayd do have boundgind box in a curve datablock
there's no reason to duplicate non-modified display list
just to calculate bounding box later, let's just calculate
boundding box at the first point.
This makes code a little be more thread-safe but curves are
still not safe for threads at all because of bevel list and
path. That would be solved later.
It's not so much happening inside the lock and using spin
lock instead of mutex lock will give some speedup due to
smaller latency of resuming the thread when mutex was locked.
Initially i wanted to have some really simple and basic
threading scheduler and wrote one based on traversing
depsgraph in advance. But it ended up in some issues with
the single-pass traverse i did which didn't gather all
the dependencies actually.
That was for sure solvable, but it ended up in a bit of
time consuming thing and with huge help of Brecht's
patch it was faster just to write proper balancing.
But it's again really basic thing, which could be
easily changed depending on feedback and design decisions
from Joshua,
So for now it works in the following way:
- Currently DagNode is used for threaded evaluaiton,
meaning traversing actually happens for DagNodes.
This is easier than converting DAG to a graph where
only objects are stored, but required adding one int
field to DagNode for faster runtime checks.
We could change this later when it'll be clear how
and where we'll store evaluation data, but for now
it work pretty ok.
- The new field is called "valency" and it's basically
number of node parents which needs to be evaluated
before the node itself could be evaluated.
- Nodes' valency is getting initialized before threading,
and when node finished to update valency of it's childs
is getting decreased by one. And if it happens so
child's valency became zero, it's adding to task pool.
- There's thread lock around valency update, it'll be
replaced with spinlock in nearest future.
- Another update runtime data is node color. White nodes
represents objects, gray one non-objects.
Currently it's needed to distinguish whether we need to
call object_handle_update on node->ob or not. In the
future it could be replaced with node->type to support
granularity, meaning we then could update object data
separately from object itself.
- Needed to add some public depsgraph functions to make
it possible to traverse depsgraph without including
depsgraph private header to other files.
This change doesn't make code anyhow more stable, but
solves update order issues noticed while working on
fixing underlying bugs.
Threaded update is still ifdef-ed for until curves and
armatures are considered thread-safe, which is next
step to be done.
This was one of the consequences of r.57333 (i.e. influence shouldn't be ignored
on the first strip that animates a channel), as scale should really default to a
base value of 1 (instead of things being blended against 0 as per all other
properties). The end result was that bones were getting scaled to zero here when
the influence of their strip fell to zero.
Now, we use the RNA default values of properties to initialise their initial
values. This may/may not work well in all cases:
1) For properties which don't have the appropriate RNA defaults set, this will
be problematic. But, most properties people are likely to animate here I think
are already set up correctly.
2) It may not always be nice to have values "snapping back" to default values.
In this case, you should still be defining a strip at the bottom of your NLA
stack which defines what the appropriate rest poses *should* be for your shot.
Move static variables to context filling in by this fcuntion
and owned by a callee function. This ensures no conflicts
between threads happens because of static variables used in
this function.
Also moved modifier types and virtual modifiers data to a
function called from creator. This is needed to be sure all
the information is properly initialied to the time when
threads starts to use this data.
Make Local operator uses BKE_library_make_local function if all the
datablocks needs to be made local. And this function was calling
id_clear_lib_data for every datablock, which only clears library
data. But this function doesn't work correct for datablocks which
areshared by multiple users (this is also mentioned in comment
for this function).
This lead to situations when two datablocks shares the same runtime
data leading to crashes later. For example making everythig local in
scales cycles scene from durian ends up in a crash when toggling
rig edit mode.
Solved by using id_make_local instead of id_clear_lib_data, which
will ensure all the data are nicely expanded and made local.
Checked by Brecht, thanks fr the review!
This commit contains changes related on running function
BKE_object_handle_update_ex from multiple threads in order
to increase scene update time when having multiple
independent groups of objects.
Currently this required changes to two areas:
- scene.c, where scene_update_tagged_recursive is now using
threads for updating the object
There're some tricks to prevent threads from being spawned
when it's not needed:
* Threading will only happen if there're more than one CPU
core.
* Threading will happen only if there're more than single
object which needed to be updated.
There's currently one crappy part of the change: which is
freeing object caches (derivedFinal, derivedDeform and so)
from main thread. This is so because in case VBO are used
freeing DM is not thread safe. This is because DrawObject
used global array. Would look into possibility of making
that code safe later.
There're also currently some ifdef-ed debug-only code, which
helps a lot troubleshooting whether everything is working
fine. This code looks a bit ugly now, will either drop it
later or make it more cleat.
And one more thing: threaded update is CURRENTLY DISABLED.
This is because of some thread-unsafe issues discovered
while was working on this patch. Namely:
* I have once a crash in Curve module. Wasn't been able
to reproduce the crash, but could thing about some
unsafe code there.
* Virtual modifier list is not thread-safe (it uses static
variables).
* Armature modifier is also doesn't seem to be thread safe
because of storing some temporary runtime data in actual
armature.
All this issues are to be solved next.
- depsgraph.c, where i've added a function which gives list
of groups, each group contains objects and dependency is
only allowed between objects inside one group.
This is needed to make scheduling of objects easier, which
means update threads will operate on groups, and will handle
objects one-by-one inside group. Different threads will
operate on different groups.
Currently such groups will be generated on every update.
Actually, on every run of scene_update_objects_threaded which
only happens if there're objects marked for update. In the
future we could consider storing such groups in graph itself,
which will help saving CPU power on building such groups.
But this is something to be discussed with Joshua first.
P.S. If you really want to test threaded update, you'll
need to replace:
#undef USE_THREADED_UPDATE
with:
#define USE_THREADED_UPDATE
Printing text on the color grid image would initialize font glyphs from a thread at
the same time as the UI, causing conflicts. The freetype glyph renderer needs to be
mutex locked because it uses a shared buffer internally even when rendering for
different fonts. Also needed to change the image generate function to use the render
monospace font to avoid conflicts in blenfont.
What's still weak in the blenfont API is that there is no distinction between a font
and a thread using that font to render with some particular size, style, etc.
in some cases. The dupli code would still do object updates during render indirectly,
while this is disabled for thread safety everywhere else, now we disable it for this
case as well.
Not a pretty solution but this is for the depgraph refactor to solve.
rendering. This used to happen in an unneeded frame change update which was
removed. For heavy particle systems this could have a bad impact on viewport
performance after rendering.
of the active object in the 3D view. This was due to sharing a global G.moving
flag to indicate that transform is active, now it's only set per transform data
type so different editors don't influence each other.