Now modifier stack wouldn't modify original curve's nurbs
and will operate on a copy of nurbs.
This makes it possible to process curve object update with
shared curve datablocks from multiple threads. There's no
big overhead for creating a copy of nurbs comparing to old
behavior which was allocating original vertex array and
apply coordinates on curve after all modifier are applied.
The only remained issue with curves is curve's bounding box
and texture space. It's not thread-safe, but it wouldn't
lead to crashes -- it just could lead to either memory
leak or wrong texture coordinates due to difference in
modifiers stacks of objects which shares the same curve.
DAG node already has got type field, so we could just check whether
type is ID_OB to distinguish whether object_handle_update need to
be called for node->ob.
This saves iterating via scene's bases when preparing threaded
update. This also will very much likely update objects which are
not in the scene base are used in dupli groups.
So now objects form dupli_groups are likely already properly updated
white traversing the DAG with threaded update and special hacks
are needed from main thread to update objects from dupli_groups
which was added in previous commit.
However, kept this hack for a while, need some more thoughts
and investigation.
Dupli-groups used to have special case for updating
which is BKE_group_handle_recalc_and_update. This
function calls BKE_object_handle_update for every
object in the group.
This isn't thread-safe, because object could be
updating in separate thread already. And what's
worse dependencies are not known for objects inside
the group, which makes it impossible to schedule
objects from the group in a safe way.
It's even impossible to schedule groups as different
tasks, because groups could share the same objects.
For now used simple but robust solution which is
updating dupli-groups in main thread, handling
groups one-by-one and updating objects from the
group one-by-one as well.
Will work on a proper solution for this later.
I know this is not so much nice to have this guys hanging
around in a general Object datablock and ideally they better
be wrapped around into a structure like DerivedMesh or
something like this. But this is pure runtime only stuff and
we could re-wrap them around later.
Main purpose of this is making curves more thread safe,
so no separate threads will ever start freeing the same path
or the same bevel list.
It also makes sense because path and bevel shall include
deformation coming from modifiers which are applying on
pre-tesselation point and different objects could have
different set of modifiers. This used to be really confusing
in the past and now dtaa which depends on object is stored
in an object, making things clear for understanding even.
This doesn't make curve code fully thread-safe due to
pre-tesselation modifiers still modifies actual nurbs and
lock is still needed in makeDispListsCurveTypes, but this
change makes usage of paths safe for threading.
Once modifiers will stop modifying actual nurbs, curves
will be fully safe for threading.
Issue was caused by worker threads updating nodes valency
at the same time while we're filling the queue with "root"
nodes (leaf nodes which don't depend on others).
Without this lock it's possible that thread_wait_pop will stuck
at the point where it await for new task in the queue but in
fact exit was requested already.
This ended up in deadlock in some circumstances. Really random
because it totally depends on timings.
The tihng here is: curve is getting modified by modifier stack
and then it's coordinates are restored. To be really safe we
need to do all this locked.
Added threading lock around unsafe part of do_makeDispListCurveTypes
(parts which touches Curve->bev and Curve->path). Namely it means
pre-tesselation modifiers, bevel, path and non-modified display
list will be calculated inside a locked thread.
Post-tessellation modifiers will eb calculated outside of locked
thread, which means heavy constructive or deformation modifiers
applying on tesselated spline will be nicely threaded.
This makes it possible to use threaded object update by default
in the branch, so everyone could start testing it.
This display list was only used for texture space calculation,
and even there this display list was only used for bounding
box calulation.
Since we alreayd do have boundgind box in a curve datablock
there's no reason to duplicate non-modified display list
just to calculate bounding box later, let's just calculate
boundding box at the first point.
This makes code a little be more thread-safe but curves are
still not safe for threads at all because of bevel list and
path. That would be solved later.
It used to be a check for ob->bb ? ob->bb : cu->bb but
in fact it doesn't make sense and only makes code more
crappy.
Making displist for mballs and curves/surfaces/fonts
already ensures object has walid bounding box.
It's not so much happening inside the lock and using spin
lock instead of mutex lock will give some speedup due to
smaller latency of resuming the thread when mutex was locked.
Initially i wanted to have some really simple and basic
threading scheduler and wrote one based on traversing
depsgraph in advance. But it ended up in some issues with
the single-pass traverse i did which didn't gather all
the dependencies actually.
That was for sure solvable, but it ended up in a bit of
time consuming thing and with huge help of Brecht's
patch it was faster just to write proper balancing.
But it's again really basic thing, which could be
easily changed depending on feedback and design decisions
from Joshua,
So for now it works in the following way:
- Currently DagNode is used for threaded evaluaiton,
meaning traversing actually happens for DagNodes.
This is easier than converting DAG to a graph where
only objects are stored, but required adding one int
field to DagNode for faster runtime checks.
We could change this later when it'll be clear how
and where we'll store evaluation data, but for now
it work pretty ok.
- The new field is called "valency" and it's basically
number of node parents which needs to be evaluated
before the node itself could be evaluated.
- Nodes' valency is getting initialized before threading,
and when node finished to update valency of it's childs
is getting decreased by one. And if it happens so
child's valency became zero, it's adding to task pool.
- There's thread lock around valency update, it'll be
replaced with spinlock in nearest future.
- Another update runtime data is node color. White nodes
represents objects, gray one non-objects.
Currently it's needed to distinguish whether we need to
call object_handle_update on node->ob or not. In the
future it could be replaced with node->type to support
granularity, meaning we then could update object data
separately from object itself.
- Needed to add some public depsgraph functions to make
it possible to traverse depsgraph without including
depsgraph private header to other files.
This change doesn't make code anyhow more stable, but
solves update order issues noticed while working on
fixing underlying bugs.
Threaded update is still ifdef-ed for until curves and
armatures are considered thread-safe, which is next
step to be done.
Patch by Brecht, which in original version also
gets rid of ThreadedWorker in favor of new task
scheduler and ports some areas to it.
Kudos to Brecht for this work!
tweakmode, this will now result in tweakmode being exited instead of going into
a weird limbo-land where channel selection has changed (but tweakmode is still
active but not drawn)