Commit Graph

54863 Commits

Author SHA1 Message Date
Campbell Barton
7b8e89f297 Cleanup: incorrect comment 2017-11-27 15:15:56 +11:00
Bastien Montagne
440aa2bf70 Cleanup: ImageEditor's mask drawing code was re-implementing BKE_maskrasterize_buffer!
So this deduplicates and simplifies code, yeah.

Also, as an odd bonus, new code seems slighly quicker than previous one
(about 5 to 10% quicker).
2017-11-26 19:18:12 +01:00
Bastien Montagne
06e64058dd Removing OMP: BKE's mask_rasterize.c
Once again nothing much to say here, except that whole mask rendering
process from VSE is about 25% quicker now. ;)
2017-11-26 19:06:26 +01:00
Bastien Montagne
440a49a24c Removing OMP: autotrack BKE code.
Pretty straightforward this time, we already have a single struct
pointer containing all needed data (or nearly).

And we gain about 10-15% speed on tracking! :)
2017-11-26 17:25:41 +01:00
Bastien Montagne
f1ce279903 Removing OMP: bmesh_operators.c
Two more 'not really useful' cases (OMP only shows some noticeable
speedup with above 1M elements, and since this is quick operation anyway
compared to even ather basic operators, gain is in the 1% area of total
processing time in best case).

So not worth parallelizing here, we'll gain much more on tackling heavy
operations. ;)

And BMesh is free from OMP now!
2017-11-26 16:03:29 +01:00
Bastien Montagne
2b6f345558 Removing OMP: bmesh_interp.c
Performances tests on this one are quite surprising actually...
Parallelized loop itself is at least 10 times quicker with new BLI_task
code than it was with OMP. And subdividing e.g. a heavy mesh with 3
levels of multires (whole process) takes 8 seconds with new code, while
10 seconds with OMP one. And cherry on top, BLI_task code only uses
about 50% of CPU load, while OMP one was at nearly 100%!

In fact, I suspect OMP code was not properly declaring outside vars,
generating a lot of uneeded locks.

Also, raised the minimum level of subdiv to enable parallelization,
tests here showed that we only start to get significant gain with subdiv
levels of 4, below single threaded one is quicker.
2017-11-26 16:03:29 +01:00
Bastien Montagne
099bda8875 Removing OMP: nuke last usages in bmesh_mesh.c
Those three ones were actually giving no significant benefits, in fact
even slowing things down in one case compared to no parallelization at
all (in `BM_mesh_elem_table_ensure()`).

Point being, once more, parallelizing *very* small tasks (like index or
flag setting, etc.) is nearly never worth it.

Also note that we could not easlily use per-item parallel looping in
those three cases, since they are heavily relying on valid
loop-generated index (or are doing non-threadable things like allocation
from a mempool)...
2017-11-26 16:03:29 +01:00
Campbell Barton
311da4cd16 Cleanup: rename edge -> edges 2017-11-26 20:13:18 +11:00
Campbell Barton
23252eece6 Minor improvement to last commit
Don't operate on multiple boundaries at once,
instead keep collapsing from the first selected boundary.
2017-11-26 18:38:45 +11:00
Campbell Barton
329bf8e1bf BMesh: improve edge rotate when edges share faces
Previously outcome depended on order of edges,
now the longest boundary edges are rotated first,
then the faces connected edges.

This gives more predictable results, allowing regions containing
a vertex fan to be rotated onto the next vertex.
2017-11-26 17:51:22 +11:00
Campbell Barton
5b225c59bb Cleanup: move edge-rotate into own file 2017-11-26 13:42:25 +11:00
Joshua Leung
941deaca7a Fix T53393: Change from 'd' key to 'draw' panel button causes pencil to be activated immediately instead of upon LMB 2017-11-26 13:06:16 +13:00
Bastien Montagne
3c1f3c02c6 Fix for Fix (c): broken atomic lock in own bmesh code.
That was a nasty one, Debug build would never have any issue (even tried
with 64 threads!), but Release build would deadlock nearly immediately,
even with only 2 threads!

What happened here (I think) is that gcc optimizer would generate a
specific path endlessly looping when initial value of virtual_lock was
FLT_MAX, by-passing re-assignment from v_no[0] and the atomic cas
completely. Which would have been correct, should v_no[0] not have been
shared (and modified) by multiple threads. ;)

Idea of that (broken) for loop was to avoid completely calling the
atomic cas as long as v_no[0] was locked by some other thread, but...
Guess the avoided/missing memory barrier was the root of the issue here.

Lesson of the evening: Remember kids, do not trust your compiler to
understand all possible threading-related side effects, and be explicit
rather than elegant when using atomic ops!

Side-effect lesson: do check both release and debug builds when messing
with said atomic ops...
2017-11-25 23:14:54 +01:00
Bastien Montagne
dd6c918b2c Fix broken atomic_cas lock in own recent commit in bmesh.
Using atomic cas correctly is really hairy... ;)

In this case, the returned value from cas needs to validate *two*
conditions, it must not be FLT_MAX (which is our 'locked' value and
would mean another thread has already locked it), but it also must be
equal to previously stored value...

This means we need two steps per loop here, hence using a 'for' loop
instead of a 'while' one now.

Note that collisions are (as expected) very rare, less than 1 for 10k
typically, so did not catch the issue initially (also because I was
mostly working with release build to check on performances...).
2017-11-25 20:28:12 +01:00
Sergey Sharybin
1caa267ee6 Depsgraph: Cleanup, indentation 2017-11-24 15:45:41 +01:00
Sergey Sharybin
5f7981243e Depsgraph: Allow finding operations after construction is done 2017-11-24 15:38:20 +01:00
Sergey Sharybin
a8b97b2e41 Depsgraph: Deduplicate operation node finding logic 2017-11-24 15:35:42 +01:00
Sergey Sharybin
d232363290 Depsgraph: Use proper return type for find_node method 2017-11-24 15:34:53 +01:00
Sergey Sharybin
d80c1e1e11 Depsgraph: Use get_ prefix for function which expect operation to exists 2017-11-24 15:32:29 +01:00
Sergey Sharybin
d8f33fc818 Depsgraph: Make has_ prefixed function to return boolean 2017-11-24 15:26:54 +01:00
Sergey Sharybin
93e8a045df Depsgraph: Introduce explicit method which finds operation or returns NULL 2017-11-24 15:24:33 +01:00
Sergey Sharybin
68654c0be5 Depsgraph: Make more clear what find_operation() is doing for component 2017-11-24 15:21:50 +01:00
Bastien Montagne
8db63c6a1c Cleanup leftover timing debug prints from own recent commits.
Sorry about that...
2017-11-24 10:43:29 +01:00
Campbell Barton
c62e3a05b0 Cleanup: -Wnonnull-compare GCC warning 2017-11-24 14:29:17 +11:00
Bastien Montagne
b63442e0b6 Minor cleanup for own recent commits. 2017-11-23 22:43:11 +01:00
Bastien Montagne
43ddf0e9a7 Getting rid of OMP: first usage of new parallel BMesh items iteration instead.
`BM_mesh_normals_update` was converted from OMP to new parallel iterator code,
basic test with heavily subdivided cube (24.5k faces) gives:
    - old OMP code: average 10ms per run.
    - new BLI_task code: average 6ms per run.

So new code seems to be easily 40% quicker, in addition to getting rid of OMP. ;)

Reviewers: sergey, campbellbarton

Differential Revision: https://developer.blender.org/D2930
2017-11-23 21:21:32 +01:00
Bastien Montagne
bc3f0cfd14 BMesh: add limited support for parallelization over some basic iterators.
This merely uses new memloop/task looper over vertex/edge/face mempools.

Quite obviously, only BM_VERTS/EDGES/FACES_OF_MESH iterators are
supported.
2017-11-23 21:19:54 +01:00
Bastien Montagne
efb86b712d Add a new parallel looper for MemPool items to BLI_task.
It merely uses the new thread-safe iterators system of mempool, quite
straight forward.

Note that to avoid possible confusion with two void pointers as
parameters of the callback, a dummy opaque struct pointer is used
instead for the second parameter (pointer generated by iteration over
mempool), callback functions must explicitely convert it to expected
real type.

Also added a basic gtest for this new feature.
2017-11-23 21:14:43 +01:00
Bastien Montagne
b84e6dfee4 Add ability to use more than one mempool iterator simultaneously.
This will allow threaded tasks to 'consume' all mempool items in
parallel tasks, each one working on a whole chunk at once (to reduce
concurrency managing overhead).
2017-11-23 21:12:00 +01:00
Bastien Montagne
d423e66d34 Add non-gcc variant of static assert macro.
Adapted from http://www.pixelbeat.org/programming/gcc/static_assert.html.

Note that this macro just discards error message, so error when building
is much less nice than with gcc's _Static_assert... But error log will
point to right place in code, so should still be OK.
2017-11-23 20:25:55 +01:00
Brecht Van Lommel
5e13097dc3 Fix T53145: bevel tool fails when used a second time.
Pixel size was not initial early enough. For first time this was not a problem
because the bevel amount starts at 0 then, and after the mouse moves the pixel
size is initialized. For the second time the bevel amount starts at a non-zero
value, and it failed then.
2017-11-23 20:17:31 +01:00
Brecht Van Lommel
56da112ae0 Fix T53360: crash with GLSL bump mapping and missing group output node. 2017-11-23 18:12:32 +01:00
Brecht Van Lommel
f218e6d4da Fix T53276: encoding output quality UI clarification. 2017-11-23 17:55:25 +01:00
Brecht Van Lommel
dd04f54e84 Fix inaccuracy when storing material ID pass in half float multilayer EXR.
These and other non-RGB passes should always be stored as full float, the
precision loss is too unpredictable.

Related to T53381, but that one is about file output nodes where we don't
know the type of data being saved currently.
2017-11-23 17:14:04 +01:00
Brecht Van Lommel
e50ed90e4d Fix T53348: Cycles difference between gradient texture on CPU and GPU. 2017-11-23 17:14:04 +01:00
Bastien Montagne
580b34e52b atomic_ops: add char versions of uint8_t atomic primitives. 2017-11-23 16:24:34 +01:00
Bastien Montagne
497e2b3dfa Cleanup: use signed atomic ops when needed. 2017-11-23 16:24:34 +01:00
Sergey Sharybin
75a87abdc9 Depsgraph: Cleanup, deduplicate code around component registration 2017-11-23 15:23:19 +01:00
Sergey Sharybin
f2842ac65e Depsgraph: Cleanup, split build_object() a bit 2017-11-23 12:01:31 +01:00
Sergey Sharybin
f3fa5c1258 Depsgraph: Cleanup, always call full object 2017-11-23 11:39:28 +01:00
Campbell Barton
434ed96dd2 Revert "BLI_utildefines: Support SWAP macro with two args"
This reverts commit d749320e3b.

It's possible the container struct is larger,
we could do sizeof checks that falls back to memmove
but rather avoid complicating things.
2017-11-23 15:21:50 +11:00
Campbell Barton
3bec70ca60 Use custom SWAP macro for swapping userdef data
Avoids complicating the common case
2017-11-23 15:18:22 +11:00
Campbell Barton
326efb4319 Fix T53274: Saving template prefs overwrites default prefs 2017-11-23 03:12:00 +11:00
Campbell Barton
d749320e3b BLI_utildefines: Support SWAP macro with two args 2017-11-23 03:11:48 +11:00
Campbell Barton
69b5165902 WM: minor correction to user-pref writing
When saving templates had wrong return value.
2017-11-22 17:11:03 +11:00
Bastien Montagne
5f4058ddbc Removing OMP: get rid of usages in /bmesh/ area.
Just removing it, such cases are not bottlenecks and not worth the
complication of doing real threading with own BLI_task.

Other (remaining) usages may be relevant, need case-by-case check.
2017-11-21 22:25:22 +01:00
Bastien Montagne
7770c2ef87 Removing OMP: get rid of last bit in /editors/ area.
Just removing it, such cases are not bottlenecks and not worth the
complication of doing real threading with own BLI_task.
2017-11-21 22:25:22 +01:00
Sergey Sharybin
6c372530b4 Cleanup: We do not use camel case in Blender code
At least not for variables.
2017-11-21 17:34:44 +01:00
Sergey Sharybin
6785a2bd66 Fix T53371: Keying Node fails with values above 1
This was expected behavior for over-exposured lamps when the mode was originally
created for Tears of Steel. Turns out, there could be really bad green screen in
real production which will only have green (or rather screen) channel over
exposured.

Tweaked condition now so we use least bright channel to see if the area has
proper exposure or not.

Seems to work fine in tests, but further tweaks are possible.
2017-11-21 17:31:45 +01:00
Campbell Barton
175e8fdc1e Disable adding scene sequence strips into themselves
D2923 by @spockTheGray w/ edits, see T52586 for details
2017-11-21 16:46:27 +11:00