Commit Graph

4106 Commits

Author SHA1 Message Date
Alexander Gavrilov
01581d4a1e BLI_task: fix queue in work_and_wait, and support resetting.
To make the pool more usable for running multiple stages of tasks,
fix local queue handling in BLI_task_pool_work_and_wait.

Specifically, after the wait loop the local queue should be empty,
or the wait part of the function contract isn't fulfilled. Instead,
check and run any tasks in queue before the wait loop.

Also, add a new function that resets the suspended state of the pool.
2018-12-04 14:08:50 +03:00
Sergey Sharybin
df2635099b Merge branch 'master' into blender2.8 2018-12-04 11:45:22 +01:00
Sergey Sharybin
3f31ec8398 Cleanup: Spelling 2018-12-04 11:43:53 +01:00
Campbell Barton
81f898f053 Merge branch 'master' into blender2.8 2018-12-01 08:19:38 +11:00
Campbell Barton
a9bd788348 Cleanup: style 2018-12-01 08:15:25 +11:00
Campbell Barton
7e21c99d05 PyAPI: add load_factory_startup_post handler
Needed so we can apply changes to the startup file,
only in the case when it's load loaded from a user-saved startup.
2018-11-30 13:33:13 +11:00
Campbell Barton
2089feeb1b Fix leak in CPU brand check 2018-11-29 12:52:39 +11:00
Campbell Barton
67d77f47f0 Fix leak in CPU brand check 2018-11-29 08:22:15 +11:00
Sergey Sharybin
3ed0d5b4d4 Merge branch 'master' into blender2.8 2018-11-28 14:42:38 +01:00
Sergey Sharybin
ce927e15e0 Tweaks for threading schedule for Threadripper2 and EPYC
The idea is to make main thread and job threads to be scheduled
on CPU dies which has direct access to memory (those are NUMA
nodes 0 and 2).

We also do this for new EPYC CPUs since their NUMA nodes 1 and 3
do have access but only to a higher range DDR slots. By preferring
nodes 0 and 2 on EPYC we make it so users with partially filled
DDR slots has fast memory access.

One thing which is not really solved yet is localization of
memory allocation: we do not guarantee that memory is allocated
on the closest to the NUMA node DDR slot and hope that memory
manager of OS is acting in favor of us.
2018-11-28 14:41:22 +01:00
Sergey Sharybin
b3e2c69416 Add utility function to query CPU brand string 2018-11-28 14:35:26 +01:00
Campbell Barton
62edc31d34 Cleanup: correct function signatures 2018-11-28 16:21:24 +11:00
Campbell Barton
fb262f942e Cleanup: style, includes 2018-11-27 08:01:54 +11:00
Jacques Lucke
c1adf938e6 Timer: Generic BLI_timer with Python wrapper
There is a new `bpy.app.timers` api.
For more details, look in the Python API documentation.

Reviewers: campbellbarton

Differential Revision: https://developer.blender.org/D3994
2018-11-26 20:25:15 +01:00
Jacques Lucke
4b06d0bf51 Python API: bpy.app.handlers.depsgraph_update_pre/post
Reviewers: brecht

Differential Revision: https://developer.blender.org/D3978
2018-11-23 11:52:09 +01:00
Campbell Barton
f5df1efa2f Cleanup: warnings 2018-11-22 05:26:18 +11:00
Philipp Oeser
cec83e92e6 Fix T57884: Triangle count is incorrect when above around 2 billion
Maniphest Tasks: T57884

Differential Revision: https://developer.blender.org/D3962
2018-11-21 16:34:32 +01:00
Sergey Sharybin
5c632ced53 Merge branch 'master' into blender2.8 2018-11-20 15:02:13 +01:00
Sergey Sharybin
01e8e7dc6d Task scheduler: Optimize parallel loop over lists
The goal is to address performance regression when going from
few threads to 10s of threads. On a systems with more than 32
CPU threads the benefit of threaded loop was actually harmful.

There are following tweaks now:

- The chunk size is adaptive for the number of threads, which
  minimizes scheduling overhead.

- The number of tasks is adaptive to the list size and chunk
  size.

Here comes performance comparison on the production shot:

 Number of threads        DEG time before        DEG time after
       44                     0.09                   0.02
       32                     0.055                  0.025
       16                     0.025                  0.025
       8                      0.035                  0.033
2018-11-20 14:58:17 +01:00
Clément Foucault
0c987aa7ac BLI: Math: Add normal_float_to_short_v4 2018-11-17 14:56:17 +01:00
Campbell Barton
55e719ec35 Merge branch 'master' into blender2.8 2018-11-14 17:21:34 +11:00
Campbell Barton
d7f55c4ff5 Cleanup: comment block tabs 2018-11-14 17:10:56 +11:00
Sergey Sharybin
3cf724209f Cleanup, spelling 2018-11-08 15:00:19 +01:00
Campbell Barton
f6bec570c5 Merge branch 'master' into blender2.8 2018-11-07 12:28:26 +11:00
Campbell Barton
dca3d3ef0e Cleanup: use BLI_compiler_compat.h for BLI_INLINE 2018-11-07 12:17:58 +11:00
Alexander Gavrilov
f600b4bc67 Shrinkwrap: new mode that projects along the target normal.
The Nearest Surface Point shrink method, while fast, is neither
smooth nor continuous: as the source point moves, the projected
point can both stop and jump. This causes distortions in the
deformation of the shrinkwrap modifier, and the motion of an
animated object with a shrinkwrap constraint.

This patch implements a new mode, which, instead of using the simple
nearest point search, iteratively solves an equation for each triangle
to find a point which has its interpolated normal point to or from the
original vertex. Non-manifold boundary edges are treated as infinitely
thin cylinders that cast normals in all perpendicular directions.

Since this is useful for the constraint, and having multiple
objects with constraints targeting the same guide mesh is a quite
reasonable use case, rather than calculating the mesh boundary edge
data over and over again, it is precomputed and cached in the mesh.

Reviewers: mont29

Differential Revision: https://developer.blender.org/D3836
2018-11-06 21:20:17 +03:00
Alexander Gavrilov
84fa806491 BLI_kdopbvh: add an option to use a priority queue in find_nearest.
Simple find_nearest relies on a heuristic for efficient culling of
the BVH tree, which involves a fast callback that always updates the
result, and the caller reusing the result of the previous find_nearest
to prime the process for the next vertex.

If the callback is slow and/or applies significant restrictions on
what kind of nodes can qualify for the result, the heuristic can't
work. Thus for such tasks it is necessary to order and prune nodes
before the callback at BVH tree level using a priority queue.

Since, according to code history, for simple find_nearest the
heuristic approach is faster, this mode has to be an option.
2018-11-06 21:20:16 +03:00
Alexander Gavrilov
798cdaeeb6 Implement an Armature constraint that mimics the modifier.
The main use one can imagine for this is adding tweak controls to
parts of a model that are already deformed by multiple other major
bones. It is natural to expect such locations to deform as if the
tweaks aren't there by default; however currently there is no easy
way to make a bone follow multiple other bones.

This adds a new constraint that implements the math behind the Armature
modifier, with support for explicit weights, bone envelopes, and dual
quaternion blending. It can also access bones from multiple armatures
at the same time (mainly because it's easier to code it that way.)

This also fixes dquat_to_mat4, which wasn't used anywhere before.

Differential Revision: https://developer.blender.org/D3664
2018-11-06 10:56:08 +03:00
Campbell Barton
900c562b71 Cleanup: rename fast-heap -> heap-simple
In general prefer API names don't start with adjectives
since it causes grouping of unrelated API's for completion.
2018-11-06 13:06:49 +11:00
Campbell Barton
d805a4a5ef Cleanup: move fast heap into own source & header 2018-11-06 12:52:34 +11:00
Campbell Barton
29dfe9a614 Cleanup: style 2018-11-06 12:39:51 +11:00
Alexander Gavrilov
fee6ab18e7 BLI_heap: implement a limited but faster version of heap.
If the user only needs insertion and removal from top, there is
no need to allocate and manage separate HeapNode objects: the
data can be stored directly in the main tree array.

This measured a 24% FPS increase on a ~50% heap-heavy workload.

Reviewers: brecht

Differential Revision: https://developer.blender.org/D3898
2018-11-05 20:49:17 +03:00
Alexander Gavrilov
721a484ccb BLI_kdopbvh: reduce branching in calc_nearest_point_squared.
This lets the compiler use min/max instructions for 4.5% FPS
improvement in Shrinkwrap to Nearest Surface Point.
2018-11-05 17:16:06 +03:00
Alexander Gavrilov
a70589e439 BLI_heap: optimize heap_swap, heap_down and heap_up.
The index field of nodes is supposed to be its actual index, so
there is no need to read it in swap. On a 64-bit processor i and
j are already in registers, so this removes two memory reads.

In addition, cache the tree pointer, use branch hints, and
put the most frequently accessed 'value' field at 0 offset.

Produced a 20% FPS improvement for a 50% heap-heavy workload.
2018-11-05 17:16:06 +03:00
Alexander Gavrilov
d3c815bd08 BLI_heap: add an API function to directly read the top node value.
It is very commonly needed in loop conditions to check if
the items in the heap are good enough to continue.
2018-11-04 13:29:17 +03:00
Campbell Barton
0d69a5aa34 Merge branch 'master' into blender2.8 2018-11-04 18:12:58 +11:00
Campbell Barton
76b9eaf7a8 Fix ghash masking out upper bits on 64bit systems
The code this was taken from assumes a 'size_t' result,
which isn't the case here.

In practice the bucket distribution wasn't bad,
even so this was a nop so best fix.
2018-11-04 16:48:36 +11:00
Brecht Van Lommel
1b974563b1 Fix T57529: 2D image paint fill tool not taking into account alpha. 2018-11-01 13:05:57 +01:00
Clément Foucault
2c545c0409 BLI: Add comment about to orthogonalize_m3/4 2018-10-28 21:48:22 +01:00
Campbell Barton
a30c9f710a Partial revert '#if 0' cleanup
Partially revert 41216d5ad4

Some of this code had comments to be left as is for readability,
or comment the code should be kept.
Other functions were only for debugging.
2018-10-19 09:18:22 +11:00
Jacques Lucke
41216d5ad4 Cleanup: Remove more #if 0 blocks
Continuation of https://developer.blender.org/D3802

Reviewers: brecht

Differential Revision: https://developer.blender.org/D3808
2018-10-18 15:43:06 +02:00
Clément Foucault
01745051de DRW: Add DRW_shgroup_create_sub to create children shgroup
This makes is easy to create nested drawcalls that will inherit all the
parents properties. This is usefull if only a few uniforms changes for that
drawcall.
2018-10-12 16:38:55 +02:00
Campbell Barton
f5e51ded09 Merge branch 'master' into blender2.8 2018-10-11 09:38:17 +11:00
Campbell Barton
b618c185cb Fix incorrect strncpy use
Didn't ensure null terminated.
2018-10-11 09:36:43 +11:00
Clément Foucault
a5101de6a9 BLI: Add mul_v2_v2v2 function 2018-10-01 15:14:46 +02:00
mano-wii
d6ca699d7e BLI_math: add isect_seg_seg_v3 2018-10-01 13:52:00 +10:00
mano-wii
9df476ecaa BLI_math: add isect_seg_seg_v3 function and use in the cloth collision algorith.
In my tests a 4% improvement in performance was achieved by simulating a square cloth over the cube.
2018-10-01 00:16:44 -03:00
Antonioya
2ca67de960 A new function to move list at the beginning of another list
This is a change of the BLI_movelisttolist but in reverse order.
2018-09-29 16:45:33 +02:00
Brecht Van Lommel
28324143c4 UI: draw mono icons with button type text color, instead of area text color. 2018-09-27 18:39:50 +02:00
Luca Rood
0666ece2e2 Cloth: Collision improvements
This commit includes several performance, stability, and reliability
improvements to cloth collisions.

Most notably:
* The implementation of a new self-collisions system.
* Multithreading of collision detection.
* Implementation of single sided collisions and normal overrides.
* Replacement of the `plNearestPoints` function from Bullet with a
dedicated solution.

Further, this also includes several bug fixes, and algorithmic
improvements.

Reviewed By: brecht

Differential Revision: http://developer.blender.org/D3712
2018-09-26 17:49:40 +02:00