Commit Graph

146 Commits

Author SHA1 Message Date
Sergey Sharybin
6298632bfa Guardedalloc: Don't use aligned blocks to calculate memory sloppyness
Aligned memory is allocated with memalign() and malloc_usable_size() can't be
used to measure this block.
2015-04-20 19:23:25 +05:00
Sergey Sharybin
9fc2c37328 Guardedalloc: Reset peak memory should set peak to currently allocated memory
Otherwise statistics could be really funny looking.
2015-02-19 13:14:06 +05:00
Sergey Sharybin
6c5f63b476 Guardedalloc: Add extra logging and checks in MEM_freeN()
We don't like when NULL is send to MEM_freeN(), but there was some
differences between lockfree and guarded allocators:

- Lockfree would have silently crash, in both release and debug modes
- Guarded allocator would have printed error message, abort in debug
  but keep working in release build.

This commit makes lockfree allocator behavior to match guarded one.
2015-02-19 01:58:49 +05:00
Dan Horák
c86c9297dc Fix inconsistent types in guardealloc
This basically fixes mix of size_t and uintptr_t usages which might be different size.
2014-10-14 16:11:20 +02:00
Sergey Sharybin
f2280661cb Enable atomic peak memory detection
This gives more precise information about memory usage which might be real handy
when doing memory optimization.

It works good here for as long as i can tell but if for some reason you'll be
experiencing some weird slowdown please let me know.
2014-10-10 01:55:57 +06:00
Brecht Van Lommel
32f83a298c Fix build errors in atomic ops and warning in aligned malloc on OS X. 2014-09-25 23:59:38 +02:00
Sergey Sharybin
faf4f29cc0 Guardedalloc: Implement atomic peak memory update
Updating maximum requires a bit of a cycle which usually does 1 iteration only,
sometimes needs a bit more but seems there's no speed regressions.

For now the code is commented out. This way it's easier for others to verify
there's no speed regressions.

Reviewers: campbellbarton

Differential Revision: https://developer.blender.org/D626
2014-09-26 00:40:53 +06:00
Tamito Kajiyama
af585e843b Fix inconsistent use of print_error() and fprintf(stderr, ...) in MEM_guarded_printmemlist_internal().
Also extended the size of buf[] in print_error() to prevent mem_printmemlist_pydict_script[]
from getting truncated when MEM_printmemlist_pydict() is used.

Differential revision: https://developer.blender.org/D675

Reviewed by: Campbell Barton
2014-07-25 19:24:24 +09:00
Sergey Sharybin
ecfc2db6e2 I'd tend to declare dead code is forbidden
All this code blocks commented out with UNUSED comment are
really useless.
2014-06-16 14:08:22 +06:00
Campbell Barton
788f4858d7 Comment unused macro 2014-06-14 16:27:13 +10:00
Sergey Sharybin
7e20583688 Attempt to fix sign conversion error happening on buildbot 2014-06-14 03:35:22 +06:00
Sergey Sharybin
d0573ce905 Attempt to fix guardedalloc on OSX 2014-06-14 01:52:35 +06:00
Sergey Sharybin
a87fb34eda Use advantage of SSE2 instructions in gaussian blur node
This gives around 30% of speedup for gaussian blur node.

Pretty much straightforward implementation inside the node
itself, but needed to implement some additional things:

- Aligned malloc. It's needed to load data onto SSE registers
  faster. based on the aligned_malloc() from Libmv with
  some additional trickery going on to support arbitrary
  alignment (this magic is needed because of MemHead).

  In the practice only 16bit alignment is supported because
  of the lack of aligned malloc with arbitrary alignment
  for OSX. Not a bit deal for now because we need 16 bytes
  alignment at this moment only. Could be tweaked further
  later.

- Memory buffers in compositor are now aligned to 16 bytes.
  Should be harmless for non-SSE cases too. just mentioning.

Reviewers: campbellbarton, lukastoenne, jbakker

Reviewed By: campbellbarton

CC: lockal

Differential Revision: https://developer.blender.org/D564
2014-06-14 00:38:07 +06:00
Matteo F. Vescovi
9b23d9acec Fix compilation error non non-linux architectures 2014-06-02 16:26:38 +06:00
Sebastian Ramacher
76f7a5bd6b Fix compilation error on kFreeBSD 2014-05-19 16:35:24 +02:00
Campbell Barton
43fb105ff1 Move LIKELY/UNLIKELY into header 2014-04-06 17:25:50 +10:00
Campbell Barton
43a201662a Guarded Alloc: use UNLIKELY for debug memset 2014-04-06 12:58:10 +10:00
Campbell Barton
c16bd951cd Enable GCC pedantic warnings with strict flags,
also modify MIN/MAX macros to prevent shadowing.
2014-03-30 15:04:20 +11:00
Campbell Barton
a99a8a6070 Code cleanup: style and warnings 2014-03-26 07:53:56 +11:00
Campbell Barton
8480bb64ec Code cleanup: style 2014-03-17 21:48:13 +11:00
Sergey Sharybin
a81cf3182f Fix typo in mmap commit from a while ago 2014-01-23 18:41:38 +06:00
Brecht Van Lommel
282ad434a8 Memory allocation: do not use mmap for memory allocation on 64 bit.
On Windows we can only do mmap memory allocation up to 4 GB, which causes a
crash when doing very large renders on 64 bit systems with a lot of memory.

As far as I can tell the reason to use mmap is to get around address space
limitation on some 32 bit operating systems, and I can't see a reason to use
it on 64 bit. For the original explanation see here:
http://orange.blender.org/blog/stupid-memory-problems

Fixes T37841.
2014-01-23 01:13:46 +01:00
Campbell Barton
a5183d7a87 Code Cleanup: use NULL for pointer checks and remove joke. 2013-11-22 10:43:42 +11:00
Sergey Sharybin
4f6dd555b7 Fix for wrong implementation of mmap in lock-free allocator
- Freeing was not using proper block length
- Duplicating memory block was not aware of
  mmaped blocks.
2013-10-20 00:12:54 +00:00
Brecht Van Lommel
1760f5fdcc Fix FreeBSD build with recent malloc changes, patch by Shane Ambler. 2013-10-11 14:41:00 +00:00
Brecht Van Lommel
b880b01db5 Fix OS X build error in malloc code, and warning in rna. 2013-10-10 15:44:47 +00:00
Sergey Sharybin
4bd4037276 Lock-free memory allocator
Release builds will now use lock-free allocator by
default without any internal locks happening.

MemHead is also reduces to as minimum as it's possible.
It still need to be size_t stored in a MemHead in order
to make us keep track on memory we're requesting from
the system, not memory which system is allocating. This
is probably also faster than using a malloc's usable
size function.

Lock-free guarded allocator will say you whether all
the blocks were freed, but wouldn't give you a list
of unfreed blocks list. To have such a list use a
--debug or --debug-memory command line arguments.

Debug builds does have the same behavior as release
builds. This is so tools like valgrind are not
screwed up by guarded allocator as they're currently
are.

--
svn merge -r59941:59942 -r60072:60073 -r60093:60094 \
          -r60095:60096 ^/branches/soc-2013-depsgraph_mt
2013-10-10 11:58:01 +00:00
Campbell Barton
2dc988df8c reorder BLI_strict_flags.h include so its not conflicting with stdio.h on apple. 2013-09-03 04:39:12 +00:00
Campbell Barton
fe427f0561 kd-tree,
- replace numbers with defines for allocation increments and default array size.
- move array reallocation into a static function (deduplicate 2x).

also fix own mistake with uninitialized slop-space var in memory printing statistics.
2013-09-01 08:58:46 +00:00
Joshua Leung
33c68846de Mingw/Windows Compiling Fix
This commit attempts to fix the following error:

intern\guardedalloc\intern\mallocn.c: In function 'rem_memblock':
intern\guardedalloc\intern\mallocn.c:977:48: error: conversion to 'intptr_t' from 'size_t' may change the sign of the result [-Werror=sign-conversion]

From the references I've managed to find, it appears that
the second arg to munmap() should be size_t not intptr_t.
Fortunately though, we don't use this arg anyways atm, so 
this should be quite harmless...
2013-09-01 05:12:36 +00:00
Campbell Barton
9ad5f32fc0 use strict flags for guarded alloc 2013-09-01 02:46:34 +00:00
Brecht Van Lommel
9135425607 Attempted fix for #36569: couldn't unmap memory errors on Windows. The guardedalloc optimizations were not entirely thread safe for mmap. 2013-08-29 23:46:44 +00:00
Campbell Barton
1ac57ccbc8 correct own recent commit, malloc_usable_size() isn't valid for mmap()'d memory. 2013-08-28 22:12:40 +00:00
Campbell Barton
1a6b364c28 should fix builds for osx 2013-08-28 11:22:29 +00:00
Campbell Barton
d1d6a13297 include slop-space in debug statistics (gcc/clang only) 2013-08-28 10:17:26 +00:00
Sergey Sharybin
c0f8e15295 Speedup for guarded allocator
- Re-arrange locks, so no actual memory allocation
  (which is relatively slow) happens from inside
  the lock. operation system will take care of locks
  which might be needed there on it's own.

- Use spin lock instead of mutex, since it's just
  list operations happens from inside lock, no need
  in mutex here.

- Use atomic operations for memory in use and total
  used blocks counters.

This makes guarded allocator almost the same speed
as non-guarded one in files from Tube project.

There're still MemHead/MemTail overhead which might
be bad for CPU cache utilization
2013-08-19 10:51:40 +00:00
Sergey Sharybin
018ab045e3 Added check for whether thread lock is being removed while thread is using guarded alloc.
--
svn merge -r58788:58789 ^/branches/soc-2013-depsgraph_mt
2013-08-19 10:38:27 +00:00
Sergey Sharybin
58d7ae891d Blender might be compiled without guardedalloc again
This is useful for benchmark tests, to make CPU cache
utilization as good as we could with current design.
2013-08-15 07:36:56 +00:00
Campbell Barton
bd89bd9e1c avoid using MEM_reallocN_id directly, add utility macro for freeing. 2013-08-04 03:00:04 +00:00
Campbell Barton
2a8d76d734 add versions of MEM_reallocN, MEM_recallocN which take a string arg so new allocs have an ID, changing existing functions signatures would be too disruptive at the moment. 2013-08-03 17:53:41 +00:00
Campbell Barton
7068a5eba7 tweak to recent commit, don't show keymap in menu tooltips. 2013-06-02 15:58:43 +00:00
Brecht Van Lommel
fe02323632 Fix to actually disable DEBUG_BACKTRACE by default. 2013-05-31 12:36:35 +00:00
Sergey Sharybin
1be2936298 Backtrace for unfreed memory blocks
Added an option to show backtrace from where
non-freed datablock was allocated from.

To enable this feature, simply enable DEBUG_BACKTRACE
in mallocn.c file and all unfreed datablocks will
be followed up by a backtrace.

Currently works on linux and osx only,
windows support is on TODO.

This feature is for sure disabled by default,
so does not affect any builds which don't
explicitly define DEBUG_BACKTRACE.
2013-05-30 14:27:24 +00:00
Brecht Van Lommel
bc0e3ffc0c Fix build error after removing return value from MEM_freeN. 2013-05-21 10:13:42 +00:00
Campbell Barton
cd6b27f2b5 remove return value from MEM_freeN, it wasn't used anywhere and was cast to a different function signature. (which evidently works but error prone). 2013-05-21 07:37:59 +00:00
Campbell Barton
7d4eee2b18 add option to disable guardedalloc, helps for debugging memory errors
since guardedalloc confuses them.

The option cases a warning on build, since its ownly for experimental
use.
2013-05-08 12:55:23 +00:00
Thomas Dinges
431619e45b Windows compile fix:
* r54117 broke Windows, __func__ not declared.
2013-01-27 15:12:52 +00:00
Campbell Barton
9d36fade8f make MEM_reallocN and MEM_recallocN behave as libc's realloc() - alloc when receiving a NULL value. 2013-01-27 11:20:50 +00:00
Sergey Sharybin
5e739ddae2 Added some code which helps troubleshooting issues caused by
non-threadsafe usage of guarded allocator.

Also added small chunk of code to check consistency of begin/end
threaded malloc.

All this additional checks are commented and wouldn't affect on
builds, however found them helpful to troubleshoot issues so
decided to commit it to SVN.
2013-01-24 08:14:05 +00:00
Sergey Sharybin
6571713ddb Ambient occlusion baker from multi-resolution mesh
This implements AO baking directly from multi-resolution mesh with much
less memory overhead than regular baker.

Uses rays distribution implementation from Morten Mikkelsen, raycast
is based on RayObject also used by Blender Internal.

Works in single-thread yet, multi-threading would be implemented later.
2012-12-18 17:46:42 +00:00