Multithreaded calculation#
Benchmarks for the threaded algorithm are shown here, for unmasked z
, a problem size n
of 1000 and a total_chunk_count
of 40 for up to 6 threads.
For the simple
dataset contour calculations are faster with more threads but it does not scale
particularly well. The speedup with 6 threads is 2.4-2.5 for lines()
and 2.5-2.6 for filled()
. This problem dataset is perhaps not
computationally expensive enough to justify the use of multiple threads.
For the random
dataset contour calculations scale much better with increasing number of threads
as long as one of the ChunkCombined...
line or fill types is being used.
Using 6 threads the speedup is 4.4 for lines()
and 5.1-5.2 for
filled()
.
The LineType
and FillType
options that do not scale well are those that return individual
NumPy arrays for each line or polygon rather than combined arrays for each chunk. This is because
the allocation of a new NumPy array can only be performed by one thread at a time, so the larger the
number of arrays that are generated, the greater the likelihood that other threads are left waiting
before they can allocate arrays.
Note
Whether it is worth using threaded rather than serial for a particular problem depends on the complexity of the dataset and what the calculated contours are to be used for. If they are only needed for rendering using the Matplotlib Agg renderer, then for complicated problems the rendering time usually far exceeds the calculation time so a reduction in calculation time may not be of much real-world benefit.