MultiThread: delayed spawn of last thread


#1

Hi,

I’m writing OFX plugins for Natron (on Linux) that use MultiThread and just noticed that the last thread seems to only be spawned after the first one has finished. Let’s say if nThreads is 8, threads with ID 0 - 6 start running, but not thread #7 yet. Only after one of 0 - 6 has finished, thread 7 start running.

This happens only when the “Max threads usable per effect” preference is set to the number of cores available on the system (or to 0=guess which results in the same).

I reckon Natron reserves one core for the user interface to stay responsive which isn’t probably a bad idea. But then that last thread shouldn’t be offered to the plugin for rendering and then held back.


#2

Basically when the setting “Effects use thread-pool” the multi-thread suite uses the application’s global thread pool (that has at most the number of threads defined in the setting “Number of render threads”).

The global thread pool is also used to launch the render action, so if you have a total of 8 threads on a CPU, 1 is used in the render action to call the multi-thread suite and the 7 else are used to execute the functor.

The thread-pool itself is managing the threads and when one is available it picks up work to do. There’s no guarantee that the number of threads you pass to multiThread will be the one that the thread pool will use (maybe only 2 are available at that point!)

You may uncheck the “Effects use thread-pool” settings, but it will yield terrible performances because you are going to force Natron to spawn many threads which will clutter the CPU and force it to do more scheduling than actual “work”


#3

Thanks for the quick reply! This makes sense.

In this case a dynamic assignment of small tasks seems more appropriate than a fixed division of the job between n threads. Otherwise we might have to wait for one or two stragglers to complete. (At least if interactive performance of a single node is important).


#4

You should not even care about that as a plug-in developer, this is the host responsibility of doing so.
Basically you would call getNumCpu and pass it to the multiThread function.

Regarding interactivity, your plug-in should periodically (say every scan-line) call abort() to check if the render was aborted by the user to avoid useless computations


#5

Well, what I mean is that if I divided the number of scanlines by nThreads (which let’s say is 8) I expect to get the job done in about 1/8th of the time a single thread needs (in an ideal world …). However the way it is currently handled, I get 7 threads running first and then one thread doing the last bit, taking twice as long as if all 8 threads had started together. In this case it would be better to know that there are only 7 threads available right away, so I could get the result in at least 1/7th of the total time rather than 2/8th.

So far I actually used a simple scheduling scheme which dynamically assigns scanlines to threads as they request them. This way I didn’t run into the issue of waiting for the last thread to finish (it never got work assigned because it never requested any). I only wondered why one CPU has always been idle.

Saying that, according to your description aren’t we wasting the render thread? It seems to idle in the .multiThread() call, waiting for the worker threads to finish. Wouldn’t it be better to have it execute one of the multiThreadFunction()s as well?