Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.3k views
in Technique[技术] by (71.8m points)

multithreading - Confusion about multiprocessing and workers in Keras fit_generator() with windows 10 in spyder

In the documentation for fit_generator() (docs: https://keras.io/models/sequential/#fit_generator) it says that the parameter use_multiprocessing accepts a bool that if set to True allows process-based threading.

It also says that the parameter workers is an integer that designates how many process to spin up if using process-based threading. Apparently it defaults to 1 (a single process based thread) and if set to 0 it will execute the generator on the main thread.

What I thought this meant was that if use_multiprocessing=True and workers > 0 (let's use 6 for an example) that it would spin up 6 processes running the generator independently. However, when I test this I think I must be misunderstanding something (see below).

My confusion arises from the fact that if I set use_multiprocessing to False and workers = 1 then in my task manager I can see that all 12 of my virtual cores are being utilized somewhat evenly and I am at about 50% CPU usage while training my model (for reference, I have an i7-8750H CPU with 6 cores that support virtualization and I have virtualization enabled in BIOS). If I increase the number of workers, the CPU usage goes to 100% and training is much faster. If I decrease the number of workers to 0 so that it runs on the main thread, I can see that all of my virtual cores are still being used, but it seems somewhat uneven and CPU usage is at about 36%.

Unfortunately, if I set multiprocessing = True, then I get a brokenpipe error. I have yet to fix this, but I'd like to better understand what I am trying to fix here.

If someone could please explain the difference between training with use_multiprocessing = True and use_multiprocessing = False, as well as when workers are = 0, 1, and >1 I would be very grateful. If it matters, I am using tensorflow (gpu version) as the backend for keras with python 3.6 in Spyder with the IPython Console.

My suspicion is that use_multiprocessing is actually enabling multiprocessing when True whereas workers>1 when use_multiprocessing=False is setting the number of threads, but that's just a guess.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The only thing I know is that when use_multiprocessing=False and workers > 1, there are many parallel data loading threads (I'm not really good with these names, threads, processes, etc.). But there are five parallel fronts loading data to the queue (so, loading data is faster, but it doesn't affect the model's speed - this can be good when data loading takes too long).

Whenever I tried use_multiprocessing=True, everything got frozen.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...