Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
398 views
in Technique[技术] by (71.8m points)

c++ - Change number of threads for Tensorflow inference with C API

I'm writing a c++ wrapper around tensorflow 1.2 C API (for inference purposes, if it matters). Since my application is a multi-process and multi-threaded one, where resources are explicitly allocated, I would like to limit Tensorflow to only use one thread.

Currently, running a simple inference test that allows batch processing, I see it is using all CPU cores. I have tried limiting number of threads for a new session using a mixture of C and C++ as follows (forgive my partial code snippet, I hope this makes sense):

tensorflow::ConfigProto conf;
conf.set_intra_op_parallelism_threads(1);
conf.set_inter_op_parallelism_threads(1);
conf.add_session_inter_op_thread_pool()->set_num_threads(1);
std::string str;
conf.SerializeToString(&str);
TF_SetConfig(m_session_opts,(void *)str.c_str(),str.size(),m_status);
m_session = TF_NewSession(m_graph, m_session_opts, m_status);

But I don't see it is making any difference - all cores are still fully utilized.

Am I using the C API correctly?

(My current work around is to recompile Tensorflow with hard coding number of threads to be 1, which will probably work, but its obviously not the best approach...)

-- Update --

I also tried adding:

conf.set_use_per_session_threads(true);

Without success. Still multiple cores are used...

I also tried to run with high log verbosity, and got this output (showing only what I think is relevant):

tensorflow/core/common_runtime/local_device.cc:40] Local device intraop parallelism threads: 8
tensorflow/core/common_runtime/session_factory.cc:75] SessionFactory type DIRECT_SESSION accepts target: 
tensorflow/core/common_runtime/direct_session.cc:95] Direct session inter op parallelism threads for pool 0: 1

The "parallelism threads: 8" message shows up as soon as I instantiate a new graph using TF_NewGraph(). I didn't find a way to specify options prior to this graph allocation though...

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I had the same problem and solved it by setting the number of threads when creating the first TF session my application is creating. If the first created session is not created with a options object TF will create worker threads as the number of cores on the machine * 2.

Here is the C++ code I used:

// Call when application starts
void InitThreads(int coresToUse)
{
    // initialize the number of worker threads
    tensorflow::SessionOptions options;
    tensorflow::ConfigProto & config = options.config;
    if (coresToUse > 0)
    {
        config.set_inter_op_parallelism_threads(coresToUse);
        config.set_intra_op_parallelism_threads(coresToUse);
        config.set_use_per_session_threads(false);  
    }
    // now create a session to make the change
    std::unique_ptr<tensorflow::Session> 
        session(tensorflow::NewSession(options));
    session->Close();
}

Pass 1 to limit the number of inter & intra threads to 1 each.

Edit: IMPORTANT NOTE: This code works when called from the main application (google sample trainer) BUT stopped working when I moved it to a DLL dedicated to wrap tensorFlow). TF 1.4.1 ignores the parameter I pass and spins up all threads. I would like to hear your comments...


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...