Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
519 views
in Technique[技术] by (71.8m points)

multithreading - Use cases for ithreads (interpreter threads) in Perl and rationale for using or not using them?

If you want to learn how to use Perl interpreter threads, there's good documentation in perlthrtut (threads tutorial) and the threads pragma manpage. It's definitely good enough to write some simple scripts.

However, I have found little guidance on the web on why and what to sensibly use Perl's interpreter threads for. In fact, there's not much talk about them, and if people talk about them it's quite often to discourage people from using them.

These threads, available when perl -V:useithreads is useithreads='define'; and unleashed by use threads, are also called ithreads, and maybe more appropriately so as they are very different from threads as offered by the Linux or Windows operating systems or the Java VM in that nothing is shared by default and instead a lot of data is copied, not just the thread stack, thus significantly increasing the process size. (To see the effect, load some modules in a test script, then create threads in a loop pausing for key presses each time around, and watch memory rise in task manager or top.)

[...] every time you start a thread all data structures are copied to the new thread. And when I say all, I mean all. This e.g. includes package stashes, global variables, lexicals in scope. Everything!

-- Things you need to know before programming Perl ithreads (Perlmonks 2003)

When researching the subject of Perl ithreads, you'll see people discouraging you from using them ("extremely bad idea", "fundamentally flawed", or "never use ithreads for anything").

The Perl thread tutorial highlights that "Perl Threads Are Different", but it doesn't much bother to explain how they are different and what that means for the user.

A useful but very brief explanation of what ithreads really are is from the Coro manpage under the heading WINDOWS PROCESS EMULATION. The author of that module (Coro - the only real threads in perl) also discourages using Perl interpreter threads.

Somewhere I read that compiling perl with threads enabled will result in a significantly slower interpreter.

There's a Perlmonks page from 2003 (Things you need to know before programming Perl ithreads), in which the author asks: "Now you may wonder why Perl ithreads didn't use fork()? Wouldn't that have made a lot more sense?" This seems to have been written by the author of the forks pragma. Not sure the info given on that page still holds true in 2012 for newer Perls.

Here are some guidelines for usage of threads in Perl I have distilled from my readings (maybe erroneously so):

So far my research. Now, thanks for any more light you can shed on this issue of threads in Perl. What are some sensible use cases for ithreads in Perl? What is the rationale for using or not using them?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

The short answer is that they're quite heavy (you can't launch 100+ of them cheaply), and they exhibit unexpected behaviours (somewhat mitigated by recent CPAN modules).

You can safely use Perl ithreads by treating them as independent Actors.

  1. Create a Thread::Queue::Any for "work".
  2. Launch multiple ithreads and "result" Queues passing them the ("work" + own "result") Queues by closure.
  3. Load (require) all the remaining code your application requires (not before threads!)
  4. Add work for the threads into the Queue as required.

In "worker" ithreads:

  1. Bring in any common code (for any kind of job)
  2. Blocking-dequeue a piece of work from the Queue
  3. Demand-load any other dependencies required for this piece of work.
  4. Do the work.
  5. Pass the result back to the main thread via the "result" queue.
  6. Back to 2.

If some "worker" threads start to get a little beefy, and you need to limit "worker" threads to some number then launch new ones in their place, then create a "launcher" thread first, whose job it is to launch "worker" threads and hook them up to the main thread.

What are the main problems with Perl ithreads?

They're a little inconvenient with for "shared" data as you need to explicity do the sharing (not a big issue).

You need to look out for the behaviour of objects with DESTROY methods as they go out of scope in some thread (if they're still required in another!)

The big one: Data/variables that aren't explicitly shared are CLONED into new threads. This is a performance hit and probably not at all what you intended. The work around is to launch ithreads from a pretty much "pristine" condition (not many modules loaded).

IIRC, there are modules in the Threads:: namespace that help with making dependencies explicit and/or cleaning up cloned data for new threads.

Also, IIRC, there's a slightly different model using ithreads called "Apartment" threads, implemented by Thread::Appartment which has a different usage pattern and another set of trade-offs.

The upshot:

Don't use them unless you know what you're doing :-)

Fork may be more efficient on Unix, but the IPC story is much simpler for ithreads. (This may have been mitigated by CPAN modules since I last looked :-)

They're still better than Python's threads.

There might, one day, be something much better in Perl 6.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...