Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
1.0k views
in Technique[技术] by (71.8m points)

arrays - Matlab - bsxfun no longer faster than repmat?

I'm trying to find the fastest way of standardizing a matrix in Matlab (zero mean, unit variance columns). It all comes down to which is the quickest way of applying the same operation to all rows in a matrix. Every post I've read come to the same conclusion: use bsxfun instead of repmat. This article, written by Mathworks is an example: http://blogs.mathworks.com/loren/2008/08/04/comparing-repmat-and-bsxfun-performance/

However, when trying this on my own computer repmat is always quicker. Here are my results using the same code as in the article:

m = 1e5;
n = 100;
A = rand(m,n);

frepmat = @() A - repmat(mean(A),size(A,1),1);
timeit(frepmat)

fbsxfun = @() bsxfun(@minus,A,mean(A));
timeit(fbsxfun)

Results:

ans =

    0.0349


ans =

    0.0391

In fact, I can never get bsxfun to perform better than repmat in this situation no matter how small or large the input matrix is.

Can someone explain this?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Most of the advice you're reading, including the blog post from Loren, likely refers to old versions of MATLAB, for which bsxfun was quite a bit faster than repmat. In R2013b (see the "Performance" section in the link), repmat was reimplemented to give large performance improvements when applied to numeric, char and logical arguments. In recent versions, it can be about the same speed as bsxfun.

For what it's worth, on my machine with R2014a I get

m = 1e5;
n = 100;
A = rand(m,n);

frepmat = @() A - repmat(mean(A),size(A,1),1);
timeit(frepmat)

fbsxfun = @() bsxfun(@minus,A,mean(A));
timeit(fbsxfun)

ans =
      0.03756
ans =
     0.034831

so it looks like bsxfun is still a tiny bit faster, but not much - and on your machine it seems the reverse is the case. Of course, these results are likely to vary again, if you vary the size of A or the operation you're applying.

There may still be other reasons to prefer one solution over the other, such as elegance (I prefer bsxfun, if possible).


Edit: commenters have asked for a specific reason to prefer bsxfun, implying that it might use less memory than repmat by avoiding a temporary copy that repmat does not.

I don't think this is actually the case. For example, open Task Manager (or the equivalent on Linux/Mac), watch the memory levels, and type:

>> m = 1e5; n = 8e3; A = rand(m,n);
>> B = A - repmat(mean(A),size(A,1),1);
>> clear B
>> C = bsxfun(@minus,A,mean(A));
>> clear C

(Adjust m and n until the jumps are visible in the graph, but not so big you run out of memory).

I see exactly the same behaviour from both repmat and bsxfun, which is that memory rises smoothly to the new level (basically double the size of A) with no temporary additional peak.

This is also the case even if the operation is done in-place. Again, watch the memory and type:

>> m = 1e5; n = 8e3; A = rand(m,n);
>> A = A - repmat(mean(A),size(A,1),1);
>> clear all
>> m = 1e5; n = 8e3; A = rand(m,n);
>> A = bsxfun(@minus,A,mean(A));

Again, I see exactly the same behaviour from both repmat and bsxfun, which is that memory rises to a peak (basically double the size of A), and then falls back to the previous level.

So I'm afraid I can't see much technical difference in terms of either speed or memory between repmat and bsxfun. My preference for bsxfun is really just a personal preference as it feels a bit more elegant.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...