Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
327 views
in Technique[技术] by (71.8m points)

python 3.x - pandas: groupwise normalize

let's have a DataFrame

dft = pd.DataFrame({'A': (1,2,3,1,2,3),
                    'B': [10*i for i in range(6)],
                    'C': ("a", "a", "a", "b", "b", "b")}).set_index(["C", "A"])
      B
C A    
a 1   0
  2  10
  3  20
b 1  30
  2  40
  3  50

I'd need to interpret column B as value normed by value B of second index "A" == 2.

         B
C A       
a 1  0.000
  2  1.000
  3  2.000
b 1  0.750
  2  1.000
  3  1.200

It looks so easy. I've experimented with groupby, transform ... and cannot make it done.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Idea is first get only 2 in second level by Series.xs and dived mapped first level by Index.map:

s = dft['B'].xs(2, level=1)

dft['B'] = dft['B'].div(dft.index.droplevel(-1).map(s))
print (dft)
        B
C A      
a 1  0.00
  2  1.00
  3  2.00
b 1  0.75
  2  1.00
  3  1.25

Another idea is use Series.where for replace B values to NaN if no 2 in A level and then divide with GroupBy.transform and GroupBy.first:

s1 = dft['B'].where(dft.index.get_level_values(1) == 2)
dft['B'] = dft['B'].div(s1.groupby(level=0).transform('first'))
print (dft)
        B
C A      
a 1  0.00
  2  1.00
  3  2.00
b 1  0.75
  2  1.00
  3  1.25

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...