Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
484 views
in Technique[技术] by (71.8m points)

pandas - fillna with max value of each group in python

Dataframe

df=pd.DataFrame({"sym":["a","a","aa","aa","aa","a","ab","ab","ab"],
                "id_h":[2.1, 2.2 , 2.5 , 3.1 , 2.5, 3.8 , 2.5, 5,6],
                 "pm_h":[np.nan, 2.3, np.nan , 2.8, 2.7, 3.7, 2.4, 4.9,np.nan]})

want to fill pm_h nan values with max id_h value of each "sys" group i.e. (a, aa, ab)

Required output:

df1=pd.DataFrame({"sym":["a","a","aa","aa","aa","a","ab","ab","ab"],
                "id_h":[2.1, 2.2 , 2.5 , 3.1 , 2.5, 3.8 , 2.5, 5,6],
                 "pm_h":[3.8, 2.3, 3.1 , 2.8, 2.7, 3.7, 2.4, 4.9, 6})
question from:https://stackoverflow.com/questions/65885577/fillna-with-max-value-of-each-group-in-python

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Use Series.fillna with GroupBy.transform by maximal values for new Series with same index like original:

df['pm_h'] = df['pm_h'].fillna(df.groupby('sym')['id_h'].transform('max'))
print (df)
  sym  id_h  pm_h
0   a   2.1   3.8
1   a   2.2   2.3
2  aa   2.5   3.1
3  aa   3.1   2.8
4  aa   2.5   2.7
5   a   3.8   3.7
6  ab   2.5   2.4
7  ab   5.0   4.9
8  ab   6.0   6.0

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...