you can use DataFrame.stack and DataFrame.unstack to transform the dataframe, to get the result.
# for every primary_id create a Disease tag num
# e.g. as for primary_id 1: Diabetes-1, Hypertension-2
df2['tag'] = 1
df2['tag'] = df2.groupby('primary_id_id')['tag'].cumsum()
# use stack and unstack to transform the dataframe
df2n = df2.set_index(['primary_id_id','tag'])[['Disease','duration']].stack()
# swap the index level and unstack
df3 = df2n.swaplevel(1).unstack().unstack()
# join to df1 to get result
df_result = df1.join(df3, on='Primary_id', how='left')
# rename columns
df_result.columns = [col if isinstance(col, str) else f'{col[1]}_{col[0]}' for col in df_result.columns]
result:
print(df2n)
primary_id_id tag
1 1 Disease Diabetes
duration 10
2 Disease Hypertension
duration 20
3 1 Disease Cataract
duration 20
dtype: object
print(df3)
tag 1 2
Disease duration Disease duration
primary_id_id
1 Diabetes 10 Hypertension 20
3 Cataract 20 NaN NaN
print(df_result)
Primary_id Age Gender Disease_1 duration_1 Disease_2 duration_2
0 1 45 Male Diabetes 10 Hypertension 20
1 2 32 Female NaN NaN NaN NaN
2 3 35 Male Cataract 20 NaN NaN
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…