I want to check if the value in my dataframe is greater than 1.5 times the median of all previous values (or last 10 previous values) and replace it with the median of all previous values (or last 10 previous values). I have a huge dataset so i dont want to use loops.
df
Out[315]:
a
0 15.0
1 16.0
2 13.5
3 14.6
4 15.0
5 26.0
6 12.0
7 28.0
8 12.0
9 29.0
i want the 26 to be replaced by median of previous values and so on. Once the value is replaced, i want the new value to be considered for calculating the median the next time. Here is what i have tried:(for simplicity i have taken a condition of >20 and mean of past 2 values). Actually, i want the condition to compare the value to 1.5*median of previous 10 values and if greater, then replace it with the median of previous 10 values and the new value to be used next time the median is calculated.
df["b"] = df["a"]
df['b'] = np.where(df["b"]>20, df['b'].rolling(2).mean(), df["b"])
df
Out[88]:
a b
0 11.0 11.0
1 16.0 16.0
2 13.5 13.5
3 14.6 14.6
4 15.0 15.0
5 26.0 14.8
6 12.0 12.0
7 28.0 19.0
8 12.0 12.0
9 29.0 20.0
Here the replaced values are not getting used to caluclate the median next time. for eg. last value in df["b"] is 20 which is a mean of 28 and 12. But i want the value to be mean of 19 and 12 because 19 is the replaced value.
question from:
https://stackoverflow.com/questions/65713692/how-to-check-if-a-value-in-dataframe-satisfies-a-condition-based-on-all-or-last 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…