Here's some code to make up a pandas dataframe with 2 columns one called data
and the other called hours
. The data
column is random int from -150 to 250. And the hours
column is random floats from .5 to 15.5.
import random
import numpy as np
import pandas as pd
data = np.random.randint(-150,250,size=200)
df = pd.DataFrame(data, columns=['Data'])
#generate random floats for df2
randomFloatList = []
# Set a length of the list to length of pandas df1
for i in range(0, len(df)):
# any random float between 5.50 to 50.50
x = round(random.uniform(0.50, 15.50), 2)
randomFloatList.append(x)
df2 = pd.DataFrame(randomFloatList,columns=['hours'])
combined = df.join(df2)
print(combined)
Returns:
Data hours
0 93 9.66
1 85 14.76
2 -82 12.55
3 -44 2.40
4 -1 13.86
Can Pandas rank function reorganize a dataframe based on the highest values in one column (data
) and lowest values in a different column (hours
) with rows in the dataset being preserved? Hopefully this makes sense...
If I use
print(combined.rank(axis='columns'))
This returns something unwanted, I cant quite figure out if this is possible with the pandas rank or not.
Data hours
0 2.0 1.0
1 2.0 1.0
2 1.0 2.0
3 1.0 2.0
4 1.0 2.0
Any tips greatly appreciated.
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…