Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
113 views
in Technique[技术] by (71.8m points)

python - How to create separate dataframes with groupby time

I have this dataset with data collected over 34 days with 15-minute intervals.

How would I fetch all data from the same time of day? I have already loaded and converted the dataset into DateTime format.

I have gotten the following piece of code to work:

tmp=weather_sensor_df()
df=pd.DataFrame(columns=tmp.columns)
print(df)
tmp.DATE_TIME.dt.hour[13]
for i in tmp.index:
    time = tmp.DATE_TIME[i]
    if time.hour==13 and time.minute==0:
        dict={
            df.columns[0]:time,
            df.columns[1]:tmp.AMBIENT_TEMPERATURE[i],
            df.columns[2]:tmp.MODULE_TEMPERATURE[i],
            df.columns[3]:tmp.IRRADIATION[i],
        }
        df=df.append(dict,ignore_index=True)

Reference: weather_sensor_df() loads the weather sensor dataframe and sets DATE_TIME to Timestamp format using pd.DataFrame.to_datetime().

I think that the groupby() function would be better suited for this situation but I am not sure how to proceed.

DATE_TIME,PLANT_ID,SOURCE_KEY,AMBIENT_TEMPERATURE,MODULE_TEMPERATURE,IRRADIATION
2020-05-15 00:00:00,4135001,HmiyD2TTLFNqkNe,25.184316133333333,22.8575074,0.0
2020-05-15 00:15:00,4135001,HmiyD2TTLFNqkNe,25.08458866666667,22.761667866666663,0.0
2020-05-15 00:30:00,4135001,HmiyD2TTLFNqkNe,24.935752600000004,22.59230553333333,0.0
2020-05-15 00:45:00,4135001,HmiyD2TTLFNqkNe,24.8461304,22.36085213333333,0.0
2020-05-15 01:00:00,4135001,HmiyD2TTLFNqkNe,24.621525357142858,22.165422642857145,0.0
2020-05-15 01:15:00,4135001,HmiyD2TTLFNqkNe,24.5360922,21.968570866666667,0.0
2020-05-15 01:30:00,4135001,HmiyD2TTLFNqkNe,24.638673866666664,22.352925666666668,0.0
2020-05-15 01:45:00,4135001,HmiyD2TTLFNqkNe,24.87302233333333,23.1609192,0.0
2020-05-15 02:00:00,4135001,HmiyD2TTLFNqkNe,24.936930466666663,23.026113,0.0
2020-05-15 02:15:00,4135001,HmiyD2TTLFNqkNe,25.0122476,23.343229266666665,0.0
2020-06-17 21:30:00,4135001,HmiyD2TTLFNqkNe,22.9965616,21.869773466666665,0.0
2020-06-17 21:45:00,4135001,HmiyD2TTLFNqkNe,23.137091,22.1259848,0.0
2020-06-17 22:00:00,4135001,HmiyD2TTLFNqkNe,22.563179466666668,21.164713466666665,0.0
2020-06-17 22:15:00,4135001,HmiyD2TTLFNqkNe,22.19922893333333,20.51527293333333,0.0
2020-06-17 22:30:00,4135001,HmiyD2TTLFNqkNe,22.171736666666664,21.0808288,0.0
2020-06-17 22:45:00,4135001,HmiyD2TTLFNqkNe,22.150569666666662,21.480377266666668,0.0
2020-06-17 23:00:00,4135001,HmiyD2TTLFNqkNe,22.129815666666666,21.38902386666667,0.0
2020-06-17 23:15:00,4135001,HmiyD2TTLFNqkNe,22.008274642857145,20.709211357142856,0.0
2020-06-17 23:30:00,4135001,HmiyD2TTLFNqkNe,21.96949473333333,20.7349628,0.0
2020-06-17 23:45:00,4135001,HmiyD2TTLFNqkNe,21.909287666666668,20.4279724,0.0
question from:https://stackoverflow.com/questions/65861179/how-to-create-separate-dataframes-with-groupby-time

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)
  • Use pandas.DataFrame.groupby for .dt.time.
    • .dt.hour can be used if you want to group on the hour.
  • Aggregation functions haven't been specified for the columns, so dfg is a DataFrameGroupBy object.
  • Using the GroupBy object, a dict of dataframe can be created, with the time, in isoformat (e.g. 'hh:mm:ss'), as the keys.
    • If .dt.hour is used for the group, then remove .isoformat, and the keys will be ints (0...23).
import pandas as pd

# load the data
tmp = pd.read_csv('./data/Plant_1_Weather_Sensor_Data.csv')

# set the column as a datetime dtype
tmp.DATE_TIME = pd.to_datetime(tmp.DATE_TIME)

# groupby time
dfg = tmp.groupby(tmp.DATE_TIME.dt.time)

# create a dict of dataframes, where the key is an isoformat datetime.time
df_times = {g.isoformat(): data for g, data in dfg}

# display(df_times['00:15:00'].head())
              DATE_TIME  PLANT_ID       SOURCE_KEY  AMBIENT_TEMPERATURE  MODULE_TEMPERATURE  IRRADIATION
1   2020-05-15 00:15:00   4135001  HmiyD2TTLFNqkNe            25.084589           22.761668          0.0
182 2020-05-17 00:15:00   4135001  HmiyD2TTLFNqkNe            24.011531           21.648279          0.0
278 2020-05-18 00:15:00   4135001  HmiyD2TTLFNqkNe            21.041437           20.475962          0.0
374 2020-05-19 00:15:00   4135001  HmiyD2TTLFNqkNe            22.548998           20.529877          0.0
467 2020-05-20 00:15:00   4135001  HmiyD2TTLFNqkNe            22.255206           20.110174          0.0

# iterate through the dict of dataframes like a normal dict
for k, v in df_times.items():
    print(k)
    print(v.head())    

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...