Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
299 views
in Technique[技术] by (71.8m points)

python - How to read in a csv and create new columns in the resulting dataframe based on results of presence/absence of strings in some of the columns?

I have a function that takes two inputs, the year and an identifier associated with a specific CSV file. It then (using a try and except loop), does the following:

  1. Reads in the csv.
  2. Creates a new column based on the input of the function.
  3. Uses several list comprehensions to check if certain variables appear in column names.
  4. Assigns new columns based on the output of these 3 list comprehensions.

For some reason, everything is fine until I assign the new columns. Once I add that step in, I return a Nonetype Dataframe. I am at a loss as to why this is happening and also how to fix it?

def read_csv_create_cols(id, year=2020):
    subs_2 = ["_rate"]

    path = os.path.join(path1, path2/{id}/{year})
    
    try:
        new_df = pd.read(csv(path, low_memory=False)
        new_df["id"] = id
        
        col_name_1 = [col for col in new_df.columns if "product" in col]
        col_name_2 = [col for col in new_df.columns if "date" in col]
        col_name_3 = [col for col in new_df.columns if all (sub in column for sub in subs_2)]
  
        new_df["col_name_1"] = new_df[col_name_1]
        new_df["col_name_2"] = new_df[col_name_2]
        new_df["col_name_3"] = new_df[col_name_3]

    except:
          return None
question from:https://stackoverflow.com/questions/65837791/how-to-read-in-a-csv-and-create-new-columns-in-the-resulting-dataframe-based-on

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

All of your col_name vairables are generated using a list comprehension, so these are also lists. You should unlist these to get the actual column names, needed to assign the new columns. Try the following

new_df["col_name_1"] = new_df[col_name_1].pop()
new_df["col_name_2"] = new_df[col_name_2].pop()
new_df["col_name_3"] = new_df[col_name_3].pop()

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...