Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
422 views
in Technique[技术] by (71.8m points)

python - Creating and assigning different variables using a for loop

So what I'm trying to do is the following:

I have 300+ CSVs in a certain folder. What I want to do is open each CSV and take only the first row of each.

What I wanted to do was the following:

import os

list_of_csvs = os.listdir() # puts all the names of the csv files into a list.

The above generates a list for me like ['file1.csv','file2.csv','file3.csv'].

This is great and all, but where I get stuck is the next step. I'll demonstrate this using pseudo-code:

import pandas as pd

for index,file in enumerate(list_of_csvs):
    df{index} = pd.read_csv(file)    

Basically, I want my for loop to iterate over my list_of_csvs object, and read the first item to df1, 2nd to df2, etc. But upon trying to do this I just realized - I have no idea how to change the variable being assigned when doing the assigning via an iteration!!!

That's what prompts my question. I managed to find another way to get my original job done no problemo, but this issue of doing variable assignment over an interation is something I haven't been able to find clear answers on!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

If i understand your requirement correctly, we can do this quite simply, lets use Pathlib instead of os which was added in python 3.4+

from pathlib import Path
csvs = Path.cwd().glob('*.csv') # creates a generator expression.
#change Path(your_path) with Path.cwd() if script is in dif location

dfs = {} # lets hold the csv's in this dictionary

for file in csvs:
   dfs[file.stem] = pd.read_csv(file,nrows=3) # change nrows [number of rows] to your spec.

#or with a dict comprhension
dfs = {file.stem : pd.read_csv(file) for file in Path('locationofyourfiles').glob('*.csv')}

this will return a dictionary of dataframes with the key being the csv file name .stem adds this without the extension name.

much like

{
'csv_1' : dataframe,
'csv_2' : dataframe
} 

if you want to concat these then do

df = pd.concat(dfs)

the index will be the csv file name.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...