Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
141 views
in Technique[技术] by (71.8m points)

python - Create dictionary from CSV where column names are keys

I'm trying to read a csv file (actually a tsv, but nvm) and set it as a dictionary where its key are the column names of said csv and the rest of the rows are values for those keys. I also have some comments marked by the '#' character, which I intend to ignore:

csv_in.csv

##Some comments
##Can ignore these lines
Location   Form                  Range         <-- This would be the header
North      Dodecahedron          Limited       <---|
East       Toroidal polyhedron   Flexible      <------ These lines would be lists
South      Icosidodecahedron     Limited       <---| 

The main idea is to store them like this:

final_dict = {'Location': ['North','East','South'], 
'Form': ['Dodecahedron','Toroidal polyhedron','Icosidodecahedron'],
'Range': ['Limited','Flexible','Limited']}

So far I could come close like so:

tryercode.py

import csv
dct = {}

# Open csv file
with open(tsvfile) as file_in:
# Open reader instance with tab delimeter
reader = csv.reader(file_in, delimiter='')
# Iterate through rows 
for row in reader:
    # First I skip those rows that start with '#'
    if row[0].startswith('#'):
        pass
    elif row[0].startswith('L'):
        # Here I try to keep the first row that starts with the letter 'L' in a separate list
        # and insert this first row values as keys with empty lists inside
        dictkeys_list = []
        for i in range(len(row)):
            dictkeys_list.append(row[i])
            dct[row[i]] = []
    else:
        # Insert each row indexes as values by the quantity of rows
        print('??')

So far, the dictionary's skeleton looks fine:

print(dct)
{'Location': [], 'Form': [], 'Range': []}

But everything I tried so far failed to append the values to the keys' empty lists the way it is intended. Only could do so for the first row.

        (...)
    else:
        # Insert each row indexes as values by the quantity of rows
        print('??')
        for j in range(len(row)):
            dct[dictkeys_list[j]] = row[j]   # Here I indicate the intented key of the dict through the preoviously list of key names

I searched far and wide stackoverflow but couldn't find it for this way (the code template is inspired by an answer at this post, but the dictionary is of a different structure.

question from:https://stackoverflow.com/questions/65932657/create-dictionary-from-csv-where-column-names-are-keys

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

I amend something in your code and run it. Your code can work with the right result.
The code is below

import csv
dct = {}

# Open csv file
tsvfile="./tsv.csv"  # This is the tsv file path
with open(tsvfile) as file_in:
# Open reader instance with tab delimeter
    reader = csv.reader(file_in, delimiter='')
    for row in reader:
    # First I skip those rows that start with '#'
        if row[0].startswith('#'):
            pass
        elif row[0].startswith('L'):
        # Here I try to keep the first row that starts with the letter 'L' in a separate list
        # and insert this first row values as keys with empty lists inside
            dictkeys_list = []
            for i in range(len(row)):
                dictkeys_list.append(row[i])
                dct[row[i]] = []
        else:
        # Insert each row indexes as values by the quantity of rows
            for i in range(len(row)):
                dct[dictkeys_list[i]].append(row[i])
print(dct)
# Iterate through rows

Running result like this enter image description here Besides, I amend your further like below, I think the code can deal with more complicated situation

import csv
dct = {}

# Open csv file
tsvfile="./tsv.csv"  # This is the tsv file path
is_head=True    # judge if the first line
with open(tsvfile) as file_in:
# Open reader instance with tab delimeter
    reader = csv.reader(file_in, delimiter='')
    for row in reader:
        # First I skip those rows that start with '#'
        # Use strip() to remove the space char of each item
        if row.__len__()==0 or row[0].strip().startswith('#'):
            pass
        elif is_head:
        # Here I try to keep the first row that starts with the letter 'L' in a separate list
        # and insert this first row values as keys with empty lists inside
            is_head=False
            dictkeys_list = []
            for i in range(len(row)):
                item=row[i].strip()
                dictkeys_list.append(item)
                dct[item] = []
        else:
        # Insert each row indexes as values by the quantity of rows
            for i in range(len(row)):
                dct[dictkeys_list[i]].append(row[i].strip())
print(dct)
# Iterate through rows


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...