Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
126 views
in Technique[技术] by (71.8m points)

python - Sorting lines of text file numerically when the file has strange formatting

I'm having an issue comprehending how to do this. I have a txt file handling birthdays as such:

**January birthdays:**
**17** - !@Mark
**4** - !@Jan
**15** - !@Ralph

**February birthdays:**
**27** - !@Steve
**19** - !@Bill
**29** - !@Bob

The list continues for every month, each month is separated by a blank line. How on Earth do you sort the days sequentially with formatting like this?

For example January should be:

**January birthdays:**
**4** - !@Jan
**15** - !@Ralph 
**17** - !@Mark

What I've brainstormed:

I thought maybe I could potentially use readlines() from specific indexes and then save each line to a list, check the integer somehow, and then re-write the file properly. But this seems so tedious and frankly seems like the totally wrong idea.

I also considered using partial() to read until a stop condition such as the line of the next month and then sort somehow based on that.

Does Python offer any easier way to do something like this?

question from:https://stackoverflow.com/questions/65908942/sorting-lines-of-text-file-numerically-when-the-file-has-strange-formatting

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

You can do it as follows.

Code

import re

def order_month(month_of_entries):
    '''
        Order lines for a Month of entries
    '''
    # Sort key based upon number in line
    # First line in Month does not have a number, 
    # so key function returns 0 for it so it stays first
    month_of_entries.sort(key=lambda x: int(p.group(0)) if (p:=re.search('d+', x)) else 0)
            
# Process input file
with open('input.txt', 'r') as file:
    results = []
    months_data = []
    for line in file:
        line = line.rstrip()
        if line:
            months_data.append(line)
        else:
            # blank line
            # Order files for this month
            order_month(months_data)
            results.append(months_data)
            
            # Setup for next month
            months_data = []
    else:
        # Reached end of file
        # Order lines for last month
        if months_data:
            order_entries(months_data)
            results.append(months_data)
               
# Write to output file
with open('output.txt', 'w') as file:
    for i, months_data in enumerate(results):
        # Looping over each month
        for line in months_data:
            file.write(line + '
')
        # Add blank line if not last month
        if i < len(results) - 1:
            file.write('
')           
         

Output

**January birthdays:**
**4** - !@Jan
**15** - !@Ralph
**17** - !@Mark

**February birthdays:**
**19** - !@Bill
**27** - !@Steve
**29** - !@Bob

Alternativee, that also sort months if necessary

import re
from itertools import accumulate
from datetime import date
    
def find_day(s, pattern=re.compile(r'd+')): 
    return 99 if not s.strip() else int(p.group(0)) if (p:=pattern.search(s)) else 0

def find_month(previous, s, pattern = re.compile(fr"^**({'|'.join(months_of_year)})")):
    ' Index of Month in year (i.e. 1-12)'
    return months_of_year.index(p.group(1)) if (p:=pattern.search(s)) else previous

with open('test.txt') as infile:
    lines = infile.readlines()
    
months_of_year = [date(2021, i, 1).strftime('%B') for i in range(1, 13)] # Months of year
months = list(accumulate(lines, func = find_month, initial = ''))[1:]   # Create Month for each line
days = (find_day(line) for line in lines)                               # Day for each line

# sort lines based upon it's month and day
result = (x[-1] for x in sorted(zip(months, days, lines), key = lambda x: x[:2]))
    
with open('output.txt', 'w') as outfile:
    outfile.writelines(result)
    

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...