unix - Shell Script for Splitting Sentences into new format in output text file?

Question

Welcome To Ask or Share your Answers For Others

unix - Shell Script for Splitting Sentences into new format in output text file?

posted Oct 6, 2021 in Technique[技术] by 深蓝 (71.8m points)

unix - Shell Script for Splitting Sentences into new format in output text file?

I have converted a bible to be a plain text file which comes out like this

$$  Genesis 40:1 It came to pass after these things that the butler and the baker of the king of Egypt ..

$$  Genesis 40:2 And Pharaoh was angry with his two officers, the chief butler and the chief baker.

$$  Genesis 40:3 So he put them in custody in the house of the captain of the guard, in the prison, the ..

I would like to be able to run a shell script on the text file and have it run through the file outputing a new file that looks like this

$$ Genesis 40:1

It came to pass after these things that the butler and the baker of the king of Egypt ..

$$ Genesis 40:2

And Pharaoh was angry with his two officers, the chief butler and the chief baker.

$$ Genesis 40:3

So he put them in custody in the house of the captain of the guard, in the prison, the ..

I figure somehow I need to have it parse the first X number of characters on each line then split the lines at that point however, I'm new at shell scripting and can't seem to figure out the best way to process the file to accomplish this.

Any Thoughts?

question from:https://stackoverflow.com/questions/66051527/shell-script-for-splitting-sentences-into-new-format-in-output-text-file

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2021-10-06T03:10:15+0000

Since you just need to replace the space after the numbers with two newline characters, you can use this command:

sed 's/([0-9]) /1

/' <textfile >newfile

- substitute (the first) one digit followed by space with that same digit followed by two .

this worked really well until it got to a line that read “1 John 1:1 something written here” then it split the line in the wrong spot. How can I account for this?

To account for lines having number and space before the name, we can include a letter and everything before the final digit in the pattern:

sed 's/([a-z].*[0-9]) /1

/' <textfile >newfile

Categories

unix - Shell Script for Splitting Sentences into new format in output text file?

unix - Shell Script for Splitting Sentences into new format in output text file?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags