Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
301 views
in Technique[技术] by (71.8m points)

list - Sentiment Analysis using NLTK and beautifulsoup

I'm working on a personal project where I'm thinking of doing sentiment analysis using NLTK and Vader to compare presidential speeches.

I was able to use beautiful soup to find one of George Washington's speeches and I managed to put the speech in a list. But after that, I'm not really sure the best way to go further. It seems that it's typical for the file to be read from a text file but I have the brackets that have the list which make it difficult. I'm not sure if I should store the web scraped speech in a file or just work at from the list. Or maybe I should put the speech into a dataframe already? I'm not too sure.

from bs4 import BeautifulSoup
import requests
import spacy
import pandas as pd

page_link = 'https://www.ourdocuments.gov/doc.php?flash=false&doc=11&page=transcript'
page_response = requests.get(page_link, timeout=5)
page_content = BeautifulSoup(page_response.content, "html.parser")

textContent = []
for i in range(0, 7):
    paragraphs = page_content.find_all("p")[i].text
    textContent.append(paragraphs)

toWrite = open('washington.txt', 'w')
line = textContent
toWrite.write(str(line))
toWrite.close()

Any help or pointers would be greatly appreciated.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...