• 设为首页
  • 点击收藏
  • 手机版
    手机扫一扫访问
    迪恩网络手机版
  • 关注官方公众号
    微信扫一扫关注
    迪恩网络公众号

pritamd/SoccerTweetAnalysis: Soccer Tweet Analysis using SprakSQL on a Jupyter n ...

原作者: [db:作者] 来自: 网络 收藏 邀请

开源软件名称:

pritamd/SoccerTweetAnalysis

开源软件地址:

https://github.com/pritamd/SoccerTweetAnalysis

开源编程语言:

Jupyter Notebook 100.0%

开源软件介绍:

SoccerTweetAnalysis

This is an assignment from "Big Data Integration and Processing" Course from Coursera. The "Big Data Integration and Processing" is the 3rd course for the Big Data Specilization.

As the Sports Analyst, you are very interested in reporting on the countries with the most popularity in Twitter. So a good way to approach this problem would be to find which countries were mentioned the most in the tweets in your dataset and to analyze what words are being used the most in these tweets.

In addition to the CSV file you just exported from MongoDB, we give you a small dataset with the codes and names of some countries. To see this additional dataset, open the following file:

Downloads/big-data-3/final-project/country-list.csv To get you started, we have prepared a Jupyter notebook template, and started a SparkSQL context for you. Please open the notebook in:

Downloads/big-data-3/final-project/SoccerTweetAnalysis.ipynb. You will use this notebook to answer the questions below. So let’s get started.

Question 1: As a Sports Analyst, you are interested in how many different countries are mentioned in the tweets. Use the Spark to calculate this number. Note that regardless of how many times a single country is mentioned, this country only contributes 1 to the total.

Question 2: Next, compute the total number of times any country is mentioned. This is different from the previous question since in this calculation, if a country is mentioned three times, then it contributes 3 to the total.

Question 3: Your next task is to determine the most popular countries. You can do this by finding the three countries mentioned the most.

Question 4: After exploring the dataset, you are now interested in how many times specific countries are mentioned. For example, how many times was France mentioned?

Question 5: Which country has the most mentions: Kenya, Wales, or Netherlands?

Question 6: Finally, what is the average number of times a country is mentioned?




鲜花

握手

雷人

路过

鸡蛋
该文章已有0人参与评论

请发表评论

全部评论

专题导读
热门推荐
阅读排行榜

扫描微信二维码

查看手机版网站

随时了解更新最新资讯

139-2527-9053

在线客服(服务时间 9:00~18:00)

在线QQ客服
地址:深圳市南山区西丽大学城创智工业园
电邮:jeky_zhao#qq.com
移动电话:139-2527-9053

Powered by 互联科技 X3.4© 2001-2213 极客世界.|Sitemap