Twitter data analysis - Assignment #2 for Computational Social Media course (Fulbright University Vietnam)

Description

This is a Python program I created to statistically analyze a dataset of 10,000 Tweets (from Twitter) for the Computational Social Media course (CS205) at Fulbright University Vietnam in the Fall term, academic year 2021-2022.

The program uses pandas, numpy, and math libraries on Python alongsides key concepts such as string manipulation, lists, dictionaries, and loops.

Features

Open and read the Tweets data from 'twitter_covid_fuv_2021.xlsx' file
Compute through the dataset and return the below descriptive statistics as required in the assignment:
- Percentage of tweets that contain URLs.
- Percentage of tweets that are (or contain) retweets.
- Percentage of tweets that contain vaccination hashtags/keywords (%pfizer, %moderna, %astrazeneca, %janssen, %verocell).
- Distribution of languages declared in the tweet metadata (%EN, %FR,....)
- Table of the 30 most frequent hashtags in the following format:[rank, hashtag, frequency]. Example: [1, #coronavirus, 2500]
- Percentage of tweets directly generated by all the 20 media accounts together. e.g.: 3% of tweets were produced by the 20 media accounts altogether.
- Percentage of tweets directly generated by the 20 NGOs/gov. accounts. e.g.: 5% of tweets were produced by the 20 NGOs/government accounts.
- Percentage of tweets generated by all the 20 media accounts that appear as retweets.
- Percentage of tweets generated by all the 20 NGOs/gov. accounts that appear as retweets.

Other information

The Twitter data in the Excel file is provided by our course instructor for the educational purpose of the course only. I am grateful for our instructor, Thay Phan Thanh Trung, as he kindly gave us access to the dataset as well as clear instructions and requirements on this analysis assignment.

#python #university #computerscience #dataanalysis #learning #pythonproject

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Description_of_Code_Structure.docx		Description_of_Code_Structure.docx
QuynhAnh_analyze_tweets.py		QuynhAnh_analyze_tweets.py
README.md		README.md
twitter_covid_fuv_2021.xlsx		twitter_covid_fuv_2021.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Twitter data analysis - Assignment #2 for Computational Social Media course (Fulbright University Vietnam)

Description

Features

Other information

About

Releases

Packages

Languages

quynhanhninh/cs205-twitter-data-analysis

Folders and files

Latest commit

History

Repository files navigation

Twitter data analysis - Assignment #2 for Computational Social Media course (Fulbright University Vietnam)

Description

Features

Other information

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages