Tweet downloading class. The TweetDownloader class contains the main downloading function as well as the storing and plotting functions accessible to the user.

Parameters:

Name Type Description Default
credentials str

A path pointing to the location of the TwitterAPI credentials file.

required
name str, optional

The name to use when saving downloaded files and exports. The default value 'Project_[date]' with the current date in %m%d%Y_%H%M%S format.

None
output_folder str, optional

Path to the folder in which saved information is going to be stored. It defaults to the current location

''

Attributes:

Name Type Description
credentials str

A path pointing to the location of the TwitterAPI credentials file.

name str, optional

The name to use when saving downloaded files and exports.

output_folder str, optional

Path to the folder in which saved information is going to be stored.

tweets list

List of pages of the response tweet object obtained from Twitter API calls.

authors list

List of pages of the response authors object obtained from Twitter API calls.

places list

List of pages of the response location object obtained from Twitter API calls.

replies list

List of tweets that are replies to the tweets in the tweets attribute

tweets_df pandas.DataFrame

Table with the tweets from the attribute tweets

authors_df pandas.DataFrame

Table with the authors from the attribute authors

places_df pandas.DataFrame

Table with the georreferenced locations from the attribute places

replies_df pandas.DataFrame

Table containing replies to the tweets in the tweets_df table

search_args dict

Dictionary containing the Twitter keys required to access the API

timestamp str

A string to append at the end of saved files, so they all have a timestamp

get_tweets(query, start_time=None, end_time=None, lang=None, include_retweets=False, place=None, has_geo=True, max_tweets=10, max_page=500, save_temp=True, save_final=True, save_replies=False, include_replies=False, max_replies=10, temp_replies=True)

Parameters:

Name Type Description Default
query str

Words to be searched in tweets. Twitter API query operators supported.

required
start_time str

Lower bound of time frame in which tweets are going to be searched in date-time format (default is current date and time minus 24 hours)

None
end_time str

Upper bound of time frame in which tweets are going to be searched in date-time format (default is current date and time time)

None
lang str, optional

Two letter code for language to be imposed in retrieved tweets

None
include_retweets bool

Whether to include tweets that are just a retweet of a previous one (default is False)

False
place str, optional

Two letter code for country or place in which the search is going to be constraint

None
has_geo bool, optional

Whether to only include tweets with geographic reference (default is True)

True
max_tweets int

The maximum amount of tweets to retrieve in total (default is 10)

10
max_page int

The maximum amount of tweets allowed per tweets page (default is 500)

500
save_temp bool

Whether to save current progress (default is True)

True
save_final bool

Whether to save final tweets dataframe after download is over (default is True)

True
save_replies bool

Whether to include the replies to the downoaded tweets (default is false)

False
max_replies bool

Maximum amount of replies per tweet if replies are allowed (default is 10)

10
temp_replies bool

Whether to save progress while downloading replies if these are allowed (default is True)

True

get_replies(max_replies=10, save_temp=True, save_final=True)

Parameters:

Name Type Description Default
max_replies int

Maximum number of replies for each tweet in the original tweets dataset (default is 10)

10
save_temp bool

Whether to save progress at each page (default is True)

True
save_final bool

Whether to save final replies dataset (default is True)

True

tweets_from_csv(path, sep=',', save_temp=True)

Parameters:

Name Type Description Default
path str

The path to the csv path containing the download parameters

required
sep str, optional

The separator of the csv file (default is ,)

','
save_temp bool, optional

Whether to save or not progress at each downloaded page (default is True)

True

tweets_to_gdf(geo_type='centroids')

Parameters:

Name Type Description Default
geo_type

The type of geometry (default is centroids)

'centroids'

places_to_gdf(geo_type='centroids')

Parameters:

Name Type Description Default
geo_type

The type of geometry (default is centroids)

'centroids'

preview_tweet_locations()

interactive_map()

plot_heatmap(radius=20)

Parameters:

Name Type Description Default
radius int

The radius of the heatmap plot (default is 20)

20

map_animation(time_unit)

Parameters:

Name Type Description Default
time_unit

Time unit to aggregate by (default is 'day')

'second'

wordcloud(custom_stopwords=None, background_color='black', min_word_length=4, save_wordcloud=True, bar_plot=False, save_bar_plot=False)

Parameters:

Name Type Description Default
custom_stopwords list

List of words to exclude from word cloud

None
background_color

Background color of wordcloud plot

'black'
min_word_length int

Minimum length of strings to be considered for word cloud (default is 4)

4
save_wordcloud

Whether to save plot (default is True)

True
bar_plot

Whether to display barplot with word frequency (default is False)

False
save_bar_plot

Whether to save barplot (default is False)

False