Welcome to Week 8 of ArcGIS Hub’s Civic Analytics Notebook series. In the last post we saw how the data catalog of a Hub can be analyzed and visualized. A data catalog is an organized collection or inventory of all the data assets in your Hub and their metadata, that aims to improve data transparency, access and governance.
This week we look at ways to extract data from social media platforms like Twitter. Over the years, Twitter has proved to be a gold mine for learning about events happening in real time and the social reach and impact of those events. Most people tweet about their opinion or experience regarding an event by tagging and mentioning relevant twitter handles (accounts) or by using hashtags that categorize their tweet and helps users find content related to the hashtag or topic. This data can be used for research as well as to obtain civic and demographic insights. Accessing publicly available tweets for our city or local region helps invert our lens and understand the general public and what matters to them.
We use the
tweepy Python library here to access the Twitter API and fetch tweets. Before we get started with scripting using
tweepy, you will need to apply for a Twitter developer account to access their API. Once your application is approved, you will have access to a few consumer and access keys that are used to authenticate you before allowing you to work with the Twitter API. Here is a great tutorial by Twitter that covers the steps for applying for a developer account. Once we have our developer credentials, we are ready to get started with the notebook.
Few different scenarios for extracting tweets
In this notebook we introduce 5 different scenarios of filtering and extracting tweets.
- The first scenario is for extracting tweets for a particular user mention. A user can be mentioned using @user in the tweet text and is used to get that user’s attention. Extracting tweets for a mention tells us what people are generally saying about the user and where these tweets are made from.
- In the second scenario, we extract tweets for a particular hashtag. A hashtag is written with a # symbol and is used to categorize tweets by topic. Fetching tweets for a hashtag acts like an index for all the things being said about that topic.
- In the third scenario we filter tweets that contain a particular word (as mention, hashtag or in tweet text) and that have been made after a particular date. This helps us track tweets after a particular event to assess its impact on people.
- In the fourth scenario we see how we can extract a fixed number of tweets made by a certain user account, to understand the content created by that user.
- To conclude we see how to fetch all the undeleted tweets made from a particular user account and store it in a table, to then publish it to ArcGIS as a hosted table for future use. We also add a section that shows you how to update this feature table with new data, in future re-runs of this notebook.
I invite you to experiment with the different Twitter data extraction techniques in this notebook. Leverage the power of this data by creating ArcGIS Dashboards, Experience Builder, Insights apps that can provide a visual summary of the trends and sentiments from this data. Here is an example of an ArcGIS Dashboard we created with this data. Next week we will see how to extract and analyze tweets from our city to understand the social fabric of our city or local region using text and spatial analysis techniques. We are looking forward to hearing your thoughts and experience with this notebook and our other Civic Analytics notebook offerings on our Community thread.
Link to notebook – Twitter data extraction