Usage of Twitter API v2 with tweepy and pandas in Python
In this tutorial, we’ll cover the setup to get started with the Twitter API v2 using Python and tweepy.
- Sign up with Twitter
- Create an App in the developer account: Follow steps 1 and 2 in this Twitter article
- Obtain the access token and access token secret. These can be generated in your developer portal, under the “Keys and tokens” tab for your developer App.
- Next, we need to install tweepy. Installation with Anaconda:
conda install -c conda-forge tweepy
Note that we use the free “Essential access” method and therefore have the following limitations:
- 500,000 Tweets per month
- 1 Project per account
- 1 App environment per Project
- No access to standard v1.1, premium v1.1, or enterprise
Now we are ready to import tweepy:
import tweepy
Create keys
We need to provide the Twitter keys and tokens in order to use the API v2.
Therefore, we first create a simple Python script called
keys.py
in which we store all passwords.
We create the file with the following commands: 1. we create a variable called keys.py
1. we create the file with %%writefile
: this will save this script in the same folder as this notebook 1. open keys.py
and insert your keys.
# Create variable
= 'keys.py' file_name
%%writefile {file_name}
="insert your API key"
consumer_key="insert your API secret"
consumer_secret="insert your access token"
access_token="insert your access token secret"
access_token_secret="insert your bearer token" bearer_token
Writing keys.py
Make a connection with API v2
We import the keys and use them in the function tweepy.Client:
from keys import *
import requests
= tweepy.Client( bearer_token=bearer_token,
client =consumer_key,
consumer_key=consumer_secret,
consumer_secret=access_token,
access_token=access_token_secret,
access_token_secret= requests.Response,
return_type =True) wait_on_rate_limit
Make a query
- Let’s search Tweets from Barack Obama’s Twitter account (@BarackObama) from the last 7 days (
search_recent_tweets
). - We exclude Retweets and limit the result to a maximum of 100 Tweets.
- We also include some additional information with
tweet_fields
(author id and when the Tweet was created).
# Define query
= 'from:BarackObama -is:retweet'
query
# get max. 100 tweets
= client.search_recent_tweets(query=query,
tweets =['author_id', 'created_at'],
tweet_fields=100) max_results
Convert to pandas Dataframe
Finally, we convert the data to a pandas Dataframe.
import pandas as pd
# Save data as dictionary
= tweets.json()
tweets_dict
# Extract "data" value from dictionary
= tweets_dict['data']
tweets_data
# Transform to pandas Dataframe
= pd.json_normalize(tweets_data) df
df
created_at | id | author_id | text | |
---|---|---|---|---|
0 | 2022-05-16T21:24:35.000Z | 1526312680226799618 | 813286 | It’s despicable, it’s dangerous — and it needs… |
1 | 2022-05-16T21:24:34.000Z | 1526312678951641088 | 813286 | We need to repudiate in the strongest terms th… |
2 | 2022-05-16T21:24:34.000Z | 1526312677521428480 | 813286 | This weekend’s shootings in Buffalo offer a tr… |
3 | 2022-05-16T13:16:16.000Z | 1526189794665107457 | 813286 | I’m proud to announce the Voyager Scholarship … |
4 | 2022-05-14T15:03:07.000Z | 1525491905139773442 | 813286 | Across the country, Americans are standing up … |
# save df
"tweets-obama.csv") df.to_csv(