[PYTHON] Get replies to specific tweets with tweepy

1. About this article

In this article, we use tweepy to set a specific tweet + a reply to the corresponding tweet. Describe the code to get. In the code below, over 100 replies were posted as an example Collects tweet and reply information.

The set of tweets and replies I got is "Id" included in the status of the tweet and "in_reply_to_status_id" included in the status of the reply The destination can be linked by collating.

Please note that the free version of the Twitter API has a limitation that tweets older than 7 days cannot be collected.

2. Code * python3

gather.py


# coding:utf-8

import tweepy
import csv
import time
from datetime import datetime, date, timedelta
import re

#Get current date information
today = datetime.today()
#Specify the range of posting time of the tweet to be acquired(Example: From 2 days ago to today)
tweet_begin_date = datetime.strftime(today - timedelta(days=2), '%Y-%m-%d_00:00:00_JST')
tweet_end_date = datetime.strftime((today), '%Y-%m-%d_23:59:00_JST')
#Specify the range of posting time of the reply to be acquired(Example: From 2 days ago to today)
reply_begin_date = datetime.strftime(today - timedelta(days=2), '%Y-%m-%d_00:00:00_JST')
reply_end_date = datetime.strftime((today), '%Y-%m-%d_23:59:00_JST')

#Acquisition result csv file output destination directory
csv_dir = '/hoge/'

# Twitter API KEY
Consumer_key = 'xxxx'
Consumer_secret = 'xxxx'
Access_token = 'xxxx'
Access_secret = 'xxxx'

#For Twitter API authentication
def authTwitter():
        auth = tweepy.OAuthHandler(Consumer_key, Consumer_secret)
        auth.set_access_token(Access_token, Access_secret)
        api = tweepy.API(auth, retry_count=3,retry_delay=40,retry_errors=set([401, 404, 500, 502, 503, 504]), wait_on_rate_limit = True, wait_on_rate_limit_notify=True)
        return(api)

#For tweet data acquisition
def gather_tweet_and_reply(s,t):
        api = authTwitter() #Authentication
        tweet_list = []
        reply_list = []
        tweet_id_list = []
        user_id_list = []

        tweets = tweepy.Cursor(api.search, q = s,     #Search string
                 include_entities = True,   #Get all omitted links
                 tweet_mode = 'extended',   #Get all omitted tweets
                 since = tweet_begin_date,    #Specifying the collection start date and time
                 until = tweet_end_date,      #Specifying the collection end period
                 lang = 'ja').items()       #Get only Japanese tweets

        #Store searched tweets in a list
        for tweet in tweets:
                tweet_list.append([tweet.id, tweet.user.screen_name, tweet.created_at, tweet.full_text.replace('\n',''), tweet.favorite_count, tweet.retweet_count])
                tweet_id_list.append(tweet.id)
                user_id_list.append(tweet.user.screen_name)

        # user_id_The user name stored in list searches for the destination reply
        for user_id in user_id_list:
                replies = tweepy.Cursor(api.search, q = t + " to:" + str(user_id),   #Search string
                          include_entities = True,   #Get all omitted links
                          tweet_mode = 'extended',   #Get all omitted tweets
                          since = reply_begin_date,    #Specifying the reply collection start date and time
                          until = reply_end_date,      #Specifying the reply collection end date and time
                          lang = 'ja').items()       #Get only Japanese tweets
                #Countermeasures against session disconnection due to mass transmission of requests
                time.sleep(5)
                #The destination ID of the reply is tweet_id_If it is in list, store it in the list
                for reply in replies:
                        if reply.in_reply_to_status_id in tweet_id_list:
                                reply_list.append([reply.id, reply.in_reply_to_status_id, reply.user.screen_name, reply.created_at, reply.full_text.replace('\n',''), reply.favorite_count, reply.retweet_count])

        #Output result as csv
        with open(csv_dir+'tweet_'+ today.strftime('%Y%m%d_%H%M%S') + '.csv', 'w',newline='',encoding='utf-8') as f:
                writer = csv.writer(f, lineterminator='\n')
                writer.writerow(["id","user","created_at","text","fav","RT"])
                writer.writerows(tweet_list)
        pass

        with open(csv_dir+'reply_'+ today.strftime('%Y%m%d_%H%M%S') + '.csv', 'w',newline='',encoding='utf-8') as f:
                writer = csv.writer(f, lineterminator='\n')
                writer.writerow(["id","to_id","user","created_at","text","fav","RT"])
                writer.writerows(reply_list)
        pass

def main():
        gather_tweet_and_reply("lang:ja exclude:retweets min_replies:100","lang:ja filter:replies exclude:retweets")

if __name__ == "__main__":
        main()

3. Reference

Summary of procedures from Twitter API registration (account application method) to approval * Information as of August 2019I didn't know what I could get from the Tweepy status list, so I took it out

Recommended Posts

Get replies to specific tweets with tweepy
Get Tweets with Tweepy
Get lots of your tweets with Tweepy
How to selectively delete past tweets with Tweepy
How to cancel RT with tweepy
How to get started with Scrapy
How to get started with Python
How to get started with Django
Step notes to get started with django
How to get parent id with sqlalchemy
Get tweets containing keywords using Python Tweepy
Exclude tweets containing URLs with tweepy [Python]
I tried to get started with Hy
How to get started with laravel (Linux)
A story I was addicted to trying to get a video url with tweepy
Get tweets with Google Cloud Function and automatically save images to Google Photos
The easiest way to get started with Django
I tried to get CloudWatch data with Python
A layman wants to get started with Python
Python script to get note information with REAPER
[First API] Try to get Qiita articles with Python
Get media timeline images and videos with Python + Tweepy
I tried to get started with blender python script_Part 01
How to get a logged-in user with Django's forms.py
I tried to get started with blender python script_Part 02
How to get more than 1000 data with SQLAlchemy + MySQLdb
Try to get the contents of Word with Golang
Get twitter tweets, followers, followers, likes, with urllib and beautifulSoup
How to get mouse wheel verdict with Python curses
Get tweets with arbitrary keywords using Twitter's Streaming API
What I did to get started with Linux commands
Apply conda's env to a specific directory with pyenv-vertualenv
Convert 202003 to 2020-03 with pandas
Collecting tweets with Python
Posting tweets with python
Recursively get the Excel list in a specific folder with python and write it to Excel.
Get started with MicroPython
Get date with python
Get started with Mezzanine
Memo to get the value on the html-javascript side with jupyter
Minimum knowledge to get started with the Python logging module
[NetworkX] I want to search for nodes with specific attributes
Get the package version to register with PyPI from Git
I tried to get started with Hy ・ Define a class
[Python] I tried to visualize tweets about Corona with WordCloud
How to extract other than a specific index with Numpy
How to get into the python development environment with Vagrant
Try to get data while port forwarding to RDS with anaconda.
Try to predict if tweets will burn with machine learning
Get additional data to LDAP with python (Writer and Reader)
[Introduction to Python] How to get data with the listdir function
Get the source of the page to load infinitely with python.