[PYTHON] A story I was addicted to trying to get a video url with tweepy

Introduction

The latest images and videos of actress Mayu Matsuoka are automatically sent every day. While creating a LINE Bot, I was addicted to trying to get the image / video URL from tweepy, so I will share it.

Looking back on it now, I spent a lot of wasted time, but if I write an article for the time being, that wasted time will be rewarded a little! !! I will write it.

L.png

Original code

search_tweets.py


import os
import tweepy
from datetime import datetime, date, timedelta
from dateutil.relativedelta import relativedelta

consumer_key = os.getenv('TWITTER_CONSUMER_KEY')
consumer_secret = os.getenv('TWITTER_CONSUMER_SECRET')
access_token = os.getenv('TWITTER_ACCESS_TOKEN')
access_token_secret = os.getenv('TWITTER_ACCESS_TOKEN_SECRET')
bearer_token = os.getenv('TWITTER_ACCESS_TOKEN_SECRET')


#Search for tweets related to Mayu Matsuoka on twitter and get the URL
def search_tweets():
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)

    yesterday = datetime.strftime(datetime.today() - relativedelta(days=1), f"%Y-%m-%d")
    #Omit retweets, 10 likes or more, yesterday-today, 1 or more retweets, twitter search on the condition that there are images and videos
    q = f'#Mayu Matsuoka OR Mayu Matsuoka filter:media exclude:retweets min_faves:10 since:{yesterday} min_retweets:1'

    #The problematic part
    tweets = tweepy.Cursor(api.search, q=q).items(20)

    contents = []
    for tweet in tweets:
        print(tweet.text)
        print(tweet.extended_entities)
        try:
            media = tweet.extended_entities['media']
            for m in media:
                print(m)
                preview = m['media_url_https']
                if m['type'] == 'video':
                    origin = [variant['url'] for variant in m['video_info']['variants'] if variant['content_type'] == 'video/mp4'][0]
                else:
                    origin = m['media_url_https']

                #Formatted for sending on LINE Bot
                content = {'preview': preview, 'origin': origin, 'type': m['type']}
                contents.append(content)

            print('--------------------------------------------')
        except:
            print('Error')
            print('--------------------------------------------')
    return contents


if __name__ == "__main__":
    search_tweets()

** With this code, I should have been able to get the URL of both tweets with images and videos. .. .. .. .. ** **

I can't get the video URL as I expected

I can search for tweets, but I notice that some videos URLs can be obtained and some cannot.

There may be an error in the try. .. ..

When I printed each tweet, I found some tweets without extended_entities.

for tweet in tweets:
        print(tweet.text)
        #Cause part
        # extened_Since there are no entities, an error occurs here except:I was going to.
        print(tweet.extended_entities)
        try:
            media = tweet.extended_entities['media']

When I looked it up, there were many articles.

Get tweets over 140 characters https://qiita.com/hitsumabushi845/items/f7fd87106381fc65fc86

** It seems that extended_entities disappears when the tweet and video URL exceeds 140 characters. ** **

It took me a long time to notice this. Or rather, when I looked it up, it was resolved in an instant.

Solution

Adding tweet_mode ='extended' and include_entities = True to the api.search parameters will fix it.

tweets = tweepy.Cursor(api.search, q=q).items(20)
tweets = tweepy.Cursor(api.search, q=q, 
    #Here, extend the omitted part
    tweet_mode='extended', #Get all omitted tweets
    include_entities=True).items(20) #Get all omitted links

I was able to get the URL of the video, but I'm throwing an error. ..

AttributeError: 'Status' object has no attribute 'text'

Apparently, when tweet_mode is set to extended, the key name of the tweet text changes from text to full_text.

for tweet in tweets:
        # text → full_text
        # print(tweet.text)
        print(tweet.full_text)
        print(tweet.extended_entities)
        try:
            media = tweet.extended_entities['media']

Modified code

search_tweets.py


import os
import tweepy
from datetime import datetime, date, timedelta
from dateutil.relativedelta import relativedelta

consumer_key = os.getenv('TWITTER_CONSUMER_KEY')
consumer_secret = os.getenv('TWITTER_CONSUMER_SECRET')
access_token = os.getenv('TWITTER_ACCESS_TOKEN')
access_token_secret = os.getenv('TWITTER_ACCESS_TOKEN_SECRET')
bearer_token = os.getenv('TWITTER_ACCESS_TOKEN_SECRET')


#Search for tweets related to Mayu Matsuoka on twitter and get the URL
def search_tweets():
    auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
    auth.set_access_token(access_token, access_token_secret)
    api = tweepy.API(auth)

    yesterday = datetime.strftime(datetime.today() - relativedelta(days=1), f"%Y-%m-%d")
    #Omit retweets, 10 likes or more, yesterday-today, 1 or more retweets, twitter search on the condition that there are images and videos
    q = f'#Mayu Matsuoka OR Mayu Matsuoka filter:media exclude:retweets min_faves:10 since:{yesterday} min_retweets:1'

    #The problematic part
    tweets = tweepy.Cursor(api.search, q=q).items(20)

    contents = []
    for tweet in tweets:
        print(tweet.full_text)
        print(tweet.extended_entities)
        try:
            media = tweet.extended_entities['media']
            for m in media:
                print(m)
                preview = m['media_url_https']
                if m['type'] == 'video':
                    origin = [variant['url'] for variant in m['video_info']
                              ['variants'] if variant['content_type'] == 'video/mp4'][0]
                else:
                    origin = m['media_url_https']

                #Formatted for sending on LINE Bot
                content = {'preview': preview, 'origin': origin, 'type': m['type']}
                contents.append(content)

            print('--------------------------------------------')
        except:
            print('Error')
            print('--------------------------------------------')
    return contents


if __name__ == "__main__":
    search_tweets()

in conclusion

For me, who is a fan of Mayu Matsuoka (commonly known as Mayura) and has run out of time to check the Twitter timeline every day, this LINE Bot is only for me, but if you like, I'm reading this article. If you are a fan of Mayu Matsuoka, please register as a friend with the QR code below! !! !!

Thank you! !!

L.png

Also, if there are any mistakes in this article, or if you think you should do more, please comment more and more! !!

Reference article

[Python] Search and get Twitter tweets with tweepy https://vatchlog.com/tweepy-search/ Get tweets over 140 characters https://qiita.com/hitsumabushi845/items/f7fd87106381fc65fc86 Get Video URL with Python + tweepy https://thinkami.hatenablog.com/entry/2017/11/02/062226

Recommended Posts

A story I was addicted to trying to get a video url with tweepy
A story I was addicted to trying to install LightFM on Amazon Linux
I was addicted to trying Cython with PyCharm, so make a note
A story that I was addicted to at np.where
A story that I was addicted to when I made SFTP communication with python
A story that I was addicted to calling Lambda from AWS Lambda.
A note I was addicted to when creating a table with SQLAlchemy
I was addicted to creating a Python venv environment with VS Code
I get a UnicodeDecodeError when trying to connect to oracle with python sqlalchemy
A note I was addicted to when running Python with Visual Studio Code
I was addicted to scraping with Selenium (+ Python) in 2020
I was addicted to trying logging.getLogger in Flask 1.1.x
[IOS] GIF animation with Pythonista3. I was addicted to it.
I tried to get started with Hy ・ Define a class
A story addicted to Azure Pipelines
What I was addicted to when dealing with huge files in a Linux 32bit environment
The story I was addicted to when I specified nil as a function argument in Go
I was addicted to multiprocessing + psycopg2
A memorandum when I tried to get it automatically with selenium
[Python] A memo that I tried to get started with asyncio
I wrote a script to get you started with AtCoder fast!
A note I was addicted to when making a beep on Linux
I get an error when trying to install maec 4.0.1.0 with pip
Note that I was addicted to accessing the DB with Python's mysql.connector using a web application.
I was addicted to not being able to get an email address from google with django-allauth authentication
I tried to get started with Hy
I was addicted to pip install mysqlclient
Get replies to specific tweets with tweepy
I was addicted to Flask on dotCloud
What I was addicted to Python autorun
Use Python from Java with Jython. I was also addicted to it.
I tried to make a url shortening service serverless with AWS CDK
[Python] When I tried to make a decompression tool with a zip file I just knew, I was addicted to sys.exit ()
What I was addicted to when creating a web application in a windows environment
[Introduction to json] No, I was addicted to it. .. .. ♬
I want to make a game with Python
I was in vain because I couldn't get a send parent order with pybitflyer
I tried to get CloudWatch data with Python
Three things I was addicted to when using Python and MySQL with Docker
I get a UnicodeDecodeError when running with mod_wsgi
Story of trying to use tensorboard with pytorch
I made a tool to get new articles
I want to write to a file with Python
A layman wants to get started with Python
I set up TensowFlow and was addicted to it, so make a note
A story that required preparation when trying to do a Django tutorial with plain centos7
A story that failed when trying to remove the suffix from the string with rstrip
I can't find the clocksource tsc! ?? The story of trying to write a kernel patch
I got stuck when trying to specify a relative path with relative_to () in python
A story about a Python beginner trying to get Google search results using the API
A beginner tried coloring line art with chainer. I was able to do it.
I tried to create a table only with Django
I want to transition with a button in flask
I want to climb a mountain with reinforcement learning
I tried to get started with blender python script_Part 01
How to get a logged-in user with Django's forms.py
I tried to get started with blender python script_Part 02
A story about trying a (Golang +) Python monorepo with Bazel
I want to work with a robot in python.
I want to split a character string with hiragana
I tried to automatically generate a password with Python3