[PYTHON] I want to automatically answer Google Form at 5 o'clock every morning

Hello everyone. It's summer. The club activities have finally resumed, and I'm thrilled, but the night before yesterday, my adviser contacted me like this.

-Please measure the temperature by 5:20 every morning and report the result from Google Form.
-Members who did not report will not be allowed to participate in the morning training

In the first place, I was sick when the morning training started at 6:30, so I wasn't surprised when I was asked to contact me at 5:20, but one problem arose here. Because I usually get up at 5 am and go to the station by bicycle while biting the bread, so I don't have time to measure the temperature. It's a story that I should get up a little earlier, but I don't want to get up at 4 o'clock because I'm so tight that I don't have a body. However, if I wake up at 5 o'clock and measure the temperature, I will be late for the morning training itself.

So, at around 5 am, I would like to create a program that will send you a body temperature that you don't have to worry about by entering it in the specified form.

Submit form with Selenium

If you use a real form, my identity will be lost, so this time form with the same content as the real one I created for testing / 1FAIpQLScGgZ8dsBkcSVutvW3JgDLqy3pIEKk12ucjiA8mNQrKopILog / viewform) I would like to implement it.

Prepare a URL with initial value input

Google Form can open the URL with the value of each question entered by adding parameters. The URL for opening a form normally is https://docs.google.com/forms/d/e/1FAIpQLScGgZ8dsBkcSVutvW3JgDLqy3pIEKk12ucjiA8mNQrKopILog/viewform?usp=sf_link Like this, the parameter "usp = sf_link" is attached after the view form. This parameter indicates that it is a pure answer form without pre-filling, so first change this to "usp = pp_url" to let us know that there is pre-filling. Then enter the answer to each question in the parameters. There is a number that identifies each question in the form, so look for the question div on the Chrome validation screen and look for a number like the one below in the second tier. スクリーンショット 2020-08-07 16.56.07.jpg When you find the number, add the parameter in the form of ```entry. Number = answer content `` `. This time, the name and body temperature will be entered as text, so the URL will be as follows. https://docs.google.com/forms/d/e/1FAIpQLScGgZ8dsBkcSVutvW3JgDLqy3pIEKk12ucjiA8mNQrKopILog/viewform?usp=pp_url&entry.1534939278=荒川智則&entry.511939456=36.5 However, if this is left as it is, it will report 36.5 degrees every day, and I am suspicious of it, so I will give a good feeling with random numbers.

# 36.1~36.Randomly generate a value between 7 and convert it to a string
body_temp = str(36 + random.randint(1,7)/10)
#Add to the end of the URL
url = 'https://docs.google.com/forms/d/e/1FAIpQLScGgZ8dsBkcSVutvW3JgDLqy3pIEKk12ucjiA8mNQrKopILog/viewform?usp=pp_url&entry.1534939278=Tomonori Arakawa&entry.511939456='+body_temp

Automatic submission with Selenium

Once the URL is complete, all you have to do is open the URL in Selenium and have it click the submit button.

#Install Selenium and Chromedriver with pip
from selenium import webdriver
import chromedriver_binary
import time
import random

body_temp = str(36 + random.randint(1,7)/10)
url = 'https://docs.google.com/forms/d/e/1FAIpQLScGgZ8dsBkcSVutvW3JgDLqy3pIEKk12ucjiA8mNQrKopILog/viewform?usp=pp_url&entry.1534939278=Tomonori Arakawa&entry.511939456='+body_temp

#Click function
def click(xpath):
    driver.find_element_by_xpath(xpath).click()

#Password input function
def insert_pw(xpath, str):
    driver.find_element_by_xpath(xpath).send_keys(str)

driver = webdriver.Chrome()
driver.implicitly_wait(1)
driver.get(url)

moving_login_button = '/html/body/div[2]/div/div[2]/div[3]/div[2]'
time.sleep(1)

#If you need to log in with your Google account, log in
if(driver.find_elements_by_xpath(moving_login_button) != []):
  click(moving_login_button)
  login_id = "{Google account email address}"
  login_id_xpath = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[1]/div/form/span/section/div/div/div[1]/div/div[1]/div/div[1]/input'
  login_id_button = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[2]/div/div[1]/div/div'
  insert_pw(login_id_xpath, login_id)
  click(login_id_button)
  time.sleep(1)
  login_pw = "{Google account password}"
  login_pw_xpath = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[1]/div/form/span/section/div/div/div[1]/div[1]/div/div/div/div/div[1]/div/div[1]/input'
  login_pw_button = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[2]/div/div[1]/div/div'
  insert_pw(login_pw_xpath, login_pw)
  time.sleep(1)
  click(login_pw_button)

time.sleep(1)
submit_button = '//*[@id="mG61Hd"]/div[2]/div/div[3]/div[1]/div/div'
click(submit_button)

print("Done!")

driver.close
#It eats memory so let's finish it properly
driver.quit

I was able to send it firmly.

Make it a regular event

After writing the code, all I have to do is make it a regular event, but I stumbled a little here, so I will explain how to do it. The originally planned method is to make it into an application with Automator, put it in the calendar and execute it every day ([Reference](https://qiita.com/baraobara/items/73d753c678e5c0e72f46#4-mac%E3%81%AE%] E3% 83% 87% E3% 83% 95% E3% 82% A9% E3% 83% AB% E3% 83% 88% E3% 81% AEautomator% E3% 82% 92% E7% 94% A8% E3% 81% 84% E3% 81% A6mac% E5% 86% 85% E3% 82% A2% E3% 83% 97% E3% 83% AA% E3% 82% 92% E4% BD% 9C% E3% 82% 8B)). If this is the case, you can remove the days you don't need to report from the calendar and it should be perfect. I thought, but it doesn't work when the PC is shut down. The method of setting it in crontab and making it a regular event is also crap for the same reason. After all, I decided to use AWS Lambda, which can execute events regardless of the state of the PC, although I can not change the event flexibly. (For how to use Lambda, this site was helpful)

Raise Selenium, Chrome driver, headless-chromium to Lambda layer

To use the library with Lambda, you need to zip each folder and upload it to the layer. This time we will use Chromedriver, which is a web driver for Selenium and Chrome, and headless-chromium for scraping without opening Chrome, so we will zip them and put them on layers.

Selenium

mkdir selenium
cd selenium
mkdir python
cd python
pip install selenium -t .
cd ../
zip -r selenium.zip ./python

Upload the created zip file to the layer as it is.

2. Chrome driver and headless-chromium

curl https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-55/stable-headless-chromium-amazonlinux-2017-03.zip > headless-chromium.zip
curl https://chromedriver.storage.googleapis.com/2.43/chromedriver_linux64.zip > chromedriver.zip

Unzip the two resulting zip files and put them together in a headless-chrome folder. Then zip the headless-chrome and upload it to the layer.

3. Apply layer to function

Press "Layers" below the function and add two layers from the "Add Layer" button below スクリーンショット 2020-08-07 18.36.45.png

※Caution※

--When I set the runtime of the Lambda function to Python3.8, Chromedriver didn't work (cause unknown), so I recommend setting the runtime to Python3.6 or 3.7. --Headless-If there is no compatibility between chromium and Chromedriver, it will not work, so even if you get the latest version from here, it may not work.

A little code change for Lambda

I've never used AWS tools other than Cloud9 before, so I managed to make the code work with Lambda by imitating the appearance of various sites. Thanks to my ancestors.

import json
from selenium import webdriver
import time
import random

def lambda_handler(event, context):
    body_temp = str(36 + random.randint(1,7)/10)
    url = 'https://docs.google.com/forms/d/e/1FAIpQLScGgZ8dsBkcSVutvW3JgDLqy3pIEKk12ucjiA8mNQrKopILog/viewform?usp=pp_url&entry.1534939278=Tomonori Arakawa&entry.511939456='+body_temp
    options = webdriver.ChromeOptions()
    options.binary_location = '/opt/headless-chrome/headless-chromium'
    #If you don't add these 4 options, Chrome won't start and you will get an error.
    options.add_argument('--headless') #Launch Chrome serverless
    options.add_argument('--no-sandbox') #Launch Chrome outside the sandbox
    options.add_argument('--single-process') #tab/Switch to single process instead of multi-process per site
    options.add_argument('--disable-dev-shm-usage') #Change the output location of the memory file
    driver = webdriver.Chrome('/opt/headless-chrome/chromedriver',options = options)
    driver.implicitly_wait(1)
    driver.get(url)
    
    def click(xpath):
        driver.find_element_by_xpath(xpath).click()

    def insert_pw(xpath, str):
        driver.find_element_by_xpath(xpath).send_keys(str)
    
    moving_login_button = '/html/body/div[2]/div/div[2]/div[3]/div[2]'
    time.sleep(2)
    if(driver.find_elements_by_xpath(moving_login_button) != []):
        click(moving_login_button)
        #Environment variable MY_Please set the Google account email address in GMAIL
        login_id = MY_GMAIL
        login_id_xpath = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[1]/div/form/span/section/div/div/div[1]/div/div[1]/div/div[1]/input'
        login_id_button = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[2]/div/div[1]/div/div'
        insert_pw(login_id_xpath, login_id)
        click(login_id_button)
        time.sleep(1)
        #Environment variable MY_Set your Google account password in PASSWORD
        login_pw = MY_PASSWORD
        login_pw_xpath = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[1]/div/form/span/section/div/div/div[1]/div[1]/div/div/div/div/div[1]/div/div[1]/input'
        login_pw_button = '/html/body/div[1]/div[1]/div[2]/div/div[2]/div/div/div[2]/div/div[2]/div/div[1]/div/div'
        insert_pw(login_pw_xpath, login_pw)
        time.sleep(1)
        click(login_pw_button)
    time.sleep(1)
    submit_button = '//*[@id="mG61Hd"]/div[2]/div/div[3]/div[1]/div/div'
    click(submit_button)
    driver.close
    driver.quit
    return {
        'statusCode': 200,
        'body': json.dumps('Form submission success!!')
    }

important point

--Chrome startup options --headless, --no-sandbox, single-process , --disable-dev-shm-usage If you do not add ``, it will not start normally on Lambda and an error will occur. For more information on each option, please see here. --Files uploaded to the layer will be placed under the opt directory. When specifying the path, write it in the form of opt / directory name / ...

Set a trigger in CloudWatch Events

Click the function Layers, click Add Trigger, and select Event Bridge (CloudWatch Events) from the dropdown.
For the rule, enter an arbitrary rule name in "Create new rule". Set the rule type to a schedule type, and this time it is 5 am every day, so enter `corn (0 20? * * *)` (Note that Lambda is triggered by UTC, so set 9 hours ago) .. Enable the trigger and click Add. (See here for how to write cron.)

test

Finally, let's test if it works well. Click Test on the Lambda function screen. スクリーンショット 2020-08-07 18.51.28.png Sounds okay.

in conclusion

I skip the temperature measurement in the morning, but please be assured that I measure it properly before going to bed.

References

https://masakimisawa.com/selenium_headless-chrome_python_on_lambda/ https://github.com/heroku/heroku-buildpack-google-chrome/issues/56 https://qiita.com/mishimay/items/afd7f247f101fbe25f30