Automating My Car Search - Python & GitHub Actions
Python and GitHub Actions Automation
Photo by Hello I'm Nik on Unsplash
I was rear-ended a few months ago, luckily no one was hurt. However, the car was totaled and it was time to search for a vehicle in a tough car market. There was very limited inventory everywhere due to COVID and folks were flush with cash. High demand and low supply.
I was scouring my local Toyota websites daily until I built the following solution to automatically scrape the listings and productionize the script with GitHub Actions.
Goal
Scrape my local Toyota website for all RAV4 listings.
Send myself the prices of each car via SMS text message, sorted by lowest price to highest.
Purpose
- Monitor the inventory and fluctuating prices; allowing me to jump on a deposit for a car.
Python Procedure
Identifying HTML Values
First, I needed to find the HTML values I was after on my Toyota dealer's website.
I inspected the MSRP value on the inventory page and found
<div class = "price_value"> $34,889
Requests and Beautiful Soup
Easy enough, the start of the script:
Sends a basic get request
Creates a Beautiful Soup object
Grabs all
div
elements with theclass
name ofprice_value
De-duplicate the returned items using the
set()
data structure
def get_bike_items(url):
#Use the requests library for a simple get request
response = requests.get(url)
#Build the bs4 soup object
html_soup = BeautifulSoup(response.text, 'html.parser')
#Find all divs with a name of
all_cars = html_soup.find_all('div', class_ = 'price_value')
all_cars = set(all_cars)
Some light cleaning is needed for the returned strings
Loop through all cars
Replace new lines and strip white space
Append back to a new list
Sort list
Return the list
Light Cleaning
cars_text = []
for cars in all_cars:
text = cars.text
care_c = text.replace("\n","").strip()
cars_text.append(care_c)
cars = sorted(cars_text)
return cars
Constructing String Message
I had two URLs to pass into this function, one for Hybrid and one for Regular gasoline RAV4s but I wanted each page to be sent in one text message
Loop over array of URLs
The first URL in the list was for Hybrid RAV4s so I added REG to the end of the first item in the array (index 0) and started
sms_text
with 'HYBRIDS || 'Join the returned list of strings into one string, separated by
||
'||'.join(get_bike_items(url))
urls = ['https://www.sheehytoyotafredericksburg.com/new-toyota-rav4-hybrid-fredericksburg-va',
'https://www.sheehytoyotafredericksburg.com/new-toyota-rav4-fredericksburg-va']
sms_text = 'HYBRIDs || '
for index, url in enumerate(urls):
if index ==0:
sms_text += '||'.join(get_bike_items(url)) +'\nREG ||'
else:
sms_text += '||'.join(get_bike_items(url))
#Calling the function below to send the SMS
send(sms_text)
SMS Message
Lastly, I just needed to send the SMS message!
- Below is the function for sending a SMS message. It works by using an email account to send the message to a phone carrier network
import smtplib
carriers = {
'att': '@mms.att.net',
'tmobile':' @tmomail.net',
'verizon': '@vtext.com',
'sprint': '@page.nextel.com'
}
def send(message):
# Replace the number with your own, or consider using an argument\dict for multiple people.
to_number = 'xxxxxxxxxx{}'.format(carriers['xxxxx'])
auth = ('xxx@gmail.com', 'xxxpass')
# Establish a secure session with gmail's outgoing SMTP server using your gmail account
server = smtplib.SMTP( "smtp.gmail.com", 587 )
server.starttls()
server.login(auth[0], auth[1])
# Send text message through SMS gateway of destination number
server.sendmail( auth[0], to_number, message)
GitHub Actions Automation
After the script was running, I wanted to be able to send these messages to myself automatically twice a day.
GitHub actions performs very well for this type of use case. It's essentially a serverless framework for executing code. I plan to do another article that goes deeper into Actions.
For now, to get the above script running:
Push code to a private GitHub repository
Create a
.yml
file in your repository with the path of.github/workflows/main.yml
The structure of the YAML is fairly straightforward
GitHub Action YAML File
#Name of the Action
name: car_scrape_sms
#On is the trigger of when this will be executed
on:
#Here I am using a basic cron schedule
schedule:
- cron: '15 14 * * *'
- cron: '15 2 * * *'
jobs:
upload:
#The execution environment of the runner
runs-on: ubuntu-latest
steps:
# Checking the repository out to the execution env
- uses: actions/checkout@v3
- name: Set up Python
# Installs Python on the env
uses: actions/setup-python@v3
with:
python-version: '3.10'
- name: Install dependencies
#CLI arguments to execute on the runner
run: |
pip install requests
pip install beautifulsoup4
- name: run main
run: |
python main.py
GitHub Action Keywords Explained
name:
is the name of your Action runon:
is the keyword to trigger your workflow, here I am using a cron schedulejobs:
starts the main entry into the workflowruns-on:
is the platform OS, essentially the container (runner) environment that your job will run onsteps:
are sections to categorically separate pieces of your Actionname:
of your stepuses:
can call other actionsHere I am using another action that checkouts my repository to the container environment that this will be executed on. Essentially it copies all of the scripts in the repo to the execution env.
I am also setting up python in the container env with
actions/setup-python@v3
with:
is passing a parameter to specify the version of python from that above action
run:
here we can specify the commands to run on the runner. You can treat this as the command line of the runner.
This barely scratches the surface of the power of GitHub Actions!!
Action Run Example
Below is an example of an Action run for the above scripts.
Output
Finally here is the output that gets sent to my phone