Thoughts and Deeds

Unbowed Unbent Unbroken

Want to Get Email Updates From a Store That Doesn't Provide Them?

I’ve been curious about learning leathercrafting for a while now but I haven’t known where to start. I was suggested to check out a Society for Creative Anachronism but I couldn’t find anything local.

I found the very helpful Leathercraft subreddit and through them I discovered Tandy Leather.

The cool thing about Tandy Leather is they provide classes on leatherworking! Some are free and some cost money but they are priced reasonably. A listing of their stores and the upcoming classes are found at http://www.tandyleather.com/en/leathercraft-classes.

Tandy does have an email but it’s not store specific so I won’t get the upcoming courses for the store closest to me (Richmond, VA). There is way to solve this though!

We’re going to scrape the page, grab the information for the Richmond store, email it to ourselves, and keep the script on an internet connected server that will allow us to set a cron job to run this weekly.

We’ll review how to do this in Python and Ruby.

Python

The script for this is very short and easy. For scraping we’re going to use BeautifulSoup. And to email the results to ourselves we’re going to use Mandrill.

We’re going to use the great requests library to make the connection to the page then put the response into a BeautifulSoup object.

At a command line (preferably in a virtualenv) let’s install what we need for now.

1
2
3
pip3 install beautifulsoup4
pip3 requests
pip3 html5lib

Then in our script let’s add:

1
2
3
4
5
6
7
import requests
import html5lib
from bs4 import BeautifulSoup

# using the html5lib parser instead of the typical 
# html.parser
soup = BeautifulSoup(r.text, 'html5lib')

Next we want to find the Richmond information on the page. The html for that looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
<a name="id_631"></a>

<hr>

<div class="store-class">
          <div>
      <span><strong>Richmond, VA #138 Leathercraft Classes</strong></span>
  </div>

  <div class="store-class-content"><p>
November 28, 2015 9:00A.M. - 4:00P.M.</p>
<p>
Black Friday Sale! No class. Look for our E-mail Blast, give us a call, or stop in the store early to check out our pre black Friday specials.</p>
<p>
&nbsp;</p>
<p>
December 5, 2015 10:00 A.M. - 1:00 P.M.</p>
<p>
Shadow Box: Come join us as we will be making a creative display case for small knickknacks, keepsakes, or leave them empty for use as wall decorations. We&rsquo;ll be cutting, forming, and adding your own personal flair to these elegant boxes.</p>
<p>
Cost: Retail $25, Gold/Elite $20</p>
<p>
&nbsp;</p>
<p>
December 12, 2015 10:00 A.M. &ndash; 1:00 P.M.</p>
<p>
Basic Carving: Have you recently starting or want to begin carving/tooling leather? Learn what the Basic 7 tools are and how to use them. We&rsquo;ll be carving a Sheridan style flower, which will utilize all seven tools.</p>
<p>
Cost: Free</p>
<p>
&nbsp;</p>
<p>
December 19, 2015 10:00 A.M. &ndash; 12:00 P.M.</p>
<p>
Valet Tray: need somewhere to toss your keys or change at the end of a long day? A leather Valet tray is the perfect solution. This classy catchall for the bedroom night stand or hallway/living room table. Personalize to make it your own. We&rsquo;ll be using snaps on it so it&rsquo;ll lay flat when not in use.</p>
<p>
Cost: Retail $25, Gold/Elite $20</p>
<p>
&nbsp;</p>
<p>
On December 18 &amp; 19, join us for our first Super Saturday Sale. Everything in the store will be on sale. You really don&rsquo;t want to miss this.</p>
</div>

  <div>
      Please contact store for class details.<br>
      Richmond138@tandyleather.com<br>
      Phone: 804-750-9970<br>
      Toll Free: 866-755-7090<br>
      9045 W. Broad St, #130<br>
      Henrico, VA 23294<br>
  </div>
</div>

Looking at the html the only identifier for the Richmond block is <a name="id_631"></a> which doesn’t exist in the class store-class and there is no unique identifier for Richmond inside the div.store-class So what we have to do is identify the name="id_631" then grab the next div whose class is store-class.

We can do that with one line of code courtesy of BeautifulSoup.

1
rva = soup.find("a", attrs={"name": "id_631"}).find_next_sibling("div", class_="store-class")

That line finds the a whose name is id_631 then finds the next sibling that is a div whose class is store-class.

And that will give us the entire div with all of the information we want!

But there is a caveat. The object is a BeautifulSoup object so we’re going to need to convert that to a string.

1
rva = str(rva)

That’s it!

We’ve got the information now we just need to get it to us.

For that I’m using Mandrill. Mandrill is a transaction email service made by Mailchimp. Just like MailChimp there is a free tier that will be more than enough for us.

Go to Mandrill, sign up, and get developer key.

At a command line let’s install the mandrill library.

1
pip3 install mandrill

Now to make our connection:

1
mandrill_client = mandrill.Mandrill("ENTER_API_KEY")

And let’s build out our message.

1
2
3
4
5
6
message = {
  'html': rva,
  'subject': 'Upcoming classes at Tandy Leather',
  'from_email': 'ENTER_FROM_EMAIL_ADDRESS',
  'to': [{'email': 'ENTER_TO_EMAIL_ADDRESS'}]
}

What we’re doing here is setting the html body of the message to the information we scraped from the web page. We give our email a subject, set the from email address, and the to.

Then we make a call to messages.send:

1
result = mandrill_client.messages.send(message=message)

There is a return result so if you don’t get the email you can output result and see what error you received.

Our final script looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#!/Users/jwhite/python_projects/tandyleather/env-tandyleather/bin/python

import requests
import html5lib
import mandrill
from bs4 import BeautifulSoup

mandrill_client = mandrill.Mandrill("ENTER_API_KEY")

r = requests.get("http://www.tandyleather.com/en/leathercraft-classes")

soup = BeautifulSoup(r.text, 'html5lib')

rva = soup.find("a", attrs={"name": "id_631"}).find_next_sibling("div", class_="store-class")

rva = str(rva)

message = {
  'html': rva,
  'subject': 'Upcoming classes at Tandy Leather',
  'from_email': 'ENTER_FROM_EMAIL_ADDRESS',
  'to': [{'email': 'ENTER_TO_EMAIL_ADDRESS'}]
}

result = mandrill_client.messages.send(message=message)

Notice I did include the location for where our python is located. This is required on our server to set a cron job to automatically email ourselves the information.

A cron tutorial is a little outside the scope of this post.

I did promise to show how to do this in Ruby also didn’t I?

Ruby

The Ruby script is VERY similar to the Python version.

We’re going to need to install the following gems:

1
2
3
gem install mandrill-api
gem install httparty
gem install nokogiri

We then require our libraries:

1
2
3
require 'mandrill'
require 'nokogiri'
require 'httparty'

We use httparty to get the page:

1
response = HTTParty.get('http://www.tandyleather.com/en/leathercraft-classes')

Put that into a nokogiri object:

1
doc = Nokogiri::HTML(response.body)

Create our connection to mandrill

1
mandrill = Mandrill::API.new 'API_KEY'

We find the section we’re looking for, create a new empty string to hold the information, but then there’s an extra step with nokogiri.

1
2
3
4
5
6
7
8
id_631 = doc.css("a").select{|link| link['name'] == 'id_631'}

rva = ''

id_631.each do |item|
  hr = item.next_element
  rva = hr.next_element.inner_html
end

We have to loop over our id_631 to get the next element, but that’s an <hr>, so we have to get the next element after that which is the div.store-class.

We build the message the same way as before but using Ruby hashrocket syntax.

1
2
3
4
5
6
7
message = {
  "html" => rva,
  "subject" => "Upcoming classes at Tandy Leather",
  "from_name" => "Jody White",
  "to" => [{"email" => TO_EMAIL}],
  "from_email" => FROM_EMAIL
}

Then make the call to mandrill to send.

1
result = mandrill.messages.send message

The entire script is:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/Users/jwhite/.rbenv/shims/ruby

require 'mandrill'
require 'nokogiri'
require 'httparty'

response = HTTParty.get('http://www.tandyleather.com/en/leathercraft-classes')

doc = Nokogiri::HTML(response.body)

mandrill = Mandrill::API.new 'API_KEY'

id_631 = doc.css("a").select{|link| link['name'] == 'id_631'}

rva = ''

id_631.each do |item|
  hr = item.next_element
  rva = hr.next_element.inner_html
end

message = {
  "html" => rva,
  "subject" => "Upcoming classes at Tandy Leather",
  "from_name" => "Jody White",
  "to" => [{"email" => TO_EMAIL}],
  "from_email" => FROM_EMAIL
}

result = mandrill.messages.send message

And that’s it!

Summary

This was a quick and fun project to put together. And useful! I set my cron job to run every Sunday morning at 8am. I tested out my cron job and it was working when I set it a minute into the future, but the real test will be tomorrow morning!