Learn the basic of APIs and how they can be leveraged for SEO and marketing. Chalk full of Python code examples.
The URL to the GitHub gist link on slide 54 has changed to the following:
https://gist.github.com/pshapiro/a86dc340f57c38fc22d0545ddec1fc9e
7. #TTTLIVE19
Basically, APIs provide you a way to
interface with an external web service.
This enables automation,
permits you to incorporate 3rd party systems
into your own application,
and to expand both systems by combining
those services and features.
9. #TTTLIVE19
SERVER
HTTP is the protocol that
facilitates communication
between the client computer and
server computer via requests
and responses
10. #TTTLIVE19
CRUD Operations:
Create
Read
Update
Delete
Operation SQL HTTP
RESTful Web
Services
Create INSERT PUT / POST POST
Read (Retrieve) SELECT GET GET
Update (Modify) UPDATE
PUT / POST / PAT
CH
PUT
Delete (Destroy) DELETE DELETE DELETE
https://en.wikipedia.org/wiki/Create,_read,_update_and_de
lete
11. #TTTLIVE19
The interaction between client and server can
be facilitated via several structured methods
(sometimes referred to as verbs).
12. #TTTLIVE19
GET and POST are the most common methods
and commonly used in conjunction with web
APIs.
14. #TTTLIVE19
• “GET is used to request data from a
specified resource.”
• “POST is used to send data to a server to
create/update a resource.”
https://www.w3schools.com/tags/ref_httpmethods.asp
15. #TTTLIVE19
• “PUT is used to send data to a server to
create/update a resource.”
• “DELETE method deletes the specified
resource.”
https://www.w3schools.com/tags/ref_httpmethods.asp
17. #TTTLIVE19
APIs are a little bit like this antiquated ordering system…
1. You need to look at available inventory. You look at Spice
Company’s catalogue via the GET method. This gives them
a list of products they can order.
2. Once you know what you would like to purchase, your
internal system marks it down according to some pre-
defined business logic (in the form of item numbers and
corresponding quantities)
3. Your program places and order sending this payload to the
corresponding API endpoint using the POST method and
you receive the product at your physical address sometime
after.
21. #TTTLIVE19
Parse the XML
board games
board games for kids
board games for adults
board games near me
board games online
board games list
board games walmart
board games boston
board games 2018
board games for toddlers
29. #TTTLIVE19
Full Python Script (Google Autosuggest/XML):
import requests
import xml.etree.ElementTree as ET
boardgames = ["Gaia Project", "Great Western Trail", "Spirit Island"]
for x in boardgames:
apiurl = "http://suggestqueries.google.com/complete/search?output=toolbar&hl=en&q=" + x
r = requests.get(apiurl)
tree = ET.fromstring(r.content)
for child in tree.iter('suggestion'):
print(child.attrib['data'])
31. #TTTLIVE19
Combine Them Together?
import requests
import xml.etree.ElementTree as ET
import json
boardgames = ["board game", "bgg", "board game geek"]
for x in boardgames:
suggest_url =
"http://suggestqueries.google.com/complete/search?output=toolbar&hl=en&q=" + x
r = requests.get(suggest_url)
tree = ET.fromstring(r.content)
for child in tree.iter('suggestion'):
print(child.attrib['data'])
grep_url = "http://api.grepwords.com/lookup?apikey=key&q=" + child.attrib['data']
r = requests.get(grep_url)
parsed_json = json.loads(r.text)
try:
print(parsed_json[0]['gms'])
except KeyError:
print("No data available in GrepWords.")
37. #TTTLIVE19
import requests
import json
import xml.etree.ElementTree as ET
import time
testurls = ["https://searchwilderness.com/", "https://trafficthinktank.com/", "https://searchengineland.com/"]
for x in testurls:
apiurl = "http://www.webpagetest.org/runtest.php?fvonly=1&k=KEY&lighthouse=1&f=xml&url=" + x
r = requests.get(apiurl)
tree = ET.fromstring(r.content)
for child in tree.findall('data'):
wpturl = child.find('jsonUrl').text
print(wpturl)
ready = True
while ready:
r = requests.get(wpturl)
parsed_json = json.loads(r.text)
try:
if(parsed_json['data']['statusCode']==100):
print("Not yet ready. Trying again in 20 seconds.")
ready = True
time.sleep(20)
except KeyError:
ready = False
print(x + "rn")
print("Lighthouse Average First Contentful Paint: " +
str(parsed_json['data']['average']['firstView']['chromeUserTiming.firstContentfulPaint']))
43. #TTTLIVE19
from mozscape import Mozscape
import pandas as pd
import numpy as np
import requests
import time
def divide_chunks(l, n):
for i in range(0, len(l), n):
yield l[i:i + n]
client = Mozscape('access_id', 'sectet_key')
csv = pd.read_csv('./all_outlinks.csv', skiprows=1)
links = csv[csv['Type'] == 'AHREF']
# filter out CDNs, self-references, and other known cruft
links = csv[~csv['Destination'].str.match('https?://boardgamegeek.com.*')]
Domains = links['Destination'].replace(to_replace="(.*://)?([^/?]+).*", value=r"12", regex=True)
x = list(divide_chunks(Domains.unique().tolist(), 5))
df = pd.DataFrame(columns=['pda','upa','url'])
for vals in x:
da_pa = client.urlMetrics(vals, Mozscape.UMCols.domainAuthority | Mozscape.UMCols.pageAuthority)
i = 0
for y in da_pa:
y['url'] = vals[i]
i = i+1
df = df.append(y, ignore_index=True)
print("Processing a batch of 5 URLs. Total URLs: " + str(len(Domains.unique())))
time.sleep(5)
print(df)
https://github.com/seomoz/SEOmozAPISamples
/tree/master/python
45. #TTTLIVE19
Schedule to run monthly with Cron and
backup to SQL database:
https://searchwilderness.com/gwmt-
data-python/
JR Oakes’ BigQuery vision:
http://bit.ly/2vmjDe8
54. #TTTLIVE19
1. Take outlink report
from Screaming Frog
2. Distills URLs to
Domains
3. Runs Moz Linkscape
API against the list for
PA & DA
4. Checks HTTP Status
Code
5. Runs WHOIS API to
see if domain is
available
https://gist.github.com/pshapiro/819cd172f
f8fe576f2a4e1f74395ec47