Table of contents
Intro
In the previous SerpApi Async Requests with Pagination using Python blog post we've covered how to make Async requests with SerpApi's pagination, how to use Search Archive API and Queue
.
In this blog post we'll cover on how to make direct requests to serpapi.com/search.json
without using SerpApi's google-search-results
Python client.
This way, when making a direct request SerpApi, we can get a slightly faster response time in comparison to Python's client batch async
search feature which uses Queue
.
For example, 50 requests in less than 10 seconds. Depends on the internet speed.
In the following blog post, we'll cover how to add pagination to the shown code below.
Subject of test: YouTube Search Engine Results API.
Test includes: 50 async search queries.
Code
You can check the code example in the online IDE:
import aiohttp
import asyncio
import json
import time
async def fetch_results(session, query):
params = {
'api_key': '...', # your serpapi api key: https://serpapi.com/manage-api-key
'engine': 'youtube', # search engine to parse data from
'device': 'desktop', # from which device to parse data
'search_query': query, # search query
'no_cache': 'true' # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
}
async with session.get('https://serpapi.com/search.json', params=params) as response:
results = await response.json()
data = []
if 'error' in results:
print(results['error'])
else:
for result in results.get('video_results', []):
data.append({
'title': result.get('title'),
'link': result.get('link'),
'channel': result.get('channel').get('name'),
})
return data
async def main():
# 50 queries
queries = [
'burly',
'creator',
'doubtful',
'chance',
'capable',
'window',
'dynamic',
'train',
'worry',
'useless',
'steady',
'thoughtful',
'matter',
'rotten',
'overflow',
'object',
'far-flung',
'gabby',
'tiresome',
'scatter',
'exclusive',
'wealth',
'yummy',
'play',
'saw',
'spiteful',
'perform',
'busy',
'hypnotic',
'sniff',
'early',
'mindless',
'airplane',
'distribution',
'ahead',
'good',
'squeeze',
'ship',
'excuse',
'chubby',
'smiling',
'wide',
'structure',
'wrap',
'point',
'file',
'sack',
'slope',
'therapeutic',
'disturbed'
]
data = []
async with aiohttp.ClientSession() as session:
tasks = []
for query in queries:
task = asyncio.ensure_future(fetch_results(session, query))
tasks.append(task)
start_time = time.time()
results = await asyncio.gather(*tasks)
end_time = time.time()
data = [item for sublist in results for item in sublist]
print(json.dumps(data, indent=2, ensure_ascii=False))
print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds
asyncio.run(main())
Code Explanation
Import libraries:
import aiohttp # to make a request
import asyncio
import json # for printing data
import time # to measure execution time
In the fetch_results()
function we:
- creating search
params
that will be passed to SerpApi while making request. - making an
async
session request, passing params and awaiting for eachresponse
and storing it toresults
variable. - checking for
'error'
in theresults
and iterating over'video_results'
, and storing extracted data to thedata
list
. - returning
list
with videos data.
async def fetch_results(session, query):
params = {
'api_key': '...', # your serpapi api key: https://serpapi.com/manage-api-key
'engine': 'youtube', # search engine to parse data from
'device': 'desktop', # from which device to parse data
'search_query': query, # search query
'no_cache': 'true' # https://serpapi.com/search-api#api-parameters-serpapi-parameters-no-cache
}
async with session.get('https://serpapi.com/search.json', params=params) as response:
results = await response.json()
data = []
if 'error' in results:
print(results['error'])
else:
for result in results.get('video_results', []):
data.append({
'title': result.get('title'),
'link': result.get('link'),
'channel': result.get('channel').get('name'),
})
return data
In the second main()
function we:
- create a
list
ofqueries
. Could be also txt/csv/json. - open a
aiohttp.ClientSession()
. - iterate over queries and create
asyncio
tasks. - proceed all of the tasks with
asyncio.gather(*tasks)
. - flatten
list
with data and store it to thedata
variable. - print the data.
async def main():
queries = [
'burly',
'creator',
'doubtful',
# ...
]
data = []
async with aiohttp.ClientSession() as session:
tasks = []
for query in queries:
task = asyncio.ensure_future(fetch_results(session, query))
tasks.append(task)
start_time = time.time()
results = await asyncio.gather(*tasks)
end_time = time.time()
data = [item for sublist in results for item in sublist]
print(json.dumps(data, indent=2, ensure_ascii=False))
print(f'Script execution time: {end_time - start_time} seconds') # ~7.192448616027832 seconds
asyncio.run(main())
Conclusion
As you saw (and possibly tried) this results in quite a fast response times. Additionally, we can add pagination to it, which will be covered in the next blog post.