Web scraping The Home Depot Product Info with Nodejs

ยท

8 min read

What will be scraped

what

Full code

If you don't need an explanation, have a look at the full code example in the online IDE

import dotenv from "dotenv";
import { config, getJson } from "serpapi";

dotenv.config();
config.api_key = process.env.API_KEY; //your API key from serpapi.com

const engine = "home_depot_product"; // search engine
const params = {
  product_id: "318783386", // The Home Depot identifier of a product
};

const getResults = async () => {
  const json = await getJson(engine, params);
  return json.product_results;
};

getResults().then((result) => console.dir(result, { depth: null }));

Why use The Home Depot Product API from SerpApi?

Using API generally solves all or most problems that might get encountered while creating own parser or crawler. From webscraping perspective, our API can help to solve the most painful problems:

  • Bypass blocks from supported search engines by solving CAPTCHA or IP blocks.

  • No need to create a parser from scratch and maintain it.

  • Pay for proxies, and CAPTCHA solvers.

  • Don't need to use browser automation if there's a need to extract data in large amounts faster.

Head to the Playground for a live and interactive demo.

Preparation

First, we need to create a Node.js* project and add npm packages serpapi and dotenv.

To do this, in the directory with our project, open the command line and enter:

$ npm init -y

And then:

$ npm i serpapi dotenv

*If you don't have Node.js installed, you can download it from nodejs.org and follow the installation documentation.

  • SerpApi package is used to scrape and parse search engine results using SerpApi. Get search results from Google, Bing, Baidu, Yandex, Yahoo, Home Depot, eBay, and more.

  • dotenv package is a zero-dependency module that loads environment variables from a .env file into process.env.

Next, we need to add a top-level "type" field with a value of "module" in our package.json file to allow using ES6 modules in Node.JS:

ES6Module

For now, we complete the setup Node.JS environment for our project and move to the step-by-step code explanation.

Code explanation

First, we need to import dotenv from dotenv library, and config and getJson from serpapi library:

import dotenv from "dotenv";
import { config, getJson } from "serpapi";

Then, we apply some config. Call dotenv config() method and set your SerpApi Private API key to global config object.

dotenv.config();
config.api_key = process.env.API_KEY; //your API key from serpapi.com
  • dotenv.config() will read your .env file, parse the contents, assign it to process.env, and return an object with a parsed key containing the loaded content or an error key if it failed.

  • config.api_keyallows you declare a global api_key value by modifying the config object.

Next, we write search engine and write the necessary search parameters for making a request:

const engine = "home_depot_product"; // search engine
const params = {
  product_id: "318783386", // The Home Depot identifier of a product
};

You can use the next search params:

  • product_id The Home Depot identifier of a product.

  • delivery_zip ZIP Postal code. To filter the shipping products by a selected area.

  • store_id Store Id to filter the products by the specific store only.

  • no_cache parameter will force SerpApi to fetch the App Store Search results even if a cached version is already present. A cache is served only if the query and all parameters are exactly the same. Cache expires after 1h. Cached searches are free, and are not counted towards your searches per month. It can be set to false (default) to allow results from the cache, or true to disallow results from the cache. no_cache and async parameters should not be used together.

  • async parameter defines the way you want to submit your search to SerpApi. It can be set to false (default) to open an HTTP connection and keep it open until you got your search results, or true to just submit your search to SerpApi and retrieve them later. In this case, you'll need to use our Searches Archive API to retrieve your results. async and no_cache parameters should not be used together. async should not be used on accounts with Ludicrous Speed enabled.

Next, we declare the function getResult that gets data from the page and return it:

const getResults = async () => {
  ...
};

In this function we get json with results, and return product_results from the received json.

const json = await getJson(engine, params);
return json.product_results;

And finally, we run the getResults function and print all the received information in the console with the console.dir method, which allows you to use an object with the necessary parameters to change default output options:

getResults().then((result) => console.dir(result, { depth: null }));

Output

{
   "product_id":"318783386",
   "title":"2500-Watt Recoil Start Ultra-Light Portable Gas and Propane Powered Dual Fuel Inverter Generator with CO Shield",
   "description":"The Champion Power Equipment 201122 2500-Watt Portable Inverter Generator with CO Shield is ideal for camping or tailgating. Weighing in at an ultralight 39 pounds, this model is one of the lightest 2500-watt inverters in the industry. Included are a covered 120V 20A household duplex outlet (5-20R) plus two handy 2.1A USB ports you can use to power your phone, laptop, or similar device. Just add oil (included 10W-30), and operate your Dual-Fuel generator right out of the box on gasoline or propane, and easily switch fuels with the fuel selector dial. When the 1.1-gallon tank of gasoline is full, the 79cc Champion engine produces 2500 starting watts and 1850 running watts and will run for up to 11.5 hours at 25% load. When using a 20-pound propane tank, it produces 2500 starting watts and 1665 running watts and will run for up to 34 hours at 25% load. CO Shield technology monitors the accumulation of carbon monoxide (CO), a poisonous gas produced by engine exhaust when the generator is running. If CO Shield detects unsafe elevated levels of CO gas, it automatically shuts off the engine. CO Shield is not a substitute for an indoor carbon monoxide alarm or for safe operation. DO NOT allow engine exhaust fumes to enter a confined area through windows, doors, vents or other openings. Generators must ALWAYS be used outdoors, far away from occupied buildings with engine exhaust pointed away from people and buildings. Meets the requirements of ANSI/PGMA G300-2018.",
   "link":"https://www.homedepot.com/p/Champion-Power-Equipment-2500-Watt-Recoil-Start-Ultra-Light-Portable-Gas-and-Propane-Powered-Dual-Fuel-Inverter-Generator-with-CO-Shield-201122/318783386",
   "upc":"817198025742",
   "model_number":"201122",
   "rating":"4.7821",
   "reviews":"1606",
   "price":899,
   "highlights":[
      "Ultra-quiet operation, Dual fuel flexibility (gas or propane)",
      "Engine oil and oil funnel included",
      "2500 starting watts and 1850 running watts"
   ],
   "brand":{
      "name":"Champion Power Equipment",
      "link":"https://www.homedepot.com/b/Sports-Outdoors-Tailgating-Gear-Tailgating-Portable-Gas-Power/Champion-Power-Equipment/N-5yc1vZcbwtZ9xs"
   },
   "images":[
      [
         "https://images.thdstatic.com/productImages/95951f39-703a-4efc-b12b-ad3a3276e73c/svn/champion-power-equipment-inverter-generators-201122-64_65.jpg",
         "https://images.thdstatic.com/productImages/95951f39-703a-4efc-b12b-ad3a3276e73c/svn/champion-power-equipment-inverter-generators-201122-64_100.jpg",
         "https://images.thdstatic.com/productImages/95951f39-703a-4efc-b12b-ad3a3276e73c/svn/champion-power-equipment-inverter-generators-201122-64_145.jpg",
         "https://images.thdstatic.com/productImages/95951f39-703a-4efc-b12b-ad3a3276e73c/svn/champion-power-equipment-inverter-generators-201122-64_300.jpg",
         "https://images.thdstatic.com/productImages/95951f39-703a-4efc-b12b-ad3a3276e73c/svn/champion-power-equipment-inverter-generators-201122-64_400.jpg",
         "https://images.thdstatic.com/productImages/95951f39-703a-4efc-b12b-ad3a3276e73c/svn/champion-power-equipment-inverter-generators-201122-64_600.jpg",
         "https://images.thdstatic.com/productImages/95951f39-703a-4efc-b12b-ad3a3276e73c/svn/champion-power-equipment-inverter-generators-201122-64_1000.jpg"
      ],
      ...and other images
   ],
   "bullets":[
      "Need help with service or repair? Champion Power Equipment is available to help 24 hours a day, 7 days a week. Call us at 1-877-338-0999.",
      "Operate your 2500-Watt portable generator right out of the box on either gasoline or propane, plus at only 39 lbs., this inverter is 1 of the lightest 2500-Watt inverters in the industry",
      "With an ultra-quiet 53 dBA from 23 ft., enjoy 2500-Watt starting watt, 1850-Watt running watt and up to 11.5-hours run time on gasoline and 1665-Watt running watt and up to 34-hours on propane",
      "Optional, sold-separately parallel kit enables this inverter to connect with another 2500-Watt Champion inverter to double your output power",
      "Includes a 120-Volt 20 Amp household duplex outlet (5-20R) with clean electricity (less than 3% THD) and 2 convenient USB ports",
      "Includes 3-year limited warranty with free lifetime technical support from dedicated experts",
      "<br /><br /><center><img src=\"https://inlinecontent.thdstatic.com/28I/CHAMPION POWER EQUIPMENT/Champinline.jpg\"></center><br />"
   ],
   "info_and_guides":[
      {
         "title":"SDS",
         "link":"https://images.thdstatic.com/catalog/pdfImages/15/15de2a59-eb0b-443f-85dd-f6b773d618df.pdf"
      },
      {
         "title":"Replacement Part List",
         "link":"https://images.thdstatic.com/catalog/pdfImages/f9/f999c0c7-2df4-432d-86e1-fff36b2d3de9.pdf"
      },
      {
         "title":"Service and Repairs",
         "link":"https://images.thdstatic.com/catalog/pdfImages/70/70c2dc77-a1d9-4d45-9610-a723523171ab.pdf"
      },
      {
         "title":"Product Brochure",
         "link":"https://images.thdstatic.com/catalog/pdfImages/c7/c767f0c7-bd5e-4b78-ad4d-0bf2272ca158.pdf"
      },
      {
         "title":"Product Label in Spanish",
         "link":"https://images.thdstatic.com/catalog/pdfImages/be/be815184-f129-40b7-829d-34d3629168fe.pdf"
      },
      {
         "title":"Full Product Manual",
         "link":"https://images.thdstatic.com/catalog/pdfImages/a5/a531c5cb-34f4-4d3f-8992-86ab7b774b1c.pdf"
      }
   ],
   "specifications":[
      {
         "key":"Details",
         "value":[
            {
               "name":"Application",
               "value":"Campsite,Recreation,Tailgating"
            },
            {
               "name":"Built-in inverter",
               "value":"Yes"
            },
           ... and other details
         ]
      },
      {
         "key":"Warranty / Certifications",
         "value":[
            {
               "name":"Certifications and Listings",
               "value":"CARB Compliant,CARB Compliant,EPA Approved,EPA Approved,FCC Listed"
            },
            {
               "name":"Manufacturer Warranty",
               "value":"3 Year Limited Warranty"
            }
         ]
      },
      {
         "key":"Dimensions",
         "value":[
            {
               "name":"Product Height (in.)",
               "value":"17.7 in"
            },
            {
               "name":"Product Length (in.)",
               "value":"17.3 in"
            },
            {
               "name":"Product Width (in.)",
               "value":"11.5 in"
            }
         ]
      }
   ],
   "fulfillment":{
      "countity":1385,
      "store":"Bangor",
      "options":[
         {
            "type":"Ship to Home",
            "title":"Get it by",
            "arrival_time":[
               "Dec 26",
               "Dec 26"
            ],
            "bottom":"Free delivery"
         },
         {
            "type":"Schedule delivery",
            "title":"Not available for this item"
         },
         {
            "type":"Ship to store",
            "title":"Pickup",
            "arrival_time":[
               "Dec 22",
               "Dec 28"
            ],
            "bottom":"FREE"
         }
      ]
   }
}

How to extract products results and then extract product data

You can get products (with product_id) using our Web scraping The Home Depot Search with Nodejs blog post and then get product data with extracted product_id. Or you can use the simple code example, which shows you how to get products and get all these products info:

import dotenv from "dotenv";
import { config, getJson } from "serpapi";

dotenv.config();
config.api_key = process.env.API_KEY; //your API key from serpapi.com

const getResults = async () => {
  const productsInfo = [];
  const { products } = await getJson("home_depot", { q: "refrigerator" });
  for (const { product_id } of products) {
    const { product_results } = await getJson("home_depot_product", { product_id });
    productsInfo.push(product_results);
  }
  return productsInfo;
};

getResults().then((result) => console.dir(result, { depth: null }));

First, we get and destructure product from the received JSON with getJson function(we use "home_depot" search engine and params object with search query (q) "refrigerator").

Then, we use for...of loop, destructure product_id from each received product and get and destructure product_results from received JSON with getJson function(for now we use "home_depot_product" search engine and params object with product_id). And then we add the received results to the productsInfo array (using push() method).

If you want other functionality added to this blog post or if you want to see some projects made with SerpApi, write me a message.


Join us on Twitter | YouTube

Add a Feature Request๐Ÿ’ซ or a Bug๐Ÿž

ย