Some time ago, I worked for a foreign online store, one of the benefits of a lifetime discount of 20% on selected products. I thought to myself, why not sell these products on the local market with an adequate margin? Unfortunately, my business budget was close to zero. Buying the products in stock and waiting for someone to buy it from me was out of the question. I decided to put my products up for sale even though I didn’t own them.


It is a HYPOTHETICAL SITUATION - I do not follow such practices. In the post, I omit legal and ethical matters because my goal is to present the possibilities of Google Sheets as a simple tool for web scraping and work automation.


The product tracking tool described here can be found at Google Drive. Source code is available on my GitHub.


The business model is straightforward. Clients buy a product from me, and then I order it from an online store. Depending on which distribution path I have taken, various problems may arise. One of them will be when a customer requests a selected product from me that has recently become unavailable in the online store. If it is a transaction through an OLX-type advertising website, it will only end up with a dissatisfied customer. For sites like Allegro, it can be more unpleasant (for example, a commission fee). Of course, you can protect yourself in various ways, e.g., described in the auction that you don’t have products now, but life shows that customers are different and may not be understanding.

I started by checking whether my idea made sense at all, i.e., whether there was a demand for the products available on the store’s website (for example, I took the www.military1st.co.uk store). I have selected a dozen or so products that I have displayed on Polish websites. After a few weeks, it turned out that they were willing to buy products at the prices offered. First of all, it turned out that I make money on it (the profit margin is greater than the costs of displaying the goods and promoting them, and they compensate for the time spent on servicing the business). I was able to move on to the next step, which is increasing the range offered. With a dozen or so products, I had no problem checking at least once a day whether a given product is available on the store’s website. With more products, such clicking was pointless.

Few people realize that Google Sheets is an excellent tool for smaller web scraping. It may not work with well-known and popular websites (IP blocking), but in ordinary stores, it works perfectly (but a lot depends on the technology in which the website is made). Good knowledge of Excel is enough, but the major problems can be a function written in JavaScript (“almost like in VBA”). To further develop the idea, I created a Google Sheet for product tracking to meet several assumptions.

  1. Keep track of selected products (I only paste links)
  2. If the availability of products changes, send an e-mail alert

Web scraping and Google Sheets

In the beginning, all you need is a table with several columns:

  • url
  • title
  • img
  • in_stock
  • in_stock_prev (to compare)
  • allegro_id / lub allegro_url / lub inny_serwis_url / etc.

The only columns that I fill in manually are the link to the product page and references to my auctions. The remaining fields are completed automatically (either using the sheet function or using scripts).

To download data from the store’s website (title, img, and in_stock columns), I used the IMPORTXML function and developer tools built into the browser. Below is an example of using developer tools. The copied path to the selected item depends on many things, incl the browser type and version.

Ultimately, the effect is essential. In my case, the XPATH paths to each field look like this:

  • "//h1" for title
  • "//img[contains(concat(' ',normalize-space(@class),' '),' main-image ')]/@src" for img
  • "//div[contains(concat(' ',normalize-space(@class),' '),' stock available ')]" for in_stock

Automation with Google Sheets

I have created a table that stores the current data. It was left to write a script that will refresh the data, compare the in_stock and in_stock columns, and send an alert if the value changes. The advantage of Google Sheets over Excel is that scripts (macros) can be run with time triggers. However, remember not to overload the website we are tracking (in the worst case, the website will block the IP).

Google Sheets scripts are written in JavaScript. The Script Editor is located on the toolbar under Tools.

Below is the code for the function refreshData() with a brief explanation.

/**
 * Refresh data in military1st.co.uk sheet and send the alert.
 */
function refreshData() {
  var sht = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("military1st.co.uk");
  var lastRow = sht.getLastRow();
  var rngInStock = sht.getRange(2, 4, lastRow-1, 1);
  
  var data = sht.getRange(2, 1, lastRow-1, 6).getValues()
  var productsInStock = [];
  var productsOutOfStock = [];
  
  var email = 'bartosz.pawel.sekiewicz@gmail.com';
  var subject = 'Product Tracker Alert!';
  var message = '';

  // check changes
  // ...

  // prepare and send alert e-mail
  // ...
  
  // copy from in_stock to in_stock_prev
  // ...
}

I keep links to products whose availability status has changed from “unavailable” to “available” in the productsInStock table and links to products whose status has changed from “available” to “unavailable” in the productsOutOfStock table.

  // check changes
  for (var i in data) {
    var row = data[i];
    
    if ((row[3] != row[4]) && i>0) {
      if (row[3] == true) {
        productsInStock.push(row[0]);
      } else if (row[3] == false) {
        productsOutOfStock.push(row[0]);
      }
    }
  }

I am composing the email body. If there is at least one change in the availability of products, I send a message using the MailApp.sendEmail function. NOTE! Use of this function will require one-time authorization.

  // prepare and send alert e-mail
  message = message + 'Products in stock:\n\n'
  if (productsInStock.length > 0) {
    message = message + productsInStock.join('\n') + '\n\n'
  } else {
    message = message + 'no change.\n\n'
  }
  
  message = message + 'Products out of stock:\n\n'
  if (productsOutOfStock.length > 0) {
    message = message + productsOutOfStock.join('\n') + '\n\n'
  } else {
    message = message + 'no change.\n\n'
  }
  
  if (productsInStock.length > 0 || productsOutOfStock.length > 0) {
    MailApp.sendEmail(email, subject, message);
  }

It remains to copy the value from the ʻin_stockcolumn to the ʻin_stock prev column.

  // copy from in_stock to in_stock_prev
  rngInStock.copyValuesToRange(sht, 5, 5, 2, lastRow);

Finally, I run a time trigger - run the function refreshData every 6 hours.

Opportunities for further development

A solution based on Google Sheets should be able to handle even several hundred products. Indeed, with a more significant number of products on offer, it would be useful to integrate the sheet with the purchasing platform - for example, to automatically suspend and unsuspend requests. Perhaps there are dedicated sales management services that may treat the Google Sheet as a database of our “store.”

It would also be worth starting to collect data so that it can be analyzed in the future. For example, write information about product availability in addition to a new column on a separate sheet. Based on this data, we could develop an algorithm that would suspend the auctions in advance if it decides that there is a high chance of a product being out of stock on a given day.

Ultimately, you will have to invest in ready-made solutions (or commission someone to create a dedicated one) that would allow you to automate even more elements (e.g., listing offers, mailing) and further expand the range. When we know business is small, it’s easier to invest money in more comprehensive tools.