Get data from a site into Google docs

Posted by Strainu on January 29, 2021
Personal / No Comments
Disclaimer: Data and text in websites is protected by copyright unless specified otherwise. Use this guide only under the fair use limitations in your country.

A question that I often receive when I talk about my spreadsheets is how do I get data directly into Google docs, so I’ve put together this tutorial on how to get Romanian stock prices. We’re going to use Tradeville‘s website to do that. As usual, the published data is delayed, but it’s good enough for end of day tracking.

1. Identify the data you need

From the homepage, you can use the search box in the upper-right corner to search for a ticker symbol (e.g. BRD). This will get you to a page like the one below. In that page, identify the URL (in the address bar of your browser) and the price.

Picture 1

2. Get the data

Now, you need a bit of tech knowledge: you need to open the developer tools in your browser. Fear not, in most browser you can use the Ctrl+Shift+I key combination (Cmd+Shift+I on Mac) to open it. You can see how that looks like in Firefox in the picture below.

Click the little arrow in the top-left corner (in red towards the middle of the picture), then select the element that contains the price. In the code pane, the equivalent line of code is selected (the red highlight at the bottom of the picture 2).

Picture 2
Picture 3

You’re almost done: right click the line of code in picture 3, then look for the command that allows you to copy the XPath of the element. In Firefox, that’s Copy->XPath.

3. Use the data in Google docs

Not that you have the path, you can use the information you got above (the URL of the page and the XPath of the element containing the price) in the IMPORTXML function in Google Spreadsheets. Just put in any cell the following function call:


Let’s see how that looks:

Picture 4

Ooops! Looks like we’re not quite there yet! Google returns 150.4 instead of 15.04. That’s actually because of the encoding differences between English and Romanian (the dot as a thousands separator in Romanian vs decimal separator in English) and how the Google code interprets the data. It can happen that Google considers the data as a date, which will leave you with a totally unusable value.

I will leave the solution to this problem as a task for the reader, since it might be different for different symbols. I personally chose a more complex formula including parsing more of the page using regular expressions, just because it was more resilient to small site changes.

If this guide helped you, please leave a comment or share it!

Share and Enjoy:
  • Facebook
  • Twitter
  • LinkedIn
  • Reddit
  • Add to favorites

Tags: , , ,