TrickyCases #6. Scrape Highcharts plots

Disclaimer: “TrickyCases” is a series of posts with relatively short code snippets, useful in day-to-day ML practice. Here you can find something that you would search for in StackOverflow in days from now.Photo by Markus Winkler on UnsplashPlots often…


This content originally appeared on Level Up Coding - Medium and was authored by Mykhailo Kushnir

Disclaimer: “TrickyCases” is a series of posts with relatively short code snippets, useful in day-to-day ML practice. Here you can find something that you would search for in StackOverflow in days from now.

Photo by Markus Winkler on Unsplash

Plots often contain valuable data, and if you are a data-geek as I am, you’d want to take that data home. One of my recent discoveries was how easy it is to parse data drawn with the HighCharts.js module.

Before we jump to the scraping part, make sure that:

  • The site you want to scrape does not provide the same data through API. It’s always going to be easier to use a programmable interface.
  • You’re not legally forbidden to scrape the site.
  • You’re not causing a high load to the site.

The Part Where We Scrape

As promised, the scraping part would be relatively simple. For practice purposes, let us take a page like this one. Scroll down to the plot named “Max, Min and Average Temperature” and you’ll see an average degree of temperature in Celcius within each month in Brussels since 2009. Now open a browser console and run the following script:

Highcharts.charts[0].series[0].data

Data Points

After some examination, you’ll see that now you have access to the entire raw data from the plot. Furthermore, with some javascript knowledge, you’ll even be able to print data cleanly. The trick for scraping would be to automate this process through selenium and pandas usage.

Below I’ll share with you a simplified version of such a script so you’d be able to catch the gist:

In this code:

  1. You’ll call the target site
  2. Wait until HighCharts plot appears
  3. Parse data with linked JS functions
  4. Store it to dataframe and save it to a local folder

Be aware that code assumes that you have chromedriver already installed, running, and available. Here are a few good tutorials on how to do it on various operating systems:

As usual, I’m open to any questions and comments. Let me know if you have any troubles on the road and how you’ve used the script in your automation tasks!


TrickyCases #6. Scrape Highcharts plots was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.


This content originally appeared on Level Up Coding - Medium and was authored by Mykhailo Kushnir


Print Share Comment Cite Upload Translate Updates
APA

Mykhailo Kushnir | Sciencx (2022-01-06T14:48:35+00:00) TrickyCases #6. Scrape Highcharts plots. Retrieved from https://www.scien.cx/2022/01/06/trickycases-6-scrape-highcharts-plots/

MLA
" » TrickyCases #6. Scrape Highcharts plots." Mykhailo Kushnir | Sciencx - Thursday January 6, 2022, https://www.scien.cx/2022/01/06/trickycases-6-scrape-highcharts-plots/
HARVARD
Mykhailo Kushnir | Sciencx Thursday January 6, 2022 » TrickyCases #6. Scrape Highcharts plots., viewed ,<https://www.scien.cx/2022/01/06/trickycases-6-scrape-highcharts-plots/>
VANCOUVER
Mykhailo Kushnir | Sciencx - » TrickyCases #6. Scrape Highcharts plots. [Internet]. [Accessed ]. Available from: https://www.scien.cx/2022/01/06/trickycases-6-scrape-highcharts-plots/
CHICAGO
" » TrickyCases #6. Scrape Highcharts plots." Mykhailo Kushnir | Sciencx - Accessed . https://www.scien.cx/2022/01/06/trickycases-6-scrape-highcharts-plots/
IEEE
" » TrickyCases #6. Scrape Highcharts plots." Mykhailo Kushnir | Sciencx [Online]. Available: https://www.scien.cx/2022/01/06/trickycases-6-scrape-highcharts-plots/. [Accessed: ]
rf:citation
» TrickyCases #6. Scrape Highcharts plots | Mykhailo Kushnir | Sciencx | https://www.scien.cx/2022/01/06/trickycases-6-scrape-highcharts-plots/ |

Please log in to upload a file.




There are no updates yet.
Click the Upload button above to add an update.

You must be logged in to translate posts. Please log in or register.