Skip to content

Node package for scraping wind forecast from a few websites

License

Notifications You must be signed in to change notification settings

jeroentvb/wind-scrape

Repository files navigation

Disclaimer

This package scrapes websites. Web scraping is a grey area and may not be allowed by the website.
Use with caution and for personal use only!

As per windfinder's Terms & Conditions

1.4.2 The data are protected in our favor by copyright or related rights.

1.5.2 The data may be used without our consent only for the intended use within the scope of the services offered by us; in particular the data may not be used for own software, apps, web pages, etc., unless we have expressly agreed to this use.

As per windguru's Terms and Conditions

3.2. It is forbidden to download website content by automated scripts.

This basically means that you can't use the windfinder & windguru scrape functions in this package.
I wasn't able to find the terms and conditions for Windy.

Caution

To be able to use this package with the ubuntu shell on windows 10 (and possibly linux), I've added the flag --no-sandbox to puppeteer.launch(). This means that puppeteer will launch without a sandbox. (Puppeteer is the headless browser used for scraping). This is a security risk, so only use it to visit sites you trust! More info on this, or how to configure it properly here. In the case of this package, it only visits www.windguru.cz and www.windy.com. Only use this package for scraping those sites if you deem them safe. I'm not responsible for anything that happens.

Note

If you are going to use this package in a project I highly recommend implementing writing the scraped data to a file, and using this file if a website has been scraped within a certain amount of time. This avoids spamming a website with unnecessary requests.

Wind scrape

Maintainability
This package can scrape wind forecast from windfinder superforecast and windguru.

Table of contents

Installation

npm install https://github.com/jeroentvb/wind-scrape/releases/download/v3.0.0/dist.tgz

If there is a newer version available, you can use that version number.
Releases can be found here.

Usage

const scrape = require('wind-scrape')

// TypeScript
import * as scrape from 'wind-scrape'

// Scrape windfinder spot
scrape.windfinder('tarifa')
  .then(data => console.log(data)
  .catch(err => console.error(err)

// Scrape windguru spot
scrape.windguru(43)
  .then(data => console.log(data)
  .catch(err => console.error(err)

// Scrape windy spot
scrape.windy(36.012, -5.611)
  .then(data => console.log(data)
  .catch(err => console.error(err)

// Scrape windreport of a windguru spot
scrape.windReport('tarifa')
  .then(data => console.log(data)
  .catch(err => console.error(err)

windfinder

scrape.windfinder(spotname)
Scrapes data from a windfinder superforecast page. Returns a promise which resolves in an object with the following format:

Windfinder data format
{
   "name": "Windfinder",
   "spot": "Tarifa Centro",
   "days": [
       {
           "date": "Sunday, Apr 07",
           "hours": [
               {
                   "hour": 7,
                   "windspeed": 16,
                   "windgust": 24,
                   "winddirection": 265,
                   "temperature": 14
               }
           ]
       }
   ]
}

It also slices the data to only return day hours.

spotname

A string. Name of the spot to scrape. This is the part after https://www.windfinder.com/weatherforecast/.
Example: to scrape data for Tarifa Centro, use tarifa.

windguru

scrape.windguru(url, modelNumbers)
Scrapes data from selected windguru model (tables). Returns a promise which resolves in an object with the following format:

Windguru data format
{
    "spot": {
        "name": "Spain - Tarifa",
        "coordinates": {
            "lat": "36",
            "lng": "-5.65"
        },
        "altitude": "16 C"
    },
    "models": [
        {
            "name": "GFS 27 km",
            "days": [
                {
                    "date": "Tue 4",
                    "hours": [
                        {
                            "wspd": "1",
                            "gust": "2",
                            "wdirn": "N",
                            "wdeg": "352",
                            "tmp": "16",
                            "slp": "1027",
                            "hcld": "0",
                            "mcld": "0",
                            "lcld": "-",
                            "apcp": "0",
                            "rh": "68",
                            "hour": "10"
                        }

The included data may vary per forecast model. You can find the keys of variables on the windguru micro help page. The only variable all hours have is hour.
Wave models are now included as well. They have different variables.

spotnumber

A string or integer. The number windguru uses for a spot.
Example: to scrape data for Tarifa, use 43. You can get this number from the url of the forecast for a spot.

Windy

scrape.windy(lat, long)
Scrapes data for a custom location. Returns a promise which resolves in an object with the following format:

Windguru data format
{
    "name": "Windy",
    "models": [
        {
            "name": "ECMWF 9km",
            "days": [
                {
                    "date": "07-04-2019",
                    "hours": [
                        {
                            "hour": 9,
                            "windspeed": 20,
                            "windgust": 30,
                            "winddirection": 278
                        }
                    ]
                }
            ]
        }
    ]
}

lat

Latitude of a spot

long

Longitude of a spot. Together these make up the coordinates of a spot. Consider the following windy url https://www.windy.com/36.012/-5.611/wind?. 36.012 would be the latitude, -5.611 the longitude.
I recommend using windy specific coordinates. Though, any set of coordinates should work.

WindReport

scrape.windReport(spotname)
Gets the report data for a windfinder spot report. Returns a promise which resolves in an object with the following format:

Windguru data format
{
    "name": "Windfinder report",
    "spot": "tarifa",
    "report": [
        {
            "windspeed": 17,
            "windgust": 25,
            "winddirection": 260,
            "time": "2019-04-06T15:00:00+02:00"
        }
    ]
}
Time is given in ISO8601 format.