from Hacker News

Historical weather data API for machine learning, free for non-commercial

by meteo-jeff on 7/5/22, 10:32 AM with 45 comments

  • by meteo-jeff on 7/6/22, 8:18 AM

    Some technical background:

    Open-Meteo offers free weather APIs for a while now. Archiving data was not an option, because forecast data alone required 300 GB storage.

    In the past couple of weeks, I started to look for fast and efficient compression algorithms like zstd, brotli or lz4. All of them, performed rather poor with time-series weather data.

    After a lot of trial and error, I found a couple of pre-processing steps, that improve compression ratio a lot:

    1) Scaling data to reasonable values. Temperature has an accuracy of 0.1° at best. I simply round everything to 0.05 instead of keeping the highest possible floating point precision.

    2) A temperature time-series increases and decreases by small values. 0.4° warmer, then 0.2° colder. Only storing deltas improves compression performance.

    3) Data are highly spatially correlated. If the temperature is rising in one "grid-cell", it is rising in the neighbouring grid cells as well. Simply subtract the time-series from one grid-cell to the next grid-cell. Especially this yielded a large boost.

    4) Although zstd performs quite well with this encoded data, other integer compression algorithms have far better compression and decompression speeds. Namely I am using FastPFor.

    With that compression approach, an archive became possible. One week of weather forecast data should be around 10 GB compressed. With that, I can easily maintain a very long archive.

  • by m0llusk on 7/6/22, 2:45 AM

    This is how we could defeat a rogue AI: Distract it by talking about the weather.
  • by MeteorMarc on 7/6/22, 7:30 AM

    It would also be fun to have the historical weather *forecasts* so that you can compare the forecasts with the eventually measured data.
  • by bernulli on 7/6/22, 1:39 AM

    Hi meteo-jeff, this looks really cool!

    I have two questions:

    1) How does the spatial resolution come into this? Is it constant data all across the 2kmx2km (?) parcel with an abrupt change, or is it interpolated in some way? Can I query the coordinates of the mesh?

    2) How 'historical' does it get? How far back can I go with this?

    Thank you!

  • by ricksunny on 7/6/22, 2:52 AM

    Original open-meteo HN thread for background (9 months back) https://news.ycombinator.com/item?id=28499910
  • by aarreedd on 7/6/22, 3:08 PM

    Does not seem accurate. This is telling me it snowed 1.33cm on June 17, 2022 in New York City.

    https://api.open-meteo.com/v1/forecast?latitude=40.71&longit...

  • by m3kw9 on 7/6/22, 12:28 PM

    Still trying to predict weather using historical is like trying to predict the next number on a roulette using historical numbers
  • by mhalle on 7/6/22, 1:56 PM

    Thanks for offering this service!

    You explain your API offers historic data using the "past_days" parameter. Could you also offer a "date" parameter for a given day, or are you only keeping a rolling window of data?

  • by Kalanos on 7/6/22, 3:28 PM

    check out https://docs.aiqc.io for easy walk-forward, multivariate deep learning: https://docs.aiqc.io/notebooks/gallery/tensorflow/tab_foreca...

    excited to play w some of this data

  • by farmin on 7/6/22, 2:36 AM

    Do you have a commercial option? Does anyone know good alternatives?

    What forecast models do you use for Australia?

  • by melony on 7/6/22, 5:35 AM

    You can get these for free from the government websites.