from Hacker News

Twitter’s new R package for anomaly detection

by astletron on 1/8/15, 11:45 AM with 15 comments

  • by jcr on 1/8/15, 12:24 PM

    This is blogspam and a duplicate. The original url is here:

    https://blog.twitter.com/2015/introducing-practical-and-robu...

    And the previous discussion is here:

    https://news.ycombinator.com/item?id=8846205

  • by tempodox on 1/8/15, 1:04 PM

    This reminds me how much of a hole there is in my knowledge about statistics and such. I built myself a Twitter client that sucks users' geolocations into a DB so I can do all kinds of analyses on their movements. Makes me wish we had Statistics classes back in school. That should come right after learning the ABC.
  • by naftaliharris on 1/8/15, 3:30 PM

    It's interesting that they chose to write their anomaly detection code in R, which is typically used in offline, post hoc analysis mode. It seems reasonable to suppose that the ability to discover an anomaly like "service X is failing" in real time is more valuable then discovering it a week later.

    In order to monitor important time series with this code, they would presumably need to run it every n minutes on the entire time series, or at least the recent part of it. Seems an anomaly detection system operating on streaming data might make more sense.

    Perhaps their real time anomaly detection system uses simpler logic on streaming data?

  • by codewithcheese on 1/8/15, 1:14 PM

    Does anyone know if these R packages (AnomalyDetection, BreakoutDetection) are to be used on large scale data or they more intended to be used in lab work?