by irs on 11/28/18, 5:16 PM with 124 comments
by citilife on 11/28/18, 5:47 PM
Often this means building systems to analyze terabytes of logs [semi]-realtime. All I have to say is - thank god! This is going to make my job a lot easier, and likely empower us to remove our current infrastructure setup.
I know at one point we actually considered building our own time series database. Instead, we ended up utilizing a Kafka queue with an SQL based backend after we parsed and paired down the data, because it was the only one quick enough to do the queries.
Should make a lot of the modeling I've worked on a bit easier[1].
[1] https://medium.com/capital-one-tech/batch-and-streaming-in-t...
by sciurus on 11/28/18, 7:23 PM
Imagine you have 1000 servers submitting data to 100 timeseries each minute. That's 100,000 writes a minute (unless they support batch writes across series) At $0.50 per million writes that's $72 a day or $26k a year.
Now imagine you want to alert on that data. Say you have 100 monitors that each evaluate 1GB of data once a minute. At $10 per TB of data scanned, that's $1,440 a day or $525k a year!
by willlll on 11/28/18, 5:52 PM
by Tehnix on 11/28/18, 9:22 PM
---
I've seen a lot of people complain about pricing, so I thought I'd share a little why we are excited about this:
We have approximately 280 devices out, monitoring production lines, sending aggregated data every 5 seconds, via MQTT to AWS IoT. The average messages published that we see is around ~2 million a day (equipment is often turned off, when not producing). The packet size is very small, and highly compressable, each below 1KB, but let's just make it 1KB.
We then currently funnel this data into Lambda, which processes it, and puts it into DynamoDB and handles rollups. The costs of that whole thing is approximately $20 a day (IoT, DynamoDB, Lambda and X-Ray), with Lambda+DynamoDB making up $17 of that cost.
Finally, our users look at this data, live, on dashboards, usually looking at the last 8 hours of data for a specific device. Let's throw around that there will be 10,000 queries each day, looking at the data of the day (2GB/day / 280devices = 0.007142857 GB/device/day).
---
Now, running the same numbers on the AWS Timestream pricing[0] (daily cost):
- Writes: 2million * $0.5/million = $1
- Memory store: 2 GB * $0.036 = $0.072
- SSD store: (2GB * 7days) * $0.01 (GB/day) * 7days = $0.98
- Magnetic store: (2 GB * 30 days) * $0.03 (GB/month) = $1.8
- Query: 10,0000 queries * 0.007142857GB/device/day --> 71GB = free until day 14, where it'll cost $10, so $20 a month.
Giving us: $1 + $0.072 + $0.98 + $1.8 + ($20/30) = $4.5/day.
From these (very) quick calculations, this means we could lower our cost from ~$20/day to ~$4.5/day. And that's not even taking into account that it removes our need to create/maintain our own custom solution.
I am probably missing some details, but it does look bright!
by sciurus on 11/28/18, 5:55 PM
by addisonj on 11/28/18, 5:31 PM
From the little that was said, going to guess this uses something like Beringei (https://code.fb.com/core-data/beringei-a-high-performance-ti...) under the hood
by plasma on 11/28/18, 8:02 PM
by axus on 11/28/18, 5:32 PM
by brootstrap on 11/28/18, 8:34 PM
by samstave on 11/28/18, 5:45 PM
Ill have to go look into this, because if aws historic pricing for any large volume stream, quickly becomes untennable.
Its very easy to have gobs and gobs of time series points... aws might make using this way too expensive for anything at relative scale for a small startup?
by brian_herman__ on 11/28/18, 5:24 PM
by probdist on 11/28/18, 5:32 PM
by erikcw on 11/28/18, 5:35 PM
by temuze on 11/28/18, 5:41 PM
IMO, it makes way more sense to decide the aggregations you want ahead of time (e.g. "SELECT customer, sum(value) FROM purchases GROUP BY customer"). That way, you deal with substantially less data and everything becomes a whole lot simpler.
by MagicPropmaker on 11/28/18, 6:21 PM
by coredog64 on 11/28/18, 7:04 PM
CloudWatch metrics are also very expensive for what you get, so that's another similarity to Timestream ;)
by taf2 on 11/28/18, 5:53 PM
by mharroun on 11/28/18, 11:57 PM
by tjholowaychuk on 12/12/18, 12:33 PM
by inoiox on 11/28/18, 6:01 PM
by jopsen on 11/28/18, 5:32 PM
by booleandilemma on 11/28/18, 5:36 PM
by superkuh on 11/28/18, 6:08 PM
1. Amazon Timestream (amazon.com)
3. Amazon Quantum Ledger Database (amazon.com)
8. Amazon FSx for Lustre (amazon.com)
13. AWS DynamoDB On-Demand (amazon.com)
14. Amazon's homegrown Graviton processor was very nearly an AMD Arm CPU (theregister.co.uk)
21. Building an Alexa-Powered Electric Blanket (shkspr.mobi)
30. Amazon FSx for Windows File Server (amazon.com)
by nimbius on 11/28/18, 5:44 PM
by mLuby on 11/28/18, 6:30 PM