Curated articles, resources, tips and trends from the DevOps World.
Summary: This is a summary of an article originally published by The New Stack. Read the full original article here →
General-purpose OLTP/OLAP databases are great, and not re-inventing the wheel is always a good principle. However, it doesn’t mean all query use-cases are easy to implement correctly, run quickly or at a reasonable cost. Often, there’s a significant effort to design and manage the database to make it somehow support the needed scale and complexity — not getting the performance or cost where we wanted it to be, especially as data and query volumes grow over time.
Of course, as always there are a myriad of other costs: data preparation in Spark, storage in S3, etc.
Here are the initial results for datasets of 100 and 500 million rows each, stored in Parquet format in S3 and partitioned to either 100 or 500 files, respectively.
Made with pure grit © 2024 Jetpack Labs Inc. All rights reserved. www.jetpacklabs.com