Do you run a NoSQL database, like Cassandra or MongoDB, on the cloud or on-prem and it is terribly slow?
Probably, you are already paying a lot of money for the infrastructure, and slow is definitely something you don’t need.
Cassandra is a very popular NoSQL database, and it is widely deployed on AWS (Amazon Web Services) and GCP (Google Cloud Platform). The insoluble dilemma of deployment, which is constantly discussed the past few years, is whether you should be running Cassandra over local (ephemeral disks on AWS, local SSDs on GCP, local DAS on prem) or shared storage (EBS on AWS, PDs on GCP, SDS/SAN on prem).
The former option offers great performance, but no flexibility, while the latter is very flexible, but cripples performance and adds cost.
DataStax documentation recommends running Cassandra on ephemeral disks instead of using shared storage. However, a lot of people (see companies like Spotify, CrowdStrike, Librato) have been moving to shared storage lately, to gain from its flexibility. For example to migrate nodes easily, and take backups.
In terms of CPU and RAM, you have a wide range of choices. The cloud providers offer various types of instances, so that won’t be a problem. But what about IOPS? The flexible, shared storage that cloud providers offer is great for keeping your data safe and always available. However, it comes with an important downside, it is poor in IOPS.
What if you could have your application running on extremely fast, local NVMe SSDs, while keeping the flexibility of shared storage at the same time?
Well this is what we are building, and we call it Rok.
Rok is decentralized storage for the cloud native world. We believe modern, highly mobile apps need to discover persistent data instantly and access it fast, anywhere they run. That’s why we designed the first software product to combine the performance of local storage with the flexibility of shared storage, while enabling seamless collaboration on data, across your global infrastructure. Rok allows you to run your stateful containers over fast, local NVMe storage on-prem or on the cloud, and still be able to snapshot the containers and distribute them efficiently: across machines of the same cluster, or across distinct locations and administrative domains over a decentralized network. We think performance and flexibility shouldn’t be mutually exclusive anymore. One should have both. Everywhere.
So, by running Cassandra with Rok you get all the advantages of running over local NVMe storage:
- extremely high IOPS
- I/O latency in the order of μs
- massive scale-out, with excellent scalability as the cluster grows
- significant cost savings compared to running over shared storage
while keeping all the advantages of shared storage:
- local backups
- offsite backups
- node migrations
Moreover, you get something that wasn’t possible before:
- collaboration at global scale
This means that you can take a snapshot of your NoSQL database along with its data, and share it with another user of a completely distinct administrative domain, at a distinct location. This is a perfect match for test & dev use cases, analytics, or forensics.
Now that I have your attention, I will let the numbers speak for themselves.
I will present an analysis of cost and performance for a 100TB cluster on AWS and GCP.
Cassandra on AWS
We are going to compare the cost and performance of two Cassandra clusters on AWS. Each one has 100TB of raw capacity.
Shared Storage | NVMe + Rok | Comparison | |
---|---|---|---|
Storage Type | EBS (io1) | Local NVMe | – |
Storage Capacity (raw) | 100 TB | 100 TB | – |
Instance Type | c4.4xlarge | i3.4xlarge | – |
Number of instances | 27 | 27 | – |
Total vCPUs | 432 | 432 | same |
Total GB of RAM | 810 | 3,294 | 4x better |
Nominal aggregate write IOPS | 432 K | 9,720 K | 22x better |
Nominal aggregate read IOPS | 432 K | 22,275 K | 51x better |
Cost per month | $50,838 | $18,016 | 64% cheaper |
Comparison of running Cassandra cluster on AWS over shared storage (EBS) and local NVMe shows that the latter approach results in 22 times more nominal aggregate write IOPS, 51 times more nominal aggregate read IOPS, and 64% cost reduction
We can see that using Rok and local NVMe-backed instances on AWS, you get more than 51x the nominal aggregate read IOPS, and more than 22x the nominal aggregate write IOPS, with over 60% cost reduction, keeping all the flexibility you need.
Cassandra on GCP
We are going to compare cost and performance of two Cassandra clusters on GCP. Each one has 100TB of raw capacity.
Shared Storage | NVMe + Rok | Comparison | |
---|---|---|---|
Storage Type | PDs | Local NVMe | – |
Storage Capacity (raw) | 100 TB | 100 TB | – |
Instance Type | n1-standard-8 | n1-standard-8 | – |
Number of instances | 34 | 34 | – |
Total vCPUs | 272 | 272 | same |
Total GB of RAM | 1,020 | 1,020 | same |
Nominal aggregate write IOPS | 510 K | 12,240 K | 24x better |
Nominal aggregate read IOPS | 510 K | 23,120 K | 45x better |
Cost per month | $23,942 | $16,178 | 32% cheaper |
Comparison of running Cassandra cluster on GCP over shared storage (PDs) and local NVMe shows that the latter approach results in 24 times more nominal aggregate write IOPS, 45 times more nominal aggregate read IOPS, and 32% cost reduction.
We can see that using Rok and local NVMe-backed instances on GCP, you get more than 45x the nominal aggregate read IOPS, and 24x the nominal aggregate write IOPS, with more than 30% cost reduction, keeping all the flexibility you need.Comparison of running Cassandra cluster on GCP over shared storage (PDs) and local NVMe shows that the latter approach results in 24 times more nominal aggregate write IOPS, 45 times more nominal aggregate read IOPS, and 32% cost reduction.
Conclusion
Now, that the Rok data management platform adds the flexibility you need on this setup, and with the NVMe prices going down, running over local NVMe storage is a very compelling option. We strongly recommend NVMe-backed instances combined with Rok; the performance boost you will experience along with the associated cost savings will surprise you.
If you have any questions, or want to learn more about the proposed solution, don’t hesitate to drop us a line at contact@arrikto.com or get started with MiniKF.