Clustrix, Inc. is a San Francisco-based company that develops a NewSQL Database. The company was founded by Paul Mikesell (formerly of EMC Isilon) and
Sergei Tsarev (developer of Simple Time-series Database) and is headed by Mike Azevedo.The company is privately held, and is backed by HighBAR Ventures, Sequoia Capital, U.S. Venture Partners (USVP), and ATA Ventures. Clustrix is a distributed primary SQL database.
Sergei Tsarev (developer of Simple Time-series Database) and is headed by Mike Azevedo.The company is privately held, and is backed by HighBAR Ventures, Sequoia Capital, U.S. Venture Partners (USVP), and ATA Ventures. Clustrix is a distributed primary SQL database.
Market
Clustrix is a scale-out SQL database and part of what are often called the NewSQL databases (modern relational database management systems), which have started to gain mind share closely following the NoSQL movement.
Clustrix is a mature product, based on support for stored procedures and that it was designed and built before competitive NewSQL databases. The product launched in 2006, and has served customers since 2008. The primary databases like Microsoft SQL Server and MySQL supported online transaction processing and online analytical processing but were not distributed. Clustrix occupies this space with a distributed SQL, ACID database that scales transactions and support real-time analytics. Other successful distributed SQL databases are columnar (they don't support primary transaction workload) and focus on offline analytics and this includes EMC Greenplum, HP Vertica, Infobright, and Amazon Redshift. Notable players in the primary SQL database space are in-memory. This includes VoltDB and MemSQL, which excel at low-latency transactions, but do not target real-time analytics. NoSQL competitors, like MongoDB are good at handling unstructured data and read heavy workloads, but do not compete in the space for write heavy workloads (no transactions, coarse grained (DB-level) locking, and no SQL features (like joins), so the NewSQL and NoSQL databases are complementary.
Products
Clustrix is a primary scale-out SQL database. It supports workloads that involve scaling transactions and real-time analytics. The system is a drop-in replacement for MySQL, and is designed to overcome MySQL scalability issues with a minimum of disruption to an enterprise's production activities. It also has built in fault-tolerance features for high availability within a cluster. It has parallel backup and parallel replication among clusters for disaster recovery.
Clustrix's database is available:
- as downloadable software
- in the Amazon Web Services Marketplace
Technology
Query evaluation
The Clustrix database operates on a distributed cluster of shared-nothing nodes using a query to data approach. Here nodes typically own a subset of the data. SQL queries are split into query fragments and sent to the nodes that own the data. This enables Clustrix to scale horizontally (scale out) as additional nodes are added.
Data distribution
The Clustrix database automatically splits and distributes data evenly across nodes with each slice having copies on other nodes. Uniform data distribution is maintained as nodes are added, removed or if data is inserted unevenly. This automatic data distribution approach removes the need to shard and enables Clustrix to maintain database availability in the face of node loss.
Performance
In a performance test completed by Percona, a three-node cluster saw about a 73% increase in speed over a similarly equipped single MySQL server running tests with 1024 simultaneous threads. Additional nodes added to the Clustrix cluster provided roughly linear increases in speed.
No comments:
Post a Comment