What’s New in Apache Cassandra 4.0

By Deepak Vohra - March 16, 2020

Apache Cassandra is an open source, distributed NoSQL database based on the wide column model. The highly scalable, highly available database is great for handling large amounts of data across many commodity servers.

There is no set release date yet for the next version, Cassandra 4.0, but we do already know about several new features.

Support for Java 11

Java has a new release cycle in which a new version is made available every six months, but not all of these are longtime support (LTS) versions. Java 11 is the latest LTS version, and Cassandra 4.0 is adding experimental support for it. Experimental support implies that it is not yet recommended for production use.

Virtual Tables

Virtual tables are not your regular tables, as they are backed by an API instead of by SSTables. This implies that the data presented to a user on a virtual table query is fetched from the dynamic state of the database.

Virtual tables are not meant to be created by a user. Instead, a fixed set of read-only virtual tables are provided. Virtual tables are used for exposing the current settings in the cassandra.yaml configuration file, currently running SSTable tasks, system caches, and currently connected clients. Three different virtual tables could be used to monitor the performance of the database, providing information about the read, write, and scan latency. Other virtual tables can present data about disk usage and internode inbound/outbound messaging.

Audit Logging

All database activity is monitored and recorded to audit logs that are stored in the local filesystem. Audit logging records all authentication attempts made against the database, whether a login attempt was successful or failed. All CQL commands (DDL and DML), whether failed or successful, are also logged. Audit logging is configured in a configuration file.

Full Query Logging

Full-query logging (FQL) is similar to audit logging, except it only logs queries. FQL is dedicated to requests made to the CQL interface. Audit logging also logs CQL requests but lacks features such as FQL Replay and FQL Compare.

FQL Replay could be used to replay the FQL for testing, debugging, and performance benchmarking. This could be performed on a different machine or cluster for different runs of production traffic, to compare different versions and configurations. The FQL Compare tool is used to compare results output by FQL Replay.

Transient Replication

To understand transient replication, first we need to understand how Cassandra performs repair. Cassandra stores multiple replicas of the same data for durability and high availability. When all replicas are not consistent, a repair needs to be performed. Full repair is performed across the whole cluster, extending to newly added nodes. Incremental repair is performed only on data that has not been repaired previously.

Transient replication is used to create transient replicas that store data that has not been incrementally repaired. When sufficient numbers of full replicas become available, transient replicas stream the data they were storing to the full replicas. This is also an experimental feature.

Tags:

Up Next

Figuring Out Your Regression Testing Strategy

March 13, 2020

Get TechWell Insights Delivered Weekly

All TechWell Insights by this Author

Related Insights

About the Author

Deepak Vohra

Deepak is a Sun Certified Java Programmer and Web Component Developer, and has worked in the fields of XML, Java programming and Java EE for ten years. Deepak is the co-author of the Apress book Pro XML Development with Java Technology and was the technical reviewer for the O'Reilly book WebLogic: The Definitive Guide. Deepak was also the technical reviewer for the Course Technology PTR book Ruby Programming for the Absolute Beginner. Deepak is also the author of the Packt Publishing books JDBC 4.0 and Oracle JDeveloper for J2EE Development, Processing XML Documents with Oracle JDeveloper 11g, EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g, and Java EE Development in Eclipse IDE. Deepak is a Docker Mentor and has published 5 books on Docker and Kubernetes.