Distributed Caching with AccelereX
- What is AccelereX?
- Current challenges with Caffeine Cache
- Benefits of Infinispan Cache
- How to integrate AccelereX into SMDS Applications?
- Trade-offs
What is AccelereX?
AccelereX is an in-house distributed and scalable caching platform designed for enterprise-grade applications. It overcomes the limitations of traditional methods by providing essential features like distributed cache replication, persistence, scalability, and robust disaster recovery. It is built on top of the Infinispan OSS stack.
Current challenges with Caffeine Cache
While Caffeine is excellent for single-node caching, it falls short when we need distributed or shared caching.
- Limited to Single JVM: In a distributed system with multiple nodes, each node would have its own isolated cache, leading to potential inconsistencies or stale data.
- No Data Sharing Across Nodes: It cannot synchronize cache data between nodes. If one node updates the cache, other nodes remain unaware.
- No Built-in Fault Tolerance: If the node running Caffeine fails, the cached data is lost.
- Lack of Persistence: Cached data is stored entirely in memory and is lost when the application restarts.
- No Querying or Advanced Features: Caffeine lacks data grid features like querying, indexing, and event listeners.
Benefits of Infinispan Cache
Infinispan overcomes the challenges of Caffeine and provides additional features for distributed systems:
- Distributed Caching: Ensures consistent and synchronized data across multiple nodes.
- High Availability: Ensure fault tolerance through replication and recovery in case of node failures.
- Scalability: Handle increasing traffic or larger datasets by scaling horizontally with additional nodes.
- Data Persistence: Persist cached data to disk or an external store for durability across restarts.
- Advanced Features: Supports querying, indexing, and event listeners for complex use cases.
- Geo-Replication: Synchronize data across geographically distributed clusters.
How to integrate AccelereX into SMDS Applications?
Before we dive into how we can implement the solution for the Infinispan cache, let’s look into what are the things we are dealing with here.
Application Cache vs Cache Cluster
Application Cache:
- This is the main object requested via API calls by the application.
- Access is controlled through Role-Based Access Control (RBAC), ensuring only cache owners or authorized identities can use it.
Cache Cluster:
- These are infrastructure components created by AccelereX to host application caches.
- Depending on workload requirements, AccelereX supports four types of cache clusters:
- SHARED
- SHARED-REPLICATED
- DEDICATED
- DEDICATED-REPLICATED
Workflow for AccelereX Integration
The diagram below illustrates how an application interacts with the Infinispan cluster using the Hot Rod protocol, which facilitates communication between the application and the distributed cache.
Components in the Workflow:
1. Application Layer
This layer represents the application using the Infinispan cache for various caching operations. It contains the following components:
a. Cache Inserter
- A component in the application responsible for adding data to the cache.
- It uses the Hot Rod client to interact with the Infinispan cluster.
- The
Put Cacheoperation is used to insert or update data in the cache.
b. Component Demanding Cache
- This is a consumer or a part of the application that retrieves data from the cache when required.
- It performs the
Get Cacheoperation to retrieve cached data via the Hot Rod client.
c. Near Cache
- A local cache maintained within the application to reduce latency.
- Frequently accessed data is stored in the near cache to avoid remote calls to the Infinispan cluster for every access.
- Data in the near cache is synchronized with the remote cache in the Infinispan cluster using listeners for cache events.
d. Listener
- A component that listens to cache events (e.g., updates, invalidations, or removals) from the Infinispan cluster.
- Ensures that the Near Cache remains consistent with the remote cache.
- Events like
Put Cache/Update Cacheare communicated to update or invalidate data in the near cache.
2. Infinispan Cluster
This represents the distributed caching system powered by Infinispan. It is accessed by the application through the Hot Rod protocol. The following components exist in this layer:
a. Remote Cache
- The main cache maintained by the Infinispan cluster.
- Data is stored here and accessed remotely by application components using the Hot Rod client.
- This is the primary source of truth for cached data in a distributed environment.
b. Replicated Cache (if enabled)
- Optional component for high availability.
- Data in the remote cache is replicated across multiple nodes within the Infinispan cluster.
- Ensures fault tolerance and improved read performance by replicating data to other nodes in the cluster.
Trade-offs
While Infinispan addresses the above challenges, it comes with:
- Increased Complexity: Setting up and managing a distributed cache is more complex than a local cache like Caffeine.
- Dual Data Maintenance: Introducing this will also result in the same data residing in two places: The Database and the Remote Cache. When data is updated in the database, the cache must also be updated to avoid serving stale data. Without proper synchronization, the cache might contain outdated values. (To Check: Any automated way to update the remote cache on DB update)