Before we start wondering about the existence, use, need and architecture of the In-Memory Data Grid (IMDG), First we need to look into the concept of In-memory processing.
So, what exactly is In-memory processing? In-memory processing is the practice of taking action on the data present in faster memory (i.e., RAM) rather than accessing the data in disk space. One way to make this practice more effective by eliminating the access to disk storage and completely loading the entire set of data in RAM.
let’s get back to the real topic of In-memory Data Grid (IMDG). IMDG is an architecture that provides many features but the basic fundamental design revolves around the concept of In-Memory Data processing. But, I am sure many people want to have a one-sentence definition of an IMDG. So, I am not going to disappoint them.
Definition: An in-memory data grid is a collection of computers/machines/nodes which are connected in a network to combine their fast memory (RAM) to store a large amount of data that can be shared among multiple applications that are connected to the same cluster.
How IMDG looks like?
IMDG can be visually represented in the below manner:
Memory Management in IMDG
Below shows the pictorial representation of memory and data management in IMDG system.
As shown in the above diagram, the Individual memory of each node is combined to act as one huge block of memory and data is scattered across the cluster. It might be interesting to notice the duplicate data across the grid (e.g., D1 is present in both Node 1 and Node 2 RAM) and we will talk about it at a later stage.
There are different IMDG platforms/products are present in the market. However, features offered by them vary from each other.
Few main feature of in-memory data grids:
- Low latency
- High performance/throughput
- Data durability
- Manage Large Datasets
Extremely low latency and High performance/throughput: IMDG keeps all the necessary data in memory (RAM) which provides massive performance benefits to mission-critical applications.
Data durability: IMDG stores the data in a distributed manner and also, keep the copy of the data in multiple nodes which make the data more durable in a situation where one of the nodes disappeared from the network due to technical failures. If you remember we talked about the duplicate data stored in different nodes previously. This duplication of data in different nodes is there to keep the system more durable and fault-tolerant.
Scalability: IMDG provides features and flexibility of scaling the infrastructure if there is a need by adding new servers to the clusters and it’s is smart enough to redistribute the data across all the nodes.
Where we can use IMDG?
There are many areas we can use IMDG. A Few of those are mentioned below:
- Analytics and Machine Learning
- Data replication
- Data visualization
- Multi Data Mode Storage
- Mission Critical applications with low latency
Please note that there are many other areas in that IMDG are used and can be used but the list is too big.
Where we should not be using IMDG.
IMDG is not for every applications! Please find the instances where we should not be implementing IMDG.
- Small set of Data
- None mission critical
- Application with high dependency of SQL
IMDG has been there for more than a decade. However, due to the massive demand for in-memory processing by mission-critical applications, the popularity and use of IMDG have surged recently. There are many flavours of IMDG with different features are in the market but what suits your application has to be determined by you.