Why it is recommended to keep dynamic data closer to the compute and static data closer to the end user in Cloud computing?

Data proximity is an important principle of Cloud Computing. If we keep the right kind of data at right place, it can help build an excellent enterprise software system.

The purpose of keeping dynamic data closer to compute resources is that it can reduce the latency while processing. There is no need for servers to fetch data from remote locations. Even MapReduce algorithm recommends keeping dynamic data nodes closer to compute servers.

Since there is always inherent network latency in a cloud computing environment, this practice can improve the overall performance of computation by saving time from data transfer between servers for processing.

Another benefit is that in Cloud we pay for the in and out bandwidth by the GBs of data transfer. So the cost of data transfer can increase overall costs.

In case there is a big chunk of external data that has to be processed in the cloud, we first transfer the data to nodes near the execution environment. And then process the data in parallel mode. It is a common practice in Data warehouse operations to first move the entire database in cloud and then process it in parallel threads.

For multi-tier web applications data is stored into and retrieved from relational databases. In such a scenario the recommended architecture is to create app server and db nodes in same cloud environment. Generally there is free data transfer within cloud nodes. Keeping app and db nodes in same cloud can save time as well as money for internal data transfer.

For static data like images, pdf, video etc., the recommended approach is to keep it closer to the end user. This kind of data can be cached in nodes that are closer to the user consuming it. This can drastically reduce the access latency for consumer, and provide better user experience.

Leave a Reply

Your email address will not be published. Required fields are marked *