Cloud Enabled Scalability – My Story





Something prompted me to blog again after a long time and not sure if I would be motivated to continue. First, I thought to share a long pending experience, which I had the opportunity to have eight years back.

The best weapon we have with the rise of Cloud is on demand scalability (my reference is of 8 years back which everyone knows now). Most of the times when we talk about cloud, we are not talking enabling application for cloud adoption but only talk about cloud infrastructure management (which is sad).

I am going to talk about my experience in making an application cloud enable and the use of the design principles. As I am talking an experience gained 8 years back (I am a bit rusty now) I will go through the journey with the case study.

Thats How it Started


So our story start back in 2011 when hero Mr Dan came up with an idea to monetize the online imagery content of advertisement and make some money (back then it was a cool idea, Google was doing it only for its videos). I will not get the architectural design of the application, as that is not the purpose of this blog.

Let me jump to the main turning point of our story. Application version 1.0 was ready. The pilot website was ready. Time was chosen to showcase the world new rise of website real estate monetization power.

Thats How it Unfolded


6 AM morning the whole customer was ready to see the results. My team was on the console to read the pulse of the application. Everything was set.



And ....

In 10 minutes, the application went BOOM. We all were devastated that our whole effort blown up in minutes.
Retrospection showed that all the lexical relevance and proximity algorithms were good. Coding standards were fine, but what went wrong.

WE COULD NOT SCALE, based on the traffic flow.

Let’s get to the point now on what we did to fix the issue. Below is the scalability design principles we came up with and implemented.

Thats How It Changed


Following are the points of learning from the experience in the order of implementation to make your application scalable and leverage cloud infrastructure. However, the “Design Pattern for Scalable Application” was derived almost 8 years back but still holds good.

Concurrency



First and foremost, your application should support the concurrency. It actually means that your application design should enable processing mutually exclusive tasks independently. It helps us use the CPU time slicing to the best. What actually it means is that first, we have to use each CPU to the best probable capacity.

In java, Concurrency framework is there for your rescue. It primarily consists of two components, processes and threads. Not going in details, process contains thread and threads are lightweight processes. The java.util.concurrent package offers improved support for concurrency compared to the direct usage of Threads.

What we did

We created separate processes for ‘Crawler’, ‘Data Processing’ and ‘Best Match’ on each URL, as these were completely independent pieces of puzzle. It enabled us to control the resource allocation individually. In ‘Data Processing’ process, each URL was processed in its individual thread.

Parallelism



It mean the ability to break a task in smaller tasks to complete them faster. Divide and Conquer is the best example for implementation (remember merge sort). Another excellent design example for parallel processing is Map / Reduce.

Our privilege to rely on CPU speed restricted back around ~2008 when we maxed out the capacity of number of transistors on one CPU. Then the era started for multicores and the need for parallelism.

Back then, Java did have a concurrency framework but unfortunately nothing for parallelism. Later we saw the rise of languages, which came up to manage this challenge like GoLang.
Java also came up with pipelines and streams (more of MapReduce). JDK also came up Fork and Join implementation in Concurrency framework in JDK 8.

What we did

Unfortunately, we did not have much of the language framework support back then. We resorted to the basics by implementing threads (fork and join). Each URL was supposed to be processed with multiple data analysis criteria to arrive page scoring. Each criteria score was calculated with a separate child thread (divide) and then merge the results to final page score with weightage (conquer).

Remove Contention



If you have a non-scalable module in your application which all or most of the application depends, means you have a contention point. These needs to be removed or your other aspects for scalability improvements may lose the relevance as overall effective solution. There is no silver bullet for it. It all depends on the application design and the need of the solution.

What we did

We did a quick performance evaluation of the application and realized that even though we are implementing parallel processing, all different pieces are heavily dependent on the DB. Which prompted three decisions. One redesign DB based and de-normalize the tables based on the modules specific needs rather being a purely normalized DB. Secondly, the persistence layer needed to be scalable service to use load-balancing capabilities. And, lastly we reduced the DB IO operation by having in memory DB for operations which could wait for persistence for a duration of time.

Microism / Modularism

Most of us know now what it means by 2020. The more modular we make our application design or as micro we make our services, more are the chances to use the cloud infrastructure to scale our application runtime.

What we did

As you have seen in previous steps, each step contributed in making the services modular. The more formalized version is micro services architecture in today’s world.

Hardware Scaling

The last step is now to leverage the hardware scaling. Cloud gives us the capability to perform hardware scaling at length. By the time we reach to this step, we are the king. First step, vertical scaling where we add high capacity CPU and with concurrency implemented, we are ready to use it to our advantage. Second step again vertical scale by adding multi core processing power. With parallelism, we are ready to use the added horsepower.
Next step includes the horizontal scaling. We add load balanced machines based on the firepower need for individual component of the application like DB, persistence, analytics etc. and since our application is modularized and micro, we are all set to take advantage.

Conclusion


As per my learning, if you really need to leverage cloud for limitless scale, please follow the steps in order to refactor your application then world is yours.
What happened to us
We refactored the whole application in 2 months and had a major success. This was a very big learning on how to achieve success with right decisions. Many things we implemented at that time are all readily available in languages or frameworks now. We had to create our own producer consumer model but now we have Apache Kafka. Similarly, we went for binary protocol transfer than Text data protocols like JSON for any internal data exchange.

I hope I could make some sense. At least enjoyed the story. Thanks for reading my story.

Comments

Post a Comment

Popular posts from this blog

Hibernate: a different object with the same identifier value was already associated with the session

BeanDefinitionStoreException: Failed to parse configuration class: Could not find class [javax.jms.ConnectionFactory]

org.hibernate.HibernateException: Unable to instantiate default tuplizer [org.hibernate.tuple.entity.PojoEntityTuplizer]