The last couple of years, "Cloud Computing" replaced Web 2.0 as the new buzzword. You can read, hear and see everywhere the cloud is coming. To most developer, this is still the same old sh*t. If you have experience in developing distributed system then you should be fine, you say. Well not entirely true, the IT department wants to deploy on cheap cloud and therefore some restrictions now applies. I will list 5 things that I think all developers should know when working with cloud Platform as a Service provider such as Amazon Beanstalk or Google App Engine. This list also applies to IaaS architecture. Some of the points might be obvious to the more experienced, nevertheless, they need to be mentioned.- Static objects
- Caching Objects
This one is related to performance in order to avoid expensive operations such as running database queries and others. Sometimes we need to cache objects in memory and therefore we implement our own caching strategy through the use of simple HashMap or some other caching solutions available outthere. Caching has many benefits but implementing a caching strategy should be approached with care. This is because caching has the same problem as static objects. Your cache will be in the local JVM therefore not it will not be visible in the cluster. There are some solutions, for example, GAE uses Memcached and Beanstalk can make use of Amazon ElastiCache which is compliant with Memcached. When developing for a PaaS environment, make sure to not implement your own caching system but look for one that is supported by the vendor. I know this can lead to vendor lock-ins.
- Server-side Session
Something we do take for granted in single environment is storing application session data on the server. Based on experiences, mainly using GAE, I encountered multiple issues with session management. Since then, Google has fixed alot of the issues with the way GAE handle sessions for Java application. To minimize writing session to a datastore, we store application state in memory. Most application are written without any vendor approach in mind; so we use JEE as-is. This approach would work in you deploy in any self hosted clustered environment but Google PaaS. Google implements their own session management which is off by default therefore you need to enable it in appengine-web.xml and make sure that all your objects implements the java.io.Serializable interface.
Note: Note, session data is always written synchronously to memcache. If a request tries to read the session data when memcache is not available (or the session data has been flushed), it will fail over to the datastore, which may not yet have the most recent session data. This means that asynchronous session persistence may cause your application to see stale session data. However, for most applications the latency benefit far outweighs the risk.
- Event-driven Execution
This is more about running a process at a given time such as Scheduling task. Again, in a managed environment, it is straightforward to implement a timer or scheduler service. But this is a clustered environment which is not managed by yourself and their stack his different to yours. I personally use Quartz Scheduler when working in a single server environment. In a clustered environment such as Beanstalk or GAE, it is difficult to know which instance will be triggered and execute the task only once. The folks at Google have provided another solution with their own implementation of Cron for Java which can be used. At the time of writing, Amazon Beanstalk didn't have a solution yet. Therefore, consider before-hand when designing your system, which approach to take in order to create scheduled tasks for your application.
- JRE white list
I believe this related to GAE J only. Google App Engine for Java doesn't allow the use for all available API in Java, especially if they do require access to the file system. The fact that there is a such a restriction impose by the Google has led us to look elsewhere for some of our projects. The cost of re-developing our application to please them is much higher than deploying them elsewhere. Also, another downside of GAE J is doesn't fully support JEE servlet specification. You cannot implement custom security for your application through your web.xml therefore pushing you to use Google own security mechanism. I would recommedn using GAE J when developing a greenfield project which can be built with these restrictions here and here in mind. If you want to be locked-in using GAE J for your application, then I recommend it as a cost efficient way to testing your application otherwise, look somewhere else.
I hope this was helpful and if there's mistake, feel free to get back to me and I make any corrections. Also, I am sure that I am missing some other points, add them to the comments sections.
P.S. here is a nice comparison from IBM
P.S. here is a nice comparison from IBM
Cheers and Happy Coding.
Hi Armel, Have you tried Heroku? I'd love to hear your feedback.
ReplyDelete-James
Hi James, I haven't used Heroku yet. Funny enough I was looking at yesterday to try to understand what it brings to the table. I am also considering Cloud Foundry as alternative to GAE and Beanstalk. I will let you know once I tried it.
DeleteFor point #1 to #4, those are things you have to take care of when you develop applications which are going to be deployed in a cluster. They apply to cloud application development, but they are not new to cloud. If you do want to cache data, you need some sort of messaging system to synchronize data across multiple servers.
ReplyDeleteHi Jun, you're right. Those need to be taken care of in clustered environment. The point here is that you do not really have the freedom to use whichever framework you want as it might not be supported by the cloud vendor therefore leading to vendor locking. Some vendors do not support caching and only recently implementing a caching mechanism such as Amazon ElastiCache.
DeleteThanks for the nice intro into the Cloud. I'll just comment on using JEE which is not official (though common): http://www.java.com/en/about/javanaming.jsp "Please say Java"
ReplyDeleteProgramming in cloud..a good start for me..
ReplyDeleteGood Article ...
ReplyDeleteI have very minimal knowledge on Cloud. With this article I learnt new things on Cloud in Java Perceptive. Good one. Thanks.
ReplyDeleteI have seen fantastic blogs and I have seen not so fantastic blogs. This blog is very informative in many ways and certainloy ranks in the former category. Really appreciate the information your providing use avid readers!
ReplyDeletehttp://celabright.com/
Nice Post,
ReplyDeleteThanks
vtiger CRM is free open source CRM with full-featured. vTiger crm is best suited for small and medium sized
business.
VTiger
VTiger CRM
vTiger Integration
I am new to the combination of cloud and java but as per my development experience cloud have bright future with java EE 5 and 6 as it consist EAR which makes cloud apps provisioning easy.
ReplyDeleteThanks for sharing useful information. I always make sure to bookmark pages like this because you know it will be useful in the future too. thanks again.
ReplyDeletecloud backup
I'm the same way I do my best to remain neutral. It's hard if you communicate with the person the other person dislikes then you fall out of favor with them! I simple can't dislike a person just because someone else does I just can't.
ReplyDeletechild custody investigation
I really believe you will do much better in the future I appreciate everything you have added to my knowledge base. Admiring the time and effort you put into your blog and detailed information you offer!
ReplyDeletecontacts