This series basically aims to lay out my opinions on the subject.
The problem...
In high traffic enterprise applications, where traditionally you'd have a few real heavy duty application servers, and a couple database instances... you'd quickly reach an area of diminishing returns. As disk IO and connection counts rose, the capability of the machine to handle the load diminished rapidly.
Traditional architectures rely heavily on tightly coupled, highly localized data and content, and very little distribution, etc. Costs of SAN disk arrays go up dramatically as storage requirements increase, and still you're talking costs of a centralized (though often scalable) disk arrays
What is a cloud architecture?
Cloud architectures are designed to be massively scalable, highly available, and robust. The components or services are generally designed to be decoupled hopefully adding a great deal of flexibility to the architecture itself.
This diagram has nothing to do with this post (as this is not about specific implementations of cloud architectures), but actually I thought this was a perfect representation of a cloud. Traditionally in this sort of diagram, you'd expect to assume that items represented as a cloud clipart, were components of the app that weren't hard defined, nor were they specific about the implementation or number of machines involved in serving that component. In the windows azure diagram above you can see that the storage, queueing, and blob storage components are represented as a cloud.
The symbology here denotes that these components are kind of service layers... that are designed to be decoupled from one another. Their particular implementation or hardware requirements are purposely abstracted away from these diagrams. The "service" is responsible for distributing commands to specific underlying servers etc. This is a very common trend in the cloud architectures I've come across thus far.
In the diagram itself above, the user story represents the process of uploading an image, storing binary data into cloud blob storage, and perhaps storing meta data about the binary, in the table storage. The underlying implementation of the data store could be a number of database servers with replication, a database pool with sharding or partitioning, or several servers just attaching to same SAN lun's. The point here is that the underlying implementation, and the fetching and saving of that data is the responsibility of the blob storage layer, not the business logic layer.
Key characteristics in my opinion are...
- Components or services are decoupled
- Underlying implementation is never touched directly, and ultimately shouldn't matter from the point of view of the consumer.
- Components or services shouldbe responsible for broadcasting messages, fetching data they need. After all you basically send a command, or a business object to the cloud... the cloud figures out who needs to handle it, and how its to be handled.
- Implementations of components or services generally tend to be made up many many smaller nodes.. which ultimately generally are less powerful.
- Webservice component of a cloud app should be machine in-specific. Web app's should be as stateless as possible. (requests may migrate from one machine to another at any point of time)
- Sometimes the sheer number of connections on a single machine can hamper performance, and 4 lighter weight machines vs 1 heavy duty machine may be more suitable.
- Depending on application requirements, this may help alleviate storage woes, in that the $150k storage array could potentially be replaced by cheaper hardware and still be better off.
What a cloud is not...
I've seen all too many times, people mixing up typical client server applications, with the idea of a cloud architecture...when in fact that analog only really works in one direction.
- Cloud architectures are not highly coupled
- Layers should be clearly defined (business logic layer should not need to know how to fetch data out of database b on machine a which ultimately houses the data, but rather know the data layer knows how to delegate requests to appropriate places.)
- Business logic layer should never know the physical breakdown of service implementation. Your business logic layer should never be sending data to specific machines in those components at all. In my opinion... any of those component participants should be able to either handle your request, or delegate it to the machine that can. Likewise for ultimate scalability... the ability to drop in and out of the pool is a huge plus.
- Management of cloud architectures shouldn't usually refer to specific machines participating in the service components individually, but rather have something in place to allow bulk updates, and deploys.
- If I have 20 webapp servers participating in my web services component, I really shouldn't need to push new builds to each of those machines individually, manage them individually etc. (after all the idea is that the web server component is a cloud of one or more machines, and the underlying implementation doesn't really matter.)
Keeping in tradition with many of my other blog posts.. this ended up being a sort of slightly structured mind dump. Sorry in advance if any of it seemed confusing or like parts of it ran away.
In next article, I intend to show how some products lend themselves to cloud architectures better than others, and hopefully layout some problems I've encountered in my development efforts (to perhaps prevent you guys from making the same mistakes I have).