Tuesday, October 14, 2008

Part 2 - Designing the manager services

This stage is probably the most crucial as it will determine the emergent properties of your system; this is how your system behaves while in operation. Luckily there are a number of things that we can look at introducing so that we are steered down the right path. Some of these are:

Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.

Asynchrony: The system makes progress under all circumstances.

Autonomy: The system is designed such that individual components can make decisions based on local information.

Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.

Controlled concurrency: Operations are designed such that no or limited concurrency control is required.

Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.

Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.

Decompose into small well-understood building blocks: Do not try to provide a single service that does everything for everyone, but instead build small components that can be used as building blocks for other services.

Simplicity: The system should be made as simple as possible (but no simpler).

Just a note at this stage, this type of architecture is for enterprise type systems, these are systems deployed within a organization that run the critical business process, things like data integrity and security are vital. There are many other styles of architecture and we can go into others in future posts. I have a strong bias towards WCF as I believe it to be the most productive framework for building this type of architecture.

I recommend a closed architecture, where objects are allowed to call other objects only in their own tier or in the tier immediately under them. This encapsulates changes and eases the task of long-term maintenance.

At the service boundary services need to support business capabilities, they should have an interface that can be thought of what a business does, and an interface that can the thought of how the business currently does that process. Key here is to keep the what, and the how decoupled from each other.

Decoupling is the cornerstone of modern development, look at the modern patterns that we as an industry subscribe to, notice something? They all hold as a central tenant this notion of decoupling, the removing direct dependency on concrete objects and the enabler of this, dependency injection.

Interfaces allow us to achieve the separation between what and how, so we should take things that are good and try using them elsewhere. Don’t get me wrong, coupling is not all bad we need to be coupled to thing in order for our application to be useful, the decision of choosing what to be coupled to becomes critical.

I have observed that as soon as IT assets and components map well to the real world things work well and systems become really useful. I mentioned earlier that the top level services should map to business capabilities, what are these?

A business capability is a particular ability or capacity that a business may possess or exchange to achieve a specific purpose or outcome. A capability describes what the business does (outcomes and service levels) that create value for customers; for example, pay employee or ship product. A business capability abstracts and encapsulates the people, process/procedures, technology, and information into the essential building blocks needed to facilitate performance improvement and redesign analysis.

So to achieve stable, useful boundary services i.e. the ones exposed to the business people, we need to align them with business capabilities, one technique of achieving this is capability mapping and when applied correctly leads to process driven services.

One of our top non functional is maintainability as the amount of time a system spends in development pales in comparison to the amount of time it will spend in maintenance. With that in mind, the granularity and size of our services should be thought about. Services with hundreds of methods will be un-maintainable while services with one method will be useless; the trick is to find the area of minimum cost between the cost of building a service and the cost of integrating it into the rest of the system (testing etc). Metrics tell us the magic number sits between five and eight methods per interface with twelve as an upper limit.

As simplicity is number one on our list of non functional requirements I suggest that you make your services stateless, note stateless and state aware are not the same, by stateless I suggest that your services use a per call instantiation pattern unless you have a natural singleton of a really good reason not to.

State is the sworn enemy of scalability, so by remaining per call your system will be able to scale indefinably.

I term these services manager services, as I see them as managing a use case. This technique has saved my bacon in the past as I have had customers that withhold final payment claiming that the system delivered is not what they asked for. If your manger services map one to one to the use cases that they signed off on the process of system validation becomes simple and such customers do not have much of an argument.

Recently I have started using workflow within these manager services to control the flow of logic graphically; I am a firm believer that the development industry is moving towards visual tools for development and future IDE ‘s will not require the developer to write as much code, but rather focus in design more. Workflows come with an overhead as learning this programming model requires a paradigm shift for developers. For simple managers I just code it up the regular way.



Managers service as the root for transactions I start my transaction scope here for operations that need transactability, operation design is key here, consider splitting interfaces into operations that require transactions and those that do not.

What is cool is that the transaction is cross-process, so you programming model is so much simpler as any enlistable resource you touch is automatically add to the transaction, it even works with message queues. This enables you to program without worrying too much about this sort of thing as you can be sure that a rollback will occur across all the services enlisted if an exception is thrown. You can also consider including the client into your transaction if you have access to their machine and a requirement to do so.





While I am mentioning exceptions, lets discuss this a little, services need to be fault tolerant and bear local responsibility, so that need to handle exceptions graceful, and not burden the client with the details, typically the client can do nothing about exceptions in the service anyway. Writing code to compensate for exceptions is a waste of time; people only cater for the easiest exceptions anyway and conveniently ignore the rest.

There are two types of exceptions you need to know, business exceptions, are the ones you throw to notify users of a violation of business rules, and system exceptions, that the runtime throws, I say catch all system exceptions at the service boundary and log them off to a logging service thus shielding the client from the internal details of the service, throw all business exceptions as FaultException and send back to the user as they can actually do something about it.

Security has not been mentioned at this point, manager service cannot the burdened with the security requirements of any one application as you would typically like to reuse them in multiple applications, but security is more than just authorization and authentication, things like non-repudiation, confidentiality and integrity also play a part. I typically delegate authorization and authentication to an application specific façade layer and handle the rest at the manger service.

By using X509 certificates I can ensure that every packet sent is encrypted and signed to prevent tapering. I can also ensure total end to end security unlike HTTPS. Security is the most intensive thing you can apply to your services, but in my opinion it is non-negotiable and should be by design and not an afterthought. Larger more dispersed and heterogeneous system will require a security token service (STS) look at things like Zermatt to assist you in building these.

In closing we have aligned to tall the things we know make good services in a productive and achievable way, while delegating the heavy lifting to the framework and significantly reducing the amount of plumbing our developers need to write.

Friday, October 10, 2008

Formulating an enterprise architecture - Part 1

Tackling the task of formulating architecture for the construction of an enterprise scale system is a daunting one, the architect needs to draw upon multi disciplinary skills, one need to understand the technical details of the selection of technologies available to you, the options in process for constructing solutions as architectures do not all fit well into every process, and experience has taught the when process and architecture are aligned good things happen.

The following post is part one of a series that will present my current thinking around the process of formulating such an architecture. I must confess that most of the ideas are not my own, but I have drawn on the experience and teaching of some of the world best architects to guide me in this. You will find ideas from Juval Lowy, Roger Sessions, Eric Evans, Michele Leroux Bustamante and Bill Poole.

This first post attempts to present an overview, following posts will drill down into the component pieces and flesh out the detail.

Application architecture results from the balancing of functional and the non-functional requirements applicable to the system. Non-functional requirements (restrictions and constraints) serve the purpose of limiting the number of potential solutions that will satisfy a set of functional requirements. Functional requirements are the behaviour that the owner of the system would like to emerge from the development process. All architectures exhibit emergent behaviour. If the emergent behaviours are not specified / managed in the form of non-functional requirements, the developers will choose them on behalf of business, leading to misalignment.

Any new architecture can only result from the balancing of functional and non functional requirements with the forces of time, effort and cost. Every feature, requirement and every non functional requirement carries an associated cost in one of the forces mentioned above.

Architecture is applied at two levels:

·The application architecture addresses the software components of the actual application.
·The systems architecture addresses the physical (hardware) and logical distribution of components across the network environment.

Before any architecture can be formulated the following need to be considered:

Non Functional Requirements

• Simplicity
• Maintainability
• Data Integrity
• Security
• Reporting
• Extensibility
• Scalability
• Deployment

Complexity

The most common aspect shared by all failed systems is the failure to manage complexity. [Sessions] An architecture that fails to mange this aspect is also doomed to fail.
In order to manage complexity we must first understand it, the complexity of a system is a function of the number of states in which a system can find it’s self.

An example is a system (A) that has 3 dice, so the number of potential states is 63 or 216 potential states. Another system (B) with one dice has 61 or 6 potential states, if you where to make repeated guesses for both systems, you would be right, on average 36 times more often with B than you would be with system A. Because system B is less complex and easier to predict

Partitioning

With this basic model of complexity, we can gain some insight into how complexity can be better organised.

Consider another two systems (C) and (D) both have three six sided dice, but in C all the dice are together as before, but in D the dice are divided into three partitions. Let’s assume we can deal with the three partitions independently, in effect, three subsystems each like B. We know that the complexity of C is 216.

The overall complexity of system D is the sum of the complexity of each partition (61 + 61 + 61) or 18. If you are not convinced of this imagine inspecting C and D for correctness in C you would need to examine 216 different states, checking each for correctness. In system D you would need to examine only 6 states in each partition.

The results on how much more complex a non-partitioned system of 9 dice is compared to a partitioned system containing the same number of dice is startling. Ratio of 10,077,696 to 54, a lot!

Process

Just as any architecture will not work with any process, any process will not work with any architecture. Process and architecture are really different manifestations of the same thing.[Lowy]
When formulating any architecture the following graph always remains true.


















The goal of any architecture is to find that area of minimum cost, the interaction of team members is isomorphic to the interaction between components. A good design (loose coupling, minimised interactions, and encapsulation) minimises the communication overhead both within the team and the system and any process adopted needs to support this.

Best Practice Design Principles

Decentralization: Use fully decentralized techniques to remove scaling bottlenecks and single points of failure.
Asynchrony: The system makes progress under all circumstances.
Autonomy: The system is designed such that individual components can make decisions based on local information.
Local responsibility: Each individual component is responsible for achieving its consistency; this is never the burden of its peers.
Controlled concurrency: Operations are designed such that no or limited concurrency control is required.
Failure tolerant: The system considers the failure of components to be a normal mode of operation, and continues operation with no or minimal interruption.
Controlled parallelism: Abstractions used in the system are of such granularity that parallelism can be used to improve performance and robustness of recovery or the introduction of new nodes.
Decompose into small well-understood building blocks: Do not try to provide a single service that does everything for everyone, but instead build small components that can be used as building blocks for other services.
Simplicity: The system should be made as simple as possible (but no simpler).



Process - Staged Delivery


Process suggested for this development is one of staged delivery as defined in the diagram below.














This is a very defensive approach as you are able to exit the process at anytime and release what you have or move onto the next stage. We are not going to try to deliver all the requirements at day one as requirements will change during the course of the project.


The axiom is that at any moment in time we have a releasable system; releases may internal type releases or as time progresses and features become apparent, they will be releases to customer. This approach reduces risk as if at the point of release for whatever reason the project is out of time, money etc, what you have may be good enough.


It also facilitates a clear display of progress throughout the development process and builds confidence between development team and customer. Process advocates constant integration with daily builds and tests (smoke tests).

Each component of the system follows a life cycle in order to ensure quality at a service/component level. This is followed by the developer of the component/service.


















Service Governance


It would be unprofessional to deploy a distributed system with no means of managing the operational assets deployed. To this end we propose to use service virtualisation to facilitate this process.


The Managed Services Engine (MSE) is one approach to facilitating Enterprise SOA through service virtualization. Built upon the Windows Communication Foundation (WCF) and the Microsoft Server Platform, the MSE was developed by Microsoft Services as we helped customers address the challenges of SOA in the enterprise.


The MSE fully enables service virtualization through a Service Repository, which helps organizations deploy services faster, coordinate change management, and maximize the reuse of various service elements. In doing so, the MSE provides the ability to support versioning, abstraction, management, routing, and runtime policy enforcement for Services


Proposed Architecture

In the next post i will dive deeper into concepts that relate to the service layers and the concept of fortresses.