Clemens Vasters on Longhorn

Exploring Next Generation Web Services with Indigo

September 2003 - Posts

  • "Services" in SOA posting: Comments and Responses

    Filed under:

    Javier Gonzalez sent me a mail today on my most recent SOA post and says that it resonates with his experience:

    I just read your article about services and find it very interesting. I have been using OOP languages to build somewhat complex systems for the last 5 years and even if I have had some degree of success with them, I usually find myself facing those same problems u mention (why, for instance, do I have to throw an exception to a module that doesn't know how to deal with it?). Yes, objects in a well designed OOP systems are *supposed* to be loosely coupled, but then, is that really possible to completely achieve? So I do agree with u SOA might be a solution to some of my nightmares. Only one thing bothers me, and that is service implementation. Services, and most of all Web Services only care about interfaces, or better yet, contracts, but the functionality that those contracts provide have to be implemented in some way, right? Being as I am an "object fan" I would use an OO language, but I would like to hear your opinions on the subject. Also, there's something I call "service feasibility". Web Services and SOA in general do "sound" a very nice idea, but then, on real systems they tend to be sluggish, to say the least. They can put a network on its knees if the amount of information transmitted is only fair. SAOP is a very nice idea when it comes to interoperability, but the messages are *bloated* and the system's performance tend to suffer. -- I'd love to hear your opinions on this topics.

    Here’s my reply to Javier:

    Within a service, OOP stays as much of a good idea as it always was, because it gives us all the qualities of pre-built infrastructure reuse that we've learned to appreciate in recent years. I don't see much realistic potential for business logic or business object reuse, but OOP as a tool is well and alive.

    Your point about services being sluggish has some truth to it, if you look at system components singularly. There is no doubt that a Porsche 911 is faster than a Ford Focus. However, if you look at a larger system as a whole, to stay in the picture let's take a bridge crossing a river at rush hour, the Focus and the 911 move at the same speed because of congestion -- a congestion that would occur even if everyone driving on that bridge were driving a 911. The primary goal is thus to make that bridge wider and not to give everyone a Porsche.

    Maximizing throughput always tops optimizing raw performance. The idea of SOA in conjunction with autonomous computing networks decouples subsystems in a way that you get largely independent processing islands connected by one-way roads to which you can add arbitrary numbers of lanes (and arbitrary number of identical islands). So while an individual operation may indeed take a bit longer and the bandwidth requirements may be higher, the overall system can scale its capacity and throughput to infinity.

    Still, for a quick reality check: Have you looked at what size packages IIOP or DCOM produce on the wire and at the number of network roundtrips they require for protocol negotiation? The scary thing about SOAP is that it is really very much in our face and relatively easy to comprehend. Thus people tend to pay more attention to it. If you compare common binary protocols to SOAP (considering a realistic mix of payloads), SOAP doesn't look all that horrible. Also, XML compresses really well and much better than binary data. All that being said, I know that the vendors (specifically Microsoft) are looking very closely at how to reduce the wire footprint of SOAP and I expect them to come around with proposals in a not too distant future.

    Over in the comment view of that article, Stu Charlton raises some concerns and posts some questions. Here are some answers:

    1) "No shared application state, everything must be passed through messages."  Every "service" oriented system I have ever witnessed has stated this as a goal, and eventually someone got sick of it and implemented a form of shared state. The GIT in COM, session variables in PL/SQL packages, ASP[.NET] Sessions, JSP HttpSession, common areas in CICS, Linda/JavaSpaces, Stateful Session Beans, Scratchpads / Blackboards, etc. Concern: No distributed computing paradigm has ever eliminated transient shared state, no matter how messy or unscalable it is.

    Sessions are scoped to a conversation; what I mean is application-scoped state shared across sessions. Some of the examples you give are about session state, some are about application state. Session state can’t be avoided (although it can sometimes be piggybacked into the message flow) and is owned by a particular service. If you’ve started a conversation with a service, you need to go back to that service to continue the conversation. If the service itself is implemented using a local (load balance and/or failover) cluster that’s great, but you shouldn’t need to know about it. Application state that’s shared between multiple services provided by an application leads to co-location assumptions and is therefore bad.

    2) "A customer record isn't uniquely identifiable in-memory and even not an addressable on-disk entity that's known throughout the system"  -- Question: This confuses me quite a bit. Are you advocating the abolishment of a primary key for a piece of shared data? If not, what do you mean by this: no notion of global object identity (fair), or something else?

    I am saying that not all data can and should be treated alike. There is shared data whose realistic frequency of change is so low, that it simply doesn’t deserve uniqueness (and be identified by a primary key in a central store). There is shared data for which a master copy exists, but of which many concurrent on-disk replicas and in-memory copies may safely float throughout the system as long as there is understanding about the temporal accuracy requirements as well as about the potential for concurrent modification. While there is always a theoretical potential for concurrent data modification, the reality of many systems is that a records in many tables can and will never be concurrently accessed, because the information causing the change does not surface at two places at the same time. How many call center agents will realistically attempt to change a single customer’s address information at the same time? Lastly, there is data that should only be touched within a transaction and can and may only exist in a single place.

    I am not abandoning the idea of “primary key” or a unique customer number. I am saying that reflecting that uniqueness in in-memory state is rarely the right choice and rarely worth the hassle. Concurrent modification of data is rare and there are techniques to eliminate it in many cases and by introduction of chronologies. Even if you are booking into a financial account, you are just adding information to a uniquely identifiable set of data. You are not modifying the account itself, but you add information to it. Counter example: If you have an object that represents a physical device such as a printer, a sensor, a network switch or a manufacturing robot, in-memory identity immediately reflects the identity of the physical entity you are dealing with. These are cases where objects and object identity make sense. That direct correspondence rarely exists in business systems. Those deal with data about things, not things.

    3) "In a services world, there are no objects, just data". – […] Anyway, I don't think anyone [sane] has advocated building fine-grained object model distributed systems for quite a few years. […] But the object oriented community has known that for quite some time, hence the "Facade" pattern, and the packaging/reuse principles from folks such as Robert C. Martin. Domain models may still exist in the implementation of the service, depending on the complexity of the service.

    OOP is great for the inner implementation of a service (see above) and I am in line with you here. There, however, plenty of people who still believe in object purity and that’s why I am saying what I am saying.

    4) "data record stored & retrieved from many different data sources within the same application out of a variety of motivations"  --- I assume all of these copies of data are read-only, with one service having responsibility for updates. I also assume you mean that some form of optimistic conflict checking would be involved to ensure no lost updates. Concern: Traditionally we have had serializable transaction isolation to protect us from concurrent anomalies. Will we still have this sort of isolation in the face of multiple cached copies across web services?

    I think that absolute temporal accuracy is severely overrated and is more an engineering obsession than anything else. basically lies into the faces of millions of users each day by saying “only 2-4 items left in stock” or “Usually ships within 24 hours”. Can they give you to-the-second accurate information from their backend warehouse? Of course they don’t. They won’t even tell you when your stuff ships when you’re through checkout and gave them you money. They’ll do so later – by email.

    I also think that the risk of concurrent updates to records is – as outlined above – very low if you segment your data along the lines of the business use cases and not so much along the lines of what a DBA thinks is perfect form.

    I’ll skip 5) and 6) (the answers are “Ok” and “If you want to see it that way”) and move on to
    7) "Problematic assumptions regarding single databases vs. parallel databases for scalability" -- I'm not sure what the problem is here from an SOA perspective? Isn't this a physical data architecture issue, something encapsulated by your database's interface? As far as I know it's pretty transparent to me if Oracle decides to use a parallel query, unless I dig into the SQL plan. […]

    “which may or may not be directly supported by your database system” is the half sentence to consider here as well. The Oracle cluster does it, SQL Server does it too, but there are other database system out there and there’s also other ways of storing and accessing data than RDBMS.

    8) "Strong contracts eliminate "illegal argument" errors" Question: What about semantic constraints? Or referential integrity constraints? XML Schemas are richer than IDL, but they still don't capture rich semantic constraints (i.e. "book a room in this hotel, ensuring there are no overlapping reservations" -- or "employee reporting relationships must be hierarchical"). […]

    “Book a room in this hotel” is a message to the service. The requirements-motivated answer to this message is either “yes” or “no”. “No overlapping reservations” is a local concern of that service and even “Sorry, we don’t know that hotel” is. The employee reporting relationships for a message relayed to an HR service can indeed be expressed by referential constraints in XSD, the validity of the merging the message into the backend store is an internal concern of the service. The answer is “can do that” or “can’t do that”.

    What you won’t get are failures like “the employee name has more than 80 characters and we don’t know how to deal with that”. Stronger contracts and automatic enforcement of these contracts reduce the number of stupid errors, side-effects and the combination of stupid errors and side effects to look for – at either endpoint.

    9) "The vision of Web services as an integration tool of global scale exhibits these and other constraints, making it necessary to enable asynchronous behavior and parallel processing as a core principle of mainstream application design and don’t leave that as a specialty to the high-performance and super-computing space."  -- Concern: Distributed/concurrent/parallel computing is hard. I haven't seen much evidence that SOA/ web services makes this any easier. It makes contracts easier, and distributing data types easier. But it's up to the programming model (.NET, J2EE, or something else) to make the distributed/concurrent/parallel model easier. There are some signs of improvement here, but I'm skeptical there will be anything that breaks this stuff into the "mainstream" (I guess it depends on what one defines as mainstream)...

    Oh, I wouldn’t be too sure about that. There are lots of thing going on in that area that I know of but can’t talk about at present.

    While SOA as a means of widespread systems integration is a solid idea, the dream of service-oriented "grid" computing isn't really economically viable unless the computation is very expensive. Co-locating processing & filtering as close as possible to the data source is still the key principle to an economic & performing system. (Jim Gray also has a recent paper on this on his website). Things like XQuery for integration and data federations (service oriented or not) still don't seem economically plausible until distributed query processors get a lot smarter and WAN costs go down.

    Again, if the tools were up to speed, it would be economically feasible to do so. That’s going to be fixed. Even SOA based grids apparently sound much less like science fiction to me than to you.

  • "Services" in SOA

    Filed under:

    If you are a developer and don't live in the Netherlands (where SOA stands, well known, for "Sexueel Overdraagbare Aandoeningen" = "Sexually transmitted diseases”), you may have heard by now that SOA stands for "service oriented architectures".

    What's really interesting about talks and articles about SOA (including the ones that I gave on this year's Microsoft Architect's Tour) tend to focus almost exclusively on the glue between services and how the use of registries and dynamic binding, message and service contract exchange and negotiation and use of standard protocols and data exchange formats promises greater flexibility for enterprise architectures, but little is said about the characteristics of the "services" themselves.

    So, here's a bit of my current thinking around services; by now I think I could probably fill a book on the topic and therefore a blog entry cannot come even close to give the complete picture. Also, I don’t claim to say anything new here, but rather just want to have it all once in one place on my blog. So here we go:


    Very broadly speaking, a service is an autonomous unit that is responsible for a transformation, storage and/or retrieval of data. Services never interact with other services by side-effect, meaning there is no notion of inter-service (application) state that is not explicitly exchanged through messages. Services are accessed through well-defined public access points that are governed by contracts that tightly define the set of supported messages, the message content and the applicable service policies.

    To explain services and their motivation, I will first have to write about objects. The basic idea of exclusive data ownership is not too dissimilar to the idealistic view domain objects. There, you have an object “Customer 12345678 Peter Miller” and that object has its own “save data” and “load data” capability. To activate (load) an object from persistent storage, you go through some sort of factory that is getting the object identity as an argument and from there, all you talk to is the object and the object’s inner implementation worries about the details of storage all by itself.

    However, in contrast to the object notion of having data and code in one place and at one location, services strictly separate between code and data. The customer record mentioned above isn’t a uniquely identifiable in-memory and even not an addressable on-disk entity that’s known throughout the system, but data simply flows through the system and the same record may exist in multiple places at the same time. In a service world, there are no objects, there is just data.

    The idea of “self-contained” domain objects, while thought to be an ideal modularization model, indeed most often fails to provide modularity. A data record can be stored to and retrieved from many different data sources within the same application out of a variety of motivations. It may be stored in an offline “isolated storage” replica on a mobile machine, inside a queue message that is processed only during an hourly or daily batch run, in a SQL database, an in-memory caching structure and many more places based on its character. The character of a data record includes, for instance, how often and by how many concurrent activities it is likely to be changed and therefore how safe it is to create read-only replicas of the record and how long these replicas can be regarded as being valid and accurate or even just “good enough” to base further processing on them. Likewise, a data record can be rendered for presentation to both human and machine consumers is a vast variety of ways, ranging from an XML fragment over HTML rendering to sophisticated 3D graphics visualizations.

    Although the idea of object-centric storage, object self-responsibility and universal object-identity is fantastically attractive, a single object implementation that attempts to accommodate all these requirements simply results in a monolithic application block that is anything but modular. Even when it comes to “business logic”, the implementation of the rules that govern the contextual correctness and integrity of an object, putting all rules that result from the requirements for an entire system into a single class breaks the separation of subsystems. Creating a single “customer” object class for a bank’s loan, investment and financial collections business is essentially impossible because of conflicting rules and requirements and a different perception of “customer” by these businesses. Still, it is standard procedure to have a central database with data records that hold the customer information shared by these systems – and a service that governs this data store. Not too rarely that service goes by the name “host communication” and shifts data records on and off the mainframe via CICS transactions.

    The consequence from this thinking about domain objects and generally about the notion of object identity and self-responsibility (and I am sure that a lot of people will disagree violently with me on that) is that there is not only no proper way of realizing the dream of “true objects”, but there is indeed no way of defining any method for a domain object in a way that it doesn’t result in a monolith spanning multiple concerns having methods that are inappropriate or wrong to be used in certain contexts.

    However, this statement explicitly excludes property access methods that enforce rules like “value must be greater-equal to 0 and less-equal to 100”, because the value in question may represent an expression in percent. Now, one could argue that the fact that such property access methods are a clear example why domain objects do indeed make sense, because these method implement fundamental business logic, but in my view they don’t. The fact that property access methods enforcing such rules must exist, simply fixes an inherent weakness in the type system of most mainstream programming languages. The rule [0<=x<=100] is a property of the “percentage” data type, but that doesn’t readily map into most languages. Hence, it’s the job of explicit coding to fix that limitation and provide stronger types. The type description language XML Schema (and siblings like Schematron and Relax NG) provides facilities to define data types of the desired strength and infrastructures supporting these description formats are capable of either enforcing these rules without specific coding or generating the code required to enforce them. Property access methods are just a way to overcome programming model limitations and enforcing contract, they are not an object feature or business logic. At least I don’t see them that way.

    So, if you’re still reading after I’ve slaughtered the idea of “objects” for a modular, layered and even distributed system, it’s not so far to go from here to the essence of what a service is.

    To recap the initial statement, a service is an autonomous unit. The autonomous character of a service results from the combination of exclusive responsibility for certain operations on data and a strict definition of the message contracts for both the messages it receives and the messages it is able to provide.

    Exclusive responsibility means that there is exactly on service in a given system that may perform a certain operation on data, for instance storing and retrieving data into a certain set of tables on a certain database. Any other service that requires access to this data must use the responsible service. This serves to guarantee that only a single implementation of (for instance) data consistency rules exists, but also helps to eliminate assumptions that hinder (again, for instance) scalability. One of these problematic assumptions is that all records of a given type are co-located in the same database or in the same location. That assumption is okay as long as you don’t have to deal with a massive data volume or very high concurrency with very frequent transactional writes. In these cases, it may be beneficial to break up the storage into multiple tables or even across multiple databases, which may or may not be directly supported by your database system. If it isn’t or doesn’t work with the desired flexibility, it’s nearly impossible to introduce this scalability technique once everyone is permitted to access backend storage directly. (To get an idea of this sort of parallelism and partitioning, check out this PPT by Jim Gray).

    Ruling out that state is implicitly shared between services (in memory or on disk) is a direct consequence from this and also serves the scalability purpose, because it further eliminates co-location assumptions about services and enables clustering. Note that this isn’t about “stateless” or “stateful”. Everything is stateful while it runs.

    Strong contracts and operational guarantees further allow you to rely on (trust) the service that it will be able to perform a given task without passing the caller an error that it likely can’t handle, anyways. If the message contract and the description of types are sufficiently precise, a service won’t ever and should never have to come back to the caller with an “invalid argument” exception. If input is compliant with a contract, it’s the receiving service’s own problem to deal with any issues it has with the data, even if that involves manual resolution by an operator. The sender (client) can’t and won’t have any additional information and implemented measures to fix the input if the contract isn’t sufficiently expressing the constraints. Operational guarantees like transactional processing and reliable transport make sure that the data that is passed on to a service does not get lost on the way or gets lost when a processing attempt is unsuccessfully. If a service (A) can trust that an invoked service (B) will be able to handle a set of data contained in a message and can trust that processing will occur without further intervention by (A), the processing can occur asynchronously and the message sent from (A) to (B) can be queued and load balanced.

    This is not only true for one-way storage operations, but also for requesting data. If (A) can trust that service (B) will not simply fail a request operation and die, but is able to recover from any problem (with a reasonable probability) it may run into and send a response, and (A) passes (B) a reply-to entry point to drop the request result into, (A) can safely trust to end or suspend processing until that reply arrives or, if required, a timeout occurs. This type of asynchronous “call me back when you’re ready” interaction between services is called “dialog” and much better suited for fair load distribution in distributed systems than request/response. In essence, dialogs turn the call trees resulting from request/response operations into a sequence of one-way operations. [A further important aspect in this context is 2PC vs. compensating transactions, but I won’t go into that here and now]

    Asynchronous and parallel operation is a key element of both scalable systems and systems that operate well in the presence of substantial communication constraints like network latency and the required processing introduced by strong security boundaries. The vision of Web services as an integration tool of global scale exhibits these and other constraints, making it necessary to enable asynchronous behavior and parallel processing as a core principle of mainstream application design and don’t leave that as a specialty to the high-performance and super-computing space.

    Summarizing, services and service oriented architectures are, in a sense, a return to quite a few of the good old principles of structured programming and batch processing. Data and code are kept separate in order to allow cross-organization, cross-platform modularization, and asynchronous processing is better than synchronous processing if you want your systems to scale. But service oriented architectures also mean that we rely much more on the abstraction and tighter definition that data contracts provide compared to what can be expressed in a programming model. Message contracts expressed in a rich, cross-platform type description language such as XML Schema are much more powerful and precise than any IDL file you could ever write and they are independent of the implementation platform that’s chosen for a particular subsystem. Service policy contracts provide a similar abstraction for the operational requirements and guarantees that can be mandated or given in order to establish the required level of trust between services independent of the platform they are implemented on.

    [For some answers to reader comments go here]