In this section we are going to look at some of the inherent
problems we encounter if we wish to integrate applications within a
business process. First, let's compare and contrast a typical
multi-application business system with the simple client-server model
with which we are all familiar.
Simple Client-Server Model
If you have ever programmed a client-server system, you have
actually performed application integration. Neither of the
applications can fulfill the requirements of the system by itself;
they must be able to exchange data. The two programs must be made to
share data reliably. Without a predictable, efficient, and robust
protocol for exchanging data, the client-server system cannot be made
to work.
The simplicity of the client-server model produces some distinct
advantages in comparison with the sort of application integration we
will focus on in this book. Client-server has just two entities
sending data back and forth. Although one entity - the server - can
communicate with one or more clients concurrently, there are only two
roles in the system. A party is either a client or a server. A single
interprocess communication technology is embedded into both
applications in the client-server model. The two programs are
designed from the start around a protocol suited to the overall
system. Typically, both applications are programmed by the same
people, or at the very least, the two programming teams are in close
communication. In short, the client and server applications were
intended to work together, so application integration is relatively
simple.
Distributed Business Systems
Contrast the client-server model with the problem of building a
distributed business system involving multi-part data flow (known as
workflow). Information comes into the system through some
client interface, and one or more applications perform processing on
the data. These programs may act directly on the data, or they may
perform supporting functions as a side effect of the data passing
through the system.
Often, many of the applications are in existence before the
distributed system is implemented. In one way, this is a desirable
situation, as we are then dealing with well-tested building blocks.
However, integrating these applications poses several challenges:
· Incompatible
protocols and data formats
· Workflow design
and error handling
· Monitoring the
workflow
Let's take a look at each of these potential problem
areas.
Incompatible Protocols and Data Formats
First, we have the problem of incompatible protocols. If the
applications were built independently, it is highly unlikely that
they will use the same technology for interprocess communication. An
application enabled for DCOM cannot speak directly to another
application designed for HTTP. Some older applications may not be
equipped for direct interprocess communication at all - they may
expect to find a data file on a local disk.
We also encounter the problem of incompatible data formats.
This may be a question of mismatched low-level data type
representation between dissimilar computers, or a higher-level issue
of mismatched data structures.
Mismatched data types may arise because one computer uses a
different binary representation for some data type - numeric types
are particularly troublesome - from another computer. Structures that
are logically identical will be physically different at a low level
of representation.
Dissimilar data structures involve two programming teams choosing
different data structures to represent the same body of data. For
example, one program might use a hierarchical structure well suited
to programmatic use. Another might serialize its data in fixed length
fields, a system oriented to saving data to disk or performing bulk
data exchanges through files. Whatever the case, application
integration usually needs some facility for translating between data
formats. This facility must correctly map one item of data
into another and back again if two programs are to communicate.
Workflow Design and Error Handling
Application integration is not complete when two programs are made
to communicate with one another. We have to deal with data flow and
error handling. Implementing a non-trivial process usually requires
the connection of multiple applications in an appropriate sequence.
The architect of such a system must also consider how the process can
fail and provide for alternative paths through the system.
Consider a simple retail purchase at a web site:
1.
The site searches and displays a catalog
2.
The user selects purchase
3.
The site accesses an inventory control system to confirm availability
and mark the product for selection
4.
An order price is calculated (this may involve shipping costs, sales
taxes, and promotional discounts)
5.
An e-mail acknowledgement is formatted and sent
6.
A shipping invoice is generated so that warehouse fulfillment workers
can complete and check the order
If you have the luxury of starting this business from scratch and
have to write custom systems to support it you have little need for
extra application integration. If, on the other hand, you are
enabling web solutions for an existing business, you will probably
have to coordinate the activities of several programs.
Also, the path I have just described is the path the process will
take if everything works correctly. If the site is to be robust,
though, the architect will have to consider various contingencies,
such as:
· What if the
product is back ordered, or perhaps discontinued?
· Is the product
compatible with all forms of shipping?
· What happens if
the acknowledgement is bounced at the receiving domain?
Different applications may have to be invoked in different
sequences to account for all the contingencies that are possible in
the system, as shown in the diagram below:
In other words, we need an integrated system that is flexible, and
has built-in error-handling.
Monitoring the Workflow
Even when all the individual applications have been stitched
together to implement the desired workflow, system administrators
must be able to monitor the process during run-time, to manually
verify the status of any particular item of work and intervene if
necessary. They should also be able to adjust the parameters
governing each step in the workflow. Considerations include:
q
Should some steps be processed in batches, and if so, at what
intervals?
q
How long should one process wait for a reply from another before
raising an error?
In other words, it is not enough to build an integrated system. We
must be able to monitor and control it, as well.
Summarizing the Challenges of Enterprise Application
Integration
The task of application integration poses several challenges.
Efficiently overcoming them requires a set of tools and technologies.
The challenges include:
q
Defining one or more data formats for the exchange of data
q
Defining the logical exchange of data between two applications
q
Implementing the physical exchange of data, accounting for dissimilar
protocols and asynchronous exchanges
q
Defining ideal workflow processes
q
Identifying error conditions and processing exceptions, and defining
workflow to handle these cases
q
Monitoring and operating the integrated system effectively and
efficiently
These are the challenges that BizTalk Server is designed to
overcome. When we discuss the various parts of the product later, you
will see that there are different tools and services that address
each of these areas. Before we get into that, however, let's look at
some scenarios that use EAI so that you can see some ways in which
these challenges arise.