There are circumstances when the use of Microservices is the best choice. These were discussed primarily in Chapters 1, 2, and 8. Despite the advantages of Microservices, making sweeping changes to introduce system-wide use of Microservices might not be necessary. The number of Microservices should be determined by business purpose and technical justification.
Note
The running case study of NuCoverage, which was introduced in Chapter 1 and has been carried throughout this book, is continued in this chapter. If some context within the examples seems to be missing, please see the earlier chapters.
Transitioning from a Monolith-only architecture to some qualified number of Microservices might mean preserving some parts of the Monolith to operate in conjunction with the newly extracted Microservices. There might also be a good reason to entirely replace a Monolith with a full array of Microservices. This chapter addresses both approaches.
When breaking off Microservices from a Monolith, the freshly introduced distributed system components bring along a significantly higher potential for runtime failure. It’s not only the Microservices that introduce failure potential, but also the overall distributed ecosystem required to provide an end-to-end solution. This is largely due to the increased dependency on the network to communicate across different computing nodes, and the fact that both tend to fail: The more nodes, the greater likelihood of failure of both network and nodes. To fathom the new complexities, it’s beneficial to consider the effects of falling into the traps of distributed computing. Here is a list of Fallacies of Distributed Computing [Fallacies]:
1. The network is reliable.
2. Latency is zero.
3. Bandwidth is infinite.
4. The network is secure.
5. Topology doesn't change.
6. There is one administrator.
7. Transport cost is zero.
8. The network is homogeneous.
Now, consider the results of believing these and how distributed systems will fail because they are untrue. These are reasons that distributed systems software will fail:
1. The network is unreliable and changes over time without warning, and software lacks resiliency and recovery tactics.
2. Latency is indeterminate and unpredictable, and worsened by the obliviousness common among inexperienced software developers.
3. Bandwidth fluctuates at counterintuitive times, and that inconsistency is exacerbated by naivete.
4. The network is not fully secure due to the Internet’s historical openness, as well as the complexity faced by security teams required to balance default openness with corporate security policies, and even complacency and laxness among common users who don’t follow mandated security policies.
5. Topology has suboptimal routes and produces unexpected bottlenecks.
6. An unknown number of administrators create conflicting policies that slow or block traffic.
7. Building network transports is nontrivial. and mistakes in their construction can have negative financial and/or functional and non-functional consequences.
8. Assuming that the network is composed entirely of the same or similar parts leads to the effects of 1–3.
Points 1–3 indicate the primary problems with distributed systems. Most software is unprepared to deal with network failures, or any other failures common to distributed computing. Points 1–8 address network failures alone; they don’t account for software failures unrelated to the network. Now add to that list all other kinds of failure: application errors, server and other hardware failures, persistence resources that are slowed or downed, and other general infrastructure failures. All of these problems can lead to catastrophic consequences.
Distributed systems must have non-naive designs. Design with failure prevention and recovery in mind:
▪ Failure supervision: Prevents cascading failures by placing software bulkheads around application components.
▪ Circuit breakers: Prevents cascading failures by blocking access to a failed component until that component is restored to full operation.
▪ Retries with capped exponential backoff: Provides the means for failed requests from a client to a server to retry until the server is restored to full operation, but without overwhelming the network or server by rapidly repeating retries at a constant rate.
▪ Idempotent receivers: When a server receives the same request more than once, it knows how to safely ignore subsequent operations, or otherwise perform the operation again without negative consequences.
▪ Out-of-sequence message deliveries: Generally applicable in the same way as commutative math (2 + 3 = 3 + 2), which works with addition and multiplication, but not with subtraction and division. Additionally, rather than demand a specific sequence, understand what qualifies as the total of all necessary message deliveries and take next step(s) only when the total is reached.
▪ Avoid overwhelming requests and/or deliveries using loadshedding, offloading, and backpressure: To lighten the runtime burdens on subsystems, loadshedding drops/ignores some unnecessarily work. Offloading distributes work to other workers that have capacity. Consumers use backpressure to limit the number of elements that might be received from a producer within a given time frame.
Of course, none of these remedies, by itself, will solve all problems. At a minimum, failure supervision can prevent catastrophic outcomes. Supervision might be difficult to introduce for some architectural designs. The use of the Actor Model will generally provide supervision as a platform requirement, but not always. Still, most software runtime environments are devoid of supervision.
Yet, even employing supervision should not be considered a complete solution on its own. Continual failure with local recovery only, and lack of cooperative stabilizing designs, will not reverse deteriorating conditions or prevent compute anomalies. Think of the negative consequences of improperly handled redelivery and wrongly ordered mutating operations. For example, continuously adding to or subtracting from financial accounts because of unrecognized redelivery will inflect monetary damage on multiple parties, and possibly even prove fatal to the party responsible for the error.
Teams must prepare mentally and resolve to prevent landslide failures, and anticipate that the small technical failures that arise can become large technical failures. This must be followed up with robust designs and implementations, which are essential for successful distributed systems operations.
Chapter 10, “Building Monoliths Like You Mean It,” outlined two approaches for building modular Monoliths:
▪ Build a modular Monolith from the start.
▪ Recover from the negative effects of a Big Ball of Mud by gradually refactoring it into a modular Monolith.
Figures 10.2 and 10.3 in Chapter 10 illustrated the ultimate goal of forming eight contextual modules inside a single deployment container named Auto Insurance System. Using either approach, NuCoverage now has a modular Monolith. Even so, some new dynamics with the system are causing pain.
There are five total teams working on the eight subdomains and contextual modules. Each of the modules changes at a different rate, but two of the eight modules, Risk and Rate, change especially frequently. The business rules change in both, as well as the algorithms that execute the actuarial processing and the pricing calculations, respectively. Additionally, Risk and Rate must scale independently due to their heavy loads and their demand for greater resources. Moreover, the company plans to expand the Rewards offerings, and its addition of new lines of insurance will bring several other impacts. This means that Rewards must be factored out of Policyholder Accounts. The custom-built legacy Billing subsystem is getting long in the tooth and lacks new billing rules and payment options. It will be replaced by a Software as a Service (SaaS)–based billing solution.
All these changes call for extracting some contexts from the Monolith into Microservices:
1. Risk Context
2. Rate Context
3. Policyholder Accounts Context
4. Rewards Context
5. Billing Context
There will be other extractions, but there are enough unfamiliar challenges to be faced with these initial ones. Making stepwise refinements with these five business capabilities will help the teams gain experience with relatively low risk.
The initial tasks extract one contextual module at a time. As Figure 11.1 shows, within a relatively short time, the four existing contexts have been extracted into four autonomous Microservices. Although the Billing Context will eventually be completely replaced, the team takes a conservative step of extracting the existing modular context with a specific purpose in mind. More refactorings follow.
Note
As was the case with Figure 10.3, the user interfaces are not shown in Figure 11.1 for the sake of simplicity, so it appears that users are interacting directly with adapters. Also, the users would usually be shown strictly on the left of each subsystem. Here the architectures are “rotated” for the convenience of showing users surrounding the system and having roles in multiple subsystems. User interfaces are discussed later in the section “User Interactions.”
Because the Monolith is already using a Message Bus with reliable delivery, the inter-context communication using commands, events, and queries by means of messaging requires no changes when extracting new contexts. What changes is that any new Microservices must manage security on their own to keep out bad actors and provide authorization to control who can do what. Also, the Rewards model must be extracted from the Policyholder Accounts Context and the Billing Context must be greatly altered, mostly to provide a transport between the existing system and the new SaaS Billing service. And we’ll add another item to the technical concerns checklist: Each Bounded Context should own its own database(s).
The SaaS-based Billing service doesn’t understand the Auto Insurance System’s events or offer a means to stream events to it. As seen in Figure 11.2, to facilitate the integration of the NuCoverage system with the subscription Billing service, we need a small Billing Interchange Context that is responsible for translating the local events into API calls on the SaaS-based Billing service.
There is a feed available from the SaaS-based Billing service that provides a stream of event records that carry information about what has happened there. The NuCoverage Billing Interchange Context translates that stream into events that are published onto the NuCoverage Message Bus and understood where needed. The event types already exist, so the system as a whole does not need to change as a result of using a new Billing subsystem.
The initial extraction of the Billing Context might seem like a wasteful step because much of it will be thrown out. It’s true that the legacy business logic in Billing will eventually go away. Even so, taking advice from refactoring guidelines, it would be safer to prove that extracting the current Billing Context doesn’t break the system. Once that is done, each of the features provided by the legacy Billing Context can be redirected one-by-one to the new SaaS Billing service.
True, the team could redirect Billing features one-by-one from within the Monolith to the subscription service.1 Yet, taking the initial step will create the separation that is ultimately desired. Specifically, the Billing capability will no longer be inside the Monolith and there will be one less team to cause a conflicting pace of change for the other teams. This will be especially helpful as the intense refactoring of Billing takes place and several releases begin occurring daily or at least weekly. Consider how much time would be involved in extracting the current Billing Context compared to the value it might or might not provide, and decide how the team should proceed.
1 This is a form of the Strangler Fig Application pattern (also known as Strangler), which is discussed in the next section.
Finally, the Rewards model is extracted from the Policyholder Accounts Context into its own Rewards Context and packaged and deployed as a Microservice. This is a fairly simple extraction because the current Rewards support is implemented as just one or a few attributes on the Account
type. Nevertheless, migrating these into a new context and ensuring that all Rewards concerns are now addressed in one place will provide at least some opportunities for the team to pause and think. The good news is that the number and types of Rewards can now more readily expand to popularize new lines of insurance as the specific coverage products go live.
The far more challenging approach to extracting components from a Monolith involves decomposing a Big Ball of Mud directly into Microservices. Considering that this effort requires assuming several complexities at once, it isn’t difficult to perceive how challenging it is. All of the steps described in the section “Right from Wrong” in Chapter 10 must be shoved and squeezed through a narrow seam that must be pried open for each rework task. Because this is all done while the existing Big Ball of Mud is still running in production, nothing can be broken or compromised by this decomposition, or at least not for very long. Collectively, the resulting sweeping changes will be comparable to a very big leap forward. Even so, there’s a lot of weight to lose and software strengthening required to make that strenuous long jump.
Again, the major change involves transitioning from the troublesome software development behaviors, pivoting on a pebble’s radius to become a team operating in a highly optimized iterative process, while employing business-driven strategy with quality assurance. This must happen quickly. A tall order? Absolutely.
Here’s a short list of what has to happen:
Step 1. Identify all business capabilities that exist in the Big Ball of Mud.
Step 2. Rank business capabilities according to strategic significance: core competitive advantage > supporting functionality > generic operations that could be replaced by third-party solutions.
Step 3. Gather an inventory of the business rules and functionality that are still relevant and now irrelevant.
Step 4. Decide which business capabilities will be extracted in priority order.
Step 5. Plan how the first of the business capabilities will be extracted.
Step 6. Work iteratively and deliver incrementally on steps 1–5.
Early on, step 4 can bias the teams toward prioritizing its attack to obtain an easy win. An example is replacing the legacy Policyholder Accounts with a new Microservice. This would naturally lead to the Rewards Context being created so that it can be extracted from Policyholder Accounts. These changes are the least complex and can help build the teams’ experience and confidence.
Several varied details are inherent within step 5. The sections that follow disclose the details, which are not necessarily given in sequential order. Some steps may be taken at any given time as demand requires. For the most part, this technique is known as “strangling” the Monolith, much in the same way a vine strangles a tree.
The system remains enabled and available for users of the system at all times. Thus, the user interface must remain consistent.
Maintaining the system’s contract with users can be facilitated by placing a Facade [GoF] between users and the Monolith. Figure 11.3 shows an API Gateway as the Facade, which works as described next.
1. Initially all user requests are directed from the API Gateway to the legacy Big Ball of Mud.
2. As the features of business capabilities are extracted into Microservices and enabled for the user, the API Gateway is changed to direct some or all respective user requests to the newly enabled Microservice.
3. This scheme can be used for A/B testing to determine if users in groups A and B have good/bad and better/worse experiences.
4. If there are bad or worse experiences with the new Microservice, direct all user requests back to the legacy Big Ball of Mud.
5. If tests determine that the newly enabled Microservice is sound and properly integrated, direct all user requests there.
Repeat these steps for every newly extracted Microservice.
It is possible to succeed in redirecting requests without the use of an API Gateway, but it requires changes to the UI. Although it might be more difficult to perform A/B tests this way, it is still doable. The authors have used this approach with success. Nevertheless, an API Gateway makes it far similar.
A common complexity here, regardless of the approach used, is the need to deal with single requests that must be split. This occurs when the legacy and one or more new Microservices all provide part of the requested service. Whatever the results they produce, the multiple outcomes must be aggregated. This is much simpler to handle through an API Gateway. A query request and response can often be readily aggregated. By comparison, requests for creates and updates are not so readily aggregated. Some query tools can make this task run more smoothly, but the solution might run the risk of needing full and direct database access rather than using the well-designed application and domain model. A query tool such as GraphQL can also work through REST request–response communications, thus preserving valuable application services and the domain model. The pattern has been referred to as GraphQL for Server-Side Resource Aggregation [TW-GraphQL].
The preceding discussion points out a need to rethink how the user interface is designed and built. As data becomes segregated into various modules within the legacy, or as whole business capabilities are reimplemented in autonomous Microservices, the user interface must display the aggregation of the data from multiple sources to provide a meaningful user experience. There is no longer just one place from which to get all the data to render in the user interface.2 This is yet another reason to use an API Gateway. As illustrated in Figure 11.4, the pattern that changes the way the user interface works is known as Composite UI.
2 Actually, there was never a convenient way to aggregate data into the user interface from the Big Ball of Mud, although there was possibly an illusion that such a way existed from an outside perspective.
When using this pattern, the UI is aware that different components are assembled within one Web page. This is accomplished by using the relatively simple HTML Web markup language. It provides block-based document segmentation or divisions, which can be used to compose various areas of the Web page to show data from different subsystems. Each segment or division gets its content from a different source. In Figure 11.4, there are three user interface components to be rendered, which are obtained by querying Underwriting in the legacy, and Risk and Rate in the new Microservices. All of this aggregation and composition work can be managed with an API Gateway. The API Gateway can use traditional REST queries or specialized GraphQL queries.
Note that it is possible to accomplish the same goals using other approaches, such as a backend for frontend and micro frontends. These approaches are described in detail in our follow on-book Implementing Strategic Monoliths and Microservices (Vernon & Jaskuła, Addison-Wesley, forthcoming).
While strangling the Monolith, the data that is being maintained in a Microservice must be given to the legacy, and the legacy data must be given to the Microservices. Otherwise, users will see inconsistent views of what they perceive as the same data. Thus, whenever system state data is mutated in the legacy, or when a Microservice’s state is modified, generally the changes must be harmonized on the opposite side. That is, changes to the legacy system state are generally made available to all Microservices that use this data. Likewise, state changes made in a Microservice will likely be needed by the legacy application. Because “generally” and “likely” aren’t decisions, each case must be dealt with individually.
Note
It might seem better to refer to “harmonizing” as “ synchronizing,” because what is described here is a sort of synchronization. Yet, the data will no doubt take a different shape and possibly change its meaning between the legacy system and the new Microservices. Thus, the authors have chosen the terms “harmonizing” and “harmony.” What’s happening here is similar to two singers who have different vocal ranges, but can still sing in harmony.
Such cross-process data harmony will be eventually consistent. Thus, if the legacy system state changes, the Microservice that must be harmonized will only be able to merge the changes after some delay. For this reason, the Microservice will have at least a brief discrepancy with the data in the legacy system. The same goes for the legacy system being eventually harmonized with data changes in the Microservices. There isn’t much that can be done about this temporary mismatch other than querying the multiple sources of data and merging into the latest snapshot. Even so, changes to the same data may occur before the current use case is completed. In distributed computing, there truly is no “now,” and time always wins over consistency.
Harmonizing data modifications on both sides is essential when both remain at least partially dependent on the same data. This might seem odd, but until an entire business capability is replaced by one or more Microservices, all subsystems holding the same data must eventually agree on the system’s state. This might be necessary for a while, despite it not being the ultimate goal. The ultimate goal is a single source of truth.
Of course, having a single source of truth is impossible to achieve at all times because at any given moment the changes in the legacy or a given Microservice will be recorded in only one place. The harmony of system state is achieved only eventually; that is, a true distributed system is always eventually consistent. In reality, in a large and complex system with many constantly changing sources of data, all system states are never fully consistent. Achieving this feat would require defying the laws of physics, so it’s best not even to try.
Next, we’ll present three primary approaches to achieve eventual data harmony of dependent system states: database triggers, event surfacing, and Change Data Capture (CDC).
The first approach uses database triggers. When a trigger fires, an event is created to represent what changed, inserted into an events table, and managed by the transaction scope. The event is made available to Microservices through message publishing. A given use case might need triggers to fire on multiple tables in a single transaction. The problem with triggers is that database products other than RDBMS types don’t support them. Even when they are supported, triggers can be tedious to work with. Triggers can also be slow when used on a database under heavy load.
The authors have used this technique when no better solution was available. In one case, the legacy Big Ball of Mud was implemented in an arcane language and framework with a sizable amount of reimplementations and new work using a modern programming language. It would have been quite difficult to use anything other than triggers due to the extensive use of the Microsoft SQL Server database in that environment. Fortunately, with SQL Server, all triggers can fire before the transaction commits, enabling the aggregation of multi-table modifications into a single event. At the time, Change Data Capture was not an option. Using database triggers in this situation was challenging, but it worked.
The second approach to harmonize data is to create a table in the legacy database that will be used to insert events related to each translation. Call it the events table, and the technique event surfacing. We have named this technique event surfacing because the original implementation was not designed with events, but now events are being jammed into the legacy system codebase to make strangling it easier. The difference between database triggers and event surfacing is that the first approach creates an event when one or more triggers are fired, while the second approach explicitly creates the event in the application code and persists it into the database along with the entities.
When inserting, updating, or deleting one or more application entities to carry out a use case, a new event is created by the application (perhaps the Service Layer) and inserted into the events table. The Entities and the event are committed in the same transaction. The event will later be queried by a background task and published via messaging.
The complexity here is that it is often difficult to find a good place to create and inject the event. Sometimes multiple Service Layer components will modify system state independently while being managed within a single transaction scope. Unless some top-level component is orchestrating the subtasks, how and where is a single event created? Perhaps each individual Service Layer component will create a unique event, and the Microservices must deal with multiple events rather than one. If a thread-local or thread-static concept is available, and assuming the legacy code is single threaded per request, this could be used to advantage in aggregating all in-thread changes into a single event. These technical topics are discussed in detail in our follow-on book, Implementing Strategic Monoliths and Microservices (Vernon & Jaskuła, Addison-Wesley, forthcoming).
A third approach uses special database tooling that supports Change Data Capture [CDC]. In fact, triggers can be used to implement a form of Change Data Capture, but a class of highly specialized tools for this task are available that work even better. When such tools are applied, the database transaction log holds only changes to be made to the underlying data. An extraction tool runs as a separate process from the database, which eliminates any problematic issues related to coupling and contention. This specific technique is referred to as database transaction log tailing.
For example, the tool named Debezium [Debezium] is available as open source. Debezium has had some limitations in regard to the number of database products it supports, but its abilities have been steadily increasing and improving. Unsurprisingly, Debezium gives priority to supporting open source database products. At the time of writing, there were nine total supported database products—six open source and three fully commercial. Take this product seriously and watch this space.
Using Change Data Capture is very efficient when its proper use is well understood. Such tools can be used for other solutions, and there are patterns to describe those. The additional patterns are described in the follow-on book within this series, Implementing Strategic Monoliths and Microservices (Vernon & Jaskuła, Addison-Wesley, forthcoming).
All of the previously discussed approaches to harmonizing data are a means to the same end. They are not intended to make the Big Ball of Mud a better place to work, although that can be a goal (as discussed in Chapter 10). In this case, NuCoverage has decided that its Big Ball of Mud should be decomposed piece by piece, and finally decommissioned.
Likely one of the previously discussed approaches will suit this specific situation, or at least provide some significant clues about what will work in other cases. Once an approach has been chosen, it’s time to apply it to harmonizing data. Figure 11.5 and the list that follows explain how this process works from end to end.
1. The user submits a request.
2. The API Gateway directs the request to the legacy system.
3. Data modifications are persisted into the legacy database transaction log.
4. Change Data Capture reads the database transaction log and dispatches to a listener, and the listener creates an event that is placed on the Message Bus.
5. The Message Bus delivers the event to the Microservice, which establishes consistency (database persistence implied).
6. The user submits a request.
7. The API Gateway directs the request to the Microservice.
8. The event emitted by the Microservice (database persistence implied) is placed on the Message Bus.
9. The Message Bus delivers the event to the legacy, which harmonizes the data.
10. The harmonized data modifications are persisted in the legacy database.
Note that both sides must understand how to detect and ignore events that are seen during a full round-trip. That is, an event caused by the legacy that harmonizes state in the Microservice will, in turn, emit an event that the legacy system will receive. The legacy system must know to ignore the event that is the result of a full round-trip. The same goes for an event that originates in the Microservice as a result of a direct user action and is used to harmonize state in the legacy, which in turn emits an event from Change Data Capture. The Microservice must know to ignore the event that is the result of a full round-trip. If this consideration is not taken into account, a single database transaction could lead to infinite synchronization. This outcome can be avoided by each event carrying the originator’s correlation identity/tag, which must be propagated by all receivers into their emitted events. When the originator’s correlation identity/tag is seen back home, the event carrying it can be safely ignored.
The previous two subsections describe the procedures for facilitating the use of a pattern named Strangler Fig Application [Strangler-Fig], or simply Strangler [Strangler].3 Here, we consider how to work within the ecosystem that the Strangler creates.
3 Interestingly, Martin Fowler later decided that he didn’t like the name “Strangler Application,” which he originally used, because the word “strangler” has an “unpleasantly violent connotation.” Yet, encyclopedias and other sources use “Strangler” as a common name for this plant species. The authors use the name best known in the software industry, which is “Strangler.”
Interestingly, the physical strangler fig itself, if it were sentient, might be labeled an opportunist. It first gets a ride up a tree from inside an animal, such as a monkey, and is then deposited on the tree in a most unpleasant package. Growth begins by taking in sunlight and nutrients from rain, as well as plant litter from the tree that becomes its host. From its humble beginnings, the strangler fig spouts roots that begin a slow growth following rich and convenient paths. When the roots find the ground, they embed themselves securely into the soil and begin growing aggressively. This opens new opportunities to forcefully overtake the host tree by consuming the nutrients that the tree would otherwise receive. At the point where it covers the whole host tree, the strangler fig tightens its grip by developing thicker roots and squeezes the trunk of the tree. It simultaneously stretches above the host, enveloping the top and blocking the sunlight’s reach. Incredibly, it’s not only strangulation that kills the tree, but also depletion of sunlight and root monopoly of soil nutrients.
The organic “opportunistic strategy and tactics” provide a framework for succeeding when strangling a legacy Big Ball of Mud. This is a fitting metaphor from which we can learn valuable lessons.
▪ Things will be ugly and smelly when we first start out, and it might appear to be a long way to fall from our humble beginnings, but starting is a must.
▪ Begin slowly, and be biased toward prioritizing for an easy win. The example we cited earlier was replacing the legacy Policyholder Accounts with a new Microservice.
▪ After achieving some early growth and gaining confidence, it’s possible to sprout new roots by following a few more opportunities for certain wins.
▪ Continue in this mode until the team has become grounded in the techniques.
▪ Start aggressive growth, tightening the grip around the Monolith.
▪ If the choice is to kill the Monolith, that final squeeze might not be a long way off.
One good thing about this approach is that it might not be necessary to carry the baggage of the legacy forward. It’s possible that some existing code could be migrated to Microservices, but that should be done with caution. Strangling is not the same as refactoring from a Big Ball of Mud to a modular Monolith, and this is a great opportunity to shed the weight of the legacy and gain strength from a well-designed set of Microservices. This is discussed in more detail later in this chapter in the section “Unplugging the Legacy Monolith.”
Consider some cautionary points regarding the decomposition of a Big Ball of Mud Monolith into Microservices:
▪ Beware of blurred behavior. The legacy code has no explicit model of a given business behavior. Other impediments include components that have too many responsibilities, business logic is flung into the user interface and infrastructure, the user interface and service layer yield conflicts, and behavior and business rules have been duplicated. (On a more technical note, these problems may include issues such as low cohesion of related components and high coupling between any two or more components.) This list of impediments is not exhaustive, but the examples chosen are certainly frequently seen in the wild. One of the authors encountered a scenario where the user interface sported business rules that triggered massive technical workflows and processes and integration with business partners.4 A hierarchical selection list housed business logic, and depending on the list items involved, different categories of data would be updated, triggering the workflows and processes. The codebase provided no practical assistance in understanding the business logic and rules. To change this joyless situation, a deep analysis and extensive archaeological dig, including help from business experts, finally led to knowledge acquisition.
4 In case it is not obvious, this is an antipattern that tends to go far beyond wrong.
▪ Don’t carry over the wrong behavior. Very often some parts of the Big Ball of Mud were wrongly implemented, from both business and technical perspectives. Business people must accomplish their work, even in the face of poorly designed systems. As a result, user-contrived workarounds abound. This tacit knowledge is gained and shared in the user community, and software never answers their needs. Familiarity breeds complacency. Fight it: Don’t replicate the same wrong model into the new Microservices. Instead, challenge business experts constantly to help discover breakthrough innovations. Keep checking the pulse of the legacy decomposition, monitoring how results are being received. One heuristic is to observe users accomplishing a certain task with the system. Does the user experience appear to be intuitive? Do users require special knowledge to work around problems? Do users need paper instructions, sticky notes on their display, or secret spreadsheets with hacks? The more experienced users might carry most of the knowledge in their head, so user experience issues might be difficult to see in some cases. Observe, yes—but also ask and challenge. Be understanding and empathetic.
▪ Don’t implement a new feature in the Monolith. The ongoing decomposition effort might take some time, but the business won’t grind to halt while waiting for those tasks to be accomplished. Rather, the business will constantly demand new features. If some business capability currently in the Monolith is supposed to be reimplemented in a Microservice, try to avoid implementing a new feature for that business capability in the Monolith, if at all possible. Doing so would be a waste of time and effort, unless there is a very urgent need to introduce the new feature in one of the remaining vestiges of the Monolith. Besides limiting the growth and size of the Monolith (which should instead be shrinking), building the new feature inside a Microservice will accelerate development and more quickly deliver the business value.
Extracting business capability behavior out of a Big Ball of Mud requires very hard work. There has to be a substantial effort devoted to rediscovering the domain, identifying its rules, and challenging them.
On the one hand, at some point it might be practical or even necessary to put the final squeeze on the legacy and bring about its death. On the other hand, that action might be either impractical or unnecessary. If the decision is to keep some parts of the legacy in operation, it might be possible to rid its codebase of obsolete code. Or it could be that the code was previously so tangled that it’s not practical to remove large swaths or even small pieces. The saddest part is knowing that harmonizing data between the legacy and Microservices will very likely be necessary for the foreseeable future and beyond.
If some business capabilities will not be extracted at all, or will remain relevant for a short time, leave the legacy Monolith running. Of course, this is not desirable because the rigging to maintain data harmony will almost certainly be necessary even for the reduced number of business capabilities in the legacy. Naturally, leaving the Monolith running simultaneously with the new Microservices will increase the operational and development complexity, which will certainly lead to greater business risk.
Some situations will not tolerate retaining the legacy system, even when terminating it is associated with great complexity. Consider the cases where companies are themselves in a stranglehold by computing machinery and software that is long obsolete or extremely unpopular, not to mention the grand expense of renewing licenses and support contracts. It can feel like paying the mob for protection from the mob itself. This legacy overhead imposes a huge “tax,” and unless the organization takes drastic measures, it never, ever, ends. Even beyond vendor lock-in, in some situations it has become nearly impossible to hire developers to maintain the legacy. The original code might have been implemented by people who now have great-grandchildren, or are no longer with us.
For these reasons and more, a less significant muddy legacy system cannot remain in service beyond a given license and support contract expiration date. A few semi-tractor trailers (e.g., lorries or large hauling trucks) must pull up to a shipping dock, where burly movers push some behemoths into the trailer, and take them to the museum of computing. The team charged with decommissioning the legacy must be the best possible human form of the strangler fig. For some CEOs, CFOs, CIOs, and others who would like to move on from the 1960s and 1970s, unplugging machinery will have never felt so good.
This chapter considers how to move from a Monolithic architecture to a Microservices architecture. Because it is essential to understanding the challenges of Microservices, an introduction of the issues related to distributed computing was provided first. Next, we considered the simplest step of the transitioning from a well-modularized Monolith to Microservices. With that understanding, we outlined the challenges related to extracting Microservices directly from a Big Ball of Mud, and provided step-by-step guidance for doing so. Lastly, the chapter described the goal of eventually unplugging the legacy Big Ball of Mud in the face of the challenges associated with shedding unhealthy technology lock-in.
The primary takeaways of this chapter are as follows:
▪ Distributed computing introduces several complex challenges, which are mostly avoidable when employing a Monolithic architecture.
▪ The simplest and most direct way to realize Microservices architecture from a legacy system is to start with a well-modularized Monolith, like that developed in Chapter 10.
▪ It is far more challenging to extract components from a legacy Big Ball of Mud system directly into Microservices, because doing so requires assuming several levels of complexities at once.
▪ Consider an API Gateway as a means to aggregate requests between several Microservices and legacy applications when each implements parts of requested services.
▪ A composite UI is a great way of aggregating data from multiple services.
▪ Consider having always a single source of truth for any data when moving features from a Monolith to Microservices.
▪ Database triggers, event surfacing, and Change Data Capture are patterns to consider when migrating a legacy system to Microservices.
In Chapter 12 (the final chapter in this book), we will look at everything we have seen so far.
[CDC] https://en.wikipedia.org/wiki/Change_data_capture
[Debezium] https://debezium.io/
[Fallacies] https://en.wikipedia.org/wiki/Fallacies_of_distributed_computing
[GoF] Erich Gamma, Richard Helm, Ralph Johnson, and John Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software. Reading, MA: Addison-Wesley, 1995.
[Strangler] https://docs.microsoft.com/en-us/azure/architecture/patterns/strangler
[Strangler-Fig] https://martinfowler.com/bliki/StranglerFigApplication.html
[TW-GraphQL] https://www.thoughtworks.com/radar/techniques/graphql-for-server-side-resource-aggregation
35.170.81.33