mobcode

Can inversion of control go awry?

16 May, 2007

mess

Is there any downside to using inversion of control? Absolutely

Can the testing code become a disaster? Totally

Can you create a big mess of tangled code with inversion of control? Daily

Is it harder to read code that has been “distilled” so that compile time and runtime dependencies are different? Yep

So what is the conclusion? We need to train our brains to think in new ways and build/find tools to help us write code that supports automated testing (and therefore works and can be maintained).

We need to get better at distilling out the un-testable parts. We need to learn how to test more un-testable code. We need to learn better patterns of clearly doing inversion of control. We need to train our brains to think in different ways. We need to keep groping towards a better way of building software.

Comments (0)

Distill code to make it testable

10 May, 2007

stil

I am adding a small module to a big application. If I add the module in the default way it won’t be easy to test. The default way is to simply slap the new code onto the existing app and stir it all together nicely. If I want to test the new code I must run the entire application. And that is hard to do in an automated way.

To make this example concrete, let’s say I need to add table support to a word processor. The default approach is to just start adding code to the word processor code until I have tables working. But when I am done, the “table module” is very tightly integrated with the word processor app.

Here is an alternative approach: There are two kinds of code that are mixed into the new module that must be distilled out. The first, is application specific code. Find all the parts that are specific to this use of the new module and pull them together at the top. This is the common approach of making a library. This involves a bit of abstraction, pulling out constants and other details that apply to the application. So I make a table library and then use that table library in the application.

This part is fairly well understood, if not followed. The second aspect is much less well understood. The second kind of code that must be distilled out is system access code. System access code is any code that goes outside of memory and touches real resources. For example, reading a file, talking on the network, accessing a database, reading the system clock. This is all system access code that is harder to test than normal code.

Just as I distilled the application code out to the top of the new module, I need to distill the system access code out to the bottom of the new module. So imagine different pieces of code working together: the application code is on top (the word processor), making calls down to the new module (the table module), which in turn is calling bits of system access code that are plugged in underneath it.

The final step is to use inversion of control, to allow the application code to pass the system access code into the module. From an object construction perspective this pulls the system access code up on top of the module and puts it under the control of the application. (This point is complicated unless you understand inversion of control. The module still makes calls down to the system access code, but the system access code is constructed by the application code. So there is a runtime dependency from the module to the system access code. But, the system access code implements interfaces defined in the module, so the compile time dependencies are such that the system access code depends on the module, not vice-versa).

With inversion of control in place, I can create fake system access code that is just normal code (i.e. only uses memory, does not access other system resources). This makes it easy to test. For example, suppose the table module in the word processor needs to read a config file to know how many columns to create by default. With the file access code distilled out of the module, I can write simple automated tests that give the module different strings as “config files”.

To do this really well I want to only distill out pure application code and pure system access code. I want the distilled parts to be as small as possible. Why? Because they are going to be harder to test. I am going to put the module under extensive automated testing. So the more code that is in the module, the more code that will be tested. Which means: the more code that will work.

This approach also simplifies the task of automated testing because everything that must be faked for the automated test is gathered together at the top of the library (remember the system access code is “on top” with the application code from a compile time perspective). This means I can have a nice neat bit of code that “fakes” the system access, and then everything below that is the “real” module running.

So, to make my modules testable I need to keep them free of application code and free of system access code. I use inversion of control to allow the application code to control what system access code to use.

Comments (0)

Requirements, people, and monsters (part 4)

2 May, 2007

caged

The final complex system is the application we build: the system. Over time the system becomes a monster. A monster that threatens the team by causing damage, demanding attention, creating more urgent work, growing out of control, refusing to cooperate, and generally causing pain for the development team. The source code grows into a monster. The running system grows into another kind of monster.

The task is to tame the beast. The monster is supposed to serve the team and the users, not the other way around. We need to get the beast in a cage, tame him, get a bit in his mouth, and steer him where we want. We must make the monster serve people.

Once again, the agile movement shows us many of the key techniques we need. Create automated unit tests around each piece of the system. Create automated end-to-end functional tests that confirm the whole thing works as expected. These tests create a cage that constrains the monster.

Build with an eye to creating visible workings. The system cannot be a black box, it has to show its users what it is doing. The system has to provide useful logging and monitoring. Now people can reason about its behavior rather than making up superstitions to explain the rampages of the beast.

With extensive tests in place the development team has a safety net that emboldens them to keep the design from deteriorating into a big ball of mud. When a coder is working on a piece of code and when they see how terrible the code is they can make it better and count on the tests to help them keep things working. The code base can be steered in the direction of good design. The monster has a bit in his mouth.

Software development projects are dominated by these three complex systems: the requirements, the team, and the application itself. Each one of these offers endless opportunities for learning. Any one of them can run out of control and cause misery. Welcome to the joyful world of software development.

Comments (0)

Use Java thread pool to isolate poorly behaved objects

13 April, 2007

toxic

There is an evil in software systems. Some services are not well behaved. When you call them you may block forever waiting for them to respond. These are toxic services. The really bad aspect of such services is they are infectious.

Imagine you have created a component. It is processing many simultaneous requests from callers. It has many threads running. If your component calls a toxic service in the most obvious way then it will become toxic. The way this happens is that a thread running in your code calls the toxic service. If the toxic service is misbehaving, then your thread blocks forever. A little bit later another one of your threads call the toxic service and it too blocks forever. This will continue until all of your threads are blocked waiting for the toxic service.

If you observe the external behavior of your component at this time you will see that it is toxic just like the toxic service you are using. Sometimes callers get a normal response from your component; sometimes they block forever. You are toxic.

And it gets worse. Consider services offered by your component that don’t use the toxic service. Even these are choked out by all the busy threads blocked on the toxic service. So the failure spreads to include operations that are unrelated to the underlying source of the problem. Even if only 1 out of every 100 operations use the toxic service, the failure will still spread and quickly block all 100 operations.

And it gets worse. Because your component continues to consume resources without bound you will bring down other components running on the same system. The problem also propagates upstream to the calling systems. If the calling components are written in a naive fashion (like your component) then they too will become toxic and continue spreading the love.

This is the way systems die.

So what is the solution? The general solution is to move from synchronous to asynchronous constructs. Invoke the toxic service through asynchronous messaging instead of blocking synchronously on a thread waiting for a response. Calls to the toxic service show up in queues. These queues can be monitored, bounded, and managed.

The problem with the asynchronous messaging solution is that it requires a fairly dramatic change to the design of your code.

Here is a simplified implementation of the asynchronous idea that does not disrupt the design of your code. Allocate a pool of threads for dealing with the toxic resource. Give this pool a safe upper limit. You will never sacrifice more than N threads to the toxic service. This is the foundation of the approach. It provides a safety mechanism to keep the toxicity from spreading without limit.

When your code needs to call the toxic service, instead of calling it directly, it asks one of the brave volunteers from the thread pool to call the toxic service. This is a sacrificial thread that may never return… Set a timeout and wait for a little while to see how the thread fares. If it returns, great! The toxic service is working. If it doesn’t return then give up and tell the caller about the problem.

This leaves the sacrificial thread in the thread pool “hung” waiting for the toxic service. But, since we have an upper bound on the pool size, once we reach the limit we will stop calling the toxic service. If we stop calling it, then it cannot claim any more threads.

This approach leaves your component free to continue servicing requests that don’t require the toxic service. This approach does not consume unlimited system resources so other components on the same computer continue to operate. This approach doesn’t block calling code indefinitely. So the toxic service has been contained!

Now for some code. The Java libraries provide the code needed to implement the strategy. I ran this code on Java 1.6.

Here is an example of a toxic object. Notice how it randomly blocks forever.

import java.util.Random;

// A poorly behaved class. public class ToxicService implements Service { private static final int FOREVER = 10000;

private final Random random = new Random();

public void go() throws ServiceException, InterruptedException { if (oneOutOf3()) { // Sometimes it blocks. blockForever(); } else if (oneOutOf3()) { // Sometimes it fails. throw new ServiceException(); } // Sometimes it works! }

void blockForever() throws InterruptedException { // This is what makes this object toxic. Sometimes it // blocks forever // when called. (Ok… not really forever in this // example, but long // enough to see the problem.) try { Thread.sleep(FOREVER); } catch (InterruptedException e) { // Bad code - ignore interruption, just keep running Thread.sleep(FOREVER); } }

boolean oneOutOf3() { return random.nextInt(2) == 0; } }

public interface Service { public void go() throws ServiceException, InterruptedException; }

// Exception thrown by Service. public class ServiceException extends Exception { private static final long serialVersionUID = 1L; }

The following class shows how this ToxicService can be wrapped in a way that will protect the calling code from the bad behavior. This class uses the thread pool tools built into Java to isolate the calls to the toxic object in separate threads. A timeout is used to give up on these calls if they don’t return quickly.

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

// A class that "contains" the badness of a ToxicService. public class ContainedService implements Service { // Toxic service that is being contained. final Service service = new ToxicService();

// Thread pool for running toxic calls. ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);

public void go() throws ServiceException, InterruptedException { List<Callable<Object>> toRun = new ArrayList<Callable<Object>>(); toRun.add(new Callable<Object>() { public Object call() throws Exception { // Call the service. service.go(); return null; } }); List<Future<Object>> futures = executor.invokeAll(toRun, 1000, TimeUnit.MILLISECONDS); try { // Find out what happened when the service was // called. futures.get(0).get(); } catch (ExecutionException e) { // Propagate the exception that is part of the // interface. if (ServiceException.class.isAssignableFrom(e .getCause().getClass())) { throw (ServiceException) e.getCause(); } throw new RuntimeException(e); } }

public void shutdown() { // Shutdown the thread pool. executor.shutdown(); } }

The ContainedService class can be used like this:

import java.util.concurrent.CancellationException;

public class ContainedMain { public static void main(String[] args) throws InterruptedException { ContainedService service = new ContainedService(); for (int i = 0; i < 10; i++) { try { service.go(); System.out.println("success"); } catch (CancellationException e) { System.out.println("timeout"); } catch (ServiceException e) { System.out.println("failed"); } } service.shutdown(); } }

This code works, but it is hand-crafted for the specific object being contained. A better solution would allow us to capture the timeout behavior in a generic form to be applied to arbitrary objects. We can create an InvocationHandler that will generically wrap each call with the timeout behavior.

Notice how the invoke(…) method handles all calls generically and the exception handling deals with all exceptions generically.

import java.lang.reflect.InvocationHandler;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.Callable;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.ThreadPoolExecutor;
import java.util.concurrent.TimeUnit;

// A generic class for timing out operations on poorly // behaved objects. public class TimeoutHandler implements InvocationHandler { // Toxic object. Calls to this object will timeout. private final Object target;

// Thread pool used to call toxic object. ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);

public TimeoutHandler(Object target) { this.target = target; }

public Object invoke(Object proxy, final Method method, final Object[] args) throws Throwable { List<Callable<Object>> toRun = new ArrayList<Callable<Object>>(); toRun.add(new Callable<Object>() { public Object call() throws Exception { // Call the toxic method. return method.invoke(target, args); } }); List<Future<Object>> futures = executor.invokeAll(toRun, 1000, TimeUnit.MILLISECONDS); try { // Discover result of toxic call. return futures.get(0).get(); } catch (ExecutionException e) { // Unwrap interface specific exceptions. if (InvocationTargetException.class .isAssignableFrom(e.getCause().getClass())) { throw ((InvocationTargetException) e.getCause()) .getCause(); } throw e.getCause(); } }

public void shutdown() { // Clean up thread pool. executor.shutdown(); } }

This InvocationHandler can be used with the Java Proxy mechanism like this:

import java.lang.reflect.Proxy;
import java.util.concurrent.CancellationException;

public class ProxyMain { public static void main(String[] args) throws InterruptedException { TimeoutHandler timeoutHandler = new TimeoutHandler(new ToxicService()); Service service = (Service) Proxy.newProxyInstance(Thread .currentThread().getContextClassLoader(), new Class[] { Service.class }, timeoutHandler); for (int i = 0; i < 10; i++) { try { service.go(); System.out.println("success"); } catch (CancellationException e) { System.out.println("timeout"); } catch (ServiceException e) { System.out.println("failed"); } } timeoutHandler.shutdown(); } }

Download the sample code.

Note: This example code uses threads to solve the problem without ever using synchronized, wait(), or notify(). This is important! It means the code has a decent chance of working. The java.util.concurrent package deals with the low-level tricky threading issues for you.

Comments (1)

Don’t make up your own language

6 November, 2006

esperanto

So you are working on an app and you find that you want to add some hooks to let users customize some aspect of the app. Say… you are building a task management application and you want to allow users to plug-in an algorithm for scheduling the order to work tasks. You might start out small and give users a custom field to add a order to each task. Then you realize they need more power so you let the users enter a simple expression that evaluates some of the fields on each task to determine its order. Then you go a step further and add the ability to call a sub-routine. They you add variables. You are sliding down a slippery slope of creating an ad-hoc custom language.

That is bad thing.

Instead of doing that just adopt some scripting language and add a hook to call a script. The default choice for a scripting language must be JavaScript. For a list of some other choices check out the Extension/embeddable languages section here http://en.wikipedia.org/wiki/Scripting_language.

Comments (0)

Let data flow through to users

1 November, 2006

pipes

Applications often need to be treated as pipes that carry data. In such cases the pipes should be smooth and should not consume data. Rather the data should flow out the other end.

I have seen this pattern in a variety of applications including billing systems and email handling systems.

Consider an application for routing email. The basic idea is that every incoming message needs to flow through the filter and end up coming out the other end.

Suppose there are fixed filtering rules that you are implementing. One rule might be that if the email is from domain “casino.com” then route the message to the folder “gambling”. That rule is implemented and the messages flow through. If a message arrives from casino.com then it ends up in the gambling folder. So this meets the objective of allowing the data to flow.

Now, consider another rule. The rule is that if the subject line contains the word “urgent” then the message should be handled specially. The users are unable to tell you how to handle it. The requirements are pending. But the users assure you that there are special rules for such messages and you will need to implement them. So what do you do? The right answer is to add a rule that detects the word “urgent” and puts those messages into the folder called “urgent”. This seems obvious, how could it be done wrong?

The wrong answer is to implement code that detects the word “urgent” and then places the messages in some internal holding area that the users do not have access to. This is violating the objective of being a smooth pipe. Data is sent in, but it never comes out. This leaves users peering into the pipe looking for their data, shaking the pipe, cursing the pipe. It leaves users unable to use the app because data comes in and disappears.

And users are the key to this issue. Users are people, not computers. That means users can adapt and do the right thing even if the system is only partially implemented. So even if you don’t know the complete requirements make sure that the data flows out the other end. This way users can implement the missing requirements. If you don’t do this then you will prevent users from being able to do their job simply because you did not have the needed requirements to build the system. Now it may not be your problem that the requirements are not available, but if you build a system that eats the data then you become the problem.

Now consider this advice in light of an agile, iterative development process. Never build a version of the code that hides data away in some corner. Every step of the way make a system that lets the data flow. The first step in building this email router is to make a no-op router that merely forwards every message on to a single destination. With that start, never build a system that eats messages.

Comments (0)

Respect the web

26 September, 2006

blocks

As an enterprise architect I spent years holding web technologies in disdain. Sure, you could make some neat little pages but it was certainly no way to build real apps. Over the last few years I have seen the light and come to appreciate the wonderful, messy, hacked-together, really useful real world of the web.

So here is my guide:

  1. HTML - It is not just an inconsequential view that may be used to render your data. It is a really useful way of marking up data, not just for browsers to view but for code to parse. Forget custom XML schemas. Use HTML. If your data is in HTML then your code can parse it easily, your browser can render it easily, and you can use CSS to format it nicely.
  2. HTTP - Don’t dismiss it as some low-level communication layer that you will use to tunnel your application protocol. Embrace HTTP as your application protocol. Read the spec. See what it offers and use it. Even if you don’t have a web client use HTTP to communicate with your server. Don’t bury HTTP at the bottom of the protocol stack, but embrace it and build an app to use its features.
  3. JavaScript - It is not just a toy scripting language for creating pop-ups or doing validation. It is a powerful dynamic language in its own right.
  4. REST - Learn about the principles that have made the web the success that it is. Embrace these principles in your own apps.

Even if you are building “enterprise” apps use web technologies. Even enterprise applications need to inter-operate and that is what the web does. As Mark Baker says, the web is the distributed object system you have been looking for.

Comments (1)

Roll your own

4 June, 2006

You hear the buzz about AOP, or IOC, or ORM. Instead of rushing out and adopting some third party libraries and application frameworks just write the code you need.

Maybe you don’t need an aspect-oriented programming language extension that will auto-magically generate byte-code on the fly. Maybe you simply need to interecept method calls and apply some generic code.

Maybe you don’t need an inversion-of-control container library that will manage all of your component dependencies. Maybe you simply need to write some wiring code at the top of your application to assemble your components.

Maybe you don’t need an object-to-relational mapping engine. Maybe you simply need some generic code to read a result set into a simple structure.

All of these things can be done in simple and straightforward ways, even using such primitive platforms as Java and .NET. If you are a good developer you can write these things yourself. You can write the needed code and be done with it. You can do it without being swallowed up in a black-hole of framework development. You aren’t writing a framework; you are writing some code you need.

Comments (0)

So you say you’re an architect

3 June, 2006

If you say on your resume that you are an “architect” then you need to give some thought to what that means. In a job interview you might be asked:

  • What exactly do you do as an architect?
  • What are some architectural principles you use?
  • What are some big mistakes you have committed…err.. seen committed?

And while you are thinking about it, also spend some time thinking about which kind of architect you are:

  • The non-coding, sit on committees, mandate corporate policies kind of architect.
  • The space faring architecture astronaut type who simultaneously over-complicates and over-simplifies everything.
  • An honest-to-goodness, hard-working, real-life architect who works in the trenches, writes code, delivers features, and helps make the key decisions and provide the technical leadership to deliver working systems.
Comments (0)