Distill code to make it testable
I am adding a small module to a big application. If I add the module in the default way it won’t be easy to test. The default way is to simply slap the new code onto the existing app and stir it all together nicely. If I want to test the new code I must run the entire application. And that is hard to do in an automated way.
To make this example concrete, let’s say I need to add table support to a word processor. The default approach is to just start adding code to the word processor code until I have tables working. But when I am done, the “table module” is very tightly integrated with the word processor app.
Here is an alternative approach: There are two kinds of code that are mixed into the new module that must be distilled out. The first, is application specific code. Find all the parts that are specific to this use of the new module and pull them together at the top. This is the common approach of making a library. This involves a bit of abstraction, pulling out constants and other details that apply to the application. So I make a table library and then use that table library in the application.
This part is fairly well understood, if not followed. The second aspect is much less well understood. The second kind of code that must be distilled out is system access code. System access code is any code that goes outside of memory and touches real resources. For example, reading a file, talking on the network, accessing a database, reading the system clock. This is all system access code that is harder to test than normal code.
Just as I distilled the application code out to the top of the new module, I need to distill the system access code out to the bottom of the new module. So imagine different pieces of code working together: the application code is on top (the word processor), making calls down to the new module (the table module), which in turn is calling bits of system access code that are plugged in underneath it.
The final step is to use inversion of control, to allow the application code to pass the system access code into the module. From an object construction perspective this pulls the system access code up on top of the module and puts it under the control of the application. (This point is complicated unless you understand inversion of control. The module still makes calls down to the system access code, but the system access code is constructed by the application code. So there is a runtime dependency from the module to the system access code. But, the system access code implements interfaces defined in the module, so the compile time dependencies are such that the system access code depends on the module, not vice-versa).
With inversion of control in place, I can create fake system access code that is just normal code (i.e. only uses memory, does not access other system resources). This makes it easy to test. For example, suppose the table module in the word processor needs to read a config file to know how many columns to create by default. With the file access code distilled out of the module, I can write simple automated tests that give the module different strings as “config files”.
To do this really well I want to only distill out pure application code and pure system access code. I want the distilled parts to be as small as possible. Why? Because they are going to be harder to test. I am going to put the module under extensive automated testing. So the more code that is in the module, the more code that will be tested. Which means: the more code that will work.
This approach also simplifies the task of automated testing because everything that must be faked for the automated test is gathered together at the top of the library (remember the system access code is “on top” with the application code from a compile time perspective). This means I can have a nice neat bit of code that “fakes” the system access, and then everything below that is the “real” module running.
So, to make my modules testable I need to keep them free of application code and free of system access code. I use inversion of control to allow the application code to control what system access code to use.
Comments (0)10 ways to spot a good coder
Some coders are really good. How can you identify them?
You gotta know how to read a stack trace. I know this is a pathetically low bar, but I have seen countless developers just stare in wonder at a stack trace. So even this low bar will eliminate a bunch of people.
Know your tools Know your IDE. Know your editor. Know your operating system. Select the right “power” tools and make them serve you. (An interesting corollary is that if you code on Windows then you have to use Cygwin to be a good developer.)
Know your language If you have things to say in code, then you need to know how to talk. Learn every part of your language syntax. Get a broad understanding of the libraries your language offers.
You have to be able to download a 3rd party package, get the source code compiling, make some changes to the source and get the hacked library working. Maybe they even have to use a decompiler to get the source.
Know how to use a debugger.
Know how to use a profiler. At some point the code is too slow and you need to know why. A good developer can bust out a profiler and get an answer.
Read a spec. Sure every developer will read the spec before coding. Good developers read the spec when they are done coding… then they code up all the stuff they missed.
Read code. Writing code is more fun, but a good developer reads someone else’s code, understand it, and make sensible changes to it.
Work from the command line. Yeah GUIs are great, but you are doing way too much manual work if you don’t drop down to the command line and script out what you need.
Create the build script It’s easy to find a good developer on a project. Find the guy who made the build script. He is the one who actually knows how the stuff works.
So what is the common theme here? The bad developers are faced with something new and they stare in wonder at all the meaningless symbols in front of them. Whether it be a stack trace or someone else’s code, or the confusion of buttons in our tools, or the build script. The good coders are filled with the same wonder, for a moment, but they apply their brain to gain understanding. They dig in, read, understand, learn. Programming is a knowledge business if there ever was one.
Comments (0)Requirements, people, and monsters (part 4)
The final complex system is the application we build: the system. Over time the system becomes a monster. A monster that threatens the team by causing damage, demanding attention, creating more urgent work, growing out of control, refusing to cooperate, and generally causing pain for the development team. The source code grows into a monster. The running system grows into another kind of monster.
The task is to tame the beast. The monster is supposed to serve the team and the users, not the other way around. We need to get the beast in a cage, tame him, get a bit in his mouth, and steer him where we want. We must make the monster serve people.
Once again, the agile movement shows us many of the key techniques we need. Create automated unit tests around each piece of the system. Create automated end-to-end functional tests that confirm the whole thing works as expected. These tests create a cage that constrains the monster.
Build with an eye to creating visible workings. The system cannot be a black box, it has to show its users what it is doing. The system has to provide useful logging and monitoring. Now people can reason about its behavior rather than making up superstitions to explain the rampages of the beast.
With extensive tests in place the development team has a safety net that emboldens them to keep the design from deteriorating into a big ball of mud. When a coder is working on a piece of code and when they see how terrible the code is they can make it better and count on the tests to help them keep things working. The code base can be steered in the direction of good design. The monster has a bit in his mouth.
Software development projects are dominated by these three complex systems: the requirements, the team, and the application itself. Each one of these offers endless opportunities for learning. Any one of them can run out of control and cause misery. Welcome to the joyful world of software development.
Comments (0)Use Java thread pool to isolate poorly behaved objects
There is an evil in software systems. Some services are not well behaved. When you call them you may block forever waiting for them to respond. These are toxic services. The really bad aspect of such services is they are infectious.
Imagine you have created a component. It is processing many simultaneous requests from callers. It has many threads running. If your component calls a toxic service in the most obvious way then it will become toxic. The way this happens is that a thread running in your code calls the toxic service. If the toxic service is misbehaving, then your thread blocks forever. A little bit later another one of your threads call the toxic service and it too blocks forever. This will continue until all of your threads are blocked waiting for the toxic service.
If you observe the external behavior of your component at this time you will see that it is toxic just like the toxic service you are using. Sometimes callers get a normal response from your component; sometimes they block forever. You are toxic.
And it gets worse. Consider services offered by your component that don’t use the toxic service. Even these are choked out by all the busy threads blocked on the toxic service. So the failure spreads to include operations that are unrelated to the underlying source of the problem. Even if only 1 out of every 100 operations use the toxic service, the failure will still spread and quickly block all 100 operations.
And it gets worse. Because your component continues to consume resources without bound you will bring down other components running on the same system. The problem also propagates upstream to the calling systems. If the calling components are written in a naive fashion (like your component) then they too will become toxic and continue spreading the love.
This is the way systems die.
So what is the solution? The general solution is to move from synchronous to asynchronous constructs. Invoke the toxic service through asynchronous messaging instead of blocking synchronously on a thread waiting for a response. Calls to the toxic service show up in queues. These queues can be monitored, bounded, and managed.
The problem with the asynchronous messaging solution is that it requires a fairly dramatic change to the design of your code.
Here is a simplified implementation of the asynchronous idea that does not disrupt the design of your code. Allocate a pool of threads for dealing with the toxic resource. Give this pool a safe upper limit. You will never sacrifice more than N threads to the toxic service. This is the foundation of the approach. It provides a safety mechanism to keep the toxicity from spreading without limit.
When your code needs to call the toxic service, instead of calling it directly, it asks one of the brave volunteers from the thread pool to call the toxic service. This is a sacrificial thread that may never return… Set a timeout and wait for a little while to see how the thread fares. If it returns, great! The toxic service is working. If it doesn’t return then give up and tell the caller about the problem.
This leaves the sacrificial thread in the thread pool “hung” waiting for the toxic service. But, since we have an upper bound on the pool size, once we reach the limit we will stop calling the toxic service. If we stop calling it, then it cannot claim any more threads.
This approach leaves your component free to continue servicing requests that don’t require the toxic service. This approach does not consume unlimited system resources so other components on the same computer continue to operate. This approach doesn’t block calling code indefinitely. So the toxic service has been contained!
Now for some code. The Java libraries provide the code needed to implement the strategy. I ran this code on Java 1.6.
Here is an example of a toxic object. Notice how it randomly blocks forever.
import java.util.Random;// A poorly behaved class. public class ToxicService implements Service { private static final int FOREVER = 10000;
private final Random random = new Random();
public void go() throws ServiceException, InterruptedException { if (oneOutOf3()) { // Sometimes it blocks. blockForever(); } else if (oneOutOf3()) { // Sometimes it fails. throw new ServiceException(); } // Sometimes it works! }
void blockForever() throws InterruptedException { // This is what makes this object toxic. Sometimes it // blocks forever // when called. (Ok… not really forever in this // example, but long // enough to see the problem.) try { Thread.sleep(FOREVER); } catch (InterruptedException e) { // Bad code - ignore interruption, just keep running Thread.sleep(FOREVER); } }
boolean oneOutOf3() { return random.nextInt(2) == 0; } }
public interface Service { public void go() throws ServiceException, InterruptedException; }
// Exception thrown by Service. public class ServiceException extends Exception { private static final long serialVersionUID = 1L; }
The following class shows how this ToxicService can be wrapped in a way that will protect the calling code from the bad behavior. This class uses the thread pool tools built into Java to isolate the calls to the toxic object in separate threads. A timeout is used to give up on these calls if they don’t return quickly.
import java.util.ArrayList; import java.util.List; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit;// A class that "contains" the badness of a ToxicService. public class ContainedService implements Service { // Toxic service that is being contained. final Service service = new ToxicService();
// Thread pool for running toxic calls. ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);
public void go() throws ServiceException, InterruptedException { List<Callable<Object>> toRun = new ArrayList<Callable<Object>>(); toRun.add(new Callable<Object>() { public Object call() throws Exception { // Call the service. service.go(); return null; } }); List<Future<Object>> futures = executor.invokeAll(toRun, 1000, TimeUnit.MILLISECONDS); try { // Find out what happened when the service was // called. futures.get(0).get(); } catch (ExecutionException e) { // Propagate the exception that is part of the // interface. if (ServiceException.class.isAssignableFrom(e .getCause().getClass())) { throw (ServiceException) e.getCause(); } throw new RuntimeException(e); } }
public void shutdown() { // Shutdown the thread pool. executor.shutdown(); } }
The ContainedService class can be used like this:
import java.util.concurrent.CancellationException;public class ContainedMain { public static void main(String[] args) throws InterruptedException { ContainedService service = new ContainedService(); for (int i = 0; i < 10; i++) { try { service.go(); System.out.println("success"); } catch (CancellationException e) { System.out.println("timeout"); } catch (ServiceException e) { System.out.println("failed"); } } service.shutdown(); } }
This code works, but it is hand-crafted for the specific object being contained. A better solution would allow us to capture the timeout behavior in a generic form to be applied to arbitrary objects. We can create an InvocationHandler that will generically wrap each call with the timeout behavior.
Notice how the invoke(…) method handles all calls generically and the exception handling deals with all exceptions generically.
import java.lang.reflect.InvocationHandler; import java.lang.reflect.InvocationTargetException; import java.lang.reflect.Method; import java.util.ArrayList; import java.util.List; import java.util.concurrent.Callable; import java.util.concurrent.ExecutionException; import java.util.concurrent.Executors; import java.util.concurrent.Future; import java.util.concurrent.ThreadPoolExecutor; import java.util.concurrent.TimeUnit;// A generic class for timing out operations on poorly // behaved objects. public class TimeoutHandler implements InvocationHandler { // Toxic object. Calls to this object will timeout. private final Object target;
// Thread pool used to call toxic object. ThreadPoolExecutor executor = (ThreadPoolExecutor) Executors.newFixedThreadPool(5);
public TimeoutHandler(Object target) { this.target = target; }
public Object invoke(Object proxy, final Method method, final Object[] args) throws Throwable { List<Callable<Object>> toRun = new ArrayList<Callable<Object>>(); toRun.add(new Callable<Object>() { public Object call() throws Exception { // Call the toxic method. return method.invoke(target, args); } }); List<Future<Object>> futures = executor.invokeAll(toRun, 1000, TimeUnit.MILLISECONDS); try { // Discover result of toxic call. return futures.get(0).get(); } catch (ExecutionException e) { // Unwrap interface specific exceptions. if (InvocationTargetException.class .isAssignableFrom(e.getCause().getClass())) { throw ((InvocationTargetException) e.getCause()) .getCause(); } throw e.getCause(); } }
public void shutdown() { // Clean up thread pool. executor.shutdown(); } }
This InvocationHandler can be used with the Java Proxy mechanism like this:
import java.lang.reflect.Proxy; import java.util.concurrent.CancellationException;public class ProxyMain { public static void main(String[] args) throws InterruptedException { TimeoutHandler timeoutHandler = new TimeoutHandler(new ToxicService()); Service service = (Service) Proxy.newProxyInstance(Thread .currentThread().getContextClassLoader(), new Class[] { Service.class }, timeoutHandler); for (int i = 0; i < 10; i++) { try { service.go(); System.out.println("success"); } catch (CancellationException e) { System.out.println("timeout"); } catch (ServiceException e) { System.out.println("failed"); } } timeoutHandler.shutdown(); } }
Note: This example code uses threads to solve the problem without ever using synchronized, wait(), or notify(). This is important! It means the code has a decent chance of working. The java.util.concurrent package deals with the low-level tricky threading issues for you.
Comments (1)Requirements, people, and monsters
Software development projects are dominated by three complex systems. First there are the requirements. Not just a few requirements but a massive mountain of requirements:

And it is a dangerous mountain that threatens to come tumbling down as an avalanche:
Crushing the development team. The second system: the team. A group of people. And that is their distinguishing characteristic. They are people. Weird people, creative people, confused people, confusing people, thinking people. That is the source of both their great annoyance and their charm… they are people:
But the people are not alone, they have built an application. They have created a monster:

A monster that requires constant care and feeding. A monster that occasionally goes on rampages threatening life and limb and demanding immediate attention.
That is the dangerous world we live in. A bunch of people under threat of being crushed in an avalanche of requirements and living with a monster. Sounds like fun, eh?
Comments (0)Invest $20 in a RentACoder experiment
If you are a developer you need to experience what it is like to use RentACoder to buy code.
Here is what you do:
Think of a little piece of code you want written. Think of something you would pay $20 for.
Write a one page spec for what you want.
Create an account on RentACoder. This is a bit painful, particularly the part where they validate your financial credentials. But hey, no pain, no gain.
Post the spec on RentACoder, watch the bids come in, pick a likely looking developer, and have them code it for you.
Here is why:
Learn what developers around the world will code for $20.
See what kind of code your competition produces.
See what it is like working with a developer. What does the developer do to help make the effort a success or a failure?
Experience buying software development services. Most coders spend their lives coding, but never get a chance to buy coding services. Experiencing the buying side will help you know how to relate to your buyers better.
If you are serious about your craft of programming it is well worth your time, and money to do a RentACoder experiment. Think of it as a bit of market research. It is guaranteed to open your eyes in some ways.
Give it a try and post a comment back here to let me know what you learn!
Comments (12)Exploring ruby class and metaclass relationships
The Ruby class structure is difficult to understand without drawing pictures. Three class relationships are important:
classsuperclassmetaclass
In order to sort it out I consider three categories of objects:
- plain objects
- classes
- metaclasses
Plain objects are just plain old instances of objects. So if I have a Dog class and this code:
d = Dog.new
Then d is a plain object.
In this example Dog is a class. In Ruby, classes can usually be considered as just special types of objects. In the following code, Dog is a class and Animal is a class.
class Animal end class Dog < Animal end
Other relevant classes include Class itself and Object.
That leaves metaclasses. Here is a good explanation of metaclasses (after reading his stuff I feel like I should write in the wonderful gibberish that he uses… alas, I don’t have that talent so I have to settle for plain old boring English). The basic story of metaclasses goes something like this: Plain objects hold state, not behavior. An object’s methods are defined in its class. The behavior of all Dog instances is defined in the Dog class. A metaclass is a special class that allows behavior to be defined at the instance level. Adding a method to the metaclass of object d, creates a method that will only apply to object d.
Similarly, a class can have a metaclass. This is analogous to the situation I just described. Methods on the metaclass only apply to a single instance of an object, in this case the object happens to be a class. Methods on a class metaclass end up looking a lot like class methods from Java.
The bottom line is that every object in Ruby can have a metaclass. This includes plain old objects, classes, and metaclasses. They can all have metaclasses that define methods just for them.
I keep saying they can have a metaclass. The first time a metaclass is referenced it will be created. Until then it does not exist.
I use this bit of code (also from why) to access an object’s metaclass:
class Object
def metaclass
class << self
self
end
end
end
The syntax, class <<, is used to access an object’s metaclass. This small bit of code adds a method called metaclass to the base class, Object. This new method uses the class << construct to get access to the current object’s metaclass and then returns that metaclass. Within the context of the class << construct, self refers to the metaclass.
Finally, as I said, a metaclass itself can also have a metaclass. So you can keep adding metaclasses to your heart’s content.
Those are the three types of objects; now the three types of relationships.
The class of an object tells you what type of object it is. The class of plain objects is simple, it is their class. The class of a class is easy, it is always Class. This includes special classes such as Object. The class of Object is also Class. Similarly the class of all metaclasses is also Class.

The superclass relationship is mostly straightforward too. Plain objects don’t have a superclass. The superclass of classes is what you would expect. It follows the normal object-oriented inheritance structure. In Ruby, Class has a superclass of Module which has a superclass of Object. All classes descend from Object.

The superclass of metaclasses is a bit complicated. Based on experiments that I conducted with Ruby 1.8.5 I found that the metaclass inheritance structure changes in surprising ways. In the simple case, a plain old object’s metaclass starts out with a superclass of the object’s class’ metaclass.
This also applies to a class metaclass which means by default a class metaclass has a superclass which is the Class metaclass.

However, when a metaclass is added to the object’s metaclass, the superclass of the object’s metaclass changes to point to the next metaclass out. With the outermost metaclass pointing to itself as superclass.

The final wierdity is that the metaclasses of Object and Module have a superclass of Class. This explains how metaclasses end up in the Class hierarchy. And this too changes once a metaclass is added to the metaclass of Object or Module.
Charting out the inheritance structure of metaclasses is complicated by the fact that metaclasses are created on demand and when they are created they change the metaclass inheritance structure. So simply retrieving a metaclass to see what its superclass is causes the inheritance structure to change. Fun.
NOTE: These results are different than what the docs say.
If superclass is complicated, the metaclass relationship is quite simple. Every object’s metaclass is its own unique metaclass. This applies to plain objects, classes, and metaclasses.

Now all of these diagrams can be combined to get the whole picture of how plain objects, classes, and metaclasses relate to each other.

These are the conventions in the diagrams:
- plain objects are ovals
- classes are white squares
- metaclasses are blue squares
- class relationships are shown as lines pointing to the north-west
- superclass relationships are lines pointing up
- metaclass relationships are lines pointing to the left
More complete code to explore these relationships:
class Animal
end
class Dog < Animal
end
class Object
def metaclass
class << self
self
end
end
end
d = Dog.new
def dump (d, code)
puts "#{code} = #{eval code}"
end
dump d, "d.class"
dump d, "d.metaclass"
puts
dump d, "d.metaclass.class"
dump d, "d.metaclass.superclass"
dump d, "d.metaclass.metaclass"
puts
dump d, "d.class.class"
dump d, "d.class.superclass"
dump d, "d.class.metaclass"
puts
dump nil, "Class.class"
dump nil, "Class.superclass"
dump nil, "Class.metaclass"
Comments (3) Generate a real parser for your custom language
Many years ago I found myself in the position of writing telephony applications for the phone company. We were using an IBM system that expected you to use a graphical programming tool to create a state machine describing the program. The state machine had primitives like play a message to the caller, collect digits from the caller, etc. Writing programs as state machines in their dismal environment was brutally painful. You ended up doing ridiculous things like building while loops out of primitives that checked a condition, took an action, and then looped back to the top. All of this by adding blocks and drawing lines between them… ugh!
So I longed for a simple C style syntax to write my code. Not knowing any better I got a copy of the Dragon Book from the library, downloaded lex and yacc and created a prototype of a language that compiled to a state machine format the IBM system could read. Once I had a protoype working I was convinced I could really build such a thing. I switched to JavaCC and created a full-blown language.
The way tools like lex/yacc and JavaCC work is that you create an input file that specifies the grammar for your language. You tell it what the keywords are and what order they are to appear in. Then the tool generates a parser for your language. It produces an abstract syntax tree that you can then walk to produce the compiled output that you need.
This was undoubtedly the most fun I ever had programming. And this was before I became a serious professional programmer. In fact, this experience was one of the factors that drove me from engineering to programming as a profession. So I highly recommend the experience of creating a language in this way. Furthermore, if you are going to create your own language this is the way to do it. You need a real parser for it, not some home-grown monstrosity. So using something like yacc or JavaCC is the way to go. You can hand someone the grammar specification and they can read it to see how to speak your new language.
It is quite amazing the power you feel having such a power tool at your fingertips. On future jobs I would whip out JavaCC and create a code generator at the drop of the hat.
Back to the original telephony project… once I had the basic language working, I then proceeded to create another language: a domain specific language that compiled to my C style language. This DSL was tailored to creating the kinds of voice applications that we were building. Unlike the IBM graphical programming tool, my DSL made it trivial to create the apps that we needed to create. I became dramatically more productive than my teammates by using these tools.
So what was the next step? Getting my teammates to use it of course. And why wouldn’t they want to? It would save them tons of time. Well, as it turns out, I was never able to get anyone else to even look at it seriously, much less use it. What good was the formal specification of the grammar for the language? None. Nobody read it. And if they did look at it, it’s not the most self-explanatory thing. What you really need is lots of examples that show the language in use. Nobody learns a language by reading the grammar specification.
So you can create your own language for yourself, but there is not much chance of getting others to use it.
Comments (0)Don’t make up your own language
So you are working on an app and you find that you want to add some hooks to let users customize some aspect of the app. Say… you are building a task management application and you want to allow users to plug-in an algorithm for scheduling the order to work tasks. You might start out small and give users a custom field to add a order to each task. Then you realize they need more power so you let the users enter a simple expression that evaluates some of the fields on each task to determine its order. Then you go a step further and add the ability to call a sub-routine. They you add variables. You are sliding down a slippery slope of creating an ad-hoc custom language.
That is bad thing.
Instead of doing that just adopt some scripting language and add a hook to call a script. The default choice for a scripting language must be JavaScript. For a list of some other choices check out the Extension/embeddable languages section here http://en.wikipedia.org/wiki/Scripting_language.
Comments (0)Failure oblivious computing
Do you really need to understand all aspects of the program you are writing?
Is it more important for a program to be correct or simply keep running?
I certainly have strong reactions to these questions. My natural, strong responses are based on the principles of how software ought to be done.
This paper (link provided to me courtesy of Kyle) challenges my thinking in a startling way.RinardOOPSLA06.pdf
The results are so startling that on this page I was ready to discard the paper in complete disbelief:
Our Philosophy • Should be able to ignore addressing errors • Perform dynamic bounds checks • Discard out of bounds writes • Manufacture values for out of bounds reads • Continue executing • Called failure-oblivious computing
Did you get that? If the program writes to memory that it does not own… just throw away the write (!?). If the program reads from memory that it does not own… just make up some value to return (!?).
But then the author makes a compelling case based on a empirical data.
The startling conclusion is that in many cases we cannot afford to build correct software. When I think about reality, rather than my wishful thinking, this resonates deeply.
Comments (0)



















