Archive for October, 2008

When will we have a real multi-user windowing system

Tuesday, October 28th, 2008

I once worked on a VNC type application, where the purpose was to allow some number of users to share any desktop application to a group of people (part of the allure is that windows applications could be shared to Linux users). The features that were meant to set it apart from the typical desktop sharing application like VNC was that it would enhance collaboration amongst the users.

It was a fun application to write - It had a Java Swing front end with JNI to use portions of the win32 API and X-Windows API, and we wrote a video and audio server to distribute the view of the application and allow the users to talk to each other in on online group phone call while they were using the application.

There was one desired feature that we could not implement. We wanted to allow users to simultaneously enter text, and simultaneously control their own mouse cursor to interact with the application. We were completely blocked on this feature because it goes against a major assumption of all major operating systems - that one user will use the system at a time. For instance, it’s assumed that one window will have focus at a time. It is also assumed that there is only one mouse cursor at a time. There are even more subtle issues that come up, such as there being only one copy and paste buffer.

At the time, I was disappointed, but I didn’t dwell on it too much. I figured that that level of multi-user collaboration would only service a very small niche market, and that it wasn’t so bad that we didn’t have the possibility of better collaboration.

Fast forward to today. I pair program most of the time, and I find myself longing for an environment that would allow me and my pair to collaborate in different ways. It’s standard pair programming, where we both work on the same window, and have to share the same keyboard and mouse input (preferably with 2 mice and 2 keyboards). So what do you do when one person has to check email, or one person decides to research an issue online, or otherwise engage in a task that is not the same as the other pair member? Typically, each person will have a laptop that they can use for these situations, but that causes the person to separate themselves from the primary work environment. Sharing information will have to done by sending an email or putting a file on a shared drive, or having the other person look at information on the laptop’s screen. What about spiking a problem side by side, where each pair tries two similar but different approaches to solving a programming problem? I would prefer to collaborate with someone in the same environment, where we can do some different tasks in parallel, and can share windows, applications, and data in the same desktop environment.

I think that it would be interesting to expand this idea even further. What if an entire team worked in the same environment, on the same desktop? Each person could have a portion of a huge desktop, and people in a team could choose to collaborate with each other in a very fluid fashion by putting their desktop spaces close together. It would be an interesting experiment in collaboration applied to software development.

Even if multiple simultaneous mouse and keyboard input were generally possible in operating systems, there would still likely be a lot of complex concurrency issues for developers, such as making sure that servers bind to different ports and write out to different files. It’s likely that once popular operating systems generally allow the multi-user features I mentioned before, that it would still be preferable to have isolated machines that interact through collaborative software rather then have everyone actually use the same machine.

I’ve seen software that allows sharing of a the mouse and keyboard across machines, and sharing the copy and paste buffer, but the purpose was to allows one person to connect 2 machines and use them like an extended desktop. Perhaps software like this could be tailored for better collaboration for pair programming or developers on a team not pairing.

An idea for a ruby inspection tool

Saturday, October 25th, 2008

Working on some ruby code for a little while, I had an idea for a tool that would be useful.

Basically, what I want is an inspection tool that also has a bundle in textmate, so that I can see the history of objects that are created, as well as interactions in a form that is searchable and that quickly shows what really happened. It would be like having debugging on all the time.

The goal of this tool would be to allow developers to view what really happens with the code when it executes. Sometimes finding the source of bugs is difficult because statically analyzing code is not the easiest way to predict the future state of an object, or to predict the interactions between objects. You don’t really know what happens until the code executes.

Nor is traditional debugging an efficient way to get the information that you need. You have to set a break-point, add in a line to call the a debugger, or even print out some information in order to answer a question about the actual execution of the code.

What I want is a tool that records a wealth of data about the actual execution of some code, such as when I run a test, or initiate an action from a controller. Then, I would like to be able to filter/query that information to get more precise information.

For the inspection tool, I’d like to know things like:

- See the actual methods and fields of a class or object instance at the beginning and end of code execution
- See all calls made to an object, or a specific instance of an object
- See all callers of a specific method of a specific object type or object instance
- Detect changes in the structure of an individual object that happens, such as methods being added or redefined after the object is initialized
- View all changes to fields of an object
- I would like to see the order that objects are created in. I would like to see when all objects of a certain type were created, or objects that respond to a specific method
- I would like to see all interactions between 2 objects
- I would like to be able to further scope the search, so that I only see information within the scope of a specific method call

Preferably, I would want to be able to view all of this information after running the code one time. And then be able to enter some search criteria into a form and have it reduce the amount of information to match my search criteria. Then, I can tweak some starting data in a test, and then rerun the test and get a new set of information to query.

I think that most of this information could be obtained by creating a Module to intercept all method sends to every object, and save some data about the context of the call when it’s made. Then, a client tool could parse the output, and allow a developer to query it.

I tried to do some googling to find something like this, but all I found were standard debugging tools, which I don’t think have the features that I’m looking for. If anyone knows of something that can do what I want, let me know.

Granularity of Abstractions

Thursday, October 23rd, 2008

In Java, iterating over a collection and collecting some objects based on a condition is a classic example of something to put into a method. Here is an example:

private Collection<Claim> getExpiredClaims() {
    List<Claim> expiredClaims = new ArrayList<Claim>();
    for (Claim claim : allClaims) {
        if (claim.isExpired()) {
            expiredClaims.add(claim);
        }
    }
    return expiredClaims;
}

In Ruby, you can do this in one line.

all_claims.select { |claim| claim.expired? }

It’s lovely how much more succinct that operation becomes. It can actually get more succinct by using Symbol#to_proc:

all_claims.select (&:expired?)

There is still the question of whether to move this to a method. Now, moving it to a method doesn’t reduce multi-line duplication the way it would in the Java example; there is only a minor amount of single line duplication that can be reduced.

The resistance I often get to refactoring a single line to a method is that it creates more total lines in the file with the def/end lines and spacing. Also, it takes a slight amount of effort to do the refactoring, and some developers won’t refactor unless the need is glaring and obvious.

There are still important reasons to move this to a method, and those reasons are better readability of code and reducing duplication of expressions. Without the refactoring, you’ll see code that looks like this:

def mark_expired_claims_for_review
    all_claims.select (&:expired?).each(&:needs_review!)
end

def notify_claim_agents_of_expired_claims
    all_claims.select (&:expired?).each(&:notify_agent_of_expiration)
end

After the refactoring, this is what it would look like.

def expired_claims
    all_claims.select(&:expired?)
end

def mark_expired_claims_for_review
    expired_claims.each(&:needs_review!)
end

def notify_claim_agents_of_expired_claims
    expired_claims.each(&:notify_agent_of_expiration)
end

It’s a minor point perhaps, and I’m sure that I could come up with much more excessive examples, but I just wanted to focus on a simple example. By replacing the expression with a symbolic reference, a method in this case, you express in English something that is a programmatic operation. This improves readability quite a bit.

Also, there is the additional benefit of abstracting on the concept of expired claims. By having multiple places using “all_claims.select(&:expired?)” to express expired claims, you duplicate the implementation detail that expired claims are derived from a larger collection of claims. This may not always be true, and a change in that derivation results in a change in many places.

Perhaps the question here is how DRY should you make your code. I’m still undecided on this on this point, but I think that the amount of work that it takes to reduce duplication to this level is minimal, and the result will be an important part of a pristine codebase.

That’s stupid

Wednesday, October 22nd, 2008

I meet a lot of developers (or IT people in general) that make this statement. It’s a reoccurring personality trait. Some do it more than others. It’s annoying. I don’t consider myself a “strong” personality, and as part of that, I hesitate to call anything stupid. I like to ponder topics for a while before I declare the stupidity at hand. Often, things that seem really stupid have a deeper context that result in their existence. It’s not as simple as saying that an idea, a person, or an occurrence is stupid; sometime “stupid” things come out a culmination of small factors.

Ironically, I often find that the people that make these claims do it out of ignorance. Many of these personality types make these statements and then have people explain to them why it’s not as stupid as it seems. Even then, a subset of these people will continue to assert that it’s all stupid.

Strictly speaking, it’s not a bad technique; I’ve met many people with this approach who earn validation in their claims from people who trust that they are, in fact, very smart. I don’t want to discredit some of these people; I’ve met some that were incredibly intelligent and could successfully back up their claims with both facts and a quick wit.

For many, it’s an argumentative technique, where they begin an argument basically saying “I’m smart, and most of you are stupid, now try to tell me that I’m wrong.” When I was a teenager, I started a lot of debates this way. I thought that I was smarter than most, and I felt empowered by this approach.

The problem that I find is that I meet too many developers are not good at it. They are clearly ignorant of the context of their statements, and it shows. They tend to look a lot like the naive teenager that thinks that they know more than they really do. Nor are they often the type of people that you can easily inform of the downside to their approach; Many of them won’t consider that they are wrong long enough to accept feedback.

Perhaps this bothers me as much as it does because I’m a consultant. Either it occurs with other consultants, and it demonstrates a lack of consulting skills, or it happens with client employees, where I’m not in a position to give them completely honest feedback.

Aside from this being a rant, I guess the purpose of this post is to cause some amount of introspection from people out there. Are you putting your ideas out there as best as you can? If you just read this blog post, and you feel like it could apply to you, then think for a while whether you can improve your approach. If you really want your ideas and opinions to have value, then try to articulate them in way that shows that you are cognizant of the factors involved, and are trying to make things better. Otherwise, you run the risk of discrediting yourself, and decreasing your ability to affect change.

Automated acceptance testing can be optional for some projects

Monday, October 20th, 2008

My experience with acceptance testing is that it usually fills the role of deep integration as well as immediate acceptance of tests. They can serve as both a requirement recording mechanism as well as a regression testing tool. There is a cost associated with them, but you have to consider the context of the project in deciding whether the costs are justified.

For instance, a public facing application generally requires a higher bar of quality than an internal application. Often, small defects in an internal app can be tolerated, while they can become hugely embarrassing mistakes in the public facing application.

An example for a public facing application: If amazon.com has a bug in whether users can add items to the cart at certain points, it may not be the only way to add items to the cart, but a small bug could result in many missed sales, and an expensive loss to amazon (not to mention the risk of losing shoppers from a poor user experience). An acceptance testing suite that tests on specific small features like that can be very valuable - offsetting the cost of maintenance.

However, some projects may find that they can get away with a reduced mitigation of defects. These projects may also find that they can run acceptably with a small team of QA even without those tests.

For instance, I worked on an internal workflow application. It was a large application in terms of the number of screens and tables (more than 150 tables). It was implemented as a Java Swing thick client that communicated with an apache tomcat server. There were less than 500 users using the application, and all of them were employees of the company. The important consideration was not to have no bugs occur, but to not have any occur that would block a users ability to complete some portion of a business critical workflow. If any defects like that were discovered in production after a release (released every 1 to 2 weeks), we would be able to push out a branched hotfix release to get it working again.

We had 2 - 3 QA’s and about 12 devs. They were unable to manually do a full regression test run since they had over 10,000 test cases (which would take months to test manually), so they focused on testing what they knew changed, and doing a quick happy path test of the workflow to make sure.

An additional aspect to this was that developers often helped out in QA. We thoroughly (manually) tested our changes when we were done with a story or bug fix, and often put developers on QA tasks in order to increase testing capacity when needed. One last thing of critical importance; when a developer made changes to the codebase, there was communication to the QA people on how risky the change might be. Low risk changes could get by with a QA testing a single screen briefly, whereas high risk changes would merit greater regression testing of the application. Developers worked with QA to give them better information to help them discover bugs more often.

The project ran well with this approach. There were defects, but almost none of them were critical. In the case of a critical bug occurring, they were quickly resolve with little cost.

Getting rid of switch statements with Java Enums

Sunday, October 19th, 2008

I recently saw an interesting and polymorphic way to get rid of using a case statement when using enums. This is possible by defining a method for each instance of an enum.

I’m sure that you have seen code like this:

enum Friend {
    Joey, Chandler;
}

And then somewhere in the code, you might see:

class SomeObjectThatNeedsToKnowBestFriends {

    void doSomethingWithBestFriends() {
        for (Friend friend : Friend.values()) {
            doSomething(bestFriend(friend));
        }
    }

    Friend bestFriend(Friend friend) {
         switch (friend) {
             case Joey: return Friend.Chandler;
             case Chandler: return Friend.Joey;
             default: throw new RuntimeException(“This person has no friend”);
         }
    }
}

This is a common smell in a code base where client code has logic that should be better encapsulated. Now, whenever someone adds a friend, they are going to have to search for references on Friends and add a new entry, or they are going to get a RuntimeException from the switch statement. In a well tested codebase, there is probably going to be a unit test that asserts that all Friends have best friends. In any case, the switch statement in the client object code is not great from an OO standpoint and it creates a maintainability issue.

The first refactoring is to move all Friend related logic to the Friend enum where it belongs.

enum Friend {
    Joey, Chandler;

    Friend bestFriend() {
         switch (this) {
             case Joey: return Chandler;
             case Chandler: return Joey;
             default: throw new RuntimeException(“This person has no friend”);
         }
    }

}

Now, we can just ask the Friend who their best friend is.

Joey.bestFriend(); –> returns Chandler;

Nice. Still, I’m not wild about that switch statement. Developers still have to know to update it, and really, I’d rather not even have to throw an exception because it was misused. It would be better if the structure of the code did not allow misuse.

Here is an example of how to do this:

enum Friend {
    Joey {
        Friend bestFriend() { return Chandler; }
    },
    Chandler {
        Friend bestFriend() { return Joey; }
    }
    abstract Friend bestFriend();
}

Then,

Joey.bestFriend(); –> returns Chandler;

Great, now we know that when someone adds a new friend, they will immediately be confronted with having to supply a best friend. My only issue with this approach is that all the method definitions become verbose when you introduce many methods like this. I tried different approaches to solving this problem, but due to the enums referencing each other, I was not able to do a different approach.

Here is an example of what you can’t do:

enum Friend {
    Joey (Chandler),
    Chandler(Joey)

    final Friend bestFriend;

    Friend(Friend bestFriend) {
        this.bestFriend = bestFriend;
    }

    Friend bestFriend() {
        return bestFriend;
    }
}

This won’t work because you can’t reference Chandler in the enum definition for Joey. The Chandler Enum hasn’t been defined yet, so this won’t even compile. However, you can “trick” the compiler by fully referencing Chandler using Friend.Chandler;

enum Friend {
    Joey (Friend.Chandler),
    Chandler(Joey)

    final Friend bestFriend;

    Friend(Friend bestFriend) {
        this.bestFriend = bestFriend;
    }

    Friend bestFriend() {
        return bestFriend;
    }
}

However, the result is not what we want:

Joey.bestFriend(); –> null
Chandler.bestFriend(); –> Joey

Even though I can reference the other enum instance this way, it resolves to null. The reason lies in the fact that when the Enum is compiled, each instance is a static final field, and initialized in a static block. Here is a snippet of the generated code:

public static final Friend Joey;
    public static final Friend Chandler;
    final Friend bestFriend;
    private static final Friend ENUM$VALUES[];

    static
    {
        Joey = new Friend(“Joey”, 0, Chandler);
        Chandler = new Friend(“Chandler”, 1, Joey);
        ENUM$VALUES = (new Friend[] {
            Joey, Chandler
        });
    }

Interestingly, I can write a program that tries the fully qualified name for explicit static constants, and it works:

public class StaticEnumClass {

    static final String foobar = “foo” + StaticEnumClass.bar;
    static final String bar = “bar”;
   
    public static void main(String[] args) {
        System.out.println(foobar); // prints out "foobar"
    }
   
}

But if I use a static initialization, it doesn’t:

public class StaticEnumClass {

    static final String foobar;
    static final String bar;
   
    static {
        foobar = “foo” + StaticEnumClass.bar;
        bar = “bar”;
    }
   
    public static void main(String[] args) {
        System.out.println(foobar); // prints "foonull"
    }
   
}

Interesting. The compiler is smart enough to resolve the correct value when you don’t use a static initialization block. Back to my original example with Friends. The next attempt was to create an anonymous constructor (actually, an instance initializer) for the enum instances and see if I could get what I want:

enum Friend {
    Joey {
        {
            bestFriend = Chandler;
        }
    },
    Chandler {
        {
            bestFriend = Joey;
        }
    }

    Friend bestFriend;

    Friend bestFriend() {
        return bestFriend;
    }
}

I had to remove the final from bestFriend, since I’m inializing the value when the object is instantiated using an instance initializer. This compiles, and seems like an okay approach. My hope was that the references to other enum types would get resolved in much the same manner as in the case of creating a method that returns each one. Interestingly, this doesn’t happen.

Joey.bestFriend(); –> null
Chandler.bestFriend(); –> Joey

The reason is that even though I am using an instance initializer, it’s being called from a static block since the instances are created in a static block. Turns out to be a naive attempt. Here is what it ends up looking like when the enum gets generated as a class:

static
    {
        Joey = new Friend(“Joey”, 0) {

           
            {
                bestFriend = Friend.Chandler;
            }
        }
;
        Chandler = new Friend(“Chandler”, 1) {

           
            {
                bestFriend = Friend.Joey;
            }
        }
;
        ENUM$VALUES = (new Friend[] {
            Joey, Chandler
        });
    }

Oh well. That’s as far as my experimentation went. I’m satisfied that I can at least create an anonymous subtype of an enum that returns the correct value, but if anyone has any ideas on how to do this in a cleaner way, let me know.

Saving lost developer time with better hardware

Saturday, October 4th, 2008

A common problem that I see on projects is that the computers available to the teams are mediocre. The obvious example of this is when the computers given to developers are mediocre, but I also think that there is a compelling point to be made around solving performance on build machines with hardware instead of software.

Developer Machines

I was once on a project where the local update and build process became an hour long. I won’t get into the details, but it was largely an IO bound delay, with portions with the processor as the bottleneck. We were using Dell 610 laptops. When some developers started gettting Dell 620’s (dual core laptops), we discovered that it reduced the local build time on the machines by 33% to 50%. Whoa.

Think about that. A 60 minute build cut down to 30 minutes. Let’s assume that developers only build once per day and that each developer has an average cost of $100 per hour (total cost to the organization, not just wages). With those savings, getting every developer a Dell 620 instead of a Dell 610 pays for itself in couple weeks. This is just considered cutting a long build in half. There are many other situations where having a slow machine causes lost developer time.

We lobbied for getting the developers better machines, and were mostly denied. I discovered that organizations measure the cost of people separate from the cost of hardware. In fact, they may be accounted for by different departments entirely, where an arbitrary budget is given to the department that issues employee computers.

I’ve seen this on every project I’ve been on. We are given slow machines, and time is lost. It may be lost because I’m running grep over a lot of files, it may be because when I have my all my development tools open and the machine slows down.

I think that it’s fine that organizations begin developers with cheap machines, but they should be quick to spend money at the first sign that it is needed by the developers. I believe that it is an aspect of agility that many organizations fall short on, where the ability to respond to constraints in the software development system is hindered by the structure and policies of the organization.

In fact, I think that IT organizations should do a few tests against their technology stack and see what kind of performance difference exist, and use those numbers to decide what kind of machines that developers can use that will result in the best performance while being reasonable on cost. This is especially true of Java J2EE projects, as most of the tools and applications are intensive, and the time it takes to build an entire application can be intensive.

Build Machines

If your project has any kind of continuous integration (and it should!) then you have probably felt the pain of long builds at some point. I’ve seen this on every project I’ve been on. There are two areas in particular that I’ve found to be painful: Long running regression or acceptance tests, and long compilation and deployment cycles due to heavyweight tools.

Often builds and tests are segmented into builds that are run locally on developer machines, and builds that are run by the build server. A typical approach is to have developers run unit tests and fast running integration tests locally while developing, but to have long running integration tests and acceptance tests run by a build server, where failures will be fixed later by developers when the build completes.

Many projects will find over time that the time it takes to run these large integration tests and acceptance tests becomes so long that the value is reduced. The time it takes to get feedback might be hours or even days. Often, these tests are failing, as by the time they complete, multiple developers or teams may made changes that break a portion of the test suite.

I’ve seen or read of different approaches to this problem, from using in-memory databases, to manually splitting the regression suites into separate builds or “pipelines”, to distributed computing, to transparent parallelization of tests.

An example of a new tool for transparently running tests in parallel is the Selenium Grid which attempts to run selenium tests in parallel. While I think there is merit in exploring these tools, they are non-trivial to setup and maintain, and while it may result in the build/test time being cut down to a fraction of it’s original time, it increases the complexity of the infrastructure that developers will have to maintain. There tends to be surprise issues with parallization as well. You have to make sure that you can have tests that are writing to the filesystem, querying a database, or calling other services in parrallel.

One day, I hope to try a different approach. I would rather spend the money trying to use hardware to solve the problem instead of using some complicated tool. From a previous experience of dealing with an incredibly IO bound build, I’ve long dreamt of building a hard drive out of RAM. I’m not talking about using flash memory; I’m talking about using DDR RAM instead of a traditional hard drive.

I recently looked into this concept, and I found that there are a few manufacturers out there that provide devices to do this very thing, such as the Hyper Drive 4. There are a few other devices out there that can achieve this, but I liked the information/propaganda on the Hyper Drive page the best.

The stats claimed by using a RAM based hard disk are nothing short of sexy.

I won’t reiterate the numbers here, but depending on the usage, it ranges from an order of magnitude to several orders of magnitude in increased performance. Even in builds and test suites that are not predominately IO bound, I am willing to bet that the performance boost to the operating system will translate into large gains for the performance of the tests. My favorite statistic was that Windows XP booted in 2 seconds with their test configuration, and that was only because of device polling.

I don’t think that the cost of such as system is unreasonable. One 16 gigabyte drive using the Hyper Drive system would probably cost around $5000. Assuming an average developer cost of $100 dollars an hour, it pays for itself if one week of 1 developers time is saved, let alone considering the benefits to an entire team. Come to think of it, I would argue that developers should have similar setups for their local machines. For instance, grep would probably be instantaneous with a DDR based drive.