Archive for October, 2008

When will we have a real multi-user windowing system

Tuesday, October 28th, 2008

I once worked on a VNC type application, where the purpose was to allow some number of users to share any desktop application to a group of people (part of the allure is that windows applications could be shared to Linux users). The features that were meant to set it apart from the typical desktop sharing application like VNC was that it would enhance collaboration amongst the users.

It was a fun application to write - It had a Java Swing front end with JNI to use portions of the win32 API and X-Windows API, and we wrote a video and audio server to distribute the view of the application and allow the users to talk to each other in on online group phone call while they were using the application.

There was one desired feature that we could not implement. We wanted to allow users to simultaneously enter text, and simultaneously control their own mouse cursor to interact with the application. We were completely blocked on this feature because it goes against a major assumption of all major operating systems - that one user will use the system at a time. For instance, it’s assumed that one window will have focus at a time. It is also assumed that there is only one mouse cursor at a time. There are even more subtle issues that come up, such as there being only one copy and paste buffer.

At the time, I was disappointed, but I didn’t dwell on it too much. I figured that that level of multi-user collaboration would only service a very small niche market, and that it wasn’t so bad that we didn’t have the possibility of better collaboration.

Fast forward to today. I pair program most of the time, and I find myself longing for an environment that would allow me and my pair to collaborate in different ways. It’s standard pair programming, where we both work on the same window, and have to share the same keyboard and mouse input (preferably with 2 mice and 2 keyboards). So what do you do when one person has to check email, or one person decides to research an issue online, or otherwise engage in a task that is not the same as the other pair member? Typically, each person will have a laptop that they can use for these situations, but that causes the person to separate themselves from the primary work environment. Sharing information will have to done by sending an email or putting a file on a shared drive, or having the other person look at information on the laptop’s screen. What about spiking a problem side by side, where each pair tries two similar but different approaches to solving a programming problem? I would prefer to collaborate with someone in the same environment, where we can do some different tasks in parallel, and can share windows, applications, and data in the same desktop environment.

I think that it would be interesting to expand this idea even further. What if an entire team worked in the same environment, on the same desktop? Each person could have a portion of a huge desktop, and people in a team could choose to collaborate with each other in a very fluid fashion by putting their desktop spaces close together. It would be an interesting experiment in collaboration applied to software development.

Even if multiple simultaneous mouse and keyboard input were generally possible in operating systems, there would still likely be a lot of complex concurrency issues for developers, such as making sure that servers bind to different ports and write out to different files. It’s likely that once popular operating systems generally allow the multi-user features I mentioned before, that it would still be preferable to have isolated machines that interact through collaborative software rather then have everyone actually use the same machine.

I’ve seen software that allows sharing of a the mouse and keyboard across machines, and sharing the copy and paste buffer, but the purpose was to allows one person to connect 2 machines and use them like an extended desktop. Perhaps software like this could be tailored for better collaboration for pair programming or developers on a team not pairing.

An idea for a ruby inspection tool

Saturday, October 25th, 2008

Working on some ruby code for a little while, I had an idea for a tool that would be useful.

Basically, what I want is an inspection tool that also has a bundle in textmate, so that I can see the history of objects that are created, as well as interactions in a form that is searchable and that quickly shows what really happened. It would be like having debugging on all the time.

The goal of this tool would be to allow developers to view what really happens with the code when it executes. Sometimes finding the source of bugs is difficult because statically analyzing code is not the easiest way to predict the future state of an object, or to predict the interactions between objects. You don’t really know what happens until the code executes.

Nor is traditional debugging an efficient way to get the information that you need. You have to set a break-point, add in a line to call the a debugger, or even print out some information in order to answer a question about the actual execution of the code.

What I want is a tool that records a wealth of data about the actual execution of some code, such as when I run a test, or initiate an action from a controller. Then, I would like to be able to filter/query that information to get more precise information.

For the inspection tool, I’d like to know things like:

- See the actual methods and fields of a class or object instance at the beginning and end of code execution
- See all calls made to an object, or a specific instance of an object
- See all callers of a specific method of a specific object type or object instance
- Detect changes in the structure of an individual object that happens, such as methods being added or redefined after the object is initialized
- View all changes to fields of an object
- I would like to see the order that objects are created in. I would like to see when all objects of a certain type were created, or objects that respond to a specific method
- I would like to see all interactions between 2 objects
- I would like to be able to further scope the search, so that I only see information within the scope of a specific method call

Preferably, I would want to be able to view all of this information after running the code one time. And then be able to enter some search criteria into a form and have it reduce the amount of information to match my search criteria. Then, I can tweak some starting data in a test, and then rerun the test and get a new set of information to query.

I think that most of this information could be obtained by creating a Module to intercept all method sends to every object, and save some data about the context of the call when it’s made. Then, a client tool could parse the output, and allow a developer to query it.

I tried to do some googling to find something like this, but all I found were standard debugging tools, which I don’t think have the features that I’m looking for. If anyone knows of something that can do what I want, let me know.

Granularity of Abstractions

Thursday, October 23rd, 2008

In Java, iterating over a collection and collecting some objects based on a condition is a classic example of something to put into a method. Here is an example:

  1. private Collection<Claim> getExpiredClaims() {
  2.     List<Claim> expiredClaims = new ArrayList<Claim>();
  3.     for (Claim claim : allClaims) {
  4.         if (claim.isExpired()) {
  5.             expiredClaims.add(claim);
  6.         }
  7.     }
  8.     return expiredClaims;
  9. }

In Ruby, you can do this in one line.

  1. all_claims.select { |claim| claim.expired? }

It’s lovely how much more succinct that operation becomes. It can actually get more succinct by using Symbol#to_proc:

  1. all_claims.select (&:expired?)

There is still the question of whether to move this to a method. Now, moving it to a method doesn’t reduce multi-line duplication the way it would in the Java example; there is only a minor amount of single line duplication that can be reduced.

The resistance I often get to refactoring a single line to a method is that it creates more total lines in the file with the def/end lines and spacing. Also, it takes a slight amount of effort to do the refactoring, and some developers won’t refactor unless the need is glaring and obvious.

There are still important reasons to move this to a method, and those reasons are better readability of code and reducing duplication of expressions. Without the refactoring, you’ll see code that looks like this:

  1. def mark_expired_claims_for_review
  2.     all_claims.select (&:expired?).each(&:needs_review!)
  3. end
  4.  
  5. def notify_claim_agents_of_expired_claims
  6.     all_claims.select (&:expired?).each(&:notify_agent_of_expiration)
  7. end

After the refactoring, this is what it would look like.

  1. def expired_claims
  2.     all_claims.select(&:expired?)
  3. end
  4.  
  5. def mark_expired_claims_for_review
  6.     expired_claims.each(&:needs_review!)
  7. end
  8.  
  9. def notify_claim_agents_of_expired_claims
  10.     expired_claims.each(&:notify_agent_of_expiration)
  11. end

It’s a minor point perhaps, and I’m sure that I could come up with much more excessive examples, but I just wanted to focus on a simple example. By replacing the expression with a symbolic reference, a method in this case, you express in English something that is a programmatic operation. This improves readability quite a bit.

Also, there is the additional benefit of abstracting on the concept of expired claims. By having multiple places using “all_claims.select(&:expired?)” to express expired claims, you duplicate the implementation detail that expired claims are derived from a larger collection of claims. This may not always be true, and a change in that derivation results in a change in many places.

Perhaps the question here is how DRY should you make your code. I’m still undecided on this on this point, but I think that the amount of work that it takes to reduce duplication to this level is minimal, and the result will be an important part of a pristine codebase.

Getting rid of switch statements with Java Enums

Sunday, October 19th, 2008

I recently saw an interesting and polymorphic way to get rid of using a case statement when using enums. This is possible by defining a method for each instance of an enum.

I’m sure that you have seen code like this:

  1. enum Friend {
  2.     Joey, Chandler;
  3. }

And then somewhere in the code, you might see:

  1. class SomeObjectThatNeedsToKnowBestFriends {
  2.  
  3.     void doSomethingWithBestFriends() {
  4.         for (Friend friend : Friend.values()) {
  5.             doSomething(bestFriend(friend));
  6.         }
  7.     }
  8.  
  9.     Friend bestFriend(Friend friend) {
  10.          switch (friend) {
  11.              case Joey: return Friend.Chandler;
  12.              case Chandler: return Friend.Joey;
  13.              default: throw new RuntimeException(“This person has no friend”);
  14.          }
  15.     }
  16. }

This is a common smell in a code base where client code has logic that should be better encapsulated. Now, whenever someone adds a friend, they are going to have to search for references on Friends and add a new entry, or they are going to get a RuntimeException from the switch statement. In a well tested codebase, there is probably going to be a unit test that asserts that all Friends have best friends. In any case, the switch statement in the client object code is not great from an OO standpoint and it creates a maintainability issue.

The first refactoring is to move all Friend related logic to the Friend enum where it belongs.

  1. enum Friend {
  2.     Joey, Chandler;
  3.  
  4.     Friend bestFriend() {
  5.          switch (this) {
  6.              case Joey: return Chandler;
  7.              case Chandler: return Joey;
  8.              default: throw new RuntimeException(“This person has no friend”);
  9.          }
  10.     }
  11.  
  12. }

Now, we can just ask the Friend who their best friend is.

  1. Joey.bestFriend(); –> returns Chandler;

Nice. Still, I’m not wild about that switch statement. Developers still have to know to update it, and really, I’d rather not even have to throw an exception because it was misused. It would be better if the structure of the code did not allow misuse.

Here is an example of how to do this:

  1. enum Friend {
  2.     Joey {
  3.         Friend bestFriend() { return Chandler; }
  4.     },
  5.     Chandler {
  6.         Friend bestFriend() { return Joey; }
  7.     }
  8.     abstract Friend bestFriend();
  9. }

Then,

  1. Joey.bestFriend(); –> returns Chandler;

Great, now we know that when someone adds a new friend, they will immediately be confronted with having to supply a best friend. My only issue with this approach is that all the method definitions become verbose when you introduce many methods like this. I tried different approaches to solving this problem, but due to the enums referencing each other, I was not able to do a different approach.

Here is an example of what you can’t do:

  1. enum Friend {
  2.     Joey (Chandler),
  3.     Chandler(Joey)
  4.  
  5.     final Friend bestFriend;
  6.  
  7.     Friend(Friend bestFriend) {
  8.         this.bestFriend = bestFriend;
  9.     }
  10.  
  11.     Friend bestFriend() {
  12.         return bestFriend;
  13.     }
  14. }

This won’t work because you can’t reference Chandler in the enum definition for Joey. The Chandler Enum hasn’t been defined yet, so this won’t even compile. However, you can “trick” the compiler by fully referencing Chandler using Friend.Chandler;

  1. enum Friend {
  2.     Joey (Friend.Chandler),
  3.     Chandler(Joey)
  4.  
  5.     final Friend bestFriend;
  6.  
  7.     Friend(Friend bestFriend) {
  8.         this.bestFriend = bestFriend;
  9.     }
  10.  
  11.     Friend bestFriend() {
  12.         return bestFriend;
  13.     }
  14. }

However, the result is not what we want:

Joey.bestFriend(); –> null
Chandler.bestFriend(); –> Joey

Even though I can reference the other enum instance this way, it resolves to null. The reason lies in the fact that when the Enum is compiled, each instance is a static final field, and initialized in a static block. Here is a snippet of the generated code:

  1. public static final Friend Joey;
  2.     public static final Friend Chandler;
  3.     final Friend bestFriend;
  4.     private static final Friend ENUM$VALUES[];
  5.  
  6.     static
  7.     {
  8.         Joey = new Friend(“Joey”, 0, Chandler);
  9.         Chandler = new Friend(“Chandler”, 1, Joey);
  10.         ENUM$VALUES = (new Friend[] {
  11.             Joey, Chandler
  12.         });
  13.     }

Interestingly, I can write a program that tries the fully qualified name for explicit static constants, and it works:

  1. public class StaticEnumClass {
  2.  
  3.     static final String foobar = “foo” + StaticEnumClass.bar;
  4.     static final String bar = “bar”;
  5.    
  6.     public static void main(String[] args) {
  7.         System.out.println(foobar); // prints out "foobar"
  8.     }
  9.    
  10. }

But if I use a static initialization, it doesn’t:

  1. public class StaticEnumClass {
  2.  
  3.     static final String foobar;
  4.     static final String bar;
  5.    
  6.     static {
  7.         foobar = “foo” + StaticEnumClass.bar;
  8.         bar = “bar”;
  9.     }
  10.    
  11.     public static void main(String[] args) {
  12.         System.out.println(foobar); // prints "foonull"
  13.     }
  14.    
  15. }

Interesting. The compiler is smart enough to resolve the correct value when you don’t use a static initialization block. Back to my original example with Friends. The next attempt was to create an anonymous constructor (actually, an instance initializer) for the enum instances and see if I could get what I want:

  1. enum Friend {
  2.     Joey {
  3.         {
  4.             bestFriend = Chandler;
  5.         }
  6.     },
  7.     Chandler {
  8.         {
  9.             bestFriend = Joey;
  10.         }
  11.     }
  12.  
  13.     Friend bestFriend;
  14.  
  15.     Friend bestFriend() {
  16.         return bestFriend;
  17.     }
  18. }

I had to remove the final from bestFriend, since I’m inializing the value when the object is instantiated using an instance initializer. This compiles, and seems like an okay approach. My hope was that the references to other enum types would get resolved in much the same manner as in the case of creating a method that returns each one. Interestingly, this doesn’t happen.

  1. Joey.bestFriend(); –> null
  2. Chandler.bestFriend(); –> Joey

The reason is that even though I am using an instance initializer, it’s being called from a static block since the instances are created in a static block. Turns out to be a naive attempt. Here is what it ends up looking like when the enum gets generated as a class:

  1. static
  2.     {
  3.         Joey = new Friend(“Joey”, 0) {
  4.  
  5.            
  6.             {
  7.                 bestFriend = Friend.Chandler;
  8.             }
  9.         }
  10. ;
  11.         Chandler = new Friend(“Chandler”, 1) {
  12.  
  13.            
  14.             {
  15.                 bestFriend = Friend.Joey;
  16.             }
  17.         }
  18. ;
  19.         ENUM$VALUES = (new Friend[] {
  20.             Joey, Chandler
  21.         });
  22.     }

Oh well. That’s as far as my experimentation went. I’m satisfied that I can at least create an anonymous subtype of an enum that returns the correct value, but if anyone has any ideas on how to do this in a cleaner way, let me know.

Saving lost developer time with better hardware

Saturday, October 4th, 2008

A common problem that I see on projects is that the computers available to the teams are mediocre. The obvious example of this is when the computers given to developers are mediocre, but I also think that there is a compelling point to be made around solving performance on build machines with hardware instead of software.

Developer Machines

I was once on a project where the local update and build process became an hour long. I won’t get into the details, but it was largely an IO bound delay, with portions with the processor as the bottleneck. We were using Dell 610 laptops. When some developers started gettting Dell 620’s (dual core laptops), we discovered that it reduced the local build time on the machines by 33% to 50%. Whoa.

Think about that. A 60 minute build cut down to 30 minutes. Let’s assume that developers only build once per day and that each developer has an average cost of $100 per hour (total cost to the organization, not just wages). With those savings, getting every developer a Dell 620 instead of a Dell 610 pays for itself in couple weeks. This is just considered cutting a long build in half. There are many other situations where having a slow machine causes lost developer time.

We lobbied for getting the developers better machines, and were mostly denied. I discovered that organizations measure the cost of people separate from the cost of hardware. In fact, they may be accounted for by different departments entirely, where an arbitrary budget is given to the department that issues employee computers.

I’ve seen this on every project I’ve been on. We are given slow machines, and time is lost. It may be lost because I’m running grep over a lot of files, it may be because when I have my all my development tools open and the machine slows down.

I think that it’s fine that organizations begin developers with cheap machines, but they should be quick to spend money at the first sign that it is needed by the developers. I believe that it is an aspect of agility that many organizations fall short on, where the ability to respond to constraints in the software development system is hindered by the structure and policies of the organization.

In fact, I think that IT organizations should do a few tests against their technology stack and see what kind of performance difference exist, and use those numbers to decide what kind of machines that developers can use that will result in the best performance while being reasonable on cost. This is especially true of Java J2EE projects, as most of the tools and applications are intensive, and the time it takes to build an entire application can be intensive.

Build Machines

If your project has any kind of continuous integration (and it should!) then you have probably felt the pain of long builds at some point. I’ve seen this on every project I’ve been on. There are two areas in particular that I’ve found to be painful: Long running regression or acceptance tests, and long compilation and deployment cycles due to heavyweight tools.

Often builds and tests are segmented into builds that are run locally on developer machines, and builds that are run by the build server. A typical approach is to have developers run unit tests and fast running integration tests locally while developing, but to have long running integration tests and acceptance tests run by a build server, where failures will be fixed later by developers when the build completes.

Many projects will find over time that the time it takes to run these large integration tests and acceptance tests becomes so long that the value is reduced. The time it takes to get feedback might be hours or even days. Often, these tests are failing, as by the time they complete, multiple developers or teams may made changes that break a portion of the test suite.

I’ve seen or read of different approaches to this problem, from using in-memory databases, to manually splitting the regression suites into separate builds or “pipelines”, to distributed computing, to transparent parallelization of tests.

An example of a new tool for transparently running tests in parallel is the Selenium Grid which attempts to run selenium tests in parallel. While I think there is merit in exploring these tools, they are non-trivial to setup and maintain, and while it may result in the build/test time being cut down to a fraction of it’s original time, it increases the complexity of the infrastructure that developers will have to maintain. There tends to be surprise issues with parallization as well. You have to make sure that you can have tests that are writing to the filesystem, querying a database, or calling other services in parrallel.

One day, I hope to try a different approach. I would rather spend the money trying to use hardware to solve the problem instead of using some complicated tool. From a previous experience of dealing with an incredibly IO bound build, I’ve long dreamt of building a hard drive out of RAM. I’m not talking about using flash memory; I’m talking about using DDR RAM instead of a traditional hard drive.

I recently looked into this concept, and I found that there are a few manufacturers out there that provide devices to do this very thing, such as the Hyper Drive 4. There are a few other devices out there that can achieve this, but I liked the information/propaganda on the Hyper Drive page the best.

The stats claimed by using a RAM based hard disk are nothing short of sexy.

I won’t reiterate the numbers here, but depending on the usage, it ranges from an order of magnitude to several orders of magnitude in increased performance. Even in builds and test suites that are not predominately IO bound, I am willing to bet that the performance boost to the operating system will translate into large gains for the performance of the tests. My favorite statistic was that Windows XP booted in 2 seconds with their test configuration, and that was only because of device polling.

I don’t think that the cost of such as system is unreasonable. One 16 gigabyte drive using the Hyper Drive system would probably cost around $5000. Assuming an average developer cost of $100 dollars an hour, it pays for itself if one week of 1 developers time is saved, let alone considering the benefits to an entire team. Come to think of it, I would argue that developers should have similar setups for their local machines. For instance, grep would probably be instantaneous with a DDR based drive.