Posts Tagged ‘java’

Granularity of Abstractions

Thursday, October 23rd, 2008

In Java, iterating over a collection and collecting some objects based on a condition is a classic example of something to put into a method. Here is an example:

  1. private Collection<Claim> getExpiredClaims() {
  2.     List<Claim> expiredClaims = new ArrayList<Claim>();
  3.     for (Claim claim : allClaims) {
  4.         if (claim.isExpired()) {
  5.             expiredClaims.add(claim);
  6.         }
  7.     }
  8.     return expiredClaims;
  9. }

In Ruby, you can do this in one line.

  1. all_claims.select { |claim| claim.expired? }

It’s lovely how much more succinct that operation becomes. It can actually get more succinct by using Symbol#to_proc:

  1. all_claims.select (&:expired?)

There is still the question of whether to move this to a method. Now, moving it to a method doesn’t reduce multi-line duplication the way it would in the Java example; there is only a minor amount of single line duplication that can be reduced.

The resistance I often get to refactoring a single line to a method is that it creates more total lines in the file with the def/end lines and spacing. Also, it takes a slight amount of effort to do the refactoring, and some developers won’t refactor unless the need is glaring and obvious.

There are still important reasons to move this to a method, and those reasons are better readability of code and reducing duplication of expressions. Without the refactoring, you’ll see code that looks like this:

  1. def mark_expired_claims_for_review
  2.     all_claims.select (&:expired?).each(&:needs_review!)
  3. end
  4.  
  5. def notify_claim_agents_of_expired_claims
  6.     all_claims.select (&:expired?).each(&:notify_agent_of_expiration)
  7. end

After the refactoring, this is what it would look like.

  1. def expired_claims
  2.     all_claims.select(&:expired?)
  3. end
  4.  
  5. def mark_expired_claims_for_review
  6.     expired_claims.each(&:needs_review!)
  7. end
  8.  
  9. def notify_claim_agents_of_expired_claims
  10.     expired_claims.each(&:notify_agent_of_expiration)
  11. end

It’s a minor point perhaps, and I’m sure that I could come up with much more excessive examples, but I just wanted to focus on a simple example. By replacing the expression with a symbolic reference, a method in this case, you express in English something that is a programmatic operation. This improves readability quite a bit.

Also, there is the additional benefit of abstracting on the concept of expired claims. By having multiple places using “all_claims.select(&:expired?)” to express expired claims, you duplicate the implementation detail that expired claims are derived from a larger collection of claims. This may not always be true, and a change in that derivation results in a change in many places.

Perhaps the question here is how DRY should you make your code. I’m still undecided on this on this point, but I think that the amount of work that it takes to reduce duplication to this level is minimal, and the result will be an important part of a pristine codebase.

Getting rid of switch statements with Java Enums

Sunday, October 19th, 2008

I recently saw an interesting and polymorphic way to get rid of using a case statement when using enums. This is possible by defining a method for each instance of an enum.

I’m sure that you have seen code like this:

  1. enum Friend {
  2.     Joey, Chandler;
  3. }

And then somewhere in the code, you might see:

  1. class SomeObjectThatNeedsToKnowBestFriends {
  2.  
  3.     void doSomethingWithBestFriends() {
  4.         for (Friend friend : Friend.values()) {
  5.             doSomething(bestFriend(friend));
  6.         }
  7.     }
  8.  
  9.     Friend bestFriend(Friend friend) {
  10.          switch (friend) {
  11.              case Joey: return Friend.Chandler;
  12.              case Chandler: return Friend.Joey;
  13.              default: throw new RuntimeException(“This person has no friend”);
  14.          }
  15.     }
  16. }

This is a common smell in a code base where client code has logic that should be better encapsulated. Now, whenever someone adds a friend, they are going to have to search for references on Friends and add a new entry, or they are going to get a RuntimeException from the switch statement. In a well tested codebase, there is probably going to be a unit test that asserts that all Friends have best friends. In any case, the switch statement in the client object code is not great from an OO standpoint and it creates a maintainability issue.

The first refactoring is to move all Friend related logic to the Friend enum where it belongs.

  1. enum Friend {
  2.     Joey, Chandler;
  3.  
  4.     Friend bestFriend() {
  5.          switch (this) {
  6.              case Joey: return Chandler;
  7.              case Chandler: return Joey;
  8.              default: throw new RuntimeException(“This person has no friend”);
  9.          }
  10.     }
  11.  
  12. }

Now, we can just ask the Friend who their best friend is.

  1. Joey.bestFriend(); –> returns Chandler;

Nice. Still, I’m not wild about that switch statement. Developers still have to know to update it, and really, I’d rather not even have to throw an exception because it was misused. It would be better if the structure of the code did not allow misuse.

Here is an example of how to do this:

  1. enum Friend {
  2.     Joey {
  3.         Friend bestFriend() { return Chandler; }
  4.     },
  5.     Chandler {
  6.         Friend bestFriend() { return Joey; }
  7.     }
  8.     abstract Friend bestFriend();
  9. }

Then,

  1. Joey.bestFriend(); –> returns Chandler;

Great, now we know that when someone adds a new friend, they will immediately be confronted with having to supply a best friend. My only issue with this approach is that all the method definitions become verbose when you introduce many methods like this. I tried different approaches to solving this problem, but due to the enums referencing each other, I was not able to do a different approach.

Here is an example of what you can’t do:

  1. enum Friend {
  2.     Joey (Chandler),
  3.     Chandler(Joey)
  4.  
  5.     final Friend bestFriend;
  6.  
  7.     Friend(Friend bestFriend) {
  8.         this.bestFriend = bestFriend;
  9.     }
  10.  
  11.     Friend bestFriend() {
  12.         return bestFriend;
  13.     }
  14. }

This won’t work because you can’t reference Chandler in the enum definition for Joey. The Chandler Enum hasn’t been defined yet, so this won’t even compile. However, you can “trick” the compiler by fully referencing Chandler using Friend.Chandler;

  1. enum Friend {
  2.     Joey (Friend.Chandler),
  3.     Chandler(Joey)
  4.  
  5.     final Friend bestFriend;
  6.  
  7.     Friend(Friend bestFriend) {
  8.         this.bestFriend = bestFriend;
  9.     }
  10.  
  11.     Friend bestFriend() {
  12.         return bestFriend;
  13.     }
  14. }

However, the result is not what we want:

Joey.bestFriend(); –> null
Chandler.bestFriend(); –> Joey

Even though I can reference the other enum instance this way, it resolves to null. The reason lies in the fact that when the Enum is compiled, each instance is a static final field, and initialized in a static block. Here is a snippet of the generated code:

  1. public static final Friend Joey;
  2.     public static final Friend Chandler;
  3.     final Friend bestFriend;
  4.     private static final Friend ENUM$VALUES[];
  5.  
  6.     static
  7.     {
  8.         Joey = new Friend(“Joey”, 0, Chandler);
  9.         Chandler = new Friend(“Chandler”, 1, Joey);
  10.         ENUM$VALUES = (new Friend[] {
  11.             Joey, Chandler
  12.         });
  13.     }

Interestingly, I can write a program that tries the fully qualified name for explicit static constants, and it works:

  1. public class StaticEnumClass {
  2.  
  3.     static final String foobar = “foo” + StaticEnumClass.bar;
  4.     static final String bar = “bar”;
  5.    
  6.     public static void main(String[] args) {
  7.         System.out.println(foobar); // prints out "foobar"
  8.     }
  9.    
  10. }

But if I use a static initialization, it doesn’t:

  1. public class StaticEnumClass {
  2.  
  3.     static final String foobar;
  4.     static final String bar;
  5.    
  6.     static {
  7.         foobar = “foo” + StaticEnumClass.bar;
  8.         bar = “bar”;
  9.     }
  10.    
  11.     public static void main(String[] args) {
  12.         System.out.println(foobar); // prints "foonull"
  13.     }
  14.    
  15. }

Interesting. The compiler is smart enough to resolve the correct value when you don’t use a static initialization block. Back to my original example with Friends. The next attempt was to create an anonymous constructor (actually, an instance initializer) for the enum instances and see if I could get what I want:

  1. enum Friend {
  2.     Joey {
  3.         {
  4.             bestFriend = Chandler;
  5.         }
  6.     },
  7.     Chandler {
  8.         {
  9.             bestFriend = Joey;
  10.         }
  11.     }
  12.  
  13.     Friend bestFriend;
  14.  
  15.     Friend bestFriend() {
  16.         return bestFriend;
  17.     }
  18. }

I had to remove the final from bestFriend, since I’m inializing the value when the object is instantiated using an instance initializer. This compiles, and seems like an okay approach. My hope was that the references to other enum types would get resolved in much the same manner as in the case of creating a method that returns each one. Interestingly, this doesn’t happen.

  1. Joey.bestFriend(); –> null
  2. Chandler.bestFriend(); –> Joey

The reason is that even though I am using an instance initializer, it’s being called from a static block since the instances are created in a static block. Turns out to be a naive attempt. Here is what it ends up looking like when the enum gets generated as a class:

  1. static
  2.     {
  3.         Joey = new Friend(“Joey”, 0) {
  4.  
  5.            
  6.             {
  7.                 bestFriend = Friend.Chandler;
  8.             }
  9.         }
  10. ;
  11.         Chandler = new Friend(“Chandler”, 1) {
  12.  
  13.            
  14.             {
  15.                 bestFriend = Friend.Joey;
  16.             }
  17.         }
  18. ;
  19.         ENUM$VALUES = (new Friend[] {
  20.             Joey, Chandler
  21.         });
  22.     }

Oh well. That’s as far as my experimentation went. I’m satisfied that I can at least create an anonymous subtype of an enum that returns the correct value, but if anyone has any ideas on how to do this in a cleaner way, let me know.

The Law of Demeter

Saturday, May 10th, 2008

The other day I came across a reference to something called the law of demeter. With much sadness, I realized that I could not remember what that law was. It sounded familiar, and was probably an answer to some test question in college. Nonetheless, I had forgotten it, and I felt that it was my duty to look it up.

Wikipedia had a nice page[1] on it, and I was happy to know that I was following this policy already. In fact, the concept is core to well encapsulated code and a good domain model.

It’s a shame that it has such a pedantic sounding name; I don’t see myself being able to tell someone that their code would better if they only followed the law of demeter. Principle of least knowledge sounds better, but it’s still kind of vague. Unfortunately, it’s a very commonly violated principle, but it is one that should guide our default programming behavior. That is, when we aren’t being forced to violate it because of an api or legacy code base.

Anyway, to summarize what it is, it basically says that an object A can only use other objects that are fields, parameters to a method, or created inside of A. Here is a nicely contrived example of violating this principle:

  1. class Book {
  2.     double cost;
  3.     public double getCost() { return cost; }
  4. }
  5.  
  6. class ShoppingCart {
  7.     List<Book> books;
  8.     public List<Book> getBooks() { return books; }
  9. }
  10.  
  11. public class SalesCalculator {
  12.     public void calculateTotalCost(ShoppingCart cart) {
  13.         double total = 0;
  14.         for (Book item : cart.getBooks()) {
  15.             total += item.getCost();
  16.         }
  17.         System.out.println(“Total cost is: “ + total);
  18.     }
  19. }

Actually, this isn’t that contrived. For some reason, people love putting calculations into a class with Calculator in the name. In this example, SalesCalculator knows about more than just the ShoppingCart it’s given. It knows about the internals of the shopping cart. Not only does it know that it has books, but that it stores the books in a List. Now, the getter provides a slight appearance that it doesn’t really know that it’s stored in a list, but in this example, it’s returning the exact list object that it is storing internally. In this situation, if I modify the list outside of the ShoppingCart, then it’s internal state changes. We may as well be accessing books by doing a cart.books and leave out the getter code.

I’m digressing here; there is probably some other elitist sounding principle that basically says that I should not be able to mutate an objects internal state from outside the object. An improved version of the code would look like:

  1. class Book {
  2.     double cost;
  3.     public double getCost() { return cost; }
  4. }
  5.  
  6. class ShoppingCart {
  7.     List<Book> books;
  8.     public double calculateTotalCost() {
  9.         double total = 0;
  10.         for (Book book : books) {
  11.             total += book.getCost();
  12.         }
  13.         return total;
  14.     }
  15. }
  16.  
  17. public class SalesCalculator {
  18.     public void calculateTotalCost(ShoppingCart cart) {
  19.         System.out.println(“Total cost is: “
  20. + cart.calculateTotalCost());
  21.     }
  22. }

That’s better. Now SalesCalculator knows nothing about Book objects, nor is it playing around with the List that shopping cart was using to hold the books. In fact, it looks like the SalesCalculator barely has a reason to exist. And it probably shouldn’t, not as a calculator. Maybe it could be a SalesPrinter or something.

The benefit of this increases as the logic of a domain model grows. Another common smell are objects like Managers:

  1. class ShoppingCart {
  2.     List<Book> books;
  3.     int numberOfComicBooks;
  4.     public List<Book> getBooks() {
  5.         return books;
  6.     }
  7.     public int getNumberOfComicBooks() {
  8.         return numberOfComicBooks;
  9.     }
  10.     public void setNumberOfComicBooks(int numberOfComicBooks) {
  11.         this.numberOfComicBooks = numberOfComicBooks;
  12.     }
  13. }
  14.  
  15. class ShoppingCartManager {
  16.     public void addBookToCart(ShoppingCart cart, Book book) {
  17.         cart.getBooks().add(book);
  18.         if (book.isComicBook) {
  19.             cart.setNumberOfComicBooks(
  20. cart.getNumberOfComicBooks() + 1);
  21.         }
  22.     }
  23. }

Why can’t the Shopping cart manage the adding of books and counting of comic books itself? If it did we would have less code anyway:

  1. class ShoppingCart {
  2.     List<Book> books;
  3.     int numberOfComicBooks;
  4.  
  5.     public void add(Book book) {
  6.         books.add(book);
  7.         if (book.isComicBook()) {
  8.             numberOfComicBooks++;
  9.         }
  10.     }
  11. }

Some developers have a default behavior that results in less code following the law of demeter (and tosses encapsulation in the garbage). That is, when they create an object, they immediately have their IDE generate getters and setters for all the fields they added. I think the idea is that it will save them time later or some nonsense like that. The unfortunate reality is that when I begin changing the ShoppingCart class, I’m going to have to traverse a tangled web of references across the code base trying to figure out what other objects are holding onto the ShoppingCarts book list and changing it on me. It gets even worse if you introduce multiple threads in to the mix.

[1] http://en.wikipedia.org/wiki/Law_of_Demeter