Michael Bolin's 6.170 Recitation |
www.bolinfest.com |
10/14 Recitation 6 - Equality
1. The equals() ContractThe equals() method has the following specified requirements for non-null references:
Make sure that if you override equals(), then you override hashCode() so that for any two objects for which equals() is true that their hashCode() method returns the same int. Failure to do these two things will lead to insidious bugs. 2. Why equals(null) Returns FalseGiven that the method signature is:public boolean equals(Object o)the invocation equals(null) can do one of the following things:
Option 1: Return trueIfequals(null) were to return true ,
then intuitively, this would not make any sense
because a non-null object would be considered
equal to something that is not even an object.
This is inconsistent with our idea of equality.
Option 2: Throw an ExceptionBecauseequals() is a method of Object ,
it is a fundamental method in Java that is used often,
and in a variety of ways.
Thus, it would be inconvenient if it threw an exception when
passed a null reference when there is a more reasonable
response, namely:
Option 3: Return falseThis consistent with our idea of equality because it considers objects and non-objects unequal, so this appears to be the most reasonable result forequals(null) .
3. Template for Implementing equals()Suppose you are overridingequals() in a class called MyClass ;
use the following as a template for your implementation:
public boolean equals(Object obj) { if (!(obj instanceof MyClass)) { return false; } MyClass myClass = (MyClass)obj; // myClass is a non-null reference // add the code that is // specific to comparing // this and myClass here }Because (obj instanceof Object) will return false
if obj is null , the lines:
if (!(obj instanceof MyClass)) { return false; }will ensure that this implementation of equals()
fulfills the part of the contract the requires it to return
false when passed a null .
The instanceof keyword is used to test to see if the
reference on the left either implements or is a subclass of the
type on the right. Performing an instanceof test
is a relatively expensive operation in Java, so it should be
be avoided whenever possible, as it improves performance and
is also better style.
For example, if you had a List of Student
and Professor objects, then you would have to do something
like this when iterating over the list:
for (Iterator iter = list.iterator(); iter.hasNext(); ) { Object obj = iter.next(); if (obj instanceof Student) { Student student = (Student)obj; System.out.println("Name: " + student.getStudentName()); } else if (obj instanceof Professor) { Professor professor = (Professor)obj; System.out.println("Name: " + professor.getProfessorName()); } }It would be better style (and better performance) to create an interface Person with a method getName() that both
Student and Professor implement:
class Student implements Person { public String getName() { return getStudentName(); } } class Professor implement Person { public String getName() { return getProfessorName()); }Then the previous code could be replaced with: for (Iterator iter = list.iterator(); iter.hasNext(); ) { Person person = (Person)iter.next(); System.out.println("Name: " + person.getName()); }This is much more elegant. Unfortunately, a similar refactoring cannot be done with equals() to eliminate the call to instanceof .
However, Bloch (in Item 7 of Effective Java)
proposes making the first line of equals() the following:
if (obj == this) { return true; }As this is an inexpensive comparison to perform, this may save a lot of cycles if there is a fair chance that the object will be compared to itself via equals() .
You may think that you could solve this problem by doing something like this:
public class Point3D { private int x, y, z; public Point3D(int x, int y, int z) { this.x = x; this.y = y; this.z = z; } public boolean equals(Point3D p) { if (p == null) return false; return this.x == p.x && this.y == p.y && this.z == p.z; } }Unfortunately, this will not work. For example, if you try to run the following code: java.util.Set pointSet = new java.util.HashSet(); pointSet.add(new Point3D(1, 2, 3)); pointSet.add(new Point3D(1, 2, 3)); System.out.println(pointSet.size());the number 2 will be printed to the console even though
it seems that addition of the second point should be ignored, as
it is a duplicate. The catch is that Set tests for equality
by invoking the equals(Object) method.
Since Point3D does not override equals(Object) ,
it uses its default implementation:
public boolean equals(Object obj) { return obj == this; }Thus, the equals(Point3D) method is ignored, and unused, by Set .
This method will only be used when the compiler can match the types of the parameters with the method.
Examine the results of the following equals() comparisions for a better understanding
of how this works:
Point3D a = new Point3D(3, 14, 159); Point3D b = new Point3D(3, 14, 159); Object c = a; Object d = b; a.equals(b); // true a.equals(c); // true, uses equals(Object) but a == c in this case a.equals(d); // false a.equals((Point3D)d); // true c.equals(b); // false c.equals(d); // falseThus, make sure that you override public boolean equals(Object obj)
and not public boolean equals(MyType type) .
4. Overriding equals() in a SubclassConsider a classDuration and a class NanoDuration that extends it:
public class Duration { protected int min; protected int sec; public Duration(int min, int sec) { this.min = min; this.sec = sec; } } public class NanoDuration extends Duration { private int nano; public NanoDuration(int min, int sec, int nano) { super(min, sec); this.nano = nano; } }Now let's explore some implementations of equals() :
Try #1// Duration public boolean equals(Object obj) { if (!(obj instanceof Duration) return false; Duration d = (Duration)obj; return this.min == d.min && this.sec == d.sec; } // NanoDuration public boolean equals(Object obj) { if (!(obj instanceof NanoDuration) return false; NanoDuration nd = (NanoDuration)obj; return this.min == nd.min && this.sec == d.sec && this.nano == nd.nano; }At first, this seems reasonable; however, there is a problem because this violates the symmetry requirement of the equals() contract:
Duration d = new Duration(2, 71); NanoDuration nd = new NanoDuration(2, 71, 82); d.equals(nd); // true nd.equals(d); // false Try #2// Duration public boolean equals(Object obj) { if (obj == null || !obj.getClass().equals(Duration.class)) { return false; } Duration d = (Duration)o; return this.min == d.min && this.sec == d.sec; } // NanoDuration //suppose it inherits equals() from DurationThis seems like it might be an improvement; however, it is worse because it still violates symmetry, and what's worse is that it violates reflexivity in the subclass: NanoDuration nd = new NanoDuration(0, 6, 170); nd.equals(nd); // false: nd.getClass().equals(NanoDuration.class) Try #3// Duration public boolean equals(Object obj) { if (obj == null || !obj.getClass().equals(getClass()) { return false; } Duration d = (Duration)o; return this.min == d.min && this.sec == d.sec; } // NanoDuration public boolean equals(Object obj) { if (obj == null || !obj.getClass().equals(getClass()) { return false; } NanoDuration nd = (NanoDuration)o; return this.min == nd.min && this.sec == nd.sec && this.nano == nd.nano; }This is a better solution because it satisfies the equals() contract.
Unfortunately, it is impossible to compare a Duration with a
NanoDuration under this implementation; however, it is difficult
to define how such a comparision should be made, anyway.
The larger problem with this solution is that "harmless" subclasses cannot be
compared. For example, consider the following:
public class ArithmeticDuration extends Duration { public Duration addDuration(Duration d) { return new Duration(this.min + d.min, this.sec + d.sec); } }This leads to the following problem: Duration d = new Duration(4, 20); Duration a = new ArithmeticDuration(4, 20); d.getMinutes() == a.getMinutes(); // true d.getSeconds() == a.getSeconds(); // true d.equals(a); // falseThus, two objects that appear to be the same are not considered equal when compared with the equals() method. (Note that this problem would also exist if
we had used NanoDuration in place of ArithmeticDuration in
this example). Even though ArithmeticDuration is not fundamentally different
than its superclass, the implementation of equals() in Duration
makes the two types incomparable. Thus, this is not an ideal solution to our problem.
Try #4As Bloch advises in Item 14 of Effective Java: "Favor composition over inheritance." Instead of extendingDuration ,
NanoDuration could reuse Duration
by composing it as a field,
and delegating calls to it internally, as necessary:
public class NanoDuration { private Duration d; private int nano; public NanoDuration(int min, int sec, int nano) { d = new Duration(min, sec); this.nano = nano; } }Note that the client of NanoDuration need not know that
NanoDuration uses Duration .
Because Duration and NanoDuration do not extend one another,
there are no problems involving inheriting a broken equals() method.
However, we do lose something: previously, we could add both Duration
and NanoDuration objects to a Duration[] array.
Because NanoDuration is no longer a subclass, we can no longer do this.
The solution is to create an interface that both classes implement:
public interface TimePeriod { public int getMinutes(); public int getSeconds(); }Thus, we could create a TimePeriod[] array to which we could add
both Duration and NanoDuration objects.
Note that great care must be exercised when using the equals()
method of objects that are referred to through their interface,
as an interface rarely specifies how equals() must be implemented.
For example, if we have the following:
TimePeriod tp1, tp2; // suppose these were initialized outside this block of code java.util.Set set = new java.util.HashSet(); tp1.getMinutes() == tp2.getMinutes(); // true tp1.getSeconds() == tp2.getSeconds(); // true set.add(tp1); set.add(tp2); System.out.println(set.size());What gets printed out? From this code, there is no way to know because it is unclear what happens when add(tp2) is invoked and set tests if
tp1.equals(tp2) . Thus, be conscious of the use of equals()
in scenarios like this.
5. equals() and hashCode(): Best Friends ForeverBoth the equals(Object obj) and the hashCode() method are defined by Object, so every Java object inherits these two methods. According to the specification for hashCode():
The general contract of
![]() 6. Spelling Counts!When you override a method in Java, make sure that you spell it correctly. Unfortunately, the Java compiler will not signal an error if you declare either of the following methods:public boolean Equals(Object obj); public int hashcode();Instead, it will let you implement them, but will never invoke them because the spelling of the name does not exactly match that of the method it is intended to override. In the unfortunate event that you make this error, you will find that it is an extremely difficult bug to track down. Thus, one way to avoid this problem is to use the Override/Implement Methods... command in Eclipse to create these method signatures for you so that you can be sure that there will not be any typos. 7. Creating a Good Hash CodeConsider the following implementations forDuration.hashCode() :
Try #1return 1;This is a terrible hash code because it would make everything hash to the same bucket, making the hash code function as a linked list (eliminating the benefit of constant-time-lookup that a hash table should provide). Try #2return min;This is a better hash code; however, it will still result in a lot of collisions because every Duration
with the same value for min will end up in the same bucket.
Try #3return min + sec;Again, this is better, but there will still be collisions for a number of different durations. For example, when (min,sec) = (3,4) and (min,sec) = (4,3) ,
both will hash to the same bucket.
Try #4return sec << 16 + min;This is a bigger improvement because now only equal durations have equal hash codes (assuming minutes and seconds take on values between 0 and 60). However, if only the low order bits are used in the hash (which happens in techniques such as extensible hashing) then there will still be collisions when durations have the same minute. Note that this shows that just because you have unique hash codes for unique values, this does not guarantee a good hash. Try #5int result = 17; result = 37 * result + sec; result = 37 * result + min; return result;"Whoah -- where did this come from?" you ask. Well, the truth is that number theory people have discovered that multiplying by prime numbers helps spread out hash code values. In fact, they have discovered a lot of neat things about how to produce better hash codes, but they are often quite complicated, so it's probably easiest to look at the guidelines that Bloch has in Item 8 and apply them to your code rather than devise your own algorithms to acheive good hashing (unless you're into that sort of thing). Try #6Actually, there was nothing wrong with Try #5, but if thehashCode() method is going to be invoked often,
and is always going to return the same number, then it will help
to cache the hash code and to declare the hashCode()
method final so that the cost of method lookup and
computation will be reduced. For a large number of elements in a hashed data structure
(like nodes in the graph of the Stata Center), this will have a noticeable
impact on performance:
/** * <code>Duration</code> is immutable. */ public class Duration { protected final int min; protected final int sec; private final int hash; public Duration(int min, int sec) { this.min = min; this.sec = sec; int h = 37; h = 37 * h + sec; h = 37 * h + min; hash = h; } public boolean equals(Object obj) { if (this == obj) return true; if (!(obj instanceof Duration)) return false; Duration d = (Duration)obj; return min == d.min && sec == d.sec; } public final int hashCode() { return hash; } }Even if Duration were a mutable object,
we could still cache the hash code by updating the cached
value every time a Duration were mutated.
Alternatively, we could set a flag whenever it is mutated,
and then check that flag before returning the cached
value: if it has been mutated, then we consider the
cached value stale and then recalculate and cache the hash
before returning it.
The number of optimizations like this that you want to
encode depends on the nature of the problem if
hashCode() is inexpensive to compute,
and is not accessed often, then these mechanisms
will just make your code more difficult to maintain
without providing any real benefit to the client.
The important thing is to think carefully about
how likely hashCode() is to cause a collision,
and to make sure that it is in agreement with equals() .
8. EquivalenceIn 6.170, we talk about two types of equivalence:
In Java, == tests references for behavioral equivalence because if two references refer to the same object, then any operation performed using one reference must have the same effect as performing the operation on the other reference because they're being applied to the same object! This may lead you to believe that only references for which == returns true are behaviorially equivalent; however, this is not true. Consider the following: String a = new String("foo"); String b = new String("foo"); System.out.println(a == b); // prints falseIn the example above, a and b are not referentially equivalent; however, they are behaviorally equivalent. The general reason why this is true is because String is an immutable type. An immutable type doesn't have any mutators, so if two immutable objects are observationally equivalent, then they must also be behaviorally equivalent. Let's also look at an example of how to show that two things are not behaviorally equivalent: StringBuffer b1 = new StringBuffer(), b2 = new StringBuffer(); System.out.println(b1.length() == b2.length()); // prints true b1.append("foo"); System.out.println(b1.length() == b2.length()); // prints falseAha! We found a sequence of mutators (a sequence of one element: b1.append("foo")) that can distinguish between b1 and b2. Thus, b1 and b2 are not behaviorally equivalent. |
©2004 Michael Bolin | www.bolinfest.com |