In this article I'll describe a possible base class for domain entities which
implements a surrogate key as identity field and provides equality and hash
code.
Introduction
Martin Fowler writes in his PoEAA book: "The identity
field saves a database ID field in an object to maintain identity between an
in-memory object and a database row."
And further he states: "The first concern is whether to use meaningful or
meaningless keys. A meaningful key is something like the U.S. Social Security
Number... A meaningless key is essentially a random number the database dreams
up that's never intended for human use."
There are many reasons why meaningful keys often are NOT good candidates for
an identity field. Primarily they often are not immutable (due to possible human
errors) and not unique. Thus Martin Fowler states: "... As a result,
meaningful keys should be distrusted. ..."
Having you provided some background about the ongoing dispute about what is a
good candidate for an identity field I'll now make my choice. I always choose
meaningless keys as identity fields. Such fields are often called surrogate key. Important:
"The surrogate key is not derived from application
data."
My favorite type of surrogate key is a GUID (global unique
identifier). The mathematical algorithm used to generate a new GUID is such as
that it is (nearly) impossible to generate the same ID twice (the probability
tends to zero).
NHibernate supports GUID as one possible type for the identity field.
Problem Description
When dealing with NHibernate one often uses a special type of collection
known as Set. A set is a collection that contains no duplicate
elements. More formally, sets contain no pair of elements e1
and e2 such that e1.Equals(e2), and at most one null
element. As the Set is not provided by the .NET framework NHibernate uses the
IESI collections library which contains an implementation of a set.
In the definition above you find which is the important predicate to decide
whether two elements are the same or not. It is the
Equals function. By default the Equals function takes
the hash code of two objects and compares it. So if two variables e1
and e2 refer to 2 different instances of a class Equals will
always return false. But we want to use the identity field as the relevant part
in the comparison of two instances. If two different instances have the
same identity field then they are equal (that is they refer to
the same database record).
Implementation
The default implementation of the Equals function is to be found in the
System.Object class. From this class all other classes in .NET implicitly or
explicitly inherit. Fortunately the Equals function is virtual and we are able
to override it. But when overriding the Equals function we have to also
override the GetHashCode function.
Assuming that we take a GUID called Id as
identity field we can define the following base class from which all our domain
classes directly or indirectly will inherit
public class IdentityFieldProvider<T>
where T : IdentityFieldProvider<T>
{
private Guid _id;
public virtual Guid Id
{
get { return _id; }
set { _id = value; }
}
}
Now lets override the Equals method. A possible solution is
public override bool Equals(object obj)
{
T other = obj as T;
if (other == null)
return false;
// handle the case of comparing two NEW objects
bool otherIsTransient = Equals(other.Id, Guid.Empty);
bool thisIsTransient = Equals(Id, Guid.Empty);
if (otherIsTransient && thisIsTransient)
return ReferenceEquals(other, this);
return other.Id.Equals(Id);
}
We have to distinguish 3 possible cases. The first one is that the
user/developer wants to compare two objects of different type. This case is
trivial; the answer is ALWAYS "not equal". The second case is when the two
objects are both new (also called transient) then the two references
point to the same instance. And the third case just takes the implementation of
the Equals method of the GUID type to check for equality.
Now we have to also override the GetHashCode method also inherited
from System.Object.
private int? _oldHashCode;
public override int GetHashCode()
{
// Once we have a hash code we'll never change it
if (_oldHashCode.HasValue)
return _oldHashCode.Value;
bool thisIsTransient = Equals(Id, Guid.Empty);
// When this instance is transient, we use the base GetHashCode()
// and remember it, so an instance can NEVER change its hash code.
if (thisIsTransient)
{
_oldHashCode = base.GetHashCode();
return _oldHashCode.Value;
}
return Id.GetHashCode();
}
Now, why this kind of code you might ask yourself? Well, a object should
never ever change it's hash code during its life, that is from the moment the
object is instantiated until it is disposed. If a object is restored from
database there is no problem since any existing database record has always a
well defined and unique identity field. Thus we can derive the hash code from
this Id field. This is done in the last line of code in the code snippet
above.
A little bit more problematic is the case when a new object is created in
memory, then it's identity field is undefined (the object has not been saved to
the database so far and is thus considered as being transient). In our
case undefined means that the Id field has a value of Guid.Empty. In
this case we take the default implementation (of System.Object) of the
GetHashCode method to generate a hash code. But we store is in an
instance variable for further reference.
Later in the life cycle of the instance it may be persisted to the database
(but still continues to sit around in the memory). At this moment NHibernate
assigns a new unique value to the Id field of the instance. Now the object isn't
transient any more but the 2 first lines in the method avoid that the hash code
of the object changes. It is still the same object as before. It has just been
made persistent.
Finally we can also override the two operators '==' and '!=' to make it
possible to compare two instances with those operators instead of only the
Equals method.
public static bool operator ==(IdentityFieldProvider<T> x, IdentityFieldProvider<T> y)
{
return Equals(x, y);
}
public static bool operator !=(IdentityFieldProvider<T> x, IdentityFieldProvider<T> y)
{
return !(x == y);
}
That's it. You can now use this class as the base for every entity class in
your domain and never ever have to think about the identity field and the
equality of objects. It just happens...