I'm bringing up this topic just to throw in my two coppers regarding what
DataCleaner does very well and very quickly, which is profiling data, and the challenge of 'scrubbing' data.
All the approaches above are indeed quite common, and will *always* be customized depending on which identity fields you will have available to work with.
Name parsers understanding prefix/suffix/hyphenated names, soundex/metaphone for comparison (not entirely accurate mind you, just helpful), nickname aliasing, and adding address standardization and lookup/address-moved algorithms it is a really complex scope.
Scrubbing data with these sometimes heavyweight approaches should be the area of specialized tools and/or integration with your favorite ETL tool (particularly if you can write custom components for your SSIS or, in my case, Pentaho Data Integration ETL job).