Topic: Validating Danish names
Validating Danish names
Hi everybody,
Attached to ticket #25 you can find some text files that can be used to validate danish names using the Dictionary Lookup validation rule. I'm posting this because I know that there are a couple of danish users out there, so please enjoy!
On another note - if anyone has similar validation data available, please go ahead and post it and we will make sure it gets into the DataCleaner resources project which will contain regexes, dictionaries and other input-data to make efficient use of the data quality framework.
Attached to ticket #25 you can find some text files that can be used to validate danish names using the Dictionary Lookup validation rule. I'm posting this because I know that there are a couple of danish users out there, so please enjoy!
On another note - if anyone has similar validation data available, please go ahead and post it and we will make sure it gets into the DataCleaner resources project which will contain regexes, dictionaries and other input-data to make efficient use of the data quality framework.
On another note, I found this website, which contains a lot of official data about danish geography, companies etc.:
http://www.informationsportalen.dk/databaser/baser/index.htm
Perhaps there are similar portals around for other countries?
http://www.informationsportalen.dk/databaser/baser/index.htm
Perhaps there are similar portals around for other countries?
A full list of all Danish female first names, male first names and last names is available from the Danish statistical bureau. They charge a fee for delivering.
This list also contains a frequency for the names which is useful in an intelligent way of validating names.
Further more the list contains the age distribution for the first names. This is useful if you want to give a probabilistic value about the age of a person with a given name – say the probabilistic age of male called ‘Kasper’ is 22, and the probabilistic age of a male called ‘Børge’ is 77.
Similar lists are available from the Swedish and Norwegian statistical bureaus.
This list also contains a frequency for the names which is useful in an intelligent way of validating names.
Further more the list contains the age distribution for the first names. This is useful if you want to give a probabilistic value about the age of a person with a given name – say the probabilistic age of male called ‘Kasper’ is 22, and the probabilistic age of a male called ‘Børge’ is 77.
Similar lists are available from the Swedish and Norwegian statistical bureaus.
Great, I'll add this info to ticket #25!
Log in by clicking the login link at the top of the screen
Go back to forum.


