Topic: Some useful dictionaries
Some useful dictionaries
Hi everybody,
I've been trying out your product and lurking on these forums for a couple of weeks and I really like what I'm seeing. I've been using DataCleaner to profile a dataset consisting mainly of people-data, such as names, nationality etc.
For this I started using the provided name dictionaries but quickly I understood that these are only for Danish people?! Anyways, I've collected four new dictionaries that I think could perhaps be of some use by everyone, so I want to know if you want them for your application?
The dictionaries are: List of nationalities, List of countries, US top 1000 male names, US top 1000 girl names
I've been trying out your product and lurking on these forums for a couple of weeks and I really like what I'm seeing. I've been using DataCleaner to profile a dataset consisting mainly of people-data, such as names, nationality etc.
For this I started using the provided name dictionaries but quickly I understood that these are only for Danish people?! Anyways, I've collected four new dictionaries that I think could perhaps be of some use by everyone, so I want to know if you want them for your application?
The dictionaries are: List of nationalities, List of countries, US top 1000 male names, US top 1000 girl names
Hello beno,
Yes this is really something we could use. I agree we should have more dictionaries included in DataCleaner and this is also something I think we will intensify a lot the next couple of months. If I could get you to attach them as text files to ticket #25 then it would be great!
Yes this is really something we could use. I agree we should have more dictionaries included in DataCleaner and this is also something I think we will intensify a lot the next couple of months. If I could get you to attach them as text files to ticket #25 then it would be great!
I've uploaded the dictionaries that I had. I also included a US states list and took a little time to find the last names for US also. Now I'm off to bed, nightie :)
For reference, you can find US names here:
http://www.ssa.gov/OACT/babynames/
I just used the most popular names of 2007 but actually I'm thinking if that may have been a bad decision - I mean, it would be best with a population-based number instead of the newborn one.
http://www.ssa.gov/OACT/babynames/
I just used the most popular names of 2007 but actually I'm thinking if that may have been a bad decision - I mean, it would be best with a population-based number instead of the newborn one.
Thank you very much beno!
Regarding the baby names I will try and look for other sources for this information and make a merged dictionary out of them then. Thanks for the contributions.
Regarding the baby names I will try and look for other sources for this information and make a merged dictionary out of them then. Thanks for the contributions.
Thank you beno and kasper,
I applied the US naming dictionaries yesterday and it got my job done before Christmas break! Cheers and have a merry Christmas.
I applied the US naming dictionaries yesterday and it got my job done before Christmas break! Cheers and have a merry Christmas.
Log in by clicking the login link at the top of the screen
Go back to forum.


