DataCleaner is an Open Source application for profiling, validating and comparing data. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation.
DataCleaner is the free alternative to software for master data management (MDM) methodologies, data warehousing (DW) projects, statistical research, preparation for extract-transform-load (ETL) activities and more.
Click to post a new topic on the DataCleaner forum.
Forum topics
Click to post a new regex to the RegexSwap.
Regex categories
geographic (10) creditcards (7) denmark (6) numbers (5) time (5) internet (5) postal address (5)
Who uses DataCleaner?
Read about where DataCleaner is being used and what our users think
Click to go to the media page.
See it in action
Learning how to use DataCleaner can be a visual and interesting experience. Start by taking a look at our online webcasts and screenshots.
Click to go to the user survey.
Take the user survey
Please help us gain a better understanding of our users impressions and interests by filling out our new user survey.
Open Source software is software, where the source code is shared with everyone. This means that many
people can contribute to the code, that improvements are rapidly integrated and that the planning
is extensively user-based.
DataCleaner is licensed under the terms of the Lesser General Public License (LGPL) which allows anyone to use the software for all purposes, but any modifications made to the code must be contributed back to the community.
DataCleaner is licensed under the terms of the Lesser General Public License (LGPL) which allows anyone to use the software for all purposes, but any modifications made to the code must be contributed back to the community.
Master Data Management (MDM) can loosely be translated into "management of reference data". This means
all the data that is non-transactional and can be shared globally within a company or organization. Examples
are datasets containing customers, employees, products, office-locations etc. To perform data quality analysis
on these data is a crucial task to ensure usefulness and correct use throughout the organization.
Extract-Transform-Load (ETL) is the process of copying and transforming data from one or more datasources
to another. Typically you would use a tool like DataCleaner before, during and after any ETL activity.
- Before, to gain insight into the datasources that you are about use in your work. We typically refer to this as looking below the tip of the iceberg of data.
- During, if (when) you run into any unexpected mismatches during the ETL process.
- After to ensure consistency and quality in the datasource that you have populated.
DataCleaner is quite possibly the easiest available data quality application.
Within just a few clicks you can get yourself an insightful overview of your data.
DataCleaner gives you the power to customize while respecting the
pleasantness of simplicity.
If you want to take your skill-level in Data Quality to the next level, participating
in the development of DataCleaner is a primer you can't live without. We have very
interesting tasks for a lot of different kinds of people. If you're a programmer you can
of course help out with developing the source code, but we also need people to help out
with the website, answer questions on the forums and loud their oppinions on the mailing
lists.
With our new RegexSwap forum facility we promote the idea of accessing and sharing a wide
range of Data Quality content directly on the website. By sharing, discussing and
voting on the provided content, the DataCleaner DQ model gives you the instructions,
support and quality you need in your DQ deployments.
DataCleaner can access and analyze practically any datastore, including:
- Databases such as Oracle, Microsoft SQL Server, PostgreSQL, MySQL, OpenOffice (ODB) and more
- Comma-separated and tab-separated files (.csv/.tsv)
- Excel spreadsheets (.xls)
- XML files
DataCleaner is open source software and we are giving it away for free.
Our philosophy is that the user should have a big say in how the application works.
Utilizing your knowledge of data quality to bring together the best possible
application is the goal of DataCleaner, not to make money.
