DataCleaner is an Open Source application for profiling, validating and comparing data. These activities help you administer and monitor your data quality in order to ensure that your data is useful and applicable to your business situation.

DataCleaner is the free alternative to software for master data management (MDM) methodologies, data warehousing (DW) projects, statistical research, preparation for extract-transform-load (ETL) activities and more.

Who uses DataCleaner?

Read about where DataCleaner is being used and what our users think
Click to go to the media page.

See it in action

Learning how to use DataCleaner can be a visual and interesting experience. Start by taking a look at our online webcasts and screenshots.
Click to go to the user survey.

Take the user survey

Please help us gain a better understanding of our users impressions and interests by filling out our new user survey.
Open Source software is software, where the source code is shared with everyone. This means that many people can contribute to the code, that improvements are rapidly integrated and that the planning is extensively user-based.
DataCleaner is licensed under the terms of the Lesser General Public License (LGPL) which allows anyone to use the software for all purposes, but any modifications made to the code must be contributed back to the community.
Master Data Management (MDM) can loosely be translated into "management of reference data". This means all the data that is non-transactional and can be shared globally within a company or organization. Examples are datasets containing customers, employees, products, office-locations etc. To perform data quality analysis on these data is a crucial task to ensure usefulness and correct use throughout the organization.
Extract-Transform-Load (ETL) is the process of copying and transforming data from one or more datasources to another. Typically you would use a tool like DataCleaner before, during and after any ETL activity.
  • Before, to gain insight into the datasources that you are about use in your work. We typically refer to this as looking below the tip of the iceberg of data.
  • During, if (when) you run into any unexpected mismatches during the ETL process.
  • After to ensure consistency and quality in the datasource that you have populated.
DataCleaner is quite possibly the easiest available data quality application. Within just a few clicks you can get yourself an insightful overview of your data. DataCleaner gives you the power to customize while respecting the pleasantness of simplicity.
If you want to take your skill-level in Data Quality to the next level, participating in the development of DataCleaner is a primer you can't live without. We have very interesting tasks for a lot of different kinds of people. If you're a programmer you can of course help out with developing the source code, but we also need people to help out with the website, answer questions on the forums and loud their oppinions on the mailing lists.
With our new RegexSwap forum facility we promote the idea of accessing and sharing a wide range of Data Quality content directly on the website. By sharing, discussing and voting on the provided content, the DataCleaner DQ model gives you the instructions, support and quality you need in your DQ deployments.
DataCleaner can access and analyze practically any datastore, including:
  • Databases such as Oracle, Microsoft SQL Server, PostgreSQL, MySQL, OpenOffice (ODB) and more
  • Comma-separated and tab-separated files (.csv/.tsv)
  • Excel spreadsheets (.xls)
  • XML files
DataCleaner is open source software and we are giving it away for free. Our philosophy is that the user should have a big say in how the application works. Utilizing your knowledge of data quality to bring together the best possible application is the goal of DataCleaner, not to make money.

Username:

Password:

Requested username:

Password:

Real name:

Email address:

Title:

Company:

Country: