back to forum.

Topic: Experience using DataCleaner

Topic by
barboza

2011-04-11
17:23

Experience using DataCleaner

Hello, good afternoon!
I'm a Systems Analyst and student of MBA - Quality Assurance Software UFRJ (Federal University of Rio de Janeiro) and am preparing a presentation on the data quality tool DataCleaner and seek success stories. For example:
- Scenario before and after the tool;
- Motivation to use the tool;
- Why use DataCleaner?
- Advantages and disadvantages of the tool.

Appreciate the help!
Sincerely,
Humberto Barboza

Reply by
beno

2011-04-12
07:22
Hello Humberto,

I like the tool because it's pretty easy to use and connects to a lot of different data sources. Often we get spreadsheets, access databases and more at our customers site and datacleaner gives a good initial overview before we start manipulating the data for our consolidation jobs.

Maybe also take a look at the page about who uses the tool: http://datacleaner.eobjects.org/who_uses

Reply by
kasper

2011-04-12
20:19
Hi barboza,

Glad to hear that you're dealing with this at the university in Rio!

I have a few success stories based on my own experience and what I've heard from community feedback. If I can I would love to help you with your research.

I wrote a blog entry about the advantages, the use-case and the motivation for making DataCleaner and how we imagine that it will differentiate itself from most competing tools here: http://www.datavaluetalk.com/2011/02/17/data-quality-analysis-%E2%80%93-it-requires-a-bit-of-all-worlds/. Maybe it's something you can use.

Please keep us updated and bring your questions if you have some :)

Reply by
chiaochi

2011-04-12
21:53
Background: Our company is migrating to a new CRM. There are about 30+ tables to extract, transform and load into the new system.

Scenario before the tool: We kept getting pages of messy error logs when loading the manipulated data into the new CRM. Errors range from invalid format to missing look-up values. It was hard to distinguish cascading issues from real errors. Since the system rejects a row as soon as an error is found in one of the fields. There could be more errors in the same row but not documented in the error logs.

After the tool: We spent hours building the rules in DataCleaner and running them against each extract. In our test load, 80% of the tables were loaded successfully. The other 20% failed due to the lack of time of building all the rules (probably not efficient and realistic anyway) or were expected failures.

Motivation to use the tool: Increase the smoothness of the system migration

Why use DataCleaner: I've tried Talend and DQ Analyzer, but found DataCleaner most intuitive of all. The output/report of DataCleaner really stand out compared to the other data profilers.

Advantages of the tool: As mentioned earlier, the intuitive design and the reports won me over. The responsiveness of the developer and the community are definitely huge pluses. Since the program is open-sourced(free), I could test the software right away without applying for a budget.

Disadvantages of the tool: My gut feeling tells me that other SQL-engine DP tools (like Talend) might have higher performance in some area. Also, there are still some basic features missing such as "Save As" and the ability of being able delete an analysis file in the "Open" window.

Hope this helps!

George

Reply by
barboza

2011-04-24
13:51
Good morning, Kasper!

I want to thank you for immediacy in their response and readiness to serve and help me in this research.
Dude, I made ​​contact with the companies of the session: "Who uses DataCleaner, " but they do not return. I need real examples of success as our friend George described in his post. No it is a complex description, but which describes the challenge (Because you need a tool for data quality?), the solution (DataCleaner Why did you choose?) and the benefits of using DataCleaner, what the consequences for organization?
The objective for us is to learn to use the DataCleaner, buy the idea of using it, talking about the need to implement such a tool, talk about its functioning, its origin, support and training platform, advantages and disadvantages compared to other tools on the market and real success stories.
Our presentation will be next Wednesday and we look forward to it. =)
Another thing I liked so much that history data quality, I'm even thinking of moving to Denmark and help you in developing it. =) Just thought this complicated language there ... =)

So that's it. If you can help us with success stories will be great!
Again thank you for always quick and valuable help.

Sincerely,
Humberto Barboza

Reply by
barboza

2011-04-24
13:56
Good morning George!

Thank you for answering my questions!
His experience is already registered and will be discussed in the presentation. I just need to know your company or website?

Thank you very much for your help!

Sincerely,
Humberto Barboza

Reply by
chiaochi

2011-04-24
14:08
Humberto,

I work for The Advisory Board Company. Here is my company's website: http://www.advisoryboardcompany.com/

Good luck!

George

Reply by
kasper

2011-04-28
10:50
Hi Humberto,

Too bad that the companies did not reply. But I'm happy that you at least got a few points from George!

Even if it might be a long trip to Denmark (and I would welcome you, might even provide you with some office space ;-)) you can even help develop DataCleaner remotely. That's one of the nice things about open source community development - everything is online, but of course face-to-face conversations help. Anyways, if you're interested I'll be very happy to help you take the first steps in becoming a DataCleaner contributor.

I hope your presentation went well!

Reply by
barboza

2011-05-12
12:57
Hi Kasper!

The presentation was great, we dissect the subject well and widely expose the tool, its advantages, its operation, technical information (platform, compatibility) and also cite several cases of success achieved with its use.
Personally I was very interested in the subject and the tool would be a great pleasure working with you and your team, or are so remotely in Denmark. Your project is quite interesting, since it is only success and growth depends on us.
About going to Denmark, I have extreme desire to leave the country to Europe or USA, even went to Michigan later in the year to align a possible trip there. Maybe I will not stop there? I'll certainly go there in the office know when Denmark, indeed!
Regarding the invitation to contribute to the DataCleaner, I thought great idea and opportunity, I have some personal projects as I become an expert in Software Quality, sharpen my English and make sure in some areas such as governance, project management, testing and QA, but rather want to participate and will make contact when you have closed some of these projects to fit the project development DataCleaner.
Take this opportunity to thank all the support in the design of work and to emphasize that the interaction and community participation are sensational and is a key part in the success of DataCleaner!
If you want I can send you our presentation, however, is only in Portuguese for a while ...

Sincerely,
Humberto.

Reply by
kasper

2011-05-13
06:47
Hi Humberto!

Sounds brilliant. Glad that the presentation went good. I would very much like to see it! You can drop it to me in an email: kasper "at" eobjects.org...

Thank you for all your nice words. And please let me know if you come to Denmark (or the Netherlands where I also spend some time because Human Inference's headquarters are there).

:-)

You need to be logged in to participate

In order to post your own comments on this topic, you need to be logged in.

Username:

Log in by clicking the login link at the top of the screen

 

Go back to forum.