back to forum.

Topic: [SOLVED] Reporting

Topic by
MFontana

2008-05-21
09:17

[SOLVED] Reporting

Maybe I missed something, but I found quite difficult to export results of DataCleaner analysis to other formats. I managed to do so only by copy&paste. There is a structured way to print report or export results?
Thank you in advance

Reply by
kasper

2008-05-21
11:19
You're absolutely right, we need a lot of improvement on the reporting side.

There is a ticket (#139) to enable easier copying to the clipboard. But in response to your request here are my thoughts on this (which can be changed over time, but it's all a question of time and resources right now):
  • My original plan with DataCleaner GUI was that it should be a "desktop app", meaning that it was intended for single users who had a somewhat "explorative" goal. I never thought of reporting in this use case scenario, but I will definately put some more work into it now that I know that it's a real feature request.
  • In addition we are planning the webmonitor webapplication which should be used for enterprises who want to schedule and monitor data quality over time. This would be an ideal place to put reporting using some intelligent report format (eclipse BIRT is my personal favourite, but let's see). The webmonitor is still in a very early stage so this doesn't really respond to your wishes.

Out of curiosity, can I ask in what kind of environment you are using DataCleaner? I think you are one of the early adopters and I would love to hear about the users experiences.

Reply by
anonymous

2008-05-21
12:55
I'm using data clenaner as a data profiling software for both academic and commercial projects based on SQL Server and MySQL. The main focus is on ETL processes (developed using kettle, ad hoc SQL procedures, Integration Services and Informatica Power Center) and data quality. I think DataCleaner could be a good tool to evaluate dataquality, at least to define the big picture. Databases involved in these projects have tables reaching 9-10 million rows, with an average of 1,5 million rows. We adopted Pentaho as BI platform. My request of reporting features is due to the need of documenting result step by step during the ETL process. Tell me if you nedd more info.

Just one more idea: have you ever tought about collapsing datacleaner inside an ETL tool or even just a BI platform? It could be very appreciated and your tool may gain a large visibility. Pentaho could be a good proving ground ;-)

Reply by
kasper

2008-05-21
14:32
Your scenario sounds very interesting. I'm glad to hear that you're using it to profile MS SQL because we actually haven't tested that database! :-O If you don't mind we would be very happy if you could go to our [DataCleanerFeatures features page] and fill in some info on how to connect to MS SQL (look at the bottom of the page, there are similar examples with MySQL etc.)?

Personally I don't particularly like Pentaho. Not from a user point of view, because their solutions seem to work pretty OK, but from a developer point of view I think they are doing a lot of things wrong. I have been in contact with them but they don't seem to have a lot of interest in DataCleaner either. We are right now talking to the guys from SpagoBI about some sort of collaboration with them, because they have an architecture that is much more in line with my ideas of good design and they are also more open to new ideas. So let's hope that we can get a good collaboration up and running with them.

Reply by
matteo.fontana@statistica.unimib.it

2008-05-21
15:12
Added MSSQL compliance references.

Well about Peantaho vs SpagoBI we've to look at the problem from an opposite point of view: we've to pay attention to the user point of view and I believe that Pentaho's solution much easier (both on understanding and implementing). But I also believe their ETL solution still has to grow (a lot ;-)) and I'm looking for an ETL open source project that can deal with large databases.

I leave you my email address.
Let's keep in touch.

Reply by
kasper

2008-05-21
18:44
Thank you very much for the MSSQL docs.

I know that Pentaho is easier to implement and actually I still feel that there's a lot of opportunities for improvement in both suites on the usability and integration side. I feel that a lot of the technologies are too fragmented and there's too little metadata to tie the ends together.

Also when investigating your previous request about large databases I was dissapointed to see that none of the ETL tools out there had come as far as we actually have (by version 1.1) in optimizing and splitting up queries for the sake of control over memory consumption. This is also why we've spent some time seperating some initial DataCleaner core code out to the MetaModel code - because we actually "invented" something that was not out there in the middleware market already. Check out ticket #134 for more info.

Yes let's keep in touch, you can find all my contact info at [KasperSorensen my wiki page]. I think it's great to hear what "real users" think of the product, it definately helps us prioritize the tickets because right now we wan't people to get interested and build the community.

Reply by
kasper

2008-06-05
08:45
Hi MFontana
Just to let you know, I just finished working on the improved "copy to clipboard" functionality yesterday. So now you can right click on any table in datacleaner and copy 1) the selected cells or 2) the entire table.

You will of course have to wait a bit for version 1.2 of DataCleaner or [BuildingDataCleaner build the app] yourself. Or you could do a quick and dirty update (assuming you use the 1.1 development release) and just replace these jar files into your target and target/lib dirs:

Hope it works out.

Reply by
kasper

2008-07-27
13:35
For your interest, we've just released version 1.3 of DataCleaner with those improvements that you've inspired us to do (but now in a stable, supported version).

You need to be logged in to participate

In order to post your own comments on this topic, you need to be logged in.

Username:

Log in by clicking the login link at the top of the screen

 

Go back to forum.