Topic: Can I change metadata
Can I change metadata
I am trying to profile a fixed width file. I am able to define the datasource by selecting a file and defining the column widths.
When I am done I see that the column names in the datstore are taken from the first row of the file I used when defining the datasource. Can I change this? My first row contains actual data and many of the columns are actually empty.
Also it has interpreted numeric fields with leading zeros as strings and I am not able to do any numeric profiling on them.
I suppose the same question applies to delimited datastores. Once I have defined the datasource can I alter the metadata?
Thank you for your help.
When I am done I see that the column names in the datstore are taken from the first row of the file I used when defining the datasource. Can I change this? My first row contains actual data and many of the columns are actually empty.
Also it has interpreted numeric fields with leading zeros as strings and I am not able to do any numeric profiling on them.
I suppose the same question applies to delimited datastores. Once I have defined the datasource can I alter the metadata?
Thank you for your help.
Hi tzimmerman,
Actually you could say that the metadata of a datastore is discovered, not defined, by DataCleaner. Therefore I would suggest that you make sure the first line of the file (delimited or fixed width) has column headers in it. In future versions we might add an ability to define these at the application level.
Regarding numeric profiling - simply use the "convert to number" transformer and then you will have the same columns available with number types.
Good luck.
Actually you could say that the metadata of a datastore is discovered, not defined, by DataCleaner. Therefore I would suggest that you make sure the first line of the file (delimited or fixed width) has column headers in it. In future versions we might add an ability to define these at the application level.
Regarding numeric profiling - simply use the "convert to number" transformer and then you will have the same columns available with number types.
Good luck.
kasper,
Thank you for the prompt response. This is pretty much what I had come to conclude. It would be great if in the future the ability to either specify or modify the metadata once it has been discovered would be a big plus (IMHO).
I will look into the "convert to number" transformer for my other issue.
Again, thanks for the help and thank you for DataCleaner.
Thank you for the prompt response. This is pretty much what I had come to conclude. It would be great if in the future the ability to either specify or modify the metadata once it has been discovered would be a big plus (IMHO).
I will look into the "convert to number" transformer for my other issue.
Again, thanks for the help and thank you for DataCleaner.
You're very welcome, happy that you like it. And thanks for your feedback, it's very valuable!
A little word of warning, to be precise: Actually the first line will not be included in your analysis jobs, because it is assumed to be a header line (not data content). So therefore I generally recommend you add such a line yourself, if it's not already there. This is definately something that will be configurable at some point. Just improved this for CSV files the other day.
A little word of warning, to be precise: Actually the first line will not be included in your analysis jobs, because it is assumed to be a header line (not data content). So therefore I generally recommend you add such a line yourself, if it's not already there. This is definately something that will be configurable at some point. Just improved this for CSV files the other day.
Log in by clicking the login link at the top of the screen
Go back to forum.


