back to forum.

Topic: Datacleaner with command line

Topic by
datacleanuser123

2011-04-05
15:43

Datacleaner with command line

how datacleaner cleans the data. Or is it more suitable for profiling of the data?

please let me know. I also would like to know how can i use datacleaner tool as command line. I just want to pass input parameters e.g input filename, filed name on which i need to run cleaning task and which specifically cleaning task and then run datacleaner engine and output should be in .csv/text file.

Has anyone already done this kind of work plz let me know.

Reply by
franklin

2011-04-05
17:26
I guess the tool is useful for both cleansing and profiling, but probably the main focus is on profiling/analysis. Cleansing can be done using filters and transformers - see the webcasts on the /media page.

There used to be a command line interface for the old datacleaner 1.5.4 but I havent heard of anything like it for 2.x yet.

Reply by
kasper

2011-04-05
18:26
The command line is in deed in the works. Actually the analyzerbeans project (which DataCleaner builds upon) has a very nice command line interface that we can almost use out of the box, except that datacleaners settings needs to be loaded first ... If you're up for registering all datastores, dictionaries etc. in the conf.xml file, then you can use the command line interface as is. And otherwise it's a feature that I think will not be that difficult to include in an upcoming release.

Reply by
datacleanuser123

2011-04-05
18:44
Thanks franklin and kasper for both of your reply.

Kasper - So this command line feature will be in the same existing recent version of datacleaner? I am eager to use release of datacleaner with command line in my project.

Thanks once again for your reply.


Reply by
kasper

2011-04-05
19:52
It will not be included directly in this release - that's too late of course. BUT there is actually a "hidden" existing command line interface that you can use. It requires a few steps to make it work, but I just succesfully did it to verify. I shall try to guide you:
  • First of all you need to download args4j-2.0.12.jar and put it in your DataCleaner folder.
  • With this file you should be able to run a job. Here's how to run one of the examples:
cd DataCleaner
java -cp DataCleaner.jar;args4j-2.0.12.jar org.eobjects.analyzer.cli.Main -conf conf.xml -job examples/employees.analysis.xml

Above we use the conf.xml file as the configuration and the examples/employees.analysis.xml file as the job file. For a job to work, all datastores and reference data needs to be registered in conf.xml. You might get some help for this by using the xml schema to conf.xml.

Please let me know how it works out!

Reply by
datacleanuser123

2011-04-06
14:43
I ran the Datacleaner on command line. But i got an error message which i have attached along with this email. Also I have attached a copy of the conf.xml file. I am not sure whether i have created it right.

Also another question i would like to ask is , to run Datacleaner as command line which scenario is correct?

TO run Datacleaner as command line tool for Datacleaning job.

Here is the scenario 1.

I should open the input file in datacleaner tool and then process it with the help of filter and data analyzer, data dictionaries or synonyms and then save that job (datastore?). Then run DataCLeaner command line.

scenario 2.

I should just put the input file in datacleaner examples directory and then change the conf.xml and pass all the conf parameters in this file and then run it on datacleaner. But in that case how i can i define which data cleansing specific functions (string analyzer etcc) i should use.

---------------------------------------

Error message on command prompt after running args4j-2.0.12.jar file is:

C:\DataCleaner>java -cp DataCleaner.jar;args4j-2.0.12.jar org.eobjects.analyzer.
cli.Main -conf conf.xml -job examples/employees.analysis.xml
WARN JaxbValidationEventHandler - encountered JAXB parsing error: unexpected el
ement (uri:"http://eobjects.org/analyzerbeans/configuration/1.0", local:"datasto
reCatalogType"). Expected elements are <{http://eobjects.org/analyzerbeans/confi
guration/1.0}custom-taskrunner>,<{http://eobjects.org/analyzerbeans/configuratio
n/1.0}singlethreaded-taskrunner>,<{http://eobjects.org/analyzerbeans/configurati
on/1.0}datastore-catalog>,<{http://eobjects.org/analyzerbeans/configuration/1.0}
reference-data-catalog>,<{http://eobjects.org/analyzerbeans/configuration/1.0}mu
ltithreaded-taskrunner>,<{http://eobjects.org/analyzerbeans/configuration/1.0}cl
asspath-scanner>,<{http://eobjects.org/analyzerbeans/configuration/1.0}configura
tion-metadata>,<{http://eobjects.org/analyzerbeans/configuration/1.0}storage-pro
vider>
Exception in thread "main" java.lang.IllegalArgumentException: javax.xml.bind.Un
marshalException: unexpected element (uri:"http://eobjects.org/analyzerbeans/con
figuration/1.0", local:"datastoreCatalogType"). Expected elements are <{http://e
objects.org/analyzerbeans/configuration/1.0}custom-taskrunner>,<{http://eobjects
.org/analyzerbeans/configuration/1.0}singlethreaded-taskrunner>,<{http://eobject
s.org/analyzerbeans/configuration/1.0}datastore-catalog>,<{http://eobjects.org/a
nalyzerbeans/configuration/1.0}reference-data-catalog>,<{http://eobjects.org/ana
lyzerbeans/configuration/1.0}multithreaded-taskrunner>,<{http://eobjects.org/ana
lyzerbeans/configuration/1.0}classpath-scanner>,<{http://eobjects.org/analyzerbe
ans/configuration/1.0}configuration-metadata>,<{http://eobjects.org/analyzerbean
s/configuration/1.0}storage-provider>
at org.eobjects.analyzer.configuration.JaxbConfigurationReader.unmarshal
l(JaxbConfigurationReader.java:163)
at org.eobjects.analyzer.configuration.JaxbConfigurationReader.create(Ja
xbConfigurationReader.java:151)
at org.eobjects.analyzer.configuration.JaxbConfigurationReader.create(Ja
xbConfigurationReader.java:144)
at org.eobjects.analyzer.cli.Main.run(Main.java:95)
at org.eobjects.analyzer.cli.Main.main(Main.java:87)
Caused by: javax.xml.bind.UnmarshalException: unexpected element (uri:"http://eo
bjects.org/analyzerbeans/configuration/1.0", local:"datastoreCatalogType"). Expe
cted elements are <{http://eobjects.org/analyzerbeans/configuration/1.0}custom-t
askrunner>,<{http://eobjects.org/analyzerbeans/configuration/1.0}singlethreaded-
taskrunner>,<{http://eobjects.org/analyzerbeans/configuration/1.0}datastore-cata
log>,<{http://eobjects.org/analyzerbeans/configuration/1.0}reference-data-catalo
g>,<{http://eobjects.org/analyzerbeans/configuration/1.0}multithreaded-taskrunne
r>,<{http://eobjects.org/analyzerbeans/configuration/1.0}classpath-scanner>,<{ht
tp://eobjects.org/analyzerbeans/configuration/1.0}configuration-metadata>,<{http
://eobjects.org/analyzerbeans/configuration/1.0}storage-provider>
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContex
t.handleEvent(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(
Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportError(
Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.reportUnexpe
ctedChildElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.Loader.childElement
(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.StructureLoader.chi
ldElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContex
t._startElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallingContex
t.startElement(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.SAXConnector.startE
lement(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startEle
ment(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scan
StartElement(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImp
l$FragmentContentDriver.next(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(U
nknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next
(Unknown Source)
at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImp
l.scanDocument(Unknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(U
nknown Source)
at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(U
nknown Source)
at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(Unknown So
urce)
at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(Un
known Source)
at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.p
arse(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.un
marshal0(Unknown Source)
at com.sun.xml.internal.bind.v2.runtime.unmarshaller.UnmarshallerImpl.un
marshal(Unknown Source)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Sou
rce)
at javax.xml.bind.helpers.AbstractUnmarshallerImpl.unmarshal(Unknown Sou
rce)
at org.eobjects.analyzer.configuration.JaxbConfigurationReader.unmarshal
l(JaxbConfigurationReader.java:160)
... 4 more
---------------------------------------
here is conf.xml
<?xml version="1.0" encoding="UTF-8"?>
<configuration xmlns="http://eobjects.org/analyzerbeans/configuration/1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">

<configuration-metadata>
<configuration-name>DataCleaner configuration</configuration-name>
<configuration-description>Configures DataCleaner's initial environment.
This includes an example datastore and some example reference data.</configuration-description>
<configuration-version>2.0</configuration-version>
<author>eobjects.org</author>
</configuration-metadata>

<datastore-catalog>
<jdbc-datastore name="orderdb">
<url>jdbc:hsqldb:res:orderdb;readonly=true</url>
<driver>org.hsqldb.jdbcDriver</driver>
<username>SA</username>
<password></password>
</jdbc-datastore>
</datastore-catalog>

<reference-data-catalog>

</reference-data-catalog>

<multithreaded-taskrunner max-threads="30" />

<storage-provider>
<combined>
<collections-storage>
<berkeley-db />
</collections-storage>
<row-annotation-storage>
<in-memory max-rows-threshold="1000" />
</row-annotation-storage>
</combined>
</storage-provider>

<classpath-scanner>
<package recursive="true">org.eobjects.analyzer.beans</package>
<package>org.eobjects.analyzer.result.renderer</package>
<package>org.eobjects.datacleaner.output.beans</package>
<package recursive="true">org.eobjects.datacleaner.widgets.result</package>
</classpath-scanner>
<datastoreCatalogType>
<xml-datastore>
<filename name="C:\DataCleaner\examples\employees.analysis"></filename>
</xml-datastore>
</datastoreCatalogType>
<referenceDataCatalogType>
<dictionaries><value-list-dictionary></value-list-dictionary></dictionaries>
<synonym-catalogs><datastore-synonym-catalog></datastore-synonym-catalog></synonym-catalogs>
<string-patterns><simplePatternType name="validate_phone"></simplePatternType></string-patterns>
</referenceDataCatalogType>
</configuration>

Reply by
kasper

2011-04-06
17:25
I took the liberty of adding some quote styling to your post :)

Hmm I dont want to dictate which scenario you choose. Maybe someone else has experience with both way, but in general I would say that both approaches should work.

The error you're seeing is because the conf.xml file is not formatted in a valid way. The best way to edit such XML files in hand is using an XML schema aware editor such as eclipse. Another way might be to look at some examples. There are lots of examples in the test resources of AnalyzerBeans but beware that a few of them are invalid (the filenames should state what they do on a high level).

The same applies for job files. There is an XML schema and also some examples you can look at to see how they define which analyzers etc. are being used.

Reply by
datacleanuser123

2011-04-07
20:08
Hello kasper

huh...I did get a success in validating xml file. I ran again datacleaner on command line now i am getting different errors compare to the first errors.

C:\DataCleaner>java -cp DataCleaner.jar;args4j-2.0.12.jar org.eobjects.analyzer.
cli.Main -conf conf.xml -job examples/employees.analysis.xml
INFO JaxbConfigurationReader - Configuration name: Configuration with all the d
atastores
INFO JaxbConfigurationReader - Configuration version: null
INFO JaxbConfigurationReader - Configuration description: null
INFO JaxbConfigurationReader - Author: null
INFO JaxbConfigurationReader - Created date: null
INFO JaxbConfigurationReader - Updated date: null
INFO JaxbJobReader - Job name: null
INFO JaxbJobReader - Job version: null
INFO JaxbJobReader - Job description: Created with DataCleaner 2.0 (BETA)
INFO JaxbJobReader - Author: John Doe
INFO JaxbJobReader - Created date: 2010-11-12Z
INFO JaxbJobReader - Updated date: 2011-02-09+01:00
ERROR Main - Exception thrown in org.eobjects.analyzer.job.NoSuchDatastoreExcept
ion: No such datastore: orderdb
Error:
org.eobjects.analyzer.job.NoSuchDatastoreException: No such datastore: orderdb
at org.eobjects.analyzer.job.JaxbJobReader.create(JaxbJobReader.java:268
)
at org.eobjects.analyzer.job.JaxbJobReader.create(JaxbJobReader.java:213
)
at org.eobjects.analyzer.job.JaxbJobReader.create(JaxbJobReader.java:208
)
at org.eobjects.analyzer.job.JaxbJobReader.create(JaxbJobReader.java:201
)
at org.eobjects.analyzer.cli.Main.runJob(Main.java:249)
at org.eobjects.analyzer.cli.Main.run(Main.java:98)
at org.eobjects.analyzer.cli.Main.main(Main.java:87)
INFO MultiThreadedTaskRunner - shutdown() called, shutting down executor servic
e

When i carefully look at job file employee.analysis which is getting data i guess from orderdb. But then i don't understand where the actual data file say e.g. orderdb in this case should be stored.

Thanks.












Reply by
kasper

2011-04-08
06:19
The datastores are defined in conf.xml. It surprises me that it cannot find 'orderdb' because it is defined in the configuration file that is shipped with datacleaner. Did you modify conf.xml?

To reaffirm, there are two files involved:
conf.xml: Defines the "environment" of DataCleaner.
[your-filename].analysis.xml: Defines a single job running in the defined "environment".

Reply by
datacleanuser123

2011-04-08
14:13
I did modify it.

so i can get the original conf.xml again from the datacleaner website. Is that right?

Another thing is when i validated conf.xml file (from that time), if i open any analysis job or any datastore from datacleaner GUI, then i get an error saying " Unexpected error. Descriptor can not be null".

Also should conf.xsd schema file be there in the same folder where conf.xml is? Actual database/data file on which analysis/cleansing job should be performed be in the datastore folder?

Reply by
datacleanuser123

2011-04-08
14:45
I got the previous problem solved. And now here is what i am getting as output when i run Datacleaner on command line.

C:\>cd DataCleaner

C:\DataCleaner>java -cp DataCleaner.jar;args4j-2.0.12.jar org.eobjects.analyzer.cli.Main -conf conf.xml -job
INFO JaxbConfigurationReader - Configuration name: DataCleaner configuration
INFO JaxbConfigurationReader - Configuration version: 2.0
INFO JaxbConfigurationReader - Configuration description: Configures DataCleaner's initial environment. T
ence data.
INFO JaxbConfigurationReader - Author: eobjects.org
INFO JaxbConfigurationReader - Created date: null
INFO JaxbConfigurationReader - Updated date: null
INFO ClasspathScanDescriptorProvider - Scanning package path 'org/eobjects/analyzer/beans' (and subpackages
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
yzer._columns
INFO AbstractBeanDescriptor - No @Inject annotation found for @Provided field: org.eobjects.analyzer.storage
nAnalyzer._annotationFactory
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
alesceDatesTransformer.input
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
alesceNumbersTransformer.input
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
alesceStringsTransformer.input
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
EmptyStringAsNull
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Boolean org.eo
r.nullReplacement
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String[] org.e
er._trueTokens
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String[] org.e
er._falseTokens
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Number org.eob
nullReplacement
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
nullReplacement
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
er.fromColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
er.toColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
er.groupColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Boolean org.eo

INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
naryLookupFilter.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.re
ctionaryLookupFilter.dictionary
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
Filter.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String[] org.e
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: int org.eobjects.analyze
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
ullFilter.columns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
ll
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
RangeFilter.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Double org.eob
e
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Double org.eob
ue
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
WordFilter.input
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
LengthRangeFilter.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: int org.eobjects.analyze
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: int org.eobjects.analyze
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
PatternMatchFilter.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.re
er.StringPatternMatchFilter.stringPatterns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.be
ns.filter.StringPatternMatchFilter.matchCriteria
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
ValueRangeFilter.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
tValue
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
stValue
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: private org.eobjects.met
her.leftTable
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: private org.eobjects.met
her.rightTable
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: private org.eobjects.met
cher.leftTableJoinColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: private org.eobjects.met
cher.rightTableJoinColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
lyzer.columns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.re
gAnalyzer.dictionaries
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.re
hingAnalyzer.stringPatterns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
zer._columns
INFO AbstractBeanDescriptor - No @Inject annotation found for @Provided field: org.eobjects.analyzer.storage
Analyzer._annotationFactory
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.metamodel.s
grityValidator.primaryKeyColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.metamodel.s
grityValidator.foreignKeyColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
ignKey
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
ScriptFilter.columns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
ScriptTransformer.columns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
Code
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.be
bjects.analyzer.beans.similarity.PhoneticSimilarityFinder.matchMode
INFO NamedPattern - compiling pattern: ([a-zA-Z0-9\._%+-]+)@([a-zA-Z0-9\._%+-]+\.[a-zA-Z0-9\._%+-]{2,4})
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Integer org.eo
umTokens
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
okenizerTransformer.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: char[] org.eobjects.anal
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
rlStandardizerTransformer.inputColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
zer._columns
INFO AbstractBeanDescriptor - No @Inject annotation found for @Provided field: org.eobjects.analyzer.storage
Analyzer._annotationFactory
INFO AbstractBeanDescriptor - No @Inject annotation found for @Provided field: org.eobjects.analyzer.storage
pattern.PatternFinderAnalyzer._rowAnnotationFactory
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
.PatternFinderAnalyzer.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Boolean org.eo
r.discriminateTextCase
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Boolean org.eo
r.discriminateNegativeNumbers
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Boolean org.eo
r.discriminateDecimals
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Boolean org.eo
r.enableMixedTokens
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Boolean org.eo
r.ignoreRepeatedSpaces
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
eExpandable
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
eExpandable
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
.predefinedTokenName
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String[] org.e
er.predefinedTokenPatterns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Character org.
zer.decimalSeparator
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Character org.
zer.thousandsSeparator
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.Character org.
zer.minusSign
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
oncatenatorTransformer.columns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
eparator
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
eDiffTransformer.fromColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
eDiffTransformer.toColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
eMaskMatcherTransformer._column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String[] org.e
mer._dateMasks
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
eToAgeTransformer.dateColumn
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.re
rm.DictionaryMatcherTransformer._dictionaries
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
tionaryMatcherTransformer._column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob

INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
ingLengthTransformer.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.re
sform.StringPatternMatcherTransformer._stringPatterns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
ingPatternMatcherTransformer._column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
onymReplacementTransformer.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.re
form.SynonymReplacementTransformer.synonymCatalog
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
inOriginalValue
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
hitespaceTrimmerTransformer.columns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
eft
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
ight
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: boolean org.eobjects.ana
ultipleToSingleSpace
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
DecoderTransformer.column
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
eekdayDistributionAnalyzer.dateColumns
INFO ClasspathScanDescriptorProvider - Scanning package path 'org/eobjects/analyzer/result/renderer' (and su
INFO ClasspathScanDescriptorProvider - Scanning package path 'org/eobjects/datacleaner/output/beans' (and su
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: char org.eobjects.datacl
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: char org.eobjects.datacl
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.io.File org.eobject
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
bstractOutputWriterAnalyzer.columns
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
atastoreName
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: java.lang.String org.eob
ableName
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.datacleaner
ects.datacleaner.output.beans.DatastoreOutputAnalyzer.writeMode
INFO SimpleComponentDescriptor - No @Inject annotation found for @Configured field: org.eobjects.analyzer.da
bstractOutputWriterAnalyzer.columns
INFO ClasspathScanDescriptorProvider - Scanning package path 'org/eobjects/datacleaner/widgets/result' (and
INFO JaxbJobReader - Job name: null
INFO JaxbJobReader - Job version: null
INFO JaxbJobReader - Job description: Created with DataCleaner 2.0 (BETA)
INFO JaxbJobReader - Author: null
INFO JaxbJobReader - Created date: null
INFO JaxbJobReader - Updated date: 2011-02-09+01:00
INFO UsageAwareDatastore - Reusing existing DataContextProvider: UsageAwareDataContextProvider[datastore=ord
INFO UsageAwareDataContextProvider - Usage incremented to 2 for UsageAwareDataContextProvider[datastore=orde
INFO JdbcDataContext - Found schemaName: INFORMATION_SCHEMA
INFO JdbcDataContext - Found schemaName: PUBLIC
INFO JdbcDataContext - Querying for table types [TABLE, VIEW] in catalog: null, schema: PUBLIC
INFO JdbcDataContext - Querying for columns in table: EMPLOYEES
INFO UsageAwareDataContextProvider - Method close() invoked, usage decremented to 1 for UsageAwareDataContex
INFO UsageAwareDatastore - Reusing existing DataContextProvider: UsageAwareDataContextProvider[datastore=ord
INFO UsageAwareDataContextProvider - Usage incremented to 2 for UsageAwareDataContextProvider[datastore=orde
INFO UsageAwareDataContextProvider - Method close() invoked, usage decremented to 1 for UsageAwareDataContex
INFO AnalysisRunnerJobDelegate - Created 1 row processor publishers
INFO FilterBeanInstance - assignConfigured (org.eobjects.analyzer.beans.filter.EqualsFilter@72adf5be)
INFO FilterBeanInstance - assignProvided (org.eobjects.analyzer.beans.filter.EqualsFilter@72adf5be)
INFO TransformerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.standardize.EmailStandardizerTr
INFO TransformerBeanInstance - assignProvided (org.eobjects.analyzer.beans.standardize.EmailStandardizerTran
INFO TransformerBeanInstance - initialize (org.eobjects.analyzer.beans.standardize.EmailStandardizerTransfor
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@5918
INFO DefaultTokenizer - Predefined tokens are turned OFF, using tokenizeInternal
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@528a
INFO DefaultTokenizer - Predefined tokens are turned OFF, using tokenizeInternal
INFO FilterBeanInstance - initialize (org.eobjects.analyzer.beans.filter.EqualsFilter@72adf5be)
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@773c
INFO DefaultTokenizer - Predefined tokens are turned OFF, using tokenizeInternal
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@2a11
INFO DefaultTokenizer - Predefined tokens are turned OFF, using tokenizeInternal
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@4477
INFO DefaultTokenizer - Predefined tokens are turned OFF, using tokenizeInternal
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@c96a
INFO DefaultTokenizer - Predefined tokens are turned OFF, using tokenizeInternal
INFO AnalyzerBeanInstance - assignConfigured (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyze
INFO AnalyzerBeanInstance - assignProvided (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@
INFO FileHelper - Using 'C:\Users\arti\AppData\Local\Temp' as tmpdir.
INFO BerkeleyDbStorageProvider - Using target directory for persistent collections (deleteOnExit=true): C:\U
-46ae-8680-0ddb6d09b606
INFO FileHelper - Using 'C:\Users\arti\AppData\Local\Temp' as tmpdir.
INFO BerkeleyDbStorageProvider - Using target directory for persistent collections (deleteOnExit=true): C:\U
-46ae-8680-0ddb6d09b606
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@207f
INFO AnalyzerBeanInstance - initialize (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@6ac6
INFO ForkTaskListener - Calling onComplete(...) on nested TaskListener ()
INFO ForkTaskListener - Scheduling 1 tasks
INFO UsageAwareDatastore - Reusing existing DataContextProvider: UsageAwareDataContextProvider[datastore=ord
INFO UsageAwareDataContextProvider - Usage incremented to 2 for UsageAwareDataContextProvider[datastore=orde
INFO JdbcDataContext - SELECT COUNT(*) FROM PUBLIC."EMPLOYEES"
Analyzing 23 rows from table: EMPLOYEES
INFO InfoLoggingAnalysisListener - Beginning row processing of 23 rows in Table[name=EMPLOYEES,type=TABLE,re
INFO JdbcDataContext - SELECT "EMPLOYEES"."EMAIL", "EMPLOYEES"."LASTNAME", "EMPLOYEES"."REPORTSTO", "EMPLOYE
FICECODE" FROM PUBLIC."EMPLOYEES"
INFO UsageAwareDataContextProvider - Method close() invoked, usage decremented to 1 for UsageAwareDataContex
Done processing rows from table: EMPLOYEES
INFO ForkTaskListener - Scheduling 10 tasks
INFO FilterBeanInstance - close (org.eobjects.analyzer.beans.filter.EqualsFilter@72adf5be)
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@2
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@7
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@4
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@5
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@528a52b6)
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@5
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@5918cb3a)
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@c
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@c96ad7c)
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@2
INFO ValueDistributionAnalyzer - getResult()
INFO AnalyzerBeanInstance - returnResults (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@6
INFO ValueDistributionAnalyzer - getResult()
INFO TransformerBeanInstance - close (org.eobjects.analyzer.beans.standardize.EmailStandardizerTransformer@7
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@6ac67a88)
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.valuedist.ValueDistributionAnalyzer@207ff5b6)
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@44775121)
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@2a114025)
INFO AnalyzerBeanInstance - close (org.eobjects.analyzer.beans.stringpattern.PatternFinderAnalyzer@773c550f)
INFO ForkTaskListener - Calling onComplete(...) on nested TaskListener ()
INFO ForkTaskListener - Calling onComplete(...) on nested TaskListener ()
INFO ForkTaskListener - Scheduling 2 tasks
SUCCESS!

RESULT:
Match count Sample
aaaaaaaaaa 23 jfirrelli


RESULT:
Match count Sample
aaaaaaaaaaaaaaaa.aaa 23 classicmodelcars.com


RESULT:
Match count Sample
Aaaaa Aaa 17 Sales Rep
AA Aaaaaaaaa 2 VP Marketing
Aaaa Aaaaaaa (AAAA) 1 Sale Manager (EMEA)
Aaaaa Aaaaaaa (AA) 1 Sales Manager (NA)
Aaaaa Aaaaaaa (AAAAA, AAAA) 1 Sales Manager (JAPAN, APAC)
Aaaaaaaaa 1 President


RESULT:
Match count Sample
aaaaaaaaaa@aaaaaaaaaaaaaaaa.aaa 23 jfirrelli@classicmodelcars.com


RESULT:
Match count Sample
Aaaaaaa 22 Jeff
Aaaa Aaa 1 Foon Yue


RESULT:
Match count Sample
Aaaaaaaaa 23 Firrelli


RESULT:
Value distribution for column: REPORTSTO
Top values:
- 1102: 6
- 1143: 6
- 1088: 5
Null count: 0
Unique values:

RESULT:
Value distribution for column: OFFICECODE
Top values:
- 4: 4
- 6: 3
- 1: 2
- 2: 2
- 3: 2
- 5: 2
- 7: 2
Null count: 0
Unique values:
INFO MultiThreadedTaskRunner - shutdown() called, shutting down executor service

Reply by
datacleanuser123

2011-04-08
17:47
After a 2/3 days of struggle finally i did succeed in testing a small sample file.

Here a brief explanation of what i did.

1. List_A.csv input file. I put it in datastores folder
2. Example folder i created a job analysis file without creating job in Datacleaner GUI. I just mentioned which jobs i needed in .xml file.
3. i modified conf.xml where i added datastore as csv here is an example
<csv-datastore name="List_A_csv">
<filename>datastores/List_A.csv</filename>
<quote-char>"</quote-char>
<separator-char>,</separator-char>
<encoding>UTF-8</encoding>
</csv-datastore>
4. Made it ensure that conf.xml is valid and job analysis xml file is well formed.
5. Run datacleaner on command line

Thats it....

Thanks again to all who helped me in getting this done and thanks to Datacleaner development team.

Just now finding out that how you should i save the results which i got in csv/ plain text file

next is exploring more on it now :-)

Reply by
kasper

2011-04-09
07:17
Hey,

Great that you figured all this out! I know that currently it is kind of a well kept secret how to configure the engine manually.

You can of course look into how to save the results. A simple solution would be to just pipe the console output to a file using the '>' operator on the command line.

In the future we will supply more output formats such as CSV, Excel, HTML etc.

Reply by
datacleanuser123

2011-04-10
15:17
Thanks kasper. I'll be happy to have such feature. A great kudos for all DataCleaner team and Thanks to you for persistently solving others doubts.

Reply by
datacleanuser123

2011-04-11
19:54
Kasper

I used '>' operator for outputing the command line console output. I have a question regarding output.

I used Name standarizer in my analysis job. I checked in the GUI that if you perform name standardizer on name column then you can get output column in various combinations of given. last and mi name and then you can save/preview transformed data.

Here is the situation i have a column as Name where first name, last name and MI name are combined and i wanted to separate it so i used Name standardizer analysis job by running datacleaner on command line, but i am not getting my result as separated first name, last name and mi name. I actually wanted to view the 3 separate output columns (first name, last name and mi name) for input column name.

Is this the right way for performing name standardization.

Reply by
tech4

2011-04-11
20:23
Kasper,

Could you please give me an example for a data store entry in conf.xml which actually connects to Oracle DB ?

i couldn't find it in the examples given online.

-tech4

Reply by
kasper

2011-04-12
07:10
Hi tech4,

True there are no examples because editing the conf.xml file in hand is still not documented properly so we didn't want to give it a lot of focus yet. But here's an example:

<jdbc-datastore name="my_oracle_datastore">
<url>jdbc:oracle:thin:@localhost:1521:mydb</url>
<driver>oracle.jdbc.OracleDriver</driver>
<username>myuser</username>
<password>mypass</password>
</jdbc-datastore>

Reply by
tech4

2011-04-12
13:31
thanks kasper. it worked.

Reply by
lwdallas

2012-03-14
23:48
Kasper,

Have there been any update to the command line functionality in the current version of DataCleaner? This seems like a huge opportunity for success.

Reply by
lwdallas

2012-03-14
23:49
Kasper,

Have there been any update to the command line functionality in the current version of DataCleaner? This seems like a huge opportunity for success.

Reply by
kasper

2012-03-15
08:45
Hi lwdallas,

Yes there's been plenty of updates to the CLI. Take a look at the docs, or my recent blog post about doing "dq monitoring" using the command line: Now you can build your own DQ monitoring solution with DataCleaner.

Reply by
lwdallas

2012-03-15
16:27
Kasper!

Thanks for the very fast response. I just made a quick skim of that posting--very cool!

Lonnie

Reply by
paul

2012-05-11
18:59
I am having trouble getting the output to a file. Why would it not create the out put file? Running under Win 7.
Cmd: DataCleaner-Console.exe -job examples\PBD_contact_string.analysis.xml\ -ot SERIALIZED\ -of pbd.analysis.result.dat

I would actually like it in HTML and tried -ot HTML\ also.

Reply by
kasper

2012-05-11
20:03
That looks entirely correct. What happens when you execute it? I would expect that it creates a new file called "pdb.analysis.result.dat" in your DC directory, correct?

Reply by
paul

2012-05-17
19:12
It displays the following on the console:


500 rows processed from table: Contact
1000 rows processed from table: Contact
996 rows processed from table: Contact
1002 rows processed from table: Contact
1500 rows processed from table: Contact
2000 rows processed from table: Contact
2500 rows processed from table: Contact
3000 rows processed from table: Contact
3500 rows processed from table: Contact
4000 rows processed from table: Contact
4500 rows processed from table: Contact
5000 rows processed from table: Contact
5500 rows processed from table: Contact
6000 rows processed from table: Contact
6500 rows processed from table: Contact
7000 rows processed from table: Contact
7500 rows processed from table: Contact
8000 rows processed from table: Contact
7998 rows processed from table: Contact
8026 rows processed from table: Contact
8500 rows processed from table: Contact
9055 rows processed from table: Contact
8935 rows processed from table: Contact
9018 rows processed from table: Contact
8999 rows processed from table: Contact
9093 rows processed from table: Contact
9500 rows processed from table: Contact
10000 rows processed from table: Contact
10520 rows processed from table: Contact
10426 rows processed from table: Contact
10500 rows processed from table: Contact
10483 rows processed from table: Contact
10561 rows processed from table: Contact
11000 rows processed from table: Contact
11500 rows processed from table: Contact
11472 rows processed from table: Contact
11594 rows processed from table: Contact
12066 rows processed from table: Contact
11976 rows processed from table: Contact
12028 rows processed from table: Contact
12500 rows processed from table: Contact
12377 rows processed from table: Contact
12504 rows processed from table: Contact
12455 rows processed from table: Contact
12508 rows processed from table: Contact
13000 rows processed from table: Contact
13500 rows processed from table: Contact
14000 rows processed from table: Contact
14500 rows processed from table: Contact
14479 rows processed from table: Contact
14513 rows processed from table: Contact
15000 rows processed from table: Contact
SUCCESS!

RESULT: ImmutableAnalyzerJob[name=null,analyzer=String analyzer]
Account_ID__c AssistantName AssistantPhone Birthdate Client_Survey_Recipient__c Cont1_RecID__c Cont2_RecID__c Contact_Type__c ContSupp__c CreatedDate Department DoNotCall Email Email_Opt_In__c EmailBouncedDate EmailBouncedReason Fax FirstName Freight_Letter__c HasOptedOutOfEmail Holiday_Card__c HomePhone If_Other_List_Lead_Source__c IsDeleted Jigsaw JigsawContactId Last_Date__c Last_User__c LastActivityDate LastCURequestDate LastCUUpdateDate LastModifiedDate LastName LeadSource LID__LinkedIn_Company_Id__c LID__LinkedIn_Member_Token__c LinkedIn__c Mailing__c MailingCity MailingCountry MailingPostalCode MailingState MailingStreet Memberships__c Merge_Codes__c MobilePhone Name Newsletter__c OtherCity OtherCountry OtherPhone OtherPostalCode OtherState OtherStreet Partner_Summit__c Phone Sales_Opt_In__c Salutation Status_Contact__c Survey_Opt_In__c Survey_Opt_Out__c SystemModstamp Thanksgiving_Call__c Title
Row count 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176 15176
Null count 15176 15109 15155 15176 0 13538 13539 14676 12264 0 14120 0 9340 0 15169 15169 7047 20 0 0 0 15161 14962 0 15176 15176 13897 13897 12487 15176 15176 0 0 12551 15174 15159 15144 15022 2135 7248 2318 2200 2217 15070 15156 14982 0 0 15055 15162 15130 15058 15055 15052 0 8823 0 13984 12688 0 0 0 0 1200
Entirely uppercase count 0 2 1 0 0 1638 1637 0 1292 0 27 0 0 0 0 0 0 199 0 0 0 1 13 0 0 0 0 1279 0 0 0 0 67 320 0 0 0 0 96 484 30 12822 95 30 0 1 55 0 1 14 0 0 114 0 0 10 0 27 1209 0 0 0 0 1013
Entirely lowercase count 0 1 8 0 15176 0 0 0 0 0 1 15176 5820 15176 0 0 2 8 15176 15176 15176 2 14 15176 0 0 0 0 0 0 0 0 12 72 0 0 25 0 8 0 0 24 4 0 0 0 6 15176 0 0 3 0 1 0 15176 783 15176 0 0 15176 15176 0 15176 6
Total char count 0 717 359 0 75791 24570 24555 3784 51851 409752 19832 75880 121729 75652 189 374 120571 86829 75813 75877 75364 213 3882 75880 0 0 34533 8004 72603 0 0 409752 101703 31730 11 170 2523 1158 118270 98931 79565 26467 265689 1550 240 2711 203688 75472 1051 49 648 670 256 2946 75782 94041 75786 3667 4976 75847 75880 409752 75821 223417
Max chars <null> 30 29 <null> 5 15 15 10 20 27 80 5 40 5 27 78 40 37 5 5 5 23 40 5 <null> <null> 27 8 27 <null> <null> 27 41 33 7 10 255 17 30 14 14 14 107 30 12 15 45 5 18 6 22 10 5 81 5 38 5 19 2 5 5 27 5 80
Min chars <null> 3 5 <null> 4 15 15 7 15 27 1 5 10 4 27 24 11 1 4 4 4 5 3 5 <null> <null> 27 4 27 <null> <null> 27 1 2 4 10 37 7 2 2 2 1 2 8 12 12 3 4 2 2 8 5 2 7 4 3 4 2 2 4 5 27 4 1
Avg chars <null> 10.7 17.1 <null> 4.99 15 15 7.57 17.81 27 18.78 5 20.86 4.98 27 53.43 14.83 5.73 5 5 4.97 14.2 18.14 5 <null> <null> 27 6.26 27 <null> <null> 27 6.7 12.09 5.5 10 78.84 7.52 9.07 12.48 6.19 2.04 20.5 14.62 12 13.97 13.42 4.97 8.69 3.5 14.09 5.68 2.12 23.76 4.99 14.8 4.99 3.08 2 5 5 27 5 15.99
Max white spaces <null> 3 3 <null> 0 3 2 0 2 1 14 0 0 0 1 13 2 3 0 0 0 4 6 0 <null> <null> 1 0 1 <null> <null> 1 7 3 0 0 0 2 4 1 2 2 20 4 1 1 8 0 1 0 3 0 0 11 0 7 0 2 0 0 0 1 0 12
Min white spaces <null> 0 0 <null> 0 0 0 0 0 1 0 0 0 0 1 4 0 0 0 0 0 1 0 0 <null> <null> 1 0 1 <null> <null> 1 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0
Avg white spaces <null> 0.72 0.76 <null> 0 0.15 0.62 0 0.13 1 1.6 0 0 0 1 8.29 1.01 0.14 0 0 0 1.27 1.85 0 <null> <null> 1 0 1 <null> <null> 1 0.04 1.13 0 0 0 1.05 0.35 0.94 0 0 3.2 1.69 1 0.98 1.18 0 0.19 0 1.04 0 0 3.42 0 1.19 0 0.01 0 0 0 1 0 1.62
Uppercase chars 0 121 2 0 0 11041 11957 500 15177 0 2138 0 40 0 0 12 0 17770 0 0 0 2 597 0 0 0 0 8004 0 0 0 0 17175 8123 0 79 145 178 18090 16294 100 25773 39437 496 40 1 34945 0 146 37 1 0 233 406 0 40 0 1284 1209 0 0 0 0 39086
Uppercase chars (excl. first letters) 0 54 0 0 0 9333 10233 0 12095 0 1265 0 28 0 0 3 0 2503 0 0 0 1 395 0 0 0 0 6725 0 0 0 0 1902 5349 0 67 145 24 4994 7829 68 12810 25282 388 20 0 18889 0 22 15 0 0 113 257 0 17 0 84 0 0 0 0 0 24592
Lowercase chars 0 533 142 0 75791 0 0 3284 2819 0 15805 75880 108393 75652 0 241 28 66087 75813 75877 75364 15 2493 75880 0 0 0 0 0 0 0 0 83202 18028 0 66 1551 340 94926 74306 15 615 121314 534 180 0 149289 75472 878 0 9 0 16 1451 75782 1691 75786 1266 0 75847 75880 0 75821 153703
Digit chars 0 3 140 0 0 4782 4656 0 21072 318696 13 0 248 0 147 36 86802 0 0 0 0 135 348 0 0 0 26859 0 56469 0 0 318696 3 2149 11 23 263 470 496 75 76231 37 56514 310 0 1940 3 0 0 0 458 654 5 572 0 65759 0 0 3767 0 0 318696 0 13
Diacritic chars 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Non-letter chars 0 63 215 0 0 13529 12598 0 33855 409752 1889 0 13296 0 189 121 120543 2972 0 0 0 196 792 0 0 0 34533 0 72603 0 0 409752 1326 5579 11 25 827 640 5254 8331 79450 79 104938 520 20 2710 19454 0 27 12 638 670 7 1089 0 92310 0 1117 3767 0 0 409752 0 30628
Word count 0 115 37 0 15176 1887 2640 500 3282 30352 2743 15176 5836 15176 14 65 16328 17280 15176 15176 15176 34 608 15176 0 0 2558 1279 5378 0 0 30352 15792 5591 2 17 32 316 17559 15353 12885 12981 54271 285 40 385 33072 15176 144 14 94 118 121 544 15176 13927 15176 1201 2488 15176 15176 30352 15176 36493
Max words <null> 4 4 <null> 1 4 3 1 3 2 15 1 1 1 2 14 3 4 1 1 1 5 7 1 <null> <null> 2 1 2 <null> <null> 2 8 4 1 1 1 3 5 2 3 2 21 5 2 2 9 1 2 1 4 1 1 10 1 8 1 3 1 1 1 2 1 13
Min words <null> 1 1 <null> 1 1 1 1 1 2 1 1 1 1 2 5 1 1 1 1 1 2 1 1 <null> <null> 2 1 2 <null> <null> 2 1 1 1 1 1 2 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 2 1 1

Reply by
kasper

2012-05-18
09:02
Can you try without the backslashes? I just tried the "examples\employees.analysis.xml" example job and it didn't seem to work correctly when I added backslashes. Actually I thought that was something supported by the OS and that application programmers did not have to worry about... Will have to read up on it since I rarely have used the backslash/multiline feature myself.

Reply by
kasper

2012-05-18
09:04
It seems the backslash feature is a Unix-style operator. In windows you can use tilde (^):

http://stackoverflow.com/questions/605686/windows-how-to-specify-multiline-command-on-command-prompt

But also remember to add a space between the tilde and the actual arguments, or else DC will think the tilde is a part of the argument values.

Reply by
paul

2012-05-18
13:20
Thanks, removing the backslash altogether worked too. I did not realize that was being used for a line continuation. Nice to know how to use it in both now.

I used HTML output and noticed that the AVG CHARS and AVG WHITE SPACE were displaying with a large number of decimals, you may want to check the rounding there to be consistent with the output above.

Thanks for the great product!

Paul

You need to be logged in to participate

In order to post your own comments on this topic, you need to be logged in.

Username:

Log in by clicking the login link at the top of the screen

 

Go back to forum.