XML stand for Extensible Markup Language which is easy to read by human and machine
both, XML is database in itself.
XML file always start with <?xml version=”1.0″ encoding=”UTF-8″?> , it has it’s version and the encoding, changing the encoding will let XML to treat special character differently.
Note: Before reading any file make sure it is not password protected.
I am reading below file
tFileInputXML component Reads an XML structured file row by row to split them up into fields and sends the fields as defined in the schema to the next component.
Component has few basic properties that’s needs to be check/uncheck to process data for proper formatting.
- We need add one column with type ‘Document’.
- In ‘Loop Xpath query’ option we need provide tags within XML file, e.g “/”, a simple backslash mean file will be read from beginning to end or we can also provide “/root/value”.
- Under mapping in “XPath query” we can provide similar “/” node value to
- fetch values of all tags.
TXMLMap is similar to tMAP component, it is an advanced component fine-tuned for transforming and routing XML data flow (data of the Document type), especially when processing numerous XML data sources, with or without flat data to be joined.
In tMap component if we already have XML file, we can import it by right click on doc and select ‘import from xml file’ the schema will be automatically created. In this we have to set loop element, in above image loop element is ‘value’, so iteration will happen on the basis of ‘value’ tag.
tAdvancedFileOutputXML outputs data to an XML type of file and offers an interface to deal with loop and group by elements if needed.
tAdvancedFileOutputXML can be used in place of tXMLMap. In above image ‘entidad’ column is set as loop element, so iteration will happen on this tag. ‘@id’ is called attribute which mean it is sub-element of entidad and we can’t add sub-element under it whereas ‘direction’ is also sub-element of entidad but we can add sub-element under it as we can see in above image.
tFileInputJSON Extracts JSON data from a file and transfers the data to a file, a database table, etc.
‘Edit schema’ will contain all columns. ‘Read By’ will have 3 options out of which we are taking ‘JsonPath’. We can check ‘Use Url’ if Json file need to be fetched from any website else keep it uncheck. ‘Loop Json query’ is appearing because we have selected ‘JsonPath’ in ‘Read By’ property above, it will have path of tabs in file, please see Json file before this.
In this book have 4 attributes which we need to extract. At last tlogrow will show us output.
tFileOutputJSON receives data and rewrites it in a JSON structured data block in an output file.
Below is the file format that we are going to convert into JSON file.
‘Name of data block’ is what comes in JSON at top, see below image.
Edit schema will have all column that need to be mapped.
Output JSON file:
While working on Talend if in case we came across some issue which is not possible to resolve at our end we can raise it to Talend community on link. Their team will help in solving the problem.
Girikon is an IT service organization, headquartered in Phoenix, Arizona with presence across India and Australia. We provide cutting-edge Salesforce consulting services and solutions to help your business grow and achieve sustainable success.