Pdfreader-call
Definition
PDFReaderCall plug-in is compatible with all PDF formats, and it is very simple to configure. It receives in input a binary stream containing the PDF file, or reads it form file-system, and returns in output a corresponding XML structure.
To the output of PDFReaderCall plug-in it is possible to apply a ChangeGVBufferNode operation or an XSL transformation to retrieve all interesting data.
GreenVulcano® ESB provides two different tools, GV Console® and VulCon®, to configure all supported plug-ins.
VulCon / GV Console Configuration
pdfreader-call is the operation that must be configured into VulCon® or GV Console® System section, to convert an PDF file in GVBuffer.object field, or in file-system, in an XML document.
In order to add an operation pdfreader-call you must define the following fields:
Attribute | Type | Description |
---|---|---|
class | fixed | it.greenvulcano.gvesb.virtual.pdf.reader.GVPdfReaderCallOperation
(java class that manage ExcelReaderCall invocation). |
type | fixed | This attribute must assume the value call |
name | required | This field identify the operation name that you will use in service definition. |
fileName | optional | Pdf file name. Can contains placeholder to be decoded at runtime. If not defined the Pdf file content must be into GVBuffer.object field. |
pageStart | optional | Starting page for conversion. Can contains placeholder to be decoded at runtime. If not defined is -1, meaning that only Pdf metadata must be extracted. |
pageEnd | optional | Ending page for conversion. Can contains placeholder to be decoded at runtime. If not defined is -1, meaning that till Pdf's last must be extracted. |
embedPDF | optional | If true the input pdf file is embedded as base64 data into the output XML. Default to false. |
The following example shows the configuration generated from VulCon® or GV Console® when you configure a pdfreader-call operation:
<?xml version="1.0" encoding="UTF-8"?>
<GVSystems name="SYSTEMS" type="module">
<Systems>
<System id-system="system-name" system-activation="on">
<Channel id-channel="CHANNEL_NAME">
<pdfreader-call class="it.greenvulcano.gvesb.virtual.pdf.reader.GVPdfReaderCallOperation"
name="ReadPDF" type="call" pageStart="1" pageEnd="1" embedPDF="true"/>
</Channel>
</System>
</Systems>
</GVSystems>
To use an pdfreader-call in a GreenVulcano® ESB service, you need to define a node of type GVOperationNode in Service section and define in the field operation-name the name defined in pdfreader-call operation.
The following example shows the configuration generated from VulCon® or GV Console® when you configure an pdfreader-call operation in GreenVulcano® ESB service:
<?xml version="1.0" encoding="UTF-8"?>
<GVServices name="SERVICES" type="module">
<Groups>
<Group group-activation="on" id-group="DEFAULT_GRP"/>
</Groups>
<Services>
<Service group-name="DEFAULT_GRP" id-service="SERVICE-NAME"
service-activation="on">
<Client id-system="SYSTEM-NAME" statistics="off" system-activation="on">
<Operation name="RequestReply" operation-activation="on"
out-check-type="none" type="operation">
<Participant id-channel="CHANNEL-NAME" id-system="SYSTEM-NAME"/>
<Flow first-node="pdf_reader" point-x="20" point-y="112">
<GVOperationNode class="it.greenvulcano.gvesb.core.flow.GVOperationNode"
id="pdf_reader" id-system="SYSTEM-NAME"
input="input" next-node-id="end"
op-type="call"
operation-name="ReadPDF"
output="pdf_xml" point-x="158"
point-y="112" type="flow-node"/>
<GVEndNode class="it.greenvulcano.gvesb.core.flow.GVEndNode"
end-business-process="yes" id="end" op-type="end"
output="pdf_xml" point-x="358" point-y="112"
type="flow-node"/>
</Flow>
</Operation>
</Client>
</Service>
</Services>
</GVServices>
At this point you have configured a service with an pdfreader-call operation.
Example
This example shows an XML document generated by a simple PDF document:
<?xml version="1.0" encoding="UTF-8"?>
<pdf>
<metadata>
<page-count>5</page-count>
<title>FOP Development: RTFLib (jfor)</title>
<author/>
<subject>Apache FOP</subject>
<keywords/>
<creator/>
<producer>Apache FOP Version 0.94</producer>
<creation-date>2008-07-31T16:06:16+02:00</creation-date>
<modification-date/>
<trapped/>
<extra>
<x:xmpmeta xmlns:x="adobe:ns:meta/">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description xmlns:pdf="http://ns.adobe.com/pdf/1.3/" rdf:about="">
<pdf:PDFVersion>1.4</pdf:PDFVersion>
<pdf:Producer>Apache FOP Version 0.94</pdf:Producer>
<pdf:Creator>Apache Forrest - http://forrest.apache.org/</pdf:Creator>
</rdf:Description>
<rdf:Description xmlns:xmp="http://ns.adobe.com/xap/1.0/" rdf:about="">
<xmp:MetadataDate>2008-07-31T15:06:16+01:00</xmp:MetadataDate>
<xmp:CreateDate>2008-07-31T15:06:16+01:00</xmp:CreateDate>
</rdf:Description>
<rdf:Description xmlns:dc="http://purl.org/dc/elements/1.1/" rdf:about="">
<dc:date>2008-07-31T15:06:16+01:00</dc:date>
<dc:title>FOP Development: RTFLib (jfor)</dc:title>
<dc:description>Apache FOP</dc:description>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
</extra>
</metadata>
<pages end="1" start="1">
<page num="1"
>PDF created by Apache FOP
http://xmlgraphics.apache.org/fop/
FOP Development: RTFLib (jfor)
Version 627324
Table of contents
1 General Information............................................................................................................................. 2
1.1 Introduction.....................................................................................................................................2
1.2 History.............................................................................................................................................2
1.3 Status...............................................................................................................................................2
2 User Documentation.............................................................................................................................2
2.1 Overview.........................................................................................................................................2
2.2 Document Structure........................................................................................................................ 3
2.3 Attributes.........................................................................................................................................3
</page>
</pages>
<base64pdf>JVBERi0xLjQKJaqrrK0KNCAwIG9iago8PAovVGl0bGUgKEZPUCBEZXZlbG9wbWVudDogUlRGTGli
IFwoamZvclwpKQovU3ViamVjdCAoQXBhY2hlIEZPUCkKL1Byb2R1Y2VyIChBcGFjaGUgRk9QIFZl
cnNpb24gMC45NCkKL0NyZWF0aW9uRGF0ZSAoRDoyMDA4MDczMTE1MDYxNiswMScwMCcpCj4+CmVu
ZG9iago1IDAgb2JqCjw8IC9OIDMKL0xlbmd0aCAyMiAwIFIKL0ZpbHRlciAvRmxhdGVEZWNvZGUg
Cj4+CnN0cmVhbQp4nJ2Wd1RT2RaHz703vVCSEIqU0GtoUgJIDb1IkS4qMQkQSsCQACI2RFRwRFGR
pggyKOCAo0ORsSKKhQFRsesEGUTUcXAUG5ZJZK0Z37x5782b3x/3fmufvc/dZ+991roAkPyDBcJM
WAmADKFYFOHnxYiNi2dgBwEM8AADbADgcLOzQhb4RgKZAnzYjGyZE/gXvboOIPn7KtM/jMEA/5+U
uVkiMQBQmIzn8vjZXBkXyTg9V5wlt0/JmLY0Tc4wSs4iWYIyVpNz8ixbfPaZZQ858zKEPBnLc87i
ZfDk3CfjjTkSvoyRYBkX5wj4uTK+JmODdEmGQMZv5LEZfE42ACiS3C7mc1NkbC1jkigygi3jeQDg
SMlf8NIvWMzPE8sPxc7MWi4SJKeIGSZcU4aNkxOL4c/PTeeLxcwwDjeNI+Ix2JkZWRzhcgBmz/xZ
FHltGbIiO9g4OTgwbS1tvijUf138m5L3dpZehH/uGUQf+MP2V36ZDQCwpmW12fqHbWkVAF3rAVC7
...........
</base64pdf>
</pdf>
With a ChangeGVBufferNode is possible parsing XML and retrieve any tag and value.