Difference between revisions of "DataHandler"

From GreenVulcano Wiki
Jump to: navigation, search
({{GVESB}} Configuration)
 
(14 intermediate revisions by 2 users not shown)
Line 1: Line 1:
 
==Description==
 
==Description==
  
The DataHandler is a component extremely performable and configurable, which deals with extraction, edition, insertion and deletion of data and execution of stored procedures/functions of one or more RDBMS using JDBC APIs. It is only available in the Enterprise version.
+
[[File:datahandlerArchitecture.jpg|thumb|DataHandler Architecture]]The DataHandler is an extremely powerful and configurable component, which deals with extraction and modification of data and execution of stored procedures/functions on one or more RDBMS using JDBC APIs. It is only available in the Enterprise edition.
  
This component works on XML defined by XSD. Through maps of transformation, the XML input is received and normalized into another XML, which defines the data types and values, necessary to perform operations on the database. Thereafter, the DataHandler performs operations on the database configured for the service. Depending on the type of operation configured, the DataHandler returns an output consisting in a report that describes the operation result, and a XML which, through a transformation map, is converted into a second XML defined by one XSD.
+
 
 +
This component is optimized to work with XML documents as input or output. Through transformation maps, the XML passed in is converted to an XML of commands, which defines data types and values​​, necessary to perform the update operations on the database.
 +
 
 +
 
 +
Depending on the type of operation configured, the DataHandler returns an output consisting of a report that describes the result of the operation, and a possible internal XML document that, through a transformation map, is converted into a specific XML document for the client.
  
 
=={{GVESB}} Configuration==
 
=={{GVESB}} Configuration==
  
The configuration of this component occurs in three steps:
+
The configuration of the Datahandler(DH) component occurs in three steps:
 
* Definition of XSD in input and output of the service;
 
* Definition of XSD in input and output of the service;
 
* Definition of maps of conversion from the format defined in the XSD created and vice versa;
 
* Definition of maps of conversion from the format defined in the XSD created and vice versa;
 
* Composition of select/update calls in SQL language for finding/changing data in the database.
 
* Composition of select/update calls in SQL language for finding/changing data in the database.
  
The language used for map transformation is XSLT 1.0.
+
You might perform a call to DataHandler by include into your workflow configuration a [[dh-call]] plugin.
  
[[File:datahandlerVSjdbc.jpg|thumb|DataHandler vs JDBC]]Here are some differences between the connector JDBC and DataHandler (DH):
+
The basic language used for map transformation is XSLT 1.0. The package includes Xalan, but it is also possible to add others XSLT processors which implement XSLT 2.0. 
 +
 
 +
===DataHandler vs JDBC===
 +
 
 +
[[File:datahandlerVSjdbc.jpg|thumb|DataHandler vs JDBC]]Here are some differences between the connector JDBC and DataHandler:
 
* The JDBC connector can generally operate on a single record, operations on multiple records must be controlled by an Iterator
 
* The JDBC connector can generally operate on a single record, operations on multiple records must be controlled by an Iterator
 
* The DataHandler can handle more operations by a single invocation, the operations are controlled by the stylesheet of data conversion
 
* The DataHandler can handle more operations by a single invocation, the operations are controlled by the stylesheet of data conversion
Line 23: Line 31:
 
     where last_name = <nowiki>'@{{LAST_NAME}}'</nowiki>
 
     where last_name = <nowiki>'@{{LAST_NAME}}'</nowiki>
 
     <nowiki>decode{{@{{CITY}}::NULL:: ::and city='@{{CITY}}'}}</nowiki><br/>
 
     <nowiki>decode{{@{{CITY}}::NULL:: ::and city='@{{CITY}}'}}</nowiki><br/>
     If the parameter CITY is not populated the corresponding where clause
+
     If the parameter CITY is not populated the corresponding where clause
 
     will not be included in the statement
 
     will not be included in the statement
 +
 
* The DataHandler can perform multiple operations depending on data:
 
* The DataHandler can perform multiple operations depending on data:
 
** deleting a record on Table A -> removal of records matching the table B
 
** deleting a record on Table A -> removal of records matching the table B
Line 34: Line 43:
 
** validation procedure (call statement)
 
** validation procedure (call statement)
 
** extraction results validation (select statement)
 
** extraction results validation (select statement)
 +
[[File:datahandlerVSjdbc2.jpg|thumb|DataHandler vs JDBC]]
 
* The DataHandler can operate on multiple tables in a single invocation.
 
* The DataHandler can operate on multiple tables in a single invocation.
 
* The DH, using XA compliant JDBC connections, can operate on multiple databases in a single invocation.
 
* The DH, using XA compliant JDBC connections, can operate on multiple databases in a single invocation.
* The DH during transformations of input/output data (before of insert/update/delete and after of select), can use the helpers presents in the XSL maps that allow to perform SQL queries or scripts JavaScript and its result can be used to enhance the input/output data:
+
* The DataHandler during transformations of input/output data (before of insert/update/delete and after of select), can use the helpers presents in the XSL maps that allow to perform SQL queries or scripts JavaScript and its result can be used to enhance the input/output data:
     Insert: XML with the city names -> database with numeric id => SQL retriever that maps name  
+
     Insert: XML with the city names  
                                    -> id for inserts/updates and id -> name for the select
+
      -> database with numeric id => SQL retriever that maps name  
 +
      -> id for inserts/updates and id -> name for the select
 
* The DataHandler can perform a series of parallel data extraction into tables, or with groupings, diferents, aggregate the partial results and send the aggregate to the DTE for an eventual data transformation.
 
* The DataHandler can perform a series of parallel data extraction into tables, or with groupings, diferents, aggregate the partial results and send the aggregate to the DTE for an eventual data transformation.
 
* The DataHandler can perform a series of data extraction (also from different databases) + XML transformation (different from each other) and then add the result (via XPath) into a single document to be returned to the caller.
 
* The DataHandler can perform a series of data extraction (also from different databases) + XML transformation (different from each other) and then add the result (via XPath) into a single document to be returned to the caller.
Line 48: Line 59:
 
** Error Statement
 
** Error Statement
 
** Security Error
 
** Security Error
 +
 +
===Data flow===
 +
 +
[[File:DataHandlerFlow.jpg|thumb|DataHandler Data flow]]
 +
 +
With DataHandler the flow of data might occur in two ways:
 +
 +
* From XML to Database: valid for processes of data insertion and/or update
 +
* From Database to XML: for extracting processes.
 +
 +
 +
Next will be shown an example of internal XML and its description:
 +
 +
<syntaxhighlight lang="XML"><?xml version="1.0" encoding="UTF-8"?>
 +
<RowSet>
 +
    <data id="0" key_1="ASTOR" key_2="20120325" key_3="ITA" key_4="HIVAL">
 +
        <row id="0">
 +
            <col format="yyyyMMdd HH:mm:ss" type="timestamp">20120325 00:00:00</col>
 +
            <col decimal-separator="," format="#0.00" grouping-separator="."     
 +
                type="float">10000,00</col>
 +
            <col decimal-separator="," format="#,##0.###" grouping-separator="."
 +
                type="float">1</col>
 +
        </row>
 +
    </data>
 +
    <data id="0" key_1="MGP" key_2="20120325" key_3="BRNN" key_4="GREC">
 +
        <row id="0">
 +
            <col format="yyyyMMdd HH:mm:ss" type="timestamp">20120325 00:00:00</col>
 +
            <col decimal-separator="," format="#0.00" grouping-separator="."
 +
                type="float">10000,00</col>
 +
            <col decimal-separator="," format="#,##0.###" grouping-separator="."
 +
                type="float">1</col>
 +
        </row>
 +
    </data>
 +
</RowSet>
 +
</syntaxhighlight>
 +
 +
* The '''data''' element is a grouping of records and it is significant only in data extraction services.
 +
* The '''id''' attribute identifies the statement used for the extraction of data or the statement to be used for modifing/inserting data.
 +
* The '''key_xx''' attributes identify the actual values ​​of aggregation keys.
 +
* The '''row''' element represents a record in data extraction services or the container of the parameters for a service of insertion/update data.
 +
* The '''col''' element represents a record field in services of data extraction or a parameter for a service of data insertion / update.
 +
* The '''type''' attribute identifies the type of the containing value and it is used for format conversion of data to be read or inserted:
 +
** string (default)
 +
** timestamp
 +
** numeric
 +
** float
 +
** long-string (CLOB)
 +
** base64 (BLOB)
 +
** binary (string in BLOB)
 +
 +
: For the timestamp type it can be defined the format attribute containing the conversion pattern  string <-> date/datehour/timestamp. There are used the same conventions as in SimpleDateFormat Java class. The default value is yyyyMMdd HH:mm:ss
 +
 +
: For the  float type will be used the same conventions of the DecimalFormat java class. It can be defined the following attributes describing the conversion string <-> floating/fixed point
 +
::format  (conversion pattern, default: #,##0.###)
 +
::decimal-separator  (decimal separator, default:  ,)
 +
::grouping-separator (thousands separator, default:  .)
 +
 +
The DataHandler elementary operations that may be combining in a service are represented by the [[DBOBuilder#DBO|DBOs]].

Latest revision as of 10:43, 2 April 2012

Description

DataHandler Architecture

The DataHandler is an extremely powerful and configurable component, which deals with extraction and modification of data and execution of stored procedures/functions on one or more RDBMS using JDBC APIs. It is only available in the Enterprise edition.


This component is optimized to work with XML documents as input or output. Through transformation maps, the XML passed in is converted to an XML of commands, which defines data types and values​​, necessary to perform the update operations on the database.


Depending on the type of operation configured, the DataHandler returns an output consisting of a report that describes the result of the operation, and a possible internal XML document that, through a transformation map, is converted into a specific XML document for the client.

GreenVulcano® ESB Configuration

The configuration of the Datahandler(DH) component occurs in three steps:

  • Definition of XSD in input and output of the service;
  • Definition of maps of conversion from the format defined in the XSD created and vice versa;
  • Composition of select/update calls in SQL language for finding/changing data in the database.

You might perform a call to DataHandler by include into your workflow configuration a dh-call plugin.

The basic language used for map transformation is XSLT 1.0. The package includes Xalan, but it is also possible to add others XSLT processors which implement XSLT 2.0.

DataHandler vs JDBC

DataHandler vs JDBC

Here are some differences between the connector JDBC and DataHandler:

  • The JDBC connector can generally operate on a single record, operations on multiple records must be controlled by an Iterator
  • The DataHandler can handle more operations by a single invocation, the operations are controlled by the stylesheet of data conversion
  • The JDBC connector can execute the statement configured as it appears
  • The DataHandler can process a statement opportunely modified by metadata.
   select first_name, last_name, city from anagrafic
   where last_name = '@{{LAST_NAME}}'
   decode{{@{{CITY}}::NULL:: ::and city='@{{CITY}}'}}
If the parameter CITY is not populated the corresponding where clause will not be included in the statement
  • The DataHandler can perform multiple operations depending on data:
    • deleting a record on Table A -> removal of records matching the table B
    • inserting a record on Table A -> insert a record equivalent of an audit table C
  • Both components are used to extract the data into an internal XML format, the DH allows to aggregate results of multiple select into a single service.
  • The DH returns, in the internal XML, information about the type of fields extracted (numeric, string, date, c/blob) and its precision (date format, decimal/whole cipher, separators). That information can be used in subsequent manipulations/transformations of the data.
  • Both components can invoke the stored procedure/function, but DH allows a best control of the parameters IN/OUT and the number /sequence of procedures to be invoked, even before or after other statements typology:
    • data entry (insert statement)
    • validation procedure (call statement)
    • extraction results validation (select statement)
DataHandler vs JDBC
  • The DataHandler can operate on multiple tables in a single invocation.
  • The DH, using XA compliant JDBC connections, can operate on multiple databases in a single invocation.
  • The DataHandler during transformations of input/output data (before of insert/update/delete and after of select), can use the helpers presents in the XSL maps that allow to perform SQL queries or scripts JavaScript and its result can be used to enhance the input/output data:
   Insert: XML with the city names 
      -> database with numeric id => SQL retriever that maps name 
      -> id for inserts/updates and id -> name for the select
  • The DataHandler can perform a series of parallel data extraction into tables, or with groupings, diferents, aggregate the partial results and send the aggregate to the DTE for an eventual data transformation.
  • The DataHandler can perform a series of data extraction (also from different databases) + XML transformation (different from each other) and then add the result (via XPath) into a single document to be returned to the caller.
  • The DataHandler backs to the caller (application) a report about the operation performed: number of read/inserted/updated/discarded records and, for those discarded, also a description of the discard cause (eg. invalid data, an empty required field, key violation)
  • The DataHandler can be configured for each information change service, to behave in an appropriate manner depending on the type of the error found in the interaction with the DB (are blocked by default):
    • Platform Error
    • Data Error
    • Constraint Error
    • Error Statement
    • Security Error

Data flow

DataHandler Data flow

With DataHandler the flow of data might occur in two ways:

  • From XML to Database: valid for processes of data insertion and/or update
  • From Database to XML: for extracting processes.


Next will be shown an example of internal XML and its description:

<?xml version="1.0" encoding="UTF-8"?>
<RowSet>
    <data id="0" key_1="ASTOR" key_2="20120325" key_3="ITA" key_4="HIVAL">
        <row id="0">
            <col format="yyyyMMdd HH:mm:ss" type="timestamp">20120325 00:00:00</col>
            <col decimal-separator="," format="#0.00" grouping-separator="."      
                 type="float">10000,00</col>
            <col decimal-separator="," format="#,##0.###" grouping-separator="." 
                 type="float">1</col>
        </row>
    </data>
    <data id="0" key_1="MGP" key_2="20120325" key_3="BRNN" key_4="GREC">
        <row id="0">
            <col format="yyyyMMdd HH:mm:ss" type="timestamp">20120325 00:00:00</col>
            <col decimal-separator="," format="#0.00" grouping-separator="." 
                 type="float">10000,00</col>
            <col decimal-separator="," format="#,##0.###" grouping-separator="." 
                 type="float">1</col>
        </row>
     </data>
</RowSet>
  • The data element is a grouping of records and it is significant only in data extraction services.
  • The id attribute identifies the statement used for the extraction of data or the statement to be used for modifing/inserting data.
  • The key_xx attributes identify the actual values ​​of aggregation keys.
  • The row element represents a record in data extraction services or the container of the parameters for a service of insertion/update data.
  • The col element represents a record field in services of data extraction or a parameter for a service of data insertion / update.
  • The type attribute identifies the type of the containing value and it is used for format conversion of data to be read or inserted:
    • string (default)
    • timestamp
    • numeric
    • float
    • long-string (CLOB)
    • base64 (BLOB)
    • binary (string in BLOB)
For the timestamp type it can be defined the format attribute containing the conversion pattern string <-> date/datehour/timestamp. There are used the same conventions as in SimpleDateFormat Java class. The default value is yyyyMMdd HH:mm:ss
For the float type will be used the same conventions of the DecimalFormat java class. It can be defined the following attributes describing the conversion string <-> floating/fixed point
format (conversion pattern, default: #,##0.###)
decimal-separator (decimal separator, default: ,)
grouping-separator (thousands separator, default: .)

The DataHandler elementary operations that may be combining in a service are represented by the DBOs.