DataHandler

From GreenVulcano Wiki
Revision as of 09:51, 30 March 2012 by Anonymous (talk) (Data flow)
Jump to: navigation, search

Description

DataHandler Architecture

The DataHandler is a component extremely performable and configurable, which deals with extraction, edition, insertion and deletion of data and execution of stored procedures/functions of one or more RDBMS using JDBC APIs. It is only available in the Enterprise version.


This component works on XML defined by XSD. Through maps of transformation, the XML input is received and normalized into another XML, which defines the data types and values, necessary to perform operations on the database.


Thereafter, the DataHandler performs operations on the database configured for the service. Depending on the type of operation configured, the DataHandler returns an output consisting in a report that describes the operation result, and a XML which, through a transformation map, is converted into a second XML defined by one XSD.

GreenVulcano® ESB Configuration

The configuration of the Datahandler(DH) component occurs in three steps:

  • Definition of XSD in input and output of the service;
  • Definition of maps of conversion from the format defined in the XSD created and vice versa;
  • Composition of select/update calls in SQL language for finding/changing data in the database.

The language used for map transformation is XSLT 1.0.

Data flow

DataHandler Data flow

With DataHandler the flow of data might occur in two ways:

  • From XML to Database: valid for processes of data insertion and/or update
  • From Database to XML: for extracting processes.


Next will be shown an example of internal XML and its description:

<?xml version="1.0" encoding="UTF-8"?>
<RowSet>
    <data id="0" key_1="MGP" key_2="20120325" key_3="AUST" key_4="NORD">
        <row id="0">
            <col format="yyyyMMdd HH:mm:ss" type="timestamp">20120325 00:00:00</col>
            <col decimal-separator="," format="#0.00" grouping-separator="."      
                 type="float">10000,00</col>
            <col decimal-separator="," format="#,##0.###" grouping-separator="." 
                 type="float">1</col>
        </row>
    </data>
    <data id="0" key_1="MGP" key_2="20120325" key_3="BRNN" key_4="GREC">
        <row id="0">
            <col format="yyyyMMdd HH:mm:ss" type="timestamp">20120325 00:00:00</col>
            <col decimal-separator="," format="#0.00" grouping-separator="." 
                 type="float">10000,00</col>
            <col decimal-separator="," format="#,##0.###" grouping-separator="." 
                 type="float">1</col>
        </row>
     </data>
</RowSet>
  • The data element is a grouping of records and it is significant only in data extraction services.
  • The id attribute identifies the statement used for the extraction of data or the statement to be used for modifing/inserting data.
  • The key_xx attributes identify the actual values ​​of aggregation keys.
  • The row element represents a record in data extraction services or the container of the parameters for a service of insertion/update data.
  • The col element represents a record field in services of data extraction or a parameter for a service of data insertion / update.
  • The type attribute identifies the type of the containing value and it is used for format conversion of data to be read or inserted:
    • string (default)
    • timestamp
    • numeric
    • float
    • long-string (CLOB)
    • base64 (BLOB)
    • binary (string in BLOB)
For the timestamp type it can be defined the format attribute containing the conversion pattern string <-> date/datehour/timestamp. There are used the same conventions as in SimpleDateFormat Java class. The default value is yyyyMMdd HH:mm:ss
For the float type it can be defined the following attributes describing the conversion string <-> floating/fixed point
  • format (conversion pattern, default: #,##0.###)
  • decimal-separator (decimal separator, default: ,)
  • grouping-separator (thousands separator, default: .)
Will be used the same conventions of the DecimalFormat java class.

DataHandler vs JDBC

DataHandler vs JDBC

Here are some differences between the connector JDBC and DataHandler:

  • The JDBC connector can generally operate on a single record, operations on multiple records must be controlled by an Iterator
  • The DataHandler can handle more operations by a single invocation, the operations are controlled by the stylesheet of data conversion
  • The JDBC connector can execute the statement configured as it appears
  • The DataHandler can process a statement opportunely modified by metadata.
   select first_name, last_name, city from anagrafic
   where last_name = '@{{LAST_NAME}}'
   decode{{@{{CITY}}::NULL:: ::and city='@{{CITY}}'}}
If the parameter CITY is not populated the corresponding where clause will not be included in the statement
  • The DataHandler can perform multiple operations depending on data:
    • deleting a record on Table A -> removal of records matching the table B
    • inserting a record on Table A -> insert a record equivalent of an audit table C
  • Both components are used to extract the data into an internal XML format, the DH allows to aggregate results of multiple select into a single service.
  • The DH returns, in the internal XML, information about the type of fields extracted (numeric, string, date, c/blob) and its precision (date format, decimal/whole cipher, separators). That information can be used in subsequent manipulations/transformations of the data.
  • Both components can invoke the stored procedure/function, but DH allows a best control of the parameters IN/OUT and the number /sequence of procedures to be invoked, even before or after other statements typology:
    • data entry (insert statement)
    • validation procedure (call statement)
    • extraction results validation (select statement)
DataHandler vs JDBC
  • The DataHandler can operate on multiple tables in a single invocation.
  • The DH, using XA compliant JDBC connections, can operate on multiple databases in a single invocation.
  • The DataHandler during transformations of input/output data (before of insert/update/delete and after of select), can use the helpers presents in the XSL maps that allow to perform SQL queries or scripts JavaScript and its result can be used to enhance the input/output data:
   Insert: XML with the city names -> database with numeric id => SQL retriever that maps name 
                                   -> id for inserts/updates and id -> name for the select
  • The DataHandler can perform a series of parallel data extraction into tables, or with groupings, diferents, aggregate the partial results and send the aggregate to the DTE for an eventual data transformation.
  • The DataHandler can perform a series of data extraction (also from different databases) + XML transformation (different from each other) and then add the result (via XPath) into a single document to be returned to the caller.
  • The DataHandler backs to the caller (application) a report about the operation performed: number of read/inserted/updated/discarded records and, for those discarded, also a description of the discard cause (eg. invalid data, an empty required field, key violation)
  • The DataHandler can be configured for each information change service, to behave in an appropriate manner depending on the type of the error found in the interaction with the DB (are blocked by default):
    • Platform Error
    • Data Error
    • Constraint Error
    • Error Statement
    • Security Error