appRules
  • Welcome to appRulesĀ®
  • PLATFORM
    • appRulesĀ® Concepts & Features
    • Platform Overview
    • Platform Features
      • Sourced Values
        • Overview
        • Sourced Values List
      • Actions
        • Overview
        • Action Categories
          • Actions-DataSource
            • Accounting
            • AIMachineLearning
            • Analytics
            • ApplicationPlatforms
            • BigData
            • CalendarsAndContactMgrs
            • CloudStorage
            • Collaboration
            • CRM
            • CustomerService
            • Databases
            • DataWarehouse
            • ECommercePlatforms
            • EmailSystems
            • ERP
            • ExpenseManagement
            • Files
            • Financial
            • Marketing
            • Miscelleanous
            • NoSQL
            • PaymentProcessing
            • PLM
            • ProjectManagement
            • Search
            • Services
            • ShippingAndMailing
            • Social
            • Spreadsheets
          • Actions-Execution
          • Actions-Fields-Field
          • Actions-Fields-Library-Financial
          • Actions-Fields-Library-Math
          • Actions-Utility-Automation
          • Actions-Utility-FileSystem
      • Conditions
        • Overview
        • Condition Categories
          • Conditions-Cache
          • Conditions-Custom
          • Conditions-Fields
          • Conditions-FileSystem
          • Conditions-Records
          • Conditions-Results
          • Conditions-Workflow
      • Workflow Activities
        • Control Flow Activities
        • Base Activities
        • Dialog Activities
        • Base Data Activities
      • System Settings
        • Runtime Settings
        • Users
        • Data Sources
          • Overview
          • Configuring Connection Settings
          • Generating Metadata
          • Connection Settings
            • Accounting
              • Exact Online
              • Raisers Edge NXT
              • Xero
              • Quickbooks
              • Quickbooks Online
              • Quickbooks POS
              • Freshbooks
              • Sage Business Cloud Accounting
              • Sage Intacct
              • Sage50 UK
              • Reckon
            • Application Platforms
              • Alfresco Platform
              • Kintone Platform
              • Quickbase Platform
              • Veeva Vault
            • AI & Machine Learning
              • DataRobot
              • Salesforce Einstein
              • Splunk
            • Analytics
              • Google Analytics
              • Adobe Analytics
                • Adobe Custom oAuth App
              • Azure Analysis Services
              • SAS
              • Youtube Analytics
            • Big Data
              • Apache HBase
              • Apache Hive
              • Apache HDFS
              • Apache Spark
              • Greenplum
              • HPCC LexisNexis
            • Calendar & Contact Managers
              • Google Calendar
              • Google Contacts
            • Cloud Storage
              • Azure Table Storage
              • Amazon S3
              • IBM Cloud Object Storage
              • Wasabi
            • Collaboration
              • Airtable
              • Asana
              • Box
              • Dropbox
              • Exchange
              • Google Drive
              • Microsoft Teams
              • Office365
              • Microsoft OneDrive
              • Microsoft OneNote
              • Salesforce Chatter
              • SharePoint
              • Slack
              • Smartsheet
            • CRM
              • Act! CRM
              • Bullhorn CRM
              • Microsoft Dynamics 365 Sales
              • Microsoft Dynamics CRM
              • Highrise
              • Oracle Sales Cloud
              • Salesforce
              • SAP Hybris C4C
              • Sugar CRM
              • Suite CRM
              • Veeva CRM
              • Zoho CRM
            • Customer Service & Support
              • ServiceNow
              • Zendesk
              • Jira
            • Data Warehouse
              • Azure Data Lake Storage Gen1
              • Azure Data Lake Storage Gen2
              • Google BigQuery
              • Redshift
              • Databricks
              • Snowflake
              • Azure Synapse
            • E-Commerce Platforms
              • Amazon MarketPlace
              • Magento
              • Shopify
              • Woocommerce
            • Electronic Signature
              • Docusign
            • Email
              • Gmail
              • Email
            • ERP
              • NetSuite
              • Odoo
              • Acumatica
              • Dynamics NAV
              • Dynamics GP
              • Microsoft Dynamics 365 FinOp
              • Microsoft Dynamics365 Business Central
                • Business Central Endpoints
              • SAP Business One
              • SAP ERP
            • Expense Management
              • SAP Concur
            • Files
              • CSV Database
              • Fixed Length File
              • Delimited File
            • Marketing
              • Bing Ads
              • Marketo
              • MailChimp
              • HubSpot
              • Eloqua
              • Salesforce Marketing Cloud
            • NoSQL
              • Cassandra
                • Cassandra Advanced Settings
              • Azure CosmosDB
              • DynamoDB
              • Couchbase
              • MongoDB
              • IBM Cloudant
              • Redis
            • Payment Processing
              • Authorize.Net
              • Square
              • Paypal
              • Stripe
            • Product Lifecycle Management (PLM)
              • Propel
            • Relational Databases
              • appRules Embedded Database
              • Firebird
              • IBM DB2
              • IBM Informix
              • Microsoft Access
              • MySQL
              • MariaDB
              • Oracle
              • PostgreSQL
              • Microsoft SQL Server
              • Microsoft SQL Server CE
              • SQL Azure
              • SQLlite
              • Vertica
            • Search
              • ElasticSearch
              • Bing
              • Google Search
            • Shipping and Mailing
              • Fedex
              • UPS
              • USPS
            • Social Networks
              • Linkedin
              • Facebook
              • Instagram
              • Twitter
            • Services
              • REST
              • RSS
              • XML
                • Connecting to XML Datasources
                • Parsing Hierarchical Data
                • Raw Data
              • Active Directory
              • Amazon Athena
              • Microsoft Dataverse
              • JSON
              • LDAP
              • GraphQL
            • Spreadsheets
              • Excel
              • Excel Services
              • Excel Online
              • GoogleSheets
          • SQL Compliance
            • Accounting
              • Exact Online
              • Raisers Edge NXT
              • Xero
              • Quickbooks
              • Quickbooks Online
              • QuickBooks POS
              • Freshbooks
              • Reckon
              • Sage Intacct
              • Sage Business Cloud Accounting
              • Sage50 UK
            • AI & Machine Learning
              • Salesforce Einstein
              • DataRobot
              • Splunk
            • Analytics
              • Google Analytics
              • Youtube Analytics
              • Azure Analysis Services
              • Adobe Analytics
              • SAS
            • Application Platforms
              • Alfresco Platform
              • Kintone Platform
              • Quick Base Platform
              • Veeva Vault
            • Big Data
              • Apache HBase
              • Apache HDFS
              • Apache Hive
              • Apache Spark
              • Greenplum
              • HPCC Systems
            • Calendar & Contact Managers
              • Google Calendar
              • Google Contacts
            • Cloud Storage
              • Azure Table Storage
              • Amazon S3
              • Wasabi
              • IBM Cloud Object Storage
            • Collaboration
              • Airtable
              • Asana
              • Box
              • Dropbox
              • Google Drive
              • Microsoft Teams
              • Microsoft OneDrive
              • Microsoft OneNote
              • Microsoft Exchange
              • Office365
              • Salesforce Chatter
              • SharePoint
              • Slack
              • Smartsheet
            • CRM
              • Act! CRM
              • Bullhorn CRM
              • Highrise
              • Microsoft Dynamics CRM
              • Microsoft Dynamics 365 Sales
              • Oracle Sales Cloud
              • Salesforce
              • SAP Hybris C4C
              • Sugar CRM
              • Veeva CRM
              • Zoho CRM
            • Customer Service & Support
              • ServiceNow
              • Zendesk
              • Jira
            • E-commerce Platforms
              • Woocommerce
              • Magento
              • Shopify
              • Amazon Marketplace
            • DataWarehouse
              • Azure Data Lake Storage Gen1
              • Azure Data Lake Storage Gen2
              • Azure Synapse
              • Databricks
              • Google BigQuery
              • Redshift
              • Snowflake
            • Email
              • Gmail
            • ERP
              • Acumatica
              • NetSuiteSQL
              • NetSuiteTalk
              • Odoo
              • Dynamics NAV
              • Dynamics GP
              • Microsoft Dynamics 365 FinOp
              • Microsoft Dynamics365 Business Central
              • SAP Business One
              • SAP NetWeaver
              • Workday
            • Expense Management
              • SAP Concur
            • Files
              • CSV Database
            • Marketing
              • Bing Ads
              • Google Ads
              • HubSpot
              • MailChimp
              • Marketo
              • Oracle Eloqua
              • Salesforce Marketing Cloud
              • Survey Monkey
              • Youtube 2020
            • NoSQL
              • Cassandra
              • Azure CosmosDB
              • Amazon SimpleDB
              • Amazon DynamoDB
              • Couchbase
              • MongoDB
              • IBM Cloudant
              • Redis
            • Payment Processing
              • Authorize.Net
              • Paypal
              • Stripe
              • Square
            • Product Lifecycle Management (PLM)
              • Propel
            • Project Management
              • Basecamp
              • Microsoft Project
            • Search
              • ElasticSearch
              • Bing
              • Google Search
            • Shipping and Mailing
              • FedEx
              • UPS
              • USPS
            • Spreadsheets
              • Excel Online
              • Excel
              • Excel Services
              • GoogleSheets
            • Social Networks
              • Facebook
              • Instagram
              • Linkedin
              • Twitter
            • Services
              • Active Directory
              • Amazon Athena
              • JSON
              • LDAP
              • Microsoft Dataverse
              • OData
              • REST
              • RSS
              • XML
        • Projects
          • Overview
          • Project Settings
      • Utilities
        • DataSource Browser
        • Job Monitor
        • Logs/Statistics
        • CheckIn/CheckOut
        • Master Admin Options
      • Creating & Running Projects
        • Creating a New Project
        • Orchestrating Workflows with Designer
          • Designer Home Page Overview
          • Using the appRules Designer
        • Reading Data
        • Mapping Data
        • Writing Data
        • Project Validation
        • Running Projects
          • Running from the Home Page
          • Using the Web API
          • Using the Scheduler
        • Monitoring Jobs
        • Logging
      • Flow Data Migration
        • Overview
        • Basic Flow Data Migrations
        • Bulk Loader Flow Data Migrations
    • Self-Hosting
      • Software Installation
        • Installation Prerequisites
        • Running the Installer
        • HostService Settings
        • IIS Settings
        • Web API Settings
        • Scheduler
        • Creating a New Project Database
      • Application Configuration
      • appRules Embedded Database DBA Tool
  • PLATFORM PRODUCTS
    • appRules Data Migration Engine
      • Overview
      • Flow Data Migration
      • The Migrate Data Activity
        • Overview
        • Initializing the Migrate Data Activity
        • Target Entities Collection
          • Overview
          • Toolbar
          • Target Entity Properties
        • Basic Data Migration
        • Matching Schema Migration
        • One-To-Many Migration
        • Entities to Csv Migration
        • Data Replication
        • Data Synchronization
        • Bidirectional Data Synchronization
      • Field Mapping & Transformation
      • Reviewing the Project Settings
      • Validations, Error Handling & Notifications
      • Job Execution & Monitoring
      • Evaluating Data Migration Job Results
    • appRules Integration & Automation Engine
      • Overview
      • Flow Data Migration
      • Process Automation, Business Rules & Decision Support
        • Overview
        • Control Flow Activities
          • If
          • While
        • Conditions & Actions
          • Overview
          • Conditions
          • Actions
        • Evaluating Conditions
        • Performing Actions
        • Decision Tables
        • Executing Rule Sets
    • appRules Enterprise 360
      • Overview
      • Data Migration
      • Integration & Automation
  • Samples
    • Samples Overview
    • Sample Projects
      • Data Migration, Replication & Synchronization
        • Replicate Dynamics365 In Snowflake
        • Migrate Parent/Child Records Using Target / Bulk Loader
        • Export SQL Table to CSV Using Flow
        • One-to-Many Data Migration
        • Batch Data Source Entities to Files
      • Integration & Automation
        • Decision Table - Customer Discount
        • Iterate Deleted Salesforce Records
        • Use CSV Folder As SQL Database
        • Conditional Actions Execution
        • Round Robin Lead Assignment
      • Snippets
        • Snippet - Arguments, Variables, Dynamic Values & Sourced Values
        • Snippet - Counters & Running Totals
        • Snippet - Executing Financial Functions
        • Snippet - Infinite Loop
        • Using Composite Keys
      • Miscellaneous
        • Custom Queries
        • Parent-Child: Child Project
        • Parent-Child: Parent Project
        • Scheduled System Maintenance
        • Target Activity Errors & Logs
        • Task Parallelism
  • APPENDIX
    • Miscellaneous
      • Common Connection Properties
      • Troubleshooting a Connection
      • Dynamics365 - Connection Settings
      • Dynamics365 App Registration
      • Getting the oAuth Access and Refresh Tokens
      • Editing Windows Environment Variables
      • PickLists
      • Creating a Custom Azure oAuth App
      • Using Kerberos
      • Stored Procedures
    • FAQs
Powered by GitBook
On this page
  • Flattened Documents Model
  • Joining Object Arrays into a Single Table
  • Example
  • Relational Model:
  • Joining Nested Arrays as Tables
  • Example
  • Top-Level Document Model
  • Modeling a Top-Level Document View
  • Example
Export as PDF
  1. PLATFORM
  2. Platform Features
  3. System Settings
  4. Data Sources
  5. Connection Settings
  6. Services
  7. XML

Parsing Hierarchical Data

PreviousConnecting to XML DatasourcesNextRaw Data

Last updated 4 years ago

The provider offers three basic configurations to model nested data in XML as tables.

Flattened Documents Model

For users who simply need access to the entirety of their XML data, flattening the data into a single table is the best option. The provider will use streaming and only parses the XML data once per query in this mode.

Joining Object Arrays into a Single Table

With DataModel set to "FlattenedDocuments", the provider returns a separate table for each object array, but implicitly JOINed to the parent table. Any nested sibling XPath values (child paths at the same height) will be treated as a SQL CROSS JOIN.

Example

Below is a sample query and the results, based on the sample document in and parsing based on the XPaths /root/people, /root/people/vehicles, and /root/people/vehicles/maintenance. This implicitly JOINs the people element with the vehicles element and implicitly JOINs the vehicles element with the maintenance element.

Connection String

Use the following connection string to query the in this example.

URI=C:\people.txt;DataModel=FlattenedDocuments;XPath='/root/people;/root/people/vehicles;/root/people/vehicles/maintenance;'

Query

The following query drills into the nested elements in each people element. Since the XPath property included the vehicles node as an XML path, you can query an element of a vehicle explicitly.

SELECT [personal.age] AS age, [personal.gender] AS gender, [personal.name.first] AS name_first, [personal.name.last] AS name_last, [source], [type], [model], [insurance.company] AS ins_company, [insurance.policy_num] AS ins_policy_num, [date] AS maint_date, [desc] AS maint_descFROM [people]

Results

With horizontal and vertical flattening based on the described paths, each vehicle element is implicitly JOINed to its parent people element and each maintenance element is implicitly JOINed to its parent vehicle element.

age

gender

first_name

last_name

source

type

model

ins_company

ins_policy_num

maint_date

maint_desc

20

M

John

Doe

internet

car

Honda Civic

ABC Insurance

12345

2017-07-17

oil change

20

M

John

Doe

internet

car

Honda Civic

ABC Insurance

12345

2018-01-03

new tires

20

M

John

Doe

internet

truck

Dodge Ram

ABC Insurance

12345

2017-08-27

new tires

20

M

John

Doe

internet

truck

Dodge Ram

ABC Insurance

12345

2018-01-08

oil change

24

F

Jane

Roberts

phone

car

Toyota Camry

Car Insurance

98765

2017-05-11

tires rotated

24

F

Jane

Roberts

phone

car

Toyota Camry

Car Insurance

98765

2017-11-03

oil change

24

F

Jane

Roberts

phone

car

Honda Accord

Car Insurance

98765

2017-10-07

new air filter

24

F

Jane

Roberts

phone

car

Honda Accord

Car Insurance

98765

2018-01-13

new brakes

Relational Model:

The provider for XML can be configured to create a relational model of the data in the XML file or source, treating each XPath as an individual table containing a primary key and a foreign key that links to the parent document. This is particularly useful if you need to work with your XML data in existing BI, reporting, and ETL tools that expect a relational data model.

Joining Nested Arrays as Tables

With DataModel set to "Relational", any JOINs are controlled by the query. Any time you perform a JOIN query, the XML file or source will be queried once for each table included in the query.

Example

Connecting String

Set the DataModel connection property to "Relational" and set the XPath connection property to "/root/people;/root/people/vehicles;/root/people/vehicles/maintenance;" to perform the following query and see the example result set.

URI=C:\people.txt;DataModel=Relational;XPath='/root/people;/root/people/vehicles;/root/people/vehicles/maintenance;'

Query

The following query explicitly JOINs the people, vehicles, and maintenance tables.

SELECT [people].[personal.age] AS age, [people].[personal.gender] AS gender, [people].[personal.name.first] AS first_name, [people].[personal.name.last] AS last_name, [people].[source], [vehicles].[type], [vehicles].[model], [vehicles].[insurance.company] AS ins_company, [vehicles].[insurance.policy_num] AS ins_policy_num, [maintenance].[date] AS maint_date, [maintenance].[desc] AS maint_descFROM [people]JOIN [vehicles]ON [people].[_id] = [vehicles].[people_id]JOIN [maintenance]ON [vehicles].[_id] = [maintenance].[vehicles_id]

Results

In the example query, each maintenance element is JOINed to its parent vehicle element, which is JOINed to its parent people element to produce a table with 8 rows (2 maintenance entries for each of 2 vehicles each for 2 people).

age

gender

first_name

last_name

source

type

model

ins_company

ins_policy_num

maint_date

maint_desc

20

M

John

Doe

internet

car

Honda Civic

ABC Insurance

12345

2017-07-17

oil change

20

M

John

Doe

internet

car

Honda Civic

ABC Insurance

12345

2018-01-03

new tires

20

M

John

Doe

internet

truck

Dodge Ram

ABC Insurance

12345

2017-08-27

new tires

20

M

John

Doe

internet

truck

Dodge Ram

ABC Insurance

12345

2018-01-08

oil change

24

F

Jane

Roberts

phone

car

Toyota Camry

Car Insurance

98765

2017-05-11

tires rotated

24

F

Jane

Roberts

phone

car

Toyota Camry

Car Insurance

98765

2017-11-03

oil change

24

F

Jane

Roberts

phone

car

Honda Accord

Car Insurance

98765

2017-10-07

new air filter

24

F

Jane

Roberts

phone

car

Honda Accord

Car Insurance

98765

2018-01-13

new brakes

Top-Level Document Model

Using a top-level document view of the data provides ready access to top-level elements. The provider returns nested elements in aggregate, as single columns.

One aspect to consider is performance. You forego the time and resources to process and parse nested elements -- the provider parses the returned data once, using streaming to read the data. Another consideration is your need to access any data stored in nested parent elements, and the ability of your tool or application to process XML.

Modeling a Top-Level Document View

With DataModel set to "Document" (the default), the provider scans only a single element, the top-level element by default. The top-level elements are available as columns due to default object flattening. Nested elements are returned as aggregated XML.

You can set XPath to specify an element other than the top-level one.

Example

Connection String

Set the DataModel connection property to "Document" to perform the following query and see the example result set. The provider will scan only the XPath value below:

URI=C:\people.txt;DataModel=Document;XPath='/root/people';

Query

The following query pulls the top-level elements and the subelements of the vehicles element into the results.

SELECT [personal.age] AS age, [personal.gender] AS gender, [personal.name.first] AS name_first, [personal.name.last] AS name_last, [source], [vehicles]FROM [people]

Results

With a document view of the data, the personal element is flattened into 4 columns and the source and vehicles elements are returned as individual columns, resulting in a table with 6 columns.

age

gender

name_first

name_last

source

vehicles

20

M

John

Doe

internet

24

F

Jane

Roberts

phone

Below is a sample query against the sample document in , using a relational model based on the XML paths "/root/people", "/root/people/vehicles", and "/root/people/vehicles/maintenance".

Below is a sample query and the results, based on the sample document in . The query results in a single "people" table based on the XPath "/root/people".

Raw Data
Raw Data
Raw Data
Raw Data