You must be logged in and have permission to create or edit a blog.

Representing MultiValue Data In an ADO.NET DataSet

Aug 10

Written by: Robert Houben, CTO
8/10/2010 2:29 PM  RssIcon

The Problem with MultiValues

While it supports non-relational data sources, Microsoft’s ADO.NET technology is built first and foremost to work with relational data.  When retrieving data that is non-relational, it is necessary to map it to a first-normal-form (FNF) structure.  MultiValued systems allow you to create a table that has up to two levels of nesting contained within it.  For instance, you can have an invoice item that contains invoice lines, and those invoice lines can in turn have fields that track options.  So, for instance, you could have

Invoice Date: 1 Aug 2010 Invoice # 10234 Customer: ABC Company

Line 1: XYZ Radio

               Option 1:             Name = Color, Value = Black

               Option 2:             Name = Voltage, Value = 110

Line 2: Headphones

In a relational database, the 2 levels of 1 to many relationships would force you to have a minimum of 3 tables to support this.  In a MultiValued system, this can all be contained in a single table, known in MultiValued terminology as a “file”.

Although Microsoft supports non-relational data sources, they ultimately require you to normalize the data at least to FNF. i.e. Multiple Values need to be extracted and flattened into a separate view for each grouping, for all levels of nesting.

Microsoft also insists that for any column (attribute) the data type that is used to represent it be consistent and unchanging.

In this document, we describe how the DbDataAdapter Fill method automatically creates multiple tables and both keys and foreign keys to link them to each other, based on the relationships between these when MultiValued data is presented to it.

FusionWare CUST File

The FusionWare Direct DAS account “GAXOLEDB” includes a file named “CUST” which includes 2 MultiValued sets, each of which contains a SubValued set.  It contains all the data types supported by your MultiValue system and also includes an example of a record with non-conforming data ( item-id “BAD.3”).

CUST File Data

Here is a dump of an item from the CUST file:

>CT CUST 1

    1

001 JOHN DOE

002 55

003 9567

004 10324753

005 13266

006 S]C

007 SAVINGS]CHEQUING

008 12335\23447]59866

009 1237\125618]325067

010 14001

011 DONT LOSE THIS

012 TRANSFER FROM ANOTHER\BRANCH

013 HERE'S A BUNCH OF TRAILING

014 STUFF THAT WE WANT TO MAKE

015 SURE WE DON'T LOSE.

016 14001\14001]14001

 

Here is that same data annotated:

    1                                   The Item Id of the file (CustomerId)

001 JOHN DOE                           Customer Name

002 55                                  Age (in years)

003 9567                               Birth Date (I know, they don’t match up!)

004 10324753                           Net Worth (MD2 – 2 stored decimal places)

005 13266                              Birth Time (odd, but lets us work with a time field)

006 S]C                                 MultiValued field: Account Type Code(s)

007 SAVINGS]CHEQUING                   MultiValued field: Name of account type

008 12335\23447]59866                  SubValued field: Account number(s)

009 1237\125618]325067                 SubValued field: Account balance(s) (MD2 – 2 decimals)

010 14001                              MultiValued field: Visit date

011 DONT LOSE THIS                     Ignored field

012 TRANSFER FROM ANOTHER\BRANCH      SubValued field: Visit notes, multiple lines per visit

013 HERE'S A BUNCH OF TRAILING        This and all subsequent fields are unmapped and ignored.

014 STUFF THAT WE WANT TO MAKE

015 SURE WE DON'T LOSE.

016 14001\14001]14001

 

We actually have two MultiValued groupings, and each of these has one SubValued grouping nested within it.

The first grouping is a grouping of account types. We have “S”avings and “C”hequing in this item, but some of our records also have “P”ersonal Line of Credit types.

For each account type multivalue, there are two attributes that are part of a nested SubValued grouping.  This grouping contains a list of all the account numbers and balances for the controlling multivalue’s account types. So, for instance this item has two savings accounts with balances and only one chequing account.

The second multivalued grouping is a list of dates on which the customer visited the branch.  Associated with this one field is a SubValued grouping with a single field, which contains multiple lines of text explaining what occurred during the visit.

There are 4 items in this file.  Their item ids are 1, 2, 3 and BAD.3.  The last one “BAD.3” is an example of non-conforming data.  i.e. The fields defined as DATE, TIME, INTEGER and DECIMAL all have invalid data in them.  Invariable, large MultiValued tables wind up with bogus records in them.  This enables us to test and demonstrate how our products allow you to deal with these real-world anomalies.

The file contains dictionaries for all the relevant fields.

CUST File Mapping

We also provide a mapping of this file that exposes some of the fields:

>CT DICT CUST %OLEDB.T.CUST%

    %OLEDB.T.CUST%

001 VARCHAR]CUSTID]0

002 VARCHAR]CustName]0

003 INTEGER\]Age]0

004 DATE]Birthdate]0

005 DECIMAL\2]NetWorth]0

006 TIME]Birthtime]0

007 VARCHAR]AccountType]1\AccountTypes

008 VARCHAR]AccountTypeName]1\AccountTypes

009 VARCHAR]AccountNumber]2\AccountTypes\Accounts

010 DECIMAL\2]AccountBalance]2\AccountTypes\Accounts

011 DATE]VisitDate]1\Visits

012 IGNORE]]

013 VARCHAR]VisitNotes]2\Visits\VisitNoteLines

 

The first attribute of this mapping item represents the item id (sometimes called attribute 0).  The second attribute represents data in attribute 1 of the item, and so on.

Note that attribute 12 is marked “IGNORE” for data type, which means we have not mapped attribute 11 of the file’s data.  Attribute 12 *is* mapped, but everything after that is not.  We put data in there to allow us to test that the Update method does not lose this data.

The ‘]’ characters are actually value delimiters and the ‘\’ characters are subvalue delimiters.  The structure of this mapping file is:

Data type\Scale]Field name]Level\Multivalue group name\Subvalue group name

Scale (and the subvalue delimiter for it) are optional unless the data type is DECIMAL.

Level is 0 for single valued fields, 1 for multi valued fields, and 2 for sub valued fields.

Multivalue group name is optional if level is 0.

Subvalue group name is optional if level is 0 or 1.

ADO.NET Representation of Nested Data

ADO.NET represents nested as a series of IDataReader instances.  The top level one is the parent of all the others, and in our case has all the single valued fields.  For each multivalued group there is a field in the top level IDataReader, which is itself an IDataReader. This field is named based on the multivalue group name.  Within this nested, multivalued IDataReader, for each subvalued group that is contained within the multivalued group, there is, again, another field that is itself a nested IDataReader.

When you ask the DbDataAdapter implementation to “Fill” the DataSet object based on the IDataReader, it asks for (or defaults) the top level table name.  Let’s say that we name it “CUST”.  The top level IDataReader gets a table named CUST, in this case.

This table initially contains all the single valued fields.  For every nested IDataReader that this contains, a table is created, which is named “CUST” suffixed with the name of that IDataReader field, which in this case is the name of the multivalued group.  In our example we have a multivalued group name of AccountTypes and another of “Visits”, so we wind up with two tables: CUSTAccountTypes and CUSTVisits.  In order to allow linking of rows in the multivalued tables to their parent rows in the top level table, an integer field named after the IDataReader field name is created with a unique numeric sequential, zero-based row id.  So we have two fields, named AccountTypes and Visits in the CUST table.

Furthermore, the CUSTAccountTypes table has a matching foreign key field named “CUSTAccountTypes” which is the link back to the AccountTypes field in the CUST table, and the CustVisits table has a matching foreign key field named “CustVisits” which is the link back to the Visits field in the CUST table.

This exercise is repeated to the subvalue level.

Tables Generated for the CUST File Mapping

Nesting Name

DataTable Name

Description

CUST

CUST

This is the table of single valued data

AccountTypes

CUSTAccountTypes

This is the table with all multivalued fields from the AcountTypes multivalued group.

Accounts

CUSTAccountTypesAccounts

This is the table with all subvalues in the Accounts subvalued group which is nested within the AccountTypes subvalued group.

Visits

CUSTVisits

This is the table with all the multivalued fields from the Visits multivalued group.

VisitNoteLines

CUSTVisitsVisitNoteLines

This is the table with all the subvalued fields from the VisitNoteLines subvalued group which is nested within the Visits multivalued group.

 


 

Row Key and Matching Foreign Key Fields for the CUST File

Table Name

Key Field Name

Description

CUST

AccountTypes

Key to identify the rows in the CUST table so that the AccountTypes table can link back to the appropriate single-valued row (item).

CUST

Visits

Key to identify the rows in the CUST table so that the Visits table can link back to the appropriate single-valued row (item).

Note that the values in AccountTypes and Visits will always be the same. This is redundant, but it works.

CUSTAccountTypes

CUSTAccountTypes

Foreign key to link back to the AccountTypes field of the singlevalued row that is the parent of this row.

CUSTAccountTypes

Accounts

Key to identify the row in the CUSTAccountTypes table so that the CUSTAccountTypesAccount table can link back to this table.

CUSTAccountTypesAccounts

CUSTAccountTypesAccounts

Foreign key to link back to the multivalued row that is the parent of this row.

CUSTVisits

CUSTVisits

Foreign key to link back to the Visits field for the singlevalued row that is the parent of this row.

CUSTVisits

VisitNoteLines

Key to identify the row in the CUSTVisits table so that the CUSTVisitsVisitNoteLines table can link back to this table.

CUSTVisitsVisitNoteLines

CUSTVisitsVisitNoteLines

Foreign key to link back to the multivalued row that is the parent of this row.

 


Notes for MultiValue Programmers

  • These key fields are zero-based.  Multivalue and subvalue numbers are one-based.
  • Multivalue numbers for each item restart at 1.  These key fields however are sequential for all items represented in the table. If the first item has 2 multivalues (0,1) then the next item starts at 2.  The point here is that these key fields are not the same as a multivalue or subvalue number.
  • These should NEVER be accessible to an end user.  You should not normally even show them.  They should be hidden from view.

 

FusionWare Direct Product Family

The FusionWare Direct product family includes client providers that enable access to your MultiValue data from Java, .NET and Legacy COM/COM+ environments.

Client Platforms Supported

Client Platforms supported for .NET and COM clients include:

  • Windows 2000, 2003 and 2008, 32 bit and 64 bit
  • Windows XP, Vista and Windows 7, 32 bit and 64 bit

Client Platforms Supported for Java client include:

  • Windows 2000, 2003 and 2008, 32 bit and 64 bit
  • Windows XP, Vista and Windows 7, 32 bit and 64 bit
  • IBM Platforms:
    • IBM i (AS/400)
    • System z (mainframe)
    • System p (AIX)
    • Certified "Ready for IBM Systems with Linux"
  • Linux
  • Unix
  • and more...

MultiValue Platforms Supported

  • Rocket (formerly IBM)
    • U2 (Universe and Unidata) including Universe back to version 5
    • PI Open
  • Raining Data
    • D3
    • mvEnterprise
    • mvBase
  • Northgate
    • Reality
    • Reality/X
  • Temenos Group
    • jBase
  • Ladybridge Systems (coming soon)
    • QM
    • Open QM
  • Others (older or unsupported MultiValue Systems
    • UltPlus
    • Power95
    • and more...

Copyright ©2010 Robert Houben, CTO

Categories:
Location: Blogs Parent Separator CTO Blog

Your name:
Gravatar Preview
Your email:
(Optional) Email used only to show Gravatar.
Your website:
Title:
Comment:
Security Code
CAPTCHA image
Enter the code shown above in the box below
Add Comment   Cancel 
Privacy Statement | Terms Of Use | Copyright 2017 by Fusionware Integration Corp.