WPF multipage reports - Part II - Data Grouping

In my first post about building WPF multipage reports I described the overall problem definition and a high level design.
Today I will describe the data layer that reads data from a data source and provides simple grouping calculations (sorry no WPF stuff today).  I decided not to use GROUPING and CUBE clauses that are in SQL Server, because I wanted to be data source independent and also support a more extensible computational model that allows much more than SQL language. I also tried to perform all calculations in one pass, enabling a partial processing in the future and also for performance reasons.
Do not expect full featured data engine created in one post, but you can get the idea. The result will be the data prepared for presentation layer as one encapsulating class.
Here is the Report data class

public class ReportData
    {
        DataTable rows;

        /// <summary>
        /// Represents the raw underlying data
        /// </summary>
        public DataTable Rows
        {
            get { return rows; }
        }

        List<GroupData> groups;

        /// <summary>
        /// Contains calculated values for each groupl level
        /// </summary>
        public List<GroupData> Groups
        {
            get { return groups; }
        }

        public ReportData(DataTable rows, List<GroupData> groups,GroupData reportGroup)
        {
            this.rows = rows;
            this.groups = groups;
            this.reportGroup = reportGroup;
        }

        private GroupData reportGroup;

        /// <summary>
        /// Contains calculated values for whole report
        /// </summary>
        public GroupData ReportGroup
        {
            get { return reportGroup; }
           
        }
    }


For the raw data container I’m using the good old DataTable. One of the reasons is that I wanted to support practically any data source and the generic nature of DataTable is fully suitable. The same idea is behind using the IDataReader as my data source.
The root group holds the calculated values for the whole report even of there is no real grouping defined. The container for group calculated values is the GroupData class:

    public class GroupData
    {
        int level;

        /// <summary>
        /// The level in a group hierarchy 
        /// </summary>
        public int Level
        {
            get { return level; }
        }

        private List<GroupData> nestedDataGroups;

        /// <summary>
        /// Inner data group's.
        /// </summary>
        public List<GroupData> NestedDataGroups
        {
            get {
                if (nestedDataGroups == null)
                    nestedDataGroups = new List<GroupData>();

                return nestedDataGroups; 
            }
        }


        string key;

        /// <summary>
        /// The grouping key value for this data group.
        /// </summary>
        public string Key
        {
            get { return key; }
        }

        int count;
        /// <summary>
        /// Holds the number of rows in a group
        /// </summary>
        public int Count
        {
            get { return count; }
            set { count = value; }
        }

        int startRow;

        /// <summary>
        /// Pointer to a starting row for this group data in a Datatable
        /// </summary>
        public int StartRow
        {
            get { return startRow; }
        }

        public bool HasNestedGroups
        {
            get {return (nestedDataGroups != null && nestedDataGroups.Count > 0);}
        }

        Dictionary<string, ComputeField> computes = new Dictionary<string, ComputeField>();

        /// <summary>
        /// Initializes the fields and local computes
        /// </summary>
        public GroupData(int level, string key, int startRow, IDataReader dataReader)
        {
            this.level = level;
            this.startRow = startRow;
            this.key = key;

            //For demo purposes I will aggregate all (and only) decimal fields
            for (int i = 0; i <= dataReader.FieldCount - 1; i++)
            {
                Type t = dataReader.GetFieldType(i);
                if (t == typeof(decimal))
                    computes.Add(dataReader.GetName(i),new DecimalAggregateField( i));
            }
        }

        /// <summary>
        /// Used during report rendering phase and  not for calculations 
        /// </summary>
        public object GetComputedValue(string name)
        {
            if (computes.ContainsKey(name)) return computes[name];
            return key;
        }

        /// <summary>
        /// Updates calculated values for each compute field
        /// </summary>
        /// <param name="dataReader"></param>
        public void UpdateValues(IDataReader dataReader)
        {
            count++;
            foreach (ComputeField cf in computes.Values)
                cf.UpdateValue(dataReader);
        }
    }


If a grouping is used, the underlying data has to be sorted accordingly to a group definition. Sorting is better to execute on the underlying database layer, where exists the appropriate indexes for speeding up a sorting operation.
The computed values can be of different type. You can implement various expressions and algorithms. I have created a simple class hierarchy to support this variability.

 /// <summary>
    /// Generic computation field ancestor class
    /// </summary>
    abstract class ComputeField
    {
        /// <summary>
        /// wThe overrides has to implement the exact calculation logic
        /// </summary>
        /// <param name="reader"></param>
        public abstract void UpdateValue(IDataReader reader);

        public abstract object Value { get; }
    }

For this post I will implement the most common aggregations.

 /// <summary>
    /// Supported aggregate types
    /// </summary>
    public enum AggegateType
    {
        None,
        Sum,
        Count,
        Min,
        Max
    }

    
    /// <summary>
    /// Supports aggregation
    /// </summary>
    abstract class AggregateField:ComputeField
    {
        /// <summary>
        /// the ordinal position in the datasource
        /// </summary>
        protected int ordinal;
   
        public AggregateField(int ordinal)
        {
            this.ordinal = ordinal;
        }

        AggegateType aggregate=AggegateType.Sum;

        public AggegateType Aggregate
        {
            get { return aggregate; }
            set { aggregate = value; }
        }

    }


Add support for typed decimal values.

class DecimalAggregateField : AggregateField
    {
        decimal value;
        public DecimalAggregateField(int ordinal) : base( ordinal) {}
        public override void UpdateValue(IDataReader reader)
        {
            decimal d=reader.GetDecimal(ordinal);
            switch (Aggregate)
            {
                case AggegateType.Sum: value += d; break;
                default: throw new NotImplementedException();
            }
        }

        public override object Value
        {
            get { return value; }
        }
     
    }


During group calculations I will need a class for holding the group definition for each group level like this GroupItem class.

 /// <summary>
    /// Represents a group definition.
    /// Builds the group data instances according to underlying data.
    /// </summary>
    class GroupItem
    {
        int ordinal;
        int level;

        GroupItem upperGroup;
        public GroupItem(int ordinal,int level,GroupItem upperGroup)
        {
            this.ordinal = ordinal;
            this.upperGroup=upperGroup;
            this.level = level;
        }

        string lastKey="DUMMYKEY";
        GroupData currentDataGroup;

        //currently usefull only for root group level
        List<GroupData> dataGroups = new List<GroupData>();

        /// <summary>
        /// The datag for this group level
        /// </summary>
        public List<GroupData> DataGroups
        {
            get { return dataGroups; }
        }

        public void AddChildDataGroup(GroupData gData)
        {
            this.currentDataGroup.NestedDataGroups.Add(gData);
        }

        /// <summary>
        /// If the grouping key value is changed or parent group is new new Group data is created. 
        /// The calculated values are always updated
        /// </summary>
        public bool UpdateGroupData(IDataReader reader,int rowIndex,bool forceNewGroup)
        {
            bool newGroup=false;
            string newKey=reader.GetValue(ordinal).ToString();
            if (forceNewGroup || newKey != lastKey)
            {
                lastKey = newKey;
                currentDataGroup = new GroupData(level,newKey,rowIndex, reader);
                newGroup = true;
                this.dataGroups.Add(currentDataGroup);
                if (this.upperGroup != null)
                    upperGroup.AddChildDataGroup(currentDataGroup);
            }
            currentDataGroup.UpdateValues(reader);
            return newGroup;
        }

    }

OK, I have all supporting classes ready so I can add the real data processing code. The source data comes from IDataReader and at first I iterate over the fields to create the DataTable and group definitions and then I read all the data from the datareader and update the group data.


public static class DataEngine
    {
        public static ReportData Load(IDataReader dataReader, string[] groupColumns)
        {
            DataTable tmp = new DataTable();
            for (int i = 0; i <= dataReader.FieldCount - 1; i++)
            {
                Type t = dataReader.GetFieldType(i);
                tmp.Columns.Add(dataReader.GetName(i), t);
            }

            List<GroupItem> groups=new List<GroupItem>();
            GroupItem parentGroup=null;
            for (int i = 0; i < groupColumns.Length; i++)
            {
                GroupItem g= new GroupItem(dataReader.GetOrdinal(groupColumns[i]),i,parentGroup);
                groups.Add(g);
                parentGroup=g;
            }

            GroupData reportGroup = new GroupData(-1, "Report", 0, dataReader);
            // prepare empty data buffer
            object[] rowData=new object[dataReader.FieldCount];

            int rowIndex = 0;
            while (dataReader.Read())
            {
                //update totals for a report
                reportGroup.UpdateValues(dataReader);

                //update group computes and if needed start a new group
                bool newGroup = false;
                for (int i = 0; i < groups.Count; i++)
                    newGroup = groups[i].UpdateGroupData(dataReader, rowIndex, newGroup);

                dataReader.GetValues(rowData);
                tmp.LoadDataRow(rowData, true);
                rowIndex++;
            }
            dataReader.Close();
            reportGroup.Count = tmp.Rows.Count;

            if (groups.Count>0)
                return new ReportData(tmp,groups[0].DataGroups,reportGroup);
            else
                return new ReportData(tmp, null,reportGroup);

        }
}

In a next post I will go back to WPF and describe the rendering part.

 

kick it on DotNetKicks.com

Tags: WPF, .NET, LOB