Sunday, 18 March 2012

Synchronised and Padded Iterators

Synchronised and Padded Iterators

What are they and why use them?

An iterator is something that can be used to loop over a collection of related items. For example, you could have a list of objects in an array which you would like to perform some operation on. An iterator allows you to get the value and step to the next item of interest.

Padded Iterators

Usually, you would only be interested in getting the values of objects in the array. Trying to dereference an index which does not point to an item in an array would result in a runtime error, such as an array out of bonds exception. However, there are cases where you might want to return some sort of value anyway, despite a value not existing in the array. An example would be where two arrays represent events in time, such as a sampled time series. You want to output these two time series aligned in time, such that you get
Time stamp - Value1 - Value2
However, what if these time series might not match up exactly, such that time series 1 has values where time series 2 does not, or vice versa? This is where padded iterators come in. They allow you to give a value at any index, whether one exists or not. You can specify what value to return if one does not exist.

Synchronised Iterators

Synchronised iterators are useful in situations where you would like to iterate over several lists of objects at once. In the example above, you want to get the value out for both time series for a specified time stamp and display them alongside each other. Then, you want to increment the time stamp of all lists being iterated over at once.

Source Code

Padded iterator code

class PaddedIterator
{
    public PaddedIterator(
        List<double> Values, DateTime DataStart, String PadVal)
    {
        _Values = Values;
        _DataStartDate = DataStart;
        _PadVal = PadVal;
        _Iterator = _Values.GetEnumerator();
    }

    public override string ToString()
    {
        if (this.CurrentDate >= _DataStartDate && _MoreData)
        {
            //return dereferenced iterator
            return _Iterator.Current.ToString();
        }
        else
        {
            // return padded value
            return _PadVal;
        }
    }

    public void MoveNext()
    {
        this.CurrentDate += this.Increment;
        if (this.CurrentDate >= _DataStartDate && _MoreData)
        {
            _MoreData = _Iterator.MoveNext();
        }
    }

    public DateTime CurrentDate {get;set;}

    public TimeSpan Increment {get;set;}

    private List<double> _Values;
    private List<double>.Enumerator _Iterator;
    private DateTime _DataStartDate;
    private String _PadVal;
    private bool _MoreData = true;
}

Explanation

In the above C# code, you would first construct a list of double values you would like to iterate over. Then, you construct a padded iterator with a representative start date of the first value and what you would like to give out if you ask for a data value that does not exist. Before using the iterator, you would set the CurrentDate and Increment properties. The CurrentDate is when the iterator starts giving values for, which would either be before or at the beginning of the data. In it's current form, it would not work to give it a CurrentDate that is within the data, as it always starts incrementing from the start of the data. The increment tells the iterator what value to add to the current date each time it is incremented.

Synchronised iterator code

class SynchronisedIterator
{
    public SynchronisedIterator(
        DateTime ReferenceDate, TimeSpan Increment, String PadVal)
    {
        this.Iterators = new List<PaddedIterator>();
        _EndDate = _CurrentDate = _ReferenceDate = ReferenceDate;
        _Increment = Increment;
        _PadVal = PadVal;
    }

    public List<PaddedIterator> Iterators {get;set;}

    public void AddIterator(List<double> Values, DateTime StartDate)
    {
        PaddedIterator Iterator =
            new PaddedIterator(Values, StartDate, _PadVal);
        Iterator.CurrentDate = _ReferenceDate;
        Iterator.Increment = _Increment;
        this.Iterators.Add(Iterator);

        //Recalculate end date to include new data
        int DataSpan = Values.Count * (int)_Increment.TotalDays;
        int OffsetDays = (int) (StartDate - _ReferenceDate).TotalDays;
        DateTime DataEndDate = _ReferenceDate +
            TimeSpan.FromDays(OffsetDays + DataSpan);
        if (DataEndDate > _EndDate) _EndDate = DataEndDate;
    }

    public override String ToString()
    {
        String Row = _CurrentDate.ToShortDateString();
        foreach (PaddedIterator It in this.Iterators)
        {
            Row += string.Format("\t{0}", It);
        }
        return Row;
    }

    public void NextRow()
    {
        _CurrentDate += _Increment;
        foreach (PaddedIterator iterator in this.Iterators)
        {
            iterator.MoveNext();
        }
    }

    public bool MoreData()
    {
        return (_CurrentDate < _EndDate);
    }

    private DateTime _ReferenceDate;
    private DateTime _CurrentDate;
    private DateTime _EndDate;
    private TimeSpan _Increment;
    private String _PadVal;
}

Explanation

In this example, the synchronised iterator holds a reference to a collection of padded iterators. The padded iterators know about the data that is being iterated over, and when to give a data value and when to give a padding value. The job of the synchronised iterator is to output a row of records with the date and the representative value of that date from all the padded iterators.
The synchronised iterator is given a reference start date which is the date that you would like to start returning values for. Then, the synchronised iterator will set up it's list of padded iterators with the current date as the reference start date. The end date is re-calculated for each padded iterator that is added.

Program code

class Program
{
    static void Main(string[] args)
    {
        List<double> myList = new List<double> { 10, 20, 30, 40, 50 };
        SynchronisedIterator synchronisedIterator =
            new SynchronisedIterator(new DateTime(2000, 1, 1),
                TimeSpan.FromDays(2), "None");

        synchronisedIterator.AddIterator(myList, new DateTime(2000, 1, 1));
        synchronisedIterator.AddIterator(myList, new DateTime(2000, 1, 15));
        synchronisedIterator.AddIterator(myList, new DateTime(2000, 1, 6));
        synchronisedIterator.AddIterator(myList, new DateTime(2000, 1, 3));
        synchronisedIterator.AddIterator(myList, new DateTime(2000, 1, 8));

        while (synchronisedIterator.MoreData())
        {
            Console.WriteLine(synchronisedIterator);
            synchronisedIterator.NextRow();
        }

        Console.ReadKey();
    }
}

Explanation

Because much of the complexity of how to iterate through the data, how to format the data, and whether to give values or to pad is hidden within the iterators, the actual program that constructs and uses these iterators can be quite simple.
In this example, I am creating a single list of double values which will be iterated over by several padded iterators. The synchronised iterator is constructed by telling it that it will start giving values from 1 Jan 2000, and that each value is 2 days apart. Whenever there is no value at a particular date, it will pad with the word "None". The actual padded iterators are added by calling AddIterator with the data and what date the data starts from.
Iterating over all the data is done with the while loop and just keeps churning out data until there is no more.

Result

Output of Program

No comments:

Post a Comment