Cook Computing

Arrays and Enumerable.Last()

August 28, 2011 Written by Charles Cook

Code which determines the index of the last item in an array like this has always irritated me a little, particularly when doing complex array manipulation:

object lastItem = myArray[myArray.Length - 1];

It's obvious enough but it would be nice to have a cleaner way of getting the index of the last item or the last item itself. The latter has a good solution, using the Linq Enumerable.Last() extension method:

object lastItem = myArray.Last();

I initially thought this could be sub-optimal because the implementation might traverse the whole array but according to Ed Maurer at Microsoft, in a reply on a Connect thread, Last()is optimized for source sequences which implements IList<T>:

Thanks for your investigation of the performance of Enumerable.Last(). I've inspected the implementation, and it has an optimization to deal with cases in which the source sequence implements IList<T> like your array case - cast to IList<T> and use the indexer method. I believe the implementation employs the most practical optimization available to us, and we won't invest further to improve the performance of this method. Thanks again for your comments.

The Mono implementation of Last() applies this optimization (but also illustrates the level of performance hit using Last()):

public static TSource Last<TSource> (this IEnumerable<TSource> source)
{
    Check.Source (source);

    var collection = source as ICollection<TSource>;
    if (collection != null && collection.Count == 0)
        throw new InvalidOperationException ();

    var list = source as IList<TSource>;
    if (list != null)
        return list [list.Count - 1];

    return source.Last (PredicateOf<TSource>.Always, Fallback.Throw);
}

Back in 2004 Brian Grunkemeyer blogged about what makes this possible:

When we were designing our generic collections classes, one of the things that bothered me was how to write a generic algorithm that would work on both arrays and collections. To drive generic programming, of course we must make arrays and generic collections as seamless as possible. It felt that there should be a simple solution to this problem that meant you shouldn't have to write the same code twice, once taking an IList<T> and again taking a T[]. The solution that dawned on me was that arrays needed to implement our generic IList. We made arrays in V1 implement the non-generic IList, which was rather simple due to the lack of strong typing with IList and our base class for all arrays (System.Array). What we needed was to do the same thing in a strongly typed way for IList<T>.

There were some restrictions here though - we didn't want to support multidimensional arrays since IList<T> only provides single dimensional accesses. Also, arrays with non-zero lower bounds are rather strange, and probably wouldn't mesh well with IList<T>, where most people may iterate from 0 to the return from the Count property on that IList. So, instead of making System.Array implement IList<T>, we made T[] implement IList<T>. Here, T[] means a single dimensional array with 0 as its lower bound (often called an SZArray internally, but I think Brad wanted to promote the term "vector" publically at one point in time), and the element type is T. So Int32[] implements IList<Int32>, and String[] implements IList<String>.

He goes on to describe how implementing this was decidedly non-trivial.

Finally going back to the first issue, now to determine the index of the last item in any array, it's possible to use Array.GetUpperBound():

int idx = myArray.GetUpperBound(0);

But this is a bit ugly because it's relying on the fact that an array type of T[] is a special case of System.Array. Perhaps it's better to use an extension method:

static class Extensions
{
  public static int LastIndex<T>(this T[] array)
  {
    if (array == null) throw new ArgumentNullException("array");
    if (array.Length == 0) throw new InvalidOperationException("zero-length array");
    return array.Length - 1;
  }
}

So we end up with:

int idx = intArray.LastIndex();

It's slower than using length - 1 but potentially safer because calling it on a zero-length array will result in InvalidOperationException being thrown. And it's not quite as succint as Perl's $#array syntax but that's another story.