Interrupt Performance Considerations in DotNet (C#)

Posted in software by Christopher R. Wirz on Thu Jun 20 2019

We were implementing a message queue (maybe to get away from ActiveMQ or SignalR) when it dawned on us that C#'s Queue<T>.Dequeue() method throws an exception when there is nothing in the queue. In the initial Pull Request to fix the bug, a try-catch was put around Dequeue(), but that means using the interrupt pattern...

The interrupt pattern is often viewed as being slower, so the goal would be write an implementation to avoid raising an exception or handling one. Basically, this method would Try to Dequeue and return false if it could not.
Let's try a few approaches to the TryDequeue implementation:


namespace System.Collections.Generic
{
    /// <summary>
    /// Extensions on Queue
    /// </summary>
    public static class QueueExtensions
    {
        /// <summary>
        /// Performs a Dequeue first checking count
        /// </summary>
        /// <typeparam name="T">The type</typeparam>
        /// <param name="queue">The Queue</param>
        /// <param name="result">The result</param>
        /// <returns>True if result found</returns>
        public static bool TryDequeueCount<T>(this Queue<T> queue, out T result)
        {
            if ((queue?.Count ?? 0) > 0)
            {
                result = queue.Dequeue();
                return true;
            }
            result = default;
            return false;
        }

        /// <summary>
        /// Performs a Dequeue using the enumerator to check if possible
        /// </summary>
        /// <typeparam name="T">The type</typeparam>
        /// <param name="queue">The Queue</param>
        /// <param name="result">The result</param>
        /// <returns>True if result found</returns>
        public static bool TryDequeueEnumerator<T>(this Queue<T> queue, out T result)
        {
            if (queue?.GetEnumerator().MoveNext() == true)
            {
                result = queue.Dequeue();
                return true;
            }
            result = default;
            return false;
        }

        /// <summary>
        /// Performs a Dequeue catching any exceptions
        /// </summary>
        /// <typeparam name="T">The type</typeparam>
        /// <param name="queue">The Queue</param>
        /// <param name="result">The result</param>
        /// <returns>True if result found</returns>
        public static bool TryDequeueThrow<T>(this Queue<T> queue, out T result)
        {
            try
            {
                result = queue.Dequeue();
                return true;
            }
            catch
            {
                // Ignored
            }
            result = default;
            return false;
        }
    }
}

This API was based on the Queue<T>.TryDequeue(out T result) method as suggested by a colleague. This sounded like a great approach except we'd have to change our NetStandard target. But maybe it offered a performance advantage?
Let's target the right NetStandard and write some test code...


class DequeueTest
{
	/// <summary>
	/// Makes a queue of messages
	/// </summary>
	/// <param name="numberOfQueues">The number of queues</param>
	/// <param name="queueSize">The size of each queue</param>
	/// <returns>A collection of queues for testing</returns>
	public static Queue<string>[] MakeTestArray(int numberOfQueues = 100000, int queueSize = 5)
	{
		var testArray = new Queue<string>[numberOfQueues];
		for (int i = 0; i < numberOfQueues; i++)
		{
			var queue = new Queue<string>(queueSize);
			for (int j = 0; j < queueSize; j++)
			{
				queue.Enqueue("tests");
			}
			testArray[i] = queue;
		}
		return testArray;
	}
	/// <summary>
	/// Runs a test of each TryDequeue given a number of queues and queue size 
	/// </summary>
	/// <param name="numberOfQueues">The number of queues</param>
	/// <param name="queueSize">The size of each queue</param>
	public static void Run(int numberOfQueues = 100000, int queueSize = 5)
	{
		var sw = new System.Diagnostics.Stopwatch();
		var ta = MakeTestArray(numberOfQueues, queueSize);
		sw.Start();
		for (int i = 0; i < numberOfQueues; i++)
		{
			var current = ta[i];
		}
		sw.Stop();
		Console.Write(sw.Elapsed.Ticks + ",\t");

		sw.Restart();
		for (int i = 0; i < numberOfQueues; i++)
		{
			var current = ta[i];
			while (current.TryDequeue(out string _)) { }
		}
		sw.Stop();
		Console.Write(sw.Elapsed.Ticks + ",\t");

		ta = MakeTestArray(numberOfQueues, queueSize);
		sw.Restart();
		for (int i = 0; i < numberOfQueues; i++)
		{
			var current = ta[i];
			while (current.TryDequeueCount(out string _)) { }
		}
		sw.Stop();
		Console.Write(sw.Elapsed.Ticks + ",\t");

		ta = MakeTestArray(numberOfQueues, queueSize);
		sw.Restart();
		for (int i = 0; i < numberOfQueues; i++)
		{
			var current = ta[i];
			while (current.TryDequeueEnumerator(out string _)) { }
		}
		sw.Stop();
		Console.Write(sw.Elapsed.Ticks + ",\t");

		ta = MakeTestArray(numberOfQueues, queueSize);
		sw.Restart();
		for (int i = 0; i < numberOfQueues; i++)
		{
			var current = ta[i];
			while (current.TryDequeueThrow(out string _)) { }
		}
		sw.Stop();
		Console.WriteLine(sw.Elapsed.Ticks);
	}
}

Here is the average of 30 runs (all times in Ticks):



Average Time (in ticks) to Dequeue 100k Queues

Items in QueueTime to Traverse Training DataFramework ImplementationCount ImplementationMoveNext ImplementationInterrupt Implementation
185122376942502764586
2892266135571112900740
48941822741126832566929
88577274964251963890715
1689157928899463041961727
32852972418785897511975518
648558004364811772541974013
12889114340715303514682526171
256892269541426467001272605587
5128545332228201214024112541266
10248996818656318228282223158801
2048891807073112239156854644259148
40968536959662261828112367626901183
819289742517645246792241451511408915
16384891457048796080514531557321544792
327689829357790181743119151193639762457
65536119620322183615736718340681283570126


Let's normalize the times to the results of the framework implementation, shall we?



Normalized Average Time (with respect to Framework results)

Items in QueueFramework ImplementationCount ImplementationMoveNext ImplementationInterrupt Implementation
110.633.482260.5
210.63.141280.11
410.663.03613.8
810.643.26503.52
1610.562.93124.22
3210.633.0266.46
6410.633.0634.03
12810.633.0722.09
25610.633.0811.48
51210.623.095.61
102410.582.923.26
204810.623.152.36
409610.613.041.87
819210.613.021.54
1638410.663.111.48
3276810.623.121.35
6553610.582.961.35


Visually, this might make more sense on a log2 - log2 plot...





These results are somewhat interesting. First of all, since an exception is only raised when the Queue can no longer Dequeue, the interrupt pattern approaches framework times as the number of items in the queue increases. Of course, for sizes less than 2^10 it is still the least performant. We expect using MoveNext to be slower than framework times... 4x seems reasonable. The other interesting finding is that the Count implementation of TryDequeue is consistently faster than framework.

How is that possible? All Count does is return _size of the Queue.
Let's look at the source code...


// Dequeue implementation
public T Dequeue()
{
	int head = _head;
	T[] array = _array;
	if (_size == 0)
	{
		ThrowForEmptyQueue();
	}
	T result = array[head];
	if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
	{
		array[head] = default(T);
	}
	MoveNext(ref _head);
	_size--;
	_version++;
	return result;
}



// TryDequeue implementation
public bool TryDequeue([MaybeNullWhen(false)] out T result)
{
	int head = _head;
	T[] array = _array;
	if (_size == 0)
	{
		result = default(T);
		return false;
	}
	result = array[head];
	if (RuntimeHelpers.IsReferenceOrContainsReferences<T>())
	{
		array[head] = default(T);
	}
	MoveNext(ref _head);
	_size--;
	_version++;
	return true;
}

Really there is no difference to explain the ~35% improvement of speed for the Count implementation. Maybe this is one of the many times we have found that static methods are faster than instance methods.