C# : Task Parallel Library (TPL) with async await and TaskCompletionSource for async I/O operations

Asynchronous operations have long been a part of C# and the .NET framework. We discussed quite a bit about asynchronous programming using the delegate’s BeginInvoke and EndInvoke methods and Event based Asynchronous Pattern ( EAP ) in our earlier posts. These are quite good and enables the developer to introduce asynchrony in their applications, especially the ones that are GUI based. These patterns, especially EAP is easy to use and take care of callback functionality after the asynchronous operation has completed.

However, there was still some more scope of developer friendly code and C# 5.0 took it one notch higher by introducing the async and await keywords. In layman’s terms these keywords don’t introduce asynchrony directly, but make use of underlying .NET architecture. In other words, these are just syntactic sugar, but very powerful as far as code readability goes.

Though this post refers to Task Parallel Library (TPL), we will concentrate exclusively on the more important aspect of TPL – the usage of Task class with async and await keywords. We will also go over CPU bound and I/O bound async operations and some best practices involving them both.

Lastly, we will also take a look at TaskCompletionSource<TResult> which is especially useful when you don’t want to hold up threads during I/O operations.

Let us start with the most important member of TPL – the Task class.

Consider a very simple example where we do some CPU intensive task involving the generation of random numbers and words. Here is the sample WPF code. We have two functions – GenerateRandomNumbers() and GenerateRandomWords() doing exactly what their name describes. There is a ListBox control which displays these random numbers and words.

XAML :
<Window x:Class="WpfApp2.MainWindow"
        xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
        xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
        xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
        xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
        xmlns:local="clr-namespace:WpfApp2"
        mc:Ignorable="d"
        Title="MainWindow" Height="653.097" Width="1086.726">
   
    <Grid RenderTransformOrigin="-0.091,-0.032" Margin="0,0,-45,-29.8" Background="LightSlateGray">
        <ListBox Name="listBoxRandomNum"  HorizontalAlignment="Left" Height="241" Margin="85,44,0,0" VerticalAlignment="Top" Width="363"/>
        <Button Margin="494,44,302,486" FontSize="23" Click="btnRandomNum_Click">
            <TextBlock Text="Generate Random Numbers and Words" TextWrapping="Wrap" TextAlignment="Center"/>
        </Button>
        <ListBox x:Name="listBoxRandomWords"  HorizontalAlignment="Left" Height="241" Margin="85,313,0,0" VerticalAlignment="Top" Width="363"/>
    </Grid>
Code Behind ( Synchronous Code )
        private void btnRandomNum_Click(object sender, RoutedEventArgs e)
        {
            GenerateRandomNumbers();
            GenerateRandomWords();
        }
        private int GenerateRandomNumbers()
        {
            //the relatively expensive operation of generating random numbers.
            //there is no I/O involved in this. It is a purely CPU based operation.
            List<int> lstRandomNum = new List<int>();
            Random r = new Random();
            for (int count = 0; count < 2000000; count++)
            {
                lstRandomNum.Add(r.Next());
            }
            //iterate the List and fill up the ListBox.
            foreach(int number in lstRandomNum)
            {
                listBoxRandomNum.Items.Add(number);
            }            
        }
        private int GenerateRandomWords()
        {
            //there is no I/O involved in this. It is a purely CPU based operation.
            List<string> lstAlphabets = new List<string>() { "a", "f", "i", "y", "u", "i", "r", "b", "h", "x", "c", "v" };
            List<string> lstRandomWords = new List<string>();
            Random r = new Random();
            //generate 200,000 three letter words.
            for (int count = 0; count < 20000; count++)
            {
                lstRandomWords.Add(lstAlphabets[r.Next(0, 11)] + lstAlphabets[r.Next(0, 11)] + lstAlphabets[r.Next(0, 11)]);
            }
            foreach (string randomWord in lstRandomWords)
            {
                listBoxRandomWords.Items.Add(randomWord);
            }
        }

Both these functions are invoked sequentially one after the other on the button’s click event. This results in a blocked UI as soon as the first function is invoked. We can used the typical callback mechanism or a delegate with BeginInvoke() and EndInvoke() methods to achieve asynchrony here. However, this would require either an event handler or polling IAsyncResult to get the results and populate the ListBox.

The output :

The Task Class

A good alternative is the Task class. The Task class represents a “promise” to the caller signifying that the code wrapped in it would be executed asynchronously and the results will be delivered when the operation completes. The caller doesn’t need to bother about how and when the Task would complete and can continue executing other code.

The most commonly method of the Task class is Task.Run() method. It runs the wrapped code as a delegate in the .NET ThreadPool and is assigned a dedicated thread for entire course of execution of the code. It has several overloads. The above code has now been changed to include the usage of Task.Run() method

Code with Task.Run() :

        private int GenerateRandomNumbers()
        {
            List<int> lstRandomNum = new List<int>();
            //the relatively expensive operation of generating random numbers.
            //there is no I/O involved in this. It is a purely CPU based operation.
            Task t = Task.Run(() =>
            {
                Random r = new Random();
                for (int count = 0; count < 20000000; count++)
                {
                    lstRandomNum.Add(r.Next());
                }
            });
            t.Wait();
            //iterate the List and fill up the ListBox.
            foreach (int number in lstRandomNum)
            {
                listBoxRandomNum.Items.Add(number);
            }            
        }
        private void GenerateRandomWords()
        {
            //there is no I/O involved in this. It is a purely CPU based operation.
            List<string> lstAlphabets = new List<string>() { "a", "f", "i", "y", "u", "i", "r", "b", "h", "x", "c", "v" };
            List<string> lstRandomWords = new List<string>();
            Task t = Task.Run(() =>
            {
                Random r = new Random();
                //generate 200,000 three letter words.
                for (int count = 0; count < 20000000; count++)
                {
                    lstRandomWords.Add(lstAlphabets[r.Next(0, 11)] + lstAlphabets[r.Next(0, 11)] + lstAlphabets[r.Next(0, 11)]);
                }
            });
            t.Wait();
            foreach (string randomWord in lstRandomWords)
            {
                listBoxRandomWords.Items.Add(randomWord);
            }
        }

You can see that the random number and word generation has been wrapped in Task.Run().  However, as we need to await on the results of the code executed using Task.Run(), we will have to use the Wait() method from Task class which blocks the main thread until the Task is complete. Normally this wouldn’t be required if you don’t need the result of the task immediately.

async and await

This is where the async and await keywords come in. They enable us to mark the method as “async” and then “await” an asynchronous operation inside the method. Advantages are  :

  • You don’t need to explicitly define a callback or an event handler when the async operation completes.
  • Code readability is very good
  • Since the await keyword expects a Task object, it is easy to get a method to return a Task encapsulating some work and then await on it.

Before we change the above code, here is a small sample code which uses async await keywords :

        private async void DoAsyncOperation()
        {
            await Task.Delay(1000);

            //do some other dependent work here.
        }



We have simulated async behavior using Task.Delay(1000) and creating a Task that completes after 1 second when the DoAsyncOperation() method is called. The use of async keyword in the method name tells the compiler that this method will be awaiting on an asynchronous operation using the await keyword and as soon as the await keyword is encountered, the control would pass on to the calling code. However, the asynchronous operation signified by the await keyword would continue executing. Once the asynchronous operation is over, rest of the method is executed.

Let us change our random numbers and words example by using the async await keywords. Here is the code.

        private async void GenerateRandomNumbers()
        {
            List<int> lstRandomNum = new List<int>();
            //the relatively expensive operation of generating random numbers.
            //there is no I/O involved in this. It is a purely CPU based operation.
            await Task.Run(() =>
            {
                Random r = new Random();
                for (int count = 0; count < 20000000; count++)
                {
                    lstRandomNum.Add(r.Next());
                }
            });         

            listBoxRandomNum.ItemsSource = lstRandomNum.AsEnumerable();
        }
        private async void GenerateRandomWords()
        {
            //there is no I/O involved in this. It is a purely CPU based operation.
            List<string> lstAlphabets = new List<string>() { "a", "f", "i", "y", "u", "i", "r", "b", "h", "x", "c", "v" };
            List<string> lstRandomWords = new List<string>();
            await Task.Run(() =>
            {
                Random r = new Random();
                //generate 200,000 three letter words.
                for (int count = 0; count < 20000000; count++)
                {
                    lstRandomWords.Add(lstAlphabets[r.Next(0, 11)] + lstAlphabets[r.Next(0, 11)] + lstAlphabets[r.Next(0, 11)]);
                }
            });

            listBoxRandomWords.ItemsSource = lstRandomWords;
        }

The methods GenerateRandomNumbers() and GenerateRandomWords() are now prefixed with the async keyword that marks the methods as asynchronous. As soon as the await keyword is encountered the execution switches to the calling thread ( which is the UI thread in our case ), and the random number/word calculation is transferred to another thread from the .NET ThreadPool.

The main takeaway here is that when you mark the method as async and then use the await keyword inside it, you are essentially making entire method asynchronous enabling it to yield control to the calling code even when the method execution has not finished. This way, the calling code can continue performing other non-dependent tasks.

The random number generation has finished first and the Listbox is populated with the result. Additionally, the random word generation still continues without blocking the UI.

CPU Bound Operations

The above example involved generating random numbers and words which could be termed as a relatively lengthy calculation using the Random class. These can be termed as a CPU bound operation which is handled solely by the CPU. A separate thread is borrowed from the ThreadPool and is dedicated to the Task throughout the course of its execution. This is OK because the thread is actually “doing some work” and not waiting for some other dependent process to complete.

I/O Bound Operations

I/O operations involve tasks that are outside the control of CPU. For instance they could comprise of database operations or reading/writing something over the network. The .NET framework provides a lot of built in methods for such operations like – BeginExecuteReader(), BeginExecuteNonQuery() etc. Basically, these methods call the I/O operations and then in most cases a callback is used to perform post I/O operations – like binding results to a data grid or writing them to a file etc. This is in itself okay when we go back to using old C# syntax of having an async I/O method and implementing callbacks.

However, since async await are already at our disposal it might be meaningful to use them to wrap the existing .NET async methods. This cannot be directly done since the above methods do not return a Task object. Not that it is impossible to await types other than Task, but requires some work. Read this excellent post by Stephen Toub if you want to explore on creating awaitables – await anything

Note : We should never wrap an existing I/O operation in Task.Run() and then await it. This will allocate a ThreadPool thread for the entire duration of  I/O operation without the thread actually doing anything. This is because in I/O operations the CPU is mostly idle.

Using TaskCompletionSource<TResult> with async await for I/O bound operations

Lets take an example to demonstrate awaiting an I/O operation that involves executing a SELECT query in SQL Server database using the .NET method – BeginExecuteReader(). Since this is an asynchronous method it yields control immediately.

public IAsyncResult BeginExecuteReader (AsyncCallback callback, object stateObject);

There is a callback delegate AsyncCallback passed to it that will be executed once the DB operation completes. In normal course, you could just call this method to execute the SELECT query and then use the IAsyncResult to poll whether the operation completed. Or you could use the EndExecuteReader() method to block the thread till the query finishes executing.

The TaskCompletionSource class basically returns a Task object that actually “represents” the underlying I/O operation that is happening but it can only be controlled using the TaskCompletionSource class instance. The AsyncCallback callback method can be used to set the result on the Task object. Let us take an example and then dissect it to understand the flow.

We have simple database operation getting all records from a SQL Server table using the async method – SqlCommand.BeginExecuteReader(). These records are inserted into a list and the list count is displayed on the console.

    class Program
    {
        static TaskCompletionSource<List<string>> taskCompletionSource = new TaskCompletionSource<List<string>>();
        static void Main(string[] args)
        {
            GetDBResultsAsync();
            Console.WriteLine("async execution is still in progress, please wait for some time...");
            Console.ReadLine();
        }

        private static async void GetDBResultsAsync()
        {
            Console.WriteLine("getting record count from MyTable");
            List<string> recordList = await GetTaskResultsFromMyTable();
            /*more code goes here*/
            /*more code goes here*/
            /*more code goes here*/
            Console.WriteLine("record count = " + recordList.Count.ToString());           
        }

        private static Task<List<string>> GetTaskResultsFromMyTable()
        {
            SqlConnection connection = new SqlConnection("Data Source=S734410-W10;Initial Catalog=MyDB;Integrated Security=True");
            connection.Open();
            SqlCommand command = new SqlCommand("select * from mytable (nolock)", connection);

            AsyncCallback callback = new AsyncCallback(ResultHandler);
            command.BeginExecuteReader(callback, command);

            return taskCompletionSource.Task;
        }

        private static void ResultHandler(IAsyncResult result)
        {
            List<string> stringValueList = new List<string>();
            Task resultTask = taskCompletionSource.Task;
            SqlCommand command = (SqlCommand)result.AsyncState;
            SqlDataReader reader = command.EndExecuteReader(result);

            while (reader.Read())
            {
                stringValueList.Add((string)reader.GetValue(1));
            }

            taskCompletionSource.SetResult(stringValueList);
        }
    }

We make a call to GetDBResultsAsync() which is marked as async. It awaits the method GetTaskResultsFromMyTable() which actually returns a Task<List<string>> object. The method – GetTaskResultsFromMyTable() uses the BeginExecuteReader() to query the database and handle the completion in the callback method – ResultHandler(). However, what made it return the Task object which is being awaited in the calling method ?

This is where the TaskCompletionSource<TResult> class comes into picture. It gives us access to the Task object we can await.

There are 3 things to notice :

  1. We have declared the TaskCompletionSource<List<string>> global variable.static TaskCompletionSource<List<string>> taskCompletionSource = new TaskCompletionSource<List<string>>();
  2. As soon as we call the BeginExecuteReader() asynchronous method, we return taskCompletionSource.Task object, which is awaited in the calling method. This way, GetTaskResultsFromMyTable() becomes awaitable.
  3. When the SELECT query finishes executing, the callback method – ResultHandler() is called, where we call the TaskCompletionSource.SetResult() method. As soon as we call SetResult(), the awaitable Task object is returned to the caller with the entire data set in the List<string> type.

Summary

async and await can be very useful for ease of coding in asynchronous programming. TaskCompletionSource is especially useful when you need to have control over the results of an asynchronous operation and return the results to be awaited using the async await style of coding.

Leave a Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.