Answer by Theodor Zoulias for How to limit the amount of concurrent async I/O operations?

This is my second answer, with a possibly improved version of Theo Yaung's solution (the accepted answer). This is based too on a SemaphoreSlim and does a lazy enumeration of the urls, but is not relying on the Task.WhenAll for awaiting the tasks to complete. The SemaphoreSlim is used for this purpose too. This can be an advantage because it means that the completed tasks need not be referenced during the whole operation. Instead each task is eligible for garbage collection immediately after its completion.

Two overloads of the ForEachAsync extension method are provided (the name is borrowed from Dogu Arslan's answer, the next most popular answer). One is for tasks that return a result, and one for tasks that do not. A nice extra feature is the onErrorContinue parameter, that controls the behavior in case of exceptions. The default is false, which mimics the behavior of Parallel.ForEach (that stops processing shortly after an exception), and not the behavior of Task.WhenAll (that waits for all tasks to complete).

public static async Task<TResult[]> ForEachAsync<TSource, TResult>(    this IEnumerable<TSource> source,    Func<TSource, Task<TResult>> action,    int maximumConcurrency = 1,    bool onErrorContinue = false){    // Arguments validation omitted    var semaphore = new SemaphoreSlim(maximumConcurrency, maximumConcurrency);    var results = new List<TResult>();    var exceptions = new ConcurrentQueue<Exception>();    int index = 0;    try    {        foreach (var item in source)        {            var localIndex = index++;            lock (results) results.Add(default); // Reserve space in the list            await semaphore.WaitAsync(); // continue on captured context            if (!onErrorContinue && !exceptions.IsEmpty) { semaphore.Release(); break; }            FireAndAwaitTask();            async void FireAndAwaitTask()            {                try                {                    var task = action(item);                    var result = await task.ConfigureAwait(false);                    lock (results) results[localIndex] = result;                }                catch (Exception ex) { exceptions.Enqueue(ex); return; }                finally { semaphore.Release(); }            }        }    }    catch (Exception ex) { exceptions.Enqueue(ex); }    // Wait for all pending operations to complete    for (int i = 0; i < maximumConcurrency; i++)        await semaphore.WaitAsync().ConfigureAwait(false);    if (!exceptions.IsEmpty) throw new AggregateException(exceptions);    lock (results) return results.ToArray();}public static Task ForEachAsync<TSource>(    this IEnumerable<TSource> source,    Func<TSource, Task> action,    int maximumConcurrency = 1,    bool onErrorContinue = false){    // Arguments validation omitted    return ForEachAsync<TSource, object>(source, async item =>    {        await action(item).ConfigureAwait(false); return null;    }, maximumConcurrency, onErrorContinue);}

The action is invoked on the context of the caller. This can be desirable because it allows (for example) UI elements to be accessed inside the lambda. In case it is preferable to invoke it on the ThreadPool context, you can just wrap the supplied action in a Task.Run().

To keep things simple, the Task ForEachAsync is implemented not optimally by calling the generic Task<TResult[]> overload.

Usage example:

await urls.ForEachAsync(async url =>{    var html = await httpClient.GetStringAsync(url);    TextBox1.AppendText($"Url: {url}, {html.Length:#,0} chars\r\n");}, maximumConcurrency: 10, onErrorContinue: true);

Latest Images

Trending Articles

Latest Images