This is my second answer, with a possibly improved version of Theo Yaung's solution (the accepted answer). This is based too on a SemaphoreSlim
and does a lazy enumeration of the urls, but is not relying on the Task.WhenAll
for awaiting the tasks to complete. The SemaphoreSlim
is used for this purpose too. This can be an advantage because it means that the completed tasks need not be referenced during the whole operation. Instead each task is eligible for garbage collection immediately after its completion.
Two overloads of the ForEachAsync
extension method are provided (the name is borrowed from Dogu Arslan's answer, the next most popular answer). One is for tasks that return a result, and one for tasks that do not. A nice extra feature is the onErrorContinue
parameter, that controls the behavior in case of exceptions. The default is false
, which mimics the behavior of Parallel.ForEach
(that stops processing shortly after an exception), and not the behavior of Task.WhenAll
(that waits for all tasks to complete).
public static async Task<TResult[]> ForEachAsync<TSource, TResult>( this IEnumerable<TSource> source, Func<TSource, Task<TResult>> action, int maximumConcurrency = 1, bool onErrorContinue = false){ // Arguments validation omitted var semaphore = new SemaphoreSlim(maximumConcurrency, maximumConcurrency); var results = new List<TResult>(); var exceptions = new ConcurrentQueue<Exception>(); int index = 0; try { foreach (var item in source) { var localIndex = index++; lock (results) results.Add(default); // Reserve space in the list await semaphore.WaitAsync(); // continue on captured context if (!onErrorContinue && !exceptions.IsEmpty) { semaphore.Release(); break; } FireAndAwaitTask(); async void FireAndAwaitTask() { try { var task = action(item); var result = await task.ConfigureAwait(false); lock (results) results[localIndex] = result; } catch (Exception ex) { exceptions.Enqueue(ex); return; } finally { semaphore.Release(); } } } } catch (Exception ex) { exceptions.Enqueue(ex); } // Wait for all pending operations to complete for (int i = 0; i < maximumConcurrency; i++) await semaphore.WaitAsync().ConfigureAwait(false); if (!exceptions.IsEmpty) throw new AggregateException(exceptions); lock (results) return results.ToArray();}public static Task ForEachAsync<TSource>( this IEnumerable<TSource> source, Func<TSource, Task> action, int maximumConcurrency = 1, bool onErrorContinue = false){ // Arguments validation omitted return ForEachAsync<TSource, object>(source, async item => { await action(item).ConfigureAwait(false); return null; }, maximumConcurrency, onErrorContinue);}
The action
is invoked on the context of the caller. This can be desirable because it allows (for example) UI elements to be accessed inside the lambda. In case it is preferable to invoke it on the ThreadPool
context, you can just wrap the supplied action
in a Task.Run()
.
To keep things simple, the Task ForEachAsync
is implemented not optimally by calling the generic Task<TResult[]>
overload.
Usage example:
await urls.ForEachAsync(async url =>{ var html = await httpClient.GetStringAsync(url); TextBox1.AppendText($"Url: {url}, {html.Length:#,0} chars\r\n");}, maximumConcurrency: 10, onErrorContinue: true);