This content originally appeared on Level Up Coding - Medium and was authored by Stewart Celani
Recently I was working on a project to backup files via the Autodesk BIM360/Construction Cloud API.
My initial prototype took 12 hours to backup roughly 100 GB (40,230 files in 11,737 folders) which wasn’t good. Due to the fast pace data is being added to the account, within 1–2 years the backup would be taking over 24 hours.
After a program re-write which included moving the file downloading logic from a synchronous foreach loop to a Parallel.ForEachAsync loop I was able to get the nightly backup time down to 3 hours.
I wanted to quantify how much of the 4x performance improvement was down to the re-write vs the Parallel.ForEachAsync loop so ran two benchmarks in the awesome Benchmark.NET.
The first was on a small project containing 300 MB (136 files in 30 folders):
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1645 (21H2)
Intel Core i7-5820K CPU 3.30GHz (Broadwell), 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.202
[Host] : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
Job-KSORIT : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
IterationCount=3 LaunchCount=1 WarmupCount=1
| Method | Mean | Error | StdDev |
|---------------- |--------:|--------:|-------:|
| ParallelLoop | 123.2 s | 61.06 s | 3.35 s |
| SynchronousLoop | 299.8 s | 74.74 s | 4.10 s |
The second was on a large project containing 26 GB (10,653 files in 973 folders):
BenchmarkDotNet=v0.13.1, OS=Windows 10.0.19044.1706 (21H2)
Intel Core i7-5820K CPU 3.30GHz (Broadwell), 1 CPU, 12 logical and 6 physical cores
.NET SDK=6.0.202
[Host] : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
Job-QIVIZX : .NET 6.0.4 (6.0.422.16404), X64 RyuJIT
IterationCount=3 LaunchCount=1 RunStrategy=Monitoring
WarmupCount=0
| Method | Mean | Error | StdDev |
|---------------- |---------:|---------:|---------:|
| ParallelLoop | 98.83 m | 12.97 m | 0.711 m |
| SynchronousLoop | 300.62 m | 368.38 m | 20.192 m |
So the improvement from Parallel.ForEachAsync is roughly 3x.
I would have loved to run more iterations of benchmark 2 to get the error and stddev down but it took a day to run as is and the results match with both the benchmark 1 and the nightly real-world performance of the 100 GB backup.
Note the 100~ minutes for 26 GB backup in the benchmark above is using my home internet which is much slower than where the backup is running from each night.
The actual benchmark runner:
The SynchronousLoop task on line 49 does what you expect, a foreach loop iterates over an IEnumerable of files and passes each down to a DownloadFile method:
The file downloading logic is wrapped in a Polly AsyncRetryPolicy and is essentially 3 lines (20–22).
Both the SynchronousLoop and the ParallelLoop use the above method to download files.
The ParallelLoop’s ‘DownloadContentsRecursively’ call on line 44 of the benchmark uses the below method incorporating Parallel.ForEachAsync:
As you can see on line 14 the same DownloadFile method is being called as with the synchronous loop.
The C# dev team have really made it incredibly simple to incorporate parallelism into our applications when the situation calls for it.
Hope you found that useful!
Benchmark: How to use C# Parallel.ForEachAsync for 3x faster bulk file downloading time was originally published in Level Up Coding on Medium, where people are continuing the conversation by highlighting and responding to this story.
This content originally appeared on Level Up Coding - Medium and was authored by Stewart Celani
Stewart Celani | Sciencx (2022-06-12T17:04:36+00:00) Benchmark: How to use C# Parallel.ForEachAsync for 3x faster bulk file downloading time. Retrieved from https://www.scien.cx/2022/06/12/benchmark-how-to-use-c-parallel-foreachasync-for-3x-faster-bulk-file-downloading-time/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.