any modern hard drive can handle the date rates I calculated for very worst-case scenarios
You still seem to be thinking in single-threading. Remember that there is only one disk controller for each device, once the disk controller starts handling an asynchronous signal from the processor to fetch data and write it into memory it will be a complicated, lengthy process that lasts milliseconds, which in processor-relative timeline is gargantuan amount of time, SSDs are much better but still slow in comparison. Sure cache will offset this a bit (especially if fragmentation is minimal) but the other threads will still have to wait until the disk controller is done with its current task, and even then the threads of a server may not be the only ones wanting disk access on the machine and you have no real control over the operating system's process scheduler.
Also, since you mention size of chunks, the amount of data you want to fetch from a HDD is irrelevant for the most part. The first byte of a request is the most expensive one, the whole block of bytes that follow you get essentially for free.
You may not see it as severe in action to get micro-stutters here and there and slightly unimpressive loading times, but it is a pretty well known fact that current HDD and SSD technology is easily the biggest bottleneck on your computer when measuring speed of execution. We have technologies dedicated to combat this bottleneck: cache, processor prediction, disk-optimized C library functions...
You still seem to be thinking in single-threading. Remember that there is only one disk controller for each device, once the disk controller starts handling an asynchronous signal from the processor to fetch data and write it into memory it will be a complicated, lengthy process that lasts milliseconds, which in processor-relative timeline is gargantuan amount of time, SSDs are much better but still slow in comparison. Sure cache will offset this a bit (especially if fragmentation is minimal) but the other threads will still have to wait until the disk controller is done with its current task, and even then the threads of a server may not be the only ones wanting disk access on the machine and you have no real control over the operating system's process scheduler.
Also, since you mention size of chunks, the amount of data you want to fetch from a HDD is irrelevant for the most part. The first byte of a request is the most expensive one, the whole block of bytes that follow you get essentially for free.
You may not see it as severe in action to get micro-stutters here and there and slightly unimpressive loading times, but it is a pretty well known fact that current HDD and SSD technology is easily the biggest bottleneck on your computer when measuring speed of execution. We have technologies dedicated to combat this bottleneck: cache, processor prediction, disk-optimized C library functions...