Summary
Even with a tuned OS and 100 Gbps port, application architecture may limit throughput. Single-threaded tools or conservative defaults (e.g., worker counts, buffer sizes, TLS/SSH overhead) often cap performance.
Common examples & fixes
- File transfer: SCP/SFTP are single-threaded. Use bbcp or rsync with parallel streams.
- Web services / APIs: Increase worker processes/threads, buffer sizes, keep-alive.
- Databases: Enable parallel query execution, scale out consumers.
- Virtualization: Ensure virtio/vmxnet3 drivers and offloads are configured.
Validation workflow
- Prove host ↔ host raw throughput with iPerf3.
- Add the application and watch where performance drops.
- Profile CPU, latency, and per-flow rates; scale parallelism.
- For commercial/open source software packages, review documentation for performance considerations carefully.
References
- bbcp overview: parallel streams over SSH for high-speed transfer. ESnet/US Dept. of Energy
- ESnet: Linux Host Tuning general guide under Host Tuning → Linux. oai_citation:16‡fasterdata.es.net
- ESnet: 100G Benchmarking & Other Tuning pages. oai_citation:17‡fasterdata.es.net