You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/PerfProblemSimulator/wwwroot/documentation.html
+223Lines changed: 223 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -632,6 +632,229 @@ <h3>Why This Matters</h3>
632
632
<p>During thread pool starvation, CPU and memory metrics often look normal, but users experience severe latency. The latency monitor makes this invisible problem <strong>visible</strong> — you can watch response times spike from milliseconds to seconds when triggering the sync-over-async simulation.</p>
633
633
</section>
634
634
635
+
<!-- CPU Stress Section -->
636
+
<sectionid="cpu" class="doc-section">
637
+
<h2>🔥 CPU Stress Simulation</h2>
638
+
<p>The CPU Stress simulation creates high CPU usage through parallel spin loops, saturating all available CPU cores for a specified duration.</p>
639
+
640
+
<h3>How It Works</h3>
641
+
<p>The simulation spawns one thread per CPU core, each running a busy-wait spin loop (incrementing a counter in a tight <code>while(true)</code> loop). This consumes 100% of CPU capacity across all cores.</p>
642
+
643
+
<h3>Dashboard Controls</h3>
644
+
<ul>
645
+
<li><strong>Duration:</strong> How long to run the stress test (10-300 seconds, default 30)</li>
646
+
<li><strong>Start/Stop:</strong> Begin or cancel the simulation</li>
647
+
</ul>
648
+
649
+
<h3>What You'll Observe</h3>
650
+
<tableclass="api-table">
651
+
<thead>
652
+
<tr>
653
+
<th>Metric</th>
654
+
<th>During Simulation</th>
655
+
<th>Explanation</th>
656
+
</tr>
657
+
</thead>
658
+
<tbody>
659
+
<tr>
660
+
<td>CPU %</td>
661
+
<td>~100%</td>
662
+
<td>All cores saturated by spin loops</td>
663
+
</tr>
664
+
<tr>
665
+
<td>Memory</td>
666
+
<td>Unchanged</td>
667
+
<td>Spin loops don't allocate memory</td>
668
+
</tr>
669
+
<tr>
670
+
<td>Latency</td>
671
+
<td>May increase</td>
672
+
<td>Less CPU available for request processing</td>
673
+
</tr>
674
+
</tbody>
675
+
</table>
676
+
677
+
<h3>Real-World CPU Issues</h3>
678
+
<p>High CPU in production is typically caused by:</p>
679
+
<ul>
680
+
<li>Inefficient algorithms (O(n²) when O(n) is possible)</li>
<li><code>dotnet-trace collect -p {PID}</code> for CPU profiling</li>
691
+
<li>Azure App Service → Diagnose and Solve Problems → CPU usage</li>
692
+
<li>Application Insights Performance blade</li>
693
+
</ul>
694
+
</section>
695
+
696
+
<!-- Memory Pressure Section -->
697
+
<sectionid="memory" class="doc-section">
698
+
<h2>💾 Memory Pressure Simulation</h2>
699
+
<p>The Memory Pressure simulation allocates and holds large byte arrays to demonstrate memory consumption patterns and garbage collection behavior.</p>
700
+
701
+
<h3>How It Works</h3>
702
+
<p>Large byte arrays (minimum 10MB, default 100MB) are allocated using <code>GC.AllocateArray<byte>(pinned: true)</code>. The pinned allocation prevents the garbage collector from moving the memory, simulating problematic allocation patterns.</p>
703
+
704
+
<h3>Dashboard Controls</h3>
705
+
<ul>
706
+
<li><strong>Size (MB):</strong> Amount of memory to allocate (10-500MB)</li>
707
+
<li><strong>Allocate:</strong> Adds a new memory block</li>
708
+
<li><strong>Release All:</strong> Frees all allocated blocks and triggers GC</li>
709
+
</ul>
710
+
711
+
<h3>What You'll Observe</h3>
712
+
<tableclass="api-table">
713
+
<thead>
714
+
<tr>
715
+
<th>Metric</th>
716
+
<th>After Allocation</th>
717
+
<th>Explanation</th>
718
+
</tr>
719
+
</thead>
720
+
<tbody>
721
+
<tr>
722
+
<td>Memory (MB)</td>
723
+
<td>Increases by allocation size</td>
724
+
<td>Charts reflect working set increase</td>
725
+
</tr>
726
+
<tr>
727
+
<td>CPU</td>
728
+
<td>Brief spike</td>
729
+
<td>Memory allocation and zeroing</td>
730
+
</tr>
731
+
<tr>
732
+
<td>After Release</td>
733
+
<td>Memory drops</td>
734
+
<td>GC.Collect() reclaims memory</td>
735
+
</tr>
736
+
</tbody>
737
+
</table>
738
+
739
+
<h3>Large Object Heap (LOH)</h3>
740
+
<p>Objects larger than 85KB are allocated on the Large Object Heap:</p>
741
+
<ul>
742
+
<li>LOH is collected less frequently (Gen 2 collections only)</li>
<li>Can lead to memory fragmentation over time</li>
745
+
<li>Our allocations (10MB+) always go to LOH</li>
746
+
</ul>
747
+
748
+
<h3>Real-World Memory Issues</h3>
749
+
<ul>
750
+
<li>Static collections that accumulate data without limits</li>
751
+
<li>Event handlers not being unsubscribed</li>
752
+
<li>Improper IDisposable implementation</li>
753
+
<li>Caching without size limits or expiration</li>
754
+
<li>Keeping references to large objects longer than needed</li>
755
+
</ul>
756
+
757
+
<h3>Diagnosis Tools</h3>
758
+
<ul>
759
+
<li><code>dotnet-dump collect -p {PID}</code> for memory dumps</li>
760
+
<li><code>dotnet-gcdump collect -p {PID}</code> for GC heap analysis</li>
761
+
<li>Visual Studio Memory Profiler</li>
762
+
<li>Azure App Service → Memory Working Set blade</li>
763
+
</ul>
764
+
</section>
765
+
766
+
<!-- Thread Pool Starvation Section -->
767
+
<sectionid="threadpool" class="doc-section">
768
+
<h2>🧵 Thread Pool Starvation</h2>
769
+
<p>The Thread Pool Starvation simulation demonstrates the dangerous "sync-over-async" anti-pattern, where synchronous code blocks waiting for asynchronous operations to complete.</p>
770
+
771
+
<h3>How It Works</h3>
772
+
<p>The simulation spawns multiple parallel tasks (default: 2× CPU cores) that each call <code>Task.Delay().Wait()</code>. This blocks thread pool threads, causing:</p>
773
+
<ol>
774
+
<li>Thread pool threads become blocked waiting</li>
775
+
<li>.NET slowly adds new threads (~1 per second)</li>
776
+
<li>Incoming requests queue up waiting for threads</li>
777
+
<li>Request latency spikes dramatically</li>
778
+
</ol>
779
+
780
+
<h3>Dashboard Controls</h3>
781
+
<ul>
782
+
<li><strong>Duration:</strong> How long each blocking operation lasts (default 15s)</li>
783
+
<li><strong>Threads:</strong> Number of blocking threads to spawn</li>
784
+
<li><strong>Start/Stop:</strong> Begin or cancel the simulation</li>
785
+
</ul>
786
+
787
+
<h3>The Sync-Over-Async Anti-Pattern</h3>
788
+
<p>These patterns block threads and should be avoided:</p>
789
+
<pre><code>// ❌ BAD - Blocks thread waiting for Task
790
+
var result = SomeAsyncMethod().Result;
791
+
var result = SomeAsyncMethod().GetAwaiter().GetResult();
792
+
SomeAsyncMethod().Wait();
793
+
794
+
// ✅ GOOD - Properly awaits the Task
795
+
var result = await SomeAsyncMethod();</code></pre>
796
+
797
+
<h3>What You'll Observe</h3>
798
+
<tableclass="api-table">
799
+
<thead>
800
+
<tr>
801
+
<th>Metric</th>
802
+
<th>During Starvation</th>
803
+
<th>Explanation</th>
804
+
</tr>
805
+
</thead>
806
+
<tbody>
807
+
<tr>
808
+
<td>CPU</td>
809
+
<td>LOW</td>
810
+
<td>Threads are waiting, not working</td>
811
+
</tr>
812
+
<tr>
813
+
<td>Memory</td>
814
+
<td>Normal</td>
815
+
<td>No memory leak occurring</td>
816
+
</tr>
817
+
<tr>
818
+
<td>Thread Count</td>
819
+
<td>Slowly climbing</td>
820
+
<td>.NET adds ~1 thread/second</td>
821
+
</tr>
822
+
<tr>
823
+
<td>Latency</td>
824
+
<td>🔴 Spikes to seconds</td>
825
+
<td>Requests queued waiting for threads</td>
826
+
</tr>
827
+
</tbody>
828
+
</table>
829
+
830
+
<h3>Why CPU is Low but Latency is High</h3>
831
+
<p>This is the signature of thread pool starvation. The CPU isn't working because all threads are blocked waiting. New requests can't start because there are no available threads. It looks like the app is idle, but users experience timeouts.</p>
832
+
833
+
<h3>Real-World Causes</h3>
834
+
<ul>
835
+
<li>Calling async libraries from sync code: <code>httpClient.GetAsync(url).Result</code></li>
836
+
<li>Mixing sync and async in constructors</li>
837
+
<li>Using <code>.Result</code> in properties</li>
838
+
<li>Calling async methods from Dispose()</li>
839
+
<li>Third-party libraries that block internally</li>
840
+
</ul>
841
+
842
+
<h3>How to Fix</h3>
843
+
<ul>
844
+
<li>Use <code>await</code> all the way up the call stack ("async all the way down")</li>
845
+
<li>Avoid <code>.Result</code>, <code>.Wait()</code>, and <code>.GetAwaiter().GetResult()</code></li>
846
+
<li>Use <code>ConfigureAwait(false)</code> in library code</li>
847
+
<li>Consider <code>IAsyncEnumerable</code> for streaming data</li>
848
+
</ul>
849
+
850
+
<h3>Diagnosis Tools</h3>
851
+
<ul>
852
+
<li><code>dotnet-counters</code>: Watch "ThreadPool Thread Count" and "ThreadPool Queue Length"</li>
853
+
<li>Response times spike for ALL endpoints simultaneously</li>
854
+
<li>Azure App Service → Diagnose and Solve Problems → Thread count</li>
0 commit comments