<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Millibottlenecks | Xuhang Gu</title>
    <link>https://xgu5.github.io/tag/millibottlenecks.html</link>
      <atom:link href="https://xgu5.github.io/tag/millibottlenecks/index.xml" rel="self" type="application/rss+xml" />
    <description>Millibottlenecks</description>
    <generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Thu, 27 Feb 2025 00:00:00 +0000</lastBuildDate>
    <image>
      <url>https://xgu5.github.io/media/icon_hu_a4f3f82829f42164.png</url>
      <title>Millibottlenecks</title>
      <link>https://xgu5.github.io/tag/millibottlenecks.html</link>
    </image>
    
    <item>
      <title>Transient Bottlenecks in Distributed Systems</title>
      <link>https://xgu5.github.io/project/transition_bottlenecks.html</link>
      <pubDate>Thu, 27 Feb 2025 00:00:00 +0000</pubDate>
      <guid>https://xgu5.github.io/project/transition_bottlenecks.html</guid>
      <description>&lt;div class=&#34;container&#34;&gt;
 &lt;div class=&#34;row justify-content-center align-items-center&#34;&gt;
 &lt;div class=&#34;col-md-10 text-center&#34;&gt;
&lt;img src=&#34;../../img/Burst/long_chain.png&#34; alt=&#34;A single burst&#34; class=&#34;m-0 w-100 img-fluid rounded&#34;/&gt;
&lt;h4 class=&#34;mt-1 mb-3&#34;&gt;Figure 1. A representative long call chain in microservices application SocialNetwork. Suddenly increased workload or resource contention are comment factors that cause transient bottlenecks.&lt;/h4&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p class=&#34;text-start&#34;&gt;
Maintaining consistently low response times is crucial for mission-critical, web-facing applications (e.g., e-commerce), which are typically implemented using distributed systems such as microservices architectures. Through extensive benchmarking of a microservices application in a cloud environment, we find that response time stability is fragile, exhibiting significant variations ranging from milliseconds to seconds.
&lt;p&gt;Our detailed timeline analysis identifies that even a millibottleneck (a bottleneck lasting sub-seconds) can trigger a queuing effect from a downstream service that propagates to upstream services, resulting in dropped requests and TCP retransmissions lasting several seconds at the weakest link in the chain.&lt;/p&gt;
&lt;/p&gt;
 &lt;div class=&#34;row justify-content-center align-items-center&#34;&gt;
  &lt;div class=&#34;col-md-10 text-center&#34;&gt;
    &lt;img src=&#34;../../img/Burst/Burst_ReqRate.png&#34; alt=&#34;A single burst&#34; class=&#34;m-2 img-fluid rounded&#34;/&gt;
    &lt;img src=&#34;../../img/Burst/Burst_Latency.png&#34; alt=&#34;Large response time fluctuations&#34; class=&#34;m-2 img-fluid rounded&#34;/&gt;
    &lt;img src=&#34;../../img/Burst/Burst_DroppedReqs.png&#34; alt=&#34;Multiple waves of dropped requests&#34; class=&#34;m-2 img-fluid rounded&#34;/&gt;
    &lt;img src=&#34;../../img/Burst/Burst_Queues.png&#34; alt=&#34;Queue propagation&#34; class=&#34;m-2 img-fluid rounded&#34;/&gt;
    &lt;img src=&#34;../../img/Burst/Burst_CPU.png&#34; alt=&#34;CPU utilizations&#34; class=&#34;m-2 img-fluid rounded&#34;/&gt;
    &lt;h4 class=&#34;mt-1 mb-3&#34;&gt;Figure 2. An illustration showing how a single traffic burst triggers multiple waves of dropped requests and TCP retransmissions over a ten-second span, leading to significant response time fluctuations..&lt;/h4&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p class=&#34;text-start&#34;&gt;
External bursty workloads occur when a microservice receives a sudden increase in requests, causing the system to become temporarily overloaded and response times to spike.
&lt;p&gt;Here we show a representative 10-second snapshot, capturing each metric with fine-grained monitors using a 50ms time window. This figure demonstrates how a bursty workload induces substantial response time fluctuations. Notably, &lt;b&gt;even very short resource saturation in a deep downstream microservice can significantly degrade performance&lt;/b&gt;, highlighting the critical impact of transient bottlenecks on system stability.&lt;/p&gt;&lt;/p&gt;
&lt;/div&gt;</description>
    </item>
    
    <item>
      <title>Attacking Microservices by Exploiting Execution Dependencies</title>
      <link>https://xgu5.github.io/project/attacks.html</link>
      <pubDate>Sun, 25 Feb 2024 00:00:00 +0000</pubDate>
      <guid>https://xgu5.github.io/project/attacks.html</guid>
      <description>&lt;div class=&#34;container&#34;&gt;
 &lt;div class=&#34;row justify-content-center align-items-center&#34;&gt;
  &lt;div class=&#34;col-md-11 text-center&#34;&gt;
    &lt;img src=&#34;../../img/Attacks/profile_bursts.png&#34; alt=&#34;Profiling bursts&#34; class=&#34;m-2 img-fluid rounded&#34;/&gt;
    &lt;h4 class=&#34;mt-1 mb-3&#34;&gt;Figure 1. Test performance interference between a pair of different requests to profile their execution dependencies.&lt;/h4&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p class=&#34;text-start&#34;&gt;
Building on our understanding of &lt;a href=&#34;../../project/execution_dependencies.html&#34;&gt;execution dependencies&lt;/a&gt; in microservices, we propose a black-box approach that leverages legitimate HTTP requests to precisely profile the internal pairwise dependencies across all supported execution paths in the target microservices. As a result, overloading just a few microservices can significantly degrade the performance of the entire system, revealing potential performance vulnerabilities within the microservices.
&lt;/p&gt;
 &lt;div class=&#34;row justify-content-center align-items-center&#34;&gt;
 &lt;div class=&#34;col-md-12 text-center p-0&#34;&gt;
&lt;img src=&#34;../../img/Attacks/Grunt.png&#34; alt=&#34;Grunt attack scenario&#34; class=&#34;m-0 w-100 img-fluid rounded&#34;/&gt;
&lt;h4 class=&#34;mt-1 mb-3&#34;&gt;Figure 2. By exploiting execution dependencies in microservices, Grunt attack triggers millibottlenecks alternatively among different paths, causing persistent blocking effects, resulting in system-wide large response problem.&lt;/h4&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p class=&#34;text-start&#34;&gt;
To better understand performance vulnerabilities in microservices, we present Grunt Attack – a novel low-volume DDoS attack that exploits execution dependencies in microservice applications. By systematically grouping and characterizing execution paths based on their pairwise dependencies, &lt;b&gt;Grunt attack can target only a few well-selected execution paths to launch a low-volume DDoS attack that achieves substantial wide-spread performance degradation to the system&lt;/b&gt;. To enhance stealth, the attacker avoids creating a persistent bottleneck by dynamically alternating target execution paths within their dependency group.
&lt;/p&gt;
 &lt;div class=&#34;row justify-content-center align-items-center&#34;&gt;
  &lt;div class=&#34;col-md-9 text-center&#34;&gt;
    &lt;img src=&#34;../../img/Attacks/Attack_results.png&#34; alt=&#34;Attacking results&#34; class=&#34;m-2 img-fluid rounded&#34;/&gt;
    &lt;h4 class=&#34;mt-1 mb-3&#34;&gt;Figure 3. Conceptual illustration of cross-service queue blocking.&lt;/h4&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;p class=&#34;text-start&#34;&gt;
As a result, Grunt attack consumes less than 20% additional CPU resource of the target system while increasing its average response time by over 10x.
&lt;/p&gt;
&lt;/div&gt;</description>
    </item>
    
  </channel>
</rss>
