Among industries, e-commerce sees a wide range of bot attacks. E-commerce ranks fifth in the intensity of bad bot traffic and first in the volume of sophisticated bot traffic. Companies like Amazon face a persistent challenge with bot traffic, given that 70% of their total traffic is generated by bots, resulting in significant economic consequences. There are different types of bots which can cause problems for e-commerce such as manipulating product ranks and increasing data access latency. Despite e-commerce businesses’ continuous efforts to enhance user experience through the adoption of modern technologies, the disruptive presence of malicious bots poses a significant threat to the revenue and overall success of companies.
We review the characteristics of bot traffic in real-world e-commerce websites and analyze how these characteristics can bring vulnerability while using in-network caching in the backend. We outline how the presence of bot traffic can lead to higher cache misses of legitimate users. As depicted, regular users typically request popular items, leading to the storage of these items within the in-network caching system. However, malicious bots follow a different pattern, actively soliciting less popular items and attempting to inject counterfeit popular items into the caching system. The introduction of this injection results in the removal of popular accessed items, leading to a notable rise in data access latency for normal clients.
We propose In-network Caching Shelter (INCS), a mitigation solution for attacks on in-network caching. To safeguard the caching system, an in-network machine learning framework (i.e., Planter) is used to classify incoming requests, detecting and acting on bot-generated queries. INCS runs on a programmable network target, such as a Data Processing Unit (DPU) or SmartNIC using Behavioral Model Version 2 (BMV2) or Open vSwitch (OvS). we have classified bots into three distinct categories within our dataset: (1) Moderate, (2) Intense, and (3) Aggressive. Moderate bots make requests for popular items 40% of the time and for unpopular items 60% of the time. Intense bots, on the other hand, request unpopular items 75% of the time and popular items 25% of the time. Aggressive bots predominantly request unpopular items, with a distribution of 90% for unpopular items and 10% for popular items. We deployed INCS on BlueField-2 DPU’s ARM cores using several trained machine learning models. As the bot behaves more aggressively, it is easier to detect it. For instance, Random Forest (RF) achieves a bot detection accuracy of 80.42% for Moderate bot activity, 87.36% for Intense bot activity, and 94.72% for Aggressive bot activity.