ProactivePIM: Accelerating Weight-Sharing Embedding Layer With PIM for Scalable Recommendation System
TL;DR
ProactivePIM uses PIM to accelerate the memory-intensive embedding layer in recommendation systems, addressing bottlenecks from sparse access patterns.
ProactivePIM: Accelerating Weight-Sharing Embedding Layer With PIM for Scalable Recommendation System
Youngsuk Kim; Junghwan Lim; Hyuk-Jae Lee; Chae Eun Rhee
https://doi.org/10.1109/ACCESS.2025.3648766
Volume 14
Although deep learning-based personalized recommendation systems provide qualified recommendations, they strain data center resources. The main bottleneck is the embedding layer, which is highly memory-intensive due to its sparse, irregular access patterns to embeddings. Recent near-memory processing (NMP) and processing-in-memory (PIM) architectures have addressed these issues by exploiting paral...