Item Anomaly Detection Based on Dynamic Partition for Time Series in Recommender Systems
August
Item Anomaly Detection Based on Dynamic Partition for Time Series in Recommender Systems
Min Gao 0 1 2
Renli Tian 0 1 2
Junhao Wen 0 1 2
Qingyu Xiong 0 1 2
Bin Ling 0 1 2
Linda Yang 0 1 2
0 1 Key Laboratory of Dependable Service Computing in Cyber Physical Society, Ministry of Education , Chongqing, 400044, China , 2 School of Software Engineering, Chongqing University , Chongqing, 400044, China , 3 School of Engineering, University of Portsmouth , Portsmouth, PO1 3AH , United Kingdom
1 Funding: This work was supported by National Natural Science Foundation of China (71102065
2 Editor: Sergio Gómez, Universitat Rovira i Virgili , SPAIN
In recent years, recommender systems have become an effective method to process information overload. However, recommendation technology still suffers from many problems. One of the problems is shilling attacks-attackers inject spam user profiles to disturb the list of recommendation items. There are two characteristics of all types of shilling attacks: 1) Item abnormality: The rating of target items is always maximum or minimum; and 2) Attack promptness: It takes only a very short period time to inject attack profiles. Some papers have proposed item anomaly detection methods based on these two characteristics, but their detection rate, false alarm rate, and universality need to be further improved. To solve these problems, this paper proposes an item anomaly detection method based on dynamic partitioning for time series. This method first dynamically partitions item-rating time series based on important points. Then, we use chi square distribution (χ2) to detect abnormal intervals. The experimental results on MovieLens 100K and 1M indicate that this approach has a high detection rate and a low false alarm rate and is stable toward different attack models and filler sizes.
-
Recommendation systems are effective and widely used to solve information overload [1].
Although personalized recommendation technology has achieved huge progress in the cold
start problem, forecasting precision, diversity-accuracy dilemma, user experience and
contextual-based recommendations [2–6], it still suffers from many problems. Shilling attack, in
which attackers inject spam user profiles (user profile indicates the user’s rating set of all items)
to change the recommendation results, is one of the most serious problems [7–8]. For a
collaborative filtering-based recommendation system without defense, the target item is able to top
the recommendation list with spam users’ efforts representing only one percent of the list [9].
The injection of spam users’ ratings in e-commerce systems seriously disturbs the system
recommendation ranking, and then misguides users from obtaining what they really want.
www.edu.cn/zheng_ce_fa_gui_1115/20090820/
t20090820_400962.shtml, MG). The funders had no
role in study design, data collection and analysis,
decision to publish, or preparation of the manuscript.
Competing Interests: The authors have declared
that no competing interests exist.
Consequently, the injection will result in a decline of user satisfaction. Shilling attacks are
divided into two categories: push attacks and nuke attacks. Push attacks make the target items
easier to be recommended. Nuke attacks make the target items harder to be recommended.
Traditional detection methods of shilling attacks are based mainly on the features of user
profiles, such as RDMA and Degsim features [10]. From the machine learning perspective,
there are supervised and unsupervised detection algorithms [9–16]. These methods are
primarily focused on detecting spam users, which has a good result on some specific attack models,
but generality is not strong. Zhang et al. [17] and Gao et al. [18] proposed that the ultimate
goal of a shilling attack is to make a change in target items. Therefore, they proposed detection
methods for abnormal items, and hoped to solve the problem of shilling attack from the item’s
perspective. Focusing on attack promptness, Zhang et al. [17] proposed an item anomaly
detection approach based on sample average and sample entropy in a time series., Gao et al. [18]
divided all items into different types according the features of items' lifecycle and rating
numbers. Then, they used a fixed window to divide the time intervals and χ2 was utilized to detect
abnormal intervals. In their approach, the time interval is divided by a fixed time window;
therefore, the different time window sizes will directly influence the effectiveness of the
detection. Additionally, the item’s own characteristic varies with time. The detection difficulty
increases as the adjacent window’s rating distribution becomes closer, which results from the
time window becoming larger. However, the false alarm rate increases as the adjacent windows
rating distribution differences become greater, which results from the time window becomes
smaller. The detection rate and false alarm rate of this method needs to be further i (...truncated)