Spammer Detection Relying on Reviewer Behavior Features Under Uncertainty

4.1. Introduction

Today, the Internet gives the opportunity to people worldwide to express and share their opinions and attitudes regarding products or services. These opinions called online reviews have become one of the most important sources of information thanks to their availability and visibility. They are increasingly used by both consumers and organizations. Positive reviews usually attract new customers and bring financial gain. However, negative ones damage the e-reputation of different businesses, which leads to a loss. Reviewing has changed the face of marketing in this new area. Due to their important impact, companies invest money to over-qualify their product to gain insights into readers’ preferences. To do this, they rely on spammers to usually post deceptive reviews: positive ones to attract new customers and negative ones to damage competitors’ e-reputation. These fraudulent activities are extremely harmful to both companies and readers. Hence, detecting and analyzing the opinion spam becomes pivotal for saving e-commerce and to ensure trustworthiness and equitable competition between different products and services. Therefore, different researchers have given considerable attention to this challenging problem. In fact, several studies (Deng and Chen 2014; Ong et al. 2014; Rayana and Akoglu 2015; Heydari et al. 2016; Fontanarava et al. 2017) have been devoted to developing methods capable of spotting fake reviews and stopping these misleading actions. These approaches can be classified into three global categories: spam review detection based on the review content and linguistic features, group spammer detection based on relational indicators and spammer detection.

Since spammers are mainly responsible for the appearance of deceptive reviews, spotting them is surely one of the most essential tasks in this field. Several approaches addressed this problem (Heydari et al. 2016) and succeeded in achieving significant results. Spammer detection techniques can be divided into two global categories: graph-based method and behavioral indicator-based methods.

One of the first studies that relies on graph representation to detect fake reviews was proposed in Wang et al. (2011). This method attempted to spot fake reviewers and reviews of online stores. This approach is based on a graph model composed of three types of nodes, which are reviewers, reviews and stores. The spamming clues are comprised through the interconnections and the relationships between nodes. The detection of these clues is based on the trustworthiness of reviewers, the honesty of reviews and the reliability of stores. Thanks to these three measures, the method generates a ranking list of spam reviews and reviewers. This method was tested on real dataset extracted from resellerratings.com and labeled by human experts and judged. However, the accuracy of this method is limited to 49%. A similar study was proposed by Fayazbakhsh and Sinha (2012) based also on the review graph model. This method generates a suspicion score for each node in the review graph and updates these scores based on the graph connectivity using an iterative algorithm. This method was performed using a dataset labeled through human judgment. Moreover, the third graph-related approach was introduced by Akoglu et al. (2013) as an unsupervised framework. This method relies on a bipartite network composed of reviewers and products. The review can be positive or negative according to the rating. The method assumes that the spammers usually write positive reviews for bad products and negative ones for good quality products. The authors use an iterative propagation algorithm as well as the correlations between nodes and assign a score to each vertex and update it using the loopy belief propagation (LBP). This method offers a list of scores to rank reviewers and products in order to get equation clusters. Results were compared to two iterative classifiers, where they have shown performance.

The aspect of the behavior indicators was introduced by Lim et al. (2010) to detect spammers. This method measures spamming behaviors and accords a score to rank reviewers regarding the rating they give. It is essentially based on the assumption that fake reviewers target specific products and that their reviews rating deviates from the average rating associated with these products. Authors assume that this method achieved significant results. Another method proposed in Savage et al. (2015) is also based on the rating behavior of each reviewer. It focuses on the gap between the majority of the given rating and each reviewer’s rating. This method uses the binomial regression to identify spammers. One of the most preferred studies was conducted by Fei et al. (2013), which is essentially based on various spammers’ behavioral patterns. Since the spammers and the genuine reviewers display distinct behaviors, the proposed method models each reviewer’s spamicity while observing their actions. It was formulated as an unsupervised clustering problem in a Bayesian framework. The proposed technique was tested on data from Amazon and proves its effectiveness. Moreover, Fei et al. (2013) proposed a method to detect the burst pattern in reviews given to some specific products or services. This approach generates five new spammer behavioral indicators to enhance review spammer detection. The authors used the Markov random fields to model the reviewers in burst and a hidden node to model the reviewer spamicity. Then, they rely on the loopy belief propagation framework to spot spammers. This method achieves 83.7% of precision thanks to the spammers’ behavioral indicators. Since then, behavioral indicators have become an important basis for the spammer detection task. These indicators are used in several recent studies (Liu et al. 2017). Nevertheless, we believe that the information or the reviewers’ history can be imprecise or uncertain. Also, the deceptive behavior of users might be due to some coincidence, which makes the spammer detection issue full of uncertainty. For these reasons, ignoring such uncertainty may deeply affect the quality of the detection. To manage these concerns, we propose a novel method that aims to classify reviewers into spammer and genuine ones based on K-nearest neighbors’ algorithm within the belief function theory to deal with the uncertainty involved by the spammer behavioral indicators, which are considered as features. It is known as one of the richest theories in dealing with all the levels of imperfection from total ignorance to full certainty. In addition, it allows us to manage different pieces of evidence, not only to combine them but also to make decisions while facing imprecision and imperfections. This theory proves its robustness in this field through our previous methods, which achieve significant results (Ben Khalifa et al. 2018a, 2018b, 2019a, 2019b, 2019c, 2020). Furthermore, the use of the Evidential K-NN (Denoeux 1995) has been based on its robustness in real-world classification problems under uncertainty. We seek to involve imprecision in spammers’ behavioral indicators, which are considered the fundamental interest in our approach since they are used as features for the evidential K-NN. In such a way, our method distinguishes between spammers and innocent reviewers while offering an uncertain output, which is the spamicity degree related to each user.

This chapter is structured as follows: in the second section, we present the basic concepts of the belief function theory and the evidential K-nearest neighbors, then we elucidate the proposed method in section 4.3. Section 4.4 presents the experimental results, and we finish with a conclusion and some future work.

4.2. Background

In this section, we elucidate the fundamentals of the belief function theory as well as the evidential K-nearest neighbors classifier.

4.2.1. The belief function theory

The belief function theory, called also Dempster Shafer theory, is one of the powerful theories that handles uncertainty in different tasks. It was introduced by Shafer (1976) as a model to manage beliefs. Basic concepts

In this theory, a given problem is represented by a finite and exhaustive set of different events called the frame of discernment equation is the power set of equation that includes all possible hypotheses, and it is defined by: equation  equation

A basic belief assignment (bba) defined as a function from equation to [0,1] that represents the degree of belief given to an element equation such that:


A focal element equation is a set of hypotheses with positive mass value  equation

Several types of bbas have been proposed (Smets 1992) in order to model special situations of uncertainty. Here, we present some special cases of bbas:

– the certain bba represents the state of total certainty, and it is defined as follows:


– simple support function: in this case, the bba focal elements are  equation A simple support function is defined as the following equation:


where equation is the focus and equation Combination rules

Various combination rules have been suggested in the framework of belief functions to aggregate a set of bba’s provided by pieces of evidence from different experts. Let equation and equation be two bbas modeling two distinct sources of information defined on the same frame of discernment equation In the following, we elucidate the combination rules related to our approach.

1) Conjunctive rule: it was settled in Smets (1992), denoted by equation and defined as:


2) Dempster’s rule of combination: this rule is the normalized version of the conjunctive rule (Dempster 1967). It is denoted by and defined as:

equation Decision process

The belief function framework provides numerous solutions to make a decision. Within the transferable belief model (TBM) (Smets 1998), the decision process is performed at the pignistic level where bbas are transformed into the pignistic probabilities denoted by BetP and defined as:


4.2.2. Evidential K-nearest neighbors

The evidential K-nearest neighbors (EKNN) (Denoeux 1995) is one of the well-known classification methods based in the belief function framework. It performs the classification over the basic crisp KNN method thanks to its ability to offer a credal classification of the different objects. This credal partition provides a richer information content of the classifier’s output. Notations

– equation the frame of discernment containing the equation possible classes of the problem.

– equation the object equation belonging to the set of equation distinct instances in the problem.

– A new instance X to be classified.

– equation the set of the K-nearest neighbors of equation EKNN method

The main objective of the EKNN is to classify a new object equation based on the information given by the training set. A new instance equation to be classified must be allocated to one class of the equation founded on the selected neighbors. Nevertheless, the knowledge that a neighbor equation belongs to class  equation may be deemed as a piece of evidence that raises the belief that the object  equation to be classified belongs to the class equation ForFor thisthis reason,reason, thethe EKNNEKNN technique deals with this fact and treats each neighbor as a piece of evidence that support some hypotheses about the class of the pattern equation to be classified. In fact, the more the distance between equation and equation reduces, the more  equation the evidence is strong. This evidence can be illustrated by a simple support function with a equation such that:



equation is a constant that has been fixed at 0.95;

d(X, Xi) represents the Euclidean distance between the instance to be classified and the other instances in the training set;

– equation assigned to each class equation has been defined as a positive parameter. It represents the inverse of the mean distance between all the training instances belonging to the class equation

After the generation of the different bbas by the K-nearest neighbors, they can be combined through the Dempster combination rule as follows:


where {1, …, K} is the set including the indexes of the K-nearest neighbors.

4.3. Spammer detection relying on the reviewers’ behavioral features

The idea behind our method is to take into account the uncertain aspect in order to improve detecting spammer reviewers. For that, we propose a novel approach based on different spammer indicators and we rely on the evidential K-nearest neighbors, which is a famous classifier under the belief function framework. In the remainder of this section, we will elucidate the different steps of our proposed approach: in the first step, we model and calculate the spammers’ indicators through the reviewers’ behaviors. In the second step, we present the initialization and learning phase. Finally, we distinguish between spammers and innocent reviewers through the classification phase in which we also offer an uncertain input to report the spamicity degree of each reviewer. Figure 4.1 illustrates our method steps.

Schematic illustration of spammer detection relying on the reviewers' behavioral features.

Figure 4.1 Spammer detection relying on the reviewers’ behavioral features

4.3.1. Step 1: Features extraction

As mentioned above, spammer indicators become one of the most powerful tools in the spammer detection field used in several studies. In this section, we suggest controlling the reviewers’ behaviors if they are linked with the spamming activities and thus can be used as features to learn the evidential KNN classifier in order to distinguish between the two classes: spammer and innocent reviewers. We select the significant features used in previous work (Mukherjee et al. 2013a, 2013b). Here, we detail them in two lists: in the first list, we elucidate the author, and in the second list, we present the review features. To make the equations more comprehensible, we present the different notations in Table 4.1. Reviewer features

The values of these features are in the interval [0,1]. The more the value is close to 1, the higher the spamicity degree is indicated. Content similarity (CS)

Generally, spammers choose to copy reviews from other similar products because, for them, creating a new review is considered as an action that required time. Hence, we assume that it is very useful to detect the reviews’ content similarity (using cosine similarity). From this perspective and in order to pick up the most unpleasing behavior of spammers, we use the maximum similarity:


where equation and equation are the reviews written by the reviewer equation and  equation represents all the reviews written by the reviewer equation

Table 4.1. List of notations

equationA reviewer
equationSet of reviewers, where H is the total number of reviewers
equationA review
equationA product
equationTotal number of reviews
equationReview written by the reviewer equation
equationSet of all reviews written by the reviewer equation
equationSet of reviews on product or service equation
equationReview on product equation
equationReview given by the reviewer equation to the same product equation
equationSet of all reviews given by the reviewer equation to the same product equation
equationSet of rating reviews given by the reviewer equation to the same product equation
equationLast posting date of the review written by the reviewer equation
equationFirst posting date of the review written by the reviewer equation
equationThe date of the product launch
equationSpamming indicator
equationThe mean score of a given product equation
equationThe reviewing score of the reviews given to one product equation by the same reviewer equation
equationThe frame of discernment including the spammer and not spammer class Maximum number of reviews (MNR)

Creating reviews and posting them successively in one day displays an indication of deviant behavior. This indicator calculates the maximum number of reviews per day equation for a reviewer normalized by the maximum value for our full data.

equation Reviewing burstiness (BST)

Although authentic reviewers publish their reviews from their accounts occasionally, opinion spammers represent a non-old-time membership in the site. To this point, it makes us able to take advantage of the accounts’ activity in order to capture the spamming behavior. The activity window, which is the dissimilarity between the first and last dates of the review creation, is used as a definition of the reviewing burstiness. Consequently, if the timeframe of a posted review was reasonable, it could mention a typical activity. Nevertheless, posting reviews in a short and nearby burst ( equation = 28 days, estimated by Mukherjee et al. (2013a, 2013b)) shows an emergence of spam behavior.


where equation represents the last posting date of the review equation given by the reviewer equation and equation is the first posting date of the review. Ratio of first reviews (RFR)

To take advantage of the reviews, people rely on the first posted reviews. For this reason, spammers tend to create them at an early stage in order to affect the elementary sales. Therefore, spammers believe that managing the first reviews of each product could empower them to govern people’s sentiments. For every single author, we calculate the ratio between the first reviews and the total reviews. We mean by the first reviews those posted by the author as the first to evaluate the product:


where equation represents the first reviews of the reviewer equation Review features

These features have a binary value. If the feature value is equal to 1, then it indicates spamming. If not, it represents non-spamming. Duplicate/near-duplicate reviews (DUP)

As far as they want to enhance the ratings, spammers frequently publish multiple reviews. They tend to use a duplicate/near-duplicate kind of preceding reviews about the same product. We could spotlight this activity by calculating the duplicate reviews on the same product. The calculation proceeds as follows:


For a review equation each author equation on a product equation acquires as value 1 if it is in analogy (using cosine similarity based on some threshold, equation = 0.7) with another review is estimated in Mukherjee et al. (2013a, 2013b). Extreme rating (EXT)

In favor of bumping or boosting a product, spammers often review it while using extreme ratings (1* or 5*). We have a rating scale composed of five stars (*):


where equation represents all the reviews (ratings) given by the reviewer equation to the same product equation Rating deviation

Spammers aim to promote or demote some target products or services to this point they generate reviews or rating values according to the situation. In order to deviate the overall rating of a product, they have to contradict the given opinion by posting deceptive ratings strongly deviating the overall mean.

If the rating deviation of a review exceeds a threshold equation = 0.63 estimated in Mukherjee et al. (2013a, 2013b), this feature achieves the value of 1. The maximum deviation is normalized to four on a five-star scale.


where equation represents the mean score of a given product and equation represents the score of the reviews given to one product equation by the same reviewer equation Early time frame (ETF)

Since the first review is considered as a meaningful tool to gauge the sentiment of people on a product, spammers set to review at an early level in order to press the spam behavior. The feature below is proposed as a way to detect the spamming characteristic:


where equation represents the last review posting date by the reviewer equation on the product equation and equation is the date of the product launch. The degree of earliness of an author equation who had reviewed a product equation is captured by  equation the threshold symbolizing earliness is about equation months (estimated in Mukherjee et al. (2013a, 2013b)). According to the presented definition, we cannot consider the last review as an early one if it has been posted beyond 7 months since the product’s launch. On the other hand, the display of a review following the launch of the product allows this feature to reach the value of equation = 0.69 , which is considered as the threshold mentioning spamming and is estimated in Mukherjee et al. (2013a, 2013b). Rating abuse (RA)

To bring up the wrong use generated from the multiple ratings, we adopt the feature of rating abuse (RA). Obtaining multiple rating on a unique product is considered a strange behavior. Despite the fact that this feature is like DUP, it does not focus on the content but rather it targets the rating dimension. As a definition, the rating abuse, the similarity of the donated ratings by an author for a product beyond multiple ratings by the same author blended by the full reviews on this product:


where equation


We should calculate the difference between the two extremes (maximum/minimum) on five-star scale rating to catch the coherence of high/low rating and to determine the similarity of multiple star rating. The maximum difference between ratings attains 4 as a normalized constant. Lower values are reached by this feature if, in authentic cases, the multiple ratings change (as a result of a healthy use). equation = 2.01 is considered as the threshold mentioning spamming and is estimated in Mukherjee et al. (2013a, 2013b).

4.3.2. Step 2: Initialization and learning phase

In order to apply the evidential K-NN classifier, we should first assign values to parameters equation and equation to be used in the learning phase. We will start by initializing the parameter equation and then computing the second parameter equation while exploiting the reviewer-item matrix. As mentioned in the EKNN procedure (Denoeux 1995), equation is initialized to 0.95. The value of the parameter α0 is assigned only one time, while the equation value changes each time according to the class. In order to ensure the equation computation performance, we select the reviewers who belong to the same class and we assign a parameter equation that will be measured as the inverse of the average distance between each pair of reviewers equation and equation having the same class ω. This calculation is based on the Euclidean distance denoted equation such that:


where equation represents the number of indicators and equation is the value of the equation indicator for the reviewer equation

Once the spammer indicators are calculated and the two parameters equation and equation have been assigned, we must select a set of reviewers. Then, we compute for each reviewer equation in the database their distance to the target reviewer equation Given a target reviewer, we have to spot its K-most similar neighbors, by selecting only the K reviewers having the smallest distances values that is calculated using the Euclidean distance and denoted by  equation

4.3.3. Step 3: Distinguishing between innocent and spammer reviewers

In this step, we aim to classify a new reviewer into a spammer or an innocent reviewer. Let equation where equation represents the class of the spammer reviewers and equation includes the class of the genuine reviewers. The bba generation

Each reviewer equation induces a piece of evidence that builds up our belief about the class that they belong to. However, this information does not supply certain knowledge about the class. In the belief function framework, this case is shaped by simple support functions, where only a part of belief is committed to equation where equation is the label of the reviewer equation and the rest is assigned to equation Thus, we obtain the following bba:


where equation is the new reviewer and equation is their similar reviewer,  equation and equation and equation are two parameters assigned in the initialization phase.

In our case, each neighbor of the new reviewer has two possible hypotheses. It can be similar to a spammer reviewer in which their committed belief is allocated to the spammer class S and the rest to the frame of discernment equation In the other case, it can be near to an innocent reviewer where the committed belief is given to the not spammer class equation and the rest of is assigned to equation We treat the K-most similar reviewers as independent sources of evidence where each one is modeled by a basic belief assignment. Hence, K different bbas can be generated for each reviewer. The combination of bbas

After the generation of the bbas for each reviewer equation we describe how to aggregate these bbas in order to get the final belief about the reviewer classification. Under the belief function framework, such bbas can be combined using the Dempster combination rule. Therefore, the obtained bba represents the evidence of the K-nearest neighbors regarding the class of the reviewer. Hence, this global mass function equation is obtained as such:

equation Decision-making

We apply the pignistic probability equation in order to select the membership of the reviewer equation to one of the classes of equation and to accord them a spamicity degree. Then, the classification decision is made: either the reviewer is a spammer or not. For this, the possible label that has the greater pignistic probability is selected. Moreover, we assign to each reviewer even they are not a spammer the spamicity degree which consists of the equation value of the spammer class.

4.4. Experimental study

The evaluation in the fake reviews detection problem was always a challenging issue due to the unavailability of true real-world growth data and variability of the features as well as the classification methods used by the different related work, which can lead to unsafe comparison in this field.

4.4.1. Evaluation protocol Data description

In order to test our method performance, we use two datasets collected from yelp.com. These datasets represent the more complete, largest, the more diversified and general-purpose labeled datasets that are available today for the spam review detection field. They are labeled through the classification based on the yelp filter, which has been used in various previous works (Mukherjee et al. 2013b; Rayana and Akoglu 2015; Fontanarava et al. 2017; Ben Khalifa et al. 2019b, 2019c) as ground truth in favor of its efficient detection algorithm based on experts’ judgment and on various behavioral features. Table 4.2 introduces the datasets content where the percentages indicate the filtered fake reviews (not recommended) and also the spammer reviewers.

The YelpNYC dataset contains reviews of restaurants located in New York City; the Zip dataset is bigger than the YelpNYC datasets, since it includes businesses in various regions of the United States, such as New York, New Jersey, Vermont, Connecticut and Pennsylvania. The strong points of these datasets are:

– the high number of reviews per user, which facilities modeling of the behavioral features of each reviewer;

– the miscellaneous kinds of entities reviewed, i.e. hotels and restaurants;

– above all, the datasets hold just fundamental information, such as the content, label, rating and date of each review, connected to the reviewer who generated them. With regard to considering over-specific information, this allows us to generalize the proposed method to different review sites.

Table 4.2. Datasets description

DatasetsReviews (filtered %)Reviewers (spammer %)Services (restaurant or hotel)
YelpZip608,598 (13.22%)260,277 (23.91%)5,044
YelpNYC359,052 (10.27%)160,225 (17.79%)923 Evaluation criteria

We rely on the following three criteria to evaluate our method: accuracy, precision and recall, which can be defined as the following equations respectively, where equation denote true positive, true negative, false positive and false negative respectively:


4.4.2. Results and discussion

Our method relies on the evidential KNN to classify the reviewer into spammers and genuine ones. We compare our method to the support vector machine (SVM) and the Naive Bayes (NB); used by most of spammer detection method (Mukherjee et al. 2013a; Rayana and Akoglu 2015; Liu et al. 2017) in this field. Moreover, we suggest comparing also with our previously proposed uncertain classifier to detect spammers (UCS) in Ben Khalifa et al. (2019a). Table 4.3 reports the different results.

Our method achieves the best performance detection according to accuracy, precision and recall over-passing the baseline classifier. We record at best an accuracy improvement of over 24% in both yelpZip and yelpNYC datasets compared to NB and over 19% compared to SVM. Moreover, the improvement records between our two uncertain methods (Ben Khalifa et al. 2019a) (over 10%) at best shows the importance of the variety of the features used in our proposed approach.

Our method can be used in several fields by different review websites. In fact, these websites must block detected spammers in order to stop the appearance of fake reviews. Furthermore and thanks to our uncertain output which represents the spamicity degree for each reviewer, they can control the behavior of the genuine ones with a high spamicity degree to prevent their tendency to turn into spammers.

Table 4.3. Comparative results

Evaluation criteriaAccuracyPrecisionRecall

4.5. Conclusion and future work

In this chapter, we tackle the spammer review detection problem and put forward a novel approach that aims to distinguish between spammers and innocent reviewers while taking into account the uncertainty in the different suspicious behavioral indicators. Our method shows its performance in detecting the spammers’ review while according a spamicity degree to each reviewer. Our proposed approach can be useful for different review sites in various fields. Moreover, our uncertain input can be used by other methods to model the reliability of each reviewer. In future, we aim to tackle the group spammer aspect in the interest of improving detection in this field.

4.6. References

  1. Akoglu, L., Chandy, R., Faloutsos, C. (2013). Opinion fraud detection in online reviews by network effects. Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM, 13, 2–11.
  2. Bandakkanavar, R.V., Ramesh, M., Geeta, H. (2014). A survey on detection of reviews using sentiment classification of methods. IJRITCC, 2(2), 310–314.
  3. Ben Khalifa, M., Elouedi, Z., Lefèvre, E. (2018a). Fake reviews detection under belief function framework. Proceedings of the International Conference on Advanced Intelligent System and Informatics (AISI), 395–404.
  4. Ben Khalifa, M., Elouedi, Z., Lefèvre, E. (2018b). Multiple criteria fake reviews detection using belief function theory. The 18th International Conference on Intelligent Systems Design and Applications (ISDA), 315–324.
  5. Ben Khalifa, M., Elouedi, Z., Lefèvre, E. (2019a). Fake reviews detection based on both the review and the reviewer features under belief function theory. The 16th International Conference Applied Computing, 123–130.
  6. Ben Khalifa, M., Elouedi, Z., Lefèvre, E. (2019b). Spammers detection based on reviewers’ behaviors under belief function theory. The 32nd International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems (IEA/AIE), 642–653.
  7. Ben Khalifa, M., Elouedi, Z., Lefèvre, E. (2019c). Multiple criteria fake reviews detection based on spammers’ indicators within the belief function theory. The 19th International Conference on Hybrid Intelligent Systems (HIS), Bhopal.
  8. Ben Khalifa, M., Elouedi, Z., Lefèvre, E. (2019d). An evidential spammer detection based on the suspicious behaviors’ indicators. The International Multiconference OCTA 2019, February 6–8.
  9. Ben Khalifa, M., Elouedi, Z., Lefèvre, E. (2020) Evidential group spammers detection. In Information Processing and Management of Uncertainty in Knowledge-Based Systems, Lesot. M.J., Vieira, S., Reformat, M., Carvalho, P.J., Wilbik, A., Bouchon-Meunier, B., Yager, R. (eds). Springer, Heidelberg.
  10. Dempster, A.P. (1967). Upper and lower probabilities induced by a multivalued mapping. Annals of Mathematical Statistics, 38, 325–339.
  11. Deng, X. and Chen, R. (2014). Sentiment analysis based online restaurants fake reviews hyper detection. Web Technologies and Applications, 1–10.
  12. Denoeux, T. (1995). A K-nearest neighbor classification rule based on Dempster- Shafer theory. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 25(5), 804–813.
  13. Fayazbakhsh, S. and Sinha, J. (2012). Review spam detection: A network-based approach. Final project report, CSE 590 (Data Mining and Networks).
  14. Fei, G., Mukherjee, A., Liu, B., Hsu, M., Castellanos, M., Ghosh, R. (2013). Exploiting burstiness in reviews for review spammer detection. Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM, 13, 175–184.
  15. Fontanarava, J., Pasi, G., Viviani, M. (2017). Feature analysis for fake review detection through supervised classification. Proceedings of the International Conference on Data Science and Advanced Analytics, 658–666.
  16. Heydari, A., Tavakoli, M., Ismail, Z., Salim, N. (2016). Leveraging quality metrics in voting model based thread retrieval. World Academy of Science, Engineering and Technology, International Journal of Computer, Electrical, Automation, Control and Information Engineering, 10(1), 117–123.
  17. Jindal, N. and Liu, B. (2008). Opinion spam and analysis. Proceedings of the 2008 International Conference on Web Search and Data Mining, ACM, 219–230.
  18. Jousselme, A.-L., Grenier, D., Bossé, É. (2001). A new distance between two bodies of evidence. Information Fusion, 2(2), 91–101.
  19. Lim, P., Nguyen, V., Jindal, N., Liu, B., Lauw, H. (2010). Detecting product review spammers using rating behaviors. Proceedings of the 19th ACM International Conference on Information and Knowledge Management, 939–948.
  20. Ling, X. and Rudd, W. (1989). Combining opinions from several experts. Applied Artificial Intelligence an International Journal, 3(4), 439–452.
  21. Liu, P., Xu, Z., Ai, J., Wang, F. (2017). Identifying indicators of fake reviews based on spammers behavior features. IEEE International Conference on Software Quality, Reliability and Security Companion (QRS-C), 396–403.
  22. Mukherjee, A., Kumar, A., Liu, B., Wang, J., Hsu, M., Castellanos, M. (2013a). Spotting opinion spammers using behavioral footprints. Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining, 632–640.
  23. Mukherjee, A., Venkataraman, V., Liu, B., Glance, N. (2013b). What yelp fake review filter might be doing. Proceedings of the Seventh International Conference on Weblogs and Social Media, ICWSM, 409–418.
  24. Ong, T., Mannino, M., Gregg, D. (2014). Linguistic characteristics of shill reviews. Electronic Commerce Research and Applications, 13(2), 69–78.
  25. Pan, L., Zhenning, X., Jun, A., Fei, W. (2017). Identifying indicators of fake reviews based on spammer’s behavior features. Proceedings of the IEEE International Conference on Software Quality, Reliability and Security Companion, QRS-C, 396–403.
  26. Rayana, S. and Akoglu, L. (2015). Collective opinion spam detection; bridging review networks and metadata. Proceedings of the 21th International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD, 985–994.
  27. Savage, D., Zhang, X., Yu, X., Chou, P., Wang, Q. (2015). Detection of opinion spam based on anomalous rating deviation. Expert Systems with Applications, 42 (22), 8650–8657.
  28. Shafer, G. (1976). A Mathematical Theory of Evidence, vol. 1. Princeton University Press, Princeton.
  29. Smets, P. (1990). The combination of evidence in the transferable belief model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(5), 447–458.
  30. Smets, P. (1992). The transferable belief model for expert judgement and reliability problem. Reliability Engineering and System Safety, 38, 59–66.
  31. Smets, P. (1995). The canonical decomposition of a weighted belief. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1896–1901.
  32. Smets, P. (1998). The transferable belief model for quantified belief representation. In Quantified Representation of Uncertainty and Imprecision, Smets, P. (ed.). Springer, Dordrecht.
  33. Wang, G., Xie, S., Liu, B., Yu, P.S. (2011). Review graph based online store review spammer detection. Proceedings of 11th International Conference on Data Mining, ICDM, 1242–1247.


  1. Chapter written by Malika BEN KHALIFA, Zied ELOUEDI and Eric LEFÈVRE.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.