Efficient Tensor Strategy for Recommendation

,


Introduction
Recommender system as defined from the perspective of E commerce as a tool that helps users search through records of knowledge which is related to users interest and preference for a recommender system to implement its core function of identifying useful items for the user, [1,2]. In [3] RSs is defined as a means of assisting and augmenting the social process of using recommendations of others to make choices when there is no sufficient personal knowledge or experience of the alternatives. (RS) must predict that an item is worth recommending [4,5]. In view of that, recommendation systems of late have become an interesting field, as they play an exquisite role in various automatic recommendation systems, and are nowadays pervasive in various domains such as recommendation of books at Amazon, music and movies recommendation at Netflix as an algorithm form tackling the information over load problem. Some of the main specific constraints or digital-age dilemmas of Recommender Systems (RSs) are; data sparsity, cold-start and issues. To overcome such problems, Matrix Factorization methods have been applied extensively by various researchers in the field. [6]- [8]. In recent times, additional sources of information are integrated into RSs. As a result, a lot of research in this field are being carried about mainly with Matrix factorization methods such as social matrix factorization (Social MF), which combines ratings with social relations [9]- [14]. Another research thread is Topic Matrix Factorization methods which combine latent factors in ratings with latent topics in item reviews [15]. In [16,10], the authors suggested other sources of information like reviews which justify the rating of a user, and ratings which are associated with item attributes hidden in reviews producing extraordinary results but at the cost of training data and time. In this wise, we propose that, omitting such information does not aid recommendation accuracy. As a result, such problems according to research could be well taken care of through tensor decomposition as propounded by [17] and our motivation for this paper is strongly tied to these reasons. Various tensor decomposition methods have been proposed. The CANDECOMP/PARAFAC decomposition, shorted as CP decomposition, is a direct extension of low-rank matrix decomposition to tensors; and it can be regarded as a special case of Probabilistic Tensor Factorization (PTF) [18], inspired by probabilistic latent factor models [19,20], has been proposed by various researchers as an effective tool for tackling recommendation problems [21,22]. The era of big data has also witnessed the explosion of tensor datasets, while the large scale PTF analysis is important to accommodate the increasing datasets. A comprehensive overview can be found from the survey paper by [23]. There is therefore the need for us to solicit for tensor decomposition analysis that is able to extract hidden patterns from multi-way datasets. The core concept of senti-PTF is to capture additional sources of information occasionally neglected in ASTESJ ISSN: 2415-6698 various recommendation models which could efficiently improve prediction performance in RSs. The key contribution of our model is that it integrates all available data sources, that is, it provides a joint model of user, product I.D, ratings, reviews and review helpfulness.
1. Providing an effective way to exploit ratings, reviews and relations to overcome cold-start problems tightly.
2. We propose a new framework; senti-PTF which is effective in terms of prediction through error detection and solves sparsity problems.
The rest of this paper is organized as follows; Tensor Decomposition Preliminaries are given in Section 2. In Section 3, we present the details of the experiments with datasets. In Section 4, Concluding remarks with a discussion of some future work are in the final section. Matrix Factorization and its application to personalized recommendation demonstrated the effectiveness of directly modelling all the dimensions simultaneously in a unified framework. These among other works presupposes that, tensor decomposition models performed well in terms of prediction efficiency and effectiveness compared to the various matrix factorization algorithms, in particular application to massive data processing [24]- [26]. However, the numerous literature concerning the subject.

Problem Statement
Regardless of the various attempt made by researchers on the subject matter; Collaborative Filtering models, they suffer from Sparsity; due to sparse rating matrix. Cold-Start; as they perform poorly on cold users and cold items for which there are no or few data. User feedback is intended to discover latent product and user dimensions. Unfortunately, traditional methods often discard review text, which makes user and product latent dimensions difficult to interpret, mainly due to the fact that, the very text that justifies a user's rating is relegated. In our opinion, ignoring rich source of information is a major shortcoming of existing works on recommender systems

Related Work
Tensor factorization methods are useful tools in recommendation systems. One prominent representative Factorbased method for recommendation systems is Probabilistic Tensor Factorization (PTF) which has been envisaged by quite a number of researchers in the recommendation system field. Tensor Factorization (BPTF) was also used to enhance prediction accuracy and recommendation using sales data by [27]. [28] also proposed the PTF model which was naturally applicable to incomplete tensors to provide both point estimate and multiple imputation for the missing entries. Tensor factorization [29] for Precision Medicine in Heart Failure with Preserved Ejection Fraction was effective. [30] in his Probabilistic polyadic factorization and its application to personalized recommendation demonstrated the effectiveness of directly modelling all the dimensions simultaneously in a unified framework. These among other works presupposes that, tensor decomposition models performed well in terms of prediction efficiency and effectiveness compared to the various matrix factorization algorithms, in particular application to massive data processing [31]- [33]. However, the numerous literature concerning the subject.

Proposed Sentiment-based Tensor Analysis
We propose a tensor decomposition approach to solve the sparsity, and cold-start problems of collaborative filtering algorithm making use of review sentiments and rating scores adopting Probabilistic Tensor Factorization. The main idea is to capture the latent structure of a tensor through a probabilistic factorization framework, and the latent structure is used for prediction. We jointly model ratings with review sentiments scores and model our data with probabilistic tensor factorization algorithm. In particular, CP decomposition which factorizes a tensor into a summation of rank-one tensors, where A, B and C are the latent factors. We propose probabilistic tensor factorization (PTF), which is an instance of CANDECOMP/PARAFAC (CP) tensor decomposition [34], which is a commonly used tensor model for factorization.

Equations
PTF's performance, we process the data into three 3rd order tensors, where each mode correspond to IDs, users and reviews, and also modelled ratings with item IDs, users and ratings respectively denoted by the tensor ABC as shown in our probabilistic model. The ratings range from 1 to 5, whiles the review sentiment were processed to 0 and 1 representing negative and positive sentiments. Tensor factorization techniques have gained popularity and have become the standard recommender approaches due to their accuracy and scalability [35]. They have probabilistic interpretation with Gaussian noise. Our model Senti-PTF combines our sentiment algorithm with probabilistic Tensor factorization framework. For Probabilistic Tensor S of size [I, J, K] where each entry is indexed as (i, j, k), and assume there is a D-dimensional latent factor Ai, Bj and Ck corresponding to each i, j and k respectively. In other words, for each dimension of the tensor, we have a latent factor matrix (Ai _D), (Bj _ D), and (Ck _D) respectively. The distribution of the unknown entry (I, j, k) given the observed tensor S is generated from Multivariate Gaussian Distribution. Given S the learning task is to model parameter theta such that the likelihood function is given by;

Experiments
Our experiment is designed to study the accuracy and efficiency of the senti-PTF, rat-PTF and baselines on social media review datasets which are publicly available. All the experiments are run on a Processor AMD E26110 APU with AMD Radeon R2 Graphics, 1500 Mhz, 4 Core(s), 4 Logical Processor(s) and 12GB of RAM. (a) Datasets and Parameter Settings: The real word tensor data used in our experiments are public collaborative filtering datasets; Amazon Datasets, which contains product reviews and metadata from Amazon, including 142.8 million reviews spanning May 1996 to July 2014 [37]. This dataset includes. Reviews (ratings, text, helpfulness votes), product metadata (descriptions, category information, price, brand, and image features) and links. In order to study senti-PTF's performance, we process the data into three 3rd order tensors, where each mode correspond to IDs, users and reviews, and also modelled ratings with item IDs, users and ratings respectively denoted by the tensor ABC as shown in our probabilistic model. The ratings range from 1 to 5, whiles the review sentiments were processed to 0 and 1 representing negative and positive sentiments.

Error Detection
For comparison, we implement and report the performance of senti-PTF and rat-PTF prediction. For the consistency of expression we still use "customer" and "item" to represent reviewers of automotive products. We estimate error rates on sentiments and ratings expressed, to assess the performance of our model and had the following results (figure1): The error graph shows how our algorithm; senti-PTF and rat-PTF performed. Senti-PTF performed better than rat-PTF in terms of prediction performance Figure 1 Error detection for senti-PTF and rat-PTF

Conclusion
A unified framework rat-PTF and senti-PTF by aligning latent factors and topics is proposed to perform Probabilistic Tensor Factorization for effective rating and sentiment prediction. In this paper, experiments on real world data sets demonstrate that our senti-PTF model outperforms the traditional CP decomposition, exploiting review sentiment beyond ratings can significantly improve recommender performance in terms of RMSE (figure2). We therefore propose Sentiment based Tensor Analysis approach in recommendation as it solves the cold start, improves prediction efficiently and solves scalability problems of the big data era. Model integration could be envisaged in our future work [38]. Figure1 demonstrates sent-PTF achieves better performance as the tensor size increases on the Amazon datasets. The result directly sheds light on the necessity of a senti-PTF solution.

Conflict of Interest
We declare that, there is no conflict of interest to the publishing of this work.