@PHDTHESIS{ 2018:1997138684, title = {Leveraging User Opinions for Product Catalog Enrichment}, year = {2018}, url = "https://tede.ufam.edu.br/handle/tede/6831", abstract = "A large number of people post reviews on the products of all types, which are offered online. In these reviews, people express their opinion regarding these products and their features. Consequently, a large number of opinions are available, which can be a valuable source of knowledge for decision-making for manufacturers as well as customers. From these opinions, manufacturers can obtain immediate feedback to improve the quality of their products and customers are able to obtain assessments from reviews prior to purchasing a product. However, as it is common in many types of social media, the sheer volume of available reviews for each product normally exceeds the human processing capacity and can, thus, become a major barrier to its effective use. The question that now arises is how to structure opinions so that they can be effectively used by customers and manufacturers. Traditional methods of organizing a large number of product reviews aim at creating an opinion summary. However, these methods are inadequate to address customer queries on the most relevant product characteristics. In particular, in current methods, the opinions are arbitrarily clustered by aspect expressions, causing these clusters to not necessarily align with relevant product characteristics. We claim that the most important product characteristics for people are represented by the attributes of the product catalogs and the process of organizing opinions should be guided by the attributes of the product catalogs. Therefore, in this thesis, we formulated and investigated the following problem: enriching product catalogs with user opinions at the attribute granularity level as a new form of opinion summarization. Grouping opinions around the attributes of the product catalog also allows the catalog to be enriched with these opinions with the passage of time. To deal with this novel problem, in this thesis, we started by investigating the impacts of attributes of product catalogs on user opinions. In this investigation, we used a large collection of data. The experimental results indicate that user opinions are significantly influenced by product attributes. In addition, we presented a new approach comprising of two phases: opinion extraction and opinion mapping. Based on this approach, we developed two distinct methods. For the first method, named AspectLink, an unsupervised strategy has been adopted. For the second method, named OpinionLink, a supervised strategy has been adopted. To verify the effectiveness of the methods, an extensive experimental evaluation was conducted which demonstrated the effectiveness of the proposed methods. Furthermore, a bootstrapping strategy was proposed to train the classifiers of OpinionLink in order to reduce the dependence on training data. Finally, the supervised method was applied as a full pipeline, and the experimental results demonstrated the feasibility of using this method in real and large-scale applications. To properly evaluate the methods developed in this study, the experimental datasets, non-existent in the literature, were developed, which are available as another contribution. We also developed a practical application to showcase some proposal ideas in this thesis.", publisher = {Universidade Federal do Amazonas}, scholl = {Programa de P?s-gradua??o em Inform?tica}, note = {Instituto de Computa??o} }