This content originally appeared on HackerNoon and was authored by YAML
:::info Authors:
(1) Daniele Malitesta, Politecnico di Bari, Italy and daniele.malitesta@poliba.it with Corresponding authors: Daniele Malitesta (daniele.malitesta@poliba.it) and Giuseppe Gassi (g.gassi@studenti.poliba.it);
(2) Giuseppe Gassi, Politecnico di Bari, Italy and g.gassi@studenti.poliba.it with Corresponding authors: Daniele Malitesta (daniele.malitesta@poliba.it) and Giuseppe Gassi (g.gassi@studenti.poliba.it);
(3) Claudio Pomo, Politecnico di Bari, Italy and claudio.pomo@poliba.it;
(4) Tommaso Di Noia, Politecnico di Bari, Italy and tommaso.dinoia@poliba.it.
:::
Abstract and 1 Introduction and Motivation
2 Architecture and 2.1 Dataset
5 Demonstrations and 5.1 Demo 1: visual + textual items features
5.2 Demo 2: audio + textual items features
5.3 Demo 3: textual items/interactions features 6
Conclusion and Future Work, Acknowledgments and References
5.3 Demo 3: textual items/interactions features
Online platforms usually allow customers to express reviews and comments about the products they have enjoyed to share their experience with other potentially-interested customers. In an ecommerce scenario, items may come with textual descriptions of the product characteristics (as seen in Demo 1). However, textual reviews of users commenting on those items may also be involved. Unlike most existing literature works, which usually refer to both sources of information as items’ representations, we decide to conceptually distinguish between items- and interactions (i.e., useritem)-side representations for the former and the latter, respectively.
\ Input data. We adopt the widely-popular Amazon recommendation dataset where each user’s purchase keeps track of metadata such as customer/product ids, the review text, the rating, and the purchase date. In a similar manner to the other demos, we retain only a small subset of the original dataset including 100 reviews and the corresponding product descriptions (obtained as the concatenation of their product title and category). Specifically, we save descriptions and reviews into separate tsv files where the former follow the same format as Demo 1 and Demo 2, while the latter maps user/item ids to review texts. Note that the number of products does not correspond to the number of user-item interactions as we only consider the set of unique interacted items. While Ducho extracts, by default, description/interaction texts from the last column of the tsv file, here we provide explicit column names to tell Ducho where to retrieve product descriptions and user reviews from the respective tsv files.
\ Extraction. While for the items’ descriptions we use again the same sentences encoder as in Demo 1 and 2, we decide to extract textual features from users’ reviews through a multilingual BERTbased model pre-trained on customers’ reviews and specify the task of sentiment analysis for this model.
\ Output. Textual item features are saved to numpy arrays whose filenames are the item ids. Conversely, the textual interaction features are saved under the filename obtained from user and item ids to provide a unique pointer to each review.
\
:::info This paper is available on arxiv under CC BY 4.0 DEED license.
:::
\
This content originally appeared on HackerNoon and was authored by YAML

YAML | Sciencx (2025-02-16T20:10:32+00:00) Ducho, the AI That Knows What You Think About That Toaster. Retrieved from https://www.scien.cx/2025/02/16/ducho-the-ai-that-knows-what-you-think-about-that-toaster/
Please log in to upload a file.
There are no updates yet.
Click the Upload button above to add an update.