News aggregation and recommendation using clustering algorithms and graph theory
DOI:
https://doi.org/10.70530/kuset.v20i1.720Keywords:
News aggregation, Clustering, Graph theory, Recommendation system, Content based filteringAbstract
The proliferation of online news sources creates significant challenges in managing redundant articles and providing personalized recommendations, especially without user data. This study proposes a methodology that addresses these issues through a two-stage process. We first apply Agglomerative Hierarchical Clustering (AHC) to a corpus of 8,142 articles scraped from prominent Nepali news websites, reducing content duplication by grouping articles based on semantic similarity. The resulting unique articles are then modeled as nodes in a weighted, undirected graph. A content-based recommendation engine generates curated news lists by applying Dijkstra’s algorithm to this graph, leveraging a composite edge weight that accounts for semantic similarity, publication recency, and category. Our evaluation shows this approach effectively reduced article redundancy by 15.9%. Furthermore, the recommendation system demonstrated high performance, generating 10 recommendations in 1.14 seconds on average, and a user study (N=192) found the recommendations to be highly relevant (40% potential CTR). This work validates a computationally efficient, graph-based framework for news aggregation and recommendation that entirely bypasses the training overhead and ’cold start’ limitations of traditional machine learning models.
Published
How to Cite
Issue
Section

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
This work is licensed under CC BY-SA 4.0