Efficient algorithm for the k-means problem with Must-Link and Cannot-Link constraints
Journal article
Jia, Chaoqi, Guo, Longkun, Liao, Kewen and Lu, Zhigang. (2023). Efficient algorithm for the k-means problem with Must-Link and Cannot-Link constraints. Tsinghua Science and Technology. 28(6), pp. 1050-1062. https://doi.org/10.26599/TST.2022.9010056
Authors | Jia, Chaoqi, Guo, Longkun, Liao, Kewen and Lu, Zhigang |
---|---|
Abstract | Constrained clustering, such as k -means with instance-level Must-Link (ML) and Cannot-Link (CL) auxiliary information as the constraints, has been extensively studied recently, due to its broad applications in data science and AI. Despite some heuristic approaches, there has not been any algorithm providing a non-trivial approximation ratio to the constrained k -means problem. To address this issue, we propose an algorithm with a provable approximation ratio of O(logk) when only ML constraints are considered. We also empirically evaluate the performance of our algorithm on real-world datasets having artificial ML and disjoint CL constraints. The experimental results show that our algorithm outperforms the existing greedy-based heuristic methods in clustering accuracy. |
Keywords | constrained k-means; Must-Link (ML) constraints; Cannot-Link (CL) constraints; approximation algorithm; constrained clustering |
Year | 2023 |
Journal | Tsinghua Science and Technology |
Journal citation | 28 (6), pp. 1050-1062 |
Publisher | Tsinghua University |
ISSN | 1007-0214 |
Digital Object Identifier (DOI) | https://doi.org/10.26599/TST.2022.9010056 |
Scopus EID | 2-s2.0-85166738954 |
Open access | Published as ‘gold’ (paid) open access |
Page range | 1050-1062 |
Funder | National Natural Science Foundation of China (NSFC) |
Universities of Shandong Province | |
Publisher's version | License File Access Level Open |
Output status | Published |
Publication dates | |
Online | 28 Jul 2023 |
Publication process dates | |
Deposited | 10 Oct 2023 |
Grant ID | 12271098 |
61772005 | |
2020KJN008 |
https://acuresearchbank.acu.edu.au/item/8zv5y/efficient-algorithm-for-the-k-means-problem-with-must-link-and-cannot-link-constraints
Download files
Publisher's version
OA_Jia_2023_Efficient_algorithm_for_the_k-means.pdf | |
License: CC BY 4.0 | |
File access level: Open |
42
total views49
total downloads4
views this month1
downloads this month