Bringing Clustering to MLL: Weakly-Supervised Clustering for Partial Multi-Label Learning
2026-04-10 • Machine Learning
Machine Learning
AI summaryⓘ
The authors address the problem of noisy labels in partial multi-label learning, where data points may have both correct and incorrect labels. They introduce a new clustering method that separates cluster membership into two parts, allowing the model to better handle multi-label data despite label noise. Their approach combines unsupervised clustering with weak supervision from data, refining clusters iteratively to improve performance. Experiments on many datasets show their method works better than existing approaches.
multi-label learninglabel noisepartial multi-label learningclusteringmembership matrixweak supervisioniterative optimizationprototype learning
Authors
Yu Chen, Weijun Lv, Yue Huang, Xuhuan Zhu, Fang Li
Abstract
Label noise in multi-label learning (MLL) poses significant challenges for model training, particularly in partial multi-label learning (PML) where candidate labels contain both relevant and irrelevant labels. While clustering offers a natural approach to exploit data structure for noise identification, traditional clustering methods cannot be directly applied to multi-label scenarios due to a fundamental incompatibility: clustering produces membership values that sum to one per instance, whereas multi-label assignments require binary values that can sum to any number. We propose a novel weakly-supervised clustering approach for PML (WSC-PML) that bridges clustering and multi-label learning through membership matrix decomposition. Our key innovation decomposes the clustering membership matrix $\mathbf{A}$ into two components: $\mathbf{A} = \mathbfΠ \odot \mathbf{F}$, where $\mathbfΠ$ maintains clustering constraints while $\mathbf{F}$ preserves multi-label characteristics. This decomposition enables seamless integration of unsupervised clustering with multi-label supervision for effective label noise handling. WSC-PML employs a three-stage process: initial prototype learning from noisy labels, adaptive confidence-based weak supervision construction, and joint optimization via iterative clustering refinement. Extensive experiments on 24 datasets demonstrate that our approach outperforms six state-of-the-art methods across all evaluation metrics.