ProtoSSL: Interpretable Prototype Learning from Unlabeled Time-Series Data

AI in healthcare
Published: arXiv: 2605.06943v1
Authors

Steven Song Sahil Sethi Brett Beaulieu-Jones Robert L. Grossman

Abstract

In time-series domains where both predictive performance and interpretability are essential, deep neural networks achieve strong results but provide limited insight into how their predictions are made. Projection-based prototype networks address this limitation by grounding predictions in similarity to representative training examples, enabling case-based explanations and global prototype inspection. However, existing approaches rely on label supervision, tying prototypes to a specific task and requiring large labeled datasets. We introduce ProtoSSL, a novel framework for learning interpretable, projection-based prototypes from unlabeled time-series data and adapting them to downstream tasks. Our key idea is to separate motif discovery from label alignment. ProtoSSL first learns a reusable prototype bank using a self-supervised objective applied directly to prototype activations, and then aligns these prototypes to downstream tasks through an efficient assignment procedure. Across six electrocardiography (ECG) datasets, ProtoSSL improves label efficiency, outperforming supervised prototype baselines in low-data regimes with as few as 256 labeled examples; with fine-tuning, ProtoSSL outperforms supervised prototype baselines at full dataset scale. In a human evaluation study, ProtoSSL produces prototypes and prototype-based explanations that are judged more favorably than those learned with direct label supervision. We further show that the framework extends to audio classification. Thus, ProtoSSL enables both learning generalizable prototypes from unlabeled data before the downstream label space is known, and subsequent assignment of interpretable, projection-grounded prototypes to new time-series tasks.

Paper Summary

Problem
Deep learning models are widely used in time-series applications like medical waveform analysis, audio classification, and human-activity recognition. However, these models often lack interpretability, making it difficult to understand how they make predictions. This is a significant problem because interpretability is essential in many domains, such as medicine, where decisions have a direct impact on people's lives.
Key Innovation
The researchers introduce a novel framework called ProtoSSL, which enables the learning of interpretable, projection-based prototypes from unlabeled time-series data. ProtoSSL separates motif discovery from label alignment, allowing prototypes to be reused across tasks without retraining the prototype bank. This innovation is significant because it decouples prototype learning from task-specific label supervision, making it possible to learn interpretable prototypes before the downstream label space is known.
Practical Impact
The practical impact of ProtoSSL is substantial. By learning interpretable prototypes from unlabeled data, ProtoSSL can improve label efficiency in time-series applications. In low-data regimes, ProtoSSL outperforms supervised prototype baselines with as few as 256 labeled examples. Additionally, ProtoSSL's prototypes can be reused across tasks, reducing the need for retraining the prototype bank. This makes ProtoSSL a valuable tool for applications where data is scarce or expensive to label.
Analogy / Intuitive Explanation
Think of ProtoSSL as a library of pre-trained models, each representing a characteristic pattern in a time-series dataset. When a new task is introduced, ProtoSSL's assignment procedure matches the new task to the most relevant pre-trained model, allowing the model to adapt to the new task without relearning from scratch. This is similar to how a librarian might match a new book to a pre-existing shelf, rather than creating a new shelf from scratch.
Paper Information
Categories:
cs.LG
Published Date:

arXiv ID:

2605.06943v1

Quick Actions