Interpreting Contrastive Embeddings in Specific Domains with Fuzzy Rules

2026-03-12Symbolic Computation

Symbolic ComputationMachine Learning
AI summary

The authors explore how to turn free-form text, like medical reports and movie reviews, into useful data for computer analysis. They use a model called CLIP, which links text and images, and combine it with fuzzy rule-based classification to better understand important features in the text. Their method is tested on clinical and film data, showing how the rules connect different features. They also talk about the approach's limits and possible improvements.

free-form textnatural language processingCLIP modelfuzzy rule-based classificationvector representationzero-shot learningfew-shot learningclinical reportsfilm reviews
Authors
Javier Fumanal-Idocin, Mohammadreza Jamalifard, Javier Andreu-Perez
Abstract
Free-style text is still one of the common ways in which data is registered in real environments, like legal procedures and medical records. Because of that, there have been significant efforts in the area of natural language processing to convert these texts into a structured format, which standard machine learning methods can then exploit. One of the most popular methods to embed text into a vectorial representation is the Contrastive Language-Image Pre-training model (CLIP), which was trained using both image and text. Although the representations computed by CLIP have been very successful in zero-show and few-shot learning problems, they still have problems when applied to a particular domain. In this work, we use a fuzzy rule-based classification system along with some standard text procedure techniques to map some of our features of interest to the space created by a CLIP model. Then, we discuss the rules and associations obtained and the importance of each feature considered. We apply this approach in two different data domains, clinical reports and film reviews, and compare the results obtained individually and when considering both. Finally, we discuss the limitations of this approach and how it could be further improved.