1 |
Head Tail Open: Open Tailed Classification of Imbalanced Document DataJoshi, Chetan 23 April 2024 (has links) (PDF)
Deep learning models for scanned document image classification and form understand- ing have made significant progress in the last few years. High accuracy can be achieved by a model with the help of copious amounts of labelled training data for closed-world classification. However, very little work has been done in the domain of fine-grained and head-tailed(class imbalance with some classes having high numbers of data points and some having a low number of data points) open-world classification for documents. Our proposed method achieves a better classification results than the baseline of the head-tail-novel/open dataset. Our techniques include separating the head-tail classes and transferring the knowledge from head data to the tail data. This transfer of knowledge also improves the capability of recognizing a novel category by 15% as compared to the baseline.
|
Page generated in 0.1119 seconds