Upload an image and identify the taxon of the shell
Published on: 12 January 2025
Deep learning, particularly Convolutional Neural Networks (CNNs), has revolutionized the field of image-based classification. However, most CNN applications rely on flat classification structures, overlooking the nested relationships inherent in many real-world tasks. This study investigates a hierarchical CNN framework for the taxonomic classification of Mollusca species, leveraging the modularity and flexibility of a “Local Classifier per Parent Node” (LCPN) architecture. Each node in the taxonomic hierarchy (e.g., family, genus, species) is assigned a specialized CNN model, enabling individualized hyperparameter tuning and allowing each model to capture the morphological nuances specific to its target group.
We construct a hierarchy of CNNs, from the phylum level down to individual genera and species, and address dataset imbalances through class weighting, focal loss, data augmentation, and undersampling. Empirical results underscore the effectiveness of our hierarchical architecture, showing improvements in predictive performance and robustness compared to a monolithic, flat classification model. Analyses within the family Cypraeidae reveal that genus-specific morphological features, rather than dataset size or class counts, most strongly impact model accuracy. By reducing classification errors at each taxonomic tier, the hierarchical approach offers finer control and interpretability of classifications — a key advantage for biodiversity monitoring, ecological research, and conchology.
Overall, our findings affirm that a hierarchical CNN framework provides significant gains in accuracy, scalability, and adaptability for complex taxonomic tasks, underscoring its potential for broader applications in biological taxonomy and beyond.
In recent years, Convolutional Neural Networks (CNNs) have emerged as a dominant paradigm in machine learning, particularly for tasks
involving image recognition, object detection, and classification. Their architecture, inspired by the human visual system, enables them
to effectively capture spatial hierarchies in data, making them highly effective for extracting features across varying
levels of granularity. While CNNs have demonstrated remarkable success in numerous domains, most applications are designed
to produce flat, single-level predictions, without explicitly leveraging the hierarchical relationships inherent in many
classification tasks.
Hierarchical classification tasks, such as taxonomic classification of species, inherently
involve nested relationships between classes. For example, biological taxonomies organize organisms in a hierarchy of taxonomic
ranks (e.g., class, order, family, and species), while diseases are categorized by systems and subtypes. Exploiting these
hierarchical structures within a classification framework has the potential to improve both the accuracy and interpretability
of machine learning models. Hierarchical CNNs, which either embed hierarchical structures into a single model or use a collection of CNN models
working together hierarchically, represent a promising approach to address this challenge.
The concept of hierarchical CNNs encompasses two primary strategies. The first approach integrates hierarchical relationships directly
into the architecture or loss function of a single CNN, enabling it to model dependencies among different levels of classification.
The second approach involves a collection of CNN models, each specialized to operate at a specific level of the hierarchy, progressively
refining predictions from coarse categories (e.g., order, family) to finer ones (e.g., species). This latter method closely mirrors the
hierarchical decision-making processes, where general categories are identified first, followed by finer distinctions.
Hierarchical CNNs have shown potential in domains like taxonomy, where large datasets often contain imbalances across levels of the hierarchy.
This paper explores the application of hierarchical CNNs in Mollusca. We discuss the advantages of leveraging hierarchical structures and
analyze existing methodologies.
Figure 1: "Flat" classification approach using a multi-class classifier to predict the leaf nodes (Species). The grey area represents the classifier.
Figure 2: The Local Classifier per Parent Node uses a multi-class classifier to predict the nodes at the next level. The grey squares represent the classifiers.
Figure 3: Local Classifier per Level classification approach using a multi-class classifier to classify the nodes at a level. The grey area represents the classifier.
Figure 4: The Global multi-class classifier to predict the nodes at the leaf level but using hierarchy information. The grey squares represent the classifiers.
Hyperparameter | Initial value |
---|---|
Epochs | 50 |
Learning rate | 0.0005 |
Early stopping | Yes |
Batch Size | 64 |
Dropout top layer | 0.20 |
Optimizer | Adam |
Early stopping | Yes |
Figure 6.
Name Taxon | Taxonomic level | # Images |
---|---|---|
Archiheterodonta | Subterclass | 1473 |
Arcida | Order | 5567 |
Caenogastropoda incertae sedis | Order | 14023 |
Chitonida | Order | 800 |
Cycloneritida | Order | 8072 |
Dentaliidae | Family | 473 |
Euheterodonta | Subterclass | 16658 |
Euthyneura | Infraclass | 4637 |
Lepetellida | Order | 14463 |
Lepidopleurida | Order | 434 |
Limida | Order | 1408 |
Littorinimorpha | Order | 131468 |
Lottioidea | Superfamily | 1884 |
Mytilida | Order | 4542 |
Neogastropoda | Order | 170973 |
Nuculanoidea | Superfamily | 540 |
Nuculoidea | Superfamily | 1067 |
Patelloidea | Superfamily | 5322 |
Pectinida | Order | 11878 |
Pleurotomariida | Order | 3182 |
Seguenziida | Order | 638 |
Trochida | Order | 20693 |
Name Taxon | Recall | Precision | F1 |
---|---|---|---|
Archiheterodonta | 1.000 | 1.000 | 1.000 |
Arcida | 0.386 | 0.809 | 0.523 |
Caenogastropoda incertae sedis | 0.982 | 0.823 | 0.896 |
Chitonida | 1.000 | 0.428 | 0.600 |
Cycloneritida | 0.958 | 0.884 | 0.920 |
Dentaliidae | 1.000 | 0.500 | 0.666 |
Euheterodonta | 0.947 | 0.831 | 0.885 |
Euthyneura | 1.000 | 0.900 | 0.947 |
Lepetellida | 0.951 | 0.975 | 0.963 |
Lepidopleurida | 1.000 | 1.000 | 1.000 |
Limida | 1.000 | 0.833 | 0.909 |
Littorinimorpha | 0.975 | 0.987 | 0.981 |
Lottioidea | 1.000 | 0.500 | 0.666 |
Mytilida | 0.944 | 0.894 | 0.918 |
Neogastropoda | 0.973 | 0.989 | 0.981 |
Nuculanoidea | 1.000 | 0.166 | 0.285 |
Patelloidea | 0.888 | 0.841 | 0.864 |
Pectinida | 1.000 | 1.000 | 1.000 |
Pleurotomariida | 1.000 | 0.833 | 0.909 |
Seguenziida | 1.000 | 0.800 | 0.888 |
Trochida | 0.958 | 0.985 | 0.971 |
Class weights were used, but no other adjustments were made to improve the imbalance. The 3 top layers were unfrozen and a regularization value of 0.0001 was used. All other parameters had default values. The dataset has 424406 images. |
# images | Accuracy | Loss | Dentaliidae | Nuculanoidae | Littorinimorpha | Neogastropda | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Total | Max. per class | Training | Validation | Training | Validation | F1 Score | # images | F1 Score | # images | F1 Score | # images | F1 Score | # images |
12235 | 1000 | 0.986 | 0.938 | 0.010 | 0.033 | 0.928 | 473 | 0.914 | 540 | 0.849 | 1000 | 0.870 | 1000 |
26526 | 2500 | 0.989 | 0.959 | 0.009 | 0.025 | 0.914 | 473 | 0.980 | 540 | 0.943 | 2500 | 0.897 | 2500 |
44485 | 5000 | 0.983 | 0.964 | 0.012 | 0.022 | 1.000 | 473 | 0.923 | 540 | 0.952 | 5000 | 0.948 | 5000 |
68895 | 10000 | 0.985 | 0.971 | 0.012 | 0.018 | 1.000 | 473 | 0.870 | 540 | 0.967 | 10000 | 0.930 | 10000 |
103489 | 25000 | 0.980 | 0.971 | 0.012 | 0.019 | 1.000 | 473 | 0.875 | 540 | 0.961 | 25000 | 0.963 | 25000 |
No undersampling | 0.960 | 0.964 | 0.030 | 0.033 | 0.666 | 473 | 1.000 | 540 | 0.981 | 131468 | 0.981 | 170973 |
# images | Accuracy | Loss | Dentaliidae | Nuculanoidae | Littorinimorpha | Neogastropda | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Total | Max. per class | Training | Validation | Training | Validation | F1 Score | # images | F1 Score | # images | F1 Score | # images | F1 Score | # images |
44485 | 5000 | 0.991 | 0.970 | 0.008 | 0.019 | 1.000 | 473 | 0.952 | 540 | 0.984 | 3000 | 0.937 | 3000 |
103489 | 25000 | 0.989 | 0.976 | 0.013 | 0.016 | 1.000 | 473 | 1.000 | 540 | 0.977 | 15000 | 0.970 | 15000 |
133389 | 50000 | 0.982 | 0.975 | 0.023 | 0.018 | 0.666 | 473 | 1.000 | 540 | 0.982 | 30000 | 0.979 | 30000 |
Model with same (initial) hyperparameters | Fine-tuned Model with optimized hyperparameters | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Genus | # images | # species | Train. Acc. | Val. Acc. | Train. Loss | Val. Loss | Learning rate | Top layer dropout | Un-freezed top layers | Regulari-zation | Train. Acc. | Val. Acc. | Train. Loss | Val. Loss |
Austrasiatica | 1396 | 3 | 0.942 | 0.943 | 0.248 | 0.203 | 0.0005 | 0.2 | - | - | 0.942 | 0.943 | 0.248 | 0.203 |
Bistolida | 4440 | 11 | 0.837 | 0.837 | 0.266 | 0.075 | 0.001 | 0.2 | 20 | 0.01 | 0.965 | 0.919 | 0.087 | 0.085 |
Cypraeovula | 3830 | 14 | 0.885 | 0.876 | 0.867 | 0.885 | 0.0005 | 0.2 | 3 | 0.0001 | 0.956 | 0.914 | 0.266 | 0.289 |
Eclogavena | 2110 | 5 | 0.892 | 0.879 | 0.485 | 0.376 | 0.00025 | 0.2 | 3 | - | 0.959 | 0.950 | 0.182 | 0.189 |
Erronea | 9338 | 16 | 0.860 | 0.892 | 1.122 | 0.362 | 0.0005 | 0.2 | 3 | 0.0001 | 0.953 | 0.935 | 0.324 | 0.211 |
Leporicypraea | 2394 | 4 | 0.913 | 0.914 | 0.546 | 0.244 | 0.0005 | 0.2 | 3 | 0.0001 | 0.968 | 0.960 | 0.155 | 0.126 |
Lyncina | 9210 | 11 | 0.912 | 0.934 | 1.279 | 0.226 | 0.0005 | 0.2 | 3 | 0.0001 | 0.978 | 0.964 | 0.291 | 0.123 |
Mauritia | 8299 | 8 | 0.815 | 0.853 | 0.125 | 0.051 | 0.0005 | 0.2 | 3 | 0.0001 | 0.953 | 0.919 | 0.024 | 0.031 |
Melicerona | 623 | 2 | 0.938 | 0.863 | 0.230 | 0.293 | 0.00025 | 0.25 | 3 | 0.0001 | 0.96 | 0.903 | 0.157 | 0.241 |
Naria | 23518 | 24 | 0.879 | 0.925 | 0.924 | 0.296 | 0.001 | 0.1 | 3 | 0.0001 | 0.987 | 0.965 | 0.018 | 0.026 |
Notocypraea | 2230 | 5 | 0.825 | 0.796 | 0.659 | 0.559 | 0.001 | 0.2 | 3 | 0.001 | 0.959 | 0.888 | 0.167 | 0.408 |
Palmadusta | 7058 | 13 | 0.947 | 0.945 | 0.278 | 0.175 | 0.0005 | 0.2 | - | - | 0.947 | 0.945 | 0.278 | 0.175 |
Pseudozonaria | 1753 | 4 | 0.962 | 0.980 | 0.152 | 0.089 | 0.0005 | 0.2 | - | - | 0.962 | 0.980 | 0.152 | 0.089 |
Purpuradusta | 2575 | 8 | 0.870 | 0.849 | 1.233 | 0.533 | 0.0005 | 0.2 | 3 | 0.0001 | 0.932 | 0.882 | 0.663 | 0.650 |
Pustularia | 4332 | 10 | 0.837 | 0.848 | 0.854 | 0.471 | 0.0005 | 0.2 | 20 | 0.01 | 0.967 | 0.916 | 0.050 | 0.060 |
Talparia | 1075 | 2 | 0.949 | 0.977 | 0.222 | 0.086 | 0.0005 | 0.2 | - | - | 0.949 | 0.977 | 0.222 | 0.086 |
Umbilia | 2898 | 7 | 0.846 | 0.889 | 0.947 | 0.341 | 0.0005 | 0.2 | 3 | - | 0.941 | 0.941 | 0.408 | 0.213 |
Zoila | 7036 | 27 | 0.855 | 0.848 | 1.264 | 0.502 | 0.0005 | 0.2 | 3 | 0.0001 | 0.977 | 0.916 | 0.231 | 0.299 |
Using the same parameters for the first training (see default parameters), the results vary widely between a training accuracy of 0.815 for the Mauritia model until 0.962 for the Pseudozonaria model. Optimizing the hyperparameters and further fine-tuning gave always better results, all models having at least a validation accuracy of 0.9. When plotting the amount of images in the dataset, or the number of species (classes) in the dataset against the accuracy, no correlation was found. The size of the dataset is not able to explain the variation in accuracy among the models.
This study has successfully demonstrated the power of hierarchical convolutional neural networks (CNNs) as a robust tool for the identification of Mollusca species. Our results provide compelling evidence for the efficacy of hierarchical classification strategies, which can be implemented through diverse architectures. These range from a single, end-to-end CNN capable of refining predictions from coarse taxonomic levels (e.g., class) down to fine-grained distinctions (e.g., species), to an ensemble of interconnected CNNs, where each model is specialized for a specific node within the taxonomic hierarchy. In this work, we elected to pursue the latter approach, constructing a hierarchical network of independent CNNs. As detailed in previous sections, this architectural decision was driven by the inherent advantages it offers in terms of modularity, optimization, and feature specificity. The strong performance of our model validates this choice and underscores the potential of hierarchical CNNs for accurate and efficient Mollusca classification.
The observed variations in performance across different taxonomic groups, particularly within the family Cypraeidae, offer critical insights into the benefits of our chosen architecture. By utilizing a dedicated model for each node within the hierarchy, we are empowered to fine-tune hyperparameters at each level of classification. This granular level of optimization ensures that each model is ideally suited to the specific challenges posed by its target taxonomic group. Consequently, this strategy leads to demonstrably superior outcomes compared to a single, "flat" model attempting to classify all Mollusca species simultaneously. In such a monolithic approach, optimal hyperparameters for one group might prove detrimental to another, hindering overall accuracy.
Furthermore, the adoption of a hierarchical network of specialized models allows each CNN to focus on learning the distinct morphological characteristics relevant to its assigned genus. Our findings suggest that the presence of these genus-specific features is a more significant determinant of model performance than either the sheer size of the dataset or the total number of classes within a given node. This implies that even within a complex and diverse group like Mollusca, a well-structured hierarchy can leverage subtle, genus-level morphological differences to achieve accurate identification.
The success of our hierarchical CNN approach has significant implications for the field of conchology and biodiversity research. It provides a framework for developing automated, accurate, and efficient tools for species identification, which can be invaluable for tasks such as biodiversity monitoring, ecological surveys, conservation efforts and shell collection. Investigating the interpretability of the learned features within each model could provide further insights into the specific morphological characteristics that distinguish different Mollusca genera and species.