LizardLens: A Two-Stage Deep Learning Pipeline for Detecting and Classifying Similar Species in Visually Complex Environments bioRxivpreprint
dlvr.it
Community science platforms like iNaturalist generate unprecedented volumes of biodiversity data, but their scientific utility depends critically on accurate species identification; a persistent challenge when contributors often lack taxonomic expertise. We developed LizardLens, a two-stage machine learning pipeline that decouples object detection from species classification to enable fine-grained identification of morphologically similar organisms in visually complex field photographs. Using 10,000 verified iNaturalist images of five Anolis lizard species in Florida, we trained specialized YOLO-based detection and Swin Transformer classification models and compared performance against state-of-the-art single-stage architectures. Our two-stage pipeline achieved 83.0% Top-1 accuracy and a macro-averaged F1-score of 89.0%, indicating strong precision-recall performance across species and outperforming single-stage YOLOv8 and YOLOv12 models across all evaluation metrics for all species, with relative improvements ranging from 10.5% to 13.2%. Gradient-weighted Class Activation Mapping (Grad-CAM) indicated that the models predictions were consistently associated with regions corresponding to diagnostic morphological (e.g., head shape, feet, and limb lengths) and pattern features (e.g., ocular rings and body patterning), providing evidence that LizardLens leverages biologically relevant visual cues consistent with those used by expert taxonomists. Error analysis identified partial occlusion and multiple proximate individuals as primary sources of missed detections, while spurious detections of lizard-like environmental features (e.g., sticks, bark) represented the dominant false positive error mode. We deployed LizardLens as an accessible web application featuring interactive bounding box correction, ranked species predictions with confidence scores, directly supporting the Lizards on the Loose middle school community science initiative. By combining technical advances in fine-grained visual classification with user-centered design, LizardLens demonstrates how machine learning can simultaneously enhance data quality for biodiversity monitoring and provide authentic scientific experiences for student participants. Our approach is generalizable to other small-bodied organisms in complex habitats and provides a framework for translating computer vision advances into practical tools for community science and conservation.