Pathologists’ Routine Fixations Can Be Used to Supervise Lymph Node Deep Learning Models
Poster Presentation: Sunday, May 18, 2025, 2:45 – 6:45 pm, Pavilion
Session: Eye Movements: Perception, fixational eye movements
Schedule of Events | Search Abstracts | Symposia | Talk Sessions | Poster Sessions
Meng Ling1, Veronica Thai1, Shuning Jiang1, Rui Li1, Jeremy Wolfe2, Wei-Lun Chao1, Yan Hu1, Anil Parwani1, Raghu Machiraju1, Srinivasan Parthsarathy1, Zaibo Li1, Jian Chen1; 1Ohio State University, 2Harvard University
Locating cancerous tissues from large high-spatial resolution whole-slide images (WSIs) is hindered by a lack of training data to supervise deep convolutional neural network (DCNN) algorithms. Patch-based human annotation is time- and labor-intensive. Additionally, training DCNNs would be improved by seeing data from diverse stimuli from routine clinical settings. Expert pathologists are trained professionals who know where to look to locate cancerous tissue from giga-pixel WSIs. Thus, we could acquire theoretically unlimited training samples by harvesting pathologists’ routine examinations of WSIs. To validate the reliability of this idea, we collected eye-tracking data from 10 pathologists, each viewing 60 slides from the CAMELYON16 dataset. These data were entered into DeepPFNet: our automated human-intelligence-based data preparation pipeline used to supervise AI to identify tumors. Specifically, we computed a pathologist’s fixation-map (PFMap) over each WSI and trained a DCNN using tumor tiles sampled from these maps and benign tiles sampled from benign slides’ tissue area. We validated DeepPFNet in experiments that examined effectiveness and scalability. Our experiments show that: models trained using DeepPFNet can achieve accuracy significantly higher than random sampling (F1 = 0.84, AUC = 0.91), and increasing the number of slides sampled leads to significant improvement (ΔF1 = 0.08, ΔAUC = 0.13). DeepPFNet models have better accuracy (F1 = 0.84, AUC = 0.93) than those using clustering (F1 = 0.70, AUC = 0.82) or viewport (F1 = 0.68, AUC = 0.88) approaches. We used the DeepPFNet model to classify tiles from the training WSIs and expanded the sampling maps, significantly improving the pipeline (ΔF1 = 0.04, ΔAUC = 0.03). Fixation-based fine-tuning of weakly supervised learning improved slide-level classification accuracy (ΔF1 = 0.01, ΔAUC = 0.01). Finally, applying PFMap on benign slides to sample benign tiles can improve sensitivity (Δsensitivity = 0.04), but decrease accuracy (ΔF1 = -0.06, ΔAUC = -0.03).
Acknowledgements: OSU Translational Data Analytics Institute (TDAI) Research Pilot Award