Workshop
8th Workshop on Computer Vision for Microscopy Image Analysis
Mei Chen · Daniel J. Hoeppner · Dimitris N. Metaxas · Steve Finkbeiner
East 10
Keywords: CV + X: Biomedical
High-throughput microscopy enables researchers to acquire thousands of images automatically over a matter of hours. This makes it possible to conduct large-scale, image-based experiments for biological discovery. The main challenge and bottleneck in such experiments is the conversion of “big visual data” into interpretable information and hence discoveries. Visual analysis of large-scale image data is a daunting task. Cells need to be located and their phenotype (e.g., shape) described. The behaviors of cell components, cells, or groups of cells need to be analyzed. The cell lineage needs to be traced. Not only do computers have more “stamina” than human annotators for such tasks, they also perform analysis that is more reproducible and less subjective. The post-acquisition component of high-throughput microscopy experiments calls for effective and efficient computer vision techniques.
This workshop will bring together computer vision experts from academia, industry, and government who have made progress in developing computer vision tools for microscopy image analysis. It will provide a comprehensive forum on this topic and foster in-depth discussion of technical and application issues as well as cross-disciplinary collaboration. It will also serve as an introduction to researchers and students curious about this important and fertile field.
Schedule
Mon 8:30 a.m. - 8:40 a.m.
|
Opening Remarks & Logistics of the Day
(
Presentation
)
|
🔗 |
Mon 8:40 a.m. - 9:20 a.m.
|
Machine learning challenges in spatial single cell omics analysis
(
Invited Talk
)
Methods for profiling RNA and protein expression in a spatially resolved manner are rapidly evolving, making it possible to comprehensively characterize cells and tissues in health and disease. The resulting large-scale and complex multimodal data sets raise interesting computational and machine learning challenges, from QC & storage to actual analysis. For instance what is the best description of a local cell neighborhood, how do we find such interesting ones and how are these differential across disease or other perturbations? And how can they be chained together to build a spatial human cell atlas? Here, I will present approaches from the lab touching upon a few of these points. In particular, I will show how our recent toolbox Squidpy and the related SpatialData format support standard steps in analysis and visualization of spatial molecular data. I will then discuss recent approaches towards multimodal classification, learning cell cell communication and extension towards morphometric representations under perturbations using generative models. |
🔗 |
Mon 9:20 a.m. - 10:00 a.m.
|
AI for breast cancer diagnostics 2.0
(
Invited Talk
)
Deep learning is a state-of-the-art pattern recognition technique that has proven extremely powerful for the analysis of digitized histopathological slides. In our work, we studied the use of deep learning to assess a range of breast-cancer related tissue features: presence of lymph node metastases, extend of lymphatic infiltrate within tumors, and the components of tumor grading. It was shown that DL enables reproducible, quantitative tumor feature extraction, showing a good correlation with pathologists’ scores and with patient outcome. Our current research involves larger scale validation with pathologists, to study the added value of the developed algorithms in routine practice, in terms of efficiency and diagnostic accuracy. Such studies will be required for certification but are still mostly lacking in our field, making it difficult to assess the true potential of deep learning for pathology diagnostics. |
🔗 |
Mon 10:00 a.m. - 10:10 a.m.
|
Giga-SSL: Self-Supervised Learning for Gigapixel Images
(
Accepted Paper
)
Whole slide images (WSI) are microscopy images of stained tissue slides routinely prepared for diagnosis and treatment selection in medical practice. WSI are very large (gigapixel size) and complex (made of up to millions of cells). The current state-of-the-art (SoTA) approach to classify WSI subdivides them into tiles, encodes them by pre-trained networks and applies Multiple Instance Learning (MIL) to train for specific downstream tasks. However, annotated datasets are often small, typically a few hundred to a few thousand WSI, which may cause overfitting and underperforming models. Conversely, the number of unannotated WSI is ever increasing, with datasets of tens of thousands (soon to be millions) of images available. While it has been previously proposed to use these unannotated data to identify suitable tile representations by self-supervised learning (SSL), downstream classification tasks still require full supervision because parts of the MIL architecture is not trained during tile level SSL pre-training. Here, we propose a strategy of slide level SSL to leverage the large number of WSI without annotations to infer powerful slide representations. Applying our method to The Cancer-Genome Atlas, one of the most widely used data resources in cancer research (16 TB image data), we are able to downsize the dataset to 23 MB without any loss in predictive power: we show that a linear classifier trained on top of these embeddings maintains or improves previous SoTA performances on various benchmark WSI classification tasks. Finally, we observe that training a classifier on these representations with tiny datasets (\eg 50 slides) improved performances over SoTA by an average of +6.3 AUC points over all downstream tasks. Altogether, our Giga-SSL representations of whole slide images are agnostic of downstream classification tasks and are well-suited for small datasets. |
🔗 |
Mon 10:10 a.m. - 10:20 a.m.
|
Fast local thickness
(
Accepted Paper
)
We propose a fast algorithm for the computation of local thickness in 2D and 3D. Compared to the conventional algorithm, our fast algorithm yields local thickness in just a fraction of the time. In our algorithm, we first compute the distance field of the object and then iteratively dilate the selected parts of the distance field. In every iteration, we employ small structuring elements, which makes our approach fast. Our algorithm is implemented in Python and is freely available as a pip-installable module. Besides giving a detailed description of our method, we test our implementation on 2D images and 3D volumes. In 2D, we compute the ground truth using the conventional local thickness methods, where the distance field is dilated with increasingly larger circular structuring elements. We use this as a reference to evaluate the quality of our results. In 3D, we have no ground truth since it would be too time-consuming to compute. Instead, we compare our results with the golden standard method provided by BoneJ. In both 2D and 3D, we compare with another Python-based approach from PoreSpy. Our algorithm performs equally well or better than other approaches, but significantly faster. |
🔗 |
Mon 10:20 a.m. - 10:25 a.m.
|
Automatic analysis of cryo-electron tomography using computer vision and machine learning
(
Work-in-Progress Spotlight
)
|
🔗 |
Mon 10:25 a.m. - 10:30 a.m.
|
Performance Review of Retraining and Transfer Learning of DeLTA 2.0 for Image Segmentation for Pseudomonas fluorescens SBW25
(
Work-in-Progress Spotlight
)
|
🔗 |
Mon 10:30 a.m. - 10:40 a.m.
|
Coffee Break
|
🔗 |
Mon 10:40 a.m. - 10:50 a.m.
|
A Super-Resolution Training Paradigm Based on Low-Resolution Data Only to Surpass the Technical Limits of STEM and STM Microscopy
(
Accepted Paper
)
Modern microscopes can image at atomic resolutions but often reach technical limitations for high-resolution images captured at the smallest nanoscale. Prior works have applied super-resolution (SR) by deep neural networks employing high-resolution images as targets in supervised training. However, in practice, it may be impossible to obtain these high-resolution images at the smallest atomic scales. Approaching this problem, we consider a new super-resolution training paradigm based on low-resolution (LR) microscope images only, to surpass the highest physically captured resolution available for training. As a solution, we propose a novel multi-scale training method for SR based on LR data only, which simultaneously supervises SR at multiple resolutions, allowing the SR to generalize beyond the LR training data. We physically captured low- and high-resolution images for evaluation, thereby incorporating real microscope degradation to deliver a proof of concept. Our experiments on periodic atomic structure in STEM and STM microscopy images show that our proposed multi-scale training method enables deep neural network image SR even up to 360% of the highest physically recorded resolution. Code and data is available on github. |
🔗 |
Mon 10:50 a.m. - 11:00 a.m.
|
New Bayesian Focal Loss Targeting Aleatoric Uncertainty Estimate: Pollen Image Recognition
(
Accepted Paper
)
In biological image recognition, different species might look similar resulting in a small margin, which causes errors in labeling images. Pollen grain image classification heavily suffers from both problems preventing from building well-calibrated recognition models. In this research, we aim to filter out aleatoric uncertainty caused by noisy labeling and similar shape of pollen species. To estimate aleatoric uncertainty, we propose a new Bayesian Focal Softmax loss function. It uses the softmax activation, which is more convenient for a single-label tasks compared to the original Focal loss based on the logistic function. The proposed loss function better estimates aleatoric uncertainty increasing the overall model performance. For evaluation, we used two datasets, POLLEN13L-det containing 13 classes of allergic pollen and POLLEN20L-det containing additional honey plant pollen species. We achieved the state-of-the-art results for both of them by applying the proposed loss function on RetinaNet. It improved the mAP and significantly reduced the variance compared to the regular Focal loss with softmax and provided much better aleatoric uncertainty estimate compared the Bayesian Focal loss with sigmoid activation |
🔗 |
Mon 11:00 a.m. - 11:05 a.m.
|
Virtual Staining for Pixel-Wise and Quantitative Analysis of Single Cell Image Analysis
(
Work-in-Progress Spotlight
)
|
🔗 |
Mon 11:05 a.m. - 11:10 a.m.
|
Self-supervised clustering and annotation of single-cell trajectories
(
Work-in-Progress Spotlight
)
|
🔗 |
Mon 11:10 a.m. - 11:15 a.m.
|
Spatio-temporal graph attention networks predict single cell response to cancer treatment in live 3D tumour spheroids
(
Work-in-Progress Spotlight
)
|
🔗 |
Mon 11:15 a.m. - 11:55 a.m.
|
How Can Humans Learn from AI
(
Invited Talk
)
In traditional ML, models learn from hand-engineered features informed by existing domain knowledge. More recently, in the deep learning era, combining large-scale model architectures, compute, and datasets has enabled learning directly from raw data, often at the expense of human interpretability. In this talk, I'll discuss using deep learning to predict patient outcomes with interpretability methods to extract new knowledge that humans could learn and apply. This process is a natural next step in the evolution of applying ML to problems in medicine and science, moving from the use of ML to distill existing human knowledge to people using ML as a tool for knowledge discovery. |
🔗 |
Mon 11:55 a.m. - 1:00 p.m.
|
Lunch Break
|
🔗 |
Mon 1:00 p.m. - 1:40 p.m.
|
Decoding hidden signal from neurodegenerative drug discovery high-content screens
(
Invited Talk
)
Alzheimer's disease is a complex and recalcitrant condition that has largely evaded traditional molecular drug discovery approaches. Phenotypic drug discovery using high-content cellular models with unbiased small molecule screening is promising but faces obstacles from subtle signal, artifacts, and non-specific visual markers. We propose two deep learning-based methods to overcome these challenges in large-scale cellular screens. First, we develop deep neural networks to generate missing fluorescence channel images from an Alzheimer's disease high-content screen (HCS), enabling the identification and prospective validation of overlooked but active small molecules. This is a unique application of generative images in drug discovery. Second, we introduce a learned biological landscape using deep metric learning to organize drug-like molecules by live-cell HCS images. Metric learning outperforms conventional image scoring and reveals previously hidden molecules that push diseased cells toward a healthy state as effectively as positive control compounds. These results indicate that a wealth of actionable biological information lies untapped but readily available in HCS datasets. |
🔗 |
Mon 1:40 p.m. - 2:20 p.m.
|
Multimodal Computational Pathology
(
Invited Talk
)
Advances in digital pathology and artificial intelligence have presented the potential to build assistive tools for objective diagnosis, prognosis and therapeutic-response and resistance prediction. In this talk we will discuss our work on: 1) Data-efficient methods for weakly-supervised whole slide classification with examples in cancer diagnosis and subtyping (Nature BME, 2021), and allograft rejection (Nature Medicine, 2022) 2) Harnessing weakly-supervised, fast and data-efficient WSI classification for identifying origins for cancers of unknown primary (Nature, 2021). 3) Discovering integrative histology-genomic prognostic markers via interpretable multimodal deep learning (Cancer Cell, 2022; IEEE TMI, 2020; ICCV, 2021). 4) Self-supervised deep learning for pathology and image retrieval (CVPR, 2022; Nature BME, 2022). 5) Integrating vision and language for computational pathology (CVPR, 2023) 6) Deploying weakly supervised models in low resource settings without slide scanners, network connections, computational resources, and expensive microscopes. 7) Bias and fairness in computational pathology algorithms. |
🔗 |
Mon 2:20 p.m. - 2:30 p.m.
|
Learning to Correct Sloppy Annotations in Electron Microscopy Volumes
(
Accepted Paper
)
Connectomics deals with the problem of reconstructing neural circuitry from electron microscopy images at the synaptic level. Automatically reconstructing circuits from these volumes requires high fidelity 3-D instance segmentation, which yet appears to be a daunting task for current computer vision algorithms. Hence, to date, most datasets are not reconstructed by fully-automated methods. Even after painstaking proofreading, these methods still produce numerous small errors. In this paper, we propose an approach to accelerate manual reconstructions by learning to correct imperfect manual annotations. To achieve this, we designed a novel solution for the canonical problem of marker-based 2-D instance segmentation, reporting a new state-of-the-art for region-growing algorithms demonstrated on challenging electron microscopy image stacks. We use our marker-based instance segmentation algorithm to learn to correct all “sloppy” object annotations by reducing and expanding all annotations. Our correction algorithm results in high quality morphological reconstruction (near ground truth quality), while significantly cutting annotation time (~8x) for several examples in connectomics. We demonstrate the accuracy of our approach on public connectomics benchmarks and on a set of large-scale neuron reconstruction problems, including on a new octopus dataset that cannot be automatically segmented at scale by existing algorithms. |
🔗 |
Mon 2:30 p.m. - 2:40 p.m.
|
Theia: Bleed-Through Estimation with Convolutional Neural Networks
(
Accepted Paper
)
Microscopy is ubiquitous in biological research, and with high content screening |
🔗 |
Mon 2:40 p.m. - 2:50 p.m.
|
RxRx1: A Dataset for Evaluating Experimental Batch Correction Methods
(
Accepted Paper
)
High-throughput screening techniques are commonly used to obtain large quantities of data in many fields of biology. It is well known that artifacts arising from variability in the technical execution of different experimental batches within such screens confound these observations, and can lead to invalid biological conclusions. It is, therefore, necessary to account for these batch effects when analyzing outcomes. In this paper, we describe RxRx1, a biological dataset designed specifically for the systematic study of batch effect correction methods. The dataset consists of 125,510 high-resolution fluorescence microscopy images of human cells under 1,138 genetic perturbations in 51 experimental batches across 4 cell types. Visual inspection of the images clearly demonstrates significant batch effects. We also propose a classification task designed to evaluate the effectiveness of experimental batch correction methods on these images and examine the performance of a number of correction methods on this task. Our goal in releasing RxRx1 is to encourage the development of effective experimental batch correction methods that generalize well to unseen experimental batches. The dataset can be downloaded at https://rxrx.ai. |
🔗 |
Mon 2:50 p.m. - 3:00 p.m.
|
An Ensemble Method with Edge Awareness for Abnormally Shaped Nuclei Segmentation
(
Accepted Paper
)
Abnormalities in biological cell nuclei shapes are correlated with cell cycle stages, disease states, and various external stimuli. There have been many deep learning approaches that are being used for nuclei segmentation and analysis. In recent years, transformers have performed better than CNN methods on many computer vision tasks. One problem with many deep learning nuclei segmentation methods is acquiring large amounts of annotated nuclei data, which is generally expensive to obtain. In this paper, we propose a Transformer and CNN hybrid ensemble processing method with edge awareness for accurately segmenting abnormally shaped nuclei. We call this method Hybrid Edge Mask R-CNN (HER-CNN), which uses Mask R-CNNs with the ResNet and the Swin-Transformer to segment abnormally shaped nuclei. We add an edge awareness loss to the mask prediction step of the Mask R-CNN to better distinguish the edge difference between the abnormally shaped nuclei and typical oval nuclei. We describe an ensemble processing strategy to combine or fuse individual segmentations from the CNN and the Transformer. We introduce the use of synthetic ground truth image generation to supplement the annotated training images due to the limited amount of data. Our proposed method is compared with other segmentation methods for segmenting abnormally shaped nuclei. We also include ablation studies to show the effectiveness of the edge awareness loss and the use of synthetic ground truth images. |
🔗 |
Mon 3:00 p.m. - 3:30 p.m.
|
Coffee Break
|
🔗 |
Mon 3:30 p.m. - 4:00 p.m.
|
TBD
(
Invited Talk
)
|
🔗 |
Mon 4:00 p.m. - 4:40 p.m.
|
Point-and-click: using microscopy images to guide spatial next generation sequencing measurements
(
Invited Talk
)
By using microscopy images of cells to guide multiplexed spatial indexing of sequencing reads, Light-Seq allows combined imaging and spatially resolved next generation sequencing (NGS) measurements to be captured from fixed biological samples. This is achieved through the combination of spatially targeted, rapid photocrosslinking of DNA barcodes onto complementary DNAs in situ with a one-step DNA stitching reaction to create pooled, spatially indexed sequencing libraries. Selection of cells can be done manually by pointing and clicking on the regions of interest (ROIs) to be sequenced, or automatically through the use of computer vision for cell type identification and segmentation. This foundational capability opens up a broad range of applications for multi-omic analysis unifying microscopy and NGS measurements from intact biological samples. |
🔗 |
Mon 4:40 p.m. - 4:50 p.m.
|
Out of Distribution Generalization via Interventional Style Transfer in Single-Cell Microscopy
(
Accepted Paper
)
Real-world deployment of computer vision systems, including in the discovery processes of biomedical research, requires causal representations that are invariant to contextual nuisances and generalize to new data. Leveraging the internal replicate structure of two novel single-cell fluorescent microscopy datasets, we propose generally applicable tests to assess the extent to which models learn causal representations across increasingly challenging levels of OOD-generalization. We show that despite seemingly strong performance as assessed by other established metrics, both naive and contemporary baselines designed to ward against confounding, collapse to random on these tests. We introduce a new method, Interventional Style Transfer (IST), that substantially improves OOD generalization by generating interventional training distributions in which spurious correlations between biological causes and nuisances are mitigated. We publish our code and datasets. |
🔗 |
Mon 4:50 p.m. - 5:00 p.m.
|
One-shot and Partially-Supervised Cell Image Segmentation Using Small Visual Prompt
(
Accepted Paper
)
Semantic segmentation of microscopic cell images using deep learning is an important technique, however, it requires a large number of images and ground truth labels for training. To address the above problem, we consider an efficient learning framework with as little data as possible, and we propose two types of learning strategies: One-shot segmentation which can learn with only one training sample, and Partially-supervised segmentation which assigns annotations to only a part of images. Furthermore, we introduce novel segmentation methods using the small prompt images inspired by prompt learning in recent studies. Our proposed methods use a pre-trained model based on only cell images and teach the information of the prompt pairs to the target image to be segmented by the attention mechanism, which allows for efficient learning while reducing the burden of annotation costs. Through experiments conducted on three types of microscopic cell image datasets, we confirmed that the proposed method improved the Dice score coefficient (DSC) in comparison with the conventional methods. |
🔗 |
Mon 5:00 p.m. - 5:30 p.m.
|
Enhancing SAM's Biomedical Image Analysis through Prompt-based Learning
(
Invited Talk
)
The Segment Anything Model (SAM), a foundational model trained on an extensive collection of images, presents many opportunities for diverse applications. For instance, we employed SAM in our biological pathway curation pipeline that synergizes image understanding and text mining techniques for deciphering gene relationships. SAM has proven highly efficient in recognizing pathway entities and their interconnections. However, SAM does not work well when applied to low-contrastive images directly. To counter this, we investigated prompt-based learning with SAM, specifically for identifying proteins from cryo-Electron Microscopy (cryo-EM) images. We trained a U-Net-based filter to adapt these grayscale cryo-EM images into RGB images suitable for SAM's input. We also trained continuous prompts and achieved state-of-the-art (SOTA) performance, even with a limited quantity of labeled data. The outcomes of our studies underscore the potential utilities of prompt-based learning on SAM for efficient biomedical image analyses. |
🔗 |
Mon 5:30 p.m. - 5:45 p.m.
|
Challenge Reportout
(
Challenge
)
|
🔗 |
Mon 5:45 p.m. - 5:50 p.m.
|
Closing Remarks
(
Presentation
)
|
🔗 |