Uncovering patterns in cancer cells with visual representation learning

Projekt: Forskningsprojekt


Beskrivning (abstrakt)

Deep learning has revolutionized image and data analysis. However, the most successful applications build on large annotated datasets which are expensive to create and can introduce bias to the learning process. The aim of this research project is to solve this problem by learning general representations of fluorescence microscopy images of cells and tissues, and apply these representations in studies of cancer samples. Fluorescence microscopy has a central role in image-based profiling which has a wide range of application areas such as drug discovery and functional genomics. The outcome of the project will advance machine learning research and impact studies of clinically relevant predictive biomarkers to aid treatment of patients.

During this project we will study representation learning using weakly-supervised and unsupervised approaches. Our learning approach enables representation learning without annotation bias. We will collect in total 10 million fluorescence microscopy images and their experimental metadata from open databases and train our models in a weakly-supervised manner. We will study unsupervised representation learning using generative adversarial networks and variational autoencoders to 1) generate realistic images for weakly-supervised learning and 2) to implicitly capture image data representation. Finally, we will study clustering and outlier detection to observe rare phenotypic classes of cancer cells. The methods and models developed in the project will be used to profile cancer cell and tissue samples provided by our collaborators.

The project will provide new knowledge about learning generalizable domain-independent image features. Prior to this project no standard models for fluorescent microscopy data existed. The representations learned during the project will be openly shared to the whole research community in project GitHub repository. Models trained in here are expected to impact the whole bioimaging community facilitating improved reproducibility as well as analytical precision. Together with our biological collaborators we expect these representations to give clarifications on cancer cell heterogeneity and how this information can be used to find better cancer treatments and predictive biomarkers.

Allmän beskrivning

One of the biggest challenges in machine learning is to learn generalizable models from limited amounts of annotated data as creating annotated data is extremely costly and may limit novel findings. In this research project we study novel solutions to the challenge in the field of microscopy imaging of cancer cells using weakly-supervised and unsupervised learning. The developed methods and learned models will be applied in cancer cells and tissue studies to uncover unknown phenotypes and predictive biomarkers that may be clinically relevant for cancer patient survival. The outcome of the project will provide new knowledge in machine learning and enable solutions for various biological and medical questions regarding cancer function and treatment. The project will be done at the Institute for Molecular Medicine Finland (FIMM), HiLIFE, University of Helsinki.
Gällande start-/slutdatum01/09/202131/08/2026


  • 113 Data- och informationsvetenskap