Exploration of tissue morphologies in breast cancer samples using unsupervised machine learning

Forskningsoutput: TidskriftsbidragMötesabstraktForskningPeer review

Sammanfattning

We applied a machine learning approach for exploration of tissue morphology in hematoxylin and eosin (H&E) stained breast cancer tissue microarray (TMA) samples. We then investigated whether the morphological categories produced were associated with clinically relevant molecular biomarkers and 10-year overall survival. The data set comprises digitized (0.22 µm/pixel) and H&E stained TMA spots from tumor samples of 490 women who were diagnosed with primary breast cancer within a Finnish breast cancer database (FinProg) collected in 1991 and 1992. In order to quantitatively describe the tissue morphologies of the TMA spots, we divided the tissue images into rectangular sub-images (224x224 pixels), and extracted features with a pre-trained convolutional neural network. We then clustered the sub-images (n=147,266) with a non-linear data embedding algorithm that creates a two-dimensional mapping of the tissue morphologies. Lastly, we defined a quantitative profile for each tumor, describing the morphologies within the tissue spot image by dividing the two-dimensional map of morphologies into 128 separate clusters with k-nearest neighbor clustering. Visual inspection of the two-dimensional embedding of tissue spot images verified that the morphologies clustered coherently, i.e. similar looking sub-images formed distinct clusters in the map. Interestingly, some morphological patterns were strongly associated with tumor estrogen receptor content, progesterone receptor content, human epidermal growth factor receptor 2 status, and the proliferation marker Ki-67 status (p<0.0001 for each comparison). In exploratory analyses we identified one morphological category that was associated with a favorable 10-year overall survival with a risk ratio of 0.68 (CI95% 0.53-0.89, p=0.002, power = 0.87). Our work demonstrates that unsupervised machine learning can be applied to explore and better understand the role of morphological patterns in breast cancer. Methods that quantitatively assess the morphology of cancer tissue may complement molecular biomarkers and potentially reveal novel prognostic and predictive factors.
Originalspråkengelska
TidskriftCancer Research
Volym77
Utgåva13 Supplement
Sidor (från-till)673
Antal sidor1
ISSN0008-5472
DOI
StatusPublicerad - 1 jul 2017
EvenemangAACR Annual Meeting 2017 - Washington, Förenta Staterna (USA)
Varaktighet: 1 apr 20175 apr 2017
http://www.aacr.org/Meetings/Pages/MeetingDetail.aspx?EventItemID=105

Citera det här

@article{a058ed59dee6400da5a048c6d30a6c72,
title = "Exploration of tissue morphologies in breast cancer samples using unsupervised machine learning",
abstract = "We applied a machine learning approach for exploration of tissue morphology in hematoxylin and eosin (H&E) stained breast cancer tissue microarray (TMA) samples. We then investigated whether the morphological categories produced were associated with clinically relevant molecular biomarkers and 10-year overall survival. The data set comprises digitized (0.22 µm/pixel) and H&E stained TMA spots from tumor samples of 490 women who were diagnosed with primary breast cancer within a Finnish breast cancer database (FinProg) collected in 1991 and 1992. In order to quantitatively describe the tissue morphologies of the TMA spots, we divided the tissue images into rectangular sub-images (224x224 pixels), and extracted features with a pre-trained convolutional neural network. We then clustered the sub-images (n=147,266) with a non-linear data embedding algorithm that creates a two-dimensional mapping of the tissue morphologies. Lastly, we defined a quantitative profile for each tumor, describing the morphologies within the tissue spot image by dividing the two-dimensional map of morphologies into 128 separate clusters with k-nearest neighbor clustering. Visual inspection of the two-dimensional embedding of tissue spot images verified that the morphologies clustered coherently, i.e. similar looking sub-images formed distinct clusters in the map. Interestingly, some morphological patterns were strongly associated with tumor estrogen receptor content, progesterone receptor content, human epidermal growth factor receptor 2 status, and the proliferation marker Ki-67 status (p<0.0001 for each comparison). In exploratory analyses we identified one morphological category that was associated with a favorable 10-year overall survival with a risk ratio of 0.68 (CI95{\%} 0.53-0.89, p=0.002, power = 0.87). Our work demonstrates that unsupervised machine learning can be applied to explore and better understand the role of morphological patterns in breast cancer. Methods that quantitatively assess the morphology of cancer tissue may complement molecular biomarkers and potentially reveal novel prognostic and predictive factors.",
author = "Riku Turkki and Dmitrii Bychkov and Nina Linder and Jorma Isola and Joensuu, {Heikki Tuomas} and Johan Lundin",
year = "2017",
month = "7",
day = "1",
doi = "10.1158/1538-7445.AM2017-673",
language = "English",
volume = "77",
pages = "673",
journal = "Cancer Research",
issn = "0008-5472",
publisher = "American Association for Cancer Research",
number = "13 Supplement",

}

Exploration of tissue morphologies in breast cancer samples using unsupervised machine learning. / Turkki, Riku; Bychkov, Dmitrii; Linder, Nina; Isola, Jorma; Joensuu, Heikki Tuomas; Lundin, Johan.

I: Cancer Research, Vol. 77, Nr. 13 Supplement, 01.07.2017, s. 673.

Forskningsoutput: TidskriftsbidragMötesabstraktForskningPeer review

TY - JOUR

T1 - Exploration of tissue morphologies in breast cancer samples using unsupervised machine learning

AU - Turkki, Riku

AU - Bychkov, Dmitrii

AU - Linder, Nina

AU - Isola, Jorma

AU - Joensuu, Heikki Tuomas

AU - Lundin, Johan

PY - 2017/7/1

Y1 - 2017/7/1

N2 - We applied a machine learning approach for exploration of tissue morphology in hematoxylin and eosin (H&E) stained breast cancer tissue microarray (TMA) samples. We then investigated whether the morphological categories produced were associated with clinically relevant molecular biomarkers and 10-year overall survival. The data set comprises digitized (0.22 µm/pixel) and H&E stained TMA spots from tumor samples of 490 women who were diagnosed with primary breast cancer within a Finnish breast cancer database (FinProg) collected in 1991 and 1992. In order to quantitatively describe the tissue morphologies of the TMA spots, we divided the tissue images into rectangular sub-images (224x224 pixels), and extracted features with a pre-trained convolutional neural network. We then clustered the sub-images (n=147,266) with a non-linear data embedding algorithm that creates a two-dimensional mapping of the tissue morphologies. Lastly, we defined a quantitative profile for each tumor, describing the morphologies within the tissue spot image by dividing the two-dimensional map of morphologies into 128 separate clusters with k-nearest neighbor clustering. Visual inspection of the two-dimensional embedding of tissue spot images verified that the morphologies clustered coherently, i.e. similar looking sub-images formed distinct clusters in the map. Interestingly, some morphological patterns were strongly associated with tumor estrogen receptor content, progesterone receptor content, human epidermal growth factor receptor 2 status, and the proliferation marker Ki-67 status (p<0.0001 for each comparison). In exploratory analyses we identified one morphological category that was associated with a favorable 10-year overall survival with a risk ratio of 0.68 (CI95% 0.53-0.89, p=0.002, power = 0.87). Our work demonstrates that unsupervised machine learning can be applied to explore and better understand the role of morphological patterns in breast cancer. Methods that quantitatively assess the morphology of cancer tissue may complement molecular biomarkers and potentially reveal novel prognostic and predictive factors.

AB - We applied a machine learning approach for exploration of tissue morphology in hematoxylin and eosin (H&E) stained breast cancer tissue microarray (TMA) samples. We then investigated whether the morphological categories produced were associated with clinically relevant molecular biomarkers and 10-year overall survival. The data set comprises digitized (0.22 µm/pixel) and H&E stained TMA spots from tumor samples of 490 women who were diagnosed with primary breast cancer within a Finnish breast cancer database (FinProg) collected in 1991 and 1992. In order to quantitatively describe the tissue morphologies of the TMA spots, we divided the tissue images into rectangular sub-images (224x224 pixels), and extracted features with a pre-trained convolutional neural network. We then clustered the sub-images (n=147,266) with a non-linear data embedding algorithm that creates a two-dimensional mapping of the tissue morphologies. Lastly, we defined a quantitative profile for each tumor, describing the morphologies within the tissue spot image by dividing the two-dimensional map of morphologies into 128 separate clusters with k-nearest neighbor clustering. Visual inspection of the two-dimensional embedding of tissue spot images verified that the morphologies clustered coherently, i.e. similar looking sub-images formed distinct clusters in the map. Interestingly, some morphological patterns were strongly associated with tumor estrogen receptor content, progesterone receptor content, human epidermal growth factor receptor 2 status, and the proliferation marker Ki-67 status (p<0.0001 for each comparison). In exploratory analyses we identified one morphological category that was associated with a favorable 10-year overall survival with a risk ratio of 0.68 (CI95% 0.53-0.89, p=0.002, power = 0.87). Our work demonstrates that unsupervised machine learning can be applied to explore and better understand the role of morphological patterns in breast cancer. Methods that quantitatively assess the morphology of cancer tissue may complement molecular biomarkers and potentially reveal novel prognostic and predictive factors.

U2 - 10.1158/1538-7445.AM2017-673

DO - 10.1158/1538-7445.AM2017-673

M3 - Meeting Abstract

VL - 77

SP - 673

JO - Cancer Research

JF - Cancer Research

SN - 0008-5472

IS - 13 Supplement

ER -