Journal of Pathology Informatics Journal of Pathology Informatics
Contact us | Home | Login   |  Users Online: 6137  Print this pageEmail this pageSmall font sizeDefault font sizeIncrease font size 

Table of Contents    
J Pathol Inform 2019,  10:32

14th European congress on digital pathology

Date of Web Publication11-Nov-2019

Correspondence Address:
Login to access the Email id

Source of Support: None, Conflict of Interest: None

DOI: 10.4103/2153-3539.270744

Rights and Permissions

How to cite this article:
. 14th European congress on digital pathology. J Pathol Inform 2019;10:32

How to cite this URL:
. 14th European congress on digital pathology. J Pathol Inform [serial online] 2019 [cited 2022 May 17];10:32. Available from:

   ECDP/NDP 2018 Summary Top

Johan Lundin1,2, Nina Linder3

1Institute for Molecular Medicine Finland – FIMM, Helsinki, Finland, 2Department of Public Health Sciences, Karolinska Institutet, Solna, Sweden, 3Institute for Molecular Medicine Finland – FIMM and Uppsala University, Helsinki, Finland. E-mail: [email protected]

The 14th European Congress on Digital Pathology (ECDP 2018), was organized May 29 -June 1, 2018 in Helsinki, Finland. The meeting this time also hosted the 5th Nordic Symposium on Digital Pathology (NDP 2018).

The primary goal was to bring together researchers, pathologists, clinicians and industry working in the field of digital pathology, to present and discuss science, implementation of digital techniques and the latest advances in the field.

The congress theme was digital diagnostics and intelligence augmentation, with focus on artificial intelligence for pathology. The meeting also included sessions on standards, quality management, point-of-care pathology, emerging technologies, computational pathology, translational research and precision medicine.

It was an honor for us to host our distinguished keynote and invited speakers, representing both pioneers in digital pathology and key opinion leaders in the field. We were very happy that more than one-hundred abstracts were submitted. We had 375 attendees from 33 countries, including 74 speakers and 60 poster presenters.

One day of the Conference was a “Nordic Day” dedicated to the 5th Nordic Symposium on Digital Pathology (NDP 2018) and chaired by Prof. Claes Lundström and Prof. Darren Treanor.

The industrial exhibition showed the continuing growth and advances in the field and reflected the broad usability of the technology. We were especially happy that more than twenty companies attended and that the IHE and DICOM working groups also met during the event.

Also, a “Connectathon” – a connectivity marathon was organized during the conference, where companies could hands-on test their compatibility with the latest interoperability standards.

A special new event was a competition for the “ESDIP Digital Diagnostics Award” where innovators and companies had a chance to pitch their innovations and got feedback from a panel of experts on stage. Innovators are asked to pitch (max 3 minutes) one of their products or inventions on stage and they will get instant feedback from a panel of high-profile experts. Competitors were asked to submit a max 500 words proposal on what is unique and innovative regarding their novel digital diagnostics product or invention (for example an image analysis algorithm) and how the innovation delivers valuable impact for patients.

Startups, established companies, non-governmental organizations, researchers and pathologists were invited to compete, as long as they present an innovation in digital diagnostics for pathology.

Ten finalists gave pitches and a winner was selected by the jury that consisted of Prof. Carolina Wählby, Uppsala University, Prof. G. Steven Bova, University of Tampere, Prof. Jorma Isola, University of Tampere and Prof. Nasir Rajpoot, University of Warwick.

The winner of the first ESDIP Digital Diagnostics Award was Dr Andrew Janowczyk from Case Western Reserve University in Cleveland, United States that presented HistoQC: A quality control pipeline for digital pathology slides.

In this issue of the Journal, accepted extended and short abstracts of the Conference are presented.

Keynote Speakers

   Cognitive Algorithms for Tissue-Based Diagnosis Top

Klaus Kayser1

1Charité – University of Berlin, Berlin, Germany. E-mail: [email protected]

Klaus Kayser is one of the European pioneers of digital pathology and telepathology, and organizer of the first European Congress of Telepathology (current ECDP) hosted in Heidelberg, Germany in 1992. Prof. Kayser gave a presentation during the opening ceremony of the meeting and a keynote talk on “Cognitive algorithms for tissue-based diagnosis.”

   The Opportunity for Machine Intelligence in Digital Medicine Top

Greg Corrado1

1Augmented Intelligence Research, Google Brain, Google Inc.

Greg Corrado, Director of Augmented Intelligence Research at Google is a senior researcher in artificial intelligence, computational neuroscience, and machine learning. At Google he served as one of the founding members of and co-lead of the Google Brain Project for large scale deep neural networks. Dr. Corrado gave a presentation on the future of machine learning.

   The DICOM Standard Approach to Whole Slide Imaging Deployment Top

David Clunie1

1PixelMed Publishing, Bangor, Pennsylvania, USA. E-mail: [email protected]

As WSI begins to permeate routine clinical anatomical pathology workflow globally, confusion reigns as to the role of standard protocols and formats for image interchange and storage, as opposed to proprietary monolithic systems. Though the regulatory issues of modular systems remain to be resolved, standards do exist and are being implemented, as the success of recent DICOM Connectathons demonstrates. The lessons learned from other specialties about the advantages and disadvantages of DICOM are considered for a range of use cases. Decisions in the design of the DICOM WSI standard are considered in light of experience with similar proprietary formats. The role of meta data and the importance of anticipating annotation interchange is also examined, particularly from the perspective of automated and interactive analytic applications. High volume operational use requires attention to the workflow and appropriate acquisition and reporting management standards need to be selected and implemented. Enterprise integration and cross enterprise data exchange are key considerations that are also predicated on the successful deployment of standards.

David Clunie is a retired neuroradiologist, medical informaticist, DICOM open source software author, editor of the DICOM standard and independent consultant. He was formerly the co-chair of the IHE Radiology Technical Committee and industry co- chairman of the DICOM Standards Committee, as well as being a member or chair of many of the DICOM working groups, including structured reporting, digital x-ray, compression, interchange media, base standard, display, mammography, security, application hosting, clinical trials, small animal imaging and digital pathology. Recently he has specialized in the technical aspects of enterprise imaging for non-radiological specialties and has provided technical support for the whole slide imaging Connectathons organized by DICOM WG 26, specializing in the validation of compliance of WSI implementations with the DICOM standard.

   Image Based In Situ Sequencing as a Basis for Learning Tissue Morphology Top

Carolina Wählby1

1Department of Information Technology, Uppsala University, Uppsala, Sweden. E-mail: [email protected]

Carolina Wählby, Professor at the Department of Information Technology, Uppsala University, is head of the Centre for Image Analysis at Uppsala University and focusing on digital image processing and analysis in the interface between biomedicine, microscopy, and computer science. She leads a number of research projects involving large-scale cell and tissue analysis, including computational methods for spatial transcriptomics, funded by the ERC and the Swedish Foundation for Strategic research. Prof. Wählby gave an overview of recent projects related to tissue analytics and machine learning for pathology.

   A Byte of the Future: Digital Pathology and Machine Learning, New Value Based Opportunities Top

Michael Feldman1

1Pathology and Laboratory Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA. E-mail: [email protected]

Digital Pathology is a broad term that is often associated with just whole slide imaging but truly encompasses far more. In Europe, Digital Pathology has enjoyed CE mark for routine diagnostics and has seen full scale adoption in a growing number of practices across multiple countries. In the USA, the FDA recently cleared the first whole slide imaging system for primary diagnostic with more companies to follow and the first clinical site is moving toward adoption of whole slide imaging for diagnsotics. New image acquisition modalities are developing which further push “Digital Pathology” in new directions (Computational photonics) as well as acquisition methodologies for “slideless” imaging of tissue present new and exciting opportunities for us to rethink what we mean by the term “digital pathology”. Of course, no presentation of “digital pathology” would be complete without a discussion of the implication of machine learning within this space. A value focused discussion with specific use cases of machine learning and deep learning to create a path towards adoption will be presented as these technologies begin to mature from basic science laboratories and move into clinical practice. Finally, a brief discussion of integrated multiscale imaging across radiology and pathology will be discussed, again with an eye on value to the health system as well as value to our patients. Michael Feldman is Professor of Pathology and Laboratory Medicine at the Hospital of the University of Pennsylvania. His professional interests revolve around the development, integration and adoption of information technologies in the discipline of Pathology. One of his main areas of interest within this broad discipline has been in the field of digital imaging. He has been studying pathology imaging on several fronts including interactions between pathology/radiology (Radiopathogenomics of prostate cancer and breast carcinoma), development and utilization of computer assisted diagnostic algorithms for machine vision in prostate and breast cancer. More recently his team has been developing deep learning methods for complex interrogation of pathology slides both within the cancer domain as well as in cardiovascular and renal pathology. Prof. Feldman and his collaborators have also been developing methods to apply multispectral imaging for the analysis of multiplexed immunohistochemistry and immunoflourescence to tissues along with the development of a quantitative system for scoring and analyzing these studies at a cytometric level on surgical pathology slides. The efforts have been recognized by the national funding agencies of the NIH and DOD as well as industry sponsored projects.

   The Human Protein Atlas: A Digital Tool for Pathology and Clinical Medicine Top

Fredrik Pontén1

1Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden. E-mail: [email protected]

Fredrik Pontén, Professor at Uppsala University, is a board-certified senior physician and specialist of Anatomical Pathology. Prof. Pontén is co-founder and the Vice-Program and Clinical Director of the Human Protein Atlas. His research is focused on gene expression profiling in normal and cancer tissues with emphasis on translational medicine. Prof. Pontén gave a talk entitled “The Human Protein Atlas - a digital tool for pathology and clinical medicine.”

   Invited Speakers Top

   Does Pathologist Input for AI Apps Matter? Top

Liron Pantanowitz1

1Department of Pathology, UPMC University of Pittsburgh, Pittsburgh, Pennsylvania, USA. E-mail: [email protected]

This an exciting but unnerving time in pathology in which we are witnessing the emergence of AI algorithms that can accurately analyze and interpret pathology images. Several academic computational biologists and vendors, including many AI start-up companies are now focused on AI. Pathologists are accordingly being increasingly asked to participate in co-developing these algorithms for digital pathology. The intent of this talk is to address some of the key questions about developing computational pathology algorithms from a pathologist's perspective. These questions include: (1) do we use a traditional machine learning or deep learning approach? (2) How much data is needed to build a good algorithm?, (3) why and who needs to annotate images?, (4) how does one get other pathologists to use this algorithm?, and (5) does this app need regulatory clearance if it is intended for clinical use?

Dr. Liron Pantanowitz is a Professor of Pathology and Biomedical Informatics at the University of Pittsburgh in the USA. He is the Director of Pathology Informatics and Cytopathology at the University of Pittsburgh Medical Center. Dr. Pantanowitz is an Editor-in-Chief of the Journal of Pathology Informatics. He is a member of the Association for Pathology Informatics council, College of American Pathologist's digital pathology committee, and Digital Pathology Association board of directors. He is well published and has written several textbooks in informatics, including digital pathology.

   Digital Image Analysis in Breast Pathology to Facilitate Personalised Medicine Top

Johan Hartman1

1Department of Pathology, Karolinska Institutet, Solna, Sweden. E-mail: [email protected]

Malignant cell proliferation remains as a paramount prognostic indicator in breast cancer. Moreover, cancer cell proliferation is associated with therapeutic response. Tumor cell proliferation can be detected by direct assessment of mitotic figures or associated biomarkers. The most frequent biomarker in routine diagnosis and research is Ki67 but its assessment is hampered by poor reproducibility. We have earlier shown that Ki67 analysis is dramatically improved by digital image analysis. Moreover, digital image analysis can stratify patients into subgroups with both prognostic and predictive relevance. Recent developments in machine learning enable identification of relevant morphological patterns and even more sophisticated methods to stratify patients. In combination with cancer sequencing, these methods will be fundamental to facilitate precision medicine in oncology.

Johan Hartman, MD, PhD is breast pathologist at Karolinska University Laboratory, Stockholm, Sweden. He is a board member of the Stockholm Medical Biobank and leads the task force in precision medicine at the personalised medicine program at Karolinska Institutet. He leads the national quality and standardisation committee in breast pathology (KVAST- bröst), consisting of expert breast pathologists with responsibility for composing national guidelines.

Dr Hartman's research team performs translational breast cancer research with focus on precision medicine. This includes digital image analysis, clinical cancer sequencing and patient-derived ex vivo models.

   Artificial Intelligence: What is Means for Pathology from a Pathologist's Perspective Top

Keith Kaplan1 . E-mail: [email protected] advances in artificial intelligence (AI) show great promise in several fields of medicine. The field of deep learning is enabling the development of expert-level automated algorithms. AI is simply a tool – not a replacement for a pathologist. Deep learning can be applied to numerous diagnostic tasks and defines a new era in digital pathology. This session will focus on advancements in AI in surgical pathology and what the use of this technology means for the future of the practice of pathology. Objectives for this session: 1) Understand basic definitions with AI and deep learning, 2) Review applications and use cases to date within medicine and pathology, 3) Discuss implications of AI for practice of pathology, 4) Highlight potential opportunities and weaknesses for AI within pathology.

Dr. Kaplan is a native of Chicago and a graduate of Northwestern University Feinberg School of Medicine. He completed residency training in anatomic and clinical pathology at Walter Reed Army Medical Center, Washington, DC. Dr. Kaplan was an Associate Professor of Pathology at Mayo Medical School and Senior Associate Consultant, Mayo Clinic, Rochester, MN. Since 2007, Dr. Kaplan has been publisher of, the industry's leading pathology blog.

   Machine Learning and Software Infrastructure for Prognostication and Discovery in Pathology Top

Lee Cooper1

1Department of Biomedical Informatics, Emory University School of Medicine, Atlanta, Georgia. E-mail: [email protected]

Predicting the future course of a patient's disease is critical in choosing therapy and in helping patients to plan their lives. Despite the rich data produced by genomic and imaging platforms, the accuracy of prognoses for patients diagnosed with cancer can be highly variable, often relying on a handful of molecular biomarkers or subjective interpretations of histology.

In this talk, Dr. Cooper will discuss recent advances in combining conventional survival models with deep learning techniques to build machines that can predict patient survival from histology and genomics. He will also discuss how open-source platforms for digital pathology can play an important role in facilitating the development of the next generation of digital pathology algorithms, and can overcome challenges in satisfying the need for training data.

Lee Cooper, PhD, is an Assistant Professor of Biomedical Informatics and Engineering at the Emory University School of Medicine and Georgia Institute of Technology in Atlanta, Georgia, USA. His research focuses on machine-learning methods for predicting clinical outcomes from integrated histology and genomic data, and the development open-source software infrastructure that allows clinicians and investigators to interact with complex pathology datasets and learning algorithms.

   Computational Pathology as a Companion Diagnostic: Implications for Precision Medicine Top

Anant Madabhushi1

1Department of Biomedical Engineering, Case Western Reserve University, Cleveland, Ohio, USA. E-mail: [email protected]

With the advent of digital pathology, there is an opportunity to develop computerized image analysis methods to not just detect and diagnose disease from histopathology tissue sections, but to also attempt to predict risk of recurrence, predict disease aggressiveness and long-term survival. At the Center for Computational Imaging and Personalized Diagnostics, our team has been developing a suite of image processing and computer vision tools, specifically designed to predict disease progression and response to therapy via the extraction and analysis of image- based “histological biomarkers” derived from digitized tissue biopsy specimens. These tools would serve as an attractive alternative to molecular based assays, which attempt to perform the same predictions. The fundamental hypotheses underlying our work are that:

1) the genomic expressions detected by molecular assays manifest as unique phenotypic alterations (i.e. histological biomarkers) visible in the tissue; 2) these histological biomarkers contain information necessary to predict disease progression and response to therapy; and 3) sophisticated computer vision algorithms are integral to the successful identification and extraction of these biomarkers. We have developed and applied these prognostic tools in the context of several different disease domains including ER+ breast cancer, prostate cancer, Her2+ breast cancer, ovarian cancer, and more recently medulloblastomas. For the purposes of this talk I will focus on our work in breast, prostate, rectal, oropharyngeal, and lung cancer.

Dr. Anant Madabhushi is the Director of the Center for Computational Imaging and Personalized Diagnostics (CCIPD) and the F. Alex Nason Professor II in the Department of Biomedical Engineering, Case Western Reserve University. His team has developed pioneering computer aided diagnosis, pattern recognition, image analysis tools for diagnosis and prognosis of different types of cancers (prostate, breast, medulloblastoma, oropharyngeal) based on quantitative and computerized histomorphometric image analysis of digitized histologic biopsy tissue specimens. This novel approach involves quantitatively mining the histologic image data for hundreds of image features via sophisticated image segmentation, feature extraction, machine learning and pattern recognition methods and then predicting the risk of disease recurrence and patient prognosis. His group has also pioneered new ways of combining histomorphometric imaging features with “omics” derived biomarkers for improved and integrated prediction of cancer outcome. His team has published over 140 peer-reviewed journal papers and over 160 peer-reviewed conference papers (Google H-Index=46) and has over 70 patents awarded or pending.

   Deep Learning and Augmented Reality May Bridge the Gap between Conventional and Digital Pathology Top

Peter Hufnagl1

1Department of Digital Pathology and IT, Charité University Hospital Berlin, Berlin, Germany. E-mail: [email protected]

The use of conventional light microscopes and digital pathology seem to be mutually exclusive. On the one hand, complete digitization offers many advantages. On the other hand, the digital workflow requires the timely and complete scanning of all slides. This requires not only a change in the pathologist's workflow, but also a corresponding investment that many institutions shy away from. Can one still benefit from digitization while retaining the conventional way of working? Yes, there are several possibilities here. Some institutions practice a successful coexistence of conventional and digital working methods. As a rule, cases are assigned according to indication or personnel. Of particular interest is the addition of conventional microscopy through the online creation of WSI, Deep Learning and Augmented Reality. For this purpose, the light microscope must first be supplemented by a fast camera. The resulting images can either be used to create a WSI in the background, for example to obtain a second opinion. Alternatively, the images can be analyzed immediately on the computer and the results reflected live into the light microscope. Thus, mitoses can be counted or tumor areas can be visualized. Such a retrofitting of a microscope is comparatively inexpensive compared to a complete conversion to WSI. The advantage: A smooth transition from conventional to digital becomes possible.

Peter Hufnagl, PhD, is head of Digital Pathology at the Charité – Universitätsmedizin Berlin, Germany. His research interest is in combining different technologies with the aim of helping patients individually. This includes conventional image analysis and telemedical systems as well as biobanking and the use of artificial intelligence. Since 2017, Dr. Hufnagl has also been head of the Center for Biomedical Image and Information Processing (CBMI) with a focus on the application of artificial intelligence.

   The Promise of Computational Pathology Top

Nasir Rajpoot1

1Department of Computer Science, University of Warwick, Coventry, UK. E-mail: [email protected]

The human brain is fantastic at recognising people and objects and building an understanding of the natural world around us. However, the visual cortex is not ideal at objectively measuring what we see and complex spatial patterns hidden in plain sight cannot sometimes be deciphered by the naked eye. Computational Pathology is an emerging discipline concerned with the study of computer algorithms for understanding disease from the analysis of digitised histology images. I will show some snippets of computational pathology research in my group to demonstrate the value of analytics of information-rich whole-slide images (the so-called Big Cancer Image Data) for cancer diagnosis and prognosis. I will show examples of how histological motifs extracted from digital pathology image data are likely to lead to patient stratification for precision medicine. I will conclude with some of the main challenges facing digital pathology research.

Nasir Rajpoot is Professor in Computational Pathology at the University of Warwick, where started his academic career as a Lecturer (Assistant Professor) in 2001.

At Warwick, he is the founding Head of Tissue Image Anayltics (TIA) lab since 2012. He also holds an Honorary Scientist position at the Department of Pathology, University Hospitals Coventry & Warwickshire NHS Trust since 2016. The focus of current research in his lab is on developing novel computational pathology algorithms with applications to computer-assisted grading of cancer and image-based markers for prediction of cancer progression and survival. Prof Rajpoot has been active in the digital pathology community for almost a decade now, having co-chaired several meetings in the histology image analysis (HIMA) series since 2008 and served as a founding PC member of the SPIE Digital Pathology meeting since 2012. He was the General Chair of the UK Medical Image Understanding and Analysis (MIUA) conference in 2010, and the Technical Chair of the British Machine Vision Conference (BMVC) in 2007. He has guest edited a special issue of Machine Vision and Applications on Microscopy Image Analysis and its Applications in Biology in 2012, and a special section on Multivariate Microscopy Image Analysis in the IEEE Transactions on Medical Imaging in 2010. He is a Senior Member of IEEE and member of the ACM, the British Association of Cancer Research (BACR), and the European Association of Cancer Research (EACR). Prof Rajpoot was recently awarded the Wolfson Fellowship by the UK Royal Society and the Turing Fellowship by the Alan Turing Institute, the UK's national data science institute. He will be chairing the European Congress on Digital Pathology (ECDP) at Warwick in 2019.

   Computational Pathology and Artificial Intelligence Top

Francesco Ciompi1

1Computational Pathology Group, Radboud University Nijmegen, Nijmegen, Netherlands. E-mail: [email protected]

Computational Pathology embodies the synergy of Digital Pathology, Medical Image Analysis, Computer Vision, and Machine Learning. The huge amount of information and data available in multi-gigapixel histopathology images makes digital pathology the perfect use case for advanced image analysis techniques. For this reason, deep learning and artificial intelligence have successfully powered computational pathology research in recent years. In this talk, I will present some of our recent research results in deep learning and computational pathology and discuss open challenges.

Dr. Francesco Ciompi is a senior researcher in Computational Pathology at Radboud University Medical Center, Nijmegen (Netherlands). He received the Master's degree in Electronic Engineering from the University of Pisa in July 2006 and the Master's degree in Computer Vision and Artificial Intelligence from the Autonomous University of Barcelona in September 2008. In July 2012 he obtained the PhD (cum laude) in Applied Mathematics and Analysis at the University of Barcelona, with a thesis on “Multi-Class Learning for Vessel Characterization in Intravascular Ultrasound”. In February 2013 he joined the Autonomous University of Barcelona as postdoctoral researcher, working on machine learning for computer vision and large-scale image classification and retrieval. From September 2007 to September 2013 he was also member of the Computer Vision Center. From October 2013 to September 2015, he was a postdoctoral researcher at the Diagnostic Image Analysis Group of Radboud University Medical Center, Nijmegen. His research focuses on deep learning for the analysis of medical images in cancer research, with a particular focus on development of prognostic and predictive imaging biomarkers based on histopathology image analysis.

   Deep Neural Networks and Cloud Computing for Automated Histopathologic Analysis of Lung Tumors Top

Sami Blom1

1Aiforia Technologies Oy, Helsinki, Finland. E-mail: [email protected]

Cancer remains a leading cause of death worldwide. A variety of Genetically Engineered Mouse Models (GEMM) are used in efforts to elucidate molecular events, therapeutic targets, and new treatment modalities. A well-known GEMM for studying lung cancer is the KRAS (G12D), p53(+/-) conditional model, which captures characteristic features of the human disease, importantly much of the histologic architecture, in particular intratumoral heterogeneity. The intratumoral heterogeneity and efficiency of generating new GEMMs also increase the need of automated histopathological analysis that would offer consistent and fast analysis as well as widely available common standards for classification of tumor features. We trained a deep neural network (dNN) on Aiforia® Cloud platform to detect and quantify tumor burden and to classify tumor grades at cell-level resolution in whole-slide H&E images of KRAS(G12D), p53(+/-) GEMM. The results were validated against independent set of slides that were visually analyzed by experts. The combination of dNN and cloud computing on Aiforia® Cloud enables automated and quantitative analysis of histologc features in H&E images at a scale unachievable using the currently available cadre of veterinary pathologists or traditional analysis tools.

Sami Blom is the application manager at Aiforia Technologies Oy. He has a background in the fields of translational cancer research and in vitro diagnostics, and has specialized in the development of tissue staining and imaging methods for solid tumours. Sami holds a M.Sc. degree in biochemistry from the University of Turku, Finland.

   Mobile Phone and Handheld Microscopy for Global Health Applications Top

Isaac Bogoch1

1Department of Medicine, University of Toronto, Toronto, Canada. E-mail: [email protected]

There is limited laboratory infrastructure and laboratory capacity in low-income countries, especially in rural areas. Infectious and non-infectious diseases are more common in these settings and preferentially affect the poorest of the poor. Novel solutions to improve the quality of clinical and Public Health care delivery are required. Mobile phone and handheld microscopy may enable diagnostic support in such settings. These devices have several attractive attributes in that they are low cost, battery powered, and portable. More sophisticated equipment can harness technological features such as image recognition for automated diagnoses, and GPS monitoring to map diseases in a region. This talk focuses on the development and implementation of mobile phone and handheld microscopes for use in clinical and Public Health settings in Africa that are endemic for infectious diseases, such as malaria, schistosomiasis, and other worm infections. The strengths and limitations of diagnostic devices, in addition to future directions will be discussed in detail.

Dr. Isaac Bogoch is an Assistant Professor at the University of Toronto in the Department of Medicine, and is an Infectious Diseases consultant and General Internist at the Toronto General Hospital. Dr. Bogoch divides his clinical and research time between Canada and several countries in Africa and Asia. He collaborates with a team that models the spread of emerging infectious diseases. In addition, Dr. Bogoch studies innovative and simple diagnostic solutions to improve the quality of care in resource-constrained settings, including implementing mobile phone and handheld microscopy for clinical and public health applications.

   Beyond Classification: Deep Learning for Computational Microscopy in Digital Pathology Top

Yair Rivenson1

1UCLA Henry Samueli School of Engineering and Applied Science, University of California, Los Angeles, California, USA. E-mail: [email protected]

In recent years, deep learning has redefined the state-of-the-art results for diagnosis and classification tasks in digital pathology. Here, we demonstrate the application of deep learning for enhancing microscopic imaging for digital pathology, with some unique challenges and opportunities that this framework brings. Amongst the applications, we will demonstrate enhancement of benchtop microscope images, extending their resolution, depth of field and field of view. Another application is the extension of this framework for mobile, smartphone-based, microscopy, where deep learning enables users to match the imaging performance of a laboratory grade benchtop microscope, with a cost- effective smartphone microscope. The deep learning essentially learns to eliminate spectral distortions, increase signal-to-noise ratio and enhance resolution, even for highly compressed images, that can be used in devices deployed in low resource settings areas. We'll also discuss the application of deep learning for coherent imaging systems, recovering both amplitude and phase of an object, from a single diffraction pattern. Finally, we'll demonstrate the application of virtual histopathology staining, where a deep network can learn how to digitally stain a single, label-free (unstained), autofluorescent image to match the same image of the tissue section as it would have been chemically stained (for example, using H&E or Masson's Trichrome stains) and imaged using a brightfield microscope. All these results will be demonstrated on thin tissue sections as well as blood and Pap smears. These results establish the potential of deep learning as a promising framework for multimodal computational microscopy in digital pathology.

Yair Rivenson is a postdoctoral researcher in the Electrical and Computer Engineering, University of California, Los Angeles (UCLA). His research is aiming for the development of intelligent biomedical computational imaging and sensing platforms, specifically applying novel mathematical frameworks, such as deep learning, with numerical-physical systems modeling. Dr. Rivenson has recently focused his research on deep learning-based approaches, which can be used as a new framework for computational microscopy, significantly enhancing the performance of standard and mobile microscopic modalities.

   Point-of-Care Diagnostics of Cancer and Infectious Diseases with AI-Supported Mobile Microscopy Top

Johan Lundin1,2

1Institute for Molecular Medicine Finland – FIMM, University of Helsinki, Helsinki, Finland, 2Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden. E-mail: [email protected]

The aim of our current studies is to assess the feasibility and diagnostic accuracy of mobile digital microscopy scanners combined with artificial intelligence for improving access to diagnostics. Our research group has developed a mobile, miniature whole-slide scanner that can be used for point-of-care diagnostics (POC), especially in low-resource and expert-poor settings. The mobile microscope is wirelessly connected for remote diagnostic support, either by computer vision or human expert assessment. We have initiated a series of proof-of-concept studies of mobile whole-slide microscopy combined with artificial intelligence both in high and low resource settings for cancer and infectious disease diagnostics. The current talk describes a POC diagnostic platform for image-based detection of malaria parasites in blood smears, helminth eggs in faecal smears and dysplastic cells in cervical smears. Experiences are shared from the POC studies in Finland, Tanzania and Kenya. Current challenges of point-of-care pathology will be discussed, and some visions presented for future opportunities of mobile pathology.

Johan Lundin, MD, PhD is a Research Director at the Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Finland and a Professor of Medical Technology at Karolinska Institutet, Stockholm. His overall research aims are to study the use of digital technologies and artificial intelligence for improvement of diagnostics and care of the individual patient. In addition to the research, Dr. Lundin has together with his research group at FIMM and researchers at Karolinska Institutet developed technologies for diagnostic decision support, for example cloud-based and mobile solutions that allow the diagnostic process to be performed using automated computerized analysis. Dr. Lundin gave an overview of how the digital methods can enable task-shifting and improve access to diagnostics at the point-of-care, both in high and low resource settings.

   Digital Imaging Adoption Model: A Joined Initiative of HIMSS and European Society of Radiology to Promote Advanced IT Solutions in Medical Imaging Top

Peter Mildenberger1

1University Medical Center Mainz, Mainz, Germany. E-mail: [email protected]

IT support in Radiology is well established. Many different tools and systems are available. Therefore, some guidance for users should be helpful for planning and decision making in the implementation of imaging informatics. Jointly developed by HIMSS Analytics® and the European Society of Radiology (ESR), the Digital Imaging Adoption Model (DIAM) helps to evaluate the maturity of IT-supported processes in medical imaging in both hospitals and diagnostic centres. This eight-stage maturity model drives your organisational, strategic and tactical alignment towards imaging-IT maturity. With its standardised approach, it enables regional, national and international benchmarking. Participants will receive a gap analysis with detailed action items for future investment decisions. In addition, DIAM provides an authoritative basis for presentation to the management level.

Peter Mildenberger, MD, is an associated professor for Radiology at the University Medical Center in Mainz (Germany). Besides the clinical focus on diagnostic imaging in cardio-vascular, emergency and uro-radiology, he is very much engaged in imaging informatics. Actual activities are focused on Structured Reporting and machine learning in Radiology. He is also very active in several organisations, e.g. as chair of the ESR subcommittee on “Professional Issues and Economics in Radiology” and chair of the joint RSNA-ESR “Template Library Advisory Panel”. He is also representing ESR in the DICOM Standards Committee and in IHE.

   Deep Digital Convergence: Pathology, Radiology and Molecular Medicine in the 21st Century Top

Harry B. Burke1

1Department of Medicine, F. Edward Hébert School of Medicine, Uniformed Services University, Bethesda, Maryland, USA. E-mail: [email protected]

For most of the 20th Century pathology, radiology, and molecular medicine were independent clinical domains. Each existed in its own ocular realm – radiologists looked at structural and functional anatomic images, pathologists looked at tissue images, and molecular medicine looked at biochemical false-color microarrays. In the 21st Century, the information in these domains has become, or is rapidly becoming, digital. The transition from the ocular to the digital has created electronic data types that, for many medical activities, allows the convergence of these three information levels of analysis to create a new integrated information paradigm, one that comprises digital radiology, digital pathology, and digital molecular medicine (D3RPM). D3RPM is an integrated multi-dimensional model that significantly extends the power of clinical medicine. Each dimension contributes orthogonal information and together they are greater than the sum of their dimensions. The result is a multilevel structural, multicellular, and molecular model of disease. Together, they are the core of precision medicine. But I have gotten ahead of myself; we have not yet integrated these three information levels of analysis to create a coherent information system, and we have not yet integrated this system into clinical medicine. In order to clinically operationalize D3RPM we will have to build electronic clinical decision support systems (CDSS) that: (1) function in the medical domains of risk/prevention, diagnosis, and prognosis/ treatment, (2) contain powerful predictive analytics for accurate individualized patient predictions and include deep learning algorithms within and across patients that improve its performance, (3) construct models using D3RPM information and other relevant clinical data, (4) acquire and integrate into these models new clinical information acquired in real time during the clinician- patient encounter, and (5) effectively communicate to the clinician and patient in real time the model- generated patient-centric clinical information necessary for improved shared decision-making. This approach has the potential to move precision medicine from an interesting idea at the bench to a powerful clinical instrument at the bedside.

Harry B. Burke, MD, PhD, was awarded medical and doctoral degrees by the University of Chicago. He is a Professor of Medicine at the Uniformed Services University of Health Sciences, Bethesda, MD, USA. His research interests include advanced statistical modeling and predictive analytics, applied information theory, health informatics including clinical decision support systems and natural language processing, and the use of molecular biomarkers for clinical outcome prediction. He has an extensive publication record, he is a member of the Editorial Board of several journals, including the Journal of the National Cancer Institute and the American Cancer Society's journal Cancer, and he reviews for many high impact journals.

   Deep Learning for Detecting Tumour Infiltrating Lymphocytes in Testicular Germ Cell Tumours Top

Clare Verrill1

1Nuffield Department of Surgical Sciences, Oxford University, Oxford, England. E-mail: [email protected]

Machine learning and deep learning in particular has shown potential for expert-level accuracy in biomedical image classification. Through collaboration between Oxford University and the Finnish Institute of Molecular Medicine, a deep learning network was developed to identify and count tumour infiltrating lymphocytes (TILs) in testicular germ cell tumours as well as to predict disease relapse. Digitized haematoxylin-eosin (H&E) stained tumour whole slides from 89 patients with clinicopathological data were evaluated. Patients without testicular cancer relapse in general had higher TIL density in the primary tumour compared to patients who relapsed and in seminomas none of the relapsed cases belonged to the highest TIL density tertile (2011/ mm2, P=0.04, Fisher's exact test). TIL quantifications performed visually by three pathologists on the same tumours were not significantly associated with outcome. The average inter-observer agreement between the pathologists when assigning a patient into TIL tertiles was 0.32 (Kappa test), compared to 0.35 between the algorithm and the experts respectively. A higher TIL density was associated with a lower clinical tumour stage, seminoma histology and lack of lymphovascular invasion at presentation.We show that deep learning- based image analysis can be applied to automated detection of TILs in H&E stained digitized samples of testicular germ cell cancer and that it has potential for use as a prognostic marker for disease relapse.

Clare Verrill, FRCPath is an associate professor of pathology with the Nuffield Department of Surgical Sciences at the University of Oxford, UK. She works partly as a diagnostic urological pathologist and partly as an academic pathologist and has her own research group. She is the UK National Cancer Research Institute Cellular Molecular Pathology (CM-Path) Initiative Workstream lead for Technology and Informatics. Her research interests include digital pathology applications within urological pathology, in particular in prostate and testis. She gave an overview of a collaborative project between Oxford University and the Finnish Institute of Molecular Medicine in which a deep learning tool was developed to quantify tumour infiltrating lymphocytes in testicular germ cell tumours.

   New European Medical Device Regulation: Impact on Slide Scanners and Image Analysis Solutions Top

David de Mena García1

1Andalusian Public Health System, Sevilla, Spain. E-mail: [email protected]

David De Mena Garcia has graduated as a Telecommunications Engineer from the University of Seville, Bachelour in Philosophy from Laternanese University of Rome, MBA from the School of Industrial Organization and currently holds a doctorate in Applied Sciences from the University of Castilla la Mancha in DICOM and Digital Pathology Images. He has several specializations in Digital Health and Entrepreneurship by international universities and business schools. Member of DICOM WG26 and IHE PaLM group and ISO 13485 Internal auditor. He has acted as invited speaker at University of Seville as well as different digital pathology schools. As Innovation Projects Manager, he has been developing his professional career in the central node of the Deputy Direction of ICT in the Andalusian Public Health System, advising and preparing innovation projects in Information Technologies and Communications. He has participated in several national and European projects. Dr. Mena Garcia gave an overview of the new European medical device regulation.

   Digital Pathology in the Landscape of Interoperability, Regulations and Standards Top

Nick Haarselhorst1

1Digital Pathology Solutions, Philips. E-mail: [email protected]

Nick Haarselhorst, Professional Service Consultant & Interoperability Architect within the Service Innovation group of Philips Digital Pathology Solutions. Within this role he provides technical leadership in respect to interoperability and integration within the DP projects and be responsible for the final solution design to be provided. Beside that he contributed to many standardization efforts and represented Philips Healthcare in standardization bodies like IHE (Europe, Services, The Netherlands, and PaLM), DICOM and HL7. Nick Haarselhorst gave an overview of the actual status of interoperability and the way forward in the coming years from a vendor perspective.

   Experiences of Digital Diagnostics in Practice Top

Gloria Bueno1

1University of Castilla-La Mancha, Ciudad Real, Spain. E-mail: [email protected]

The future paradigm of pathology will be digital. Instead of conventional microscopy, a pathologist will perform a diagnosis through interacting with images on computer screens and performing quantitative analysis. The fourth generation of virtual slide telepathology systems, so-called virtual microscopy and whole-slide imaging (WSI), has allowed for the storage and fast dissemination of image data in pathology and other biomedical areas. These novel digital imaging modalities encompass high-resolution scanning of tissue slides and derived technologies, including automatic digitization and computational processing of whole microscopic slides. Moreover, automated image analysis with WSI can extract specific diagnostic features of diseases and quantify individual components of these features to support diagnoses and provide informative clinical measures of disease. Therefore, the challenge is: 1at) to apply information technology and image analysis methods to exploit the new and emerging digital pathology technologies effectively in order to process and model all the data and information contained in WSI and 2nd) To adopt the developed tool into clinical practice, which is still the major challenge. The final objective is to support the complex workflow from specimen receipt to anatomic pathology report transmission, that is, to improve diagnosis both in terms of pathologists' efficiency and with new information. This talk will present two tools developed by VISILAB group and currently used in clinical practice and research at INCLIVA and HGUCR. The tools are based on both deep learning approached and classical methods and they are dedicated to Her2 quantification and angiogenesis research and neuroblastoma analysis, respectively.

Dr. Gloria Bueno Garcia is Professor at the Engineering School of University of Castilla-La Mancha where lead VISILAB research group dedicated to machine vision and artificial intelligence applications. She holds a PhD in Machine Vision from Coventry University, UK (1998). She has experience working as principal researcher in several research centres such as UMR 7005 research unit CNRS/ Louis Pasteur Univ. Strasbourg (France), Gilbert Gilkes & Gordon Technology (UK) and CEIT San Sebastian (Spain). She is the author of 2 patents, 4 registered software and more than 80 refereed papers in journals and conferences. Runner-up award for the best PhD work on computer vision & pattern recognition by AERFAI and the 'Image File & Reformatting Sofware' Challenge Award. She has served as visiting researcher at Carnegie Mellon University (USA), Imperial College London (UK) and Leica Biosystems (Ireland). She is a Senior Member of IEEE and is affiliated with several societies such as ESDIP relevant to the topic of ECDP. In the field of Digital Pathology she is the coordinator of the European AIDPATH project entitled 'Academia and Industry Collaboration for Digital Pathology' composed of 11 partners from both private and public sector.

   Getting Pathology Pixels to Work Top

Arvydas Laurinavičius1

1Department of Pathology, Pharmacology and Forensic Medicine, Vilnius University, Vilnius, Lithuania. E-mail: [email protected]

Digital image processing technologies and analytics promise outburst of new knowledge and practical implementations in tissue pathology. Microscopy slides contain abundant biological data which can be retrieved in multiple tissue staining and imaging modalities. Recent advance of artificial intelligence brings a new wave to the innovations. While the digital technologies are maturing and new computational pathology analytics are being developed, major effort is needed to promote clinical validation and implementation of the tools. As an example, our pilot implementation of comprehensive Ki67 immunohistochemistry analytics for routine breast cancer diagnosis revealed an added value in quality assurance of pathologist's evaluation of the proliferation index. The tool also provided an advice on the “hottest” area of the tumor tissue based on hexagonal grid analytics; however, this aspect of the experiment led to the discussions between the participating pathologists on the concept of the hot spot. Ironically, this concept is widely used but has multiple, often obscure definitions in research papers and clinical guidelines. We suggest that “subvisual” features such as hot spots, even if defined for “human use,” can hardly be validated and reproducibly evaluated by human observers. Instead, automated and robust computational, applications validated against each other and clinical outcomes, could promote clinical adoption of the decision support tools.

Arvydas Laurinavicius, MD, PhD is Pathology Professor at Vilnius University and Director of the National Center of Pathology, affiliate of the Vilnius University Hospital Santara Clinics, Lithuania. His research focuses on digital image analytics to derive novel tissue pathology indicators for disease modelling. In particular, Prof. Laurinavicius has together with his research group at Vilnius University and researchers at Caen and Nottingham Universities developed methodologies for comprehensive digital immunohistochemistry analytics to empower information retrieval from routine IHC slides. Prof. Laurinavicius will share recent experiences in integrating potential decision support tools into diagnostic pathology workflow and perspectives on further developments needed to achieve robust clinical applications.

   A Simulation Model for Computational Histopathology Issued from the Study of Tumour Microenvironment: An Instantiation in Breast Carcinomas Top

Daniel Racoceanu1

1Department of Biomedical Image and Data Computing, Pontifical Catholic University of Peru, Lima, Peru. E-mail: [email protected]

Breast carcinomas are cancers that arise from the epithelial cells of the breast, which are the cells that line the lobules and the lactiferous ducts. Breast carcinoma is the most common type of breast cancer and can be divided into different subtypes based on architectural features and growth patterns, recognized during a histopathological examination. Tumor microenvironment (TME) is the cellular environment in which tumor cells develop. Being composed of various cell types having dierent biological roles, TME is recognized as playing an important role in the progression of the disease. The architectural heterogeneity in breast carcinomas and the spatial interactions with TME are not yet perfectly understood. Developing a spatial model of tumor architecture and spatial interactions with TME can advance our understanding of tumor heterogeneity. Furthermore, generating histological synthetic datasets can contribute to validating, and comparing analytical methods that are used in digital pathology. The model proposed here applies to dierent breast carcinoma subtypes and TME spatial distributions based on mathematical morphology. It is based on a few morphological parameters that give access to a large spectrum of breast tumor architectures and are able to differentiate in-situ ductal carcinomas (DCIS) and histological subtypes of invasive carcinomas such as ductal (IDC) and lobular carcinoma (ILC). In addition, a part of the parameters of the model controls the spatial distribution of TME relative to the tumor. The test of the model is performed by comparing morphological features between real and simulated images.

Daniel Racoceanu is Professor in Biomedical Image and Data Computing at the Pontifical Catholic University of Peru, he has a tenured Professor position at Sorbonne University, Paris. His areas of competency are Medical Image Analysis, Pattern Recognition, and Machine Learning, his research being mainly focused on Digital Pathology and its Integrative aspects. HDR (2006) and Ph.D. (1997) of the Univ. of Franche-Comté, Besançon, France, he was Project Manager at General Electric Energy Products - Europe, before joining, in 1999, a chair of Associate Professor at University of Besançon, Research Fellow at FEMTO-ST Institute (French National Research Center - CNRS), Besancon, France. From 2005 to 2014, he participated in the creation and the development of the International Joint Research Unit (UMI CNRS) Image & Pervasive Access Lab, being the Director (from 2008 to 2014) of this joint research venture created between Sorbonne University, the French National Center for Scientific Research (CNRS), the National University of Singapore (NUS), the Agency for Science, Technology and Research (A*STAR), the Univ. Grenoble Alpes and the Institut Mines-Telecom, in Singapore. From 2009 to 2015, he was Associate and then Full Professor (adjunct) at the School of Computing, National University of Singapore. Between 2014 and 2016, he was a member of the Executive Board of the University Institute of Health Engineering of the Sorbonne Université, being also co-Director and co-initiator of a new B.Sc. Minor, dedicated to Innovation in Public Health. During the same period, he was leading the Cancer Theranostics research team at the Bioimaging Lab, a joint research unit created between Sorbonne Université, CNRS and INSERM (French National Institute of Health and Medical Research). He is vice-President of the European Society for Digital Integrative Pathology (ESDIP) and member of MICCAI Board of Directors (Medical Image Computing & Computer Assisted Intervention).

   Why Not JPEG2000 for WSI Storage? Top

Jorma Isola1

1BioMediTech, University of Tampere, Tampere, Finland. E-mail: [email protected]

Whole slide images (WSIs, digitized histopathology glass slides) are large data files whose long-term storage remains a significant cost for pathology departments. With improved scanning resolutions (typically 0.2-0.25 um per pixel, instead of 0.5) and requirement to scan macroslides and z-stacks, WSI storage remains a significant problem, especially if WSI files are to be stored for decades. Most currently used WSI formats are based on JPEG type lossy image compression algorithm. While the advantages of the JPEG 2000 algorithm (JP2) are commonly recognized, its compression parameters have not been fully optimized for pathology WSIs. We developed a new parametrization for JPEG 2000 image compression, designated JP2-WSI, to be used with histopathological WSIs. Our parametrization is based on allowing a very high degree of compression on the background part of the WSI while using a conventional amount of compression on the tissue-containing part of the image, resulting in high overall compression ratios. When comparing the compression power of JP2-WSI to our previously used fixed 35:1 compression ratio JPEG 2000 and the default image formats of proprietary Aperio, Hamamatsu, and 3DHISTECH scanners, JP2-WSI produced the smallest file sizes and highest overall compression ratios for all slides tested. The image quality, as judged by visual inspection and peak signal-to-noise ratio (PSNR) measurements, was equal to or better than the compared image formats. The average file size by JP2-WSI amounted to 15, 9, and 16 percent, respectively, of the file sizes of the three commercial scanner vendors' proprietary file formats (3DHISTECH MRXS, Aperio SVS, and Hamamatsu NDPI). In comparison to the commonly used 35:1 compressed JPEG 2000, JP2-WSI was three times more efficient. Conclusion: JP2-WSI allows very efficient and cost-effective data compression for whole slide images without loss of image information required for histopathological diagnosis.

Jorma IsolaMD, PhD is a board certified pathologist. Professor of Cancer Biology (Univ. Tampere, Finland) since 1995. ~ 300 publications (PubMed), 2 patents. Research on virtual microscopy since 2004.

   Short Abstracts Top

   Digital Breast Pathology in the NHS: Experience from an Innovative Validation and Training Pilot Top

Bethany Williams1,2, Andrew Hanby1,2, Rebecca Millican-Slater1, Anju Nijhawan1, Eldo Verghese1,2, Darren Treanor1,2

1Leeds TH NHS Trust, Leeds, United Kingdom, 2University of Leeds, Leeds, United Kingdom. E-mail: [email protected]

Background: Safe and successful use of digital microscopy in routine primary diagnosis is reliant on individual pathologists receiving adequate training and gaining sufficient experience of digital cases. We designed an innovative training and validation protocol to allow our pathologists to gain competence and confidence in live digital diagnosis, in a real world, risk mitigated environment. Methods: 3 breast pathologists received basic digital microscope training, and completed a set of training cases, compiled to reflect the breadth and depth of breast histopathology, and provide exposure to challenging digital diagnostic scenarios. Following discussion of the training set, pathologists commenced live digital reporting. All breast diagnoses were made on digital, with immediate reconciliation with the glass before sign-out. Data on diagnoses, discrepancies, diagnostic confidence, and diagnostic modality preference were collected. After 2 months experience of primary digital diagnosis with glass review, a validation document was produced for each pathologist, summarizing the cases viewed, overall concordance, overall preference for digital or glass, and any potential pitfalls for that individual pathologist on the digital microscope. Results: In the clinical reporting phase, 693 live cases were reported digitally, with glass verification prior to sign out. Absolute concordance between digital and glass diagnoses was 98.2%. None of the discordant diagnoses on the digital microscope would have resulted in significant patient harm. Conclusions: Our pilot validation study demonstrates the importance of individualised validation for histopathologists for primary digital diagnosis. Our pathologists all reported high rates of satisfaction with digital microscopy, and all now report breast cases digitally as standard.

   Imaging and Machine Learning Methods for Assessing HPV In Situ Hybridisation Patterns in Oropharyngeal Carcinomas Top

Shereen Fouad1,2, Gabrial Landini1, Max Robinson3, David A. Randell1, Hisham Mehanna4

1School of Dentistry, University of Birmingham, Birmingham, United Kingdom, 2Faculty of Computing, Engineering and the Built Environment, Birmingham City University, Birmingham, United Kingdom, 3Centre for Oral Health Research, Newcastle University, Newcastle, United Kingdom, 4Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, United Kingdom. E-mail: [email protected]

Some Human Papilloma Virus (HPV) strains are considered aetiological factors in oropharyngeal carcinomas. Interestingly, HPV+ neoplasms appear to have a different demographic and relatively better prognosis than HPV- ones, making HPV status a relevant diagnostic and prognostic feature. HPV status assessed by in situ hybridisation is particularly challenging due to the complexity of staining patterns, high resolution imaging requirements and presence of staining artefacts, all of which call into question the feasibility of imaging techniques for analysing such histological samples. We present a machine learning-based, automated imaging workflow for the identification of HPV status in digitized tissue microarray samples of oropharyngeal carcinomas. High-risk HPV genomes assessed with the INFORM HPV-III system (Ventana) revealed blue stained regions (NBT/BCIP) in epithelial nuclei on a pink background (Red Counterstain II). The contributions of the two dyes (plus a residual colour component) were determined using colour deconvolution. Blue regions were segmented using Renyi's entropy thesholding, while stain artefacts were identified by feature co-occurrence in the blue and residual channels. Morphological parameters of the segmented regions and clinical data were summarized and submitted to a set of supervised recognition classifiers. The evaluation of 695 cases (2085 TMA images, x20 magnification) using feature selection procedures in conjunction with a support vector machine classifier achieved an average of 90% accuracy in detecting HPV status when compared with the histopatholgist scoring as standard. In conclusion, imaging of in situ hybridisation patterns can provide automated means of screening HPV status in large datasets at known levels of accuracy.

   A 4-Year Report on the Use of Digital Pathology to Teach Microscopy Classes Top

Sandrina Martens1, Yves Sucaet1,2, Silke Smeets1, Ilse Rooman1, Peter In't Veld1

1Vrije Universiteit Brussel, Jette, Belgium, 2Pathomation, Berchem, Belgium. E-mail: [email protected]

Background: Students can study glass slides under a microscope during classes, but they cannot repeat this process at home. Since 2014, we gradually introduced digital pathology in various courses. We aimed to digitize the slides for our courses in histology and pathology, and create digital content. We also wanted to measure the reception of digitization by students. Methods: Slides were scanned, stored on a server and visualized using commercial software by Pathomation. Virtual slides with background information are available on: and No data or software needs to be downloaded. QR codes were added to the manuals to view slides on a smartphone or tablet. A mobile app was created for iPhone and Android. Interactive quizzes were created where virtual slides were combined with multiple choice and open-ended questions. We organized a questionnaire among students to gauge their experiences with the technology. Results: In four years, we have seen a shift occur in digital pathology technology and students' expertise with blended learning alike. In our recent questionnaire (n=81), nearly all (98%) students use the website, and 99% find it useful. However, most students prefer using both the website and the manual to study. While 47% of the students used the app, 22% of the students indicate they didn't know it existed. The QR-codes were used by 28% of the students. Quizzes are used by 98% of the students, and 81% find them useful. Conclusions: Our data indicate a strong potential for digitizing content in microscopy for educational purposes.

   Validation of Digital Frozen Island for Cancer and Transplant Intraoperative Services Top

Luca Cima1, Matteo Brunelli1, Antonietta D'Errico2, Luca Novelli3, Desley Neil4, Francesca Vanzo5, Alessandro Sorio1, Claudia Mescoli6, Massimo Rugge6, Vito Cirielli7, Mattia Barbareschi8, Giovanni Valotto1, Aldo Scarpa1, Albino Eccher1

1Department of Diagnostics and Public Health, University and Hospital Trust of Verona, Verona, Italy, 2Department of Specialised, Experimental and Diagnostic Medicine, Pathology Unit, S. Orsola-Malpighi University Hospital of Bologna, Bologna, Italy, 3, Pathology Unit, Careggi University Hospital, Firenze, Italy, 4Department of Histopathology, Pathology Unit, Queen Elizabeth Hospital Birmingham, Birmingham, United Kingdom, 5Arsenàl, Veneto's Research Center for eHealth Innovation, Verona, Italy, 6Department of Medicine, Surgical Pathology and Cytopathology Unit, University and Hospital Trust of Padova, Padova, Italy, 7Department of Diagnostics and Public Health, Forensic Pathology Unit, University and Hospital Trust of Verona, Verona, Italy, 8Pathology Unit, S. Chiara Hospital, Trento, Italy. E-mail: [email protected]

Background: Whole-slide imaging (WSI) technology is used for primary diagnosis and consultation, including intraoperative frozen section (FS). We aimed to implement and validate a digital workstation for the FS evaluation of cancer and transplant biopsies following the recommendations of the College of American Pathologists. Methods: Routine FS cases were scanned at 20x magnification by the “Navigo” system and anonymized. Virtual intraoperative reports were redacted by a trained pathologist interpreting glass slides and, after a 3-week washout period, digital slides. The validity of WSI diagnosis was assessed with accuracy rate, kappa intra-observer coefficients, sensitivity, specificity and predictive values. Participants also completed a digital survey indicating times to scan and view per case, final microscope/WSI diagnosis, image quality, interface handling and problems encountered. Results: 121 cases (436 slides) were successfully scanned including 93 oncological and 28 donor-organ FS biopsies. Full agreement with the glass slides diagnosis was obtained in 90 of 93 (97%, k = 0.96) oncological and in 24 of 28 (86%, k = 0.91) transplant FSs. Two major and one minor discrepancies resulted for cancer FSs (sensitivity 100%, specificity 96%). Average scanning and reporting time was 12 and 3 minutes. Two major and two minor disagreements turned out for transplant FSs (sensitivity 96%, specificity 75%). Average scanning and reporting time was 18 and 5 minutes. A high diagnostic comfort level emerged from the survey. Conclusions: The “Navigo” digital workstation can reliably provide a routine WSI service for intraoperative FS. Validation of integrated digital (WSI) and frozen (intraoperative) island is provided.

   Cytologic Immune Cell-Based Risk Score for Patients with Malignant Pleural Effusions Top

Wu Chengguang1, Alex Soltermann1, Fabian Mairinger2, Ruben Casanova1

1Institute of Pathology and Molecular Pathology, University Hospital Zürich, Zurich, Switzerland, 2Universitätsklinikum Essen, Institut für Pathologie, Essen, Germany. E-mail: [email protected]

Background: Malignant pleural effusions (MPE) is a common clinical problem with no effective therapy. The immune cells and immune checkpoints in MPE play important parts in tumor associated microenvironment. However, their prognostic roles have not yet been well investigated in MPE. Methods: Cores from effusion cell blocks of 188 patients were assembled on hybrid cytology-tissue microarrays (C/TMA). For PD-L1 IHC staining patterns assessment, three different antibody colons (SP263, E1L3N, Quartett) were processed. Simultaneously, the analysis of MPO, CD3, CD4, CD8, CD20 and CD68 expression were performed by IHC. Immune-reactivity was quantitatively counted and scored by pathologists and computer. mRNA expression of 40 MPE lung adenocarcinoma cases were processed by Nanostring (770 genes). Data was correlated with clinico-pathologic parameters. Results: MPE cohort was composed of 155 patients (83 lung adenocarcinoma (54%), 33 breast carcinoma (21%), 19 mesothelioma (12%), 12 ovarian carcinoma (8%) and 8 gastro-intestinal carcinoma (5%)). The medium survival (MS) of the cohort was 158 days and large spread of survival times were found according to the tumor entities. The lung adenocarcinoma group had a MS of only 107 days. The IHC and gene analysis results presented the discrepancy and heterogeneity of MPE immune microenvironment. Conclusion: These data provided us better understanding of MPE immunity and also emphasize the importance of personalized medicine for advancing anticancer therapy. This may aid clinical decision making in this diverse patient population.

   Automated Classification of Nonsmall Cell Lung Cancer Histologic Subtypes by Deep Learning Top

Ruben Casanova1,2, Elvis Murina2, Martina Haberecker1, Anne-Laure Leblond1, Hanna Honcharova-Biletska1, Bart Vrugt1, Oliver Dürr2, Beate Sick2, Alex Soltermann1

1University Hospital Zurich, Zurich, Switzerland, 2Zurich University of Applied Sciences, Winterthur, Switzerland. E-mail: [email protected]

Background: Non-small cell lung cancer (NSCLC) encompasses a heterogeneous group of histological subtypes. Typically, adenocarcinoma (ADC) and squamous cell carcinoma (LSCC) are morphologically differentiated based on histologic features such as the formation of glandular structures for ADC and the presence of keratin and/or intercellular desmosomes for LSCC. Methods: In this study, we have developed a convolutional neural network (CNN) in order to differentiate ADC from LSCC, on a cohort consisting of 208 NSCLC patients from the University Hospital of Zurich. Histologic slides were prepared from formalin-fixed paraffin embedded tumor blocks, stained by Hematoxylin and Eosin. Slides were scanned and 50 frames (128x128 pixels) were prepared for each tumor with a spatial resolution of 2.3 μm/pixel. Results: Patients cohorts were split into training (n= 112 patients), validation (n=28) and test (n=68) sets. The performance of the CNN was evaluated on the test set. Results were compared with the classification made by three independent pathologists. In total 66/68 cases were correctly labeled using the average prediction score of the 50 frames for each patient. In addition, features extraction from the final convolutional layers allowed identifying histologic subgroups without relying on expert labels. Conclusions: Quantitative frameworks for histologic subtyping of cancer tissues, using deep learning algorithms, could serve as potential analytic companion tools for pathologists in order to support routine diagnostic tasks.

   Comparison of Automated and Supervised Ki67 Digital Image Analysis in Breast Cancer Whole Slide Images Top

Anna Bodén1,2, Rebecca Halas3, Stina Garvin1,2, Jesper Molin4, Darren Treanor1, 2, 3, 5

1Department of Clinical and Experimental Medicine, Division of Neurobiology, Faculty of Health Sciences, Linköping University, Sweden, UK, 2Department of Clinical Pathology, Center for Diagnostics, Region Östergötland, Linköping, Sweden, 3Leeds Teaching Hospitals NHS Trust, Leeds, UK, 4Sectra AB, Linköping, Sweden, UK, 5University of Leeds, Leeds, UK. E-mail: [email protected]

Digital image analysis (DIA) is becoming increasingly common within routine pathology practice. DIA systems typically performs well in controlled studies, but in routine practice, the performance of DIA remains largely unknown. This study investigates the performance of a semi-automatic nuclear algorithm that has been used to quantify Ki67 within clinical routine in Linköping, Sweden since 2016. In an initial pilot study, 20 areas were retrospectively extracted from analysed Ki67 hot spots on digitized breast cancer whole slides, where the algorithm has been applied and manually corrected (supervised) by a breast pathologist in routine work. A ground truth was established for each extracted area and compared to the automated and the supervised scores. As a reference, an additional eyeballing score was produced. Thus, three scoring methodologies were compared: automatic DIA, supervised DIA and eyeballing. The results from the pilot study shows the following average Ki67 index error rates: automatic DIA 4.0% (SD: 3.7), supervised DIA 2.4% (SD: 3.0) and eyeballing 8.0% (SD: 5.2). A trend is observed towards less interobserver variability between the supervised method and the ground truth compared to the other methodologies, but this needs to be confirmed in a larger study. To further explore, the study will be expanded by images and pathologists using a web-based user interface. In addition, we will study the effect of algorithm accuracy and efficacy. The pilot study, as described in the presentation, constitutes a model how to investigate the performance of clinical DIA systems, but also regarding the effects of pathologist supervision.

   Multiplexing: Next-Generation Immunohistochemistry Top

Torben Steiniche1, Jeanette Georgsen1, Kristina Lauridsen1, Patricia Switten Nielsen1

1Department of Pathology, Aarhus University Hospital, Aarhus, Denmark. E-mail: [email protected]

Introduction: Immunohistochemistry (IHC) is used in almost every cancer case to determine its type and treatment. Usually only one biomarker is visualized and manually reviewed at a time. Accordingly, a more cost- and time-efficient method that preserves tissue seems desirable. Instead of conventional IHC with chromogenes, fluorescent dyes enable multiplexing of numerous biomarkers. Because cells are more easily distinguished, accuracy of image analysis also increases. A profound drawback of fluorescence is, however, the loss of morphology and contextual tissue.To address this, the study aimed to develop simple multiplexing panels with 3-5 fluorescent markers and digitally convey their signals on an H&E stain. Materials and Methods: On sections from formalin-fixed, paraffin-embedded tissue an indirect, sequential IHC procedure was utilized. Primary antibodies were visualized with fluorescent dyes and nuclei were counterstained with DAPI. Sections were re-stained with H&E. These whole slide images were then aligned as virtual multiple stains. Results: Biomarkers were successfully multiplexed on a fully commercial IHC platform. By visual inspection, signals were very distinct and displayed sufficient sensitivity and specificity. Multiple multiplexed stains were aligned and signals visualized on an H&E stain. Discussion and Conclusion: Virtual combination of fluorescent panels with H&E stains seemed very feasible to perform and assess. In time, they are possibly potential alternatives to conventional IHC in both routine pathology and research studies. Of great importance, multiplex imaging that includes morphologic and contextual information may accurately define the suppressive mechanisms within the tumor microenvironment, which may guide novel immunotherapy.

   Molecular Tissue Characterisation with Mass Spectrometry Imaging: Data Analysis and Visualisation Top

Judith M. Fonville1, Claire L. Carter2, Elaine Holmes1, Josephine Bunch2

1Imperial College London, United Kingdom, 2University of Birmingham, Birmingham, United Kingdom. E-mail: [email protected]

Matrix-assisted laser desorption/ionization (MALDI) mass spectrometry imaging (MSI) provides localized information about the molecular content of a tissue sample. MSI has a great, but as yet not fully realized, potential for diagnostics and research in biology and pathology. The MSI methodology generates a series of mass spectra from discrete sample locations. To derive reliable conclusions from these hyperspectral MSI data, appropriate data processing steps are needed. We present an approach to select biologically relevant peaks and pixels, remove the influence of the applied MALDI matrix, and normalise peak intensities across the different pixels. This robust data processing approach is demonstrated on MALDI MSI from a sagittal rat brain section. Upon data processing, MSI data are often analysed by visually interpreting images from individual molecules of interest, which is labour-intensive and likely to miss associations between different biomolecules. Instead, we propose to leverage on the multidimensional nature of the data set and developed an intuitive colour-coding scheme based on hyperspectral modelling, to generate a single overview image of the complex data set. This visualization strategy was applied to results of principal component analysis, self-organizing maps and t-distributed stochastic neighbour embedding. Our automated data processing, modelling and display of MSI data allows both the spatial and molecular information to be visualized intuitively and effectively. Applications include biomarker profiling, preclinical drug distribution studies, and studies addressing underlying molecular mechanisms of tissue pathology.

   German Guideline “Digital Pathology: Virtual Microscopy in Primary Diagnostics” Top

Gunter Haroske1, Ralf Zwönitzer1,2, Peter Hufnagl1,3

1Federal Association of German Pathologists, Berlin, Germany, 2Imassense GmbH, Berlin, Germany, 3Charité – Universitätsmedizin Berlin, Berlin, Germany. E-mail: [email protected]

Background: The digitization of medicine is gaining momentum in pathology, too. Far known technologies have reached such a degree of maturity that their use in primary diagnostics in routine pathology will be possible. In spite of the complexity of technological solutions and the far reaching consequences in terms of diagnostic reliability as well as due to the high investments and modifications in workflow the decision for a specific product may become highly sophisticated for a pathologist. Aim: A Guideline for Digital Diagnostics in Pathology is presented as to describe technical and legal conditions for making this new technology feasible for the single pathologist. Results and Discussion: The Digital Pathology Commission of the Federal Association of German Pathologists developed this implementation guide for the use of Virtual Microscopy in the daily pathology routine in Germany. The key is the principal comparability of diagnostic reliability between conventional stained microscopic slides and their digital images, which have to be shown by the potential user. In eight chapters validation procedure as well as technical minimum requirements in slide scanners, the visualization pipeline, the archiving, and the integration in the pathology workflow are described. Conclusions: The recommendations given in the guideline enable pathologists to design, to plan, to install, and to perform Virtual Microscopy in routine diagnostics in their labs. The key issue is the validation of that new technique in terms of diagnostic reliability compared with conventional light microscopy. Minimum requirements for the technology are defined as to enable this diagnostic reliability.

   A New High-Throughput Auto-Annotation Method to Detect and Outline Cancer Areas in Prostate Biopsies Top

Lars Björk1,2, Jonas Gustafsson3, Feria Hikmet Noraddin3, Kristian Eurén1, Cecilia Lindskog3

1ContextVision AB, Stockholm, Sweden, 2Department of Women's and Children's Health, Karolinska Institutet, Solna, Sweden, 3Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden. E-mail:[email protected]

Prostate cancer is one of the most diagnosed cancer forms and a leading cause of cancer-related death in males. The manual examination and Gleason scoring of prostate biopsies is however a major bottleneck in the pathology workflow, and studies have shown that the inter-observer variability in scoring is high. In order to reduce the risk of therapeutic decision errors, there is a high demand for implementation of an automated image analysis algorithm to serve as a decision support tool for pathologists. The aim of the present investigation was to develop a strategy for highly specific detection and outlining of cancer areas in clinical biopsy whole slide images (WSIs), which will serve as training material for machine-learning algorithms. Prostate sections were triple-stained towards Cytokeratin 5/6, Cytokeratin 8/18 and AMACR, followed by DAPI counterstaining and immunofluorescence whole slide scanning. After detachment of coverslip, the same slides we stained with hematoxylin and eosin (HE) and scanned in brightfield. The immunofluorescent stainings generated high-resolution multiplex images marking specific structures in the prostate biopsies, accurately outlining the cancer containing areas. By overlaying the multiplex antibody stainings with HE in the computer, the cancer areas could be exactly marked in the corresponding HE images. This provides the basis for further qualitative and quantitative image analysis of prostate sections. In summary, we have generated a robust and powerful method for specific and objective visualization of cancer areas in prostate biopsy WSIs, which will be used for machine-learning to generate a highly accurate decision support tool for pathologists.

   Automated Slide Screening for the Diagnosis of Breast Cancer Metastases in Lymphnodes Top

Gianluca Gerard1, Patrizia Morbini2, Marco Piastra1

1Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Pavia, Italy, 2Department of Molecular Medicine, Unit of Pathology, University of Pavia and Policlinico San Matteo Foundation IRCCS, Pavia, Italy. E-mail: [email protected]

Whole sentinel lymphnode serial examination for breast cancer metastasis is for pathologists a tedious and time-consuming process. Misdiagnoses, due to operator fatigue and less-than-optimal quality of the slides, can have a huge impact on patient life, and automation could contribute to limit their occurrence. The availability of whole-slide scanners to digitalise glass slides at high resolution has extended the diagnostic capabilities, and at the same time has opened up the opportunity of applying the latest advances in Computer Vision for the automated analysis of whole slide images (WSI). Recent advancements in automatic image analysis with Deep Convolutional Neural Networks (DCNN) have created a huge interest in automated diagnosis of breast cancer metastases in lymphnodes, as demonstrated by the recent academic challenges on predefined datasets, such as CAMELYON16/17. In our project we adopt a different perspective, as we aim to realize a tool that supports the activity of pathologists by performing a preliminary analysis on serial lymphnode sections to identify areas of interest within those. To this aim, we adopt a specific variant of DCNNs, namely Fully Convolutional Networks, that proves to be extremely effective in selecting those segments in slide images that - with a high likelihood - contain lesions with features of metastasis. Preliminary results show the potential of the approach proposed in achieving near total recall of lesion-containing areas, while at the same time maintaining high specificity to reduce the number of false positives to be excluded afterwards.

   Implementation of a Digital Pathology Network in the Northern Part of the Netherlands Top

Jacko Duker1, Jan Doff1, Joost Bart2, Callista Weggemans3, Bart Hamel4, Marius Van den Heuvel5, Bert Van der Vegt1

1University Medical Center Groningen, Groningen, Netherlands, 2Pathology Isala, Zwolle, Netherlands, 3Pathology Treant, Hoogeveen, Netherlands, 4Department of Pathology, Martini Hospital, Groningen, Netherlands, 5Pathology Friesland, Leeuwarden, Netherlands. E-mail: [email protected]

The implementation of subspecialication in pathology and the large amount of referrals between the hospitals in the northern part of The Netherlands (2.86 million residents) necessitates intensive collaboration between the 5 pathology departments (4 general and 1 university) in this region where we could reduce the labor-intensive exchange of physical slides and increase diagnostic speed and with the introduction of digital pathology. Therefore a private digital pathology network will be implemented through a specialized health-care network provider. Strict rules on security/privacy will be applied, e.g. patient data will be separately stored from image data to prevent a data-breach. In this network every local image management server (IMS) will be connected to a multi-site server at the central site (university). After finishing the case at a local department, digital slides will be stored in a digital archive located at two geographically separate data warehouses (scalable, currently 5Pb). Each department can access archived slides and display them in their own IMS system. The design of this network will result in reduction of turn- around times of consultations and revisions. Cost-effectiveness will increase by sharing both hardware and personnel for support/maintenance. The network will also allow us to uniformly teach residents by storing interesting cases and to collaborate on computational pathology and digital image analysis innovations. This network will provide a solid base for further collaboration in diagnostics (digital and computational pathology) between the pathology departments and will support the further development of subspecialication in pathology in the northern part of the Netherlands.

   Digital Image Analysis of HER2 in Breast Carcinoma: Comparison with Manual Scoring and Inter-Platform Agreement Top

Timco Koopman1, Henk Buikema1, Bert van der Vegt1

1Department of Pathology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands. E-mail: [email protected]

Background: we aimed to test the validity of digital image analysis (DIA) of HER2 immunohistochemistry (IHC) in breast cancer, and to assess inter-platform agreement between two independent DIA platforms. Methods: HER2 IHC and in situ hybridization (ISH) were performed on 152 consecutive invasive breast carcinomas. Immunohistochemical HER2 scores (0/1+: negative; 2+: equivocal; 3+: positive) were determined using two independent DIA platforms. Manual scoring was performed by two independent observers, who reached consensus. In situ hybridization (ISH) was performed on all cases. HER2 status was considered positive in 3+ and ISH-positive 2+ cases. Results: HER2 positivity was 10.5%. Overall agreement of HER2 IHC scores between manual scoring and platform A was 82.9% (linear weighted κ = 0.61); between manual scoring and platform B was 91.1% (κ = 0.85), and between platform A and platform B was 82.2% (κ = 0.59). Compared to manual scoring, DIA resulted in a 92.3% reduction of 2+ cases with platform A and a 7.7% reduction with platform B. However, with ISH as a reference, there were three false-negative cases with platform A (100% sensitivity, 97.8% specificity, 78.6% positive predictive value (PPV), 100% negative predictive value (NPV)), while sensitivity, specificity, PPV and NPV were 100% for manual scoring and platform B. Conclusions: DIA is a feasible alternative to manual scoring of HER2 IHC in breast carcinomas, which may lead to a reduction of 2+ cases requiring subsequent ISH. However, different platforms can behave differently, and optimal calibration is essential to safely introduce this technique in daily practice.

   Assessment of Predictive Biomarkers in Cancer Tissues Using Micro-Immunohistochemistry Followed by DNA Sequencing Top

Anne-Laure Leblond1, Ruben Casanova1, Markus Rechsteiner1, Peter Wild1, Amy Jones2, Ata Tuna Ciftlik2, Alex Soltermann1

1Institut für Klinische Pathologie und Molekular Pathologie, Zurich, Switzerland, 2Lunaphore Technologies SA, Lausanne, Switzerland. E-mail: [email protected]

Background: Limited access to tumoral tissue strikes the need to combine on the same specimen IHC and molecular analysis. Microfluidic technology as MicroTissue Processor (MTP) allowed fast and reliable immunofluorescent stainings on formalin-fixed paraffin-embedded (FFPE) cancer tissues. This study aimed at investigating the potential use of MTP for chromogenic stainings on FFPE sections (micro-IHC) and its combination with next generation sequencing (NGS). Methods: We performed micro-IHC on different cancer tissues using BRAF V600E mutation antibody, pan-CK and pan-Melan antibodies. We investigated micro-IHC at room temperature and variable incubation times with primary antibody. Positive micro-IHC was assessed for H score, positive cell quantification and processed for BRAF V600E mutation detection by NGS. Results: There was no significant difference in H score between fully automated IHC and micro-IHC. Micro-IHC led to strong immunoreactivity after only 4-minute incubation with primary antibody. Titration curve experiments showed a regular elevation of signal intensity until 8 minutes before reaching a plateau phase, mimicking an exponential model. Ultimate combination with DNA sequencing showed that pan-CK micro-IHC could be used for BRAF V600E mutation detection and that BRAF V600E micro-IHC results were concordant with NGS results. Conclusion: Our results showed that micro-IHC is a reliable alternative to combine minute range IHC and molecular analysis by NGS on the same FFPE tissue section.

   Multispectral Imaging for Quantitative and Compartment-Specific Immune Infiltrates Reveals Distinct Patterns in Lung Cancer Patients Top

Artur Mezheyeuski1, Christian Holst Bergsland2, Max Backman1, Dijana Djureinovic1, Tobias Sjöblom1, Jarle Bruun2, Patrick Micke1

1Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden, 2Department of Molecular Oncology, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway. E-mail: [email protected]

Semiquantitative assessment of immune markers by immunohistochemistry (IHC) has significant limitations to describe the diversity of the immune response in cancer. Therefore we evaluated a fluorescence-based multiplexed IHC method in combination with a multispectral imaging system to quantify immune infiltrates in the in situ environment of non-small cell lung cancer (NSCLC). A tissue microarray including 57 NSCLC cases was stained with CD8, CD20, CD4, FoxP3, CD45RO and pan-cytokeratin antibodies and immune cells were quantified in epithelial and stroma compartments. The results were compared to conventional IHC and related to corresponding RNAseq expression values. We found a strong correlation between the visual and digital quantification of lymphocytes for CD45RO (Correlation coefficient: r = 0.52), FoxP3 (0.87), CD4 (0.79), CD20 (0.81) and CD8 cells (0.90). The correlation to RNAseq data was comparable or better for digital rather than visual quantification (visual: 0.38-0.58 versus digital: 0.35-0.65). By combining the signals of the five immune markers further subpopulations of lymphocytes were identified and localized. Specific pattern of immune cell infiltration based either on the spatial distribution (distance between regulatory CD8+ T and cancer cells) or the relation of lymphocyte subclasses to each other (e.g. cytotoxic/regulatory cell ratio) were associated with patient prognosis. In conclusion, the fluorescence multiplexed IHC method based on only one tissue section provided a reliable quantification and localization of immune cells in cancer tissue. The application of this technique on clinical biopsies can provide a basic characterization of immune infiltrates to guide clinical decisions in the era of immunotherapy.

   Cytokeratin-Supervised Deep Learning for Automatic Recognition of Epithelial Cells in Breast Tumors Top

Mira Valkonen1, Jorma Isola1, Onni Ylinen1, Teemu Tolonen2, Matti Nykter1, Pekka Ruusuvuori1,3

1Faculty of Medicine and Life Sciences, BioMediTech, University of Tampere, Tampere, Finland, 2Department of Pathology, Fimlab Laboratories, Tampere University Hospital, Tampere, Finland, 3Tampere University of Technology, Tampere, Finland. E-mail: [email protected]

Background: Immunohistochemical staining of ER, PR, and Ki-67 are established biomarkers used routinely in breast cancer diagnostics. Determination of the staining labeling indexes should be restricted on malignant epithelial cells, carefully avoiding tumor infiltrating stroma and inflammatory cells. Methods: Here, we developed a deep learning based digital panCK mask for automated epithelial cell detection using fluoro-chromogenic cytokeratin-Ki67 double staining and sequential hematoxylin-IHC staining as training material. Partially pre-trained deep convolutional neural network was fine-tuned using image batches from 152 patient samples. Digital panCK masks were predicted for 366 fields of view collected from 98 unseen samples using the trained model. The validity of the predicted epithelial cell masks were confirmed by comparison to cytokeratin images and to visual evaluation performed by two pathologists. Results: A good discrimination of epithelial cells was achieved (AUC = 0.85 (defined as the area under receiver operating characteristics)), well in concordance with pathologists' visual assessment (4.01/5 and 4.67/5). Conclusions: Our findings indicate that deep learning can be applied to detect carcinoma cells in breast cancer samples stained with conventional brightfield Ki-67 IHC. Additionally, the deep learning based epithelial cell detection performed equally well on tissue sections stained for estrogen and progesterone receptors with single color brightfield IHC. Thus, a deep learning algorithm, which can be implemented as part of the image analysis software, was found to provide significant improvement in Ki-67 image analysis when compared to the use of adjacent step sections, which is the prevailing method in current practise.

   Digital Microscopy in Modern Training of Clinical Pathology Top

Larisa Volkova1, Fedor Paramzin1

1Immanuel Kant Baltic Federal University, Kaliningrad, Russian. E-mail: [email protected]

Achievements in the field of digital pathology diagnostics and telemedicine allow to receive archives of scanning images of different pathological processes and diseases for teaching of medical students and Postgraduate Courses. During last 4 years the training of pathology in the Department of Fundamental Medicine of Immanuel Kant Baltic Federal University includes the digital and video microscopy, virtual slides are used (CaseCenter, 3DHISTECH Ltd.). Scanning is applied to transfer of traditional histological slides in digital format by equipment Pannoramic 250 Flash 3DHISTECH Ltd. At present in education of clinical pathology are used: 1) collection of the Department of Fundamental Medicine with more than 50 digital slides for different themes of the basic course of pathology; 2) archive of the most difficult and interesting cases from clinical practice of the Laboratory of Immunohistochemistry and Pathology Diagnostics. The visualization and analysis of digital images of the histological sections is carried out by means of the program Pannoramic Viewer. For scientific researches and receiving of digital images in the field of breast cancer, gynecological pathology software 3DHISTECH Ltd. is also applied. The quantitative evaluation of expression of proliferation marker Ki-67 was fulfilled for investigation of breast cancer in digital images of histological sections with software PatternQuant Quantcenter 3D HISTECH. The own experience of applying of scanning, digital images in teaching in Medical Institute and Postgraduate Courses, in clinical practice of pathologist and research investigations was analyzed and confirmed of high effectiveness of such modern approach.

   Deep Learning for Quantification of Tall Cells in Papillary Thyroid Carcinoma Top

Sebastian Stenman1,2, Päivi Siironen3, Johan Lundin1,4, Caj Haglund3,5, Johanna Arola2

1Institute for Molecular Medicine Finland - FIMM, Helsinki, Finland, 2Huslab Pathology, Helsinki University Hospital, Helsinki, Finland, 3Department of Surgery, Helsinki University Hospital, Helsinki, Finland, 4Department of Public Health Science, Karolinska Institutet, Stockholm, Stockholm, Sweden, 5Research Programs Unit, Translational Cancer Biology, University of Helsinki, Helsinki, Finland. E-mail: [email protected]

Background: Tall cell variant (TCV), a subtype of papillary thyroid carcinoma (PTC), has a worse prognosis than the conventional papillary type. TCV is, according to WHO, defined as a tumor consisting of at least 30% epithelial cells that are 2-3 times as tall as they are wide. However, scanning tumors for these tall cells (TCs) by conventional microscope is a time-consuming and inexact process, and new diagnostic methods are needed. Our aim was to assess the feasibility of a machine learning approach in quantification of TCs. Methods: Two separate deep learning classifiers were trained, one counting TCs and one counting all epithelial cells thus yielding TC percentage. 299 images sized 224x224 pixels including a wide range of tumor morphologies were extracted. Training was performed on 238 of these images, and the test set comprised of the remaining 61. Validation was performed on a case-control cohort including 65 PTC patients. All samples were digitally scanned and imported as whole-slide images (WSIs) into an image management and processing platform. Results: The TC and epithelial region classifiers had recall of 86.7% and 81.9%, precisions 80.3% and 86.2%, and F-scores 84.3% and 84.0% respectively. Area under the curve for untrained regions of the TC classifier was 98.7%, and for the epithelial classifier 96.9%. The nucleus classifier had an 80.1% recall rate, 89.3% precision and F-score of 84.4%. Conclusions: The TC percentage in a PTC can be automatically calculated with high specificity and sensitivity and the tool could reliably assist pathologists scanning PTCs for TCs.

   Integrative Computational Pathology Approach to Classify Prostate Cancer: Phenotype, Genomic, Tumor Microenvironment Via Deep Learning Top

Laura Marin1, Fanny Lys Casado Peña2, Daniel Racoceanu1,3

1Engineering Department, Electrical and Electronics Section, Medical Imaging Lab, Pontifical Catholic University of Peru, San Miguel, Peru, 2Pontifical Catholic University of Peru Laboratory of Research in Omic Sciences and Applied Biotechnology, Lima, Peru, 3Sorbonne University, UPMC Univ Paris 06, CNRS, INSERM, Paris, France. E-mail: [email protected]

In 2017, prostate cancer was the second most common cancer in men after lung cancer. Conventionally, it is diagnosed evaluating tissue biopsies, and classified according to the Gleason grading system. Novel molecular classifications of prostate cancer have been proposed, however their clinical use is limited due to a number of reasons including their lack of localization information. The main goal of this work is to implement an automatic tissue classification that takes into account the phenotype, the micro environment (Haralick features, direction of the stroma…) and a genomic signature predictive of recurrence to improve the existing grading system. Modern techniques to classify images keep getting broader and accurate, in particular with the introduction of Convolutional Neural Networks. This approach is clearly different from the traditional classification as it lets the network decide which features are of major discriminative importance. Therefore, instead of using a classic fully connected layer, we will integrate heterogeneous inputs, at different levels. To avoid normalization mismatches between the data from the phenotype and genomic signature, we introduce them via different layers. The images will serve as an input in the first layer, while the gene expression from the omics and micro environments data will be input in the last layer. Using that approach, we can classify the image, with respect to the survival rate as to the risk of recurrence, the gold standard of any cancer. Therefore, our method is able to generate an augmented score, enabling a more accurate and personalized diagnosis.

   Heterogeneity Analysis of Patient-Derived Xenografts to Optimize Analysis of Immune Infiltrates in Preclinical Oncoimmunology Research Top

Anne Grote1, Julia Schüler2, Eva Oswald2, Friedrich Feuerhake1

1Hannover Medical School, Hannover, Germany, 2Charles River Discovery Research Services Germany GmbH, Freiburg, Germany. E-mail: [email protected]

Humanized mouse models allow in-vivo studies on the interaction of human immune cells with patient-derived tumor xenografts (PDX). The heterogeneity of the inflammatory tumor microenvironment (iTME) is a challenge for robust evaluation of immunomodulatory therapy effects in PDX models. We apply a digital pathology workflow to analyze heterogeneity of tumor tissue and iTME in multiple sections of uniform distance representing whole xenograft tumors. Seven paraffin blocks of humanized lung cancer PDX were fully sectioned in steps of 80μm distance, and two consecutive slices per step were stained with hCD45 and mCD31 and scanned. The automated image analysis workflow comprised tissue classification with a pre-trained convolutional neural network, adapted with transfer learning, and cell detection using blob detection, edge detection, level sets and watersheds. A variety of relevant measures such as the ratio of necrotic and tumor tissue and the CD45 positivity in tumor and stroma were obtained from the slides. The tissue classification accuracy was 95% and the F1 score of the cell detection was 81% on test image subsets. Heterogeneity analysis showed significant changes of necrosis/tumor tissue ratio in almost all blocks, while the CD45 positivity ratios were more stable. In the majority of cases, two or three slides yielded a mean positivity ratio that was within the standard deviation interval of the mean derived from the full stack, while in single PDX models more sections were required for robust iTME evaluation. Our study provides guidance for rational design of digital pathology workflows for iTME analysis in translational oncoimmunology.

   Deep Convolutional Neural Network-Based Method for Quantification of the Pancreatic B-Cell Mass in Mice Top

Tatiana Danilova1, Sami Blom2, Tuomas Ropponen2, Kari Pitkänen2, Maria Lindahl1

1Institute of Biotechnology, HiLIFE Unit, University of Helsinki, Helsinki, Finland, 2Fimmic Oy, Biomedicum 2, Helsinki, Finland. E-mail: [email protected]

Background: Functional pancreatic β-cell mass is an important parameter in diabetes research as it correlates with the insulin secretion in pancreas. Traditionally, image captures are acquired from insulin-stained pancreatic sections and analyzed using the low-throughput software platforms. However, accurate image analysis of high-resolution captures of pancreatic β-cell mass using established morphological methods is technically challenging and time-consuming. In contrast, low-resolution imaging solves the throughput issue, but performs poorly in identifying small islets, individual β-cells, and even pancreatic tissue. Thus, there is a need for image analytics “next-gen” methods enabling high-resolution whole-slide image analysis for accurate measurements of β-cell mass. Methods: Machine learning methods based on deep convolutional neural networks (dCNN)-are highly efficient in classification of images and increasingly used in medical and biological research. We employed dCNNs for the analysis of β-cell mass measurement as well as quantification of individual β-cells on whole-slide digital images of mouse pancreatic sections stained with anti-insulin antibody from normal wild-type and MANF-deficient mice, which develop insulin-dependent diabetes due to progressive postnatal decrease in the β-cell mass. Results: As validation, we found good agreement between human observer and the algorithms with F1-scores of 97% (CI95% 96.99–97.01%) and 91% (CI95% 85.0–96.4%) for pixel-level classification of pancreas tissue and counting of Langerhans islets, respectively. Conclusions: The established algorithms overcome current limitations in β-cell mass analysis and yield reliable and consistent data. Our algorithms were developed and run on a fully cloud-embedded WebMicroscope® Cloud Platform.

   Accurate Prostate Glandular Segmentation, a Key to Successful Automatic Grading and Prognostication of Prostate Cancer Top

Christophe Avenel1, Anna Tolf2, Pr Ingrid Carlbom1

1CADESS Medical AB, Uppsala, Sweden, 2Department of Clinical Pathology, Uppsala University Hospital, Uppsala, Sweden. E-mail: [email protected]

Digital pathology offers the potential for computer-aided diagnosis, significantly reducing the pathologists' workload and paving the way for accurate prognostication with reduced inter-and intra-observer variations. But successful computer-based analysis requires careful tissue preparation. While the human eye may recognize a gland with intensity variations or with parts of its boundary missing, a computer algorithm may fail under such conditions. Since malignancy grading of prostate cancer is based on architectural growth patterns of prostatic carcinoma, automatic methods must rely on accurate identification of the glands. But due to the poor color differentiation between stroma and epithelium from the commonly used stain Hematoxylin-Eosin, no method is yet able to segment all types of glands, making automatic prognostication impossible with this stain. We address the effect of tissue preparation on glandular segmentation with an alternative stain, Picro-Sirius red-Hematoxylin, which clearly delineates the glandular boundaries, and couple this stain with color decomposition that removes intensity variation. In this manner image analysis, based on tensor-based mathematical morphology and gradient-maximizing thresholds, can successfully determine the glandular boundaries. The segmentation of prostate tissue and its decomposition method has been successfully tested on more than 12000 glands, including well-formed (Gleason grade 3), cribriform and fine caliber (grade 4), and single cells (grade 5) glands.

   Going Fully Digital: Three Years of Experience at Cannizzaro Hospital in Italy Top

Filippo Fraggetta1, Liron Pantanowitz2

1Department of Pathology, Cannizzaro Hospital, Catania, Italy, 2Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA. E-mail: [email protected]

Background: Fully digital workflow of whole slide imaging for routine clinical practice is available in only a few pathology laboratories worldwide. The first three years of experience with digital workflow at Cannizzaro Hospital in Catania, Italy is presented. Methods: We started digitizing all (100%) permanent histopathology glass slides at 20x and subsequently progressed to 40x scanning mode using two Aperio AT2 scanners. Results: Over 160,000 glass slides were digitized with a scan fail rate of around 2%. Successful adoption required a 2D barcode tracking system, modification of histology workflow processes, and implementation of complimentary software tools (i.e.”telepathox” for electronic remote consultation and “eslide macro view” for a safer diagnosis). A Laboratory Information System (LIS) centric approach (“Pathox” ver. 13.2.0) was necessary to support a virtual slide tray for pathologists and biderectional communication with the digital pathology system (eSlideManager). Moving to 40x scanning required boosting of the IT network from a 100 Mbit/s to a 1Gbit/s. Conclusion: Successful adoption of WSI for primary diagnostic use in histopathology was dependent on integration with the LIS, enhancing the underlying network bandwidth, and utilizing auxiliary software tools. Going fully digital for routine histopathology in clinical practice secondarily created an opportunity to standardize workflow processes in the pathology laboratory.

   HER2 Control-Aware Automatic Diagnosis Tool Based on Deep Learning Top

Anibal Pedraza1, Oscar Deniz1, Lucia Gonzalez2, Gloria Bueno1

1VISILAB Research Group, University of Castilla La Mancha, Ciudad Real, Spain, 2Department of Anatomical Pathology, University Hospital, Ciudad Real, Spain. E-mail: [email protected]

Background: The idea of this work is to develop a tool to perform a diagnosis for HER2, a gene which appears to be overexpressed in breast cancer. This can be stated with Immunohistochemical analysis (IHC) and a score-dependent staining level, which ranges from 0+ to 3+. Methods: A dataset composed by patches extracted from Whole Slide Images has been built, representing the main classes for HER2 expression. The tool that can handle several image formats (Leica, Hamamatsu, …) It also extracts the tissue region from the images in which contains a control (a positive area to be used by pathologists), using image processing and clustering. The diagnosis is based on a Convolutional Neural Network trained with the dataset that was mentioned before. Results: The decision for each patch is shown graphically. A global decision is also given, depending on the proportion of each kind of expression. Conclusions: The tool that has been developed is able to assist pathologist to highlight the overexpressed regions to state a final diagnosis.

   Glass Slides and Digital Slides in the Era of Digital Pathology: How Long to Store? Top

Dr Sabine Leh1,2, Jens Lien3,4, Ivar Skaland5, Sindre Byrkjeland Nessen6, Line Rodahl Dokset7,8, Inger Nina Farstad9

Departments of 1Pathology and 2Research and Development, Haukeland University Hospital, Bergen, Norway, 3Central Norway Regional Health Authority's IT department, Trondheim, Norway, 4Bouvet Norway ASA, Trondheim, Norway, 5Department of Pathology, Stavanger University Hospital, Stavanger, Norway, 6Helse Vest IKT AS, Bergen, Norway, 7Sykehuspartner Trust, Oslo, Norway, 8Nasjonal IKT, Bergen, Norway, 9Department of Pathology, Laboratory Clinic, Oslo University Hospital, Oslo, Norway. E-mail: [email protected]

Background: Digital slides represent a new storage item for pathology archives. The Norwegian National Project for digital pathology has developed recommendations for retention and storage of glass slides and digital slides. Methods: The as-is state of storage, costs and system support in Norway was mapped in 2017. The project defined principles for retention and assessed legal requirements. Scenarios for storage of glass slides and digital slides were developed and costs for the various scenarios estimated. Based on this, the project proposed recommendations for retention and storage. Results: Glass slides are generally stored indefinitely in Norway. The Norwegian pathology archives currently contain approximately 70 million glass slides. Annual production is 2.5 million glass slides. About 70000 digital slides have been stored until now, requiring 70 Tb storage capacity. A central principle is a cost-benefit assessment that takes into consideration both the patient's interest and the potential benefit to society. Digital slides are legally interpreted as health data and storage is regulated in the corresponding laws. The economically most favorable scenario for future storage is the following: if the diagnosis is based on digital slides, glass slides are discarded and digital slides are stored; glass slides in the historical archives are discarded after a defined time. Conclusion: Recommendations for retention and storage of glass slides and digital slides will be sent to the Directorate of Health and are expected to form the basis for national guidelines.

   Nuclei Detection: Inter-Observer Agreement and its Impact on Training A Deep Learning Method Top

Henning Kost1, Christiane Engel1, Jesper Molin2, André Homeyer1, Nick Weiss1, Claes F. Lundström2,3, Horst K. Hahn1,4

1Fraunhofer MEVIS, Bremen, Germany, 2Sectra AB, Linköping, Sweden, 3Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden, 4Jacobs University, Bremen, Germany. E-mail: [email protected]

Machine learning-based medical image analysis methods like CNNs rely on carefully crafted sets of training images. They must sufficiently cover the variability of the images encountered in practice. However, training annotations are often created by only a single observer. Due to the fact that medical images might be ambiguous, this can lead to a substantial bias. In our study, we evaluated the impact of observer variability on training CNNs for nuclei detection in Ki67-stained histological images. Three independent observers annotated all nuclei in 101 field-of-view images. CNNs were trained with annotations of each observer and each combination of two observers. The agreement of the CNN's outputs with the observers as well as the inter-observer agreement were assessed using the f1-measure. The average inter-observer agreement was 0.75. The outputs of CNNs trained with annotations from one observer had an average agreement of 0.81 when compared with annotations of the same observer. When compared with the other observers, a mean agreement of 0.76 was determined. CNNs trained with annotations of two observers, showed an average agreement with the third observer of 0.78. All CNNs trained using a single observer's annotations had better agreements to other observers than the direct comparison of these observers. This interesting finding needs further research. CNNs can perform well even when trained with inconsistent data. They however tend to overfit the observers that created the training annotations. It is thus important to employ training annotations of multiple observers to better cover the inter-observer variability.

   Web Application for Pathology Training Using a Low Cost Platform and Annotated whole Slide Images Top

Paula Toro1, Germán Corredor1, Viviana Arias1, Eduardo Romero1

1Universidad Nacional De Colombia, Bogotá, Colombia. E-mail: [email protected]

Education in dermatopathology relies meaningfully on consultation of books, which are expensive, get quickly outdated, and have limited possibilities. In recent years, virtual microscopy, a method that enables the examination of digitized microscopy samples by means of a computer, has earn interest because of its remarkable benefits for education. These advantages include low costs, flexibility, easy content updating, etc. In this work, we introduce a low-cost full platform for digitalization and consultation of dermatopathology samples. First, physical slides are digitized using an optical microscope coupled to a digital camera controlled by a custom-motorized scanner. The whole slide images are stored and stitched for later consultation. Then, a web application (mobile responsive), developed using open source tools, access to such images and allows users to interact with the content by panning and zooming on the images. The application also allows users to hand-free annotate specific image regions. This tool was used to provide a dermatology atlas for pathology and dermatology residents. For this purpose, a set of 100 histological dermatopathology slides, provided by Pathology Department of Universidad Nacional de Colombia, that represent basic lesions and inflammatory skin diseases (based on Ackerman patterns) were published. Each image contains clinical information as well as annotation of relevant regions. The platform is currently being used by trainees who highlight the benefits of this kind of tools which complement their training and help to improve their diagnostic skills. Of course, this tool can be easily adapted for including more cases of different lesions and organs.

   A Deep Learning Approach to Semantic Segmentation of Epidermal Tissue in Whole Slide Histopathological Images Top

Kay Raymond Oskal1, Martin Risdal1, Emiel Janssen2, Thor Ole Gulsrud1,3

1International Research Institute of Stavanger, Stavanger, Norway, 2Stavanger University Hospital, Stavanger, Norway, 3University of Stavanger, Stavanger, Norway. E-mail: [email protected]

Malignant melanoma is a severe and aggressive type of skin cancer, with rapid decrease in survival rate if not diagnosed and treated at an early stage. Histopathological examination of whole slide images (WSIs) is currently the gold standard for the diagnosis. However, this is a difficult and time-consuming task, and at most pathology laboratories the sheer amount of skin biopsies causes real logistic and personnel challenge. Additionally, diagnosis will often lead to intra- and inter-observer variability. To address these issues, a computer-based pre-screening system can help through quantitative image analysis. The epidermis region is often the first area analyzed by a pathologist, since there are several diagnostic clues here. Therefore, an important initial step in developing such a system is to develop an automatic epidermis segmentation algorithm. Due to the texture complexity and inconsistent staining, a new and robust technique is needed. Recent advances in deep learning have led to promising results within automatic analysis and segmentation of WSIs. In this study we present a deep learning approach for epidermis segmentation. Our approach uses ~1M labeled patches from 30 training WSIs to train a convolutional neural network (CNN). Due to a limited amount of training images, we solve this with transfer learning where we fine-tune a known ImageNet architecture to discriminate between patches located in the epidermis-region from the rest. Finally, we feed overlapping image patches to our CNN and use the prediction results to create an epidermis probability heatmap, which is post-processed to create a binary epidermis mask.

   A Multiplexing Imaging Instrument with Reagents for Digital Pathology Employing Novel Up-Converting Nanoparticles Top

Anders Sjögren1, Stefan Andersson-Engels1,2, Wayne Bowen3

1Lumito AB, Lund, Sweden, 2Tyndall National Institute, Cork, Ireland, 3TTP Plc, Melbourn, UK. E-mail: [email protected]

For personalized treatment and prognosis, digital pathology is crucial. However, options for staining, scanning and automatically diagnosing tissue sections are limited. Classic pathology employs brightfield imaging, and is not suited to computer-assisted diagnosing, at best providing a qualitative readout. Fluorescence imaging represents an alternative, multiplex approach but is hard to combine with brightfield. In addition, traditional fluorophores are not amenable to storage, suffer from background interference, and readily photobleach, all significant obstacles to clinical adoption. Here, novel luminescent up-converting nanoparticles (UCNPs) were used to create a flexible solution with a prototype instrument, reagents and staining protocols. UCNP fluorescence imaging of human tissue sections was compared with a commercial system for digital pathology. To quantitatively gauge multiplexing capabilities, multiple protein biomarkers were stained with two UCNP reagents. Pulsed excitation and gated detection were explored to improve the scanning speed. UCNP and H&E co-staining and co-imaging were also investigated, which could provide a safe route from classical pathology to digital pathology. To assess viability for computer assisted diagnoses supporting the pathologist, the UCNP fluorescence image data was compared with standard brightfield image data. The results indicated that the properties of UCNPs make them ideal for fluorescence imaging in digital pathology with great potential for a complete system for digital pathology. The quality, flexibility and speed of the solution could surpass any current system offering multi-modal brightfield and fluorescence imaging. Research and development in this area will be continued to provide a complete, off-the-shelf commercial system for digital pathology.

   Structured Synoptic Reports for Anatomical Pathology Using SNOMED CT: The Swedish Experiences Top

Carlos Fernandez Moro1,2, Keng-Ling Wallin3, Daniel Karlsson3

1Department of Clinical Pathology/Cytology, Karolinska University Hospital Huddinge, Stockholm, Sweden, 2Department of Laboratory Medicine, Division of Pathology, Karolinska Institute, Stockholm, Sweden, 3The National Board of Health and Welfare, eHealth and Structured Information Unit, Stockholm, Sweden. E-mail: [email protected]

Background: Digital pathology focuses largely on software to assist examination and diagnosis using digital images of histopathological sections, and less attention has been given to the digital use of pathology requests forms and diagnostic reports. It is common practice to have pathology examination results dictated and transposed into narrative form. Semi-structured text is not readily machine computable leading to duplication of work when coding for e.g. registries. Also, there is an increasing demand for structured data e.g. for standardized cancer protocols. Methods: A collaboration network has been established between different disciplines and organizations. Structured anatomical pathology synoptic report templates were created for different organ systems and SNOMED CT codes were assigned to report elements. Results: The Swedish Society of Pathology has decided to migrate to standardized synoptic reporting with SNOMED CT. Organ-specific groups have started to develop synoptic reports. So far, reports for pancreatic cancer and malignant melanoma have been developed, and other reports are ongoing through international collaboration. Web-based prototypes for reporting for those tumor types have been developed. The prototypes support use of SNOMED CT coding through pre-determined lists of possible answers. The coding captures major parts of the pathology report, not only topology and morphology. Missing codes have been submitted for inclusion in future version of SNOMED CT. Conclusion: A process towards electronic standardized pathology reporting with SNOMED CT is ongoing in Sweden. This may increase not only the quality of pathology reporting but also utilization of pathology data for registries, research, while eliminating double work.

   A Deep-Learning Algorithm to Determine Liver Fat Content in Nonalcoholic Fatty Liver Disease in Humans Top

Laura Ahtiainen1,2, Panu Luukkonen1,2, Tuomas Ropponen3, Sami Blom3, Johanna Arola4, Hannele Yki-Järvinen1,2

1University of Helsinki, Helsinki, Finland, 2Minerva Foundation Institute for Medical Research, Helsinki, Finland, 3Fimmic Oy, Helsinki, Finland, 4Huslab Pathology and Helsinki Biobank, Helsinki, Finland. E-mail: [email protected]

Background: Non-alcoholic fatty liver disease (NAFLD) is the most common cause for chronic liver disease in the world and better tools for assessment of liver histology are called for. Methods: We developed a deep learning-based algorithm to quantify liver fat content in liver biopsies obtained from 160 subjects undergoing bariatric surgery (age 50±0.71 years, BMI 42.5±0.52 kg/m2. Liver histology was determined conventionally by liverpathologist using the SAF score (steatosis, activity, fibrosis): 25% of the subjects had normal liver histology, 30% had simple steatosis, 5% had NASH (non-alcoholic steatohepatitis), and 26% had any fibrosis. The average macrovesicular steatosis was 16,2% according to the SAF score. Whole-slide images (WSI) of the herovici-stained liver biopsies were acquired at 0.26 μm/px resolution. We trained an automatic image analysis algorithm using 2.9 and 428 gigapixels of training and validation data, respectively. Results: The percentage of liver fat determined by the algorithm correlated highly significantly with the macrovesicular steatosis from the SAF-score (r=0.91, p<0.0001). Further, the average size of the lipid droplets was 189±8.6 μm2 and the average count of droplets was 132±9 droplets per square millimeter. Macroscopic lipid droplet size correlated highly significantly with the macrovesicular steatosis determined by the algorithm (r=0.86, p<0.0001). Conclusions: We conclude that steatosis in human NAFLD samples can be accurately analyzed using the developed algorithm. Our approach enables novel metrics to more accurately characterize steatosis phenotype. In addition, this algorithm can also be used to determine the size and number of the macrovesicular lipid droplets in human liver sections.

   The Importance of Macro Images for Primary Diagnosis with Whole Slide Imaging Top

Filippo Fraggetta1, Marcial Garcia Rojo2, Alexi Baidoshvili3, Yukako Yagi4, Andrew J Evans5, J. Mark Tuthill6, Junya Fukuoka7, Liron Pantanowitz8

1Department of Pathology, Cannizzaro Hospital, Catania, Italy, 2Department of Pathology, Hospital de Jerez de la Frontera, Cádiz, Spain, 3Laboratory for Pathology East-Netherlands, Hengelo, The Netherlands, 4Department of Pathology, Memorial Sloan Kettering Cancer Center, New York, USA, 5Department of Pathology, University Health Network, Toronto, Canada, 6Department of PathoPathology Informatics, Henry Ford Health System, Detroit, Michigan, USA, 7Department of Pathology, Nagasaki University Graduate School of Biomedical Sciences, Nagasaki, Japan, 8Department of Pathology, University of Pittsburgh Medical Center, Pittsburgh, Pennsylvania, USA. E-mail: [email protected]

Background: A Whole Slide Image (WSI) is typically comprised of (a) a macro image (low power snapshot of the glass slide) and (b) stacked tiles in a pyramid structure (with the low resolution thumbnail at the top). The macro image shows the label, all pieces of tissue and pen marks on the slide. Many scanner vendors do not readily show this overview macro to pathologists. We demonstrate that failure to do so may result in a serious misdiagnosis. Methods: We solicited examples of errors that occurred during the digitization of glass slides where the virtual slide differed from the macro of the original glass slide. Such examples were retrieved from experts in the USA, Canada, Europe and Asia from a variety of scanners. Results: Errors were categorized into (a) device limitation (e.g. scanning coverage), (b) technical problems (e.g. tissue finder failure, mismatching) and (c) human mistakes (e.g. poor manual region of interest selection). These errors were all highlighted in the macro image. Conclusion: Our experience indicates that whole slide imaging is subject to inadvertent errors related to scanning glitches, corrupt images or mistakes by human scan technicians. Displaying the macro image is very important in digital pathology practice because this will help detect the majority of these imaging problems.

   Validation of a Gleason Grading Algorithm in Prostate Cancer Biopsies Top

Agnieszka Krzyzanowska1, Felicia Marginean1,2, Ida Arvidsson3, Athanasios Simoulis2, MEng Erik Sjöblom4, Claes Lundström4, Niels-Christian Overgaard3, Roy Ehrnström2, Kalle Åström3, Anders Heyden3, Anders Bjartell1

1Department of Translational Medicine, Lund University, Lund, Sweden, 2Department of Pathology, Skåne University Hospital, Malmö, Sweden, 3Centre for Mathematical Sciences, Lund University, Lund, Sweden, 4Sectra, Linköping, Sweden. E-mail: [email protected]

Background: Prostate cancer (PCa) is the most widely diagnosed cancer worldwide. Correct identification of the stage and severity of PCa on histological preparations, using Gleason grading, is essential for diagnosis and can help the healthcare specialists predict patient outcome and choose the best treatment options. In order to improve diagnosis, there is great interest in the automation and standardisation of the Gleason grading system. Over the last two years we have been developing an algorithm, based on convolutional neural networks. Methods: To train the algorithm, 317 haematoxylin & eosin-stained PCa biopsies from Malmö University Hospital were digitally scanned and annotated by two experienced uropathologists in a specialised software (Sectra, IDS7). The algorithm was trained to detect the benign and cancerous areas and distinguish between Gleason grades 3, 4 and 5. The results of the algorithm were consulted with the pathologists and the algorithm was adjusted, re-trained and improved until satisfactory result was obtained. Results: Here we present the pilot validation of the algorithm on biopsies that were not used for training. Preliminary results find the algorithm to be very good in detecting cancerous vs non-cancerous areas (Pearson correlation between pathologist and algorithm r2= 0.98). The algorithm could also detect Gleason 3 (r2= 0.76) Gleason 4 (r2= 0.92) and Gleason 5 (r2= 0.77) patterns. Overall diagnosis estimated by the algorithm was correct on 80% of the tested slides. Conclusions: An optimised Gleason grading algorithm can be of great use for prostate cancer diagnostics.

   Automated Quantification of Steatosis: Spatial Heterogeneity Matters Top

André Homeyer1, Seddik Hammad2, Lars Ole Schwen1, Uta Dahmen3, Henning Kost1, Andrea Schenk1, Steven Dooley2

1Fraunhofer MEVIS, Bremen, Germany, 2Medical Faculty Mannheim, University of Heidelberg, Mannheim, Germany, 3Universitätsklinikum Jena, Jena, Germany. E-mail: [email protected]

Background: Steatosis, the pathological accumulation of fat, is often heterogeneously distributed across tissue sections. Most automated image analysis methods quantify steatosis in terms of the area fraction of fat droplets in the entire tissue. Thereby, they ignore its spatial heterogeneity. We present and evaluate novel measures for quantifying steatosis that take its spatial heterogeneity into account. Methods: The measures are computed in three steps. First, whole-slide images of liver tissue sections are divided into thousands of small tiles. Second, the steatosis area fractions in the individual tiles are determined via automated image analysis. Third, different descriptive statistics are derived from the distribution of steatosis area fractions across the tiles. We evaluated the measures on 30 hematoxylin-and-eosin-stained mouse liver sections from different groups, where each group represented a different state of diet-induced steatosis. Measurement reliability was assessed via the intra-class correlation coefficient (ICC) with respect to the groups. Results: The mean steatosis area fraction across all tiles, which corresponds to the result produced by most image analysis methods, was only fairly reliable in distinguishing the groups (ICC=0.53). Statistics of the spatial steatosis distribution, like certain percentiles, showed excellent reliability, with ICC values up to 0.87. Conclusions: We conclude that, when quantifying steatosis in histological tissue sections, it is necessary to take its spatial heterogeneity into account. Whole-slide image analysis provides the unique capability to assess heterogeneity across entire tissue sections.

   Implementation of Neural Networks for Studies of Brain Pathology in Parkinson's Disease Top

Ilmari Parkkinen1, Anna-Maija Penttinen1, Katrina Albert1, Jaan-Olle Andressoo1, Jaakko Kopra1, Sami Blom2, Kari Pitkänen2, Merja Voutilainen1, Mart Saarma1, Mikko Airavaara1

1Institute of Biotechnology, University of Helsinki, Helsinki, Finland, 2Fimmic Oy, Helsinki, Finland. E-mail: [email protected]

Background: Unbiased estimates of neuron numbers within substantia nigra are important for experimental Parkinson's disease models. Although the widely used unbiased stereological counting techniques with optical fractionation are accurate, these techniques are extremely laborious and time-consuming. The development of neural networks and deep learning has enabled the implementation of machine learning for automated cell counting. The advantages of computerized counting are reproducibility, elimination of human errors, and fast high-capacity analysis. We implemented whole-slide digital imaging and convolutional neural networks (CNN) to count dopamine neurons and Lewy bodies in mouse and rat brain. Methods: After immunohistochemistry, digital whole-slide images of brain sections were acquired at 0.22 μm/px resolution. We trained a CNN algorithm on WebMicroscope® platform using 528 megapixels of image data to recognize TH-positive neuron cell bodies in the digital images and validated against published data and independent human observers. Results: The algorithm performance was validated against stereology (Pearson correlation of 0.9, P<0.0001; R2 was 0.819; n=44) and against manual neuron cell body counts by two independent observers in regions that were not included in the training data (Pearson correlation 0.98, p<0.001; R2=0.95; n=26). The sensitivity, specificity and F1-score for the algorithm were 88.5% (CI95: 85.5–91.4%), 87.8 % (CI95: 84.9–90.7%), 88.2% (CI95: 85.3–91.0%), respectively. We also trained an algorithm to detect phosphorylated alpha-synuclein (pAsyn S129), a marker for Lewy bodies. Conclusions: The algorithms developed on WebMicroscope® are robust tools for cell counting in mouse and rat brain sections enabling fast and high-capacity analytics for experimental studies of Parkinson's disease.

   Open and Collaborative Digital Pathology Using Cytomine Top

Raphaël Marée1, Renaud Hoyoux2, Ulysse Rubens1, Romain Mormont1, Rémy Vandaele1, Christopher Hamilton2, Grégoire Vincke2, Pr. Pierre Geurts1, Pr. Louis Wehenkel1

1Montefiore Institute, University of Liège, Liège, Belgium, 2Cytomine SCRL FS, Liège, Belgium. E-mail: [email protected]

Common digital pathology software are often closed-source and hardly interoperable. There is a need for sustainable open-source software and open innovation models in digital pathology. Methods: Cytomine is continuously developed since 2010 ( It is based on modern web and distributed software development methodologies and machine/deep learning. It integrates tens of open-source libraries into a user-friendly rich internet application. Results: Cytomine (Marée et al., Bioinformatics 2016) provides remote and collaborative features so that users can readily and securely share their data worldwide. It relies on data models that allow to easily organize and semantically annotate imaging datasets in a standardized way (e.g. to build pathology atlases for training courses or ground-truth datasets for machine learning). It efficiently supports digital slides produced by most scanner vendors. It provides mechanisms to proofread and share image quantifications produced by machine/deep learning-based algorithms. Cytomine can be used free of charge and it is distributed under a permissive license. It has been installed at various institutes worldwide and it is used by thousands of users in research and education settings. In addition to a growing research community contributing to its development, we created a co-operative social company ( Its goals are to promote the sustainability of the project, to coordinate the community, and to provide paid services (hosting, installation, support, training, specific developments, ...) to end-users (researchers, pathologists, companies,...). Conclusions: We believe our open and cooperative initiative will foster collaborative and reproducible science, and will favor innovation in digital pathology.

   Chemotherapeutic Response Prediction in High-Grade Serous Ovarian Cancer Using Histopathological Image Analysis Top

Valeria Ariotta1, Jimmy Azar1, Elisa Ficarra2, Olli Carpén3,4, Sampsa Hautaniemi1

1Systems Biology of Drug Resistance in Cancer (University of Helsinki), Helsinki, Finland, 2Department of Computer and Control Engineering, Politecnico di Torino, Turin, Italy, 3Department Pathology, Genome-Scale Biology Research Program, Research Programs Unit, Faculty of Medicine, University of Helsinki, Helsinki, Finland, 4Turku University Hospital, University of Turku, Turku, Finland. E-mail: [email protected]

High-grade serous ovarian cancer (HGSOC) is the most abundant and lethal subtype of epithelial ovarian cancer. The standard treatment for HGSOC consists of surgery and platinum-taxane chemotherapy, and even though 80% of patients have an excellent initial response, the majority relapse within 18 months leading to less than 45% 5-year survival rate. Thus, it is important to develop tools to predict the patient's response and identify which patients benefit from the treatment. Here we use hematoxylin and eosin (H&E) stained histopathological images to predict response to chemotherapy in HGSOC. We use prospectively collected samples from 44 HGSOC patients and 55 samples from The Cancer Genome Atlas (TCGA). Firstly, we have developed tools to allow extraction of features using morphological and texture methods, such as the fraction of cancer/immune/stromal cells and distance between cancer and immune cells, from H&E images. Secondly, we use machine learning methods to predict chemotherapy based on the herein extracted features. The accuracy of correct prediction is >75%. Our results demonstrate that H&E images allow prediction of chemotherapy response in HGSOC and shed light on the role of tumor infiltrating lymphocytes in HGSOC.

   Correlation of Ki67 in Breast Cancer Biopsies and Surgical Specimens Using Image Analysis Techniques Top

Enrico Pegolo1, Fulvio Antoniazzi2, David Pilutti3, Vincenzo Della Mea3, Carla Di Loreto1,2

1Anatomic Pathology Institute, University Hospital of Udine, Udine, Italy, 2Dipartimento di Area Medica, University of Udine, Udine, Italy, 3Department of Mathematics, Computer Science, and Physics, University of Udine, Udine, Italy. E-mail: [email protected]

Background: The Ki67 labeling index is one of the most robust biomarkers to evaluate the proliferation rate of breast cancers and along with the assessment of hormone receptor status and HER2 expression is used to assign molecular subtypes (luminal-A, luminal-B, HER2-enriched, triple negative) for therapeutic decisions. Even though there isn't a standardized methodology for the evaluation of Ki67, the assessment of the most proliferative areas (“hot spots”) is recommended by the guidelines. Ki67 can be evaluated either in the core needle biopsy or in the corresponding surgical specimen with concordance rates that seem to be inferior compared to the other biomarkers. Methods: We propose a method to combine the automated identification of hot spots with the quantification of Ki67 positivity in both needle biopsy and surgical specimen to quantitatively analyze their concordance. Firstly, a set (1-3) of hotspots is identified using an automated method (AKHoD), previously published. Then, the positivity is calculated using Nuclear v9 method (Leica) with default parameters over the hotspots to identify the hotspot with the highest proliferative rate. Finally, the proliferative rates resulting from the analysis of the needle biopsies are compared with the ones from the analysis of surgical specimen to calculate their correlation rate. Results: The test has been performed over 32 digital slides of Ki67 stained breast cancer biopsies and relative surgical specimens, showing a correlation of more than 89%. Conclusions: In conclusion, this method has shown promising results even though further large series evaluations are needed to confirm our findings.

   Deep Learning Tissue Classification in Whole Slides Images of Colorectal Liver Metastases after Preoperative Chemotherapy Top

Carlos Fernández Moro1,2, Zhaoyang Xu3, Danyil Kuznyecov4, Faranak Sobhani3, Nira Nirmalathas5, Béla Bozóky1, Qianni Zhang3

1Department of Clinical Pathology/Cytology, Karolinska University Hospital Huddinge, Stockholm, Sweden, 2Department of Laboratory Medicine (LABMED), Division of Pathology, Karolinska Institute, Stockholm, Sweden, 3Queen Mary University of London, London, UK, 4Department of Clinical Pathology and Genetics, Regional Laboratories, Skåne University Hospital, Lund, Sweden, 5Karolinska Institute, Stockholm, Sweden. E-mail: [email protected]

Background: The liver is the main metastatic organ in colorectal cancer, the third most common cancer worldwide. More than 50% of colorectal cancer patients will develop colorectal liver metastases (CRLM), which is a serious condition associated with significant reductions to life expectancy. Surgical resection of CRLM, usually after preoperative chemotherapy, is the only treatment associated with long-term survival and potentially curative. Accurate assessment and quantification of residual tumor, host tissues, and immune cell infiltrates is of critical importance for prognostic stratification and developing personalized therapies. However, visual quantitative evaluations on whole slides are inherently imprecise and subject to significant interobserver variability, even by specialized liver pathologists. Methods: Extensive annotation was performed by specialized liver pathologists on whole slide images (WSI, n=34) of hematoxylin-eosin stained CRLM resected at Karolinska University Hospital, Stockholm. Annotated classes were tumour cells, hepatocytes, fibrosis, necrosis, mucin, lymphoplasmacytic infiltrate, macrophages, and blood. The annotation strategy aimed at maximizing the capture of morphological variation for each class. We trained two multi-scale context-aware convolutional networks using a patch-based approach. To enhance the generalizability of the model, image augmentation methods were applied before training, including rotation, random resizing and adjusting of hue, brightness, and saturation. Results: The proposed networks outperformed classical networks trained on single magnification levels alone. Classification results are exported as overlays on the WSIs for qualitative evaluation by pathologists. Conclusions: Deep networks effectively classify cell types and tissue components in CRLM, opening the path to automated, objective, and accurate quantitative histopathological assessments.

   Digital Image Analysis of Tumour Associated Macrophages in HER2 Positive Breast Carcinoma Top

Mieke Zwager1,2, Rico Bense2, Stijn Waaijer2, Hetty Timmer2, Stine Harder3, Andreas Schønau3, Elisabeth de Vries2, Carolien Schröder2, Bert van der Vegt1

Departments of 1Pathology and 2Medical Oncology, University Medical Center Groningen, University of Groningen, Groningen, The Netherlands, 3Visiopharm, Hørsholm, Denmark. E-mail: [email protected]

Background: Tumour-associated macrophages (TAMs) may play a role in breast cancer development and progression. The physiological balance between anti-tumour M1-like and pro-tumour M2-like TAMs is changed in breast cancer. Gaining insight in TAM fractions in breast cancer could be of clinical relevance. Manual scoring of TAMs is labour intensive, and digital image analysis (DIA) may aid standardised, robust TAM assessment. This study aimed to develop a DIA algorithm to calculate TAM number/ratio in breast cancer and to apply this algorithm to a series of HER2 positive breast carcinomas to assess the relation between TAM number/ratio and tumour characteristics. Methods: Tissue microarrays containing 106 consecutive primary invasive HER2-positive breast carcinomas were serially sectioned, and immunohistochemically stained for CD68, CD163, ER, PR, HER2 and AR. DIA was used to detect cells and to classify positive cells based on cytoplasmic staining. The numbers of M1-like and M2-like TAMs were quantified and the M2:M1 ratio calculated. Tumour characteristics (tumour type, grade, size) were revised. Results: We found a median of 1 (range 1-392) M1-like TAM and 419 (range 51-3296) M2-like TAMs per tumour. High numbers of M2-like TAMs were related to high tumour grade and low AR expression (p<0.01 and P=0.01 respectively). High M2:M1 ratio was related to high tumour grade and low ER expression (p<0.01 and p=0.04 respectively). Conclusions: Our study shows that TAM number/ratio in breast carcinoma can be calculated successfully using DIA. In HER2 positive breast cancers, high TAM number/ratio is related to unfavourable tumour characteristics.

   3D Histology Based on Serial Sectioning and Computational Reconstruction Top

Kimmo Kartasalo1,2, Masi Valkonen1, Tapio Visakorpi1, Matti Nykter1,2, Leena Latonen1, Pekka Ruusuvuori1,2

1University of Tampere, Tampere, Finland, 2Tampere University of Technology, Tampere, Finland. E-mail: [email protected]

The integration of serial sectioning, whole slide imaging (WSI) and computational reconstruction algorithms enable the examination of histological samples in 3D at subcellular resolution. This allows visualizing and analyzing the imaged sample in its true three-dimensional context, offering a more comprehensive view on its spatial and morphological characteristics than that obtained via typical 2D examination. The advantages of reconstructing 3D models computationally using WSI over direct 3D imaging include the combination of high resolution, large sample sizes and compatibility with existing biochemical techniques such as in situ hybridization, immunohistochemistry and established histological staining and interpretation protocols. For this purpose, we developed a software pipeline to perform routine 3D reconstruction tasks for large image sizes and datasets in a fully automatic manner. The steps in our 3D reconstruction process are image acquisition, alignment of images to a shared coordinate space, also known as registration, and visualization of the reconstructed 3D data. We selected an algorithm for this implementation and optimized its parameters on the basis of a quantitative benchmarking framework and Bayesian optimization. The implementation was programmed in Python using the Fiji image analysis distribution to allow automatic processing of large datasets on a computational cluster. As a proof-of-concept, we applied this system to a murine model of prostate cancer, and reconstructed and characterized the histology of 12 prostates in 3D. We will further scale up the protocol to perform 3D reconstruction for serially sectioned human prostates. The approach is also generally applicable to other tissue types.

   Real-Time Image Quality Assurance in Digital Pathology Top

David Ameisen1,2, Julie Auger-Kantor1, Emmanuel Ameisen1

1ImginIT, Paris, France, 2Paris Biotech Santé, Université Paris Descartes, Paris, France. E-mail: [email protected]

Background: Real time image Quality Assurance (QA) is key to streamlining Digital Pathology, especially in clinical use. We have designed a new QA solution for Whole Slide Images (WSI), regardless of their dimensions, quantity or acquisition rate. Methods: Developed on macOS, the programming libraries are GPU-accelerated and implemented in C/C++ and Python. The standalone program based on these libraries runs on Linux, MacOS and Windows. Results: Each WSI's custom-sized tile is analyzed to evaluate objective and quantifiable quality parameters such as sharpness and saturation. Parameters and thresholds are configurable. Global results are generated and stored in an exportable structured format along with a quality heat-map. Analyzed tiles can be reacquired on the fly and WSI can be sorted and filtered (kept, put to review or sent for deletion). This solution handles hundreds of image formats, including many WSI formats, uses OpenSlide, Dicom, PIL and OpenCV libraries and accepts third-party libraries. Easy to use by third parties, it is scalable, compatible, and has a low processing footprint. The new sharpness assessment method and quality assessment library we designed in 2017 analyze WSIs up to 180Gpx/min and are 50x faster than our results published in 2016, already two orders of magnitude faster than the then state-of-the-art, with equal computing power. Conclusions: This normalized QA analysis takes less time than acquiring a WSI, and time spent by QA staff can be reduced sixfold. Such real-time QA in any device, program or server will considerably improve the QA of the entire digital pathology workflow.

   Separating Tissue and Stain Information Using Deep Learning for Increased Generalization Ability Top

Ida Arvidsson1, Niels Christian Overgaard1, Mattias Ohlsson2, Anders Heyden1, Kalle Åström1

1Centre for Mathematical Sciences, Lund, Sweden, 2Department of Theoretical Physics, Lund University, Lund, Sweden. E-mail: [email protected]

Variations in stain appearance are a large problem for techniques within digital pathology. Well-performing algorithms trained on samples from one hospital may not generalize to samples from another hospital, due to relevant information about the tissue is lost in all details of the staining. To prevent this, we have trained a neural network that separates the information regarding the tissue type from the colours of the stains. The network gives a rough segmentation into relevant tissue components, and separately extracts the colour of each main component. The design of the network is new, where the main part is based on an autoencoder looking at local features. Furthermore, the network has a second path, from which the stain information is separated as global features for the whole image. Since the network is trained like an autoencoder, i.e. it tries to reconstruct its input, no annotations are needed and data from new sites can easily be added. Using the encoding part of an autoencoder as a preprocessing step before classification, with for example a convolutional neural network, has previously been shown to improve the generalization performance for the task of Gleason grading. We believe that our suggested modification can improve this further. Moreover, new samples can artificially be generated by modifying the stain features of one sample, similar to - but more clever - than random colour augmentation.

   Automated Ki67 “Pareto Hotspot” is an Independent Prognostic Factor in Breast Cancer Patients Top

Justinas Besusparis1,2, Benoit Plancoulaine1,3, Paulette Herlin1, Allan Rasmusson1,2, Renaldas Augulis1,2, Aida Laurinaviciene1,2, Arvydas Laurinavicius1,2

1Faculty of Medicine, Vilnius University, Vilnius, Lithuania, 2National Center of Pathology, Affiliate of Vilnius University Hospital Santaros Clinics, Vilnius, Lithuania, 3Normandie Univ, UNICAEN, INSERM, ANTICIPE, Caen, France. E-mail: [email protected]

Background: Ki67 labeling index (Ki67LI) is a potential prognostic factor in breast cancer, but due to intratumor heterogeneity, Ki67 assessment is preferred in hotspots over global enumeration. Both manual and automated methodologies have addressed the challenges of Ki67LI hotspot quantification, but there are still no agreed upon definition of a hotspot in terms of its size, contrast etc. This study utilized clinical outcomes as ground truth against which the prognostic value of several hotspot definitions were evaluated to investigate an automated, objective Ki67LI assessment. Methods: 294 images of Ki67 stained breast cancer samples were analyzed by HALO. The data were subsequently subsampled into hexagonal tiles to compute distribution and heterogeneity indicators. In particular, the 90th percentile representing a median of 20% of tumor area with highest Ki67 expression was considered as “Pareto hotspot”. Overall survival (OS) statistics were performed for several suggested hotspot definitions and pathology report data. Results: The median follow-up was 43 months; 25 patients died. Ki67 indicators allowed significant stratification of the patients into prognostic groups in univariate analyses. Multivariate Cox regression revealed superiority of the 90th percentile of intratumor Ki67 expression as an independent predictor of worse OS (HR: 1.738, 95%CI: 1.054-2.866, p<0.05). Conclusion: “Pareto hotspot”, representing the median of systematically subsampled 20% of the most proliferative tumor tissue, was the best expression of Ki67LI to predict OS. Although this definition does not necessarily imply discrete, intra-connected hotspots, it is biologically relevant and independent of other spatial features.

   Working Methods for End-to-End Development of Automated Assistants in Pathology Top

Martin Lindvall1,2, Karin Skoglund3, Jerónimo Rose2, Claes Lundström1,2

1Sectra AB, Linköping, Sweden, 2Linköpings Universitet, Linköping, Sweden, 3Department of Pathology, Region Östergötland, Linköping, Sweden. E-mail: [email protected]

The recent technical advancements within deep learning are eagerly being applied within the field of digital pathology. However, few studies focus on the end to end steps needed to realize clinical utility. In this case study, using a cross disciplinary team consisting of researchers, industrial representatives and clinicians, we describe a stepwise process that includes need-finding, problem definition, matching technical possibilities with clinical utility, data selection and curation, the development of annotation protocols and finally creating a deep learning pipeline. This case study was conducted in the context of supporting the diagnosis of high-grade serous carcinoma in ovarian tissue. The proposed process can be summarized as follows. Need-finding: an analysis of areas of possible improvement, as manifested in clinical practice. For the HGSC case, the selected need was the waiting time for ordering additional immunohistochemistry (IHC) slides. Problem definition: identifying the objective of the automated assistant. Here this was set as predicting the need of additional stains. Matching possibilities with utility: judging whether means and objectives are in balance. In our case, considerable time saving can be achieved even with less than perfect predictions, in the worst case only risking unneeded IHC slides rather than an incorrect diagnosis. Data selection/curation: Ensuring that the content and amount of data available can sufficiently inform the machine learning process. Annotation protocols: Pinning down the ontology or other structured labeling to employ, determining sufficient level of detail etc. Once all this is done, the deep learning method can be developed, often with off-the-shelf components.

   Immunohistochemical Fluoro-Chromogenic Staining and Digital Image Analysis for Accurate Detection of PD-L1 in Cytokeratin-Positive NSCLC Top

Teppo Haapaniemi1, 2, 3, Satu Luhtala1, Onni Ylinen1,4, Ville Muhonen1,4, Taneli Tani2, Jorma Isola1,4

1University of Tampere, Tampere, Finland, 2Päijät-Häme Joint Authority for Health and Wellbeing, Pathology, Lahti, Finland, 3BioSiteHisto, Tampere, Finland, 4Jilab Inc., Tampere, Finland. E-mail: [email protected]

A novel immunotherapy for non-small cell lung carcinoma (NSCLC) is based on blocking the signaling between the programmed death ligand 1 (PD-L1) and programmed cell death protein 1 (PD-1). For therapy, demonstration of PD-L1 expression in malignant tumor cells is required. A major problem in immunohistochemical detection of PD-L1 in NSCLCs is PD-L1 expression in alveolar macrophages, which may lead to misinterpretation and increased inter-observer variability. With herein presented fluoro-chromogenic staining, interpretation can be facilitated by showing co-expression of PD-L1 and cytokeratin in carcinoma cells, which eliminates PD-L1 positive macrophages from the analysis. FFPE-samples of 40 NSCLCs were stained using fluoro-chromogenic staining method with PD-L1, PD-1, and cytokeratin antibodies. PD-L1 was detected with HRP-polymer and visualized with DAB. Epithelial cells were labeled with anti-pancytokeratin and detected on the same sections with Cy2-conjugated IgG. Hematoxylin was used as counterstain. Slides were scanned as whole slide images (WSIs) sequentially under brightfield and fluorescence illumination and saved as multilayer JPEG2000 images. WSIs were viewed and analyzed with SlideVantage 1.2 software, whose brightfield and fluorescence image blending mode was found very helpful by pathologists evaluating PD-L1 stainings. Improved accuracy of PD-L1 positive cytokeratin-positive carcinoma cell count was observed, especially in tumors with low level PD-L1 expression and in low grade carcinomas. Importantly, PD-L1 expression in carcinoma cells, macrophages and necrotic tissue was distinguishable by the fluorescent cytokeratin label. Thus, the developed PD-L1 quantification method decreased the risk for misinterpretations and was shown feasible for clinical practice. Virtual slides are available at

   Integrin Beta4 Associates with Tumor Budding and Predicts Survival in Stage II Colorectal Cancer Top

Khadija Slik1, Sami Blom2, Riku Turkki2, Katja Välimäki2, Samu Kurki3, Harri Mustonen4, Caj Haglund4, Olli Carpén3,5, Olli Kallioniemi2,6, Eija Korkeila1, Jari Sundström1, Teijo Pellinen2

1Department of Pathology, Turku University Hospital, University of Turku, Turku, Finland, 2Institute For Molecular Medicine Finland FIMM, Helsinki, Finland, 3Auria Biobank, Turku University Hospital, University of Turku, Turku, Finland, 4Department of Surgery, University of Helsinki and Helsinki University Hospital, Helsinki, Finland, 5Department of Pathology, University of Helsinki and HUSLAB, Helsinki University Hospital, Helsinki, Finland, 6Department of Oncology and Pathology, Science for Life Laboratory, Karolinska Institutet, Stockholm, Sweden. E-mail: [email protected]

Tumor budding is considered as an independent predictor of survival in stage II colorectal cancer (CRC). It has been suggested to associate with epithelial-to-mesenchymal transition, yet the molecular changes are poorly characterized. We first validated the predictive power of tumor budding according to ITBCC guidelines using H&E whole tumor sections of stage II colon cancer patients (n=232). Tumor budding was prognostic of overall survival and disease-free survival, and independently predicted disease-specific survival (HR=6.04; 95% CI=2.00-18.20). Blinded to tumor budding, we constructed a TMA of the same patient cohort with 1.2-mm cores of benign, tumor center, and tumor border areas of each patient (n=1248 cores) using the corresponding H&E sample blocks. To characterize phenotypic epithelial marker association with tumor budding, we performed multiplex IHC (mIHC) staining of the TMA cohort for epithelial markers: Pan-cytokeratin, E-cadherin (adherens junctions), ITGB4=integrin beta 4 (basal cell contacts), and ZO-1 (tight junctions). A change in ITGB4 expression from basal cell-cell contacts to diffuse cytoplasm was observed in cells and cell clusters resembling tumor budding. Automated image analysis was applied to digitally detect all the epithelial clusters. Small epithelial clusters (≤5 cells) with high ITGB4 expression correlated with visual assessment of tumor budding in H&E sections (r=0.240; p<0.001) and predicted DSS when present in the tumor border in the digital analysis (HR=4.14; 95% CI=1.40-12.27). In summary, the mIHC and digital pathology platform provides a powerful tool to study tumor epithelial phenotypes and reveals an association of ITGB4 with high-grade tumor budding in stage II CRC.

   Experiments in Training Neural Networks with Heterogenenous Sources Top

Francesco Daneluzzi1, David Pilutti1, Fulvio Antoniazzi2, Enrico Pegolo3, Carla Di Loreto2, Vincenzo Della Mea1

Departments of 1Mathematics, Computer Science, and Physics and 2Medicine, University of Udine, Udine, Italy, 3Institute of Pathology, Hospital of Udine, Udine, Italy. E-mail: [email protected]

Background: When training systems for tissue classification, the most crucial component is the expert knowledge needed for ground truth, that is, tissue annotations made by human experts. Good annotations involve a lot of precise work and time by expert pathologists. We designed a preliminary experiment aimed at verifying if it is possible to reuse available annotations in a classification problem different from the one for which annotations have been collected. In particular, our aim was to recognize invasive tumor in breast slides. Methods: The Camelyon Challenge provided annotated slides for detecting breast cancer metastases in sections of lymph nodes. We exploited those metastases annotations as examples of breast invasive tumor, and we used breast biopsies resulted free from invasive or in-situ cancer as normal tissue. Inception v.3 was adopted with Caffe and a pre-trained VGG16 model; training on 111 slides was run on a nVidia Titan Xp (from a nVidia GPU grant). The test set was represented by 27 breast cancer biopsies. Results: The preliminary result have been encouraging but far from perfect. In fact, tumor was recognised in 21 out of 27 slides, and the recognised areas were smaller than the real tumor area. Conclusions: The main reason for incomplete recognition could be the staining differences due to the different labs involved. This suggests some form of staining normalization to be applied before doing training and test. However, the same issue could be apparent in any attempt to generalize the use of a system trained for a specific lab.

   Deep Learning Tissue Classification and Fibrosis Quantification in Digital Slides of Ductal Carcinoma In Situ Top

Zhaoyang Xu1, Carina Strell2, Carlos Fernández Moro3,4, Yibao Sun1, Fredrik Wärnberg5, Arne Östman2, Qianni Zhang1

1Multimedia and Vision Group, Queen Mary University of London, London, United Kingdom, 2Department of Oncology-Pathology, Cancer Center Karolinska, Karolinska Institute, Stockholm, Sweden, 3Department of Clinical Pathology/Cytology, Karolinska University Hospital, Stockholm, Sweden, 4Department of Laboratory Medicine (LABMED), Division of Pathology, Karolinska Institute, Stockholm, Sweden, 5Department of Surgical Sciences, Uppsala Academic Hospital, Uppsala University, Uppsala, Sweden. E-mail: [email protected]

Background: The diagnosis of DCIS increases since the introduction of screening programs. DCIS is considered a precursor of invasive ductal carcinoma, but our knowledge on its natural progression course is rather limited. Most patients are treated with breast-conserving surgery and radiotherapy. To improve future disease management, the identification of prognostic and predictive biomarkers is highly demanded. The in-depth quantitative analysis of the different tissue components of DCIS and their morphological properties could provide important information for better stratification of patients. Methods: Multi-class ground truth annotation on a training set has been provided based on the morphology of different tissue components in whole slide H/E images of DCIS, including fat(FA), fibrosis(FI), periductal fibrosis(PF), immune infiltration(IF) and tumour(TU). In order to classify different tissue types, we trained a 5-layer deep convolutional neural network with the cropped patches from the annotated regions on the 20X magnification level. Then, a sliding window method was used to generate the distribution of the different components on the whole slide image. Results: The training accuracy and testing accuracy on the patches can are 99.6% and 98.3% respectively. The F1-scores for the five classes in the testing set (FA/FI/PF/IF/TU) are 99.9%,98.3%,91.7%, 99.3%, and 99.7%. Conclusions: These experiments demonstrate that the five kinds of tissue can be well differentiated using appropriately designed deep neural network structures.

   Multi-Scale Fusion for Semantic Segmentation on CRLM Tumour Border Top

Zhaoyang Xu1, Carlos Fernández Moro2,3, Danyil Kuznyecov4, Faranak Sobhani1, Qianni Zhang1

1Multimedia and Vision Group, Queen Mary University of London, London, United Kingdom, 2Department of Clinical Pathology/Cytology, Karolinska University Hospital, Stockholm, Sweden, 3Department of Laboratory Medicine (LABMED), Division of Pathology, Karolinska Institute, Solna, Sweden, 4Department of Clinical Pathology and Genetics, Regional Laboratories, Skåne University Hospital, Lund, Sweden. E-mail: [email protected]

Background: The accurate identification of tissue components in tumour tissue border area is the foundation for further histopathology image analysis tasks. However, due to the high morphology variance in histology images, especially in the border regions where cancer cells interfere into the normal regions, it is challenging even for the pathologists to define the border with precision, not to say for the machines. Method: We present an innovative framework to semantically segment the tumour border area in colorectal liver metastases on pixel level by integrating the features from deep convolutional networks with spatial and statistical information of the cells. With annotations from the pathologists, a two-level deep neural network including a cell-level model and a tissue-level model, is trained to classify patches from the whole slide scan images. Based on the prediction outputs of trained models, a growing-style algorithm is proposed to finalize the segmentation by leveraging the statistical and spatial properties of the cells. Results: The experiments on 10 tumour border areas demonstrate that the proposed model is able to achieve an 81.08% of pixel accuracy and 60.03% for mean intersection of union while the results from the tissue level model alone are 77.79% /58.39% and 62.7%/44.42% for cell level alone. Conclusions: The framework can jointly exploit advantages from the trained deep convolutional neural networks on both tissue level and cell level for a more accurate result.

   Open Source Piwigo as Collaborative Platform for Effective Review of Histopathological Annotations from Digital Slides Top

Zhaoyang Xu1,Faranak Sobhani1, Carlos Fernández Moro2,3, Qianni Zhang1

1Multimedia and Vision Group, Queen Mary University of London, London, United Kingdom, 2Department of Clinical Pathology/Cytology, Karolinska University Hospital, Stockholm, Sweden, 3Department of Laboratory Medicine (LABMED), Division of Pathology, Karolinska Institute, Stockholm, Sweden. E-mail: [email protected]

Background: In the deep learning era, dataset with high-quality annotations is the foundation to a successful model. For histopathological images, the manual annotation process requires substantial expertise and efforts. To overcome the annotator subjectivity and improve the annotation quality, cross-review of the annotated regions is necessary. For the pathologists, offline annotation is usually preferred to online platforms due to the better user experience without relying on smooth network connections. However, offline annotation limits the possibility for collaborative annotation. Methods: This platform is developed based on the open source library - openslide and the powerful online picture sharing & management system - Piwigo. Openslide provides a universal interface to decode most histology images, while the piwigo offers stable APIs for uploading and managing the images. Thus, the developed platform utilizes the advantages of these two platforms and provide the pathologists with a flexible and efficient way to perform histopathological annotation and analysis collaboratively. Results: The basic functions of the platform include the following: 1) Parse and process different digital histopathology image and annotation formats. 2) Extract and upload annotated regions and the tags/labels to the review platforms. 3) Permitted users are allowed to edit the properties of the uploaded regions which including comments, rating and deleting. 4) After the review process, the final annotations of the slides will be saved in any formats as required. Conclusions: The developed platform enables low-cost and efficient collaborative eview of histopathological annotations online.

   Scorenado: An Efficient and User-Friendly Visual Assessment Tool Top

Micha Eichmann1, Stefan Reinhard1, Inti Zlobec1

1Institute of Pathology, University of Bern, Bern, Switzerland. E-mail: [email protected]

Despite significant advances in the field of digital image analysis (IA), visual assessment (VA) of histological slides and quantification of features/staining by “eyeballing” is still common practice. This is due to the necessity of large expert-annotated datasets for training deep learning IA algorithms and the absence of reliable IA algorithms tailored to the everyday needs of pathologists. Here, we present Scorenado, a software solution for the VA of tissue microarrays (TMA) and whole-tissue slide images. For broad compatibility, Scorenado was developed as a cross-browser web tool in jQuery for individual users and groups. It runs on an Apache web server with PHP and on a Maria database. Additionally, Groovy scripts were written for the open-source software QuPath to automate the creation of TMA spot or tissue tile image collections for the use in Scorenado. Scorenado permits researchers to define and record their parameters of interest (score, count, class, etc.) and automatically optimizes its graphical user interface accordingly. In combination with the randomization of presented images, this hence facilitates VA in a blinded, and time-efficient manner. So far, Scorenado has been successfully used for VA of tumor buds, immune cells, protein expression, and colorectal polyp classes totaling 63,388 TMA slide spots and 2,235 whole-tissue slide tiles. Incorporating Scorenado into the research routine not only allows researchers to efficiently evaluate TMA and whole slides but also greatly facilitates the creation of expert annotated datasets on the fly, which may have the potential to significantly advance digital clinical histopathology.

   Virtual Staining Simulator for Computational Pathology Method Development Top

Martin Risdal1, Thor Ole Gulsrud1, Emiel Janssen2

1International Research Institute of Stavanger, Stavanger, Norway, 2Stavanger University Hospital, Stavanger, Norway. E-mail: [email protected]

Worldwide, significant efforts are being made in research and development of computational pathology decision support systems. It is well known that tissue biopsy H &E staining procedures, color preferences and biopsy thickness vary significantly between laboratories. A computational pathology method should hence be validated on dataset with a representative staining/coloring diversity. Ideally each dataset should be cut and stained by different laboratories. Unfortunately, this is hard to accomplish due to different pre-analytic circumstances and instrumentations at the different laboratories. Establishing such a data set is hence costly in terms of human resources, time and funds. We have addressed this issue by developing an adaptive staining simulator, which allows for synthesizing the staining variations by adapting the hematoxylin and eosin staining of a whole slide image to that of a reference. The staining simulator is based on a combination of a tissue/pixel classifier, a coloring model, and a color adaptation scheme. The coloring model operates in the Hue-Saturation-Value (HSV) color space. The coloring differences between nuclei and eosinophilic structures require a classification of the different cell structures and types. An adaption scheme is used to adapt the model to a specific biopsy with a desired coloring by minimizing an objective function. The staining simulator concept has been evaluated by tuning the colors of biopsy samples from other sites to match those of a local pathology lab, and where the image quality and color match has been qualitatively evaluated by lab personnel. Initial tests of the simulator show promising results.

   Considering the z-Axis: Control Data Created with Digital Image Analysis on a Next-Generation Tissue Microarray (ngTMA®) Top

Tilman T. Rau1, Stefan Reinhard1, Carol Büchi1, Micha Eichmann1, Inti Zlobec1

1Institute of Pathology, University of Bern, Bern, Switzerland. E-mail: [email protected]

The tissue microarray is an ancillary tool for biomarker analysis, but the validity and reproducibility of basic TMA data are rarely reported. Here, we measure simple cell counts, tissue area and cell density across a multi-panel ngTMA experiment in order to determine the change in these parameters along the z-axis across serial sections. An ngTMA with 2740 possible core positions was followed across 7 serial sections, irrespective of the underlying immunohistochemistry. Simple cell counts and tissue area per core were measured using the open source software QuPath. Densities of cells per core were calculated. Simple statistics included calculation of means, standard deviations and Pearson's correlation. A total number of 18,369,417 cells were counted on a total area of 1324.13 mm2. The average cell density was 13,781 cells per mm2. Naturally, there was a strong correlation between the cell count and the underlying area (Pearson coefficient r=0.869). However, regarding the volatility of these three values, the cell density showed the least variation, in comparison to the area or the cell number(STD percentaged: 16.9%, 29.9% or 35.2%). The shifting of these categories along the z-axis (across the serial sections) was in the range of only ±2%. Our approach of analyzing the shift in cell number along the z-axis of a TMA could highlight the robustness of its construction and could provide a supporting framework for the pathologist. For instance including and excluding cores for further analysis could be based on a mathematical-statistical rational rather than on the “interpretability” by the pathologist.

   Platform for Training and Implementation of Deep-Learning Neural Network in Prostate Cancer Detection and Grading Top

Kevin Sandeman1, Sami Blom2, Tuomas Ropponen2, Tuomas Mirtti1

1Faculty of Medicine, Medicum, University of Helsinki, Helsinki, Finland, 2Fimmic Oy, Helsinki, Finland. E-mail: [email protected]

Prostate cancer (PC) is globally the second most common cancer and fifth most frequent cause of mortality in men. The pathological Gleason score (GS) applied on prostate biopsies is considered the most accurate diagnostic and predictive tool for patient outcome. Artificial Intelligence (AI) may help detecting and scoring cancer in the future. To train a deep learning analysis tool for the detection and grading of PC, an uropathological expert team annotated 59 scanned prostate biopsies with 0.22 μm/pixel resolution. Based on morphological pattern, areas were annotated into benign, Gleason 3, Gleason 4, cribriform Gleason 4 and Gleason 5. For an independent validation of agreement between AI and a pathologist, 214 biopsies were analysed using a 7-tier grouping: benign (0), Grade Group (GG) 1 – 5, and three subgroups in GG 5. From the training areas, AI assigned benign, G3, G4, cribriform G4 and G5 with a total area error of 12.33, 1.25, 0.99, 0.80 and 0.14 %, respectively. In the independent analysis of 214 biopsies, there was total agreement between AI and clinician in 58 cases. AI gave a higher GG in 134 cases, and clinician in 22 cases compared with AI. Absolute GG difference between AI and clinician was 0 in 27.1%, 1 in 30.8%, 2 in 32.3% and 3 in 7.9% of cases. The Pearson correlation for total tumour area assessment between AI and pathologist was 0.56 (p<0,00001). The results of AI for detection and scoring of PC may have direct implications in improving clinical diagnostics of PC.

   Use of Open Communication Platforms for Diagnosis in Pathology Top

Rodrigo Ugalde Herrá1,2, Ana Fernández Ibañez3, Hector Torres Rivas1, Luis Fernández Fernández1, Ivan Fernández Vega1,3, Jorge Ugalde Puyol3

1Hospital Universitario Central De Asturias, Oviedo, España,, Cuenca, Ecuador, 3Facultad de Medicina, Universidad de Oviedo, Oviedo, España. E-mail: [email protected]

Introduction: The evolution in the last decades from the Ramon y Cajal microscopes to the use of digital diagnostic methods has led to a complexity and high cost that, in most situations, makes it very difficult to access small laboratories whether public or private. Many pathologists have been using different open communication platforms for their routine activities either in diagnosis or interconsultation, or simply as a teaching medium for resident doctors and pathologists in continuing education. Materials and Methods: Communication platforms “Skype” and “WhatsApp” were used for the diagnosis by static image or in real time. In 2017, these platforms were applied in 80 cases of consultation, including intraoperative diagnoses of various pathologies and as teaching material to objectively assess the initial diagnosis, the final diagnostic agreement and its applicability in the interconsultations of complex cases. Results: Mainly, the WhatsApp platform has been used in the diagnosis of intraoperative parts both at the macroscopic and microscopic levels. A 95% diagnostic correlation was obtained, between the initial image diagnosis and the final result. Conclusions: Currently, we are going to a change where digital pathology is going to become a fundamental tool in our daily work by allowing us access to very varied samples, in a very short time and from very distant places.

   Application of Social Communication Networks in the Continuing Education of Health Professionals Top

Rodrigo Ugalde Herrá1,2, Ana Fernández Ibañez3, Maria del Mar Ugalde Herrá2, Carmen Lis Ugalde Herrá2, Isabel Herrá Diaz de la Espina2, Hector Torres Rivas1, Luis Fernández Fernández1, Sara Marcos Gonzalez4, Ivan Fernández Vega1, 2, 3, Jorge Ugalde Puyol2

1Hospital Universitario Central De Asturias, Oviedo, España,, Cuenca, Ecuador, 3Facultad de Medicina, Universidad de Oviedo, Oviedo, España, 4Hospital Universitario Marques de Valdecilla, Santander, España . E-mail: [email protected]

Introduction: In the era of communications, the use of web pages or social networks among other media is useful for ongoing training in any branch of medicine. Pathology is not an exception, where these social networks also serve as disseminators of knowledge and information at a universal level. Institutions and medical specialists present clinical cases, scientific reviews with images and explanations that can be very useful in teaching and daily clinical practice, since in many cases they propose differential diagnoses for multidisciplinary teams or open very interesting sources of debate for professionals of the Health. Methods: We evaluate the use of web pages and different social networks verifying the comments made by professionals of various disciplines about the images and reviews proposed during a year. We reviewed 150 cases that included images (macroscopic and microscopic) with their respective medical commentaries, bibliographic reviews or specific clinical cases, observing their possible applicability both in daily clinical practice and for the training of health professionals. Results: With the study and review of all the proposed material was observed a great utility that allowed a very adequate and high quality in continuous training since we obtained a clinical-diagnostic correlation between the initial suspicion and the subsequent confirmation (> 95%). Conclusions: The use of social networks and web media is very useful for the continuing education of health professionals, since they provide almost unlimited access to clinical cases, bibliographical reviews and multidisciplinary explanations that are very useful in daily practice, allowing greater diagnostic performance.

   Leveraging Unlabeled Data to Improve Mitosis Detection Top

Saad Ullah Akram1, Talha Qaiser2, Simon Graham2, Juho Kannala3, Janne Heikkilä1, Nasir Rajpoot2

1University of Oulu, Oulu, Finland, 2University of Warwick, Coventry, United Kingdom, 3Aalto University, Helsinki, Finland. E-mail: [email protected]

Mitosis counting in the H&E stained tissue biopsies is commonly used to estimate tumor proliferation rate, which is an important bio-marker for various cancers. It is very tedious, time-consuming and highly subjective with large discrepancy between different pathologists. Methods: The public mitosis datasets have limited number of mitosis samples (~1500) and gains in performance can be made by increasing the training-set size. We propose a self-supervised method in which a model is first trained using labelled data, then, it is applied to unlabeled data and high confidence detections are used to augment the training-set, which is then used to re-train the model. Our mitosis detection model is based on ResNet, which is first trained using mitosis patches and randomly sampled background patches. Once it has converged, it is used to mine hard-negatives, which are then used to fine-tune it. Results: When using a limited number of mitosis samples for training, using additional samples from unlabelled dataset leads to significant gain in performance. We also show that when using whole of TUPAC and MITOS datasets for training, use of unlabeled data leads to improved performance, with our model reaching F1-score of 0.68 on validation set of TUPAC and 0.62 on validation set of MITOS14 datasets. Conclusions: In digital pathology, large unlabeled datasets are readily available and it can be very challenging to annotate them as it typically requires domain-specific knowledge. Self- and semi-supervised methods have the potential of utilizing these vast untapped resources and surpass the performance of fully supervised methods.

   Multi-Centre Raman Spectroscopy in Oesophageal Cancer Pathology Classification Top

Martin Isabelle1, Catherine Kendall2, Nick Stone3,Geraint Thomas4, Gavin Lloyd2, Riana Gaifulina4, Neil Shepherd2, Manuel Rodriguez-Justo4, Aaran Lewis4, Jennifer Dorney3, Ian Bell1, Hugh Barr2

1Renishaw Plc, New Mills, Wotton-under-Edge, Gloucester, United Kingdom, 2Biophotonics Research Unit, Gloucestershire Hospitals NHSFT, Gloucester, United Kingdom, 3The University of Exeter, Exeter, United Kingdom, 4Department of Cell and Developmental Biology, University College London, London, United Kingdom. E-mail: [email protected]

Introduction: Despite clear evidence of the potential of Raman spectroscopy (RS) as an accurate diagnostic tool in a range of cancer and non-cancer diseases, RS has yet to make the wide translation from research to clinic. A barrier blocking clinical implementation is evidence that RS is capable of providing comparable results or diagnoses across multiple sites utilising multiple instruments. Methods: Three Renishaw benchtop RA800 series Raman spectrometers (located at three different geographical locations) were used to collect Raman spectra from tissue sections from patients with Barrett's oesophagus, dysplasia and adenocarcinoma. Contiguous haematoxylin and eosin (H&E) stained sections were prepared for histological review to identify homogeneous regions of tissue pathology. The review was repeated (blind) by other expert histopathologists to enable consensus opinion to be used to develop rigorous multivariate classification models. Result: Algorithms were developed to minimize instrument and sample quality variations within and between the instruments and sites. Spectra was processed and classification models to discriminate normal squamous (NSq) tissue versus adenocarcinoma (AC), low risk versus high risk, and discrimination between five pathology groups were developed. Discussion: Single site classification performance achieved good sensitivities and specificities with NSq v AC sensitivity of 92-100% & specificity of 96-100% and independent validation sensitivity of 95-100% & specificity of 83-100%, however the classification models were not fully transferable between data collected at the three sites. This study illustrates the potential of developing validated oesophageal classification models constructed from data measured at three remote instrument sites with potential application to other tissue types.

   An AI-based Quality Control System in a Clinical Workflow Setting Top

Judith Sandbank1,2, Chaim Linhart2, Joseph Mossel2

1Institute of Pathology, Maccabi Healthcare Services, Rehovot, Israel, 2IBEX Medical Analytics Ltd., Tel Aviv-Yafo, Israel. E-mail: [email protected]

Maccabi Healthcare Services is a large healthcare provider with a centralized pathology institute that handles 120,000 histology accessions per year, of which approximately 700 are prostate core needle biopsies (PCNBs). Roughly 40% of the PCNBs are diagnosed with cancer. In collaboration with IBEX Medical Analytics, we developed a computer software that identifies various cell types and features within whole slide images of PCNBs, including cancerous glands (of Gleason patterns 3, 4 and 5), high-grade PIN and inflammation. The algorithm utilizes state-of-the-art Artificial Intelligence (AI) and Machine Learning techniques, and was trained on many thousands of image samples, taken from hundreds of PCNBs from multiple institutes, and manually annotated by senior pathologists. We ran the algorithm on 80 retrospective cases that had been diagnosed as benign, and found two major errors – in both cases, the algorithm identified small foci of Gleason 3. Two years later, both patients were diagnosed with higher grade cancer and underwent radical prostatectomy. Following these findings, the institute decided to deploy the algorithm as a QC system on all new PCNBs entering the lab. The system raises an alert whenever it encounters a discrepancy between the automated analysis and the original diagnosis, prompting a second human opinion. The complexity of prostate cancer diagnosis, together with the considerable shortage of pathologists, makes a QC system like this extremely useful for diagnostic accuracy and safety. To the best of our knowledge, this is the first AI-based digital pathology diagnostic system running in a live clinical setting.

   Telepathology in the Grand-Duchy of Luxembourg: Integrating a Network of Hospitals with a Central Laboratory Top

Daniel Val1, Paulo Miranda1, Jean-Marc Papi1, Alves Javier1, Adrian Cuevas1, Margarida Sarreira1, Philippe Vielh1, Michel MIttelbronn1, Fernando Schmitt1,2

1Laboratoire National De Santé, Dudelange, Luxembourg, 2Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal. E-mail: [email protected]

Background: Luxembourg is a European country with five different hospitals and a single public Pathology Laboratory. This geographic dissociation poses several challenges, being the most notable a long delay in the Pathology intra-operative frozen section reporting, to the point of making it unavailable for some surgical procedures and/or hospitals located too far away. Methods: A Telepathology room was accommodated in two hospitals during 2016-17, with a plan to expand to the remaining three. Each room was equipped with a remote controlled microscope, a virtual macro camera and equipment to process fresh and fixed specimens. A protected VPN network was stablished between institutions to allow a live view of both macroscopic and microscopic images. A validation phase of twenty cases per hospital was stablished and a follow up until March 2018 was further analyzed. Results: Comparison of results prior to the use of Telepathology showed a clear reduced delay in frozen section reporting. Discrepancies between virtual and traditional frozen section were analyzed, with no significant disadvantages of a telepathology assessment vs a traditional one. Conclusions: A project to integrate all the hospitals with a central Pathology Laboratory is analyzed. Its impact on patient care and surgeon satisfaction proves beneficial. Other benefits are also outlined.

   Preliminary Study of Stromal Compartment of Prostate Cancer through Fractal Dimension Top

Mircea-Sebastian Serbanescu1, Razvan-Mihail Plesea2,3, Rossy Vladut Teica1, Viorel Ciovica3, Valentin Tiberiu Moldovan4, Iancu-Emil Plesea3, 4, 5

Departments of 1Medical Informatics and Biostatistics and 2Medical Genetics, University of Medicine and Pharmacy of Craiova, Craiova, Romania, 3Doctoral School, University of Medicine and Pharmacy of Craiova, Craiova, Romania, 4National Institute of Research-Development in the Pathology Domain and Biomedical Sciences “Victor Babes,” Bucharest, Romania, 5Department of Pathology, University of Medicine and Pharmacy “Carol Davila,” Bucharest, Romania. E-mail: [email protected]

Background: The quest for new methods in objective assessment of prostatic carcinoma images is still open. Most methods focus on the objective evaluation of nuclei distribution leaving behind the stroma. Our approach focuses on the assessment of stromal compartment distribution using the fractal dimension (FD) assuming that in any malignant process cell proliferation influences stromal compartment. Methods: Four serial sections from 229 prostatic cancer cases were stained with: H&E for grading and Gömöri technique, Goldner's trichrome, and CD34 immunomarker to assess, glandular architecture, collagen distribution, and vascular network. Assessment images were binarized using different approach with color focus for Goldner's and CD34 stainings and intensity focus for the Gömöri staining. The FD was computed for each binary image using a box-counting algorithm. The three computed values were used for clustering (k-means) and classification (k-Nearest-Neighbour, k-NN). Images were classified using Gleason's grading system: Gl1(n=2), Gl2(n=24), Gl3A(n=42), Gl3B(n=40), Gl3C(n=9), Gl4A(n=62), Gl4B(n=44), Gl5A(n=3), Gl5B(n=3). Results: Clustering showed a mixture between different patterns, probably the reason why other papers use different grading systems. Taking into consideration the ununiform distribution of cases in different patterns the k-NN proved to be a good choice with a high classification rate. The results show that the stromal assessment can bring useful information to the objective quantification of prostate cancer. Conclusions: The combined use of the three FDs had good results in the classification task, but, taking into consideration the uneven distribution of the case's pattern, larger and uniform datasets should be verified.

   HistoQC: A Quality Control Pipeline for Digital Pathology Slides Top

Andrew Janowczyk1, Ren Zuo1, Michael Feldman2, Anant Madabhushi1

1Case Western Reserve University, Cleveland, Ohio, USA, 2School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, USA. E-mail: [email protected]

Background: Artifacts in digital pathology (DP) slides come from many potential sources, e.g. slide preparation or scanning. Consequently, there is a need for tools to assess DP image quality and determine diagnostic/computational suitability. In this work we present a DP specific quality control program, HistoQC, for automated assessment of image quality. Methods: HistoQC employs a combination of image features (e.g., color histograms, brightness, contrast) and supervised classifiers (e.g., pen detection) to identify regions within the image that are artifact free. 92 whole slide images of breast cancer were randomly chosen from 1143 slides from the TCGA for this study, encompassing a diversity of artifacts including the presence of pen markings, bubbles, and cracks. The images and the corresponding HistoQC generated masks of artifact free regions were manually reviewed by a pathologist. Results: 85 of 92 masks created by HistoQC were identified as suitable for diagnostic evaluation. Additionally, both HistoQC and the pathologist identified two slides as not being diagnostically or computationally suitable due to bubble artifacts. Conclusion: Initial results suggest HistoQC could be used to both identify images with preparation artifacts and artifact free regions within the slide image. We additionally anticipate researchers could extend HistoQC's modular functionality to define the thresholds of their pipeline's artifact and image characteristic tolerances.

   Computer Assisted Diagnostics in Histopathology: A Systematic Review Top

Katherine Hewitt1, Nasir Rajpoot1,2, David Snead1

1Department of Pathology, University Hospital of Coventry and Warwickshire, Coventry, United Kingdom, 2Department of Computer Science, University of Warwick, Coventry, United Kingdom. E-mail: [email protected]

Considerable advances have been made in Computer Assisted Diagnostics (CAD) over the last decade. Image analysis software is routinely used in several specialties including Radiology and Ophthalmology, however progress in diagnostic Histopathology has been limited. This is plausibly due to relatively nascent digitisation of tissue slides; nevertheless recent developments in machine learning algorithms have shown promise for clinical application. The potential scope of CAD applications in histopathology is considerable. To assess progress made in this field we quantified the scale of research endeavour by reviewing the annual research output related to algorithm development in Histopathology. Electronic databases were interrogated between 2007 and 2017 using a systematic approach. Annual output in terms of the numbers of studies were analysed and categorised according to area of focus (tissue type, application), country of origin, first author and publishing journal. Results found that research output almost doubled within the timeframe. A relative peak in publications during the last 4 years was noted, believed to correlate with modifications in HER2 scoring and breakthroughs in deep learning. The most frequent applications for algorithm development were breast cancer diagnosis, grading and immunohistochemistry scoring. CAD algorithm development in Histopathology, particularly breast pathology, is attracting significant research effort. The most prevalent developments have been on algorithms that quantify malignant cell features, tasks which are time consuming and difficult to standardise between trained pathologists. Pathology images lend themselves well to the application of modern machine learning methods and, therefore, the potential application of these to CAD in histopathology appears auspicious.

   A Digital Pathology Approach to Evaluate the Prognostic Role of CAF-1/p60 in OSCC Top

Francesco Merolla1, Daniela Russo2, Giovanni Zarrilli1, Francesco Martino2, Gennaro Ilardi2, Silvia Varricchio2, Virginia Napolitano2, Maria Luisa Vecchione2, Giovanni Dell'Aversana Orabona3, Massimo Mascolo2, Luigi Califano3, Stefania Staibano2

1Department of Medicine and Health Sciences, University of Molise, Campobasso, Italy, 2Department of Advanced Biomedical Sciences, University of Naples “Federico II”, Naples, Italy, 3Department of Neuroscience and Reproductive and Odontostomatological Sciences, Operative Unit of Maxillo-Facial Surgery, University of Naples “Federico II,” Naples, Italy. E-mail: [email protected]

Background: Oral squamous cell carcinomas still account for high mortality and morbidity rates despite all the efforts to understand the biology of these tumors and to identify reliable markers of outcome. CAF-1 p60, a subunit of chromatin chaperon CAF-1, is involved in chromatin restoration following DNA replication and repair. Prognostic value of CAF-1 p60 has been demonstrated in several human solid tumors. Methods: In the present study we performed an IHC expression assessment of CAF-1 p60 protein on a case series of primary OSCC samples, arranged in TMAs; the study also included normal oral mucosa samples as reference. Following glass slides digitalization, we analyzed digital slides with the open source software QuPath, classifying the samples and retrieving expression data. We set the threshold for IHC signal quantification on the normal epithelium considering as “positive” only the tumor cells with levels of p60 expression exceeding the threshold. We finally performed survival analysis, correlating p60 expression data with patients' follow-up data. Results: Our data showed that overexpression of CAF-1 p60 protein has a significant value as prognostic marker in OSCC. Conclusion: Taking advantage of an image analysis software, we could quantify the overexpression of CAF-1 p60 protein in oral squamous cell carcinomas samples, via an objective and precisely quantifiable evaluation of immunohistochemistry, a technique historically assessed in a qualitative or semi-quantitative manner. We obtained reference values, for the quantization of CAF-1 p60 expression, more sensitive and specific of those so far described in literature.

   Low-Cost, Point-of-Care Digital Microscopy for Breast Cancer Lymph Node Frozen Section Analysis Top

Oscar Holmström1, Nina Linder1, Stig Nordling1, Antti Suutala1, Hannu Moilanen2, Mikael Lundin1, Johan Lundin1

1Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland, 2Center of Microscopy and Nanotechnology, University of Oulu, Oulu, Finland. E-mail: [email protected]

Background: Assessment of lymph nodes for detection of metastases is essential in staging of breast cancers, affecting treatment and prognosis. Intraoperative light microscopy of frozen section samples remains the golden standard of diagnosis but requires the presence of a pathologist close to the point-of-care. Digital microscopy can provide means for remote consultation, but currently requires expensive and bulky instruments not suitable for use at the point-of-care or in low-resource settings. Objective: To determine whether a mobile digital microscope scanner, constructed from mass-produced, low-cost consumer microelectronics and smartphone components, can be used for assessment of metastases in routine frozen sections of lymph nodes. Methods: Routinely prepared lymph node frozen section samples from 82 breast cancer patients were collected retrospectively and digitized using the prototype microscopy scanner. The obtained whole-slide images were reviewed by a pathologist to detect metastatic cells. Results were compared to conventional light microscopy examination. Results: Visual assessment of the whole-slide images generated with the prototype scanner resulted in a sensitivity of 93% and specificity 100% with conventional light microscopy analysis as the ground truth. Two cases out of 82 were incorrectly scored as negative with the prototype scanner but no false positive samples were observed in the 54 negative test samples, yielding an overall accuracy of 98%. Conclusion: The results indicate that assessment of breast cancer axillary metastases in frozen section samples is feasible by analysis of digital samples scanned using low-cost, point-of-care digital microscopy. Advances in smartphone camera technologies are likely to further improve image quality.

   Facilitating Ultrastructural Pathology through Automated Imaging and Analysis Top

Ida-maria Sintorn1,2, Amit Suveer2, Anca Dragomir3, Kjell Hultenby4, Elisabeth Wetzer2, Kristina Lidayova1, Natasa Sladoje2, Joakim Lindblad2, Martin Ryner1

1Vironova AB, Stockholm, Sweden, 2Department of IT, Centre for Image Analysis, Uppsala University, Uppsala, Sweden, 3Departmentof Pathology, Uppsala Academic Hospital, Uppsala, Sweden, 4Karolinska Institute and Karolinska Hospital, Huddinge, Sweden. E-mail: [email protected]

Transmission electron microscopy (TEM) is an important diagnostic tool for analyzing human tissue at the nm scale. It is the only option, or gold standard, for diagnosing several disorders e.g. cilia and renal diseases, rare cancers etc. However, conventional TEM microscopes are highly manual, technically complex and a special environment is required to house the bulky and sensitive machines. Interpretation of information is subjective, time consuming, and relies on a high level of expertise which, unfortunately, is rare for this specialty within pathology. Here, we present methods and results from an ongoing project with the goal to develop a smart and easy to use platform for ultrastructural pathologic diagnoses. The platform is based on the recently developed MiniTEM instrument, a highly automated table-top TEM. In the project we develop image analysis methods for guided as well as fully automated search and analysis of structures of interest. In addition we enrich MiniTEM with an integrated database for convenient image handling and traceability. These points are identified by user representatives as crucial for creating a cost-effective diagnostic platform. We will show strategies and results for using image analysis and machine learning for automated search for objects/regions of interest at low magnification as well as combining multiple object instances acquired at high magnification to enhance nm details necessary for correct diagnosis. This will be exemplified for diagnosing primary cilia dyskinesia and renal disorders. The automation in imaging and analysis within the platform is a big step towards digital ultrapathology.

   Tissue Image-based Outcome Prediction in Breast Cancer with Neural Networks Top

Dmitrii Bychkov1, Riku Turkki1, Aleksei Tiulpin2, Esa Rahtu3, Juho Kannala4, Heikki Joensuu5, Mikael Lundin1, Nina Linder1,6, Johan Lundin1,7

1University of Helsinki, Finland, 2University of Oulu, Oulu, Finland, 3Tampere University of Technology, Tampere, Finland, 4Aalto University, Espoo, Finland, 5Department of Oncology, Helsinki University Hospital, University of Helsinki, Helsinki, Finland, 6Department of Women's and Children's Health, International Maternal and Child Health, Uppsala University, Uppsala, 7Department of Public Health Sciences, Karolinska Institutet, Stockholm, Sweden. E-mail: [email protected]

Background: A novel approach to survival prediction in cancer is to train a deep learning model based on tumor tissue images with outcome as the endpoint, instead of proxies for outcome such as glandular formation, mitoses, pleomorphism or tumor infiltrating immune cells. This approach eliminates the need for subjective ground truth annotation and domain expertise. In this study we apply deep networks to image-based outcome prediction in a nation-wide cohort of patients with breast cancer. Methods: Whole-slide images of hematoxylin-eosin-stained tissue microarray (TMA) samples were obtained from the primary tumor of 1299 patients with breast cancer. A convolutional neural network (ResNet50) was trained to predict long-term outcome, directly from raw pixel intensities in the digitized tissue samples. Results: The outcome supervised deep network trained on TMA images from 979 patients was used to generate a digital risk score for 320 patients in the test set. The accuracy of the digital risk score in prediction of long-term outcome was 0.60 (AUC) and the hazard ratio was > 2.00. The corresponding AUC for visual risk scoring by a pathologist was 0.58 and hazard ratio 1.74. Conclusions: Our results demonstrate that deep network can extract features predictive of long-term survival from small tissue regions stained for basic morphology only.

   Visualization and Exploration of 3D Augmented Whole Slide Images Top

Norman Zerbe1, Max Bergfeld2, Stephan Wienert1, Christoph Harms3, Peter Hufnagl1,2

1Department of Pathology, Charité - University Medicine Berlin, Berlin, Germany, 2University of Applied Sciences, Berlin, Germany, 3Department of Experimental Neurology, Charité - University Medicine Berli, Berlin, Germany. E-mail: [email protected]

Background: Stroke recovery investigations in mice require dedicated regions of interest (ROI) to be identified and compared in both hemispheres of the brain. To conduct measurements up to a cell based level histology sections have to be adducted. Each hemisphere has to be assessed separately since stroke has a non-symmetric effect to the brain. Methods: Histological sections of each brain where distributed on multiple glass slides and digitized to acquire WSI. Subsequently, all sections have been segmented and aligned with an elastic intensity-based registration using mutual information to create an image stack. Subsequent, coronal slice based annotation layer of cell groups and anatomical regions within hemispheres of the brain have been extracted from the Allen Brain Atlas Project. For each section, the corresponding atlas layer was aligned to both hemispheres separately. Initially, histology and atlas data were aligned coarsely using an iterative intensity-based recursive pyramid registration with B-spline transformations. Subsequently, corresponding landmarks on the borders of histology sections and atlas annotations were automatically extracted and aligned using thin-plate spline transformations. Results: All data are made available using a dedicated graphical user interfaces to interactively explore and measure at different magnifications of the histological image stack. Moreover, annotations can be used to navigate within images and comparative measurements between hemispheres are possible. Conclusion: The augmentation of WSI is able to enhance quality and feasibility within quantifications and during navigation. Moreover, the introduced Atlas-Histo registration provides a robust way to measure lesion sizes for the space occupying effect in mouse stroke models.

   Basal Cell Carcinoma Detection in Full Field OCT Images Using Convolutional Neural Networks Top

Diana Mandache1, Emilie Benoit2, John R Durkin3, Jean-Christophe Olivo-Marin1, Claude Boccara2, Vannary Meas-Yedid1

1Institut Pasteur, BioImage Analysis Unit, CNRS UMR 3691, Paris, France, 2LLTech, Paris, France, 3Drexel University College of Medicine, Philadelphie, Pennsylvania, USA. E-mail: [email protected]

Skin cancer is the most common human malignancy, predominantly represented by non-melanoma types with 5.4 million cases per year, 80% of which are Basal Cell Carcinomas with the majority the remaining being Squamous Cell Carcinomas. The gold standard procedure for treating non-melanoma skin cancer in high risk areas is Mohs Surgery.The technique involves the consecutive removal of thin layers of skin, followed by histological preparation and microscopic examination for tumour clearance. This process can take up to an hour and guides further tissue extraction. We investigate the feasibility of using a non-invasive optical slicing modality, together with an automated diagnosis of the cancerous areas, which would lead to speeding up the procedure, consequently, improving patient comfort and physician throughput. In this study, we introduce a new application that exploits the emerging imaging modality of full field optical coherence tomography (FFOCT) as a means of optical biopsy. The objective is to build a computer-aided diagnosis tool that can speed up the detection of tumoral areas in skin excisions resulting from Mohs surgery. Since there is little prior knowledge about the appearance of cancer cell morphology in this type of imagery, deep learning techniques are applied. Using convolutional neural networks, we train a feature extractor able to find representative characteristics for FFOCT data and a classifier that learns a generalized distribution of the data. With a dataset of 40 high-resolution images, we obtained a classification accuracy of 95.93%. These preliminary results are promising to automate segmentation of FFOCT images.

   Detection of β Cells in Immunohistochemical Stained Mouse Pancreatic Tissue Top

Talha Qaiser1, Navid Alemi Koohbanani1,2, Nasir Rajpoot1, 2, 3

1University of Warwick, Coventry, United Kingdom, 2The Alan Turing Institute, London, United Kingdom, 3University Hospitals Coventry and Warwickshire NHS Trust, Coventry, United Kingdom. E-mail: [email protected]

Beta (β) cells in the pancrease are primarily responsible for releasing and storing of insulin, that eventually affect the concentration of blood glucose. Accurate detection and density estimation of β cells in mouse pregnancy may help in our understanding of the formation and growth of beta cells in pancreatic islets. We approach this problem by proposing a fast deep convolutional regression model that predicts spatial coordinates of a bounding box containing nuclei. The proposed framework is capable of encoding the spatial context, as opposed to sliding windowing approaches, which are computationally complex and their analysis is limited to a small subsection of an input image. Our aim is to achieve almost real-time detection of beta cells. Our method is closely related to YOLO with the addition of residual-blocks instead of only convolutional layers to alleviate the vanishing gradient problem. In addition, we introduce domain specific parameters in loss functions that relate to morphological appearance of a nuclei. The dataset for this study consist of 200 sections of IHC stained mouse pancreatic whole-slide images. The training dataset contains more than 9,000 handmarked nuclei from pancreatic islets of Langerhans. We demonstrate the efficacy of the proposed method for effective and eficient detection of pancreatic β cells that could potentially be used for detecting other types of nuclei in computational pathology.

   Deep Autoencoder based Registration of Histology Images Top

Ruqayya Awan1, Nasir Rajpoot1, 2, 3

1Department of Computer Science, The University of Warwick, Coventry, United Kingdom, 2The Alan Turing Institute, London, United Kingdom, 3Department of Pathology, University Hospitals Coventry and Warwickshire, Coventry, United Kingdom. E-mail: [email protected]

Cross-slide image analysis provides additional information by expressing behaviours of different biomarkers as compared to a single slide image analysis. During the slide preparation, a tissue section may not be placed at the same orientation as the other sections of the same tissue block. Therefore, after the digitization of these serial sections, their alignment is mandatory prior to any multiple section analysis. Currently, the registration is performed manually by the pathologists which is time-consuming due to the large number of sections per tissue block. There are many studies on medical image registration using intensity-based methods and feature-based methods including both supervised and unsupervised learning criterion. Among unsupervised feature-based methods, autoencoders (AEs) have been demonstrated to perform well for registration of the MRI brain images. Since AE is an unsupervised learning method and doesn't require ground truth, therefore it can be applied to other imaging modalities. In this study, our objective of using AE is to learn the latent representation features that could be used for the registration of histology images. Our proposed method estimates the transformation which maximises the mutual information (MI) between the features of the two images. For comparison purpose, we experimented with images intensities using the same MI method. Our results demonstrate that the use of AE features reduces the error rate and computational time by a significant margin. In this study, we evaluated our proposed method on small histology images and our obtained results encourages to use this approach for the registration of whole slide images.

   A Multi-Task Deep Convolutional Network for Nuclei Segmentation Top

Simon Graham1,2, Nasir Rajpoot1

1Department of Computer Science, University of Warwick, Coventry, United Kingdom, 2Mathematics for Real-World Systems Centre for Doctoral Training, University of Warwick, Coventry, United Kingdom. E-mail: [email protected]

Segmentation of nuclear material in histology slides is an important step in the digital pathology work-flow, due to the ability of nuclei to act as key diagnostic markers. Manual segmentation can be a laborious task, where pathologists are often required to analyse many nuclei within a whole slide image (WSI). The rise in digital pathology has been matched with an increase in interest for automated nuclei segmentation in Hematoxylin & Eosin (H&E) stained histology images, yet this remains a challenge due to two main contributing factors. Firstly, nuclear heterogeneity can lead to nuclei having a variable Hematoxylin intensity, which often has detrimental effects on the success of current methods. Next, it is common for tumor nuclei to be clumped together, which makes it difficult to separate individual instances. We propose a deep multi-task neural network that addresses both of these key challenges by: (i) using a novel loss function that is sensitive to the Hematoxylin intensity; (ii) segmenting additional nuclear information that assists with the separation of nuclei. We show that the proposed network outperforms all competing methods for the computational precision medicine (CPM) nuclei segmentation challenge contest held in conjunction with MICCAI 2017.

   On Quantifying the Role of Tumour Infiltrating Lymphocytes in Head and Neck Squamous Cell Carcinoma Top

Anna Lisowska1, Muhammad Shaban1, Nasir Rajpoot1

1Department of Computer Science, University of Warwick, Coventry, UK. E-mail: [email protected]

Head and neck squamous cell carcinoma (HNSCC) develops in the mucous membranes of the nose, mouth and throat and is among the most prevalent cancers worldwide. It arises from DNA alterations in squamous cells following exposure to mutagenic factors, such as tobacco, large doses of alcohol and human papillomavirus (HPV) infection. Researchers have identified mutations in several genes in HNSCC patients, especially genes associated with control of cell growth and proliferation. However, the role of these mutations in tumour survival and progression is still largely unknown. The purpose of this study is to investigate how the tumour microenvironment is affected by genetic mutations that have been previously associated with patient survival. Since host immune response to tumour was previously shown to play a major role in patient outcome, we specifically focus this analysis on tumour infiltrating lymphocytes (TILs) and their interaction with cancer cells. To do this, we first develop a deep learning algorithm to identify TIL-rich areas in Haematoxylin and Eosin stained Whole Slide Images of histology sections obtained from HNSCC patients, available from the National Cancer Institute Genomic Data Commons Legacy Archive database. Then, we quantify the degree of tumour infiltration by TILs and finally, investigate the relationship between TILs infiltration, genetic mutations and patient survival. Our long-term goal is to further our understanding of how genetic mutations lead to cancer development and progression and enable us to make more precise outcome predictions based on patients genetic profile, both of which are critical in making patient-oriented therapy recommendations.

   AI based Identification and Quantification of Growth Patterns in Lung Adenocarcinoma Top

Najah Alsubaie1,2, Muhammad Shaban1, David Snead3, Ali Khurram4, Nasir Rajpoot1,3

1Department of Computer Science, Warwick University, Coventry, UK, 2Department of Computer Science, Princess Nourah University, Riyadh, KSA, 3Department of Pathology, University Hospitals Coventry and Warwickshire, Coventry, UK, 4School of Clinical Dentistry, University of Sheffield, Sheffield, England. E-mail: [email protected]

Lung Adenocarcinoma is one of the most common types of cancer in the world. The main characteristic that makes it distinguishable from the other types of non-small cell lung cancers is the presence of specific tumour morphology patterns, known as growth patterns or adenocarcinoma histology subtypes. In 2011, a new lung adenocarcinoma classification system was proposed by a joint group of IASLC/ATS/ERS that advocated the use of predominant growth pattern for prognosis purposes. It is routine clinical practice to identify the presence of these patterns by visual examination of histology slides under the microscope. These growth patterns are identified by presence of the percentage of each subtype in 5% increment. The predominant pattern is then assigned to the case. According to the latest 2015 WHO of lung tumour, lung adenocarcinoma has five growth patterns: Acinar, papillary, micro-papillary, solid and lepidic. Adenocarcinoma tumour could contain one or more of these patterns in the same biopsy. Studies show that the growth patterns are correlated with patient survival, with solid and micro-papillary having the worst prognosis. Several studies agree that micro-papillary is an independent factor for overall survival, while cases with predominant lepidic pattern have good survival.

Automatic identification of growth patterns is, therefore, critical and could affect patients' lives. It will not only reduce the pathologists effort and variability, it could also provide a second opinion to support pathologists decision. To the best of our knowledge, ours is the first algorithm to classify growth patterns in lung adenocarcinoma. We proposed a deep learning-based framework that mimics pathologists examination. Our method identifies all possible patterns by examining the tissue at several resolutions. We then fuse classification results delivered from these resolutions to construct the final classification over the whole slide image. We measure the pattern classification accuracy based on pathologist annotations and show promising preliminary results.

   Embedded Deep Features for Classification of Breast Cancer Histology Images Top

Navid Alemi Koohbanani1,2, Ruqayya Awan1, Muhammad Shaban1, Anna Lisowska1, Nasir Rajpoot1,2

1Department of Computer Science, University of Warwick, Coventry, UK, 2The Alan Turing Institute, London, UK. E-mail: [email protected]

Breast cancer is the most common type of cancer diagnosed and is the second most common type of cancer with high mortality rate after lung cancer in women. Due to the increased incidence of breast cancer and subjectivity in diagnosis, there is an increasing demand for computer assisted diagnosis and grading of breast cancer histology images. To this end, deep neural networks (DNNs) have been widely used to produce the state-of-the-art results for a variety of histology image analysis tasks such as nuclei detection and classification, tissue classification and segmentation. The generalizability property of DNN makes their features transferable to other applications which encouraged the researchers to employ transfer learning for histology images. These features have also been used to train separate classifiers for predictions, which are particularly useful when there is not enough data for training the CNNs from scratch. In some recent studies, context-aware based learning architecture have been introduced, in which first CNN is trained using high resolution patches to extract features at a cellular level that are then fed to a second CNN, stacked on top of the first for expanding the context from a single patch to a large tissue region. The experimental results of these studies suggest that the contextual information plays a crucial role in identifying abnormalities in heterogeneous tissue structures. Our contribution in this work is twofold. First, we propose to use CNN features as a generic descriptor for a small dataset. We extract transferable features from a CNN, which was trained on images patches for the purpose of classification by a separate classifier trained on these features. As our second contribution, we combine these features to learn context of a large patch to improve our classification performance. To this end, we use transferable features for a block of consecutive patches to train an SVM model to classify the H&E stained breast images into normal, benign, carcinoma InSitu (CIS) and breast invasive carcinoma (BIC). We will show that the proposed framework outperforms the state-of-the-art method for breast cancer classification.

   Context Aware Deep Learning in Computational Pathology Top

Muhammad Shaban1, Nasir Rajpoot1, 2, 3

1Department of Computer Science, The University of Warwick, Coventry, United Kingdom, 2The Alan Turing Institute, London, United Kingdom, 3Department of Pathology, University Hospitals Coventry and Warwickshire, Coventry, United Kingdom. E-mail: [email protected]

Tumour grade is an important parameter for deciding the treatment plan and it has been shown to have clinical and prognostic significance. Grading of a tumour is often based on the appearance of cancer cells and their architectural features which is time consuming and subjective in nature. Therefore, an objective automated method is needed which requires high resolution view of cancer cells along with the large contextual information to capture cell organization. Convolutional Neural Networks (CNN) based automated methods process large Whole Slide Images (WSI) in a patch-wise manner, ignoring the contextual information from the neighbouring patches, leading to many noisy predictions. Processing of WSI as a whole through a CNN is computationally infeasible and down-sampling of WSI is also not an option as the cell level features at high magnification are also important for accurate grading. We proposed a Context Aware Network (CAN) consisting of two stacked CNNs, for automated grading of large histology images. First network encodes the cellular morphology in small local regions into high dimensional features whereas the second network considers these features of all local regions along with their spatial organization to predict the final grade. We will show evaluation results of our proposed method on the task of tumour segmentation to gauge its power in terms of context awareness and show that it is able to remove all those noisy isolated regions that were wrongly predicted by the state-of-the-art patch classifier. This way, we demonstrate the significance of incorporating contextual awareness in computational pathology.

   Multispectral Imaging as Potential Tool to Identify Immunological Theranostic Features in High-Grade Ovarian Cancer Top

Eliana Pivetta1, Melissa Manchi2, Gustavo Baldassarre1, Vincenzo Canzonieri2

1Molecular Oncology and Preclinical Model of Tumor Progression Division, CRO National Cancer Institute, Aviano, Italy, 2Pathology Unit, CRO National Cancer Institute, Aviano, Italy. E-mail: [email protected]

MultiSpectral Imaging (MSI) is a promising tool that permits to identify different TIL subsets based on their specific markers simultaneously with other molecules of interest, avoiding staining of sequential slides, problems of effective and proper biomarker co-localization. The multiplexed immunohistochemistry captures essential prognostic features, such as immune cell type, functional orientation, location in the tumour and density thus permitting to develop an immunoscore. We applied a multiplex staining to analyse the interrelationships between immuno infiltrates (CD8+, CD4+, and CD68+) and tumour cells on samples of the most common and aggressive subtype of ovarian cancer, high-grade serum (HGSOC), typically diagnosed at an advanced stage. A strong lymphocytic infiltration has been reported to be associated with good clinical outcome in this type of cancer. Moreover, the field of immunotherapy is rapidly expanding due to the promising results of preclinical and early clinical trials with monoclonal antibodies targeting immune checkpoint molecules modulating antitumor responses. Among immune checkpoints, PD-L1 appears to be prognostically favourable, since has been proven to be strongly associated with tumor-infiltrating lymphocytes in ovarian cancer. In this perspective, HGSOC may be a useful model to investigate the potentiality of this technology, which permits to evaluate and quantify the immune infiltrate subsets and correlate them with PD-L1 expression and clinico-pathological features, performing a semi-quantitative analysis. In the era of immunotherapy and personalised medicine, the possibility to identify specific immune subsets and relate them with immune checkpoint molecules may have diagnostic and prognostic implication and is a great therapeutic opportunity.

   Extended Peer-Reviewed Abstracts Top

   Feasibility of Whole Slide Imaging in Hematopathology Practice: Experience with 707 International Telepathology Consultation Cases Top

Joshua Pantanowitz1, Liron Pantanowitz1

1University of Pittsburgh, Pittsburgh, Pennsylvania, USA. E-mail: [email protected]


Hematopathology cases are often excluded from whole slide imaging (WSI) diagnostic work because of the difficulty viewing hematolymphoid cells without oil magnification. At the University of Pittsburgh Medical Center (UPMC) WSI has been leveraged for international teleconsultation for many years. We aimed to determine the feasibility of WSI to support second opinion diagnoses in hematopathology. A customized web-based teleconsultation portal was developed to submit cases scanned at 40x from China to UPMC. A consecutive series of such cases submitted specifically for hematopathology teleconsultation were analyzed. For 707 of these cases a definitive diagnosis was rendered in 314 (44%) cases, specific diagnoses favored in 317 (45%), 76 (11%) cases labeled as abnormal, and in one case no diagnosis was possible. In almost all cases immunostains were requested. Flow cytometry was rarely available. There were 20 (3%) technical issues (16 for slide focus, 1 for insufficient resolution, 3 for tissue/stain artifact). Overall, WSI proved to be feasible for teleconsultation in the vast majority (89%) of hematopathology cases. Shortcomings were attributed to pre-imaging factors (e.g. tissue preparation) and lack of ancillary studies (e.g. flow cytometry, cytogenetics) rather than technical problems.

Keywords: Consultation, digital pathology, hematopathology, telepathology, whole slide imaging


Whole slide imaging (WSI) has many applications for surgical pathology including primary diagnosis and pathologist to pathologist consultation. Indeed, telepathology has been employed around the world for many years to support diagnostic teleconsultation services.[1] Several published validation studies have indicated that while WSI is suitable for diagnostic use in most histopathology subspecialities, there are specific problematic areas such as detecting granulocytes (e.g. eosinophils).[2] Hematopathology cases are also often excluded from such studies because of the limitation of WSI to easily view hematolymphoid cells without oil magnification.[3] At the University of Pittsburgh Medical Center (UPMC) WSI of all case types, including hematopathology, has been leveraged for international teleconsultation.[4],[5] We aimed to determine the feasibility of WSI to support second opinion diagnoses in hematopathology.


A customized web-based teleconsultation portal was developed to submit cases from China to UPMC. In general, glass slides selected for consultation were scanned at 20x (2.0-HT NanoZoomer, Hamamatsu), except for hematopathology cases that were scanned at 40x. The scanner objective lens has 0.75 numerical aperture (NA) and employs a 3 chip Time Delay Integration (TDI) camera. Scanning resolution in 20x mode is 0.46 μm and for 40x mode is 0.23 μm. Pathology consultants from Pittsburgh in the USA used the UPMC web portal to securely access digital slides, either directly from the client's server or after they had been transferred to our data center in the USA using an automated commercial high speed file transfer software solution (Aspera).[6] The Aspera software used hot folders for automated file transfer, so that every time a file was uploaded into the folder on the client's server it was immediately transferred over to the UPMC server. If required, UPMC pathologists were permitted to order additional stains (e.g. immunostains) to further work up cases. These additional slides were prepared by the client, then digitized and uploaded to the existing consult case. A broad menu of immunohistochemical stains was available for this purpose. A consecutive series of these formalin-fixed cases (5-year time period) submitted for hematopathology teleconsultation were analyzed using descriptive statistics (Microsoft Excel).


Hematopathology represented around 24% of all cases received for consultation. For 707 of these cases submitted (nodes 54.6%, extra-nodal 45.4%), a definitive diagnosis was rendered in 314 (44%) cases, specific diagnoses favored in 317 (45%), 76 (11%) cases labeled as abnormal, and in one case no diagnosis was possible. Cases comprised 365 (51.6%) non-Hodgkin lymphomas, 41 (5.8%) Hodgkin lymphomas, and 301 (42.6%) other diagnoses (e.g. myeloid sarcoma). There were on average 23 slides (range 2-80) scanned per case. In almost all cases immunostains were requested. Flow cytometry was rarely available. Turnaround time averaged 83 hours (range 2-293 hours). There were 20 (3%) technical issues (16 for slide focus, 1 for insufficient resolution, 3 for tissue/stain artifact).


These results indicate that using WSI to enable access to pathology subspecialists for international telepathology can augment patient care. In particular, WSI proved to be feasible for teleconsultation in the vast majority (89%) of hematopathology cases. Many of these were difficult cases that contained numerous slides including many immunostains, which accordingly were time consuming to review. Whilst a definitive diagnosis could not be reached in all cases, this shortcoming was most often due to pre-imaging factors (e.g. tissue preparation) and lack of ancillary studies (e.g. flow cytometry) rather than technical problems.

Competing interests

Liron Pantanowitz is a consultant for Hamamatsu.


  1. Farahani N, Riben M, Evans AJ, Pantanowitz L. International telepathology: Promises and pitfalls. Pathobiology 2016;83:121-6.
  2. Williams BJ, DaCosta P, Goacher E, Treanor D. A systematic analysis of discordant diagnoses in digital pathology compared with light microscopy. Arch Pathol Lab Med 2017;141:1712-8.
  3. Naughler C. Imaging in clinical pathology. In: Pantanowitz L, Parwani AV, editors. Digital Pathology. USA: ASCP Press; 2017. p. 197-210.
  4. Zhao C, Wu T, Ding X, Parwani AV, Chen H, McHugh J, et al. International telepathology consultation: Three years of experience between the university of pittsburgh medical center and kingMed diagnostics in China. J Pathol Inform 2015;6:63.
  5. Pantanowitz L, Wiley CA, Demetris A, Lesniak A, Ahmed I, Cable W, et al. Experience with multimodality telepathology at the university of Pittsburgh medical center. J Pathol Inform 2012;3:45.
  6. Pantanowitz L, McHugh J, Cable W, Zhao C, Parwani AV. Imaging file management to support international telepathology. J Pathol Inform 2015;6:17.

   Epithelium and Stroma Segmentation Using Multiscale Superpixel Clustering Top

Gabriel Landini1, Shereen Fouad1, David Randell1, Hisham Mehanna2

1Oral Pathology Unit, School of Dentistry, University of Birmingham, Birmingham, UK, 2Institute of Cancer and Genomic Sciences, University of Birmingham, Birmingham, UK. E-mail: [email protected]


We present an unsupervised segmentation framework featuring multiscale superpixels and k-means clustering to identify image data belonging to three distinct regions of interest: epithelium, stroma and background in H&E stained sections of oropharyngeal cancer micro-arrays. The method could be used as an exploratory tool for image contents and for semi-automated image annotation by providing a first automated approximation that expert operators can later refine and approve.

Keywords: Automation, clustering, multi-resolution, segmentation, superpixels


Accurate image segmentation is essential in quantitative histopathology although it is also challenging due to tissue complexity, heterogeneity and the uncertainty on scene contents of pathological specimens. Histologically-relevant structures exist at range of sizes and the resolution at which some those structures are detected (e.g. cell nuclei) do not always coincide with the resolution at which other features (e.g. cell clumps, vessels, glands) are found. We investigated to what extent multiscale methods can capture the level of detail necessary for tissue detection.


The image data consisted of fifty-six H&E stained sections of oropharyngeal carcinoma tissue micro arrays (TMA), captured with an automated microscope (Olympus BX50, Japan, x20 objective, N.A. 0.5, resolution 0.67 micrometres) attached to a QImaging Retiga 2000R (Canada) greyscale camera (1600x1200 pixels, inter-pixel spacing 0.367 micrometres) and a tunable LCD RGB filter that provides pixel-aligned colour capture without colour interpolation artefacts. TMA fields were captured and stitched using an OASIS Glide XY Scanning Stage (Objective Imaging, UK) with a motorised focus drive (Prior Scientific, UK). Individual TMA core images were approximately 3300x3300 pixels, background corrected during capture and saved in 24bit RGB colour TIFF format. All imaging procedures were performed in ImageJ.[1] Superpixels are compact groups of pixels sharing some degree of perceptual meaning. We adopted the SLIC superpixel approach implemented by Borovec and Kybic[2] to generate image partitions of grid-sizes from 35 pixels (12.8 micrometres) to 100 pixels (36.6 micrometres) in increments of 5 pixels (1.8 micrometres). A regularisation value of 0.3 (following preliminary tests in[3]) allowed superpixels shape to change according to the image data. K-means clustering is an unsupervised learning method to partition observations into k different clusters, where every observation is associated to the cluster with the nearest mean value in the data space. Here, superpixels were clustered based on their statistical colour features obtained after colour deconvolution (a procedure introduced by Ruifrok and Johnston[4] to unmix colour images featuring mixtures of subtractive colour dyes into the contributions of each dye separately). In H&E images this is used to unmix colours into 'haematoxylin', 'eosin' and a 'residual' channel, to describe tissue stain uptake in a detailed manner. For each superpixel partition size, the colour values extracted from each channel were: minimum, maximum, mode, median, average, average deviation, standard deviation, skewness, kurtosis and entropy. The data was submitted to the k-means clusterer in the Weka library[5] (other clustering methods can be used) via ImageJ plugins[6] which retrieved the cluster allocation to finally label every superpixel. It is worth noting that label assignment during clustering is arbitrary and prevents making direct comparisons across different images/scales. We found the closest correspondences across images by reformulating the problem in terms of a bipartite graph matching, similar to the optimal (minimal cost) assignment for a group of n people performing n jobs. Kuhn's 'Hungarian algorithm'[7] was used to optimally match the labellings across image pairs using the additive inverse of the Dice index between all possible label combinations to generate a 'cost matrix' on which the algorithm is applied [Figure 1]. Performing this on consecutive image pairs resulted in a stack of images with closest labellings. 'Probability' images for each label [Figure 1]p, [Figure 1]q, [Figure 1]r were then computed from the z-projection of the matched stack (e.g. pixels labelled in every slice as belonging to the same cluster would have a probability of 1, and so on). This enabled computing a 'most likely' segmentation [Figure 1]s using label voting, as well as 'uncertainty' images from the inverse of the maximal projection of all cluster probability images.
Figure 1: (a): original H and E image (field width 1258 μm) or a tissue microarray core (oropharyngeal carcinoma). (b-o) Results of k-means clustering (k = 3) on increasingly larger superpixel partitions (sizes to 35 to 100 pixels in increments of 5) with cluster labels matched using the Hungarian algorithm.[7] (p-q) Probability (heat map) images of the three clusters (bright is higher). (s) Most likely segmentation. (t) Gold standard image labelled as background (grey), epithelium (yellow) and stroma (pale blue)

Click here to view


The 'most likely' labelling [Figure 1]s was compared to a set of manually annotated images [as gold standard, [Figure 1]t] using the average Dice index over the three labels per image. The average index over the fifty-six images was 0.76 (minimum 0.52, median 0.79, maximum 0.89) [Figure 2]. With regards to multiscale vs. single scale analysis, the multiscale approach produced higher Dice index values than those of the smallest and largest superpixel sizes considered in 71% and 94% of instances, respectively, while the analysis repeated in the range of 100 to 165 pixels resulted in significantly lower similarity (Dice index 0.67, t-test p<0.01). No differences were found in the average Dice index similarity (t-test, p=0.14) when the superpixel partition was done on the RGB or 'stain images' (where R, G, and B channels stored the heamatoxylin, eosin and residual data). The Dice index of the superpixel classification based on the stain image was slightly, but still significantly higher than that computed from the image RGB components (0.76 vs. 0.74, t-test p<0.05). We noted a morphological asymmetry worth considering when comparing segmentation results blindly with annotated tissue images: small background 'islands' within the stromal and epithelial compartments commonly arise due to the heterogeneous composition of the tissues, however the opposite (small tissue island on the background) is not common. It seemed unreasonable to expect small background islands to be reliably identified with larger sized superpixels and therefore, for a fair comparison, annotated images were processed to treat background regions within tissue as part of the tissue compartments containing them. The results showed a significant improvement in the average of the Dice index agreement over the three classes: mean: 0.81 (t-test, p<0.01), minimum: 0.64, median: 0.81 and maximum: 0.93.
Figure 2: Results of the average Dice index over the three cluster labels when comparing to a set of gold standard images (n = 56) with and without considering background islands in the tissue regions (see Methods), once the dye RGB vectors (for colour deconvolution) have been appropriately determined

Click here to view


The unsupervised multiscale segmentation approach presented, on average, performed better than single resolution superpixel clustering. It has been shown that is it possible to devise supervised methods based on training sets that perform better than unsupervised approaches, however it is not always possible to produce large number of annotations for machine learning training, specially when considering the natural variability of histological samples and the different number of staining currently in use. The outlined method could play a role not only as an exploratory tool, but also in considerably reducing the burden of image annotation by providing a first automated approximation that expert operators can later refine and approve. Other possible variations include the use of alternative clustering algorithms, or combinations of these (e.g. via consensus clustering[6]). The principles are applicable to specimens processed with other staining techniques, once the reference colour vectors for the dyes (in the colour deconvolution step) have been accurately determined. This work was supported by the EPSRC (UK) through funding under grant EP/M023869/1 “Novel context-based segmentation algorithms for intelligent microscopy”.

Competing interest



  1. Rasband WS. ImageJ. US National Institutes of Health: Bethesda, MD, USA; 1997-2018. Available from: /. [Last accessed on 2018 Aug 14].
  2. Borovec J, Kybic J. jSLIC: Superpixels in ImageJ. Praha: Computer Vision Winter Workshop; 2014.
  3. Fouad S, Randell DA, Galton A, Mehanna H, Landini G. Epithelium and stroma identification in histopathological images using unsupervised and semi-supervised superpixel-based segmentation. J Imaging 2017;3: 61.
  4. Ruifrok AC, Johnston DA. Quantification of histochemical staining by color deconvolution. Anal Quant Cytol Histol 2001;23:291-9.
  5. Frank E, Hall MA, Witten IH. The WEKA Workbench. Online Appendix for “Data Mining: Practical Machine Learning Tools and Techniques. 4th ed. Cambridge, MA: Morgan Kaufmann; 2016.
  6. Fouad S, Randell D, Galton A, Mehanna H, Landini G. Unsupervised morphological segmentation of tissue compartments in histopathological images. PLoS One 2017;12:e0188717.
  7. Kuhn HW. The Hungarian method for the assignment problem. Naval Res Logist Q 1955;2:83-97.

   A Digital Decision Support Tool to Identify and Classify Lung Cancer in Human Tissue Samples Top

Nikolay Burlutskiy1, Max Backman2, Lars Björk1,3, Hedvig Elfving2, Johanna Mattsson2, Artur Mazheyeuski2, Dijana Djureinovic2, Sahar Sayegh2, Lena Kajland-Wilén1, Patrick Micke2

1ContextVision AB, Uppsala, Sweden, 2Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden, 3Department of Women's and Children's Health, Karolinska Institutet, Solna, Sweden. E-mail: [email protected]


We developed a digital decision support tool to identify and classify lung cancer in human tissue samples. The study was based on a cohort of lung cancer patients operated at the Uppsala University Hospital. The tissues were reviewed by lung pathologists and then the cores were compiled to tissue micro-arrays (TMAs). For experiments, hematoxylin-eosin stained slides from 712 patients were scanned and then manually annotated. Then these scans and annotations were used to train segmentation and classification models of the tool. The performance of the developed deep learning based tool was evaluated on fully annotated TMA cores reaching pixel-wise precision of 0.80 and recall of 0.85. Finally, the performance of the tool to distinguish adenocarcinoma and squamous cell cancer subgroups was evaluated with an accuracy of up to 85% based only on a single tissue core.

Keywords: Deep learning, digital pathology, lung cancer, semantic segmentation


Lung cancer is the leading cause of cancer death worldwide and the second most common cancer type in the world both in men and women.[1] Lung cancer takes more than 1.6 million lives each year and the survival rate is less than 20%. The microscopic evaluation of cancer is the backbone of clinical diagnostics and, particularly in lung cancer, the evaluation is highly dependent on the experience of the pathologist. Hence, classification can vary considerably between individual pathologists. Therefore, we developed a deep learning based image analysis tool that helps to identify and classify cancer in patient tissue specimens.


The study was based on a cohort of 712 lung cancer patients operated at Uppsala University Hospital. The tissues were reviewed by two lung pathologists and histological diagnosis was performed on both whole slide sections and immunohistochemistry. Next, tissue cores were removed from the tissue blocks and compiled into tissue microarrays. Each core had a diameter of 1 mm, thus approximately representing the tissue area available on a lung biopsy. Finally, the tissue microarrays were stained with hematoxylin-eosin and scanned at a high resolution of 0.5 um/pixel. All the scanned cores were uploaded into Cytomine,[2] an annotation tool, and then extensively annotated by trained pathologists [[Figure 1] for the whole workflow]. The annotated areas included cancer areas, necrosis, tumor stroma, and benign lung tissue areas. In total, 707 out of 712 tissue cores were manually annotated with almost 10,000 annotated areas. A few cores, 5 out of 712, were excluded and not annotated due to presence of artifacts such as foldings and tears in the tissue that caused difficulties in annotating the cores accurately. Once the annotation process was completed, the annotations were extracted from Cytomine and then used as a ground truth for training and evaluating a deep learning model. The purpose of the first trained model was to identify and segment out cancer areas in unseen scanned cores. As a first step, the annotated images of scanned cores were split into train, validation, and test sets. Then a fully convolutional neural network, based on fully convolutional neural networks,[3] was trained on the train set of 354 images of scanned cores and corresponding manual annotations. The validation set of 175 images was used to avoid overfitting of the trained model. The purpose of the second trained model was to distinguish between the main histological subtypes of lung cancer and to assign cases to adenocarcinoma or squamous cell subgroup based on an image of a single core. For training and testing the second model, a subset of cores only with squamous and adenocarcinoma histological subtypes was used, there were cores from 103 squamous and 207 adenocarcinoma patients selected in total. The cores from these patients were split into a test set with cores from randomly selected 10 adenocarcinoma and 10 squamous cancer patients. The cores from the other patients, 93 squamous and 197 adenocarcinoma patients, were used for training and validating a classifier. Since the images of the scanned cores were large and could not fit into the memory of a modern GPU, each scanned core was split into smaller patches. Then the second model was trained to classify core patches into adenocarcinoma or squamous cell subgroup. The model was trained by fine tuning the weights of the final layer of a pre-trained model with the train data. An Inception V3 classifier[4] pre-trained on ImageNet dataset was chosen for this purpose.
Figure 1: Workflow of the annotation tool

Click here to view


Finally, the performance of the first trained network was evaluated on 178 scanned cores from the test set by predicting cancer areas in these test images [[Figure 2] for three such predictions]. The predictions were evaluated both qualitatively and quantitatively; the predictions were examined visually by pathologists at the pixel level and then precision and recall values were calculated using the manual annotations of cancer areas as a ground truth. The visual comparisons of the manual annotations and the predicted cancer areas revealed striking agreements that was further confirmed by calculated precision of 0.80 and recall of 0.85 at the pixel level. The performance of the second model was evaluated by predicting the cancer subgroups for each patch of the scanned cores from the test set. Then the whole cores consisting of the predicted patches were assigned to adenocarcinoma or squamous cell subgroup using the majority voting rule. As a result, an accuracy of up to 85% based only on a single tissue core was reached.
Figure 2: Annotated and predicted cancer areas in test images

Click here to view


In conclusion, we developed a deep learning image analysis tool that could annotate lung cancer areas in human tissue specimens on a pixel level and also determine the histological subtype of the cancer. Work is ongoing to optimize the prediction and to quantify the performance of the model compared to a trained pathologist. Also, testing of the model on lung tissue biopsies in order to evaluate the performance of the trained deep learning models in clinical settings is ongoing.


  1. Lung Cancer Facts and Statistics by International Association for the Study of Lung Cancer. Available from: [Last accessed on 2018 Mar 26].
  2. Marée R, Rollus L, Stévens B, Hoyoux R, Louppe G, Vandaele R, et al. Collaborative analysis of multi-gigapixel imaging data using cytomine. Bioinformatics 2016;32:1395-401.
  3. Long J, Shelhamer E, Darrell T. Fully Convolutional Networks for Semantic Segmentation. In Conference on Computer Vision and Pattern Recognition (CVPR); 2015.
  4. Szegedy C, Vanhoucke V, Ioffe S, ShlensJ, Wojna Z. Rethinking the Inception Architecture for Computer Vision; 2015.

   Transfer Learning and Cloud Computing Used for Nervous System Tumors Detection Top

Slawomir Walkowski1

1Poznan University of Medical Sciences, Poznań, Poland. E-mail: [email protected]


The goal of this work is to design and test a versatile method of automated detection of two tumors of the central nervous system, glioblastoma and neurinoma, that is based on processing histopathological whole slide images (WSIs) and can be prepared with only a limited set of input data available. The proposed approach uses transfer learning to create a machine learning model based on an existing pre-trained model and fragments of WSIs. Cloud computing is used for fast training and running the model. The solution was verified on a set of histopathological images and appeared to be a viable approach to distinguishing the tumors, resulting in correct classification of 93% of areas of WSIs with glioblastoma and 82% of areas of WSIs with neurinoma.

Keywords: Cloud computing, digital pathology, transfer learning, tumors of the central nervous system, whole slide images


WHO Classification of Tumours of the Central Nervous System[1] is an authoritative reference for categorizing these tumors. Besides other parameters, it uses histology to define tumor entities.[2]

Glioblastoma and neurinoma (schwannoma) are among the diseases covered by this classification. These two tumors were already a subject for automated detection based on computer vision methods applied to whole slide images (WSIs) proposed in.[3] That approach used structural analysis and shape descriptors to detect histopathologic patterns in the images, pseudopalisading and palisading, characteristic for glioblastoma and neurinoma, respectively. It required design and implementation of algorithms tailored to finding these specific patterns. One drawback of this approach is that the algorithms prepared for detecting these patterns cannot be directly used for recognizing other tumors.

Supervised machine learning is an alternative approach, in which a generic framework can be used to train a computational model based on specific labeled images. What is important, the chosen framework can be agnostic to the domain or classes of the images (like different tumors in case of WSI classification), which makes it relatively easy to adopt. For example, deep learning has been used for breast cancer detection.[4],[5] On the other hand, machine learning typically requires significant amount of training data and computational power.

This paper demonstrates the use of transfer learning and cloud computing to address these challenges when building a classification model based on limited amount of images representing glioblastoma and neurinoma.


The approach presented in this paper is based on processing a set of WSI areas coming from glioblastoma cases (41 areas) and neurinoma cases (11 areas). The images were captured with Axiocam HRc (ZEISS) camera using a robotic microscope with objective Planapo 20x/0.75, originally as 1300x1030 tiles with 15% overlap, then stitched into larger areas of sizes about 8000x8000 pixels each. The same source material was used in.[3]

In the first step, magnification level of the input WSI areas is reduced, and the areas are cut into patches of size 320x320 pixels, giving 1265 patches in total. Some of the patches contain at least fragments of distinguishable patterns, pseudopalisading or palisading, characteristic for the analyzed tumors. Based on the presence of these patterns, each patch is manually labeled as belonging to one of the three classes: glioblastoma, neurinoma, or 'none' (if it did not contain any pattern). In this experiment, the patches were labeled by a person without formal training in histopathology, using description of the characteristic patterns given by a pathologist.

Then, a neural network is trained based on the patches and using the transfer learning approach. Inception-v3, an existing convolutional neural network pre-trained on ImageNet data set,[6] is used as the base model. Pre-logit layers from the Inception model extract features from the patches cut from WSI areas, while the newly trained classifier adds a fully-connected layer on top of them, and uses softmax function for calculating the final output.[7] This way the new classifier can utilize the compact summaries of the images output by the bottleneck (penultimate) layer of Inception,[8] and categorize them into one of the three classes listed above.

Finally, classification results are aggregated across patches to decide on the classification of the WSI area that they belong to (Please see example in [Figure 1]). This is done by comparing the number of patches classified as one of the two tumors, and requiring a minimum difference between these numbers for a confident decision. The difference of 3 was required when testing this solution.
Figure 1: WSI areas of glioblastoma (a) and neurinoma (b) with marked fragments classified as belonging to one of the two tumors by the trained model. Green squares denote classification consistent with the actual tumor present in the slide, red squares indicate incorrect classification.

Click here to view

It is worth noting that the data set was divided into 5 parts, which enabled training and testing 5 models through 5-fold cross-validation. Patches coming from the same WSI or at least from the same WSI area were put into the same part to avoid situations of training and testing the model on neighboring areas of the same slides. The training set size was 660 - 830 patches, depending on the fold, while the validation and test sets contained 110 - 375 patches each. This shows that the total data set was relatively small for a machine learning experiment, which means that its results are not proven yet to generalize well to larger and more diverse data sets.

Preprocessing the patches and training the final model was performed using TensorFlow,[9] and run in a cloud platform. This computation was based on an existing example[7] and code,[10] originally prepared for classifying other classes of images. It utilized cloud services for parallel processing[11] and streamlined model training and serving.[12]


The cross-validation experiment of classifying the original WSI areas, composed of the categorized patches, resulted in correct identification of 93% of areas with actual glioblastoma and 82% of areas with actual neurinoma (recall). Once an area was classified as containing one of the tumors, the detected disease was almost always correct, for 97% of areas classified as glioblastoma and 100% of areas classified as neurinoma (precision).

Classification of individual patches into the three categories through the trained models was characterized by lower accuracy. Area under receiver operating characteristic (AUC) was 89% for glioblastoma patches, 92% for neurinoma patches, and 83% for 'none' patches. However, the patches misclassified as containing one of the two tumors were typically labeled as 'none' but actually belonged to the WSI area which contained the detected tumor class. This is why aggregating the results for multiple patches when classifying bigger WSI areas, which was the main goal of the experiment, helped in improving the accuracy, as summarized in the previous paragraph. On the other hand, it means that there were quite a few patches which were classified consistently with the corresponding WSI area but did not contain patterns characteristic for the given tumor. This suggests that the classifier learned the patterns imprecisely, or it was overfit to the training data to some extent.

Access to computational resources through cloud platform enabled faster training thanks to parallel execution. For example, running data preprocessing on 7 worker machines resulted in the median preprocessing time of 2 hours 16 minutes per fold, while running it sequentially would take a few times longer time. Total training time per fold was 2 hours 34 minutes at median. Classifying a single WSI area, typically consisting of 25 patches, took about 40 seconds.


The proposed detection method was successfully applied to the available set of WSI areas. Even though the input data size was rather small, machine learning models trained using transfer learning approach enabled good discrimination of glioblastoma and neurinoma. Using existing machine learning framework and services hosted in a cloud platform helped in fast preparation of the models.

Classification quality of this method is at least similar to the performance of the other approach described in.[3] That solution resulted in 97% precision and 82% recall when classifying WSI areas with glioblastoma, and 75% precision and 90% recall when classifying WSI areas with neurinoma. This means that a detection method based on a model generated through versatile machine learning approach can be as good as or better than algorithms tailored to detecting concrete two tumors in the images.

The experiment can be further improved and extended. WSIs without tumors could be added to the data set to better test discrimination of areas with and without the diseases. The base model used in transfer learning could be replaced with a classifier trained on a big set of histological images, instead of images from other domains, for more effective transferability. Thanks to using a generic machine learning approach, not designed for any specific domain, the method could be reused for recognizing other diseases in histopathological images. Finally, the trained model served from a cloud platform could be used in interactive applications built to classify displayed WSI areas.

Competing interests

Slawomir Walkowski works at Google Poland


  1. Louis DN, Ohgaki H, Wiestler OD, Cavenee WK, editors. WHO Classification of Tumours of the Central Nervous System. Revised 4th edition. IARC WHO; 2016.
  2. Louis DN, Perry A, Reifenberger G, von Deimling A, Figarella-Branger D, Cavenee WK, et al. The 2016 World Health Organization classification of tumors of the central nervous system: A summary. Acta Neuropathol 2016;131:803-20.
  3. Walkowski S, Szymas J. Histopathologic patterns of nervous system tumors based on computer vision methods and whole slide imaging (WSI). Anal Cell Pathol (Amst) 2012;35:117-22.
  4. Wang D, Khosla A, Gargeya R, Irshad H, Beck AH. Deep Learning for Identifying Metastatic Breast Cancer. arXiv preprint arXiv:1606.05718; 2016.
  5. Liu Y, Gadepalli K, Norouzi M, Dahl GE, Kohlberger T, Boyko A. Detecting Cancer Metastases on Gigapixel Pathology Images. arXiv preprint arXiv:1703.02442; 2017.
  6. Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z. Rethinking the Inception Architecture for Computer Vision. arXiv preprint arXiv:1512.00567; 2015.
  7. How to Classify Images with Tensorflow Using GOOGLE Cloud Machine Learning and Cloud Dataflow. Google Cloud; 16, December 2016. Available from: [Last accessed on 2018 May 11].
  8. How to Retrain an Image Classifier for New Categories. TensorFlow; 30, March 2018. Available from: [Last accessed on 2018 May 11].
  9. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, Corrado GS. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems. arXiv preprint arXiv:1603.04467; 2016.
  10. Google Cloud Platform – Flowers: Image-Based Transfer Learning on Cloud ML. GitHub; 7 February 2018. Available from: [Last accessed on 2018 May 11].
  11. Google Cloud Dataflow Google Cloud. Available from:[Last accessed on 2018 May 11].
  12. Google Cloud Machine Learning Engine. Google Cloud. Available from: [Last accessed on 2018 May 11].

   Quality Assurance and Local Regions for Whole Slide Image Registration Top

Leslie Solorzano1, Carolina Wählby1

1Department of Information Technology, Uppsala University, Uppsala, Sweden. E-mail: [email protected]


Protein activity in tissue can be studied in-situ by staining for the proteins of interest and scanning it under different kinds of microscopy (e.g., fluorescent and brightfield). The result is a whole slide image (WSI) per protein. Brought together, these WSIs reveal multiplexed information for the same point in the tissue, e.g., mapping of protein expression and localization. To achieve this, WSIs must be registered to accurately compare and correlate patterns of different markers and expressions. Registration of gigapixel WSI presents several challenges: (i) artifacts resulting from thin sectioning of fixed tissues that make global affine registration prone to very large local errors, (ii) local affine registration is required to preserve correct tissue morphology (local size, shape and texture) and (iii) large image sizes, fast and efficient strategies have to be used to open and display WSIs. WSIs are usually preprocessed to obtain a resolution pyramid for fast access and visualization. All subregions of the tissue are registered separately to a common tissue section, overcoming challenge i. Affine registration is used to adjust rotation and translation of each region while avoiding artifacts that could be introduced by using non-linear registration. This also overcomes challenge iii of working with large image sizes by introducing natural subregions (as compared to square tiles) and challenge ii by excluding scientifically irrelevant information resulting from artifacts such as ripping, folding, and missing tissue. We use a new state of the art registration framework that uses both spatial and intensity features to do the alignment.

Keywords: Brightfield, protein, registration, whole slide image


Tissue environment and behaviour is a consequence of interactions in various levels, genetic, epigenetic, gene expression, protein expression and protein activity. To realize a study of protein interaction and its relationship to disease mechanisms, consecutive thin tissue sections are stained by means of immunohistochemistry with Haematoxylin (H) and diaminobenzidine (DAB). H is the first stain and reveals the cell nuclei and the common tissue morphology, while DAB is the second stain used to locate a unique specific protein in each section guided by protein-specific antibodies. Usually, independent quantification of each protein can serve as a biomarker indicating a certain activity in the tissue,[1] but without spatial colocalization, assumptions can't be made about the interaction between the proteins. Therefore, registration (spatial alignment) is needed. In image analysis, registration can be feature-based (e.g., edges, corners) or intensity based (pixel values). For feature-based registration, salient features must be found on the image and given coordinates. The same points should be found in the images involved, named fixed and moving images. In order to find the corresponding transformation, the feature points must be aligned, using any method to compare set to set distances. SIFT is a common feature extraction method.[2] In intensity-based methods, the difference between images is calculated pixel-wise. Registration methods in general try to find a distance measure to find a gradient which will explain how a change in parameters will affect the result.


The first step is to create two new images per input image of the same size but only one channel, one that represents only H and one that represents only DAB.

This is done by a process commonly referred to in literature as color deconvolution, which can also be referred to as a projection into another color space, from RGB to HDAB.[3]

After separating the WSI into the different composing stains, we use the morphology exhibited by H to guide and evaluate the registration, while the DAB marker stain overlap is used to show and quantify protein colocalization.[4] One of the biggest challenges of working with WSIs is the image size. The presented example consists of three WSIs of about 120,000 by 120,000 pixels each, with three image channels in the RGB space. The registration receives the ROI for each tissue section selected using our web tool for visualization and annotation of WSI. Once the H and DAB images are separated, a region is selected and registered using H. The affine registration is achieved using Alpha-AMD[5] a new registration framework that combines both intensity and feature information, avoiding the high number of local optima in intensity-based methods and avoiding difficulties when there is a lack of corresponding features found using only feature-based methods. Alpha-AMD achieves this by using a distance measure that takes into account different quantization levels in the image (intensities) and finding the euclidean distance from points in the fixed and moving images (features) across these levels.[6] [Figure 1] shows how both intensity and spatial information are combined Different quantization levels might reveal common structures in both H and then the distance between them will be reduced gradually without interference of other levels.
Figure 1: Images are quantized. Each level contains and provides intensity information. Distance from a transformed subset of points to the quantization levels provides the spatial information. This happens every iteration

Click here to view


Evaluation of the registration is needed to ensure quality and trust in the quantification of colocalization of specific stains. For this purpose, using the H from each protein, we calculate a local registration quality/confidence score, where colocalized H is an indication of where both of the tissues are present without artifacts thus serving as a mask to indicate the right locations to quantify.[4] We then apply the same transformation to the DAB stain revealing the colocalization of the proteins which can be then quantified. We compare the proposed approach to a global affine transformation and show that the proposed approach results in a higher Pearson correlation coefficient (PCC) for all the non background-background pixels in the image as compared to affine registration of the full single WSI. The process that images undergo is presented in [Figure 2], every tissue section is separated into H and DAB and after registration all H are combined into a quality map (A) and all DAB are combined into a protein colocalization map (B). In the confidence map of the registration result, green regions mean that all three common stains are highly correlated, tissue is present, and the registration as well as colocalization/lack of colocalization of the specific stain can thus be completely trusted. White means that only two out of three of the common stains colocalize, while a 3rd is poorly aligned or missing due to sectioning artifacts. Red means that only one stain is present (the others are missing), and colocalization (or lack thereof) of the protein specific stains can not be studied in this part of the tissue section. These colors are also representative of the total PCC for all the image, which can be used as a measure to compare different registration approaches, a higher PCC means better alignment.
Figure 2: Each thin section is separated into H and DAB. H guides the transformation and gives the quality of alignment. DAB stains are merged into a protein colocalization map

Click here to view


We present a study using three WSIs stained by means of immunohistochemistry with Haematoxylin (H) and diaminobenzidine (DAB). Each WSI is separated into its H and DAB stains. Using only the common H stain, registration is performed using Alpha-AMD to align them spatially. The transformation found is applied to the DAB stains and the result is a map where protein colocalization can be quantified within a trusted area given by the areas where the H is colocalized. Colocalization is quantified using PCC.


We would like to thank Carla Oliveira head of I3S at Porto University for providing the data for this project.

European Research Council for funding via ERC Consolidator grant 682810 to C.



  1. Pontén F, Gry M, Fagerberg L, Lundberg E, Asplund A, Berglund L, et al. A global view of protein expression in human cells, tissues, and organs. Mol Syst Biol 2009;5:337.
  2. Lowe D. Object Recognition From Local Scale-Invariant Features. Proceedings of the International Conference on Computer Vision. 1999. p. 1150-7.
  3. Wemmert C, Kruger J, Forestier G, Sternberger L, Feuerhake F, Gancarski P.
  4. Stain Unmixing in Brightfield Multiplexed Immunohistochemistry. IEEE International Conference on Image Processing. Melbourne, VIC; 2013. p. 1125-9.
  5. Solorzano L, Almeida G, Mesquita B, Martins D, Oliveira C, Wählby C. Whole Slide Image Registration for the Study of Tumor Heterogeneity. In: Computational Pathology and Ophthalmic Medical Image Analysis. OMIA 2018. Lecture Notes in Computer Science, Vol 11039. Springer, Cham, Company; 2018.
  6. Öfverstedt J, Lindblad J, Sladoje N. Fast and Robust Symmetric Image Registration Based on Distances Combining Intensity and Spatial Information. In IEEE Transactions on Image Processing. Vol. 28. 2019. p. 3584-97.
  7. Lindblad J, Sladoje N. Linear Time Distances Between Fuzzy Sets With Applications to Pattern Matching and Classification. In IEEE Transactions on Image Processing. Vol. 23. 2014. p. 126-36.

   Digital Diagnostics for Quantitative Evaluation of Proliferation Marker Ki-67 in Breast Cancer Top

Larisa Volkova1, Fedor Paramzin1

1Immanuel Kant Baltic Federal University, Kaliningrad, Russia. E-mail: [email protected]


Immunohistochemical verification of biological properties of breast cancer is important for prognosis and treatment. Among the most important prognostic factors in different types of tumours is the marker Ki-67, with the percentage of positive nuclear expression in tumour cells evaluated predominantly semi-quantitatively to characterise proliferative activity. The aim of this study was quantitative analysis of Ki-67 expression in the breast cancer on scans of tissue specimens after immunohistochemical staining and automatic counting (PatternQuant software, QuantCenter 3DHISTECH). The results were compared with data from traditional determination of the level of proliferative activity of tumour cells in breast carcinomas by an experienced pathologist.

Keywords: Breast cancer, digital images, quantitative analysis of Ki-67 expression


Immunohistochemical evaluation of expression of steroid hormone receptors, proliferation marker Ki-67, Her2/neu and markers of angiogenesis is very important for determination of the biological properties of breast cancer, for patient prognosis and treatment. Among the most important prognostic factors in different types of tumours including breast carcinoma is the proliferation marker Ki-67. High variability of Ki-67 evaluation results is observed in breast cancer, especially in grade II carcinomas and in cases with tumour heterogeneity.[1],[2]

In current pathology practice, the level of Ki-67 marker is evaluated predominantly with semi-quantitative evaluation of the percentage of positive nuclear expression in tumour cells. Quantitative evaluation of Ki-67 expression in carcinoma of the breast of different histological types and grades is necessary for pathology practice and for understanding the fundamental aspects of breast cancer morphology.[3],[4],[5],[6],[7],[8],[9],[10],[11],[12]

The aim of this study was quantitative analysis of Ki-67 expression in carcinoma of the breast on scans of tissue specimens after immunohistochemical staining and automatic counting (PatternQuant software, QuantCenter 3DHISTECH), with comparison to data traditionally determined on the proliferative activity of tumour cells by an experienced pathologist.


In the Kaliningrad region, 50 clinical cases of invasive ductal carcinomas of the breast of no special type (NST) were investigated.

Quantitative retrospective morphological study of Ki-67 expression (software PatternQuant, QuantCenter 3DHISTECH) was carried out on tissue specimens from the archive of the Laboratory of Immunohistochemistry and Pathology Diagnostics of Clinical Diagnostic Center of Immanuel Kant Baltic Federal University.

The methods of clinical and morphological analysis included traditional morphological investigation, evaluation of the breast tumours according the Nottingham classification criteria, and immunohistochemical staining of slides using a immunohistochemical stainer (BOND-MAX). Oestrogen receptor (6F11) and progesterone receptor (16) determination and histological grading according the Elston–Ellis modification method were performed and statistical analysis done with Statistica 10.0 and Excel 10.0.

The results of evaluation of Ki-67 expression (Clone MIB 1, Dako) by two methods were compared: traditional investigation of immunohistochemical slides by an experienced pathologist, and automatic counting on the scans (three fields of view, × 40) with PatternQuant program 3D HISTECH [Figure 1]. The immunohistochemical marker of proliferation Ki-67 is a nuclear protein, so only nuclear staining was evaluated as percent positive cells among the total quantity of tumour cells. In case of breast carcinoma without morphological heterogeneity, any three fields of view of the scan image were evaluated (×40). In specimens with histological and immunohistochemical heterogeneity of carcinoma, quantitative counting was performed in hot spots with highest levels of cell proliferation (x40).
Figure 1: Moderate (a, b) and weak (c, d) expression of Ki-67 in carcinoma of the breast in immunohistochemical tissue specimens (a, c) and digital images presenting the results of the quantitative automatic counting (b, d)

Click here to view


The average age of patients was 61.2 ± 12.8 years. The degree of differentiation of the tumours according to the Nottingham classification of breast cancer were carcinomas of grade II in 72% of patients, grade III in 26%, and grade I in 2% of cases. A predominance of breast carcinomas of grade II and hormone receptor positive tumours was revealed in the study group. Positive expression of estrogen receptors (ER) was revealed in 68% of patients, and progesterone receptors (PR) in 58% of cases.

The level of proliferation was examined by two methods. Method one was evaluation of Ki-67 expression by an experienced pathologist. Method two was quantitative automatic counting on the scans using the PatternQuant program.

The quantity of Ki-67 positive tumour cells was counted in subgroups of patients with carcinoma of the breast of different grades. The results of evaluation of Ki-67 expression in grade II carcinomas of the breast were following: method one – 50.4% ±18.1, method two – 40.13% ±17.8; the same data for the breast tumours of grade III: method one – 52.3% ±27, method two – 41.79% ±18.56.

Expression of Ki-67 was evaluated in subgroups of carcinoma with positive ER and PR status. The following results were received in ER -positive tumours: method one – 50.38% ±18.1, method two – 40.1% ±17.8; the data for the PR -positive breast carcinomas: method one – 50.38% ±18.1, method two –– 40.1% ±17.4.

Results of evaluating Ki-67 expression in the total study group: method one – the average Ki-67 proliferation level was 52.1% with variation of the results from 18.8% to 90%; method two – the average Ki-67 score was 41. 4% with variation from 8.33% to 75.43%. Correlation between the results obtained by the two methods of calculating of Ki-67 expression is characterised by high linear dependence: r = 0.72, p <0.05.


Despite the high correlations between the results yielded by means of methods one and two, lower scores of Ki-67 positive cell were revealed in all study groups and subgroups after quantitative automatic counting on the scans with the software program (PatternQuant) compared to the data assessed by the pathologist.

The possibilities of quantitative analysis of Ki-67 proliferation marker in digital images were demonstrated. The main problems of applying the methods in clinical practice may include: the quality of tissue specimens, the level of professional qualification of the pathologist in breast cancer examination, determination of the representative fields of scans for study in individual cases of carcinomas, and heterogeneity of the tumours.


This study revealed the prevalence of invasive ductal carcinomas of no special type grade II with positive ER and PR expression, with levels of Ki-67 expression characterised by significant variability from less than 20% to 90%.

The results demonstrate the high effectiveness of quantitative evaluation of the Ki-67 expression in breast cancer using dedicated software (PatternQuant, 3DHISTECH). A high correlation between the results obtained by the traditional examination by a pathologist and the data obtained by the quantitative automatic counting on the scans was revealed, with lower Ki-67 positive cell score in the digital images.

Competing interests

There are no conflicts of interest.


  1. Elmore JG, Nelson HD, Pepe MS, Longton GM, Tosteson AN, Geller B, et al. Variability in pathologists' interpretations of individual breast biopsy slides: A population perspective. Ann Intern Med 2016;164:649-55.
  2. Plancoulaine B, Laurinaviciene A, Herlin P, Besusparis J, Meskauskas R, et al. A methodology for comprehensive breast cancer Ki67 labeling index with intra-tumour heterogeneity appraisal based on hexagonal tiling of digital image analysis data. Virchows Arch 2015;467:711-22.
  3. Rakha EA, Aleskandarani M, Toss MS, Green AR, Ball G, Ellis IO, et al. Breast cancer histologic grading using digital microscopy: Concordance and outcome association. J Clin Pathol 2018;71:680-6.
  4. Williams BJ, Hanby A, Millican-Slater R, Nijhawan A, Verghese E, Treanor D, et al. Digital pathology for the primary diagnosis of breast histopathological specimens: An innovative validation and concordance study on digital pathology validation and training. Histopathology 2018;72:662-71.
  5. Elmore JG, Longton GM, Pepe MS, Carney PA, Nelson HD, Allison KH, et al. A randomized study comparing digital imaging to traditional glass slide microscopy for breast biopsy and cancer diagnosis. J Pathol Inform 2017;8:12.
  6. Van Eycke YR, Allard J, Salmon I, Debeir O, Decaestecker C. Image processing in digital pathology: An opportunity to solve inter-batch variability of immunohistochemical staining. Sci Rep 2017;7:42964.
  7. Joshi S, Watkins J, Gazinska P, Brown JP, Gillett CE, Grigoriadis A, et al. Digital imaging in the immunohistochemical evaluation of the proliferation markers ki67, MCM2 and geminin, in early breast cancer, and their putative prognostic value. BMC Cancer 2015;15:546.
  8. Varga Z, Cassoly E, Li Q, Oehlschlegel C, Tapia C, Lehr HA, et al. Standardization for ki-67 assessment in moderately differentiated breast cancer. A retrospective analysis of the SAKK 28/12 study. PLoS One 2015;10:e0123435.
  9. Harvey J, Thomas C, Wood B, Hardie M, Dessauvagie B, Combrinck M, et al. Practical issues concerning the implementation of ki-67 proliferative index measurement in breast cancer reporting. Pathology 2015;47:13-20.
  10. Jing N, Fang C, Williams DS. Validity and reliability of ki-67 assessment in oestrogen receptor positive breast cancer. Pathology 2017;49:371-8.
  11. Menter DG, Hoque A, Motiwala N, Sahin AA, Sneige N, Lieberman R, et al. Computerized image analysis of ki-67 in ductal breast carcinoma in situ. Anal Quant Cytol Histol 2001;23:218-28.
  12. Zhong F, Bi R, Yu B, Yang F, Yang W, Shui R, et al. A comparison of visual assessment and automated digital image analysis of ki67 labeling index in breast cancer. PLoS One 2016;11:e0150505.

   Helsinki Biobank's Digital Pathology Solutions in Processing Tissue Samples Top

Tiina Vesterinen1,2, Jenni Niinimäki3, Johanna Arola1, Tuomas Mirtti1,3

1Department of Pathology, HUSLAB, Helsinki University Hospital, University of Helsinki, Helsinki, Finland, 2Institute for Molecular Medicine Finland (FIMM), HiLife, University of Helsinki, Helsinki, Finland, 3HUSLAB, Helsinki Biobank, Helsinki University Hospital, Helsinki, Finland. E-mail: [email protected]


Digital pathology is an image-based environment that enables management, viewing, and analysis of histological samples in the form of digitized slide. It has been widely used in teaching and training as well as in clinical pathology including diagnostics and remote consultation. Clinical research and biobanks are growing application areas of digital pathology. Currently, the Helsinki Biobank collects over 100 fresh frozen tissue samples per month. Digitized hematoxylin and eosin (H&E) stained slides are available for most of the samples. This histological verification allows researchers to select the most representative samples, even remotely, for their studies. In addition to fresh frozen samples, Helsinki Biobank administers a tissue repository of 4 million formalin fixed, paraffin embedded (FFPE) tissue samples. To maximize the value of this collection, digital pathology solutions are widely applied. Biomarker research relies mainly on tissue microarrays instead of whole tissue sections. In the Helsinki Biobank, the next-generation Tissue Microarray (ngTMA) is the main technology for delivering FFPE tissue material. After careful planning and design of the TMA, the H&E slides of the donor blocks are scanned and precise tissue areas for punching are marked on digitized slides. Annotated slides are then overlayed on the images of the donor blocks, and arraying of the annotated tissue areas is automatically performed.

Here we describe the ngTMA protocol based on an example of 133 patients with pulmonary carcinoid tumor. We also present, how digital pathology is utilized in fresh tissue biobanking.

Keywords: Biobank, digital pathology, fresh frozen tissue, tissue microarray


According to the Finnish Biobank Act,[1] a biobank is a unit for collecting and storing human biological material coupled with associated information for future research purposes. Biobanks accelerate research by providing readily accessible resources of samples and data, thereby relieving the researchers from recruiting volunteers as well as collecting samples and clinical follow-up information.

Helsinki Biobank, the largest hospital-integrated biobank in Finland, is owned by the Hospital District of Helsinki and Uusimaa. The aim of the biobank is to collect a 10 ml EDTA blood sample from each person entering its hospitals but also gather fresh frozen tissue samples from predefined patient cohorts. Although novel techniques for extraction of DNA, RNA, and proteins appear to enable the use of formalin fixed, paraffin embedded (FFPE) material,[2],[3],[4] fresh frozen tissue is still preferred in many study settings. In addition to fresh frozen samples, Helsinki Biobank administers a tissue repository of four million FFPE tissue samples.

Biomarker research relies on tissue microarrays (TMA) that are prepared by transferring small, carefully selected tissue cores from a traditional FFPE block into so called recipient block.[5] By using TMA instead of whole sections, more samples can be studied under identical experimental conditions accompanied by significantly diminished cost. Recently, the next-generation TMA approach was conceptualized.[6] It utilizes digital pathology and automated tissue arraying together with scientific and histological expertise.

Here we describe, how Helsinki Biobank utilizes ngTMA approach for preparing TMA blocks by using a pulmonary carcinoid (PC) project as an example. We also present, how fresh tissue processing takes advantages of digital pathology.


Fresh tissue biobanking

Resection specimens are transported at room temperature from the operating theatre to the Department of Pathology where the fresh tissue processing for biobanking purposes is a part of the routine workflow. The same pathologist, who prepares the sample for diagnostic purposes, harvests also tissue for biobanking [Figure 1]. Macroscopically representative piece of tumor, as well as non-tumorous tissue, are detached, divided into smaller pieces, placed into cryovials (0.7 ml, 96 well format, external thread screw cap vial with 2D barcode and jacket, FluidX, Brooks Automation, Inc., Chelmsford, MA, US), and snap frozen in liquid nitrogen. Frozen vials are stored in the liquid nitrogen vapor phase. Sample related information (e.g. time when tissue was detached from circulation, time of arrival, tissue type, cryovial identifiers) is recorded in the QPati database of the Department of Pathology.
Figure 1: The protocol for fresh tissue biobanking. The biobanked sample is divided into smaller pieces which are placed into cryovials (a), snap frozen and stored in liquid nitrogen vapor phase (b). One of the diagnostic samples is taken right next to the biobanked sample and processed with routine histopathological methods (c). The resulting H&E slide is digitized and serves as an adjacent histological control for biobanked sample (d)

Click here to view

One of the diagnostic samples is then taken right next to the biobanked sample. It goes through the routine histopathology workflow including fixation, tissue processing, paraffin-embedding, sectioning, and staining. The hematoxylin and eosin (H&E) stained slide of the diagnostic sample is digitized with a whole-slide scanner (Pannoramic, 3DHISTECH, Budapest, Hungary) and uploaded into a specific folder in the internal network. This digitized slide serves as a mirror histological control for the biobanked sample.

Tissue microarray


The patients were identified from the registry of the Department of Pathology, HUSLAB, Helsinki University Hospital. The inclusion criteria were a) surgically treated tumor between January 1990 and August 2013, b) a whole tumor specimen obtained at the surgery, and c) availability of clinical data. Altogether, 146 patients were found. The diagnostic slides of these patients were retrieved from the archives and re-classified according to the latest World Health Organization classification for pulmonary carcinoids.[7] During this process, 13 patients were excluded due to a scarce sample (n=12) or wrong primary diagnosis (n=1). For the remaining 133 patients, the most representative slides were selected and fresh H&E sections were prepared from the corresponding tissue blocks.

Slide scanning and annotation

H&E slides were digitized with a whole-slide scanner (Pannoramic, 3DHISTECH). Annotations for the TMA were marked using digital microscopy application software (CaseViewer, 3DHISTECH). An annotation tool with 1.0 mm circle marks with different colors was used to mark four cores on the tumor center (red), four on the tumor border (blue), four on non-tumorous lung (yellow), and two on bronchus or bronchiole (green) [Figure 2]. In addition, four tumor center areas (black) were marked for tissue cores to be detached for further DNA extraction. In total, 18 annotations were marked per one patient case.
Figure 2: Digital annotation of tissue areas for TMA. Different annotation colors can be utilized (red for tumor center, blue for tumor border, yellow for non-tumorous area, green for bronchiole, and black for macromolecule extraction)

Click here to view

Automated preparing of tissue microarrays

The TMAs were constructed with an automated tissue microarrayer (TMA Grand Master, 3DHISTECH). Two replicate TMAs were prepared. TMA layout was created and each annotated digital slide was overlayed on the photo of the corresponding donor block with a special software (TMA Control Software, 3DHISTECH). Annotations were confirmed and donor blocks punched.


Fresh tissue biobanking

Since February 2016, over 2300 fresh frozen tissue samples were collected. Almost three quarter (73%) of them were breast cancer samples, pulmonary tumors representing the second most common cancer type (9%). Approximately half of the H&E slides of the corresponding diagnostic FFPE samples were digitized.

Tissue microarray

Scanning of slides representing the donor blocks was done mostly overnight, ranging from 5 to 15 minutes per slide. Digitized slides were uploaded to an external hard drive. Digital slide annotation software was easy and quick to use, and results were more accurate compared to traditional annotation under the microscope. In addition, the exact area, where the tissue core was punched, remains on the digital slide for any possible further needs.

Overlaying of the digital slide image on the donor block was occasionally challenging, especially with light-colored tissues in white tissue cassettes or when the tissue section for the newly generated H&E slide had stretched too much in the warm water bath while sectioning. Automated punching of tissue cores and transferring them into the recipient blocks succeeded in most cases: less than 5% of the cases needed further manual punching. As a result, five duplicate TMA blocks were created containing altogether 816 tissue spots per series (range per TMA block 123-186). In addition, detached tissue cores for macromolecule extraction were automatically placed into 0.2 ml domed cap tubes (Thermo Scientific, Thermo Fisher Scientific, Waltham, MA, US). Core drop-out rate in sectioning and staining of TMA slides was less than 5%.


Although novel techniques for the extraction of DNA, RNA, and proteins appear to enable the use of FFPE material, cryopreservation is the preferred method for maintaining the integrity of macromolecules. For this purpose Helsinki Biobank offers digitalized mirror images of frozen tissues to facilitate tissue selection, even remotely.

Careful planning and design is the cornerstone of ngTMA. Digitalization and automated arraying increase the accuracy while saving human resources and time.

Conflicts of interest

There are no conflicts of interest.


  1. Ministry of Social Affairs and Health, Finland 688/2012 Biobank Act. Unofficial Translation. Available from: [Last accssed on 2018 Aug 30].
  2. FitzGerald LM, Jung CH, Wong EM, Joo JE, Gould JA, Vasic V, et al. Obtaining high quality transcriptome data from formalin-fixed, paraffin-embedded diagnostic prostate tumor specimens. Lab Invest 2018;98:537-50.
  3. Bonnet E, Moutet ML, Baulard C, Bacq-Daian D, Sandron F, Mesrob L, et al. Performance comparison of three DNA extraction kits on human whole-exome data from formalin-fixed paraffin-embedded normal and tumor samples. PLoS One 2018;13:e0195471.
  4. Yakovleva A, Plieskatt JL, Jensen S, Humeida R, Lang J, Li G, et al. Fit for genomic and proteomic purposes: Sampling the fitness of nucleic acid and protein derivatives from formalin fixed paraffin embedded tissue. PLoS One 2017;12:e0181756.
  5. Kononen J, Bubendorf L, Kallioniemi A, Bärlund M, Schraml P, Leighton S, et al. Tissue microarrays for high-throughput molecular profiling of tumor specimens. Nat Med 1998;4:844-7.
  6. Zlobec I, Suter G, Perren A, Lugli A. A next-generation tissue microarray (ngTMA) protocol for biomarker studies. J Vis Exp 2014;91:51893.
  7. Beasley MB, Brambilla E, Chirieac LR, Austin JH, Devesa SS, Hasleton P, et al. Carcinoid tumour. In: Travis WD, Brambilla E, Burke AP, Marx A, Nicholson AG, editors. WHO Classification of Tumours of the Lung, Pleura, Thymus and Heart. Lyon: International Agency for Research on Cancer; 2015. p. 73-7.

   Identification and Retrieval of Prostate Cancer Cases Using a Content Based Search Tool Top

Sebastian Otálora1, Manfredo Atzori1, Roger Schaer1, Mats Andersson2, Kristian Eurén2, Martin Hedlund2, Lena Kajland Wilén2, Prof Henning Müller1

1University of Applied Sciences Western Switzerland, Delémont, Switzerland, 2ContextVision AB, Stockholm, Sweden. E-mail: [email protected]


In recent years, large amounts of digital histopathology images have become available. Such images can be useful for pathologists, however, searching for specific cases and similarities within them is not straightforward. In this work, we present a content-based retrieval system and a scale detection method that can allow browsing in heterogeneous prostate histopathology datasets. The system is based on state-of-the-art deep convolutional learning networks[1] and handcrafted features. The system allows to retrieve regions of prostate images that are visually similar to manually delineated regions of interest at specific magnification levels. Several image features were tested and compared, showing that a properly tuned retrieval system can enhance the practice of pathologists.


Large amounts of histopathology images have become available in digital form over the past years. Such databases allow to algorithmically analyse the images and also makes them available for visual image retrieval. However, retrieving histopathology images from varied sources, such as the literature and teaching files can be difficult due to their heterogeneity in magnification, color and, sometimes, a lack of metadata. In this work, we present a content-based retrieval system and a scale detection method that can allow browsing in heterogeneous histopathology datasets.


An annotation tool integrates the visual image retrieval system for digital pathology,[1] allowing to perform the tests and to visually evaluate the results. The extracted image features as well as the detection of the scale are based from finetuned state-of-the-art deep convolutional learning networks[2] and handcrafted texture, shape, and color features. The dataset includes a proprietary set of whole slide images and histopathology images automatically extracted from the biomedical open access literature of Pub Med Central. The total training time for the deep learning networks did not exceed 4 hours, using a modern graphics processor unit.


The used data include 110 whole slide images of prostate biopsies and their respective indexed features of small regions at 5, 10, 20 and 40X. In total more than 100'000 regions where indexed and used for training of the deep learning network. There where local annotations for some of the images, and the size of the regions indexed is of 224x224 pixels.


The system allows to retrieve regions of prostate images that are visually similar to manually delineated regions of interest at specific magnification levels.

The quantitative retrieval performance of both types of visual features extracted is depicted in [Figure 1]. The main observation is that the deep learning-based features have better performance at higher magnifications, i.e., at 20 and 40X, this could be partly due to the fact that the number of patches used for these magnifications was slightly bigger than for lower magnifications. This also suggests that the performance of the system could be better if more images are added to the training of the deep learning networks. Another observation is that the precision of both types of features is similar when there are not many retrieved results, but in general, the deep features have better performance when more results are retrieved.
Figure 1: Precision-Recall graph of the retrieval performance using visual features extracted with the DenseNet architecture (orange line) and the color and edge directivity descriptor features (blue line) in four magnification levels. The DenseNet features are systematically leading to slightly higher results, especially for higher magnification levels (20 and ×40)

Click here to view

Scale detection allows to identify the scale of the images and define adapted features for retrieval and classification even if the magnification level is unknown.

Visual evaluation of the results was performed in two different ways as shown in [Figure 2]a and [Figure 2]b. First, by searching for similar data within the same image. The size of a whole slide image (WSI) can easily be pixels, and regions of interest can be small (e.g. 103-105 pixels). Thus, searching a WSI to find areas that are similar to a specific region of interest can be time-consuming for the pathologist. The system retrieves the most similar regions within the image similar to the defined region of interest within the image and provide them to the user as a convenient side-by-side list, this helps for pathologist to automatically look for missing annotated areas in the whole slide image. The second way to obtain relevant information is by comparing the suspicious area to the similar images that are present in the open access literature, i.e., pathology journals. In the latter case, it is possible to launch a multimodal query by combining an image region and relevant text. This system allows to refine the amount and quality of images retrieved.
Figure 2: Retrieved similar regions in (a) the same whole side image and (b) Images from PubMed Central journals

Click here to view

A short video demonstration of the system is available at


We compared different scale-based features for the identification and retrieval of similar prostate cancer regions of interest. The features allow searching for similar areas at specific scales and will be combined with a scale detection system. A visual histopathology retrieval system can enhance the practice of pathologists allowing them to easily retrieve similar regions in mage or similar cases in proprietary datasets. The scale detection improves the retrieval performance on heterogeneous datasets, thus allowing pathologists to retrieve images from scientific publications, teaching files and books. A quantitative evaluation of the performance of the system was performed. The evaluation of the system shows that even though some improvements must be made, the features ability to retrieve relevant areas within the same image and the scientific literature, opens the possibilities to assist pathologists in their daily work and research duties using this kind of systems.


  1. Schaer R, Otálora S, Jimenez-Del-Toro O, Atzori M, Müller H. Deep learning-based retrieval system for gigapixel histopathology cases and the open access literature. J Pathol Inform 2019;10:19.
  2. Huang G, LiuZ, Van der Maaten L, Weinberger KQ. Densely Connected Convolutional Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Vol. 1; 2017.

   Digital Pathology from a National Perspective Top

Sabine Leh1,2, Jens Lien3,4, Ivar Skaland5, Sindre Byrkjeland Nessen6, Inger Nina Farstad7, Line Rodahl Dokset8,9

Departments of 1Pathology and 2Research and Development, Haukeland University Hospital, Bergen, Norway, 3Central Norway Regional Health Authority's IT Department, 4Bouvet Norway ASA, 5Department of Pathology, Stavanger University Hospital, Stavanger, Norway, 6Helse Vest IKT AS, 7Department of Pathology, Laboratory Clinic, Oslo University Hospital, Oslo, Norway, 8Sykehuspartner Trust, 9Nasjonal IKT. E-mail: [email protected]


Objective: Digital pathology improves traditional workflows and enables the redesign of work processes and technical solutions. Some work processes and technical solutions are generally applicable and are therefore best developed at a national level. Methods: The Norwegian national project for digital pathology has developed concepts and plans since February 2016. Both the steering group and the project team are interdisciplinary, representing experts from professions affected by digital pathology. The project has a longterm perspective and is scheduled to be completed by 2022. Results: The project has drawn up a roadmap until 2022. A concept report was delivered in May 2017. The project has since then worked with detailed planning. A national solution for nationwide cooperation - “ePat” - is a key deliverable. ePat is planned to be integrated with regional laboratory information systems, allowing workflows across health enterprises and enabling easy consultation. Nationwide digital cooperation requires higher-level organizational structures and standardized work processes. The project will establish such structures and processes. Examples of organizational structures are national professional networks for pathologists and laboratory technicians. Nationwide standards for sample processing, shared quality requirements as well as synoptic reports contribute to standardized work processes. Workflows across health enterprises require adjustment of funding schemes and nationwide unique sample identifiers. Conclusion: The Norwegian National Project for digital pathology suggests nationwide technical and organizational solutions that relieve the regional digital pathology enterprises and, at the same time, provide patients with a faster and safer diagnosis irrespective of where they live.

Keywords: Definition, digital pathology, nationwide collaboration


To unfold the potential of a new IT solution, two actions are needed: first, the implementation of the technology itself and second, the redesign of workflows and related organizations.[1] For digital pathology, simply digitizing glass slides would not realize the full potential of this new technology. Workflows have to be adapted to new possibilities that are enabled by new technology. Some work processes and technical solutions have to be developed locally; others are generally applicable and are therefore best developed on a national level. Consequently, as more and more departments in Norway are in the process of implementing digital pathology, it was appropriate to start a national project addressing common aspects of digital pathology. The national project is supposed to develop concepts for an effective use of digital pathology on a national level. Based on the concepts developed, joint solutions are to be planned and implemented later.


The Norwegian national project for digital pathology has developed concepts and plans since February 2016. A concept report was delivered in May 2017. The project has since worked with detailed planning and is now in the transition to the execution phase. Both the steering group and the project team are interdisciplinary, representing experts from professions affected by digital pathology, such as pathologists, laboratory technicians, clinicians, enterprise architects, controllers and health IT specialists. A critical success factor for the project is that all four health regions in Norway are participating in the project team. The project has a long-term perspective and is scheduled to be completed by 2022.


Since digitization encompasses and affects all processes in a pathology department, a long and a short definition of the term “digital pathology” was developed that takes this into account. The long version is as follows:

“Digital pathology involves digitizing all information that belongs to or can be extracted from tissue and cell samples. A key element of digital pathology is scanning of thin tissue sections or cell smears to obtain digital slides, which are examined on a screen and not in the microscope. However, the term digital pathology also encompasses other types of digital information, such as electronic requisition with clinical information, images of surgical specimens or results from molecular pathological analyses. Digital pathology is a paradigm shift within pathology. The technology provides new opportunities for image analysis and pattern recognition and facilitates and improves collaboration as well as utilization of highly specialized expertise. The technology implies integration with patient administration systems as well as standardization and optimization of work processes both in the laboratory and in the evaluation of digital information. Thus, digital pathology covers the entire process from requisition to report and evaluation of pathology data in registries.”

The short version reads:

“Digital pathology involves digitizing all information that belongs to or can be extracted from tissue or cell samples and includes the entire process from requisition, digitization of the slide to report and evaluation of pathology data in registries.”

The project has drawn up a roadmap of national solutions for digital pathology until 2022. In this roadmap, several sub- projects will contribute to achieve the vision of equivalent, fast and accurate diagnoses in all pathology departments in Norway.

A national solution for nationwide collaboration - “ePat” - will be a key deliverable [Figure 1]. ePat is planned to be integrated with regional laboratory information systems and digital slide repositories, thus allowing workflows across health enterprises, enabling easy consultation and access to complete case histories. While main workflows will remain local and regional, national workflows will be used for consultation, review and flexible workload allocation for example in case of vacancies. ePat will give clinicians access to digital slides from their patients and patients access to their slides. ePat will also have standardized systems for transfer of data: Pathology data will be continuously transferred to health registries; Data or digital slides will be delivered to approved research projects. Furthermore, ePat will include a national education and training database with anonymized teaching materials (digital slides, images and other materials) for residency training and continuing professional development.
Figure 1: Technical diagram illustrating the national solution for nationwide collaboration “ePat”

Click here to view

Nationwide digital collaboration requires higher-level organizational structures and standardized work processes. The project will establish such structures and processes [Figure 2]. National professional networks of subspecialized pathologists will be a key structure. They are the backbone of subspecialized diagnostics, when digital slides are flexibly deployed according to the principle “right sample to the right pathologist”. National networks of subspecialized pathologists are also central in the development and maintenance of standardized pathology reports with structured data.
Figure 2: Chart illustrating structures and processes required by nationwide collaboration

Click here to view

Cross-departmental diagnostics will require digital slides from microscopic sections with histological and immunohistochemical stains of consistently high quality. To achieve this, a national network of laboratory technicians will be formed. This network is supposed to establish nationwide standardized laboratory processes and a central quality management. Workflows across health enterprises require nationwide unique sample identifiers and standardized digital slide formats for storage and collaboration. Funding schemes have to be adjusted to changed workflows. Privacy and other legal aspects will also be covered by the project.


The Norwegian national project for digital pathology is implementing nationwide technical and organizational solutions that relieve the regional digital pathology enterprises and, at the same time, provide patients with a faster and safer diagnosis irrespective of where they live.

Competing interests

The authors disclose no competing interests.


1. Wachter R. The Digital Doctor: Hope, Hype, and Harm at the Dawn of Medicine's Computer Age. McGraw-Hill Education; 2015.

   Image-Analysis Microservices Integrated in an Open Source PACS Top

David Pilutti1, Andrea Poli2, Giacomo Petronio2, Vincenzo Della Mea1

1Department of Mathematics, Computer Science, and Physics, University of Udine, Udine, Italy, 2O3 Enterprise, Trieste, Italy. E-mail: [email protected]


Image analysis is becoming more and more popular and used in digital pathology for tasks that include quantification of biomarkers, tissue classification, or identification of rare events. In the present abstract we describe an approach, based on microservices, to partially automate image analysis on digital slides, subsequently to digital slide acquisition. Microservices is a variant of the service-oriented architecture framework where an application is a collection of loosely coupled services, collaborating through lightweight protocols. We preliminary developed a hierarchy of atomic microservices, from which more abstract services could be created by aggregating them through orchestration languages like BPEL. We tested the architecture within the O3IMS PACS (O3 Enterprise Image Management Suite - Picture Archiving and Communication System), in a simple immunohistochemistry analysis case. The large size of digital slides makes it unpractical to move the slide to the microservice provider. Thus, PACS and Microservices provider share storage: slides annotated with staining and organ are automatically submitted to analysis through a microservice call. The proposed architecture, although preliminary, seems to provide a relatively simple approach to generic digital slide analysis services in a distributed environment, taking into account the issues coming from the large size of the involved files.

Keywords: Digital pathology, image analysis, Microservices, PACS, SOA


Image analysis is becoming more and more popular and used in digital pathology, for tasks that include quantification of biomarkers, tissue classification, identification of rare events. For very few applications, algorithms have been also certified by Food and Drugs Administration (FDA) or CE for use in clinical routine. However, most of times the workflow surrounding image analysis has not yet been optimized for routine use: manual intervention is required to select and activate algorithms.

Microservices is a variant of the service-oriented architecture (SOA) framework where an application is a collection of loosely coupled services, collaborating through lightweight protocols. This may also enable cloud-based services, provided in a pay-as-you-go modality.[1] Some implementations of such architectures have been applied to the analysis of Traumatic Brain Injury (TBI) images,[2] to the analysis of Whole Slide Images (WSIs) in digital pathology,[3] or for teleconsultation of medical images.[1]

Some papers describe service-oriented architectures applied to digital pathology. Schuler et al.[5] recognized the need for a loosely coupled infrastructure in general digital pathology services. Gao et al. exploited SOAP-based web services for the integration of digital pathology services and Anatomic Pathology Information Systems.[6]

In the case of digital slide image analysis, the finest grained microservices could be those that are no more decomposable, and that can be called directly or composed together to provide more abstract output. As an example, immunohistochemistry (IHC) quantification could be seen as a composition of tissue classification, and Diaminobenzidine (DAB) quantification on the extracted tumor areas. Since tissue classification can be used in different applications, and DAB quantification too, both could function also as standalone microservices.

In the present paper we describe an approach, based on microservices, to partially automate image analysis on digital slides, as a step subsequent to digital slide acquisition. The work is being carried out inside the HeAD POR-ESF project funded by the Region Friuli – Venezia Giulia, together with O3 Enterprise, a spin-off company of the University of Trieste that develops a number of biomedical imaging systems, including:

  • O3IMS.Store: a PACS system for archiving and managing bio-images (MR, CT, etc.), currently being adapted for digital slide storage. This PACS is a substantial evolution from a previous open source project.[7]
  • O3IMS.View: a system for the visualization and reporting of biomedical images.

All the above systems comply with the relevant DICOM and HL7 standards, as well as with the IHE specifications.

The present work aims at providing a further processing module, to be added to the overall system, and able also to be used independently from the others.


The proposed system includes the implementation of some atomic microservices such as the tissue classification, the color deconvolution for DAB quantification, or the nuclear segmentation.

Input parameters should include the digital slide, the staining (possibly chosen from a term set), eventually the organ. We chose REST-based communication through HTTP. [Figure 1] is an example of the hierarchy of atomic microservices and their possible combinations to perform image analysis for digital pathology images.
Figure 1: Schematic example of Image analysis for digital pathology performed using atomic microservices and their combinations in different contexts such as Progesterone (PGR) quantification or Tumor Infiltrating Lymphocytes (Tils) classification

Click here to view

However, digital slides pose a problem due to their size. While in most cases, input parameters are directly passed to microservices through the calling protocol, the large size of digital slides makes it unpractical to move the slide to the microservice provider. For this reason, we envisaged an architecture with shared storage, where the slide path becomes an input parameter.

We tested the architecture within the O3 Enterprise Imaging Products, by encapsulating immunohistochemistry screening algorithms able to discriminate clearly positive and negative cases in progesterone (PGR) and estrogen (ER). O3IMS and Microservices provider share storage; slides annotated with staining and organ are automatically submitted to analysis through a microservice call. The call result consists in a “positive”, “negative”, or “suspect” answer, that is incorporated in the slide annotations shown by the viewer, with a warning on their usage in clinical routine.


A sample hierarchy of atomic microservices has been designed to cover the three main areas of image analysis (quantification, tissue classification, rare event identification) and is shown in [Figure 2].
Figure 2: An example of hierarchy of atomic microservices to cover three main areas of WSI analysis

Click here to view

Inputs include always a digital slide path. Depending on the service, they can include also one or more annotations, and other metadata if needed. Outputs may be in form of numeric data, classes or tags, and also annotations. For this, some semantic resource is needed to identify image or case features in a standardized way.

Further services could be designed for low level operations, like image registration, etc.

The above mentioned services could be composed at code level or through service orchestration languages (like BPEL – Business Process Execution Language) to obtain higher level functionalities.

For example, a web service for quantifying nuclear immunohistochemistry on the tumoral tissue in a biopsy that could contain normal tissue too can be obtained by invoking the /classify/tumor/ service at first, then the /quantify/nuclear/ service, having as input the same section and the annotation provided by the former service as output.

To test the proposed system, we implemented prototypes of some atomic microservices such as color deconvolution that can be used for staining separation and quantification in both PGR and ER image analysis. The result of the analysis classifies each case as “negative”, “positive”, or “suspect”. The cases classified as “suspect” are then further analyzed by experts with a possible software support from other services. For example, the analysis could be refined using other atomic services to identify the tumor area and to evaluate positive cells only in the tissue classified as tumour.


The proposed architecture, although preliminary, seems to provide a relatively simple approach to generic digital slide analysis services in a distributed environment, taking into account the issues coming from the large size of the involved files. Specific security issues have not yet been examined, although SOA security has been abundantly investigated (e.g.,[8]) and apparently provides a sensible solution for security of health records.[9]

However, to describe and retrieve image analysis microservices according to the theoretical model of SOA, sufficiently abstract and shared terms should be adopted, that is, an ontology of operations, of slide contents at different levels (subcellular, cellular, tissue, organ…) and of diseases is needed, like proposed by the MICO project.[10]

The microservices are currently being integrated in the O3IMS PACS in two ways, to be considered representative of two image analysis modalities:

- Pull modality: the pathologist, while viewing the slide through a workstation, decides to carry out some automated analysis, and thus invokes a service from the graphical interface;

- Push modality: after digitization of a slide, the PACS may autonomously invoke some analysis on it, basing on available metadata. This way, when the pathologist will examine the slide, results are already available. Furthermore, some analysis could screen slides to avoid the pathologist the examination of obvious ones (i.e., frankly negative IHC results).

More microservices implemented and integrated within the PACS imply a higher possibility of analysis using either the microservices directly or in combination for a deeper analysis and automatization. The scalable architecture allows a high flexibility in digital slides image analysis, potentially covering a wide number of quantitative analysis over different types of stained images. The main advantage of such architecture is that many different analyses can be performed over potentially heterogeneous data with a limited computational load of the local machine as well as of the communication system.

The system could be updated with the implementation and integration of new atomic microservices as well as their combinations (even with existing ones) to fulfill the needs for image analysis of the clinicians.


The work has been partially funded by the project HEaD – Higher Education and Development - FP1619942003 (Region Friuli – Venezia Giulia).


  1. Flynn AJ, Boisvert P, Gittlen N, Gross C, Iott B, Lagoze C, et al. Architecture and initial development of a knowledge-as-a-service activator for computable knowledge objects for health. Stud Health Technol Inform 2018;247:401-5.
  2. Gao Y, Burns SS, Lauzon CB, Fong AE, James TA, Lubar JF, et al. Integration of XNAT/PACS, DICOM, and research software for automated multi-modal image analysis. Proc SPIE Int Soc Opt Eng 2013;8674.
  3. Zerbe N, Hufnagl P, Schlüns K. Distributed computing in image analysis using open source frameworks and application to image sharpness assessment of histological whole slide images. Diagn Pathol 2011;6 Suppl 1:S16.
  4. Rassias Andrikos C, Tsanakas P, Maglogiannis I. Versatile Cloud Collaboration Services for Device-Transparent Medical Imaging Teleconsultations. In Proceedings. Of CBMS. IEEE; 2017. p. 306-11.
  5. Schuler R, Smith DE, Kumaraguruparan G, Chervenak A, Lewis AD, Hyde DM, et al. A flexible, open, decentralized system for digital pathology networks. Stud Health Technol Inform 2012;175:29-38.
  6. Guo H, Birsa J, Farahani N, Hartman DJ, Piccoli A, O'Leary M, et al. Digital pathology and anatomic pathology laboratory information system integration to support digital pathology sign-out. J Pathol Inform 2016;7:23.
  7. Inchingolo P, Beltrame M, Bosazzi P, Cicuta D, Faustini G, Mininel S, et al. O3-DPACS open-source image-data manager/Archiver and HDW2 image-data display: An IHE-compliant project pushing the e-health integration in the world. Comput Med Imaging Graph 2006;30:391-406.
  8. Beer MJ, Hassan MF. Adaptive security architecture for protecting RESTful web services in enterprise computing environment. Serv Oriented Comput Appl 2017;12:111-21.
  9. Rezaeibagha F, Win KT, Susilo W. A systematic literature review on security and privacy of electronic health record systems: Technical perspectives. Health Inf Manag 2015;44:23-38.
  10. Racoceanu D, Capron F. Towards semantic-driven high-content image analysis: An operational instantiation for mitosis detection in digital histopathology. Comput Med Imaging Graph 2015;42:2-15.

   Deep Digital Convergence of Radiology, Pathology, and Clinical Molecular Biology Top

Harry B. Burke1

1Department of Medicine, F. Edward Hébert School of Medicine, Uniformed Services University of the Health Sciences, Bethesda, Maryland, USA. E-mail: [email protected]


Historically, most radiology, pathology, and clinical molecular biology information has been image-based and sequestered within each domain. This means that their information could not be combined to improve our knowledge of disease and treatment. This paper proposes the convergence of radiology, pathology, and clinical molecular biology through the integration of their digital data in powerful statistical models in order to create information synergies that can be used by trained models to improve risk estimation, diagnostic certainty, treatment effectiveness, and clinical outcomes. This advance will improve the quality and safety of medical care. Historically, most radiology, pathology, and clinical molecular biology information has been image-based and sequestered within each domain. This means that their information could not be combined to synergistically improve our knowledge of the disease and its treatment. This paper proposes the convergence of radiology, pathology, and clinical molecular biology through the integration of their digital data in a statistical model and the use of the trained model to improve the quality and safety of patient care. During the 20th Century radiology, pathology, and clinical molecular biology were independent clinical domains. Each existed in its own ocular realm – radiologists looked at structural and functional anatomic images, pathologists looked at tissue-based images, and clinical molecular biology looked at biochemical false-color microarray images. Each looked for features that could be used to determine the risk of disease, diagnose disease, and assess the severity of disease. Radiology viewed anatomic images generated by: (i) Röntgen-ray images and (ii) algorithmically constructed images created either from either digital molecular data (for example, magnetic resonance imaging) or from analog cellular data (for example, ultrasound). Pathology viewed enhanced (stains, antibodies, etc.) images of molecular/cellular/multicellular material affixed to slides. Clinical molecular biology viewed analog false-color images of probe-detected gene expression – which could be converted into numeric data, but the resulting numbers were imprecise because of the imprecision inherent in the false-color data. There are several issues related to the visual detection of image-based clinical information. One issue is additivity; each clinical domain possesses some information that is not contained in the other domains (orthogonality), but there was no way to combine the images across the imaging domains in order to create an additive model of the patient and disease. Another issue is observability; there is more information in the analog data than can be seen by an observer, therefore, from a cybernetic perspective, the visual assessment of images results in a loss of information. A third issue is subjectivity; the visual assessment of analog data (signal detection) is subject to high intra and inter-observer variability (error). This lowers predictive accuracy for the three types of prediction, namely, risk/prevention, diagnosis, prognosis/treatment) and, as a consequence, decreases the clinical utility of the information.[1] Fortunately, radiology, pathology, and clinical molecular biology data have become, or are rapidly becoming, digital. This means that the images will no longer be needed – they will be replaced by computational digital data. This transition from analog to digital data permits, for the first time, the convergence of the three domains. The union of these imaging domains will increase the amount, and quality, of clinically useful information through improvements in additivity and observability, and it will reduce error through the elimination of subjectivity. We can employ a conceptual framework, a set of variables and the relations among them that are thought to account for a phenomenon, to help us understand, and guide our modeling of, disease.[2] The body is a unitary, complex biological system[3] which has the following framework: (1) the body is an integrated, interdependent hierarchical organization that is composed of systems, each of which serves one or more biological functions, (2) the complete uncompensated failure of one of the body's necessary systems results in the body's failure – but the body has alternate systems for some functions and they may be able to take over for a failed system; (3) the body can be described in terms of four-dimensional, interconnected levels of analysis, including the molecular, cellular, and multicellular levels; (4) a level is defined in terms of its units and rules (the allowed interactions and activities) and the level's units and rules are the constituents of its functional systems at that level: (5) each level has different units and rules; (6) the levels are interrelated a hierarchical manner; (7) time is different at different levels, i.e., things occur at different rates at different levels, furthermore, body time is not equivalent to level time; (8) complex biological systems are dynamical in that their units are always interacting with each other in order to maintain homeostasis; (9) in terms of the functioning of complex biological systems, the inhibition of an activity is, many times, just as important as the existence of the activity; (10) the body's biological systems self-organize based on their constitutive units and rules; (11) complex biological systems have the ability to adapt to changes in their internal systems, their functions, and their external environment, which means that they evolve over time; (12) complex biological systems have the ability to maintain themselves, protect their existence, and to learn from experience; and (13) the functioning of a complex biological system depends on its present state, its environment, and its feedback and feedforward processes. Finally, higher biological/anatomical levels are constructed from four dimensional multicellular functional units (MFUs). MFUs have at least three characteristics: (1) they are composed of multiple cell types and their local environment, including the extracellular matrix, (2) their cells are co-dependent and spatio-temporally interact in an organized, cooperative manner and (3) they perform one or more biological functions that are required to maintain homeostasis. Biological systems are probabilistic (statistical) rather than deterministic (causal) because they are loosely coupled rather than tightly coupled systems. Tightly coupled means that what the system does in the future is not influenced by what it is currently doing. For example, a spring is governed by the rule of proportionality which means that its response to a stimulus is always proportional to the strength of the stimulus. Because tightly coupled systems are deterministic, they are inflexible, therefore, when conditions change they become maladapted and dysfunctional. Loosely coupled means that the forces affecting the present state of the system can determine, to a greater or lesser degree, the future state of the system. Biological systems are loosely coupled probabilistic systems that can adapt to changes in themselves, their functions, and their environment. For example, in physics, Newton and Einstein created deterministic systems whereas quantum mechanics (a type of statistics) is probabilistic.[4] Complex biological systems are loosely coupled to the extent that they do not jeopardize their existence. A deterministic system does not allow for choice, there is only one outcome, so there is no uncertainty. A probabilistic system always has choice, there is always has more than one possible outcome, so it is inherently uncertain. Since information is anything that reduces uncertainty[5] (uncertainty is mathematically equivalent to Boltzmann entropy) there is no information in a deterministic system. The amount of information in a probabilistic system is a function of its ability to reduce uncertainty. We are interested in the uncertainty related to the accuracy of our risk, diagnosis, and treatment predictions. For example, a patient's biomarker is informative if knowing its numerical value reduces our uncertainty regarding the patient's outcome (e.g., prognosis). Once we have a framework for complex biological systems, we can begin to understand non-traumatic disease. The homeostatic principle is necessary, but not sufficient, for the normal functioning of a biological system. An important homeostatic mechanism is deviation reduction (negative feedback) – which maintains the system's normal functioning. Disease is a biological system that violates the body's homeostatic principle through the use of deviation amplification (positive feedback) – it is this deviation amplification away from homeostasis that allows the disease to cause the body's failure. Disease has many of the characteristics of a chaotic system. Chaotic systems are a special class of loosely coupled systems that employ deviation amplification processes. In some situations, the body is able to contain the disease but, in other situations, the disease takes control of the body and destroys it. Our job is to help the body regain homeostasis through traumatic interventions, for example, surgery, and/or through molecular interventions, for example, medications. To accomplish this task, we must attend to Sir William Osler who said, “Medicine is the science of uncertainty and the art of probability” – in other words, our goal is to reduce uncertainty by creating statistical (probabilistic) models which integrate multilevel biological information in order to create accurate disease representations which help us understand and defeat the disease. It might be thought that the convergence of radiology, pathology, and clinical molecular biology is about the collapsing of three levels of analysis related to anatomic scale – from larger to smaller scale. But our levels of analysis are not related to scale, rather, they are the hierarchical levels related to the molecular, cellular, and multicellular systems and subsystems. These systems and subsystems are fantastically complex. Even the smallest subsystems are extremely complicated, for example, researchers are just beginning to try to model a single signal transduction pathway using supercomputers.[6] Modeling every system in the body is currently not possible, so what is the point of combining orthogonal biological information? The answer is that we are not going to model all the workings of normal biological systems, rather, we are going to model deviant biological systems (disease) – we are going enter the disease-related data into a statistical algorithm, let the statistical method learn the relevant-for-the-disease relationships (rather than all permissible biological system relationships), create a trained model that accurately represents the disease, and use that model of the disease to provide accurate probability estimates related to disease for risk/prevention, diagnosis, and prognosis/treatment. In other words, we do not need to know all the aspects of a system in order to model those aspects related to the disease and to use that model to improve the quality and safety of patient care. Radiology, pathology, and clinical molecular biology provide the digital molecular, cellular, multicellular data we need to create our disease models. Radiomics is radiographic digital data at the molecular, cellular, multicellular levels.[7] Non-Röntgen-ray radiology is noninvasive and some modalities, such as magnetic resonance imaging, are inherently digital and they can operate at the molecular, cellular, multicellular levels. Pathomics, which is invasive, is the use of statistical algorithms to digitize and learn key features of pathology images and it can operate at the molecular, cellular, multicellular levels.[8] Proteogenomics, which is also invasive, is the acquisition of disease-related digital gene and protein information and, of course, it operates at the molecular, cellular, multicellular levels.[9] Radiomic, pathomics, and proteogenomic digital data can be combined with other digital data, including physiologic and laboratory data, to create an optimal statistical model of the disease. The resulting trained model acts as a surrogate for the disease: to the extent that the model is an accurate representation of the disease, it can inform us regarding the disease's natural course and it can tell us which treatment to select to slow or truncate the disease's progression. In order to select the best statistical method to model the disease we need to formally understand complex biological systems. One way to do so is through network theory, i.e., nodes (functional units) and their connections (the relationships and interactions between nodes). Biological systems can be represented as a multiplex network, i.e., the units at different levels are not separate entities, which means that the data cannot be compressed into a single level model or represented as an aggregated model. Furthermore, we are modeling a dynamical rather than structural network, therefore, a level's units interact with each other and they are constitutive within and across levels (the effects of one level pass through from one level to another), and they can be codependent.[10] Furthermore, we have to be aware of multilevel information redundancy, i.e., that the same information can be shared across levels.[11] Finally, we would like to use as few levels as possible to explain the phenomenon because the more levels there are the sparser the representation. These network characteristics of biological systems have important implications for the convergence of multilevel digital data. One implication is that the statistical method must be able to deal with multilevel, interactional data. Another is that the method must be able to learn from the data. Finally, the method should make few, if any, parametric assumptions. What is required is a universal classifier, one that can learn multilevel dynamical data, including capturing the nonlinearities and interactions, and that will converge on the correct solution which, in this case, is the creation of an accurate model of the disease. Fortunately, there is such a universal classifier, namely, the neural network.[12] A three-layer backpropagation neural network with an arbitrarily large number of sigmoidal hidden layer units can fit any real continuous function and, given that the solution is in the data and that there is sufficient data, it will find the correct solution.[13] The discriminative accuracy of the trained neural network is a function of how well it models the disease and it is measured by the receiver operating characteristic (ROC).[14] To be clinically useful a model's ROC should be at least 0.70.9. Our remaining task is to create clinical decision support systems (CDSS) that: (1) function in the prediction domains of risk/prevention, diagnosis, and prognosis/treatment; (2) contain powerful neural networks that use the patient's digital data to make accurate individualized patient predictions and that learn in order to improve their performance over time; and (3) effectively communicate to the clinician and patient in real time the individual patient clinical information and predictions required for better shared decision-making and optimal treatment. In summary, the transformation of the clinical domains of radiology, pathology, and clinical molecular biology from analog to digital data, and the integration of their digital data in powerful statistical models, will create information synergies that will improve risk estimation, diagnostic certainty, treatment effectiveness, and clinical outcomes. This advance will improve the quality and safety of patient care.


Presented at the 14th European Congress on Digital Pathology and the 5th Nordic Symposium on Digital Pathology, Helsinki, Finland on 31 May 2018.


The views expressed in this manuscript are those of the author and do not represent the views of the U.S. Government or any of its agencies.


Support for this manuscript was provided by the Patient Safety and Quality Academic Collaborative, a joint Defense Health Agency – Uniformed Services University program.

Conflicts of interest

The author does not declare any conflicts of interest.


  1. Elmore JG, Longton GM, Carney PA, Geller BM, Onega T, Tosteson AN, et al. Diagnostic concordance among pathologists interpreting breast biopsy specimens. JAMA 2015;313:1122-32.
  2. Carpiano RM, Daley DM. A guide and glossary on post-positivist theory building for population health. J Epidemiol Community Health 2006;60:564-70.
  3. von Bertalanffy L. General System Theory: Foundations, Development, Applications. Revised edition. New York: George Braziller; 1976.
  4. Schrodinger E. What Is Life? New York: The Macmillan Company; 1945.
  5. Shannon CE. A mathematical theory of communication. Bell Syst Tech J 1948;27:379–423, 623-56.
  6. Ganesan N, Li J, Sharma V, Jiang H, Compagnoni A. Process simulation of complex biological pathways in physical reactive space and reformulated for massively parallel computing platforms. IEEE/ACM Trans Comput Biol Bioinform 2016;13:365-79.
  7. Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images are more than pictures, they are data. Radiology 2016;278:563-77.
  8. Bychkov D, Linder N, Turkki R, Nordling S, Kovanen PE, Verrill C, et al. Deep learning based tissue analysis predicts outcome in colorectal cancer. Sci Rep 2018;8:3395.
  9. Burke HB. Predicting clinical outcomes using molecular biomarkers. Biomark Cancer 2016;8:89-99.
  10. DeFord DR, Pauls SD. A new framework for dynamical models on multiplex networks. J Complex Netw 2018;6:353-81.
  11. Stolarczyk S, Bhardwaj M, Bassler KE, Ma WJ, Josić K. Loss of information in feedforward social networks. J Complex Netw 2018;6:448-69.
  12. Burke HB, Goodman PH, Rosen DB, Henson DE, Weinstein JN, Harrell FE Jr., et al. Artificial neural networks improve the accuracy of cancer survival prediction. Cancer 1997;79:857-62.
  13. Hornik K, Stinchcombe M, White H. Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks. Neural Netw 1990;3:551-60.
  14. Swets JA. Signal Detection Theory and ROC Analysis in Psychology and Diagnostics: Collected Papers. Mahwah, NJ: Lawrence Erlbaum Associates; 1996.

   Predicting Prostate Cancer Progression Using a Network of Bivariate Prognostic Models Top

Guenter Schmidt1, Hadassah Sade1, Nathalie Harder1, Harald Hessel2, Maria Athelogou1, Alexander Buchner2, Christian Stief2, Thomas Kirchner2, Ralf Huss1

1Definiens AG, Munich, Germany, 2Ludwig Maximilian University, Munich, Germany. E-mail: [email protected]


Background: The accurate identification of prostate cancer patients with prostate specific antigen (PSA) biochemical recurrence (BCR) after radical prostatectomy is unsolved in oncology. We present a set of prognostic bivariate models discovered by a novel graph-based method integrating tissue phenomics data from immunohistochemistry (IHC) and mRNA-based gene expression profiling. Methods: Automated image analysis and co-registration determined spatial properties of cell populations detected in consecutive tissue sections stained for CD3/CD8, CD68/CD163, CK18/p63 and CD34 (Definiens Tissue Phenomics, Munich). For each of the 23 patients (Gleason-score 6-9, pT2, age<=80 years), the resulting tissue phenomics feature vector was expanded with gene expression measurements (nCounter PanCancer Immune Profiling Panel, NanoString Technologies, Seattle) from the same tissue sample. A minimal spanning tree was constructed based on graph nodes representing univariate prognostic features by adding edges representing bivariate prognostic features. Results: The edges of the prognostic network linking IHC with gene expression features comprise: (1) a large distance from CD163+ to CD3+CD8- cells and low MAGEC2 expression indicates low BCR risk, and (2) a small distance from CD34+ to CD163+ cells in stroma and high TICAM2 expression indicates a high BCR risk. The topological center of the graph shows that a high MAGEC2 and high LGALS3 expression indicates a high BCR risk. Conclusion: By combining tissue phenomics with gene expression measurements we provide spatial context to a prognostic bivariate network model. This network is a novel, concise representation of factors driving disease progression, complementing graphical protein interaction models known from systems immunology.

Keywords: Digital pathology, prognostic models, prostate cancer, systems immunology, tissue phenomics


The accurate identification of prostate cancer patients with prostate specific antigen (PSA) biochemical recurrence (BCR) after radical prostatectomy is unsolved in oncology. In particular, the established Gleason grading provides limited prognostic value for low and intermediate risk patients.[1] Motivated by recent studies on the role of the immune contexture for cancer progression,[2],[3],[4] we investigated the prognostic potential of the spatial distribution of various immune cell populations combined with immune system-focused mRNA gene expression profiling. The discovered univariate and bivariate prognostic features are visualized as a prognostic graph offering a dual representation to known gene expression networks.[5]


Data collection: Data was collected from a cohort of 23 prostate cancer patients characterized by Gleason-score 6-9, pT2, age<=80 years. In 12 patients, BCR was observed. From the resected prostate of the patient, one FFPE tissue block comprising tumor tissue was generated for this study. The block was sectioned for immunohistochemistry (IHC) and mRNA gene expression analysis.

IHC-based analysis: From each tissue block, we processed three consecutive sections with dual stains: CD3/CD8 (T-cells/cytotoxic T-cells), CD68/CD163 (all-macrophages/M2-polarised macrophages), and CK18/p63 (epithelial/basal cells). Normal prostate glands are characterized by CK18-positive luminal cells with at least one p63-positive adjacent basal cell. All other CK18-positive cells are considered as “tumor”. One additional fourth section was processed with CD34 (endothelial cells). As described by Harder et al.,[6] we used a slide-specific, highly auto-adaptive machine learning approach to segment individual cells.[7] Image coregistration[8] of all four sections enabled the computation of virtual multiplexed cell density heatmaps. For subsequent data mining we computed potentially prognostic features such as densities, ratios, and distances of stain-positive cells in specific regions-of-interest (ROIs). The relevant ROIs were: (1) Tumor as defined above. (2) Stroma within a predefined distance to the tumor. This ROI is defined by a threshold-based segmentation of stroma using a distance-from-tumor map with a 150μm threshold. (3) other stroma which is tissue excluding normal glands. In total, 462 features were computed.

Gene expression analysis: We processed within the tumor region tissue from a fifth section using the NanoString nCounter PanCancer Immune Profiling Panel (NanoString Technologies, Seattle). The NanoString software provided quantitative measurements for 730 genes with potentially prognostic value.

Feature normalization: To integrate the IHC-based features and the gene expression features into a joint feature vector for each patient, we normalized all features by quantile normalization to the range [0%…100%].

Univariate feature selection: We computed for every IHC-based and gene expression feature, and every potential cut point within a quantile range of 40% to 60%, the negative logarithm (base 10) of the Kaplan Meier log-rank test p-Value (log-p-value) using the biochemical recurrence (BCR) data. Those features with a mean log-p-value greater than -log10(0.05) were considered as prognostic.

Bivariate feature selection: We combined all significant univariate features with all available features. To combine the feature f1 with feature f2 to a single (potentially prognostic) feature f12, we choose four (fuzzy) logical combinations: f12 = |c1 – f1| × |c2 – f2| with c1, c2 = {0, 1}. The selection of significant combinations was equivalent to the univariate feature selection, with the additional requirement that the log-p-value of the combination f12 is at least a factor 2 larger than the maximum log-p-value from the univariate analysis of f1 or f2. To reduce the number of potentially prognostic features by another step, based on graph theory, we constructed a minimal spanning tree (MST) using a modified version of Prims algorithm[9] and the bivariate log-p-values as edge weights. The MST was extended with features having a log-p-value greater than the 75% quantile of all bivariate features.


Univariate features predicting PSA recurrence: using the method described above, we discovered 3 phenomics (IHC-based) and 29 gene expression features, which provide significant prognostic power [[Table 1]-left]. The phenomics features are related to the distance of CD163(+) cells to CD3(+) or CD34(+) cells. The gene expression features relate to various immune system components, with C4B including the complement system.
Table 1: List of discovered prostate specific antigen BCR prognostic features in prostate cancer

Click here to view

Bivariate features predicting PSA recurrence: The combination of the univariate features with all features resulted in a set of 9746 significant features. Using the minimal spanning tree approach, this set was reduced to a network of 22 bivariate features [[Table 1]-right] which provide high prognostic value for BCR prediction. The network is shown in [Figure 1].
Figure 1: Extended minimal spanning tree of bivariate features predicting prostate cancer PSA recurrence after redical prostatectomy. Node size is proportional to –log 10 (univariate log-rank test P-value), the edge size is proportional to –log 10 (bivariate log-rank text P-value). A missing arrow head represents a c1 (or c2) = 1 in the bivariate combination of two features

Click here to view


By combining tissue phenomics with gene expression measurements we provide spatial context to a prognostic bivariate network model. This network is a novel, concise representation of factors driving disease progression, complementing graphical protein interaction models known from systems immunology. Although some of the discovered univariate features such as MAGEC2[10] and LGALS3[11] are known to be prognostic in prostate cancer, and other features such as PTGDR2[12] and DDX43[13] have been published for other cancers, our method provide a unique overview on multiple prognostic factors, emphasizing a network topology which puts MAGEC2 in the center, with LGALS3, CD47 and C4B as strong complementary co-factors. In future work we will investigate the prognostic potential of IHC biomarkers related to the most interesting gene expression features mentioned above, and validate all results using an independent patient cohort from another clinical site.

Financial support and sponsorship

The research reported in this publication was supported by Definiens.

Conflicts of interest

HH has been part time consultant to Definiens.


  1. Swanson GP, Basler JW. Prognostic factors for failure after prostatectomy. J Cancer 2010;2:1-9.
  2. Fridman WH, Pagès F, Sautès-Fridman C, Galon J. The immune contexture in human tumours: Impact on clinical outcome. Nat Rev Cancer 2012;12:298-306.
  3. Kärjä V, Aaltomaa S, Lipponen P, Isotalo T, Talja M, Mokka R, et al. Tumour-infiltrating lymphocytes: A prognostic factor of PSA-free survival in patients with local prostate carcinoma treated by radical prostatectomy. Anticancer Res 2005;25:4435-8.
  4. Lanciotti M, Masieri L, Raspollini MR, Minervini A, Mari A, Comito G, et al. The role of M1 and M2 macrophages in prostate cancer in relation to extracapsular tumor extension and biochemical recurrence after radical prostatectomy. Biomed Res Int 2014;2014:486798.
  5. Khosravi P, Gazestani VH, Akbarzadeh M, Mirkhalaf S, Sadeghi M, Goliaei B, et al. Comparative analysis of prostate cancer gene regulatory networks via hub type variation. Avicenna J Med Biotechnol 2015;7:8-15.
  6. Harder N, Athelogou M, Hessel H, Brieu N, Yigitsoy M, Zimmermann J, et al. Tissue phenomics for prognostic biomarker discovery in low- and intermediate-risk prostate cancer. Sci Rep 2018;8:4470.
  7. Brieu N, Schmidt G. Learning Size Adaptive Local Maxima Selection for Robust Nuclei Detection in Histopathology Images. IEEE International Symposium on Biomedical Imaging (ISBI); 2017.p. 937-41.
  8. Yigitsoy M, Schmidt G. Hierarchical Patch-Based Co-Registration of Differently Stained Histopathology Slides. Medical Imaging 2017; Digital Pathology, Proc. SPIE 2017 1014009-1014009–6.
  9. Prim RC. Shortest connection networks and some generalizations. Bell Syst Tech J 1957;36:1389-401.
  10. von Boehmer L, Keller L, Mortezavi A, Provenzano M, Sais G, Hermanns T, et al. MAGE-C2/CT10 protein expression is an independent predictor of recurrence in prostate cancer. PLoS One 2011;6:e21366.
  11. Wang Y, Nangia-Makker P, Tait L, Balan V, Hogan V, Pienta KJ, et al. Regulation of prostate cancer progression by galectin-3. Am J Pathol 2009;174:1515-23.
  12. Jandl K, Heinemann A. The therapeutic potential of CRTH2/DP2 beyond allergy and asthma. Prostaglandins Other Lipid Mediat 2017;133:42-8.
  13. Abdel-Fatah TM, McArdle SE, Johnson C, Moseley PM, Ball GR, Pockley AG, et al. HAGE (DDX43) is a biomarker for poor prognosis and a predictor of chemotherapy response in breast cancer. Br J Cancer 2014;110:2450-61.

   New Cytomine Modules for User Behavior Analytics in Digital Pathology Top

Raphaël Maree1, Laurent Vanhee1, Ulysse Rubens1, Renaud Hoyoux2, Alodie Weatherspoon3, Laurence Pesesse3, Pascfale Quatresooz3, Sylvie Multon3, Valérie Defaweux3

1Montefiore Institute, University of Liège, Liège, Belgium, 2Cytomine SCRL FS, Seraing, Belgium, 3Human Histology Service, Faculty of Medecine, University of Liege, Liège, Belgium. E-mail: [email protected]


Our aim is to provide new open source tools for collecting and assessing user activities in digital pathology. We implemented new modules into the Cytomine web-based software and validate our developments on a large database of student activities from an online course in histology.

Keywords: Digital pathology, learning analytics, machine learning, open source, user behavior


In-depth behavioral analysis of users of digital pathology software can provide useful insights in education and diagnostic settings. In education, it might help to better understand and improve how students are learning histology and cytology concepts. In diagnostic, it might help to design more efficient recognition algorithms based on experienced pathologist's browsing patterns. While other works have analyzed viewing patterns in digital pathology,[1],[2],[3],[4] none of them provides reusable and open-source software. In this work, we implemented open-source modules and we illustrate their potentials for the analysis of the behavior of undergraduate medical students in their practical histology course.


Cytomine,[5]( is an open-source software for digital pathology continuously developed since 2010. It is based on modern web and distributed software development methodologies and machine/deep learning. It integrates tens of open-source libraries into a user-friendly rich internet application. Cytomine provides remote and collaborative features so that users can readily and securely share their whole-slide images and related metadata worldwide. It relies on data models that allow to easily organize and semantically annotate imaging datasets in a standardized way (e.g. to build pathology atlases for training courses or ground-truth datasets for machine learning). It efficiently supports digital slides produced by most scanner vendors. It provides mechanisms to proofread and share image quantifications produced by machine/deep learning-based algorithms. Cytomine can be used free of charge and it is distributed under a permissive license. It has been installed at various institutes worldwide and it is used by thousands of users in research and education settings.

In this work, we extended the Cytomine web-based software for user behavior analytics with new modules for data acquisition, data processing, and data analysis.

First, new modules for user data acquisition were developed and integrated into the Cytomine-Core component. These new components gather online user actions (such as user positions in whole-slide images, user clicks on reference annotations,...) in a NoSQL database (MongoDB). In addition, we developed new RESTful web services to allow to export these data into a standardized format (JSON) using regular Http requests. It is for example possible to extract in a single request all positions (with (x,y) coordinates, zoom level, and timestamps) for a given user in a given whole-slide image within a certain period of time.

Second, new Python data processing modules have been developed to transform these raw data and automatically generate heat maps and gaze plots [Figure 1], and tens of behavioral features for a set of users and a series of whole-slide images. This module generates a tabular file (one line corresponding to a user) and can be linked to additional data (e.g. student final grades in education setting, diagnosis in clinical setting) for further statistical analysis (see below). Behavioral features include for each user the number of images visited, the number of user positions (total, median, average) at each zoom level for each image, the viewing time (total, median, average) for each image, the number of annotation actions (user clicks on a reference annotation: total, median, average) for each image, annotation and image scores (estimation of observation for each annotation and image), statistics on the hours of use of the software (daily or nightly), and performance scores related to the way students explore histological sections, ie following a pedagogical course. Indeed, several histological whole-slide images can be browsed: each of the structures to be observed is highlighted by numbered markers. These reference annotations to follow as a “treasure hunt” are associated with questions with their answers, illustrated with drawings or photos as previously described.[6] Features encoding performance scores were based on the regularized number of positions close to the reference annotations and are weighted according to the distance at each reference annotation.
Figure 1: Overview of the new Cytomine modules for the analysis of online user activities

Click here to view

Third, for data analysis, we integrated tree-based machine learning algorithms (more specifically, we used extremely randomized regression forests[7] from scikit-learn[8]). We propose to use their variable importance rankings to enable the identification of most prominent features (among the whole set of behavioral features described above) with respect to any relevant outputs.


All our new developments (see are distributed under a permissive open-source license and can be used free of charge. These developments potentially allow to answer several questions by summarizing and visualizing behavioral data in differents ways [see [Figure 1]], and by providing statistical and machine learning tools to assess correlations between user image exploration patterns and relevant outputs (e.g. exam grades or correctness of a diagnosis).

To validate and illustrate our developments, we applied our new tools a posteriori on raw data from a practical histology course given to first year medical students from the Faculty of Medicine (University of Liège). This course was broadcasted thanks to a massive online course (“Introduction to Histology” MOOC hosted on the France Université Numérique platform: While in-depth study of all user behaviors is beyond the scope of this technical paper, our aim here is to illustrate the potentials of our new tools to study a posteriori correlation between user viewing patterns on Cytomine platform and final certification exam success. The final grade was obtained from 7 different exercises including multiple choice and written exercises). Such an analysis might allow to better characterize user viewing patterns that contribute to better grades, hence providing guidelines to future users (students and teachers). Among 5220 registered users of the MOOC, we analyzed data of 395 first year medical students and who have passed the exams (between February and June 2017). Over the semester, the students were given the task to browse through a set of 78 whole-slide images where they were invited to follow predefined learning paths (through reference annotations created by teachers to learn general histological concepts including the diagnosis of different cells and tissues).

Overall, there were over tens of millions positions recorded in Cytomine core database as well as tens of thousands actions performed by all the users. In this preliminary study, for each user, we applied our data processing tool to generate 2651 behavioral input features (computed for the whole semester period and also for different time ranges, as described above). The resulting data table (395 users x 2651 variables) was injected into tree-based machine learning algorithms and input variables were ranked according to their importance to predict final grades (output variables). As for the behavioral features, we observed that the most impactful ones are often the performance scores that roughly correspond to the fact that a user actually followed the predefined paths. In addition, we tried to predict users' final grades based on how they explored whole-slide images, similarly to (Walkowski, 2015). To assess grade prediction error, we followed a leave-one-student-out protocol: a prediction model is learned using the data from all students except one and then the prediction of this model is evaluated on the remaining student. We repeat this procedure for all students and compute the average prediction error. Results were compared to a baseline “median model”, ie., a model that predicts the observed median grade (in each training set of the leave-one-out protocol) for each student. For example, with the global grade (the weighted sum of all the grades obtained for each question given in the exam), the tree-based model (using 1000 fully developed trees and the maximum number of features tested at each tree node) returned a mean absolute error of 2.02 between predicted and original grades. It improves the “median model” (mean absolute error of 2.37) but predicting grades remains a difficult task [see [Figure 2] to visualize the distribution of actual and predicted grades].
Figure 2: Results of the leave-one-out-procedure for the prediction of global grades

Click here to view


In this work, we implemented new modules for user behavior analytics into Cytomine open-source software. We illustrated their potential on actual data from an online course in histology. Our proposed tools can be used in various ways to summarize and analyze whole-slide viewing patterns. Here, we provided preliminary results in the context of a correlation analysis between viewing patterns and final grades, but we believe many other questions of educational interest and other statistical analyses could be investigated in the future. Our developments are the first attempt to make such behavioral data available and analyzable using standard and open tools. In the future, we believe our new developments could help users (teachers, students, lab managers, technicians, pathologists,...) to keep track of their work and could ease to provide them useful feedback. Overall, we hope our new tools that can be installed freely at any site (university, hospital, research center, company) will pave the way to improvments in teaching methods in digital pathology. These modules might also be useful in other contexts such as in diagnostic settings where studying the viewing patterns of experienced pathologists might help to design better computer-assisted diagnostic methods.

Competing interests

R.M. is co-founder of the not-for-profit cooperative company Cytomine SCRL FS.


  1. Christensen PA, Lee NE, Thrall MJ, Powell SZ, Chevez-Barrios P, Long SW, et al. An open source, whole slide image-based pathology education system. J Pathol Inform 2017;8:10.
  2. Shin D, Kovalenko M, Ersoy I, Li Y, Doll D, Shyu CR, et al. PathEdEx - uncovering high-explanatory visual diagnostics heuristics using digital pathology and multiscale gaze data. J Pathol Inform 2017;8:29.
  3. Walkowski S, Lundin M, Szymas J, Lundin J. Exploring viewing behavior data from whole slide images to predict correctness of students' answers during practical exams in oral pathology. J Pathol Inform 2015;6:28.
  4. Roa-Peña L, Gómez F, Romero E. An experimental study of pathologist's navigation patterns in virtual microscopy. Diagn Pathol 2010;5:71.
  5. Marée R, Rollus L, Stévens B, Hoyoux R, Louppe G, Vandaele R, et al. Collaborative analysis of multi-gigapixel imaging data using cytomine. Bioinformatics 2016;32:1395-401.
  6. Multon S, Weatherspoon A, Schaffer P, Quatresooz P, Defaweux V. Practical histology in tune with the times. Med Educ 2015;49:1166-7.
  7. Geurts P. Ernst D. Wehenkel L. Extremely randomized trees. Mach Learn 2006;63:3.
  8. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, GriselO, et al. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 2011;12:2825-30.

   Validation of Automated H-Score Wholeslide Analysis of CEA-Expressing Tumor Cells in IHC Brightfield Imaging Top

Auranuch Lorsakul1, Yao Nie1, Emilia Andersson2, Irina Klaman2, Oliver Grimm2

1Roche Tissue Diagnostics, Imaging and Algorithms, Digital Pathology, Santa Clara, CA, USA, 2Roche Innovation Center Munich, Penzberg, Germany. E-mail: [email protected]


Pathologists' visual scoring of carcinoembryonic antigen (CEA) based on immunohistochemistry (IHC) is costly, subjective, and produces ordinal rather than continuously variable scores. To overcome these limitations, we developed a fully automated image-analysis solution to report H-Scores for CEA-expressing tumors in wholeslide IHC-brightfield images. The method was validated by comparing to three pathologists' visual scoring in 110 wholeslide images. The pathologists' mean score was highly correlated with the score produced by the algorithm (R^2=0.81, CCC=0.90 and R^2=0.94, CCC=0.96), indicating that the proposed solution accurately classified carcinomatous cells within IHC-slide images.

Keywords: Automated image analysis, cancer, computer vision, immunohistochemistry, machine learning


Carcinoembryonic antigen (CEA) based on Immunohistochemistry (IHC) is currently assessed semi-quantitatively by pathologists in clinical studies for immunotherapy. Typically, IHC results can be evaluated by a semiquantitative approach used to assign an H-score (or “histo” score) to tumor samples. However, this visual scoring of CEA on IHC is costly, inherently subjective, and produces ordinal rather than continuously variable scores.

To increase opportunities for developing companion diagnostics and overcoming these limitations, we proposed:

  1. to develop a fully-automated wholeslide image-analysis solution to report H-Scores for CEA-expressing tumors in wholeslide IHC-brightfield images; and
  2. to validate the proposed automated H-Scores analysis by comparing the results to visual scorings performed by three pathologists on 110 wholeslide images.


In the algorithm development, a total of 66 tissue slides (2.5-μm FFPE) of human carcinoma tissues were included: 2×primary colorectal cancer (CRC), 16×metastatic colorectal cancer, 16×primary pancreas cancer (PaC), 7×metastatic pancreas cancer, and 25×primary non-small cell lung cancer (NSCLC), as shown in [Figure 1].
Figure 1: Example of tissue slides (2.5-μm FFPE) of human carcinoma tissues including metastatic colorectal cancer, primary pancreas cancer, and primary nonsmall cell lung cancer. The slides were stained and detected for CEA (Clone: CEA31, VENTANA#7604594, Lot-independent) using VENTANA ultraView DAB-detection systems, digitized on a VENTANA iScan HT scanner, and processed using automated image analysis

Click here to view

Tissue slides were stained to detect CEA expressing cells [Clone: CEA31, VENTANA#7604594, Lot-independent] using a DAB-detection system (VENTANA ultraView). Subsequently, the tissue slides were scanned using a whole-slide scanner (VENTANA iScan HT) and were processed using automated image analysis. A total of 423 fields-of-view (FOV) extracted from the total of 66 slides were used; algorithm development comprised the following steps:

  1. unmix stains to separate DAB and hematoxylin,[1]
  2. detect and individually identify tumor cells using computer-vision and supervised machine-learning methods,[2],[3],[4]
  3. automatically group the detected tumor cells into CEA-intensity categories: 1) CEACAM5+high, 2) CEACAM5+medium, 3) CEACAM5+low, and 4) CEACAM5-negative. The different intensities' discriminatory thresholds were determined by ROC studies on annotated ground-truth regions.
  4. report automated wholeslide H-scores based on percentage of tumor cells stained at different intensities for wholeslide readouts, together with heatmaps, intensity histograms, and tumor cell locations.

To evaluate the accuracy, the automated image-analysis results were quantitatively compared to three pathologists' scoring within the annotated tumor regions.


The algorithm was verified at the cell-by-cell level at 20X magnification (0.465 μm/pixel) using a total of 166 FOVs from a total of 66 development slides of primary CRC, PaC, NSCLC and metastatic CRC and PaC in liver and lung. The correlation of automated cell counts with pathologists' ground truth yielded R^2=0.94 (CRC/PaC) and 0.96 (NSCLC). Lin's concordance correlation coefficient (CCC) was 0.95 and 0.94 with 62,368 (algorithm) and 77,016 (ground-truth) cell counts, respectively.

For gene-copy verification, two datasets of a total of 16 tumor cell lines on tissue microarray and wholeslide images with known CEA-copy and mRNA numbers were included. The scatter-corrected DAB intensity measured by algorithm correlated well to gene-copy numbers (R^2=0.75, p < 0.001). We observed a linear relationship between the CEA intensity and the gene-copy numbers; however, the saturation of the intensity measurement was observed at the copy numbers that were greater than 62,000.

For the reproducibility study, Kolmogorov–Smirnov tests showed good intensity measurement reproducibility using 6-paired cell-line slides from different pellet blocks stained under the same conditions (p < 0.005).

For the wholeslide validation, a total of 110 slides were analyzed by the proposed automated algorithm and compared against two pathologists on CRC/PaC, and one pathologist on NSCLC, respectively. The CRC/PaC scoring was provided independently under microscope on glass slides and on a screen, whereas the NSCLC scoring was provided only on a screen. For CRC/PaC, the inter-observer correlation of pathologists' scores yielded R^2=0.83, CCC=0.81, although the scoring of the CEA expression on screen was generally slightly higher than the glass-slide scoring. The correlation of pathologists' mean score with the algorithm yielded R^2=0.81, CCC=0.90 (CRC/PaC) and R^2=0.94, CCC=0.96 (NSCLC), indicating that the proposed solution accurately classified carcinomatous cells within IHC-slide images, as illustrated in [Figure 2].
Figure 2: The example of the wholeslide analysis to automatically identify and group the detected tumor clls toCEA-intensity categories: (1) CEACAM5 + high, (2) CEACAM5 + medium, (3) CEACAM5 + low, and (4) CEACAM5-nagative, as shown the colors in red, yellow, green, and magenta, respectively. The automated wholeslide H-scores based on percentage of cells stained at different intensities for Wholeslide readouts were reported

Click here to view


We demonstrated that our automated image-analysis method can produce H-Scores essentially equivalent to visual evaluation by pathologists. We anticipate that automated assessment on brightfield-IHC slides can be a viable option for CEA-targeted diagnostics.


  1. Ruifrok AC, Johnston DA. Quantification of histochemical staining by color deconvolution. J Chem Inf Model 2013;53:1689-99.
  2. Parvin B1, Yang Q, Han J, Chang H, Rydberg B, Barcellos-Hoff MH. Iterative voting for inference of structural saliency and characterization of subcellular events. IEEE Trans. Image Process 2007;16:615-23.
  3. Nguyen K, Bredno J, Knowles D. Using contextual information to classify nuclei in histology images. Proc Int Symp Biomed Imaging 2015:995-8.
  4. Lorsakul A, Andersson E, Vega Harring, Bredno J. Automated Wholeslide Analysis of Multiplex-Brightfield IHC Images for Cancer Cells and Carcinoma-Associated Fibroblasts. Proceedings of SPIE, 10140, Medical Imaging 2017: Digital Pathology, 1014007; 01 March 2017.

   Detection and Grading of Ductal Carcinoma In Situ by Using Structural Features Top

Germán Corredor1,2, Cristian Barrera1,2, Paula Toro1, Ricardo Moncayo1, Charlems Alvarez-Jimenez1, Hannah Gilmore3, Anant Madabhushi2, Eduardo Romero1

1Computer Imaging and Medical Applications Lab, Department of Medical Imaging, Universidad Nacional De Colombia, Bogotá, Colombia, 2Biomedical Engineering Department, Case Western Reserve University, Cleveland, Ohio, USA, 3Department of Pathology, University Hospitals Cleveland Medical Center, Cleveland, Ohio. E-mail: [email protected]


Breast cancer is a leading cause of women mortality worldwide. Ductal carcinoma in situ (DCIS) is the most common non-invasive breast cancer type. Although DCIS is not life-threatening, this is an early cancer stage with a high risk of developing actual invasive breast cancer. Unfortunately, DCIS comprises a heterogeneous group of lesions with highly variable morphology, biomarker expression, genomic profile, and natural progression. In consequence, discrimination between DCIS and other breast lesions is challenging because of the high inter-observer variability. This work introduces an automatic method that exploits three scales of information to characterize breast histopathology images with DCIS: 1) local information, obtained from each nucleus itself (size, shape, color, texture, etc.), 2) regional information, extracted from the nuclei surrounding the nucleus (nuclei density, variance of neighbors, etc.), and 3) global information, computed from the grade of grouping of image nuclei. This information is then used to classify each image as DCSI or non-DCIS. In addition, this method can differentiate the DCIS grade (low, moderate, or high). The method was evaluated using 1102 fields of view of 1024x1024 at 40x extracted from 28 different cases (non-DCIS =400, low grade DCIS=106, moderate grade=251, and high grade=345). Using a 10-fold cross-validation scheme, splitting the dataset randomly into 70/30, a trained Gradient Boosted Regression Trees classifier yielded an accuracy of 95% when differentiating DCIS and non-DCIS, and 91%, 88%, and 94% when identifying low moderate, and high-grade DCIS. These automatic approaches might provide pathologists with objective and quantitative tools that facilitate decision making and treatment planning.

Keywords: Breast cancer, digital pathology, ductal carcinoma in situ, machine learning


Breast cancer comprises several kinds of lesions with different severity grades. From such lesions, ductal carcinoma in situ (DCIS) is the most common non-invasive breast cancer type. In this case, tumor cells are still located in the tissue of origin (the milk ducts) and have not spread into any surrounding tissue.[1] Although DCIS is not life-threatening, DCIS is synonymous of a high risk of developing invasive carcinoma and these patients may require additional surveillance, prevention, or treatment to reduce their risks. For this reason, early detection results is crucial in these cases.[2]

Unfortunately, detection of DCIS is challenging since this is described as a set of lesions with highly variable morphology, biomarker expression, genomic profile, and natural progression.[3] Usually, DCIS is categorized into three grades: low, moderate, and high. Low grade DCIS contains cancerous cells that look very similar to normal or atypical ductal hyperplasic cells. Moderate grade lesions contain cancerous cells slightly different from normal cells. High grade DCIS is characterized by well-differentiated and fast-growing cancerous cells.[1] Previous studies have revealed low levels of agreement among experts when analyzing DCIS lesions,[2],[3] a definite issue in clinical practice. Misclassification of breast lesions may lead to over/under treatments of lesions identified during breast screening.[2] In this context, automatic measures may contribute to discrimination between breast lesions.

This work presents an automatic strategy that classifies microscopic Field of Views (FoVs) extracted from breast histology images into two classes: DCIS and non-DCIS. Furthermore, this strategy also classifies DCIS lesions into 3 grades. The presented approach characterizes each FoV using different nuclear features and their context. Each nucleus is threefold characterized by its own morphological properties (size, shape, color, texture, etc.), by its neighbor nuclei features within a determined radius, and by its distance to other image nuclei. Unlike other state-of-the-art methods, any feature in this approach exploits nuclei relative information, i.e., each nucleus is not only characterized by its own information but also by how that nucleus feature is with respect to the surrounding nuclei. This method shows a good classification performance as well as fast training times and needs no large annotated datasets.


The pathology semiology is based on identifying abnormalities in terms of color, shapes, sizes, textures, and spatial arrangement of present structures at different scales.[4],[5] Inspired by this observation, the underlying idea behind the present work is that after nuclei are automatically segmented from breast tissue images, different nuclei-based features characterize such images, as illustrated in [Figure 1].
Figure 1: Illustration of the feature extraction process of histopathological images of ductal carcinoma in situ (DCIS). Whole slide images (WSIs) is manually annotated by an expert pathologist. A set of field of view (FoV) is extracted from the annotated areas of WSIs. Once nuclei is detected, each nucleus is threefold characterized by local, regional and cellularity features.

Click here to view

Nuclei segmentation

Automatic nuclei segmentation is performed by a watershed-based algorithm.[6] This method applies a set of mathematical operations at different scales to identify candidate nuclei in Hematoxylin-Eosin (H&E) stained images. This method was selected by its visual efficacy, simplicity, speed, and absence of adjustable parameters.

Local features and regional features

Previous works have shown that nuclear morphological features are useful to characterize DCIS.[2] For this reason, after nuclei are segmented, a set of low-level features are extracted from them, including characteristics of shape (Zernike moments, ratio between axes, etc.), texture (Haralick features like homogeneity, correlation, entropy, etc.), and color (mean intensity, mean red, etc.). This set of local features is used to characterize each nucleus.

Besides nuclear local features, pathologists also examine the nuclei context/neighborhood looking for particular patterns. Different approaches have used graph-based techniques to characterize nuclei neighborhoods;[7],[8],[9] however, these features only take into account the spatial distribution and ignore the variability of other features. For this reason, for each image nucleus, a set of circles with incremental radii of k=dL*10, dL*20, dL*30 pixels were placed at the nucleus center (dL=20 pixels, the average diameter of all the detected nuclei). Different radii were used aiming to model a multi-scale approach. Finally, a set of regional features are computed within each circle and used to characterize the nucleus. These features aim to measure the neighborhood density and variations in color, shape, and texture.

Once nuclei are characterized, each feature (local and regional) is represented by a histogram to characterize the FoV. For this purpose, a dynamic feature range is set between the maximum and minimum values found along the whole set of FoVs. The dynamic range is divided into ten intervals and the bins of the histogram are constructed as the number of occurrences within each of the particular intervals. The final histogram is normalized thereby obtaining a probability distribution function.

Cellularity features

Since cancer is characterized by an uncontrolled proliferation of cells, features related to the number of nuclei and their grouping grade were also included. This grouping index was computed as follows: a fully-connected graph is built using nuclei as nodes and the inverse of the Euclidean distance between nuclei is set as edge weights. The grouping index for each node is computed as the sum of the weights of every edge connected to such a node.[10] A high/low value means that such particular nucleus is close/far to other nuclei. Finally, the FoV was also characterized by different statistical measures of such grouping grade (mean, median, mode, etc.).


A group of 1102 FoVs (1024x1024 pixels) were extracted from a set of H&E breast histology samples from 28 different patients. The cases were acquired from Indiana Hospital and scanned into WSIs at University Hospitals in Cleveland - Ohio using Aperio and Philips scanners. Each FoV was automatically extracted from a set of manual annotations carried out by an expert pathologist. The final set comprises 400 non-DCIS, 106 low DCIS grade, 251 moderate DCIS grade, and 345 high DCIS grade FoVs.

Experimental setup

Two experiments were carried out. The first experiment attempted to classify between DCIS and non-DCIS. A 10-fold cross validation scheme was used. At each iteration, 70% of the whole set of FoVs was used to train a Gradient Boosted Regression Trees (GBRT) classifier[11] and 30% was used to test the trained model. Finally, the measured performances along the 10 iterations were averaged. The second experiment aimed to distinguish between the different DCIS grades; this experiment followed a methodology similar to the first experiment, but in this case only the images labeled by the pathologist as DCIS were used. The presented method was compared with two approaches: The former uses only morphological features and the latter combines morphological and graph-based (Voronoi, Delaunay, etc.) features.[7] For evaluation, three metrics were used, namely area under the receiver operator characteristic (ROC) curve, accuracy and F-measure. Briefly, accuracy corresponds to the number of correctly predicted data points from of all the existent data points; area under the ROC curve is equal to the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one; and F-measure measures the balance between precision (the number of correct positive results divided by the number of all positive results returned by the classifier) and recall (the number of correct positive results divided by the number of all the existent positive samples).


All experiments were performed according to experimental setup, previously described. [Figure 2]a. illustrates the average receiver operator characteristic curves of the predictions using the three different approaches, and [Figure 2]b. presents the corresponding accuracies and f-measures for both experiments. Results demonstrate that the presented approach outperforms the baseline approaches in all the tested scenarios.
Figure 2: (a) The average ROC curves for correctly identifying DCIS and DCIS grades in the test set of FOVs using the introduced approach and the comparative strategies (Morphological only features and morphological + graph-based features). To generate an adequate precision of the ROC curves, 100 repeats of 10-fold cross-validation were run. (b) Performance metrics of the tested approaches. First row presents the results of an approach that classifies FoVs based just on morphological features. Second row shows the results of a strategy that uses the morphological and graph-based features reported in.[7] The third row represents the results of the introduced approach that uses local, regional, and cellularity features. Blue-shaded cells correspond to the first, and yellow-shaded cells correspond to the second

Click here to view


In this work, an automatic method to characterize breast histopathology images with DCIS was introduced. This approach exploits different scales of information, so each nucleus is not only characterized by its local information but also by how the nucleus is with regards the nuclei surrounding it. The method was tested on a classification task in which was able to separate images between non-DCIS and DCIS with an accuracy of 95%, thereby outperforming state-of-the-art methods. Likewise, the method was also used to distinguish between different grades of DCIS, yielding an accuracy of 91%, 88%, and 94% when identifying low, moderate, and high-grade, respectively. These automatic approaches might provide pathologists with objective and quantitative tools that facilitate decision making and treatment planning. Future work will include designing a complete framework able to perform a global classification, i.e., given an field of view (FoV) the classifier will assign a definitive label (non-DCIS, low, moderate, or high). Also, the method will be tested on whole slide images (WSIs).


This work was supported by project number 40518 funded by Universidad Nacional de Colombia by means of “Hacia la implementación de un programa de medicina personalizada para el diagnóstico y tratamiento del cáncer en el Hospital Universitario Nacional (HUN) en la era del posconflicto”.

Competing interests

Dr. Madabhushi is an equity holder in Elucid Bioimaging and in Inspirata Inc. He is also a scientific advisory consultant for Inspirata Inc and also sits on its scientific advisory board. Additionally, his technology has been licensed to Elucid Bioimaging and Inspirata Inc. He is also involved in a NIH U24 grant with PathCore Inc.


  1. Ductal Carcinoma In Situ (DCIS): Available from: [Last accessed on 2018 Jan 12].
  2. Elmore JG, Longton GM, Carney PA, Geller BM, Onega T, Tosteson AN, et al. Diagnostic concordance among pathologists interpreting breast biopsy specimens. JAMA 2015;313:1122-32.
  3. Ehteshami Bejnordi B, Balkenhol M, Litjens G, Holland R, Bult P, Karssemeijer N, van der Laak JA. Automated detection of DCIS in whole-slide H&E stained breast histopathology images. IEEE transactions on medical imaging. 2016;35:2141-50.
  4. Kumar V, Abbas AK, Fausto N, Aster JC. Robbins and Cotran Pathologic Basis of Disease, Professional Edition E-Book. ISBN 9781455726134: Elsevier Health Sciences; 2014.
  5. Veta M, Pluim JP, van Diest PJ, Viergever MA. Breast cancer histopathology image analysis: A review. IEEE Trans Biomed Eng 2014;61:1400-11.
  6. Veta M, van Diest PJ, Kornegoor R, Huisman A, Viergever MA, Pluim JP, et al. Automatic nuclei segmentation in H&amp;E stained breast cancer histopathology images. PLoS One 2013;8:e70221.
  7. Basavanhally A, Ganesan S, Feldman M, Shih N, Mies C, Tomaszewski J, et al. Multi-field-of-view framework for distinguishing tumor grade in ER+ breast cancer from entire histopathology slides. IEEE Trans Biomed Eng 2013;60:2089-99.
  8. Ali S, Veltri R, Epstein JA, Christudass C, Madabhushi A. Cell cluster graph for prediction of biochemical recurrence in prostate cancer patients from tissue microarrays. Proc. SPIE 8676, Medical Imaging 2013: Digital Pathology, 86760H.
  9. Lee G, Sparks R, Ali S, Shih NN, Feldman MD, Spangler E, et al. Co-occurring gland angularity in localized subgraphs: Predicting biochemical recurrence in intermediate-risk prostate cancer patients. PLoS One 2014;9:e97954.
  10. Corredor G, Romero E, Iregui M. An adaptable navigation strategy for virtual microscopy from mobile platforms. J Biomed Inform 2015;54:39-49.
  11. Becker C, Rigamonti R, Lepetit V, Fua P. Supervised feature learning for curvilinear structure segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Berlin, Heidelberg: Springer; 2013. p. 526-33.

   Morphometric Assessment of Intratumoral Stroma Changes in Gastric Carcinoma Top

Iancu-Emil Pleşea1, 2, 3, Aurelia Glavici4, Constantin Daniel Uscatu5, Mircea Sebastian Şerbănescu6, Răzvan Mihail Pleşea3, Valentin Titus Grigorean7, Victor Dan Eugen Strâmbu7, Ion Păun8

1Department of Pathology, University of Medicine and Pharmacy “Carol Davila”, Bucharest, Romania, 2Department of Pathology, National Institute of Research-Development in the Pathology Domain and Biomedical Sciences “Victor Babes”, Bucharest, Romania, 3Doctoral School, University of Medicine and Pharmacy Craiova, Craiova, Romania, 4Department of Pathology, Emergency County Hospital, Drobeta Turnu-Severin, Oltenia, Romania, 5Department of Pathology, Emergency County Hospital, Piteşti, Romania, 6Department of Medical Informatics and Biostatistics, University of Medicine and Pharmacy, Craiova, Romania, 7Department of Surgery, University of Medicine and Pharmacy “Carol Davila, Bucharest, Romania, 8Department of Surgery, University of Medicine and Pharmacy, Craiova, Romania. E-mail: [email protected]


Objective: The authors assessed the possible correlations between different components of stromal compartment in gastric carcinomas. Methods: Four serial sections of 75 tumor areas from 59 patients with gastric carcinoma (GC) were stained with Gömöri technique, Masson's trichrome, α smooth muscle actin (ASMA) and CD34 immunomarkers to assess fibrillary constituents (FC), stromal specific cells (SSCs) and tumor stroma vascular network density (VD). Images were acquired and measurements were done with dedicated image analysis software. Statistical correlations were assessed with Pearson correlation test. Findings: Tumor stroma percentage generally varied between 10% and 40% of the tumor area. FC generally dominated the stromal architecture. Newly formed reticular fibers (Re-F) usually dominated the fibrillary compartment. ASMA+ cells dominated the cellular compartment. The vascular compartment showed great variability in the density of intratumoral network. SSCs number highly correlated with the amount of FC and especially Re-F. Re-F influenced both Re-F/Mature collagen fibers (Cl-F) ratio and the amount of FC. Consequently, the total amount of tumor stroma (TTS) correlated with FC/SSCs ratio. VD was in close correlation with the amount of vascular specific cells (CD34+) an also with the TTS and the number of FC producing cells.

Keywords: Gastric carcinoma, image analysis, quantitative assessment, tumor stroma


Gastric cancer is the fifth most common diagnostic cancer and the third most common mortality factor among malignancies worldwide,[1] the 5 year survival rate varying remarkably between geographical regions, from 25-29% in Europe and USA to 74% in Asia.[2],[3],[4] It is a multifactorial disease that displays also considerable variation in the histological pattern and degree of differentiation both between different tumors but even within the same tumor.

The tumor is more than some groups of solitary transformed cells. Epithelial tumor cells can grow only in a deviated microenvironment, consisting of altered extracellular matrix and many untransformed cells that play a role in the initiation and progression of neoplasms.[5],[6],[7],[8]

Tumor stroma represents the reaction of the connective tissue suffering various changes in contact with tumor parenchyma.[9],[10]

Depending on the expression of CD34, CD31, ASMA and high molecular weight caldesmon, in the gastrointestinal tract, in normal and pathological conditions, 5 different immunophenotypes of stromal cells can be identified, other than inflammatory cells and interstitial cells of Cajal: pericryptal fibroblasts, smooth muscle cells (SMCs), ASMA+ stromal cells, CD34+ stromal cells and endothelial cells. Angiogenesis, expressed as mean VD is closely correlated with metastasis and a poor prognosis of gastric cancer, both being considered independent prognostic factors for gastric cancer patients.[11]


The studied material consisted of 75 tumor areas from 59 patients with GC. Tumor tissue fragments were subjected to conventional histological processing techniques (fixation and paraffin embedding). Four serial sections were stained with Gömöri technique, Masson's trichrome, ASMA and CD34 immunomarkers to assess FC, SSCs and VD. Images were acquired and measurements were done with “Acquisition” and “Measure” modules respectively of analySIS Pro 5.0 image analysis software.

The assessed parameters were: the amount of Cl-F, the amount of Re-F, the amount of FC, the amount of ASMA+ cells, the amount of CD34+ cells, the amount of SCCs, the TTS and the VD. For each of them Minimum value (MIN-V), Maximum value (MAX-V), Mean value (AV), Standard deviation (STDEV) were calculated. Statistical correlations between all these parameters were assessed with Pearson's correlation test.


Tumor stroma percentage generally varied between 10% and 40% of the tumor area, variation explained by the great number of tumors invading the gastric wall beyond the submucosal layer.

FC generally dominated the stromal architecture with an average ratio of the two components of around “2” [Figure 1]a, expressing a somewhat balanced distribution of the two components, from the slight prevalence of the FC to a share twice as high as that of the SSCs. It should be noted that in only 20% of determinations, SSCs outweighed FC.
Figure 1: Assessment of main stromal constituents. ASMA = α smooth muscle actin; AV = Mean value; B-Vs = blood vessels; Cl-F = mature collagen fibers; FC = fibrillary constituents; Re-F = newly formed reticular fibers; SSC = stromal specific cells; TTS = total amount of tumor stroma

Click here to view

Although individually, each of the two types of fibers (Cl-F and Re-F) showed a share of up to 10%, the overall fibrillary component showed a range most often between 10% and 30%, with only four cases with a very rich fibril component (over 30% of the tumor area) [Figure 1]a and [Figure 1]c.

Re-F usually dominated the fibrillary compartment [Figure 1]c, revealing a significant secretory activity of SSCs.

The cellular compartment was composed mostly of SCCs that produced the extracellular matrix fibrillary material, ie ASMA+ cells, considered the major constituent of desmoplastic stroma.[12],[13] Vascular cells (endothelial and adventitial) represented around the 10th-13th part of the cellular compartment [Figure 1]a and [Figure 1]d. The vascular compartment showed great variability in the density of intratumoral network, with an average of little less than 200 blood vessels (B-Vs)/mm2, with more than half of the VD values being below average [Figure 1]b.

The assessment of the correlations between different components of tumor stroma brought out very strongly the active and dynamic behavior of intratumoral stroma, which tended to maintain the balance between the different morphological components, probably in order to meet the needs of the cellular population that it has to support. One exception here, this time of non-correlation, is the ratio between the cell population producing fibrillar structures and the proportion of Cl-F. The mature type of collagen fibers, which probably constitutes the basic skeleton even in an architectural construction with a high degree of disorder such as the tumor, almost does not vary.

The adaptation of the supporting fibrillary structure to dynamics of the tumor cell population is achieved by varying the number of cells producing the necessary fibrillary material dominated by Re-F, confirmed by the close correlation between cell number and amount of fibers of that type that respond much better to the disordered “construction” of tumor parenchyma. Thus, the SSCs number highly correlated with the amount of FC and especially Re-F. The latter influenced both Re-F/Cl-F ratio and the amount of FC. Consequently, TTS correlated with FC/SSCs ratio.

The vascular support correlated also with the dynamics of stromal architecture. VD was in close correlation with the amount of vascular specific cells (CD34+) an also with the TTS and the number of FC producing cells. In all comparisons, “p” value of Pearson's correlation test was <0.05 [Table 1].
Table 1: Statistical correlations between the assessed stromal parameters

Click here to view


The stromal microenvironment is an active participant in the neoplastic phenomenon, with the cellular compartment that remodels and continuously adapts the supporting structure to the needs of neoplastic epithelial cells playing a central role, but only in close relation to the vascular compartment.

Competing interests

The authors have no competing interests.


  1. Bray F, Ferlay J, Soerjomataram I, Siegel RL, Torre LA, Jemal A. Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin. 2018;68:394-424.
  2. Anderson LA, Tavilla A, Brenner H, Luttmann S, Navarro C, Gavin AT, et al. Survival for oesophageal, stomach and small intestine cancers in europe 1999-2007: Results from EUROCARE-5. Eur J Cancer 2015;51:2144-57.
  3. Jim MA, Pinheiro PS, Carreira H, Espey DK, Wiggins CL, Weir HK, et al. Stomach cancer survival in the United States by race and stage (2001-2009): Findings from the CONCORD-2 study. Cancer 2017;123 Suppl 24:4994-5013.
  4. Jung KW, Won YJ, Oh CM, Kong HJ, Lee DH, Lee KH, et al. Cancer statistics in korea: Incidence, mortality, survival, and prevalence in 2014. Cancer Res Treat 2017;49:292-305.
  5. Weinberg R, Mihich E. Eighteenth annual pezcoller symposium: Tumor microenvironment and heterotypic interactions. Cancer Res 2006;66:11550-3.
  6. Bissell MJ, Radisky D. Putting tumours in context. Nat Rev Cancer 2001;1:46-54.
  7. Radisky D, Hagios C, Bissell MJ. Tumors are unique organs defined by abnormal signaling and context. Semin Cancer Biol 2001;11:87-95.
  8. Tlsty TD, Hein PW. Know thy neighbor: Stromal cells can contribute oncogenic signals. Curr Opin Genet Dev 2001;11:54-9.
  9. Zaharia B, Pleşea IE, Foarfă C, Georgescu CV. Stroma tumorală. In: Morfopatologie Generală: Editura Medicală Universitară Craiova; 2005. p. 181-2.
  10. Cabanne F, Bonenfant JL. Anatomie Pathologique – Principes de pathologie générale ets péciale. Paris: Les Presses de L'Université Laval Québec & Maloine S.A. Editeur; 1980. p. 254-57.
  11. Zhao HC, Qin R, Chen XX, Sheng X, Wu JF, Wang DB, et al. Microvessel density is a prognostic marker of human gastric cancer. World J Gastroenterol 2006;12:7598-603.
  12. Nakayama H, Enzan H, Miyazaki E, Toi M. Alpha smooth muscle actin positive stromal cells in gastric carcinoma. J Clin Pathol 2002;55:741-4.
  13. Nakayama H, Enzan H, Miyazaki E, Kuroda N, Naruse K, Hiroi M. Differential expression of CD34 in colorectal normal tissue, peritumoral inflammatory tissue, and tumour stroma. J Clin Pathol 2000;53:626-9.

   Digital Pathology in a Box Top

Heimo Müller1, 2, 3, Roxana Merino Martinez4, Robert Reihs1,4, Markus Plass1, Kurt Zatloukal1

1Diagnostic and Research Center for Molecular Biomedicine, Institute of Pathology, Medical University of Graz, Graz, Austria, Sweden, 2Biobanking and Biomolecular Research Infrastructure (BBMRI-ERIC), Common Service IT, 3B3Africa, H2020 Projekt, 4Department of Medical Epidemiology and Biostatistics, Karolinska Institute, Solna, Sweden. E-mail: [email protected]


In digital pathology the adoption of open source software has several positive implications, e.g. in the implementation of standards for data representation and the capacity of integration with other software and platforms. Furthermore the development of machine learning in digital pathology requires open access to large data sets of annotated training data sets. There are several open source tools available to process clinical and molecular data providing valuable annotations for whole slide images. However the utilization of open source software tools requires high technical skills for installation and configuration. To minimize this complexity we developed the software as a service (SaaS) framework BIBBOX (Basic Infrastructure Building Box). We build a box ready for digital pathology including whole slide management and viewers (e.g. Cytomine, OMERO, Girder), data analysis (QuPath, ASAP, SlideJ, HistomicsTK) and integrate these with other open source software for data management, analysis and sharing. “Digital pathology in a BOX” is available either as a virtual machine, or as a “real box” including a data management server, network interfaces and computing nodes. “Digital pathology in a BOX” can be used to organize federated whole slide image collections and/or as a standalone unit supporting the integration of digital pathology software with biobanking, clinical and life science tools. Beside this “Digital pathology in a BOX” may be used as informatics platform for education and training, and as a starter kit for digital pathology and biobanks in low and middle income countries.

Keywords: Data management, open source, software platforms


Digital pathology is not just the transformation of microscopic analysis of histological slides by pathologist to a digital workflow; it is a disruptive innovation that will markedly change health care in the next few years. Digital pathology and machine learning will change the education and training of pathologists, which will be an urgently needed solution to address the global shortage of medical specialists, and it will generate new business models for diagnostic services. It is expected, that several of the solutions developed in the context of digital pathology are also relevant to other fields of medical data analysis.[1]

Future research projects, particularly machine learning and medical data analysis have to build on shared knowledge and access to large data sets that cover the variety of human diseases in different organ systems. Such data sets have to meet quality and regulatory criteria of medical devices (raw data) and be described by all necessary provenance metadata, from patient to sample to image. To generate and share whole slide image collections open source software tools can improve data harmonization and collaborative workflows. The open science movement is in alignment with the EU vision of research in a connected and participative society. An existing example is the European Open Science Cloud (EOSC), which aims to support data storage to foster collaboration among research institutions.

Open source software can also be utilized in education and training and as a “starter kit” for digital pathology workflows. However the installation, configuration and evaluation of open source software tools requires technical skills often not available at the user side. To minimize this complexity we developed the software as a service (SaaS) framework BIBBOX (Basic Infrastructure Building Box).

The work was initiated in biobanking and life science research and funded within the BBMRI-ERIC common service IT and the H2020 project B3Africa (grant agreement 654404).


The Basic Infrastructure Building Box (BIBBOX) abstracts the complexities involved in the installation and configuration of open-source software and provides a framework for user management and data integration. It is based on the metaphor of a smartphone “App Store”, were each App is a pre-configured open source software tool. The selection of Apps is based on a scenario approach initially developed in the biobanking domain.[2],[3]

For digital pathology our scenario is the setup of a federated “colon cancer tissue bank” based on the BBMRI-ERIC's colon cancer cohort collection (H2020 ADOPT BBMRI-ERIC project). In this effort multiple biobanks in Europe are working together to build a collection of 10.000 colon cancer cases with clinical data. [For a subset of these cases also stained tissue sections from archived material shall be scanned and provided together with clinical annotation to the research community with the aim to develop algorithms and software tools for analysing stained tissue sections (tumors).

Our methodology is based on virtualization and container technology, which enables to separate applications from infrastructure. With the help of software containers we can package a single service and put them together in an App description. An App defines the template for the orchestration of several containers, which are instantiated when installing the App in a local BIBBOX framework, see [Figure 1].
Figure 1: The proposed Basic Infrastructure Building Box (BIBBOX) framework

Click here to view

As Apps are based on containers they are lightweight, i.e. they don't need the extra load of a hypervisor, but run directly within the host machine's kernel. This means you can run more Apps on a given hardware combination than if you were using virtual machines.

BIBBOX can be seen as a super class of Apps. Each implementation of the BIBBOX is an instance of that class with different set of deployed applications. Each deployed application produces its own data, i.e. an image management App can be deployed by two different sites. Each instance stores data from its local collections and can share (aggregated) metadata to a central catalogue, and optionally allow managed access to whole slide images.


“Digital Pathology in a Box” consists of a central service (App store and FAIR complaint federation framework) and a software as a service (SaaS) framework consisting of an App manager (install / start / stop and backup of Apps), a user portal to configure Apps with specific user roles and permissions and an ID management system to link digital objects across application.

The App store is based on a set of Github repositories,[4] each single repository containing a blueprint how install the App.[5] The BIBBOX framework provides a web based “one click” installer, so each App can be installed in a easy way and App specific documentation and logs are available in a graphical user interface, see [Figure 2].
Figure 2: Graphical user interface of the proposed Basic Infrastructure Building Box (BIBBOX) framework

Click here to view

“Digital pathology in a Box” includes a software federation framework that allows the sharing of data and knowledge between boxes. The data is harmonised according to the standard explicitly designed for the federation. The standard can include any type of metadata. The software extracts, transform and uploads data from several boxes and visualises it through a very flexible and advanced visualization and exploration tool. The main software components of the federation framework are based on ElasticSearch, an open-source software platform for searching and analysing heterogeneous data in real time.

The “Digital pathology in a Box” framework can be deployed in three different ways: (i) as virtual machine in any hypervisor environments with vagrant/puppet scripts, (ii) it can be downloaded as pre-configured virtual appliance in OVA format and (iii) it can be shipped as a physical box consisting of a UBUNTU server together with storage and computing nodes.


Adopting open source software in the research process has several positive implications. For instance, open source software quickly adopts state-of-the-art technologies and new paradigms in software engineering and computer sciences, and the implementation of standards for data representation and the capacity of integration with other software and platforms, is higher than with commercial software.

We selected and evaluated a set of tools for digital pathology, starting with tools for management and sharing of whole slide image (WSI) collections. These tools were transformed to BIBBOX Apps, which can be installed, configured and integrated with already existing tools from biobanking (sample management) and bioinformatics (data analysis). With the help of a federation framework, several instances of BIBBOX can share a catalogue of their collections.

Competing interests

The authors declare that they have no conflict of interest.


  1. Holzinger A, Malle B, Kieseberg P, Roth PM, Müller H, Reihs R, et al. Machine learning and knowledge extraction in digital pathology needs an integrative approach. In: Holzinger A, Goebel R, Ferri M, Palade V. editors. Towards Integrative Machine Learning and Knowledge Extraction. Lecture Notes in Computer Science. Vol 10344. Cham: Springer; 2017.
  2. Müller H, et al. From the evaluation of existing solutions to an all-inclusive package for biobanks. Health Technol. (Berlin) 2015;7. (Doi: 10.1007/s12553-016-0175-x).
  3. Müller H, Reihs R, Zatloukal K, Jeanquartier F, Merino-Martinez R, van Enckevort D, et al. State-of-the-art and future challenges in the integration of biobank catalogues. In: Holzinger A, Röcker C, Ziefle M, editors. Smart Health. Lecture Notes in Computer Science. Vol. 8700. Cham: Springer; 2015.
  4. BIBBOX apps in Github. Available from: [Last accessed 2019 Oct 15].
  5. The Anatomy of an APP In BIBBOX. Available from: [Last accessed 2019 Oct 15].

   Experience of Digital Pathology Applications in a Diagnostic Pathology Laboratory Top

Meral Üner1,2

1Yozgat City Hospital, Yozgat, Turkey, 2Department of Pathology, Hacettepe University, Ankara, Turkey. E-mail: [email protected]


Objective: Digital pathology is a dynamic, image-based method of obtaining pathology information from a digitized analogous slide, including managing, interpreting, and sharing this information. Herein, we aim to share the experience and basic algorithm of our laboratory, also discuss the future effects and utility of digital pathology in diagnostic pathology. Method: Our whole slide scanner (Leica Aperio LV1), is a device mainly designed for research purposes and has 4 slide scanning capacity. Considering the capacity of our device, we have designed an optimized scanning algorithm. Pathologists marked an area, best representing the lesion, which was scanned by the technician at a single level on 20x magnification. Findings: Our usage of digital pathology includes three main components; 1- limited archiving with special cases, 2-consultation & telepathology for intraoperative consultation and 3- educational purposes. Complete digitization of pathology cases was not possible in our laboratory due to limited digital pathology technology and our technical infrastructure. As a 4th field of digital pathology use, we do not use our whole slide scanner in routine pathological diagnostic processes, because it prolongs the reporting process of the cases due to some technological constraints and reduces the reliability of the diagnosis.

Keywords: Archiving, digital pathology, scanning, telepathology


Digital pathology is a trending technology in worldwide pathology applications by being a dynamic, image-based method of obtaining pathological information from a digitized analogous slide (whole slide imaging - WSI), including managing, interpreting (Deep learning, limited AI), and sharing (Telepathology) this information. Although digital pathology has begun to take its place in many fields of diagnostic and experimental pathology,[1],[2] entire digitization of pathology cases is not effective enough for many routine diagnostic laboratories of developing countries as ours because of technological limitations.[3]

In this article, we aim to share our lab experience as a humble digital pathology experience used for archiving, consultation, telepathology and education.


Our whole slide scanner (Leica Aperio LV1) is designed for research purposes and has 4 slide scanning capacity. The scanned image requires 0.5-5 GB of storage space, depending on the scan quality, and approximately 15 minutes of scanning time.[4] The device can transfer microscopic images on live mode, as a digital microscope, to computers and display units on 1x, 2.5x, 10x, 20x and 40x magnifications [Figure 1]a; this image can also be controlled and accessed by another computer via remote access programs (such as TeamViewer).
Figure 1: (a) Whole slide scanner in our laboratory (Leica Aperio LV1). The device can transfer microscopic images on live mode to computers and display units. (b) Pathologiests mark the area which represent the diagnosis best, our technicians scan the marked areas at a single level on ×20 magnification in SVS format; then the images are filed by date. Many manuel image analysis can be image scope programme. (c) Number of scanned cases since July 2017; diagnoses were categorized as malignant and benign ones

Click here to view

We had designed an optimized scanning algorithm including limited cases considering the capacity of our device. Pathologists marked an area, best representing the lesion, which was scanned by the technician at a single level on 20x magnification.


For our laboratory, we designed a digital pathology usage in three main components;

  1. Limited archiving with special cases,
  2. Consultation & telepathology for intraoperative consultation,
  3. Meetings and educational purposes.

Since our technical infrastructure does not allow scanning of all routine cases for archival purposes, we have planned a limited archiving with special cases in our laboratory. During 11 months' time period, we had been scanned 131 malignant cases and 42 benign cases' slides [Figure 1]c, according to the criteria mentioned below as “Limited Archiving Procedure”:

- Malignant cases,

- Rare cases and cases with academic diagnostic value,

- Cases with judicial considerations,

- Cases with incompatibility between clinical preliminary diagnosis and final pathological diagnosis,

- Cases with significant discordance between the cytological diagnosis and paraffin section diagnosis,

- Cases with significant discordance between diagnosis of endoscopic or incisional biopsy and diagnosis of resection material.

Utility of digital pathology in consultation purposes is possible by remote access to slides in “live mode”. This method is useful in counselling problematic cases with other institutions according to established protocols, in intraoperative consultations and in case of emergency situations (such as assessment of graft rejection etc.) when pathologist is not present in the laboratory.

In purposes of education, presentation and meetings, “microscopic image” created in “live mode” can be transferred to a monitor in our meeting room to support interdisciplinary clinicopathological meetings, case sharing between pathologists and in-lab trainings. Scans in SVS format [Figure 1]b and pictures taken in JPEG, JPG and TIFF formats can be used in presentations at national, international meetings and congresses.

“Using digital pathology for primary diagnosis of pathology slides”, which is a 4th field of use, is not implemented in our laboratory because this process extends the reporting time and lowers the reliability of diagnostic procedure due to some limitations in our existing technology. Some of these limitations are;

- the necessity of extra time for preparation of analogous slides and digital images, separately,

- increase in cost of these processes,

- need for extra data storage capacity,

- resolution quality of the slide monitoring device,

- data loss and slowdowns due to factors such as speed of the internet which transfers and processes data to the physician's computer.


Digital Pathology, a new rising technology, has begun to take its place in routine pathology laboratories in many new areas as well as for scientific research and educational purposes. “Archiving” and “telepathology” are two of the digital pathology applications, which can take a place in developing country laboratories, emphasizing the utility of digital pathology in diagnostic processes. In order to develop a more effective algorithm and usage for routine diagnostic pathology, there is a need for advanced technology, more infrastructural support and consultation protocols between institutions.

Competing interests

Author does not have any competing interests to declare.


  1. Glassy EF. Digital pathology: Quo vadis? Pathology 2018;50:375-6.
  2. Racoceanu D, Hufnagl P. Preface. Selected Papers From the 13th European Congress on Digital Pathology. Vol. 61. 2017. p. 1-34.
  3. Flotte TJ, Bell DA. Anatomical pathology is at a crossroads. Pathology 2018;50:373-4.
  4. Available from: [Last accessed on 2018 Sep 15].

   Ontology-Based Text Mining for Large Scale Digital Slide Annotation Top

Robert Reihs1, Heimo Müller1, Kurt Zatloukal1

1Institute of Pathology, Medical University of Graz, Graz, Austria. E-mail: [email protected]


Extraction of structured information from medical reports is a key enabling technology for generating large data sets of annotated digital slides needed for machine learning. Combined datasets of digital images and medical data alongside developments in AI/machine learning could make novel information accessible and quantifiable for human experts. Information could be hidden in images and is not findable by low level annotations. However, major progress might be expected with unsupervised learning using large numbers of digital slides associated with pathological, clinical and outcome data. To build the structured data source for pathological findings, we created a text mining system called SAAT (Semi Automated Annotation Tool). This uses an ontology-based term extraction and a decision tree built with a manually curated classification system. It extracts from pathology reports the ICD-10 and ICD-O Codes, TNM classification and immunohistochemistry results, e.g. hormone receptor status. A visual editor is used by experts to generate, modify and evaluate the rules. A reference dataset, excluded from the evaluation, is used to measure performance. 1.4 mio pathology reports generated over 23 years have been processed. We achieved F-Scores of 89.7% for ICD-10 and 94.78% for ICD-O classification. For extraction of the Tumor Stating and the receptors we achieved an F-Score range between 81.8 and 96.8%. SAAT is an open source tool and has been tested at 4 institutes in Austria, Germany and Switzerland. It can be used to generate a large scale data resource of digital slides linked with structured and cleaned histopathological information for machine learning.

Keywords: Automatic classification, text mining, ontologies


Medical Universities and hospitals have acquired a large data pool in the last decades. Extraction of structured information from medical reports is a key enabling technology for generating large data sets of annotated digital slides needed for machine learning. Exploring the knowledge in these “big data” pools is a very challenging task. Inhomogeneous data collections build over several decades with evolving information gathered, is not easing the task.

A manual clean-up is possible for smaller projects with less than 100 patients. However, with hundreds of thousands of datasets, as needed in artificial intelligence/machine learning, an automated system for the processing is needed. Current advances in artificial intelligence/machine learning and the combination of different data sources (images, patient records, …) make novel information accessible and quantifiable to a human expert.[1]

The starting point of our activity is the tissue collection of the Institute of Pathology of the Medical University of Graz. The dataset contains approximately 1.4 million samples from 700.000 patients recorded over 23 years. Also follow-up date and survival data are recorded. This patient cohort represents a non-selected patient group characteristic for Central Europe. It is part of the Biobank of the Medical University of Graz[2] and part of the Central Research Infrastructure for Molecular Pathology (CRIP).[3] For all this datasets, the histopathological and immunohistochemistry slides are available. This collection provide amongst others an ideal opportunity for artificial intelligence/machine learning dataset.

Problems like, different misspellings, ontologies, synonyms, alteration of terms, changes in classification and different descriptions of clinical findings, arise especially with datasets covering a long time period. To overcome this main challenges in this area, we created a tool set called SAAT (Semi Automated Annotation Tool ), to extract information and classify the findings.


The SAAT tool set is a bundle of modules for cleaning up, structuring and coding clinical data records. This uses an ontology based term extraction and a decision tree text mining approach with a manually curated classification system. An information extraction system, based on regular expressions, is used to extract TNM tumor staging and immunohistochemistry results, e.g. hormone receptor status.

The data clean-up is the first step, data are integrated into the SAAT system. This process is not needed if the system runs as a service. Also patient records can be merged together with a separate tool (patient merger). Then dictionaries are created, spelling errors are corrected, and all findings are normalised. Additionally a wordnet is generated.

In the information extraction module, a regular expression based system with a substitution system is used to extract data, e.g. T, N, M, G, R, L, V Staging (T size and extent of the primary tumor, N degree of spread to regional lymph nodes, M presence of distant metastasis, G grade of the cancer cells, R presence of cancer cells on the resection-boundaries, L invasion into lymphatic vessels, V invasion into vein); organ receptors (progesterone receptor, estrogen receptor, fish, HER2/neu). With the substitution system also text phrases like “The basal resection margin was tumor positive” is translated to R1. With the structuring step also textual values are translated to numeric and vice versa (progesterone receptor status: <10% -> 1,2 and 3 -> mildly depressed).

The classification module[4] is the core of the SAAT toolbox and classifies the texts with ICD-10[5] and ICD-O[6] codes. In the preparation step some words are merged to terms. Than the text is split into single terms (tokenized), right and left neighbours, sentences and findings. We use a decision tree based classification system. Every node describes a matching token in the diagnosis. The nodes describe the matching token with a regular expression, and have a set of processing rules, describing the relation and positioning of tokens and negation rules. Additionally a priority system for selecting codes is integrated on the node base. The system is designed so that it can run with a local or a centralized dictionary. With this a cooperate classification tree between multiple institutes is possible. So the effort of maintaining and updating the tree can be minimised and shared. Only the classification tree is shared and no recodes, with this there is no privacy issue. To overcome lingual differences between different institutes, also in the centralized catalogue rules can be individualized for specific institutes.

Currently the classification system consist of 104 trees with 4285 nodes. The classification tree were created with an interactive tree editor, a word net visualization tool and an ontology-based term extraction tool by medical experts at the Institute of Pathology in Graz.


The classification system is evaluated by medical experts with an web based evaluation system. Editorial an evaluation dataset was created to do automated evaluations test after each run and modification of the rule set. The changes between runs are recoded to help the rule evaluation process. In the classification we achieved a F-Score of 89,7% (precision 83,2% and recall 97,5%) for ICD-10 and 94,7% (precision 91,2% and recall 98,5%) for ICD-O codes. In [Table 1] the F-Scores of the information extraction for the TNM Staging is shown. In comparison we compute the F-Score for a simple classification at the lexical level with regular-expressions using the “fruit machine” machine.[7] “FM original” shows the results using the original records, “FM corrected” uses the processed dataset for the information extraction system. Our approach shows a very good performance, compared to others. It could be further improved by extending the clean-up for textual description of the tumor stated or additional graph-based extraction methods.[8]
Table 1: Precision, recall and F-score values for search results and extraction of tumor classifications

Click here to view


SAAT is an open source software for pathology, based on an ontology-based term extraction and semi-manually curated decision tree. It will be integrated into the open source BIBBOX[9] Suite. The system achieved an F-Score of 89,7% for ICD-10 and 94,7% for ICD-O. In the information extraction module we achieved an F-Score between 81,8% and 96,8%.

Comparing the result with other state of the art approaches shows a little higher accuracy. This is due to the fact that our system is specially tailored to German pathology diagnosis and has a manually optimized classification tree. Fully automatic NLP methods can be easily trained to new tasks, when trainings sets are available.[10] The classification tree currently supports the German semantic for pathological findings, however a translation of the ruleset can easily be achieved. Piwowar[11] pointed out, that “Academic health centres (AHCs) have a critical role in enabling, encouraging, and rewarding data sharing. The leaders of medical schools and academic-affiliated hospitals can play a unique role in supporting this transformation of the research enterprise”. Our tool can help building a structured data set, that is needed for artificial intelligence/machine learning. In this context patient privacy is a major concern. Here we use the k-anonymity approach,[12] which needs as prerequisite a well-structured and normalized data pool.

Competing interests

The authors declare that they have no conflict of interest.


  1. Holzinger A, Malle B, Kieseberg P, Roth PM, Müller H, Reihs R, et al. Towards the augmented pathologist: Challenges of explainable-AI in digital pathology. 2017:1-34. Available at [Last accessed 2019 Oct 15].
  2. BioBank Graz; 2016. Available from: [Last accessed on 2019 Oct 15].
  3. CRIP- Central Infrastructure for Biomedical Research Involving Human Tissue Repositories; 2016. Available: [Last accessed 2018 Aug 22].
  4. Davis B, Dantuluri P, Dragan L, Handschuh S, Cunningham H. On designing controlled natural languages for semantic annotation. ; Berlin, Heidelberg: Springer Berlin Heidelberg. 2010. p. 187-205.
  5. World Health Organization. ICD-10 : International Statistical Classification of Diseases and Related Health Problems. tenth revision, 2nd ed.: World Health Organization; 2004. [Last accessed on 2019 Oct 15].
  6. World Health Organization. International Classification of Diseases for Oncology (ICD-O). World Health Organization; 2000
  7. Dinwoodie HP, Howell RW. Automatic disease coding: The 'fruit-machine' method in general practice. Br J Prev Soc Med 1973;27:59-62.
  8. Zhou X, Han H, Chankai I, Prestrud A, Brooks A. Approaches to Text Mining for Clinical Medical Records. In Proceedings of the 2006 ACM symposium on Applied Computing; 2006. p. 235-9.
  9. Müller H, Malservet N, Quinlan P, Reihs R, Penicaud M, Chami A, et al. From the evaluation of existing solutions to an all-inclusive package for biobanks. Health Technol (Berl) 2017;7:89-95.
  10. Névéol A, Zweigenbaum P. Clinical natural language processing in 2015: Leveraging the variety of texts of clinical interest. Yearb Med Inform 2016;1:234-9.
  11. Piwowar HA, Becich MJ, Bilofsky H, Crowley RS, Towards a data sharing culture: Recommendations for leadership from academic health centers. PLoS Med 2008;5:e183.
  12. Stark K, Eder J, Zatloukal K. Priority-based k-anonymity accomplished by weighted generalisation structures. In: Data Warehousing and Knowledge Discovery. Berlin, Heidelberg: Springer Berlin Heidelberg. 2006. p. 394-404.


  [Figure 1], [Figure 2], [Figure 3], [Figure 4], [Figure 5], [Figure 6], [Figure 7], [Figure 8], [Figure 9], [Figure 10], [Figure 11], [Figure 12], [Figure 13], [Figure 14], [Figure 15], [Figure 16], [Figure 17], [Figure 18], [Figure 19], [Figure 20], [Figure 21], [Figure 22], [Figure 23], [Figure 24], [Figure 25], [Figure 26], [Figure 27]

  [Table 1], [Table 2], [Table 3]




   Browse articles
    Similar in PUBMED
    Access Statistics
    Email Alert *
    Add to My List *
* Registration required (free)  

  In this article
    ECDP/NDP 2018 Su...
    Cognitive Algori...
    The Opportunity ...
    The DICOM Standa...
    Image Based I...
    A Byte of the Fu...
    The Human Protei...
   Invited Speakers
    Does Pathologist...
    Digital Image An...
    Artificial Intel...
    Machine Learning...
    Computational Pa...
    Deep Learning an...
    The Promise of C...
    Computational Pa...
    Deep Neural Netw...
    Mobile Phone and...
    Beyond Classific...
    Point-of-Care Di...
    Digital Imaging ...
    Deep Digital Con...
    Deep Learning fo...
    New European Med...
    Digital Patholog...
    Experiences of D...
    Getting Patholog...
    A Simulation Mod...
    Why Not JPEG2000...
   Short Abstracts
    Digital Breast P...
    Imaging and Mach...
    A 4-Year Report ...
    Validation of Di...
    Cytologic Immune...
    Automated Classi...
    Comparison of Au...
    Multiplexing: Ne...
    Molecular Tissue...
    German Guideline...
    A New High-Throu...
    Automated Slide ...
    Implementation o...
    Digital Image An...
    Assessment of Pr...
    Multispectral Im...
    Digital Microsco...
    Deep Learning fo...
    Integrative Comp...
    Heterogeneity An...
    Deep Convolution...
    Accurate Prostat...
    Going Fully Digi...
    HER2 Control-Awa...
    Glass Slides and...
    Nuclei Detection...
    Web Application ...
    A Deep Learning ...
    A Multiplexing I...
    Structured Synop...
    A Deep-Learning ...
    The Importance o...
    Validation of a ...
    Automated Quanti...
    Implementation o...
    Open and Collabo...
    Correlation of K...
    Deep Learning Ti...
    Digital Image An...
    3D Histology Bas...
    Real-Time Image ...
    Separating Tissu...
    Automated Ki67 &...
    Working Methods ...
    Integrin Beta4 A...
    Deep Learning Ti...
    Multi-Scale Fusi...
    Open Source Piwi...
    Scorenado: An Ef...
    Considering the ...
    Platform for Tra...
    Use of Open Comm...
    Application of S...
    Leveraging Unlab...
    Multi-Centre Ram...
    An AI-based Qual...
    Telepathology in...
    Preliminary Stud...
    HistoQC: A Quali...
    Computer Assiste...
    Low-Cost, Point-...
    Facilitating Ult...
    Tissue Image-bas...
    Visualization an...
    Basal Cell Carci...
    Detection of ...
    Deep Autoencoder...
    A Multi-Task Dee...
    On Quantifying t...
    AI based Identif...
    Embedded Deep Fe...
    Context Aware De...
    Multispectral Im...
    Extended Peer-Re...
    Feasibility of W...
    Epithelium and S...
    A Digital Decisi...
    Transfer Learnin...
    Quality Assuranc...
    Digital Diagnost...
    Helsinki Biobank...
    Identification a...
    Digital Patholog...
    Image-Analysis M...
    Deep Digital Con...
    Predicting Prost...
    New Cytomine Mod...
    Validation of Au...
    Detection and Gr...
    Morphometric Ass...
    Digital Patholog...
    Experience of Di...
    Ontology-Based T...
    Experiments in T...
    Virtual Staining...
    A Digital Pathol...
    Article Figures
    Article Tables

 Article Access Statistics
    PDF Downloaded458    
    Comments [Add]    

Recommend this journal