SYMPOSIUM  ORIGINAL RESEARCH Year : 2013  Volume : 4  Issue : 2  Page : 5 Approaches to automatic parameter fitting in a microscopy image segmentation pipeline: An exploratory parameter space analysis Christian Held^{1}, Tim Nattkemper^{2}, Ralf Palmisano^{3}, Thomas Wittenberg^{1}, ^{1} Department for Image Processing and Biomedical Engineering, Fraunhofer Institute for Integrated Circuits, Erlangen, Germany ^{2} Biodata Mining Group, Faculty of Technology, Bielefeld University, (D33501 Bielefeld), Germany ^{3} Optical Imaging Center Erlangen, OICE, Erlangen, Germany Correspondence Address: Introduction: Research and diagnosis in medicine and biology often require the assessment of a large amount of microscopy image data. Although on the one hand, digital pathology and new bioimaging technologies find their way into clinical practice and pharmaceutical research, some general methodological issues in automated image analysis are still open. Methods: In this study, we address the problem of fitting the parameters in a microscopy image segmentation pipeline. We propose to fit the parameters of the pipeline«SQ»s modules with optimization algorithms, such as, genetic algorithms or coordinate descents, and show how visual exploration of the parameter space can help to identify suboptimal parameter settings that need to be avoided. Results: This is of significant help in the design of our automatic parameter fitting framework, which enables us to tune the pipeline for large sets of micrographs. Conclusion: The underlying parameter spaces pose a challenge for manual as well as automated parameter optimization, as the parameter spaces can show several local performance maxima. Hence, optimization strategies that are not able to jump out of local performance maxima, like the hill climbing algorithm, often result in a local maximum.
Introduction In clinical practice, pharmaceutical studies, and life science research, automatic analysis of large sets of microscopy images has received a lot more attention in the last ten years, because of several parallel developments. In the clinical domain, the digitization of pathological preparations, that is, digital pathology, has experienced a significant speed up in the last year, as more tissue sections are investigated and imaging and image archiving have been improved into integrated hardware solutions. As a consequence, large image collections can be recorded under almost identical conditions, enabling (and calling for) new software solutions for the automated and semiautomated analysis of the data, which includes quantitative measurements (counting of cell nuclei) or tissue segmentation. [1],[2],[3],[4] In pharmaceutical research and life sciences, advances in robotics, automation, signal processing, and staining techniques enable new microscopy imaging approaches, which are able to resolve high dimensional tissue features with multistaining or multispectral images. As a consequence, algorithmic approaches for the analysis of such data are desperately needed. The socalled bioimage informatics has been developed as a new branch in the bioinformatics tree. [5] For a manual assessment of microscopic imagery, the commercial and academic sectors provide different solutions to support experts in data management, exploration visualization, and semantic annotation. [6],[7],[8],[9] However, it is common sense, that manual evaluation of large scale image data is a repetitive and timeconsuming task. As a result of this, fatigue or pressure of time can lead to erroneous or nonreproducible results. The analysis of microscopy images (e.g., from tissue sections or fluorescent data) can involve different steps, and it is dependent on the diagnostic or scientific context. In one diagnostic context, the analysis can be deduced by counting different types of cells in a sample. Another context on the same or similar image data may require measurement of cell size or measurement of the average signal intensity across a handselected region of interest (ROI). Using automated image analysis, reproducibility as well as time required for evaluation can be reduced for many experiments. The development of an automated image analysis system promises robust and accurate segmentation of each object (e.g., nuclei) in the image, but poses several challenges like overlapping cells, low signaltonoise ratios or artifacts. To overcome these obstacles, usually pipelines or workflows of image processing modules are designed, involving modules of denoising, thresholding, and segmentation. Optimal adjustment of each module's parameters is important to obtain high quality segmentation. Tools like the CellProfiler [10] offer a segmentation pipeline editor, enabling the arrangement and parameterization of modules. However, the design of the pipeline and fitting the modules parameters is a nontrivial task, as an experienced image processing expert has to perfectly understand the aim of the segmentation and the image characteristics (like tissue morphology, staining). Furthermore, if the expert finally finds a good parameter set for the chosen pipeline it is unclear how this pipeline will perform on the new data recorded in another laboratory. To ease the integration of medical/pathological background knowledge (usually defined by manually acquired ground truth segmentation) into the image processing solution, artificial neural networks and machine learning had been already proposed in the mid 1990s and their application has been proposed. [11],[12],[13] However, the parameter fitting problem remains with this approach as well. One alternative to an iterative manual parameter adjustment has been proposed just recently by Pretorious et al., [14] describing a plugin for the CellProfiler [10] that computes and stores segmentation results for different combinations of parameters. Based on the selection of high or lowquality results by the user, optimal parameterization of the segmentation pipeline is determined. A further alternative to manual parameter adjustment that is applicable to fluorescent image data is described by Wittenberg et al.[15] This requires the user to manually delineate some representative cells. Parameterization of the segmentation pipeline is then adjusted toward the user input by a genetic algorithm. In this study we investigate the problem of automatic parameter fitting in a segmentation pipeline. We introduce the segmentation pipeline developed in one particular project. We have chosen this example, as the cell shapes are very complex and the pipeline relates well to many other published pipelines. Next, we explain how the parameters are fitted automatically with different approaches and explain how segmentation accuracy is assessed. Visualization of the parameter space in relation to segmentation accuracy enables us to understand why some automatic and manual methods are likely to fail in finding the global optimum in the parameter space. Materials Macrophages from C56BL/6 mice were stained with CD11b/APC. An additional DAPI staining enabled visualization of the cell nuclei. Using a Zeiss Axivert microscope, with a 20x objective, 20 micrographs with a size of 1388 × 1040 pixels were captured [Figure 1]. In order to define the solution of the segmentation task, each cell was manually delineated by an experienced biologist. Cells exceeding the image boundary or cells whose boundaries could not be resolved by the human expert were also delineated and assigned to a rejection class. The resulting ground truth consisted of 588 valid macrophages and 320 cells assigned to the rejection class.{Figure 1} Methods The Segmentation Pipeline The segmentation pipeline used in this study consists of three modules. In the first module, preprocessing is applied aiming to remove the potential noise or shading artifacts. Next, the cells are separated from the image background. In the last step, the touching or overlapping cells are separated from each other. An illustration of the segmentation pipeline is provided in [Figure 2].{Figure 2} Automated or manual adjustment of the module parameters correspond to optimization of a multidimensional function f(x1 , x2., xn), where n corresponds to the full number of parameters of all the modules. For this study we assume that each parameter is discretized and can be represented by a finite set of ordered values. Minimum FilterBased Preprocessing In the preprocessing stage, a Gaussian smoothing filter with standard deviation σM ∈ {1,3.,19} (so x1 = σ, see above) is applied for noise reduction. Then, a difference imagingbased shading correction technique is performed. For shading correction, the minimum filtered background image is subtracted from the smoothed image. The width and height of the minimum filter are set to 2ε +1, with ε ∈ {1,2.,10} (so x2 = ε and so on) to enable efficient adjustment of the kernel size. As the runtime of the minimum filter usually increases with the radius of the structuring element, the implementation of Van Herk [16] is used, which only requires six comparisons per pixel, independent of the kernel size. KMeans Clustering Based FigureGround Separation For the separation of the fore and background pixels, a method based on kmeans clustering [17] is applied. Note that this technique can be implemented efficiently using the histogram of the image. [18] Compared to alternative figureground separation routines, this method enables easy adjustment of the threshold level by changing the number of clusters, k ≥ 2. After clustering, each of the clusters is assigned to the fore or background classes. Assuming that the dynamics of the cells are large compared to the dynamics of the background, the darkest cluster can be regarded as the background. All the remaining clusters are interpreted as the foreground. For the opposite case with dynamics of the image background being larger than the dynamics of the image foreground, clustering with k ≥ 2 results in oversegmentation of the image. To enable a robust, figureground separation for such image data, the rules for assigning each cluster to the fore or background must be modified. Hence, only the brightest cluster is regarded as the foreground. All the remaining clusters are interpreted as the background. For adaption to data from different fluorescence imaging domains, parameterization of kmeans clustering must be adjustable, such that, both modes for assigning clusters to the fore and background are supported. Therefore, an auxiliary variable k0 is introduced, encoding the number of clusters k. By optimizing the parameter k0 , the proposed kmeans clusteringbased figureground separation method supports both modes for assigning clusters to the fore or background. For k0 ≥ 2, the number of clusters k is set to k = k0 , and the cells are assumed to be represented by all clusters, except the darkest cluster. If k0 < 2, the brightest cluster is regarded as the foreground. As both modes result in equal results if clustering is performed with two clusters, k0 = 1 shall correspond to kmeans clustering with three clusters and only the brightest cluster shall be interpreted as the foreground. Accordingly, the number of clusters is set to k = 4 − k0 if k0 < 2. For an illustration of the segmentation results using varying values for k0, see [Figure 3]. The automated parameter optimization range of k0 is restricted to k0 ∈ {−8,−7.,12}.{Figure 3} Watershed and Seeded Watershed Transform The watershed transform is a commonly used method for splitting of touching cells. For splitting of approximately round or oval cells, a distance transformed binary image is used as an input for the watershed transform. The binary image is obtained from the kmeans clustering, as described above. Alternatively, cells can be split by using the information on gradient magnitude. In our study, we have applied a hybrid watershed transform, [19] which combines the distancetransformed image with information on the gradient magnitude. Denoting the gradient magnitudebased image as ∇I (x, y) and the distance transformed binary image as ID (x, y), the input for the hybrid watershed transform IH (x, y) is defined as: [INLINE:1] Where αw ∈ {0,0.1,...,1} is a weighting factor that allows balancing between the impact of the gradient magnitudebased and the distance transformbased components. In order to reduce oversegmentation, a Gaussian smoothing filter with standard deviation σw ∈ {1,2,...,10} is applied for smoothing I H (x, y), thus reducing the oversegmentation artifacts. In contrary to the watershed transform, the seeded watershed transform is not based on local intensity maxima. Instead, cell nuclei that have been previously segmented in the DAPI channel are used as seeds. The seeded watershed transform works analogous to the watershed transform and uses parameters αws ∈ {0,0.1,...,1} for balancing of gradient and distance information and σws ∈ {1,2,...,10} for smoothing of the hybrid image. Segmentation Accuracy Assessment A performance metric is required to determine the similarity between the ground truth data and the segmentation results. The Jaccard similarity enables pairwise comparison of the segmentations. Let the ground truth be represented by a set of pixels Sgt and the automatically generated segmentation by Sres. The Jaccard similarity OJ ∈ [0,1] is then defined as: [INLINE:2] Measurement of the Jaccard similarity only takes the qualitative aspect overlap into account. Information on the number of correctly identified cells, NTP , number of missed cells, NFN , and number of erroneously detected cells, NFP , is not reflected by the performance metric. In order to evaluate the quantitative information, accuracy Oa of the segmentation can be utilized: [INLINE:3] Determining NTP, NFP , and NFN requires a mapping between all ground truth objects and the objects from the segmentation result. For this mapping, each of the segmented objects is assigned to the best fitting ground truth object. Thereby, only a single segmented object can be assigned to each ground truth region. Additionally, a rejection criterion is implemented. Regions are rejected in case of exceeding the image boundary. Objects assigned to the rejection class will be excluded from evaluation and not influence the performance measurement. Hence, a decision rule for determining if the segmented region of interest corresponds to a rejection ground truth object or to a valid segmentation result, is required. For this study, regions that in terms of Jaccard similarity better fit a rejection class region than any ground truth region are assigned to the rejection class and excluded from evaluation. In order to combine quantitative and qualitative performance measurements, the combined Jaccard metric is defined. Denoting an optimal pair of corresponding regions Sgt and Sres as [INSIDE:1] with i ∈ 1,2., NTP , the combined Jaccard metric [INSIDE:2] is defined as: [INLINE:4] Optimization of Multidimensional Functions Different strategies exist for the optimization of such multidimensional functions. Some optimization strategies enable efficient optimization, based on the derivative of f. As a derivative of f can only be estimated by finite differences, applicability of such functions is not discussed in this article. Instead, we focus on the optimization methods that do not require estimation of derivatives. Hill Climbing The basic hill climbing method is an iterative algorithm applicable to optimization of multidimensional functions. Based on an initial solution, the hill climbing algorithm attempts to find a better solution by increasing or decreasing a single parameter x i, i = 1.., n. If the new solution outperforms the previous solution, the new solution is accepted and the search continues. This method is repeated until no further improvements can be found. Due to the local search, the hill climbing algorithm is sensitive to local performance maxima. Steepest Ascent Hill Climbing Contrary to the basic hill climbing algorithm, the steepest ascent hill climbing method evaluates the performance for all points in a local neighborhood, around the current solution. From these points, only the solution yielding the highest performance is accepted. Convergence of this optimization scheme is assumed if no further improvement can be identified. The steepest ascent hill climbing algorithm is also sensitive to the local performance maxima. Coordinate Descent The coordinate descent optimization method is an approach for minimization of a multidimensional function f (x1, x2., x n) that does not require gradient information. Outcome of this method depends on its initialization. Using the coordinate descent, each parameter is optimized individually by a brute force approach in every single iteration. Hence, for the first iteration, parameter x1 is varied and optimized. Next, parameters x2 , x n are optimized. This coordinatewise optimization of all parameters is then repeated until no new performance maximum is determined. Results obtained by utilizing this search method depend on the initialization and identification of the global optimum, which is not guaranteed. For details on convergence of the coordinate descent method see Luo et al.[20] Genetic Algorithm Due to its ability to jump out of local extrema and its capability to efficiently optimize multidimensional objective functions, genetic algorithms [21] are often used for automatic parameter optimization of complex multidimensional functions. Using genetic algorithms, the set of parameters is regarded as a genome, which consists of a set of alleles. Each of the alleles thereby represents a parameter. In this study, discrete parameters defined by a minimum value, a maximum value, and a step size are used to reduce the size of the parameter space. For initialization of the genome a random initializer is used. After initialization, exploration of the parameter space is performed based on mutation and crossover operations. For this study, probability that a crossover operation occurs is set at pcross = 0.5 and probability that a mutation occurs at pmut = 0.2. This results in a crossover operation for every second individual and an average amount of ≈ 1 mutation per individual. Convergence of the genetic algorithm is assumed, if no better individual has been determined for 200 iterations. Using this convergence criterion in Praxis, about 1000 iterations have to be performed until convergence, using the described fluorescent macrophage data. Manual Parameter Optimization In the following, manual optimization strategies applied by expert users as well as nonexpert users are briefly analyzed. Nonexpert users in this context refer to users that do not have an understanding of the underlying image processing pipeline. For both user groups, initialization of the search space is determined by a guess, including a potential prior estimate. Manual parameter optimization for nonexpert users is often performed by systematically incrementing or decrementing the parameters. This strategy can be compared to a hill climbing algorithm. Contrary to this, expert users often know the parameter, whose modification results in the strongest increase in segmentation performance. Assuming that their intuition is right, it results in the steepest ascent hill climbing algorithm, otherwise, their optimization strategy is similar to the hill climbing algorithm. Experimental Setup In the following experiment, the applicability of strategies for manual as well as automated parameter optimization is discussed. The applied segmentation pipeline combines the described minimum, filterbased, preprocessing routine, the figureground separation based on kmeans clustering and watershed or seeded watershed transform, for cell splitting. Both segmentation pipelines require the user to adjust a total of five parameters. To enable visualization, parameter spaces are investigated separately for each step of the segmentation engine. Therefore, all parameters are automatically adjusted for a set of images by applying a genetic algorithm, such that, the combined Jaccard similarity is maximized. Note that no crossvalidation is included because a single parameter set that bestrepresents the complete dataset is required. Assuming that the global performance maximum has been determined by the genetic algorithm, parameters of each stage are individually varied around the performance optimum, to enable visualization of the corresponding parameter spaces. Results Based on the optimal parameterization of the segmentation pipeline, the parameter spaces of preprocessing, figureground separation, and object splitting can be visualized. Therefore, the combined Jaccard similarity is evaluated for all possible parameter combinations. [Figure 4] shows the resulting parameter spaces using the watershed transform or the seeded watershed transform, for splitting of the cells. For the watershed transform, the parameter space of minimum preprocessing [Figure 4]a thereby shows several local performance maxima. The parameter space of the kmeans, clusteringbased, figureground separation [Figure 4]b also shows local performance maxima, whereas, the parameter space for watershed transformbased cell splitting [Figure 4]c is very monotonous. Using a segmentation pipeline with seeded watershed transformbased splitting of the cells, the minimum filterbased preprocessing again shows several local performance maxima [Figure 4]d. Contrary to the previous segmentation pipeline, the parameter space for figureground separation [Figure 4]e is very monotonous and does not show any local performance maxima. However, the parameter space of the seeded watershed transform shows several local performance maxima [Figure 4]f.{Figure 4} Discussion and Conclusions In the presented study, a representative segmentation pipeline is described, which requires the adjustment of five parameters. Using fluorescence microscopy image data as an example, the parameters were automatically adjusted toward a manually generated ground truth, to obtain an optimal parameterization of the segmentation pipeline. By varying the parameters for preprocessing, figureground separation, and cell splitting, the parameter spaces were visualized for each of the modules. The depicted illustrations show that most parameter spaces contain several local performance maxima, which pose a challenge for many automated parameter optimization methods. The obtained results indicate that the genetic algorithm outperforms other approaches in solving our optimization problem. However, the time required until convergence by optimizing the ground truth data, consisting of 20 images, is around six hours for the genetic algorithm. Application of the hill climbing algorithm or the improved steepest ascent hill climbing algorithm probably results in a local performance minimum, depending on the initialization [Figure 5].{Figure 5} The convergence time for the hill climbing algorithm is much faster than the convergence of the genetic algorithm and usually lies below one hour. For the coordinate descend optimization method, the convergence time is about two hours. Based on visualization of the parameter spaces, a conclusion may be drawn that the coordinate descent also enables identification of the global performance optimum, nearly independent from the initialization. Hence, the coordinate descent can be applied as a more timeefficient alternative to the genetic algorithm. On the basis of these results, a conclusion may be drawn that the underlying parameter spaces pose a challenge for manual as well as automated parameter optimization, as the parameter spaces can show several local performance maxima. Hence, optimization strategies that are not able to jump out of local performance maxima, like the hill climbing algorithm, often result in a local maximum. In order to decrease the liability to the local maxima, the coordinate descent or genetic algorithmbased search strategies are recommended. Acknowledgements We would like to thank the DFG for funding the SFB 796 in projects A4, C3, and Z and Prof. Roland Lang and Jens Wenzel from the Institute of Clinical Microbiology, Immunology, and Hygiene, University Hospital Erlangen, for providing the image data used for this study. References


