- Research article
- Open Access
Semiautomatic segmentation of the kidney in magnetic resonance images using unimodal thresholding
BMC Research Notesvolume 9, Article number: 489 (2016)
Total kidney volume (TKV) is an important marker for the presence or progression of chronic kidney disease, however, routine ultrasonography underestimates renal volume to a high and varying degree.
The aim of this work was to adapt and evaluate a semi-automatic unimodal thresholding method for volumetric analysis of the kidney in native T2-weighted magnetic resonance (MR) images.
In a group of healthy volunteers (n = 24; 48 kidneys), we defined a region of interest (ROI) by manually tracing the outline of the kidney in every MR image. An automatic unimodal thresholding algorithm with visual feedback was applied to the probability distribution function of voxel intensities in the ROI to remove intrarenal non-parenchyma volume. For comparison, reference volumes were created by manual segmentation. Intra- and inter-observer reliability was evaluated.
There was a small, significant mean difference of 1.5 ml between semi-automatically and manually segmented TKV (p = 0.009, 95% CI [0.4, 2.7]). While intra-observer reliability was good (mean difference 2.9 ml, p < 0.01, 95% CI [1.5, 4.2]) there was a small but significant mean difference of 4.8 ml (p < 0.01, 95% CI [3.6, 5.9]) between the TKV results of different observers. Reference volume correlations were excellent (r = 0.97–0.98). Semi-automated segmentation was significantly faster than manual segmentation; mean difference = 234 s [91–483 s]; p < 0.05. Automatic unimodal thresholding removed a considerable mean volume of 18.7 ml (13.1%) from the coarse manual pre-segmentations.
Unimodal thresholding of native MR images is a robust and sufficiently reliable method for kidney segmentation and volumetric analysis. The manual pre-segmentation can be done by non-experts with little introduction.
Chronic kidney disease (CKD) is a public health problem with a growing incidence in aging populations [1, 2]. A change in kidney volume is a potential marker for the presence or progression of CKD. Kidney volume also helps to determine therapy and invasive diagnostics, e.g. kidney biopsy . In ultrasonography, TKV is usually calculated by a variation of the ellipsoid formula. Unfortunately, this approximation significantly underestimates the kidney volume and suffers from poor repeatability [3, 4]. Therefore more accurate, reproducible and fast alternatives for determining TKV are desirable. Magnetic resonance imaging (MRI) uses a tomographic method, is non-invasive and does not expose patients to ionizing radiation, contrary to computed tomography. It should be acknowledged that MRI-based examinations are limited by the relatively high costs, the partly limited access to scanners (e.g., compared to ultrasound), some contraindications (e.g., subjects with some types of metal implants or pacemakers), and the potentially reduced convenience and tolerability for patients (e.g., claustrophobia). Due to the fact that the acquisition of MR images takes longer than the acquisition of CT images motion artifacts may occur, especially when imaging breath-dependent organs such as the kidney.
Manual slice by slice segmentation of the kidney parenchyma, excluding e.g. intrarenal vessels, calyces, cysts, is time intensive and user-dependent. There are various approaches that fully or partly automate the process . Fully automatic segmentation of the kidney in MR images is a challenging problem, owing to noise, low contrast, artifacts and the highly variable shape of the organ. Examples for state of the art approaches include methods based on the generation of probability maps  and neural networks . Semiautomatic solutions often combine local adaptive algorithms, which detect edges and contours in an image, with an interactive mouse tool. The user interactively ‘brushes’ the kidney with the tool and achieves faster segmentation. Local adaptive algorithms have parametric properties that need to be adjusted manually by trial and visual feedback, depending on image quality, MR sequence and, most importantly, the shape and structure of the kidney itself. This leads to operator dependence in adjusting these properties and can be time intensive.
The aim of this work was to develop and evaluate a semi-automatic unimodal thresholding method for volumetric analysis of the kidney in native T2-weighted MR images. Unimodal thresholding has first been described by Rosin et al.  and works with any image that has a unimodal intensity distribution. Rosin et al.  evaluated images of various modalities with good results, including CT angiograms. There also exists an adaptation for volumetric segmentation of 3D breast images acquired with cone-beam computed tomography. Our adaptation is based on the distribution of the voxel intensities in a rough, manually pre-segmented region of interest (ROI) of the kidney. To our knowledge, the proposed implementation is the first in the MRI domain.
In this study, 24 healthy subjects were chosen (11 males, 13 females). The mean age was 26 years (range: 21–41 years). The body mass index of all subjects was on average 21.8 kg/m2 (range: 18.9–24.8 kg/m2). The study recruitment process began in July 2015 and was conducted using an advertisement sent to our circle of acquaintances. The informed consent documents delineated the MR exam and informed the volunteer that data collected during the exam could be used for research purposes. Informed consent forms were signed by every volunteer in the study. The study was executed according to the Declaration of Helsinki and ‘good clinical practice’ (GCP) guidelines. The institutional review board of the University Hospital Erlangen/Germany approved the study.
In 19/24 MR examinations, we used a 1.5 Tesla MR Scanner (Magnetom Avanto, Siemens, Erlangen, Germany). The remaining 5/24 examinations were acquired on a 3 Tesla MR scanner (Magnetom Verio, Siemens, Erlangen, Germany). The volunteers were examined in supine position with a standard abdominal phased array coil (Siemens, Erlangen, Germany). For each subject, a volume containing both kidneys was scanned using a T2 turbo spin echo (TSE) sequence with prospective acquisition correction (PACE) in axial orientation and the following parameters: TR 4400 ms, TE 88 ms, bandwidth 260 Hz/px, acquisition matrix 328 × 288 px, slice thickness 4.4 mm, in-plane resolution 1.0 × 1.0 mm.
For the manual pre-segmentation we used the Multi-image Analysis GUI (Mango) image processing system.Footnote 1 Mango was selected because it is easy to use and employs the standard Neuroimaging Informatics Technology Initiative (NIfTI) file format for saving image volumes and ROI.
The unimodal thresholding algorithm was implemented by one of the authors (M.S.) as a custom script in Python, version 2.7Footnote 2 using the Scipy ecosystem, Numpy,Footnote 3 nibabelFootnote 4 and scikit-imageFootnote 5 libraries. Spyder, a scientific development environment,Footnote 6 and part of the Anaconda software suiteFootnote 7 was selected for ease of use.
The script takes an abdominal MRI and a ROI around one kidney, both in NifTI-format, as input. From all voxel intensities within the ROI, a probability distribution function (PDF) is generated with a kernel-density estimate using a Gaussian kernelFootnote 8 and Scott’s method . The maxima of the PDF are detected automatically by a peak-seeking algorithm.Footnote 9 If there is more than one maximum, the user has the choice to depart from the default, which is the global maximum of the PDF. Two points on the PDF with a distance of one standard deviation left and right of its maximum are calculated. Straight lines are drawn from the peak to these points. The area of a right triangle, its hypotenuse part of the respective line and its right vertex on the PDF, is maximized—this is equivalent to the maximum perpendicular distance between the PDF and the line. The upper and lower threshold is determined by the right vertices on the PDF (Fig. 1a). For visual feedback in all slices of the kidney volume the manual ROI and all voxels within the upper and lower threshold are shown as a colored line and colored area (Fig. 2).
A manual segmentation was performed with Photoshop (Version CS6, Adobe Systems, San Jose, CA, USA). The entire kidney parenchyma was segmented from the surrounding tissues manually on the T2-weighted MR images using knowledge about the shape, location and structure of the kidney. The contours of both kidneys were carefully drawn manually in each slice for each volunteer. The manual segmentation was performed by a medical student. A board-certified radiologist [6 years of work experience (M.H.)] verified and corrected the segmentation where necessary. These delineations were considered as the reference volume.
The statistics in the study were calculated using R, version 3.2.1.Footnote 12 For the inter-observer variation study, manual segmentation of all 48 kidneys was performed by two independent observers, one medical postgraduate and one layperson. An intra-observer variation study was performed by comparing two segmentation groups of all 48 kidneys by the same observer (medical postgraduate) with a 6 months minimum time difference. These results were compared to a reference volume, of all 48 kidneys, obtained by a purely manual segmentation (see above) by another independent observer. All observers were blind to the others results. Correlations were calculated using Pearson’s product-moment correlation coefficient. Statistical differences between groups were compared with a paired t test where p < 0.05 was considered significant. Comparisons were performed with a linear regression analysis and the Bland–Altman method .
Voxel histogram and probability distribution function (PDF)
Except for two kidneys, the PDFs are unimodal which means that they have a single global maximum (Fig. 3). Kidneys 016L and 023L have two local maxima (bimodal). The default method uses only the global maximum; local maxima are ignored. Choosing the maxima near the respective lines as their peak points (Fig. 1b) and comparing the result to the default we found a maximum volume difference of 1.2 ml for kidney 016L (observer 1) and of 0.4 ml for kidney 023L (observer 2). We consider these differences negligible. The results reported below are based on the automatic global maximum method.
Throughout the evaluation, the areas marked by the automatic threshold corresponded well to anatomical features like vessels and calyxes (Fig. 2).
Using unimodal thresholding, the mean TKV was 143.2 ± 29.0 ml; 146.3 ± 28.0 ml for the left kidney and 140.1 ± 29.8 ml for the right kidney; 157.1 ± 26.2 ml for male and 131.4 ± 25.9 ml for female subjects; see Table 1.
There was a small, significant mean difference of 1.5 ml between the volumes acquired by semi-automated segmentation and the reference volume (95% CI [0.4, 2.7], N = 144, p < 0.01, paired t test). The mean absolute difference (MAD) was 5.5 ml.
Repeated measurements by the same observer (intra-reader reliability) showed a small, significant variability; the mean difference was 2.9 ml (95% CI [1.5, 4.2], N = 48, p < 0.01, paired t test), the MAD was 4.3 ml.
To evaluate inter-reader reliability, both segmentation groups of observer 1 were compared to one segmentation group of observer 2. We found a small, significant mean difference of 4.8 ml (95% CI [3.6, 5.9], N = 96, p < 0.01, paired t test) and a MAD of 6.0 ml. A detailed summary is shown in Tables 2 and 3.
Comparing the results acquired by the same observer, we found excellent correlation (r = 0.99, t = 44.1, p < 0.01, 95% CI [0.98, 0.99]) and excellent agreement (mean value of differences = 2.9 ± 4.5 ml, 95% CI [−6.1, 11.8]) (see Fig. 4).
Comparing the results acquired by different observers, we also found excellent correlation (observer 1 (1)/observer 2: r = 0.98, 95% CI [0.96, 0.99], p < 0.001; observer 1 (2)/observer 2: r = 0.99, 95% CI [0.98, 0.99], p < 0.01) and good agreement (observer 1 (1)/observer 2: mean value of differences = 6.2 ± 6.5 ml, 95% CI [−6.6, 19.0], observer 1 (2)/observer 2: mean value of differences = 3.3 ± 4.7 ml, 95% CI [-6.0, 12.6]), see Fig. 5.
Comparing the three unimodal thresholding TK volumes to the reference volume, there was again excellent correlation (Table 3) and excellent agreement (Fig. 6). For the manual segmentation, the time required per kidney was 408 ± 105 s. The semi-automated segmentation took 174 ± 38 s and, therefore, was significantly faster.
The mean automatically removed volume of all observers was 18.7 ml (13.1%), shown in Table 4.
The aim of this work was to develop and evaluate a fast, robust, accurate and reliable method for volumetric analysis of the kidney in native T2-weighted MR images. We developed an automatic algorithm to partition the PDF of the voxel intensities within a rough and manually acquired kidney ROI. This thresholding method of essentially unimodal distributions has been described and tested with images of various modalities . It has also been successfully adapted for volumetric segmentation of 3D breast images acquired with cone-beam computed tomography .
It should be mentioned that the obtained thresholds are not fixed. They are case-dependent and vary with the peak location of the PDF derived from the ROI intensity histogram. This is an important advantage, because MR intensity is arbitrary and varies between subjects, scanners and imaging sessions. Of note, MR intensity normalization techniques are commonly based on the image histogram . In our case the underlying assumptions are that the kidney parenchyma, consisting of a cortex and medulla, is the predominant tissue in the kidney ROI and has characteristic intensity values that are different from non-parenchyma tissue. This then shows as a peak in the intensity histogram of the ROI and the derived PDF. We used points on the PDF with a distance of one standard deviation left and right of its maximum rather than the left and right ends of the PDF. This empirical modification was necessary because, without it, right-sided outliers (high intensity) would lead to markedly higher upper thresholds and a systematically too high TKV. Also, the lower threshold would otherwise be influenced by the location of the maximum of the PDF.
The method was tested on 24 scans of healthy volunteers (48 kidneys). It removed a considerable mean volume of 13.1% from the manually pre-segmented ROI. On visual inspection this volume consisted predominantly of non-parenchyma tissue such as blood vessels and calyxes. Renal volume, listed in Table 1, is in good accordance with values reported in the literature . Intra- and inter-observer reliability was good, and the results correlated well with a manually segmented reference volume.
The unimodal thresholding approach was proposed by Rosin et al. more than 15 years ago . Since then, numerous approaches were described in the domain of MR image segmentation [5–7]. Our proposed method requires a rough manual pre-segmentation of the kidney. However, it affords significant time savings compared to a fine manual segmentation and could be used for generating training data for segmentation approaches that employ machine learning techniques. The necessary coarse manual pre-segmentation could be done by non-experts with little introduction. The end-results should be independent of the software used for this step. This could be advantageous for the evaluation of large data sets. Kidney volume was determined automatically at completion of the unimodal thresholding algorithm and we did not make any corrections. Our study has some limitations. The unimodal thresholding only removes voxels that differ somewhat from the predominant intensity within the manually pre-segmented ROI. If voxels with similar intensity (spleen, pelvis, vascular tissue, dull margins due to partial volume effects) are included in the pre-segmentation they will not be removed by the subsequent automatic unimodal thresholding. This leads to a false high TKV.
Additionally, if a lot of near isointense non-parenchyma voxels are included in the manual pre-segmentation the intensity distribution of the whole ROI could be altered in a way that leads to higher thresholds compared to a sufficiently accurate pre-segmentation, resulting in a systematic error towards a higher TKV. We did not see this error in our pre-segmentations (observer 2 is a layperson); therefore, we did not see this property having a considerable impact.
However, we saw small but significant systematic differences between observation groups (Table 2). We consider these to be mainly a manifestation of user dependability concerning the manual segmentation, which is the tendency of the user to include or exclude disputable volume that may or may not be part of the kidney. These differences were more pronounced for left kidneys than for right kidneys (Table 3), which is most likely due to the proximity of iso-intense spleen tissue. These systematic errors were sufficiently small for us to consider acceptable.
For the evaluation of our semiautomatic segmentation approach, we used images of healthy volunteers. Further research is necessary to show how the algorithm works with data of patients that could have a worse image quality, different intensity distributions within the kidney or lower contrast between renal parenchyma and surrounding tissue.
For the clinical setting a more interactive process would be desirable, such as unimodal thresholding executed every time the user draws a kidney contour in an image slice. For diseased kidneys it could be beneficial if the user had the ability to manipulate thresholds along the pdf with instant feedback.
Lastly, all of the software used is open source for research purposes. It stands to reason that this thresholding method is promising also for different tissues and different image modalities, as long as the intensity distribution of the voxels in the region of interest is nearly unimodal. The Python code for the implementation of the unimodal thresholding algorithm consists of a central script, several functions and sub functions. The Python code is available as additional material (Additional file 1).
Unimodal thresholding of native MR images is a robust and sufficiently reliable method for kidney segmentation and volumetric analysis. The manual pre-segmentation can be done by non-experts and with a brief introduction. Future research includes the evaluation of kidney voxel distributions at different magnetic field strengths, the evaluation of alternative MR protocols, embedding of unimodal thresholding in a more interactive protocol and the investigation of patient data as well as large data sets.
[http://ric.uthscsa.edu/mango] Mango. Accessed 10 June 2016.
[http://www.python.org] Python. Accessed 10 June 2016.
[http://docs.scipy.org/doc/] Numpy and Scipy Documentation. Accessed 10 June 2016.
[http://nipy.org/nibabel/] Nibabel. Accessed 10 June 2016.
[http://scikit-image.org/]Scikit-image. Accessed 10 June 2016.
[https://pythonhosted.org/spyder/] Spyder. Accessed 10 June 2016.
[http://docs.continuum.io/anaconda/index] Anaconda. Accessed 10 June 2016.
[http://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.gaussian_kde.html] SciPy Kernel Density Estimate. Accessed 7 June 2016.
[https://github.com/demotu/BMC/blob/master/functions/detect_peaks.py] DetectPeaks. Accessed 10 June 2016.
[https://www.sqlite.org/] SQLite. Accessed 10 June 2016.
[https://addons.mozilla.org/en-us/firefox/addon/sqlite-manager/] SQLite Manager. Accessed 10 June 2016.
[https://www.r-project.org/] The R Project for Statistical Computing. Accessed 10 June 2016.
chronic kidney disease
good clinical practice
graphical user interface
mean absolute difference
magnetic resonance imaging
Neuroimaging Informatics Technology Initiative
prospective acquisition correction
probability distribution function
region of interest
total kidney volume
time to echo
time to repeat
turbo spin echo
Zhang Q-L, Rothenbacher D. Prevalence of chronic kidney disease in population-based studies: systematic review. BMC Public Health. 2008;8:117. doi:10.1186/1471-2458-8-117.
El Nahas AM, Bello AK. Chronic kidney disease: the global challenge. Lancet. 2005;365:331–40. doi:10.1016/S0140-6736(05)17789-7.
Bakker J, Olree M, Kaatee R, de Lange EE, Moons KGM, Beutler JJ, Beek FJA. Renal volume measurements: accuracy and repeatability of US compared with that of MR imaging. Radiology. 1999;211:623–8. doi:10.1148/radiology.211.3.r99jn19623.
Cheong B, Muthupillai R, Rubin MF, Flamm SD. Normal values for renal length and volume as measured by magnetic resonance imaging. Clin J Am Soc Nephrol. 2006;2:38–45. doi:10.2215/CJN.00930306.
Zöllner FG, Svarstad E, Munthe-Kaas AZ, Schad LR, Lundervold A, Rørvik J. Assessment of kidney volumes from MRI: acquisition and segmentation techniques. Am J Roentgenol. 2012;199:1060–9. doi:10.2214/AJR.12.8657.
Gloger O, Tonnies K, Laqua R, Volzke H. Fully automated renal tissue volumetry in MR Volume data using prior-shape-based segmentation in subject-specific probability maps. IEEE Trans Biomed Eng. 2015;62:2338–51. doi:10.1109/TBME.2015.2425935.
Goceri N, Goceri E. A neural network based kidney segmentation from MR images. In: 2015 IEEE 14th Int Conf Mach Learn Appl. Anaheim: ICMLA; 2015. p. 1195–8.
Scott DW. Multivariate density estimation: theory, practice, and visualization. New York: Wiley; 2015.
Bland JM, Altman DG. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8:135–60.
Rosin PL. Unimodal thresholding. Pattern Recognit. 2001;34:2083–96. doi:10.1016/S0031-3203(00)00136-9.
Chen Z. Histogram partition and interval thresholding for volumetric breast tissue segmentation. Comput Med Imaging Graph. 2008;32:1–10. doi:10.1016/j.compmedimag.2007.07.007.
Shinohara RT, Sweeney EM, Goldsmith J, Shiee N, Mateen FJ, Calabresi PA, Jarso S, Pham DL, Reich DS, Crainiceanu CM. Statistical normalization techniques for magnetic resonance imaging. Neuroimage Clin. 2014;6:9–19. doi:10.1016/j.nicl.2014.08.008.
All authors have made substantial contributions to conception and design, or acquisition of data, or analysis and interpretation of data. MH carried out data analysis, data collection, and manuscript revision. RJ was involved in project development, data analysis, data collection and manuscript revision. MS carried out project development and programming, data analysis, data collection, data management, manuscript drafting and manuscript revision. HS carried out data collection, and manuscript revision. RT carried out data collection. MU was involved in project development. All authors have given final approval of the version to be published and agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved. All authors read and approved the final manuscript.
This work contains parts of Martin Sandmair’s doctoral thesis.
The authors declare that they have no competing interests.
Consent for publication
We confirm that we have given due consideration to the protection of intellectual property associated with this work and that there are no impediments to publication, including the timing of publication, with respect to intellectual property. In so doing we confirm that we have followed the regulations of our institutions concerning intellectual property. Consent to publish was obtained from all individual participants included in the study.
Ethics approval and consent to participate
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
The study was approved by the institutional review board of the University Hospital Erlangen/Germany.
This research has been supported by the Smart Data Program in the KDI project of the Federal Ministry for Economic Affairs and Energy, Germany (01MT14001E).
Martin Sandmair and Matthias Hammon contributed equally to this work