A ranking index for quality assessment of forensic DNA profiles
- Johannes Hedman^{1, 2},
- Ricky Ansell^{1, 3} and
- Anders Nordgaard^{1, 4}Email author
https://doi.org/10.1186/1756-0500-3-290
© Nordgaard et al; licensee BioMed Central Ltd. 2010
Received: 2 July 2010
Accepted: 9 November 2010
Published: 9 November 2010
Abstract
Background
Assessment of DNA profile quality is vital in forensic DNA analysis, both in order to determine the evidentiary value of DNA results and to compare the performance of different DNA analysis protocols. Generally the quality assessment is performed through manual examination of the DNA profiles based on empirical knowledge, or by comparing the intensities (allelic peak heights) of the capillary electrophoresis electropherograms.
Results
We recently developed a ranking index for unbiased and quantitative quality assessment of forensic DNA profiles, the forensic DNA profile index (FI) (Hedman et al. Improved forensic DNA analysis through the use of alternative DNA polymerases and statistical modeling of DNA profiles, Biotechniques 47 (2009) 951-958). FI uses electropherogram data to combine the intensities of the allelic peaks with the balances within and between loci, using Principal Components Analysis. Here we present the construction of FI. We explain the mathematical and statistical methodologies used and present details about the applied data reduction method. Thereby we show how to adapt the ranking index for any Short Tandem Repeat-based forensic DNA typing system through validation against a manual grading scale and calibration against a specific set of DNA profiles.
Conclusions
The developed tool provides unbiased quality assessment of forensic DNA profiles. It can be applied for any DNA profiling system based on Short Tandem Repeat markers. Apart from crime related DNA analysis, FI can therefore be used as a quality tool in paternal or familial testing as well as in disaster victim identification.
Background
In the last years, several statistical models and expert systems have been developed to streamline and simplify the routine evaluation of forensic DNA profiles [1–3], to aid in the interpretation of mixed DNA profiles [4, 5] and to estimate the risk of encountering artifact peaks and/or allelic drop-outs [6–9]. However, assessment of DNA profile quality is generally not quantified or treated in an unbiased way. For example, in most studies comparing the performance of different forensic DNA analysis protocols, DNA profile quality is either assessed by manual examination based on empirical knowledge, and/or by comparing the intensities (allelic peak heights or areas) of the EPG/DNA profiles [10–14]. Manual examination has its apparent drawbacks in the difficulty for reproducibility and automation. The intensity is a decent quality measure but may be misleading if the allelic peak balance is not taken into account.
We recently developed the forensic DNA profile index (FI), a ranking index for unbiased and quantitative quality assessment of forensic DNA profiles [15]. FI combines intensity and balance into one single, easily interpretable numerical index. FI is constructed by using Principal Components Analysis (PCA) on the following DNA profile quality measures: total allelic peak height (intensity), balance between allelic peaks within heterozygous loci (intra-locus or local balance), and balance between STR markers (inter-loci or global balance). The ranking index is based on empirical data taking into account statistical properties of such data as well as common opinions about what is considered a high or low quality EPG/DNA profile. Here we present the construction of FI, describing the applied mathematical and statistical methodologies. We show how to adapt the ranking index for any STR-based forensic DNA typing system through validation against a manual grading scale and calibration against a specific set of DNA profiles.
Methods and results
This section describes the construction of the ranking index. First, we define the three quality measures that are used to create FI, and show how PCA is used to combine these measures into one single value. Second, we describe how FI is validated against a manual grading scale and how it may be calibrated against a calibration set of DNA profiles.
Methodology
Basic measures of DNA profile quality
Consider the EPG/DNA profile presented in Figure 1A. An experienced reporting officer would have no problems to identify what is acceptable or not in this DNA profile. However, there is no obvious way of immediately ranking the DNA profile without careful comparisons with other "competing" profiles. For that we need to define what we are supposed to look for in the EPG and how our observations could be summarized in a more compact way. This section generally follows what was published in Hedman et al. (2009), but with more details on central statistical issues.
A global measure of intra-locus or local balance is then obtained by taking the mean of these measures for all observed markers (Mean Local Balance):$MLB={M}^{-1}{\displaystyle {\sum}_{i=1}^{M}L{B}_{i}}$, where M is the number of analyzed STR markers. MLB is genotype dependent: DNA profiles from different people have different setups of homo- and heterozygous STR markers, affecting the measure. The extreme is that a person only has homozygous loci. In this case, all resulting DNA profiles would get a MLB equal to 1, as long as there are no drop-outs. Discrepancies between summarized peak heights between loci (Figure 1) are less straightforwardly handled. One approach could be to apply a measure of dispersion, like the standard deviation, but as such a measure is scale-dependent, data need to be standardized before it can be applied. We instead suggest to use the Shannon entropy [16], which in our case is defined as $SH=-{\displaystyle {\sum}_{i=1}^{M}{p}_{i}\cdot \mathrm{ln}\left({p}_{i}\right)}$, where p_{ i } is the relative contribution from marker i to the total sum of peak heights (i.e., $TP{H}_{i}/{\displaystyle {\sum}_{i=1}^{M}TP{H}_{i}}$). SH varies between 0 and ln(M ), where 0 is attained when only one marker has observable peaks and ln(M) is attained when the summarized peak heights in all markers are equal. Thus, the higher the value of SH, the greater the inter-loci balance. For example, if the DNA profile is made up by ten STR markers, SH has a maximum value of ln(10), or 2.30. SH can only be calculated for markers that contain peaks. However, drop-out markers generally lower the calculated SH by giving fewer factors to include in the calculations. Additionally, if there are allelic drop-outs in an EPG/DNA profile, the existing markers generally exhibit poor intra-locus and inter-loci balance, further strengthening the validity of using SH as a quality measure. Shannon entropy emerged within information theory, but has later become a useful measure in different areas, e.g., in studies of biodiversity, where a high entropy means great diversity of species within a habitat. In our case the analogues to species are observable peaks for a particular EPG/DNA profile. Each locus must have one or two alleles and in a good representation of the profile all loci included should be equally visible. Thus if all summarized peak heights are reasonably equal in the EPG, the profile can be considered to be globally balanced.
From three measures to one using data reduction methods
FI is a so-called ranking index, a single number that can be used to rank DNA profiles according to quality. Such an index should be based on empirical data comprising several quality measures, in particular the ones that have been defined in the previous section. Constructing one single number from several measures means that some data reduction is necessary. We use Principal Components Factor Analysis [17, 18] for this purpose, and retain only the first principal component to represent the entities of interest provided the loadings of that component are consistent with each entity's relationship with the quality.
Principal components
where a_{1}, a_{2} and a_{3} are the estimated coefficients (or factor loadings) for this component. Further, if a_{1}, a_{2} and a_{3} are all positive the retained principal component will be large for high quality DNA profiles and small for low quality profiles.
The so translated scores ${\left\{tp{c}_{i}\right\}}_{i=1}^{n}$ will all be greater than zero unless all the original variables are zero, but that can never be the case the way the three measures are constructed. If TPH is zero, then calculation of SH is not meaningful, and for SH to be equal to zero, there must be exactly one locus with detectable peaks (i.e., TPH > 0). We will get back to translated components later in this paper, but before we do so it is necessary to develop improvements of the principal component with respect to ranking power.
Constructing the forensic DNA profile index (FI)
Combining the principal component with manual ranking to a ranking index
The principal component (1) is a natural base for the construction of a ranking index. It automatically takes into account the intra-relationships between the embedded measures TPH, MLB and SH, which makes it less biased than any ranking procedure based on independent use of the three measures separately. Nevertheless, although increasing scores of pc are consistent with improvement of EPG/DNA profile quality, nothing ensures that the rate of increase in its value corresponds with the rate of increase of the profile quality. To achieve this and at the same time get a numerically interpretable ranking index, pc must be validated against a ranking of profiles based on other arguments.
A particular calibration set
Scrutinizing (3) we see that the intra-correlation between TPH, MLB and SH has resulted in a first principal component that puts the largest weight on the standardized intra-locus balance measure (mlb) while the standardized total sum of peak heights (tph) is less important. This is a result not fully consistent with a DNA analyst's opinion, which instead would be to have the total sum of peak heights as the dominant part of a quality measure. Nevertheless, (3) is considered sufficient to represent the variation in TPH, MLB and SH and forms the base of a final ranking index. Below we shall adjust (3) by validation towards a scale consistent with opinions of a DNA analyst.
Non-PCA based DNA profile ranking
The state-of-the art today is to evaluate DNA profiles manually, i.e., by visual inspection of the EPGs with consideration taken to the heights of the allelic peaks. In general, peak heights are particularly dominant when comparing two DNA profiles, but aspects of peak balance, both local and global, are also taken into account. This is in particular the case when peak heights are small, whereas for moderate or large peak heights the balance aspects are less important. The two steps outlined below constitute an attempt to transform manual ranking to a numerical scale, based on manual rankings made by different analysts at the Swedish National Laboratory of Forensic Science.
1. Summarized peak heights, i.e., TPH in our notation, are classified into 15 intervals and each interval is coded with a rank according to Table 1. The lengths of the 15 intervals increase with TPH reflecting that for large enough peak heights the quality of the profile does not change that much with increasing TPH. The same argument goes for the choice of even-numbered ranks only for intervals between a TPH of 500 and a TPH of 10000, reflecting that a change in TPH at those levels has great impact on the quality.
Manual grading scale (profile grades) for forensic DNA profiles, with intervals for summarized peak heights (TPH)
Interval | Profile grade |
---|---|
50000 ≤ TPH | 1 |
40000 ≤ TPH < 50000 | 2 |
30000 ≤ TPH < 40000 | 3 |
25000 ≤ TPH < 30000 | 4 |
20000 ≤ TPH < 25000 | 5 |
15000 ≤ TPH < 20000 | 6 |
12500 ≤ TPH < 15000 | 7 |
10000 ≤ TPH < 12500 | 8 |
7500 ≤ TPH < 10000 | 10 |
5000 ≤ TPH < 7500 | 12 |
2500 ≤ TPH < 5000 | 14 |
1000 ≤ TPH < 2500 | 16 |
500 ≤ TPH < 1000 | 18 |
0 <TPH < 500 | 19 |
TPH = 0 | 20 |
where Range(MLB) = (1-min(MLB)) + (1-max(MLB)) with min(MLB) and max(MLB) being the lowest and largest value respectively of MLB in the calibration set and Range(SH) = (ln(10) - min(SH)) + (ln(10) - max(SH)) with analogous definitions of min(SH) and max(SH). The conditions in (4) relate to which of MLB and SH that is relatively closest to its maximum value (1 for MLB and ln(10) for SH ). The values of d will vary between 0 and 1 attaining the borders if MLB or SH attains their respective maximum somewhere in the calibration set. For the ranks 8, 10, 12, 14, 16 and 18 we instead add the value 2d and for the rank 20 nothing is added. The whole procedure then refines the ranking to rational numbers between 1 and 20 which hereafter are referred to as profile grades, prg, descending with increased DNA profile quality. The construction allows a stretching to the whole interval between two initial ranks provided it is considered possible to have either perfect local balance or perfect global balance, but otherwise the range of possible values between two ranks are more centered. It should be pointed out that the suggested construction of prg is completely additive, while a more comprehensive transformation should possibly included multiplicative relationships. The addition of d (or 2d) includes balance aspects into the ranking in such a way that this type of consideration becomes important for profiles with similar peak heights. However, prg should be considered as a rough approximation of the more complicated and subjective judgement of the profile quality, and cannot serve as an adequate replacement of the former.
Validation and adjustment of the principal component
where η(·) is a chosen function. The relationship (6) covers (5) and also e.g., polynomial regression models [19] and generalized linear models [20] with the addition of a probability distribution for the random variation in prg.
- (i)
For profile i in the calibration set, define ES_{ i } = {(prg_{ j } , pc_{ j } ), j ≠ i} and TS_{ i } = (prg_{ i } , pc_{ i } )
- (ii)Fit the model (5) to all observations in ES_{ i } →$p\widehat{r}{g}_{j}={b}_{0}^{(-i)}+{b}_{1}^{(-i)}\cdot p{c}_{j},j\ne i$
where the superscript (-i) means that (prg_{ i } , pc_{ i } ) is left out from the estimation
- (iii)
Repeat (i) and (ii) for all profiles in the calibration set
- (iv)Find the value of θ that minimizes$PRESS\text{}(\theta )={\displaystyle \sum _{i=1}^{n}{(pr{g}_{i}-[{b}_{0}^{(-i)}+\theta \cdot {b}_{1}^{(-i)}\cdot p{c}_{i}])}^{2}}$
The forensic DNA profile index (FI)
where we introduce the FI notation, for forensic DNA profile index. Like the translated principal component of (2), FI is always greater than zero.
With a quadratic prediction model the estimated coefficients c_{1}, c_{2} and c_{3} become 4.8693, 0.0216 and 0.0760 respectively, which are very close to the ones obtained with the linear prediction model. The choice of model is therefore of less importance for the shrinking of the PCA coefficients and we prefer the linear prediction model by reasons explained before.
Discussion
Electropherogram data for the DNA profile in Figure 1A
Locus | Allele 1 | Peak height (rfu) | Allele 2 | Peak height (rfu) | TPH | MLB | SH | FI |
---|---|---|---|---|---|---|---|---|
D3S1358 | 14 | 244 | 17 | 201 | ||||
vWA | 15 | 165 | 17 | 226 | ||||
D16S539 | 12 | 146 | 13 | 115 | ||||
D2S1338 | 17 | 79 | ||||||
D8S1179 | 14 | 174 | 15 | 240 | 2628 | 0.81 | 2.14 | 0.94 |
D21S11 | 32.2 | 177 | 33.2 | 113 | ||||
D18S51 | 16 | 61 | 19 | 61 | ||||
D19S433 | 14 | 416 | ||||||
TH01 | 9 | 123 | ||||||
FGA | 21 | 87 | d.o | d.o |
Electropherogram data for the DNA profile in Figure 1B
Locus | Allele 1 | Peak height (rfu) | Allele 2 | Peak height (rfu) | TPH | MLB | SH | FI |
---|---|---|---|---|---|---|---|---|
D3S1358 | 14 | 671 | 17 | 706 | ||||
vWA | 15 | 714 | 17 | 710 | ||||
D16S539 | 12 | 227 | 13 | 253 | ||||
D2S1338 | 17 | 442 | ||||||
D8S1179 | 14 | 416 | 15 | 431 | 7284 | 0.96 | 2.19 | 1.59 |
D21S11 | 32.2 | 351 | 33.2 | 317 | ||||
D18S51 | 16 | 198 | 19 | 190 | ||||
D19S433 | 14 | 624 | ||||||
TH01 | 9 | 557 | ||||||
FGA | 21 | 234 | 27 | 243 |
We chose to base our ranking index on three quality aspects, which together describe the DNA profile quality: intensity, balance within a locus and balance between loci. TPH is a straightforward, easily interpretable measure of DNA profile intensity, and in consequence of DNA profile quality. However, if the fluorescence is saturated due to overloading of DNA template, bleed-through peaks may be formed, lowering the perceived quality of the profile. For extreme peak heights, TPH may therefore be misleading as a quality measure. Hence, FI should only be applied for DNA profiles without bleed-through peaks caused by DNA overloading.
TPH, MLB and SH are all quantitative and measured on a continuous scale, which increases the success in constructing an unbiased and quantitative ranking index. Other quality measures sometimes used in the forensic community include the fraction of unbalanced heterozygote STR markers, and the number of complete markers in a profile. Using the fraction of unbalanced markers to create a ranking index suffers from two identified drawbacks; (i) the decision about whether a marker is balanced or not must precede the calculation of a quality index and has a potential contribution of bias; (ii) the number of STR markers in the standard amplification kits is low (in our case ten, in other common kits up to around 16) which gives low resolution of the measure and thus discretizes the scale. Calculating the number of complete markers in a profile may also be biased, as different laboratories may use different peak height threshold values for accepting a peak as a true allelic peak. We omitted these measures when creating our ranking index, as our aim was to design a general tool that is independent of arbitrary balance rules and peak height threshold values.
Nothing has so far been said about the interpretation of the numerically derived index, but the validation against a grading scale would make an increase in the index value consistent with an increase in the profile grade, no matter the level of that grade. This is so because a linear prediction model has been used in the validation. However, the non-linear part of the true relationship should possibly be investigated further. Likewise, a separate study is needed to draw adequate conclusions about the probability distribution of the ranking index in the population of DNA profiles obtained in real crime cases. One might argue that instead of using the first principal component a linear combination of TPH, MLB and SH could be found by ordinary least-squares fitting of the profile grade, prg, i.e., a multiple regression model. However, in regression models it is the conditional mean of the response given the values of the predictors that is modeled, and we do not consider any of the values of TPH, MLB and SH to be part of a fixed design. Furthermore, the intra-correlation structure of these three measures would lead to problems with multicollinearity when they are all used in the same model, and as a consequence the estimated slopes will not all be positive.
The FI model was developed for usage with the ten STR marker DNA typing kit AmpFl STR SGM Plus [15]. However, the model can be adapted for any STR-based DNA profiling system, e.g., systems with a higher number of markers such as AmpFl STR NGM (Applied Biosystems) or PowerPlex ESI/ESX (Promega, Madison, WI, USA), by using an appropriate calibration set of samples and by validating the index against a suitable manual grading scale. The mathematical and statistical procedures described here can be used to adapt FI for other DNA typing systems. It is also possible to calculate FI for a part of an EPG/DNA profile, e.g., for STR markers in a certain length range. This could be useful when analyzing degraded or impure DNA, which often results in preferential amplification of the shorter markers. Additionally, it may be possible to use FI as a DNA profile evaluation tool in routine casework. In the present format, the user decides which alleles to incorporate into the FI calculations. Thus, stochastic thresholds can be suited for each individual laboratory, or all peaks over the detection limit can be added to the calculations. FI does not handle mixed DNA profiles, so for evaluations of such complex profiles other statistical tools should be used.
Conclusions
FI is a quantitative, unbiased quality measure for forensic DNA profiles. It combines intensity and balance into one easily interpretable index which describes the complete quality of the DNA profile. FI can be accustomed for any STR-based DNA typing system, and can be used for validation studies as well as other comparative studies of different DNA analysis protocols. Apart from crime related DNA analysis, FI can be used as a quality tool in paternal or familial testing as well as in disaster victim identification.
Declarations
Acknowledgements
The authors are grateful to Linda Albinsson at the Swedish National Laboratory of Forensic Science (SKL) for providing this study with ideas and technical support, and to Professor Peter Rådström, department of Applied Microbiology, Lund University, for comments on the manuscript.
Authors’ Affiliations
References
- Hedman J, Albinsson L, Ansell C, Tapper H, Hansson O, Holgersson S, Ansell R: A fast analysis system for forensic DNA reference samples. Forensic Sci Int Genet. 2008, 2: 184-189. 10.1016/j.fsigen.2007.12.011.PubMedView ArticleGoogle Scholar
- Bill M, Knox C: FSS-i^{3} Expert Systems. Profiles in DNA. 2005, 8: 8-10.Google Scholar
- Power T, McCabe B, Harbison SA: FaSTR DNA: a new expert system for forensic DNA analysis. Forensic Sci Int Genet. 2008, 2: 159-165. 10.1016/j.fsigen.2007.11.007.PubMedView ArticleGoogle Scholar
- Bill M, Gill P, Curran J, Clayton T, Pinchin R, Healy M, Buckleton J: PENDULUM--a guideline-based approach to the interpretation of STR mixtures. Forensic Sci Int. 2005, 148: 181-189. 10.1016/j.forsciint.2004.06.037.PubMedView ArticleGoogle Scholar
- Cowell RG, Lauritzen SL, Mortera J: Identification and separation of DNA mixtures using peak area information. Forensic Sci Int. 2007, 166: 28-34. 10.1016/j.forsciint.2006.03.021.PubMedView ArticleGoogle Scholar
- Gill P, Curran J, Neumann C, Kirkham A, Clayton T, Whitaker J, Lambert J: Interpretation of complex DNA profiles using empirical models and a method to measure their robustness. Forensic Sci Int Genet. 2008, 2: 91-103. 10.1016/j.fsigen.2007.10.160.PubMedView ArticleGoogle Scholar
- Gill P, Kirkham A, Curran J: LoComatioN: a software tool for the analysis of low copy number DNA profiles. Forensic Sci Int. 2007, 166: 128-138. 10.1016/j.forsciint.2006.04.016.PubMedView ArticleGoogle Scholar
- Tvedebrink T, Eriksen PS, Mogensen HS, Morling N: Estimating the probability of allelic drop-out of STR alleles in forensic genetics. Forensic Sci Int Genet. 2009, 3: 222-226. 10.1016/j.fsigen.2009.02.002.PubMedView ArticleGoogle Scholar
- Balding DJ, Buckleton J: Interpreting low template DNA profiles. Forensic Sci Int Genet. 2009, 4: 1-10. 10.1016/j.fsigen.2009.03.003.PubMedView ArticleGoogle Scholar
- Castella V, Dimo-Simonin N, Brandt-Casadevall C, Mangin P: Forensic evaluation of the QIAshredder/QIAamp DNA extraction procedure. Forensic Sci Int. 2006, 156: 70-73. 10.1016/j.forsciint.2005.11.012.PubMedView ArticleGoogle Scholar
- Abaz J, Walsh SJ, Curran JM, Moss DS, Cullen J, Bright JA, Crowe GA, Cockerton SL, Power TE: Comparison of the variables affecting the recovery of DNA from common drinking containers. Forensic Sci Int. 2002, 126: 233-240. 10.1016/S0379-0738(02)00089-0.PubMedView ArticleGoogle Scholar
- Forster L, Thomson J, Kutranov S: Direct comparison of post-28-cycle PCR purification and modified capillary electrophoresis methods with the 34-cycle "low copy number" (LCN) method for analysis of trace forensic DNA samples. Forensic Sci Int Genet. 2008, 2: 318-328. 10.1016/j.fsigen.2008.04.005.PubMedView ArticleGoogle Scholar
- Li RC, Harris HA: Using hydrophilic adhesive tape for collection of evidence for forensic DNA analysis. J Forensic Sci. 2003, 48: 1318-1321.PubMedView ArticleGoogle Scholar
- Moss D, Harbison SA, Saul DJ: An easily automated, closed-tube forensic DNA extraction procedure using a thermostable proteinase. Int J Legal Med. 2003, 117: 340-349. 10.1007/s00414-003-0400-9.PubMedView ArticleGoogle Scholar
- Hedman J, Nordgaard A, Rasmusson B, Ansell R, Rådström P: Improved forensic DNA analysis through the use of alternative DNA polymerases and statistical modeling of DNA profiles. Biotechniques. 2009, 47: 951-958. 10.2144/000113246.PubMedView ArticleGoogle Scholar
- Shannon CE: A mathematical theory of communication. Bell System Technical Journal. 1948, 27: 379-423. 623-656View ArticleGoogle Scholar
- Johnson RA, Wichern DW: Applied multivariate statistical analysis. 2002, Upper Saddle River, NJ, USA: Prentice HallGoogle Scholar
- Manly BFJ: Multivariate statistical methods. 2004, London, UK: Chapman & Hall, 2Google Scholar
- Neter J, Kutner MH, Nachtsheim CJ, Wasserman W: Applied linear statistical models. 1996, Scarborough, ON, Canada: Irwin, 4Google Scholar
- McCullagh P, Nelder JA: Generalized linear models. 1989, London, UK: Chapman & Hall, 2View ArticleGoogle Scholar
- Stone M: Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Series B Stat Methodol. 1974, 36: 111-147.Google Scholar
- Hjorth JSU: Computer intensive statistical methods: Validation, model selection and bootstrap. 1993, London, UK: Chapman & HallGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.