Research Article Open Access
A Bootstrap Prediction Confidence Band for QMRA Beta-Poisson Dose-Response Models
Gang Xie1,2*
1School of Medicine, Griffith University, Queensland, Australia
2Faculty of Science, Health, Education and Engineering, University of the Sunshine Coast, Queensland, Australia
*Corresponding authors address: Gang Xie, Adjunct Research Fellow, School of Medicine, Griffith University, Queensland 4111, Australia, Tel.: +61-7-37357485; Fax: +61-7-37355318; E-mail: g.xie@griffith.edu.au
Received: May 05, 2017; Accepted: May 23, 2017; Published: May 26, 2017
Citation: Gang Xie (2017) A Bootstrap Prediction Confidence Band for QMRA Beta-Poisson Dose-Response Models. Int Struct Comput Biol 1(1): 1-11.
Abstract
Let P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ denote the probability of infection at a given mean dose d . In the quantitative microbial risk assessment (QMRA) framework, the beta-Poisson dose-response model P I (d)=1   1 F 1 (α,α+β,d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaiabg2da9iaaigda cqGHsislcaqGGaWaaSbaaSqaaiaaigdaaeqaaOGaamOramaaBaaale aacaaIXaaabeaakiaacIcacqaHXoqycaGGSaGaeqySdeMaey4kaSIa eqOSdiMaaiilaiabgkHiTiaadsgacaGGPaaaaa@4A5E@ where 1 F 1 (.,.,.) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSbaaSqaai aaigdaaeqaaOGaamOramaaBaaaleaacaaIXaaabeaakiaacIcacaGG UaGaaiilaiaac6cacaGGSaGaaiOlaiaacMcaaaa@3D71@ denotes the Kummer confluent hypergeometric function and α,β are the model parameters, remains the most popular plausible dose response model in practice. One commonly accepted way of constructing the confidence band about the dose-response curve in QMRA literature is to follow a bootstrap procedure based on the maximum likelihood estimates of the model parameters α and β. Here, it is shown that this bootstrap confidence bands reported in the literature represent the confidence intervals for the mean value of the probability of adverse effect, (i.e., P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ which may represent the probability of infection at mean dose levels), not the confidence intervals for prediction. Therefore, the existing literature bootstrap (95%) confidence bands normally contain far less than 95% data points. In this study, a sample-size-dependent bootstrap (95%) confidence band for prediction is proposed which can normally contain 95% of the data points. Comparisons between the existing and the newly proposed confidence band algorithms were made through a comprehensive Monte Carlo simulation study by applying both algorithms to four well studied experimental dose-response datasets. The comparison results showed that the confidence band for prediction is a better representation of the degree of uncertainty of an estimated dose-response relation. The upper limit of the estimated prediction confidence band could be used as a better (i.e., more conservative but sensible) estimate for the worst case scenario in risk assessment.

Key words: QMRA; beta-Poisson dose-response model; bootstrap; confidence intervals for prediction; Monte Carol simulation.
Introduction
QMRA [1, 2] has provided a valuable alternative quantitative framework for analysis of adverse health outcomes associated with the environmental exposures to pathogenic organisms [3, 4] and is widely used by researchers to characterise microbial risks associated with food, water, and wastewater use in agriculture (e.g., [1, 3, 5]). The core part of the QMRA framework is the dose-response analysis which models the mathematical characterization of the relationship between the dose administered and the probability of adverse effect (typically, the probability of infection) in the exposed population [1, 2, 6]. Among different microbial dose-response models proposed in the literature, the beta-Poisson models remain the most popular plausible models in practice [1, 2, 7].The beta-Poisson dose-response model and its special case, the exponential doseresponse model have been well studied and widely employed to characterize infectivity of various viral, bacterial, and protozoan pathogens since the 1980s [1, 2, 7, 8]. Because a beta-Poisson model has two parameters, a confidence band about the doseresponse curve cannot be calculated directly from the confidence intervals of the parameter estimates. It is well accepted in QMRA practice to employ a parametric bootstrap algorithm [9] to construct a confidence band for the dose-response curve[1, 2, 8, 10].The upper limit of the estimated confidence band is used as the worst case scenario in risk assessment.

Here, it is shown that the (95%) bootstrap confidence bands reported in microbial risk literature normally contain far less than 95% of the data points because they are not confidence bands for prediction. This implies that the existing literature confidence band may not be a good representation of the degree of uncertainty of an estimated dose-response relation. To fill this knowledge gap in QMRA analysis, a new, sample-size-dependent bootstrap (95%) confidence band construction algorithm is proposed which can normally contain 95% of the data points and the results are validated by an extensive Monte Carlo simulation study using four well studied experimental dose-response datasets.

The rest of the paper is organized as follows. Section 2 begins with a description of the general setting for this study which includes: the notation and definitions; datasets; model specification and parameter estimation. The existing and the new bootstrap confidence band algorithms are specified in details in Section .The comparisons between the existing and the new algorithms were made and the results are presented and discussed in the last part of Section 2. The paper is completed with a conclusion section. For a better reading flow and repeatability of our research findings, an appendix section is added at the end which contains: table 4, the details of the selected real doseresponse datasets; table 5, the core parts of R code programs [11] for random sample generation from a beta-Poisson doseresponse model.
Data, model specification and parameter estimation
Dataset, Notation and Definition
Well-studied dose-response experiment datasets provide a reliable baseline for validation of our research findings. In this study, four experimental datasets from the literature are analysed for comparison between the single-hit beta-Poisson model P I (d|α,β) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiiFaiabeg7aHjaacYca cqaHYoGycaGGPaaaaa@3F00@ and the generalized beta-Poisson model P I (d|α,β,r*) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiiFaiabeg7aHjaacYca cqaHYoGycaGGSaGaamOCaiaacQcacaGGPaaaaa@4155@ with respect to parameter estimation and model performance. The selected experiment datasets are: rotavirus (CJN) and infection in healthy volunteers (Dataset 1) [8, 10]; Campylobacter and infection in healthy volunteers (Dataset 2) [8, 10]; Salmonella (Nontyphoid Strains) and infection in human volunteers (Dataset 3) (p399,[12]); Listeria monocytogenes and infection in mice (Dataset 4)[2]. These datasets have been used for dose-response analyses in [2, 8, 10, 12], although these references are not the original data sources. These four datasets are chosen because of their variety of coverage, e.g., virus and bacteria, few data points (e.g., ≤ 10) and many data points (> 40), human volunteers and mice, and old (back to 1996) and recent research sources. For the purpose of a clear and consistent model specification and easy comparison with the literature results, we have adopted a notation scheme as detailed in table I throughout this paper. The term ‘organism’ is used as a short name for pathogenic microorganism.
Model Specification
Beta-Poisson dose-response models are derived based on a plausible conceptual process with the following assumptions [14]: (1) only one viable organism is required to initiate the infection process in vivo; (2) the exact number of organisms being ingested in each dose representing a random sample from a Poisson distribution (with a known mean number d); (3) the survival of any organism within a single host is independent survival of any other organism within that host.

A detailed description and derivation of the exponential and the beta-Poisson dose-response models are given by Haas et al..[1] For the purpose of this paper, the following model specifications are needed.

If we assume that the actual number of organisms ingested, D, follows a Poisson distribution, i.e.,D~ Poisson ( D i );r MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaiikaiaads eadaWgaaWcbaGaamyAaaqabaGccaGGPaGaai4oaiaadkhaaaa@3AF1@ a constant (a homogeneous susceptibility assumption); and Kmin = 1 (a single-hit assumption), the simplest QMRA dose-response P I (d)=1exp(rd),       (1) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaiabg2da9iaaigda cqGHsislcaqGLbGaaeiEaiaabchacaqGOaGaeyOeI0IaamOCaiaads gacaGGPaGaaiilaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeii aiaabccacaqGOaGaaeymaiaabMcaaaa@4AE9@ where is the mean/effective dose.

In Equation (1), if we take into account the individual susceptibility/host sensitivity by allowing rtofollow a beta distribution with the probability density function (pdf) f(r|α,β)= Γ(α+β) Γ(α)Γ(β) r α1 (1r) β1 ,        (2) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOzaiaacI cacaWGYbGaaiiFaiabeg7aHjaacYcacqaHYoGycaGGPaGaeyypa0Za aSaaaeaacqqHtoWrcaGGOaGaeqySdeMaey4kaSIaeqOSdiMaaiykaa qaaiabfo5ahjaacIcacqaHXoqycaGGPaGaeu4KdCKaaiikaiabek7a IjaacMcaaaGaamOCamaaCaaaleqabaGaeqySdeMaeyOeI0IaaGymaa aakiaacIcacaaIXaGaeyOeI0IaamOCaiaacMcadaahaaWcbeqaaiab ek7aIjabgkHiTiaaigdaaaGccaGGSaGaaeiiaiaabccacaqGGaGaae iiaiaabccacaqGGaGaaeiiaiaabccacaqGOaGaaeOmaiaabMcaaaa@629C@ where 0<r<1, MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaaGimaiabgY da8iaadkhacqGH8aapcaaIXaGaaiilaaaa@3B19@ and parameters α>0 MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaey Opa4JaaGimaaaa@3956@ and β>0 MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdiMaey Opa4JaaGimaaaa@3958@ we obtain the well-known beta-Poisson (single-hit) dose-response model[1]: P I (d)=1   1 F 1 (α,α+β,d),         (3) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaiabg2da9iaaigda cqGHsislcaqGGaWaaSbaaSqaaiaaigdaaeqaaOGaamOramaaBaaale aacaaIXaaabeaakiaacIcacqaHXoqycaGGSaGaeqySdeMaey4kaSIa eqOSdiMaaiilaiabgkHiTiaadsgacaGGPaGaaiilaiaabccacaqGGa GaaeiiaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabIca caqGZaGaaeykaaaa@52D6@ where 1 F 1 (α,α+β,d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSbaaSqaai aaigdaaeqaaOGaamOramaaBaaaleaacaaIXaaabeaakiaacIcacqaH XoqycaGGSaGaeqySdeMaey4kaSIaeqOSdiMaaiilaiabgkHiTiaads gacaGGPaaaaa@42F2@ is the Kummer confluent hypergeometric function [15] defined by 1 F 1 (α,α+β,d)=1+ Γ(α+β) Γ(α) j=1 [ Γ(α+j) Γ(α+β=j) (1) j1 (d) j j! ] .          (4) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSbaaSqaai aaigdaaeqaaOGaamOramaaBaaaleaacaaIXaaabeaakiaacIcacqaH XoqycaGGSaGaeqySdeMaey4kaSIaeqOSdiMaaiilaiabgkHiTiaads gacaGGPaGaeyypa0JaaGymaiabgUcaRmaalaaabaGaeu4KdCKaaiik aiabeg7aHjabgUcaRiabek7aIjaacMcaaeaacqqHtoWrcaGGOaGaeq ySdeMaaiykaaaadaaeWaqaamaadmaabaWaaSaaaeaacqqHtoWrcaGG OaGaeqySdeMaey4kaSIaamOAaiaacMcaaeaacqqHtoWrcaGGOaGaeq ySdeMaey4kaSIaeqOSdiMaeyypa0JaamOAaiaacMcaaaWaaSaaaeaa caGGOaGaeyOeI0IaaGymaiaacMcadaahaaWcbeqaaiaadQgacqGHsi slcaaIXaaaaOGaaiikaiaadsgacaGGPaWaaWbaaSqabeaacaWGQbaa aaGcbaGaamOAaiaacgcaaaaacaGLBbGaayzxaaaaleaacaWGQbGaey ypa0JaaGymaaqaaabaaaaaaaaapeGaeyOhIukan8aacqGHris5aOGa aiOlaiaabccacaqGGaGaaeiiaiaabccacaqGGaGaaeiiaiaabccaca qGGaGaaeiiaiaabccacaqGOaGaaeinaiaabMcaaaa@7C48@ Since there are no analytic solutions to 1 F 1 (α,α+β,d), MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaSbaaSqaai aaigdaaeqaaOGaamOramaaBaaaleaacaaIXaaabeaakiaacIcacqaH XoqycaGGSaGaeqySdeMaey4kaSIaeqOSdiMaaiilaiabgkHiTiaads gacaGGPaGaaiilaaaa@43A2@ only numeric approximation solutions or asymptotic solutions can be obtained[15, 16]. In their original paper, Furumoto and Mickey [17] derived the simple, attractive approximation beta-Poisson dose-response formula: P I (d)=1 ( 1+ d β ) α MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaiabg2da9iaaigda cqGHsisldaqadaqaaiaaigdacqGHRaWkdaWcaaqaaiaadsgaaeaacq aHYoGyaaaacaGLOaGaayzkaaWaaWbaaSqabeaacqGHsislcqaHXoqy aaaaaa@4537@ When αβ MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqySdeMaeS OAI0JaeqOSdigaaa@3A8F@ and β1,  P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqOSdiMaeS 4AI8JaaGymaiaacYcacaqGGaGaamiuamaaBaaaleaacaWGjbaabeaa kiaacIcacaWGKbGaaiykaaaa@3F1C@ obtained from Equation (5) is a very good approximation to the true values of P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ defined by Equation (3) [1, 10]

A single-hit model reaches its maximum risk limit when r=1 in Equation (1). This is an important property of the beta- Poisson model (Equation (3)) which provides an upper bound of the confidence level of the dose-response relation. It is worth noting that this maximum risk limit property is not retained in the approximate beta-Poisson model (Equation (5)) [10]
Model Parameter Estimation and Statistical analysis
Let denote the value of a ─2 log-likelihood ratio and we term this the deviance. According to [1](p285, Equation 8.33), for dose-response data analysed in this study, Y can be calculated as Y=2 i=1 m [ y i  log π i π i o +( N i y i ) log 1 π i 1 π i o ] MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamywaiabg2 da9iabgkHiTiaaikdadaaeWaqaamaadmaabaGaamyEamaaBaaaleaa caWGPbaabeaakiaabccacaqGSbGaae4BaiaabEgadaWcaaqaaiabec 8aWnaaBaaaleaacaWGPbaabeaaaOqaaiabec8aWnaaDaaaleaacaWG PbaabaGaam4BaaaaaaGccqGHRaWkcaGGOaGaamOtamaaBaaaleaaca WGPbaabeaakiabgkHiTiaadMhadaWgaaWcbaGaamyAaaqabaGccaGG PaGaaeiiaiaabYgacaqGVbGaae4zamaalaaabaGaaGymaiabgkHiTi abec8aWnaaBaaaleaacaWGPbaabeaaaOqaaiaaigdacqGHsislcqaH apaCdaqhaaWcbaGaamyAaaqaaiaad+gaaaaaaaGccaGLBbGaayzxaa aaleaacaWGPbGaeyypa0JaaGymaaqaaiaad2gaa0GaeyyeIuoaaaa@6234@ where π i MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aaS baaSqaaiaadMgaaeqaaaaa@38CC@ is the predicted response, typically the predicated probability of infection;m is the number of population groups at risk in terms of the mean dose levels as defined in Table 1. The true response (i.e., the true probability of infection) is estimated by π i o = y i / N i , MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aa0 baaSqaaiaadMgaaeaacaWGVbaaaOGaeyypa0JaamyEamaaBaaaleaa caWGPbaabeaakiaac+cacaWGobWaaSbaaSqaaiaadMgaaeqaaOGaai ilaaaa@404D@ the ratio of the observed number of infected individuals to the number of all individuals at risk at each mean/ effective dose level D i . MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiramaaBa aaleaacaWGPbaabeaakiaac6caaaa@3894@
Table 1: Notation and definition

Notation

Definition

 

Ds= { D 1, D 2 ,..., D m } MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca WGebWaaSbaaSqaaiaaigdacaGGSaaabeaakiaadseadaWgaaWcbaGa aGOmaaqabaGccaGGSaGaaiOlaiaac6cacaGGUaGaaiilaiaadseada WgaaWcbaGaamyBaaqabaaakiaawUhacaGL9baaaaa@41B2@

mean dose levels (i.e., the average number of organisms per dose or average concentrations); where the meaning is clear, denote d D i MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamizaiabgg Mi6kaadseadaWgaaWcbaGaamyAaaqabaaaaa@3A8A@ for i=1,2,...,m MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyAaiabg2 da9iaaigdacaGGSaGaaGOmaiaacYcacaGGUaGaaiOlaiaac6cacaGG SaGaamyBaaaa@3E78@ .

 

N = { N 1 , N 2 ,..., N m } MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca WGobWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiaad6eadaWgaaWcbaGa aGOmaaqabaGccaGGSaGaaiOlaiaac6cacaGGUaGaaiilaiaad6eada WgaaWcbaGaamyBaaqabaaakiaawUhacaGL9baaaaa@41D0@

N i MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOtamaaBa aaleaacaWGPbaabeaaaaa@37E2@ is the number of individuals participated in the i th exposure group ( m MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyBaaaa@36E7@  groups, each corresponding to one mean dose level D i MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiramaaBa aaleaacaWGPbaabeaaaaa@37D8@ ).

y = { y 1 , y 2 ,..., y m } MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca WG5bWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiaadMhadaWgaaWcbaGa aGOmaaqabaGccaGGSaGaaiOlaiaac6cacaGGUaGaaiilaiaadMhada WgaaWcbaGaamyBaaqabaaakiaawUhacaGL9baaaaa@4251@

y i MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyEamaaBa aaleaacaWGPbaabeaaaaa@380D@  is the number of infected individuals in the th exposure group

r

survival probability of  each single organism ingested

K

number of organisms surviving to the target site inside a host body

Kmin

the threshold value of the surviving organisms, i.e., the minimum number of organisms required for causing an infection event

P I MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaaaaa@37C4@

probability of infection estimated by a dose-response model

π i o MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aa0 baaSqaaiaadMgaaeaacaWGVbaaaaaa@39C1@

observed proportion of infected individuals in the i th exposure group, i.e.,  π i o = y i / N i  for i=1,2,...,m. MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aa0 baaSqaaiaadMgaaeaacaWGVbaaaOGaeyypa0JaamyEamaaBaaaleaa caWGPbaabeaakiaac+cacaWGobWaaSbaaSqaaiaadMgaaeqaaOGaae iiaiaabAgacaqGVbGaaeOCaiaabccacaWGPbGaeyypa0JaaGymaiaa cYcacaaIYaGaaiilaiaac6cacaGGUaGaaiOlaiaacYcacaWGTbGaai Olaaaa@4CE8@

π i MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aaS baaSqaaiaadMgaaeqaaaaa@38CC@

predicted binary response by a model, e.g., if response is defined by probability of infection, π i = P I ( D i ) P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aaS baaSqaaiaadMgaaeqaaOGaeyypa0JaamiuamaaBaaaleaacaWGjbaa beaakiaacIcacaWGebWaaSbaaSqaaiaadMgaaeqaaOGaaiykaiabgg Mi6kaadcfadaWgaaWcbaGaamysaaqabaGccaGGOaGaamizaiaacMca aaa@44DF@ .

exp(.) and log

exp(1) = e2.71828 MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamyzaiabgI Ki7kaaikdacaGGUaGaaG4naiaaigdacaaI4aGaaGOmaiaaiIdaaaa@3DBA@  is the natural logarithm base such that log(e) = 1.

Γ(.) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeu4KdCKaai ikaiaac6cacaGGPaaaaa@3968@

the gamma function [13]

Equation (6) is a measure of deviance and can be used as an objective function for minimisation in model parameter estimation [18].The results are the maximum likelihood estimates (MLE) [1, 19]. Note that π i = P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aaS baaSqaaiaadMgaaeqaaOGaeyypa0JaamiuamaaBaaaleaacaWGjbaa beaakiaacIcacaWGKbGaaiykaaaa@3DF7@ defined in Table 1.Then, the MLE of the model parameters, α ̂and β ̂, can be obtained using Equation (3) for an exact beta-Poisson model or Equation (5) for an approximate beta-Poisson model. The evaluation of the Kummer confluent hypergeometric function (Equation (4)) could be non-trivial. The statistical data analyses in this study were conducted using the open source statistical software R [11]. The R function ‘hyperg_1F1’ in the R package ‘gsl’ [20] is employed for evaluation of Equation (4). The built-in optimization function ‘optim’ in R is employed to obtain α ̂ and β ̂ by minimizing the model deviance Y defined by Equation (6).
A new Bootstrap confidence band Algorithm
A bootstrap approach is commonly accepted for construction of confidence bands on dose-response curves in microbial risk literature [1, 2, 8, 10]. Following the descriptions given in Haas et al. [1]and in Teunis et al. [8], the bootstrap algorithm employed in the literature (the existing algorithm) is summarized in the left column in Table 2. The new bootstrap algorithm, which produces sample-size-dependent confidence bands for prediction, is summarized in the right column of the same table. The comparison of both algorithms is depicted through figures1 to 7 for the four selected datasets.
Table 2: Parametric bootstrap algorithms for construction of 95% confidence bands about the mean dose-response curve

Existing bootstrap algorithm (confidence band for means)

New bootstrap algorithm (confidence band for prediction)

 

Step 1

Generate a large number of simulation samples, e.g., N*=2000, based on Ds, N, α ̂ , β ̂ MathType@MTEF@5@5@+= feaagGart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaacq aHXoqyaSqabeaaqaaaaaaaaaWdbiablkWaKaaak8aacaGGSaWaaCbi aeaacqaHYoGyaSqabeaapeGaeSOadqcaaaaa@3C35@ ; Determine the plotting mean dose levels (Dsim).

 

Step 1

Determine at which mean dose levels (Dsim) and the number of people (Nsim={max(N)}) exposed to hazard for generating simulation samples;

 

Step 2

Fit a beta-Poisson model to the generated  simulation sample so that 2000 sets of { α ̂ * , β ̂ * } MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaada WfGaqaaiabeg7aHbWcbeqaaabaaaaaaaaapeGaeSOadqcaaOWdamaa CaaaleqabaGaaiOkaaaakiaacYcadaWfGaqaaiabek7aIbWcbeqaa8 qacqWIcmajaaGcpaWaaWbaaSqabeaacaGGQaaaaaGccaGL7bGaayzF aaaaaa@4048@  are obtained;

 

Step 2

Generate a large number of simulation samples, e.g., N*=2000, based on Dsim, Nsim, α ̂ , β ̂ MathType@MTEF@5@5@+= feaagGart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaacq aHXoqyaSqabeaaqaaaaaaaaaWdbiablkWaKaaak8aacaGGSaWaaCbi aeaacqaHYoGyaSqabeaapeGaeSOadqcaaaaa@3C35@ ;

 

step 3

Calculate the probability of infection based on Equation (3) or (5) using parameter estimates obtained in Step 2 (2000 values at each Dsim level );

 

Step 3

Calculate the probability of infection using the ratio ysim/Nsim, where ysimare the simulation sample values from Step 2 (2000 values at each Dsim level);

 

 

Step 4

Obtain a 95 percentile (2.5% to 97.5%) confidence interval for each Dsim level using the results from Step 3; connect all lower/upper bound points to form a confidence band about the infection probability curve.

 

 

Step 4

Obtain a 95 percentile (2.5% to 97.5%) confidence interval for each Dsim level using the results from Step 3; connect all lower/upper bound points to form a confidence band about the infection probability curve.

Note that the generation of random samples of ysim ={ y sim (1), y sim (2),..., y sim (m) }, MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeyypa0Zaai WaaeaacaWG5bWaaSbaaSqaaiaadohacaWGPbGaamyBaaqabaGccaGG OaGaaGymaiaacMcacaGGSaGaamyEamaaBaaaleaacaWGZbGaamyAai aad2gaaeqaaOGaaiikaiaaikdacaGGPaGaaiilaiaac6cacaGGUaGa aiOlaiaacYcacaWG5bWaaSbaaSqaaiaadohacaWGPbGaamyBaaqaba GccaGGOaGaamyBaiaacMcaaiaawUhacaGL9baacaGGSaaaaa@509A@ given Ds = { D 1 , D 2 ,..., D m } MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca WGebWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiaadseadaWgaaWcbaGa aGOmaaqabaGccaGGSaGaaiOlaiaac6cacaGGUaGaaiilaiaadseada WgaaWcbaGaamyBaaqabaaakiaawUhacaGL9baaaaa@41B2@ (or Dsim) and N= { N 1 , N 2 ,..., N m }, MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaiWaaeaaca WGobWaaSbaaSqaaiaaigdaaeqaaOGaaiilaiaad6eadaWgaaWcbaGa aGOmaaqabaGccaGGSaGaaiOlaiaac6cacaGGUaGaaiilaiaad6eada WgaaWcbaGaamyBaaqabaaakiaawUhacaGL9baacaGGSaaaaa@4280@ is essential for employing a bootstrap algorithm to construct confidence bands. For a selected beta- Poisson dose-response model, the parametric bootstrap [9] random samples, ysim, is generated using R code programs provided in table 5.

The two different approaches to estimate the probabilities of infection, P I (d), MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaiaacYcaaaa@3AC0@ have a substantial impact on the resulting confidence bands. With respect to each bootstrap sample (i.e., the simulated data) ysim, the parameter estimates α ̂ * MathType@MTEF@5@5@+= feaagGart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaaqa aaaaaaaaWdbiabeg7aHbWcpaqabeaapeGaeSOadqcaaOWdamaaCaaa leqabaWdbiaacQcaaaaaaa@39DB@ and β ̂ * MathType@MTEF@5@5@+= feaagGart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaacq aHYoGyaSqabeaaqaaaaaaaaaWdbiablkWaKaaak8aadaahaaWcbeqa a8qacaGGQaaaaaaa@39BE@ are obtained first and the existing algorithm then estimates P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ using Equation (3) (for an exact beta-Poisson model) or Equation (5) (for an approximate beta-Poisson model), whereas the new algorithm directly calculates the ratio ysim/Nsim as the model estimates for P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ .

Although both algorithms construct the parametric bootstrap confidence bands, it is important to realize that the band based on the existing algorithm counts for the variation of the probability of infection given a mean dose level while the band based on the new algorithm counts for the variation for the probability of infection given a specific dose.

Therefore, the difference between these two confidence bands is an analogy of the difference between confidence bands for a regression line and for prediction of a new data point in a linear regression model. Furthermore, these two algorithms are fundamentally different in evaluating the discrepancy between Equation (3) and Equation (5): the existing algorithm treats the discrepancy as a difference caused by two different dose-response models (i.e., the exact versus the approximate models) while the new algorithm treats the discrepancy as a difference caused by two different parameter sets under the exact beta-Poisson doseresponse model.

Because the new algorithm employs the proportion data as the estimates for P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ significant data discretization effects are expected in the resulting confidence bands when the maximum number of N i s MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOtamaaBa aaleaacaWGPbaabeaakiaabAgacaqGGaGaam4Caaaa@3A70@ (denoted by max(N)) is relatively small (e.g., max(N) < 15). Because the observed proportion of infection π i o ̂ = y i / N i , MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaaqa aaaaaaaaWdbiabec8aW9aadaqhaaWcbaWdbiaadMgaa8aabaWdbiaa d+gaaaaapaqabeaapeGaeSOadqcaaOGaeyypa0JaamyEa8aadaWgaa WcbaWdbiaadMgaa8aabeaak8qacaGGVaGaamOta8aadaWgaaWcbaWd biaadMgaa8aabeaakiaacYcaaaa@422E@ where π i o = log N i ( y i N i ), MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaeqiWda3aaS baaSqaaiaadMgaaeqaaOWaaWbaaSqabeaacaWGVbaaaOGaeyypa0Ja aeiBaiaab+gacaqGNbWaaSbaaSqaaiaad6eadaWgaaadbaGaamyAaa qabaWccqGHsgIRcqGHEisPaeqaaOWaaeWaaeaadaWcaaqaaiaadMha daWgaaWcbaGaamyAaaqabaaakeaacaWGobWaaSbaaSqaaiaadMgaae qaaaaaaOGaayjkaiaawMcaaiaacYcaaaa@49C1@ is used as the optimization target in the P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ parameter estimation process for Equations (3) and (5), the (median) curve estimated using the new algorithm should converge to the analogous value using the existing algorithm for large max(N), e.g., max(N) > 50. Whereas the large sample assumption is implicitly assumed in the existing algorithm, the new algorithm is critically dependent on the choice of Nsim. Smaller (larger) Nsim would result in a wider (narrower) confidence band. With respect to each mean dose level ( d D i ), MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaeWaaeaaca WGKbGaeyyyIORaamiramaaBaaaleaacaWGPbaabeaaaOGaayjkaiaa wMcaaiaacYcaaaa@3CCD@ the number of individuals exposed to a hazard, N i MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamOtamaaBa aaleaacaWGPbaabeaaaaa@37E2@ (or Nsim) may be considered as a sample size in a binomial process. In this sense, the new algorithm confidence bands are, therefore, sample-size-dependent (i.e. Nsim =max(N)-dependent) These theoretical properties are examined in the Figures presented in next section.
Comparison Results and Discussions
Results and Discussions for dataset 1
Using the rotavirus dataset (Dataset 1) as an example, the data and parameters used in both bootstrap algorithms for confidence band construction are listed in Table 3.Figures1 and 2 are created based on Table II (bootstrap algorithm procedures) and table 3 (input information).
Table 3: Data and parameters used for bootstrap confidence band construction in Figures1 and 2

Ds

{ 0.009, 0.09, 0.9, 9, 90, 900, 9000, 90000}                                   

N

{5, 7, 7, 11, 7, 8, 7, 3}                                                            

y

{0, 0, 1, 8, 6, 7, 5, 3}

Approximate B-P model:

α ̂ MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaacq aHXoqyaSqabeaaqaaaaaaaaaWdbiablkWaKaaaaaa@38B7@  = 0.253; β ̂ MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaacq aHYoGyaSqabeaaqaaaaaaaaaWdbiablkWaKaaaaaa@38B9@  = 0.422

Exact B-P model:

α ̂ MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaacq aHXoqyaSqabeaaqaaaaaaaaaWdbiablkWaKaaaaaa@38B7@  = 0.167; β ̂ MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaacq aHYoGyaSqabeaaqaaaaaaaaaWdbiablkWaKaaaaaa@38B9@  = 0.191

Dsim

{0.005, 0.009, 0.05, 0.09, 0.5, 0.9, 5, 9, 50, 90, 500, 900, 5000, 9000, 50000, 90000}

Nsim=max(N)

{11,   11,   11,    11,    11,   11,  11,  11,  11, 11,  11,   11,   11,    11,    11,   11 }

In figure1, within the observed mean dose level range, the smooth confidence bands (dot-dashed lines in green colour, based on the existing algorithm) are consistently and noticeably narrower than the less smooth confidence bands (dotted lines in black colour, based on the new algorithm) while the median lines (solid lines in green or black colour) match well with each other.
Table 4: Selected dose- response experiment datasets from the literature

Dataset 1: Rotavirus (CJN) and infection in healthy volunteers [8, 10]

Ds (mean dose)

N (total)

y(infected)

9×10-3

5

0

9×10-2

7

0

9×10-1

7

1

9

11

8

9×101

7

6

9×102>

8

7

9×103

7

5

9×104

3

3

 

Dataset2: Campylobacter and infection in healthy volunteers [8, 10]

Ds(mean dose)

N(total)

y(infected)

8×102

10

 5

8×103

10

 6

9×104

13

11

8×105

11

 8

1×106

19

15

1×108

5

  5

 

Dataset3: Salmonella (Nontyphoid Strains) and infection in human volunteers (p399,[12])

Ds(mean dose)

N(total)

y(infected)

1.52×105

6

3

3.85×105

8

6

1.35×106

6

6

1.39×105

6

3

7.05×105

6

4

1.66×106

6

4

1.5×107

6

4

1.25×105

6

5

6.95×105

6

6

1.7×106

6

5

1.2×104

5

2

2.4×104

6

3

6.6×104

6

4

1.41×105

6

3

2.56×105

6

5

5.87×105

6

4

8.6×105

6

6

8.9×104

6

5

4.48×105

6

4

1.04×106

6

6

3.9×106

6

4

1×107

6

6

2.39×107

6

5

4.45×107

6

6

6.73×107

8

8

1.26×106

6

6

4.68×106

6

6

1.2×104

6

3

2.4×104

6

4

5.2×104

6

3

9.6×104

6

3

1.55×105

6

5

3×105

6

6

7.2×105

5

4

1.15×106

6

6

5.5×106

6

5

2.4×107

5

5

5×107

6

6

1×106

6

6

5.5×106

6

6

1×107

6

5

2×107

6

6

4.1&times;107

6

6

1.5×106

6

5

7.68×106

6

6

1×107

6

5

1.58×105

6

1

 

Dataset4: (292 and 295 pooling) Listeria monocytogenes  and infection in mice [2]

Ds(mean dose)

N(total)

y(infected)

2

6

 0

5

6

 1

1.1102

6

 2

5.5×103

10

 7

3.24×104

10

 7

3.9×104

6

 4

5.5×104

10

 9

2.51×105

10

10

5.5×105

10

10

2.82×106

10

10

Ds= {D_1,D_2,…,D_m } representsm mean dose levels (i.e., the average number of organisms per dose or average concentrations) applied tom exposure groups; N = {N_1,N_2,…,N_m }where N_i represents the number of individuals participated in the ith exposure group (corresponding to mean dose level D_i); y = {y_1,y_2,… ,y_m }where y_irepresents the number of infected individuals in the ith exposure group.
Table 5:
This is the most important feature to note in the comparison of these two algorithms. With the new algorithm, the resulting confidence band is critically dependent on the choice of Nsim. By choosing Nsim= max(N) as proposed in table II and exemplified in table 3, the confidence band based on the new algorithm provides a lower bound of the band-width in estimation of the prediction confidence levels (i.e., the confidence band for prediction should be at least as wide). As expected, in all datasets we examined, the existing algorithm band was narrower than the analogous band based on the new algorithm within the observed mean dose level range. This provides strong empirical evidence to confirm the theoretical property concluded in the last section that the confidence bands based on the algorithm currently proposed in the literature are not confidence bands for prediction.
Figure 1: Comparison of bootstrap confidence bands (dataset 1). Plotting scheme: circles represent the data points; solid lines depict the estimatedmedian P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ , and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when Nsim→ ∞).
Figure 2: Comparison of bootstrap confidence bands (dataset 1, approximate beta-Poisson model). Plotting scheme: circles represent the data points; solid lines depict the estimated median P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ , and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when Nsim→ ∞); the thin red dashed line is the maximum risk curve in finite sample case (i.e., when Nsim is finite), a bootstrap 97.5 percentile line.
Another important feature to note in Figure1 is that, except for the confidence band of the exact beta-Poisson model based on the existing algorithm (the green dot-dashed lines in figure 1(b)), all three other confidence band upper limit lines exceed the maximum risk line (bold red dashed line) in the verylow dose region (the bottom left corner).This exceedance issue is further illustrated and discussed in figure2.

An approximate beta-Poisson model can perform poorly in that its predicted P_I (d) curve may go above the maximum risk line (the bold red dashed line); figure1(a) essentially reproduced the result reported by Teunis in 2000[10]. Figure 2(a) is a reproduction of figure 1(a) but with one extra curve added in – the thin red dashed line in the farthest left part of the plot.

It is noted that, in the bottom left corner region of figure 2(a), the upper dotted line exceeds the upper dot-dashed line which is already above the bold red dashed line. However, the dotted line remains under the thin red dashed line. Therefore, the confidence band based on the new algorithm (dotted lines) does not fail here, because it is a sample-size dependent confidence band and the thin red dashed line is the finite sample maximum risk line (choosing the same Nsim values, as shown in table 3, to take account of the sampling variation in estimating the maximum risk level). On the other hand, the existing algorithm for computing the approximate beta-Poisson confidence band (dot-dashed lines) does violate the maximum risk limit for MLEs are obtained under the large sample assumption and the bold red-dashed line (Nsim→ ∞) applies as the maximum risk limit. Figure 2(b) is drawn to verify how the new algorithm confidence band will change as Nsim changes (Nsim = {200}, compared with Nsim = {11} in figure 2(a)). Figure2(b) shows that the thin red dashed line moves much closer to the bold red dashed line and the new algorithm band (dotted lines) shrinks significantly, with the upper dotted line being always under the thin-red dashed line. It is noted that the (existing algorithm) upper dot-dashed line even exceeds the thin red dashed line in figure 2(b) which confirms the finding reported by Teunis [10]: the maximum risk limit property is not retained in the approximate beta-Poisson model. Noting the fact that N_i is seldom more than 20 in the real dose-response data, it is, therefore, necessary to take into account the sampling variation effects in determination of the maximum risk limits using Equation (1).

As shown by figure 1(b), by ignoring the sampling variation effects, the existing algorithm could underestimate the maximum risk in the very low dose level range (e.g., mean/ effective dose d< 0.05or log(mean/effective dose) = -3 ) which is of particular interest (and concern) in microbial risk assessment practice.
Figure 3: Comparison of bootstrap confidence bands (dataset 2). Plotting scheme: circles represent the data points; solid lines depict the estimated median P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ , and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when Nsim→ ∞).
Results and Discussions for datasets 2, 3, and 4
Figures 3 through to 7 are produced for the analysis of datasets 2, 3, and 4 following the same procedures as exemplified with dataset 1. Similar to figure1, figure 3 shows that the median PI (d) lines match very closely for both algorithms and the black dotted lines (the new algorithm) form a wider confidence band than the band formed by green dot-dashed lines (the existing algorithm) over the observed mean dose level range. In contrast toFigure1, however, data points in figure 3 only spread in the high dose level range (log(mean dose) > 6.5). In figure 3, the new algorithm band is much narrower than the existing algorithm band outside the observed mean dose range (roughly where log(mean dose) < 6.5). It is also noted that the new algorithm bands are well under the limiting maximum risk (bold red dashed) line. In contrast, the approximate beta-Poisson model confidence band based on the existing algorithm (figure 3(a)) badly exceeds the maximum risk line in the lower mean dose level range (roughly where log(mean dose) < -2). With a 95% confidence band, we expect to observe, on average, only one out of 20 data points falling outside the band. However, we notice that, with six data points, the existing algorithm confidence band clearly missed one point and another point is only marginally included. In contrast, all six data points are well included in the new algorithm band.

Figures 4 and 5 are drawn in order to show the impacts of the sampling variation on the new algorithm confidence band while the existing algorithm band should remain unchanged. The sampling variation is determined by the sample size Nsim. The plots in figure 4(a), (b), (c), and (d), show how the bands, constructed based on the new algorithm, change as the sample size Nsim changes.
Figure 4: Comparison of bootstrap confidence bands by setting Nsim to different values (dataset 2, approximate beta-Poisson model). Plotting scheme: circles represent the data points; solid lines depict the estimated median P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ , and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when Nsim→ ∞); the thin red dashed line is the maximum risk curve in the finite sample case (i.e., when Nsim is finite), a bootstrap 97.5 percentile line.
Figure 5: Comparison of bootstrap confidence bands (dataset 2, exact beta-Poisson model, Nsim={2} ). Plotting scheme: circles represent the data points; solid lines depict the estimated median P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ , and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when Nsim→ ∞).
Note that the thin red dashed lines move away / diverge from the bold red dashed lines as the sample size decreases. Figure5 is with the Nsim = {2} setting, i.e., it is assumed that only two subjects are exposed to hazard at each mean dose level. Settings in Figures4 and 5 are supposed to cover most of the real life experimental scenarios. Figure4 examines the approximate beta-Poisson model situation and Figure5 examines the exact beta-Poisson model situation. All these cases have shown a consistent pattern – the existing algorithm confidence band is possibly too wide outside the observed mean dose level range. Since the parameter estimation results are α ̂ =0.145;  β ̂ =7.59; MathType@MTEF@5@5@+= feaagGart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaWaaCbiaeaaqa aaaaaaaaWdbiabeg7aHbWcpaqabeaapeGaeSOadqcaaOWdaiabg2da 9iaaicdacaGGUaGaaGymaiaaisdacaaI1aGaai4oaiaabccadaWfGa qaaiabek7aIbWcbeqaa8qacqWIcmajaaGcpaGaeyypa0JaaG4naiaa c6cacaaI1aGaaGyoaiaacUdaaaa@4683@ (approximate beta- Poisson model [10]) for dataset 2. This implies that the confidence band for dataset 2, no matter whether the parameter estimates are obtained from Equation (3) or Equation (5), should almost always be valid. While this numeric evidence is consistent with the new algorithm bands as shown in figures3, 4, and 5, the existing approximate beta-Poisson model bands clearly contradict this expectation. The result of the deviance difference analysis (not reported in this article) also suggested that, with dataset 2, the approximate beta-Poisson model is a very good approximation. All these assessments strongly suggest that the confidence band based on the existing algorithm may not be a proper estimation of the degree of uncertainty of the dose-response relation in these circumstances. Figures 6 and 7, which present the confidence band comparison for datasets 3 and 4, provide more evidence to show the comparatively poor performance of the confidence bands based on the existing algorithm.
Figure 6: Comparison of bootstrap confidence bands (dataset 3, exact beta-Poisson model). Plotting scheme: circles represent the data points; solid lines depict the estimated median P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ ,and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when Nsim→ ∞)
Figure 7: Comparison of bootstrap confidence bands (dataset 4, exact beta-Poisson model). Plotting scheme: circles represent the data points; olid lines depict the estimated median P I (d) MathType@MTEF@5@5@+= feaagGart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLn hiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr 4rNCHbGeaGqiVu0Je9sqqrpepC0xbbL8F4rqqrFfpeea0xe9Lq=Jc9 vqaqpepm0xbba9pwe9Q8fs0=yqaqpepae9pg0FirpepeKkFr0xfr=x fr=xb9adbaqaaeGaciGaaiaabeqaamaabaabaaGcbaGaamiuamaaBa aaleaacaWGjbaabeaakiaacIcacaWGKbGaaiykaaaa@3A10@ ,and the 2.5 and 97.5 percentile lines are drawn with dot-dashed lines (the existing algorithm in green colour) or with dotted lines (the new algorithm in black colour); the bold red dashed line is the asymptotic maximum risk curve (i.e., when Nsim→ ∞).
Figure 6 (a) is essentially a reproduction of figure 9-2 in [12] (p402) and figure 6(b) shows the corresponding confidence band based on the new algorithm. It is obvious that only a minority (roughly 10 out 47) data points are included in the existing algorithm band while the new algorithm band contains 45 out of 47 data points as expected. Similar comparisons can be made for dataset 4 as shown in figure 7(a) and (b) -- 5 out of 10 data points are outside the existing algorithm band while all data points are contained in the new algorithm band.

The literature dose-response analysis results of dataset 4 can be found in [2].

http://qmrawiki.msu.edu/index.php?title=Listeria_monocytoge nes_%28Infection%29%3A_Dose_Response_Models

(accessed on 23/04/2017). In the analysis, the author(s) compared the fit of the beta-Poisson and exponential models to the data and concluded that the beta-Poisson model was a adequate model. Figure 7 depicts the results of the doseresponse analysis with dataset 4. Clearly, the new algorithm has provided a much better fit.
Conclusions
A new parametric bootstrap confidence band algorithm is proposed for the popular QMRA beta-Poisson dose-response model in this study. The resulting confidence band is sample-size dependent (i.e., Nsim-dependent)and it provides the lower bound of the band-width for estimating the true confidence level for prediction. Through an intensive Monte Carlo simulation study, the performances of the confidence bands based on the existing literature algorithm and the new algorithm are compared by fitting the beta-Poisson dose-response models to four well studied datasets. It has been shown that, asymptotically (i.e., as Nsim→ ∞) the estimated median dose-response curve based on the new algorithm converges to the existing algorithm median curve; in finite Nsim case, these two curves differ only in smoothness (the existing algorithm curve is always smoother). The new algorithm confidence band substantially improves the existing algorithm band in data point coverage. Due to the big sampling variation in proportion data under small sample conditions, the existing algorithm is likely to underestimate the maximum risk limit of dose-response data. On the other hand, outside the observed mean dose level range, the existing algorithm tends to overestimate the upper bound of the confidence band. This study has shown that the new algorithm is able to facilitate both types of bias. Therefore, the confidence band for prediction based on the new algorithm is a better representation of the degree of uncertainty of an estimated dose-response relation. The upper limit of the estimated prediction confidence band could be used as a better estimate for the worst case scenario in risk assessment.
Acknowledgements
The early version of this article was finished in 2015 when the author worked for the pond project which was funded by Department of Science, Information Technology and Innovation (DSITI) of Queensland government, Australia. The author would like to thank Professor Kerrie Mengersen (Queensland University of Technology) for her valuable feedback on the early version of the draft manuscript.
References
  1. Haas CN, JB Rose,CP Gerba. Quantitative Microbial Risk Assessment. second ed. New York: John Wiley & Sons;INC2014.
  2.  Rose J. RA Wiki 23 January, 2015 [cited 2017 February]; Quantitative Microbial Risk Assessment (QMRA) Wiki] Available from: http://qmrawiki.canr.msu.edu/index.php/Quantitative_Microbial_Risk_Assessment_%28QMRA%29_Wiki
  3. Mara DD. Sleigh PA, Blumenthal UJ, Carr RM. Health risks in wastewater irrigation: comparing estimates from quantitative microbial risk analyses and epidemiological studies. J Water Health. 2007;5(1): 39-50.
  4. Karavarsamis N, AJ Hamilto. Estimators of annual probability of infection for quantitative microbial risk assessment. Journal of Water and Health, 2010;8(2): 365-373.
  5. Teunis P, Takumi K, and Shinagawa K. Dose response for infection by Escherichia coli O157: H7 from outbreak data. Risk Analysis. 2004;24(2): 401-407.
  6. Xie G, Stratton H, Lemckert C, Dunn PK, Mengersen K. A Generalized QMRA Beta‐Poisson Dose‐Response Model. Risk Analysis, 2016;36(10):1948-1958. doi:10.1111/risa.12561
  7. The Interagency MRA Guideline Workgroup, Microbial Risk Assessment Guideline, Pathogenic Microorganisms with Focus on Food and Water. 2012, U.S. Environment Protection Agency (EPA) and U.S. Department of Agriculture/Food Safety and Inspection Service (USDA/FSIS).
  8. Teunis P, O G. van der Heijden, JWB van der Giessen, A H Havelaar. The dose-response relation in human volunteers for gastro-intestinal pathogens. National Institute of Public Health and the Environment (RIVM); The Netherlands:1996.
  9.  Efron B. RJ. Tibshirani. An introduction to the bootstrap. Vol. 57;1994: CRC press.
  10. Teunis PF, Havelaar AH. The Beta Poisson dose‐response model is not a single‐hit model. Risk Analysis. 2000;20(4): 513-520.
  11. R Development Core Team. R: A Language and Environment for Statistical Computing. 2014. Available from: http://www.R-project.org

  12. Haas CN, JB Rose, CP Gerba. Quantitative Microbial Risk Assessment. first ed. New York:John Wiley & Sons, INC;1999.
  13.  Kreyszig E. Advanced engineering mathematics. 10th edition ed:John Wiley & Sons;2011.
  14. Haas CN. Conditional dose-response relationships for microorganisms: Development and application. Risk analysis. 2002; 22(3):455-463.
  15. Muller KE, Computing the confluent hypergeometric function, M (a, b, x). Numerische Mathematik. 2001;90(1): 179-196. doi:10.1007/s002110100285
  16. Butler RW, Andrew T. A. Wood. Laplace Approximations for Hypergeometric Functions with Matrix Argument. The Annals of Statistics. 2002;30(4):1155-1177.
  17. Furumoto WA, Mickey R. A mathematical model for the infectivity-dilution curve of tobacco mosaic virus: Theoretical considerations. Virology. 1967;32(2):216-223. doi:10.1016/0042-6822(67)90272-3
  18. McCullagh P, J Nelder. Generalized linear models. Monographs on statistics and applied probability. Second edition London; Chapman and Hall:1989.
  19. Morgan B.J. Analysis of quantal response data. Vol. 46. 1992: CRC Press.
  20. Hankin, RK. Package ‘gsl’. 2013.
 
Listing : ICMJE   

Creative Commons License Open Access by Symbiosis is licensed under a Creative Commons Attribution 4.0 Unported License