Linear models
In general, the \(p\)th linear QRmodel is of the form \[\begin{equation}\label{eq:4}\tag{4}Q_{Y|X}(p) = \mathbf{x}^{\top}\boldsymbol \beta(p)\end{equation}\] where \(\mathbf{x}\) is a \(k\)-dimensional vector of covariates(including \(1\) as first element) and\(\boldsymbol \beta(p) = [\beta_{0}(p),\beta_{1}(p),\) \(\ldots,\beta_{k-1}(p)]^{\top}\) is a vector of coefficients. The‘slopes’ \(\beta_{j}(p)\), \(j = 1,\ldots, k-1\), have the usualinterpretation of partial derivatives. For example, in case of thesimple model \(Q_{Y|X}(p) = \beta_{0}(p) +\beta_{1}(p)x\), one obtains \[\frac{\partial Q_{Y|X}(p)}{\partial x} = \beta_{1}(p).\\\] If \(x\) is a dummy variable,then \(\beta_{1}(p) = Q_{Y|X = 1}(p) -Q_{Y|X=0}(p)\), i.e.the so-called ‘quantile treatment effect’(Doksum 1974; Lehmann 1975; Koenker and Xiao2002). Estimation can be carried out using LP algorithms which,given a sample \((\mathbf{x}_{i},y_{i})\), \(i=1,\dots,n\), solve \[\min_{b \in \mathbb{R}^{k}} \sum_{i=1}^{n} \kappa_{p}\left(y_{i} -\mathbf{x}_{i}^{\top}\mathbf{b}\right),\] where \(\kappa_{p}(u) = u(p - I(u< 0))\), \(0 < p < 1\),is the check loss function. Large-\(n\)approximation of standard errors can be obtained from the samplingdistribution of the linear quantile estimators (Koenker and Bassett 1978).
Figure 3. (a) Waiting times between eruptions againstdurations of eruptions (dashed vertical line drawn at \(3\) minutes) in the Old Faithful Geyserdata set. (b) Mid-CDF of waiting time by duration of eruption (solidline, shorter than 3 minutes; dashed line, longer than 3 minutes).
Waiting times between eruptions are plotted against the durations ofthe eruptions in Figure 3. Two clusters of observations can be definedfor durations below and above 3 minutes (see alsoAzzalini and Bowman 1990). The distribution shows a strongbimodality as already illustrated in Figure 2. A dummy variable fordurations equal to or longer than \(3\)minutes is created to define the two distributions and included ascovariate \(X\) in a model as the onespecified in (4). The latter is then fitted to the Old Faithful Geyserdata using the function rq in the packagequantreg for \(p \in \{0.1,0.25, 0.5, 0.75, 0.9\}\).
> require("quantreg")> y <- faithful$waiting> x <- as.numeric(faithful$eruptions >= 3)> fit <- rq(formula = y ~ x, tau = c(0.1, 0.25, 0.5, 0.75, 0.9))> fitCall:rq(formula = y ~ x, tau = c(0.1, 0.25, 0.5, 0.75, 0.9))Coefficients: tau= 0.10 tau= 0.25 tau= 0.50 tau= 0.75 tau= 0.90(Intercept) 47 50 54 59 63x 26 26 26 25 25Degrees of freedom: 272 total; 270 residual
From the output above, it is quite evident that the distribution ofwaiting times is shifted by an approximately constant amount at allconsidered values of \(p\). Thelocation-shift hypothesis can be tested by using the Khmaladze test. Thenull hypothesis is that two distributions, say \(F_{0}\) and \(F_{1}\), differ by a pure location shift(Koenker and Xiao 2002), that is \[H_{0}: \, F^{-1}_{1}(p) = F^{-1}_{0}(p) + \delta_{0},\] where \(\delta_{0}\) is thequantile treatment effect, constant over \(p\). The location–scale-shift specificationof the test considers \[H_{0}: \, F^{-1}_{1}(p) = \delta_{1}F^{-1}_{0}(p) + \delta_{0}.\] The alternative hypothesis is that the model is more complexthan the one specified in the null hypothesis. The Khmaladze test isimplemented in quantreg (see?quantreg::KhmaladzeTest for further details). The criticalvalues of the test and corresponding significance levels (Koenker 2005) are not readily available in thesame package. These have been hardcoded in the Qtoolsfunction KhmaladzeFormat which can be applied to“KhmaladzeTest” objects. For the Old Faithful Geyser data, theresult of the test is not statistically significant at the \(10\%\) level.
> kt <- KhmaladzeTest(formula = y ~ x, taus = seq(.05, .95, by = .01),> KhmaladzeFormat(kt, 0.05)Khmaladze test for the location-shift hypothesisJoint test is not significant at 10% levelTest(s) for individual slopes: not significant at 10% level
Goodness of fit
Distribution-free quantile regression does not require introducing anassumption on the functional form of the error distribution (Koenker and Bassett 1978), but only weakerquantile restrictions (Powell 1994).Comparatively, the linear specification of the conditional quantilefunction in Equation 4 is a much stronger assumption and thus plays animportant role for inferential purposes.
The problem of assessing the goodness of fit (GOF) is ratherneglected in applications of QR. Although some approaches to GOF havebeen proposed (Zheng 1998; Koenker and Machado1999; X. M. He and Zhu 2003; Khmaladze and Koul 2004), there iscurrently a shortage of software code available to users. The functionGOFTest implements a test based on the cusum process of thegradient vector (X. M. He and Zhu 2003).Briefly, the test statistic is given by the largest eigenvalue of \[n^{-1}\sum_{i}^{n}\mathbf{R}_{n}(\mathbf{x}_{i})\mathbf{R}^{\top}_{n}(\mathbf{x}_{i})\] where \(\mathbf{R}_{n}(\mathbf{t}) =n^{-1/2} \sum_{j=1}^{n} \psi_{p}(r_{j})\mathbf{x}_{j} I(\mathbf{x}_{j}\leq \mathbf{t})\) is the residual cusum (RC) process and \(\psi_{p}(r_{j})\) is the derivative of theloss function \(\kappa_{p}\) calculatedfor residual \(r_{j} = y_{j} -\mathbf{x}_{j}^{\top}\boldsymbol \beta(p)\). The samplingdistribution of this test statistic is non-normal (X. M. He and Zhu 2003) and a resamplingapproach is used to obtain the \(p\)-value under the null hypothesis.
An example is provided further below using the New York Air Qualitydata set, which contains \(111\)complete observations on daily mean ozone (parts per billion – ppb) andsolar radiation (Langleys – Ly). For simplicity, wind speed and maximumdaily temperature, also included in the data set, are not analysedhere.
Suppose that the model of interest is \[\begin{equation}\label{eq:5}\tag{5}Q_{\text{ozone}}(p) = \beta_{0}(p) + \beta_{1}(p) \cdot \text{Solar.R}.\end{equation}\] Three conditional quantiles (\(p \in \{0.1,0.5,0.9\}\)) are estimated andplotted using the following code:
> dd <- airquality[complete.cases(airquality), ]> dd <- dd[order(dd$Solar.R), ]> fit.rq <- rq(Ozone ~ Solar.R, tau = c(.1, .5, .9), data = dd)> x <- seq(min(dd$Solar.R), max(dd$Solar.R), length = 200)> yhat <- predict(fit.rq, newdata = data.frame(Solar.R = x))> plot(Ozone ~ Solar.R, data = dd)> apply(yhat, 2, function(y,x) lines(x,y), x = x)
Figure 4. Predicted 10th (solid line), 50th (dashedline), and 90th (dot-dashed line) centiles of ozone conditional on solarradiation in the New York Air Quality data set.
As a function of solar radiation, the median of the ozone dailyaverages increases by \(0.09\) ppb foreach Ly increase in solar radiation (Figure 4). The 90th centile ofconditional ozone shows a steeper slope at \(0.39\) ppb/Ly, about nine times larger thanthe slope of the conditional \(10\)thcentile at \(0.04\) ppb/Ly.
The RC test applied to the the object fit.rq providesevidence of lack of fit for all quantiles considered, particularly for\(p = 0.1\) and \(p = 0.5\). Therefore the straight-linemodel in Equation 5 for these three conditional quantiles does not seemto be appropriate. The New York Air Quality data set will be analysedagain in the next section, where a transformation-based approach tononlinear modelling is discussed.
> gof.rq <- GOFTest(fit.rq, alpha = 0.05, B = 1000, seed = 987)> gof.rqGoodness-of-fit test for quantile regression based on the cusum processQuantile 0.1: Test statistic = 0.1057; p-value = 0.001Quantile 0.5: Test statistic = 0.2191; p-value = 0Quantile 0.9: Test statistic = 0.0457; p-value = 0.018
Transformation models
Complex dynamics may result in nonlinear effects in the relationshipbetween the covariates and the response variable. For instance, inkinesiology, pharmaco*kinetics, and enzyme kinetics, the study of thedynamics of an agent in a system involves the estimation of nonlinearmodels; phenomena like human growth, certain disease mechanisms and theeffects of harmful environmental substances such as lead and mercury,may show strong nonlinearities over time. In this section, the linearmodel is abandoned in favor of a more general model of the type \[\begin{equation}\label{eq:6}\tag{6}Q_{Y|X}(p) = g\left\{\mathbf{x}^{\top}\boldsymbol \beta(p)\right\},\end{equation}\] for some real-valued function \(g\). If \(g\) is nonlinear, the alternativeapproaches to conditional quantile modelling are
- nonlinear parametric models—this approach may provide a model withsubstantive interpretability, possibly parsimonious (in general moreparsimonious than polynomials), and valid beyond the observed range ofthe data. A nonlinear model depends on either prior knowledge of thephenomenon or the introduction of new, strong theory to explain theobserved relationship with potential predictive power. Estimation maypresent challenges;
- polynomial models and smoothing splines—this approach goes under thelabel of nonparametric regression, in which the complexity ofthe model is approximated by a sequence of linear polynomials (a naiveglobal polynomial trend can be considered to be a special case). Anonparametric model need not introducing strong assumptions about therelationship and is essentially data-driven. Estimation is based onlinear approximations and, typically, requires the introduction of apenalty term to control the degree of smoothing; and
- transformation models—a flexible, parsimonious family of parametrictransformations is applied to the response seeking to obtain approximatelinearity on the transformed scale. The data provide information aboutthe ‘best’ transformation among a family of transformations. Estimationis facilitated by the application of methods for linear models.
The focus of this section is on the third approach. More specificallythe functions available in Qtools refer to the methodsfor transformation-based QR models developed by Powell (1991), Chamberlain (1994), Muand He (2007), Dehbi, Cortina-Borja, andGeraci (2016) and Geraci and Jones(2015). Examples of approaches to nonlinear QR based onparametric models or splines can be found in Koenker and Park (1996) and Yu and Jones (1998), respectively.
The goal of transformation-based QR is to fit the model \[\begin{equation}\label{eq:7}\tag{7}Q_{h\left(Y;\lambda_{p}\right)}(p) = \mathbf{x}^{\top}\boldsymbol\beta(p).\end{equation}\] The assumption is that the transformation \(h\) is the inverse of \(g\), \(h(Y;\lambda_{p}) \equiv g^{-1}(Y)\), so that the \(p\)th quantile function of the transformedresponse variable is linear. (In practice, it is satisfactory to achieveapproximate linearity.) The parameter \(\lambda_{p}\) is a low-dimensionalparameter that gives some flexibility to the shape of the transformationand is estimated from the data. In general, the interest is onpredicting \(Q_{Y|X}(p)\) andestimating the effects of the covariates on \(Q_{Y|X}(p)\). If \(h\) is a non-decreasing function on \(\mathbb{R}\) (as is the case for alltransformations considered here), predictions can be easily obtainedfrom (7) by virtue of the equivariance property of quantiles, \[\begin{equation}\label{eq:8}\tag{8}Q_{Y|X}(p) = h^{-1}\left\{\mathbf{x}^{\top}\boldsymbol \beta(p);\lambda_{p}\right\}.\end{equation}\] The marginal effect of the \(j\)th covariate \(x_{j}\) can be obtained by differentiatingthe quantile function \(Q_{Y|X}(p)\)with respect to \(x_{j}\). This can bewritten as the derivative of the composition \(Q \circ \eta\), i.e.
\[\begin{equation}\label{eq:9}\tag{9}\frac{\partial Q(p)}{\partial x_{j}} = \frac{\partial Q(p)}{\partial\eta(p)} \cdot \frac{\partial \eta(p)}{\partial x_{j}},\end{equation}\] \(\eta(p) =\mathbf{x}^{\top}\boldsymbol \beta(p)\). Once the estimates \(\hat{\boldsymbol{\beta}}(p)\) and \(\hat{\lambda}_{p}\) are obtained, these canbe plugged in Equations 8 and 9.
The package Qtools provides several transformationfamilies, namely the Box–Cox (Box and Cox1964), Aranda-Ordaz (Aranda-Ordaz1981), and Jones (Jones 2007; Geraci andJones 2015) transformations. A distinction between these familiesis made in terms of the support of the response variable to which thetransformation is applied and the number of transformation parameters.The Box–Cox model is a one-parameter family of transformations whichapplies to singly bounded variables, \(y >0\). The Aranda-Ordaz symmetric and asymmetric transformationstoo have one parameter and are used when responses are bounded on theunit interval, \(0 < y < 1\)(doubly bounded). Geraci and Jones (2015)developed two families of transformations which can be applied to eithersingly or doubly bounded responses:
- Proposal I transformations – this family has one parameter and itcomes in both symmetric and asymmetric forms;
- Proposal II transformations – this family has two parameters, withone parameter modelling the symmetry (or lack thereof) of thetransformation.
Originally, Box and Cox (1964) proposedusing power transformations to address lack of linearity,hom*oscedasticity and normality of the residuals in mean regressionmodelling. () reported that ``seldom does this transformation fulfil thebasic assumptions of linearity, normality and hom*oscedasticitysimultaneously as originally suggested by Box & Cox (1964). TheBox-Cox transformation has found more practical utility in the empiricaldetermination of functional relationships in a variety of fields,especially in econometrics’’.
Indeed, the practical utility of power transformations has been longrecognised in QR modelling (Powell 1991;Buchinsky 1995; Chamberlain 1994; Mu and He 2007). Model 7 is theBox–Cox QR model if \[\begin{equation}\label{eq:10}\tag{10}h_{BC}\left(Y;\lambda_{p}\right) =\begin{cases}\dfrac{Y^{\lambda_{p}} - 1}{\lambda_{p}} & \text{if $\lambda_{p}\neq 0$}\\[.5cm]\log Y & \text{if $\lambda_{p} = 0$}.\end{cases}\end{equation}\] Note that when \(\lambda_{p} \neq 0\), the range of thistransformation is not \(\mathbb{R}\)but the singly bounded interval \((-1/\lambda_{p},\infty)\). This impliesthat the inversion in (8) is defined only for \(\lambda_{p} \mathbf{x}^{\top}\boldsymbol \beta(p)+ 1 > 0\).
The ‘symmetric’ Aranda-Ordaz transformation is given by \[\begin{equation}\label{eq:11}\tag{11}h_{AOs}\left(Y;\lambda_{p}\right) =\begin{cases}\dfrac{2}{\lambda_{p}} \quad \dfrac{Y^{\lambda_{p}} -\left(1-Y\right)^{\lambda_{p}}}{Y^{\lambda_{p}} +\left(1-Y\right)^{\lambda_{p}}}& \text{if $\lambda_{p} \neq0$},\\[.5cm]\log\left(\dfrac{Y}{1-Y}\right) & \text{if $\lambda_{p} =0$}.\end{cases}\end{equation}\] (The symmetry here is that \(h_{AOs}(\theta;\lambda_p) =-h_{AOs}(1-\theta;\lambda_p)=h_{AOs}(\theta;-\lambda_p)\).) Thereis a range problem with this transformation too since, for all \(\lambda_p \neq 0\), the range of \(h_{AOs}\) is not \(\mathbb{R}\), but \((-2/|\lambda_{p}|, 2/|\lambda_{p}|)\). The‘asymmetric’ Aranda-Ordaz transformation is given by \[\begin{equation}\label{eq:12}\tag{12}h_{AOa}\left(Y;\lambda_{p}\right) =\begin{cases}\log \left\{\dfrac{\left(1-Y\right)^{-\lambda_{p}} -1}{\lambda_{p}} \right\}& \text{if $\lambda_{p} \neq 0$},\\[.5cm]\log \left\{-\log\left(1 - Y\right)\right\} & \text{if$\lambda_{p} = 0$}.\end{cases}\end{equation}\] For \(\lambda_{p} =0\), this is equivalent to the complementary log-log. Theasymmetric Aranda-Ordaz transformation does have range \(\mathbb{R}\). Note that \(h_{AOa}(Y;1) = \log (Y/(1-Y))\), i.e.thetransformation is symmetric.
To overcome range problems, which give rise to computationaldifficulties, Geraci and Jones (2015)proposed to use instead one-parameter transformations with range \(\mathbb{R}\). Proposal I is written interms of the variable (say) \(W\),where \[\begin{equation}\label{eq:13}\tag{13}h_I\left(W;\lambda_{p}\right) =\begin{cases}\dfrac{1}{2\lambda_{p}}\left(W^{\lambda_{p}} -\dfrac{1}{W^{\lambda_{p}}}\right)& \text{if $\lambda_{p} \neq0$}\\[.5cm]\log W & \text{if $\lambda_{p} = 0$},\end{cases}\end{equation}\] which takes on four forms depending on therelationship of \(W\) to \(Y\), as described in Table 2. For each ofdomains \((0,\infty)\) and \((0,1)\), there are symmetric and asymmetricforms.
Support of \(Y\) | Symmetric | Asymmetric |
\((0,\infty)\) | \(W= Y\) | \(W= \log(1+Y)\) | \(h_{Is}(Y;\lambda_p)\) | \(h_{Ia}(Y;\lambda_p)\) | \((0,1)\) | \(W= Y/(1-Y)\) | \(W= -\log(1-Y)\) | \(h_{Is}(Y;\lambda_p)\) | \(h_{Ia}(Y;\lambda_p)\) | Table 2: Choices of \(W\) andcorresponding notation for transformations based on (13).
Since the transformation in (13) has range \(\mathbb{R}\) for all \(\lambda_{p}\), it admits an explicitinverse transformation. In addition, in the case of a single covariate,every estimated quantile that results will be monotone increasing,decreasing or constant, although different estimated quantiles can havedifferent shapes from this collection. Geraci andJones (2015) also proposed a transformation that unifies thesymmetric and asymmetric versions of \(h_{I}\) into a single two-parametertransformation, namely \[\begin{equation}\label{eq:14}\tag{14}h_{II}(W;\lambda_p) = h_I(W_{\delta_p};\lambda_p),\end{equation}\] where \(h_I\)is given in (13) and \[\begin{equation*}W_{\delta_{p}} = h_{BC}(1+W;\delta_p) =\begin{cases}\dfrac{(1+W)^{\delta_{p}} - 1}{\delta_{p}} & \text{if $\delta_{p}> 0$}\\[.5cm]\log (1+W) & \text{if $\delta_{p} = 0$},\end{cases}\end{equation*}\] with \(W=Y\),if \(Y > 0\), and \(W=Y/(1-Y)\), if \(Y \in (0,1)\). The additional parameter\(\delta_{p}\) controls the asymmetry:symmetric forms of \(h_{I}\) correspondto \(\delta_p = 1\) while asymmetricforms of \(h_{I}\) to \(\delta_p = 0\).
All transformation models discussed above can be fitted using atwo-stage (TS) estimator (Chamberlain 1994;Buchinsky 1995) whereby \(\boldsymbol\beta(p)\) is estimated conditionally on a fine grid of valuesfor the transformation parameter(s). Alternatively, point estimation canbe approached using the RC process (Mu and He2007), which is akin to the process that leads to the RC testintroduced in the previous section. The RC estimator avoids thetroublesome inversion of the Box-Cox and Aranda-Ordaz transformations,but it is computationally more intensive than the TS estimator.
There are several methods for interval estimation, including thosebased on large-\(n\) approximations andthe ubiquitous bootstrap. Both the TS and RC estimators have anasymptotic normal distribution. The large-sample properties of the TSestimator for monotonic quantile regression models have been studied byPowell (1991) (see also Chamberlain (1994);Machado and Mata (2000)). Under regularityconditions, it can be shown that the TS estimator is unbiased and willconverge to a normal distribution with a sandwich-type limitingcovariance matrix which is easy to calculate. In contrast, the form ofthe covariance matrix of the sampling distribution for the RC estimatoris rather complicated and its estimation requires resampling (Mu and He 2007). Finally, if the transformationparameter is assumed to be known, then conditional inference isapposite. In this case, the estimation procedures simplify to those forstandard quantile regression problems.
In Qtools, model fitting for one-parametertransformation models can be carried out using the functiontsrq. The formula argument specifies the model for thelinear predictor as in (7), while the argument tsf provides thedesired transformation \(h\) asspecified in Equations 10-13: bc for the Box–Cox model,ao for Aranda-Ordaz families, and mcjI for proposal Itransformations. Additional arguments in the function tsrqinclude
- symmetry, a logical flag to specify the symmetric orasymmetric version of ao and mcjI;
- dbounded, a logical flag to specify whether the responsevariable is doubly bounded (default is strictly positive, i.e.singlybounded);
- lambda, a numerical vector to define the grid of values forestimating \(\lambda_p\); andconditional, a logical flag indicating whether \(\lambda_{p}\) is assumed to be known (inwhich case the argument lambda provides such known value).\end{itemize}
There are other functions to fit transformation models. The functionrcrq fits one-parameter transformation models using the RCestimator. The functions tsrq2 and nlrq2 are specificto Geraci and Jones’s (2015) Proposal II transformations. The formeremploys a two-way grid search while the latter is based on Nelder-Meadoptimization as implemented in optim. Simulation studies inGeraci and Jones (2015) suggest that,although computationally slower, a two grid search is numerically morestable than the derivative-free approach.
A summary of the basic differences between all fitting functions isgiven in Table 3. The table also shows the available methods insummary.rqt to estimate standard errors and confidenceintervals for the model’s parameters. Unconditional inference is carriedout jointly on \(\boldsymbol \beta(p)\)and the transformation parameter by means of bootstrap using the packageboot (Canty and Ripley 2014;Davison and Hinkley 1997). Large-\(n\) approximations (Powell 1991; Chamberlain 1994; Machado and Mata2000) are also available for the one-parameter TS estimator underiid or nid assumptions.
When summary.rqt is executed with the argumentconditional = TRUE, confidence interval estimation for \(\boldsymbol \beta_{p}\) is performed withone of the several methods developed for linear quantile regressionestimators (see options rank, iid, nid,ker, and boot in quantreg::summary.rq).
Function name | Transformation parameters | Estimation | Standard errors or confidence intervals | ||||||
Unconditional | Conditional | tsrq | 1 | Two-stage | iid, nid, boot | All types | rcrq | 1 | Residual cusum process | boot | All types | tsrq2 | 2 | Two-stage | boot | All types | nlrq2 | 2 | Nelder–Mead | boot | – | Table 3: Transformation-based quantile regression in packageQtools. ‘All types’ consists of options rank,iid, nid, ker, and boot as providedby function summary in package quantreg. |
In the New York Air Quality data example, a linear model was foundunsuitable to describe the relationship between ozone and solarradiation. At closer inspection, Figure 4 reveals that the conditionaldistribution of ozone may in fact be nonlinearly associated with solarradiation, at least for some of the conditional quantiles. The model\[\begin{equation}\label{eq:15}\tag{15}Q_{h_{Is}\{\text{ozone}\}}(p) = \beta_{0}(p) + \beta_{1}(p) \cdot\text{Solar.R},\end{equation}\] where \(h_{Is}\) denotes the symmetric version of(13) for a singly bounded response variable, is fitted for the quantiles\(p \in \{0.1,0.5,0.9\}\) using thefollowing code:
> system.time(fit.rqt <- tsrq(Ozone ~ Solar.R, data = dd, tsf = "mcjI",+ symm = TRUE, dbounded = FALSE, lambda = seq(1, 3, by = 0.005),+ conditional = FALSE, tau = c(.1, .5, .9))) user system elapsed 0.5 0.0 0.5> fit.rqtcall:tsrq(formula = Ozone ~ Solar.R, data = dd, tsf = "mcjI", symm = TRUE, dbounded = FALSE, lambda = seq(1, 3, by = 0.005), conditional = FALSE, tau = c(0.1, 0.5, 0.9))Proposal I symmetric transformation (singly bounded response)Optimal transformation parameter:tau = 0.1 tau = 0.5 tau = 0.9 2.210 2.475 1.500Coefficients linear model (transformed scale): tau = 0.1 tau = 0.5 tau = 0.9(Intercept) -3.3357578 -48.737341 16.557327Solar.R 0.4169697 6.092168 1.443407Degrees of freedom: 111 total; 109 residual
The TS estimator makes a search for \(\lambda_{p}\) over the grid \(1.000, 1.005, \ldots, 2.995, 3.000\). Thechoice of the search interval usually results from a compromise betweenaccuracy and performance: the coarser the grid, the faster thecomputation but the less accurate the estimate. A reasonable approachwould be to start with a coarse, wide-ranging grid (e.g.seq(-5, 5,by = 0.5)), then center the interval about the resulting estimateusing a finer grid, and re-fit the model.
The output above reports the estimates \(\boldsymbol{\hat{\beta}}(p)\) and \(\hat{\lambda}_p\) for each quantile levelspecified in tau. Here, the quantities of interest are thepredictions on the ozone scale and the marginal effect of solarradiation, which can obtained using the functionpredict.rqt.
> x <- seq(9, 334, length = 200)> qhat <- predict(fit.rqt, newdata = data.frame(Solar.R = x),+ type = "response")> dqhat <- predict(fit.rqt, newdata = data.frame(Solar.R = x),+ type = "maref", namevec = "Solar.R")The linear component of the marginal effect is calculated as derivative of Ozone ~ beta1 * Solar.R with respect to Solar.R
The calculations above are based on a sequence of 200 ozone values inthe interval \([9,334]\) Ly, asprovided via the argument newdata (if this argument is missing,the function returns the fitted values). There are three types ofpredictions available:
- link—predictions of conditional quantiles on thetransformed scale (7), i.e.\[\hat{Q}_{h\left(Y;\hat{\lambda}_{p}\right)}(p) =\mathbf{x}^{T}\hat{\boldsymbol{\beta}}(p);\]
- response—predictions of conditional quantiles on theoriginal scale (8), i.e.\[\hat{Q}_{Y|X}(p) =h^{-1}\left\{\mathbf{x}^{\top}\hat{\boldsymbol{\beta}}(p);\hat{\lambda}_{p}\right\};\] and
- maref—predictions of the marginal effect (9).
In the latter case, the argument namevec is used to specifythe name of the covariate with respect to which the marginal effect hasto be calculated. The function maref.rqt computes derivativessymbolically using the stats function derivand these are subsequently evaluated numerically. While the nonlinearcomponent of the marginal effect in Equation 9 (i.e.\(\partial Q(p)/\partial \eta(p)\)) is ratherstraightforward to derive for any of the transformations (10)-(13), thederivative of the linear predictor (i.e.\(\partial \eta(p)/\partial x_{j}\)) requiresparsing the formula argument in order to obtain an expressionsuitable for deriv. The function maref.rqt can handlesimple expressions with common functions like log,exp, etc., interaction terms, and `as is’ terms(i.e.I()). However, using functions that are not recognised byderiv will trigger an error.
Figure 5. Predicted 10th (solid line), 50th (dashedline), and 90th (dot-dashed line) centiles of ozone conditional on solarradiation (a) and corresponding estimated marginal effects (b) using thesymmetric proposal I transformation in the New York Air Quality dataset.
The predicted quantiles of ozone and the marginal effects of solarradiation are plotted in Figure 5 using the following code:
> par(mfrow = c(1, 2))> plot(Ozone ~ Solar.R, data = dd, xlab = "Solar radiation (lang)",+ ylab = "Ozone (ppb)")> for(i in 1:3) lines(x, qhat[ ,i], lty = c(1, 2, 4)[i], lwd = 2)> plot(range(x), range(dqhat), type = "n", xlab = "Solar radiation (lang)",+ ylab = "Marginal effect")> for(i in 1:3) lines(x, dqhat[ ,i], lty = c(1, 2, 4)[i], lwd = 2)
The effect of solar radiation on different quantiles of ozone levelsshows a nonlinear behavior, especially at lower ranges of radiation(below \(50\) Ly) and on the medianozone. It might be worth testing the goodness-of-fit of the model. Inthe previous analysis, it was found evidence of lack of fit for thelinear specification (5). In contrast, the output reported belowindicates that, in general, the goodness of fit of the quantile modelsbased on the transformation model (15) has improved since the teststatistics are now smaller at all values of \(p\). However, such improvement is not yetsatisfactory for the median.
> GOFTest(fit.rqt, alpha = 0.05, B = 1000, seed = 416)Goodness-of-fit test for quantile regression based on the cusum processQuantile 0.1: Test statistic = 0.0393; p-value = 0.025Quantile 0.5: Test statistic = 0.1465; p-value = 0.005Quantile 0.9: Test statistic = 0.0212; p-value = 0.127
The TS and RC estimators generally provide similar estimates andpredictions. However, computation based on the cusum process tends to besomewhat slow, as shown further below. This is also true for the RC testprovided by GOFTest.
> system.time(fit.rqt <- rcrq(Ozone ~ Solar.R, data = dd, tsf = "mcjI",+ symm = TRUE, dbounded = FALSE, lambda = seq(1, 3, by = 0.005),+ tau = c(.1, .5, .9))) user system elapsed 36.88 0.03 37.64
An example using doubly bounded transformations is demonstrated usingthe A-level Chemistry Scores data set. The latter is available fromQtools and it consists of 31022 observations of A-levelscores in Chemistry for England and Wales students, 1997. The data setalso includes information of prior academic achievement as assessed withGeneral Certificate of Secondary Education (GCSE) average scores. Thegoal is to evaluate the ability of GCSE to predict A-level scores. Thelatter are based on national exams in specific subjects (e.g.chemistry)with grades ranging from A to F. For practical purposes, scores areconverted numerically as follows: A = 10, B = 8, C = 6, D = 4, E = 2,and F = 0. The response is therefore doubly bounded between 0 ad 10. Itshould be noted that this variable is discrete, although, for the sakeof simplicity, here it is assumed that the underlying process iscontinuous.
The model considered here is \[\begin{equation}\label{eq:16}\tag{16}Q_{h_{AOa}\{\text{score}\}}(p) = \beta_{0}(p) + \beta_{1}(p) \cdot\text{gcse},\end{equation}\] where \(h_{AOa}\) denotes the asymmetricAranda-Ordaz transformation in (12). This model is fitted for \(p = 0.9\):
> data(Chemistry)> fit.rqt <- tsrq(score ~ gcse, data = Chemistry, tsf = "ao", symm = FALSE,+ lambda = seq(0,2,by=0.01), tau = 0.9)
The predicted \(90\)th centile ofA-level scores conditional on GCSE and the marginal effect of GCSE areplotted in Figure 6. There is clearly a positive, nonlinear associationbetween the two scores. The nonlinearity is partly explained by thefloor and ceiling effects which result from the boundedness of themeasurement scale. Note, however, that the S-shaped curve is notsymmetric about the inflection point. As a consequence, the marginaleffect is skewed to the left. Indeed, the estimate \(\hat{\lambda}_{0.9} = 0\) and the narrowconfidence interval give support to a complementary log-logtransformation:
> summary(fit.rqt, conditional = FALSE, se = "nid")call:summary.rqt(object = fit.rqt, se = "nid", conditional = FALSE)Aranda-Ordaz asymmetric transformation (doubly bounded response)Summary for unconditional inferencetau = 0.9Optimal transformation parameter: Value Std. Error Lower bound Upper bound 0.000000000 0.001364422 -0.002674218 0.002674218Coefficients linear model (transformed scale): Value Std. Error Lower bound Upper bound(Intercept) -4.3520060 0.015414540 -4.3822179 -4.3217941gcse 0.8978072 0.002917142 0.8920898 0.9035247Degrees of freedom: 31022 total; 31020 residual
Alternatively, one can estimate the parameter \(\delta_{p}\) using a two-parametertransformation:
> coef(tsrq2(score ~ gcse, data = chemsub, dbounded = TRUE,+ lambda = seq(0, 2, by = 0.1), delta = seq(0, 2, by = 0.1),+ tau = 0.9), all = TRUE)(Intercept) gcse lambda delta -4.1442274 0.8681246 0.0000000 0.0000000
These results confirm the asymmetric nature of the relationship since\(\hat{\delta}_{0.9} = 0\). Similarresults (not shown) were obtained with nlrq2.
In conclusion, the package Qtools offers severaloptions in terms of transformations and estimation algorithms, theadvantages and disadvantages of which are discussed by . In particular,they found that the symmetric Proposal I transformation improvesconsiderably on the Box-Cox method and marginally on the Aranda-Ordaztransformation in terms of mean squared error of the predictions. Also,asymmetric transformations do not seem to improve sufficiently often onsymmetric transformations to be especially recommendable. However, theBox-Cox and the symmetric Aranda-Ordaz transformations should not beused when individual out-of-range predictions represent a potentialinconvenience as, for example, in multiple imputation (see sectionfurther below). Finally, in some situations transformation-basedquantile regression may be competitive as compared to methods based onsmoothing, as demonstrated by a recent application to anthropometriccharts (Boghossian et al. 2016).
Conditional LSS
Quantile-based measures of location, scale, and shape can be assessedconditionally on covariates. A simple approach is to a fit alinear model as in (4) or a transformation-based model as in (7), andthen predict \(\hat{Q}_{Y|X}(p)\) toobtain the conditional LSS measures in Equation 3 for specific values of\(\mathbf{x}\).
Estimation of conditional LSS can be carried out by using thefunction qlss.formula. The conditional model is specified inthe argument formula, while the probability \(p\) is given in probs. (As seen inEquation 3, the other probabilities of interest to obtain thedecomposition of the conditional quantiles are \(1-p\), \(0.25\), \(0.5\), and \(0.75\).) The argument typespecifies the required type of regression model, more specificallyrq for linear models and rqt for transformation-basedmodels. The function qlss.formula will take any additionalargument to be passed to quantreg::rq or tsrq(e.g.subset, weights, etc.).
Let’s consider the New York Air Quality data example discussed in theprevious section and assume that the transformation model (15) holds forthe quantiles \(p \in \{0.05, 0.1, 0.25, 0.5,0.75,\) \(0.9, 0.95\}\). Thenthe conditional LSS summary of the distribution of ozone conditional onsolar radiation for \(p = 0.05\) and\(p = 0.1\) is calculated asfollows:
> fit.qlss <- qlss(formula = Ozone ~ Solar.R, data = airquality, type =+ "rqt", tsf = "mcjI", symm = TRUE, dbounded = FALSE, lambda =+ seq(1, 3, by = 0.005), probs = c(0.05, 0.1))> fit.qlsscall:qlss.formula(formula = Ozone ~ Solar.R, probs = c(0.05, 0.1), data = airquality, type = "rqt", tsf = "mcjI", symm = TRUE, dbounded = FALSE, lambda = seq(1, 3, by = 0.005))Conditional Quantile-Based Location, Scale, and Shape-- Values are averaged over observations --** Location **Median[1] 30.2258** Scale **Inter-quartile range (IQR)[1] 43.40648Inter-quantile range (IPR) 0.05 0.188.02909 73.93430**Shape**Skewness index 0.05 0.10.5497365 0.5180108Shape index 0.05 0.11.960315 1.661648
The output, which is of class “qlss”, is a named list withthe same LSS measures seen in the case of unconditional quantiles.However, these are now conditional on solar radiation. By default, thepredictions are the fitted values, which are averaged over observationsfor printing purposes. An optional data frame for predictions can begiven via the argument newdata in predict.qlss. Ifinterval = TRUE, the latter computes confidence intervals atthe specified level using R bootstrap replications (itis, therefore, advisable to set the seed before callingpredict.qlss). The conditional LSS measures can be convenientlyplotted using the plot.qlss function as shown in the codebelow. The argument z is required and specifies the covariateused for plotting. Finally, the argument whichp specifies oneprobability (and one only) among those given in probs thatshould be used for plotting (e.g.\(p =0.1\) in the following example).
> set.seed(567)> x <- seq(9, 334, length = 200)> qhat <- predict(fit.qlss, newdata = data.frame(Solar.R = x),+ interval = TRUE, level = 0.90, R = 500)> plot(qhat, z = x, whichp = 0.1, interval = TRUE, type = "l",+ xlab = "Solar radiation (lang)", lwd = 2)
Figure 7 shows that both the median and the IQR of ozone increasenonlinearly with increasing solar radiation. The distribution of ozoneis skewed to the right and the degree of asymmetry is highest at lowvalues of solar radiation. This is due to the extreme curvature of themedian which takes on values close to the 10th centile (Figure 5).(Recall that the index approaches 1 when \(Q(p) \approx Q(0.5)\).) However, thesparsity of observations at the lower end of the observed range of solarradiation determines substantial uncertainty as reflected by the widerconfidence interval (Figure 7). At \(p =0.1\), the conditional shape index is on average equal to \(1.66\) and it increases monotonically from\(1.32\) to about \(1.85\), remaining always below thetail-weight threshold of a normal distribution (\(1.90\)).
Figure 7. Location, scale and shape of ozone levelsconditional on solar radiation in the New York Air Quality data set.Dashed lines denote the bootstrapped \(90%\) point-wise confidence intervals.