Multivariate Calibration

In calibration problem we have accurately known data values (X) and a responses to those values (Y). Responses are scaled and contaminated by noise (E), but easier to obtain. Given the calibration data (X,Y), we want to estimate new data values (X’) when we observe response Y’. Using Brown’s (Brown 1982) notation, we have a model

[tex] Y=\textbf{1}\alpha ^T + XB + E [/tex] (1)

[tex] Y’=\alpha ^T + X’^T B + E’ [/tex] (2)

where sizes of matrices are Y (nXq), E (nXq), B(pXq), Y’ (1Xq), E’ (1Xq), X (nXp) and X’ (pX1). [tex]\textbf{1}[/tex] is a column vector of ones (nX1). This is a bit less general than Brown’s model (only one response vector for each X’). n is length of the calibration data, q length of the response vector, and p length of the unknown X’. For example, if Y contains proxy responses to global temperature X, p is one and q the number of proxy records.

In the following, it is assumed that columns of E are zero mean, normally distributed vectors. Furthermore, rows of E are uncorrelated. (This assumption would be contradicted by red proxy noise.) The (qXq) covariance matrix of noise is denoted by G. In addition, columns of X are centered and have average sum of squares one.

Classical and Inverse Calibration Estimators

Classical estimator of X’ ( CCE (Williams 69) , indirect regression (Sundberg 99), inverse regression (Juckes 06) ) is obtained by generating ML estimator with known [tex]B[/tex] and [tex]G[/tex] and then replacing [tex]B[/tex] by [tex]\hat{B}[/tex] and [tex]G[/tex] by [tex]\hat{G}[/tex] where

[tex]\hat{B}=(X^TX)^{-1}X^TY[/tex] (3a)

[tex]\hat{\alpha}^T=(\textbf{1}^T \textbf{1})^{-1}\textbf{1}^TY[/tex] (3b)


[tex]\hat{G}=(Y_c ^TY_c-\hat{B}^TX^TY_c)/(n-p-q) [/tex] (4)

([tex]Y_c=Y-\textbf{1}\hat{\alpha}^T[/tex] , i.e. centered Y ), yielding CCE estimator

[tex] \hat{X}’=(\hat{B} S^{-1}\hat{B}^T)^{-1}\hat{B}S^{-1}(Y’^T-\hat{\alpha})[/tex] (5)


[tex]S=Y_c^TY_c-\hat{B}^TX^TY_c[/tex] (6)

Another way to go is ICE (inverse calibration estimator (Krutchkoff 67), direct regression (Sundberg 99) ) , directly regress X on Y,

[tex]\hat{\hat{X}}’^T=(Y’-\hat{\alpha}^T)(Y_c^TY_c)^{-1}Y_c^TX[/tex] (7)

Note that nobody yet has said that these estimators are optimal in any sense. It turns out that if we have special prior knowledge of ‘ (Xs and Ys sampled from normal population), ICE is optimal.

Important note (yet without proof here) is that sample variance of econstruction in the calibration period will be smaller than the reconstruction in the case of ICE, and larger with CCE. In the absence of noise, ICE and CCE yield (naturally) the same result. Update: see Gerd’s link and, and also note that ICE is a matrix weighted average between CCE and zero-matrix (Brown82, Eq 2.21).

Confidence Region for X’

Following Brown, we have [tex](100-\gamma)[/tex] per cent confidence region, all X’ such that

[tex](Y’^T-\hat{\alpha}-\hat{B}^TX’)^TS^{-1}(Y’^T-\hat{\alpha}-\hat{B}^TX’)/\sigma ^2(X’)\leq (q/v)F(\gamma)[/tex] (8)

where [tex]F(\gamma)[/tex] is the upper [tex](100-\gamma)[/tex] per cent point of the standard F-distribution on q and v=(n-p-q) degrees of freedom and

[tex]\sigma ^2(X’)=1+1/n+X’^T(X^TX)^{-1}X’[/tex] (9)

The form of this confidence region is very interesting, and it is important to note that letting [tex]\gamma[/tex] approach one the region degenerates to the CCE estimate [tex]\hat{X}’[/tex]. Update2: Central point of the region is NOT (AFAIK for now ;) ) ML estimate, and the relation of central point and CCE is, as per Brown,

[tex]C^{-1}D[/tex] (10) , where

[tex]C=\hat{B}S^{-1}\hat{B}^T-(q/v)F(\gamma)(X^TX)^{-1}[/tex] (11)


[tex]D=\hat{B}S^{-1}(Y’^T-\hat{\alpha})[/tex] (12) .

Often calibration residuals are used to generate CIs for proxy reconstructions. We’ll see what will be missing in that case:
I simulated proxy vs. temperature cases with q=40, n=79 and SNR=1 and SNR=0.01. With SNR 1 we’ll get nice CIs, (which agree quite well with calibration residuals), but when SNR gets lower, the confidence region grows rapidly, being open from the upper side quite soon! Yet, in the latter case calibration residuals indicate relatively low noise. The dangerous situation is when true X’ is greater than calibration X (the very thing hockey sticks are trying to prove wrong).

Above: SNR=1, Below: SNR=0.01



  1. Direct usage of calibration residuals for estimating confidence intervals is quite dangerous procedure.
  2. Assumptions of ICE just do not work in proxy reconstructions


Brown 82: Multivariate Calibration, Journal of the Royal Statistical Society. Ser B. Vol. 44, No. 3, pp. 287-321

Williams 69: Regression methods in calibration problems. Bull. ISI., 43, 17-28

Krutchkoff 67: Classical and inverse regression methods of calibration. Technometrics, 9, 425-439

Sundberg 99: Multivariate Calibration – Direct and Indirect Regression Methodology
( )

Juckes 06: Millennial temperature reconstruction intercomparison and evaluation

( )

About these ads

15 Responses to “Multivariate Calibration”

  1. John Creighton Says:

    I see how I can apply these techniques now but do you know where I can find a derivation?

  2. uced Says:

    CCE is quite simple to derive, see for example Hu McCulloch’s post at CA, . Properties of CCE in the multivariate case are a different story, still working on it. Above post is a bit ‘expired’, I’ll try to update when I have time.

  3. John Creighton Says:

    The post was interesting but I didn’t really get much out of it which I didn’t already notice myself. I actually disagree with Hu McCulloch but I can’t prove it yet. Let my test your board features:


  4. John Creighton Says:

    New thought. I derive B via the first equation. Than I apply maximum likelihood to solve for X’ given Y’. I agree with the formula for B except I wonder why it is not:

    [tex] \hat{B}=(X^{T} X)^{-1} X^{T} (Y-1 \alpha) [/tex]

    maybe I don’t understand what \alpha is.

    UC: Updated, sorry, I’ll blame Brown ;) Now it should work with unknown mean as well.

  5. John Creighton Says:

    You must think I can’t reading. I noticed you wrote something like I mentioned above. Anyway, let me try to minimize the left hand side as you say.

  6. John Creighton Says:

    How come in the equation for the confidence interval that the B and X are in the opposite multiplicative order as in the original model?

  7. John Creighton Says:

    Also why is it that in the second equation on the page the transpose of X is taken but the transpose of X is not taken in the first equation on the page?

  8. uced Says:

    John, “Also why is it that in the second equation on the page the transpose of X is taken but the transpose of X is not taken in the first equation on the page?”

    The second equation corresponds to one new observation Y’ (row vector), which doesn’t belong to calibration data. X’ is transposed in the original Brown’s text (xi in 2.2), I guess the reason is to
    have unknown vector X’ ( to be estimated) in column vector form.

  9. John Creighton Says:

    Hey UC,
    can you number your equations? It makes them easier to refer to. It wasn’t apparent to me until today that if X is a column vector then X’ is a scalar. Thus it should be easy to minimize the above equation. That said it is not clear to me why the above error bounds are still valid for when x’ is not scalar.

  10. uced Says:

    “can you number your equations?”


  11. walid Says:


    I’ am Walid. I’m working on multivariate calibration and intervals calibration. I m needing copies of the following 3 articles:

    1- Brown 82: Multivariate Calibration, Journal of the Royal Statistical Society. Ser B. Vol. 44, No. 3, pp. 287-321

    2- Williams 69: Regression methods in calibration problems. Bull. ISI., 43, 17-28

    3 -Krutchkoff 67: Classical and inverse regression methods of calibration. Technometrics, 9, 425-439..

    Please contact me on:

  12. uced Says:



    2,3) I have only paper copies of those

  13. Marty Ringo Says:


    I may not understand the initial setup, but equation system (1) appears to be just a system of q equations with the same exogenous variables in each equation, a “reduced form” system in the terminology of econometrics. The estimator of alpha and B in (3a) and (3b) is like an OLS. I am not clear here why the vectors of ones is not just placed in the “X” and standard notation used, but that is not my real question.

    My problem comes from the G and the maximum likelihood estimator. Is G diagonal? If so, then I will trust you on the algebra for the rest. But if G is not diagonal, i.e. not known to be diagonal apriori, then (3a) isn’t the ML estimator of B.

    Let me restate my question in the time dimension and equation dimension. As I understand, we are assuming no serial correlation, and thus the covariance matrix for each equation is diagonal. But what about between equations? If there is correlation of the error terms between equations — known in econometrics as a system of “Seemingly Unrelated Regression,” or SUR, equations — then that information would be in the likelihood function (the covariance of the i_th error term in equation k with the i_th error term in equation k’).

    Further, while I am decades out of date with the research, I don’t believe the finite sample distribution of SUR estimators for non-orthogonal regressors has been reduced to a tractable form and incorporated into the standard statistical software. Now it may be that the inverse estimation problem is more tractable although that is not likely, and thus, I don’t think that those F-statistics can be anything more than an approximation to the critical regions.

    Since the proxies are the original dependents, I can’t say as to whether or not one can hypothesize zero cross-equation correlation, but with temperatures themselves the cross-equation (geographical correlation) is not zero (although a SUR, with ARMA, estimation of temperature doesn’t offer much more than a standard, one-equation ARMA estimation).

  14. uced Says:

    Interesting point, Marty. Need to think about that. My Eq. 3a equals Brown87 Eq. 2.12, with text ‘the maximum likelihood estimator of B from the calibration training data solely.’ Some possibly relevant comments:

    Indeed, there is assumption of no serial correlation, but errors can be correlated between responses. Yet, this correlation is to be estimated (after computing [tex]\hat{B}[/tex] ) so it cannot be used before.

    CCE is not ML estimator, Sunberg99:

    The classical estimator is not ML (unless q=p). Heuristically this is because only p-vector part of the q-vector is used for estimation of the unknown x, and the remaining q-p components contain some (little) information about [tex]\Gamma[/tex]

    Brown87: Confidence and Conflict in Multivariate Calibration, Philip J. Brown; Rolf Sundberg; Journal of the Royal Statistical Society. Series B (Methodological), Vol. 49, No. 1. (1987), pp. 46-57.

  15. uc00 Says:

    This post is now moved to CA, . Will close the comments here.

Comments are closed.


Get every new post delivered to your Inbox.

%d bloggers like this: