SCM

[#6471] `.vcov.avar1` expects something a little different from what `lmrob.S` provides

Date:
2016-10-05 21:05
Priority:
3
State:
Open
Submitted by:
Ben Hansen (benstatistician)
Assigned to:
Nobody (None)
Hardware:
None
Product:
None
Operating System:
All
Component:
None
Version:
None
Severity:
None
Resolution:
None
URL:
Summary:
`.vcov.avar1` expects something a little different from what `lmrob.S` provides

Detailed description
`.vcov.avar1` computes sandwich estimates of covariance. The entry on the diagonal of the bread matrix corresponding to the scale parameter is now being estimated by

```
mean(w0^2 - bb^2)
```
(In the terms of Croux, Dhaene and Hoorelbeke [2004], this is the estimate of the lower-right entry in 3.7's $\Omega_{MM}$.)

The use of this expression to estimate that matrix entry presumes the mean of `w0` to be `bb`. (Or perhaps that `w0` is a realization of a random variable with expectation `bb`.) However, `lmrob.S` arranges that `sum(w0)/(n-p) = bb`, not that `mean(w0) = bb`, as I learned from Matias Salibian Barrera on this discussion thread: https://github.com/msalibian/Fast-S/issues/1.
There are some simple examples on that thread also.

The problem seems likely to undercut `vcov.avar1`'s performance in small samples; it may also explain some failures of positive-definiteness. It shouldn't be difficult to correct; the trickiest part may be deciding whether `.vcov.avar1` should mimic `lmrob.S`'s approach to degrees of freedom or continue taking simple means, but with an updated centering point. An intermediate to these alternatives would be to replace `mean(w0^2 - bb^2)` with `var(w0)`.

Comments:

Message  ↓
Date: 2016-10-07 02:24
Sender: Ben Hansen

I tested out a fix that works by adjusting the value of `bb`. The new lines are between the first and last in the block below, they being adjacent line s of the current .vcov.avar1 definition.

```
p <- ncol(x)
## Next 2 lines added post-.vcov.MM, addressing #6471.
## This assumes initial S-estimate solved sum( loss )/(n-p) == bb,
## not mean( loss ) == bb as assumed in .vcov.avar1
adj <- (n-p)/n
bb <- bb * adj
r.s <- r / scale # final scaled residuals
```

I tried this change in a very small simulation study. In small samples (n=50, p=2) it improved coverage, but the improvement was extremely modest: a nominally 5% test that previously had 10% type 1 error rate came down to a type 1 error rate of 9%.

Attached Files:

Changes

No Changes Have Been Made to This Item

Thanks to:
Vienna University of Economics and Business Powered By FusionForge