jackknife
Compute jackknife estimates of a parameter taking one or more given samples as parameters.
In particular, E is the estimator to be jackknifed as a function name, handle, or inline function, and x is the sample for which the estimate is to be taken. The i-th entry of jackstat will contain the value of the estimator on the sample x with its i-th row omitted.
jackstat (i) = E(x(1 : i - 1, i + 1 : length(x))) |
Depending on the number of samples to be used, the estimator must have the appropriate form:
jackstat = jackknife (@std, rand (100, 1)).
jackstat = jackknife (@(x) std(x{1})/var(x{2}),
rand (100, 1), randn (100, 1))
|
If all goes well, a theoretical value P for the parameter is already known, n is the sample size,
t = n * E(x) - (n - 1) *
mean(jackstat)
and
v = sumsq(n * E(x) - (n - 1) *
jackstat - t) / (n * (n - 1))
then
(t-P)/sqrt(v) should follow a t-distribution with
n-1 degrees of freedom.
Jackknifing is a well known method to reduce bias. Further details can be found in:
Source Code: jackknife
for k = 1:1000
rand ("seed", k); # for reproducibility
x = rand (10, 1);
s(k) = std (x);
jackstat = jackknife (@std, x);
j(k) = 10 * std (x) - 9 * mean (jackstat);
endfor
figure();
hist ([s', j'], 0:sqrt(1/12)/10:2*sqrt(1/12))
|
for k = 1:1000
randn ("seed", k); # for reproducibility
x = randn (1, 50);
rand ("seed", k); # for reproducibility
y = rand (1, 50);
jackstat = jackknife (@(x) std(x{1})/std(x{2}), y, x);
j(k) = 50 * std (y) / std (x) - 49 * mean (jackstat);
v(k) = sumsq ((50 * std (y) / std (x) - 49 * jackstat) - j(k)) / (50 * 49);
endfor
t = (j - sqrt (1 / 12)) ./ sqrt (v);
figure();
plot (sort (tcdf (t, 49)), ...
"-;Almost linear mapping indicates good fit with t-distribution.;")
|