class Statsample::DominanceAnalysis::Bootstrap

Goal¶ ↑

Generates Bootstrap sample to identity the replicability of a Dominance Analysis. See Azen & Bodescu (2003) for more information.

Usage¶ ↑

require 'statsample'
a=100.times.collect {rand}.to_scale
b=100.times.collect {rand}.to_scale
c=100.times.collect {rand}.to_scale
d=100.times.collect {rand}.to_scale
ds={'a'=>a,'b'=>b,'c'=>c,'d'=>d}.to_dataset
ds['y']=ds.collect{|row| row['a']*5+row['b']*2+row['c']*2+row['d']*2+10*rand()}
dab=Statsample::DominanceAnalysis::Bootstrap.new(ds2, 'y', :debug=>true)
dab.bootstrap(100,nil)
puts dab.summary

<strong>Output</strong>

 Sample size: 100
t: 1.98421693632958

Linear Regression Engine: Statsample::Regression::Multiple::MatrixEngine
Table: Bootstrap report
--------------------------------------------------------------------------------------------
| pairs                 | sD  | Dij    | SE(Dij) | Pij   | Pji   | Pno   | Reproducibility |
--------------------------------------------------------------------------------------------
| Complete dominance    |
--------------------------------------------------------------------------------------------
| a - b                 | 1.0 | 0.6150 | 0.454   | 0.550 | 0.320 | 0.130 | 0.550           |
| a - c                 | 1.0 | 0.9550 | 0.175   | 0.930 | 0.020 | 0.050 | 0.930           |
| a - d                 | 1.0 | 0.9750 | 0.131   | 0.960 | 0.010 | 0.030 | 0.960           |
| b - c                 | 1.0 | 0.8800 | 0.276   | 0.820 | 0.060 | 0.120 | 0.820           |
| b - d                 | 1.0 | 0.9250 | 0.193   | 0.860 | 0.010 | 0.130 | 0.860           |
| c - d                 | 0.5 | 0.5950 | 0.346   | 0.350 | 0.160 | 0.490 | 0.490           |
--------------------------------------------------------------------------------------------
| Conditional dominance |
--------------------------------------------------------------------------------------------
| a - b                 | 1.0 | 0.6300 | 0.458   | 0.580 | 0.320 | 0.100 | 0.580           |
| a - c                 | 1.0 | 0.9700 | 0.156   | 0.960 | 0.020 | 0.020 | 0.960           |
| a - d                 | 1.0 | 0.9800 | 0.121   | 0.970 | 0.010 | 0.020 | 0.970           |
| b - c                 | 1.0 | 0.8850 | 0.283   | 0.840 | 0.070 | 0.090 | 0.840           |
| b - d                 | 1.0 | 0.9500 | 0.181   | 0.920 | 0.020 | 0.060 | 0.920           |
| c - d                 | 0.5 | 0.5800 | 0.360   | 0.350 | 0.190 | 0.460 | 0.460           |
--------------------------------------------------------------------------------------------
| General Dominance     |
--------------------------------------------------------------------------------------------
| a - b                 | 1.0 | 0.6500 | 0.479   | 0.650 | 0.350 | 0.000 | 0.650           |
| a - c                 | 1.0 | 0.9800 | 0.141   | 0.980 | 0.020 | 0.000 | 0.980           |
| a - d                 | 1.0 | 0.9900 | 0.100   | 0.990 | 0.010 | 0.000 | 0.990           |
| b - c                 | 1.0 | 0.9000 | 0.302   | 0.900 | 0.100 | 0.000 | 0.900           |
| b - d                 | 1.0 | 0.9700 | 0.171   | 0.970 | 0.030 | 0.000 | 0.970           |
| c - d                 | 1.0 | 0.5600 | 0.499   | 0.560 | 0.440 | 0.000 | 0.560           |
--------------------------------------------------------------------------------------------

Table: General averages
---------------------------------------
| var | mean  | se    | p.5   | p.95  |
---------------------------------------
| a   | 0.133 | 0.049 | 0.062 | 0.218 |
| b   | 0.106 | 0.048 | 0.029 | 0.199 |
| c   | 0.035 | 0.032 | 0.002 | 0.106 |
| d   | 0.023 | 0.019 | 0.002 | 0.062 |
---------------------------------------

References:¶ ↑

Azen, R. & Budescu, D.V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8(2), 129-148.

Goal¶ ↑

Generates Bootstrap sample to identity the replicability of a Dominance Analysis. See Azen & Bodescu (2003) for more information.

Usage¶ ↑

require 'statsample'
a=100.times.collect {rand}.to_scale
b=100.times.collect {rand}.to_scale
c=100.times.collect {rand}.to_scale
d=100.times.collect {rand}.to_scale
ds={'a'=>a,'b'=>b,'c'=>c,'d'=>d}.to_dataset
ds['y']=ds.collect{|row| row['a']*5+row['b']*2+row['c']*2+row['d']*2+10*rand()}
dab=Statsample::DominanceAnalysis::Bootstrap.new(ds2, 'y', :debug=>true)
dab.bootstrap(100,nil)
puts dab.summary

<strong>Output</strong>

 Sample size: 100
t: 1.98421693632958

Linear Regression Engine: Statsample::Regression::Multiple::MatrixEngine
Table: Bootstrap report
--------------------------------------------------------------------------------------------
| pairs                 | sD  | Dij    | SE(Dij) | Pij   | Pji   | Pno   | Reproducibility |
--------------------------------------------------------------------------------------------
| Complete dominance    |
--------------------------------------------------------------------------------------------
| a - b                 | 1.0 | 0.6150 | 0.454   | 0.550 | 0.320 | 0.130 | 0.550           |
| a - c                 | 1.0 | 0.9550 | 0.175   | 0.930 | 0.020 | 0.050 | 0.930           |
| a - d                 | 1.0 | 0.9750 | 0.131   | 0.960 | 0.010 | 0.030 | 0.960           |
| b - c                 | 1.0 | 0.8800 | 0.276   | 0.820 | 0.060 | 0.120 | 0.820           |
| b - d                 | 1.0 | 0.9250 | 0.193   | 0.860 | 0.010 | 0.130 | 0.860           |
| c - d                 | 0.5 | 0.5950 | 0.346   | 0.350 | 0.160 | 0.490 | 0.490           |
--------------------------------------------------------------------------------------------
| Conditional dominance |
--------------------------------------------------------------------------------------------
| a - b                 | 1.0 | 0.6300 | 0.458   | 0.580 | 0.320 | 0.100 | 0.580           |
| a - c                 | 1.0 | 0.9700 | 0.156   | 0.960 | 0.020 | 0.020 | 0.960           |
| a - d                 | 1.0 | 0.9800 | 0.121   | 0.970 | 0.010 | 0.020 | 0.970           |
| b - c                 | 1.0 | 0.8850 | 0.283   | 0.840 | 0.070 | 0.090 | 0.840           |
| b - d                 | 1.0 | 0.9500 | 0.181   | 0.920 | 0.020 | 0.060 | 0.920           |
| c - d                 | 0.5 | 0.5800 | 0.360   | 0.350 | 0.190 | 0.460 | 0.460           |
--------------------------------------------------------------------------------------------
| General Dominance     |
--------------------------------------------------------------------------------------------
| a - b                 | 1.0 | 0.6500 | 0.479   | 0.650 | 0.350 | 0.000 | 0.650           |
| a - c                 | 1.0 | 0.9800 | 0.141   | 0.980 | 0.020 | 0.000 | 0.980           |
| a - d                 | 1.0 | 0.9900 | 0.100   | 0.990 | 0.010 | 0.000 | 0.990           |
| b - c                 | 1.0 | 0.9000 | 0.302   | 0.900 | 0.100 | 0.000 | 0.900           |
| b - d                 | 1.0 | 0.9700 | 0.171   | 0.970 | 0.030 | 0.000 | 0.970           |
| c - d                 | 1.0 | 0.5600 | 0.499   | 0.560 | 0.440 | 0.000 | 0.560           |
--------------------------------------------------------------------------------------------

Table: General averages
---------------------------------------
| var | mean  | se    | p.5   | p.95  |
---------------------------------------
| a   | 0.133 | 0.049 | 0.062 | 0.218 |
| b   | 0.106 | 0.048 | 0.029 | 0.199 |
| c   | 0.035 | 0.032 | 0.002 | 0.106 |
| d   | 0.023 | 0.019 | 0.002 | 0.062 |
---------------------------------------

References:¶ ↑

Azen, R. & Budescu, D.V. (2003). The dominance analysis approach for comparing predictors in multiple regression. Psychological Methods, 8(2), 129-148.

Constants

ALPHA: Default level of confidence for t calculation

Attributes

alpha[RW]

Alpha level of confidence. Default: ALPHA

debug[RW]

Debug?

ds[RW]

Dataset

fields[R]

Name of fields

lr_class[RW]

Regression class used for analysis

name[RW]

Name of analysis

regression_class[RW]

Regression class used for analysis

samples_cd[R]

Conditional Dominance results

samples_ga[R]

General average results

samples_gd[R]

General Dominance results

samples_td[R]

Total Dominance results

Public Class Methods

new(ds,y_var, opts=Hash.new) click to toggle source

Create a new Dominance Analysis Bootstrap Object

ds: A Dataset object
y_var: Name of dependent variable
opts: Any other attribute of the class

# File lib/statsample/dominanceanalysis/bootstrap.rb, line 97
def initialize(ds,y_var, opts=Hash.new)
  @ds=ds
  @y_var=y_var
  @n=ds.cases
  
  @n_samples=0
  @alpha=ALPHA
  @debug=false
  if y_var.is_a? Array
    @fields=ds.fields-y_var
    @regression_class=Regression::Multiple::MultipleDependent
    
  else
    @fields=ds.fields-[y_var]
    @regression_class=Regression::Multiple::MatrixEngine
  end
  @samples_ga=@fields.inject({}){|a,v| a[v]=[];a}

  @name=_("Bootstrap dominance Analysis:  %s over %s") % [ ds.fields.join(",") , @y_var]
  opts.each{|k,v|
    self.send("#{k}=",v) if self.respond_to? k
  }
  create_samples_pairs            
end

Public Instance Methods

bootstrap(number_samples,n=nil) click to toggle source

Creates n re-samples from original dataset and store result of each sample on @samples_td, @samples_cd, @samples_gd, @samples_ga

number_samples: Number of new samples to add
n: size of each new sample. If nil, equal to original sample size

# File lib/statsample/dominanceanalysis/bootstrap.rb, line 135
def bootstrap(number_samples,n=nil)
  number_samples.times{ |t|
    @n_samples+=1
    puts _("Bootstrap %d of %d") % [t+1, number_samples] if @debug
    ds_boot=@ds.bootstrap(n)
    da_1=DominanceAnalysis.new(ds_boot, @y_var, :regression_class => @regression_class)
    
    da_1.total_dominance.each{|k,v|
      @samples_td[k].push(v)
    }
    da_1.conditional_dominance.each{|k,v|
      @samples_cd[k].push(v)
    }
    da_1.general_dominance.each{|k,v|
      @samples_gd[k].push(v)
    }
    da_1.general_averages.each{|k,v|
      @samples_ga[k].push(v)
    }
  }
end

create_samples_pairs() click to toggle source

# File lib/statsample/dominanceanalysis/bootstrap.rb, line 156
def create_samples_pairs
  @samples_td={}
  @samples_cd={}
  @samples_gd={}
  @pairs=[]
  c=(0...@fields.size).to_a.combination(2)
  c.each do |data|
    p data
    convert=data.collect {|i| @fields[i] }
    @pairs.push(convert)
    [@samples_td, @samples_cd, @samples_gd].each{|s|
      s[convert]=[]
    }
  end
end

da() click to toggle source

# File lib/statsample/dominanceanalysis/bootstrap.rb, line 123
def da
  if @da.nil?
    @da=DominanceAnalysis.new(@ds,@y_var, :regression_class => @regression_class)
  end
  @da
end

f(v,n=3) click to toggle source

# File lib/statsample/dominanceanalysis/bootstrap.rb, line 228
def f(v,n=3)
    prec="%0.#{n}f"
    sprintf(prec,v)
end

summary_pairs(pair,std,ttd) click to toggle source

# File lib/statsample/dominanceanalysis/bootstrap.rb, line 220
def summary_pairs(pair,std,ttd)
    freqs=std.proportions
    [0, 0.5, 1].each{|n|
        freqs[n]=0 if freqs[n].nil?
    }
    name="%s - %s" % [@ds[pair[0]].name, @ds[pair[1]].name]
    [name,f(ttd,1),f(std.mean,4),f(std.sd),f(freqs[1]), f(freqs[0]), f(freqs[0.5]), f(freqs[ttd])]
end

t() click to toggle source

# File lib/statsample/dominanceanalysis/bootstrap.rb, line 171
def t
  Distribution::T.p_value(1-((1-@alpha) / 2), @n_samples - 1)
end