G::Tools Statistics
SummaryIncluded librariesPackage variablesSynopsisDescriptionGeneral documentationMethods
Summary
G::Tools::Statistics - Statistical Methods
Package variables
No package variables defined.
Included modules
G::Messenger
SelfLoader
SubOpt
Inherit
Exporter
Synopsis
    use G::Tools::Statistics;
    $mean = mean(@values);
Description
This module contains statistical analysis methods, which is
mostly a simple wrapper around CPAN Statistics:: modules.
Methods
corDescriptionCode
cumulativeDescriptionCode
least_squares_fitDescriptionCode
maxDescriptionCode
maxdexDescriptionCode
meanDescriptionCode
medianDescriptionCode
minDescriptionCode
mindexDescriptionCode
standard_deviationDescriptionCode
sumDescriptionCode
ttestDescriptionCode
varianceDescriptionCode
Methods description
corcode    nextTop
  Name: cor   -   calculate the correlation of given two array references

  Description:
    Calculates correlation coefficient from two array references, using
    Pearson's, Spearman's or Kendall's methods.

    This is a wrapper around Statistics::Descriptive and Statistics::RankCorrelation modules.

  Usage:
    $r = cor([@array1], [@array2]); 

  Options:
    -method    "pearson", "spearman", or "kendall" (default:pearson)
               Method used to calculate the correlation coefficient
    -sorted    sorted rank coefficient when 1 (default:0)

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20130422-01 changed the default value to Pearson's correlation rather than R^2 20070607-01 initial posting
cumulativecodeprevnextTop
  Name: cumulative   -   returns cumulative array of the given array

  Description:
    Returns the cumulative array of the data.

  Usage:
    @cumulative = cumulative(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20081124-01 initial posting
least_squares_fitcodeprevnextTop
  Name: least_squares_fit   -   calculate the least squares fit of the given array of data

  Description:
    Performs least squares fit on the data. 

    When called in array context, this returns an array of four
    values ($intercept, $slope, $r, $error), where the linear fit is expressed by 
    y = $slope * x + $intercept, and $r is the Pearson's linear correlation coefficient,
    $error is the root-mean-square error.

    When called in scalar context, only $r is returned.

    This is a wrapper around Statistics::Descriptive module.

  Usage:
    ($intercept, $slope, $r, $error) = least_squares_fit(@array_of_values);
      or
    $r = least_squares_fit(@array_of_values);
      or
    $r = scalar least_squares_fit(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
maxcodeprevnextTop
  Name: max   -   get the maximum value of the given array of data

  Description:
    Returns the maximum value of the data.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $max = max(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
maxdexcodeprevnextTop
  Name: maxdex   -   get the index of maximum value of the given array of data

  Description:
    Returns the index of the maximum value of the data.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $maximum_index = maxdex(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
meancodeprevnextTop
  Name: mean   -   calculate the mean of the given array of data

  Description:
    Returns the mean of the data.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $mean = mean(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
mediancodeprevnextTop
  Name: median   -   calculate the median of the given array of data

  Description:
    Returns the median of the data.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $median = median(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
mincodeprevnextTop
  Name: min   -   get the minimum value of the given array of data

  Description:
    Returns the minimum value of the data.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $min = min(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
mindexcodeprevnextTop
  Name: mindex   -   get the index of minimum value of the given array of data

  Description:
    Returns the index of the minimum value of the data.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $minimum_index = mindex(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
standard_deviationcodeprevnextTop
  Name: standard_deviation   -   calculate the standard deviation of the given array of data

  Description:
    Returns the standard deviation of the data. Division by n-1 is used.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $standard_deviation = standard_deviation(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
sumcodeprevnextTop
  Name: sum   -   calculate the sum of the given array of data

  Description:
    Returns the sum of the data.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $sum = sum(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
ttestcodeprevnextTop
  Name: ttest   -   performs Student's t-test on given two array references

  Description:
    Performs Student's t-test (independent or pair-wise) on two array references.

    Only the p-value is returned in scalar context. If called in array context,
    an array of 3 values ($t_value, $p_value, $degree_of_freedom) are returned.

    This is a wrapper around Statistics::TTest and Statistics::DependantTTest modules.

  Usage:
    ($t_value, $p_value, $df) = ttest([@array1], [@array2]);
      or
    $p_value = ttest([@array1], [@array2]); 
      or
    $p_value = scalar ttest([@array1], [@array2]); 

  Options:
    -paired    1 for dependent (paired t-test), 0 for independent (default:0)

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
variancecodeprevnextTop
  Name: variance   -   calculate the variance of the given array of data

  Description:
    Returns the variance of the data. Division by n-1 is used.
    This is a wrapper around Statistics::Descriptive module.

  Usage:
    $variance = variance(@array_of_values);

  Options:
    None.

  Author: 
    Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
History: 20070607-01 initial posting
Methods code
cordescriptionprevnextTop
sub cor {
    opt_default(method=>"pearson", sorted=>0);
    my @args = opt_get(@_);
    my $method = opt_val("method");
    my $sorted = opt_val("sorted");

    if($method eq 'spearman' || $method eq 'kendall'){
	require Statistics::RankCorrelation;
	my $stat = Statistics::RankCorrelation->new(@args, sorted=>$sorted);

	return $stat->spearman if($method eq 'spearman');
	return $stat->kendall  if($method eq 'kendall');

    }else{
#	require Statistics::LineFit;
# my $stat = Statistics::LineFit->new();
# $stat->setData(@args) or die("G::Tools::Statistics::cor - Invalid data");
# my ($intercept, $slope) = $stat->coefficients();
# my $r = $stat->rSquared();
# my $error = $stat->meanSqError();
require Statistics::Descriptive; my $stat2 = Statistics::Descriptive::Full->new(); $stat2->add_data(@{$args[0]}); my @result = $stat2->least_squares_fit(@{$args[1]}); return $result[2]; # if(wantarray()){
# return $r, $result[2], $result[2] * $result[2];
# return ($intercept, $slope, $r, $error);
# }else{
# return $r;
# }
}
}
cumulativedescriptionprevnextTop
sub cumulative {
    opt_default("mean"=>0);
    my @args = opt_get(@_);
    my $mean = opt_val("mean");

    my @array;
    my $cum;
    my $ave = $mean ? mean(@args) : 0;
    foreach my $val (@args){
	$cum += $val - $ave;
	push(@array, $cum);
    }

    return @array;
}
least_squares_fitdescriptionprevnextTop
sub least_squares_fit {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@{$_[0]}); 

    my @result = $stat->least_squares_fit(@{$_[1]});

    if(wantarray()){
	return @result;
    }else{
	return $result[2];
    }
}
maxdescriptionprevnextTop
sub max {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->max();
}
maxdexdescriptionprevnextTop
sub maxdex {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->maxdex();
}
meandescriptionprevnextTop
sub mean {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->mean();
}
mediandescriptionprevnextTop
sub median {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->median();
}
mindescriptionprevnextTop
sub min {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->min();
}
mindexdescriptionprevnextTop
sub mindex {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->mindex();
}
standard_deviationdescriptionprevnextTop
sub standard_deviation {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->standard_deviation();
}
sumdescriptionprevnextTop
sub sum {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->sum();
}
ttestdescriptionprevnextTop
sub ttest {
    opt_default(paired=>0);

    my @args = opt_get(@_);
    my $paired = opt_val("paired");

    my ($t_value, $p_value, $df);

    if($paired){
	require Statistics::DependantTTest;
	require Statistics::Distributions;
	my $stat = new Statistics::DependantTTest;
	$stat->load_data('x',@{$args[0]});
	$stat->load_data('y',@{$args[1]});
	($t_value,$df) = $stat->perform_t_test('x','y');
	$p_value = Statistics::Distributions::tprob ($df, $t_value);
    }else{
	require Statistics::TTest;
	my $stat = new Statistics::TTest;  
	$stat->load_data(@args);  

	$t_value = $stat->t_statistic;
	$p_value = $stat->{t_prob};
	$df = $stat->df;
    }

    if(wantarray()){
	return ($t_value, $p_value, $df);
    }else{
	return ($p_value);
    }
}
variancedescriptionprevnextTop
sub variance {
    require Statistics::Descriptive;
    my $stat = Statistics::Descriptive::Full->new();
    $stat->add_data(@_); 
    return $stat->variance();
}
General documentation
AUTHORTop
Kazuharu Arakawa, <gaou@sfc.keio.ac.jp>
COPYRIGHT AND LICENSETop
Copyright 2007 by Kazuharu Arakawa
This library is a part of G-language GAE.