stsdas.analysis.statistics

The statistics package contains statistical analysis tasks.

Notes

For questions or comments please see our github page. We encourage and appreciate user feedback.

Most of these notebooks rely on basic knowledge of the Astropy FITS I/O module. If you are unfamiliar with this module please see the Astropy FITS I/O user documentation before using this documentation.

Many of the tasks below are documented in their respective library documentation. Please see the links provided for example usage.

Contents:

bhkmethod

Please review the Notes section above before running any examples in this notebook

The bhkmethod task is used to compute the generalized Kendall’s tau correlation coefficient. We show a short example here taken from the scipy.stats.kendalltau documentation.

# Standard Imports
from scipy import stats
x1 = [12, 2, 1, 12, 2]
x2 = [1, 4, 7, 1, 0]
tau, p_value = stats.kendalltau(x1, x2)
print("tau: {}".format(tau))
print("p_value: {}".format(p_value))
tau: -0.4714045207910316
p_value: 0.2827454599327748

buckleyjames-kmestimate

Please review the Notes section above before running any examples in this notebook

The buckleyjames and kestimate tasks compute linear regression coefficients and esitmators with the Kaplan-Meier estimator. There is currently a Python package called lifelines that have this fitter.

coxhazard

Please review the Notes section above before running any examples in this notebook

The coxhazard task is used to compute the correlation probability by Cox’s proportional hazard model. See an example of this fitter in the lifelines package.

kolmov

Please review the Notes section above before running any examples in this notebook

The kolmov task uses the Kolmogorov-Smirnov test for goodness of fit. You can find both the one-sided and two-sided test in scipy:

spearman

Please review the Notes section above before running any examples in this notebook

The spearman task is used to compute regression coefficients by Scmitt’s method. Scipy contains a version of this task, see documentation here.

# Standard Imports
from scipy import stats
rho, pvalue = stats.spearmanr([1,2,3,4,5],[5,6,7,8,7])
print("rho: {}".format(rho))
print("p-value: {}".format(pvalue))
rho: 0.8207826816681233
p-value: 0.08858700531354381

twosampt

Please review the Notes section above before running any examples in this notebook

The twosampt task is used to determine if two sets of data are from the same population. It provided the following types of two sample test: geham-permute, gehan-hyper, logrank, peto-peto, and peto-prentice. These tests do not currently have an equivalent in Scipy, but the following two sample tests are availalbe:

Not Replacing

  • censor - Information about the censoring indicator in survival analysis. Deprecated.

  • emmethod - Compute linear regression for censored data by EM method. Deprecated.

  • schmittbin - Compute regression coefficients by Schmitt’s method. Deprecated.

  • survival - Provide background & overview of survival analysis. Deprecated.