Source code for naginterfaces.library.stat

# -*- coding: utf-8 -*-
r"""
Module Summary
--------------
Interfaces for the NAG Mark 29.3 `stat` Chapter.

``stat`` - Simple Calculations on Statistical Data

This module covers the following topics:

    descriptive statistics and exploratory data analysis;

    statistical distribution functions and their inverses;

    testing for Normality and other distributions.

See Also
--------
``naginterfaces.library.examples.stat`` :
    This subpackage contains examples for the ``stat`` module.
    See also the :ref:`library_stat_ex` subsection.

Functionality Index
-------------------

**Descriptive statistics / Exploratory analysis**

  summaries

    frequency / contingency table

      one variable: :meth:`frequency_table`

      two variables, with :math:`χ^2` and Fisher's exact test: :meth:`contingency_table`

    mean, variance, skewness, kurtosis (one variable)

      combine summaries: :meth:`summary_onevar_combine`

      from frequency table: :meth:`summary_freq`

      from raw data: :meth:`summary_onevar`

    mean, variance, sums of squares and products (two variables): :meth:`summary_2var`

    median, hinges / quartiles, minimum, maximum: :meth:`five_point_summary`

    quantiles

      approximate

        large data stream of fixed size: :meth:`quantiles_stream_fixed`

        large data stream of unknown size: :meth:`quantiles_stream_arbitrary`

      unordered vector

        unweighted: :meth:`quantiles`

    rolling window

      mean, standard deviation (one variable): :meth:`moving_average`

**Distributions**

  :math:`\chi^2`

    vectorized deviates: :meth:`inv_cdf_chisq_vector`

    vectorized probabilities: :meth:`prob_chisq_vector`

  Beta

    central

      deviates

        scalar: :meth:`inv_cdf_beta`

        vectorized: :meth:`inv_cdf_beta_vector`

      probabilities and probability density function

        scalar: :meth:`prob_beta`

        vectorized: :meth:`prob_beta_vector`

    non-central

      probabilities: :meth:`prob_beta_noncentral`

  binomial

    distribution function

      scalar: :meth:`prob_binomial`

      vectorized: :meth:`prob_binomial_vector`

  Dickey--Fuller unit root test

    probabilities: :meth:`prob_dickey_fuller_unit`

  Durbin--Watson statistic

    probabilities: :meth:`prob_durbin_watson`

  energy loss distributions

    Landau

      density: :meth:`pdf_landau`

      derivative of density: :meth:`pdf_landau_deriv`

      distribution: :meth:`prob_landau`

      first moment: :meth:`pdf_landau_moment1`

      inverse distribution: :meth:`inv_cdf_landau`

      second moment: :meth:`pdf_landau_moment2`

    Vavilov

      density: :meth:`pdf_vavilov`

      distribution: :meth:`prob_vavilov`

      initialization: :meth:`init_vavilov`

  :math:`F`

    central

      deviates

        scalar: :meth:`inv_cdf_f`

        vectorized: :meth:`inv_cdf_f_vector`

      probabilities

        scalar: :meth:`prob_f`

        vectorized: :meth:`prob_f_vector`

    non-central

      probabilities: :meth:`prob_f_noncentral`

  gamma

    deviates

      scalar: :meth:`inv_cdf_gamma`

      vectorized: :meth:`inv_cdf_gamma_vector`

    probabilities

      scalar: :meth:`prob_gamma`

      vectorized: :meth:`prob_gamma_vector`

    probability density function

      scalar: :meth:`pdf_gamma`

      vectorized: :meth:`pdf_gamma_vector`

  Hypergeometric

    distribution function

      scalar: :meth:`prob_hypergeom`

      vectorized: :meth:`prob_hypergeom_vector`

  Kolomogorov--Smirnov

    probabilities

      one-sample: :meth:`prob_kolmogorov1`

      two-sample: :meth:`prob_kolmogorov2`

  Normal

    bivariate

      probabilities: :meth:`prob_bivariate_normal`

    multivariate

      probabilities: :meth:`prob_multi_normal`

      probability density function

        vectorized: :meth:`pdf_multi_normal_vector`

      quadratic forms

        cumulants and moments: :meth:`moments_quad_form`

        moments of ratios: :meth:`moments_ratio_quad_forms`

    univariate

      deviates

        scalar: :meth:`inv_cdf_normal`

        vectorized: :meth:`inv_cdf_normal_vector`

      probabilities

        scalar: :meth:`prob_normal`

        vectorized: :meth:`prob_normal_vector`

      probability density function

        scalar: :meth:`pdf_normal`

        vectorized: :meth:`pdf_normal_vector`

      reciprocal of Mill's Ratio: :meth:`mills_ratio`

      Shapiro and Wilk's test for Normality: :meth:`test_shapiro_wilk`

  Poisson

    distribution function

      scalar: :meth:`prob_poisson`

      vectorized: :meth:`prob_poisson_vector`

  Student's :math:`t`

    central

      bivariate

        probabilities: :meth:`prob_bivariate_students_t`

      multivariate

        probabilities: :meth:`prob_multi_students_t`

      univariate

        deviates

          scalar: :meth:`inv_cdf_students_t`

          vectorized: :meth:`inv_cdf_students_t_vector`

        probabilities

          scalar: :meth:`prob_students_t`

          vectorized: :meth:`prob_students_t_vector`

    non-central

      probabilities: :meth:`prob_students_t_noncentral`

  Studentized range statistic

    deviates: :meth:`inv_cdf_studentized_range`

    probabilities: :meth:`prob_studentized_range`

  von Mises

    probabilities: :meth:`prob_vonmises`

  :math:`χ^2`

    central

      deviates: :meth:`inv_cdf_chisq`

      probabilities: :meth:`prob_chisq`

      probability of linear combination: :meth:`prob_chisq_lincomb`

    non-central

      probabilities: :meth:`prob_chisq_noncentral`

      probability of linear combination: :meth:`prob_chisq_noncentral_lincomb`

**Scores**

  Normal scores

    accurate: :meth:`normal_scores_exact`

    approximate: :meth:`normal_scores_approx`

    variance-covariance matrix: :meth:`normal_scores_var`

  Normal scores, ranks or exponential (Savage) scores: :meth:`ranks_and_scores`

For full information please refer to the NAG Library document

https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html
"""

# NAG Copyright 2017-2023.

[docs]def summary_2var(x1, x2, wt=None): r""" ``summary_2var`` computes the means, standard deviations, corrected sums of squares and products, maximum and minimum values, and the product-moment correlation coefficient for two variables. Unequal weighting may be given. .. _g01ab-py2-py-doc: For full information please refer to the NAG Library document for g01ab https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01abf.html .. _g01ab-py2-py-parameters: **Parameters** **x1** : float, array-like, shape :math:`\left(n\right)` The observations from the first sample, :math:`x_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. **x2** : float, array-like, shape :math:`\left(n\right)` The observations from the second sample, :math:`y_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. **wt** : None or float, array-like, shape :math:`\left(n\right)`, optional If weights are being supplied then the elements of :math:`\mathrm{wt}` must contain the weights associated with the observations, :math:`w_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. If weights are not supplied then :math:`\mathrm{wt}` may be **None**. **Returns** **nval** : int Is used to indicate the number of valid observations, :math:`m`; see :ref:`Notes <g01ab-py2-py-notes>` \(g). **res** : float, ndarray, shape :math:`\left(13\right)` The elements of :math:`\mathrm{res}` contain the following results: .. rst-class:: nag-rules-none nag-align-left +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[0]` |mean of the first sample, :math:`\bar{x}`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[1]` |mean of the second sample, :math:`\bar{y}`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[2]` |standard deviation of the first sample, :math:`s_1`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[3]` |standard deviation of the second sample, :math:`s_2`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[4]` |corrected sum of squares of the first sample, :math:`c_{11}`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[5]` |corrected sum of products of the two samples, :math:`c_{12}`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[6]` |corrected sum of squares of the second sample, :math:`c_{22}`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[7]` |product-moment correlation coefficient, :math:`R`; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[8]` |minimum of the first sample; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[9]` |maximum of the first sample; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[10]`|minimum of the second sample; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[11]`|maximum of the second sample; | +------------------------+----------------------------------------------------------------------------------------------------+ |:math:`\mathrm{res}[12]`|sum of weights, :math:`\sum_{{i = 1}}^nw_i` (:math:`{} = n`, if :math:`\textit{iwt} = 0`, on entry).| +------------------------+----------------------------------------------------------------------------------------------------+ .. _g01ab-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n \geq 1`. (`errno` :math:`3`) The number of valid cases, :math:`m`, is :math:`0`. (`errno` :math:`3`) On entry, :math:`\mathrm{wt}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{wt}[\textit{i}-1]\geq 0.0`, for :math:`\textit{i} = 1,2,\ldots,n` **Warns** **NagAlgorithmicWarning** (`errno` :math:`2`) The number of valid cases, :math:`m`, is :math:`1`. In this case standard deviation and product-moment correlation coefficient cannot be calculated. .. _g01ab-py2-py-notes: **Notes** `No equivalent traditional C interface for this routine exists in the NAG Library.` The data consist of two samples of :math:`n` observations, denoted by :math:`x_i`, and :math:`y_i`, for :math:`i = 1,2,\ldots,n`, with corresponding weights :math:`w_i`, for :math:`\textit{i} = 1,2,\ldots,n`. If no specific weighting is given, then each :math:`w_i` is set to :math:`1.0` in ``summary_2var``. The quantities calculated are: (a) The sum of weights, .. math:: W = \sum_{{i = 1}}^nw_i\text{.} (#) The means, .. math:: \bar{x} = \frac{{\sum_{{i = 1}}^nw_ix_i}}{W}\text{, }\quad \bar{y} = \frac{{\sum_{{i = 1}}^nw_iy_i}}{W}\text{.} (#) The corrected sums of squares and products .. math:: \begin{array}{l} c_{11} = \sum_{{i = 1}}^n w_i \left(x_i-\bar{x}\right)^2 \\\\ c_{21} = c_{12} = \sum_{{i = 1}}^n w_i \left(x_i-\bar{x}\right) \left(y_i-\bar{y}\right) \\\\ c_{22} = \sum_{{i = 1}}^n w_i \left(y_i-\bar{y}\right)^2 \text{.} \end{array} (#) The standard deviations .. math:: s_j = \sqrt{\frac{c_{{jj}}}{d}}\text{, where }\quad j = 1,2\quad \text{ and }\quad d = W-\frac{{\sum_{{i = 1}}^nw_i^2}}{W}\text{.} (#) The product-moment correlation coefficient .. math:: R = \frac{c_{12}}{{\sqrt{c_{11}c_{22}}}}\text{.} (#) The minimum and maximum elements in each of the two samples. (#) The number of pairs of observations, :math:`m`, for which :math:`w_i > 0`, i.e., the number of **valid** observations. The quantities in (d) and (e) above will only be computed if :math:`m\geq 2`. All other items are computed if :math:`m\geq 1`. """ raise NotImplementedError
[docs]def summary_freq(x, ifreq): r""" ``summary_freq`` calculates the mean, standard deviation and coefficients of skewness and kurtosis for data grouped in a frequency distribution. .. _g01ad-py2-py-doc: For full information please refer to the NAG Library document for g01ad https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01adf.html .. _g01ad-py2-py-parameters: **Parameters** **x** : float, array-like, shape :math:`\left(k\right)` The elements of :math:`\mathrm{x}` must contain the boundary values of the classes in ascending order, so that class :math:`\textit{i}` is bounded by the values in :math:`\mathrm{x}[\textit{i}-1]` and :math:`\mathrm{x}[\textit{i}]`, for :math:`\textit{i} = 1,2,\ldots,k-1`. **ifreq** : int, array-like, shape :math:`\left(k\right)` The :math:`\textit{i}`\ th element of :math:`\mathrm{ifreq}` must contain the frequency associated with the :math:`\textit{i}`\ th class, for :math:`\textit{i} = 1,2,\ldots,k-1`. :math:`\mathrm{ifreq}[k-1]` is not used by the function. **Returns** **xmean** : float The mean value, :math:`\bar{y}`. **s2** : float The standard deviation, :math:`s_2`. **s3** : float The coefficient of skewness, :math:`s_3`. **s4** : float The coefficient of kurtosis, :math:`s_4`. **n** : int The total frequency, :math:`n`. .. _g01ad-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`k = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`k > 1`. (`errno` :math:`2`) On entry, :math:`\textit{I} = \langle\mathit{\boldsymbol{value}}\rangle`, :math:`\mathrm{x}[\textit{I}-2] = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{x}[\textit{I}-1] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{x}[\textit{I}-2]\leq \mathrm{x}[\textit{I}-1]`. (`errno` :math:`3`) Either :math:`\mathrm{ifreq}[i] < 0` for some :math:`i`, or the sum of frequencies is zero. **Warns** **NagAlgorithmicWarning** (`errno` :math:`4`) The total frequency, :math:`n`, is less than :math:`2`, hence the quantities :math:`s_2`, :math:`s_3` and :math:`s_4` cannot be calculated. .. _g01ad-py2-py-notes: **Notes** The input data consist of a univariate frequency distribution, denoted by :math:`f_i`, for :math:`\textit{i} = 1,2,\ldots,k-1`, and the boundary values of the classes :math:`x_i`, for :math:`\textit{i} = 1,2,\ldots,k`. Thus the frequency associated with the interval :math:`\left(x_i, x_{{i+1}}\right)` is :math:`f_i`, and ``summary_freq`` assumes that all the values in this interval are concentrated at the point .. math:: y_i = \left(x_{{i+1}}+x_i\right)/2\text{, }\quad i = 1,2,\ldots,k-1\text{.} The following quantities are calculated: (a) total frequency, .. math:: n = \sum_{{i = 1}}^{{k-1}}f_i\text{.} (#) mean, .. math:: \bar{y} = \frac{{\sum_{{i = 1}}^{{k-1}}f_iy_i}}{n}\text{.} (#) standard deviation, .. math:: s_2 = \sqrt{\frac{{\sum_{{i = 1}}^{{k-1}}f_i\left(y_i-\bar{y}\right)^2}}{\left(n-1\right)}}\text{, }\quad n\geq 2\text{.} (#) coefficient of skewness, .. math:: s_3 = \frac{{\sum_{{i = 1}}^{{k-1}}f_i\left(y_i-\bar{y}\right)^3}}{{\left(n-1\right)\times s_2^3}}\text{, }\quad n\geq 2\text{.} (#) coefficient of kurtosis, .. math:: s_4 = \frac{{\sum_{{i = 1}}^{{k-1}}f_i\left(y_i-\bar{y}\right)^4}}{{\left(n-1\right)\times s_2^4}}-3\text{, }\quad n\geq 2\text{.} The function has been developed primarily for groupings of a continuous variable. If, however, the function is to be used on the frequency distribution of a discrete variable, taking the values :math:`y_1,\ldots,y_{{k-1}}`, then the boundary values for the classes may be defined as follows: (i) for :math:`k > 2`, .. math:: \begin{array}{rcl}x_1& = &\left(3y_1-y_2\right)/2\\x_j& = &\left(y_{{j-1}}+y_j\right)/2\text{, }\quad j = 2,\ldots,k-1\\x_k& = &\left(3y_{{k-1}}-y_{{k-2}}\right)/2\end{array} (#) for :math:`k = 2`, .. math:: x_1 = y_1-a\quad \text{ and }\quad x_2 = y_1+a\quad \text{ for any }a > 0\text{.} """ raise NotImplementedError
[docs]def frequency_table(x, cb=None): r""" ``frequency_table`` constructs a frequency distribution of a variable, according to either user-supplied, or function-calculated class boundary values. .. _g01ae-py2-py-doc: For full information please refer to the NAG Library document for g01ae https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01aef.html .. _g01ae-py2-py-parameters: **Parameters** **x** : float, array-like, shape :math:`\left(n\right)` The sample of observations of the variable for which the frequency distribution is required, :math:`x_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. The values may be in any order. **cb** : None or float, array-like, shape :math:`\left(k\right)`, optional If :math:`\textit{iclass} = 0`, the elements of :math:`\mathrm{cb}` need not be assigned values, as ``frequency_table`` calculates :math:`k-1` class boundary values. If :math:`\textit{iclass} = 1`, the first :math:`k-1` elements of :math:`\mathrm{cb}` must contain the class boundary values you supplied, in ascending order. In both cases, the element :math:`\mathrm{cb}[k-1]` need not be assigned, as it is not used in the function. **Returns** **cb** : float, ndarray, shape :math:`\left(k\right)` The first :math:`k-1` elements of :math:`\mathrm{cb}` contain the class boundary values in ascending order. **ifreq** : int, ndarray, shape :math:`\left(k\right)` The elements of :math:`\mathrm{ifreq}` contain the frequencies in each class, :math:`f_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,k`. In particular :math:`\mathrm{ifreq}[0]` contains the frequency of the class up to :math:`\mathrm{cb}[0]`, :math:`f_1`, and :math:`\mathrm{ifreq}[k-1]` contains the frequency of the class greater than :math:`\mathrm{cb}[k-2]`, :math:`f_k`. **xmin** : float The smallest value in the sample, :math:`a`. **xmax** : float The largest value in the sample, :math:`b`. .. _g01ae-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`k = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`k \geq 2`. (`errno` :math:`2`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n \geq 1`. (`errno` :math:`3`) On entry, :math:`\mathrm{cb}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{cb}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{cb}[\langle\mathit{\boldsymbol{value}}\rangle] < \mathrm{cb}[\langle\mathit{\boldsymbol{value}}\rangle]`. .. _g01ae-py2-py-notes: **Notes** The data consists of a sample of :math:`n` observations of a continuous variable, denoted by :math:`x_i`, for :math:`\textit{i} = 1,2,\ldots,n`. Let :math:`a = \mathrm{min}\left(x_1, \ldots, x_n\right)` and :math:`b = \mathrm{max}\left(x_1, \ldots, x_n\right)`. ``frequency_table`` constructs a frequency distribution with :math:`k\left(> 1\right)` classes denoted by :math:`f_i`, for :math:`\textit{i} = 1,2,\ldots,k`. The boundary values may be either user-supplied, or function-calculated, and are denoted by :math:`y_j`, for :math:`\textit{j} = 1,2,\ldots,k-1`. If the boundary values of the classes are to be function-calculated, then they are determined in one of the following ways: (a) if :math:`k > 2`, the range of :math:`x` values is divided into :math:`k-2` intervals of equal length, and two extreme intervals, defined by the class boundary values :math:`y_1,y_2,\ldots,y_{{k-1}}`; (#) if :math:`k = 2`, :math:`y_1 = \frac{1}{2}\left(a+b\right)`. However formed, the values :math:`y_1,\ldots,y_{{k-1}}` are assumed to be in ascending order. The class frequencies are formed with :math:`f_1 = \text{}` the number of :math:`x` values in the interval :math:`\left({-\infty }, y_1\right)` :math:`f_i = \text{}` the number of :math:`x` values in the interval :math:`\left[{y_{{i-1}}}, y_i\right)`, :math:`\quad \text{ }\quad i = 2,\ldots,k-1` :math:`f_k = \text{}` the number of :math:`x` values in the interval :math:`\left[{y_{{k-1}}}, \infty \right)`, where [ means inclusive, and) means exclusive. If the class boundary values are function-calculated and :math:`k > 2`, then :math:`f_1 = f_k = 0`, and :math:`y_1` and :math:`y_{{k-1}}` are chosen so that :math:`y_1 < a` and :math:`y_{{k-1}} > b`. If a frequency distribution is required for a discrete variable, then it is suggested that you supply the class boundary values; function-calculated boundary values may be slightly imprecise (due to the adjustment of :math:`y_1` and :math:`y_{{k-1}}` outlined above) and cause values very close to a class boundary to be assigned to the wrong class. """ raise NotImplementedError
[docs]def contingency_table(nobs, num=0): r""" ``contingency_table`` performs the analysis of a two-way :math:`r\times c` contingency table or classification. If :math:`r = c = 2`, and the total number of objects classified is :math:`40` or fewer, then the probabilities for Fisher's exact test are computed. Otherwise, a test statistic is computed (with Yates' correction when :math:`r = c = 2`), which under the assumption of no association between the classifications has approximately a chi-square distribution with :math:`\left(r-1\right)\times \left(c-1\right)` degrees of freedom. .. _g01af-py2-py-doc: For full information please refer to the NAG Library document for g01af https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01aff.html .. _g01af-py2-py-parameters: **Parameters** **nobs** : int, array-like, shape :math:`\left(m, n\right)` The elements :math:`\mathrm{nobs}[\textit{i}-1,\textit{j}-1]`, for :math:`\textit{j} = 1,2,\ldots,n`, for :math:`\textit{i} = 1,2,\ldots,m`, must contain the frequencies for the two-way classification. The :math:`\left(m+1\right)`\ th row and the :math:`\left(n+1\right)`\ th column of :math:`\mathrm{nobs}` need not be set. **num** : int, optional The value assigned to :math:`\mathrm{num}` must determine whether automatic 'shrinkage' is required when any :math:`r_{{ij}} < 1`, as outlined in :ref:`Notes <g01af-py2-py-notes>`\(1). If :math:`\mathrm{num} = 1`, shrinkage is required, otherwise shrinkage is not required. **Returns** **nobs** : int, ndarray, shape :math:`\left(m, n\right)` Contains the following information: :math:`\mathrm{nobs}[\textit{i}-1,\textit{j}-1]`, for :math:`\textit{j} = 1,2,\ldots,n_1`, for :math:`\textit{i} = 1,2,\ldots,m_1`, contain the frequencies for the two-way classification after 'shrinkage' has taken place (see :ref:`Notes <g01af-py2-py-notes>`). :math:`\mathrm{nobs}[\textit{i}-1,n]`, for :math:`\textit{i} = 1,2,\ldots,m_1`, contain the total frequencies in the remaining rows, :math:`R_i`. :math:`\mathrm{nobs}[m,\textit{j}-1]`, for :math:`\textit{j} = 1,2,\ldots,n_1`, contain the total frequencies in the remaining columns, :math:`C_j`. :math:`\mathrm{nobs}[m,n]`, contains the total frequency, :math:`\mathrm{T}`. If any 'shrinkage' has occurred, all other cells contain no useful information. **num** : int When Fisher's exact test for a :math:`2\times 2` classification is used then :math:`\mathrm{num}` contains the number of elements used in the array :math:`\mathrm{p}`, otherwise :math:`\mathrm{num}` is set to zero. **pred** : float, ndarray, shape :math:`\left(m, n\right)` The elements :math:`\mathrm{pred}[i-1,j-1]`, where :math:`i = 1,2,\ldots,\mathrm{m1}` and :math:`j = 1,2,\ldots,\mathrm{n1}` contain the expected frequencies, :math:`r_{{ij}}` corresponding to the observed frequencies :math:`\mathrm{nobs}[i-1,j-1]`, except in the case when Fisher's exact test for a :math:`2\times 2` classification is to be used, when :math:`\mathrm{pred}` is not used. No other elements are utilized. **chis** : float The value of the test statistic, :math:`\chi^2`, except when Fisher's exact test for a :math:`2\times 2` classification is used in which case it is unspecified. **p** : float, ndarray, shape :math:`\left(21\right)` The first :math:`\mathrm{num}` elements contain the probabilities associated with the various possible frequency tables, :math:`P_{\textit{r}}`, for :math:`\textit{r} = 0,1,\ldots,R_1`, the remainder are unspecified. **npos** : int :math:`\mathrm{p}[\mathrm{npos}-1]` holds the probability associated with the given table of frequencies. **ndf** : int The value of :math:`\mathrm{ndf}` gives the number of degrees of freedom for the chi-square distribution, :math:`\left(m_1-1\right)\times \left(n_1-1\right)`; when Fisher's exact test is used :math:`\mathrm{ndf} = 1`. **m1** : int The number of rows of the two-way classification, after any 'shrinkage', :math:`m_1`. **n1** : int The number of columns of the two-way classification, after any 'shrinkage', :math:`n_1`. .. _g01af-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) The number of rows or columns of :math:`\mathrm{nobs}` is less than :math:`2`. (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n > 2`. (`errno` :math:`1`) On entry, :math:`m = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`m > 2`. (`errno` :math:`2`) At least one frequency is negative, or all frequencies are zero. .. _g01af-py2-py-notes: **Notes** `No equivalent traditional C interface for this routine exists in the NAG Library.` The data consist of the frequencies for the two-way classification, denoted by :math:`n_{{\textit{i}\textit{j}}}`, for :math:`\textit{j} = 1,2,\ldots,n`, for :math:`\textit{i} = 1,2,\ldots,m` with :math:`m,n > 1`. A check is made to see whether any row or column of the matrix of frequencies consists entirely of zeros, and if so, the matrix of frequencies is reduced by omitting that row or column. Suppose the final size of the matrix is :math:`m_1\times n_1` (:math:`m_1,n_1 > 1`), and let :math:`R_{\textit{i}} = \sum_{{j = 1}}^{n_1}n_{{\textit{i}j}}`, the total frequency for the :math:`\textit{i}`\ th row, for :math:`\textit{i} = 1,2,\ldots,m_1`, :math:`C_{\textit{j}} = \sum_{{i = 1}}^{m_1}n_{{i\textit{j}}}`, the total frequency for the :math:`\textit{j}`\ th column, for :math:`\textit{j} = 1,2,\ldots,n_1`, and :math:`T = \sum_{{i = 1}}^{m_1}R_i = \sum_{{j = 1}}^{n_1}C_j`, the total frequency. There are two situations: (i) If :math:`m_1 > 2` and/or :math:`n_1 > 2`, or :math:`m_1 = n_1 = 2` and :math:`T > 40`, then the matrix of expected frequencies, denoted by :math:`r_{{ij}}`, for :math:`i = 1,2,\ldots,m_1` and :math:`j = 1,2,\ldots,n_1`, and the test statistic, :math:`\chi^2`, are computed, where .. math:: r_{{ij}} = R_iC_j/T\text{, }\quad i = 1,2,\ldots,m_1\text{;}j = 1,2,\ldots,n_1 and .. math:: \chi^2 = \sum_{{i = 1}}^{m_1}\sum_{{j = 1}}^{n_1}\left[\left\lvert r_{{ij}}-n_{{ij}}\right\rvert -Y\right]^2/r_{{ij}}\text{,} where .. math:: Y = \left\{\begin{array}{l}\frac{1}{2}\quad \text{ if } m_1 = n_1 = 2\\0\quad \text{ otherwise}\end{array}\right. is Yates' correction for continuity. Under the assumption that there is no association between the two classifications, :math:`\chi^2` will have approximately a chi-square distribution with :math:`\left(m_1-1\right)\times \left(n_1-1\right)` degrees of freedom. An option exists which allows for further 'shrinkage' of the matrix of frequencies in the case where :math:`r_{{ij}} < 1` for the (:math:`i,j`)th cell. If this is the case, then row :math:`i` or column :math:`j` will be combined with the adjacent row or column with smaller total. Row :math:`i` is selected for combination if :math:`R_i\times m_1\leq C_j\times n_1`. This 'shrinking' process is continued until :math:`r_{{ij}}\geq 1` for all cells (:math:`i,j`). (#) If :math:`m_1 = n_1 = 2` and :math:`T\leq 40`, the probabilities to enable Fisher's exact test to be made are computed. The matrix of frequencies may be rearranged so that :math:`R_1` is the smallest marginal (i.e., column and row) total, and :math:`C_2\geq C_1`. Under the assumption of no association between the classifications, the probability of obtaining :math:`r` entries in cell :math:`\left(1, 1\right)` is computed where .. math:: P_{{r+1}} = \frac{{R_1!R_2!C_1!C_2!}}{{T!r!\left(R_1-r\right)!\left(C_1-r\right)!\left(T-C_1-R_1+r\right)!}}\text{, }\quad r = 0,1,\ldots,R_1\text{.} The probability of obtaining the table of given frequencies is returned. A test of the assumption against some alternative may then be made by summing the relevant values of :math:`P_r`. """ raise NotImplementedError
[docs]def five_point_summary(x): r""" ``five_point_summary`` calculates a five-point summary for a single sample. .. _g01al-py2-py-doc: For full information please refer to the NAG Library document for g01al https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01alf.html .. _g01al-py2-py-parameters: **Parameters** **x** : float, array-like, shape :math:`\left(n\right)` The sample observations, :math:`x_1,x_2,\ldots,x_n`. **Returns** **res** : float, ndarray, shape :math:`\left(5\right)` :math:`\mathrm{res}` contains the five-point summary. :math:`\mathrm{res}[0]` The minimum. :math:`\mathrm{res}[1]` The lower hinge. :math:`\mathrm{res}[2]` The median. :math:`\mathrm{res}[3]` The upper hinge. :math:`\mathrm{res}[4]` The maximum. .. _g01al-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n\geq 5`. .. _g01al-py2-py-notes: **Notes** ``five_point_summary`` calculates the minimum, lower hinge, median, upper hinge and the maximum of a sample of :math:`n` observations. The data consist of a single sample of :math:`n` observations denoted by :math:`x_i` and let :math:`z_i`, for :math:`i = 1,2,\ldots,n`, represent the sample observations sorted into ascending order. Let :math:`m = \frac{n}{2}` if :math:`n` is even and :math:`\frac{\left(n+1\right)}{2}` if :math:`n` is odd, and :math:`k = \frac{m}{2}` if :math:`m` is even and :math:`\frac{\left(m+1\right)}{2}` if :math:`m` is odd. Then we have .. rst-class:: nag-rules-none nag-align-left +-----------+---------------------------------------------------+---------------------+ |Minimum |:math:`\text{} = z_1`, | | +-----------+---------------------------------------------------+---------------------+ |Maximum |:math:`\text{} = z_n`, | | +-----------+---------------------------------------------------+---------------------+ |Median |:math:`\text{} = z_m` |if :math:`n` is odd, | +-----------+---------------------------------------------------+---------------------+ | |:math:`\text{} = \frac{{z_m+z_{{m+1}}}}{2}` |if :math:`n` is even,| +-----------+---------------------------------------------------+---------------------+ |Lower hinge|:math:`\text{} = z_k` |if :math:`m` is odd, | +-----------+---------------------------------------------------+---------------------+ | |:math:`\text{} = \frac{{z_k+z_{{k+1}}}}{2}` |if :math:`m` is even,| +-----------+---------------------------------------------------+---------------------+ |Upper hinge|:math:`\text{} = z_{{n-k+1}}` |if :math:`m` is odd, | +-----------+---------------------------------------------------+---------------------+ | |:math:`\text{} = \frac{{z_{{n-k}}+z_{{n-k+1}}}}{2}`|if :math:`m` is even.| +-----------+---------------------------------------------------+---------------------+ .. _g01al-py2-py-references: **References** Erickson, B H and Nosanchuk, T A, 1985, `Understanding Data`, Open University Press, Milton Keynes Tukey, J W, 1977, `Exploratory Data Analysis`, Addison--Wesley """ raise NotImplementedError
[docs]def quantiles(rv, q): r""" ``quantiles`` finds specified quantiles from a vector of unsorted data. .. _g01am-py2-py-doc: For full information please refer to the NAG Library document for g01am https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01amf.html .. _g01am-py2-py-parameters: **Parameters** **rv** : float, array-like, shape :math:`\left(n\right)` The vector whose quantiles are to be determined. **q** : float, array-like, shape :math:`\left(\textit{nq}\right)` The quantiles to be calculated, in ascending order. Note that these must be between :math:`0.0` and :math:`1.0`, with :math:`0.0` returning the smallest element and :math:`1.0` the largest. **Returns** **qv** : float, ndarray, shape :math:`\left(\textit{nq}\right)` :math:`\mathrm{qv}[i-1]` contains the quantile specified by the value provided in :math:`\mathrm{q}[i-1]`, or an interpolated value if the quantile falls between two data values. .. _g01am-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n > 0`. (`errno` :math:`2`) On entry, :math:`\textit{nq} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{nq} > 0`. (`errno` :math:`3`) On entry, an element of :math:`\mathrm{q}` was less than :math:`0.0` or greater than :math:`1.0`. (`errno` :math:`4`) On entry, :math:`\mathrm{q}` was not in ascending order. .. _g01am-py2-py-notes: **Notes** A quantile is a value which divides a frequency distribution such that there is a given proportion of data values below the quantile. For example, the median of a dataset is the :math:`0.5` quantile because half the values are less than or equal to it; and the :math:`0.25` quantile is the :math:`25`\ th percentile. ``quantiles`` uses a modified version of Singleton's 'median-of-three' Quicksort algorithm (Singleton (1969)) to determine specified quantiles of a vector of real values. The input vector is partially sorted, as far as is required to compute desired quantiles; for a single quantile, this is much faster than sorting the entire vector. Where necessary, linear interpolation is also carried out to return the values of quantiles which lie between two data points. .. _g01am-py2-py-references: **References** Singleton, R C, 1969, `An efficient algorithm for sorting with minimal storage: Algorithm 347`, Comm. ACM (12), 185--187 """ raise NotImplementedError
[docs]def quantiles_stream_fixed(ind, n, rv, nb, eps, q, nq, rcomm, icomm): r""" ``quantiles_stream_fixed`` finds approximate quantiles from a data stream of known size using an out-of-core algorithm. .. _g01an-py2-py-doc: For full information please refer to the NAG Library document for g01an https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01anf.html .. _g01an-py2-py-parameters: **Parameters** **ind** : int Indicates the action required in the current call to ``quantiles_stream_fixed``. :math:`\mathrm{ind} = 0` Return the required length of :math:`\mathrm{rcomm}` and :math:`\mathrm{icomm}` in :math:`\mathrm{icomm}[0]` and :math:`\mathrm{icomm}[1]` respectively. :math:`\mathrm{n}` and :math:`\mathrm{eps}` must be set and :math:`\textit{licomm}` must be at least :math:`2`. :math:`\mathrm{ind} = 1` Initialise the communication arrays and process the first :math:`\mathrm{nb}` values from the data stream as supplied in :math:`\mathrm{rv}`. :math:`\mathrm{ind} = 2` Process the next block of :math:`\mathrm{nb}` values from the data stream. The calling program must update :math:`\mathrm{rv}` and (if required) :math:`\mathrm{nb}`, and re-enter ``quantiles_stream_fixed`` with all other parameters unchanged. :math:`\mathrm{ind} = 3` Calculate the :math:`\mathrm{nq}` :math:`\epsilon`-approximate quantiles specified in :math:`\mathrm{q}`. The calling program must set :math:`\mathrm{q}` and :math:`\mathrm{nq}` and re-enter ``quantiles_stream_fixed`` with all other parameters unchanged. This option can be chosen only when :math:`\mathrm{np}\geq \left\lceil \mathrm{exp}\left(1.0\right)/\mathrm{eps}\right\rceil`. **n** : int :math:`n`, the total number of values in the data stream. **rv** : float, array-like, shape :math:`\left(:\right)` Note: the required length for this argument is determined as follows: if :math:`\mathrm{ind}\text{ in } (1, 2)`: :math:`\mathrm{nb}`; otherwise: :math:`0`. If :math:`\mathrm{ind} = 1` or :math:`2`, the vector containing the current block of data, otherwise :math:`\mathrm{rv}` is not referenced. **nb** : int If :math:`\mathrm{ind} = 1` or :math:`2`, the size of the current block of data. The size of blocks of data in array :math:`\mathrm{rv}` can vary;, therefore, :math:`\mathrm{nb}` can change between calls to ``quantiles_stream_fixed``. **eps** : float Approximation factor :math:`\epsilon`. **q** : float, array-like, shape :math:`\left(:\right)` Note: the required length for this argument is determined as follows: if :math:`\mathrm{ind}=3`: :math:`\mathrm{nq}`; otherwise: :math:`0`. If :math:`\mathrm{ind} = 3`, the quantiles to be calculated, otherwise :math:`\mathrm{q}` is not referenced. Note that :math:`\mathrm{q}[i] = 0.0`, corresponds to the minimum value and :math:`\mathrm{q}[i] = 1.0` to the maximum value. **nq** : int If :math:`\mathrm{ind} = 3`, the number of quantiles requested, otherwise :math:`\mathrm{nq}` is not referenced. **rcomm** : float, ndarray, shape :math:`\left(:\right)`, modified in place Note: the required length for this argument is determined as follows: if :math:`\mathrm{ind} \neq 0`: the value returned in :math:`\mathrm{icomm}[0]` by a call to ``quantiles_stream_fixed`` with :math:`\mathrm{ind} = 0`; otherwise: :math:`0`. Communication array. **icomm** : int, ndarray, shape :math:`\left(:\right)`, modified in place Note: the required length for this argument is determined as follows: if :math:`\mathrm{ind} \neq 0`: the value returned in :math:`\mathrm{icomm}[1]` by a call to ``quantiles_stream_fixed`` with :math:`\mathrm{ind} = 0`; otherwise: :math:`2`. Communication array. **Returns** **ind** : int Indicates output from a successful call. :math:`\mathrm{ind} = 1` Lengths of :math:`\mathrm{rcomm}` and :math:`\mathrm{icomm}` have been returned in :math:`\mathrm{icomm}[0]` and :math:`\mathrm{icomm}[1]` respectively. :math:`\mathrm{ind} = 2` ``quantiles_stream_fixed`` has processed :math:`\mathrm{np}` data points and expects to be called again with additional data (i.e., :math:`\mathrm{np} < \mathrm{n}`). :math:`\mathrm{ind} = 3` ``quantiles_stream_fixed`` has returned the requested :math:`\epsilon`-approximate quantiles in :math:`\mathrm{qv}`. These quantiles are based on :math:`\mathrm{np}` data points. :math:`\mathrm{ind} = 4` Routine has processed all :math:`\mathrm{n}` data points (i.e., :math:`\mathrm{np} = \mathrm{n}`). **np** : int The number of elements processed so far. **qv** : float, ndarray, shape :math:`\left(:\right)` If :math:`\mathrm{ind} = 3`, :math:`\mathrm{qv}[i]` contains the :math:`\epsilon`-approximate quantiles specified by the value provided in :math:`\mathrm{q}[i]`. .. _g01an-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{ind} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ind} = 0`, :math:`1`, :math:`2` or :math:`3`. (`errno` :math:`2`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n} > 0`. (`errno` :math:`3`) On entry, :math:`\mathrm{eps} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{exp}\left(1.0\right)/\mathrm{n}\leq \mathrm{eps}\leq 1.0`. (`errno` :math:`4`) On entry, :math:`\mathrm{ind} = 1` or :math:`2` and :math:`\mathrm{nb} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{ind} = 1` or :math:`2` then :math:`\mathrm{nb} > 0`. (`errno` :math:`7`) Number of data elements streamed, :math:`\langle\mathit{\boldsymbol{value}}\rangle` is not sufficient for a quantile query when :math:`\mathrm{eps} = \langle\mathit{\boldsymbol{value}}\rangle`. Supply more data or reprocess the data with a higher :math:`\mathrm{eps}` value. (`errno` :math:`8`) On entry, :math:`\mathrm{ind} = 3` and :math:`\mathrm{nq} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{ind} = 3` then :math:`\mathrm{nq} > 0`. (`errno` :math:`9`) On entry, :math:`\mathrm{ind} = 3` and :math:`\mathrm{q}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{ind} = 3` then :math:`0.0\leq \mathrm{q}[i]\leq 1.0` for all :math:`i`. .. _g01an-py2-py-notes: **Notes** A quantile is a value which divides a frequency distribution such that there is a given proportion of data values below the quantile. For example, the median of a dataset is the :math:`0.5` quantile because half the values are less than or equal to it. ``quantiles_stream_fixed`` uses a slightly modified version of an algorithm described in a paper by Zhang and Wang (2007) to determine :math:`\epsilon`-approximate quantiles of a data stream of :math:`n` real values, where :math:`n` is known. Given any quantile :math:`q \in \left[0.0, 1.0\right]`, an :math:`\epsilon`-approximate quantile is defined as an element in the data stream whose rank falls within :math:`\left[{\left(q-\epsilon \right)n}, {\left(q+\epsilon \right)n}\right]`. In case of more than one :math:`\epsilon`-approximate quantile being available, the one closest to :math:`qn` is returned. .. _g01an-py2-py-references: **References** Zhang, Q and Wang, W, 2007, `A fast algorithm for approximate quantiles in high speed data streams`, Proceedings of the 19th International Conference on Scientific and Statistical Database Management, IEEE Computer Society, 29 """ raise NotImplementedError
[docs]def quantiles_stream_arbitrary(ind, rv, nb, eps, q, nq, rcomm, icomm): r""" ``quantiles_stream_arbitrary`` finds approximate quantiles from a large arbitrary-sized data stream using an out-of-core algorithm. .. _g01ap-py2-py-doc: For full information please refer to the NAG Library document for g01ap https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01apf.html .. _g01ap-py2-py-parameters: **Parameters** **ind** : int `On initial entry`: must be set to :math:`0`.Indicates the action required in the current call to ``quantiles_stream_arbitrary``. :math:`\mathrm{ind} = 0` Initialize the communication arrays and attempt to process the first :math:`\mathrm{nb}` values from the data stream. :math:`\mathrm{eps}`, :math:`\mathrm{rv}` and :math:`\mathrm{nb}` must be set and :math:`\textit{licomm}` must be at least :math:`10`. :math:`\mathrm{ind} = 1` Attempt to process the next block of :math:`\mathrm{nb}` values from the data stream. The calling program must update :math:`\mathrm{rv}` and (if required) :math:`\mathrm{nb}`, and re-enter ``quantiles_stream_arbitrary`` with all other parameters unchanged. :math:`\mathrm{ind} = 2` Continue calculation following the reallocation of either or both of the communication arrays :math:`\mathrm{rcomm}` and :math:`\mathrm{icomm}`. :math:`\mathrm{ind} = 3` Calculate the :math:`\mathrm{nq}` :math:`\epsilon`-approximate quantiles specified in :math:`\mathrm{q}`. The calling program must set :math:`\mathrm{q}` and :math:`\mathrm{nq}` and re-enter ``quantiles_stream_arbitrary`` with all other parameters unchanged. This option can be chosen only when :math:`\mathrm{np}\geq \left\lceil \mathrm{exp}\left(1.0\right)/\mathrm{eps}\right\rceil`. **rv** : float, array-like, shape :math:`\left(:\right)` Note: the required length for this argument is determined as follows: if :math:`\mathrm{ind}\text{ in } (0, 1, 2)`: :math:`\mathrm{nb}`; otherwise: :math:`0`. If :math:`\mathrm{ind} = 0`, :math:`1` or :math:`2`, the vector containing the current block of data, otherwise :math:`\mathrm{rv}` is not referenced. **nb** : int If :math:`\mathrm{ind} = 0`, :math:`1` or :math:`2`, the size of the current block of data. The size of blocks of data in array :math:`\mathrm{rv}` can vary;, therefore, :math:`\mathrm{nb}` can change between calls to ``quantiles_stream_arbitrary``. **eps** : float Approximation factor :math:`\epsilon`. **q** : float, array-like, shape :math:`\left(:\right)` Note: the required length for this argument is determined as follows: if :math:`\mathrm{ind}=3`: :math:`\mathrm{nq}`; otherwise: :math:`0`. If :math:`\mathrm{ind} = 3`, the quantiles to be calculated, otherwise :math:`\mathrm{q}` is not referenced. Note that :math:`\mathrm{q}[i] = 0.0`, corresponds to the minimum value and :math:`\mathrm{q}[i] = 1.0` to the maximum value. **nq** : int If :math:`\mathrm{ind} = 3`, the number of quantiles requested, otherwise :math:`\mathrm{nq}` is not referenced. **rcomm** : float, ndarray, shape :math:`\left(\textit{lrcomm}\right)`, modified in place `On entry`: if :math:`\mathrm{ind} = 1` or :math:`2` then the first :math:`l` elements of :math:`\mathrm{rcomm}` as supplied to ``quantiles_stream_arbitrary`` must be identical to the first :math:`l` elements of :math:`\mathrm{rcomm}` returned from the last call to ``quantiles_stream_arbitrary``, where :math:`l` is the value of :math:`\textit{lrcomm}` used in the last call. In other words, the contents of :math:`\mathrm{rcomm}` must not be altered between calls to this function. If :math:`\mathrm{rcomm}` needs to be reallocated then its contents must be preserved. If :math:`\mathrm{ind} = 0` then :math:`\mathrm{rcomm}` need not be set. `On exit`: :math:`\mathrm{rcomm}` holds information required by subsequent calls to ``quantiles_stream_arbitrary``. **icomm** : int, ndarray, shape :math:`\left(\textit{licomm}\right)`, modified in place `On entry`: if :math:`\mathrm{ind} = 1` or :math:`2` then the first :math:`l` elements of :math:`\mathrm{icomm}` as supplied to ``quantiles_stream_arbitrary`` must be identical to the first :math:`l` elements of :math:`\mathrm{icomm}` returned from the last call to ``quantiles_stream_arbitrary``, where :math:`l` is the value of :math:`\textit{licomm}` used in the last call. In other words, the contents of :math:`\mathrm{icomm}` must not be altered between calls to this function. If :math:`\mathrm{icomm}` needs to be reallocated then its contents must be preserved. If :math:`\mathrm{ind} = 0` then :math:`\mathrm{icomm}` need not be set. `On exit`: :math:`\mathrm{icomm}[0]` holds the minimum required length for :math:`\mathrm{rcomm}` and :math:`\mathrm{icomm}[1]` holds the minimum required length for :math:`\mathrm{icomm}`. The remaining elements of :math:`\mathrm{icomm}` are used for communication between subsequent calls to ``quantiles_stream_arbitrary``. **Returns** **ind** : int Indicates output from the call. :math:`\mathrm{ind} = 1` ``quantiles_stream_arbitrary`` has processed :math:`\mathrm{np}` data points and expects to be called again with additional data. :math:`\mathrm{ind} = 2` Either one or more of the communication arrays :math:`\mathrm{rcomm}` and :math:`\mathrm{icomm}` is too small. The new minimum lengths of :math:`\mathrm{rcomm}` and :math:`\mathrm{icomm}` have been returned in :math:`\mathrm{icomm}[0]` and :math:`\mathrm{icomm}[1]` respectively. If the new minimum length is greater than the current length then the corresponding communication array needs to be reallocated, its contents preserved and ``quantiles_stream_arbitrary`` called again with all other parameters unchanged. If there is more data to be processed, it is recommended that :math:`\textit{lrcomm}` and :math:`\textit{licomm}` are made significantly bigger than the minimum to limit the number of reallocations. :math:`\mathrm{ind} = 3` ``quantiles_stream_arbitrary`` has returned the requested :math:`\epsilon`-approximate quantiles in :math:`\mathrm{qv}`. These quantiles are based on :math:`\mathrm{np}` data points. **np** : int :math:`m`, the number of elements processed so far. **qv** : float, ndarray, shape :math:`\left(:\right)` If :math:`\mathrm{ind} = 3`, :math:`\mathrm{qv}[i]` contains the :math:`\epsilon`-approximate quantiles specified by the value provided in :math:`\mathrm{q}[i]`. .. _g01ap-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{ind} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ind} = 0`, :math:`1`, :math:`2` or :math:`3`. (`errno` :math:`2`) On entry, :math:`\mathrm{eps} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0 < \mathrm{eps}\leq 1.0`. (`errno` :math:`3`) On entry, :math:`\mathrm{ind} = 0`, :math:`1` or :math:`2` and :math:`\mathrm{nb} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{ind} = 0`, :math:`1` or :math:`2` then :math:`\mathrm{nb} > 0`. (`errno` :math:`6`) The contents of :math:`\mathrm{icomm}` have been altered between calls to this function. (`errno` :math:`7`) The contents of :math:`\mathrm{rcomm}` have been altered between calls to this function. (`errno` :math:`8`) Number of data elements streamed, :math:`\langle\mathit{\boldsymbol{value}}\rangle` is not sufficient for a quantile query when :math:`\mathrm{eps} = \langle\mathit{\boldsymbol{value}}\rangle`. Supply more data or reprocess the data with a higher :math:`\mathrm{eps}` value. (`errno` :math:`9`) On entry, :math:`\mathrm{ind} = 3` and :math:`\mathrm{nq} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{ind} = 3` then :math:`\mathrm{nq} > 0`. (`errno` :math:`10`) On entry, :math:`\mathrm{ind} = 3` and :math:`\mathrm{q}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{ind} = 3` then :math:`0.0\leq \mathrm{q}[i]\leq 1.0` for all :math:`i`. .. _g01ap-py2-py-notes: **Notes** A quantile is a value which divides a frequency distribution such that there is a given proportion of data values below the quantile. For example, the median of a dataset is the :math:`0.5` quantile because half the values are less than or equal to it. ``quantiles_stream_arbitrary`` uses a slightly modified version of an algorithm described in a paper by Zhang and Wang (2007) to determine :math:`\epsilon`-approximate quantiles of a large arbitrary-sized data stream of real values, where :math:`\epsilon` is a user-defined approximation factor. Let :math:`m` denote the number of data elements processed so far then, given any quantile :math:`q \in \left[0.0, 1.0\right]`, an :math:`\epsilon`-approximate quantile is defined as an element in the data stream whose rank falls within :math:`\left[{\left(q-\epsilon \right)m}, {\left(q+\epsilon \right)m}\right]`. In case of more than one :math:`\epsilon`-approximate quantile being available, the one closest to :math:`qm` is used. .. _g01ap-py2-py-references: **References** Zhang, Q and Wang, W, 2007, `A fast algorithm for approximate quantiles in high speed data streams`, Proceedings of the 19th International Conference on Scientific and Statistical Database Management, IEEE Computer Society, 29 """ raise NotImplementedError
[docs]def summary_onevar(x, wt=None, pn=0, rcomm=None): r""" ``summary_onevar`` calculates the mean, standard deviation, coefficients of skewness and kurtosis, and the maximum and minimum values for a set of (optionally weighted) data. The input data can be split into arbitrary sized blocks, allowing large datasets to be summarised. .. _g01at-py2-py-doc: For full information please refer to the NAG Library document for g01at https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01atf.html .. _g01at-py2-py-parameters: **Parameters** **x** : float, array-like, shape :math:`\left(\textit{nb}\right)` The current block of observations, corresponding to :math:`x_{\textit{i}}`, for :math:`\textit{i} = k+1,\ldots,k+b`, where :math:`k` is the number of observations processed so far and :math:`b` is the size of the current block of data. **wt** : None or float, array-like, shape :math:`\left(:\right)`, optional Note: the required length for this argument is determined as follows: if :math:`\mathrm{wt}\text{ is not }\mathbf{None}`: :math:`\textit{nb}`; otherwise: :math:`0`. If :math:`\mathrm{wt}` is not **None**, :math:`\mathrm{wt}` must contain the user-supplied weights corresponding to the block of data supplied in :math:`\mathrm{x}`, that is :math:`w_{\textit{i}}`, for :math:`\textit{i} = k+1,\ldots,k+b`. If :math:`\mathrm{wt}` is **None**, :math:`w_i = 1` for all :math:`i`. **pn** : int, optional The number of valid observations processed so far, that is the number of observations with :math:`w_i > 0`, for :math:`\textit{i} = 1,2,\ldots,k`. On the first call to ``summary_onevar``, or when starting to summarise a new dataset, :math:`\mathrm{pn}` must be set to :math:`0`. If :math:`\mathrm{pn}\neq 0`, it must be the same value as returned by the last call to ``summary_onevar``. **rcomm** : None or float, ndarray, shape :math:`\left(:\right)`, optional, modified in place Note: the required length for this argument is determined as follows: if :math:`\mathrm{rcomm}\text{ is not }\mathbf{None}`: :math:`20`; otherwise: :math:`0`. `Optionally, on entry`: communication array, used to store information between calls to ``summary_onevar``. If :math:`\mathrm{pn} = 0`, :math:`\mathrm{rcomm}` need not be initialized, otherwise it must be unchanged since the last call to this function. If :math:`\mathrm{rcomm}` is **None**, :math:`\mathrm{rcomm}` is not referenced and all the data must be supplied in one go. `On exit`, if not **None** on entry: the updated communication array. The first five elements of :math:`\mathrm{rcomm}` hold information that may be of interest with .. math:: \begin{array}{rcl}\mathrm{rcomm}[0]& = & \sum_{1}^{{k+b}}{w_i} \\\mathrm{rcomm}[1]& = & \left(\sum_{1}^{k+b}{w_i}\right)^2 - \sum_{1}^{k+b}{w_i^2} \\\mathrm{rcomm}[2]& = & \sum_{1}^{k+b}{w_i\left(x_i-\bar{x}\right)^2} \\\mathrm{rcomm}[3]& = & \sum_{1}^{k+b}{w_i\left(x_i-\bar{x}\right)^3} \\\mathrm{rcomm}[4]& = & \sum_{1}^{k+b}{w_i\left(x_i-\bar{x}\right)^4} \end{array} the remaining elements of :math:`\mathrm{rcomm}` are used for workspace and so are undefined. **Returns** **pn** : int The updated number of valid observations processed, that is the number of observations with :math:`w_i > 0`, for :math:`\textit{i} = 1,2,\ldots,k+b`. **xmean** : float :math:`\bar{x}`, the mean of the first :math:`k+b` observations. **xsd** : float :math:`s_2`, the standard deviation of the first :math:`k+b` observations. **xskew** : float :math:`s_3`, the coefficient of skewness for the first :math:`k+b` observations. **xkurt** : float :math:`s_4`, the coefficient of kurtosis for the first :math:`k+b` observations. **xmin** : float The smallest value in the first :math:`k+b` observations. **xmax** : float The largest value in the first :math:`k+b` observations. .. _g01at-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`11`) On entry, :math:`\textit{nb} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{nb}\geq 0`. (`errno` :math:`41`) On entry, :math:`\mathrm{wt}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{wt}\text{ is not }\mathbf{None}` then :math:`\mathrm{wt}[\textit{i}-1]\geq 0`, for :math:`\textit{i} = 1,2,\ldots,\textit{nb}`. (`errno` :math:`51`) On entry, :math:`\mathrm{pn} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{pn}\geq 0`. (`errno` :math:`52`) On entry, :math:`\mathrm{pn} = \langle\mathit{\boldsymbol{value}}\rangle`. On exit from previous call, :math:`\mathrm{pn} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{pn} > 0`, :math:`\mathrm{pn}` must be unchanged since previous call. (`errno` :math:`54`) On entry, :math:`\mathrm{pn} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{rcomm}\text{ is }\mathbf{None}` then :math:`\mathrm{pn} = 0`. (`errno` :math:`121`) :math:`\mathrm{rcomm}` has been corrupted between calls. **Warns** **NagAlgorithmicWarning** (`errno` :math:`53`) On entry, the number of valid observations is zero. (`errno` :math:`71`) On exit we were unable to calculate :math:`\mathrm{xskew}` or :math:`\mathrm{xkurt}`. A value of :math:`0` has been returned. (`errno` :math:`72`) On exit we were unable to calculate :math:`\mathrm{xsd}`, :math:`\mathrm{xskew}` or :math:`\mathrm{xkurt}`. A value of :math:`0` has been returned. .. _g01at-py2-py-notes: **Notes** Given a sample of :math:`n` observations, denoted by :math:`x = \left\{x_i:i = 1,2,\ldots,n\right\}` and a set of non-negative weights, :math:`w = \left\{w_i:i = 1,2,\ldots,n\right\}`, ``summary_onevar`` calculates a number of quantities: (a) Mean .. math:: \bar{x} = \frac{{\sum_{1}^{n}{w_ix_i}}}{W}\text{, where }\quad W = \sum_{1}^{n}{w_i}\text{.} (#) Standard deviation .. math:: s_2 = \sqrt{\frac{{\sum_{1}^{n}{w_i}\left(x_i-\bar{x}\right)^2}}{d}}\text{, where }\quad d = W-\frac{{\sum_{1}^{n}{w_i^2}}}{W}\text{.} (#) Coefficient of skewness .. math:: s_3 = \frac{{\sum_{1}^{n}{w_i}\left(x_i-\bar{x}\right)^3}}{{ds_2^3}}\text{.} (#) Coefficient of kurtosis .. math:: s_4 = \frac{{\sum_{1}^{n}{w_i\left(x_i-\bar{x}\right)^4}}}{{ds_2^4}}-3\text{.} (#) Maximum and minimum elements, with :math:`w_i\neq 0`. These quantities are calculated using the one pass algorithm of West (1979). For large datasets, or where all the data is not available at the same time, :math:`x` and :math:`w` can be split into arbitrary sized blocks and ``summary_onevar`` called multiple times. .. _g01at-py2-py-references: **References** West, D H D, 1979, `Updating mean and variance estimates: An improved method`, Comm. ACM (22), 532--555 """ raise NotImplementedError
[docs]def summary_onevar_combine(mrcomm): r""" ``summary_onevar_combine`` combines sets of summaries produced by :meth:`summary_onevar`. .. _g01au-py2-py-doc: For full information please refer to the NAG Library document for g01au https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01auf.html .. _g01au-py2-py-parameters: **Parameters** **mrcomm** : float, array-like, shape :math:`\left(20, b\right)` The :math:`j`\ th column of :math:`\mathrm{mrcomm}` must contain the information returned in :math:`\textit{rcomm}` from one of the runs of :meth:`summary_onevar`. **Returns** **pn** : int The number of valid observations, that is the number of observations with :math:`w_i > 0`, for :math:`\textit{i} = 1,2,\ldots,n`. **xmean** : float :math:`\bar{x}`, the mean. **xsd** : float :math:`s_2`, the standard deviation. **xskew** : float :math:`s_3`, the coefficient of skewness. **xkurt** : float :math:`s_4`, the coefficient of kurtosis. **xmin** : float The smallest value. **xmax** : float The largest value. **rcomm** : float, ndarray, shape :math:`\left(20\right)` An amalgamation of the information held in :math:`\mathrm{mrcomm}`. This is in the same format as :math:`\textit{rcomm}` from :meth:`summary_onevar`. .. _g01au-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`11`) On entry, :math:`b = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`b\geq 1`. (`errno` :math:`21`) On entry, :math:`\mathrm{mrcomm}` is not in the expected format. **Warns** **NagAlgorithmicWarning** (`errno` :math:`31`) On entry, the number of valid observations is zero. (`errno` :math:`51`) On exit we were unable to calculate :math:`\mathrm{xskew}` or :math:`\mathrm{xkurt}`. A value of :math:`0` has been returned. (`errno` :math:`52`) On exit we were unable to calculate :math:`\mathrm{xsd}`, :math:`\mathrm{xskew}` or :math:`\mathrm{xkurt}`. A value of :math:`0` has been returned. .. _g01au-py2-py-notes: **Notes** Assume a dataset containing :math:`n` observations, denoted by :math:`x = \left\{x_i:i = 1,2,\ldots,n\right\}` and a set of weights, :math:`w = \left\{w_i:i = 1,2,\ldots,n\right\}`, has been split into :math:`b` blocks, and each block summarised via a call to :meth:`summary_onevar`. Then ``summary_onevar_combine`` takes the :math:`b` communication arrays returned by :meth:`summary_onevar` and returns the mean (:math:`\bar{x}`), standard deviation (:math:`s_2`), coefficients of skewness (:math:`s_3`) and kurtosis (:math:`s_4`), and the maximum and minimum values for the whole dataset. For a definition of :math:`\bar{x},s_2,s_3` and :math:`s_4` see :ref:`Notes for summary_onevar <g01at-py2-py-notes>`. .. _g01au-py2-py-references: **References** West, D H D, 1979, `Updating mean and variance estimates: An improved method`, Comm. ACM (22), 532--555 """ raise NotImplementedError
[docs]def prob_binomial(n, p, k): r""" ``prob_binomial`` returns the lower tail, upper tail and point probabilities associated with a binomial distribution. .. _g01bj-py2-py-doc: For full information please refer to the NAG Library document for g01bj https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01bjf.html .. _g01bj-py2-py-parameters: **Parameters** **n** : int The parameter :math:`n` of the binomial distribution. **p** : float The parameter :math:`p` of the binomial distribution. **k** : int The integer :math:`k` which defines the required probabilities. **Returns** **plek** : float The lower tail probability, :math:`\mathrm{Prob}\left\{X\leq k\right\}`. **pgtk** : float The upper tail probability, :math:`\mathrm{Prob}\left\{X > k\right\}`. **peqk** : float The point probability, :math:`\mathrm{Prob}\left\{X = k\right\}`. .. _g01bj-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n}\geq 0`. (`errno` :math:`2`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} < 1.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} > 0.0`. (`errno` :math:`3`) On entry, :math:`\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{k} \geq 0`. (`errno` :math:`3`) On entry, :math:`\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{k} \leq \mathrm{n}`. (`errno` :math:`4`) On entry, :math:`\mathrm{n}` is too large to be represented exactly as a double precision number. (`errno` :math:`5`) On entry, the variance :math:`\left(= {np}\left(1-p\right)\right)` exceeds :math:`10^6`. .. _g01bj-py2-py-notes: **Notes** Let :math:`X` denote a random variable having a binomial distribution with parameters :math:`n` and :math:`p` (:math:`n\geq 0` and :math:`0 < p < 1`). Then .. math:: \mathrm{Prob}\left\{X = k\right\} = \begin{pmatrix}n\\k\end{pmatrix}p^k\left(1-p\right)^{{n-k}}\text{, }\quad k = 0,1,\ldots,n\text{.} The mean of the distribution is :math:`np` and the variance is :math:`np\left(1-p\right)`. ``prob_binomial`` computes for given :math:`n`, :math:`p` and :math:`k` the probabilities: .. math:: \begin{array}{l}\mathrm{plek} = \mathrm{Prob}\left\{X\leq k\right\}\\\mathrm{pgtk} = \mathrm{Prob}\left\{X > k\right\}\\ \mathrm{peqk} = \mathrm{Prob}\left\{X = k\right\} \text{.} \end{array} The method is similar to the method for the Poisson distribution described in Knüsel (1986). .. _g01bj-py2-py-references: **References** Knüsel, L, 1986, `Computation of the chi-square and Poisson distribution`, SIAM J. Sci. Statist. Comput. (7), 1022--1036 """ raise NotImplementedError
[docs]def prob_poisson(rlamda, k): r""" ``prob_poisson`` returns the lower tail, upper tail and point probabilities associated with a Poisson distribution. .. _g01bk-py2-py-doc: For full information please refer to the NAG Library document for g01bk https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01bkf.html .. _g01bk-py2-py-parameters: **Parameters** **rlamda** : float The parameter :math:`\lambda` of the Poisson distribution. **k** : int The integer :math:`k` which defines the required probabilities. **Returns** **plek** : float The lower tail probability, :math:`\mathrm{Prob}\left\{X\leq k\right\}`. **pgtk** : float The upper tail probability, :math:`\mathrm{Prob}\left\{X > k\right\}`. **peqk** : float The point probability, :math:`\mathrm{Prob}\left\{X = k\right\}`. .. _g01bk-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{rlamda} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rlamda} > 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{k}\geq 0`. (`errno` :math:`3`) On entry, :math:`\mathrm{rlamda} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rlamda}\leq 10^6`. .. _g01bk-py2-py-notes: **Notes** Let :math:`X` denote a random variable having a Poisson distribution with parameter :math:`\lambda` :math:`\left(> 0\right)`. Then .. math:: \mathrm{Prob}\left\{X = k\right\} = e^{{-\lambda }}\frac{\lambda^k}{{k!}}\text{, }\quad k = 0,1,2,\ldots The mean and variance of the distribution are both equal to :math:`\lambda`. ``prob_poisson`` computes for given :math:`\lambda` and :math:`k` the probabilities: .. math:: \begin{array}{c}\mathrm{plek} = \mathrm{Prob}\left\{X\leq k\right\}\\\mathrm{pgtk} = \mathrm{Prob}\left\{X > k\right\}\\\mathrm{peqk} = \mathrm{Prob}\left\{X = k\right\} \text{.} \end{array} The method is described in Knüsel (1986). .. _g01bk-py2-py-references: **References** Knüsel, L, 1986, `Computation of the chi-square and Poisson distribution`, SIAM J. Sci. Statist. Comput. (7), 1022--1036 """ raise NotImplementedError
[docs]def prob_hypergeom(n, l, m, k): r""" ``prob_hypergeom`` returns the lower tail, upper tail and point probabilities associated with a hypergeometric distribution. .. _g01bl-py2-py-doc: For full information please refer to the NAG Library document for g01bl https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01blf.html .. _g01bl-py2-py-parameters: **Parameters** **n** : int The parameter :math:`n` of the hypergeometric distribution. **l** : int The parameter :math:`l` of the hypergeometric distribution. **m** : int The parameter :math:`m` of the hypergeometric distribution. **k** : int The integer :math:`k` which defines the required probabilities. **Returns** **plek** : float The lower tail probability, :math:`\mathrm{Prob}\left\{X\leq k\right\}`. **pgtk** : float The upper tail probability, :math:`\mathrm{Prob}\left\{X > k\right\}`. **peqk** : float The point probability, :math:`\mathrm{Prob}\left\{X = k\right\}`. .. _g01bl-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n}\geq 0`. (`errno` :math:`2`) On entry, :math:`\mathrm{l} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{l}\leq \mathrm{n}`. (`errno` :math:`2`) On entry, :math:`\mathrm{l} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{l}\geq 0`. (`errno` :math:`3`) On entry, :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{m}\leq \mathrm{n}`. (`errno` :math:`3`) On entry, :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{m}\geq 0`. (`errno` :math:`4`) On entry, :math:`\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle`, :math:`\mathrm{l} = \langle\mathit{\boldsymbol{value}}\rangle`, :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{l}+\mathrm{m}-\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{k}\geq \mathrm{l}+\mathrm{m}-\mathrm{n}`. (`errno` :math:`4`) On entry, :math:`\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{k}\leq \mathrm{m}`. (`errno` :math:`4`) On entry, :math:`\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{l} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{k}\leq \mathrm{l}`. (`errno` :math:`4`) On entry, :math:`\mathrm{k} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{k}\geq 0`. (`errno` :math:`5`) On entry, :math:`\mathrm{n}` is too large to be represented exactly as a double precision number. (`errno` :math:`6`) On entry, the variance :math:`\text{} = \frac{{lm\left(n-l\right)\left(n-m\right)}}{{n^2\left(n-1\right)}}` exceeds :math:`10^6`. .. _g01bl-py2-py-notes: **Notes** Let :math:`X` denote a random variable having a hypergeometric distribution with parameters :math:`n`, :math:`l` and :math:`m` (:math:`n\geq l\geq 0`, :math:`n\geq m\geq 0`). Then .. math:: \mathrm{Prob}\left\{X = k\right\} = \frac{{\begin{pmatrix}m\\k\end{pmatrix}\begin{pmatrix}n-m\\l-k\end{pmatrix}}}{\begin{pmatrix}n\\l\end{pmatrix}}\text{,} where :math:`\mathrm{max}\left(0, {l-\left(n-m\right)}\right)\leq k\leq \mathrm{min}\left(l, m\right)`, :math:`0\leq l\leq n` and :math:`0\leq m\leq n`. The hypergeometric distribution may arise if in a population of size :math:`n` a number :math:`m` are marked. From this population a sample of size :math:`l` is drawn and of these :math:`k` are observed to be marked. The mean of the distribution :math:`\text{} = \frac{{lm}}{n}`, and the variance :math:`\text{} = \frac{{lm\left(n-l\right)\left(n-m\right)}}{{n^2\left(n-1\right)}}`. ``prob_hypergeom`` computes for given :math:`n`, :math:`l`, :math:`m` and :math:`k` the probabilities: .. math:: \begin{array}{c}\mathrm{plek} = \mathrm{Prob}\left\{X\leq k\right\}\\\mathrm{pgtk} = \mathrm{Prob}\left\{X > k\right\}\\\mathrm{peqk} = \mathrm{Prob}\left\{X = k\right\} \text{.} \end{array} The method is similar to the method for the Poisson distribution described in Knüsel (1986). .. _g01bl-py2-py-references: **References** Knüsel, L, 1986, `Computation of the chi-square and Poisson distribution`, SIAM J. Sci. Statist. Comput. (7), 1022--1036 """ raise NotImplementedError
[docs]def normal_scores_exact(n, etol): r""" ``normal_scores_exact`` computes a set of Normal scores, i.e., the expected values of an ordered set of independent observations from a Normal distribution with mean :math:`0.0` and standard deviation :math:`1.0`. .. _g01da-py2-py-doc: For full information please refer to the NAG Library document for g01da https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01daf.html .. _g01da-py2-py-parameters: **Parameters** **n** : int :math:`n`, the size of the set. **etol** : float The maximum value for the estimated absolute error in the computed scores. **Returns** **pp** : float, ndarray, shape :math:`\left(\mathrm{n}\right)` The Normal scores. :math:`\mathrm{pp}[\textit{i}-1]` contains the value :math:`E\left(x_{\left(\textit{i}\right)}\right)`, for :math:`\textit{i} = 1,2,\ldots,n`. **errest** : float A computed estimate of the maximum error in the computed scores (see `Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01daf.html#accuracy>`__). .. _g01da-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n} > 0`. (`errno` :math:`2`) On entry, :math:`\mathrm{etol} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{etol} > 0.0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`3`) The function was unable to estimate the scores with estimated error less than :math:`\mathrm{etol}`. The best result obtained is returned together with the associated value of :math:`\mathrm{errest}`. .. _g01da-py2-py-notes: **Notes** If a sample of :math:`n` observations from any distribution (which may be denoted by :math:`x_1,x_2,\ldots,x_n`), is sorted into ascending order, the :math:`r`\ th smallest value in the sample is often referred to as the :math:`r`\ th '**order statistic**', sometimes denoted by :math:`x_{\left(r\right)}` (see Kendall and Stuart (1969)). The order statistics, therefore, have the property .. math:: x_{\left(1\right)}\leq x_{\left(2\right)}\leq \cdots \leq x_{\left(n\right)}\text{.} (If :math:`n = 2r+1`, :math:`x_{{r+1}}` is the sample median.) For samples originating from a known distribution, the distribution of each order statistic in a sample of given size may be determined. In particular, the expected values of the order statistics may be found by integration. If the sample arises from a Normal distribution, the expected values of the order statistics are referred to as the '**Normal scores**'. The Normal scores provide a set of reference values against which the order statistics of an actual data sample of the same size may be compared, to provide an indication of Normality for the sample. Normal scores have other applications; for instance, they are sometimes used as alternatives to ranks in nonparametric testing procedures. ``normal_scores_exact`` computes the :math:`r`\ th Normal score for a given sample size :math:`n` as .. math:: E\left(x_{\left(r\right)}\right) = \int_{{-\infty }}^{\infty }x_rdG_r\text{,} where .. math:: dG_r = \frac{{A_r^{{r-1}}\left(1-A_r\right)^{{n-r}}dA_r}}{{\beta \left(r, {n-r+1}\right)}}\text{, }\quad A_r = \frac{1}{\sqrt{2\pi }}\int_{{-\infty }}^{x_r}e^{{-t^2/2}}{dt}\text{, }\quad r = 1,2,\ldots,n\text{,} and :math:`\beta` denotes the complete beta function. The function attempts to evaluate the scores so that the estimated error in each score is less than the value :math:`\mathrm{etol}` specified by you. All integrations are performed in parallel and arranged so as to give good speed and reasonable accuracy. .. _g01da-py2-py-references: **References** Kendall, M G and Stuart, A, 1969, `The Advanced Theory of Statistics (Volume 1)`, (3rd Edition), Griffin """ raise NotImplementedError
[docs]def normal_scores_approx(n): r""" ``normal_scores_approx`` calculates an approximation to the set of Normal Scores, i.e., the expected values of an ordered set of independent observations from a Normal distribution with mean :math:`0.0` and standard deviation :math:`1.0`. .. _g01db-py2-py-doc: For full information please refer to the NAG Library document for g01db https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01dbf.html .. _g01db-py2-py-parameters: **Parameters** **n** : int :math:`n`, the size of the sample. **Returns** **pp** : float, ndarray, shape :math:`\left(\mathrm{n}\right)` The Normal scores. :math:`\mathrm{pp}[\textit{i}-1]` contains the value :math:`E\left(x_{\left(\textit{i}\right)}\right)`, for :math:`\textit{i} = 1,2,\ldots,n`. .. _g01db-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n} \geq 1`. .. _g01db-py2-py-notes: **Notes** `No equivalent traditional C interface for this routine exists in the NAG Library.` ``normal_scores_approx`` is an adaptation of the Applied Statistics Algorithm AS :math:`177.3`, see Royston (1982). If you are particularly concerned with the accuracy with which ``normal_scores_approx`` computes the expected values of the order statistics (see `Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01dbf.html#accuracy>`__), then :meth:`normal_scores_exact` which is more accurate should be used instead at a cost of increased storage and computing time. Let :math:`x_{\left(1\right)},x_{\left(2\right)},\ldots,x_{\left(n\right)}` be the order statistics from a random sample of size :math:`n` from the standard Normal distribution. Defining .. math:: P_{{r,n}} = \Phi \left(-E\left(x_{\left(r\right)}\right)\right) and .. math:: Q_{{r,n}} = \frac{{r-\epsilon }}{{n+\gamma }}\text{, }\quad r = 1,2,\ldots,n\text{,} where :math:`E\left(x_{\left(r\right)}\right)` is the expected value of :math:`x_{\left(r\right)}`, the current function approximates the Normal upper tail area corresponding to :math:`E\left(x_{\left(r\right)}\right)` as, .. math:: \tilde{P}_{{\textit{r},n}} = Q_{{\textit{r},n}}+\frac{\delta_1}{n}Q_{{\textit{r},n}}^{\lambda }+\frac{\delta_2}{n}Q_{{\textit{r},n}}^{{2\lambda }}-C_{{\textit{r},n}}\text{.} , for :math:`\textit{r} = 1,2,\ldots,3`, and :math:`r\geq 4`. Estimates of :math:`\epsilon`, :math:`\gamma`, :math:`\delta_1`, :math:`\delta_2` and :math:`\lambda` are obtained. A small correction :math:`C_{{r,n}}` to :math:`\tilde{P}_{{r,n}}` is necessary when :math:`r\leq 7` and :math:`n\leq 20`. The approximation to :math:`E\left(X_{\left(r\right)}\right)` is thus given by .. math:: E\left(x_{\left(r\right)}\right) = {-\Phi^{-1}}\left(\tilde{ P }_{{r,n}}\right),\quad \text{ }\quad r = 1,2,\ldots,n\text{.} Values of the inverse Normal probability integral :math:`\Phi^{-1}` are obtained from :meth:`inv_cdf_normal`. .. _g01db-py2-py-references: **References** Royston, J P, 1982, `Algorithm AS 177: expected normal order statistics (exact and approximate)`, Appl. Statist. (31), 161--165 """ raise NotImplementedError
[docs]def normal_scores_var(n, exp1, exp2, sumssq): r""" ``normal_scores_var`` computes an approximation to the variance-covariance matrix of an ordered set of independent observations from a Normal distribution with mean :math:`0.0` and standard deviation :math:`1.0`. .. _g01dc-py2-py-doc: For full information please refer to the NAG Library document for g01dc https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01dcf.html .. _g01dc-py2-py-parameters: **Parameters** **n** : int :math:`n`, the sample size. **exp1** : float The expected value of the largest Normal order statistic, :math:`m_n`, from a sample of size :math:`n`. **exp2** : float The expected value of the second largest Normal order statistic, :math:`m_{{n-1}}`, from a sample of size :math:`n`. **sumssq** : float The sum of squares of the expected values of the Normal order statistics from a sample of size :math:`n`. **Returns** **vec** : float, ndarray, shape :math:`\left(\mathrm{n}\times \left(\mathrm{n}+1\right)/2\right)` The upper triangle of the :math:`n\times n` variance-covariance matrix packed by column. Thus element :math:`V_{{ij}}` is stored in :math:`\mathrm{vec}[i+j\times \left(j-1\right)/2-1]`, for :math:`1\leq i\leq j\leq n`. .. _g01dc-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n} > 0`. .. _g01dc-py2-py-notes: **Notes** ``normal_scores_var`` is an adaptation of the Applied Statistics Algorithm AS 128, see Davis and Stephens (1978). An approximation to the variance-covariance matrix, :math:`V`, using a Taylor series expansion of the Normal distribution function is discussed in David and Johnson (1954). However, convergence is slow for extreme variances and covariances. The present function uses the David--Johnson approximation to provide an initial approximation and improves upon it by use of the following identities for the matrix. For a sample of size :math:`n`, let :math:`m_i` be the expected value of the :math:`i`\ th largest order statistic, then: (a) for any :math:`i = 1,2,\ldots,n`, :math:`\sum_{{j = 1}}^nV_{{ij}} = 1` (#) :math:`V_{12} = V_{11}+m_n^2-m_nm_{{n-1}}-1` (#) the trace of :math:`V` is :math:`tr\left(V\right) = n-\sum_{{i = 1}}^nm_i^2` (#) :math:`V_{{ij}} = V_{{ji}} = V_{{rs}} = V_{{sr}}` where :math:`r = n+1-i`, :math:`s = n+1-j` and :math:`i,j = 1,2,\ldots,n`. Note that only the upper triangle of the matrix is calculated and returned column-wise in vector form. .. _g01dc-py2-py-references: **References** David, F N and Johnson, N L, 1954, `Statistical treatment of censored data, Part 1. Fundamental formulae`, Biometrika (41), 228--240 Davis, C S and Stephens, M A, 1978, `Algorithm AS 128: approximating the covariance matrix of Normal order statistics`, Appl. Statist. (27), 206--212 """ raise NotImplementedError
[docs]def test_shapiro_wilk(x, a=None): r""" ``test_shapiro_wilk`` calculates Shapiro and Wilk's :math:`W` statistic and its significance level for testing Normality. .. _g01dd-py2-py-doc: For full information please refer to the NAG Library document for g01dd https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ddf.html .. _g01dd-py2-py-parameters: **Parameters** **x** : float, array-like, shape :math:`\left(n\right)` The ordered sample values, :math:`x_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. **a** : None or float, array-like, shape :math:`\left(n\right)`, optional If :math:`\textit{calwts}` has been set to :math:`\mathbf{False}` then before entry :math:`\mathrm{a}` must contain the :math:`n` weights as calculated in a previous call to ``test_shapiro_wilk``, otherwise :math:`\mathrm{a}` need not be set. **Returns** **a** : float, ndarray, shape :math:`\left(n\right)` The :math:`n` weights required to calculate :math:`\mathrm{w}`. **w** : float The value of the statistic, :math:`\mathrm{w}`. **pw** : float The significance level of :math:`\mathrm{w}`. .. _g01dd-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n \geq 3`. (`errno` :math:`2`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n\leq 5000`. (`errno` :math:`3`) On entry, elements of :math:`\mathrm{x}` not in order. :math:`\mathrm{x}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`, :math:`\mathrm{x}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`, :math:`\mathrm{x}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. (`errno` :math:`3`) On entry, all elements of :math:`\mathrm{x}` are equal. .. _g01dd-py2-py-notes: **Notes** ``test_shapiro_wilk`` calculates Shapiro and Wilk's :math:`W` statistic and its significance level for any sample size between :math:`3` and :math:`5000`. It is an adaptation of the Applied Statistics Algorithm AS R94, see Royston (1995). The full description of the theory behind this algorithm is given in Royston (1992). Given a set of observations :math:`x_1,x_2,\ldots,x_n` sorted into either ascending or descending order (:meth:`sort.realvec_sort <naginterfaces.library.sort.realvec_sort>` may be used to sort the data) this function calculates the value of Shapiro and Wilk's :math:`W` statistic defined as: .. math:: W = \frac{{\left(\sum_{{i = 1}}^na_ix_i\right)^2}}{{\sum_{{i = 1}}^n\left(x_i-\bar{x}\right)^2}}\text{,} where :math:`\bar{x} = \frac{1}{n}\sum_1^nx_i` is the sample mean and :math:`a_i`, for :math:`i = 1,2,\ldots,n`, are a set of 'weights' whose values depend only on the sample size :math:`n`. On exit, the values of :math:`a_i`, for :math:`\textit{i} = 1,2,\ldots,n`, are only of interest should you wish to call the function again to calculate :math:`\mathrm{w}` and its significance level for a different sample of the same size. It is recommended that the function is used in conjunction with a Normal :math:`\left(Q-Q\right)` plot of the data. Functions :meth:`normal_scores_exact` and :meth:`normal_scores_approx` can be used to obtain the required Normal scores. .. _g01dd-py2-py-references: **References** Royston, J P, 1982, `Algorithm AS 181: the` :math:`W` `test for normality`, Appl. Statist. (31), 176--180 Royston, J P, 1986, `A remark on AS 181: the` :math:`W` `test for normality`, Appl. Statist. (35), 232--234 Royston, J P, 1992, `Approximating the Shapiro--Wilk's` :math:`W` `test for non-normality`, Statistics & Computing (2), 117--119 Royston, J P, 1995, `A remark on AS R94: A remark on Algorithm AS 181: the` :math:`W` `test for normality`, Appl. Statist. (44(4)), 547--551 """ raise NotImplementedError
[docs]def ranks_and_scores(scores, ties, x): r""" ``ranks_and_scores`` computes the ranks, Normal scores, an approximation to the Normal scores or the exponential scores as requested by you. .. _g01dh-py2-py-doc: For full information please refer to the NAG Library document for g01dh https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01dhf.html .. _g01dh-py2-py-parameters: **Parameters** **scores** : str, length 1 Indicates which of the following scores are required. :math:`\mathrm{scores} = \texttt{'R'}` The ranks. :math:`\mathrm{scores} = \texttt{'N'}` The Normal scores, that is the expected value of the Normal order statistics. :math:`\mathrm{scores} = \texttt{'B'}` The Blom version of the Normal scores. :math:`\mathrm{scores} = \texttt{'T'}` The Tukey version of the Normal scores. :math:`\mathrm{scores} = \texttt{'V'}` The van der Waerden version of the Normal scores. :math:`\mathrm{scores} = \texttt{'S'}` The Savage scores, that is the expected value of the exponential order statistics. **ties** : str, length 1 Indicates which of the following methods is to be used to assign scores to tied observations. :math:`\mathrm{ties} = \texttt{'A'}` The average of the scores for tied observations is used. :math:`\mathrm{ties} = \texttt{'L'}` The lowest score in the group of ties is used. :math:`\mathrm{ties} = \texttt{'H'}` The highest score in the group of ties is used. :math:`\mathrm{ties} = \texttt{'N'}` The nonrepeatable random number generator is used to randomly untie any group of tied observations. :math:`\mathrm{ties} = \texttt{'R'}` The repeatable random number generator is used to randomly untie any group of tied observations. :math:`\mathrm{ties} = \texttt{'I'}` Any ties are ignored, that is the scores are assigned to tied observations in the order that they appear in the data. **x** : float, array-like, shape :math:`\left(n\right)` The sample of observations, :math:`x_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. **Returns** **r** : float, ndarray, shape :math:`\left(n\right)` Contains the scores, :math:`s_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`, as specified by :math:`\mathrm{scores}`. .. _g01dh-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{scores} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{scores} = \texttt{'R'}`, :math:`\texttt{'N'}`, :math:`\texttt{'B'}`, :math:`\texttt{'T'}`, :math:`\texttt{'V'}` or :math:`\texttt{'S'}`. (`errno` :math:`1`) On entry, :math:`\mathrm{ties} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ties} = \texttt{'A'}`, :math:`\texttt{'L'}`, :math:`\texttt{'H'}`, :math:`\texttt{'R'}` or :math:`\texttt{'I'}`. (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n \geq 1`. .. _g01dh-py2-py-notes: **Notes** ``ranks_and_scores`` computes one of the following scores for a sample of observations, :math:`x_1,x_2,\ldots,x_n`. (1) **Rank Scores** The ranks are assigned to the data in ascending order, that is the :math:`i`\ th observation has score :math:`s_i = k` if it is the :math:`k`\ th smallest observation in the sample. (#) **Normal Scores** The Normal scores are the expected values of the Normal order statistics from a sample of size :math:`n`. If :math:`x_i` is the :math:`k`\ th smallest observation in the sample, then the score for that observation, :math:`s_i`, is :math:`E\left(Z_k\right)` where :math:`Z_k` is the :math:`k`\ th order statistic in a sample of size :math:`n` from a standard Normal distribution and :math:`E` is the expectation operator. (#) **Blom, Tukey and van der Waerden Scores** These scores are approximations to the Normal scores. The scores are obtained by evaluating the inverse cumulative Normal distribution function, :math:`\Phi^{-1}\left(·\right)`, at the values of the ranks scaled into the interval :math:`\left(0, 1\right)` using different scaling transformations. The Blom scores use the scaling transformation :math:`\frac{{r_i-\frac{3}{8}}}{{n+\frac{1}{4}}}` for the rank :math:`r_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. Thus the Blom score corresponding to the observation :math:`x_i` is .. math:: s_i = \Phi^{-1}\left(\frac{{r_i-\frac{3}{8}}}{{n+\frac{1}{4}}}\right)\text{.} The Tukey scores use the scaling transformation :math:`\frac{{r_i-\frac{1}{3}}}{{n+\frac{1}{3}}}`; the Tukey score corresponding to the observation :math:`x_i` is .. math:: s_i = \Phi^{-1}\left(\frac{{r_i-\frac{1}{3}}}{{n+\frac{1}{3}}}\right)\text{.} The van der Waerden scores use the scaling transformation :math:`\frac{r_i}{{n+1}}`; the van der Waerden score corresponding to the observation :math:`x_i` is .. math:: s_i = \Phi^{-1}\left(\frac{r_i}{{n+1}}\right)\text{.} The van der Waerden scores may be used to carry out the van der Waerden test for testing for differences between several population distributions, see Conover (1980). (#) **Savage Scores** The Savage scores are the expected values of the exponential order statistics from a sample of size :math:`n`. They may be used in a test discussed by Savage (1956) and Lehmann (1975). If :math:`x_i` is the :math:`k`\ th smallest observation in the sample, then the score for that observation is .. math:: s_i = E\left(Y_k\right) = \frac{1}{n}+\frac{1}{{n-1}} + \cdots +\frac{1}{{n-k+1}}\text{,} where :math:`Y_k` is the :math:`k`\ th order statistic in a sample of size :math:`n` from a standard exponential distribution and :math:`E` is the expectation operator. Ties may be handled in one of five ways. Let :math:`x_{{t\left(\textit{i}\right)}}`, for :math:`\textit{i} = 1,2,\ldots,m`, denote :math:`m` tied observations, that is :math:`x_{{t\left(1\right)}} = x_{{t\left(2\right)}} = \cdots = x_{{t\left(m\right)}}` with :math:`t\left(1\right) < t\left(2\right) < \cdots < t\left(m\right)`. If the rank of :math:`x_{{t\left(1\right)}}` is :math:`k`, then if ties are ignored the rank of :math:`x_{{t\left(j\right)}}` will be :math:`k+j-1`. Let the scores ignoring ties be :math:`s_{{t\left(1\right)}}^*,s_{{t\left(2\right)}}^*,\ldots,s_{{t\left(m\right)}}^*`. Then the scores, :math:`s_{{t\left(\textit{i}\right)}}`, for :math:`\textit{i} = 1,2,\ldots,m`, may be calculated as follows: - if averages are used, then :math:`s_{{t\left(i\right)}} = \sum_{{j = 1}}^ms_{{t\left(j\right)}}^*/m`; - if the lowest score is used, then :math:`s_{{t\left(i\right)}} = s_{{t\left(1\right)}}^*`; - if the highest score is used, then :math:`s_{{t\left(i\right)}} = s_{{t\left(m\right)}}^*`; - if ties are to be broken randomly, then :math:`s_{{t\left(i\right)}} = s_{{t\left(I\right)}}^*` where :math:`I \in \left\{\text{random permutation of }1,2,\ldots,m\right\}`; - if ties are to be ignored, then :math:`s_{{t\left(i\right)}} = s_{{t\left(i\right)}}^*`. .. _g01dh-py2-py-references: **References** Blom, G, 1958, `Statistical Estimates and Transformed Beta-variables`, Wiley Conover, W J, 1980, `Practical Nonparametric Statistics`, Wiley Lehmann, E L, 1975, `Nonparametrics: Statistical Methods Based on Ranks`, Holden--Day Savage, I R, 1956, `Contributions to the theory of rank order statistics -- the two-sample case`, Ann. Math. Statist. (27), 590--615 Tukey, J W, 1962, `The future of data analysis`, Ann. Math. Statist. (33), 1--67 """ raise NotImplementedError
[docs]def prob_normal(x, tail='L'): r""" ``prob_normal`` returns a one or two tail probability for the standard Normal distribution. .. _g01ea-py2-py-doc: For full information please refer to the NAG Library document for g01ea https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01eaf.html .. _g01ea-py2-py-parameters: **Parameters** **x** : float :math:`x`, the value of the standard Normal variate. **tail** : str, length 1, optional Indicates which tail the returned probability should represent. :math:`\mathrm{tail} = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`P\left(X\leq x\right)`. :math:`\mathrm{tail} = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`P\left(X\geq x\right)`. :math:`\mathrm{tail} = \texttt{'S'}` The two tail (significance level) probability is returned, i.e., :math:`P\left(X\geq \left\lvert x\right\rvert \right)+P\left(X\leq -\left\lvert x\right\rvert \right)`. :math:`\mathrm{tail} = \texttt{'C'}` The two tail (confidence interval) probability is returned, i.e., :math:`P\left(X\leq \left\lvert x\right\rvert \right)-P\left(X\leq -\left\lvert x\right\rvert \right)`. **Returns** **p** : float A one or two tail probability for the standard Normal distribution. .. _g01ea-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail} = \texttt{'L'}`, :math:`\texttt{'U'}`, :math:`\texttt{'S'}` or :math:`\texttt{'C'}`. .. _g01ea-py2-py-notes: **Notes** The lower tail probability for the standard Normal distribution, :math:`P\left(X\leq x\right)` is defined by: .. math:: P\left(X\leq x\right) = \int_{{-\infty }}^xZ\left(X\right){dX}\text{,} where .. math:: Z\left(X\right) = \frac{1}{\sqrt{2\pi }}e^{{-X^2/2}},{-\infty } < X < \infty \text{.} The relationship .. math:: P\left(X\leq x\right) = \frac{1}{2}\mathrm{erfc}\left(\frac{{-x}}{\sqrt{2}}\right) is used, where erfc is the complementary error function, and is computed using :meth:`specfun.erfc_real <naginterfaces.library.specfun.erfc_real>`. For the upper tail probability the relationship :math:`P\left(X\geq x\right) = P\left(X\leq -x\right)` is used and for the two tail significance level probability twice the probability obtained from the absolute value of :math:`x` is returned. When the two tail confidence probability is required the relationship .. math:: P\left(X\leq \left\lvert x\right\rvert \right)-P\left(X\leq -\left\lvert x\right\rvert \right) = \mathrm{erf}\left(\frac{{\left\lvert x\right\rvert }}{\sqrt{2}}\right)\text{,} is used, where erf is the error function, and is computed using :meth:`specfun.erf_real <naginterfaces.library.specfun.erf_real>`. .. _g01ea-py2-py-references: **References** NIST Digital Library of Mathematical Functions Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_students_t(t, df, tail='L'): r""" ``prob_students_t`` returns the lower tail, upper tail or two tail probability for the Student's :math:`t`-distribution with real degrees of freedom. .. _g01eb-py2-py-doc: For full information please refer to the NAG Library document for g01eb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ebf.html .. _g01eb-py2-py-parameters: **Parameters** **t** : float :math:`t`, the value of the Student's :math:`t` variate. **df** : float :math:`\nu`, the degrees of freedom of the Student's :math:`t`-distribution. **tail** : str, length 1, optional Indicates which tail the returned probability should represent. :math:`\mathrm{tail} = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`P\left(T\geq t:\nu \right)`. :math:`\mathrm{tail} = \texttt{'S'}` The two tail (significance level) probability is returned, i.e., :math:`P\left(T\geq \left\lvert t\right\rvert :\nu \right)+P\left(T\leq -\left\lvert t\right\rvert :\nu \right)`. :math:`\mathrm{tail} = \texttt{'C'}` The two tail (confidence interval) probability is returned, i.e., :math:`P\left(T\leq \left\lvert t\right\rvert :\nu \right)-P\left(T\leq -\left\lvert t\right\rvert :\nu \right)`. :math:`\mathrm{tail} = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`P\left(T\leq t:\nu \right)`. **Returns** **p** : float Either the lower tail, upper tail or two tail probability for the Student's :math:`t`-distribution, depending on the value of :math:`\mathrm{tail}` supplied. .. _g01eb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail} = \texttt{'L'}`, :math:`\texttt{'U'}`, :math:`\texttt{'S'}` or :math:`\texttt{'C'}`. (`errno` :math:`2`) On entry, :math:`\mathrm{df} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df}\geq 1.0`. .. _g01eb-py2-py-notes: **Notes** The lower tail probability for the Student's :math:`t`-distribution with :math:`\nu` degrees of freedom, :math:`P\left(T\leq t:\nu \right)` is defined by: .. math:: P\left(T\leq t:\nu \right) = \frac{{\Gamma \left(\left(\nu +1\right)/2\right)}}{{\sqrt{\pi \nu }\Gamma \left(\nu /2\right)}}\int_{{-\infty }}^t\left[1+\frac{T^2}{\nu }\right]^{{-\left(\nu +1\right)/2}}dT\text{, }\quad \nu \geq 1\text{.} Computationally, there are two situations: (i) when :math:`\nu < 20`, a transformation of the beta distribution, :math:`P_{\beta }\left({B\leq \beta :a}, b\right)` is used .. math:: P\left(T\leq t:\nu \right) = \frac{1}{2}P_{\beta }\left(B\leq \frac{\nu }{{\nu +t^2}}:\nu /2,\frac{1}{2}\right)\quad \text{ when }t < 0.0 or .. math:: P\left(T\leq t:\nu \right) = \frac{1}{2}+\frac{1}{2}P_{{\beta }}\left(B\geq \frac{{\nu }}{{\nu +t^2}}:\nu /2,\frac{1}{2}\right)\quad \text{ when }t > 0.0\text{;} (#) when :math:`\nu \geq 20`, an asymptotic normalizing expansion of the Cornish--Fisher type is used to evaluate the probability, see Hill (1970). .. _g01eb-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Hill, G W, 1970, `Student's` :math:`t` `-distribution`, Comm. ACM (13(10)), 617--619 """ raise NotImplementedError
[docs]def prob_chisq(x, df, tail='L'): r""" ``prob_chisq`` returns the lower or upper tail probability for the :math:`\chi^2`-distribution with real degrees of freedom. .. _g01ec-py2-py-doc: For full information please refer to the NAG Library document for g01ec https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ecf.html .. _g01ec-py2-py-parameters: **Parameters** **x** : float :math:`x`, the value of the :math:`\chi^2` variate with :math:`\nu` degrees of freedom. **df** : float :math:`\nu`, the degrees of freedom of the :math:`\chi^2`-distribution. **tail** : str, length 1, optional Indicates whether the upper or lower tail probability is required. :math:`\mathrm{tail} = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`P\left(X\leq x:\nu \right)`. :math:`\mathrm{tail} = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`P\left(X\geq x:\nu \right)`. **Returns** **p** : float The lower or upper tail probability for the :math:`\chi^2`-distribution, depending on the value of :math:`\mathrm{tail}` supplied. .. _g01ec-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail} = \texttt{'L'}` or :math:`\texttt{'U'}`. (`errno` :math:`2`) On entry, :math:`\mathrm{x} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{x}\geq 0.0`. (`errno` :math:`3`) On entry, :math:`\mathrm{df} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df} > 0.0`. (`errno` :math:`4`) The series used to calculate the gamma probabilities has failed to converge. The result returned should represent an approximation to the solution. .. _g01ec-py2-py-notes: **Notes** The lower tail probability for the :math:`\chi^2`-distribution with :math:`\nu` degrees of freedom, :math:`P\left(X\leq x:\nu \right)` is defined by: .. math:: P\left(X\leq x:\nu \right) = \frac{1}{{2^{{\nu /2}}\Gamma \left(\nu /2\right)}}\int_{0.0}^xX^{{\nu /2-1}}e^{{-X/2}}{dX}\text{, }\quad x\geq 0,\nu > 0\text{.} To calculate :math:`P\left(X\leq x:\nu \right)` a transformation of a gamma distribution is employed, i.e., a :math:`\chi^2`-distribution with :math:`\nu` degrees of freedom is equal to a gamma distribution with scale parameter :math:`2` and shape parameter :math:`\nu /2`. .. _g01ec-py2-py-references: **References** NIST Digital Library of Mathematical Functions Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_f(f, df1, df2, tail='L'): r""" ``prob_f`` returns the probability for the lower or upper tail of the :math:`F` or variance-ratio distribution with real degrees of freedom. .. _g01ed-py2-py-doc: For full information please refer to the NAG Library document for g01ed https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01edf.html .. _g01ed-py2-py-parameters: **Parameters** **f** : float :math:`f`, the value of the :math:`F` variate. **df1** : float The degrees of freedom of the numerator variance, :math:`\nu_1`. **df2** : float The degrees of freedom of the denominator variance, :math:`\nu_2`. **tail** : str, length 1, optional Indicates whether an upper or lower tail probability is required. :math:`\mathrm{tail} = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`P\left({F\leq f:\nu_1}, \nu_2\right)`. :math:`\mathrm{tail} = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`P\left({F\geq f:\nu_1}, \nu_2\right)`. **Returns** **p** : float The probability for the lower or upper tail of the :math:`F` or variance-ratio distribution with real degrees of freedom. .. _g01ed-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail} = \texttt{'L'}` or :math:`\texttt{'U'}`. (`errno` :math:`2`) On entry, :math:`\mathrm{f} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{f}\geq 0.0`. (`errno` :math:`3`) On entry, :math:`\mathrm{df1} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{df2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df1} > 0.0` and :math:`\mathrm{df2} > 0.0`. (`errno` :math:`4`) The probability is too close to :math:`0.0` or :math:`1.0`. :math:`\mathrm{f}` is too far out into the tails for the probability to be evaluated exactly. The result tends to approach :math:`1.0` if :math:`f` is large, or :math:`0.0` if :math:`f` is small. The result returned is a good approximation to the required solution. .. _g01ed-py2-py-notes: **Notes** The lower tail probability for the :math:`F`, or variance-ratio distribution, with :math:`\nu_1` and :math:`\nu_2` degrees of freedom, :math:`P\left({F\leq f:\nu_1}, \nu_2\right)`, is defined by: .. math:: P\left({F\leq f:\nu_1}, \nu_2\right) = \frac{{\nu_1^{{\nu_1/2}}\nu_2^{{\nu_2/2}}\Gamma \left(\left(\nu_1+\nu_2\right)/2\right)}}{{\Gamma \left(\nu_1/2\right)\Gamma \left(\nu_2/2\right)}}\int_0^fF^{{\left(\nu_1-2\right)/2}}\left(\nu_1F+\nu_2\right)^{{-\left(\nu_1+\nu_2\right)/2}}dF\text{,} for :math:`\nu_1`, :math:`\nu_2 > 0`, :math:`f\geq 0`. The probability is computed by means of a transformation to a beta distribution, :math:`P_{\beta }\left({B\leq \beta :a}, b\right)`: .. math:: P\left({F\leq f:\nu_1}, \nu_2\right) = P_{\beta }\left(B\leq \frac{{\nu_1f}}{{\nu_1f+\nu_2}}:\nu_1/2,\nu_2/2\right) and using a call to :meth:`prob_beta`. For very large values of both :math:`\nu_1` and :math:`\nu_2`, greater than :math:`10^5`, a normal approximation is used. If only one of :math:`\nu_1` or :math:`\nu_2` is greater than :math:`10^5` then a :math:`\chi^2` approximation is used, see Abramowitz and Stegun (1972). .. _g01ed-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_beta(x, a, b): r""" ``prob_beta`` computes the upper and lower tail probabilities and the probability density function of the beta distribution with parameters :math:`a` and :math:`b`. .. _g01ee-py2-py-doc: For full information please refer to the NAG Library document for g01ee https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01eef.html .. _g01ee-py2-py-parameters: **Parameters** **x** : float :math:`\beta`, the value of the beta variate. **a** : float :math:`a`, the first parameter of the required beta distribution. **b** : float :math:`b`, the second parameter of the required beta distribution. **Returns** **p** : float The lower tail probability, :math:`P\left({B\leq \beta :a}, {b}\right)`. **q** : float The upper tail probability, :math:`P\left({B\geq \beta :a}, {b}\right)`. **pdf** : float The probability density function, :math:`f\left({B:a}, b\right)`. .. _g01ee-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{x} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{x}\geq 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{x} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{x}\leq 1.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a} > 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{b} > 0.0`. .. _g01ee-py2-py-notes: **Notes** The probability density function of the beta distribution with parameters :math:`a` and :math:`b` is: .. math:: f\left({B:a}, b\right) = \frac{{\Gamma \left(a+b\right)}}{{\Gamma \left(a\right)\Gamma \left(b\right)}}B^{{a-1}}\left(1-B\right)^{{b-1}}\text{, }\quad 0\leq B\leq 1\text{;}a,b > 0\text{.} The lower tail probability, :math:`P\left({B\leq \beta :a}, {b}\right)` is defined by .. math:: P\left({B\leq \beta :a}, b\right) = \frac{{\Gamma \left(a+b\right)}}{{\Gamma \left(a\right)\Gamma \left(b\right)}}\int_0^{\beta }B^{{a-1}}\left(1-B\right)^{{b-1}}{dB} = I_{\beta }\left(a, b\right)\text{, }\quad 0\leq \beta \leq 1\text{;}a,b > 0\text{.} The function :math:`I_x\left(a, b\right)`, also known as the incomplete beta function is calculated using :meth:`specfun.beta_incomplete <naginterfaces.library.specfun.beta_incomplete>`. .. _g01ee-py2-py-references: **References** Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_gamma(g, a, b, tail='L'): r""" ``prob_gamma`` returns the lower or upper tail probability of the gamma distribution, with parameters :math:`\alpha` and :math:`\beta`. .. _g01ef-py2-py-doc: For full information please refer to the NAG Library document for g01ef https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01eff.html .. _g01ef-py2-py-parameters: **Parameters** **g** : float :math:`g`, the value of the gamma variate. **a** : float The parameter :math:`\alpha` of the gamma distribution. **b** : float The parameter :math:`\beta` of the gamma distribution. **tail** : str, length 1, optional Indicates whether an upper or lower tail probability is required. :math:`\mathrm{tail} = \texttt{'L'}` The lower tail probability is returned, that is :math:`P\left({G\leq g:\alpha }, \beta \right)`. :math:`\mathrm{tail} = \texttt{'U'}` The upper tail probability is returned, that is :math:`P\left({G\geq g:\alpha }, \beta \right)`. **Returns** **p** : float The lower or upper tail probability of the gamma distribution, with parameters :math:`\alpha` and :math:`\beta`. .. _g01ef-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail} = \texttt{'L'}` or :math:`\texttt{'U'}`. (`errno` :math:`2`) On entry, :math:`\mathrm{g} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{g}\geq 0.0`. (`errno` :math:`3`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a} > 0.0` and :math:`\mathrm{b} > 0.0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`4`) The algorithm has failed to converge in :math:`\langle\mathit{\boldsymbol{value}}\rangle` iterations. The probability returned should be a reasonable approximation to the solution. .. _g01ef-py2-py-notes: **Notes** The lower tail probability for the gamma distribution with parameters :math:`\alpha` and :math:`\beta`, :math:`P\left(G\leq g\right)`, is defined by: .. math:: P\left({G\leq g\text{;}\alpha }, \beta \right) = \frac{1}{{\beta^{\alpha }\Gamma \left(\alpha \right)}}\int_0^gG^{{\alpha -1}}e^{{-G/\beta }}{dG}\text{, }\quad \alpha > 0.0\text{, }\beta > 0.0\text{.} The mean of the distribution is :math:`\alpha \beta` and its variance is :math:`\alpha \beta^2`. The transformation :math:`Z = \frac{G}{\beta }` is applied to yield the following incomplete gamma function in normalized form, .. math:: P\left({G\leq g\text{;}\alpha }, \beta \right) = P\left({Z\leq g/\beta :\alpha }, 1.0\right) = \frac{1}{{\Gamma \left(\alpha \right)}}\int_0^{{g/\beta }}Z^{{\alpha -1}}e^{{-Z}}{dZ}\text{.} This is then evaluated using :meth:`specfun.gamma_incomplete <naginterfaces.library.specfun.gamma_incomplete>`. .. _g01ef-py2-py-references: **References** Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_studentized_range(q, v, ir): r""" ``prob_studentized_range`` returns the probability associated with the lower tail of the distribution of the Studentized range statistic. .. _g01em-py2-py-doc: For full information please refer to the NAG Library document for g01em https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01emf.html .. _g01em-py2-py-parameters: **Parameters** **q** : float :math:`q`, the Studentized range statistic. **v** : float :math:`v`, the number of degrees of freedom for the experimental error. **ir** : int :math:`r`, the number of groups. **Returns** **p** : float The probability associated with the lower tail of the distribution of the Studentized range statistic. .. _g01em-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{q} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{q} > 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{ir} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ir} \geq 2`. (`errno` :math:`1`) On entry, :math:`\mathrm{v} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{v}\geq 1.0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`2`) There is some doubt as to whether full accuracy has been achieved. The returned value should be a reasonable estimate of the true value. .. _g01em-py2-py-notes: **Notes** The externally Studentized range, :math:`q`, for a sample, :math:`x_1,x_2,\ldots,x_r`, is defined as: .. math:: q = \frac{{\mathrm{max}\left(x_i\right)-\mathrm{min}\left(x_i\right)}}{\hat{\sigma }_e}\text{,} where :math:`\hat{\sigma }_e` is an independent estimate of the standard error of the :math:`x_i`'s. The most common use of this statistic is in the testing of means from a balanced design. In this case for a set of group means, :math:`\bar{T}_1,\bar{T}_2,\ldots,\bar{T}_r`, the Studentized range statistic is defined to be the difference between the largest and smallest means, :math:`\bar{T}_{\mathrm{largest}}` and :math:`\bar{T}_{\mathrm{smallest}}`, divided by the square root of the mean-square experimental error, :math:`MS_{\mathrm{error}}`, over the number of observations in each group, :math:`n`, i.e., .. math:: q = \frac{{\bar{T}_{\mathrm{largest}}-\bar{T}_{\mathrm{smallest}}}}{{\sqrt{MS_{\mathrm{error}}/n}}}\text{.} The Studentized range statistic can be used as part of a multiple comparisons procedure such as the Newman--Keuls procedure or Duncan's multiple range test (see Montgomery (1984) and Winer (1970)). For a Studentized range statistic the probability integral, :math:`P\left({q;v}, r\right)`, for :math:`v` degrees of freedom and :math:`r` groups can be written as: .. math:: P\left({q;v}, r\right) = C\int_0^{\infty }x^{{v-1}}e^{{-vx^2/2}}\left\{r\int_{{-\infty }}^{\infty }\phi \left(y\right){\left[\Phi \left(y\right)-\Phi \left(y-qx\right)\right]}^{{r-1}}{dy}\right\}{dx}\text{,} where .. math:: C = \frac{v^{{v/2}}}{{\Gamma \left(v/2\right)2^{{v/2-1}}}}\text{, }\quad \phi \left(y\right) = \frac{1}{\sqrt{2\pi }}e^{{-y^2/2}}\quad \text{ and }\quad \Phi \left(y\right) = \int_{{-\infty }}^y\phi \left(t\right){dt}\text{.} The above two-dimensional integral is evaluated using :meth:`quad.dim2_fin <naginterfaces.library.quad.dim2_fin>` with the upper and lower limits computed to give stated accuracy (see `Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01emf.html#accuracy>`__). If the degrees of freedom :math:`v` are greater than :math:`2000` the probability integral can be approximated by its asymptotic form: .. math:: P\left(q;r\right) = r\int_{{-\infty }}^{\infty }\phi \left(y\right){\left[\Phi \left(y\right)-\Phi \left(y-q\right)\right]}^{{r-1}}{dy}\text{.} This integral is evaluated using :meth:`quad.dim1_inf <naginterfaces.library.quad.dim1_inf>`. .. _g01em-py2-py-references: **References** NIST Digital Library of Mathematical Functions Lund, R E and Lund, J R, 1983, `Algorithm AS 190: probabilities and upper quartiles for the studentized range`, Appl. Statist. (32(2)), 204--210 Montgomery, D C, 1984, `Design and Analysis of Experiments`, Wiley Winer, B J, 1970, `Statistical Principles in Experimental Design`, McGraw--Hill """ raise NotImplementedError
[docs]def prob_durbin_watson(n, ip, d): r""" ``prob_durbin_watson`` calculates upper and lower bounds for the significance of a Durbin--Watson statistic. .. _g01ep-py2-py-doc: For full information please refer to the NAG Library document for g01ep https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01epf.html .. _g01ep-py2-py-parameters: **Parameters** **n** : int :math:`n`, the number of observations used in calculating the Durbin--Watson statistic. **ip** : int :math:`p`, the number of independent variables in the regression model, including the mean. **d** : float :math:`d`, the Durbin--Watson statistic. **Returns** **pdl** : float Lower bound for the significance of the Durbin--Watson statistic, :math:`p_l`. **pdu** : float Upper bound for the significance of the Durbin--Watson statistic, :math:`p_u`. .. _g01ep-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{ip} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ip} \geq 1`. (`errno` :math:`1`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{ip} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n} > \mathrm{ip}`. (`errno` :math:`2`) On entry, :math:`\mathrm{d} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{d}\geq 0.0`. .. _g01ep-py2-py-notes: **Notes** Let :math:`r = \left(r_1, r_2, \ldots, r_n\right)^\mathrm{T}` be the residuals from a linear regression of :math:`y` on :math:`p` independent variables, including the mean, where the :math:`y` values :math:`y_1,y_2,\ldots,y_n` can be considered as a time series. The Durbin--Watson test (see Durbin and Watson (1950), Durbin and Watson (1951) and Durbin and Watson (1971)) can be used to test for serial correlation in the error term in the regression. The Durbin--Watson test statistic is: .. math:: d = \frac{{\sum_{{i = 1}}^{{n-1}}\left(r_{{i+1}}-r_i\right)^2}}{{\sum_{{i = 1}}^nr_i^2}}\text{,} which can be written as .. math:: d = \frac{{r^\mathrm{T}Ar}}{{r^\mathrm{T}r}}\text{,} where the :math:`n\times n` matrix :math:`A` is given by .. math:: A = \left[\begin{array}{rrrrr}1&-1&0&\ldots &:\\-1&2&-1&\ldots &:\\0&-1&2&\ldots &:\\:&0&-1&\ldots &:\\:&:&:&\ldots &:\\:&:&:&\ldots &-1\\0&0&0&\ldots &1\end{array}\right] with the nonzero eigenvalues of the matrix :math:`A` being :math:`\lambda_j = \left(1-\cos\left(\pi j/n\right)\right)`, for :math:`\textit{j} = 1,2,\ldots,n-1`. Durbin and Watson show that the exact distribution of :math:`d` depends on the eigenvalues of a matrix :math:`HA`, where :math:`H` is the hat matrix of independent variables, i.e., the matrix such that the vector of fitted values, :math:`\hat{y}`, can be written as :math:`\hat{y} = Hy`. However, bounds on the distribution can be obtained, the lower bound being .. math:: d_l = \frac{{\sum_{{i = 1}}^{{n-p}}\lambda_iu_i^2}}{{\sum_{{i = 1}}^{{n-p}}u_i^2}} and the upper bound being .. math:: d_u = \frac{{\sum_{{i = 1}}^{{n-p}}\lambda_{{i-1+p}}u_i^2}}{{\sum_{{i = 1}}^{{n-p}}u_i^2}}\text{,} where :math:`u_i` are independent standard Normal variables. Two algorithms are used to compute the lower tail (significance level) probabilities, :math:`p_l` and :math:`p_u`, associated with :math:`d_l` and :math:`d_u`. If :math:`n\leq 60` the procedure due to Pan (1964) is used, see Farebrother (1980), otherwise Imhof's method (see Imhof (1961)) is used. The bounds are for the usual test of positive correlation; if a test of negative correlation is required the value of :math:`d` should be replaced by :math:`4-d`. .. _g01ep-py2-py-references: **References** Durbin, J and Watson, G S, 1950, `Testing for serial correlation in least squares regression. I`, Biometrika (37), 409--428 Durbin, J and Watson, G S, 1951, `Testing for serial correlation in least squares regression. II`, Biometrika (38), 159--178 Durbin, J and Watson, G S, 1971, `Testing for serial correlation in least squares regression. III`, Biometrika (58), 1--19 Farebrother, R W, 1980, `Algorithm AS 153. Pan's procedure for the tail probabilities of the Durbin--Watson statistic`, Appl. Statist. (29), 224--227 Imhof, J P, 1961, `Computing the distribution of quadratic forms in Normal variables`, Biometrika (48), 419--426 Newbold, P, 1988, `Statistics for Business and Economics`, Prentice--Hall Pan, Jie--Jian, 1964, `Distributions of the noncircular serial correlation coefficients`, Shuxue Jinzhan (7), 328--337 """ raise NotImplementedError
[docs]def prob_vonmises(t, vk): r""" ``prob_vonmises`` returns the probability associated with the lower tail of the von Mises distribution between :math:`{-\pi }` and :math:`\pi` through the function name. .. _g01er-py2-py-doc: For full information please refer to the NAG Library document for g01er https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01erf.html .. _g01er-py2-py-parameters: **Parameters** **t** : float :math:`\theta`, the observed von Mises statistic measured in radians. **vk** : float The concentration parameter :math:`\kappa`, of the von Mises distribution. **Returns** **p** : float The probability associated with the lower tail of the von Mises distribution between :math:`{-\pi }` and :math:`\pi`. .. _g01er-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{vk} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{vk}\geq 0.0`. .. _g01er-py2-py-notes: **Notes** The von Mises distribution is a symmetric distribution used in the analysis of circular data. The lower tail area of this distribution on the circle with mean direction :math:`\mu_0 = 0` and concentration parameter kappa, :math:`\kappa`, can be written as .. math:: \mathrm{Pr}\left(\Theta \leq \theta :\kappa \right) = \frac{1}{{2\pi I_0\left(\kappa \right)}}\int_{{-\pi }}^{\theta }e^{{\kappa \cos\left(\Theta \right)}}\mathrm{d}\Theta \text{,} where :math:`\theta` is reduced modulo :math:`2\pi` so that :math:`{-\pi }\leq \theta < \pi` and :math:`\kappa \geq 0`. Note that if :math:`\theta = \pi` then ``prob_vonmises`` returns a probability of :math:`1`. For very small :math:`\kappa` the distribution is almost the uniform distribution, whereas for :math:`\kappa →\infty` all the probability is concentrated at one point. The method of calculation for small :math:`\kappa` involves backwards recursion through a series expansion in terms of modified Bessel functions, while for large :math:`\kappa` an asymptotic Normal approximation is used. In the case of small :math:`\kappa` the series expansion of Pr(:math:`\Theta \leq \theta`: :math:`\kappa`) can be expressed as .. math:: \mathrm{Pr}\left(\Theta \leq \theta :\kappa \right) = \frac{1}{2}+\frac{\theta }{\left(2\pi \right)}+\frac{1}{{\pi I_0\left(\kappa \right)}}\sum_{{n = 1}}^{\infty }n^{-1}I_n\left(\kappa \right)\sin\left(n\right)\theta \text{,} where :math:`I_n\left(\kappa \right)` is the modified Bessel function. This series expansion can be represented as a nested expression of terms involving the modified Bessel function ratio :math:`R_n`, .. math:: R_n\left(\kappa \right) = \frac{{I_n\left(\kappa \right)}}{{I_{{n-1}}\left(\kappa \right)}}\text{, }\quad n = 1,2,3,\ldots \text{,} which is calculated using backwards recursion. For large values of :math:`\kappa` (see `Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01erf.html#accuracy>`__) an asymptotic Normal approximation is used. The angle :math:`\Theta` is transformed to the nearly Normally distributed variate :math:`Z`, .. math:: Z = b\left(\kappa \right)\sin\left(\frac{\Theta }{2}\right)\text{,} where .. math:: b\left(\kappa \right) = \frac{{\sqrt{\frac{2}{\pi }}e^{\kappa }}}{{I_0\left(\kappa \right)}} and :math:`b\left(\kappa \right)` is computed from a continued fraction approximation. An approximation to order :math:`\kappa^{-4}` of the asymptotic normalizing series for :math:`z` is then used. Finally the Normal probability integral is evaluated. For a more detailed analysis of the methods used see Hill (1977). .. _g01er-py2-py-references: **References** Hill, G W, 1977, `Algorithm 518: Incomplete Bessel function` :math:`I_0` `: The Von Mises distribution`, ACM Trans. Math. Software (3), 279--284 Mardia, K V, 1972, `Statistics of Directional Data`, Academic Press """ raise NotImplementedError
[docs]def prob_landau(x): r""" ``prob_landau`` returns the value of the Landau distribution function :math:`\Phi \left(\lambda \right)`. .. _g01et-py2-py-doc: For full information please refer to the NAG Library document for g01et https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01etf.html .. _g01et-py2-py-parameters: **Parameters** **x** : float The argument :math:`\lambda` of the function. **Returns** **p** : float The value of the Landau distribution function :math:`\Phi \left(\lambda \right)`. .. _g01et-py2-py-notes: **Notes** ``prob_landau`` evaluates an approximation to the Landau distribution function :math:`\Phi \left(\lambda \right)` given by .. math:: \Phi \left(\lambda \right) = \int_{{-\infty }}^{\lambda }\phi \left(\lambda \right){d\lambda }\text{,} where :math:`\phi \left(\lambda \right)` is described in :meth:`pdf_landau`, using piecewise approximation by rational functions. Further details can be found in Kölbig and Schorr (1984). .. _g01et-py2-py-references: **References** Kölbig, K S and Schorr, B, 1984, `A program package for the Landau distribution`, Comp. Phys. Comm. (31), 97--111 """ raise NotImplementedError
[docs]def prob_vavilov(x, comm): r""" ``prob_vavilov`` returns the value of the Vavilov distribution function :math:`\Phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)`. It is intended to be used after a call to :meth:`init_vavilov`. .. _g01eu-py2-py-doc: For full information please refer to the NAG Library document for g01eu https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01euf.html .. _g01eu-py2-py-parameters: **Parameters** **x** : float The argument :math:`\lambda` of the function. **comm** : dict, communication object Communication structure. This argument must have been initialized by a prior call to :meth:`init_vavilov`. **Returns** **p** : float The value of the Vavilov distribution function :math:`\Phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)`. .. _g01eu-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) Either the initialization function has not been called prior to the first call of this function or a communication array has become corrupted. .. _g01eu-py2-py-notes: **Notes** ``prob_vavilov`` evaluates an approximation to the Vavilov distribution function :math:`\Phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` given by .. math:: \Phi_V\left({\lambda \text{;}\kappa }, \beta^2\right) = \int_{{-\infty }}^{\lambda }\phi_V\left({\lambda \text{;}\kappa }, \beta^2\right){d\lambda }\text{,} where :math:`\phi \left(\lambda \right)` is described in :meth:`pdf_vavilov`. The method used is based on Fourier expansions. Further details can be found in Schorr (1974). .. _g01eu-py2-py-references: **References** Schorr, B, 1974, `Programs for the Landau and the Vavilov distributions and the corresponding random numbers`, Comp. Phys. Comm. (7), 215--224 """ raise NotImplementedError
[docs]def prob_dickey_fuller_unit(ts_type, n, ts, method=1, nsamp=100000, statecomm=None): r""" ``prob_dickey_fuller_unit`` returns the probability associated with the lower tail of the distribution for the Dickey--Fuller unit root test statistic. .. _g01ew-py2-py-doc: For full information please refer to the NAG Library document for g01ew https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ewf.html .. _g01ew-py2-py-parameters: **Parameters** **ts_type** : int The type of test statistic, supplied in :math:`\mathrm{ts}`. **n** : int :math:`n`, the length of the time series used to calculate the test statistic. **ts** : float The Dickey--Fuller test statistic for which the probability is required. If :math:`\mathrm{ts\_type} = 1` :math:`\mathrm{ts}` must contain :math:`\tau`. :math:`\mathrm{ts\_type} = 2` :math:`\mathrm{ts}` must contain :math:`\tau_{\mu }`. :math:`\mathrm{ts\_type} = 3` :math:`\mathrm{ts}` must contain :math:`\tau_{\tau }`. If the test statistic was calculated using :meth:`tsa.uni_dickey_fuller_unit <naginterfaces.library.tsa.uni_dickey_fuller_unit>` the value of :math:`\mathrm{ts\_type}` and :math:`\mathrm{n}` must not change between calls to :meth:`tsa.uni_dickey_fuller_unit <naginterfaces.library.tsa.uni_dickey_fuller_unit>` and ``prob_dickey_fuller_unit``. **method** : int, optional The method used to calculate the probability. :math:`\mathrm{method} = 1` The probability is interpolated from a look-up table, whose values were obtained via simulation. :math:`\mathrm{method} = 2` The probability is interpolated from a look-up table, whose values were obtained from Dickey (1976). :math:`\mathrm{method} = 3` The probability is obtained via simulation. The probability calculated from the look-up table should give sufficient accuracy for most applications. **nsamp** : int, optional If :math:`\mathrm{method} = 3`, the number of samples used in the simulation; otherwise :math:`\mathrm{nsamp}` is not referenced and need not be set. **statecomm** : None or dict, RNG communication object, optional, modified in place RNG communication structure. When :math:`\mathrm{method} = 3`, this argument must have been initialized by a prior call to :meth:`rand.init_repeat <naginterfaces.library.rand.init_repeat>` or :meth:`rand.init_nonrepeat <naginterfaces.library.rand.init_nonrepeat>`. **Returns** **pint** : float The probability associated with the lower tail of the distribution for the (augmented) Dickey--Fuller unit root test statistic supplied in :math:`\mathrm{ts}`. .. _g01ew-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`11`) On entry, :math:`\mathrm{method} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{method} = 1`, :math:`2` or :math:`3`. (`errno` :math:`21`) On entry, :math:`\mathrm{ts\_type} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ts\_type} = 1`, :math:`2` or :math:`3`. (`errno` :math:`31`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{method} \neq 3`, :math:`\mathrm{n} > 0`. (`errno` :math:`31`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{method} = 3` and :math:`\mathrm{ts\_type} = 1`, :math:`\mathrm{n} > 2`. (`errno` :math:`31`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{method} = 3` and :math:`\mathrm{ts\_type} = 2`, :math:`\mathrm{n} > 3`. (`errno` :math:`31`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{method} = 3` and :math:`\mathrm{ts\_type} = 3`, :math:`\mathrm{n} > 4`. (`errno` :math:`51`) On entry, :math:`\mathrm{nsamp} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{method} = 3`, :math:`\mathrm{nsamp} > 0`. (`errno` :math:`61`) On entry, :math:`\mathrm{method} = 3` and the :math:`\mathrm{statecomm}`\ ['state'] vector has been corrupted or not initialized. **Warns** **NagAlgorithmicWarning** (`errno` :math:`201`) The supplied input values were outside the range of at least one look-up table, therefore, extrapolation was used. .. _g01ew-py2-py-notes: **Notes** If the root of the characteristic equation for a time series is one then that series is said to have a unit root. Such series are nonstationary. ``prob_dickey_fuller_unit`` is designed to be called after :meth:`tsa.uni_dickey_fuller_unit <naginterfaces.library.tsa.uni_dickey_fuller_unit>` and returns the probability associated with one of three types of (augmented) Dickey--Fuller test statistic: :math:`\tau`, :math:`\tau_{\mu }` or :math:`\tau_{\tau }`, used to test for a unit root, a unit root with drift or a unit root with drift and a deterministic time trend, respectively. The three types of test statistic are constructed as follows: (1) To test whether a time series, :math:`y_t`, for :math:`\textit{t} = 1,2,\ldots,n`, has a unit root the regression model .. math:: \nabla y_t = \beta_1y_{{t-1}}+\sum_{{i = 1}}^{{p-1}}\delta_i\nabla y_{{t-i}}+\epsilon_t is fit and the test statistic :math:`\tau` constructed as .. math:: \tau = \frac{\hat{\beta }_1}{\sigma_{{11}}} where :math:`\nabla` is the difference operator, with :math:`\nabla y_t = y_t-y_{{t-1}}`, and where :math:`\hat{\beta }_1` and :math:`\sigma_{{11}}` are the least squares estimate and associated standard error for :math:`\beta_1` respectively. (#) To test for a unit root with drift the regression model .. math:: \nabla y_t = \beta_1y_{{t-1}}+\sum_{{i = 1}}^{{p-1}}\delta_i\nabla y_{{t-i}}+\alpha +\epsilon_t is fit and the test statistic :math:`\tau_{\mu }` constructed as .. math:: \tau_{\mu } = \frac{\hat{\beta }_1}{\sigma_{{11}}}\text{.} (#) To test for a unit root with drift and deterministic time trend the regression model .. math:: \nabla y_t = \beta_1y_{{t-1}}+\sum_{{i = 1}}^{{p-1}}\delta_i\nabla y_{{t-i}}+\alpha +\beta_2t+\epsilon_t is fit and the test statistic :math:`\tau_{\tau }` constructed as .. math:: \tau_{\tau } = \frac{\hat{\beta }_1}{\sigma_{{11}}}\text{.} All three test statistics: :math:`\tau`, :math:`\tau_{\mu }` and :math:`\tau_{\tau }` can be calculated using :meth:`tsa.uni_dickey_fuller_unit <naginterfaces.library.tsa.uni_dickey_fuller_unit>`. The probability distributions of these statistics are nonstandard and are a function of the length of the series of interest, :math:`n`. The probability associated with a given test statistic, for a given :math:`n`, can, therefore, only be calculated by simulation as described in Dickey and Fuller (1979). However, such simulations require a significant number of iterations and are, therefore, prohibitively expensive in terms of the time taken. As such ``prob_dickey_fuller_unit`` also allows the probability to be interpolated from a look-up table. Two such tables are provided, one from Dickey (1976) and one constructed as described in `Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ewf.html#fcomments>`__. The three different methods of obtaining an estimate of the probability can be chosen via the :math:`\mathrm{method}` argument. Unless there is a specific reason for choosing otherwise, :math:`\mathrm{method} = 1` should be used. .. _g01ew-py2-py-references: **References** Dickey, A D, 1976, `Estimation and hypothesis testing in nonstationary time series`, PhD Thesis, Iowa State University, Ames, Iowa Dickey, A D and Fuller, W A, 1979, `Distribution of the estimators for autoregressive time series with a unit root`, J. Am. Stat. Assoc. (74 366), 427--431 """ raise NotImplementedError
[docs]def prob_kolmogorov1(n, d): r""" ``prob_kolmogorov1`` returns the upper tail probability associated with the one sample Kolmogorov--Smirnov distribution. .. _g01ey-py2-py-doc: For full information please refer to the NAG Library document for g01ey https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01eyf.html .. _g01ey-py2-py-parameters: **Parameters** **n** : int :math:`n`, the number of observations in the sample. **d** : float Contains the test statistic, :math:`D_n^+` or :math:`D_n^-`. **Returns** **p** : float The upper tail probability associated with the one sample Kolmogorov--Smirnov distribution. .. _g01ey-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{n} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n} \geq 1`. (`errno` :math:`2`) On entry, :math:`\mathrm{d} < 0.0` or :math:`\mathrm{d} > 1.0`: :math:`\mathrm{d} = \langle\mathit{\boldsymbol{value}}\rangle`. .. _g01ey-py2-py-notes: **Notes** Let :math:`S_n\left(x\right)` be the sample cumulative distribution function and :math:`F_0\left(x\right)` the hypothesised theoretical distribution function. ``prob_kolmogorov1`` returns the upper tail probability, :math:`p`, associated with the one-sided Kolmogorov--Smirnov test statistic :math:`D_n^+` or :math:`D_n^-`, where these one-sided statistics are defined as follows; .. math:: \begin{array}{lcl}D_n^+& = &\mathrm{sup}_x\left[S_n\left(x\right)-F_0\left(x\right)\right]\text{,}\\&&\\D_n^-& = &\mathrm{sup}_x\left[F_0\left(x\right)-S_n\left(x\right)\right[\text{.}\end{array} If :math:`n\leq 100` an exact method is used; for the details see Conover (1980). Otherwise a large sample approximation derived by Smirnov is used; see Feller (1948), Kendall and Stuart (1973) and Smirnov (1948). .. _g01ey-py2-py-references: **References** Conover, W J, 1980, `Practical Nonparametric Statistics`, Wiley Feller, W, 1948, `On the Kolmogorov--Smirnov limit theorems for empirical distributions`, Ann. Math. Statist. (19), 179--181 Kendall, M G and Stuart, A, 1973, `The Advanced Theory of Statistics (Volume 2)`, (3rd Edition), Griffin Siegel, S, 1956, `Non-parametric Statistics for the Behavioral Sciences`, McGraw--Hill Smirnov, N, 1948, `Table for estimating the goodness of fit of empirical distributions`, Ann. Math. Statist. (19), 279--281 """ raise NotImplementedError
[docs]def prob_kolmogorov2(n1, n2, d): r""" ``prob_kolmogorov2`` returns the probability associated with the upper tail of the Kolmogorov--Smirnov two sample distribution. .. _g01ez-py2-py-doc: For full information please refer to the NAG Library document for g01ez https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ezf.html .. _g01ez-py2-py-parameters: **Parameters** **n1** : int The number of observations in the first sample, :math:`n_1`. **n2** : int The number of observations in the second sample, :math:`n_2`. **d** : float The test statistic :math:`D_{{n_1,n_2}}`, for the two sample Kolmogorov--Smirnov goodness-of-fit test, that is the maximum difference between the empirical cumulative distribution functions (CDFs) of the two samples. **Returns** **p** : float The probability associated with the upper tail of the Kolmogorov--Smirnov two sample distribution. .. _g01ez-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{n1} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{n2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{n1} \geq 1` and :math:`\mathrm{n2} \geq 1`. (`errno` :math:`2`) On entry, :math:`\mathrm{d} < 0.0` or :math:`\mathrm{d} > 1.0`: :math:`\mathrm{d} = \langle\mathit{\boldsymbol{value}}\rangle`. (`errno` :math:`3`) The Smirnov approximation used for large samples did not converge in :math:`200` iterations. The probability is set to :math:`1.0`. .. _g01ez-py2-py-notes: **Notes** Let :math:`F_{n_1}\left(x\right)` and :math:`G_{n_2}\left(x\right)` denote the empirical cumulative distribution functions for the two samples, where :math:`n_1` and :math:`n_2` are the sizes of the first and second samples respectively. The function ``prob_kolmogorov2`` computes the upper tail probability for the Kolmogorov--Smirnov two sample two-sided test statistic :math:`D_{{n_1,n_2}}`, where .. math:: D_{{n_1,n_2}} = \mathrm{sup}_x\left\lvert F_{n_1}\left(x\right)-G_{n_2}\left(x\right)\right\rvert \text{.} The probability is computed exactly if :math:`n_1,n_2\leq 10000` and :math:`\mathrm{max}\left(n_1, n_2\right)\leq 2500` using a method given by Kim and Jenrich (1973). For the case where :math:`\mathrm{min}\left(n_1, n_2\right)\leq 10\%` of the :math:`\mathrm{max}\left(n_1, n_2\right)` and :math:`\mathrm{min}\left(n_1, n_2\right)\leq 80` the Smirnov approximation is used. For all other cases the Kolmogorov approximation is used. These two approximations are discussed in Kim and Jenrich (1973). .. _g01ez-py2-py-references: **References** Conover, W J, 1980, `Practical Nonparametric Statistics`, Wiley Feller, W, 1948, `On the Kolmogorov--Smirnov limit theorems for empirical distributions`, Ann. Math. Statist. (19), 179--181 Kendall, M G and Stuart, A, 1973, `The Advanced Theory of Statistics (Volume 2)`, (3rd Edition), Griffin Kim, P J and Jenrich, R I, 1973, `Tables of exact sampling distribution of the two sample Kolmogorov--Smirnov criterion` :math:`D_{{mn}}\left(m < n\right)`, Selected Tables in Mathematical Statistics (1), 80--129, American Mathematical Society Siegel, S, 1956, `Non-parametric Statistics for the Behavioral Sciences`, McGraw--Hill Smirnov, N, 1948, `Table for estimating the goodness of fit of empirical distributions`, Ann. Math. Statist. (19), 279--281 """ raise NotImplementedError
[docs]def inv_cdf_normal(p, tail='L'): r""" ``inv_cdf_normal`` returns the deviate associated with the given probability of the standard Normal distribution. .. _g01fa-py2-py-doc: For full information please refer to the NAG Library document for g01fa https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01faf.html .. _g01fa-py2-py-parameters: **Parameters** **p** : float :math:`p`, the probability from the standard Normal distribution as defined by :math:`\mathrm{tail}`. **tail** : str, length 1, optional Indicates which tail the supplied probability represents. :math:`\mathrm{tail} = \texttt{'L'}` The lower probability, i.e., :math:`{P\left(X\leq x_p\right)}`. :math:`\mathrm{tail} = \texttt{'U'}` The upper probability, i.e., :math:`{P\left(X\geq x_p\right)}`. :math:`\mathrm{tail} = \texttt{'S'}` The two tail (significance level) probability, i.e., :math:`{P\left(X\geq \left\lvert x_p\right\rvert \right)}+{P\left(X\leq -\left\lvert x_p\right\rvert \right)}`. :math:`\mathrm{tail} = \texttt{'C'}` The two tail (confidence interval) probability, i.e., :math:`{P\left(X\leq \left\lvert x_p\right\rvert \right)}-P\left(X\leq -\left\lvert x_p\right\rvert \right)`. **Returns** **x** : float The deviate associated with the given probability of the standard Normal distribution. .. _g01fa-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail} = \texttt{'L'}`, :math:`\texttt{'U'}`, :math:`\texttt{'S'}` or :math:`\texttt{'C'}`. (`errno` :math:`2`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} < 1.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} > 0.0`. .. _g01fa-py2-py-notes: **Notes** The deviate, :math:`x_p` associated with the lower tail probability, :math:`p`, for the standard Normal distribution is defined as the solution to .. math:: P\left(X\leq x_p\right) = p = \int_{{-\infty }}^{x_p}Z\left(X\right){dX}\text{,} where .. math:: Z\left(X\right) = \frac{1}{\sqrt{2\pi }}e^{{-X^2/2}}\text{, }\quad {-\infty } < X < \infty \text{.} The method used is an extension of that of Wichura (1988). :math:`p` is first replaced by :math:`q = p-0.5`. (a) If :math:`\left\lvert q\right\rvert \leq 0.3`, :math:`x_p` is computed by a rational Chebyshev approximation .. math:: x_p = s\frac{{A\left(s^2\right)}}{{B\left(s^2\right)}}\text{,} where :math:`s = \sqrt{2\pi }q` and :math:`A`, :math:`B` are polynomials of degree :math:`7`. (#) If :math:`0.3 < \left\lvert q\right\rvert \leq 0.42`, :math:`x_p` is computed by a rational Chebyshev approximation .. math:: x_p = \mathrm{sign}\left(q\right)\left(\frac{{C\left(t\right)}}{{D\left(t\right)}}\right)\text{,} where :math:`t = \left\lvert q\right\rvert -0.3` and :math:`C`, :math:`D` are polynomials of degree :math:`5`. (#) If :math:`\left\lvert q\right\rvert > 0.42`, :math:`x_p` is computed as .. math:: x_p = \mathrm{sign}\left(q\right)\left[\left(\frac{{E\left(u\right)}}{{F\left(u\right)}}\right)+u\right]\text{,} where :math:`u = \sqrt{-2\times \log\left(\mathrm{min}\left(p, {1-p}\right)\right)}` and :math:`E`, :math:`F` are polynomials of degree :math:`6`. For the upper tail probability :math:`{-x_p}` is returned, while for the two tail probabilities the value :math:`x_{{p^*}}` is returned, where :math:`p^*` is the required tail probability computed from the input value of :math:`p`. .. _g01fa-py2-py-references: **References** NIST Digital Library of Mathematical Functions Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Wichura, 1988, `Algorithm AS 241: the percentage points of the Normal distribution`, Appl. Statist. (37), 477--484 """ raise NotImplementedError
[docs]def inv_cdf_students_t(p, df, tail='L'): r""" ``inv_cdf_students_t`` returns the deviate associated with the given tail probability of Student's :math:`t`-distribution with real degrees of freedom. .. _g01fb-py2-py-doc: For full information please refer to the NAG Library document for g01fb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01fbf.html .. _g01fb-py2-py-parameters: **Parameters** **p** : float :math:`p`, the probability from the required Student's :math:`t`-distribution as defined by :math:`\mathrm{tail}`. **df** : float :math:`\nu`, the degrees of freedom of the Student's :math:`t`-distribution. **tail** : str, length 1, optional Indicates which tail the supplied probability represents. :math:`\mathrm{tail} = \texttt{'U'}` The upper tail probability, i.e., :math:`{P\left(T\geq t_p:\nu \right)}`. :math:`\mathrm{tail} = \texttt{'L'}` The lower tail probability, i.e., :math:`{P\left(T\leq t_p:\nu \right)}`. :math:`\mathrm{tail} = \texttt{'S'}` The two tail (significance level) probability, i.e., :math:`{P\left(T\geq \left\lvert t_p\right\rvert :\nu \right)}+{P\left(T\leq -\left\lvert t_p\right\rvert :\nu \right)}`. :math:`\mathrm{tail} = \texttt{'C'}` The two tail (confidence interval) probability, i.e., :math:`{P\left(T\leq \left\lvert t_p\right\rvert :\nu \right)}-P\left(T\leq -\left\lvert t_p\right\rvert :\nu \right)`. **Returns** **x** : float The deviate associated with the given tail probability of Student's :math:`t`-distribution with real degrees of freedom. .. _g01fb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail} = \texttt{'L'}`, :math:`\texttt{'U'}`, :math:`\texttt{'S'}` or :math:`\texttt{'C'}`. (`errno` :math:`2`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} < 1.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} > 0.0`. (`errno` :math:`3`) On entry, :math:`\mathrm{df} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df}\geq 1.0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`5`) The solution has failed to converge. However, the result should be a reasonable approximation. .. _g01fb-py2-py-notes: **Notes** The deviate, :math:`t_p` associated with the lower tail probability, :math:`p`, of the Student's :math:`t`-distribution with :math:`\nu` degrees of freedom is defined as the solution to .. math:: P\left(T < t_p:\nu \right) = p = \frac{{\Gamma \left(\left(\nu +1\right)/2\right)}}{{\sqrt{\nu \pi }\Gamma \left(\nu /2\right)}}\int_{{-\infty }}^{t_p}\left(1+\frac{T^2}{\nu }\right)^{{-\left(\nu +1\right)/2}}dT\text{, }\quad \nu \geq 1\text{; }{-\infty } < t_p < \infty \text{.} For :math:`\nu = 1` or :math:`2` the integral equation is easily solved for :math:`t_p`. For other values of :math:`\nu < 3` a transformation to the beta distribution is used and the result obtained from :meth:`inv_cdf_beta`. For :math:`\nu \geq 3` an inverse asymptotic expansion of Cornish--Fisher type is used. The algorithm is described by Hill (1970). .. _g01fb-py2-py-references: **References** Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Hill, G W, 1970, `Student's` :math:`t` `-distribution`, Comm. ACM (13(10)), 617--619 """ raise NotImplementedError
[docs]def inv_cdf_chisq(p, df): r""" ``inv_cdf_chisq`` returns the deviate associated with the given lower tail probability of the :math:`\chi^2`-distribution with real degrees of freedom. .. _g01fc-py2-py-doc: For full information please refer to the NAG Library document for g01fc https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01fcf.html .. _g01fc-py2-py-parameters: **Parameters** **p** : float :math:`p`, the lower tail probability from the required :math:`\chi^2`-distribution. **df** : float :math:`\nu`, the degrees of freedom of the :math:`\chi^2`-distribution. **Returns** **x** : float The deviate associated with the given lower tail probability of the :math:`\chi^2`-distribution with real degrees of freedom. .. _g01fc-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} < 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p}\geq 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{df} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df} > 0.0`. (`errno` :math:`3`) The probability is too close to :math:`0.0` or :math:`1.0`. (`errno` :math:`5`) The series used to calculate the gamma function has failed to converge. This is an unlikely error exit. **Warns** **NagAlgorithmicWarning** (`errno` :math:`4`) The algorithm has failed to converge in :math:`\langle\mathit{\boldsymbol{value}}\rangle` iterations. The result should be a reasonable approximation. .. _g01fc-py2-py-notes: **Notes** The deviate, :math:`x_p`, associated with the lower tail probability :math:`p` of the :math:`\chi^2`-distribution with :math:`\nu` degrees of freedom is defined as the solution to .. math:: P\left(X\leq x_p:\nu \right) = p = \frac{1}{{2^{{\nu /2}}\Gamma \left(\nu /2\right)}}\int_0^{x_p}e^{{-X/2}}X^{{v/2-1}}{dX}\text{, }\quad 0\leq x_p < \infty \text{;}\nu > 0\text{.} The required :math:`x_p` is found by using the relationship between a :math:`\chi^2`-distribution and a gamma distribution, i.e., a :math:`\chi^2`-distribution with :math:`\nu` degrees of freedom is equal to a gamma distribution with scale parameter :math:`2` and shape parameter :math:`\nu /2`. For very large values of :math:`\nu`, greater than :math:`10^5`, Wilson and Hilferty's normal approximation to the :math:`\chi^2` is used; see Kendall and Stuart (1969). .. _g01fc-py2-py-references: **References** Best, D J and Roberts, D E, 1975, `Algorithm AS 91. The percentage points of the` :math:`\chi^2` `distribution`, Appl. Statist. (24), 385--388 Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Kendall, M G and Stuart, A, 1969, `The Advanced Theory of Statistics (Volume 1)`, (3rd Edition), Griffin """ raise NotImplementedError
[docs]def inv_cdf_f(p, df1, df2): r""" ``inv_cdf_f`` returns the deviate associated with the given lower tail probability of the :math:`F` or variance-ratio distribution with real degrees of freedom. .. _g01fd-py2-py-doc: For full information please refer to the NAG Library document for g01fd https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01fdf.html .. _g01fd-py2-py-parameters: **Parameters** **p** : float :math:`p`, the lower tail probability from the required :math:`F`-distribution. **df1** : float The degrees of freedom of the numerator variance, :math:`\nu_1`. **df2** : float The degrees of freedom of the denominator variance, :math:`\nu_2`. **Returns** **x** : float The deviate associated with the given lower tail probability of the :math:`F` or variance-ratio distribution with real degrees of freedom. .. _g01fd-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} < 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p}\geq 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{df1} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{df2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df1} > 0.0` and :math:`\mathrm{df2} > 0.0`. (`errno` :math:`4`) The probability is too close to :math:`0.0` or :math:`1.0`. The value of :math:`f_p` cannot be computed. This will only occur when the large sample approximations are used. **Warns** **NagAlgorithmicWarning** (`errno` :math:`3`) The solution has failed to converge. However, the result should be a reasonable approximation. Alternatively, :meth:`inv_cdf_beta` can be used with a suitable setting of the argument :math:`\textit{tol}`. .. _g01fd-py2-py-notes: **Notes** The deviate, :math:`f_p`, associated with the lower tail probability, :math:`p`, of the :math:`F`-distribution with degrees of freedom :math:`\nu_1` and :math:`\nu_2` is defined as the solution to .. math:: P\left({F\leq f_p:\nu_1}, \nu_2\right) = p = \frac{{\nu_1^{{\frac{1}{2}\nu_1}}\nu_2^{{\frac{1}{2}\nu_2}}\Gamma \left(\frac{{\nu_1+\nu_2}}{2}\right)}}{{\Gamma \left(\frac{\nu_1}{2}\right)\Gamma \left(\frac{\nu_2}{2}\right)}}\int_0^{f_p}F^{{\frac{1}{2}\left(\nu_1-2\right)}}\left(\nu_2+\nu_1F\right)^{{-\frac{1}{2}\left(\nu_1+\nu_2\right)}}{dF}\text{,} where :math:`\nu_1,\nu_2 > 0`; :math:`0\leq f_p < \infty`. The value of :math:`f_p` is computed by means of a transformation to a beta distribution, :math:`P_{\beta }\left({B\leq \beta :a}, b\right)`: .. math:: P\left({F\leq f:\nu_1}, \nu_2\right) = P_{\beta }\left(B\leq \frac{{\nu_1f}}{{\nu_1f+\nu_2}}:\nu_1/2,\nu_2/2\right) and using a call to :meth:`inv_cdf_beta`. For very large values of both :math:`\nu_1` and :math:`\nu_2`, greater than :math:`10^5`, a normal approximation is used. If only one of :math:`\nu_1` or :math:`\nu_2` is greater than :math:`10^5` then a :math:`\chi^2` approximation is used; see Abramowitz and Stegun (1972). .. _g01fd-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def inv_cdf_beta(p, a, b, tol=0.0): r""" ``inv_cdf_beta`` returns the deviate associated with the given lower tail probability of the beta distribution. .. _g01fe-py2-py-doc: For full information please refer to the NAG Library document for g01fe https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01fef.html .. _g01fe-py2-py-parameters: **Parameters** **p** : float :math:`p`, the lower tail probability from the required beta distribution. **a** : float :math:`a`, the first parameter of the required beta distribution. **b** : float :math:`b`, the second parameter of the required beta distribution. **tol** : float, optional The relative accuracy required by you in the result. If ``inv_cdf_beta`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. **Returns** **x** : float The deviate associated with the given lower tail probability of the beta distribution. .. _g01fe-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p}\geq 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p}\leq 1.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a} > 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a}\leq 10^6`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{b} > 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{b}\leq 10^6`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`3`) The solution has failed to converge. However, the result should be a reasonable approximation. Requested accuracy not achieved when calculating beta probability. You should try setting :math:`\mathrm{tol}` larger. (`errno` :math:`4`) The requested accuracy has not been achieved. Use a larger value of :math:`\mathrm{tol}`. There is doubt concerning the accuracy of the computed result. :math:`100` iterations of the Newton--Raphson method have been performed without satisfying the accuracy criterion (see `Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01fef.html#fcomments>`__). The result should be a reasonable approximation of the solution. .. _g01fe-py2-py-notes: **Notes** The deviate, :math:`\beta_p`, associated with the lower tail probability, :math:`p`, of the beta distribution with parameters :math:`a` and :math:`b` is defined as the solution to .. math:: P\left({B\leq \beta_p:a}, b\right) = p = \frac{{\Gamma \left(a+b\right)}}{{\Gamma \left(a\right)\Gamma \left(b\right)}}\int_0^{{\beta_p}}B^{{a-1}}\left(1-B\right)^{{b-1}}{dB}\text{, }\quad 0\leq \beta_p\leq 1\text{;}a,b > 0\text{.} The algorithm is a modified version of the Newton--Raphson method, following closely that of Cran `et al.` (1977). An initial approximation, :math:`\beta_0`, to :math:`\beta_p` is found (see Cran `et al.` (1977)), and the Newton--Raphson iteration .. math:: \beta_i = \beta_{{i-1}}-\frac{{f\left(\beta_{{i-1}}\right)}}{{f^{\prime }\left(\beta_{{i-1}}\right)}}\text{,} where :math:`f\left(\beta \right) = P\left({B\leq \beta :a}, {b}\right)-p` is used, with modifications to ensure that :math:`\beta` remains in the range :math:`\left(0, 1\right)`. .. _g01fe-py2-py-references: **References** Cran, G W, Martin, K J and Thomas, G E, 1977, `Algorithm AS 109. Inverse of the incomplete beta function ratio`, Appl. Statist. (26), 111--114 Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def inv_cdf_gamma(p, a, b, tol=0.0): r""" ``inv_cdf_gamma`` returns the deviate associated with the given lower tail probability of the gamma distribution. .. _g01ff-py2-py-doc: For full information please refer to the NAG Library document for g01ff https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01fff.html .. _g01ff-py2-py-parameters: **Parameters** **p** : float :math:`p`, the lower tail probability from the required gamma distribution. **a** : float :math:`\alpha`, the shape parameter of the gamma distribution. **b** : float :math:`\beta`, the scale parameter of the gamma distribution. **tol** : float, optional The relative accuracy required by you in the results. The smallest recommended value is :math:`50\times \delta`, where :math:`\delta = \mathrm{max}\left(10^{-18}, \text{machine precision}\right)`. If ``inv_cdf_gamma`` is entered with :math:`\mathrm{tol}` less than :math:`50\times \delta` or greater or equal to :math:`1.0`, then :math:`50\times \delta` is used instead. **Returns** **x** : float The deviate associated with the given lower tail probability of the gamma distribution. .. _g01ff-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p} < 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{p}\geq 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a}\leq 10^6`. (`errno` :math:`2`) On entry, :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{b} > 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a} > 0.0`. (`errno` :math:`3`) The probability is too close to :math:`0.0` for the given :math:`\mathrm{a}` to enable the result to be calculated. (`errno` :math:`5`) The series used to calculate the gamma function has failed to converge. This is an unlikely error exit. **Warns** **NagAlgorithmicWarning** (`errno` :math:`4`) The algorithm has failed to converge in :math:`100` iterations. A larger value of :math:`\mathrm{tol}` should be tried. The result may be a reasonable approximation. .. _g01ff-py2-py-notes: **Notes** The deviate, :math:`g_p`, associated with the lower tail probability, :math:`p`, of the gamma distribution with shape parameter :math:`\alpha` and scale parameter :math:`\beta`, is defined as the solution to .. math:: P\left({G\leq g_p:\alpha }, \beta \right) = p = \frac{1}{{\beta^{\alpha }\Gamma \left(\alpha \right)}}\int_0^{g_p}e^{{-G/\beta }}G^{{\alpha -1}}{dG}\text{, }\quad 0\leq g_p < \infty \text{;}\alpha,\beta > 0\text{.} The method used is described by Best and Roberts (1975) making use of the relationship between the gamma distribution and the :math:`\chi^2`-distribution. Let :math:`y = 2\frac{g_p}{\beta }`. The required :math:`y` is found from the Taylor series expansion .. math:: y = y_0+\sum_r\frac{{C_r\left(y_0\right)}}{{r!}}\left(\frac{E}{{\phi \left(y_0\right)}}\right)^r\text{,} where :math:`y_0` is a starting approximation :math:`C_1\left(u\right) = 1`, :math:`C_{{r+1}}\left(u\right) = \left(r\Psi +\frac{d}{{du}}\right)C_r\left(u\right)`, :math:`\Psi = \frac{1}{2}-\frac{{\alpha -1}}{u}`, :math:`E = p-\int_0^{y_0}\phi \left(u\right){du}`, :math:`\phi \left(u\right) = \frac{1}{{2^{\alpha }\Gamma \left(\alpha \right)}}e^{{-u/2}}u^{{\alpha -1}}`. For most values of :math:`p` and :math:`\alpha` the starting value .. math:: y_{01} = 2\alpha \left(z\sqrt{\frac{1}{{9\alpha }}}+1-\frac{1}{{9\alpha }}\right)^3 is used, where :math:`z` is the deviate associated with a lower tail probability of :math:`p` for the standard Normal distribution. For :math:`p` close to zero, .. math:: y_{02} = \left(p\alpha 2^{\alpha }\Gamma \left(\alpha \right)\right)^{{1/\alpha }} is used. For large :math:`p` values, when :math:`y_{01} > 4.4\alpha +6.0`, .. math:: y_{03} = -2\left[\mathrm{ln}\left(1-p\right)-\left(\alpha -1\right)\mathrm{ln}\left(\frac{1}{2}y_{01}\right)+\mathrm{ln}\left(\Gamma \left(\alpha \right)\right)\right] is found to be a better starting value than :math:`y_{01}`. For small :math:`\alpha` :math:`\left(\alpha \leq 0.16\right)`, :math:`p` is expressed in terms of an approximation to the exponential integral and :math:`y_{04}` is found by Newton--Raphson iterations. Seven terms of the Taylor series are used to refine the starting approximation, repeating the process if necessary until the required accuracy is obtained. .. _g01ff-py2-py-references: **References** Best, D J and Roberts, D E, 1975, `Algorithm AS 91. The percentage points of the` :math:`\chi^2` `distribution`, Appl. Statist. (24), 385--388 """ raise NotImplementedError
[docs]def inv_cdf_studentized_range(p, v, ir): r""" ``inv_cdf_studentized_range`` returns the deviate associated with the lower tail probability of the distribution of the Studentized range statistic. .. _g01fm-py2-py-doc: For full information please refer to the NAG Library document for g01fm https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01fmf.html .. _g01fm-py2-py-parameters: **Parameters** **p** : float The lower tail probability for the Studentized range statistic, :math:`p_0`. **v** : float :math:`v`, the number of degrees of freedom. **ir** : int :math:`r`, the number of groups. **Returns** **x** : float The deviate associated with the lower tail probability of the distribution of the Studentized range statistic. .. _g01fm-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{p} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0 < \mathrm{p} < 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{ir} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ir} \geq 2`. (`errno` :math:`1`) On entry, :math:`\mathrm{v} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{v}\geq 1.0`. (`errno` :math:`2`) The function was unable to find an upper bound for the value of :math:`q_0`. This will be caused by :math:`p_0` being too close to :math:`1.0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`3`) There is some doubt as to whether full accuracy has been achieved. The returned value should be a reasonable estimate of the true value. .. _g01fm-py2-py-notes: **Notes** The externally Studentized range, :math:`q`, for a sample, :math:`x_1,x_2,\ldots,x_r`, is defined as .. math:: q = \frac{{\mathrm{max}\left(x_i\right)-\mathrm{min}\left(x_i\right)}}{\hat{\sigma }_e}\text{,} where :math:`\hat{\sigma }_e` is an independent estimate of the standard error of the :math:`x_i`. The most common use of this statistic is in the testing of means from a balanced design. In this case for a set of group means, :math:`\bar{T}_1,\bar{T}_2,\ldots,\bar{T}_r`, the Studentized range statistic is defined to be the difference between the largest and smallest means, :math:`\bar{T}_{\text{largest}}` and :math:`\bar{T}_{\text{smallest}}`, divided by the square root of the mean-square experimental error, :math:`MS_{\text{error}}`, over the number of observations in each group, :math:`n`, i.e., .. math:: q = \frac{{\bar{T}_{\text{largest}}-\bar{T}_{\text{smallest}}}}{{\sqrt{MS_{\text{error}}/n}}}\text{.} The Studentized range statistic can be used as part of a multiple comparisons procedure such as the Newman--Keuls procedure or Duncan's multiple range test (see Montgomery (1984) and Winer (1970)). For a Studentized range statistic the probability integral, :math:`P\left({q;v}, r\right)`, for :math:`v` degrees of freedom and :math:`r` groups, can be written as: .. math:: P\left({q;v}, r\right) = C\int_0^{\infty }x^{{v-1}}e^{{-vx^2/2}}\left(r\int_{{-\infty }}^{\infty }\phi \left(y\right){\left(\Phi \left(y\right)-\Phi \left(y-qx\right)\right)}^{{r-1}}{dy}\right){dx}\text{,} where .. math:: C = \frac{v^{{v/2}}}{{\Gamma \left(v/2\right)2^{{v/2-1}}}}\text{, }\quad \phi \left(y\right) = \frac{1}{\sqrt{2\pi }}e^{{-y^2/2}}\quad \text{ and }\quad \Phi \left(y\right) = \int_{{-\infty }}^y\phi \left(t\right){dt}\text{.} For a given probability :math:`p_0`, the deviate :math:`q_0` is found as the solution to the equation .. math:: P\left({q_0\text{;}v}, r\right) = p_0\text{,} using :meth:`roots.contfn_brent_rcomm <naginterfaces.library.roots.contfn_brent_rcomm>`. Initial estimates are found using the approximation given in Lund and Lund (1983) and a simple search procedure. .. _g01fm-py2-py-references: **References** Lund, R E and Lund, J R, 1983, `Algorithm AS 190: probabilities and upper quartiles for the studentized range`, Appl. Statist. (32(2)), 204--210 Montgomery, D C, 1984, `Design and Analysis of Experiments`, Wiley Winer, B J, 1970, `Statistical Principles in Experimental Design`, McGraw--Hill """ raise NotImplementedError
[docs]def inv_cdf_landau(x): r""" ``inv_cdf_landau`` returns the value of the inverse :math:`\Phi^{-1}\left(x\right)` of the Landau distribution function. .. _g01ft-py2-py-doc: For full information please refer to the NAG Library document for g01ft https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ftf.html .. _g01ft-py2-py-parameters: **Parameters** **x** : float The argument :math:`x` of the function. **Returns** **invlan** : float The value of the inverse :math:`\Phi^{-1}\left(x\right)` of the Landau distribution function. .. _g01ft-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{x} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{x} < 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{x} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{x} > 0.0`. .. _g01ft-py2-py-notes: **Notes** ``inv_cdf_landau`` evaluates an approximation to the inverse :math:`\Phi^{-1}\left(x\right)` of the Landau distribution function given by .. math:: \Psi \left(x\right) = \Phi^{-1}\left(x\right) (where :math:`\Phi \left(\lambda \right)` is described in :meth:`prob_landau` and :meth:`pdf_landau`), using either linear or quadratic interpolation or rational approximations which mimic the asymptotic behaviour. Further details can be found in Kölbig and Schorr (1984). It can also be used to generate Landau distributed random numbers in the range :math:`0 < x < 1`. .. _g01ft-py2-py-references: **References** Kölbig, K S and Schorr, B, 1984, `A program package for the Landau distribution`, Comp. Phys. Comm. (31), 97--111 """ raise NotImplementedError
[docs]def prob_students_t_noncentral(t, df, delta, tol=0.0, maxit=100): r""" ``prob_students_t_noncentral`` returns the lower tail probability for the noncentral Student's :math:`t`-distribution. .. _g01gb-py2-py-doc: For full information please refer to the NAG Library document for g01gb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gbf.html .. _g01gb-py2-py-parameters: **Parameters** **t** : float :math:`t`, the deviate from the Student's :math:`t`-distribution with :math:`\nu` degrees of freedom. **df** : float :math:`\nu`, the degrees of freedom of the Student's :math:`t`-distribution. **delta** : float :math:`\delta`, the noncentrality parameter of the Students :math:`t`-distribution. **tol** : float, optional The absolute accuracy required by you in the results. If ``prob_students_t_noncentral`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. **maxit** : int, optional The maximum number of terms that are used in each of the summations. **Returns** **p** : float The lower tail probability for the noncentral Student's :math:`t`-distribution. .. _g01gb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{df} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df}\geq 1.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{maxit} \geq 1`. (`errno` :math:`4`) Unable to calculate the probability as it is too close to zero or one. **Warns** **NagAlgorithmicWarning** (`errno` :math:`3`) One of the series has failed to converge with :math:`\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{tol} = \langle\mathit{\boldsymbol{value}}\rangle`. Reconsider the requested tolerance and/or the maximum number of iterations. (`errno` :math:`4`) The probability is too close to :math:`0` or :math:`1`. The returned value should be a reasonable estimate of the true value. .. _g01gb-py2-py-notes: **Notes** The lower tail probability of the noncentral Student's :math:`t`-distribution with :math:`\nu` degrees of freedom and noncentrality parameter :math:`\delta`, :math:`P\left(T\leq t:\nu \text{;}\delta \right)`, is defined by .. math:: P\left(T\leq t:\nu \text{;}\delta \right) = C_{\nu }\int_0^{\infty }\left(\frac{1}{\sqrt{2\pi }}\int_{{-\infty }}^{{\alpha u-\delta }}e^{{-x^2/2}}{dx}\right)u^{{\nu -1}}e^{{-u^2/2}}du\text{, }\quad \nu > 0.0 with .. math:: C_{\nu } = \frac{1}{{\Gamma \left(\frac{1}{2}\nu \right)2^{{\left(\nu -2\right)/2}}}}\text{, }\quad \alpha = \frac{t}{\sqrt{\nu }}\text{.} The probability is computed in one of two ways. (i) When :math:`t = 0.0`, the relationship to the normal is used: .. math:: P\left(T\leq t:\nu \text{;}\delta \right) = \frac{1}{\sqrt{2\pi }}\int_{\delta }^{\infty }e^{{-u^2/2}}{du}\text{.} (#) Otherwise the series expansion described in Equation 9 of Amos (1964) is used. This involves the sums of confluent hypergeometric functions, the terms of which are computed using recurrence relationships. .. _g01gb-py2-py-references: **References** Amos, D E, 1964, `Representations of the central and non-central` :math:`t` `-distributions`, Biometrika (51), 451--458 """ raise NotImplementedError
[docs]def prob_chisq_noncentral(x, df, rlamda, tol=0.0, maxit=100): r""" ``prob_chisq_noncentral`` returns the probability associated with the lower tail of the noncentral :math:`\chi^2`-distribution. .. _g01gc-py2-py-doc: For full information please refer to the NAG Library document for g01gc https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gcf.html .. _g01gc-py2-py-parameters: **Parameters** **x** : float The deviate from the noncentral :math:`\chi^2`-distribution with :math:`\nu` degrees of freedom and noncentrality parameter :math:`\lambda`. **df** : float :math:`\nu`, the degrees of freedom of the noncentral :math:`\chi^2`-distribution. **rlamda** : float :math:`\lambda`, the noncentrality parameter of the noncentral :math:`\chi^2`-distribution. **tol** : float, optional The required accuracy of the solution. If ``prob_chisq_noncentral`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. **maxit** : int, optional The maximum number of iterations to be performed. **Returns** **p** : float The probability associated with the lower tail of the noncentral :math:`\chi^2`-distribution. .. _g01gc-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{df} = 0.0` and :math:`\mathrm{rlamda} = 0.0`. Constraint: :math:`\mathrm{rlamda} > 0.0` if :math:`\mathrm{df} = 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{maxit} \geq 1`. (`errno` :math:`1`) On entry, :math:`\mathrm{x} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{x}\geq 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{rlamda} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rlamda}\geq 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{df} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df}\geq 0.0`. (`errno` :math:`2`) The initial value of the Poisson weight used in the summation of Equation `(1) <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gcf.html#eqn1>`__ (see :ref:`Notes <g01gc-py2-py-notes>`) was too small to be calculated. The computed probability is likely to be zero. (`errno` :math:`3`) The solution has failed to converge in :math:`\langle\mathit{\boldsymbol{value}}\rangle` iterations. Consider increasing :math:`\mathrm{maxit}` or :math:`\mathrm{tol}`. (`errno` :math:`4`) The value of a term required in Equation `(2) <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gcf.html#eqn2>`__ (see :ref:`Notes <g01gc-py2-py-notes>`) is too large to be evaluated accurately. The most likely cause of this error is both :math:`\mathrm{x}` and :math:`\mathrm{rlamda}` are too large. (`errno` :math:`5`) The calculations for the central chi-square probability has failed to converge. A larger value of :math:`\mathrm{tol}` should be used. .. _g01gc-py2-py-notes: **Notes** The lower tail probability of the noncentral :math:`\chi^2`-distribution with :math:`\nu` degrees of freedom and noncentrality parameter :math:`\lambda`, :math:`P\left(X\leq x:\nu \text{;}\lambda \right)`, is defined by .. math:: P\left(X\leq x:\nu \text{;}\lambda \right) = \sum_{{j = 0}}^{\infty }e^{{-\lambda /2}}\frac{{\left(\lambda /2\right)}^j}{{j!}}P\left(X\leq x:\nu +2j\text{;}0\right)\text{,} where :math:`P\left(X\leq x:\nu +2j\text{;}0\right)` is a central :math:`\chi^2`-distribution with :math:`\nu +2j` degrees of freedom. The value of :math:`j` at which the Poisson weight, :math:`e^{{-\lambda /2}}\frac{{\left(\lambda /2\right)}^j}{{j!}}`, is greatest is determined and the summation `(1) <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gcf.html#eqn1>`__ is made forward and backward from that value of :math:`j`. The recursive relationship: .. math:: P\left(X\leq x:a+2\text{;}0\right) = P\left(X\leq x:a\text{;}0\right)-\frac{{\left(x^a/2\right)e^{{-x/2}}}}{{\Gamma \left(a+1\right)}} is used during the summation in `(1) <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gcf.html#eqn1>`__. .. _g01gc-py2-py-references: **References** NIST Digital Library of Mathematical Functions """ raise NotImplementedError
[docs]def prob_f_noncentral(f, df1, df2, rlamda, tol=0.0, maxit=500): r""" ``prob_f_noncentral`` returns the probability associated with the lower tail of the noncentral :math:`F` or variance-ratio distribution. .. _g01gd-py2-py-doc: For full information please refer to the NAG Library document for g01gd https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gdf.html .. _g01gd-py2-py-parameters: **Parameters** **f** : float :math:`f`, the deviate from the noncentral :math:`F`-distribution. **df1** : float The degrees of freedom of the numerator variance, :math:`\nu_1`. **df2** : float The degrees of freedom of the denominator variance, :math:`\nu_2`. **rlamda** : float :math:`\lambda`, the noncentrality parameter. **tol** : float, optional The relative accuracy required by you in the results. If ``prob_f_noncentral`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. **maxit** : int, optional The maximum number of iterations to be used. **Returns** **p** : float The probability associated with the lower tail of the noncentral :math:`F` or variance-ratio distribution. .. _g01gd-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{df2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df2} > 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{rlamda} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0\leq \mathrm{rlamda}\leq {-2.0}\times \log\left(U\right)`, where :math:`U` is the safe range parameter as defined by :meth:`machine.real_safe <naginterfaces.library.machine.real_safe>`. (`errno` :math:`1`) On entry, :math:`\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{maxit} \geq 1`. (`errno` :math:`1`) On entry, :math:`\mathrm{df1} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0 < \mathrm{df1}\leq 10^6`. (`errno` :math:`1`) On entry, :math:`\mathrm{f} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{f} > 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{df1} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df1} > 0.0`. (`errno` :math:`2`) The solution has failed to converge in :math:`\langle\mathit{\boldsymbol{value}}\rangle` iterations. Consider increasing :math:`\mathrm{maxit}` or :math:`\mathrm{tol}`. (`errno` :math:`3`) The required probability cannot be computed accurately. This may happen if the result would be very close to zero or one. Alternatively the values of :math:`\mathrm{df1}` and :math:`\mathrm{f}` may be too large. In the latter case you could try using a normal approximation, see Abramowitz and Stegun (1972). **Warns** **NagAlgorithmicWarning** (`errno` :math:`4`) The required accuracy was not achieved when calculating the initial value of the central :math:`F` or :math:`\chi^2` probability. You should try a larger value of :math:`\mathrm{tol}`. If the :math:`\chi^2` approximation is being used then ``prob_f_noncentral`` returns zero otherwise the value returned should be an approximation to the correct value. .. _g01gd-py2-py-notes: **Notes** The lower tail probability of the noncentral :math:`F`-distribution with :math:`\nu_1` and :math:`\nu_2` degrees of freedom and noncentrality parameter :math:`\lambda`, :math:`P\left({F\leq f:\nu_1}, {\nu_2\text{;}\lambda }\right)`, is defined by .. math:: P\left({F\leq f:\nu_1}, {\nu_2\text{;}\lambda }\right) = \int_0^xp\left({F:\nu_1}, {\nu_2\text{;}\lambda }\right){dF}\text{,} where .. math:: P\left({F:\nu_1}, {\nu_2\text{;}\lambda }\right) = \sum_{{j = 0}}^{\infty }e^{{-\lambda /2}}\frac{{\left(\lambda /2\right)}^j}{{j!}}\times \frac{{\left(\nu_1+2j\right)^{{\left(\nu_1+2j\right)/2}}\nu_2^{{\nu_2/2}}}}{{B\left({\left(\nu_1+2j\right)/2}, {\nu_2/2}\right)}} .. math:: \times u^{{\left(\nu_1+2j-2\right)/2}}{\left[\nu_2+\left(\nu_1+2j\right)u\right]}^{{-\left(\nu_1+2j+\nu_2\right)/2}} and :math:`B\left(·, ·\right)` is the beta function. The probability is computed by means of a transformation to a noncentral beta distribution: .. math:: P\left({F\leq f:\nu_1}, {\nu_2\text{;}\lambda }\right) = P_{\beta }\left({X\leq x:a}, {b\text{;}\lambda }\right)\text{,} where :math:`x = \frac{{\nu_1f}}{{\nu_1f+\nu_2}}` and :math:`P_{\beta }\left({X\leq x:a}, {b\text{;}\lambda }\right)` is the lower tail probability integral of the noncentral beta distribution with parameters :math:`a`, :math:`b`, and :math:`\lambda`. If :math:`\nu_2` is very large, greater than :math:`10^6`, then a :math:`\chi^2` approximation is used. .. _g01gd-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications """ raise NotImplementedError
[docs]def prob_beta_noncentral(x, a, b, rlamda, tol=0.0, maxit=500): r""" ``prob_beta_noncentral`` returns the probability associated with the lower tail of the noncentral beta distribution. .. _g01ge-py2-py-doc: For full information please refer to the NAG Library document for g01ge https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gef.html .. _g01ge-py2-py-parameters: **Parameters** **x** : float :math:`\beta`, the deviate from the beta distribution, for which the probability :math:`P\left({B\leq \beta :a}, {b\text{;}\lambda }\right)` is to be found. **a** : float :math:`a`, the first parameter of the required beta distribution. **b** : float :math:`b`, the second parameter of the required beta distribution. **rlamda** : float :math:`\lambda`, the noncentrality parameter of the required beta distribution. **tol** : float, optional The relative accuracy required by you in the results. If ``prob_beta_noncentral`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. See `Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gef.html#accuracy>`__ for the relationship between :math:`\mathrm{tol}` and :math:`\mathrm{maxit}`. **maxit** : int, optional The maximum number of iterations that the algorithm should use. See `Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gef.html#accuracy>`__ for suggestions as to suitable values for :math:`\mathrm{maxit}` for different values of the arguments. **Returns** **p** : float The probability associated with the lower tail of the noncentral beta distribution. .. _g01ge-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0 < \mathrm{b}\leq 10^6`. (`errno` :math:`1`) On entry, :math:`\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{maxit} \geq 1`. (`errno` :math:`1`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0 < \mathrm{a}\leq 10^6`. (`errno` :math:`1`) On entry, :math:`\mathrm{x} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0\leq \mathrm{x}\leq 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{rlamda} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`0.0\leq \mathrm{rlamda}\leq {-2.0}\log\left(U\right)`, where :math:`U` is the safe range parameter as defined by :meth:`machine.real_safe <naginterfaces.library.machine.real_safe>`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`2`) The solution has failed to converge in :math:`\langle\mathit{\boldsymbol{value}}\rangle` iterations. Consider increasing :math:`\mathrm{maxit}` or :math:`\mathrm{tol}`. The returned value will be an approximation to the correct value. (`errno` :math:`3`) The probability is too close to :math:`0.0` or :math:`1.0` for the algorithm to be able to calculate the required probability. ``prob_beta_noncentral`` will return :math:`0.0` or :math:`1.0` as appropriate. This should be a reasonable approximation. (`errno` :math:`4`) The required accuracy was not achieved when calculating the initial value of the beta distribution. You should try a larger value of :math:`\mathrm{tol}`. The returned value will be an approximation to the correct value. .. _g01ge-py2-py-notes: **Notes** The lower tail probability for the noncentral beta distribution with parameters :math:`a` and :math:`b` and noncentrality parameter :math:`\lambda`, :math:`P\left({B\leq \beta :a}, {b\text{;}\lambda }\right)`, is defined by .. math:: P\left({B\leq \beta :a}, {b\text{;}\lambda }\right) = \sum_{{j = 0}}^{\infty }e^{{-\lambda /2}}\frac{\left(\lambda /2\right)}{{j!}}P\left({B\leq \beta :a}, {b\text{;}0}\right)\text{,} where .. math:: P\left({B\leq \beta :a}, {b\text{;}0}\right) = \frac{{\Gamma \left(a+b\right)}}{{\Gamma \left(a\right)\Gamma \left(b\right)}}\int_0^{{\beta }}B^{{a-1}}\left(1-B\right)^{{b-1}}{dB}\text{,} which is the central beta probability function or incomplete beta function. Recurrence relationships given in Abramowitz and Stegun (1972) are used to compute the values of :math:`P\left({B\leq \beta :a}, {b\text{;}0}\right)` for each step of the summation `(1) <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01gef.html#eqn1>`__. The algorithm is discussed in Lenth (1987). .. _g01ge-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Lenth, R V, 1987, `Algorithm AS 226: Computing noncentral beta probabilities`, Appl. Statist. (36), 241--244 """ raise NotImplementedError
[docs]def prob_bivariate_normal(x, y, rho): r""" ``prob_bivariate_normal`` returns the lower tail probability for the bivariate Normal distribution. .. _g01ha-py2-py-doc: For full information please refer to the NAG Library document for g01ha https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01haf.html .. _g01ha-py2-py-parameters: **Parameters** **x** : float :math:`x`, the first argument for which the bivariate Normal distribution function is to be evaluated. **y** : float :math:`y`, the second argument for which the bivariate Normal distribution function is to be evaluated. **rho** : float :math:`\rho`, the correlation coefficient. **Returns** **p** : float The lower tail probability for the bivariate Normal distribution. .. _g01ha-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{rho} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rho}\leq 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{rho} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rho} \geq {-1.0}`. .. _g01ha-py2-py-notes: **Notes** For the two random variables :math:`\left(X, Y\right)` following a bivariate Normal distribution with .. math:: E\left[X\right] = 0\text{, }\quad E\left[Y\right] = 0\text{, }\quad E\left[X^2\right] = 1\text{, }\quad E\left[Y^2\right] = 1\quad \text{ and }\quad E\left[XY\right] = \rho \text{,} the lower tail probability is defined by: .. math:: P\left({X\leq x}, {Y\leq y:\rho }\right) = \frac{1}{{2\pi \sqrt{1-\rho^2}}}\int_{{-\infty }}^y\int_{{-\infty }}^x\mathrm{exp}\left(-\frac{\left(X^2-2\rho XY+Y^2\right)}{{2\left(1-\rho^2\right)}}\right)dXdY\text{.} For a more detailed description of the bivariate Normal distribution and its properties see Abramowitz and Stegun (1972) and Kendall and Stuart (1969). The method used is described by Genz (2004). .. _g01ha-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Genz, A, 2004, `Numerical computation of rectangular bivariate and trivariate Normal and` :math:`t` `probabilities`, Statistics and Computing (14), 151--160 Kendall, M G and Stuart, A, 1969, `The Advanced Theory of Statistics (Volume 1)`, (3rd Edition), Griffin """ raise NotImplementedError
[docs]def prob_multi_normal(xmu, sig, a=None, b=None, tol=0.0001): r""" ``prob_multi_normal`` returns the upper tail, lower tail or central probability associated with a multivariate Normal distribution of up to ten dimensions. .. _g01hb-py2-py-doc: For full information please refer to the NAG Library document for g01hb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01hbf.html .. _g01hb-py2-py-parameters: **Parameters** **xmu** : float, array-like, shape :math:`\left(n\right)` :math:`\mu`, the mean vector of the multivariate Normal distribution. **sig** : float, array-like, shape :math:`\left(n, n\right)` :math:`\Sigma`, the variance-covariance matrix of the multivariate Normal distribution. Only the lower triangle is referenced. **a** : None or float, array-like, shape :math:`\left(n\right)`, optional If upper tail or central probablilities are to be returned, :math:`\mathrm{a}` should supply the lower bounds, :math:`a_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. **b** : None or float, array-like, shape :math:`\left(n\right)`, optional If lower tail or central probablilities are to be returned, :math:`\mathrm{b}` should supply the upper bounds, :math:`b_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`. **tol** : float, optional If :math:`n > 2` the relative accuracy required for the probability, and if the upper or the lower tail probability is requested then :math:`\mathrm{tol}` is also used to determine the cut-off points, see `Accuracy <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01hbf.html#accuracy>`__. If :math:`n = 1`, :math:`\mathrm{tol}` is not referenced. **Returns** **p** : float The upper tail, lower tail or central probability associated with then multivariate Normal distribution. .. _g01hb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`1\leq n\leq 10`. (`errno` :math:`1`) On entry, :math:`\textit{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{tail} = \texttt{'L'}`, :math:`\texttt{'U'}` or :math:`\texttt{'C'}`. (`errno` :math:`1`) On entry, :math:`\mathrm{tol} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tol} > 0.0`. (`errno` :math:`2`) On entry, the :math:`\langle\mathit{\boldsymbol{value}}\rangle`:math:`\langle\mathit{\boldsymbol{value}}\rangle` value in :math:`\mathrm{b}` is less than or equal to the corresponding value in :math:`\mathrm{a}`. (`errno` :math:`3`) On entry, :math:`\mathrm{sig}` is not positive definite. **Warns** **NagAlgorithmicWarning** (`errno` :math:`4`) Full accuracy not achieved, relative accuracy :math:`\text{} = \langle\mathit{\boldsymbol{value}}\rangle`. (`errno` :math:`5`) Accuracy requested by :math:`\mathrm{tol}` is too strict: :math:`\mathrm{tol} = \langle\mathit{\boldsymbol{value}}\rangle`. .. _g01hb-py2-py-notes: **Notes** Let the vector random variable :math:`X = \left(X_1, X_2, \ldots, X_n\right)^\mathrm{T}` follow an :math:`n`-dimensional multivariate Normal distribution with mean vector :math:`\mu` and :math:`n\times n` variance-covariance matrix :math:`\Sigma`, then the probability density function, :math:`f\left({X:\mu }, \Sigma \right)`, is given by .. math:: f\left({X:\mu }, \Sigma \right) = \left(2\pi \right)^{{-\left(1/2\right)n}}\left\lvert \Sigma \right\rvert^{{-1/2}}\mathrm{exp}\left(-\frac{1}{2}\left(X-\mu \right)^\mathrm{T}\Sigma^{-1}\left(X-\mu \right)\right)\text{.} The lower tail probability is defined by: .. math:: P\left({X_1\leq b_1}, \ldots, {X_n\leq b_n:\mu }, \Sigma \right) = \int_{{-\infty }}^{{b_1}} \cdots \int_{{-\infty }}^{{b_n}}f\left({X:\mu }, {\Sigma }\right){dX_n} \cdots {dX_1}\text{.} The upper tail probability is defined by: .. math:: P\left({X_1\geq a_1}, \ldots, {X_n\geq a_n:\mu }, \Sigma \right) = \int_{a_1}^{\infty } \cdots \int_{a_n}^{\infty }f\left({X:\mu }, \Sigma \right){dX_n} \cdots {dX_1}\text{.} The central probability is defined by: .. math:: P\left({a_1\leq X_1\leq b_1}, \ldots, {a_n\leq X_n\leq b_n:\mu }, {\Sigma }\right) = \int_{a_1}^{{b_1}} \cdots \int_{a_n}^{{b_n}}f\left({X:\mu }, {\Sigma }\right){dX_n} \cdots {dX_1}\text{.} To evaluate the probability for :math:`n\geq 3`, the probability density function of :math:`X_1,X_2,\ldots,X_n` is considered as the product of the conditional probability of :math:`X_1,X_2,\ldots,X_{{n-2}}` given :math:`X_{{n-1}}` and :math:`X_n` and the marginal bivariate Normal distribution of :math:`X_{{n-1}}` and :math:`X_n`. The bivariate Normal probability can be evaluated as described in :meth:`prob_bivariate_normal` and numerical integration is then used over the remaining :math:`n-2` dimensions. In the case of :math:`n = 3`, :meth:`quad.dim1_fin_bad <naginterfaces.library.quad.dim1_fin_bad>` is used and for :math:`n > 3` :meth:`quad.md_adapt <naginterfaces.library.quad.md_adapt>` is used. To evaluate the probability for :math:`n = 1` a direct call to :meth:`prob_normal` is made and for :math:`n = 2` calls to :meth:`prob_bivariate_normal` are made. .. _g01hb-py2-py-references: **References** Kendall, M G and Stuart, A, 1969, `The Advanced Theory of Statistics (Volume 1)`, (3rd Edition), Griffin """ raise NotImplementedError
[docs]def prob_bivariate_students_t(df, rho, a=None, b=None): r""" ``prob_bivariate_students_t`` returns probabilities for the bivariate Student's :math:`t`-distribution. .. _g01hc-py2-py-doc: For full information please refer to the NAG Library document for g01hc https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01hcf.html .. _g01hc-py2-py-parameters: **Parameters** **df** : int :math:`\nu`, the degrees of freedom of the bivariate Student's :math:`t`-distribution. **rho** : float :math:`\rho`, the correlation of the bivariate Student's :math:`t`-distribution. **a** : None or float, array-like, shape :math:`\left(2\right)`, optional If upper tail or central probablilities are to be returned, :math:`\mathrm{a}` should supply the lower bounds, :math:`a_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,2`. **b** : None or float, array-like, shape :math:`\left(2\right)`, optional If lower tail or central probablilities are to be returned, :math:`\mathrm{b}` should supply the upper bounds, :math:`b_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,2`. **Returns** **p** : float The probabilities for the bivariate Student's :math:`t`-distribution. .. _g01hc-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\textit{tail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{tail} = \texttt{'L'}`, :math:`\texttt{'U'}` or :math:`\texttt{'C'}`. (`errno` :math:`3`) On entry, :math:`\mathrm{b}[i-1]\leq \mathrm{a}[i-1]` for central probability, for some :math:`i = 1,2`. (`errno` :math:`4`) On entry, :math:`\mathrm{df} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{df}\geq 1`. (`errno` :math:`5`) On entry, :math:`\mathrm{rho} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`{-1.0}\leq \mathrm{rho}\leq 1.0`. .. _g01hc-py2-py-notes: **Notes** Let the vector random variable :math:`X = \left(X_1, X_2\right)^\mathrm{T}` follow a bivariate Student's :math:`t`-distribution with degrees of freedom :math:`\nu` and correlation :math:`\rho`, then the probability density function is given by .. math:: \left(X:\nu,\rho \right) = \frac{1}{{2\pi \sqrt{1-\rho^2}}}\left(1+\frac{{X_1^2+X_2^2-2\rho X_1X_2}}{{\nu \left(1-\rho^2\right)}}\right)^{{-\nu /2-1}}\text{.} The lower tail probability is defined by: .. math:: \left({X_1\leq b_1},{X_2\leq b_2}:\nu,\rho \right) = \int_{{-\infty }}^{b_1}\int_{{-\infty }}^{b_2}\left(X:\nu,\rho \right)dX_2dX_1\text{.} The upper tail probability is defined by: .. math:: \left({X_1\geq a_1},{X_2\geq a_2}:\nu,\rho \right) = \int_{a_1}^{\infty }\int_{a_2}^{\infty }\left(X:\nu,\rho \right)dX_2dX_1\text{.} The central probability is defined by: .. math:: \left({a_1\leq X_1\leq b_1},{a_2\leq X_2\leq b_2}:\nu,\rho \right) = \int_{a_1}^{b_1}\int_{a_2}^{b_2}\left(X:\nu,\rho \right)dX_2dX_1\text{.} Calculations use the Dunnett and Sobel (1954) method, as described by Genz (2004). .. _g01hc-py2-py-references: **References** Dunnett, C W and Sobel, M, 1954, `A bivariate generalization of Student's` :math:`t` `-distribution, with tables for certain special cases`, Biometrika (41), 153--169 Genz, A, 2004, `Numerical computation of rectangular bivariate and trivariate Normal and` :math:`t` `probabilities`, Statistics and Computing (14), 151--160 """ raise NotImplementedError
[docs]def prob_multi_students_t(tail, a, b, nu, delta, iscov, rc, epsabs=0.0, epsrel=0.001, numsub=350, nsampl=8, fmax=None): r""" ``prob_multi_students_t`` returns a probability associated with a multivariate Student's :math:`t`-distribution. .. _g01hd-py2-py-doc: For full information please refer to the NAG Library document for g01hd https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01hdf.html .. _g01hd-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(n\right)` Defines the calculated probability, set :math:`\mathrm{tail}[i-1]` to: :math:`\mathrm{tail}[i-1] = \texttt{'L'}` If the :math:`i`\ th lower limit :math:`a_i` is negative infinity. :math:`\mathrm{tail}[i-1] = \texttt{'U'}` If the :math:`i`\ th upper limit :math:`b_i` is infinity. :math:`\mathrm{tail}[i-1] = \texttt{'C'}` If both :math:`a_i` and :math:`b_i` are finite. **a** : float, array-like, shape :math:`\left(n\right)` :math:`a_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`, the lower integral limits of the calculation. If :math:`\mathrm{tail}[i-1] = \texttt{'L'}`, :math:`\mathrm{a}[i-1]` is not referenced and the :math:`i`\ th lower limit of integration is :math:`-\infty`. **b** : float, array-like, shape :math:`\left(n\right)` :math:`b_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`, the upper integral limits of the calculation. If :math:`\mathrm{tail}[i-1] = \texttt{'U'}`, :math:`\mathrm{b}[i-1]` is not referenced and the :math:`i`\ th upper limit of integration is :math:`\infty`. **nu** : float :math:`\nu`, the degrees of freedom. **delta** : float, array-like, shape :math:`\left(n\right)` :math:`\mathrm{delta}[\textit{i}-1]` the noncentrality parameter for the :math:`\textit{i}`\ th dimension, for :math:`\textit{i} = 1,2,\ldots,n`; set :math:`\mathrm{delta}[i-1] = 0` for the central probability. **iscov** : int Set :math:`\mathrm{iscov} = 1` if the covariance matrix is supplied and :math:`\mathrm{iscov} = 2` if the correlation matrix is supplied. **rc** : float, array-like, shape :math:`\left(n, n\right)` The lower triangle of either the covariance matrix (if :math:`\mathrm{iscov} = 1`) or the correlation matrix (if :math:`\mathrm{iscov} = 2`). In either case the array elements corresponding to the upper triangle of the matrix need not be set. **epsabs** : float, optional :math:`\epsilon_a`, the absolute accuracy requested in the approximation. If :math:`\mathrm{epsabs}` is negative, the absolute value is used. **epsrel** : float, optional :math:`\epsilon_r`, the relative accuracy requested in the approximation. If :math:`\mathrm{epsrel}` is negative, the absolute value is used. **numsub** : int, optional If quadrature is used, the number of sub-intervals used by the quadrature algorithm; otherwise :math:`\mathrm{numsub}` is not referenced. **nsampl** : int, optional If quadrature is used, :math:`\mathrm{nsampl}` is not referenced; otherwise :math:`\mathrm{nsampl}` is the number of samples used to estimate the error in the approximation. **fmax** : None or int, optional Note: if this argument is **None** then a default value will be used, determined as follows: :math:`1000\times n`. If a number theoretic approach is used, the maximum number of evaluations for each integrand function. **Returns** **p** : float The probability associated with the multivariate Student's :math:`t`-distribution. **rc** : float, ndarray, shape :math:`\left(n, n\right)` The strict upper triangle of :math:`\mathrm{rc}` contains the correlation matrix used in the calculations. **errest** : float An estimate of the error in the calculated probability. .. _g01hd-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`1 < n\leq 1000`. (`errno` :math:`2`) On entry, :math:`\mathrm{tail}[k-1] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{tail}[\textit{k}-1] = \texttt{'L'}`, :math:`\texttt{'U'}` or :math:`\texttt{'C'}`. (`errno` :math:`4`) On entry, :math:`k = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{b}[k-1] > \mathrm{a}[k-1]` for a central probability. (`errno` :math:`5`) On entry, :math:`\mathrm{nu} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: degrees of freedom :math:`\mathrm{nu} > 0.0`. (`errno` :math:`8`) On entry, :math:`\mathrm{iscov} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{iscov} = 1` or :math:`2`. (`errno` :math:`9`) On entry, the information supplied in :math:`\mathrm{rc}` is invalid. (`errno` :math:`12`) On entry, :math:`\mathrm{numsub} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{numsub}\geq 1`. (`errno` :math:`13`) On entry, :math:`\mathrm{nsampl} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{nsampl}\geq 1`. (`errno` :math:`14`) On entry, :math:`\mathrm{fmax} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{fmax}\geq 1`. .. _g01hd-py2-py-notes: **Notes** A random vector :math:`x \in \mathbb{R}^n` that follows a Student's :math:`t`-distribution with :math:`\nu` degrees of freedom and covariance matrix :math:`\Sigma` has density: .. math:: \frac{{\Gamma \left(\left(\nu +n\right)/2\right)}}{{\Gamma \left(\nu /2\right)\nu^{{n/2}}\pi^{{n/2}}\left\lvert \Sigma \right\rvert^{{1/2}}\left[1+\frac{1}{\nu }x^\mathrm{T}\Sigma^{-1}x\right]^{{\left(\nu +n\right)/2}}}}\text{,} and probability :math:`p` given by: .. math:: p = \frac{{\Gamma \left(\left(\nu +n\right)/2\right)}}{{\Gamma \left(\nu /2\right)\sqrt{\left\lvert \Sigma \right\rvert \left(\pi \nu \right)^n}}}\int_{a_1}^{b_1}\int_{a_2}^{b_2} \cdots \int_{a_n}^{b_n}\left(1+x^\mathrm{T}\Sigma^{-1}x/\nu \right)^{{-\left(\nu +n\right)/2}}dx\text{.} The method of calculation depends on the dimension :math:`n` and degrees of freedom :math:`\nu`. The method of Dunnett and Sobel (1954) is used in the bivariate case if :math:`\nu` is a whole number. A Plackett transform followed by quadrature method is adopted in other bivariate cases and trivariate cases. In dimensions higher than three a number theoretic approach to evaluating multidimensional integrals is adopted. Error estimates are supplied as the published accuracy in the Dunnett and Sobel (1954) case, a Monte Carlo standard error for multidimensional integrals, and otherwise the quadrature error estimate. A parameter :math:`\delta` allows for non-central probabilities. The number theoretic method is used if any :math:`\delta` is nonzero. In cases other than the central bivariate with whole :math:`\nu`, ``prob_multi_students_t`` attempts to evaluate probabilities within a requested accuracy :math:`\mathrm{max}\left(\epsilon_a, {\epsilon_r\times I}\right)`, for an approximate integral value :math:`I`, absolute accuracy :math:`\epsilon_a` and relative accuracy :math:`\epsilon_r`. .. _g01hd-py2-py-references: **References** Dunnett, C W and Sobel, M, 1954, `A bivariate generalization of Student's` :math:`t` `-distribution, with tables for certain special cases`, Biometrika (41), 153--169 Genz, A and Bretz, F, 2002, `Methods for the computation of multivariate` :math:`t` `-probabilities`, Journal of Computational and Graphical Statistics ((11)), 950--971 """ raise NotImplementedError
[docs]def prob_chisq_noncentral_lincomb(a, mult, rlamda, c, tol=0.0, maxit=500): r""" ``prob_chisq_noncentral_lincomb`` returns the lower tail probability of a distribution of a positive linear combination of :math:`\chi^2` random variables. .. _g01jc-py2-py-doc: For full information please refer to the NAG Library document for g01jc https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01jcf.html .. _g01jc-py2-py-parameters: **Parameters** **a** : float, array-like, shape :math:`\left(n\right)` The weights, :math:`a_1,a_2,\ldots,a_n`. **mult** : int, array-like, shape :math:`\left(n\right)` The degrees of freedom, :math:`m_1,m_2,\ldots,m_n`. **rlamda** : float, array-like, shape :math:`\left(n\right)` The noncentrality parameters, :math:`\lambda_1,\lambda_2,\ldots,\lambda_n`. **c** : float :math:`c`, the point for which the lower tail probability is to be evaluated. **tol** : float, optional The relative accuracy required by you in the results. If ``prob_chisq_noncentral_lincomb`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. **maxit** : int, optional The maximum number of terms that should be used during the summation. **Returns** **p** : float The lower tail probability associated with the linear combination of :math:`n` :math:`\chi^2` random variables with :math:`m_{\textit{j}}` degrees of freedom, and noncentrality parameters :math:`\lambda_{\textit{j}}`, for :math:`\textit{j} = 1,2,\ldots,n`. **pdf** : float The value of the probability density function of the linear combination of :math:`\chi^2` variables. .. _g01jc-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{maxit} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{maxit} \geq 1`. (`errno` :math:`1`) On entry, :math:`\mathrm{c} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{c}\geq 0.0`. (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n \geq 1`. (`errno` :math:`2`) On entry, :math:`\mathrm{rlamda}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rlamda}[\textit{i}]\geq 0.0`, for :math:`\textit{i} = 0,\ldots,n-1`. (`errno` :math:`2`) On entry, :math:`\mathrm{mult}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{mult}[\textit{i}]\geq 1`, for :math:`\textit{i} = 0,\ldots,n-1`. (`errno` :math:`2`) On entry, :math:`\mathrm{a}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a}[\textit{i}] > 0.0`, for :math:`\textit{i} = 0,\ldots,n-1`. (`errno` :math:`3`) The central :math:`\chi^2` calculation has failed to converge. This is an unlikely exit. A larger value of :math:`\mathrm{tol}` should be tried. **Warns** **NagAlgorithmicWarning** (`errno` :math:`4`) The solution has failed to converge within :math:`\mathrm{maxit}` iterations. A larger value of :math:`\mathrm{maxit}` or :math:`\mathrm{tol}` should be used. The returned value should be a reasonable approximation to the correct value. (`errno` :math:`5`) The solution appears to be too close to :math:`0` or :math:`1` for accurate calculation. The value returned is :math:`0` or :math:`1` as appropriate. .. _g01jc-py2-py-notes: **Notes** For a linear combination of noncentral :math:`\chi^2` random variables with integer degrees of freedom the lower tail probability is .. math:: P\left(\sum_{{j = 1}}^na_j\chi^2\left(m_j, {\lambda_j}\right)\leq c\right)\text{,} where :math:`a_j` and :math:`c` are positive constants and where :math:`\chi^2\left(m_j, {\lambda_j}\right)` represents an independent :math:`\chi^2` random variable with :math:`m_j` degrees of freedom and noncentrality parameter :math:`\lambda_j`. The linear combination may arise from considering a quadratic form in Normal variables. Ruben's method as described in Farebrother (1984) is used. Ruben has shown that `(1) <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01jcf.html#eqn1>`__ may be expanded as an infinite series of the form .. math:: \sum_{{k = 0}}^{\infty }d_kF\left(m+2k,c/\beta \right)\text{,} where :math:`F\left(m+2k,c/\beta \right) = P\left(\chi^2\left(m+2k\right) < c/\beta \right)`, i.e., the probability that a central :math:`\chi^2` is less than :math:`c/\beta`. The value of :math:`\beta` is set at .. math:: \beta = \beta_B = \frac{2}{{\left(1/a_{\mathrm{min}}+1/a_{\mathrm{max}}\right)}} unless :math:`\beta_B > 1.8a_{\mathrm{min}}`, in which case .. math:: \beta = \beta_A = a_{\mathrm{min}} is used, where :math:`a_{\mathrm{min}} = \mathrm{min}\left\{a_j\right\}` and :math:`a_{\mathrm{max}} = \mathrm{setmax}\left(a_j\right)`, for :math:`\textit{j} = 1,2,\ldots,n`. .. _g01jc-py2-py-references: **References** Farebrother, R W, 1984, `The distribution of a positive linear combination of` :math:`\chi^2` `random variables`, Appl. Statist. (33(3)) """ raise NotImplementedError
[docs]def prob_chisq_lincomb(rlam, d, c, method='D'): r""" ``prob_chisq_lincomb`` calculates the lower tail probability for a linear combination of (central) :math:`\chi^2` variables. .. _g01jd-py2-py-doc: For full information please refer to the NAG Library document for g01jd https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01jdf.html .. _g01jd-py2-py-parameters: **Parameters** **rlam** : float, array-like, shape :math:`\left(n\right)` The weights, :math:`\lambda_{\textit{i}}`, for :math:`\textit{i} = 1,2,\ldots,n`, of the central :math:`\chi^2` variables. **d** : float :math:`d`, the multiplier of the central :math:`\chi^2` variables. **c** : float :math:`c`, the value of the constant. **method** : str, length 1, optional Indicates whether Pan's, Imhof's or an appropriately selected procedure is to be used. :math:`\mathrm{method} = \texttt{'P'}` Pan's method is used. :math:`\mathrm{method} = \texttt{'I'}` Imhof's method is used. :math:`\mathrm{method} = \texttt{'D'}` Pan's method is used if :math:`\lambda_{\textit{i}}^*`, for :math:`\textit{i} = 1,2,\ldots,n` are at least :math:`1\%` distinct and :math:`n\leq 60`; otherwise Imhof's method is used. **Returns** **prob** : float The lower tail probability for the linear combination of central :math:`\chi^2` variables. .. _g01jd-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{method} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{method} = \texttt{'P'}`, :math:`\texttt{'I'}` or :math:`\texttt{'D'}`. (`errno` :math:`1`) On entry, :math:`\mathrm{d} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{d}\geq 0.0`. (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n \geq 1`. (`errno` :math:`2`) On entry, :math:`\mathrm{rlam}[\textit{i}-1] = \mathrm{d}` for all values of :math:`\textit{i}`, for :math:`\textit{i} = 1,2,\ldots,n`. .. _g01jd-py2-py-notes: **Notes** Let :math:`u_1,u_2,\ldots,u_n` be independent Normal variables with mean zero and unit variance, so that :math:`u_1^2,u_2^2,\ldots,u_n^2` have independent :math:`\chi^2`-distributions with unit degrees of freedom. ``prob_chisq_lincomb`` evaluates the probability that .. math:: \lambda_1u_1^2+\lambda_2u_2^2 + \cdots +\lambda_nu_n^2 < d\left(u_1^2+u_2^2 + \cdots +u_n^2\right)+c\text{.} If :math:`c = 0.0` this is equivalent to the probability that .. math:: \frac{{\lambda_1u_1^2+\lambda_2u_2^2 + \cdots +\lambda_nu_n^2}}{{u_1^2+u_2^2 + \cdots +u_n^2}} < d\text{.} Alternatively let .. math:: \lambda_i^* = \lambda_i-d\text{, }i = 1,2,\ldots,n\text{,} then ``prob_chisq_lincomb`` returns the probability that .. math:: \lambda_1^*u_1^2+\lambda_2^*u_2^2 + \cdots +\lambda_n^*u_n^2 < c\text{.} Two methods are available. One due to Pan (1964) (see Farebrother (1980)) makes use of series approximations. The other method due to Imhof (1961) reduces the problem to a one-dimensional integral. If :math:`n\geq 6` then a non-adaptive method described in :meth:`quad.dim1_fin_smooth <naginterfaces.library.quad.dim1_fin_smooth>` is used to compute the value of the integral otherwise :meth:`quad.dim1_fin_bad <naginterfaces.library.quad.dim1_fin_bad>` is used. Pan's procedure can only be used if the :math:`\lambda_i^*` are sufficiently distinct; ``prob_chisq_lincomb`` requires the :math:`\lambda_i^*` to be at least :math:`1\%` distinct; see `Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01jdf.html#fcomments>`__. If the :math:`\lambda_i^*` are at least :math:`1\%` distinct and :math:`n\leq 60`, then Pan's procedure is recommended; otherwise Imhof's procedure is recommended. .. _g01jd-py2-py-references: **References** Farebrother, R W, 1980, `Algorithm AS 153. Pan's procedure for the tail probabilities of the Durbin--Watson statistic`, Appl. Statist. (29), 224--227 Imhof, J P, 1961, `Computing the distribution of quadratic forms in Normal variables`, Biometrika (48), 419--426 Pan, Jie--Jian, 1964, `Distributions of the noncircular serial correlation coefficients`, Shuxue Jinzhan (7), 328--337 """ raise NotImplementedError
[docs]def pdf_normal(x, xmean, xstd): r""" ``pdf_normal`` returns the value of the probability density function (PDF) for the Normal (Gaussian) distribution with mean :math:`\mu` and variance :math:`\sigma^2` at a point :math:`x`. .. _g01ka-py2-py-doc: For full information please refer to the NAG Library document for g01ka https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01kaf.html .. _g01ka-py2-py-parameters: **Parameters** **x** : float :math:`x`, the value at which the PDF is to be evaluated. **xmean** : float :math:`\mu`, the mean of the Normal distribution. **xstd** : float :math:`\sigma`, the standard deviation of the Normal distribution. **Returns** **pdf** : float The value of the PDF. .. _g01ka-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{xstd} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{xstd}\times \sqrt{2.0\pi } > U`, where :math:`U` is the safe range parameter as defined by :meth:`machine.real_safe <naginterfaces.library.machine.real_safe>`. (`errno` :math:`2`) Computation abandoned owing to underflow of :math:`\frac{1}{{\left(\sigma \times \sqrt{2\pi }\right)}}`. (`errno` :math:`3`) Computation abandoned owing to an internal calculation overflowing. .. _g01ka-py2-py-notes: **Notes** The Normal distribution has probability density function (PDF) .. math:: f\left(x\right) = \frac{1}{{\sigma \sqrt{{2\pi }}}}e^{{-\left(x-\mu \right)^2/2\sigma^2}}\text{, }\quad \sigma > 0\text{.} """ raise NotImplementedError
[docs]def pdf_gamma(x, a, b): r""" ``pdf_gamma`` returns the value of the probability density function (PDF) for the gamma distribution with shape parameter :math:`\alpha` and scale parameter :math:`\beta` at a point :math:`x`. .. _g01kf-py2-py-doc: For full information please refer to the NAG Library document for g01kf https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01kff.html .. _g01kf-py2-py-parameters: **Parameters** **x** : float :math:`x`, the value at which the PDF is to be evaluated. **a** : float :math:`\alpha`, the shape parameter of the gamma distribution. **b** : float :math:`\beta`, the scale parameter of the gamma distribution. **Returns** **pdf** : float The value of the PDF. .. _g01kf-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{a} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{a} > 0.0`. (`errno` :math:`2`) On entry, :math:`\mathrm{b} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{b} > 0.0`. (`errno` :math:`3`) Computation abandoned owing to overflow due to extreme parameter values. .. _g01kf-py2-py-notes: **Notes** The gamma distribution has PDF .. math:: \begin{array}{cc}f\left(x\right) = \frac{1}{{\beta^{\alpha }\Gamma \left(\alpha \right)}}x^{{\alpha -1}}e^{{-x/\beta }}&\text{if }x\geq 0\text{; }\quad \alpha,\beta > 0\\\\f\left(x\right) = 0&\text{otherwise.}\end{array} If :math:`0.01\leq x,\alpha,\beta \leq 100` then an algorithm based directly on the gamma distribution's PDF is used. For values outside this range, the function is calculated via the Poisson distribution's PDF as described in Loader (2000) (see `Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01kff.html#fcomments>`__). .. _g01kf-py2-py-references: **References** Loader, C, 2000, `Fast and accurate computation of binomial probabilities` ((not yet published)) """ raise NotImplementedError
[docs]def pdf_gamma_vector(ilog, x, a, b): r""" ``pdf_gamma_vector`` returns a number of values of the probability density function (PDF), or its logarithm, for the gamma distribution. .. _g01kk-py2-py-doc: For full information please refer to the NAG Library document for g01kk https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01kkf.html .. _g01kk-py2-py-parameters: **Parameters** **ilog** : int The value of :math:`\mathrm{ilog}` determines whether the logarithmic value is returned in :math:`\mathrm{pdf}`. :math:`\mathrm{ilog} = 0` :math:`f\left(x_i, \alpha_i, \beta_i\right)`, the probability density function is returned. :math:`\mathrm{ilog} = 1` :math:`\log\left(f\left(x_i, \alpha_i, \beta_i\right)\right)`, the logarithm of the probability density function is returned. **x** : float, array-like, shape :math:`\left(\textit{lx}\right)` :math:`x_i`, the values at which the PDF is to be evaluated. **a** : float, array-like, shape :math:`\left(\textit{la}\right)` :math:`\alpha_i`, the shape parameter. **b** : float, array-like, shape :math:`\left(\textit{lb}\right)` :math:`\beta_i`, the scale parameter. **Returns** **pdf** : float, ndarray, shape :math:`\left(\max\left(\textit{lx},\textit{la}\right)\right)` :math:`f\left(x_i, \alpha_i, \beta_i\right)` or :math:`\log\left(f\left(x_i, \alpha_i, \beta_i\right)\right)`. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{lx},\textit{la}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` :math:`\alpha_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 2` :math:`\beta_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 3` :math:`\frac{x_i}{\beta_i}` overflows, the value returned should be a reasonable approximation. .. _g01kk-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\mathrm{ilog} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ilog} = 0` or :math:`1`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lx} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{la} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lb} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{x}`, :math:`\mathrm{a}` or :math:`\mathrm{b}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01kk-py2-py-notes: **Notes** The gamma distribution with shape parameter :math:`\alpha_i` and scale parameter :math:`\beta_i` has PDF .. math:: \begin{array}{cc} f \left(x_i, \alpha_i, \beta_i\right) = \frac{1}{{\beta_i^{\alpha_i}\Gamma \left(\alpha_i\right)}} x_i^{{\alpha_i-1}} e^{{-x_i/\beta_i}} & \text{if } x_i \geq 0 \text{; }\quad \alpha_i, \beta_i > 0 \\\\f\left(x_i, \alpha_i, \beta_i\right) = 0&\text{otherwise.}\end{array} If :math:`0.01\leq x_i,\alpha_i,\beta_i\leq 100` then an algorithm based directly on the gamma distribution's PDF is used. For values outside this range, the function is calculated via the Poisson distribution's PDF as described in Loader (2000) (see `Further Comments <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01kkf.html#fcomments>`__). The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01kk-py2-py-references: **References** Loader, C, 2000, `Fast and accurate computation of binomial probabilities` ((not yet published)) """ raise NotImplementedError
[docs]def pdf_normal_vector(ilog, x, xmu, xstd): r""" ``pdf_normal_vector`` returns a number of values of the probability density function (PDF), or its logarithm, for the Normal (Gaussian) distributions. .. _g01kq-py2-py-doc: For full information please refer to the NAG Library document for g01kq https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01kqf.html .. _g01kq-py2-py-parameters: **Parameters** **ilog** : int The value of :math:`\mathrm{ilog}` determines whether the logarithmic value is returned in PDF. :math:`\mathrm{ilog} = 0` :math:`f\left(x_i, \mu_i, \sigma_i\right)`, the probability density function is returned. :math:`\mathrm{ilog} = 1` :math:`\log\left(f\left(x_i, \mu_i, \sigma_i\right)\right)`, the logarithm of the probability density function is returned. **x** : float, array-like, shape :math:`\left(\textit{lx}\right)` :math:`x_i`, the values at which the PDF is to be evaluated. **xmu** : float, array-like, shape :math:`\left(\textit{lxmu}\right)` :math:`\mu_i`, the means. **xstd** : float, array-like, shape :math:`\left(\textit{lxstd}\right)` :math:`\sigma_i`, the standard deviations. **Returns** **pdf** : float, ndarray, shape :math:`\left(\max\left(\textit{lx},\textit{lxstd}\right)\right)` :math:`f\left(x_i, \mu_i, \sigma_i\right)` or :math:`\log\left(f\left(x_i, \mu_i, \sigma_i\right)\right)`. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{lx},\textit{lxstd}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` :math:`\sigma_i < 0`. .. _g01kq-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\mathrm{ilog} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ilog} = 0` or :math:`1`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lx} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lxmu} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lxstd} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{xstd}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01kq-py2-py-notes: **Notes** The Normal distribution with mean :math:`\mu_i`, variance :math:`\sigma_i^2`; has probability density function (PDF) .. math:: f\left(x_i, \mu_i, \sigma_i\right) = \frac{1}{{\sigma_i\sqrt{{2\pi }}}}e^{{-\left(x_i-\mu_i\right)^2/2\sigma_i^2}}\text{, }\quad \sigma_i > 0\text{.} The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. """ raise NotImplementedError
[docs]def pdf_multi_normal_vector(ilog, x, xmu, iuld, sig): r""" ``pdf_multi_normal_vector`` returns a number of values of the probability density function (PDF), or its logarithm, for the multivariate Normal (Gaussian) distribution. .. _g01lb-py2-py-doc: For full information please refer to the NAG Library document for g01lb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01lbf.html .. _g01lb-py2-py-parameters: **Parameters** **ilog** : int The value of :math:`\mathrm{ilog}` determines whether the logarithmic value is returned in PDF. :math:`\mathrm{ilog} = 0` :math:`f\left({X:\mu }, \Sigma \right)`, the probability density function is returned. :math:`\mathrm{ilog} = 1` :math:`\log\left(f\left({X:\mu }, \Sigma \right)\right)`, the logarithm of the probability density function is returned. **x** : float, array-like, shape :math:`\left(n, k\right)` :math:`X`, the matrix of :math:`k` points at which to evaluate the probability density function, with the :math:`i`\ th dimension for the :math:`j`\ th point held in :math:`\mathrm{x}[i-1,j-1]`. **xmu** : float, array-like, shape :math:`\left(n\right)` :math:`\mu`, the mean vector of the multivariate Normal distribution. **iuld** : int Indicates the form of :math:`\Sigma` and how it is stored in :math:`\mathrm{sig}`. :math:`\mathrm{iuld} = 1` :math:`\mathrm{sig}` holds the lower triangular portion of :math:`\Sigma`. :math:`\mathrm{iuld} = 2` :math:`\mathrm{sig}` holds the upper triangular portion of :math:`\Sigma`. :math:`\mathrm{iuld} = 3` :math:`\Sigma` is a diagonal matrix and :math:`\mathrm{sig}` only holds the diagonal elements. :math:`\mathrm{iuld} = 4` :math:`\mathrm{sig}` holds the lower Cholesky decomposition, :math:`L` such that :math:`LL^\mathrm{T} = \Sigma`. :math:`\mathrm{iuld} = 5` :math:`\mathrm{sig}` holds the upper Cholesky decomposition, :math:`U` such that :math:`U^\mathrm{T}U = \Sigma`. **sig** : float, array-like, shape :math:`\left(:, n\right)` Note: the required extent for this argument in dimension 1 is determined as follows: if :math:`\mathrm{iuld}=3`: :math:`1`; otherwise: :math:`n`. Information defining the variance-covariance matrix, :math:`\Sigma`. :math:`\mathrm{iuld} = 1` or :math:`2` :math:`\mathrm{sig}` must hold the lower or upper portion of :math:`\Sigma`, with :math:`\Sigma_{{ij}}` held in :math:`\mathrm{sig}[i-1,j-1]`. The supplied variance-covariance matrix must be positive semidefinite. :math:`\mathrm{iuld} = 3` :math:`\Sigma` is a diagonal matrix and the :math:`i`\ th diagonal element, :math:`\Sigma_{{ii}}`, must be held in :math:`\mathrm{sig}[0,i-1]` :math:`\mathrm{iuld} = 4` or :math:`5` :math:`\mathrm{sig}` must hold :math:`L` or :math:`U`, the lower or upper Cholesky decomposition of :math:`\Sigma`, with :math:`L_{{ij}}` or :math:`U_{{ij}}` held in :math:`\mathrm{sig}[i-1,j-1]`, depending on the value of :math:`\mathrm{iuld}`. No check is made that :math:`LL^\mathrm{T}` or :math:`U^\mathrm{T}U` is a valid variance-covariance matrix. The diagonal elements of the supplied :math:`L` or :math:`U` must be greater than zero **Returns** **pdf** : float, ndarray, shape :math:`\left(k\right)` :math:`f\left({X:\mu }, \Sigma \right)` or :math:`\log\left(f\left({X:\mu }, \Sigma \right)\right)` depending on the value of :math:`\mathrm{ilog}`. **rank** : int :math:`r`, rank of :math:`\Sigma`. .. _g01lb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`11`) On entry, :math:`\mathrm{ilog} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{ilog} = 0` or :math:`1`. (`errno` :math:`21`) On entry, :math:`k = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`k\geq 0`. (`errno` :math:`31`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n\geq 2`. (`errno` :math:`71`) On entry, :math:`\mathrm{iuld} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{iuld} = 1`, :math:`2`, :math:`3`, :math:`4` or :math:`5`. (`errno` :math:`81`) On entry, :math:`\Sigma` is not positive semidefinite. (`errno` :math:`82`) On entry, at least one diagonal element of :math:`\Sigma` is less than or equal to :math:`0`. (`errno` :math:`83`) On entry, :math:`\Sigma` is not positive definite and eigenvalue decomposition failed. (`errno` :math:`92`) On entry, :math:`\textit{ldsig} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{iuld} \neq 3`, :math:`\textit{ldsig}\geq n`. .. _g01lb-py2-py-notes: **Notes** The probability density function, :math:`f\left({X:\mu }, \Sigma \right)` of an :math:`n`-dimensional multivariate Normal distribution with mean vector :math:`\mu` and :math:`n\times n` variance-covariance matrix :math:`\Sigma`, is given by .. math:: f\left({X:\mu }, \Sigma \right) = \left(\left(2\pi \right)^n\left\lvert \Sigma \right\rvert \right)^{{-1/2}}\mathrm{exp}\left(-\frac{1}{2}\left(X-\mu \right)^\mathrm{T}\Sigma^{-1}\left(X-\mu \right)\right)\text{.} If the variance-covariance matrix, :math:`\Sigma`, is not of full rank then the probability density function, is calculated as .. math:: f\left({X:\mu }, \Sigma \right) = \left(\left(2\pi \right)^r\text{pdet}\left(\Sigma \right)\right)^{{-1/2}}\mathrm{exp}\left(-\frac{1}{2}\left(X-\mu \right)^\mathrm{T}\Sigma^-\left(X-\mu \right)\right) where :math:`\text{pdet}\left(\Sigma \right)` is the pseudo-determinant, :math:`\Sigma^-` a generalized inverse of :math:`\Sigma` and :math:`r` its rank. ``pdf_multi_normal_vector`` evaluates the PDF at :math:`k` points with a single call. """ raise NotImplementedError
[docs]def mills_ratio(x): r""" ``mills_ratio`` returns the reciprocal of Mills' Ratio. .. _g01mb-py2-py-doc: For full information please refer to the NAG Library document for g01mb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01mbf.html .. _g01mb-py2-py-parameters: **Parameters** **x** : float :math:`x`, the argument of the reciprocal of Mills' Ratio. **Returns** **rmr** : float The reciprocal of Mills' Ratio. .. _g01mb-py2-py-notes: **Notes** ``mills_ratio`` calculates the reciprocal of Mills' Ratio, the hazard rate, :math:`\lambda \left(x\right)`, for the standard Normal distribution. It is defined as the ratio of the ordinate to the upper tail area of the standard Normal distribution, that is, .. math:: \lambda \left(x\right) = \frac{{Z\left(x\right)}}{{Q\left(x\right)}} = \frac{{\frac{1}{\sqrt{2\pi }}e^{{-\left(x^2/2\right)}}}}{{\frac{1}{\sqrt{2\pi }}\int_x^{\infty }e^{{-\left(t^2/2\right)}}{dt}}}\text{.} The calculation is based on a Chebyshev expansion as described in :meth:`specfun.erfcx_real <naginterfaces.library.specfun.erfcx_real>`. .. _g01mb-py2-py-references: **References** Gross, A J and Clark, V A, 1975, `Survival Distributions: Reliability Applications in the Biomedical Sciences`, Wiley """ raise NotImplementedError
[docs]def pdf_landau(x): r""" ``pdf_landau`` returns the value of the Landau density function :math:`\phi \left(\lambda \right)`. .. _g01mt-py2-py-doc: For full information please refer to the NAG Library document for g01mt https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01mtf.html .. _g01mt-py2-py-parameters: **Parameters** **x** : float The argument :math:`\lambda` of the function. **Returns** **pdf** : float The value of the Landau density function :math:`\phi \left(\lambda \right)`. .. _g01mt-py2-py-notes: **Notes** ``pdf_landau`` evaluates an approximation to the Landau density function :math:`\phi \left(\lambda \right)` given by .. math:: \phi \left(\lambda \right) = \frac{1}{{2\pi i}}\int_{{c-i\infty }}^{{c+i\infty }}\mathrm{exp}\left(\lambda s+s\mathrm{ln}\left(s\right)\right){ds}\text{,} where :math:`c` is an arbitrary real constant, using piecewise approximation by rational functions. Further details can be found in Kölbig and Schorr (1984). To obtain the value of :math:`\phi^{\prime }\left(\lambda \right)`, :meth:`pdf_landau_deriv` can be used. .. _g01mt-py2-py-references: **References** Kölbig, K S and Schorr, B, 1984, `A program package for the Landau distribution`, Comp. Phys. Comm. (31), 97--111 """ raise NotImplementedError
[docs]def pdf_vavilov(x, comm): r""" ``pdf_vavilov`` returns the value of the Vavilov density function :math:`\phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)`. It is intended to be used after a call to :meth:`init_vavilov`. .. _g01mu-py2-py-doc: For full information please refer to the NAG Library document for g01mu https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01muf.html .. _g01mu-py2-py-parameters: **Parameters** **x** : float The argument :math:`\lambda` of the function. **comm** : dict, communication object Communication structure. This argument must have been initialized by a prior call to :meth:`init_vavilov`. **Returns** **pdf** : float The value of the Vavilov density function :math:`\phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)`. .. _g01mu-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) Either the initialization function has not been called prior to the first call of this function or a communication array has become corrupted. .. _g01mu-py2-py-notes: **Notes** ``pdf_vavilov`` evaluates an approximation to the Vavilov density function :math:`\phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` given by .. math:: \phi_V\left({\lambda \text{;}\kappa }, \beta^2\right) = \frac{1}{{2\pi i}}\int_{{c-i\infty }}^{{c+i\infty }}e^{{\lambda s}}f\left({s\text{;}\kappa }, \beta^2\right){ds}\text{,} where :math:`\kappa > 0` and :math:`0\leq \beta^2\leq 1`, :math:`c` is an arbitrary real constant and .. math:: f\left({s\text{;}\kappa }, \beta^2\right) = C\left(\kappa, \beta^2\right)\mathrm{exp}\left\{s\mathrm{ln}\left(\kappa \right)+\left(s+\kappa \beta^2\right)\left[\mathrm{ln}\left(\frac{s}{\kappa }\right)+E_1\left(\frac{s}{\kappa }\right)\right]-\kappa \mathrm{exp}\left(-\frac{s}{\kappa }\right)\right\}\text{.} :math:`E_1\left(x\right) = \int_0^xt^{-1}\left(1-e^{{-t}}\right){dt}` is the exponential integral, :math:`C\left(\kappa, \beta^2\right) = \mathrm{exp}\left\{\kappa \left(1+\gamma \beta^2\right)\right\}` and :math:`\gamma` is Euler's constant. The method used is based on Fourier expansions. Further details can be found in Schorr (1974). For values of :math:`\kappa \leq 0.01`, the Vavilov distribution can be replaced by the Landau distribution since :math:`\lambda_V = \left(\lambda_L-\mathrm{ln}\left(\kappa \right)\right)/\kappa`. For values of :math:`\kappa \geq 10`, the Vavilov distribution can be replaced by a Gaussian distribution with mean :math:`\mu = \gamma -1-\beta^2-\mathrm{ln}\left(\kappa \right)` and variance :math:`\sigma^2 = \left(2-\beta^2\right)/2\kappa`. .. _g01mu-py2-py-references: **References** Schorr, B, 1974, `Programs for the Landau and the Vavilov distributions and the corresponding random numbers`, Comp. Phys. Comm. (7), 215--224 """ raise NotImplementedError
[docs]def moments_quad_form(a, sigma, l, emu=None): r""" ``moments_quad_form`` computes the cumulants and moments of quadratic forms in Normal variates. .. _g01na-py2-py-doc: For full information please refer to the NAG Library document for g01na https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01naf.html .. _g01na-py2-py-parameters: **Parameters** **a** : float, array-like, shape :math:`\left(n, n\right)` The :math:`n\times n` symmetric matrix :math:`A`. Only the lower triangle is referenced. **sigma** : float, array-like, shape :math:`\left(n, n\right)` The :math:`n\times n` variance-covariance matrix :math:`\Sigma`. Only the lower triangle is referenced. **l** : int The required number of cumulants, and moments if specified. **emu** : None or float, array-like, shape :math:`\left(:\right)`, optional Note: the required length for this argument is determined as follows: if :math:`\textit{mean}=\texttt{'M'}`: :math:`n`; otherwise: :math:`1`. If :math:`\textit{mean} = \texttt{'M'}`, :math:`\mathrm{emu}` must contain the :math:`n` elements of the vector :math:`\mu`. If :math:`\textit{mean} = \texttt{'Z'}`, :math:`\mathrm{emu}` is not referenced. **Returns** **rkum** : float, ndarray, shape :math:`\left(\mathrm{l}\right)` The :math:`\mathrm{l}` cumulants of the quadratic form. **rmom** : float, ndarray, shape :math:`\left(:\right)` If :math:`\textit{mom} = \texttt{'M'}`, the :math:`\mathrm{l}` moments of the quadratic form. .. _g01na-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{l} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{l} \leq 12`. (`errno` :math:`1`) On entry, :math:`\textit{mean} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{mean} = \texttt{'Z'}` or :math:`\texttt{'M'}`. (`errno` :math:`1`) On entry, :math:`\textit{mom} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{mom} = \texttt{'C'}` or :math:`\texttt{'M'}`. (`errno` :math:`1`) On entry, :math:`\mathrm{l} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{l} \geq 1`. (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n > 1`. (`errno` :math:`2`) On entry, :math:`\mathrm{sigma}` is not positive definite. .. _g01na-py2-py-notes: **Notes** Let :math:`x` have an :math:`n`-dimensional multivariate Normal distribution with mean :math:`\mu` and variance-covariance matrix :math:`\Sigma`. Then for a symmetric matrix :math:`A`, ``moments_quad_form`` computes up to the first :math:`12` moments and cumulants of the quadratic form :math:`Q = x^\mathrm{T}Ax`. The :math:`s`\ th moment (about the origin) is defined as .. math:: E\left(Q^s\right)\text{,} where :math:`E` denotes expectation. The :math:`s`\ th moment of :math:`Q` can also be found as the coefficient of :math:`t^s/s!` in the expansion of :math:`E\left(e^{{Qt}}\right)`. The :math:`s`\ th cumulant is defined as the coefficient of :math:`t^s/s!` in the expansion of :math:`\log\left(E\left(e^{{Qt}}\right)\right)`. The function is based on the function CUM written by Magnus and Pesaran (1993a) and based on the theory given by Magnus (1978), Magnus (1979) and Magnus (1986). .. _g01na-py2-py-references: **References** Magnus, J R, 1978, `The moments of products of quadratic forms in Normal variables`, Statist. Neerlandica (32), 201--210 Magnus, J R, 1979, `The expectation of products of quadratic forms in Normal variables: the practice`, Statist. Neerlandica (33), 131--136 Magnus, J R, 1986, `The exact moments of a ratio of quadratic forms in Normal variables`, Ann. Économ. Statist. (4), 95--109 Magnus, J R and Pesaran, B, 1993, `The evaluation of cumulants and moments of quadratic forms in Normal variables (CUM): Technical description`, Comput. Statist. (8), 39--45 Magnus, J R and Pesaran, B, 1993, `The evaluation of moments of quadratic forms and ratios of quadratic forms in Normal variables: Background, motivation and examples`, Comput. Statist. (8), 47--55 """ raise NotImplementedError
[docs]def moments_ratio_quad_forms(a, b, sigma, l1, l2, eps, c=None, ela=None, emu=None): r""" ``moments_ratio_quad_forms`` computes the moments of ratios of quadratic forms in Normal variables and related statistics. .. _g01nb-py2-py-doc: For full information please refer to the NAG Library document for g01nb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01nbf.html .. _g01nb-py2-py-parameters: **Parameters** **a** : float, array-like, shape :math:`\left(n, n\right)` The :math:`n\times n` symmetric matrix :math:`A`. Only the lower triangle is referenced. **b** : float, array-like, shape :math:`\left(n, n\right)` The :math:`n\times n` positive semidefinite symmetric matrix :math:`B`. Only the lower triangle is referenced. **sigma** : float, array-like, shape :math:`\left(n, n\right)` The :math:`n\times n` variance-covariance matrix :math:`\Sigma`. Only the lower triangle is referenced. **l1** : int The first moment to be computed, :math:`l_1`. **l2** : int The last moment to be computed, :math:`l_2`. **eps** : float The relative accuracy required for the moments, this value is also used in the checks for the existence of the moments. If :math:`\mathrm{eps} = 0.0`, a value of :math:`\sqrt{\epsilon }` where :math:`\epsilon` is the machine precision used. **c** : None or float, array-like, shape :math:`\left(:, n\right)`, optional Note: the required extent for this argument in dimension 1 is determined as follows: if :math:`\mathrm{c}\text{ is not }\mathbf{None}`: :math:`n`; otherwise: :math:`0`. If :math:`\mathrm{c}\text{ is not }\mathbf{None}`, :math:`\mathrm{c}` must contain the :math:`n\times n` symmetric matrix :math:`C`; only the lower triangle is referenced. **ela** : None or float, array-like, shape :math:`\left(n\right)`, optional If :math:`\mathrm{ela}\text{ is not }\mathbf{None}`, :math:`\mathrm{ela}` must contain the vector :math:`a` of length :math:`n`, otherwise :math:`\mathrm{ela}` is not referenced. **emu** : None or float, array-like, shape :math:`\left(n\right)`, optional If :math:`\mathrm{emu}\text{ is not }\mathbf{None}`, :math:`\mathrm{emu}` must contain the :math:`n` elements of the vector :math:`\mu`. **Returns** **lmax** : int The highest moment computed, :math:`l_{\mathrm{MAX}}`. This will be :math:`l_2` if no exception or warning is raised on exit. **rmom** : float, ndarray, shape :math:`\left(\mathrm{l2}-\mathrm{l1}+1\right)` The :math:`l_1` to :math:`l_{\mathrm{MAX}}` moments. **abserr** : float The estimated maximum absolute error in any computed moment. .. _g01nb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{l2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{l2} \leq 12`. (`errno` :math:`1`) On entry, :math:`\mathrm{l1} = \langle\mathit{\boldsymbol{value}}\rangle` and :math:`\mathrm{l2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{l2}\geq \mathrm{l1}`. (`errno` :math:`1`) On entry, :math:`\mathrm{l1} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{l1} \geq 1`. (`errno` :math:`1`) On entry, :math:`\mathrm{eps} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{eps}\neq 0.0`, :math:`\mathrm{eps}\geq \text{machine precision}`. (`errno` :math:`1`) On entry, :math:`n = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`n > 1`. (`errno` :math:`2`) On entry, :math:`\mathrm{sigma}` is not positive definite. (`errno` :math:`2`) On entry, :math:`\mathrm{b}` is not positive semidefinite or is null. (`errno` :math:`3`) Only :math:`\langle\mathit{\boldsymbol{value}}\rangle` moments exist, less than :math:`\mathrm{l1} = \langle\mathit{\boldsymbol{value}}\rangle`, therefore, none of the required moments can be computed. (`errno` :math:`4`) The matrix :math:`L^\mathrm{T}BL` is not positive semidefinite or is null. (`errno` :math:`5`) The computation to compute the eigenvalues required in the calculation of moments has failed to converge: this is an unlikely error exit. **Warns** **NagAlgorithmicWarning** (`errno` :math:`6`) Only some of the required moments have been computed, the highest is given by :math:`\mathrm{lmax}`. (`errno` :math:`7`) The required accuracy has not been achieved in the integration. An estimate of the accuracy is returned in :math:`\mathrm{abserr}`. .. _g01nb-py2-py-notes: **Notes** Let :math:`x` have an :math:`n`-dimensional multivariate Normal distribution with mean :math:`\mu` and variance-covariance matrix :math:`\Sigma`. Then for a symmetric matrix :math:`A` and symmetric positive semidefinite matrix :math:`B`, ``moments_ratio_quad_forms`` computes a subset, :math:`l_1` to :math:`l_2`, of the first :math:`12` moments of the ratio of quadratic forms .. math:: R = x^\mathrm{T}Ax/x^\mathrm{T}Bx\text{.} The :math:`s`\ th moment (about the origin) is defined as .. math:: E\left(R^s\right)\text{,} where :math:`E` denotes the expectation. Alternatively, this function will compute the following expectations: .. math:: E\left(R^s\left(a^\mathrm{T}x\right)\right) and .. math:: E\left(R^s\left(x^\mathrm{T}Cx\right)\right)\text{,} where :math:`a` is a vector of length :math:`n` and :math:`C` is an :math:`n\times n` symmetric matrix, if they exist. In the case of `(2) <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01nbf.html#eqn2>`__ the moments are zero if :math:`\mu = 0`. The conditions of theorems 1, 2 and 3 of Magnus (1986) and Magnus (1990) are used to check for the existence of the moments. If all the requested moments do not exist, the computations are carried out for those moments that are requested up to the maximum that exist, :math:`l_{\mathrm{MAX}}`. This function is based on the function QRMOM written by Magnus and Pesaran (1993a) and based on the theory given by Magnus (1986) and Magnus (1990). The computation of the moments requires first the computation of the eigenvectors of the matrix :math:`L^\mathrm{T}BL`, where :math:`LL^\mathrm{T} = \Sigma`. The matrix :math:`L^\mathrm{T}BL` must be positive semidefinite and not null. Given the eigenvectors of this matrix, a function which has to be integrated over the range zero to infinity can be computed. This integration is performed using :meth:`quad.dim1_inf <naginterfaces.library.quad.dim1_inf>`. .. _g01nb-py2-py-references: **References** Magnus, J R, 1986, `The exact moments of a ratio of quadratic forms in Normal variables`, Ann. Économ. Statist. (4), 95--109 Magnus, J R, 1990, `On certain moments relating to quadratic forms in Normal variables: Further results`, Sankhyā, Ser. B (52), 1--13 Magnus, J R and Pesaran, B, 1993, `The evaluation of cumulants and moments of quadratic forms in Normal variables (CUM): Technical description`, Comput. Statist. (8), 39--45 Magnus, J R and Pesaran, B, 1993, `The evaluation of moments of quadratic forms and ratios of quadratic forms in Normal variables: Background, motivation and examples`, Comput. Statist. (8), 47--55 """ raise NotImplementedError
[docs]def pdf_landau_moment1(x): r""" ``pdf_landau_moment1`` returns the value of the first moment :math:`\Phi_1\left(x\right)` of the Landau density function. .. _g01pt-py2-py-doc: For full information please refer to the NAG Library document for g01pt https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01ptf.html .. _g01pt-py2-py-parameters: **Parameters** **x** : float The argument :math:`x` of the function. **Returns** **mom1** : float The value of the first moment :math:`\Phi_1\left(x\right)` of the Landau density function. .. _g01pt-py2-py-notes: **Notes** ``pdf_landau_moment1`` evaluates an approximation to the first moment :math:`\Phi_1\left(x\right)` of the Landau density function given by .. math:: \Phi_1\left(x\right) = \frac{1}{{\Phi \left(x\right)}}\int_{{-\infty }}^x\lambda \phi \left(\lambda \right){d\lambda }\text{,} where :math:`\phi \left(\lambda \right)` is described in :meth:`pdf_landau`, using piecewise approximation by rational functions. Further details can be found in Kölbig and Schorr (1984). To obtain the value of :math:`\Phi_2\left(x\right)`, :meth:`pdf_landau_moment2` can be used. .. _g01pt-py2-py-references: **References** Kölbig, K S and Schorr, B, 1984, `A program package for the Landau distribution`, Comp. Phys. Comm. (31), 97--111 """ raise NotImplementedError
[docs]def pdf_landau_moment2(x): r""" ``pdf_landau_moment2`` returns the value of the second moment :math:`\Phi_2\left(x\right)` of the Landau density function. .. _g01qt-py2-py-doc: For full information please refer to the NAG Library document for g01qt https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01qtf.html .. _g01qt-py2-py-parameters: **Parameters** **x** : float The argument :math:`x` of the function. **Returns** **mom2** : float The value of the second moment :math:`\Phi_2\left(x\right)` of the Landau density function. .. _g01qt-py2-py-notes: **Notes** ``pdf_landau_moment2`` evaluates an approximation to the second moment :math:`\Phi_2\left(x\right)` of the Landau density function given by .. math:: \Phi_2\left(x\right) = \frac{1}{{\Phi \left(x\right)}}\int_{{-\infty }}^x\lambda^2\phi \left(\lambda \right){d\lambda }\text{,} where :math:`\phi \left(\lambda \right)` is described in :meth:`pdf_landau`, using piecewise approximation by rational functions. Further details can be found in Kölbig and Schorr (1984). To obtain the value of :math:`\Phi_1\left(x\right)`, :meth:`pdf_landau_moment1` can be used. .. _g01qt-py2-py-references: **References** Kölbig, K S and Schorr, B, 1984, `A program package for the Landau distribution`, Comp. Phys. Comm. (31), 97--111 """ raise NotImplementedError
[docs]def pdf_landau_deriv(x): r""" ``pdf_landau_deriv`` returns the value of the derivative :math:`\phi^{\prime }\left(\lambda \right)` of the Landau density function. .. _g01rt-py2-py-doc: For full information please refer to the NAG Library document for g01rt https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01rtf.html .. _g01rt-py2-py-parameters: **Parameters** **x** : float The argument :math:`\lambda` of the function. **Returns** **deriv** : float The value of the derivative :math:`\phi^{\prime }\left(\lambda \right)` of the Landau density function. .. _g01rt-py2-py-notes: **Notes** ``pdf_landau_deriv`` evaluates an approximation to the derivative :math:`\phi^{\prime }\left(\lambda \right)` of the Landau density function given by .. math:: \phi^{\prime }\left(\lambda \right) = \frac{{d\phi \left(\lambda \right)}}{{d\lambda }}\text{,} where :math:`\phi \left(\lambda \right)` is described in :meth:`pdf_landau`, using piecewise approximation by rational functions. Further details can be found in Kölbig and Schorr (1984). To obtain the value of :math:`\phi \left(\lambda \right)`, :meth:`pdf_landau` can be used. .. _g01rt-py2-py-references: **References** Kölbig, K S and Schorr, B, 1984, `A program package for the Landau distribution`, Comp. Phys. Comm. (31), 97--111 """ raise NotImplementedError
[docs]def prob_normal_vector(tail, x, xmu, xstd): r""" ``prob_normal_vector`` returns a number of one or two tail probabilities for the Normal distribution. .. _g01sa-py2-py-doc: For full information please refer to the NAG Library document for g01sa https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01saf.html .. _g01sa-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the returned probabilities should represent. Letting :math:`Z` denote a variate from a standard Normal distribution, and :math:`z_i = \frac{{x_i-\mu_i}}{\sigma_i}`, then for :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{lx}, \textit{ltail}, \textit{lxmu}, \textit{lxstd}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`p_i = P\left(Z\leq z_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`p_i = P\left(Z\geq z_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'C'}` The two tail (confidence interval) probability is returned, i.e., :math:`p_i = P\left(Z\leq \left\lvert z_i\right\rvert \right)-P\left(Z\leq -\left\lvert z_i\right\rvert \right)`. :math:`\mathrm{tail}[j] = \texttt{'S'}` The two tail (significance level) probability is returned, i.e., :math:`p_i = P\left(Z\geq \left\lvert z_i\right\rvert \right)+P\left(Z\leq -\left\lvert z_i\right\rvert \right)`. **x** : float, array-like, shape :math:`\left(\textit{lx}\right)` :math:`x_i`, the Normal variate values. **xmu** : float, array-like, shape :math:`\left(\textit{lxmu}\right)` :math:`\mu_i`, the means. **xstd** : float, array-like, shape :math:`\left(\textit{lxstd}\right)` :math:`\sigma_i`, the standard deviations. **Returns** **p** : float, ndarray, shape :math:`\left(\max\left(\textit{lx},\textit{ltail}\right)\right)` :math:`p_i`, the probabilities for the Normal distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{lx},\textit{ltail}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`\sigma_i\leq 0.0`. .. _g01sa-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\textit{ltail} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\textit{lx} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lx} > 0`. (`errno` :math:`4`) On entry, :math:`\textit{lxmu} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lxmu} > 0`. (`errno` :math:`5`) On entry, :math:`\textit{lxstd} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lxstd} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}` or :math:`\mathrm{xstd}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01sa-py2-py-notes: **Notes** The lower tail probability for the Normal distribution, :math:`P\left(X_i\leq x_i\right)` is defined by: .. math:: P\left(X_i\leq x_i\right) = \int_{{-\infty }}^{x_i}Z_i\left(X_i\right){dX_i}\text{,} where .. math:: Z_i\left(X_i\right) = \frac{1}{\sqrt{2\pi \sigma_i^2}}e^{{-\left(X_i-\mu_i\right)^2/\left(2\sigma_i^2\right)}},{-\infty } < X_i < \infty \text{.} The relationship .. math:: P\left(X_i\leq x_i\right) = \frac{1}{2}\mathrm{erfc}\left(\frac{{-\left(x_i-\mu_i\right)}}{{\sqrt{2}\sigma_i}}\right) is used, where erfc is the complementary error function, and is computed using :meth:`specfun.erfc_real <naginterfaces.library.specfun.erfc_real>`. When the two tail confidence probability is required the relationship .. math:: P\left(X_i\leq \left\lvert x_i\right\rvert \right)-P\left(X_i\leq -\left\lvert x_i\right\rvert \right) = \mathrm{erf}\left(\frac{{\left\lvert x_i-\mu_i\right\rvert }}{{\sqrt{2}\sigma_i}}\right)\text{,} is used, where erf is the error function, and is computed using :meth:`specfun.erf_real <naginterfaces.library.specfun.erf_real>`. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sa-py2-py-references: **References** NIST Digital Library of Mathematical Functions Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_students_t_vector(tail, t, df): r""" ``prob_students_t_vector`` returns a number of one or two tail probabilities for the Student's :math:`t`-distribution with real degrees of freedom. .. _g01sb-py2-py-doc: For full information please refer to the NAG Library document for g01sb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01sbf.html .. _g01sb-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the returned probabilities should represent. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lt}, \textit{ldf}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`p_i = \left({T_i\leq t_i}:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`p_i = \left({T_i\geq t_i}:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'C'}` The two tail (confidence interval) probability is returned, i.e., :math:`p_i = \left({T_i\leq \left\lvert t_i\right\rvert }:\nu_i\right)-\left({T_i\leq -\left\lvert t_i\right\rvert }:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'S'}` The two tail (significance level) probability is returned, i.e., :math:`p_i = \left({T_i\geq \left\lvert t_i\right\rvert }:\nu_i\right)+\left({T_i\leq -\left\lvert t_i\right\rvert }:\nu_i\right)`. **t** : float, array-like, shape :math:`\left(\textit{lt}\right)` :math:`t_i`, the values of the Student's :math:`t` variates. **df** : float, array-like, shape :math:`\left(\textit{ldf}\right)` :math:`\nu_i`, the degrees of freedom of the Student's :math:`t`-distribution. **Returns** **p** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lt}\right)\right)` :math:`p_i`, the probabilities for the Student's :math:`t` distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lt}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`\nu_i < 1.0`. .. _g01sb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lt} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}` or :math:`\mathrm{df}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01sb-py2-py-notes: **Notes** The lower tail probability for the Student's :math:`t`-distribution with :math:`\nu_i` degrees of freedom, :math:`\left({T_i\leq t_i}:\nu_i\right)` is defined by: .. math:: \left({T_i\leq t_i}:\nu_i\right) = \frac{{\Gamma \left(\left(\nu_i+1\right)/2\right)}}{{\sqrt{\pi \nu_i}\Gamma \left(\nu_i/2\right)}}\int_{{-\infty }}^{t_i}\left[1+\frac{T_i^2}{\nu_i}\right]^{{-\left(\nu_i+1\right)/2}}dT_i\text{, }\quad \nu_i\geq 1\text{.} Computationally, there are two situations: (i) when :math:`\nu_i < 20`, a transformation of the beta distribution, :math:`P\beta_i\left({B_i\leq \beta_i}:a_i,b_i\right)` is used .. math:: \left({T_i\leq t_i}:\nu_i\right) = \frac{1}{2}P\beta_i\left({B_i\leq \frac{\nu_i}{{\nu_i+t_i^2}}}:{\nu_i/2},\frac{1}{2}\right)\quad \text{ when }t_i < 0.0 or .. math:: \left({T_i\leq t_i}:\nu_i\right) = \frac{1}{2}+\frac{1}{2}P{\beta_i}\left({B_i\geq \frac{{\nu_i}}{{\nu_i+t_i^2}}}:{\nu_i/2},\frac{1}{2}\right)\quad \text{ when }t_i > 0.0\text{;} (#) when :math:`\nu_i\geq 20`, an asymptotic normalizing expansion of the Cornish--Fisher type is used to evaluate the probability, see Hill (1970). The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sb-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Hill, G W, 1970, `Student's` :math:`t` `-distribution`, Comm. ACM (13(10)), 617--619 """ raise NotImplementedError
[docs]def prob_chisq_vector(tail, x, df): r""" ``prob_chisq_vector`` returns a number of lower or upper tail probabilities for the :math:`\chi^2`-distribution with real degrees of freedom. .. _g01sc-py2-py-doc: For full information please refer to the NAG Library document for g01sc https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01scf.html .. _g01sc-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates whether the lower or upper tail probabilities are required. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lx}, \textit{ldf}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`p_i = \left({X_i\leq x_i}:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`p_i = \left({X_i\geq x_i}:\nu_i\right)`. **x** : float, array-like, shape :math:`\left(\textit{lx}\right)` :math:`x_i`, the values of the :math:`\chi^2` variates with :math:`\nu_i` degrees of freedom. **df** : float, array-like, shape :math:`\left(\textit{ldf}\right)` :math:`\nu_i`, the degrees of freedom of the :math:`\chi^2`-distribution. **Returns** **p** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{ldf}\right)\right)` :math:`p_i`, the probabilities for the :math:`\chi^2` distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{ldf}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`x_i < 0.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`\nu_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 4` The solution has failed to converge while calculating the gamma variate. The result returned should represent an approximation to the solution. .. _g01sc-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lx} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{x}`, :math:`\mathrm{df}` or :math:`\mathrm{tail}` was invalid, or the solution failed to converge. Check :math:`\mathrm{ivalid}` for more information. .. _g01sc-py2-py-notes: **Notes** The lower tail probability for the :math:`\chi^2`-distribution with :math:`\nu_i` degrees of freedom, :math:`P = \left({X_i\leq x_i}: \nu_i\right)` is defined by: .. math:: P = \left({X_i\leq x_i}: \nu_i\right) = \frac{1}{{2^{{\nu_i/2}}\Gamma \left(\nu_i/2\right)}}\int_{0.0}^{x_i}X_i^{{\nu_i/2-1}}e^{{-X_i/2}}{dX_i}\text{, }\quad x_i\geq 0, \nu_i > 0\text{.} To calculate :math:`P = \left({X_i\leq x_i}: \nu_i\right)` a transformation of a gamma distribution is employed, i.e., a :math:`\chi^2`-distribution with :math:`\nu_i` degrees of freedom is equal to a gamma distribution with scale parameter :math:`2` and shape parameter :math:`\nu_i/2`. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sc-py2-py-references: **References** NIST Digital Library of Mathematical Functions Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_f_vector(tail, f, df1, df2): r""" ``prob_f_vector`` returns a number of lower or upper tail probabilities for the :math:`F` or variance-ratio distribution with real degrees of freedom. .. _g01sd-py2-py-doc: For full information please refer to the NAG Library document for g01sd https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01sdf.html .. _g01sd-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates whether the lower or upper tail probabilities are required. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lf}, \textit{ldf1}, \textit{ldf2}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`p_i = \left({F_i\leq f_i}:u_i,v_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`p_i = \left({F_i\geq f_i}:u_i,v_i\right)`. **f** : float, array-like, shape :math:`\left(\textit{lf}\right)` :math:`f_i`, the value of the :math:`F` variate. **df1** : float, array-like, shape :math:`\left(\textit{ldf1}\right)` :math:`u_i`, the degrees of freedom of the numerator variance. **df2** : float, array-like, shape :math:`\left(\textit{ldf2}\right)` :math:`v_i`, the degrees of freedom of the denominator variance. **Returns** **p** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lf}\right)\right)` :math:`p_i`, the probabilities for the :math:`F`-distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lf}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`f_i < 0.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`u_i\leq 0.0`, or, :math:`v_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 4` The solution has failed to converge. The result returned should represent an approximation to the solution. .. _g01sd-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lf} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf1} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf2} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{f}`, :math:`\mathrm{df1}`, :math:`\mathrm{df2}` or :math:`\mathrm{tail}` was invalid, or the solution failed to converge. Check :math:`\mathrm{ivalid}` for more information. .. _g01sd-py2-py-notes: **Notes** The lower tail probability for the :math:`F`, or variance-ratio, distribution with :math:`u_i` and :math:`v_i` degrees of freedom, :math:`\left({F_i\leq f_i}:u_i,v_i\right)`, is defined by: .. math:: \left({F_i\leq f_i}:u_i,v_i\right) = \frac{{u_i^{{u_i/2}}v_i^{{v_i/2}}\Gamma \left(\left(u_i+v_i\right)/2\right)}}{{\Gamma \left(u_i/2\right)\Gamma \left(v_i/2\right)}}\int_0^{f_i}F_i^{{\left(u_i-2\right)/2}}\left(u_iF_i+v_i\right)^{{-\left(u_i+v_i\right)/2}}dF_i\text{,} for :math:`u_i`, :math:`v_i > 0`, :math:`f_i\geq 0`. The probability is computed by means of a transformation to a beta distribution, :math:`P\beta_i\left({B_i\leq \beta_i}:a_i,b_i\right)`: .. math:: \left({F_i\leq f_i}:u_i,v_i\right) = P\beta_i\left({B_i\leq \frac{{u_if_i}}{{u_if_i+v_i}}}:{u_i/2},{v_i/2}\right) and using a call to :meth:`prob_beta`. For very large values of both :math:`u_i` and :math:`v_i`, greater than :math:`10^5`, a normal approximation is used. If only one of :math:`u_i` or :math:`v_i` is greater than :math:`10^5` then a :math:`\chi^2` approximation is used, see Abramowitz and Stegun (1972). The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sd-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_beta_vector(tail, beta, a, b): r""" ``prob_beta_vector`` computes a number of lower or upper tail probabilities for the beta distribution. .. _g01se-py2-py-doc: For full information please refer to the NAG Library document for g01se https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01sef.html .. _g01se-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates whether a lower or upper tail probabilities are required. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lbeta}, \textit{la}, \textit{lb}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`p_i = \left({B_i\leq \beta_i}:a_i,{b_i}\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`p_i = \left({B_i\geq \beta_i}:a_i,{b_i}\right)`. **beta** : float, array-like, shape :math:`\left(\textit{lbeta}\right)` :math:`\beta_i`, the value of the beta variate. **a** : float, array-like, shape :math:`\left(\textit{la}\right)` :math:`a_i`, the first parameter of the required beta distribution. **b** : float, array-like, shape :math:`\left(\textit{lb}\right)` :math:`b_i`, the second parameter of the required beta distribution. **Returns** **p** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lbeta}\right)\right)` :math:`p_i`, the probabilities for the beta distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lbeta}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`\beta_i < 0.0`, or, :math:`\beta_i > 1.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`a_i\leq 0.0`, or, :math:`b_i\leq 0.0`. .. _g01se-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lbeta} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{la} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lb} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{beta}`, :math:`\mathrm{a}`, :math:`\mathrm{b}` or :math:`\mathrm{tail}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01se-py2-py-notes: **Notes** The lower tail probability, :math:`\left({B_i\leq \beta_i}:a_i,{b_i}\right)` is defined by .. math:: \left({B_i\leq \beta_i}:a_i,b_i\right) = \frac{{\Gamma \left(a_i+b_i\right)}}{{\Gamma \left(a_i\right)\Gamma \left(b_i\right)}}\int_0^{\beta_i}B_i^{{a_i-1}}\left(1-B_i\right)^{{b_i-1}}{dB_i} = I_{\beta_i}\left(a_i, b_i\right)\text{, }\quad 0\leq \beta_i\leq 1\text{; }\quad a_i,b_i > 0\text{.} The function :math:`I_{\beta_i}\left(a_i, b_i\right)`, also known as the incomplete beta function is calculated using :meth:`specfun.beta_incomplete <naginterfaces.library.specfun.beta_incomplete>`. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01se-py2-py-references: **References** NIST Digital Library of Mathematical Functions Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Majumder, K L and Bhattacharjee, G P, 1973, `Algorithm AS 63. The incomplete beta integral`, Appl. Statist. (22), 409--411 """ raise NotImplementedError
[docs]def prob_gamma_vector(tail, g, a, b): r""" ``prob_gamma_vector`` returns a number of lower or upper tail probabilities for the gamma distribution. .. _g01sf-py2-py-doc: For full information please refer to the NAG Library document for g01sf https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01sff.html .. _g01sf-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates whether a lower or upper tail probability is required. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lg}, \textit{la}, \textit{lb}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability is returned, i.e., :math:`p_i = \left({G_i\leq g_i}:\alpha_i,\beta_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability is returned, i.e., :math:`p_i = \left({G_i\geq g_i}:\alpha_i,\beta_i\right)`. **g** : float, array-like, shape :math:`\left(\textit{lg}\right)` :math:`g_i`, the value of the gamma variate. **a** : float, array-like, shape :math:`\left(\textit{la}\right)` The parameter :math:`\alpha_i` of the gamma distribution. **b** : float, array-like, shape :math:`\left(\textit{lb}\right)` The parameter :math:`\beta_i` of the gamma distribution. **Returns** **p** : float, ndarray, shape :math:`\left(\max\left(\textit{lg},\textit{la}\right)\right)` :math:`p_i`, the probabilities of the beta distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{lg},\textit{la}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`g_i < 0.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`\alpha_i\leq 0.0`, or, :math:`\beta_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 4` The solution did not converge in :math:`600` iterations, see :meth:`specfun.gamma_incomplete <naginterfaces.library.specfun.gamma_incomplete>` for details. The probability returned should be a reasonable approximation to the solution. .. _g01sf-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lg} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{la} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lb} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{g}`, :math:`\mathrm{a}`, :math:`\mathrm{b}` or :math:`\mathrm{tail}` was invalid, or the solution did not converge. Check :math:`\mathrm{ivalid}` for more information. .. _g01sf-py2-py-notes: **Notes** The lower tail probability for the gamma distribution with parameters :math:`\alpha_i` and :math:`\beta_i`, :math:`P\left(G_i\leq g_i\right)`, is defined by: .. math:: \left({G_i\leq g_i}:\alpha_i,\beta_i\right) = \frac{1}{{\beta_i^{\alpha_i}\Gamma \left(\alpha_i\right)}}\int_0^{g_i}G_i^{{\alpha_i-1}}e^{{-G_i/\beta_i}}{dG_i}\text{, }\quad \alpha_i > 0.0\text{, }\beta_i > 0.0\text{.} The mean of the distribution is :math:`\alpha_i\beta_i` and its variance is :math:`\alpha_i\beta_i^2`. The transformation :math:`Z_i = \frac{G_i}{\beta_i}` is applied to yield the following incomplete gamma function in normalized form, .. math:: \left({G_i\leq g_i}:\alpha_i,\beta_i\right) = \left({Z_i\leq g_i/\beta_i}:\alpha_i,1.0\right) = \frac{1}{{\Gamma \left(\alpha_i\right)}}\int_0^{{g_i/\beta_i}}Z_i^{{\alpha_i-1}}e^{{-Z_i}}{dZ_i}\text{.} This is then evaluated using :meth:`specfun.gamma_incomplete <naginterfaces.library.specfun.gamma_incomplete>`. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sf-py2-py-references: **References** Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def prob_binomial_vector(n, p, k): r""" ``prob_binomial_vector`` returns a number of the lower tail, upper tail and point probabilities for the binomial distribution. .. _g01sj-py2-py-doc: For full information please refer to the NAG Library document for g01sj https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01sjf.html .. _g01sj-py2-py-parameters: **Parameters** **n** : int, array-like, shape :math:`\left(\textit{ln}\right)` :math:`n_i`, the first parameter of the binomial distribution. **p** : float, array-like, shape :math:`\left(\textit{lp}\right)` :math:`p_i`, the second parameter of the binomial distribution. **k** : int, array-like, shape :math:`\left(\textit{lk}\right)` :math:`k_i`, the integer which defines the required probabilities. **Returns** **plek** : float, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{lp}\right)\right)` :math:`\mathrm{Prob}\left\{X_i\leq k_i\right\}`, the lower tail probabilities. **pgtk** : float, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{lp}\right)\right)` :math:`\mathrm{Prob}\left\{X_i > k_i\right\}`, the upper tail probabilities. **peqk** : float, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{lp}\right)\right)` :math:`\mathrm{Prob}\left\{X_i = k_i\right\}`, the point probabilities. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{lp}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, :math:`n_i < 0`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`p_i\leq 0.0`, or, :math:`p_i\geq 1.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`k_i < 0`, or, :math:`k_i > n_i`. :math:`\mathrm{ivalid}[i-1] = 4` On entry, :math:`n_i` is too large to be represented exactly as a real number. :math:`\mathrm{ivalid}[i-1] = 5` On entry, the variance (:math:`\text{} = n_ip_i\left(1-p_i\right)`) exceeds :math:`10^6`. .. _g01sj-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ln} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lp} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lk} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{n}`, :math:`\mathrm{p}` or :math:`\mathrm{k}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01sj-py2-py-notes: **Notes** Let :math:`X = \left\{X_i: {i = 1,2,\ldots,m}\right\}` denote a vector of random variables each having a binomial distribution with parameters :math:`n_i` and :math:`p_i` (:math:`n_i\geq 0` and :math:`0 < p_i < 1`). Then .. math:: \mathrm{Prob}\left\{X_i = k_i\right\} = \begin{pmatrix}n_i\\k_i\end{pmatrix}p_i^{k_i}\left(1-p_i\right)^{{n_i-k_i}}\text{, }\quad k_i = 0,1,\ldots,n_i\text{.} The mean of the each distribution is given by :math:`n_ip_i` and the variance by :math:`n_ip_i\left(1-p_i\right)`. ``prob_binomial_vector`` computes, for given :math:`n_i`, :math:`p_i` and :math:`k_i`, the probabilities: :math:`\mathrm{Prob}\left\{X_i\leq k_i\right\}`, :math:`\mathrm{Prob}\left\{X_i > k_i\right\}` and :math:`\mathrm{Prob}\left\{X_i = k_i\right\}` using an algorithm similar to that described in Knüsel (1986) for the Poisson distribution. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sj-py2-py-references: **References** Knüsel, L, 1986, `Computation of the chi-square and Poisson distribution`, SIAM J. Sci. Statist. Comput. (7), 1022--1036 """ raise NotImplementedError
[docs]def prob_poisson_vector(l, k): r""" ``prob_poisson_vector`` returns a number of the lower tail, upper tail and point probabilities for the Poisson distribution. .. _g01sk-py2-py-doc: For full information please refer to the NAG Library document for g01sk https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01skf.html .. _g01sk-py2-py-parameters: **Parameters** **l** : float, array-like, shape :math:`\left(\textit{ll}\right)` :math:`\lambda_i`, the parameter of the Poisson distribution. **k** : int, array-like, shape :math:`\left(\textit{lk}\right)` :math:`k_i`, the integer which defines the required probabilities. **Returns** **plek** : float, ndarray, shape :math:`\left(\max\left(\textit{ll},\textit{lk}\right)\right)` :math:`\mathrm{Prob}\left\{X_i\leq k_i\right\}`, the lower tail probabilities. **pgtk** : float, ndarray, shape :math:`\left(\max\left(\textit{ll},\textit{lk}\right)\right)` :math:`\mathrm{Prob}\left\{X_i > k_i\right\}`, the upper tail probabilities. **peqk** : float, ndarray, shape :math:`\left(\max\left(\textit{ll},\textit{lk}\right)\right)` :math:`\mathrm{Prob}\left\{X_i = k_i\right\}`, the point probabilities. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ll},\textit{lk}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, :math:`\lambda_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`k_i < 0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`\lambda_i > 10^6`. .. _g01sk-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ll} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lk} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{l}` or :math:`\mathrm{k}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01sk-py2-py-notes: **Notes** Let :math:`X = \left\{X_i: {i = 1,2,\ldots,m}\right\}` denote a vector of random variables each having a Poisson distribution with parameter :math:`\lambda_i` :math:`\left(> 0\right)`. Then .. math:: \mathrm{Prob}\left\{X_i = k_i\right\} = e^{{-\lambda_i}}\frac{\lambda_i^{k_i}}{{k_i!}}\text{, }\quad k_i = 0,1,2,\ldots The mean and variance of each distribution are both equal to :math:`\lambda_i`. ``prob_poisson_vector`` computes, for given :math:`\lambda_i` and :math:`k_i` the probabilities: :math:`\mathrm{Prob}\left\{X_i\leq k_i\right\}`, :math:`\mathrm{Prob}\left\{X_i > k_i\right\}` and :math:`\mathrm{Prob}\left\{X_i = k_i\right\}` using the algorithm described in Knüsel (1986). The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sk-py2-py-references: **References** Knüsel, L, 1986, `Computation of the chi-square and Poisson distribution`, SIAM J. Sci. Statist. Comput. (7), 1022--1036 """ raise NotImplementedError
[docs]def prob_hypergeom_vector(n, l, m, k): r""" ``prob_hypergeom_vector`` returns a number of the lower tail, upper tail and point probabilities for the hypergeometric distribution. .. _g01sl-py2-py-doc: For full information please refer to the NAG Library document for g01sl https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01slf.html .. _g01sl-py2-py-parameters: **Parameters** **n** : int, array-like, shape :math:`\left(\textit{ln}\right)` :math:`n_i`, the parameter of the hypergeometric distribution. **l** : int, array-like, shape :math:`\left(\textit{ll}\right)` :math:`l_i`, the parameter of the hypergeometric distribution. **m** : int, array-like, shape :math:`\left(\textit{lm}\right)` :math:`m_i`, the parameter of the hypergeometric distribution. **k** : int, array-like, shape :math:`\left(\textit{lk}\right)` :math:`k_i`, the integer which defines the required probabilities. **Returns** **plek** : float, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{ll}\right)\right)` :math:`\mathrm{Prob}\left\{X_i\leq k_i\right\}`, the lower tail probabilities. **pgtk** : float, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{ll}\right)\right)` :math:`\mathrm{Prob}\left\{X_i > k_i\right\}`, the upper tail probabilities. **peqk** : float, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{ll}\right)\right)` :math:`\mathrm{Prob}\left\{X_i = k_i\right\}`, the point probabilities. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ln},\textit{ll}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, :math:`n_i < 0`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`l_i < 0`, or, :math:`l_i > n_i`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`m_i < 0`, or, :math:`m_i > n_i`. :math:`\mathrm{ivalid}[i-1] = 4` On entry, :math:`k_i < 0`, or, :math:`k_i > l_i`, or, :math:`k_i > m_i`, or, :math:`k_i < l_i+m_i-n_i`. :math:`\mathrm{ivalid}[i-1] = 5` On entry, :math:`n_i` is too large to be represented exactly as a real number. :math:`\mathrm{ivalid}[i-1] = 6` On entry, the variance (see :ref:`Notes <g01sl-py2-py-notes>`) exceeds :math:`10^6`. .. _g01sl-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ln} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ll} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lm} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lk} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{n}`, :math:`\mathrm{l}`, :math:`\mathrm{m}` or :math:`\mathrm{k}` was invalid, or the variance was too large. Check :math:`\mathrm{ivalid}` for more information. .. _g01sl-py2-py-notes: **Notes** Let :math:`X = \left\{X_i: {i = 1,2,\ldots,r}\right\}` denote a vector of random variables having a hypergeometric distribution with parameters :math:`n_i`, :math:`l_i` and :math:`m_i`. Then .. math:: \mathrm{Prob}\left\{X_i = k_i\right\} = \frac{{\begin{pmatrix}m_i\\k_i\end{pmatrix}\begin{pmatrix} n_i - m_i \\ l_i - k_i \end{pmatrix}}}{\begin{pmatrix}n_i\\l_i\end{pmatrix}}\text{,} where :math:`\mathrm{max}\left(0, {l_i+m_i-n_i}\right)\leq k_i\leq \mathrm{min}\left(l_i, m_i\right)`, :math:`0\leq l_i\leq n_i` and :math:`0\leq m_i\leq n_i`. The hypergeometric distribution may arise if in a population of size :math:`n_i` a number :math:`m_i` are marked. From this population a sample of size :math:`l_i` is drawn and of these :math:`k_i` are observed to be marked. The mean of the distribution :math:`\text{} = \frac{{l_im_i}}{n_i}`, and the variance :math:`\text{} = \frac{{l_im_i\left(n_i-l_i\right)\left(n_i-m_i\right)}}{{n_i^2\left(n_i-1\right)}}`. ``prob_hypergeom_vector`` computes for given :math:`n_i`, :math:`l_i`, :math:`m_i` and :math:`k_i` the probabilities: :math:`\mathrm{Prob}\left\{X_i\leq k_i\right\}`, :math:`\mathrm{Prob}\left\{X_i > k_i\right\}` and :math:`\mathrm{Prob}\left\{X_i = k_i\right\}` using an algorithm similar to that described in Knüsel (1986) for the Poisson distribution. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01sl-py2-py-references: **References** Knüsel, L, 1986, `Computation of the chi-square and Poisson distribution`, SIAM J. Sci. Statist. Comput. (7), 1022--1036 """ raise NotImplementedError
[docs]def inv_cdf_normal_vector(tail, p, xmu, xstd): r""" ``inv_cdf_normal_vector`` returns a number of deviates associated with given probabilities of the Normal distribution. .. _g01ta-py2-py-doc: For full information please refer to the NAG Library document for g01ta https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01taf.html .. _g01ta-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the supplied probabilities represent. Letting :math:`Z` denote a variate from a standard Normal distribution, and :math:`z_i = \frac{{x_{p_i}-\mu_i}}{\sigma_i}`, then for :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lp}, \textit{lxmu}, \textit{lxstd}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability, i.e., :math:`p_i = P\left(Z\leq z_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability, i.e., :math:`p_i = P\left(Z\geq z_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'C'}` The two tail (confidence interval) probability, i.e., :math:`p_i = P\left(Z\leq \left\lvert z_i\right\rvert \right)-P\left(Z\leq -\left\lvert z_i\right\rvert \right)`. :math:`\mathrm{tail}[j] = \texttt{'S'}` The two tail (significance level) probability, i.e., :math:`p_i = P\left(Z\geq \left\lvert z_i\right\rvert \right)+P\left(Z\leq -\left\lvert z_i\right\rvert \right)`. **p** : float, array-like, shape :math:`\left(\textit{lp}\right)` :math:`p_i`, the probabilities for the Normal distribution as defined by :math:`\mathrm{tail}` with :math:`p_i = \mathrm{p}[j]`, :math:`j = \mathrm{mod}\left({i-1}, \textit{lp}\right)`. **xmu** : float, array-like, shape :math:`\left(\textit{lxmu}\right)` :math:`\mu_i`, the means. **xstd** : float, array-like, shape :math:`\left(\textit{lxstd}\right)` :math:`\sigma_i`, the standard deviations. **Returns** **x** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lxmu}\right)\right)` :math:`x_{p_i}`, the deviates for the Normal distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lxmu}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`x_{p_i}`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`p_i\leq 0.0`, or, :math:`p_i\geq 1.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`\sigma_i\leq 0.0`. .. _g01ta-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lp} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lxmu} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lxstd} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}`, :math:`\mathrm{xstd}` or :math:`\mathrm{p}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01ta-py2-py-notes: **Notes** The deviate, :math:`x_{p_i}` associated with the lower tail probability, :math:`p_i`, for the Normal distribution is defined as the solution to .. math:: P\left(X_i\leq x_{p_i}\right) = p_i = \int_{{-\infty }}^{x_{p_i}}Z_i\left(X_i\right){dX_i}\text{,} where .. math:: Z_i\left(X_i\right) = \frac{1}{\sqrt{2\pi \sigma_i^2}}e^{{-\left(X_i-\mu_i\right)^2/\left(2\sigma_i^2\right)}}\text{, }{-\infty } < X_i < \infty \text{.} The method used is an extension of that of Wichura (1988). :math:`p_i` is first replaced by :math:`q_i = p_i-0.5`. (a) If :math:`\left\lvert q_i\right\rvert \leq 0.3`, :math:`z_i` is computed by a rational Chebyshev approximation .. math:: z_i = s_i\frac{{A_i\left(s_i^2\right)}}{{B_i\left(s_i^2\right)}}\text{,} where :math:`s_i = \sqrt{2\pi }q_i` and :math:`A_i`, :math:`B_i` are polynomials of degree :math:`7`. (#) If :math:`0.3 < \left\lvert q_i\right\rvert \leq 0.42`, :math:`z_i` is computed by a rational Chebyshev approximation .. math:: z_i = \mathrm{sign}\left(q_i\right)\left(\frac{{C_i\left(t_i\right)}}{{D_i\left(t_i\right)}}\right)\text{,} where :math:`t_i = \left\lvert q_i\right\rvert -0.3` and :math:`C_i`, :math:`D_i` are polynomials of degree :math:`5`. (#) If :math:`\left\lvert q_i\right\rvert > 0.42`, :math:`z_i` is computed as .. math:: z_i = \mathrm{sign}\left(q_i\right)\left[\left(\frac{{E_i\left(u_i\right)}}{{F_i\left(u_i\right)}}\right)+u_i\right]\text{,} where :math:`u_i = \sqrt{-2\times \log\left(\mathrm{min}\left(p_i, {1-p_i}\right)\right)}` and :math:`E_i`, :math:`F_i` are polynomials of degree :math:`6`. :math:`x_{p_i}` is then calculated from :math:`z_i`, using the relationsship :math:`z_{p_i} = \frac{{x_i-\mu_i}}{\sigma_i}`. For the upper tail probability :math:`{-x_{p_i}}` is returned, while for the two tail probabilities the value :math:`x_{{ip_i^*}}` is returned, where :math:`p_i^*` is the required tail probability computed from the input value of :math:`p_i`. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01ta-py2-py-references: **References** NIST Digital Library of Mathematical Functions Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Wichura, 1988, `Algorithm AS 241: the percentage points of the Normal distribution`, Appl. Statist. (37), 477--484 """ raise NotImplementedError
[docs]def inv_cdf_students_t_vector(tail, p, df): r""" ``inv_cdf_students_t_vector`` returns a number of deviates associated with given probabilities of Student's :math:`t`-distribution with real degrees of freedom. .. _g01tb-py2-py-doc: For full information please refer to the NAG Library document for g01tb https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01tbf.html .. _g01tb-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the supplied probabilities represent. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lp}, \textit{ldf}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability, i.e., :math:`p_i = \left({T_i\leq t_{p_i}}:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability, i.e., :math:`p_i = \left({T_i\geq t_{p_i}}:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'C'}` The two tail (confidence interval) probability, i.e., :math:`p_i = \left({T_i\leq \left\lvert t_{p_i}\right\rvert }:\nu_i\right)-\left({T_i\leq -\left\lvert t_{p_i}\right\rvert }:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'S'}` The two tail (significance level) probability, i.e., :math:`p_i = \left({T_i\geq \left\lvert t_{p_i}\right\rvert }:\nu_i\right)+\left({T_i\leq -\left\lvert t_{p_i}\right\rvert }:\nu_i\right)`. **p** : float, array-like, shape :math:`\left(\textit{lp}\right)` :math:`p_i`, the probability of the required Student's :math:`t`-distribution as defined by :math:`\mathrm{tail}`. **df** : float, array-like, shape :math:`\left(\textit{ldf}\right)` :math:`\nu_i`, the degrees of freedom of the Student's :math:`t`-distribution. **Returns** **t** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`t_{p_i}`, the deviates for the Student's :math:`t`-distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`t_{p_i}`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`p_i\leq 0.0`, or, :math:`p_i\geq 1.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`\nu_i < 1.0`. :math:`\mathrm{ivalid}[i-1] = 4` The solution has failed to converge. The result returned should represent an approximation to the solution. .. _g01tb-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lp} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}`, :math:`\mathrm{p}` or :math:`\mathrm{df}` was invalid, or the solution failed to converge. Check :math:`\mathrm{ivalid}` for more information. .. _g01tb-py2-py-notes: **Notes** The deviate, :math:`t_{p_i}` associated with the lower tail probability, :math:`p_i`, of the Student's :math:`t`-distribution with :math:`\nu_i` degrees of freedom is defined as the solution to .. math:: \left({T_i < t_{p_i}}:\nu_i\right) = p_i = \frac{{\Gamma \left(\left(\nu_i+1\right)/2\right)}}{{\sqrt{\nu_i\pi }\Gamma \left(\nu_i/2\right)}}\int_{{-\infty }}^{t_{p_i}}\left(1+\frac{T_i^2}{\nu_i}\right)^{{-\left(\nu_i+1\right)/2}}dT_i\text{, }\quad \nu_i\geq 1\text{; }{-\infty } < t_{p_i} < \infty \text{.} For :math:`\nu_i = 1` or :math:`2` the integral equation is easily solved for :math:`t_{p_i}`. For other values of :math:`\nu_i < 3` a transformation to the beta distribution is used and the result obtained from :meth:`inv_cdf_beta`. For :math:`\nu_i\geq 3` an inverse asymptotic expansion of Cornish--Fisher type is used. The algorithm is described by Hill (1970). The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01tb-py2-py-references: **References** Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Hill, G W, 1970, `Student's` :math:`t` `-distribution`, Comm. ACM (13(10)), 617--619 """ raise NotImplementedError
[docs]def inv_cdf_chisq_vector(tail, p, df): r""" ``inv_cdf_chisq_vector`` returns a number of deviates associated with the given probabilities of the :math:`\chi^2`-distribution with real degrees of freedom. .. _g01tc-py2-py-doc: For full information please refer to the NAG Library document for g01tc https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01tcf.html .. _g01tc-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the supplied probabilities represent. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lp}, \textit{ldf}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability, i.e., :math:`p_i = \left({X_i\leq x_{p_i}}:\nu_i\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability, i.e., :math:`p_i = \left({X_i\geq x_{p_i}}:\nu_i\right)`. **p** : float, array-like, shape :math:`\left(\textit{lp}\right)` :math:`p_i`, the probability of the required :math:`\chi^2`-distribution as defined by :math:`\mathrm{tail}`. **df** : float, array-like, shape :math:`\left(\textit{ldf}\right)` :math:`\nu_i`, the degrees of freedom of the :math:`\chi^2`-distribution. **Returns** **x** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`x_{p_i}`, the deviates for the :math:`\chi^2`-distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`x_{p_i}`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, invalid value for :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`\nu_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 4` :math:`p_i` is too close to :math:`0.0` or :math:`1.0` for the result to be calculated. :math:`\mathrm{ivalid}[i-1] = 5` The solution has failed to converge. The result should be a reasonable approximation. .. _g01tc-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lp} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}`, :math:`\mathrm{p}` or :math:`\mathrm{df}` was invalid, or the solution failed to converge. Check :math:`\mathrm{ivalid}` for more information. .. _g01tc-py2-py-notes: **Notes** The deviate, :math:`x_{p_i}`, associated with the lower tail probability :math:`p_i` of the :math:`\chi^2`-distribution with :math:`\nu_i` degrees of freedom is defined as the solution to .. math:: \left({X_i\leq x_{p_i}}:\nu_i\right) = p_i = \frac{1}{{2^{{\nu_i/2}}\Gamma \left(\nu_i/2\right)}}\int_0^{x_{p_i}}e^{{-X_i/2}}X_i^{{v_i/2-1}}{dX_i}\text{, }\quad 0\leq x_{p_i} < \infty \text{; }\nu_i > 0\text{.} The required :math:`x_{p_i}` is found by using the relationship between a :math:`\chi^2`-distribution and a gamma distribution, i.e., a :math:`\chi^2`-distribution with :math:`\nu_i` degrees of freedom is equal to a gamma distribution with scale parameter :math:`2` and shape parameter :math:`\nu_i/2`. For very large values of :math:`\nu_i`, greater than :math:`10^5`, Wilson and Hilferty's Normal approximation to the :math:`\chi^2` is used; see Kendall and Stuart (1969). The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01tc-py2-py-references: **References** Best, D J and Roberts, D E, 1975, `Algorithm AS 91. The percentage points of the` :math:`\chi^2` `distribution`, Appl. Statist. (24), 385--388 Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth Kendall, M G and Stuart, A, 1969, `The Advanced Theory of Statistics (Volume 1)`, (3rd Edition), Griffin """ raise NotImplementedError
[docs]def inv_cdf_f_vector(tail, p, df1, df2): r""" ``inv_cdf_f_vector`` returns a number of deviates associated with given probabilities of the :math:`F` or variance-ratio distribution with real degrees of freedom. .. _g01td-py2-py-doc: For full information please refer to the NAG Library document for g01td https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01tdf.html .. _g01td-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the supplied probabilities represent. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lp}, \textit{ldf1}, \textit{ldf2}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability, i.e., :math:`p_i = \left({F_i\leq f_{p_i}}:{u_i,v_i}\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability, i.e., :math:`p_i = \left({F_i\geq f_{p_i}}:{u_i,v_i}\right)`. **p** : float, array-like, shape :math:`\left(\textit{lp}\right)` :math:`p_i`, the probability of the required :math:`F`-distribution as defined by :math:`\mathrm{tail}`. **df1** : float, array-like, shape :math:`\left(\textit{ldf1}\right)` :math:`u_i`, the degrees of freedom of the numerator variance. **df2** : float, array-like, shape :math:`\left(\textit{ldf2}\right)` :math:`v_i`, the degrees of freedom of the denominator variance. **Returns** **f** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`f_{p_i}`, the deviates for the :math:`F`-distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`f_{p_i}`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, invalid value for :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`u_i\leq 0.0`, or, :math:`v_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 4` The solution has not converged. The result should still be a reasonable approximation to the solution. :math:`\mathrm{ivalid}[i-1] = 5` The value of :math:`p_i` is too close to :math:`0.0` or :math:`1.0` for the result to be computed. This will only occur when the large sample approximations are used. .. _g01td-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lp} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf1} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ldf2} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}`, :math:`\mathrm{p}`, :math:`\mathrm{df1}`, :math:`\mathrm{df2}` was invalid, or the solution failed to converge. Check :math:`\mathrm{ivalid}` for more information. .. _g01td-py2-py-notes: **Notes** The deviate, :math:`f_{p_i}`, associated with the lower tail probability, :math:`p_i`, of the :math:`F`-distribution with degrees of freedom :math:`u_i` and :math:`v_i` is defined as the solution to .. math:: \left({F_i\leq f_{p_i}}:u_i,v_i\right) = p_i = \frac{{u_i^{{\frac{1}{2}u_i}}v_i^{{\frac{1}{2}v_i}}\Gamma \left(\frac{{u_i+v_i}}{2}\right)}}{{\Gamma \left(\frac{u_i}{2}\right)\Gamma \left(\frac{v_i}{2}\right)}}\int_0^{f_{p_i}}F_i^{{\frac{1}{2}\left(u_i-2\right)}}\left(v_i+u_iF_i\right)^{{-\frac{1}{2}\left(u_i+v_i\right)}}{dF_i}\text{,} where :math:`u_i,v_i > 0`; :math:`0\leq f_{p_i} < \infty`. The value of :math:`f_{p_i}` is computed by means of a transformation to a beta distribution, :math:`P{i\beta_i}\left({B_i\leq \beta_i}:a_i,b_i\right)`: .. math:: \left({F_i\leq f_{p_i}}:u_i,v_i\right) = P{i\beta_i}\left({B_i\leq \frac{{u_if_{p_i}}}{{u_if_{p_i}+v_i}}}:{u_i/2},{v_i/2}\right) and using a call to :meth:`inv_cdf_beta_vector`. For very large values of both :math:`u_i` and :math:`v_i`, greater than :math:`10^5`, a Normal approximation is used. If only one of :math:`u_i` or :math:`v_i` is greater than :math:`10^5` then a :math:`\chi^2` approximation is used; see Abramowitz and Stegun (1972). The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01td-py2-py-references: **References** Abramowitz, M and Stegun, I A, 1972, `Handbook of Mathematical Functions`, (3rd Edition), Dover Publications Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def inv_cdf_beta_vector(tail, p, a, b, tol=0.0): r""" ``inv_cdf_beta_vector`` returns a number of deviates associated with given probabilities of the beta distribution. .. _g01te-py2-py-doc: For full information please refer to the NAG Library document for g01te https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01tef.html .. _g01te-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the supplied probabilities represent. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lp}, \textit{la}, \textit{lb}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability, i.e., :math:`p_i = \left({B_i\leq \beta_{p_i}}:{a_i,b_i}\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability, i.e., :math:`p_i = \left({B_i\geq \beta_{p_i}}:{a_i,b_i}\right)`. **p** : float, array-like, shape :math:`\left(\textit{lp}\right)` :math:`p_i`, the probability of the required beta distribution as defined by :math:`\mathrm{tail}`. **a** : float, array-like, shape :math:`\left(\textit{la}\right)` :math:`a_i`, the first parameter of the required beta distribution. **b** : float, array-like, shape :math:`\left(\textit{lb}\right)` :math:`b_i`, the second parameter of the required beta distribution. **tol** : float, optional The relative accuracy required by you in the results. If ``inv_cdf_beta_vector`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. **Returns** **beta** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`\beta_{p_i}`, the deviates for the beta distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`\beta_{p_i}`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, :math:`p_i < 0.0`, or, :math:`p_i > 1.0`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`a_i\leq 0.0`, or, :math:`a_i > 10^6`, or, :math:`b_i\leq 0.0`, or, :math:`b_i > 10^6`. :math:`\mathrm{ivalid}[i-1] = 4` The solution has not converged but the result should be a reasonable approximation to the solution. :math:`\mathrm{ivalid}[i-1] = 5` Requested accuracy not achieved when calculating the beta probability. The result should be a reasonable approximation to the correct solution. .. _g01te-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lp} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{la} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lb} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}`, :math:`\mathrm{p}`, :math:`\mathrm{a}`, or :math:`\mathrm{b}` was invalid, or the solution failed to converge. Check :math:`\mathrm{ivalid}` for more information. .. _g01te-py2-py-notes: **Notes** The deviate, :math:`\beta_{p_i}`, associated with the lower tail probability, :math:`p_i`, of the beta distribution with parameters :math:`a_i` and :math:`b_i` is defined as the solution to .. math:: \left({B_i\leq \beta_{p_i}}:a_i,b_i\right) = p_i = \frac{{\Gamma \left(a_i+b_i\right)}}{{\Gamma \left(a_i\right)\Gamma \left(b_i\right)}}\int_0^{{\beta_{p_i}}}B_i^{{a_i-1}}\left(1-B_i\right)^{{b_i-1}}{dB_i}\text{, }\quad 0\leq \beta_{p_i}\leq 1\text{; }a_i,b_i > 0\text{.} The algorithm is a modified version of the Newton--Raphson method, following closely that of Cran `et al.` (1977). An initial approximation, :math:`\beta_{{i0}}`, to :math:`\beta_{p_i}` is found (see Cran `et al.` (1977)), and the Newton--Raphson iteration .. math:: \beta_k = \beta_{{k-1}}-\frac{{f_i\left(\beta_{{k-1}}\right)}}{{f_i^{\prime }\left(\beta_{{k-1}}\right)}}\text{,} where :math:`f_i\left(\beta_k\right) = \left({B_i\leq \beta_k}:a_i,{b_i}\right)-p_i` is used, with modifications to ensure that :math:`\beta_k` remains in the range :math:`\left(0, 1\right)`. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01te-py2-py-references: **References** Cran, G W, Martin, K J and Thomas, G E, 1977, `Algorithm AS 109. Inverse of the incomplete beta function ratio`, Appl. Statist. (26), 111--114 Hastings, N A J and Peacock, J B, 1975, `Statistical Distributions`, Butterworth """ raise NotImplementedError
[docs]def inv_cdf_gamma_vector(tail, p, a, b, tol=0.0): r""" ``inv_cdf_gamma_vector`` returns a number of deviates associated with given probabilities of the gamma distribution. .. _g01tf-py2-py-doc: For full information please refer to the NAG Library document for g01tf https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01tff.html .. _g01tf-py2-py-parameters: **Parameters** **tail** : str, length 1, array-like, shape :math:`\left(\textit{ltail}\right)` Indicates which tail the supplied probabilities represent. For :math:`j = \left(\mathrm{mod}\left({\textit{i}-1}, \textit{ltail}\right)\right) +1 -1`, for :math:`\textit{i} = 1,2,\ldots,\mathrm{max}\left(\textit{ltail}, \textit{lp}, \textit{la}, \textit{lb}\right)`: :math:`\mathrm{tail}[j] = \texttt{'L'}` The lower tail probability, i.e., :math:`p_i = \left({G_i\leq g_{p_i}}:{\alpha_i,\beta_i}\right)`. :math:`\mathrm{tail}[j] = \texttt{'U'}` The upper tail probability, i.e., :math:`p_i = \left({G_i\geq g_{p_i}}:{\alpha_i,\beta_i}\right)`. **p** : float, array-like, shape :math:`\left(\textit{lp}\right)` :math:`p_i`, the probability of the required gamma distribution as defined by :math:`\mathrm{tail}`. **a** : float, array-like, shape :math:`\left(\textit{la}\right)` :math:`\alpha_i`, the first parameter of the required gamma distribution. **b** : float, array-like, shape :math:`\left(\textit{lb}\right)` :math:`\beta_i`, the second parameter of the required gamma distribution. **tol** : float, optional The relative accuracy required by you in the results. If ``inv_cdf_gamma_vector`` is entered with :math:`\mathrm{tol}` greater than or equal to :math:`1.0` or less than :math:`10\times \text{machine precision}` (see :meth:`machine.precision <naginterfaces.library.machine.precision>`), the value of :math:`10\times \text{machine precision}` is used instead. **Returns** **g** : float, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`g_{p_i}`, the deviates for the gamma distribution. **ivalid** : int, ndarray, shape :math:`\left(\max\left(\textit{ltail},\textit{lp}\right)\right)` :math:`\mathrm{ivalid}[i-1]` indicates any errors with the input arguments, with :math:`\mathrm{ivalid}[i-1] = 0` No error. :math:`\mathrm{ivalid}[i-1] = 1` On entry, invalid value supplied in :math:`\mathrm{tail}` when calculating :math:`g_{p_i}`. :math:`\mathrm{ivalid}[i-1] = 2` On entry, invalid value for :math:`p_i`. :math:`\mathrm{ivalid}[i-1] = 3` On entry, :math:`\alpha_i\leq 0.0`, or, :math:`\alpha_i > 10^6`, or, :math:`\beta_i\leq 0.0`. :math:`\mathrm{ivalid}[i-1] = 4` :math:`p_i` is too close to :math:`0.0` or :math:`1.0` to enable the result to be calculated. :math:`\mathrm{ivalid}[i-1] = 5` The solution has failed to converge. The result may be a reasonable approximation. .. _g01tf-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`2`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{ltail} > 0`. (`errno` :math:`3`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lp} > 0`. (`errno` :math:`4`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{la} > 0`. (`errno` :math:`5`) On entry, :math:`\text{array size} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{lb} > 0`. **Warns** **NagAlgorithmicWarning** (`errno` :math:`1`) On entry, at least one value of :math:`\mathrm{tail}`, :math:`\mathrm{p}`, :math:`\mathrm{a}`, or :math:`\mathrm{b}` was invalid. Check :math:`\mathrm{ivalid}` for more information. .. _g01tf-py2-py-notes: **Notes** The deviate, :math:`g_{p_i}`, associated with the lower tail probability, :math:`p_i`, of the gamma distribution with shape parameter :math:`\alpha_i` and scale parameter :math:`\beta_i`, is defined as the solution to .. math:: \left({G_i\leq g_{p_i}}:\alpha_i,\beta_i\right) = p_i = \frac{1}{{\beta_i^{\alpha_i}\Gamma \left(\alpha_i\right)}}\int_0^{g_{p_i}}e_i^{{-G_i/\beta_i}}G_i^{{\alpha_i-1}}{dG_i}\text{, }\quad 0\leq g_{p_i} < \infty \text{; }\alpha_i,\beta_i > 0\text{.} The method used is described by Best and Roberts (1975) making use of the relationship between the gamma distribution and the :math:`\chi^2`-distribution. Let :math:`y_i = 2\frac{g_{p_i}}{\beta_i}`. The required :math:`y_i` is found from the Taylor series expansion .. math:: y_i = y_0+\sum_r\frac{{C_r\left(y_0\right)}}{{r!}}\left(\frac{E_i}{{\phi \left(y_0\right)}}\right)^r\text{,} where :math:`y_0` is a starting approximation :math:`C_1\left(u_i\right) = 1`, :math:`C_{{r+1}}\left(u_i\right) = \left(r\Psi +\frac{d}{{du_i}}\right)C_r\left(u_i\right)`, :math:`\Psi_i = \frac{1}{2}-\frac{{\alpha_i-1}}{u_i}`, :math:`E_i = p_i-\int_0^{y_0}\phi_i\left(u_i\right){du_i}`, :math:`\phi_i\left(u_i\right) = \frac{1}{{2^{\alpha_i}\Gamma \left(\alpha_i\right)}}e_i^{{-u_i/2}}u_i^{{\alpha_i-1}}`. For most values of :math:`p_i` and :math:`\alpha_i` the starting value .. math:: y_{01} = 2\alpha_i\left(z_i\sqrt{\frac{1}{{9\alpha_i}}}+1-\frac{1}{{9\alpha_i}}\right)^3 is used, where :math:`z_i` is the deviate associated with a lower tail probability of :math:`p_i` for the standard Normal distribution. For :math:`p_i` close to zero, .. math:: y_{02} = \left(p_i\alpha_i2^{\alpha_i}\Gamma \left(\alpha_i\right)\right)^{{1/\alpha_i}} is used. For large :math:`p_i` values, when :math:`y_{01} > 4.4\alpha_i+6.0`, .. math:: y_{03} = -2\left[\mathrm{ln}\left(1-p_i\right)-\left(\alpha_i-1\right)\mathrm{ln}\left(\frac{1}{2}y_{01}\right)+\mathrm{ln}\left(\Gamma \left(\alpha_i\right)\right)\right] is found to be a better starting value than :math:`y_{01}`. For small :math:`\alpha_i` :math:`\left(\alpha_i\leq 0.16\right)`, :math:`p_i` is expressed in terms of an approximation to the exponential integral and :math:`y_{04}` is found by Newton--Raphson iterations. Seven terms of the Taylor series are used to refine the starting approximation, repeating the process if necessary until the required accuracy is obtained. The input arrays to this function are designed to allow maximum flexibility in the supply of vector arguments by re-using elements of any arrays that are shorter than the total number of evaluations required. See `the G01 Introduction <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01intro.html#vectorizeddoc>`__ for further information. .. _g01tf-py2-py-references: **References** Best, D J and Roberts, D E, 1975, `Algorithm AS 91. The percentage points of the` :math:`\chi^2` `distribution`, Appl. Statist. (24), 385--388 """ raise NotImplementedError
[docs]def moving_average(m, x, iwt=0, wt=None, pn=0, wantsd=False, comm=None): r""" ``moving_average`` calculates the mean and, optionally, the standard deviation using a rolling window for an arbitrary sized data stream. .. _g01wa-py2-py-doc: For full information please refer to the NAG Library document for g01wa https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01waf.html .. _g01wa-py2-py-parameters: **Parameters** **m** : int :math:`m`, the length of the rolling window. If :math:`\mathrm{pn}\neq 0`, :math:`\mathrm{m}` must be unchanged since the last call to ``moving_average``. **x** : float, array-like, shape :math:`\left(\textit{nb}\right)` The current block of observations, corresponding to :math:`x_{\textit{i}}`, for :math:`\textit{i} = k+1,\ldots,k+b`, where :math:`k` is the number of observations processed so far and :math:`b` is the size of the current block of data. **iwt** : int, optional The type of weighting to use. :math:`\mathrm{iwt} = 0` No weights are used. :math:`\mathrm{iwt} = 1` Each observation has its own weight. :math:`\mathrm{iwt} = 2` Each position in the window has its own weight. :math:`\mathrm{iwt} = 3` Each position in the window has a weight equal to its position number. If :math:`\mathrm{pn}\neq 0`, :math:`\mathrm{iwt}` must be unchanged since the last call to ``moving_average``. **wt** : None or float, array-like, shape :math:`\left(:\right)`, optional Note: the required length for this argument is determined as follows: if :math:`\mathrm{iwt}=1`: :math:`\textit{nb}`; if :math:`\mathrm{iwt}=2`: :math:`\mathrm{m}`; otherwise: :math:`0`. The user-supplied weights. If :math:`\mathrm{iwt} = 1`, :math:`\mathrm{wt}[\textit{i}-1] = \nu_{{\textit{i}+k}}`, for :math:`\textit{i} = 1,2,\ldots,b`. If :math:`\mathrm{iwt} = 2`, :math:`\mathrm{wt}[\textit{j}-1] = w_{\textit{j}}`, for :math:`\textit{j} = 1,2,\ldots,m`. **pn** : int, optional :math:`k`, the number of observations processed so far. On the first call to ``moving_average``, or when starting to summarise a new dataset, :math:`\mathrm{pn}` must be set to :math:`0`. If :math:`\mathrm{pn}\neq 0`, it must be the same value as returned by the last call to ``moving_average``. **wantsd** : bool, optional If the standard deviations are required then :math:`\mathrm{wantsd}` should be set to :math:`\mathbf{True}`. **comm** : None or dict, communication object, optional, modified in place Communication structure. If :math:`\mathbf{None}` all the data must be supplied in one go, otherwise need not be set. **Returns** **pn** : int :math:`k+b`, the updated number of observations processed so far. **rmean** : float, ndarray, shape :math:`\left(\max\left(0,{ \textit{nb} + \min\left(0,{ \mathrm{pn} - \mathrm{m} + 1 }\right) }\right)\right)` :math:`\mu_{\textit{l}}`, the (weighted) moving averages, for :math:`\textit{l} = 1,2,\ldots,{b+\mathrm{min}\left(0, {k-m+1}\right)}`. Therefore, :math:`\mu_l` is the mean of the data in the window that ends on :math:`\mathrm{x}[l+m-\mathrm{min}\left(k, {m-1}\right)-2]`. If, on entry, :math:`\mathrm{pn}\geq \mathrm{m}-1`, i.e., at least one windows worth of data has been previously processed, then :math:`\mathrm{rmean}[l-1]` is the summary corresponding to the window that ends on :math:`\mathrm{x}[l-1]`. On the other hand, if, on entry, :math:`\mathrm{pn} = 0`, i.e., no data has been previously processed, then :math:`\mathrm{rmean}[l-1]` is the summary corresponding to the window that ends on :math:`\mathrm{x}[\mathrm{m}+l-2]` (or, equivalently, starts on :math:`\mathrm{x}[l-1]`). **rsd** : None or float, ndarray, shape :math:`\left(:\right)` If :math:`\mathrm{wantsd} = \mathbf{True}` then :math:`\sigma_l`, the (weighted) standard deviation. The ordering of :math:`\mathrm{rsd}` is the same as the ordering of :math:`\mathrm{rmean}`. .. _g01wa-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`11`) On entry, :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{m}\geq 1`. (`errno` :math:`12`) On entry, :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle`. On entry at previous call, :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{pn} > 0`, :math:`\mathrm{m}` must be unchanged since previous call. (`errno` :math:`21`) On entry, :math:`\textit{nb} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\textit{nb}\geq 0`. (`errno` :math:`22`) On entry, :math:`\textit{nb} = \langle\mathit{\boldsymbol{value}}\rangle`, :math:`\mathrm{m} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{comm}` is **None**, :math:`\textit{nb}\geq \mathrm{m}`. (`errno` :math:`41`) On entry, :math:`\mathrm{iwt} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{iwt} = 0`, :math:`1`, :math:`2` or :math:`3`. (`errno` :math:`42`) On entry, :math:`\mathrm{iwt} = \langle\mathit{\boldsymbol{value}}\rangle`. On entry at previous call, :math:`\mathrm{iwt} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{pn} > 0`, :math:`\mathrm{iwt}` must be unchanged since previous call. (`errno` :math:`51`) On entry, :math:`\mathrm{wt}[\langle\mathit{\boldsymbol{value}}\rangle] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{wt}[i-1]\geq 0`. (`errno` :math:`52`) On entry, :math:`\mathrm{wt}[0] = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{iwt} = 2`, :math:`\mathrm{wt}[0] > 0`. (`errno` :math:`55`) On entry, sum of weights supplied in :math:`\mathrm{wt}` is :math:`\langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{iwt} = 2`, the sum of the weights :math:`> 0`. (`errno` :math:`61`) On entry, :math:`\mathrm{pn} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{pn}\geq 0`. (`errno` :math:`62`) On entry, :math:`\mathrm{pn} = \langle\mathit{\boldsymbol{value}}\rangle`. On exit from previous call, :math:`\mathrm{pn} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: if :math:`\mathrm{pn} > 0`, :math:`\mathrm{pn}` must be unchanged since previous call. (`errno` :math:`101`) :math:`\mathrm{comm}`\ ['rcomm'] has been corrupted between calls. **Warns** **NagAlgorithmicWarning** (`errno` :math:`53`) On entry, at least one window had all zero weights. (`errno` :math:`54`) On entry, unable to calculate at least one standard deviation due to the weights supplied. .. _g01wa-py2-py-notes: **Notes** Given a sample of :math:`n` observations, denoted by :math:`x = \left\{x_i:i = 1,2,\ldots,n\right\}` and a set of weights, :math:`w = \left\{w_j:j = 1,2,\ldots,m\right\}`, ``moving_average`` calculates the mean and, optionally, the standard deviation, in a rolling window of length :math:`m`. For the :math:`i`\ th window the mean is defined as .. math:: \mu_i = \frac{{\sum_{{j = 1}}^mw_jx_{{i+j-1}}}}{W} and the standard deviation as .. math:: \sigma_i = \sqrt{\frac{{\sum_{{j = 1}}^mw_j\left(x_{{i+j-1}}-\mu_i\right)^2}}{{W-\frac{{\sum_{{j = 1}}^mw_j^2}}{W}}}} with :math:`W = \sum_{{j = 1}}^mw_j`. Four different types of weighting are possible: (i) **No weights (** :math:`w_j = 1` **)** When no weights are required both the mean and standard deviations can be calculated in an iterative manner, with .. math:: \begin{array}{cc} \mu_{{i+1}} = & \mu_i + \frac{\left(x_{{i+m}}-x_i\right)}{m} \\ \sigma_{{i+1}}^2 = & \left(m-1\right) \sigma_i^2 + \left(x_{{i+m}}-\mu_i\right)^2 - \left(x_i-\mu_i\right)^2 - \frac{\left(x_{{i+m}}-x_i\right)^2}{m} \end{array} where the initial values :math:`\mu_1` and :math:`\sigma_1` are obtained using the one pass algorithm of West (1979). (#) **Each observation has its own weight** In this case, rather than supplying a vector of :math:`m` weights a vector of :math:`n` weights is supplied instead, :math:`v = \left\{v_j:j = 1,2,\ldots,n\right\}` and :math:`w_j = v_{{i+j-1}}` in `[equation] <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01waf.html#mean_rolling_window_eqn>`__ and `[equation] <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01waf.html#sd_rolling_window_eqn>`__. If the standard deviations are not required then the mean is calculated using the iterative formula: .. math:: \begin{array}{cc} W_{{i+1}} = & W_i + \left(v_{{i+m}}-v_i\right) \\ \mu_{{i+1}} = & \mu_i + W_i^{-1} \left(v_{{i+m}}x_{{i+m}}-v_ix_i\right) \end{array} where :math:`W_1 = \sum_{{i = 1}}^mv_i` and :math:`\mu_1 = W_1^{-1}\sum_{{i = 1}}^mv_ix_i`. If both the mean and standard deviation are required then the one pass algorithm of West (1979) is used in each window. (#) **Each position in the window has its own weight** This is the case as described in `[equation] <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01waf.html#mean_rolling_window_eqn>`__ and `[equation] <https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01waf.html#sd_rolling_window_eqn>`__, where the weight given to each observation differs depending on which summary is being produced. When these types of weights are specified both the mean and standard deviation are calculated by applying the one pass algorithm of West (1979) multiple times. (#) **Each position in the window has a weight equal to its position number (** :math:`w_j = j` **)** This is a special case of \(iii). If the standard deviations are not required then the mean is calculated using the iterative formula: .. math:: \begin{array}{cc} S_{{i+1}} = & S_i + \left(x_{{i+m}}-x_i\right) \\ \mu_{{i+1}} = & \mu_i + \frac{{2\left(mx_{{i+m}}-S_i\right)}}{{m\left(m+1\right)}} \end{array} where :math:`S_1 = \sum_{{i = 1}}^mx_i` and :math:`\mu_1 = 2\left(m^2+m\right)^{-1}S_1`. If both the mean and standard deviation are required then the one pass algorithm of West is applied multiple times. For large datasets, or where all the data is not available at the same time, :math:`x` (and if each observation has its own weight, :math:`v`) can be split into arbitrary sized blocks and ``moving_average`` called multiple times. .. _g01wa-py2-py-references: **References** Chan, T F, Golub, G H and Leveque, R J, 1982, `Updating Formulae and a Pairwise Algorithm for Computing Sample Variances`, Compstat, Physica-Verlag West, D H D, 1979, `Updating mean and variance estimates: An improved method`, Comm. ACM (22), 532--555 See Also -------- :meth:`naginterfaces.library.examples.stat.moving_average_ex.main` """ raise NotImplementedError
[docs]def init_vavilov(rkappa, beta2, mode): r""" ``init_vavilov`` is used to initialize functions :meth:`pdf_vavilov` and :meth:`prob_vavilov`. It is intended to be used before a call to :meth:`pdf_vavilov` or :meth:`prob_vavilov`. .. _g01zu-py2-py-doc: For full information please refer to the NAG Library document for g01zu https://www.nag.com/numeric/nl/nagdoc_29.3/flhtml/g01/g01zuf.html .. _g01zu-py2-py-parameters: **Parameters** **rkappa** : float The argument :math:`\kappa` of the function. **beta2** : float The argument :math:`\beta^2` of the function. **mode** : int If :math:`\mathrm{mode} = 0`, :meth:`pdf_vavilov` is to be called after the call to ``init_vavilov``. Otherwise, :meth:`prob_vavilov` is to be called. **Returns** **xl** : float :math:`x_l`, a threshold value below which :math:`\phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` will be set to zero by :meth:`pdf_vavilov` and :math:`\Phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` will be set to zero by :meth:`prob_vavilov` if :math:`\lambda < x_l`. **xu** : float :math:`x_u`, a threshold value above which :math:`\phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` will be set to zero by :meth:`pdf_vavilov` and :math:`\Phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` will be set to unity by :meth:`prob_vavilov` if :math:`\lambda > x_u`. **comm** : dict, communication object Communication structure. .. _g01zu-py2-py-errors: **Raises** **NagValueError** (`errno` :math:`1`) On entry, :math:`\mathrm{beta2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{beta2}\leq 1.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{beta2} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{beta2}\geq 0.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{rkappa} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rkappa}\leq 10.0`. (`errno` :math:`1`) On entry, :math:`\mathrm{rkappa} = \langle\mathit{\boldsymbol{value}}\rangle`. Constraint: :math:`\mathrm{rkappa}\geq 0.01`. (`errno` :math:`2`) The initialization has been abandoned due to an internal error. This error exit is unlikely to occur, but if it does change the values of :math:`\mathrm{rkappa}` and/or :math:`\mathrm{beta2}` and rerun ``init_vavilov``. .. _g01zu-py2-py-notes: **Notes** ``init_vavilov`` initializes the array :math:`\mathrm{comm}`\ ['rcomm'] for use by :meth:`pdf_vavilov` or :meth:`prob_vavilov` in the evaluation of the Vavilov functions :math:`\phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` and :math:`\Phi_V\left({\lambda \text{;}\kappa }, \beta^2\right)` respectively. Multiple calls to :meth:`prob_vavilov` or :meth:`pdf_vavilov` can be made following a single call to ``init_vavilov``, provided that :math:`\mathrm{rkappa}` or :math:`\mathrm{beta2}` do not change, and that either all calls are to :meth:`prob_vavilov` or all calls are to :meth:`pdf_vavilov`. If you wish to call both :meth:`prob_vavilov` and :meth:`pdf_vavilov`, then you will need to initialize both separately. .. _g01zu-py2-py-references: **References** Schorr, B, 1974, `Programs for the Landau and the Vavilov distributions and the corresponding random numbers`, Comp. Phys. Comm. (7), 215--224 """ raise NotImplementedError