Next: Tests, Up: Statistics
If x is a vector, compute the mean of the elements of x
mean (x) = SUM_i x(i) / NIf x is a matrix, compute the mean for each column and return them in a row vector.
With the optional argument opt, the kind of mean computed can be selected. The following options are recognized:
"a"
- Compute the (ordinary) arithmetic mean. This is the default.
"g"
- Computer the geometric mean.
"h"
- Compute the harmonic mean.
If the optional argument dim is supplied, work along dimension dim.
Both dim and opt are optional. If both are supplied, either may appear first.
If x is a vector, compute the median value of the elements of x.
x(ceil(N/2)), N odd median(x) = (x(N/2) + x((N/2)+1))/2, N evenIf x is a matrix, compute the median value for each column and return them in a row vector.
See also: std, mean.
If x is a vector, compute the standard deviation of the elements of x.
std (x) = sqrt (sumsq (x - mean (x)) / (n - 1))If x is a matrix, compute the standard deviation for each column and return them in a row vector.
The argument opt determines the type of normalization to use. Valid values are
- 0:
- normalizes with N-1, provides the square root of best unbiased estimator of the variance [default]
- 1:
- normalizes with N, this provides the square root of the second moment around the mean
The third argument dim determines the dimension along which the standard deviation is calculated.
See also: mean, median.
If each row of x and y is an observation and each column is a variable, the (i, j)-th entry of
cov (
x,
y)
is the covariance between the i-th variable in x and the j-th variable in y. If called with one argument, computecov (
x,
x)
.
If each row of x and y is an observation and each column is a variable, the (i, j)-th entry of
corrcoef (
x,
y)
is the correlation between the i-th variable in x and the j-th variable in y. If called with one argument, computecorrcoef (
x,
x)
.
If x is a vector of length N, return the kurtosis
kurtosis (x) = N^(-1) std(x)^(-4) sum ((x - mean(x)).^4) - 3of x. If x is a matrix, return the kurtosis over the first non-singleton dimension. The optional argument dim can be given to force the kurtosis to be given over that dimension.
Return the Mahalanobis' D-square distance between the multivariate samples x and y, which must have the same number of components (columns), but may have a different number of observations (rows).
If x is a vector of length n, return the skewness
skewness (x) = N^(-1) std(x)^(-3) sum ((x - mean(x)).^3)of x. If x is a matrix, return the skewness along the first non-singleton dimension of the matrix. If the optional dim argument is given, operate along this dimension.
Return the different values in a column vector, arranged in ascending order.
For vector arguments, return the (real) variance of the values. For matrix arguments, return a row vector contaning the variance for each column.
The argument opt determines the type of normalization to use. Valid values are
- 0:
- Normalizes with N-1, provides the best unbiased estimator of the variance [default].
- 1:
- Normalizes with N, this provides the second moment around the mean.
The third argument dim determines the dimension along which the variance is calculated.
Create a contingency table t from data vectors. The l vectors are the corresponding levels.
Currently, only 1- and 2-dimensional tables are supported.
If x is a vector, subtract its mean and divide by its standard deviation.
If x is a matrix, do the above along the first non-singleton dimension. If the optional argument dim is given then operate along this dimension.
If x is a matrix, return a matrix with the minimum, first quartile, median, third quartile, maximum, mean, standard deviation, skewness and kurtosis of the columns of x as its rows.
If x is a vector, treat it as a column vector.
Compute Spearman's rank correlation coefficient rho for each of the variables specified by the input arguments.
For matrices, each row is an observation and each column a variable; vectors are always observations and may be row or column vectors.
spearman (
x)
is equivalent tospearman (
x,
x)
.For two data vectors x and y, Spearman's rho is the correlation of the ranks of x and y.
If x and y are drawn from independent distributions, rho has zero mean and variance
1 / (n - 1)
, and is asymptotically normally distributed.
Count the upward runs along the first non-singleton dimension of x of length 1, 2, ..., n-1 and greater than or equal to n. If the optional argument dim is given operate along this dimension
If x is a vector, return the (column) vector of ranks of x adjusted for ties.
If x is a matrix, do the above for along the first non-singleton dimension. If the optional argument dim is given, operate along this dimension.
If x is a vector, return the range, i.e., the difference between the maximum and the minimum, of the input data.
If x is a matrix, do the above for each column of x.
If the optional argument dim is supplied, work along dimension dim.
Perform a QQ-plot (quantile plot).
If F is the CDF of the distribution dist with parameters params and G its inverse, and x a sample vector of length n, the QQ-plot graphs ordinate s(i) = i-th largest element of x versus abscissa q(if) = G((i - 0.5)/n).
If the sample comes from F except for a transformation of location and scale, the pairs will approximately follow a straight line.
The default for dist is the standard normal distribution. The optional argument params contains a list of parameters of dist. For example, for a quantile plot of the uniform distribution on [2,4] and x, use
qqplot (x, "uniform", 2, 4)If no output arguments are given, the data are plotted directly.
For each component of p, return the probit (the quantile of the standard normal distribution) of p.
Perform a PP-plot (probability plot).
If F is the CDF of the distribution dist with parameters params and x a sample vector of length n, the PP-plot graphs ordinate y(i) = F (i-th largest element of x) versus abscissa p(i) = (i - 0.5)/n. If the sample comes from F, the pairs will approximately follow a straight line.
The default for dist is the standard normal distribution. The optional argument params contains a list of parameters of dist. For example, for a probability plot of the uniform distribution on [2,4] and x, use
ppplot (x, "uniform", 2, 4)If no output arguments are given, the data are plotted directly.
If x is a vector, compute the p-th moment of x.
If x is a matrix, return the row vector containing the p-th moment of each column.
With the optional string opt, the kind of moment to be computed can be specified. If opt contains
"c"
or"a"
, central and/or absolute moments are returned. For example,moment (x, 3, "ac")computes the third central absolute moment of x.
If the optional argument dim is supplied, work along dimension dim.
For vector arguments, return the mean square of the values. For matrix arguments, return a row vector contaning the mean square of each column. With the optional dim argument, returns the mean squared of the values along this dimension.
Compute Kendall's tau for each of the variables specified by the input arguments.
For matrices, each row is an observation and each column a variable; vectors are always observations and may be row or column vectors.
kendall (
x)
is equivalent tokendall (
x,
x)
.For two data vectors x, y of common length n, Kendall's tau is the correlation of the signs of all rank differences of x and y; i.e., if both x and y have distinct entries, then
1 tau = ------- SUM sign (q(i) - q(j)) * sign (r(i) - r(j)) n (n-1) i,jin which the q(i) and r(i) are the ranks of x and y, respectively.
If x and y are drawn from independent distributions, Kendall's tau is asymptotically normal with mean 0 and variance
(2 * (2
n+5)) / (9 *
n* (
n-1))
.
If x is a vector, return the interquartile range, i.e., the difference between the upper and lower quartile, of the input data.
If x is a matrix, do the above for first non singleton dimension of x. If the option dim argument is given, then operate along this dimension.
Create categorical data out of numerical or continuous data by cutting into intervals.
If breaks is a scalar, the data is cut into that many equal-width intervals. If breaks is a vector of break points, the category has
length (
breaks) - 1
groups.The returned value is a vector of the same size as x telling which group each point in x belongs to. Groups are labelled from 1 to the number of groups; points outside the range of breaks are labelled by
NaN
.
The (i, j)-th entry of
cor (
x,
y)
is the correlation between the i-th variable in x and the j-th variable in y.For matrices, each row is an observation and each column a variable; vectors are always observations and may be row or column vectors.
cor (
x)
is equivalent tocor (
x,
x)
.