Next: Example programs for histograms, Previous: Resampling from histograms, Up: Histograms
The probability distribution function for a histogram consists of a set of bins which measure the probability of an event falling into a given range of a continuous variable x. A probability distribution function is defined by the following struct, which actually stores the cumulative probability distribution function. This is the natural quantity for generating samples via the inverse transform method, because there is a one-to-one mapping between the cumulative probability distribution and the range [0,1]. It can be shown that by taking a uniform random number in this range and finding its corresponding coordinate in the cumulative probability distribution we obtain samples with the desired probability distribution.
size_t n
- This is the number of bins used to approximate the probability distribution function.
double * range
- The ranges of the bins are stored in an array of n+1 elements pointed to by range.
double * sum
- The cumulative probability for the bins is stored in an array of n elements pointed to by sum.
The following functions allow you to create a gsl_histogram_pdf
struct which represents this probability distribution and generate
random samples from it.
This function allocates memory for a probability distribution with n bins and returns a pointer to a newly initialized
gsl_histogram_pdf
struct. If insufficient memory is available a null pointer is returned and the error handler is invoked with an error code ofGSL_ENOMEM
.
This function initializes the probability distribution p with the contents of the histogram h. If any of the bins of h are negative then the error handler is invoked with an error code of
GSL_EDOM
because a probability distribution cannot contain negative values.
This function frees the probability distribution function p and all of the memory associated with it.
This function uses r, a uniform random number between zero and one, to compute a single random sample from the probability distribution p. The algorithm used to compute the sample s is given by the following formula,
s = range[i] + delta * (range[i+1] - range[i])where i is the index which satisfies sum[i] <= r < sum[i+1] and delta is (r - sum[i])/(sum[i+1] - sum[i]).