Skip to contents

Information content and entropy are fundamental concepts in information theory, which quantify the amount of information (or "surprise") in a random variable. Both concepts are closely related the probability density/mass of events: improbable events have higher information content. The probability of each observation maps to the information content; The average information content of a variable is the entropy. Information content/entropy can be calculated for discrete probabilities or continuous probabilities, and humdrumR defines methods for calculating both.

Usage

entropy(..., base = 2)

H(..., base = 2)

# S3 method for probability
entropy(q, p, condition = NULL, base = 2)

# S3 method for numeric
entropy(x, base = 2, na.rm = TRUE)

# S3 method for density
entropy(x, base = 2, na.rm = TRUE)

# S3 method for default
entropy(..., base = 2)

mutualInfo(..., base = 2)

# S3 method for probability
mutualInfo(x, base = 2)

# S3 method for default
mutualInfo(..., base = 2)

Details

To calculate information content or entropy, we must assume (or estimate) a probability distribution. HumdrumR uses R's standard table() and density() functions to estimate discrte and continuous probability distributions respectively.

Entropy is the average information content of a variable. The entropy() function can accept either a table() object (for discrete variables), or a density() object (for continuous variables). If entropy() is passed an atomic vector, the values of the vector are treated as observations or a random variable: for numeric vectors, the stats::density() function is used to estimate the probability distribution of the random (continuous) variable, then entropy is computed for the density. For other atomic vectors, table() is called to tabulate the discrete probability mass for each observed level, and entropy is then computed for the table.

The ic() function only accepts atomic vectors as its main (x) argument, but must also be provided a distribution argument. By default, the distribution argument is estimated using density() (numeric input) or table() (other input).