Information content and entropy are fundamental concepts in information theory, which quantify the amount of information (or "surprise") in a random variable. Both concepts are closely related the probability density/mass of events: improbable events have higher information content. The probability of each observation maps to the information content; The average information content of a variable is the entropy. Information content/entropy can be calculated for discrete probabilities or continuous probabilities, and humdrumR defines methods for calculating both.
Usage
entropy(..., base = 2)
H(..., base = 2)
# S3 method for probability
entropy(q, p, condition = NULL, base = 2)
# S3 method for numeric
entropy(x, base = 2, na.rm = TRUE)
# S3 method for density
entropy(x, base = 2, na.rm = TRUE)
# S3 method for default
entropy(..., base = 2)
mutualInfo(..., base = 2)
# S3 method for probability
mutualInfo(x, base = 2)
# S3 method for default
mutualInfo(..., base = 2)
Details
To calculate information content or entropy, we must assume (or estimate) a probability distribution.
HumdrumR uses R's standard table()
and density()
functions to estimate discrte and continuous probability
distributions respectively.
Entropy is the average information content of a variable.
The entropy()
function can accept either a table()
object (for discrete variables),
or a density()
object (for continuous variables).
If entropy()
is passed an atomic vector,
the values of the vector are treated as observations or a random variable:
for numeric
vectors, the stats::density()
function is used to estimate the probability distribution
of the random (continuous) variable, then entropy is computed for the density.
For other atomic vectors, table()
is called to tabulate the discrete probability mass for each
observed level, and entropy is then computed for the table.
The ic()
function only accepts atomic vectors as its main (x
) argument, but must also
be provided a distribution
argument.
By default, the distribution
argument is estimated using density()
(numeric
input) or table()
(other input).