humdrumR data size and shape — humSize • humdrumR

These functions can be used to quickly get basic information about the size and "shape" of a humdrumR corpus objects. For more details, use the census() or spines() functions instead.

HumdrumR objects can be divided into "subcorpora." anySubcorpora and namesSubcorpora functions tell us if there are any subcorpora and, if so, what they are called.

Usage

nrecord(humdrumR, dataTypes = "GLIMDd")

# S4 method for humdrumR
nrow(x)

ntoken(humdrumR, dataTypes = "GLIMDd")

npieces(humdrumR)

nfiles(humdrumR)

# S4 method for humdrumR
length(x)

# S4 method for humdrumR
ncol(x)

# S4 method for humdrumR
dim(x)

is.empty(humdrumR)

anyMultiPieceFiles(humdrumR)

anyPaths(humdrumR)

anyStops(humdrumR)

anySubcorpora(humdrumR)

namesSubcorpora(humdrumR)

Arguments

humdrumR

HumdrumR data.

Must be a humdrumR data object.

dataTypes

Which types of humdrum record(s) to include in the census.

Defaults to "GLIMDd".

Must be a single character string. Legal values are 'G', 'L', 'I', 'M', 'D', 'd' or any combination of these (e.g., "LIM"). (See the humdrum table documentation Fields section for explanation.)

Details

The following functions are defined.

nfile : The number of input files in the corpus.
- length(humdrumR) is a synonym.
npiece: The number of pieces in the corpus. (There may be multiple pieces per file.)
nrecord: The number of records in the corpus.
- nrow(humdrumR) is a synonym.
ntoken: The number of tokens in the corpus.
ncol(humdrumR): Returns the maximum number of "columns" need to represent the data in a 2d matrix. Matches the default output from as.matrix(humdrumR).
dim(humdrumR): the same as c(nrow(humdrumR), ncol(humdrumR)).

Is/Any

A few additional functions return quick TRUE/FALSE answers regarding a humdrumR corpus:

is.empty: Returns TRUE is a corpus contains no non-null data tokens (D tokens).
anyPaths: Returns TRUE if there are any spine paths (Path > 0) in any pieces in the corpus.
anyStops: Returns TRUE if there are any multi-stops (Stop > 1) in any pieces in the corpus.
anySubcorpora: Returns TRUE if the corpus was read with different regex patterns matching "subcorpora" labels.
- namesSubcorpora returns the names of the subcorpora labels (Label field).
anyMultiPieceFiles: Returns TRUE if any files contain more than one piece (Piece != File).