A common cause of problems is feeding functions with data which columns are
not all of the expected type. The problem often begins when reading data from
a text file with functions such as utils::read.csv()
,
utils::read.delim()
, and friends -- which commonly guess wrongly the column
type that you more likely expect. These common offenders are strongly
discouraged; instead consider using readr::read_csv()
, readr::read_tsv()
,
and friends, which guess column types correctly much more often than their
analogs from the utils package.
type_vft()
and type_taxa()
help you to read data more safely by
explicitly specifying what type to expect from each column of known datasets.
These functions output the specification of column types used internally by
read_vft()
and read_taxa()
:
type_vft():
Type specification for ViewFullTable.
type_taxa():
Type specification for ViewFullTaxonomy.
type_vft()
type_taxa()
A list.
Types reference (for more details see read_delim()
):
c = character,
i = integer,
n = number,
d = double,
l = logical,
D = date,
T = date time,
t = time,
? = guess,
or _/- to skip the column.'.
Other functions to operate on column types:
type_ensure()
Other functions to read text files delivered by ForestgGEO's database:
read_vft()
assert_is_installed("fgeo.x")
library(fgeo.x)
library(readr)
str(type_vft())
#> List of 32
#> $ DBHID : chr "i"
#> $ PlotName : chr "c"
#> $ PlotID : chr "i"
#> $ Family : chr "c"
#> $ Genus : chr "c"
#> $ SpeciesName : chr "c"
#> $ Mnemonic : chr "c"
#> $ Subspecies : chr "c"
#> $ SpeciesID : chr "i"
#> $ SubspeciesID : chr "c"
#> $ QuadratName : chr "c"
#> $ QuadratID : chr "i"
#> $ PX : chr "d"
#> $ PY : chr "d"
#> $ QX : chr "d"
#> $ QY : chr "d"
#> $ TreeID : chr "i"
#> $ Tag : chr "c"
#> $ StemID : chr "i"
#> $ StemNumber : chr "i"
#> $ StemTag : chr "i"
#> $ PrimaryStem : chr "c"
#> $ CensusID : chr "i"
#> $ PlotCensusNumber: chr "i"
#> $ DBH : chr "d"
#> $ HOM : chr "d"
#> $ ExactDate : chr "D"
#> $ Date : chr "i"
#> $ ListOfTSM : chr "c"
#> $ HighHOM : chr "i"
#> $ LargeStem : chr "c"
#> $ Status : chr "c"
read_csv(example_path("view/vft_4quad.csv"), col_types = type_vft())
#> # A tibble: 500 × 32
#> DBHID PlotName PlotID Family Genus Speci…¹ Mnemo…² Subsp…³ Speci…⁴ Subsp…⁵
#> <int> <chr> <int> <chr> <chr> <chr> <chr> <chr> <int> <chr>
#> 1 385164 luquillo 1 Rubiace… Psyc… brachi… PSYBRA NA 185 NA
#> 2 385261 luquillo 1 Urticac… Cecr… schreb… CECSCH NA 74 NA
#> 3 384600 luquillo 1 Rubiace… Psyc… brachi… PSYBRA NA 185 NA
#> 4 608789 luquillo 1 Rubiace… Psyc… berter… PSYBER NA 184 NA
#> 5 388579 luquillo 1 Arecace… Pres… acumin… PREMON NA 182 NA
#> 6 384626 luquillo 1 Araliac… Sche… moroto… SCHMOR NA 196 NA
#> 7 410958 luquillo 1 Rubiace… Psyc… brachi… PSYBRA NA 185 NA
#> 8 385102 luquillo 1 Piperac… Piper glabre… PIPGLA NA 174 NA
#> 9 353163 luquillo 1 Arecace… Pres… acumin… PREMON NA 182 NA
#> 10 481018 luquillo 1 Salicac… Case… arborea CASARB NA 70 NA
#> # … with 490 more rows, 22 more variables: QuadratName <chr>, QuadratID <int>,
#> # PX <dbl>, PY <dbl>, QX <dbl>, QY <dbl>, TreeID <int>, Tag <chr>,
#> # StemID <int>, StemNumber <int>, StemTag <int>, PrimaryStem <chr>,
#> # CensusID <int>, PlotCensusNumber <int>, DBH <dbl>, HOM <dbl>,
#> # ExactDate <date>, Date <int>, ListOfTSM <chr>, HighHOM <int>,
#> # LargeStem <chr>, Status <chr>, and abbreviated variable names ¹SpeciesName,
#> # ²Mnemonic, ³Subspecies, ⁴SpeciesID, ⁵SubspeciesID
str(type_taxa())
#> List of 21
#> $ ViewID : chr "i"
#> $ SpeciesID : chr "i"
#> $ SubspeciesID : chr "c"
#> $ Family : chr "c"
#> $ Mnemonic : chr "c"
#> $ Genus : chr "c"
#> $ SpeciesName : chr "c"
#> $ Rank : chr "c"
#> $ Subspecies : chr "c"
#> $ Authority : chr "c"
#> $ IDLevel : chr "c"
#> $ subspMnemonic : chr "c"
#> $ subspAuthority: chr "c"
#> $ FieldFamily : chr "c"
#> $ Lifeform : chr "c"
#> $ Description : chr "c"
#> $ wsg : chr "d"
#> $ wsglevel : chr "c"
#> $ ListOfOldNames: chr "c"
#> $ Specimens : chr "c"
#> $ Reference : chr "c"
read_csv(example_path("view/taxa.csv"), col_types = type_taxa())
#> # A tibble: 163 × 21
#> ViewID SpeciesID Subspec…¹ Family Mnemo…² Genus Speci…³ Rank Subsp…⁴ Autho…⁵
#> <int> <int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1 56 NA Fabac… AESAME Aesc… americ… NA NA (Poir.…
#> 2 2 57 NA Eupho… ALCFLO Alch… florib… NA NA (Benth…
#> 3 3 58 NA Eupho… ALCLAT Alch… latifo… NA NA Sw.
#> 4 4 59 NA Fabac… ANDINE Andi… inermis NA NA (W. Wr…
#> 5 5 60 NA Rubia… ANTOBT Sten… obtusi… NA NA (Urb.)…
#> 6 6 61 NA Myrsi… ARDGLA Ardi… glauci… NA NA Urb.
#> 7 7 62 NA Morac… ARTALT Arto… altilis NA NA (Parki…
#> 8 8 63 NA Laura… BEIPEN Beil… pendula NA NA (Sw.) …
#> 9 9 64 NA Solan… BRUPOR Brun… portor… NA NA Krug &…
#> 10 10 65 NA Combr… BUCTET Buch… tetrap… NA NA (Aubl.…
#> # … with 153 more rows, 11 more variables: IDLevel <chr>, subspMnemonic <chr>,
#> # subspAuthority <chr>, FieldFamily <chr>, Lifeform <chr>, Description <chr>,
#> # wsg <dbl>, wsglevel <chr>, ListOfOldNames <chr>, Specimens <chr>,
#> # Reference <chr>, and abbreviated variable names ¹SubspeciesID, ²Mnemonic,
#> # ³SpeciesName, ⁴Subspecies, ⁵Authority