Search, match, and look up values (like Excel's functions INDEX + MATCH
).
Source: R/bruceR_utils.R
LOOKUP.Rd
In Excel, we can use VLOOKUP
, HLOOKUP
, XLOOKUP
(a new function released in 2019),
or the combination of INDEX
and MATCH
to search, match, and look up values.
Here I provide a similar function.
Usage
LOOKUP(
data,
vars,
data.ref,
vars.ref,
vars.lookup,
return = c("new.data", "new.var", "new.value")
)
Arguments
- data
Main data.
- vars
Character (vector), specifying the variable(s) to be searched in
data
.- data.ref
Reference data containing both the reference variable(s) and the lookup variable(s).
- vars.ref
Character (vector), with the same length and order as
vars
, specifying the reference variable(s) to be matched indata.ref
.- vars.lookup
Character (vector), specifying the variable(s) to be looked up and returned from
data.ref
.- return
What to return. Default (
"new.data"
) is to return a data frame with the lookup values added. You may also set it to"new.var"
or"new.value"
.
Examples
ref = data.table(City=rep(c("A", "B", "C"), each=5),
Year=rep(2013:2017, times=3),
GDP=sample(1000:2000, 15),
PM2.5=sample(10:300, 15))
ref
#> City Year GDP PM2.5
#> <char> <int> <int> <int>
#> 1: A 2013 1615 285
#> 2: A 2014 1507 62
#> 3: A 2015 1497 268
#> 4: A 2016 1040 213
#> 5: A 2017 1558 296
#> 6: B 2013 1726 237
#> 7: B 2014 1026 25
#> 8: B 2015 1221 187
#> 9: B 2016 1401 36
#> 10: B 2017 1825 102
#> 11: C 2013 1670 52
#> 12: C 2014 1458 128
#> 13: C 2015 1392 115
#> 14: C 2016 1649 286
#> 15: C 2017 1439 183
data = data.table(sub=1:5,
city=c("A", "A", "B", "C", "C"),
year=c(2013, 2014, 2015, 2016, 2017))
data
#> sub city year
#> <int> <char> <num>
#> 1: 1 A 2013
#> 2: 2 A 2014
#> 3: 3 B 2015
#> 4: 4 C 2016
#> 5: 5 C 2017
LOOKUP(data, c("city", "year"), ref, c("City", "Year"), "GDP")
#> sub city year GDP
#> <int> <char> <num> <int>
#> 1: 1 A 2013 1615
#> 2: 2 A 2014 1507
#> 3: 3 B 2015 1221
#> 4: 4 C 2016 1649
#> 5: 5 C 2017 1439
LOOKUP(data, c("city", "year"), ref, c("City", "Year"), c("GDP", "PM2.5"))
#> sub city year GDP PM2.5
#> <int> <char> <num> <int> <int>
#> 1: 1 A 2013 1615 285
#> 2: 2 A 2014 1507 62
#> 3: 3 B 2015 1221 187
#> 4: 4 C 2016 1649 286
#> 5: 5 C 2017 1439 183