Skip to contents

In Excel, we can use VLOOKUP, HLOOKUP, XLOOKUP (a new function released in 2019), or the combination of INDEX and MATCH to search, match, and look up values. Here I provide a similar function.

Usage

LOOKUP(
  data,
  vars,
  data.ref,
  vars.ref,
  vars.lookup,
  return = c("new.data", "new.var", "new.value")
)

Arguments

data

Main data.

vars

Character (vector), specifying the variable(s) to be searched in data.

data.ref

Reference data containing both the reference variable(s) and the lookup variable(s).

vars.ref

Character (vector), with the same length and order as vars, specifying the reference variable(s) to be matched in data.ref.

vars.lookup

Character (vector), specifying the variable(s) to be looked up and returned from data.ref.

return

What to return. Default ("new.data") is to return a data frame with the lookup values added. You may also set it to "new.var" or "new.value".

Value

New data object, new variable, or new value (see the argument return).

Details

If multiple values were simultaneously matched, a warning message would be printed.

Examples

ref = data.table(City=rep(c("A", "B", "C"), each=5),
                 Year=rep(2013:2017, times=3),
                 GDP=sample(1000:2000, 15),
                 PM2.5=sample(10:300, 15))
ref
#>     City Year  GDP PM2.5
#>  1:    A 2013 1615   285
#>  2:    A 2014 1507    62
#>  3:    A 2015 1497   268
#>  4:    A 2016 1040   213
#>  5:    A 2017 1558   296
#>  6:    B 2013 1726   237
#>  7:    B 2014 1026    25
#>  8:    B 2015 1221   187
#>  9:    B 2016 1401    36
#> 10:    B 2017 1825   102
#> 11:    C 2013 1670    52
#> 12:    C 2014 1458   128
#> 13:    C 2015 1392   115
#> 14:    C 2016 1649   286
#> 15:    C 2017 1439   183

data = data.table(sub=1:5,
                  city=c("A", "A", "B", "C", "C"),
                  year=c(2013, 2014, 2015, 2016, 2017))
data
#>    sub city year
#> 1:   1    A 2013
#> 2:   2    A 2014
#> 3:   3    B 2015
#> 4:   4    C 2016
#> 5:   5    C 2017

LOOKUP(data, c("city", "year"), ref, c("City", "Year"), "GDP")
#>    sub city year  GDP
#> 1:   1    A 2013 1615
#> 2:   2    A 2014 1507
#> 3:   3    B 2015 1221
#> 4:   4    C 2016 1649
#> 5:   5    C 2017 1439
LOOKUP(data, c("city", "year"), ref, c("City", "Year"), c("GDP", "PM2.5"))
#>    sub city year  GDP PM2.5
#> 1:   1    A 2013 1615   285
#> 2:   2    A 2014 1507    62
#> 3:   3    B 2015 1221   187
#> 4:   4    C 2016 1649   286
#> 5:   5    C 2017 1439   183