Skip to contents

Download pre-trained language models (Transformers Models, such as GPT, BERT, RoBERTa, DeBERTa, DistilBERT, etc.) from HuggingFace to your local ".cache" folder ("C:/Users/[YourUserName]/.cache/"). The models will never be removed unless you run text_model_remove.

Usage

text_model_download(model = NULL)

Arguments

model

Character string(s) specifying the pre-trained language model(s) to be downloaded. For a full list of options, see HuggingFace. Defaults to download nothing and check currently downloaded models.

Example choices:

  • "gpt2" (50257 vocab, 768 dims, 12 layers)

  • "openai-gpt" (40478 vocab, 768 dims, 12 layers)

  • "bert-base-uncased" (30522 vocab, 768 dims, 12 layers)

  • "bert-large-uncased" (30522 vocab, 1024 dims, 24 layers)

  • "bert-base-cased" (28996 vocab, 768 dims, 12 layers)

  • "bert-large-cased" (28996 vocab, 1024 dims, 24 layers)

  • "bert-base-chinese" (21128 vocab, 768 dims, 12 layers)

  • "bert-base-multilingual-cased" (119547 vocab, 768 dims, 12 layers)

  • "distilbert-base-uncased" (30522 vocab, 768 dims, 6 layers)

  • "distilbert-base-cased" (28996 vocab, 768 dims, 6 layers)

  • "distilbert-base-multilingual-cased" (119547 vocab, 768 dims, 6 layers)

  • "albert-base-v2" (30000 vocab, 768 dims, 12 layers)

  • "albert-large-v2" (30000 vocab, 1024 dims, 24 layers)

  • "roberta-base" (50265 vocab, 768 dims, 12 layers)

  • "roberta-large" (50265 vocab, 1024 dims, 24 layers)

  • "xlm-roberta-base" (250002 vocab, 768 dims, 12 layers)

  • "xlm-roberta-large" (250002 vocab, 1024 dims, 24 layers)

  • "xlnet-base-cased" (32000 vocab, 768 dims, 12 layers)

  • "xlnet-large-cased" (32000 vocab, 1024 dims, 24 layers)

  • "microsoft/deberta-v3-base" (128100 vocab, 768 dims, 12 layers)

  • "microsoft/deberta-v3-large" (128100 vocab, 1024 dims, 24 layers)

  • ... (see https://huggingface.co/models)

Value

Invisibly return the names of all downloaded models.

Examples

if (FALSE) {
# text_init()  # initialize the environment

text_model_download()  # check downloaded models
text_model_download(c(
  "bert-base-uncased",
  "bert-base-cased",
  "bert-base-multilingual-cased"
))
}