Tokenizer Java

About 50 results

Open links in new tab

Any time

stackoverflow.com
https://stackoverflow.com › questions
What does Keras Tokenizer method exactly do? - Stack Overflow
On occasion, circumstances require us to do the following: from keras.preprocessing.text import Tokenizer tokenizer = Tokenizer(num_words=my_max) Then, invariably, we chant this mantra: …
stackoverflow.com
https://stackoverflow.com › questions
python - AutoTokenizer.from_pretrained fails to load locally saved ...
1. tokenizer_config.json, 2. special_tokens_map.json, 3. vocab.txt, 4. added_tokens.json, 5. tokenizer.json. However added_token.json is missing in the location. If you can point me somewhere …
stackoverflow.com
https://stackoverflow.com › questions
How to apply max_length to truncate the token sequence from the left …
May 11, 2022 · In the HuggingFace tokenizer, applying the max_length argument specifies the length of the tokenized text. I believe it truncates the sequence to max_length-2 (if truncation=True) by cutting …
stackoverflow.com
https://stackoverflow.com › questions
Unable to get the tokenizer of Gemma-3 - Stack Overflow
Mar 22, 2025 · 3 I am trying to get the tokenizer using huggingface AutoTokenizer library, but I am unable to fetch, is there any other way to get it? Where I am doing wrong?
stackoverflow.com
https://stackoverflow.com › questions › how-to-fix-valueerror-this-tokenizer...
python - How to fix "ValueError: This tokenizer cannot be instantiated ...
Feb 28, 2023 · I'm trying to run a Hugging Face model using the following code in Google Colab: !pip install transformers from transformers import AutoTokenizer tokenizer = …
stackoverflow.com
https://stackoverflow.com › questions
How to add new tokens to an existing Huggingface tokenizer?
May 8, 2023 · And then it points to the train_new_from_iterator() function in Chapter 7 but I can't seem to find reference to how to use it to extend the tokenizer without re-training it.
stackoverflow.com
https://stackoverflow.com › questions
How to add all standard special tokens to my hugging face tokenizer …
Aug 11, 2022 · I want all special tokens to always be available. How do I do this? My first attempt to give it to my tokenizer: def does_t5_have_sep_token(): tokenizer: PreTrainedTokenizerFast = …
stackoverflow.com
https://stackoverflow.com › questions › tokenizer-from-file-hugginface-excepti…
json - Tokenizer.from_file () HUGGINFACE - Stack Overflow
Nov 1, 2022 · Tokenizer.from_file () HUGGINFACE : Exception: data did not match any variant of untagged enum ModelWrapper Asked 3 years, 5 months ago Modified 9 months ago Viewed 31k times
stackoverflow.com
https://stackoverflow.com › questions › openai-api-how-do-i-count-tokens-befo…
OpenAI API: How do I count tokens before(!) I send an API request?
Mar 21, 2023 · How do I count tokens before (!) I send an API request? As stated in the official OpenAI article: To further explore tokenization, you can use our interactive Tokenizer tool, which allows you …
stackoverflow.com
https://stackoverflow.com › questions
what's difference between tokenizer.encode and tokenizer.encode_plus …
The main difference is stemming from the additional information that encode_plus is providing. If you read the documentation on the respective functions, then there is a slight difference for encode(): …

Pagination
- 1
- 2
- 3
- Next