If you’re working with Hugging Face’s transformers
and peft
libraries on Windows, you’ve likely seen messages or warnings related to model caching, symlinks, and environment variables. This guide demystifies how Hugging Face handles model storage, how to change the cache locations, and how to resolve common issues — especially on Windows.
What Is Model Caching in Hugging Face?
When you use Hugging Face libraries like transformers
, models and tokenizers are downloaded and cached locally so they don’t need to be re-downloaded for future use. This caching system improves speed and efficiency, particularly when experimenting or using multiple scripts or notebooks.
By default, Hugging Face caches models in the following directories:
Hugging Face Hub Cache
This cache stores downloaded models and datasets from the Hugging Face Hub:
- Windows:
C:\Users\<YourUsername>\.cache\huggingface\hub\
or if overridden:H:\HuggingFace\Cache\hub\
- Linux/macOS:
~/.cache/huggingface/hub/
Transformers-Specific Cache
This stores transformers
-specific files like:
pytorch_model.bin
tokenizer.json
config.json
Location:
- Windows:
C:\Users\<YourUsername>\.cache\huggingface\transformers
- Linux/macOS:
~/.cache/huggingface/transformers/
Symlinks Warning on Windows
You may encounter a warning like this when using Hugging Face:
huggingface_hub\file_download.py: UserWarning:
`huggingface_hub` cache-system uses symlinks by default...
Caching will still work but in a degraded version...
What It Means:
Hugging Face uses symlinks (shortcuts) to reduce duplicated files and save space. However, Windows often restricts symlink creation unless:
- Developer Mode is enabled, or
- Python is run as an Administrator
How to Enable Symlinks on Windows
Option 1: Enable Developer Mode
- Press
Win + R
, type:ms-settings:developers
- Enable Developer Mode
If “For Developers” is missing from Settings, you can enable Developer Mode via the Windows Registry (advanced users only — instructions below).
Option 2: Run Python as Administrator
- Right-click your terminal or IDE (e.g., VS Code) → Run as Administrator
- This allows symlink creation without Developer Mode.
Option 3: Suppress the Warning
You can safely ignore the warning, but to suppress it:
import os
os.environ["HF_HUB_DISABLE_SYMLINKS_WARNING"] = "1"
Or set it globally via environment variables:
- Variable name: HF_HUB_DISABLE_SYMLINKS_WARNING
- Value:
1
Changing Cache Directory Locations
1. Change Hugging Face Hub Cache (HF_HOME
)
Set the HF_HOME
environment variable:
In Python (temporary):
import os
os.environ["HF_HOME"] = "D:/MyCustomHFCache"
System-wide (Windows):
- Open Environment Variables
- Add:
- Name:
HF_HOME
- Value:
D:\MyCustomHFCache
- Name:
2. Change Transformers Cache (TRANSFORMERS_CACHE
)
In Python:
os.environ["TRANSFORMERS_CACHE"] = "D:/MyTransformersCache"
System-wide:
- Add a new Environment Variable:
- Name:
TRANSFORMERS_CACHE
- Value:
D:\MyTransformersCache
- Name:
Managing Cache Storage
You can inspect or clean up your cache folder manually:
- Check the size:
- Navigate to
.cache/huggingface
and right-click → Properties
- Navigate to
- Delete old models:
- You can safely delete folders like
models--...
,snapshots
, or specific model subfolders - Hugging Face will re-download them when needed
- You can safely delete folders like
Example: Loading a Fine-Tuned PEFT Model with LoRA
from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
from peft import PeftModel, PeftConfig
# Load PEFT config
peft_model_path = "fine_tuned_policy_model"
config = PeftConfig.from_pretrained(peft_model_path)
# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(config.base_model_name_or_path)
tokenizer = AutoTokenizer.from_pretrained(config.base_model_name_or_path)
# Apply LoRA adapter
model = PeftModel.from_pretrained(base_model, peft_model_path)
model.eval()
# Inference
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer)
print(pipe("Once upon a time,", max_new_tokens=50)[0]["generated_text"])
Summary
Task | Solution |
---|---|
Default cache path | ~/.cache/huggingface/ (Windows: C:\Users\<You>\.cache\... ) |
Change cache location | Use HF_HOME or TRANSFORMERS_CACHE env vars |
Enable symlinks | Turn on Developer Mode or run Python as Admin |
Suppress symlink warnings | Set HF_HUB_DISABLE_SYMLINKS_WARNING=1 |
Clean up old models | Manually delete from cache folder |
My Tech Advice: Managing Hugging Face’s cache and understanding how to optimize it — especially on Windows — is key to a smoother ML workflow. With a few tweaks, you can avoid annoying warnings, save disk space, and keep things running efficiently.
Ready to optimise your HuggingFace solution ? Try the above tech concept, or contact me for a tech advice!
#AskDushyant
Note: The names and information mentioned are based on my personal experience and publicly available data; however, they do not represent any formal statement.
#TechConcept #TechAdvice #HuggingFace #AI #Configuration
Leave a Reply