
TIGER-AI-Lab/VLM2Vec - GitHub
This repository contains the official code and data for VLM2Vec-V2, a unified framework for learning powerful multimodal embeddings across diverse visual formats including images, videos, and visual …
VLM2Vec/src/grad_cache/loss.py at main - GitHub
assert dist.is_initialized (), "Distributed training has not been properly initialized."
VLM2Vec/src/loss.py at main · TIGER-AI-Lab/VLM2Vec · GitHub
assert dist.is_initialized (), "Distributed training has not been properly initialized."
VLM2Vec/src/model/model.py at main · TIGER-AI-Lab/VLM2Vec
331 332 333 from typing import Dict import torch import torch.distributed as dist from torch import nn, Tensor from transformers import PreTrainedModel, AutoModelForCausalLM, AutoConfig from peft …
VLM2Vec/experiments/public/data at main - GitHub
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025] - TIGER-AI-Lab/VLM2Vec
VLM2Vec/src/data/loader/mixed_dataset.py at main - GitHub
if len (train_datasets) > 1: train_dataset = interleave_datasets (train_datasets, probabilities=probs, batch_size=interleave_batch_size, seed=training_args.seed, …
VLM2Vec/src/grad_cache/grad_cache.py at main · TIGER-AI-Lab
Should be in similar order as the class's model. :param no_sync_except_last: If True, under distributed setup, for each model, only trigger gradient reduction across processes for the last sub-batch's …
VLM2Vec/src/model/baseline_backbone/phi3_v/image_embedding
# You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is …
VLM2Vec/src/grad_cache/minigc_cmd.md at main - GitHub
TIGER-AI-Lab / VLM2Vec Public Notifications You must be signed in to change notification settings Fork 59 Star 619 Code Pull requests Projects Insights Code Issues Pull requests Actions Files VLM2Vec …
VLM2Vec/src/prompt/tart.py at main · TIGER-AI-Lab/VLM2Vec
This repo contains the code for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks" [ICLR 2025] - TIGER-AI-Lab/VLM2Vec