How to Use generate_text_prompts for Custom Dataset Embeddings and Reparameterization Fine-tuning #567

TAKAGNU · 2025-01-25T13:50:16Z

I am trying to use the generate_text_prompts function to generate text embeddings for a custom dataset in a format similar to Flickr30k. My goal is to use these embeddings to reparameterize and fine-tune YOLO-World. Here is my current workflow:

I am using the tokens_positive_eval corresponding text as the input categories for generate_text_prompts to generate the text embeddings.

However, this results in a large number of categories (73,350), which requires setting num_classes and num_training_classes to 73,350 during reparameterization fine-tuning.

As a beginner, I am unsure if this is the correct approach. Specifically, I have the following questions:

Is it appropriate to use tokens_positive_eval text as the input for generate_text_prompts to generate embeddings for a custom dataset?

Is setting num_classes and num_training_classes to 73,350 the correct way to handle such a large number of categories during fine-tuning?

Are there any best practices or recommended workflows for generating text embeddings and fine-tuning YOLO-World on custom datasets with a large number of categories?

Any guidance or suggestions would be greatly appreciated!

Thank you in advance for your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to Use generate_text_prompts for Custom Dataset Embeddings and Reparameterization Fine-tuning #567

How to Use generate_text_prompts for Custom Dataset Embeddings and Reparameterization Fine-tuning #567

TAKAGNU commented Jan 25, 2025

How to Use generate_text_prompts for Custom Dataset Embeddings and Reparameterization Fine-tuning #567

How to Use generate_text_prompts for Custom Dataset Embeddings and Reparameterization Fine-tuning #567

Comments

TAKAGNU commented Jan 25, 2025