![]() ![]() This activation is actually less efficient than native torch.nn.GELU in recent versions of PyTorch. NOTE: Many existing checkpoints use the QuickGELU activation from the original OpenAI models. number of parameters, FLOPs) in this table. ![]() ![]() You can find more about the models we support (e.g. More details about our pretrained models are available here. To see which pretrained models are available, use the following code snippet. We offer a simple model interface to instantiate both pre-trained and untrained models. To compute billions of embeddings efficiently, you can use clip-retrieval which has openclip support. softmax ( dim =- 1 ) print ( "Label probs:", text_probs ) # prints: ] norm ( dim =- 1, keepdim = True ) text_probs = ( 100.0 * image_features text_features. ![]() norm ( dim =- 1, keepdim = True ) text_features /= text_features. encode_text ( text ) image_features /= image_features. encode_image ( image ) text_features = model. unsqueeze ( 0 ) text = tokenizer () with torch. get_tokenizer ( 'ViT-B-32' ) image = preprocess ( Image. create_model_and_transforms ( 'ViT-B-32', pretrained = 'laion2b_s34b_b79k' ) tokenizer = open_clip. Import torch from PIL import Image import open_clip model, _, preprocess = open_clip. Note that portions of src/open_clip/ modelling and tokenizer code are adaptations of OpenAI's official repository. We welcome anyone to submit an issue or send an email if you have any other requests or suggestions. If you found this repository useful, please consider citing. Model cards with additional model specific details can be found on the Hugging Face Hub under the OpenCLIP library tag. We provide more details about our full collection of pretrained models here, and zero-shot results for 38 datasets here. Some of our best models and their zero-shot ImageNet-1k accuracy are shown below, along with the ViT-L model trained by OpenAI. Many of our models and their scaling properties are studied in detail in the paper reproducible scaling laws for contrastive language-image learning. Using this codebase, we have trained several models on a variety of data sources and compute budgets, ranging from small-scale experiments to larger runs including models trained on datasets such as LAION-400M, LAION-2B and DataComp-1B. Welcome to an open source implementation of OpenAI's CLIP (Contrastive Language-Image Pre-training). ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |