Add CLIP score #1311

chinoll · 2022-11-03T11:41:27Z

🚀 Feature

Calculate the correlation between image and text

Motivation

Evaluating the performance of the text2image model

Pitch

pytorch-like Pseudocode

def clip_score(img_inputs, txt_inputs):
    img_features = clip.get_image_features(img_inputs)
    txt_features = clip.get_text_features(txt_inputs)
    img_features, txt_features = [
        x / torch.linalg.norm(x, axis=-1, keepdims=True)
        for x in [img_features, txt_features]
    ]
    return (img_features * txt_features).sum(axis=-1)

Alternatives

Additional context

clip score

github-actions · 2022-11-03T11:41:57Z

Hi! thanks for your contribution!, great first issue!

stancld · 2022-11-03T17:07:57Z

Hi @chinoll, this sounds like a nice addition to add the first multi-model metric. Would you have please any reference implementation?

SkafteNicki · 2022-11-04T16:26:21Z

Here is at least one reference implementation:
https://github.com/mehdidc/DALLE_clip_score

chinoll · 2022-11-04T16:55:49Z

Hi @chinoll, this sounds like a nice addition to add the first multi-model metric. Would you have please any reference implementation?

simple pytorch implementation,Reference CLIP-score-vs-FID-pareto-curves

from transformers import CLIPModel,CLIPTokenizer,CLIPFeatureExtractor
import torch
import PIL
version = "openai/clip-vit-large-patch14"
tokenizer = CLIPTokenizer.from_pretrained(version)
model = CLIPModel.from_pretrained(version)
feature_extractor = CLIPFeatureExtractor.from_pretrained(version)

def clip_score(text:str, image:PIL.Image):
    txt_features = model.get_text_features(tokenizer(text,return_tensors="pt")["input_ids"])
    img_features = model.get_image_features(torch.tensor(feature_extractor(image)['pixel_values'][0][None]))
    img_features, txt_features = [
        x / torch.linalg.norm(x, axis=-1, keepdims=True)
        for x in [img_features, txt_features]
    ]
    return (img_features * txt_features).sum(axis=-1)

SkafteNicki · 2022-11-05T08:58:08Z

I started the work of adding the metric in #1314

chinoll added the enhancement New feature or request label Nov 3, 2022

stancld added the New metric label Nov 4, 2022

SkafteNicki mentioned this issue Nov 5, 2022

Add CLIP score #1314

Merged

4 tasks

SkafteNicki self-assigned this Nov 5, 2022

stancld added this to the v0.11 milestone Nov 6, 2022

Borda closed this as completed in #1314 Nov 18, 2022

Borda added the topic: Image label Aug 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CLIP score #1311

Add CLIP score #1311

chinoll commented Nov 3, 2022

github-actions bot commented Nov 3, 2022

stancld commented Nov 3, 2022

SkafteNicki commented Nov 4, 2022

chinoll commented Nov 4, 2022 •

edited

SkafteNicki commented Nov 5, 2022

Add CLIP score #1311

Add CLIP score #1311

Comments

chinoll commented Nov 3, 2022

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

github-actions bot commented Nov 3, 2022

stancld commented Nov 3, 2022

SkafteNicki commented Nov 4, 2022

chinoll commented Nov 4, 2022 • edited

SkafteNicki commented Nov 5, 2022

chinoll commented Nov 4, 2022 •

edited