Danbooru 0-shot classifiction demo

Demo for 0-shot classification on Danbooru images.

Davit-tiny backbone, ML-Decoder classification head, Alibaba-NLP/gte-large-en-v1.5 text embedding model. Training set includes IDs with <= 5,400,000 and last 3 digits in range [0, 899], inclusive.

Get image by uploading or fetching by post ID. Get tag description by input box or fetching by tag name.

Examples
Post ID Tag Name