Captioning photos with KI II

After having my 60.000 photos captioned by the BLIB KI I looked at the captions.

The captions that BLIB produced were not bad. In fact they were heartbreakingly accurate "A baby girl opening a box of candy", which in deed she was. My wife and I sat some time over the captions and were really moved.

But BLIB focuses to much an the person in the front. When there were more persons in the image BLIB often did not mention them.

So I looked at the source code of Stable-Diffusion to get more knowledge on """CLIP""". And while doing so I found captionr

which is a software that does exactly what I like to program for myself.

But Playing around with their cmd line switches for CLIP does not make me any happier.

So I looked for BLIB, CLIP alternatives