Skip to content
This repository was archived by the owner on Sep 28, 2023. It is now read-only.
This repository was archived by the owner on Sep 28, 2023. It is now read-only.

RGB-as-text for image classification with LLMs #1

@billhowe

Description

@billhowe

Can we use RGB (or grayscale) encoded images, represented as text, as input to an LLM to do non-trivial classification tasks?
i.e., take an MNIST image, print the sequence of grayscale values as text, and provide it in a prompt to an LLM with the question "What is this a picture of?"

If this works, it's feasible to do simple image work, inefficiently, using text-only models.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions