Images and multimodal: see and create
Goal. Use ChatGPT both ways with images — have it interpret a figure or photo you upload, and generate a diagram or illustration from a description.
Generate a clean, labelled schematic of the central dogma (DNA → RNA → protein) suitable for a lecture slide: simple, high-contrast, with arrows and labels. Then suggest a one-line caption.
Send this to generate an image. To go the other way, upload a photo of a gel or a chart and ask "what does this figure show, and what stands out?" — that uses ChatGPT's vision to read the image.
- 1Generate from a description. Describe the image you want — content, style, and where it'll be used (slide, poster, figure). Iterate: "make the arrows thicker", "remove the background".
- 2Interpret what you upload (vision). Attach a photo, chart, gel, or microscopy image and ask ChatGPT to describe or interpret it. Good for a first read; not a substitute for your own quantification.
- 3Stay honest about generated images. AI-generated figures are illustrations, not data. Never present a generated image as an experimental result.
You'll see. A labelled schematic created from your description, and — when you upload one — a plain-language read of a figure or photo.
Cost. Image generation and vision both work on the free tier but are limited (slower, fewer images). Plus gives more and higher-quality image creation; Pro gives unlimited, faster generation.
Takeaway. ChatGPT is multimodal both directions — it can read images you upload and create new ones — but generated images are illustrations, never data.