The massive modify from GPT-3.5 is the fact that OpenAI's 4th era language design is multimodal, which implies it might course of action both text, visuals and audio. This implies you could display it pictures and it will respond to them alongside a text prompt – an early example of this, mentioned via the Big apple Times, concerned giving GPT-4