ChatGPT is based on the GPT (Generative Pretrained Transformer) architecture. It is a variation of OpenAI’s GPT, but specifically fine-tuned for conversational responses. The model architecture utilized by ChatGPT is the transformer architecture which helps in understanding the context of the input provided.
The GPT models adopt a transformer-based architecture, capable of generating human-like text by predicting the likelihood of a word given the previous words used in the text. Initial versions were trained using supervised fine-tuning, where human AI trainers provided conversations playing both sides (user and AI assistant) and this training data mix was then combined with the data from the InstructGPT dataset transformed into a dialogue format.
The learning process involves both Reinforcement Learning (RL) and Supervised Fine-Tuning (SFT) in its pipeline. The model uses supervised learning to generate responses using a dataset of dialogues and then uses reinforcement learning from human feedback to optimize its responses over time.