Yes they do use trained transformer models, specifically in the case of ChatGPT that is a human language model or a “user interface” between human and machine, if you will.
I just talked with ChatGPT and was able to get it to tell me that it is trained on massive text datasets. The algorithms that make up the program then store that “learning” in files as weights or measures.
Those weights or measures are data structures. The program’s algorithm loads the data into active memory and processes it, generating a text output.
What I am saying is that ChatGPT or any current “AI” is not artificial intelligence. It can not think and act on it’s own. ChatGPT and related computer programs are definitely an evolutionary improvement in human-machine interface technology, but they are not a revolutionary improvement and are in no way close to AI. They are just better at receiving and transmitting responses from human to machine and machine to human. I think it does disservice to society to have all the kids thinking “oh, we have AI now”, when that is not correct.
It was interesting to talk with ChatGPT. You really have to get specific to get it to tell you what you want to know. I included the chat with Chat here, below. I left out many initial questions and responses that didn’t answer my questions.
___________________ chatting with ChatGPT_________________________________________
You
how do you retain your encoded knowledge?
ChatGPT
The encoded knowledge in a model like mine, based on the GPT-3.5 architecture, is retained in the form of weights or parameters in the neural network. During the training process, the model adjusts these parameters to minimize the difference between its predicted outputs and the actual training data.
The learning occurs through a combination of backpropagation and gradient descent, where the model updates its parameters based on the errors it makes during predictions. The adjusted parameters represent the learned knowledge, capturing the patterns and relationships in the training data.
Once the training is completed, the model’s parameters are saved, and they are used during inference (when generating responses to user inputs). The model doesn’t retain explicit knowledge of specific documents or sources within its training data but rather a generalized understanding of patterns and structures that allow it to generate contextually relevant responses.
It’s important to note that I don’t have the ability to retain information between conversations or sessions, as each interaction is stateless. I don’t store personal data shared during conversations, and privacy and data security are prioritized.
You
do you keep the weights or parameters in your neural network in files or in active computer memory
ChatGPT
The weights or parameters of the neural network in models like mine are typically stored in files. During the training process, these weights are continuously updated and adjusted to capture patterns in the data. Once the training is complete, the final set of learned parameters is saved to files.
When the model is in use for inference, these saved parameters are loaded into the active computer memory (RAM) to perform the computations needed for generating responses to user inputs. Loading the parameters into memory allows the model to quickly access and use the learned information during the conversation.
In summary, the weights or parameters are stored in files, but they are loaded into the active computer memory when the model is in use for generating responses.
You
Finally we’re getting somewhere!
ChatGPT
I’m glad to hear that you found the information helpful! If you have any more questions or if there’s anything else you’d like to discuss, feel free to let me know. I’m here to help!
You
So, you process the weights or parameters of the neural network based on the algorithms contained in your computer code?
ChatGPT
Yes, that’s correct. The processing of weights or parameters in the neural network is done through the algorithms encoded in the computer code that defines the model architecture. During the training phase, an optimization algorithm, often gradient descent, is used along with backpropagation to update these weights in order to minimize the difference between the model’s predictions and the actual training data.
The algorithms implemented in the code are responsible for the forward pass (making predictions), backward pass (calculating gradients and updating weights), and other training procedures. Once the model is trained, these algorithms are also used during inference to generate responses based on the learned parameters.
In summary, the algorithms in the computer code govern how the model processes and updates its weights, enabling it to learn from the training data and make predictions during inference.
You
can you display the file contents of weights or measures
ChatGPT
I’m sorry for any confusion, but I cannot display the file contents of the weights or parameters directly. The weights are stored in a complex numerical format within the model’s files, and they are not directly interpretable as human-readable text.
The files containing the model parameters are typically binary files, and the information they contain is structured in a way that the model’s architecture and numerical representations of the learned weights can be reconstructed during the model loading process. These files are not intended for direct human inspection or interpretation.
If you have specific questions about the model or its functioning, I’m happy to help explain concepts or processes to the best of my ability!