Glossary

User: Represents an individual or entity that interacts with the Protege Engine. Users can own applications, predictions, training predictions, datasets, models, and inference backends. They are identified by a unique ID, name, email, and can have an associated role (admin, editor, viewer, etc.) and API key for making requests.

UserRole: Defines the set of roles a user can have within the Protege Engine environment, such as ADMIN, EDITOR, and VIEWER. These roles determine the level of access and permissions a user has.

App: An application controlled by a user that interfaces with the Protege Engine. Each app has a unique name and can be assigned an API key for authentication.

BaseLLM (Base Large Language Model): Refers to a trainable base model within the Protege Engine ecosystem. These models have specific characteristics like name, version, and potentially a description, and can be associated with a HuggingFace path or stored in an S3 bucket.

InferenceBackend: A deployed backend service used for making predictions with a Large Language Model. It includes information about the backend's status (active or not), name, URLs for access, and an optional API key for use.

Prediction: Represents the outcome of a prediction request made through the system. It includes details like the inference backend used, the request arguments, response data, prompt, completion, and associated user labels for further categorization or feedback.

TrainingPrediction: A type of prediction uploaded by the user for the purpose of training and evaluating models. It belongs to a dataset and targets a specific model.

Label: A component of a prediction that categorizes or provides additional granularity to the prediction output. Each label has a name, value, confidence level, and can be linked to feedback for adjustments or improvements.

Feedback: Input provided by users regarding the accuracy or relevance of predictions or labels. Feedback can indicate approval (upvote) or disapproval (downvote) and may include corrective content.

Dataset: A collection of predictions grouped together for the purpose of training or evaluating models. Datasets belong to users and can contain both standard predictions and training predictions.