- running out of public data to train on
- model as a service
- AI engineering - building apps on top of models
- language model - stat info about 1 or more languages
- how likely a word appears given a context
- token - smallest unit of text in a language model
- set of all tokens - vocabulary
- tokens help process unknown words
- masked language model - predict missing tokens anywhere in a sequence
- used for non generative tasks
- auto regressive model - next token based on previous tokens
- completions are based on predictions
- parameter - variable within ML model that is updated during training
- RAG - retrieval augmented generation, database to supplement
- context from RAG and agents
- retrieve then generate
- retriever and generator
- query - retrieve data chunks most relevant to the query
- sparse vector - vector with many zeros
- dense vector - vector with few zeros
- term based - relevance at lexical level
- embedding based - relevance at semantic level