> Could a large language model trained on data fit under that term? I don't think so, but the terminology is vague enough that once again I'm not ready to stake my reputation on it.
My intuition is that this wording does allude that private repositories are used for training.
Private repo --[text data] --> embeddings generator server --[embeddings]--> model training server --> model
I personally do not have a problem with this, and would opt into it if I could (wouldn’t say no to some free GitHub service credits) but my immediate gut is leaning towards “it’s definitely used for training” and was surprised you think otherwise/
> Could a large language model trained on data fit under that term? I don't think so, but the terminology is vague enough that once again I'm not ready to stake my reputation on it.
My intuition is that this wording does allude that private repositories are used for training.
Private repo --[text data] --> embeddings generator server --[embeddings]--> model training server --> model
I personally do not have a problem with this, and would opt into it if I could (wouldn’t say no to some free GitHub service credits) but my immediate gut is leaning towards “it’s definitely used for training” and was surprised you think otherwise/