It's infuriatingly hard to understand how…

Jun 4, 2023

Plus, ChatGPT should include inline tips

1 Comment

Jun 12, 2023

> Could a large language model trained on data fit under that term? I don't think so, but the terminology is vague enough that once again I'm not ready to stake my reputation on it.

My intuition is that this wording does allude that private repositories are used for training.

Private repo --[text data] --> embeddings generator server --[embeddings]--> model training server --> model

I personally do not have a problem with this, and would opt into it if I could (wouldn’t say no to some free GitHub service credits) but my immediate gut is leaning towards “it’s definitely used for training” and was surprised you think otherwise/

Expand full comment

Simon Willison’s Newsletter

It's infuriatingly hard to understand how…