@crazycells Great article, and good testing ground also for future cases I think. However, I do firmly believe you are right. There are a number of sources (for example, Amazon has a feature where it’s possible to “read” the first couple of chapters of a book, and it’s not behind a paywall either, so could be scraped) where a synopsis of the book itself could be available as a “teaser” - of course, not the entire works as that would be pointless.
However, another possibility is the works leaking online in digital format. Whilst novels don’t necessarily have the allure of bootleg DVD’s or warez / illegal downloads, it’s still plausible in my view.
However, I see this more as a case for plagiarism than anything else. If it’s there on the internet, it can be discovered, and that would be the basis of my argument for sure.
Having read the article, I think this passage says it all
“ChatGPT allows users to ask questions and type commands into a chatbot and responds with text that resembles human language patterns. The model underlying ChatGPT is trained with data that is publicly available on the internet.”
Based on this, is there really a case? Surely, ChatGPT has simply ingested what it found during it’s crawl and learn process?
This could easily become the “Napster” of 2023.
https://www.theguardian.com/technology/2000/jul/27/copyright.news