Circumventing computer security to copy items en masse to distribute wholesale without transformation is a far cry from reading data on public facing web pages.
He didn't circumvent computer security. He had had a right to use the MIT network and pull the JSTR information. He certainly did it in a shady way (computer in a closet) but it's every bit as arguable that he did it that way because he didn't want someone stealing or unplugging his laptop while it was downloading the data.
He also did not distribute the information wholesale. What he planned on doing with the information was never proven.
OpenAI IS distributing information they got wholesale from the internet without license to that information. Heck, they are selling the information they distribute.
That right ended when he used it to break the law. It was also for use on MIT computers, not for remote access (which is why he decided to install the laptop, also knowing this was against his "right to use").
The "right to use" also included a warning that misuse could result in state and federal prosecutions. It was not some free for all.
> and pull the JSTR information
No, he did not have the right to pull en masse. The JSTOR access explicitly disallowed that. So he most certainly did not have the "right" to do that, even if he were sitting at MIT in an office not breaking into systems.
> did it in a shady way
The word you're looking for is "illegal." Breaking and entering is not simply shady - it's illegal and against the law. B&E with intent to commit a felony (which is what he was doing) is an even more serious crime, and one of the charges.
> he did it that way because he didn't want someone stealing or unplugging his laptop
Ah, the old "ends justifies break the law" argument.
Now, to be precise, MIT and JSTOR went to great lengths to stop the outflow of copying, which both saw. Schwartz returned multiple times to devise workarounds, continuing to break laws and circumvent yet more security measures. This was not some simply plug and forget laptop. He continually and persistently engaged in hacking to get around the protections both MIT and JSTOR were putting in place to stop him. He added a second computer, he used MAC spoofing, among other things. His actions started to affect all users of JSTOR at MIT. The rate of outflow caused JSTOR to suffer performance, so JSTOR disabled all of MIT access.
Go read the indictment and evidence.
> OpenAI IS distributing information they got wholesale
No, that ludicrous. How many complete JSTOR papers can I pull from ChatGPT? Zero? How many complete novels? None? Short stories? Also none? Can I ask for any of a category of items and get any of them? Nope. I cannot.
It's extremely hard to even get a complete decent sized paragraph from any work, and almost certainly not one you pre-select at will (most of those anyone produces are found by running massive search runs, then post selecting any matches).
Go ahead and demonstrate some wholesale distribution - pick an author and reproduce a few works, for example. I'll wait.
How many could I get from what Schwartz downloaded? Millions? Not just even as text - I could have gotten the complete author formatted layout, diagrams, everything, in perfect photo ready copy.
You're being dishonest in claiming these are the same. One can feel sad for Schwartz outcome, realize he was breaking the law, and realizing the current OpenAI copyright situation is likely unlike any previous copyright situation all at the same time. No need to equate such different things.
Ok, so a lot you've written but it comes down to this. What law did he break?
Neither MIT nor JSTOR raised issue with what Schwartz did. JSTOR even went out of their way to tell the FBI they did not want him prosecuted.
Remember, again, with what he was charged. Wiretapping and intent to distribute. He wasn't charged with trespassing, breaking and entering, or anything else. Wiretapping and intent to distribute.
> His actions started to affect all users of JSTOR at MIT. The rate of outflow caused JSTOR to suffer performance, so JSTOR disabled all of MIT access.
And this is where you are confusing a "crime" with "misuse of a system". MIT and JSTOR were in their rights to cut access. That does not mean that what Schwartz did was illegal. Similar to how if a business owner tells you "you need to leave now" you aren't committing a crime because they asked you to leave. That doesn't happen until you are trespassed.
> Go ahead and demonstrate some wholesale distribution - pick an author and reproduce a few works, for example. I'll wait.
You violate copyright by transforming. And fortunately, it's really simple to show that chat GPT will violate and simply emit byte for byte chunks of copyrighted material.
You can, for example, ask it to implement Java's Array list and get several verbatim parts of the JDKs source code echoed back at you.
> How many could I get from what Schwartz downloaded?
You can read the indictment, which I already suggested you do.
> Remember, again, with what he was charged. Wiretapping and intent to distribute. He wasn't charged with trespassing, breaking and entering, or anything else. Wiretapping and intent to distribute.
He wasn't charged with wiretapping (not even sure that's a generic crime). He was charged with (two counts of) wire fraud (18 USC 1343), a huge difference. He also had 5 different charges of computer fraud (18 USC 1030(a)(4), (b) & 2), 5 counts of unlawfully obtaining information from a protected computer (18 USC 1030 (a)(2), (b), (c)(2)(B)(iii) & 2), and 1 count of recklessly damaging a protected computer (18 USC...).
He was not charged with "intent to distribute", and there's not such thing as a "wiretapping" charge. Did you ever once read the actual indictment, or did you just make all this up from internet forum posts?
If you're going to start with the phrase "Remember, again.." you should try to make up nonsense. Actually read what you're asking others to "remember" which you apparently never knew in the first place.
> you are confusing a "crime" with "misuse of a system"
Apparently you are (willfully?) ignorant of law.
> You violate copyright by transforming.
That's false too. Transformative use is one defense used to not infringe copyright. Carefully read up on the topic.
> ask it to implement Java's Array list and get several verbatim parts of the JDKs source code echoed back at you
Provide the prompt. Courts have ruled that code that is the naïve way to create a simple solution is not copyrighted on it's own, so if you have only a few disconnected snippets, that violates nothing. Can you make it reproduce an entire source file, comments, legalese at the top? I doubt it. To violate copyright one needs a certain amount (determined by trials) of the content.
You might also want to make sure you're not simply reading OpenJDK.
> 0, because he didn't distribute.
Please read. "How many could I get from what Schwartz downloaded?" does not mean he published it all before he was stopped. It means what he took.
That you seem unable to tell the difference between someone copying millions of PDF to distribute as-is, and the effort one must go to to possibly get a desired copyrighted snippet, shows either dishonestly or ignorance of relevant laws.