I am making a Unofficial Reddit API, which mimics the official one.
Its early days, but I would like to have a discussion here about it since my post was blocked on reddit(of course).
Let me know what you think of the project, if you have any input, let me know.
Is there a reason you’re scraping data rather than attaching a network sniffer/reverse engineering the official apps and documenting the results? Or map the RSS feed to an API? The main thrust behind my comment is that I think scraping is pretty fragile, so I’m interested as to why other options are infeasible.
There’s currently no implementation (the repos are currently just skeletons), so it could just be a semantics difference right now.
Wouldn’t those other options be C&D’d?
*I am a layman
This is likely to be C&D’d as well if it ever reaches the point where it does anything useful (remember, reddit doesn’t need grounds that would hold up in court to send a C&D).
Don’t worry, it won’t be a problem. I have taken reasonable measures to ensure my anonymity. and also you can’t really kill free/libre software easily anyways.
I suspect that any of the methods proposed here would be prone to a C&D, but IMO the safest legally would probably be the RSS method (not a lawyer though). Reddit’s RSS feeds are public, documented, and available without the need for private APIs, authentication, or an API key, so I don’t see how they could claim that a wrapper is unauthorised/illegal. Documenting their private API however seems like a gray area. Google LLC v. Oracle America, Inc. found that APIs are copyrightable, but this use may constitute fair use.
Because we need to retain the breadth of functionality the API has, if you want to just scrape posts, APIs for that already exist, but i am aiming for something more.
About reverse engineering, they can change that part at any time too, and may be even more fragile as they can change that without breaking the UX, if they change the front page CSS selectors or layout for example, it will effect the UX more as it changes the expected output, not the middle end that is just raw data.
Thats my reasoning, I appreciate the input though (:
Making a breaking change to the mobile API also breaks old outdated installations of the app. Websites and their APIs are usually synced, apps not so.
If they were really motivated to stop your method, they could just obfuscate the frontend with webpack and break your scraper every time they make an update.