Claude: papers please?

SuspciousCarrot78@lemmy.world · 2 days ago

Claude: papers please?

SuspciousCarrot78@lemmy.world · edit-2 22 hours ago

The intranet becomes the internet :) Everything is local, accessible from multiple devices within my wLAN. The main box plugs into the router and serves everything over Wifi to trusted devices - my documents, media, books, games etc.

I wrote (flippantly) about the bones of the system here, 3 or 4 months ago. It’s more complex now, but the endgame has always been “what if cloud, but you are your own cloud?”

https://lemmy.world/post/41315607/21438607

It may not be fresh (if the net goes down), but it would be local. The only real question I have to grok for myself is if I want to mirror curated section of Wikipedia, books etc.

https://en.wikipedia.org/wiki/Kiwix

Probably I should. May as well go full data-horder. Good excuse to get a few more TB of storage. What I’ve done so far is all within 4TB, using clever tricks and black magic but there’s a limit to 4TB. Fortunately, hard drives are still not too $$$. +4TB is about $200 here locally. So the entire set up is still around $600-700 AUD (around $350-400 USD)

All the other stuff I have more or less tee-ed up (barring the UPS + solar kit I am building later in the year).

https://www.youtube.com/watch?v=1q4dUt1yK0g

Anyhow, 8TB local should just about cover what I have in mind (he said, fully aware of dragon horder sickness). Then I’ll grab something for offsite storage for critical docs - I have an old raspberry pi with a 256GB NVMe ssd I can use for that.

I’m semi tempted (because fuck it, why not have fun) to look into LoRA after that.

https://en.wikipedia.org/wiki/LoRa

What I really NEED to do is finish the LLM stack (I’m on it and nearly done) and then do a curated Youtube replacement with yl-dlp feeding into Nova-player or Jellyfin, once/if SmartTube etc gets shit-canned. The youtube thing I’m kinda excited about because I’ve figured out how to squeeze ~1500 videos in around 250ish GB of storage, with TTL (time to live) mechanics, download replacement schedule etc. The kids watch too much random shit on YT, so daddy will make YT at home (ha!).

I have some other wild ideas too…it’s a whole other thing…don’t get me started :)

Once I’m finished, I will open-source the entire thing, post about it here, and let others replicate / improve on it. And so it goes. Once you begin walking down the dark path, you are forever doomed. Be careful :)

ropatrick@lemmy.world · 1 day ago

OK I have you. You dont need the internet because you have the internet in your terabyte farm. Pretty cool.

Thanks for the detailed reply.

One final question, I’m sure its dark at the bottom of the deep rabbit hole you are in, what do you do for batteries for your head torch?! 😀

SuspciousCarrot78@lemmy.world · edit-2 1 day ago

Exactly so. Mom - can we get the internet? Mom: we have the internet at home.

Batteries? I don’t need batteries. I have the never-ending warm glow of weaponized autism. And that’s not even a joke.

I tend to hyper-fixate on something until either it breaks or I do. It’s usually 70/30 in my favour :)

webghost0101@sopuli.xyz · 1 day ago

Thats weird, i don’t remember having an alt account called SuspiciousCarrot78 but surely you must be me, same project, same neurology… same fixation pattern.

SuspciousCarrot78@lemmy.world · 1 day ago

Well, there’s a quick check. Take a shot of whisky and I’ll see if can type “gottle of geer”.

webghost0101@sopuli.xyz · 1 day ago

Yep, definitely talking to myself again

Jokes aside, i haven’t seen you mention anything for media streaming.

I highly passionately recommend Navidrome for music. It is my absolute favorite and most used self hosted service.

For acquiring media like film and music depending where you live ripping those from your local library is in some places arguably a protected fair use. (Comes from the time mp3 players became common and runners used to take rented cds in their walkman outside before). In my experience, 480p dvd is much higher quality then internet 480p streams and the total size is much smaller then what you find in downloads.

ARM can help you automatically rip these as long as you have a drive in your pc. I got ARM running in a proxmox lxc with drive passtrough but that honestly was a pain to setup so not sure you should go that exact route, either way the moment arm is functional its smooth sailing and your only concern becomes storage space.

SuspciousCarrot78@lemmy.world · edit-2 22 hours ago

Well, this is going to freak you out, because I am (literally, right now) explicitly scoping out offline YouTube integration into Jellyfin, as a sort of rolling library. Jellyfin has been good to me, but I’ve been using Nova Player for a while now, since my Pi borked itself (Nova player is plug hard drive into router, install app on TVs, done). The limit is that yt-dlp doesn’t integrate very well with it. I mean, I could build something, or fork the repo myself…or I could just use what already exists.

So it might be time to restore the entire *arr stack.

The TL;DR: I want one front end for ALL my media - YouTube, instructionals, movies, TV shows. That immediately speaks to Jellyfin, which I’m very familiar with. The issue is YouTube. There’s too much slop on there, I want a curated experience for the kids, SmartTube won’t work forever, and the eldest is starting to go black-hat and screw around with settings. That’s accelerating the timeline.

The stack I’m scoping:

Jellyfin - front end for everything
Tube Archivist - YouTube archive, metadata, download manager
Tube Archivist Jellyfin plugin - maps channels as Shows, videos as Episodes, bidirectional playback sync
*The usual arr stack (Sabnzbd, Sonarr, Radarr, etc.) - for maximum yarr me hearties. I’ve been downloading from 1337 like a pleb.
Handbrake (+ usual media ripping stuff from DVD as needed)

The YT stack: rolling library logic:

Core “keepers” - permanent, protected, not touched by auto-delete
TA rescans subscribed channels twice a week
Auto-delete watched videos after N days, per channel, marks them ignored so they don’t re-download
Whole thing surfaces in Jellyfin as a YouTube-style shelf

Scoping the maths at 200GB, 30-min average per vid, using compressed modern codecs:

Planning numbers per video: assume average video is 30mins. At 360p, that’s ~100MB per video. 480p ~160MB, 540p ~220MB, 720p ~320MB.

If I have a selection of “core keepers” at 720p H.265 (~300 videos), taking up ~80GB, that leaves ~120GB for the rolling pool:

Rotating quality	Rotating count	Total library
360p	~1,200	~1,500 (garbage; ok for kids cartoons)
480p	~750	~1,050 (surprising ok)
540p	~545	~845 (good to my eyes)
720p	~375	~675 (very nice.)

I don’t need 4K…hell, 1080p is wasted on me. So I’m thinking… 300 core vids at 720p + rolling library at 540p = 845 videos, give or take. More than enough to keep the fam off my back once SmartTube goes tits up (they can’t play whack-a-mole for ever).

I would prefer a clean migration to other, live sources (I have those scoped out as well) but not all the Minecraft / gaming / pretend play / blah blah stuff the family watches is on Peertube/Odysee/Curiosity Stream.

PS: I see your 480p and raise you 60, because 540p is the forbidden resolution :)

PPS: I was planning on using JF for music too…but maybe I should look at Navidrome like you said.

The crazy idea that I had was to use AI to create an infinite playlist of sorts. Seed it with your own music, get it to generate tracks in THAT style as filler, intermingle them (so there’s always something new).

Finish off with AI DJ’s that pulls in “local news” from your curated RSS feeds.

Think: Three Dog from Fallout 3.

Basically what I spoke about here -

https://lemmy.world/post/43936980/22784324

I have a pretty clear idea of how to get that done. It could be amusing.

https://huggingface.co/ACE-Step/acestep-5Hz-lm-0.6B

webghost0101@sopuli.xyz · 22 hours ago

Currently my setup for YouTube In jellyfin is pretty basic (but it works well)

Its one complex yt-dlp script that runs ecery hour, reads a list of channels from a txt file. Rss to check the last 2 vids per channel, if there not downloaded already (log with vid ids per channel) it starts to download on high quality preset, filter out shorts and “live”, if the result is getting above a certain size it aborts and tries again with lower quality preset.

I don’t have a system to auto dele yet cause the script is only for my favorites. For everything else or that uploads to regularly i host a local invidious.

But google really hates invidious and it breaks frequently, even got my original residential ip banned. I may take a look at tube archive cause i would prefer download and delete as a more stable flow.

Something else i am experimenting with is using the same data in multiple systems. My music collection is a proxmox dataset but as long as the read rights are good i see no reason why jellyfin can’t also read those files.

The output folder of ARM used to be its own library in jellyfin on my first server.