Learning professional skills at home with unprofessional gains
After years of toying with the ideas of how I want to do homelab things at home, I finally had a scope of what I wanted to do and how I wanted to implement it. Professionally I am 90% a Microsoft / Windows guy. I dabble in .NET and administrate a mostly Windows environment. I speak all the terms pretty well but there’s this other 5% that I know dangerously little about, and I feel like it’s something I need to stop making fun of and just get to it.
The Scope
I’ve maintained a media collection for quite some time, cause it’s legal to own a digital copy of physical media (That’s my story and I’m sticking to it.) and being able to stream stuff locally eliminates issues with connection speed. I want to make the platform around that media work better and be smarter. Ideally the frontend service will need to be universal, it must work everywhere, on the TV in the living room, on mobile devices, inside a web-browser, etc. It also must be quick. If it’s a hassle to get to the media, I’ll be less likely to use it. If I’m less likely to use it, my home users will be even less likely to want to put time into it. Lastly the interface points need to be easy, I’m here for the complicated setup so that further interfacing with it is easy.
That seems simple enough I have a couple options. I’ve looked in to Plex, and well it just put a sour taste in my mouth from the start. Then I saw some things around Emby but hadn’t really given it a fair shake, so last on the list was Jellyfin. The interface is pretty quick, and it incorporates quite a bit of metadata that doesn’t really take much effort to get in to place, so the whole platform has a polish about it that makes it really attractive.
Now, I’m not going to settle for just serving the media. I, of course, need to incorporate a better way to bringing media into the library (let’s face it, I’m going to want more media…). So onward to find ways to automate fetching media. These are all well known tools and since I don’t really have to interface with them it’ll be simple enough to set them and forget them. On the opposite end of the stack is JellySeerr, a search engine of sorts to hand terms to the well-known tools to make the magic happen and to integrate with Jellyfin.
Okay application stack out of the way, how do I serve it? Jumping on the whale late here, I’ll use Docker, of course. Since my professional life has me constantly looking at processes and identifying the failure points, I saw an immediate need for more than one host, cause the best part about redundancy is redundancy.
Seems like I have a rough shopping list, time to get into it.
The Hardware
Now, I’ve operated some form of storage on my home network for quite some time, I’ve got redundancy built in to the arrays so disk failure isn’t going to stop me. But I only have one NAS… this may change in the future but for now it’s just going to be a weak link in the chain. Media takes up a lot of storage and replicating that across multiple NASs would be prohibitively expensive at this juncture.
The docker hosts; I have some criteria, since these will live in the rack behind my computer chair, they had to be quiet, and preferably small. I was holding out to capitalize on surplus hardware when my office did a hardware refresh but that didn’t happen. I had seen something around using Micro Form Factor desktop PCs to perform these tasks, and I knew this was going to be my solution. Off to Ebay to find a few identical machines for my “server”. I found a really good deal on some Dell Optiplex 3040 MFFs, they had a decent but not fantastic CPU and half the RAM that the board could handle and as expected no Hard Drive. I got 4. The RAM and SSDs came from amazon, 4x1TB SSDs cause they’re dumb cheap this time of year, and enough RAM to max out all the boxes. The irritation came when I had to order the external power supplies, not something I was used to in the desktop space, but it makes sense when you realize that MFF PCs are really just laptops with better cooling and a bit more room for I/O. The boxes came with WiFi capability which got my gears turning but ultimately, I opted to just remove it. SSDs and max out the RAM and I was ready for OS.
The Software Part 1
Something I hadn’t really considered up until this point was what OS did I want for the Docker hosts. I had previously used Debian, and I had some recommendations for RHEL, but I opted to go with Ubuntu Server for the most universal Linux experience, I may catch some flak for it, but at the end of the day I have a working product. Got the OS installed on all the machines and made sure that I did everything identically on each one.** Once each machine was ready for production I was off to take on my next consideration; which flavor of containerization/orchestration do I want to use? The two products I considered were Kubernetes and Docker Swarm. After not much research I figured that Kubernetes was going to be too much for what I wanted to do so Docker Swarm it is! Setup was dead simple I had the cluster up and running in no time. Adding in Portainer for management and I had a pretty solid foundation to work with.
Without getting fully into the gritty details I have the following containers to stand up; Jellyfin, JellySeerr, Radarr, Sonarr, Transmission, Jackett (prowlarr is probably going to replace this). I want them to be able to run on any host at any time, so I had to come up with a way to make the storage the same on each host. After learning how mounts work in Linux, I opened up an NFS share on my Synology and mounted that on each host. Now regardless of where the container is running it can see it’s configuration data and do it’s job.
Some quick configuration to get all the services talking to each other and it was time to test. I picked a movie I hadn’t seen and requested it through JellySeerr, which in turn handed the search over to Radarr, which grabbed some information from jackett and sent the request to transmission, great right? Once completed Radarr moved the file into my media library and we were ready to rock.
Trouble in paradise
This story isn’t without a little peril though. I found that Radarr and Sonarr weren’t fully doing what they were supposed to and after a couple of restarts I found that they were really struggling to work with their internal database (which really really really dislikes being shared over NFS) so it was time to figure out how to have persistent storage available across the cluster. First suggestion was to use Ceph, but that really requires having an available disk to dedicate to the application and these MFF PCs really don’t have that. Next was to figure out GlusterFS which looks like it can slot in on top of an existing file system, you know like that 1TB SSD I have in each of the hosts? Another really simple setup and solution that only required a little bit of work to get functional. Though adding in some options to wait for Gluster to start up before mounting the storage didn’t really occur to me, I added it in so should give me persistent replicated storage available to all 4 hosts.
The Software Part 2: The Search for Curly’s Certificate
Now that I have all the services working, it was time to condense them in to a stack, for easier deployment and replication should my entire cluster fail. Exposing Jellyfin to the outside world I realized a couple of issues. First; Port mapping from my router to the cluster doesn’t really allow me to do proper load balancing or failover to point at different hosts. Second; I should figure out a reverse proxy to hide all the services behind, so everything is secured encrypted and happy.
Not knowing how to approach the first one I started to try to fix the second. I had trouble getting Traefik configured right to properly handle inbound requests and forward them to the appropriate service, so I opted to use the gui for nginx built in to the Synology for the time being.
After doing some more research on the subject I found that there is a linux application KeepAliveD that will enable all the nodes in the cluster to use the same virtual IP and through VRRP determine which host answers on the address, and allowing any node in the cluster to fail without taking all the services with it.
The End of Our Story
So, now I have this semi-complex setup to acquire new media and to consume said media in a friendly way, that flexes some principles that I use in my day job. In the near future, I want to utilize some internal process monitoring and external monitoring so I know if the services go down. I’d also like to take a better look at Traefik to move the reverse proxy into the docker environment. Not too shabby for someone who doesn’t know Linux.
** If only there was a tool for this, that could make sure that my environment stays homogenous and all commands are executed the same way every time on multiple machines…