It just happened again. It was so bad I couldn't even attach to the container it was running in. But I noticed the container was using all of its allocated memory (8GB). I increased it to 12GB and the instant I did, attaching to the container worked, but it immediately used those new 4GB as well. But a few seconds later, both memory and CPU usage dropped to almost 0 again. So it seems there is a bug in photoprism that triggers endless 100% CPU usage once memory runs out. Not great :D
My PhotoPrism still periodically, throughout the day, hogs my server's CPU to the max while indexing. It does settle down eventually (after like 20 minutes) and return to normal, but this is new behaviour. I haven't even updated anything. Maybe I have a file in there that wreaks havoc on the indexer?
I turned on trace logs, it doesn't show anything different before the CPU load goes up vs. before, but it then just consumes all RAM and swap and actually doesn't catch itself, but just gets killed after a while. Which might also explain why it keeps happening - it can't finish the indexing process and restarts every time...
update on the photoprism freezes. I had disabled automatic indexing for now since I couldn't be bothered to put up with the issue. Came back to it now and started indexing with a different command which for some reason gives me more info in the log output (go figure). Shortly before consuming all memory and subsequently getting OOM killed, it reports that a RAW image file has changed size from 76.2 MB (pretty large already) to 9944 MB (???). Seems suspicious. No idea why that happens yet
Gonna check that file and remove it, who knows what's up with that. But seems like that might be the culprit, especially because it also reports that it can't find metadata in that file and it's gonna start a brute-force search. Yeah... :D
yep, after restoring the original file, indexing now works again. So it really was just down to a corrupt file! I saved both the original and corrupted file for later analysis.
(cc @red since you wanted to know what the issue was)
Add comment