Archiving emails on Paperless-NGX
I personally self-host https://github.com/paperless-ngx/paperless-ngx to archive a multitude of documents, such as receipts, payslips and manuals. Anything document-based, really.
However, with the new influx of Shopify-esque emails without .PDF files being attached, and the actual email being the receipt itself, I started having to create a "Receipts" folder which now just grows with no real way to cut it down.
However, I was going through the Paperless docs and saw in the optional services, Tika & Gotenberg.
I frantically got them setup in my home k8s cluster, updated the Env vars, and now Paperless is able to consume the whole email as a document! 🥳
Now, with this beautiful option of an .eml consumption scope, it now processed the email as if itself is the attachment.
I think the only issue that I've experienced/experiencing is that it keeps the email header as a whole page in the downloaded document.
In terms of what I changed, I:
- Added two new containers (Tika & Gotenberg) to the Paperless pod in my kubernetes deployment.
- Added two new services (for Tika & Goternberg) that my Paperless instance can access
- Added env vars for the new containers to enable .eml parsing.