From e00845f58c917e2129de8b2be66ba9151849d9b6 Mon Sep 17 00:00:00 2001 From: Nicholas Hebert <68243838+n-hebert@users.noreply.github.com> Date: Tue, 19 Mar 2024 11:13:47 -0300 Subject: [PATCH] Revise md section not formatting properly in html --- README.md | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 6c17b7f5..43f0080c 100644 --- a/README.md +++ b/README.md @@ -1060,7 +1060,6 @@ Improved support for saving multiple snapshots of a single URL without this hash
- ### Storage Requirements Because ArchiveBox is designed to ingest a large volume of URLs with multiple copies of each URL stored by different 3rd-party tools, it can be quite disk-space intensive. There are also some special requirements when using filesystems like NFS/SMB/FUSE. @@ -1070,17 +1069,16 @@ Because ArchiveBox is designed to ingest a large volume of URLs with multiple co Click to learn more about ArchiveBox's filesystem and hosting requirements...
- -**ArchiveBox can use anywhere from ~1gb per 1000 articles, to ~50gb per 1000 articles**, mostly dependent on whether you're saving audio & video using `SAVE_MEDIA=True` and whether you lower `MEDIA_MAX_SIZE=750mb`. - -Disk usage can be reduced by using a compressed/deduplicated filesystem like ZFS/BTRFS, or by turning off extractors methods you don't need. You can also deduplicate content with a tool like [fdupes](https://github.com/adrianlopezroche/fdupes) or [rdfind](https://github.com/pauldreik/rdfind). - -**Don't store large collections on older filesystems like EXT3/FAT** as they may not be able to handle more than 50k directory entries in the `data/archive/` folder. - -**Try to keep the `data/index.sqlite3` file on local drive (not a network mount)** or SSD for maximum performance, however the `data/archive/` folder can be on a network mount or slower HDD. - -If using Docker or NFS/SMB/FUSE for the `data/archive/` folder, you may need to set [`PUID` & `PGID`](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#puid--pgid) and [disable `root_squash`](https://github.com/ArchiveBox/ArchiveBox/issues/1304) on your fileshare server. - +

Learn More