diff --git a/README.md b/README.md index df8aa0b6..72cf202a 100644 --- a/README.md +++ b/README.md @@ -72,7 +72,7 @@ The goal is to sleep soundly knowing the part of the internet you care about wil


-bookshelf graphic   logo   bookshelf graphic +bookshelf graphic   logo   bookshelf graphic

Demo | Screenshots | Usage
@@ -110,10 +110,10 @@ ls ./archive/*/index.json # or browse directly via the filesyste


-cli init screenshot -cli init screenshot -server snapshot admin screenshot -server snapshot details page screenshot +cli init screenshot +cli init screenshot +server snapshot admin screenshot +server snapshot details page screenshot

@@ -146,7 +146,7 @@ ls ./archive/*/index.json # or browse directly via the filesyste

-grassgrass +grassgrass
# Quickstart @@ -351,7 +351,7 @@ See below for usage examples using the CLI, W - +
✨ Alpha (contributors wanted!): for more info, see the: Electron ArchiveBox repo.
@@ -443,7 +443,7 @@ ls ./archive/*/index.html # or inspect snapshots on the filesystem
-grassgrass +grassgrass

@@ -460,7 +460,7 @@ ls ./archive/*/index.html # or inspect snapshots on the filesystem ---
-lego +lego

@@ -474,12 +474,12 @@ ArchiveBox supports many input formats for URLs, including Pocket & Pinboard exp *Click these links for instructions on how to prepare your links from these sources:* -- TXT, RSS, XML, JSON, CSV, SQL, HTML, Markdown, or [any other text-based format...](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Import-a-list-of-URLs-from-a-text-file) -- [Browser history](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) or [browser bookmarks](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) (see instructions for: [Chrome](https://support.google.com/chrome/answer/96816?hl=en), [Firefox](https://support.mozilla.org/en-US/kb/export-firefox-bookmarks-to-backup-or-transfer), [Safari](http://i.imgur.com/AtcvUZA.png), [IE](https://support.microsoft.com/en-us/help/211089/how-to-import-and-export-the-internet-explorer-favorites-folder-to-a-32-bit-version-of-windows), [Opera](http://help.opera.com/Windows/12.10/en/importexport.html), [and more...](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive)) -- Browser extension [`archivebox-exporter`](https://github.com/tjhorner/archivebox-exporter) (realtime archiving from Chrome/Chromium/Firefox) +- TXT, RSS, XML, JSON, CSV, SQL, HTML, Markdown, or [any other text-based format...](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Import-a-list-of-URLs-from-a-text-file) +- [Browser history](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) or [browser bookmarks](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) (see instructions for: [Chrome](https://support.google.com/chrome/answer/96816?hl=en), [Firefox](https://support.mozilla.org/en-US/kb/export-firefox-bookmarks-to-backup-or-transfer), [Safari](https://github.com/ArchiveBox/ArchiveBox/assets/511499/24ad068e-0fa6-41f4-a7ff-4c26fc91f71a), [IE](https://support.microsoft.com/en-us/help/211089/how-to-import-and-export-the-internet-explorer-favorites-folder-to-a-32-bit-version-of-windows), [Opera](https://help.opera.com/en/latest/features/#bookmarks:~:text=Click%20the%20import/-,export%20button,-on%20the%20bottom), [and more...](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive)) +- Browser extension [`archivebox-exporter`](https://github.com/tjhorner/archivebox-exporter) (realtime archiving from Chrome/Chromium/Firefox) - [Pocket](https://getpocket.com/export), [Pinboard](https://pinboard.in/export/), [Instapaper](https://www.instapaper.com/user), [Shaarli](https://shaarli.readthedocs.io/en/master/Usage/#importexport), [Delicious](https://www.groovypost.com/howto/howto/export-delicious-bookmarks-xml/), [Reddit Saved](https://github.com/csu/export-saved-reddit), [Wallabag](https://doc.wallabag.org/en/user/import/wallabagv2.html), [Unmark.it](http://help.unmark.it/import-export), [OneTab](https://www.addictivetips.com/web/onetab-save-close-all-chrome-tabs-to-restore-export-or-import/), [and more...](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive) - + ```bash @@ -506,7 +506,7 @@ It also includes a built-in scheduled import feature with `archivebox schedule` Inside each Snapshot folder, ArchiveBox saves these different types of extractor outputs as plain files: - + `./archive//*` @@ -530,7 +530,7 @@ It does everything out-of-the-box by default, but you can disable or tweak [indi ## Configuration - + ArchiveBox can be configured via environment variables, by using the `archivebox config` CLI, or by editing the `ArchiveBox.conf` config file directly. @@ -572,9 +572,10 @@ PUBLIC_ADD_VIEW=False # default: False whether anon users can add new URLs For better security, easier updating, and to avoid polluting your host system with extra dependencies, **it is strongly recommended to use the official [Docker image](https://github.com/ArchiveBox/ArchiveBox/wiki/Docker)** with everything pre-installed for the best experience. -To achieve high-fidelity archives in as many situations as possible, ArchiveBox depends on a variety of 3rd-party tools and libraries that specialize in extracting different types of content. These optional dependencies used for archiving sites include: +These optional dependencies used for archiving sites include: + +archivebox --version CLI output screenshot showing dependencies installed - - `chromium` / `chrome` (for screenshots, PDF, DOM HTML, and headless JS scripts) - `node` & `npm` (for readability, mercury, and singlefile) @@ -731,36 +732,36 @@ Disk usage can be reduced by using a compressed/deduplicated filesystem like ZFS ## Screenshots
- + @@ -773,7 +774,7 @@ Disk usage can be reduced by using a compressed/deduplicated filesystem like ZFS
-paisley graphic +paisley graphic
# Background & Motivation @@ -785,8 +786,8 @@ Vast treasure troves of knowledge are lost every day on the internet to link rot Whether it's to resist censorship by saving articles before they get taken down or edited, or just to save a collection of early 2010's flash games you love to play, having the tools to archive internet content enables to you save the stuff you care most about before it disappears.
-
- Image from WTF is Link Rot?...
+
+ Image from Perma.cc...
The balance between the permanence and ephemeral nature of content on the internet is part of what makes it beautiful. I don't think everything should be preserved in an automated fashion--making all content permanent and never removable, but I do think people should be able to decide for themselves and effectively archive specific content that they care about. @@ -796,7 +797,7 @@ ArchiveBox archives the sites in **several different formats** beyond what publi ## Comparison to Other Projects -comparison +comparison ▶ **Check out our [community page](https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community) for an index of web archiving initiatives and projects.** @@ -826,14 +827,14 @@ For more alternatives, see our [list here](https://github.com/ArchiveBox/Archive

-dependencies graphic +dependencies graphic
## Internet Archiving Ecosystem Whether you want to learn which organizations are the big players in the web archiving space, want to find a specific open-source tool for your web archiving need, or just want to see where archivists hang out online, our Community Wiki page serves as an index of the broader web archiving community. Check it out to learn about some of the coolest web archiving projects and communities on the web! - + - [Community Wiki](https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community) - [The Master Lists](https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community#the-master-lists) @@ -861,7 +862,7 @@ Whether you want to learn which organizations are the big players in the web arc ---
-documentation graphic +documentation graphic
# Documentation @@ -907,7 +908,7 @@ You can also access the docs locally by looking in the [`ArchiveBox/docs/`](http ---
-development +development
# ArchiveBox Development
-brew install archivebox
-archivebox version +brew install archivebox
+archivebox version
-archivebox init
+archivebox init
-archivebox add +archivebox add -archivebox data dir +archivebox data dir
-archivebox server +archivebox server -archivebox server add +archivebox server add -archivebox server list +archivebox server list -archivebox server detail +archivebox server detail