1
0
Fork 0
mirror of synced 2024-06-22 16:10:54 +12:00

re-arrange

This commit is contained in:
Nick Sweeting 2017-05-05 05:56:11 -04:00 committed by GitHub
parent c375d88355
commit 031a9ec176

View file

@ -8,6 +8,9 @@ Save an archived copy of all websites you star using Pocket, indexed in an html
## Quickstart
**Runtime:** I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB.
Those numbers are from running it on my i5 4-core machine with 50mbps down. YMMV.
**Dependencies:** Google Chrome headless, wget
```bash
@ -21,7 +24,7 @@ sudo sh -c 'echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable
apt update; apt install google-chrome-beta
```
**Usage:**
**Archiving:**
1. Download your pocket export file `ril_export.html` from https://getpocket.com/export
2. Download this repo `git clone https://github.com/pirate/pocket-archive-stream`
@ -35,9 +38,6 @@ organized by timestamp. For each sites it saves:
- `sreenshot.png` 1440x900 screenshot of site using headless chrome
- `output.pdf` Printed PDF of site using headless chrome
I've found it takes about an hour to download 1000 articles, and they'll take up roughly 1GB.
Those numbers are from running it on my i5 4-core machine with 50mbps down. YMMV.
You can tweak parameters like screenshot size, file paths, timeouts, etc. in `archive.py`.
You can also tweak the outputted html index in `index_template.html`. It just uses python
format strings (not a proper templating engine like jinja2), which is why the CSS is double-bracketed `{{...}}`.