1
0
Fork 0
mirror of synced 2024-07-09 00:05:37 +12:00
Commit graph

40 commits

Author SHA1 Message Date
Nick Sweeting 1cbbf7204d
Merge pull request #114 from karlicoss/fix-favicon
Add dummy favicon entry so FETCH_FAVICON='False' isn't failing
2018-11-26 16:23:22 -05:00
Dima Gerasimov 03c1b0009c Fix 'Too many open files' error.
Happened before after continuous archiving of few hundreds links.

Fix by:
* setting process object to `None` to trigger GC finalizer cleanup of pipe descriptors
* protecting against double cleanup
2018-11-26 02:29:56 +00:00
Dima Gerasimov b0ffc9c076 Add dummy favicon entry so FETCH_FAVICON='False' isn't failing 2018-11-25 23:55:12 +00:00
Dima Gerasimov 75c062f33e Add script to remove entries from index 2018-11-09 20:12:37 +00:00
Nick Sweeting a2f5fa8ba6 Use a more appropriate coding style from @pirate.
Co-Authored-By: f0086 <mail@aaron-fischer.net>
2018-10-24 21:10:41 +02:00
Aaron Fischer ebc327bb89 Make O(n^2) loop to an O(n) problem. 2018-10-21 22:36:32 +02:00
Aaron Fischer b1b6be4f13
merge_links() used wrong index
Because merge_links() use the index, we need to get the new_links() _before_ we manipulate the index with write_links_index(). This has the negative side effect that the "Adding X new links ..." will output twice (because we execute merge_links() twice. For that, we only output stuff when the only_new is not set.
2018-10-19 22:35:08 +02:00
Aaron Fischer 69c007ce85 Optionally import only new links
When importing a huge list of links periodically (from a big dump of
links from a bookmark service for example) with a lot of broken links,
this links will always be rechecked. To skip this, the environment
variable ONLY_NEW can be used to only import new links and skip the rest
altogether. This partially fixes #95.
2018-10-19 21:34:57 +02:00
William Esz a59d609571 Fix archive_dot_org submit_url
It was removing functional query parameters. (e.g., https://news.ycombinator.com/item?id=18216459)
2018-10-15 13:09:31 +02:00
William Esz 8b850393df Fix archive_dot_org CMD
`curl -I {url}` returns 404
2018-10-15 13:07:20 +02:00
Nick Sweeting 46ad4fd163 fix python io encoding 2018-10-13 22:12:31 -04:00
Nick Sweeting 6c6bdaa3d7 add chrome sandbox option 2018-10-13 22:12:26 -04:00
Nick Sweeting a6650dfca0 move requirements down a level 2018-10-12 23:48:15 -04:00
Christian Kollmann fbc90b4279 Enable importing files from wallabag 2018-10-08 18:45:51 +02:00
Pig Monkey 7ed4f8deed support a configurable output directory
Closes #94
2018-09-21 17:41:11 -07:00
Florian Tham 5450afd18b fixes unstable sorting between consecutive runs 2018-09-15 00:08:59 +02:00
Nick Sweeting 8a23358fc8 create robots.txt in output dir 2018-09-12 19:26:00 -04:00
Nick Sweeting ff10253cef fix user agent breaking all wgets 2018-09-10 22:31:19 -04:00
Nick Sweeting 735f530516 hide scrollbars in screenshots 2018-06-17 19:09:09 -04:00
Nick Sweeting 738513ead8 regigger wget exception handling order 2018-06-17 19:09:01 -04:00
Nick Sweeting 70530060c2 make ts naming consistent 2018-06-17 18:35:09 -04:00
Nick Sweeting 062d2ddc98 fix archive_url broken on first run 2018-06-17 18:32:52 -04:00
Nick Sweeting 47c3d563b2 tweak index columns and footer links 2018-06-17 18:32:42 -04:00
Nick Sweeting 16b6e0b428 flip collapse and return to archive buttons 2018-06-17 18:16:12 -04:00
Nick Sweeting c4e0af84e7 add default wget user agent 2018-06-17 18:05:25 -04:00
Nick Sweeting cad622a137 re-arrange index columns 2018-06-17 17:51:28 -04:00
Nick Sweeting aa5a674a17 add new migrate_data step to move old folder 2018-06-10 23:01:56 -04:00
Nick Sweeting 755845c69a use latest instead of deriving wget path 2018-06-10 22:43:01 -04:00
Nick Sweeting 5498822a97 fix parsing of chrome and ff histories 2018-06-10 22:13:56 -04:00
Nick Sweeting 9ec1f81bd5 add author and version 2018-06-10 22:02:33 -04:00
Nick Sweeting b5e2ed1d46 pretty_path the source and index paths in stdout 2018-06-10 22:00:31 -04:00
Nick Sweeting d6354ac93f rearrange files again 2018-06-10 21:58:48 -04:00
Nick Sweeting 19ade54668 move examples to tests 2018-06-10 21:50:09 -04:00
Nick Sweeting e74227569e fix timestamp parsing edgecase 2018-06-10 21:27:23 -04:00
Nick Sweeting c90f4bfd5b cleanup ARCHIVE_DIR paths 2018-06-10 21:26:11 -04:00
Nick Sweeting 46ea65d4f2 remove DS_Store files 2018-06-10 21:22:19 -04:00
Nick Sweeting d2d1b977fe log wget 500 errors 2018-06-10 21:14:46 -04:00
Nick Sweeting c1c689cb94 log wget 404 and 403 errors 2018-06-10 21:13:07 -04:00
Nick Sweeting a287900345 fix template locations 2018-06-10 21:12:55 -04:00
Nick Sweeting d0f2e693b3 re-arrange and cleanup directory structure 2018-06-10 20:52:15 -04:00