1
0
Fork 0
mirror of synced 2024-06-10 06:24:30 +12:00
ArchiveBox/archivebox/extractors
2024-03-01 14:50:32 -06:00
..
__init__.py minor fixes 2024-02-22 04:50:22 -08:00
archive_org.py Flip dedupe precedence order 2024-03-01 14:50:32 -06:00
dom.py After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2023-08-28 17:27:03 +02:00
favicon.py Flip dedupe precedence order 2024-03-01 14:50:32 -06:00
git.py Refactor should_save_extractor methods to accept overwrite parameter 2021-01-21 15:56:32 -06:00
headers.py Flip dedupe precedence order 2024-03-01 14:50:32 -06:00
htmltotext.py new archivebox update speed improvements 2024-02-22 04:50:22 -08:00
media.py Flip dedupe precedence order 2024-03-01 14:50:32 -06:00
mercury.py improve readability and mercury error handling and fix output path to be relative 2021-02-16 15:53:11 -05:00
pdf.py After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2023-08-28 17:27:03 +02:00
readability.py tag URLs immediately once added instead of waiting until archival completes 2024-01-03 20:31:46 -08:00
screenshot.py After a timeout, chrome will leave behind a SingletonLock, which prevents future instances of chrome from starting. When an extractor fails due to a timeout, remove this file. 2023-08-28 17:27:03 +02:00
singlefile.py Flip dedupe precedence order 2024-03-01 14:50:32 -06:00
title.py Flip dedupe precedence order 2024-03-01 14:50:32 -06:00
wget.py Flip dedupe precedence order 2024-03-01 14:50:32 -06:00