1
0
Fork 0
mirror of synced 2024-05-19 19:52:41 +12:00
Commit graph

235 commits

Author SHA1 Message Date
Serene-Arc 36ff95de6b Add Patreon image support 2021-12-19 13:44:24 +10:00
dbanon87 1530456cf7
Update downloader.py 2021-11-29 09:23:04 -05:00
dbanon87 9ccc9e6863
Update archiver.py 2021-11-29 09:22:21 -05:00
Serene-Arc f670b347ae Add integration test for archiver option 2021-11-24 12:49:11 +10:00
Serene-Arc d0d72c8229 Add integration test for downloader option 2021-11-24 12:49:11 +10:00
Jay R. Wren 2b50ee0724 add test. fix typos. 2021-11-24 12:49:11 +10:00
Jay R. Wren dd8d74ee25 Add --ignore to ignore user 2021-11-24 12:49:11 +10:00
Serene-Arc 8925643331 Rename module to reflect backend change 2021-11-24 10:40:18 +10:00
Serene-Arc 2dd446a402 Fix max path length calculations 2021-11-22 14:37:21 +10:00
Serene-Arc 17939fe47c Fix bug with youtube class and children 2021-11-22 14:37:21 +10:00
Serene-Arc 53562f4873 Fix regex 2021-11-16 17:05:46 +03:00
OMEGARAZER f05e909008 Stop videos from being downloaded as images
Erroneous .gifv extensions such as .giff or .gift resolve to a static image and are downloaded by the direct downloader. (ex: https://i.imgur.com/OGeVuAe.giff  )
2021-11-16 17:05:46 +03:00
Serene-Arc 4be0f5ec19 Add more tests for file length checking 2021-11-15 11:57:54 +10:00
Serene-Arc 801784c46d Fix a crash when downloading a disabled pornhub video 2021-11-05 13:23:55 +10:00
Serene-Arc e493ab048a Fix bug with period not separating file extension 2021-11-05 12:47:46 +10:00
Serene-Arc c6c6002ab2 Update Erome module 2021-10-02 17:50:20 +10:00
Serene-Arc 9b23f273fc Separate function out 2021-10-02 17:50:20 +10:00
Serene-Arc eeb2054606 Switch to yt-dlp 2021-10-02 17:50:20 +10:00
Serene e004ccd148
Merge pull request #521 from Serene-Arc/bug_fix_518
Fix bug with different Vidble links
2021-09-14 13:48:26 +10:00
Serene-Arc 80baab8de7 Fix bug with different Vidble links 2021-09-14 13:47:46 +10:00
Eli Lipsitz 33312687ac imgur: download videos as mp4 instead of gif
Some imgur URLS have the extension ".gifv" and show up as a gif,
even though they're actually supposed to be mp4 videos. Imgur
serves all videos/gifs as both .gif and .mp4. The image dict has
a key "prefer_video" to distinguish the two. This commit
overrides the .gif extension if "prefer_video" is true to ensure
we download the submission as originally intended.
2021-09-12 17:30:25 -05:00
Ali Parlakçı 483f179ccc
Merge pull request #482 from Serene-Arc/enhancement_481
Add ability to read IDs from files
2021-09-12 20:07:17 +03:00
Serene-Arc aee6f4add9 Add Vidble to download factory 2021-09-11 12:15:35 +10:00
Serene-Arc 940d646d30 Add Vidble module 2021-09-11 12:13:21 +10:00
Serene 3040a35306
Merge pull request #512 from Serene-Arc/bug_fix_510 2021-09-03 19:30:30 +10:00
Serene-Arc 87f283cc98 Fix backup config location 2021-09-03 19:24:28 +10:00
Serene-Arc 7bca303b1b Add in downloader parameters 2021-07-29 19:10:10 +10:00
Serene-Arc dbe8733fd4 Refactor method to remove max wait time 2021-07-27 14:02:30 +10:00
Serene-Arc 3cdae99490 Implement callbacks for downloading 2021-07-27 13:39:49 +10:00
Serene-Arc 1a4ff07f78 Add ability to read IDs from files 2021-07-21 17:32:38 +10:00
Serene-Arc 77aaee96f3 Fix bug with deleted galleries 2021-07-19 18:44:54 +10:00
Serene-Arc 2f8ca766c6 Update regex 2021-07-04 11:00:02 +10:00
Serene-Arc d03a5e556e Stop writing new value to config 2021-07-04 10:59:35 +10:00
Serene-Arc 7f1c929a08 Add fallback scope 2021-07-03 13:54:26 +10:00
Serene-Arc d5ef991b3a Catch additional error in galleries 2021-07-02 15:11:09 +10:00
Serene-Arc aa55a92791 Remove unused local variables 2021-07-02 14:58:56 +10:00
Serene-Arc 6efcf1ce7e Remove unused imports 2021-07-02 14:58:20 +10:00
Serene-Arc 1319eeb6da Fix error with crossposted Reddit galleries 2021-07-02 14:53:02 +10:00
Serene-Arc 8db9d0bcc4 Add test for unauthenticated instances 2021-07-02 14:29:39 +10:00
Serene-Arc bd34c37052 Add exception for special friends subreddit 2021-07-02 14:09:44 +10:00
Serene-Arc edfeb653a4 Record user flair in comment archive entries 2021-07-02 14:01:24 +10:00
Serene-Arc d53b3b7274 Update gallery code to work with NSFW galleries 2021-06-25 17:52:11 +10:00
Serene-Arc e8998da2f0 Catch some Imgur errors with weird links 2021-06-25 17:52:11 +10:00
Serene-Arc 1a52dfdcbc Add PornHub module 2021-06-25 17:47:49 +10:00
Serene-Arc e5be624f1e Check submission URL against filter before factory 2021-06-23 14:30:39 +10:00
Ali Parlakçı 71930e06a8
Merge pull request #457 from Serene-Arc/enhancement_322
Add option for archiver full context
2021-06-19 13:56:04 +03:00
Serene-Arc 7c27b7bf12 Update logging message 2021-06-13 09:49:42 +10:00
Serene-Arc c5c010bce0 Rename option 2021-06-12 10:35:31 +10:00
Serene-Arc 6eeadc8821 Add option for archiver full context 2021-06-11 15:31:11 +10:00
Serene-Arc 8ba2d0bb55 Add missing return statement 2021-06-10 18:59:22 +10:00
Serene-Arc 8be3efb6e4 Fix bug with Imgur gifs being shortened too much
The rstrip function was used wrongly, it doesn't remove a substring but
rather removes any of the characters provided, so here it removed any I,
G, V, or F that finished the six character ID for Imgur, resulting in a
404 error for the resources in question.
2021-06-08 13:08:39 +10:00
Serene 6dcef83666
Add ability to disable modules (#434)
* Fix test name to match standard

* Rename file

* Add ability to disable modules

* Update README

* Fix missing comma

* Fix more missing commas. sigh...

Co-authored-by: Ali Parlakçı <parlakciali@gmail.com>
2021-06-06 13:47:56 +03:00
Serene 434aeb8feb
Add a combined command for the archiver and downloader: clone (#433)
* Simplify downloader function

* Add basic scraper class

* Add "scrape" command

* Rename "scrape" command to "clone"

* Add integration tests for clone command

* Update README

* Fix failing test
2021-06-06 13:29:09 +03:00
Ali Parlakçı a2f010c40d
Merge pull request #432 from Serene-Arc/enhancement_429
Allow --user to be specified multiple times
2021-06-06 13:25:00 +03:00
Ali Parlakçı 6839c65bd6
Merge pull request #396 from Serene-Arc/bug_fix_385
Add path limit check
2021-06-06 13:24:22 +03:00
Serene-Arc 79fba4ac4a Fix indent 2021-05-31 13:42:41 +10:00
Serene-Arc 9a1e1ebea1 Add path limit fix 2021-05-27 15:27:02 +10:00
Serene-Arc 6b78a23484 Allow --user to be specified multiple times 2021-05-27 15:22:58 +10:00
Serene-Arc fef2fc864b Update blacklist 2021-05-25 19:33:32 +10:00
Serene-Arc 87959028e5 Add blacklist for web filetypes 2021-05-25 19:20:51 +10:00
Serene-Arc f47688812d Rename function 2021-05-25 18:51:24 +10:00
Serene-Arc 323b2d2b03 Fix download retries logic 2021-05-25 09:56:22 +10:00
Serene-Arc e2582ecb3e Catch error with MacOS writing per issue #407 2021-05-23 12:17:14 +10:00
Serene-Arc 47a4951279 Rename variable 2021-05-23 12:13:44 +10:00
Serene-Arc 4395dd4646 Update logging messages to include submission IDs 2021-05-22 11:53:44 +10:00
Serene-Arc a104a154fc Simplify method structure 2021-05-22 11:53:44 +10:00
Ali Parlakci da8c64ec51 Read files in chunks instead when hashing (#416) 2021-05-22 08:46:39 +10:00
Ali Parlakci cf6905db28
Reverts #384 2021-05-21 00:22:44 +03:00
Ali Parlakçı bfa6e4da5a
Merge pull request #409 from Ailothaen/master
Adding info to threads and comments: distinguished, spoiler, pinned, locked
2021-05-21 00:04:53 +03:00
Ailothaen 827f1ab80e Adding some more info in threads and comments: distinguished, spoiler, locked, sticky 2021-05-20 18:26:50 +02:00
Serene-Arc 830e4f2830 Catch additional error 2021-05-19 10:09:46 +10:00
Serene-Arc 3b28ad24b3 Fix bug with some Imgur extensions 2021-05-19 09:57:27 +10:00
Ali Parlakci 7c401b1461
Merge branch 'reddit_connector_refactor' of https://github.com/Serene-Arc/bulk-downloader-for-reddit into Serene-Arc-reddit_connector_refactor 2021-05-17 13:53:48 +03:00
Serene c581bef790
Set file creation times to the post creation time (#391) 2021-05-17 13:49:35 +03:00
Serene-Arc 7016603763 Refactor out super class RedditConnector 2021-05-17 11:50:17 +10:00
Ali Parlakci 200916a150 Rename --exclude-id(-file) to --skip-id(-file) 2021-05-17 10:30:55 +10:00
Ali Parlakci f768a7d61c Rename --skip to --skip-format 2021-05-17 10:30:55 +10:00
alpbetgam ef37712115 Fix error with old gfycat/redgifs urls 2021-05-16 19:15:36 +10:00
Ali Parlakci c7a5ec4376 bug(youtube.dl): Fix crash on zero downloads #375 2021-05-16 11:06:13 +10:00
BlipRanger fca3184950
Bind socket to '0.0.0.0' rather than 'localhost' to allow for more flexible OAuth connection. (#368) 2021-05-12 17:47:33 +03:00
Serene-Arc 7e70175e4c Change logging message to include submission ID 2021-05-10 19:03:20 +10:00
Ali Parlakçı a2e22e894a
Fix xml archiver encoding bug (#349)
* test_integration: add archiver tests

* archiver.py: fix encoding bug in xml archiver
2021-05-06 16:11:48 +03:00
Ali Parlakçı 283ad164e5
__main__.py: fix typo in -f argument 2021-05-06 12:52:45 +03:00
Serene-Arc f6d89097f8 Consolidate exception block 2021-05-06 10:43:56 +10:00
Ali Parlakci e642ad68d4 youtubedl_fallback.py: add a fallback exception and log messages 2021-05-05 16:56:34 +03:00
Ali Parlakci 00defe3b87 youtubedl_fallback: remove logging the expected exception 2021-05-05 16:35:03 +03:00
Serene-Arc c9cde54a72 Remove VReddit downloader module 2021-05-03 14:10:54 +10:00
Serene-Arc ab96a3ba97 Remove Streamable downloader module 2021-05-03 14:10:54 +10:00
Serene-Arc fba70dcf18 Intercept youtube-dl output 2021-05-03 14:10:54 +10:00
Serene-Arc a8c2136270 Add fallback downloader 2021-05-03 14:10:54 +10:00
Serene-Arc afa3e2548f Add customisable time formatting 2021-05-03 14:05:05 +10:00
Serene-Arc eda12e5274 Make downloadfilter apply itself to Resources 2021-05-03 14:02:03 +10:00
Serene-Arc 711f8b0c76 Add exception for r/all in subreddit check 2021-05-02 14:00:23 +10:00
Serene-Arc 14195157de Catch errors for banned or private subreddits 2021-05-01 13:36:38 +10:00
Daniel Clowry fe95394b3b Match import order, update docs 2021-04-30 09:22:26 +10:00
Daniel Clowry e6d2980db3 Add Streamable to download factory 2021-04-30 09:22:26 +10:00
Daniel Clowry 2c54cd740a Add Streamable downloader 2021-04-30 09:22:26 +10:00
Serene-Arc 39935c58d9 Remove GifDeliveryNetwork module 2021-04-28 19:08:31 +10:00
Serene-Arc 9931839d14 Remove gifdeliverynetwork from download factory 2021-04-28 19:08:31 +10:00
Serene-Arc 760e59e1f7 Invert inheritance direction 2021-04-28 19:08:31 +10:00
Serene-Arc 3c6e9f6ccf Refactor class 2021-04-28 19:08:31 +10:00
Ali Parlakci e1a4ac063c (bug) redgifs: fix could not read page source 2021-04-28 19:08:31 +10:00
Serene-Arc 7fcbf623a0 Catch additional errors in site downloaders 2021-04-28 15:20:02 +10:00
Serene-Arc 6a20548269 Catch additional error 2021-04-28 12:03:28 +10:00
Serene-Arc 17499baf61 Add informative error when testing user existence 2021-04-28 12:03:28 +10:00
Serene-Arc e6551bb797 Return banned users as not existing 2021-04-28 12:03:28 +10:00
Serene-Arc db46676dec Catch error when logfile accessed concurrently 2021-04-28 12:03:28 +10:00
Serene-Arc cb41d4749a Add option to specify logfile location 2021-04-28 12:03:28 +10:00
Serene-Arc a28c2d3c73 Add missing default argument 2021-04-28 12:03:28 +10:00
Serene-Arc f5d11107a7 Remove unused imports 2021-04-28 12:03:28 +10:00
Serene-Arc 8cdf926211 Rename function 2021-04-28 12:03:28 +10:00
Serene-Arc 7438543f49 Remove unused variable 2021-04-28 12:03:28 +10:00
Serene-Arc ca495a6677 Add missing typing declaration 2021-04-28 12:03:28 +10:00
Serene-Arc 214c883a10 Simplify regex string slightly 2021-04-28 12:03:28 +10:00
Serene-Arc 6767777944 Catch requests errors in site downloaders 2021-04-28 12:03:28 +10:00
Serene-Arc d960bc0b7b Use ISO format for timestamps in names 2021-04-28 12:03:28 +10:00
Ali Parlakçı 2eab4052c5
Fix GifDeliveryNetwork link algorithm (#298)
* Catch additional error when parsing site

* Fix GifDeliveryNetwork link algorithm

Co-authored-by: Serene-Arc <serenical@gmail.com>
2021-04-22 10:07:05 +03:00
Ali Parlakci 1c4cfbb580 tests: move outside of the package 2021-04-21 08:29:14 +10:00
Serene b37ff0714f Fix time filters (#279) 2021-04-18 16:44:52 +03:00
Ali Parlakci f483f24e15 test_integration.py: fix skipif test_config 2021-04-18 16:44:52 +03:00
Ali Parlakçı 5e81160e5f test_vreddit: remove flaky test (#272) 2021-04-18 16:44:52 +03:00
Ali Parlakci 8eb374eec6 test_vreddit: fix incorrect file hash 2021-04-18 16:44:52 +03:00
Serene d8752b15fa Add option to skip specified subreddits (#268)
* Rename variables

* Add option to skip specific subreddits

* Update README
2021-04-18 16:44:52 +03:00
Serene-Arc 48dca9e5ee Fix mistaken backreference in some titles
This should resolve #267
2021-04-18 16:44:52 +03:00
Nathan Spicer-Davis 59ab5d8777 Update extension regex to match URI fragments (#264) 2021-04-18 16:44:52 +03:00
Serene-Arc ab7a0f6a51 Catch errors when resources have no extension
This is related to #266 and will prevent the BDFR from
completely crashing when a file extension is unknown
2021-04-18 16:44:52 +03:00
Serene-Arc e672e28a12 Fix typing on function 2021-04-18 16:44:52 +03:00
Serene-Arc 4b195f2b53 Remove unneeded logger entry 2021-04-18 16:44:52 +03:00
Serene 62dedb6c95 Fix bug with emojis in the filename (#263) 2021-04-18 16:44:52 +03:00
Serene-Arc 5758aad48b Fix formatting 2021-04-18 16:44:52 +03:00
Serene-Arc 77bdbbac63 Update test hash 2021-04-18 16:44:52 +03:00
Serene-Arc 7d71f8ffab Add regex for all 2xx HTTP codes 2021-04-18 16:44:52 +03:00
Serene-Arc 308853d531 Use standards in HTTP errors 2021-04-18 16:44:52 +03:00
Serene-Arc e35dd9e5d0 Fix bug in file name formatter 2021-04-18 16:44:52 +03:00
Serene-Arc bd9f276acc Rename module 2021-04-18 16:44:52 +03:00