1
0
Fork 0
mirror of synced 2024-05-18 19:32:24 +12:00

Compare commits

...

106 commits

Author SHA1 Message Date
Nick Sweeting 102e87578c
re-lock packages for python3.10 because we have to support it still 2024-05-11 15:06:19 -07:00
Nick Sweeting 913590ee39
explain weird use of ellipses magic value 2024-05-11 15:02:43 -07:00
Nick Sweeting 3882b1ee22
Update README.md 2024-05-11 14:17:38 -07:00
Nick Sweeting 471cf06d89
Update README.md 2024-05-11 14:17:16 -07:00
Nick Sweeting 340fc95f75
Update README.md 2024-05-09 00:45:20 -07:00
Nick Sweeting 75a3f03149
Update README.md 2024-05-09 00:23:59 -07:00
Nick Sweeting 2e9512adfd
Update README.md 2024-05-08 23:49:47 -07:00
Nick Sweeting 1c76193704
Update README.md 2024-05-08 23:17:27 -07:00
Nick Sweeting e23c7cb3db
Update README.md 2024-05-08 23:11:43 -07:00
Nick Sweeting 7a8ed9cd55
Update README.md 2024-05-08 23:10:32 -07:00
Nick Sweeting 72f52d5dd5
Update README.md 2024-05-08 23:00:02 -07:00
Nick Sweeting 3ce801a182
Update README.md 2024-05-08 22:58:55 -07:00
Nick Sweeting 8b75788644
Update README.md 2024-05-08 22:57:05 -07:00
Nick Sweeting 7673b42117
Update README.md 2024-05-08 22:52:02 -07:00
Nick Sweeting 03296e2200
Update README.md 2024-05-08 22:40:53 -07:00
Nick Sweeting e522810a20
Update README.md 2024-05-08 22:39:06 -07:00
Nick Sweeting 69579a73ec
Update README.md 2024-05-08 22:33:01 -07:00
Nick Sweeting f5f8d091c3
Update README.md 2024-05-08 22:32:38 -07:00
Nick Sweeting 51601632c2
Update README.md 2024-05-08 22:28:13 -07:00
Nick Sweeting b489338555
Update README.md (#1427) 2024-05-08 22:25:42 -07:00
Nick Sweeting d03f447555
Update README.md 2024-05-08 22:25:20 -07:00
Nick Sweeting 68a859ccfd
Update README.md 2024-05-08 22:06:50 -07:00
Nick Sweeting 6baf2b2f69
Update README.md 2024-05-08 21:30:15 -07:00
Nick Sweeting d451636224
Update README.md 2024-05-08 21:28:34 -07:00
Nick Sweeting 208c16c611
Update README.md 2024-05-08 21:26:49 -07:00
Nick Sweeting 16d1b92fd6
add Runtipi as a self-hosted app provider option 2024-05-08 20:49:15 -07:00
Nick Sweeting b90ba6c909
change header auth to use X-ArchiveBox-API-Key so it doesnt conflict with other auth headers 2024-05-08 20:02:47 -07:00
Nick Sweeting 09360fd191
Update README.md 2024-05-08 17:53:11 -07:00
Nick Sweeting 4c5a3fba8b
more fixes for wget_output_path 2024-05-07 05:38:29 -07:00
Nick Sweeting f2729c9dc7
revert wget character limit back to windows, ascii is allows MORE not less 2024-05-07 05:24:16 -07:00
Nick Sweeting cf9ef88aa8
Path validation fixes for wget_output_path() (#1424) 2024-05-07 05:04:48 -07:00
Nick Sweeting 9b21ce490e
add workaround logic to catch paths that are too long or contain unprintable characters 2024-05-07 05:03:23 -07:00
Nick Sweeting f62cb5fb43
change wget to use stricter ascii filepath normalization 2024-05-07 05:03:01 -07:00
Nick Sweeting f770bba3cf
fix OSError 36 caused by checking for path that is too long to exist 2024-05-07 04:12:07 -07:00
Nick Sweeting ce42472732
Text Search and Filters don't work at the same time in the web UI #1316 (#1333) 2024-05-06 23:14:25 -07:00
Nick Sweeting ef856e8051
Merge branch 'dev' into issue1316 2024-05-06 23:14:16 -07:00
Nick Sweeting 27d5d1ddc8
revert queryset intersection back to union for search results 2024-05-06 23:13:52 -07:00
Nick Sweeting 664e09f0b4
Bump cryptography from 42.0.6 to 42.0.7 (#1422) 2024-05-06 23:12:21 -07:00
Nick Sweeting f472705d10
Change phrasing (#1419) 2024-05-06 23:11:26 -07:00
Nick Sweeting 3095265880
fix inner quote not escaped 2024-05-06 23:10:59 -07:00
dependabot[bot] 60df0c3137
Bump cryptography from 42.0.6 to 42.0.7
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.6 to 42.0.7.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/42.0.6...42.0.7)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-05-06 22:11:05 +00:00
Nick Sweeting 32aea66913
Fix quotation (#1421) 2024-05-06 12:16:37 -07:00
Brandl 8ccd606973
Fix quotation
Fixes:
=> ERROR [stage-0 22/23] RUN "/app"/bin/docker_entrypoint.sh version 2>&1 | tee -a /VERSION.txt                                       1.7s 
------                                                                                                                                      
 > [stage-0 22/23] RUN "/app"/bin/docker_entrypoint.sh version 2>&1 | tee -a /VERSION.txt:                                                  
1.665 Traceback (most recent call last):                                                                                                    
1.665   File "/usr/local/bin/archivebox", line 5, in <module>                                                                               
1.665     from archivebox.cli import main                                                                                                   
1.665   File "/app/archivebox/cli/__init__.py", line 83, in <module>
1.665     SUBCOMMANDS = list_subcommands()
1.665                   ^^^^^^^^^^^^^^^^^^
1.665   File "/app/archivebox/cli/__init__.py", line 43, in list_subcommands
1.665     module = import_module('.archivebox_{}'.format(subcommand), __package__)
1.665              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1.665   File "/usr/local/lib/python3.11/importlib/__init__.py", line 126, in import_module
1.665     return _bootstrap._gcd_import(name[level:], package, level)
1.665            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
1.666   File "/app/archivebox/cli/archivebox_add.py", line 11, in <module>
1.666     from ..main import add
1.666   File "/app/archivebox/main.py", line 233
1.666     f'COMMIT_HASH={COMMIT_HASH[:7] if COMMIT_HASH else 'unknown'}',
1.666                                                         ^^^^^^^
1.666 SyntaxError: f-string: expecting '}'
2024-05-06 21:04:14 +02:00
Nick Sweeting 94ee394339
Add ability to view configuration in Admin UI (ability to edit coming later...) (#1420) 2024-05-06 11:25:37 -07:00
Nick Sweeting 027c029316
redact passwords, keys, and secret tokens in admin UI 2024-05-06 11:06:42 -07:00
Nick Sweeting 8667ed29f1
improve API webhooks helptext and change app_label to API 2024-05-06 08:11:01 -07:00
Evan Boehs f998647350
change phrasing 2024-05-06 10:32:36 -04:00
Nick Sweeting 29c794925e
Add Webhooks support to new REST API (#1418) 2024-05-06 07:17:36 -07:00
Nick Sweeting 641a07b08a
bump dependencies 2024-05-06 07:14:22 -07:00
Nick Sweeting c30d697904
archivebox/package-lock.json 2024-05-06 07:14:18 -07:00
Nick Sweeting d782bafe2e
fix storages missing stackfiles error 2024-05-06 07:14:01 -07:00
Nick Sweeting 47666ec26b
show webhooks config in django admin 2024-05-06 07:13:54 -07:00
Nick Sweeting f067451267
fix django timezone.utc removed in 5.0 2024-05-06 07:13:25 -07:00
Nick Sweeting c7fc9c004f
add django-signal-webhooks 2024-05-06 06:58:03 -07:00
Nick Sweeting 08931edbe0
Add Railway self-host option (#1417) 2024-05-06 06:27:33 -07:00
Nick Sweeting 9dc7065506
Update FUNDING.yml 2024-05-06 06:12:46 -07:00
Nick Sweeting 12a990c178
Update FUNDING.yml 2024-05-06 06:10:43 -07:00
Evan Boehs f95b369f0d
Add railway 2024-05-06 08:57:27 -04:00
Nick Sweeting 90b7a7f40d
fix TrueNAS chart link 2024-05-05 01:53:40 -07:00
Nick Sweeting 3805a1730d
add 0002 api migration 2024-04-30 21:45:02 -07:00
Nick Sweeting 2094ed842b
fix django-stubs in pyproject.toml 2024-04-30 21:43:51 -07:00
Nick Sweeting 8d7dd47c43
stop pushing version tags by default on docker build 2024-04-30 21:43:44 -07:00
Nick Sweeting e20eb52f15
fix COMMIT_HASH missing error 2024-04-30 21:43:22 -07:00
Nick Sweeting 17b35496cc
fix missing bottle in dev dependencies 2024-04-25 22:25:58 -07:00
Nick Sweeting 1c9f9fe1b7
remove playwright from prod requirements.txt 2024-04-25 21:39:48 -07:00
Nick Sweeting 8f3901dd36
fix edge case in docker_entrypoint where crontabs glob fails to expand 2024-04-25 21:38:15 -07:00
Nick Sweeting 18a5b6e99c
move EXTERNAL_LOCATIONS to data locations in version output 2024-04-25 21:36:43 -07:00
Nick Sweeting 6a6ae7468e
fix lint errors 2024-04-25 21:36:11 -07:00
Nick Sweeting 1d9e7ec66a
declare no-install-recommends at top of dockerfile and remove armv7 build steps 2024-04-25 21:35:09 -07:00
Nick Sweeting 8cbc1a4adc
remove pdm lockfile from git 2024-04-25 18:03:44 -07:00
Nick Sweeting 4a5ad32040
add django-requests-tracker 2024-04-25 18:02:01 -07:00
Nick Sweeting af669d2f37
rename api files for clarity 2024-04-25 05:55:47 -07:00
Nick Sweeting 716ba52450
bump django to version 5.0 and all other requirements 2024-04-25 04:19:16 -07:00
Nick Sweeting 75153252dc
big overhaul of REST API, split into auth, core, and cli methods 2024-04-25 03:56:22 -07:00
Nick Sweeting e5aba0dc2e
fix urljoin bug causing multiple slashes to be merged into one 2024-04-24 19:41:31 -07:00
Nick Sweeting 6cb357e76c
fix fix_url_from_markdown assertion to be valid url 2024-04-24 19:41:11 -07:00
Nick Sweeting 128419f991
expand comment about markdown url trailing paren trimming 2024-04-24 17:50:18 -07:00
Nick Sweeting beb3932d80
replace uses of URL_REGEX with find_all_urls to handle markdown better 2024-04-24 17:45:45 -07:00
Nick Sweeting 3afdd3d96f
remove Docker arm/v7 auto-builds, rename :main to :stable 2024-04-24 16:55:14 -07:00
Nick Sweeting 463ea54616
remove references to main branch in favor of stable branch 2024-04-24 16:38:00 -07:00
Nick Sweeting c6d644be29
Merge branch 'dev' into issue1316 2024-04-24 16:24:16 -07:00
Nick Sweeting 8e9cfc8869
fix crontab symlinking to use glob instead of ls for iteration 2024-04-24 16:23:08 -07:00
Nick Sweeting 98c5e69203
bump lockfiles 2024-04-24 14:38:21 -07:00
Nick Sweeting 8dcfa93ec6
Merge branch 'main' into dev 2024-04-24 14:32:07 -07:00
Nick Sweeting e28f33fcd0
Update docker-compose.yml 2024-04-24 14:17:53 -07:00
Nick Sweeting 665a2e505f
fix the URL_REGEX used in generic_html parsers (#1396) 2024-04-23 19:54:04 -07:00
Nick Sweeting 17f40f3ada
Merge branch 'dev' into fix-URL_REGEX 2024-04-23 19:53:58 -07:00
Nick Sweeting c6f8a33a63
Update util.py 2024-04-23 19:53:18 -07:00
Nick Sweeting 24175f5b4a
REST API v1 using django-ninja (#1397) 2024-04-23 17:50:31 -07:00
Nick Sweeting a1a877f47f
bump pip_dist submodule 2024-04-23 17:48:53 -07:00
Nick Sweeting 63fc317229
minor pylint fixes in logging_util 2024-04-23 17:45:36 -07:00
Nick Sweeting 756e159dfe
add new bin/lock_pkgs.sh to generate package lockfiles consistently 2024-04-23 17:45:36 -07:00
Nick Sweeting 667cf38fc6
Update dependabot.yml 2024-04-12 17:19:01 -07:00
Nick Sweeting 11acc9ceea
Add Dockerfile labels needed for depandabot and Docker Extension marketplace 2024-04-12 14:16:55 -07:00
Nick Sweeting 55d6bde7db
Create dependabot.yml 2024-04-12 13:59:05 -07:00
Nick Sweeting bc0b0303ea
Create codeql.yml 2024-04-12 13:57:30 -07:00
Nick Sweeting 82b38df8ec
Update README.md 2024-04-11 15:50:37 -07:00
Nick Sweeting 8ced9fd4bb
Update README.md 2024-04-11 01:12:06 -07:00
longzai e4dc2701ef fix URL_REGEX 2 2024-04-11 15:51:55 +08:00
Nick Sweeting 99502bd928
Merge branch 'main' into main 2024-04-10 05:15:43 -07:00
Nick Sweeting b76875aab6
Update README.md 2024-04-10 02:45:30 -07:00
Nick Sweeting 9ad99d86c1
Update docker-compose.yml to add rclone remote storage example 2024-04-09 18:38:29 -07:00
Brandl 5f9aac18f2
api v1 2024-04-10 01:29:24 +02:00
longzai 4ae765ec27 fix the URL_REGEX used in generic_html parsers
Signed-off-by: longzai <437172242@qq.com>
2024-04-08 04:53:05 +08:00
Nick Sweeting 9d4cc361e6
Update docker-compose.yml 2024-03-27 20:15:27 -07:00
Neel Suthar 279883d6bb Text Search and Filters don't work at the same time in the web UI #1316
Making sure to return distinct results. Changing set operation to '&' to show the matching results from filters AND search term
2024-01-21 17:34:22 -06:00
63 changed files with 4818 additions and 1561 deletions

View file

@ -17,6 +17,11 @@ venv/
.venv-old/
.docker-venv/
node_modules/
chrome/
chromeprofile/
pdm.dev.lock
pdm.lock
docs/
build/

5
.github/FUNDING.yml vendored
View file

@ -1,3 +1,2 @@
github: pirate
patreon: theSquashSH
custom: ["https://hcb.hackclub.com/donations/start/archivebox", "https://paypal.me/NicholasSweeting"]
github: ["ArchiveBox", "pirate"]
custom: ["https://donate.archivebox.io", "https://paypal.me/NicholasSweeting"]

12
.github/dependabot.yml vendored Normal file
View file

@ -0,0 +1,12 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
version: 2
updates:
- package-ecosystem: "pip" # See documentation for possible values
directory: "/"
target-branch: "dev"
schedule:
interval: "weekly"

92
.github/workflows/codeql.yml vendored Normal file
View file

@ -0,0 +1,92 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"
on:
push:
branches: [ "dev" ]
pull_request:
branches: [ "dev" ]
schedule:
- cron: '33 17 * * 6'
jobs:
analyze:
name: Analyze (${{ matrix.language }})
# Runner size impacts CodeQL analysis time. To learn more, please see:
# - https://gh.io/recommended-hardware-resources-for-running-codeql
# - https://gh.io/supported-runners-and-hardware-resources
# - https://gh.io/using-larger-runners (GitHub.com only)
# Consider using larger runners or machines with greater resources for possible analysis time improvements.
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
timeout-minutes: ${{ (matrix.language == 'swift' && 120) || 360 }}
permissions:
# required for all workflows
security-events: write
# required to fetch internal or private CodeQL packs
packages: read
# only required for workflows in private repositories
actions: read
contents: read
strategy:
fail-fast: false
matrix:
include:
- language: python
build-mode: none
# CodeQL supports the following values keywords for 'language': 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'swift'
# Use `c-cpp` to analyze code written in C, C++ or both
# Use 'java-kotlin' to analyze code written in Java, Kotlin or both
# Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both
# To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,
# see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.
# If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality
# If the analyze step fails for one of the languages you are analyzing with
# "We were unable to automatically build your code", modify the matrix above
# to set the build mode to "manual" for that language. Then modify this step
# to build your code.
# Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
- if: matrix.build-mode == 'manual'
run: |
echo 'If you are using a "manual" build mode for one or more of the' \
'languages you are analyzing, replace this with the commands to build' \
'your code, for example:'
echo ' make bootstrap'
echo ' make release'
exit 1
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
with:
category: "/language:${{matrix.language}}"

View file

@ -11,7 +11,7 @@ on:
env:
DOCKER_IMAGE: archivebox-ci
jobs:
buildx:
runs-on: ubuntu-latest
@ -24,21 +24,21 @@ jobs:
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
id: buildx
uses: docker/setup-buildx-action@v3
with:
version: latest
install: true
platforms: linux/amd64,linux/arm64,linux/arm/v7
platforms: linux/amd64,linux/arm64
- name: Builder instance name
run: echo ${{ steps.buildx.outputs.name }}
- name: Available platforms
run: echo ${{ steps.buildx.outputs.platforms }}
- name: Cache Docker layers
uses: actions/cache@v3
with:
@ -51,21 +51,27 @@ jobs:
uses: docker/login-action@v3
if: github.event_name != 'pull_request'
with:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
- name: Collect Docker tags
# https://github.com/docker/metadata-action
id: docker_meta
uses: docker/metadata-action@v5
with:
images: archivebox/archivebox,nikisweeting/archivebox
tags: |
# :stable
type=ref,event=branch
# :0.7.3
type=semver,pattern={{version}}
# :0.7
type=semver,pattern={{major}}.{{minor}}
# :sha-463ea54
type=sha
type=raw,value=latest,enable={{is_default_branch}}
# :latest
type=raw,value=latest,enable=${{ github.ref == format('refs/heads/{0}', 'stable') }}
- name: Build and push
id: docker_build
uses: docker/build-push-action@v5
@ -77,7 +83,7 @@ jobs:
tags: ${{ steps.docker_meta.outputs.tags }}
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache-new
platforms: linux/amd64,linux/arm64,linux/arm/v7
platforms: linux/amd64,linux/arm64
- name: Image digest
run: echo ${{ steps.docker_build.outputs.digest }}
@ -88,7 +94,7 @@ jobs:
username: ${{ secrets.DOCKER_USERNAME }}
password: ${{ secrets.DOCKER_PASSWORD }}
repository: archivebox/archivebox
# This ugly bit is necessary if you don't want your cache to grow forever
# until it hits GitHub's limit of 5GB.
# Temp fix

8
.gitignore vendored
View file

@ -12,6 +12,11 @@ venv/
.docker-venv/
node_modules/
# Ignore dev lockfiles (should always be built fresh)
pdm.lock
pdm.dev.lock
requirements-dev.txt
# Packaging artifacts
.pdm-python
.pdm-build
@ -22,9 +27,6 @@ dist/
# Data folders
data/
data1/
data2/
data3/
data*/
output/

View file

@ -20,9 +20,23 @@ FROM python:3.11-slim-bookworm
LABEL name="archivebox" \
maintainer="Nick Sweeting <dockerfile@archivebox.io>" \
description="All-in-one personal internet archiving container" \
description="All-in-one self-hosted internet archiving solution" \
homepage="https://github.com/ArchiveBox/ArchiveBox" \
documentation="https://github.com/ArchiveBox/ArchiveBox/wiki/Docker#docker"
documentation="https://github.com/ArchiveBox/ArchiveBox/wiki/Docker" \
org.opencontainers.image.title="ArchiveBox" \
org.opencontainers.image.vendor="ArchiveBox" \
org.opencontainers.image.description="All-in-one self-hosted internet archiving solution" \
org.opencontainers.image.source="https://github.com/ArchiveBox/ArchiveBox" \
com.docker.image.source.entrypoint="Dockerfile" \
# TODO: release ArchiveBox as a Docker Desktop extension (requires these labels):
# https://docs.docker.com/desktop/extensions-sdk/architecture/metadata/
com.docker.desktop.extension.api.version=">= 1.4.7" \
com.docker.desktop.extension.icon="https://archivebox.io/icon.png" \
com.docker.extension.publisher-url="https://archivebox.io" \
com.docker.extension.screenshots='[{"alt": "Screenshot of Admin UI", "url": "https://github.com/ArchiveBox/ArchiveBox/assets/511499/e8e0b6f8-8fdf-4b7f-8124-c10d8699bdb2"}]' \
com.docker.extension.detailed-description='See here for detailed documentation: https://wiki.archivebox.io' \
com.docker.extension.changelog='See here for release notes: https://github.com/ArchiveBox/ArchiveBox/releases' \
com.docker.extension.categories='database,utility-tools'
ARG TARGETPLATFORM
ARG TARGETOS
@ -73,7 +87,9 @@ COPY --chown=root:root --chmod=755 package.json "$CODE_DIR/"
RUN grep '"version": ' "${CODE_DIR}/package.json" | awk -F'"' '{print $4}' > /VERSION.txt
# Force apt to leave downloaded binaries in /var/cache/apt (massively speeds up Docker builds)
RUN echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' > /etc/apt/apt.conf.d/keep-cache \
RUN echo 'Binary::apt::APT::Keep-Downloaded-Packages "1";' > /etc/apt/apt.conf.d/99keep-cache \
&& echo 'APT::Install-Recommends "0";' > /etc/apt/apt.conf.d/99no-intall-recommends \
&& echo 'APT::Install-Suggests "0";' > /etc/apt/apt.conf.d/99no-intall-suggests \
&& rm -f /etc/apt/apt.conf.d/docker-clean
# Print debug info about build and save it to disk, for human eyes only, not used by anything else
@ -106,10 +122,10 @@ RUN echo "[*] Setting up $ARCHIVEBOX_USER user uid=${DEFAULT_PUID}..." \
# Install system apt dependencies (adding backports to access more recent apt updates)
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$TARGETVARIANT \
echo "[+] Installing APT base system dependencies for $TARGETPLATFORM..." \
&& echo 'deb https://deb.debian.org/debian bookworm-backports main contrib non-free' >> /etc/apt/sources.list.d/backports.list \
&& echo 'deb https://deb.debian.org/debian bookworm-backports main contrib non-free' > /etc/apt/sources.list.d/backports.list \
&& mkdir -p /etc/apt/keyrings \
&& apt-get update -qq \
&& apt-get install -qq -y -t bookworm-backports --no-install-recommends \
&& apt-get install -qq -y -t bookworm-backports \
# 1. packaging dependencies
apt-transport-https ca-certificates apt-utils gnupg2 curl wget \
# 2. docker and init system dependencies
@ -120,27 +136,13 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$T
######### Language Environments ####################################
# Install Node environment
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$TARGETVARIANT --mount=type=cache,target=/root/.npm,sharing=locked,id=npm-$TARGETARCH$TARGETVARIANT \
echo "[+] Installing Node $NODE_VERSION environment in $NODE_MODULES..." \
&& echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_${NODE_VERSION}.x nodistro main" >> /etc/apt/sources.list.d/nodejs.list \
&& curl -fsSL "https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key" | gpg --dearmor | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg \
&& apt-get update -qq \
&& apt-get install -qq -y -t bookworm-backports --no-install-recommends \
nodejs libatomic1 python3-minimal \
&& rm -rf /var/lib/apt/lists/* \
# Update NPM to latest version
&& npm i -g npm --cache /root/.npm \
# Save version info
&& ( \
which node && node --version \
&& which npm && npm --version \
&& echo -e '\n\n' \
) | tee -a /VERSION.txt
# Install Python environment
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$TARGETVARIANT --mount=type=cache,target=/root/.cache/pip,sharing=locked,id=pip-$TARGETARCH$TARGETVARIANT \
echo "[+] Setting up Python $PYTHON_VERSION runtime..." \
# && apt-get update -qq \
# && apt-get install -qq -y -t bookworm-backports --no-upgrade \
# python${PYTHON_VERSION} python${PYTHON_VERSION}-minimal python3-pip \
# && rm -rf /var/lib/apt/lists/* \
# tell PDM to allow using global system python site packages
# && rm /usr/lib/python3*/EXTERNALLY-MANAGED \
# create global virtual environment GLOBAL_VENV to use (better than using pip install --global)
@ -157,13 +159,34 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$T
&& echo -e '\n\n' \
) | tee -a /VERSION.txt
# Install Node environment
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$TARGETVARIANT --mount=type=cache,target=/root/.npm,sharing=locked,id=npm-$TARGETARCH$TARGETVARIANT \
echo "[+] Installing Node $NODE_VERSION environment in $NODE_MODULES..." \
&& echo "deb [signed-by=/etc/apt/keyrings/nodesource.gpg] https://deb.nodesource.com/node_${NODE_VERSION}.x nodistro main" >> /etc/apt/sources.list.d/nodejs.list \
&& curl -fsSL "https://deb.nodesource.com/gpgkey/nodesource-repo.gpg.key" | gpg --dearmor | gpg --dearmor -o /etc/apt/keyrings/nodesource.gpg \
&& apt-get update -qq \
&& apt-get install -qq -y -t bookworm-backports --no-upgrade libatomic1 \
&& apt-get install -y -t bookworm-backports --no-upgrade \
nodejs \
&& rm -rf /var/lib/apt/lists/* \
# Update NPM to latest version
&& npm i -g npm --cache /root/.npm \
# Save version info
&& ( \
which node && node --version \
&& which npm && npm --version \
&& echo -e '\n\n' \
) | tee -a /VERSION.txt
######### Extractor Dependencies ##################################
# Install apt dependencies
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$TARGETVARIANT --mount=type=cache,target=/root/.cache/pip,sharing=locked,id=pip-$TARGETARCH$TARGETVARIANT \
echo "[+] Installing APT extractor dependencies globally using apt..." \
&& apt-get update -qq \
&& apt-get install -qq -y -t bookworm-backports --no-install-recommends \
&& apt-get install -qq -y -t bookworm-backports \
curl wget git yt-dlp ffmpeg ripgrep \
# Packages we have also needed in the past:
# youtube-dl wget2 aria2 python3-pyxattr rtmpdump libfribidi-bin mpv \
@ -182,25 +205,21 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$T
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$TARGETVARIANT --mount=type=cache,target=/root/.cache/pip,sharing=locked,id=pip-$TARGETARCH$TARGETVARIANT --mount=type=cache,target=/root/.cache/ms-playwright,sharing=locked,id=browsers-$TARGETARCH$TARGETVARIANT \
echo "[+] Installing Browser binary dependencies to $PLAYWRIGHT_BROWSERS_PATH..." \
&& apt-get update -qq \
&& apt-get install -qq -y -t bookworm-backports --no-install-recommends \
&& apt-get install -qq -y -t bookworm-backports \
fontconfig fonts-ipafont-gothic fonts-wqy-zenhei fonts-thai-tlwg fonts-khmeros fonts-kacst fonts-symbola fonts-noto fonts-freefont-ttf \
at-spi2-common fonts-liberation fonts-noto-color-emoji fonts-tlwg-loma-otf fonts-unifont libatk-bridge2.0-0 libatk1.0-0 libatspi2.0-0 libavahi-client3 \
libavahi-common-data libavahi-common3 libcups2 libfontenc1 libice6 libnspr4 libnss3 libsm6 libunwind8 \
libxaw7 libxcomposite1 libxdamage1 libxfont2 \
libxkbfile1 libxmu6 libxpm4 libxt6 x11-xkb-utils xfonts-encodings \
# xfonts-scalable xfonts-utils xserver-common xvfb \
# chrome can run without dbus/upower technically, it complains about missing dbus but should run ok anyway
# libxss1 dbus dbus-x11 upower \
# && service dbus start \
&& if [[ "$TARGETPLATFORM" == *amd64* || "$TARGETPLATFORM" == *arm64* ]]; then \
# install Chromium using playwright
pip install playwright \
&& cp -r /root/.cache/ms-playwright "$PLAYWRIGHT_BROWSERS_PATH" \
&& playwright install --with-deps chromium \
&& export CHROME_BINARY="$(python -c 'from playwright.sync_api import sync_playwright; print(sync_playwright().start().chromium.executable_path)')"; \
else \
# fall back to installing Chromium via apt-get on platforms not supported by playwright (e.g. risc, ARMv7, etc.)
# apt-get install -qq -y -t bookworm-backports --no-install-recommends \
# chromium \
# && export CHROME_BINARY="$(which chromium)"; \
echo 'armv7 no longer supported in versions after v0.7.3' \
exit 1; \
fi \
# install Chromium using playwright
&& pip install playwright \
&& cp -r /root/.cache/ms-playwright "$PLAYWRIGHT_BROWSERS_PATH" \
&& playwright install chromium \
&& export CHROME_BINARY="$(python -c 'from playwright.sync_api import sync_playwright; print(sync_playwright().start().chromium.executable_path)')" \
&& rm -rf /var/lib/apt/lists/* \
&& ln -s "$CHROME_BINARY" /usr/bin/chromium-browser \
&& mkdir -p "/home/${ARCHIVEBOX_USER}/.config/chromium/Crash Reports/pending/" \
@ -233,8 +252,8 @@ COPY --chown=root:root --chmod=755 "./pyproject.toml" "requirements.txt" "$CODE_
RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$TARGETVARIANT --mount=type=cache,target=/root/.cache/pip,sharing=locked,id=pip-$TARGETARCH$TARGETVARIANT \
echo "[+] Installing PIP ArchiveBox dependencies from requirements.txt for ${TARGETPLATFORM}..." \
&& apt-get update -qq \
&& apt-get install -qq -y -t bookworm-backports --no-install-recommends \
build-essential \
&& apt-get install -qq -y -t bookworm-backports \
# build-essential \
libssl-dev libldap2-dev libsasl2-dev \
python3-ldap python3-msgpack python3-mutagen python3-regex python3-pycryptodome procps \
# && ln -s "$GLOBAL_VENV" "$APP_VENV" \
@ -244,8 +263,8 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$T
# && pdm export -o requirements.txt --without-hashes \
# && source $GLOBAL_VENV/bin/activate \
&& pip install -r requirements.txt \
&& apt-get purge -y \
build-essential \
# && apt-get purge -y \
# build-essential \
&& apt-get autoremove -y \
&& rm -rf /var/lib/apt/lists/*
@ -255,7 +274,7 @@ RUN --mount=type=cache,target=/var/cache/apt,sharing=locked,id=apt-$TARGETARCH$T
echo "[*] Installing PIP ArchiveBox package from $CODE_DIR..." \
# && apt-get update -qq \
# install C compiler to build deps on platforms that dont have 32-bit wheels available on pypi
# && apt-get install -qq -y -t bookworm-backports --no-install-recommends \
# && apt-get install -qq -y -t bookworm-backports \
# build-essential \
# INSTALL ARCHIVEBOX python package globally from CODE_DIR, with all optional dependencies
&& pip install -e "$CODE_DIR"[sonic,ldap] \

View file

@ -124,8 +124,8 @@ curl -fsSL 'https://get.archivebox.io' | sh
## Key Features
- [**Free & open source**](https://github.com/ArchiveBox/ArchiveBox/blob/dev/LICENSE), doesn't require signing up online, stores all data locally
- [**Powerful, intuitive command line interface**](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#CLI-Usage) with [modular optional dependencies](#dependencies)
- [**Free & open source**](https://github.com/ArchiveBox/ArchiveBox/blob/dev/LICENSE), own your own data & maintain your privacy by self-hosting
- [**Powerful CLI**](https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#CLI-Usage) with [modular dependencies](#dependencies) and [support for Google Drive/NFS/SMB/S3/B2/etc.](https://github.com/ArchiveBox/ArchiveBox/wiki/Setting-Up-Storage)
- [**Comprehensive documentation**](https://github.com/ArchiveBox/ArchiveBox/wiki), [active development](https://github.com/ArchiveBox/ArchiveBox/wiki/Roadmap), and [rich community](https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community)
- [**Extracts a wide variety of content out-of-the-box**](https://github.com/ArchiveBox/ArchiveBox/issues/51): [media (yt-dlp), articles (readability), code (git), etc.](#output-formats)
- [**Supports scheduled/realtime importing**](https://github.com/ArchiveBox/ArchiveBox/wiki/Scheduled-Archiving) from [many types of sources](#input-formats)
@ -152,8 +152,8 @@ ArchiveBox is free for everyone to self-host, but we also provide support, secur
- **Governments:**
`snapshoting public service sites`, `recordkeeping compliance`
> ***[Contact us](https://zulip.archivebox.io/#narrow/stream/167-enterprise/topic/welcome/near/1191102)** if your org wants help using ArchiveBox professionally.*
> We offer: setup & support, hosting, custom features, security, hashing & audit logging/chain-of-custody, etc.
> ***[Contact us](https://zulip.archivebox.io/#narrow/stream/167-enterprise/topic/welcome/near/1191102)** if your org wants help using ArchiveBox professionally.* (we are also seeking [grant funding](https://github.com/ArchiveBox/ArchiveBox/issues/1126#issuecomment-1487431394))
> We offer: setup & support, CAPTCHA/ratelimit unblocking, SSO, audit logging/chain-of-custody, and more
> *ArchiveBox has 🏛️ 501(c)(3) [nonprofit status](https://hackclub.com/hcb/) and all our work supports open-source development.*
<br/>
@ -407,11 +407,12 @@ See <a href="#%EF%B8%8F-cli-usage">below</a> for usage examples using the CLI, W
> *Warning: These are contributed by external volunteers and may lag behind the official `pip` channel.*
<ul>
<li>TrueNAS: <a href="https://truecharts.org/charts/incubator/archivebox/">Official ArchiveBox TrueChart</a> / <a href="https://dev.to/finloop/setting-up-archivebox-on-truenas-scale-1788">Custom App Guide</a></li>
<li>TrueNAS: <a href="https://truecharts.org/charts/stable/archivebox/">Official ArchiveBox TrueChart</a> / <a href="https://dev.to/finloop/setting-up-archivebox-on-truenas-scale-1788">Custom App Guide</a></li>
<li><a href="https://unraid.net/community/apps?q=archivebox#r">UnRaid</a></li>
<li><a href="https://github.com/YunoHost-Apps/archivebox_ynh">Yunohost</a></li>
<li><a href="https://www.cloudron.io/store/io.archivebox.cloudronapp.html">Cloudron</a></li>
<li><a href="https://github.com/ArchiveBox/ArchiveBox/pull/922/files#diff-00f0606e18b2618c3cc1667ca7c2b703b537af690ca71eba1330633587dcb1ee">AppImage</a></li>
<li><a href="https://runtipi.io/docs/apps-available#:~:text=for%20AI%20Chats.-,ArchiveBox,Open%20source%20self%2Dhosted%20web%20archiving.,-Atuin%20Server">Runtipi</a></li>
<li><a href="https://github.com/ArchiveBox/ArchiveBox/issues/986">Umbrel</a> (need contributors...)</li>
<li>More: <a href="https://github.com/ArchiveBox/ArchiveBox/issues/new"><i>contribute another distribution...!</i></a></li>
@ -445,6 +446,9 @@ Other providers of paid ArchiveBox hosting (not officially endorsed):<br/>
<li><a href="https://fly.io/">
<img src="https://img.shields.io/badge/Unmanaged_App-Fly.io-%239a2de6.svg?style=flat" height="22px"/>
</a> (USD $10-50+/mo, <a href="https://fly.io/docs/hands-on/start/">instructions</a>)</li>
<li><a href="https://railway.app/template/2Vvhmy">
<img src="https://img.shields.io/badge/Unmanaged_App-Railway-%23A11BE6.svg?style=flat" height="22px"/>
</a> (USD $0-5+/mo)</li>
<li><a href="https://aws.amazon.com/marketplace/pp/Linnovate-Open-Source-Innovation-Support-For-Archi/B08RVW6MJ2"><img src="https://img.shields.io/badge/Unmanaged_VPS-AWS-%23ee8135.svg?style=flat" height="22px"/></a> (USD $60-200+/mo)</li>
<li><a href="https://azuremarketplace.microsoft.com/en-us/marketplace/apps/meanio.archivebox?ocid=gtmrewards_whatsnewblog_archivebox_vol118"><img src="https://img.shields.io/badge/Unmanaged_VPS-Azure-%237cb300.svg?style=flat" height="22px"/></a> (USD $60-200+/mo)</li>
<br/>
@ -669,7 +673,7 @@ docker run -it -v $PWD:/data archivebox/archivebox add --depth=1 'https://exampl
```bash
# archivebox add --help
archivebox add 'https://example.com/some/page'
archivebox add < ~/Downloads/firefox_bookmarks_export.html
archivebox add --parser=generic_rss < ~/Downloads/some_feed.xml
archivebox add --depth=1 'https://news.ycombinator.com#2020-12-12'
echo 'http://example.com' | archivebox add
echo 'any text with <a href="https://example.com">urls</a> in it' | archivebox add
@ -865,6 +869,7 @@ Each snapshot subfolder <code>data/archive/TIMESTAMP/</code> includes a static <
<h4>Learn More</h4>
<ul>
<li><a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Setting-Up-Storage">Wiki: Setting Up Storage (SMB, NFS, S3, B2, Google Drive, etc.)</a></li>
<li><a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#Disk-Layout">Wiki: Usage (Disk Layout)</a></li>
<li><a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Usage#large-archives">Wiki: Usage (Large Archives)</a></li>
<li><a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview#output-folder">Wiki: Security Overview (Output Folder)</a></li>
@ -1007,7 +1012,7 @@ https://127.0.0.1:8000/archive/*
### Working Around Sites that Block Archiving
For various reasons, many large sites (Reddit, Twitter, Cloudflare, etc.) actively block archiving or bots in general. There are a number of approaches to work around this.
For various reasons, many large sites (Reddit, Twitter, Cloudflare, etc.) actively block archiving or bots in general. There are a number of approaches to work around this, and we also provide <a href="https://docs.monadical.com/s/archivebox-consulting-services">consulting services</a> to help here.
<br/>
<details>
@ -1018,7 +1023,7 @@ For various reasons, many large sites (Reddit, Twitter, Cloudflare, etc.) active
<ul>
<li>Set <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#curl_user_agent"><code>CHROME_USER_AGENT</code>, <code>WGET_USER_AGENT</code>, <code>CURL_USER_AGENT</code></a> to impersonate a real browser (by default, ArchiveBox reveals that it's a bot when using the default user agent settings)</li>
<li>Set up a logged-in browser session for archiving using <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install#setting-up-a-chromium-user-profile"><code>CHROME_USER_DATA_DIR</code> &amp; <code>COOKIES_FILE</code></a></li>
<li>Rewrite your URLs before archiving to swap in an alternative frontend thats more bot-friendly e.g.<br>
<li>Rewrite your URLs before archiving to swap in alternative frontends that are more bot-friendly e.g.<br>
<code>reddit.com/some/url</code> -&gt; <code>teddit.net/some/url</code>: <a href="https://github.com/mendel5/alternative-front-ends">https://github.com/mendel5/alternative-front-ends</a></li>
</ul>
@ -1174,7 +1179,7 @@ ArchiveBox's stance is that duplication of other people's content is only ethica
- A. doesn't deprive the original creators of revenue and
- B. is responsibly curated by an individual/institution.
In the U.S., <a href="https://guides.library.oregonstate.edu/copyright/libraries">libraries, researchers, and archivists</a> are allowed to duplicate copyrighted materials under <a href="https://libguides.ala.org/copyright/fairuse">"fair use"</a> for <a href="https://guides.cuny.edu/cunyfairuse/librarians#:~:text=One%20of%20these%20specified%20conditions,may%20be%20liable%20for%20copyright">private study, scholarship, or research</a>. Archive.org's preservation work is covered under this exemption, as they are as a non-profit providing public service, and they respond to <a href="https://cardozoaelj.com/2015/03/20/use-of-copyright-law-to-take-down-revenge-porn/">unethical content</a>/<a href="https://help.archive.org/help/rights/">DMCA</a>/<a href="https://gdpr.eu/right-to-be-forgotten/#:~:text=An%20individual%20has%20the%20right,that%20individual%20withdraws%20their%20consent.">GDPR</a> removal requests.
In the U.S., <a href="https://guides.library.oregonstate.edu/copyright/libraries">libraries, researchers, and archivists</a> are allowed to duplicate copyrighted materials under <a href="https://libguides.ala.org/copyright/fairuse">"fair use"</a> for <a href="https://guides.cuny.edu/cunyfairuse/librarians#:~:text=One%20of%20these%20specified%20conditions,may%20be%20liable%20for%20copyright">private study, scholarship, or research</a>. Archive.org's non-profit preservation work is <a href="https://blog.archive.org/2024/03/01/fair-use-in-action-at-the-internet-archive/">covered under fair use</a> in the US, and they properly handle <a href="https://cardozoaelj.com/2015/03/20/use-of-copyright-law-to-take-down-revenge-porn/">unethical content</a>/<a href="https://help.archive.org/help/rights/">DMCA</a>/<a href="https://gdpr.eu/right-to-be-forgotten/#:~:text=An%20individual%20has%20the%20right,that%20individual%20withdraws%20their%20consent.">GDPR</a> removal requests to maintain good standing in the eyes of the law.
As long as you A. don't try to profit off pirating copyrighted content and B. have processes in place to respond to removal requests, many countries allow you to use sofware like ArchiveBox to ethically and responsibly archive any web content you can view. That being said, ArchiveBox is not liable for how you choose to operate the software. You must research your own local laws and regulations, and get proper legal council if you plan to host a public instance (start by putting your DMCA/GDPR contact info in <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#footer_info"><code>FOOTER_INFO</code></a> and changing your instance's branding using <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#custom_templates_dir"><code>CUSTOM_TEMPLATES_DIR</code></a>).
@ -1187,21 +1192,25 @@ As long as you A. don't try to profit off pirating copyrighted content and B. ha
<img src="https://github.com/ArchiveBox/ArchiveBox/assets/511499/4cac62a9-e8fb-425b-85a3-ca644aa6dd42" width="5%" align="right" alt="comparison" style="float: right"/>
> **Check out our [community wiki](https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community) for a list of web archiving tools and orgs.**
> **Check out our [community wiki](https://github.com/ArchiveBox/ArchiveBox/wiki/Web-Archiving-Community) for a list of alternative web archiving tools and orgs.**
A variety of open and closed-source archiving projects exist, but few provide a nice UI and CLI to manage a large, high-fidelity collection over time.
ArchiveBox gained momentum in the internet archiving industry because it uniquely combines 3 things:
- **it's distributed:** users own their data instead of entrusting it to one big central provider
- **it's future-proof:** saving in *multiple formats* and extracting out raw TXT, PNG, PDF, MP4, etc. files
- **it's extensible:** with powerful APIs, flexible storage, and a big community adding new extractors regularly
<br/>
<details>
<summary><i>Click to read about how we differ from other centralized archiving services and open source tools...</i></summary><br/>
<summary><i>Expand for a more direct comparison to Archive.org and specific open-source alternatives...</i></summary><br/>
ArchiveBox tries to be a robust, set-and-forget archiving solution suitable for archiving RSS feeds, bookmarks, or your entire browsing history (beware, it may be too big to store), including private/authenticated content that you wouldn't otherwise share with a centralized service.
ArchiveBox tries to be a robust, set-and-forget archiving solution suitable for archiving RSS feeds, bookmarks, or your entire browsing history (beware, it may be too big to store), including private/authenticated content that you wouldn't otherwise share with a centralized service like Archive.org.
<h3>Comparison With Centralized Public Archives</h3>
Not all content is suitable to be archived in a centralized collection, whether because it's private, copyrighted, too large, or too complex. ArchiveBox hopes to fill that gap.
Not all content is suitable to be archived on a centralized, publicly accessible platform. Archive.org doesn't offer the ability to save things behind login walls for good reason, as the content may not have been intended for a public audience. ArchiveBox exists to fill that gap by letting everyone save what they have access to on an individual basis, and to encourage decentralized archiving that's less succeptible to censorship or natural disasters.
By having each user store their own content locally, we can save much larger portions of everyone's browsing history than a shared centralized service would be able to handle. The eventual goal is to work towards federated archiving where users can share portions of their collections with each other.
By having users store their content locally or within their organizations, we can also save much larger portions of the internet than a centralized service has the disk capcity handle. The eventual goal is to work towards federated archiving where users can share portions of their collections with each other, and with central archives on a case-by-case basis.
<h3>Comparison With Other Self-Hosted Archiving Options</h3>
@ -1251,7 +1260,7 @@ ArchiveBox is neither the highest fidelity nor the simplest tool available for s
**Need help building a custom archiving solution?**
> ✨ **[Hire the team that built Archivebox](https://zulip.archivebox.io/#narrow/stream/167-enterprise/topic/welcome/near/1191102) to work on your project.** ([@ArchiveBoxApp](https://twitter.com/ArchiveBoxApp))
> ✨ **[Hire the team that built Archivebox](https://zulip.archivebox.io/#narrow/stream/167-enterprise/topic/welcome/near/1191102) to solve archiving for your org.** ([@ArchiveBoxApp](https://twitter.com/ArchiveBoxApp))
<br/>
@ -1264,9 +1273,11 @@ ArchiveBox is neither the highest fidelity nor the simplest tool available for s
<img src="https://read-the-docs-guidelines.readthedocs-hosted.com/_images/logo-dark.png" width="13%" align="right" style="float: right"/>
We use the [GitHub wiki system](https://github.com/ArchiveBox/ArchiveBox/wiki) and [Read the Docs](https://archivebox.readthedocs.io/en/latest/) (WIP) for documentation.
We use the [ArchiveBox GitHub Wiki](https://github.com/ArchiveBox/ArchiveBox/wiki) for documentation.
You can also access the docs locally by looking in the [`ArchiveBox/docs/`](https://github.com/ArchiveBox/ArchiveBox/wiki/Home) folder.
<sub>There is also a mirror available on <a href="https://archivebox.readthedocs.io/en/latest/">Read the Docs</a> (though it's sometimes outdated).</sub>
> ✏️ You can submit docs changes & suggestions in our dedicated repo [`ArchiveBox/docs`](https://github.com/ArchiveBox/docs).
## Getting Started
@ -1277,16 +1288,19 @@ You can also access the docs locally by looking in the [`ArchiveBox/docs/`](http
- [Configuration](https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration)
- [Supported Sources](https://github.com/ArchiveBox/ArchiveBox/wiki/Quickstart#2-get-your-list-of-urls-to-archive)
- [Supported Outputs](https://github.com/ArchiveBox/ArchiveBox/wiki#can-save-these-things-for-each-site)
- [Scheduled Archiving](https://github.com/ArchiveBox/ArchiveBox/wiki/Scheduled-Archiving)
## Advanced
- [Troubleshooting](https://github.com/ArchiveBox/ArchiveBox/wiki/Troubleshooting)
- [Scheduled Archiving](https://github.com/ArchiveBox/ArchiveBox/wiki/Scheduled-Archiving)
- [Publishing Your Archive](https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive)
- [Chromium Install](https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install)
- [Cookies & Sessions Setup](https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install#setting-up-a-chromium-user-profile)
- [Security Overview](https://github.com/ArchiveBox/ArchiveBox/wiki/Security-Overview)
- [Cookies & Sessions Setup](https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install#setting-up-a-chromium-user-profile) (archiving sites that require logins)
- [Setting up the Search Backends](https://github.com/ArchiveBox/ArchiveBox/wiki/Setting-up-Search) (choosing ripgrep, Sonic, or FTS5)
- [Setting up Local/Remote Storages](https://github.com/ArchiveBox/ArchiveBox/wiki/Setting-up-Storage) (S3/B2/Google Drive/SMB/NFS/etc.)
- [Setting up Authentication & Permissions](https://github.com/ArchiveBox/ArchiveBox/wiki/Setting-up-Authentication) (SSO/LDAP/OAuth/API Keys/etc.)
- [Publishing Your Archive](https://github.com/ArchiveBox/ArchiveBox/wiki/Publishing-Your-Archive) (sharing your archive server with others)
- [Chromium Install Options](https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install) (installing and configuring ArchiveBox's Chrome)
- [Upgrading or Merging Archives](https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives)
- [Troubleshooting](https://github.com/ArchiveBox/ArchiveBox/wiki/Troubleshooting)
## Developers

View file

@ -1 +1,7 @@
__package__ = 'archivebox'
# monkey patch django timezone to add back utc (it was removed in Django 5.0)
import datetime
from django.utils import timezone
timezone.utc = datetime.timezone.utc

View file

@ -0,0 +1 @@
__package__ = 'archivebox.api'

7
archivebox/api/apps.py Normal file
View file

@ -0,0 +1,7 @@
__package__ = 'archivebox.api'
from django.apps import AppConfig
class APIConfig(AppConfig):
name = 'api'

107
archivebox/api/auth.py Normal file
View file

@ -0,0 +1,107 @@
__package__ = 'archivebox.api'
from typing import Optional
from django.http import HttpRequest
from django.contrib.auth import login
from django.contrib.auth import authenticate
from django.contrib.auth.models import AbstractBaseUser
from ninja.security import HttpBearer, APIKeyQuery, APIKeyHeader, HttpBasicAuth, django_auth_superuser
def auth_using_token(token, request: Optional[HttpRequest]=None) -> Optional[AbstractBaseUser]:
"""Given an API token string, check if a corresponding non-expired APIToken exists, and return its user"""
from api.models import APIToken # lazy import model to avoid loading it at urls.py import time
user = None
submitted_empty_form = token in ('string', '', None)
if submitted_empty_form:
user = request.user # see if user is authed via django session and use that as the default
else:
try:
token = APIToken.objects.get(token=token)
if token.is_valid():
user = token.user
except APIToken.DoesNotExist:
pass
if not user:
print('[❌] Failed to authenticate API user using API Key:', request)
return None
def auth_using_password(username, password, request: Optional[HttpRequest]=None) -> Optional[AbstractBaseUser]:
"""Given a username and password, check if they are valid and return the corresponding user"""
user = None
submitted_empty_form = (username, password) in (('string', 'string'), ('', ''), (None, None))
if submitted_empty_form:
user = request.user # see if user is authed via django session and use that as the default
else:
user = authenticate(
username=username,
password=password,
)
if not user:
print('[❌] Failed to authenticate API user using API Key:', request)
return user
### Base Auth Types
class APITokenAuthCheck:
"""The base class for authentication methods that use an api.models.APIToken"""
def authenticate(self, request: HttpRequest, key: Optional[str]=None) -> Optional[AbstractBaseUser]:
user = auth_using_token(
token=key,
request=request,
)
if user is not None:
login(request, user, backend='django.contrib.auth.backends.ModelBackend')
return user
class UserPassAuthCheck:
"""The base class for authentication methods that use a username & password"""
def authenticate(self, request: HttpRequest, username: Optional[str]=None, password: Optional[str]=None) -> Optional[AbstractBaseUser]:
user = auth_using_password(
username=username,
password=password,
request=request,
)
if user is not None:
login(request, user, backend='django.contrib.auth.backends.ModelBackend')
return user
### Django-Ninja-Provided Auth Methods
class HeaderTokenAuth(APITokenAuthCheck, APIKeyHeader):
"""Allow authenticating by passing X-API-Key=xyz as a request header"""
param_name = "X-ArchiveBox-API-Key"
class BearerTokenAuth(APITokenAuthCheck, HttpBearer):
"""Allow authenticating by passing Bearer=xyz as a request header"""
pass
class QueryParamTokenAuth(APITokenAuthCheck, APIKeyQuery):
"""Allow authenticating by passing api_key=xyz as a GET/POST query parameter"""
param_name = "api_key"
class UsernameAndPasswordAuth(UserPassAuthCheck, HttpBasicAuth):
"""Allow authenticating by passing username & password via HTTP Basic Authentication (not recommended)"""
pass
### Enabled Auth Methods
API_AUTH_METHODS = [
HeaderTokenAuth(),
BearerTokenAuth(),
QueryParamTokenAuth(),
django_auth_superuser,
UsernameAndPasswordAuth(),
]

View file

@ -0,0 +1,29 @@
# Generated by Django 4.2.11 on 2024-04-25 04:19
import api.models
from django.conf import settings
from django.db import migrations, models
import django.db.models.deletion
import uuid
class Migration(migrations.Migration):
initial = True
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
operations = [
migrations.CreateModel(
name='APIToken',
fields=[
('id', models.UUIDField(default=uuid.uuid4, editable=False, primary_key=True, serialize=False)),
('token', models.CharField(default=api.models.generate_secret_token, max_length=32, unique=True)),
('created', models.DateTimeField(auto_now_add=True)),
('expires', models.DateTimeField(blank=True, null=True)),
('user', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to=settings.AUTH_USER_MODEL)),
],
),
]

View file

@ -0,0 +1,17 @@
# Generated by Django 5.0.4 on 2024-04-26 05:28
from django.db import migrations
class Migration(migrations.Migration):
dependencies = [
('api', '0001_initial'),
]
operations = [
migrations.AlterModelOptions(
name='apitoken',
options={'verbose_name': 'API Key', 'verbose_name_plural': 'API Keys'},
),
]

View file

63
archivebox/api/models.py Normal file
View file

@ -0,0 +1,63 @@
__package__ = 'archivebox.api'
import uuid
import secrets
from datetime import timedelta
from django.conf import settings
from django.db import models
from django.utils import timezone
from django_stubs_ext.db.models import TypedModelMeta
def generate_secret_token() -> str:
# returns cryptographically secure string with len() == 32
return secrets.token_hex(16)
class APIToken(models.Model):
id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
user = models.ForeignKey(settings.AUTH_USER_MODEL, on_delete=models.CASCADE)
token = models.CharField(max_length=32, default=generate_secret_token, unique=True)
created = models.DateTimeField(auto_now_add=True)
expires = models.DateTimeField(null=True, blank=True)
class Meta(TypedModelMeta):
verbose_name = "API Key"
verbose_name_plural = "API Keys"
def __str__(self) -> str:
return self.token
def __repr__(self) -> str:
return f'<APIToken user={self.user.username} token=************{self.token[-4:]}>'
def __json__(self) -> dict:
return {
"TYPE": "APIToken",
"id": str(self.id),
"user_id": str(self.user.id),
"user_username": self.user.username,
"token": self.token,
"created": self.created.isoformat(),
"expires": self.expires_as_iso8601,
}
@property
def expires_as_iso8601(self):
"""Returns the expiry date of the token in ISO 8601 format or a date 100 years in the future if none."""
expiry_date = self.expires or (timezone.now() + timedelta(days=365 * 100))
return expiry_date.isoformat()
def is_valid(self, for_date=None):
for_date = for_date or timezone.now()
if self.expires and self.expires < for_date:
return False
return True

30
archivebox/api/tests.py Normal file
View file

@ -0,0 +1,30 @@
__package__ = 'archivebox.api'
from django.test import TestCase
from ninja.testing import TestClient
from .routes_cli import router
class ArchiveBoxCLIAPITestCase(TestCase):
def setUp(self):
self.client = TestClient(router)
def test_add_endpoint(self):
response = self.client.post("/add", json={"urls": ["http://example.com"], "tag": "testTag1,testTag2"})
self.assertEqual(response.status_code, 200)
self.assertTrue(response.json()["success"])
def test_remove_endpoint(self):
response = self.client.post("/remove", json={"filter_patterns": ["http://example.com"]})
self.assertEqual(response.status_code, 200)
self.assertTrue(response.json()["success"])
def test_update_endpoint(self):
response = self.client.post("/update", json={})
self.assertEqual(response.status_code, 200)
self.assertTrue(response.json()["success"])
def test_list_all_endpoint(self):
response = self.client.post("/list_all", json={})
self.assertEqual(response.status_code, 200)
self.assertTrue(response.json()["success"])

17
archivebox/api/urls.py Normal file
View file

@ -0,0 +1,17 @@
__package__ = 'archivebox.api'
from django.urls import path
from django.views.generic.base import RedirectView
from .v1_api import urls as v1_api_urls
urlpatterns = [
path("", RedirectView.as_view(url='/api/v1')),
path("v1/", v1_api_urls),
path("v1", RedirectView.as_view(url='/api/v1/docs')),
# ... v2 can be added here ...
# path("v2/", v2_api_urls),
# path("v2", RedirectView.as_view(url='/api/v2/docs')),
]

111
archivebox/api/v1_api.py Normal file
View file

@ -0,0 +1,111 @@
__package__ = 'archivebox.api'
from io import StringIO
from traceback import format_exception
from contextlib import redirect_stdout, redirect_stderr
from django.http import HttpRequest, HttpResponse
from django.core.exceptions import ObjectDoesNotExist, EmptyResultSet, PermissionDenied
from ninja import NinjaAPI, Swagger
# TODO: explore adding https://eadwincode.github.io/django-ninja-extra/
from api.auth import API_AUTH_METHODS
from ..config import VERSION, COMMIT_HASH
COMMIT_HASH = COMMIT_HASH or 'unknown'
html_description=f'''
<h3>Welcome to your ArchiveBox server's REST API <code>[v1 ALPHA]</code> homepage!</h3>
<br/>
<i><b>WARNING: This API is still in an early development stage and may change!</b></i>
<br/>
<ul>
<li> Manage your server: <a href="/admin/api/"><b>Setup API Keys</b></a>, <a href="/admin/">Go to your Server Admin UI</a>, <a href="/">Go to your Snapshots list</a>
<li>💬 Ask questions and get help here: <a href="https://zulip.archivebox.io">ArchiveBox Chat Forum</a></li>
<li>🐞 Report API bugs here: <a href="https://github.com/ArchiveBox/ArchiveBox/issues">Github Issues</a></li>
<li>📚 ArchiveBox Documentation: <a href="https://github.com/ArchiveBox/ArchiveBox/wiki">Github Wiki</a></li>
<li>📜 See the API source code: <a href="https://github.com/ArchiveBox/ArchiveBox/blob/dev/archivebox/api"><code>archivebox/api/</code></a></li>
</ul>
<small>Served by ArchiveBox v{VERSION} (<a href="https://github.com/ArchiveBox/ArchiveBox/commit/{COMMIT_HASH}"><code>{COMMIT_HASH[:8]}</code></a>), API powered by <a href="https://django-ninja.dev/"><code>django-ninja</code></a>.</small>
'''
def register_urls(api: NinjaAPI) -> NinjaAPI:
api.add_router('/auth/', 'api.v1_auth.router')
api.add_router('/core/', 'api.v1_core.router')
api.add_router('/cli/', 'api.v1_cli.router')
return api
class NinjaAPIWithIOCapture(NinjaAPI):
def create_temporal_response(self, request: HttpRequest) -> HttpResponse:
stdout, stderr = StringIO(), StringIO()
with redirect_stderr(stderr):
with redirect_stdout(stdout):
request.stdout = stdout
request.stderr = stderr
response = super().create_temporal_response(request)
print('RESPONDING NOW', response)
return response
api = NinjaAPIWithIOCapture(
title='ArchiveBox API',
description=html_description,
version='1.0.0',
csrf=False,
auth=API_AUTH_METHODS,
urls_namespace="api",
docs=Swagger(settings={"persistAuthorization": True}),
# docs_decorator=login_required,
# renderer=ORJSONRenderer(),
)
api = register_urls(api)
urls = api.urls
@api.exception_handler(Exception)
def generic_exception_handler(request, err):
status = 503
if isinstance(err, (ObjectDoesNotExist, EmptyResultSet, PermissionDenied)):
status = 404
print(''.join(format_exception(err)))
return api.create_response(
request,
{
"succeeded": False,
"message": f'{err.__class__.__name__}: {err}',
"errors": [
''.join(format_exception(err)),
# or send simpler parent-only traceback:
# *([str(err.__context__)] if getattr(err, '__context__', None) else []),
],
},
status=status,
)
# import orjson
# from ninja.renderers import BaseRenderer
# class ORJSONRenderer(BaseRenderer):
# media_type = "application/json"
# def render(self, request, data, *, response_status):
# return {
# "success": True,
# "errors": [],
# "result": data,
# "stdout": ansi_to_html(stdout.getvalue().strip()),
# "stderr": ansi_to_html(stderr.getvalue().strip()),
# }
# return orjson.dumps(data)

52
archivebox/api/v1_auth.py Normal file
View file

@ -0,0 +1,52 @@
__package__ = 'archivebox.api'
from typing import Optional
from ninja import Router, Schema
from api.models import APIToken
from api.auth import auth_using_token, auth_using_password
router = Router(tags=['Authentication'])
class PasswordAuthSchema(Schema):
"""Schema for a /get_api_token request"""
username: Optional[str] = None
password: Optional[str] = None
@router.post("/get_api_token", auth=None, summary='Generate an API token for a given username & password (or currently logged-in user)') # auth=None because they are not authed yet
def get_api_token(request, auth_data: PasswordAuthSchema):
user = auth_using_password(
username=auth_data.username,
password=auth_data.password,
request=request,
)
if user:
# TODO: support multiple tokens in the future, for now we just have one per user
api_token, created = APIToken.objects.get_or_create(user=user)
return api_token.__json__()
return {"success": False, "errors": ["Invalid credentials"]}
class TokenAuthSchema(Schema):
"""Schema for a /check_api_token request"""
token: str
@router.post("/check_api_token", auth=None, summary='Validate an API token to make sure its valid and non-expired') # auth=None because they are not authed yet
def check_api_token(request, token_data: TokenAuthSchema):
user = auth_using_token(
token=token_data.token,
request=request,
)
if user:
return {"success": True, "user_id": str(user.id)}
return {"success": False, "user_id": None}

234
archivebox/api/v1_cli.py Normal file
View file

@ -0,0 +1,234 @@
__package__ = 'archivebox.api'
from typing import List, Dict, Any, Optional
from enum import Enum
from ninja import Router, Schema
from ..main import (
add,
remove,
update,
list_all,
schedule,
)
from ..util import ansi_to_html
from ..config import ONLY_NEW
# router for API that exposes archivebox cli subcommands as REST endpoints
router = Router(tags=['ArchiveBox CLI Sub-Commands'])
# Schemas
JSONType = List[Any] | Dict[str, Any] | bool | int | str | None
class CLICommandResponseSchema(Schema):
success: bool
errors: List[str]
result: JSONType
stdout: str
stderr: str
class FilterTypeChoices(str, Enum):
exact = 'exact'
substring = 'substring'
regex = 'regex'
domain = 'domain'
tag = 'tag'
timestamp = 'timestamp'
class StatusChoices(str, Enum):
indexed = 'indexed'
archived = 'archived'
unarchived = 'unarchived'
present = 'present'
valid = 'valid'
invalid = 'invalid'
duplicate = 'duplicate'
orphaned = 'orphaned'
corrupted = 'corrupted'
unrecognized = 'unrecognized'
class AddCommandSchema(Schema):
urls: List[str]
tag: str = ""
depth: int = 0
update: bool = not ONLY_NEW # Default to the opposite of ONLY_NEW
update_all: bool = False
index_only: bool = False
overwrite: bool = False
init: bool = False
extractors: str = ""
parser: str = "auto"
class UpdateCommandSchema(Schema):
resume: Optional[float] = 0
only_new: bool = ONLY_NEW
index_only: bool = False
overwrite: bool = False
after: Optional[float] = 0
before: Optional[float] = 999999999999999
status: Optional[StatusChoices] = StatusChoices.unarchived
filter_type: Optional[str] = FilterTypeChoices.substring
filter_patterns: Optional[List[str]] = ['https://example.com']
extractors: Optional[str] = ""
class ScheduleCommandSchema(Schema):
import_path: Optional[str] = None
add: bool = False
every: Optional[str] = None
tag: str = ''
depth: int = 0
overwrite: bool = False
update: bool = not ONLY_NEW
clear: bool = False
class ListCommandSchema(Schema):
filter_patterns: Optional[List[str]] = ['https://example.com']
filter_type: str = FilterTypeChoices.substring
status: Optional[StatusChoices] = StatusChoices.indexed
after: Optional[float] = 0
before: Optional[float] = 999999999999999
sort: str = 'added'
as_json: bool = True
as_html: bool = False
as_csv: str | bool = 'timestamp,url'
with_headers: bool = False
class RemoveCommandSchema(Schema):
delete: bool = True
after: Optional[float] = 0
before: Optional[float] = 999999999999999
filter_type: str = FilterTypeChoices.exact
filter_patterns: Optional[List[str]] = ['https://example.com']
@router.post("/add", response=CLICommandResponseSchema, summary='archivebox add [args] [urls]')
def cli_add(request, args: AddCommandSchema):
result = add(
urls=args.urls,
tag=args.tag,
depth=args.depth,
update=args.update,
update_all=args.update_all,
index_only=args.index_only,
overwrite=args.overwrite,
init=args.init,
extractors=args.extractors,
parser=args.parser,
)
return {
"success": True,
"errors": [],
"result": result,
"stdout": ansi_to_html(request.stdout.getvalue().strip()),
"stderr": ansi_to_html(request.stderr.getvalue().strip()),
}
@router.post("/update", response=CLICommandResponseSchema, summary='archivebox update [args] [filter_patterns]')
def cli_update(request, args: UpdateCommandSchema):
result = update(
resume=args.resume,
only_new=args.only_new,
index_only=args.index_only,
overwrite=args.overwrite,
before=args.before,
after=args.after,
status=args.status,
filter_type=args.filter_type,
filter_patterns=args.filter_patterns,
extractors=args.extractors,
)
return {
"success": True,
"errors": [],
"result": result,
"stdout": ansi_to_html(request.stdout.getvalue().strip()),
"stderr": ansi_to_html(request.stderr.getvalue().strip()),
}
@router.post("/schedule", response=CLICommandResponseSchema, summary='archivebox schedule [args] [import_path]')
def cli_schedule(request, args: ScheduleCommandSchema):
result = schedule(
import_path=args.import_path,
add=args.add,
show=args.show,
clear=args.clear,
every=args.every,
tag=args.tag,
depth=args.depth,
overwrite=args.overwrite,
update=args.update,
)
return {
"success": True,
"errors": [],
"result": result,
"stdout": ansi_to_html(request.stdout.getvalue().strip()),
"stderr": ansi_to_html(request.stderr.getvalue().strip()),
}
@router.post("/list", response=CLICommandResponseSchema, summary='archivebox list [args] [filter_patterns]')
def cli_list(request, args: ListCommandSchema):
result = list_all(
filter_patterns=args.filter_patterns,
filter_type=args.filter_type,
status=args.status,
after=args.after,
before=args.before,
sort=args.sort,
csv=args.as_csv,
json=args.as_json,
html=args.as_html,
with_headers=args.with_headers,
)
result_format = 'txt'
if args.as_json:
result_format = "json"
elif args.as_html:
result_format = "html"
elif args.as_csv:
result_format = "csv"
return {
"success": True,
"errors": [],
"result": result,
"result_format": result_format,
"stdout": ansi_to_html(request.stdout.getvalue().strip()),
"stderr": ansi_to_html(request.stderr.getvalue().strip()),
}
@router.post("/remove", response=CLICommandResponseSchema, summary='archivebox remove [args] [filter_patterns]')
def cli_remove(request, args: RemoveCommandSchema):
result = remove(
yes=True, # no way to interactively ask for confirmation via API, so we force yes
delete=args.delete,
before=args.before,
after=args.after,
filter_type=args.filter_type,
filter_patterns=args.filter_patterns,
)
return {
"success": True,
"errors": [],
"result": result,
"stdout": ansi_to_html(request.stdout.getvalue().strip()),
"stderr": ansi_to_html(request.stderr.getvalue().strip()),
}

210
archivebox/api/v1_core.py Normal file
View file

@ -0,0 +1,210 @@
__package__ = 'archivebox.api'
from uuid import UUID
from typing import List, Optional
from datetime import datetime
from django.shortcuts import get_object_or_404
from ninja import Router, Schema, FilterSchema, Field, Query
from ninja.pagination import paginate
from core.models import Snapshot, ArchiveResult, Tag
router = Router(tags=['Core Models'])
### ArchiveResult #########################################################################
class ArchiveResultSchema(Schema):
id: UUID
snapshot_id: UUID
snapshot_url: str
snapshot_tags: str
extractor: str
cmd: List[str]
pwd: str
cmd_version: str
output: str
status: str
created: datetime
@staticmethod
def resolve_id(obj):
return obj.uuid
@staticmethod
def resolve_created(obj):
return obj.start_ts
@staticmethod
def resolve_snapshot_url(obj):
return obj.snapshot.url
@staticmethod
def resolve_snapshot_tags(obj):
return obj.snapshot.tags_str()
class ArchiveResultFilterSchema(FilterSchema):
id: Optional[UUID] = Field(None, q='uuid')
search: Optional[str] = Field(None, q=['snapshot__url__icontains', 'snapshot__title__icontains', 'snapshot__tags__name__icontains', 'extractor', 'output__icontains'])
snapshot_id: Optional[UUID] = Field(None, q='snapshot_id')
snapshot_url: Optional[str] = Field(None, q='snapshot__url')
snapshot_tag: Optional[str] = Field(None, q='snapshot__tags__name')
status: Optional[str] = Field(None, q='status')
output: Optional[str] = Field(None, q='output__icontains')
extractor: Optional[str] = Field(None, q='extractor__icontains')
cmd: Optional[str] = Field(None, q='cmd__0__icontains')
pwd: Optional[str] = Field(None, q='pwd__icontains')
cmd_version: Optional[str] = Field(None, q='cmd_version')
created: Optional[datetime] = Field(None, q='updated')
created__gte: Optional[datetime] = Field(None, q='updated__gte')
created__lt: Optional[datetime] = Field(None, q='updated__lt')
@router.get("/archiveresults", response=List[ArchiveResultSchema])
@paginate
def list_archiveresults(request, filters: ArchiveResultFilterSchema = Query(...)):
qs = ArchiveResult.objects.all()
results = filters.filter(qs)
return results
@router.get("/archiveresult/{archiveresult_id}", response=ArchiveResultSchema)
def get_archiveresult(request, archiveresult_id: str):
archiveresult = get_object_or_404(ArchiveResult, id=archiveresult_id)
return archiveresult
# @router.post("/archiveresult", response=ArchiveResultSchema)
# def create_archiveresult(request, payload: ArchiveResultSchema):
# archiveresult = ArchiveResult.objects.create(**payload.dict())
# return archiveresult
#
# @router.put("/archiveresult/{archiveresult_id}", response=ArchiveResultSchema)
# def update_archiveresult(request, archiveresult_id: str, payload: ArchiveResultSchema):
# archiveresult = get_object_or_404(ArchiveResult, id=archiveresult_id)
#
# for attr, value in payload.dict().items():
# setattr(archiveresult, attr, value)
# archiveresult.save()
#
# return archiveresult
#
# @router.delete("/archiveresult/{archiveresult_id}")
# def delete_archiveresult(request, archiveresult_id: str):
# archiveresult = get_object_or_404(ArchiveResult, id=archiveresult_id)
# archiveresult.delete()
# return {"success": True}
### Snapshot #########################################################################
class SnapshotSchema(Schema):
id: UUID
url: str
tags: str
title: Optional[str]
timestamp: str
bookmarked: datetime
added: datetime
updated: datetime
archive_path: str
archiveresults: List[ArchiveResultSchema]
# @staticmethod
# def resolve_id(obj):
# return str(obj.id)
@staticmethod
def resolve_tags(obj):
return obj.tags_str()
@staticmethod
def resolve_archiveresults(obj, context):
if context['request'].with_archiveresults:
return obj.archiveresult_set.all().distinct()
return ArchiveResult.objects.none()
class SnapshotFilterSchema(FilterSchema):
id: Optional[UUID] = Field(None, q='id')
search: Optional[str] = Field(None, q=['url__icontains', 'title__icontains', 'tags__name__icontains'])
url: Optional[str] = Field(None, q='url')
tag: Optional[str] = Field(None, q='tags__name')
title: Optional[str] = Field(None, q='title__icontains')
timestamp: Optional[str] = Field(None, q='timestamp__startswith')
added: Optional[datetime] = Field(None, q='added')
added__gte: Optional[datetime] = Field(None, q='added__gte')
added__lt: Optional[datetime] = Field(None, q='added__lt')
@router.get("/snapshots", response=List[SnapshotSchema])
@paginate
def list_snapshots(request, filters: SnapshotFilterSchema = Query(...), with_archiveresults: bool=True):
request.with_archiveresults = with_archiveresults
qs = Snapshot.objects.all()
results = filters.filter(qs)
return results
@router.get("/snapshot/{snapshot_id}", response=SnapshotSchema)
def get_snapshot(request, snapshot_id: str, with_archiveresults: bool=True):
request.with_archiveresults = with_archiveresults
snapshot = get_object_or_404(Snapshot, id=snapshot_id)
return snapshot
# @router.post("/snapshot", response=SnapshotSchema)
# def create_snapshot(request, payload: SnapshotSchema):
# snapshot = Snapshot.objects.create(**payload.dict())
# return snapshot
#
# @router.put("/snapshot/{snapshot_id}", response=SnapshotSchema)
# def update_snapshot(request, snapshot_id: str, payload: SnapshotSchema):
# snapshot = get_object_or_404(Snapshot, id=snapshot_id)
#
# for attr, value in payload.dict().items():
# setattr(snapshot, attr, value)
# snapshot.save()
#
# return snapshot
#
# @router.delete("/snapshot/{snapshot_id}")
# def delete_snapshot(request, snapshot_id: str):
# snapshot = get_object_or_404(Snapshot, id=snapshot_id)
# snapshot.delete()
# return {"success": True}
### Tag #########################################################################
class TagSchema(Schema):
name: str
slug: str
@router.get("/tags", response=List[TagSchema])
def list_tags(request):
return Tag.objects.all()

View file

@ -72,7 +72,7 @@ CONFIG_SCHEMA: Dict[str, ConfigDefaultDict] = {
'TIMEOUT': {'type': int, 'default': 60},
'MEDIA_TIMEOUT': {'type': int, 'default': 3600},
'OUTPUT_PERMISSIONS': {'type': str, 'default': '644'},
'RESTRICT_FILE_NAMES': {'type': str, 'default': 'windows'},
'RESTRICT_FILE_NAMES': {'type': str, 'default': 'windows'}, # TODO: move this to be a default WGET_ARGS
'URL_DENYLIST': {'type': str, 'default': r'\.(css|js|otf|ttf|woff|woff2|gstatic\.com|googleapis\.com/css)(\?.*)?$', 'aliases': ('URL_BLACKLIST',)}, # to avoid downloading code assets as their own pages
'URL_ALLOWLIST': {'type': str, 'default': None, 'aliases': ('URL_WHITELIST',)},
@ -112,7 +112,7 @@ CONFIG_SCHEMA: Dict[str, ConfigDefaultDict] = {
'LDAP_FIRSTNAME_ATTR': {'type': str, 'default': None},
'LDAP_LASTNAME_ATTR': {'type': str, 'default': None},
'LDAP_EMAIL_ATTR': {'type': str, 'default': None},
'LDAP_CREATE_SUPERUSER': {'type': bool, 'default': False},
'LDAP_CREATE_SUPERUSER': {'type': bool, 'default': False},
},
'ARCHIVE_METHOD_TOGGLES': {
@ -265,7 +265,7 @@ CONFIG_ALIASES = {
for key, default in section.items()
for alias in default.get('aliases', ())
}
USER_CONFIG = {key for section in CONFIG_SCHEMA.values() for key in section.keys()}
USER_CONFIG = {key: section[key] for section in CONFIG_SCHEMA.values() for key in section.keys()}
def get_real_name(key: str) -> str:
"""get the current canonical name for a given deprecated config key"""
@ -282,6 +282,7 @@ ARCHIVE_DIR_NAME = 'archive'
SOURCES_DIR_NAME = 'sources'
LOGS_DIR_NAME = 'logs'
PERSONAS_DIR_NAME = 'personas'
CRONTABS_DIR_NAME = 'crontabs'
SQL_INDEX_FILENAME = 'index.sqlite3'
JSON_INDEX_FILENAME = 'index.json'
HTML_INDEX_FILENAME = 'index.html'
@ -355,7 +356,7 @@ ALLOWED_IN_OUTPUT_DIR = {
'static',
'sonic',
'search.sqlite3',
'crontabs',
CRONTABS_DIR_NAME,
ARCHIVE_DIR_NAME,
SOURCES_DIR_NAME,
LOGS_DIR_NAME,
@ -598,7 +599,6 @@ DYNAMIC_CONFIG_SCHEMA: ConfigDefaultDict = {
'DEPENDENCIES': {'default': lambda c: get_dependency_info(c)},
'CODE_LOCATIONS': {'default': lambda c: get_code_locations(c)},
'EXTERNAL_LOCATIONS': {'default': lambda c: get_external_locations(c)},
'DATA_LOCATIONS': {'default': lambda c: get_data_locations(c)},
'CHROME_OPTIONS': {'default': lambda c: get_chrome_info(c)},
'CHROME_EXTRA_ARGS': {'default': lambda c: c['CHROME_EXTRA_ARGS'] or []},
@ -985,11 +985,6 @@ def get_code_locations(config: ConfigDict) -> SimpleConfigValueDict:
'enabled': True,
'is_valid': (config['TEMPLATES_DIR'] / 'static').exists(),
},
'CUSTOM_TEMPLATES_DIR': {
'path': config['CUSTOM_TEMPLATES_DIR'] and Path(config['CUSTOM_TEMPLATES_DIR']).resolve(),
'enabled': bool(config['CUSTOM_TEMPLATES_DIR']),
'is_valid': config['CUSTOM_TEMPLATES_DIR'] and Path(config['CUSTOM_TEMPLATES_DIR']).exists(),
},
# 'NODE_MODULES_DIR': {
# 'path': ,
# 'enabled': ,
@ -997,50 +992,25 @@ def get_code_locations(config: ConfigDict) -> SimpleConfigValueDict:
# },
}
def get_external_locations(config: ConfigDict) -> ConfigValue:
abspath = lambda path: None if path is None else Path(path).resolve()
return {
'CHROME_USER_DATA_DIR': {
'path': abspath(config['CHROME_USER_DATA_DIR']),
'enabled': config['USE_CHROME'] and config['CHROME_USER_DATA_DIR'],
'is_valid': False if config['CHROME_USER_DATA_DIR'] is None else (Path(config['CHROME_USER_DATA_DIR']) / 'Default').exists(),
},
'COOKIES_FILE': {
'path': abspath(config['COOKIES_FILE']),
'enabled': config['USE_WGET'] and config['COOKIES_FILE'],
'is_valid': False if config['COOKIES_FILE'] is None else Path(config['COOKIES_FILE']).exists(),
},
}
def get_data_locations(config: ConfigDict) -> ConfigValue:
return {
# OLD: migrating to personas
# 'CHROME_USER_DATA_DIR': {
# 'path': os.path.abspath(config['CHROME_USER_DATA_DIR']),
# 'enabled': config['USE_CHROME'] and config['CHROME_USER_DATA_DIR'],
# 'is_valid': False if config['CHROME_USER_DATA_DIR'] is None else (Path(config['CHROME_USER_DATA_DIR']) / 'Default').exists(),
# },
# 'COOKIES_FILE': {
# 'path': os.path.abspath(config['COOKIES_FILE']),
# 'enabled': config['USE_WGET'] and config['COOKIES_FILE'],
# 'is_valid': False if config['COOKIES_FILE'] is None else Path(config['COOKIES_FILE']).exists(),
# },
'OUTPUT_DIR': {
'path': config['OUTPUT_DIR'].resolve(),
'enabled': True,
'is_valid': (config['OUTPUT_DIR'] / SQL_INDEX_FILENAME).exists(),
'is_mount': os.path.ismount(config['OUTPUT_DIR'].resolve()),
},
'SOURCES_DIR': {
'path': config['SOURCES_DIR'].resolve(),
'enabled': True,
'is_valid': config['SOURCES_DIR'].exists(),
},
'LOGS_DIR': {
'path': config['LOGS_DIR'].resolve(),
'enabled': True,
'is_valid': config['LOGS_DIR'].exists(),
},
'PERSONAS_DIR': {
'path': config['PERSONAS_DIR'].resolve(),
'enabled': True,
'is_valid': config['PERSONAS_DIR'].exists(),
},
'ARCHIVE_DIR': {
'path': config['ARCHIVE_DIR'].resolve(),
'enabled': True,
'is_valid': config['ARCHIVE_DIR'].exists(),
'is_mount': os.path.ismount(config['ARCHIVE_DIR'].resolve()),
},
'CONFIG_FILE': {
'path': config['CONFIG_FILE'].resolve(),
'enabled': True,
@ -1052,6 +1022,38 @@ def get_data_locations(config: ConfigDict) -> ConfigValue:
'is_valid': (config['OUTPUT_DIR'] / SQL_INDEX_FILENAME).exists(),
'is_mount': os.path.ismount((config['OUTPUT_DIR'] / SQL_INDEX_FILENAME).resolve()),
},
'ARCHIVE_DIR': {
'path': config['ARCHIVE_DIR'].resolve(),
'enabled': True,
'is_valid': config['ARCHIVE_DIR'].exists(),
'is_mount': os.path.ismount(config['ARCHIVE_DIR'].resolve()),
},
'SOURCES_DIR': {
'path': config['SOURCES_DIR'].resolve(),
'enabled': True,
'is_valid': config['SOURCES_DIR'].exists(),
},
'LOGS_DIR': {
'path': config['LOGS_DIR'].resolve(),
'enabled': True,
'is_valid': config['LOGS_DIR'].exists(),
},
'CUSTOM_TEMPLATES_DIR': {
'path': config['CUSTOM_TEMPLATES_DIR'] and Path(config['CUSTOM_TEMPLATES_DIR']).resolve(),
'enabled': bool(config['CUSTOM_TEMPLATES_DIR']),
'is_valid': config['CUSTOM_TEMPLATES_DIR'] and Path(config['CUSTOM_TEMPLATES_DIR']).exists(),
},
'PERSONAS_DIR': {
'path': config['PERSONAS_DIR'].resolve(),
'enabled': True,
'is_valid': config['PERSONAS_DIR'].exists(),
},
# managed by bin/docker_entrypoint.sh and python-crontab:
# 'CRONTABS_DIR': {
# 'path': config['CRONTABS_DIR'].resolve(),
# 'enabled': True,
# 'is_valid': config['CRONTABS_DIR'].exists(),
# },
}
def get_dependency_info(config: ConfigDict) -> ConfigValue:
@ -1366,6 +1368,7 @@ def check_data_folder(out_dir: Union[str, Path, None]=None, config: ConfigDict=C
stderr(' archivebox init')
raise SystemExit(2)
def check_migrations(out_dir: Union[str, Path, None]=None, config: ConfigDict=CONFIG):
output_dir = out_dir or config['OUTPUT_DIR']
from .index.sql import list_migrations

View file

@ -14,12 +14,17 @@ from django.shortcuts import render, redirect
from django.contrib.auth import get_user_model
from django import forms
from signal_webhooks.apps import DjangoSignalWebhooksConfig
from signal_webhooks.admin import WebhookAdmin, WebhookModel
from ..util import htmldecode, urldecode, ansi_to_html
from core.models import Snapshot, ArchiveResult, Tag
from core.forms import AddLinkForm
from core.mixins import SearchResultsAdminMixin
from api.models import APIToken
from index.html import snapshot_icons
from logging_util import printable_filesize
@ -98,10 +103,32 @@ class ArchiveBoxAdmin(admin.AdminSite):
return render(template_name='add.html', request=request, context=context)
# monkey patch django-signals-webhooks to change how it shows up in Admin UI
DjangoSignalWebhooksConfig.verbose_name = 'API'
WebhookModel._meta.get_field('name').help_text = 'Give your webhook a descriptive name (e.g. Notify ACME Slack channel of any new ArchiveResults).'
WebhookModel._meta.get_field('signal').help_text = 'The type of event the webhook should fire for (e.g. Create, Update, Delete).'
WebhookModel._meta.get_field('ref').help_text = 'Dot import notation of the model the webhook should fire for (e.g. core.models.Snapshot or core.models.ArchiveResult).'
WebhookModel._meta.get_field('endpoint').help_text = 'External URL to POST the webhook notification to (e.g. https://someapp.example.com/webhook/some-webhook-receiver).'
WebhookModel._meta.app_label = 'api'
archivebox_admin = ArchiveBoxAdmin()
archivebox_admin.register(get_user_model())
archivebox_admin.register(APIToken)
archivebox_admin.register(WebhookModel, WebhookAdmin)
archivebox_admin.disable_action('delete_selected')
# patch admin with methods to add data views
from admin_data_views.admin import get_app_list, admin_data_index_view, get_admin_data_urls, get_urls
archivebox_admin.get_app_list = get_app_list.__get__(archivebox_admin, ArchiveBoxAdmin)
archivebox_admin.admin_data_index_view = admin_data_index_view.__get__(archivebox_admin, ArchiveBoxAdmin)
archivebox_admin.get_admin_data_urls = get_admin_data_urls.__get__(archivebox_admin, ArchiveBoxAdmin)
archivebox_admin.get_urls = get_urls(archivebox_admin.get_urls).__get__(archivebox_admin, ArchiveBoxAdmin)
class ArchiveResultInline(admin.TabularInline):
model = ArchiveResult

View file

@ -1,3 +1,5 @@
__package__ = 'archivebox.core'
from django.apps import AppConfig
@ -5,6 +7,22 @@ class CoreConfig(AppConfig):
name = 'core'
def ready(self):
# register our custom admin as the primary django admin
from django.contrib import admin
from django.contrib.admin import sites
from core.admin import archivebox_admin
admin.site = archivebox_admin
sites.site = archivebox_admin
# register signal handlers
from .auth import register_signals
register_signals()
# from django.contrib.admin.apps import AdminConfig
# class CoreAdminConfig(AdminConfig):
# default_site = "core.admin.get_admin_site"

View file

@ -1,5 +1,6 @@
import os
from django.conf import settings
__package__ = 'archivebox.core'
from ..config import (
LDAP
)

View file

@ -1,10 +1,8 @@
from django.conf import settings
from ..config import (
LDAP_CREATE_SUPERUSER
)
def create_user(sender, user=None, ldap_user=None, **kwargs):
if not user.id and LDAP_CREATE_SUPERUSER:
user.is_superuser = True

View file

@ -10,7 +10,7 @@ class SearchResultsAdminMixin:
search_term = search_term.strip()
if not search_term:
return qs, use_distinct
return qs.distinct(), use_distinct
try:
qsearch = query_search_index(search_term)
qs = qs | qsearch

View file

@ -18,6 +18,7 @@ from ..config import (
CUSTOM_TEMPLATES_DIR,
SQL_INDEX_FILENAME,
OUTPUT_DIR,
ARCHIVE_DIR,
LOGS_DIR,
TIMEZONE,
@ -61,7 +62,11 @@ INSTALLED_APPS = [
'django.contrib.admin',
'core',
'api',
'admin_data_views',
'signal_webhooks',
'django_extensions',
]
@ -172,6 +177,17 @@ if DEBUG_TOOLBAR:
]
MIDDLEWARE = [*MIDDLEWARE, 'debug_toolbar.middleware.DebugToolbarMiddleware']
# https://github.com/bensi94/Django-Requests-Tracker (improved version of django-debug-toolbar)
# Must delete archivebox/templates/admin to use because it relies on some things we override
# visit /__requests_tracker__/ to access
DEBUG_REQUESTS_TRACKER = False
if DEBUG_REQUESTS_TRACKER:
INSTALLED_APPS += ["requests_tracker"]
MIDDLEWARE += ["requests_tracker.middleware.requests_tracker_middleware"]
INTERNAL_IPS = ["127.0.0.1", "10.0.2.2", "0.0.0.0", "*"]
################################################################################
### Staticfile and Template Settings
################################################################################
@ -241,6 +257,29 @@ CACHES = {
EMAIL_BACKEND = 'django.core.mail.backends.console.EmailBackend'
STORAGES = {
"default": {
"BACKEND": "django.core.files.storage.FileSystemStorage",
},
"staticfiles": {
"BACKEND": "django.contrib.staticfiles.storage.StaticFilesStorage",
},
"archive": {
"BACKEND": "django.core.files.storage.FileSystemStorage",
"OPTIONS": {
"base_url": "/archive/",
"location": ARCHIVE_DIR,
},
},
# "personas": {
# "BACKEND": "django.core.files.storage.FileSystemStorage",
# "OPTIONS": {
# "base_url": "/personas/",
# "location": PERSONAS_DIR,
# },
# },
}
################################################################################
### Security Settings
################################################################################
@ -367,3 +406,32 @@ LOGGING = {
}
},
}
# Add default webhook configuration to the User model
SIGNAL_WEBHOOKS = {
"HOOKS": {
"django.contrib.auth.models.User": ..., # ... is a special value that means "use the default autogenerated hooks"
"core.models.Snapshot": ...,
"core.models.ArchiveResult": ...,
"core.models.Tag": ...,
"api.models.APIToken": ...,
},
}
ADMIN_DATA_VIEWS = {
"NAME": "configuration",
"URLS": [
{
"route": "live/",
"view": "core.views.live_config_list_view",
"name": "live",
"items": {
"route": "<str:key>/",
"view": "core.views.live_config_value_view",
"name": "live_config_value",
},
},
],
}

View file

@ -1,4 +1,4 @@
from .admin import archivebox_admin
__package__ = 'archivebox.core'
from django.urls import path, include
from django.views import static
@ -6,7 +6,14 @@ from django.contrib.staticfiles.urls import staticfiles_urlpatterns
from django.conf import settings
from django.views.generic.base import RedirectView
from core.views import HomepageView, SnapshotView, PublicIndexView, AddView, HealthCheckView
from .admin import archivebox_admin
from .views import HomepageView, SnapshotView, PublicIndexView, AddView, HealthCheckView
# GLOBAL_CONTEXT doesn't work as-is, disabled for now: https://github.com/ArchiveBox/ArchiveBox/discussions/1306
# from config import VERSION, VERSIONS_AVAILABLE, CAN_UPGRADE
# GLOBAL_CONTEXT = {'VERSION': VERSION, 'VERSIONS_AVAILABLE': VERSIONS_AVAILABLE, 'CAN_UPGRADE': CAN_UPGRADE}
# print('DEBUG', settings.DEBUG)
@ -31,8 +38,10 @@ urlpatterns = [
path('accounts/', include('django.contrib.auth.urls')),
path('admin/', archivebox_admin.urls),
path("api/", include('api.urls')),
path('health/', HealthCheckView.as_view(), name='healthcheck'),
path('error/', lambda _: 1/0),
path('error/', lambda *_: 1/0),
# path('jet_api/', include('jet_django.urls')), Enable to use https://www.jetadmin.io/integrations/django
@ -43,10 +52,10 @@ urlpatterns = [
urlpatterns += staticfiles_urlpatterns()
if settings.DEBUG_TOOLBAR:
import debug_toolbar
urlpatterns += [
path('__debug__/', include(debug_toolbar.urls)),
]
urlpatterns += [path('__debug__/', include("debug_toolbar.urls"))]
if settings.DEBUG_REQUESTS_TRACKER:
urlpatterns += [path("__requests_tracker__/", include("requests_tracker.urls"))]
# # Proposed FUTURE URLs spec

View file

@ -1,10 +1,12 @@
__package__ = 'archivebox.core'
from typing import Callable
from io import StringIO
from contextlib import redirect_stdout
from django.shortcuts import render, redirect
from django.http import HttpResponse, Http404
from django.http import HttpRequest, HttpResponse, Http404
from django.utils.html import format_html, mark_safe
from django.views import View, static
from django.views.generic.list import ListView
@ -14,6 +16,10 @@ from django.contrib.auth.mixins import UserPassesTestMixin
from django.views.decorators.csrf import csrf_exempt
from django.utils.decorators import method_decorator
from admin_data_views.typing import TableContext, ItemContext
from admin_data_views.utils import render_with_table_view, render_with_item_view, ItemLink
from core.models import Snapshot
from core.forms import AddLinkForm
@ -26,6 +32,10 @@ from ..config import (
COMMIT_HASH,
FOOTER_INFO,
SNAPSHOTS_PER_PAGE,
CONFIG,
CONFIG_SCHEMA,
DYNAMIC_CONFIG_SCHEMA,
USER_CONFIG,
)
from ..main import add
from ..util import base_url, ansi_to_html
@ -124,9 +134,9 @@ class SnapshotView(View):
'<center><br/><br/><br/>'
f'Snapshot <a href="/archive/{snapshot.timestamp}/index.html" target="_top"><b><code>[{snapshot.timestamp}]</code></b></a> exists in DB, but resource <b><code>{snapshot.timestamp}/'
'{}'
f'</code></b> does not exist in <a href="/archive/{snapshot.timestamp}/" target="_top">snapshot dir</a> yet.<br/><br/>'
'Maybe this resource type is not availabe for this Snapshot,<br/>or the archiving process has not completed yet?<br/>'
f'<pre><code># run this cmd to finish archiving this Snapshot<br/>archivebox update -t timestamp {snapshot.timestamp}</code></pre><br/><br/>'
f'</code></b> does not exist in the <a href="/archive/{snapshot.timestamp}/" target="_top">snapshot dir</a> yet.<br/><br/>'
'It\'s possible that this resource type is not available for the Snapshot,<br/>or that the archiving process has not completed yet.<br/>'
f'<pre><code># if interrupted, run this cmd to finish archiving this Snapshot<br/>archivebox update -t timestamp {snapshot.timestamp}</code></pre><br/><br/>'
'<div class="text-align: left; width: 100%; max-width: 400px">'
'<i><b>Next steps:</i></b><br/>'
f'- list all the <a href="/archive/{snapshot.timestamp}/" target="_top">Snapshot files <code>.*</code></a><br/>'
@ -312,3 +322,124 @@ class HealthCheckView(View):
content_type='text/plain',
status=200
)
def find_config_section(key: str) -> str:
matching_sections = [
name for name, opts in CONFIG_SCHEMA.items() if key in opts
]
section = matching_sections[0] if matching_sections else 'DYNAMIC'
return section
def find_config_default(key: str) -> str:
default_val = USER_CONFIG.get(key, {}).get('default', lambda: None)
if isinstance(default_val, Callable):
return None
else:
default_val = repr(default_val)
return default_val
def find_config_type(key: str) -> str:
if key in USER_CONFIG:
return USER_CONFIG[key]['type'].__name__
elif key in DYNAMIC_CONFIG_SCHEMA:
return type(CONFIG[key]).__name__
return 'str'
def key_is_safe(key: str) -> bool:
for term in ('key', 'password', 'secret', 'token'):
if term in key.lower():
return False
return True
@render_with_table_view
def live_config_list_view(request: HttpRequest, **kwargs) -> TableContext:
assert request.user.is_superuser, 'Must be a superuser to view configuration settings.'
rows = {
"Section": [],
"Key": [],
"Type": [],
"Value": [],
"Default": [],
# "Documentation": [],
"Aliases": [],
}
for section in CONFIG_SCHEMA.keys():
for key in CONFIG_SCHEMA[section].keys():
rows['Section'].append(section.replace('_', ' ').title().replace(' Config', ''))
rows['Key'].append(ItemLink(key, key=key))
rows['Type'].append(mark_safe(f'<code>{find_config_type(key)}</code>'))
rows['Value'].append(mark_safe(f'<code>{CONFIG[key]}</code>') if key_is_safe(key) else '******** (redacted)')
rows['Default'].append(mark_safe(f'<a href="https://github.com/search?q=repo%3AArchiveBox%2FArchiveBox+path%3Aconfig.py+%27{key}%27&type=code"><code style="text-decoration: underline">{find_config_default(key) or 'See here...'}</code></a>'))
# rows['Documentation'].append(mark_safe(f'Wiki: <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#{key.lower()}">{key}</a>'))
rows['Aliases'].append(', '.join(CONFIG_SCHEMA[section][key].get('aliases', [])))
section = 'DYNAMIC'
for key in DYNAMIC_CONFIG_SCHEMA.keys():
rows['Section'].append(section.replace('_', ' ').title().replace(' Config', ''))
rows['Key'].append(ItemLink(key, key=key))
rows['Type'].append(mark_safe(f'<code>{find_config_type(key)}</code>'))
rows['Value'].append(mark_safe(f'<code>{CONFIG[key]}</code>') if key_is_safe(key) else '******** (redacted)')
rows['Default'].append(mark_safe(f'<a href="https://github.com/search?q=repo%3AArchiveBox%2FArchiveBox+path%3Aconfig.py+%27{key}%27&type=code"><code style="text-decoration: underline">{find_config_default(key) or 'See here...'}</code></a>'))
# rows['Documentation'].append(mark_safe(f'Wiki: <a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#{key.lower()}">{key}</a>'))
rows['Aliases'].append(ItemLink(key, key=key) if key in USER_CONFIG else '')
return TableContext(
title="Computed Configuration Values",
table=rows,
)
@render_with_item_view
def live_config_value_view(request: HttpRequest, key: str, **kwargs) -> ItemContext:
assert request.user.is_superuser, 'Must be a superuser to view configuration settings.'
aliases = USER_CONFIG.get(key, {}).get("aliases", [])
return ItemContext(
slug=key,
title=key,
data=[
{
"name": mark_safe(f'data / ArchiveBox.conf &nbsp; [{find_config_section(key)}] &nbsp; <b><code style="color: lightgray">{key}</code></b>' if key in USER_CONFIG else f'[DYNAMIC CONFIG] &nbsp; <b><code style="color: lightgray">{key}</code></b> &nbsp; <small>(calculated at runtime)</small>'),
"description": None,
"fields": {
'Key': key,
'Type': find_config_type(key),
'Value': CONFIG[key] if key_is_safe(key) else '********',
},
"help_texts": {
'Key': mark_safe(f'''
<a href="https://github.com/ArchiveBox/ArchiveBox/wiki/Configuration#{key.lower()}">Documentation</a> &nbsp;
<span style="display: {"inline" if aliases else "none"}">
Aliases: {", ".join(aliases)}
</span>
'''),
'Type': mark_safe(f'''
<a href="https://github.com/search?q=repo%3AArchiveBox%2FArchiveBox+path%3Aconfig.py+%27{key}%27&type=code">
See full definition in <code>archivebox/config.py</code>...
</a>
'''),
'Value': mark_safe(f'''
{'<b style="color: red">Value is redacted for your security. (Passwords, secrets, API tokens, etc. cannot be viewed in the Web UI)</b><br/><br/>' if not key_is_safe(key) else ''}
Default: <a href="https://github.com/search?q=repo%3AArchiveBox%2FArchiveBox+path%3Aconfig.py+%27{key}%27&type=code">
<code>{find_config_default(key) or 'See 1here...'}</code>
</a>
<br/><br/>
<p style="display: {"block" if key in USER_CONFIG else "none"}">
<i>To change this value, edit <code>data/ArchiveBox.conf</code> or run:</i>
<br/><br/>
<code>archivebox config --set {key}="{
val.strip("'")
if (val := find_config_default(key)) else
(repr(CONFIG[key] if key_is_safe(key) else '********')).strip("'")
}"</code>
</p>
'''),
},
},
],
)

View file

@ -133,64 +133,38 @@ def save_wget(link: Link, out_dir: Optional[Path]=None, timeout: int=TIMEOUT) ->
@enforce_types
def wget_output_path(link: Link) -> Optional[str]:
"""calculate the path to the wgetted .html file, since wget may
adjust some paths to be different than the base_url path.
See docs on wget --adjust-extension (-E)
"""
# Wget downloads can save in a number of different ways depending on the url:
# https://example.com
# > example.com/index.html
# https://example.com?v=zzVa_tX1OiI
# > example.com/index.html?v=zzVa_tX1OiI.html
# https://www.example.com/?v=zzVa_tX1OiI
# > example.com/index.html?v=zzVa_tX1OiI.html
# https://example.com/abc
# > example.com/abc.html
# https://example.com/abc/
# > example.com/abc/index.html
# https://example.com/abc?v=zzVa_tX1OiI.html
# > example.com/abc?v=zzVa_tX1OiI.html
# https://example.com/abc/?v=zzVa_tX1OiI.html
# > example.com/abc/index.html?v=zzVa_tX1OiI.html
# https://example.com/abc/test.html
# > example.com/abc/test.html
# https://example.com/abc/test?v=zzVa_tX1OiI
# > example.com/abc/test?v=zzVa_tX1OiI.html
# https://example.com/abc/test/?v=zzVa_tX1OiI
# > example.com/abc/test/index.html?v=zzVa_tX1OiI.html
# There's also lots of complexity around how the urlencoding and renaming
# is done for pages with query and hash fragments or extensions like shtml / htm / php / etc
# Since the wget algorithm for -E (appending .html) is incredibly complex
# and there's no way to get the computed output path from wget
# in order to avoid having to reverse-engineer how they calculate it,
# we just look in the output folder read the filename wget used from the filesystem
def unsafe_wget_output_path(link: Link) -> Optional[str]:
# There used to be a bunch of complex reverse-engineering path mapping logic here,
# but it was removed in favor of just walking through the output folder recursively to try to find the
# html file that wget produced. It's *much much much* slower than deriving it statically, and is currently
# one of the main bottlenecks of ArchiveBox's performance (the output data is often on a slow HDD or network mount).
# But it's STILL better than trying to figure out URL -> html filepath mappings ourselves from first principles.
full_path = without_fragment(without_query(path(link.url))).strip('/')
search_dir = Path(link.link_dir) / domain(link.url).replace(":", "+") / urldecode(full_path)
for _ in range(4):
if search_dir.exists():
if search_dir.is_dir():
html_files = [
f for f in search_dir.iterdir()
if re.search(".+\\.[Ss]?[Hh][Tt][Mm][Ll]?$", str(f), re.I | re.M)
]
if html_files:
return str(html_files[0].relative_to(link.link_dir))
try:
if search_dir.exists():
if search_dir.is_dir():
html_files = [
f for f in search_dir.iterdir()
if re.search(".+\\.[Ss]?[Hh][Tt][Mm][Ll]?$", str(f), re.I | re.M)
]
if html_files:
return str(html_files[0].relative_to(link.link_dir))
# sometimes wget'd URLs have no ext and return non-html
# e.g. /some/example/rss/all -> some RSS XML content)
# /some/other/url.o4g -> some binary unrecognized ext)
# test this with archivebox add --depth=1 https://getpocket.com/users/nikisweeting/feed/all
last_part_of_url = urldecode(full_path.rsplit('/', 1)[-1])
for file_present in search_dir.iterdir():
if file_present == last_part_of_url:
return str((search_dir / file_present).relative_to(link.link_dir))
# sometimes wget'd URLs have no ext and return non-html
# e.g. /some/example/rss/all -> some RSS XML content)
# /some/other/url.o4g -> some binary unrecognized ext)
# test this with archivebox add --depth=1 https://getpocket.com/users/nikisweeting/feed/all
last_part_of_url = urldecode(full_path.rsplit('/', 1)[-1])
for file_present in search_dir.iterdir():
if file_present == last_part_of_url:
return str((search_dir / file_present).relative_to(link.link_dir))
except OSError:
# OSError 36 and others can happen here, caused by trying to check for impossible paths
# (paths derived from URLs can often contain illegal unicode characters or be too long,
# causing the OS / filesystem to reject trying to open them with a system-level error)
pass
# Move up one directory level
search_dir = search_dir.parent
@ -200,10 +174,93 @@ def wget_output_path(link: Link) -> Optional[str]:
# check for literally any file present that isnt an empty folder
domain_dir = Path(domain(link.url).replace(":", "+"))
files_within = list((Path(link.link_dir) / domain_dir).glob('**/*.*'))
files_within = [path for path in (Path(link.link_dir) / domain_dir).glob('**/*.*') if not str(path).endswith('.orig')]
if files_within:
return str((domain_dir / files_within[-1]).relative_to(link.link_dir))
# abandon all hope, wget either never downloaded, or it produced an output path so horribly mutilated
# that it's better we just pretend it doesnt exist
# this is why ArchiveBox's specializes in REDUNDANTLY saving copies of sites with multiple different tools
return None
@enforce_types
def wget_output_path(link: Link) -> Optional[str]:
"""calculate the path to the wgetted .html file, since wget may
adjust some paths to be different than the base_url path.
See docs on: wget --adjust-extension (-E), --restrict-file-names=windows|unix|ascii, --convert-links
WARNING: this function is extremely error prone because mapping URLs to filesystem paths deterministically
is basically impossible. Every OS and filesystem have different requirements on what special characters are
allowed, and URLs are *full* of all kinds of special characters, illegal unicode, and generally unsafe strings
that you dont want anywhere near your filesystem. Also URLs can be obscenely long, but most filesystems dont
accept paths longer than 250 characters. On top of all that, this function only exists to try to reverse engineer
wget's approach to solving this problem, so this is a shittier, less tested version of their already insanely
complicated attempt to do this. Here be dragons:
- https://github.com/ArchiveBox/ArchiveBox/issues/549
- https://github.com/ArchiveBox/ArchiveBox/issues/1373
- https://stackoverflow.com/questions/9532499/check-whether-a-path-is-valid-in-python-without-creating-a-file-at-the-paths-ta
- and probably many more that I didn't realize were caused by this...
The only constructive thing we could possibly do to this function is to figure out how to remove it.
Preach loudly to anyone who will listen: never attempt to map URLs to filesystem paths,
and pray you never have to deal with the aftermath of someone else's attempt to do so...
"""
# Wget downloads can save in a number of different ways depending on the url:
# https://example.com
# > example.com/index.html
# https://example.com?v=zzVa_tX1OiI
# > example.com/index.html@v=zzVa_tX1OiI.html
# https://www.example.com/?v=zzVa_tX1OiI
# > example.com/index.html@v=zzVa_tX1OiI.html
# https://example.com/abc
# > example.com/abc.html
# https://example.com/abc/
# > example.com/abc/index.html
# https://example.com/abc?v=zzVa_tX1OiI.html
# > example.com/abc@v=zzVa_tX1OiI.html
# https://example.com/abc/?v=zzVa_tX1OiI.html
# > example.com/abc/index.html@v=zzVa_tX1OiI.html
# https://example.com/abc/test.html
# > example.com/abc/test.html
# https://example.com/abc/test?v=zzVa_tX1OiI
# > example.com/abc/test@v=zzVa_tX1OiI.html
# https://example.com/abc/test/?v=zzVa_tX1OiI
# > example.com/abc/test/index.html@v=zzVa_tX1OiI.html
# There's also lots of complexity around how the urlencoding and renaming
# is done for pages with query and hash fragments, extensions like shtml / htm / php / etc,
# unicode escape sequences, punycode domain names, unicode double-width characters, extensions longer than
# 4 characters, paths with multipe extensions, etc. the list goes on...
output_path = None
try:
output_path = unsafe_wget_output_path(link)
except Exception as err:
pass # better to pretend it just failed to download than expose gnarly OSErrors to users
# check for unprintable unicode characters
# https://github.com/ArchiveBox/ArchiveBox/issues/1373
if output_path:
safe_path = output_path.encode('utf-8', 'replace').decode()
if output_path != safe_path:
# contains unprintable unicode characters that will break other parts of archivebox
# better to pretend it doesnt exist and fallback to parent dir than crash archivebox
output_path = None
# check for a path that is just too long to safely handle across different OS's
# https://github.com/ArchiveBox/ArchiveBox/issues/549
if output_path and len(output_path) > 250:
output_path = None
if output_path:
return output_path
# fallback to just the domain dir
search_dir = Path(link.link_dir) / domain(link.url).replace(":", "+")
if search_dir.is_dir():

0
archivebox/index.sqlite3 Normal file
View file

View file

@ -4,6 +4,7 @@ WARNING: THIS FILE IS ALL LEGACY CODE TO BE REMOVED.
DO NOT ADD ANY NEW FEATURES TO THIS FILE, NEW CODE GOES HERE: core/models.py
These are the old types we used to use before ArchiveBox v0.4 (before we switched to Django).
"""
__package__ = 'archivebox.index'

View file

@ -638,17 +638,15 @@ def printable_folder_status(name: str, folder: Dict) -> str:
@enforce_types
def printable_dependency_version(name: str, dependency: Dict) -> str:
version = None
color, symbol, note, version = 'red', 'X', 'invalid', '?'
if dependency['enabled']:
if dependency['is_valid']:
color, symbol, note, version = 'green', '', 'valid', ''
color, symbol, note = 'green', '', 'valid'
parsed_version_num = re.search(r'[\d\.]+', dependency['version'])
if parsed_version_num:
version = f'v{parsed_version_num[0]}'
if not version:
color, symbol, note, version = 'red', 'X', 'invalid', '?'
else:
color, symbol, note, version = 'lightyellow', '-', 'disabled', '-'

View file

@ -104,7 +104,6 @@ from .config import (
COMMIT_HASH,
BUILD_TIME,
CODE_LOCATIONS,
EXTERNAL_LOCATIONS,
DATA_LOCATIONS,
DEPENDENCIES,
CHROME_BINARY,
@ -231,7 +230,7 @@ def version(quiet: bool=False,
p = platform.uname()
print(
'ArchiveBox v{}'.format(get_version(CONFIG)),
*((f'COMMIT_HASH={COMMIT_HASH[:7]}',) if COMMIT_HASH else ()),
f'COMMIT_HASH={COMMIT_HASH[:7] if COMMIT_HASH else "unknown"}',
f'BUILD_TIME={BUILD_TIME}',
)
print(
@ -272,11 +271,6 @@ def version(quiet: bool=False,
for name, path in CODE_LOCATIONS.items():
print(printable_folder_status(name, path))
print()
print('{white}[i] Secrets locations:{reset}'.format(**ANSI))
for name, path in EXTERNAL_LOCATIONS.items():
print(printable_folder_status(name, path))
print()
if DATA_LOCATIONS['OUTPUT_DIR']['is_valid']:
print('{white}[i] Data locations:{reset}'.format(**ANSI))
@ -695,7 +689,7 @@ def add(urls: Union[str, List[str]],
if CAN_UPGRADE:
hint(f"There's a new version of ArchiveBox available! Your current version is {VERSION}. You can upgrade to {VERSIONS_AVAILABLE['recommended_version']['tag_name']} ({VERSIONS_AVAILABLE['recommended_version']['html_url']}). For more on how to upgrade: https://github.com/ArchiveBox/ArchiveBox/wiki/Upgrading-or-Merging-Archives\n")
return all_links
return new_links
@enforce_types
def remove(filter_str: Optional[str]=None,
@ -1362,7 +1356,7 @@ def manage(args: Optional[List[str]]=None, out_dir: Path=OUTPUT_DIR) -> None:
if (args and "createsuperuser" in args) and (IN_DOCKER and not IS_TTY):
stderr('[!] Warning: you need to pass -it to use interactive commands in docker', color='lightyellow')
stderr(' docker run -it archivebox manage {}'.format(' '.join(args or ['...'])), color='lightyellow')
stderr()
stderr('')
execute_from_command_line([f'{ARCHIVEBOX_BINARY} manage', *(args or ['help'])])

View file

@ -7,7 +7,7 @@ if __name__ == '__main__':
# versions of ./manage.py commands whenever possible. When that's not possible
# (e.g. makemigrations), you can comment out this check temporarily
if not ('makemigrations' in sys.argv or 'migrate' in sys.argv):
if not ('makemigrations' in sys.argv or 'migrate' in sys.argv or 'startapp' in sys.argv):
print("[X] Don't run ./manage.py directly (unless you are a developer running makemigrations):")
print()
print(' Hint: Use these archivebox CLI commands instead of the ./manage.py equivalents:')

2391
archivebox/package-lock.json generated Normal file

File diff suppressed because it is too large Load diff

View file

@ -1,6 +1,6 @@
{
"name": "archivebox",
"version": "0.7.3",
"version": "0.8.0",
"description": "ArchiveBox: The self-hosted internet archive",
"author": "Nick Sweeting <archivebox-npm@sweeting.me>",
"repository": "github:ArchiveBox/ArchiveBox",
@ -8,6 +8,6 @@
"dependencies": {
"@postlight/parser": "^2.2.3",
"readability-extractor": "github:ArchiveBox/readability-extractor",
"single-file-cli": "^1.1.46"
"single-file-cli": "^1.1.54"
}
}

View file

@ -7,7 +7,6 @@ For examples of supported import formats see tests/.
__package__ = 'archivebox.parsers'
import re
from io import StringIO
from typing import IO, Tuple, List, Optional
@ -28,7 +27,6 @@ from ..util import (
htmldecode,
download_url,
enforce_types,
URL_REGEX,
)
from ..index.schema import Link
from ..logging_util import TimedProgress, log_source_saved
@ -202,54 +200,3 @@ def save_file_as_source(path: str, timeout: int=TIMEOUT, filename: str='{ts}-{ba
log_source_saved(source_file=source_path)
return source_path
# Check that plain text regex URL parsing works as expected
# this is last-line-of-defense to make sure the URL_REGEX isn't
# misbehaving due to some OS-level or environment level quirks (e.g. bad regex lib)
# the consequences of bad URL parsing could be disastrous and lead to many
# incorrect/badly parsed links being added to the archive, so this is worth the cost of checking
_test_url_strs = {
'example.com': 0,
'/example.com': 0,
'//example.com': 0,
':/example.com': 0,
'://example.com': 0,
'htt://example8.com': 0,
'/htt://example.com': 0,
'https://example': 1,
'https://localhost/2345': 1,
'https://localhost:1234/123': 1,
'://': 0,
'https://': 0,
'http://': 0,
'ftp://': 0,
'ftp://example.com': 0,
'https://example.com': 1,
'https://example.com/': 1,
'https://a.example.com': 1,
'https://a.example.com/': 1,
'https://a.example.com/what/is/happening.html': 1,
'https://a.example.com/what/ís/happening.html': 1,
'https://a.example.com/what/is/happening.html?what=1&2%20b#höw-about-this=1a': 1,
'https://a.example.com/what/is/happéning/?what=1&2%20b#how-aboüt-this=1a': 1,
'HTtpS://a.example.com/what/is/happening/?what=1&2%20b#how-about-this=1af&2f%20b': 1,
'https://example.com/?what=1#how-about-this=1&2%20baf': 1,
'https://example.com?what=1#how-about-this=1&2%20baf': 1,
'<test>http://example7.com</test>': 1,
'https://<test>': 0,
'https://[test]': 0,
'http://"test"': 0,
'http://\'test\'': 0,
'[https://example8.com/what/is/this.php?what=1]': 1,
'[and http://example9.com?what=1&other=3#and-thing=2]': 1,
'<what>https://example10.com#and-thing=2 "</about>': 1,
'abc<this["https://example11.com/what/is#and-thing=2?whoami=23&where=1"]that>def': 1,
'sdflkf[what](https://example12.com/who/what.php?whoami=1#whatami=2)?am=hi': 1,
'<or>http://examplehttp://15.badc</that>': 2,
'https://a.example.com/one.html?url=http://example.com/inside/of/another?=http://': 2,
'[https://a.example.com/one.html?url=http://example.com/inside/of/another?=](http://a.example.com)': 3,
}
for url_str, num_urls in _test_url_strs.items():
assert len(re.findall(URL_REGEX, url_str)) == num_urls, (
f'{url_str} does not contain {num_urls} urls')

View file

@ -10,7 +10,7 @@ from ..index.schema import Link
from ..util import (
htmldecode,
enforce_types,
URL_REGEX,
find_all_urls,
)
from html.parser import HTMLParser
from urllib.parse import urljoin
@ -40,10 +40,22 @@ def parse_generic_html_export(html_file: IO[str], root_url: Optional[str]=None,
parser.feed(line)
for url in parser.urls:
if root_url:
# resolve relative urls /home.html -> https://example.com/home.html
url = urljoin(root_url, url)
for archivable_url in re.findall(URL_REGEX, url):
url_is_absolute = (url.lower().startswith('http://') or url.lower().startswith('https://'))
# url = https://abc.com => True
# url = /page.php?next=https://example.com => False
if not url_is_absolute: # resolve it by joining it with root_url
relative_path = url
url = urljoin(root_url, relative_path) # https://example.com/somepage.html + /home.html
# => https://example.com/home.html
# special case to handle bug around // handling, crucial for urls that contain sub-urls
# e.g. https://web.archive.org/web/https://example.com
if did_urljoin_misbehave(root_url, relative_path, url):
url = fix_urljoin_bug(url)
for archivable_url in find_all_urls(url):
yield Link(
url=htmldecode(archivable_url),
timestamp=str(datetime.now(timezone.utc).timestamp()),
@ -56,3 +68,74 @@ def parse_generic_html_export(html_file: IO[str], root_url: Optional[str]=None,
KEY = 'html'
NAME = 'Generic HTML'
PARSER = parse_generic_html_export
#### WORKAROUND CODE FOR https://github.com/python/cpython/issues/96015 ####
def did_urljoin_misbehave(root_url: str, relative_path: str, final_url: str) -> bool:
"""
Handle urljoin edge case bug where multiple slashes get turned into a single slash:
- https://github.com/python/cpython/issues/96015
- https://github.com/ArchiveBox/ArchiveBox/issues/1411
This workaround only fixes the most common case of a sub-URL inside an outer URL, e.g.:
https://web.archive.org/web/https://example.com/some/inner/url
But there are other valid URLs containing // that are not fixed by this workaround, e.g.:
https://example.com/drives/C//some/file
"""
# if relative path is actually an absolute url, cut off its own scheme so we check the path component only
relative_path = relative_path.lower()
if relative_path.startswith('http://') or relative_path.startswith('https://'):
relative_path = relative_path.split('://', 1)[-1]
# TODO: properly fix all double // getting stripped by urljoin, not just ://
original_path_had_suburl = '://' in relative_path
original_root_had_suburl = '://' in root_url[8:] # ignore first 8 chars because root always starts with https://
final_joined_has_suburl = '://' in final_url[8:] # ignore first 8 chars because final always starts with https://
urljoin_broke_suburls = (
(original_root_had_suburl or original_path_had_suburl)
and not final_joined_has_suburl
)
return urljoin_broke_suburls
def fix_urljoin_bug(url: str, nesting_limit=5):
"""
recursively replace broken suburls .../http:/... with .../http://...
basically equivalent to this for 99.9% of cases:
url = url.replace('/http:/', '/http://')
url = url.replace('/https:/', '/https://')
except this handles:
other schemes besides http/https (e.g. https://example.com/link/git+ssh://github.com/example)
other preceding separators besides / (e.g. https://example.com/login/?next=https://example.com/home)
fixing multiple suburls recursively
"""
input_url = url
for _ in range(nesting_limit):
url = re.sub(
r'(?P<root>.+?)' # https://web.archive.org/web
+ r'(?P<separator>[-=/_&+%$#@!*\(\\])' # /
+ r'(?P<subscheme>[a-zA-Z0-9+_-]{1,32}?):/' # http:/
+ r'(?P<suburl>[^/\\]+)', # example.com
r"\1\2\3://\4",
input_url,
re.IGNORECASE | re.UNICODE,
)
if url == input_url:
break # nothing left to replace, all suburls are fixed
input_url = url
return url
# sanity check to make sure workaround code works as expected and doesnt introduce *more* bugs
assert did_urljoin_misbehave('https://web.archive.org/web/https://example.com', 'abc.html', 'https://web.archive.org/web/https:/example.com/abc.html') == True
assert did_urljoin_misbehave('http://example.com', 'https://web.archive.org/web/http://example.com/abc.html', 'https://web.archive.org/web/http:/example.com/abc.html') == True
assert fix_urljoin_bug('https:/example.com') == 'https:/example.com' # should not modify original url's scheme, only sub-urls
assert fix_urljoin_bug('https://web.archive.org/web/https:/example.com/abc.html') == 'https://web.archive.org/web/https://example.com/abc.html'
assert fix_urljoin_bug('http://example.com/link/git+ssh:/github.com/example?next=ftp:/example.com') == 'http://example.com/link/git+ssh://github.com/example?next=ftp://example.com'

View file

@ -72,21 +72,13 @@ def parse_generic_json_export(json_file: IO[str], **_kwargs) -> Iterable[Link]:
json_file.seek(0)
try:
links = json.load(json_file)
if type(links) != list:
raise Exception('JSON parser expects list of objects, maybe this is JSONL?')
except json.decoder.JSONDecodeError:
# sometimes the first line is a comment or other junk, so try without
json_file.seek(0)
first_line = json_file.readline()
#print(' > Trying JSON parser without first line: "', first_line.strip(), '"', sep= '')
links = json.load(json_file)
# we may fail again, which means we really don't know what to do
links = json.load(json_file)
if type(links) != list:
raise Exception('JSON parser expects list of objects, maybe this is JSONL?')
for link in links:
if link:
yield jsonObjectToLink(link,json_file.name)
yield jsonObjectToLink(link, json_file.name)
KEY = 'json'
NAME = 'Generic JSON'

View file

@ -3,11 +3,9 @@ __package__ = 'archivebox.parsers'
import json
from typing import IO, Iterable
from datetime import datetime, timezone
from ..index.schema import Link
from ..util import (
htmldecode,
enforce_types,
)

View file

@ -1,8 +1,6 @@
__package__ = 'archivebox.parsers'
__description__ = 'Plain Text'
import re
from typing import IO, Iterable
from datetime import datetime, timezone
from pathlib import Path
@ -11,7 +9,7 @@ from ..index.schema import Link
from ..util import (
htmldecode,
enforce_types,
URL_REGEX
find_all_urls,
)
@ -39,7 +37,7 @@ def parse_generic_txt_export(text_file: IO[str], **_kwargs) -> Iterable[Link]:
pass
# otherwise look for anything that looks like a URL in the line
for url in re.findall(URL_REGEX, line):
for url in find_all_urls(line):
yield Link(
url=htmldecode(url),
timestamp=str(datetime.now(timezone.utc).timestamp()),
@ -48,17 +46,6 @@ def parse_generic_txt_export(text_file: IO[str], **_kwargs) -> Iterable[Link]:
sources=[text_file.name],
)
# look inside the URL for any sub-urls, e.g. for archive.org links
# https://web.archive.org/web/20200531203453/https://www.reddit.com/r/socialism/comments/gu24ke/nypd_officers_claim_they_are_protecting_the_rule/fsfq0sw/
# -> https://www.reddit.com/r/socialism/comments/gu24ke/nypd_officers_claim_they_are_protecting_the_rule/fsfq0sw/
for sub_url in re.findall(URL_REGEX, line[1:]):
yield Link(
url=htmldecode(sub_url),
timestamp=str(datetime.now(timezone.utc).timestamp()),
title=None,
tags=None,
sources=[text_file.name],
)
KEY = 'txt'
NAME = 'Generic TXT'

View file

@ -6,6 +6,7 @@
<a href="/admin/core/tag/">Tags</a> |
<a href="/admin/core/archiveresult/?o=-1">Log</a> &nbsp; &nbsp;
<a href="{% url 'Docs' %}" target="_blank" rel="noopener noreferrer">Docs</a> |
<a href="/api">API</a> |
<a href="{% url 'public-index' %}">Public</a> |
<a href="/admin/">Admin</a>
&nbsp; &nbsp;
@ -16,7 +17,7 @@
{% endblock %}
{% block userlinks %}
{% if user.has_usable_password %}
<a href="{% url 'admin:password_change' %}">Account</a> /
<a href="{% url 'admin:password_change' %}" title="Change your account password">Account</a> /
{% endif %}
<a href="{% url 'admin:logout' %}">{% trans 'Log out' %}</a>
{% endblock %}

View file

@ -57,19 +57,62 @@ short_ts = lambda ts: str(parse_date(ts).timestamp()).split('.')[0]
ts_to_date_str = lambda ts: ts and parse_date(ts).strftime('%Y-%m-%d %H:%M')
ts_to_iso = lambda ts: ts and parse_date(ts).isoformat()
COLOR_REGEX = re.compile(r'\[(?P<arg_1>\d+)(;(?P<arg_2>\d+)(;(?P<arg_3>\d+))?)?m')
# https://mathiasbynens.be/demo/url-regex
URL_REGEX = re.compile(
r'(?=('
r'http[s]?://' # start matching from allowed schemes
r'(?:[a-zA-Z]|[0-9]' # followed by allowed alphanum characters
r'|[-_$@.&+!*\(\),]' # or allowed symbols (keep hyphen first to match literal hyphen)
r'|(?:%[0-9a-fA-F][0-9a-fA-F]))' # or allowed unicode bytes
r'[^\]\[\(\)<>"\'\s]+' # stop parsing at these symbols
r'(?=('
r'http[s]?://' # start matching from allowed schemes
r'(?:[a-zA-Z]|[0-9]' # followed by allowed alphanum characters
r'|[-_$@.&+!*\(\),]' # or allowed symbols (keep hyphen first to match literal hyphen)
r'|[^\u0000-\u007F])+' # or allowed unicode bytes
r'[^\]\[<>"\'\s]+' # stop parsing at these symbols
r'))',
re.IGNORECASE,
re.IGNORECASE | re.UNICODE,
)
COLOR_REGEX = re.compile(r'\[(?P<arg_1>\d+)(;(?P<arg_2>\d+)(;(?P<arg_3>\d+))?)?m')
def parens_are_matched(string: str, open_char='(', close_char=')'):
"""check that all parentheses in a string are balanced and nested properly"""
count = 0
for c in string:
if c == open_char:
count += 1
elif c == close_char:
count -= 1
if count < 0:
return False
return count == 0
def fix_url_from_markdown(url_str: str) -> str:
"""
cleanup a regex-parsed url that may contain dangling trailing parens from markdown link syntax
helpful to fix URLs parsed from markdown e.g.
input: https://wikipedia.org/en/some_article_(Disambiguation).html?abc=def).somemoretext
result: https://wikipedia.org/en/some_article_(Disambiguation).html?abc=def
IMPORTANT ASSUMPTION: valid urls wont have unbalanced or incorrectly nested parentheses
e.g. this will fail the user actually wants to ingest a url like 'https://example.com/some_wei)(rd_url'
in that case it will return https://example.com/some_wei (truncated up to the first unbalanced paren)
This assumption is true 99.9999% of the time, and for the rare edge case the user can use url_list parser.
"""
trimmed_url = url_str
# cut off one trailing character at a time
# until parens are balanced e.g. /a(b)c).x(y)z -> /a(b)c
while not parens_are_matched(trimmed_url):
trimmed_url = trimmed_url[:-1]
# make sure trimmed url is still valid
if re.findall(URL_REGEX, trimmed_url):
return trimmed_url
return url_str
def find_all_urls(urls_str: str):
for url in re.findall(URL_REGEX, urls_str):
yield fix_url_from_markdown(url)
def is_static_file(url: str):
# TODO: the proper way is with MIME type detection + ext, not only extension
@ -315,7 +358,8 @@ def chrome_cleanup():
if IN_DOCKER and lexists("/home/archivebox/.config/chromium/SingletonLock"):
remove_file("/home/archivebox/.config/chromium/SingletonLock")
def ansi_to_html(text):
@enforce_types
def ansi_to_html(text: str) -> str:
"""
Based on: https://stackoverflow.com/questions/19212665/python-converting-ansi-color-codes-to-html
"""
@ -399,3 +443,98 @@ class ExtendedEncoder(pyjson.JSONEncoder):
return pyjson.JSONEncoder.default(self, obj)
### URL PARSING TESTS / ASSERTIONS
# Check that plain text regex URL parsing works as expected
# this is last-line-of-defense to make sure the URL_REGEX isn't
# misbehaving due to some OS-level or environment level quirks (e.g. regex engine / cpython / locale differences)
# the consequences of bad URL parsing could be disastrous and lead to many
# incorrect/badly parsed links being added to the archive, so this is worth the cost of checking
assert fix_url_from_markdown('http://example.com/a(b)c).x(y)z') == 'http://example.com/a(b)c'
assert fix_url_from_markdown('https://wikipedia.org/en/some_article_(Disambiguation).html?abc=def).link(with)_trailingtext') == 'https://wikipedia.org/en/some_article_(Disambiguation).html?abc=def'
URL_REGEX_TESTS = [
('https://example.com', ['https://example.com']),
('http://abc-file234example.com/abc?def=abc&23423=sdfsdf#abc=234&234=a234', ['http://abc-file234example.com/abc?def=abc&23423=sdfsdf#abc=234&234=a234']),
('https://twitter.com/share?url=https://akaao.success-corp.co.jp&text=ア@サ!ト&hashtags=ア%オ,元+ア.ア-オ_イ*シ$ロ abc', ['https://twitter.com/share?url=https://akaao.success-corp.co.jp&text=ア@サ!ト&hashtags=ア%オ,元+ア.ア-オ_イ*シ$ロ', 'https://akaao.success-corp.co.jp&text=ア@サ!ト&hashtags=ア%オ,元+ア.ア-オ_イ*シ$ロ']),
('<a href="https://twitter.com/share#url=https://akaao.success-corp.co.jp&text=ア@サ!ト?hashtags=ア%オ,元+ア&abc=.ア-オ_イ*シ$ロ"> abc', ['https://twitter.com/share#url=https://akaao.success-corp.co.jp&text=ア@サ!ト?hashtags=ア%オ,元+ア&abc=.ア-オ_イ*シ$ロ', 'https://akaao.success-corp.co.jp&text=ア@サ!ト?hashtags=ア%オ,元+ア&abc=.ア-オ_イ*シ$ロ']),
('///a', []),
('http://', []),
('http://../', ['http://../']),
('http://-error-.invalid/', ['http://-error-.invalid/']),
('https://a(b)c+1#2?3&4/', ['https://a(b)c+1#2?3&4/']),
('http://उदाहरण.परीक्षा', ['http://उदाहरण.परीक्षा']),
('http://例子.测试', ['http://例子.测试']),
('http://➡.ws/䨹 htps://abc.1243?234', ['http://➡.ws/䨹']),
('http://⌘.ws">https://exa+mple.com//:abc ', ['http://⌘.ws', 'https://exa+mple.com//:abc']),
('http://مثال.إختبار/abc?def=ت&ب=abc#abc=234', ['http://مثال.إختبار/abc?def=ت&ب=abc#abc=234']),
('http://-.~_!$&()*+,;=:%40:80%2f::::::@example.c\'om', ['http://-.~_!$&()*+,;=:%40:80%2f::::::@example.c']),
('http://us:pa@ex.co:42/http://ex.co:19/a?_d=4#-a=2.3', ['http://us:pa@ex.co:42/http://ex.co:19/a?_d=4#-a=2.3', 'http://ex.co:19/a?_d=4#-a=2.3']),
('http://code.google.com/events/#&product=browser', ['http://code.google.com/events/#&product=browser']),
('http://foo.bar?q=Spaces should be encoded', ['http://foo.bar?q=Spaces']),
('http://foo.com/blah_(wikipedia)#c(i)t[e]-1', ['http://foo.com/blah_(wikipedia)#c(i)t']),
('http://foo.com/(something)?after=parens', ['http://foo.com/(something)?after=parens']),
('http://foo.com/unicode_(✪)_in_parens) abc', ['http://foo.com/unicode_(✪)_in_parens']),
('http://foo.bar/?q=Test%20URL-encoded%20stuff', ['http://foo.bar/?q=Test%20URL-encoded%20stuff']),
('[xyz](http://a.b/?q=(Test)%20U)RL-encoded%20stuff', ['http://a.b/?q=(Test)%20U']),
('[xyz](http://a.b/?q=(Test)%20U)-ab https://abc+123', ['http://a.b/?q=(Test)%20U', 'https://abc+123']),
('[xyz](http://a.b/?q=(Test)%20U) https://a(b)c+12)3', ['http://a.b/?q=(Test)%20U', 'https://a(b)c+12']),
('[xyz](http://a.b/?q=(Test)a\nabchttps://a(b)c+12)3', ['http://a.b/?q=(Test)a', 'https://a(b)c+12']),
('http://foo.bar/?q=Test%20URL-encoded%20stuff', ['http://foo.bar/?q=Test%20URL-encoded%20stuff']),
]
for urls_str, expected_url_matches in URL_REGEX_TESTS:
url_matches = list(find_all_urls(urls_str))
assert url_matches == expected_url_matches, 'FAILED URL_REGEX CHECK!'
# More test cases
_test_url_strs = {
'example.com': 0,
'/example.com': 0,
'//example.com': 0,
':/example.com': 0,
'://example.com': 0,
'htt://example8.com': 0,
'/htt://example.com': 0,
'https://example': 1,
'https://localhost/2345': 1,
'https://localhost:1234/123': 1,
'://': 0,
'https://': 0,
'http://': 0,
'ftp://': 0,
'ftp://example.com': 0,
'https://example.com': 1,
'https://example.com/': 1,
'https://a.example.com': 1,
'https://a.example.com/': 1,
'https://a.example.com/what/is/happening.html': 1,
'https://a.example.com/what/ís/happening.html': 1,
'https://a.example.com/what/is/happening.html?what=1&2%20b#höw-about-this=1a': 1,
'https://a.example.com/what/is/happéning/?what=1&2%20b#how-aboüt-this=1a': 1,
'HTtpS://a.example.com/what/is/happening/?what=1&2%20b#how-about-this=1af&2f%20b': 1,
'https://example.com/?what=1#how-about-this=1&2%20baf': 1,
'https://example.com?what=1#how-about-this=1&2%20baf': 1,
'<test>http://example7.com</test>': 1,
'https://<test>': 0,
'https://[test]': 0,
'http://"test"': 0,
'http://\'test\'': 0,
'[https://example8.com/what/is/this.php?what=1]': 1,
'[and http://example9.com?what=1&other=3#and-thing=2]': 1,
'<what>https://example10.com#and-thing=2 "</about>': 1,
'abc<this["https://example11.com/what/is#and-thing=2?whoami=23&where=1"]that>def': 1,
'sdflkf[what](https://example12.com/who/what.php?whoami=1#whatami=2)?am=hi': 1,
'<or>http://examplehttp://15.badc</that>': 2,
'https://a.example.com/one.html?url=http://example.com/inside/of/another?=http://': 2,
'[https://a.example.com/one.html?url=http://example.com/inside/of/another?=](http://a.example.com)': 3,
}
for url_str, num_urls in _test_url_strs.items():
assert len(list(find_all_urls(url_str))) == num_urls, (
f'{url_str} does not contain {num_urls} urls')

6
archivebox/vendor/requirements.txt vendored Normal file
View file

@ -0,0 +1,6 @@
# this folder contains vendored versions of these packages
atomicwrites==1.4.0
pocket==0.3.7
django-taggit==1.3.0
base32-crockford==0.3.0

View file

@ -31,6 +31,20 @@ else
echo "[!] Warning: No virtualenv presesnt in $REPO_DIR.venv"
fi
# Build python package lists
# https://pdm-project.org/latest/usage/lockfile/
echo "[+] Generating requirements.txt and pdm.lock from pyproject.toml..."
pdm lock --group=':all' --production --lockfile pdm.lock --strategy="cross_platform"
pdm sync --group=':all' --production --lockfile pdm.lock --clean || pdm sync --group=':all' --production --lockfile pdm.lock --clean
pdm export --group=':all' --production --lockfile pdm.lock --without-hashes -o requirements.txt
pdm lock --group=':all' --dev --lockfile pdm.dev.lock --strategy="cross_platform"
pdm sync --group=':all' --dev --lockfile pdm.dev.lock --clean || pdm sync --group=':all' --dev --lockfile pdm.dev.lock --clean
pdm export --group=':all' --dev --lockfile pdm.dev.lock --without-hashes -o requirements-dev.txt
# cleanup build artifacts
rm -Rf build deb_dist dist archivebox-*.tar.gz

View file

@ -21,6 +21,20 @@ VERSION="$(jq -r '.version' < "$REPO_DIR/package.json")"
SHORT_VERSION="$(echo "$VERSION" | perl -pe 's/(\d+)\.(\d+)\.(\d+)/$1.$2/g')"
REQUIRED_PLATFORMS="${2:-"linux/arm64,linux/amd64,linux/arm/v7"}"
# Build python package lists
# https://pdm-project.org/latest/usage/lockfile/
echo "[+] Generating requirements.txt and pdm.lock from pyproject.toml..."
pdm lock --group=':all' --production --lockfile pdm.lock --strategy="cross_platform"
pdm sync --group=':all' --production --lockfile pdm.lock --clean || pdm sync --group=':all' --production --lockfile pdm.lock --clean
pdm export --group=':all' --production --lockfile pdm.lock --without-hashes -o requirements.txt
pdm lock --group=':all' --dev --lockfile pdm.dev.lock --strategy="cross_platform"
pdm sync --group=':all' --dev --lockfile pdm.dev.lock --clean || pdm sync --group=':all' --dev --lockfile pdm.dev.lock --clean
pdm export --group=':all' --dev --lockfile pdm.dev.lock --without-hashes -o requirements-dev.txt
echo "[+] Building Docker image: tag=$TAG_NAME version=$SHORT_VERSION arch=$REQUIRED_PLATFORMS"
@ -32,4 +46,4 @@ docker build . --no-cache -t archivebox-dev --load
# -t archivebox \
# -t archivebox:$TAG_NAME \
# -t archivebox:$VERSION \
# -t archivebox:$SHORT_VERSION
# -t archivebox:$SHORT_VERSION

View file

@ -18,7 +18,7 @@ which docker > /dev/null || exit 1
which jq > /dev/null || exit 1
# which pdm > /dev/null || exit 1
SUPPORTED_PLATFORMS="linux/amd64,linux/arm64,linux/arm/v7"
SUPPORTED_PLATFORMS="linux/amd64,linux/arm64"
TAG_NAME="${1:-$(git rev-parse --abbrev-ref HEAD)}"
VERSION="$(jq -r '.version' < "$REPO_DIR/package.json")"
@ -71,10 +71,8 @@ docker buildx use xbuilder 2>&1 >/dev/null || create_builder
check_platforms || (recreate_builder && check_platforms) || exit 1
# Build python package lists
echo "[+] Generating requirements.txt and pdm.lock from pyproject.toml..."
pdm lock --group=':all' --strategy="cross_platform" --production
pdm export --group=':all' --production --without-hashes -o requirements.txt
# Make sure pyproject.toml, pdm{.dev}.lock, requirements{-dev}.txt, package{-lock}.json are all up-to-date
bash ./bin/lock_pkgs.sh
echo "[+] Building archivebox:$VERSION docker image..."
@ -82,20 +80,20 @@ echo "[+] Building archivebox:$VERSION docker image..."
# docker build . --no-cache -t archivebox-dev \
# replace --load with --push to deploy
docker buildx build --platform "$SELECTED_PLATFORMS" --load . \
-t archivebox/archivebox \
# -t archivebox/archivebox \
-t archivebox/archivebox:$TAG_NAME \
-t archivebox/archivebox:$VERSION \
-t archivebox/archivebox:$SHORT_VERSION \
# -t archivebox/archivebox:$VERSION \
# -t archivebox/archivebox:$SHORT_VERSION \
-t archivebox/archivebox:$GIT_SHA \
-t archivebox/archivebox:latest \
-t nikisweeting/archivebox \
# -t archivebox/archivebox:latest \
# -t nikisweeting/archivebox \
-t nikisweeting/archivebox:$TAG_NAME \
-t nikisweeting/archivebox:$VERSION \
-t nikisweeting/archivebox:$SHORT_VERSION \
# -t nikisweeting/archivebox:$VERSION \
# -t nikisweeting/archivebox:$SHORT_VERSION \
-t nikisweeting/archivebox:$GIT_SHA \
-t nikisweeting/archivebox:latest \
# -t nikisweeting/archivebox:latest \
-t ghcr.io/archivebox/archivebox/archivebox:$TAG_NAME \
-t ghcr.io/archivebox/archivebox/archivebox:$VERSION \
-t ghcr.io/archivebox/archivebox/archivebox:$SHORT_VERSION \
# -t ghcr.io/archivebox/archivebox/archivebox:$VERSION \
# -t ghcr.io/archivebox/archivebox/archivebox:$SHORT_VERSION \
-t ghcr.io/archivebox/archivebox/archivebox:$GIT_SHA \
-t ghcr.io/archivebox/archivebox/archivebox:latest
# -t ghcr.io/archivebox/archivebox/archivebox:latest

View file

@ -20,20 +20,13 @@ else
fi
cd "$REPO_DIR"
echo "[*] Cleaning up build dirs"
cd "$REPO_DIR"
rm -Rf build dist
# Generate pdm.lock, requirements.txt, and package-lock.json
bash ./bin/lock_pkgs.sh
echo "[+] Building sdist, bdist_wheel, and egg_info"
rm -f archivebox/package.json
cp package.json archivebox/package.json
pdm self update
pdm install
rm -Rf build dist
pdm build
pdm export --without-hashes -o ./pip_dist/requirements.txt
cp dist/* ./pip_dist/
echo
echo "[√] Finished. Don't forget to commit the new sdist and wheel files in ./pip_dist/"
echo "[√] Finished. Don't forget to commit the new sdist and wheel files in ./pip_dist/"

View file

@ -18,6 +18,7 @@
# https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html
# set -o xtrace
# set -o nounset
shopt -s nullglob
set -o errexit
set -o errtrace
set -o pipefail
@ -166,13 +167,13 @@ fi
# symlink etc crontabs into place
mkdir -p "$DATA_DIR/crontabs"
if ! test -L /var/spool/cron/crontabs; then
# copy files from old location into new data dir location
for file in $(ls /var/spool/cron/crontabs); do
cp /var/spool/cron/crontabs/"$file" "$DATA_DIR/crontabs"
# move files from old location into new data dir location
for existing_file in /var/spool/cron/crontabs/*; do
mv "$existing_file" "$DATA_DIR/crontabs/"
done
# replace old system path with symlink to data dir location
rm -Rf /var/spool/cron/crontabs
ln -s "$DATA_DIR/crontabs" /var/spool/cron/crontabs
ln -sf "$DATA_DIR/crontabs" /var/spool/cron/crontabs
fi
# set DBUS_SYSTEM_BUS_ADDRESS & DBUS_SESSION_BUS_ADDRESS

View file

@ -15,7 +15,7 @@ DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && cd .. && pwd )"
source "$DIR/.venv/bin/activate"
echo "[*] Running flake8..."
cd archivebox
cd "$DIR/archivebox"
flake8 . && echo "√ No errors found."
echo

101
bin/lock_pkgs.sh Executable file
View file

@ -0,0 +1,101 @@
#!/usr/bin/env bash
### Bash Environment Setup
# http://redsymbol.net/articles/unofficial-bash-strict-mode/
# https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html
# set -o xtrace
set -o errexit
set -o errtrace
set -o nounset
set -o pipefail
IFS=$'\n'
REPO_DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && cd .. && pwd )"
cd "$REPO_DIR"
py_version="$(grep 'version = ' pyproject.toml | awk '{print $3}' | jq -r)"
js_version="$(jq -r '.version' package.json)"
if [[ "$py_version" != "$js_version" ]]; then
echo "[❌] Version in pyproject.toml ($py_version) does not match version in package.json ($js_version)!"
exit 1
fi
echo "[🔒] Locking all ArchiveBox dependencies (pip, npm)"
echo
echo "pyproject.toml: archivebox $py_version"
echo "package.json: archivebox $js_version"
echo
echo
echo "[*] Cleaning up old lockfiles and build files"
deactivate 2>/dev/null || true
rm -Rf build dist
rm -f pdm.lock
rm -f pdm.dev.lock
rm -f requirements.txt
rm -f requirements-dev.txt
rm -f package-lock.json
rm -f archivebox/package.json
rm -f archivebox/package-lock.json
rm -Rf ./.venv
rm -Rf ./node_modules
rm -Rf ./archivebox/node_modules
echo
echo
echo "[+] Generating dev & prod requirements.txt & pdm.lock from pyproject.toml..."
pip install --upgrade pip setuptools
pdm self update >/dev/null 2>&1 || true
pdm venv create 3.10
echo
echo "pyproject.toml: archivebox $(grep 'version = ' pyproject.toml | awk '{print $3}' | jq -r)"
echo "$(which python): $(python --version | head -n 1)"
echo "$(which pdm): $(pdm --version | head -n 1)"
pdm info --env
pdm info
echo
# https://pdm-project.org/latest/usage/lockfile/
# prod
pdm lock --group=':all' --production --lockfile pdm.lock --strategy="cross_platform"
pdm sync --group=':all' --production --lockfile pdm.lock --clean
pdm export --group=':all' --production --lockfile pdm.lock --without-hashes -o requirements.txt
cp ./pdm.lock ./pip_dist/
cp ./requirements.txt ./pip_dist/
# dev
pdm lock --group=':all' --dev --lockfile pdm.dev.lock --strategy="cross_platform"
pdm sync --group=':all' --dev --lockfile pdm.dev.lock --clean
pdm export --group=':all' --dev --lockfile pdm.dev.lock --without-hashes -o requirements-dev.txt
cp ./pdm.dev.lock ./pip_dist/
cp ./requirements-dev.txt ./pip_dist/
echo
echo "[+] Generating package-lock.json from package.json..."
npm install -g npm
echo
echo "package.json: archivebox $(jq -r '.version' package.json)"
echo
echo "$(which node): $(node --version | head -n 1)"
echo "$(which npm): $(npm --version | head -n 1)"
echo
npm install --package-lock-only
cp package.json archivebox/package.json
cp package-lock.json archivebox/package-lock.json
echo
echo "[√] Finished. Don't forget to commit the new lockfiles:"
echo
ls "pyproject.toml" | cat
ls "pdm.lock" | cat
ls "pdm.dev.lock" | cat
ls "requirements.txt" | cat
ls "requirements-dev.txt" | cat
echo
ls "package.json" | cat
ls "package-lock.json" | cat
ls "archivebox/package.json" | cat
ls "archivebox/package-lock.json" | cat

View file

@ -27,9 +27,9 @@ if (which docker-compose > /dev/null && docker pull archivebox/archivebox:latest
if [ -f "./index.sqlite3" ]; then
mv -i ~/archivebox/* ~/archivebox/data/
fi
curl -fsSL 'https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/main/docker-compose.yml' > docker-compose.yml
curl -fsSL 'https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/stable/docker-compose.yml' > docker-compose.yml
mkdir -p ./etc
curl -fsSL 'https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/main/etc/sonic.cfg' > ./etc/sonic.cfg
curl -fsSL 'https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/stable/etc/sonic.cfg' > ./etc/sonic.cfg
docker compose run --rm archivebox init --setup
echo
echo "[+] Starting ArchiveBox server using: docker compose up -d..."

View file

@ -1,14 +1,12 @@
# Usage:
# docker compose run archivebox init --setup
# docker compose up
# echo "https://example.com" | docker compose run archivebox archivebox add
# docker compose run archivebox add --depth=1 https://example.com/some/feed.rss
# docker compose run archivebox config --set MEDIA_MAX_SIZE=750m
# echo 'https://example.com' | docker compose run -T archivebox add
# docker compose run archivebox add --depth=1 'https://news.ycombinator.com'
# docker compose run archivebox config --set SAVE_ARCHIVE_DOT_ORG=False
# docker compose run archivebox help
# Documentation:
# https://github.com/ArchiveBox/ArchiveBox/wiki/Docker#docker-compose
services:
archivebox:
image: archivebox/archivebox:latest
@ -23,11 +21,11 @@ services:
- PUBLIC_INDEX=True # set to False to prevent anonymous users from viewing snapshot list
- PUBLIC_SNAPSHOTS=True # set to False to prevent anonymous users from viewing snapshot content
- PUBLIC_ADD_VIEW=False # set to True to allow anonymous users to submit new URLs to archive
- SEARCH_BACKEND_ENGINE=sonic # uncomment these and sonic container below for better full-text search
- SEARCH_BACKEND_ENGINE=sonic # tells ArchiveBox to use sonic container below for fast full-text search
- SEARCH_BACKEND_HOST_NAME=sonic
- SEARCH_BACKEND_PASSWORD=SomeSecretPassword
# - PUID=911 # set to your host user's UID & GID if you encounter permissions issues
# - PGID=911
# - PGID=911 # UID/GIDs <500 may clash with existing users and are not recommended
# - MEDIA_MAX_SIZE=750m # increase this filesize limit to allow archiving larger audio/video files
# - TIMEOUT=60 # increase this number to 120+ seconds if you see many slow downloads timing out
# - CHECK_SSL_VALIDITY=True # set to False to disable strict SSL checking (allows saving URLs w/ broken certs)
@ -35,7 +33,6 @@ services:
# ...
# add further configuration options from archivebox/config.py as needed (to apply them only to this container)
# or set using `docker compose run archivebox config --set SOME_KEY=someval` (to persist config across all containers)
# For ad-blocking during archiving, uncomment this section and pihole service section below
# networks:
# - dns
@ -45,51 +42,50 @@ services:
######## Optional Addons: tweak examples below as needed for your specific use case ########
### Enable ability to run regularly scheduled archiving tasks by uncommenting this container
# $ docker compose run archivebox schedule --every=day --depth=1 'https://example.com/some/rss/feed.xml'
# then restart the scheduler container to apply the changes to the schedule
### This optional container runs any scheduled tasks in the background, add new tasks like so:
# $ docker compose run archivebox schedule --add --every=day --depth=1 'https://example.com/some/rss/feed.xml'
# then restart the scheduler container to apply any changes to the scheduled task list:
# $ docker compose restart archivebox_scheduler
archivebox_scheduler:
image: archivebox/archivebox:latest
command: schedule --foreground
environment:
- TIMEOUT=120 # increase if you see timeouts often during archiving / on slow networks
- ONLY_NEW=True # set to False to retry previously failed URLs when re-adding instead of skipping them
# - PUID=502 # set to your host user's UID & GID if you encounter permissions issues
# - PGID=20
volumes:
- ./data:/data
# cpus: 2 # uncomment / edit these values to limit container resource consumption
# mem_limit: 2048m
# shm_size: 1024m
image: archivebox/archivebox:latest
command: schedule --foreground --update --every=day
environment:
- TIMEOUT=120 # use a higher timeout than the main container to give slow tasks more time when retrying
# - PUID=502 # set to your host user's UID & GID if you encounter permissions issues
# - PGID=20
volumes:
- ./data:/data
# cpus: 2 # uncomment / edit these values to limit scheduler container resource consumption
# mem_limit: 2048m
# restart: always
### Runs the Sonic full-text search backend, config file is auto-downloaded into sonic.cfg:
# After starting, backfill any existing Snapshots into the full-text index:
### This runs the optional Sonic full-text search backend (much faster than default rg backend).
# If Sonic is ever started after not running for a while, update its full-text index by running:
# $ docker-compose run archivebox update --index-only
sonic:
image: valeriansaliou/sonic:latest
build:
# custom build just auto-downloads archivebox's default sonic.cfg as a convenience
# not needed if you have already have /etc/sonic.cfg
# not needed after first run / if you have already have ./etc/sonic.cfg present
dockerfile_inline: |
FROM quay.io/curl/curl:latest AS setup
RUN curl -fsSL 'https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/main/etc/sonic.cfg' > /tmp/sonic.cfg
FROM quay.io/curl/curl:latest AS config_downloader
RUN curl -fsSL 'https://raw.githubusercontent.com/ArchiveBox/ArchiveBox/stable/etc/sonic.cfg' > /tmp/sonic.cfg
FROM valeriansaliou/sonic:latest
COPY --from=setup /tmp/sonic.cfg /etc/sonic.cfg
COPY --from=config_downloader /tmp/sonic.cfg /etc/sonic.cfg
expose:
- 1491
environment:
- SEARCH_BACKEND_PASSWORD=SomeSecretPassword
volumes:
- ./etc/sonic.cfg:/etc/sonic.cfg
- ./sonic.cfg:/etc/sonic.cfg
- ./data/sonic:/var/lib/sonic/store
### Example: Watch the ArchiveBox browser in realtime as it archives things,
# or remote control it to set up logins and credentials for sites you want to archive.
### This container runs xvfb+noVNC so you can watch the ArchiveBox browser as it archives things,
# or remote control it to set up a chrome profile w/ login credentials for sites you want to archive.
# https://github.com/ArchiveBox/ArchiveBox/wiki/Chromium-Install#setting-up-a-chromium-user-profile
novnc:
@ -99,11 +95,13 @@ services:
- DISPLAY_HEIGHT=1080
- RUN_XTERM=no
ports:
# to view/control ArchiveBox's browser, visit: http://localhost:8080/vnc.html
- "8080:8080"
# to view/control ArchiveBox's browser, visit: http://127.0.0.1:8080/vnc.html
# restricted to access from localhost by default because it has no authentication
- 127.0.0.1:8080:8080
### Example: Put Nginx in front of the ArchiveBox server for SSL termination
### Example: Put Nginx in front of the ArchiveBox server for SSL termination and static file serving.
# You can also any other ingress provider for SSL like Apache, Caddy, Traefik, Cloudflare Tunnels, etc.
# nginx:
# image: nginx:alpine
@ -121,7 +119,8 @@ services:
# pihole:
# image: pihole/pihole:latest
# ports:
# - 127.0.0.1:8090:80 # uncomment to access the admin HTTP interface on http://localhost:8090
# # access the admin HTTP interface on http://localhost:8090
# - 127.0.0.1:8090:80
# environment:
# - WEBPASSWORD=SET_THIS_TO_SOME_SECRET_PASSWORD_FOR_ADMIN_DASHBOARD
# - DNSMASQ_LISTENING=all
@ -136,7 +135,44 @@ services:
# - ./etc/dnsmasq:/etc/dnsmasq.d
### Example: run all your ArchiveBox traffic through a WireGuard VPN tunnel
### Example: Enable ability to run regularly scheduled archiving tasks by uncommenting this container
# $ docker compose run archivebox schedule --every=day --depth=1 'https://example.com/some/rss/feed.xml'
# then restart the scheduler container to apply the changes to the schedule
# $ docker compose restart archivebox_scheduler
# archivebox_scheduler:
# image: archivebox/archivebox:latest
# command: schedule --foreground
# environment:
# - MEDIA_MAX_SIZE=750m # increase this number to allow archiving larger audio/video files
# # - TIMEOUT=60 # increase if you see timeouts often during archiving / on slow networks
# # - ONLY_NEW=True # set to False to retry previously failed URLs when re-adding instead of skipping them
# # - CHECK_SSL_VALIDITY=True # set to False to allow saving URLs w/ broken SSL certs
# # - SAVE_ARCHIVE_DOT_ORG=True # set to False to disable submitting URLs to Archive.org when archiving
# # - PUID=502 # set to your host user's UID & GID if you encounter permissions issues
# # - PGID=20
# volumes:
# - ./data:/data
# - ./etc/crontabs:/var/spool/cron/crontabs
# # cpus: 2 # uncomment / edit these values to limit container resource consumption
# # mem_limit: 2048m
# # shm_size: 1024m
### Example: Put Nginx in front of the ArchiveBox server for SSL termination
# nginx:
# image: nginx:alpine
# ports:
# - 443:443
# - 80:80
# volumes:
# - ./etc/nginx.conf:/etc/nginx/nginx.conf
# - ./data:/var/www
### Example: run all your ArchiveBox traffic through a WireGuard VPN tunnel to avoid IP blocks.
# You can also use any other VPN that works at the docker IP level, e.g. Tailscale, OpenVPN, etc.
# wireguard:
# image: linuxserver/wireguard:latest
@ -167,10 +203,30 @@ services:
networks:
# network needed for pihole container to offer :53 dns resolving on fixed ip for archivebox container
# network just used for pihole container to offer :53 dns resolving on fixed ip for archivebox container
dns:
ipam:
driver: default
config:
- subnet: 172.20.0.0/24
# To use remote storage for your ./data/archive (e.g. Amazon S3, Backblaze B2, Google Drive, OneDrive, SFTP, etc.)
# Follow the steps here to set up the Docker RClone Plugin https://rclone.org/docker/
# $ docker plugin install rclone/docker-volume-rclone:amd64 --grant-all-permissions --alias rclone
# $ nano /var/lib/docker-plugins/rclone/config/rclone.conf
# [examplegdrive]
# type = drive
# scope = drive
# drive_id = 1234567...
# root_folder_id = 0Abcd...
# token = {"access_token":...}
# volumes:
# archive:
# driver: rclone
# driver_opts:
# remote: 'examplegdrive:archivebox'
# allow_other: 'true'
# vfs_cache_mode: full
# poll_interval: 0

2
docs

@ -1 +1 @@
Subproject commit a1b69c51ba9b249c0b2a6efd141dbb792fc36ad2
Subproject commit f23abba9773b67ad9f2fd04d6f2e8e056dfa6521

480
package-lock.json generated
View file

@ -1,23 +1,33 @@
{
"name": "archivebox",
"version": "0.7.3",
"version": "0.8.0",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "archivebox",
"version": "0.7.3",
"version": "0.8.0",
"license": "MIT",
"dependencies": {
"@postlight/parser": "^2.2.3",
"readability-extractor": "github:ArchiveBox/readability-extractor",
"single-file-cli": "^1.1.46"
"single-file-cli": "^1.1.54"
}
},
"node_modules/@asamuzakjp/dom-selector": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/@asamuzakjp/dom-selector/-/dom-selector-2.0.2.tgz",
"integrity": "sha512-x1KXOatwofR6ZAYzXRBL5wrdV0vwNxlTCK9NCuLqAzQYARqGcvFwiJA6A1ERuh+dgeA4Dxm3JBYictIes+SqUQ==",
"dependencies": {
"bidi-js": "^1.0.3",
"css-tree": "^2.3.1",
"is-potential-custom-element-name": "^1.0.1"
}
},
"node_modules/@babel/runtime-corejs2": {
"version": "7.23.7",
"resolved": "https://registry.npmjs.org/@babel/runtime-corejs2/-/runtime-corejs2-7.23.7.tgz",
"integrity": "sha512-JmMk2t1zGDNkvsY2MsLLksocjY+ufGzSk8UlcNcxzfrzAPu4nMx0HRFakzIg2bhcqQq6xBI2nUaW/sHoaYIHdQ==",
"version": "7.24.5",
"resolved": "https://registry.npmjs.org/@babel/runtime-corejs2/-/runtime-corejs2-7.24.5.tgz",
"integrity": "sha512-cC9jiO6s/IN+xwCHYy1AGrcFJ4bwgIwb8HX1KaoEpRsznLlO4x9eBP6AX7RIeMSWlQqEj2WHox637OS8cDq6Ew==",
"dependencies": {
"core-js": "^2.6.12",
"regenerator-runtime": "^0.14.0"
@ -168,9 +178,9 @@
}
},
"node_modules/@puppeteer/browsers": {
"version": "1.8.0",
"resolved": "https://registry.npmjs.org/@puppeteer/browsers/-/browsers-1.8.0.tgz",
"integrity": "sha512-TkRHIV6k2D8OlUe8RtG+5jgOF/H98Myx0M6AOafC8DdNVOFiBSFa5cpRDtpm8LXOa9sVwe0+e6Q3FC56X/DZfg==",
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/@puppeteer/browsers/-/browsers-2.0.0.tgz",
"integrity": "sha512-3PS82/5+tnpEaUWonjAFFvlf35QHF15xqyGd34GBa5oP5EPVfFXRsbSxIGYf1M+vZlqBZ3oxT1kRg9OYhtt8ng==",
"dependencies": {
"debug": "4.3.4",
"extract-zip": "2.0.1",
@ -184,7 +194,7 @@
"browsers": "lib/cjs/main-cli.js"
},
"engines": {
"node": ">=16.3.0"
"node": ">=18"
}
},
"node_modules/@tootallnate/quickjs-emscripten": {
@ -193,9 +203,9 @@
"integrity": "sha512-C5Mc6rdnsaJDjO3UpGW/CQTHtCKaYlScZTly4JIu97Jxo/odCiH0ITnDXSJPTOrEKk/ycSZ0AOgTmkDtkOsvIA=="
},
"node_modules/@types/node": {
"version": "20.10.6",
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.10.6.tgz",
"integrity": "sha512-Vac8H+NlRNNlAmDfGUP7b5h/KA+AtWIzuXy0E6OyP8f1tCLYAtPvKRRDJjAPqhpCb0t6U2j7/xqAuLEebW2kiw==",
"version": "20.12.11",
"resolved": "https://registry.npmjs.org/@types/node/-/node-20.12.11.tgz",
"integrity": "sha512-vDg9PZ/zi+Nqp6boSOT7plNuthRugEKixDv5sFTIpkE89MmNtEArAShI4mxuX2+UrLEe9pxC1vm2cjm9YlWbJw==",
"optional": true,
"dependencies": {
"undici-types": "~5.26.4"
@ -211,9 +221,9 @@
}
},
"node_modules/agent-base": {
"version": "7.1.0",
"resolved": "https://registry.npmjs.org/agent-base/-/agent-base-7.1.0.tgz",
"integrity": "sha512-o/zjMZRhJxny7OyEF+Op8X+efiELC7k7yOjMzgfzVqOzXqkBkWI79YoTdOtsuWd5BWhAGAuOY/Xa6xpiaWXiNg==",
"version": "7.1.1",
"resolved": "https://registry.npmjs.org/agent-base/-/agent-base-7.1.1.tgz",
"integrity": "sha512-H0TSyFNDMomMNJQBn8wFV5YC/2eJ+VXECwOadZJT554xP6cODZHPX3H9QMQECxvrgiSOP1pHjy1sMWQVYJOUOA==",
"dependencies": {
"debug": "^4.3.4"
},
@ -304,14 +314,15 @@
"integrity": "sha512-NmWvPnx0F1SfrQbYwOi7OeaNGokp9XhzNioJ/CSBs8Qa4vxug81mhJEAVZwxXuBmYB5KDRfMq/F3RR0BIU7sWg=="
},
"node_modules/b4a": {
"version": "1.6.4",
"resolved": "https://registry.npmjs.org/b4a/-/b4a-1.6.4.tgz",
"integrity": "sha512-fpWrvyVHEKyeEvbKZTVOeZF3VSKKWtJxFIxX/jaVPf+cLbGUSitjb49pHLqPV2BUNNZ0LcoeEGfE/YCpyDYHIw=="
"version": "1.6.6",
"resolved": "https://registry.npmjs.org/b4a/-/b4a-1.6.6.tgz",
"integrity": "sha512-5Tk1HLk6b6ctmjIkAcU/Ujv/1WqiDl0F0JdRCR80VsOcUlHcu7pWeWRlOqQLHfDEsVx9YH/aif5AG4ehoCtTmg=="
},
"node_modules/balanced-match": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/balanced-match/-/balanced-match-1.0.2.tgz",
"integrity": "sha512-3oSeUO0TMV67hN1AmbXsK4yaqU7tjiHlbxRDZOpH0KW9+CeX4bRAaX0Anxt0tx2MrpRpWwQaPwIlISEJhYU5Pw=="
"node_modules/bare-events": {
"version": "2.2.2",
"resolved": "https://registry.npmjs.org/bare-events/-/bare-events-2.2.2.tgz",
"integrity": "sha512-h7z00dWdG0PYOQEvChhOSWvOfkIKsdZGkWr083FgN/HyoQuebSew/cgirYqh9SCuy/hRvxc5Vy6Fw8xAmYHLkQ==",
"optional": true
},
"node_modules/base64-js": {
"version": "1.5.1",
@ -333,9 +344,9 @@
]
},
"node_modules/basic-ftp": {
"version": "5.0.4",
"resolved": "https://registry.npmjs.org/basic-ftp/-/basic-ftp-5.0.4.tgz",
"integrity": "sha512-8PzkB0arJFV4jJWSGOYR+OEic6aeKMu/osRhBULN6RY0ykby6LKhbmuQ5ublvaas5BOwboah5D87nrHyuh8PPA==",
"version": "5.0.5",
"resolved": "https://registry.npmjs.org/basic-ftp/-/basic-ftp-5.0.5.tgz",
"integrity": "sha512-4Bcg1P8xhUuqcii/S0Z9wiHIrQVPMermM1any+MX5GeGD7faD3/msQUDGLol9wOcz4/jbg/WJnGqoJF6LiBdtg==",
"engines": {
"node": ">=10.0.0"
}
@ -348,6 +359,14 @@
"tweetnacl": "^0.14.3"
}
},
"node_modules/bidi-js": {
"version": "1.0.3",
"resolved": "https://registry.npmjs.org/bidi-js/-/bidi-js-1.0.3.tgz",
"integrity": "sha512-RKshQI1R3YQ+n9YJz2QQ147P66ELpa1FQEg20Dk8oW9t2KgLbpDLLp9aGZ7y8WHSshDknG0bknqGw5/tyCs5tw==",
"dependencies": {
"require-from-string": "^2.0.2"
}
},
"node_modules/bluebird": {
"version": "2.11.0",
"resolved": "https://registry.npmjs.org/bluebird/-/bluebird-2.11.0.tgz",
@ -358,15 +377,6 @@
"resolved": "https://registry.npmjs.org/boolbase/-/boolbase-1.0.0.tgz",
"integrity": "sha512-JZOSA7Mo9sNGB8+UjSgzdLtokWAky1zbztM3WRLCbZ70/3cTANmQmOdR7y2g+J0e2WXywy1yS468tY+IruqEww=="
},
"node_modules/brace-expansion": {
"version": "1.1.11",
"resolved": "https://registry.npmjs.org/brace-expansion/-/brace-expansion-1.1.11.tgz",
"integrity": "sha512-iCuPHDFgrHX7H2vEI/5xpz07zSHB00TpugqhmYtVmMO6518mCuRMoOYFldEBl0g187ufozdaHgWKcYFb61qGiA==",
"dependencies": {
"balanced-match": "^1.0.0",
"concat-map": "0.0.1"
}
},
"node_modules/brotli": {
"version": "1.3.3",
"resolved": "https://registry.npmjs.org/brotli/-/brotli-1.3.3.tgz",
@ -446,12 +456,12 @@
}
},
"node_modules/chromium-bidi": {
"version": "0.4.33",
"resolved": "https://registry.npmjs.org/chromium-bidi/-/chromium-bidi-0.4.33.tgz",
"integrity": "sha512-IxoFM5WGQOIAd95qrSXzJUv4eXIrh+RvU3rwwqIiwYuvfE7U/Llj4fejbsJnjJMUYCuGtVQsY2gv7oGl4aTNSQ==",
"version": "0.5.8",
"resolved": "https://registry.npmjs.org/chromium-bidi/-/chromium-bidi-0.5.8.tgz",
"integrity": "sha512-blqh+1cEQbHBKmok3rVJkBlBxt9beKBgOsxbFgs7UJcoVbbeZ+K7+6liAsjgpc8l1Xd55cQUy14fXZdGSb4zIw==",
"dependencies": {
"mitt": "3.0.1",
"urlpattern-polyfill": "9.0.0"
"urlpattern-polyfill": "10.0.0"
},
"peerDependencies": {
"devtools-protocol": "*"
@ -497,11 +507,6 @@
"node": ">= 0.8"
}
},
"node_modules/concat-map": {
"version": "0.0.1",
"resolved": "https://registry.npmjs.org/concat-map/-/concat-map-0.0.1.tgz",
"integrity": "sha512-/Srv4dswyQNBfohGpz9o6Yb3Gz3SrUDqBH5rTuhGR7ahtlbYKnVxw2bCFMRljaA7EXHaXZ8wsHdodFvbkhKmqg=="
},
"node_modules/core-js": {
"version": "2.6.12",
"resolved": "https://registry.npmjs.org/core-js/-/core-js-2.6.12.tgz",
@ -533,6 +538,18 @@
"nth-check": "~1.0.1"
}
},
"node_modules/css-tree": {
"version": "2.3.1",
"resolved": "https://registry.npmjs.org/css-tree/-/css-tree-2.3.1.tgz",
"integrity": "sha512-6Fv1DV/TYw//QF5IzQdqsNDjx/wc8TrMBZsqjL9eW01tWb7R7k/mq+/VXfJCl7SoD5emsJop9cOByJZfs8hYIw==",
"dependencies": {
"mdn-data": "2.0.30",
"source-map-js": "^1.0.1"
},
"engines": {
"node": "^10 || ^12.20.0 || ^14.13.0 || >=15.0.0"
}
},
"node_modules/css-what": {
"version": "2.1.3",
"resolved": "https://registry.npmjs.org/css-what/-/css-what-2.1.3.tgz",
@ -542,14 +559,14 @@
}
},
"node_modules/cssstyle": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/cssstyle/-/cssstyle-3.0.0.tgz",
"integrity": "sha512-N4u2ABATi3Qplzf0hWbVCdjenim8F3ojEXpBDF5hBpjzW182MjNGLqfmQ0SkSPeQ+V86ZXgeH8aXj6kayd4jgg==",
"version": "4.0.1",
"resolved": "https://registry.npmjs.org/cssstyle/-/cssstyle-4.0.1.tgz",
"integrity": "sha512-8ZYiJ3A/3OkDd093CBT/0UKDWry7ak4BdPTFP2+QEP7cmhouyq/Up709ASSj2cK02BbZiMgk7kYjZNS4QP5qrQ==",
"dependencies": {
"rrweb-cssom": "^0.6.0"
},
"engines": {
"node": ">=14"
"node": ">=18"
}
},
"node_modules/dashdash": {
@ -564,9 +581,9 @@
}
},
"node_modules/data-uri-to-buffer": {
"version": "6.0.1",
"resolved": "https://registry.npmjs.org/data-uri-to-buffer/-/data-uri-to-buffer-6.0.1.tgz",
"integrity": "sha512-MZd3VlchQkp8rdend6vrx7MmVDJzSNTBvghvKjirLkD+WTChA3KUf0jkE68Q4UyctNqI11zZO9/x2Yx+ub5Cvg==",
"version": "6.0.2",
"resolved": "https://registry.npmjs.org/data-uri-to-buffer/-/data-uri-to-buffer-6.0.2.tgz",
"integrity": "sha512-7hvf7/GW8e86rW0ptuwS3OcBGDjIi6SZva7hCyWC0yYry2cOPmLIjXAUHI6DK2HsnwJd9ifmt57i8eV2n4YNpw==",
"engines": {
"node": ">= 14"
}
@ -657,9 +674,9 @@
}
},
"node_modules/devtools-protocol": {
"version": "0.0.1203626",
"resolved": "https://registry.npmjs.org/devtools-protocol/-/devtools-protocol-0.0.1203626.tgz",
"integrity": "sha512-nEzHZteIUZfGCZtTiS1fRpC8UZmsfD1SiyPvaUNvS13dvKf666OAm8YTi0+Ca3n1nLEyu49Cy4+dPWpaHFJk9g=="
"version": "0.0.1232444",
"resolved": "https://registry.npmjs.org/devtools-protocol/-/devtools-protocol-0.0.1232444.tgz",
"integrity": "sha512-pM27vqEfxSxRkTMnF+XCmxSEb6duO5R+t8A9DEEJgy4Wz2RVanje2mmj99B6A3zv2r/qGfYlOvYznUhuokizmg=="
},
"node_modules/difflib": {
"version": "0.2.6",
@ -696,9 +713,9 @@
"integrity": "sha512-3VdM/SXBZX2omc9JF9nOPCtDaYQ67BGp5CoLpIQlO2KCAPETs8TcDHacF26jXadGbvUteZzRTeos2fhID5+ucQ=="
},
"node_modules/dompurify": {
"version": "3.0.7",
"resolved": "https://registry.npmjs.org/dompurify/-/dompurify-3.0.7.tgz",
"integrity": "sha512-BViYTZoqP3ak/ULKOc101y+CtHDUvBsVgSxIF1ku0HmK6BRf+C03MC+tArMvOPtVtZp83DDh5puywKDu4sbVjQ=="
"version": "3.1.3",
"resolved": "https://registry.npmjs.org/dompurify/-/dompurify-3.1.3.tgz",
"integrity": "sha512-5sOWYSNPaxz6o2MUPvtyxTTqR4D3L77pr5rUQoWgD5ROQtVIZQgJkXbo1DLlK3vj11YGw5+LnF4SYti4gZmwng=="
},
"node_modules/domutils": {
"version": "1.5.1",
@ -726,6 +743,11 @@
"safer-buffer": "^2.1.0"
}
},
"node_modules/ecc-jsbn/node_modules/jsbn": {
"version": "0.1.1",
"resolved": "https://registry.npmjs.org/jsbn/-/jsbn-0.1.1.tgz",
"integrity": "sha512-UVU9dibq2JcFWxQPA6KCqj5O42VOmAY3zQUfEKxU0KpTGXwNoCjkX1e13eHNvw/xPynt6pU0rZ1htjWTNTSXsg=="
},
"node_modules/ellipsize": {
"version": "0.1.0",
"resolved": "https://registry.npmjs.org/ellipsize/-/ellipsize-0.1.0.tgz",
@ -750,9 +772,9 @@
"integrity": "sha512-f2LZMYl1Fzu7YSBKg+RoROelpOaNrcGmE9AZubeDfrCEia483oW4MI4VyFd5VNHIgQ/7qm1I0wUHK1eJnn2y2w=="
},
"node_modules/escalade": {
"version": "3.1.1",
"resolved": "https://registry.npmjs.org/escalade/-/escalade-3.1.1.tgz",
"integrity": "sha512-k0er2gUkLf8O0zKJiAhmkTnJlTvINGv7ygDNPbeIsX/TJjGJZHuh9B2UxbsaEkmlEo9MfhrSzmhIlhRlI2GXnw==",
"version": "3.1.2",
"resolved": "https://registry.npmjs.org/escalade/-/escalade-3.1.2.tgz",
"integrity": "sha512-ErCHMCae19vR8vQGe50xIsVomy19rg6gFu3+r3jkEO46suLMWBksvVyoGgQV+jOfl84ZSOSlmv6Gxa89PmTGmA==",
"engines": {
"node": ">=6"
}
@ -890,31 +912,26 @@
}
},
"node_modules/fs-extra": {
"version": "8.1.0",
"resolved": "https://registry.npmjs.org/fs-extra/-/fs-extra-8.1.0.tgz",
"integrity": "sha512-yhlQgA6mnOJUKOsRUFsgJdQCvkKhcz8tlZG5HBQfReYZy46OwLcY+Zia0mtdHsOo9y/hP+CxMN0TU9QxoOtG4g==",
"version": "11.2.0",
"resolved": "https://registry.npmjs.org/fs-extra/-/fs-extra-11.2.0.tgz",
"integrity": "sha512-PmDi3uwK5nFuXh7XDTlVnS17xJS7vW36is2+w3xcv8SVxiB4NyATf4ctkVY5bkSjX0Y4nbvZCq1/EjtEyr9ktw==",
"dependencies": {
"graceful-fs": "^4.2.0",
"jsonfile": "^4.0.0",
"universalify": "^0.1.0"
"jsonfile": "^6.0.1",
"universalify": "^2.0.0"
},
"engines": {
"node": ">=6 <7 || >=8"
"node": ">=14.14"
}
},
"node_modules/fs-extra/node_modules/universalify": {
"version": "0.1.2",
"resolved": "https://registry.npmjs.org/universalify/-/universalify-0.1.2.tgz",
"integrity": "sha512-rBJeI5CXAlmy1pV+617WB9J63U6XcazHHF2f2dbJix4XzpUF0RS3Zbj0FGIOCAva5P/d/GBOYaACQ1w+0azUkg==",
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/universalify/-/universalify-2.0.1.tgz",
"integrity": "sha512-gptHNQghINnc/vTGIk0SOFGFNXw7JVrlRUtConJRlvaw6DuX0wO5Jeko9sWrMBhh+PsYAZ7oXAiOnf/UKogyiw==",
"engines": {
"node": ">= 4.0.0"
"node": ">= 10.0.0"
}
},
"node_modules/fs.realpath": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/fs.realpath/-/fs.realpath-1.0.0.tgz",
"integrity": "sha512-OO0pH2lK6a0hZnAdau5ItzHPI6pUlvI7jMVnxUQRtw4owF2wk8lOSabtGDCTP4Ggrg2MbGnWO9X8K1t4+fGMDw=="
},
"node_modules/get-caller-file": {
"version": "2.0.5",
"resolved": "https://registry.npmjs.org/get-caller-file/-/get-caller-file-2.0.5.tgz",
@ -938,14 +955,14 @@
}
},
"node_modules/get-uri": {
"version": "6.0.2",
"resolved": "https://registry.npmjs.org/get-uri/-/get-uri-6.0.2.tgz",
"integrity": "sha512-5KLucCJobh8vBY1K07EFV4+cPZH3mrV9YeAruUseCQKHB58SGjjT2l9/eA9LD082IiuMjSlFJEcdJ27TXvbZNw==",
"version": "6.0.3",
"resolved": "https://registry.npmjs.org/get-uri/-/get-uri-6.0.3.tgz",
"integrity": "sha512-BzUrJBS9EcUb4cFol8r4W3v1cPsSyajLSthNkz5BxbpDcHN5tIrM10E2eNvfnvBn3DaT3DUgx0OpsBKkaOpanw==",
"dependencies": {
"basic-ftp": "^5.0.2",
"data-uri-to-buffer": "^6.0.0",
"data-uri-to-buffer": "^6.0.2",
"debug": "^4.3.4",
"fs-extra": "^8.1.0"
"fs-extra": "^11.2.0"
},
"engines": {
"node": ">= 14"
@ -959,25 +976,6 @@
"assert-plus": "^1.0.0"
}
},
"node_modules/glob": {
"version": "7.2.3",
"resolved": "https://registry.npmjs.org/glob/-/glob-7.2.3.tgz",
"integrity": "sha512-nFR0zLpU2YCaRxwoCJvL6UvCH2JFyFVIvwTLsIf21AuHlMskA1hhTdk+LlYJtOlYt9v6dvszD2BGRqBL+iQK9Q==",
"dependencies": {
"fs.realpath": "^1.0.0",
"inflight": "^1.0.4",
"inherits": "2",
"minimatch": "^3.1.1",
"once": "^1.3.0",
"path-is-absolute": "^1.0.0"
},
"engines": {
"node": "*"
},
"funding": {
"url": "https://github.com/sponsors/isaacs"
}
},
"node_modules/graceful-fs": {
"version": "4.2.11",
"resolved": "https://registry.npmjs.org/graceful-fs/-/graceful-fs-4.2.11.tgz",
@ -1034,9 +1032,9 @@
}
},
"node_modules/http-proxy-agent": {
"version": "7.0.0",
"resolved": "https://registry.npmjs.org/http-proxy-agent/-/http-proxy-agent-7.0.0.tgz",
"integrity": "sha512-+ZT+iBxVUQ1asugqnD6oWoRiS25AkjNfG085dKJGtGxkdwLQrMKU5wJr2bOOFAXzKcTuqq+7fZlTMgG3SRfIYQ==",
"version": "7.0.2",
"resolved": "https://registry.npmjs.org/http-proxy-agent/-/http-proxy-agent-7.0.2.tgz",
"integrity": "sha512-T1gkAiYYDWYx3V5Bmyu7HcfcvL7mUrTWiM6yOfa3PIphViJ/gFPbvidQ+veqSOHci/PxBcDabeUNCzpOODJZig==",
"dependencies": {
"agent-base": "^7.1.0",
"debug": "^4.3.4"
@ -1059,9 +1057,9 @@
}
},
"node_modules/https-proxy-agent": {
"version": "7.0.2",
"resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-7.0.2.tgz",
"integrity": "sha512-NmLNjm6ucYwtcUmL7JQC1ZQ57LmHP4lT15FQ8D61nak1rO6DH+fz5qNK2Ap5UN4ZapYICE3/0KodcLYSPsPbaA==",
"version": "7.0.4",
"resolved": "https://registry.npmjs.org/https-proxy-agent/-/https-proxy-agent-7.0.4.tgz",
"integrity": "sha512-wlwpilI7YdjSkWaQ/7omYBMTliDcmCN8OLihO6I9B86g06lMyAoqgoDpV0XqoaPOKj+0DIdAvnsWfyAAhmimcg==",
"dependencies": {
"agent-base": "^7.0.2",
"debug": "4"
@ -1105,24 +1103,22 @@
"resolved": "https://registry.npmjs.org/immediate/-/immediate-3.0.6.tgz",
"integrity": "sha512-XXOFtyqDjNDAQxVfYxuF7g9Il/IbWmmlQg2MYKOH8ExIT1qg6xc4zyS3HaEEATgs1btfzxq15ciUiY7gjSXRGQ=="
},
"node_modules/inflight": {
"version": "1.0.6",
"resolved": "https://registry.npmjs.org/inflight/-/inflight-1.0.6.tgz",
"integrity": "sha512-k92I/b08q4wvFscXCLvqfsHCrjrF7yiXsQuIVvVE7N82W3+aqpzuUdBbfhWcy/FZR3/4IgflMgKLOsvPDrGCJA==",
"dependencies": {
"once": "^1.3.0",
"wrappy": "1"
}
},
"node_modules/inherits": {
"version": "2.0.4",
"resolved": "https://registry.npmjs.org/inherits/-/inherits-2.0.4.tgz",
"integrity": "sha512-k/vGaX4/Yla3WzyMCvTQOXYeIHvqOKtnqBduzTHpzpQZzAskKMhZ2K+EnBiSM9zGSoIFeMpXKxa4dYeZIQqewQ=="
},
"node_modules/ip": {
"version": "1.1.8",
"resolved": "https://registry.npmjs.org/ip/-/ip-1.1.8.tgz",
"integrity": "sha512-PuExPYUiu6qMBQb4l06ecm6T6ujzhmh+MeJcW9wa89PoAz5pvd4zPgN5WJV104mb6S2T1AwNIAaB70JNrLQWhg=="
"node_modules/ip-address": {
"version": "9.0.5",
"resolved": "https://registry.npmjs.org/ip-address/-/ip-address-9.0.5.tgz",
"integrity": "sha512-zHtQzGojZXTwZTHQqra+ETKd4Sn3vgi7uBmlPoXVWZqYvuKmtI0l/VZTjqGmJY9x88GGOaZ9+G9ES8hC4T4X8g==",
"dependencies": {
"jsbn": "1.1.0",
"sprintf-js": "^1.1.3"
},
"engines": {
"node": ">= 12"
}
},
"node_modules/is-fullwidth-code-point": {
"version": "3.0.0",
@ -1153,16 +1149,17 @@
"integrity": "sha512-Yljz7ffyPbrLpLngrMtZ7NduUgVvi6wG9RJ9IUcyCd59YQ911PBJphODUcbOVbqYfxe1wuYf/LJ8PauMRwsM/g=="
},
"node_modules/jsbn": {
"version": "0.1.1",
"resolved": "https://registry.npmjs.org/jsbn/-/jsbn-0.1.1.tgz",
"integrity": "sha512-UVU9dibq2JcFWxQPA6KCqj5O42VOmAY3zQUfEKxU0KpTGXwNoCjkX1e13eHNvw/xPynt6pU0rZ1htjWTNTSXsg=="
"version": "1.1.0",
"resolved": "https://registry.npmjs.org/jsbn/-/jsbn-1.1.0.tgz",
"integrity": "sha512-4bYVV3aAMtDTTu4+xsDYa6sy9GyJ69/amsu9sYF2zqjiEoZA5xJi3BrfX3uY+/IekIu7MwdObdbDWpoZdBv3/A=="
},
"node_modules/jsdom": {
"version": "23.0.1",
"resolved": "https://registry.npmjs.org/jsdom/-/jsdom-23.0.1.tgz",
"integrity": "sha512-2i27vgvlUsGEBO9+/kJQRbtqtm+191b5zAZrU/UezVmnC2dlDAFLgDYJvAEi94T4kjsRKkezEtLQTgsNEsW2lQ==",
"version": "23.2.0",
"resolved": "https://registry.npmjs.org/jsdom/-/jsdom-23.2.0.tgz",
"integrity": "sha512-L88oL7D/8ufIES+Zjz7v0aes+oBMh2Xnh3ygWvL0OaICOomKEPKuPnIfBJekiXr+BHbbMjrWn/xqrDQuxFTeyA==",
"dependencies": {
"cssstyle": "^3.0.0",
"@asamuzakjp/dom-selector": "^2.0.1",
"cssstyle": "^4.0.1",
"data-urls": "^5.0.0",
"decimal.js": "^10.4.3",
"form-data": "^4.0.0",
@ -1170,7 +1167,6 @@
"http-proxy-agent": "^7.0.0",
"https-proxy-agent": "^7.0.2",
"is-potential-custom-element-name": "^1.0.1",
"nwsapi": "^2.2.7",
"parse5": "^7.1.2",
"rrweb-cssom": "^0.6.0",
"saxes": "^6.0.0",
@ -1181,7 +1177,7 @@
"whatwg-encoding": "^3.1.1",
"whatwg-mimetype": "^4.0.0",
"whatwg-url": "^14.0.0",
"ws": "^8.14.2",
"ws": "^8.16.0",
"xml-name-validator": "^5.0.0"
},
"engines": {
@ -1235,13 +1231,24 @@
"integrity": "sha512-ZClg6AaYvamvYEE82d3Iyd3vSSIjQ+odgjaTzRuO3s7toCdFKczob2i0zCh7JE8kWn17yvAWhUVxvqGwUalsRA=="
},
"node_modules/jsonfile": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/jsonfile/-/jsonfile-4.0.0.tgz",
"integrity": "sha512-m6F1R3z8jjlf2imQHS2Qez5sjKWQzbuuhuJ/FKYFRZvPE3PuHcSMVZzfsLhGVOkfd20obL5SWEBew5ShlquNxg==",
"version": "6.1.0",
"resolved": "https://registry.npmjs.org/jsonfile/-/jsonfile-6.1.0.tgz",
"integrity": "sha512-5dgndWOriYSm5cnYaJNhalLNDKOqFwyDB/rr1E9ZsGciGvKPs8R2xYGCacuf3z6K1YKDz182fd+fY3cn3pMqXQ==",
"dependencies": {
"universalify": "^2.0.0"
},
"optionalDependencies": {
"graceful-fs": "^4.1.6"
}
},
"node_modules/jsonfile/node_modules/universalify": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/universalify/-/universalify-2.0.1.tgz",
"integrity": "sha512-gptHNQghINnc/vTGIk0SOFGFNXw7JVrlRUtConJRlvaw6DuX0wO5Jeko9sWrMBhh+PsYAZ7oXAiOnf/UKogyiw==",
"engines": {
"node": ">= 10.0.0"
}
},
"node_modules/jsprim": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/jsprim/-/jsprim-2.0.2.tgz",
@ -1375,6 +1382,11 @@
"node": ">=12"
}
},
"node_modules/mdn-data": {
"version": "2.0.30",
"resolved": "https://registry.npmjs.org/mdn-data/-/mdn-data-2.0.30.tgz",
"integrity": "sha512-GaqWWShW4kv/G9IEucWScBx9G1/vsFZZJUO+tD26M8J8z3Kw5RDQjaoZe03YAClgeS/SWPOcb4nkFBTEi5DUEA=="
},
"node_modules/mime-db": {
"version": "1.52.0",
"resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.52.0.tgz",
@ -1394,17 +1406,6 @@
"node": ">= 0.6"
}
},
"node_modules/minimatch": {
"version": "3.1.2",
"resolved": "https://registry.npmjs.org/minimatch/-/minimatch-3.1.2.tgz",
"integrity": "sha512-J7p63hRiAjw1NDEww1W7i37+ByIrOWO5XQQAzZ3VOcL0PNybwpfmV/N05zFAzwQ9USyEcX6t3UO+K5aqBQOIHw==",
"dependencies": {
"brace-expansion": "^1.1.7"
},
"engines": {
"node": "*"
}
},
"node_modules/mitt": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/mitt/-/mitt-3.0.1.tgz",
@ -1461,9 +1462,9 @@
}
},
"node_modules/nwsapi": {
"version": "2.2.7",
"resolved": "https://registry.npmjs.org/nwsapi/-/nwsapi-2.2.7.tgz",
"integrity": "sha512-ub5E4+FBPKwAZx0UwIQOjYWGHTEq5sPqHQNRN8Z9e4A7u3Tj1weLJsL59yH9vmvqEtBHaOmT6cYQKIZOxp35FQ=="
"version": "2.2.9",
"resolved": "https://registry.npmjs.org/nwsapi/-/nwsapi-2.2.9.tgz",
"integrity": "sha512-2f3F0SEEer8bBu0dsNCFF50N0cTThV1nWFYcEYFZttdW0lDAoybv9cQoK7X7/68Z89S7FoRrVjP1LPX4XRf9vg=="
},
"node_modules/oauth-sign": {
"version": "0.9.0",
@ -1500,12 +1501,11 @@
}
},
"node_modules/pac-resolver": {
"version": "7.0.0",
"resolved": "https://registry.npmjs.org/pac-resolver/-/pac-resolver-7.0.0.tgz",
"integrity": "sha512-Fd9lT9vJbHYRACT8OhCbZBbxr6KRSawSovFpy8nDGshaK99S/EBhVIHp9+crhxrsZOuvLpgL1n23iyPg6Rl2hg==",
"version": "7.0.1",
"resolved": "https://registry.npmjs.org/pac-resolver/-/pac-resolver-7.0.1.tgz",
"integrity": "sha512-5NPgf87AT2STgwa2ntRMr45jTKrYBGkVU36yT0ig/n/GMAa3oPqhZfIQ2kMEimReg0+t9kZViDVZ83qfVUlckg==",
"dependencies": {
"degenerator": "^5.0.0",
"ip": "^1.1.8",
"netmask": "^2.0.2"
},
"engines": {
@ -1539,14 +1539,6 @@
"url": "https://github.com/fb55/entities?sponsor=1"
}
},
"node_modules/path-is-absolute": {
"version": "1.0.1",
"resolved": "https://registry.npmjs.org/path-is-absolute/-/path-is-absolute-1.0.1.tgz",
"integrity": "sha512-AVbw3UJ2e9bq64vSaS9Am0fje1Pa8pbGqTTsmXfaIiMpnr5DlDhfJOuLj9Sf95ZPVDAUerDfEk88MPmPe7UCQg==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/pend": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/pend/-/pend-1.2.0.tgz",
@ -1648,25 +1640,25 @@
}
},
"node_modules/puppeteer-core": {
"version": "21.5.2",
"resolved": "https://registry.npmjs.org/puppeteer-core/-/puppeteer-core-21.5.2.tgz",
"integrity": "sha512-v4T0cWnujSKs+iEfmb8ccd7u4/x8oblEyKqplqKnJ582Kw8PewYAWvkH4qUWhitN3O2q9RF7dzkvjyK5HbzjLA==",
"version": "22.0.0",
"resolved": "https://registry.npmjs.org/puppeteer-core/-/puppeteer-core-22.0.0.tgz",
"integrity": "sha512-S3s91rLde0A86PWVeNY82h+P0fdS7CTiNWAicCVH/bIspRP4nS2PnO5j+VTFqCah0ZJizGzpVPAmxVYbLxTc9w==",
"dependencies": {
"@puppeteer/browsers": "1.8.0",
"chromium-bidi": "0.4.33",
"@puppeteer/browsers": "2.0.0",
"chromium-bidi": "0.5.8",
"cross-fetch": "4.0.0",
"debug": "4.3.4",
"devtools-protocol": "0.0.1203626",
"ws": "8.14.2"
"devtools-protocol": "0.0.1232444",
"ws": "8.16.0"
},
"engines": {
"node": ">=16.13.2"
"node": ">=18"
}
},
"node_modules/puppeteer-core/node_modules/ws": {
"version": "8.14.2",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.14.2.tgz",
"integrity": "sha512-wEBG1ftX4jcglPxgFCMJmZ2PLtSbJ2Peg6TmpJFTbe9GZYOQCDPdMYu/Tm0/bGZkw8paZnJY45J4K2PZrLYq8g==",
"version": "8.16.0",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.16.0.tgz",
"integrity": "sha512-HS0c//TP7Ina87TfiPUz1rQzMhHrl/SG2guqRcTOIUYD2q8uhUdNHZYJUaQ8aTGPzCh+c6oawMKW35nFl1dxyQ==",
"engines": {
"node": ">=10.0.0"
},
@ -1703,8 +1695,7 @@
},
"node_modules/readability-extractor": {
"version": "0.0.11",
"resolved": "git+ssh://git@github.com/ArchiveBox/readability-extractor.git#2fb4689a65c6433036453dcbee7a268840604eb9",
"license": "MIT",
"resolved": "git+ssh://git@github.com/ArchiveBox/readability-extractor.git#057f2046f9535cfc6df7b8d551aaad32a9e6226c",
"dependencies": {
"@mozilla/readability": "^0.5.0",
"dompurify": "^3.0.6",
@ -1740,25 +1731,19 @@
"node": ">=0.10.0"
}
},
"node_modules/require-from-string": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/require-from-string/-/require-from-string-2.0.2.tgz",
"integrity": "sha512-Xf0nWe6RseziFMu+Ap9biiUbmplq6S9/p+7w7YXP/JBHhrUDDUhwa+vANyubuqfZWTveU//DYVGsDG7RKL/vEw==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/requires-port": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/requires-port/-/requires-port-1.0.0.tgz",
"integrity": "sha512-KigOCHcocU3XODJxsu8i/j8T9tzT4adHiecwORRQ0ZZFcp7ahwXuRU1m+yuO90C5ZUyGeGfocHDI14M3L3yDAQ=="
},
"node_modules/rimraf": {
"version": "3.0.2",
"resolved": "https://registry.npmjs.org/rimraf/-/rimraf-3.0.2.tgz",
"integrity": "sha512-JZkJMZkAGFFPP2YqXZXPbMlMBgsxzE8ILs4lMIX/2o0L9UBw9O/Y3o6wFw/i9YLapcUJWwqbi3kdxIPdC62TIA==",
"dependencies": {
"glob": "^7.1.3"
},
"bin": {
"rimraf": "bin.js"
},
"funding": {
"url": "https://github.com/sponsors/isaacs"
}
},
"node_modules/rrweb-cssom": {
"version": "0.6.0",
"resolved": "https://registry.npmjs.org/rrweb-cssom/-/rrweb-cssom-0.6.0.tgz",
@ -1800,9 +1785,9 @@
}
},
"node_modules/selenium-webdriver": {
"version": "4.15.0",
"resolved": "https://registry.npmjs.org/selenium-webdriver/-/selenium-webdriver-4.15.0.tgz",
"integrity": "sha512-BNG1bq+KWiBGHcJ/wULi0eKY0yaDqFIbEmtbsYJmfaEghdCkXBsx1akgOorhNwjBipOr0uwpvNXqT6/nzl+zjg==",
"version": "4.17.0",
"resolved": "https://registry.npmjs.org/selenium-webdriver/-/selenium-webdriver-4.17.0.tgz",
"integrity": "sha512-e2E+2XBlGepzwgFbyQfSwo9Cbj6G5fFfs9MzAS00nC99EewmcS2rwn2MwtgfP7I5p1e7DYv4HQJXtWedsu6DvA==",
"dependencies": {
"jszip": "^3.10.1",
"tmp": "^0.2.1",
@ -1818,16 +1803,16 @@
"integrity": "sha512-MATJdZp8sLqDl/68LfQmbP8zKPLQNV6BIZoIgrscFDQ+RsvK/BxeDQOgyxKKoh0y/8h3BqVFnCqQ/gd+reiIXA=="
},
"node_modules/single-file-cli": {
"version": "1.1.46",
"resolved": "https://registry.npmjs.org/single-file-cli/-/single-file-cli-1.1.46.tgz",
"integrity": "sha512-+vFj0a5Y4ESqpMwH0T6738pg8ZA9KVhhl6OlIOsicamGNU9DnMa+q9dL1S2KnLWHoauKjU0BThhR/YKUleJSxw==",
"version": "1.1.54",
"resolved": "https://registry.npmjs.org/single-file-cli/-/single-file-cli-1.1.54.tgz",
"integrity": "sha512-wnVPg7BklhswwFVrtuFXbmluI4piHxg2dC0xATxYTeXAld6PnRPlnp7ufallRKArjFBZdP2u+ihMkOIp7A38XA==",
"dependencies": {
"file-url": "3.0.0",
"iconv-lite": "0.6.3",
"jsdom": "23.0.0",
"puppeteer-core": "21.5.2",
"selenium-webdriver": "4.15.0",
"single-file-core": "1.3.15",
"jsdom": "24.0.0",
"puppeteer-core": "22.0.0",
"selenium-webdriver": "4.17.0",
"single-file-core": "1.3.24",
"strong-data-uri": "1.0.6",
"yargs": "17.7.2"
},
@ -1847,11 +1832,11 @@
}
},
"node_modules/single-file-cli/node_modules/jsdom": {
"version": "23.0.0",
"resolved": "https://registry.npmjs.org/jsdom/-/jsdom-23.0.0.tgz",
"integrity": "sha512-cbL/UCtohJguhFC7c2/hgW6BeZCNvP7URQGnx9tSJRYKCdnfbfWOrtuLTMfiB2VxKsx5wPHVsh/J0aBy9lIIhQ==",
"version": "24.0.0",
"resolved": "https://registry.npmjs.org/jsdom/-/jsdom-24.0.0.tgz",
"integrity": "sha512-UDS2NayCvmXSXVP6mpTj+73JnNQadZlr9N68189xib2tx5Mls7swlTNao26IoHv46BZJFvXygyRtyXd1feAk1A==",
"dependencies": {
"cssstyle": "^3.0.0",
"cssstyle": "^4.0.1",
"data-urls": "^5.0.0",
"decimal.js": "^10.4.3",
"form-data": "^4.0.0",
@ -1870,14 +1855,14 @@
"whatwg-encoding": "^3.1.1",
"whatwg-mimetype": "^4.0.0",
"whatwg-url": "^14.0.0",
"ws": "^8.14.2",
"ws": "^8.16.0",
"xml-name-validator": "^5.0.0"
},
"engines": {
"node": ">=18"
},
"peerDependencies": {
"canvas": "^3.0.0"
"canvas": "^2.11.2"
},
"peerDependenciesMeta": {
"canvas": {
@ -1909,9 +1894,9 @@
}
},
"node_modules/single-file-core": {
"version": "1.3.15",
"resolved": "https://registry.npmjs.org/single-file-core/-/single-file-core-1.3.15.tgz",
"integrity": "sha512-/YNpHBwASWNxmSmZXz0xRolmXf0+PGAbwpVrwn6A8tYeuAdezxxde5RYTTQ7V4Zv68+H4JMhE2DwCRV0sVUGNA=="
"version": "1.3.24",
"resolved": "https://registry.npmjs.org/single-file-core/-/single-file-core-1.3.24.tgz",
"integrity": "sha512-1B256mKBbNV8jXAV+hRyEv0aMa7tn0C0Ci+zx7Ya4ZXZB3b9/1MgKsB/fxVwDiL28WJSU0pxzh8ftIYubCNn9w=="
},
"node_modules/smart-buffer": {
"version": "4.2.0",
@ -1923,24 +1908,24 @@
}
},
"node_modules/socks": {
"version": "2.7.1",
"resolved": "https://registry.npmjs.org/socks/-/socks-2.7.1.tgz",
"integrity": "sha512-7maUZy1N7uo6+WVEX6psASxtNlKaNVMlGQKkG/63nEDdLOWNbiUMoLK7X4uYoLhQstau72mLgfEWcXcwsaHbYQ==",
"version": "2.8.3",
"resolved": "https://registry.npmjs.org/socks/-/socks-2.8.3.tgz",
"integrity": "sha512-l5x7VUUWbjVFbafGLxPWkYsHIhEvmF85tbIeFZWc8ZPtoMyybuEhL7Jye/ooC4/d48FgOjSJXgsF/AJPYCW8Zw==",
"dependencies": {
"ip": "^2.0.0",
"ip-address": "^9.0.5",
"smart-buffer": "^4.2.0"
},
"engines": {
"node": ">= 10.13.0",
"node": ">= 10.0.0",
"npm": ">= 3.0.0"
}
},
"node_modules/socks-proxy-agent": {
"version": "8.0.2",
"resolved": "https://registry.npmjs.org/socks-proxy-agent/-/socks-proxy-agent-8.0.2.tgz",
"integrity": "sha512-8zuqoLv1aP/66PHF5TqwJ7Czm3Yv32urJQHrVyhD7mmA6d61Zv8cIXQYPTWwmg6qlupnPvs/QKDmfa4P/qct2g==",
"version": "8.0.3",
"resolved": "https://registry.npmjs.org/socks-proxy-agent/-/socks-proxy-agent-8.0.3.tgz",
"integrity": "sha512-VNegTZKhuGq5vSD6XNKlbqWhyt/40CgoEw8XxD6dhnm8Jq9IEa3nIa4HwnM8XOqU0CdB0BwWVXusqiFXfHB3+A==",
"dependencies": {
"agent-base": "^7.0.2",
"agent-base": "^7.1.1",
"debug": "^4.3.4",
"socks": "^2.7.1"
},
@ -1948,11 +1933,6 @@
"node": ">= 14"
}
},
"node_modules/socks/node_modules/ip": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/ip/-/ip-2.0.0.tgz",
"integrity": "sha512-WKa+XuLG1A1R0UWhl2+1XQSi+fZWMsYKffMZTTYsiZaUD8k2yDAj5atimTUD2TZkyCkNEeYE5NhFZmupOGtjYQ=="
},
"node_modules/source-map": {
"version": "0.6.1",
"resolved": "https://registry.npmjs.org/source-map/-/source-map-0.6.1.tgz",
@ -1962,6 +1942,19 @@
"node": ">=0.10.0"
}
},
"node_modules/source-map-js": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/source-map-js/-/source-map-js-1.2.0.tgz",
"integrity": "sha512-itJW8lvSA0TXEphiRoawsCksnlf8SyvmFzIhltqAHluXd88pkCd+cXJVHTDwdCr0IzwptSm035IHQktUu1QUMg==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/sprintf-js": {
"version": "1.1.3",
"resolved": "https://registry.npmjs.org/sprintf-js/-/sprintf-js-1.1.3.tgz",
"integrity": "sha512-Oo+0REFV59/rz3gfJNKQiBlwfHaSESl1pcGyABQsnnIfWOFt6JNj5gCog2U6MLZ//IGYD+nA8nI+mTShREReaA=="
},
"node_modules/sshpk": {
"version": "1.18.0",
"resolved": "https://registry.npmjs.org/sshpk/-/sshpk-1.18.0.tgz",
@ -1986,6 +1979,11 @@
"node": ">=0.10.0"
}
},
"node_modules/sshpk/node_modules/jsbn": {
"version": "0.1.1",
"resolved": "https://registry.npmjs.org/jsbn/-/jsbn-0.1.1.tgz",
"integrity": "sha512-UVU9dibq2JcFWxQPA6KCqj5O42VOmAY3zQUfEKxU0KpTGXwNoCjkX1e13eHNvw/xPynt6pU0rZ1htjWTNTSXsg=="
},
"node_modules/stream-length": {
"version": "1.0.2",
"resolved": "https://registry.npmjs.org/stream-length/-/stream-length-1.0.2.tgz",
@ -1995,12 +1993,15 @@
}
},
"node_modules/streamx": {
"version": "2.15.6",
"resolved": "https://registry.npmjs.org/streamx/-/streamx-2.15.6.tgz",
"integrity": "sha512-q+vQL4AAz+FdfT137VF69Cc/APqUbxy+MDOImRrMvchJpigHj9GksgDU2LYbO9rx7RX6osWgxJB2WxhYv4SZAw==",
"version": "2.16.1",
"resolved": "https://registry.npmjs.org/streamx/-/streamx-2.16.1.tgz",
"integrity": "sha512-m9QYj6WygWyWa3H1YY69amr4nVgy61xfjys7xO7kviL5rfIEc2naf+ewFiOA+aEJD7y0JO3h2GoiUv4TDwEGzQ==",
"dependencies": {
"fast-fifo": "^1.1.0",
"queue-tick": "^1.0.1"
},
"optionalDependencies": {
"bare-events": "^2.2.0"
}
},
"node_modules/string_decoder": {
@ -2067,9 +2068,9 @@
}
},
"node_modules/tar-stream": {
"version": "3.1.6",
"resolved": "https://registry.npmjs.org/tar-stream/-/tar-stream-3.1.6.tgz",
"integrity": "sha512-B/UyjYwPpMBv+PaFSWAmtYjwdrlEaZQEhMIBFNC5oEG8lpiW8XjcSdmEaClj28ArfKScKHs2nshz3k2le6crsg==",
"version": "3.1.7",
"resolved": "https://registry.npmjs.org/tar-stream/-/tar-stream-3.1.7.tgz",
"integrity": "sha512-qJj60CXt7IU1Ffyc3NJMjh6EkuCFej46zUqJ4J7pqYlThyd9bO0XBTmcOIhSzZJVWfsLks0+nle/j538YAW9RQ==",
"dependencies": {
"b4a": "^1.6.4",
"fast-fifo": "^1.2.0",
@ -2082,20 +2083,17 @@
"integrity": "sha512-w89qg7PI8wAdvX60bMDP+bFoD5Dvhm9oLheFp5O4a2QF0cSBGsBX4qZmadPMvVqlLJBBci+WqGGOAPvcDeNSVg=="
},
"node_modules/tmp": {
"version": "0.2.1",
"resolved": "https://registry.npmjs.org/tmp/-/tmp-0.2.1.tgz",
"integrity": "sha512-76SUhtfqR2Ijn+xllcI5P1oyannHNHByD80W1q447gU3mp9G9PSpGdWmjUOHRDPiHYacIk66W7ubDTuPF3BEtQ==",
"dependencies": {
"rimraf": "^3.0.0"
},
"version": "0.2.3",
"resolved": "https://registry.npmjs.org/tmp/-/tmp-0.2.3.tgz",
"integrity": "sha512-nZD7m9iCPC5g0pYmcaxogYKggSfLsdxl8of3Q/oIbqCqLLIO9IAF0GWjX1z9NZRHPiXv8Wex4yDCaZsgEw0Y8w==",
"engines": {
"node": ">=8.17.0"
"node": ">=14.14"
}
},
"node_modules/tough-cookie": {
"version": "4.1.3",
"resolved": "https://registry.npmjs.org/tough-cookie/-/tough-cookie-4.1.3.tgz",
"integrity": "sha512-aX/y5pVRkfRnfmuX+OdbSdXvPe6ieKX/G2s7e98f4poJHnqH3281gDPm/metm6E/WRamfx7WC4HUqkWHfQHprw==",
"version": "4.1.4",
"resolved": "https://registry.npmjs.org/tough-cookie/-/tough-cookie-4.1.4.tgz",
"integrity": "sha512-Loo5UUvLD9ScZ6jh8beX1T6sO1w2/MpCRpEP7V280GKMVUQ0Jzar2U3UJPsrdbziLEMMhu3Ujnq//rhiFuIeag==",
"dependencies": {
"psl": "^1.1.33",
"punycode": "^2.1.1",
@ -2125,9 +2123,9 @@
"integrity": "sha512-AEYxH93jGFPn/a2iVAwW87VuUIkR1FVUKB77NwMF7nBTDkDrrT/Hpt/IrCJ0QXhW27jTBDcf5ZY7w6RiqTMw2Q=="
},
"node_modules/turndown": {
"version": "7.1.2",
"resolved": "https://registry.npmjs.org/turndown/-/turndown-7.1.2.tgz",
"integrity": "sha512-ntI9R7fcUKjqBP6QU8rBK2Ehyt8LAzt3UBT9JR9tgo6GtuKvyUzpayWmeMKJw1DPdXzktvtIT8m2mVXz+bL/Qg==",
"version": "7.1.3",
"resolved": "https://registry.npmjs.org/turndown/-/turndown-7.1.3.tgz",
"integrity": "sha512-Z3/iJ6IWh8VBiACWQJaA5ulPQE5E1QwvBHj00uGzdQxdRnd8fh1DPqNOJqzQDu6DkOstORrtXzf/9adB+vMtEA==",
"dependencies": {
"domino": "^2.1.6"
}
@ -2178,9 +2176,9 @@
}
},
"node_modules/urlpattern-polyfill": {
"version": "9.0.0",
"resolved": "https://registry.npmjs.org/urlpattern-polyfill/-/urlpattern-polyfill-9.0.0.tgz",
"integrity": "sha512-WHN8KDQblxd32odxeIgo83rdVDE2bvdkb86it7bMhYZwWKJz0+O0RK/eZiHYnM+zgt/U7hAHOlCQGfjjvSkw2g=="
"version": "10.0.0",
"resolved": "https://registry.npmjs.org/urlpattern-polyfill/-/urlpattern-polyfill-10.0.0.tgz",
"integrity": "sha512-H/A06tKD7sS1O1X2SshBVeA5FLycRpjqiBeqGKmBwBDBy28EnRjORxTNe269KSSr5un5qyWi1iL61wLxpd+ZOg=="
},
"node_modules/util-deprecate": {
"version": "1.0.2",
@ -2298,9 +2296,9 @@
"integrity": "sha512-l4Sp/DRseor9wL6EvV2+TuQn63dMkPjZ/sp9XkghTEbV9KlPS1xUsZ3u7/IQO4wxtcFB4bgpQPRcR3QCvezPcQ=="
},
"node_modules/ws": {
"version": "8.16.0",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.16.0.tgz",
"integrity": "sha512-HS0c//TP7Ina87TfiPUz1rQzMhHrl/SG2guqRcTOIUYD2q8uhUdNHZYJUaQ8aTGPzCh+c6oawMKW35nFl1dxyQ==",
"version": "8.17.0",
"resolved": "https://registry.npmjs.org/ws/-/ws-8.17.0.tgz",
"integrity": "sha512-uJq6108EgZMAl20KagGkzCKfMEjxmKvZHG7Tlq0Z6nOky7YF7aq4mOx6xK8TJ/i1LeK4Qus7INktacctDgY8Ow==",
"engines": {
"node": ">=10.0.0"
},

View file

@ -1,6 +1,6 @@
{
"name": "archivebox",
"version": "0.7.3",
"version": "0.8.0",
"description": "ArchiveBox: The self-hosted internet archive",
"author": "Nick Sweeting <archivebox-npm@sweeting.me>",
"repository": "github:ArchiveBox/ArchiveBox",

853
pdm.lock
View file

@ -1,853 +0,0 @@
# This file is @generated by PDM.
# It is not intended for manual editing.
[metadata]
groups = ["default", "ldap", "sonic"]
strategy = ["cross_platform"]
lock_version = "4.4.1"
content_hash = "sha256:4ba1c25daa30a36c5b3ffdb563d5024c2ab15042758f4fbc3f375dedb35d1bdf"
[[package]]
name = "asgiref"
version = "3.7.2"
requires_python = ">=3.7"
summary = "ASGI specs, helper code, and adapters"
dependencies = [
"typing-extensions>=4; python_version < \"3.11\"",
]
files = [
{file = "asgiref-3.7.2-py3-none-any.whl", hash = "sha256:89b2ef2247e3b562a16eef663bc0e2e703ec6468e2fa8a5cd61cd449786d4f6e"},
{file = "asgiref-3.7.2.tar.gz", hash = "sha256:9e0ce3aa93a819ba5b45120216b23878cf6e8525eb3848653452b4192b92afed"},
]
[[package]]
name = "asttokens"
version = "2.4.1"
summary = "Annotate AST trees with source code positions"
dependencies = [
"six>=1.12.0",
]
files = [
{file = "asttokens-2.4.1-py2.py3-none-any.whl", hash = "sha256:051ed49c3dcae8913ea7cd08e46a606dba30b79993209636c4875bc1d637bc24"},
{file = "asttokens-2.4.1.tar.gz", hash = "sha256:b03869718ba9a6eb027e134bfdf69f38a236d681c83c160d510768af11254ba0"},
]
[[package]]
name = "brotli"
version = "1.1.0"
summary = "Python bindings for the Brotli compression library"
files = [
{file = "Brotli-1.1.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:e1140c64812cb9b06c922e77f1c26a75ec5e3f0fb2bf92cc8c58720dec276752"},
{file = "Brotli-1.1.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:c8fd5270e906eef71d4a8d19b7c6a43760c6abcfcc10c9101d14eb2357418de9"},
{file = "Brotli-1.1.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1ae56aca0402a0f9a3431cddda62ad71666ca9d4dc3a10a142b9dce2e3c0cda3"},
{file = "Brotli-1.1.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:43ce1b9935bfa1ede40028054d7f48b5469cd02733a365eec8a329ffd342915d"},
{file = "Brotli-1.1.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_12_i686.manylinux2010_i686.whl", hash = "sha256:7c4855522edb2e6ae7fdb58e07c3ba9111e7621a8956f481c68d5d979c93032e"},
{file = "Brotli-1.1.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:38025d9f30cf4634f8309c6874ef871b841eb3c347e90b0851f63d1ded5212da"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:e6a904cb26bfefc2f0a6f240bdf5233be78cd2488900a2f846f3c3ac8489ab80"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:a37b8f0391212d29b3a91a799c8e4a2855e0576911cdfb2515487e30e322253d"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_ppc64le.whl", hash = "sha256:e84799f09591700a4154154cab9787452925578841a94321d5ee8fb9a9a328f0"},
{file = "Brotli-1.1.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:f66b5337fa213f1da0d9000bc8dc0cb5b896b726eefd9c6046f699b169c41b9e"},
{file = "Brotli-1.1.0-cp310-cp310-win32.whl", hash = "sha256:be36e3d172dc816333f33520154d708a2657ea63762ec16b62ece02ab5e4daf2"},
{file = "Brotli-1.1.0-cp310-cp310-win_amd64.whl", hash = "sha256:0c6244521dda65ea562d5a69b9a26120769b7a9fb3db2fe9545935ed6735b128"},
{file = "Brotli-1.1.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:a3daabb76a78f829cafc365531c972016e4aa8d5b4bf60660ad8ecee19df7ccc"},
{file = "Brotli-1.1.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:c8146669223164fc87a7e3de9f81e9423c67a79d6b3447994dfb9c95da16e2d6"},
{file = "Brotli-1.1.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:30924eb4c57903d5a7526b08ef4a584acc22ab1ffa085faceb521521d2de32dd"},
{file = "Brotli-1.1.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ceb64bbc6eac5a140ca649003756940f8d6a7c444a68af170b3187623b43bebf"},
{file = "Brotli-1.1.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a469274ad18dc0e4d316eefa616d1d0c2ff9da369af19fa6f3daa4f09671fd61"},
{file = "Brotli-1.1.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:524f35912131cc2cabb00edfd8d573b07f2d9f21fa824bd3fb19725a9cf06327"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:5b3cc074004d968722f51e550b41a27be656ec48f8afaeeb45ebf65b561481dd"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:19c116e796420b0cee3da1ccec3b764ed2952ccfcc298b55a10e5610ad7885f9"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_ppc64le.whl", hash = "sha256:510b5b1bfbe20e1a7b3baf5fed9e9451873559a976c1a78eebaa3b86c57b4265"},
{file = "Brotli-1.1.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:a1fd8a29719ccce974d523580987b7f8229aeace506952fa9ce1d53a033873c8"},
{file = "Brotli-1.1.0-cp311-cp311-win32.whl", hash = "sha256:39da8adedf6942d76dc3e46653e52df937a3c4d6d18fdc94a7c29d263b1f5b50"},
{file = "Brotli-1.1.0-cp311-cp311-win_amd64.whl", hash = "sha256:aac0411d20e345dc0920bdec5548e438e999ff68d77564d5e9463a7ca9d3e7b1"},
{file = "Brotli-1.1.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:5fb2ce4b8045c78ebbc7b8f3c15062e435d47e7393cc57c25115cfd49883747a"},
{file = "Brotli-1.1.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:7905193081db9bfa73b1219140b3d315831cbff0d8941f22da695832f0dd188f"},
{file = "Brotli-1.1.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a77def80806c421b4b0af06f45d65a136e7ac0bdca3c09d9e2ea4e515367c7e9"},
{file = "Brotli-1.1.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:8dadd1314583ec0bf2d1379f7008ad627cd6336625d6679cf2f8e67081b83acf"},
{file = "Brotli-1.1.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.whl", hash = "sha256:901032ff242d479a0efa956d853d16875d42157f98951c0230f69e69f9c09bac"},
{file = "Brotli-1.1.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.whl", hash = "sha256:22fc2a8549ffe699bfba2256ab2ed0421a7b8fadff114a3d201794e45a9ff578"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:ae15b066e5ad21366600ebec29a7ccbc86812ed267e4b28e860b8ca16a2bc474"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:949f3b7c29912693cee0afcf09acd6ebc04c57af949d9bf77d6101ebb61e388c"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_ppc64le.whl", hash = "sha256:89f4988c7203739d48c6f806f1e87a1d96e0806d44f0fba61dba81392c9e474d"},
{file = "Brotli-1.1.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:de6551e370ef19f8de1807d0a9aa2cdfdce2e85ce88b122fe9f6b2b076837e59"},
{file = "Brotli-1.1.0-cp39-cp39-win32.whl", hash = "sha256:f0d8a7a6b5983c2496e364b969f0e526647a06b075d034f3297dc66f3b360c64"},
{file = "Brotli-1.1.0-cp39-cp39-win_amd64.whl", hash = "sha256:cdad5b9014d83ca68c25d2e9444e28e967ef16e80f6b436918c700c117a85467"},
{file = "Brotli-1.1.0.tar.gz", hash = "sha256:81de08ac11bcb85841e440c13611c00b67d3bf82698314928d0b676362546724"},
]
[[package]]
name = "brotlicffi"
version = "1.1.0.0"
requires_python = ">=3.7"
summary = "Python CFFI bindings to the Brotli library"
dependencies = [
"cffi>=1.0.0",
]
files = [
{file = "brotlicffi-1.1.0.0-cp37-abi3-macosx_10_9_x86_64.whl", hash = "sha256:9b7ae6bd1a3f0df532b6d67ff674099a96d22bc0948955cb338488c31bfb8851"},
{file = "brotlicffi-1.1.0.0-cp37-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:19ffc919fa4fc6ace69286e0a23b3789b4219058313cf9b45625016bf7ff996b"},
{file = "brotlicffi-1.1.0.0-cp37-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:9feb210d932ffe7798ee62e6145d3a757eb6233aa9a4e7db78dd3690d7755814"},
{file = "brotlicffi-1.1.0.0-cp37-abi3-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:84763dbdef5dd5c24b75597a77e1b30c66604725707565188ba54bab4f114820"},
{file = "brotlicffi-1.1.0.0-cp37-abi3-win32.whl", hash = "sha256:1b12b50e07c3911e1efa3a8971543e7648100713d4e0971b13631cce22c587eb"},
{file = "brotlicffi-1.1.0.0-cp37-abi3-win_amd64.whl", hash = "sha256:994a4f0681bb6c6c3b0925530a1926b7a189d878e6e5e38fae8efa47c5d9c613"},
{file = "brotlicffi-1.1.0.0-pp310-pypy310_pp73-macosx_10_9_x86_64.whl", hash = "sha256:2e4aeb0bd2540cb91b069dbdd54d458da8c4334ceaf2d25df2f4af576d6766ca"},
{file = "brotlicffi-1.1.0.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:4b7b0033b0d37bb33009fb2fef73310e432e76f688af76c156b3594389d81391"},
{file = "brotlicffi-1.1.0.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:54a07bb2374a1eba8ebb52b6fafffa2afd3c4df85ddd38fcc0511f2bb387c2a8"},
{file = "brotlicffi-1.1.0.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:7901a7dc4b88f1c1475de59ae9be59799db1007b7d059817948d8e4f12e24e35"},
{file = "brotlicffi-1.1.0.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:ce01c7316aebc7fce59da734286148b1d1b9455f89cf2c8a4dfce7d41db55c2d"},
{file = "brotlicffi-1.1.0.0-pp37-pypy37_pp73-macosx_10_9_x86_64.whl", hash = "sha256:246f1d1a90279bb6069de3de8d75a8856e073b8ff0b09dcca18ccc14cec85979"},
{file = "brotlicffi-1.1.0.0-pp37-pypy37_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cc4bc5d82bc56ebd8b514fb8350cfac4627d6b0743382e46d033976a5f80fab6"},
{file = "brotlicffi-1.1.0.0-pp37-pypy37_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:37c26ecb14386a44b118ce36e546ce307f4810bc9598a6e6cb4f7fca725ae7e6"},
{file = "brotlicffi-1.1.0.0-pp37-pypy37_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ca72968ae4eaf6470498d5c2887073f7efe3b1e7d7ec8be11a06a79cc810e990"},
{file = "brotlicffi-1.1.0.0-pp37-pypy37_pp73-win_amd64.whl", hash = "sha256:add0de5b9ad9e9aa293c3aa4e9deb2b61e99ad6c1634e01d01d98c03e6a354cc"},
{file = "brotlicffi-1.1.0.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:9b6068e0f3769992d6b622a1cd2e7835eae3cf8d9da123d7f51ca9c1e9c333e5"},
{file = "brotlicffi-1.1.0.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8557a8559509b61e65083f8782329188a250102372576093c88930c875a69838"},
{file = "brotlicffi-1.1.0.0-pp38-pypy38_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:2a7ae37e5d79c5bdfb5b4b99f2715a6035e6c5bf538c3746abc8e26694f92f33"},
{file = "brotlicffi-1.1.0.0-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:391151ec86bb1c683835980f4816272a87eaddc46bb91cbf44f62228b84d8cca"},
{file = "brotlicffi-1.1.0.0-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:2f3711be9290f0453de8eed5275d93d286abe26b08ab4a35d7452caa1fef532f"},
{file = "brotlicffi-1.1.0.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl", hash = "sha256:1a807d760763e398bbf2c6394ae9da5815901aa93ee0a37bca5efe78d4ee3171"},
{file = "brotlicffi-1.1.0.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:fa8ca0623b26c94fccc3a1fdd895be1743b838f3917300506d04aa3346fd2a14"},
{file = "brotlicffi-1.1.0.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3de0cf28a53a3238b252aca9fed1593e9d36c1d116748013339f0949bfc84112"},
{file = "brotlicffi-1.1.0.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6be5ec0e88a4925c91f3dea2bb0013b3a2accda6f77238f76a34a1ea532a1cb0"},
{file = "brotlicffi-1.1.0.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:d9eb71bb1085d996244439154387266fd23d6ad37161f6f52f1cd41dd95a3808"},
{file = "brotlicffi-1.1.0.0.tar.gz", hash = "sha256:b77827a689905143f87915310b93b273ab17888fd43ef350d4832c4a71083c13"},
]
[[package]]
name = "certifi"
version = "2023.11.17"
requires_python = ">=3.6"
summary = "Python package for providing Mozilla's CA Bundle."
files = [
{file = "certifi-2023.11.17-py3-none-any.whl", hash = "sha256:e036ab49d5b79556f99cfc2d9320b34cfbe5be05c5871b51de9329f0603b0474"},
{file = "certifi-2023.11.17.tar.gz", hash = "sha256:9b469f3a900bf28dc19b8cfbf8019bf47f7fdd1a65a1d4ffb98fc14166beb4d1"},
]
[[package]]
name = "cffi"
version = "1.16.0"
requires_python = ">=3.8"
summary = "Foreign Function Interface for Python calling C code."
dependencies = [
"pycparser",
]
files = [
{file = "cffi-1.16.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:6b3d6606d369fc1da4fd8c357d026317fbb9c9b75d36dc16e90e84c26854b088"},
{file = "cffi-1.16.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:ac0f5edd2360eea2f1daa9e26a41db02dd4b0451b48f7c318e217ee092a213e9"},
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:7e61e3e4fa664a8588aa25c883eab612a188c725755afff6289454d6362b9673"},
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:a72e8961a86d19bdb45851d8f1f08b041ea37d2bd8d4fd19903bc3083d80c896"},
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:5b50bf3f55561dac5438f8e70bfcdfd74543fd60df5fa5f62d94e5867deca684"},
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:7651c50c8c5ef7bdb41108b7b8c5a83013bfaa8a935590c5d74627c047a583c7"},
{file = "cffi-1.16.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e4108df7fe9b707191e55f33efbcb2d81928e10cea45527879a4749cbe472614"},
{file = "cffi-1.16.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:32c68ef735dbe5857c810328cb2481e24722a59a2003018885514d4c09af9743"},
{file = "cffi-1.16.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:673739cb539f8cdaa07d92d02efa93c9ccf87e345b9a0b556e3ecc666718468d"},
{file = "cffi-1.16.0-cp310-cp310-win32.whl", hash = "sha256:9f90389693731ff1f659e55c7d1640e2ec43ff725cc61b04b2f9c6d8d017df6a"},
{file = "cffi-1.16.0-cp310-cp310-win_amd64.whl", hash = "sha256:e6024675e67af929088fda399b2094574609396b1decb609c55fa58b028a32a1"},
{file = "cffi-1.16.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:b84834d0cf97e7d27dd5b7f3aca7b6e9263c56308ab9dc8aae9784abb774d404"},
{file = "cffi-1.16.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:1b8ebc27c014c59692bb2664c7d13ce7a6e9a629be20e54e7271fa696ff2b417"},
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ee07e47c12890ef248766a6e55bd38ebfb2bb8edd4142d56db91b21ea68b7627"},
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:d8a9d3ebe49f084ad71f9269834ceccbf398253c9fac910c4fd7053ff1386936"},
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e70f54f1796669ef691ca07d046cd81a29cb4deb1e5f942003f401c0c4a2695d"},
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:5bf44d66cdf9e893637896c7faa22298baebcd18d1ddb6d2626a6e39793a1d56"},
{file = "cffi-1.16.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:7b78010e7b97fef4bee1e896df8a4bbb6712b7f05b7ef630f9d1da00f6444d2e"},
{file = "cffi-1.16.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:c6a164aa47843fb1b01e941d385aab7215563bb8816d80ff3a363a9f8448a8dc"},
{file = "cffi-1.16.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:e09f3ff613345df5e8c3667da1d918f9149bd623cd9070c983c013792a9a62eb"},
{file = "cffi-1.16.0-cp311-cp311-win32.whl", hash = "sha256:2c56b361916f390cd758a57f2e16233eb4f64bcbeee88a4881ea90fca14dc6ab"},
{file = "cffi-1.16.0-cp311-cp311-win_amd64.whl", hash = "sha256:db8e577c19c0fda0beb7e0d4e09e0ba74b1e4c092e0e40bfa12fe05b6f6d75ba"},
{file = "cffi-1.16.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:582215a0e9adbe0e379761260553ba11c58943e4bbe9c36430c4ca6ac74b15ed"},
{file = "cffi-1.16.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:b29ebffcf550f9da55bec9e02ad430c992a87e5f512cd63388abb76f1036d8d2"},
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_12_i686.manylinux2010_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:dc9b18bf40cc75f66f40a7379f6a9513244fe33c0e8aa72e2d56b0196a7ef872"},
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:9cb4a35b3642fc5c005a6755a5d17c6c8b6bcb6981baf81cea8bfbc8903e8ba8"},
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:b86851a328eedc692acf81fb05444bdf1891747c25af7529e39ddafaf68a4f3f"},
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:c0f31130ebc2d37cdd8e44605fb5fa7ad59049298b3f745c74fa74c62fbfcfc4"},
{file = "cffi-1.16.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8f8e709127c6c77446a8c0a8c8bf3c8ee706a06cd44b1e827c3e6a2ee6b8c098"},
{file = "cffi-1.16.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:748dcd1e3d3d7cd5443ef03ce8685043294ad6bd7c02a38d1bd367cfd968e000"},
{file = "cffi-1.16.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:8895613bcc094d4a1b2dbe179d88d7fb4a15cee43c052e8885783fac397d91fe"},
{file = "cffi-1.16.0-cp39-cp39-win32.whl", hash = "sha256:ed86a35631f7bfbb28e108dd96773b9d5a6ce4811cf6ea468bb6a359b256b1e4"},
{file = "cffi-1.16.0-cp39-cp39-win_amd64.whl", hash = "sha256:3686dffb02459559c74dd3d81748269ffb0eb027c39a6fc99502de37d501faa8"},
{file = "cffi-1.16.0.tar.gz", hash = "sha256:bcb3ef43e58665bbda2fb198698fcae6776483e0c4a631aa5647806c25e02cc0"},
]
[[package]]
name = "charset-normalizer"
version = "3.3.2"
requires_python = ">=3.7.0"
summary = "The Real First Universal Charset Detector. Open, modern and actively maintained alternative to Chardet."
files = [
{file = "charset-normalizer-3.3.2.tar.gz", hash = "sha256:f30c3cb33b24454a82faecaf01b19c18562b1e89558fb6c56de4d9118a032fd5"},
{file = "charset_normalizer-3.3.2-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:25baf083bf6f6b341f4121c2f3c548875ee6f5339300e08be3f2b2ba1721cdd3"},
{file = "charset_normalizer-3.3.2-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:06435b539f889b1f6f4ac1758871aae42dc3a8c0e24ac9e60c2384973ad73027"},
{file = "charset_normalizer-3.3.2-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:9063e24fdb1e498ab71cb7419e24622516c4a04476b17a2dab57e8baa30d6e03"},
{file = "charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:6897af51655e3691ff853668779c7bad41579facacf5fd7253b0133308cf000d"},
{file = "charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1d3193f4a680c64b4b6a9115943538edb896edc190f0b222e73761716519268e"},
{file = "charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:cd70574b12bb8a4d2aaa0094515df2463cb429d8536cfb6c7ce983246983e5a6"},
{file = "charset_normalizer-3.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:8465322196c8b4d7ab6d1e049e4c5cb460d0394da4a27d23cc242fbf0034b6b5"},
{file = "charset_normalizer-3.3.2-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:a9a8e9031d613fd2009c182b69c7b2c1ef8239a0efb1df3f7c8da66d5dd3d537"},
{file = "charset_normalizer-3.3.2-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:beb58fe5cdb101e3a055192ac291b7a21e3b7ef4f67fa1d74e331a7f2124341c"},
{file = "charset_normalizer-3.3.2-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:e06ed3eb3218bc64786f7db41917d4e686cc4856944f53d5bdf83a6884432e12"},
{file = "charset_normalizer-3.3.2-cp310-cp310-musllinux_1_1_ppc64le.whl", hash = "sha256:2e81c7b9c8979ce92ed306c249d46894776a909505d8f5a4ba55b14206e3222f"},
{file = "charset_normalizer-3.3.2-cp310-cp310-musllinux_1_1_s390x.whl", hash = "sha256:572c3763a264ba47b3cf708a44ce965d98555f618ca42c926a9c1616d8f34269"},
{file = "charset_normalizer-3.3.2-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:fd1abc0d89e30cc4e02e4064dc67fcc51bd941eb395c502aac3ec19fab46b519"},
{file = "charset_normalizer-3.3.2-cp310-cp310-win32.whl", hash = "sha256:3d47fa203a7bd9c5b6cee4736ee84ca03b8ef23193c0d1ca99b5089f72645c73"},
{file = "charset_normalizer-3.3.2-cp310-cp310-win_amd64.whl", hash = "sha256:10955842570876604d404661fbccbc9c7e684caf432c09c715ec38fbae45ae09"},
{file = "charset_normalizer-3.3.2-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:802fe99cca7457642125a8a88a084cef28ff0cf9407060f7b93dca5aa25480db"},
{file = "charset_normalizer-3.3.2-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:573f6eac48f4769d667c4442081b1794f52919e7edada77495aaed9236d13a96"},
{file = "charset_normalizer-3.3.2-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:549a3a73da901d5bc3ce8d24e0600d1fa85524c10287f6004fbab87672bf3e1e"},
{file = "charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f27273b60488abe721a075bcca6d7f3964f9f6f067c8c4c605743023d7d3944f"},
{file = "charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:1ceae2f17a9c33cb48e3263960dc5fc8005351ee19db217e9b1bb15d28c02574"},
{file = "charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:65f6f63034100ead094b8744b3b97965785388f308a64cf8d7c34f2f2e5be0c4"},
{file = "charset_normalizer-3.3.2-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:753f10e867343b4511128c6ed8c82f7bec3bd026875576dfd88483c5c73b2fd8"},
{file = "charset_normalizer-3.3.2-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:4a78b2b446bd7c934f5dcedc588903fb2f5eec172f3d29e52a9096a43722adfc"},
{file = "charset_normalizer-3.3.2-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:e537484df0d8f426ce2afb2d0f8e1c3d0b114b83f8850e5f2fbea0e797bd82ae"},
{file = "charset_normalizer-3.3.2-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:eb6904c354526e758fda7167b33005998fb68c46fbc10e013ca97f21ca5c8887"},
{file = "charset_normalizer-3.3.2-cp311-cp311-musllinux_1_1_ppc64le.whl", hash = "sha256:deb6be0ac38ece9ba87dea880e438f25ca3eddfac8b002a2ec3d9183a454e8ae"},
{file = "charset_normalizer-3.3.2-cp311-cp311-musllinux_1_1_s390x.whl", hash = "sha256:4ab2fe47fae9e0f9dee8c04187ce5d09f48eabe611be8259444906793ab7cbce"},
{file = "charset_normalizer-3.3.2-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:80402cd6ee291dcb72644d6eac93785fe2c8b9cb30893c1af5b8fdd753b9d40f"},
{file = "charset_normalizer-3.3.2-cp311-cp311-win32.whl", hash = "sha256:7cd13a2e3ddeed6913a65e66e94b51d80a041145a026c27e6bb76c31a853c6ab"},
{file = "charset_normalizer-3.3.2-cp311-cp311-win_amd64.whl", hash = "sha256:663946639d296df6a2bb2aa51b60a2454ca1cb29835324c640dafb5ff2131a77"},
{file = "charset_normalizer-3.3.2-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:c235ebd9baae02f1b77bcea61bce332cb4331dc3617d254df3323aa01ab47bd4"},
{file = "charset_normalizer-3.3.2-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:5b4c145409bef602a690e7cfad0a15a55c13320ff7a3ad7ca59c13bb8ba4d45d"},
{file = "charset_normalizer-3.3.2-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:68d1f8a9e9e37c1223b656399be5d6b448dea850bed7d0f87a8311f1ff3dabb0"},
{file = "charset_normalizer-3.3.2-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:22afcb9f253dac0696b5a4be4a1c0f8762f8239e21b99680099abd9b2b1b2269"},
{file = "charset_normalizer-3.3.2-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:e27ad930a842b4c5eb8ac0016b0a54f5aebbe679340c26101df33424142c143c"},
{file = "charset_normalizer-3.3.2-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:1f79682fbe303db92bc2b1136016a38a42e835d932bab5b3b1bfcfbf0640e519"},
{file = "charset_normalizer-3.3.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:b261ccdec7821281dade748d088bb6e9b69e6d15b30652b74cbbac25e280b796"},
{file = "charset_normalizer-3.3.2-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:122c7fa62b130ed55f8f285bfd56d5f4b4a5b503609d181f9ad85e55c89f4185"},
{file = "charset_normalizer-3.3.2-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:d0eccceffcb53201b5bfebb52600a5fb483a20b61da9dbc885f8b103cbe7598c"},
{file = "charset_normalizer-3.3.2-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:9f96df6923e21816da7e0ad3fd47dd8f94b2a5ce594e00677c0013018b813458"},
{file = "charset_normalizer-3.3.2-cp39-cp39-musllinux_1_1_ppc64le.whl", hash = "sha256:7f04c839ed0b6b98b1a7501a002144b76c18fb1c1850c8b98d458ac269e26ed2"},
{file = "charset_normalizer-3.3.2-cp39-cp39-musllinux_1_1_s390x.whl", hash = "sha256:34d1c8da1e78d2e001f363791c98a272bb734000fcef47a491c1e3b0505657a8"},
{file = "charset_normalizer-3.3.2-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:ff8fa367d09b717b2a17a052544193ad76cd49979c805768879cb63d9ca50561"},
{file = "charset_normalizer-3.3.2-cp39-cp39-win32.whl", hash = "sha256:aed38f6e4fb3f5d6bf81bfa990a07806be9d83cf7bacef998ab1a9bd660a581f"},
{file = "charset_normalizer-3.3.2-cp39-cp39-win_amd64.whl", hash = "sha256:b01b88d45a6fcb69667cd6d2f7a9aeb4bf53760d7fc536bf679ec94fe9f3ff3d"},
{file = "charset_normalizer-3.3.2-py3-none-any.whl", hash = "sha256:3e4d1f6587322d2788836a99c69062fbb091331ec940e02d12d179c1d53e25fc"},
]
[[package]]
name = "colorama"
version = "0.4.6"
requires_python = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,!=3.6.*,>=2.7"
summary = "Cross-platform colored terminal text."
files = [
{file = "colorama-0.4.6-py2.py3-none-any.whl", hash = "sha256:4f1d9991f5acc0ca119f9d443620b77f9d6b33703e51011c16baf57afb285fc6"},
{file = "colorama-0.4.6.tar.gz", hash = "sha256:08695f5cb7ed6e0531a20572697297273c47b8cae5a63ffc6d6ed5c201be6e44"},
]
[[package]]
name = "croniter"
version = "2.0.1"
requires_python = ">=2.6, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
summary = "croniter provides iteration for datetime object with cron like format"
dependencies = [
"python-dateutil",
"pytz>2021.1",
]
files = [
{file = "croniter-2.0.1-py2.py3-none-any.whl", hash = "sha256:4cb064ce2d8f695b3b078be36ff50115cf8ac306c10a7e8653ee2a5b534673d7"},
{file = "croniter-2.0.1.tar.gz", hash = "sha256:d199b2ec3ea5e82988d1f72022433c5f9302b3b3ea9e6bfd6a1518f6ea5e700a"},
]
[[package]]
name = "dateparser"
version = "1.2.0"
requires_python = ">=3.7"
summary = "Date parsing library designed to parse dates from HTML pages"
dependencies = [
"python-dateutil",
"pytz",
"regex!=2019.02.19,!=2021.8.27",
"tzlocal",
]
files = [
{file = "dateparser-1.2.0-py2.py3-none-any.whl", hash = "sha256:0b21ad96534e562920a0083e97fd45fa959882d4162acc358705144520a35830"},
{file = "dateparser-1.2.0.tar.gz", hash = "sha256:7975b43a4222283e0ae15be7b4999d08c9a70e2d378ac87385b1ccf2cffbbb30"},
]
[[package]]
name = "decorator"
version = "5.1.1"
requires_python = ">=3.5"
summary = "Decorators for Humans"
files = [
{file = "decorator-5.1.1-py3-none-any.whl", hash = "sha256:b8c3f85900b9dc423225913c5aace94729fe1fa9763b38939a95226f02d37186"},
{file = "decorator-5.1.1.tar.gz", hash = "sha256:637996211036b6385ef91435e4fae22989472f9d571faba8927ba8253acbc330"},
]
[[package]]
name = "django"
version = "3.1.14"
requires_python = ">=3.6"
summary = "A high-level Python Web framework that encourages rapid development and clean, pragmatic design."
dependencies = [
"asgiref<4,>=3.2.10",
"pytz",
"sqlparse>=0.2.2",
]
files = [
{file = "Django-3.1.14-py3-none-any.whl", hash = "sha256:0fabc786489af16ad87a8c170ba9d42bfd23f7b699bd5ef05675864e8d012859"},
{file = "Django-3.1.14.tar.gz", hash = "sha256:72a4a5a136a214c39cf016ccdd6b69e2aa08c7479c66d93f3a9b5e4bb9d8a347"},
]
[[package]]
name = "django-auth-ldap"
version = "4.1.0"
requires_python = ">=3.7"
summary = "Django LDAP authentication backend."
dependencies = [
"Django>=2.2",
"python-ldap>=3.1",
]
files = [
{file = "django-auth-ldap-4.1.0.tar.gz", hash = "sha256:77f749d3b17807ce8eb56a9c9c8e5746ff316567f81d5ba613495d9c7495a949"},
{file = "django_auth_ldap-4.1.0-py3-none-any.whl", hash = "sha256:68870e7921e84b1a9867e268a9c8a3e573e8a0d95ea08bcf31be178f5826ff36"},
]
[[package]]
name = "django-extensions"
version = "3.1.5"
requires_python = ">=3.6"
summary = "Extensions for Django"
dependencies = [
"Django>=2.2",
]
files = [
{file = "django-extensions-3.1.5.tar.gz", hash = "sha256:28e1e1bf49f0e00307ba574d645b0af3564c981a6dfc87209d48cb98f77d0b1a"},
{file = "django_extensions-3.1.5-py3-none-any.whl", hash = "sha256:9238b9e016bb0009d621e05cf56ea8ce5cce9b32e91ad2026996a7377ca28069"},
]
[[package]]
name = "exceptiongroup"
version = "1.2.0"
requires_python = ">=3.7"
summary = "Backport of PEP 654 (exception groups)"
files = [
{file = "exceptiongroup-1.2.0-py3-none-any.whl", hash = "sha256:4bfd3996ac73b41e9b9628b04e079f193850720ea5945fc96a08633c66912f14"},
{file = "exceptiongroup-1.2.0.tar.gz", hash = "sha256:91f5c769735f051a4290d52edd0858999b57e5876e9f85937691bd4c9fa3ed68"},
]
[[package]]
name = "executing"
version = "2.0.1"
requires_python = ">=3.5"
summary = "Get the currently executing AST node of a frame, and other information"
files = [
{file = "executing-2.0.1-py2.py3-none-any.whl", hash = "sha256:eac49ca94516ccc753f9fb5ce82603156e590b27525a8bc32cce8ae302eb61bc"},
{file = "executing-2.0.1.tar.gz", hash = "sha256:35afe2ce3affba8ee97f2d69927fa823b08b472b7b994e36a52a964b93d16147"},
]
[[package]]
name = "idna"
version = "3.6"
requires_python = ">=3.5"
summary = "Internationalized Domain Names in Applications (IDNA)"
files = [
{file = "idna-3.6-py3-none-any.whl", hash = "sha256:c05567e9c24a6b9faaa835c4821bad0590fbb9d5779e7caa6e1cc4978e7eb24f"},
{file = "idna-3.6.tar.gz", hash = "sha256:9ecdbbd083b06798ae1e86adcbfe8ab1479cf864e4ee30fe4e46a003d12491ca"},
]
[[package]]
name = "ipython"
version = "8.18.1"
requires_python = ">=3.9"
summary = "IPython: Productive Interactive Computing"
dependencies = [
"colorama; sys_platform == \"win32\"",
"decorator",
"exceptiongroup; python_version < \"3.11\"",
"jedi>=0.16",
"matplotlib-inline",
"pexpect>4.3; sys_platform != \"win32\"",
"prompt-toolkit<3.1.0,>=3.0.41",
"pygments>=2.4.0",
"stack-data",
"traitlets>=5",
"typing-extensions; python_version < \"3.10\"",
]
files = [
{file = "ipython-8.18.1-py3-none-any.whl", hash = "sha256:e8267419d72d81955ec1177f8a29aaa90ac80ad647499201119e2f05e99aa397"},
{file = "ipython-8.18.1.tar.gz", hash = "sha256:ca6f079bb33457c66e233e4580ebfc4128855b4cf6370dddd73842a9563e8a27"},
]
[[package]]
name = "jedi"
version = "0.19.1"
requires_python = ">=3.6"
summary = "An autocompletion tool for Python that can be used for text editors."
dependencies = [
"parso<0.9.0,>=0.8.3",
]
files = [
{file = "jedi-0.19.1-py2.py3-none-any.whl", hash = "sha256:e983c654fe5c02867aef4cdfce5a2fbb4a50adc0af145f70504238f18ef5e7e0"},
{file = "jedi-0.19.1.tar.gz", hash = "sha256:cf0496f3651bc65d7174ac1b7d043eff454892c708a87d1b683e57b569927ffd"},
]
[[package]]
name = "matplotlib-inline"
version = "0.1.6"
requires_python = ">=3.5"
summary = "Inline Matplotlib backend for Jupyter"
dependencies = [
"traitlets",
]
files = [
{file = "matplotlib-inline-0.1.6.tar.gz", hash = "sha256:f887e5f10ba98e8d2b150ddcf4702c1e5f8b3a20005eb0f74bfdbd360ee6f304"},
{file = "matplotlib_inline-0.1.6-py3-none-any.whl", hash = "sha256:f1f41aab5328aa5aaea9b16d083b128102f8712542f819fe7e6a420ff581b311"},
]
[[package]]
name = "mutagen"
version = "1.47.0"
requires_python = ">=3.7"
summary = "read and write audio tags for many formats"
files = [
{file = "mutagen-1.47.0-py3-none-any.whl", hash = "sha256:edd96f50c5907a9539d8e5bba7245f62c9f520aef333d13392a79a4f70aca719"},
{file = "mutagen-1.47.0.tar.gz", hash = "sha256:719fadef0a978c31b4cf3c956261b3c58b6948b32023078a2117b1de09f0fc99"},
]
[[package]]
name = "mypy-extensions"
version = "1.0.0"
requires_python = ">=3.5"
summary = "Type system extensions for programs checked with the mypy type checker."
files = [
{file = "mypy_extensions-1.0.0-py3-none-any.whl", hash = "sha256:4392f6c0eb8a5668a69e23d168ffa70f0be9ccfd32b5cc2d26a34ae5b844552d"},
{file = "mypy_extensions-1.0.0.tar.gz", hash = "sha256:75dbf8955dc00442a438fc4d0666508a9a97b6bd41aa2f0ffe9d2f2725af0782"},
]
[[package]]
name = "parso"
version = "0.8.3"
requires_python = ">=3.6"
summary = "A Python Parser"
files = [
{file = "parso-0.8.3-py2.py3-none-any.whl", hash = "sha256:c001d4636cd3aecdaf33cbb40aebb59b094be2a74c556778ef5576c175e19e75"},
{file = "parso-0.8.3.tar.gz", hash = "sha256:8c07be290bb59f03588915921e29e8a50002acaf2cdc5fa0e0114f91709fafa0"},
]
[[package]]
name = "pexpect"
version = "4.9.0"
summary = "Pexpect allows easy control of interactive console applications."
dependencies = [
"ptyprocess>=0.5",
]
files = [
{file = "pexpect-4.9.0-py2.py3-none-any.whl", hash = "sha256:7236d1e080e4936be2dc3e326cec0af72acf9212a7e1d060210e70a47e253523"},
{file = "pexpect-4.9.0.tar.gz", hash = "sha256:ee7d41123f3c9911050ea2c2dac107568dc43b2d3b0c7557a33212c398ead30f"},
]
[[package]]
name = "prompt-toolkit"
version = "3.0.43"
requires_python = ">=3.7.0"
summary = "Library for building powerful interactive command lines in Python"
dependencies = [
"wcwidth",
]
files = [
{file = "prompt_toolkit-3.0.43-py3-none-any.whl", hash = "sha256:a11a29cb3bf0a28a387fe5122cdb649816a957cd9261dcedf8c9f1fef33eacf6"},
{file = "prompt_toolkit-3.0.43.tar.gz", hash = "sha256:3527b7af26106cbc65a040bcc84839a3566ec1b051bb0bfe953631e704b0ff7d"},
]
[[package]]
name = "ptyprocess"
version = "0.7.0"
summary = "Run a subprocess in a pseudo terminal"
files = [
{file = "ptyprocess-0.7.0-py2.py3-none-any.whl", hash = "sha256:4b41f3967fce3af57cc7e94b888626c18bf37a083e3651ca8feeb66d492fef35"},
{file = "ptyprocess-0.7.0.tar.gz", hash = "sha256:5c5d0a3b48ceee0b48485e0c26037c0acd7d29765ca3fbb5cb3831d347423220"},
]
[[package]]
name = "pure-eval"
version = "0.2.2"
summary = "Safely evaluate AST nodes without side effects"
files = [
{file = "pure_eval-0.2.2-py3-none-any.whl", hash = "sha256:01eaab343580944bc56080ebe0a674b39ec44a945e6d09ba7db3cb8cec289350"},
{file = "pure_eval-0.2.2.tar.gz", hash = "sha256:2b45320af6dfaa1750f543d714b6d1c520a1688dec6fd24d339063ce0aaa9ac3"},
]
[[package]]
name = "pyasn1"
version = "0.5.1"
requires_python = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,>=2.7"
summary = "Pure-Python implementation of ASN.1 types and DER/BER/CER codecs (X.208)"
files = [
{file = "pyasn1-0.5.1-py2.py3-none-any.whl", hash = "sha256:4439847c58d40b1d0a573d07e3856e95333f1976294494c325775aeca506eb58"},
{file = "pyasn1-0.5.1.tar.gz", hash = "sha256:6d391a96e59b23130a5cfa74d6fd7f388dbbe26cc8f1edf39fdddf08d9d6676c"},
]
[[package]]
name = "pyasn1-modules"
version = "0.3.0"
requires_python = "!=3.0.*,!=3.1.*,!=3.2.*,!=3.3.*,!=3.4.*,!=3.5.*,>=2.7"
summary = "A collection of ASN.1-based protocols modules"
dependencies = [
"pyasn1<0.6.0,>=0.4.6",
]
files = [
{file = "pyasn1_modules-0.3.0-py2.py3-none-any.whl", hash = "sha256:d3ccd6ed470d9ffbc716be08bd90efbd44d0734bc9303818f7336070984a162d"},
{file = "pyasn1_modules-0.3.0.tar.gz", hash = "sha256:5bd01446b736eb9d31512a30d46c1ac3395d676c6f3cafa4c03eb54b9925631c"},
]
[[package]]
name = "pycparser"
version = "2.21"
requires_python = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
summary = "C parser in Python"
files = [
{file = "pycparser-2.21-py2.py3-none-any.whl", hash = "sha256:8ee45429555515e1f6b185e78100aea234072576aa43ab53aefcae078162fca9"},
{file = "pycparser-2.21.tar.gz", hash = "sha256:e644fdec12f7872f86c58ff790da456218b10f863970249516d60a5eaca77206"},
]
[[package]]
name = "pycryptodomex"
version = "3.20.0"
requires_python = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
summary = "Cryptographic library for Python"
files = [
{file = "pycryptodomex-3.20.0-cp35-abi3-macosx_10_9_universal2.whl", hash = "sha256:59af01efb011b0e8b686ba7758d59cf4a8263f9ad35911bfe3f416cee4f5c08c"},
{file = "pycryptodomex-3.20.0-cp35-abi3-macosx_10_9_x86_64.whl", hash = "sha256:82ee7696ed8eb9a82c7037f32ba9b7c59e51dda6f105b39f043b6ef293989cb3"},
{file = "pycryptodomex-3.20.0-cp35-abi3-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:91852d4480a4537d169c29a9d104dda44094c78f1f5b67bca76c29a91042b623"},
{file = "pycryptodomex-3.20.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:bca649483d5ed251d06daf25957f802e44e6bb6df2e8f218ae71968ff8f8edc4"},
{file = "pycryptodomex-3.20.0-cp35-abi3-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:6e186342cfcc3aafaad565cbd496060e5a614b441cacc3995ef0091115c1f6c5"},
{file = "pycryptodomex-3.20.0-cp35-abi3-musllinux_1_1_aarch64.whl", hash = "sha256:25cd61e846aaab76d5791d006497134602a9e451e954833018161befc3b5b9ed"},
{file = "pycryptodomex-3.20.0-cp35-abi3-musllinux_1_1_i686.whl", hash = "sha256:9c682436c359b5ada67e882fec34689726a09c461efd75b6ea77b2403d5665b7"},
{file = "pycryptodomex-3.20.0-cp35-abi3-musllinux_1_1_x86_64.whl", hash = "sha256:7a7a8f33a1f1fb762ede6cc9cbab8f2a9ba13b196bfaf7bc6f0b39d2ba315a43"},
{file = "pycryptodomex-3.20.0-cp35-abi3-win32.whl", hash = "sha256:c39778fd0548d78917b61f03c1fa8bfda6cfcf98c767decf360945fe6f97461e"},
{file = "pycryptodomex-3.20.0-cp35-abi3-win_amd64.whl", hash = "sha256:2a47bcc478741b71273b917232f521fd5704ab4b25d301669879e7273d3586cc"},
{file = "pycryptodomex-3.20.0-pp27-pypy_73-manylinux2010_x86_64.whl", hash = "sha256:1be97461c439a6af4fe1cf8bf6ca5936d3db252737d2f379cc6b2e394e12a458"},
{file = "pycryptodomex-3.20.0-pp27-pypy_73-win32.whl", hash = "sha256:19764605feea0df966445d46533729b645033f134baeb3ea26ad518c9fdf212c"},
{file = "pycryptodomex-3.20.0-pp310-pypy310_pp73-macosx_10_9_x86_64.whl", hash = "sha256:f2e497413560e03421484189a6b65e33fe800d3bd75590e6d78d4dfdb7accf3b"},
{file = "pycryptodomex-3.20.0-pp310-pypy310_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:e48217c7901edd95f9f097feaa0388da215ed14ce2ece803d3f300b4e694abea"},
{file = "pycryptodomex-3.20.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:d00fe8596e1cc46b44bf3907354e9377aa030ec4cd04afbbf6e899fc1e2a7781"},
{file = "pycryptodomex-3.20.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:88afd7a3af7ddddd42c2deda43d53d3dfc016c11327d0915f90ca34ebda91499"},
{file = "pycryptodomex-3.20.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl", hash = "sha256:d3584623e68a5064a04748fb6d76117a21a7cb5eaba20608a41c7d0c61721794"},
{file = "pycryptodomex-3.20.0-pp39-pypy39_pp73-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:0daad007b685db36d977f9de73f61f8da2a7104e20aca3effd30752fd56f73e1"},
{file = "pycryptodomex-3.20.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5dcac11031a71348faaed1f403a0debd56bf5404232284cf8c761ff918886ebc"},
{file = "pycryptodomex-3.20.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:69138068268127cd605e03438312d8f271135a33140e2742b417d027a0539427"},
{file = "pycryptodomex-3.20.0.tar.gz", hash = "sha256:7a710b79baddd65b806402e14766c721aee8fb83381769c27920f26476276c1e"},
]
[[package]]
name = "pygments"
version = "2.17.2"
requires_python = ">=3.7"
summary = "Pygments is a syntax highlighting package written in Python."
files = [
{file = "pygments-2.17.2-py3-none-any.whl", hash = "sha256:b27c2826c47d0f3219f29554824c30c5e8945175d888647acd804ddd04af846c"},
{file = "pygments-2.17.2.tar.gz", hash = "sha256:da46cec9fd2de5be3a8a784f434e4c4ab670b4ff54d605c4c2717e9d49c4c367"},
]
[[package]]
name = "python-crontab"
version = "3.0.0"
summary = "Python Crontab API"
dependencies = [
"python-dateutil",
]
files = [
{file = "python-crontab-3.0.0.tar.gz", hash = "sha256:79fb7465039ddfd4fb93d072d6ee0d45c1ac8bf1597f0686ea14fd4361dba379"},
{file = "python_crontab-3.0.0-py3-none-any.whl", hash = "sha256:6d5ba3c190ec76e4d252989a1644fcb233dbf53fbc8fceeb9febe1657b9fb1d4"},
]
[[package]]
name = "python-dateutil"
version = "2.8.2"
requires_python = "!=3.0.*,!=3.1.*,!=3.2.*,>=2.7"
summary = "Extensions to the standard Python datetime module"
dependencies = [
"six>=1.5",
]
files = [
{file = "python-dateutil-2.8.2.tar.gz", hash = "sha256:0123cacc1627ae19ddf3c27a5de5bd67ee4586fbdd6440d9748f8abb483d3e86"},
{file = "python_dateutil-2.8.2-py2.py3-none-any.whl", hash = "sha256:961d03dc3453ebbc59dbdea9e4e11c5651520a876d0f4db161e8674aae935da9"},
]
[[package]]
name = "python-ldap"
version = "3.4.4"
requires_python = ">=3.6"
summary = "Python modules for implementing LDAP clients"
dependencies = [
"pyasn1-modules>=0.1.5",
"pyasn1>=0.3.7",
]
files = [
{file = "python-ldap-3.4.4.tar.gz", hash = "sha256:7edb0accec4e037797705f3a05cbf36a9fde50d08c8f67f2aef99a2628fab828"},
]
[[package]]
name = "pytz"
version = "2023.3.post1"
summary = "World timezone definitions, modern and historical"
files = [
{file = "pytz-2023.3.post1-py2.py3-none-any.whl", hash = "sha256:ce42d816b81b68506614c11e8937d3aa9e41007ceb50bfdcb0749b921bf646c7"},
{file = "pytz-2023.3.post1.tar.gz", hash = "sha256:7b4fddbeb94a1eba4b557da24f19fdf9db575192544270a9101d8509f9f43d7b"},
]
[[package]]
name = "regex"
version = "2023.12.25"
requires_python = ">=3.7"
summary = "Alternative regular expression module, to replace re."
files = [
{file = "regex-2023.12.25-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:0694219a1d54336fd0445ea382d49d36882415c0134ee1e8332afd1529f0baa5"},
{file = "regex-2023.12.25-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:b014333bd0217ad3d54c143de9d4b9a3ca1c5a29a6d0d554952ea071cff0f1f8"},
{file = "regex-2023.12.25-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:d865984b3f71f6d0af64d0d88f5733521698f6c16f445bb09ce746c92c97c586"},
{file = "regex-2023.12.25-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:1e0eabac536b4cc7f57a5f3d095bfa557860ab912f25965e08fe1545e2ed8b4c"},
{file = "regex-2023.12.25-cp310-cp310-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:c25a8ad70e716f96e13a637802813f65d8a6760ef48672aa3502f4c24ea8b400"},
{file = "regex-2023.12.25-cp310-cp310-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:a9b6d73353f777630626f403b0652055ebfe8ff142a44ec2cf18ae470395766e"},
{file = "regex-2023.12.25-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:a9cc99d6946d750eb75827cb53c4371b8b0fe89c733a94b1573c9dd16ea6c9e4"},
{file = "regex-2023.12.25-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:88d1f7bef20c721359d8675f7d9f8e414ec5003d8f642fdfd8087777ff7f94b5"},
{file = "regex-2023.12.25-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:cb3fe77aec8f1995611f966d0c656fdce398317f850d0e6e7aebdfe61f40e1cd"},
{file = "regex-2023.12.25-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:7aa47c2e9ea33a4a2a05f40fcd3ea36d73853a2aae7b4feab6fc85f8bf2c9704"},
{file = "regex-2023.12.25-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:df26481f0c7a3f8739fecb3e81bc9da3fcfae34d6c094563b9d4670b047312e1"},
{file = "regex-2023.12.25-cp310-cp310-musllinux_1_1_ppc64le.whl", hash = "sha256:c40281f7d70baf6e0db0c2f7472b31609f5bc2748fe7275ea65a0b4601d9b392"},
{file = "regex-2023.12.25-cp310-cp310-musllinux_1_1_s390x.whl", hash = "sha256:d94a1db462d5690ebf6ae86d11c5e420042b9898af5dcf278bd97d6bda065423"},
{file = "regex-2023.12.25-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:ba1b30765a55acf15dce3f364e4928b80858fa8f979ad41f862358939bdd1f2f"},
{file = "regex-2023.12.25-cp310-cp310-win32.whl", hash = "sha256:150c39f5b964e4d7dba46a7962a088fbc91f06e606f023ce57bb347a3b2d4630"},
{file = "regex-2023.12.25-cp310-cp310-win_amd64.whl", hash = "sha256:09da66917262d9481c719599116c7dc0c321ffcec4b1f510c4f8a066f8768105"},
{file = "regex-2023.12.25-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:1b9d811f72210fa9306aeb88385b8f8bcef0dfbf3873410413c00aa94c56c2b6"},
{file = "regex-2023.12.25-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:d902a43085a308cef32c0d3aea962524b725403fd9373dea18110904003bac97"},
{file = "regex-2023.12.25-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:d166eafc19f4718df38887b2bbe1467a4f74a9830e8605089ea7a30dd4da8887"},
{file = "regex-2023.12.25-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c7ad32824b7f02bb3c9f80306d405a1d9b7bb89362d68b3c5a9be53836caebdb"},
{file = "regex-2023.12.25-cp311-cp311-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:636ba0a77de609d6510235b7f0e77ec494d2657108f777e8765efc060094c98c"},
{file = "regex-2023.12.25-cp311-cp311-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:0fda75704357805eb953a3ee15a2b240694a9a514548cd49b3c5124b4e2ad01b"},
{file = "regex-2023.12.25-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:f72cbae7f6b01591f90814250e636065850c5926751af02bb48da94dfced7baa"},
{file = "regex-2023.12.25-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:db2a0b1857f18b11e3b0e54ddfefc96af46b0896fb678c85f63fb8c37518b3e7"},
{file = "regex-2023.12.25-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:7502534e55c7c36c0978c91ba6f61703faf7ce733715ca48f499d3dbbd7657e0"},
{file = "regex-2023.12.25-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:e8c7e08bb566de4faaf11984af13f6bcf6a08f327b13631d41d62592681d24fe"},
{file = "regex-2023.12.25-cp311-cp311-musllinux_1_1_ppc64le.whl", hash = "sha256:283fc8eed679758de38fe493b7d7d84a198b558942b03f017b1f94dda8efae80"},
{file = "regex-2023.12.25-cp311-cp311-musllinux_1_1_s390x.whl", hash = "sha256:f44dd4d68697559d007462b0a3a1d9acd61d97072b71f6d1968daef26bc744bd"},
{file = "regex-2023.12.25-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:67d3ccfc590e5e7197750fcb3a2915b416a53e2de847a728cfa60141054123d4"},
{file = "regex-2023.12.25-cp311-cp311-win32.whl", hash = "sha256:68191f80a9bad283432385961d9efe09d783bcd36ed35a60fb1ff3f1ec2efe87"},
{file = "regex-2023.12.25-cp311-cp311-win_amd64.whl", hash = "sha256:7d2af3f6b8419661a0c421584cfe8aaec1c0e435ce7e47ee2a97e344b98f794f"},
{file = "regex-2023.12.25-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:f7bc09bc9c29ebead055bcba136a67378f03d66bf359e87d0f7c759d6d4ffa31"},
{file = "regex-2023.12.25-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:e14b73607d6231f3cc4622809c196b540a6a44e903bcfad940779c80dffa7be7"},
{file = "regex-2023.12.25-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:9eda5f7a50141291beda3edd00abc2d4a5b16c29c92daf8d5bd76934150f3edc"},
{file = "regex-2023.12.25-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:cc6bb9aa69aacf0f6032c307da718f61a40cf970849e471254e0e91c56ffca95"},
{file = "regex-2023.12.25-cp39-cp39-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:298dc6354d414bc921581be85695d18912bea163a8b23cac9a2562bbcd5088b1"},
{file = "regex-2023.12.25-cp39-cp39-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:2f4e475a80ecbd15896a976aa0b386c5525d0ed34d5c600b6d3ebac0a67c7ddf"},
{file = "regex-2023.12.25-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:531ac6cf22b53e0696f8e1d56ce2396311254eb806111ddd3922c9d937151dae"},
{file = "regex-2023.12.25-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:22f3470f7524b6da61e2020672df2f3063676aff444db1daa283c2ea4ed259d6"},
{file = "regex-2023.12.25-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl", hash = "sha256:89723d2112697feaa320c9d351e5f5e7b841e83f8b143dba8e2d2b5f04e10923"},
{file = "regex-2023.12.25-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:0ecf44ddf9171cd7566ef1768047f6e66975788258b1c6c6ca78098b95cf9a3d"},
{file = "regex-2023.12.25-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:905466ad1702ed4acfd67a902af50b8db1feeb9781436372261808df7a2a7bca"},
{file = "regex-2023.12.25-cp39-cp39-musllinux_1_1_ppc64le.whl", hash = "sha256:4558410b7a5607a645e9804a3e9dd509af12fb72b9825b13791a37cd417d73a5"},
{file = "regex-2023.12.25-cp39-cp39-musllinux_1_1_s390x.whl", hash = "sha256:7e316026cc1095f2a3e8cc012822c99f413b702eaa2ca5408a513609488cb62f"},
{file = "regex-2023.12.25-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:3b1de218d5375cd6ac4b5493e0b9f3df2be331e86520f23382f216c137913d20"},
{file = "regex-2023.12.25-cp39-cp39-win32.whl", hash = "sha256:11a963f8e25ab5c61348d090bf1b07f1953929c13bd2309a0662e9ff680763c9"},
{file = "regex-2023.12.25-cp39-cp39-win_amd64.whl", hash = "sha256:e693e233ac92ba83a87024e1d32b5f9ab15ca55ddd916d878146f4e3406b5c91"},
{file = "regex-2023.12.25.tar.gz", hash = "sha256:29171aa128da69afdf4bde412d5bedc335f2ca8fcfe4489038577d05f16181e5"},
]
[[package]]
name = "requests"
version = "2.31.0"
requires_python = ">=3.7"
summary = "Python HTTP for Humans."
dependencies = [
"certifi>=2017.4.17",
"charset-normalizer<4,>=2",
"idna<4,>=2.5",
"urllib3<3,>=1.21.1",
]
files = [
{file = "requests-2.31.0-py3-none-any.whl", hash = "sha256:58cd2187c01e70e6e26505bca751777aa9f2ee0b7f4300988b709f44e013003f"},
{file = "requests-2.31.0.tar.gz", hash = "sha256:942c5a758f98d790eaed1a29cb6eefc7ffb0d1cf7af05c3d2791656dbd6ad1e1"},
]
[[package]]
name = "six"
version = "1.16.0"
requires_python = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*"
summary = "Python 2 and 3 compatibility utilities"
files = [
{file = "six-1.16.0-py2.py3-none-any.whl", hash = "sha256:8abb2f1d86890a2dfb989f9a77cfcfd3e47c2a354b01111771326f8aa26e0254"},
{file = "six-1.16.0.tar.gz", hash = "sha256:1e61c37477a1626458e36f7b1d82aa5c9b094fa4802892072e49de9c60c4c926"},
]
[[package]]
name = "sonic-client"
version = "1.0.0"
summary = "python client for sonic search backend"
files = [
{file = "sonic-client-1.0.0.tar.gz", hash = "sha256:fe324c7354670488ed84847f6a6727d3cb5fb3675cb9b61396dcf5720e5aca66"},
{file = "sonic_client-1.0.0-py3-none-any.whl", hash = "sha256:291bf292861e97a2dd765ff0c8754ea9631383680d31a63ec3da6f5aa5f4beda"},
]
[[package]]
name = "sqlparse"
version = "0.4.4"
requires_python = ">=3.5"
summary = "A non-validating SQL parser."
files = [
{file = "sqlparse-0.4.4-py3-none-any.whl", hash = "sha256:5430a4fe2ac7d0f93e66f1efc6e1338a41884b7ddf2a350cedd20ccc4d9d28f3"},
{file = "sqlparse-0.4.4.tar.gz", hash = "sha256:d446183e84b8349fa3061f0fe7f06ca94ba65b426946ffebe6e3e8295332420c"},
]
[[package]]
name = "stack-data"
version = "0.6.3"
summary = "Extract data from python stack frames and tracebacks for informative displays"
dependencies = [
"asttokens>=2.1.0",
"executing>=1.2.0",
"pure-eval",
]
files = [
{file = "stack_data-0.6.3-py3-none-any.whl", hash = "sha256:d5558e0c25a4cb0853cddad3d77da9891a08cb85dd9f9f91b9f8cd66e511e695"},
{file = "stack_data-0.6.3.tar.gz", hash = "sha256:836a778de4fec4dcd1dcd89ed8abff8a221f58308462e1c4aa2a3cf30148f0b9"},
]
[[package]]
name = "traitlets"
version = "5.14.1"
requires_python = ">=3.8"
summary = "Traitlets Python configuration system"
files = [
{file = "traitlets-5.14.1-py3-none-any.whl", hash = "sha256:2e5a030e6eff91737c643231bfcf04a65b0132078dad75e4936700b213652e74"},
{file = "traitlets-5.14.1.tar.gz", hash = "sha256:8585105b371a04b8316a43d5ce29c098575c2e477850b62b848b964f1444527e"},
]
[[package]]
name = "typing-extensions"
version = "4.9.0"
requires_python = ">=3.8"
summary = "Backported and Experimental Type Hints for Python 3.8+"
files = [
{file = "typing_extensions-4.9.0-py3-none-any.whl", hash = "sha256:af72aea155e91adfc61c3ae9e0e342dbc0cba726d6cba4b6c72c1f34e47291cd"},
{file = "typing_extensions-4.9.0.tar.gz", hash = "sha256:23478f88c37f27d76ac8aee6c905017a143b0b1b886c3c9f66bc2fd94f9f5783"},
]
[[package]]
name = "tzdata"
version = "2023.4"
requires_python = ">=2"
summary = "Provider of IANA time zone data"
files = [
{file = "tzdata-2023.4-py2.py3-none-any.whl", hash = "sha256:aa3ace4329eeacda5b7beb7ea08ece826c28d761cda36e747cfbf97996d39bf3"},
{file = "tzdata-2023.4.tar.gz", hash = "sha256:dd54c94f294765522c77399649b4fefd95522479a664a0cec87f41bebc6148c9"},
]
[[package]]
name = "tzlocal"
version = "5.2"
requires_python = ">=3.8"
summary = "tzinfo object for the local timezone"
dependencies = [
"tzdata; platform_system == \"Windows\"",
]
files = [
{file = "tzlocal-5.2-py3-none-any.whl", hash = "sha256:49816ef2fe65ea8ac19d19aa7a1ae0551c834303d5014c6d5a62e4cbda8047b8"},
{file = "tzlocal-5.2.tar.gz", hash = "sha256:8d399205578f1a9342816409cc1e46a93ebd5755e39ea2d85334bea911bf0e6e"},
]
[[package]]
name = "urllib3"
version = "2.1.0"
requires_python = ">=3.8"
summary = "HTTP library with thread-safe connection pooling, file post, and more."
files = [
{file = "urllib3-2.1.0-py3-none-any.whl", hash = "sha256:55901e917a5896a349ff771be919f8bd99aff50b79fe58fec595eb37bbc56bb3"},
{file = "urllib3-2.1.0.tar.gz", hash = "sha256:df7aa8afb0148fa78488e7899b2c59b5f4ffcfa82e6c54ccb9dd37c1d7b52d54"},
]
[[package]]
name = "w3lib"
version = "2.1.2"
requires_python = ">=3.7"
summary = "Library of web-related functions"
files = [
{file = "w3lib-2.1.2-py3-none-any.whl", hash = "sha256:c4432926e739caa8e3f49f5de783f336df563d9490416aebd5d39fb896d264e7"},
{file = "w3lib-2.1.2.tar.gz", hash = "sha256:ed5b74e997eea2abe3c1321f916e344144ee8e9072a6f33463ee8e57f858a4b1"},
]
[[package]]
name = "wcwidth"
version = "0.2.13"
summary = "Measures the displayed width of unicode strings in a terminal"
files = [
{file = "wcwidth-0.2.13-py2.py3-none-any.whl", hash = "sha256:3da69048e4540d84af32131829ff948f1e022c1c6bdb8d6102117aac784f6859"},
{file = "wcwidth-0.2.13.tar.gz", hash = "sha256:72ea0c06399eb286d978fdedb6923a9eb47e1c486ce63e9b4e64fc18303972b5"},
]
[[package]]
name = "websockets"
version = "12.0"
requires_python = ">=3.8"
summary = "An implementation of the WebSocket Protocol (RFC 6455 & 7692)"
files = [
{file = "websockets-12.0-cp310-cp310-macosx_10_9_universal2.whl", hash = "sha256:d554236b2a2006e0ce16315c16eaa0d628dab009c33b63ea03f41c6107958374"},
{file = "websockets-12.0-cp310-cp310-macosx_10_9_x86_64.whl", hash = "sha256:2d225bb6886591b1746b17c0573e29804619c8f755b5598d875bb4235ea639be"},
{file = "websockets-12.0-cp310-cp310-macosx_11_0_arm64.whl", hash = "sha256:eb809e816916a3b210bed3c82fb88eaf16e8afcf9c115ebb2bacede1797d2547"},
{file = "websockets-12.0-cp310-cp310-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:c588f6abc13f78a67044c6b1273a99e1cf31038ad51815b3b016ce699f0d75c2"},
{file = "websockets-12.0-cp310-cp310-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5aa9348186d79a5f232115ed3fa9020eab66d6c3437d72f9d2c8ac0c6858c558"},
{file = "websockets-12.0-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6350b14a40c95ddd53e775dbdbbbc59b124a5c8ecd6fbb09c2e52029f7a9f480"},
{file = "websockets-12.0-cp310-cp310-musllinux_1_1_aarch64.whl", hash = "sha256:70ec754cc2a769bcd218ed8d7209055667b30860ffecb8633a834dde27d6307c"},
{file = "websockets-12.0-cp310-cp310-musllinux_1_1_i686.whl", hash = "sha256:6e96f5ed1b83a8ddb07909b45bd94833b0710f738115751cdaa9da1fb0cb66e8"},
{file = "websockets-12.0-cp310-cp310-musllinux_1_1_x86_64.whl", hash = "sha256:4d87be612cbef86f994178d5186add3d94e9f31cc3cb499a0482b866ec477603"},
{file = "websockets-12.0-cp310-cp310-win32.whl", hash = "sha256:befe90632d66caaf72e8b2ed4d7f02b348913813c8b0a32fae1cc5fe3730902f"},
{file = "websockets-12.0-cp310-cp310-win_amd64.whl", hash = "sha256:363f57ca8bc8576195d0540c648aa58ac18cf85b76ad5202b9f976918f4219cf"},
{file = "websockets-12.0-cp311-cp311-macosx_10_9_universal2.whl", hash = "sha256:5d873c7de42dea355d73f170be0f23788cf3fa9f7bed718fd2830eefedce01b4"},
{file = "websockets-12.0-cp311-cp311-macosx_10_9_x86_64.whl", hash = "sha256:3f61726cae9f65b872502ff3c1496abc93ffbe31b278455c418492016e2afc8f"},
{file = "websockets-12.0-cp311-cp311-macosx_11_0_arm64.whl", hash = "sha256:ed2fcf7a07334c77fc8a230755c2209223a7cc44fc27597729b8ef5425aa61a3"},
{file = "websockets-12.0-cp311-cp311-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:8e332c210b14b57904869ca9f9bf4ca32f5427a03eeb625da9b616c85a3a506c"},
{file = "websockets-12.0-cp311-cp311-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:5693ef74233122f8ebab026817b1b37fe25c411ecfca084b29bc7d6efc548f45"},
{file = "websockets-12.0-cp311-cp311-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:6e9e7db18b4539a29cc5ad8c8b252738a30e2b13f033c2d6e9d0549b45841c04"},
{file = "websockets-12.0-cp311-cp311-musllinux_1_1_aarch64.whl", hash = "sha256:6e2df67b8014767d0f785baa98393725739287684b9f8d8a1001eb2839031447"},
{file = "websockets-12.0-cp311-cp311-musllinux_1_1_i686.whl", hash = "sha256:bea88d71630c5900690fcb03161ab18f8f244805c59e2e0dc4ffadae0a7ee0ca"},
{file = "websockets-12.0-cp311-cp311-musllinux_1_1_x86_64.whl", hash = "sha256:dff6cdf35e31d1315790149fee351f9e52978130cef6c87c4b6c9b3baf78bc53"},
{file = "websockets-12.0-cp311-cp311-win32.whl", hash = "sha256:3e3aa8c468af01d70332a382350ee95f6986db479ce7af14d5e81ec52aa2b402"},
{file = "websockets-12.0-cp311-cp311-win_amd64.whl", hash = "sha256:25eb766c8ad27da0f79420b2af4b85d29914ba0edf69f547cc4f06ca6f1d403b"},
{file = "websockets-12.0-cp39-cp39-macosx_10_9_universal2.whl", hash = "sha256:ab3d732ad50a4fbd04a4490ef08acd0517b6ae6b77eb967251f4c263011a990d"},
{file = "websockets-12.0-cp39-cp39-macosx_10_9_x86_64.whl", hash = "sha256:a1d9697f3337a89691e3bd8dc56dea45a6f6d975f92e7d5f773bc715c15dde28"},
{file = "websockets-12.0-cp39-cp39-macosx_11_0_arm64.whl", hash = "sha256:1df2fbd2c8a98d38a66f5238484405b8d1d16f929bb7a33ed73e4801222a6f53"},
{file = "websockets-12.0-cp39-cp39-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:23509452b3bc38e3a057382c2e941d5ac2e01e251acce7adc74011d7d8de434c"},
{file = "websockets-12.0-cp39-cp39-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:2e5fc14ec6ea568200ea4ef46545073da81900a2b67b3e666f04adf53ad452ec"},
{file = "websockets-12.0-cp39-cp39-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:46e71dbbd12850224243f5d2aeec90f0aaa0f2dde5aeeb8fc8df21e04d99eff9"},
{file = "websockets-12.0-cp39-cp39-musllinux_1_1_aarch64.whl", hash = "sha256:b81f90dcc6c85a9b7f29873beb56c94c85d6f0dac2ea8b60d995bd18bf3e2aae"},
{file = "websockets-12.0-cp39-cp39-musllinux_1_1_i686.whl", hash = "sha256:a02413bc474feda2849c59ed2dfb2cddb4cd3d2f03a2fedec51d6e959d9b608b"},
{file = "websockets-12.0-cp39-cp39-musllinux_1_1_x86_64.whl", hash = "sha256:bbe6013f9f791944ed31ca08b077e26249309639313fff132bfbf3ba105673b9"},
{file = "websockets-12.0-cp39-cp39-win32.whl", hash = "sha256:cbe83a6bbdf207ff0541de01e11904827540aa069293696dd528a6640bd6a5f6"},
{file = "websockets-12.0-cp39-cp39-win_amd64.whl", hash = "sha256:fc4e7fa5414512b481a2483775a8e8be7803a35b30ca805afa4998a84f9fd9e8"},
{file = "websockets-12.0-pp310-pypy310_pp73-macosx_10_9_x86_64.whl", hash = "sha256:248d8e2446e13c1d4326e0a6a4e9629cb13a11195051a73acf414812700badbd"},
{file = "websockets-12.0-pp310-pypy310_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:f44069528d45a933997a6fef143030d8ca8042f0dfaad753e2906398290e2870"},
{file = "websockets-12.0-pp310-pypy310_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:c4e37d36f0d19f0a4413d3e18c0d03d0c268ada2061868c1e6f5ab1a6d575077"},
{file = "websockets-12.0-pp310-pypy310_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:3d829f975fc2e527a3ef2f9c8f25e553eb7bc779c6665e8e1d52aa22800bb38b"},
{file = "websockets-12.0-pp310-pypy310_pp73-win_amd64.whl", hash = "sha256:2c71bd45a777433dd9113847af751aae36e448bc6b8c361a566cb043eda6ec30"},
{file = "websockets-12.0-pp38-pypy38_pp73-macosx_10_9_x86_64.whl", hash = "sha256:0bee75f400895aef54157b36ed6d3b308fcab62e5260703add87f44cee9c82a6"},
{file = "websockets-12.0-pp38-pypy38_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:423fc1ed29f7512fceb727e2d2aecb952c46aa34895e9ed96071821309951123"},
{file = "websockets-12.0-pp38-pypy38_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:27a5e9964ef509016759f2ef3f2c1e13f403725a5e6a1775555994966a66e931"},
{file = "websockets-12.0-pp38-pypy38_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:c3181df4583c4d3994d31fb235dc681d2aaad744fbdbf94c4802485ececdecf2"},
{file = "websockets-12.0-pp38-pypy38_pp73-win_amd64.whl", hash = "sha256:b067cb952ce8bf40115f6c19f478dc71c5e719b7fbaa511359795dfd9d1a6468"},
{file = "websockets-12.0-pp39-pypy39_pp73-macosx_10_9_x86_64.whl", hash = "sha256:00700340c6c7ab788f176d118775202aadea7602c5cc6be6ae127761c16d6b0b"},
{file = "websockets-12.0-pp39-pypy39_pp73-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:e469d01137942849cff40517c97a30a93ae79917752b34029f0ec72df6b46399"},
{file = "websockets-12.0-pp39-pypy39_pp73-manylinux_2_5_i686.manylinux1_i686.manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:ffefa1374cd508d633646d51a8e9277763a9b78ae71324183693959cf94635a7"},
{file = "websockets-12.0-pp39-pypy39_pp73-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:ba0cab91b3956dfa9f512147860783a1829a8d905ee218a9837c18f683239611"},
{file = "websockets-12.0-pp39-pypy39_pp73-win_amd64.whl", hash = "sha256:2cb388a5bfb56df4d9a406783b7f9dbefb888c09b71629351cc6b036e9259370"},
{file = "websockets-12.0-py3-none-any.whl", hash = "sha256:dc284bbc8d7c78a6c69e0c7325ab46ee5e40bb4d50e494d8131a07ef47500e9e"},
{file = "websockets-12.0.tar.gz", hash = "sha256:81df9cbcbb6c260de1e007e58c011bfebe2dafc8435107b0537f393dd38c8b1b"},
]
[[package]]
name = "yt-dlp"
version = "2023.12.30"
requires_python = ">=3.8"
summary = "A youtube-dl fork with additional features and patches"
dependencies = [
"brotli; implementation_name == \"cpython\"",
"brotlicffi; implementation_name != \"cpython\"",
"certifi",
"mutagen",
"pycryptodomex",
"requests<3,>=2.31.0",
"urllib3<3,>=1.26.17",
"websockets>=12.0",
]
files = [
{file = "yt-dlp-2023.12.30.tar.gz", hash = "sha256:a11862e57721b0a0f0883dfeb5a4d79ba213a2d4c45e1880e9fd70f8e6570c38"},
{file = "yt_dlp-2023.12.30-py2.py3-none-any.whl", hash = "sha256:c00d9a71d64472ad441bcaa1ec0c3797d6e60c9f934f270096a96fe51657e7b3"},
]

@ -1 +1 @@
Subproject commit 5323fc773d33ef3f219c35c946f3b353b1251d37
Subproject commit 1380be7e4ef156d85957dfef8c6d154ef9880578

View file

@ -1,32 +1,48 @@
[project]
name = "archivebox"
version = "0.7.3"
version = "0.8.0"
package-dir = "archivebox"
requires-python = ">=3.10,<3.13"
platform = "py3-none-any"
description = "Self-hosted internet archiving solution."
authors = [
{name = "Nick Sweeting", email = "pyproject.toml@archivebox.io"},
]
authors = [{name = "Nick Sweeting", email = "pyproject.toml@archivebox.io"}]
license = {text = "MIT"}
readme = "README.md"
package-dir = "archivebox"
requires-python = ">=3.10,<3.12"
# pdm install
# pdm update --unconstrained
dependencies = [
# pdm update [--unconstrained]
"croniter>=0.3.34",
"dateparser>=1.0.0",
# Last Bumped: 2024-04-25
# Base Framework and Language Dependencies
"setuptools>=69.5.1",
"django>=5.0.4,<6.0",
"django-ninja>=1.1.0",
"django-extensions>=3.2.3",
"django>=4.2.0,<5.0",
"setuptools>=69.0.3",
"mypy-extensions>=1.0.0",
# Python Helper Libraries
"requests>=2.31.0",
"dateparser>=1.0.0",
"feedparser>=6.0.11",
"ipython>5.0.0",
"mypy-extensions>=0.4.3",
"python-crontab>=2.5.1",
"requests>=2.24.0",
"w3lib>=1.22.0",
"yt-dlp>=2024.3.10",
# dont add playwright becuase packages without sdists cause trouble on many build systems that refuse to install wheel-only packages
"playwright>=1.39.0; platform_machine != 'armv7l'",
"w3lib>=2.1.2",
# Feature-Specific Dependencies
"python-crontab>=3.0.0", # for: archivebox schedule
"croniter>=2.0.5", # for: archivebox schedule
"ipython>=8.23.0", # for: archivebox shell
# Extractor Dependencies
"yt-dlp>=2024.4.9", # for: media
# "playwright>=1.43.0; platform_machine != 'armv7l'", # WARNING: playwright doesn't have any sdist, causes trouble on build systems that refuse to install wheel-only packages
# TODO: add more extractors
# - gallery-dl
# - scihubdl
# - See Github issues for more...
"django-signal-webhooks>=0.3.0",
"django-admin-data-views>=0.3.1",
]
homepage = "https://github.com/ArchiveBox/ArchiveBox"
repository = "https://github.com/ArchiveBox/ArchiveBox"
documentation = "https://github.com/ArchiveBox/ArchiveBox/wiki"
keywords = ["internet archiving", "web archiving", "digipres", "warc", "preservation", "backups", "archiving", "web", "bookmarks", "puppeteer", "browser", "download"]
classifiers = [
"Development Status :: 4 - Beta",
"Environment :: Console",
@ -42,9 +58,6 @@ classifiers = [
"Natural Language :: English",
"Operating System :: OS Independent",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
"Programming Language :: Python :: 3.11",
"Programming Language :: Python :: 3.12",
@ -59,45 +72,52 @@ classifiers = [
"Topic :: Utilities",
"Typing :: Typed",
]
# dynamic = ["version"] # TODO: programatticaly fetch version from package.json at build time
# pdm lock --group=':all'
# pdm install -G:all
# pdm update --group=':all' --unconstrained
[project.optional-dependencies]
# pdm update [--group=':all'] [--unconstrained]
sonic = [
# echo "deb [signed-by=/usr/share/keyrings/valeriansaliou_sonic.gpg] https://packagecloud.io/valeriansaliou/sonic/debian/ bookworm main" > /etc/apt/sources.list.d/valeriansaliou_sonic.list
# curl -fsSL https://packagecloud.io/valeriansaliou/sonic/gpgkey | gpg --dearmor -o /usr/share/keyrings/valeriansaliou_sonic.gpg
# apt install sonic
"sonic-client>=0.0.5",
"sonic-client>=1.0.0",
]
ldap = [
# apt install libldap2-dev libsasl2-dev python3-ldap
"python-ldap>=3.4.3",
"django-auth-ldap>=4.1.0",
]
# playwright = [
# platform_machine isnt respected by pdm export -o requirements.txt, this breaks arm/v7
# "playwright>=1.39.0; platform_machine != 'armv7l'",
# ]
# pdm lock --group=':all' --dev
# pdm install -G:all --dev
# pdm update --dev [--unconstrained]
# pdm update --dev --unconstrained
[tool.pdm.dev-dependencies]
dev = [
# building
build = [
# "pdm", # usually installed by apt/brew, dont double-install with pip
"setuptools>=69.5.1",
"pip",
"wheel",
"pdm",
"homebrew-pypi-poet>=0.10.0",
# documentation
"homebrew-pypi-poet>=0.10.0", # for: generating archivebox.rb brewfile list of python packages
]
docs = [
"recommonmark",
"sphinx",
"sphinx-rtd-theme",
# debugging
]
debug = [
"django-debug-toolbar",
"djdt_flamegraph",
"ipdb",
# testing
"requests-tracker>=0.3.3",
]
test = [
"pytest",
# linting
"bottle",
]
lint = [
"flake8",
"mypy",
"django-stubs",
@ -108,16 +128,33 @@ lint = "./bin/lint.sh"
test = "./bin/test.sh"
# all = {composite = ["lint mypackage/", "test -v tests/"]}
[tool.pytest.ini_options]
testpaths = [ "tests" ]
[project.scripts]
archivebox = "archivebox.cli:main"
[build-system]
requires = ["pdm-backend"]
build-backend = "pdm.backend"
[project.scripts]
archivebox = "archivebox.cli:main"
[tool.pytest.ini_options]
testpaths = [ "tests" ]
[tool.mypy]
mypy_path = "archivebox"
namespace_packages = true
explicit_package_bases = true
# follow_imports = "silent"
# ignore_missing_imports = true
# disallow_incomplete_defs = true
# disallow_untyped_defs = true
# disallow_untyped_decorators = true
# exclude = "pdm/(pep582/|models/in_process/.+\\.py)"
plugins = ["mypy_django_plugin.main"]
[tool.django-stubs]
django_settings_module = "core.settings"
[project.urls]
Homepage = "https://github.com/ArchiveBox/ArchiveBox"

View file

@ -1,54 +1,70 @@
# This file is @generated by PDM.
# Please do not edit it manually.
asgiref==3.7.2
annotated-types==0.6.0
anyio==4.3.0
asgiref==3.8.1
asttokens==2.4.1
brotli==1.1.0; implementation_name == "cpython"
brotlicffi==1.1.0.0; implementation_name != "cpython"
certifi==2023.11.17
cffi==1.16.0; implementation_name != "cpython"
certifi==2024.2.2
cffi==1.16.0; platform_python_implementation != "PyPy" or implementation_name != "cpython"
charset-normalizer==3.3.2
colorama==0.4.6; sys_platform == "win32"
croniter==2.0.1
croniter==2.0.5
cryptography==42.0.7
dateparser==1.2.0
decorator==5.1.1
django==3.1.14
django-auth-ldap==4.1.0
django-extensions==3.1.5
exceptiongroup==1.2.0; python_version < "3.11"
django==5.0.6
django-admin-data-views==0.3.1
django-auth-ldap==4.8.0
django-extensions==3.2.3
django-ninja==1.1.0
django-settings-holder==0.1.2
django-signal-webhooks==0.3.0
exceptiongroup==1.2.1; python_version < "3.11"
executing==2.0.1
idna==3.6
ipython==8.18.1
feedparser==6.0.11
h11==0.14.0
httpcore==1.0.5
httpx==0.27.0
idna==3.7
ipython==8.24.0
jedi==0.19.1
matplotlib-inline==0.1.6
matplotlib-inline==0.1.7
mutagen==1.47.0
mypy-extensions==1.0.0
parso==0.8.3
pexpect==4.9.0; sys_platform != "win32"
parso==0.8.4
pexpect==4.9.0; sys_platform != "win32" and sys_platform != "emscripten"
prompt-toolkit==3.0.43
ptyprocess==0.7.0; sys_platform != "win32"
ptyprocess==0.7.0; sys_platform != "win32" and sys_platform != "emscripten"
pure-eval==0.2.2
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycparser==2.21; implementation_name != "cpython"
pycryptodomex==3.19.1
pygments==2.17.2
pyasn1==0.6.0
pyasn1-modules==0.4.0
pycparser==2.22; platform_python_implementation != "PyPy" or implementation_name != "cpython"
pycryptodomex==3.20.0
pydantic==2.7.1
pydantic-core==2.18.2
pygments==2.18.0
python-crontab==3.0.0
python-dateutil==2.8.2
python-dateutil==2.9.0.post0
python-ldap==3.4.4
pytz==2023.3.post1
regex==2023.12.25
pytz==2024.1
regex==2024.5.10
requests==2.31.0
setuptools==69.5.1
sgmllib3k==1.0.0
six==1.16.0
sniffio==1.3.1
sonic-client==1.0.0
sqlparse==0.4.4
sqlparse==0.5.0
stack-data==0.6.3
traitlets==5.14.1
typing-extensions==4.9.0; python_version < "3.11"
tzdata==2023.4; platform_system == "Windows"
traitlets==5.14.3
typing-extensions==4.11.0
tzdata==2024.1; sys_platform == "win32" or platform_system == "Windows"
tzlocal==5.2
urllib3==2.1.0
urllib3==2.2.1
w3lib==2.1.2
wcwidth==0.2.12
wcwidth==0.2.13
websockets==12.0
yt-dlp==2023.12.30
yt-dlp==2024.4.9