1
0
Fork 0
mirror of synced 2024-06-01 10:09:49 +12:00

Update README.md

This commit is contained in:
Nick Sweeting 2023-11-08 22:42:20 -08:00 committed by GitHub
parent 7c7257c446
commit fb698e6ecf
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23

View file

@ -45,15 +45,15 @@ Without active preservation effort, everything on the internet eventually dissap
<img src="https://github.com/ArchiveBox/ArchiveBox/assets/511499/90f1ce3c-75bb-401d-88ed-6297694b76ae" alt="snapshot detail page" align="right" width="190px"/>
💾 **It saves offline-viewable snapshots of the URLs you feed it in a few redundant formats.**
It also auto-detects the content featured *inside* each webpage extracts it out to common, easy file formats:
💾 **It saves snapshots of the URLs you feed it in several redundant formats.**
It also detects any content featured *inside* each webpage & extracts it out into a folder:
- `HTML/Generic Websites -> HTML/PDF/PNG/WARC`
- `YouTube/SoundCloud/etc. -> mp3/mp4`,
- `news articles -> article body text`
- `github/gitlab/etc. links -> cloned source code`
- *[and more...](#output-formats)*
You get back simple folders containing all the content for each URL (with a CLI and web UI to browse and manage it).
You get back folders on your filesystem containing all the content for each URL (with a CLI and web UI to browse and manage it).
---