mirror of
https://github.com/bluxmit/alnoda-workspaces.git
synced 2024-05-16 12:02:19 +12:00
80 lines
2.9 KiB
Markdown
80 lines
2.9 KiB
Markdown
# Data Workstation
|
||
|
||
|
||
```sh
|
||
docker build -t data-workstation-base:3.8 --build-arg docker_registry=rg.fr-par.scw.cloud/dgym .
|
||
docker run -p 3000:3000 -p 8001:8000 -p 3012:3012 -p 8092:8092 -p 8448:8448 -p 9992:9992 -p 8085:8085 -p 8086:8086 -p 8082:8082 -p 8084:8084 data-workstation-base:3.8
|
||
docker run -p 3000:3000 -p 8001:8000 -p 3012:3012 -p 8092:8092 -p 8448:8448 -p 9992:9992 -p 8085:8085 -p 8086:8086 -p 8082:8082 -p 8084:8084 rg.fr-par.scw.cloud/dgym/python-workstation:3.8
|
||
```
|
||
|
||
## Luigi
|
||
|
||
Useful links:
|
||
- [Luigi Github Repo](https://github.com/spotify/luigi)
|
||
- [A Tutorial on Luigi, the Spotify’s Pipeline](https://towardsdatascience.com/a-tutorial-on-luigi-spotifys-pipeline-5c694fb4113e)
|
||
- [Create your first ETL in Luigi](http://blog.adnansiddiqi.me/create-your-first-etl-in-luigi/)
|
||
- [Luigi on PyPi](https://pypi.org/project/luigi/)
|
||
|
||
|
||
## DBT
|
||
Useful links:
|
||
- [DBT main page](https://docs.getdbt.com/)
|
||
- [dbt(Data Build Tool) Tutorial](https://www.startdataengineering.com/post/dbt-data-build-tool-tutorial/)
|
||
- [DBT on PyPi](https://pypi.org/project/dbt/)
|
||
- [Analytics Engineering with dbt and PostgreSQL](https://dsotm-rsa.space/post/2019/09/01/analytics-engineering-with-dbt-data-build-tool-and-postgres-11/)
|
||
|
||
```sh
|
||
dbt init simple_dbt_project --adapter postgres
|
||
```
|
||
|
||
|
||
## Great expectations
|
||
Useful links:
|
||
- [Great Expectations main page](https://greatexpectations.io/)
|
||
- [Great Expectations documentation](https://docs.greatexpectations.io/en/latest/)
|
||
- [Great Expectations on PyPi](https://pypi.org/project/great-expectations/)
|
||
- [Understanding Great Expectations and How to Use It](https://medium.com/hashmapinc/understanding-great-expectations-and-how-to-use-it-7754c78962f4)
|
||
- [Know Your Data Pipelines with Great Expectations](https://medium.com/hashmapinc/know-your-data-pipelines-with-great-expectations-tool-b6d38a2e6f06)
|
||
|
||
https://www.startdataengineering.com/post/ensuring-data-quality-with-great-expectations/
|
||
https://medium.com/hashmapinc/understanding-great-expectations-and-how-to-use-it-7754c78962f4
|
||
https://docs.greatexpectations.io/en/stable/guides/tutorials/how_to_create_expectations.html
|
||
|
||
|
||
## Papermill
|
||
- [Papermill Report GitHub](https://github.com/ariadnext/papermill_report)
|
||
- [Automated Report Generation with Papermill: Part 1](https://pbpython.com/papermil-rclone-report-1.html)
|
||
- [Automated Report Generation with Papermill: Part 2]https://pbpython.com/papermil-rclone-report-2.html)
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
|
||
## Prefect
|
||
https://docs.prefect.io/core/getting_started/installation.html
|
||
|
||
|
||
## ADVANCED DATA
|
||
https://www.datacouncil.ai/blog/25-hot-new-data-tools-and-what-they-dont-do
|
||
|
||
|
||
|
||
## PREFECT
|
||
|
||
|
||
|
||
RUN pip install prefect==0.14.20
|
||
|
||
```
|
||
[program:prefect]
|
||
directory=/home/
|
||
command=/bin/sh -c " prefect backend server; prefect server start --ui-port 8095; prefect agent local start "
|
||
stderr_logfile = /var/log/prefect-stderr.log
|
||
stdout_logfile = /var/log/prefect-stdout.log
|
||
logfile_maxbytes = 1024
|
||
|
||
```
|
||
-p 8095:8095 |