Automated ML pipeline with Python, Docker, Luigi, SciKit-Learn and Pandas to predict wine quality ratings
-
Updated
May 30, 2020 - Jupyter Notebook
Automated ML pipeline with Python, Docker, Luigi, SciKit-Learn and Pandas to predict wine quality ratings
pipecutter provides a few tools for luigi such that it works better with data science libraries and environments such as pandas, scikit-learn, and Jupyter notebooks.
Contiene la presentación del proyecto de datos realizado a propósito de la materia "Data Product Architecture": 1) Producto de datos funcional: Video de corrida final del producto de datos; 2) Presentación de "front"; 3) Entrega de documento final en repositorio; 4) Último commit del proyecto
This repository contains an ETL (Extract, Transform, Load) pipeline implemented using Luigi, a Python package for building data pipelines. The pipeline extracts data from a CSV file hosted on a GitHub repository, performs some cleaning and transformation steps, and then loads the data into a SQLite database table.
Sebagai seorang Data Engineer di Erdigma, tugas ini bertujuan untuk menggabungkan beberapa file CSV yang berasal dari Google Drive ke dalam satu file agar dapat diinput ke database. Pengolahan data ini dilakukan dengan menggunakan Python dan Luigi sebagai workflow automation.
Universal Luigi ETL pipeline. Validates data received from external sources. Extracts, transforms them and lands.
This repository is based on scraping data from a static website through Luigi. This was created to display my ability to utilize the Luigi pipeline to automatically collect data and other tasks.
Pycon 2017- Creating ETL tasks with Luigi package
Add a description, image, and links to the luigi-tasks topic page so that developers can more easily learn about it.
To associate your repository with the luigi-tasks topic, visit your repo's landing page and select "manage topics."