Basic ETL using Python, Big Query, Data Studio & Airflow
Contents
ETL Diagram of my first project
Idea:
Get data raw data from Austin Crime, transform it, store the data in the cloud and utilize a visualization too to properly present the data.
List of Technology used in this project are:
- Visual Studio
- Python Pandas
- Big Query
- Data Studio
- Airflow
1. Library Imports
2. Extraction: API containing the crime data from Austin, Texas.
3. Transformation: Used Visual Studio & Pandas.
Transformation: Rename of Columns
Transformation: Change format of “Date Ocurred” from military time, to standard time
Transformation: Change format of “Date Reported” from military time, to standard time
4. Load: Upload data into Big Query.
5. Airflow: Preferred scheduler (and wanted to learn the application)
Useful: Aiflow documentation