NYC Taxi Data Pipeline
This GitHub Actions workflow automates the end-to-end data pipeline, from initializing the Snowflake infrastructure to producing analytical tables and views using Python and dbt.
π» Project source code
π Online dbt documentation
π Data Source
TLC Trip Record Data - NYC Taxi and Limousine Commission
The data includes:
- Pickup and dropoff dates/times
- Pickup and dropoff zones
- Distances, detailed fares, payment types
- Passenger count reported by the driver
The data is collected by authorized technology providers and provided to the TLC. The TLC does not guarantee the accuracy of this data.
π License
This project is licensed under the MIT License. The source data is provided by the NYC TLC and subject to their terms of use.