Movie Data - ETL in a Single Function

Project Overview

The purpose of this project is to create an automated pipeline that takes in new movie data, performs the appropriate transformations, and loads the data into existing tables. The existing code is refactored into one function that performs the ETL process.

Resources and Software

Resources:
- Wikipedia Movie Data
- Kaggle Movie Data: The file size of this data exceeds the maximum size allowed on GitHub. However, you may download it directly from Kaggle.
  - In the zip file downloaded from Kaggle, only "movies_metadata.csv" and "ratings.csv" are used in this project.
Python 3.7
pgAdmin 4.50
PostgreSQL v13

Results

The new ETL function performs correctly and the data is successfuly added to a PostgreSQL database as seen in the images below. The final code can be viewed here.

These outputs were created with the following queries:

SELECT COUNT(*) AS "Number of Movies row" FROM movies;
SELECT COUNT(*) AS "Number of Ratings row" FROM ratings;

Author: Michael Mishkanian

For all questions and inquiries, please contact me on LinkedIn.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Resources		Resources
.gitignore		.gitignore
ETL_clean_kaggle_data.ipynb		ETL_clean_kaggle_data.ipynb
ETL_clean_wiki_movies.ipynb		ETL_clean_wiki_movies.ipynb
ETL_create_database.ipynb		ETL_create_database.ipynb
ETL_function_test.ipynb		ETL_function_test.ipynb
MovielensExtract.ipynb		MovielensExtract.ipynb
README.md		README.md
movies_rows.png		movies_rows.png
ratings_rows.png		ratings_rows.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resources

Resources

.gitignore

.gitignore

ETL_clean_kaggle_data.ipynb

ETL_clean_kaggle_data.ipynb

ETL_clean_wiki_movies.ipynb

ETL_clean_wiki_movies.ipynb

ETL_create_database.ipynb

ETL_create_database.ipynb

ETL_function_test.ipynb

ETL_function_test.ipynb

MovielensExtract.ipynb

MovielensExtract.ipynb

README.md

README.md

movies_rows.png

movies_rows.png

ratings_rows.png

ratings_rows.png

Repository files navigation

Movie Data - ETL in a Single Function

Project Overview

Resources and Software

Results

About

Releases

Packages

Languages

Mishkanian/Movies-ETL

Folders and files

Latest commit

History

Repository files navigation

Movie Data - ETL in a Single Function

Project Overview

Resources and Software

Results

About

Topics

Resources

Stars

Watchers

Forks

Languages