diff options
| author | HastarTara <joslinrashleigh@gmail.com> | 2024-08-16 13:47:45 +0100 |
|---|---|---|
| committer | HastarTara <joslinrashleigh@gmail.com> | 2024-08-16 13:47:45 +0100 |
| commit | 857f389ccd7a64f79e5c9ebb5ec1cdd2ce639df9 (patch) | |
| tree | b49719b214a0ec90bb1ccc56878b4aa41e87f1ee /README.md | |
| parent | 24dd35f4bc6a0b8934f09b320f73bc88c6f68f1f (diff) | |
| parent | 6425cd0b5bd9afe3f0fea8fdc37cfb7fe624d0e5 (diff) | |
| download | de-project-bentley-857f389ccd7a64f79e5c9ebb5ec1cdd2ce639df9.tar.gz de-project-bentley-857f389ccd7a64f79e5c9ebb5ec1cdd2ce639df9.zip | |
Merge branch 'development' of https://github.com/ajschofield/de-project-bentley into lambda-layers
Diffstat (limited to 'README.md')
| -rw-r--r-- | README.md | 44 |
1 files changed, 43 insertions, 1 deletions
@@ -1 +1,43 @@ -# de-project-bentley
\ No newline at end of file +# ToteSys - Data Engineering Project + +# Summary +The project aims to implement a data platform that can extract data from an +operational database, archive it in a data lake, and make it easily accessible +within a remodelled OLAP data warehouse. + +The solution showcases our skills in: + +- Python +- PostgreSQL +- Database modelling +- Amazon Web Services (AWS) +- Agile methodologies + +# Main Objective + +Our goal is to create a reliable ETL (Extract, Transform, Load) pipeline that +can: + +1. Extract the data from the `totesys` operational database +2. Store the data in AWS S3 buckets, that will form our data lake +3. Transform the data into a suitable schema for the data warehouse +4. Load the transformed data into the data warehouse hosted on AWS + +# Key Features + +We aim for the project to have certain features. Some are more prioritised than +others. + +- [ ] Automated data ingestion from `totesys` db +- [ ] Data storage for ingested and processed data in S3 buckets +- [ ] Data transformation for data warehouse schema +- [ ] Automated data loading into the data warehouse schema +- [ ] Logging and monitoring with CloudWatch +- [ ] Notifications for errors and successful runs (e.g. successful ingestion) +- [ ] Visualisation of warehouse data + +# Test Coverage +TBA + +# Contributors +TBA
\ No newline at end of file |
