From d25f05ba140cb85847ca604bef0e68b76a17ba62 Mon Sep 17 00:00:00 2001 From: Alex Schofield Date: Fri, 16 Aug 2024 10:34:50 +0100 Subject: docs: add draft summary section --- README.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) (limited to 'README.md') diff --git a/README.md b/README.md index 8ae0cb3..203482e 100644 --- a/README.md +++ b/README.md @@ -1 +1,14 @@ -# de-project-bentley \ No newline at end of file +# ToteSys - Data Engineering Project + +# Summary +The project aims to implement a data platform that can extract data from an +operational database, archive it in a data lake, and make it easily accessible +within a remodelled OLAP data warehouse. + +The solution showcases our skills in: + +- Python +- PostgreSQL +- Database modelling +- Amazon Web Services (AWS) +- Agile methodologies \ No newline at end of file -- cgit v1.2.3 From 9809e7ca1351d7b27f62b3c7c74db7124cab5dc9 Mon Sep 17 00:00:00 2001 From: Alex Schofield Date: Fri, 16 Aug 2024 10:40:00 +0100 Subject: docs: add draft main objective section --- README.md | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) (limited to 'README.md') diff --git a/README.md b/README.md index 203482e..e55cb16 100644 --- a/README.md +++ b/README.md @@ -11,4 +11,14 @@ The solution showcases our skills in: - PostgreSQL - Database modelling - Amazon Web Services (AWS) -- Agile methodologies \ No newline at end of file +- Agile methodologies + +# Main Objective + +Our goal is to create a reliable ETL (Extract, Transform, Load) pipeline that +can: + +1. Extract the data from the `totesys` operational database +2. Store the data in AWS S3 buckets, that will form our data lake +3. Transform the data into a suitable schema for the data warehouse +4. Load the data into the data warehouse hosted on AWS \ No newline at end of file -- cgit v1.2.3 From 37eb3bb7974904614867c7d0c2d4f6eccb39f22e Mon Sep 17 00:00:00 2001 From: Alex Schofield Date: Fri, 16 Aug 2024 10:41:01 +0100 Subject: docs(main_obj): clarify data being loaded into data warehouse --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) (limited to 'README.md') diff --git a/README.md b/README.md index e55cb16..9c7baee 100644 --- a/README.md +++ b/README.md @@ -21,4 +21,4 @@ can: 1. Extract the data from the `totesys` operational database 2. Store the data in AWS S3 buckets, that will form our data lake 3. Transform the data into a suitable schema for the data warehouse -4. Load the data into the data warehouse hosted on AWS \ No newline at end of file +4. Load the transformed data into the data warehouse hosted on AWS \ No newline at end of file -- cgit v1.2.3 From 67a3caf058416718e9413520cb74be049af1e93e Mon Sep 17 00:00:00 2001 From: Alex Schofield Date: Fri, 16 Aug 2024 11:09:59 +0100 Subject: docs: add draft key features section --- README.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) (limited to 'README.md') diff --git a/README.md b/README.md index 9c7baee..0bf6b9d 100644 --- a/README.md +++ b/README.md @@ -21,4 +21,17 @@ can: 1. Extract the data from the `totesys` operational database 2. Store the data in AWS S3 buckets, that will form our data lake 3. Transform the data into a suitable schema for the data warehouse -4. Load the transformed data into the data warehouse hosted on AWS \ No newline at end of file +4. Load the transformed data into the data warehouse hosted on AWS + +# Key Features + +We aim for the project to have certain features. Some are more prioritised than +others. + +- [ ] Automated data ingestion from `totesys` db +- [ ] Data storage for ingested and processed data in S3 buckets +- [ ] Data transformation for data warehouse schema +- [ ] Automated data loading into the data warehouse schema +- [ ] Logging and monitoring with CloudWatch +- [ ] Notifications for errors and successful runs (e.g. successful ingestion) +- [ ] Visualisation of warehouse data \ No newline at end of file -- cgit v1.2.3 From a217da60ba75a226bf72a9fc680c4cbabe883aea Mon Sep 17 00:00:00 2001 From: Alex Schofield Date: Fri, 16 Aug 2024 12:53:22 +0100 Subject: docs: add empty sections --- README.md | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) (limited to 'README.md') diff --git a/README.md b/README.md index 0bf6b9d..6bc75dc 100644 --- a/README.md +++ b/README.md @@ -34,4 +34,10 @@ others. - [ ] Automated data loading into the data warehouse schema - [ ] Logging and monitoring with CloudWatch - [ ] Notifications for errors and successful runs (e.g. successful ingestion) -- [ ] Visualisation of warehouse data \ No newline at end of file +- [ ] Visualisation of warehouse data + +# Test Coverage +TBA + +# Contributors +TBA \ No newline at end of file -- cgit v1.2.3 From 9dabc89c897f7dc9034e44c277d68e01c7e12ad7 Mon Sep 17 00:00:00 2001 From: Alex Schofield Date: Fri, 16 Aug 2024 21:06:51 +0100 Subject: docs: add badges to README --- README.md | 8 ++++++++ 1 file changed, 8 insertions(+) (limited to 'README.md') diff --git a/README.md b/README.md index 6bc75dc..cbb446c 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,13 @@ # ToteSys - Data Engineering Project +[![Python](https://img.shields.io/badge/Python-FFD43B?style=for-the-badge&logo=python&logoColor=blue)](https://www.python.org/) +[![AWS](https://img.shields.io/badge/Amazon_AWS-FF9900?style=for-the-badge&logo=amazonaws&logoColor=white)](https://aws.amazon.com/) +[![Terraform](https://img.shields.io/badge/Terraform-7B42BC?style=for-the-badge&logo=terraform&logoColor=white)](https://www.terraform.io/) +[![Postgresql](https://img.shields.io/badge/PostgreSQL-316192?style=for-the-badge&logo=postgresql&logoColor=white)](https://www.postgresql.org/) +[![GitHub Actions](https://img.shields.io/badge/GitHub_Actions-2088FF?style=for-the-badge&logo=github-actions&logoColor=white)](https://github.com/features/actions) + +[![Terraform Main Deployment Workflow Status](https://img.shields.io/github/actions/workflow/status/ajschofield/de-project-bentley/deploy.yml?branch=main&style=flat-square&label=deploy)](https://github.com/ajschofield/de-project-bentley/actions/workflows/deploy.yml?query=branch%3Amain) +[![Production Environment Status](https://img.shields.io/github/deployments/ajschofield/de-project-bentley/production?style=flat-square&label=env)](https://github.com/ajschofield/de-project-bentley/deployments/production) # Summary The project aims to implement a data platform that can extract data from an operational database, archive it in a data lake, and make it easily accessible -- cgit v1.2.3