From 7823850692c12bb8a7155c5c26e66bd8129c9b4a Mon Sep 17 00:00:00 2001 From: Alex Schofield Date: Mon, 17 Feb 2025 14:21:21 +0000 Subject: update MVP section to include the minimum requirements of the project --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) (limited to 'README.md') diff --git a/README.md b/README.md index b82ccbb..222808e 100644 --- a/README.md +++ b/README.md @@ -11,7 +11,13 @@ A Python library designed to detect and remove Personally Identifiable Informati ## Minimum Viable Product (MVP) +The MVP covers: +1. Reading a JSON string containing the S3 location of the CSV file and the names of the fields that are required to be obfuscated +2. Ingesting the CSV file containing data records (with a primary key) from an AWS S3 bucket +3. Obfuscating chosen PII fields (e.g. `name`, `email_address`) by replacing their values with an obfuscated string (`***`) +4. Producing an output CSV file (or a byte-stream) that maintains the original structure but with sensitive fields changed +This meets the requirements under the General Data Protection Regulation [(GDPR)](https://ico.org.uk/media/for-organisations/guide-to-data-protection/guide-to-the-general-data-protection-regulation-gdpr-1-1.pdf) to ensure that all data containing information that can be used to identify an individual should be anonymised. ## Setup -- cgit v1.2.3