aboutsummaryrefslogtreecommitdiffstats
path: root/NOTES.md
diff options
context:
space:
mode:
Diffstat (limited to 'NOTES.md')
-rw-r--r--NOTES.md23
1 files changed, 23 insertions, 0 deletions
diff --git a/NOTES.md b/NOTES.md
new file mode 100644
index 0000000..9179c45
--- /dev/null
+++ b/NOTES.md
@@ -0,0 +1,23 @@
+# Flow
+
+`from gdpr_obfuscator import Obfuscator`
+
+- User imports the `Obfuscator` class from the `gdpr_obfuscator` module
+ - This is the main entry point for the obfuscation tool
+- During instantiation, the `__init__()` constructor method creates an instance
+of `gdpr_obfuscator.read.DataReader` that is responsible for fetching data from
+local files of an S3 bucket
+ - The `read_local()` method called via `process_local()` opens the CSV file
+ at the path the user specifies, uses `csv.DictReader()` to parse the field,
+ and then return a list of dictionaries
+ - The `read_s3()` method called via `process_s3()` uses the `boto3` library to
+ fetch the CSV file from the S3 bucket, and then returns a list of
+ dictionaries
+- The outputs of both `read_local()` and `read_s3()` are then passed to the
+`obfuscate_data()` function in the `gdpr_obfuscator.obfuscate` module, which
+receives the data and the Personally Identifiable Information (PII) fields to
+obfuscate
+- Following this, the obfuscated data is passed into `create_byte_stream()` to
+create the byte stream that will be written to a new CSV file or S3 bucket
+object
+
git.ajschof.me — hosted by ajschofield — powered by cgit