aboutsummaryrefslogtreecommitdiffstats
path: root/NOTES.md
blob: 9179c457dff03808a0931c86dbc8caf8371f1de0 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Flow

`from gdpr_obfuscator import Obfuscator`

- User imports the `Obfuscator` class from the `gdpr_obfuscator` module
    - This is the main entry point for the obfuscation tool
- During instantiation, the `__init__()` constructor method creates an instance
of `gdpr_obfuscator.read.DataReader` that is responsible for fetching data from
local files of an S3 bucket
    - The `read_local()` method called via `process_local()` opens the CSV file
    at the path the user specifies, uses `csv.DictReader()` to parse the field,
    and then return a list of dictionaries
    - The `read_s3()` method called via `process_s3()` uses the `boto3` library to
    fetch the CSV file from the S3 bucket, and then returns a list of
    dictionaries
- The outputs of both `read_local()` and `read_s3()` are then passed to the
`obfuscate_data()` function in the `gdpr_obfuscator.obfuscate` module, which
receives the data and the Personally Identifiable Information (PII) fields to
obfuscate
- Following this, the obfuscated data is passed into `create_byte_stream()` to
create the byte stream that will be written to a new CSV file or S3 bucket
object
git.ajschof.me — hosted by ajschofield — powered by cgit