Sensitive Data

Published:

For most of the software, it is likely that it encounters a situation where sensitive data need to be processed e.g. government systems or customer data. The ways to handle its storage varies. This article collects ways it can be done.

If implement improperly such as keeping keys in the same place as encrypted data serves no purpose since hackers can use them if leaked.

Basic Practices

These are standards that are trivially mentioned everywhere nowadays.

  • Use encrypted connections such as HTTPS, TLS, FTPS
  • Encrypt data at rest. Usually managed databases such as AWS RDS and S3 have it.
  • Only have necessary ports open e.g. 443
  • Keep the data behind a firewall and only allow necessary IPs.

The actual hard part is implementing encryption schemes to the data itself so that unintended access can be avoided.

Encryption By User Password

Using this scheme means that every sensitive piece of data that belongs to a user is encrypted using their password as key.

  • Nobody but user only can access
  • Suitable for data that is to be modified and viewed by user themselves
  • Requires active session
  • Password change and reset

Implementation

I think this is best implemented if you have centralized system to recover the key if password is lost. E.g. a separate system to store the keys and is only retrieved when password change happen.

user | encrypted_key        | recovery_id
-----+----------------------+-------------
1    | encypted_by_password | 4tgfb98
2    | encypted_by_password | 943549s

On user login it need to decrypt the key

def decrypt_key(user_id: int, password: str) -> str:
    encrypted_key: str = retrieve_key(user_id)
    derived_key: str = derive_key_from_password(password)
    return AES.new(derived_key, ...).decrypt(encrypted_key)


def derive_key_from_password(password: str) -> str:
    """Generate a new secure key from password it should deterministic."""

Then we need to store that decrypted key to user session could be stored encrypted in user browser as cookie and decrypt in server on each request with a master key.

def handle_request(req):
    encrypted_key: str = req.headers.get("")
    key = AES.new(os.environ["MASTER_KEY"], ...).decrypt(encrypted_key)

If user password changes, then obtain the key backup from password recovery server.

def change_password(user, token, password):
    backup_key: str = get_backup_key(token)
    derived_key: str = derive_key_from_password(password)
    key = AES.new(derived_key, ...).encrypt(backup_key)
    save_key(user, key)

https://en.wikipedia.org/wiki/PBKDF2 https://security.stackexchange.com/questions/38828/how-can-i-securely-convert-a-string-password-to-a-key-used-in-aes

What if algorithm changes?

In that case, you have to re-encrypt everything.

Column Encryption

It is possible to encrypt data by columns.