Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
190 views
in Technique[技术] by (71.8m points)

python - Simple way to encode a string according to a password?

Does Python have a built-in, simple way of encoding/decoding strings using a password?

Something like this:

>>> encode('John Doe', password = 'mypass')
'sjkl28cn2sx0'
>>> decode('sjkl28cn2sx0', password = 'mypass')
'John Doe'

So the string "John Doe" gets encrypted as 'sjkl28cn2sx0'. To get the original string, I would "unlock" that string with the key 'mypass', which is a password in my source code. I'd like this to be the way I can encrypt/decrypt a Word document with a password.

I would like to use these encrypted strings as URL parameters. My goal is obfuscation, not strong security; nothing mission critical is being encoded. I realize I could use a database table to store keys and values, but am trying to be minimalist.

Question&Answers:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Reply

0 votes
by (71.8m points)

Python has no built-in encryption schemes, no. You also should take encrypted data storage serious; trivial encryption schemes that one developer understands to be insecure and a toy scheme may well be mistaken for a secure scheme by a less experienced developer. If you encrypt, encrypt properly.

You don’t need to do much work to implement a proper encryption scheme however. First of all, don’t re-invent the cryptography wheel, use a trusted cryptography library to handle this for you. For Python 3, that trusted library is cryptography.

I also recommend that encryption and decryption applies to bytes; encode text messages to bytes first; stringvalue.encode() encodes to UTF8, easily reverted again using bytesvalue.decode().

Last but not least, when encrypting and decrypting, we talk about keys, not passwords. A key should not be human memorable, it is something you store in a secret location but machine readable, whereas a password often can be human-readable and memorised. You can derive a key from a password, with a little care.

But for a web application or process running in a cluster without human attention to keep running it, you want to use a key. Passwords are for when only an end-user needs access to the specific information. Even then, you usually secure the application with a password, then exchange encrypted information using a key, perhaps one attached to the user account.

Symmetric key encryption

Fernet – AES CBC + HMAC, strongly recommended

The cryptography library includes the Fernet recipe, a best-practices recipe for using cryptography. Fernet is an open standard, with ready implementations in a wide range of programming languages and it packages AES CBC encryption for you with version information, a timestamp and an HMAC signature to prevent message tampering.

Fernet makes it very easy to encrypt and decrypt messages and keep you secure. It is the ideal method for encrypting data with a secret.

I recommend you use Fernet.generate_key() to generate a secure key. You can use a password too (next section), but a full 32-byte secret key (16 bytes to encrypt with, plus another 16 for the signature) is going to be more secure than most passwords you could think of.

The key that Fernet generates is a bytes object with URL and file safe base64 characters, so printable:

from cryptography.fernet import Fernet

key = Fernet.generate_key()  # store in a secure location
print("Key:", key.decode())

To encrypt or decrypt messages, create a Fernet() instance with the given key, and call the Fernet.encrypt() or Fernet.decrypt(), both the plaintext message to encrypt and the encrypted token are bytes objects.

encrypt() and decrypt() functions would look like:

from cryptography.fernet import Fernet

def encrypt(message: bytes, key: bytes) -> bytes:
    return Fernet(key).encrypt(message)

def decrypt(token: bytes, key: bytes) -> bytes:
    return Fernet(key).decrypt(token)

Demo:

>>> key = Fernet.generate_key()
>>> print(key.decode())
GZWKEhHGNopxRdOHS4H4IyKhLQ8lwnyU7vRLrM3sebY=
>>> message = 'John Doe'
>>> encrypt(message.encode(), key)
'gAAAAABciT3pFbbSihD_HZBZ8kqfAj94UhknamBuirZWKivWOukgKQ03qE2mcuvpuwCSuZ-X_Xkud0uWQLZ5e-aOwLC0Ccnepg=='
>>> token = _
>>> decrypt(token, key).decode()
'John Doe'

Fernet with password – key derived from password, weakens the security somewhat

You can use a password instead of a secret key, provided you use a strong key derivation method. You do then have to include the salt and the HMAC iteration count in the message, so the encrypted value is not Fernet-compatible anymore without first separating salt, count and Fernet token:

import secrets
from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d

from cryptography.fernet import Fernet
from cryptography.hazmat.backends import default_backend
from cryptography.hazmat.primitives import hashes
from cryptography.hazmat.primitives.kdf.pbkdf2 import PBKDF2HMAC

backend = default_backend()
iterations = 100_000

def _derive_key(password: bytes, salt: bytes, iterations: int = iterations) -> bytes:
    """Derive a secret key from a given password and salt"""
    kdf = PBKDF2HMAC(
        algorithm=hashes.SHA256(), length=32, salt=salt,
        iterations=iterations, backend=backend)
    return b64e(kdf.derive(password))

def password_encrypt(message: bytes, password: str, iterations: int = iterations) -> bytes:
    salt = secrets.token_bytes(16)
    key = _derive_key(password.encode(), salt, iterations)
    return b64e(
        b'%b%b%b' % (
            salt,
            iterations.to_bytes(4, 'big'),
            b64d(Fernet(key).encrypt(message)),
        )
    )

def password_decrypt(token: bytes, password: str) -> bytes:
    decoded = b64d(token)
    salt, iter, token = decoded[:16], decoded[16:20], b64e(decoded[20:])
    iterations = int.from_bytes(iter, 'big')
    key = _derive_key(password.encode(), salt, iterations)
    return Fernet(key).decrypt(token)

Demo:

>>> message = 'John Doe'
>>> password = 'mypass'
>>> password_encrypt(message.encode(), password)
b'9Ljs-w8IRM3XT1NDBbSBuQABhqCAAAAAAFyJdhiCPXms2vQHO7o81xZJn5r8_PAtro8Qpw48kdKrq4vt-551BCUbcErb_GyYRz8SVsu8hxTXvvKOn9QdewRGDfwx'
>>> token = _
>>> password_decrypt(token, password).decode()
'John Doe'

Including the salt in the output makes it possible to use a random salt value, which in turn ensures the encrypted output is guaranteed to be fully random regardless of password reuse or message repetition. Including the iteration count ensures that you can adjust for CPU performance increases over time without losing the ability to decrypt older messages.

A password alone can be as safe as a Fernet 32-byte random key, provided you generate a properly random password from a similar size pool. 32 bytes gives you 256 ^ 32 number of keys, so if you use an alphabet of 74 characters (26 upper, 26 lower, 10 digits and 12 possible symbols), then your password should be at least math.ceil(math.log(256 ** 32, 74)) == 42 characters long. However, a well-selected larger number of HMAC iterations can mitigate the lack of entropy somewhat as this makes it much more expensive for an attacker to brute force their way in.

Just know that choosing a shorter but still reasonably secure password won’t cripple this scheme, it just reduces the number of possible values a brute-force attacker would have to search through; make sure to pick a strong enough password for your security requirements.

Alternatives

Obscuring

An alternative is not to encrypt. Don't be tempted to just use a low-security cipher, or a home-spun implementation of, say Vignere. There is no security in these approaches, but may give an inexperienced developer that is given the task to maintain your code in future the illusion of security, which is worse than no security at all.

If all you need is obscurity, just base64 the data; for URL-safe requirements, the base64.urlsafe_b64encode() function is fine. Don't use a password here, just encode and you are done. At most, add some compression (like zlib):

import zlib
from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d

def obscure(data: bytes) -> bytes:
    return b64e(zlib.compress(data, 9))

def unobscure(obscured: bytes) -> bytes:
    return zlib.decompress(b64d(obscured))

This turns b'Hello world!' into b'eNrzSM3JyVcozy_KSVEEAB0JBF4='.

Integrity only

If all you need is a way to make sure that the data can be trusted to be unaltered after having been sent to an untrusted client and received back, then you want to sign the data, you can use the hmac library for this with SHA1 (still considered secure for HMAC signing) or better:

import hmac
import hashlib

def sign(data: bytes, key: bytes, algorithm=hashlib.sha256) -> bytes:
    assert len(key) >= algorithm().digest_size, (
        "Key must be at least as long as the digest size of the "
        "hashing algorithm"
    )
    return hmac.new(key, data, algorithm).digest()

def verify(signature: bytes, data: bytes, key: bytes, algorithm=hashlib.sha256) -> bytes:
    expected = sign(data, key, algorithm)
    return hmac.compare_digest(expected, signature)

Use this to sign data, then attach the signature with the data and send that to the client. When you receive the data back, split data and signature and verify. I've set the default algorithm to SHA256, so you'll need a 32-byte key:

key = secrets.token_bytes(32)

You may want to look at the itsdangerous library, which packages this all up with serialisation and de-serialisation in various formats.

Using AES-GCM encryption to provide encryption and integrity

Fernet builds on AEC-CBC with a HMAC signature to ensure integrity of the encrypted data; a malicious attacker can't feed your system nonsense data to keep your service busy running in circles with bad input, because the ciphertext is signed.

The Galois / Counter mode block cipher produces ciphertext and a tag to serve the same purpose, so can be used to serve the same purposes. The downside is that unlike Fernet there is no easy-to-use one-size-fits-all recipe to reuse on other platforms. AES-GCM also doesn't use padding, so this encryption ciphertext matches the length of the input message (whereas Fernet / AES-CBC encrypts messages to blocks of fixed length, obscuring the message length somewhat).

AES256-GCM takes the usual 32 byte secret as a key:

key = secrets.token_bytes(32)

then use

import binascii, time
from base64 import urlsafe_b64encode as b64e, urlsafe_b64decode as b64d

from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, mode

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
OGeek|极客中国-欢迎来到极客的世界,一个免费开放的程序员编程交流平台!开放,进步,分享!让技术改变生活,让极客改变未来! Welcome to OGeek Q&A Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...