Lowercase URL Encoding with Urllib in Python3 (Python3)
Python's urllib
module is pretty flexible, and contains a bunch of useful tools - including the ability to urlencode strings via urllib.parse.quote()
Unfortunately, it doesn't allow you to specify what case should be used in URL encoding, and defaults to upper-case. Many other libraries/languages use lower case.
This becomes problematic where you're url encoding an authentication token (or similar) - when the other side mints a token for comparison, the two will not match in a like for like comparison (and if the other end urldecodes your input, they're doing more processing on untrusted input that should be necessary - similarly passing through lower
will affect the entire token and massively increase the chance of collisions).
This little wrapper function allows you to do a lowercase url-encode
Details
- Language: Python3
Snippet
import re
import urllib.parse
def quote_url_lower(url,safe='/'):
s=urllib.parse.quote(url,safe)
pattern = "%[A-Z,0-9][A-Z,0-9]"
# Convert the result to a set so each entry is unique
result = set(re.findall(pattern,s))
for r in result:
s = s.replace(r,r.lower())
return s
Usage Example
print(quote_url_lower("abcdef/ghi/123+="))
abcdef/ghi/123%2b%3d
print(quote_url_lower("abcdef/ghi/123+=",safe=''))
abcdef%2fghi%2f123%2b%3d