RSA challenges for fun and CTFs
RSA is one of the oldest asymmetric public-private key cryptosystem. It has a lot of fun special cases and considerations that enables fun edge case considerations that makes it perfect for Capture The Flag challenges and puzzles.
TL;DR:
- Lots of “intentionally vulnerable” RSA challenges can be cracked super easy with Python or similar.
- SageMath
(GitHub)
(Docker)
is great when you need to do more advanced math than “vanilla Python” offers
out of the box.
For example Coppersmith, LLL,
small_rootsis something Sage excels at. - AI / Large Language Models like Gemini, ChatGPT etc. are very familiar with RSA vulnerabilities. A simple prompt usually gives you clues and codes for exploitation.
- There are repositories of ready to use RSA attacks, for example: RsaCtfTool, Giapppp/RSA-Attacks: Some RSA Attacks I implemented and collected from Internet, xanhacks CTF-docs: RSA (GitHub), cado-nfs, yafu etc.
caveat: A bunch of RSA attack, factoring software are a bit wonky… Your millage may vary… I’ve experienced a bunch of fun pitfalls;
- official Docker-images based on very outdated Ubuntu versions
- build instructions failing on modern operating systems, very hard to build
- official Docker-images not working, for example: failing with error messages about python library functions missing
So eh, if you expect that you will have to deal with RSA in a CTF, it is recommend to install and test tools ahead of time!
Table of Contents
- Textbook RSA
- Factoring RSA keys
- Bad RSA keys
- Stereotyped Messages
- Broadcast Attack
- Weak padding
- PKCS1 padding modes
- Some final tidbits
Textbook RSA
Textbook RSA is the classic c = m^e mod n, n = c^d mod n, where d is hard
to calculate without knowing p * q = n needed to calculate
d = mod_inverse( (p-1) * (q-1) ).
i.e. the ciphertext is the
secret message m raised to the
public exponent e
modulus n.
In an ideal world, attacker should need to factor n to yield n = p * q
i.e. security should rely on solving the “RSA trapdoor” factoring problem.
Textbook RSA has a couple of weaknesses that enables quicker attacks:
Confidentiality only.
Recipient has no indication of if the message is received correctly,
unmodified.
Recipient does not even know if the provided ciphertext even is the output of
the RSA calculus c = m^e mod n or some arbitrary garbage attempting to
exploit / cryptanalyze the system.
One-to-one encryption. Attacker who has intercepted one ciphertext can bruteforce messages until they find a message that generates the ciphertext. I.e. the encryption function is a (slow) decryption Oracle.
Implementation issues. Bad keys and Bad messages (lack of secure padding) weakens the security of the system significantly.
Key reuse issues. Key reuse, using one RSA keys for multiple roles, escalates impact of RSA vulnerabilities. Notably an decryption function working as a Padding Oracle may enable Bleichenbacher Signature Forgery. i.e. you can trick a decryption function to become a signing function.
Factoring RSA keys
Generally, factoring RSA public modulo n is very time/resource expensive
and for CTF’s and puzzles this only applies to
very short keys or possibly
intentionally weak keys.
General factorization algorithms such as NFS, ECM etc. are implemented by tools such as:
Factoring bypass / factoring simplifications:
For some poorly chosen values p * q = n there exists much better/simpler
factoring attacks than a general factoring.
If you know or have reason to suspect a key is intentionally vulnerable
to a specific factoring attack,
the specialized attack may be faster than generic factoring algorithms.
Bad RSA keys
Several of RSA vulnerabilities easily exploitable in CTF / puzzle setups stem from bad keys. Here is some examples of weak keys!
Bad RSA keys: Short Public Exponent
Public exponent e is usually the prime 65537 (0x1001).
When the public exponent e is a small prime, such as 3, 7, 11 a lot of the
“randomness”/”complexity” from the modulo never kicks in.
c = m^e mod n behaves more like c = m^e, enabling simpler cryptanalysis.
This is of interest in a number of RSA attacks, for example:
- Stereotyped Message recovery
- Broadcast Message recovery
Bad RSA keys: Shared primes
One of the ROCA CVE-2017-15361 issues was shared primes.
Two keys sharing primes can simply be cracked instantly using Greatest common divisor (GCD) attack:
def attack(pub_a_path, pub_b_path, out_a_path, out_b_path):
print("--- Loading Public Keys ---")
alice_pub = load_public_key(pub_a_path)
bob_pub = load_public_key(pub_b_path)
N_A = alice_pub.n
N_B = bob_pub.n
e_A = alice_pub.e
e_B = bob_pub.e
print(f"Alice Modulo (N): {N_A}")
print(f"Bob Modulo (N): {N_B}\n")
# The Attack: Greatest Common Divisor
# If N_A = p1 * q and N_B = p2 * q, then GCD(N_A, N_B) = q
print("--- Performing GCD Attack ---")
shared_q = math.gcd(N_A, N_B)
if shared_q == 1:
print("Failed: No shared prime factors found. These keys do not share a prime.")
return
print(f"CRITICAL: Found Shared Prime (Q): {shared_q}\n"
BatchGCD enables more efficient scaling of GCD to attack a vast number of keys at the same time. It is significantly more efficient than looping GCD over all unique combinations of keys.
Bad RSA Keys: Predictable primes
One of the ROCA CVE-2017-15361 issues was predictable primes. “The Return of Coppersmith’s Attack: Practical Factorization of Widely Used RSA Moduli” employees the Lenstra–Lenstra–Lovász (LLL) lattice basis reduction algorithm algorithm to solve the unknown bits of an RSA key. If you know or can guess 50% or more of prime P, you can derive P, Q using LLL/Coppersmith.
You do not need to know the details as SageMath does all the LLL internally,
as you just setup the polynomial function f = x + P_known
and ask Sage to calculate:
roots = f.small_roots(X=2^bits_to_find, beta=0.5):
# --- SageMath Script to Crack RSA via Coppersmith ---
# Target: Recover P from N and partial MSBs
# Modulus N: {n_bits} bits
# Known bits of P: {known_bits_count} bits
N = {n}
P_known = {p_known}
bits_to_find = {bits_to_hide}
# Define the polynomial ring over Zmod(N)
PR.<x> = PolynomialRing(Zmod(N))
# We solve for x in f(x) = P_known + x = 0 (mod P)
f = x + P_known
print(f"\\n[!] Attempting to find root for {n_bits}-bit N...")
print(f"[!] Unknown bits: ")
# beta=0.5 because we are looking for a factor P ~ N^0.5
# X is the bound for our unknown 'x' (2^bits_to_find)
try:
roots = f.small_roots(X=2^bits_to_find, beta=0.5)
if roots:
recovered_p = P_known + int(roots[0])
recovered_q = N // recovered_p
hex_p = hex(recovered_p)
hex_q = hex(recovered_q)
print(f"\\n[+] Success! RSA Key Cracked.")
print("-" * 20)
print(f"P: {hex_p}")
print(f"Q: {hex_q}")
print("-" * 20)
assert N % recovered_p == 0
else:
print("\\n[-] No roots found. The lattice dimension may be too small or you have < 50% of bits.")
except Exception as e:
print(f"\\n[!] Sage Error: ")
References, LLL and Copperfield:
- YouTube/ Tanja Lange: Cryptology: RSA XI - LLL, Coppersmith/Howgrave-Graham, and stereotyped messages
video - YouTube/ David Wong: Attacking RSA with lattice reduction techniques (LLL)
video - YouTube/ media.ccc.de: 34C3 - LatticeHacks
video
References, Poor Randomness:
Bad RSA keys: related primes
RSA primes are supposed to be generated in some manner where they are unrelated.
CVE-2022-26320
is about software/hardware generating prime P,
and then picking prime Q close to P.
This enables cracking RSA keys super fast with an ancient algorithm… several hundred years old, invented by Pierre de Fermat:
def fermat_factorization(n):
"""
Finds p and q such that n = p * q using Fermat's Method.
Optimized for cases where p and q are close.
"""
# Start searching at the ceiling of the square root of n
a = math.isqrt(n)
if a * a < n:
a += 1
while True:
b2 = a*a - n
b = math.isqrt(b2)
# Check if b2 is a perfect square
if b*b == b2:
p = a - b
q = a + b
return p, q
a += 1
# In a real attack, you'd set a limit, but for this exercise
# with small increments, it will find it almost instantly.
References: Fermat Attack on RSA:
Stereotyped Messages
A stereotyped message is an message m for which many of the bits are known
or predictable.
A classic example is:
m = "The Secret Is: " || 20 unknown characters
LLL/Coppersmith attack can remove/ignore all of the known bits, and reduce the problem to a lattice analysis of the unknown bits.
Effectiveness is very much tied to the size of public exponent e,
as the LLL analysis can only recover less than approximately N/e bits.
For e=3 this is extremely powerful but for e=65537 this is moot.
Exponent e |
RSA modulo size N |
Recoverable bits N/e |
|---|---|---|
| 3 | 1024 | 341 |
| 3 | 2048 | 682 |
| 3 | 4096 | 1365 |
| 65537 | 4096 | 0 |
Example code:
def get_params(public_key_path, ciphertext_path):
# Load Public Key
with open(public_key_path, "rb") as f:
pub = serialization.load_pem_public_key(f.read())
n = pub.public_numbers().n
e = pub.public_numbers().e
# Read Ciphertext from file as binary
with open(ciphertext_path, "rb") as f:
c_bytes = f.read()
c = int.from_bytes(c_bytes, 'big')
return n, e, c
def generate_sage_script(n, e, c, template, unknown_len):
# 1. Convert template to integer and shift it
template_bytes = template.encode('utf-8')
m_known = int.from_bytes(template_bytes, 'big') << (8 * unknown_len)
# 2. Define the bound for the unknown x
x_max = 2**(8 * unknown_len)
# 3. Create the SageMath script string
sage_template = ...; # See SageMath script below
return sage_template
def main():
parser = argparse.ArgumentParser(description="Generate SageMath script for Raw RSA Coppersmith Attack")
# omitted for brevity
args = parser.parse_args()
try:
n, e, c = get_params(args.public_key, args.ciphertext)
sage_script = generate_sage_script(n, e, c, args.template, args.unknown_len)
print(sage_script)
except Exception as err:
print(f"Error: {err}")
SageMath script:
# --- Generated SageMath Script for Raw RSA Coppersmith Attack ---
n = ...
e = ...
c = ...
m_known = ...
x_max = ...
# Set up the Polynomial Ring
R.<x> = PolynomialRing(Zmod(n))
# Define the polynomial: f(x) = (m_known + x)^e - c
f = (m_known + x)^e - c
print("Searching for roots via LLL...")
# beta=1.0 is for modular roots (mod n)
roots = f.small_roots(X=x_max, beta=1.0)
if roots:
print("\\n[+] Root found!")
result_x = int(roots[0])
# Convert integer back to bytes
try:
# Calculate byte length
byte_len = (result_x.bit_length() + 7) // 8
decoded = result_x.to_bytes(byte_len, 'big').decode('utf-8', errors='ignore')
print(f"Decrypted Unknown Part: ")
print(f"Raw Integer: ")
except Exception as err:
print(f"Could not decode result to UTF-8: ")
print(f"Raw Integer: ")
else:
print("\\n[-] No roots found. Check your unknown-len or template.")
Broadcast Attack
Johan Håstad noted in 1985 that Textbook RSA is vulnerable to a broadcast attack.
If same message is sent to multiple recipients,
- with low public exponent
esuch as 3, 5, 7… - with no or insecure RSA padding,
- i.e. if
for i in 0..e: c[i] = m^e mod n[i],
it becomes trivial cracking m using
Chinese Reminder Theorem (CRT).
i.e.
C = CRT([ (c[0], n[0]), (c[1], n[1]), (c[2], n[2]) ])
m = mod_inverse( C )
or more verbose:
def nth_root(n, e):
"""Fast integer e-th root using Newton's Method."""
if n == 0: return 0
# Initial guess: 2^(bits/e)
x = 1 << ((n.bit_length() + e - 1) // e)
while True:
# x = ((e-1)*x + n // x^(e-1)) // e
x_prev = x
x = ((e - 1) * x + n // pow(x, e - 1)) // e
if x >= x_prev:
return x_prev
def extended_gcd(a, b):
"""Iterative Extended Euclidean Algorithm."""
old_r, r = a, b
old_s, s = 1, 0
old_t, t = 0, 1
while r != 0:
quotient = old_r // r
old_r, r = r, old_r - quotient * r
old_s, s = s, old_s - quotient * s
old_t, t = t, old_t - quotient * t
return old_r, old_s, old_t
def mod_inverse(a, m):
g, x, y = extended_gcd(a, m)
if g != 1: raise Exception('Modular inverse does not exist')
return x % m
def solve_crt(items):
"""Iterative Chinese Remainder Theorem."""
N = 1
for _, n in items:
N *= n
result = 0
for c_i, n_i in items:
M_i = N // n_i
y_i = mod_inverse(M_i, n_i)
result += c_i * M_i * y_i
return result % N
def main():
// Cut a lot of initialization for brevity
if len(recipients) < common_e:
print(f"ERROR: Need at least {common_e} recipients for e={common_e}, but only found {len(recipients)}.")
sys.exit(1)
C = solve_crt(recipients[:common_e]) # We only need e recipients
m_int = nth_root(C, common_e)
# Convert to bytes
byte_len = (common_bit_len + 7) // 8
recovered = m_int.to_bytes(byte_len, 'big').lstrip(b'\x00')
if args.destination:
with open(args.destination, "wb") as f:
f.write(recovered)
References:
Weak padding
Johan Håstad noted in 1985 that linear padding does not protect against his CRT Broadcast attack.
Daniel Bleichenbacher noted in 1998 that PKCS #1 1.5 padding does not protect against adaptive chosen-ciphertext attacks. The padding oracle can be abused to decrypt messages and sign messages.
Bleichenbacher Signature Forgery python implementations:
Hanno Böck, Juraj Somorovsky, Craig Young noted in 2018
Return Of Bleichenbacher’s Oracle Threat (ROBOT) attack that several
SSL implementations were vulnerable to Bleichenbacher decryption and
signature forgery.
They even signed a message using facebook.com TLS certificate…
PKCS1 padding modes
RSA padding modes attempts to secure RSA against implementation flaws.
- Encryption padding should generate long “random” messages to ensure
c = m^e mod ntrigger several high bits. Reducing Small Exponent vulnerability. Reducing Stereotyped Message vulnerability. - Enable recipient to detect invalid messages, chosen ciphertext attacks (CCA)
RFC 8017 PKCS #1: RSA Cryptography Specifications Version 2.2 defines four modes of RSA.
Optimal Asymmetric Encryption scheme (OAEP)
OAEP replaces the raw m with EM = OEAP(m, seed, hash, L):
+----------+------+--+-------+
DB = | lHash | PS |01| M |
+----------+------+--+-------+
|
+----------+ |
| seed | |
+----------+ |
| |
|-------> MGF ---> xor
| |
+--+ V |
|00| xor <----- MGF <-----|
+--+ | |
| | |
V V V
+--+----------+----------------------------+
EM = |00|maskedSeed| maskedDB |
+--+----------+----------------------------+
where:
PS = 0x00...lHash = HASH( L )MGF: mask generation function seeded byseed.
So upon encryption, you know many messages bits will “randomly” enabled,
making cryptanalysis hard, and defeating many attacks.
And upon decryption, there are several indicators of tampered messages,
such as lHash not generating expected value, or PS not generating
expected values.
seed should be selected randomly to prevent vulnerabilities such as
“Stereotyped Message Attack”.
Probabilistic Signature Scheme (PSS)
PSS implements a double hashing scheme with two salts.
mHash = Hash( M )M' = 00 00 00 00 00 00 00 00 || mHash || saltH = Hash( M' )
And similar to OAEP, it applies a masking function before running the RSA primitives.
PSS verifiers can extract the salt from the signature.
It is intended to be very difficult to attack PSS with known attack methods.
+-----------+
| M |
+-----------+
|
V
Hash
|
V
+--------+----------+----------+
M' = |Padding1| mHash | salt |
+--------+----------+----------+
|
+--------+----------+ V
DB = |Padding2| salt | Hash
+--------+----------+ |
| |
V |
xor <--- MGF <---|
| |
| |
V V
+-------------------+----------+--+
EM = | maskedDB | H |bc|
+-------------------+----------+--+
PKCS1 v1.5 Encryption
PKCS1 1.5 replaces m with EM = 0x00 || 0x02 || PS || 0x00 || M
where
PSis non-zero bytesPSfills the RSA messageEMPSis 8 bytes or more (i.e. fail ifsizeof(M)+3+8 > sizeof(EM))
PKCS1 1.5 padding effects is similar-ish to OAEP effects, with a few downsides:
- Secure generation of
PSis necessary, bigger dependence on random generator. IfPSis static or low entropy, “Stereotyped Message Attack” is a threat. - Recipient can randomly decipher invalid ciphertext as a valid message
EM, if the deciphered block begins with0x00 || 0x02 || 8 or more non-zero bytes || 0x00. - Bleichenbacher Signature Forgery, Return Of Bleichenbacher’s Oracle Threat (ROBOT), Lucky13, are some of the vulnerabilities potentially exposed by PKCS1 v1.5. You could make a CTF challenge around a simpler variant of these vulnerabilities.
PKCS1 v1.5 Signatures
PKCS1 v1.5 replaces m with
EM = 0x00 || 0x01 || PS || 0x00 || T.
where
T = DigestInfo(digestAlgorithm, digest)digest = hash( m )PS = 0xFF 0xFF 0xFF ...
Compared to PSS, it’s clearly a simpler design with less advanced countermeasures.
PKCS1 v1.5 signatures are legacy but kind of okay within many applications, with some major assumptions;
- Keys are used for signature only
- i.e. a PKCS1 1.5 decryption padding oracle must not exist within a PKCS1 v1.5 signing system.
Some final tidbits
A little bit of extra goodies I wanted to add but did not fit neatly into the main portions of the blog post!
Reasons to stop using RSA
While RSA is fun for CTF and puzzles, it may not be the correct tool for actual cryptography today.
- Do not implement your own RSA code.
- “Real” RSA implementations has suffered from many different RSA vulnerabilities: Bleichenbacher/ROBOT, Lucky13, Weak Keys, …
References:
RsaCtfTool with full SageMath
This seems to be an okayish Dockerfile to get RsaCtfTool working fully, but honestly I haven’t tested RsaCtfTool much.
FROM sagemath/sagemath
RUN sudo apt update \
&& sudo apt install -y git pip
RUN sudo mkdir /opt/rsactf \
&& sudo chown sage /opt/rsactf
RUN git clone https://github.com/RsaCtfTool/RsaCtfTool.git /opt/rsactf/RsaCtfTool
ENV PATH "${PATH}:/home/sage/.local/bin"
RUN pip3 install /opt/rsactf/RsaCtfTool
RUN cd /opt/rsactf/RsaCtfTool \
&& mv optional-requirements.txt optional-requirements.txt.original \
&& fgrep -v 'gmpy==2.1.15' optional-requirements.txt.original > optional-requirements.txt
RUN pip3 install -r /opt/rsactf/RsaCtfTool/optional-requirements.txt
ENTRYPOINT ["RsaCtfTool"]
CMD ["-h"]
Sage
SageMath is really nice.
- SageMath Mathematical Software System: SageMath - Open-Source Mathematical Software System
- GitHub - sagemath/sage: Main repository of SageMath · GitHub
Docker:
docker run --rm -it sagemath/sagemath
Podman:
podman pull docker.io/sagemath/sagemath
podman run --rm -it sagemath/sagemath
Yafu working Dockerfile
I had some issues getting Yafu to build, with all tools, variants actually working. Too many dependencies, too many download locations, too many build/runtime failures!
ProjectFactor has an actual working yafu Dockerfile. It is easy to rip their file and throw away all the blockchain related code (that you probably do not want).