RSA challenges for fun and CTFs

RSA is one of the oldest asymmetric public-private key cryptosystem. It has a lot of fun special cases and considerations that enables fun edge case considerations that makes it perfect for Capture The Flag challenges and puzzles.

TL;DR:

Lots of “intentionally vulnerable” RSA challenges can be cracked super easy with Python or similar.
SageMath (GitHub) (Docker) is great when you need to do more advanced math than “vanilla Python” offers out of the box. For example Coppersmith, LLL, small_roots is something Sage excels at.
AI / Large Language Models like Gemini, ChatGPT etc. are very familiar with RSA vulnerabilities. A simple prompt usually gives you clues and codes for exploitation.
There are repositories of ready to use RSA attacks, for example: RsaCtfTool, Giapppp/RSA-Attacks: Some RSA Attacks I implemented and collected from Internet, xanhacks CTF-docs: RSA (GitHub), cado-nfs, yafu etc.

caveat: A bunch of RSA attack, factoring software are a bit wonky… Your millage may vary… I’ve experienced a bunch of fun pitfalls;

official Docker-images based on very outdated Ubuntu versions
build instructions failing on modern operating systems, very hard to build
official Docker-images not working, for example: failing with error messages about python library functions missing

So eh, if you expect that you will have to deal with RSA in a CTF, it is recommend to install and test tools ahead of time!

Table of Contents

Textbook RSA
Factoring RSA keys
Bad RSA keys
Stereotyped Messages
Broadcast Attack
Weak padding
PKCS1 padding modes
Some final tidbits

Textbook RSA

Textbook RSA is the classic c = m^e mod n, n = c^d mod n, where d is hard to calculate without knowing p * q = n needed to calculate d = mod_inverse( (p-1) * (q-1) ). i.e. the ciphertext is the secret message m raised to the public exponent e modulus n.

In an ideal world, attacker should need to factor n to yield n = p * q i.e. security should rely on solving the “RSA trapdoor” factoring problem.

Textbook RSA has a couple of weaknesses that enables quicker attacks:

Confidentiality only. Recipient has no indication of if the message is received correctly, unmodified. Recipient does not even know if the provided ciphertext even is the output of the RSA calculus c = m^e mod n or some arbitrary garbage attempting to exploit / cryptanalyze the system.

One-to-one encryption. Attacker who has intercepted one ciphertext can bruteforce messages until they find a message that generates the ciphertext. I.e. the encryption function is a (slow) decryption Oracle.

Implementation issues. Bad keys and Bad messages (lack of secure padding) weakens the security of the system significantly.

Key reuse issues. Key reuse, using one RSA keys for multiple roles, escalates impact of RSA vulnerabilities. Notably an decryption function working as a Padding Oracle may enable Bleichenbacher Signature Forgery. i.e. you can trick a decryption function to become a signing function.

Factoring RSA keys

Generally, factoring RSA public modulo n is very time/resource expensive and for CTF’s and puzzles this only applies to very short keys or possibly intentionally weak keys.

General factorization algorithms such as NFS, ECM etc. are implemented by tools such as:

Factoring bypass / factoring simplifications:

For some poorly chosen values p * q = n there exists much better/simpler factoring attacks than a general factoring. If you know or have reason to suspect a key is intentionally vulnerable to a specific factoring attack, the specialized attack may be faster than generic factoring algorithms.

Bad RSA keys

Several of RSA vulnerabilities easily exploitable in CTF / puzzle setups stem from bad keys. Here is some examples of weak keys!

Bad RSA keys: Short Public Exponent

Public exponent e is usually the prime 65537 (0x1001).

When the public exponent e is a small prime, such as 3, 7, 11 a lot of the “randomness”/”complexity” from the modulo never kicks in. c = m^e mod n behaves more like c = m^e, enabling simpler cryptanalysis. This is of interest in a number of RSA attacks, for example:

Stereotyped Message recovery
Broadcast Message recovery

Bad RSA keys: Shared primes

One of the ROCA CVE-2017-15361 issues was shared primes.

Two keys sharing primes can simply be cracked instantly using Greatest common divisor (GCD) attack:

def attack(pub_a_path, pub_b_path, out_a_path, out_b_path):
    print("--- Loading Public Keys ---")
    alice_pub = load_public_key(pub_a_path)
    bob_pub = load_public_key(pub_b_path)

    N_A = alice_pub.n
    N_B = bob_pub.n
    e_A = alice_pub.e
    e_B = bob_pub.e

    print(f"Alice Modulo (N): {N_A}")
    print(f"Bob Modulo (N):   {N_B}\n")

    # The Attack: Greatest Common Divisor
    # If N_A = p1 * q and N_B = p2 * q, then GCD(N_A, N_B) = q
    print("--- Performing GCD Attack ---")
    shared_q = math.gcd(N_A, N_B)

    if shared_q == 1:
        print("Failed: No shared prime factors found. These keys do not share a prime.")
        return

    print(f"CRITICAL: Found Shared Prime (Q): {shared_q}\n"

BatchGCD enables more efficient scaling of GCD to attack a vast number of keys at the same time. It is significantly more efficient than looping GCD over all unique combinations of keys.

Bad RSA Keys: Predictable primes

One of the ROCA CVE-2017-15361 issues was predictable primes. “The Return of Coppersmith’s Attack: Practical Factorization of Widely Used RSA Moduli” employees the Lenstra–Lenstra–Lovász (LLL) lattice basis reduction algorithm algorithm to solve the unknown bits of an RSA key. If you know or can guess 50% or more of prime P, you can derive P, Q using LLL/Coppersmith.

You do not need to know the details as SageMath does all the LLL internally, as you just setup the polynomial function f = x + P_known and ask Sage to calculate: roots = f.small_roots(X=2^bits_to_find, beta=0.5):

# --- SageMath Script to Crack RSA via Coppersmith ---
# Target: Recover P from N and partial MSBs
# Modulus N: {n_bits} bits
# Known bits of P: {known_bits_count} bits

N = {n}
P_known = {p_known}
bits_to_find = {bits_to_hide}

# Define the polynomial ring over Zmod(N)
PR.<x> = PolynomialRing(Zmod(N))
# We solve for x in f(x) = P_known + x = 0 (mod P)
f = x + P_known

print(f"\\n[!] Attempting to find root for {n_bits}-bit N...")
print(f"[!] Unknown bits: ")

# beta=0.5 because we are looking for a factor P ~ N^0.5
# X is the bound for our unknown 'x' (2^bits_to_find)
try:
    roots = f.small_roots(X=2^bits_to_find, beta=0.5)

    if roots:
        recovered_p = P_known + int(roots[0])
        recovered_q = N // recovered_p
        hex_p = hex(recovered_p)
        hex_q = hex(recovered_q)
        print(f"\\n[+] Success! RSA Key Cracked.")
        print("-" * 20)
        print(f"P: {hex_p}")
        print(f"Q: {hex_q}")
        print("-" * 20)
        assert N % recovered_p == 0
    else:
        print("\\n[-] No roots found. The lattice dimension may be too small or you have < 50% of bits.")
except Exception as e:
    print(f"\\n[!] Sage Error: ")

References, LLL and Copperfield:

References, Poor Randomness:

YouTube/ CryptoAction: Exploiting Poor Randomness (Part 3) video

RSA primes are supposed to be generated in some manner where they are unrelated.

CVE-2022-26320 is about software/hardware generating prime P, and then picking prime Q close to P.

This enables cracking RSA keys super fast with an ancient algorithm… several hundred years old, invented by Pierre de Fermat:

def fermat_factorization(n):
    """
    Finds p and q such that n = p * q using Fermat's Method.
    Optimized for cases where p and q are close.
    """
    # Start searching at the ceiling of the square root of n
    a = math.isqrt(n)
    if a * a < n:
        a += 1

    while True:
        b2 = a*a - n
        b = math.isqrt(b2)

        # Check if b2 is a perfect square
        if b*b == b2:
            p = a - b
            q = a + b
            return p, q

        a += 1
        # In a real attack, you'd set a limit, but for this exercise
        # with small increments, it will find it almost instantly.

References: Fermat Attack on RSA:

Stereotyped Messages

A stereotyped message is an message m for which many of the bits are known or predictable.

A classic example is:

m = "The Secret Is: " || 20 unknown characters

LLL/Coppersmith attack can remove/ignore all of the known bits, and reduce the problem to a lattice analysis of the unknown bits.

Effectiveness is very much tied to the size of public exponent e, as the LLL analysis can only recover less than approximately N/e bits. For e=3 this is extremely powerful but for e=65537 this is moot.

Exponent `e`	RSA modulo size `N`	Recoverable bits `N/e`
3	1024	341
3	2048	682
3	4096	1365
65537	4096	0

Example code:

def get_params(public_key_path, ciphertext_path):
    # Load Public Key
    with open(public_key_path, "rb") as f:
        pub = serialization.load_pem_public_key(f.read())
    n = pub.public_numbers().n
    e = pub.public_numbers().e

    # Read Ciphertext from file as binary
    with open(ciphertext_path, "rb") as f:
        c_bytes = f.read()
    c = int.from_bytes(c_bytes, 'big')

    return n, e, c

def generate_sage_script(n, e, c, template, unknown_len):
    # 1. Convert template to integer and shift it
    template_bytes = template.encode('utf-8')
    m_known = int.from_bytes(template_bytes, 'big') << (8 * unknown_len)

    # 2. Define the bound for the unknown x
    x_max = 2**(8 * unknown_len)

    # 3. Create the SageMath script string
    sage_template = ...; # See SageMath script below
    return sage_template

def main():
    parser = argparse.ArgumentParser(description="Generate SageMath script for Raw RSA Coppersmith Attack")
    # omitted for brevity
    args = parser.parse_args()

    try:
        n, e, c = get_params(args.public_key, args.ciphertext)
        sage_script = generate_sage_script(n, e, c, args.template, args.unknown_len)
        print(sage_script)
    except Exception as err:
        print(f"Error: {err}")

SageMath script:

# --- Generated SageMath Script for Raw RSA Coppersmith Attack ---

n = ...
e = ...
c = ...
m_known = ...
x_max = ...

# Set up the Polynomial Ring
R.<x> = PolynomialRing(Zmod(n))

# Define the polynomial: f(x) = (m_known + x)^e - c
f = (m_known + x)^e - c

print("Searching for roots via LLL...")
# beta=1.0 is for modular roots (mod n)
roots = f.small_roots(X=x_max, beta=1.0)

if roots:
    print("\\n[+] Root found!")
    result_x = int(roots[0])

    # Convert integer back to bytes
    try:
        # Calculate byte length
        byte_len = (result_x.bit_length() + 7) // 8
        decoded = result_x.to_bytes(byte_len, 'big').decode('utf-8', errors='ignore')
        print(f"Decrypted Unknown Part: ")
        print(f"Raw Integer: ")
    except Exception as err:
        print(f"Could not decode result to UTF-8: ")
        print(f"Raw Integer: ")
else:
    print("\\n[-] No roots found. Check your unknown-len or template.")

Broadcast Attack

Johan Håstad noted in 1985 that Textbook RSA is vulnerable to a broadcast attack.

If same message is sent to multiple recipients,

with low public exponent e such as 3, 5, 7…
with no or insecure RSA padding,
i.e. if for i in 0..e: c[i] = m^e mod n[i],

it becomes trivial cracking m using Chinese Reminder Theorem (CRT). i.e.

C = CRT([ (c[0], n[0]), (c[1], n[1]), (c[2], n[2]) ])
m = mod_inverse( C )

or more verbose:

def nth_root(n, e):
    """Fast integer e-th root using Newton's Method."""
    if n == 0: return 0
    # Initial guess: 2^(bits/e)
    x = 1 << ((n.bit_length() + e - 1) // e)
    while True:
        # x = ((e-1)*x + n // x^(e-1)) // e
        x_prev = x
        x = ((e - 1) * x + n // pow(x, e - 1)) // e
        if x >= x_prev:
            return x_prev

def extended_gcd(a, b):
    """Iterative Extended Euclidean Algorithm."""
    old_r, r = a, b
    old_s, s = 1, 0
    old_t, t = 0, 1
    while r != 0:
        quotient = old_r // r
        old_r, r = r, old_r - quotient * r
        old_s, s = s, old_s - quotient * s
        old_t, t = t, old_t - quotient * t
    return old_r, old_s, old_t

def mod_inverse(a, m):
    g, x, y = extended_gcd(a, m)
    if g != 1: raise Exception('Modular inverse does not exist')
    return x % m

def solve_crt(items):
    """Iterative Chinese Remainder Theorem."""
    N = 1
    for _, n in items:
        N *= n

    result = 0
    for c_i, n_i in items:
        M_i = N // n_i
        y_i = mod_inverse(M_i, n_i)
        result += c_i * M_i * y_i
    return result % N

def main():
    // Cut a lot of initialization for brevity

    if len(recipients) < common_e:
        print(f"ERROR: Need at least {common_e} recipients for e={common_e}, but only found {len(recipients)}.")
        sys.exit(1)

    C = solve_crt(recipients[:common_e]) # We only need e recipients
    m_int = nth_root(C, common_e)

    # Convert to bytes
    byte_len = (common_bit_len + 7) // 8
    recovered = m_int.to_bytes(byte_len, 'big').lstrip(b'\x00')

    if args.destination:
        with open(args.destination, "wb") as f:
            f.write(recovered)

References:

Weak padding

Johan Håstad noted in 1985 that linear padding does not protect against his CRT Broadcast attack.

On using RSA with low exponent in a public key network

Daniel Bleichenbacher noted in 1998 that PKCS #1 1.5 padding does not protect against adaptive chosen-ciphertext attacks. The padding oracle can be abused to decrypt messages and sign messages.

Chosen Ciphertext Attacks Against Protocols Based on the RSA Encryption Standard PKCS #1

Bleichenbacher Signature Forgery python implementations:

Hanno Böck, Juraj Somorovsky, Craig Young noted in 2018 Return Of Bleichenbacher’s Oracle Threat (ROBOT) attack that several SSL implementations were vulnerable to Bleichenbacher decryption and signature forgery. They even signed a message using facebook.com TLS certificate…

robotattack.org

PKCS1 padding modes

RSA padding modes attempts to secure RSA against implementation flaws.

Encryption padding should generate long “random” messages to ensure c = m^e mod n trigger several high bits. Reducing Small Exponent vulnerability. Reducing Stereotyped Message vulnerability.
Enable recipient to detect invalid messages, chosen ciphertext attacks (CCA)

RFC 8017 PKCS #1: RSA Cryptography Specifications Version 2.2 defines four modes of RSA.

Optimal Asymmetric Encryption scheme (OAEP)

OAEP replaces the raw m with EM = OEAP(m, seed, hash, L):

                                +----------+------+--+-------+
                           DB = |  lHash   |  PS  |01|   M   |
                                +----------+------+--+-------+
                                               |
                     +----------+              |
                     |   seed   |              |
                     +----------+              |
                           |                   |
                           |-------> MGF ---> xor
                           |                   |
                  +--+     V                   |
                  |00|    xor <----- MGF <-----|
                  +--+     |                   |
                    |      |                   |
                    V      V                   V
                  +--+----------+----------------------------+
            EM =  |00|maskedSeed|          maskedDB          |
                  +--+----------+----------------------------+

where:

PS = 0x00...
lHash = HASH( L )
MGF: mask generation function seeded by seed.

So upon encryption, you know many messages bits will “randomly” enabled, making cryptanalysis hard, and defeating many attacks. And upon decryption, there are several indicators of tampered messages, such as lHash not generating expected value, or PS not generating expected values.

seed should be selected randomly to prevent vulnerabilities such as “Stereotyped Message Attack”.

Probabilistic Signature Scheme (PSS)

PSS implements a double hashing scheme with two salts.

mHash = Hash( M )
M' = 00 00 00 00 00 00 00 00 || mHash || salt
H = Hash( M' )

And similar to OAEP, it applies a masking function before running the RSA primitives.

PSS verifiers can extract the salt from the signature.

It is intended to be very difficult to attack PSS with known attack methods.

                                     +-----------+
                                     |     M     |
                                     +-----------+
                                           |
                                           V
                                         Hash
                                           |
                                           V
                             +--------+----------+----------+
                        M' = |Padding1|  mHash   |   salt   |
                             +--------+----------+----------+
                                            |
                  +--------+----------+     V
            DB =  |Padding2|   salt   |   Hash
                  +--------+----------+     |
                            |               |
                            V               |
                           xor <--- MGF <---|
                            |               |
                            |               |
                            V               V
                  +-------------------+----------+--+
            EM =  |    maskedDB       |     H    |bc|
                  +-------------------+----------+--+

PKCS1 v1.5 Encryption

PKCS1 1.5 replaces m with EM = 0x00 || 0x02 || PS || 0x00 || M where

PS is non-zero bytes
PS fills the RSA message EM
PS is 8 bytes or more (i.e. fail if sizeof(M)+3+8 > sizeof(EM))

PKCS1 1.5 padding effects is similar-ish to OAEP effects, with a few downsides:

Secure generation of PS is necessary, bigger dependence on random generator. If PS is static or low entropy, “Stereotyped Message Attack” is a threat.
Recipient can randomly decipher invalid ciphertext as a valid message EM, if the deciphered block begins with 0x00 || 0x02 || 8 or more non-zero bytes || 0x00.
Bleichenbacher Signature Forgery, Return Of Bleichenbacher’s Oracle Threat (ROBOT), Lucky13, are some of the vulnerabilities potentially exposed by PKCS1 v1.5. You could make a CTF challenge around a simpler variant of these vulnerabilities.

PKCS1 v1.5 Signatures

PKCS1 v1.5 replaces m with EM = 0x00 || 0x01 || PS || 0x00 || T.

where

T = DigestInfo(digestAlgorithm, digest)
digest = hash( m )
PS = 0xFF 0xFF 0xFF ...

Compared to PSS, it’s clearly a simpler design with less advanced countermeasures.

PKCS1 v1.5 signatures are legacy but kind of okay within many applications, with some major assumptions;

Keys are used for signature only
i.e. a PKCS1 1.5 decryption padding oracle must not exist within a PKCS1 v1.5 signing system.

Some final tidbits

A little bit of extra goodies I wanted to add but did not fit neatly into the main portions of the blog post!

Reasons to stop using RSA

While RSA is fun for CTF and puzzles, it may not be the correct tool for actual cryptography today.

Do not implement your own RSA code.
“Real” RSA implementations has suffered from many different RSA vulnerabilities: Bleichenbacher/ROBOT, Lucky13, Weak Keys, …

References:

YouTube/ trailofbits: Fuck RSA @ SummerCon 2019 video

RsaCtfTool with full SageMath

This seems to be an okayish Dockerfile to get RsaCtfTool working fully, but honestly I haven’t tested RsaCtfTool much.

FROM sagemath/sagemath

RUN sudo apt update \
 && sudo apt install -y git pip

RUN sudo mkdir /opt/rsactf \
 && sudo chown sage /opt/rsactf

RUN git clone https://github.com/RsaCtfTool/RsaCtfTool.git /opt/rsactf/RsaCtfTool

ENV PATH "${PATH}:/home/sage/.local/bin"
RUN pip3 install /opt/rsactf/RsaCtfTool
RUN cd /opt/rsactf/RsaCtfTool \
 && mv optional-requirements.txt optional-requirements.txt.original \
 && fgrep -v 'gmpy==2.1.15' optional-requirements.txt.original > optional-requirements.txt
RUN pip3 install -r /opt/rsactf/RsaCtfTool/optional-requirements.txt

ENTRYPOINT ["RsaCtfTool"]
CMD ["-h"]

Sage

SageMath is really nice.

Docker:

docker run --rm -it sagemath/sagemath

Podman:

podman pull docker.io/sagemath/sagemath
podman run --rm -it sagemath/sagemath

Yafu working Dockerfile

I had some issues getting Yafu to build, with all tools, variants actually working. Too many dependencies, too many download locations, too many build/runtime failures!

ProjectFactor has an actual working yafu Dockerfile. It is easy to rip their file and throw away all the blockchain related code (that you probably do not want).

GitHub - ProjectFactor/factoring: Mining code for the FACT0RN blockchain. · GitHub