UMass CTF | Reader Exercise w/ Gröbner
A bit ago, I played the UMass CTF with the CyberHero team. We performed pretty well, placing third overall, and had the opportunity to solve a lot of really cool challenges. In particular, I found RE to be especially interesting as it made me learn a bit about the Groebner basis (which actually turned out to be unintended!).
Reader Exercise
Author: ming
10 solves / 498 points
Kevin the sea cucumber became a math competition addict and developed this weird verification scheme. We could only recover this source file that’s missing a function that looks pretty important. Before retreating back into his math competition preparation cave, he said that breaking this scheme would be left as “an exercise for the readers”, so we now need your help to retrieve the flag!
The challenge implements polynomials over a field as one’d expect with the Polynomial
class and the (hidden) genPair
function, so I’ll leave out that part for brevity. The meat of the challenge is here:
if __name__ == "__main__":
with open("flag.txt", "r") as f:
FLAG = f.read()
size = 500
base = 16
degree = 8
print(f"Gimme a sec to generate the prime...")
while True:
n = getPrime(size)
if n % (base * 2) == 1:
break
print(f"n = {n}")
p, q = gen_pair(degree, n)
assert isinstance(p, Polynomial) and isinstance(q, Polynomial)
assert p.degree() == degree
assert q.degree() < p.degree()
p_squared = p * p
q_squared = q * q
while True:
decision = input("What would you like to do?\n")
if decision == "challenge":
challenge = int(input("I will never fail your challenges!\n"))
proof = (p_squared(challenge) + q_squared(challenge)) % n
assert proof == (pow(challenge, base, n) + 1) % n
print(f"See? {proof}")
elif decision == "verify":
token = getRandomNBitInteger(size - 1) % n
print("Here's how verification works: ")
print(f"I give you:")
print(f"token = {token}")
print(f"You should give back:")
print(f"p(token) = {p(token) % n}")
print(f"q(token) = {q(token) % n}")
print(f"Simple enough, right?")
token = getRandomNBitInteger(size) % n
print(f"token = {token}")
p_attempt = int(input("p(token) = "))
q_attempt = int(input("q(token) = "))
assert p_attempt == p(token) % n and q_attempt == q(token) % n
print("Great job!")
print(FLAG)
break
else:
print("Probably not that...")
So, we’re given two polynomials,
So, naturally, we take a pen and paper and write it all out.
Polynomials
We’re actually given a lot more immediate information than might be apparent. Simply brute forcing the polynomials isn’t efficient, but there’s a structure we can exploit. Consider the end-result polynomial
More formally, let us write the polynomials
Symbolically in SageMath, we write this as
R.<x> = SR[]
a = SR.var('a', 7)
b = SR.var('b', 8)
p = sum(a[k] * x^k for k in range(8)) + x^8
q = sum(b[k] * x^k for k in range(8))
As made apparent above, we know that the coefficient that goes with
Observe the symbolic sum of the squared polynomials. Notice that the (sum of the) coefficients with each of the variables is equal to zero for most variables (excluding
But can a computer do this? Turns out that yes!
Mr. Gröbner
So, over the years, I’ve consistently heard about this thing called the Gröbner basis. I knew it was something to do with nonlinear equations, but the details eluded me. Turns out, it’s actually a rather complicated topic and nobody really knows what’s going on - and the somewhat good news is that we don’t have to.
Let
The problem, though, is that this is a very slow algorithm for most cases, and we’d like
Through the lense of our problem, we can let
SageMath magic
SageMath has a Gröbner basis implemented (many version even - notable is Singular). Usually it’s a toy implementation, but there’s a proper one (albeit a slow one) for the multivariate case. But, again, we’d like to have as much data to feed into Sage lest we fail to solve (or it taking too long).
Prior to our own guess, we actually get two evaluations. This is just a linear combination of all the unknown coefficients, and gives us 2 more equations to throw into our solve!
# t: the point evaluated
# pt: p(t)
# qt: q(t)
# t2: the point we need to evaluate
def solve(t, pt, qt, t2):
global polys, F # Our original equations and the field in question
# We add the two aformentioned equations
p1 = (sum(a[k] * ZZ(t)^k for k in range(8)) + ZZ(t)^8 - pt).polynomial(F)
q1 = (sum(b[k] * ZZ(t)^k for k in range(8)) - qt).polynomial(F)
polys += [p1, q1]
I = Ideal(polys)
print("Computing basis...")
g = I.groebner_basis() # Takes a bit under 2 minutes
P.<z> = PolynomialRing(F)
# Reconstruct the polynomials from the new basis
ppoly = 0
for idx, value in enumerate(g[:7]): # We skip the last one because it's zero
v = -list(value)[1][0]
ppoly += z^idx * v
ppoly += z^8
print(ppoly)
qpoly = 0
for idx, value in enumerate(g[8:]):
v = -list(value)[1][0]
qpoly += z^idx * v
print(qpoly)
return [ppoly(t2), qpoly(t2)]
The above works, and produces an answer within the challenge’s time-frame! It’s in fact what solved the challenge for me.
But it’s still a bit slow, and one’d get curious how to speed it up…
Further reading
So, what do? Looking around, there’s some information we can additionally supply that could’ve helped. The main improvement is that we know that most of the coefficients are units i.e. not zero, as the trivial solution is the only case when multiple are equal to zero - so adding more equations of the form
Another option was simply not using SageMath to compute the Gröbner basis. The big problem with Sage, beyond making C/C++ calls from Python, is the lack of parallelization and threading in Singular’s implementation, making the computation rather single-core dependent and slow. There exist a few faster algos than Singular within SageMath, but most are limited to smaller working over smaller fields (ex. Macaulay2). The ones we’d ideally need are locked behind pricey software (Maple, Magma), but those offer trials, and it might’ve been worth giving them a shot for the sake of the challenge.
TL;DR: Gröbner is pretty much the LLL for ideals, it’s a pretty fun thing, and can help you solve multivariate problems much quicker. Just make sure to feed it enough equations and computing power