Paradigm CTF 2021 - swap

A guided walkthrough for swap, the hardest challenge in Paradigm CTF 2021

Paradigm CTF 2021 - swap

When you have eliminated all which is impossible, then whatever remains, however improbable, must be the truth - Sherlock Holmes

Paradigm CTF 2021 took place in early February and together, players solved all but two of the challenges during the competition (and one of the remaining two mere days later).

However, the second of the two remained unsolved for almost 2 months until just a few days ago.

Today, we'll be taking a look at swap in the form of a guided walkthrough. Each section will give you a small hint at the end in case you're stuck. If you haven't already, take a look at the challenge files here and familiarize yourself with the code.

The Challenge

This challenge implements a fictional stablecoin AMM. Users can do all the things you'd expect from an AMM:  swap assets and mint/burn/transfer LP shares. The contract is initialized with 100 ether worth of four stablecoins (DAI, TUSD, USDC, USDT) and the win condition is to reduce the TVL of the AMM to 1% of what it started with.

Initially, you might assume that there's a bug in the swap function, but it's easy to confirm with reasonable confidence that the function behaves as expected and won't give you any free stablecoins.

You might also try stealing the LP tokens from the setup contract itself, but the ERC20 implementation appears to be correct and so that's impossible too.

You could try minting and burning LP tokens, but you'll find that the invariants appear to be correctly maintained and that you can't mint more LP tokens than you deserve, or burn more LP tokens that you have.

Finally, just to really drive the point home, all functions are marked nonReentrant.

At this point, the challenge is looking pretty hopeless so under contest conditions it's perfectly reasonable to just drop it and move onto a different problem. However, now that we have all the time in the world, how might we tackle this?

Well, keen observers might notice that the win condition is slightly different in this challenge. In every other challenge, it's always required to drain the entire contract of all assets. However, in this challenge, you only need to drain a percentage. Why might this be?

Everything Is Intentional

One of the lessons I've learned the hard way is that everything in a CTF, no matter how innocuous, is intentional. In this sense, the win condition is a subtle hint towards what the solution requires. The fact that you don't need to drain all of the assets to win should strongly suggest that it's in fact impossible to drain all of the assets.

But what does this mean for us? Working from first principles, we can reset our assumptions by following this chain of logic:

  1. We need to remove (some, but not all) assets from the pool
  2. The only way for assets to leave the pool is through a swap or a burn
  3. Swapping appears to be correctly implemented such that no assets can be stolen
  4. Burning also appears to be correctly implemented such that no assets can be stolen
  5. However, while swapping is a single-step process, burning is a two-step process where the first step is somehow acquiring LP tokens. The burn operation itself says nothing about where the LP tokens come from, so there must be a way to get free LP tokens

The subtle hint also reinforces this theory. Assuming that we can't steal the LP tokens from the setup contract, we'll never be able to burn 100% of all LP tokens and so we'll never be able to drain the entire contract.

Under the assumption that there must be a way to get free LP tokens, can you figure out what the solution is?

No Matter How Improbable

There are only two ways to get LP tokens. The first is to mint them, and the second is to be transferred them. We've already established that the ERC20 implementation is sound (although you may want to double-check that before committing yourself to the more challenging route) which leaves us with the mint function.

     struct MintVars {
        uint totalSupply;
        uint totalBalanceNorm;
        uint totalInNorm;
        uint amountToMint;
        
        ERC20Like token;
        uint has;
        uint preBalance;
        uint postBalance;
        uint deposited;
    }
 
    function mint(uint[] memory amounts) public nonReentrant returns (uint) {
        MintVars memory v;
        v.totalSupply = supply;
        
        for (uint i = 0; i < underlying.length; i++) {
            v.token = underlying[i];
            
            v.preBalance = v.token.balanceOf(address(this));
            
            v.has = v.token.balanceOf(msg.sender);
            if (amounts[i] > v.has) amounts[i] = v.has;
            
            v.token.transferFrom(msg.sender, address(this), amounts[i]);
            
            v.postBalance = v.token.balanceOf(address(this));
            
            v.deposited = v.postBalance - v.preBalance;

            v.totalBalanceNorm += scaleFrom(v.token, v.preBalance);
            v.totalInNorm += scaleFrom(v.token, v.deposited);
        }
        
        if (v.totalSupply == 0) {
            v.amountToMint = v.totalInNorm;
        } else {
            v.amountToMint = v.totalInNorm * v.totalSupply / v.totalBalanceNorm;
        }
        
        supply += v.amountToMint;
        balances[msg.sender] += v.amountToMint;
        
        return v.amountToMint;
    }

The goal now is to somehow adjust v.amountToMint such that we are minted way more LP tokens than we deserve. We can start by reviewing the function itself again, but this time with a fine-tooth comb. The function begins by constructing an object to represent the local variables. Then, for each token, we record the current balance of the AMM as well as the balance of the sender. If the amount to be deposited is greater than the sender's balance, then we update the amount requested. We perform the transfer, record the current balance again, then update the total deposited counters appropriately.

Nothing is immediately obvious, so we must begin eliminating the impossible and leave ourselves with only the improbable. Unfortunately, sometimes we might not know enough to immediately differentiate between the impossible and the improbable, so we'll need to keep an open mind.

We can again reset our assumptions by working from first principles using the following chain of logic:

  1. We clearly need to change v.amountToMint in such a way that the value is larger than what it should be
  2. The equation to calculate the amount to mint is implemented correctly
  3. Therefore, there must be another way to either tamper with v.amountToMint directly or to tamper with one of the values it depends on.

Do you see any way to do this, even if it seems impossible or improbable at first?

True Names Have Power

Notice that v.amountToMint is a field on MintVars memory v. This means that any time we write to v.amountToMint, we're actually writing to memory. Unlike data on the stack (recall that the EVM is a stack-based VM), data in memory can be aliased, meaning that the same data can be accessed through multiple symbolic names.

It just so happens that we have two pointers to memory in scope. The first is MintVars memory v, and the second is uint[] memory amounts. If we could somehow cause the latter to point to the same memory address as the former, then we could trigger undefined behavior by updating one through the other.

In order to understand where the amounts variable comes from, we'll first take a brief detour to the land of the Solidity ABI. All transactions are but blobs of data, and it's the job of the ABI to define a standard way of parsing that blob into structed information. Before a contract's code is executed, Solidity inserts some logic which decodes the transaction data into the types that the code expects. Only then does the contract begin executing.

In the case of mint(uint256[]), the ABI decoder will decode the transaction data into a single array of uint256. Is there some way that we can trick the ABI decoder into causing the decoded array to alias with the MintVars object constructed later?

Around the World

According to the Solidity documentation, the ABI encoding of an array is as follows:

<- 32 bytes ->

OOOOO....OOOOO [offset to array start]
    [....]
LLLLL....LLLLL [length in words]
DDDDD....DDDDD [data #1]
DDDDD....DDDDD [data #2]
    [....]
DDDDD....DDDDD [data #L]

To parse that data, the ABI decoder runs the following pseudocode:

function decodeArrayAt(uint arg) private returns (bytes memory) {
    uint offset = read(0x04+0x20*arg);
    uint length = read(0x04+offset);
    bytes memory data = malloc(length);
    memcpy(data, 0x04 + offset + 0x20, length);
    return data;
}

So far so good. Let's take a look at how malloc is implemented, again in pseudocode:

function malloc(uint size) private returns (bytes memory) {
    bytes memory buf;
    assembly {
        buf := mload(0x40)
        mstore(0x40, add(add(buf, size), 0x20))
        mstore(buf, size)
    }
    return buf;
}

Solidity uses what is known as a linear memory allocator (or arena-based allocator). This just means that Solidity will allocate new memory linearly along the block of total available memory. To allocate a new chunk, Solidity reads the free-memory pointer stored at 0x40 to determine where the next free address is, and moves it forward to reflect the fact that a new chunk of memory was just allocated.

Notice that there are no checks to ensure that the amount of memory requested is not excessively large. This means that if one was to allocate a specific amount of memory, then the free memory pointer may overflow and begin re-allocating in-use memory. In this case, two calls to the pseudo-malloc might return pointers which alias each other.

Fortunately for us, we can control exactly the size of memory the allocator will allocate simply by changing the ABI-encoded representation of our array. This lets us cause the allocator to allocate v over top of our amounts array.

Can you figure out the final step to solving this challenge?

Technical Difficulties

Because v and amounts now alias each other, both will share the same values and updating one will update the other. This means that while you can manipulate v by writing into amounts, the reverse is true as well.

As such, if you just tried to force some initial values into v by calling mint with some custom values, you might've found that your data kept getting clobbered by the writes to v.totalSupply, v.totalBalanceNorm, and v.totalInNorm. This means that some careful planning is needed to make sure that your values stay consistent over all four iterations of the loop.

Additionally, v.amountToMint is updated after the four iterations, meaning that even if we try to write directly to amounts[3], it'll just be overwritten with the correctly calculated value in the end. This means that we'll need to either make v.totalInNorm or v.totalSupply extremely big, or make v.totalBalanceNorm extremely small.

To do this, we first swap out all of the stablecoins in the AMM except for DAI. This means that v.totalBalanceNorm will not increase after the first iteration because the AMM has no balance. This just makes our lives a bit easier later on.

Next, we'll purchase the right amount of DAI/TUSD/USDC such that when amounts[i] = v.has is executed, we'll write our desired values into v. The balance of DAI will be written to v.totalSupply, so we'll want to set this to a large number such as 10000e18. The balance of TUSD will be written to v.totalBalanceNorm, so we'll update that to 1. Finally, the balance of USDC will be written to v.totalInNorm so we'll also set that to 10000e18. There's no point manipulating the USDT balance because the value will be clobbered anyways.

Finally, we just need to assemble our payload and send the transaction, giving us the flag we worked so hard for: PCTF{hon3y_1_Ov3rFLOw3d_7h3_M3MOry}.

You can find the official solution here.

Conclusion

This was by far the most complex challenge in the CTF and for good reason - it made use of a vulnerability that most people haven't heard of and very few people actively think about. It also challenged basic assumptions about what Solidity source code really does. However, I think it made for a great exercise in solving problems from first principles, eliminating assumptions, and exploring the entire search space. Hopefully you enjoyed this writeup and I'll see you next year at Paradigm CTF 2022!