The dYdX Safety Module is a staking contract designed to bootstrap a decentralized pool of funds which can be used to backstop the dYdX protocol.
During the deployment process for the Safety Module upgradeable smart contract, an error was made in which the contract implementation was upgraded in such a way that the storage layout of the contract was changed. This caused the exchange rate of dydx to stkDYDX to change from one to zero. As a result, users who staked to the contract deposited dydx without receiving stkDYDX.
This bug occurred due to an error made during the deployment of the smart contract, and we do not believe there is any error in the code itself. The Safety Module previously received a smart contract audit and is based on the Liquidity Module design, which was also audited. The Safety Module was thoroughly tested prior to deployment.
On August 3rd UTC, the Foundation performed a contract upgrade, updating the address of the implementation from
0x5132...aE87 (link) to
0xd249...7c55 (link). This change was intended as a minor gas optimization, based on feedback received during the audit and other internal code review of the Liquidity Module. The change was thoroughly tested, both in the context of the Liquidity Module and Safety Module.
The bug occurred due to the fact that this upgrade modified the storage layout of the contract after it had been deployed and initialized. The specific problematic change was made in
SM1Storage.sol which defines the storage layout for the contract. The following lines...
...had been changed to
Because the original code uses two storage slots and the new code uses only one, the storage layout shifted such that all variables in
SM1Storage.sol following the modified lines ended up pointing to incorrect storage slots.
This type of problem is a known concern with upgradeable contracts. It is one of the reasons contract upgrades must be performed carefully and the very reason why all Safety Module storage is carefully consolidated into the single file called
SM1Storage.sol. Unfortunately, the possibility of this type of error was overlooked in this case, and internal reviews failed to catch the bug.
On Wed Sep 8 at 15:00 UTC, the transfer restriction on the DYDX token was automatically lifted. This effectively opened up staking to the dYdX Safety Module.
Between 15:00 and 16:00 UTC, staking transactions were sent to the Safety Module from roughly 55 different addresses, contributing around 157,000 dydx. The bug was detected when it was observed that no stkDYDX was being issued to stakers. At 15:57 UTC, access to staking was disabled in the dYdX governance UI.
At the moment, these users’ funds are safely held within the Safety Module. However, a contract upgrade is necessary in order to restore functionality to the contract. Currently, no Safety Module rewards are being distributed and withdrawals are not possible. By design, funds staked to the Safety Module are locked until the end of the 28-day epoch.
By deploying a fix before or shortly after the end of the epoch, impact to users can be minimized.
A full solution should do the following in order to minimize impact to users:
- Restore functionality to the Safety Module.
- Allow the users who are currently staked to recover their funds.
- Compensate those users for the missed rewards that they should have received for participating in the Safety Module.
We propose implementing this as follows:
1. Deploy a Safety Module Recovery contract with a hardcoded mapping from address to DYDX token amount. The amounts should include staked funds and any additional compensation agreed upon by governance.
- As a precaution, this contract will be upgradeable via the Short Timelock.
2. Deploy a new Safety Module implementation contract with the following changes:
- Add an unused variable to the storage layout to restore the correct layout. The details of this step will require a careful analysis of the smart contract storage.
- Add a new initializer function which withdraws the full dydx balance to the Safety Module Recovery contract, transfers any additional compensation from the Rewards Treasury to the Safety Module Recovery contract, and performs any other initialization needed to reset the smart contract state.
The on-chain proposal to be approved by governance will include the following steps to be executed via the Long Timelock:
1. Upgrade the Safety Module to the new implementation contract.
2. Call the Safety Module initializer.
Before submitting the proposal, mainet forking will be used to simulate the approval of the proposal and validate the resulting smart contract state. The validation process will include the following:
- Running the deployment verification suite to check that the full smart contract state and all emitted logs are as expected.
- Running the full Safety Module unit test suite to ensure all functionality and code paths work as expected.
- Verifying that affected users are able to receive their full dydx balance including any compensation from the Safety Module Recovery contract.
At dYdX we always require adherence to the highest smart contract security standards. We require that thorough audits be undertaken, and enforce 100% branching code coverage on all smart contracts. Unfortunately, when deploying a complex smart contract system, there are many ways in which things can break, and in this case our decisions and process around deployment and validation left some room for error.
The dYdX governance launch required 187 transactions to deploy and initialize 33 smart contracts. In a project like this, it is important to adopt a testing and validation framework that is not only thorough in the context of each individual smart contract, but also able to detect any problems at the intersection of different parts of the system.
To a large extent, we accomplished this with the following:
- Thorough automated test coverage of both individual smart contracts and the interactions between them.
- Integration testing of all contracts on Ropsten as a supplement to automated testing.
- Validation of on-chain state after every phase of the deployment.
In spite of this, our process had gaps in the following respects which allowed the bug to pass undetected:
- We allowed a different deployment process to be used in production as compared with the process that had been thoroughly tested (in this case, the difference was the addition of a smart contract upgrade after initialization), and then failed to thoroughly test the outcome of the new process (for example, using mainnet forking).
- We failed to keep the the validation scripts themselves fully up-to-date with the latest version of the smart contracts. In this case, our validation script neglected to call the
getExchangeRate()function to verify the value of the exchange rate.
Going forwards, we will ensure a problem like this does not happen again, by doing the following:
- Ensure that tests are always run against the specific deployment process used in production. Any deviation in the deployment process between the tested and production environments could be a source of error. Process changes, even seemingly small ones, must not be made in production without thorough testing.
- Make the up-front investment to ensure that automated smart contract tests can be easily run in any environment: development, testnet, or mainnet fork. In this case, our validation scripts failed to detect the deployment error. Had we been set up to run our unit tests against a mainnet fork, we could have easily caught the bug before it was too late.
We look forward to assisting the community to come to a safe and effective resolution via Governance. Join the discussion on Commonwealth.