123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311 |
- \documentclass [10pt]{article}
-
-
- \usepackage{latexsym}
- \usepackage{amssymb}
- \usepackage{epsfig}
- \usepackage{fullpage}
- \usepackage{enumerate}
- \usepackage{xspace}
- \usepackage{todonotes}
- \usepackage{listings}
- \usepackage{url}
- \usepackage[ruled,linesnumbered]{algorithm2e} % Enables the writing of pseudo code.
- \usepackage{float}% http://ctan.org/pkg/float
-
- \newcommand{\true}{true}
- \newcommand{\false}{false}
- \pagestyle{plain}
- \bibliographystyle{plain}
-
-
- \title{192.127 Seminar in Software Engineering (Smart Contracts) \\
- SWC-124: Write to Arbitrary Storage Location}
- \author{Exercises}
-
- \date{WT 2023/24}
-
- \author{\textbf{Ivanov, Ivaylo (11777707) \& Millauer, Peter (01350868)}}
-
- \newtheorem{theorem}{Theorem}
- \newtheorem{lemma}[theorem]{Lemma}
- \newtheorem{corollary}[theorem]{Corollary}
- \newtheorem{proposition}[theorem]{Proposition}
- \newtheorem{conjecture}[theorem]{Conjecture}
- \newtheorem{definition}[theorem]{Definition}
- \newtheorem{example}[theorem]{Example}
- \newtheorem{remark}[theorem]{Remark}
- \newtheorem{exercise}[theorem]{Exercise}
-
-
- \renewcommand{\labelenumi}{(\alph{enumi})}
-
- \usepackage{xcolor}
-
- \definecolor{codegreen}{rgb}{0,0.6,0}
- \definecolor{codegray}{rgb}{0.5,0.5,0.5}
- \definecolor{codepurple}{rgb}{0.58,0,0.82}
- \definecolor{backcolour}{rgb}{0.95,0.95,0.92}
-
- \lstdefinestyle{mystyle}{
- backgroundcolor=\color{backcolour},
- commentstyle=\color{codegreen},
- keywordstyle=\color{magenta},
- numberstyle=\tiny\color{codegray},
- stringstyle=\color{codepurple},
- basicstyle=\ttfamily\footnotesize,
- breakatwhitespace=false,
- breaklines=true,
- captionpos=b,
- keepspaces=true,
- numbers=left,
- numbersep=5pt,
- showspaces=false,
- showstringspaces=false,
- showtabs=false,
- tabsize=2
- }
-
-
-
- \begin{document}
-
-
- \maketitle
-
- \section{Weakness and consequences}
-
- \subsection{Solidity storage layout}
-
- Any contract's storage is a continuous 256-bit address space consisting of 32-bit values. In order to implement dynamically sized data structures like maps and arrays, Solidity distributes their entries in a pseudo-random location. Due to the vast 256-bit range of addresses collisions are statistically extremely improbable and of little practical relevance in safely implemented contracts.
-
- \medspace
-
- In the case of a dynamic array at variable slot $p$, data is written to continuous locations starting at $keccak(p)$. The array itself contains the length information as an $uint256$ value. Even enormous arrays are unlikely to produce collisions due to the vast address space, although an improperly managed array may store data to an unbounded user-controlled offset, thereby allowing arbitrary overwriting of data.
-
- \medspace
-
- For maps stored in variable slot $p$ the data for index $k$ can be found at $keccak(k . p)$ where $.$ is the concatenation operator. This is a statistically safe approach, as the chance of intentionally finding a value for $keccak(k . p)$ s.t. for a known stored variable $x$, $keccak(k . p) == storage\_address(x)$ is about one in $2^{256}$ and $keccak$ is believed to be a cryptographically secure hash function.
-
- \subsection{The Weakness}
-
- Any unchecked array write is potentially dangerous, as the storage-location of all variables is publicly known and an unconstrained array index can be reverse engineered to target them. This can be achieved by using the known array storage location $p$, target-variable $x$, and computing the offset-value $o$ such that $keccac(p) + o == storage\_address(x)$.
-
- \medspace
-
- A trivial example of such a vulnerable write operation is shown in Algorithm~\ref{alg:vuln-write}.
-
- \lstset{style=mystyle}
- \begin{algorithm}[H]
- \begin{lstlisting}[language=Octave]
- pragma solidity 0.4.25;
-
- contract MyContract {
- address private owner;
- uint[] private arr;
-
- constructor() public {
- arr = new uint[](0);
- owner = msg.sender;
- }
-
- function write(unit index, uint value) {
- arr[index] = value;
- }
- }
- \end{lstlisting}
- \caption{A completely unchecked array write}
- \label{alg:vuln-write}
- \end{algorithm}
-
- \medspace
-
- In the following example (Algorithm~\ref{alg:pop-incorrect}) the $pop$ function incorrectly checks for an array $length >= 0$, thereby allowing the $length$ value to underflow when called with an empty array. Once this weakness is triggered, $update$ in Algorithm~\ref{alg:pop-incorrect} behaves just like $write$ did in Algorithm~\ref{alg:pop-incorrect}.
-
- \medspace
-
- \lstset{style=mystyle}
- \begin{algorithm}[H]
- \begin{lstlisting}[language=Octave]
- pragma solidity 0.4.25;
-
- contract MyContract {
- address private owner;
- uint[] private arr;
-
- constructor() public {
- arr = new uint[](0);
- owner = msg.sender;
- }
-
- function push(value) {
- arr[arr.length] = value;
- arr.length++;
- }
-
- function pop() {
- require(arr.length >= 0);
- arr.length--;
- }
-
- function update(unit index, uint value) {
- require(index < arr.length);
- arr[index] = value;
- }
- }
- \end{lstlisting}
- \caption{An incorrectly managed array length}
- \label{alg:pop-incorrect}
- \end{algorithm}
-
- Another weakness that allows arbitrary storage access is unchecked assembly code. Assembly is a powerful tool that allows the developers to get as close to the EVM as they can,
- but it may also be very dangerous when not tested correctly. As per the documentation\footnote{\url{https://docs.soliditylang.org/en/latest/assembly.html}}: \textit{"this [inline assembly]
- bypasses important safety features and checks of Solidity. You should only use it for tasks that need it, and only if you are confident with using it."}
- When given access to such lowlevel structures, a programmer can built-in not only weaknesses similar to the ones described previously, but also others, such as overwriting map locations,
- contract variables etc.
-
- An example for such a weakness is given in Algorithm~\ref{alg:unchecked-assembly}.
-
- \medspace
-
- \lstset{style=mystyle}
- \begin{algorithm}[H]
- \begin{lstlisting}[language=Octave]
- pragma solidity 0.4.25;
-
- contract MyContract {
- address private owner;
- mapping(address => bool) public managers;
-
- constructor() public {
- owner = msg.sender;
- setNextUserRole(msg.sender);
- }
-
- function setNextManager(address next) internal {
- uint256 slot;
- assembly {
- slot := managers.slot
- sstore(slot, next)
- }
-
- bytes32 location = keccak256(abi.encode(160, uint256(slot)));
- assembly {
- sstore(location, true)
- }
- }
-
- function registerUser(address user) {
- require(msg.sender == owner);
- setNextManager(user);
- }
-
- function cashout() {
- require(managers[msg.sender]);
- address payable manager = msg.sender;
- manager.transfer(address(this).balance);
- }
- }
- \end{lstlisting}
- \caption{An unchecked assembly write to mapping}
- \label{alg:unchecked-assembly}
- \end{algorithm}
-
- The contract has a manager mapping, which should be used as a stack.
- The developer has added the \texttt{setNextManager} function, which should set the top of the stack to the latest user as a manager.
- The issue is that the function is implemented in such a way, that the stack would not grow, but the first element would always be overwritten - this arises from the fact that the memory slot
- of the managers mapping does not point to the memory address on the top of the stack, but instead to the base of it.
- The function is then using this slot address directly, without calculating any offset, overwriting the base of the stack. If social engineeering is applied, an attacker can persuade the
- owner to set them as a manager, which would result in the weakness being exploited directly and the owner giving up their own management rights.
- \subsection{Consequences}
-
- The consequences of exploiting an arbitrary storage access weakness can be of different types and severity.
- An attacker may gain read-write access to private contract data, which should only be accessible to owners, maintainers etc.
- They may also exploit the contract to circumvent authorization checks and drain the contract funds.
- According to Li Duan et al.~\cite{multilayer}, an attacker may also be able to destroy the contract storage structure and thus cause
- unexpected program flow, abnormal function execution or contract freeze.
-
- \section{Vulnerable contracts in literature}
-
- One example for vulnerable contracts, which is similar to Algorithm~\ref{alg:pop-incorrect}, is mentioned in the paper by Li Duan et al.~\cite{multilayer}:
-
- \medspace
-
- \lstset{style=mystyle}
- \begin{algorithm}[H]
- \begin{lstlisting}[language=Octave]
- function PopBonusCode() public {
- require(0 <= bonusCodes.length);
- bonusCodes.length--;
- }
-
- function UpdateBonusCodeAt(uint idx, uint c) public {
- require(idx < bonusCodes.length);
- bonusCodes[idx] = c;
- }
- \end{lstlisting}
- \caption{Arbitrary write as per Li Duan et al.}
- \label{alg:multilayer-example}
- \end{algorithm}
-
- We will not go into a detailed explanation, as we already did this in the previous section.
-
-
- A more sophisticated example is presented in the paper by Sukrit Kalra et al.~\cite{Kalra2018ZEUSAS}:
-
- \medspace
-
- \lstset{style=mystyle}
- \begin{algorithm}[H]
- \begin{lstlisting}[language=Octave]
- uint payout = balance/participants.length;
- for (var i = 0; i < participants.length; i++)
- participants[i].send(payout);
- \end{lstlisting}
- \caption{Arbitrary read as per Sukrit Kalra et al.}
- \label{alg:zeus-example}
- \end{algorithm}
-
- The vulnerability here is an integer overflow - as the variable \texttt{i} is dinamically typed, it will get the smallest possible type that will be able to hold the value 0 - that being \texttt{uint8}, which is able to hold positive integers up to 255.
-
- Because of this, if the length of the \texttt{participants} arrays is greater than 255, the integer overflows on the 256th iteration and instead of moving on to \texttt{participants[255]}, it reverts back to the first element in the array. As a result, the first 255 paricipants will split all the balance of the contract, whereas the rest will get nothing.
-
- \section{Code properties and automatic detection}
-
- Automatic detection tools can be broadly categorized into ones employing static analysis and those who use fuzzing, i.e. application of semi-random inputs. Notable static analysis tools include Securify~\cite{securify} and teEther~\cite{teether} which both function in a similar manner:
-
- \medspace
-
- Initially, the given EVM byte-code is disassembled into a control-flow-graph (CFG). In the second step, the tools identify potentially risky instructions. In the case of arbitrary writes, the instruction of note is $sstore(k,v)$ where both $k$ and $v$ are input-controlled. The tools differ in the way they identify whether or not the values are input-controlled.
-
- \medspace
-
- In the case of Securify~\cite{securify}, the CFG is translated into what the authors call "semantic facts" to which an elaborate set of so-called security patterns is applied. These patterns consist of building blocks in the form of predicates, which allows the tool to simply generate output based on the (transitively) matched patterns.
-
- \medspace
-
- teEther~\cite{teether} employs a similar approach, but instead the authors opt to build a graph of dependent variables. If the graph arrives at a $sstore(k,v)$ instruction and a path can be found leading to user-controlled inputs, the tool infers a set of constraints which are then used to automatically generate an exploit.
-
- \medspace
-
- The fuzz-driven approach to vulnerability detection is more abstract, as general-purpose fuzzing tools generally don't have knowledge of the analysed program. For the tool SmartFuzzDriverGenerator~\cite{fuzzdrivegen}, a multitude of these fuzzing libraries can be used. The problem at hand is, however, that the technique cannot interface with a smart contract out of the box. The "glue" between fuzzer and program is called a driver, hence the name of "driver-generator".
-
- \medspace
-
- SmartFuzzDriverGenerator aims to automatically generate such a driver by %TODO: I have no idea how it does this actually%
-
- \medspace
-
- The Smartian tool~\cite{smartian} attempts to find a middle-ground between static and dynamic analysis by first transforming the EVM bytecode into control-flow facts. Based on this information, a set of seed-inputs is generated that are expected to have a high probability of yielding useable results. Should no exploit be found, the seed-inputs are then mutated in order to yield a higher code coverage. %TODO: This is probably extemely inprecise and should be re-written%
-
- \section{Exploit sketch}
-
- \cite{doughoyte}
- %TODO: just explain what this guy does: https://github.com/Arachnid/uscc/tree/master/submissions-2017/doughoyte%
-
-
- \bibliography{exercise.bib}
-
- \end{document}
-
|