Format string attacks
These vulnerabilities are associated with the ‘printf’ statement. Yes you read it right- ‘printf’. Suppose a programmer is writing a code and in that he is using printf statement to print something. He uses the following printf: Instead of Now you may argue “What is the difference between these two as both of them will compile without any errors?” Imagine if the output is set to “%d” in the first printf. The printf command will dutifully interpret the output as a format for printing a decimal integer, and in turn it will go to the stack to grab an integer. Since there is none, printf will print a garbage value which should not be treated as a failure because we can successfully print a garbage value with a printf command.
Let’s dance in the stack
To understand this attack even better, we will showcase this attack with the help of an example and then trace it down in stack.
Reading from stack
Consider the following sprintf statement In this sprint statement, there is no format string so it can come under a format string attack. Suppose the attacker enters “%x%x%x” into the input above, then the above sprint statement will become Now this input from the attacker is interpreted as a format string and sprint will fetch the next three hexadecimal values from the stack and load it in the variable buffer. If we issue now: We will see the value if next three hexadecimal values from the buffer.
Writing to the stack
For now we have been reading contents from the stack by passing format string as user input but we can also write to the stack. Let’s see how. The “%n” format is used to store the number of characters before encountering %n. For example, consider the following printf command: This will load the number 6 into the memory location of the test. Notice we have just written to the memory location of variable ‘test’ using a printf. Now let’s try with a more complex example, Suppose there is some value that attacker wants to change in the stack and following is the program for that: So the corresponding stack will look this this
Now let’s say the value to change is at address 0xaffbfca0. This can be collected from the attacker by looking at the source code or by printing the content of stack. So in this example let’s say user input: “xa0xfcxfbxaf%d%n”, so that the sprint statement will become snprintf(buffer, sizeof_buffer, “xa0xfcxfbxaf%d%n”) and the stack will be like this:
Let’s understand this input from the user. Note that “” which is used to escape and x indicates a hexadecimal number. So 2 hexadecimal gets translated into 1 ASCII character so there are 4 ASCII characters in the input. After this attacker enters %d, this means to print a decimal integer but what integer will it print. Look at the above stack diagram and the next value in the stack after input is of integer ‘a’ which is set to 1, so now there are total of five characters in the buffer (4 ASCII+ 1 decimal). So the stack will become like this
After this, the next thing that comes in the input is ‘%n’ and as stated earlier this format string is used to store the number of characters, which in this case is 5, and write it in the memory of the next argument. What is the memory location one would ask? snprintf will look out for next argument. It is being provided, but in this case there no such argument , so it will look at the stack and pick the next item, which is the buffer, and which is loaded with “xa0xfcxfbxaf” which in memory will be interpreted as 0xaffbfca0 (because it is interpreted as little endian) and thus the value 5 is written in this location:
So we have seen as to how we can write a number to a memory location. Now this memory location can be where return pointer resides, thus overwriting it attackers can take control of the program.
Format string attack detection & defenses
Following section describes about how we can detect and defend against format string attacks.
Whenever user input is provided with a (“),%x,%d,%n, it is likely that a format string attack is underway. The best way to defend against a format string attack is to make sure programmer includes format strings in printf, sprint,fprintf,snprintf function calls. Deploy all the patches whenever applicable.
So in this article we have seen that how a small function like printf can lead to serious issued if not handled correctly/securely.