Why gets() is bad / Buffer Overflows

Match word(s).	If you have any questions or comments, please visit us on the Forums.
FAQ > Explanations of... > Why gets() is bad / Buffer Overflows
This item was added on: 2003/03/31 When people are introduced to C, they are often shown the `gets()` function as a method to get some input from the user/keyboard. It appears that some teachers are also quite insistant that their pupils continue to use it. Well, this is OK for a day one lesson, but `gets()` has an inherent problem that causes most coders to avoid using it. This quick overview will hopefully help new coders understand the problem, how to get around it, and also how it might also affect their own functions. The problem First, let's look at the prototype for this function: `#include <stdio.h>` `char gets(char s);` You can see that the one and only parameter is a char pointer. So then, if we make an array like this: `char buf[100];` we could pass it to `gets()` like so: `gets(buf)` So far, so good. Or so it seems... but really our problem has already begun. `gets()` has only received the name of the array (a pointer), it does not know how big the array is, and it is impossible to determine this from the pointer alone. When the user enters their text, gets() will read all available data into the array, this will be fine if the user is sensible and enters less than 99 bytes. However, if they enter more than 99, `gets()` will not stop writing at the end of the array. Instead, it continues writing past the end and into memory it doesn't own. This problem may manifest itself in a number of ways: No visible affect what-so-ever Immediate program termination (a crash) Termination at a later point in the programs life time (maybe 1 second later, maybe 15 days later) Termination of another, unrelated program Incorrect program behaviour and/or calculation ... and the list goes on. This is the problem with "buffer overflow" bugs, you just can't tell when and how they'll bite you. A demonstration Here is some sample code showing this problem. The output is subject to change due to its unpredictable nature. #include <stdio.h> typedef struct MyStruct { char buf[5]; int i; } MyStruct_t; int main(void) { MyStruct_t my; my.i = 10; printf ("my.i is %d\n", my.i); printf ("Enter a 10 digit number:"); /* Too big on purpose / gets(my.buf); printf ("my.buf is >%s<\n", my.buf); printf ("my.i is %d\n", my.i); return(0); } / * Output (on my BCC 5.5 compiler) my.i is 10 Enter a 10 digit number:1234567890 my.buf is >1234567890< my.i is 12345 * / As you can see, the input buffer is 5 bytes in length (4 data, plus one for the null terminator). The initial value of the int within the structure is set to 10, but after the `gets()` function has been called, the value has been changed. Go here for more on buffer overflows and other security vulnerabilities. A resolution* To get around this problem, ensure you use a more secure function for performing reads. For example, `fgets()` is a buffer safe function. Its prototype is: `#include <stdio.h>` `char fgets(char s, int size, FILE *stream);` The are some examples here, but for ease, here is a quick sample: `fgets(buf, sizeof(buf), stdin);` Written by Hammer

Script provided by SmartCGIs