Strings

Working with strings in C. String manipulation functions from the `string.h` library.


Safe String Handling in C

C is a powerful language, but its manual memory management and lack of built-in bounds checking make it prone to vulnerabilities, especially concerning string handling. Buffer overflows, format string bugs, and other issues can arise from improper string manipulation. This document outlines best practices to mitigate these risks and ensure safer string operations.

The Importance of Safe String Handling

String handling vulnerabilities can lead to serious security breaches. A buffer overflow, for example, occurs when you write data beyond the allocated memory boundary of a string buffer. This can overwrite adjacent memory, potentially corrupting data, crashing the program, or, worse, allowing an attacker to inject and execute arbitrary code. Therefore, understanding and implementing safe string handling techniques is crucial for writing secure C code.

Understanding Buffer Overflows

A buffer overflow happens when a program writes data beyond the allocated boundaries of a buffer. Consider this example:

 #include <stdio.h>
#include <string.h>

int main() {
    char buffer[10];  // Allocate a buffer of size 10
    char input[] = "This is a very long string";

    strcpy(buffer, input); // Vulnerable: strcpy doesn't check bounds

    printf("Buffer contents: %s\n", buffer);
    return 0;
} 

In this code, strcpy copies the contents of input into buffer. Because input is larger than buffer, strcpy writes past the end of buffer, causing a buffer overflow. This is a significant security risk.

Safe String Handling Practices

The key to safe string handling in C is to always be mindful of buffer sizes and avoid functions that don't provide bounds checking. Here are some recommended practices:

1. Avoid Unsafe Functions

Functions like strcpy, strcat, sprintf, and gets are inherently unsafe because they don't perform bounds checking. They should be avoided whenever possible.

2. Use snprintf for Safe Formatting

snprintf is a safer alternative to sprintf. It allows you to specify the maximum number of characters to write to the buffer, preventing overflows.

 #include <stdio.h>

int main() {
    char buffer[20];
    int value = 12345;

    // Use snprintf to safely format the string
    int result = snprintf(buffer, sizeof(buffer), "Value: %d", value);

    if (result < 0) {
        fprintf(stderr, "Error in snprintf\n");
        return 1;
    } else if (result >= sizeof(buffer)) {
        fprintf(stderr, "String truncated by snprintf\n");
        // Handle truncation appropriately (e.g., reallocate buffer, truncate data)
    } else {
        printf("Buffer contents: %s\n", buffer);
    }
    return 0;
} 

Important considerations with `snprintf`:

  • Always provide the size of the destination buffer as the second argument.
  • Check the return value of `snprintf`.
  • A negative return value indicates an encoding error.
  • If the return value is greater than or equal to the buffer size, the output was truncated. You *must* handle this condition appropriately. Possible actions include reallocating the buffer to a larger size, truncating the data manually, or returning an error. Ignoring truncation can lead to data loss or other unexpected behavior.

3. Use Bounded String Copying (strncpy)

strncpy is a bounded version of strcpy. However, it has a potentially subtle pitfall: if the source string is longer than the specified size, strncpy will not null-terminate the destination string.

 #include <stdio.h>
#include <string.h>

int main() {
    char buffer[10];
    char input[] = "This is a long string";

    strncpy(buffer, input, sizeof(buffer) - 1); // Copy at most 9 characters

    buffer[sizeof(buffer) - 1] = '\0'; // Manually null-terminate

    printf("Buffer contents: %s\n", buffer);
    return 0;
} 

To use strncpy safely, you must:

  • Specify a maximum length that is one less than the buffer size.
  • Always manually null-terminate the destination buffer: buffer[sizeof(buffer) - 1] = '\0';

4. Use Bounded String Concatenation (strncat)

Similar to strncpy, strncat offers a bounded version of strcat. It appends at most `n` characters from a source string to the end of a destination string. It is generally safer than `strcat`, but proper usage is still required.

 #include <stdio.h>
#include <string.h>

int main() {
    char buffer[20] = "Hello, ";
    char input[] = "world!";

    // strncat appends at most sizeof(buffer) - strlen(buffer) - 1 characters.
    strncat(buffer, input, sizeof(buffer) - strlen(buffer) - 1);
    buffer[sizeof(buffer) -1] = '\0'; //Ensure Null Termination

    printf("Buffer contents: %s\n", buffer);
    return 0;
} 

Important considerations for `strncat`:

  • The destination buffer *must* be null-terminated before calling `strncat`.
  • The third argument should be calculated as the remaining space in the buffer minus one for the null terminator: `sizeof(buffer) - strlen(buffer) - 1`.
  • After the call to `strncat`, you may need to explicitly null-terminate the string if the `n` argument limited the amount copied. While `strncat` *attempts* to always null-terminate, manual termination can add robustness.

5. Functions from the BSD Family: strlcpy and strlcat (If Available)

The strlcpy and strlcat functions are often considered safer alternatives and are available on some systems (primarily BSD-derived systems and some POSIX extensions). They are *not* part of the standard C library, so you must check if they are available on your target platform. These functions always null-terminate the destination string and return the length of the source string (which can be used to detect truncation).

 #include <stdio.h>
#include <string.h>
#include <bsd/string.h> // Include the header file (if available)


int main() {
    char buffer[10];
    char input[] = "This is a very long string";

    size_t result = strlcpy(buffer, input, sizeof(buffer));

    if (result >= sizeof(buffer)) {
        printf("String truncated\n");
        // Handle truncation appropriately
    }

    printf("Buffer contents: %s\n", buffer);
    return 0;
} 

Note: If strlcpy and strlcat are not available, you may need to find alternative implementations or use the strncpy/strncat patterns described above.

6. Check Buffer Sizes

Always know the size of your buffers and ensure that you don't write beyond their boundaries. This often involves using the sizeof operator and carefully calculating the amount of space available.

7. Use Dynamic Memory Allocation with Caution

If you don't know the required string length at compile time, use dynamic memory allocation (malloc, calloc). *Always* check that the allocation succeeds. When you are finished with dynamically allocated memory, release it using free to prevent memory leaks.

 #include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main() {
    size_t length = 50; // Desired string length
    char *buffer = (char*) malloc(length * sizeof(char));

    if (buffer == NULL) {
        fprintf(stderr, "Memory allocation failed!\n");
        return 1;
    }

    snprintf(buffer, length, "This is a dynamically allocated string.");
    printf("Buffer contents: %s\n", buffer);

    free(buffer); // Free the allocated memory
    buffer = NULL; // Set the pointer to NULL to prevent dangling pointer issues
    return 0;
} 

8. Consider Using String Length Functions Safely

Use strlen with caution. Don't use it to calculate buffer sizes *before* the buffer has been initialized or filled with data, as it might read beyond allocated memory if the buffer is not null-terminated. Instead, use the `sizeof` operator to determine the size of the allocated buffer.

9. Use Input Validation and Sanitization

When dealing with user input (or any external data), validate and sanitize the data before processing it. This can involve:

  • Checking the length of the input.
  • Removing or escaping special characters.
  • Converting the input to a safe format.
 #include <stdio.h>
#include <string.h>

int main() {
    char input[100];

    printf("Enter some text: ");
    fgets(input, sizeof(input), stdin); // Use fgets for safer input

    // Remove trailing newline character from fgets
    size_t len = strlen(input);
    if (len > 0 && input[len - 1] == '\n') {
        input[len - 1] = '\0';
    }

    // Validate length
    if (strlen(input) > 50) {
        printf("Input too long!\n");
        return 1;
    }

    printf("You entered: %s\n", input);
    return 0;
} 

10. Regularly Review and Test Your Code

Security vulnerabilities can be subtle. Conduct regular code reviews and use static analysis tools (e.g., Coverity, SonarQube) to identify potential issues. Test your code thoroughly, including with intentionally long or malformed input, to uncover buffer overflows and other string-related bugs.

11. Handle Format String Vulnerabilities

Format string vulnerabilities arise when user-controlled input is used directly as the format string in functions like printf, fprintf, or sprintf. An attacker can inject format specifiers (e.g., %s, %x, %n) to read from or write to arbitrary memory locations.

Prevention: Never use user-supplied input directly as the format string. Instead, use a fixed format string and pass the user input as an argument. For example:

 #include <stdio.h>

int main() {
  char user_input[100];

  printf("Enter a string: ");
  fgets(user_input, sizeof(user_input), stdin);

  // Vulnerable:
  // printf(user_input);  // DO NOT DO THIS

  // Safe:
  printf("%s", user_input);  // Always use a fixed format string
  printf("\n");  // Add a newline

  return 0;
} 

In the safe example, "%s" is the fixed format string, and user_input is passed as the argument to be printed. This prevents the attacker from injecting their own format specifiers.

Conclusion

Safe string handling in C requires vigilance and a thorough understanding of potential vulnerabilities. By following the best practices outlined above, you can significantly reduce the risk of buffer overflows, format string bugs, and other string-related security flaws in your C programs. Remember that security is an ongoing process, and it's crucial to stay informed about new threats and techniques.