Sunday, December 27, 2015

Select on Linux is really dangerous

Why is it dangerous?

The select system call can be used to check if a file descriptor (usually a socket) is ready for reading or writing. The function accepts as an input three sets, each one containing the file descriptors of the entities (pipes, sockets, ...) to test for readability, writability or an error state.
The largest file descriptor that can be stored in these sets is limited by the FD_SETSIZE constant, that is usually equal to 1024. If a file descriptor having a value larger than FD_SETSIZE is "stored" in one of such arrays, the effects is undetermined: but generally that operation results in an invalid memory access.

So if your program can potentially use many file descriptors, it is possible that select and its sets will make it fail. The remainder of this blog shows how to replace select with poll, that is not limited by the value of the file descriptors it can handle.

My GitHub account contains a sample that reproduces the select failure:
https://github.com/randomswdev/select_vs_poll

It is enough to clone the repository and issue the command:

make valgrind.select

This command builds the select based code and runs it with an input parameter, that make it generate more than 1000 file descriptors. Valgrind will soon report some memory violations and the program will fail because of a SIGSEGV.

More in detail, the test program's main thread opens a sockets and listens on it while a secondary thread establishes multiple connections. Every established connection adds to the number of open file descriptors. If the number of established connection is small, for example 100, the file descriptors do not grow over 1024 and the select logic works fine. If the program attempts to establish 1000 connections, shortly after receiving a file descriptor larger than 1024, the program fails with SIGSEGV.

Executing the command:

make valgrind.poll

the program terminates successfully after establishing 1000 connections and without generating any memory corruption.

This last command compiles the program to use the poll function. The fd_set used by select is replaced with a class that creates a data structure suitable for use with poll. It basically contains two attributes:
  1. the first one, pollfds, is an array of pollfd structures, i.e. the input structure required by poll. The vector contains one entry for every socket we want to check for a readable or writable status;
  2. the second structure, fd_to_pollfd, maps the numerical value of a file descriptor to the index of the corresponding entry in the pollfds vector.
In this way it is possible to create a compact vector of pollfd structures even if the file descriptors are very sparse.

Looking at poll.h, select.h and main.cc it is possible to verify how limited are the changes required for switching from select to poll. Obviously this is a quite simple scenario, in which we want only to test sockets for readability.
But in a more complex scenario, it would be possible to reuse the same data structure to test both readability and writability, simply setting more than one status flag in the event field of the pollfd structures. Also the differences in the timeout parameter, that is infinite in our sample, can be managed quite simply.

The select function and, mainly, its fd_set accompanying structure are dangerous, hiding potential stack corruptions that will occur as soon as the file descriptors will grow larger than 1024.
Using the right data structure, select can be quickly replaced with poll, thus completely removing any danger of memory corruption and allowing to manage any number of file descriptors.

No comments:

Post a Comment