C/C++ P-thread Code Safety – Avoiding Race Conditions

One of the most serious problems using threads is the resource sharing, mutual exclusion locking and race conditions.

When you start to develop using threads you may be tempted to use shared variables across multiples threads. This could be dangerous in many ways, because each operation you think is atomic, is not.

Even a simple i++ could be a nightmare. Every sentence in C/C++ translates into ASM code, and some operator like i++ in C is translated into many ASM atomic opcodes.

By example (i++):

        movl    i(%rip), %eax
        addl    $1, %eax
        movl    %eax, i(%rip)

Here, we can see that movl moves the i memory value to the eax CPU registry, and then, in the CPU we increment by one the registry, and then we return the value of the EAX registry to the i memory pointer.

But what happens if we use multiple threads doing that?

Threads takes every ASM instruction and treats it as a deck of cards. Then threads use a process that is very similar to deck shuffling, but preserving the order.

then, all kind of combinations results in possibility of mixing two or more threads doing an i++ of a shared i.

        movl    i(%rip), %eax ; thread 2
; registry save for thread 2 and restore of thread 1
        movl    i(%rip), %eax ; thread 1
; registry save for thread 1 and restore of thread 2
        addl    $1, %eax ; thread 2
; registry save for thread 2 and restore of thread 1
        addl    $1, %eax ; thread 1
; registry save for thread 1 and restore of thread 2
        movl    %eax, i(%rip) ; thread 2
; registry save for thread 2 and restore of thread 1
        movl    %eax, i(%rip) ; thread 1

the desired result is i++; and then i++… (starting with 0, two threads: i==2 )

but lets see whats happen here..

  1. each thread loads the value of i into EAX Registry (which is zero)
  2. thread 2 adds 1 to the registry value of i. then EAX == 1
  3. Because the EAX CPU Registry suffers change of context, now EAX == 0 for the thread 2 (the restored EAX)
  4. Then, thread 1 increment the value of EAX from 0 to 1.
  5. Each thread put the value of EAX into memory within his context, so, each thread put 1.

so, this is how a double i++ ends in just one i++.

this could be problematic if you need a precise i counter. But could be worse when you are talking about pointers, null terminated strings, or some other variable defined condition.

By example:

if you retrieve a static shared string pointer from a function, and you call that function from both separate threads. Each thread could be reading the wrong string, or even you may be having a memory corruption.

The answer:

In Pthread we have mutex and semaphores. Mutex avoid multiple threads to execute the fenced code between the lock to the unlock. It’s like a zone. if thread 1 is in zone A, thread 2 could not enter to zone A until thread 1 leaves

The wrong code:

#include <pthread.h>

unsigned int i = 0;

void * thread_1(void *data)
{
    i++;
    pthread_exit(NULL);
}
void * thread_2(void *data)
{
    i++;
    pthread_exit(NULL);
}
int main()
{
    pthread_t x;
    pthread_create(&x,NULL,thread_1,NULL);
    pthread_create(&x,NULL,thread_2,NULL);
    return 0;
}

The right code:

#include <pthread.h>

unsigned int i = 0;
pthread_mutex_t mp;

void * thread_1(void *data)
{
    pthread_mutex_lock(&mp);
    i++;
    pthread_mutex_unlock(&mp);
    pthread_exit(NULL);
}
void * thread_2(void *data)
{
    pthread_mutex_lock(&mp);
    i++;
    pthread_mutex_unlock(&mp);
    pthread_exit(NULL);
}
int main()
{
    pthread_t x;
    pthread_mutex_init(&mp,NULL);
    pthread_create(&x,NULL,thread_1,NULL);
    pthread_create(&x,NULL,thread_2,NULL);
    return 0;
}

Each susceptible variable (or code part) must be enclosed within pthread mutex locks.

Atomicity

If you define a large block of codes that can’t change the context, your application will be very inefficient, that’s because the application must wait until the whole thread code executes, and then execute the other thread. That does not sound like parallel programming.

Then we recommend to constraint the locks just inside the operations that involves the shared variable. However, under certain circumstances you must keep the lock open between many functions.

Then you have to evaluate the ASM card deck shuffling again, and see if it works for your purpose.

Multiple Readers and Multiple Writers

Many threads could be trying to read a variable without modifying it. Then, you have someone in charge to change it.

The following example shows the thing:

#include <pthread.h>
#include <stdio.h>

char var[128];
pthread_rwlock_t mp;
bool bContinue = true;

void * thread_1(void *data)
{

    do
    {        
        pthread_rwlock_rdlock(&rwlock);
        printf("%s\n",var);
        pthread_rwlock_unlock(&rwlock);        
    }
    while (bContinue);

    pthread_exit(NULL);
}
int main()
{
    pthread_t x,y;
    pthread_rwlock_init(&rwlock,NULL);
    pthread_create(&x,NULL,thread_1,NULL);
    pthread_create(&y,NULL,thread_1,NULL);

    sleep(1);

    for (int counter=0;counter&lt;100000;counter++)
    {
        pthread_rwlock_wrlock(&rwlock);
        memset(var,0xFF,128);
        snprintf(var,128,"THIS VAR!: %08X",counter);
        pthread_rwlock_unlock(&rwlock);
    }

    bContinue = false;

    sleep(1);

    pthread_join(&x,NULL);
    pthread_join(&y,NULL);

    return 0;
}

The variable called “var” should be protected for read and write (wrlock) just when its modified. Because after that memset and during the copy of “this var!”, the string is not null-terminated, and thread_1 could crash parsing a non null-terminated string to printf.

Multiple readers could live together in parallel, because they does not modify the value of var.

Using phtread_rwlock, you can optimize the performance

Note: When you call a function that could be called multiple times from multiple threads in parallel, that’s called: thread-safe function.

Non thread-safe functions in Linux C

Many commonly used ANSI C functions, and Linux provided C functions are not Thread Safe, that means that you should not call them from any thread any time.

You have two options:

  • Make an static function wrapper with locks that apply for the whole application (bad idea)
  • Use the re-entrant or thread-safe function which is provided with the operating system.

Here we have a table of non thread-safe functions and it replacement:

  • strtok() – strtok uses an internal buffer that keeps registries of what are you searching for between multiples strtok’s. The recommended function is strtok_r()
  • rand() – rand uses an internal static vector to keep the random state. That could be compromised and rand could be erratic. rand_r() is recommended.
  • srand() – srand initialize the random state vector. srand works for rand(), use rand_r()
  • inet_ntoa() – Use: inet_ntop()
  • inet_aton() – Use: inet_pton()
  • asctime() – Use asctime_r()
  • crypt() – Use crypt_r()
  • ctime() – Use ctime_r()
  • drand48() – Use drand48_r()
  • ecvt() – Use ecvt_r()
  • encrypt() – Use encrypt_r()
  • erand48() – Use erand48_r()
  • ether_aton() – Use ether_aton_r()
  • ether_ntoa() – Use ether_ntoa_r()
  • fcvt() – Use fcvt_r()
  • fgetgrent() – Use fgetgrent_r()
  • fgetpwent() – Use fgetpwent_r()
  • fgetspent() – Use fgetspent_r()
  • getaliasbyname() – Use getaliasbyname_r()
  • getaliasent() – Use getaliasent_r()
  • getdate() – Use getdate_r()
  • getgrent() – Use getgrent_r()
  • getgrgid() – Use getgrgid_r()
  • getgrnam() – Use getgrnam_r()
  • gethostbyaddr() – Use gethostbyaddr_r()
  • gethostbyname2() – Use gethostbyname2_r()
  • gethostbyname() – Use gethostbyname_r()
  • gethostent() – Use gethostent_r()
  • getlogin() – Use getlogin_r()
  • getmntent() – Use getmntent_r()
  • getnetbyaddr() – Use getnetbyaddr_r()
  • getnetbyname() – Use getnetbyname_r()
  • getnetent() – Use getnetent_r()
  • getnetgrent() – Use getnetgrent_r()
  • getprotobyname() – Use getprotobyname_r()
  • getprotobynumber() – Use getprotobynumber_r()
  • getprotoent() – Use getprotoent_r()
  • getpwent() – Use getpwent_r()
  • getpwnam() – Use getpwnam_r()
  • getpwuid() – Use getpwuid_r()
  • getrpcbyname() – Use getrpcbyname_r()
  • getrpcbynumber() – Use getrpcbynumber_r()
  • getrpcent() – Use getrpcent_r()
  • getservbyname() – Use getservbyname_r()
  • getservbyport() – Use getservbyport_r()
  • getservent() – Use getservent_r()
  • getspent() – Use getspent_r()
  • getspnam() – Use getspnam_r()
  • getutent() – Use getutent_r()
  • getutid() – Use getutid_r()
  • getutline() – Use getutline_r()
  • gmtime() – Use gmtime_r()
  • hcreate() – Use hcreate_r()
  • hdestroy() – Use hdestroy_r()
  • hsearch() – Use hsearch_r()
  • initstate() – Use initstate_r()
  • jrand48() – Use jrand48_r()
  • lcong48() – Use lcong48_r()
  • lgammaf() – Use lgammaf_r()
  • lgammal() – Use lgammal_r()
  • lgamma() – Use lgamma_r()
  • localtime() – Use localtime_r()
  • lrand48() – Use lrand48_r()
  • mrand48() – Use mrand48_r()
  • nrand48() – Use nrand48_r()
  • ptsname() – Use ptsname_r()
  • qecvt() – Use qecvt_r()
  • qfcvt() – Use qfcvt_r()
  • random() – Use random_r()
  • readdir() – Use readdir_r()
  • seed48() – Use seed48_r()
  • setkey() – Use setkey_r()
  • setstate() – Use setstate_r()
  • sgetspent() – Use sgetspent_r()
  • srand48() – Use srand48_r()
  • srandom() – Use srandom_r()
  • strerror() – Use strerror_r()
  • strtok() – Use strtok_r()
  • tmpnam() – Use tmpnam_r()
  • ttyname() – Use ttyname_r()

I hope this helps 🙂

Leave a Reply