Ad

Segmentation Fault Returned Using Strtok

- 1 answer

I'm trying to create a function in c which split strings just like split function in java or many other languages. I made this

char **split(char * str, char *ch) {
  char **array = (char **)malloc((strlen(str)) * sizeof(*array));
  int i = 0;
  char *token = strtok(str, ch);
  while (token != NULL) {
    array[i++] = token;
    token = strtok(NULL, ch);
  }
  free(token);
  return array;
}

This seems to work but not always and not properly. Let's assume we invoke this in 2 different ways: 1rst the working one:

int main(){
  
  while(1){
    sleep(1);
    char h = ':';
    char a[] = "test:1234";
    char ** result = split(a,&h);
    printf("%s\n",result[0]);
    printf("%s\n",result[1]);
    free(result);
  }
}

while the second one gives me a segmentation fault at the second while cycle:

int main(){
  char a[] = "test:1234";
  char h = ':';
  while(1){
    sleep(1);
    char ** result = split(a,&h);
    printf("%s\n",result[0]);
    printf("%s\n",result[1]);
    free(result);
  }
}

Output:

test
1234
test
Segmentation fault (core dumped)

I think this is due to a manipulation of the string index by the strtok function but I cannot understand how to fix it and exactly why it gives me a segmentation fault.

Ad

Answer

One problem is that you are calling strtok incorrectly.

strtok is expecting two strings, i.e. the string to split and a string of delimiters.

But you are not passing a string of delimiters - you a passing a pointer to a single character.

So change it like:

char h = ':';                  --->  char *h = ":";

and

char ** result = split(a,&h);  --->  char ** result = split(a,h);

Another issue with your code is that you expect it to always return at least two valid tokens. That is a bad assumption and it will fail in the second loop of your second code example.

In the first loop a will be changed to be the string "test" because strtok replaces the ':' with a string termination character.

In the second loop there will consequently only be one token. This means that result[1] is not pointing to a valid token and therefore, you are not allowed to print what it is pointing to.

One way to fix that problem is to set all the result pointers to NULL in the function, e.g. by using calloc instead of malloc like:

char **array = calloc(strlen(str), sizeof(*array));

and then do the printing like:

if (result[0]) printf("%s\n",result[0]);
if (result[1]) printf("%s\n",result[1]);

or better:

int i = 0;
while(result[i])
{
    printf("%s\n",result[i]);
    ++i;
}

Putting it all together:

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

char **split(char * str, char *ch) {
  char **array = calloc(strlen(str), sizeof(*array));  // Use calloc to set
                                                       // all pointers to NULL
  int i = 0;
  char *token = strtok(str, ch);
  while (token != NULL) {
    array[i++] = token;
    token = strtok(NULL, ch);
  }
  return array;
}

int main(){
  char a[] = "test:1234";
  char *h = ":";
  int z = 0;
  while(z < 5){    // Just loop 5 times
    //sleep(1);
    char ** result = split(a,h);
    int i = 0;
    while(result[i])   // Print all tokens, i.e. stop when a pointer is NULL
    {
        printf("%s\n",result[i]);
        ++i;
    }    
    free(result);
    ++z;
  }
}

Output:

test
1234
test
test
test
test

BTW:

This

free(token);

is the same as

free(NULL);

It does nothing so just delete that line.

Ad
source: stackoverflow.com
Ad