The following code receives seg fault on line 2:

  char *str = "string";
  str[0] = 'z';
  printf("%s", str);

While this works perfectly well:

  char str[] = "string";
  str[0] = 'z';
  printf("%s", str);

Tested with MSVC and GCC.

2 upvote
  flag
MSVC gives "Access violation writing location 0x...". I thing segmentation faults are specific to Linux/UNIX platforms. – M.S. Dousti
1 upvote
  flag
Its funny - but this actually compiles and runs perfectly when using windows compiler (cl) on a visual studio developer command prompt. Got me confused for a few moments... – David Refaeli
upvote
  flag
You can print char *str by printf("%c\n", (*(st + i))); when changing i: [0, 6] – EsmaeelE

16 Answers 11

Because the type of "whatever" in the context of the 1st example is const char * (even if you assign it to a non-const char*), which means you shouldn't try and write to it.

The compiler has enforced this by putting the string in a read-only part of memory, hence writing to it generates a segfault.

Normally, string literals are stored in read-only memory when the program is run. This is to prevent you from accidentally changing a string constant. In your first example, "string" is stored in read-only memory and *str points to the first character. The segfault happens when you try to change the first character to 'z'.

In the second example, the string "string" is copied by the compiler from its read-only home to the str[] array. Then changing the first character is permitted. You can check this by printing the address of each:

printf("%p", str);

Also, printing the size of str in the second example will show you that the compiler has allocated 7 bytes for it:

printf("%d", sizeof(str));
7 upvote
  flag
Whenever using "%p" on printf, you should cast the pointer to void * as in printf("%p", (void *)str); When printing a size_t with printf, you should use "%zu" if using the latest C standard (C99). – Chris
upvote
  flag
Also, the parenthesis with sizeof are only needed when taking the size of a type (the argument then looks like a cast). Remember that sizeof is an operator, not a function. – unwind
upvote
  flag
char *str = "string";  

The above sets str to point to the literal value "string" which is hard-coded in the program's binary image, which is probably flagged as read-only in memory.

So str[0]= is attempting to write to the read-only code of the application. I would guess this is probably compiler dependent though.

In the first code, "string" is a string constant, and string constants should never be modified because they are often placed into read only memory. "str" is a pointer being used to modify the constant.

In the second code, "string" is an array initializer, sort of short hand for

char str[7] =  { 's', 't', 'r', 'i', 'n', 'g', '\0' };

"str" is an array allocated on the stack and can be modified freely.

upvote
  flag
On the stack, or the data segment if str is global or static. – Gauthier
char *str = "string";

allocates a pointer to a string literal, which the compiler is putting in a non-modifiable part of your executable;

char str[] = "string";

allocates and initializes a local array which is modifiable

upvote
  flag
can we write int *b = {1,2,3) like we write char *s = "HelloWorld" ? – Suraj Jain

String literals like "string" are probably allocated in your executable's address space as read-only data (give or take your compiler). When you go to touch it, it freaks out that you're in its bathing suit area and lets you know with a seg fault.

In your first example, you're getting a pointer to that const data. In your second example, you're initializing an array of 7 characters with a copy of the const data.

The

 char *str = "string";

line defines a pointer and points it to a literal string. The literal string is not writable so when you do:

  str[0] = 'z';

you get a seg fault. On some platforms, the literal might be in writable memory so you won't see a segfault, but it's invalid code (resulting in undefined behavior) regardless.

The line:

char str[] = "string";

allocates an array of characters and copies the literal string into that array, which is fully writable, so the subsequent update is no problem.

upvote
  flag
can we write int *b = {1,2,3) like we write char *s = "HelloWorld" ? – Suraj Jain
up vote 179 down vote accepted

See the C FAQ, Question 1.32

Q: What is the difference between these initializations?
char a[] = "string literal";
char *p = "string literal";
My program crashes if I try to assign a new value to p[i].

A: A string literal (the formal term for a double-quoted string in C source) can be used in two slightly different ways:

  1. As the initializer for an array of char, as in the declaration of char a[] , it specifies the initial values of the characters in that array (and, if necessary, its size).
  2. Anywhere else, it turns into an unnamed, static array of characters, and this unnamed array may be stored in read-only memory, and which therefore cannot necessarily be modified. In an expression context, the array is converted at once to a pointer, as usual (see section 6), so the second declaration initializes p to point to the unnamed array's first element.

Some compilers have a switch controlling whether string literals are writable or not (for compiling old code), and some may have options to cause string literals to be formally treated as arrays of const char (for better error catching).

11 upvote
  flag
K&R section 5.5... Silly me, should have opened the book before asking stupid questions! – Markus
7 upvote
  flag
Couple of other points: (1) the segfault happens as described, but its occurrence is a function of the run environment; if the same code was in an embedded system, the write may have no effect, or it may actually change the s to a z. (2) Because string literals are non-writable, the compiler can save space by putting two instances of "string" in the same place; or, if somewhere else in the code you have "another string", then one chunk of memory could support both literals. Clearly, if code were then allowed to change those bytes, strange and difficult bugs could occur. – greggo
1 upvote
  flag
@greggo: Good point. There is also a way to do this on systems with MMU by using mprotect to wave read-only protection (see here). – user405725
upvote
  flag
So char *p="blah" actually creates a temporary array ?weird. – rahul tyagi
upvote
  flag
And after 2 years of writing in C++...TIL – zeboidlund
upvote
  flag
would int*b = {1,2,3} work? – Suraj Jain
upvote
  flag
@rahultyagi what do you mean ? – Suraj Jain

In the first place, str is a pointer that points at "string". The compiler is allowed to put string literals in places in memory that you cannot write to, but can only read. (This really should have triggered a warning, since you're assigning a const char * to a char *. Did you have warnings disabled, or did you just ignore them?)

In the second place, you're creating an array, which is memory that you've got full access to, and initializing it with "string". You're creating a char[7] (six for the letters, one for the terminating '\0'), and you do whatever you like with it.

upvote
  flag
Does C also support const? – Ferruccio
upvote
  flag
would int*b = {1,2,3} work? – Suraj Jain
upvote
  flag
@Ferruccio, ? Yes const prefix make variables Read-Only – EsmaeelE

The C FAQ that @matli linked to mentions it, but no one else here has yet, so for clarification: if a string literal (double-quoted string in your source) is used anywhere other than to initialize a character array (ie: @Mark's second example, which works correctly), that string is stored by the compiler in a special static string table, which is akin to creating a global static variable (read-only, of course) that is essentially anonymous (has no variable "name"). The read-only part is the important part, and is why the @Mark's first code example segfaults.

upvote
  flag
can we write int *b = {1,2,3) like we write char *s = "HelloWorld" ? – Suraj Jain

Most of these answers are correct, but just to add a little more clarity...

The "read only memory" that people are referring to is the text segment in ASM terms. It's the same place in memory where the instructions are loaded. This is read-only for obvious reasons like security. When you create a char* initialized to a string, the string data is compiled into the text segment and the program initializes the pointer to point into the text segment. So if you try to change it, kaboom. Segfault.

When written as an array, the compiler places the initialized string data in the data segment instead, which is the same place that your global variables and such live. This memory is mutable, since there are no instructions in the data segment. This time when the compiler initializes the character array (which is still just a char*) it's pointing into the data segment rather than the text segment, which you can safely alter at run-time.

upvote
  flag
But isn't it true that there can be implementations that allow modifying the "read-only memory"? – Pacerier
// create a string constant like this - will be read only
char *str_p;
str_p = "String constant";

// create an array of characters like this 
char *arr_p;
char arr[] = "String in an array";
arr_p = &arr[0];

// now we try to change a character in the array first, this will work
*arr_p = 'E';

// lets try to change the first character of the string contant
*str_p = 'G'; // this will result in a segmentation fault. Comment it out to work.


/*-----------------------------------------------------------------------------
 *  String constants can't be modified. A segmentation fault is the result,
 *  because most operating systems will not allow a write
 *  operation on read only memory.
 *-----------------------------------------------------------------------------*/

//print both strings to see if they have changed
printf("%s\n", str_p); //print the string without a variable
printf("%s\n", arr_p); //print the string, which is in an array. 

segmentation fault is caused when you tyr to access the memory which is non accessible.

char *str is a pointer to a string which is non modifiable(the reason for getting seg fault)..

whereas char str[] is an array and can be modifiable..

First is one constant string which can't be modified. Second is an array with initialized value, so it can be modified.

To understand this error or problem you should first know difference b/w the pointer and array so here firstly i have explain you differences b/w them

string array

 char strarray[] = "hello";

In memory array is stored in continuous memory cells, stored as [h][e][l][l][o][\0] =>[] is 1 char byte size memory cell ,and this continuous memory cells can be access by name named strarray here.so here string array strarray itself containing all characters of string initialized to it.in this case here "hello" so we can easily change its memory content by accessing each character by its index value

`strarray[0]='m'` it access character at index 0 which is 'h'in strarray

and its value changed to 'm' so strarray value changed to "mello";

one point to note here that we can change the content of string array by changing character by character but can not initialized other string directly to it like strarray="new string" is invalid

Pointer

As we all know pointer points to memory location in memory , uninitialized pointer points to random memory location so and after initialization points to particular memory location

char *ptr = "hello";

here pointer ptr is initialized to string "hello" which is constant string stored in read only memory (ROM) so "hello" can not be changed as it is stored in ROM

and ptr is stored in stack section and pointing to constant string "hello"

so ptr[0]='m' is invalid since you can not access read only memory

But ptr can be initialised to other string value directly since it is just pointer so it can be point to any memory address of variable of its data type

ptr="new string"; is valid

Why do I get a segmentation fault when writing to a string?

C99 N1256 draft

There are two completely different uses of array literals:

  1. Initialize char[]:

    char c[] = "abc";      
    

    This is "more magic", and described at 6.7.8/14 "Initialization":

    An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

    So this is just a shortcut for:

    char c[] = {'a', 'b', 'c', '\0'};
    

    Like any other regular array, c can be modified.

  2. Everywhere else: it generates an:

    So when you write:

    char *c = "abc";
    

    This is similar to:

    /* __unnamed is magic because modifying it gives UB. */
    static char __unnamed[] = "abc";
    char *c = __unnamed;
    

    Note the implicit cast from char[] to char *, which is always legal.

    Then if you modify c[0], you also modify __unnamed, which is UB.

    This is documented at 6.4.5 "String literals":

    5 In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence [...]

    6 It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

6.7.8/32 "Initialization" gives a direct example:

EXAMPLE 8: The declaration

char s[] = "abc", t[3] = "abc";

defines "plain" char array objects s and t whose elements are initialized with character string literals.

This declaration is identical to

char s[] = { 'a', 'b', 'c', '\0' },
t[] = { 'a', 'b', 'c' };

The contents of the arrays are modifiable. On the other hand, the declaration

char *p = "abc";

defines p with type "pointer to char" and initializes it to point to an object with type "array of char" with length 4 whose elements are initialized with a character string literal. If an attempt is made to use p to modify the contents of the array, the behavior is undefined.

GCC 4.8 x86-64 Linux implementation

Let's see why this implementation segfaults.

Program:

#include <stdio.h>

int main() {
    char *s = "abc";
    printf("%s\n", s);
    return 0;
}

Compile and decompile:

gcc -ggdb -std=c99 -c main.c
objdump -Sr main.o

Output contains:

 char *s = "abc";
8:  48 c7 45 f8 00 00 00    movq   $0x0,-0x8(%rbp)
f:  00 
        c: R_X86_64_32S .rodata

So the string is stored in the .rodata section.

Then:

readelf -l a.out

Contains (simplified):

Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
      [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
  LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                 0x0000000000000704 0x0000000000000704  R E    200000

 Section to Segment mapping:
  Segment Sections...
   02     .text .rodata

This means that the default linker script dumps both .text and .rodata into a segment that can be executed but not modified (Flags = R E). Attempting to modify such a segment leads to a segfault in Linux.

If we do the same for char[]:

 char s[] = "abc";

we obtain:

17:   c7 45 f0 61 62 63 00    movl   $0x636261,-0x10(%rbp)

so it gets stored in the stack (relative to %rbp), and we can of course modify it.

upvote
  flag
would int*b = {1,2,3} work? – Suraj Jain

Not the answer you're looking for? Browse other questions tagged or ask your own question.