Is std::string’s storage contiguous?

Take a look at the following code:

#include <cstddef>
#include <cstring>
#include <iostream>
#include <string>

/* Dummy recv function */
size_t recv(int socket, void *buffer, size_t length, int flags)
{
 std::memcpy(buffer, "Some stuff", 10);
 return 10;
}

int main()
{
 //Create a string with space for some characters
 std::string x(64, char());

 std::cout << "x.size() = " << x.size() << ", x.capacity() = "
           << x.capacity() << std::endl;

 //filling the string with data assuming that &x[0]
 //is pointing at the beginning a contiguous array
 std::size_t bytes_read = recv(0, &x[0], x.size(), 0);

 //Using the good old swap trick to free excess space
 std::string(&x[0], bytes_read).swap(x);

 std::cout << "x.size() = " << x.size() << ", x.capacity() = "
           << x.capacity() << std::endl;
 std::cout << x << std::endl;
}

Is that code… valid? I always thought it wasn’t since I heard lots of people telling me that storage for a std::string isn’t guaranteed to be contiguous (I’ve yet to see an implementation with storage that isn’t) so the preceding code would possibly cause undefined behavior. A few days ago, I was on Freenode’s ##C++ when some guy asked the same question, I was responding him with the  classical answer… when another guy “Tinodidriksen” replied that I was wrong. What?!? What if he’s right? I mean, I’ve never actually looked at the standard about it, I only took the usual answer as being the truth. Let’s see what the standard says about it.

1) basic_string constructor requirements(See tables 38-43 in the 14882:2003 standard):
Excerpt from table 39: “data() points at the first element of an allocated copy of rlen consecutive elements of the string controlled by str beginning at position pos”, rlen being equal to size() according to the same table.

2) 21.3.4 paragraph 1, basic_string indexed access:
Returns: If pos < size(), returns data()[pos]. Otherwise, if pos == size(), the const version returns charT().

3) 21.3.6 paragraph 4, const charT* data() const:
Requires: The program shall not alter any of the values stored in the character array.

Well, these are the three relevant excerpts I found, the first one guarantees that the storage is contiguous because it’s allocated in a single block of X consecutive elements. The second proves that when you modify the string using the index operator, it modifies the buffer pointed by data(). (data() in that specific sentence is probably meaning the pointer to the underlying storage and not the data() member function itself since using it would force the implementor to use an ugly const cast and would violate the third and last point).

My conclusion, the standard does not say explicitly “std::basic_string must have contiguous storage” like std::vector (ISO/IEC 14882:2003 only) but there are enough constraints available to state that it can’t be otherwise. If I missed something or you have a different opinion, I’d love to hear it.

Advertisements

3 comments so far

  1. Dodheim on

    http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-defects.html#530

    Quote from Stephan T. Lavavej: “VC, along with every other Standard Library implementation, already conforms to this. So, &s[0] is a conformant way to get at the guts of a non-empty s.”

  2. rmn on

    I must say that no matter how convinced I am that you are right, it’s very likely i will never write code that relies on this being true. The standard must be very clear with such things, and unfortunetly in this case – it isn’t.

    Thanks for the interesting read.

  3. artyom on

    Few notes:

    I would suggest you to take a look on: http://stackoverflow.com/questions/760790/is-it-legal-to-write-to-stdstring

    According to standard continuity is not required, but because of const member functions like data() and c_str(), there is virtually no way to implement it otherwise. And all known implementations are actually continuous.

    More then that, next C++ standard aka C++0x do requires continuity explicitly as for vector.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: