Page 1 of 2

std::vector vs. array - FIGHT (or: STL vs CSL)

Posted: 2006-07-19 09:20pm
by phongn
Inspired by Faqa's thread about .NET, here comes a new thread about libraries and whatnot.

Durandal and I promptly got into an argument online about std::vector versus array and access speed times and whatnot. He maintained that the C array was always faster while I maintained that the vector was as fast (or faster).

So, thoughts on using the STL? Love it? Hate it? Prefer to use C constructs?

Posted: 2006-07-19 09:51pm
by Chris OFarrell
Unless I'm mistaken, isn't the STL libary actualy Guaranteed to have a minimum level of effeciency that isn't half bad?

Posted: 2006-07-19 10:00pm
by phongn
Chris OFarrell wrote:Unless I'm mistaken, isn't the STL libary actualy Guaranteed to have a minimum level of effeciency that isn't half bad?
Yes, there are specifications for performance and efficiency (some differences -- std::vector<bool> is a bitfield for space reasons instead of an array of eight-bit ints like in many other languages)

Posted: 2006-07-19 10:38pm
by Beowulf
Isn't an std::vector of things normally actually stored as an array of said objects? It just allows you to not accidentally run over the end of the array, and not have to worry about allocating the space. So I can see it being slightly slower, but not significantly so. And only if you use the function that actually checks to see if you went outside the bounds. The overloaded [] should be just as fast as a normal array access. The benefits of not having to worry about it certainly outweigh the downside, unless you really have a hard-on for efficiency.

Posted: 2006-07-19 11:36pm
by Durandal
By the way, the "he" Phong refers to is me. ;)

A vector is always guaranteed to take up more space than an array of the same size simply because of all the instance variables and method table. Depending on how it is implemented, the subscript operator could have function call overhead (though a smart compiler would inline it).

But the difference in access times given a smart compiler is probably negligible for most users. This still doesn't change the fact that real men dynamically allocate their memory. ;)

Posted: 2006-07-19 11:51pm
by Beowulf
Durandal wrote:By the way, the "he" Phong refers to is me. ;)

A vector is always guaranteed to take up more space than an array of the same size simply because of all the instance variables and method table. Depending on how it is implemented, the subscript operator could have function call overhead (though a smart compiler would inline it).

But the difference in access times given a smart compiler is probably negligible for most users. This still doesn't change the fact that real men dynamically allocate their memory. ;)
Real programmers consider Laziness to be a virtue. Why make it harder on yourself to write good code, when you can use tools available to you?

Posted: 2006-07-19 11:52pm
by Mad
Running the following simple benchmark, it appears that standard arrays are about twice as fast as vector arrays. I had expected the difference to be closer, but it looks like the overhead takes about as long as dereferencing the offset pointer itself takes.

Code: Select all

#include <cstdlib>
#include <iostream>
#include <vector>
#include <ctime>

#define SIZE    100000

using namespace std;

int main()
{
    vector<int> va;
    va.resize(SIZE);

    int aa[SIZE];
    int i;

    long vt_start, vt_end;
    long at_start, at_end;

    for(i=0; i<SIZE; ++i)
    {
        int num = rand()%SIZE;
        va[i] = num;
        aa[i] = num;
    }

    int swap1, swap2, val;

    long itr = SIZE * 1000;

    at_start = time(NULL);
    for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = va[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    }
    at_end = time(NULL);

    vt_start = time(NULL);
    for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = va[swap1];
        va[swap1] = va[swap2];
        va[swap2] = val;
    }
    vt_end = time(NULL);

    cout << "Time for vector array:   " << vt_end - vt_start << endl;
    cout << "Time for standard array: " << at_end - at_start << endl;

    return EXIT_SUCCESS;
}
Results for two runs (reversed the order of the tests) between runs:

Run 1:

Code: Select all

Time for vector array:   122
Time for standard array: 63
Run 2:

Code: Select all

Time for vector array:   120
Time for standard array: 64
Ran remotely via SSH using GCC 3.3.2 on Solaris 9 on a university SPARC machine. Compiled with compiler defaults ("g++ -o bench.exe bench.cpp").

Posted: 2006-07-20 12:19am
by Beowulf
You have a bug in your program:

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = va[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
Unfortunately, this makes the comparison even worse. it's more like a 1 to 4 advantage for the native array, compared to the std:vector. Just goes to show, don't believe anything without empirical testing.

EDIT: redid my test with release settings instead of debug. It's rough parity now. 14-13, 14-12, 14-12. Oh, and this is with MSVS 2005

EDIT2: with even more optimizations enabled for speed, the vector and array timings come out dead even, 13-14, 14-13, 13-14. So of course, double check your empirical results. It could be that your initial set-up is bad. (Example, Debug mode vs Release(debug turns off alot of optimizations)).

Posted: 2006-07-20 03:22am
by Xon
The difference between debug mode while running under a debugger, and release mode and is night and day.

I have some .NET code for a game conversion toolset for an upcoming RTS which runs over 50000% slower in debug mode under the debugger compared to release mode.

Posted: 2006-07-20 04:45am
by Dahak
When I wrote in C++, I always used the STL. I didn't feel like doing things probably worse than a library already could do...
Friend of mine didn't know the STL existed, was very sceptical of it. At the end, he used it too...

Posted: 2006-07-20 07:03am
by Dooey Jo
If I need a dynamic array, I use a std::vector. If I need a normal array, I don't use the vector. This is because it is the most convenient. It would be foolish and time-consuming to write my own vector template class if I don't need anything more than the STL already offers (or worse, write the allocation and everything directly in the code wherever I need it; holy debugging hell!).

If you're using C++, why pretend it's actually C?

Posted: 2006-07-20 09:08am
by Arrow
There are only two times I use an array: (1) for DMA transfers (because the hardware doesn't talk any other way) and (2) number to string conversions (because no one has shown me how to it directly in an stl::string). Everything else is done with stl::vector and stl::string. Why reinvent the wheel? And why risk the bugs from a typo in your own array and string handlers?

Posted: 2006-07-20 09:18am
by Mad
Beowulf wrote:Unfortunately, this makes the comparison even worse. it's more like a 1 to 4 advantage for the native array, compared to the std:vector. Just goes to show, don't believe anything without empirical testing.
Oops. It also just goes to show, don't copy-and-paste code and change the variable names when using very similar names without being very careful.

We also learned that turning compiler optimization on is very, very important. But if we're going the optimization route, we may want to perform more varied tests just in case the client code for this run just happens to be optimized differently for vector and standard arrays in this case than it would bef or other cases.

That is, would the time ratios vary significantly for other types of loops? Iterating? Loops that do function calls that does the actual accessing? Pass by value function calls vs pass by reference?

Posted: 2006-07-20 09:21am
by Dooey Jo
Arrow wrote:(2) number to string conversions (because no one has shown me how to it directly in an stl::string)
You can/should use a string stream for that.

Posted: 2006-07-20 09:31am
by Arrow
Dooey Jo wrote:
Arrow wrote:(2) number to string conversions (because no one has shown me how to it directly in an stl::string)
You can/should use a string stream for that.
Oh, I forget to mention that my team lead hates streams, and perfers printf/sprintf/fprintf, and its rubbed off my other team member and me. Its purely a code asthetic reason, not because of functionality. I really wish the stl::string class had native conversion functions (something I like about Java).

Posted: 2006-07-20 10:13am
by phongn
Durandal wrote:A vector is always guaranteed to take up more space than an array of the same size simply because of all the instance variables and method table. Depending on how it is implemented, the subscript operator could have function call overhead (though a smart compiler would inline it).
You're right - a vector takes up more space (16 bytes, IIRC). Depending on the circumstances this may be of nontrivial
But the difference in access times given a smart compiler is probably negligible for most users. This still doesn't change the fact that real men dynamically allocate their memory. ;)
But it is being dynamically allocated ... for you :P

Posted: 2006-07-20 02:00pm
by Durandal
Beowulf wrote:it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
I hate people who double-indent their loops. :)

A better swap would be something like this.

Code: Select all

for( i = 0; i < SIZE; i++ ) {
    swap1 = rand() % SIZE;
    swap2 = rand() % SIZE;
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.

Posted: 2006-07-20 02:47pm
by Arrow
Durandal wrote:
Beowulf wrote:it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
I hate people who double-indent their loops. :)

A better swap would be something like this.

Code: Select all

for( i = 0; i < SIZE; i++ ) {
    swap1 = rand() % SIZE;
    swap2 = rand() % SIZE;
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.
Well, for starters the vector doesn't have to override XOR, because the operator is applied to the data type, not to the vector instance (aa[swap1] ^= vs. aa ^=). And second, you'd lose that bet:

Code: Select all

int main()
{
    vector<int> va;
    va.resize(SIZE);

    int aa[SIZE];
    int i;

    long vt_start, vt_end;
    long at_start, at_end;

    for(i=0; i<SIZE; ++i)
    {
        int num = rand()%SIZE;
        va[i] = num;
        aa[i] = num;
    }

    int swap1, swap2, val;

    long itr = SIZE * 1000;

    at_start = time(NULL);
    for( i = 0; i < itr; i++ ) 
	{
		swap1 = rand() % SIZE;
		swap2 = rand() % SIZE;
		aa[swap1] ^= aa[swap2];
		aa[swap2] ^= aa[swap1];
		aa[swap1] ^= aa[swap2];
	}
    at_end = time(NULL);

    vt_start = time(NULL);
	for( i = 0; i < itr; i++ ) 
	{
		swap1 = rand() % SIZE;
		swap2 = rand() % SIZE;
		va[swap1] ^= va[swap2];
		va[swap2] ^= va[swap1];
		va[swap1] ^= va[swap2];
	}
    vt_end = time(NULL);

    cout << "Time for vector array:   " << vt_end - vt_start << endl;
    cout << "Time for standard array: " << at_end - at_start << endl;

    return EXIT_SUCCESS;
}
Results wrote:Time for vector array: 12
Time for standard array: 11
Compiled in MSVS 2005 as a Win32 Console App (Release build), with the following changes in project properties:

C++ -> Optimization
Optimization: Maximize Speed (/O2)
Inline Function Expansion: Any Suitable(/Ob2)
Favor Size or Speed: Favor Fast Code (/Ot)

Posted: 2006-07-20 02:48pm
by Mad
Durandal wrote:I hate people who double-indent their loops. :)
Java-style indents also suck, by the way. Harder to scan through and see where the braces match up from a quick scan because they no longer line up.
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.
It doesn't. Your XOR is slower than the swap I did, at least on the system I tested on (same as my first post):

Code: Select all

[tango] ~/cpp/VvsA>g++ -O2 bench.cpp
[tango] ~/cpp/VvsA>a.out
Time for standard array: 36
Time for Durandal's swap, standard array: 37
Time for vector array:   36
Time for Durandal's swap, vector array: 37
[tango] ~/cpp/VvsA>a.out
Time for standard array: 35
Time for Durandal's swap, standard array: 37
Time for vector array:   36
Time for Durandal's swap, vector array: 36
[tango] ~/cpp/VvsA>a.out
Time for standard array: 36
Time for Durandal's swap, standard array: 37
Time for vector array:   35
Time for Durandal's swap, vector array: 37
You're doing 6 dereferences, 6 additions (array offset), and 3 XOR operations. My swap does 4 dereferences, 4 additions, and 3 assignments. In other words, mine is faster, even if it doesn't look as fancy.

Also, the vector was roughly as fast as the standard array using optimization.

Here's the relevant code I'm running:

Code: Select all

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    val = aa[swap1];
    aa[swap1] = aa[swap2];
    aa[swap2] = val;
}
t_end = time(NULL);
cout << "Time for standard array: " << t_end - t_start << endl;

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    // Durandal's optimization:
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
t_end = time(NULL);
cout << "Time for Durandal's swap, standard array: " << t_end - t_start << endl;

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    val = va[swap1];
    va[swap1] = va[swap2];
    va[swap2] = val;
}
t_end = time(NULL);
cout << "Time for vector array:   " << t_end - t_start << endl;

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    // Durandal's optimization:
    va[swap1] ^= va[swap2];
    va[swap2] ^= va[swap1];
    va[swap1] ^= va[swap2];
}
t_end = time(NULL);
cout << "Time for Durandal's swap, vector array: " << t_end - t_start << endl;

Posted: 2006-07-20 03:25pm
by Faqa
Compiled in MSVS 2005 as a Win32 Console App (Release build), with the following changes in project properties:

C++ -> Optimization
Optimization: Maximize Speed (/O2)
Inline Function Expansion: Any Suitable(/Ob2)
Favor Size or Speed: Favor Fast Code (/Ot)
Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?

On-topic, I haven't used the STL much. When I was learning C++, our instructors deferred use of the STL to when we got our on-the-job training, and mine of course ended up in full-out .NET mode. Learned the basics, ground away at the low level. Arrays FTW! Still and all, vector use seems to save you quite a bit of standard array crap, so why not use it?[/quote]

Posted: 2006-07-20 03:33pm
by Dooey Jo
Faqa wrote:
Compiled in MSVS 2005 as a Win32 Console App (Release build), with the following changes in project properties:

C++ -> Optimization
Optimization: Maximize Speed (/O2)
Inline Function Expansion: Any Suitable(/Ob2)
Favor Size or Speed: Favor Fast Code (/Ot)
Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?
I did a quick try with DevC++ (which uses Cygwin IIRC) and using the "best optimization" option I got
14 for STL and 13 for array.
Second try was 15 vs 14.

Posted: 2006-07-20 03:37pm
by Mad
Dooey Jo wrote:
Faqa wrote:Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?
I did a quick try with DevC++ (which uses Cygwin IIRC) and using the "best optimization" option I got
14 for STL and 13 for array.
Second try was 15 vs 14.
MinGW, actually, which is a port of GCC to the Win32 platform.

My results were using GCC running on a Sun SPARC system and Durandal's optimization was actually slower than the traditional swapping method.

Posted: 2006-07-20 03:42pm
by Arrow
Faqa wrote: Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?
Shouldn't be a factor, since this just standard stuff (no .NET libraries included). The speed is probably from my work machine - P4 3.6GHz with 4GB RAM.

I also did a run with Xilinx tools running in the background, and those results were 21 for the vector vs 20 for the array.

Posted: 2006-07-20 04:13pm
by Durandal
Mad wrote:You're doing 6 dereferences, 6 additions (array offset), and 3 XOR operations. My swap does 4 dereferences, 4 additions, and 3 assignments. In other words, mine is faster, even if it doesn't look as fancy.
You're right, of course. I forgot that we were swapping array values. XOR should be the faster method for non-pointer swaps though.

Posted: 2006-07-20 04:37pm
by Beowulf
Durandal wrote:
Beowulf wrote:it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
I hate people who double-indent their loops. :)

A better swap would be something like this.

Code: Select all

for( i = 0; i < SIZE; i++ ) {
    swap1 = rand() % SIZE;
    swap2 = rand() % SIZE;
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.
Copy-paste job, with the beginning indent not copied. *shrug* shoulda been single indent. Your indent style sucks, because it's harder to find the open brace, when you've seen the close brace. Also, it's apparent you need more C++ experience, otherwise you'd never would have made such a silly mistake, as the comment above.