std::vector vs. array - FIGHT (or: STL vs CSL)

GEC: Discuss gaming, computers and electronics and venture into the bizarre world of STGODs.

Moderator: Thanas

User avatar
phongn
Rebel Leader
Posts: 18487
Joined: 2002-07-03 11:11pm

std::vector vs. array - FIGHT (or: STL vs CSL)

Post by phongn »

Inspired by Faqa's thread about .NET, here comes a new thread about libraries and whatnot.

Durandal and I promptly got into an argument online about std::vector versus array and access speed times and whatnot. He maintained that the C array was always faster while I maintained that the vector was as fast (or faster).

So, thoughts on using the STL? Love it? Hate it? Prefer to use C constructs?
Last edited by phongn on 2006-07-20 10:35am, edited 1 time in total.
User avatar
Chris OFarrell
Durandal's Bitch
Posts: 5724
Joined: 2002-08-02 07:57pm
Contact:

Post by Chris OFarrell »

Unless I'm mistaken, isn't the STL libary actualy Guaranteed to have a minimum level of effeciency that isn't half bad?
Image
User avatar
phongn
Rebel Leader
Posts: 18487
Joined: 2002-07-03 11:11pm

Post by phongn »

Chris OFarrell wrote:Unless I'm mistaken, isn't the STL libary actualy Guaranteed to have a minimum level of effeciency that isn't half bad?
Yes, there are specifications for performance and efficiency (some differences -- std::vector<bool> is a bitfield for space reasons instead of an array of eight-bit ints like in many other languages)
User avatar
Beowulf
The Patrician
Posts: 10621
Joined: 2002-07-04 01:18am
Location: 32ULV

Post by Beowulf »

Isn't an std::vector of things normally actually stored as an array of said objects? It just allows you to not accidentally run over the end of the array, and not have to worry about allocating the space. So I can see it being slightly slower, but not significantly so. And only if you use the function that actually checks to see if you went outside the bounds. The overloaded [] should be just as fast as a normal array access. The benefits of not having to worry about it certainly outweigh the downside, unless you really have a hard-on for efficiency.
"preemptive killing of cops might not be such a bad idea from a personal saftey[sic] standpoint..." --Keevan Colton
"There's a word for bias you can't see: Yours." -- William Saletan
User avatar
Durandal
Bile-Driven Hate Machine
Posts: 17927
Joined: 2002-07-03 06:26pm
Location: Silicon Valley, CA
Contact:

Post by Durandal »

By the way, the "he" Phong refers to is me. ;)

A vector is always guaranteed to take up more space than an array of the same size simply because of all the instance variables and method table. Depending on how it is implemented, the subscript operator could have function call overhead (though a smart compiler would inline it).

But the difference in access times given a smart compiler is probably negligible for most users. This still doesn't change the fact that real men dynamically allocate their memory. ;)
Damien Sorresso

"Ever see what them computa bitchez do to numbas? It ain't natural. Numbas ain't supposed to be code, they supposed to quantify shit."
- The Onion
User avatar
Beowulf
The Patrician
Posts: 10621
Joined: 2002-07-04 01:18am
Location: 32ULV

Post by Beowulf »

Durandal wrote:By the way, the "he" Phong refers to is me. ;)

A vector is always guaranteed to take up more space than an array of the same size simply because of all the instance variables and method table. Depending on how it is implemented, the subscript operator could have function call overhead (though a smart compiler would inline it).

But the difference in access times given a smart compiler is probably negligible for most users. This still doesn't change the fact that real men dynamically allocate their memory. ;)
Real programmers consider Laziness to be a virtue. Why make it harder on yourself to write good code, when you can use tools available to you?
"preemptive killing of cops might not be such a bad idea from a personal saftey[sic] standpoint..." --Keevan Colton
"There's a word for bias you can't see: Yours." -- William Saletan
User avatar
Mad
Jedi Council Member
Posts: 1923
Joined: 2002-07-04 01:32am
Location: North Carolina, USA
Contact:

Post by Mad »

Running the following simple benchmark, it appears that standard arrays are about twice as fast as vector arrays. I had expected the difference to be closer, but it looks like the overhead takes about as long as dereferencing the offset pointer itself takes.

Code: Select all

#include <cstdlib>
#include <iostream>
#include <vector>
#include <ctime>

#define SIZE    100000

using namespace std;

int main()
{
    vector<int> va;
    va.resize(SIZE);

    int aa[SIZE];
    int i;

    long vt_start, vt_end;
    long at_start, at_end;

    for(i=0; i<SIZE; ++i)
    {
        int num = rand()%SIZE;
        va[i] = num;
        aa[i] = num;
    }

    int swap1, swap2, val;

    long itr = SIZE * 1000;

    at_start = time(NULL);
    for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = va[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    }
    at_end = time(NULL);

    vt_start = time(NULL);
    for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = va[swap1];
        va[swap1] = va[swap2];
        va[swap2] = val;
    }
    vt_end = time(NULL);

    cout << "Time for vector array:   " << vt_end - vt_start << endl;
    cout << "Time for standard array: " << at_end - at_start << endl;

    return EXIT_SUCCESS;
}
Results for two runs (reversed the order of the tests) between runs:

Run 1:

Code: Select all

Time for vector array:   122
Time for standard array: 63
Run 2:

Code: Select all

Time for vector array:   120
Time for standard array: 64
Ran remotely via SSH using GCC 3.3.2 on Solaris 9 on a university SPARC machine. Compiled with compiler defaults ("g++ -o bench.exe bench.cpp").
Later...
User avatar
Beowulf
The Patrician
Posts: 10621
Joined: 2002-07-04 01:18am
Location: 32ULV

Post by Beowulf »

You have a bug in your program:

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = va[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
Unfortunately, this makes the comparison even worse. it's more like a 1 to 4 advantage for the native array, compared to the std:vector. Just goes to show, don't believe anything without empirical testing.

EDIT: redid my test with release settings instead of debug. It's rough parity now. 14-13, 14-12, 14-12. Oh, and this is with MSVS 2005

EDIT2: with even more optimizations enabled for speed, the vector and array timings come out dead even, 13-14, 14-13, 13-14. So of course, double check your empirical results. It could be that your initial set-up is bad. (Example, Debug mode vs Release(debug turns off alot of optimizations)).
"preemptive killing of cops might not be such a bad idea from a personal saftey[sic] standpoint..." --Keevan Colton
"There's a word for bias you can't see: Yours." -- William Saletan
User avatar
Xon
Sith Acolyte
Posts: 6206
Joined: 2002-07-16 06:12am
Location: Western Australia

Post by Xon »

The difference between debug mode while running under a debugger, and release mode and is night and day.

I have some .NET code for a game conversion toolset for an upcoming RTS which runs over 50000% slower in debug mode under the debugger compared to release mode.
"Okay, I'll have the truth with a side order of clarity." ~ Dr. Daniel Jackson.
"Reality has a well-known liberal bias." ~ Stephen Colbert
"One Drive, One Partition, the One True Path" ~ ars technica forums - warrens - on hhd partitioning schemes.
User avatar
Dahak
Emperor's Hand
Posts: 7292
Joined: 2002-10-29 12:08pm
Location: Admiralty House, Landing, Manticore
Contact:

Post by Dahak »

When I wrote in C++, I always used the STL. I didn't feel like doing things probably worse than a library already could do...
Friend of mine didn't know the STL existed, was very sceptical of it. At the end, he used it too...
Image
Great Dolphin Conspiracy - Chatter box
"Implications: we have been intercepted deliberately by a means unknown, for a purpose unknown, and transferred to a place unknown by a form of intelligence unknown. Apart from the unknown, everything is obvious." ZORAC
GALE Force Euro Wimp
Human dignity shall be inviolable. To respect and protect it shall be the duty of all state authority.
Image
User avatar
Dooey Jo
Sith Devotee
Posts: 3127
Joined: 2002-08-09 01:09pm
Location: The land beyond the forest; Sweden.
Contact:

Post by Dooey Jo »

If I need a dynamic array, I use a std::vector. If I need a normal array, I don't use the vector. This is because it is the most convenient. It would be foolish and time-consuming to write my own vector template class if I don't need anything more than the STL already offers (or worse, write the allocation and everything directly in the code wherever I need it; holy debugging hell!).

If you're using C++, why pretend it's actually C?
Image
"Nippon ichi, bitches! Boing-boing."
Mai smote the demonic fires of heck...

Faker Ninjas invented ninjitsu
User avatar
Arrow
Jedi Council Member
Posts: 2283
Joined: 2003-01-12 09:14pm

Post by Arrow »

There are only two times I use an array: (1) for DMA transfers (because the hardware doesn't talk any other way) and (2) number to string conversions (because no one has shown me how to it directly in an stl::string). Everything else is done with stl::vector and stl::string. Why reinvent the wheel? And why risk the bugs from a typo in your own array and string handlers?
Artillery. Its what's for dinner.
User avatar
Mad
Jedi Council Member
Posts: 1923
Joined: 2002-07-04 01:32am
Location: North Carolina, USA
Contact:

Post by Mad »

Beowulf wrote:Unfortunately, this makes the comparison even worse. it's more like a 1 to 4 advantage for the native array, compared to the std:vector. Just goes to show, don't believe anything without empirical testing.
Oops. It also just goes to show, don't copy-and-paste code and change the variable names when using very similar names without being very careful.

We also learned that turning compiler optimization on is very, very important. But if we're going the optimization route, we may want to perform more varied tests just in case the client code for this run just happens to be optimized differently for vector and standard arrays in this case than it would bef or other cases.

That is, would the time ratios vary significantly for other types of loops? Iterating? Loops that do function calls that does the actual accessing? Pass by value function calls vs pass by reference?
Later...
User avatar
Dooey Jo
Sith Devotee
Posts: 3127
Joined: 2002-08-09 01:09pm
Location: The land beyond the forest; Sweden.
Contact:

Post by Dooey Jo »

Arrow wrote:(2) number to string conversions (because no one has shown me how to it directly in an stl::string)
You can/should use a string stream for that.
Image
"Nippon ichi, bitches! Boing-boing."
Mai smote the demonic fires of heck...

Faker Ninjas invented ninjitsu
User avatar
Arrow
Jedi Council Member
Posts: 2283
Joined: 2003-01-12 09:14pm

Post by Arrow »

Dooey Jo wrote:
Arrow wrote:(2) number to string conversions (because no one has shown me how to it directly in an stl::string)
You can/should use a string stream for that.
Oh, I forget to mention that my team lead hates streams, and perfers printf/sprintf/fprintf, and its rubbed off my other team member and me. Its purely a code asthetic reason, not because of functionality. I really wish the stl::string class had native conversion functions (something I like about Java).
Artillery. Its what's for dinner.
User avatar
phongn
Rebel Leader
Posts: 18487
Joined: 2002-07-03 11:11pm

Post by phongn »

Durandal wrote:A vector is always guaranteed to take up more space than an array of the same size simply because of all the instance variables and method table. Depending on how it is implemented, the subscript operator could have function call overhead (though a smart compiler would inline it).
You're right - a vector takes up more space (16 bytes, IIRC). Depending on the circumstances this may be of nontrivial
But the difference in access times given a smart compiler is probably negligible for most users. This still doesn't change the fact that real men dynamically allocate their memory. ;)
But it is being dynamically allocated ... for you :P
User avatar
Durandal
Bile-Driven Hate Machine
Posts: 17927
Joined: 2002-07-03 06:26pm
Location: Silicon Valley, CA
Contact:

Post by Durandal »

Beowulf wrote:it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
I hate people who double-indent their loops. :)

A better swap would be something like this.

Code: Select all

for( i = 0; i < SIZE; i++ ) {
    swap1 = rand() % SIZE;
    swap2 = rand() % SIZE;
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.
Damien Sorresso

"Ever see what them computa bitchez do to numbas? It ain't natural. Numbas ain't supposed to be code, they supposed to quantify shit."
- The Onion
User avatar
Arrow
Jedi Council Member
Posts: 2283
Joined: 2003-01-12 09:14pm

Post by Arrow »

Durandal wrote:
Beowulf wrote:it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
I hate people who double-indent their loops. :)

A better swap would be something like this.

Code: Select all

for( i = 0; i < SIZE; i++ ) {
    swap1 = rand() % SIZE;
    swap2 = rand() % SIZE;
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.
Well, for starters the vector doesn't have to override XOR, because the operator is applied to the data type, not to the vector instance (aa[swap1] ^= vs. aa ^=). And second, you'd lose that bet:

Code: Select all

int main()
{
    vector<int> va;
    va.resize(SIZE);

    int aa[SIZE];
    int i;

    long vt_start, vt_end;
    long at_start, at_end;

    for(i=0; i<SIZE; ++i)
    {
        int num = rand()%SIZE;
        va[i] = num;
        aa[i] = num;
    }

    int swap1, swap2, val;

    long itr = SIZE * 1000;

    at_start = time(NULL);
    for( i = 0; i < itr; i++ ) 
	{
		swap1 = rand() % SIZE;
		swap2 = rand() % SIZE;
		aa[swap1] ^= aa[swap2];
		aa[swap2] ^= aa[swap1];
		aa[swap1] ^= aa[swap2];
	}
    at_end = time(NULL);

    vt_start = time(NULL);
	for( i = 0; i < itr; i++ ) 
	{
		swap1 = rand() % SIZE;
		swap2 = rand() % SIZE;
		va[swap1] ^= va[swap2];
		va[swap2] ^= va[swap1];
		va[swap1] ^= va[swap2];
	}
    vt_end = time(NULL);

    cout << "Time for vector array:   " << vt_end - vt_start << endl;
    cout << "Time for standard array: " << at_end - at_start << endl;

    return EXIT_SUCCESS;
}
Results wrote:Time for vector array: 12
Time for standard array: 11
Compiled in MSVS 2005 as a Win32 Console App (Release build), with the following changes in project properties:

C++ -> Optimization
Optimization: Maximize Speed (/O2)
Inline Function Expansion: Any Suitable(/Ob2)
Favor Size or Speed: Favor Fast Code (/Ot)
Artillery. Its what's for dinner.
User avatar
Mad
Jedi Council Member
Posts: 1923
Joined: 2002-07-04 01:32am
Location: North Carolina, USA
Contact:

Post by Mad »

Durandal wrote:I hate people who double-indent their loops. :)
Java-style indents also suck, by the way. Harder to scan through and see where the braces match up from a quick scan because they no longer line up.
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.
It doesn't. Your XOR is slower than the swap I did, at least on the system I tested on (same as my first post):

Code: Select all

[tango] ~/cpp/VvsA>g++ -O2 bench.cpp
[tango] ~/cpp/VvsA>a.out
Time for standard array: 36
Time for Durandal's swap, standard array: 37
Time for vector array:   36
Time for Durandal's swap, vector array: 37
[tango] ~/cpp/VvsA>a.out
Time for standard array: 35
Time for Durandal's swap, standard array: 37
Time for vector array:   36
Time for Durandal's swap, vector array: 36
[tango] ~/cpp/VvsA>a.out
Time for standard array: 36
Time for Durandal's swap, standard array: 37
Time for vector array:   35
Time for Durandal's swap, vector array: 37
You're doing 6 dereferences, 6 additions (array offset), and 3 XOR operations. My swap does 4 dereferences, 4 additions, and 3 assignments. In other words, mine is faster, even if it doesn't look as fancy.

Also, the vector was roughly as fast as the standard array using optimization.

Here's the relevant code I'm running:

Code: Select all

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    val = aa[swap1];
    aa[swap1] = aa[swap2];
    aa[swap2] = val;
}
t_end = time(NULL);
cout << "Time for standard array: " << t_end - t_start << endl;

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    // Durandal's optimization:
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
t_end = time(NULL);
cout << "Time for Durandal's swap, standard array: " << t_end - t_start << endl;

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    val = va[swap1];
    va[swap1] = va[swap2];
    va[swap2] = val;
}
t_end = time(NULL);
cout << "Time for vector array:   " << t_end - t_start << endl;

t_start = time(NULL);
for(i=0; i<itr; ++i)
{
    swap1 = rand()%SIZE;
    swap2 = rand()%SIZE;
    // Durandal's optimization:
    va[swap1] ^= va[swap2];
    va[swap2] ^= va[swap1];
    va[swap1] ^= va[swap2];
}
t_end = time(NULL);
cout << "Time for Durandal's swap, vector array: " << t_end - t_start << endl;
Later...
User avatar
Faqa
Jedi Master
Posts: 1340
Joined: 2004-06-02 09:32am
Contact:

Post by Faqa »

Compiled in MSVS 2005 as a Win32 Console App (Release build), with the following changes in project properties:

C++ -> Optimization
Optimization: Maximize Speed (/O2)
Inline Function Expansion: Any Suitable(/Ob2)
Favor Size or Speed: Favor Fast Code (/Ot)
Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?

On-topic, I haven't used the STL much. When I was learning C++, our instructors deferred use of the STL to when we got our on-the-job training, and mine of course ended up in full-out .NET mode. Learned the basics, ground away at the low level. Arrays FTW! Still and all, vector use seems to save you quite a bit of standard array crap, so why not use it?[/quote]
"Peace on Earth and goodwill towards men? We are the United States Goverment - we don't DO that sort of thing!" - Sneakers. Best. Quote. EVER.

Periodic Pwnage Pantry:

"Faith? Isn't that another term for ignorance?" - Gregory House

"Isn't it interesting... religious behaviour is so close to being crazy that we can't tell them apart?" - Gregory House

"This is usually the part where people start screaming." - Gabriel Sylar
User avatar
Dooey Jo
Sith Devotee
Posts: 3127
Joined: 2002-08-09 01:09pm
Location: The land beyond the forest; Sweden.
Contact:

Post by Dooey Jo »

Faqa wrote:
Compiled in MSVS 2005 as a Win32 Console App (Release build), with the following changes in project properties:

C++ -> Optimization
Optimization: Maximize Speed (/O2)
Inline Function Expansion: Any Suitable(/Ob2)
Favor Size or Speed: Favor Fast Code (/Ot)
Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?
I did a quick try with DevC++ (which uses Cygwin IIRC) and using the "best optimization" option I got
14 for STL and 13 for array.
Second try was 15 vs 14.
Image
"Nippon ichi, bitches! Boing-boing."
Mai smote the demonic fires of heck...

Faker Ninjas invented ninjitsu
User avatar
Mad
Jedi Council Member
Posts: 1923
Joined: 2002-07-04 01:32am
Location: North Carolina, USA
Contact:

Post by Mad »

Dooey Jo wrote:
Faqa wrote:Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?
I did a quick try with DevC++ (which uses Cygwin IIRC) and using the "best optimization" option I got
14 for STL and 13 for array.
Second try was 15 vs 14.
MinGW, actually, which is a port of GCC to the Win32 platform.

My results were using GCC running on a Sun SPARC system and Durandal's optimization was actually slower than the traditional swapping method.
Later...
User avatar
Arrow
Jedi Council Member
Posts: 2283
Joined: 2003-01-12 09:14pm

Post by Arrow »

Faqa wrote: Interesting. You're sure the .NET framework isn't the factor here(if it JITed the whole thing into MSIL, Microsoft may have a few tricks of their own to make things faster)?
Shouldn't be a factor, since this just standard stuff (no .NET libraries included). The speed is probably from my work machine - P4 3.6GHz with 4GB RAM.

I also did a run with Xilinx tools running in the background, and those results were 21 for the vector vs 20 for the array.
Artillery. Its what's for dinner.
User avatar
Durandal
Bile-Driven Hate Machine
Posts: 17927
Joined: 2002-07-03 06:26pm
Location: Silicon Valley, CA
Contact:

Post by Durandal »

Mad wrote:You're doing 6 dereferences, 6 additions (array offset), and 3 XOR operations. My swap does 4 dereferences, 4 additions, and 3 assignments. In other words, mine is faster, even if it doesn't look as fancy.
You're right, of course. I forgot that we were swapping array values. XOR should be the faster method for non-pointer swaps though.
Damien Sorresso

"Ever see what them computa bitchez do to numbas? It ain't natural. Numbas ain't supposed to be code, they supposed to quantify shit."
- The Onion
User avatar
Beowulf
The Patrician
Posts: 10621
Joined: 2002-07-04 01:18am
Location: 32ULV

Post by Beowulf »

Durandal wrote:
Beowulf wrote:it should be

Code: Select all

for(i=0; i<itr; ++i)
    {
        swap1 = rand()%SIZE;
        swap2 = rand()%SIZE;
        val = aa[swap1];
        aa[swap1] = aa[swap2];
        aa[swap2] = val;
    } 
I hate people who double-indent their loops. :)

A better swap would be something like this.

Code: Select all

for( i = 0; i < SIZE; i++ ) {
    swap1 = rand() % SIZE;
    swap2 = rand() % SIZE;
    aa[swap1] ^= aa[swap2];
    aa[swap2] ^= aa[swap1];
    aa[swap1] ^= aa[swap2];
}
I don't know if the XOR or XOR-equal operators are overridden in the std::vector class, but if they're not, I'd be willing to bet that the code above for a standard array would spank a vector silly.
Copy-paste job, with the beginning indent not copied. *shrug* shoulda been single indent. Your indent style sucks, because it's harder to find the open brace, when you've seen the close brace. Also, it's apparent you need more C++ experience, otherwise you'd never would have made such a silly mistake, as the comment above.
"preemptive killing of cops might not be such a bad idea from a personal saftey[sic] standpoint..." --Keevan Colton
"There's a word for bias you can't see: Yours." -- William Saletan
Post Reply