Sorry for posting this, it appears that HN is not the place to post content if your only goal is to help people to learn. The hacker ethic is dead I guess, Richard Stallman really was the last man standing then. Remind me to come and post something when I decide to make some silly iPhone / android app when I want to make money rather than sharing knowledge.
Not at all. Thanks for posting OP. Any technical discussion is welcome here. Ignore the haters who think STL vectors are a drop in replacement for lists.
This isn't how I'd do a linked list in C. Having to malloc separate storage for the next/prev pointers imposes a big cost in terms of performance (it's hardly O(1)) and robustness. Your linked list operations can all fail now - sooner or later, that's going to put you in a very tight spot.
I prefer to store next/prev pointers with the data itself, and to use offsetof/containerof to find a pointer to the data struct which contains a given set of next/prev pointers.
One example of such an scheme is the list used in the Linux kernel:
It really does depend on what you need to be doing in order to make the assumption of what data structure is needed. A vector is all well and good for certain tasks, but sometimes, a list is what you need. I will cover vectors too, but for now I am focused on C. not C++. Thanks though.
You need a list when you have routine insertions somewhere besides the front or the back of the container, and you don't have routine random-access reads. The vector covers more cases and is generally more performant. Use whatever you like, though.
That's nice little demonstration of superior vector performance for a certain set of (very common) conditions:
# The value type (int in this case) is very small, in fact it's pointer-sized.
# This entire benchmark easily fits into L2 cache. The benchmark only goes up to 100000 elements, which for an int is only 390K. This can be an advantage of vector, but it's not necessarily such a slam-dunk in real-world programs where there are multiple data structures and other threads contending for the use of the cache.
# The value type has 'trivial' copy/move constructors and assignment operators. It's POD and outright memmove()-able.
# The benchmark code is only able to use list nodes one time. It doesn't take advantage of a list's ability to move items from one container to another without reallocation.
# The benchamark is thoroughly randomized. If it were more typical real-world data (such as supplied by another algorithm or even a malicious attacker) then you would want to consider the possibility of pathological access patterns arising.
So if you meet all of these conditions, definitely use a vector. If not, life is back to being complicated again.
That blog post doesn't address the worst-case running time argument, and doesn't measure worst-case running time costs, since it measures batches of 5000 operations at a time. And hey, it's about a workload that totally obviates worst-case running time considerations by requiring a linear scan for every operation.
If you responded by pointing out that even when the vector consumes half of physical memory, the worst case running time isn't going to be any worse than iterating through all of physical memory, which isn't really so bad, and that for the vast majority of vectors it's going to cost less than a page fault, maybe take a couple of microseconds, to copy those values, you might have an argument that actually addressed spinlocked's point.
Of course, std::vectors are only good for copyable types, and copyable types are the devil.
Presbyterian. Well no, but see http://google-styleguide.googlecode.com/svn/trunk/cppguide.x... for some rationale in that direction. Generally speaking you might note that in Java and C#, if you want to copy a data structure, you have to do so explicitly.
Yeah I agree that one should intentionally enable or disable copy and assignment for a type. It might even make sense to disable implicit copying and make an explicit method for it instead.
The C++11 move constructor and move assignment operator seem to me to be a wonderful middle ground.
"insertions to the end of a variable-length array are very fast"
really tptacek? its O(1) ammortized over time, and when N grows it becomes really slow because it involves reallocation and a linear copy of all elements.
The "variable" in "variable length" describes the fact that you don't know how much space you need, and insertions to the end of a variable-length array are very fast. I can see why you thought my comment was weird, now.
Not everyone went to university, and people have to learn from somewhere. It is also only an introduction, there will be more complicated data structures shown soon, and algorithms. No point in jumping into the heavier stuff. Trying to help people who are new to C.
See, thats more constructive! :) I figured it belonged here, especially since a lot of people are probably still interested in learning things like this, I was not aware of a reddit for this.
I use a Chrome plugin that creates a new account each time I post a comment. I use a proxy that means HN can't track the number of accounts I've created. Keeps my privacy.
Thanks.