Linked lists in C, with open source ADT

AncoraImparo · on Jan 20, 2013

Sorry for posting this, it appears that HN is not the place to post content if your only goal is to help people to learn. The hacker ethic is dead I guess, Richard Stallman really was the last man standing then. Remind me to come and post something when I decide to make some silly iPhone / android app when I want to make money rather than sharing knowledge.

Thanks.

spinlocked · on Jan 20, 2013

Not at all. Thanks for posting OP. Any technical discussion is welcome here. Ignore the haters who think STL vectors are a drop in replacement for lists.

gosu · on Jan 20, 2013

This isn't how I'd do a linked list in C. Having to malloc separate storage for the next/prev pointers imposes a big cost in terms of performance (it's hardly O(1)) and robustness. Your linked list operations can all fail now - sooner or later, that's going to put you in a very tight spot.

I prefer to store next/prev pointers with the data itself, and to use offsetof/containerof to find a pointer to the data struct which contains a given set of next/prev pointers.

One example of such an scheme is the list used in the Linux kernel:

http://lxr.linux.no/#linux+v3.7.3/include/linux/list.h

http://lwn.net/Articles/336255/

tptacek · on Jan 20, 2013

This problem is why you're better off with something like a C++ vector (a variable-length array).

AncoraImparo · on Jan 20, 2013

It really does depend on what you need to be doing in order to make the assumption of what data structure is needed. A vector is all well and good for certain tasks, but sometimes, a list is what you need. I will cover vectors too, but for now I am focused on C. not C++. Thanks though.

tptacek · on Jan 20, 2013

You need a list when you have routine insertions somewhere besides the front or the back of the container, and you don't have routine random-access reads. The vector covers more cases and is generally more performant. Use whatever you like, though.

marshray · on Jan 20, 2013

Note that insertions and removals at the front are worst-case for a vector, whereas the back is all about speed.

spinlocked · on Jan 20, 2013

i disagree.

tptacek · on Jan 20, 2013

... do go on.

spinlocked · on Jan 20, 2013

see my post below your second highest voted comment on this thread.

tptacek · on Jan 20, 2013

http://yaserzt.com/blog/archives/615

"This result should not surprise any experienced programmers, but unfortunately it does many of us."

whistles

marshray · on Jan 20, 2013

That's nice little demonstration of superior vector performance for a certain set of (very common) conditions:

# The value type (int in this case) is very small, in fact it's pointer-sized.

# This entire benchmark easily fits into L2 cache. The benchmark only goes up to 100000 elements, which for an int is only 390K. This can be an advantage of vector, but it's not necessarily such a slam-dunk in real-world programs where there are multiple data structures and other threads contending for the use of the cache.

# The value type has 'trivial' copy/move constructors and assignment operators. It's POD and outright memmove()-able.

# The benchmark code is only able to use list nodes one time. It doesn't take advantage of a list's ability to move items from one container to another without reallocation.

# The benchamark is thoroughly randomized. If it were more typical real-world data (such as supplied by another algorithm or even a malicious attacker) then you would want to consider the possibility of pathological access patterns arising.

So if you meet all of these conditions, definitely use a vector. If not, life is back to being complicated again.

shrughes · on Jan 20, 2013

That blog post doesn't address the worst-case running time argument, and doesn't measure worst-case running time costs, since it measures batches of 5000 operations at a time. And hey, it's about a workload that totally obviates worst-case running time considerations by requiring a linear scan for every operation.

If you responded by pointing out that even when the vector consumes half of physical memory, the worst case running time isn't going to be any worse than iterating through all of physical memory, which isn't really so bad, and that for the vast majority of vectors it's going to cost less than a page fault, maybe take a couple of microseconds, to copy those values, you might have an argument that actually addressed spinlocked's point.

Of course, std::vectors are only good for copyable types, and copyable types are the devil.

marshray · on Jan 20, 2013

If you don't mind me asking ... what religion are you that considers copyable types to be the devil?

shrughes · on Jan 20, 2013

Presbyterian. Well no, but see http://google-styleguide.googlecode.com/svn/trunk/cppguide.x... for some rationale in that direction. Generally speaking you might note that in Java and C#, if you want to copy a data structure, you have to do so explicitly.

marshray · on Jan 20, 2013

Yeah I agree that one should intentionally enable or disable copy and assignment for a type. It might even make sense to disable implicit copying and make an explicit method for it instead.

The C++11 move constructor and move assignment operator seem to me to be a wonderful middle ground.

spinlocked · on Jan 20, 2013

This result should not surprise any experienced programmers, but unfortunately it does many of us.

1. for that workload the result should not really surprise anyone.

2. you are pointing to one pathological workload to make a point that vector performance is always better than lists?

tptacek · on Jan 20, 2013

The book _C Interfaces and Implementation_ is nothing but library interfaces like this and their implementations.

You are, for whatever it's worth, better off defaulting to a variable-length array than to a linked list.

spinlocked · on Jan 20, 2013

"insertions to the end of a variable-length array are very fast"

really tptacek? its O(1) ammortized over time, and when N grows it becomes really slow because it involves reallocation and a linear copy of all elements.

marshray · on Jan 20, 2013

You have a magic allocator for your list nodes that does better than O(1) amortized?

shrughes · on Jan 20, 2013

Being better than O(n) worst case is the target here, not O(1) amortized. For example, O(1) amortized and O(log n) worst case is a thing.

spinlocked · on Jan 20, 2013

Do you need magic to create an O(1) allocator for a pool of fixed size objects?

marshray · on Jan 20, 2013

Yes, when (as in the general case of std::list) the number of objects is not known in advance.

spinlocked · on Jan 20, 2013

true. however, libstdc++ implements __pool_alloc for a reason.

afhof · on Jan 20, 2013

Someone wants to know about linked lists, and your comment is to use some other data structure that has a different purpose?

tptacek · on Jan 20, 2013

Different purpose?

afhof · on Jan 20, 2013

Linked lists are good for adding and removing elements. Variable length arrays are good for allocating a known amount of space at runtime.

tptacek · on Jan 20, 2013

The "variable" in "variable length" describes the fact that you don't know how much space you need, and insertions to the end of a variable-length array are very fast. I can see why you thought my comment was weird, now.

chj · on Jan 20, 2013

Some simple suggestions:

1. Please align the asterisks properly. It hurts eyes.

2. Access by index?

3. Iterator (callback on each element)

4. use 'init' instead of 'initialize'. Extra typing hurts fingers.

afhof · on Jan 20, 2013

On the front page, "test.c" is listed twice. Should it be "list.c" ?

AncoraImparo · on Jan 20, 2013

I will fix, I hadn't realized! Thankyou!

categorykni · on Jan 20, 2013

This is week-one stuff of a CS degree or on the job training. If you don't know this you shouldn't be programming.

AncoraImparo · on Jan 20, 2013

Not everyone went to university, and people have to learn from somewhere. It is also only an introduction, there will be more complicated data structures shown soon, and algorithms. No point in jumping into the heavier stuff. Trying to help people who are new to C.

stuffihavemade · on Jan 20, 2013

Is it appropriate for Hacker News though? It fits better on http://www.reddit.com/r/learnprogramming

AncoraImparo · on Jan 20, 2013

See, thats more constructive! :) I figured it belonged here, especially since a lot of people are probably still interested in learning things like this, I was not aware of a reddit for this.

marshray · on Jan 20, 2013

There's nothing wrong with someone taking the time to write a week-one CS degree blogpost.

spinlocked · on Jan 20, 2013

Did you create a new account solely to bash this guy?

littlejdu · on Jan 20, 2013

I use a Chrome plugin that creates a new account each time I post a comment. I use a proxy that means HN can't track the number of accounts I've created. Keeps my privacy.