I would not call it performance comparison at all. When Python call functions written in C it is not Python's performance. Write those functions using plain Python and then see the results. Sure for this basic example in the article it does not matter from a practical standpoint. But when you need to step away from canned cases suddenly Python's performance sucks big time.
Can we move past the whole "it is not really python to use libraries written in C", especially when talking pure stdlib python?
Python is basically a DSL for C extensions. That is the whole point. It would be like criticizing any compiled language for essentially being a DSL for machine code, and not "Real Instructions".
Python's ability to interop with pre-built, optimized libraries with a lightweight interface is arguably its greatest selling point. Everyone and their dog knows purely interpreted CPython is slow. It doesn't need to be pointed out every single time python performance is brought up unless literally discussing optimizing the CPython vm.
I am not criticizing Python. It does well enough what it was made to do. It just make no sense to call that particular example a "language performance comparison". It is anything but.
But you are still wrong. As mentioned, Dicts are incredibly efficient data structures in Python (because they underpin everything) and the Counter class is pure python. That's 100% pure python. Saying dicts "don't count" because they are implemented in C would disqualify the entire language of CPython, as virtually everything under the hood is a PyObject struct pointer. It just so happens that "counting abstract objects" is the class of problem* CPython does basically all the time in normal VM execution anyways, so this task is easy and fast.
* looking up a reference of an attribute with a string key underpins the meat and potatoes of python's execution model
Yes, that's exactly what I said. dict.update is in C, because it's a core feature of the python vm. It's pure CPython. What do you think "pure python" is? There's no python hardware ISA (afaik). All cpython is manipulating data structures in C via Python VM opcodes. It just so happens that whatever opcodes that are dispatched in the course of solving this problem are quite efficient.
If you say "it does not count as Real Python if you dispatch to C", then you literally cannot execute any CPython vm opcodes, because it's all dispatching to C under the hood.
“Pure Python” commonly means implemented using only the Python language. Something written in pure Python ought to be portable across Python implementations. I was merely pointing out that this line
isn’t exactly pure Python, because, under a different runtime (eg PyPy), the code would take a different path (the “pure Python” implementation of _count_elements[1] instead of the C implementation[2][3]). Yes, it's hard to draw exact lines when it comes to Python, especially as the language is so tied to its implementation. However, I think in this case it's relatively clear that the code that specific line is calling is an optimization in CPython, specifically intended to get around some of the VM overhead. Said optimization comes into play in the OP.
Apparently there will be someone who feels the need to point it out; and someone who feels the need to point out that it doesn't need to be pointed out; and …
Python's `collections.Counter` is written in Python and is a subclass of the builtin `dict` type. I don't think it's comparable to something like using `pandas` to solve the problem.