The TIOBE index is meaningless

The TIOBE index ranks programming languages. It claims to be based "on the world-wide availability of skilled engineers, courses and third party vendors". But how can they reliably and automatically mine such infomration using just search engine results?

Actually, not only is their data not very reliable, but it is also prone to "spamming", because search engines are! And this is why we see a totally obscure experimental Forth-like language such as "Factor" get in the top 50. There is only one explanation: the TIOBE index is simply a combination of the number of results of some search queries at major search engines; as a handful of persons regularly post articles about Factor at social bookmarking sites such as Reddit or at Wikipedia, this artificially inflates their position.

The other explanation is that Factor is legitimately getting a lot of web attention. But that's absurd, since it doesn't deserve any serious attention. I mean, it is on the same level as Brainfuck. Brainfuck is interesting to programming language geeks. Factor can be interesting to Forth geeks, or compilation geeks. But that's not what TIOBE is about.

In the real world, there is no Factor. It is just a virtually unknown obscure experimental language with a small fandom that managed to get into a mostly meaningless index. You want proof?

There is not a single scholarly article about it, not a single PhD about it, actually not a single known application written in Factor, no single school giving courses in Factor; in fact, Factor isn't even in the Debian distribution, while Brainfuck, which is also an obscure language, is. How many persons in the world are paid to write Factor code?

But then it could be that Factor is the language of the future, and TIOBE is very good at picking languages of the future?

It seems that TIOBE is just very good at picking spamming effort. Consider the following important languages, which are not in the top 50.

Let's show that the rankings at the TIOBE index do not map to language importance according to any criteria other than web hype:

The other languages cited in the top 50 are usually vendor-specific languages of products that have some momentum; for many of those languages, knowledge of the language is indistinguishable from knowledge of the particular software product. And what the hell is PL fucking I doing in a 2008 list of the top 50 languages?

So, while obscure experimental languages and vendor-specific scripting frameworks clutter the top 50 list, industrially and academically important real-world languages such as VHDL, Verilog or Ocaml are relegated to the end or not mentioned at all.

2008-02-02