The Crystal programming language has been mentioned a couple of times on this forum (by asterite, its original creator, almost two years ago), but I think it would be useful to have more discussion.

I recently came across it again while looking at these benchmarks, where Nim's json parsing performs much better against PyPy and Node than in my own benchmarks, but Crystal comes out on top. Crystal also surpasses Nim in Havlak loop finder and base64 benchmarks.

I find this surprising, given that Nim seems to be further along in the development process, and their policy of "never have to specify the type of a variable or method argument". What is Crystal doing to be faster?

I personally prefer Nim: I like Python's syntax better than Ruby's, and Nim now has a better license. I think Nim has a better chance of breaking the ceiling of new programming languages and becoming mainstream, perhaps even dominant in certain fields someday. But performance is a crucial selling point for Nim - if it's not "faster enough" then people won't make the switch from something like Python, or they'll continue using another systems language for the small fraction of their code that requires optimization.

2015-05-21 00:56:54

The benchmarks being by one of the top Crystal contributors might have something to do with it.

I took a short look and fixed a few minor things in these benchmarks already. You could improve Nim's performance by implementing better algorithms for these problems. For example create a lookup table for base64, as the C implementation does, but then your binaries are a bit bigger. Or do a streaming parse for the json benchmark, as the Crystal implementation does. (I have the code for it, but want to clean it up and eventually add streaming json parsing to the Nim standard library)

While I enjoy optimizing benchmarks, I don't think they're as useful as some people believe. If performance really matters to you in Nim, you can disable runtime checks, you can manually manage memory instead of using the GC, you can write your own optimizations in the form of term rewriting macros, and you can go down to intrinsics or assembler if it's really necessary.

2015-05-21 01:15:20

I don't think benchmarks matter much. Too many things are mixed: the data structures used, the algorithms for some of the operations, the backends (gcc, llvm), etc. In some other benchmarks out there Nim wins. It does certainly win in the brainfuck interpreter. Also, Nim usually is shorter to write (probably because of the lack of "end").

What's important for me is (in no order): 1. code clarity, 2. good-enough performance in real applications. I think both Crystal and Nim can offer this.

If you check the JSON benchmark, Nim takes just a bit longer than Crystal. I'm sure there's a small optimization in Nim's JSON parsing that could be introduced, but that performance is already good-enough for real applications so it's not that important.

About the "never have to specify types": even if you don't specify them, the compiler still types things to the most narrow type (in most cases), so it's similar to Nim in that things have types and they can be optimized further than a dynamic language.

There's no need for a single language to win over others: many can coexist so whatever style you like you can choose it. Most importantly, have fun programming and optimizing stuff

2015-05-21 01:30:42

Crystal fits a different niche. It's a nice, fast replacement for Ruby.

Nim is an incredibly powerful language, but with a fair amount of safety. I consider it the highest productivity language today, when maintenance costs are included. It sells itself.

As for JSON, I have the world's fastest JSON parser (based on vivkin/gason), but I cheated. (I have other plans for it.) There are many ways to cheat.

  • Don't construct actual dictionaries.
  • Delay decoding of Unicode or floats.
  • Retain pointers into the input buffer.
  • Avoid deserialization.
  • Etc.

gason benchmark, Clang 3.4 (tags/RELEASE_34/final), x86_64 1, SIZEOF_POINTER 8, NDEBUG 0
 Parse   Speed (units/s)

814.14   71.13 rapidjson normal
784.22   73.84 rapidjson insitu
554.78  104.38 gason
181.47  319.12 nim-gason

Benchmarks can be gamed. I worry only about whether idiomatic code is reasonably efficient. Nim seems pretty good at everything.

2015-05-21 03:20:16

Library benchmarks tell you little about the language; their performance is generally a function of the library implementation. LAPACK isn't fast because it's written in Fortran, but because a ton of work has gone into tuning the library, both mathematically and for modern CPU architectures.

Language performance these days is 90% about feeding your code in a suitable form to the established backends (LLVM/clang, gcc, JVM, CLR) and rely on the expertise of their implementors to do the actual heavy lifting [1]. As long as you can represent your data in a reasonably backend-friendly form, identical algorithms should result in roughly comparable performance for the same backend. Crystal has to do more work here because it has to infer types, but once it has (say) figured out that a variable is a plain integer, it can generate exactly the same code as a statically typed language does where the type is explicitly declared.

[1] The exception is if your team consists of actual backend gurus (e.g. V8, Dart, LuaJIT) who can do that themselves, but those are few and far between.

2015-05-21 04:22:13

Libman: their policy of "never have to specify the type of a variable or method argument".

This just means that they do aggressive type inference, a la Haskell. Since Crystal is, nonetheless, a statically typed language, this has no bearing on performance.

2015-05-22 04:02:38

It looks like Crystal has an LLVM backend. While I find the notion of a language producing intermediary (ugh) C, questionable, it seems a bit less hairy to have your language produce C, and require the existence of a "real" compiler, than have your language use the LLVM virtual machine, and require the existence specifically of the LLVM. Nim can (in theory) work seamlessly with LLVM, gcc, and a number of other C compilers, and someone even got cygwin to work, I think. As nice as it is that LLVM made a standard "compiler framework" backend thingy, what it amounts to for language design is effectively the same as just producing C, but also tying yourself to one single backend.

Also I hear tell that LLVM is pretty huge compared to some other compiler suite thingies. Rust (auuuuugh) for instance takes hours to compile on my machine, due to the resource requirements of preparing the LLVM stuff. Going through the massive effort to learn the binary LLVM plugin backend architecture thing, when you can just spit out a kilobyte of ugly C and end up with the same damn thing always makes me question the choice to use an LLVM backend.

So I'm skeptical about the utility of Crystal, though I can bet it beats the pants off of RubyCorporate2.0Edition or whatever they call the optimized ruby thing.

Although aggressive type inference really is awfully nice...

2015-05-22 08:06:48

Approach based on intermediary C code has one more advantage over Rust way: portability. Porting Rust to DragonflyBSD seems to be very troublesome - "porting" Nim to NetBSD requires single make command. And that make me very happy.

PS. Yes, Nim works on NetBSD - at least with default gcc, but I'm also playing a little around PCC available as alternative.

2015-05-22 22:20:21

Yup, compiling to C is definitely the way to go, at least in the beginning. As a {F,O,D}BSD advocate, I greatly appreciate the instant portability. When running Linux (ex. on a work laptop), I appreciate being able to compile Nim without compiling all of LLVM. Choice of compiler can also get you better performance on some platforms (ex. Intel's icc optimized for Android, Solaris CC, MS vcc, AIX xlc, etc), but I think LLVM is now close to being the champ on all platforms.

I'd love to see Nim do more aggressive type inference and come ever-closer to Python's brevity. Perhaps even assumed "import"s too - shaving off lines of code is a good thing. As I suggested earlier, appealing to Python fans with something almost as productive but a lot faster would be a winning strategy for Nim - just gotta inch forward on both those fronts: performance (esp for Web apps) and Pythonic code briefness...

2015-05-22 23:58:01
Compiling to C obviously has many advantages; my question is whether Nim can do better along some important metric (speed of executable, generated code size, etc) by targeting LLVM directly or with a native (direct to machine code) backend. Of course, even if the answer to that question is 'yes', the next question would be 'Is it worth it given the limited manpower?'. Note that my question isn't about replacing the C backend, rather about having multiple backends. 2015-05-23 15:50:15