Wednesday, February 22, 2012

Preliminary thoughts on Scala vs Clojure

I've been using Scala for a little over a week now. Here are some preliminary thoughts on Scala vs Clojure.

(i) Scala is a better Java. Scala is still Java, in the sense that it is an OOP language requiring you to put a lot of effort into determining what is public and what is private. I've written about alternative approaches to OOP before [here, here, and here]. I don't want to go through the same discussion here (and don't claim to have fully grasped all the issues). I'll just say that I'm not smart enough to keep track of all the things you have to keep track of to do Java-style OOP. I can write pure functions that operate on data structures. I can't keep track of how all the data and methods and private etc. interact with one another. I couldn't do it when I tried Java and I can't do it now.

(ii) I miss macros. That surprised me because I don't write a lot of macros in Clojure. A lot of what I have done in Clojure makes use of Incanter, so it was just simple library calls. I don't even understand macros fully. Yet I miss them when they're not available. I spent a few minutes looking at Scala macros. That was enough.

(iii) Syntax is double-edged sword. I'm a fan of "everything's a list". If anything, I'd prefer to cut down on the amount of syntax that ends up in Clojure programs. Yet "everything's a list" just can't be sold to others. Scala, on the other hand, probably has the best syntax I've seen. I'd imagine that Scala makes a very good first impression. I haven't talked to anyone else about Scala so I don't know.

(iv) Scala performance is better than Clojure's. It's too easy to write slow code in Clojure. The various benchmarks I've seen support this. There are a lot of good things I can say about Incanter, but you wouldn't choose it for performance. I don't mean this in a way to start a flamewar, it's just an observation, and I'd be happy for someone to point out how I'm wrong. By that I don't mean a list of ways to tune my Clojure code. Tuning is a PITA that should be done by the author of the library. If performance is critical, I can't see a reasonable argument for using Clojure rather than Scala.

(v) I do mostly numerical programming. The JVM is a serious drawback for both languages. Beyond that, I don't see much interest from the Clojure community in numerical computing. All I see is Incanter, which doesn't seem to be very active, doesn't offer great performance, and is far from a complete solution relative to R, Matlab, and Scipy. Clojure is focused on other areas, and that's fine, but this is my comparison of Clojure and Scala, and it's what matters to me. On the other hand, Scala is very active in this area. Five years from now Scala will probably be a strong competitor to Matlab, while offering a much better language. It would be interesting to compare Clojure against SBCL for numerical computing.

(vi) Both have adequate IDE support in Eclipse. Counterclockwise is pretty good. The Scala IDE is just unbelievable. You don't need to debug Scala code: you just look at the bottom of the screen to see if there are any errors. You can even understand Scala error messages.

(vii) I probably have a preference for static typing overall, but it depends on what I'm doing and my mood.

(viii) Clojure has some rough edges. The transition to 1.3 coinciding with changes to Clojure/core left me frustrated. I use Counterclockwise with 1.2 and the old Clojure/core. I value my time. I don't think Clojure is where it needs to be in terms of documentation. Also, I really, really wish that those who write Clojure documentation would understand that NOT ALL CLOJURE USERS ARE CURRENTLY EMPLOYED AS ENTERPRISE JAVA DEVELOPERS! I tried Java years ago, hated it, and moved on. I know nothing about Java. It took me quite a while to understand how the hideous "com.Enterprise.Verbose.For.No.Reason" thing works when using libraries. I had to look in a Java book to get the necessary background. I have had a much smoother experience with Scala. Maybe the difference is that I've learned the Java approach by using Clojure.

(ix) Clojure has an impressive set of books available. They all do a decent job of explaining the language. Yet none of them can compete with Programming in Scala by Odersky, Spoon and Venners. You learn a lot more than Scala in that book. I'd highly recommend it to any intermediate programmer, even those who don't plan to use Scala. There are excellent resources available for either language so I don't see this as a reason to choose Scala.

Conclusion: I like the Clojure language better. I'd probably be more productive in Scala because it has a stronger future for numerical computing. I could sell others on Scala, but probably not on Clojure, due to syntax. Neither is going to replace my current combination of R + Fortran for everyday work.

That's a lot more than I planned to write. I'll update as things come to mind or as I learn I was wrong.

18 comments:

Anonymous said...

man scala for numrical computing doesn't seem a good idea..clojure neither..R and although f# fix perfect...f# has several tools and the clr seem most suitable for num comp...maybe you must give it a try...

lmf said...

I've actually looked at F#. I'm not willing to switch to Windows, and I'm not willing to put faith in Mono because I don't trust its funding (if it even has any). I've also read that Mono is really slow.

That said, I'll maybe try Ocaml at some point. It would require a pretty big investment.

Anonymous said...

I can't see R being better than clojure performance wise, but I'd love to see some benchmarks. Incanter was built in response to R's native language being slow as I understand.

lmf said...

R is slower than Clojure, but much of R is calls to C/C++/Fortran. Incanter is terrible at things like matrix algebra. Parallel Colt doesn't do very well in the Java matrix algebra benchmarks.

As indicated in the post, I don't see much interest in using Clojure for scientific computing, so I don't plan to pursue it anytime soon.

lmf said...

Also, for comparison of Scala and Clojure, there is the CLBG:

http://shootout.alioth.debian.org/u32q/benchmark.php?test=all&lang=clojure&lang2=scala

Scala does better in every benchmark. In almost every case it is faster by a factor of two or more, and in one case nine.

It's tough to compare with R because R calls so much native code, so R will be many times slower for a matrix multiplication written in pure R, but much faster doing x%*%y.

In addition, you have the option of easily calling Fortran or C++ code from R but not Clojure. For someone with Fortran experience, it's simple to rewrite a six-line bottleneck in an R program and get a speedup of 10 or 20 times. If you're willing to accept "99% R code" as R, nothing will be faster.

Anonymous said...

Here is an interesting blog post about how fast Clojure is. It starts off with a basic Clojure program and proceeds to perform several optimizations that result in C-like speed.

http://www.learningclojure.com/2010/09/clojure-is-fast.html

The example is a certainly far from elaborate, but it does show how much faster Clojure can get with some rather simple tweaks. I doubt very much that shootout did any of these things.

Your point about having the library writers do this by default is fair, but from what I have seen, the Clojure community's focus hasn't really been on these sorts of things, and that might change (slowly).

I am into numerical computing as well, and I don't want to go the Scala/Java/OOP route.

Anonymous said...

You are absolutely right about the benefits of a good native C interface. This is also how numpy and other scientific python libraries achieve good performance. It's certainly not due to python.

The JNI isn't quite as convenient for interfacing with C code, but I've still manage to make use of it in the NLP domain. In this scenario Clojure is just used as glue and basic pre and post processing. It would be fantastic to have Clojure ported to a non JVM host to make interop with C easier. I suppose there is Common Lisp, but I quite like some of what Clojure brings to the table there. I can dream. :-)

Bruce said...

Incanter isn't quite so moribund (we've just had a releases last week on clojure 1.3). We're working on the speed issues as well (we think most of them are around reflection).

lmf said...

@Bruce That's great to hear. I'll be checking it out. I really want to use Clojure, and Incanter is critical to that.

@anon My recollection is that ABCL can use all the QuickLisp libraries. Not sure how it works, though, so I don't know if it would be an option for Clojure.

@anon-- Thanks for the link. I'll check it out.

Anonymous said...

well...if the problem with f# is about a "microsoft hate" then I can't recommend this..although mono!=.NET and mono hasn't any relation with microsoft (actually mono is as "open" as java..because both are maintained for private companies but both are open-source too), I love clojure but I don't like the jvm as plataform for functional languages, mainly because this doesn't support tail recursion and many other functional features.

mono isn't slow...I don't trust much in benchmarks but in the latest benchmarks the difference between c# clr and mono is very small and actually many times it's faster than .net...In my experience I could say you than mono has better performance in linux/OS than java in windows and I don't feel any difference between mono in OS/windows .....

http://geekswithblogs.net/CISCBrain/archive/2005/08/28/Mono_vs_dotNet_Performance_Test.aspx

I've read about ocaml in the jvm..maybe it would be interesting...

lmf said...

My concern about mono is the opposite. I don't know if it will be around or if it will be able to keep up with the "official" version.

If Microsoft were to put out a Linux version of everything (under an open license of course) I'd be very happy to give it a try. I just don't trust that I'd be able to run something in five years if I write it in F#.

Anonymous said...

I am also very interested in using Clojure for numerical computing. It was my understanding that the changes in 1.3 would help us performance wise. All we need is to boostrap some user base / library vertuous cycle.
Clojure usually solves this by leveraging the JVM interoperability.
I don't know scala so I may be way off, be maybe we could "assimilate" the numerical goodies that you speak off by wrapping them ?

Anyway, I do believe in Incanter as the next R replacement and LISP synatx can and will become mainstream (it is not easy at first, but really simple in fact :) ).

Anonymous said...

Pidigits(Scala) using BigInteger
CPU Elapsed Memory Code
22.68sec 22.72sec 418,516KB 479B
http://shootout.alioth.debian.org/u32/program.php?test=pidigits&lang=scala&id=3

Pidigits(Scala) using GMP(native)
CPU Elapsed Memory Code
4.38sec 4.39sec 42,196KB 1125B
http://shootout.alioth.debian.org/u32/program.php?test=pidigits&lang=scala&id=4

Pidigits(F# Mono) using BigInteger
CPU Elapsed Memory Code
126.05sec 126.10sec 10,052KB 513B
http://shootout.alioth.debian.org/u32/program.php?test=pidigits&lang=fsharp&id=1

Pidigits(F# Mono) using GMP(native)
CPU Elapsed Memory Code
3.90sec 3.90sec 5,752KB 903B
http://shootout.alioth.debian.org/u32/program.php?test=pidigits&lang=fsharp&id=3

scala vs F# Mono
http://shootout.alioth.debian.org/u32/benchmark.php?test=all&lang=scala&lang2=fsharp

Ben Racine said...

I think it sounds like Julia [http://julialang.org/] would be of interest to you.

lmf said...

Thanks - I've checked out Julia, but it's still early days for that language. It'll probably be at least a year before I'll be willing to write anything in Julia, and even at that, it will depend on the available libraries and how much support exists for the things I want to do.

Anonymous said...

Your first mistake was giving up on Java "years ago". It takes a long time before a language is robust enough and every language I've used is rough around the edges for the 1st 5 years. Java 1.5 yielded Generics which 1.6 improved upon and the nio package is pretty good.

Most developers hate Java because of Swing and left Java before the JEE took hold. I agree Swing was a pain (and still is to me), but Swing is far better today than 8 years ago. Add in Netbeans Platform 7, and you can produce an enterprise-level desktop application in half of the time you can in .NET.

I also use Python and Ruby for trivial stuff like document publishing and archival where Java isn't needed. Not every solution will work with every language - that's why today's developers need to be Polyglot developers.

Anonymous said...

I pay a quick visit day-to-day some sites and blogs to read articles, however this blog offers quality based content.


Here is my web page; http://www.nihonlinks.com

Anonymous said...

The signals connect with concentrated nerve places, which in turn spread the signal to the entire abdominal area.


Also visit my weblog - flex belt review