There's a joke which comes in different flavors along the lines that (1) you need a PhD in Computer Science to program in Haskell, or (2) you need a PhD in Computer Science or Mathematics to perform IO in Haskell. Why do people who use Haskell all brag/complain that Haskell is hard to wrap your brain around?
I've thought about this a little, as one who has read dozens of recent research papers to learn more about Haskell. (Disclaimer: I don't have a PhD, just a BS in CS.) Haskell is such a powerful language precisely because it was built out of just the right number of just the right concepts. Once you've learned those (and while you're learning those) you see through the language and into the realm of concepts that you can express with the language. Of course no language (programming or natural) is free from a context, and Haskell definitely has a context for understanding. But instead of learning the language to express old concepts, for some reason (hopefully churned up here) Haskell has a way of teaching you new, more powerful, better ways of stating and — since it's a programming language — solving problems.
The authors of SICP make the interesting claim that learning how to program should be about the concepts and abstractions, and not about the syntax. They are able to do this because they use a language with more or less no syntax, instead of using a syntax-heavy language like Pascal. (The book was written when this was the programming language of the future; in fact, I learned Pascal in highschool. Today you might say, they avoided syntax-heavy languages like Java, C++, and Perl.) The tradeoff made in the design of Haskell was to opt for slightly more built-in syntax to express programs with less, yet cleaner code than Scheme.
Most languages (especially Perl, but to a sadly-too-large degree,
Python) cloud the concepts in programs with the language
itself. Haskell avoids this by tidying up the language all the
way. The core of Haskell is pretty small and can be learned in a
few chapters or a few weeks, maybe twice the time it takes to explain
Scheme's syntax, especially since Haskell's syntax (and semantics)
more closely resembles mathematics. But Haskell allows the programmer
to extend the language in various, very clean, idiomatic
ways — without macros. (The core of the language consists of
function application, and case expressions over algebraic datatypes. That's it.
With these you can implement most of the the rest of Haskell. There is a bit of built in syntactic sugar for lists, numbers and the like, but you could easily define your own numbers, lists, trees, etc. with algebraic datatypes.)
The upshot of all this is that the time you spend "learning Haskell"
is really time learning with Haskell, or in many cases, learning
from Haskell. A simple example that most Haskell-learners will run
into at some point is the monad, the source of the accusation that
you need a PhD in Category Theory to do I/O in Haskell. You learn
about monads to understand how a side-effect free language can perform
I/O. You learn the helpful
do syntactic sugar.
But in the process, you learned something that many other languages only give you in pieces: that there is a simple, subtle, general way to structure the semantics of sequencing, and that I/O, parsing, continuation-passing, state-transforming, etc. are all just special cases of a similar shaped "pattern". But this "pattern" is so subtle, that OOP practitioners don't really identify it completely. It's like it's a totally hidden, implicit assumption that only comes to light when you step away from it.
More specifically, the strict, implicit state-transforming semantics of almost every programming language — Java, C++, Pascal, Perl, Python, Lisp, Scheme (though they build theirs on top of continuations and let you mess with the continuation guts), even ML or OCaml — are hard-wired into the language, and you don't have the option to change this. Even the Good Ship Python, with its mantra of ""explicit is better than implicit," doesn't recognize that they are sailing in implicit "strict, state-passing" water. This is a very subtle issue.
A friend of mine explained this as, in my words, "Haskell has no prefered semantics, and the monad just lets you swap out the semantics according to your needs." The beauty of identifying this, even though we have to use the scary word "monad", is that you are now free to write more easily extensible programs. (Incidentally, Simon Peyton Jones proposed a retroactive change to using the word "monad": instead the Haskell community should have called it, "warm fuzzy thing.")
This is just one case of where Haskell helps brings your implicit assumptions out into the light, that is, explicitly in your code. Now you can change the behavior of your program by replumbing the entire building without cutting open any walls or ceilings. I did this with a parser I was building that later needed to pass state around. I literally changed a line of code for the type of my parser monad, and now any place in my program that needs it has access to a threaded state.
There are practical benefits to programming in the powerful language Haskell. I can't defend some the weaknesses inherent in using Haskell in practice (well elaborated in Why No One Uses Functional Languages), but I will say that conceptually, no other programming language comes close to Haskell in showing you a broader landscape of new, amazing ways to solve problems, or gives you the tools to take advantage of this elevated paradigm. Take the time and learn from Haskell. Maybe by the time you understand the whole thing, you will know enough to have a PhD!