I've just merged a Cascalog pull request of mine
[https://github.com/nathanmarz/cascalog/pull/270] that gives Cascalog operations
access to the statistics that Cascading generates at the end of each job. I've
also added global inc! and inc-by! functions that let you increment custom
»
Cascalog 2.0 In Depth
Cascalog 2.0 has been out for over a year now, and outside of a post to the
mailing list
[https://groups.google.com/d/topic/cascalog-user/F8EkFM7HiE0/discussion] and a
talk at Clojure/Conj 2013 [https://www.youtube.com/watch?v%3DuuJW3EaN_3Q] (
slides here
[https://speakerdeck.com/sritchie/
»
Hardcore Cascalog: Dynamic Queries
A little side note before I get started - pivoting from my last post on ski
mountaineering racing [http://www.samritchie.io/skimo-racing/] to this post on
advanced Cascalog [https://github.com/nathanmarz/cascalog] patterns has made me
realize that I'm a full-fledged connoisseur of the esoteric. I&
»
Cascalog Testing 2.0
A few months ago I announced Midje-Cascalog
[http://sritchie.github.com/2011/09/30/testing-cascalog-with-midje.html], my
layer of Midje testing macros over the Cascalog MapReduce DSL. These allow you
to write tests for your Cascalog jobs in a style that mimics Cascalog's own
query execution syntax. In
»
Introducing Cascalog-Contrib
I've had the pleasure of working with Cascalog
[https://github.com/nathanmarz/cascalog] for about ten months now, and have seen
the community produce some fantastic work. A number of businesses
[https://www.assembla.com/spaces/cascalog/wiki/Who's_using_Cascalog] are using
Cascalog in production;
»
Testing Cascalog with Midje
I've been working on a Cascalog testing suite these past few weeks, an extension
to Brian Marick's Midje [https://github.com/marick/Midje], that eases much of
the pain of testing MapReduce workflows. I think a lot of the dull work we see
in the Hadoop
»
Getting Creative with MapReduce
One problem with many existing MapReduce abstraction layers is the utter
difficulty of testing queries and workflows. End-to-end tests are maddening to
craft in vanilla Hadoop and frustrating at best in Pig and Hive. The difficulty
of testing MapReduce workflows makes it scary to change code, and destroys your
desire
»
Cascalog 1.8.1 Released
Nathan Marz [http://nathanmarz.com/] and I are releasing Cascalog 1.8.1 today!
We've added a few interesting features, and I thought I'd provide a bit more
detail here for anyone interested.
Cross Join
cascalog.api now includes support for cross-joins
[http://en.wikipedia.org/
»