Mutterings and Musings

Lifting (J)Ruby Into Scala's Typeland

BACKGROUND

The Java Virtual Machine (JVM) is a "bytecode" interpreter that runs on top of a physical host platform. This means it provides a generalized translation layer that allows the same "binary" to run on any supported piece of hardware (compile-once-run-anywhere). While it's not uncommon for languages to provide a foreign function interface (FFI) for native interoperability, writing for the JVM usually means you're already writing for a "native" interface, which often makes interoperability more immediate.

Just as the JVM provides an abstraction over its host hardware, it also provides an abstraction beneath many languages. The JVM offers a Lisp (ie Clojure), strongly-typed functional languages (Scala, Kotlin, Flix), and at least one dynamically typed language (JRuby). These languages have various amounts of language interoperability. I like to call this polyglot programming.

In example: say you have some domain logic written in Ruby. You realize, however, that it would be helpful to parallelize this using a modern streaming framework written in Scala or Java. Well, you can compile your JRuby code to a JAR, import it into a Java project, and use the streaming framework as normal. (You can also often invert that relationship and use the streaming framework from JRuby, but that's not the point of this post!)

I'm going to provide an example here of using some JRuby code in Scala. Although there are several examples out there of using JVM libraries from JRuby, I haven't seen much that describes going in the opposite direction.

All of the code for this project can be found on Codeberg.

OUTLINE

In this post, we will:

STEP 1: JRUBY

Start by installing a recent version of JRuby, ie:

export JRUBY_VERSION="10.0.2.0"
rvm install jruby-$JRUBY_VERSION
rvm default jruby-$JRUBY_VERSION

We're going to use Jeremy Evans' Tilt library to process incoming templates. Tilt is itself an abstraction over several templating languages; in this article, we'll just handle ERB, Haml, and Liquid, and we will stick to context interpolation via a hash input (so no environment context data). We'll take a payload with approximately the following structure:

{
  "suffix": "<format suffix, ie 'erb'>",
  "data": "<JSON-ified template string>",
  "context": {<JSON-ified hash of locals>}
}

Let's start by adding the appropriate packages to our Gemfile:

source 'https://rubygems.org'

gem 'tilt', '~> 2.0'
gem 'haml', '~> 6.0'
gem 'kramdown', '~> 2.0'
gem 'liquid', '~> 5.0'

group :development do
  gem 'rspec', '~> 3.0'
  gem 'rspec-parameterized'
end

...and install these using our chosen JRuby:

bundle install

Now, to the core code: for each message, we will unmarshal the context, let Tilt try and parse the template, and then attempt to render the template with the context:

require 'tilt'
require 'json'

class InTiltPret
   def self.render(suffix, data, context)
       tmpl = Tilt[suffix].new { JSON.parse(data) }
       context_symbolified = context.reduce({}) { |acc,(k,v)| acc.merge({k.to_sym => v}) }
       tmpl.render(nil, **context_symbolified)
   end
end

We can run our tests using Rspec:

bundle exec rspec spec/tests.rb

The Ruby tests pass, so we know our library is valid Ruby code. That's fine, but we're trying to make this accessible in Java; for that, we need to add annotations that instruct JRuby how to expose this to Java's type system (see here for more information):

java_package 'test.hoprocker'

require 'tilt'
require 'json'

java_import 'java.util.HashMap'

class InTiltPret

    java_signature 'static String render(String, String, HashMap)'
    def self.render(suffix, data, context)
        tmpl = Tilt[suffix].new { JSON.parse(data) }
        context_symbolified = context.reduce({}) { |acc,(k,v)| acc.merge({k.to_sym => v}) }
        tmpl.render(nil, **context_symbolified)
    end

end

When compiled, these annotations will provide type and package information for the JVM. The InTiltPret.render(..) function will be accessible in any JVM language after importing the test.hoprocker.InTiltPret class.

java_signature tells the JRuby compiler to use that method signature on the JVM. JRuby is quite good about translating types, so we are going to be fine using them like normal Ruby.

STEP 2: A JAR OF (J)RUBIES

We're now going to package this up as a JAR that can be used in any JVM application. We are going to pre-compile because a) this separates concerns, b) it's faster to do a JRuby-only or Java-only build, and c) it reifies our annotations into Java's type system, which gives us type checking in our IDE.

To compile a (J)Ruby project as a fully-contained Java JAR, we need to:

Information on this is scattered across the internet, but the basic idea is to have all of our gems (and their specifications) available inside of the jruby-complete JAR file. This will be enough to use the resulting JAR like any other Java library. We're going to expand on a process by one of the original JRuby developers outlined here, using Maven's AntRun plugin to automate the build steps1.

A self-contained Ruby build

Our Ruby library is going to be a jar, so it gets its own pom.xml. These are the commands relevant to our build. XML is quite verbose, so I'll just show the commands themselves here, but please feel free to look at the entire pom for more context. Ant commands can take place in any subdirectory, so I'll annotate the descriptions with the execution location.


## root directory
mkdir -p ${project.build.directory}/bundle-gems ${project.build.directory}/jruby-complete

Make the directories we'll need for gem shuffling. Everything goes in ${project.build.directory} to integrate with Maven's clean lifecycle.


## root directory
cp Gemfile ${project.build.directory}/Gemfile"

Because we're working out of the project build directory, the Gemfile needs to go there too.


## build directory
bundle cache --no-install --gemfile=${project.build.directory}/Gemfile

Make sure everything we need is available. This is mainly important for any locally-built gem dependencies.


## build directory
bundle config set --local path ${project.build.directory}/bundle-gems
bundle config set --local without 'test development'

Add some configuration for bundler.


## build directory
bundle install --gemfile=${project.build.directory}/Gemfile --prefer-local --no-cache

Install all of the necessary gems in our build directory. This is executed from the build directory to keep the Gemfile.lock there as well.


Now we get into the weeds a bit: we need to download a copy of the jruby-complete.jar matching our version and prep it for the build:

## root directory
curl -o ${project.build.directory}/jruby-complete.jar -s https://repo1.maven.org/maven2/org/jruby/jruby-complete/${JRUBY_VERSION}/jruby-complete-${JRUBY_VERSION}.jar
unzip -q ${project.build.directory}/jruby-complete.jar -d${project.build.directory}/jruby-complete

We then copy our dependency gems into the expanded jruby-complete, where the subdirectory matches the equivalent MRI minor version (ie 3.1.0 for JRuby 9.x, 3.4.0 for JRuby 10.x):

cp -r ${project.build.directory}/bundle-gems/jruby/3.4.0 ${project.build.directory}/jruby-complete

At this point everything is set up for Maven to compile our project as a fully-contained JAR. Bundler has some weird quirks2, so once again check the Ruby pom.xml in the Codeberg repo for the complete build description with workarounds. (We could also pre-compile the actual bundled gems, but I'm omitting that for clarity.)

At this point we can run Maven inside of our JRuby environment context to build the JAR and install it to our local .m2 repository (make sure you run it in an rvm context so the appropriate environment variables are set):

rvm $JRUBY_VERSION do mvn clean install

STEP 3: SCALA

Project structure

Now that we have compiled our Ruby library to a JAR, we can import that jar into any JVM project. We'll go ahead and make a separate Scala project (I'm using Maven for continuity, you might use sbt or mill):

mvn archetype:generate -DarchetypeGroupId=net.alchim31.maven -DarchetypeArtifactId=scala-archetype-simple -DgroupId=demo.hoprocker -DartifactId=TmplToption

This will create a pom.xml file with a bunch of boilerplate Scala build stuff. We now want to add our JRuby JAR in the <dependencies> section, using the group and artifact IDs from its pom.xml:

<dependencies>
    ...
    <dependency>
        <groupId>demo.hoprocker</groupId>
        <artifactId>in-tilt-pret</artifactId>
        <version>0.1-SNAPSHOT</version>
    </dependency>
    ...
</dependencies>

(It's useful to suffix your library versions with -SNAPSHOT during development because Maven repositories allow these packages to be overwritten during deployments.)

Code step 1: a container for our library

To make this really shine in Scala, we're going to encapsulate our JRuby/Java class in a Scala type that takes advantage of that language's more functional attributes. Among Scala's standard monadic types are Option and Either. We'll use Either to get either our successful output or an error that we can do something with. Here's the basic encapsulation:

// json4s imports here

case class Message(format: String, data: String, context: Option[util.HashMap[String,String]]) {  
  implicit lazy val formats: Formats = DefaultFormats  
  def render: Either[(String, Throwable), String] = {  
    Try {  
      InTiltPret.render(format, data, context.getOrElse(new java.util.HashMap[String, String]()))  
    } match {  
      case Success(value) => Right(value)  
      case Failure(exception) => Left(Serialization.write(this), exception)  
    }  
  }  
}

object Message {  
  implicit lazy val formats: Formats = DefaultFormats  
  def apply(s: String) = parse(s).extract[Message]
}

The apply method in the companion object allows us to pass the whole JSON blob into the constructor; json4s will coerce it into a case class in our Right type. Additionally, any errors (and a reconstructed original message) will be encapsulated in a Left type for us to deal with later however we choose. Neat!

Code step 2: wrapping it in a reduction

Here's a simple function to take a list of JSON messages, parse them, and return a list of successfully parsed values while outputting error messages:

def parseMessages(messages: Iterator[String]): Iterator[Either[(String, Throwable), String]] =
    messages.map(Message(_).render)  

That's it! Although this is a fairly naive implementation, we can modify or filter on this structure however we want, ie

val parsed = parseMessages(inputList)
// get successful outputs
parsed.filter(_.isRight)
// print information about failures
parsed.filter(_.isLeft).foreach(m => System.out.println(m.left.toString))

Once again, quite naive, but not a bad start.

CONCLUSION

There we have it: we've taken some unique Ruby functionality, brought it into the JVM, and used it in Scala. This is a simple example, but it shows how we can code in a polyglot manner to take advantage of different languages' strengths. While this was a simple example, there's nothing to say this couldn't be further refined and used on a production basis (as we have been doing at Cavulus for several years).


Footnotes:

  1. Note that there may be a more streamlined way (ie a special library for this), but I haven't come across one and this is sufficient inside of the Java build paradigms.

  2. Ie CLI arguments automatically set global context (stored in the .bundle directory), and even if you use the correct flag to specify a different root directory, it still uses the Gemfile in your current working directory.