Lifting (J)Ruby Into Scala's Typeland
BACKGROUND
The Java Virtual Machine (JVM) is a "bytecode" interpreter that runs on top of a physical host platform. This means it provides a generalized translation layer that allows the same "binary" to run on any supported piece of hardware (compile-once-run-anywhere). While it's not uncommon for languages to provide a foreign function interface (FFI) for native interoperability, writing for the JVM usually means you're already writing for a "native" interface, which often makes interoperability more immediate.
Just as the JVM provides an abstraction over its host hardware, it also provides an abstraction beneath many languages. The JVM offers a Lisp (ie Clojure), strongly-typed functional languages (Scala, Kotlin, Flix), and at least one dynamically typed language (JRuby). These languages have various amounts of language interoperability. I like to call this polyglot programming.
In example: say you have some domain logic written in Ruby. You realize, however, that it would be helpful to parallelize this using a modern streaming framework written in Scala or Java. Well, you can compile your JRuby code to a JAR, import it into a Java project, and use the streaming framework as normal. (You can also often invert that relationship and use the streaming framework from JRuby, but that's not the point of this post!)
I'm going to provide an example here of using some JRuby code in Scala. Although there are several examples out there of using JVM libraries from JRuby, I haven't seen much that describes going in the opposite direction.
All of the code for this project can be found on Codeberg.
OUTLINE
In this post, we will:
- write a template parsing library in (J)Ruby (including Ruby-side and Java-side tests)
- compile to a Java JAR file
- wrap this in a Scala class providing a monadic return type; and
- use some fancy Scala features to process data without side effects
STEP 1: JRUBY
Start by installing a recent version of JRuby, ie:
export JRUBY_VERSION="10.0.2.0"
rvm install jruby-$JRUBY_VERSION
rvm default jruby-$JRUBY_VERSION
We're going to use Jeremy Evans' Tilt library to process incoming templates. Tilt is itself an abstraction over several templating languages; in this article, we'll just handle ERB, Haml, and Liquid, and we will stick to context interpolation via a hash input (so no environment context data). We'll take a payload with approximately the following structure:
{
"suffix": "<format suffix, ie 'erb'>",
"data": "<JSON-ified template string>",
"context": {<JSON-ified hash of locals>}
}
Let's start by adding the appropriate packages to our Gemfile:
source 'https://rubygems.org'
gem 'tilt', '~> 2.0'
gem 'haml', '~> 6.0'
gem 'kramdown', '~> 2.0'
gem 'liquid', '~> 5.0'
group :development do
gem 'rspec', '~> 3.0'
gem 'rspec-parameterized'
end
...and install these using our chosen JRuby:
bundle install
Now, to the core code: for each message, we will unmarshal the context, let Tilt try and parse the template, and then attempt to render the template with the context:
require 'tilt'
require 'json'
class InTiltPret
def self.render(suffix, data, context)
tmpl = Tilt[suffix].new { JSON.parse(data) }
context_symbolified = context.reduce({}) { |acc,(k,v)| acc.merge({k.to_sym => v}) }
tmpl.render(nil, **context_symbolified)
end
end
We can run our tests using Rspec:
bundle exec rspec spec/tests.rb
The Ruby tests pass, so we know our library is valid Ruby code. That's fine, but we're trying to make this accessible in Java; for that, we need to add annotations that instruct JRuby how to expose this to Java's type system (see here for more information):
java_package 'test.hoprocker'
require 'tilt'
require 'json'
java_import 'java.util.HashMap'
class InTiltPret
java_signature 'static String render(String, String, HashMap)'
def self.render(suffix, data, context)
tmpl = Tilt[suffix].new { JSON.parse(data) }
context_symbolified = context.reduce({}) { |acc,(k,v)| acc.merge({k.to_sym => v}) }
tmpl.render(nil, **context_symbolified)
end
end
When compiled, these annotations will provide type and package information for the JVM. The InTiltPret.render(..) function will be accessible in any JVM language after importing the test.hoprocker.InTiltPret class.
java_signature tells the JRuby compiler to use that method signature on the JVM. JRuby is quite good about translating types, so we are going to be fine using them like normal Ruby.
STEP 2: A JAR OF (J)RUBIES
We're now going to package this up as a JAR that can be used in any JVM application. We are going to pre-compile because a) this separates concerns, b) it's faster to do a JRuby-only or Java-only build, and c) it reifies our annotations into Java's type system, which gives us type checking in our IDE.
To compile a (J)Ruby project as a fully-contained Java JAR, we need to:
- locally cache all of the dependent gems
- compile our JRuby to Java using jruby-maven-plugin
- inject this into a vanilla copy of the version-appropriate jruby-complete jar
Information on this is scattered across the internet, but the basic idea is to have all of our gems (and their specifications) available inside of the jruby-complete JAR file. This will be enough to use the resulting JAR like any other Java library. We're going to expand on a process by one of the original JRuby developers outlined here, using Maven's AntRun plugin to automate the build steps1.
A self-contained Ruby build
Our Ruby library is going to be a jar, so it gets its own pom.xml. These are the commands relevant to our build. XML is quite verbose, so I'll just show the commands themselves here, but please feel free to look at the entire pom for more context. Ant commands can take place in any subdirectory, so I'll annotate the descriptions with the execution location.
## root directory
mkdir -p ${project.build.directory}/bundle-gems ${project.build.directory}/jruby-complete
Make the directories we'll need for gem shuffling. Everything goes in ${project.build.directory} to integrate with Maven's clean lifecycle.
## root directory
cp Gemfile ${project.build.directory}/Gemfile"
Because we're working out of the project build directory, the Gemfile needs to go there too.
## build directory
bundle cache --no-install --gemfile=${project.build.directory}/Gemfile
Make sure everything we need is available. This is mainly important for any locally-built gem dependencies.
## build directory
bundle config set --local path ${project.build.directory}/bundle-gems
bundle config set --local without 'test development'
Add some configuration for bundler.
## build directory
bundle install --gemfile=${project.build.directory}/Gemfile --prefer-local --no-cache
Install all of the necessary gems in our build directory. This is executed from the build directory to keep the Gemfile.lock there as well.
Now we get into the weeds a bit: we need to download a copy of the jruby-complete.jar matching our version and prep it for the build:
## root directory
curl -o ${project.build.directory}/jruby-complete.jar -s https://repo1.maven.org/maven2/org/jruby/jruby-complete/${JRUBY_VERSION}/jruby-complete-${JRUBY_VERSION}.jar
unzip -q ${project.build.directory}/jruby-complete.jar -d${project.build.directory}/jruby-complete
We then copy our dependency gems into the expanded jruby-complete, where the subdirectory matches the equivalent MRI minor version (ie 3.1.0 for JRuby 9.x, 3.4.0 for JRuby 10.x):
cp -r ${project.build.directory}/bundle-gems/jruby/3.4.0 ${project.build.directory}/jruby-complete
At this point everything is set up for Maven to compile our project as a fully-contained JAR. Bundler has some weird quirks2, so once again check the Ruby pom.xml in the Codeberg repo for the complete build description with workarounds. (We could also pre-compile the actual bundled gems, but I'm omitting that for clarity.)
At this point we can run Maven inside of our JRuby environment context to build the JAR and install it to our local .m2 repository (make sure you run it in an rvm context so the appropriate environment variables are set):
rvm $JRUBY_VERSION do mvn clean install
STEP 3: SCALA
Project structure
Now that we have compiled our Ruby library to a JAR, we can import that jar into any JVM project. We'll go ahead and make a separate Scala project (I'm using Maven for continuity, you might use sbt or mill):
mvn archetype:generate -DarchetypeGroupId=net.alchim31.maven -DarchetypeArtifactId=scala-archetype-simple -DgroupId=demo.hoprocker -DartifactId=TmplToption
This will create a pom.xml file with a bunch of boilerplate Scala build stuff. We now want to add our JRuby JAR in the <dependencies> section, using the group and artifact IDs from its pom.xml:
<dependencies>
...
<dependency>
<groupId>demo.hoprocker</groupId>
<artifactId>in-tilt-pret</artifactId>
<version>0.1-SNAPSHOT</version>
</dependency>
...
</dependencies>
(It's useful to suffix your library versions with -SNAPSHOT during development because Maven repositories allow these packages to be overwritten during deployments.)
Code step 1: a container for our library
To make this really shine in Scala, we're going to encapsulate our JRuby/Java class in a Scala type that takes advantage of that language's more functional attributes. Among Scala's standard monadic types are Option and Either. We'll use Either to get either our successful output or an error that we can do something with. Here's the basic encapsulation:
// json4s imports here
case class Message(format: String, data: String, context: Option[util.HashMap[String,String]]) {
implicit lazy val formats: Formats = DefaultFormats
def render: Either[(String, Throwable), String] = {
Try {
InTiltPret.render(format, data, context.getOrElse(new java.util.HashMap[String, String]()))
} match {
case Success(value) => Right(value)
case Failure(exception) => Left(Serialization.write(this), exception)
}
}
}
object Message {
implicit lazy val formats: Formats = DefaultFormats
def apply(s: String) = parse(s).extract[Message]
}
The apply method in the companion object allows us to pass the whole JSON blob into the constructor; json4s will coerce it into a case class in our Right type. Additionally, any errors (and a reconstructed original message) will be encapsulated in a Left type for us to deal with later however we choose. Neat!
Code step 2: wrapping it in a reduction
Here's a simple function to take a list of JSON messages, parse them, and return a list of successfully parsed values while outputting error messages:
def parseMessages(messages: Iterator[String]): Iterator[Either[(String, Throwable), String]] =
messages.map(Message(_).render)
That's it! Although this is a fairly naive implementation, we can modify or filter on this structure however we want, ie
val parsed = parseMessages(inputList)
// get successful outputs
parsed.filter(_.isRight)
// print information about failures
parsed.filter(_.isLeft).foreach(m => System.out.println(m.left.toString))
Once again, quite naive, but not a bad start.
CONCLUSION
There we have it: we've taken some unique Ruby functionality, brought it into the JVM, and used it in Scala. This is a simple example, but it shows how we can code in a polyglot manner to take advantage of different languages' strengths. While this was a simple example, there's nothing to say this couldn't be further refined and used on a production basis (as we have been doing at Cavulus for several years).
Footnotes:
Note that there may be a more streamlined way (ie a special library for this), but I haven't come across one and this is sufficient inside of the Java build paradigms.↩
Ie CLI arguments automatically set global context (stored in the
.bundledirectory), and even if you use the correct flag to specify a different root directory, it still uses the Gemfile in your current working directory.↩