security – Libera #java

Jackson-databind and Default Typing Vulnerabilities

Today, GitHub sent out security notices to owners of projects using old jackson-databind versions (older than 2.8.11.1 and 2.9.5). These notices pertain to this issue. I have talked about its relevance before on IRC, but since it is getting more attention now, I will describe it here again.

The Problem

The “bug” comes from using the so-called default typing. This feature allows a user to deserialize subclasses (or even Object) without specifying the full possible type hierarchy. Consider this model:

interface I {
}
@Data
class A implements I {
    private int i;
}
@Data
class B implements I {
    private boolean b;
}

Now, if we wish to serialize the interface I, you usually need to specify some sort of type info. This is typically done through an annotation on the interface:

@JsonTypeInfo(use = JsonTypeInfo.Id.NAME)
@JsonSubTypes({@JsonSubTypes.Type(value = A.class, name = "A"), @JsonSubTypes.Type(value = B.class, name = "B")})
interface I {
}

If we now serialize an object of type I, we get a result like this:

new ObjectMapper().writerFor(I.class).writeValueAsString(new A())
{"@type":"A","i":0}

Deserializing this JSON works as expected.

Default typing

This annotation-based registration is fine for small use cases, but can get cumbersome if the types are in different modules, or there are just a lot of them, or it would be bad style to reference them from the parent class. Normal Java serialization does not have this problem (it just carries the actual dynamic class name with it), so this could be a barrier for adoption of Jackson for previous Java serialization users.

The problem here is that we want people to migrate from Java serialization. There are lots of reasons, most of them compelling.

Enter default typing. With default typing, we don’t need the type info annotations at all:

new ObjectMapper().enableDefaultTyping().writerFor(I.class).writeValueAsString(new A())
["net.hawo.tv.tvui17.A",{"i":0}]

(Ignore the fact that this is now suddenly an array – this is one of the ways jackson may include type info)
The idea is simple: Include the full class name in the json, and you can simply get the proper class at runtime! Sounds good, right?

The Vulnerability

Well, turns out this is not that good of an idea. Java serialization does a very similar thing (though it still requires the named class to be serializable, which jackson doesn’t) and this has lead to what feels like a third of all serious security vulnerabilities in Java applications, ever. The problem lies with the fact that Java classpaths are often massive, and allowing any class on that classpath to be deserialized at will can be disastrous since it exposes a huge attack surface. If you can get any class on the classpath to execute code when deserialized with jackson, you have successfully achieved remote code execution.
This is exactly what happened with Jackson. Some classes that were common on user classpaths could be deserialized to execute arbitrary code. The fix for this issue is basically a blacklist of a few of these classes that could be exploited. Blacklists are not a solution, though, and since this first list, the list has been amended several times. The maintainers are playing whack-a-mole here, and in my opinion it is a waste of everyone’s time to be adding all exploitable classes to this list.

The Solution

Our experience with this same issue in the Java serialization world tells us not to deserialize untrusted data. Luckily, Jackson is much more secure than Java serialization – if you don’t use default typing. The only acceptable solution to this issue in the long run is: do not use default typing to deserialize untrusted data. Default typing is rarely necessary or even a good idea.
Unfortunately, online resources saying this are sparse. Default typing is an “easy” solution, and many people simply do not have the security awareness to see the issue with it – they will stumble over stackoverflow answers such as this one and simply enable default typing to easily serialize Object fields. The documentation of Jackson also doesn’t highlight this as much as it should.
Two alternatives to default typing exist in Jackson:

Normal, annotation-based typing as shown above. This allows you either to use the full name as with default typing, or even specify your own name for greater compatibility (you can later change the class name without affecting the serialized representation). This is the “standard” solution, and the appropriate one for most use cases.
Should you not know the possible subtypes of a class you wish to deserialize in advance, you can use the rich registerSubtypes API to dynamically add the types you desire. These types could be detected through an existing module system you are already using, or using something like SPI.

All in all, I am a bit dissatisfied with the attention this issue has gotten. The issue is a security vulnerability by design, and anyone using default typing should have been aware of it. Luckily, default typing is not on by default. I do not have statistics but I would be surprised if many people used it or knew of its existence in the first place, and so I find the attention GitHub has given this a bit over the top – the biggest thing these notifications will spread is uncertainty about Jackson, so this article was an attempt at clearing up what it’s actually all about.

JavaChannel’s Interesting Links podcast, episode 14

Welcome to the fourteenth ##java podcast. Your hosts, as usual, are Joseph Ottinger, dreamreal on the IRC channel, and Andrew Lombardi from Mystic Coders (kinabalu on the channel), and it’s Wednesday, February 7, 2018.
As always, this podcast is basically interesting content pulled from various sources, and funneled through the ##java IRC channel on freenode. You can find the show notes at the channel’s website, at javachannel.org; you can find all of the podcasts using the tag (or even “category”) “podcast”, and each podcast is tagged with its own identifier, too, so you can find this one by searching for the tag “podcast-14”.

javan-warty-pig is a fuzzer for Java. Basically, a fuzzer generates lots of potential inputs for a test; for example, if you were going to write a test to parse a number, well, you’d generate inputs like empty text, or “this is a test” or various numbers, and you’d expect that your tests would validate errors or demonstrate compliance to number conversion (this is a way of saying “it would parse the numbers.”) A Fuzzer generates this sort of thing largely randomly, and is a good way of really stressing the inputs for the methods; the fuzzer has no regard for boundary conditions, so it’s usually a good way of making sure you’ve covered cases. The question, therefore, becomes: would you use a fuzzer, or HAVE you used a fuzzer? Do you even see the applicability of such a tool? There’s no doubt that it can be useful, but potential doesn’t mean that the potential will be leveraged.
Simon Levermann, mentioned last week for having released pwhash, wrote up an article for the channel blog detailing its use and reason for existing. Thank you, Simon!
Scala is in a complex fight to overthrow Java, from DZone. Is the author willing to share the drugs they’re on? Scala’s been getting a ton of public notice lately – it’s like the Scala advocates finally figured out that everything Scala brought to the table, Kotlin does better, and with far less toxicity. If kotlin wanted to take aim at Scala, there’d be no contest – Kotlin would win immediately, unless “used in Spark and Kafka” were among the criteria for deciding a winner. It’s a fair criterion, though, honestly; Spark and Kafka are in fairly wide deployment. But Scala is incidental for them, and chances are that their developers would really rather have used something a lot more kind to them, like Kotlin, rather than Scala.
More of the joys of having a super-rapid release cycle in Java: according to a post on the openjdk mailing list, bugs marked as critical are basically being ignored because the java 9 project is being shuttered. It’s apparently on to Java 10. This is going to take some getting used to. It’s good to have the new features, I guess, after wondering for years if Java would get things like lambdas, multiline strings, and so forth, but the rapid abandonment of releases before we even have a chance to see widespread adoption of the runtimes is… strange.
James Ward posted Open Sourcing “Get You a License”, about a tool that allows you to pull up licenses for an entire github organization – and issue pull requests that automatically add a license for the various projects that need one. Brilliant idea. Laziness is the brother of invention, that’s what Uncle Grandpa always said.

Modular password hashing with pwhash

When building web applications, we usually also have to store user authentication data. When doing so, there’s generally two choices: Either we use an external authentication provider like OAuth, or we store passwords for our users in our database. This blog post focuses on how to do the latter correctly.

Password hashing basics

Before we dive into the specifics of the pwhash library, we briefly discuss the basics of what we need to be wary of when storing passwords. The first, and most important point is that we do not actually ever store our users’ passwords. Instead, we run them through a cryptographic hash function. This way, in case our database ever gets compromised, the attacker doesn’t immediately know all our users’ passwords. But this is not considered enough any more. Because if we just put our users’ passwords through a simple hash function, like SHA-3, two users that pick the same password will end up with the same hash value. Knowing this is of course useful to an attacker, because now they can attack multiple passwords at the same time, if they get the user data.

Editor’s note: so choose passwords likely to be unique… and unguessable even to people who know you. Please.

Enter salts

To mitigate this problem, we use so-called salts. For each password, we generate a random salt, and prepend (or append) it to the password, before passing it through the hash function. This leaves us with
sha3(firstsalt:password) = 0bee3940e2d74f5155e73a9e90ea75b5d06407db85527f62acefa97af2d59f69589b276630e20a1ff8b0b781a372ae17db88b9f782acf7ed0022ab4c2fc766df
sha3(secondsalt:password) = edaa1e8b31e2d2943d689928a5ba1c503bb3220ddc164d6feff9e178dd18a4727b1905400252ef071d28b370c0b9727420cd0e2011109ca3e8934a4082d2bf9e
As we can see, this yields two completely different hashes even though two users used the same (admittedly very bad) password. The salt can be stored alongside the hash in the database. An attacker now has to break each password individually, instead of attacking them all at the same time. Great, right? Surely, now we’re done and can get to the actual library? Not quite yet. The next problem we have is that modern graphics cards are really fast at calculating these kind of hashes. For example, an Nvidia GTX 1080 calculates around 800 million SHA-3 hashes per second.

Dedicated password hashing functions

To get rid of this problem, we do something we usually don’t want to do in computing: We make things intentionally slow. There are several functions that can be used for this. An older idea is simply applying many rounds of the same hash function over and over again. An example of this approach is the PBKDF2 (Password-Based-Key-Derivation-Function). But since GPUs are really fast these days, we also want functions that are harder to compute on GPUs. Functions that use a lot of memory, and a lot of branches are generally very hard to compute on graphics cards. An example of this is argon2, a function which was specifically designed for hashing passwords, and won the Password Hashing Competition in 2015. The question, then, is how do we create a re-usable approach that allows us to stay current with always-current password hashing requirements?

The pwhash library

While there are already Java implementations and bindings for argon2 and other password hashing functions such as bcrypt, what is still missing in the Java ecosystem is a library that unifies them under a single interface. Functions like argon2 come with different versions, and a lot of adjustable parameters. So if we hardcode those parameters into our application in 2018, the parameters are probably going to be outdated (and inefficient or insecure) in five or ten years. So ideally, we want our library to handle this for us. We change our parameters in one place, and the library automatically takes care of upgrading both new and existing password hashes every time a user logs in or signs up. In addition to this, it would also be convenient to not just switch between parameters of a single algorithm, but also to switch the algorithm to something newer and better altogether.
That’s the goal of the pwhash library.

The HashStrategy interface

To offer all this, the core functionality of pwhash is the HashStrategy interface. It offers 3 methods:

String hash(String password)
boolean verify(String password, String hash)
boolean needsRehash(String password, String hash)

This interface is largely inspired by the PHP APIs for password hashing, which offers the functions password_hash, password_verify and password_needs_rehash. In the author’s opinion, this is one of the better things in the PHP core library, and hence I decided to port the functionality to Java in a bit more Object-Oriented style.
The first two methods are relatively straightforward: String hash(myPassword) produces a hash, alongside with all its parameters, which can later be passed to boolean verify(myPassword, storedHash) in order to verify if the password actually matches the given hash.
The third method is the magic that lets us upgrade our hashes without needing to write new code every time. In the first step, it checks whether the given password actually verifies against the hash, in order to make sure that we don’t accidentally overwrite hashes when the user supplied an incorrect password. In the next step, the parameters used for the current HashStrategy (which are generally supplied by the constructor, or some factory method) are compared with the ones stored alongside the password hash. If they match, false is returned, and no further action needs to be taken. If they do not match, the method returns true instead. Now, we call hash(myPassword) again, and store the newly generated hash in the database.
This way, we only have to write code once for our login, and every time we change our parameters, they are automatically updated every time users log in. However, for a single implementation, like the Argon2Strategy, this only handles changes in parameters. If somehow a weakness is discovered, we still do not have a good way to migrate away from Argon2 to a different algorithm.

The MigrationStrategy class

Migration between two different algorithms is handled by the MigrationStrategy class, which implements the above interface, and in its constructor, accepts two other strategies: One to migrate from, and another to migrate to. Its implementation for String hash(String) simply calls the hash functions of the latter strategy, since we want all new hashes to be performed with the new algorithm.
Its implementation of boolean verify(String, String) first attempts to verify against the old strategy, and if that fails, against the new strategy. If neither succeed, the password was incorrect. If either succeeds, the password was correct, and true can be returned.
Again, the boolean needsRehash(String, String) method is a little bit more complicated. It first attempts to verify the given password against the old strategy. If this succeeds, the password is obviously in the old format, and needs to be rehashed, and we immediately return true. In the second step, we attempt to verify it with the new strategy. If this is also unsuccessful, we return false as we don’t want to rehash if the user supplied an incorrect password. If the verification succeeds, we return whatever newStrategy.needsRehash(String, String) returns.
This way, we can adjust parameters on our new strategy while some passwords are still hashed with the old strategy, and we receive the expected results.
Note that it is not necessary to use this class when migrating parameters inside one implementation. It is only used when switching between different classes.

Chaining MigrationStrategy

Fun fact about MigrationStrategy: Since it is also an implementation of the HashStrategy interface, you can even nest it in another MigrationStrategy like so:

MigrationStrategy one = new MigrationStrategy(veryOldStrategy, somewhatBetterStrategy);
MigrationStrategy two = new MigrationStrategy(one, reallyGoodStrategy);

This might not seem useful at first, but it is useful if you have a lot of users. You may want to switch to yet another strategy at some point, while not all your users are migrated to the intermediate strategy yet. Chaining the migrations like this allows you to successfully verify against all three strategies, still allowing logins for even your oldest users who haven’t logged in in a while, and still migrating all passwords to the newest possible algorithm.

Examples

Examples for the usage of the pwhash library are hosted in the repository on GitHub.
Currently, there are two examples. One uses only the Argon2Strategy and upgrades parameters within that implementation. The second example uses the MigrationStrategy to show code that migrates users from a legacy Pbkdf2Strategy to a more modern Argon2Strategy.
The best thing about these examples: The code that actually authenticates users is exactly the same in both examples. The only thing that is different is the HashStrategy that is plugged in. This is the main selling point of the pwhash library: You write your authentication code once, and when you want to upgrade, all you have to do is plug in a new strategy.

Where to get it

pwhash is available on Maven Central. For new applications, it is recommended to only use the -core artifact. Support for PBKDF2 is mainly targeted at legacy code bases wanting to migrate away from PBKDF2. JavaDoc is available online on the project homepage. JAR archives are also available on GitHub, but it is strongly recommended you use maven instead.

Interesting links podcast, episode 4

Welcome to the fourth ##java podcast. I’m Joseph Ottinger, dreamreal on the IRC channel, and it’s Monday, 2017 October 16.
This podcast covers news and interesting things from the ##java IRC channel; if you see something interesting that’s related to Java, feel free to submit it to the channel bot, with ~submit and a URL to the interesting thing, or you can also write an article for the channel blog as well; I’m pretty sure that if it’s interesting enough to write about and post on the channel blog, it’s interesting enough to include in the podcast.

Worth noting, not because it’s Java-related but because we’re all on the same Internet: there’s a security vulnerability with WPA2, the wireless encryption used by, well, pretty much everyone. Check your routers for security patches; if they’re not available, they should be soon, and if they’re not available soon, consider getting a good router.
Effective Java is one of the recommended books from the channel regulars; it covers a lot of things that affect efficiently written Java. However, Josh Bloch is working on an update for the third edition of Effective Java. It’s available for pre-order. Highly recommended; Josh Bloch is one of the people who really knows Java, to the point where he says he can write code in Java such that he can influence how the JIT works, to make it more efficient than code mere mortals like you or I would write. So when he has a book on writing effective Java, it’s probably pretty authoritative.
Facebook apparently uses their own build system, called “Buck.” It’s supposedly really fast; it apparently supports a lot of languages, which is a good thing; it does not, however, use the same build structure for source that Maven and Gradle use. That’s sort of okay; the Maven convention (which is what Gradle uses) is idiomatic in Java only because Maven itself became idiomatic, but it’s still something to consider if you’re moving to something different. My thought is that Buck might be cool but in a java-centric project, it’s probably not of sufficient interest to really move the needle. I looked; I considered; I moved on, seeing nothing really compelling in the description or tutorial that made me think “Wow!” like I did with, well, both Maven and Gradle, both of which I use regularly.
Excelsior JET – who makes an ahead-of-time compiler for the JVM, so you can deploy your Java applications as native binaries – has an interesting post called “The Folder of God.” No, it’s not a religious post, although religious fervor might be involved if you hate Windows enough. Basically, there’s a way to create a folder in Windows such that Java programs running from that folder will crash, every time. (I don’t know why you’d actually do that in practice.) It’s an actual Java bug, not an Excelsior bug – but Excelsior experiences the bug nonetheless. It’s apparently been addressed in the Java sources, but your JVM might not be updated with the fix. It’s fascinating reading, even if only to make you glad you’re not using Windows.
A report by realm.io suggests that “Java (on Android) is dying. There aren’t simply more Kotlin builders: they’re also switching their apps to Kotlin. In fact, 20% of apps built with Java before Google I/O are now being built in Kotlin. Kotlin may even change how Java is used on the server, too.” As a Kotlin user myself, I can say that the transition to Kotlin in dreamreal-land is progressing rather nicely… but what’s more relevant is that Kotlin on Android is increasing momentum, and that may very well drive server-side development as well, as there’s a strong tendency to be homogenous even if interoperability between languages like Kotlin and Java is quite strong. The channel recently had a discussion about Java’s checked exceptions, a feature Kotlin doesn’t share (because nobody really likes checked exceptions and the JVM itself doesn’t have them – they’re a javac thing, not a JVM thing); checked exceptions are actually a good thing in that they force you to think about your exception handling, but there’s no guarantee you’re going to actually handle your exceptions well, so they end up being an unnecessary burden in many peoples’ minds. Worth thinking about, in any event…
Something that comes up fairly often in the channel is the use of the Oracle JDK vs. OpenJDK, and what the differences are between them. I always said it was in a set of closed-source libraries used in Oracle JDK, such that some features might be present in Oracle’s JVM that OpenJDK did not have. Well, while that was true at one point, it’s like all Internet knowledge: it erodes. The Adopt OpenJDK project has a page on Differences between Adopt OpenJDK binaries and Oracle JDK Binaries that actually walks through the differences, which is a really short list: font rasterizers, color management, and graphics renderers. That doesn’t compare the differences with other implementations of the JVM – Zulu and whatnot – but what it does say is that OpenJDK and Oracle’s JDK are really closely aligned right now, just as designed.
The Jooq blog has an article called Benchmarking JDK String.replace() vs Apache Commons StringUtils.replace(). It walks through an optimization process and measures the effectiveness, offering a ton of apologies for what might appear to be premature optimization along the way; the upshot is that Java 9’s String.replace() works better than it used to, which might affect which implementation to use. (It turns out that Java 9’s version is slower for matches in long strings but faster for matches in short strings, which – in practice – are probably more common.) They ended up staying with Apache’s implementation for now if only because most people are still on Java 8 and thus the performance improvements are worthwhile. It’s a fascinating read.
Wrapping up, we have an article from Baeldung called “Introduction to Caffeine“. Caffeine is a caching library; the article walks through its use, as one might expect. All that’s fine. What the article does not do, though, is differentiate why one might use Caffeine as opposed to one of the other caching libraries out there, like EHCache, or Guava Cache – which inspired Caffeine, actually. Channel inhabitant dudeji – which I don’t know how to pronounce – points out that Caffeine has time-to-live (as most of them do) but also automatic elimination based on unused keys; I can see some use in that, although I’d be concerned that the cache was deciding that a key was unused more aggressively than I’d have liked. I’m sure it’s tunable, though.

Interesting Links, 2017-Feb-23

Maven Polyglot: replacing pom.xml with Clojure, Scala, or Groovy Script shows how you can, well, replace the XML configuration for a Maven project with a configuration in another language. This also provides some imperative-ish features to Maven projects, plus no XML except for a short description of the scripting language.
How does a relational database work provides a (very) high level description of how a relational database does its work – and if you read carefully you can see some of the motivations behind some of the less- or non-relational database decisions out there, too.
From user DerDingens: An Examination of Ineffective Certificate Pinning Implementations, which points out common mistakes while using certificates in Java.
Maldivia pointed out a Java Enhancement Process draft, for Epsilon GC: The Arbitrarily Low Overhead Garbage (Non-)Collector – basically a garbage collector for Java that does absolutely nothing, for the purposes of tuning, testing, and in some cases, performance enhancements (if you know your job is short-lived, why bother running a GC if you don’t need to?) Fascinating stuff.
From cheeser: Google, IBM back new open source graph database project, JanusGraph. JanusGraph is a fork of Titan, but its list of backers make it worth watching. Have you used a graph database before? How? What did you think of it?
Two from DZone on git:
- Lesser Known Git Commands has some handy shortcuts for git users (and chances are good that if you’re reading this, you’re a git user.)
- Git Submodules: Core Concept, Workflows, And Tips. From the article: “Including submodules as part of your Git development allows you to include other projects in your codebase, keeping their history separate but synchronized with yours.” Great stuff.

Interesting Links – 5 Oct 2016

In Accounts is Everything Meteor Does Right, Pete Corey points out that Meteor‘s Accounts package dictates how user authentication and authorization will work, period, and that this removal of choice actually makes it entirely usable. Having wrestled with Shiro and Crowd and Spring Security and JAAS, it’s a point that’s hard to argue with – and while users of Shiro, etc., will probably say that those packages work as well as Accounts does, I’d have to humbly disagree, even while acknowledging how nice it is to have things like Shiro and Crowd. It’d be nice to have something that was just as drop-in for Java was Accounts is for Meteor. (Meteor is, BTW, a reactive application framework for NodeJS.)
User jdlee posted Maslow’s hierarchy of dev needs from Twitter – One gets the impression that the developer’s needs weren’t being met when designing that graph.
Also from jdlee, who was apparently crawling Twitter: “In Ruby, everything is an object. In Clojure, everything is a list. In Javascript, everything is a terrible mistake.”
User adimit gave high praise to Growing Object-Oriented Software Guided by Tests, saying that “it’s really simple but helping me write a test-driven app right now” and (coming from a Haskell background) “I’ve always hated OOP, but now I see how to do it properly, and it can be fun.” Anything that helps code quality improve is awesome, if you ask me.
From DZone: The Rise and Fall of Scala describes the apparent slow demise of Scala. The author backs it up, and it’s not a bad article; Scala’s awesome (I love it) but the criticisms are quite valid. Two areas where the author sees Scala remaining strong: Big Data and custom DSLs, definitely two Scala strengths.

Link: Jenkins over HTTPS with JNLP slaves

A user in ##java recently ran into a problem where he needed to connect Jenkins slaves to a master, using SSL – and it didn’t work, thanks to security. He helpfully wrote up how using correct policy files got it working.
Jenkins is a continuous integration server – basically an application that runs your builds when changes occur in your source tree. Slaves allow the build to take place on different hardware configurations, as well as providing horizontal scalability to your builds.
Good stuff, and thanks!