November 2017 – Libera #java

How to use Scanner

The java.util.Scanner class built into java can be used to read inputs. It can for example be used to set up an interactive command line session, where you prompt the user for stuff and your program acts on what the user types in. You can also point it at a file or network connection if you like.

TL;DR: Do not use nextLine()

Phew, glad we got that out there. If you don’t want to read the rest, just never call that method. Instead, to read a full line of text, you call:

scanner.useDelimiter("\r?\n");
String entireLine = scanner.next();

Creating scanners

To create a scanner, just call its constructor, and pass in the data you want to read from. You have lots of options:

import java.util.Scanner;
import java.io.InputStream;
import java.nio.file.Paths;
import java.nio.charset.StandardCharsets;
Scanner in = new Scanner(System.in);
Scanner in = new Scanner(Paths.get("/path/to/file"), StandardCharsets.UTF_8);
Scanner in = new Scanner("input to read");
try (InputStream raw = MyClass.class.getResource("listOfStates.txt")) {
  Scanner in = new Scanner(raw, StandardCharsets.UTF_8);
}

The above code shows 4 different things you can read with scanner:

“Standard input” – what the user types into the command line.
Files. You can also pass a java.io.File object if you still use the old file API.
A string containing data you want to read. The string is read directly; you can’t pass a file path (see the second example for reading files).
Input streams, byte channels, and Readables.

Note that in all cases where you’re dealing with byte input, you can opt not to explicitly pass along a character encoding, but don’t do this! – this means you get the platform default and you have no idea what that’s going to be. That is a good way to write code that works on your computer but fails elsewhere. When in doubt, pass in StandardCharsets.UTF_8 just like in the examples above.

Reading data from a scanner

To read data from a scanner, first decide what you want to read. Then, call the right next method:

A whole number

Call nextInt(), nextLong(), or nextBigInteger(). If you don’t know what these are, call nextInt().

A number in hexadecimal

Call nextInt(0x10), or nextLong(0x10).

A number with fractional elements (i.e., numbers after a decimal point)

Call nextDouble(). However, be aware that how people type floating point numbers depends on where they live. You may want to call useLocale() first to explicitly set the style.

A boolean value

Call nextBoolean()

A single word

Call next()

A whole sentence

Ah, well, that’s tricky. Do not call nextLine(), that doesn’t do what you might think based on its name. Instead, read the next section.

Tokens?

The way Scanner works is by first reading a so-called ‘token’, and then converting this into the representation you asked for. So, if you call nextInt(), a single token is read and this token is then parsed as an integer. If it isn’t an integer, for example because the user typed hello instead of 85, an InputMismatchException is thrown. Catch this exception if you wish to respond to faulty inputs (I advise not calling hasNextInt() and friends either; but that’s for another article).
By default, ‘read the next token’ is done by reading until end-of-input, or any whitespace (tabs, spaces, enters, anything goes). This means that ‘read the next entire line’ just isn’t a thing scanner can do: That would involve reading multiple tokens.
The trick is to turn ‘the entire line’ into a single token. To do that, just tell Scanner to delimit on something else! The delimiter is what separates tokens. So, to read an entire line, first tell scanner that tokens are separated only by newlines, then just read the next token:

scanner.useDelimiter("\r?\n");
String entireLine = scanner.next();

This sets the delimiter to be the newline character, optionally preceded by the carriage return character (windows formatted text files tend to put that in front of their newlines, other OSes don’t).
If you want to return to whitespace-separated tokens, you just call scanner.reset(), though, note, that also resets useLocale.

So what does nextLine do?

nextLine() is unlike all the other next methods. It doesn’t read tokens. Instead, it just keeps reading characters up to the next newline character and returns what it read. It completely ignores the delimiter. This is problematic, because the next method family reads up to the delimiter but doesn’t ‘consume’ it. Thus, if you write this code:

Scanner scanner = new Scanner(System.in);
System.out.print("Enter your age: ");
int age = scanner.nextInt();
System.out.println("Enter your name: ");
String name = scanner.nextLine();

It won’t work: In the user types 5 and then hits enter, well, the 5 goes to nextInt() and that lone enter is all that nextLine is going to read. But if the user types 5 Jane Doe then you’ll end up with age = 5 and name = "Jane Doe". There’s no real way to try to fix this (you can try to call nextLine() twice but that breaks it for those who type spaces instead of enter); just don’t use nextLine, and change the delimiter instead.

Don't use raw types.

… just.. yeah. Don’t use raw types. This message brought to you by the color ‘4’ and the number ‘dictionary.’

Javachannel’s interesting links podcast, episode 7

Welcome to the seventh ##java podcast. I’m Joseph Ottinger, dreamreal on the IRC channel, and it’s Monday, 2017 November 6. Today feels slightly less anonymous than yesterday.
This week we have a co-host, Andrew Lombardi – kinabalu on ##java – and we also offer our humblest apologies to Ms. Debbie Gibson.
This podcast covers news and interesting things from the ##java IRC channel on Freenode; if you see something interesting that’s related to Java, feel free to submit it to the channel bot, with ~submit and a URL to the interesting thing, or you can also write an article for the channel blog as well; I’m pretty sure that if it’s interesting enough to write about and post on the channel blog, it’s interesting enough to include in the podcast.

Increment Development posted “Center stage: Best practices for staging environments,” an article by Alice Goldfuss that defends and describes the use of the staging environment. “Staging is where you gain confidence in your systems by consensus,” she writes – pointing out that development and testing are for testing known things (“when I do this, does that happen?”), and staging is for testing those things that you think might happen in production but can’t necessarily anticipate as part of development or explicit testing. The author points out that there’s an ongoing debate about this, with some well-known people saying “just test better!” but I’m on Alice’ side personally – staging is where you validate that all that testing didn’t let something get through before deployment to production.
Chase Roberts has written “How to unit test machine learning code.” It’s an interesting article – in that it focuses on expected results for a long pipeline of operations for stuff that’s really hard to test well. Machine learning libraries tend to be black-box tested – throw an input at it, pray a bit, hope you get the expected output, suffer for a while if you don’t – and he’s trying to show a way to avoid this cycle. Short summary: testing is hard. Long summary: know what your algorithms are doing, and test every step along the way.
One of the changes for Java 9’s release was unlimited strength cryptography. Well, all of you laggards on older JVMs might be getting it as well, assuming you update and/or patch – which might be questionable, depending on how far back in the revision cycle you are. If you’re still running Java 6, chances are you don’t update, ever, and this might be a scary process for you because it’s so rare. Do I sound like I’m filled with scorn? I don’t mean to be – pity, maybe, and confusion, but not scorn. (Seriously, folks: update to 8. Or 9. Something moderately current. The pain is coming; putting it off will only make it hurt worse when you run out of time.)
The first of multiple DZone articles for this edition of the podcast: “Switching Java Versions on MacOS” shows you how to use the java_home command on OSX to switch between your multiple JVM installations on OSX easily. This is apparently not a perfect process according to some on ##java, but it’s always worked for me when I’ve tried it – but that’s a very small sample set, so try it yourself and see. (The context of the failure was apparently Apache Ant, and my “success” was really just kicking the tires of Java 9. I’m not saying that the failure is incorrect or user error, by any means.)
Another DZone article: “The JSON-P API: A JSON Processing Primer” shows you a high level overview of JSON-P, with both an object model and a streaming model. The streaming model is more interesting; the object model is a lot like org.json, which… no. Just no. In the end, though, Jackson is probably still your best bet for JSON processing in Java.
Kafka has gone to version 1.0, according to Apache. Kafka is a distributed streaming platform – one way of thinking of it is that it’s a distributed event log, where you can write events that are processed at very high volume by various clients. Every client can have its own offset into the event log, so there’s a lot of flexibility in how you use it. Like most such types of data stores, it’s not a magic bullet for … anything, really, but leveraged properly it really can provide amazing throughput. Administration is great fun; I think I’d rather chew off my own neck than manage a Kafka cluster, but … again, if you need the features, it’s a great product.
Yet another DZone article: “https://dzone.com/articles/an-introduction-to-http2-support-in-java-9” shows us the new HTTP/2 client that’s being incubated in Java 9 – which means it probably won’t be fully realized until the next release of Java (which is itself the subject of another news item.) There are already HTTP/2 libraries for Java: Jetty, Netty, vert.x, OkHttp, and Firefly (among others, probably) – but this one will be part of the Java runtime itself. It looks pretty similar to some of the others already mentioned, but that’s not a bad thing; idiom is good and there probably are only so many ways you can think of building a request and issuing it.
Our third entry from DZone this week: “Machine Learning Algorithms: Which One to Choose for Your Problem” Tries to provide an overview of some of the core factors involved in choosing a machine learning algorithm: supervised vs. unsupervised (or semi-supervised, or reinforced) models, along with some of the models themselves and their applications. There are some examples of problems and math, but it’s got no code whatsoever (and if it did, would probably use Python) – still, it does a good job of going over some of the models and capabilities.
Oh, this seems relevant: Java 10 is coming! … maybe. In an email to the OpenJDK mailing list, Mark Reinhold has revised the version string for Java again – so we might actually get Java 10 instead of Java 18.3 for the next release. Versions are hard to get right – but I think going to a major release version scheme like this (or, rather, staying with a major version scheme) is a good idea, even if the release frequency is boosted. Of course, I also think a major version every year or two is a good thing, so now I’ll probably be complaining about the frequency, but … first world problems, I guess.
Speaking of Java 10, early access builds are available. I don’t know if they have variable inference – “var“, in other words – because I haven’t even truly migrated to Java 9 yet, so I’m far from being ready to test an early access build of 10! But if that’s your stimulant of choice, the builds are there for OSX, Linux, Windows, and even Solaris on SPARC, since even that guy Mike needs to play with Java 10 every now and then.
The next two entries are from DZone, too; they’re on fire. The first one is “Null Safety: Calling Java From Kotlin,” which shows the use of annotations in Java code such that Kotlin doesn’t have to pretend the Java method call can return null. It still can, of course, but in Kotlin, nullable types look and act differently than non-nullable types, so this annotation (@Nonnull(when = When.ALWAYS), if you’re interested) is really a way to suggest to Kotlin that the result is never expected to be null. Really pretty neat stuff.
Lastly, DZone came through with an article about the human mind, applied to programming: “Transcending the Limitations of the Human Mind.” It’s about cognitive capacity, a subject I’ve written about myself in the past; when I wrote about it, I referred to it as “chunking” (we manage only so many chunks of information at a time) and here, it’s the same concept with different terminology. The author – Robert Brautigam – walks through some of the tricks we developers use to manage incredibly detailed deployments with limited cognitive capacity through decomposition, generalization, abstract concept management; he also discusses ways in which our development processes work against our cognitive capacity (where our processes make a given mechanism more expensive cognitively than it otherwise should be, perhaps.) Good article.