Taco Bell Javap
Recently, I wanted to better understand a java program’s execution. I wanted to know every usage of a particular class. For example, let’s look for calls to the static method BigInteger.valueOf. I have a funny constraint though, I don’t own the source code. My solution involves looking at java bytecode.
All About That Bytecode
We write java programs in syntax that’s supposed to be readable0. Here is Fibonacci1 with BigInteger’s.
(0) Some developers find joy in taking that human readable definition to its lower limit.
(1) My implementation starts to make the word “next” look a little funny. Supposedly this is called “semantic satiation”.
import java.util.Iterator;
import java.math.BigInteger;
class MyFibIterator implements Iterator<BigInteger> {
private BigInteger next = BigInteger.ZERO;
private BigInteger nextNext = BigInteger.ONE;
@Override
public boolean hasNext() { return true; }
@Override
public BigInteger next() {
BigInteger toReturn = next;
BigInteger newNextNext = next.add(nextNext);
next = nextNext;
nextNext = newNextNext;
return toReturn;
}
}
My computer can’t run this text outright. Instead a program that can run on my computer runs java programs. That program is the JVM. Except … even the JVM doesn’t want to look at my silly code. Instead it wants an intermediate form called java bytecode. We transform java text into bytecode with the javac “compiler”. Here’s the problem, that bytecode is opaque to plaintext tools.
% cat MyFibIterator.class
����C(
java/lang/Object<init>()V
java/math/BigIntegervalueOf(J)Ljava/math/BigInteger;
MyFibIteratornextLjava/math/BigInteger;
nextNext
add.(Ljava/math/BigInteger;)Ljava/math/BigInteger;
()Ljava/math/BigInteger;java/util/IteratorCodeLineNumberTablehasNext()Z()Ljava/lang/Object; Signature>Ljava/lang/Object;Ljava/util/Iterator<Ljava/math/BigInteger;>;
*ourceFileMyFibIterator.java 5*�* ��
���
*,�+� �� *�
A#*�� $%&'%
These bytes follow the structure of the class file format.
If I dump bytes to hexadecimal digits with xxd -p
I get this.
cafebabe0000004300280a000200030700040c000500060100106a617661
2f6c616e672f4f626a6563740100063c696e69743e0100032829560a0008
000907000a0c000b000c0100146a6176612f6d6174682f426967496e7465
67657201000776616c75654f66010019284a294c6a6176612f6d6174682f
426967496e74656765723b09000e000f0700100c0011001201000d4d7946
69624974657261746f720100046e6578740100164c6a6176612f6d617468
2f426967496e74656765723b09000e00140c001500120100086e6578744e
6578740a000800170c0018001901000361646401002e284c6a6176612f6d
6174682f426967496e74656765723b294c6a6176612f6d6174682f426967
496e74656765723b0a000e001b0c0011001c01001828294c6a6176612f6d
6174682f426967496e74656765723b07001e0100126a6176612f7574696c
2f4974657261746f72010004436f646501000f4c696e654e756d62657254
61626c650100076861734e65787401000328295a01001428294c6a617661
2f6c616e672f4f626a6563743b0100095369676e617475726501003e4c6a
6176612f6c616e672f4f626a6563743b4c6a6176612f7574696c2f497465
7261746f723c4c6a6176612f6d6174682f426967496e74656765723b3e3b
01000a536f7572636546696c650100124d794669624974657261746f722e
6a6176610020000e00020001001d00020002001100120000000200150012
000000040000000500060001001f0000003500030001000000152ab70001
2a09b80007b5000d2a0ab80007b50013b10000000100200000000e000300
00000400040005000c00060001002100220001001f0000001a0001000100
00000204ac0000000100200000000600010000000800010011001c000100
1f0000004800020003000000202ab4000d4c2ab4000d2ab40013b600164d
2a2ab40013b5000d2a2cb500132bb0000000010020000000160005000000
0b0005000c0011000d0019000e001e000f1041001100230001001f000000
1d00010001000000052ab6001ab000000001002000000006000100000004
000200240000000200250026000000020027
Through magic methods I highlighted where we call the static method BigInteger.valueOf2. Most of my readers won’t have my magic ball though. We need a better way to read this file.
b8
. We
call that static method twice. There are only two
b8
’s in our hexdump! That made the
interpretation super easy. Bonus Challenge
Javap
Javap is a class disassembler. This tool knows how to read the class file format. With the right flags we can use it to print the instructions we are sending to the JVM. I’ve cut out some lines, and added my own comments in brackets.
[The class file is printed because we used the -sysinfo flag.]
Classfile /Users/mathias.kools/Desktop/tacobelljavap/MyFibIterator.class
[...]
Compiled from "MyFibIterator.java"
class MyFibIterator implements java.util.Iterator<java.math.BigInteger> {
private java.math.BigInteger next;
private java.math.BigInteger nextNext;
MyFibIterator();
Code:
0: aload_0
[...]
[Check it out our static method call!]
6: invokestatic #7 // Method java/math/BigInteger.valueOf:(J)Ljava/math/BigInteger;
9: putfield #13 // Field next:Ljava/math/BigInteger;
[...]
20: return
public boolean hasNext();
Code: [...]
public java.math.BigInteger next();
Code:
[...]
13: invokevirtual #22 // Method java/math/BigInteger.add:(Ljava/math/BigInteger;)Ljava/math/BigInteger;
[...]
31: areturn
[We didn’t write this next method!]
[This is a bridge method^3!]
public java.lang.Object next();
Code: [...]
}
Living Mas
I now have a way to search bytecode for static method usage. Let’s apply it. In my situation, I wanted to search two hundred thousand class files3. To recreate this kind of scale, I’ll download thirty popular JARs.
% cd /Users/mathias.kools/Desktop/tacobelljavap/jars/
% wget https://repo1.maven.org/maven2/junit/junit/4.13.2/junit-4.13.2.jar
[...]
% wget https://repo1.maven.org/maven2/com/google/code/findbugs/jsr305/3.0.2/jsr305-3.0.2.jar
% for jar in *.jar; do
dir="${jar%.*}"
mkdir -p "$dir"
unzip -q "$jar" -d "$dir"
done
% cd ..; find jars -type f -name '*.class' | wc -l
19080
Shucks. I’m off by an order of magnitude. That’s okay, this is enough files to illustrate the challenges I faced. To look for this method, I want to resist the urge to go web scale. Instead I want to try to combine simple command line tools. This is taco bell programming.
I’m a terrible cook. There’s only a few “simple”4 ingredients here. But I’m still going to let an LLM ratatouille me through the command line arguments. That means the final result is going to be a bit of a taco bell sauce tote bag.
Attempt One, Just Xargs
We’ll run javap on every file. How bad could this possibly be?
% find jars -type f -name '*.class' | xargs -n1 javap -c -sysinfo
Pretty bad. Using later data, I estimate it would finish in one hour and 42 minutes. If I wanted to be efficient with work hours I’d shut up and eat my garbage. Instead, in after hours I’ve nibbled at this for something like 3 months5.
Attempt Two, A Little Gusteau
To measure rate of improvement, I’m going to test on a simple random sample.
One wrinkle to any measurement we’re doing is that there are some HUGE class files (histogram in kilobytes).
% find jars -type f -name '*.class' | xargs -n1 -I % du -k % | awk 'BEGIN { binwidth=25; max=0 } { if ($1 < min) min = $1; if ($1 > max) max = $1; bins[int(($1)/binwidth)]++; } END { for (i=0; i<=int(max/binwidth); i++) { start = i * binwidth; end = start + binwidth; if (bins[i] > 0) { printf "%d-%d: %d\n", start, end, bins[i] } } }'
0-25: 18828
25-50: 185
50-75: 36
75-100: 17
100-125: 2
125-150: 5
150-175: 2
300-325: 1
325-350: 3
650-675: 1
The biggest offender is
jars/kotlin-stdlib-2.1.10/kotlin/collections/ArraysKt___ArraysKt.class
with something like 1223 members (comes from
this monster generated source file).
First we’ll try parallelism. That’s a good trick.
P_VALUES=(1 2 4 6 8 12 24)
for P in "${P_VALUES[@]}"; do
START_TIME=$(gdate +%s%3N)
cat sample.txt | xargs -n1 -P"$P" javap -c -sysinfo > /dev/null
: gdate is GNU's latest date,
: OSX's is out of date.
END_TIME=$(gdate +%s%3N)
DURATION=$((END_TIME - START_TIME))
echo "$P,$DURATION"
done
P | 1 | 2 | 4 | 6 | 8 | 12 | 24 |
---|---|---|---|---|---|---|---|
dur_ms | 270159 | 162260 | 113331 | 94999 | 77162 | 79271 | 84400 |
Looks like it is effective until around 8 processes. My Macbook Pro has 6 CPUs and 12 virtual cores. I could sip on my pipe in my study and ponder this. Or I could show you a better speedup!
Attempt Three, Zuzhing Xargs Further
Next we’ll try larger and larger batch sizes. That is we’ll pass longer and longer LISTS of files to a single invocation of javap.
N_VALUES=(4 64 256 1024)
P_VALUES=(1 2 4 8 12)
for P in "${P_VALUES[@]}"; do
for N in "${N_VALUES[@]}"; do
START_TIME=$(gdate +%s%3N)
cat sample.txt | xargs -n"$N" -P"$P" javap -c -sysinfo > /dev/null
END_TIME=$(gdate +%s%3N)
DURATION=$((END_TIME - START_TIME))
echo "$P,$DURATION"
done
done
N | dur_ms |
---|---|
1 | 270159 |
4 | 77407 |
64 | 9594 |
256 | 4491 |
1024 | 2403 |
Javap gets faster every second it is running. Here I have a cool experiment to measure the rate of improvement6.
S_VALUES=($(seq 1 10 2000))
for S in "${S_VALUES[@]}"; do
START_TIME=$(gdate +%s%3N)
find jars -type f -name '*.class' | head -n "$S" | xargs -n"$S" javap -J-XX:+UnlockDiagnosticVMOptions -J-XX:+LogCompilation -J-XX:LogFile=/tmp/compiler.log > /dev/null
END_TIME=$(gdate +%s%3N)
DURATION=$((END_TIME - START_TIME))
COMPILER_LINES=$(cat /tmp/compiler.log | wc -l)
echo "$S,$DURATION,$COMPILER_LINES" >> times.csv
done
What’s going on? We are witnessing the power of Java’s JIT: it has tiered compilation7.
-J-Djava.compiler=NONE
increasing batch size
produces no speed improvements.
We can measure compiler activity by the number of log lines it produces8. And we can measure the speed of javap by dividing the count of class files decompiled by the total time. We’ll graph the two together. We start with a lot of compiler activity. We taper off as we approach 2000 class files. Mirroring this shape is our javap speed. How neat is that relationship?
(8) This just so happened to produce nice graphs. I got a little lucky.
(9) R^2 of 0.898 when we graph log lines and speed together. High school me would write highly correlated on the stats test.
Attempt Four, Mixing Xargs Args
N_VALUES=(1024 2000 4000 8000 16000 20000 30000 40000)
P_VALUES=(1 2 4 6 8 12)
for N in "${N_VALUES[@]}"; do
for P in "${P_VALUES[@]}"; do
START_TIME=$(gdate +%s%3N)
find jars jars jars -type f -name '*.class' | xargs -n"$N" -P"$P" javap > /dev/null
END_TIME=$(gdate +%s%3N)
DURATION=$((END_TIME - START_TIME))
echo "$N,$P,$DURATION"
done
done
N/P | 1 | 2 | 4 | 6 | 8 | 12 |
---|---|---|---|---|---|---|
1024 | 116408 | 65166 | 50234 | 45763 | 44337 | 44127 |
2000 | 94860 | 52757 | 34322 | 34150 | 30540 | 29927 |
4000 | 69023 | 37320 | 27250 | 26806 | 25318 | 26006 |
8000 | 46223 | 27934 | 21434 | 16113 | 17168 | 17074 |
16000 | 35381 | 21787 | 15185 | 15966 | 15903 | 16175 |
20000 | 35059 | 21458 | 15338 | 15694 | 15942 | 16214 |
30000 | 43258 | 27232 | 17966 | 19023 | 18910 | 21210 |
4000010 | 43058 | 25605 | 18278 | 19514 | 19334 | 20203 |
(10) Out of curiosity because it appears in the table in
xargs what happens when BATCH SIZE * P >
N
.
seq 1 60000 | xargs -n40000 -P12 python3 -c "import uuid
import sys
print(f\"{uuid.uuid4()} {len(sys.argv) - 1}\")
"
8172f612-a49c-42e6-bd6a-189c7e3ab80c 20000
a150ac10-0c7d-4ab0-a7e0-66659818cab6 40000
It will totally fill processes before moving onto giving arguments to the next invocation. Some processes won’t be started at all.
Looks like batch size 16000, and parallelism 4 is the ideal crunchwrap supreme10 combination.
Attempt Five, Swapping Taco Bell for Baja Fresh
I saw all this and assumed packing 20000 files into a list of arguments would be slow. So slow, that with a single process I figured if I read the files in as a stream I could outspeed xargs. I will just call into the methods that javap is implemented with11. This isn’t taco bell programming anymore.
package io.github.math_ias;
import com.sun.tools.javap.Main;
import java.io.PrintWriter;
import java.io.InputStreamReader;
import java.io.BufferedReader;
import java.io.IOException;
import java.util.Arrays;
/**
* Program that takes javap flags first,
* followed by one file path at a time through standard input.
*/
public class MyMain {
public static void main(String[] flags) {
int flagsLength = flags.length;
String[] args = Arrays.copyOf(flags, flagsLength + 1);
PrintWriter writer = new PrintWriter(System.out);
try (
BufferedReader reader =
new BufferedReader(new InputStreamReader(System.in))
) {
String line;
while ((line = reader.readLine()) != null) {
args[flagsLength] = line;
Main.run(args, writer);
}
} catch (IOException e) {
e.printStackTrace();
}
System.exit(0);
}
}
(12) This isn’t trivial by the way. I did it like this
with an unammed module (drop --add-modules
if
you already have a module graph defined in a
module-info.java
).
javac --add-modules jdk.jdeps --add-exports jdk.jdeps/com.sun.tools.javap=ALL-UNNAMED io/github/math_ias/MyMain.java
You still add these exports when running with java.
java --add-exports jdk.jdeps/com.sun.tools.javap=ALL-UNNAMED io.github.math_ias.MyMain
Check out this comparison of the best value pair of batch size and parallelism and my program.
for _ in {1..5}; do
START_TIME=$(gdate +%s%3N)
find jars jars jars -type f -name '*.class' | xargs -n 40000 -P 4 javap > /dev/null
: OR find ../../jars ../../jars ../../jars -type f -name '*.class' | java --add-exports jdk.jdeps/com.sun.tools.javap=ALL-UNNAMED io.github.math_ias.MyMain > /dev/null
END_TIME=$(gdate +%s%3N)
DURATION=$((END_TIME - START_TIME))
echo "$DURATION"
done
Sample | Xargs Time | My Program Time |
---|---|---|
0 | 15857 | 14504 |
1 | 14964 | 14323 |
2 | 15506 | 15331 |
3 | 17200 | 15789 |
4 | 15913 | 14894 |
Average13 | 15888 | 14968.2 |
pbpaste | awk '{sum+=$1} END {print
sum/NR}'
It’s a small improvement. But not worth the time I spent writing the code. We will come back to the idea of calling into the libraries that power javap later.
Attempt Six, Je Ne Sais Quoi14
I googled “how to make my java programs start faster”. I found this funky feature called AppCDS. It saves “class metadata” into a “JSA” file. The JVM loads classes from the JSA format faster than through class files or a JAR. The only blood sacrifice we have to make is giving up cross-platform-ness and file size.
First, I make the JSA file.
javap -J-XX:ArchiveClassesAtExit=my.jsa -c -p MyFibIterator.class
Then I use it.
N_VALUES=(1 4 16 64 256 1024 4096 16384 20000)
for n in "${N_VALUES[@]}"; do
START_TIME=$(gdate +%s%3N)
find jars -type f -name '*.class' | head -n "$n" | xargs -P1 -n"$n" javap -J-XX:SharedArchiveFile=my.jsa > /dev/null
END_TIME=$(gdate +%s%3N)
DURATION=$((END_TIME - START_TIME))
echo "$n,$DURATION"
done
It’s alright. Looks like I save a constant amount of time. Maybe two seconds?
NO CDS | AppCDS | |
---|---|---|
-n1 | 272 | 225 |
-n4 | 285 | 238 |
-n16 | 351 | 302 |
-n64 | 539 | 493 |
-n256 | 956 | 886 |
-n1024 | 2363 | 2272 |
-n4096 | 6694 | 6382 |
-n4096 | 6694 | 6382 |
-n16384 | 23996 | 23366 |
-n20000 | 27250 | 25314 |
It reminds me of basil. It smells amazing, but when I cook with it the flavor disappears. Maybe I’m doing it wrong.
Attempt Seven, Native Image
What if we didn’t have to wait for tiered compilation? What if we compiled everything to machine code ahead of time? This is the idea behind GraalVM’s native image tool. The output looks nice.
% native-image com.sun.tools.javap.Main
========================================
GraalVM Native Image: Generating 'com.sun.tools.javap.main' (executable)...
========================================
[1/8] Initializing...
(13.8s @ 0.09GB)
Java version: 23.0.1+11, vendor version: GraalVM CE 23.0.1+11.1
Graal compiler: optimization level: 2, target machine: x86-64-v3
C compiler: cc (apple, x86_64, 16.0.0)
Garbage collector: Serial GC (max heap size: 80% of RAM)
1 user-specific feature(s):
- com.oracle.svm.thirdparty.gson.GsonFeature
----------------------------------------------
Build resources:
- 12.09GB of memory (75.6% of 16.00GB system memory, determined at start)
- 12 thread(s) (100.0% of 12 available processor(s), determined at start)
[2/8] Performing analysis... [*****] (17.0s @ 0.56GB)
5,302 reachable types (75.0% of 7,071 total)
6,253 reachable fields (44.9% of 13,936 total)
23,246 reachable methods (48.8% of 47,615 total)
1,746 types, 15 fields, and 324 methods registered for reflection
58 types, 57 fields, and 52 methods registered for JNI access
4 native libraries: -framework Foundation, dl, pthread, z
[3/8] Building universe... (2.1s @ 0.63GB)
[4/8] Parsing methods... [*] (1.8s @ 0.39GB)
[5/8] Inlining methods... [***] (1.4s @ 0.47GB)
[6/8] Compiling methods... [****] (16.7s @ 0.84GB)
[7/8] Laying out methods... [**] (3.3s @ 0.97GB)
[8/8] Creating image... [**] (3.2s @ 0.49GB)
9.18MB (44.69%) for code area: 13,846 compilation units
11.13MB (54.14%) for image heap: 140,509 objects and 60 resources
245.58kB ( 1.17%) for other data
20.55MB in total
----------------------------------------------
Top 10 origins of code area: Top 10 object types in image heap:
6.51MB java.base 2.38MB byte[] for code metadata
1.07MB svm.jar (Native Image) 1.84MB byte[] for java.lang.String
489.63kB jdk.compiler 1.32MB java.lang.String
451.91kB jdk.jdeps 1.23MB java.lang.Class
237.65kB jdk.zipfs 527.32kB heap alignment
114.32kB java.logging 455.64kB com.oracle.svm.core.hub.DynamicHubCompanion
69.53kB org.graalvm.nativeimage.base 295.38kB byte[] for general heap data
49.71kB jdk.proxy2 279.84kB java.util.HashMap$Node
39.69kB jdk.proxy1 269.21kB java.lang.String[]
26.75kB jdk.internal.vm.ci 237.15kB java.lang.Object[]
61.46kB for 8 more packages 2.33MB for 1344 more object types
----------------------------------------------
Recommendations:
HEAP: Set max heap for improved and more predictable memory usage.
CPU: Enable more CPU features with '-march=native' for improved performance.
----------------------------------------------
3.2s (5.2% of total time) in 877 GCs | Peak RSS: 1.44GB | CPU load: 6.81
----------------------------------------------
Build artifacts:
[...]/com.sun.tools.javap.main (executable)
==============================================
Finished generating 'com.sun.tools.javap.main' in 1m 0s.
Unfortunately, it looks like I’ve made a bit of a “Taco Bell Beefer Burger” here. A recipe not in my comfort zone.
% ./com.sun.tools.javap.main
Exception in thread "main" java.lang.ExceptionInInitializerError
at jdk.compiler@23.0.1/com.sun.tools.javac.file.BaseFileManager.createLocations(BaseFileManager.java:126)
at jdk.compiler@23.0.1/com.sun.tools.javac.file.BaseFileManager.<init>(BaseFileManager.java:84)
at jdk.compiler@23.0.1/com.sun.tools.javac.file.JavacFileManager.<init>(JavacFileManager.java:162)
at jdk.jdeps@23.0.1/com.sun.tools.javap.JavapFileManager.<init>(JavapFileManager.java:46)
at jdk.jdeps@23.0.1/com.sun.tools.javap.JavapFileManager.create(JavapFileManager.java:57)
[...]
Attempt 8, Maybe I Do Know Quoi
I brainstormed ways to go faster.
- What if we didn’t read files from standard input?
- What if the list of files never had to come from outside the java process?
- What if we pared down what we’re getting from javap? What
if we just answered the question we have: what file uses this
BigInteger
method?
Here’s what I came up with. Straight off my ape brain.
package io.github.math_ias;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.InvalidPathException;
import java.lang.classfile.ClassFile;
import java.lang.classfile.ClassModel;
import java.lang.classfile.MethodModel;
import java.lang.classfile.attribute.CodeAttribute;
import java.lang.classfile.constantpool.PoolEntry;
import java.lang.classfile.constantpool.ClassEntry;
import java.lang.classfile.constantpool.MethodRefEntry;
import java.lang.classfile.constantpool.InterfaceMethodRefEntry;
import java.lang.constant.MethodTypeDesc;
import java.util.Iterator;
// Compile like ...
// javac --enable-preview -source 23 src/io/github/math_ias/MyMain.java -d target
// Run like ...
// java --enable-preview -cp target io.github.math_ias.MyMain 'java/math/BigInteger' 'valueOf' '(J)Ljava/math/BigInteger;'
/**
* Program that does a whole lot of things.
*/
public class MyMain {
public static void main(String[] args) {
if (args.length != 4) {
System.err.println("Expected exactly 4 arguments, the root directory, the class name (/'s), the method name, and the method type descriptor.");
System.exit(-1);
}
Path rootPath = null;
try {
rootPath = Path.of(args[0]);
} catch (InvalidPathException ipe) {
System.err.println("Expected first arg to be a valid root path (does not parse).");
ipe.printStackTrace(System.err);
System.exit(-1);
}
if (!Files.exists(rootPath)) {
System.err.println("Expected first arg to be a valid root path (does not exist).");
System.exit(-1);
}
String classToMatch = args[1];
String methodToMatch = args[2];
MethodTypeDesc methodTypeDescToMatch = null;
try {
methodTypeDescToMatch =
MethodTypeDesc.ofDescriptor(args[3]);
} catch (IllegalArgumentException e) {
System.err.println("Expected fourth arg to be a valid method type descriptor.");
e.printStackTrace(System.err);
System.exit(-1);
}
// To quiet javac on the lambda. :]
MethodTypeDesc effectivelyFinalValue =
methodTypeDescToMatch;
try {
Files.walk(rootPath)
.parallel()
// This lambda is likely repeating work Files.walk already does,
// plus this file extension business is nasty,
// but I'm betting that I can beat xargs without optimizing it.
.filter((Path path) ->
!Files.isDirectory(path) &&
path.getFileName().toString().endsWith(".class")
)
.forEach((Path path) -> printOnPathMatch(
path
, classToMatch
, methodToMatch
, effectivelyFinalValue
));
} catch (IOException ioe) {
System.err.println("Unexpected error occurred while traversing file tree.");
ioe.printStackTrace();
System.exit(-1);
}
System.exit(0);
}
public static void printOnPathMatch(
Path path
, String classToMatch
, String methodToMatch
, MethodTypeDesc methodTypeDescToMatch
) {
try {
ClassModel classModel =
ClassFile.of()
.parse(path);
if (classModelMatches(
classModel, classToMatch, methodToMatch, methodTypeDescToMatch
)) {
System.out.println(path.toString());
}
} catch (IOException io) {
System.err.println(String.format("Failed to read path %s, skipping.", path.toString()));
io.printStackTrace(System.err);
}
}
public static boolean classModelMatches(
ClassModel classModel
, String classToMatch
, String methodToMatch
, MethodTypeDesc methodTypeDescToMatch
) {
Iterator<PoolEntry> iterator = classModel.constantPool().iterator();
while (iterator.hasNext()) {
PoolEntry poolEntry = iterator.next();
if (poolEntry instanceof MethodRefEntry) {
MethodRefEntry methodRefEntry = (MethodRefEntry) poolEntry;
if (
methodRefEntry.name().equalsString(methodToMatch) &&
methodRefEntry.typeSymbol().equals(methodTypeDescToMatch) &&
methodRefEntry.owner().name().equalsString(classToMatch)
) {
return true;
}
} else if (poolEntry instanceof InterfaceMethodRefEntry) {
InterfaceMethodRefEntry interfaceMethodRefEntry =
(InterfaceMethodRefEntry) poolEntry;
if (
interfaceMethodRefEntry.name().equalsString(methodToMatch) &&
interfaceMethodRefEntry.typeSymbol().equals(methodTypeDescToMatch) &&
interfaceMethodRefEntry.owner().name().equalsString(classToMatch)
) {
return true;
}
}
}
return false;
}
}
Let’s measure it!15
for _ in {1..5}; do
START_TIME=$(gdate +%s%3N)
: My program only takes one root dir.
: So for fairness I run it 3 times.
: This will eliminate some warmup effects.
: But as you’ll see in the results,
: it’s so fast it doesn’t matter.
java --enable-preview io.github.math_ias.MyMain ../../jars 'java/math/BigInteger' 'valueOf' '(J)Ljava/math/BigInteger;' > /dev/null
java --enable-preview io.github.math_ias.MyMain ../../jars 'java/math/BigInteger' 'valueOf' '(J)Ljava/math/BigInteger;' > /dev/null
java --enable-preview io.github.math_ias.MyMain ../../jars 'java/math/BigInteger' 'valueOf' '(J)Ljava/math/BigInteger;' > /dev/null
END_TIME=$(gdate +%s%3N)
DURATION=$((END_TIME - START_TIME))
echo "$DURATION"
done
(15) We can also measure fairness. For each file in the parallel stream let’s just print the file name.
java --enable-preview io.github.math_ias.MyMain ../../jars | wc -l
19080
We’re not skipping files by accident. What about accuracy?
find . -type f -name '*.class' | xargs -n 50000 -P 4 javap -sysinfo -c | rg "java/lang/Object\.equals:\(Ljava/lang/Object;\)Z|^Classfile" | python3 -c "
import sys, itertools
a, b = itertools.tee(sys.stdin)
next(b, None)
for x, y in zip(a, b):
if 'Classfile' in x and 'invokevirtual' in y.lower():
print(x, end='')
" | wc -l
571
That gives a similar count to my program: 587. I suspect the classfile spec allows you to add methods to the constant pool and not use it.
It does beg the question, what IS the most used method in all these jars? Let’s go find out in another page.
It’s FAST. It is about 10 times faster than the fastest method we’ve come up with so far (14968 versus 1533 milliseconds).
SAMPLE | 0 | 1 | 2 | 3 | 5 | AVG |
---|---|---|---|---|---|---|
TIME | 1528 | 1548 | 1491 | 1536 | 1560 | 1532.6 |
And it only took me a couple of months to do it and then write about it!
Further Reading
There are two JVM implementations that have their own solutions to slow java startup times. I didn’t try them and perhaps they are another way to crack this problem.
- Azul Zulu has Coordinated Restore at Checkpoint.
- Azul Zing has ReadyNow