Some of the binding code for accessing native functions and data structures is generated with jextract. The tool is still under construction and has its limitations.
In order to generate the code, the scripts in the subdirectories have to be run (linux/gen_linux.sh, macos/gen_macos.sh and /windowsgen_win.cmd). Each script has to be run on the particular operating system.
The code is generated in directories below gen, i.e. main/java/net/codecrete/usb/linux/gen and similar for the other operating systems. For each library (xxx.so or xxx.dll) and each macOS framework, a separate package is created.
The scripts explicitly specify the functions, structs etc. to include as generating code for entire operating system header files can result in an excessive amount of Java source files and classes.
The resulting code is then committed to the source code repository. Before the commit, imports are cleaned up to get rid of superfluous imports. Most IDEs provide a convenient command to execute this on entire directories.
-
The binaries for jextract on https://jdk.java.net/jextract/ have not been updated for JDK 21. So it must be built from source. Instructions can be found at Building & Testing.
-
According to the jextract mailing list, it would be required to create separate code for Intel x64 and ARM64 architecture. And jextract would need to be run on each architecture separately (no cross-compilation). Fortunately, this doesn't seem to be the case. Linux code generated on Intel x64 also runs on ARM64 without change. The same holds for macOS. However, jextract needs to be run on each operating system separately.
-
JDK 20 introduced a new feature for saving the thread-specific error values (
GetLastError()on Windows,errnoon Linux). To use it, an additional parameter must be added to function calls. Unfortunately, this is not yet supported by jextract. So a good number of function bindings have to be written manually. -
typedefandstruct:- If only the
typedefis included (--include-typedef), an empty Java class is generated. - If both typedef and the
structit refers to are included, thetypedefclass inherits from thestructclass, which contains all thestructmembers. - If the
typedefrefers to an unnamedstruct, the generated class contains all thestructmembers.
Case 1 looks like a bug.
- If only the
-
jextract is not really transparent about what it does. It often skips elements without providing any information. In particular, it will silently skip a requested element in these cases:
--include-var myvarifmyvaris declared asstatic.--include-var myvarifmyvaris anenumconstant.enumconstants must be requested with--include-constant.--include-constant MYCONSTANTifMYCONSTANTis function-like, even if it evaluates to a constant.--include-struct mystructifmystructis actually atypedefto astruct.--include-typedef mystructifmystructis actually astruct.--include-typedef mytypedefifmytypedefis atypedeffor a primitive type.
To run the script, the header files for libudev must be present. In most cases, they aren't install by default (in contrast to the library itself):
sudo apt-get install libudev-dev
On Linux, the limitations are:
-
usbdevice_fs.h: The macroUSBDEVFS_CONTROLand all similar ones are not generated. They are probably considered function-like macros. jextract does not generate code for function-like macros. ButUSBDEVFS_CONTROLevaluates to a constant. -
sd-device.h(header file for libsystemd): jextract fails with "Error: /usr/include/inttypes.h:290:8: error: unknown type name 'intmax_t'". The reason is yet unknown. This code is currently not needed as libudev is used instead of libsystemd. They are related, libsystemd is the future solution, but it is missing support for monitoring devices. -
libudev.h: After code generation, the classRuntimeHelper.javain.../linux/gen/udevmust be manually modified as the code to access the library does not work for the directory the library is located in. So replace:
System.loadLibrary("udev");
SymbolLookup loaderLookup = SymbolLookup.loaderLookup();
with:
SymbolLookup loaderLookup = SymbolLookup.libraryLookup("libudev.so", MemorySession.openImplicit());
Most of the required native functions on macOS are part of a framework. Frameworks internally have a more complex file organization of header and binary files than appears from the outside. Thus, they require a special logic to locate framework header files. clang supports it with the -F. jextract allows to specify the options via compiler_flags.txt file. Since the file must be in the local directory and since it does not apply to Linux and Windows, separate directories must be used for the operating systems.
The generated code has the same problem as the Linux code for udev. It must be manually changed to use SymbolLookup.libraryLookup() for the frameworks CoreFoundation and IOKit respectively, and use an absolute path starting with /System/Library/Frameworks/.
Most Windows SDK header files are not independent. They require that Windows.h is included first. So instead of specifying the target header files directly, a helper header file (windows_headers.h in this directory) is specified.
Compared to Linux and macOS, the code generation on Windows is very slow (about 1 min vs 3 seconds). And jextract crashes sometimes.
The known limitations are:
-
Variable size
struct: Several Windows struct are of variable size. The last member is an array. Thestructdefinition specifies array length 1. But you are expected to allocate more space depending on the actual array size you need. jextract generates code for array length 1 and checks the length when the members are accessed. So the generated code is difficult to use. Variable sizestructs are a pain - in any language. -
GUID constants like
GUID_DEVINTERFACE_USB_DEVICEdo not work. While code is generated, the code fails at run-time as it is unable to locate the symbol. This is due to the fact thatGUID_DEVINTERFACE_USB_DEVICEactually resolve to a variable definition and not to a variable declaration. The GUID constant is not contained in any library; instead the header files use linkage options to generate the constant in the callers code, which does not work with FFM. Such constants should be skipped by jextract. -
jextract is a batch script and turns off echo mode. If a single batch scripts has multiple calls of jextract, two things need to be considered:
- If the regular command interpreter
cmd.exeis used, jextract must be called usingcall, i.e.call jextract header.h. - If PowerShell is used instead,
callis not needed but PowerShell must be configured to allow the execution of scripts. - jextract turns off echo mode. So the first call will behave differently than the following calls.
- If the regular command interpreter
jextract generates a comprehensive set of methods for each function, struct, struct member etc. Most of it will not be used as a typical application just uses a subset of struct members, might only read or write them etc. So a considerable amount of code is generated. For some types, it's a bit excessive.
The worst example is IOUSBInterfaceStruct190 (macOS). This is a struct consisting of about 50 member functions. It's basically a vtable of a C++ class. For this single struct, jextract generates codes resulting in 70 class files with a total size of 227kByte.
The table below shows statistics for version 0.6.0 of the library:
Code Size (compiled), in bytes and percentage of total size
If jextract could generate code for error state capturing, there would be even more generated and less manually written code.
