Java – SBT: how to package instances of classes as jars?
My code is basically like this:
class FoodTrainer(images: S3Path) { // data is >100GB file living in S3 def train(): FoodClassifier // Very expensive - takes ~5 hours! } class FoodClassifier { // Light-weight API class def isHotDog(input: Image): Boolean }
I want to call Val classifier = new foodtrainer (s3dir) when jar assembly (SBT assembly) Train () and publish a jar with a classifier instance that is immediately available to downstream library users
What is the easiest way? What are the established examples of this? I know it is a fairly common idiom in ML projects and can publish well-trained models http://nlp.stanford.edu/software/stanford-corenlp-models-current.jar
How can I use an SBT assembly to do this without having to check large model classes or data files into my version control?
Solution
You should serialize the training data into your own file You can then package this data file into a jar Your production code opens the file and reads it instead of running the training algorithm