Tensorflow_scala alternatives and similar packages
Based on the "Science and Data Analysis" category

PredictionIO
machine learning server for developers and data scientists. Built on Apache Spark, HBase and Spray 
Smile
Statistical Machine Intelligence and Learning Engine. Smile is a fast and comprehensive machine learning system. 
Spark Notebook
Scalable and stable Scala and Spark focused notebook bridging the gap between JVM and Data Scientists (incl. extendable, typesafe and reactive charts). 
Figaro
Figaro is a probabilistic programming language that supports development of very rich probabilistic models. 
FACTORIE
A toolkit for deployable probabilistic modeling, implemented as a software library in Scala. 
ND4S
NDimensional arrays and linear algebra for Scala with an API similar to Numpy. ND4S is a scala wrapper around ND4J. 
Libra
Libra is a dimensional analysis library based on shapeless, spire and singletonops. It contains out of the box support for SI units for all numeric types. 
Compute.scala
Scientific computing with Ndimensional arrays 
Optimus * 96
Optimus is a library for Linear and Quadratic mathematical optimization written in Scala programming language. 
rscala
The Scala interpreter is embedded in R and callbacks to R from the embedded interpreter are supported. Conversely, the R interpreter is embedded in Scala. 
Clustering4Ever
Scala and Spark API to benchmark and analyse clustering algorithms on any vectorization you can generate 
Tyche
Probability distributions, stochastic & Markov processes, lattice walks, simple random sampling. A simple yet robust Scala library. 
Rings
An efficient library for polynomial rings. Commutative algebra, polynomial GCDs, polynomial factorization and other sci things at a really high speed. 
SwiftLearner
Simply written algorithms to help study Machine Learning or write your own implementations.
* Code Quality Rankings and insights are calculated and provided by Lumnify.
They vary from L1 to L5 with "L5" being the highest. Visit our partner's website for more details.
Do you think we are missing an alternative of Tensorflow_scala or a related project?
README
This library is a Scala API for https://www.tensorflow.org. It attempts to provide most of the functionality provided by the official Python API, while at the same type being stronglytyped and adding some new features. It is a work in progress and a project I started working on for my personal research purposes. Much of the API should be relatively stable by now, but things are still likely to change.
Please refer to the main website for documentation and tutorials. Here are a few useful links:
Citation
It would be greatly appreciated if you could cite this project using the following BibTex entry, if you end up using it in your work:
@misc{Platanios:2018:tensorflowscala,
title = {{TensorFlow Scala}},
author = {Platanios, Emmanouil Antonios},
howpublished = {\url{https://github.com/eaplatanios/tensorflow_scala}},
year = {2018}
}
Main Features
Easy manipulation of tensors and computations involving tensors (similar to NumPy in Python):
val t1 = Tensor(1.2, 4.5) val t2 = Tensor(0.2, 1.1) t1 + t2 == Tensor(1.0, 5.6)
Lowlevel graph construction API, similar to that of the Python API, but strongly typed wherever possible:
val inputs = tf.placeholder[Float](Shape(1, 10)) val outputs = tf.placeholder[Float](Shape(1, 10)) val predictions = tf.nameScope("Linear") { val weights = tf.variable[Float]("weights", Shape(10, 1), tf.ZerosInitializer) tf.matmul(inputs, weights) } val loss = tf.sum(tf.square(predictions  outputs)) val optimizer = tf.train.AdaGrad(1.0f) val trainOp = optimizer.minimize(loss)
Numpylike indexing/slicing for tensors. For example:
tensor(2 :: 5, , 1) // is equivalent to numpy's 'tensor[2:5, ..., 1]'
Highlevel API for creating, training, and using neural networks. For example, the following code shows how simple it is to train a multilayer perceptron for MNIST using TensorFlow for Scala. Here we omit a lot of very powerful features such as summary and checkpoint savers, for simplicity, but these are also very simple to use.
// Load and batch data using prefetching. val dataset = MNISTLoader.load(Paths.get("/tmp")) val trainImages = tf.data.datasetFromTensorSlices(dataset.trainImages.toFloat) val trainLabels = tf.data.datasetFromTensorSlices(dataset.trainLabels.toLong) val trainData = trainImages.zip(trainLabels) .repeat() .shuffle(10000) .batch(256) .prefetch(10) // Create the MLP model. val input = Input(FLOAT32, Shape(1, 28, 28)) val trainInput = Input(INT64, Shape(1)) val layer = Flatten[Float]("Input/Flatten") >> Linear[Float]("Layer_0", 128) >> ReLU[Float]("Layer_0/Activation", 0.1f) >> Linear[Float]("Layer_1", 64) >> ReLU[Float]("Layer_1/Activation", 0.1f) >> Linear[Float]("Layer_2", 32) >> ReLU[Float]("Layer_2/Activation", 0.1f) >> Linear[Float]("OutputLayer", 10) val loss = SparseSoftmaxCrossEntropy[Float, Long, Float]("Loss") >> Mean("Loss/Mean") val optimizer = tf.train.GradientDescent(1e6f) val model = Model.simpleSupervised(input, trainInput, layer, loss, optimizer) // Create an estimator and train the model. val estimator = InMemoryEstimator(model) estimator.train(() => trainData, StopCriteria(maxSteps = Some(1000000)))
And by changing a few lines to the following code, you can get checkpoint capability, summaries, and seamless integration with TensorBoard:
val loss = SparseSoftmaxCrossEntropy[Float, Long, Float]("Loss") >> Mean("Loss/Mean") >> ScalarSummary(name = "Loss", tag = "Loss") val summariesDir = Paths.get("/tmp/summaries") val estimator = InMemoryEstimator( modelFunction = model, configurationBase = Configuration(Some(summariesDir)), trainHooks = Set( SummarySaver(summariesDir, StepHookTrigger(100)), CheckpointSaver(summariesDir, StepHookTrigger(1000))), tensorBoardConfig = TensorBoardConfig(summariesDir)) estimator.train(() => trainData, StopCriteria(maxSteps = Some(100000)))
If you now browse to
https://127.0.0.1:6006
while training, you can see the training progress:Efficient interaction with the native library that avoids unnecessary copying of data. All tensors are created and managed by the native TensorFlow library. When they are passed to the Scala API (e.g., fetched from a TensorFlow session), we use a combination of weak references and a disposing thread running in the background. Please refer to
tensorflow/src/main/scala/org/platanios/tensorflow/api/utilities/Disposer.scala
, for the implementation.
Compiling from Source
Note that in order to compile TensorFlow Scala on your
machine you will need to first install the TensorFlow
Python API. You also need to make sure that you have a
python3
alias for your python binary. This is used by
CMake to find the TensorFlow header files in your
installation.
Tutorials
Funding
Funding for the development of this library has been generously provided by the following sponsors:
CMU Presidential Fellowship  National Science Foundation  Air Force Office of Scientific Research 
awarded to Emmanouil Antonios Platanios  Grant #: IIS1250956  Grant #: FA95501710218 
TensorFlow, the TensorFlow logo, and any related marks are trademarks of Google Inc.
<!
Some TODOs
 [ ] Figure out what the proper to way to handle Int vs Long shapes is, so that we can use Long shapes without hurting GPU performance.
 [ ] Make the optimizers typed (with respect to their state, at least).
 [ ] Make the gradients function retain types (we need a type trait for that).
 [ ] Dispose dataset iterators automatically.
[ ] Fixed all
[TYPE] !!!
code TODOs.[ ] Session execution context (I'm not sure if that's good to have)
[ ] Session reset functionality
[ ] Variables slicing
[ ] Slice assignment
[ ] Support for
CriticalSection
.[ ] tfdbg / debugging support
[ ] tfprof / op statistics collection
Switch to using JUnit for all tests.
Add convenience implicit conversions for shapes (e.g., from tuples or sequences of integers).
Create a "Scope" class and companion object.
Variables API:
 Clean up the implementation of variable scopes and stores and integrate it with "Scope".
 Make 'PartitionedVariable' extend 'Variable'.
 After that change, all 'getPartitionedVariable' methods can be integrated with the 'getVariable' methods, which will simplify the variables API.
Switch to using "Seq" instead of "Array" wherever possible.
Op creation:
 Reset default graph
 Register op statistics
Fix Travis CI support (somehow load the native library)
 Website margins are a little large relative to the content in mobile
 Make the code blocks scroll rather than wrap
To publish a signed snapshot version of the package that is crosscompiled, we use the following commands from within an SBT shell:
set nativeCrossCompilationEnabled in jni := true
publishSigned
You can also test crosscompilation using the following command:
sbt jni/cross:nativeCrossCompile
Compile the TensorFlow dynamic libraries from source using:
bazel build config=opt cxxopt=D_GLIBCXX_USE_CXX11_ABI=0 //tensorflow:libtensorflow.so
On Ubuntu 18.04 you may get some linking errors, in which case you should use:
bazel build config=opt cxxopt=D_GLIBCXX_USE_CXX11_ABI=0 noincompatible_do_not_split_linking_cmdline //tensorflow:libtensorflow.so
To publish the documentation website we use the following commands:
sbt docs/previewSite # To preview the website
sbt docs/ghpagesPushSite # To publish the website
To prepare the precompiled TensorFlow binary packages, use the following commands:
mkdir lib
cp av /usr/local/lib/libtensorflow* lib/
tar zcvf libtensorflow2.2.0cpudarwinx86_64.tar.gz lib
tar ztvf libtensorflow2.2.0cpudarwinx86_64.tar.gz
>
*Note that all licence references and agreements mentioned in the Tensorflow_scala README section above
are relevant to that project's source code only.