BigDL v0.5.0 Release Notes
Release Date: 2018-03-30 // about 6 years ago-
Highlights
- π Bring in a Keras-like API(Scala and Python). User can easily run their Keras code (training and inference) on Apache Spark through BigDL. For more details, see this link.
- Support load Tensorflow dynamic models(e.g. LSTM, RNN) in BigDL and support more Tensorflow operations, see this page.
- π Support combining data preprocessing and neural network layers in the same model (to make model deployment easy )
- Speedup various modules in BigDL (BCECriterion, rmsprop, LeakyRelu, etc.)
- β Add DataFrame-based image reader and transformer
π New Features
- Tensor can be converted to OpenCVMat
- Bring in a new Keras-like API for scala and python
- π Support load Tensorflow dynamic models(e.g. LSTM, RNN)
- π Support load more Tensorflow operations(InvertPermutation, ConcatOffset, Exit, NextIteration, Enter, RefEnter, LoopCond, ControlTrigger, TensorArrayV3,TensorArrayGradV3, TensorArrayGatherV3, TensorArrayScatterV3, TensorArrayConcatV3, TensorArraySplitV3, TensorArrayReadV3, TensorArrayWriteV3, TensorArraySizeV3, StackPopV2, StackPop, StackPushV2, StackPush, StackV2, Stack)
- π ResizeBilinear support NCHW
- π ImageFrame support load Hadoop sequence file
- π ImageFrame support gray image
- β Add Kv2Tensor Operation(Scala)
- β Add PGCriterion to compute the negative policy gradient given action distribution, sampled action and reward
- π Support gradual increase learning rate in LearningrateScheduler
- β Add FixExpand and add more options to AspectScale for image preprocessing
- β Add RowTransformer(Scala)
- π Support to add preprocessors to Graph, which allows user combine preprocessing and trainable model into one model
- π Resnet on cifar-10 example support load images from HDFS
- β Add CategoricalColHashBucket operation(Scala)
- π Predictor support Table as output
- β Add BucketizedCol operation(Scala)
- π Support using DenseTensor and SparseTensor together to create Sample
- β Add CrossProduct Layer (Scala)
- π» Provide an option to allow user bypass the exception in transformer
- π DenseToSparse layer support disable backward propagation
- β Add CategoricalColVocaList Operation(Scala)
- π Support imageframe in python optimizer
- π Support get executor number and executor cores in python
- β Add IndicatorCol Operation(Scala)
- β Add TensorOp, which is an operation with Tensor[T]-formatted input and output, and provides shortcuts to build Operations for tensor transformation by closures. (Scala)
- π³ Provide a docker file to make it easily to setup testing environment of BigDL
- β Add CrossCol Operation(Scala)
- β Add MkString Operation(Scala)
- β Add a prediction service interface for concurrent calls and accept bytes input
- β Add SparseTensor.cast & SparseTensor.applyFun
- β Add DataFrame-based image reader and transformer
- π Support load tensoflow model files saved by tf.saved_model API
- π SparseMiniBatch supporting multiple TensorDataTypes
β¨ Enhancement
- π ImageFrame support serialization
- 0οΈβ£ A default implementation of zeroGradParameter is added to AbstractModule
- π Improve the style of the document website
- Models in different threads share weights in model training
- Speed up leaky relu
- Speed up Rmsprop
- Speed up BCECriterion
- π Support Calling Java Function in Python Executor and ModelBroadcast in Python
- β Add detail instructions to run-on-ec2
- β‘οΈ Optimize padding mechanism
- π Fix maven compiling warnings
- Check duplicate layers in the container
- π Refine the document which introduce how to automatically Deploy BigDL on Dataproc cluster
- Refactor adding extra jars/python packages for python user. Now only need to set env variable BIGDL_JARS & BIGDL_PACKAGES
- Implement appendColumn and avoid the error caused by API mismatch between different Spark version
- β Add python inception training on ImageNet example
- β‘οΈ Update "can't find locality partition for partition ..." to warning message
API change
- π¦ Move DataFrame-based API to dlframe package
- π Refine the Container hierarchy. The add method(used in Sequential, Concatβ¦) is moved to a subclass DynamicContainer
- Refine the serialization code hierarchy
- Dynamic Graph has been an internal class which is only used to run tensorflow models
- Operation is not allowed to use outside Graph
- The getParamter method as final and private[bigdl], which should be only used in model training
- β remove the updateParameter method, which is only used in internal test
- Some Tensorflow related operations are marked as internal, which should be only used when running Tensorflow models
π Bug Fix
- π Fix Sparse sample batch bug. It should add another dimension instead of concat the original tensor
- π Fix some activation or layers donβt work in TimeDistributed and RnnCell
- π Fix a bug in SparseTensor resize method
- π Fix a bug when convert SparseTensor to DenseTensor
- π Fix a bug in SpatialFullConvolution
- π Fix a bug in Cosine equal method
- π Fix optimization state mess up when call optimizer.optimize() multiple times
- π Fix a bug in Recurrent forward after invoking reset
- π Fix a bug in inplace leakyrelu
- π Fix a bug when save/load bi-rnn layers
- π Fix getParameters() in submodule will create new storage when parameters has been shared by parent module
- π Fix some incompatible syntax between python 2.7 and 3.6
- π Fix save/load graph will loss stop gradient information
- π Fix a bug in SReLU
- π Fix a bug in DLModel
- π Fix sparse tensor dot product bug
- π Fix Maxout ser issue
- π Fix some serialization issue in some customized faster rcnn model
- π Fix and refine some example document instructions
- Fix a bug in export_tf_checkpoint.py script
- π Fix a bug in set up python package.
- π Fix picklers initialization issues
- π Fix some race condition issue in Spark 1.6 when broadcasting model
- π Fix Model.load in python return type is wrong
- π Fix a bug when use pyspark-with-bigdl.sh to run jobs on Yarn
- π Fix empty tensor call size and stride not throw null exception