LynxKite v1.7.0 Release Notes

    • Major changes to importing and exporting data. We introduce the concept of tables to improve clarity and performance when working with external data.

    Projects no longer depend on the input files. (They can be deleted after importing.) It becomes easier to share raw data between projects. We have fewer operations (just Import vertices from table instead of Import vertices from CSV files and Import vertices from database), but support more formats (JSON, Parquet and ORC are added) with a unified interface. We also support direct import from Hive.

    Tables are built on Apache Spark DataFrames. As a result, you can run SQL queries on graphs. (See the SQL section at the bottom of a project.) Plus DataFrame-based data manipulation is now possible from Groovy scripts.

    Export operations are gone. Data can be exported in various formats through the SQL interface. SQL results can also be saved as tables and re-imported as parts of a project.

    For more details about the new features see the documentation.

    • Default home directory is moved under the 'Users' folder.
    • Root folder is default readable by everyone and writable by only admin users for bare new Kite installations.
    • Edges and segmentation links can now also be accessed as DataFrames from batch scripts.
    • New Derive scalar operation.
    • Possible to create visualizations with lighter Google Maps as a background thanks to adjustable map filters.
    • Upgrade to Hadoop 2 in our default Amazon EC2 setup.
    • Remove support of Hadoop 1.
    • Introduce tools/emr.sh which starts up an Amazon Elastic MapReduce cluster. This is now the recommended way to run Kite clusters on Amazon.
    • Introduce operation Copy edges to base project.
    • emr.sh can now invoke groovy scripts on a remote cluster.
    • Introduce explicit machine learning models. Create them with the Train linear regression model operation and use them for predictions with Predict from model.
    • Added a new centrality measure, the average distance.
    • The Convert vertices into edges operation has been removed. The same functionality is now available via tables. You can simply import the vertices table of one project as edges in another project.