Avg Release Cycle
- Workflow and batch API scripts now use Groovy instead of JSON. This makes them easier to read and the batch API gets more flexible too.
- Users can now configure better the stopping condition for modular clustering.
- Improved format for the graph storage. Note that this breaks compatibility of the data directory with 1.5.0.
Compatibility is retained with all version before 1.5. One way to fix a data directory created/touched by Kite
1.5.0 is to delete the directory
- Kite configuration setting
YARN_NUM_EXECUTORSis replaced by the more general
NUM_EXECUTORSwhich applies to the standalone spark cluster setup as well.
- Reorganized operation categories. We hope you find them more logical.
- The Batch processing API is now ready for use. It allows you to run a sequence of operations from the command line. For more details see the Batch processing API section in the manual.
- Richer progress indicator.
- LynxKite on EC2 will now always use ephemeral HDFS to store data. This data is lost if you
destroythe cluster. Use the new
s3copycommand if you want to save the data to S3. (The data does not have to be restored from S3 to HDFS. It will be read directly from S3.) This also means significant performance improvements for EC2 clusters.
- User passwords can now be changed.
- New operation Metagraph is useful for debugging and perhaps also for demos.
- Added an experimental tool for cleaning up old data. It is accessible as
- The title and tagline on the splash page can be customized through the
- A large number of stability and performance improvements.
- tags.journal files will not grow large anymore.
- Kite configuration setting
- Faster loading of the project list page.
- Fixed missing CSS.
- History editing improvements. Operations with problems (such as importing a file that no longer exists) can be edited now. Long histories can now be edited without problem. The UI has been revised a bit. (The Validate button has been removed.)
- Switching to Spark 1.4.0.
- New operation Copy graph into a segmentation can import the project as its own segmentation and create edges between the original vertices and their corresponding vertices in the segmentation.
- Operations Degree, Aggregate on neighbors, and Weighted aggregate on neighbors can now also make calculations directly on unique neighboring vertices (that is, regardless of the number of edges between the vertices).
- For neighbor aggregations and degree the meaning of "symmetric" edges makes more sense now: the number of symmetric edges between A and B is now understood as max(number of edgex A->B, number of edges B->A)
- EC2 Kite instances now listen on port 4044 (instead of 5080 before).
- Smoke test script added: when you install a new Kite instance, you can run
kite_xxx/tools/daily_test.shto see if everything is set up correctly.
- Fixed saving workflows. The save dialog didn't actually show in prod deployments.
- Fixed a bug where we didn't close some files when we didn't need them which caused s3 lockups.
- Appending data to an existing DB table is not supported anymore (as it's dangerous if done accidentally). In other words, you can only export to a DB by creating a new table.
- Removed the SQL dump option in file export operations. The only supported output format is CSV now.
- Improved watchdog to detect a wider range of potential problems.
- Fixed bug: editing the history now causes project reload.
- Fixed a bug where vertices became frozen when attributes were visualized on them.
- Fixed a bug where cross edges between project and segmentation could be broken for certain operations.
- Improvements to the import graph workflow: somewhat less computation stages plus the algorithm is not sensitive anymore to vertices with millions of edges.
- Two new attribute operations implemented: Fill edge attribute with constant default value and Merge two edge attributes. These do the same to edges what Fill with constant default value and Merge two attributes could do to vertices. Note that the latter two operations have been renamed to Fill vertex attribute with constant default value and Merge two vertex attributes, respectively.
- Lin's Centrality algorithm is added.
- Fixed regression. (Segmentations opened on the wrong side.)
- You can now omit columns when importing something from a CSV file.
- Moderate performance improvements for graph visualization.
- The User's Guide contents are now integrated into LynxKite. Relevant help messages are provided through contextual popups, and the document is also available as a single page, ready for printing, through a new link at the bottom right.
- New edge operation "Merge parallel edges by attribute" makes it possible for the user to merge those parallel edges between two vertices that have the same value for the given edge attribute.
- Admins can download the last server log using the link
http://<kite ip>:<kite port>/logs.
- When running an EC2 cluster, you cannot directly reference s3 files as before (using
s3n://AWS_ACCESS_KEY_ID:[email protected]/path), see the changelog entry below about the data file prefix notation. Instead, for EC2 cluster we automatically setup the prefix
S3to point to
s3n://AWS_ACCESS_KEY_ID:[email protected]. In practice this means that when using an EC2 cluster you need to refer to files on S3 as:
- Improved stability and graceful degradation. Not having enough memory now will only result in degraded performance not failures.
- "Create scale-free random edge bundle" operation added which allows one to create a scale free random graph.
- One can save a sequence of operations as a workflow. The feature is accessible from the project history editor/viewer and saved workflows show up as a new operation category.
- Strings visualized as icons will be matched to neutral icons (circle, square, triangle, hexagon, pentagon, star) if an icon with the given name does not exist.
- "PageRank" operation can now be used without weights.
- Data files and directories can now only be accessed via a special prefix notation.
For example, what used to be
hdfs://nameservice1:8020/user/kite/data/uploads/fileis now simply
UPLOAD$/file. This enables the administrator to hide s3n passwords from the users; futhermore, it will be easier to move the data to another location. A new kiterc option
KITE_PREFIX_DEFINITIONScan be used to provide extra prefixes (other than
UPLOAD$) See the files
- New aggregation method:
- LynxKite pages can now be printed. Visualizations are also supported. This provides a method of exporting the visualization in a scalable vector graphics format.
- Segmentation coverage is automatically calculated.
- New vertex operation "Centrality" makes it possible to count approximate harmonic centrality values.
- Edges can now be colored based on string attributes as well.
- SQL operations should be more portable now and will also work if field names contain special characters.
- One can now use the newly added .kiterc option,
KITE_EXTRA_JARS, to add JARS on the classpath of all LynxKite components. Most important usecase is to add JDBC drivers.
- Multiple ways of 2D graph rendering.
- Improved compatibility for Mozilla Firefox.
- Show all visualized attributes as part of the legend in sampled view.
- New vertex operation "Add rank attribute" makes it possible for the user to sort by an attribute (of double type). This can be used to identify vertices which have the highest or lowest values with respect to some attributes.
- Fixed project history performance issue.
- More than one type of aggregation can be selected for each attribute in one operation.
- New "Combine segmentations" operation makes it possible to create a segmentation by cities of residence and age, for example.
- New "Segment by double attribute" and "Segment by string attribute" operations make it possible to create a segmentation by cities of residence or age, for example.
- New "Create edges from co-occurrence" operation creates edges in the parent project for each pair of vertices that co-occur in a segment. If they co-occur multiple times, multiple parallel edges are created.
- Visualizations can be saved into the project for easier sharing.
- Backward compatibility for undo in pre-1.2.0 projects.