Scalding v0.16.0 Release Notes

    • โž• Add tests around hashcode collisions : #1299
    • ๐Ÿ›  Fix performance bug in TypedPipeDiff : #1300
    • ๐Ÿ— make serialization modules build on travis : #1301
    • ๐Ÿ‘Œ Improve TypedParquetTuple : #1303
    • โž• Add UnitOrderedSerialization : #1304
    • โช Revert "Add UnitOrderedSerialization" : #1306
    • ๐Ÿ”„ Change groupRandomly & groupAll to use OrderedSerialization : #1307
    • โœ… Make test of Kmeans very very unlikely to fail : #1310
    • ๐Ÿ‘‰ make LongThrift sources TypedSink : #1313
    • ๐Ÿ›  Fix testing VersionedKeyValSource#toIterator for non-Array[Byte] types : #1314
    • ๐Ÿ‘‰ Make SketchJoin ordered serialization aware : #1316
    • โž• Added a sealed trait ordered serializer. When it works its great. Not as reliable as we'd like. But hopefully restrictions on it will do the job : #1320
    • โž• Add secondary sorting using ordered serialization : #1321
    • Bails out from the length calculation if we don't succeed often : #1322
    • increased number of box instances to 250 : #1323
    • ๐Ÿ”€ Apply merge strategy for pom.properties files : #1325
    • ๐Ÿ”€ Apply merge strategy for pom.xml files : #1327
    • โž• Add a OrderedSerialization.viaTransform with no dependencies, and a BijectedOrderedSerialization in scalding core : #1329
    • Precompute int hashes : #1330
    • ๐Ÿ—„ Hide the deprecated string error for getting ASCII bytes. : #1332
    • ๐Ÿ”„ Change defaults for Scalding reducer estimator : #1333
    • Execution id code : #1334
    • โž• Add line numbers at .group and .toPipe boundaries : #1335
    • Ordered Serialization macros for thrift : #1338
    • ๐Ÿ‘‰ make some repl components extensible : #1342
    • โœ‚ Remove the bootstrap section : #1346
    • ๐Ÿ›  Fix the execution test : #1347
    • Implement flatMapValues method : #1348
    • ๐Ÿ’… Consistent style in homepage example : #1349
    • Serialization folding : #1351
    • ๐Ÿ“ฆ Collapses scalding-db packages : #1353
    • ๐Ÿ”€ Merge scalding-macros into scalding-core : #1355
    • Migrate typedtext : #1356
    • โš™ Runtime reducer estimator : #1358
    • โšก๏ธ Update Build.scala : #1361
    • ๐Ÿ‘ Allow overriding of hadoop configuration options for a single source/sink : #1362
    • Missing an extends Serializable, causes issues if capture Config's anywhere : #1365
    • ๐Ÿ›  Fix TypedPipe.limit to be correct, if slightly slower : #1366
    • ๐Ÿ›  Fix scala.Function2 showing up in line numbers : #1367
    • โฌ‡๏ธ Drop with MacroGenerated from Fields macros : #1370
    • ๐Ÿ›  Fix deprecation warnings in TypedDelimited : #1371
    • โช Ianoc/revert changes around making file systems : #1372
    • โช Revert typed tsv behavior : #1373
    • ๐Ÿ‘€ A serialization error we were seeing in repl usage : #1376
    • โž• Add NullSink and test : #1378
    • โž• Add monoid and semigroup for Execution : #1379
    • โฌ†๏ธ Upgrade parquet to 1.8.1 : #1380
    • โฌ†๏ธ Upgrade sbt launcher script (sbt-extras) : #1381
    • ๐Ÿšš Just move whitespace, add comments, simplify a few methods : #1383
    • Don't publish maple when doing 2.11 so we only publish it once -- needed for cross publishing to maven repo's : #1386
    • ๐Ÿ‘Œ Support nesting Options in TypeDescriptor : #1387
    • Enable Scalding-REPL for Scala 2.11 : #1388
    • โšก๏ธ Updates for some upstream fixes/changes : #1390
    • โœ‚ Remove use of hadoop version in estimators : #1391
    • Set hadoop version to dummy value : #1392
    • ๐Ÿ– Handle no history case in RatioBasedEstimator : #1393
    • Inline parquet-scrooge : #1395
    • โœ… RatioBasedEstimator - fix threshold edge case, add tests : #1397
    • ๐Ÿ›  Fixes the scrooge generator tasks not to generate code in the compile target, we were publishing these : #1399
    • ๐Ÿ”ง Ianoc/configure set converter : #1400
    • ๐Ÿ”„ Change hash function in GroupRandomly : #1401
    • ๐Ÿ‘Œ Improve logging in runtime reducer estimators : #1402
    • โž• Add the type in ScroogeReadSupport : #1403
    • โž• Adds a function to test if a sink exists at the version we created : #1404
    • โž• add .groupWith method to TypedPipe : #1406
    • โž• Add some return types : #1407
    • โž• add counter verification logic : #1409
    • ๐Ÿ›  Runtime reducer estimator fixes : #1411
    • ๐Ÿ‘‰ Make sure Execution.zip fails fast : #1412
    • When using WriteExecution and forceToDisk we can share the same flow def closer in construction : #1414
    • Cache the zipped up write executions : #1415
    • ๐Ÿ›  Fix DateOps "match may not be exhaustive" warning : #1416
    • Factor out repeated code into FutureCache : #1417
    • ๐Ÿ›  Fix lack of Externalizer in joins. : #1421
    • โž• Adds much more line number information through the NoStackAndThen class : #1423
    • Requires a DateRange's "end" to be after its "start" : #1425
    • Scalding viz options : #1426
    • ๐Ÿ›  Fixes map-only jobs to accommodate both an lzo source and sink binary converter : #1431
    • ๐Ÿ›  Fix Readme travis link : #1432
    • ๐Ÿ›  Fixes docs wording : #1433
    • ๐Ÿ‘ป Don't squash the exception in history service when there's a failure : #1434
    • ๐ŸŒฒ Log the exception in RatioBasedEstimator when there's a failure : #1435
    • ๐Ÿ‘‰ make getBytesPerReducer support human readable values like 128m and 1g : #1436
    • ๐Ÿ›  Fixes minor KeyedList docs wording : #1437
    • ๐Ÿ›  Fix readPathsFor to use the tz argument : #1439
    • Scalding viz options : #1440
    • โœ… call Job.validate when running tests under JobTest : #1441
    • โœ… opt-in to calling Job.validate in JobTest : #1444
    • ๐Ÿ›  Fix bug with sketch joins and single keys : #1451
    • ๐Ÿ›  Fix incorrect usage of percent. : #1455
    • โž• Add OrderedSerialization2 support in Matrix2. : #1457
    • โž• Add InvalidSourceTap to catch all cases for no good path. : #1458
    • Cluster info and fs shell in repl : #1462
    • โšก๏ธ Update Scala version to 2.10.6 : #1463
    • ๐Ÿ›  Fix median estimation : #1464
    • ๐Ÿ‘‰ Makes the config transient in the KryoHadoop instanciator : #1466
    • ๐Ÿšš Moves the default to 2.11 : #1467
    • โž• Adds Error Message to REPL when Current Directory Not Readable : #1468
    • SuccessFileSource: correctness for multi-dir globs : #1470
    • Limit task history fields consumed from hraven : #1472
    • โœ‚ Remove dependency on dfs-datastores : #1473
    • ScaldingILoop should enable one to pass in in/out : #1475
    • Switch Chat to Gitter : #1477
    • โž• Add two functions that assist in testing a TypedPipe : #1478
    • ๐Ÿ‘‰ Makes permission failures non-fatal when looking for .scalding_repl files : #1479
    • โšก๏ธ Update TypeDescriptor to explain that Option[String] is not supported : #1480
    • โœ‚ Remove a type parameter that doesn't seem to do anything : #1481
    • Utility for expanding libjars : #1483
    • Shouldn't skip hidden files, user can decide such things with their glob : #1485
    • ๐Ÿ›  Fix FileSystem.get issue : #1487
    • โœ‚ Remove dependency on parquet-cascading : #1488
    • โž• Add withConfig api to allow running an execution with a transformed config : #1489
    • Call validateTaps in toIterator codepath : #1490
    • โšก๏ธ Update the build : #1491
    • Arg Descriptions/Help for Execution Apps : #1492
    • ๐Ÿ›  Fix issue #1429 : #1493
    • โšก๏ธ Cache counters for stat updates : #1495
    • โœ… Pulls the core ExecutionTests back into scalding-core : #1498
    • โž• Add a liftToTry function to Execution : #1499
    • Small improvements to the Boxed.scala module : #1500
    • Cache boxed classes : #1501
    • ๐Ÿ›  Fix unnecessary use of .get in Globifier.scala : #1502
    • Replace unintentional use of Unit with () : #1503
    • ๐Ÿ›  Fix unnecessary uses of Option.get : #1506
    • Utility methods for running Executions in parallel : #1507
    • Typed Mapside Reduce : #1508
    • ๐Ÿšš Use wartremover to guard against careless use of _.get : #1509
    • โž• Add in an API around cache isolation : #1511
    • โž• Add implicit Ordering[RichDate] : #1512
    • ๐Ÿ›  Fix MultipleTextLineFiles source in JobTest : #1513
    • โž• Add's support for sealed abstract classes : #1518
    • Update FixedPathSource to strip out '' in paths ending with '/' for writes : #1520
    • ๐Ÿ‘Œ support for more formats to work with RichDate : #1522
    • ๐Ÿšง WIP: Add forceToDisk parameter to hashJoin in TypedPipe : #1529
    • ๐Ÿ›  Fixing comments on partitioned delimited source : #1530
    • โœ‚ Remove weakly typed Source : #1531
    • Maple fix for HBaseTap : #1532
    • โž• Add an enrichment for TypedPipe.inMemoryToList and use it in TypedPipeDiff test : #1533
    • Because, because... fun, the scala compiler has special naming rules it appears when there are leading underscores : #1534
    • ๐Ÿ›  Fix README examples link : #1536
    • ๐Ÿ›  Fixes Config to accommodate spaces in argument values : #1537
    • โž• Add before() and after() to RichDate : #1538
    • โž• Adds late tap validation for cases where race conditions cause it to fail : #1540
    • ๐Ÿ›  Fix Rounding Bug in RatioBasedEstimator : #1542