Blog Page 21/44
How to select the best coder for your data with Apache Beam by Romain Manni-Bucau, 2018-09-19
Apache Beam coder abstraction enables you to switch between implementations without rewriting your pipeline. But how to select your coder? Performance and disk spaces are likely the most important criterias, let’s see how to measure them.
Apache Beam: convert Row structure to an Avro IndexedRecord by Romain Manni-Bucau, 2018-09-12
We previously saw that Beam Row structure allows to write generic transforms but that using its serialization can be a bad bet. To illustrate how to switch between one format to another, we will show in this post how to convert a Row to an IndexedRecord
Apache Beam and Row: a new Big Data record/serialization standard? by Romain Manni-Bucau, 2018-09-05
Handing data you don’t know at compile time is a common concern of processing libraries. Apache Beam can’t ignore that since it allows to build portable pipelines for Big Data engines. Let’s see how they started to solve that concern!