Archive: 2018/6

0

Cloud Dataflow

Dataflow TextIO.write will shard your file, you can avoid it by withoutSharding() Dataflow can do everything you do in MapReduceParDo: parallel processingshould not contain any stateProcess one item

0

Cloud Dataprep

Interactive graphical system for preparing structured or unstructured data for use in:analytics -> BigQueryvisualisation -> Data Studiotrain machine learning models -> Input integration : G

0

Cloud Pub/Sub

feature summaryMessage persist for 7 dayslow-lantencyCapturing data and distributing dataUnified global server less service - not attached to a specific project, domain or userSmooth out traffic spike

0

GCP Data Engineer preparation

Here is a summarization of what I have learned so far on GCP data engineering. It includes all of the resources I found, such as: Coursera Google Data Engineer courses Udemy courses Google Cloud Pla