Category: Cloud

0

Common GCP command

GCP commands: app enginegcloud app createmvn appengine:deploy choose projectgcloud config listgcloud config list accountgcloud config list project gcloud config set project gcp-test-xxxxxgcloud projec

0

BigTable

feature summaryIt’s OLAP, so no transaction.More expensive than bigqeuryLow latency, high throughput100,000 QPS @ 6ms latency for 10-node clusterYou pay for the number of clusterGlobal availabilityAl

0

BigQuery

feature summary Petabyte scale DW on GCP for interactive analysis (near real time analysis, it’s not totally real time, cannot respond in millisecond, microsecond). Prefer demoralised table structure

0

Cloud Dataproc

Dataproc takes care of all the over head and normally take 90 secs to spin upCheep storage Terabytes to Petabytes is good running it on cloud.Use CLI and GC console to operate. The console and GC com

0

Cloud Dataflow

Dataflow TextIO.write will shard your file, you can avoid it by withoutSharding() Dataflow can do everything you do in MapReduceParDo: parallel processingshould not contain any stateProcess one item

0

Cloud Dataprep

Interactive graphical system for preparing structured or unstructured data for use in:analytics -> BigQueryvisualisation -> Data Studiotrain machine learning models -> Input integration : G

0

Cloud Pub/Sub

feature summaryMessage persist for 7 dayslow-lantencyCapturing data and distributing dataUnified global server less service - not attached to a specific project, domain or userSmooth out traffic spike

0

GCP Data Engineer preparation

Here is a summarization of what I have learned so far on GCP data engineering. It includes all of the resources I found, such as: Coursera Google Data Engineer courses Udemy courses Google Cloud Pla