Google Cloud Platform setup Instruction

GCP--->Google Cloude Platform
1.Gmail Account

2.GCP create Big data cluster--->3 instances1 instance master 2 instances are workers
                            2 cpu cores and 7.5GB
   each instance will be provided with 2IPS, internal and external
    remote connection can be made to the instances with external IP only

3.Connecting to the instance/s
SSH protocol --->DSA/RSA Algorithm....2 keys public key and private key.
4.Tool to connect to the remote instance
Putty--->SSH, hostname and IP and private key

5.Creation of SSH keys
use a tool called puttyGen--public key and private key      
                            add public key generated to the instance to get
                                                                                                         connect with
           username:rsa-key-20181222@ key)
                                                                           private key(.ppk)

6.list of services
    master-->HDFS(name node,secondary namenode)-->2.9.0
           -->YARN(Resource Manager,History Server)
                                             Hive(Run Jar)
                                              spark 2.3.2  scala 2.11.8
                                             python 2.7.13

                                                            http port no :9870
                                                            yarn port no:8088                              

7.Accessing the clusters web ui(HDFS,YARN,Spark)
               1.we should download and install "Google cloud SDK"
               SDK is a command line tool to interact with GCP
    create SSH tunnel to connect to web user interface     
      cloud vendor-->Storage service-->Computing Service

AWS----->S3------------------>EMR(Elastic Map Reduce)
GCP=====>GCS(Google cloude Storage)-------->Data Proc
GCS-->HDFS--->Process the data
GCP cloud SDK-->to do everything with GCP cloud

hdfs dfsadmin -printTopology
hostname -I

8.Running spark application on cluster-->Yarn,Standalone,kubernetes,Mesos
from pyspark.sql import SparkSession
import sys
if __name__=='__main__':
r2=r1.flatMap(lambda x:x.split(' ')).map(lambda x:(x,1)).
reduceByKey(lambda x,y:x+y)
                                      to exchange data between local and remote nodes(SCP)(winscp)                                 
spark-submit   hdfs://  /sample  /wordcountout

