Jan 02, 2020 | Machine Learning
This article discusses an overview on how to build a machine learning model in a serverless manner with GCP. The brief explanation about machine learning concepts and how to implement it using BigQuery Machine Learning or TensorFlow and Keras are also will be covered here.
In this tutorial we aim to build a ML model using NYC taxicab dataset. Project in GCP is needed to build a ML model. If you don’t have one, you can sign up for free here.
Navigate to AI Platform on the side menu bar and select the Notebooks. If you familiar with Jupyter Notebook or have been using Google Colab, this Notebook is using exactly the same concepts.
Click New Instance and select TensorFlow 2.x without GPU. Wait for a minute then click on open Jupyterlab to open Notebook environment.
You can start to write your own code or cloning a project from Github. For cloning a repository from Github you can use Terminal and type following command
git clone \ REPO_PATH
For this project you can clone from Google training code here.
There are two ways discussed on this tutorial to build ML model, by using BQML or using TensorFlow and Keras.
In this project BMQL can be used for two things. First, to use BMQL to explore dataset, create ML datasets, create benchmark. Second, to use BigQuery ML to create first ML models.
To deal with dataset In AI Platform, navigate to
training-data-analyst/quests/serverlessml/01_explore/solution
and open explore_data.ipynb.
Code for preparing the dataset can be found inside the explore_data.ipynb. Clear the output by clicking the clear button on Toolbar. Change the region, project, and bucket setting in the first cell based on your project. By clicking the Run button you will be able to see how to:
BigQuery ML provides a fast way to build ML models on large structured and semi-structured datasets. To build our first models for taxifare prediction, navigate to
training-data-analyst/quests/serverlessml/02_bqml/solution
and open first_model.ipynb.
Clear the output by clicking the clear button on Toolbar. Change the region, project, and bucket setting in the first cell based on your project. By clicking the Run button you will be able to see how to:
Root Mean Square Error (RMSE) indicates the accuracy of a model. The lowest the RMSE the better the model performances.
First Step : Learn how to read large datasets using TensorFlow
First we need to build data pipe line as the input of our Keras model and then construct the model. To build the data pipeline, navigate to
training-data-analyst/quests/serverlessml/03_tfdata/solution
and open input_pipeline.ipynb.
Clear the output by clicking the clear button on Toolbar. Change the region, project, and bucket setting in the first cell based on your project. By clicking the Run button you will be able to see how to:
The next step is build the DNN model using Keras to predict the fare amount for NYC taxi cab rides. Navigate to
training-data-analyst/quests/serverlessml/04_keras/solution
and open keras_dnn.ipynb.
Clear the output by clicking the clear button on Toolbar. Change the region, project, and bucket setting in the first cell based on your project. By clicking the Run button you will be able to see how to:
In this model, there are 4 layers of Neural Network. The first layer is input network with 5 feature nodes. Second layer has 32 hidden nodes with 192 parameter to train. Parameter means weights and biases. In the Third layer it has 8 hidden nodes with 264 connection. It connected to output layer with single output nodes and 9 parameters to train.
The number of iteration or epoch is only set into 5, increase the number of iteration is recommended. The other thing can be done to improve the model is through feature Engineering. After a training, the model can be deploy using gcloud ai-platform command which will take 5–10 minutes.
Prediction can be done using gcloud ai-platform predict command. Before doing the prediction, the input should be written as a json file that consist of 5 input features as shown in figure Visualisation of DNN Model in Keras. The output shown that the fare will be $11.43.
#Serverless #AppliedMachineLearning #GCP #BigQueryML #DNN #TensorFlow #Keras
We also publish articles on Medium. Take a look and follow us to see when we publish new articles.
Copyright © 2020 Cloud Ace, Inc. All rights reserved