Google BigQuery is a solution for query with massive datasets. Bigquery is a excellent solution of Google with Dremel technology. With Bigquery, no need hardware configuration, no need setup administration.
Tutorial will show you how to start with Bigquery with Java.
I. Technology for BigQuery tutorial
– Google Cloud
– Java 1.8
– Maven: 3.3.9
– Editor: Spring Tool Suite – Version 3.7.3.RELEASE
II. Overview
Sample of authentication with BigQuery and reading from a public wikipedia dataset.
1. Struct of Project
2. Step to do
– Create GoogleCloud Account
– Create Application Default Credentials
– Create Maven Project
– Add needed dependencies
– Create Java Authentication Client
– Create Java Query Execution
– Display Results
– Create Entry Point: Main function
– Setup GOOGLE_APPLICATION_CREDENTIALS & Run
III. Practices
1. Create GoogleCloud Account
– Go to GoogleCloud and register account
2. Create Application Default Credentials
Login to Google Cloud account, go to: API Console Credentials page
Select a project or Create a project.
Note: For creating a project you can go the console: IAM-ADMI/PROJECTS
In the tutorial, I choose Select a project
In the tutorial, choose TestProject with ID: neon-deployment-141106
Press Open
Next, drop-down: Create Credentials, choose Service account key
Next New Service Account, input info and choose KeyType is JSON, as below images:
Press Create, and a JSON file is saved to local.
Place the Json file to a place for use later.
Format of Json file
{ "type": "service_account", "project_id": "neon-deployment-141106", "private_key_id": "xxx-info", "private_key": "-----BEGIN PRIVATE KEY-----Secret Info\-----END PRIVATE KEY-----\n", "client_email": "demobigquery@neon-deployment-141106.iam.gserviceaccount.com", "client_id": "xxxx", "auth_uri": "http://accounts.google.com/o/oauth2/auth", "token_uri": "http://accounts.google.com/o/oauth2/token", "auth_provider_x509_cert_url": "http://www.googleapis.com/oauth2/v1/certs", "client_x509_cert_url": "http://www.googleapis.com/robot/v1/metadata/x509/demobigquery%40neon-deployment-141106.iam.gserviceaccount.com" }
3. Create Maven Project
– Open Spring Tool Suite, on main menu, File->New->Maven Project, then select create a simple project
Press
Then press Finish, Maven project is created.
4. Add needed dependencies
Open pom.xml, add needed dependecies
com.google.apis google-api-services-bigquery v2-rev317-1.22.0 com.google.oauth-client google-oauth-client 1.21.0 com.google.http-client google-http-client-jackson2 1.21.0 com.google.oauth-client google-oauth-client-jetty 1.21.0 com.google.code.gson gson 2.7 junit junit 4.12 test com.google.truth truth 0.29 test
5. Create Java Authentication Client
Use Application Default Credentials for authentication
public static Bigquery createAuthorizedClient() throws IOException { // Create the credential HttpTransport transport = new NetHttpTransport(); JsonFactory jsonFactory = new JacksonFactory(); GoogleCredential credential = GoogleCredential.getApplicationDefault(transport, jsonFactory); if (credential.createScopedRequired()) { credential = credential.createScoped(BigqueryScopes.all()); } return new Bigquery.Builder(transport, jsonFactory, credential) .setApplicationName("Bigquery Samples") .build(); }
6. Create Java Query Execution
private static ListexecuteQuery(String querySql, Bigquery bigquery, String projectId) throws IOException { QueryResponse query = bigquery.jobs().query(projectId, new QueryRequest().setQuery(querySql)).execute(); // Execute it GetQueryResultsResponse queryResult = bigquery .jobs() .getQueryResults( query.getJobReference().getProjectId(), query.getJobReference().getJobId()) .execute(); return queryResult.getRows(); }
7. Display Results
private static void displayResults(Listrows) { System.out.print("\nResults:\n------------\n"); for (TableRow row : rows) { for (TableCell field : row.getF()) { System.out.printf("%-50s", field.getV()); } System.out.println(); } }
8. Create Entry Point: Main function
public static void main(String[] args) throws IOException { String projectId = "neon-deployment-141106"; // Create a new Bigquery client authorized via Application Default Credentials. Bigquery bigquery = createAuthorizedClient(); Listrows = executeQuery( "SELECT title " + "FROM [publicdata:samples.wikipedia] LIMIT 10", bigquery, projectId); displayResults(rows); }
9. Setup GOOGLE_APPLICATION_CREDENTIALS & Run
Right Click on main class (BigQuery.java), choose Run Configuration …, Set the environment variable GOOGLE_APPLICATION_CREDENTIALS to the path of the JSON file
Press Appy then Run.
Result
Results: ------------ Slither (2006 film) War of the Worlds (2005 film) John Glenn SanDisk Cruzer Luangwa River Don Larsen Kruševo Rocky Romero Taekwondo Hip hop
IV. Source code
Last updated on June 4, 2017.
return new Bigquery.Builder(transport, jsonFactory, credential)
.setApplicationName(“Bigquery Samples”)
.build();
In the above line of code…there is compilation issue for builder method. please explain.
You can see the segment code:
HttpTransport transport = new NetHttpTransport();
JsonFactory jsonFactory = new JacksonFactory();
GoogleCredential credential = GoogleCredential.getApplicationDefault(transport, jsonFactory);
1. HttpTransport transport is the protocol that uses HttpClient to transfer data and implements thread-safe
2. JsonFactory jsonFactory is the Json factory for converting data as JSON format
3. GoogleCredential credential is oauth2 for authentication