Getting started

The following section shows how to connect to BigQuery from R programming language and create visuals using that data:

  1. Open RStudio and click on the Run button to run the following script in the console or source pane to install bigrquery and ggplot2:
install.packages("bigrquery")
install.packages("ggplot2")
  1. Run the next script in the console to query your Google BigQuery Table:
library("bigrquery")
project <- "Enter your project ID Here"

query <- "SELECT trafficsource.medium as Medium,
COUNT(visitId) as Visits
FROM `google.com:analytics-bigquery.LondonCycleHelmet.ga_sessions_20130910`
GROUP BY Medium"

result <- query_exec(query, project, use_legacy_sql = FALSE)
  1. The bigrquery package works with the API for authentication. Once the script is run, R will prompt the user for confirmation to store a local file with authorization information. Select 1 for yes:
  1. At this point, a web browser window should open with a response from Google and include a code. This code is used by Google to authenticate the session. Copy and paste this code in the RStudio console:
  1. R will then run the script accessing the BigQuery API. BigQuery will run the query and return results in the form of an R data frame object (named result in this case):
  1. Run the next script in the R console:
library(ggplot2)

p <- ggplot(data=result, aes(x=Medium, y=Visits)) +
geom_bar(stat="identity")
p

The main selling point for using a programming language like R for these types of visualizations is the type of flexibility programming it provides. Unlike Tableau and Google Data Studio, which provide simple but strict frameworks for creating visualizations, R allows the user to be creative with their visualizations.

Let's try pushing R a bit further. The following script adds an error bar comparing the percent age difference of the actual values to the standard deviation for the bar chart visualization we've just created.

  1. Run the next script in the R console to show the difference in standard deviation and visualization:
#add a calculation of % difference from the standard deviation
result$sd <- result$Visits/sd(result$Visits)

#plot result data
p <- ggplot(data=result, aes(x=Medium, y=Visits)) +
geom_bar(stat="identity") +
geom_errorbar(aes(ymin=Visits-sd, ymax=Visits+sd), width=2,
position=position_dodge(.9))
p

Now we have a highly specialized visualization that lets us know how far each Medium is from the standard deviation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.135.216.75