R data visualizations – Exploring ggplot2: Part 1

Data visualizations are for business users to see the visual representation of Analytics including charts, graphs, maps and other graphical formats. Data visualizations illustrate difficult concepts, unearth relationships among different data elements and also help in spotting hidden trends and patterns within the data set.

R is a very powerful open source programming language and software environment for statistical computing and graphics. It compiles and runs on various UNIX platforms, Windows and MacOS.

The grammar of graphics based plotting system (ggplot2) is the starting point for data visualizations or graphics in R. You can use ggplot2 to plot complex multi-layered graphs with ease.

I will be using the following R version in my examples:

R version 3.4.0 (2017-04-21) — “You Stupid Darkness”

Here are some examples to get you started –

If you do not have ggplot2 installed, you can use the following syntax for install ggplot2:

install.packages(“ggplot2”)

To load the ggplot2 libraries, use the following syntax:

library(ggplot2) 

I will be using the famous mtcars (or Motor Trend Car Road Tests) dataset which is one of preloaded datasets in R. To find more details about the mtcars dataset, please check the following link:

https://stat.ethz.ch/R-manual/R-devel/library/datasets/html/mtcars.html

You can use the following syntax for checking mtcars dataset:

mtcars

mtcars

Let us first try to plot the columns miles per gallon (mpg) and displacement (disp) using the qplot (quick plot) function:

qplot(mtcars$mpg, mtcars$disp)

1

Here is another variant of the qplot function for the above plot:

qplot(mpg, disp, data = mtcars)

You can realize the difference between qplot() and the basic plot() function in R if you see the output for these two columns using plot() function. The syntax to be used is:

plot(mtcars$mpg, mtcars$disp)

2

To understand the power of power of ggplot2, let us try to customize the visualization using qplot. Let us use the following syntax to add colors to a basic scatter plot: 

qplot(mpg, disp, data = mtcars, color = cyl)

3

Here we are using the argument color with the qplot function and we are using the no. of cylinders (cyl) column for changing these colors. As you can see the colors used in the scatter plot are different shades of Blue. If you need to use different colors for different groups, you can use the following syntax:

qplot(mpg, disp, data = mtcars, colour=factor(cyl))

4

Now to change the sizes of points for the cyl column, you can also use the following syntax:

qplot(mpg, disp, data = mtcars, colour=factor(cyl), size=cyl)

5

Also, to add different shapes for different values of cyl column the following syntax can be tried out:  

qplot(mpg, disp, data = mtcars, colour=factor(cyl), size=cyl, shape=factor(cyl))

6

To provide a customized title for your chart, you can use the main argument of the qplot function: 

qplot(mpg, disp, data = mtcars, colour=factor(cyl), size=cyl, shape=factor(cyl), main=”MTCars”)

7

To customize the labels of X-axis and Y-axis, the arguments xlab and ylab are used for the qplot function:

qplot(mpg, disp, data = mtcars, colour=factor(cyl), size=cyl, shape=factor(cyl), main=”MTCars”, ylab=”Displacement”, xlab=”Miles per Gallon”)

8

That covers the basics of ggplot2. I am going to cover more features and different types of charts in the next blog post. Stay tuned ….

Advertisements

IBM Cognos Audit

Cognos Audit is a lifesaver for the Cognos Administrators all over the world. IBM Cognos software provides a sample model and multiple reports. Cognos Audit is very handy to track different operations being performed in the Cognos environment. A separate database called Audit DB will have to be created for storing this information. This is typical Cognos Administration work.

Here is the complete documentation for Cognos Audit configuration:
http://www-01.ibm.com/support/docview.wss?uid=swg21996317

Video tutorial for Cognos Audit:

List of tables that get created once Cognos Audit is configured:

COGIPF_ACTION : Stores information about the operation performed on an object.
COGIPF_USERLOGON : Stores user logon (including log off) information
COGIPF_NATIVEQUERY : Stores information about third-party queries to Cognos components
COGIPF_PARAMETER : Stores custom information logged by a component
COGIPF_RUNJOB : Stores information about job requests
COGIPF_RUNJOBSTEP : Stores information about job request steps
COGIPF_RUNREPORT : Stores information about report executions
COGIPF_VIEWREPORT : Stores information about report view requests
COGIPF_SYSPROPS : Stores version information about the schema for upgrade purposes
COGIPF_EDITQUERY: Stores information about query runs
COGIPF_AGENTBUILD: Stores information about agent mail delivery
COGIPF_AGENTRUN: Stores information about agent activity including tasks and delivery
COGIPF_THRESHOLD _VIOLATIONS : Stores information about threshold violations for system metrics

The 1 .. 2 .. 3 of Analytics

In today’s world, Analytics refers to a complete set of big data and analytics solutions which can solve the most difficult business questions, unearth otherwise unrecognizable patterns and provide improvement suggestions proactively. To achieve this the set of products need to  constantly engage with the organization’s data.

Primarily we divide Analytics is 3 categories – Descriptive, Predictive & Prescriptive.

Descriptive Analytics tells you the “What and How” of your business by showing the KPIs of an organization it clearly depicts the current picture. This type of Analytics uses traditional Business Intelligence tools.

Predictive Analytics shows an organization the various possibilities to improve the future business. This type of Analytics uses statistical modeling and forecasting tools.

Prescriptive Analytics answers the most important question of “What Should you do?”. This type of Analytics uses techniques like Simulation and Optimization.

It is very important to choose the Analytics vendor carefully to make sure that the tools catering to all the above types of Analytics are available with the vendor and that can be availed when an organization achieves the next level in Analytics.

For further reading, please visit:

http://www-01.ibm.com/common/ssi/cgi-bin/ssialias?htmlfid=TIW14162USEN

 

Blog at WordPress.com.

Up ↑