R Statistical Analysis for Beginners

In this guide, I will help you learn R, the statistical programming language.

 

Amazon Affiliate Ad - If you appreciate my articles then you doing your Amazon shopping through this link supports my site

Table of Contents

Getting started With R Programming

R is a programming language that can do many things. Visualizations, data processing, and manipulation of data are all easy for R. You can download R from r-project.org or from a package manager that you may use. It usually goes by the same name. It is maintained very well and has a lot of active users. Being free is also a major advantage. It runs on Windows, Mac, and Linux. 

You do not use a compiler with R code. It is an interpreted language. Every statement is interpreted on a low level basis. However, interpreted languages are slower than those that use a compiler. This won’t make any difference at the beginning. Just keep it in mind if you decide to use it for more complicated tasks. It is not meant to be C++ as it does different things. 

 The R Environment

We can use any text editor and many different IDE’s to write R code. Depending on your operating system, you will have a few different choices. Some of those include:

  • RGui
  • RStudio
  • Eclipse
  • Emacs
  • Tinn-R
  • R Commander     
  • Rattle for R

There are many more options to go with. These are some of the more well known that are free. 

Entering Commands

You can use R as a simple calculator if you want to. Enter an expression and hit enter.

3 + 5

4^2

After you enter your expression and hit enter, the interpreter returns an answer.  To close your session, type:

quit()

Your First Session

Our first session will complete the requisite “Hello World” program. 

print (“Hello World”)

That is nice and easy, it only takes one line. Basic math is just as easy.

3^3 + 4^4

You should get the answer of 283. However, this is just the most basic usage and not really helpful. 

Starting With Vectors

A vector is a collection of the same type of things. If there are just numbers in the collection, it is called a numeric vector. To create a vector of numbers, you can type:

c(1,2,3,4,5,6,7)

This will create a vector of the numbers one through seven. 

You can also create a vector with the sequence operator, a colon.

1:100

This expression creates a vector of numbers from one to a hundred. That is much quicker than typing them out. You can add all of them together by using the sum function.

sum(1:100)

Variables and Vectors

We can store numbers and vectors in variables. This lets us store values to work on them at a later point. When they are stored, we can do calculations on them at any time. 

x = 1:100
x

The above statement assigns a vector to the variable “x”. Then we print the value of “x” with the next statement. We can create another variable and add it to the value of “x”.

a = 6^2

x + a 

Interaction With User

 We can get feedback from a user. To do this, we use the “readline()” function. 

color = readline("What is your favorite color? ")

paste (color)

Your First Script

Anytime you want to repeat some action, you should place the commands in a script. A script runs several commands at once. A script can run several calculations or ask for input. The only limit is your creativity.  

To create a script, you need to open a script window. In Rstudio:

  • In the top menu, click File
  • Select New File
  • Click R Script

This will open a window in the top left. You can paste the previous code into this window if you want. Save the file in the location that you prefer. Then, in the menu:

  • Select Code
  • Then click on Source towards the bottom

This will run the code in your bottom left window. 

 

Using Functions In R Programming

 

    

If you want to follow along in this sectional load op your R editor or Rstudio. It will be fun. Start with checking your version and see if you need to update.

 

R.Version()

 

Vectorized Functions

You will use functions more than anything else. Almost everything you do will involve one. Let us start with vectorizing your functions. This type of function works on a whole set of values at once. A vectorized function takes things a step forward. You do not have to look at each element singly. A vectorized function will do this all at once for you. To make a vectorized function, use the “c()” function. It stands for combine and it combines stuff inside the parentheses.

 

I play a game called EverQuest so I will use some data from it. Specifically, my rogue does damage and the individual hits are going to be the arguments of the “c()” function. 

 

damage_per_second = c(23453, 115200, 67441, 110688, 220665)

 

We have our data in a variable. Let us see what we can do. The total is the first property to come to mind.

 

sum(damage_per_second)

 

This gives us our total damage done to some deserving digital monster in our game. We can also work with string data. A good example would be the name of the monster. Let us call it a goblin berserker.

 

Monster_name = c(“Goblin”, “Berserker”)

 

So, we have created another vector with the first and last name of this kind of goblin. We can use the “paste()” function to copy it back in any order.

 

Paste (monster_name)

 

This function will look at each item and concatenate them together. It will do something else very cool. What if we did not want to type “goblin” every time because we are surveying a whole clan of goblins and we have a few hundred types? I won’t type all of those but I will do a few to show you how this works.

 

monster_first_name = c("goblin")

monster_last_name = c("berserker", "extremist", "digger")

paste (monster_first_name, monster_last_name)

 

Now hit enter after each statement and do you see what happens? We have the word “goblin” in front of each of the types. I think that is so cool. These are vectorized functions! 

 

Function Arguments

We have just used a couple functions. Remember the data we put into the parentheses? Those are arguments for the function. The function is waiting for something to work on, whether it is numbers or names or whatever. There is more to arguments than this, but for now, let’s keep it simple. We will come back to function arguments later when we need to.

 

Command History

R keeps track of your commands. You can look at your past commands by using the up and down arrows in your IDE. So, you just hit up, for example, and then select “enter” when you see the command you want to execute. You can also use a function called “savehistory()” to save your commands to a file. If you do not give it any arguments between the parentheses, it saves the commands in a file called .Rhistory in your working directory. However, you can give arguments to the function to give it a different name.

 

savehistory(file = “functions.Rhistory”)

 

This will let you see specific commands when learning about functions! 

 

Comments

I don’t think I have talked about comments in this section, yet. So, let’s do that. If you have written code before, you know the importance of comments in your code. Your code should be self documenting so you can easily see what the code does. However, you need comments to tell you why you did it that way. I also use comments in my code to divide it into sections. That way it is easier for me and others to see what and why I did something. 

 

# This is a comment and it should explain something about your code.

 

Everything after a hash mark is ignored on a line.