In this guide, I will help you learn R, the statistical programming language.
Table of Contents
Getting started With R Programming
R is a programming language that can do many things. Visualizations, data processing, and manipulation of data are all easy for R. You can download R from r-project.org or from a package manager that you may use. It usually goes by the same name. It is maintained very well and has a lot of active users. Being free is also a major advantage. It runs on Windows, Mac, and Linux.
You do not use a compiler with R code. It is an interpreted language. Every statement is interpreted on a low level basis. However, interpreted languages are slower than those that use a compiler. This won’t make any difference at the beginning. Just keep it in mind if you decide to use it for more complicated tasks. It is not meant to be C++ as it does different things.
The R Environment
We can use any text editor and many different IDE’s to write R code. Depending on your operating system, you will have a few different choices. Some of those include:
- R Commander
- Rattle for R
There are many more options to go with. These are some of the more well known that are free.
You can use R as a simple calculator if you want to. Enter an expression and hit enter.
3 + 5
After you enter your expression and hit enter, the interpreter returns an answer. To close your session, type:
Your First Session
Our first session will complete the requisite “Hello World” program.
print (“Hello World”)
That is nice and easy, it only takes one line. Basic math is just as easy.
3^3 + 4^4
You should get the answer of 283. However, this is just the most basic usage and not really helpful.
Starting With Vectors
A vector is a collection of the same type of things. If there are just numbers in the collection, it is called a numeric vector. To create a vector of numbers, you can type:
This will create a vector of the numbers one through seven.
You can also create a vector with the sequence operator, a colon.
This expression creates a vector of numbers from one to a hundred. That is much quicker than typing them out. You can add all of them together by using the sum function.
Variables and Vectors
We can store numbers and vectors in variables. This lets us store values to work on them at a later point. When they are stored, we can do calculations on them at any time.
x = 1:100
The above statement assigns a vector to the variable “x”. Then we print the value of “x” with the next statement. We can create another variable and add it to the value of “x”.
a = 6^2
x + a
Interaction With User
We can get feedback from a user. To do this, we use the “readline()” function.
color = readline("What is your favorite color? ")
Your First Script
Anytime you want to repeat some action, you should place the commands in a script. A script runs several commands at once. A script can run several calculations or ask for input. The only limit is your creativity.
To create a script, you need to open a script window. In Rstudio:
- In the top menu, click File
- Select New File
- Click R Script
This will open a window in the top left. You can paste the previous code into this window if you want. Save the file in the location that you prefer. Then, in the menu:
- Select Code
- Then click on Source towards the bottom
This will run the code in your bottom left window.
Using Functions In R Programming
If you want to follow along in this sectional load op your R editor or Rstudio. It will be fun. Start with checking your version and see if you need to update.
You will use functions more than anything else. Almost everything you do will involve one. Let us start with vectorizing your functions. This type of function works on a whole set of values at once. A vectorized function takes things a step forward. You do not have to look at each element singly. A vectorized function will do this all at once for you. To make a vectorized function, use the “c()” function. It stands for combine and it combines stuff inside the parentheses.
I play a game called EverQuest so I will use some data from it. Specifically, my rogue does damage and the individual hits are going to be the arguments of the “c()” function.
damage_per_second = c(23453, 115200, 67441, 110688, 220665)
We have our data in a variable. Let us see what we can do. The total is the first property to come to mind.
This gives us our total damage done to some deserving digital monster in our game. We can also work with string data. A good example would be the name of the monster. Let us call it a goblin berserker.
Monster_name = c(“Goblin”, “Berserker”)
So, we have created another vector with the first and last name of this kind of goblin. We can use the “paste()” function to copy it back in any order.
This function will look at each item and concatenate them together. It will do something else very cool. What if we did not want to type “goblin” every time because we are surveying a whole clan of goblins and we have a few hundred types? I won’t type all of those but I will do a few to show you how this works.
monster_first_name = c("goblin")
monster_last_name = c("berserker", "extremist", "digger")
paste (monster_first_name, monster_last_name)
Now hit enter after each statement and do you see what happens? We have the word “goblin” in front of each of the types. I think that is so cool. These are vectorized functions!
We have just used a couple functions. Remember the data we put into the parentheses? Those are arguments for the function. The function is waiting for something to work on, whether it is numbers or names or whatever. There is more to arguments than this, but for now, let’s keep it simple. We will come back to function arguments later when we need to.
R keeps track of your commands. You can look at your past commands by using the up and down arrows in your IDE. So, you just hit up, for example, and then select “enter” when you see the command you want to execute. You can also use a function called “savehistory()” to save your commands to a file. If you do not give it any arguments between the parentheses, it saves the commands in a file called .Rhistory in your working directory. However, you can give arguments to the function to give it a different name.
savehistory(file = “functions.Rhistory”)
This will let you see specific commands when learning about functions!
I don’t think I have talked about comments in this section, yet. So, let’s do that. If you have written code before, you know the importance of comments in your code. Your code should be self documenting so you can easily see what the code does. However, you need comments to tell you why you did it that way. I also use comments in my code to divide it into sections. That way it is easier for me and others to see what and why I did something.
# This is a comment and it should explain something about your code.
Everything after a hash mark is ignored on a line.
Vectorizing Your Functions
Vectorized functions are very useful in R programming. They are a way to analyze data quickly and easily. I briefly mentioned them in the section before this but today I want to go more in depth.
Vectorizing Your Functions
Your first step in using vectorized functions is to create a vector of values. Let us make a vector using the high temperatures of Cincinnati, OH for the next few days. I don’t live there, but the temperatures are so nice, I wish I did. We need to make a vector name and then use the c() function. It looks like this:
Temps = c()
Now I will use my weather program to find the highs for the next few days and input those into the function. Use commas to separate values.
Temps = c(77,79,79,82,68,72)
With these values in hand, we can always call the vector name to see them again. The values could change, that is why it could be useful. Do it like this:
We could get the sum of the temperatures if we wanted, though with temperatures that is not exactly useful. However, you should know how to use the sum function for other kinds of numerical data.
That will give you a sum of values.
We can also use functions that deal with string data or names, then work on that function with another function.
Let me use some Magic the Gathering cards I have for examples.
What I will do here is list the first part of the card name for a few different cards. Then I will do the same for the last part of the card name. Lastly, I will paste the second part to the first part.
first=c(“Annointed”, “Shivan”, “Llanowar”)
second=c(“Peacekeeper”, “Reef”, “Wastes”)
Now we can use the paste() function on our vector.
Those are the actual names of the cards I am looking at. The point is that you can use functions to work on other functions.
Most functions will allow you to use arguments. They let you tell the function exactly how to behave. This is called passing a value to a function. If you know other programming languages, this is not a foreign concept. Most functions across languages work like this. You can have arguments with default values and some without default values.
R keeps track of all the commands you use in a particular session. The purpose of this functionality is to let you see what you have done and let you easily repeat something. You can look at the history that you have typed by using the up arrow within your console. You can hit enter when you see something you want to repeat. This runs the same command again.
You can save the history of your commands with the “savehistory()” function. This will automatically save them in a file called .Rhistory, but if you want to specify a different file you can do this:
You can look at the file with any text editor. You can also load a previous history file.
Comments are good to use, they help with readability. You can use them to indicate who wrote a piece of code and what they were intending. It is also good to explain why something is there in the first place. This is done with the pound symbol.
# this line is a comment and does nothing else
You can place comments at the beginning of a line or any place in the line itself. To do multiple lines of comments, every line will need a pound symbol.
Anyone can write functions in R and share them with others. These are called packages. There are a few different website repositories that contain collections of these packages. The most important is: https://cran.r-project.org
You install packages by using the command:
The name of the package is the argument of the function.
Once the package is installed, you have to load it in order to use it. You do it like this:
After this step, you can officially use the commands of this package. The library is the directory where your packages are installed.