#Welcome to R. A line that starts with the pound sign is a comment #line. You can type your commands in R line by line or create scripts #if you prefer. We are going to explore mainly the use of line-by-line #commands, which is interactive and friendly. #To get into R from a linux/unix server type R on the command line of a #shell. To get to R from a Mac, click on the R icon and then OK on the #console. (You have separate instruction files about R logistics on #the Macs and on Linux\Unix machines). # Assignment. You can create any object in R by using the <- assignment sign. # Note that _ is equivalent to <- . So, do not use the underscore in the names # of the objects you create. x<-1:10 # If you just enter the name of an R object you see what it consists of. x #as you can see : indicates all the numbers between two values. #There are other ways of generating a sequence of numbers, that one #can achieve using the function seq y<-seq(0.4,35.8,length=10) #To learn more about seq() (and any other function in R), type help.start() # your web-browser will be pointed to a very nice on-line help. # Note that a function is always invoked with the parenthesis. So, if # you want to quit R, you need to type q() # At this point you will be asked if you want to save the # workspace. If you say yes, all the objects you have created # will be saved and re-loaded next time you open R from the same # directory. Otherwise they will be deleted. It is usually more # convenient for you to save in a file a copy of the commands to # create objects, rather than saving objects, which are quite # space-consuming and not very portable. # To continue, you will need to start R again. # Data can be stored into vectors, or matrices, or data.frames. x<-c(1,2,7,9,20) #is a vector x[4] # is its fourth element y<-matrix(c(1,2,3,1,2,3),3,2) # is a matrix with 3 rows and 1 # column. Note that it is filled in # column by column. y[1,] # is the first row y[,2] # is the second column y[3,1] # is the element on row 3 and column 1 z<-1:5 # is an other vector. You can put it together with x to make a # longer vector c(x,z) # or to make cbind(x,z) # a matrix # you can also generate random data x<-rnorm(100,5,3) #these are 100 random numbers from a normal #distribution with mean 5 and standard deviation #3. You can look for other distributions using the #help for rnorm(). # and, of course, you can load data from a file. Suppose your file # contains three columns as the following # 10 5 A # 20 6 A # 1 7 A # 53 8 A # 60 1 B # 2 2 B # 10 3 B # 22 4 B # and suppose that such file is called test. You can load that file in # R using the function read.table. x<-read.table(file="test") # each column will be interpreted as a variable. The last column will # be interpreted as a categorical variable, a ``factor''. # One nice thing you can do with your data is to plot it. plot(x[,1],x[,2]) #scatter plot hist(x[,1]) #histogram plot(x[,3],x[,1]) #when the x-coordinate is a factor, you get these #nice box-plots. # if you want to print your graphs, you need to do a bit of work. # You start opening the postscript file that will contain them postscript(file="mygraph",horizontal=F, height=9,width=6) #height and #width are #in inches #and you can #change them #then you execute your plot; this time we want to put them all on the #same page, so we add an initial command (try with out opening a #postscript file first, so that you get an idea of what is happening) par(mfrow=c(3,1)) plot(x[,1],x[,2]) hist(x[,1]) plot(x[,3],x[,1]) # and finally you close the file on which you printed. dev.off() # Calculating statistics in R is easy: mean(x[,1]) #gives you the mean of the first column of x. x<-x[,-3] # we now eliminate the last column of x, so that it is a # dataframe of numeric components apply(x,2,mean) # gives us the column means apply(x,1,mean) # gives us the row means. Apply is an important # function in R. Look it up in the help. # There are many functions for statistical analysis which are built # into R and you should always look in the help first, to see if you # can find something that does what you would like. But sometimes you # will need to write your own function. This is also rather easy: myfun<-function(arg1,arg2) { y<-arg1*3+arg2/3 return(y) } myfun(1,9) # R is rather slow in loops, so you should try to avoid loops as much #as possible. If they cannot be avoided, test<-NULL for(i in 1:100) { test<-c(test,i/6) print(i) } test #of course, you could have avoided this loop by saying test<-1:100/6 #The following is a web tutorial for Splus. R is not Splus, but it is #very similar, especially with regard to commands that you can use; #so you may find useful to consult the following (and I recommend you do) #http://www.stat.sinica.edu.tw/splus_tutorial/contents.html