R Basics, cont

Announcement
First functions to learn
Locating and deleting objects:
Vectors
Operations on vectors
Matrix
R commands on vector/matrix
Comparison (logic operator)
Other operators
Control flow
Define a function

Announcement

Lectures will be online with zoom recordings
- https://tulane.zoom.us/j/94892763737?pwd=VTlmMStLd3hUeUhEY0UxdUZJVnk3UT09
Introduce yourself on Canvas Discussion
Email me your GitHub user name and accept the invitation to course organization
HW1 posted (due in 3 weeks on Oct 1st, 2021)
- learn from the past: start early
Project description (one page) due in two weeks on Sept. 24, 2021

First functions to learn

symbol	use
?	get documentation
str	show structure

test.str <- 1:6
str(test.str)

##  int [1:6] 1 2 3 4 5 6

Locating and deleting objects:

The commands “objects()” and “ls()” will provide a list of every object that you’ve created in a session.

objects()

## [1] "test.str"

ls()

## [1] "test.str"

The “rm()” and “remove()” commands let you delete objects (tip: always clearn-up your workspace as the first command)

rm(list=ls())  # clean up workspace

Vectors

Many commands in R generate a vector of output, rather than a single number.

The “c()” command: creates a vector containing a list of specific elements.

Example 1

c(7, 3, 6, 0)

## [1] 7 3 6 0

c(73:60)

##  [1] 73 72 71 70 69 68 67 66 65 64 63 62 61 60

c(7:3, 6:0)

##  [1] 7 6 5 4 3 6 5 4 3 2 1 0

c(rep(7:3, 6), 0)

##  [1] 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 7 6 5 4 3 0

Example 2 The command “seq()” creates a sequence of numbers.

seq(7)

## [1] 1 2 3 4 5 6 7

seq(3, 70, by = 6)

##  [1]  3  9 15 21 27 33 39 45 51 57 63 69

seq(3, 70, length = 6)

## [1]  3.0 16.4 29.8 43.2 56.6 70.0

Operations on vectors

Use brackets to select element of a vector.

x <- 73:60
x[2]

## [1] 72

x[2:5]

## [1] 72 71 70 69

x[-(2:5)]

##  [1] 73 68 67 66 65 64 63 62 61 60

Can access by “name” (safe with column/row order changes)

y <- 1:3
names(y) <- c("do", "re", "mi")
y[3]

## mi 
##  3

y["mi"]

## mi 
##  3

Matrix

matrix() command creates a matrix from the given set of values

matrix.example <- matrix(rnorm(100), nrow = 10, ncol = 10, byrow = TRUE)
matrix.example

##             [,1]        [,2]       [,3]        [,4]        [,5]       [,6]
##  [1,] -2.3323541 -2.18657747 -0.4014868  0.06691836  1.29324515 -1.4463168
##  [2,] -0.4753005  0.47294332  0.2244744 -0.84968969 -0.09993253 -0.2448627
##  [3,]  0.3734604 -0.71304198  0.8727049  2.51141778  0.74323031  1.2339513
##  [4,]  0.8692372 -0.96670992 -1.0119913 -0.51567157 -1.63234113  0.4997296
##  [5,] -0.4791038 -0.08670071  0.8407539  0.82354818 -0.94657799 -0.9801950
##  [6,]  1.1867074 -0.59849529 -1.0277184 -1.26812688  1.56010386 -0.4722041
##  [7,] -1.9479396 -0.78083551 -1.8031430  0.70548252  0.14169582  0.2453195
##  [8,]  1.8786771  0.86629994  0.8464386  0.04018114  1.34458457  1.3003608
##  [9,]  0.5132524  0.63782182 -1.8224493 -0.13809829 -1.04677710  0.1406130
## [10,]  0.8210053  1.49490255  1.4588468  0.74669792 -0.35114807  0.5957015
##              [,7]       [,8]        [,9]      [,10]
##  [1,] -0.53936590  0.7951344 -1.88023353  0.4737950
##  [2,] -2.05728909 -0.5203394 -1.07545960 -1.3900146
##  [3,]  0.66218435 -1.2966323 -0.72260372  0.1879630
##  [4,] -0.97358166 -0.1874043  1.11539389 -1.5238135
##  [5,]  0.51705074 -0.5444689 -0.92498199 -0.9871974
##  [6,] -0.95673632  0.7571397 -0.60433202  0.4158881
##  [7,] -0.99802080 -1.0719852  1.80768620 -0.7445013
##  [8,]  0.07774574 -1.1542600 -0.05219194 -0.5741227
##  [9,] -0.47623964 -0.2536268  0.89647350  1.0259948
## [10,] -0.65395919 -1.7114382 -0.75225713  0.4114466

R commands on vector/matrix

command	usage
sum()	sum over elements in vector/matrix
mean()	compute average value
sort()	sort all elements in a vector/matrix
min(), max()	min and max values of a vector/matrix
length()	length of a vector/matrix
summary()	returns the min, Q1, median, mean, Q3, and max values of a vector
dim()	dimension of a matrix
cbind()	combine a sequence of vector, matrix or data-frame arguments and combine by columns
rbind()	combine a sequence of vector, matrix or data-frame arguments and combine by rows
names()	get or set names of an object
colnames()	get or set column names of a matrix-like object
rownames()	get or set row names of a matrix-like object

sum(matrix.example)

## [1] -14.75864

mean(matrix.example)

## [1] -0.1475864

sort(matrix.example)

##   [1] -2.33235405 -2.18657747 -2.05728909 -1.94793964 -1.88023353 -1.82244929
##   [7] -1.80314305 -1.71143817 -1.63234113 -1.52381346 -1.44631679 -1.39001464
##  [13] -1.29663228 -1.26812688 -1.15426002 -1.07545960 -1.07198520 -1.04677710
##  [19] -1.02771845 -1.01199130 -0.99802080 -0.98719736 -0.98019505 -0.97358166
##  [25] -0.96670992 -0.95673632 -0.94657799 -0.92498199 -0.84968969 -0.78083551
##  [31] -0.75225713 -0.74450128 -0.72260372 -0.71304198 -0.65395919 -0.60433202
##  [37] -0.59849529 -0.57412274 -0.54446894 -0.53936590 -0.52033943 -0.51567157
##  [43] -0.47910384 -0.47623964 -0.47530055 -0.47220408 -0.40148684 -0.35114807
##  [49] -0.25362682 -0.24486272 -0.18740428 -0.13809829 -0.09993253 -0.08670071
##  [55] -0.05219194  0.04018114  0.06691836  0.07774574  0.14061303  0.14169582
##  [61]  0.18796301  0.22447441  0.24531953  0.37346040  0.41144662  0.41588814
##  [67]  0.47294332  0.47379499  0.49972960  0.51325242  0.51705074  0.59570146
##  [73]  0.63782182  0.66218435  0.70548252  0.74323031  0.74669792  0.75713969
##  [79]  0.79513441  0.82100533  0.82354818  0.84075395  0.84643859  0.86629994
##  [85]  0.86923720  0.87270494  0.89647350  1.02599483  1.11539389  1.18670739
##  [91]  1.23395127  1.29324515  1.30036082  1.34458457  1.45884684  1.49490255
##  [97]  1.56010386  1.80768620  1.87867709  2.51141778

summary(matrix.example)

##        V1                 V2                V3                 V4          
##  Min.   :-2.33235   Min.   :-2.1866   Min.   :-1.82245   Min.   :-1.26813  
##  1st Qu.:-0.47815   1st Qu.:-0.7639   1st Qu.:-1.02379   1st Qu.:-0.42128  
##  Median : 0.44336   Median :-0.3426   Median :-0.08851   Median : 0.05355  
##  Mean   : 0.04076   Mean   :-0.1860   Mean   :-0.18236   Mean   : 0.21227  
##  3rd Qu.: 0.85718   3rd Qu.: 0.5966   3rd Qu.: 0.84502   3rd Qu.: 0.73639  
##  Max.   : 1.87868   Max.   : 1.4949   Max.   : 1.45885   Max.   : 2.51142  
##        V5                 V6                 V7                 V8         
##  Min.   :-1.63234   Min.   :-1.44632   Min.   :-2.05729   Min.   :-1.7114  
##  1st Qu.:-0.79772   1st Qu.:-0.41537   1st Qu.:-0.96937   1st Qu.:-1.1337  
##  Median : 0.02088   Median : 0.19297   Median :-0.59666   Median :-0.5324  
##  Mean   : 0.10061   Mean   : 0.08721   Mean   :-0.53982   Mean   :-0.5188  
##  3rd Qu.: 1.15574   3rd Qu.: 0.57171   3rd Qu.:-0.06075   3rd Qu.:-0.2040  
##  Max.   : 1.56010   Max.   : 1.30036   Max.   : 0.66218   Max.   : 0.7951  
##        V9               V10         
##  Min.   :-1.8802   Min.   :-1.5238  
##  1st Qu.:-0.8818   1st Qu.:-0.9265  
##  Median :-0.6635   Median :-0.1931  
##  Mean   :-0.2193   Mean   :-0.2705  
##  3rd Qu.: 0.6593   3rd Qu.: 0.4148  
##  Max.   : 1.8077   Max.   : 1.0260

Exercise Write a command to generate a random permutation of the numbers between 1 and 5 and save it to an object.

Comparison (logic operator)

symbol	use
!=	not equal
==	equal
>	greater
>=	greater or equal
<	smaller
<=	smaller or equal
is.na	is it “Not Available”/Missing
complete.cases	returns a logical vector specifying which observations/rows have no missing values
is.finite	if the value is finite
all	are all values in a logical vector true?
any	any value in a logical vector is true?

test.vec <- 73:68
test.vec

## [1] 73 72 71 70 69 68

test.vec < 70

## [1] FALSE FALSE FALSE FALSE  TRUE  TRUE

test.vec > 70

## [1]  TRUE  TRUE  TRUE FALSE FALSE FALSE

test.vec[3] <- NA
test.vec

## [1] 73 72 NA 70 69 68

is.na(test.vec)

## [1] FALSE FALSE  TRUE FALSE FALSE FALSE

complete.cases(test.vec)

## [1]  TRUE  TRUE FALSE  TRUE  TRUE  TRUE

all(is.na(test.vec))

## [1] FALSE

any(is.na(test.vec))

## [1] TRUE

Now let’s do a test of accuracy for doubles in R. Recall that for Double precision, we get approximately \(\log_{10}(2^{52}) \approx 16\) decimal point for precision.

test.exponent <- -(7:18)
10^test.exponent == 0

##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE

1 - 10^test.exponent == 1

##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE

7360 - 10^test.exponent == 7360

##  [1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

73600 - 10^test.exponent == 73600

##  [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE

Other operators

%in%, match

test.vec

## [1] 73 72 NA 70 69 68

66 %in% test.vec

## [1] FALSE

match(66, test.vec, nomatch = 0)

## [1] 0

70 %in% test.vec

## [1] TRUE

match(70, test.vec, nomatch = 0)

## [1] 4

match(70, test.vec, nomatch = 0) > 0 # the implementation of %in%

## [1] TRUE

Control flow

These are the basic control-flow constructs of the R language. They function in much the same way as control statements in any Algol-like (Algol short for “Algorithmic Language”) language. They are all reserved words.

keyword	usage
if	if(cond) expr
if-else	if(cond) cons.expr else alt.expr
for	for(var in seq) expr
while	while(cond) expr
break	breaks out of a for loop
next	halts the processing of the current iteration and advances the looping index

Define a function

Read Function section from Advanced R by Hadley Wickham. We will visit functions in more details.

DoNothing <- function() {
  return(invisible(NULL))
}
DoNothing()

In general, try to avoid using loops (vectorize your code) in R. If you have to loop, try using for loops first. Sometimes, while loops can be dangerous (however, a smart compiler should detect this).

DoBadThing <- function() {
  result <- NULL
  while(TRUE) {
    result <- c(result, rnorm(100))
  }
  return(result)
}
# DoBadThing()