Wednesday, January 23, 2013

Session # 3 Residual vs Standard Plot

Session # 3


ASSIGNMENT 1a: Fit ‘lm’ and comment on the applicability of ‘lm’

Plot1: Residual vs Independent curve.
Plot2: Standard Residual vs independent curve.

Code:
> file<-read.csv(file.choose(),header=T)             
> file
    mileage groove
1       0     394.33
2       4     329.50
3       8     291.00
4      12    255.17
5      16    229.33
6      20    204.83
7      24    179.00
8      28    163.83
9      32    150.33


> x<-file$groove
> x
[1] 394.33 329.50 291.00 255.17 229.33 204.83 179.00 163.83 150.33
> y<-file$mileage
> y
[1]  0  4  8 12 16 20 24 28 32

> reg1<-lm(y~x)
> res<-resid(reg1)
> res
         1          2                  3                 4                 5                 6                  7                 8                     9
 3.6502499 -0.8322206 -1.8696280 -2.5576878 -1.9386386 -1.1442614 -0.5239038  1.4912269 3.7248633

> plot(x,res)


As the plot is parabolic, the regression cannot be performed.


ASSIGNMENT 1 (b) -Alpha-Pluto Data

Fit ‘lm’ and comment on the applicability of ‘lm’.

Plot1: Residual vs Independent curve.

Plot2: Standard Residual vs independent curve.

Also do: Qq plot and Qqline

Code:
> file<-read.csv(file.choose(),header=T)
> file
   alpha     pluto
1  0.150    20
2  0.004     0
3  0.069    10
4  0.030     5
5  0.011     0
6  0.004     0
7  0.041     5
8  0.109    20
9  0.068    10
10 0.009     0
11 0.009     0
12 0.048    10
13 0.006     0
14 0.083    20
15 0.037     5
16 0.039     5
17 0.132    20
18 0.004     0
19 0.006     0
20 0.059    10
21 0.051    10
22 0.002     0
23 0.049     5


> x<-file$alpha
> y<-file$pluto
> x
 [1] 0.150 0.004 0.069 0.030 0.011 0.004 0.041 0.109 0.068 0.009 0.009 0.048
[13] 0.006 0.083 0.037 0.039 0.132 0.004 0.006 0.059 0.051 0.002 0.049
> y
 [1] 20  0 10  5  0  0  5 20 10  0  0 10  0 20  5  5 20  0  0 10 10  0  5
> reg1<-lm(y~x)
> res<-resid(reg1)
> res
         1          2                  3                 4                 5                 6                  7
-4.2173758 -0.0643108 -0.8173877  0.6344584 -1.2223345 -0.0643108 -1.1852930
         8          9                10                11               12               13                14
 2.5653342 -0.6519557 -0.8914706 -0.8914706  2.6566833 -0.3951747  6.8665650
        15         16                17              18                19                20              21
-0.5235652 -0.8544291 -1.2396007 -0.0643108 -0.3951747  0.8369318  2.1603874
        22         23
 0.2665531 -2.5087486


> plot(x,res)





> qqnorm(res)



> qqline(res)



ASSIGNMENT 2: Justify Null Hypothesis using ANOVA

Code:
> file<-read.csv(file.choose(),header=T)
> file

   Chair Comfort.Level Chair1
1      I             2      a
2      I             3      a
3      I             5      a
4      I             3      a
5      I             2      a
6      I             3      a
7     II             5      b
8     II             4      b
9     II             5      b
10    II             4      b
11    II             1      b
12    II             3      b
13   III             3      c
14   III             4      c
15   III             4      c
16   III             5      c
17   III             1      c
18   III             2      c

> file.anova<-aov(file$Comfort.Level~file$Chair1)
> summary(file.anova)

            Df Sum Sq Mean Sq F value Pr(>F)
file$Chair1  2  1.444  0.7222   0.385  0.687

Conclusion: P Value  = 0.687

Since, the p - value is high, we cannot reject the null hypothesis. Thus we can say that all the types of chairs are not different.











Tuesday, January 15, 2013

Assignment 1: Binding of Matrices

Assignment 1: Binding of Matrices

The objective is to create two 3 x 3 matrices and select 1 column of matrix 1 and another column  of matrix 2 finally merging  them into another matrix using cbind command.

Command :-







Assignment 2: Matrix Multiplication





Assignment 3: Regression plots using NSE past month Data

Command :-

nse<-read.csv(file.choose(),header=T)
reg<-lm(high~open,data=nse)
To find residuals
residuals(reg)








Assignment 4: Generate a Normal distribution data and plot it


x=seq(70,130,length=200)
y=dnorm(x,mean=100,sd=10)
plot(x,y)




Tuesday, January 8, 2013

Intro to analyst through R project !!


What is R?

R is a language and environment for statistical computing and graphics. It is a GNU Project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues.

One of R's strengths is the ease with which well-designed publication-quality plots can be produced, including mathematical symbols and formula where needed. Great care has been taken over the defaults for the minor design choices in graphics, but the user retains full control.

For assignment we have used NSE 4 months data and stored it in a variable z.


Assignment 1: Plotting Histogram

Code
>x<-c(1,2,3)
>plot(x,type="h")

Histogram Plot


Assignment 2: Plotting point and line diagram 


Code:

>zcol3<-z[,3]
>plot(zcol3,type="b",main="NSE Data",xlab="Time",ylab="nifty")


Point and Line Plot:




Assignment 3: Scatter Diagram for High and Low data

Code:
zcol4<-z[,4]
plot(zcol3,zcol4,type="b",main="NSE Data",xlab="High",ylab="Low")

Scatter Diagram:



Assignment 4: Volatility of Max and Min of share value

Code:
>zcol3<-z[,3]
>zcol4<-z[,4]
mergeddata<-c(zcol3,zcol4)
> summary(mergeddata)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
   4888    5660    5723    5758    5884    6021
> range(mergeddata)
[1] 4888.20 6020.75



#disclaimer: This post is prepared as an assignment for R project , BA course, VGSoM, IIT Kharagpur Spring semester class of 2014.