ANALYSIS NOTES - A Multination Study of Socioeconomic Inequality in Expectations for Progression to Higher Education: The Role of Between School Tracking and Ability Stratification

0.1 Formula Used to Derive Primary and Secondary Effects in Current Research
0.2 Data Preperation
- 0.2.1 Creation of Principal Components
- 0.2.2 Construction of Intraclass correlations (ICC)
0.3 Descriptive Results
0.4 Mplus Probit Models
0.5 Parameter Extraction and Presentation
0.6 Plots and Correlations between Model Results and Country ICCs
0.7 Plots with highest parent education rather than highest parent occupational prestige
0.8 Plots with PISA Highest Parental Occupation

The supplementary material contains the formulars used to construct total, primary, and secondary effects as estimated in the paper. In addition, in keeping with growing calls for reproducible research (see Mesirov, 2010 Accessible reproducible research, Science, 327, 415-416) all analysis steps to obtain the final solutions presented in the paper and links to obtain the original data are presented. The paper below provides actual input code but also associated outputs. As the exact outputs are presented any warnings produced by R are maintained. These warnings were all investigated and found to not effect the results in any way. The warning are left in tact to provide a full record for researchers who wish to replicate the results.

0.1 Formula Used to Derive Primary and Secondary Effects in Current Research

The general formula is: \(\begin{equation}y^{i} = \alpha +B y^{i} + \Gamma x^{i} + \zeta_{i}\end{equation}\)

With matricies: \[ \begin{aligned}y_{i} = \begin{bmatrix} Expect\\ Ach \end{bmatrix}\end{aligned} \]

\[ \begin{aligned}\alpha = \begin{bmatrix} \beta^{Expect}_{0}\\ \beta^{Ach}_{0} \end{bmatrix}\end{aligned} \]

\[ \begin{aligned}B= \begin{bmatrix} 0 & \beta^{Ach}_{Expect}\\ 0&0 \end{bmatrix}\end{aligned} \]

\[ \begin{aligned}\Gamma= \begin{bmatrix} \beta^{Expect}_{SES}\\ \beta^{Ach}_{SES} \end{bmatrix}\end{aligned} \]

\[ \begin{aligned} \zeta = \begin{bmatrix} e^{Expect}_{i}\\ e^{Ach}_{i} \end{bmatrix} \sim N(o, \Psi ) \text{where} \Psi = \begin{bmatrix} \sigma^{2}_{ei:Expect} & 0\\ 0&\sigma^{2}_{ei:Ach} \end{bmatrix}\end{aligned} \]

The effects of interest are then derived from the coefficents from this model as follows:

Secondary effects \(=\beta^{Expect}_{SES}\)

Primary effects \(=\beta^{Ach}_{SES} \times \beta^{Expect}_{Ach}\)

Total effects \(=\beta^{Expect}_{SES} + (\beta^{Ach}_{SES} \times \beta^{Expect}_{Ach})\)

and where:

Expect = the expectations of obtaining a university level of education

Ach = academic achievement as measured by PISA achievement tests

SES = socioeconomic status of the child

0.2 Data Preperation

Data is taken from the PISA 2003 database which can be found at the ACER website. The data is read into R from the developed SPSS datafile (studentData.sav). In this step non-OECD countries are dropped.

University expectations variable is also created from the SISCED variable in the PISA database.

The R packages used in this paper are:

library(RSQLite)
library(psych)
library(multilevel)
library(MplusAutomation)
library(ppcor)
library(weights)
library(knitr)

HDI <- read.csv("~/Dropbox/Projects_Research/Social Forces/PISA2003/data/Table_1__Human_Development_Index_and_its_components.csv", na.strings = "..")
PISA <- dbConnect(SQLite(), dbname="~/Dropbox/Databases/LSAY_DATSETS/LSAY.sqlite")
#dbListTables(LSAY)
reducedData <- dbGetQuery(PISA, "SELECT CNT, SCHOOLID, AGE, GRADE, ST03Q01, HISEI, HISCED,ESCS,
                           IMMIG, OECD, SISCED, W_FSTUWT,
                          PV1MATH,PV2MATH,PV3MATH,PV4MATH,PV5MATH,
                          PV1READ,PV2READ,PV3READ,PV4READ,PV5READ,
                          PV1SCIE,PV2SCIE,PV3SCIE,PV4SCIE,PV5SCIE FROM PISA2003")
#summary(reducedData)
reducedData<- reducedData[reducedData$OECD==1,]
#Remove non-OECD from levels
reducedData$CNT<-factor(reducedData$CNT)
#Add university expectation binary variable
reducedData$UNI <- ifelse(reducedData$SISCED==5, 1, ifelse(reducedData$SISCED==9, NA, 0))
#For pooled standadization z-score HISEI and HISCED
reducedData[,6:7]<- scale(reducedData[,6:7])

To manage memory the smaller analysis dataset is saved and then the workspace is cleared. The smaller data set is then sourced for the remaining analysis.

0.2.1 Creation of Principal Components

Principals components of PISA math, verbal, and science tests were constructed as follows:

#Z-score achievement
reducedData[,grep("^PV[0-9]", names(reducedData))]<- scale(reducedData[,grep("^PV[0-9]",
                                                                        names(reducedData))]
                                                           )
#Principal Value analaysis
#Achievement scores
Ach <- list()
for ( i in 1:5){
                Ach[[i]]<-principal(reducedData[,grep(paste0("^PV", i, ".*$"),
                                                      names(reducedData))], 1, score=TRUE)$scores
                }
Ach<- do.call(cbind.data.frame, Ach)
names(Ach)<-paste0("Ach", 1:5)
#add principal components to dataframe
AchData <- data.frame(reducedData, Ach)

0.2.2 Construction of Intraclass correlations (ICC)

Country ICCs created using the psych package from a set of anova models. These ICCs were created independently for all plausable values and the resulting average across the plausible values is used in further analysis.

#ICCs
# load("C:/Users/30016475/Dropbox/Projects_Research/Social Forces/PISA2003/data/AchData19013013.RData")
ICCdata<- split(AchData, AchData$CNT)
ICC<-list()
for (i in 1:5) {
x<- lapply(ICCdata, function(x) ICC1(aov(x[,i+27]~as.factor(SCHOOLID), data=x, weights=W_FSTUWT)))
ICC[[i]]<- data.frame(do.call(rbind, x))
}

ICCs<-do.call(cbind,ICC)
ICCs<- apply(ICCs, 1, mean)
ICCs<- data.frame(ICCs=matrix(as.numeric(ICCs), nrow=30), CNT=names(ICCs))
names(ICCs)<-c("ICC", "CNT")

#Aggregate the ICCs back into the data set
AchData <- merge(AchData, ICCs, by="CNT")

0.3 Descriptive Results

Descriptive results consist of: * n.obs = Number of observations * selection = Age of first selection * Exptect.tables = Proportion of students expecting a university level of education or greater * ICCs = Intraclass correlations for between school academic ability stratification * avg.ach = Average achievement for each country on the principal components of the PISA academic ability tests.

# Human development index
HDI <- read.csv("~/Dropbox/Projects_Research/Social Forces/PISA2003/data/Table_1__Human_Development_Index_and_its_components.csv",
    na.strings = "..")
w.means <- split(AchData[, c("UNI", "CNT", "W_FSTUWT")], as.factor(AchData$CNT))
w.expect <- sapply(w.means, function(x) weighted.mean(x$UNI, x$W_FSTUWT, na.rm=TRUE) )
#Age of first selection
descriptives<-data.frame(n.obs=tapply(AchData$CNT, AchData$CNT, length),
                         strat = c(-1.043, 1.817, 1.018, -1.321, -0.138, 1.621, 1.862,
    -0.087, -1.02, -0.87, -0.474, -1.043, -0.474, 1.421, -0.302, -0.063, 0.166,
    -0.474, 0.072, 0.7, NA, 0.937, -1.043, -0.419, -0.083, -0.327, 1.621, -0.138,
    1.201, -1.321),
                         ICC=round(tapply(AchData$ICC, AchData$CNT, mean),3),
                         Expect.tables=round(w.expect,3),
                        avg.ach= round(tapply(AchData$Ach1, AchData$CNT, mean),3)
                         )
descriptives$CNT <- row.names(descriptives)
descriptives$HDI <- HDI[HDI$Abbreviation %in% descriptives$CNT, 5]
descriptives2 <- data.frame(descriptives[order(descriptives$Expect.tables), ])
DESplot <- as.table(t(descriptives2[, c("Expect.tables", "CNT")]))
barplot(DESplot, horiz = TRUE, cex.names = 0.5, las = 1, main = "University Expectations by Country")
abline(v = mean(descriptives$Expect.tables, na.rm = TRUE))

round(cor(descriptives[, c(2:5, 7)], use = "pairwise.complete.obs"),3)

##                strat    ICC Expect.tables avg.ach    HDI
## strat          1.000  0.687        -0.235  -0.130 -0.374
## ICC            0.687  1.000         0.098  -0.179 -0.355
## Expect.tables -0.235  0.098         1.000  -0.165 -0.289
## avg.ach       -0.130 -0.179        -0.165   1.000  0.726
## HDI           -0.374 -0.355        -0.289   0.726  1.000

pairs(descriptives[, c(2:5, 7)])

Descriptive Results

	n.obs	strat	ICC	Expect.tables	avg.ach	CNT	HDI
CHE	8420	-0.138	0.360	0.175	0.103	CHE	0.913
DEU	4660	1.862	0.552	0.190	0.110	DEU	0.920
AUT	4597	1.817	0.536	0.242	0.084	AUT	0.895
DNK	4218	-0.087	0.142	0.255	-0.018	DNK	0.901
NOR	4064	-1.043	0.078	0.257	-0.005	NOR	0.955
POL	4383	-0.083	0.142	0.301	-0.008	POL	0.821
GBR	9535	-1.043	0.238	0.315	0.220	GBR	0.875
SWE	4624	-0.138	0.098	0.329	0.154	SWE	0.916
FRA	4300	-0.474	0.473	0.345	0.172	FRA	0.893
BEL	8796	1.018	0.506	0.351	0.256	BEL	0.897
ISL	3350	-0.063	0.041	0.361	0.063	ISL	0.906
CZE	6320	1.621	0.462	0.372	0.342	CZE	0.873
NZL	4511	-0.419	0.162	0.388	0.319	NZL	0.919
NLD	3992	0.937	0.575	0.406	0.375	NLD	0.921
LUX	3923	0.700	0.314	0.408	-0.095	LUX	0.875
SVK	7346	1.621	0.446	0.428	-0.008	SVK	0.840
ESP	10791	-1.020	0.220	0.480	-0.022	ESP	0.885
MEX	29983	NA	0.457	0.488	-0.821	MEX	0.775
JPN	4707	-0.474	0.512	0.507	0.334	JPN	0.912
PRT	4608	-0.327	0.342	0.510	-0.258	PRT	0.816
FIN	5796	-0.870	0.044	0.513	0.507	FIN	0.892
ITA	11639	0.166	0.536	0.520	0.098	ITA	0.881
IRL	3880	-0.302	0.166	0.529	0.169	IRL	0.916
HUN	4765	1.421	0.553	0.532	-0.038	HUN	0.831
CAN	27953	-1.321	0.171	0.624	0.223	CAN	0.911
AUS	12551	-1.043	0.216	0.625	0.305	AUS	0.938
USA	5456	-1.321	0.222	0.642	-0.060	USA	0.937
GRC	4627	-0.474	0.403	0.645	-0.343	GRC	0.860
TUR	4855	1.201	0.549	0.765	-0.616	TUR	0.722
KOR	5444	0.072	0.405	0.783	0.448	KOR	0.909

0.4 Mplus Probit Models

First the R data file is saved for use in Mplus using the MplusAutomation package. Five data sets are created each with one plausable value for the Achievement principal component.

#{r mplusData}
library(MplusAutomation)
for (i in 1:5){
  capture.output(prepareMplusData(AchData[,c(1:28,34,28+i)],
                   filename=paste0("~/Dropbox/Projects_Research/Social Forces/AERJ/PISA2003_PV", i, ".dat")
                 ))
}

All models are run using a single Mplus template file. The template file is:

TITLE: Country XXX !Outcome = Expectation of university entry
                    !Treatment mediator Interaction;
DATA: FILE = "PISA2003_PV1.dat";
!type=imputation;

VARIABLE:
NAMES = CNT SCHOOLID AGE GRADE ST03Q01 HISEI HISCED ESCS IMMIG OECD SISCED W_FSTUWT
     PV1MATH PV2MATH PV3MATH PV4MATH PV5MATH PV1READ PV2READ PV3READ PV4READ PV5READ
     PV1SCIE PV2SCIE PV3SCIE PV4SCIE PV5SCIE UNI ICC PVAch;
MISSING=.;

USEVARIABLES UNI HISEI PVAch int;
categorical = UNI;
useobs CNT EQ 1;
cluster=schoolid;
weight=W_FSTUWT;

DEFINE: int = HISEI*PVAch;

analysis: type=complex; estimator = MLR; LINK = PROBIT;
DEFINE: STANDARDIZE HISEI PVAch int;

MODEL:
[UNI$1] (mbeta0);
UNI on HISEI (beta2);
UNI on PVAch (beta1);
UNI on int (beta3);
[PVAch](gamma0);
PVAch on HISEI (gamma1);
PVAch(sig2);

MODEL CONSTRAINT:
new(ind dir arg11 arg10 arg01 arg00 v1 v0
    probit11 probit10 probit01 probit00
    tie de pie ortie orde orpie Ctotal total prop);

    dir=beta3*gamma0+beta2;
    ind=beta1*gamma1+beta3*gamma1;
    arg11=-mbeta0+beta2+(beta1+beta3)*(gamma0+gamma1);
    arg10=-mbeta0+beta2+(beta1+beta3)*gamma0;
    arg01=-mbeta0+beta1*(gamma0+gamma1);
    arg00=-mbeta0+beta1*gamma0;
    v1=(beta1+beta3)^2*sig2+1;
    v0=beta1^2*sig2+1;
    probit11=arg11/sqrt(v1);
    probit10=arg10/sqrt(v1);
    probit01=arg01/sqrt(v0);
    probit00=arg00/sqrt(v0);
    ! Phi function needed below:
    tie=phi(probit11)-phi(probit10);
    de=phi(probit10)-phi(probit00);
    pie=phi(probit01)-phi(probit00);
    ortie=(phi(probit11)/(1-phi(probit11)))/
    (phi(probit10)/(1-phi(probit10)));
    orde=(phi(probit10)/(1-phi(probit10)))/
    (phi(probit00)/(1-phi(probit00)));
    orpie=(phi(probit01)/(1-phi(probit01)))/
    (phi(probit00)/(1-phi(probit00)));
    Ctotal = de + tie;
    total= ind + dir;
    prop = (tie)/ (Ctotal+.00001);

The template file is iterative over in order to create country specific syntax files. The creation and running of files is done via the following user defined function. Before creating these files relavent directories are created in the working directory; one each for every ses.var used.

{r fileCreation, warning=FALSE, message=FALSE}
fileCreation <- function(ses.var = "HISEI", link = "PROBIT") {
    temp <- readLines("~/Dropbox/Projects_Research/Social Forces/AERJ/TEMPLATEFILE19012013.inp")
    temp[10:60] <- unlist(sapply(temp[10:60], function(x) gsub("HISEI", ses.var,
        x)))
    countries <- levels(AchData$CNT)
    if (link == "LOGIT") {
        link = "LOGIT"
        temp <- gsub("PROBIT", "LOGIT", temp)
    } else {
        link = "PROBIT"
    }
    print(link)
    print(ses.var)
    for (i in 1:30) {
        x <- temp
        x[1] <- gsub("XXX", countries[i], x[1])
        x[13] <- paste0("useobs CNT EQ ", i, ";")
        cat(x, sep = "\n", file = paste0("~/Dropbox/Projects_Research/Social Forces/AERJ/",
            ses.var, "/", link, "/", countries[i], ".inp"))
    }
}
fileCreation(ses.var = "HISEI", link = "LOGIT")  # Highest occupation prestige of parents
fileCreation(ses.var="HISCED", link = "LOGIT")# Highest education level of parents
fileCreation(ses.var="ESCS", link = "LOGIT")# Highest education level of parents

fileCreation(ses.var = "HISEI", link = "PROBIT")  # Highest occupation prestige of parents
fileCreation(ses.var="HISCED", link = "PROBIT")# Highest education level of parents
fileCreation(ses.var="ESCS", link = "PROBIT")# Highest education level of parents

All the models in each of the folders are run using MplusAutomation package in R

runModels("~/Dropbox/Projects_Research/Social Forces/AERJ", recursive=TRUE)

0.5 Parameter Extraction and Presentation

Extraction of results for presentaton and further analysis was done using the following user defined functions.

#Run on mplus files one at a time
paths <- list.dirs("~/Dropbox/Projects_Research/Social Forces/AERJ/", full.names = TRUE, recursive = TRUE)
paths <- grep("(ESCS|HISEI|HISCED)/(LOGIT|PROBIT)", paths, value=TRUE)

parameterExtract<- function(path){
      files <- list.files(path,  full.names = TRUE)
      files <- grep(".out$", files, value=TRUE)
      #Extract relavent output
      temp<-grep("New/Additional Parameters", readLines(files[1]))
      lines<-temp+c(13, 14,19,21)
      output <- sapply(files, function(x) readLines(x)[lines])
      #Function for extracting relavent parameters
      f1<-function(x) unlist(strsplit(x,"[[:space:]]+"))[c(2:4)]
      results<-apply(output, c(1,2), f1)
      TIE<- matrix(results[,1,1:30],30, 3, byrow = TRUE)
      DE<- matrix(results[,2,1:30],30, 3, byrow = TRUE)
      total<- matrix(results[,3,1:30],30, 3, byrow = TRUE)
      prop<- matrix(results[,4,1:30],30, 3, byrow = TRUE)
      results<- cbind(total, TIE, DE, prop)
      results<-data.frame(results[,c(2:3,5:6,8:9, 11:12)],
                          descriptives2[order(descriptives2$CNT),c(2:3, 5,7)])
      names(results)<-c("total", "totalse", "ind", "indse", "dir", "dirse", "prop",
                        "propse","strat", "ICC", "avg.ability", "HDI")
      results<- data.frame(apply(results, 2, as.numeric))
      results$CNT<-levels(AchData$CNT)
      results
}

resultsList <-lapply(paths, parameterExtract)
names(resultsList)<- gsub("(/Users/phparker/Dropbox/Projects_Research/Social Forces/AERJ/)/(.*)/(.*)$",
                          "\\2 \\3", paths)

#Run on mplus files one at a time
paths <- list.dirs("~/Dropbox/Projects_Research/Social Forces/AERJ/", full.names = TRUE, recursive = TRUE)
paths <- grep("(ESCS|HISEI|HISCED)/LPM", paths, value=TRUE)

parameterExtract<- function(path){
      files <- list.files(path,  full.names = TRUE)
      files <- grep(".out$", files, value=TRUE)
      #Extract relavent output
      temp<-grep("New/Additional Parameters", readLines(files[1]))
      lines<-temp[1]+c(1:4)
      output <- sapply(files, function(x) readLines(x)[lines])
      #Function for extracting relavent parameters
      f1<-function(x) unlist(strsplit(x,"[[:space:]]+"))[c(2:4)]
      results<-apply(output, c(1,2), f1)
      IND <- matrix(results[,1,1:30],30, 3, byrow = TRUE)
      DIR<- matrix(results[,2,1:30],30, 3, byrow = TRUE)
      total<- matrix(results[,3,1:30],30, 3, byrow = TRUE)
      prop<- matrix(results[,4,1:30],30, 3, byrow = TRUE)
      results<- cbind(total, IND, DIR, prop)
      results<-data.frame(results[,c(2:3,5:6,8:9, 11:12)],
                          descriptives2[order(descriptives2$CNT),c(2:3, 5,7)])
      names(results)<-c("total", "totalse", "ind", "indse", "dir", "dirse", "prop",
                        "propse","strat", "ICC", "avg.ability", "HDI")
      results<- data.frame(apply(results, 2, as.numeric))
      results$CNT<-levels(AchData$CNT)
      results
}

resultsListb <-lapply(paths, parameterExtract)
names(resultsListb)<- gsub("(/Users/phparker/Dropbox/Projects_Research/Social Forces/AERJ/)/(.*)/(.*)$",
                          "\\2 \\3", paths)
resultsList <- c(resultsList, resultsListb)
resultsList2 <- lapply(resultsList, na.omit)

ESCS LOGIT

ESCS PROBIT

HISCED LOGIT

HISCED PROBIT

HISEI LOGIT

HISEI PROBIT

ESCS LPM

HISCED LPM

HISEI LPM

0.6 Plots and Correlations between Model Results and Country ICCs

inverseVar <- function(mu, omega){
  d = sum(mu/omega)
  n = sum(1/omega)
  y = d/n
  y
}

inv.weight <- function(se) {
  u.weight = 1/se
  u.weight.tot = sum(u.weight)
  a.factor = solve(u.weight.tot, length(se))
  u.weight*a.factor
}

w.pcor <- function(y,x, cov, weight){
  x.res = lm(x ~ ., data = data.frame(x,cov), weight = weight)$residuals
  y.res = lm(y ~ ., data = data.frame(y,cov), weight = weight)$residuals
  p.cor = wtd.cor(x.res, y.res, weight = weight)
  p.cor
}

#---------
#Average Effect Size
sapply(resultsList, function(x) inverseVar(x$total, x$totalse))

##    ESCS LOGIT   ESCS PROBIT  HISCED LOGIT HISCED PROBIT   HISEI LOGIT
##     0.2279349     0.1878100     0.1773621     0.1438383     0.1701240
##  HISEI PROBIT      ESCS LPM    HISCED LPM     HISEI LPM
##     0.1361035     0.1790321     0.1420209     0.1320916

sapply(resultsList, function(x) inverseVar(x$prop, x$propse))

##    ESCS LOGIT   ESCS PROBIT  HISCED LOGIT HISCED PROBIT   HISEI LOGIT
##     0.3886083     0.3929702     0.3663744     0.3723757     0.4576961
##  HISEI PROBIT      ESCS LPM    HISCED LPM     HISEI LPM
##     0.4553049     0.4090565     0.3824775     0.4642482

#--------------------------
#H1: Effect of tracking on total effect
#Unweighted
h1.uwa <- sapply(resultsList, function(x) wtd.cor(x$total, x$strat, weight = NULL) )[1,]
h1.uwb <- sapply(resultsList2, function(x) w.pcor(x$total, x$strat, cov = x[, c("ICC", "HDI", "avg.ability")],
       weight = NULL )
       )[1,]
#Weighted
h1.wa <- sapply(resultsList, function(x) wtd.cor(x$total, x$strat, weight = inv.weight(x$totalse)) )[1,]
h1.wb <- sapply(resultsList2, function(x) w.pcor(x$total, x$strat, cov = x[, c("ICC", "HDI", "avg.ability")],
       weight = inv.weight(x$totalse) )
       )[1,]
#--------------------------
#H2: Effect of Stratification on total effect
#unweighted
h2.uwa <- sapply(resultsList, function(x) wtd.cor(x$total, x$ICC, weight = NULL) )[1,]
h2.uwb <- sapply(resultsList2, function(x) w.pcor(x$total, x$ICC, cov = x[, c("strat", "HDI", "avg.ability")],
       weight = NULL )
       )[1,]
#weighted
h2.wa <- sapply(resultsList, function(x) wtd.cor(x$total, x$ICC, weight = inv.weight(x$totalse)) )[1,]
h2.wb <- sapply(resultsList2, function(x) w.pcor(x$total, x$ICC, cov = x[, c("strat", "HDI", "avg.ability")],
       weight = inv.weight(x$totalse) )
       )[1,]
#--------------------------
#H3: Effect of tracking on primary effect
#unweighted
h3.uwa <- sapply(resultsList, function(x) wtd.cor(x$prop, x$strat, weight = NULL) )[1,]
h3.uwb <- sapply(resultsList2, function(x) w.pcor(x$prop, x$strat, cov = x[, c("ICC", "HDI", "avg.ability")],
       weight = NULL)
       )[1,]
#weighted
h3.wa <- sapply(resultsList, function(x) wtd.cor(x$prop, x$strat, weight = inv.weight(x$propse)) )[1,]
h3.wb <- sapply(resultsList2, function(x) w.pcor(x$prop, x$strat, cov = x[, c("ICC", "HDI", "avg.ability")],
       weight = inv.weight(x$propse) )
       )[1,]
#--------------------------
#H4: Effect of Stratification on primary effect
#unweighted
h4.uwa <- sapply(resultsList, function(x) wtd.cor(x$prop, x$ICC, weight = NULL) )[1,]
h4.uwb <- sapply(resultsList2, function(x) w.pcor(x$prop, x$ICC, cov = x[, c("strat", "HDI", "avg.ability")],
       weight = NULL )
       )[1,]
#weighted
h4.wa <- sapply(resultsList, function(x) wtd.cor(x$prop, x$ICC, weight = inv.weight(x$propse)) )[1,]
h4.wb <- sapply(resultsList2, function(x) w.pcor(x$prop, x$ICC, cov = x[, c("strat", "HDI", "avg.ability")],
       weight = inv.weight(x$propse) )
       )[1,]
#--------------------------

Results: Unweighted and Unadjusted

	ESCS LOGIT	ESCS PROBIT	HISCED LOGIT	HISCED PROBIT	HISEI LOGIT	HISEI PROBIT	ESCS LPM	HISCED LPM	HISEI LPM
Hypothesis 1	0.375	0.483	0.315	0.471	0.416	0.498	0.273	0.279	0.384
Hypothesis 2	0.209	0.408	0.033	0.256	0.299	0.427	0.275	0.173	0.337
Hypothesis 3	0.437	0.423	0.565	0.545	0.289	0.298	0.422	0.559	0.368
Hypothesis 4	0.424	0.472	0.621	0.621	0.357	0.405	0.428	0.601	0.435

Results: Unweighted and Adjusted

	ESCS LOGIT	ESCS PROBIT	HISCED LOGIT	HISCED PROBIT	HISEI LOGIT	HISEI PROBIT	ESCS LPM	HISCED LPM	HISEI LPM
Hypothesis 1	0.340	0.308	0.414	0.417	0.294	0.293	0.168	0.269	0.229
Hypothesis 2	-0.082	0.117	-0.290	-0.125	-0.006	0.119	0.152	-0.028	0.117
Hypothesis 3	0.192	0.128	0.148	0.113	0.042	0.002	0.184	0.154	0.055
Hypothesis 4	0.220	0.307	0.409	0.424	0.265	0.319	0.249	0.398	0.289

Results: Weighted and Unadjusted

	ESCS LOGIT	ESCS PROBIT	HISCED LOGIT	HISCED PROBIT	HISEI LOGIT	HISEI PROBIT	ESCS LPM	HISCED LPM	HISEI LPM
Hypothesis 1	0.237	0.475	0.205	0.480	0.334	0.507	0.264	0.305	0.337
Hypothesis 2	0.059	0.381	-0.044	0.271	0.210	0.433	0.255	0.194	0.288
Hypothesis 3	0.454	0.453	0.576	0.569	0.303	0.301	0.471	0.580	0.350
Hypothesis 4	0.382	0.440	0.615	0.617	0.283	0.318	0.419	0.581	0.330

Results: Weighted and Adjusted

	ESCS LOGIT	ESCS PROBIT	HISCED LOGIT	HISCED PROBIT	HISEI LOGIT	HISEI PROBIT	ESCS LPM	HISCED LPM	HISEI LPM
Hypothesis 1	0.349	0.327	0.412	0.434	0.300	0.298	0.191	0.285	0.222
Hypothesis 2	-0.150	0.071	-0.276	-0.131	-0.036	0.111	0.139	-0.020	0.102
Hypothesis 3	0.264	0.206	0.163	0.133	0.137	0.095	0.259	0.210	0.162
Hypothesis 4	0.140	0.227	0.388	0.389	0.148	0.192	0.175	0.343	0.162

#Plots for weights with country specific standadization
# library(RSvgDevice)
# devSVG("revisionToAERJ/R3/ICC_total.svg")
results<-resultsList[[2]]
plot(results[,"ICC"], results[,"total"], ylab="Total Effect", xlab="ICC", ylim=c(0,.3),pch = '', main="Relationship between SES and Educational Expectations", bty = "n")
text(results[,"ICC"],results[,"total"],results[,"CNT"], cex=.6)
abline(lm(results[,"total"]~results[,"ICC"]), col="grey")
arrows(x0 = results[,"ICC"], y0 = results[,"total"]-results[,"totalse"],
       x1 = results[,"ICC"], y1 = results[,"total"]+results[,"totalse"], length=0, col="grey")

# dev.off()
# devSVG("revisionToAERJ/R3/strat_total.svg")
plot(results[,"strat"], results[,"total"], ylab="Total Effect", xlab="Stratification Index", ylim=c(0,.3),pch = '', main="Relationship between SES and Educational Expectations", bty = "n")
text(results[,"strat"],results[,"total"],results[,"CNT"], cex=.6)
abline(lm(results[,"total"]~results[,"strat"]), col="grey")
arrows(x0 = results[,"strat"], y0 = results[,"total"]-results[,"totalse"],
       x1 = results[,"strat"], y1 = results[,"total"]+results[,"totalse"], length=0, col="grey")

# dev.off()
#Indirect Effect
# devSVG("revisionToAERJ/R3/ICC_ind.svg")
plot(results[,"ICC"], results[,"prop"], ylab="Proportion of Total Effect ", xlab="ICC", ylim=c(0,.8), pch = '',
     main="Relationship between SES and Educational Expectations \n Explained by Achievement Differentials", bty = "n")
text(results[,"ICC"],results[,"prop"],results[,"CNT"], cex=.6)
abline(lm(results[,"prop"]~results[,"ICC"]), col="grey")
arrows(x0 = results[,"ICC"], y0 = results[,"prop"]-results[,"propse"],
       x1 = results[,"ICC"], y1 = results[,"prop"]+results[,"propse"], length=0, col="grey")

# dev.off()
# devSVG("revisionToAERJ/R3/strat_ind.svg")
plot(results[,"strat"], results[,"prop"], ylab="Proportion of Total Effect ", xlab="Stratification Index", ylim=c(0,.8), pch = '', bty = "n",
     main="Relationship between SES and Educational Expectations \n Explained by Achievement Differentials")
text(results[,"strat"],results[,"prop"],results[,"CNT"], cex=.6)
abline(lm(results[,"prop"]~results[,"strat"]), col="grey")
arrows(x0 = results[,"strat"], y0 = results[,"prop"]-results[,"propse"],
       x1 = results[,"strat"], y1 = results[,"prop"]+results[,"propse"], length=0, col="grey")

# dev.off()

0.7 Plots with highest parent education rather than highest parent occupational prestige

results<-resultsList[[4]]
plot(results[,"ICC"], results[,"total"], ylab="Total Effect", xlab="ICC", ylim=c(0,.3),pch = '', main="Relationship between SES and Educational Expectations", bty = "n")
text(results[,"ICC"],results[,"total"],results[,"CNT"], cex=.6)
abline(lm(results[,"total"]~results[,"ICC"]), col="grey")
arrows(x0 = results[,"ICC"], y0 = results[,"total"]-results[,"totalse"],
       x1 = results[,"ICC"], y1 = results[,"total"]+results[,"totalse"], length=0, col="grey")

# dev.off()
# devSVG("revisionToAERJ/R3/strat_total.svg")
plot(results[,"strat"], results[,"total"], ylab="Total Effect", xlab="Stratification Index", ylim=c(0,.3),pch = '', main="Relationship between SES and Educational Expectations", bty = "n")
text(results[,"strat"],results[,"total"],results[,"CNT"], cex=.6)
abline(lm(results[,"total"]~results[,"strat"]), col="grey")
arrows(x0 = results[,"strat"], y0 = results[,"total"]-results[,"totalse"],
       x1 = results[,"strat"], y1 = results[,"total"]+results[,"totalse"], length=0, col="grey")

# dev.off()
#Indirect Effect
# devSVG("revisionToAERJ/R3/ICC_ind.svg")
plot(results[,"ICC"], results[,"prop"], ylab="Proportion of Total Effect ", xlab="ICC", ylim=c(0,.8), pch = '',
     main="Relationship between SES and Educational Expectations \n Explained by Achievement Differentials", bty = "n")
text(results[,"ICC"],results[,"prop"],results[,"CNT"], cex=.6)
abline(lm(results[,"prop"]~results[,"ICC"]), col="grey")
arrows(x0 = results[,"ICC"], y0 = results[,"prop"]-results[,"propse"],
       x1 = results[,"ICC"], y1 = results[,"prop"]+results[,"propse"], length=0, col="grey")

# dev.off()
# devSVG("revisionToAERJ/R3/strat_ind.svg")
plot(results[,"strat"], results[,"prop"], ylab="Proportion of Total Effect ", xlab="Stratification Index", ylim=c(0,.8), pch = '', bty = "n",
     main="Relationship between SES and Educational Expectations \n Explained by Achievement Differentials")
text(results[,"strat"],results[,"prop"],results[,"CNT"], cex=.6)
abline(lm(results[,"prop"]~results[,"strat"]), col="grey")
arrows(x0 = results[,"strat"], y0 = results[,"prop"]-results[,"propse"],
       x1 = results[,"strat"], y1 = results[,"prop"]+results[,"propse"], length=0, col="grey")

# dev.off()

0.8 Plots with PISA Highest Parental Occupation

results<-resultsList[[6]]
plot(results[,"ICC"], results[,"total"], ylab="Total Effect", xlab="ICC", ylim=c(0,.3),pch = '', main="Relationship between SES and Educational Expectations", bty = "n")
text(results[,"ICC"],results[,"total"],results[,"CNT"], cex=.6)
abline(lm(results[,"total"]~results[,"ICC"]), col="grey")
arrows(x0 = results[,"ICC"], y0 = results[,"total"]-results[,"totalse"],
       x1 = results[,"ICC"], y1 = results[,"total"]+results[,"totalse"], length=0, col="grey")

# dev.off()
# devSVG("revisionToAERJ/R3/strat_total.svg")
plot(results[,"strat"], results[,"total"], ylab="Total Effect", xlab="Stratification Index", ylim=c(0,.3),pch = '', main="Relationship between SES and Educational Expectations", bty = "n")
text(results[,"strat"],results[,"total"],results[,"CNT"], cex=.6)
abline(lm(results[,"total"]~results[,"strat"]), col="grey")
arrows(x0 = results[,"strat"], y0 = results[,"total"]-results[,"totalse"],
       x1 = results[,"strat"], y1 = results[,"total"]+results[,"totalse"], length=0, col="grey")

# dev.off()
#Indirect Effect
# devSVG("revisionToAERJ/R3/ICC_ind.svg")
plot(results[,"ICC"], results[,"prop"], ylab="Proportion of Total Effect ", xlab="ICC", ylim=c(0,.8), pch = '',
     main="Relationship between SES and Educational Expectations \n Explained by Achievement Differentials", bty = "n")
text(results[,"ICC"],results[,"prop"],results[,"CNT"], cex=.6)
abline(lm(results[,"prop"]~results[,"ICC"]), col="grey")
arrows(x0 = results[,"ICC"], y0 = results[,"prop"]-results[,"propse"],
       x1 = results[,"ICC"], y1 = results[,"prop"]+results[,"propse"], length=0, col="grey")

# dev.off()
# devSVG("revisionToAERJ/R3/strat_ind.svg")
plot(results[,"strat"], results[,"prop"], ylab="Proportion of Total Effect ", xlab="Stratification Index", ylim=c(0,.8), pch = '', bty = "n",
     main="Relationship between SES and Educational Expectations \n Explained by Achievement Differentials")
text(results[,"strat"],results[,"prop"],results[,"CNT"], cex=.6)
abline(lm(results[,"prop"]~results[,"strat"]), col="grey")
arrows(x0 = results[,"strat"], y0 = results[,"prop"]-results[,"propse"],
       x1 = results[,"strat"], y1 = results[,"prop"]+results[,"propse"], length=0, col="grey")

# dev.off()

A Multination Study of Socioeconomic Inequality in Expectations for Progression to Higher Education: The Role of Between School Tracking and Ability Stratification

Philip D. Parker, John Jerrim, Ingrid Schoon, Herbert W. Marsh

28 September 2015