FE Success Rates 2003/04 sample Code

Note: This page is from an older version of the Data Dictionary and may therefore contain information that is now out of date. It is included here for reference only.


1 The following SPSS code is provided to illustrate the FE Success Rates Methodology 2003/04 and should be read in conjunction with that definition.

Initial processing of tri-files

*****************************************************************.
** Pre sort the hierarchy table.
*****************************************************************.


GET FILE = 'Hierarchy Table.sav'.
SORT CASES BY a09.
SAVE OUTFILE = 'Hierarchy Table.sav'.

*****************************************************************.
**Step to create startyr, actendyr and expendyr for ILR F05 xxxx/xx data
*****************************************************************.


Get file = 'Tri File xxxx/xx.sav'.

SORT CASES by A09.
MATCH FILES /FILE=*
/TABLE='Hierarchy Table.sav'
/BY A09.
RECODE A28 (0=SYSMIS) (ELSE=COPY).

* If mapcode is missing then recode mapcode to equal qual aim ref.

If mapcode = (" ")mapcode = A09.

compute startyr = 0.

Do if (A27 >= date.dmy( 1, 8, 99 ) and
A27 <= date.dmy( 31, 7, 2000)).
compute startyr=1999.

else if (A27 >= date.dmy( 1, 8,2000 ) and
A27 <= date.dmy( 31, 7, 2001)).
compute startyr=2000.

else if (A27 >= date.dmy( 1, 8,2001 ) and
A27 <= date.dmy( 31, 7, 2002)).
compute startyr=2001.

else if (A27 >= date.dmy( 1, 8,2002 ) and
A27 <= date.dmy( 31, 7, 2003)).
compute startyr=2002.

else if (A27 >= date.dmy( 1, 8,2003 ) and
A27 <= date.dmy( 31, 7, 2004)).
compute startyr=2003.

else if (A27 >= date.dmy( 1, 8,2004 ) and
A27 <= date.dmy( 31, 7, 2005)).
compute startyr=2004.

end if.

compute actendyr = 0.

Do if (A31 >= date.dmy( 1, 8, 2002 ) and
A31 <= date.dmy(31, 7, 2003)).
compute actendyr=2002.

else if (A31 >= date.dmy( 1, 8, 2003 ) and
A31 <= date.dmy(31, 7, 2004)).
compute actendyr=2003.

else if (A31 >= date.dmy( 1, 8, 2004 ) and
A31 <= date.dmy(31, 7, 2005)).
compute actendyr=2004.

else if (A31 >= date.dmy( 1, 8, 2005 ) and
A31 <= date.dmy(31, 7, 2006)).
compute actendyr=2005.

else if (A31 >= date.dmy( 1, 8, 2006) and
A31 <= date.dmy(31, 7, 2007)).
compute actendyr=2006.

else if (A31 >= date.dmy( 1, 8, 2007) and
A31 <= date.dmy(31, 7, 2008)).
compute actendyr=2007.
end if.
Execute.

compute expendyr = startyr.

Do if (A28 >= date.dmy( 1, 8, 2002 ) and
A28 <= date.dmy( 31, 7, 2003)).
compute expendyr=2002.

else if (A28 >= date.dmy( 1, 8, 2003 ) and
A28 <= date.dmy( 31, 7, 2004)).
compute expendyr=2003.

else if (A28 >= date.dmy( 1, 8, 2004 ) and
A28 <= date.dmy( 31, 7, 2005)).
compute expendyr=2004.

else if (A28 >= date.dmy( 1, 8, 2005 ) and
A28 <= date.dmy( 31, 7, 2006)).
compute expendyr=2005.

else if (A28 >= date.dmy( 1, 8, 2006 ) and
A28 <= date.dmy( 31, 7, 2007)).
compute expendyr=2006.

else if (A28 >= date.dmy( 1, 8, 2007 ) and
A28 <= date.dmy( 31, 7, 2008)).
compute expendyr=2007.

else if (A28>=date.dmy(1,8,2008)).
compute expendyr=2008.
end if.

value labels startyr, actendyr, expendyr
1997 '1997-98' 1998 '1998-99' 1999 '1999-2000'
2000 '2000-2001' 2001 '2001-2002' 2002 '2002-2003'
2003 '2003-2004' 2004 '2004-2005' 2005 '2005-2006'
2006 '2006-2007' 2007 '2007-2008' 0 'missing actual'.
Execute.

**This piece of code has been added to take out UFI qualifications.


Select if A46A<>1.
Select if A46B<>1.

SORT CASES by newref, mapcode, startyr, expendyr.
SAVE OUTFILE = 'Tri_File xxxx/xx.sav'.

**This code needs to be repeated for each years tri-file.

*****************************************************************.
**Identify qualifications expected to end in 03/04 from F05 0203 and F05 0304
*****************************************************************.


Get file ='Tri_file 2002/03.sav'.
Compute F050203=0.
If (startyr=2002 and expendyr=2003 and A34=1) F050203 =1.
Save outfile ='Tri_file 2002/03.sav'.

Get file = 'Tri_file 2003/04.sav'.
Compute F050304=0.
If (startyr=2002 and expendyr=2003) F050304=1.
Save outfile='Tri_file 2003/04.sav'.

Merge files

*****************************************************************.
** MERGE ALL FIVE ISRs TOGETHER
** If a student has the same newref and the same qualification
** aim code (mapcode) in two or more files, then the qualification in
** the 2003-04 file takes precedence over the 2002-03 file which in
** turn takes over the 2001-02 file etc.


Get file ='Tri_file 2003/04.sav'.

MATCH FILES FILE = *
/FILE = 'Tri_file 2002/03.sav'
/FILE = 'Tri_file 2001/02.sav'
/FILE = 'Tri_file 2000/01.sav'
/FILE = 'Tri_file 1999/00.sav'
/BY newref mapcode startyr expendyr.

*****************************************************************.
** Completion of code to identify mismatches across
** 2002/03 and 2003/04
*****************************************************************.


Recode F050203 (SYSMIS=0).
Recode F050304 (SYSMIS=0).

Compute match = F050304 + F050203.
Compute expmatch=0.
Compute didmatch=0.

If (F050203 = 1) expmatch=1.
If (match = 2) didmatch=1.

*****************************************************************.
** Drop cases that are not required for analysis.
*****************************************************************.

**code to ignore qualifications withdrawn before 1 November in first year of qualification adapted for new cyglh code.


SELECT IF NOT (A_EXP_A <=0 and (startyr=expendyr)).
SELECT IF NOT (A_EXP_A <=0 AND (A31 < DATE.DMY(1,11,startyr)) AND A34= 3 AND ANY (A35,0,3)).

Create categorical derived variables

*****************************************************************.
** CALC AGE AS AT 31 AUGUST OF THE YEAR THE QUALIFICATION
** STARTED (using startyr) then put this in groups using
** a 0/1 notation so easier for aggregation
*****************************************************************.


DO IF (NOT SYSMIS (L11)).
DO IF ( XDATE.MONTH(L11) LE 8).
COMPUTE l_age=startyr - XDATE.YEAR(L11).
ELSE.
COMPUTE l_age=(startyr - 1)-XDATE.YEAR(L11).
END IF.
END IF.

RECODE l_age (SYSMIS=9) (0 THRU 15=1) (16 THRU 18=2) (19 THRU 20=3)
(21 THRU 24=4) (25 THRU 59=5) (60 THRU 120=6) (ELSE=9) INTO l_ageb.

VARIABLE LABELS
l_age 'DV - Age of the learner as at 31 August of start year'
l_ageb 'DV - Age of the learner as at 31 August of start year banded'.

VALUE LABELS l_ageb
1 'Under 16'
2 '16-18'
3 '19-20'
4 '21-24'
5 '25-59'
6 '60 AND over'
9 'missing age'.

**Note 16-18 age group includes under 16 and 19&over age group includes unknowns for purpose of benchmarking.


!AGEAUG thisyr=startyr.

compute agegroup = 0.
if (L_AGEB = 1 or L_AGEB = 2) agegroup = 1.
variable labels agegroup 'age group of student'.
value labels agegroup 1 '16-18' 0 '19 & over'.

**Note 16-18 age group includes under 16 and
**19&over age group includes unknowns for purpose of benchmarking.


Execute.

*****************************************************************.
** Calculate the expected duration of the learning programme.
*****************************************************************.

** CALC NUMBER OF ACADEMIC YEARS THE QUALIFICATION SPANS
** The number of academic years the qualification spans is
** capped at 4 years.


compute noayears = (expendyr - startyr) + 1.

do if (NOT (MISSING(A28))).
+ compute noayears = ((expendyr - startyr) + 1).
else if (NOT (MISSING (A31))).
+ compute noayears = ((expendyr - startyr) + 1).
else if ((MISSING (A28)) and (MISSING (A31))).
+ compute noayears = 1.

end if.
if (noayears > 4) noayears = 4.
variable labels noayears 'Number of academic years qual spans'.
Execute.

** CALCULATE EXPECTED DURATION OF THE QUALIFICATION
** This uses expected end date where this is available
** or actual end date where this is not.
** Where both actual and expected end dates are missing,
** the qualification is assumed to have a short expected
** duration (99 days used to ensure qual is categorised
** as short, and to distingush it from 167 days used in step H)
** Note: we do not want qualifications with zero cyglh to
** have an expected duration, therefore code these as missing
** These should be the only quals with an expected duration
** of -9 (missing).


compute expdur = -9.

do if (NOT (MISSING(A28))).
+ compute expdur = CTIME.DAYS(A28) - CTIME.DAYS(A27) + 1.

else if (NOT (MISSING (A31))).
+ compute expdur = CTIME.DAYS(A31) - CTIME.DAYS(A27) + 1.

else if ((MISSING (A28)) and (MISSING (A31))).
+ compute expdur = 99.
end if.

variable labels expdur 'expected duration of the qualification'.
value labels expdur -9 'missing'.

execute.

** DEFINE CATEGORY OF DURATION OF QUALIFICATION
** Uses expdur and noayears to calculate this.
**
** DEFINITIONS
** SHORT - quals with an exp duration of at least 1 day and up to 24 weeks
** 1YEAR_1AY - quals betw 168 and 365 days in one teaching yr
** 1YEAR_2AY - quals betw 168 and 365 days in two teaching yrs
** 2YEAR_2AY - quals betw 366 and 730 days in two teaching yrs
** 2YEAR_3AY - quals betw 366 and 730 days in three teaching yrs
** 3YEAR_3AY - quals betw 731 and 1095 days in three teaching yrs
** 3YEAR_4AY - quals betw 731 and 1095 days in three teaching yrs
** 4YEARS OR MORE 4AY OR MORE - quals over 1095 days
** spanning 4 or more academic years.

**If there is a leap year during the duration of the course and extra day
**needs to be added to code. This needs to be updated for future leap years.
**THIS CODE HAS BEEN UPDATED FOR 2003/04 FOLLOWING A REQUEST BY A COLLEGE TO ENSURE
**ONLY LEARNING AIMS THAT SPAN THE 29 FEBRUARY ARE INCLUDED IN LEAP.


Compute leap =0.

DO IF (expendyr=1999 AND A27 <= 28/2/2000 AND A31 > 28/2/2000)
OR (expendyr=2000 AND noayears=2 AND A27 <= 28/2/2000 AND A31 > 28/2/2000)
OR (expendyr=2001 AND noayears=3 AND A27 <= 28/2/2000 AND A31 > 28/2/2000)
OR (expendyr=2002 AND noayears=4 AND A27 <= 28/2/2000 AND A31 > 28/2/2000)
OR (expendyr=2003 AND A27 <= 28/2/2004 AND A31 > 28/2/2004)
OR (expendyr=2004 AND noayears=2 AND A27 <= 28/2/2004 AND A31 > 28/2/2004)
OR (expendyr=2005 AND noayears=3 AND A27 <= 28/2/2004 AND A31 > 28/2/2004)
OR (expendyr=2006 AND noayears=4 AND A27 <= 28/2/2004 AND A31 > 28/2/2004).
COMPUTE leap=1.
END IF

* From preceding work, calculate the definitive duration of learning aims.


compute dur = 0.

Do if leap=0.
do if (expdur = -9).
+ compute dur = -9.
else if (expdur >= 1 and expdur <= 167).
+ compute dur = 1.
else if (expdur >= 168 and expdur <= 365 and noayears = 1).
+ compute dur = 2.
else if (expdur >= 168 and expdur <= 365 and noayears = 2).
+ compute dur = 3.
else if (expdur >= 366 and expdur <= 730 and noayears <= 2).
+ compute dur = 4.
else if (expdur >= 366 and expdur <= 730 and noayears = 3).
+ compute dur = 5.
else if (expdur >= 731 and expdur <= 1095 and noayears <= 3).
+ compute dur = 6.
else if (expdur >= 731 and expdur <= 1095 and noayears = 4).
+ compute dur = 7.
else if (expdur >= 1096).
+ compute dur = 8.
end if.
end if.

Do if leap=1.
do if (expdur = -9).
+ compute dur = -9.
else if (expdur >= 1 and expdur <= 167).
+ compute dur = 1.
else if (expdur >= 168 and expdur <= 366 and noayears = 1).
+ compute dur = 2.
else if (expdur >= 168 and expdur <= 366 and noayears = 2).
+ compute dur = 3.
else if (expdur >= 367 and expdur <= 731 and noayears <= 2).
+ compute dur = 4.
else if (expdur >= 367 and expdur <= 731 and noayears = 3).
+ compute dur = 5.
else if (expdur >= 732 and expdur <= 1096 and noayears <= 3).
+ compute dur = 6.
else if (expdur >= 732 and expdur <= 1096 and noayears = 4).
+ compute dur = 7.
else if (expdur >= 1097).
+ compute dur = 8.
end if.
end if.

variable labels dur 'category qualifications lie in'.

value lables dur -9 'missing' 1 'short'
2 '1year_1ay' 3 '1year_2ay' 4 '2year_2ay' 5 '2year_3ay'
6 '3year_3ay' 7 '3year_4ay' 8 '4year or more 4ay or more'.

*****************************************************************.
** CATEGORISE THE TYPE OF QUALIFICATION.
** Note there area no categories 11 to 14 since GNVQ precursors
** was separate but now are together.
*****************************************************************

****************************************************************
** Remove learning aims not used in analysis
***************************************************************

**Drop all qualifications that are additional units.

Select if mapqty<>"X901".

*** Remove Unitisation quals ***.

Select if mapqty<>"8009".
Select if mapqty<>"8008".

**Add a marker for UFI qualifications.
**Need to check format of A46 variable that we get**
**Hopefully these have all gone anyway!


Compute ufi=0.
If (A46A=1) or (A46B=1) OR (mapawb="UFI") ufi =1.

Variable labels ufi 'ufi quals'.
value labels ufi 0 'not ufi qual' 1 'ufi qual'.

**Drop all qualifications that are diagnostic tests

SELECT IF NOT(ANY(MAPCODE,'MP027552','MP027554','MP027555','MP027557')).

*****************************************************************.
** Create groupings of learning aims.
*****************************************************************.


recode mapqty
('0001'=3) ('0002'=1)('0003'=5) ('1414','1415','1416','1417','1418'=7) ('0017','0028'=10)
('0029','0030','0886'=10) ('0035'=7) ('0036'=15)
('0004','0005','0007'=17)('0031','0032'=18)('1327'=30)
('1413'=31)(ELSE=19)
into qtype2.

if (substr(A09,1,5)='X9000') qtype2=20.
if (A09 = '55555555' or A09 = '66666666') qtype2 = 20.
if (A09 = '77777777' or A09 = '88888888') qtype2 = 20.
if (A09 = '99999000' or A09 = '99999111') qtype2 = 20.
if (A09 = '99999222' or A09 = '99999333') qtype2 = 20.
if (A09 = '99999999') qtype2 = 20.

if (substr(A09,1,4)='X9CE') qtype2=20.
if (substr(A09,1,4)='X9CN') qtype2=20.
if (substr(A09,1,5)='X9G01') qtype2=2.
if (substr(A09,1,5)='X9G02') qtype2=4.
if (substr(A09,1,5)='X9G03' OR substr(A09,1,4)='X9G3') qtype2=6.
if (substr(A09,1,4)='X9GQ') qtype2=8.
if (substr(A09,1,4)='X9NQ') qtype2=16.
if (substr(A09,1,4)='X9VQ') qtype2=9.

variable labels qtype2 'reduced type of qual'.

value labels qtype2 1 'A' 2 'A generic' 3 'AS' 4 'AS generic'
5 'GCSE' 6 'GCSE generic' 7 'GNVQ and AVCE' 8 'GNVQ generic' 9 'GNVQ/NVQ units'
10 'GNVQ precursors' 15 'NVQ' 16 'NVQ generic' 17 'Access'
18 'HND/HNC' 19 'Other' 20 ' Other generic' 30 'Key Skills' 31 'A2'.

** Compute a field to say which spreadsheet each qualification
** will be on.
**A2 qualifications are put in spreadsheet 1 with A/AS.


compute ssheet=0.

do if ((qtype2 >=1 and qtype2 <=6) or qtype2=31).
+ compute ssheet=1.
else if (qtype2 >= 7 and qtype2 <= 10).
+ compute ssheet=2.
else if (qtype2=15 or qtype2=16).
+ compute ssheet=3.
else if (qtype2 >= 17 and dur=1).
+ compute ssheet=4.
else .
+ compute ssheet=5.
end if.

variable labels ssheet 'spreadsheet number'.

value labels ssheet 1 'A AS and GCSEs including A2' 2 'GNVQ and GNVQ pre'
3 'NVQs' 4 'Other short' 5 'Other not short'.

** Recode qtype2 into qtype3 for the summaries. This is a slightly
** aggregated version of qtype2, and even broader qualification
** type so to speak. It also combines the spreadsheet it is on
** for other qualifications so we have one BV with the two pieces
** of information instead.


RECODE qtype2 (2=1) (4=3) (6=5) (8=7) (16=15)
(ELSE=COPY) into qtype3.

if (qtype2 = 17 and ssheet = 4) qtype3 = 21.
if (qtype2 = 18 and ssheet = 4) qtype3 = 22.
if (qtype2 = 19 and ssheet = 4) qtype3 = 23.
if (qtype2 = 20 and ssheet = 4) qtype3 = 24.
if (qtype2 = 17 and ssheet = 5) qtype3 = 25.
if (qtype2 = 18 and ssheet = 5) qtype3 = 26.
if (qtype2 = 19 and ssheet = 5) qtype3 = 27.
if (qtype2 = 20 and ssheet = 5) qtype3 = 28.
if (qtype2 = 30 and ssheet = 4) qtype3 = 29.
if (qtype2 = 30 and ssheet = 5) qtype3 = 30.


variable labels qtype3 'reduced version of qtype2 with ss for other'.

value labels qtype3 1 'A level inc generic' 3 'AS level inc generic'
5 'GCSE inc generic' 7 'GNVQ/AVCE inc generic' 9 'GNVQ/NVQ units'
10 'GNVQ precursors' 15 'NVQs inc generic' 21 'Access short'
22 'HND/HNC short' 23 'Other short' 24 ' Other generic short'
25 'Access long' 26 'HND/HNC long' 27 'Other long' 28 ' Other generic long'
29 'Key Skills short' 30 'Key Skills long' 31 'A2 quals'.

Create scaled derived variables

*****************************************************************.
** DEFINITION OF ACHIEVE, HIGH, PASS, PARTIAL, COMPV, GRADE and TRANS
**
** A qual is achieved if it is flagged as 6 or 7 fully achieved, in
** ISR 16 19 22 and 25 or flagged as 1 in A35 in the ILR.
**
** COMPLETE is all completed qualifications regardless of
** outcome. COMPV is a subset of this, completed and any
** known outcome .
**
** Transfers is a new field from ISR13.


compute complete = 0.
if (A34 = 2) complete = 1.
variable labels complete 'qual is completed'.

compute achieved = 0.
if (A35 = 1) or (A35=6) or (A35=7) achieved = 1.
variable labels achieved 'qual is achieved'.

compute compv = 0.
if (A34 = 2) and (A35 =1 or A35 = 2 or A35 = 3
or A35 >= 6) compv = 1.
variable labels compv 'completed and valid outcome'.

*Code to create high grade variable - for A levels this is A or B
*for other quals this is A,B or C.

string grade (A8).
RECODE A36 (""="NA") ('A', '*', '*A', 'A*', 'AA', 'AB', 'AC', 'AD',
'AE', 'AF', 'AG', 'B', 'BB', 'BA', 'BC', 'BD', 'BE', 'BF', 'BG'='HIGH_AB')
('C', 'CA', 'CB', 'CC', 'CD', 'CE', 'CF', 'CG', 'ME', 'DS', 'DI', 'HI' = 'HIGH')
(ELSE=COPY) into grade

compute gradek = 1.
if (grade = "NA") gradek = 0.
variable labels gradek 'qual has a known grade'.

* Create a variable to represent A A/S and A2 quals.

Compute aa2as =0.
If (qtype3 = 1 or qtype3 =3 or qtype3 =31) aa2as=1.

*Finalise high grade marker.
* create code to deal with BTEC National Cert (mapqty = 1424) where DD, DM, MD or MM are high grades.
* BTEC National Dip with three grade letters is converted to A36 = "HI" in the current year tri file code.

compute high = 0.
if (achieved = 1 and grade = "HIGH_AB" and aa2as=1 ) high = 1.
if (achieved =1 and aa2as <> 1 and grade ="HIGH") high=1.
if (achieved =1 and aa2as <> 1 and grade = "HIGH_AB") high=1.
if (achieved = 1 and mapqty = "1424" and (grade = "DD" OR grade = "DM" OR grade = "MD" OR grade = "MM")) high = 1.
variable labels high 'qual achieved and high grade'.

compute trans = 0.
if (A34 = 4) trans = 1.
variable labels trans 'student has transferred onto another qual'.

** compute one / zero variables for each element of the outcome field
** exam and learn will be shown on the spreadsheet, ach_f and
** ach_nf will not be but will be used for credibility checks


compute exam = 0.
if (A35 = 4) exam = 1.
variable labels exam 'exam taken but result not yet known'.

compute learn = 0.
if (A35 = 5) learn = 1.
variable labels learn 'learning complete but exam not yet taken'.

** FIELD TO SAY WHETHER THE QUALIFICATION STARTED ON/BEFORE
** OR AFTER 1 NOVEMBER OF THE START YEAR.

compute nov = 0.
if (A27 <= date.dmy (1, 11, startyr )) nov = 1.
variable labels nov 'did qual start before 1 Nov'.
execute.

** CONTINUATION FIELDS
** Create continuing variable.

compute contin = 0.
if (A34 = 1) contin = 1.

** Compute a new field called count which is 1 for every record
** to enable sum in aggregation.

recode nov (1=1) (ELSE=1) INTO count.
variable lables count 'number of starts'.

**Compute a break variable called area1
**showing notional level and length of course
**for use in publication.
**Recode entry level to level 1


Recode map_nl ("E"="1")("4"="H")("5"="H")("M"="X")("U"="X")(else=copy).

compute area1 = 0.
if (qtype2 = 30) area1 =7.
if (dur = 1 and not (map_nl = "X") and not (qtype2=30)) area1 = 6.
if (dur > 1 and map_nl = "1" and not (qtype2=30)) area1 = 1.
if (dur > 1 and map_nl = "2"and not (qtype2=30)) area1 = 2.
if (dur > 1 and map_nl = "3"and not (qtype2=30)) area1 = 3.
if (dur > 1 and map_nl = "H"and not (qtype2=30)) area1 = 4.
if (dur > 1 and map_nl = "X"and not (qtype2=30)) area1 = 5.

**Note select area1 = 0 thru 6 to exclude keyskills.


variable labels area1 'high level area of publication falls into'.
value labels area1 6 'short nl 123H ' 1 'Long nl 1' 2 'Long nl 2'
3 'Long nl 3' 4 'Long nl H' 5 'Long nl X' 0 'short nl X' 7 'Key Skills'.

**Compute a break variable area 2
**which shows longs and shorts by broad qualification type
**for supporting data tables.


compute area2 = 0.

if (dur > 1 and ssheet = 2 and map_nl = "1") area2 = 1.
if (dur > 1 and ssheet = 3 and map_nl = "1") area2 = 2.
if (dur > 1 and ssheet = 5 and map_nl = "1" and not (qtype2=30)) area2 = 3.
if (dur > 1 and ssheet = 1 and map_nl = "2") area2 = 4.
if (dur > 1 and ssheet = 2 and map_nl = "2") area2 = 5.
if (dur > 1 and ssheet = 3 and map_nl = "2") area2 = 6.
if (dur > 1 and ssheet = 5 and map_nl = "2" and not (qtype2=30)) area2 = 7.
if (dur > 1 and ssheet = 1 and map_nl = "3") area2 = 8.
if (dur > 1 and ssheet = 2 and map_nl = "3") area2 = 9.
if (dur > 1 and ssheet = 3 and map_nl = "3") area2 = 10.
if (dur > 1 and ssheet = 5 and map_nl = "3" and not (qtype2=30)) area2 = 11.
if (dur > 1 and ssheet = 2 and map_nl = "H") area2 = 12.
if (dur > 1 and ssheet = 3 and map_nl = "H") area2 = 13.
if (dur > 1 and ssheet = 5 and map_nl = "H" and not (qtype2=30)) area2 = 14.
if (dur > 1 and ssheet = 1 and map_nl = "X") area2 = 15.
if (dur > 1 and ssheet = 2 and map_nl = "X") area2 = 16.
if (dur > 1 and ssheet = 3 and map_nl = "X") area2 = 17.
if (dur > 1 and ssheet = 5 and map_nl = "X" and not (qtype2=30)) area2 = 18.
if (dur = 1 and map_nl = "1"and not (qtype2=30)) area2 = 19.
if (dur = 1 and map_nl = "2"and not (qtype2=30)) area2 = 20.
if (dur = 1 and map_nl = "3"and not (qtype2=30)) area2 = 21.
if (dur = 1 and map_nl = "H" and not (qtype2=30)) area2 = 22.
if (dur = 1 and map_nl = "X" and not (qtype2=30)) area2 = 23.
if (qtype2 = 30) area2 = 24.

* Note area2 1-23 do not include keyskills

variable labels area2 'area of supporting data'.

value labels area2
1 'Long GNVQ nl 1' 2 'Long nvq nl 1' 3 'Long other nl 1'
4 'Long GCSEs nl 2' 5 'Long GNVQ nl 2' 6 'Long NVQ nl 2'
7 'Long other nl 2' 8 'Long AAS (including A2) nl 3'
9 'Long GNVQ nl 3' 10 'Long NVQ nl 3' 11 'Long other nl 3'
12 'Long GNVQ nl H' 13 'Long NVQ nl H' 14 'Long other nl H'
15 'Long AASGCSE nl X' 16 'Long GNVQ nl X' 17 'Long NVQ nl X'
18 'Long other nl X' 19 'Short all quals nl 1' 20 'Short all quals nl 2'
21 'Short all quals nl 3' 22 'Short all quals nl H'
23 'Short all quals nl X' 24 'Key Skills'.

Remove non LSC funded learner cohorts

*****************************************************************.
** Code to identify learner cohorts without aggregating the file
** and thus losing the learner and learning aim reference.
*****************************************************************.

*** Code to eliminate any missing areas of learning.

RECODE map_aol (SYSMIS = 98) (ELSE = COPY).

SORT CASES BY inst, ssheet, qtype2, qtype3, inspcode,mapcode, map_aol,
mapawb, maptitle, map_nl, dur, expendyr, agegroup, ufi, area1, area2, map_ssa1.

** Variable 'caseno' shows the row number as a numeric variable.

COMPUTE caseno = $casenum.

** Need to identify members of the same cohort. Having pre-sorted the variables used to define a cohort, the
** variable CHANGE is used to identify where any leaner is in a different cohort from the learner in the row above.
** This is achieved by setting the change variable to 0 for all cases and then the IF command moves the change variable to
** 1 where the learner is in the same cohort as the case in the row above. Therefore cases where change = 0 are the first
** case of the new cohort.


COMPUTE change =0.
IF (caseno = 1) change = 0.
IF (inst = LAG(inst,1)
AND ssheet = LAG(ssheet,1)
AND qtype2 = LAG(qtype2,1)
AND qtype3 = LAG(qtype3,1)
AND inspcode = LAG(inspcode,1)
AND mapcode = LAG(mapcode,1)
AND map_aol = LAG(map_aol,1)
AND mapawb = LAG(mapawb,1)
AND maptitle = LAG(maptitle,1)
AND map_nl = LAG(map_nl,1)
AND dur = LAG(dur,1)
AND expendyr = LAG(expendyr,1)
AND agegroup = LAG(agegroup,1)
AND ufi = LAG(ufi,1)
AND area1 = LAG(area1,1)
AND area2 = LAG(area2,1)) change = 1.

** The COHORT variable sequentially numbers each cohort. The start of a new cohort is identified by cases where.
** change = 0. Because this piece of code uses the LAG command, there are problems with the first row. By using 'caseno'.
** it is possible to force the value 1 for the first case - all other cases work from the LAG command without problem.


COMPUTE cohort = 0.
IF (caseno = 1) cohort = 1.
IF (caseno > 1 AND change = 0) cohort = LAG(cohort,1) + 1.
IF (caseno > 1 AND change = 1) cohort = LAG(cohort,1).

SAVE OUTFILE = 'Cohort_temp.sav'
/COMPRESSED.

** It is now necessary to find the number of council funded learners in each cohort - this is achieved by.
** the AGGREGATE command. This two column table is saved in a new file.

AGGREGATE OUTFILE = 'fundstat_agg.sav'
/BREAK cohort
/sum_fund = SUM(fundstat).

** The code re-opens the demographic file saved two commands earlier and matches in for each case the.
** number of council funded learners in their cohort.


GET FILE = 'Cohort_temp.sav'.

MATCH FILES / FILE = *
/TABLE = 'fundstat_agg.sav'
/BY cohort.

** Remove cohorts with no council funded learners, complimentary studies quals and similar quals and retain quals with.
** required expected end years.

SELECT IF ((sum_fund <> 0) AND (NOT (inspcode = "NS044051")) and
(expendyr = 2001 or expendyr = 2002 or expendyr = 2003)).

SAVE OUTFILE = 'Pre-agg.sav'
/DROP=L11.

Aggregate to create aim master file

****************************************************************.
** AGGREGATE QUALIFICATION FILES UP TO CREATE A SAV FILE
** AT COLLEGE LEVEL.
**
** This will create the raw detailed file for all colleges
** in this run.

** Pre-sort the inst lookup file.

GET FILE='Institution_lookup.sav'.
SORT CASES BY l01.
SAVE OUTFILE='Institution_lookup.sav'.

AGGREGATE OUTFILE='Aim_master.sav'
/PRESORTED
/BREAK=inst ssheet qtype2 qtype3 inspcode mapcode map_aol mapawb
maptitle map_nl dur expendyr agegroup ufi area1 area2 map_ssa1
/count = sum(count)
/nov = sum(nov)
/fundstat = sum(fundstat)
/trans=sum(trans)
/contin=sum(contin)
/complete = sum(complete)
/achieved = sum(achieved)
/gradek=sum(gradek)
/high = sum(high)
/compv=sum(compv)
/exam=sum(exam)
/learn=sum(learn)
/expglh=sum(A_EXP_A)
/expmatch=sum(expmatch)
/didmatch=sum(didmatch)
/batch=last(batch)
/date= last(date).

** MATCH ON COLLEGE TYPE DETAILS
** 1 GFEC/ TC
** 2 SFC
** 3 SPECIALIST
** 4 GFEC/ TC WITH HIGH WP FACTOR
** 5 SFC/TC WITH HIGH WP FACTOR
** MAKE SURE COLLEGE TYPE DETAILS ALSO INCLUDES LLSC NAME AND CODE.

*SORT CASES BY inst.


MATCH FILES /FILE=*
/TABLE='Institution_lookup.sav'
/BY inst.

SAVE OUTFILE='Aim_master.sav'
/COMPRESSED.

Aggregate to create demographic master file

****************************************************************.
**AGGREGATE PREAGG FILE TO PRODUCE DEMOGRAPHICS
****************************************************************.


Get file='Pre-agg.sav'.

*Remove key skills and ufi qualifications and make sure any missing blank disability lines are coded as unknown.

SELECT IF (area1 <> 7 and ufi = 0).
RECODE L14 (MISSING = 9) (else = copy).

*Recode ethnicity to avoid duplicate categories.

STRING ethncity (A20).
RECODE L12 (11,01 = "Bangladeshi") (12,06 = "Indian") (13, 07 = "Pakistani") (14, 09 = "Other Asian") (15, 02 = "Black African")
(16, 03 = "Black Caribbean") (17, 04 = "Black Other") (18, 05 = "Chinese") (19, 20, 21, 22 = "Mixed") (08, 23, 24, 25 = "White")
(98 = "Any other") (99 = "Unknown") (else = "Unknown") INTO ethncity.

*Match in full institution names and LLSC details.

sort cases by INST (a) .
MATCH FILES /FILE=*
/TABLE='Institution_lookup.sav'
/BY inst.

**Create length variable to show if qual is short or long.

String length (A6).
Recode area1 (0,6 = "short") (1,2,3,4,5 ="long") into length.

SORT CASES by L01, inst, name, col_type, llscname, area1, length, expendyr, agegroup, L13, L14, ethncity.

AGGREGATE OUTFILE=*
/PRESORTED
/BREAK=L01 inst name col_type llscname area1 length expendyr agegroup L13 L14 ethncity
/count = sum(count)
/trans=sum(trans)
/contin=sum(contin)
/complete = sum(complete)
/achieved = sum(achieved)
/compv=sum(compv).

Save outfile = 'Demographic_master.sav'
/COMPRESSED.

Creator

Analysis and MI Team

Date issued

22 March 2006

Date created

18 January 2006

Document ref.

\\records.lsc.local\NAT\23 LrngSkillsPolicyInfrastr\23-07 DataCollectAlysis\23-07-03 LrnrDataAlysisDiss\nat-fesuccessrates200304samplecode-report-22mar2006.doc

LSC office

Learning and Skills Council
Cheylesmore House Quinton Road Coventry CV1 2WT
T 0845 019 4170 F 024 7682 3675 www.lsc.gov.uk/

Last Modified: 22 Mar 06