Sunday, October 4, 2015

Adolescent Suicide Attempt Analysis in United States (Part 3)

The third step is about data management which includes coding out missing data, coding in valid data, recoding variables, creating secondary variables and binning or grouping variables. The variables that will be managed are H1TO31, H1TO35 and H1TO41. These variables are about whether the adolescent use drugs or not. For detail explanation of each variable, please refer to Adolescent Suicide Attempt Analysis in United States (Part 1)

Below is the SAS code:
LIBNAME mydata "/courses/d1406ae5ba27fe300" access=readonly;
DATA new2; set mydata.addhealth_pds;
LABEL 
H1TO31="During your life, how many times have you used marijuana?"
H1TO35="During your life, how many times have you used cocaine?"
H1TO41="During your life, how many times have you used any other illegal drug such as LSD, PCP, ecstasy, mushrooms, speed, ice, heroin or pills without a doctor’s prescription?"
MGRP= "Group of How many times Adolescent used Marijuana"
CGRP= "Group of How many times Adolescent used Cocain"
OGRP= "Group of How many times Adolescent used Other Drugs"
DRUGS="Use drugs either marijuana or cocaine or other drugs";

Recall from previous step, that the data is filtered only for Adolescent who thinking about suicide.
 
IF H1SU1=1; /*Seriously thinking about comitting suicide*/
IF H1SU8 >= 2 and H1SU8 <=4 ;/*Somewhat Honest to Completly Honest */

1. Missing Data
Those variables have the same codes for missing data which are 996: refused, 998: Don't know and 999: Not Applicable. Therefore, these codes will be excluded. Here is the additional script in SAS:

/*Remove 996: refused, 998: Don't know and 999: Not Applicable*/
IF H1TO31 = 996 or H1TO31 = 998 or H1TO31 = 999 then H1TO31 = .;
IF H1TO35 = 996 or H1TO35 = 998 or H1TO35 = 999 then H1TO35 = .;
IF H1TO41 = 996 or H1TO41 = 998 or H1TO41 = 999 then H1TO41 = .;

 
It can be seen that after the data processing there are missing value for each variable. There are 29 row missing value for variable H1TO31, 8 row missing value for variable H1TO35 and 15 row missing value for variable H1TO41.
Note: The tables above is cut due to web space.

2. Group Data
The values of these variables are various from 1 to 700 or 900. In order to make it simple, the binning process is performed. The groups will be created into 6 groups with following range:
Group 1: 1 - 5 times
Group 2: 6 - 10 times
Group 3: 11 - 20 times
Group 4: 21 - 50 times
Group 5: > 50 times
Group 9: 997 (Has not tried)

/*Create binning for each variables*/
IF H1TO31 LE 5 THEN MGRP = 1;
ELSE IF H1TO31 LE 10 then MGRP = 2;
ELSE IF H1TO31 LE 20 then MGRP = 3;
ELSE IF H1TO31 LE 50 then MGRP = 4;
ELSE IF H1TO31 LE 900 then MGRP = 5;
ELSE IF H1TO31 = 997 then MGRP = 9;





IF H1TO35 LE 5 then CGRP = 1;
ELSE IF H1TO35 LE 10 then CGRP = 2;
ELSE IF H1TO35 LE 20 then CGRP = 3;
ELSE IF H1TO35 LE 50 then CGRP = 4;
ELSE IF H1TO35 LE 700 then CGRP = 5;
ELSE IF H1TO35 = 997 then CGRP = 9;





IF H1TO41 LE 5 then OGRP = 1;
ELSE IF H1TO41 LE 10 then OGRP = 2;
ELSE IF H1TO41 LE 20 then OGRP = 3;
ELSE IF H1TO41 LE 50 then OGRP = 4;
ELSE IF H1TO41 LE 900 then OGRP = 5;
ELSE IF H1TO41 = 997 then OGRP = 9;




Based on table above, it shows that Adolescent who thinking about suicide, 48.11% of them use Marijuana, 10.45% use Cocain and 20.28% use Other Drugs.


3. Create New Variable
It will be created one new variable to know whether the Adolescent who thinking about suicide uses one or more drugs or not.

/*New Variables DRUGS, 1: use one or more drugs, 0: doesn't use drugs*/
IF H1TO31 < 997 or H1TO35 < 997 or H1TO41 < 997 THEN DRUGS=1;
ELSE DRUGS=0
It can be seen that in total there are 400 Adolescent or about 50.4% who thinking about suicide also use drugs either marijuana, cocain or other drugs. The other 49.62% Adolescent are not clear what is the reason of suicide. It is needed further investigation.


Here is the rest of the code.
PROC SORT; by AID;
PROC FREQ; TABLES H1TO31 H1TO35 H1TO41 MGRP CGRP OGRP DRUGS;
RUN;



No comments:

Post a Comment