Wednesday, June 16, 2010

Mean Graph with Standard Deviation Bars

************************************************************
Produce a mean graph with standard deviation bars with and
without showing plot points by using color commands
************************************************************

***Data is longitudinal in nature;

***Using SAS's symbol statement i=stdxxx, you can create a graph that will connect the mean of the y's on each x-axis point and produce standard deviation bars.;


symbol1 value=diamond i=std1jt color=black h=3;

***STD1jt=1 is 1 standard deviation, j=join together means, t=place tops and bottoms on the std dev bars;

axis1 order=(0 to 2000 by 500) minor=none label=(a=90 height=3 color=black "Mean Statistic");

axis2 order=(0 to 80 by 10) minor=none label=(height=3 color=black "Time (hr)");

proc gplot data=graphdata;
plot yvar*xvar/
vaxis=axis1
haxis=axis2;
run;
quit;

Using the following code, the graph produced looks like this:;




***Now, to take out the individual points, use color commands:;

symbol1 value=diamond i=std1jt cv=white ci=red co=black;

axis1 order=(0 to 2000 by 500) minor=none label=(a=90 height=3 color=black "Mean Statistic");

axis2 order=(0 to 80 by 10) minor=none label=(height=3 color=black "Time (hr)");

proc gplot data=graphdata;
plot yvar*xvar/
vaxis=axis1
haxis=axis2;
run;
quit;

***The graph produced would look like this:;



*/This works because:
COLOR=symbol-color | _style_ (or C=symbol-color) specifies a color for the entire definition, unless it is followed by a more explicit specification. For the GPLOT and GBARLINE procedures, this includes plot symbols, the plot line, confidence limit lines, and outlines.

CO=color specifies a color for staffs, boxes, and bars generated by the high-low interpolation methods: INTERPOL=HILO, INTERPOL=BOX, and INTERPOL=STD

CI=line-color|_style_ specifies a color for an interpolation line (GPLOT and GBARLINE) or a contour line (GCONTOUR).

CV=value-color|_style_ specifies a color for the plot symbols in the GPLOT procedure
/*

Friday, June 11, 2010

Merging, SQL, and other SAS features

***************************************************************************
SAS program that uses the following functions and statements:
*******first.group, last.group
*******retain statement, output statement
*******proc SQL
*******merge statement
***************************************************************************;

***This program adjusts the concentration of one group of data on day 2 of testing because of the long halflife of the analyte that causes spill-over concentration.;

***Import concentrations file and parameters file;

libname Adjust 'C:\Adjustments';

proc import datafile='C:\Concentrations_file' dbms=xls out=Adjust.Conc replace;
run;

proc import datafile='C:\Parameters_file' dbms=xls out=Adjust.Param replace;
run;

***Set up dataset with only Day 2 and Group A;
data Conc;
format Group Subject Day Time Concentration;
set Adjust.Conc;
if Group='A';
if Day =2;
proc sort data=Conc;
by subject time;
run;



***Get the first concentration of each subject;
data firstconc;
set conc;
by subject;
if first.subject;
if Concentration=. then firstconc=0;
else firstconc=Concentration;
keep subject firstconc;
proc sort data=firstconc;
by subject;
run;

***Import parameter dataset to get halflife;
data param;
set adjust.param(rename=(HL_LAMBDA_Z__HR_=halflife));
if group='A';
if Day=1;
keep subject halflife;
proc sort data=param;
by subject;
run;

***Option 1 for finding Maximum concentration for each subject (SQL): Must group by subject and output only the concentrations that match the maximum concentration ("having" command);

proc sql;
create table cmax as
select subject, max(Concentration) as cmax, Concentration as tempconc, time as tmax
from conc
group by subject
having calculated cmax=tempconc
;
quit;

proc sort data=cmax;
by subject;
run;

***Option 2 for finding Cmax (retain statement): Brings down the previous concentration and compares it with current concentration so the last observation of each subject will have the largest concentration of the group (must use "by" statment;

proc sort data=conc;
by subject time;
run;

data lastconcset;
set conc;
by subject;
if first.subject then lastconc=0;
if lastconc>concentration then lastconc=lastconc;
else lastconc=concentration;
retain lastconc;
output;
run;

data cmax2;
set lastconcset;
by subject;
if last.subject;
cmax=lastconc;
keep subject cmax;
*To get tmax, you would have to merge this back in with full concentration dataset and do where cmax=concentration. Or use Cmax instead of Tmax in the merge below;
run;



***Merge together all datasets, adjust concentration according to equation and perform Below Limit of Quantitation rule;

data mergedparam_conc;
format Group $1. Subject Day Time Concentration k Adjustedconc Concentration cmax tmax best12.;
merge firstconc param conc cmax ;
by subject;

*adjust concentration based on elimination kinetics;
k=log(2)/halflife; *elimination constant k;

*From plasma conc equation Ct=Co*e^(-kt);Adjustedconc=Concentration-(firstconc*exp(-k*time));


if Adjustedconc<0 adjustedconc="0;" style="color: rgb(0, 153, 0);">

*BQL rule;
if CharConc='BQL(<2.0)'> then do;
if time<=tmax then adjustedconc=0; else adjustedconc=.;
end;
drop tempconc;
run;

Wednesday, May 12, 2010

S-plus Graphics



********************************************
Create S-plus Graphics including
**histogram
**x-y plot w/abline
**boxplot
********************************************


## Clear variables and load datasets ##

rm(list=objects())
data<-importData("Splusdata.csv", stringsAsFactors=F, sortFactorLevels=F)

##Data is organized by Group, Subject, Time, Time after Last Dose (TAD), and Concentration

## Plot Histogram of Concentration data ##

par(mar=c(6,6,2,2),mfrow=c(2,2))
hist(data$CONC, xlab="Concentration (ng/mL)", ylab="Frequency", col=16, cex=1.25)


## Scatter plot of Concentrations by Time ##

plot(1, 1, type="n",xlim = c(0,max(data$TIME)), ylim=c(0,max(data$CONC,na.rm=T)), xlab="Time (hr)", ylab="Concentration (ng/ml)", cex=1.25) points(data$TIME,data$CONC)

## Scatter plot of Concentration by TAD ##

plot(1, 1, type="n",xlim = c(0,max(data$TAD)),ylim=c(0,max(data$CONC,na.rm=T)), xlab="Time Since Last Dose (hr)", ylab="Concentration (ng/ml)", cex=1.25)
points(data$TAD,data$CONC)
lines(lowess(data$TAD,data$CONC),col=16,lwd=3)

## Box plots by treatment group ##

boxplot(data[data$GROUP==1,"CONC"],data[data$GROUP==2,"CONC"],
names=c("A","B"),xlab="Treatment Group",
ylab="Concentration (ng/ml)",boxcol=16,cex=1.25)



**************************************
More on this:
1. Tutorial

Tuesday, April 27, 2010

Customize Legend in Figure


*******************************************************
Proc Gplot using Annotate
**Place legend depending on 'by' statement
**Resize so that 6 plots fit on one page in a word doc
*******************************************************


*****Sort concentration over time dataset that is composed of 3 analytes, 2 days, and 2 periods

proc sort data=exampledata;
by analyte day period;
run;

**********************************
Create annotation dataset for legend
**********************************


**Create dataset with only one observation for each analyte and day (the periods will be on the same graph)

data first;
set exampledata;
by analyte day;
if first.day;
keep analyte day;
run;

***Create the annotation dataset using xsys and ysys '1' (Referencing 0 to 100% of axis area), function, style, and text

data anno1;
*format variables - 'by' variables MUST be same format as in dataset. Annotation dataset variables MUST be in the proper format;

format analyte $4. day x y size best12. xsys ysys $1. function style text $8.;
set first;
by analyte day;

*retain the variables that will always remain the same;

retain xsys '1' ysys '1';

*Function 'move' will move the position to 67% of the x-axis and 20% of y-axis;

*Function 'draw' will draw a line of default type (black solid) until 70% of x-axis;

*Function 'label' will put in a marker associated with the text 'U' (a square) - font list here;


*Continue this process again to make a longer line and another marker, and then label with the text 'Period 1';

*Remember to output each line so that you will create one of each line for each observation in the 'first' dataset;


function='move'; x=67; y=20; size=.3; output;
function='draw'; x=70; output;
function='label'; x=72; style='markere'; text='U'; output;
function='draw'; x=75; style=''; text=''; output;
function='label'; x=78; style='markere'; text='U'; output;
function='draw'; x=81; style=''; text=''; output;
function='label'; x=88; style='hwcgm001';text='Period 1';size=.4; output;

function='move'; x=67; y=13; size=.3; style=''; text=''; output;
function='draw'; x=70; style=''; text=''; output;
function='label'; x=72; style='marker'; text='W'; output;
function='draw'; x=75; style=''; text=''; output;
function='label'; x=78; style='marker'; text='W'; output;
function='draw'; x=81; style=''; text=''; output;
function='label'; x=88; style='hwcgm001';text='Period 2';size=.4; output;

keep analyte day function x y xsys ysys size style text;
run;

**Change the y-axis location for day=14 for both legend symbols;
data annolegend;
set anno1;
if day=14 then do;
if y=20 then y=90;
if y=13 then y=83;
end;
run;

**Excerpt from what annolegend dataset looks like:





proc sort data=exampledata;
by analyte day;
run;

proc sort data=annolegend;
by analyte day;
run;

*******************************************
***Create the plots and output to CGM file;
*******************************************


filename outplot 'C:\Annotatedlegend\';

**Resize using hsize and vsize so that 6 plots will fit on one word doc page;

goptions
reset=ALL
gunit=pct
fby=hwcgm001
hby=3

ftext=hwcgm001
htext=3
dev=CGMOFML
gsfmode=replace
gsfname=outplot
hsize=3in vsize=2.32in;
**Make sure to set up the symbol statements to correspond to the annotated legend you just made!;

symbol1 value=square i=j color=black h=3;
symbol2 value=dot i=j color=black h=3;

**Here the axis is being cut to 68% to aid with resizing. Also use log scale;

axis1 offset=(,3) value=(a=90) length=68 pct
logbase=10 logstyle=expand label=(a=90 color=black "Concentration (ng/mL)");


axis2 order=(0 to 12 by 2) label=(height=3 color=black "Time (hr)") minor=none;

proc gplot data=exampledata;
by analyte day;
plot concentration*time=period/nolegend
vaxis=axis1
haxis=axis2
anno=annolegend;
run;
quit;

filename outplot clear;

See Also:
1. Good explanation of annotate facility

Tuesday, April 20, 2010

Embedded Graph


********************************************
Embed a smaller graph into the graph window
of a larger graph using Proc Greplay
*******************************************


****Data used is the average concentration at each timepoint, grouped by dose****

***plot data on linear scale, main graph;

goptions
reset=ALL
gunit=pct
colors=(black)
ftext=hwcgm001
ftitle=hwcgm009
htext=3
noborder
maxcolors=255
device=CGMOFML
hsize=10 in vsize=8;

symbol1 value=diamond i=j color=black h=2.5;
symbol2 value=triangle i=j color=black h=2.5;
symbol3 value=circle i=j color=black h=2.5;
symbol4 value=square i=j color=black h=2.5;
symbol5 value=dot i=j color=black h=2.5;
symbol6 value=diamond i=j color=black h=2.5;
symbol7 value=triangle i=j color=black h=2.5;
symbol8 value=circle i=j color=black h=2.5;

axis1 value=(a=90 h=3pct "0" "200" "400" "600" "800") order=(0 to 800 by 200) minor=none label=(a=90 color=black "Mean Concentration") offset=(1);

axis2 order=(0 to 80 by 20) minor=none label=(color=black "Time (hr)") offset=(1);

legend1 down=11 position=(right bottom inside) mode=share label=none value=(justify=left "10mg " "30mg " "90mg " "200mg " "500mg " "1000mg " "1500mg " "2000mg " );
*down=11 will push the legend up from it's original position at the bottom right;

proc gplot data=graphdata;
plot concentration*time=dose/
vaxis=axis1
haxis=axis2
legend=legend1;
run;
quit;

***2nd plot in log scale to be embedded;

symbol1 value=diamond i=j color=black h=2.5;
symbol2 value=triangle i=j color=black h=2.5;
symbol3 value=circle i=j color=black h=2.5;
symbol4 value=square i=j color=black h=2.5;
symbol5 value=dot i=j color=black h=2.5;
symbol6 value=diamond i=j color=black h=2.5;
symbol7 value=triangle i=j color=black h=2.5;
symbol8 value=circle i=j color=black h=2.5;

axis1 order=(.5,5,50,500) logbase=10 logstyle=expand label=none;

axis2 order=(0 to 80 by 20) minor=none label=none;

goptions hsize=5 in vsize=3 in noborder htext=3;

proc gplot data=graphdata;
plot concentration*time=dose/nolegend noframe
vaxis=axis1
haxis=axis2;
run;
quit;

*******Embed the second graph into the upper right area of the first graph;

filename outplot 'Z:\Embedded\Embeddedgraph.cgm';

goptions hsize=10 in vsize=8 in display gsfmode=replace gsfname=outplot;

proc greplay igout=work.gseg nofs;
tc work.templts;
tdef newtemp

1/llx=1 lly=1
ulx=1 uly=99
urx=99 ury=99
lrx=99 lry=1

2/llx=50 lly=50
ulx=50 uly=94.0
urx=98.2 ury=94.0
lrx=98.2 lry=50
clip;

template newtemp;
treplay 1:gplot 2:gplot1;
*make sure graphs are numbered properly from output screen;
run;
quit;

filename outplot clear;

**See also:
1. SAS Example

Thursday, April 8, 2010

Bioequivalence Using Proc Mixed

*****************************************************
Calculate Bioequivalence of Food Effect using Proc Mixed
**dataset has a parameter calculated for each subject
**can be cross-over or parallel study
****************************************************;

**create log data;
data logdata;
set dataset;
logparam=log(parameter);
keep Subject Parameter LOGparam fast period;
proc sort data=logdata;
by fast period subject;
run;
quit;

**ods output to data set, create mixed effect model ;
ods listing close;
ods output lsmeans=lsmeans_dataset;
ods output estimates=difflsmeans_dataset;

proc mixed data=logdata;
class fast period subject;
*class statement are categorical variables;
model LOGparam= fast;
*model statement is Y=Fixed Effects;
repeated period/subject=subject type=cs;
*means that for each period, there are repeated subjects, the covariance matrix type is compound symmetry;
lsmeans fast/pdiff cl alpha=.1;
*finds the least square means of groups in "fast", along with 90% CI;
estimate 'Fasted vs Fed' fast 1 -1/ cl alpha=0.1 E;
*estimates the difference between the groups in "fast" using 90% CI;
run;
ods output close;
ods listing;

LSM Table:



Estimates Table:



proc sql;
create table differences as
*create a table called "differences"
select referenceset.fast as reference,
referenceset.estimate as reflsm, exp(referenceset.estimate) as refgeolsm,
testset.fast as test, testset.estimate as testlsm, exp(testset.estimate) as testgeolsm,
testset.estimate-referenceset.estimate as Difference,
(calculated testgeolsm/calculated refgeolsm)*100 as Ratio
*select variables from dataset to calculate geometric least square means and ratio
from
(select * from lsmeans_dataset as referenceset where referenceset.fast='Y') inner join
(select * from lsmeans_dataset as testset where testset.fast='N')
on referenceset.effect=testset.effect
*pull individual rows from each dataset, linked by the same 'effect' variable
;
quit;

Differences Table Created using SQL:


**find 90%confidence intervals;

data CI;
set difflsmeans_dataset;
CI90_lower=100*exp(lower);
CI90_upper=100*exp(upper);
keep CI90_lower CI90_upper;
run;

***merge all data together;
data table;
merge differences ci;
run;

Final Table Output:



***Notes:
1. How does Proc Mixed compare to GLM and other statements?

Stickplot


****************************************************
Stickplot created by using SAS gplot procedure
***************************************************;


filename outplot "Z:\SAS\Stickplot.cgm";
goptions
reset=ALL
gunit=pct
colors=(black)
ftext=hwcgm001
ftitle=hwcgm009
htitle=2
htext=2
maxcolors=255
dev=cgmof97l
rotate=LANDSCAPE
gsfmode=replace
gsfname=outplot;


symbol1 value=diamond i=j c=black repeat=100;

axis1 label=(a=90 height=2.5 color=black) ;
axis2 label=none offset=(15,15);

proc gplot data=stickplotdata;
plot Yvar*Xvar=subject/noframe
vaxis=axis1
haxis=axis2
nolegend
;

run;
quit;

**Notes: unfortunately, there seems to be an issue with the frame when offsetting and outputting to CGM file at the same time. The "noframe" option gets rid of the frame altogether. Otherwise, I'm not sure how to resolve this (not a problem when using ODS PDF output)**;

Proc Gplot


*************************************************************************
Proc Gplot statement including Annotation
**outputs to CGM file
**specified Symbol, Axis, and Legend options
**uses Annotate option to add "N=" with number of observation in statistic
************************************************************************;


proc means data=dataset noprint;
by Time;
var Concentration;
output out=graphdata
n=N mean=Mean std=Std min=Min median=Median max=Max cv=CV;
run;
*the original "dataset" is set up with variables for subject, time, and concentration;

filename outplot 'Z:\SAS\Gplot\MeanGraph.cgm';
goptions reset=ALL
gunit=pct
ftext=hwcgm001
ftitle=hwcgm009
htitle=2
htext=2
noborder
maxcolors=255
dev=cgmof97l
rotate=LANDSCAPE
gsfmode=replace
gsfname=outplot;


symbol1 value=diamond i=j color=green h=2.5;
symbol2 value=diamond i=j color=red h=2.5;
symbol3 value=square i=j color=blue h=2.5;
symbol4 value=square i=j color=orange h=2.5;


legend1 down=4 position=(top right inside) frame shape=symbol(7,1.5) mode=protect label=none;

axis1 label=(a=90 height=2.5 color=black "Mean Statistic");
axis2 order=(0 to 30 by 5) label=(height=2.5 color=black "Time (hr)");


data annoplot;
length color function $8;
retain color 'black' function 'label' xsys ysys '2' hsys '1' size 2.5; *set up the coordinates ;
set graphdata;
where time=24;
*this is the last time point in the dataset so will use this y-value;

x=25;
*set up x value at one point after the last value;

y=mean;
text2=put(n,$3.);
*make the N value character;

text=compress('N='||text2);
*put together the character value with the "N=" string;

position='6';
*puts the string to the right ;

run;

proc gplot data=graphdata;
plot mean*time=group/
vaxis=axis1
haxis=axis2
legend=legend1
anno=annoplot;
run;
quit;


*For Further examples, see:
1. SAS examples
2. Robert Allison's SAS Graph Examples

Proc Boxplot




******************************************************
Proc Boxplot code including:
**outplot to CGM file
**axis options to split labels onto two lines
**insetgroup option to include statistics on top
**adjusting size and color of boxes
******************************************************


filename outplot 'Z:\SAS\Boxplot';
goptions reset=ALL
gunit=pct
ftext=hwcgm001
ftitle=hwcgm009
htitle=2
htext=2
noborder
maxcolors=255
dev=cgmof97l
rotate=LANDSCAPE
gsfmode=replace
gsfname=outplot;


axis1 label=none value=(t=1 j=c "Group 1" j=c "4/1/10"
t=2 j=c "Group 2" j=c "4/2/10"
t=3 j=c "Group 3" j=c "4/3/10"
t=4 j=c "Group 4" j=c "4/4/10"
t=5 j=c "Group 5" j=c "4/5/10");


proc boxplot data=boxplotdata;
plot (Var1 Var2)*XVar /cboxes=black boxwidth=10 nohlabel haxis=axis1;
insetgroup n mean (5.1)/ pos=top cfill=yellow;

run;

**This code would produce another boxplot with the Y-axis variable "Var2" as well. **

For Further examples see the following:
1. SAS support examples
2. PDF "Customizing Your Own Box Plot"