Course Materials for BIO/CSE/STAT 597/8F
Spring 2002

  • What is BIO/CSE 597F?
  • Nucleic Acids Research 2002 Database Issue
  • Study guide for exam
  • Schedule for class presentations

  • lecture 2
  • lecture 3 (without preliminary data)
  • supplementary material on SAGE (from NCBI)
  • more supplementary material on SAGE (from NCBI)
  • paper on SAGEmap
  • paper on SAGE data errors
  • SAGEmap Web site at NCBI
  • SAGE "home page" at Johns Hopkins
  • query yeast SAGE data at Stanford


  • introduction to spotted arrays
  • introduction to affy
  • normalization and missing values
  • smoothing and lowess
  • filtering and other transformations
  • homework 1 (due Feb. 6)
  • dimension reduction
  • PCA and plotting
  • more dimension reduction
  • multi-dimensional scaling (Susan Holmes)
  • clustering
  • clustering II
  • clustering III
  • Homework 2: assignment, data and groups,
  • Robert Tibshirani, Guenther Walther and Trevor Hastie. "Estimating the number of clusters in a dataset via the Gap statistic". Here.
  • A. Ben-Hur, A. Elisseeff, and I. Guyon. "A Stability Based Method for Discovering Structure in Clustered Data". Here.
  • K. Y. Yeung, C. Fraley, A. Murua, A. E. Raftery and W. L. Ruzzo. "Model-based clustering and data transformations for gene expression data." Here.
  • F. Bartolucci and F. Chiaromonte. "Clustering of expression data from microarrays: a mixture-based approach." Here.
  • Microarray Gene Expression Database Group website
  • working with a response

    Combining expression data and genomic sequence data:

  • introductory lecture
  • Readings on binding-site clusters: Berman et al. and Krivan and Wasserman
  • Detecting binding-site clusters, a research problem: references
  • Regulatory Sequence Analysis Tools : paper, website, details on k-mer matches and details on spaced dyads
  • PROSPECT: paper and website
  • INCLUSive: INtegrated CLustering, Upstream Sequence retrieval and motif Sampler. website
  • another lecture
  • lecture on DNA sequence patterns
  • Is a given sequence pattern associated with co-expression? paper
  • Other websites: Gibbs Motif Sampler, Motif Sampler, MEME.


  • 2D gel databases: website
  • Database of Interacting Proteins: website
  • Biomolecular Interaction Network Database: paper and website
  • MIPS Database: paper and website
  • Expression data and protein-protein interactions: paper
  • Subcellular location of yeast proteome: paper
  • Expression level and subcellular location: paper