Research Interests:

My general areas of interest include Statistical Pattern Recognition, Machine Learning and Data Mining. My doctoral research focuses on classical problems in machine learning such as clustering, feature selection, dimensionality reduction, classification, and density estimation. The approach to these problems is heavily based on statistical and probabilistic methods. My application areas include bioinformatics, machine vision, and text mining. A significant part of my thesis was devoted to the application of gene expression microarray analysis with the goal of identifying groups of genes that exhibit coherent expression patterns from which gene function may be inferred. Additional research interests include text mining (Latent Semantic Analysis) and financial modeling.

Publications:

bullet Ph.D. Thesis: Model-Based Linear Manifold Clustering.  [ Thesis ] [ Abstract ] [ Defense Slides ]
bullet Linear manifold clustering in high dimensional spaces by stochastic search (with Robert Haralick), Pattern Recognition (2007), vol. 40(10), pp 2672-2684. [ PDF of a recent version ]
bullet Linear Manifold Correlation Clustering (with Robert Haralick), International Journal of Information Technology and Intelligent Computing (2007),  vol. 2, no. 2. Invited paper. [ PDF of a recent version ]
bullet Model-based Subspace Correlation Clustering (with Robert Haralick), Pattern Recognition (2008). Accepted pending minor revision. [ PDF of a recent version
bullet Modeling High-Dimensional Probability Distributions via Linear Manifold Clusters (with Robert Haralick), Pattern Recognition Letters (2008). Pending revision. [ PDF of a recent version ]
bullet Mining Subspace Correlations (with Robert Haralick), In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2007), pp 335-342. [ PDF of a recent version ]
bullet Exploiting the Geometry of Gene Expression Patterns for Unsupervised Learning (with Robert Haralick), In Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), vol. 2 pp 670-674. [ PDF of a recent version ]
bullet Linear Manifold Clustering (with Robert Haralick), In Proceedings of the International Conference on Data Mining and Machine Learning (MLDM 2005), Lecture Notes in Computer Science, Springer Verlag LNAI 3587 pp 132-141. [ PDF of a recent version ]
bullet Linear Manifold Embedding of Pattern Clusters (with Robert Haralick), DIMACS Workshop on Detecting and Processing Regularities in High Throughput Biological Data, 2005. [ Abstract/PDF ] [ Slides/PDF ]
bullet The EM Algorithm as a Lower Bound Optimization Technique, Technical Report TR-2006001, Graduate Center, City University of New York, 2006.  [ Link to PDF ]
bullet Independent Component Analysis: An Introduction, Technical Report TR-2007007, Graduate Center, City University of New York, 2005. [ Link to PDF ]
bullet Fast Relational Matching, Technical Report TR-1401, ALPHATECH Inc., 2003. [ PDF of selected pages ]
bullet The Satisfiability Problem- From The Theory of NP-completeness to State-Of-The-Art SAT Solvers, Technical Report TR-2007008, Graduate Center, City University of New York, 2003. [ Link to PDF ]

Surveys & Research Proposals:

bullet

Research Proposal: A Spectral-Graph-Pattern-Matching Approach to the SAT Problem [ PDF ]

bullet

Introduction to the Theory of NP-Completeness [ PDF ]

bullet

Probabilistic Computations and Complexity Classes [ PDF ]

bullet

Hardness of Approximations [ PDF ]

bullet

The Complexity of Some Problems in Cryptography [ PDF ]

Linear Manifold Clustering Source Code: Download Package (Tar file)

The package contains a "readme" file describing how to install and use Linear Manifold clustering (LMCLUS).