Research Interests:
My general areas of interest include Statistical Pattern Recognition, Machine Learning and Data Mining. My doctoral research focuses on classical problems in machine learning such as clustering, feature selection, dimensionality reduction, classification, and density estimation. The approach to these problems is heavily based on statistical and probabilistic methods. My application areas include bioinformatics, machine vision, and text mining. A significant part of my thesis was devoted to the application of gene expression microarray analysis with the goal of identifying groups of genes that exhibit coherent expression patterns from which gene function may be inferred. Additional research interests include text mining (Latent Semantic Analysis) and financial modeling.
Publications:
| Ph.D. Thesis: Model-Based Linear Manifold Clustering. [ Thesis ] [ Abstract ] [ Defense Slides ] | |
| Linear manifold clustering in high dimensional spaces by stochastic search (with Robert Haralick), Pattern Recognition (2007), vol. 40(10), pp 2672-2684. [ PDF of a recent version ] | |
| Linear Manifold Correlation Clustering (with Robert Haralick), International Journal of Information Technology and Intelligent Computing (2007), vol. 2, no. 2. Invited paper. [ PDF of a recent version ] | |
| Model-based Subspace Correlation Clustering (with Robert Haralick), Pattern Recognition (2008). Accepted pending minor revision. [ PDF of a recent version ] | |
| Modeling High-Dimensional Probability Distributions via Linear Manifold Clusters (with Robert Haralick), Pattern Recognition Letters (2008). Pending revision. [ PDF of a recent version ] | |
| Mining Subspace Correlations (with Robert Haralick), In Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2007), pp 335-342. [ PDF of a recent version ] | |
| Exploiting the Geometry of Gene Expression Patterns for Unsupervised Learning (with Robert Haralick), In Proceedings of the 18th International Conference on Pattern Recognition (ICPR 2006), vol. 2 pp 670-674. [ PDF of a recent version ] | |
| Linear Manifold Clustering (with Robert Haralick), In Proceedings of the International Conference on Data Mining and Machine Learning (MLDM 2005), Lecture Notes in Computer Science, Springer Verlag LNAI 3587 pp 132-141. [ PDF of a recent version ] | |
| Linear Manifold Embedding of Pattern Clusters (with Robert Haralick), DIMACS Workshop on Detecting and Processing Regularities in High Throughput Biological Data, 2005. [ Abstract/PDF ] [ Slides/PDF ] | |
| The EM Algorithm as a Lower Bound Optimization Technique, Technical Report TR-2006001, Graduate Center, City University of New York, 2006. [ Link to PDF ] | |
| Independent Component Analysis: An Introduction, Technical Report TR-2007007, Graduate Center, City University of New York, 2005. [ Link to PDF ] | |
| Fast Relational Matching, Technical Report TR-1401, ALPHATECH Inc., 2003. [ PDF of selected pages ] | |
| The Satisfiability Problem- From The Theory of NP-completeness to State-Of-The-Art SAT Solvers, Technical Report TR-2007008, Graduate Center, City University of New York, 2003. [ Link to PDF ] |
Surveys & Research Proposals:
|
Research Proposal: A Spectral-Graph-Pattern-Matching Approach to the SAT Problem [ PDF ] | |
|
Introduction to the Theory of NP-Completeness [ PDF ] |
|
Probabilistic Computations and Complexity Classes [ PDF ] | |
|
Hardness of Approximations [ PDF ] | |
|
The Complexity of Some Problems in Cryptography [ PDF ] |
Linear Manifold Clustering Source Code: Download Package (Tar file)
The package contains a "readme" file describing how to install and use Linear Manifold clustering (LMCLUS).