About Me
I am currently an assistant professor in the Department of Mathematical Sciences at the Korea Advanced Institute of Science and Technology (KAIST). Previously, I was a ML scientist at AWS AI Labs. Before that, I was a Neyman Visiting Assistant Professor in the Department of Statistics at UC Berkeley, where I was very fortunate to be supervised by Bin Yu. Prior to that, I was a Postdoctoral fellow at UC Berkeley’s Foundations of Data Analysis (FODA) Institute and Berkeley Institute for Data Science (BIDS). I obtained my PhD in Statistics at the University of Chicago where I was very fortunate to be advised by Rina Foygel Barber. My PhD research was supported by Kwanjeong Fellowship.
My research is centered on high-dimensional statistics and machine learning, with a focus on sparse and low rank optimization, local graph clustering, and interpretable machine learning, and domain adaptation under distribution shifts. I am also interested in the applications of statistical and optimization methods to diverse scientific areas such as medical imaging, population genetics, and cosmology.
Education
- Ph.D. in Statistics, 2018
- Advisor: Rina Foygel Barber
- M.S., Statistics, 2013 (Advisor: Byeong U. Park)
- B.S., Statistics, B.A., Economics, Minor in Mathematics, 2011
Preprints / Publications
-
Variance-reduced zeroth-order methods for fine-tuning language models. Tanmay Gautam, Youngsuk Park, Hao Zhou, Parameswaran Raman, and Wooseok Ha. Accepted at the 41st International Conference on Machine Learning (ICML 2024). arXiv:2404.08080.
-
Prominent roles of conditionally invariant components in domain adaptation: theory and algorithms. Keru Wu*, Yuansi Chen*, Wooseok Ha*, Bin Yu. arXiv:2309.10301. Submitted.
-
The effect of SGD batch size on autoencoder learning: sparsity, sharpness, and feature learning. Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu. arXiv:2308.03215. Submitted.
-
Gradient dynamics of single-neuron autoencoders on orthogonal data. Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu. OPT 2022: Optimization for Machine Learning (NeurIPS 2022 Workshop).
-
Interpreting and improving deep-learning models with reality checks. Chandan Singh*, Wooseok Ha*, Bin Yu. International Workshop on Extending Explainable AI Beyond Deep Models and Classifiers. arXiv:2108.06847
-
Adaptive wavelet distillation from neural networks through interpretations. (Package.) Wooseok Ha, Chandan Singh, Francois Lanusse, Srigokul Upadhyayula, Bin Yu. 34th Annual Conference on Neural Information Processing Systems (Neurips 2021). arXiv:2107.09145
-
Fast and flexible estimation of effective migration surfaces. (Package., Reproducible Code.) Joseph H. Marcus*, Wooseok Ha*, Rina Foygel Barber, John Novembre. eLife. bioRXiv:2020.08.07.242214
-
Transformation importance with applications to cosmology. Chandan Singh*, Wooseok Ha*, Francois Lanusse, Vanessa Boehm, Jia Liu, Bin Yu. ICLR 2020 Workshop on Fundamental Science in the era of AI (Spotlight talk). arXiv:2003.01926
-
Statistical guarantees for local graph clustering. Wooseok Ha*, Kimon Fountoulakis*, Michael, W. Mahoney. 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020). Journal version Journal of Machine Learning Research. arXiv:1906.04863
-
An equivalence between critical points for rank constraints versus low-rank factorizations. Wooseok Ha, Haoyang Liu, and Rina Foygel Barber. SIAM Journal on Optimization. arXiv:1812.00404
-
Alternating minimization based framework for simultaneous spectral calibration and image reconstruction in spectral CT. Wooseok Ha, Emil Y Sidky, Rina Foygel Barber, Taly Gilat Schmidt, and Xiaochuan pan. 2018 IEEE Nuclear Science Symposium and Medical Imaging Conference.
-
Estimating the spectrum in computed tomography via Kullback-Leibler divergence constrained optimization. Wooseok Ha, Emil Y Sidky, Rina Foygel Barber, Taly Gilat Schmidt, and Xiaochuan pan. Medical Physics. arXiv:1805.00162. Selected as Editor’s Pick designation
-
Alternating minimization and alternating descent over nonconvex sets. (Code.) Wooseok Ha and Rina Foygel Barber. arXiv:1709.04451
-
Gradient descent with nonconvex constraints: local concavity determines convergence. (Code.) Rina Foygel Barber and Wooseok Ha. Information and Inference. arXiv:1703.07755
-
X-ray spectral calibration from transmission measurements using Gaussian blur model. Wooseok Ha, Emil Y Sidky and Rina Foygel Barber. Proceedings of the SPIE conference on Medical Imaging 2017: Physics of Medical Imaging.
-
Trimmed conformal prediction for high-dimensional models. Wenyu Chen, Zhaokai Wang, Wooseok Ha, Rina Foygel Barber. arXiv:1611.09933
-
Robust PCA with compressed data. Wooseok Ha and Rina Foygel Barber. 28th Annual Conference on Neural Information Processing Systems (NIPS 2015).
Extended Abstracts
- Simultaneous spectral scaling and basis material map reconstruction for spectral CT with photon-counting detectors. Emil Y Sidky, Taly Gilat Schmidt, Rina Foygel Barber, Wooseok Ha, and Xiaochuan Pan. 4th International Conference on Image Formation in X-ray Computed Tomography (CT meeting 2016).
Teaching
-
MAS555/DS512: Advanced Statistics (Spring 2024) at KAIST.
-
STAT88: Probability and Mathematical Statistics in Data Science (Fall 2020) at UC Berkeley.
-
STAT158: Design and Analysis of Experiments (Spring 2020) at UC Berkeley.