research
I’m a part-time theoretician, and a part-time empiricist. I’m interested in how machine learning works.
goals
My long-term goal is to make AI responsible — accessible, sustainable, and righteous.
Currently, I focus on efficient ai, where I aim to characterize the fundamental trade-offs among: (1) accuracy, (2) training cost, (3) inference cost.
tools
As the key tool, I study the algorithmic biases of ML. From a theoretical standpoint, understanding the bias will reveal why certain trade-offs emerge and help us gauge the fundamental limit of the Pareto optimum. From a practical standpoint, harnessing the bias will provide us a crucial tool to achieve the optimal trade-off.
Here are some examples of the biases that I studied:
- initialization (a,b). We theoretically characterize how the training speed trades with generalizability, as we change the initialization scale of neural nets.
- attention sinks (c, d). We harness the attention-biasing phenomenon to develop a model pre-processing algorithm to improve its post-compression performance.
- data transformations. (e) We discover that invertible transformations of the training data can dramatically affect both the training speed and the generalizability, and harness this phenomenon to accelerate neural field training.
works
See my group webpage or google scholar.