Career

With an interest in computing for scientific discovery, I've found my way into numerous scientifically relevant projects and jobs over the years.

I started early in the field of grid computing, becoming involved in the National Science Foundation (NSF) TeraGrid project at Indiana University. I also worked in various capacities for additional scientific projects, including at Exxon Mobil Research Center. After joining NASA's Jet Propulsion Laboratory, I've since become involved with many more scientific and space-centric projects, some of which are detailed below. I've also become an active developer for the Apache Software Foundation as well as a project management committee member. Complementing my career work, I am pursuing a graduate degree at the University of Southern California, Viterbi School of Engineering.

My current area of interest is in large scale data management systems aiding in scientific pursuits. This field is colloquially known as Big Data management and analytics.

Positions

Projects

Defense Advanced Research Projects Agency (DARPA) XDATA Project

This project seeks to build an extensible, open source set of solutions to common Big Data problems which the U.S. Defense Department is currently facing. This effort originated from President Barack Obama's Big Data initiative. For more information on this project, please visit the XDATA website.

  • I specialize in Big Data management techniques, from a data-triage perspective
  • I develop analytical workflows for the management of algorithms
Apache Object Oriented Data Technology (OODT)

This open source project, championed by the Apache Software Foundation, seeks to build a reusable set of data management and data processing tools for scientific discovery. The OODT toolbox spans technologies including NoSQL databases, workflows, grid computing, and web services.

  • I serve on the project management committee
  • I am a long-time code committer and user
Early Detection Research Network (EDRN) Informatics Center

The Early Detection Research Network (EDRN) is a collaborative group of scientists and medical practitioners seeking to build a high-quality, reliable database of biomarkers that can aid in the prediction of various cancers. For more information on this project, please visit the official EDRN website, and the EDRN Informatics Website.

  • I develop collaborative solutions to assist biomarker researchers in sharing and archiving their laboratory data
  • I am creating automated data analytics pipelines for centrally processing biomarker-related data sets
CO2 Virtual Science Data Environment

This project strives to build a cross-institutional collaborative portal for managing long-term, high-quality CO2 records. A distributed data catalog, spanning numerous CO2 repositories, is actively being maintained and curated as a result of this project. Additionally, distributed data services are being provided for end-users. For more information, please visit the CO2 Portal official website

  • I construct server-side web-services as well as data management software
  • I develop software to automatically synchronize a distributed CO2 data catalog
Airborne Snow Observatory

An ambitious project to measure snow-water equivalent within the California Sierras, located in the western United States. The process of measuring snow-water equivalent (the amount of melt water contained in snow) has become more efficient and repeatable using the technologies developed by this project. For more information, please visit the ASO official website.

  • I help develop advanced OODT data management solutions for near real-time data processing
Megacities Carbon Project

A collaborative project seeking to study the carbon-emission properties of the world's largest cities. Starting with Los Angeles, the goal of this project is to use mobile sensors, advanced data management technology, and a collaborative approach to assess the impact "megacities" have on climate change. For more information, please visit the Megacities official website.

  • I help develop and implement a distributed data management and data services architecture
QuakeSim

A project seeking to predict instances of earthquakes through the use of scientific techniques backed by a plethora of earthquake measurement and prediction research. This ambitious project has been a decade in the making and seeks to provide researchers with access to forecast information. For more information on this project, please visit the QuakeSim project website.

Publications and conferences

EDRN Cancer Biomarkers Bioinformatics Workshop April, 2013 Pasadena, CA

Presenter: LabCAS: A system for cataloging, archiving, and analyzing laboratory data from the Early Detection Research Network

ApacheCon North America February, 2013 Portland, OR

Presenter: Searching for cancer biomarkers with Apache OODT

NASA Earth Science Data System Working Group Meeting November, 2012 Annapolis, MD

Presenter: Developing a GIS for CO2 analysis using lightweight, open source components

IEEE International Conference on Information Reuse and Integration August, 2012 Las Vegas, NV

Co-Author: Developing an Open Source, Reusable Platform for Distributed Collaborative Information Management in the Early Detection Research Network

American Geophysical Union Fall Meeting 2011 December, 2011 San Fransisco, CA

Author: A virtual science data environment for carbon-dioxide observations