Welcome to Matthew Lee's Data Science Blog

Enthusiastic about applying data science and machine learning on real world problems for practical and data driven results.
Currently seeking data scientist & machine learning engineer positions.

About Me

I've been always curious about everything and all about problem-solving. I can't count how many broken objects I disassembled to understand how they worked and try to fix them myself as a kid. After getting a Bachelor of Science in Mechanical Engineering from University of Texas at Austin in 2015, I worked in manufacturing industry.
While working in the field, I encountered a problem for which conventional solutions were inadequate, but maching learning approach was very promising. Since then, I pursued self-teaching data science and machine learning while doing personal projects and competing in Kaggle where I placed in top 2% and 9% in two competitions. Over time, I fell in love with data science and also realized limitations in resources to apply data science skills with the position I held at the time. Eventually, I decided to fully pivot into data science and I would like to share some of my end-to-end projects I completed in Portfolio section below.

Portfolio

In this project, I trained LSTM models to generate melody and percussion, which are then combined to generate EDM music with sprinkles of Classical music. Then, neural network classifier was built to gauge output from LSTM models. Finally, interactive Flask app was built to utilize pretrained models to easily generate and play songs.

Link to Project

I built a tree based classifier LightGBM to classify a person's emotion based on his/her actual voice. I used LightGBM instead of neural network architecture such as CNN with RNN or LSTM since non-NN models are generally quicker than NN at training and predicting, and are more fit for implementation in real world applications. However, a lot of manual feature engineering was necessary to achieve meaningful results since non-NN models don't have automatic feature engineering from filters in CNN architecture.

Link to Project

I applied natural language processing (NLP) on news articles to perform topic modeling using bag-of-words approach and sentiment analysis using open source modules. Topic modeling gives a very concise visual for the user to understand topics and trends revolving around Bitcoin and cryptocurrency over time. Also, features created from sentiment analysis were combined with other features (Bitcoin's open price, close price, volume, etc.) to build a LSTM model to predict Bitcoin price.

Link to Project

I built a linear regression model (ElasticNet) to predict Chicago's daily crime rate by web scraping 10 years worth of various weather data. Then I added more data (total ridership and unemployment rate) on top of performing feature engineering.

Link to Project

Resume