Bea Stollnitz grad school projects

Here are some of the machine learning, signal processing, and data science projects I completed during grad school, with details below.

Package dieline classification
Technologies: Python, Keras, NumPy, Scikit-learn, Plotly.
Topics: deep learning, convolutional neural network (CNN), classification.
Human activity classification
Technologies: Python, PyTorch, PyWavelets, NumPy, Plotly, Tensorboard, H5py, Tqdm, Pillow.
Topics: deep learning, convolutional neural network (CNN), continuous wavelet transform, Gabor transform, classification, signal processing, time-series.
Predicting time evolution of a pigment in water
Technologies: Python, Keras, NumPy, Scikit-learn, Pandas, Plotly, Altair.
Topics: deep learning, principal component analysis (PCA), time-series, model discovery.
Music classification using LDA
Technologies: Python, NumPy, Plotly, Scikit-learn, Scipy.
Topics: linear discriminant analysis (LDA), classification, principal component analysis (PCA), Gabor transform, time-series.
Eigenfaces
Technologies: Python, NumPy, Plotly, Scikit-learn, Pillow.
Topics: principal component analysis (PCA), support vector machine (SVM), classification.
Gabor transforms
Technologies: Python, NumPy, Plotly.
Topics: Gabor transform, Fourier transform, spectrogram, signal processing, time-series.
Feature reduction
Technologies: Python, NumPy, Plotly, OpenCV, CSRT object tracker.
Topics: principal component analysis (PCA), dimensionality reduction, object tracking, time-series.
Separation of background and foreground using DMD
Technologies: Python, NumPy, Plotly, OpenCV.
Topics: dynamic mode decomposition (DMD), video processing.
Denoising 3D scanned data
Technologies: Python, NumPy, Plotly.
Topics: fast Fourier transform (FFT), time-series.

Package dieline classification

GitHub folder with code

The goal of this project is to classify all the panels of a package dieline. This project is a collaboration with Adobe Research, and will allow researchers to build software that automatically simulates the folding of a 2D dieline into a 3D model of a package.

I collaborated with Adobe to define two datasets for this project. In the first dataset, each panel of a dieline is represented using a feature vector of integers expressing the number of occurrences of each angle in the panel outline. In the second dataset, each panel is represented as an image that includes a bit of the surrounding area.

I automated the process of hyperparameter tuning by performing grid search over a random sample of the full combination of hyperparameters. I ran two rounds of hyperparameter tuning for each dataset type. The first round had a large search space, including hyperparameters like learning rate, batch size, and optimizer settings, as well model selection parameters for the network. The second round relied on the results from the first round to reduce the search space to the most promising values. This two-step approach helped increase the accuracy of the results.

I achieved the highest test accuracy for the image dataset (using a CNN with two convolution layers). However, the vector dataset has several advantages: the data is much more compact and the associated network is smaller, and therefore, both training and prediction can be done far more quickly.

You can find more details in the report and poster for this project.

This was my final project for the Deep Learning class (CSE 599) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Human activity classification

GitHub folder with code

In this project, I use three different approaches to classify temporal signals according to the associated activity. The input data consists of several thousand short snippets of measurements obtained from nine sensors (such as acceleration and gyroscope) while people performed six different activities (such as walking or sitting). In my first approach, I train a simple feed-forward network using the raw temporal signals and associated labels. In my second approach, I compute spectrograms by applying a Gabor transform to the temporal signals, and train a CNN to classify the spectrograms. In my third approach, I compute scaleograms by using a continuous wavelet transform, and train a CNN to classify the scaleograms.

You can find more details in the report for this project.

This was my final project for the Computational Methods for Data Analysis class (AMATH 582) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Predicting time evolution of a pigment in water

GitHub folder with code

The goal of this project is to explore techniques for predicting the behavior of food coloring in water. I started by recording several videos of the diffusion of pigment of colored candy immersed in water, to be used as data source.

I used PCA to reduce the dimensionality of the data, and then trained a feed-forward neural network based on several videos from the dataset. Once the network was trained, I used it to predict the behavior in an entire video starting from just the first frame. Given the first frame, the neural network predicts the second frame, which I feed back into the network to predict the third frame, and so on.

Next, I used a model discovery technique to find a partial differential equation (PDE) that describes the evolution of this physical phenomenon. The left-hand side of the PDE is u_t, and for the right-hand side, I considered a library of possible terms, such as u_x, u_yy, and x u_x. I then used Lasso linear regression to find the appropriate coefficients for the terms, and obtained an equation that models the spreading of the food coloring. The main challenge of this task was the calculation of the derivative terms. I tried several techniques based on finite differences, but those led to noisy data and poor results. Therefore, I decided to find derivatives by fitting a polynomial to the data within a small neighborhood around each pixel, and calculate the derivatives of the polynomial at the pixel of interest. Although this approach is more costly to compute, it produces much smoother results and a more accurate prediction.

If you'd like more details, you can read the report detailing my findings.

This was my final project for the Inferring Structure of Complex Systems class (AMATH 563) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Music classification using LDA

GitHub repo with code

In this project, I classify 108 music clips from four different genres and three different bands per genre, according to band name and genre. I first create a spectrogram per music clip using a Gabor transform with a Gaussian filter. I then reduce the dimensionality of the spectrograms using SVD and PCA. And last, I classify each clip using my own custom implementation of linear discriminant analysis (LDA). I also classify them using scikit-learn's LDA classification to ensure I'm on the right track.

In my LDA implementation, I solve a generalized eigenvalue problem with the between-class covariance matrix and the within-class covariance matrix. The resulting eigenvectors provide a transformation that maximizes the separation between class centroids while minimizing the variance within classes. I then classify each test point by applying this transformation and finding the closest class centroid. I show how to visualize the process in a 2D plot, using a Voronoi diagram to illustrate the classification boundaries.

From my experiments I conclude that LDA is quite effective at classifying audio clips by band name across different genres, and by band name even within a single genre. However, for the data set that I used, LDA struggled to effectively classify audio clips by genre. The results I obtained from my LDA implementation are consistent with results from the scikit-learn implementation.

You can find more details in the report for this project.

This was a homework assignment for the Computational Methods for Data Analysis class (AMATH 582) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Eigenfaces

GitHub repo with code

In this project, I consider two data sets, one containing images of faces that have been cropped and aligned, and another containing images of faces that have not been aligned. I decompose this data using principal component analysis (PCA), and analyze the energy, coefficients, and visual representation of the data's spatial modes. I notice how the most relevant modes seem to capture changes in the lighting and position of the faces above all.

I show how PCA can be used in two applications: image compression and image classification. For the compression scenario, I choose an image from each dataset, and reconstruct it using increasing numbers of modes. The more modes I use, the better the approximation is to the original image. I compare the mean squared error between the original image and an image reconstructed with 50 modes, for each dataset, and conclude that the error is smaller for the cropped images. The cropped images have the face's features well aligned, so it's natural that a better representation would be achieved with the same number of modes.

For the image classification scenario, I split the data into training and test sets, use PCA to find the 50 most informative modes for the training data, and use those modes as a basis to reduce both training and test data. I then use the support vector machine (SVM) method to classify which subject is photographed in each of the test images. I achieve very high accuracies for both the cropped and uncropped images. I conclude that just a few modes are sufficient to capture the identity of each photo, at least for these small data sets.

You can find more details in the report for this project.

This was a homework assignment for the Computational Methods for Data Analysis class (AMATH 582) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Gabor transforms

GitHub repo with code

I analyze three audio files in this project. The first file contains music by Handel, and the two other contain the song ``Mary had a little lamb'' played on the piano and recorder.

For the first audio file, my goal is to compare the effects of the application of different Gabor transforms. I produce several spectrograms by using a wide range of Gabor filters with different shapes, widths, and time steps. I analyze the resulting spectrograms and point out the compromises involved in making different choices.

For the second and third audio files, my goal is to produce music scores for the tune. I start by visualizing the logarithms of the spectrograms, then I simplify the data, remove overtones, and select a better frequency range for the visualization. The result is an image representation of the frequencies of each note played in the tune.

I conclude that Gabor transforms are effective at analyzing time-series data where both the frequency and time information are important.

You can find more details in the report for this project.

This was a homework assignment for the Computational Methods for Data Analysis class (AMATH 582) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Feature reduction

GitHub repo with code

In this project, I analyze twelve videos in which three cameras recorded four different scenes: an object oscillating with vertical displacement only, a similar scene but with significant camera shake, an object oscillating with horizontal and vertical displacement, and an object that rotates in addition to oscillating horizontally and vertically. I use OpenCV's CSRT object tracker to follow the object and obtain trajectories for each camera. I then combine the trajectory data of different cameras for each scenario by including x and y feature information as the rows of our data matrix, with columns corresponding to the temporal dimension. I perform Principal Component Analysis (PCA) using Singular Value Decomposition (SVD), and analyze the potential for dimensionality reduction of the data for each scenario.

I observe that scenarios 1 and 3 present little ambiguity. Scenario 1 clearly needs only a single spatial mode to describe its behavior, which matches my intuition because the object only moves vertically (and therefore a single coordinate is enough to completely describe its movement). Scenario 3 needs two spatial modes, which also matches my intuition because the object moves both horizontally and vertically, roughly in a plane. Projecting the original data into the corresponding number of spatial modes significantly reduces the size of the data by removing redundancy, while keeping the more expressive information intact.

Scenarios 2 and 4 are less intuitive, but the computation results are no less interesting. Scenario 2 can be represented using either one or three spatial modes. One mode would represent its vertical displacement, but the camera shake adds a new level of complexity to the movement that may need three spatial modes to encode. In Scenario 4, the object motion is very pronounced in the vertical direction, and less pronounced in the horizontal direction, quickly decaying to just vertical motion. As a result, we could choose either one or two modes to represent the motion.

I conclude that PCA is an effective method for dimensionality reduction of data. This method enables us to encode the data in a fraction of the space by removing redundancy while keeping all the relevant information.

You can find more details in the report for this project.

This was a homework assignment for the Computational Methods for Data Analysis class (AMATH 582) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Separation of background and foreground using DMD

GitHub repo with code

In this project, I explore two techniques for separating a moving foreground from a stationary background in videos. Both techniques rely on Dynamic Mode Decomposition (DMD), following somewhat different steps to arrive at the foreground and background pixel values. I use three videos of animal puppets exploring the Seattle cityscape in this analysis. No animals were harmed in this research.

In the first technique, I follow the approach described by Grosek and Kutz. I calculate a stationary background frame by keeping only the DMD mode with the eigenvalue of smallest magnitude. I then compute the foreground by subtracting the background mode from each frame of the original video. And finally, I extract all the negative values in the foreground into a residual matrix R, which I subtract from the foreground and add to the background. This technique has the convenient property that the background and foreground add up to the original video, but produces poor results for our particular data set. This is likely because the moving foreground in our videos is frequently darker than the background.

In the second technique, I relax the requirement that the foreground and background must add up to the original video. I again compute the stationary background from the DMD mode with the eigenvalue of smallest magnitude, but without any adjustment from the residual matrix. The foreground pixels are assigned values of zero wherever the original video is similar to the background, and the original pixel values elsewhere. This technique produces better results for our data set, giving a cleanly separated background image and an imperfect but tolerable foreground video.

You can find more details in the report for this project.

This was a homework assignment for the Computational Methods for Data Analysis class (AMATH 582) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Denoising 3D scanned data

GitHub repo with code

In this project, I denoise a series of three-dimensional scanned data to determine the path of a marble ingested by a dog. I accomplish this by converting the data into the frequency domain using an FFT, averaging the spectra over time, and using the average spectrum to construct a Gaussian filter. I then denoise the data in the frequency domain using the Gaussian filter, and convert it back into the spatial domain. The location of the marble can then be found by looking for the peak density in the spatial data.

You can find more details in the report for this project.

This was a homework assignment for the Computational Methods for Data Analysis class (AMATH 582) at the University of Washington, which I completed as part of my masters in Applied Mathematics.

Name		Name	Last commit message	Last commit date
Latest commit History 89 Commits
activity-classification		activity-classification
denoising-3D-scans		denoising-3D-scans
dieline-classifier		dieline-classifier
dmd-separation		dmd-separation
eigenfaces		eigenfaces
feature-reduction		feature-reduction
gabor-transforms		gabor-transforms
music-classification		music-classification
pigment-in-water		pigment-in-water
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bea Stollnitz grad school projects

Package dieline classification

Human activity classification

Predicting time evolution of a pigment in water

Music classification using LDA

Eigenfaces

Gabor transforms

Feature reduction

Separation of background and foreground using DMD

Denoising 3D scanned data

About

Releases

Packages

Contributors 2

Languages

License

bstollnitz/grad-school-portfolio

Folders and files

Latest commit

History

Repository files navigation

Bea Stollnitz grad school projects

Package dieline classification

Human activity classification

Predicting time evolution of a pigment in water

Music classification using LDA

Eigenfaces

Gabor transforms

Feature reduction

Separation of background and foreground using DMD

Denoising 3D scanned data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages