GENE-CLASSIFICATION

This is an in class Kaggle Competition in Kernel Method for Machine Learning at AMMI

INTRODUCTION

Transcription factors (TFs) are regulatory proteins that bind specific sequence motifs in the genome to activate or repress transcription of target genes. Genome-wide protein-DNA binding maps can be profiled using some experimental techniques and thus all genomics can be classified into two classes for a TF of interest: bound or unbound.

The main task of this project is to classify gene sequence: thus predicting whether a DNA sequence region is binding site to a specific transcription factor.

DATA SET

The data is of two form: the principal files and the optional files.

The principal files contain data that has 2000 training points and 1000 test sequence.

MODELS USED

Ridge Regression
Kernel Ridge Regression
Naive Bayes Model
Logistic Regression
Kernel Logistic Regression
Weighted Kernel Logistic Regression
Kernel Support Vector Machine

KERNELS IMPLIMENTED

Linear Kernel
Quadratic Kernel
Polynomial Kernel
Exponential Kernel
Radial Basis Kernel (RBF)
Laplacian Kernel

RESULT AND FINDING

On the Private score, the three best accuracies are: 0.684, 0.662 and 0.648 which were obtained by kernel logistic regression (polynomial kernel), Kernel ridge Regression(Polynomial kernel) and SVM with RBF kernel respectively.
This indicates that, these Kernels work well on the data set.
In addition, simple models performed better than Support Vector Machine (SVM) in general.

Name	Name	Last commit message	Last commit date
Latest commit EmmanuelOwusu Updated README.md Dec 8, 2020 1175527 · Dec 8, 2020 History 16 Commits
Data set	Data set	add files	Jun 5, 2020
Gene Classification.ipynb	Gene Classification.ipynb	add files	Jun 5, 2020
MAIN_SCRIPT.py	MAIN_SCRIPT.py	add files	Jun 5, 2020
PRESENTATION.pdf	PRESENTATION.pdf	add files	Jun 5, 2020
README.md	README.md	Updated README.md	Dec 8, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GENE-CLASSIFICATION

INTRODUCTION

DATA SET

MODELS USED

KERNELS IMPLIMENTED

RESULT AND FINDING

About

Releases

Packages

Languages

EmmanuelOwusu/GENE-CLASSIFICATION

Folders and files

Latest commit

History

Repository files navigation

GENE-CLASSIFICATION

INTRODUCTION

DATA SET

MODELS USED

KERNELS IMPLIMENTED

RESULT AND FINDING

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages