Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
-
Updated
Apr 19, 2025 - Java
Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
The foundational library of the Morpheus data science framework
Blockchain2graph extracts blockchain data (bitcoin) and insert them into a graph database (neo4j).
Computer science data structures and algorithms implementation from scratch
Data repository software that helps researchers preserve, share, and discover data
Automatic feature engineering using Generative Adversarial Networks using Deeplearning4j and Apache Spark.
Generation of Synthetic Populations Library
Age classification from text using PAN16, blogs, Fisher Callhome, and Cancer Forum
Disease Pattern Miner is a free, open-source mining framework for interactively discovering sequential disease patterns in medical health record datasets.
MetaDig Engine: multi-dialect metadata assessment engine
Where do we refactor next? A predictive maintenance approach to java code smells.
We propose two algorithms to efficiently estimate the effective diameter and other distance metrics on very large graphs that are based on the neighborhood function such as the exact diameter, the (effective) radius or the average distance.
Makes Facebook/Meta RocksDB key/value store easier to use in Java as a true object oriented database.
Apache NiFi custom processors
Hackererath_The-Great-Indian-Data-Scientist-Hiring-Challenge | 96% accuracy
Estimating the number of clusters in a data set via the gap statistic. Implemented in H2O-3
This project involves demostrating the application of data science for image recognition. The dataset used is the stranford 120 dog breed image. The transfer learning algorithm selected are Xception, Inception and EfficientNet. The algorithm with a better validation accuracy will be converted to a tflite model and embedded in an android native app
This Maven Java project implements three common measures for link prediction in graphs: Common Neighbors, Jaccard Coefficient, and Adamic-Adar. The project leverages the power of Apache Spark to efficiently process large graphs in a distributed environment.
This repository has the end result of the TFG carried out during 2016. The possibility of obtaining the results probabilistically rather than discrete results for further processing and obtaining ROC curves for evaluation are added to certain algorithms.
Add a description, image, and links to the datascience topic page so that developers can more easily learn about it.
To associate your repository with the datascience topic, visit your repo's landing page and select "manage topics."