Skip to content

oeg-upm/TINTOlib-Documentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TINTOlib: Python Library to convert Tabular Data into Synthetic Images

License Python Version Documentation Status Open In Colab-CNN Open In Colab-CNN+MLP Open In Colab-ViT Open In Colab-ViT+MLP

TINTO Logo

🎉 New Free Course on Udemy! (Spanish) 🎉

We’ve just launched a 100% free course on Udemy about using TINTOlib and developing Hybrid Neural Networks.

Learn how to turn tabular data into synthetic images and apply CNNs, ViTs, and hybrid architectures like a pro.

Access the Course on Udemy


📺 VideoTutorial Course (English/Spanish)

🎥 Prefer not to register on Udemy or looking for the English version of the course? No worries — you can follow the full course directly on GitHub!

This hands-on tutorial includes bilingual videos (English/Spanish) and practical notebooks to help you learn how to use TINTOlib with deep learning models like CNNs, ViTs, and hybrid architectures.

Access the Course on GitHub


🧠 Overview

TINTOlib is a state-of-the-art Python library that transforms tidy data (also known as tabular data) into synthetic images, enabling the application of advanced deep learning techniques, including Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs), to traditionally structured data. This transformation bridges the gap between tabular data and powerful vision-based machine learning models, unlocking new possibilities for tackling regression, classification, and other complex tasks.


📚 Features

  • Input formats: CSV or Pandas DataFrame
  • Designed for tidy data (target column last)
  • Output: grayscale images from reduction and transformation methods
  • Compatible with Linux, Windows, macOS
  • Requires Python 3.7+

🧪 Methods

TINTOlib includes a variety of methods for generating synthetic images. Below is a summary of the supported methods and their hyperparameters:

Methods Class Hyperparameters
TINTO TINTO() problem normalize verbose pixels algorithm blur submatrix amplification distance steps option times train_m zoom random_seed
IGTD IGTD() problem normalize verbose scale fea_dist_method image_dist_method error max_step val_step switch_t min_gain zoom random_seed
REFINED REFINED() problem normalize verbose hcIterations n_processors zoom random_seed
BarGraph BarGraph() problem normalize verbose pixel_width gap zoom
DistanceMatrix DistanceMatrix() problem normalize verbose zoom
Combination Combination() problem normalize verbose zoom
SuperTML SuperTML() problem normalize verbose pixels feature_importance font_size random_seed
FeatureWrap FeatureWrap() problem normalize verbose size bins zoom
BIE BIE() problem normalize verbose precision zoom

⚠️ Platform-Specific Requirements for Certain Transformation Methods

Some transformation methods in TINTOlib have specific system requirements or limitations when used on platforms such as Google Colab, Windows, Linux, or macOS.

REFINED

This method relies on mpi4py, which enables parallel computation using MPI (Message Passing Interface). However, mpi4py requires administrative permissions to utilize multiple processors, making it incompatible with platforms like Google Colab.

  • Linux: Ensure that the MPI environment is set up before installing mpi4py. Run the following commands:

      sudo apt-get install python3
      sudo apt install python3-pip
      sudo apt install python3-mpi4py

    Once MPI is installed:

      pip install mpi4py
  • MacOS / Windows: Direct installation is usually supported:

      pip install mpi4py

SuperTML

The SuperTML method generates text-based synthetic images and requires the MS Sans Serif font.

  • On Windows, this font is typically available by default.
  • On Linux and macOS, it must be installed manually to avoid rendering issues.

Font Installation

  • Linux: Install Microsoft Core Fonts:

      sudo apt install ttf-mscorefonts-installer

On Google Colab, installing additional fonts is not permitted due to administrative restrictions.

📄 Getting Started

  • You can install TINTOlib using Pypi:
      pip install TINTOlib

TINTOlib already includes all necessary dependencies, so there’s no need to install them individually.

However, if you prefer manual installation or want to explore the full environment:

  • The repository includes a requirements.txt file listing the core dependencies required to use TINTOlib. You can directly run the TINTOlib-example.ipynb notebook located in the examples/ folder using the dependencies listed in requirements.txt.
  • Other notebooks, which include training deep learning models on the generated images, require additional libraries. To run them, install the extended dependencies from requirements-example.txt:

🧩 Importing a Specific Model

  • To use a specific image transformation model, import it directly. For example, to use TINTO:

      from TINTOlib.tinto import TINTO
  • To import a specific model use:

      from TINTOlib.tinto import TINTO
  • Create the model. If you don't set any hyperparameter, the model will use the default values, refer to the Models Section or the TINTO Documentation.

      model = TINTO(problem="supervised", blur=True)

🔧 Generating Synthetic Images

To generate synthetic images, use the following workflow with the fit, transform, and fit_transform methods:

Parameters:

  • data: A path to a CSV file or a Pandas DataFrame (target column must be the last column).
  • folder: Path to the folder where the synthetic images will be saved.

Sintaxis

  1. The fit method trains the model on the tabular data and prepares it for image generation.

      model.fit(data)
  2. The transform method generates and saves synthetic images in a specified folder. It requires the model to be fitted first.

      model.transform(data, folder)
  3. The fit_transform method combines the training and image generation steps. It fits the model to the data and generates synthetic images in one step.

  model.fit_transform(data, folder)

Notes:

  • The model must be fitted before using the transform method. If the model isn't fitted, a RuntimeError will be raised.

📚 Documentation

To get started with TINTOlib, a dedicated Crash Course Repository is available. It includes videoturials, slides and Jupyter Notebooks that demonstrate how to apply state-of-the-art vision models like Vision Transformers (ViTs), Convolutional Neural Networks (CNNs) and Hybrid Neural Networks to problems.

For example, the following table shows a classic example of the IRIS CSV dataset as it should look like for the run:

sepal length sepal width petal length petal width target
4.9 3.0 1.4 0.2 1
7.0 3.2 4.7 1.4 2
6.3 3.3 6.0 2.5 3
  • The following example shows how to execute TINTOlib using the TINTO method and then display the synthetic image generated for the first row:

      model = TINTO(problem="supervised", pixel=30, algorithm="t-SNE", steps=5, blur=True)
      model.fit_transform("iris.csv", "synthetic_images")

TINTO blurring


🚀 Vision-based Neural Network Architectures

Using synthetic images, experiment with either vision models like CNNs or ViTs, and explore hybrid models. Below are the architectures that will be presented, and the ones you will modify and use during the session:

  • Synthetic images using CNN
    Tabular-to-Image CNN

  • Synthetic images using Hybrid Neural Network with ViT (HyViT)
    Tabular-to-Image HyNNViT


💬 More information


🧪 Citing TINTO:

If you used TINTO in your work, please cite the SoftwareX:

@article{softwarex_TINTO,
    title = {TINTO: Converting Tidy Data into Image for Classification with 2-Dimensional Convolutional Neural Networks},
    journal = {SoftwareX},
    author = {Manuel Castillo-Cara and Reewos Talla-Chumpitaz and Raúl García-Castro and Luis Orozco-Barbosa},
    volume={22},
    pages={101391},
    year = {2023},
    issn = {2352-7110},
    doi = {https://doi.org/10.1016/j.softx.2023.101391}
}

And use-case developed in INFFUS Paper

@article{inffus_TINTO,
    title = {A novel deep learning approach using blurring image techniques for Bluetooth-based indoor localisation},
    journal = {Information Fusion},
    author = {Reewos Talla-Chumpitaz and Manuel Castillo-Cara and Luis Orozco-Barbosa and Raúl García-Castro},
    volume = {91},
    pages = {173-186},
    year = {2023},
    issn = {1566-2535},
    doi = {https://doi.org/10.1016/j.inffus.2022.10.011}
}

🛡️ License

TINTOlib is available under the Apache License 2.0.

👥 Authors


🏛️ Contributors

Ontology Engineering Group Universidad Politécnica de Madrid Universidad Nacional de Educación a Distancia Universidad de Castilla-La Mancha

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •