Skip to content

ad3ller/blindat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

blindat

A Python library for blind analysis of measurement data.

Do you need to blind your data?

Maybe not. But if you expect (or want?!) a measurement to produce a certain result, blinding can remove the temptation to tune your analysis.

Do you need a package to blind your data?

Nope. To randomly offset the values in a column of a pandas.DataFrame(),

import numpy as np
import pandas as pd

# parameters
fil = "./my_awesome_measurement.csv"
name = "A"
offset = 0.1

# load
df = pd.read_csv(fil)

# shift data values
df[name] = df[name] + offset * np.random.rand()

blindat extends this simple concept using transformation rules.

things to consider

This library is an experiment in developing a reasonable workflow for blind analysis. It is not intended to be a universal solution for all forms of data or blind analysis techniques.

I assume the user wants to avoid bias. Trust allows for a simple and reversible approach (stored data is never altered). More paranoia is more appropriate for more critical applications.

Install

Requires python>=3.10, numpy and pandas.

Activate your Python analysis environment. Clone the source code and cd into the directory. Install using pip

pip install .

Usage

Blind analysis can be as simple as applying an unknown transform to an appropriate column of data (e.g., laser wavelength or microwave frequency).

In this example, a random offset is selected from the range of 10 to 20 and added to column A of the pandas DataFrame, df.

import blindat as bd

rules = bd.generate_rules("A", offset=(10.0, 20.0), random_seed=42)

df1 = bd.blind(df, rules)
df1.head()
A B C D
0 14.264812 0.030766 0.064909 0.930325
1 14.014989 0.562393 0.227109 0.202936
2 14.114655 0.579577 0.015450 0.534170
3 14.417311 0.868601 0.142738 0.573955
4 14.648785 0.921365 0.019821 0.263312

The @blindat decorator can be used to wrap methods that load a pandas.Dataframe with the blind() function.

The normalize() function blinds data by offsetting and scaling values to force a mean value of zero and a standard deviation of one.

See docs for examples.

About

blind analysis for pandas DataFrames

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages