Datasets¶
load_sample¶
-
bcselector.datasets.
load_sample
(as_frame=True)[source]¶ Load and return the sample artificial dataset.
Samples total
10000
Dimensionality
35
Target variables
1
- Parameters
as_frame (bool, default=True) – If True, the data is a pandas DataFrame including columns with appropriate names. The target is a pandas DataFrame with multiple target variables.
- Returns
data ({np.ndarray, pd.DataFrame} of shape (10000, 35)) – The data matrix. If as_frame=True, data will be a pd.DataFrame.
target ({np.ndarray, pd.Series} of shape (10000, 35)) – The binary classification target variable. If as_frame=True, target will be a pd.DataFrame.
costs ({dict, list)) – Cost of every feature in data. If as_frame=True, target will be a dict.
Examples
>>> from bcselector.dataset import load_sample >>> data, target, costs = load_sample()
load_hepatitis¶
-
bcselector.datasets.
load_hepatitis
(as_frame=True, discretize_data=True, **kwargs)[source]¶ Load and return the hepatitis dataset provided. The mimic3 dataset is a small medical dataset with single target variable. Dataset is collected from UCI repository 3.
Samples total
155
Dimensionality
19
Target variables
1
- Parameters
as_frame (bool, default=True) – If True, the data is a pandas DataFrame including columns with appropriate names. The target is a pandas DataFrame with multiple target variables.
discretize_data (bool, default=True) – If True, the returned data is discretized with sklearn.preprocessing.KBinsDiscretizer.
kwargs – Arguments passed to sklearn.preprocessing.KBinsDiscretizer constructor.
- Returns
data ({np.ndarray, pd.DataFrame} of shape (6591, 306)) – The data matrix. If as_frame=True, data will be a pd.DataFrame.
target ({np.ndarray, pd.Series} of shape (6591, 10)) – The binary classification target variable. If as_frame=True, target will be a pd.DataFrame.
costs ({dict, list)) – Cost of every feature in data. If as_frame=True, target will be a dict.
References
- 3
Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
Examples
>>> from bcselector.dataset import load_hepatitis >>> data, target, costs = load_hepatitis()