opengsl.data.DataSet¶
- class opengsl.data.Dataset(data, feat_norm=False, verbose=True, n_splits=1, split='public', split_params=None, homophily_control=None, path='./data/', cv=None, **kwargs)[source]¶
Bases:
object# TODO update docstring Dataset Class. This class loads, preprocesses and splits various datasets.
- Parameters
data (str) – The name of dataset.
feat_norm (bool) – Whether to normalize the features.
verbose (bool) – Whether to print statistics.
n_splits (int) – Number of data splits.
homophily_control (float) – The homophily ratio control homophily receives. If set to None, the original adj will be kept unchanged.
path (str) – Path to save dataset files.
- prepare_data(ds_name, feat_norm=False, verbose=True)[source]¶
Function to Load various datasets. Homophilous datasets are loaded via pyg, while heterophilous datasets are loaded with hetero_load. The results are saved as self.feats, self.adj, self.labels, self.train_masks, self.val_masks, self.test_masks. Noth that self.adj is undirected and has no self loops.
- Parameters
ds_name (str) – The name of dataset.
feat_norm (bool) – Whether to normalize the features.
verbose (bool) – Whether to print statistics.