pseudodynamics.reader.TwoTimpepoint_AnnDS

class pseudodynamics.reader.TwoTimpepoint_AnnDS(AnnData, split='train', cellstate_key='cellstate', timepoint_key='timepoint_tx_days', timepoint_idx=None, n_dimension=5, knn_volume=False, batchsize=200, norm_time=False, deltax_key=None, density_funs=None, kde_kws={}, nearby_cellstate=1, base_cellstate=None, pop_dict=None, n_grid=300, collocation_points=600, log_transform=False, resampling_indensity=0.5, resampling_rate=0.5)[source]

Bases: HigDim_AnnDS

Dataset for high dimensional cellstate Each batch returns the cellstates, and their density in two consecutive timepoints

Parameters:
  • AnnData (annData,) – the single cell dataset

  • cellstate_key (str,) – the obsm key, the lower dimension representation on which we will use to compute density

  • timepoint_key (str,) – the obs key that indicate the experimental time the cells are collected from

  • log_transform (bool,) – default True, whether the population size will be log transformed to reduce the magnitude of the data

  • n_repeat (int) – the output file path from script

  • nearby_cellstate (int) – the number of near (cell state)

  • norm_Time (bool) – log-normalize the real timepoint

  • split (str,) – train, val or test

  • knn_volume (bool) – whether to use the volume of the knn graph to rescale the density

Examples

>>> import pseudodynamics as pdp
>>> from pseudodynamics import reader
>>> config = pdp.ExperimentConfig(config=config_path)
>>> DS_sub = pdp.reader.TwoTimpepoint_AnnDS(AnnData=adata, split = 'train',
                        **config.dataset_config
                        )
random_train_val_test_split()[source]

random train, val, test spliting with the ratio 0.8:0.1:0.1

subset_dataset()[source]

when a valid data split is given, subset the adata, cell state and density

Methods table

compute_density([density_funs])

compute the density for the self.cellstate, if density functions not specified then we use the gaussian kde

compute_volume([dim, smooth])

Compute the volume of each cell from KNN distances, assume the inner dimension is 2 and the min distance to represent radius

random_train_val_test_split()

random train, val, test spliting with the ratio 0.8:0.1:0.1

resampling_by_density(n_samples[, p])

sample meshes by the time-averaged density distribution

subset_dataset()

when a valid data split is given, subset the adata, cell state and density

Methods

TwoTimpepoint_AnnDS.compute_density(density_funs=None)[source]

compute the density for the self.cellstate, if density functions not specified then we use the gaussian kde

Returns:

self.u_b : Tensor, flatten, (n_time * n_cell) self.t_b : Tensor, flatten, (n_time * n_cell) self.density_funs : list of callable, [n_time] self.density_P : ndarray, average the total density into probability summing to 1 self.s_std : the std of self.cellsate

TwoTimpepoint_AnnDS.compute_volume(dim=2, smooth=True)[source]

Compute the volume of each cell from KNN distances, assume the inner dimension is 2 and the min distance to represent radius

TwoTimpepoint_AnnDS.random_train_val_test_split()[source]

random train, val, test spliting with the ratio 0.8:0.1:0.1

TwoTimpepoint_AnnDS.resampling_by_density(n_samples, p=None)[source]

sample meshes by the time-averaged density distribution

TwoTimpepoint_AnnDS.subset_dataset()[source]

when a valid data split is given, subset the adata, cell state and density