pseudodynamics.reader¶

Classes

`AllTimepoint_MeshGrid`(args, *kwargs)
`Duds_AnnDS`(*args[, precomputed_duds])
`Duds_AnnDS_fastmode`(*args[, n_pseudobulk, ...])
`HigDim_AnnDS`(AnnData[, cellstate_key, ...])	High Dimensional Cell state Dataset for trajectory indepdent modeling
`MeshGrid_AnnDS`(*[, n_timepoint, n_repeat, ...])
`MeshGrid_DS`(Data_pt[, nearby_cellstate, ...])
`MeshGrid_Resample`(args, *kwargs)
`MeshGrid_logDS`(args, *kwargs)
`Pdyn_ExtractDataset`(Data_pt[, n_grid, ...])
`Random_ExtractDataset`(Data_pt[, ...])
`Simple_DS`(, n_timepoint, *kwargs)
`SingleBranch_AnnDS`(*[, n_timepoint, ...])
`Syn_DS`(cellstate, density, integrate_time[, ...])
`TwoTimepoint_MeshGrid`(args, *kwargs)
`TwoTimpepoint_AnnDS`(AnnData[, split, ...])	Dataset for high dimensional cellstate Each batch returns the cellstates, and their density in two consecutive timepoints
`TwoTimpepoint_AnnDS_fastmode`(*args[, ...])

class pseudodynamics.reader.AllTimepoint_MeshGrid(*args, **kwargs)[source]¶: Bases: MeshGrid_Resample

class pseudodynamics.reader.Duds_AnnDS(*args, precomputed_duds=None, **kwargs)[source]¶: Bases: TwoTimpepoint_AnnDS

class pseudodynamics.reader.Duds_AnnDS_fastmode(*args, n_pseudobulk=None, pseudobulk_key='pseudo_bulk', resolution=None, **kwargs)[source]¶: Bases: Duds_AnnDS

class pseudodynamics.reader.HigDim_AnnDS(AnnData, cellstate_key='cellstate', timepoint_key='timepoint_tx_days', timepoint_idx=None, n_dimension=5, knn_volume=False, nearby_cellstate=1, norm_time=False, deltax_key=None, density_funs=None, kde_kws={}, base_cellstate=None, pop_dict=None, n_grid=300, collocation_points=600, log_transform=False, resampling_indensity=0.5, resampling_rate=0.5, equal_mass=False)[source]¶

Bases: AnnDataset

High Dimensional Cell state Dataset for trajectory indepdent modeling

Parameters:

n_repeat (int) – the output file path from script
nearby_cellstate (int) – the number of near (cell state)
norm_Time (boolen) – log-normalize the real timepoint
AnnData (annData,) – the single cell object
cellstate_key (str) – the obsm key, the lower dimension representation on which we will use to compute density
timepoint_key (str) – the obs key that indicate the experimental time the cells are collected from
pop_dict (dict) – the dictionary we use to pass population statistics including collected timepoint, mean ,variation
log_transform (bool) – default False, whether the population size will be log transformed to reduce the magnitude of the data
base_cellstate (np.ndarray) – the space to evaluate the density

compute_density(density_funs=None)[source]¶

compute the density for the self.cellstate, if density functions not specified then we use the gaussian kde

Returns:: self.u_b : Tensor, flatten, (n_time * n_cell) self.t_b : Tensor, flatten, (n_time * n_cell) self.density_funs : list of callable, [n_time] self.density_P : ndarray, average the total density into probability summing to 1 self.s_std : the std of self.cellsate

compute_volume(dim=2, smooth=True)[source]¶: Compute the volume of each cell from KNN distances, assume the inner dimension is 2 and the min distance to represent radius

class pseudodynamics.reader.MeshGrid_AnnDS(*, n_timepoint=None, n_repeat=10, nearby_cellstate=10, norm_time=True, replicate_key='batch', **kwargs)[source]¶: Bases: AnnDataset, MeshGrid

class pseudodynamics.reader.MeshGrid_DS(Data_pt, nearby_cellstate=10, n_grid=300, collocation_points=600, n_repeat=10, log_transform=True, norm_time=True)[source]¶: Bases: Processed_baseDS, MeshGrid

class pseudodynamics.reader.MeshGrid_Resample(*args, **kwargs)[source]¶: Bases: MeshGrid_AnnDS

class pseudodynamics.reader.MeshGrid_logDS(*args, **kwargs)[source]¶: Bases: MeshGrid_Resample

class pseudodynamics.reader.Pdyn_ExtractDataset(Data_pt, n_grid=300, collocation_points=600, n_repeat=10, log_transform=True)[source]¶: Bases: Processed_baseDS

class pseudodynamics.reader.Random_ExtractDataset(Data_pt, nearby_cellstate=10, n_grid=300, collocation_points=600, n_repeat=10, log_transform=True)[source]¶: Bases: Pdyn_ExtractDataset

class pseudodynamics.reader.Simple_DS(*, n_timepoint, **kwargs)[source]¶: Bases: MeshGrid_AnnDS

class pseudodynamics.reader.SingleBranch_AnnDS(*, n_timepoint=None, n_repeat=10, nearby_cellstate=10, max_timespan=3, replicate_key='batch', **kwargs)[source]¶: Bases: AnnDataset, MeshGrid

class pseudodynamics.reader.Syn_DS(cellstate, density, integrate_time, deltax=None, batchsize=200)[source]¶: Bases: Dataset

class pseudodynamics.reader.TwoTimepoint_MeshGrid(*args, **kwargs)[source]¶: Bases: MeshGrid_Resample

class pseudodynamics.reader.TwoTimpepoint_AnnDS(AnnData, split='train', cellstate_key='cellstate', timepoint_key='timepoint_tx_days', timepoint_idx=None, n_dimension=5, knn_volume=False, batchsize=200, norm_time=False, deltax_key=None, density_funs=None, kde_kws={}, nearby_cellstate=1, base_cellstate=None, pop_dict=None, n_grid=300, collocation_points=600, log_transform=False, resampling_indensity=0.5, resampling_rate=0.5, equal_mass=False)[source]¶

Bases: HigDim_AnnDS

Dataset for high dimensional cellstate Each batch returns the cellstates, and their density in two consecutive timepoints

Parameters:

AnnData (annData,) – the single cell dataset
cellstate_key (str,) – the obsm key, the lower dimension representation on which we will use to compute density
timepoint_key (str,) – the obs key that indicate the experimental time the cells are collected from
log_transform (bool,) – default True, whether the population size will be log transformed to reduce the magnitude of the data
n_repeat (int) – the output file path from script
nearby_cellstate (int) – the number of near (cell state)
norm_Time (bool) – log-normalize the real timepoint
split (str,) – train, val or test
knn_volume (bool) – whether to use the volume of the knn graph to rescale the density

Examples

>>> import pseudodynamics as pdp
>>> from pseudodynamics import reader
>>> config = pdp.ExperimentConfig(config=config_path)
>>> DS_sub = pdp.reader.TwoTimpepoint_AnnDS(AnnData=adata, split = 'train',
                        **config.dataset_config
                        )

random_train_val_test_split()[source]¶: random train, val, test spliting with the ratio 0.8:0.1:0.1

subset_dataset()[source]¶: when a valid data split is given, subset the adata, cell state and density

class pseudodynamics.reader.TwoTimpepoint_AnnDS_fastmode(*args, pseudobulk_key='pseudo_bulk', resolution=None, n_pseudobulk=None, **kwargs)[source]¶

Bases: TwoTimpepoint_AnnDS

get_pseudobulk_vector(agg_ad, x_key, pseudobulk_key)[source]¶: aggregate multi-dimensional vector based on cell cluster label