Out of Memory Error: MemoryError: Unable to allocate array with shape (249255, 249255) and data type float64 · Issue #12 · lanagarmire/deepimpute · GitHub 您所在的位置:网站首页 core60002 Out of Memory Error: MemoryError: Unable to allocate array with shape (249255, 249255) and data type float64 · Issue #12 · lanagarmire/deepimpute · GitHub

Out of Memory Error: MemoryError: Unable to allocate array with shape (249255, 249255) and data type float64 · Issue #12 · lanagarmire/deepimpute · GitHub

2023-08-16 07:57| 来源: 网络整理| 查看: 265

Hi! I'm trying to run DeepImpute on scATAC-Seq data. I've filtered my dataset to 'high-quality' cells with at least 5500 reads. image I've filtered my features (peaks) for those observed in >10 cells, leaving me with close to 250k. When I try to run impute on this, it crashes.

Input dataset is 7706 cells (rows) and 249255 genes (columns) First 3 rows and columns: W_14793_15289 W_37170_37548 W_46846_47099 AAACGAAAGTAATGTG-3 0 0 0 AAACGAACAGATGGCA-2 0 0 0 AAACGAACATTGTGAC-4 0 0 0 23040 genes selected for imputation --------------------------------------------------------------------------- MemoryError Traceback (most recent call last) in 7 # Crashed, let's try with 50% of the data to fit the network. 8 ----> 9 multinet.fit(MACS_data,cell_subset=1,minVMR=0.5) ~/.local/lib/python3.7/site-packages/deepimpute/multinet.py in fit(self, raw, cell_subset, NN_lim, genes_to_impute, ntop, minVMR, mode) 192 genes_to_impute = np.concatenate((genes_to_impute, fill_genes)) 193 --> 194 covariance_matrix = get_distance_matrix(raw) 195 196 self.setTargets(raw.reindex(columns=genes_to_impute), mode=mode) ~/.local/lib/python3.7/site-packages/deepimpute/multinet.py in get_distance_matrix(raw) 22 potential_pred = raw.columns[raw.std() > 0] 23 ---> 24 covariance_matrix = pd.DataFrame(np.abs(np.corrcoef(raw.T.loc[potential_pred])), 25 index=potential_pred, 26 columns=potential_pred).fillna(0) in corrcoef(*args, **kwargs) ~/miniconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in corrcoef(x, y, rowvar, bias, ddof) 2524 warnings.warn('bias and ddof have no effect and are deprecated', 2525 DeprecationWarning, stacklevel=3) -> 2526 c = cov(x, y, rowvar) 2527 try: 2528 d = diag(c) in cov(*args, **kwargs) ~/miniconda3/lib/python3.7/site-packages/numpy/lib/function_base.py in cov(m, y, rowvar, bias, ddof, fweights, aweights) 2452 else: 2453 X_T = (X*w).T -> 2454 c = dot(X, X_T.conj()) 2455 c *= np.true_divide(1, fact) 2456 return c.squeeze() in dot(*args, **kwargs) MemoryError: Unable to allocate array with shape (249255, 249255) and data type float64

Could you explain why the program is trying to operate on a matrix of all_genes x all_genes? I am running this on a server with ~200GB of memory. I can override the error with echo 1 > /proc/sys/vm/overcommit_memory but it actually uses it all and crashes. Any thoughts would be appreciated! If I'm understanding correctly, I cannot run the model on the identified subset and then apply that similarly to the rest of the genes afterwards, correct? Additionally, I realize that this is meant for scRNA-Seq but I figured it should be able to be applied to scATAC-Seq.

On another note, is it possible to specify which normalization to use? Say, Square-root transform?



【本文地址】

公司简介

联系我们

今日新闻

    推荐新闻

    专题文章
      CopyRight 2018-2019 实验室设备网 版权所有