Python API reference

hicpep.peptools.calc_similarity(track1_np: ndarray, track2_np: ndarray)

Compare the similarity information between the given track1 and track2, The similar_rate is defined as the proportion of the entries in track1 (e.g. pc1_np) that have a same positive/negative sign as the track2 (e.g. Estimated PC1-pattern) entries compared.

Note that the NaN value entries will be excluded in advance.

Parameters:
  • track1_np (numpy.ndarray) – PC1 or Estimated PC1-pattern in numpy.ndarray format.

  • track2_np (numpy.ndarray) – PC1 or Estimated PC1-pattern in numpy.ndarray format.

Returns:

{
    total_entry_num (int): Entry numbers including ``NaN``.
    valid_entry_num (int): Entry numbers excluding ``NaN``.
    similar_num (int): Number of entries that the track1_np and track2_np have the same positive or negative sign.
    similar_rate (float): similar_num divide by valid_entry_num.
}
Return type:

dict

hicpep.peptools.create_est(pearson_np: ndarray, output: str | None = None, method: str = 'cxmax', sampling_proportion: float = 1.0) ndarray

Create the Estimated PC1-pattern of the given Hi-C Pearson matrix. The calculation is only performed on the valid sub-matrix. (We exclude the rows and columns which the corresponding diagonal value is NaN, implies that these rows and columns are all NaN. However these all NaN rows or columns will not be removed in the Estimated PC1-pattern returned)

Parameters:
  • pearson_np (numpy.ndarray) – Hi-C Pearson matrix in NumPy format.

  • output (str) – (Optional) If the file path is specified, the Estimated PC1-pattern will be stored (e.g. output="./test/est_pc1.txt").

  • method (str) – cxmax or cxmin.

  • sampling_proportion (float) – If this parameter is specified (e.g. 0.1), than the function will randomly sample the given percentage of rows in the Pearson matrix to create a partial covariance matrix, and select the cxmax in this partial covariance matrix as the Estimated PC1-pattern.

Returns:

Estimated PC1-pattern in NumPy format.

Return type:

numpy.ndarray

hicpep.peptools.plot_comparison(pc1_np: ndarray, est_np: ndarray, figsize: int = 20, scatter: str | None = None, relative_magnitude: str | None = None, xticks: int = 50)

Plot the scatter or relative-magnitude comparison figure between the PC1 and Estimated PC1-pattern. Please specified at least one of the figure storing path among the scatter plot or the relative_magnitude plot.

Note that for the plot of relative_magnitude, all the NaN value entries will be replaced with 0 in advance, and both the PC1 and Estimated PC1-pattern will be Z-score normalized.

Parameters:
  • pc1_np (numpy.ndarray) – PC1 in numpy.ndarray format.

  • est_np (numpy.ndarray) – Estimated PC1-pattern in numpy.ndarray format.

  • figsize (int) – Scaling the figure size.

  • scatter (str) – (Optional) If the file path is specified, the scatter plot will be stored (e.g. scatter="./test/scatter.png").

  • relative_magnitude (str) – (Optional) If the file path is specified, the relative_magnitude plot will be stored (e.g. relative_magnitude="./test/scatter.png").

hicpep.peptools.read_pearson(pearson: str) ndarray

Read a .txt file of the intra-chromosomal Hi-C Pearson matrix created by juicer_tools and return the Pearson matrix in NumPy format.

Parameters:

pearson (str) – Path of the juicer_tools created intra-chromosomal Hi-C Pearson matrix .txt file.

Returns:

Intra-chromosomal Hi-C Pearson matrix in NumPy format.

Return type:

numpy.ndarray