hplc_data_analysis.pubchem

hplc_data_analysis.pubchem.get_compound_from_pubchempy(comp_name)[source]
Return type:

Compound

hplc_data_analysis.pubchem.get_iupac_from_pcp(comp_name)[source]

get iupac name for compound using pubchempy, needs internet connection

Parameters:

comp_name (str) – _description_

Returns:

lowercase iupac name for the compound

Return type:

str

hplc_data_analysis.pubchem.name_to_properties(comp_name, dict_classes_to_codes, dict_classes_to_mass_fractions, df=Empty DataFrame Columns: [] Index: [], precision_sum_elements=0.05, precision_sum_functional_group=0.05)[source]

used to retrieve chemical properties of the compound indicated by the comp_name and to store those properties in the df

Return type:

DataFrame

Parameters:
  • GCname (str) – name from GC, used as a unique key.

  • search_name (str) – name to be used to search on pubchem.

  • df (pd.DataFrame) – that contains all searched compounds.

  • df_class_code_frac (pd.DataFrame) – contains the list of functional group names, codes to be searched and the weight fraction of each one to automatically calculate the mass fraction of each compounds for each functional group. Classes are given as smarts and are looked into the smiles of the comp.

Returns:

  • df (pd.DataFrame) – updated dataframe with the searched compound.

  • CompNotFound (str) – if GCname did not yield anything CompNotFound=GCname.

hplc_data_analysis.pubchem.report_difference(rep1, rep2, diff_type='absolute')[source]

calculates the ave, std and p percentage of the differnece between two reports where columns and index are the same. Replicates (indicated as XX_1, XX_2) are used for std.

Parameters:
  • rep1 (pd.DataFrame) – report that is conisdered the reference to compute differences from.

  • rep2 (pd.DataFrame) – report with the data to compute the difference.

  • diff_type (str, optional) – type of difference, absolute vs relative (to rep1) . The default is ‘absolute’.

Returns:

  • dif_ave (pd.DataFrame) – contains the average difference.

  • dif_std (pd.DataFrame) – contains the std, same units as dif_ave.

  • dif_stdp (pd.DataFrame) – contains the percentage std compared to ref1.