gptbioinsightor.get_celltype

Contents

gptbioinsightor.get_celltype#

gptbioinsightor.get_celltype(input: AnnData | dict, out: Path | str = None, background: str = None, pathway: dict | None = None, key: str = 'rank_genes_groups', topnumber: int = 15, n_jobs: int | None = None, provider: str | None = None, model: str | None = None, search_model: str | None = None, group: str | Iterable[str] | None = None, base_url: str | None = None, rm_genes=True, score_prompt: str | None = None) dict#

Annotating genesets using LLM, providing cell types, supporting gene markers, reasons, and potential cell state annotations.

Parameters:
  • input (AnnData | dict) – An AnnData object or geneset dict

  • out (Path | str, optional) – output path, by default None

  • background (str, optional) – background information of input data, by default None

  • key (str, optional) – rank_genes_groups key, by default “rank_genes_groups”

  • topnumber (int, optional) – select top gene for analysis, by default 15

  • n_jobs (int | None, optional) – set multiple jobs for querying LLM, by default None

  • provider (str| None, optional) – LLM provider, by default None “openai” for chatgpt “aliyun” for qwen “deepseek” for DeepSeek “anthropic” for claude

  • model (str | None, optional) – set a model based on LLM provider, by default None

  • search_model (str | None, optional) – If provided, run Perplexity-based cell type search using this model before the main annotation workflow, by default None.

  • group (str | Iterable, optional) – Which group, by default None

  • base_url (str | None, optional) – customized LLM API url, by default None

  • rm_genes (bool, optional) – remove rb and mt genes, by default True

  • score_prompt (str | None, optional) – custom scoring criteria prompt to override CELLTYPE_SCORE, by default None

Returns:

a celltypes dict

Return type:

dict