Internship Questions Set Stage 3

To get started with learning bioinformatics, with mentorship and project experience, visit HackBio

This project delves into the critical intersection of antibiotic resistance and severe infections such as sepsis. Sepsis, a life-threatening condition, often involves E. coli, a bacteria commonly found in the human gut. This research aims to comprehensively analyze the prevalence of antibiotic resistance genes within E. coli genomes isolated from sepsis patients.

By focusing on specific antibiotics, the study seeks to identify the prevalence and distribution of resistance genes, shedding light on the effectiveness of current treatment regimens. Understanding the genomic landscape of antibiotic resistance in E. coli associated with sepsis is pivotal for informing clinical practices, antibiotic stewardship, and public health policies.

This research not only contributes valuable insights into the evolving landscape of bacterial resistance but also aids in refining therapeutic strategies for sepsis management. By deciphering the genetic basis of antibiotic resistance in E. coli, the project plays a crucial role in the ongoing global effort to combat antimicrobial resistance and enhance patient outcomes in the face of severe infections.

Selected antibiotics: Penicillin(s), Ciprofloxacin and Aminoglycoside

Notes:

  • Ensure that you focus on isolates gotten from Sepsis
  • Your dataset should cover at least 3 continents
  • Prepare a graphical abstract using Biorender, Bioicons or any other tool at your disposal
  • As early as possible, prepare a 1-2 minutes pitch of your research topic and publish on twitter, youtube and instagram. Pitch should be as simple as possible, your ‘grandma’ should be able to understand. The links to your social media posts will be collected by the end of the first week in this stage.
  • Votes from SM will be counted alongside the final presentation.
  • Prepare powerpoint slides for your final presentation to the world. This is a global event. To ensure fairness, voting will be done by the public. No HackBio organizer or mentor will be allowed to vote

Recently, introns (intergenic regions) have gained increasing popularity for their role in the development of genetic disorders. In a recent project by the GnomAD, they developed a score for the identification of intergenic regions that could have variants with functional significance. They termed this the gnocchi (z) score which is calculated as (Obs -Exp)2/Exp.

Using the data source below (~2M variants), compute the gnocchi score (ignore the unfiltered_z).

Next, Using your own z_score, generate a chromosome level distribution plot (histogram, boxplot or density plot) showing the distribution of gnocchi score within each chromosome.

Assuming a cut-off of Z-score > 4.0; which chromosome has more significant variants?

For all the significant regions you have selected, use GenomicRanges() (in R) to identify the closest gene to the selected sequence frame between start and end columns. Download any of the Super Enhancer dataset from SEdb (https://bio.liclab.net/sedb/download.php). Use the nearest() function in GenomicRanges() to perform this step.

Notes:

  • Prepare a graphical abstract using Biorender, Bioicons or any other tool at your disposal
  • As early as possible, prepare a 1-2 minutes pitch of your research topic and publish on twitter, youtube and instagram. Pitch should be as simple as possible, your ‘grandma’ should be able to understand. The links to your social media posts will be collected by the end of the first week in this stage.
  • Votes from SM will be counted alongside the final presentation.
  • Prepare powerpoint slides for your final presentation to the world. This is a global event. To ensure fairness, voting will be done by the public. No HackBio organizer or mentor will be allowed to vote

Bacterial infections are the second most common complication in patients with cancer due to both disease-related and treatment-related immunosuppression. By undermining treatment outcomes and reducing survival in cancer patients it is estimated that 8.5% of cancer deaths are due to severe sepsis.

In this project, you will profile the resistance patterns of 8 pathogens across 64 patients.

Starting Dataset:

  1. 64 bacterial genomes reported here: SRP417207. Use https://sra-explorer.info/ to download dataset

Selected antibiotics: Penicillin(s), Ciprofloxacin and Aminoglycoside

Selected genes: Do your research to identify at least 3 genes for each antibiotics class

Specific Task: Identify the pathogens in each of the 64 samples and describe the most common resistance pattern.

Notes:

  • Ensure that you focus on isolates gotten from cancer patients (any kind of cancer is allowed, just remember to mention them)
  • ⚠️Be sure you are not downloading metagenomic datasets
  • Prepare a graphical abstract using Biorender, Bioicons or any other tool at your disposal
  • As early as possible, prepare a 1-2 minutes pitch of your research topic and publish on twitter, youtube and instagram. Pitch should be as simple as possible, your ‘grandma’ should be able to understand. The links to your social media posts will be collected by the end of the first week in this stage.
  • Votes from SM will be counted alongside the final presentation.
  • Prepare powerpoint slides for your final presentation to the world. This is a global event. To ensure fairness, voting will be done by the public. No HackBio organizer or mentor will be allowed to vote

The Cancer Cell Line Encyclopedia (CCLE) project is a project that aims to characterize the different cancer cell lines. Using WGS techniques, they have completely sequenced different cancer cell lines representing different primary sites in the body.

Your task is to select any 5 cell lines and perform variant calling and interpretation for each of the cell lines. Use a clinical grade variant interpreter like mutSigCV, GATK or cuteVariant

Also, try to see the mutations that are present across all the selected cancer cell lines you selected and those that are unique. Try to describe them and interpret their functional consequences. (hint: gene enrichment analysis)

Data Source

Notes:

  • Ensure that you focus on downloading WGS datasets from cancer.
  • ⚠️Be sure you are not downloading metagenomic, RNA seq or epigenomic datasets
  • Prepare a graphical abstract using Biorender, Bioicons or any other tool at your disposal
  • As early as possible, prepare a 1-2 minutes pitch of your research topic and publish on twitter, youtube and instagram. Pitch should be as simple as possible, your ‘grandma’ should be able to understand. The links to your social media posts will be collected by the end of the first week in this stage.
  • Votes from SM will be counted alongside the final presentation.

🍿 Subscribe to get notified of news, opportunities, gigs and new roles in the bioinformatics world.