Bots and Robots, gather round, for a random, high-level, and completely non-scientific introduction to bioinformatics is about to begin! Are you ready to learn about bioinformatics without all the boring details? Then you’ve come to the right place.
We’ll start with a general view of the field, and then zoom in to the details, just like a David Fincher film. But instead of mind-bending plots and psychological thrillers, we’ll be talking about DNA and proteins. And unlike a Fincher movie, you’ll actually understand what’s going on.
But before we begin, let’s get one thing straight: this is not a real introduction to bioinformatics. This is an introduction on how to understand bioinformatics at a high level. So, if you’re looking for a detailed explanation on how to align a sequence or predict protein structure, you’re out of luck. We’re about to discover when we’ll learn about it.
My goal for each post is to develop some understanding that I can leverage onto the next post.
Why Bioinformatics ? I strongly believe that biology has always been the backbone of research. Within the 21th century, every generational company will emerge building through software, and within the ML field, everyone has the opportunity to get in on the ground floor and be on the team building it. Thus, learning about Bioinformatics is getting the best of our world !
I would like to start by mentioning Scott H. Young’s TED Talk which inspired me to research the form of this series. Looking to follow MIT’s CS course, he leveraged Open Courseware material and dedicated 12-months to following it. Based on the School’s Course Catalog.
Throughout my exploration, I gradually realized that simply writing down an introduction to a whole field might be more difficult that I expected. Fortunately enough, this preliminary formal exploration led me to discovering a few inspirations and guided me toward a clear direction about how this series of Post should be written. Despite naming this page Bioinformatics 101, it was meant to be an overview, sort of Table of Content worth exploring where I can each time sources a clear list of subjects worth exploring.
As mentioned in my Preface, this is my ‘topic-bucket’ meant to provide me with prompts to know where my next post is headed.
For each subject, you should expect research to be done into Wikipedia, Google, MIT’s Open Courseware and finally try to write a small blog post about it. Unlike I expected early on, there shouldn’t be any reason to think my blog post would be paper-like despite their subjects being academic. What could be interesting would be to further any blog-post by looking up its topic on arXiv and see the most recent published papers.
Without further due, here’s the outline :
[ The Following SectionsAre Under Construction ]
Biology
Local and Global Alignment (BLAST, NW, SW, PAM, BLOSUM)
Mapping : Library Complexity & Short Read Alignment
Template preparation, Sequencing and Imaging, Genome Alignment and Assembly Approaches
Minimum free energy and partition function with the Nearest neighbor energy model
Stochastic prediction of RNA secondary structures
Comparative modeling of RNAs
Simultaneous folding and alignment of structured RNAs.
RNA sequence-structure maps and simulation the evolution of RNA populations
Pseudo-knots and RNA-RNA interaction predictions
RNA functional structures are made of theoretically predicted overrepresented blocks
RNA 3D Modeling & structure prediction
Protein & Proteomics (again)
Introduction to Protein structure
Protein secondary structure prediction using Neural Networks
Protein residue contact prediction
Protein fold recognition and threading
Molecular dynamics simulation
Protein 3D structure prediction
Structural bioinformatics and machine learning for drug design
Minimalist models: Protein folding on HP lattice models
3D Genomics
Networks :
Gene Regulatory Networks
Protein Interaction Networks
Logic Modeling of Cell Signaling Networks
Introduction to Chromatin : Structure & Classification
Chromatin and gene regulation
eQTLs
Quantitative Trait Loci (QTLs)
Human Genetics, SNPs, and Genome Wide Associate Studies
DNA Accessibility, Promoters and Enhancers
Transcription Factors, DNA methylation
Gene Expression, Splicing
Drug Discovery and Design
Discovery and design of new drugs, including virtual screening, molecular dynamics simulations, and cheminformatics.
Biomedical Informatics
Bioinformatics in medicine and healthcare : electronic health records, clinical decision support systems, and personalized medicine.
Electronic health records and patient data
Imaging applications in healthcare
ML for health data
Synthetic Biology
Modules & Therapeutic Systems
Biological Engineering
Fundamentals of Biological Engineering
Introduction to Cell and molecular biology, biochemistry, and genetics.
Bioinformatics
Computational methods for sequence alignment, gene annotation, and phylogenetic analysis.
Bioengineering
Introduction to gene therapy, genetic engineering, and synthetic biology.
Bioprocess Engineering
Design and optimization of biological systems for the production of chemicals, drugs, and other products.
Biomedical Imaging
Principles and techniques of biomedical imaging : X-ray, CT, MRI, ultrasound, and PET/CT.
Tissue Engineering and Regenerative Medicine
Tissue-engineered products and therapies, including stem cells, scaffolds, and biomaterials. (feel like this is starting to be very far fetched)
Computer Science
Systems Biology and Synthetic Biology
Computational methods used in systems biology : network inference, model building, data integration.
Biomedical Data Science
Data analysis, data visualization and ML methods used in biomedical research.
ML Foundations, CNN
Recurrent Neural Networks, Graph Neural Networks
Neural Networks Review
Interpretability, Dimensionality Reduction
Generative Models, GANs, VAE
Video processing, structure determination
Imaging and Cancer
EHRs and data mining
Neuroscience
If you’re reading this, It’s Too Late thank you for giving me the benefit of the doubt despite the obvious chaos.
But you asked for it folks! A brief introduction list to bioinformatics. If you feel passionate about a missing topic, do not feel free to comment which topic is missing.
As with any Fincher movie, this is just a small glimpse into my project ; from implementing bioinformatics methods to exploring the latest papers in genomics and proteomics, we’ll eventually be covering it all.
So, whether you’re a student, a researcher, or just an avid reader wondering if this can get any worse, stay tuned for more in posts about Bioinformatics.
In the meantime, keep learning, keep exploring, and remember to never stop asking questions. Everything is constantly evolving, and there’s always more to discover.
Thank you for reading. And as usual, Godspeed :)
Bottom note : I think it would be interesting to find a way to express each blog post into a concrete format. Out of my hat, I would be thinking of generating an artistic representation specific to the topic at hand. More to think about.
You might have noticed the non-exhaustiveness of this list, from being too long to repetitive. I wish I could truthfully say that it was purposely created this way but it’s not : I’ll try to edit it as we go !
Bioinformatics 101
Bots and Robots, gather round, for a random, high-level, and completely non-scientific introduction to bioinformatics is about to begin!
Are you ready to learn about bioinformatics without all the boring details? Then you’ve come to the right place.
We’ll start with a general view of the field, and then zoom in to the details, just like a David Fincher film. But instead of mind-bending plots and psychological thrillers, we’ll be talking about DNA and proteins. And unlike a Fincher movie, you’ll actually understand what’s going on.
But before we begin, let’s get one thing straight: this is not a real introduction to bioinformatics. This is an introduction on how to understand bioinformatics at a high level. So, if you’re looking for a detailed explanation on how to align a sequence or predict protein structure, you’re out of luck. We’re about to discover when we’ll learn about it.
My goal for each post is to develop some understanding that I can leverage onto the next post.
Why Bioinformatics ? I strongly believe that biology has always been the backbone of research. Within the 21th century, every generational company will emerge building through software, and within the ML field, everyone has the opportunity to get in on the ground floor and be on the team building it.
Thus, learning about Bioinformatics is getting the best of our world !
I would like to start by mentioning Scott H. Young’s TED Talk which inspired me to research the form of this series. Looking to follow MIT’s CS course, he leveraged Open Courseware material and dedicated 12-months to following it. Based on the School’s Course Catalog.
Throughout my exploration, I gradually realized that simply writing down an introduction to a whole field might be more difficult that I expected. Fortunately enough, this preliminary formal exploration led me to discovering a few inspirations and guided me toward a clear direction about how this series of Post should be written.
Despite naming this page Bioinformatics 101, it was meant to be an overview, sort of Table of Content worth exploring where I can each time sources a clear list of subjects worth exploring.
As mentioned in my Preface, this is my ‘topic-bucket’ meant to provide me with prompts to know where my next post is headed.
For each subject, you should expect research to be done into Wikipedia, Google, MIT’s Open Courseware and finally try to write a small blog post about it. Unlike I expected early on, there shouldn’t be any reason to think my blog post would be paper-like despite their subjects being academic. What could be interesting would be to further any blog-post by looking up its topic on arXiv and see the most recent published papers.
Without further due, here’s the outline :
Biology
Local and Global Alignment (BLAST, NW, SW, PAM, BLOSUM)
Mapping : Library Complexity & Short Read Alignment
Template preparation, Sequencing and Imaging, Genome Alignment and Assembly Approaches
Proteomics[1]
Introduction to Protein Structure, Structure Comparaison & Classification
Protein structure prediction and protein-protein interactions.
Predicting Protein Structure & Interactions
Analysis and alignment of Protein-Protein Interaction networks
Protein folding
Genomics
Comparative Genomic & Gene Regulation
Causality, Natural Computing and Engineering Genomes
Genomic data, genome assembly, gene prediction, and functional genomics.
Markov Models of Genomics & Protein Features
ChIP-seq Analysis, DNA-Protein Interactions
RNA
Introduction to RNA, RNA Structure and modeling
RNA-seq Analysis, Expression, Isoforms, Splicing
RNA Secondary Structure, Biological Functions & Predictions
Modeling of Sequence Motifs
Single cell RNA-sequencing
scRNA-seq, dimensionality reduction
Dimensionality Reduction, Genetics, and Variation
GWAS and Rare variants
Minimum free energy and partition function with the Nearest neighbor energy model
Stochastic prediction of RNA secondary structures
Comparative modeling of RNAs
Simultaneous folding and alignment of structured RNAs.
RNA sequence-structure maps and simulation the evolution of RNA populations
Pseudo-knots and RNA-RNA interaction predictions
RNA functional structures are made of theoretically predicted overrepresented blocks
RNA 3D Modeling & structure prediction
Protein & Proteomics (again)
Introduction to Protein structure
Protein secondary structure prediction using Neural Networks
Protein residue contact prediction
Protein fold recognition and threading
Molecular dynamics simulation
Protein 3D structure prediction
Structural bioinformatics and machine learning for drug design
Minimalist models: Protein folding on HP lattice models
3D Genomics
Networks :
Gene Regulatory Networks
Protein Interaction Networks
Logic Modeling of Cell Signaling Networks
Introduction to Chromatin : Structure & Classification
Chromatin and gene regulation
eQTLs
Quantitative Trait Loci (QTLs)
Human Genetics, SNPs, and Genome Wide Associate Studies
DNA Accessibility, Promoters and Enhancers
Transcription Factors, DNA methylation
Gene Expression, Splicing
Drug Discovery and Design
Discovery and design of new drugs, including virtual screening, molecular dynamics simulations, and cheminformatics.
Biomedical Informatics
Bioinformatics in medicine and healthcare : electronic health records, clinical decision support systems, and personalized medicine.
Electronic health records and patient data
Imaging applications in healthcare
ML for health data
Synthetic Biology
Modules & Therapeutic Systems
Biological Engineering
Fundamentals of Biological Engineering
Introduction to Cell and molecular biology, biochemistry, and genetics.
Bioinformatics
Computational methods for sequence alignment, gene annotation, and phylogenetic analysis.
Bioengineering
Introduction to gene therapy, genetic engineering, and synthetic biology.
Bioprocess Engineering
Design and optimization of biological systems for the production of chemicals, drugs, and other products.
Biomedical Imaging
Principles and techniques of biomedical imaging : X-ray, CT, MRI, ultrasound, and PET/CT.
Tissue Engineering and Regenerative Medicine
Tissue-engineered products and therapies, including stem cells, scaffolds, and biomaterials. (feel like this is starting to be very far fetched)
Computer Science
Systems Biology and Synthetic Biology
Computational methods used in systems biology : network inference, model building, data integration.
Biomedical Data Science
Data analysis, data visualization and ML methods used in biomedical research.
ML Foundations, CNN
Recurrent Neural Networks, Graph Neural Networks
Neural Networks Review
Interpretability, Dimensionality Reduction
Generative Models, GANs, VAE
Video processing, structure determination
Imaging and Cancer
EHRs and data mining
Neuroscience
If you’re reading this,
It’s Too Latethank you for giving me the benefit of the doubt despite the obvious chaos.But you asked for it folks! A
briefintroduction list to bioinformatics. If you feel passionate about a missing topic,do notfeel free to comment which topic is missing.As with any Fincher movie, this is just a small glimpse into my project ; from implementing bioinformatics methods to exploring the latest papers in genomics and proteomics, we’ll
eventuallybe covering it all.So, whether you’re a student, a researcher, or just an avid reader wondering if this can get any worse, stay tuned for more in posts about Bioinformatics.
In the meantime, keep learning, keep exploring, and remember to never stop asking questions. Everything is constantly evolving, and there’s always more to discover.
Thank you for reading. And as usual, Godspeed :)
Bottom note : I think it would be interesting to find a way to express each blog post into a concrete format. Out of my hat, I would be thinking of generating an artistic representation specific to the topic at hand.
More to think about.
You might have noticed the non-exhaustiveness of this list, from being too long to repetitive. I wish I could truthfully say that it was purposely created this way but it’s not : I’ll try to edit it as we go !
I would emphasize this part as I have a keen interest around proteins