Marc Singleton

Resources

This page has shout-outs to other great resources for all things coding, statistics, machine learning, biology, etc., that I’ve come across on the web or elsewhere. At one point or another, I found all of these useful for something I was trying to learn about, so I want to do my part to promote those authors’ hard work here (or at least boost their search results by feeding the ranking algorithms).

Bioinformatics

UCSC Data File Formats

Half the battle in bioinformatics is often just parsing the data, and while there are plenty of tools for the standard formats, there’s no point in extracting a field from a file if it’s not clear what it even means. This page from UCSC is a valuable reference for a variety of formats. Though UCSC is not the maintainer for most of them, the page does contain links and references to many of the official specifications. The UCSC Genomics Institute is also a de facto authority for a variety of other genomics resources, so the site is well-worth exploring.

Genomics Tools

Tools for working with some of the major genomics file formats:

Coding

mCoding

mCoding, by James Murphy, is a YouTube channel known for its practical and hands-on deep dives into Python’s internals. I highly recommend its videos for intermediate Python users who are looking to level up to experts.

Reducible

Reducible is a YouTube channel that covers a variety of coding topics, ranging from algorithms to computer graphics. In many ways, it feels like the computer science equivalent of 3Blue1Brown for its use of animated visualizations.

Semantic Versioning

Everyone working with code eventually has to come to terms with versioning. Version control systems like Git are sufficient for internal or private uses, but once software goes public, it becomes convenient (and often necessary) to attach human-readable version numbers. Though they seem intuitive, version numbers are surprisingly easy to mess up in the absence of a clear standard. And when entire software ecosystems depend on a common language for managing dependencies, why reinvent a perfectly good wheel?

Composing Programs

Composing Programs is the companion text to UC Berkeley’s introductory computer science course, CS 61A. Though it uses Python for its examples, its focus is less on the specifics of that language and more on general concepts in computing, such as recursion and abstraction. While it’s not the first or only resource I would use when learning Python, I would recommend it and the lecture materials from previous offerings of the course for those who already have a grasp of Python basics and are looking for a deeper understanding of computer science fundamentals.

The Linux Command Line

Knowing the command line is essential for taking full advantage of the Linux operating system, and some tasks can’t even be done without it. To quote this book, “Graphical user interfaces make easy tasks easy, while command line interfaces make difficult tasks possible.” While there’s no shortage of introductions to the Linux command line, I particularly enjoy this one for both its breadth and accessibility, covering a cross-section of topics in a conversational style that lays a strong foundation for deeper dives with other sources. It also has interludes that explain the historical context behind certain design choices, which ensures the reader comes away knowing both the what and the why of the command line.

Pro Git

Like the Linux command line, Git is another piece of computing infrastructure that is extremely powerful but also highly intimidating to new users. While there are other guides that will get someone up and running with Git much faster, it’s worth developing familiarity with this resource for both its clear explanations of Git fundamentals and its internal “plumbing.”

Machine Learning

Distill

Distill was a scientific journal for machine learning articles that fully embraced the interactivity of modern web pages. Its articles largely focused on the explanation and visualization of algorithms rather than the development of novel architectures or applications. Unfortunately, it’s now on indefinite hiatus, but its articles still set a high-water mark for scientific communication.

Mathematics

3Blue1Brown

3Blue1Brown, by Grant Sanderson, is likely the most popular YouTube channel in the math education space right now. Its videos cover a variety of topics from the basics of calculus to theorems from complex analysis using a signature style that combines slick animations with a narrative emphasis on intuition and discovery.

The Bright Side of Mathematics

The Bright Side of Mathematics is a YouTube channel by Julian Großmann that features Khan Academy-style short video lectures with a digital chalkboard. Unlike Khan Academy, TBSoM focuses on more advanced topics that would typically be taught in upper-division undergraduate or graduate courses. However, the channel has playlists that cover foundational concepts from logic and set theory as well.

Desmos Graphing Calculator

A simple but powerful graphing calculator! I’ve been using it for years, and throughout that time it’s remained an intuitive and stable web app. Don’t be fooled by the minimal interface. It supports a deep set of features, including a variety of coordinate systems, inequality graphing, parameter sliders, and more!

Scientific Writing and Presentation

Tools for compiling references with BibTeX:

Tools for creating figures and diagrams: