How I build this website

2024-01-01

Meta Productivity Programming Web

Contents


1 Introduction

At some point in my DS career, I realised that it isn't enough to simply grind in your spare time learning everything that is demanded of data scientists to know: You must take the next necessary step and tell people about it.

It is your responsibility to both yourself and others to give the public an accurate idea of what you are interested in and just how much you know. When doing so, you're not bragging about yourself - you're politely conveying useful information to your audience about 'who you truly are', which cannot be expressed as effectively in any other medium than a personal website. To some of your audience, it may give them delightful content to consume; while to others, it may empower them to make better informed judgements about you.

For example, suppose you're someone who truly, deep-down, experiences the greatest joy from playing around in your mind with structures from abstract algebra as if they were toys, or perhaps writing an elegant algorithm to solve an ancient problem which can be explained to a child yet has a nontrivial solution. Unfortunately, there are very few opportunities to convey this information to people in an in-person forum, because these topics practically never come up in discussion, because they are far from 'middle-of-the-distribution' interests to have.

This is why data scientists, who (should) have these sort of interests, must, in my opinion, have a personal website.

2 Shortlist of options for building my website

The following table contains my shortlist of options for building my website.

SolutionCategoryDescriptionFlexibilityCoding requiredLink
SquarespaceWebsite builderSaaS for website building and hosting. Allows users to use pre-built website templates and drag-and-drop elements to create and modify webpages.LowNo to lowhttps://www.squarespace.com/
WixWebsite builderTools for creating HTML5 websites and mobile sites using online drag-and-drop editing.LowNo to lowhttps://www.wix.com/
WordpressWebsite builderWeb content management system. Originally created as a tool to publish blogs but has evolved to support publishing other web content.LowNo to lowhttps://wordpress.com/
BlogdownWeb building frameworkR package for creating websites with R Markdown.MediumMediumhttps://bookdown.org/yihui/blogdown/
HugoWeb building frameworkFast, open-source static site generator. Build websites using Markdown files.MediumMediumhttps://gohugo.io/
QuartoWeb building frameworkBuild websites using Jupyter notebooks or Markdown.MediumMediumhttps://quarto.org/
Fully customWrite your own scripts to build HTML pages from source files.HighHigh

3 My decision: A fully custom solution

I chose to go with a fully custom solution for the following reasons:

3.1 The sole reason I draft posts in MS Word: Writing math

Anyone reading this with a background in academia is not going to like to hear this, but I state the following anyway:

For writing math, I have found no viable substitute for MS Word.

I have concluded that Word simply has the most slick and seamless editing experience for writing maths, period.

On more than one occasion, I have attempted to give LaTeX a chance. But I couldn't agree more with this YCombinator post, which argues that LaTeX is a good idea with a terrible implementation.

The math editing experience is the ONLY reason I use MS Word. Believe me, I would much rather be using plaintext instead. If there was some other way in which I could have as slick and seamless a math editing experience as MS Word with plaintext, I would dispense with Word entirely and do it that way. But at the time of writing, there isn't.

4 My solution

My custom solution is essentially a collection of Python and bash shell scripts which build clean HTML files from dirty MS Word HTML files.

Simplified folder structure for my public-staging folder, which is a 'staging' folder for building my public folder (N.B. not all posts shown):


$ tree public-staging
public-staging/
├── assets
│         ├── style.css
│         ├── favicon.png
├── boilerplate
│         ├── about-middle.html
│         ├── contact-middle.html
│         ├── default-bottom.html
│         ├── default-top.html
│         ├── index-middle.html
│         ├── post-bottom.html
│         ├── post-middle.html
│         ├── posts-middle.html
│         ├── post-top.html
│         └── posts-top.html
├── out
│         ├── about.html
│         ├── contact.html
│         ├── index.html
│         └── posts.html
├── posts
│         ├── dense-percentile-rank-snowflake
│         │         ├── dense-percentile-rank-snowflake.docx
│         │         ├── dense-percentile-rank-snowflake.htm
│         │         ├── metadata.json
│         │         ├── post.html
│         │         └── thumbnail.png
│         ├── glorot-initialisation
│         │         ├── glorot-initialisation.docx
│         │         ├── glorot-initialisation.htm
│         │         ├── metadata.json
│         │         ├── post.html
│         │         └── thumbnail.png
│         ├── how-i-evaluate-ml-models
│         │         ├── how-i-evaluate-ml-models.docx
│         │         ├── how-i-evaluate-ml-models.htm
│         │         ├── metadata.json
│         │         ├── post.html
│         │         └── thumbnail.png
├── scripts
│         ├── cp-files.sh
│         ├── make-post.py
│         ├── make-posts.py
│         └── make-templates.sh
├── templates
│         ├── post-template.html
│         └── posts-template.html

With a post DOCX file open, say, how-i-evaluate-ml-models.docx, I do Save As > Webpage, Filtered (.htm), which creates an HTML file (actually an HTM file, but they're the same thing except for one letter in the extension).

Then I run the following scripts in sequence:

  1. make-templates.sh (only if boilerplate has changed): For post-template.html and posts-template.html, concatenate the files in the boilerplate folder and move the resulting template files to the templates folder. For about.html, index.html and contact.html, concatenate the files in the boilerplate folder and move the resulting final files to the out folder.
  2. make-post.py: Process the ditry MS Word HTML files into clean HTML files, insert the contents into post-template.html and save as post.html in the post's subfolder.
  3. make-posts.py: Build the posts.html file by inserting the contents into posts-template.html and move to the out folder.
  4. cp-files.sh: Copy all final HTML files, including the posts.html file built in the previous steps as well as the infrequently changed HTML files (about.html, contact.html, index.html) from public-staging to a separate final public directory. It is this folder which gets uploaded and served by my web hosting provider (no, I don't DIY my hosting, at least at this stage).

This solution currently meets all my requirements.

Back to top