AKA Gaussian process regression for the geosciences.
2023-10-06
Estimating the upper bound of a discrete uniform probability mass function from sampling without replacement using frequentist and Bayesian techniques.
2024-02-24
2023-12-08
Data engineering Data cleansing Best practicesFor goodness' sake, can we please all agree on some naming conventions for temporal (date, time, datetime etc.) columns?
Suppose in some tables, you see this:
DT | WEEKDAY | MTH | X |
---|---|---|---|
2024-01-01 | 1 | 1 | 42 |
2024-01-02 | 2 | 1 | 42 |
2024-01-03 | 3 | 1 | 42 |
2024-01-04 | 4 | 1 | 42 |
2024-01-05 | 5 | 1 | 42 |
And in other tables, you see this:
DATE | WEEKDAY | MNTH | X |
---|---|---|---|
2024-01-01 | Monday | January | 42 |
2024-01-02 | Tuesday | January | 42 |
2024-01-03 | Wednesday | January | 42 |
2024-01-04 | Thursday | January | 42 |
2024-01-05 | Friday | January | 42 |
Now, you can't really label this incorrect. From the lens of a single table, anyone could argue that 'date', 'day' and 'mnth' are acceptable names to refer to a date, day of the week and month of the year. And anyone could argue that weekdays can be represented as both a string ('Monday') and an integer (1) and are therefore acceptable to place in a column called 'weekday'.
However, it is inconsistent. There are inconsistencies in name and inconsistencies in meaning.
Hi, I'm Tim. I'm an experienced technical data science leader with a passion for delivering value to businesses using data science, machine learning and artificial intelligence. In my 8+ years of experience as a data scientist, I have acquired exposure across a diverse range of industries. I have worked for two ASX 200 companies in the energy and broadcast media industries, and have acquired international exposure at a top-tier financial technology and consulting firm in UK and Singapore.