WARCing up the wrong tree - Part I

Web archiving can be hard. Luckily, these projects exist - https://github.com/webrecorder/pywb (a complete web archive replay and recording solution) https://github.com/webrecorder/warcio (a library to write/read WARC files) https://github.com/chfoo/warcat (a library for handling WARC files) https://github.com/machawk1/warcreate (a Chrome extension for capturing webpages) Code snippets A couple of code snippets that I’ve put together which I’ll probably refer back to.

Python blockchain

I decided to have another look into blockchain - this time by building one. I found this excellent medium article - https://medium.com/@vanflymen/learn-blockchains-by-building-one-117428612f46 which goes through the process of building one in Python. The blockchain that I built is available here - I’ve added some instructions about how to use it using HTTPie.

The great 2020 plan

Here’s an outline of what I’d like to achieve throughout 2020. Math I’ve decided to brush up on my math in preparation for a machine learning course that I’ll be starting in April. Complete Data Science Math Skills - Duke University Complete Mathematics for Machine Learning Specialization - Imperial College London Go through the 3Blue1Brown playlists Do some Khan Academy courses if necessary (Algebra I, Precalculus, Statistics & Probability, Calculus I, Multivariable Calculus, and Linear Algebra) Data Science Data Science-esque things took a back seat towards the end of last year (though I was using Juypter Notebooks/pandas here and there), but I’m going to be getting back into it.

A New Year, a new OS!

A New Year, a new operating system! After trying and failing numerous times to get Linux running satisfactorily via Oracle VirtualBox - I finally bit the bullet and went for a dual booting solution. After a week or so of using Linux, Ubuntu (18.04 LTS “Bionic Beaver”), I have to say that I’m loving it.

December 2019 - overview

It was a super busy month. Things that I did - Learnt about web crawling, detailed in a separate post - here Built my website using Hugo and learnt a bit about the Go programming language, detailed in a separate post - here Started to brush up on my math in preparation for a machine learning course that I will be undertaking in the near future Continued to learn more about JavaScript via the The Modern JavaScript Tutorial Attended another London Python meetup The topics of the London Python meetup were optimized human memory and MQTT.

Hugo & Typography

I decided to build my site using Hugo. I’ve also transferred my posts from my previous Github/Jekyll powered blog. I just didn’t like the fact that Jekyll is built for a dying language - Ruby. I’m quite pleased with my site. Hugo is nice and easy to work with and it allows for endless customisation.

Web crawling

December started with me looking into web crawling. I managed to set up the Heritrix web crawler, which as the Github README reads - Heritrix is the Internet Archive’s open-source, extensible, web-scale, archival-quality web crawler project. Heritrix (sometimes spelled heretrix, or misspelled or missaid as heratrix/heritix/heretix/heratix) is an archaic word for heiress (woman who inherits).

November 2019 - overview

November saw me finishing Make Your Own Neural Network (Rashid, 2016), making good progress with the freeCodeCamp - JavaScript Algorithms and Data Structures Certification, and attending my first coding meetup with London Python. Make Your Own Neural Network was a great read - it definitely demystified the goings on in neural networks.

October 2019 - overview

My greatest achievement of October was that I finally managed to finish the freeCodeCamp Responsive Web Design Certification, the Github repository is available here. It took a lot longer than I had originally anticipated, though towards the end I think I had got my workflows figured out. Next stop - the JavaScript Algorithms and Data Structures Certification.

September 2019 - overview

This month was a bit busier than the previous month. I have decided to take a break from DataCamp and have decided to hit up the books instead. I’ll return to DataCamp after I’ve finished reading - Python for Data Analysis (McKinney, 2017) Data Science from Scratch (Grus, 2019) I started to watch the 2019 CS50 lectures from Harvard University, CS50 being an introductory computer science course.