SHAURYA AGARWAL

All my technology and research in one place


.

.

About

Open Source Initiatives

Data Munging Using X

Data Engineering Workshops with Spark (PySpark), Pandas, Dask, Ray etc - some of the most popular libraries in the field. Supports Google Colab, click on the Open In Colab badge next to each notebook’s link.

Notebooks on Statistics and Machine Learning

These serve as practical notes and references on common machine learning algorithms with an introduction to Pandas and Numpy.

Vogon Poetry

concepts like zero-copy-columnar-layout-distributed-vectorized etc. that sound like Vogon Poetry to data engineering teams trying to modernize their game…

The Data Engineering Rocket Ship

Long-form posts on data engineering and technology in collaboration with my team

CoolRE

Pronounced “cooler” provides 3 approaches to building a regular expressions engine - a toy overview, a backtracking based implementation and a finite-automata based approach.

Kandinsky

Discover key colours in a painting or photograph using K-Means clustering. Also provides the proportions of the colours. Some results (more in the repo):

01

02

03

04

Barnsley Fern Fractal

Barnsley Fern Result 1

Barnsley Fern Result 2

Phyllotaxis and L-systems example

Result 1

Result 2

Certifications

Trainings Archive / Older Material

A quick FinOps project with Notional Data