Hashing Pipelines with Joblib
For the last six months or so, I’ve been working on building out the infrastructure for our machine-learning service at work. One thing that had me scratching my head last week was trying to compare two fitted pipelines, trained on what could be the same data. To ensure that I wasn’t re-uploading a duplicate fitted pipeline, I wanted to compare the MD5 hashes of the fitted pipelines. Joblib has a way to do this, but I spent way too long trying to find an example of getting it working. ...