Google Summer of Code with LibreHealth

Posts

Showing posts from July, 2020

Week 8 : Coding Period

- July 25, 2020

This week I focused on the Chest-Xray14 dataset and its available models. The benchmark model for this dataset is the Chexnet model. This was availabale on Github in 2 formats - Pytorch and Tensorflow. So I downloaded the Pytorch model and set up a pipeline to convert this model from Pytorch to ONNX to Tensorflow. Part 1 of converting the Pytorch model to ONNX was implemented successfully inspite of a lot of bugs (because DataParallel is not supported by ONNX). Now the part of converting the ONNX model to Tensorflow is generating a lot of errors. Errors that sometimes have no solutions available on the internet! This section will require me to look into it thoroughly. During this course, I also searched for available Keras models of Chexnet. The one that I found did not have any concrete results. Because the creator did not add a threshold/classifier. The model simply outputs scores per class. Another fishy aspect of this model is that it is only 28MB in size. How can such a heavy, ...

Week 7 : Coding Period

- July 18, 2020

I have finished converting the pruned models to tflite format. I have done dynamic, float16 and int8 quantization. I have also evaluated their results to be presented shortly. Currently I am working on the chest-xray14 models. I am using onnx to convert benchmark pytorch models to tensorflow format by using an intermediate onnx format. I am also implementing knowledge distillation by writing my own code for it. The available code is having issues storing the soft targets and I am finding a way to work around that. The code available on Github is in pytorch and cannot be used on my models. The available code in tensorflow is outdated and is not working correctly. So for the next week, I will be working on chestx-ray14 as well as knowledge distillation simultaneously. The knowledge distillation script will be reusable for both the datasets. I hope to complete these modules in the upcoming days. Happy Coding :)

Week 6 : Coding Period

- July 11, 2020

This week has been pretty stressful with a lot of people in my locality getting infected with Covid-19. But work has been pretty smooth. This week, I resolved the problem with my pruned models. They had too many layers, parameters and had a size greater than the original model! I went through all my code and finally got rid of all the extra layers that I had accidentally saved with the model. It is now easy to quantize these pruned models. I am also trying to experiment with Knowledge Distillation. At the same time I have started working on my next models based on the Chest-XRay 14 dataset. I have also been trying to evaluate my Int8 models on the validation and test datasets. This has been taking more than 12 hours for some reason. Be it locally, on the plhi server or on Colab. I also lost my results a couple of times due to the long processing time. So I am running different evaluation scripts on each platform for faster processing. Hopefully this will get done faster. I am a ...

Week 5 : Coding period

- July 04, 2020

Week 5 Update - Today I received the results of my first evaluation and I have passed!! Woohoo! I am happy that my mentors are happy with my work. They have asked me to improve my code by making it more reusable. So I will work on this feedback in the upcoming weeks. Work update for this week - I have (finally) successfully implemented int8 quantization. Although I had to migrate everything to Colab for this, it finally run. Even though my code is split on various platforms, I am happy that most of what I have implemented is running well on my local machine. The 2nd thing that I did this week was make python scripts for every module of my notebook. I made scripts for dataset preprocessing, model building, model training, quantization and pruning. I am now adding inference scripts for large as well as compressed models. This is the part that I have to make more reusable. I also have to cover the edge cases and erroneous inputs for these scripts. I want to present the results of...