-
I know tensorflow 1 models aren't supported (or at least I think so). However, I have gotten things to work by building a service with a TF1 artifact - the environment I build in is TF2, but TF1 is in the docker environment. The reason for this is certain models of interest are only easily available in TF1 (e.g. https://github.com/zhang0jhon/AttentionOCR). So everything seems to work okay with one exception - for a large model (e.g. the one above) the docker container will crash when run. We managed to fix this by only using one gunicorn worker. But this obviously a weird and not ideal solution. So two questions:
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi! I'm working on the TensorFlow integration of bentoML. This information may be helpful to you.
In fact, bentoML was designed to supports both TensorFlow 1 & 2 models. An example using tf1 is included in the gallery:
In most cases, it was caused by insufficient system resources, especially memory. It would be great if you could provide the logs.
We all know that if it is because of the insufficient system resources, limiting the number of workers is a good solution. We can achieve higher throughput with the same number of workers by micro-batching. |
Beta Was this translation helpful? Give feedback.
-
Hi! Thanks for the reply. I'm glad to hear that TF1 models are officially supported. I had been doing a system where I save a model in TF1 but use TF2 in the environment in which I'm predicting. This is the only way I was able to get it working before. One the issues is that I'm using pretrained weights from others' models which is why I have less flexibility in their format. However, I will try using TF1 exclusively and see how that goes. Could you clarify/explain what exactly a gunicorn worker is? Is each worker loading up the model in memory so that it can handle a request? If so then it makes sense that the memory issues are happening. |
Beta Was this translation helpful? Give feedback.
Hi! I'm working on the TensorFlow integration of bentoML. This information may be helpful to you.
In fact, bentoML was designed to supports both TensorFlow 1 & 2 models. An example using tf1 is included in the gallery:
https://github.com/bentoml/gallery/blob/master/tensorflow/fashion-mnist/tensorflow_1_fashion_mnist.ipynb
In most cases, it was caused by insufficient system resources, especially memory.
For example, for any classifier using BERT, each worker would take more than 700M memory. Even on an EC2 c5-large instanc…