Model deployment fails when using pydantic
I'm attempting to deploy a model, but my model uses pydantic for the function argument. In this approach the model fails to deploy. The main.py in this case is based on what would be in a FastAPI implementation.
import pickle import json import time from pydantic import BaseModel from sklearn.feature_extraction.text import TfidfVectorizer import xgboost as xgb clf = pickle.load(open('data/xgbmodel.pickle', 'rb')) vect = pickle.load(open('data/tfidfvect.pickle', 'rb')) class Data(BaseModel): id: str = None project: str messages: str def predict(data: Data): start = time.time() data_l = [data.messages] to_predict = vect.transform(data_l) prediction = clf.predict(to_predict) end = round((time.time() - start), 3) return { "id": data.id, "project": data.project, "prediction": prediction[0], "execution_time": end }
However, when I test the model with the following
import requests response = requests.post("https://dave.server.com:443/models/5f248eb11e43c7000602300b/latest/model", json={ 'data': { 'project': 'PROJ', 'messages': 'This is a test message and who knows what it will return\n' } } ) print(f"{response.status_code}\n") print(f"{response.headers}\n") print(f"{response.json()}\n")
The following error is returned:
('{\'error\': {\'message\': "predict() got an unexpected keyword argument ' '\'project\'"}, \'model_time_in_ms\': 0, \'release\': {\'harness_version\': ' "'0.1', 'model_version': '5f248eb11e43c7000602300d', 'model_version_number': " "1}, 'request_id': 'MXP0ZITN2FR3DGB1', 'timing': 0.1239776611328125}\n")
Any thoughts on how to resolve this?
Comments
-
I was able to get this working by dropping pydantic and putting all required data features into the function call as arguments.
def predict(project, messages, id = None):
0 -
Hi @elsammons ,
I'm not personally familiar with pydantic and I did a quick check through our internal and external docs and don't see any cases of pydantic usage or example syntax. So it appears we haven't looked into this before, meaning I don't have an immediate answer without some testing of our own. It makes sense that your code works without pydantic, just using conventional syntax, but I'll check to see if our API can be compatible with an annotation-based tool like this one.
-Zach
0 -
Hi @elsammons , am I correct in guessing that you did not change the test code for calling the model in any way when you got it to work by dropping pydantic?
Can you try adding back the pydantic syntax to your function signature (I'm changing the naming a little for reasons you'll see below):
def predict(inputdata): data = Data(**inputdata)
And then try calling the model with an additional level of nesting in your json like so:
response = requests.post("https://dave.server.com:443/models/5f248eb11e43c7000602300b/latest/model", json={ 'data': { 'inputdata': { 'project': 'PROJ', 'messages': 'This is a test message and who knows what it will return\n' } } } )
Hopefully my naming helps clarify what is going on: basically, the first
data
is part of the Domino boilerplate for calling the model (you will see it in the sample calling code) - it is required for Domino to know "this is what I pass to the model function". Then the secondinputdata
corresponds to the argument of your function, which you want to use pydantic to help unpack and check. You'll also notice I took out the annotation on the function signature and added an explicit instantiation - I think this is needed since inputs will come in raw json/dict format, but then the rest of the type annotations you define in yourData
class should work as expected.Let us know how it goes!
Melanie
0 -
Thank you @melanie.veale. Correct I changed no code on the actual test code side. But through some testing and additional print statements I was able to figure out what was going on.
i.e. I determined that Model API boilerplate is essentially treating the 'data': {} in a manner similar to kwargs, each key value pair becomes input into the api function.
Once I determined this is what was going on I was able to conclude that I can simply the expected features as arguments to my function.
I will give your approach a shot as well, given I'm already adding the level 'data': {} ('inputdata': {} won't be an issue) for the boilerplate, something that FastAPI does not require.
0
Howdy, Stranger!