Ml Model Deployment


Issues:

ML/Statistical Issue:

Given a data X, model F and output y, we have a data distribution changes Issue called: Concept drift and Data drift.

Concept Drift

refer to when desired mapping from X to y changes. for example due to inflation, same size of the house can be priced higher.

Data Drift

refer to when the input distribution (X) changes. for example when most people suddenly build a smaller house.

make sure you can detect and manage both of concept drift as well as data drift even when the mapping between x -> y didn’t changes.

How has the data changed?

A. Gradually

when the data or the behaviour gradually changes, for example: English language has evolved, some trend within teenagers, some people starting to use new brand of Iphone, etc.

B. Suddenly

sudden shock in the system, for example: covid, people purchase pattern changes suddenly.

Software Issue:

Important Checklist:

  1. Realtime (ML model serving) or Batch Prediction (ML Batch Inference)
  2. Does it run in the Cloud/Server or Edge/Browser? Do we need to perform model resource optimization (especially if it’s on the edge)?
    1. On Edge: large complex model can’t be deployed on edge
    2. Cloud/Server: will not work when latency is very important (e.g. autonomous car)
  3. Resources cost and Constraint:
    1. Compute resources (CPU, GPU, RAM, etc)
    2. Latency (Delay between user’s action and response)
    3. Throughput (Number of successful requests served per unit time)
  4. Monitoring and Logging
  5. Security and Privacy

if we know how much CPU/GPU resources for your prediction service, it could help you writing the software and/or choosing the proper ML model.

monitor and continue maintain it, especially in the face of concept drift and/or data drift.


References

  1. https://www.coursera.org/learn/introduction-to-machine-learning-in-production/home/week/1
  2. https://community.deeplearning.ai/t/mlep-course-1-lecture-notes/54446 (need login)