Model serving refers to the way trained models are made available for others to use.
There are two main types of model serving: batch and online. The former means feeding the model, typically as a scheduled job, with a large amount of data and write the output to a database or a dashboard. The latter means deploying the model with an endpoint so applications can send requests and get fast responses at low latency.

See Introduction to streaming for data scientists for further details.