Inference

AI inference is the process by which a trained AI model applies its learned knowledge to new, unseen data to generate predictions, classifications, or decisions. For example, an image recognition system identifying objects in a new image, or a self-driving car recognizing a stop sign on a road it has never encountered, is performing inference.

Unlike model training, which focuses on learning patterns from data, inference uses the model to produce actionable results in real-world scenarios.

There are different types of AI inference.

  • Batch inference processes large datasets at once, typically offline.
  • Real-time (online) inference produces predictions immediately for each incoming data point.
  • Streaming inference continuously processes live data streams from sensors or events.
  • Edge inference runs directly on local hardware, such as smartphones, IoT devices, or industrial sensors.

AI inference is central to applications such as large language models (LLMs), predictive analytics, autonomous systems, and recommendation engines.

We use cookies

Our website uses cookies to ensure you get the best experience. By browsing the website you agree to our use of cookies. Please note, we don’t collect sensitive data and child data.

To learn more and adjust your preferences click Cookie Policy and Privacy Policy. Withdraw your consent or delete cookies whenever you want here.

Allow all cookies