Numerical data in Machine Learning refers to any data that can be expressed in numerical form, such as numbers or percentages. This type of data is not only easy for our computers to understand, it’s also most common in real-world applications.

Machine Learning algorithms depend heavily on numerical data to make predictions and decisions. In fact, most algorithms can only accept numerical data as their input.

Numerical data can be continuous or discrete, and it can come from a variety of sources. For example, it can come from a sensor in a factory, a customer database, or a stock market feed. It can be collected over time to form a time-series, or it can be collected at a single point in time.

Continuous and Discrete Numerical Data :

Continuous numerical data is recorded on a scale, such as distance or temperature, which can take any value over a range. For instance, the length of a pencil is a continuous value input.

Discrete numerical data, in contrast, is data that can only take specific integer values, such as the number of items sold. It is, therefore, limited to a set of possible values.

Representing Numerical Data in Machine Learning :

To use machine learning algorithms, numerical data must be represented in a form that the algorithms can process. The most common method of representing numerical data is to create a data matrix, sometimes known as a feature matrix.

This matrix contains rows of observations, each observation represented in columns by one or more numerical features. These features can be binary, categorical or continuous.

Normalising Numerical Data :

It is essential to normalise numerical data, to ensure that all the features are considered equally by the algorithm; that means that specific features are not incorrectly considered to be more relevant than others due to differing scales.

For continuous data, normalisation refers to scaling data to fit within a certain range, often between 0 and 1. This is done by subtracting the minimum value in the dataset from each observation, then dividing the result by the range of the values.

Machine learning algorithms rely heavily on numerical data to make predictions and decisions. As a machine learning developer or data scientist, your ability to gather, process, and present numerical data is crucial to the machine learning process. By understanding what numerical data is and how it is represented in machine learning, developers and scientists can create better algorithms and get better results.