Hyperplanes

Many of you will have guessed it right. We use hyperplanes when it comes to more than 3D. We will define it using a bit of mathematics.

A linear equation looks like this: y = ax + b has got two variables, x and y, and a y-intercept, which is b. If we rename y as x₂ and x as x₁, the equation comes out as x₂=ax₁ + b which implies ax₁ - x₂ + b=0. If we define 2D vectors as x= (x₁,x₂) and w=(a,-1) and if we make use of the dot product, then the equation becomes w.x + b = 0.

Remember, x.y = x₁y₁ + x₂y₂.

So, a hyperplane is a set of points that satisfies the preceding equation. But how do we classify with the help of hyperplane?

We define a hypothesis function h:

h(x_i) = +1 if w.x_i + b ≥ 0

-1 if w.x_i + b < 0

This could be equivalent to the following:

h(x_i)= sign(w.x_i + b)

It could also be equivalent to the following:

sign(w.x_i) if (x₀=1 and w₀=b)

What it means is that it will use the position of x with respect to the hyperplane to predict a value for y. A data point on one side of the hyperplane gets a classification and a data point on other side of hyperplane gets another class.

Because it uses the equation of a hyperplane that happens to be the linear combination of the values, it is called a linear classifier. The shape of hyperplane is by w as it has elements as b and a responsible for the shape.