Hyperplanes 

Many of you will have guessed it right. We use hyperplanes when it comes to more than 3D. We will define it using a bit of mathematics.

A linear equation looks like this: y = ax + b has got two variables, x and y, and a y-intercept, which is b. If we rename y as x2 and x as x1, the equation comes out as x2=ax1 + b which implies ax1 - x2 + b=0. If we define 2D vectors as x= (x1,x2) and w=(a,-1) and if we make use of the dot product, then the equation becomes w.x + b = 0. 

Remember, x.y = x1y1 + x2y2.

So, a hyperplane is a set of points that satisfies the preceding equation. But how do we classify with the help of hyperplane?

We define a hypothesis function h:

h(xi) = +1 if w.xi + b ≥ 0

-1 if w.xi + b < 0

This could be equivalent to the following:

h(xi)= sign(w.xi + b) 

It could also be equivalent to the following:

sign(w.xi) if (x0=1 and w0=b)

What it means is that it will use the position of x with respect to the hyperplane to predict a value for y. A data point on one side of the hyperplane gets a classification and a data point on other side of hyperplane gets another class.

Because it uses the equation of a hyperplane that happens to be the linear combination of the values, it is called a linear classifier. The shape of hyperplane is by w as it has elements as b and a responsible for the shape.