Introduction to Robotics

The field of robotics has been growing even before the term 'robot' started being used.  Robotics is a highly multidisciplinary field that encapsulates various domains of Mechanical, Electronics and Computer Science and Engineering. Robotics was born out of an intentional effort to bring about synergy between these fields. Hence, in order to learn Robotics, one has to have a wide breadth of knowledge along with depth in a specific area. This is crucial as Robotic systems are highly coupled in nature and the beauty of it lies in how one branch of engineering elegantly flows into another when dealing with such systems. The basis of all this lies in linear algebra, calculus, probability and statistics. Like all engineering systems, a strong foundation of mathematics enables us to deal with the design and analysis of robotic systems. Having said all this, it is very difficult to define a 'robot'. Loosely speaking, even a dishwasher can be classified as a robot. Some may even feel that their online store chatbot is a robot. According to most professionals in this field, a robot is a device that can 'sense, perceive and act on the environment';  a description I agree with very much as a robot can be thought of two distinct parts - the "mind" and the "body"  as robots must be able to both perceive information intelligently as well as interact physically with the environment. I have briefly introduced topics within the domain of robotics below.

I would like to thank QUT Robot Academy, Professor Peter Corke  of the Australian Centre for Robotic Vision for his amazing contributions to robotics education. I shall be using some of his and his team's videos for illustration of concepts presented below. I would also like to acknowledge Professor Kostas Alexis for his contributions.

Coordinate Transformations

Robotics necessarily deals with interaction with the environment around us. Robotic arms, rotorcraft, cars, humanoids, etc. -  every type of robot has a set of moving parts. Since our objective is to control this motion, it becomes imperative for us to model it mathematically. The position of a discrete rigid body in 2D space requires two coordinates x and y. However, in general, a rigid body's orientation also needs to be specified to fully define its 'state' in the environment. The combination of position and orientation is called 'pose'. Hence, a rigid body in 2D space needs three independent coordinates to define its pose. Similarly, for rigid bodies in 3D space, six such quantities are required. This can be shown as a cartesian coordinate frame attached to the body. Coordinate transformations are matrices that contain information about these 3/6 independent quantities. In order to describe the position of the same physical point (or body) in space in different coordinate frames, transformations are required. Once we have an expression for position, the velocities and other higher order derivatives can be computed using calculus. This is helpful as many systems require state description in the ground frame for analysis purposes. 


Kinematics is the study of motion. As we need to control robot motion according to our objective, a kinematic analysis of its motion is absolutely essential and is one of the first topics studied in robotics. For example, the kinematics of a robot arm would include the relationship between its joint angles, joint velocities and the position, orientation and velocity of any given point on the robot (usually the end-effector). There are two important formulations in kinematic equations - Forward and Inverse. The forward kinematics give the pose of the robot given the state of the directly controllable variables eg: end-effector pose from joint angles. Inverse kinematics is exactly the inverse mapping of this. The inverse kinematics helps us decide the values of the directly controllable variables given a desired position and orientation in Euclidean (Cartesian 3D) space. 


Kinematics only accounts for the pure geometrical properties of mechanisms and assumes that these motions can be executed regardless of the actual physical laws that govern these motions. Therefore, equations which relate the state variables (like position, velocity, etc.) and parameters of the system, generally expressed in the time domain, is called system dynamics. Dynamic equations are characterised by the presence of time derivatives and in general impose temporal constraints on motion. Therefore, these dynamic equations govern the behaviour or evolution of the system variables as time progresses. In classical mechanics, application of Newton's 2nd Law gives us the dynamic equations of rigid bodies. Similarly, application of Kirchoff's laws in an LCR circuit gives us the electrical circuit dynamic equations. As in robotics we generally deal with physical motion, Newton-Euler and Lagrangian methods are used frequently.


As mentioned above, the dynamics of the system govern how the values of the system variables evolve over time. These behaviours are generally dependent upon system equation properties such as degree, order, parameter values, etc. Hence, at this point, they are not "controllable". To make these systems controllable i.e. to be able to achieve desired states, a control input is required. This control input is usually a physical variable that becomes a part of the system dynamics by virtue of physical laws governing the system. For example, applying an external force to a ball allows us to control its position and velocity. In general, controllers are devices that drive a system to a desired state by manipulating the system dynamics. The system dynamics can be manipulated by external forces and moments in mechanical systems and external voltages or currents in electromagnetic systems. There are many principles, approaches and methods to model controllers. Feedback is generally employed for robust and accurate control. The mathematical expression for the controller input is known as the 'control law'.

State Estimation
Path Planning

Once we are able to control the robot, we must answer the basic question of where the robot is in its environment. For example, a pilot needs to know where the aircraft is in the world in order to decide where to direct their aircraft in order to reach the destination. Extending this idea a little more, the robot needs to know its entire state (which may include velocities, orientation and other variables of interest) in order to make control decisions. The task of estimating the robot state given physical models and/or sensor data is known as state estimation. State estimation is important because mathematical models are never a perfect representation of the world. Hence, a probabilistic-statistic approach improves the reliability and robustness of robotic systems. An important example of state estimation is the the Kalman Filter. The Kalman Filter uses Gaussians (a type of probability distribution) in order to model uncertainty in sensors and processes to reach to an optimal solution for the estimated state. It is used in pose estimation of UAVs by using information from IMUs, GPS, cameras, etc.

Once we are able to control the motion of the robot, we need to decide how to be able to navigate through an environment. Every robot has a certain objective, commonly referred to as a "mission". During the mission, it has to constantly plan what it will do next similar to how humans think and plan their actions.  Generating trajectories in the robot configuration space (set of all possible states) that satisfies certain conditions is the goal of path planning. Mathematically, it can be expressed as optimization problem. The most common example of path planning is finding the shortest path between two points without colliding into objects. Various optimal and sub-optimal search algorithms can be used to find such paths. This is a crucial step as the mission or application is characterized by the robot path. Path planning can be motivated by many factors other than finding the shortest route. It can be related to exploration, mapping, inspection, etc.  A basic algorithm used for searching in path planning is the A* algorithm. '*' indicates optimality.

Computer Vision

Computer vision (CV) forms a huge subset of robotics and is currently one of the most active fields of research. CV is a field that arose because of the most widely used sensor in robotics today - the camera. Engineers and researchers realised that data coming from a camera, apart from passive viewing, can be used for a large array of functions and applications (pun intended). A camera is fundamentally an 2-D matrix of intensity values. These intensity values can either be in gray-scale (single-valued) or have a combination of R-G-B values. Similar to how the human brain uses the information from the eye, the data from a camera can be processed to infer many things including the nature, position, velocity, etc. of any object in sight. Mathematical tools in linear algebra are widely used in CV techniques. The most important application of CV in robotics is localization, especially in GPS-denied environments. By identifying certain "features" in the environment and mapping the movement of these features from frame-to-frame to its actual movement in space, the robot can localize itself in the world. The "fundamental" and "essential" matrices mathematicall describe this mapping. "Features" are usually characteristic "image patches" in the field of view, which are used as anchor points. Like the human eye, a system of two cameras (stereovision) is generally used in order to infer depth. Other applications in CV include object recognition, classification, 3D reconstruction, etc. 

Machine Learning

Machine learning forms a specific class of algorithms in computer science that enables a computer to "learn" a specific task without being explicitly programmed for that task. The performance of the algorithm improves with the amount of data given to it corresponding to that task. For example, a machine learning algorithm can be used to design a program  that classifies emails as spam or not spam. The program takes in "features" of the input email (words, sender address, etc.) and outputs either "spam" or "not spam". The data supplied to the ML algorithm consists of labeled data of hundreds or even thousands of emails. This problem is known as "supervised learning" since labeled data is provided to the learning algorithm. The email classifier is known as a "classifier" in general. 

Mathematically, an ML algorithm is an optimization formulation whose cost function depends on the type of "learning" that is required. Within ML, the most popular formulation is the "artificial neural networks" model. This model, inspired from human neuron cells,  generates output values as a result of a complex interconnection of various nodes placed in layers between the output and the input layers of values. Other formulations include SVM, PCA, K-Means, etc. Machine learning has proven to be a game changer in problems that are very hard to explicitly model as it develops "intelligence" using data provided to it, just like how a human being learns. The rise in computationaly capabilities of modern devices has given further impetus to ML and its applications.

Next Up: Design for Robotics