Matt Zand is a programmer, businessman, IT Consultant, and writer. He is the founder and owner of WEG2G Group. He is also the founder of DC Web Makers. His hobbies are hiking, biking, outdoor activities, traveling and mountain climbing.
Machine Language Overview
It’s amazing isn’t it? You can type some words and numbers on a screen and poof. You just created a software program. That is essentially how all modern high-level programming languages work, as there is a great amount of abstraction involved. But as any programmer knows, the computer is not actually reading the words that you write on screen; rather it can only interpret machine language, or binary code. Of course, this means that these high-level languages must be converted into lower-level languages, like assembly, then finally into machine code. But how do words on a screen that make sense to us become thousands of ones and zeros?
Complier and Assembly Language
As you can see from the previous image, the compiler is the program that takes every individual line of code and converts it into object code, which is essentially assembly language for our purposes. What actually happens is that the compiler runs through the high-level language statements multiple times until it completely builds the object code. The reason why the compiler cannot just take one look at each line of the high-level code and convert it is due to the complexity of some high-level language programs. For example, a variable may change values throughout the run of a program or a line in the code below might affect a line in the code above, meaning the compiler will have to take a look at the code above again. So, the object code that is finally built by the compiler goes through many iterations. But how does a compiler actually “choose” how a line of code gets converted? How does it know to convert “int x = 3;” to “ldx r22, 3”?
Well, essentially the compiler breaks up the code you typed into many small tokens. Then, it looks for any keywords that always lead to the same exact object code. For example, if a token contains “int”, then the compiler may have a map that directly associates the word “int” with the command “ldx [register value], [integer value]”. After doing this multiple times, it may then convert all of these tokens into a parse tree for efficiency. After the parse tree is created, it is possible to visualize all of the possible object code commands that could be created from the initial higher-level code. However, the compiler is not finished. There may still be inefficiencies within the code. Using the parse tree, the compiler can get rid of any unused variables, or unnecessary loops that don’t affect the lower-level code, leading to a smaller parse tree and a more efficient program. After this is done, the object code has been created and optimized, which allows the linker to come in and combine all of the object code files, then convert that combined executable into machine language. It is worth noting that the best way to learn how complier & Assemby work is via projects and practice. For instance, DC Web Makers Company only offers project-based training where students learn concepts through real world projects.
How Data Are Executed in a Machine
Combining object code files into a single executable is not a simple process. The linker basically is the exception catcher. It looks through all of the libraries that all the object code files referenced to make sure the syntax is correct. If a particular symbol in the object code does not match up with any libraries, it throws an exception, potentially halting the program from continuing. If all of the object code files manage to have no exceptions, then the linker will look at the memory addresses of the symbols in the libraries that are referenced. It will then begin storing the object code in particular addresses based on where their symbols are located within the memory. If identical constants are referenced in one object file or multiple, instead of placing multiple identical constants in multiple memory locations, it will just put it into one this will be the only address mentioned. Once every object file has been gone through and their contents stored within memory addresses in an efficient manner by the linker, an executable is produced. Based on a token table similar to the one a compiler uses, the linker can convert the object code to machine instructions, effectively making a program that a computer can actually understand.
And violà! Now the program is an executable written in machine code! Ironically, there was some abstraction in the explanation of how a high-level language goes to machine code, as the technical lingo is extremely difficult to understand. There are serious university courses that are dedicated to only talking about how language conversion works and how low-level languages work, so of course, this was only a brief overview of the subject.
Software engineers and mobile App designers usually learn and work with both low level coding languages such as C and high level programming languages such as Java. Thus, it’s imperative for new learners or students to learn both high level and low level programming. There are lots of online resources for learning software engineering. For teenagers and high school students, High School Technology Services offers variety of hands-on training. For adults and professionals, Coding Bootcamps institute offers many basic to advance programming classes with focuses on projects and algorithm design. For those who wish to learn more on coding and technology career, I suggest reading IT career roadmap article.