Matt Zand is a programmer, businessman, IT Consultant, and writer. He is the founder and owner of WEG2G Group. He is also the founder of DC Web Makers. His hobbies are hiking, biking, outdoor activities, traveling, and mountain climbing.
Machine Language Overview
It’s amazing. You can type some words and numbers on a screen and poof. You just created a software program. That is essentially how all modern high-level programming languages work, as a great amount of abstraction is involved. But as any programmer knows, the computer is not reading the words you write on screen; it can only interpret machine language or binary code. Of course, these high-level languages must be converted into lower-level languages, like assembly, and then machine code. But how do words on a screen that make sense to us become thousands of ones and zeros?
Complier and Assembly Language
As you can see from the previous image, the compiler is the program that takes every line of code and converts it into object code, essentially assembly language for our purposes. What happens is that the compiler runs through the high-level language statements multiple times until it completely builds the object code. The compiler cannot just take one look at each line of the high-level code and convert it due to some high-level language programs’ complexity. For example, a variable may change values throughout the run of a program, or a line in the code below might affect a line in the code above, meaning the compiler will have to take a look at the code above again. So, the object code finally built by the compiler goes through many iterations. But how does a compiler actually “choose” how a line of code gets converted? How does it know to convert “int x = 3;” to “ldx r22, 3”?
The compiler essentially breaks up the code you typed into many small tokens. Then, it looks for keywords that always lead to the same object code. For example, if a pass contains “int,” then the compiler may have a map that directly associates the word “int” with the command “ldx [register value], [integer value].” After doing this multiple times, it may then convert all of these tokens into a parse tree for efficiency. After the parse tree is created, it is possible to visualize all object code commands created from the initial higher-level code. However, the compiler is not finished. There may still be inefficiencies within the code. Using the parse tree, the compiler can eliminate unused variables or unnecessary loops that don’t affect the lower-level code, leading to a smaller parse tree and a more efficient program. After this, the object code has been created and optimized, allowing the linker to combine all the object code files and then convert that combined executable into machine language. It is worth noting that the best way to learn how compiler and assembly work is via projects and practice. For instance, DC Web Makers Company only offers project-based training where students learn concepts through real-world projects.
How Data Are Executed in a Machine
Combining object code files into a single executable is not a simple process. The linker is the exception catcher. It looks through all of the libraries referenced in the object code files to ensure the syntax is correct. If a particular symbol in the object code does not match up with any libraries, it throws an exception, potentially halting the program from continuing. If all object code files have no exceptions, then the linker will look at the symbols’ memory addresses in the referenced libraries. It will then begin storing the object code in particular addresses based on where their characters are in the memory. If identical constants are referenced in one object file or multiple, instead of placing numerous equal constants in various memory locations, it will just put them into one. This will be the only address mentioned. Once every object file has been gone through and its contents efficiently stored within memory addresses by the linker, an executable is produced. Based on a token table similar to a compiler, the linker can convert the object code to machine instructions, effectively making a program that a computer can understand.
And violà! Now, the program is an executable written in machine code! Ironically, some abstraction explains how a high-level language goes to machine code, as the technical jargon is complicated to understand. There are serious university courses dedicated to only talking about how language conversion works, and low-level languages work, so this was only a brief overview of the subject.
Summary
Software engineers and mobile App designers usually learn and work with low-level coding languages like C and high-level programming languages like Java. Thus, new learners or students must learn both high-level and low-level programming. There are lots of online resources for learning software engineering. High School Technology Services offers various hands-on training for teenagers and high school students. Coding Bootcamps Institute offers many basic to advanced programming classes for adults and professionals, focusing on projects and algorithm design. I suggest reading the IT career roadmap article for those wanting to learn more about coding and technology careers.