Notions About the Java Language

De la WikiLabs
Jump to navigationJump to search

Java is a programming language developed by the former company Sun Microsystems (actual Oracle), launched in 1995. Based on this language and on the idea of a virtual machine, countless technologies emerged, for web and distributed applications ([1]).

Compilation vs. Interpretation

Before presenting the virtual machine mechanism, we must make a brief description of the methods of program implementation for different programming languages.

A computer processor (or computing machines in general) can execute a fixed number of instructions known to the processor, called instruction set. A program can be written directly using these instructions (their mnemonic), i.e. in assembly language, or in a high level language. There are advantages and disadvantages to both.

A program written in assembly language is not portable, so it can be executed only on the machine for which it was written, because it depends on the instruction set that the processor is able to execute, on the operating system running on the machine, on the peripherals , etc. Another disadvantage is that for a relatively simple functionality of the program, a much larger quantity of code must be written as opposed to a high-level language (such as C or Java), which makes code maintenance, as well as development opportunities to be limited. On the other hand, an experienced programmer can make the best optimizations in assembly language, since he has control over all resources.

However, for complex applications, you need a compromise between performance and size / complexity of the code. Thus arose formal, high level languages, which are more intuitive, easier to learn and understand, and abstracts away many of the low-level layers of a computing machine (instruction set , memory map, the stack structure execution, memory allocation, and so on). The advantages are enormous, beginning with the ease of syntax and semantics of a program and ending with the fact that we now have a certain degree of portability. This portability exists because the language itself is unique and an algorithm is described in the same way for any machine, but this issue arises: since a processor only knows how to execute its set of instructions, how to make the transition from a program written in a generic high level language to the assembly language? There are two solutions: interpretation and compilation.

Interpretation

Diagram of the interpretation system

Interpretation of a program is done by another program, called an interpreter. It parses the source code of the program you wish to run and it transforms it in machine code while executing it. The advantage of this approach is that there is the possibility of dynamic code generation, at runtime. An example of an interpreted language is Javascript. This accepts constructs like:

eval("x=10;y=20;document.write(x*y)");
document.write("<br />" + eval("2+2"));
document.write("<br />" + eval("x+17"));

It can be seen that the function eval takes a string of characters as an argument, which is dynamically built, then evaluated as a language expression. This is not possible with compiled programs. The major disadvantage of interpreted programs is the low execution speed (the translation from high level language to the machine language is done at the runime).

Compilation

Diagram of the compile system

The compiler is a program that has the role of translating a programming language in another programming language. Basically, this is a translation program, but not between two tongues, but two formal languages. The difference between a language and a formal language is that the latter has a set of strict rules, making rigorous translation possible for an algorithm. Unlike the interpreter, the compiler does the translation statically, at the compile-time, between the high-level language and the assembly language of a specific processor. Once the program is translated to assembly language, then into machine code with the help of an assembler and a linker, it is ran directly on the processor, without any other additional programs.

The Java Virtual Machine (JVM)

Diagram of the JVM execution environment

Java programs are compiled then interpreted. The reason for this system is the insertion, in the execution flow, of an extra layer, called the Java Virtual Machine. As the name says, JVM is a processor, but not a real, physical, silicon one, but a virtual processor, simulated by the host CPU. So basically, the virtual machine is another program. There are two main advantages of this concept:

  • portability - the JVM has a clear specification and each of its implementations is identical, regardless on what hardware it runs on; this entails that once the program is compiled for JVM, it will run on any implementation, on any host CPU and any operating system;
  • security - the virtual machine layer behaves like a sandbox, so that the program's execution is only relevant inside the virtual machine, not in the real machine, offering a high degree of protection.

In the first phase, the initial program, stored in a file with .java extension is compiled, using the Java compiler (javac), and a new, executable file is generated for JVM, having the .class extension. This file is loaded by the virtual machine (java) and is interpreted.

Specific Features of the Java Language

Although the Java language syntax is derived mainly from C, there are certain features, especially related to memory access, which are fundamentally different:

  • in Java there are no pointers, only references; implicitly, there is no pointer arithmetic;
  • runtime checking - a system that verifies the memory locations access in an array, and throws exceptions when access to forbidden areas is requested;
  • garbage collection - the programmer does not need to keep tabs on allocated memory, a memory chunk is automatically freed if there are no more references to it;
  • distributed computing - Java provides, in its default libraries, classes which facilitate connectivity (java.net) and distributed execution (java.rmi);
  • multithreading - Java provides, in its default libraries, classes which facilitate concurrent execution of multiple methods and their synchronization;

Besides standalone applications, Java also offers support for another type of applications called applets, which are run inside a web browser. This facilitates development of interactive web applications.

However, maybe the most important advantage of the Java language is the set of classes provided by Oracle, generically called the Application Programming Interface (API), containing a vast collection of functions already implemented, ready to be used in any type of application.